Re: libcpp PATCH to avoid deprecated copy assignment

2018-05-22 Thread Gerald Pfeifer
On Mon, 21 May 2018, Jason Merrill wrote:
>> broke bootstrap on systems using libc++ instead of libstdc++

>>   In file included from /usr/include/c++/v1/new:91:
>>   /usr/include/c++/v1/exception:180:5: error: no member named 'fancy_abort' 
>> in namespace 'std::__1'; did you mean simply 'fancy_abort'?
>>   _VSTD::abort();

>> The problem appears to be the added #include 
> Does moving the #include  up higher help?

Yes, it does!

(Tested both with a direct bootstrap and by adding this to the 
FreeBSD port of gcc9-devel; both succeeded now.)

Thanks,
Gerald


Re: [RFC] [aarch64] Add HiSilicon tsv110 CPU support.

2018-05-22 Thread Zhangshaokun

Hi Kyrill,

On 2018/5/22 18:52, Kyrill Tkachov wrote:
> Hi Shaokun,
> 
> On 22/05/18 09:40, Shaokun Zhang wrote:
>> This patch adds HiSilicon's an mcpu: tsv110.
>>
>> ---
>>  gcc/ChangeLog|   9 +++
>>  gcc/config/aarch64/aarch64-cores.def |   5 ++
>>  gcc/config/aarch64/aarch64-cost-tables.h | 103 
>> +++
>>  gcc/config/aarch64/aarch64-tune.md   |   2 +-
>>  gcc/config/aarch64/aarch64.c |  79 
>>  gcc/doc/invoke.texi  |   2 +-
>>  6 files changed, 198 insertions(+), 2 deletions(-)
>>
>> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
>> index cec2892..5d44966 100644
>> --- a/gcc/ChangeLog
>> +++ b/gcc/ChangeLog
>> @@ -1,3 +1,12 @@
>> +2018-05-22  Shaokun Zhang 
>> +Bo Zhou  
>> +
>> +   * config/aarch64/aarch64-cores.def (tsv110): New CPU.
>> +   * config/aarch64/aarch64-tune.md: Regenerated.
>> +   * doc/invoke.texi (AArch61 Options/-mtune): Add "tsv110".
> 
> typo: AArch64.
> 

Good catch, my mistake.

>> +   * gcc/config/aarch64/aarch64.c (tsv110_tunings): New tuning table.
>> +   * gcc/config/aarch64/aarch64-cost-tables.h: Add "tsv110" extra costs.
> 
> Please start the path with config/.
> 

Sure, Will remove gcc/ next version.

>> +
>>  2018-05-21  Michael Meissner 
>>
>>  PR target/85657
>> diff --git a/gcc/config/aarch64/aarch64-cores.def 
>> b/gcc/config/aarch64/aarch64-cores.def
>> index 33b96ca..db7a412 100644
>> --- a/gcc/config/aarch64/aarch64-cores.def
>> +++ b/gcc/config/aarch64/aarch64-cores.def
>> @@ -91,6 +91,11 @@ AARCH64_CORE("cortex-a75",  cortexa75, cortexa57, 8_2A,  
>> AARCH64_FL_FOR_ARCH8_2
>>  /* Qualcomm ('Q') cores. */
>>  AARCH64_CORE("saphira", saphira,falkor,8_3A, 
>> AARCH64_FL_FOR_ARCH8_3 | AARCH64_FL_CRYPTO | AARCH64_FL_RCPC, saphira,   
>> 0x51, 0xC01, -1)
>>
>> +/* ARMv8.4-A Architecture Processors.  */
>> +
>> +/* HiSilicon ('H') cores. */
>> +AARCH64_CORE("tsv110", tsv110,tsv110,8_4A, 
>> AARCH64_FL_FOR_ARCH8_4 | AARCH64_FL_CRYPTO | AARCH64_FL_F16 | AARCH64_FL_AES 
>> | AARCH64_FL_SHA2, tsv110,   0x48, 0xd01, -1)
>> +
> 
> The third field is the scheduler model to use when optimising.
> Since there is no tsv110 scheduling model, using the name "tsv110"
> in the third field will generally give pretty poor schedules.
> I recommend you specify an scheduling model that most closely matches your 
> core
> for the time being. But I don't think it's required and I wouldn't let it hold

I checked it again, cortexa57 is most closely matches tsv110 and thanks your
suggestion.
If i choose cortexa57, can i add the tsv110_tunings which will use tsv110's
pipeline features, like the rest patch as follow or only use generic feature?

> up the patch.
> 
> You'll need approval from an aarch64 maintainer (cc'ed some for you).
> 

Good, thanks for your nice guidance.

Thanks,
Shaokun

> Thanks,
> Kyrill
> 
>>  /* ARMv8-A big.LITTLE implementations.  */
>>
>>  AARCH64_CORE("cortex-a57.cortex-a53",  cortexa57cortexa53, cortexa53, 8A,  
>> AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57, 0x41, AARCH64_BIG_LITTLE 
>> (0xd07, 0xd03), -1)
>> diff --git a/gcc/config/aarch64/aarch64-cost-tables.h 
>> b/gcc/config/aarch64/aarch64-cost-tables.h
>> index a455c62..b6890d6 100644
>> --- a/gcc/config/aarch64/aarch64-cost-tables.h
>> +++ b/gcc/config/aarch64/aarch64-cost-tables.h
>> @@ -334,4 +334,107 @@ const struct cpu_cost_table thunderx2t99_extra_costs =
>>}
>>  };
>>
>> +const struct cpu_cost_table tsv110_extra_costs =
>> +{
>> +  /* ALU */
>> +  {
>> +0, /* arith.  */
>> +0, /* logical.  */
>> +0, /* shift.  */
>> +0, /* shift_reg.  */
>> +COSTS_N_INSNS (1), /* arith_shift.  */
>> +COSTS_N_INSNS (1), /* arith_shift_reg.  */
>> +COSTS_N_INSNS (1), /* log_shift.  */
>> +COSTS_N_INSNS (1), /* log_shift_reg.  */
>> +0, /* extend.  */
>> +COSTS_N_INSNS (1), /* extend_arith.  */
>> +0, /* bfi.  */
>> +0, /* bfx.  */
>> +0, /* clz.  */
>> +0,/* rev.  */
>> +0, /* non_exec.  */
>> +true   /* non_exec_costs_exec.  */
>> +  },
>> +  {
>> +/* MULT SImode */
>> +{
>> +  COSTS_N_INSNS (2),   /* simple.  */
>> +  COSTS_N_INSNS (2),   /* flag_setting.  */
>> +  COSTS_N_INSNS (2),   /* extend.  */
>> +  COSTS_N_INSNS (2),   /* add.  */
>> +  COSTS_N_INSNS (2),   /* extend_add.  */
>> +  COSTS_N_INSNS (11)   /* idiv.  */
>> +},
>> +/* MULT DImode */
>> +{
>> +  COSTS_N_INSNS (3),   /* simple.  */
>> +  0,   /* flag_setting (N/A).  */
>> +  COSTS_N_INSNS (3),   /* extend.  */
>> +  COSTS_N_INSNS (3),   /* add.  

C++ PATCH for c++/81420, lifetime extension and array subscripting

2018-05-22 Thread Jason Merrill
The first hunk fixes looking through the array reference; the second
fixes looking through the base class conversion.  There's more to do
to implement DR 1299, but this is a solid improvement from a fairly
trivial patch.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit a065c041556f30a685e2d6b02c0486f7e7e77443
Author: Jason Merrill 
Date:   Tue May 1 20:23:18 2018 -0400

PR c++/81420 - not extending temporary lifetime.

* call.c (extend_ref_init_temps_1): Handle ARRAY_REF.
* class.c (build_base_path): Avoid redundant move of an rvalue.

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 1df4d14dfe6..c100a92f2fb 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -11061,7 +11061,9 @@ extend_ref_init_temps_1 (tree decl, tree init, vec **cleanups)
   if (TREE_CODE (sub) != ADDR_EXPR)
 return init;
   /* Deal with binding to a subobject.  */
-  for (p = _OPERAND (sub, 0); TREE_CODE (*p) == COMPONENT_REF; )
+  for (p = _OPERAND (sub, 0);
+   (TREE_CODE (*p) == COMPONENT_REF
+	|| TREE_CODE (*p) == ARRAY_REF); )
 p = _OPERAND (*p, 0);
   if (TREE_CODE (*p) == TARGET_EXPR)
 {
diff --git a/gcc/cp/class.c b/gcc/cp/class.c
index a9a0fa92727..25753d4c45f 100644
--- a/gcc/cp/class.c
+++ b/gcc/cp/class.c
@@ -426,7 +426,7 @@ build_base_path (enum tree_code code,
 {
   expr = cp_build_fold_indirect_ref (expr);
   expr = build_simple_base_path (expr, binfo);
-  if (rvalue)
+  if (rvalue && lvalue_p (expr))
 	expr = move (expr);
   if (want_pointer)
 	expr = build_address (expr);
diff --git a/gcc/testsuite/g++.dg/cpp0x/temp-extend1.C b/gcc/testsuite/g++.dg/cpp0x/temp-extend1.C
new file mode 100644
index 000..639f9456573
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/temp-extend1.C
@@ -0,0 +1,19 @@
+// PR c++/81420
+// { dg-do run { target c++11 } }
+
+int d;
+
+struct A
+{
+  int i[2];
+  ~A() { ++d; };
+};
+
+struct B: A {};
+
+int main()
+{
+  const int  = B().i[0];
+  if (d != 0)
+__builtin_abort();
+}


Re: [RFC] [aarch64] Add HiSilicon tsv110 CPU support

2018-05-22 Thread Zhangshaokun
Hi Ramana,

On 2018/5/22 18:28, Ramana Radhakrishnan wrote:
> On Tue, May 22, 2018 at 9:40 AM, Shaokun Zhang
>  wrote:
>> tsv110 is designed by HiSilicon and supports v8_4A, it also optimizes
>> L1 Icache which can access L1 Dcache.
>> Therefore, DC CVAU is not necessary in __aarch64_sync_cache_range for
>> tsv110, is there any good idea to skip DC CVAU operation for tsv110.
> 
> A solution would be to use an ifunc but on a cpu variant.
> 

ifunc, can you give further explanation?
If on a cpu variant, for HiSilicon tsv110, we have two version and CPU variants
are 0 and 1. Both are expected to skip DC CVAU operation in sync icache and
dcache.

Hi ARM guys,
are you happy to share yours idea about this?

> Is this really that important for performance and on what workloads ?
> 

Since it is not necessary for sync icache and dcache, it is beneficial for
performance to skip the redundant DC CVAU and do IC IVAU only.
For JVM, __clear_cache is called many times.

Thanks,
Shaokun

> regards
> Ramana
> 
>>
>> Any thoughts and ideas are welcome.
>>
>> Shaokun Zhang (1):
>>   [aarch64] Add HiSilicon tsv110 CPU support.
>>
>>  gcc/ChangeLog|   9 +++
>>  gcc/config/aarch64/aarch64-cores.def |   5 ++
>>  gcc/config/aarch64/aarch64-cost-tables.h | 103 
>> +++
>>  gcc/config/aarch64/aarch64-tune.md   |   2 +-
>>  gcc/config/aarch64/aarch64.c |  79 
>>  gcc/doc/invoke.texi  |   2 +-
>>  6 files changed, 198 insertions(+), 2 deletions(-)
>>
>> --
>> 2.7.4
>>
> 
> 



C++ PATCH for c++/85866, error with .* in template arg

2018-05-22 Thread Jason Merrill
During function template argument deduction, when we encounter a
non-template parameter with a type that depends on other parameters,
we want to check whether substituting in the arguments we already have
causes a substitution failure.  Since we might not have all the
arguments yet, we need to do this with processing_template_decl set.
tsubst_copy_and_build wasn't dealing well with a call to a .*
expression in that context; outside of a template, that shows up as an
OFFSET_REF, but within a template it's DOTSTAR_EXPR.

Tested x86_64-pc-linux-gnu, applying to trunk and 8.
commit 7704d0c55fb91ff619ff5487fec643490608d8d7
Author: Jason Merrill 
Date:   Tue May 22 15:32:27 2018 -0400

PR c++/85866 - error with .* in default template arg.

* pt.c (tsubst_copy_and_build): Handle partial instantiation.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 81de633b1ee..0b04770e123 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -18433,7 +18433,9 @@ tsubst_copy_and_build (tree t,
 	  /* Unsupported internal function with arguments.  */
 	  gcc_unreachable ();
 	}
-	else if (TREE_CODE (function) == OFFSET_REF)
+	else if (TREE_CODE (function) == OFFSET_REF
+		 || TREE_CODE (function) == DOTSTAR_EXPR
+		 || TREE_CODE (function) == MEMBER_REF)
 	  ret = build_offset_ref_call_from_tree (function, _args,
 		 complain);
 	else if (TREE_CODE (function) == COMPONENT_REF)
diff --git a/gcc/testsuite/g++.dg/cpp0x/fntmpdefarg9.C b/gcc/testsuite/g++.dg/cpp0x/fntmpdefarg9.C
new file mode 100644
index 000..833049c6de3
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/fntmpdefarg9.C
@@ -0,0 +1,29 @@
+// PR c++/85866
+// { dg-do compile { target c++11 } }
+
+template
+_Up
+__declval(int);
+
+template
+_Tp
+__declval(long);
+
+template
+auto declval() noexcept -> decltype(__declval<_Tp>(0));
+
+template
+using void_t = void;
+
+template().*declval()) () )
+		>* = nullptr>
+void boom(){}
+
+struct Foo {
+  void bar(){}
+};
+
+int main() {
+  boom();
+}


Re: C++ PATCH to implement P0614R1, Range-based for statements with initializer (take 2)

2018-05-22 Thread Jason Merrill
On Tue, May 22, 2018 at 7:25 PM, Marek Polacek  wrote:
> On Mon, May 21, 2018 at 09:51:44PM -0400, Jason Merrill wrote:
>> On Mon, May 21, 2018 at 7:34 PM, Marek Polacek  wrote:
>> > The previous version of this patch got confused by
>> >
>> >   for (int i = 0; n > 0 ? true : false; i++)
>> > // ...
>> >
>> > because even though we see a ; followed by a :, it's not a range-based for 
>> > with
>> > an initializer.  I find it very strange that this didn't show up during the
>> > regtest.
>> >
>> > To fix this, I had to uglify range_based_for_with_init_p to also check for 
>> > a ?.
>> > Yuck.
>>
>> Perhaps cp_parser_skip_to_closing_parenthesis_1 should handle balanced
>> ?: like ()/[]/{}.
>
> Good point.  Clearly there's a difference between ?: and e.g. () because : can
> stand alone--e.g. in asm (: "whatever"), labels, goacc arrays like a[0:N], and
> so on.  The following seems to work well, and is certainly less ugly than the
> previous version.
>
> +   case CPP_QUERY:
> + if (!brace_depth)
> +   ++condop_depth;
> + break;
> +
> +   case CPP_COLON:
> + if (!brace_depth && condop_depth > 0)
> +   condop_depth--;
> + break;

Since, as you say, colons can appear in more places, maybe we only
want to adjust condop_depth when all the other depths are 0, not just
brace_depth.

Jason


[PATCH] allow more strncat calls with -Wstringop-truncation (PR 85700)

2018-05-22 Thread Martin Sebor

Here's another small refinement to -Wstringop-truncation to
avoid diagnosing more arguably "safe" cases of strncat() that
match the expected pattern of

  strncat (d, s, sizeof d - strlen (d) - 1);

such as

  extern char a[4];
  strncat (a, "12", sizeof d - strlen (a) - 1);

Since the bound is derived from the length of the destination
as GCC documents is the expected use, the call should probably
not be diagnosed even though truncation is possible.

The trouble with strncat is that it specifies a single count
that can be (and has been) used to specify either the remaining
space in the destination or the maximum number of characters
to append, but not both.  It's nearly impossible to tell for
certain which the author meant, and if it's safe, hence all
this fine-tuning.  I suspect this isn't the last tweak, either.

In any event, I'd like to commit the patch to both trunk and
gcc-8-branch.  The bug isn't marked regression but I suppose
it could (should) well be considered one.

Martin
PR tree-optimization/85700 - Spurious -Wstringop-truncation warning with strncat

gcc/ChangeLog:

	PR tree-optimization/85700
	* gimple-fold.c (gimple_fold_builtin_strncat): Adjust comment.
	* tree-ssa-strlen.c (is_strlen_related_p): Handle integer subtraction.
	(maybe_diag_stxncpy_trunc): Distinguish strncat from strncpy.

gcc/testsuite/ChangeLog:

	PR tree-optimization/85700
	* gcc.dg/Wstringop-truncation-3.c: New test.

diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index b45798c..c37abe1 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -2062,10 +2062,12 @@ gimple_fold_builtin_strncat (gimple_stmt_iterator *gsi)
   if (!nowarn && cmpsrc == 0)
 {
   tree fndecl = gimple_call_fndecl (stmt);
-
-  /* To avoid certain truncation the specified bound should also
-	 not be equal to (or less than) the length of the source.  */
   location_t loc = gimple_location (stmt);
+
+  /* To avoid possible overflow the specified bound should also
+	 not be equal to the length of the source, even when the size
+	 of the destination is unknown (it's not an uncommon mistake
+	 to specify as the bound to strncpy the length of the source).  */
   if (warning_at (loc, OPT_Wstringop_overflow_,
 		  "%G%qD specified bound %E equals source length",
 		  stmt, fndecl, len))
diff --git a/gcc/testsuite/gcc.dg/Wstringop-truncation-3.c b/gcc/testsuite/gcc.dg/Wstringop-truncation-3.c
new file mode 100644
index 000..f394863
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/Wstringop-truncation-3.c
@@ -0,0 +1,63 @@
+/* PR tree-optimization/85700 - Spurious -Wstringop-truncation warning
+   with strncat
+   { dg-do compile }
+   { dg-options "-O2 -Wno-stringop-overflow -Wstringop-truncation -ftrack-macro-expansion=0" } */
+
+#define NOIPA __attribute__ ((noipa))
+#define strncat __builtin_strncat
+#define strlen __builtin_strlen
+
+extern char a4[4], b4[4], ax[];
+
+NOIPA void cat_a4_s1_1 (void)
+{
+  /* There is no truncation here but since the bound of 1 equals
+ the length of the source string it's likely a mistake that
+ could cause overflow so it's diagnosed by -Wstringop-overflow */
+  strncat (a4, "1", 1);
+}
+
+NOIPA void cat_a4_s1_2 (void)
+{
+  strncat (a4, "1", 2);
+}
+
+NOIPA void cat_a4_s1_3 (void)
+{
+  strncat (a4, "1", 3);
+}
+
+NOIPA void cat_a4_s1_4 (void)
+{
+  /* There is no truncation here but since the bound of 1 equals
+ the length of the source string it's likely a mistake that
+ could cause overflow so it's diagnosed by -Wstringop-overflow */
+  strncat (a4, "1", 4);
+}
+
+NOIPA void cat_a4_s1_5 (void)
+{
+  /* A bound in excess of the destination size is diagnosed by
+ -Wstringop-overflow.  */
+  strncat (a4, "1", 5);
+}
+
+NOIPA void cat_a4_s1_dlen (void)
+{
+  strncat (a4, "1", sizeof a4 - strlen (a4) - 1);
+}
+
+NOIPA void cat_a4_s2_dlen (void)
+{
+  strncat (a4, "12", sizeof a4 - strlen (a4) - 1);  /* { dg-bogus "\\\[-Wstringop-truncation]" } */
+}
+
+NOIPA void cat_a4_b4_dlen (void)
+{
+  strncat (a4, b4, sizeof a4 - strlen (a4) - 1);  /* { dg-bogus "\\\[-Wstringop-truncation]" } */
+}
+
+NOIPA void cat_ax_b4_dlen (void)
+{
+  strncat (ax, b4, 32 - strlen (ax) - 1);  /* { dg-bogus "\\\[-Wstringop-truncation]" } */
+}
diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
index 556c5bc..22a17d6 100644
--- a/gcc/tree-ssa-strlen.c
+++ b/gcc/tree-ssa-strlen.c
@@ -1778,6 +1778,15 @@ is_strlen_related_p (tree src, tree len)
   return is_strlen_related_p (src, rhs1);
 }
 
+  if (tree rhs2 = gimple_assign_rhs2 (def_stmt))
+{
+  /* Integer subtraction is considered strlen-related when both
+	 arguments are integers and second one is strlen-related.  */
+  rhstype = TREE_TYPE (rhs2);
+  if (INTEGRAL_TYPE_P (rhstype) && code == MINUS_EXPR)
+	return is_strlen_related_p (src, rhs2);
+}
+
   return false;
 }
 
@@ -1969,6 +1978,12 @@ maybe_diag_stxncpy_trunc (gimple_stmt_iterator gsi, tree src, tree cnt)
 
   gcall *call = as_a  (stmt);
 
+  /* 

Re: C++ PATCH to implement P0614R1, Range-based for statements with initializer (take 2)

2018-05-22 Thread Marek Polacek
On Mon, May 21, 2018 at 09:51:44PM -0400, Jason Merrill wrote:
> On Mon, May 21, 2018 at 7:34 PM, Marek Polacek  wrote:
> > The previous version of this patch got confused by
> >
> >   for (int i = 0; n > 0 ? true : false; i++)
> > // ...
> >
> > because even though we see a ; followed by a :, it's not a range-based for 
> > with
> > an initializer.  I find it very strange that this didn't show up during the
> > regtest.
> >
> > To fix this, I had to uglify range_based_for_with_init_p to also check for 
> > a ?.
> > Yuck.
> 
> Perhaps cp_parser_skip_to_closing_parenthesis_1 should handle balanced
> ?: like ()/[]/{}.

Good point.  Clearly there's a difference between ?: and e.g. () because : can
stand alone--e.g. in asm (: "whatever"), labels, goacc arrays like a[0:N], and
so on.  The following seems to work well, and is certainly less ugly than the
previous version.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2018-05-22  Marek Polacek  

Implement P0614R1, Range-based for statements with initializer.
* parser.c (cp_parser_range_based_for_with_init_p): New.
(cp_parser_init_statement): Use it.  Parse the optional init-statement
for a range-based for loop.
(cp_parser_skip_to_closing_parenthesis_1): Handle balancing ?:.

* g++.dg/cpp2a/range-for1.C: New test.
* g++.dg/cpp2a/range-for2.C: New test.
* g++.dg/cpp2a/range-for3.C: New test.
* g++.dg/cpp2a/range-for4.C: New test.
* g++.dg/cpp2a/range-for5.C: New test.
* g++.dg/cpp2a/range-for6.C: New test.
* g++.dg/cpp2a/range-for7.C: New test.

diff --git gcc/cp/parser.c gcc/cp/parser.c
index 6f51f03f47c..8e4c66d8503 100644
--- gcc/cp/parser.c
+++ gcc/cp/parser.c
@@ -3493,6 +3493,7 @@ cp_parser_skip_to_closing_parenthesis_1 (cp_parser 
*parser,
   unsigned paren_depth = 0;
   unsigned brace_depth = 0;
   unsigned square_depth = 0;
+  unsigned condop_depth = 0;
 
   if (recovering && or_ttype == CPP_EOF
   && cp_parser_uncommitted_to_tentative_parse_p (parser))
@@ -3504,7 +3505,7 @@ cp_parser_skip_to_closing_parenthesis_1 (cp_parser 
*parser,
 
   /* Have we found what we're looking for before the closing paren?  */
   if (token->type == or_ttype && or_ttype != CPP_EOF
- && !brace_depth && !paren_depth && !square_depth)
+ && !brace_depth && !paren_depth && !square_depth && !condop_depth)
return -1;
 
   switch (token->type)
@@ -3551,6 +3552,16 @@ cp_parser_skip_to_closing_parenthesis_1 (cp_parser 
*parser,
}
  break;
 
+   case CPP_QUERY:
+ if (!brace_depth)
+   ++condop_depth;
+ break;
+
+   case CPP_COLON:
+ if (!brace_depth && condop_depth > 0)
+   condop_depth--;
+ break;
+
default:
  break;
}
@@ -11255,6 +11266,40 @@ cp_parser_statement_seq_opt (cp_parser* parser, tree 
in_statement_expr)
 }
 }
 
+/* Return true if this is the C++20 version of range-based-for with
+   init-statement.  */
+
+static bool
+cp_parser_range_based_for_with_init_p (cp_parser *parser)
+{
+  bool r = false;
+
+  /* Save tokens so that we can put them back.  */
+  cp_lexer_save_tokens (parser->lexer);
+
+  /* There has to be an unnested ; followed by an unnested :.  */
+  if (cp_parser_skip_to_closing_parenthesis_1 (parser,
+  /*recovering=*/false,
+  CPP_SEMICOLON,
+  /*consume_paren=*/false) != -1)
+goto out;
+
+  /* We found the semicolon, eat it now.  */
+  cp_lexer_consume_token (parser->lexer);
+
+  /* Now look for ':' that is not nested in () or {}.  */
+  r = (cp_parser_skip_to_closing_parenthesis_1 (parser,
+   /*recovering=*/false,
+   CPP_COLON,
+   /*consume_paren=*/false) == -1);
+
+out:
+  /* Roll back the tokens we skipped.  */
+  cp_lexer_rollback_tokens (parser->lexer);
+
+  return r;
+}
+
 /* Return true if we're looking at (init; cond), false otherwise.  */
 
 static bool
@@ -12299,7 +12344,7 @@ cp_parser_iteration_statement (cp_parser* parser, bool 
*if_p, bool ivdep,
  simple-declaration  */
 
 static bool
-cp_parser_init_statement (cp_parser* parser, tree *decl)
+cp_parser_init_statement (cp_parser *parser, tree *decl)
 {
   /* If the next token is a `;', then we have an empty
  expression-statement.  Grammatically, this is also a
@@ -12312,6 +12357,29 @@ cp_parser_init_statement (cp_parser* parser, tree 
*decl)
   bool is_range_for = false;
   bool saved_colon_corrects_to_scope_p = parser->colon_corrects_to_scope_p;
 
+  /* Try to parse the init-statement.  */
+  if (cp_parser_range_based_for_with_init_p (parser))
+   {
+ tree dummy;
+ cp_parser_parse_tentatively 

[PATCH] PR target/85358: Add target hook to prevent default widening

2018-05-22 Thread Michael Meissner
I posted this patch at the end of GCC 8 as a RFC.  Now that we are in GCC 9, I
would like to repose it.  Sorry to spam some of you.  It is unclear whom the
reviewers for things like target hooks and basic mode handling are.

Here is the original patch.
https://gcc.gnu.org/ml/gcc-patches/2018-04/msg00764.html

PowerPC has 3 different 128-bit floating point types (KFmode, IFmode, and
TFmode).  We are in the process of migrating long double from IBM extended
double to IEEE 128-bit floating point.

*   IFmode is IBM extended double (__ibm128)
*   KFmode is IEEE 128-bit floating point (__float128, _Float128N)
*   TFmode is whatever long double maps to

If we are compiling for a power8 system (which does not have hardware IEEE
128-bit floating point), the current system works because each of the 3 modes
do not have hardware support.

If we are compiling for a power9 system and long double is IBM extended double,
again things are fine.

However, if we compiling for power9 and we've flipped the default for long
double to be IEEE 128-bit floating point, then the code to support __ibm128
breaks.  The machine independent portions of the mode handling says oh, there
is hardware to support TFmode operations, lets widen the type to TFmode and do
those operations.  However, converting IFmode to TFmode is not cheap, it has to
be done in a function call.

This patch adds a new target hook, that if it is overriden, the backend can say
don't automatically widen this type to that type.  The PowerPC port defines the
target hook so that it doesn't automatically convert IBM extended double to
IEEE 128-bit and vice versa.

This patch goes through all of the places that calls GET_MODE_WIDER_MODE and
then calls the target hook.  Now, the PowerPC only needs to block certain
floating point widenings.  Several of the changes are to integer widenings, and
if desired, we could restrict the changes to just floating point types.
However, there might be other ports that need the flexibility for other types.

I have tried various other approprches to fix this problem, and so far, I have
not been able to come up with a PowerPC back-end only solution that works.

Alternatively, Segher has suggested that the call to the target hook be in
GET_MODE_WIDER_MODE and GET_MODE_2XWIDER_MODE (plus any places where we access
the mode_wider array direcly).

I have built little endian PowerPC builds with this patch, and I have verified
that it does work.  I have tested the same patch in April on a big endian
PowerPC system and x86_64 and it worked there also.

Can I check in this patch as is (I will verify x86/PowerPC big endian still
works before checkin).  Or would people prefer modifications to the patch?

[gcc]
2018-05-22  Michael Meissner  

PR target/85358
* target.def (default_widening_p): New target hook to say whether
default widening between modes should be done.
* targhooks.h (hook_bool_mode_mode_bool_true): New declaration.
* targhooks.c (hook_bool_mode_mode_bool_true): New default target
hook.
* optabs.c (expand_binop): Before doing default widening, check
whether the backend allows the widening.
(expand_twoval_unop): Likewise.
(expand_twoval_binop): Likewise.
(widen_leading): Likewise.
(widen_bswap): Likewise.
(expand_unop): Likewise.
* cse.c (cse_insn): Likewise.
* combine.c (simplify_comparison): Likewise.
* var-tracking.c (prepare_call_arguments): Likewise.
* config/rs6000/rs6000.c (TARGET_DEFAULT_WIDENING_P): Define
target hook to prevent IBM extended double and IEEE 128-bit
floating point from being converted to each by default.
(rs6000_default_widening_p): Likewise.
* doc/tm.texi (TARGET_DEFAULT_WIDENING_P): Document the new
default widening hook.
* doc/tm.texi.in (TARGET_DEFAULT_WIDENING_P): Likewise.

[gcc/testsuite]
2018-05-22  Michael Meissner  

PR target/85358
* gcc.target/powerpc/pr85358.c: New test to make sure __ibm128
does not widen to __float128 on ISA 3.0 systems.

In order to start the transition of PowerPC long double to IEEE 128-bit, we
will need this patch or a similar patch to be back ported to GCC 8.2.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797
Index: gcc/target.def
===
--- gcc/target.def  (revision 260550)
+++ gcc/target.def  (working copy)
@@ -3498,6 +3498,13 @@ If this hook allows @code{val} to have a
  hook_bool_mode_uhwi_false)
 
 DEFHOOK
+(default_widening_p,
+ "Return true if GCC can automatically widen from @var{from_mode} to\n\
+@var{to_mode}.  Conversions are unsigned if @var{unsigned_p} is true.",
+ bool, (machine_mode, machine_mode, bool),
+ 

Re: [PATCH] PR libgcc/60790: Avoid IFUNC resolver access to uninitialized data

2018-05-22 Thread Jeff Law
On 03/29/2018 08:00 AM, Florian Weimer wrote:
> This patch performs lazy initialization of the relevant CPUID feature
> register value.  It will needlessly invoke the CPUID determination code
> on architectures which lack CPUID support or support for the feature
> register, but I think it's not worth to avoid the complexity for that.
> 
> I verified manually that the CMPXCHG16B implementation is still selected
> for a 128-bit load after the change.  I don't know how to write an
> automated test for that.
> 
> Thanks,
> Florian
> 
> pr60790.patch
> 
> 
> Index: libatomic/ChangeLog
> ===
> --- libatomic/ChangeLog   (revision 258952)
> +++ libatomic/ChangeLog   (working copy)
> @@ -1,3 +1,18 @@
> +2018-03-29  Florian Weimer  
> +
> + PR libgcc/60790
> + x86: Do not assume ELF constructors run before IFUNC resolvers.
> + * config/x86/host-config.h (libat_feat1_ecx, libat_feat1_edx):
> + Remove declarations.
> + (__libat_feat1, __libat_feat1_init): Declare.
> + (FEAT1_REGISTER): Define.
> + (load_feat1): New function.
> + (IFUNC_COND_1): Adjust.
> + * config/x86/init.c (libat_feat1_ecx, libat_feat1_edx)
> + (init_cpuid): Remove definitions.
> + (__libat_feat1): New variable.
> + (__libat_feat1_init): New function.
OK.

Do you have write access to the GCC repository?  If not I can commit for
you.

jeff


Re: [RFC] Configure and testsuite updates for ARM FDPIC target

2018-05-22 Thread Jeff Law
On 05/07/2018 06:29 AM, Christophe Lyon wrote:
> Hello,
> 
> 
> I am preparing the submission of a patch series to support the FDPIC ABI
> for Linux on ARM.
> 
> During development, we internally used arm-linux-uclibceabi as target
> name, but I had to change it to handle feedback when I submitted the
> binutils patches.
> 
> These have been merged to binutils master, and use arm-uclinuxfdpiceabi
> as target name.
> 
> 
> Changing the target name in binutils was a bit painful, but the
> equivalent change in GCC is more invasive: to get the same testsuite
> results we had with arm-linux-uclibceabi, I not only had to update
> several tests to accept the additional target, but I also had to edit
> several configure-related files to activate the same paths we had
> validated with the previous target name.
> 
> 
> Roughly speaking, it is a matter of extending cases where we try to
> match $target or $host against *-linux*, or $host_os against linux*. In
> all these cases I conservatively chose to add arm*-*-uclinuxfdpiceabi or
> uclinuxfdpiceabi to avoid side-effects on other uclinux targets.
> 
> I have attached the patch that handles that in the series, to check if
> this will be acceptable. Indeed, I am surprised by the number of changes
> involved, and I thought most of the enablement had already been done on
> other uclinux targets.
I don't see anything particularly objectionable here.  The ARM
maintainers would have the final say on this.

jeff



Re: [PATCH 2/2] df-scan: remove ad-hoc handling of global regs in asms

2018-05-22 Thread Jeff Law
On 05/16/2018 04:30 AM, Alexander Monakov wrote:
> 
> 
> On Mon, 23 Apr 2018, Alexander Monakov wrote:
> 
>> As discussed in the cover letter, the code removed in this patch is 
>> unnecessary,
>> references to global reg vars from inline asms do not work reliably, and so 
>> we
>> should simply require that inline asms use constraints to make such 
>> references
>> properly visible to the compiler.
>>
>> Bootstrapped/regtested on powerpc64, will retest on ppc64le and x86 in stage 
>> 1.
>>
>> PR rtl-optimization/79985
>>  * df-scan.c (df_insn_refs_collect): Remove special case for
>> global registers and asm statements.
> 
> Ping. I've retested once on ppc64le since posting.
This has the potential to break existing code that we've tried to keep
working.  Worse yet, it's not code we're likely to see until gcc-9 goes
into wide deployment.  So there's certainly a risk of complaints around
this change.

I would not expect our testsuite to provide any meaningful test coverage
here.  Matz (in the discussion around pr44281 on gcc-patches) indicates
that some jits may utilize global registers.  Unfortunately, he doesn't
indicate which ones -- which would provide a pointer for deeper testing.

But even with those caveats, I think the consensus is to go forward with
the doc change.  This change naturally follows from the doc update.

So OK for the trunk.  If there's fallout in gcc-9, we'll obviously have
to deal with it.

jeff


[PATCH] Fix PR85712 (SLSR cleanup of alternative interpretations)

2018-05-22 Thread Bill Schmidt
Hi,

PR85712 shows where an existing test case fails in the SLSR pass because
the code is flawed that cleans up alternative interpretations (CAND_ADD 
versus CAND_MULT, for example) after a replacement.  This patch fixes the
flaw by ensuring that we always visit all interpretations, not just
subsequent ones in the next_interp chain.  I found six occurrences of
this mistake in the code.

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.
No new test case is added since the failure occurs on an existing test
in the test suite.  Is this okay for trunk, and for backports to all
supported branches after some burn-in time?

Thanks,
Bill


2018-05-22  Bill Schmidt  

* gimple-ssa-strength-reduction.c (struct slsr_cand_d): Add
first_interp field.
(alloc_cand_and_find_basis): Initialize first_interp field.
(slsr_process_mul): Modify first_interp field.
(slsr_process_add): Likewise.
(slsr_process_cast): Modify first_interp field for each new
interpretation.
(slsr_process_copy): Likewise.
(dump_candidate): Dump first_interp field.
(replace_mult_candidate): Process all interpretations, not just
subsequent ones.
(replace_rhs_if_not_dup): Likewise.
(replace_one_candidate): Likewise.

Index: gcc/gimple-ssa-strength-reduction.c
===
--- gcc/gimple-ssa-strength-reduction.c (revision 260484)
+++ gcc/gimple-ssa-strength-reduction.c (working copy)
@@ -266,6 +266,10 @@ struct slsr_cand_d
  of a statement.  */
   cand_idx next_interp;
 
+  /* Index of the first candidate record in a chain for the same
+ statement.  */
+  cand_idx first_interp;
+
   /* Index of the basis statement S0, if any, in the candidate vector.  */
   cand_idx basis;
 
@@ -686,6 +690,7 @@ alloc_cand_and_find_basis (enum cand_kind kind, gi
   c->kind = kind;
   c->cand_num = cand_vec.length () + 1;
   c->next_interp = 0;
+  c->first_interp = c->cand_num;
   c->dependent = 0;
   c->sibling = 0;
   c->def_phi = kind == CAND_MULT ? find_phi_def (base) : 0;
@@ -1261,6 +1266,7 @@ slsr_process_mul (gimple *gs, tree rhs1, tree rhs2
 is the stride and RHS2 is the base expression.  */
   c2 = create_mul_ssa_cand (gs, rhs2, rhs1, speed);
   c->next_interp = c2->cand_num;
+  c2->first_interp = c->cand_num;
 }
   else if (TREE_CODE (rhs2) == INTEGER_CST)
 {
@@ -1498,7 +1504,10 @@ slsr_process_add (gimple *gs, tree rhs1, tree rhs2
{
  c2 = create_add_ssa_cand (gs, rhs2, rhs1, false, speed);
  if (c)
-   c->next_interp = c2->cand_num;
+   {
+ c->next_interp = c2->cand_num;
+ c2->first_interp = c->cand_num;
+   }
  else
add_cand_for_stmt (gs, c2);
}
@@ -1621,6 +1630,8 @@ slsr_process_cast (gimple *gs, tree rhs1, bool spe
 
   if (base_cand && base_cand->kind != CAND_PHI)
 {
+  slsr_cand_t first_cand = NULL;
+
   while (base_cand)
{
  /* Propagate all data from the base candidate except the type,
@@ -1635,6 +1646,12 @@ slsr_process_cast (gimple *gs, tree rhs1, bool spe
 base_cand->index, base_cand->stride,
 ctype, base_cand->stride_type,
 savings);
+ if (!first_cand)
+   first_cand = c;
+
+ if (first_cand != c)
+   c->first_interp = first_cand->cand_num;
+
  if (base_cand->next_interp)
base_cand = lookup_cand (base_cand->next_interp);
  else
@@ -1657,6 +1674,7 @@ slsr_process_cast (gimple *gs, tree rhs1, bool spe
   c2 = alloc_cand_and_find_basis (CAND_MULT, gs, rhs1, 0,
  integer_one_node, ctype, sizetype, 0);
   c->next_interp = c2->cand_num;
+  c2->first_interp = c->cand_num;
 }
 
   /* Add the first (or only) interpretation to the statement-candidate
@@ -1681,6 +1699,8 @@ slsr_process_copy (gimple *gs, tree rhs1, bool spe
 
   if (base_cand && base_cand->kind != CAND_PHI)
 {
+  slsr_cand_t first_cand = NULL;
+
   while (base_cand)
{
  /* Propagate all data from the base candidate.  */
@@ -1693,6 +1713,12 @@ slsr_process_copy (gimple *gs, tree rhs1, bool spe
 base_cand->index, base_cand->stride,
 base_cand->cand_type,
 base_cand->stride_type, savings);
+ if (!first_cand)
+   first_cand = c;
+
+ if (first_cand != c)
+   c->first_interp = first_cand->cand_num;
+
  if (base_cand->next_interp)
base_cand = lookup_cand (base_cand->next_interp);
  else
@@ -1717,6 +1743,7 @@ slsr_process_copy (gimple *gs, tree rhs1, bool spe
  

Re: [PATCH] Print working directory to gcov files (PR gcov-profile/84846).

2018-05-22 Thread Eric Botcazou
> How do you mean that? Why would be that dependent?

I don't really understand the question...  The coverage result contains the 
working directory where the program was run so by definition it depends on the 
working directory.  Put it differently, run the same program in 2 different 
directories and compare the .gcov files, they are now different, which is a 
very effective way of breaking automatic coverage testing.

-- 
Eric Botcazou


Re: [PATCH] testsuite: Introduce be/le selectors

2018-05-22 Thread Jeff Law
On 05/21/2018 03:46 PM, Segher Boessenkool wrote:
> This patch creates "be" and "le" selectors, which can be used by all
> architectures, similar to ilp32 and lp64.
> 
> Is this okay for trunk?
> 
> 
> Segher
> 
> 
> 2017-05-21  Segher Boessenkool  
> 
> gcc/testsuite/
>   * lib/target-supports.exp (check_effective_target_be): New.
>   (check_effective_target_le): New.
I think this is fine.  "be" "le" are used all over the place in gcc and
the kernel to denote big/little endian.

jeff


Re: [PATCH 1/2] Introduce prefetch-minimum stride option

2018-05-22 Thread Jeff Law
On 05/22/2018 12:55 PM, Luis Machado wrote:
> 
> 
> On 05/16/2018 08:18 AM, Luis Machado wrote:
>>
>>
>> On 05/16/2018 06:08 AM, Kyrill Tkachov wrote:
>>>
>>> On 15/05/18 12:12, Luis Machado wrote:
 Hi,

 On 05/15/2018 06:37 AM, Kyrill Tkachov wrote:
> Hi Luis,
>
> On 14/05/18 22:18, Luis Machado wrote:
>> Hi,
>>
>> Here's an updated version of the patch (now reverted) that
>> addresses the previous bootstrap problem (signedness and long
>> long/int conversion).
>>
>> I've checked that it bootstraps properly on both aarch64-linux and
>> x86_64-linux and that tests look sane.
>>
>> James, would you please give this one a try to see if you can
>> still reproduce PR85682? I couldn't reproduce it in multiple
>> attempts.
>>
>
> The patch doesn't hit the regressions in PR85682 from what I can see.
> I have a comment on the patch below.
>

 Great. Thanks for checking Kyrill.

> --- a/gcc/tree-ssa-loop-prefetch.c
> +++ b/gcc/tree-ssa-loop-prefetch.c
> @@ -992,6 +992,23 @@ prune_by_reuse (struct mem_ref_group *groups)
>   static bool
>   should_issue_prefetch_p (struct mem_ref *ref)
>   {
> +  /* Some processors may have a hardware prefetcher that may
> conflict with
> + prefetch hints for a range of strides.  Make sure we don't issue
> + prefetches for such cases if the stride is within this
> particular
> + range.  */
> +  if (cst_and_fits_in_hwi (ref->group->step)
> +  && abs_hwi (int_cst_value (ref->group->step)) <
> +  (HOST_WIDE_INT) PREFETCH_MINIMUM_STRIDE)
> +    {
>
> The '<' should go on the line below together with
> PREFETCH_MINIMUM_STRIDE.

 I've fixed this locally now.
>>>
>>> Thanks. I haven't followed the patch in detail, are you looking for
>>> midend changes approval since the last version?
>>> Or do you need aarch64 approval?
>>
>> The changes are not substantial, but midend approval i what i was
>> aiming at.
>>
>> Also the confirmation that PR85682 is no longer happening.
> 
> James confirmed PR85682 is no longer reproducible with the updated patch
> and the bootstrap issue is fixed now. So i take it this should be OK to
> push to mainline?
> 
> Also, i'd like to discuss the possibility of having these couple options
> backported to GCC 8. As is, the changes don't alter code generation by
> default, but they allow better tuning of the software prefetcher for
> targets that benefit from it.
> 
> Maybe after letting the changes bake on mainline enough to be confirmed
> stable?
OK for the trunk.  But they don't really seem appropriate for the
release branches.  We're primarily concerned with correctness issues on
the release branches.

jeff


Re: [Patch, Fortran] PR 85841: [F2018] reject deleted features

2018-05-22 Thread Janus Weil
2018-05-22 20:56 GMT+02:00 H.J. Lu :
> On Mon, May 21, 2018 at 10:47 PM, Janus Weil  wrote:
>> 2018-05-21 18:57 GMT+02:00 Steve Kargl :
>>> On Mon, May 21, 2018 at 12:14:13PM +0200, Janus Weil wrote:

 So, here is the promised follow-up patch. It mostly removes
 GFC_STD_F2008_TS and replaces it by GFC_STD_F2018 in a mechanical
 manner. Plus, it fixes the resulting fallout in the testsuite and
 updates the documentation. The non-mechanical parts are libgfortran.h
 and options.c. Regtests cleanly. Ok for trunk with a suitable
 ChangeLog?
>>>
>>> Looks good to me.
>>
>> I have now also committed this follow-up as r260499 (after checking
>> that not only the gfortran testsuite is regression-free, but also the
>> Fortran part of the libgomp testsuite).
>>
>
> Another one:
>
> /export/gnu/import/git/sources/gcc/gcc/testsuite/gfortran.dg/pr30667.f:9:72:
> Warning: Fortran 2018 deleted feature: DO termination statement which
> is not END DO or CONTINUE with label 100 at (1)^M
> FAIL: gfortran.dg/pr30667.f   -O  (test for excess errors)
> Excess errors:
> /export/gnu/import/git/sources/gcc/gcc/testsuite/gfortran.dg/pr30667.f:9:72:
> Warning: Fortran 2018 deleted feature: DO termination statement which
> is not END DO or CONTINUE with label 100 at (1)

I added "-std=legacy" for this case in r260555, which should fix it.

Cheers,
Janus


Re: Replace FMA_EXPR with one internal fn per optab

2018-05-22 Thread H.J. Lu
On Thu, May 17, 2018 at 1:56 AM, Richard Sandiford
 wrote:
> Richard Biener  writes:
>>> @@ -2698,23 +2703,26 @@ convert_mult_to_fma_1 (tree mul_result,
>>>  }
>>
>>> if (negate_p)
>>> -   mulop1 = force_gimple_operand_gsi (,
>>> -  build1 (NEGATE_EXPR,
>>> -  type, mulop1),
>>> -  true, NULL_TREE, true,
>>> -  GSI_SAME_STMT);
>>> +   mulop1 = gimple_build (, NEGATE_EXPR, type, mulop1);
>>
>>> -  fma_stmt = gimple_build_assign (gimple_assign_lhs (use_stmt),
>>> - FMA_EXPR, mulop1, op2, addop);
>>> +  if (seq)
>>> +   gsi_insert_seq_before (, seq, GSI_SAME_STMT);
>>> +  fma_stmt = gimple_build_call_internal (IFN_FMA, 3, mulop1, op2,
>> addop);
>>> +  gimple_call_set_lhs (fma_stmt, gimple_assign_lhs (use_stmt));
>>> +  gimple_call_set_nothrow (fma_stmt, !stmt_can_throw_internal
>> (use_stmt));
>>> +  gsi_replace (, fma_stmt, true);
>>> +  /* Valueize aggressively so that we generate FMS, FNMA and FNMS
>>> +regardless of where the negation occurs.  */
>>> +  if (fold_stmt (, aggressive_valueize))
>>> +   update_stmt (gsi_stmt (gsi));
>>
>> I think it would be nice to be able to use gimple_build () with IFNs so you
>> can
>> gimple_build () the IFN and then use gsi_replace_with_seq () on it.  You
>> only need to fold with generated negates, not with negates already in the
>> IL?
>> The the folding implied with gimple_build will take care of it.
>
> The idea was to pick up existing negates that feed the multiplication
> as well as any added by the pass itself.
>
> On IRC yesterday we talked about how this should handle the ECF_NOTHROW
> flag, and whether things like IFN_SQRT and IFN_FMA should always be
> nothrow (like the built-in functions are).  But in the end I thought
> it'd be better to keep things as they are.  We already handle
> -fnon-call-exceptions for unfused a * b + c and before the patch also
> handled it for FMA_EXPR.  It'd seem like a step backwards if the new
> internal functions didn't handle it too.  If anything it seems like the
> built-in functions should change to be closer to the tree_code and
> internal_fn way of doing things, if we want to support -fnon-call-exceptions
> properly.
>
> This also surprised me when doing the if-conversion patch I sent yesterday.
> We're happy to vectorise:
>
>   for (int i = 0; i < 100; ++i)
> x[i] = ... ? sqrt (x[i]) : 0;
>
> by doing the sqrt unconditionally and selecting on the result, even with
> the default maths flags, but refuse to vectorise the simpler:
>
>   for (int i = 0; i < 100; ++i)
> x[i] = ... ? x[i] + 1 : 0;
>
> in the same way.
>
>> Otherwise can you please move aggressive_valueize to gimple-fold.[ch]
>> alongside no_follow_ssa_edges / follow_single_use_edges and maybe
>> rename it as follow_all_ssa_edges?
>
> Ah, yeah, that's definitely a better name.
>
> I also renamed all_scalar_fma to scalar_all_fma, since I realised
> after Andrew's reply that the old name made it sound like it was
> "all scalars", whereas it meant to mean "all fmas".
>
> Tested as before.
>
> Thanks,
> Richard
>
> 2018-05-17  Richard Sandiford  
>
> gcc/
> * doc/sourcebuild.texi (scalar_all_fma): Document.
> * tree.def (FMA_EXPR): Delete.
> * internal-fn.def (FMA, FMS, FNMA, FNMS): New internal functions.
> * internal-fn.c (ternary_direct): New macro.
> (expand_ternary_optab_fn): Likewise.
> (direct_ternary_optab_supported_p): Likewise.
> * Makefile.in (build/genmatch.o): Depend on case-fn-macros.h.
> * builtins.c (fold_builtin_fma): Delete.
> (fold_builtin_3): Don't call it.
> * cfgexpand.c (expand_debug_expr): Remove FMA_EXPR handling.
> * expr.c (expand_expr_real_2): Likewise.
> * fold-const.c (operand_equal_p): Likewise.
> (fold_ternary_loc): Likewise.
> * gimple-pretty-print.c (dump_ternary_rhs): Likewise.
> * gimple.c (DEFTREECODE): Likewise.
> * gimplify.c (gimplify_expr): Likewise.
> * optabs-tree.c (optab_for_tree_code): Likewise.
> * tree-cfg.c (verify_gimple_assign_ternary): Likewise.
> * tree-eh.c (operation_could_trap_p): Likewise.
> (stmt_could_throw_1_p): Likewise.
> * tree-inline.c (estimate_operator_cost): Likewise.
> * tree-pretty-print.c (dump_generic_node): Likewise.
> (op_code_prio): Likewise.
> * tree-ssa-loop-im.c (stmt_cost): Likewise.
> * tree-ssa-operands.c (get_expr_operands): Likewise.
> * tree.c (commutative_ternary_tree_code, add_expr): Likewise.
> * fold-const-call.h (fold_fma): Delete.
> * fold-const-call.c (fold_const_call_): Handle CFN_FMS,
>  

Re: [PATCH] Minor documentation correction in aarch64-simd.md

2018-05-22 Thread James Greenhalgh
On Wed, Apr 25, 2018 at 03:17:28PM -0500, Indu Bhagat wrote:
> In function minmax_replacement in tree-ssa-phiopt.c, MIN_EXPR/MAX_EXPR 
> are substituted for when the following condition is false - (HONOR_NANS 
> (type) || HONOR_SIGNED_ZEROS (type)). So for FP mode, this is false when 
> _both_ of the following conditions are fulfilled : 1. flag_signed_zeros 
> is zero and 2. flag_finite_math_only is set. So, the documentation in 
> aarch64-simd.md is partially misleading. Here is a patch to correct 
> that. Thanks

This is OK for trunk.

Thanks,
James

> gcc/ChangeLog:
> 
>  * config/aarch64/aarch64-simd.md: correct flags text for 
> MIN_EXPR replacement

> diff --git a/gcc/config/aarch64/aarch64-simd.md 
> b/gcc/config/aarch64/aarch64-simd.md
> index 1154fc3..7fd20fd 100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -2211,8 +2211,9 @@
>  ;; Max/Min are introduced by idiom recognition by GCC's mid-end.  An
>  ;; expression like:
>  ;;  a = (b < c) ? b : c;
> -;; is idiom-matched as MIN_EXPR only if -ffinite-math-only is enabled
> -;; either explicitly or indirectly via -ffast-math.
> +;; is idiom-matched as MIN_EXPR only if -ffinite-math-only and
> +;; -fno-signed-zeros are enabled either explicitly or indirectly via
> +;; -ffast-math.
>  ;;
>  ;; MIN_EXPR and MAX_EXPR eventually map to 'smin' and 'smax' in RTL.
>  ;; The 'smax' and 'smin' RTL standard pattern names do not specify which



Re: [Patch, Fortran] PR 85841: [F2018] reject deleted features

2018-05-22 Thread H.J. Lu
On Mon, May 21, 2018 at 10:47 PM, Janus Weil  wrote:
> 2018-05-21 18:57 GMT+02:00 Steve Kargl :
>> On Mon, May 21, 2018 at 12:14:13PM +0200, Janus Weil wrote:
>>>
>>> So, here is the promised follow-up patch. It mostly removes
>>> GFC_STD_F2008_TS and replaces it by GFC_STD_F2018 in a mechanical
>>> manner. Plus, it fixes the resulting fallout in the testsuite and
>>> updates the documentation. The non-mechanical parts are libgfortran.h
>>> and options.c. Regtests cleanly. Ok for trunk with a suitable
>>> ChangeLog?
>>
>> Looks good to me.
>
> I have now also committed this follow-up as r260499 (after checking
> that not only the gfortran testsuite is regression-free, but also the
> Fortran part of the libgomp testsuite).
>

Another one:

/export/gnu/import/git/sources/gcc/gcc/testsuite/gfortran.dg/pr30667.f:9:72:
Warning: Fortran 2018 deleted feature: DO termination statement which
is not END DO or CONTINUE with label 100 at (1)^M
FAIL: gfortran.dg/pr30667.f   -O  (test for excess errors)
Excess errors:
/export/gnu/import/git/sources/gcc/gcc/testsuite/gfortran.dg/pr30667.f:9:72:
Warning: Fortran 2018 deleted feature: DO termination statement which
is not END DO or CONTINUE with label 100 at (1)



-- 
H.J.


Re: [PATCH 1/2] Introduce prefetch-minimum stride option

2018-05-22 Thread Luis Machado



On 05/16/2018 08:18 AM, Luis Machado wrote:



On 05/16/2018 06:08 AM, Kyrill Tkachov wrote:


On 15/05/18 12:12, Luis Machado wrote:

Hi,

On 05/15/2018 06:37 AM, Kyrill Tkachov wrote:

Hi Luis,

On 14/05/18 22:18, Luis Machado wrote:

Hi,

Here's an updated version of the patch (now reverted) that 
addresses the previous bootstrap problem (signedness and long 
long/int conversion).


I've checked that it bootstraps properly on both aarch64-linux and 
x86_64-linux and that tests look sane.


James, would you please give this one a try to see if you can still 
reproduce PR85682? I couldn't reproduce it in multiple attempts.




The patch doesn't hit the regressions in PR85682 from what I can see.
I have a comment on the patch below.



Great. Thanks for checking Kyrill.


--- a/gcc/tree-ssa-loop-prefetch.c
+++ b/gcc/tree-ssa-loop-prefetch.c
@@ -992,6 +992,23 @@ prune_by_reuse (struct mem_ref_group *groups)
  static bool
  should_issue_prefetch_p (struct mem_ref *ref)
  {
+  /* Some processors may have a hardware prefetcher that may 
conflict with

+ prefetch hints for a range of strides.  Make sure we don't issue
+ prefetches for such cases if the stride is within this particular
+ range.  */
+  if (cst_and_fits_in_hwi (ref->group->step)
+  && abs_hwi (int_cst_value (ref->group->step)) <
+  (HOST_WIDE_INT) PREFETCH_MINIMUM_STRIDE)
+    {

The '<' should go on the line below together with 
PREFETCH_MINIMUM_STRIDE.


I've fixed this locally now.


Thanks. I haven't followed the patch in detail, are you looking for 
midend changes approval since the last version?

Or do you need aarch64 approval?


The changes are not substantial, but midend approval i what i was aiming 
at.


Also the confirmation that PR85682 is no longer happening.


James confirmed PR85682 is no longer reproducible with the updated patch 
and the bootstrap issue is fixed now. So i take it this should be OK to 
push to mainline?


Also, i'd like to discuss the possibility of having these couple options 
backported to GCC 8. As is, the changes don't alter code generation by 
default, but they allow better tuning of the software prefetcher for 
targets that benefit from it.


Maybe after letting the changes bake on mainline enough to be confirmed 
stable?


Thanks,
Luis


[RFT PATCH, AVX512]: Implement scalar unsigned int->float conversions with AVX512F

2018-05-22 Thread Uros Bizjak
Hello!

Attached patch implements scalar unsigned int->float conversions with AVX512F.

2018-05-22  Uros Bizjak  

* config/i386/i386.md (*floatuns2_avx512):
New insn pattern.
(floatunssi2): Also enable for AVX512F and TARGET_SSE_MATH.
Rewrite expander pattern.  Emit gen_floatunssi2_i387_with_xmm
for non-SSE modes.
(floatunsdisf2): Rewrite expander pattern.  Hanlde TARGET_AVX512F.
(floatunsdidf2): Ditto.

testsuite/ChangeLog:

2018-05-22  Uros Bizjak  

* gcc.target/i386/cvt-3.c: New test.

Patch was bootstrapped and regression tested on x86_64-linux-gnu
{,-m32}., but not tested on AVX512 target.

Uros.
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 260441)
+++ config/i386/i386.md (working copy)
@@ -5615,16 +5615,26 @@
   DONE;
 })
 
+(define_insn "*floatuns2_avx512"
+  [(set (match_operand:MODEF 0 "register_operand" "=v")
+   (unsigned_float:MODEF
+ (match_operand:SWI48 1 "nonimmediate_operand" "rm")))]
+  "TARGET_AVX512F && TARGET_SSE_MATH"
+  "vcvtusi2\t{%1, %0, %0|%0, %0, %1}"
+  [(set_attr "type" "sseicvt")
+   (set_attr "prefix" "evex")
+   (set_attr "mode" "")])
+
 ;; Avoid store forwarding (partial memory) stall penalty by extending
 ;; SImode value to DImode through XMM register instead of pushing two
 ;; SImode values to stack. Also note that fild loads from memory only.
 
-(define_insn_and_split "*floatunssi2_i387_with_xmm"
+(define_insn_and_split "floatunssi2_i387_with_xmm"
   [(set (match_operand:X87MODEF 0 "register_operand" "=f")
(unsigned_float:X87MODEF
  (match_operand:SI 1 "nonimmediate_operand" "rm")))
-   (clobber (match_scratch:DI 3 "=x"))
-   (clobber (match_operand:DI 2 "memory_operand" "=m"))]
+   (clobber (match_operand:DI 2 "memory_operand" "=m"))
+   (clobber (match_scratch:DI 3 "=x"))]
   "!TARGET_64BIT
&& TARGET_80387 && X87_ENABLE_FLOAT (mode, DImode)
&& TARGET_SSE2 && TARGET_INTER_UNIT_MOVES_TO_VEC"
@@ -5639,43 +5649,59 @@
(set_attr "mode" "")])
 
 (define_expand "floatunssi2"
-  [(parallel
- [(set (match_operand:X87MODEF 0 "register_operand")
-  (unsigned_float:X87MODEF
-(match_operand:SI 1 "nonimmediate_operand")))
-  (clobber (match_scratch:DI 3))
-  (clobber (match_dup 2))])]
-  "!TARGET_64BIT
-   && ((TARGET_80387 && X87_ENABLE_FLOAT (mode, DImode)
-   && TARGET_SSE2 && TARGET_INTER_UNIT_MOVES_TO_VEC)
-   || (SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH))"
+  [(set (match_operand:X87MODEF 0 "register_operand")
+   (unsigned_float:X87MODEF
+ (match_operand:SI 1 "nonimmediate_operand")))]
+  "(!TARGET_64BIT
+&& TARGET_80387 && X87_ENABLE_FLOAT (mode, DImode)
+&& TARGET_SSE2 && TARGET_INTER_UNIT_MOVES_TO_VEC)
+   || ((!TARGET_64BIT || TARGET_AVX512F)
+   && SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH)"
 {
-  if (SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH)
+  if (!(SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH))
 {
+  emit_insn (gen_floatunssi2_i387_with_xmm
+ (operands[0], operands[1],
+  assign_386_stack_local (DImode, SLOT_TEMP)));
+  DONE;
+}
+  if (!TARGET_AVX512F)
+{
   ix86_expand_convert_uns_si_sse (operands[0], operands[1]);
   DONE;
 }
-  else
-operands[2] = assign_386_stack_local (DImode, SLOT_TEMP);
 })
 
 (define_expand "floatunsdisf2"
-  [(use (match_operand:SF 0 "register_operand"))
-   (use (match_operand:DI 1 "nonimmediate_operand"))]
+  [(set (match_operand:SF 0 "register_operand")
+   (unsigned_float:SF
+ (match_operand:DI 1 "nonimmediate_operand")))]
   "TARGET_64BIT && TARGET_SSE && TARGET_SSE_MATH"
-  "x86_emit_floatuns (operands); DONE;")
+{
+  if (!TARGET_AVX512F)
+{
+  x86_emit_floatuns (operands);
+  DONE;
+}
+})
 
 (define_expand "floatunsdidf2"
-  [(use (match_operand:DF 0 "register_operand"))
-   (use (match_operand:DI 1 "nonimmediate_operand"))]
-  "(TARGET_64BIT || TARGET_KEEPS_VECTOR_ALIGNED_STACK)
+  [(set (match_operand:DF 0 "register_operand")
+   (unsigned_float:DF
+ (match_operand:DI 1 "nonimmediate_operand")))]
+  "(TARGET_KEEPS_VECTOR_ALIGNED_STACK || TARGET_AVX512F)
&& TARGET_SSE2 && TARGET_SSE_MATH"
 {
-  if (TARGET_64BIT)
-x86_emit_floatuns (operands);
-  else
-ix86_expand_convert_uns_didf_sse (operands[0], operands[1]);
-  DONE;
+  if (!TARGET_64BIT)
+{
+  ix86_expand_convert_uns_didf_sse (operands[0], operands[1]);
+  DONE;
+}
+  if (!TARGET_AVX512F)
+{
+  x86_emit_floatuns (operands);
+  DONE;
+}
 })
 
 ;; Load effective address instructions
Index: testsuite/gcc.target/i386/cvt-3.c
===
--- testsuite/gcc.target/i386/cvt-3.c   (nonexistent)
+++ testsuite/gcc.target/i386/cvt-3.c   (working copy)
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx512f 

Re: [PATCH][RFC] Add dynamic edge/bb flag allocation

2018-05-22 Thread Richard Biener
On May 22, 2018 6:53:57 PM GMT+02:00, Joseph Myers  
wrote:
>On Tue, 22 May 2018, Richard Biener wrote:
>
>> +  if (*sptr & (1 << (CHAR_BIT * sizeof (T) - 1)))
>> +gcc_unreachable ();
>> +  m_flag = 1 << ((CHAR_BIT * sizeof (T)) - clz_hwi (*sptr));
>
>I don't see how the use of clz_hwi works with a type T that may be 
>narrower than HOST_WIDE_INT.  Surely this logic requires a count of 
>leading zeros in something of type T, not a possibly larger number of 
>leading zeros after conversion to HOST_WIDE_INT?  Also, if T is wider
>than 
>int, shifting plain 1 won't work here.

I messed up the conversion to a template. The bitnum should be subtracted from 
HOST_BITS_PER_WIDE_INT and yes, 1 in unsigned hwi should be shifted. 

Richard. 



[PATCH v2] Don't mark IFUNC resolver as only called directly

2018-05-22 Thread H.J. Lu
On Tue, May 22, 2018 at 10:11 AM, H.J. Lu  wrote:
> On Tue, May 22, 2018 at 9:21 AM, Jan Hubicka  wrote:
>>> > >  class ipa_opt_pass_d;
>>> > >  typedef ipa_opt_pass_d *ipa_opt_pass;
>>> > > @@ -2894,7 +2896,8 @@ 
>>> > > cgraph_node::only_called_directly_or_aliased_p (void)
>>> > >   && !DECL_STATIC_CONSTRUCTOR (decl)
>>> > >   && !DECL_STATIC_DESTRUCTOR (decl)
>>> > >   && !used_from_object_file_p ()
>>> > > - && !externally_visible);
>>> > > + && !externally_visible
>>> > > + && !lookup_attribute ("ifunc", DECL_ATTRIBUTES (decl)));
>>> >
>>> > How's it handled for our own generated resolver functions?  That is,
>>> > isn't there sth cheaper than doing a lookup_attribute here?  I see
>>> > that make_dispatcher_decl nor ix86_get_function_versions_dispatcher
>>> > adds the 'ifunc' attribute (though they are TREE_PUBLIC there).
>>> 
>>>  Is there any drawback of setting force_output flag?
>>>  Honza
>>> >>>
>>> >>> Setting force_output may prevent some optimizations.  Can we add a bit
>>> >>> for IFUNC resolver?
>>> >>>
>>> >>
>>> >> Here is the patch to add ifunc_resolver to cgraph_node. Tested on x86-64
>>> >> and i686.  Any comments?
>>> >>
>>> >
>>> > PING:
>>> >
>>> > https://gcc.gnu.org/ml/gcc-patches/2018-04/msg00647.html
>>> >
>>>
>>> PING.
>> OK, but please extend the verifier that ifunc_resolver flag is equivalent to
>> lookup_attribute ("ifunc", DECL_ATTRIBUTES (decl))
>> so we are sure things stays in sync.
>>
>
> Like this
>
> diff --git a/gcc/symtab.c b/gcc/symtab.c
> index 80f6f910c3b..954920b6dff 100644
> --- a/gcc/symtab.c
> +++ b/gcc/symtab.c
> @@ -998,6 +998,13 @@ symtab_node::verify_base (void)
>error ("function symbol is not function");
>error_found = true;
>}
> +  else if ((lookup_attribute ("ifunc", DECL_ATTRIBUTES (decl))
> + != NULL)
> + != dyn_cast  (this)->ifunc_resolver)
> +  {
> +  error ("inconsistent `ifunc' attribute");
> +  error_found = true;
> +  }
>  }
>else if (is_a  (this))
>  {
>
>
> Thanks.
>

This is the patch I am checking in.  Tested on x86-64.

Thanks.

-- 
H.J.
From 91d0b4bc0222ce85bd529a7b3ae9e11904802c26 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Wed, 11 Apr 2018 12:31:21 -0700
Subject: [PATCH] Don't mark IFUNC resolver as only called directly

Since IFUNC resolver is called indirectly, don't mark IFUNC resolver as
only called directly.  This patch adds ifunc_resolver to cgraph_node,
sets ifunc_resolver for ifunc attribute and checks ifunc_resolver
instead of looking up ifunc attribute.

gcc/

	PR target/85345
	* cgraph.h (cgraph_node::create): Set ifunc_resolver for ifunc
	attribute.
	(cgraph_node::create_alias): Likewise.
	(cgraph_node::get_availability): Check ifunc_resolver instead
	of looking up ifunc attribute.
	* cgraphunit.c (maybe_diag_incompatible_alias): Likewise.
	* varasm.c (do_assemble_alias): Likewise.
	(assemble_alias): Likewise.
	(default_binds_local_p_3): Likewise.
	* cgraph.h (cgraph_node): Add ifunc_resolver.
	(cgraph_node::only_called_directly_or_aliased_p): Return false
	for IFUNC resolver.
	* lto-cgraph.c (input_node): Set ifunc_resolver for ifunc
	attribute.
	* symtab.c (symtab_node::verify_base): Verify that ifunc_resolver
	is equivalent to lookup_attribute ("ifunc", DECL_ATTRIBUTES (decl)).
	(symtab_node::binds_to_current_def_p): Check ifunc_resolver
	instead of looking up ifunc attribute.

gcc/testsuite/

	PR target/85345
	* gcc.target/i386/pr85345.c: New test.
---
 gcc/cgraph.c|  7 +++-
 gcc/cgraph.h|  4 +++
 gcc/cgraphunit.c|  2 +-
 gcc/lto-cgraph.c|  2 ++
 gcc/symtab.c| 11 +--
 gcc/testsuite/gcc.target/i386/pr85345.c | 44 +
 gcc/varasm.c|  8 +++--
 7 files changed, 71 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr85345.c

diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index 9a7d54d7cee..9f3a2929f6b 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -517,6 +517,9 @@ cgraph_node::create (tree decl)
 	g->have_offload = true;
 }
 
+  if (lookup_attribute ("ifunc", DECL_ATTRIBUTES (decl)))
+node->ifunc_resolver = true;
+
   node->register_symbol ();
 
   if (DECL_CONTEXT (decl) && TREE_CODE (DECL_CONTEXT (decl)) == FUNCTION_DECL)
@@ -575,6 +578,8 @@ cgraph_node::create_alias (tree alias, tree target)
   alias_node->alias = true;
   if (lookup_attribute ("weakref", DECL_ATTRIBUTES (alias)) != NULL)
 alias_node->transparent_alias = alias_node->weakref = true;
+  if (lookup_attribute ("ifunc", DECL_ATTRIBUTES (alias)))
+alias_node->ifunc_resolver = true;
   return alias_node;
 }
 
@@ -2299,7 +2304,7 @@ cgraph_node::get_availability (symtab_node *ref)
 avail 

Re: [PATCH] Make __ibm128 a distinct type, patch 2 of 2 (PR 85657)

2018-05-22 Thread Michael Meissner
Evidently I forgot the patch.

[gcc]
2018-05-21  Michael Meissner  

PR target/85657
* config/rs6000/rs6000-builtin.def (BU_IBM128_2): New helper macro
for __builtin_{,un}pack_ibm128.
(PACK_IF): Declare __builtin_{,un}pack_ibm128.
(UNPACK_IF): Likewise.
* config/rs6000/rs6000.c (rs6000_builtin_mask_calculate): The mask
for long double builtins (RS6000_BTM_LDBL128) requires that long
double is IBM extended double.
(rs6000_invalid_builtin): Add a new error message if the long
double {,un}pack builtins are used when long double is IEEE
128-bit floating point.
* config/rs6000/rs6000.h (RS6000_BTM_LDBL128): Update comment.
* doc/extend.texi (PowerPC builtins): Update documention for
__builtin_{,un}pack_longdouble.  Add documentation for
__builtin_{,un}pack_ibm128.

[gcc/testsuite]
2018-05-21  Michael Meissner  

PR target/85657
* gcc.target/powerpc/pr85657-4.c: New tests for pack/unpack
__ibm128 builtin functions, and whether an appropriate error
message is generate if the long double pack/unpack are used when
long double is IEEE 128.
* gcc.target/powerpc/pr85657-5.c: Likewise.
* gcc.target/powerpc/pr85657-6.c: Likewise.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000-builtin.def
===
--- gcc/config/rs6000/rs6000-builtin.def(.../trunk) (revision 
260267)
+++ gcc/config/rs6000/rs6000-builtin.def(.../branches/ibm/ieee) 
(revision 260381)
@@ -628,6 +628,17 @@
 | RS6000_BTC_BINARY),  \
CODE_FOR_ ## ICODE) /* ICODE */
 
+/* 128-bit __ibm128 floating point builtins (use -mfloat128 to indicate that
+   __ibm128 is available).  */
+#define BU_IBM128_2(ENUM, NAME, ATTR, ICODE)   \
+  RS6000_BUILTIN_2 (MISC_BUILTIN_ ## ENUM, /* ENUM */  \
+   "__builtin_" NAME,  /* NAME */  \
+   (RS6000_BTM_HARD_FLOAT  /* MASK */  \
+| RS6000_BTM_FLOAT128),\
+   (RS6000_BTC_ ## ATTR/* ATTR */  \
+| RS6000_BTC_BINARY),  \
+   CODE_FOR_ ## ICODE) /* ICODE */
+
 /* Miscellaneous builtins for instructions added in ISA 3.0.  These
instructions don't require either the DFP or VSX options, just the basic
ISA 3.0 enablement since they operate on general purpose registers.  */
@@ -2315,6 +2326,9 @@ BU_P9_64BIT_MISC_0 (DARN, "darn", MISC
 BU_LDBL128_2 (PACK_TF, "pack_longdouble",  CONST,  packtf)
 BU_LDBL128_2 (UNPACK_TF,   "unpack_longdouble",CONST,  unpacktf)
 
+BU_IBM128_2 (PACK_IF,  "pack_ibm128",  CONST,  packif)
+BU_IBM128_2 (UNPACK_IF,"unpack_ibm128",CONST,  
unpackif)
+
 BU_P7_MISC_2 (PACK_V1TI,   "pack_vector_int128",   CONST,  packv1ti)
 BU_P7_MISC_2 (UNPACK_V1TI, "unpack_vector_int128", CONST,  unpackv1ti)
 
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (.../trunk) (revision 260267)
+++ gcc/config/rs6000/rs6000.c  (.../branches/ibm/ieee) (revision 260381)
@@ -3891,7 +3891,8 @@ rs6000_builtin_mask_calculate (void)
  | ((TARGET_HTM)   ? RS6000_BTM_HTM   : 0)
  | ((TARGET_DFP)   ? RS6000_BTM_DFP   : 0)
  | ((TARGET_HARD_FLOAT)? RS6000_BTM_HARD_FLOAT : 0)
- | ((TARGET_LONG_DOUBLE_128)   ? RS6000_BTM_LDBL128   : 0)
+ | ((TARGET_LONG_DOUBLE_128
+ && !TARGET_IEEEQUAD)  ? RS6000_BTM_LDBL128   : 0)
  | ((TARGET_FLOAT128_TYPE) ? RS6000_BTM_FLOAT128  : 0)
  | ((TARGET_FLOAT128_HW)   ? RS6000_BTM_FLOAT128_HW : 0));
 }
@@ -15311,6 +15312,10 @@ rs6000_invalid_builtin (enum rs6000_buil
   else if ((fnmask & RS6000_BTM_P9_MISC) == RS6000_BTM_P9_MISC)
 error ("builtin function %qs requires the %qs option", name,
   "-mcpu=power9");
+  else if ((fnmask & RS6000_BTM_LDBL128)
+  && (!TARGET_LONG_DOUBLE_128 || TARGET_IEEEQUAD))
+error ("builtin function %qs requires the %qs and %qs options",
+  name, "-mabi=ibmlongdouble", "-mlong-double-128");
   else if ((fnmask & (RS6000_BTM_HARD_FLOAT | RS6000_BTM_LDBL128))
   == (RS6000_BTM_HARD_FLOAT | RS6000_BTM_LDBL128))
 error ("builtin function %qs requires the %qs and %qs options",
Index: gcc/config/rs6000/rs6000.h

Re: [PATCH] use string length to relax -Wstringop-overflow for nonstrings (PR 85623)

2018-05-22 Thread Martin Sebor

On 05/21/2018 05:02 PM, Jeff Law wrote:

On 05/10/2018 01:26 PM, Martin Sebor wrote:

GCC 8.1 warns for unbounded (and some bounded) string comparisons
involving arrays declared attribute nonstring (i.e., char arrays
that need not be nul-terminated).  For instance:

  extern __attribute__((nonstring)) char a[4];

  int f (void)
  {
return strncmp (a, "123", sizeof a);
  }

  warning: ‘strcmp’ argument 1 declared attribute ‘nonstring’

Note that the warning refers to strcmp even though the call in
the source is to strncmp, because prior passes transform one to
the other.

The warning above is unnecessary (for strcmp) and incorrect for
strncmp because the call reads exactly four bytes from the non-
string array a regardless of the bound and so there is no risk
that it will read past the end of the array.

The attached change enhances the warning to use the length of
the string argument to suppress some of these needless warnings
for both bounded and unbounded string comparison functions.
When the length of the string is unknown, the warning uses its
size (when possible) as the upper bound on the number of accessed
bytes.  The change adds no new warnings.

I'm looking for approval to commit it to both trunk and 8-branch.

Martin

gcc-85623.diff


PR c/85623 - strncmp() warns about attribute 'nonstring' incorrectly in 
-Wstringop-overflow

gcc/ChangeLog:

PR c/85623
* calls.c (maybe_warn_nonstring_arg): Use string length to set
or ajust the presumed bound on an operation to avoid unnecessary
warnings.

s/ajust/adjust/



gcc/testsuite/ChangeLog:

PR c/85623
* c-c++-common/attr-nonstring-3.c: Adjust.
* c-c++-common/attr-nonstring-4.c: Adjust.
* c-c++-common/attr-nonstring-6.c: New test.

diff --git a/gcc/calls.c b/gcc/calls.c
index 9eb0467..f5c8ad4 100644
--- a/gcc/calls.c
+++ b/gcc/calls.c
@@ -55,6 +55,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "stringpool.h"
 #include "attribs.h"
 #include "builtins.h"
+#include "gimple-fold.h"

 /* Like PREFERRED_STACK_BOUNDARY but in units of bytes, not bits.  */
 #define STACK_BYTES (PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT)
@@ -1612,15 +1613,36 @@ maybe_warn_nonstring_arg (tree fndecl, tree exp)
   /* The bound argument to a bounded string function like strncpy.  */
   tree bound = NULL_TREE;

+  /* The range of lengths of a string argument to one of the comparison
+ functions.  If the length is less than the bound it is used instead.  */
+  tree lenrng[2] = { NULL_TREE, NULL_TREE };
+
   /* It's safe to call "bounded" string functions with a non-string
  argument since the functions provide an explicit bound for this
  purpose.  */
   switch (DECL_FUNCTION_CODE (fndecl))
 {
-case BUILT_IN_STPNCPY:
-case BUILT_IN_STPNCPY_CHK:
+case BUILT_IN_STRCMP:
 case BUILT_IN_STRNCMP:
 case BUILT_IN_STRNCASECMP:
+  {
+   /* For these, if one argument refers to one or more of a set
+  of string constants or arrays of known size, determine
+  the range of their known or possible lengths and use it
+  conservatively as the bound for the unbounded function,
+  and to adjust the range of the bound of the bounded ones.  */
+   unsigned stride = with_bounds ? 2 : 1;
+   for (unsigned argno = 0; argno < nargs && !*lenrng; argno += stride)
+ {
+   tree arg = CALL_EXPR_ARG (exp, argno);
+   if (!get_attr_nonstring_decl (arg))
+ get_range_strlen (arg, lenrng);
+ }
+  }
+  /* Fall through.  */
+
+case BUILT_IN_STPNCPY:
+case BUILT_IN_STPNCPY_CHK:
 case BUILT_IN_STRNCPY:
 case BUILT_IN_STRNCPY_CHK:
   {
@@ -1647,6 +1669,33 @@ maybe_warn_nonstring_arg (tree fndecl, tree exp)
+   {
+ /* Replace the bound on the oparation with the upper bound

s/oparation/operation/

OK for the trunk with the nits fixed.

Also note that I've acked a patch from Martin L (I believe) that removes
the chkp/bounds checking bits that were deprecated in gcc-8.  So there's
some chance the bounds-related bits will need to be updated depending on
whether or not Martin's L's patch has been committed.

This isn't strictly a regression.  So unless this is affecting some
critical code (ie glibc, kernel or something similar) this probably
would require an explicit OK from Jakub or Richi to be eligible for the
gcc-8 branch.


There are a number of warnings in Binutils/GDB that people have
been suppressing by pragmas because the attribute isn't always
effective, most due to bug 85643:

  https://sourceware.org/ml/binutils/2018-05/msg00212.html

I don't know if this bug is also among those instances (there
are several threads on the mailing lists and I may have missed
some).

If the warning for the strncmp() test case above isn't one of
them it certainly is a bug/oversight in the warning that makes
the attribute less useful than it's meant to be and the warning
more noisy.


Ping: [PATCH][Middle-end][version 3]2nd patch of PR78809 and PR83026

2018-05-22 Thread Qing Zhao
Ping for the following patch sent in 3 months ago in the end of GCC8:

https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg184075.html

I have rebased the patch on the latest GCC9 thunk. 

bootstraped and tested on both X86 and Aarch64. no regression.

the following are more details:



Hi, this is the 3rd version for this patch.

the main change compared with 2nd version are:
1. do not use “compute_objsize” to get the minimum object size per Jeff 
and Richard’s
comment. Instead, add a new function “determine_min_objsize” for this purpose. 
This new
function calls “compute_builtin_object_size” to get the minimum objsize, 
however, due to 
the fact that “compute_builtin_object_size” does nothing for SSA_NAME when 
optimize > 0 (don’t
know the exactly reason for this), inside “determine_min_objsize”, I have to 
add  more code
to catch up some simple SSA_NAME cases.

2. in gimple-fold.c and tree-ssa-structalias.c, add the handling of the 
new 
BUILT_IN_STRCMP_EQ and BUILT_IN_STRNCMP_EQ in the same places where 
BUILT_IN_STRCMP and BUILT_IN_STRNCMP is checked.

3. some  format change and comments change per Jeff’s comment. 

let me know if you have any comments.

thanks a lot.

Qing

*

2nd Patch for PR78009 
Patch for PR83026

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809
Inline strcmp with small constant strings

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83026
missing strlen optimization for strcmp of unequal strings

The design doc for PR78809 is at:
https://www.mail-archive.com/gcc@gcc.gnu.org/msg83822.html

this patch is for the second part of change of PR78809 and PR83026:

B. for strncmp (s1, s2, n) (!)= 0 or strcmp (s1, s2) (!)= 0

B.1. (PR83026) When the lengths of both arguments are constant and
 it's a strcmp:
   * if the lengths are NOT equal, we can safely fold the call
 to a non-zero value.
   * otherwise, do nothing now.

B.2. (PR78809) When the length of one argument is constant, try to replace
the call with a __builtin_str(n)cmp_eq call where possible, i.e:

strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a
string with constant length, C is a constant.
  if (C <= strlen(STR) && sizeof_array(s) > C)
{
  replace this call with
  __builtin_strncmp_eq (s, STR, C) (!)= 0
}
  if (C > strlen(STR)
{
  it can be safely treated as a call to strcmp (s, STR) (!)= 0
  can handled by the following strcmp.
}

strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a
string with constant length.
  if  (sizeof_array(s) > strlen(STR))
{
  replace this call with
  __builtin_strcmp_eq (s, STR, strlen(STR)+1) (!)= 0
}

later when expanding the new __builtin_str(n)cmp_eq calls, first expand them
as __builtin_memcmp_eq, if the expansion does not succeed, change them back
to call to __builtin_str(n)cmp.

adding test case strcmpopt_2.c and strcmpopt_4.c into gcc.dg for part B of
PR78809 adding test case strcmpopt_3.c into gcc.dg for PR83026

bootstraped and tested on both X86 and Aarch64. no regression.


gcc/ChangeLog

+2018-05-21  
+   
+   PR middle-end/78809
+   PR middle-end/83026
+   * builtins.c (expand_builtin): Add the handling of BUILT_IN_STRCMP_EQ
+   and BUILT_IN_STRNCMP_EQ.
+   * builtins.def: Add new builtins BUILT_IN_STRCMP_EQ and
+   BUILT_IN_STRNCMP_EQ.
+   * gimple-fold.c (gimple_fold_builtin_string_compare): Add the 
+   handling of BUILTIN_IN_STRCMP_EQ and BUILT_IN_STRNCMP_EQ.
+   (gimple_fold_builtin): Likewise.
+   * tree-ssa-strlen.c (compute_string_length): New function.
+   (determine_min_obsize): New function.
+   (handle_builtin_string_cmp): New function to handle calls to
+   string compare functions.
+   (strlen_optimize_stmt): Add handling to builtin string compare
+   calls. 
+   * tree-ssa-structalias.c (find_func_aliases_for_builtin_call):
+   Add the handling of BUILT_IN_STRCMP_EQ and BUILT_IN_STRNCMP_EQ.
+   * tree.c (build_common_builtin_nodes): Add new defines of
+   BUILT_IN_STRNCMP_EQ and BUILT_IN_STRCMP_EQ.
+

gcc/testsuite/ChangeLog

+2018-05-21  
+
+   PR middle-end/78809
+   * gcc.dg/strcmpopt_2.c: New testcase.
+   * gcc.dg/strcmpopt_3.c: New testcase.
+
+   PR middle-end/83026
+   * gcc.dg/strcmpopt_3.c: New testcase.



PR78809_B.patch
Description: Binary data


Re: [PATCH][AArch64] Fix aarch64_ira_change_pseudo_allocno_class

2018-05-22 Thread Richard Sandiford
Wilco Dijkstra  writes:

> A recent commit removing '*' from the md files caused a large regression
> in h264ref.
> It turns out aarch64_ira_change_pseudo_allocno_class is no longer
> effective after the
> SVE changes, and the combination results in the regression.  This patch
> fixes it by
> using the new POINTER_AND_FP_REGS register class which is now used
> instead of ALL_REGS.
> Add a missing ? to aarch64_get_lane to fix a failure in the testsuite.
>
> Passes regress, OK for commit?
>
> Since it is a regression introduced in GCC8, OK to backport to GCC8?
>
> ChangeLog:
> 2018-05-22  Wilco Dijkstra  
>
>   * config/aarch64/aarch64.c (aarch64_ira_change_pseudo_allocno_class):
>   Use POINTER_AND_FP_REGSinstead of ALL_REGS.
>   * config/aarch64/aarch64-simd.md (aarch64_get_lane): Increase
> cost of r=w alternative.
> --
>
> diff --git a/gcc/config/aarch64/aarch64-simd.md 
> b/gcc/config/aarch64/aarch64-simd.md
> index 
> 2ebd256329c1a6a6b790d16955cbcee3feca456c..3d5fe44b53198a92afb726712c6e9dee890afe38
>  100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -2961,7 +2961,7 @@ (define_insn "*aarch64_get_lane_zero_extendsi"
>  ;; is guaranteed so upper bits should be considered undefined.
>  ;; RTL uses GCC vector extension indices throughout so flip only for 
> assembly.
>  (define_insn "aarch64_get_lane"
> -  [(set (match_operand: 0 "aarch64_simd_nonimmediate_operand" "=r, w, 
> Utv")
> +  [(set (match_operand: 0 "aarch64_simd_nonimmediate_operand" "=?r, w, 
> Utv")
>   (vec_select:
> (match_operand:VALL_F16 1 "register_operand" "w, w, w")
> (parallel [(match_operand:SI 2 "immediate_operand" "i, i, i")])))]
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 
> 47d98dfd095cdcd15908a86091cf2f8a4d6137b1..a119760c7f332aded200fa1b5bcfb1bbac7b6420
>  100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -1059,16 +1059,17 @@ aarch64_err_no_fpadvsimd (machine_mode mode, const 
> char *msg)
>  }
>  
>  /* Implement TARGET_IRA_CHANGE_PSEUDO_ALLOCNO_CLASS.
> -   The register allocator chooses ALL_REGS if FP_REGS and GENERAL_REGS have
> -   the same cost even if ALL_REGS has a much larger cost.  ALL_REGS is also
> -   used if the cost of both FP_REGS and GENERAL_REGS is lower than the memory
> -   cost (in this case the best class is the lowest cost one).  Using ALL_REGS
> -   irrespectively of its cost results in bad allocations with many redundant
> -   int<->FP moves which are expensive on various cores.
> -   To avoid this we don't allow ALL_REGS as the allocno class, but force a
> -   decision between FP_REGS and GENERAL_REGS.  We use the allocno class if it
> -   isn't ALL_REGS.  Similarly, use the best class if it isn't ALL_REGS.
> -   Otherwise set the allocno class depending on the mode.
> +   The register allocator chooses POINTER_AND_FP_REGS if FP_REGS and
> +   GENERAL_REGS have the same cost - even if POINTER_AND_FP_REGS has a much
> +   higher cost.  POINTER_AND_FP_REGS is also used if the cost of both FP_REGS
> +   and GENERAL_REGS is lower than the memory cost (in this case the best 
> class
> +   is the lowest cost one).  Using POINTER_AND_FP_REGS irrespectively of its
> +   cost results in bad allocations with many redundant int<->FP moves which
> +   are expensive on various cores.
> +   To avoid this we don't allow POINTER_AND_FP_REGS as the allocno class, but
> +   force a decision between FP_REGS and GENERAL_REGS.  We use the allocno 
> class
> +   if it isn't POINTER_AND_FP_REGS.  Similarly, use the best class if it 
> isn't
> +   POINTER_AND_FP_REGS.  Otherwise set the allocno class depending on the 
> mode.
> The result of this is that it is no longer inefficient to have a higher
> memory move cost than the register move cost.
>  */
> @@ -1079,10 +1080,10 @@ aarch64_ira_change_pseudo_allocno_class (int regno, 
> reg_class_t allocno_class,
>  {
>machine_mode mode;
>  
> -  if (allocno_class != ALL_REGS)
> +  if (allocno_class != POINTER_AND_FP_REGS)
>  return allocno_class;
>  
> -  if (best_class != ALL_REGS)
> +  if (best_class != POINTER_AND_FP_REGS)
>  return best_class;
>  
>mode = PSEUDO_REGNO_MODE (regno);

I think it'd be better to use !reg_class_subset_p (POINTER_AND_FP_REGS, ...)
instead of ... != POINTER_AND_FP_REGS, since this in principle still applies
to ALL_REGS too.

FWIW, the patch looks good to me with that change.

Thanks,
Richard


Re: PING^2: [PATCH] Don't mark IFUNC resolver as only called directly

2018-05-22 Thread H.J. Lu
On Tue, May 22, 2018 at 9:21 AM, Jan Hubicka  wrote:
>> > >  class ipa_opt_pass_d;
>> > >  typedef ipa_opt_pass_d *ipa_opt_pass;
>> > > @@ -2894,7 +2896,8 @@ cgraph_node::only_called_directly_or_aliased_p 
>> > > (void)
>> > >   && !DECL_STATIC_CONSTRUCTOR (decl)
>> > >   && !DECL_STATIC_DESTRUCTOR (decl)
>> > >   && !used_from_object_file_p ()
>> > > - && !externally_visible);
>> > > + && !externally_visible
>> > > + && !lookup_attribute ("ifunc", DECL_ATTRIBUTES (decl)));
>> >
>> > How's it handled for our own generated resolver functions?  That is,
>> > isn't there sth cheaper than doing a lookup_attribute here?  I see
>> > that make_dispatcher_decl nor ix86_get_function_versions_dispatcher
>> > adds the 'ifunc' attribute (though they are TREE_PUBLIC there).
>> 
>>  Is there any drawback of setting force_output flag?
>>  Honza
>> >>>
>> >>> Setting force_output may prevent some optimizations.  Can we add a bit
>> >>> for IFUNC resolver?
>> >>>
>> >>
>> >> Here is the patch to add ifunc_resolver to cgraph_node. Tested on x86-64
>> >> and i686.  Any comments?
>> >>
>> >
>> > PING:
>> >
>> > https://gcc.gnu.org/ml/gcc-patches/2018-04/msg00647.html
>> >
>>
>> PING.
> OK, but please extend the verifier that ifunc_resolver flag is equivalent to
> lookup_attribute ("ifunc", DECL_ATTRIBUTES (decl))
> so we are sure things stays in sync.
>

Like this

diff --git a/gcc/symtab.c b/gcc/symtab.c
index 80f6f910c3b..954920b6dff 100644
--- a/gcc/symtab.c
+++ b/gcc/symtab.c
@@ -998,6 +998,13 @@ symtab_node::verify_base (void)
   error ("function symbol is not function");
   error_found = true;
   }
+  else if ((lookup_attribute ("ifunc", DECL_ATTRIBUTES (decl))
+ != NULL)
+ != dyn_cast  (this)->ifunc_resolver)
+  {
+  error ("inconsistent `ifunc' attribute");
+  error_found = true;
+  }
 }
   else if (is_a  (this))
 {


Thanks.

-- 
H.J.


[PATCH] [AArch64, Falkor] Falkor address costs tuning

2018-05-22 Thread Luis Machado
Switch from using generic address costs to using Falkor-specific ones, which
give Falkor better results overall.

OK for trunk?

Given this is a Falkor-specific adjustment, would this be an acceptable
backport for GCC 8 as well?

gcc/ChangeLog:

2018-05-22  Luis Machado  

* config/aarch64/aarch64.c (qdf24xx_addrcost_table): New static
global.
(qdf24xx_tunings) : Set to qdf24xx_addrcost_table.
---
 gcc/config/aarch64/aarch64.c | 18 +-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index f60e0ad..548d87a 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -314,6 +314,22 @@ static const struct cpu_addrcost_table 
thunderx2t99_addrcost_table =
   0, /* imm_offset  */
 };
 
+static const struct cpu_addrcost_table qdf24xx_addrcost_table =
+{
+{
+  1, /* hi  */
+  1, /* si  */
+  1, /* di  */
+  2, /* ti  */
+},
+  1, /* pre_modify  */
+  1, /* post_modify  */
+  3, /* register_offset  */
+  4, /* register_sextend  */
+  3, /* register_zextend  */
+  2, /* imm_offset  */
+};
+
 static const struct cpu_regmove_cost generic_regmove_cost =
 {
   1, /* GP2GP  */
@@ -856,7 +872,7 @@ static const struct tune_params xgene1_tunings =
 static const struct tune_params qdf24xx_tunings =
 {
   _extra_costs,
-  _addrcost_table,
+  _addrcost_table,
   _regmove_cost,
   _vector_cost,
   _branch_cost,
-- 
2.7.4



Re: [PATCH][RFC] Add dynamic edge/bb flag allocation

2018-05-22 Thread Joseph Myers
On Tue, 22 May 2018, Richard Biener wrote:

> +  if (*sptr & (1 << (CHAR_BIT * sizeof (T) - 1)))
> + gcc_unreachable ();
> +  m_flag = 1 << ((CHAR_BIT * sizeof (T)) - clz_hwi (*sptr));

I don't see how the use of clz_hwi works with a type T that may be 
narrower than HOST_WIDE_INT.  Surely this logic requires a count of 
leading zeros in something of type T, not a possibly larger number of 
leading zeros after conversion to HOST_WIDE_INT?  Also, if T is wider than 
int, shifting plain 1 won't work here.

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH PR85720/partial]Support runtime loop versioning if loop can be distributed into builtin functions

2018-05-22 Thread Bin Cheng
Hi,
This patch partially improves loop distribution for PR85720.  It now supports 
runtime
loop versioning if the loop can be distributed into builtin functions.  Note 
for this
moment only coarse-grain runtime alias is checked, while different overlapping 
cases
for different dependence relations are not supported yet.
Note changes in break_alias_scc_partitions and version_loop_by_alias_check do 
not
strictly match each other, with the latter more restricted.  Because it's hard 
to pass
information around.  Hopefully this will be resolved when classifying 
distributor.

Bootstrap and test on x86_64.  Is it OK?

Thanks,
bin

2018-05-22  Bin Cheng  

* tree-loop-distribution.c (break_alias_scc_partitions): Don't merge
SCC if all partitions are builtins.
(version_loop_by_alias_check): New parameter.  Generate cancelable
runtime alias check if all partitions are builtins.
(distribute_loop): Update call to above function.

gcc/testsuite
2018-05-22  Bin Cheng  

* gcc.dg/tree-ssa/pr85720.c: New test.
* gcc.target/i386/avx256-unaligned-store-2.c: Disable loop pattern
distribution.From 2518709d31440525010fa6692b531419fc81b426 Mon Sep 17 00:00:00 2001
From: Bin Cheng 
Date: Mon, 21 May 2018 15:49:55 +0100
Subject: [PATCH] pr85720-20180520

---
 gcc/testsuite/gcc.dg/tree-ssa/pr85720.c| 13 +++
 .../gcc.target/i386/avx256-unaligned-store-2.c |  2 +-
 gcc/tree-loop-distribution.c   | 40 +-
 3 files changed, 45 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr85720.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr85720.c b/gcc/testsuite/gcc.dg/tree-ssa/pr85720.c
new file mode 100644
index 000..18d8be9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr85720.c
@@ -0,0 +1,13 @@
+/* { dg-do compile { target size32plus } } */
+/* { dg-options "-O2 -ftree-loop-distribution -ftree-loop-distribute-patterns -fdump-tree-ldist" } */
+
+void fill(char* A, char* B, unsigned n)
+{
+for (unsigned i = 0; i < n; i++)
+{
+A[i] = 0;
+B[i] = A[i] + 1;
+}
+}
+
+/* { dg-final { scan-tree-dump-times "_builtin_memset" 2 "ldist" } } */
diff --git a/gcc/testsuite/gcc.target/i386/avx256-unaligned-store-2.c b/gcc/testsuite/gcc.target/i386/avx256-unaligned-store-2.c
index 87285c6..1e7969b 100644
--- a/gcc/testsuite/gcc.target/i386/avx256-unaligned-store-2.c
+++ b/gcc/testsuite/gcc.target/i386/avx256-unaligned-store-2.c
@@ -1,5 +1,5 @@
 /* { dg-do compile { target { ! ia32 } } } */
-/* { dg-options "-O3 -mtune-ctrl=sse_typeless_stores -dp -mavx -mavx256-split-unaligned-store -mno-prefer-avx128" } */
+/* { dg-options "-O3 -mtune-ctrl=sse_typeless_stores -dp -mavx -mavx256-split-unaligned-store -mno-prefer-avx128 -fno-tree-loop-distribute-patterns" } */
 
 #define N 1024
 
diff --git a/gcc/tree-loop-distribution.c b/gcc/tree-loop-distribution.c
index 5e327f4..c6e0a60 100644
--- a/gcc/tree-loop-distribution.c
+++ b/gcc/tree-loop-distribution.c
@@ -2268,21 +2268,26 @@ break_alias_scc_partitions (struct graph *rdg,
 	  for (j = 0; partitions->iterate (j, ); ++j)
 	if (pg->vertices[j].component == i)
 	  break;
+
+	  bool same_type = true, all_builtins = partition_builtin_p (first);
 	  for (++j; partitions->iterate (j, ); ++j)
 	{
 	  if (pg->vertices[j].component != i)
 		continue;
 
-	  /* Note we Merge partitions of parallel type on purpose, though
-		 the result partition is sequential.  The reason is vectorizer
-		 can do more accurate runtime alias check in this case.  Also
-		 it results in more conservative distribution.  */
 	  if (first->type != partition->type)
 		{
-		  bitmap_clear_bit (sccs_to_merge, i);
+		  same_type = false;
 		  break;
 		}
+	  all_builtins &= partition_builtin_p (partition);
 	}
+	  /* Merge SCC if all partitions in SCC have the same type, though the
+	 result partition is sequential, because vectorizer can do better
+	 runtime alias check.  One expecption is all partitions in SCC are
+	 builtins.  */
+	  if (!same_type || all_builtins)
+	bitmap_clear_bit (sccs_to_merge, i);
 	}
 
   /* Initialize callback data for traversing.  */
@@ -2458,7 +2463,8 @@ compute_alias_check_pairs (struct loop *loop, vec *alias_ddrs,
checks and version LOOP under condition of these runtime alias checks.  */
 
 static void
-version_loop_by_alias_check (struct loop *loop, vec *alias_ddrs)
+version_loop_by_alias_check (vec *partitions,
+			 struct loop *loop, vec *alias_ddrs)
 {
   profile_probability prob;
   basic_block cond_bb;
@@ -2481,9 +2487,25 @@ version_loop_by_alias_check (struct loop *loop, vec *alias_ddrs)
   is_gimple_val, NULL_TREE);
 
   /* Depend on vectorizer to fold IFN_LOOP_DIST_ALIAS.  */
-  if (flag_tree_loop_vectorize)
+  bool cancelable_p = flag_tree_loop_vectorize;
+  if (cancelable_p)

Re: PING^2: [PATCH] Don't mark IFUNC resolver as only called directly

2018-05-22 Thread Jan Hubicka
> > >  class ipa_opt_pass_d;
> > >  typedef ipa_opt_pass_d *ipa_opt_pass;
> > > @@ -2894,7 +2896,8 @@ cgraph_node::only_called_directly_or_aliased_p 
> > > (void)
> > >   && !DECL_STATIC_CONSTRUCTOR (decl)
> > >   && !DECL_STATIC_DESTRUCTOR (decl)
> > >   && !used_from_object_file_p ()
> > > - && !externally_visible);
> > > + && !externally_visible
> > > + && !lookup_attribute ("ifunc", DECL_ATTRIBUTES (decl)));
> >
> > How's it handled for our own generated resolver functions?  That is,
> > isn't there sth cheaper than doing a lookup_attribute here?  I see
> > that make_dispatcher_decl nor ix86_get_function_versions_dispatcher
> > adds the 'ifunc' attribute (though they are TREE_PUBLIC there).
> 
>  Is there any drawback of setting force_output flag?
>  Honza
> >>>
> >>> Setting force_output may prevent some optimizations.  Can we add a bit
> >>> for IFUNC resolver?
> >>>
> >>
> >> Here is the patch to add ifunc_resolver to cgraph_node. Tested on x86-64
> >> and i686.  Any comments?
> >>
> >
> > PING:
> >
> > https://gcc.gnu.org/ml/gcc-patches/2018-04/msg00647.html
> >
> 
> PING.
OK, but please extend the verifier that ifunc_resolver flag is equivalent to
lookup_attribute ("ifunc", DECL_ATTRIBUTES (decl))
so we are sure things stays in sync.

Thanks and sorry for the delay,
Honza
> 
> 
> -- 
> H.J.


Re: [AARCH64] Neon vld1_*_x3, vst1_*_x2 and vst1_*_x3 intrinsics

2018-05-22 Thread Sameera Deshpande
On Tue 22 May, 2018, 9:26 PM James Greenhalgh, 
wrote:

> On Mon, Apr 30, 2018 at 06:35:11PM -0500, Sameera Deshpande wrote:
> > On 13 April 2018 at 20:21, James Greenhalgh 
> wrote:
> > > On Fri, Apr 13, 2018 at 03:39:32PM +0100, Sameera Deshpande wrote:
> > >> On Fri 13 Apr, 2018, 8:04 PM James Greenhalgh, <
> james.greenha...@arm.com> wrote:
> > >> On Fri, Apr 06, 2018 at 08:55:47PM +0100, Christophe Lyon wrote:
> > >> > Hi,
> > >> >
> > >> > 2018-04-06 12:15 GMT+02:00 Sameera Deshpande <
> sameera.deshpa...@linaro.org>:
> > >> > > Hi Christophe,
> > >> > >
> > >> > > Please find attached the updated patch with testcases.
> > >> > >
> > >> > > Ok for trunk?
> > >> >
> > >> > Thanks for the update.
> > >> >
> > >> > Since the new intrinsics are only available on aarch64, you want to
> > >> > prevent the tests from running on arm.
> > >> > Indeed gcc.target/aarch64/advsimd-intrinsics/ is shared between the
> two targets.
> > >> > There are several examples on how to do that in that directory.
> > >> >
> > >> > I have also noticed that the tests fail at execution on aarch64_be.
> > >>
> > >> I think this is important to fix. We don't want the big-endian target
> to have
> > >> failing implementations of the Neon intrinsics. What is the nature of
> the
> > >> failure?
> > >>
> > >> From what I can see, nothing in the patch prevents using these
> intrinsics
> > >> on big-endian, so either the intrinsics behaviour is wrong (we have a
> wrong
> > >> code bug), or the testcase expected behaviour is wrong.
> > >>
> > >> I don't think disabling the test for big-endian is the right fix. We
> should
> > >> either fix the intrinsics, or fix the testcase.
> > >>
> > >> Thanks,
> > >> James
> > >>
> > >> Hi James,
> > >>
> > >> As the tests assume the little endian order of elements while
> checking the
> > >> results, the tests are failing for big endian targets. So, the
> failures are
> > >> not because of intrinsic implementations, but because of the testcase.
> > >
> > > The testcase is a little hard to follow through the macros, but why
> would
> > > this be the case?
> > >
> > > ld1 is deterministic on big and little endian for which elements will
> be
> > > loaded from memory, as is st1.
> > >
> > > My expectation would be that:
> > >
> > >   int __attribute__ ((noinline))
> > >   test_vld_u16_x3 ()
> > >   {
> > > uint16_t data[3 * 3];
> > > uint16_t temp[3 * 3];
> > > uint16x4x3_t vectors;
> > > int i,j;
> > > for (i = 0; i < 3 * 3; i++)
> > >   data [i] = (uint16_t) 3*i;
> > > asm volatile ("" : : : "memory");
> > > vectors = vld1_u16_x3 (data);
> > > vst1_u16 (temp, vectors.val[0]);
> > > vst1_u16 ([3], vectors.val[1]);
> > > vst1_u16 ([3 * 2], vectors.val[2]);
> > > asm volatile ("" : : : "memory");
> > > for (j = 0; j < 3 * 3; j++)
> > >   if (temp[j] != data[j])
> > > return 1;
> > > return 0;
> > >   }
> > >
> > > would work equally well for big- or little-endian.
> > >
> > > I think this is more likely to be an intrinsics implementation bug.
> > >
> > > Thanks,
> > > James
> > >
> >
> > Hi James,
> >
> > Please find attached the updated patch, which now passes for little as
> > well as big endian.
> > Ok for trunk?
>
>
> OK.
>
> Thanks,
> James
>
> >
> > --
> > - Thanks and regards,
> >   Sameera D.
> >
> > gcc/Changelog:
> >
> > 2018-05-01  Sameera Deshpande  
> >
> >
> > * config/aarch64/aarch64-simd-builtins.def (ld1x3): New.
> > (st1x2): Likewise.
> > (st1x3): Likewise.
> > * config/aarch64/aarch64-simd.md
> > (aarch64_ld1x3): New pattern.
> > (aarch64_ld1_x3_): Likewise
> > (aarch64_st1x2): Likewise
> > (aarch64_st1_x2_): Likewise
> > (aarch64_st1x3): Likewise
> > (aarch64_st1_x3_): Likewise
> > * config/aarch64/arm_neon.h (vld1_u8_x3): New function.
> > (vld1_s8_x3): Likewise.
> > (vld1_u16_x3): Likewise.
> > (vld1_s16_x3): Likewise.
> > (vld1_u32_x3): Likewise.
> > (vld1_s32_x3): Likewise.
> > (vld1_u64_x3): Likewise.
> > (vld1_s64_x3): Likewise.
> > (vld1_f16_x3): Likewise.
> > (vld1_f32_x3): Likewise.
> > (vld1_f64_x3): Likewise.
> > (vld1_p8_x3): Likewise.
> > (vld1_p16_x3): Likewise.
> > (vld1_p64_x3): Likewise.
> > (vld1q_u8_x3): Likewise.
> > (vld1q_s8_x3): Likewise.
> > (vld1q_u16_x3): Likewise.
> > (vld1q_s16_x3): Likewise.
> > (vld1q_u32_x3): Likewise.
> > (vld1q_s32_x3): Likewise.
> > (vld1q_u64_x3): Likewise.
> > (vld1q_s64_x3): Likewise.
> > (vld1q_f16_x3): Likewise.
> > (vld1q_f32_x3): Likewise.
> > (vld1q_f64_x3): Likewise.
> > (vld1q_p8_x3): Likewise.
> > (vld1q_p16_x3): 

[PATCH][AArch64] Fix aarch64_ira_change_pseudo_allocno_class

2018-05-22 Thread Wilco Dijkstra
A recent commit removing '*' from the md files caused a large regression in 
h264ref.
It turns out aarch64_ira_change_pseudo_allocno_class is no longer effective 
after the
SVE changes, and the combination results in the regression.  This patch fixes 
it by
using the new POINTER_AND_FP_REGS register class which is now used instead of 
ALL_REGS.
Add a missing ? to aarch64_get_lane to fix a failure in the testsuite.

Passes regress, OK for commit?

Since it is a regression introduced in GCC8, OK to backport to GCC8?

ChangeLog:
2018-05-22  Wilco Dijkstra  

* config/aarch64/aarch64.c (aarch64_ira_change_pseudo_allocno_class):
Use POINTER_AND_FP_REGSinstead of ALL_REGS.
* config/aarch64/aarch64-simd.md (aarch64_get_lane): Increase cost of 
r=w alternative.
--

diff --git a/gcc/config/aarch64/aarch64-simd.md 
b/gcc/config/aarch64/aarch64-simd.md
index 
2ebd256329c1a6a6b790d16955cbcee3feca456c..3d5fe44b53198a92afb726712c6e9dee890afe38
 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -2961,7 +2961,7 @@ (define_insn "*aarch64_get_lane_zero_extendsi"
 ;; is guaranteed so upper bits should be considered undefined.
 ;; RTL uses GCC vector extension indices throughout so flip only for assembly.
 (define_insn "aarch64_get_lane"
-  [(set (match_operand: 0 "aarch64_simd_nonimmediate_operand" "=r, w, 
Utv")
+  [(set (match_operand: 0 "aarch64_simd_nonimmediate_operand" "=?r, w, 
Utv")
(vec_select:
  (match_operand:VALL_F16 1 "register_operand" "w, w, w")
  (parallel [(match_operand:SI 2 "immediate_operand" "i, i, i")])))]
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
47d98dfd095cdcd15908a86091cf2f8a4d6137b1..a119760c7f332aded200fa1b5bcfb1bbac7b6420
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1059,16 +1059,17 @@ aarch64_err_no_fpadvsimd (machine_mode mode, const char 
*msg)
 }
 
 /* Implement TARGET_IRA_CHANGE_PSEUDO_ALLOCNO_CLASS.
-   The register allocator chooses ALL_REGS if FP_REGS and GENERAL_REGS have
-   the same cost even if ALL_REGS has a much larger cost.  ALL_REGS is also
-   used if the cost of both FP_REGS and GENERAL_REGS is lower than the memory
-   cost (in this case the best class is the lowest cost one).  Using ALL_REGS
-   irrespectively of its cost results in bad allocations with many redundant
-   int<->FP moves which are expensive on various cores.
-   To avoid this we don't allow ALL_REGS as the allocno class, but force a
-   decision between FP_REGS and GENERAL_REGS.  We use the allocno class if it
-   isn't ALL_REGS.  Similarly, use the best class if it isn't ALL_REGS.
-   Otherwise set the allocno class depending on the mode.
+   The register allocator chooses POINTER_AND_FP_REGS if FP_REGS and
+   GENERAL_REGS have the same cost - even if POINTER_AND_FP_REGS has a much
+   higher cost.  POINTER_AND_FP_REGS is also used if the cost of both FP_REGS
+   and GENERAL_REGS is lower than the memory cost (in this case the best class
+   is the lowest cost one).  Using POINTER_AND_FP_REGS irrespectively of its
+   cost results in bad allocations with many redundant int<->FP moves which
+   are expensive on various cores.
+   To avoid this we don't allow POINTER_AND_FP_REGS as the allocno class, but
+   force a decision between FP_REGS and GENERAL_REGS.  We use the allocno class
+   if it isn't POINTER_AND_FP_REGS.  Similarly, use the best class if it isn't
+   POINTER_AND_FP_REGS.  Otherwise set the allocno class depending on the mode.
The result of this is that it is no longer inefficient to have a higher
memory move cost than the register move cost.
 */
@@ -1079,10 +1080,10 @@ aarch64_ira_change_pseudo_allocno_class (int regno, 
reg_class_t allocno_class,
 {
   machine_mode mode;
 
-  if (allocno_class != ALL_REGS)
+  if (allocno_class != POINTER_AND_FP_REGS)
 return allocno_class;
 
-  if (best_class != ALL_REGS)
+  if (best_class != POINTER_AND_FP_REGS)
 return best_class;
 
   mode = PSEUDO_REGNO_MODE (regno);


Re: [PATCH][AArch64] Simplify frame pointer logic

2018-05-22 Thread James Greenhalgh
On Tue, May 22, 2018 at 10:37:30AM -0500, Wilco Dijkstra wrote:
> James Greenhalgh wrote:
> 
> > +/* Determine whether a frame chain needs to be generated.  */
> > +static bool
> > +aarch64_needs_frame_chain (void)
> > +{
> > +  /* Force a frame chain for EH returns so the return address is at FP+8.  
> > */
> > +  if (frame_pointer_needed || crtl->calls_eh_return)
> > +    return true;
> 
> > To match the original logic, I think this needs to not return true, but set
> > some temporary to true which may be overwritten by...
> 
> It's only overwritten if false, once true it cannot ever become false.

Doh, of course.

OK for trunk.

Thanks,
James



Re: [AARCH64] Neon vld1_*_x3, vst1_*_x2 and vst1_*_x3 intrinsics

2018-05-22 Thread James Greenhalgh
On Mon, Apr 30, 2018 at 06:35:11PM -0500, Sameera Deshpande wrote:
> On 13 April 2018 at 20:21, James Greenhalgh  wrote:
> > On Fri, Apr 13, 2018 at 03:39:32PM +0100, Sameera Deshpande wrote:
> >> On Fri 13 Apr, 2018, 8:04 PM James Greenhalgh, 
> >> > wrote:
> >> On Fri, Apr 06, 2018 at 08:55:47PM +0100, Christophe Lyon wrote:
> >> > Hi,
> >> >
> >> > 2018-04-06 12:15 GMT+02:00 Sameera Deshpande 
> >> > >:
> >> > > Hi Christophe,
> >> > >
> >> > > Please find attached the updated patch with testcases.
> >> > >
> >> > > Ok for trunk?
> >> >
> >> > Thanks for the update.
> >> >
> >> > Since the new intrinsics are only available on aarch64, you want to
> >> > prevent the tests from running on arm.
> >> > Indeed gcc.target/aarch64/advsimd-intrinsics/ is shared between the two 
> >> > targets.
> >> > There are several examples on how to do that in that directory.
> >> >
> >> > I have also noticed that the tests fail at execution on aarch64_be.
> >>
> >> I think this is important to fix. We don't want the big-endian target to 
> >> have
> >> failing implementations of the Neon intrinsics. What is the nature of the
> >> failure?
> >>
> >> From what I can see, nothing in the patch prevents using these intrinsics
> >> on big-endian, so either the intrinsics behaviour is wrong (we have a wrong
> >> code bug), or the testcase expected behaviour is wrong.
> >>
> >> I don't think disabling the test for big-endian is the right fix. We should
> >> either fix the intrinsics, or fix the testcase.
> >>
> >> Thanks,
> >> James
> >>
> >> Hi James,
> >>
> >> As the tests assume the little endian order of elements while checking the
> >> results, the tests are failing for big endian targets. So, the failures are
> >> not because of intrinsic implementations, but because of the testcase.
> >
> > The testcase is a little hard to follow through the macros, but why would
> > this be the case?
> >
> > ld1 is deterministic on big and little endian for which elements will be
> > loaded from memory, as is st1.
> >
> > My expectation would be that:
> >
> >   int __attribute__ ((noinline))
> >   test_vld_u16_x3 ()
> >   {
> > uint16_t data[3 * 3];
> > uint16_t temp[3 * 3];
> > uint16x4x3_t vectors;
> > int i,j;
> > for (i = 0; i < 3 * 3; i++)
> >   data [i] = (uint16_t) 3*i;
> > asm volatile ("" : : : "memory");
> > vectors = vld1_u16_x3 (data);
> > vst1_u16 (temp, vectors.val[0]);
> > vst1_u16 ([3], vectors.val[1]);
> > vst1_u16 ([3 * 2], vectors.val[2]);
> > asm volatile ("" : : : "memory");
> > for (j = 0; j < 3 * 3; j++)
> >   if (temp[j] != data[j])
> > return 1;
> > return 0;
> >   }
> >
> > would work equally well for big- or little-endian.
> >
> > I think this is more likely to be an intrinsics implementation bug.
> >
> > Thanks,
> > James
> >
> 
> Hi James,
> 
> Please find attached the updated patch, which now passes for little as
> well as big endian.
> Ok for trunk?


OK.

Thanks,
James

> 
> -- 
> - Thanks and regards,
>   Sameera D.
> 
> gcc/Changelog:
> 
> 2018-05-01  Sameera Deshpande  
> 
> 
> * config/aarch64/aarch64-simd-builtins.def (ld1x3): New.
> (st1x2): Likewise.
> (st1x3): Likewise.
> * config/aarch64/aarch64-simd.md
> (aarch64_ld1x3): New pattern.
> (aarch64_ld1_x3_): Likewise
> (aarch64_st1x2): Likewise
> (aarch64_st1_x2_): Likewise
> (aarch64_st1x3): Likewise
> (aarch64_st1_x3_): Likewise
> * config/aarch64/arm_neon.h (vld1_u8_x3): New function.
> (vld1_s8_x3): Likewise.
> (vld1_u16_x3): Likewise.
> (vld1_s16_x3): Likewise.
> (vld1_u32_x3): Likewise.
> (vld1_s32_x3): Likewise.
> (vld1_u64_x3): Likewise.
> (vld1_s64_x3): Likewise.
> (vld1_f16_x3): Likewise.
> (vld1_f32_x3): Likewise.
> (vld1_f64_x3): Likewise.
> (vld1_p8_x3): Likewise.
> (vld1_p16_x3): Likewise.
> (vld1_p64_x3): Likewise.
> (vld1q_u8_x3): Likewise.
> (vld1q_s8_x3): Likewise.
> (vld1q_u16_x3): Likewise.
> (vld1q_s16_x3): Likewise.
> (vld1q_u32_x3): Likewise.
> (vld1q_s32_x3): Likewise.
> (vld1q_u64_x3): Likewise.
> (vld1q_s64_x3): Likewise.
> (vld1q_f16_x3): Likewise.
> (vld1q_f32_x3): Likewise.
> (vld1q_f64_x3): Likewise.
> (vld1q_p8_x3): Likewise.
> (vld1q_p16_x3): Likewise.
> (vld1q_p64_x3): Likewise.
> (vst1_s64_x2): Likewise.
> (vst1_u64_x2): Likewise.
> (vst1_f64_x2): Likewise.
> (vst1_s8_x2): Likewise.
> (vst1_p8_x2): Likewise.
> (vst1_s16_x2): Likewise.
> (vst1_p16_x2): Likewise.
> (vst1_s32_x2): Likewise.
> (vst1_u8_x2): 

Re: [PATCH][AArch64] Simplify frame pointer logic

2018-05-22 Thread Wilco Dijkstra
James Greenhalgh wrote:

> +/* Determine whether a frame chain needs to be generated.  */
> +static bool
> +aarch64_needs_frame_chain (void)
> +{
> +  /* Force a frame chain for EH returns so the return address is at FP+8.  */
> +  if (frame_pointer_needed || crtl->calls_eh_return)
> +    return true;

> To match the original logic, I think this needs to not return true, but set
> some temporary to true which may be overwritten by...

It's only overwritten if false, once true it cannot ever become false.

> +
> +  /* A leaf function cannot have calls or write LR.  */
> +  bool is_leaf = crtl->is_leaf && !df_regs_ever_live_p (LR_REGNUM);
> +
> +  /* Don't use a frame chain in leaf functions if leaf frame pointers
> + are disabled.  */
> +  if (flag_omit_leaf_frame_pointer && is_leaf)
> +    return false;

> This.

No, a leaf function with alloca or EH return still must use a frame pointer.

> +
> +  return aarch64_use_frame_pointer;
> +}
> +


> I say that because here

> -  /* Force a frame chain for EH returns so the return address is at FP+8.  */
> -  cfun->machine->frame.emit_frame_chain
> -    = frame_pointer_needed || crtl->calls_eh_return;
> -

> We fall through to the next check.


> -  /* Emit a frame chain if the frame pointer is enabled.
> - If -momit-leaf-frame-pointer is used, do not use a frame chain
> - in leaf functions which do not use LR.  */
> -  if (flag_omit_frame_pointer == 2
> -  && !(flag_omit_leaf_frame_pointer && crtl->is_leaf
> -  && !df_regs_ever_live_p (LR_REGNUM)))
> -    cfun->machine->frame.emit_frame_chain = true;
> +  cfun->machine->frame.emit_frame_chain = aarch64_needs_frame_chain ();
 
> That may well have been a long-standing bug, but I wanted to query it
> as you don't mention any bug fixes in the patch cover letter.

There is no bug here, the code is non-trivial but it does exactly what it says -
forcing the frame chain on when required in a non-leaf or a leaf if leaf frame 
pointer
omission is disabled. The point of the new code is to make it a bit simpler to 
read.

Wilco

Re: [AArch64, patch] Refactor of aarch64-ldpstp

2018-05-22 Thread James Greenhalgh
On Tue, May 22, 2018 at 10:06:15AM -0500, Kyrill Tkachov wrote:
> [sending on behalf of Jackson Woodruff]
> 
> Hi all,
> 
> This patch removes a lot of duplicated code in aarch64-ldpstp.md.
> 
> The patterns that did not previously generate a base register now
> do not check for aarch64_mem_pair_operand in the pattern. This has
> been extracted to a check in aarch64_operands_ok_for_ldpstp.
> 
> All patterns in the file used to have explicit switching code to
> swap loads and stores that were in the wrong order.
> 
> This has been extracted into aarch64_operands_ok_for_ldpstp
> as a final operation after all the checks have been performed.
> 
> This patch is based on the patch here: 
> https://gcc.gnu.org/ml/gcc-patches/2018-05/msg01129.html
> 
> 
> Bootstrap and regtest OK on AArch64.
> 
> OK for trunk?

OK.

Thnaks,
James

> 
> Jackson.
> 
> gcc/
> 
> 2018-05-22  Jackson Woodruff  
>  Kyrylo Tkachov  
> 
>  * config/aarch64/aarch64-ldpstp.md: Replace uses of
>  aarch64_mem_pair_operand with memory_operand and delete operand swapping
>  code.
>  * config/aarch64/aarch64.c (aarch64_operands_ok_for_ldpstp):
>  Add check for legitimate_address.
>  (aarch64_gen_adjusted_ldpstp): Swap operands where appropriate.
>  (aarch64_swap_ldrstr_operands): New.
>  * config/aarch64/aarch64-protos.h (aarch64_swap_ldrstr_operands):
>  Define prototype.




Re: [PATCH][AArch64] Merge stores of D-register values with different modes

2018-05-22 Thread James Greenhalgh
On Tue, May 22, 2018 at 08:13:22AM -0500, Kyrill Tkachov wrote:
> [sending on behalf of Jackson Woodruff]
> 
> Hi all,
> 
> This patch merges loads and stores from D-registers that are of different 
> modes.
> 
> Code like this:
> 
>  typedef int __attribute__((vector_size(8))) vec;
>  struct pair
>  {
>vec v;
>double d;
>  }
> 
> Now generates a store pair instruction:
> 
>  void
>  assign (struct pair *p, vec v)
>  {
>p->v = v;
>p->d = 1.0;
>  }
> 
> Whereas previously it generated two `str` instructions.
> 
> This patch also merges storing of double zero values with
> long integer values:
> 
>  struct pair
>  {
>long long l;
>double d;
>  }
> 
>  void
>  foo (struct pair *p)
>  {
>p->l = 10;
>p->d = 0.0;
>  }
> 
> Now generates a single store pair instruction rather than two `str` 
> instructions.
> 
> The patch basically generalises the mode iterators on the patterns in 
> aarch64.md
> and the peepholes in aarch64-ldpstp.md to take all combinations of pairs of 
> modes
> so, while it may be a large-ish patch, it does fairly mechanical stuff.
> 
> Bootstrap and testsuite run OK. OK for trunk?

OK for trunk.

Thanks,
James

> 
> Jackson
> 
> 2018-05-22  Jackson Woodruff  
>  Kyrylo Tkachov  
> 
>  * config/aarch64/aarch64.md: New patterns to generate stp
>  and ldp.
>  (store_pair_sw, store_pair_dw): New patterns to generate stp for
>  single words and double words.
>  (load_pair_sw, load_pair_dw): Likewise.
>  (store_pair_sf, store_pair_df, store_pair_si, store_pair_di):
>  Delete.
>  (load_pair_sf, load_pair_df, load_pair_si, load_pair_di):
>  Delete.
>  * config/aarch64/aarch64-ldpstp.md: Modify peephole
>  for different mode ldpstp and add peephole for merged zero stores.
>  Likewise for loads.
>  * config/aarch64/aarch64.c (aarch64_operands_ok_for_ldpstp):
>  Add size check.
>  (aarch64_gen_store_pair): Rename calls to match new patterns.
>  (aarch64_gen_load_pair): Rename calls to match new patterns.
>  * config/aarch64/aarch64-simd.md (load_pair): Rename to...
>  (load_pair): ... This.
>  (store_pair): Rename to...
>  (vec_store_pair): ... This.
>  * config/aarch64/iterators.md (DREG, DREG2, DX2, SX, SX2, DSX):
>  New mode iterators.
>  (V_INT_EQUIV): Handle SImode.
>  * config/aarch64/predicates.md (aarch64_reg_zero_or_fp_zero):
>  New predicate.
> 
> 
> 2018-05-22  Jackson Woodruff  
> 
>  * gcc.target/aarch64/ldp_stp_6.c: New.
>  * gcc.target/aarch64/ldp_stp_7.c: New.
>  * gcc.target/aarch64/ldp_stp_8.c: New.



Re: [PATCH][AArch64] Simplify frame pointer logic

2018-05-22 Thread James Greenhalgh
On Tue, May 15, 2018 at 08:11:21AM -0500, Wilco Dijkstra wrote:
> 
> ping
> 
> 
> From: Wilco Dijkstra
> Sent: 25 October 2017 16:29
> To: GCC Patches
> Cc: nd
> Subject: [PATCH][AArch64] Simplify frame pointer logic
>   
> 
> Simplify frame pointer logic based on review comments here
> (https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01727.html).
> 
> This patch incrementally adds to these frame pointer cleanup patches:
> https://gcc.gnu.org/ml/gcc-patches/2017-08/msg00377.html
> https://gcc.gnu.org/ml/gcc-patches/2017-08/msg00381.html
> 
> Add aarch64_needs_frame_chain to decide when to emit the frame
> chain using clearer logic. Introduce aarch64_use_frame_pointer
> which contains the value of -fno-omit-frame-pointer
> (flag_omit_frame_pointer is set to a magic value so that the mid-end
> won't force the frame pointer in all cases, and leaf frame pointer
> emission can't be supported).
> 
> OK for commit?
> 
> +/* Determine whether a frame chain needs to be generated.  */
> +static bool
> +aarch64_needs_frame_chain (void)
> +{
> +  /* Force a frame chain for EH returns so the return address is at FP+8.  */
> +  if (frame_pointer_needed || crtl->calls_eh_return)
> +    return true;

To match the original logic, I think this needs to not return true, but set
some temporary to true which may be overwritten by...

> +
> +  /* A leaf function cannot have calls or write LR.  */
> +  bool is_leaf = crtl->is_leaf && !df_regs_ever_live_p (LR_REGNUM);
> +
> +  /* Don't use a frame chain in leaf functions if leaf frame pointers
> + are disabled.  */
> +  if (flag_omit_leaf_frame_pointer && is_leaf)
> +    return false;

This.

> +
> +  return aarch64_use_frame_pointer;
> +}
> +


I say that because here

> -  /* Force a frame chain for EH returns so the return address is at FP+8.  */
> -  cfun->machine->frame.emit_frame_chain
> -    = frame_pointer_needed || crtl->calls_eh_return;
> -

We fall through to the next check.


> -  /* Emit a frame chain if the frame pointer is enabled.
> - If -momit-leaf-frame-pointer is used, do not use a frame chain
> - in leaf functions which do not use LR.  */
> -  if (flag_omit_frame_pointer == 2
> -  && !(flag_omit_leaf_frame_pointer && crtl->is_leaf
> -  && !df_regs_ever_live_p (LR_REGNUM)))
> -    cfun->machine->frame.emit_frame_chain = true;
> +  cfun->machine->frame.emit_frame_chain = aarch64_needs_frame_chain ();
 
That may well have been a long-standing bug, but I wanted to query it
as you don't mention any bug fixes in the patch cover letter.

Thanks,
James


[AArch64, patch] Refactor of aarch64-ldpstp

2018-05-22 Thread Kyrill Tkachov

[sending on behalf of Jackson Woodruff]

Hi all,

This patch removes a lot of duplicated code in aarch64-ldpstp.md.

The patterns that did not previously generate a base register now
do not check for aarch64_mem_pair_operand in the pattern. This has
been extracted to a check in aarch64_operands_ok_for_ldpstp.

All patterns in the file used to have explicit switching code to
swap loads and stores that were in the wrong order.

This has been extracted into aarch64_operands_ok_for_ldpstp
as a final operation after all the checks have been performed.

This patch is based on the patch here: 
https://gcc.gnu.org/ml/gcc-patches/2018-05/msg01129.html


Bootstrap and regtest OK on AArch64.

OK for trunk?

Jackson.

gcc/

2018-05-22  Jackson Woodruff  
Kyrylo Tkachov  

* config/aarch64/aarch64-ldpstp.md: Replace uses of
aarch64_mem_pair_operand with memory_operand and delete operand swapping
code.
* config/aarch64/aarch64.c (aarch64_operands_ok_for_ldpstp):
Add check for legitimate_address.
(aarch64_gen_adjusted_ldpstp): Swap operands where appropriate.
(aarch64_swap_ldrstr_operands): New.
* config/aarch64/aarch64-protos.h (aarch64_swap_ldrstr_operands):
Define prototype.
diff --git a/gcc/config/aarch64/aarch64-ldpstp.md b/gcc/config/aarch64/aarch64-ldpstp.md
index f6fe8a6a93b5466723e3ed6b892e0ec1e67ee89d..7f1031dc80fab31f691c0b03d6a485c1b6fd7e53 100644
--- a/gcc/config/aarch64/aarch64-ldpstp.md
+++ b/gcc/config/aarch64/aarch64-ldpstp.md
@@ -20,26 +20,18 @@
 
 (define_peephole2
   [(set (match_operand:GPI 0 "register_operand" "")
-	(match_operand:GPI 1 "aarch64_mem_pair_operand" ""))
+	(match_operand:GPI 1 "memory_operand" ""))
(set (match_operand:GPI 2 "register_operand" "")
 	(match_operand:GPI 3 "memory_operand" ""))]
   "aarch64_operands_ok_for_ldpstp (operands, true, mode)"
   [(parallel [(set (match_dup 0) (match_dup 1))
 	  (set (match_dup 2) (match_dup 3))])]
 {
-  rtx base, offset_1, offset_2;
-
-  extract_base_offset_in_addr (operands[1], , _1);
-  extract_base_offset_in_addr (operands[3], , _2);
-  if (INTVAL (offset_1) > INTVAL (offset_2))
-{
-  std::swap (operands[0], operands[2]);
-  std::swap (operands[1], operands[3]);
-}
+  aarch64_swap_ldrstr_operands (operands, true);
 })
 
 (define_peephole2
-  [(set (match_operand:GPI 0 "aarch64_mem_pair_operand" "")
+  [(set (match_operand:GPI 0 "memory_operand" "")
 	(match_operand:GPI 1 "aarch64_reg_or_zero" ""))
(set (match_operand:GPI 2 "memory_operand" "")
 	(match_operand:GPI 3 "aarch64_reg_or_zero" ""))]
@@ -47,39 +39,23 @@ (define_peephole2
   [(parallel [(set (match_dup 0) (match_dup 1))
 	  (set (match_dup 2) (match_dup 3))])]
 {
-  rtx base, offset_1, offset_2;
-
-  extract_base_offset_in_addr (operands[0], , _1);
-  extract_base_offset_in_addr (operands[2], , _2);
-  if (INTVAL (offset_1) > INTVAL (offset_2))
-{
-  std::swap (operands[0], operands[2]);
-  std::swap (operands[1], operands[3]);
-}
+  aarch64_swap_ldrstr_operands (operands, false);
 })
 
 (define_peephole2
   [(set (match_operand:GPF 0 "register_operand" "")
-	(match_operand:GPF 1 "aarch64_mem_pair_operand" ""))
+	(match_operand:GPF 1 "memory_operand" ""))
(set (match_operand:GPF 2 "register_operand" "")
 	(match_operand:GPF 3 "memory_operand" ""))]
   "aarch64_operands_ok_for_ldpstp (operands, true, mode)"
   [(parallel [(set (match_dup 0) (match_dup 1))
 	  (set (match_dup 2) (match_dup 3))])]
 {
-  rtx base, offset_1, offset_2;
-
-  extract_base_offset_in_addr (operands[1], , _1);
-  extract_base_offset_in_addr (operands[3], , _2);
-  if (INTVAL (offset_1) > INTVAL (offset_2))
-{
-  std::swap (operands[0], operands[2]);
-  std::swap (operands[1], operands[3]);
-}
+  aarch64_swap_ldrstr_operands (operands, true);
 })
 
 (define_peephole2
-  [(set (match_operand:GPF 0 "aarch64_mem_pair_operand" "")
+  [(set (match_operand:GPF 0 "memory_operand" "")
 	(match_operand:GPF 1 "aarch64_reg_or_fp_zero" ""))
(set (match_operand:GPF 2 "memory_operand" "")
 	(match_operand:GPF 3 "aarch64_reg_or_fp_zero" ""))]
@@ -87,39 +63,23 @@ (define_peephole2
   [(parallel [(set (match_dup 0) (match_dup 1))
 	  (set (match_dup 2) (match_dup 3))])]
 {
-  rtx base, offset_1, offset_2;
-
-  extract_base_offset_in_addr (operands[0], , _1);
-  extract_base_offset_in_addr (operands[2], , _2);
-  if (INTVAL (offset_1) > INTVAL (offset_2))
-{
-  std::swap (operands[0], operands[2]);
-  std::swap (operands[1], operands[3]);
-}
+  aarch64_swap_ldrstr_operands (operands, false);
 })
 
 (define_peephole2
   [(set (match_operand:DREG 0 "register_operand" "")
-	(match_operand:DREG 1 "aarch64_mem_pair_operand" ""))
+	(match_operand:DREG 1 "memory_operand" ""))
(set (match_operand:DREG2 2 "register_operand" "")
 	(match_operand:DREG2 3 "memory_operand" ""))]
   "aarch64_operands_ok_for_ldpstp (operands, true, mode)"
   

Re: [PATCH][RFC] Add dynamic edge/bb flag allocation

2018-05-22 Thread Richard Biener
On Tue, 22 May 2018, David Malcolm wrote:

> On Tue, 2018-05-22 at 10:43 +0200, Richard Biener wrote:
> > On Mon, 21 May 2018, Jeff Law wrote:
> > 
> > > On 05/18/2018 07:15 AM, David Malcolm wrote:
> > > > On Fri, 2018-05-18 at 13:11 +0200, Richard Biener wrote:
> > > > > The following adds a simple alloc/free_flag machinery
> > > > > allocating
> > > > > bits from an int typed pool and applies that to bb->flags and
> > > > > edge-
> > > > > > flags.
> > > > > 
> > > > > This should allow infrastructure pieces to use egde/bb flags
> > > > > temporarily
> > > > > without worrying that users might already use it as for example
> > > > > BB_VISITED and friends.  It converts one clever user to the new
> > > > > interface.
> > > > > 
> > > > > The allocation state is per CFG but we could also make it
> > > > > global
> > > > > or merge the two pools so one allocates a flag that can be used
> > > > > for
> > > > > bbs and edges at the same time.
> > > > > 
> > > > > Thus - any opinions welcome.  I'm mainly targeting cfganal
> > > > > algorithms
> > > > > where I want to add a few region-based ones that to be
> > > > > O(region-size)
> > > > > complexity may not use sbitmaps for visited sets because of the
> > > > > clearing
> > > > > overhead and bitmaps are probably more expensive to use than a
> > > > > BB/edge
> > > > > flag that needs to be cleared afterwards.
> > > > > 
> > > > > Built on x86_64, otherwise untested.
> > > > > 
> > > > > Any comments?
> > > > 
> > > > Rather than putting alloc/free pairs at the usage sites, how
> > > > about an
> > > > RAII class?  Something like this:
> > > 
> > > Yes, please if at all possible we should be using RAII.
> > 
> > So like the following?  (see comments in the hwint.h hunk for
> > extra C++ questions...)
> > 
> > I dropped the non-RAII interface - it's very likely never needed.
> > 
> > Better suggestions for placement of auto_flag welcome.
> 
> Do you have ideas for other uses?  If not, maybe just put it in cfg.h
> right in front of auto_edge_flag and auto_bb_flag, for simplicity?

I don't have more users but of course gimple stmts and tree nodes
would come to my mind ;)  Basically nodes in any data structure we walk
and that we can (cheaply) re-walk to clear flags in the end.  Cost
comparison would always be to a simple pointer-set or bitmap.

But sure, I'll stick it to cfg.h for the moment.  As said, my
main use case didn't materialize on trunk yet but is in a patchset
I have to bring up-to-date.

> > Thanks,
> > Richard.
> [...snip...]
> 
> The new classes are missing leading comments.  I think it's worth
> noting that the auto_flag (and thus their subclasses) hold a pointer
> into a control_flow_graph instance, but they don't interact with the
> garbage collector, so there's an implicit assumption that the auto_flag
> instances are short-lived and that the underlying storage is kept alive
> some other way (e.g. as cfun is kept alive by cfun being a GC root).

Ah, yes - missed a comment.

> 
> > +class auto_edge_flag : public auto_flag
> > +{
> > +public:
> > +  auto_edge_flag (function *fun)
> > +: auto_flag (>cfg->edge_flags_allocated) {}
> > +};
> > +
> > +class auto_bb_flag : public auto_flag
> > +{
> > +public:
> > +  auto_bb_flag (function *fun)
> > +: auto_flag (>cfg->bb_flags_allocated) {}
> > +};
> > +
> >  #endif /* GCC_CFG_H */
> 
> [...snip...]
> 
> Hope this is constructive

Sure!

Thanks,
Richard.


[Ada] Fix Reraise_Occurrence of Foreign_Exception

2018-05-22 Thread Pierre-Marie de Rodat
In a sequence like

(d)(c) (b)  (a)
c++ raises <-- Ada calls c++,  <-- c++ call Ada <-- Ada calls
exception  others handler  and handles  c++
   gets foreignc++ exception
   exception and
   re-raises

the original exception raised on the C++ world at (d) couldn't be caught
as a regular c++ exception at (b) when the re-raise performed at (c) is
done with an explicit call to Ada.Exceptions.Reraise_Occurrence.

Indeed, the latter just re-crafted a new Ada-ish occurence and the
nature and contents of the original exception object were lost.

This patch fixes this by refining Reraise_Occurrence to be more careful
with exceptions in the course of a propagation, just resuming propagation
of the original object.

>From the set of soures below, compilation and execution with:

  g++ -c bd.cc && gnatmake -f -g a.adb -largs bd.o --LINK=g++ && ./a

is expected to output:

foreign exception caught, reraising ...
b() caught x = 5



// bd.cc

extern "C" {
  extern void c();

  void b ();
  void d ();
}

void b ()
{
  try {
c();
  } catch (int x) {
printf ("b() caught x = %d\n", x);
  }
}

void d ()
{
  throw (5);
}

-- a.adb

with C;
procedure A is
   procedure B;
   pragma Import (Cpp, B);
begin
   B;
end;

-- c.ads

procedure C;
pragma Export (C, C, "c");

-- c.adb

with Ada.Exceptions; use Ada.Exceptions;
with System.Standard_Library;
with Ada.Unchecked_Conversion;

with Ada.Text_IO; use Ada.Text_IO;

procedure C is
   package SSL renames System.Standard_Library;
   use type SSL.Exception_Data_Ptr;

   function To_Exception_Data_Ptr is new
 Ada.Unchecked_Conversion (Exception_Id, SSL.Exception_Data_Ptr);

   procedure D;
   pragma Import (Cpp, D);

   Foreign_Exception : aliased SSL.Exception_Data;
   pragma Import
 (Ada, Foreign_Exception, "system__exceptions__foreign_exception");
begin
   D;
exception
   when E : others =>
  if To_Exception_Data_Ptr (Exception_Identity (E))
= Foreign_Exception'Unchecked_access
  then
 Put_Line ("foreign exception caught, reraising ...");
 Reraise_Occurrence (E);
  end if;
end;

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-22  Olivier Hainque  

gcc/ada/

* libgnat/a-except.adb (Exception_Propagation.Propagate_Exception):
Expect an Exception_Occurence object, not an Access.
(Complete_And_Propagate_Occurrence): Adjust accordingly.
(Raise_From_Signal_Handler): Likewise.
(Reraise_Occurrence_No_Defer): If we have a Machine_Occurrence
available in the provided occurrence object, just re-propagate the
latter as a bare "raise;" would do.
* libgnat/a-exexpr.adb (Propagate_Exception): Adjust to spec change.
* libgnat/a-exstat.adb (String_To_EO): Initialize X.Machine_Occurrence
to null, to mark that the occurrence we're crafting from the stream
contents is not being propagated (yet).--- gcc/ada/libgnat/a-except.adb
+++ gcc/ada/libgnat/a-except.adb
@@ -228,7 +228,7 @@ package body Ada.Exceptions is
   function Allocate_Occurrence return EOA;
   --  Allocate an exception occurrence (as well as the machine occurrence)
 
-  procedure Propagate_Exception (Excep : EOA);
+  procedure Propagate_Exception (Excep : Exception_Occurrence);
   pragma No_Return (Propagate_Exception);
   --  This procedure propagates the exception represented by Excep
 
@@ -940,7 +940,7 @@ package body Ada.Exceptions is
procedure Complete_And_Propagate_Occurrence (X : EOA) is
begin
   Complete_Occurrence (X);
-  Exception_Propagation.Propagate_Exception (X);
+  Exception_Propagation.Propagate_Exception (X.all);
end Complete_And_Propagate_Occurrence;
 
-
@@ -1091,7 +1091,7 @@ package body Ada.Exceptions is
is
begin
   Exception_Propagation.Propagate_Exception
-(Create_Occurrence_From_Signal_Handler (E, M));
+(Create_Occurrence_From_Signal_Handler (E, M).all);
end Raise_From_Signal_Handler;
 
-
@@ -1587,12 +1587,25 @@ package body Ada.Exceptions is
-
 
procedure Reraise_Occurrence_No_Defer (X : Exception_Occurrence) is
-  Excep: constant EOA := Exception_Propagation.Allocate_Occurrence;
-  Saved_MO : constant System.Address := Excep.Machine_Occurrence;
begin
-  Save_Occurrence (Excep.all, X);
-  Excep.Machine_Occurrence := Saved_MO;
-  Complete_And_Propagate_Occurrence (Excep);
+  --  If we have a Machine_Occurrence at hand already, e.g. when we are
+  --  reraising a foreign exception, just repropagate. Otherwise, e.g.
+  --  when reraising a GNAT exception or an occurrence read back from a
+  --  stream, set up a new occurrence with its own Machine block first.
+
+  if X.Machine_Occurrence /= System.Null_Address then
+ 

[Ada] Missing warning for unreferenced formals in expression functions

2018-05-22 Thread Pierre-Marie de Rodat
This patch fixes an issue whereby the compiler failed to properly warn against
unreferenced formal parameters when analyzing expression functions.

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-22  Justin Squirek  

gcc/ada/

* sem_ch6.adb (Analyze_Expression_Function): Propagate flags from the
original function spec into the generated function spec due to
expansion of expression functions during analysis.
(Analyze_Subprogram_Body_Helper): Modify check on formal parameter
references from the body to the subprogram spec in the case of
expression functions because of inconsistances related to having a
generated body.
* libgnarl/s-osinte__android.ads: Flag parameters as unused.
* libgnarl/s-osinte__lynxos178e.ads: Likewise.
* libgnarl/s-osinte__qnx.adb: Likewise.
* libgnarl/s-osinte__qnx.ads: Likewise.

gcc/testsuite/

* gnat.dg/warn14.adb: New testcase.--- gcc/ada/libgnarl/s-osinte__android.ads
+++ gcc/ada/libgnarl/s-osinte__android.ads
@@ -313,7 +313,7 @@ package System.OS_Interface is
Stack_Base_Available : constant Boolean := False;
--  Indicates whether the stack base is available on this target
 
-   function Get_Stack_Base (thread : pthread_t)
+   function Get_Stack_Base (ignored_thread : pthread_t)
  return Address is (Null_Address);
--  This is a dummy procedure to share some GNULLI files
 
@@ -425,12 +425,12 @@ package System.OS_Interface is
PTHREAD_PRIO_INHERIT : constant := 1;
 
function pthread_mutexattr_setprotocol
- (attr : access pthread_mutexattr_t;
-  protocol : int) return int is (0);
+ (ignored_attr : access pthread_mutexattr_t;
+  ignored_protocol : int) return int is (0);
 
function pthread_mutexattr_setprioceiling
- (attr: access pthread_mutexattr_t;
-  prioceiling : int) return int is (0);
+ (ignored_attr: access pthread_mutexattr_t;
+  ignored_prioceiling : int) return int is (0);
 
type struct_sched_param is record
   sched_priority : int;  --  scheduling priority

--- gcc/ada/libgnarl/s-osinte__lynxos178e.ads
+++ gcc/ada/libgnarl/s-osinte__lynxos178e.ads
@@ -453,8 +453,8 @@ package System.OS_Interface is
pragma Import (C, pthread_setschedparam, "pthread_setschedparam");
 
function pthread_attr_setscope
- (attr: access pthread_attr_t;
-  contentionscope : int) return int is (0);
+ (Unused_attr: access pthread_attr_t;
+  Unused_contentionscope : int) return int is (0);
--  pthread_attr_setscope is not implemented in production mode
 
function pthread_attr_setinheritsched

--- gcc/ada/libgnarl/s-osinte__qnx.adb
+++ gcc/ada/libgnarl/s-osinte__qnx.adb
@@ -42,13 +42,25 @@ pragma Polling (Off);
 with Interfaces.C; use Interfaces.C;
 package body System.OS_Interface is
 
+   -
+   -- sigaltstack --
+   -
+
+   function sigaltstack
+ (ss  : not null access stack_t;
+  oss : access stack_t) return int
+   is
+  pragma Unreferenced (ss, oss);
+   begin
+  return 0;
+   end sigaltstack;
+

-- Get_Stack_Base --

 
function Get_Stack_Base (thread : pthread_t) return Address is
-  pragma Warnings (Off, thread);
-
+  pragma Unreferenced (thread);
begin
   return Null_Address;
end Get_Stack_Base;

--- gcc/ada/libgnarl/s-osinte__qnx.ads
+++ gcc/ada/libgnarl/s-osinte__qnx.ads
@@ -301,7 +301,7 @@ package System.OS_Interface is
function sigaltstack
  (ss  : not null access stack_t;
   oss : access stack_t) return int
-   is (0);
+ with Inline;
--  Not supported on QNX
 
Alternate_Stack : aliased System.Address;
@@ -315,7 +315,7 @@ package System.OS_Interface is
--  Indicates whether the stack base is available on this target
 
function Get_Stack_Base (thread : pthread_t) return System.Address
- with Inline_Always;
+ with Inline;
--  This is a dummy procedure to share some GNULLI files
 
function Get_Page_Size return int;

--- gcc/ada/sem_ch6.adb
+++ gcc/ada/sem_ch6.adb
@@ -490,8 +490,8 @@ package body Sem_Ch6 is
   Orig_N   : Node_Id;
   Ret  : Node_Id;
 
-  Def_Id   : Entity_Id := Empty;
-  Prev : Entity_Id;
+  Def_Id : Entity_Id := Empty;
+  Prev   : Entity_Id;
   --  If the expression is a completion, Prev is the entity whose
   --  declaration is completed. Def_Id is needed to analyze the spec.
 
@@ -783,11 +783,44 @@ package body Sem_Ch6 is
 Related_Nod => Original_Node (N));
   end if;
 
-  --  If the return expression is a static constant, we suppress warning
-  --  messages on unused formals, which in most cases will be noise.
+  --  We must enforce checks for unreferenced formals in our newly
+  --  generated function, so we propagate the referenced flag from the
+  --  

[Ada] Crash on partial initialization of controlled component

2018-05-22 Thread Pierre-Marie de Rodat
This patch modifies the late expansion of record aggregates to ensure that the
generated code which handles a controlled component initialized by a function
call is inserted in line with the rest of the initialization code, rather than
before the record aggregate. This way the function call has proper access to
the discriminants of the object being created.

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-22  Hristian Kirtchev  

gcc/ada/

* exp_aggr.adb (Initialize_Ctrl_Record_Component): Insert the generated
code for a transient component in line with the rest of the
initialization code, rather than before the aggregate. This ensures
that the component has proper visibility of the discriminants.

gcc/testsuite/

* gnat.dg/controlled8.adb: New testcase.--- gcc/ada/exp_aggr.adb
+++ gcc/ada/exp_aggr.adb
@@ -2846,7 +2846,7 @@ package body Exp_Aggr is
 
  In_Place_Expansion :=
Nkind (Init_Expr) = N_Function_Call
-and then not Is_Build_In_Place_Result_Type (Comp_Typ);
+ and then not Is_Build_In_Place_Result_Type (Comp_Typ);
 
  --  The initialization expression is a controlled function call.
  --  Perform in-place removal of side effects to avoid creating a
@@ -2865,7 +2865,11 @@ package body Exp_Aggr is
 Set_No_Side_Effect_Removal (Init_Expr);
 
 --  Install all hook-related declarations and prepare the clean up
---  statements.
+--  statements. The generated code follows the initialization order
+--  of individual components and discriminants, rather than being
+--  inserted prior to the aggregate. This ensures that a transient
+--  component which mentions a discriminant has proper visibility
+--  of the discriminant.
 
 Process_Transient_Component
   (Loc=> Loc,
@@ -2873,7 +2877,7 @@ package body Exp_Aggr is
Init_Expr  => Init_Expr,
Fin_Call   => Fin_Call,
Hook_Clear => Hook_Clear,
-   Aggr   => N);
+   Stmts  => Stmts);
  end if;
 
  --  Use the noncontrolled component initialization circuitry to

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/controlled8.adb
@@ -0,0 +1,63 @@
+--  { dg-do compile }
+
+with Ada.Finalization; use Ada.Finalization;
+
+procedure Controlled8
+  (Int_Input : Integer;
+   Str_Input : String)
+is
+   type Ctrl is new Controlled with null record;
+   type Integer_Ptr is access all Integer;
+   type String_Ptr  is access all String;
+
+   function Func (Val : Integer) return Ctrl is
+   begin return Result : Ctrl; end Func;
+
+   function Func (Val : String) return Ctrl is
+   begin return Result : Ctrl; end Func;
+
+   type Rec_1 (Val : Integer) is record
+  Comp : Ctrl := Func (Val);
+   end record;
+
+   type Rec_2 (Val : access Integer) is record
+  Comp : Ctrl := Func (Val.all);
+   end record;
+
+   type Rec_3 (Val : Integer_Ptr) is record
+  Comp : Ctrl := Func (Val.all);
+   end record;
+
+   type Rec_4 (Val : access String) is record
+  Comp : Ctrl := Func (Val.all);
+   end record;
+
+   type Rec_5 (Val : String_Ptr) is record
+  Comp : Ctrl := Func (Val.all);
+   end record;
+
+   Int_Heap  : constant Integer_Ptr := new Integer'(Int_Input);
+   Int_Stack : aliased  Integer := Int_Input;
+   Str_Heap  : constant String_Ptr  := new String'(Str_Input);
+   Str_Stack : aliased  String  := Str_Input;
+
+   Obj_1  : constant Rec_1 := (Val => Int_Input, others => <>);
+
+   Obj_2  : constant Rec_2 := (Val => Int_Heap, others => <>);
+   Obj_3  : constant Rec_2 := (Val => Int_Stack'Access, others => <>);
+   Obj_4  : constant Rec_2 := (Val => new Integer'(Int_Input), others => <>);
+
+   Obj_5  : constant Rec_3 := (Val => Int_Heap, others => <>);
+   Obj_6  : constant Rec_3 := (Val => Int_Stack'Access, others => <>);
+   Obj_7  : constant Rec_3 := (Val => new Integer'(Int_Input), others => <>);
+
+   Obj_8  : constant Rec_4 := (Val => Str_Heap, others => <>);
+   Obj_9  : constant Rec_4 := (Val => Str_Stack'Access, others => <>);
+   Obj_10 : constant Rec_4 := (Val => new String'(Str_Input), others => <>);
+
+   Obj_11 : constant Rec_5 := (Val => Str_Heap, others => <>);
+   Obj_12 : constant Rec_5 := (Val => Str_Stack'Access, others => <>);
+   Obj_13 : constant Rec_5 := (Val => new String'(Str_Input), others => <>);
+begin
+   null;
+end Controlled8;



[Ada] Fix retrieval of number of CPUs on QNX

2018-05-22 Thread Pierre-Marie de Rodat
Although the sysconf SC_NPROCESSORS_ONLN is also defined by the API, the
only documented way to retrieve the number of CPUs is by using the syspage.

This also better organise the QNX-specific macros in adaint.c

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-22  Jerome Lambourg  

gcc/ada/

* adaint.c: Reorganize QNX-specific macros, use syspage to retreive the
number of CPUs.--- gcc/ada/adaint.c
+++ gcc/ada/adaint.c
@@ -39,7 +39,9 @@
 #define _THREAD_SAFE
 
 /* Use 64 bit Large File API */
-#ifndef _LARGEFILE_SOURCE
+#if defined (__QNX__)
+#define _LARGEFILE64_SOURCE 1
+#elif !defined(_LARGEFILE_SOURCE)
 #define _LARGEFILE_SOURCE
 #endif
 #define _FILE_OFFSET_BITS 64
@@ -81,8 +83,8 @@
 #define __BSD_VISIBLE 1
 #endif
 
-#if defined (__QNX__)
-#define _LARGEFILE64_SOURCE 1
+#ifdef __QNX__
+#include 
 #endif
 
 #ifdef IN_RTS
@@ -2350,9 +2352,12 @@ __gnat_number_of_cpus (void)
 
 #if defined (__linux__) || defined (__sun__) || defined (_AIX) \
   || defined (__APPLE__) || defined (__FreeBSD__) || defined (__OpenBSD__) \
-  || defined (__DragonFly__) || defined (__NetBSD__) || defined (__QNX__)
+  || defined (__DragonFly__) || defined (__NetBSD__)
   cores = (int) sysconf (_SC_NPROCESSORS_ONLN);
 
+#elif defined (__QNX__)
+  cores = (int) _syspage_ptr->num_cpu;
+
 #elif defined (__hpux__)
   struct pst_dynamic psd;
   if (pstat_getdynamic (, sizeof (psd), 1, 0) != -1)



[Ada] In-place initialization for Initialize_Scalars

2018-05-22 Thread Pierre-Marie de Rodat
This patch optimizes the initialization and allocation of scalar array objects
when pragma Initialize_Scalars is in effect. The patch also extends the syntax
and semantics of pragma Initialize_Scalars to allow for the specification of
invalid values pertaining to families of scalar types. The new syntax is as
follows:

   pragma Initialize_Scalars
 [ ( TYPE_VALUE_PAIR {, TYPE_VALUE_PAIR} ) ];

   TYPE_VALUE_PAIR ::=
 SCALAR_TYPE => static_EXPRESSION

   SCALAR_TYPE :=
 Short_Float
   | Float
   | Long_Float
   | Long_Long_Flat
   | Signed_8
   | Signed_16
   | Signed_32
   | Signed_64
   | Unsigned_8
   | Unsigned_16
   | Unsigned_32
   | Unsigned_64

Depending on the value specified by pragma Initialize_Scalars, the backend may
optimize the creation of the scalar array object into a fast memset.


-- Source --


--  gnat.adc

pragma Initialize_Scalars
  (Short_Float => 0.0,
   Float   => 0.0,
   Long_Float  => 0.0,
   Long_Long_Float => 0.0,
   Signed_8=> 0,
   Signed_16   => 0,
   Signed_32   => 0,
   Signed_64   => 0,
   Unsigned_8  => 0,
   Unsigned_16 => 0,
   Unsigned_32 => 0,
   Unsigned_64 => 0);

--  types.ads

with System;

package Types is
   Max : constant := 10_000;
   subtype Big is Integer range 1 .. Max;

   type Byte is range 0 .. 255;
   for Byte'Size use System.Storage_Unit;

   type Byte_Arr_1 is array (1 .. Max) of Byte;
   type Byte_Arr_2 is array (Big) of Byte;
   type Byte_Arr_3 is array (Integer range <>) of Byte;
   type Byte_Arr_4 is array (Integer range <>,
 Integer range <>) of Byte;
   type Constr_Arr_1 is array (1 .. Max) of Integer;
   type Constr_Arr_2 is array (Big) of Integer;
   type Constr_Arr_3 is array (1 .. Max, 1 .. Max) of Integer;
   type Constr_Arr_4 is array (Big, Big) of Integer;

   type Unconstr_Arr_1 is array (Integer range <>) of Integer;
   type Unconstr_Arr_2 is array (Integer range <>,
 Integer range <>) of Integer;

   subtype Subt_Arr_1 is Unconstr_Arr_1 (1 .. Max);
   subtype Subt_Arr_2 is Unconstr_Arr_1 (Big);
   subtype Subt_Arr_3 is Unconstr_Arr_2 (1 .. Max, 1 .. Max);
   subtype Subt_Arr_4 is Unconstr_Arr_2 (Big, Big);

   subtype Subt_Str_1 is String (1 .. Max);
   subtype Subt_Str_2 is String (Big);

   type Byte_Arr_1_Ptr is access Byte_Arr_1;
   type Byte_Arr_2_Ptr is access Byte_Arr_2;
   type Byte_Arr_3_Ptr is access Byte_Arr_3;
   type Byte_Arr_4_Ptr is access Byte_Arr_4;
   type Constr_Arr_1_Ptr   is access Constr_Arr_1;
   type Constr_Arr_2_Ptr   is access Constr_Arr_2;
   type Constr_Arr_3_Ptr   is access Constr_Arr_3;
   type Constr_Arr_4_Ptr   is access Constr_Arr_4;
   type Unconstr_Arr_1_Ptr is access Unconstr_Arr_1;
   type Unconstr_Arr_2_Ptr is access Unconstr_Arr_2;
   type Subt_Arr_1_Ptr is access Subt_Arr_1;
   type Subt_Arr_2_Ptr is access Subt_Arr_2;
   type Subt_Arr_3_Ptr is access Subt_Arr_3;
   type Subt_Arr_4_Ptr is access Subt_Arr_4;
   type Str_Ptris access String;
   type Subt_Str_1_Ptr is access Subt_Str_1;
   type Subt_Str_2_Ptr is access Subt_Str_2;
end Types;

--  main.adb

with Types; use Types;

procedure Main is
   Byte_Arr_1_Obj : Byte_Arr_1;
   Byte_Arr_2_Obj : Byte_Arr_2;
   Byte_Arr_3_Obj : Byte_Arr_3 (1 .. Max);
   Byte_Arr_4_Obj : Byte_Arr_3 (Big);
   Byte_Arr_5_Obj : Byte_Arr_4 (1 .. Max, 1 .. Max);
   Byte_Arr_6_Obj : Byte_Arr_4 (Big, Big);
   Constr_Arr_1_Obj   : Constr_Arr_1;
   Constr_Arr_2_Obj   : Constr_Arr_2;
   Constr_Arr_3_Obj   : Constr_Arr_3;
   Constr_Arr_4_Obj   : Constr_Arr_4;
   Unconstr_Arr_1_Obj : Unconstr_Arr_1 (1 .. Max);
   Unconstr_Arr_2_Obj : Unconstr_Arr_1 (Big);
   Unconstr_Arr_3_Obj : Unconstr_Arr_2 (1 .. Max, 1 .. Max);
   Unconstr_Arr_4_Obj : Unconstr_Arr_2 (Big, Big);
   Subt_Arr_1_Obj : Subt_Arr_1;
   Subt_Arr_2_Obj : Subt_Arr_2;
   Subt_Arr_3_Obj : Subt_Arr_3;
   Subt_Arr_4_Obj : Subt_Arr_4;
   Str_1_Obj  : String (1 .. Max);
   Str_2_Obj  : String (Big);
   Subt_Str_1_Obj : Subt_Str_1;
   Subt_Str_2_Obj : Subt_Str_2;

   Byte_Arr_1_Ptr_Obj : Byte_Arr_1_Ptr := new Byte_Arr_1;
   Byte_Arr_2_Ptr_Obj : Byte_Arr_2_Ptr := new Byte_Arr_2;
   Byte_Arr_3_Ptr_Obj : Byte_Arr_3_Ptr := new Byte_Arr_3 (1 .. Max);
   Byte_Arr_4_Ptr_Obj : Byte_Arr_3_Ptr := new Byte_Arr_3 (Big);
   Byte_Arr_5_Ptr_Obj : Byte_Arr_4_Ptr :=
  new Byte_Arr_4 (1 .. Max, 1 .. Max);
   Byte_Arr_6_Ptr_Obj : Byte_Arr_4_Ptr := new Byte_Arr_4 (Big, Big);
   Constr_Arr_1_Ptr_Obj   : Constr_Arr_1_Ptr   := new Constr_Arr_1;
   Constr_Arr_2_Ptr_Obj   : Constr_Arr_2_Ptr   := new Constr_Arr_2;
   Constr_Arr_3_Ptr_Obj   : Constr_Arr_3_Ptr   := new Constr_Arr_3;
   Constr_Arr_4_Ptr_Obj   : Constr_Arr_4_Ptr   := new Constr_Arr_4;
   Unconstr_Arr_1_Ptr_Obj : Unconstr_Arr_1_Ptr :=
 

[Ada] Disable name generation for External_Tag and Expanded_Name

2018-05-22 Thread Pierre-Marie de Rodat
In order to avoid exposing internal names of tagged types in the
binary code generated by the compiler this enhancement facilitates
initializes the External_Tag of a tagged type with an empty string
when pragma No_Tagged_Streams is applicable to the tagged type, and
facilitates initializes its Expanded_Name with an empty string when
pragma Discard_Names is applicable to the tagged type.

This enhancement can be verified by means of the following small
test:

package Library_Level_Test is
   type Typ_01 is tagged null record;--  Case 1: No pragmas

   type Typ_02 is tagged null record;--  Case 2: Discard_Names
   pragma Discard_Names (Typ_02);

   pragma No_Tagged_Streams;
   type Typ_03 is tagged null record;--  Case 3: No_Tagged_Streams

   type Typ_04 is tagged null record;--  Case 4: Both pragmas
   pragma Discard_Names (Typ_04);
end;

Commands:
  gcc -c -gnatD library_level_test.ads
  grep "\.TYP_" library_level_test.ads.dg

Output:
 "LIBRARY_LEVEL_TEST.TYP_01["00"]";
 "LIBRARY_LEVEL_TEST.TYP_02["00"]";
 "LIBRARY_LEVEL_TEST.TYP_03["00"]";

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-22  Javier Miranda  

gcc/ada/

* exp_disp.adb (Make_DT): Initialize the External_Tag with an empty
string when pragma No_Tagged_Streams is applicable to the tagged type,
and initialize the Expanded_Name with an empty string when pragma
Discard_Names is applicable to the tagged type.--- gcc/ada/exp_disp.adb
+++ gcc/ada/exp_disp.adb
@@ -4511,7 +4511,8 @@ package body Exp_Disp is
   DT_Aggr_List   : List_Id;
   DT_Constr_List : List_Id;
   DT_Ptr : Entity_Id;
-  Exname : Entity_Id;
+  Expanded_Name  : Entity_Id;
+  External_Tag_Name  : Entity_Id;
   HT_Link: Entity_Id;
   ITable : Node_Id;
   I_Depth: Nat := 0;
@@ -4590,12 +4591,44 @@ package body Exp_Disp is
  end if;
   end if;
 
-  DT   := Make_Defining_Identifier (Loc, Name_DT);
-  Exname   := Make_Defining_Identifier (Loc, Name_Exname);
-  HT_Link  := Make_Defining_Identifier (Loc, Name_HT_Link);
-  Predef_Prims := Make_Defining_Identifier (Loc, Name_Predef_Prims);
-  SSD  := Make_Defining_Identifier (Loc, Name_SSD);
-  TSD  := Make_Defining_Identifier (Loc, Name_TSD);
+  DT:= Make_Defining_Identifier (Loc, Name_DT);
+  Expanded_Name := Make_Defining_Identifier (Loc, Name_Exname);
+  HT_Link   := Make_Defining_Identifier (Loc, Name_HT_Link);
+  Predef_Prims  := Make_Defining_Identifier (Loc, Name_Predef_Prims);
+  SSD   := Make_Defining_Identifier (Loc, Name_SSD);
+  TSD   := Make_Defining_Identifier (Loc, Name_TSD);
+
+  --  Expanded_Name
+  --  -
+
+  --  We generally initialize the Expanded_Name and the External_Tag of
+  --  tagged types with the same name, unless pragmas Discard_Names or
+  --  No_Tagged_Streams apply: Discard_Names allows us to initialize its
+  --  Expanded_Name with an empty string because in such a case it's
+  --  value is implementation defined (Ada RM Section C.5(7/2)); pragma
+  --  No_Tagged_Streams inhibits the generation of stream routines and
+  --  we initialize its External_Tag with an empty string since Ada.Tags
+  --  services Internal_Tag and External_Tag are mainly used with streams.
+
+  --  Small optimization: when both pragmas apply then there is no need to
+  --  declare two objects initialized with empty strings (since the two
+  --  aggregate components can be initialized with the same object).
+
+  if (Global_Discard_Names or else Discard_Names (Typ))
+and then Present (No_Tagged_Streams_Pragma (Typ))
+  then
+ External_Tag_Name := Expanded_Name;
+
+  elsif Global_Discard_Names
+or else Discard_Names (Typ)
+or else Present (No_Tagged_Streams_Pragma (Typ))
+  then
+ External_Tag_Name :=
+   Make_Defining_Identifier (Loc,
+ New_External_Name (Tname, 'N', Suffix_Index => -1));
+  else
+ External_Tag_Name := Expanded_Name;
+  end if;
 
   --  Initialize Parent_Typ handling private types
 
@@ -5000,20 +5033,72 @@ package body Exp_Disp is
  end if;
   end if;
 
-  --  Generate: Exname : constant String := full_qualified_name (typ);
+  --  Generate: Expanded_Name : constant String := "";
+
+  if Global_Discard_Names or else Discard_Names (Typ) then
+ Append_To (Result,
+   Make_Object_Declaration (Loc,
+ Defining_Identifier => Expanded_Name,
+ Constant_Present=> True,
+ Object_Definition   => New_Occurrence_Of (Standard_String, Loc),
+ Expression =>
+   Make_String_Literal (Loc, "")));
+
+  --  Generate:
+  --Expanded_Name : constant 

[Ada] Fix the signal trampoline on QNX

2018-05-22 Thread Pierre-Marie de Rodat
The trampoline now properly restores the link register as well as the stack
pointer. As a minor optimisation, now only callee-saved registers are
restored: the scratch registers don't need that.

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-22  Jerome Lambourg  

gcc/ada/

* sigtramp-qnx.c: Properly restore link register in signal trampoline.--- gcc/ada/sigtramp-qnx.c
+++ gcc/ada/sigtramp-qnx.c
@@ -170,33 +170,20 @@ TCR("ret")
 #define REG_OFFSET_GR(n) (n * 8)
 #define REGNO_GR(n)   n
 
-/* point to the ELR value of the mcontext registers list */
+/* ELR value offset withing the mcontext registers list */
 #define REG_OFFSET_ELR   (32 * 8)
-#define REGNO_PC  30
+/* The register used to hold the PC value to restore. We need a scratch
+   register.  */
+#define REGNO_PC  9
 
 #define CFI_DEF_CFA \
   TCR(".cfi_def_cfa " S(CFA_REG) ", 0")
 
+/* This restores the callee-saved registers, the FP, the LR, and the SP.
+   A scratch register is used as return column to indicate the new value
+   for PC */
 #define CFI_COMMON_REGS \
   CR("# CFI for common registers\n") \
-  TCR(COMMON_CFI(GR(0)))  \
-  TCR(COMMON_CFI(GR(1)))  \
-  TCR(COMMON_CFI(GR(2)))  \
-  TCR(COMMON_CFI(GR(3)))  \
-  TCR(COMMON_CFI(GR(4)))  \
-  TCR(COMMON_CFI(GR(5)))  \
-  TCR(COMMON_CFI(GR(6)))  \
-  TCR(COMMON_CFI(GR(7)))  \
-  TCR(COMMON_CFI(GR(8)))  \
-  TCR(COMMON_CFI(GR(9)))  \
-  TCR(COMMON_CFI(GR(10))) \
-  TCR(COMMON_CFI(GR(11))) \
-  TCR(COMMON_CFI(GR(12))) \
-  TCR(COMMON_CFI(GR(13))) \
-  TCR(COMMON_CFI(GR(14))) \
-  TCR(COMMON_CFI(GR(15))) \
-  TCR(COMMON_CFI(GR(16))) \
-  TCR(COMMON_CFI(GR(17))) \
   TCR(COMMON_CFI(GR(18))) \
   TCR(COMMON_CFI(GR(19))) \
   TCR(COMMON_CFI(GR(20))) \
@@ -209,6 +196,8 @@ TCR("ret")
   TCR(COMMON_CFI(GR(27))) \
   TCR(COMMON_CFI(GR(28))) \
   TCR(COMMON_CFI(GR(29))) \
+  TCR(COMMON_CFI(GR(30))) \
+  TCR(COMMON_CFI(GR(31))) \
   TCR(".cfi_offset " S(REGNO_PC) "," S(REG_OFFSET_ELR)) \
   TCR(".cfi_return_column " S(REGNO_PC))
 



[Ada] Better error message on illegal 'Access on formal subprogram

2018-05-22 Thread Pierre-Marie de Rodat
This patch improves on the error message for an attempt to apply 'Access
to a formal subprogram. It also applies the check to a renaming of a formal
subprogram.

Compiling p.adb must yield:

p.adb:15:18: not subtype conformant with declaration at line 2
p.adb:15:18: formal subprograms are not subtype conformant (RM 6.3.1 (17/3))
p.adb:16:18: not subtype conformant with declaration at line 2
p.adb:16:18: formal subprograms are not subtype conformant (RM 6.3.1 (17/3))


package body P is
  procedure Non_Generic (P : access procedure (I : Integer)) is
  begin
P.all (5);
  end Non_Generic;

  procedure G is
procedure Local (I : Integer) is
begin
  Action (I);
end;
procedure Local_Action (I : Integer) renames Action;
  begin
Non_Generic (Local'access);
Non_Generic (Local_Action'access);
Non_Generic (Action'access);
-- p.adb:15:18: not subtype conformant with declaration at line 2
-- p.adb:15:18: formal subprograms not allowed
  end G;
end P;

package P is
  generic
with procedure Action (I : Integer);
  procedure G;
end P;

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-22  Ed Schonberg  

gcc/ada/

* sem_ch6.adb (Check_Conformance): Add RM reference for rule that a
formal subprogram is never subtype conformqnt, and thus cannot be the
prefix of 'Access.  Reject as well the attribute when applied to a
renaming of a formal subprogram.--- gcc/ada/sem_ch6.adb
+++ gcc/ada/sem_ch6.adb
@@ -5348,9 +5348,13 @@ package body Sem_Ch6 is
 
  elsif Is_Formal_Subprogram (Old_Id)
or else Is_Formal_Subprogram (New_Id)
+   or else (Is_Subprogram (New_Id)
+ and then Present (Alias (New_Id))
+ and then Is_Formal_Subprogram (Alias (New_Id)))
  then
-Conformance_Error ("\formal subprograms not allowed!");
-return;
+Conformance_Error
+   ("\formal subprograms are not subtype conformant "
+ & "(RM 6.3.1 (17/3))");
  end if;
   end if;
 



[Ada] In-place initialization for Initialize_Scalars

2018-05-22 Thread Pierre-Marie de Rodat
This patch cleans up the implementation of routine Get_Simple_Init_Val. It also
eliminates potentially large and unnecessary tree replications in the context
of object default initialization.

No change in behavior, no test needed.

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-22  Hristian Kirtchev  

gcc/ada/

* exp_ch3.adb (Build_Array_Init_Proc): Update the call to
Needs_Simple_Initialization.
(Build_Init_Statements): Update the call to Get_Simple_Init_Val.
(Check_Subtype_Bounds): Renamed to Extract_Subtype_Bounds. Update the
profile and comment on usage.
(Default_Initialize_Object): Do not use New_Copy_Tree to set the proper
Sloc of a value obtained from aspect Default_Value because this could
potentially replicate large trees. The proper Sloc is now set in
Get_Simple_Init_Val.
(Get_Simple_Init_Val): Reorganized by breaking the various cases into
separate routines. Eliminate the use of global variables.
(Init_Component): Update the call to Get_Simple_Init_Val.
(Needs_Simple_Initialization): Update the parameter profile and all
uses of T.
(Simple_Init_Defaulted_Type): Copy the value of aspect Default_Value
and set the proper Sloc.
* exp_ch3.ads (Get_Simple_Init_Val): Update the parameter profile and
comment on usage.
(Needs_Simple_Initialization): Update the parameter profile.--- gcc/ada/exp_ch3.adb
+++ gcc/ada/exp_ch3.adb
@@ -520,7 +520,7 @@ package body Exp_Ch3 is
   Comp_Type: constant Entity_Id := Component_Type (A_Type);
   Comp_Simple_Init : constant Boolean   :=
 Needs_Simple_Initialization
-  (T   => Comp_Type,
+  (Typ => Comp_Type,
Consider_IS =>
  not (Validity_Check_Copies and Is_Bit_Packed_Array (A_Type)));
   --  True if the component needs simple initialization, based on its type,
@@ -576,13 +576,17 @@ package body Exp_Ch3 is
 Name   => Comp,
 Expression =>
   Get_Simple_Init_Val
-(Comp_Type, Nod, Component_Size (A_Type;
+(Typ  => Comp_Type,
+ N=> Nod,
+ Size => Component_Size (A_Type;
 
  else
 Clean_Task_Names (Comp_Type, Proc_Id);
 return
   Build_Initialization_Call
-(Loc, Comp, Comp_Type,
+(Loc  => Loc,
+ Id_Ref   => Comp,
+ Typ  => Comp_Type,
  In_Init_Proc => True,
  Enclos_Type  => A_Type);
  end if;
@@ -3106,7 +3110,12 @@ package body Exp_Ch3 is
elsif Component_Needs_Simple_Initialization (Typ) then
   Actions :=
 Build_Assignment
-  (Id, Get_Simple_Init_Val (Typ, N, Esize (Id)));
+  (Id  => Id,
+   Default =>
+ Get_Simple_Init_Val
+   (Typ  => Typ,
+N=> N,
+Size => Esize (Id)));
 
--  Nothing needed for this case
 
@@ -3277,7 +3286,12 @@ package body Exp_Ch3 is
   elsif Component_Needs_Simple_Initialization (Typ) then
  Append_List_To (Stmts,
Build_Assignment
- (Id, Get_Simple_Init_Val (Typ, N, Esize (Id;
+ (Id  => Id,
+  Default =>
+Get_Simple_Init_Val
+  (Typ  => Typ,
+   N=> N,
+   Size => Esize (Id;
   end if;
end if;
 
@@ -6004,9 +6018,9 @@ package body Exp_Ch3 is
and then not Initialization_Suppressed (Typ)
  then
 --  Do not initialize the components if No_Default_Initialization
---  applies as the actual restriction check will occur later
---  when the object is frozen as it is not known yet whether the
---  object is imported or not.
+--  applies as the actual restriction check will occur later when
+--  the object is frozen as it is not known yet whether the object
+--  is imported or not.
 
 if not Restriction_Active (No_Default_Initialization) then
 
@@ -6016,8 +6030,8 @@ package body Exp_Ch3 is
Aggr_Init := Static_Initialization (Base_Init_Proc (Typ));
 
if Present (Aggr_Init) then
-  Set_Expression
-(N, New_Copy_Tree (Aggr_Init, New_Scope => Current_Scope));
+  Set_Expression (N,
+New_Copy_Tree (Aggr_Init, New_Scope => Current_Scope));
 

[Ada] Fix compiler abort on invalid discriminant constraint

2018-05-22 Thread Pierre-Marie de Rodat
This patch fixes a compiler abort on a discriminant constraint when the
constraint is a subtype indication.

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-22  Patrick Bernardi  

gcc/ada/

* sem_ch3.adb (Build_Discriminant_Constraints): Raise an error if the
user tries to use a subtype indication as a discriminant constraint.

gcc/testsuite/

* gnat.dg/discr50.adb: New testcase.--- gcc/ada/sem_ch3.adb
+++ gcc/ada/sem_ch3.adb
@@ -9877,6 +9877,12 @@ package body Sem_Ch3 is
   ("a range is not a valid discriminant constraint", Constr);
 Discr_Expr (D) := Error;
 
+ elsif Nkind (Constr) = N_Subtype_Indication then
+Error_Msg_N
+  ("a subtype indication is not a valid discriminant constraint",
+   Constr);
+Discr_Expr (D) := Error;
+
  else
 Process_Discriminant_Expression (Constr, Discr);
 Discr_Expr (D) := Constr;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/discr50.adb
@@ -0,0 +1,11 @@
+--  { dg-do compile }
+
+procedure Discr50 is
+   type My_Record (D : Integer) is record
+  A : Integer;
+   end record;
+
+   B : My_Record (Positive range 1 .. 10);  -- { dg-error "a subtype indication is not a valid discriminant constraint" }
+begin
+   null;
+end Discr50;



[Ada] Ada2020: Reduction expressions

2018-05-22 Thread Pierre-Marie de Rodat
This patch dismantles the prototype implementation of the first proposal
for Reduction expressions, one of the important potentially parallel
constructs for Ada2020. The ARG is going in a different direction with
a simpler syntax.

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-22  Ed Schonberg  

gcc/ada/

* exp_ch4.ads, exp_ch4.adb, exp_util.adb, expander.adb: Remove mention
of N_Reduction_Expression and N_Reduction_Expression_Parameter.
* par-ch4.adb: Remove parsing routines for reduction expressions.
* sem.adb, sinfo.ads, sinfo.adb, sem_ch4.ads, sem_ch4.adb, sem_res.adb,
sem_spark.adb, sprint.adb: Remove analysis routines for reduction
expressions.--- gcc/ada/exp_ch4.adb
+++ gcc/ada/exp_ch4.adb
@@ -10077,77 +10077,6 @@ package body Exp_Ch4 is
   Analyze_And_Resolve (N, Standard_Boolean);
end Expand_N_Quantified_Expression;
 
-   ---
-   -- Expand_N_Reduction_Expression --
-   ---
-
-   procedure Expand_N_Reduction_Expression (N : Node_Id) is
-  Actions   : constant List_Id:= New_List;
-  Expr  : constant Node_Id:= Expression (N);
-  Iter_Spec : constant Node_Id:= Iterator_Specification (N);
-  Loc   : constant Source_Ptr := Sloc (N);
-  Loop_Spec : constant Node_Id:= Loop_Parameter_Specification (N);
-  Typ   : constant Entity_Id  := Etype (N);
-
-  Actual: Node_Id;
-  New_Call  : Node_Id;
-  Reduction_Par : Node_Id;
-  Result: Entity_Id;
-  Scheme: Node_Id;
-
-   begin
-  Result   := Make_Temporary (Loc, 'R', N);
-  New_Call := New_Copy_Tree (Expr);
-
-  if Nkind (New_Call) = N_Function_Call then
- Actual := First (Parameter_Associations (New_Call));
-
- if Nkind (Actual) /= N_Reduction_Expression_Parameter then
-Actual := Next_Actual (Actual);
- end if;
-
-  elsif Nkind (New_Call) in N_Binary_Op then
- Actual := Left_Opnd (New_Call);
-
- if Nkind (Actual) /= N_Reduction_Expression_Parameter then
-Actual := Right_Opnd (New_Call);
- end if;
-  end if;
-
-  Reduction_Par := Expression (Actual);
-
-  Append_To (Actions,
-Make_Object_Declaration (Loc,
-  Defining_Identifier => Result,
-  Object_Definition   => New_Occurrence_Of (Typ, Loc),
-  Expression  => New_Copy_Tree (Reduction_Par)));
-
-  if Present (Iter_Spec) then
- Scheme :=
-   Make_Iteration_Scheme (Loc,
- Iterator_Specification => Iter_Spec);
-  else
- Scheme :=
-   Make_Iteration_Scheme (Loc,
- Loop_Parameter_Specification => Loop_Spec);
-  end if;
-
-  Replace (Actual, New_Occurrence_Of (Result, Loc));
-
-  Append_To (Actions,
-Make_Loop_Statement (Loc,
-  Iteration_Scheme => Scheme,
-  Statements   => New_List (Make_Assignment_Statement (Loc,
-New_Occurrence_Of (Result, Loc), New_Call)),
-  End_Label=> Empty));
-
-  Rewrite (N,
-Make_Expression_With_Actions (Loc,
-  Expression => New_Occurrence_Of (Result, Loc),
-  Actions=> Actions));
-  Analyze_And_Resolve (N, Typ);
-   end Expand_N_Reduction_Expression;
-
-
-- Expand_N_Selected_Component --
-

--- gcc/ada/exp_ch4.ads
+++ gcc/ada/exp_ch4.ads
@@ -68,7 +68,6 @@ package Exp_Ch4 is
procedure Expand_N_Or_Else (N : Node_Id);
procedure Expand_N_Qualified_Expression(N : Node_Id);
procedure Expand_N_Quantified_Expression   (N : Node_Id);
-   procedure Expand_N_Reduction_Expression(N : Node_Id);
procedure Expand_N_Selected_Component  (N : Node_Id);
procedure Expand_N_Slice   (N : Node_Id);
procedure Expand_N_Type_Conversion (N : Node_Id);

--- gcc/ada/exp_util.adb
+++ gcc/ada/exp_util.adb
@@ -7349,8 +7349,6 @@ package body Exp_Util is
| N_Real_Literal
| N_Real_Range_Specification
| N_Record_Definition
-   | N_Reduction_Expression
-   | N_Reduction_Expression_Parameter
| N_Reference
| N_SCIL_Dispatch_Table_Tag_Init
| N_SCIL_Dispatching_Call

--- gcc/ada/expander.adb
+++ gcc/ada/expander.adb
@@ -435,9 +435,6 @@ package body Expander is
when N_Record_Representation_Clause =>
   Expand_N_Record_Representation_Clause (N);
 
-   when N_Reduction_Expression =>
-  Expand_N_Reduction_Expression (N);
-
when N_Requeue_Statement =>
   Expand_N_Requeue_Statement (N);
 

--- gcc/ada/par-ch4.adb
+++ gcc/ada/par-ch4.adb
@@ -75,8 +75,7 @@ package body Ch4 is
function 

[Ada] Crash with private types and renamed discriminants

2018-05-22 Thread Pierre-Marie de Rodat
This patch fixes a compiler abort on an object declaration whose type
is a private type with discriminants, and whose full view is a derived
type that renames some discriminant of its parent.

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-22  Ed Schonberg  

gcc/ada/

* sem_ch3.adb (Search_Derivation_Levels): Whenever a parent type is
private, use the full view if available, because it may include renamed
discriminants whose values are stored in the corresponding
Stored_Constraint.

gcc/testsuite/

* gnat.dg/discr49.adb, gnat.dg/discr49_rec1.adb,
gnat.dg/discr49_rec1.ads, gnat.dg/discr49_rec2.adb,
gnat.dg/discr49_rec2.ads: New testcase.--- gcc/ada/sem_ch3.adb
+++ gcc/ada/sem_ch3.adb
@@ -17977,9 +17977,19 @@ package body Sem_Ch3 is
   Search_Derivation_Levels (Ti, Stored_Constraint (Ti), True);
  else
 declare
-   Td : constant Entity_Id := Etype (Ti);
+   Td : Entity_Id := Etype (Ti);
 
 begin
+
+   --  If the parent type is private, the full view may include
+   --  renamed discriminants, and it is those stored values
+   --  that may be needed (the partial view never has more
+   --  information than the full view).
+
+   if Is_Private_Type (Td) and then Present (Full_View (Td)) then
+  Td := Full_View (Td);
+   end if;
+
if Td = Ti then
   Result := Discriminant;
 

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/discr49.adb
@@ -0,0 +1,12 @@
+--  { dg-do run }
+
+with Discr49_Rec2; use Discr49_Rec2;
+
+procedure Discr49 is
+   Obj : Child (True);
+   I : Integer := Value (Obj) + Boolean'Pos (Obj.Discr);
+begin
+   if I /= 125 then
+  raise Program_Error;
+   end if;
+end Discr49;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/discr49_rec1.adb
@@ -0,0 +1,6 @@
+package body Discr49_Rec1 is
+   function Value (Obj : Parent) return Integer is
+   begin
+  return Obj.V + Boolean'Pos (Obj.Discr_1);
+   end;
+end Discr49_Rec1;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/discr49_rec1.ads
@@ -0,0 +1,8 @@
+package Discr49_Rec1 is
+   type Parent (Discr_1 : Boolean; Discr_2 : Boolean) is private;
+   function Value (Obj : Parent) return Integer;
+private
+   type Parent (Discr_1 : Boolean; Discr_2 : Boolean) is record
+  V : Integer := 123;
+   end record;
+end Discr49_Rec1;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/discr49_rec2.adb
@@ -0,0 +1,6 @@
+package body Discr49_Rec2 is
+   function Value (Obj : Child) return Integer is
+   begin
+  return Value (Parent (Obj));
+   end;
+end Discr49_Rec2;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/discr49_rec2.ads
@@ -0,0 +1,10 @@
+with Discr49_Rec1; use Discr49_Rec1;
+
+package Discr49_Rec2 is
+   type Child (Discr : Boolean) is private;
+   function Value (Obj : Child) return Integer;
+
+private
+   type Child (Discr : Boolean) is
+ new Parent (Discr_1 => Discr, Discr_2 => True);
+end Discr49_Rec2;



[Ada] Allow attribute 'Valid_Scalars on private types

2018-05-22 Thread Pierre-Marie de Rodat
This patch modifies the analysis and expansion of attribute 'Valid_Scalars. It
is now possible to specify the attribute on a prefix of an untagged private
type.


-- Source --


--  gnat.adc

pragma Initialize_Scalars;

--  pack1.ads

package Pack1 is
   type Acc_1  is private;
   type Acc_2  is private;
   type Arr_1  is private;
   type Arr_2  is private;
   type Bool_1 is private;
   type Cmpx_1 is private;
   type Cmpx_2 is private;
   type Enum_1 is private;
   type Enum_2 is private;
   type Fix_1  is private;
   type Fix_2  is private;
   type Flt_1  is private;
   type Flt_2  is private;
   type Modl_1 is private;
   type Prot_1 is limited private;
   type Prot_2 is limited private;
   type Prot_3 (Discr : Boolean) is limited private;
   type Rec_1  is private;
   type Rec_2  is private;
   type Rec_3  is private;
   type Rec_4 (Discr : Boolean) is private;
   type Rec_5 (Discr_1 : Boolean; Discr_2 : Boolean) is private;
   type Sign_1 is private;
   type Tag_1  is tagged private;
   type Task_1 is limited private;
   type Task_2 (Discr : Boolean) is limited private;

   type Prec_Arr_1 is private;
   type Prec_Arr_2 is private;
   type Prec_Arr_3 is private;
   type Prec_Arr_4 is private;
   type Prec_Arr_5 is private;

   type Prec_Rec_1 is private;
   type Prec_Rec_2 (Discr : Boolean) is private;
   type Prec_Rec_3 (Discr_1 : Boolean; Discr_2 : Boolean) is private;
   type Prec_Rec_4 is private;
   type Prec_Rec_5 is private;
   type Prec_Rec_6 is private;
   type Prec_Rec_7 is private;
   type Prec_Rec_8 is private;
   type Prec_Rec_9 is private;

private
   type Acc_1 is access Boolean;
   type Acc_2 is access procedure;
   type Arr_1  is array (1 .. 10) of Boolean;
   type Arr_2  is array (1 .. 3) of access Boolean;
   type Bool_1 is new Boolean;
   type Cmpx_1 is array (1 .. 5) of Rec_5 (True, True);
   type Cmpx_2 is record
  Comp_1 : Cmpx_1;
  Comp_2 : Rec_4 (True);
   end record;
   type Enum_1 is (One, Two, Three);
   type Enum_2 is ('f', 'o', 'u', 'r');
   type Fix_1  is delta 0.5 range 0.0 .. 10.0;
   type Fix_2  is delta 0.1 digits 15;
   type Flt_1  is digits 8;
   type Flt_2  is digits 10 range -1.0 .. 1.0;
   type Modl_1 is mod 8;
   protected type Prot_1 is
   end Prot_1;
   protected type Prot_2 is
   private
  Comp_1 : Boolean;
  Comp_2 : Boolean;
   end Prot_2;
   protected type Prot_3 (Discr : Boolean) is
   private
  Comp_1 : Boolean;
  Comp_2 : Rec_4 (Discr);
   end Prot_3;
   type Rec_1  is null record;
   type Rec_2  is record
  null;
   end record;
   type Rec_3  is record
  Comp_1 : Boolean;
  Comp_2 : Boolean;
   end record;
   type Rec_4 (Discr : Boolean) is record
  case Discr is
 when True =>
Comp_1 : Boolean;
Comp_2 : Boolean;
 when False =>
Comp_3 : access Boolean;
  end case;
   end record;
   type Rec_5 (Discr_1 : Boolean; Discr_2 : Boolean) is record
  Comp_1 : Boolean;
  Comp_2 : Boolean;
  case Discr_1 is
 when True =>
case Discr_2 is
   when True =>
  Comp_3 : Boolean;
  Comp_4 : Boolean;
   when False =>
  null;
end case;
 when False =>
null;
  end case;
   end record;
   type Sign_1 is range 1 .. 10;
   type Tag_1 is tagged null record;
   task type Task_1;
   task type Task_2 (Discr : Boolean);

   type Prec_Arr_1 is array (1 .. 2) of Boolean;
   type Prec_Arr_2 is array (1 .. 2, 1 .. 2) of Boolean;
   type Prec_Arr_3 is array (1 .. 2) of Prec_Rec_1;
   type Prec_Arr_4 is array (1 .. 2) of Prec_Rec_2 (True);
   type Prec_Arr_5 is array (1 .. 2) of Prec_Rec_3 (True, True);

   type Prec_Rec_1 is record
  Comp_1 : Boolean;
   end record;

   type Prec_Rec_2 (Discr : Boolean) is record
  case Discr is
 when True =>
Comp_1 : Boolean;
 when others =>
Comp_2 : Boolean;
  end case;
   end record;

   type Prec_Rec_3 (Discr_1 : Boolean; Discr_2 : Boolean) is record
  case Discr_1 is
 when True =>
case Discr_2 is
   when True =>
  Comp_1 : Boolean;
   when others =>
  Comp_2 : Boolean;
end case;
 when False =>
case Discr_2 is
   when True =>
  Comp_3 : Boolean;
   when others =>
  Comp_4 : Boolean;
end case;
  end case;
   end record;

   type Prec_Rec_4 is record
  Comp : Prec_Arr_1;
   end record;

   type Prec_Rec_5 is record
  Comp : Prec_Arr_4;
   end record;

   type Prec_Rec_6 is record
  Comp : Prec_Arr_5;
   end record;

   type Prec_Rec_7 is record
  Comp : Prec_Rec_4;
   end record;

   type Prec_Rec_8 is record
  Comp : Prec_Rec_5;
   end record;

   type Prec_Rec_9 is record
  Comp : Prec_Rec_6;
   end record;
end Pack1;

--  pack1.adb


[Ada] Spurious visibility error in a nested instance with formal package

2018-05-22 Thread Pierre-Marie de Rodat
This patch fixes a spurious visibility error with a nested instance of a
generic unit with a formal package, when the actual for it is a formal
package PA of an enclosing generic, and there are subsequent uses of the
formals of PA in that generic unit.

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-22  Ed Schonberg  

gcc/ada/

* einfo.ads, einfo.adb: New attribute Hidden_In_Formal_Instance,
defined on packages that are actuals for formal packages, in order to
set/reset the visibility of the formals of a formal package with given
actuals, when there are subsequent uses of those formals in the
enclosing generic, as required by RN 12.7 (10).
* atree.ads, atree.adb: Add operations for Elist30.
* atree.h: Add Elist30.
* sem_ch12.adb (Analyze_Formal_Package_Instantiation): Collect formals
that are not defaulted and are thus not visible within the current
instance.
(Check_Formal_Packages): Reset visibility of formals of a formal
package that are not defaulted, on exit from current instance.

gcc/testsuite/

* gnat.dg/gen_formal_pkg.adb, gnat.dg/gen_formal_pkg_a.ads,
gnat.dg/gen_formal_pkg_b.ads, gnat.dg/gen_formal_pkg_w.ads: New
testcase.--- gcc/ada/atree.adb
+++ gcc/ada/atree.adb
@@ -3408,6 +3408,17 @@ package body Atree is
  end if;
   end Elist29;
 
+  function Elist30 (N : Node_Id) return Elist_Id is
+ pragma Assert (Nkind (N) in N_Entity);
+ Value : constant Union_Id := Nodes.Table (N + 5).Field6;
+  begin
+ if Value = 0 then
+return No_Elist;
+ else
+return Elist_Id (Value);
+ end if;
+  end Elist30;
+
   function Elist36 (N : Node_Id) return Elist_Id is
  pragma Assert (Nkind (N) in N_Entity);
  Value : constant Union_Id := Nodes.Table (N + 6).Field6;
@@ -6318,6 +6329,13 @@ package body Atree is
  Nodes.Table (N + 4).Field11 := Union_Id (Val);
   end Set_Elist29;
 
+  procedure Set_Elist30 (N : Node_Id; Val : Elist_Id) is
+  begin
+ pragma Assert (not Locked);
+ pragma Assert (Nkind (N) in N_Entity);
+ Nodes.Table (N + 5).Field6 := Union_Id (Val);
+  end Set_Elist30;
+
   procedure Set_Elist36 (N : Node_Id; Val : Elist_Id) is
   begin
  pragma Assert (not Locked);

--- gcc/ada/atree.ads
+++ gcc/ada/atree.ads
@@ -1523,6 +1523,9 @@ package Atree is
   function Elist29 (N : Node_Id) return Elist_Id;
   pragma Inline (Elist29);
 
+  function Elist30 (N : Node_Id) return Elist_Id;
+  pragma Inline (Elist30);
+
   function Elist36 (N : Node_Id) return Elist_Id;
   pragma Inline (Elist36);
 
@@ -2889,6 +2892,9 @@ package Atree is
   procedure Set_Elist29 (N : Node_Id; Val : Elist_Id);
   pragma Inline (Set_Elist29);
 
+  procedure Set_Elist30 (N : Node_Id; Val : Elist_Id);
+  pragma Inline (Set_Elist30);
+
   procedure Set_Elist36 (N : Node_Id; Val : Elist_Id);
   pragma Inline (Set_Elist36);
 

--- gcc/ada/atree.h
+++ gcc/ada/atree.h
@@ -530,6 +530,7 @@ extern Node_Id Current_Error_Node;
 #define Elist25(N)Field25 (N)
 #define Elist26(N)Field26 (N)
 #define Elist29(N)Field29 (N)
+#define Elist30(N)Field30 (N)
 #define Elist36(N)Field36 (N)
 
 #define Name1(N)  Field1  (N)

--- gcc/ada/einfo.adb
+++ gcc/ada/einfo.adb
@@ -255,6 +255,7 @@ package body Einfo is
--Corresponding_Equality  Node30
--Last_Aggregate_Assignment   Node30
--Static_Initialization   Node30
+   --Hidden_In_Formal_Instance   Elist30
 
--Derived_Type_Link   Node31
--Thunk_EntityNode31
@@ -1989,6 +1990,12 @@ package body Einfo is
   return Node8 (Id);
end Hiding_Loop_Variable;
 
+   function Hidden_In_Formal_Instance (Id : E) return L is
+   begin
+  pragma Assert (Ekind (Id) = E_Package);
+  return Elist30 (Id);
+   end Hidden_In_Formal_Instance;
+
function Homonym (Id : E) return E is
begin
   return Node4 (Id);
@@ -5167,6 +5174,12 @@ package body Einfo is
   Set_Node8 (Id, V);
end Set_Hiding_Loop_Variable;
 
+   procedure Set_Hidden_In_Formal_Instance (Id : E; V : L) is
+   begin
+  pragma Assert (Ekind (Id) = E_Package);
+  Set_Elist30 (Id, V);
+   end Set_Hidden_In_Formal_Instance;
+
procedure Set_Homonym (Id : E; V : E) is
begin
   pragma Assert (Id /= V);

--- gcc/ada/einfo.ads
+++ gcc/ada/einfo.ads
@@ -2172,6 +2172,14 @@ package Einfo is
 --   warning messages if the hidden variable turns out to be unused
 --   or is referenced without being set.
 
+--Hidden_In_Formal_Instance (Elist30)
+--   Defined on actuals for formal packages. Entities on the list are
+--   formals that are hidden outside of the formal package when this
+--   package is not declared with a box, or 

[Ada] Prohibit output dependency items on functions

2018-05-22 Thread Pierre-Marie de Rodat
This patch modifies the analysis of pragma [Refined_]Depends to emit an error
when the pragma is asspciated with a [generic] function, and one of its clauses
contains a non-null, non-'Result output item.


-- Source --


--  pack.ads

package Pack with SPARK_Mode is
   Obj_1 : Integer := 1;
   Obj_2 : Integer := 2;

   function Func_1 return Integer
 with Global => (In_Out => Obj_1);   --  Error

   function Func_2 return Integer
 with Global => (Output => Obj_1);   --  Error

   function Func_3 return Integer
 with Depends => (Func_3'Result => Obj_1,--  OK
  Obj_1 => Obj_1);   --  Error

   function Func_4 return Integer
 with Depends => (Func_4'Result => Obj_1,--  OK
  null  => Obj_2);   --  OK
end Pack;


-- Compilation and output --


$ gcc -c pack.ads
pack.ads:6:22: global mode "In_Out" is not applicable to functions
pack.ads:9:22: global mode "Output" is not applicable to functions
pack.ads:13:23: output item is not applicable to function

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-22  Hristian Kirtchev  

gcc/ada/

* sem_prag.adb (Analyze_Input_Output): Emit an error when a non-null,
non-'Result output appears in the output list of a function.--- gcc/ada/sem_prag.adb
+++ gcc/ada/sem_prag.adb
@@ -941,6 +941,17 @@ package body Sem_Prag is
 
 Ekind_In (Item_Id, E_Abstract_State, E_Variable)
   then
+ --  A [generic] function is not allowed to have Output
+ --  items in its dependency relations. Note that "null"
+ --  and attribute 'Result are still valid items.
+
+ if Ekind_In (Spec_Id, E_Function, E_Generic_Function)
+   and then not Is_Input
+ then
+SPARK_Msg_N
+  ("output item is not applicable to function", Item);
+ end if;
+
  --  The item denotes a concurrent type. Note that single
  --  protected/task types are not considered here because
  --  they behave as objects in the context of pragma



[Ada] Spurious visibility error on aspect in generic unit

2018-05-22 Thread Pierre-Marie de Rodat
This patch fixes a spurious visiblity error on an instantiation of a generic
package that contains a type declaration with an aspect specification for
an aspect that must be delayed (i.e. an aspect whose value may be specified
at a later point).

The package g.ads must compile quietly:


with S;
generic
package G
is
   type Buffer_Type is record
  Data   : Integer;
   end record;

   package Buffer is new S (Buffer_Type => Buffer_Type);
end G;

generic
   type Buffer_Type is private;
package S
is
   Page_Size : constant := 4096;

   type Reader_Type is limited record
  Data   : Buffer_Type;
   end record
 with
Alignment => Page_Size; -- Using a constant does not work
--  Alignment => 4096;  -- Using a number works

-- for Reader_Type'Alignment use Page_Size; -- so does an attribute.
   pragma Compile_Time_Error (Reader_Type'Size /= 12345, "Ooops");
   -- Note: We set 'Alignment and check for 'Size.
end S;

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-22  Ed Schonberg  

gcc/ada/

* freeze.adb (Freeze_Entity): When analyzing delayed aspects of an
entity E within a generic unit, indicate that there are no remaining
delayed aspects after invoking Analyze_Aspects_At_Freeze_Point. The
entity E is not frozen yet but the aspects should not be reanalyzed at
the freeze point, which may be outside of the generic and may not have
the proper visibility.--- gcc/ada/freeze.adb
+++ gcc/ada/freeze.adb
@@ -5167,11 +5167,14 @@ package body Freeze is
   --  be frozen in the proper scope after the current generic is analyzed.
   --  However, aspects must be analyzed because they may be queried later
   --  within the generic itself, and the corresponding pragma or attribute
-  --  definition has not been analyzed yet.
+  --  definition has not been analyzed yet. After this, indicate that the
+  --  entity has no further delayed aspects, to prevent a later aspect
+  --  analysis out of the scope of the generic.
 
   elsif Inside_A_Generic and then External_Ref_In_Generic (Test_E) then
  if Has_Delayed_Aspects (E) then
 Analyze_Aspects_At_Freeze_Point (E);
+Set_Has_Delayed_Aspects (E, False);
  end if;
 
  Result := No_List;



[Ada] Spurious size error on fixed point type with aspect Small

2018-05-22 Thread Pierre-Marie de Rodat
This path fixes a spurious size error on a fixed point that carries an
aspect specification for the 'Small of the type, when there is a subsequent
derivation of that type before the type is frozen, the given 'Small is not
not a power of two, and the bounds of the type require its full size, also
given by an aspect specification.

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-22  Ed Schonberg  

gcc/ada/

* freeze.adb (Freeze_Fixed_Point_Type): If the first subtype has
delayed aspects, analyze them now, os that the representation of the
type (size, bounds) can be computed and validated.

gcc/testsuite/

* gnat.dg/fixedpnt3.adb: New testcase.--- gcc/ada/freeze.adb
+++ gcc/ada/freeze.adb
@@ -7466,6 +7466,16 @@ package body Freeze is
--  Start of processing for Freeze_Fixed_Point_Type
 
begin
+  --  The type, or its first subtype if we are freezing the anonymous
+  --  base, may have a delayed Small aspect. It must be analyzed now,
+  --  so that all characteristics of the type (size, bounds) can be
+  --  computed and validated in the call to Minimum_Size that follows.
+
+  if Has_Delayed_Aspects (First_Subtype (Typ)) then
+ Analyze_Aspects_At_Freeze_Point (First_Subtype (Typ));
+ Set_Has_Delayed_Aspects (First_Subtype (Typ), False);
+  end if;
+
   --  If Esize of a subtype has not previously been set, set it now
 
   if Unknown_Esize (Typ) then

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/fixedpnt3.adb
@@ -0,0 +1,16 @@
+--  { dg-do compile }
+--  { dg-options "-gnatws" }
+
+procedure Fixedpnt3 is
+  C_Unit : constant := 0.001;
+
+  type T_Fixed_Point is
+ delta C_Unit range (-2 ** 63) * C_Unit .. (2 ** 63 - 1) * C_Unit
+ with Size  => 64, Small => C_Unit;
+
+  type T_Short_Fixed_Point is
+ new T_Fixed_Point range (-2 ** 31) * C_Unit .. (2 ** 31 - 1) * C_Unit
+ with Size  => 32;
+begin
+   null;
+end Fixedpnt3;



[Ada] Crash on pragma Compile_Time_Warning with declared string constant

2018-05-22 Thread Pierre-Marie de Rodat
This patch fixes a compiler abort on a pragma Compile_Time_Warning when its
second argument is a reference to a constsant string (rather than a string
literal or an expression that evaluates to a string literal).

Compiling main.adb must yield:

   main.adb:5:33: warning: Good
   main.adb:6:33: warning: VALLUE
   main.adb:7:33: warning: Test


procedure Main is
   Value : constant String := "Test";
   Switch : constant Boolean := True;
begin
   pragma Compile_Time_Warning (Switch, "Good");
   pragma Compile_Time_Warning (Switch, "VAL" & "LUE");
   pragma Compile_Time_Warning (Switch, value);
   null;
end Main;

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-22  Ed Schonberg  

gcc/ada/

* sem_prag.adb (Process_Compile_Time_Warning_Or_Error): Handle properly
a second argument that is a constant of a given string value.
--- gcc/ada/sem_prag.adb
+++ gcc/ada/sem_prag.adb
@@ -30359,11 +30359,18 @@ package body Sem_Prag is
 
   if Compile_Time_Known_Value (Arg1x) then
  if Is_True (Expr_Value (Arg1x)) then
+
+--  We have already verified that the second argument is a static
+--  string expression. Its string value must be retrieved
+--  explicitly if it is a declared constant, otherwise it has
+--  been constant-folded previously.
+
 declare
Cent: constant Entity_Id := Cunit_Entity (Current_Sem_Unit);
Pname   : constant Name_Id   := Pragma_Name_Unmapped (N);
Prag_Id : constant Pragma_Id := Get_Pragma_Id (Pname);
-   Str : constant String_Id := Strval (Get_Pragma_Arg (Arg2));
+   Str : constant String_Id :=
+   Strval (Expr_Value_S (Get_Pragma_Arg (Arg2)));
Str_Len : constant Nat   := String_Length (Str);
 
Force : constant Boolean :=



[Ada] No error on misplaced pragma Pure_Function

2018-05-22 Thread Pierre-Marie de Rodat
This patch fixes an issue whereby placement of the pragma/aspect Pure_Function
was not verified to have been in the same declarative part as the function
declaration incorrectly allowing it to appear after a function body or in a
different region like a private section.

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-22  Justin Squirek  

gcc/ada/

* sem_ch12.adb (In_Same_Declarative_Part): Moved to sem_util.
(Freeze_Subprogram_Body, Install_Body): Modify calls to
In_Same_Declarative_Part.
* sem_prag.adb (Analyze_Pragma-Pragma_Pure_Function): Add check to
verify pragma declaration is within the same declarative list with
corresponding error message.
* sem_util.adb, sem_util.ads (In_Same_Declarative_Part): Moved from
sem_ch12.adb and generalized to be useful outside the scope of
freezing.

gcc/testsuite/

* gnat.dg/pure_function1.adb, gnat.dg/pure_function1.ads,
gnat.dg/pure_function2.adb, gnat.dg/pure_function2.ads: New testcases.--- gcc/ada/sem_ch12.adb
+++ gcc/ada/sem_ch12.adb
@@ -657,17 +657,6 @@ package body Sem_Ch12 is
--  not done for the instantiation of the bodies, which only require the
--  instances of the generic parents to be in scope.
 
-   function In_Same_Declarative_Part
- (F_Node : Node_Id;
-  Inst   : Node_Id) return Boolean;
-   --  True if the instantiation Inst and the given freeze_node F_Node appear
-   --  within the same declarative part, ignoring subunits, but with no inter-
-   --  vening subprograms or concurrent units. Used to find the proper plave
-   --  for the freeze node of an instance, when the generic is declared in a
-   --  previous instance. If predicate is true, the freeze node of the instance
-   --  can be placed after the freeze node of the previous instance, Otherwise
-   --  it has to be placed at the end of the current declarative part.
-
function In_Main_Context (E : Entity_Id) return Boolean;
--  Check whether an instantiation is in the context of the main unit.
--  Used to determine whether its body should be elaborated to allow
@@ -8664,7 +8653,8 @@ package body Sem_Ch12 is
 
   if Is_Generic_Instance (Par)
 and then Present (Freeze_Node (Par))
-and then In_Same_Declarative_Part (Freeze_Node (Par), Inst_Node)
+and then In_Same_Declarative_Part
+   (Parent (Freeze_Node (Par)), Inst_Node)
   then
  --  The parent was a premature instantiation. Insert freeze node at
  --  the end the current declarative part.
@@ -8711,11 +8701,11 @@ package body Sem_Ch12 is
 and then Present (Freeze_Node (Par))
 and then Present (Enc_I)
   then
- if In_Same_Declarative_Part (Freeze_Node (Par), Enc_I)
+ if In_Same_Declarative_Part (Parent (Freeze_Node (Par)), Enc_I)
or else
  (Nkind (Enc_I) = N_Package_Body
-   and then
- In_Same_Declarative_Part (Freeze_Node (Par), Parent (Enc_I)))
+   and then In_Same_Declarative_Part
+  (Parent (Freeze_Node (Par)), Parent (Enc_I)))
  then
 --  The enclosing package may contain several instances. Rather
 --  than computing the earliest point at which to insert its freeze
@@ -8985,46 +8975,6 @@ package body Sem_Ch12 is
 (Current_Scope, Current_Scope, Assoc_Null);
end Init_Env;
 
-   --
-   -- In_Same_Declarative_Part --
-   --
-
-   function In_Same_Declarative_Part
- (F_Node : Node_Id;
-  Inst   : Node_Id) return Boolean
-   is
-  Decls : constant Node_Id := Parent (F_Node);
-  Nod   : Node_Id;
-
-   begin
-  Nod := Parent (Inst);
-  while Present (Nod) loop
- if Nod = Decls then
-return True;
-
- elsif Nkind_In (Nod, N_Subprogram_Body,
-  N_Package_Body,
-  N_Package_Declaration,
-  N_Task_Body,
-  N_Protected_Body,
-  N_Block_Statement)
- then
-return False;
-
- elsif Nkind (Nod) = N_Subunit then
-Nod := Corresponding_Stub (Nod);
-
- elsif Nkind (Nod) = N_Compilation_Unit then
-return False;
-
- else
-Nod := Parent (Nod);
- end if;
-  end loop;
-
-  return False;
-   end In_Same_Declarative_Part;
-
-
-- In_Main_Context --
-
@@ -9536,7 +9486,7 @@ package body Sem_Ch12 is
 --  Freeze instance of inner generic after instance of enclosing
 --  generic.
 
-if In_Same_Declarative_Part (Freeze_Node (Par), N) then
+if In_Same_Declarative_Part (Parent (Freeze_Node (Par)), N) then
 
--  Handle the following case:
 
@@ -9570,7 

[Ada] Adding support for Ada.Locales package

2018-05-22 Thread Pierre-Marie de Rodat
This patch adds generic support for the Ada.Locales package that
relies on the setlocale() C service.

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-22  Javier Miranda  

gcc/ada/

* locales.c: New implementation for the Ada.Locales package.
* libgnat/a-locale.ads: Remove comment indicating that this is not
implemented.
* doc/gnat_rm/standard_library_routines.rst: Remove comment indicating
that this is not implemented.
* gnat_rm.texi: Regenerate.--- gcc/ada/doc/gnat_rm/standard_library_routines.rst
+++ gcc/ada/doc/gnat_rm/standard_library_routines.rst
@@ -273,9 +273,7 @@ the unit is not implemented.
 
 ``Ada.Locales`` *(A.19)*
   This package provides declarations providing information (Language
-  and Country) about the current locale. This package is currently not
-  implemented other than by providing stubs which will always return
-  Language_Unknown/Country_Unknown.
+  and Country) about the current locale.
 
 
 ``Ada.Numerics``

--- gcc/ada/gnat_rm.texi
+++ gcc/ada/gnat_rm.texi
@@ -20617,9 +20617,7 @@ This package provides a generic interface to generalized iterators.
 @item @code{Ada.Locales} @emph{(A.19)}
 
 This package provides declarations providing information (Language
-and Country) about the current locale. This package is currently not
-implemented other than by providing stubs which will always return
-Language_Unknown/Country_Unknown.
+and Country) about the current locale.
 
 @item @code{Ada.Numerics}
 

--- gcc/ada/libgnat/a-locale.ads
+++ gcc/ada/libgnat/a-locale.ads
@@ -15,10 +15,6 @@
 --  --
 --
 
---  Note that this package is currently not implemented on any platform and
---  functions Language and Country will always return
---  Language_Unknown/Country_Unknown.
-
 package Ada.Locales is
pragma Preelaborate (Locales);
pragma Remote_Types (Locales);

--- gcc/ada/locales.c
+++ gcc/ada/locales.c
@@ -31,17 +31,777 @@
 
 /*  This file provides OS-dependent support for the Ada.Locales package.*/
 
+#include 
+#include 
+#include 
+
 typedef char char4 [4];
 
+/* Table containing equivalences between ISO_639_1 codes and their ISO_639_3
+   alpha-3 code plus their language name. */
+
+static char* iso_639[] =
+{
+  "aa", "aar", "Afar",
+  "ab", "abk", "Abkhazian",
+  "ae", "ave", "Avestan",
+  "af", "afr", "Afrikaans",
+  "ak", "aka", "Akan",
+  "am", "amh", "Amharic",
+  "an", "arg", "Aragonese",
+  "ar", "ara", "Arabic",
+  "as", "asm", "Assamese",
+  "av", "ava", "Avaric",
+  "ay", "aym", "Aymara",
+  "az", "aze", "Azerbaijani",
+
+  "ba", "bak", "Bashkir",
+  "be", "bel", "Belarusian",
+  "bg", "bul", "Bulgarian",
+  "bi", "bis", "Bislama",
+  "bm", "bam", "Bambara",
+  "bn", "ben", "Bengali",
+  "bo", "bod", "Tibetan",
+  "br", "bre", "Breton",
+  "bs", "bos", "Bosnian",
+
+  "ca", "cat", "Catalan",
+  "ce", "che", "Chechen",
+  "ch", "cha", "Chamorro",
+  "co", "cos", "Corsican",
+  "cr", "cre", "Cree",
+  "cs", "ces", "Czech",
+  "cu", "chu", "Church Slavic",
+  "cv", "chv", "Chuvash",
+  "cy", "cym", "Welsh",
+
+  "da", "dan", "Danish",
+  "de", "deu", "German",
+  "dv", "div", "Divehi",
+  "dz", "dzo", "Dzongkha",
+
+  "ee", "ewe", "Ewe",
+  "el", "ell", "Modern Greek",
+  "en", "eng", "English",
+  "eo", "epo", "Esperanto",
+  "es", "spa", "Spanish",
+  "et", "est", "Estonian",
+  "eu", "eus", "Basque",
+
+  "fa", "fas", "Persian",
+  "ff", "ful", "Fulah",
+  "fi", "fin", "Finnish",
+  "fj", "fij", "Fijian",
+  "fo", "fao", "Faroese",
+  "fr", "fra", "French",
+  "fy", "fry", "Western Frisian",
+
+  "ga", "gle", "Irish",
+  "gd", "gla", "Scottish Gaelic",
+  "gl", "glg", "Galician",
+  "gn", "grn", "Guarani",
+  "gu", "guj", "Gujarati",
+  "gv", "glv", "Manx",
+
+  "ha", "hau", "Hausa",
+  "he", "heb", "Hebrew",
+  "hi", "hin", "Hindi",
+  "ho", "hmo", "Hiri Motu",
+  "hr", "hrv", "Croatian",
+  "ht", "hat", "Haitian",
+  "hu", "hun", "Hungarian",
+  "hy", "hye", "Armenian",
+  "hz", "her", "Herero",
+
+  "ia", "ina", "Interlingua",
+  "id", "ind", "Indonesian",
+  "ie", "ile", "Interlingue",
+  "ig", "ibo", "Igbo",
+  "ii", "iii", "Sichuan Yi",
+  "ik", "ipk", "Inupiaq",
+  "io", "ido", "Ido",
+  "is", "isl", "Icelandic",
+  "it", "ita", "Italian",
+  "iu", "iku", "Inuktitut",
+
+  "ja", "jpn", "Japanese",
+  "jv", "jav", "Javanese",
+
+  "ka", "kat", "Georgian",
+  "kg", "kon", "Kongo",
+  "ki", "kik", "Kikuyu",
+  "kj", "kua", "Kuanyama",
+  "kk", "kaz", "Kazakh",
+  "kl", "kal", "Kalaallisut",
+  "km", "khm", "Central Khmer",
+  "kn", "kan", "Kannada",
+  "ko", "kor", "Korean",
+  "kr", "kau", "Kanuri",
+  "ks", "kas", "Kashmiri",
+  "ku", "kur", "Kurdish",
+  "kv", "kom", "Komi",
+  "kw", "cor", "Cornish",
+  "ky", "kir", "Kirghiz",
+
+  "la", "lat", "Latin",
+  "lb", "ltz", "Luxembourgish",
+  "lg", "lug", "Ganda",
+  "li", "lim", 

[Ada] Prevent caching of non-text symbols for symbolic tracebacks

2018-05-22 Thread Pierre-Marie de Rodat
We now only have the executable code section boundaries at hand,
so can only infer offsets for symbols within those boundaries.

Symbols outside of this region are non-text symbols, pointless for
traceback symbolization anyway.

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-22  Olivier Hainque  

gcc/ada/

* libgnat/s-dwalin.adb (Enable_Cache): Skip symbols outside of the
executable code section boundaries.--- gcc/ada/libgnat/s-dwalin.adb
+++ gcc/ada/libgnat/s-dwalin.adb
@@ -1202,6 +1202,9 @@ package body System.Dwarf_Lines is
   --  Phase 1: count number of symbols. Phase 2: fill the cache.
   declare
  S   : Object_Symbol;
+ Val : uint64;
+ Xcode_Low   : constant uint64 := uint64 (C.Low);
+ Xcode_High  : constant uint64 := uint64 (C.High);
  Sz  : uint32;
  Addr, Prev_Addr : uint32;
  Nbr_Symbols : Natural;
@@ -1211,22 +1214,31 @@ package body System.Dwarf_Lines is
 S   := First_Symbol (C.Obj.all);
 Prev_Addr   := uint32'Last;
 while S /= Null_Symbol loop
-   --  Discard symbols whose length is 0
+   --  Discard symbols of length 0 or located outside of the
+   --  execution code section outer boundaries.
Sz := uint32 (Size (S));
+   Val := Value (S);
 
-   --  Try to filter symbols at the same address. This is a best
-   --  effort as they might not be consecutive.
-   Addr := uint32 (Value (S) - uint64 (C.Low));
-   if Sz > 0 and then Addr /= Prev_Addr then
-  Nbr_Symbols := Nbr_Symbols + 1;
-  Prev_Addr   := Addr;
-
-  if Phase = 2 then
- C.Cache (Nbr_Symbols) :=
-   (First => Addr,
-Size  => Sz,
-Sym   => uint32 (Off (S)),
-Line  => 0);
+   if Sz > 0
+ and then Val >= Xcode_Low
+ and then Val <= Xcode_High
+   then
+
+  Addr := uint32 (Val - Xcode_Low);
+
+  --  Try to filter symbols at the same address. This is a best
+  --  effort as they might not be consecutive.
+  if Addr /= Prev_Addr then
+ Nbr_Symbols := Nbr_Symbols + 1;
+ Prev_Addr   := Addr;
+
+ if Phase = 2 then
+C.Cache (Nbr_Symbols) :=
+  (First => Addr,
+   Size  => Sz,
+   Sym   => uint32 (Off (S)),
+   Line  => 0);
+ end if;
   end if;
end if;
 



[Ada] Missing error on illegal categorization dependency

2018-05-22 Thread Pierre-Marie de Rodat
This patch modifies the analysis of subprogram declarations to ensure that an
aspect which is converted into a categorization pragma is properly taken into
account when verifying the dependencies of a subprogram unit.


-- Source --


--  pack.ads

package Pack is end Pack;

--  proc1.ads

with Pack;

procedure Proc1 with Pure;

--  proc2.ads

with Pack;

procedure Proc2;
pragma Pure (Proc2);


-- Compilation and output --


$ gcc -c proc1.ads
$ gcc -c proc2.ads
proc1.ads:1:06: cannot depend on "Pack" (wrong categorization)
proc1.ads:1:06: pure unit cannot depend on non-pure unit
proc2.ads:1:06: cannot depend on "Pack" (wrong categorization)
proc2.ads:1:06: pure unit cannot depend on non-pure unit

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-22  Hristian Kirtchev  

gcc/ada/

* sem_ch6.adb (Analyze_Subprogram_Declaration): Set the proper
categorization of the unit after processing the aspects in case one of
its aspects is converted into a categorization pragma.--- gcc/ada/sem_ch6.adb
+++ gcc/ada/sem_ch6.adb
@@ -4844,18 +4844,6 @@ package body Sem_Ch6 is
  Set_Kill_Elaboration_Checks (Designator);
   end if;
 
-  if Scop /= Standard_Standard and then not Is_Child_Unit (Designator) then
- Set_Categorization_From_Scope (Designator, Scop);
-
-  else
- --  For a compilation unit, check for library-unit pragmas
-
- Push_Scope (Designator);
- Set_Categorization_From_Pragmas (N);
- Validate_Categorization_Dependency (N, Designator);
- Pop_Scope;
-  end if;
-
   --  For a compilation unit, set body required. This flag will only be
   --  reset if a valid Import or Interface pragma is processed later on.
 
@@ -4883,19 +4871,35 @@ package body Sem_Ch6 is
  Write_Eol;
   end if;
 
-  if Is_Protected_Type (Current_Scope) then
-
- --  Indicate that this is a protected operation, because it may be
- --  used in subsequent declarations within the protected type.
+  --  Indicate that this is a protected operation, because it may be used
+  --  in subsequent declarations within the protected type.
 
+  if Is_Protected_Type (Current_Scope) then
  Set_Convention (Designator, Convention_Protected);
   end if;
 
   List_Inherited_Pre_Post_Aspects (Designator);
 
+  --  Process the aspects before establishing the proper categorization in
+  --  case the subprogram is a compilation unit and one of its aspects is
+  --  converted into a categorization pragma.
+
   if Has_Aspects (N) then
  Analyze_Aspect_Specifications (N, Designator);
   end if;
+
+  if Scop /= Standard_Standard and then not Is_Child_Unit (Designator) then
+ Set_Categorization_From_Scope (Designator, Scop);
+
+  --  Otherwise the unit is a compilation unit and/or a child unit. Set the
+  --  proper categorization of the unit based on its pragmas.
+
+  else
+ Push_Scope (Designator);
+ Set_Categorization_From_Pragmas (N);
+ Validate_Categorization_Dependency (N, Designator);
+ Pop_Scope;
+  end if;
end Analyze_Subprogram_Declaration;
 
--



Re: [PATCH][RFC] Add dynamic edge/bb flag allocation

2018-05-22 Thread David Malcolm
On Tue, 2018-05-22 at 10:43 +0200, Richard Biener wrote:
> On Mon, 21 May 2018, Jeff Law wrote:
> 
> > On 05/18/2018 07:15 AM, David Malcolm wrote:
> > > On Fri, 2018-05-18 at 13:11 +0200, Richard Biener wrote:
> > > > The following adds a simple alloc/free_flag machinery
> > > > allocating
> > > > bits from an int typed pool and applies that to bb->flags and
> > > > edge-
> > > > > flags.
> > > > 
> > > > This should allow infrastructure pieces to use egde/bb flags
> > > > temporarily
> > > > without worrying that users might already use it as for example
> > > > BB_VISITED and friends.  It converts one clever user to the new
> > > > interface.
> > > > 
> > > > The allocation state is per CFG but we could also make it
> > > > global
> > > > or merge the two pools so one allocates a flag that can be used
> > > > for
> > > > bbs and edges at the same time.
> > > > 
> > > > Thus - any opinions welcome.  I'm mainly targeting cfganal
> > > > algorithms
> > > > where I want to add a few region-based ones that to be
> > > > O(region-size)
> > > > complexity may not use sbitmaps for visited sets because of the
> > > > clearing
> > > > overhead and bitmaps are probably more expensive to use than a
> > > > BB/edge
> > > > flag that needs to be cleared afterwards.
> > > > 
> > > > Built on x86_64, otherwise untested.
> > > > 
> > > > Any comments?
> > > 
> > > Rather than putting alloc/free pairs at the usage sites, how
> > > about an
> > > RAII class?  Something like this:
> > 
> > Yes, please if at all possible we should be using RAII.
> 
> So like the following?  (see comments in the hwint.h hunk for
> extra C++ questions...)
> 
> I dropped the non-RAII interface - it's very likely never needed.
> 
> Better suggestions for placement of auto_flag welcome.

Do you have ideas for other uses?  If not, maybe just put it in cfg.h
right in front of auto_edge_flag and auto_bb_flag, for simplicity?

> Thanks,
> Richard.
[...snip...]

The new classes are missing leading comments.  I think it's worth
noting that the auto_flag (and thus their subclasses) hold a pointer
into a control_flow_graph instance, but they don't interact with the
garbage collector, so there's an implicit assumption that the auto_flag
instances are short-lived and that the underlying storage is kept alive
some other way (e.g. as cfun is kept alive by cfun being a GC root).


> +class auto_edge_flag : public auto_flag
> +{
> +public:
> +  auto_edge_flag (function *fun)
> +: auto_flag (>cfg->edge_flags_allocated) {}
> +};
> +
> +class auto_bb_flag : public auto_flag
> +{
> +public:
> +  auto_bb_flag (function *fun)
> +: auto_flag (>cfg->bb_flags_allocated) {}
> +};
> +
>  #endif /* GCC_CFG_H */

[...snip...]

Hope this is constructive
Dave


[PATCH][AArch64] Merge stores of D-register values with different modes

2018-05-22 Thread Kyrill Tkachov

[sending on behalf of Jackson Woodruff]

Hi all,

This patch merges loads and stores from D-registers that are of different modes.

Code like this:

typedef int __attribute__((vector_size(8))) vec;
struct pair
{
  vec v;
  double d;
}

Now generates a store pair instruction:

void
assign (struct pair *p, vec v)
{
  p->v = v;
  p->d = 1.0;
}

Whereas previously it generated two `str` instructions.

This patch also merges storing of double zero values with
long integer values:

struct pair
{
  long long l;
  double d;
}

void
foo (struct pair *p)
{
  p->l = 10;
  p->d = 0.0;
}

Now generates a single store pair instruction rather than two `str` 
instructions.

The patch basically generalises the mode iterators on the patterns in aarch64.md
and the peepholes in aarch64-ldpstp.md to take all combinations of pairs of 
modes
so, while it may be a large-ish patch, it does fairly mechanical stuff.

Bootstrap and testsuite run OK. OK for trunk?

Jackson

2018-05-22  Jackson Woodruff  
Kyrylo Tkachov  

* config/aarch64/aarch64.md: New patterns to generate stp
and ldp.
(store_pair_sw, store_pair_dw): New patterns to generate stp for
single words and double words.
(load_pair_sw, load_pair_dw): Likewise.
(store_pair_sf, store_pair_df, store_pair_si, store_pair_di):
Delete.
(load_pair_sf, load_pair_df, load_pair_si, load_pair_di):
Delete.
* config/aarch64/aarch64-ldpstp.md: Modify peephole
for different mode ldpstp and add peephole for merged zero stores.
Likewise for loads.
* config/aarch64/aarch64.c (aarch64_operands_ok_for_ldpstp):
Add size check.
(aarch64_gen_store_pair): Rename calls to match new patterns.
(aarch64_gen_load_pair): Rename calls to match new patterns.
* config/aarch64/aarch64-simd.md (load_pair): Rename to...
(load_pair): ... This.
(store_pair): Rename to...
(vec_store_pair): ... This.
* config/aarch64/iterators.md (DREG, DREG2, DX2, SX, SX2, DSX):
New mode iterators.
(V_INT_EQUIV): Handle SImode.
* config/aarch64/predicates.md (aarch64_reg_zero_or_fp_zero):
New predicate.


2018-05-22  Jackson Woodruff  

* gcc.target/aarch64/ldp_stp_6.c: New.
* gcc.target/aarch64/ldp_stp_7.c: New.
* gcc.target/aarch64/ldp_stp_8.c: New.
diff --git a/gcc/config/aarch64/aarch64-ldpstp.md b/gcc/config/aarch64/aarch64-ldpstp.md
index c008477c741d20f530b66ad9420113fa84a54ebb..f6fe8a6a93b5466723e3ed6b892e0ec1e67ee89d 100644
--- a/gcc/config/aarch64/aarch64-ldpstp.md
+++ b/gcc/config/aarch64/aarch64-ldpstp.md
@@ -99,11 +99,11 @@ (define_peephole2
 })
 
 (define_peephole2
-  [(set (match_operand:VD 0 "register_operand" "")
-	(match_operand:VD 1 "aarch64_mem_pair_operand" ""))
-   (set (match_operand:VD 2 "register_operand" "")
-	(match_operand:VD 3 "memory_operand" ""))]
-  "aarch64_operands_ok_for_ldpstp (operands, true, mode)"
+  [(set (match_operand:DREG 0 "register_operand" "")
+	(match_operand:DREG 1 "aarch64_mem_pair_operand" ""))
+   (set (match_operand:DREG2 2 "register_operand" "")
+	(match_operand:DREG2 3 "memory_operand" ""))]
+  "aarch64_operands_ok_for_ldpstp (operands, true, mode)"
   [(parallel [(set (match_dup 0) (match_dup 1))
 	  (set (match_dup 2) (match_dup 3))])]
 {
@@ -119,11 +119,12 @@ (define_peephole2
 })
 
 (define_peephole2
-  [(set (match_operand:VD 0 "aarch64_mem_pair_operand" "")
-	(match_operand:VD 1 "register_operand" ""))
-   (set (match_operand:VD 2 "memory_operand" "")
-	(match_operand:VD 3 "register_operand" ""))]
-  "TARGET_SIMD && aarch64_operands_ok_for_ldpstp (operands, false, mode)"
+  [(set (match_operand:DREG 0 "aarch64_mem_pair_operand" "")
+	(match_operand:DREG 1 "register_operand" ""))
+   (set (match_operand:DREG2 2 "memory_operand" "")
+	(match_operand:DREG2 3 "register_operand" ""))]
+  "TARGET_SIMD
+   && aarch64_operands_ok_for_ldpstp (operands, false, mode)"
   [(parallel [(set (match_dup 0) (match_dup 1))
 	  (set (match_dup 2) (match_dup 3))])]
 {
@@ -138,7 +139,6 @@ (define_peephole2
 }
 })
 
-
 ;; Handle sign/zero extended consecutive load/store.
 
 (define_peephole2
@@ -181,6 +181,36 @@ (define_peephole2
 }
 })
 
+;; Handle storing of a floating point zero with integer data.
+;; This handles cases like:
+;;   struct pair { int a; float b; }
+;;
+;;   p->a = 1;
+;;   p->b = 0.0;
+;;
+;; We can match modes that won't work for a stp instruction
+;; as aarch64_operands_ok_for_ldpstp checks that the modes are
+;; compatible.
+(define_peephole2
+  [(set (match_operand:DSX 0 "aarch64_mem_pair_operand" "")
+	(match_operand:DSX 1 "aarch64_reg_zero_or_fp_zero" ""))
+   (set (match_operand: 2 "memory_operand" "")
+	(match_operand: 3 "aarch64_reg_zero_or_fp_zero" ""))]
+  "aarch64_operands_ok_for_ldpstp (operands, false, mode)"
+  [(parallel [(set (match_dup 0) (match_dup 

Re: [PATCH] libsanitizer: Don't intercept ustat for Linux

2018-05-22 Thread Jakub Jelinek
On Tue, May 22, 2018 at 05:37:03AM -0700, H.J. Lu wrote:
>  has been removed from glibc 2.28 by:
> 
> commit cf2478d53ad7071e84c724a986b56fe17f4f4ca7
> Author: Adhemerval Zanella 
> Date:   Sun Mar 18 11:28:59 2018 +0800
> 
> Deprecate ustat syscall interface
> 
> This patch removes its reference from libsanitizer for Linux.
> 
> The LLVM patch is posted at
> 
> https://reviews.llvm.org/D47165
> 
> OK for trunk?

Please wait on whatever upstream accepts, then it can be cherry-picked.

Jakub


[PATCH] libsanitizer: Don't intercept ustat for Linux

2018-05-22 Thread H.J. Lu
 has been removed from glibc 2.28 by:

commit cf2478d53ad7071e84c724a986b56fe17f4f4ca7
Author: Adhemerval Zanella 
Date:   Sun Mar 18 11:28:59 2018 +0800

Deprecate ustat syscall interface

This patch removes its reference from libsanitizer for Linux.

The LLVM patch is posted at

https://reviews.llvm.org/D47165

OK for trunk?

H.J.
---
PR sanitizer/85835
* sanitizer_common/sanitizer_common_syscalls.inc (ustat): Don't
intercept for Linux.
* sanitizer_common/sanitizer_platform_limits_posix.cc: Don't
include  for Linux.
(struct_ustat_sz): Don't define for Linux.
---
 libsanitizer/sanitizer_common/sanitizer_common_syscalls.inc | 2 +-
 .../sanitizer_common/sanitizer_platform_limits_posix.cc | 2 --
 2 files changed, 1 insertion(+), 3 deletions(-)

diff --git a/libsanitizer/sanitizer_common/sanitizer_common_syscalls.inc 
b/libsanitizer/sanitizer_common/sanitizer_common_syscalls.inc
index 6fd5ef74274..20b9afc5f60 100644
--- a/libsanitizer/sanitizer_common/sanitizer_common_syscalls.inc
+++ b/libsanitizer/sanitizer_common/sanitizer_common_syscalls.inc
@@ -921,7 +921,7 @@ POST_SYSCALL(newfstat)(long res, long fd, void *statbuf) {
   }
 }
 
-#if !SANITIZER_ANDROID
+#if !SANITIZER_LINUX && !SANITIZER_ANDROID
 PRE_SYSCALL(ustat)(long dev, void *ubuf) {}
 
 POST_SYSCALL(ustat)(long res, long dev, void *ubuf) {
diff --git a/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cc 
b/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cc
index 858bb218450..8c2c30161c8 100644
--- a/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cc
+++ b/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cc
@@ -157,7 +157,6 @@ typedef struct user_fpregs elf_fpregset_t;
 # include 
 #endif
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -250,7 +249,6 @@ namespace __sanitizer {
 #endif // SANITIZER_LINUX || SANITIZER_FREEBSD
 
 #if SANITIZER_LINUX && !SANITIZER_ANDROID
-  unsigned struct_ustat_sz = sizeof(struct ustat);
   unsigned struct_rlimit64_sz = sizeof(struct rlimit64);
   unsigned struct_statvfs64_sz = sizeof(struct statvfs64);
 #endif // SANITIZER_LINUX && !SANITIZER_ANDROID
-- 
2.17.0



Re: [PATCH PR85804]Fix wrong code by correcting bump step computation in vector(1) load of single-element group access

2018-05-22 Thread Richard Sandiford
Richard Biener  writes:
> On Mon, May 21, 2018 at 3:14 PM Bin Cheng  wrote:
>
>> Hi,
>> As reported in PR85804, bump step is wrongly computed for vector(1) load
> of
>> single-element group access.  This patch fixes the issue by correcting
> bump
>> step computation for the specific VMAT_CONTIGUOUS case.
>
>> Bootstrap and test on x86_64 and AArch64 ongoing, is it OK?
>
> To me it looks like the classification as VMAT_CONTIGUOUS is bogus.
> We'd fall into the grouped_load case otherwise which should handle
> the situation correctly?
>
> Richard?

Yeah, I agree.  I mentioned to Bin privately that that was probably
a misstep and that we should instead continue to treat them as
VMAT_CONTIGUOUS_PERMUTE, but simply select the required vector
from the array of loaded vectors, instead of doing an actual permute.

(Note that VMAT_CONTIGUOUS is OK for stores, since we don't allow
gaps there.  But it might be easiest to handle both loads and stores
in the same way.)

Although it still seems weird to "vectorise" stuff to one element.
Why not leave the original scalar code in place, and put the onus on
whatever wants to produce or consume a V1 to do the appropriate
conversion?

Thanks,
Richard



Re: [Bug libstdc++/85845] [9 Regression] Many libstdc++ test failures

2018-05-22 Thread Paolo Carlini

Hi,

On 22/05/2018 11:24, Richard Biener wrote:

On Mon, May 21, 2018 at 7:01 PM François Dumont 
wrote:


I just committed this patch as trivial.
I must have run tests without rebuilding pre-compiled headers. I'll have
to find out how to build tests without pre-compiled headers to avoid it
in the future.

I configure with --disable-libstdcxx-pch which also reduces build-time.
If I understand correctly the issue, to test without PCHs I used to 
configure / test / etc normally and then by-hand remove the generated 
PCHs and run make check inside libstdc++-v3/testsuite (*not* inside 
libstdc++-v3) in case of need. It's an HACK, which exploits sort of a 
bug: in that case make check doesn't lead to the automatic regeneration 
of the PCHs.


Paolo.


[C++ Patch] Add INDIRECT_TYPE_P

2018-05-22 Thread Paolo Carlini

Hi,

so this is the patch only adding INDIRECT_TYPE_P to the C++ front-end 
and using it instead of the misleading POINTER_TYPE_P. It also replaces 
a couple of existing TYPE_PTR_P || TYPE_REF_P. Poisoning at the same 
time POINTER_TYPE_P in the front-end - via #pragma GCC poison - seems 
tricky, because we can't just do it in cp-tree.h: tree.c includes at the 
end the generated gt-cp-tree.h which in turn uses POINTER_TYPE_P. I 
don't know if we want to try really hard to do that at the same time...


Tested x86_64-linux.

Thanks, Paolo.

/

2018-05-22  Paolo Carlini  

* cp-tree.h (INDIRECT_TYPE_P): New.
* call.c (build_trivial_dtor_call, maybe_warn_class_memaccess,
joust): Use it instead of POINTER_TYPE_P.
* class.c (update_vtable_entry_for_fn, find_flexarrays,
* fixed_type_or_null, resolves_to_fixed_type_p): Likewise.
* constexpr.c (cxx_eval_binary_expression, cxx_fold_indirect_ref,
* cxx_eval_increment_expression, potential_constant_expression_1):
Likewise.
* cp-gimplify.c (cp_gimplify_expr, cp_genericize_r): Likewise.
* cp-objcp-common.c (cxx_get_alias_set): Likewise.
* cp-ubsan.c (cp_ubsan_maybe_instrument_member_call,
cp_ubsan_maybe_instrument_downcast): Likewise.
* cvt.c (cp_convert_to_pointer, ocp_convert,
cp_get_fndecl_from_callee, maybe_warn_nodiscard, convert): Likewise.
* cxx-pretty-print.c (cxx_pretty_printer::abstract_declarator,
pp_cxx_offsetof_expression_1): Likewise.
* decl.c (grokparms, static_fn_type): Likewise.
* decl2.c (grokbitfield): Likewise.
* error.c (dump_expr): Likewise.
* except.c (initialize_handler_parm, check_noexcept_r): Likewise.
* init.c (warn_placement_new_too_small): Likewise.
* lambda.c (build_capture_proxy, add_capture): Likewise. 
* parser.c (cp_parser_omp_for_loop): Likewise.
* pt.c (convert_nontype_argument, fn_type_unification,
uses_deducible_template_parms, check_cv_quals_for_unify,
dependent_type_p_r): Likewise.
* search.c (check_final_overrider): Likewise.
* semantics.c (handle_omp_array_sections, finish_omp_clauses,
finish_omp_for): Likewise. 
* tree.c (cp_build_qualified_type_real): Likewise.
* typeck.c (build_class_member_access_expr,
finish_class_member_access_expr, build_x_indirect_ref,
cp_build_indirect_ref_1, cp_build_binary_op, build_const_cast_1):
Likewise.
Index: call.c
===
--- call.c  (revision 260499)
+++ call.c  (working copy)
@@ -7622,7 +7622,7 @@ build_trivial_dtor_call (tree instance)
   return fold_convert (void_type_node, instance);
 }
 
-  if (POINTER_TYPE_P (TREE_TYPE (instance)))
+  if (INDIRECT_TYPE_P (TREE_TYPE (instance)))
 {
   if (VOID_TYPE_P (TREE_TYPE (TREE_TYPE (instance
goto no_clobber;
@@ -8511,7 +8511,7 @@ maybe_warn_class_memaccess (location_t loc, tree f
   unsigned srcidx = !dstidx;
 
   tree dest = (*args)[dstidx];
-  if (!TREE_TYPE (dest) || !POINTER_TYPE_P (TREE_TYPE (dest)))
+  if (!TREE_TYPE (dest) || !INDIRECT_TYPE_P (TREE_TYPE (dest)))
 return;
 
   tree srctype = NULL_TREE;
@@ -8643,7 +8643,7 @@ maybe_warn_class_memaccess (location_t loc, tree f
 case BUILT_IN_MEMPCPY:
   /* Determine the type of the source object.  */
   srctype = TREE_TYPE ((*args)[srcidx]);
-  if (!srctype || !POINTER_TYPE_P (srctype))
+  if (!srctype || !INDIRECT_TYPE_P (srctype))
srctype = void_type_node;
   else
srctype = TREE_TYPE (srctype);
@@ -10210,7 +10210,7 @@ joust (struct z_candidate *cand1, struct z_candida
  tree t = TREE_TYPE (TREE_TYPE (l->fn));
  tree f = TREE_TYPE (TREE_TYPE (w->fn));
 
- if (TREE_CODE (t) == TREE_CODE (f) && POINTER_TYPE_P (t))
+ if (TREE_CODE (t) == TREE_CODE (f) && INDIRECT_TYPE_P (t))
{
  t = TREE_TYPE (t);
  f = TREE_TYPE (f);
@@ -10226,7 +10226,7 @@ joust (struct z_candidate *cand1, struct z_candida
   else if (warn)
{
  tree source = source_type (w->convs[0]);
- if (POINTER_TYPE_P (source))
+ if (INDIRECT_TYPE_P (source))
source = TREE_TYPE (source);
  if (warning (OPT_Wconversion, "choosing %qD over %qD", w->fn, l->fn)
  && warning (OPT_Wconversion, "  for conversion from %qH to %qI",
Index: class.c
===
--- class.c (revision 260499)
+++ class.c (working copy)
@@ -2445,7 +2445,7 @@ update_vtable_entry_for_fn (tree t, tree binfo, tr
   over_return = TREE_TYPE (TREE_TYPE (overrider_target));
   base_return = TREE_TYPE (TREE_TYPE (target_fn));
 
-  if (POINTER_TYPE_P (over_return)
+  if (INDIRECT_TYPE_P (over_return)
   && TREE_CODE (over_return) == 

[PATCH] Fix PR85834

2018-05-22 Thread Richard Biener

I originally forgot to restrict memset args to constants - instead of
adding that check the following properly fixes non-constant handling.

I'm not quite sure that ao_ref with offset % 8 == 0 but size != maxsize
has any guarantee that all accesses that this covers are aligned to 8
bits but I failed to create a testcase with a variable bit-alignment.
But I'm quite sure Ada can do that, right?  There may be existing code
that assumes that if offset is byte-aligned all covered accesses are.

Eric?

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

This should restore -O3 bootstrap.

Richard.

>From 07481b3c924f095e7330337649172b6c237d2e09 Mon Sep 17 00:00:00 2001
From: Richard Guenther 
Date: Tue, 22 May 2018 10:36:13 +0200
Subject: [PATCH 2/2] fix-pr85834

2018-05-22  Richard Biener  

PR tree-optimization/85834
* tree-ssa-sccvn.c (vn_reference_lookup_3): Properly handle
non-constant and non-zero memset arguments.

* g++.dg/torture/pr85834.C: New testcase.
* gcc.dg/tree-ssa/ssa-fre-64.c: Likewise.

diff --git a/gcc/testsuite/g++.dg/torture/pr85834.C 
b/gcc/testsuite/g++.dg/torture/pr85834.C
new file mode 100644
index 000..bbdc695849c
--- /dev/null
+++ b/gcc/testsuite/g++.dg/torture/pr85834.C
@@ -0,0 +1,38 @@
+/* { dg-do compile } */
+
+typedef __SIZE_TYPE__ a;
+extern "C" void *memset(void *, int, a);
+typedef struct b c;
+enum d { e };
+template  class f {
+public:
+template  f(g);
+};
+typedef f<1, long> h;
+template  struct j {
+enum k {};
+};
+class l {
+public:
+typedef j::k k;
+l(k);
+operator d();
+};
+struct b {};
+class m {};
+c q(h, d);
+c n(unsigned char o[]) {
+int i;
+long r;
+for (i = 0; i < 4; i++)
+  r = o[i];
+return q(r, l((l::k)e));
+}
+m p() {
+unsigned char o[4], s = 1;
+for (;;) {
+   memset(o, s, 4);
+   n(o);
+   s = 2;
+}
+}
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-64.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-64.c
new file mode 100644
index 000..15f278e1945
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-64.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-fre1-details -fdump-tree-dse1-details" } */
+
+int foo(unsigned char c, signed char d, int e)
+{
+  int res = 0;
+  char x[256];
+  __builtin_memset (x, c, 256);
+  res += x[54];
+  __builtin_memset (x, d, 256);
+  res += x[54];
+  __builtin_memset (x, e, 256);
+  res += x[54];
+  return res;
+}
+
+/* The loads get replaced with conversions from c or d and e.  */
+/* { dg-final { scan-tree-dump-times "Inserted" 2 "fre1" } } */
+/* { dg-final { scan-tree-dump-times "Replaced x" 3 "fre1" } } */
+/* And the memsets removed as dead.  */
+/* { dg-final { scan-tree-dump-times "Deleted dead call" 3 "dse1" } } */
diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c
index 39de866a8ce..884cce12bb3 100644
--- a/gcc/tree-ssa-sccvn.c
+++ b/gcc/tree-ssa-sccvn.c
@@ -1959,7 +1959,12 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void *vr_,
   if (is_gimple_reg_type (vr->type)
   && gimple_call_builtin_p (def_stmt, BUILT_IN_MEMSET)
   && (integer_zerop (gimple_call_arg (def_stmt, 1))
- || (INTEGRAL_TYPE_P (vr->type) && known_eq (ref->size, 8)))
+ || (INTEGRAL_TYPE_P (vr->type)
+ && CHAR_BIT == 8 && BITS_PER_UNIT == 8
+ && known_eq (ref->size, 8)
+ && known_eq (ref->size, maxsize)
+ && offset.is_constant ()
+ && offseti % BITS_PER_UNIT == 0))
   && poly_int_tree_p (gimple_call_arg (def_stmt, 2))
   && (TREE_CODE (gimple_call_arg (def_stmt, 0)) == ADDR_EXPR
  || TREE_CODE (gimple_call_arg (def_stmt, 0)) == SSA_NAME))
@@ -2026,7 +2031,16 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void *vr_,
  if (integer_zerop (gimple_call_arg (def_stmt, 1)))
val = build_zero_cst (vr->type);
  else
-   val = fold_convert (vr->type, gimple_call_arg (def_stmt, 1));
+   {
+ code_helper rcode = NOP_EXPR;
+ tree ops[3] = {};
+ ops[0] = gimple_call_arg (def_stmt, 1);
+ val = vn_nary_build_or_lookup (rcode, vr->type, ops);
+ if (!val
+ || (TREE_CODE (val) == SSA_NAME
+ && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (val)))
+   return (void *)-1;
+   }
  return vn_reference_lookup_or_insert_for_pieces
   (vuse, vr->set, vr->type, vr->operands, val);
}
-- 
2.12.3



Re: [RFC] [aarch64] Add HiSilicon tsv110 CPU support.

2018-05-22 Thread Kyrill Tkachov

Hi Shaokun,

On 22/05/18 09:40, Shaokun Zhang wrote:

This patch adds HiSilicon's an mcpu: tsv110.

---
 gcc/ChangeLog|   9 +++
 gcc/config/aarch64/aarch64-cores.def |   5 ++
 gcc/config/aarch64/aarch64-cost-tables.h | 103 +++
 gcc/config/aarch64/aarch64-tune.md   |   2 +-
 gcc/config/aarch64/aarch64.c |  79 
 gcc/doc/invoke.texi  |   2 +-
 6 files changed, 198 insertions(+), 2 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index cec2892..5d44966 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,12 @@
+2018-05-22  Shaokun Zhang 
+Bo Zhou  
+
+   * config/aarch64/aarch64-cores.def (tsv110): New CPU.
+   * config/aarch64/aarch64-tune.md: Regenerated.
+   * doc/invoke.texi (AArch61 Options/-mtune): Add "tsv110".


typo: AArch64.


+   * gcc/config/aarch64/aarch64.c (tsv110_tunings): New tuning table.
+   * gcc/config/aarch64/aarch64-cost-tables.h: Add "tsv110" extra costs.


Please start the path with config/.


+
 2018-05-21  Michael Meissner 

 PR target/85657
diff --git a/gcc/config/aarch64/aarch64-cores.def 
b/gcc/config/aarch64/aarch64-cores.def
index 33b96ca..db7a412 100644
--- a/gcc/config/aarch64/aarch64-cores.def
+++ b/gcc/config/aarch64/aarch64-cores.def
@@ -91,6 +91,11 @@ AARCH64_CORE("cortex-a75",  cortexa75, cortexa57, 8_2A,  
AARCH64_FL_FOR_ARCH8_2
 /* Qualcomm ('Q') cores. */
 AARCH64_CORE("saphira", saphira,falkor,8_3A, 
AARCH64_FL_FOR_ARCH8_3 | AARCH64_FL_CRYPTO | AARCH64_FL_RCPC, saphira,   0x51, 0xC01, -1)

+/* ARMv8.4-A Architecture Processors.  */
+
+/* HiSilicon ('H') cores. */
+AARCH64_CORE("tsv110", tsv110,tsv110,8_4A, AARCH64_FL_FOR_ARCH8_4 
| AARCH64_FL_CRYPTO | AARCH64_FL_F16 | AARCH64_FL_AES | AARCH64_FL_SHA2, tsv110,   0x48, 
0xd01, -1)
+


The third field is the scheduler model to use when optimising.
Since there is no tsv110 scheduling model, using the name "tsv110"
in the third field will generally give pretty poor schedules.
I recommend you specify an scheduling model that most closely matches your core
for the time being. But I don't think it's required and I wouldn't let it hold
up the patch.

You'll need approval from an aarch64 maintainer (cc'ed some for you).

Thanks,
Kyrill


 /* ARMv8-A big.LITTLE implementations.  */

 AARCH64_CORE("cortex-a57.cortex-a53",  cortexa57cortexa53, cortexa53, 8A,  
AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57, 0x41, AARCH64_BIG_LITTLE (0xd07, 
0xd03), -1)
diff --git a/gcc/config/aarch64/aarch64-cost-tables.h 
b/gcc/config/aarch64/aarch64-cost-tables.h
index a455c62..b6890d6 100644
--- a/gcc/config/aarch64/aarch64-cost-tables.h
+++ b/gcc/config/aarch64/aarch64-cost-tables.h
@@ -334,4 +334,107 @@ const struct cpu_cost_table thunderx2t99_extra_costs =
   }
 };

+const struct cpu_cost_table tsv110_extra_costs =
+{
+  /* ALU */
+  {
+0, /* arith.  */
+0, /* logical.  */
+0, /* shift.  */
+0, /* shift_reg.  */
+COSTS_N_INSNS (1), /* arith_shift.  */
+COSTS_N_INSNS (1), /* arith_shift_reg.  */
+COSTS_N_INSNS (1), /* log_shift.  */
+COSTS_N_INSNS (1), /* log_shift_reg.  */
+0, /* extend.  */
+COSTS_N_INSNS (1), /* extend_arith.  */
+0, /* bfi.  */
+0, /* bfx.  */
+0, /* clz.  */
+0,/* rev.  */
+0, /* non_exec.  */
+true   /* non_exec_costs_exec.  */
+  },
+  {
+/* MULT SImode */
+{
+  COSTS_N_INSNS (2),   /* simple.  */
+  COSTS_N_INSNS (2),   /* flag_setting.  */
+  COSTS_N_INSNS (2),   /* extend.  */
+  COSTS_N_INSNS (2),   /* add.  */
+  COSTS_N_INSNS (2),   /* extend_add.  */
+  COSTS_N_INSNS (11)   /* idiv.  */
+},
+/* MULT DImode */
+{
+  COSTS_N_INSNS (3),   /* simple.  */
+  0,   /* flag_setting (N/A).  */
+  COSTS_N_INSNS (3),   /* extend.  */
+  COSTS_N_INSNS (3),   /* add.  */
+  COSTS_N_INSNS (3),   /* extend_add.  */
+  COSTS_N_INSNS (19)   /* idiv.  */
+}
+  },
+  /* LD/ST */
+  {
+COSTS_N_INSNS (3), /* load.  */
+COSTS_N_INSNS (4), /* load_sign_extend.  */
+COSTS_N_INSNS (3), /* ldrd.  */
+COSTS_N_INSNS (3), /* ldm_1st.  */
+1, /* ldm_regs_per_insn_1st. */
+2, /* ldm_regs_per_insn_subsequent.  */
+COSTS_N_INSNS (4), /* loadf.  */
+COSTS_N_INSNS (4), /* loadd.  */
+COSTS_N_INSNS (4), /* load_unaligned.  */
+0, /* store.  */
+0, /* strd.  */
+0, /* stm_1st.  */
+1, 

Re: [RFC] [aarch64] Add HiSilicon tsv110 CPU support

2018-05-22 Thread Ramana Radhakrishnan
On Tue, May 22, 2018 at 9:40 AM, Shaokun Zhang
 wrote:
> tsv110 is designed by HiSilicon and supports v8_4A, it also optimizes
> L1 Icache which can access L1 Dcache.
> Therefore, DC CVAU is not necessary in __aarch64_sync_cache_range for
> tsv110, is there any good idea to skip DC CVAU operation for tsv110.

A solution would be to use an ifunc but on a cpu variant.

Is this really that important for performance and on what workloads ?

regards
Ramana

>
> Any thoughts and ideas are welcome.
>
> Shaokun Zhang (1):
>   [aarch64] Add HiSilicon tsv110 CPU support.
>
>  gcc/ChangeLog|   9 +++
>  gcc/config/aarch64/aarch64-cores.def |   5 ++
>  gcc/config/aarch64/aarch64-cost-tables.h | 103 
> +++
>  gcc/config/aarch64/aarch64-tune.md   |   2 +-
>  gcc/config/aarch64/aarch64.c |  79 
>  gcc/doc/invoke.texi  |   2 +-
>  6 files changed, 198 insertions(+), 2 deletions(-)
>
> --
> 2.7.4
>


[PATCH] Fix PR85863

2018-05-22 Thread Richard Biener

The following fixes mishandling of widening invariant comparisons
in vect_is_simple_cond which isn't supported for SLP code generation.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk,
queued for backports.

Richard.

2018-05-22  Richard Biener  

PR tree-optimization/85863
* tree-vect-stmts.c (vect_is_simple_cond): Only widen invariant
comparisons when vectype is specified.
(vectorizable_condition): Do not specify vectype for
vect_is_simple_cond when SLP vectorizing.

* gfortran.fortran-torture/compile/pr85863.f: New testcase.


Index: gcc/tree-vect-stmts.c
===
--- gcc/tree-vect-stmts.c   (revision 260499)
+++ gcc/tree-vect-stmts.c   (working copy)
@@ -8661,7 +8661,7 @@ vect_is_simple_cond (tree cond, vec_info
 
   *comp_vectype = vectype1 ? vectype1 : vectype2;
   /* Invariant comparison.  */
-  if (! *comp_vectype)
+  if (! *comp_vectype && vectype)
 {
   tree scalar_type = TREE_TYPE (lhs);
   /* If we can widen the comparison to match vectype do so.  */
@@ -8773,7 +8773,7 @@ vectorizable_condition (gimple *stmt, gi
   else_clause = gimple_assign_rhs3 (stmt);
 
   if (!vect_is_simple_cond (cond_expr, stmt_info->vinfo,
-   _vectype, [0], vectype)
+   _vectype, [0], slp_node ? NULL : vectype)
   || !comp_vectype)
 return false;
 
Index: gcc/testsuite/gfortran.fortran-torture/compile/pr85863.f
===
--- gcc/testsuite/gfortran.fortran-torture/compile/pr85863.f(nonexistent)
+++ gcc/testsuite/gfortran.fortran-torture/compile/pr85863.f(working copy)
@@ -0,0 +1,22 @@
+! { dg-do compile }
+! { dg-additional-options "-ffast-math -ftree-vectorize" }
+  SUBROUTINE SOBOOK(MHSO,HSOMAX,MS)
+  IMPLICIT DOUBLE PRECISION(A-H,O-Z)
+  COMPLEX*16 HSOT,HSO1(2)
+  PARAMETER (ZERO=0.0D+00,TWO=2.0D+00)
+  DIMENSION SOL1(3,2),SOL2(3)
+  CALL FOO(SOL1,SOL2)
+  SQRT2=SQRT(TWO)
+  DO IH=1,MHSO
+IF(MS.EQ.0) THEN
+  HSO1(IH) =  DCMPLX(ZERO,-SOL1(3,IH))
+  HSOT =  DCMPLX(ZERO,-SOL2(3))
+ELSE
+  HSO1(IH) =  DCMPLX(-SOL1(2,IH),SOL1(1,IH))/SQRT2
+  HSOT =  DCMPLX(-SOL2(2),SOL2(1))/SQRT2
+ENDIF
+  ENDDO
+  HSOT=HSOT+HSO1(1)
+  HSOMAX=MAX(HSOMAX,ABS(HSOT))
+  RETURN
+  END


Re: [Bug libstdc++/85845] [9 Regression] Many libstdc++ test failures

2018-05-22 Thread Richard Biener
On Mon, May 21, 2018 at 7:01 PM François Dumont 
wrote:

> I just committed this patch as trivial.

> I must have run tests without rebuilding pre-compiled headers. I'll have
> to find out how to build tests without pre-compiled headers to avoid it
> in the future.

I configure with --disable-libstdcxx-pch which also reduces build-time.

RIchard.

> François

> On 20/05/2018 19:06, fdumont at gcc dot gnu.org wrote:
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85845
> >
> > François Dumont  changed:
> >
> > What|Removed |Added
> >

> >   Status|NEW |ASSIGNED
> > Assignee|unassigned at gcc dot gnu.org  |fdumont at gcc
dot gnu.org
> >


Re: [PATCH PR85804]Fix wrong code by correcting bump step computation in vector(1) load of single-element group access

2018-05-22 Thread Richard Biener
On Mon, May 21, 2018 at 3:14 PM Bin Cheng  wrote:

> Hi,
> As reported in PR85804, bump step is wrongly computed for vector(1) load
of
> single-element group access.  This patch fixes the issue by correcting
bump
> step computation for the specific VMAT_CONTIGUOUS case.

> Bootstrap and test on x86_64 and AArch64 ongoing, is it OK?

To me it looks like the classification as VMAT_CONTIGUOUS is bogus.
We'd fall into the grouped_load case otherwise which should handle
the situation correctly?

Richard?

Thanks,
Richard.


> Thanks,
> bin

> 2018-05-17  Bin Cheng  

>  PR tree-optimization/85804
>  * tree-vect-stmts.c (vectorizable_load): Compute correct bump step
>  for vector(1) load in single-element group access.

> gcc/testsuite
> 2018-05-17  Bin Cheng  

>  PR tree-optimization/85804
>  * gcc.c-torture/execute/pr85804.c: New test.


Re: [PATCH] Do not ICE for incomplete types in ICF (PR ipa/85607).

2018-05-22 Thread Richard Biener
On Mon, May 21, 2018 at 9:27 AM Martin Liška  wrote:

> PING^1

OK.

> On 05/11/2018 03:12 PM, Martin Liška wrote:
> > On 05/11/2018 11:35 AM, Richard Biener wrote:
> >> On Thu, May 10, 2018 at 9:58 AM, Martin Liška  wrote:
> >>> Hi.
> >>>
> >>> It's removal of an assert at place where we calculate hash of a type.
> >>> For incomplete types, let's skip it.
> >>>
> >>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> >>>
> >>> Ready to be installed?
> >>
> >> Seems to be a redundant check in the !val case as well.  Also why not
> >> at least do
> >>
> >>   hstate2.add_int (RECORD_TYPE);
> >>
> >> for incomplete types?
> >
> > Thanks, done that.
> >
> >  That said, your patch fixes the ICE but what
> >> is supposed to happen for incomplete types?  Note that with LTO
> >> we no longer "complete" types so you can see a mix of struct S;
> >> and struct S {  }; in the IL.  It looks like comparison later just
> >> looks at types_compatible_p here.
> >
> > Yes, that should be fine. The hashing of types is only an optimization.
> >
> >>
> >> Anyway, please at least remove the other redundant assert.
> >
> > Done and tested.
> >
> > Martin
> >
> >>
> >> Thanks,
> >> Richard.
> >>
> >>> Martin
> >>>
> >>>
> >>> gcc/ChangeLog:
> >>>
> >>> 2018-05-09  Martin Liska  
> >>>
> >>> PR ipa/85607
> >>> * ipa-icf.c (sem_item::add_type): Do not ICE for incomplete
types.
> >>>
> >>> gcc/testsuite/ChangeLog:
> >>>
> >>> 2018-05-09  Martin Liska  
> >>>
> >>> PR ipa/85607
> >>> * g++.dg/ipa/pr85606.C: New test.
> >>> ---
> >>>  gcc/ipa-icf.c  |  5 -
> >>>  gcc/testsuite/g++.dg/ipa/pr85606.C | 14 ++
> >>>  2 files changed, 18 insertions(+), 1 deletion(-)
> >>>  create mode 100644 gcc/testsuite/g++.dg/ipa/pr85606.C
> >>>
> >>>
> >


Re: [PATCHv2] PR 85822 - Fix handling of negative constants

2018-05-22 Thread Richard Biener
On Sun, May 20, 2018 at 9:45 PM Yuri Gribov  wrote:

> On Sun, May 20, 2018 at 1:01 PM, Alexander Monakov 
wrote:
> > On Sun, 20 May 2018, Yuri Gribov wrote:
> >
> >> Hi all,
> >>
> >> This fixes PR 85822 by removing incorrect reversal of condition in VRP
> >> assertion. Bootstrapped and regtested on x86_64.
> >>
> >> Ok for trunk?
> >
> > Please address the following issues:
> >
> > Use correct PR reference in Changelog.

> Ah, right.

> > Double-check the comment before the function, I think NE_EXPR and
EQ_EXPR
> > should be swapped there.

> Thanks, fixed.

> > Address Richard's request from the bug report:
> >
> >>> Ok, please make sure to say why not doing anything special for
negative
> >>> numbers is ok.

> True, I didn't notice the "to say" part :/

> Comparison
>(a & 11...100...0) == XX...X00..0  // RHS XX...X is covered by
> 11...100...0 mask i.e. (RHS & MASK) == RHS
> means that 'a' can have values
>XX...X00...00
>XX...X00...01
>XX...X00...10
>...
>XX...X11...11
> Both for positive and negative, signed and unsigned RHSs the first
> number is less than the last one so we can derive
>XX...X00...00 <= a
>a <= XX...X11...11
> in all cases.

> > Note there are at least three special cases that are handled
incorrectly either
> > before or after the patch:
> >
> >  - not two's complement integers

> Commented by Richard.

> >  - mask being 0
> >  - mask being ~0

> Done (AND with 0 or -1 is removed during Gimple generation so I didn't
> bother to optimize this special case, just bail out).

OK.

Thanks,
Richard.

> -Y


Re: [patch] Improve support for up-level references (1/n)

2018-05-22 Thread Richard Biener
On Fri, May 18, 2018 at 11:52 PM Eric Botcazou 
wrote:

> > + /* If the next declaration is a PARM_DECL pointing to
theDECL,
> > +we need to adjust its VALUE_EXPR directly, since
chains of
> > +VALUE_EXPRs run afoul of garbage collection.  This
occurs
> > +in Ada for Out parameters that aren't copied in.  */
> > + if (next
> > + && TREE_CODE (next) == PARM_DECL
> > + && DECL_HAS_VALUE_EXPR_P (next)
> > + && DECL_VALUE_EXPR (next) == decl)
> > +   SET_DECL_VALUE_EXPR (next, x);
> >
> > maybe you can explain the GC issue a bit.

> It's the issue with GCed tables pointing to each other (which one is
marked
> first?) applied to the VALUE_EXPR table.  Not clear it's worth verifying
since
> it's marginal, but I guess it's cheap enough to do in
decl_value_expr_insert.

Yeah, I guess it's cheap enough and we might have this issue lurking
somewhere...

Richard.

> --
> Eric Botcazou


Re: Handle a null lhs in expand_direct_optab_fn (PR85862)

2018-05-22 Thread Richard Biener
On Tue, May 22, 2018 at 8:58 AM Richard Sandiford <
richard.sandif...@linaro.org> wrote:

> This PR showed that the normal function for expanding directly-mapped
> internal functions didn't handle the case in which the call was only
> being kept for its side-effects.

> Tested on aarch64-linux-gnu (with and without SVE), aarch64_be-elf
> and x86_64-linux-gnu.  OK to install?

OK.

Richard.

> Richard


> 2018-05-22  Richard Sandiford  

> gcc/
>  PR middle-end/85862
>  * internal-fn.c (expand_direct_optab_fn): Cope with a null lhs.

> gcc/testsuite/
>  PR middle-end/85862
>  * gcc.dg/torture/pr85862.c: New test.

> Index: gcc/internal-fn.c
> ===
> --- gcc/internal-fn.c   2018-05-18 09:26:37.734714355 +0100
> +++ gcc/internal-fn.c   2018-05-22 07:56:26.130013854 +0100
> @@ -2891,14 +2891,15 @@ expand_direct_optab_fn (internal_fn fn,
> insn_code icode = direct_optab_handler (optab, TYPE_MODE
(types.first));

> tree lhs = gimple_call_lhs (stmt);
> -  tree lhs_type = TREE_TYPE (lhs);
> -  rtx lhs_rtx = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
> +  rtx lhs_rtx = NULL_RTX;
> +  if (lhs)
> +lhs_rtx = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);

> /* Do not assign directly to a promoted subreg, since there is no
>guarantee that the instruction will leave the upper bits of the
>register in the state required by SUBREG_PROMOTED_SIGN.  */
> rtx dest = lhs_rtx;
> -  if (GET_CODE (dest) == SUBREG && SUBREG_PROMOTED_VAR_P (dest))
> +  if (dest && GET_CODE (dest) == SUBREG && SUBREG_PROMOTED_VAR_P (dest))
>   dest = NULL_RTX;

> create_output_operand ([0], dest,
insn_data[icode].operand[0].mode);
> @@ -2917,7 +2918,7 @@ expand_direct_optab_fn (internal_fn fn,
>   }

> expand_insn (icode, nargs + 1, ops);
> -  if (!rtx_equal_p (lhs_rtx, ops[0].value))
> +  if (lhs_rtx && !rtx_equal_p (lhs_rtx, ops[0].value))
>   {
> /* If the return value has an integral type, convert the
instruction
>   result to that type.  This is useful for things that return an
> @@ -2931,7 +2932,7 @@ expand_direct_optab_fn (internal_fn fn,
>/* If this is a scalar in a register that is stored in a wider
>   mode than the declared mode, compute the result into its
>   declared mode and then convert to the wider mode.  */
> - gcc_checking_assert (INTEGRAL_TYPE_P (lhs_type));
> + gcc_checking_assert (INTEGRAL_TYPE_P (TREE_TYPE (lhs)));
>rtx tmp = convert_to_mode (GET_MODE (lhs_rtx), ops[0].value, 0);
>convert_move (SUBREG_REG (lhs_rtx), tmp,
>  SUBREG_PROMOTED_SIGN (lhs_rtx));
> @@ -2940,7 +2941,7 @@ expand_direct_optab_fn (internal_fn fn,
>  emit_move_insn (lhs_rtx, ops[0].value);
> else
>  {
> - gcc_checking_assert (INTEGRAL_TYPE_P (lhs_type));
> + gcc_checking_assert (INTEGRAL_TYPE_P (TREE_TYPE (lhs)));
>convert_move (lhs_rtx, ops[0].value, 0);
>  }
>   }
> Index: gcc/testsuite/gcc.dg/torture/pr85862.c
> ===
> --- /dev/null   2018-04-20 16:19:46.369131350 +0100
> +++ gcc/testsuite/gcc.dg/torture/pr85862.c  2018-05-22
07:56:26.131013803 +0100
> @@ -0,0 +1,9 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-fexceptions -fnon-call-exceptions" } */
> +/* { dg-additional-options "-fexceptions -fnon-call-exceptions -mfma" {
target i?86-*-* x86_64-*-* } } */
> +
> +void
> +ki (double nq)
> +{
> +  double no = 1.1 * nq - nq;
> +}


Re: [PATCH] Print working directory to gcov files (PR gcov-profile/84846).

2018-05-22 Thread Martin Liška
On 05/21/2018 10:17 PM, Eric Botcazou wrote:
>> Simple format extension which prints working directory of TU when it was
>> compiled. It's requested by LCOV folks.
> 
> Can we make that optional, please?  Having the coverage results depends on 
> the 
> current working directory is quite annoying, to say the least.
> 

Hi Eric.

How do you mean that? Why would be that dependent?

Martin


Re: [PATCH][RFC] Add dynamic edge/bb flag allocation

2018-05-22 Thread Richard Biener
On Mon, 21 May 2018, Jeff Law wrote:

> On 05/18/2018 07:15 AM, David Malcolm wrote:
> > On Fri, 2018-05-18 at 13:11 +0200, Richard Biener wrote:
> >> The following adds a simple alloc/free_flag machinery allocating
> >> bits from an int typed pool and applies that to bb->flags and edge-
> >>> flags.
> >> This should allow infrastructure pieces to use egde/bb flags
> >> temporarily
> >> without worrying that users might already use it as for example
> >> BB_VISITED and friends.  It converts one clever user to the new
> >> interface.
> >>
> >> The allocation state is per CFG but we could also make it global
> >> or merge the two pools so one allocates a flag that can be used for
> >> bbs and edges at the same time.
> >>
> >> Thus - any opinions welcome.  I'm mainly targeting cfganal algorithms
> >> where I want to add a few region-based ones that to be O(region-size)
> >> complexity may not use sbitmaps for visited sets because of the
> >> clearing
> >> overhead and bitmaps are probably more expensive to use than a
> >> BB/edge
> >> flag that needs to be cleared afterwards.
> >>
> >> Built on x86_64, otherwise untested.
> >>
> >> Any comments?
> > 
> > Rather than putting alloc/free pairs at the usage sites, how about an
> > RAII class?  Something like this:
> Yes, please if at all possible we should be using RAII.

So like the following?  (see comments in the hwint.h hunk for
extra C++ questions...)

I dropped the non-RAII interface - it's very likely never needed.

Better suggestions for placement of auto_flag welcome.

Thanks,
Richard.

>From 8ae07eb0aa6c430605a16f043ec08726f81b2442 Mon Sep 17 00:00:00 2001
From: Richard Guenther 
Date: Fri, 18 May 2018 13:01:36 +0200
Subject: [PATCH 2/2] add dynamic cfg flag allocation

* cfg.h (struct control_flow_graph): Add edge_flags_allocated and
bb_flags_allocated members.
(auto_edge_flag): New RAII class for allocating edge flags.
(auto_bb_flag): New RAII class for allocating bb flags.
* hwint.h (auto_flag): New RAII class for allocating flags.
* cfgloop.c (verify_loop_structure): Allocate temporary edge
flag dynamically.

cfg flag

diff --git a/gcc/cfg.c b/gcc/cfg.c
index 11026e7209a..f8b217d39ca 100644
--- a/gcc/cfg.c
+++ b/gcc/cfg.c
@@ -79,6 +79,8 @@ init_flow (struct function *the_fun)
 = EXIT_BLOCK_PTR_FOR_FN (the_fun);
   EXIT_BLOCK_PTR_FOR_FN (the_fun)->prev_bb
 = ENTRY_BLOCK_PTR_FOR_FN (the_fun);
+  the_fun->cfg->edge_flags_allocated = EDGE_ALL_FLAGS;
+  the_fun->cfg->bb_flags_allocated = BB_ALL_FLAGS;
 }
 
 /* Helper function for remove_edge and clear_edges.  Frees edge structure
diff --git a/gcc/cfg.h b/gcc/cfg.h
index 0953456782b..f9f762a520b 100644
--- a/gcc/cfg.h
+++ b/gcc/cfg.h
@@ -74,6 +74,10 @@ struct GTY(()) control_flow_graph {
 
   /* Maximal count of BB in function.  */
   profile_count count_max;
+
+  /* Dynamically allocated edge/bb flags.  */
+  int edge_flags_allocated;
+  int bb_flags_allocated;
 };
 
 
@@ -121,4 +125,18 @@ extern basic_block get_bb_copy (basic_block);
 void set_loop_copy (struct loop *, struct loop *);
 struct loop *get_loop_copy (struct loop *);
 
+class auto_edge_flag : public auto_flag
+{
+public:
+  auto_edge_flag (function *fun)
+: auto_flag (>cfg->edge_flags_allocated) {}
+};
+
+class auto_bb_flag : public auto_flag
+{
+public:
+  auto_bb_flag (function *fun)
+: auto_flag (>cfg->bb_flags_allocated) {}
+};
+
 #endif /* GCC_CFG_H */
diff --git a/gcc/cfgloop.c b/gcc/cfgloop.c
index 8af793c6015..fb5ebad1dfd 100644
--- a/gcc/cfgloop.c
+++ b/gcc/cfgloop.c
@@ -1539,6 +1539,7 @@ verify_loop_structure (void)
   /* Check irreducible loops.  */
   if (loops_state_satisfies_p (LOOPS_HAVE_MARKED_IRREDUCIBLE_REGIONS))
 {
+  auto_edge_flag saved_irr_mask (cfun);
   /* Record old info.  */
   auto_sbitmap irreds (last_basic_block_for_fn (cfun));
   FOR_EACH_BB_FN (bb, cfun)
@@ -1550,7 +1551,7 @@ verify_loop_structure (void)
bitmap_clear_bit (irreds, bb->index);
  FOR_EACH_EDGE (e, ei, bb->succs)
if (e->flags & EDGE_IRREDUCIBLE_LOOP)
- e->flags |= EDGE_ALL_FLAGS + 1;
+ e->flags |= saved_irr_mask;
}
 
   /* Recount it.  */
@@ -1576,20 +1577,20 @@ verify_loop_structure (void)
  FOR_EACH_EDGE (e, ei, bb->succs)
{
  if ((e->flags & EDGE_IRREDUCIBLE_LOOP)
- && !(e->flags & (EDGE_ALL_FLAGS + 1)))
+ && !(e->flags & saved_irr_mask))
{
  error ("edge from %d to %d should be marked irreducible",
 e->src->index, e->dest->index);
  err = 1;
}
  else if (!(e->flags & EDGE_IRREDUCIBLE_LOOP)
-  && (e->flags & (EDGE_ALL_FLAGS + 1)))
+  && (e->flags & saved_irr_mask))
{
  error ("edge from %d to %d should not be marked irreducible",
   

[RFC] [aarch64] Add HiSilicon tsv110 CPU support.

2018-05-22 Thread Shaokun Zhang
This patch adds HiSilicon's an mcpu: tsv110.

---
 gcc/ChangeLog|   9 +++
 gcc/config/aarch64/aarch64-cores.def |   5 ++
 gcc/config/aarch64/aarch64-cost-tables.h | 103 +++
 gcc/config/aarch64/aarch64-tune.md   |   2 +-
 gcc/config/aarch64/aarch64.c |  79 
 gcc/doc/invoke.texi  |   2 +-
 6 files changed, 198 insertions(+), 2 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index cec2892..5d44966 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,12 @@
+2018-05-22  Shaokun Zhang  
+Bo Zhou  
+
+   * config/aarch64/aarch64-cores.def (tsv110): New CPU.
+   * config/aarch64/aarch64-tune.md: Regenerated.
+   * doc/invoke.texi (AArch61 Options/-mtune): Add "tsv110".
+   * gcc/config/aarch64/aarch64.c (tsv110_tunings): New tuning table.
+   * gcc/config/aarch64/aarch64-cost-tables.h: Add "tsv110" extra costs.
+
 2018-05-21  Michael Meissner  
 
PR target/85657
diff --git a/gcc/config/aarch64/aarch64-cores.def 
b/gcc/config/aarch64/aarch64-cores.def
index 33b96ca..db7a412 100644
--- a/gcc/config/aarch64/aarch64-cores.def
+++ b/gcc/config/aarch64/aarch64-cores.def
@@ -91,6 +91,11 @@ AARCH64_CORE("cortex-a75",  cortexa75, cortexa57, 8_2A,  
AARCH64_FL_FOR_ARCH8_2
 /* Qualcomm ('Q') cores. */
 AARCH64_CORE("saphira", saphira,falkor,8_3A,  
AARCH64_FL_FOR_ARCH8_3 | AARCH64_FL_CRYPTO | AARCH64_FL_RCPC, saphira,   0x51, 
0xC01, -1)
 
+/* ARMv8.4-A Architecture Processors.  */
+
+/* HiSilicon ('H') cores. */
+AARCH64_CORE("tsv110", tsv110,tsv110,8_4A,  AARCH64_FL_FOR_ARCH8_4 
| AARCH64_FL_CRYPTO | AARCH64_FL_F16 | AARCH64_FL_AES | AARCH64_FL_SHA2, 
tsv110,   0x48, 0xd01, -1)
+
 /* ARMv8-A big.LITTLE implementations.  */
 
 AARCH64_CORE("cortex-a57.cortex-a53",  cortexa57cortexa53, cortexa53, 8A,  
AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57, 0x41, AARCH64_BIG_LITTLE 
(0xd07, 0xd03), -1)
diff --git a/gcc/config/aarch64/aarch64-cost-tables.h 
b/gcc/config/aarch64/aarch64-cost-tables.h
index a455c62..b6890d6 100644
--- a/gcc/config/aarch64/aarch64-cost-tables.h
+++ b/gcc/config/aarch64/aarch64-cost-tables.h
@@ -334,4 +334,107 @@ const struct cpu_cost_table thunderx2t99_extra_costs =
   }
 };
 
+const struct cpu_cost_table tsv110_extra_costs =
+{
+  /* ALU */
+  {
+0, /* arith.  */
+0, /* logical.  */
+0, /* shift.  */
+0, /* shift_reg.  */
+COSTS_N_INSNS (1), /* arith_shift.  */
+COSTS_N_INSNS (1), /* arith_shift_reg.  */
+COSTS_N_INSNS (1), /* log_shift.  */
+COSTS_N_INSNS (1), /* log_shift_reg.  */
+0, /* extend.  */
+COSTS_N_INSNS (1), /* extend_arith.  */
+0, /* bfi.  */
+0, /* bfx.  */
+0, /* clz.  */
+0,/* rev.  */
+0, /* non_exec.  */
+true   /* non_exec_costs_exec.  */
+  },
+  {
+/* MULT SImode */
+{
+  COSTS_N_INSNS (2),   /* simple.  */
+  COSTS_N_INSNS (2),   /* flag_setting.  */
+  COSTS_N_INSNS (2),   /* extend.  */
+  COSTS_N_INSNS (2),   /* add.  */
+  COSTS_N_INSNS (2),   /* extend_add.  */
+  COSTS_N_INSNS (11)   /* idiv.  */
+},
+/* MULT DImode */
+{
+  COSTS_N_INSNS (3),   /* simple.  */
+  0,   /* flag_setting (N/A).  */
+  COSTS_N_INSNS (3),   /* extend.  */
+  COSTS_N_INSNS (3),   /* add.  */
+  COSTS_N_INSNS (3),   /* extend_add.  */
+  COSTS_N_INSNS (19)   /* idiv.  */
+}
+  },
+  /* LD/ST */
+  {
+COSTS_N_INSNS (3), /* load.  */
+COSTS_N_INSNS (4), /* load_sign_extend.  */
+COSTS_N_INSNS (3), /* ldrd.  */
+COSTS_N_INSNS (3), /* ldm_1st.  */
+1, /* ldm_regs_per_insn_1st.  */
+2, /* ldm_regs_per_insn_subsequent.  */
+COSTS_N_INSNS (4), /* loadf.  */
+COSTS_N_INSNS (4), /* loadd.  */
+COSTS_N_INSNS (4), /* load_unaligned.  */
+0, /* store.  */
+0, /* strd.  */
+0, /* stm_1st.  */
+1, /* stm_regs_per_insn_1st.  */
+2, /* stm_regs_per_insn_subsequent.  */
+0, /* storef.  */
+0, /* stored.  */
+COSTS_N_INSNS (1), /* store_unaligned.  */
+COSTS_N_INSNS (4), /* loadv.  */
+COSTS_N_INSNS (4)  /* storev.  */
+  },
+  {
+/* FP SFmode */
+{
+  COSTS_N_INSNS (10),  /* div.  */
+  COSTS_N_INSNS (4),   /* mult.  */
+  COSTS_N_INSNS (4),   /* mult_addsub.  */
+  

[RFC] [aarch64] Add HiSilicon tsv110 CPU support

2018-05-22 Thread Shaokun Zhang
tsv110 is designed by HiSilicon and supports v8_4A, it also optimizes
L1 Icache which can access L1 Dcache.
Therefore, DC CVAU is not necessary in __aarch64_sync_cache_range for
tsv110, is there any good idea to skip DC CVAU operation for tsv110.

Any thoughts and ideas are welcome.

Shaokun Zhang (1):
  [aarch64] Add HiSilicon tsv110 CPU support.

 gcc/ChangeLog|   9 +++
 gcc/config/aarch64/aarch64-cores.def |   5 ++
 gcc/config/aarch64/aarch64-cost-tables.h | 103 +++
 gcc/config/aarch64/aarch64-tune.md   |   2 +-
 gcc/config/aarch64/aarch64.c |  79 
 gcc/doc/invoke.texi  |   2 +-
 6 files changed, 198 insertions(+), 2 deletions(-)

-- 
2.7.4



Add a class to represent a gimple match result

2018-05-22 Thread Richard Sandiford
Gimple match results are represented by a code_helper for the operation,
a tree for the type, and an array of three trees for the operands.
This patch wraps them up in a class so that they don't need to be
passed around individually.

The main reason for doing this is to make it easier to increase the
number of operands (for calls) or to support more complicated kinds
of operation.  But passing around fewer operands also helps to reduce
the size of gimple-match.o (about 7% for development builds and 4% for
release builds).

2018-05-21  Richard Sandiford  

gcc/
* gimple-match.h (gimple_match_op): New class.
(mprts_hook): Replace parameters with a gimple_match_op *.
(maybe_build_generic_op): Likewise.
(gimple_simplified_result_is_gimple_val): Replace parameters with
a const gimple_match_op *.
(gimple_simplify): Replace code_helper * and tree * parameters with
a gimple_match_op * parameter.
(gimple_resimplify1): Replace code_helper *, tree and tree *
parameters with a gimple_match_op * parameter.
(gimple_resimplify2): Likewise.
(gimple_resimplify3): Likewise.
(maybe_push_res_to_seq): Replace code_helper, tree and tree *
parameters with a gimple_match_op * parameter.
* gimple-match-head.c (gimple_simplify): Change prototypes of
auto-generated functions to take a gimple_match_op * instead of
separate code_helper * and tree * parameters.  Make the same
change in the top-level overload and update calls to the
gimple_resimplify routines.  Update calls to the auto-generated
functions and to maybe_push_res_to_seq in the publicly-facing
operation-specific gimple_simplify overloads.
(gimple_match_op::MAX_NUM_OPS): Define.
(gimple_resimplify1): Replace rcode and ops with a single res_op
parameter.  Update call to gimple_simplify.
(gimple_resimplify2): Likewise.
(gimple_resimplify3): Likewise.
(mprts_hook): Replace parameters with a gimple_match_op *.
(maybe_build_generic_op): Likewise.
(build_call_internal): Replace type, nargs and ops with
a gimple_match_op *.
(maybe_push_res_to_seq): Replace res_code, type and ops parameters
with a single gimple_match_op *.  Update calls to mprts_hook,
build_call_internal and gimple_simplified_result_is_gimple_val.
Factor out code that is common to the tree_code and combined_fn cases.
* genmatch.c (expr::gen_transform): Replace tem_code and
tem_ops with a gimple_match_op called tem_op.  Update calls
to the gimple_resimplify functions and maybe_push_res_to_seq.
(dt_simplify::gen_1): Manipulate res_op instead of res_code and
res_ops.  Update call to the gimple_resimplify functions.
(dt_simplify::gen): Pass res_op instead of res_code and res_ops.
(decision_tree::gen): Make the functions take a gimple_match_op *
called res_op instead of separate res_code and res_ops parameters.
Update call accordingly.
* gimple-fold.c (replace_stmt_with_simplification): Replace rcode
and ops with a single res_op parameter.  Update calls to
maybe_build_generic_op and maybe_push_res_to_seq.
(fold_stmt_1): Update calls to gimple_simplify and
replace_stmt_with_simplification.
(gimple_fold_stmt_to_constant_1): Update calls to gimple_simplify
and gimple_simplified_result_is_gimple_val.
* tree-cfgcleanup.c (cleanup_control_expr_graph): Update call to
gimple_simplify.
* tree-ssa-sccvn.c (vn_lookup_simplify_result): Replace parameters
with a gimple_match_op *.
(vn_nary_build_or_lookup): Likewise.  Update call to
vn_nary_build_or_lookup_1.
(vn_nary_build_or_lookup_1): Replace rcode, type and ops with a
gimple_match_op *.  Update calls to the gimple_resimplify routines
and to gimple_simplified_result_is_gimple_val.
(vn_nary_simplify): Update call to vn_nary_build_or_lookup_1.
Use gimple_match_op::MAX_NUM_OPS instead of a hard-coded 3.
(vn_reference_lookup_3): Update call to vn_nary_build_or_lookup.
(visit_nary_op): Likewise.
(visit_reference_op_load): Likewise.

Index: gcc/gimple-match.h
===
--- gcc/gimple-match.h  2018-05-22 08:22:40.094593327 +0100
+++ gcc/gimple-match.h  2018-05-22 08:22:40.324588555 +0100
@@ -40,31 +40,165 @@ #define GCC_GIMPLE_MATCH_H
   int rep;
 };
 
-/* Return whether OPS[0] with CODE is a non-expression result and
-   a gimple value.  */
+/* Represents an operation to be simplified, or the result of the
+   simplification.  */
+struct gimple_match_op
+{
+  gimple_match_op () : type (NULL_TREE), num_ops (0) {}
+  gimple_match_op (code_helper, tree, unsigned int);
+  gimple_match_op (code_helper, tree, 

Handle a null lhs in expand_direct_optab_fn (PR85862)

2018-05-22 Thread Richard Sandiford
This PR showed that the normal function for expanding directly-mapped
internal functions didn't handle the case in which the call was only
being kept for its side-effects.

Tested on aarch64-linux-gnu (with and without SVE), aarch64_be-elf
and x86_64-linux-gnu.  OK to install?

Richard


2018-05-22  Richard Sandiford  

gcc/
PR middle-end/85862
* internal-fn.c (expand_direct_optab_fn): Cope with a null lhs.

gcc/testsuite/
PR middle-end/85862
* gcc.dg/torture/pr85862.c: New test.

Index: gcc/internal-fn.c
===
--- gcc/internal-fn.c   2018-05-18 09:26:37.734714355 +0100
+++ gcc/internal-fn.c   2018-05-22 07:56:26.130013854 +0100
@@ -2891,14 +2891,15 @@ expand_direct_optab_fn (internal_fn fn,
   insn_code icode = direct_optab_handler (optab, TYPE_MODE (types.first));
 
   tree lhs = gimple_call_lhs (stmt);
-  tree lhs_type = TREE_TYPE (lhs);
-  rtx lhs_rtx = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
+  rtx lhs_rtx = NULL_RTX;
+  if (lhs)
+lhs_rtx = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
 
   /* Do not assign directly to a promoted subreg, since there is no
  guarantee that the instruction will leave the upper bits of the
  register in the state required by SUBREG_PROMOTED_SIGN.  */
   rtx dest = lhs_rtx;
-  if (GET_CODE (dest) == SUBREG && SUBREG_PROMOTED_VAR_P (dest))
+  if (dest && GET_CODE (dest) == SUBREG && SUBREG_PROMOTED_VAR_P (dest))
 dest = NULL_RTX;
 
   create_output_operand ([0], dest, insn_data[icode].operand[0].mode);
@@ -2917,7 +2918,7 @@ expand_direct_optab_fn (internal_fn fn,
 }
 
   expand_insn (icode, nargs + 1, ops);
-  if (!rtx_equal_p (lhs_rtx, ops[0].value))
+  if (lhs_rtx && !rtx_equal_p (lhs_rtx, ops[0].value))
 {
   /* If the return value has an integral type, convert the instruction
 result to that type.  This is useful for things that return an
@@ -2931,7 +2932,7 @@ expand_direct_optab_fn (internal_fn fn,
  /* If this is a scalar in a register that is stored in a wider
 mode than the declared mode, compute the result into its
 declared mode and then convert to the wider mode.  */
- gcc_checking_assert (INTEGRAL_TYPE_P (lhs_type));
+ gcc_checking_assert (INTEGRAL_TYPE_P (TREE_TYPE (lhs)));
  rtx tmp = convert_to_mode (GET_MODE (lhs_rtx), ops[0].value, 0);
  convert_move (SUBREG_REG (lhs_rtx), tmp,
SUBREG_PROMOTED_SIGN (lhs_rtx));
@@ -2940,7 +2941,7 @@ expand_direct_optab_fn (internal_fn fn,
emit_move_insn (lhs_rtx, ops[0].value);
   else
{
- gcc_checking_assert (INTEGRAL_TYPE_P (lhs_type));
+ gcc_checking_assert (INTEGRAL_TYPE_P (TREE_TYPE (lhs)));
  convert_move (lhs_rtx, ops[0].value, 0);
}
 }
Index: gcc/testsuite/gcc.dg/torture/pr85862.c
===
--- /dev/null   2018-04-20 16:19:46.369131350 +0100
+++ gcc/testsuite/gcc.dg/torture/pr85862.c  2018-05-22 07:56:26.131013803 
+0100
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-fexceptions -fnon-call-exceptions" } */
+/* { dg-additional-options "-fexceptions -fnon-call-exceptions -mfma" { target 
i?86-*-* x86_64-*-* } } */
+
+void
+ki (double nq)
+{
+  double no = 1.1 * nq - nq;
+}