[c++-delayed-folding] First stab at convert_to_integer

2015-10-16 Thread Marek Polacek
I felt like it'd be good to resolve some unfinished business in
convert_to_integer before starting messing with other convert_to_*
functions.  This is a response to
.

> > +  if (!dofold)
> > +{
> > + expr = build1 (CONVERT_EXPR,
> > +lang_hooks.types.type_for_size
> > +  (TYPE_PRECISION (intype), 0),
> > +expr);
> > + return build1 (CONVERT_EXPR, type, expr);
> > +   }
>
> When we're not folding, I don't think we want to do the two-step 
> conversion, just the second one.  And we might want to use NOP_EXPR 
> instead of CONVERT_EXPR, but I'm not sure about that.

Changed.  I kept CONVERT_EXPR there even though NOP_EXPR seemed to work as
well.

> > @@ -818,10 +828,15 @@ convert_to_integer (tree type, tree expr)
> > if (TYPE_UNSIGNED (typex))
> >   typex = signed_type_for (typex);
> >   }
> > -   return convert (type,
> > -   fold_build2 (ex_form, typex,
> > -convert (typex, arg0),
> > -convert (typex, arg1)));
> > +   if (dofold)
> > + return convert (type,
> > + fold_build2 (ex_form, typex,
> > +  convert (typex, arg0),
> > +  convert (typex, arg1)));
> > +   arg0 = build1 (CONVERT_EXPR, typex, arg0);
> > +   arg1 = build1 (CONVERT_EXPR, typex, arg1);
> > +   expr = build2 (ex_form, typex, arg0, arg1);
> > +   return build1 (CONVERT_EXPR, type, expr);
>
> This code path seems to be for pushing a conversion down into a binary 
> expression.  We shouldn't do this at all when we aren't folding.

I tend to agree, but this case is tricky.  What's this code about is
e.g. for

int
fn (long p, long o)
{
  return p + o;
}

we want to narrow the operation and do the addition on unsigned ints and then
convert to int.  We do it here because we're still missing the
promotion/demotion pass on GIMPLE (PR45397 / PR47477).  Disabling this
optimization here would regress a few testcases, so I kept the code as it was.
Thoughts?

> > @@ -845,9 +860,14 @@ convert_to_integer (tree type, tree expr)
> >
> > if (!TYPE_UNSIGNED (typex))
> >   typex = unsigned_type_for (typex);
> > +   if (!dofold)
> > + return build1 (CONVERT_EXPR, type,
> > +build1 (ex_form, typex,
> > +build1 (CONVERT_EXPR, typex,
> > +TREE_OPERAND (expr, 0;
>
> Likewise.

Changed.

> > @@ -867,6 +887,14 @@ convert_to_integer (tree type, tree expr)
> >  the conditional and never loses.  A COND_EXPR may have a throw
> >  as one operand, which then has void type.  Just leave void
> >  operands as they are.  */
> > + if (!dofold)
> > +   return build3 (COND_EXPR, type, TREE_OPERAND (expr, 0),
> > +  VOID_TYPE_P (TREE_TYPE (TREE_OPERAND (expr, 1)))
> > +  ? TREE_OPERAND (expr, 1)
> > +  : build1 (CONVERT_EXPR, type, TREE_OPERAND 
> > (expr, 1)),
> > +  VOID_TYPE_P (TREE_TYPE (TREE_OPERAND (expr, 2)))
> > +  ? TREE_OPERAND (expr, 2)
> > +  : build1 (CONVERT_EXPR, type, TREE_OPERAND 
> > (expr, 2)));
>
> Likewise.

Changed.

> > @@ -903,6 +933,10 @@ convert_to_integer (tree type, tree expr)
> >return build1 (FIXED_CONVERT_EXPR, type, expr);
> >
> >  case COMPLEX_TYPE:
> > +  if (!dofold)
> > +   return build1 (CONVERT_EXPR, type,
> > +  build1 (REALPART_EXPR,
> > +  TREE_TYPE (TREE_TYPE (expr)), expr));
>
> Why can't we call convert here rather than build1 a CONVERT_EXPR?

I don't know if there was any particular reason but just build1 seems dubious
so I've changed this to convert and didn't see any problems.

> It would be good to ask a fold/convert maintainer to review the changes 
> to this file, too.

Certainly; comments welcome.

Moreover, there are some places in the C++ FE where we still call
convert_to_integer and not convert_to_integer_nofold -- should they be
changed to the _nofold variant?

Bootstrapped/regtested on x86_64-linux, ok for branch?

diff --git gcc/convert.c gcc/convert.c
index fdb9b9a..40db767 100644
--- gcc/convert.c
+++ gcc/convert.c
@@ -571,13 +571,7 @@ convert_to_integer_1 (tree type, tree expr, bool dofold)
 coexistence of multiple valid pointer sizes, so fetch the one we need
 from the type.  */
   if (!dofold)

[PATCH] Use GET_MODE_BITSIZE to get vector natural alignment

2015-10-16 Thread H.J. Lu
Since GET_MODE_ALIGNMENT is defined by psABI and the biggest alignment
is 4 byte for IA MCU psABI, we should use GET_MODE_BITSIZE to get
vector natural alignment to check misaligned vector move.

OK for trunk?

Thanks.

H.J.
---
* config/i386/i386.c (ix86_expand_vector_move): Use
GET_MODE_BITSIZE to get vector natural alignment.
---
 gcc/config/i386/i386.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index ebe2b0a..d0e1f4c 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -18650,7 +18650,9 @@ void
 ix86_expand_vector_move (machine_mode mode, rtx operands[])
 {
   rtx op0 = operands[0], op1 = operands[1];
-  unsigned int align = GET_MODE_ALIGNMENT (mode);
+  /* Use GET_MODE_BITSIZE instead of GET_MODE_ALIGNMENT since the
+ biggest alignment is 4 byte for IA MCU psABI.  */
+  unsigned int align = GET_MODE_BITSIZE (mode);
 
   if (push_operand (op0, VOIDmode))
 op0 = emit_move_resolve_push (mode, op0);
-- 
2.4.3



[gomp4] small oacc cleanup

2015-10-16 Thread Nathan Sidwell

A  small patch committed to gomp4.

1) extract_oacc_routine_gwv does more work than necessary, we just nee to check 
if there are oacc function attributes.


2) We still want to run the oacc_device_lower pass, even if errors were 
detected.
  (a) we still need to process the loop  markers etc
  (b) I'm going to shortly be emitting diagnostics from the device compiler, 
and we don't want to only deliver ones from  the first offloaded function.


nathan
2015-10-16  Nathan Sidwell  

	* omp-low.c (build_outer_var_ref): Just check for openacc function
	attrib.
	(pass_oacc_device_lower::execute): Don't inhibit if errors have
	happened.

Index: gcc/omp-low.c
===
--- gcc/omp-low.c	(revision 228858)
+++ gcc/omp-low.c	(working copy)
@@ -1283,7 +1283,7 @@ build_outer_var_ref (tree var, omp_conte
 	x = lookup_decl (var, ctx->outer);
 }
   else if (is_reference (var)
-	   || extract_oacc_routine_gwv (current_function_decl) != 0)
+	   || get_oacc_fn_attrib (current_function_decl))
 /* This can happen with orphaned constructs.  If var is reference, it is
possible it is shared and as such valid.  */
 x = var;
@@ -16533,7 +16533,7 @@ public:
   /* opt_pass methods: */
   virtual unsigned int execute (function *)
 {
-  bool gate = (flag_openacc != 0 && !seen_error ());
+  bool gate = (flag_openacc != 0);
 
   if (!gate)
 	return 0;


[PATCH] PR fortran/67987 -- character lengths cannot be negative

2015-10-16 Thread Steve Kargl
The attach patch enforces the Fortran Standard's requirement
that character length must be great than or equal to zero.
The fix submitted here supercedes the fix for PR fortran/31250,
which silently converted a negative string length to zero.
In removing the fix for 31250, a regression occurred, because
a substring reference where the 'start' value is larger than 
'end' value results in a zero length string.  gfortran was computing
the length, but not checking for a negative value.  This has
be fixed by actually doing the check and resetting legnth to
zero. 

Built and tested on x86_64-*-freebsd.  OK to commit? 

2015-10-16  Steven G. Kargl  

* decl.c (char_len_param_value): Check for negative character length.
Unwrap a nearby unlong line.
* resolve.c (gfc_resolve_substring_charlen): If a substring would have
a negative character length, set it to zero per the Fortran standard.
Unwrap a nearby unlong line.
(resolve_charlen): Check for negative character length.

2015-10-16  Steven G. Kargl  

gfortran.dg/char_length_2.f90: Update the testcase.

-- 
Steve
Index: fortran/decl.c
===
--- fortran/decl.c	(revision 228667)
+++ fortran/decl.c	(working copy)
@@ -697,8 +697,7 @@ char_len_param_value (gfc_expr **expr, b
 
   if (gfc_match_char (':') == MATCH_YES)
 {
-  if (!gfc_notify_std (GFC_STD_F2003, "deferred type "
-			   "parameter at %C"))
+  if (!gfc_notify_std (GFC_STD_F2003, "deferred type parameter at %C"))
 	return MATCH_ERROR;
 
   *deferred = true;
@@ -708,11 +707,13 @@ char_len_param_value (gfc_expr **expr, b
 
   m = gfc_match_expr (expr);
 
-  if (m == MATCH_YES
-  && !gfc_expr_check_typed (*expr, gfc_current_ns, false))
+  if (m == MATCH_NO || m == MATCH_ERROR)
+return m;
+
+  if (!gfc_expr_check_typed (*expr, gfc_current_ns, false))
 return MATCH_ERROR;
 
-  if (m == MATCH_YES && (*expr)->expr_type == EXPR_FUNCTION)
+  if ((*expr)->expr_type == EXPR_FUNCTION)
 {
   if ((*expr)->value.function.actual
 	  && (*expr)->value.function.actual->expr->symtree)
@@ -731,6 +732,16 @@ char_len_param_value (gfc_expr **expr, b
 	}
 	}
 }
+
+  /* F2008, 4.4.3.1:  The length is a type parameter; its kind is processor
+ dependent and its value is greater than or equal to zero.  */
+  if ((*expr)->expr_type == EXPR_CONSTANT
+  && mpz_cmp_si ((*expr)->value.integer, 0) < 0)
+{
+  gfc_error ("LEN at %C must be greater than or equal to 0");
+  return MATCH_ERROR;
+}
+
   return m;
 
 syntax:
Index: fortran/resolve.c
===
--- fortran/resolve.c	(revision 228667)
+++ fortran/resolve.c	(working copy)
@@ -4562,8 +4562,7 @@ gfc_resolve_substring_charlen (gfc_expr 
 {
   if (e->ts.u.cl->length)
 	gfc_free_expr (e->ts.u.cl->length);
-  else if (e->expr_type == EXPR_VARIABLE
-		 && e->symtree->n.sym->attr.dummy)
+  else if (e->expr_type == EXPR_VARIABLE && e->symtree->n.sym->attr.dummy)
 	return;
 }
 
@@ -4596,12 +4595,19 @@ gfc_resolve_substring_charlen (gfc_expr 
   return;
 }
 
-  /* Length = (end - start +1).  */
+  /* Length = (end - start + 1).  */
   e->ts.u.cl->length = gfc_subtract (end, start);
   e->ts.u.cl->length = gfc_add (e->ts.u.cl->length,
 gfc_get_int_expr (gfc_default_integer_kind,
 		  NULL, 1));
 
+  /* F2008, 6.4.1:  Both the starting point and the ending point shall
+ be within the range 1, 2, ..., n unless the starting point exceeds
+ the ending point, in which case the substring has length zero.  */
+
+  if (mpz_cmp_si (e->ts.u.cl->length->value.integer, 0) < 0)
+mpz_set_si (e->ts.u.cl->length->value.integer, 0);
+
   e->ts.u.cl->length->ts.type = BT_INTEGER;
   e->ts.u.cl->length->ts.kind = gfc_charlen_int_kind;
 
@@ -10882,17 +10888,12 @@ resolve_charlen (gfc_charlen *cl)
 	}
 }
 
-  /* "If the character length parameter value evaluates to a negative
- value, the length of character entities declared is zero."  */
   if (cl->length && !gfc_extract_int (cl->length, ) && i < 0)
 {
-  if (warn_surprising)
-	gfc_warning_now (OPT_Wsurprising,
-			 "CHARACTER variable at %L has negative length %d,"
-			 " the length has been set to zero",
-			 >length->where, i);
-  gfc_replace_expr (cl->length,
-			gfc_get_int_expr (gfc_default_integer_kind, NULL, 0));
+  gfc_error ("LEN at %L must be greater than or equal to 0", 
+		 >length->where);
+  specification_expr = saved_specification_expr;
+  return false;
 }
 
   /* Check that the character length is not too large.  */
Index: testsuite/gfortran.dg/char_length_2.f90
===
--- testsuite/gfortran.dg/char_length_2.f90	(revision 228667)
+++ testsuite/gfortran.dg/char_length_2.f90	(working copy)
@@ -1,22 +1,12 @@
-! { dg-do link }
-! { 

Re: [PATCH, rs6000][v3] powerpc musl libc support

2015-10-16 Thread Szabolcs Nagy

On 16/10/15 17:35, Segher Boessenkool wrote:

Hi!

On Fri, Oct 16, 2015 at 04:58:06PM +0100, Szabolcs Nagy wrote:

  #if DEFAULT_LIBC == LIBC_UCLIBC
-#define CHOOSE_DYNAMIC_LINKER(G, U) "%{mglibc:" G ";:" U "}"
+#define CHOOSE_DYNAMIC_LINKER(G, U, M) \
+  "%{mglibc:" G ";:%{mmusl:" M ";:" U "}}"
  #elif DEFAULT_LIBC == LIBC_GLIBC
-#define CHOOSE_DYNAMIC_LINKER(G, U) "%{muclibc:" U ";:" G "}"
+#define CHOOSE_DYNAMIC_LINKER(G, U, M) \
+  "%{muclibc:" U ";:%{mmusl:" M ";:" G "}}"
+#elif DEFAULT_LIBC == LIBC_MUSL
+#define CHOOSE_DYNAMIC_LINKER(G, U, M) \
+  "%{mglibc:" G ";:%{muclibc:" U ";:" M "}}"
  #else
  #error "Unsupported DEFAULT_LIBC"
  #endif


This doesn't really scale, I wonder if some more elegant non-quadratic
way is possible?  Not that I expect terribly many other libcs to show
up in the near future ;-)



it is also error prone, but it was easier to use the
existing infrastructure than to figure out a clean way..

i guess the macro could be changed to

#define CHOOSE_LD(G,U,M,D) \
  "%{mglibc:" G ";:" \
  "%{muclibc:" U ";:" \
  "%{mmusl:" M ";:" \
  D \
  "}}}"

where D is the default and then

#if DEFAULT_LIBC == LIBC_UCLIBC
#define DEFAULT_LD UCLIBC_LD
#elif
..
#endif

#define LINUX_LD \
  CHOOSE_LD(GLIBC_LD, UCLIBC_LD, MUSL_LD, DEFAULT_LD)

but then the default dynlinker is listed twice
in the expansion of LINUX_LD.

i don't see an easy way to do this -mlibc logic in
the linkspec.



[hsa] Allow gridification of loop pre_bodies

2015-10-16 Thread Martin Jambor
Hi,

the patch below allows gridification and thus fast execution on HSA
GPUs of loops even when they have some statements in their pre-bodies.

It also moves the bulk of target construct preparation for
gridification to even before omp scanning, which should considerably
ease transition to OpenMP 4.5, which has landed to trunk.  I'll start
working on that next week.

Thanks,

Martin


2015-10-16  Martin Jambor  

* gimple-walk.c (walk_gimple_stmt): Also handle GIMPLE_OMP_GPUKERNEL.
* omp-low.c (omp_context): Removed field kernel_seq.
(single_stmt_in_seq_skip_bind): Moved down in the file.
(seq_only_contains_local_assignments): Likewise.
(target_follows_kernelizable_pattern): Removed.
(find_mark_kernel_components): Moved down in the file.
(attempt_target_kernelization): Removed.
(scan_omp_target): Scan kernel bounds.  Do not handle ctx->kernel_seq.
(check_omp_nesting_restrictions): Do not check GIMPLE_OMP_GPUKERNEL.
(scan_omp_1_stmt):  Also handle GIMPLE_OMP_GPUKERNEL.
(lower_omp_target): Do not process ctx->kernel_seq.
(lower_omp_gpukernel): New function.
(lower_omp_1): Call it.
(target_follows_gridifiable_pattern): New function.
(remap_prebody_decls): New function.
(attempt_target_gridification): Likewise.
(create_target_gpukernel_stmt): Likewise.
(create_target_gpukernels): Likewise.
(execute_lower_omp): Call create_target_gpukernels.

diff --git a/gcc/gimple-walk.c b/gcc/gimple-walk.c
index e62cf62..a91abf1 100644
--- a/gcc/gimple-walk.c
+++ b/gcc/gimple-walk.c
@@ -633,6 +633,7 @@ walk_gimple_stmt (gimple_stmt_iterator *gsi, walk_stmt_fn 
callback_stmt,
 case GIMPLE_OMP_SINGLE:
 case GIMPLE_OMP_TARGET:
 case GIMPLE_OMP_TEAMS:
+case GIMPLE_OMP_GPUKERNEL:
   ret = walk_gimple_seq_mod (gimple_omp_body_ptr (stmt), callback_stmt,
 callback_op, wi);
   if (ret)
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 4f6c833..383f34a 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -184,11 +184,6 @@ struct omp_context
  barriers should jump to during omplower pass.  */
   tree cancel_label;
 
-  /* When we are about to produce a special gridified copy of a target
- construct for a GPU, the copy is stored here between scanning and
- lowering.  */
-  gimple_seq kernel_seq;
-
   /* What to do with variables with implicitly determined sharing
  attributes.  */
   enum omp_clause_default_kind default_kind;
@@ -2654,292 +2649,6 @@ scan_omp_single (gomp_single *stmt, omp_context 
*outer_ctx)
 layout_type (ctx->record_type);
 }
 
-/* If SEQ is a sequence containing only one statement or a bind statement which
-   itself contains only one statement, return that statement.  Otherwise return
-   NULL.  TARGET_LOC must be location of the target statement and NAME the name
-   of the currently processed statement, both are used for dumping.  */
-
-static gimple *
-single_stmt_in_seq_skip_bind (gimple_seq seq, location_t target_loc,
- const char *name)
-{
-  gimple *stmt;
-  bool loop;
-  do
-{
-  if (!seq)
-   {
- gcc_assert (name);
- if (dump_enabled_p ())
-   dump_printf_loc (MSG_NOTE, target_loc,
-"Will not turn target construct into a simple "
-"GPGPU kernel because %s construct has empty "
-"body\n",
-name);
- return NULL;
-   }
-
-  if (!gimple_seq_singleton_p (seq))
-   {
- gcc_assert (name);
- if (dump_enabled_p ())
-   dump_printf_loc (MSG_NOTE, target_loc,
-"Will not turn target construct into a simple "
-"GPGPU kernel because %s construct contains "
-"multiple statements\n", name);
- return NULL;
-   }
-
-  stmt = gimple_seq_first_stmt (seq);
-  if (is_a  (stmt))
-   {
- loop = true;
- gbind *bind = as_a  (stmt);
- seq = gimple_bind_body (bind);
-   }
-  else
-   loop = false;
-}
-  while (loop);
-  return stmt;
-}
-
-/* If TARGET follows a pattern that can be turned into a GPGPU kernel, return
-   true, otherwise return false.  In the case of success, also fill in
-   GROUP_SIZE_P with the requested group size or NULL if there is none.  */
-
-static bool
-target_follows_kernelizable_pattern (gomp_target *target, tree *group_size_p)
-{
-  if (gimple_omp_target_kind (target) != GF_OMP_TARGET_KIND_REGION)
-return false;
-
-  location_t tloc = gimple_location (target);
-  gimple *stmt = single_stmt_in_seq_skip_bind (gimple_omp_body (target), tloc,
-  "target");
-  if (!stmt)
-return false;
-  gomp_teams *teams;
-  tree group_size = NULL;
-  if ((teams = dyn_cast  

[PATCH] Fix def_test_returning_type in iamcu/test_basic_returning.c

2015-10-16 Thread H.J. Lu
Use union to check float return bits to avoid converting from integer
to float when comparing float return value.  I will check it in after
regression test.


H.J.
---
* gcc.target/i386/iamcu/test_basic_returning.c
(def_test_returning_type): Use union to check float return bits.
---
 gcc/testsuite/gcc.target/i386/iamcu/test_basic_returning.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/i386/iamcu/test_basic_returning.c 
b/gcc/testsuite/gcc.target/i386/iamcu/test_basic_returning.c
index 23efa6e..e617d8d 100644
--- a/gcc/testsuite/gcc.target/i386/iamcu/test_basic_returning.c
+++ b/gcc/testsuite/gcc.target/i386/iamcu/test_basic_returning.c
@@ -39,7 +39,10 @@ fun_test_returning_float (void)
 
 #define def_test_returning_type(fun, type, ret, reg) \
   { type var = WRAP_RET (fun) (); \
-  assert (ret == (type) reg && ret == var); }
+union { type r; unsigned long reg; } u; \
+u.reg = reg; \
+  assert (ret == u.r && ret == var); }
+
 int
 main (void)
 {
-- 
2.4.3



[committed, PATCH] Replace printf with __builtin_printf

2015-10-16 Thread H.J. Lu
Index: ChangeLog
===
--- ChangeLog   (revision 228921)
+++ ChangeLog   (working copy)
@@ -1,5 +1,10 @@
 2015-10-16  H.J. Lu  
 
+   * gcc.target/i386/iamcu/test_basic_64bit_returning.c (main):
+   Replace printf with __builtin_printf.
+
+2015-10-16  H.J. Lu  
+
* gcc.target/i386/iamcu/test_passing_unions.c (main): Properly
initialize u5.
 
Index: gcc.target/i386/iamcu/test_basic_64bit_returning.c
===
--- gcc.target/i386/iamcu/test_basic_64bit_returning.c  (revision 228920)
+++ gcc.target/i386/iamcu/test_basic_64bit_returning.c  (working copy)
@@ -49,7 +49,7 @@ main (void)
   if (d != test_64.d
   || (test_64.ll & 0x) != eax
   || ((test_64.ll >> 32) & 0x) != edx)
-printf ("fail double\n"), failed++;
+__builtin_printf ("fail double\n"), failed++;
 
   if (failed)
 abort ();


Re: config header file reduction patch checked in.

2015-10-16 Thread Andrew MacLeod

On 10/16/2015 03:49 PM, Andrew MacLeod wrote:

On 10/12/2015 04:04 AM, Jeff Law wrote:

On 10/08/2015 07:37 AM, Andrew MacLeod wrote:

On 10/07/2015 06:02 PM, Jeff Law wrote:


I'm slightly concerned about the darwin, windows and solaris bits.  
The former primarily because Darwin has been a general source of 
pain, and in the others because I'm not sure the cross testing will 
exercise that code terribly much.


I'll go ahead and approve all the config/ bits.  Please be on the 
lookout for any fallout.


I'll try and get into more of the other patches tomorrow.




OK, I've checked in the config changes.  I rebuilt all the cross 
compilers for the 200+ targets, and they still build.. as well as 
bootstrapping on x86_64-pc-linux-gnu with no regressions.


So. If any one runs into a native build issue you can either add the 
required header back in, or back out the file for your port, and I'll 
look into why something happened.   The only thing I can imagine is 
files that have conditional compilation based on a macro that is only 
ever defined on a native build command line or headers.  Its 
unlikely... but possible.


btw, out of all the targets, the only one which didn't build before my 
patch was i686-interix3OPT-enable-obsolete...


so that one isn't my fault :-)

Andrew


[committed, PATCH] Properly initialize u5

2015-10-16 Thread H.J. Lu
Index: ChangeLog
===
--- ChangeLog   (revision 228920)
+++ ChangeLog   (working copy)
@@ -1,3 +1,8 @@
+2015-10-16  H.J. Lu  
+
+   * gcc.target/i386/iamcu/test_passing_unions.c (main): Properly
+   initialize u5.
+
 2015-10-16  Eric Botcazou  
 
PR middle-end/67966
Index: gcc.target/i386/iamcu/test_passing_unions.c
===
--- gcc.target/i386/iamcu/test_passing_unions.c (revision 228920)
+++ gcc.target/i386/iamcu/test_passing_unions.c (working copy)
@@ -123,7 +123,7 @@ main (void)
   struct long_struct ls;
 #endif /* CHECK_LARGER_UNION_PASSING */
   union un4 u4[8];
-  union un5 u5 = { { 48.394, 39.3, -397.9, 3484.9 } };
+  union un5 u5 = { 48.394 };
   int i;
   union un6 u6;
 


config header file reduction patch checked in.

2015-10-16 Thread Andrew MacLeod

On 10/12/2015 04:04 AM, Jeff Law wrote:

On 10/08/2015 07:37 AM, Andrew MacLeod wrote:

On 10/07/2015 06:02 PM, Jeff Law wrote:


I'm slightly concerned about the darwin, windows and solaris bits.  
The former primarily because Darwin has been a general source of pain, 
and in the others because I'm not sure the cross testing will exercise 
that code terribly much.


I'll go ahead and approve all the config/ bits.  Please be on the 
lookout for any fallout.


I'll try and get into more of the other patches tomorrow.




OK, I've checked in the config changes.  I rebuilt all the cross 
compilers for the 200+ targets, and they still build.. as well as 
bootstrapping on x86_64-pc-linux-gnu with no regressions.


So. If any one runs into a native build issue you can either add the 
required header back in, or back out the file for your port, and I'll 
look into why something happened.   The only thing I can imagine is 
files that have conditional compilation based on a  macro that is only 
ever defined on a native build command line or headers.  Its unlikely... 
but possible.



I've attached the latest version of the patch for the record.

Andrew




config2-final.patch.bz2
Description: application/bzip


Re: [PATCH] PR fortran/67987 -- character lengths cannot be negative

2015-10-16 Thread FX
> The attach patch enforces the Fortran Standard's requirement
> that character length must be great than or equal to zero.

We've got to be careful about this. The standard (F2008) has this to say about 
character lengths:

4.4.3.1. "The number of characters in the string is called the length of the 
string. The length is a type parameter; its kind is processor dependent and its 
value is greater than or equal to zero.”

but

4.4.3.2. "If the character length parameter value evaluates to a negative 
value, the length of character entities declared is zero."


So while strings cannot have negative length, they can be declared with a 
length parameter value which is itself negative, leading to the string having 
zero length. Or, said otherwise:

  character(len=-2) :: s

is legal and declares a string of zero length, being thus equivalent to:

  character(len=0) :: s


Thus: not OK to commit.

FX

[Patch, MIPS] Frame header optimization for MIPS (part 2)

2015-10-16 Thread Steve Ellcey

Here is the second part of the MIPS frame header optimization patch.  The
first part avoided allocating a frame header if it knew that none of the
functions that it called would need it.  This part uses a frame header
to save callee saved registers if doing so will allow the function to avoid
having to allocate stack space (thus avoiding the need to increment and
decrement the stack pointer at all).

I did a little reorganization of my first patch as part of this, seperating
the is_leaf_function check from needs_frame_header_p and making it a
seperate check from callees_functions_use_frame_header.  This allowed me
to reuse the needs_frame_header_p function and the does_not_use_frame_header
field that stores its value in this new patch.

The optimimization may be more conservative than it has to be in some
cases like functions with floating point arguments but it catches the
most important cases and we can extend it later if needed.

Tested with no regressions using the mips-mti-linux-gnu toolchain.

OK to checkin?

Steve Ellcey
sell...@imgtec.com


2015-10-16  Steve Ellcey  

* frame-header-opt.c (gate): Check for optimize > 0.
(has_inlined_assembly): New function.
(needs_frame_header_p): Remove is_leaf_function check,
add argument type check.
(callees_functions_use_frame_header): Add is_leaf_function
and has_inlined_assembly calls..
(set_callers_may_not_allocate_frame): New function.
(frame_header_opt): Add is_leaf_function call, add
set_callers_may_not_allocate_frame call.
* config/mips/mips.c (mips_compute_frame_info): Add check
to see if callee saved regs can be put in frame header.
(mips_expand_prologue): Add check to see if step1 is zero,
fix cfa restores when using frame header to store regs.
(mips_can_use_return_insn): Check to see if registers are
stored in frame header.
* config/mips/mips.h (machine_function): Add
callers_may_not_allocate_frame and
use_frame_header_for_callee_saved_regs fields.


diff --git a/gcc/config/mips/frame-header-opt.c 
b/gcc/config/mips/frame-header-opt.c
index 7c7b1f2..5512838 100644
--- a/gcc/config/mips/frame-header-opt.c
+++ b/gcc/config/mips/frame-header-opt.c
@@ -79,7 +79,7 @@ public:
   /* This optimization has no affect if TARGET_NEWABI.   If optimize
  is not at least 1 then the data needed for the optimization is
  not available and nothing will be done anyway.  */
-  return TARGET_OLDABI && flag_frame_header_optimization;
+  return TARGET_OLDABI && flag_frame_header_optimization && (optimize > 0);
 }
 
   virtual unsigned int execute (function *) { return frame_header_opt (); }
@@ -125,6 +125,29 @@ is_leaf_function (function *fn)
   return true;
 }
 
+/* Return true if this function has inline assembly code or if we cannot
+   be certain that it does not.  False if know that there is no inline
+   assembly.  */
+
+static bool
+has_inlined_assembly (function *fn)
+{
+  basic_block bb;
+  gimple_stmt_iterator gsi;
+
+  /* If we do not have a cfg for this function be conservative and assume
+ it is may have inline assembly.  */
+  if (fn->cfg == NULL)
+return true;
+
+  FOR_EACH_BB_FN (bb, fn)
+for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next ())
+  if (gimple_code (gsi_stmt (gsi)) == GIMPLE_ASM)
+   return true;
+
+  return false;
+}
+
 /* Return true if this function will use the stack space allocated by its
caller or if we cannot determine for certain that it does not.  */
 
@@ -136,20 +159,26 @@ needs_frame_header_p (function *fn)
   if (fn->decl == NULL)
 return true;
 
-  if (fn->stdarg || !is_leaf_function (fn))
+  if (fn->stdarg)
 return true;
 
   for (t = DECL_ARGUMENTS (fn->decl); t; t = TREE_CHAIN (t))
 {
   if (!use_register_for_decl (t))
- return true;
+   return true;
+
+  /* Some 64 bit types may get copied to general registers using the frame
+header, see mips_output_64bit_xfer.  Checking for SImode only may be
+ overly restrictive but it is gauranteed to be safe. */
+  if (DECL_MODE (t) != SImode)
+   return true;
 }
 
   return false;
 }
 
-/* Returns TRUE if the argument stack space allocated by function FN is used.
-   Returns FALSE if the space is needed or if the need for the space cannot
+/* Return true if the argument stack space allocated by function FN is used.
+   Return false if the space is needed or if the need for the space cannot
be determined.  */
 
 static bool
@@ -177,6 +206,8 @@ callees_functions_use_frame_header (function *fn)
  called_fn = DECL_STRUCT_FUNCTION (called_fn_tree);
  if (called_fn == NULL
  || DECL_WEAK (called_fn_tree) 
+ || has_inlined_assembly (called_fn)
+ || !is_leaf_function (called_fn)
  || 

Re: [PATCH] PR fortran/67987 -- character lengths cannot be negative

2015-10-16 Thread FX
> 2015-10-16  Steven G. Kargl  
> 
>   PR fortran/67987
>   * decl.c (char_len_param_value): Unwrap unlong line.  If LEN < 0,
>   then force it to zero pre Fortran Standards. 
>   * resolve.c (gfc_resolve_substring_charlen): Unwrap unlong line.
>   If 'start' is larger than 'end', then length of string is negative,
>   so explicitly set it to zero.
>   (resolve_charlen): Remove -Wsurprising warning.  Update comment to
>   text from F2008 standard.
> 
> 2015-10-16  Steven G. Kargl  
> 
>   PR fortran/67987
>   * gfortran.dg/char_length_2.f90: Add declaration from PR to testcase.

The patch is now mostly OK to me. Minor remarks:

  - I’m thinking you mean “force it to zero per [not pre] Fortran standards”
  - why remove the -Wsurprising warning? it seems a good case for -Wsurprising: 
legal code, but dubious anyway

OK after you ponder that second point.

FX

Re: [PATCH: RL78] libgcc fixes for divmodsi, divmodhi and divmodqi

2015-10-16 Thread DJ Delorie

> This is regression tested for RL78 -msim. Please let me know if it is
> OK to commit.

I've committed this patch for you.  Thanks!

> Best Regards,
> Kaushik
> 
> Changelog:
> 2015-08-21  Kaushik Phatak  
> 
> * config/rl78/divmodqi.S: Return 0x00 by default for div by 0.
> * config/rl78/divmodsi.S: Update return register to r8.
> * config/rl78/divmodhi.S: Update return register to r8,r9.
>   Branch to main_loop_done_himode to pop registers before return.


Re: [Patch, fortran] COMMON block error recovery: PR 67758 (second pass)

2015-10-16 Thread Steve Kargl
On Tue, Oct 06, 2015 at 07:52:16PM +0200, Mikael Morin wrote:
> 
> Dominique noticed that the test coming with the preceding PR67758 patch 
> [1] was failing if compiled as free form.
> [1] https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00301.html
> 
> The problem is again an inconsistent state, but this time between the 
> in_common attribute and the common_block pointer.
> So, here is another iteration, hopefully fixing the remaining problems.
> The changes are:
> - adding a symbol to a common block list in gfc_match_common is 
> delayed after the call to gfc_add_in_common.
> - gfc_restore_latest_undo_checkpoint is changed to check the 
> common_block pointer directly instead of the in_common attribute.
> Both of these changes fix the testcase independently, but with some 
> regressions, so there is additionally:
> - gfc_restore_old_symbol is changed to also restore the 
> common-related pointers.  This is done using a new function created to 
> factor the related memory management.
> - In gfc_restore_last_undo_checkpoint, when a symbol has been 
> removed from the common block linked list, its common_next pointer is 
> cleared.
> 
> Regression tested on x86_64-linux.  OK for trunk?
> 

Hi Mikael,

I think the patch is OK

-- 
Steve


[committed, PATCH] Disable X86_TUNE_ALWAYS_FANCY_MATH_387 for Lakemont

2015-10-16 Thread H.J. Lu
Since Lakemont processor doesn't have 387, we should disable
X86_TUNE_ALWAYS_FANCY_MATH_387 for Lakemont.

* i386/x86-tune.def (X86_TUNE_ALWAYS_FANCY_MATH_387): Disable
for Lakemont.
---
 gcc/config/i386/x86-tune.def | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/i386/x86-tune.def b/gcc/config/i386/x86-tune.def
index 05f9737..b2d3921 100644
--- a/gcc/config/i386/x86-tune.def
+++ b/gcc/config/i386/x86-tune.def
@@ -435,7 +435,7 @@ DEF_TUNE (X86_TUNE_DOUBLE_WITH_ADD, "double_with_add", 
~m_386)
such as fsqrt, fprem, fsin, fcos, fsincos etc.
Should be enabled for all targets that always has coprocesor.  */
 DEF_TUNE (X86_TUNE_ALWAYS_FANCY_MATH_387, "always_fancy_math_387",
-  ~(m_386 | m_486))
+  ~(m_386 | m_486 | m_LAKEMONT))
 
 /* X86_TUNE_UNROLL_STRLEN: Produce (quite lame) unrolled sequence for
inline strlen.  This affects only -minline-all-stringops mode. By
-- 
2.4.3



Re: [PATCH] PR fortran/67987 -- character lengths cannot be negative

2015-10-16 Thread Steve Kargl
On Fri, Oct 16, 2015 at 10:17:34PM +0200, FX wrote:
> > The attach patch enforces the Fortran Standard's requirement
> > that character length must be great than or equal to zero.
> 
> 4.4.3.2. "If the character length parameter value evaluates to
> a negative value, the length of character entities declared is zero."
> 

Thanks.  I missed the above.  The above text goes back to 
at least F90 (see p. 41 on N693.pdf).  New diff attached.

2015-10-16  Steven G. Kargl  

PR fortran/67987
* decl.c (char_len_param_value): Unwrap unlong line.  If LEN < 0,
then force it to zero pre Fortran Standards. 
* resolve.c (gfc_resolve_substring_charlen): Unwrap unlong line.
If 'start' is larger than 'end', then length of string is negative,
so explicitly set it to zero.
(resolve_charlen): Remove -Wsurprising warning.  Update comment to
text from F2008 standard.

2015-10-16  Steven G. Kargl  

PR fortran/67987
* gfortran.dg/char_length_2.f90: Add declaration from PR to testcase.

-- 
Steve
Index: fortran/decl.c
===
--- fortran/decl.c	(revision 228667)
+++ fortran/decl.c	(working copy)
@@ -697,8 +697,7 @@ char_len_param_value (gfc_expr **expr, b
 
   if (gfc_match_char (':') == MATCH_YES)
 {
-  if (!gfc_notify_std (GFC_STD_F2003, "deferred type "
-			   "parameter at %C"))
+  if (!gfc_notify_std (GFC_STD_F2003, "deferred type parameter at %C"))
 	return MATCH_ERROR;
 
   *deferred = true;
@@ -708,11 +707,13 @@ char_len_param_value (gfc_expr **expr, b
 
   m = gfc_match_expr (expr);
 
-  if (m == MATCH_YES
-  && !gfc_expr_check_typed (*expr, gfc_current_ns, false))
+  if (m == MATCH_NO || m == MATCH_ERROR)
+return m;
+
+  if (!gfc_expr_check_typed (*expr, gfc_current_ns, false))
 return MATCH_ERROR;
 
-  if (m == MATCH_YES && (*expr)->expr_type == EXPR_FUNCTION)
+  if ((*expr)->expr_type == EXPR_FUNCTION)
 {
   if ((*expr)->value.function.actual
 	  && (*expr)->value.function.actual->expr->symtree)
@@ -731,6 +732,15 @@ char_len_param_value (gfc_expr **expr, b
 	}
 	}
 }
+
+  /* F2008, 4.4.3.1:  The length is a type parameter; its kind is processor
+ dependent and its value is greater than or equal to zero.
+ F2008, 4.4.3.2:  If the character length parameter value evaluates to
+ a negative value, the length of character entities declared is zero.  */
+  if ((*expr)->expr_type == EXPR_CONSTANT
+  && mpz_cmp_si ((*expr)->value.integer, 0) < 0)
+mpz_set_si ((*expr)->value.integer, 0);
+
   return m;
 
 syntax:
Index: fortran/resolve.c
===
--- fortran/resolve.c	(revision 228667)
+++ fortran/resolve.c	(working copy)
@@ -4562,8 +4562,7 @@ gfc_resolve_substring_charlen (gfc_expr 
 {
   if (e->ts.u.cl->length)
 	gfc_free_expr (e->ts.u.cl->length);
-  else if (e->expr_type == EXPR_VARIABLE
-		 && e->symtree->n.sym->attr.dummy)
+  else if (e->expr_type == EXPR_VARIABLE && e->symtree->n.sym->attr.dummy)
 	return;
 }
 
@@ -4596,12 +4595,19 @@ gfc_resolve_substring_charlen (gfc_expr 
   return;
 }
 
-  /* Length = (end - start +1).  */
+  /* Length = (end - start + 1).  */
   e->ts.u.cl->length = gfc_subtract (end, start);
   e->ts.u.cl->length = gfc_add (e->ts.u.cl->length,
 gfc_get_int_expr (gfc_default_integer_kind,
 		  NULL, 1));
 
+  /* F2008, 6.4.1:  Both the starting point and the ending point shall
+ be within the range 1, 2, ..., n unless the starting point exceeds
+ the ending point, in which case the substring has length zero.  */
+
+  if (mpz_cmp_si (e->ts.u.cl->length->value.integer, 0) < 0)
+mpz_set_si (e->ts.u.cl->length->value.integer, 0);
+
   e->ts.u.cl->length->ts.type = BT_INTEGER;
   e->ts.u.cl->length->ts.kind = gfc_charlen_int_kind;
 
@@ -10882,18 +10888,11 @@ resolve_charlen (gfc_charlen *cl)
 	}
 }
 
-  /* "If the character length parameter value evaluates to a negative
- value, the length of character entities declared is zero."  */
+  /* F2008, 4.4.3.2:  If the character length parameter value evaluates to
+ a negative value, the length of character entities declared is zero.  */
   if (cl->length && !gfc_extract_int (cl->length, ) && i < 0)
-{
-  if (warn_surprising)
-	gfc_warning_now (OPT_Wsurprising,
-			 "CHARACTER variable at %L has negative length %d,"
-			 " the length has been set to zero",
-			 >length->where, i);
-  gfc_replace_expr (cl->length,
+gfc_replace_expr (cl->length,
 			gfc_get_int_expr (gfc_default_integer_kind, NULL, 0));
-}
 
   /* Check that the character length is not too large.  */
   k = gfc_validate_kind (BT_INTEGER, gfc_charlen_int_kind, false);
Index: testsuite/gfortran.dg/char_length_2.f90
===
--- 

Re: [PATCH] c/67882 - improve -Warray-bounds for invalid offsetof

2015-10-16 Thread Martin Sebor

On 10/16/2015 06:27 AM, Bernd Schmidt wrote:

On 10/09/2015 04:55 AM, Martin Sebor wrote:

Gcc attempts to diagnose invalid offsetof expressions whose member
designator is an array element with an out-of-bounds index. The
logic in the function that does this detection is incomplete, leading
to false negatives. Since the result of the expression in these cases
can be surprising, this patch tightens up the logic to diagnose more
such cases.


Thank you for the review. Attached is an updated patch that hopefully
addresses all your comments. I ran the check_GNU_style.sh script on
it to make sure I didn't miss something.  I've also added replies to
a few of your comments below.



In the future, please explain more clearly in the patch submission what
the false negatives are. That'll make the reviewer's job easier.


The false negatives are at a high level explained in the bug:

  GCC fails to diagnose cases of invalid offsetof expressions whose
  member designator refers to an array element past the end of the
  array plus one.

The example there is intended to illustrate the general problem.
Beyond that, the test included in the patch shows other examples.
With an unpatched GCC, the latest one shows failures on the
following lines:

  128, 129, 134, 141, 142, 148, 149, 155, 156, 157, 158, 159, 163,
  164, 167, 168, 169, 170, 171, 181, 182, 183, 204, 205, 206, and
  207.

Hopefully that along with the comments in the code makes the problem
clear enough but I'd be happy to add more of either if that helps.


Tested by boostrapping and running c/c++ tests on x86_64 with no
regressions.


Should run the full testsuite (standard practice, and library tests
might have occurrences of offsetof).


Yes, that is what I meant by running c/c+ tests (i.e., I configured
gcc with --enable-languages=c,c++, bootstrapped it, and ran make
check).


+  /* Index is considered valid when it's either less than
+ ...

I admit to having trouble parsing this comment. Can you write that in a
clearer way somehow? I'm still trying to make my mind up whether the
logic in this patch could be simplified.


I've reworded and expanded the comment in the updated patch. Please
let me know if it's still unclear.


So I checked and it looks like we accept flexible array member syntax
like "int a[][2];", which suggests that the test might have the right
idea, but has the indices swapped (the first one is the flexible one)?
Ccing Joseph for a ruling.


I believe the test is in line with Joseph's expectation. I added
a few more test cases to cover the constructs he referred to in
his response (IIUC).

Martin
gcc/ChangeLog
2015-10-16  Martin Sebor  

	PR c++-common/67882
	* c-family/c-common.h (struct offsetof_ctx_t): Declare.
	(fold_offsetof_1): Add argument.
	* c-family/c-common.c (struct offsetof_ctx_t): Define.
	(fold_offsetof_1): Diagnose more invalid offsetof expressions
	that reference elements past the end of an array.

testsuite/ChangeLog
2015-10-16  Martin Sebor  

	PR c++-common/67882
	* c-c++-common/builtin-offsetof-2.c: New test.

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 4b922bf..c313a78 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -10535,13 +10535,38 @@ c_common_to_target_charset (HOST_WIDE_INT c)
 }

 /* Fold an offsetof-like expression.  EXPR is a nested sequence of component
-   references with an INDIRECT_REF of a constant at the bottom; much like the
-   traditional rendering of offsetof as a macro.  Return the folded result.  */
+   references with an INDIRECT_REF of a constant at the bottom; much like
+   the traditional rendering of offsetof as a macro.  Return the folded
+   result.  PCTX, which is initially null, is intended only for internal
+   use by the function.  It is set by the first recursive invocation of
+   the function to refer to a local object describing the potentially
+   out-of-bounds index of the array member whose offset is being computed,
+   and to indicate whether all indices to the same array object have
+   the highest valid value.  The function issues a warning for out-of-
+   bounds array indices that either refer to elements past the one just
+   past the end of the array object or that exceed any of the major
+   bounds.  */
+
+struct offsetof_ctx_t
+{
+  /* The possibly invalid array index or NULL_TREE.  */
+  tree index;
+  /* Clear when no index to the (possibly multi-dimensional) array
+ is known to have the same value as the corresponding upper bound
+ minus one.  Negative when unknown/don't care, positive otherwise.  */
+  int max_index;
+};

 tree
-fold_offsetof_1 (tree expr)
+fold_offsetof_1 (tree expr, offsetof_ctx_t *pctx /* = 0 */)
 {
   tree base, off, t;
+  offsetof_ctx_t ctx = { NULL_TREE, -1 };
+
+  /* Set the context pointer to point to the local context object
+ to use by subsequent recursive calls.  */
+  if (!pctx)
+pctx = 

   switch (TREE_CODE (expr))
 

Re: Add VIEW_CONVERT_EXPR to operand_equal_p

2015-10-16 Thread Richard Biener
On October 16, 2015 5:55:08 PM GMT+02:00, Eric Botcazou  
wrote:
>> I wasn't aware that x86/IA-64 is still broken.  I am flying to NY
>tomorrow
>> but will try to take a look. The ICEs are not caused by
>operand_equal_p
>> changes, but the change to useless_type_conversion to ignore mode on
>> aggregate types.
>
>Sure, but I'd like to avoid hiding new problems against preexisting
>ICEs.
>
>> A safe way would be to add the mode check back (as was in my original
>patch)
>> that does not change my original intent to separate CANONICAL_TYPE
>from
>> gimple semantic type equivalence machinery. It was however outcome of
>the
>> discussion that we would preffer the mode to be ignored in this case
>which
>> means fixing expansion side.
>
>What do we gain by doing this?  Pretending that the mode doesn't matter
>is a 
>lie at the RTL level and I don't see why GIMPLE would have to care.

Well, it would (I think) ICE on assigning a packed variant to a non-packed 
variant of a strict that happens to get a non-BLKmode when not packed.

Richard.

>> I have no way to reproduce the IA-64 change, but will send proposed
>patch -
>> from backtrace it was clear where the wrong mode went in.  Will wait
>with
>> operand_euqal_p changess until this is fixed.
>
>Thanks.  I have installed 2 testcases that exhibit 2 distinct ICEs on
>x86-64, 
>pack21.adb at -O0 and pack22.adb at -O1 (similar to the IA-64 one).
>
>
>   PR middle-end/67966
>   * gnat.dg/pack21.adb: New test.
>   * gnat.dg/pack22.adb: Likewise.
>   * gnat.dg/pack22_pkg.ad[sb]: New helper.




[PATCH] Remove some force_gimple_operand usage

2015-10-16 Thread Richard Biener

This removes the trivial cases from gimple-fold.c.  Recursing from there
to the gimplifier was always a problem.  Now, the remaining calls are
difficult to replace (as many other places in the compiler) as helpers
such as c_strlen may return arbitrary GENERIC expressions.

Well, at least a bit closer to removing force_gimple_operand and 
friends...

Bootstrapped & tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2015-10-16  Richard Biener  

* gimple-fold.c (gimple_fold_builtin_memory_op): Use gimple_build
and get rid of force_gimple_operand_gsi.
(gimple_fold_builtin_memory_chk): Likewise.
(gimple_fold_builtin_stxcpy_chk): Likewise.
(rewrite_to_defined_overflow): Likewise.
(gimple_convert_to_ptrofftype): New function.
* gimple-fold.h (gimple_convert_to_ptrofftype): New overload,
declare.

Index: gcc/gimple-fold.c
===
--- gcc/gimple-fold.c   (revision 228811)
+++ gcc/gimple-fold.c   (working copy)
@@ -1045,16 +1045,20 @@ gimple_fold_builtin_memory_op (gimple_st
 }
 
 done:
+  gimple_seq stmts = NULL;
   if (endp == 0 || endp == 3)
 len = NULL_TREE;
   else if (endp == 2)
-len = fold_build2_loc (loc, MINUS_EXPR, TREE_TYPE (len), len,
-  ssize_int (1));
+len = gimple_build (, loc, MINUS_EXPR, TREE_TYPE (len), len,
+   ssize_int (1));
   if (endp == 2 || endp == 1)
-dest = fold_build_pointer_plus_loc (loc, dest, len);
+{
+  len = gimple_convert_to_ptrofftype (, loc, len);
+  dest = gimple_build (, loc, POINTER_PLUS_EXPR,
+  TREE_TYPE (dest), dest, len);
+}
 
-  dest = force_gimple_operand_gsi (gsi, dest, false, NULL_TREE, true,
-  GSI_SAME_STMT);
+  gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
   gimple *repl = gimple_build_assign (lhs, dest);
   gsi_replace (gsi, repl, false);
   return true;
@@ -1708,10 +1710,10 @@ gimple_fold_builtin_memory_chk (gimple_s
}
   else
{
- tree temp = fold_build_pointer_plus_loc (loc, dest, len);
- temp = force_gimple_operand_gsi (gsi, temp,
-  false, NULL_TREE, true,
-  GSI_SAME_STMT);
+ gimple_seq stmts = NULL;
+ len = gimple_convert_to_ptrofftype (, loc, len);
+ tree temp = gimple_build (, loc, POINTER_PLUS_EXPR, dest, len);
+ gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
  replace_call_with_value (gsi, temp);
  return true;
}
@@ -1844,11 +1846,11 @@ gimple_fold_builtin_stxcpy_chk (gimple_s
  if (!fn)
return false;
 
- len = fold_convert_loc (loc, size_type_node, len);
- len = size_binop_loc (loc, PLUS_EXPR, len,
-   build_int_cst (size_type_node, 1));
- len = force_gimple_operand_gsi (gsi, len, true, NULL_TREE,
- true, GSI_SAME_STMT);
+ gimple_seq stmts = NULL;
+ len = gimple_convert (, loc, size_type_node, len);
+ len = gimple_build (, loc, PLUS_EXPR, size_type_node, len,
+ build_int_cst (size_type_node, 1));
+ gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
  gimple *repl = gimple_build_call (fn, 4, dest, src, len, size);
  replace_call_with_call_and_fold (gsi, repl);
  return true;
@@ -5940,12 +5941,9 @@ rewrite_to_defined_overflow (gimple *stm
   gimple_seq stmts = NULL;
   for (unsigned i = 1; i < gimple_num_ops (stmt); ++i)
 {
-  gimple_seq stmts2 = NULL;
-  gimple_set_op (stmt, i,
-force_gimple_operand (fold_convert (type,
-gimple_op (stmt, i)),
-  , true, NULL_TREE));
-  gimple_seq_add_seq (, stmts2);
+  tree op = gimple_op (stmt, i);
+  op = gimple_convert (, type, op);
+  gimple_set_op (stmt, i, op);
 }
   gimple_assign_set_lhs (stmt, make_ssa_name (type, stmt));
   if (gimple_assign_rhs_code (stmt) == POINTER_PLUS_EXPR)
@@ -6154,6 +6152,20 @@ gimple_convert (gimple_seq *seq, locatio
   return gimple_build (seq, loc, NOP_EXPR, type, op);
 }
 
+/* Build the conversion (ptrofftype) OP with a result of a type
+   compatible with ptrofftype with location LOC if such conversion
+   is neccesary in GIMPLE, simplifying it first.
+   Returns the built expression value and appends
+   statements possibly defining it to SEQ.  */
+
+tree
+gimple_convert_to_ptrofftype (gimple_seq *seq, location_t loc, tree op)
+{
+  if (ptrofftype_p (TREE_TYPE (op)))
+return op;
+  return gimple_convert (seq, loc, sizetype, op);
+}
+
 /* Return true if the result of assignment STMT is known to be non-negative.

Re: [patch] Minor adjustment to gimplify_addr_expr

2015-10-16 Thread Eric Botcazou
> Btw, would be really nice to have libbacktrace support for ada ...

OK, I'll keep that in mind.

> While the patch looks technically ok I think you'll run into the same issue
> with a non-zero offset MEM_REF as that will get you a POINTER_PLUS_EXPR
> from build_fold_addr_expr.  We might be lucky not to ICE in
> recompute_tree_invariant_for_addr_expr because we can access operand
> zero of that of course.  I think recompute_tree_invariant_for_addr_expr
> misses an assert that it receives an ADDR_EXPR and the gimplify.c
> caller would need to handle POINTER_PLUS_EXPR specially.
> 
> Or change your patch to also handle non-zero offset MEM_REFs by
> simply gimplifying to POINTER_PLUS_EXPR op0, op1.

I couldn't cover the new case though, because you need a record with variable 
size and an array of those yields a non-constant offset so no MEM_REF and a 
record with fixed offset doesn't yield a MEM_REF either for some reason...

But I can add the assert in recompute_tree_invariant_for_addr_expr:

Index: tree.c
===
--- tree.c  (revision 228794)
+++ tree.c  (working copy)
@@ -4248,6 +4248,8 @@ recompute_tree_invariant_for_addr_expr (
   tree node;
   bool tc = true, se = false;
 
+  gcc_assert (TREE_CODE (t) == ADDR_EXPR);
+
   /* We started out assuming this address is both invariant and constant, but
does not have side effects.  Now go down any handled components and see if
any of them involve offsets that are either non-constant or non-invariant.

-- 
Eric Botcazou


Re: Do not use TYPE_CANONICAL in useless_type_conversion

2015-10-16 Thread Jan Hubicka
> Jan Hubicka  writes:
> 
> > Does the patch in https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00902.html 
> > help?
> 
> No, it doesn't.
> 
Andreas,
I am sorry for getting late to this. I hoped that the alternative patch by 
Alexandre would fix this.
I still don't know how to reproduce without IA-64 box, so I am attaching a 
patch that I think should
fix it.  Does the attached patch work?

Thank you,
Jan

Index: expr.c
===
--- expr.c  (revision 228851)
+++ expr.c  (working copy)
@@ -6669,9 +6669,16 @@ store_field (rtx target, HOST_WIDE_INT b
 GET_MODE_BITSIZE (GET_MODE (temp)) - bitsize,
 NULL_RTX, 1);
 
+  /* We allow move between structures of same size but different mode.
+If source is in memory and the mode differs, simply change the memory. 
 */
+  if (GET_MODE (temp) == BLKmode && mode != BLKmode)
+   { 
+ gcc_assert (MEM_P (temp));
+ temp = adjust_address_nv (temp, mode, 0);
+   }
   /* Unless MODE is VOIDmode or BLKmode, convert TEMP to MODE.  */
-  if (mode != VOIDmode && mode != BLKmode
- && mode != TYPE_MODE (TREE_TYPE (exp)))
+  else if (mode != VOIDmode && mode != BLKmode
+  && mode != TYPE_MODE (TREE_TYPE (exp)))
temp = convert_modes (mode, TYPE_MODE (TREE_TYPE (exp)), temp, 1);
 
   /* If TEMP is not a PARALLEL (see below) and its mode and that of TARGET


Re: [PATCH] Fix pr67963

2015-10-16 Thread Uros Bizjak
On Thu, Oct 15, 2015 at 9:30 PM, Uros Bizjak  wrote:

 Do we support -O2 -march=lakemont with

 __attribute__((target("arch=silvermont")))
>>>
>>> Hm, no.
>>>
>>
>> Do we issue an error or silently ignore
>> __attribute__((target("arch=silvermont")))?
>> If we don't support it, should we support
>>
>> -O2 -march=silvermont
>>
>> __attribute__((target("arch=lakemont")))
>
> Actually, we have to re-initialize:
>
>   opts->x_target_flags
> |= (TARGET_DEFAULT | TARGET_SUBTARGET_DEFAULT) & 
> ~opts_set->x_target_flags;
>
> just before TARGET_SUBTARGET{32,64}_DEFAULT processing, and it will work.

No, this won't work. The value of MASK_NO_FANCY_MATH depend on
MASK_80387setting, and once fancy math bit is set, it couldn't be
cleared for march != lakemont.

It looks just like we want to error out when lakemont is enabled with -m80387.

Uros.


[PATCH] Remove build_addr dependence on function context

2015-10-16 Thread Richard Biener

This removes the now unnecessary setting of current_function_decl
around build_fold_addr_expr in build_addr and thus the context
function argument of build_addr.  It was formerly necessary
because recompute_tree_invariant_for_addr_expr was using
current_function_decl which it now no longer does.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2015-10-16  Richard Biener  

* tree-nested.h (build_addr): Adjust prototype.
* tree-nested.c (build_addr): Remove context argument and use
mark_addressable.
(get_static_chain): Adjust calls to build_addr.
(convert_nl_goto_reference): Likewise.
(convert_tramp_reference_op): Likewise.
(finalize_nesting_tree_1): Likewise.
* value-prof.c (gimple_ic): Likewise.
* gimple-low.c (lower_builtin_setjmp): Likewise.
* tree-parloops.c (take_address_of): Likewise.
(create_call_for_reduction_1): Likewise.
* tree-profile.c (gimple_gen_interval_profiler): Likewise.
(gimple_gen_ic_func_profiler): Likewise.

fortran/
* trans-intrinsic.c (gfc_conv_intrinsic_lib_function): Adjust
calls to build_addr.
(gfc_conv_intrinsic_mod): Likewise.
(gfc_conv_intrinsic_ctime): Likewise.
(gfc_conv_intrinsic_fdate): Likewise.
(gfc_conv_intrinsic_ttynam): Likewise.
(gfc_conv_intrinsic_minmax_char): Likewise.
(gfc_conv_intrinsic_index_scan_verify): Likewise.
(gfc_conv_intrinsic_trim): Likewise.

Index: gcc/value-prof.c
===
--- gcc/value-prof.c(revision 228861)
+++ gcc/value-prof.c(working copy)
@@ -1376,8 +1376,7 @@ gimple_ic (gcall *icall_stmt, struct cgr
   load_stmt = gimple_build_assign (tmp0, tmp);
   gsi_insert_before (, load_stmt, GSI_SAME_STMT);
 
-  tmp = fold_convert (optype, build_addr (direct_call->decl,
- current_function_decl));
+  tmp = fold_convert (optype, build_addr (direct_call->decl));
   load_stmt = gimple_build_assign (tmp1, tmp);
   gsi_insert_before (, load_stmt, GSI_SAME_STMT);
 
Index: gcc/gimple-low.c
===
--- gcc/gimple-low.c(revision 228861)
+++ gcc/gimple-low.c(working copy)
@@ -751,7 +751,7 @@ lower_builtin_setjmp (gimple_stmt_iterat
   dest = gimple_call_lhs (stmt);
 
   /* Build '__builtin_setjmp_setup (BUF, NEXT_LABEL)' and insert.  */
-  arg = build_addr (next_label, current_function_decl);
+  arg = build_addr (next_label);
   t = builtin_decl_implicit (BUILT_IN_SETJMP_SETUP);
   g = gimple_build_call (t, 2, gimple_call_arg (stmt, 0), arg);
   gimple_set_location (g, loc);
@@ -776,7 +776,7 @@ lower_builtin_setjmp (gimple_stmt_iterat
   gsi_insert_before (gsi, g, GSI_SAME_STMT);
 
   /* Build '__builtin_setjmp_receiver (NEXT_LABEL)' and insert.  */
-  arg = build_addr (next_label, current_function_decl);
+  arg = build_addr (next_label);
   t = builtin_decl_implicit (BUILT_IN_SETJMP_RECEIVER);
   g = gimple_build_call (t, 1, arg);
   gimple_set_location (g, loc);
Index: gcc/tree-parloops.c
===
--- gcc/tree-parloops.c (revision 228861)
+++ gcc/tree-parloops.c (working copy)
@@ -540,7 +540,7 @@ take_address_of (tree obj, tree type, ed
   if (gsi == NULL)
 return build_fold_addr_expr_with_type (obj, type);
 
-  name = force_gimple_operand (build_addr (obj, current_function_decl),
+  name = force_gimple_operand (build_addr (obj),
   , true, NULL_TREE);
   if (!gimple_seq_empty_p (stmts))
 gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
@@ -1094,7 +1094,7 @@ create_call_for_reduction_1 (reduction_i
   load_struct = build_simple_mem_ref (clsn_data->load);
   t = build3 (COMPONENT_REF, type, load_struct, reduc->field, NULL_TREE);
 
-  addr = build_addr (t, current_function_decl);
+  addr = build_addr (t);
 
   /* Create phi node.  */
   bb = clsn_data->load_bb;
Index: gcc/fortran/trans-intrinsic.c
===
--- gcc/fortran/trans-intrinsic.c   (revision 228861)
+++ gcc/fortran/trans-intrinsic.c   (working copy)
@@ -873,7 +873,7 @@ gfc_conv_intrinsic_lib_function (gfc_se
   fndecl = gfc_get_intrinsic_lib_fndecl (m, expr);
   rettype = TREE_TYPE (TREE_TYPE (fndecl));
 
-  fndecl = build_addr (fndecl, current_function_decl);
+  fndecl = build_addr (fndecl);
   se->expr = build_call_array_loc (input_location, rettype, fndecl, num_args, 
args);
 }
 
@@ -2294,7 +2294,7 @@ gfc_conv_intrinsic_mod (gfc_se * se, gfc
   /* The builtin should always be available.  */
   gcc_assert (fmod != NULL_TREE);
 
-  tmp = build_addr (fmod, current_function_decl);
+  tmp = build_addr (fmod);
   se->expr = build_call_array_loc (input_location,
   TREE_TYPE (TREE_TYPE (fmod)),
 

[testsuite] Fix potential race conditions in gfortran tests

2015-10-16 Thread Christophe Lyon
Hi,

We have noticed a few random failures in gfortran tests in our validations.

Maxim investigated some of them and noticed a possible race condition
in the streamio tests, for which he'll post a patch.

I looked for other similar cases (checking which files are unlinked
several times during 'make check'), and noticed that a few other cases
might read/write/delete the same file concurrently.

The proposed fix is to use different names for the different testcases.

I ran make check -jX on various ARM/AArch64 configuration with no regression.

OK?

Christophe.
2015-10-16  Christophe Lyon  

* gfortran.dg/chmod_1.f90: Add suffix to the temporary filename to
make it unique per testcase.
* gfortran.dg/chmod_2.f90: Likewise.
* gfortran.dg/chmod_3.f90: Likewise.
* gfortran.dg/direct_io_8.f90: Likewise.
* gfortran.dg/f2003_inquire_1.f03: Likewise.
* gfortran.dg/f2003_io_1.f03: Likewise.
* gfortran.dg/f2003_io_2.f03: Likewise.
* gfortran.dg/f2003_io_8.f03: Likewise.
* gfortran.dg/inquire_size.f90: Likewise.
* gfortran.dg/namelist_66.f90: Likewise.
* gfortran.dg/namelist_82.f90: Likewise.
* gfortran.dg/namelist_87.f90: Likewise.
* gfortran.dg/open_negative_unit_1.f90: Likewise.
* gfortran.dg/open_new.f90: Likewise.
* gfortran.dg/stat_1.f90: Likewise.
* gfortran.dg/stat_2.f90: Likewise.
* gfortran.dg/streamio_15.f90: Likewise.
* gfortran.dg/unf_read_corrupted_1.f90: Likewise.
diff --git a/gcc/testsuite/gfortran.dg/chmod_1.f90 
b/gcc/testsuite/gfortran.dg/chmod_1.f90
index 07760cf..452b333 100644
--- a/gcc/testsuite/gfortran.dg/chmod_1.f90
+++ b/gcc/testsuite/gfortran.dg/chmod_1.f90
@@ -2,7 +2,7 @@
 ! { dg-options "-std=gnu" }
 ! See PR38956.  Test fails on cygwin when user has Administrator rights
   implicit none
-  character(len=*), parameter :: n = "foobar_file"
+  character(len=*), parameter :: n = "foobar_file_chmod_1"
   integer :: i
 
   open (10,file=n)
diff --git a/gcc/testsuite/gfortran.dg/chmod_2.f90 
b/gcc/testsuite/gfortran.dg/chmod_2.f90
index 3e5ed61..68ff17d 100644
--- a/gcc/testsuite/gfortran.dg/chmod_2.f90
+++ b/gcc/testsuite/gfortran.dg/chmod_2.f90
@@ -2,7 +2,7 @@
 ! { dg-options "-std=gnu" }
 ! See PR38956.  Test fails on cygwin when user has Administrator rights
   implicit none
-  character(len=*), parameter :: n = "foobar_file"
+  character(len=*), parameter :: n = "foobar_file_chmod_2"
   integer :: i
 
   open (10,file=n)
diff --git a/gcc/testsuite/gfortran.dg/chmod_3.f90 
b/gcc/testsuite/gfortran.dg/chmod_3.f90
index 9e92eca..5df7528 100644
--- a/gcc/testsuite/gfortran.dg/chmod_3.f90
+++ b/gcc/testsuite/gfortran.dg/chmod_3.f90
@@ -2,7 +2,7 @@
 ! { dg-options "-std=gnu -fdefault-integer-8" }
 ! See PR38956.  Test fails on cygwin when user has Administrator rights
   implicit none
-  character(len=*), parameter :: n = "foobar_file"
+  character(len=*), parameter :: n = "foobar_file_chmod_3"
   integer :: i
 
   open (10,file=n)
diff --git a/gcc/testsuite/gfortran.dg/direct_io_8.f90 
b/gcc/testsuite/gfortran.dg/direct_io_8.f90
index 5e384a1..87a8a6b 100644
--- a/gcc/testsuite/gfortran.dg/direct_io_8.f90
+++ b/gcc/testsuite/gfortran.dg/direct_io_8.f90
@@ -7,7 +7,7 @@ program main
   i=44
   ir = -42
 
-  open(11,file="foo.dat")
+  open(11,file="foo_direct_io_8.dat")
   ! Try a direct access read on a formatted sequential rile
   READ (11, REC = I, ERR = 99) TEMP_CHANGES
   call abort
diff --git a/gcc/testsuite/gfortran.dg/f2003_inquire_1.f03 
b/gcc/testsuite/gfortran.dg/f2003_inquire_1.f03
index 544a810..87ddf59 100644
--- a/gcc/testsuite/gfortran.dg/f2003_inquire_1.f03
+++ b/gcc/testsuite/gfortran.dg/f2003_inquire_1.f03
@@ -4,7 +4,7 @@ character(25) :: sround, ssign, sasynchronous, sdecimal, 
sencoding
 integer :: vsize, vid
 logical :: vpending
 
-open(10, file='mydata', asynchronous="yes", blank="null", &
+open(10, file='mydata_f2003_inquire_1', asynchronous="yes", blank="null", &
 & decimal="comma", encoding="utf-8", sign="plus")
 
 inquire(unit=10, round=sround, sign=ssign, size=vsize, id=vid, &
diff --git a/gcc/testsuite/gfortran.dg/f2003_io_1.f03 
b/gcc/testsuite/gfortran.dg/f2003_io_1.f03
index f1d67c5..2a2294f 100644
--- a/gcc/testsuite/gfortran.dg/f2003_io_1.f03
+++ b/gcc/testsuite/gfortran.dg/f2003_io_1.f03
@@ -8,7 +8,7 @@ character(25) :: msg
 
 a = 23.45
 b = 0.0
-open(10, file='mydata', asynchronous="yes", blank="null")
+open(10, file='mydata_f2003_io_1', asynchronous="yes", blank="null")
 
 write(10,'(10f8.3)', asynchronous="yes", decimal="comma", id=j) a
 rewind(10)
diff --git a/gcc/testsuite/gfortran.dg/f2003_io_2.f03 
b/gcc/testsuite/gfortran.dg/f2003_io_2.f03
index 54c0516..599eb5b 100644
--- a/gcc/testsuite/gfortran.dg/f2003_io_2.f03
+++ b/gcc/testsuite/gfortran.dg/f2003_io_2.f03
@@ -7,7 +7,7 @@ character(25) :: msg
 real, dimension(10) :: a, b
 
 a = 43.21
-open(10, file='mydata', 

Re: [PATCH] Fix pr67963

2015-10-16 Thread H.J. Lu
On Fri, Oct 16, 2015 at 2:35 AM, Uros Bizjak  wrote:
> On Fri, Oct 16, 2015 at 8:43 AM, Uros Bizjak  wrote:
>> On Thu, Oct 15, 2015 at 9:30 PM, Uros Bizjak  wrote:
>>
>> Do we support -O2 -march=lakemont with
>>
>> __attribute__((target("arch=silvermont")))
>
> Hm, no.
>

 Do we issue an error or silently ignore
 __attribute__((target("arch=silvermont")))?
 If we don't support it, should we support

 -O2 -march=silvermont

 __attribute__((target("arch=lakemont")))
>>>
>>> Actually, we have to re-initialize:
>>>
>>>   opts->x_target_flags
>>> |= (TARGET_DEFAULT | TARGET_SUBTARGET_DEFAULT) & 
>>> ~opts_set->x_target_flags;
>>>
>>> just before TARGET_SUBTARGET{32,64}_DEFAULT processing, and it will work.
>>
>> No, this won't work. The value of MASK_NO_FANCY_MATH depend on
>> MASK_80387setting, and once fancy math bit is set, it couldn't be
>> cleared for march != lakemont.
>>
>> It looks just like we want to error out when lakemont is enabled with 
>> -m80387.
>
> Like in the attached patch, that also slightly improves existing error
> reporting.
>

We should use a bit instead of checking PROCESSOR_LAKEMONT
so that we don't need to check another PROCESSOR_XXX for
a new IA MCU processor.

Thanks.


-- 
H.J.


[Ada] Minimize the save/restore of Ghost_Mode

2015-10-16 Thread Arnaud Charlet
This patch minimizes the stack-like handling of global variable Ghost_Mode when
processing Ghost code. The patch addresses references to Ghost entities within
the expanded code for pragma Contract_Cases.

Tested on x86_64-pc-linux-gnu, committed on trunk

2015-10-16  Hristian Kirtchev  

* exp_ch3.adb (Expand_N_Full_Type_Declaration): Do not capture,
set and restore the Ghost mode.
(Expand_N_Object_Declaration): Do not capture, set and restore the
Ghost mode.
(Freeze_Type): Redo the capture and restore of the Ghost mode.
(Restore_Globals): Removed.
* exp_ch5.adb (Expand_N_Assignment_Statement): Redo the capture
and restore of the Ghost mode.
(Restore_Globals): Removed.
* exp_ch6.adb (Expand_N_Procedure_Call_Statement):
Redo the capture and restore of the Ghost mode.
(Expand_N_Subprogram_Body): Redo the capture, set and restore
of the Ghost mode.
(Expand_N_Subprogram_Declaration): Do not
capture, set and restore the Ghost mode.
(Restore_Globals): Removed.
* exp_ch7.adb (Expand_N_Package_Body): Redo the capture, set
and restore of the Ghost mode.
(Expand_N_Package_Declaration): Do not capture, set and restore the
Ghost mode.
* exp_ch8.adb (Expand_N_Exception_Renaming_Declaration):
Redo the capture and restore of the Ghost mode.
(Expand_N_Object_Renaming_Declaration): Redo
the capture and restore of the Ghost mode.
(Expand_N_Package_Renaming_Declaration):
Redo the capture and restore of the Ghost mode.
(Expand_N_Subprogram_Renaming_Declaration): Redo the capture
and restore of the Ghost mode.
* exp_ch11.adb Remove with and use clauses for Ghost.
(Expand_N_Exception_Declaration): Do not capture, set and restore
the Ghost mode.
* exp_disp.adb (Make_DT): Redo the capture and restore of the
Ghost mode.
(Restore_Globals): Removed.
* exp_prag.adb (Expand_Pragma_Check): Do not capture, set
and restore the Ghost mode.
(Expand_Pragma_Contract_Cases):
Redo the capture and restore of the Ghost mode.  Preserve the
original context of contract cases by setting / resetting the
In_Assertion_Expr counter.
(Expand_Pragma_Initial_Condition):
Redo the capture and restore of the Ghost mode.
(Expand_Pragma_Loop_Variant): Redo the capture and restore of
the Ghost mode.
(Restore_Globals): Removed.
* exp_util.adb (Make_Predicate_Call): Redo the capture and
restore of the Ghost mode.
(Restore_Globals): Removed.
* freeze.adb (Freeze_Entity): Redo the capture and restore of
the Ghost mode.
(Restore_Globals): Removed.
* ghost.adb (Check_Ghost_Context): Remove the RM reference from
the error message.
(Is_OK_Statement): Account for statements
that appear in assertion expressions.
(Is_Subject_To_Ghost):
Moved from spec.
* ghost.ads (Is_Subject_To_Ghost): Moved to body.
* rtsfind.ads (Load_RTU): Redo the capture and restore of the
Ghost mode.
* sem.adb Add with and use clauses for Ghost.
(Analyze): Redo
the capture and restore of the Ghost mode. Set the Ghost mode
when analyzing a declaration.
(Do_Analyze): Redo the capture
and restore of the Ghost mode.
* sem_ch3.adb (Analyze_Full_Type_Declaration): Do not capture, set
and restore the Ghost mode.
(Analyze_Incomplete_Type_Decl):
Do not capture, set and restore the Ghost mode.
(Analyze_Number_Declaration): Do not capture, set and restore the
Ghost mode.
(Analyze_Object_Declaration): Do not capture, set and
restore the Ghost mode.
(Analyze_Private_Extension_Declaration):
Do not capture, set and restore the Ghost mode.
(Analyze_Subtype_Declaration): Do not capture, set and restore
the Ghost mode.
(Restore_Globals): Removed.
* sem_ch5.adb (Analyze_Assignment): Redo the capture and restore
of the Ghost mode.
(Restore_Globals): Removed.
* sem_ch6.adb (Analyze_Abstract_Subprogram_Declaration):
Do not capture, set and restore the Ghost mode.
(Analyze_Procedure_Call): Redo the capture and restore of the
Ghost mode.
(Analyze_Subprogram_Body_Helper): Redo the capture
and restore of the Ghost mode.  (Analyze_Subprogram_Declaration):
Do not capture, set and restore the Ghost mode.
(Restore_Globals): Removed.
* sem_ch7.adb (Analyze_Package_Body_Helper): Redo the capture and
restore of the Ghost mode.
(Analyze_Package_Declaration):
Do not capture, set and restore the Ghost mode.
(Analyze_Private_Type_Declaration): Do not capture, set and

Re: refactoring TARGET_PTRMEMFUNC_VBIT_LOCATION checks

2015-10-16 Thread Bernd Schmidt

On 10/16/2015 12:48 PM, Christian Bruel wrote:

I'm not sure. at each point of the macro, we have the current alignment
== FUNCTION_BOUNDARY, because we are just returning from the sequence
build_lang_decl/make_node

so it looks like

   DECL_ALIGN (fn) = MAX (MINIMUM_METHOD_BOUNDARY, DECL_ALIGN (fn))

would be redundant with just

   DECL_ALIGN (fn) = MINIMUM_METHOD_BOUNDARY

did I miss something ?


No, I managed to confuse myself. Patch is ok to install next Wednesday 
unless you hear otherwise from a C++ (or Java) maintainer.



Bernd


Re: Remove undefined behaviour from builtins-20.c

2015-10-16 Thread Richard Biener
On Thu, Oct 15, 2015 at 3:18 PM, Richard Sandiford
 wrote:
> builtins-20.c had:
>
>   if (cos((y*=2, -fabs(tan(x/-y != cos((y*=2,tan(x/y
> link_error ();
>
> which is undefined behaviour.  The test expected that y had the same
> value in x/y and x/-y, but gimplification actually implements the
> "obvious" interpretation, multiplying y by 2, using it for one cos call,
> then multiplying it by 2 again and using it for the other cos call.
>
> The file has other (valid) tests that side-effects don't block
> optimisation, such as:
>
>   if (cosf((y*=3, -x)) != cosf((y*=3,x)))
> link_error ();
>
> so this patch simply removes this instance.
>
> Tested on x86_64-linux-gnu, aarch64-linux-gnu and arm-linux-gnueabi.
> OK to install?

Ok.

Thanks,
Richard.

> Thanks,
> Richard
>
>
> gcc/testsuite/
> * gcc.dg/builtins-20.c: Remove undefined behavior.
>
> diff --git a/gcc/testsuite/gcc.dg/builtins-20.c 
> b/gcc/testsuite/gcc.dg/builtins-20.c
> index 43aa71b..2b63428 100644
> --- a/gcc/testsuite/gcc.dg/builtins-20.c
> +++ b/gcc/testsuite/gcc.dg/builtins-20.c
> @@ -122,7 +122,7 @@ void test2(double x, double y)
>if (cos((y*=3, -x)) != cos((y*=3,x)))
>  link_error ();
>
> -  if (cos((y*=2, -fabs(tan(x/-y != cos((y*=2,tan(x/y
> +  if (cos(-fabs(tan(x/-y))) != cos(tan(x/y)))
>  link_error ();
>
>if (cos(copysign(x,y)) != cos(x))
> @@ -350,7 +350,7 @@ void test2f(float x, float y)
>if (cosf((y*=3, -x)) != cosf((y*=3,x)))
>  link_error ();
>
> -  if (cosf((y*=2, -fabsf(tanf(x/-y != cosf((y*=2,tanf(x/y
> +  if (cosf(-fabsf(tanf(x/-y))) != cosf(tanf(x/y)))
>  link_error ();
>
>if (cosf(copysignf(x,y)) != cosf(x))
> @@ -577,7 +577,7 @@ void test2l(long double x, long double y)
>if (cosl((y*=3, -x)) != cosl((y*=3,x)))
>  link_error ();
>
> -  if (cosl((y*=2, -fabsl(tanl(x/-y != cosl((y*=2,tanl(x/y
> +  if (cosl(-fabsl(tanl(x/-y))) != cosl(tanl(x/y)))
>  link_error ();
>
>if (cosl(copysignl(x,y)) != cosl(x))
>


Re: Drop CONSTRUCTOR comparsion from ipa-icf-gimple

2015-10-16 Thread Richard Biener
On Fri, Oct 16, 2015 at 5:12 AM, Jan Hubicka  wrote:
> Hi,
> as Richard noticed in my port of the code to operand_equal_p, the checking of
> CONSTURCTOR in ipa-icf-gimple is incomplete missing the index checks.
> It is also unnecesary since non-empty ctors does not happen as gimple
> operands.  This patch thus removes the unnecesary code.

Err - they do happen, for vector constructors.  Just empty constructors
are not allowed for vector constructors - vector constructors are required
to have elements in proper order and none left out.

Sorry for misleading you.

> Bootstrapped/regtested x86_64-linux, comitted.

this will definitely ICE ...

Richard.

> Honza
>
> * ipa-icf-gimple.c (func_checker::compare_operand): Compare only
> empty constructors.
> Index: ipa-icf-gimple.c
> ===
> --- ipa-icf-gimple.c(revision 228851)
> +++ ipa-icf-gimple.c(working copy)
> @@ -415,20 +415,9 @@ func_checker::compare_operand (tree t1,
>switch (TREE_CODE (t1))
>  {
>  case CONSTRUCTOR:
> -  {
> -   unsigned length1 = vec_safe_length (CONSTRUCTOR_ELTS (t1));
> -   unsigned length2 = vec_safe_length (CONSTRUCTOR_ELTS (t2));
> -
> -   if (length1 != length2)
> - return return_false ();
> -
> -   for (unsigned i = 0; i < length1; i++)
> - if (!compare_operand (CONSTRUCTOR_ELT (t1, i)->value,
> -   CONSTRUCTOR_ELT (t2, i)->value))
> -   return return_false();
> -
> -   return true;
> -  }
> +  gcc_assert (!vec_safe_length (CONSTRUCTOR_ELTS (t1))
> + && !vec_safe_length (CONSTRUCTOR_ELTS (t2)));
> +  return true;
>  case ARRAY_REF:
>  case ARRAY_RANGE_REF:
>/* First argument is the array, second is the index.  */


[gomp4,committed] Handle oacc region in oacc routine

2015-10-16 Thread Tom de Vries

Hi,

this patch checks for occurance of oacc offload regions in oacc routines 
(which means nested parallelism, which is currently not supported) and 
gives an appropriate error message.


Committed to gomp-4_0-branch.

Thanks,
- Tom
Handle oacc region in oacc routine

2015-10-16  Tom de Vries  

	* omp-low.c (check_omp_nesting_restrictions): Check for oacc region in
	oacc routine.

	* c-c++-common/goacc/parallel-in-routine.c: New test.
---
 gcc/omp-low.c  | 9 +
 gcc/testsuite/c-c++-common/goacc/parallel-in-routine.c | 8 
 2 files changed, 17 insertions(+)
 create mode 100644 gcc/testsuite/c-c++-common/goacc/parallel-in-routine.c

diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index f27bde7..f7e4afc 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -3204,6 +3204,15 @@ check_omp_nesting_restrictions (gimple *stmt, omp_context *ctx)
 	}
   break;
 case GIMPLE_OMP_TARGET:
+  if (is_gimple_omp_offloaded (stmt)
+	  && get_oacc_fn_attrib (cfun->decl) != NULL)
+	{
+	  error_at (gimple_location (stmt),
+		"OpenACC region inside of OpenACC routine, nested "
+		"parallelism not supported yet");
+	  return false;
+	}
+
   for (; ctx != NULL; ctx = ctx->outer)
 	{
 	  if (gimple_code (ctx->stmt) != GIMPLE_OMP_TARGET)
diff --git a/gcc/testsuite/c-c++-common/goacc/parallel-in-routine.c b/gcc/testsuite/c-c++-common/goacc/parallel-in-routine.c
new file mode 100644
index 000..b93d63b
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/parallel-in-routine.c
@@ -0,0 +1,8 @@
+#pragma acc routine
+void
+foo (void)
+{
+#pragma acc parallel /* { dg-error "OpenACC region inside of OpenACC routine, nested parallelism not supported yet" } */
+  ;
+}
+
-- 
1.9.1



Re: OpenACC (gomp-4_0-branch) patch review (was: Merge from gomp-4_1-branch to trunk)

2015-10-16 Thread Jakub Jelinek
On Fri, Oct 16, 2015 at 11:44:17AM +0200, Thomas Schwinge wrote:
> But: working on getting our changes into trunk, for example, when we make
> an effort to extract from gomp-4_0-branch self-contained, individual
> patches, but it then takes weeks to get commit approval or review
> comments, I don't see how that's going to work for the thousands of lines
> patches that we're still to submit?  I mean, GCC development stage 1 is
> going to end in just a few weeks, I suppose?

Yes, see Richi's mail from today, there are 3 weeks plus weekend left in
stage 1.  I'm sorry for the delays in patch review, and will try to arrange
an hour or two daily for OpenACC/OpenMP/HSA patch review for the remainder of
stage1 (and still work on the missing OpenMP 4.5 features or fixes during
the rest of the days).  The PTX changes Nathan can review himself, or Bernd.
And I certainly appreciate if Bernd reviews some of the OMP bits from time
to time, after all, all global reviewers can do that.

> also, allow for "somewhat non-perfect" changes to be committed, and later
> address the "warts"?  (Allowing for incremental progress, while keeping

I'd prefer not to do that though, that will just mean the bad code will stay
around forever.

In any case, if you have unreviewed OpenACC/OpenMP/libgomp patches ready to
be merged into trunk, please ping them.

Jakub


Re: [patch] Minor adjustment to gimplify_addr_expr

2015-10-16 Thread Eric Botcazou
> Sure, if that works it's pre-approved.  Your original patch is also ok
> (though I still
> think it's incomplete - but we'll wait until a testcase comes up with
> the assert).

It passed a bootstrap/regtest cycle on x86-64/Linux so I have installed it.


2015-10-16  Eric Botcazou  

* tree.c (recompute_tree_invariant_for_addr_expr): Assert that the
argument is an ADDR_EXPR.

-- 
Eric BotcazouIndex: tree.c
===
--- tree.c	(revision 228794)
+++ tree.c	(working copy)
@@ -4248,6 +4248,8 @@ recompute_tree_invariant_for_addr_expr (
   tree node;
   bool tc = true, se = false;
 
+  gcc_assert (TREE_CODE (t) == ADDR_EXPR);
+
   /* We started out assuming this address is both invariant and constant, but
  does not have side effects.  Now go down any handled components and see if
  any of them involve offsets that are either non-constant or non-invariant.


Re: Drop CONSTRUCTOR comparsion from ipa-icf-gimple

2015-10-16 Thread Jan Hubicka
> On Fri, Oct 16, 2015 at 1:46 AM, Richard Biener
>  wrote:
> > On Fri, Oct 16, 2015 at 5:12 AM, Jan Hubicka  wrote:
> >> Hi,
> >> as Richard noticed in my port of the code to operand_equal_p, the checking 
> >> of
> >> CONSTURCTOR in ipa-icf-gimple is incomplete missing the index checks.
> >> It is also unnecesary since non-empty ctors does not happen as gimple
> >> operands.  This patch thus removes the unnecesary code.
> >
> > Err - they do happen, for vector constructors.  Just empty constructors
> > are not allowed for vector constructors - vector constructors are required
> > to have elements in proper order and none left out.
> >
> > Sorry for misleading you.
> >
> >> Bootstrapped/regtested x86_64-linux, comitted.
> >
> > this will definitely ICE ...
> >
> 
> And it did on x86:
> 
> https://gcc.gnu.org/ml/gcc-regression/2015-10/msg00166.html
> 
I am going to commit the following revert wich also adds generic testcase
as soon as testing converges.

Index: testsuite/ChangeLog
===
--- testsuite/ChangeLog (revision 228865)
+++ testsuite/ChangeLog (working copy)
@@ -1,3 +1,7 @@
+2015-10-11  Jan Hubicka  
+
+   * gcc.c-torture/compile/icfmatch.c: Add testcase
+
 2015-10-16  Paolo Carlini  
 
PR c++/67926
Index: testsuite/gcc.c-torture/compile/icfmatch.c
===
--- testsuite/gcc.c-torture/compile/icfmatch.c  (revision 0)
+++ testsuite/gcc.c-torture/compile/icfmatch.c  (revision 0)
@@ -0,0 +1,11 @@
+typedef char __attribute__ ((vector_size (4))) v4qi;
+void retv (int a,int b,int c,int d, v4qi *ret)
+{
+  v4qi v = { a, b , c, d };
+  *ret = v;
+}
+void retv2 (int a,int b,int c,int d, v4qi *ret)
+{
+  v4qi v = { a, b , c, d };
+  *ret = v;
+}
Index: ChangeLog
===
--- ChangeLog   (revision 228867)
+++ ChangeLog   (working copy)
@@ -1,3 +1,9 @@
+2015-10-11  Jan Hubicka  
+
+   Revert:
+   * ipa-icf-gimple.c (func_checker::compare_operand): Compare only
+   empty constructors.
+
 2015-10-16  Richard Biener  
 
* gimple-fold.c (gimple_fold_builtin_memory_op): Use gimple_build


Re: [PR67383][ARM][4.9]Backport of "Allow any register for DImode values in Thumb2"

2015-10-16 Thread Ramana Radhakrishnan
On Thu, Oct 15, 2015 at 03:01:24PM +0100, Renlin Li wrote:
> Hi all,
> 
> This is a backport patch to loosen restrictions on core registers
> for DImode values in Thumb2.
> 
> It fixes PR67383. In this particular case, reload tries to spill a
> hard register, and use next register together as a pair to reload a
> DImode pseudo register. However, the spilled register number is
> odd.This is rejected by arm_hard_regno_mode_ok(). There is no other
> register available, so the compiler throws an ICE.

I was not convinced enough by the reasoning provided in the description
because this patch was intended to be a bit of an optimization
rather than a correctness fix.

The command line implies we remove r7 (frame pointer in Thumb2 - historical 
accident, fno-omit-frame-pointer), r9 (ffixed-r9), r10 (-mpic-register) which
leaves us with:

* r0, r1
* r2, r3
* r4, r5

as the only free registers available for DImode values for the whole 
compilation.

We then have r0, r1 and r2 live across the insn which means that there are no 
free registers to handle DImode values
under the constraints provided unless LRA / reload can spill the argument 
registers which it doesn't seem to be able to do
in this particular testcase. Vlad, is that correct ?

Then I wondered why the same problem did not occur in ARM state given that has 
the same restriction.
In ARM state life is a bit better because the Frame pointer is r11 which means 
you pretty much have r6 and r7
as well available in addition to the above list, which means that theoretically 
you can
get away with this in ARM state.

Can you do some more comparison with ARM state as to why we don't have the same 
issue there ?


> 
> The test case in PR67383 is too big, so I didn't include it as part
> of the patch.

I've put up a reduced testcase on the bz, the one I was using to debug.

> arm-none-eabi regression test Okay without any new issues. Okay to
> backport to 4.9?

For changes of this nature please bootstrap and regression test this in arm and 
thumb2 state as well please.

regards
Ramana


Re: [patch] Minor adjustment to gimplify_addr_expr

2015-10-16 Thread Richard Biener
On Fri, Oct 16, 2015 at 9:48 AM, Eric Botcazou  wrote:
>> Btw, would be really nice to have libbacktrace support for ada ...
>
> OK, I'll keep that in mind.
>
>> While the patch looks technically ok I think you'll run into the same issue
>> with a non-zero offset MEM_REF as that will get you a POINTER_PLUS_EXPR
>> from build_fold_addr_expr.  We might be lucky not to ICE in
>> recompute_tree_invariant_for_addr_expr because we can access operand
>> zero of that of course.  I think recompute_tree_invariant_for_addr_expr
>> misses an assert that it receives an ADDR_EXPR and the gimplify.c
>> caller would need to handle POINTER_PLUS_EXPR specially.
>>
>> Or change your patch to also handle non-zero offset MEM_REFs by
>> simply gimplifying to POINTER_PLUS_EXPR op0, op1.
>
> I couldn't cover the new case though, because you need a record with variable
> size and an array of those yields a non-constant offset so no MEM_REF and a
> record with fixed offset doesn't yield a MEM_REF either for some reason...
>
> But I can add the assert in recompute_tree_invariant_for_addr_expr:

Sure, if that works it's pre-approved.  Your original patch is also ok
(though I still
think it's incomplete - but we'll wait until a testcase comes up with
the assert).

Thanks,
Richard.

> Index: tree.c
> ===
> --- tree.c  (revision 228794)
> +++ tree.c  (working copy)
> @@ -4248,6 +4248,8 @@ recompute_tree_invariant_for_addr_expr (
>tree node;
>bool tc = true, se = false;
>
> +  gcc_assert (TREE_CODE (t) == ADDR_EXPR);
> +
>/* We started out assuming this address is both invariant and constant, but
> does not have side effects.  Now go down any handled components and see if
> any of them involve offsets that are either non-constant or non-invariant.
>
> --
> Eric Botcazou


Re: [5/7] Allow gimple debug stmt in widen mode

2015-10-16 Thread Richard Biener
On Thu, Oct 15, 2015 at 7:44 AM, Kugan
 wrote:
>
>
> On 15/09/15 22:57, Richard Biener wrote:
>> On Tue, Sep 8, 2015 at 2:00 AM, Kugan  
>> wrote:
>>>
>>> Thanks for the review.
>>>
>>> On 07/09/15 23:20, Michael Matz wrote:
 Hi,

 On Mon, 7 Sep 2015, Kugan wrote:

> Allow GIMPLE_DEBUG with values in promoted register.

 Patch does much more.

>>>
>>> Oops sorry. Copy and paste mistake.
>>>
>>> gcc/ChangeLog:
>>>
>>> 2015-09-07 Kugan Vivekanandarajah 
>>>
>>> * cfgexpand.c (expand_debug_locations): Remove assert as now we are
>>> also allowing values in promoted register.
>>> * gimple-ssa-type-promote.c (fixup_uses): Allow GIMPLE_DEBUG to bind
>>> values in promoted register.
>>> * rtl.h (wi::int_traits ::decompose): Accept zero extended value
>>> also.
>>>
>>>
> gcc/ChangeLog:
>
> 2015-09-07  Kugan Vivekanandarajah  
>
>  * expr.c (expand_expr_real_1): Set proper SUNREG_PROMOTED_MODE for
>  SSA_NAME that was set by GIMPLE_CALL and assigned to another
>  SSA_NAME of same type.

 ChangeLog doesn't match patch, and patch contains dubious changes:

> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
> @@ -5240,7 +5240,6 @@ expand_debug_locations (void)
> tree value = (tree)INSN_VAR_LOCATION_LOC (insn);
> rtx val;
> rtx_insn *prev_insn, *insn2;
> -   machine_mode mode;
>
> if (value == NULL_TREE)
>   val = NULL_RTX;
> @@ -5275,16 +5274,6 @@ expand_debug_locations (void)
>
> if (!val)
>   val = gen_rtx_UNKNOWN_VAR_LOC ();
> -   else
> - {
> -   mode = GET_MODE (INSN_VAR_LOCATION (insn));
> -
> -   gcc_assert (mode == GET_MODE (val)
> -   || (GET_MODE (val) == VOIDmode
> -   && (CONST_SCALAR_INT_P (val)
> -   || GET_CODE (val) == CONST_FIXED
> -   || GET_CODE (val) == LABEL_REF)));
> - }
>
> INSN_VAR_LOCATION_LOC (insn) = val;
> prev_insn = PREV_INSN (insn);

 So it seems that the modes of the values location and the value itself
 don't have to match anymore, which seems dubious when considering how a
 debugger should load the value in question from the given location.  So,
 how is it supposed to work?
>>>
>>> For example (simplified test-case from creduce):
>>>
>>> fn1() {
>>>   char a = fn1;
>>>   return a;
>>> }
>>>
>>> --- test.c.142t.veclower21  2015-09-07 23:47:26.362201640 +
>>> +++ test.c.143t.promotion   2015-09-07 23:47:26.362201640 +
>>> @@ -5,13 +5,18 @@
>>>  {
>>>char a;
>>>long int fn1.0_1;
>>> +  unsigned int _2;
>>>int _3;
>>> +  unsigned int _5;
>>> +  char _6;
>>>
>>>:
>>>fn1.0_1 = (long int) fn1;
>>> -  a_2 = (char) fn1.0_1;
>>> -  # DEBUG a => a_2
>>> -  _3 = (int) a_2;
>>> +  _5 = (unsigned int) fn1.0_1;
>>> +  _2 = _5 & 255;
>>> +  # DEBUG a => _2
>>> +  _6 = (char) _2;
>>> +  _3 = (int) _6;
>>>return _3;
>>>
>>>  }
>>>
>>> Please see that DEBUG now points to _2 which is a promoted mode. I am
>>> assuming that the debugger would load required precision from promoted
>>> register. May be I am missing the details but how else we can handle
>>> this? Any suggestions?
>>
>> I would have expected the DEBUG insn to be adjusted as
>>
>> # DEBUG a => (char)_2
>
> Thanks for the review. Please find the attached patch that attempts to
> do this. I have also tested a version of this patch with gdb testsuite.
>
> As Michael wanted, I have also removed the changes in rtl.h and
> promoting constants in GIMPLE_DEBUG.
>
>
>> Btw, why do we have
>>
>>> +  _6 = (char) _2;
>>> +  _3 = (int) _6;
>>
>> ?  I'd have expected
>>
>>  unsigned int _6 = SEXT <_2, 8>
>>  _3 = (int) _6;
>>  return _3;
>
> I am looking into it.
>
>>
>> see my other mail about promotion of PARM_DECLs and RESULT_DECLs -- we should
>> promote those as well.
>>
>
> Just to be sure, are you referring to
> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00244.html
> where you wanted an IPA pass to perform this. This is one of my dodo
> after this. Please let me know if you wanted here is a different issue.

No, that's the same issue.

You remove


@@ -5269,16 +5268,6 @@ expand_debug_locations (void)

if (!val)
  val = gen_rtx_UNKNOWN_VAR_LOC ();
-   else
- {
-   mode = GET_MODE (INSN_VAR_LOCATION (insn));
-
-   gcc_assert (mode == GET_MODE (val)
-   || (GET_MODE (val) == VOIDmode
-   && (CONST_SCALAR_INT_P (val)
-   || GET_CODE (val) == CONST_FIXED
-   || GET_CODE (val) == LABEL_REF)));
- }

which is in place to ensure the debug insns 

Re: [PATCH] 2015-10-15 Benedikt Huber <benedikt.hu...@theobroma-systems.com> Philipp Tomsich <philipp.toms...@theobroma-systems.com>

2015-10-16 Thread Marcus Shawcroft
Hi,

 A few more style nits:

> +  builtin_decls_data bdda[] = {

New line before  {

> +{double_type_node, "__builtin_aarch64_rsqrt_df", 
> AARCH64_BUILTIN_RSQRT_DF},

Space after {
Space  before }

> +void aarch64_emit_swrsqrt (rtx, rtx);
> +

> +tree aarch64_builtin_rsqrt (unsigned int fn, bool md_fn);
> +

Drop the formal argument names as you did in the first declaration.

See my previous comment w.r.t the naming of new test cases in
gcc.target/aarch64, at least the following still need s/-/_/

> diff --git a/gcc/testsuite/gcc.target/aarch64/rsqrt-asm-check-common.h 
> b/gcc/testsuite/gcc.target/aarch64/rsqrt-asm-check-common.h
> diff --git a/gcc/testsuite/gcc.target/aarch64/rsqrt-asm-check-negative_1.c 
> b/gcc/testsuite/gcc.target/aarch64/rsqrt-asm-check-negative_1.c
> diff --git a/gcc/testsuite/gcc.target/aarch64/rsqrt-asm-check_1.c 
> b/gcc/testsuite/gcc.target/aarch64/rsqrt-asm-check_1.c

> +//   With -ffast-math these return positive INF.
> +//   t_double (-0.0, -inf);
> +//   t_float (-0.0, -inff);
> +
> +//   The reason here is that -ffast-math flushes to zero.
> +//   t_double  (__DBL_MIN__/256, 0X1.00P+515);
> +//   t_float (__FLT_MIN__/256, 0X1.00P+67);

Comment consistently with the rest of the backend ie /* */

Thanks
/Marcus


OpenACC (gomp-4_0-branch) patch review (was: Merge from gomp-4_1-branch to trunk)

2015-10-16 Thread Thomas Schwinge
Hi!

On Tue, 13 Oct 2015 21:12:14 +0200, Jakub Jelinek  wrote:
> I've bootstrapped/regtested on x86_64-linux and i686-linux following
> merge from gomp-4_1-branch to trunk, which brings in most of the OpenMP 4.5
> support for C and C++

With nvptx offloading, I'm seeing the following regressions (on trunk):

PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/asyncwait-1.c 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 (test for excess errors)
[-PASS:-]{+FAIL:+} 
libgomp.oacc-c/../libgomp.oacc-c-c++-common/asyncwait-1.c 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 execution test

PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/data-2.c 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 (test for excess errors)
[-PASS:-]{+FAIL:+} libgomp.oacc-c/../libgomp.oacc-c-c++-common/data-2.c 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 execution test

PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/data-3.c 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 (test for excess errors)
[-PASS:-]{+FAIL:+} libgomp.oacc-c/../libgomp.oacc-c-c++-common/data-3.c 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 execution test

>From a quick look, it might have something to do with changes that affect
handling of the OpenACC async clause.  I guess you're not set up for
nvptx offloading testing; I'll try to figure it out.


More generally, "as you've committed first", the burden of merging your
gomp-4_1-branch merge (trunk r228777) with our gomp-4_0-branch changes
will now be upon us.  From a quick assessment, this certainly doesn't
look trivial.  But well, that's how it is; one of us has to swallow the
bitter pill...

But: working on getting our changes into trunk, for example, when we make
an effort to extract from gomp-4_0-branch self-contained, individual
patches, but it then takes weeks to get commit approval or review
comments, I don't see how that's going to work for the thousands of lines
patches that we're still to submit?  I mean, GCC development stage 1 is
going to end in just a few weeks, I suppose?

At the GNU Tools Cauldron 2013 we've repeatedly been asked (by you, by
Jeff Law, and others) a) to tightly integrate our OpenACC/nvptx
offloading changes with the existing OMP code, and b) to do our
development in public so that you have a chance to review our changes
early, so they can then be merged without much effort -- we've tried our
best for a) given our understanding of the existing OMP code, and our
changes have incrementally been committed to gomp-4_0-branch over the
course of a lot of months now, but I'm not convinced that much of that
has actually been reviewed so far?  How are we going to tackle this?

I understand that you're a very busy person, with lots of
responsibilities, and I also understand that often you have a rather
"rigid" idea of how changes should/shouldn't be done (which generally is
a good thing, of course!), but if we're apparently not making much
progress with our merge into trunk -- and, as you know yourself, juggling
with (that is, rebasing on top of trunk changes, and so on) thousands of
lines patches is no fun! -- would it perhaps make sense to officially
appoint somebody else to review OMP changes in addition to you?  And,
also, allow for "somewhat non-perfect" changes to be committed, and later
address the "warts"?  (Allowing for incremental progress, while keeping
GCC test results clean, and all that, of course!)  Lately, Bernd has
stepped up a few times to review OMP patches (many thanks, Bernd!), but
also he sometimes stated that even though a patch appears fine to him,
he'd like "Jakub to have a look".

Please note that my concern here is not to accuse people, but it's to
find a way to improve the review situation/process.

There's still a bit of clean-up and development going on on
gomp-4_0-branch, but what should be the strategy to get it merged into
trunk?  Instead of us extracting and submitting individual changes (which
we certainly can do, but which is a huge time-sink if they're then not
handled quickly), would you maybe prefer to do a "whole-branch" review?


Grüße
 Thomas


signature.asc
Description: PGP signature


Re: OpenACC (gomp-4_0-branch) patch review

2015-10-16 Thread Bernd Schmidt

On 10/16/2015 11:44 AM, Thomas Schwinge wrote:

Lately, Bernd has
stepped up a few times to review OMP patches (many thanks, Bernd!), but
also he sometimes stated that even though a patch appears fine to him,
he'd like "Jakub to have a look".


I'll just comment on this briefly. In general I'll try to review 
anything I think I can figure out and which doesn't have a super-active 
maintainer, but some areas have folks who clearly know the code better. 
The whole gomp area is one of those, and when in doubt, I like to err on 
the side of allowing (in this case) Jakub a chance to take a look.



There's still a bit of clean-up and development going on on
gomp-4_0-branch, but what should be the strategy to get it merged into
trunk?  Instead of us extracting and submitting individual changes (which
we certainly can do, but which is a huge time-sink if they're then not
handled quickly), would you maybe prefer to do a "whole-branch" review?


It might be good to start with a relatively high-level overview of the 
current approach, also documenting which parts of OpenACC the changes 
implement and which ones they don't.



would it perhaps make sense to officially
appoint somebody else to review OMP changes in addition to you?  And,
also, allow for "somewhat non-perfect" changes to be committed, and later
address the "warts"?


Depends on how "somewhat non-perfect" is defined - you might want to 
elaborate. Code should follow the coding standards, add documentation, 
testcases, etc., those are minimum requirements for everyone. In cases 
where something is implemented in a way that has a clearly superiour 
alternative, I think it is reasonable to ask for it to be changed (the 
builtin folding fell into this category for me). In terms of the current 
general approach towards implementing OpenACC I don't intend to give you 
a hard time, especially since past patch review from Jakub pointed in 
this direction and it would be unreasonable to second-guess this choice now.


On the bright side I think most of omp-low.c could be described as 
"somewhat non-perfect", so you'd be following existing practice.



Bernd


Re: Drop CONSTRUCTOR comparsion from ipa-icf-gimple

2015-10-16 Thread H.J. Lu
On Fri, Oct 16, 2015 at 1:46 AM, Richard Biener
 wrote:
> On Fri, Oct 16, 2015 at 5:12 AM, Jan Hubicka  wrote:
>> Hi,
>> as Richard noticed in my port of the code to operand_equal_p, the checking of
>> CONSTURCTOR in ipa-icf-gimple is incomplete missing the index checks.
>> It is also unnecesary since non-empty ctors does not happen as gimple
>> operands.  This patch thus removes the unnecesary code.
>
> Err - they do happen, for vector constructors.  Just empty constructors
> are not allowed for vector constructors - vector constructors are required
> to have elements in proper order and none left out.
>
> Sorry for misleading you.
>
>> Bootstrapped/regtested x86_64-linux, comitted.
>
> this will definitely ICE ...
>

And it did on x86:

https://gcc.gnu.org/ml/gcc-regression/2015-10/msg00166.html

FAIL: gcc.dg/pr63914.c (internal compiler error)
FAIL: gcc.dg/pr63914.c (test for excess errors)
FAIL: gcc.target/i386/avx-1.c (internal compiler error)
FAIL: gcc.target/i386/avx-1.c (test for excess errors)
FAIL: gcc.target/i386/avx512f-set-v16sf-1.c (internal compiler error)
FAIL: gcc.target/i386/avx512f-set-v16sf-1.c (test for excess errors)
FAIL: gcc.target/i386/avx512f-set-v16sf-2.c (internal compiler error)
FAIL: gcc.target/i386/avx512f-set-v16sf-2.c (test for excess errors)
FAIL: gcc.target/i386/avx512f-set-v16sf-3.c (internal compiler error)
FAIL: gcc.target/i386/avx512f-set-v16sf-3.c (test for excess errors)
FAIL: gcc.target/i386/avx512f-set-v16si-1.c (internal compiler error)
FAIL: gcc.target/i386/avx512f-set-v16si-1.c (test for excess errors)
FAIL: gcc.target/i386/avx512f-set-v16si-2.c (internal compiler error)
FAIL: gcc.target/i386/avx512f-set-v16si-2.c (test for excess errors)
FAIL: gcc.target/i386/avx512f-set-v16si-3.c (internal compiler error)
FAIL: gcc.target/i386/avx512f-set-v16si-3.c (test for excess errors)
FAIL: gcc.target/i386/avx512f-set-v8df-1.c (internal compiler error)
FAIL: gcc.target/i386/avx512f-set-v8df-1.c (test for excess errors)
FAIL: gcc.target/i386/avx512f-set-v8df-2.c (internal compiler error)
FAIL: gcc.target/i386/avx512f-set-v8df-2.c (test for excess errors)
FAIL: gcc.target/i386/avx512f-set-v8df-3.c (internal compiler error)
FAIL: gcc.target/i386/avx512f-set-v8df-3.c (test for excess errors)
FAIL: gcc.target/i386/avx512f-set-v8di-1.c (internal compiler error)
FAIL: gcc.target/i386/avx512f-set-v8di-1.c (test for excess errors)
FAIL: gcc.target/i386/avx512f-set-v8di-2.c (internal compiler error)
FAIL: gcc.target/i386/avx512f-set-v8di-2.c (test for excess errors)
FAIL: gcc.target/i386/avx512f-set-v8di-3.c (internal compiler error)
FAIL: gcc.target/i386/avx512f-set-v8di-3.c (test for excess errors)
FAIL: gcc.target/i386/sse-13.c (internal compiler error)
FAIL: gcc.target/i386/sse-13.c (test for excess errors)
FAIL: gcc.target/i386/sse-18.c (internal compiler error)
FAIL: gcc.target/i386/sse-18.c (test for excess errors)
FAIL: gcc.target/i386/sse-19.c (internal compiler error)
FAIL: gcc.target/i386/sse-19.c (test for excess errors)
FAIL: gcc.target/i386/sse-23.c (internal compiler error)
FAIL: gcc.target/i386/sse-23.c (test for excess errors)
FAIL: gcc.target/i386/sse-24.c (internal compiler error)
FAIL: gcc.target/i386/sse-24.c (test for excess errors)
FAIL: gcc.target/i386/sse-25.c (internal compiler error)
FAIL: gcc.target/i386/sse-25.c (test for excess errors)
FAIL: gcc.target/i386/vecinit-1.c (internal compiler error)
FAIL: gcc.target/i386/vecinit-1.c (test for excess errors)
FAIL: gcc.target/i386/vecinit-2.c (internal compiler error)
FAIL: gcc.target/i386/vecinit-2.c (test for excess errors)
FAIL: gcc.target/i386/vecinit-5.c (internal compiler error)
FAIL: gcc.target/i386/vecinit-5.c (test for excess errors)
FAIL: gcc.target/i386/vecinit-6.c (internal compiler error)
FAIL: gcc.target/i386/vecinit-6.c (test for excess errors)

-- 
H.J.


GCC 6 Status Report (2015-10-16)

2015-10-16 Thread Richard Biener

Status
==

Trunk which will eventually become GCC 6 is still in Stage 1 but its
end is near and we are planning to transition into Stage 3 starting Nov 9th.

This means it is time to get things you want to have in GCC 6 finalized
and reviewed.  As usual there may be exceptions to late reviewed features
but don't count on that.  Likewise target specific features can sneak in
during Stage 3 if maintainers ok them.


Quality Data


Priority  #   Change from last report
---   ---
P12-   2
P2   84+   4
P3  120+ 111
P4   88+   4
P5   32-   3
---   ---
Total P1-P3 206+ 113
Total   326+ 114


Previous Report
===

https://gcc.gnu.org/ml/gcc/2015-04/msg00146.html



Re: [gomp4.1] OpenMP 4.1 is dead, long live OpenMP 4.5

2015-10-16 Thread Thomas Schwinge
Hi!

;-) "Bikeshed" discussion, but while waiting for a test run to finish:

On Thu, 15 Oct 2015 13:06:42 +0200, Jakub Jelinek  wrote:
> On Fri, Oct 09, 2015 at 08:26:25PM +0300, Ilya Verbin wrote:
> > On Fri, Oct 09, 2015 at 09:55:07 +0200, Jakub Jelinek wrote:
> > > -GOMP_4.1 {
> > > +GOMP_4.5 {
> > >global:
> > >   GOMP_target_41;
> > >   GOMP_target_data_41;
> > 
> > Should we rename it to GOMP_target*_45, or do you know some more mnemonic 
> > name?

The OpenMP version is already being encoded in the symbol version.  Which
to me appears a bit "arbitrary", but that at least makes sense in that it
documents for which OpenMP version a specific symbol was introduced
first, and the symbol versions form a list in the libgomp.map file that
is easily understood.  But, encoding the OpenMP version also in the
symbol name itself might be more confusing, because to the casual reader
it's not clear whether, for example, GOMP_target_41 applies just to OpeMP
4.1, or OpenMP 4.1 and later, and likewise, whether the generic
GOMP_target applies to OpenMP versions before or after GOMP_target_41.

> Either to 45, or find a better name, sure.  The latter would be preferable.
> Now, for the latter, either we could add something to say those use 16bit
> kinds, or something to indicate they allow both synchronous and asynchronous
> execution (but what word covers both of that), or that they create a target
> task (but that is the case only for GOMP_target and GOMP_target_update (and
> the new GOMP_target_enter_exit_data)).
> So 16bit kinds is the only thing all of them have in common, thus perhaps
> _kind16 suffixes instead of _4{1,5} ?

Given that it seeems difficult to express the several kinds of changes,
what about simply using GOMP_target_2 and so on?  Everyone should then
understand that it's a replacement/successor of GOMP_target.


Grüße,
 Thomas


signature.asc
Description: PGP signature


[Ada] Missing inlining of init_proc

2015-10-16 Thread Arnaud Charlet
The compiler may eventually silently skip inlining a non-tagged record type
init proc because internally the frontend forgets to processing it. This
issue generally does not occur since as soon as the frontend processes some
unit that has pragma Inline the internal machinery which takes care of such
processing is enabled (hence this problem reproduces only in small sources).

After this patch the following runs without output (which means that
inlining is now performed in this case).

package Test is
   function F return Natural;

   type No_Tagged_Rec is
  record
 Field1, Field2 : Integer := F;
 Field3, Field4 : Float := Float(F);
   end record;
end Test;

package body Test is
   Counter : Natural := 0;
   function F return Natural;
   begin
  Counter := Counter + 1;
  return Counter;
   end F;
end Test;

with Test;
use  Test;
pragma Elaborate (Test);

package Main is
   Global_Rec : No_Tagged_Rec;
end Main;

Command: gcc -gnatDG -S -O1 -gnat05 -gnatn -gnatp main.ads
 grep -i test__no_tagged_recIP main.s
No output

Tested on x86_64-pc-linux-gnu, committed on trunk

2015-10-16  Javier Miranda  

* inline.adb (Add_Inlined_Body): Ensure that
Analyze_Inlined_Bodies will be invoked after completing the
analysis of the current unit.

Index: inline.adb
===
--- inline.adb  (revision 228864)
+++ inline.adb  (working copy)
@@ -405,6 +405,11 @@
 Pack : constant Entity_Id := Get_Code_Unit_Entity (E);
 
  begin
+--  Ensure that Analyze_Inlined_Bodies will be invoked after
+--  completing the analysis of the current unit.
+
+Inline_Processing_Required := True;
+
 if Pack = E then
 
--  Library-level inlined function. Add function itself to


Re: [PATCH] Add new hooks ASM_OUTPUT_START_FUNCTION_HEADER ...

2015-10-16 Thread Dominik Vogt
On Mon, Sep 21, 2015 at 12:31:58PM +0100, Dominik Vogt wrote:
> This patch adds to new backend hooks
> ASM_OUTPUT_START_FUNCTION_HEADER and
> ASM_OUTPUT_END_FUNCTION_FOOTER that may be defined to emit
> assembly code at the very start or end of a function.

We no longer need this patch.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany



Re: [Patch, avr] Fix PR 67839 - bit addressable instructions generated for out of range addresses

2015-10-16 Thread Senthil Kumar Selvaraj
Ping!

Regards
Senthil

On Mon, Oct 05, 2015 at 02:30:58PM +0530, Senthil Kumar Selvaraj wrote:
> Hi,
> 
>   As part of support for io and io_low attributes, the upper bound of
>   the range check for low IO and IO addresses was changed from hardcoded
>   values to hardcoded_range_end + 1 - GET_MODE_SIZE(mode).
> 
>   GCC passes VOID as the mode from genrecog, and GET_MODE_SIZE returns
>   0, resulting in the range getting incorrectly extended by a byte.
> 
>   Not sure why it was done, as the mode of the operand shouldn't really
>   matter when computing the upper bound. In any case, the insns that use 
>   the predicate already have a mem:QI wrapping it, and all the bit
>   addressable instructions operate on a single IO register only.
> 
>   This patch reverts the check back to a hardcoded value, and adds a
>   test to prevent regressions.
> 
>   No new regression failures. If ok, could someone commit please? I
>   don't have commit access.
> 
> 
> Regards
> Senthil
> 
> gcc/ChangeLog
> 
> 2015-10-05  Senthil Kumar Selvaraj  
> 
>   PR target/67839
>   * config/avr/predicates.md (low_io_address_operand): Don't
>   consider MODE when computing upper bound.
>   (io_address_operand): Likewise.
> 
> 
> gcc/testsuite/ChangeLog
> 
> 2015-10-05  Senthil Kumar Selvaraj  
> 
>   PR target/67839
>   * gcc.target/avr/pr67839.c: New test.
> 
> 
> 
> diff --git gcc/config/avr/predicates.md gcc/config/avr/predicates.md
> index 2d12bc6..622bc0b 100644
> --- gcc/config/avr/predicates.md
> +++ gcc/config/avr/predicates.md
> @@ -46,7 +46,7 @@
>  (define_special_predicate "low_io_address_operand"
>(ior (and (match_code "const_int")
>   (match_test "IN_RANGE (INTVAL (op) - avr_arch->sfr_offset,
> -0, 0x20 - GET_MODE_SIZE (mode))"))
> +0, 0x1F)"))
> (and (match_code "symbol_ref")
>   (match_test "SYMBOL_REF_FLAGS (op) & SYMBOL_FLAG_IO_LOW"
>  
> @@ -60,7 +60,7 @@
>  (define_special_predicate "io_address_operand"
>(ior (and (match_code "const_int")
>   (match_test "IN_RANGE (INTVAL (op) - avr_arch->sfr_offset,
> -0, 0x40 - GET_MODE_SIZE (mode))"))
> +0, 0x3F)"))
> (and (match_code "symbol_ref")
>   (match_test "SYMBOL_REF_FLAGS (op) & SYMBOL_FLAG_IO"
>  
> diff --git gcc/testsuite/gcc.target/avr/pr67839.c 
> gcc/testsuite/gcc.target/avr/pr67839.c
> new file mode 100644
> index 000..604ab4b
> --- /dev/null
> +++ gcc/testsuite/gcc.target/avr/pr67839.c
> @@ -0,0 +1,29 @@
> +/* { dg-do compile } */
> +/* { dg-options "-Os" } */
> +/* { dg-final { scan-assembler "sbi 0x1f,0" } } */
> +/* { dg-final { scan-assembler "cbi 0x1f,0" } } */
> +/* { dg-final { scan-assembler-not "sbi 0x20,0" } } */
> +/* { dg-final { scan-assembler-not "cbi 0x20,0" } } */
> +/* { dg-final { scan-assembler "in r\\d+,__SREG__" } } */
> +/* { dg-final { scan-assembler "out __SREG__,r\\d+" } } */
> +/* { dg-final { scan-assembler-not "in r\\d+,0x40" } } */
> +/* { dg-final { scan-assembler-not "out 0x40, r\\d+" } } */
> +
> +/* This testcase verifies that SBI/CBI/SBIS/SBIC
> +   and IN/OUT instructions are not generated for
> +   an IO addresses outside the valid range.
> +*/
> +#define IO_ADDR(x) (*((volatile char *)x + __AVR_SFR_OFFSET__))
> +int main ()
> +{
> +  IO_ADDR(0x1f) |= 1;
> +  IO_ADDR(0x1f) &= 0xFE;
> +
> +  IO_ADDR(0x20) |= 1;
> +  IO_ADDR(0x20) &= 0xFE;
> +
> +  IO_ADDR(0x3f) = IO_ADDR(0x3f);
> +
> +  IO_ADDR(0x40) = IO_ADDR(0x40);
> +  return 0;
> +}


Re: refactoring TARGET_PTRMEMFUNC_VBIT_LOCATION checks

2015-10-16 Thread Bernd Schmidt

On 10/16/2015 10:01 AM, Christian Bruel wrote:

-
-  if (TARGET_PTRMEMFUNC_VBIT_LOCATION == ptrmemfunc_vbit_in_pfn
-  && DECL_ALIGN (fn) < 2 * BITS_PER_UNIT)
-DECL_ALIGN (fn) = 2 * BITS_PER_UNIT;
-
+  DECL_ALIGN (fn) = MINIMUM_METHOD_BOUNDARY;


This looks like a change in behaviour. You want to use the max of M_M_B 
and the current alignment.



Bernd


Re: Move some bit and binary optimizations in simplify and match

2015-10-16 Thread Hurugalawadi, Naveen
Hi,

Thanks very much for your detailed explanation regarding the queries.

>> you are missing the convert? on the lshift now, without it the
>> tree_nop_conversion_p check always evaluates to true.
Done.

>> fold-const.c which handles TRUTH_NOT_EXPR but logical_inverted_value
>> does not handle it.  I suggest to add
Done.

Please find attached the modified patch as per your suggestions.

>> You should move them to match.pd.  It requires duplication to
>> handle the const vs. non-const cases the fold-const.c code handled.

Have modified patterns to handle const and non-const cases.
Will test it and post it once finished with it.

Thanks,
Naveendiff --git a/gcc/fold-const.c b/gcc/fold-const.c
index de45a2c..1e7fbb4 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -9803,20 +9803,6 @@ fold_binary_loc (location_t loc,
   goto associate;
 
 case MULT_EXPR:
-  /* (-A) * (-B) -> A * B  */
-  if (TREE_CODE (arg0) == NEGATE_EXPR && negate_expr_p (arg1))
-	return fold_build2_loc (loc, MULT_EXPR, type,
-			fold_convert_loc (loc, type,
-	  TREE_OPERAND (arg0, 0)),
-			fold_convert_loc (loc, type,
-	  negate_expr (arg1)));
-  if (TREE_CODE (arg1) == NEGATE_EXPR && negate_expr_p (arg0))
-	return fold_build2_loc (loc, MULT_EXPR, type,
-			fold_convert_loc (loc, type,
-	  negate_expr (arg0)),
-			fold_convert_loc (loc, type,
-	  TREE_OPERAND (arg1, 0)));
-
   if (! FLOAT_TYPE_P (type))
 	{
 	  /* Transform x * -C into -x * C if x is easily negatable.  */
@@ -9830,16 +9816,6 @@ fold_binary_loc (location_t loc,
 		  negate_expr (arg0)),
 tem);
 
-	  /* (a * (1 << b)) is (a << b)  */
-	  if (TREE_CODE (arg1) == LSHIFT_EXPR
-	  && integer_onep (TREE_OPERAND (arg1, 0)))
-	return fold_build2_loc (loc, LSHIFT_EXPR, type, op0,
-TREE_OPERAND (arg1, 1));
-	  if (TREE_CODE (arg0) == LSHIFT_EXPR
-	  && integer_onep (TREE_OPERAND (arg0, 0)))
-	return fold_build2_loc (loc, LSHIFT_EXPR, type, op1,
-TREE_OPERAND (arg0, 1));
-
 	  /* (A + A) * C -> A * 2 * C  */
 	  if (TREE_CODE (arg0) == PLUS_EXPR
 	  && TREE_CODE (arg1) == INTEGER_CST
@@ -9882,21 +9858,6 @@ fold_binary_loc (location_t loc,
 	}
   else
 	{
-	  /* Convert (C1/X)*C2 into (C1*C2)/X.  This transformation may change
- the result for floating point types due to rounding so it is applied
- only if -fassociative-math was specify.  */
-	  if (flag_associative_math
-	  && TREE_CODE (arg0) == RDIV_EXPR
-	  && TREE_CODE (arg1) == REAL_CST
-	  && TREE_CODE (TREE_OPERAND (arg0, 0)) == REAL_CST)
-	{
-	  tree tem = const_binop (MULT_EXPR, TREE_OPERAND (arg0, 0),
-  arg1);
-	  if (tem)
-		return fold_build2_loc (loc, RDIV_EXPR, type, tem,
-TREE_OPERAND (arg0, 1));
-	}
-
   /* Strip sign operations from X in X*X, i.e. -Y*-Y -> Y*Y.  */
 	  if (operand_equal_p (arg0, arg1, 0))
 	{
@@ -10053,22 +10014,6 @@ fold_binary_loc (location_t loc,
   goto bit_rotate;
 
 case BIT_AND_EXPR:
-  /* ~X & X, (X == 0) & X, and !X & X are always zero.  */
-  if ((TREE_CODE (arg0) == BIT_NOT_EXPR
-	   || TREE_CODE (arg0) == TRUTH_NOT_EXPR
-	   || (TREE_CODE (arg0) == EQ_EXPR
-	   && integer_zerop (TREE_OPERAND (arg0, 1
-	  && operand_equal_p (TREE_OPERAND (arg0, 0), arg1, 0))
-	return omit_one_operand_loc (loc, type, integer_zero_node, arg1);
-
-  /* X & ~X , X & (X == 0), and X & !X are always zero.  */
-  if ((TREE_CODE (arg1) == BIT_NOT_EXPR
-	   || TREE_CODE (arg1) == TRUTH_NOT_EXPR
-	   || (TREE_CODE (arg1) == EQ_EXPR
-	   && integer_zerop (TREE_OPERAND (arg1, 1
-	  && operand_equal_p (arg0, TREE_OPERAND (arg1, 0), 0))
-	return omit_one_operand_loc (loc, type, integer_zero_node, arg0);
-
   /* Fold (X ^ 1) & 1 as (X & 1) == 0.  */
   if (TREE_CODE (arg0) == BIT_XOR_EXPR
 	  && INTEGRAL_TYPE_P (type)
diff --git a/gcc/match.pd b/gcc/match.pd
index f3813d8..1120c59 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -324,6 +324,22 @@ along with GCC; see the file COPYING3.  If not see
 (if (real_isinteger (_REAL_CST (@1), ) && (n & 1) == 0)
  (pows @0 @1))
 
+/* Fold (a * (1 << b)) into (a << b)  */
+(simplify
+ (mult:c @0 (convert? (lshift integer_onep@1 @2)))
+  (if (! FLOAT_TYPE_P (type)
+   && tree_nop_conversion_p (type, TREE_TYPE (@1)))
+   (lshift @0 @2)))
+
+/* Fold (C1/X)*C2 into (C1*C2)/X.  */
+(simplify
+ (mult (rdiv:s REAL_CST@0 @1) REAL_CST@2)
+  (if (flag_associative_math)
+  (with
+   { tree tem = const_binop (MULT_EXPR, type, @0, @2); }
+  (if (tem)
+   (rdiv { tem; } @1)
+
 /* X % Y is smaller than Y.  */
 (for cmp (lt ge)
  (simplify
@@ -543,6 +559,13 @@ along with GCC; see the file COPYING3.  If not see
 (match negate_expr_p
  VECTOR_CST
  (if (FLOAT_TYPE_P (TREE_TYPE (type)) || TYPE_OVERFLOW_WRAPS (type
+
+/* (-A) * (-B) -> A * B  */
+(simplify
+ (mult:c (convert1? (negate @0)) (convert2? negate_expr_p@1))
+  (if 

Re: [PATCH] Fix pr67963

2015-10-16 Thread Uros Bizjak
On Fri, Oct 16, 2015 at 8:43 AM, Uros Bizjak  wrote:
> On Thu, Oct 15, 2015 at 9:30 PM, Uros Bizjak  wrote:
>
> Do we support -O2 -march=lakemont with
>
> __attribute__((target("arch=silvermont")))

 Hm, no.

>>>
>>> Do we issue an error or silently ignore
>>> __attribute__((target("arch=silvermont")))?
>>> If we don't support it, should we support
>>>
>>> -O2 -march=silvermont
>>>
>>> __attribute__((target("arch=lakemont")))
>>
>> Actually, we have to re-initialize:
>>
>>   opts->x_target_flags
>> |= (TARGET_DEFAULT | TARGET_SUBTARGET_DEFAULT) & 
>> ~opts_set->x_target_flags;
>>
>> just before TARGET_SUBTARGET{32,64}_DEFAULT processing, and it will work.
>
> No, this won't work. The value of MASK_NO_FANCY_MATH depend on
> MASK_80387setting, and once fancy math bit is set, it couldn't be
> cleared for march != lakemont.
>
> It looks just like we want to error out when lakemont is enabled with -m80387.

Like in the attached patch, that also slightly improves existing error
reporting.

Uros.
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 228863)
+++ config/i386/i386.c  (working copy)
@@ -4949,16 +4949,30 @@ ix86_option_override_internal (bool main_args_p,
 {
   /* Verify that x87/MMX/SSE/AVX is off for -miamcu.  */
   if (TARGET_80387_P (opts->x_target_flags))
-   sorry ("X87 FPU isn%'t supported in Intel MCU psABI");
-  else if ((opts->x_ix86_isa_flags & (OPTION_MASK_ISA_MMX
- | OPTION_MASK_ISA_SSE
- | OPTION_MASK_ISA_AVX)))
-   sorry ("%s isn%'t supported in Intel MCU psABI",
-  TARGET_MMX_P (opts->x_ix86_isa_flags)
-  ? "MMX"
-  : TARGET_SSE_P (opts->x_ix86_isa_flags) ? "SSE" : "AVX");
+   sorry ("X87 FPU is not supported in Intel MCU psABI");
+  if ((opts->x_ix86_isa_flags & (OPTION_MASK_ISA_MMX
+| OPTION_MASK_ISA_SSE
+| OPTION_MASK_ISA_AVX)))
+   sorry ("%s is not supported in Intel MCU psABI",
+  TARGET_AVX_P (opts->x_ix86_isa_flags)
+  ? "AVX"
+  : TARGET_SSE_P (opts->x_ix86_isa_flags) ? "SSE" : "MMX");
 }
 
+  if (ix86_arch == PROCESSOR_LAKEMONT)
+{
+  /* Verify that x87/MMX/SSE/AVX is off for -march=lakemont.  */
+  if (TARGET_80387_P (opts->x_target_flags))
+   error ("X87 FPU is not supported on Lakemont CPU");
+  if ((opts->x_ix86_isa_flags & (OPTION_MASK_ISA_MMX
+| OPTION_MASK_ISA_SSE
+| OPTION_MASK_ISA_AVX)))
+   error ("%s is not supported on Lakemont CPU",
+  TARGET_AVX_P (opts->x_ix86_isa_flags)
+  ? "AVX"
+  : TARGET_SSE_P (opts->x_ix86_isa_flags) ? "SSE" : "MMX");
+}
+
   if (!strcmp (opts->x_ix86_arch_string, "generic"))
 error ("generic CPU can be used only for %stune=%s %s",
   prefix, suffix, sw);


Re: [PATCH] Fix pr67963

2015-10-16 Thread H.J. Lu
On Fri, Oct 16, 2015 at 3:31 AM, H.J. Lu  wrote:
> On Fri, Oct 16, 2015 at 2:35 AM, Uros Bizjak  wrote:
>> On Fri, Oct 16, 2015 at 8:43 AM, Uros Bizjak  wrote:
>>> On Thu, Oct 15, 2015 at 9:30 PM, Uros Bizjak  wrote:
>>>
>>> Do we support -O2 -march=lakemont with
>>>
>>> __attribute__((target("arch=silvermont")))
>>
>> Hm, no.
>>
>
> Do we issue an error or silently ignore
> __attribute__((target("arch=silvermont")))?
> If we don't support it, should we support
>
> -O2 -march=silvermont
>
> __attribute__((target("arch=lakemont")))

 Actually, we have to re-initialize:

   opts->x_target_flags
 |= (TARGET_DEFAULT | TARGET_SUBTARGET_DEFAULT) & 
 ~opts_set->x_target_flags;

 just before TARGET_SUBTARGET{32,64}_DEFAULT processing, and it will work.
>>>
>>> No, this won't work. The value of MASK_NO_FANCY_MATH depend on
>>> MASK_80387setting, and once fancy math bit is set, it couldn't be
>>> cleared for march != lakemont.
>>>
>>> It looks just like we want to error out when lakemont is enabled with 
>>> -m80387.
>>
>> Like in the attached patch, that also slightly improves existing error
>> reporting.
>>
>
> We should use a bit instead of checking PROCESSOR_LAKEMONT
> so that we don't need to check another PROCESSOR_XXX for
> a new IA MCU processor.
>

Another related bug:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67985

We may be able to fix both in a patch.

-- 
H.J.


update in Ada maintainers

2015-10-16 Thread Arnaud Charlet
Robert Dewar is no longer with us unfortunately
(http://www.adacore.com/press/adacore-president-robert-dewar-1945-2015/),
and Geert is no longer working for AdaCore, nor working on GNAT.

I've also updated my email.

Committed on trunk.

* MAINTAINERS: Update list of Ada maintainers and email addresses.

Index: MAINTAINERS
===
--- MAINTAINERS (revision 228864)
+++ MAINTAINERS (working copy)
@@ -148,9 +148,7 @@
 
 C front end/ISO C99Joseph Myers
 C front end/ISO C99Richard Henderson   
-Ada front end  Geert Bosch 
-Ada front end  Robert Dewar
-Ada front end  Arnaud Charlet  
+Ada front end  Arnaud Charlet  
 Ada front end  Eric Botcazou   
 c++Jason Merrill   
 c++Nathan Sidwell  


Re: Move some bit and binary optimizations in simplify and match

2015-10-16 Thread Marc Glisse


+(match (logical_inverted_value @0)
+ (truth_not @0))

That's good.

+/* Simplify ~X & X as zero.  */
+(simplify
+ (bit_and:c (convert? truth_valued_p@0) (convert? (logical_inverted_value @0)))
+  { build_zero_cst (type); })

That's not what Richard meant. We already have:

/* X & !X -> 0.  */
(simplify
 (bit_and:c @0 (logical_inverted_value @0))
 { build_zero_cst (type); })

which automatically benefits from your addition to logical_inverted_value 
(we might indeed want to add some convert? there though). But we still 
need what you had in your previous patch:


+/* Simplify ~X & X as zero.  */
+(simplify
+ (bit_and:c (convert? @0) (convert? (bit_not @0)))
+  { build_zero_cst (type); })

to simplify the case where X is not a truth value.

(detail: the indentation looks off for (C1/X)*C2)

--
Marc Glisse


Re: [gomp4.1] depend nowait support for target {update,{enter,exit} data}

2015-10-16 Thread Martin Jambor
Hi,

On Thu, Oct 15, 2015 at 04:01:56PM +0200, Jakub Jelinek wrote:
> Hi!
> 
> CCing various people, because I'd like to have something that won't work on
> XeonPhi only.

thanks.  However, I have not paid too much attention to OMP tasks
yet.  Nevertheless, let me try to answer some of the questions.

> 
> On Fri, Oct 02, 2015 at 10:28:01PM +0300, Ilya Verbin wrote:
> > On Tue, Sep 08, 2015 at 11:20:14 +0200, Jakub Jelinek wrote:
> > > nowait support for #pragma omp target is not implemented yet, supposedly 
> > > we
> > > need to mark those somehow (some flag) already in the struct gomp_task
> > > structure, essentially it will need either 2 or 3 callbacks
> > > (the current one, executed when the dependencies are resolved (it actually
> > > waits until some thread schedules it after that point, I think it is
> > > undesirable to run it with the tasking lock held), which would perform
> > > the gomp_map_vars and initiate the running of the region, and then some
> > > query routine which would poll the plugin whether the task is done or not,
> > > and either perform the finalization (unmap_vars) if it is done (and in any
> > > case return bool whether it should be polled again or not), and if the
> > > finalization is not done there, also another callback for the 
> > > finalization.
> > > Also, there is the issue that if we are waiting for task that needs to be
> > > polled, and we don't have any further tasks to run, we shouldn't really
> > > attempt to sleep on some semaphore (e.g. in taskwait, end of
> > > taskgroup, etc.) or barrier, but rather either need to keep polling it, or
> > > call the query hook with some argument that it should sleep in there until
> > > the work is done by the offloading device.
> > > Also, there needs to be a way for the target nowait first callback to say
> > > that it is using host fallback and thus acts as a normal task, therefore
> > > once the task fn finishes, the task is done.
> > 
> > Here is my WIP patch.  target.c part is obviously incorrect, but it 
> > demonstrates
> > a possible libgomp <-> plugin interface for running a target task function
> > asynchronously and checking whether it is completed or not.
> > (Refactored liboffloadmic/runtime/emulator from trunk is required to run
> > target-tmp.c testcase.)
> 
> The difficulty is designing something that will work (if possible fast) on the
> various devices we want to eventually support (at least XeonPhi, XeonPhi emul,
> PTX/Cuda and HSA), ideally without too much busy waiting.
> 
> The OpenMP 4.5 spec says that there is a special "target task" on the host
> side around the target region, and that the "target task" is mergeable and
> if nowait is not specified is included (otherwise it may be), and that the
> mapping operations (which include target device memory allocation,
> refcount management and mapping data structure updates as well as the
> memory copying to target device) happens only after the (optional) 
> dependencies
> are satisfied.  After the memory mapping operations are done, the offloading
> kernel starts, and when it finishes, the unmapping operations are performed
> (which includes memory copying from the target device, refcount management
> and mapping data structure updates, and finally memory deallocation).
> 
> Right now on the OpenMP side everything is synchronous, e.g. target
> enter/exit data and update are asynchronous only in that the mapping or
> unmapping operation is scheduled as a task, but the whole mapping or
> unmapping operations including all the above mentioned subparts are
> performed while holding the particular device's lock.

Memory mapping and unmapping is a no-op on HSA so this is fortunately
a concern for us.  (I'm assuming that ref-counting is also something
device specific and not part of running a task here).

> Anyway, let's put the asynchronous memory data transfers (which also implies
> the ability to enqueue multiple different target regions into a stream for
> the device to operate on independently from the host) on the side for now
> and just discuss what we want for the actual async execution and for now
> keep a device lock around all the mapping or unmapping operations.
> 
> If the "target task" has unresolved dependencies, then it will use existing
> task.c waiting code first (if the above is resolved somehow, there could be
> exceptions of "target task" depending on another "target task").
> When the dependencies are resolved, we can run the gomp_target_task_fn
> callback (but not with the team's tasking lock held), which can perform
> the gomp_map_vars call and start the async execution.  For host fallback,
> that is all we do, the task is at this point a normal task.
> For offloading task, we now want the host to continue scheduling other tasks
> if there are any, which means (not currently implemented on the task.c side)
> we want to move the task somewhere that we don't consider it finished, and
> that we'll need to schedule it again at some point 

Re: [PATCH 1/2] s/390: Implement "target" attribute.

2015-10-16 Thread Dominik Vogt
On Fri, Sep 25, 2015 at 02:59:41PM +0100, Dominik Vogt wrote:
> The following set of two patches implements the function
> __attribute__ ((target("..."))) and the corresponding #pragma GCC
> target("...") on S/390.  It comes with certain limitations:
> 
>  * It is not possible to change any options that affect the ABI or
>the definition of target macros by using the attribute (vx,
>htm, zarch and others).  Some of them are still supported but
>unable to change the definition of the corresponding target macros.
>In these cases, the pragma has to be used.  One reason for this
>is that it is not possible to change the definition of the target
>macros with the attribute, but the implementation of some features
>relies on them.
> 
>  * Even with the pragma it is not possible to switch between zarch
>and esa architecture because internal data typed would have to be
>changed at Gcc run time.
> 
> The second patch contains a long term change in the interface with
> the assembler.  Currently, the compiler wrapper passes the same
> -march= and -mtune= options to the compiler and the assembler.
> The patch makes this obsolete by emitting ".machine" and
> ".machinemode" dirctives to the top of the assembly language file.
> The old way ist still supported but may be removed once the
> ".machine" feature is supported by all as versions in the field.
> 
> The second patch depends on the first one, and both require the
> (latest) change proposed in this thread:
> https://gcc.gnu.org/ml/gcc-patches/2015-09/msg01546.html

This is an updated version of patch 1 that uses existing hooks and
works without any change to common code.  (Second patch is
unchanged).

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/ChangeLog

* config/s390/s390.opt (s390_arch_string): Remove.
(s390_tune_string): Likewise.
(s390_cost_pointer): Add Variable.
(s390_arch_specified): Add TargetVariable.
(s390_tune_specified): Likewise.
(s390_tune_flags): Likewise.
(s390_arch_flags): Save option.
(march=): Likewise.
(mbackchain): Likewise.
(mdebug): Likewise.
(mesa): Likewise.
(mhard-dfp): Likewise.
(mhard-float): Likewise.
(mlong-double-128): Likewise.
(mlong-double-64): Likewise.
(mhtm): Likewise.
(mvx): Likewise.
(mpacked-stack): Likewise.
(msmall-exec): Likewise.
(msoft-float): Likewise.
(mstack-guard=): Likewise.
(mstack-size=): Likewise.
(mtune=): Likewise.
(mmvcle): Likewise.
(mzvector): Likewise.
(mzarch): Likewise.
(mbranch-cost=): Likewise.
(mwarn-dynamicstack): Likewise.
(mwarn-framesize=): Likewise.
(mwarn-dynamicstack): Allow mno-warn-dynamicstack.
(mwarn-framesize=): Convert to UInteger (negative values are rejected
now).
* config/s390/s390-c.c (s390_cpu_cpp_builtins_internal): Split setting
macros changeable through the GCC target pragma into a separate
function.
(s390_cpu_cpp_builtins): Likewise.
(s390_pragma_target_parse): New function, implement GCC target pragma
if enabled.
(s390_register_target_pragmas): Register s390_pragma_target_parse if
available.
* common/config/s390/s390-common.c (s390_handle_option):
Export.
Move setting s390_arch_flags to s390.c.
Remove s390_tune_flags.
Allow 0 as argument to -mstack-size (switch to default value).
Allow 0 as argument to -mstack-guard (switch off).
Remove now unnecessary explicit parsing code for -mwarn-framesize.
* config/s390/s390-protos.h (s390_handle_option): Export.
(s390_valid_target_attribute_tree): Export.
(s390_reset_previous_fndecl): Export.
* config/s390/s390-builtins.def: Use new macro B_GROUP to mark the start
and end of HTM and VX builtins.
(s390_asm_output_function_prefix): Declare hook.
(s390_asm_declare_function_size): Likewise.
* config/s390/s390-builtins.h (B_GROUP): Use macro.
* config/s390/s390-opts.h: Add comment about processor_type usage.
* config/s390/s390.h (TARGET_CPU_IEEE_FLOAT_P): New macro.
(TARGET_CPU_ZARCH_P): Likewise.
(TARGET_CPU_LONG_DISPLACEMENT_P): Likewise
(TARGET_CPU_EXTIMM_P): Likewise
(TARGET_CPU_DFP_P): Likewise
(TARGET_CPU_Z10_P): Likewise
(TARGET_CPU_Z196_P): Likewise
(TARGET_CPU_ZEC12_P): Likewise
(TARGET_CPU_HTM_P): Likewise
(TARGET_CPU_Z13_P): Likewise
(TARGET_CPU_VX_P): Likewise
(TARGET_HARD_FLOAT_P): Likewise
(TARGET_LONG_DISPLACEMENT_P): Likewise
(TARGET_EXTIMM_P): Likewise
(TARGET_DFP_P): Likewise
(TARGET_Z10_P): Likewise
(TARGET_Z196_P): Likewise
(TARGET_ZEC12_P): Likewise
(TARGET_HTM_P): Likewise

Re: [PATCH] 2015-10-15 Benedikt Huber <benedikt.hu...@theobroma-systems.com> Philipp Tomsich <philipp.toms...@theobroma-systems.com>

2015-10-16 Thread Oleg Endo
On Thu, 2015-10-15 at 22:03 +, Benedikt Huber wrote:
>  
> +/* Add builtins for reciprocal square root.  */
> +
> +void
> +aarch64_init_builtin_rsqrt (void)
> +{
> +  tree fndecl = NULL;
> +  tree ftype = NULL;
> +
> +  tree V2SF_type_node = build_vector_type (float_type_node, 2);
> +  tree V2DF_type_node = build_vector_type (double_type_node, 2);
> +  tree V4SF_type_node = build_vector_type (float_type_node, 4);
> +
> +  typedef struct
> +  {
> +tree type_node;
> +const char *builtin_name;
> +int function_code;
> +  } builtin_decls_data;

There is an ongoing effort to remove all the unnecessary typedef struct
and enum etc stuff.  Please try not to add more of it.

Cheers,
Oleg



[Ada] Cleanups in inter-unit inlining engine

2015-10-16 Thread Arnaud Charlet
This removes a component in the record attached to every subprogram considered
for inter-unit inlining, which doesn't serve any useful purpose.

In addition, this fixes a small inconsistency in the code driving inter-unit
inlining from the front-end.  The code was at the same time discarding the
subprograms that cannot be inlined outside their unit, typically nested
subprograms, and taking them into account to build the edges of the callgraph.

This could later fool the algorithm computing the transitive closure of the
calls inlined from the main unit because vertices of the callgraph were not
present in the table of inlined subprograms.

No simple testcase.

Tested on x86_64-pc-linux-gnu, committed on trunk

2015-10-16  Eric Botcazou  

* inline.adb (Subp_Info): Remove Listed component.
(Add_Inlined_Subprogram): Take an entity instead of an index.
Do not set Listed component to True.
(New_Entry): Do not initialize Listed component to False.
(Analyze_Inlined_Bodies): Do not test Listed component
(Must_Inline): Add calls not in the main unit only
if they are in a subprogram that can be inlined outside its unit.
(Add_Inlined_Body): Move test around and add comment.

Index: inline.adb
===
--- inline.adb  (revision 228866)
+++ inline.adb  (working copy)
@@ -158,7 +158,6 @@
   Name: Entity_Id  := Empty;
   Next: Subp_Index := No_Subp;
   First_Succ  : Succ_Index := No_Succ;
-  Listed  : Boolean:= False;
   Main_Call   : Boolean:= False;
   Processed   : Boolean:= False;
end record;
@@ -180,8 +179,8 @@
--  called, and for the inlined subprogram that contains the call. If
--  the call is in the main compilation unit, Caller is Empty.
 
-   procedure Add_Inlined_Subprogram (Index : Subp_Index);
-   --  Add the subprogram to the list of inlined subprogram for the unit
+   procedure Add_Inlined_Subprogram (E : Entity_Id);
+   --  Add subprogram E to the list of inlined subprogram for the unit
 
function Add_Subp (E : Entity_Id) return Subp_Index;
--  Make entry in Inlined table for subprogram E, or return table index
@@ -347,15 +346,19 @@
 return Inline_Package;
  end if;
 
- --  The call is not in the main unit. See if it is in some inlined
- --  subprogram. If so, inline the call and, if the inlining level is
- --  set to 1, stop there; otherwise also compile the package as above.
+ --  The call is not in the main unit. See if it is in some subprogram
+ --  that can be inlined outside its unit. If so, inline the call and,
+ --  if the inlining level is set to 1, stop there; otherwise also
+ --  compile the package as above.
 
  Scop := Current_Scope;
  while Scope (Scop) /= Standard_Standard
and then not Is_Child_Unit (Scop)
  loop
-if Is_Overloadable (Scop) and then Is_Inlined (Scop) then
+if Is_Overloadable (Scop)
+  and then Is_Inlined (Scop)
+  and then not Is_Nested (Scop)
+then
Add_Call (E, Scop);
 
if Inline_Level = 1 then
@@ -378,6 +381,15 @@
begin
   Append_New_Elmt (N, To => Backend_Calls);
 
+  --  Skip subprograms that cannot be inlined outside their unit
+
+  if Is_Abstract_Subprogram (E)
+or else Convention (E) = Convention_Protected
+or else Is_Nested (E)
+  then
+ return;
+  end if;
+
   --  Find unit containing E, and add to list of inlined bodies if needed.
   --  If the body is already present, no need to load any other unit. This
   --  is the case for an initialization procedure, which appears in the
@@ -391,13 +403,6 @@
   --  no enclosing package to retrieve. In this case, it is the body of
   --  the function that will have to be loaded.
 
-  if Is_Abstract_Subprogram (E)
-or else Is_Nested (E)
-or else Convention (E) = Convention_Protected
-  then
- return;
-  end if;
-
   Level := Must_Inline;
 
   if Level /= Dont_Inline then
@@ -475,8 +480,7 @@
-- Add_Inlined_Subprogram --

 
-   procedure Add_Inlined_Subprogram (Index : Subp_Index) is
-  E: constant Entity_Id := Inlined.Table (Index).Name;
+   procedure Add_Inlined_Subprogram (E : Entity_Id) is
   Decl : constant Node_Id   := Parent (Declaration_Node (E));
   Pack : constant Entity_Id := Get_Code_Unit_Entity (E);
 
@@ -538,8 +542,6 @@
   else
  Register_Backend_Not_Inlined_Subprogram (E);
   end if;
-
-  Inlined.Table (Index).Listed := True;
end Add_Inlined_Subprogram;
 

@@ -606,7 +608,6 @@
  Inlined.Table (Inlined.Last).Name:= E;
  Inlined.Table (Inlined.Last).Next:= 

[PATCH] i386: Use the STC bb-reorder algorithm at -Os (PR67864)

2015-10-16 Thread Segher Boessenkool
For x86, STC still gives better results for optimise-for-size than
"simple" does.  So use STC at -Os as well.

Is this okay for trunk?


Segher


2015-10-16  Segher Boessenkool  

PR rtl-optimization/67864
* common/config/i386/i386-common.c (ix86_option_optimization_table)
: Use REORDER_BLOCKS_ALGORITHM_STC
at -Os and up.

---
 gcc/common/config/i386/i386-common.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/common/config/i386/i386-common.c 
b/gcc/common/config/i386/i386-common.c
index 79b2472..bb9f29c 100644
--- a/gcc/common/config/i386/i386-common.c
+++ b/gcc/common/config/i386/i386-common.c
@@ -1011,6 +1011,9 @@ static const struct default_options 
ix86_option_optimization_table[] =
 { OPT_LEVELS_2_PLUS, OPT_free, NULL, 1 },
 /* Enable function splitting at -O2 and higher.  */
 { OPT_LEVELS_2_PLUS, OPT_freorder_blocks_and_partition, NULL, 1 },
+/* The STC algorithm produces the smallest code at -Os, for x86.  */
+{ OPT_LEVELS_2_PLUS, OPT_freorder_blocks_algorithm_, NULL,
+  REORDER_BLOCKS_ALGORITHM_STC },
 /* Turn off -fschedule-insns by default.  It tends to make the
problem with not enough registers even worse.  */
 { OPT_LEVELS_ALL, OPT_fschedule_insns, NULL, 0 },
-- 
2.4.3



[PATCH] mn10300: Use the STC bb-reorder algorithm at -Os

2015-10-16 Thread Segher Boessenkool
For mn10300, STC still gives better results for optimise-for-size than
"simple" does.  So use STC at -Os as well.

Is this okay for trunk?


Segher


2015-10-16  Segher Boessenkool  

* common/config/mn10300/mn10300-common.c
(mn10300_option_optimization_table) :
Use REORDER_BLOCKS_ALGORITHM_STC at -Os and up.

---
 gcc/common/config/mn10300/mn10300-common.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/common/config/mn10300/mn10300-common.c 
b/gcc/common/config/mn10300/mn10300-common.c
index 2df93eb..52421e5 100644
--- a/gcc/common/config/mn10300/mn10300-common.c
+++ b/gcc/common/config/mn10300/mn10300-common.c
@@ -30,6 +30,9 @@
 static const struct default_options mn10300_option_optimization_table[] =
   {
 { OPT_LEVELS_1_PLUS, OPT_fomit_frame_pointer, NULL, 1 },
+/* The STC algorithm produces the smallest code at -Os.  */
+{ OPT_LEVELS_2_PLUS, OPT_freorder_blocks_algorithm_, NULL,
+  REORDER_BLOCKS_ALGORITHM_STC },
 { OPT_LEVELS_NONE, 0, NULL, 0 }
   };
 
-- 
2.4.3



Re: [PATCH] Random shuffle moveable: container size

2015-10-16 Thread Jonathan Wakely

Committed to trunk, thanks for the patch.



Re: [PATCH 7/7] Libsanitizer merge from upstream r249633.

2015-10-16 Thread Maxim Ostapenko

On 14/10/15 15:12, Jakub Jelinek wrote:

On Wed, Oct 14, 2015 at 03:02:22PM +0300, Maxim Ostapenko wrote:

On 14/10/15 14:06, Jakub Jelinek wrote:

On Wed, Oct 14, 2015 at 01:51:44PM +0300, Maxim Ostapenko wrote:

Ok, got it. The first solution would require changes in libsanitizer because
heuristic doesn't work for GCC, so perhaps new UBSan entry point should go
upstream, right? Or this may be implemented as local patch for GCC?

No.  The heuristics relies on:
1) either it is old style float cast overflow without location
2) or it is new style float cast with location, but the location must:
a) not have NULL filename
b) the filename must not be ""
c) the filename must not be "\1"
So, my proposal was to emit in GCC the old style float cast overflow if a), b) 
or
c) is true, otherwise the new style.  I have no idea what you mean by
heuristic doesn't work for GCC after that.

I mean that there are some cases where (FilenameOrTypeDescriptor[0] +
FilenameOrTypeDescriptor[1] < 2) is not sufficient to determine if we should
use old style. I actually caught this on float-cast-overflow-10.c testcase.

Ah, ok, in that case the heuristics is flawed.  If they want to keep it,
they should check if MaybeFromTypeKind is either < 2 or equal to 0x1fe.
Can you report it upstream?  If that is changed, we'd need to change the
above and also add
   d) the filename must not start with "\xff\xff"
to the rules.

I think it would be better to just add a whole new entrypoint, but if they
think the heuristics is good enough, they should at least fix it up.

Jakub



Done. I've realized that we could just set loc to input_location if loc 
== UNKNOWN_LOCATION. In this case, we always would have new style. This 
would require some changes in tests, because upstream UBSan suppress 
different reports for one location. How about this?


-Maxim
gcc/ChangeLog:

2015-10-16  Maxim Ostapenko  

	* ubsan.c (ubsan_instrument_float_cast): If location is unknown, assign
	 input_location to loc. Propagate loc to ubsan_create_data.

gcc/testsuite/ChangeLog:

2015-10-16  Maxim Ostapenko  

	* c-c++-common/ubsan/float-cast-overflow-10.c: Adjust test.
	* c-c++-common/ubsan/float-cast-overflow-8.c: Likewise.
	* c-c++-common/ubsan/float-cast-overflow-9.c: Likewise.

Index: gcc/ubsan.c
===
--- gcc/ubsan.c	(revision 228817)
+++ gcc/ubsan.c	(working copy)
@@ -1484,6 +1484,7 @@
   machine_mode mode = TYPE_MODE (expr_type);
   int prec = TYPE_PRECISION (type);
   bool uns_p = TYPE_UNSIGNED (type);
+  if (loc == UNKNOWN_LOCATION) loc = input_location;
 
   /* Float to integer conversion first truncates toward zero, so
  even signed char c = 127.875f; is not problematic.
@@ -1581,8 +1582,8 @@
   else
 {
   /* Create the __ubsan_handle_float_cast_overflow fn call.  */
-  tree data = ubsan_create_data ("__ubsan_float_cast_overflow_data", 0,
- NULL, ubsan_type_descriptor (expr_type),
+  tree data = ubsan_create_data ("__ubsan_float_cast_overflow_data", 1,
+ , ubsan_type_descriptor (expr_type),
  ubsan_type_descriptor (type), NULL_TREE,
  NULL_TREE);
   enum built_in_function bcode
Index: gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-10.c
===
--- gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-10.c	(revision 228817)
+++ gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-10.c	(working copy)
@@ -10,70 +10,37 @@
 
 /* _Decimal32 */
 /* { dg-output "value  is outside the range of representable values of type 'signed char'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value  is outside the range of representable values of type 'signed char'\[^\n\r]*(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*value  is outside the range of representable values of type 'char'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value  is outside the range of representable values of type 'char'\[^\n\r]*(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*value  is outside the range of representable values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value  is outside the range of representable values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*value  is outside the range of representable values of type 'short int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value  is outside the range of representable values of type 'short int'\[^\n\r]*(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*value  is outside the range of representable values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value  is outside the range of representable values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*value  is outside the range of representable values of type 'int'\[^\n\r]*(\n|\r\n|\r)" } */

[Ada] Optization of predicate checks

2015-10-16 Thread Arnaud Charlet
This patch marks the generated predicate functions as Pure, so that the
back-end can optimize redundant calls to these functions when inlining and
high level of optimization are requested.

This is a performance enhancement, no change in behavior.

Tested on x86_64-pc-linux-gnu, committed on trunk

2015-10-16  Ed Schonberg  

* sem_ch13.adb (Build_Predicate_Functions): If expression for
predicate is side-effect free, indicate that the predicate
function is pure, to allow for optimization of redundant
predicate checks.

Index: sem_ch13.adb
===
--- sem_ch13.adb(revision 228874)
+++ sem_ch13.adb(working copy)
@@ -8702,6 +8702,16 @@
 
 Insert_Before_And_Analyze (N, FDecl);
 Insert_After_And_Analyze  (N, FBody);
+
+--  Static predicate functions are always side-effect free, and
+--  in most cases dynamic predicate functions are as well. Mark
+--  them as such whenever possible, so redundant predicate checks
+--  can be optimized.
+
+if Expander_Active then
+   Set_Is_Pure (SId, Side_Effect_Free (Expr));
+   Set_Is_Inlined (SId);
+end if;
  end;
 
  --  Test for raise expressions present and if so build M version


[Ada] Premature finalization leads to wrong short circuit result

2015-10-16 Thread Arnaud Charlet
This patch modifies the expansion of expression_with_actions nodes to force the
evaluation of the expression when its type is Boolean. This prevents "leaks" of
dependencies on transient controlled objects which lead to incorrect results in
short circuit operators.


-- Source --


--  types.ads

with Ada.Finalization; use Ada.Finalization;

package Types is
   type FS_String is new String;
   Empty_FS_String : aliased FS_String := "ERROR";

   type FS_String_Access is access all FS_String;

   type File_Record is tagged record
  Normalized : FS_String_Access;
  Ref_Count  : Natural := 0;
   end record;

   type File_Access is access all File_Record'Class;

   procedure Ref (Obj : File_Access);
   procedure Unref (Obj : in out File_Access);

   type Virtual_File is new Controlled with record
  Value : File_Access;
   end record;

   procedure Adjust (Obj : in out Virtual_File);
   function Create (Str : FS_String) return Virtual_File;
   procedure Finalize (Obj : in out Virtual_File);
   function Full_Name (Obj : Virtual_File) return FS_String_Access;
end Types;

--  types.adb

with Ada.Unchecked_Deallocation;

package body Types is
   procedure Adjust (Obj : in out Virtual_File) is
   begin
  if Obj.Value /= null then
 Ref (Obj.Value);
  end if;
   end Adjust;

   function Create (Str : FS_String) return Virtual_File is
   begin
  return
(Controlled with Value =>
   new File_Record'(Ref_Count  => 1,
Normalized => new FS_String'(Str)));
   end Create;

   procedure Finalize (Obj : in out Virtual_File) is
  Value : File_Access := Obj.Value;

   begin
  Obj.Value := null;

  if Value /= null then
 Unref (Value);
  end if;
   end Finalize;

   function Full_Name (Obj : Virtual_File) return FS_String_Access is
   begin
  if Obj.Value /= null then
 return Obj.Value.Normalized;
  else
 return Empty_FS_String'Access;
  end if;
   end Full_Name;

   procedure Ref (Obj : File_Access) is
   begin
  Obj.Ref_Count := Obj.Ref_Count + 1;
   end Ref;

   procedure Unref (Obj : in out File_Access) is
  procedure Free_FA is
new Ada.Unchecked_Deallocation (File_Record'Class, File_Access);
  procedure Free_FS is
new Ada.Unchecked_Deallocation (FS_String, FS_String_Access);

   begin
  if Obj.Ref_Count > 0 then
 Obj.Ref_Count := Obj.Ref_Count - 1;

 if Obj.Ref_Count = 0 then
Free_FS (Obj.all.Normalized);
Free_FA (Obj);
 end if;
  end if;
   end Unref;
end Types;

--  main.adb

with Ada.Text_IO; use Ada.Text_IO;
with Types;   use Types;

procedure Main is
   function Return_Self (Flag : Boolean) return Boolean is
   begin
  return Flag;
   end Return_Self;

begin
   if Return_Self (True)
 and then Create ("hello").Full_Name.all = "hello"
   then
  Put_Line ("OK");
   else
  Put_Line ("ERROR: premature finalization");
   end if;
end Main;


-- Compilation and output --


$ gnatmake -q main.adb
$ ./main
OK

Tested on x86_64-pc-linux-gnu, committed on trunk

2015-10-16  Hristian Kirtchev  

* exp_ch4.adb (Expand_N_Expression_With_Actions):
Force the evaluation of the expression when its type is Boolean.
(Force_Boolean_Evaluation): New routine.

Index: exp_ch4.adb
===
--- exp_ch4.adb (revision 228874)
+++ exp_ch4.adb (working copy)
@@ -5039,12 +5039,49 @@
--
 
procedure Expand_N_Expression_With_Actions (N : Node_Id) is
+  Acts : constant List_Id := Actions (N);
+
+  procedure Force_Boolean_Evaluation (Expr : Node_Id);
+  --  Force the evaluation of Boolean expression Expr
+
   function Process_Action (Act : Node_Id) return Traverse_Result;
   --  Inspect and process a single action of an expression_with_actions for
   --  transient controlled objects. If such objects are found, the routine
   --  generates code to clean them up when the context of the expression is
   --  evaluated or elaborated.
 
+  --
+  -- Force_Boolean_Evaluation --
+  --
+
+  procedure Force_Boolean_Evaluation (Expr : Node_Id) is
+ Loc   : constant Source_Ptr := Sloc (N);
+ Flag_Decl : Node_Id;
+ Flag_Id   : Entity_Id;
+
+  begin
+ --  Relocate the expression to the actions list by capturing its value
+ --  in a Boolean flag. Generate:
+ --Flag : constant Boolean := Expr;
+
+ Flag_Id := Make_Temporary (Loc, 'F');
+
+ Flag_Decl :=
+   Make_Object_Declaration (Loc,
+ Defining_Identifier => Flag_Id,
+ Constant_Present=> True,
+ Object_Definition   => New_Occurrence_Of 

[Ada] Crash on illegal program with -gnatf.

2015-10-16 Thread Arnaud Charlet
This patch fixes a crash in the compiler when reporting an error on an illegal
prefixed call whose prefix is overloaded, one of its interpretations has
an untagged type, and All_Errors_Mode is set.

No short example available.

Tested on x86_64-pc-linux-gnu, committed on trunk

2015-10-16  Ed Schonberg  

* sem_ch4.adb (Try_Object_Operation, Try_One_Interpretation):
Do not reset the Obj_Type of the prefix if an interpretation
involves an untagged type, to prevent a crash when analyzing an
illegal program in All_Errors mode.

Index: sem_ch4.adb
===
--- sem_ch4.adb (revision 228864)
+++ sem_ch4.adb (working copy)
@@ -8135,6 +8135,12 @@
   ---
 
   procedure Try_One_Prefix_Interpretation (T : Entity_Id) is
+
+ --  If the interpretation does not have a valid candidate type,
+ --  preserve current value of Obj_Type for subsequent errors.
+
+ Prev_Obj_Type : constant Entity_Id := Obj_Type;
+
   begin
  Obj_Type := T;
 
@@ -8167,6 +8173,10 @@
  if not Is_Tagged_Type (Obj_Type)
or else Is_Incomplete_Type (Obj_Type)
  then
+
+--  Restore previous type if current one is not legal candidate.
+
+Obj_Type := Prev_Obj_Type;
 return;
  end if;
 


[PATCH][Testsuite] Turn on 64-bit-vector tests for AArch64

2015-10-16 Thread Alan Lawrence
This enables tests bb-slp-11.c and bb-slp-26.c for AArch64. Both of these are
currently passing on little- and big-endian.

(Tested on aarch64-none-linux-gnu and aarch64_be-none-elf).

OK for trunk?

gcc/testsuite/ChangeLog:

* lib/target-supports.exp (check_effective_target_vect64): Add AArch64.
---
 gcc/testsuite/lib/target-supports.exp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 3088369..bd03108 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -4762,6 +4762,7 @@ proc check_effective_target_vect64 { } {
 if { ([istarget arm*-*-*]
  && [check_effective_target_arm_neon_ok]
  && [check_effective_target_arm_little_endian])
+|| [istarget aarch64*-*-*]
  || [istarget sparc*-*-*] } {
set et_vect64_saved 1
 }
-- 
1.9.1



[PATCH][AArch64] Add support for 64-bit vector-mode ldp/stp

2015-10-16 Thread Kyrill Tkachov

Hi all,

We already support load/store-pair operations on the D-registers when they 
contain an FP value, but the peepholes/sched-fusion machinery that
do all the hard work currently ignore 64-bit vector modes.

This patch adds support for fusing loads/stores of 64-bit vector operands into 
ldp and stp instructions.
I've seen this trigger a few times in SPEC2006. Not too many times, but the 
times it did trigger the code seemed objectively better
i.e. long sequences of ldr and str instructions essentially halved in size.

Bootstrapped and tested on aarch64-none-linux-gnu.

Ok for trunk?

Thanks,
Kyrill

2015-10-16  Kyrylo Tkachov  

* config/aarch64/aarch64.c (aarch64_mode_valid_for_sched_fusion_p):
New function.
(fusion_load_store): Use it.
* config/aarch64/aarch64-ldpstp.md: Add new peephole2s for
ldp and stp in VD modes.
* config/aarch64/aarch64-simd.md (load_pair, VD): New pattern.
(store_pair, VD): Likewise.

2015-10-16  Kyrylo Tkachov  

* gcc.target/aarch64/stp_vec_64_1.c: New test.
* gcc.target/aarch64/ldp_vec_64_1.c: New test.
commit b5f4a5b87a7315fb8a4d88da3e4c4afc52d16052
Author: Kyrylo Tkachov 
Date:   Tue Oct 6 12:08:24 2015 +0100

[AArch64] Add support for 64-bit vector-mode ldp/stp

diff --git a/gcc/config/aarch64/aarch64-ldpstp.md b/gcc/config/aarch64/aarch64-ldpstp.md
index 8d6d882..458829c 100644
--- a/gcc/config/aarch64/aarch64-ldpstp.md
+++ b/gcc/config/aarch64/aarch64-ldpstp.md
@@ -98,6 +98,47 @@ (define_peephole2
 }
 })
 
+(define_peephole2
+  [(set (match_operand:VD 0 "register_operand" "")
+	(match_operand:VD 1 "aarch64_mem_pair_operand" ""))
+   (set (match_operand:VD 2 "register_operand" "")
+	(match_operand:VD 3 "memory_operand" ""))]
+  "aarch64_operands_ok_for_ldpstp (operands, true, mode)"
+  [(parallel [(set (match_dup 0) (match_dup 1))
+	  (set (match_dup 2) (match_dup 3))])]
+{
+  rtx base, offset_1, offset_2;
+
+  extract_base_offset_in_addr (operands[1], , _1);
+  extract_base_offset_in_addr (operands[3], , _2);
+  if (INTVAL (offset_1) > INTVAL (offset_2))
+{
+  std::swap (operands[0], operands[2]);
+  std::swap (operands[1], operands[3]);
+}
+})
+
+(define_peephole2
+  [(set (match_operand:VD 0 "aarch64_mem_pair_operand" "")
+	(match_operand:VD 1 "register_operand" ""))
+   (set (match_operand:VD 2 "memory_operand" "")
+	(match_operand:VD 3 "register_operand" ""))]
+  "TARGET_SIMD && aarch64_operands_ok_for_ldpstp (operands, false, mode)"
+  [(parallel [(set (match_dup 0) (match_dup 1))
+	  (set (match_dup 2) (match_dup 3))])]
+{
+  rtx base, offset_1, offset_2;
+
+  extract_base_offset_in_addr (operands[0], , _1);
+  extract_base_offset_in_addr (operands[2], , _2);
+  if (INTVAL (offset_1) > INTVAL (offset_2))
+{
+  std::swap (operands[0], operands[2]);
+  std::swap (operands[1], operands[3]);
+}
+})
+
+
 ;; Handle sign/zero extended consecutive load/store.
 
 (define_peephole2
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 6a2ab61..bf051c3 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -153,6 +153,34 @@ (define_insn "*aarch64_simd_mov"
(set_attr "length" "4,4,4,8,8,8,4")]
 )
 
+(define_insn "load_pair"
+  [(set (match_operand:VD 0 "register_operand" "=w")
+	(match_operand:VD 1 "aarch64_mem_pair_operand" "Ump"))
+   (set (match_operand:VD 2 "register_operand" "=w")
+	(match_operand:VD 3 "memory_operand" "m"))]
+  "TARGET_SIMD
+   && rtx_equal_p (XEXP (operands[3], 0),
+		   plus_constant (Pmode,
+  XEXP (operands[1], 0),
+  GET_MODE_SIZE (mode)))"
+  "ldp\\t%d0, %d2, %1"
+  [(set_attr "type" "neon_ldp")]
+)
+
+(define_insn "store_pair"
+  [(set (match_operand:VD 0 "aarch64_mem_pair_operand" "=Ump")
+	(match_operand:VD 1 "register_operand" "w"))
+   (set (match_operand:VD 2 "memory_operand" "=m")
+	(match_operand:VD 3 "register_operand" "w"))]
+  "TARGET_SIMD
+   && rtx_equal_p (XEXP (operands[2], 0),
+		   plus_constant (Pmode,
+  XEXP (operands[0], 0),
+  GET_MODE_SIZE (mode)))"
+  "stp\\t%d1, %d3, %0"
+  [(set_attr "type" "neon_stp")]
+)
+
 (define_split
   [(set (match_operand:VQ 0 "register_operand" "")
   (match_operand:VQ 1 "register_operand" ""))]
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index d7d05b8..7682417 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -3491,6 +3491,18 @@ offset_12bit_unsigned_scaled_p (machine_mode mode, HOST_WIDE_INT offset)
 	  && offset % GET_MODE_SIZE (mode) == 0);
 }
 
+/* Return true if MODE is one of the modes for which we
+   support LDP/STP operations.  */
+
+static bool
+aarch64_mode_valid_for_sched_fusion_p (machine_mode mode)
+{
+  return mode == SImode || mode == DImode
+	 || mode == SFmode || mode == DFmode
+	 || (aarch64_vector_mode_supported_p (mode)
+	 && GET_MODE_SIZE (mode) == 8);
+}
+
 

Re: [PATCH] i386: Use the STC bb-reorder algorithm at -Os (PR67864)

2015-10-16 Thread Bernd Schmidt

On 10/16/2015 02:53 PM, Segher Boessenkool wrote:

For x86, STC still gives better results for optimise-for-size than
"simple" does.  So use STC at -Os as well.


For how many targets is this true, and for the others, what is the 
biggest win from "simple"? If the list of targets which get patches such 
as this one is too large, maybe we ought to admit defeat on the "simple" 
algorithm and revert it.



Bernd



[Patch AArch64 63304] Fix issue with global state.

2015-10-16 Thread Ramana Radhakrishnan
Hi,

Jiong pointed out privately that there was a thinko
in the way in which the global state was being
set and reset. I don't like adding such
global state but 

Tested on aarch64-none-elf with no regressions
Bootstrapped and regression tested on aarch64-none-linux-gnu

Ok to apply ?

regards
Ramana

2015-10-15  Ramana Radhakrishnan  

PR target/63304
* config/aarch64/aarch64.c (aarch64_nopcrelative_literal_loads): New.
(aarch64_expand_mov_immediate): Use aarch64_nopcrelative_literal_loads.
(aarch64_classify_address): Likewise.
(aarch64_secondary_reload): Likewise.
(aarch64_override_options_after_change_1): Adjust.
* config/aarch64/aarch64.md (aarch64_reload_movcp):
Use aarch64_nopcrelative_literal_loads.
(aarch64_reload_movcp): Likewise.
* config/aarch64/aarch64-protos.h (aarch64_nopcrelative_literal_loads): 
Declare

2015-10-15  Jiong Wang  
Ramana Radhakrishnan  

PR target/63304
* gcc.target/aarch64/pr63304-1.c: New test.
---
 gcc/config/aarch64/aarch64-protos.h  |  1 +
 gcc/config/aarch64/aarch64.c | 21 +++--
 gcc/config/aarch64/aarch64.md|  4 +--
 gcc/testsuite/gcc.target/aarch64/pr63304-1.c | 47 
 4 files changed, 62 insertions(+), 11 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/pr63304-1.c

diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index baaf1bd..bf501c8 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -400,4 +400,5 @@ int aarch64_ccmp_mode_to_code (enum machine_mode mode);
 bool extract_base_offset_in_addr (rtx mem, rtx *base, rtx *offset);
 bool aarch64_operands_ok_for_ldpstp (rtx *, bool, enum machine_mode);
 bool aarch64_operands_adjust_ok_for_ldpstp (rtx *, bool, enum machine_mode);
+extern bool aarch64_nopcrelative_literal_loads;
 #endif /* GCC_AARCH64_PROTOS_H */
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 5130e37..f9664f5 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -163,6 +163,9 @@ enum aarch64_processor aarch64_tune = cortexa53;
 /* Mask to specify which instruction scheduling options should be used.  */
 unsigned long aarch64_tune_flags = 0;
 
+/* Global flag for PC relative loads.  */
+bool aarch64_nopcrelative_literal_loads;
+
 /* Support for command line parsing of boolean flags in the tuning
structures.  */
 struct aarch64_flag_desc
@@ -1551,7 +1554,7 @@ aarch64_expand_mov_immediate (rtx dest, rtx imm)
 we need to expand the literal pool access carefully.
 This is something that needs to be done in a number
 of places, so could well live as a separate function.  */
- if (nopcrelative_literal_loads)
+ if (aarch64_nopcrelative_literal_loads)
{
  gcc_assert (can_create_pseudo_p ());
  base = gen_reg_rtx (ptr_mode);
@@ -3665,7 +3668,7 @@ aarch64_classify_address (struct aarch64_address_info 
*info,
  return ((GET_CODE (sym) == LABEL_REF
   || (GET_CODE (sym) == SYMBOL_REF
   && CONSTANT_POOL_ADDRESS_P (sym)
-  && !nopcrelative_literal_loads)));
+  && !aarch64_nopcrelative_literal_loads)));
}
   return false;
 
@@ -4899,7 +4902,7 @@ aarch64_secondary_reload (bool in_p ATTRIBUTE_UNUSED, rtx 
x,
   if (MEM_P (x) && GET_CODE (x) == SYMBOL_REF && CONSTANT_POOL_ADDRESS_P (x)
   && (SCALAR_FLOAT_MODE_P (GET_MODE (x))
  || targetm.vector_mode_supported_p (GET_MODE (x)))
-  && nopcrelative_literal_loads)
+  && aarch64_nopcrelative_literal_loads)
 {
   sri->icode = aarch64_constant_pool_reload_icode (mode);
   return NO_REGS;
@@ -7555,21 +7558,21 @@ aarch64_override_options_after_change_1 (struct 
gcc_options *opts)
 
   /* If nopcrelative_literal_loads is set on the command line, this
  implies that the user asked for PC relative literal loads.  */
-  if (nopcrelative_literal_loads == 1)
-nopcrelative_literal_loads = 0;
+  if (opts->x_nopcrelative_literal_loads == 1)
+aarch64_nopcrelative_literal_loads = false;
 
   /* If it is not set on the command line, we default to no
  pc relative literal loads.  */
-  if (nopcrelative_literal_loads == 2)
-nopcrelative_literal_loads = 1;
+  if (opts->x_nopcrelative_literal_loads == 2)
+aarch64_nopcrelative_literal_loads = true;
 
   /* In the tiny memory model it makes no sense
  to disallow non PC relative literal pool loads
  as many other things will break anyway.  */
-  if (nopcrelative_literal_loads
+  if (opts->x_nopcrelative_literal_loads
   && (aarch64_cmodel == AARCH64_CMODEL_TINY
  || aarch64_cmodel == AARCH64_CMODEL_TINY_PIC))
-

Re: Do not describe -std=c11 etc. as experimental in c.opt help text

2015-10-16 Thread Marek Polacek
Ping^2.

On Fri, Oct 09, 2015 at 03:50:02PM +0200, Marek Polacek wrote:
> Jason: ping.
> 
> On Fri, Oct 02, 2015 at 05:35:39PM +0200, Marek Polacek wrote:
> > On Thu, Oct 01, 2015 at 05:01:26PM +, Joseph Myers wrote:
> > > I noticed that c.opt still described -std=c11 and related options as
> > > experimental in the --help text.  This patch fixes this.
> > > 
> > > Jason, note that -std=gnu++11 and -std=gnu++14 still have that text,
> > > contrary to the descriptions of -std=c++11 and -std=c++14.
> > 
> > Thus, ok to commit this one (to trunk + 5)?
> > 
> > 2015-10-02  Marek Polacek  
> > 
> > * c.opt (std=gnu++11): Do not describe as experimental.
> > (std=gnu++14): Likewise.
> > 
> > diff --git gcc/c-family/c.opt gcc/c-family/c.opt
> > index a79b9f1..bfa09ee 100644
> > --- gcc/c-family/c.opt
> > +++ gcc/c-family/c.opt
> > @@ -1694,7 +1694,7 @@ corrigendum with GNU extensions
> >  
> >  std=gnu++11
> >  C++ ObjC++
> > -Conform to the ISO 2011 C++ standard with GNU extensions (experimental and 
> > incomplete support)
> > +Conform to the ISO 2011 C++ standard with GNU extensions
> >  
> >  std=gnu++0x
> >  C++ ObjC++ Alias(std=gnu++11) Undocumented
> > @@ -1706,7 +1706,7 @@ Deprecated in favor of -std=gnu++14
> >  
> >  std=gnu++14
> >  C++ ObjC++
> > -Conform to the ISO 2014 C++ standard with GNU extensions (experimental and 
> > incomplete support)
> > +Conform to the ISO 2014 C++ standard with GNU extensions
> >  
> >  std=gnu++1z
> >  C++ ObjC++

Marek


[gomp4,committed] Handle device-resident and link map kinds in dump_omp_clause

2015-10-16 Thread Tom de Vries

Hi,

this patch fixes an ICE when compiling c-c++-common/goacc/declare-1.c 
with -fdump-tree-omplower.


Committed to gomp-4_0-branch.

Thanks,
- Tom
Handle device-resident and link map kinds in dump_omp_clause

2015-10-16  Tom de Vries  

	* tree-pretty-print.c (dump_omp_clause): Handle device-resident and link
	map kinds.
---
 gcc/tree-pretty-print.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 1b52aa2..33559f0 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -552,6 +552,12 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, int flags)
 	case GOMP_MAP_FORCE_DEVICEPTR:
 	  pp_string (pp, "force_deviceptr");
 	  break;
+	case GOMP_MAP_DEVICE_RESIDENT:
+	  pp_string (pp, "device_resident");
+	  break;
+	case GOMP_MAP_LINK:
+	  pp_string (pp, "link");
+	  break;
 	default:
 	  gcc_unreachable ();
 	}
-- 
1.9.1



[PATCH] Fix PR67975, teach SCCVN basic control equivalency for PHI value-numbering

2015-10-16 Thread Richard Biener

The following patch teaches SCCVN to value-number two PHI nodes the same
even when they are in a different basic-block.  To do that we have to
prove equivalency of the edge predicates into the PHI (and of course
equivalence of the PHI arguments).  The patch handles the simple case
of PHI nodes with two arguments, more arguments would require some sort
of pairwise reduction.  The patch doesn't yet consider swapped
predicates (non-canonical form can easily happen if the operands
are valueized).

It's enough to make the testcase in PR67975 be optimized in FRE1 where
we get

  :
  if (y_5(D) >= 1.0e+1)
goto ;
  else
goto ;

  :
  if (x_6(D) < 2.0e+1)
goto ;
  else
goto ;

  :
  iftmp.1_7 = x_6(D);
  goto ;

  :
  iftmp.1_8 = y_5(D);

  :
  # iftmp.1_2 = PHI 
  iftmp.0_9 = __builtin_tan (iftmp.1_2);
  goto ;

  :
  iftmp.0_10 = x_6(D);

  :
  # iftmp.0_1 = PHI 
  z1_11 = __builtin_cos (iftmp.0_1);
  if (y_5(D) >= 1.0e+1)
goto ;
  else
goto ;

  :
  if (x_6(D) < 2.0e+1)
goto ;
  else
goto ;

  :
  iftmp.3_12 = x_6(D);
  goto ;

  :
  iftmp.3_13 = y_5(D);

  :
  # iftmp.3_4 = PHI 
  iftmp.2_14 = __builtin_tan (iftmp.3_4);
  goto ;

  :
  iftmp.2_15 = x_6(D);

  :
  # iftmp.2_3 = PHI 
  z2_16 = __builtin_cos (iftmp.2_3);
  _17 = z1_11 == z2_16;
  _18 = (int) _17;
  return _18;

and FRE1 now figures that z1_11 == z2_16 and make us return 1 via

Setting value number of iftmp.1_2 to iftmp.1_2 (changed)
Setting value number of iftmp.0_9 to iftmp.0_9 (changed)
Setting value number of iftmp.0_1 to iftmp.0_1 (changed)
Setting value number of z1_11 to z1_11 (changed)
Setting value number of iftmp.3_4 to iftmp.1_2 (changed)
Setting value number of iftmp.2_14 to iftmp.0_9 (changed)
Setting value number of iftmp.2_3 to iftmp.0_1 (changed)
Setting value number of z2_16 to z1_11 (changed)
Setting value number of _17 to 1 (changed)
Setting value number of _18 to 1 (changed)
  
Previously figuring out any such equivalence would require jump threading 
and PRE to prove PHI equivalency via PHI translation.  For the testcase in
question jump-threading wasn't aggressive enough.

A not cleaned up patch passed bootstrap and regtest on 
x86_64-unknown-linux-gnu, I am currently re-bootstrapping and testing
the following (and plan to commit on Monday if that succeeded).

Thanks,
Richard.

2015-10-16  Richard Biener  

PR tree-optimization/67975
* tree-cfg.h (extract_true_false_controlled_edges): Declare.
* tree-cfg.c (extract_true_false_controlled_edges): Split out
core worker from ...
* tree-ssa-loop-im.c (extract_true_false_args_from_phi): ... here.
* tree-ssa-sccvn.c (vn_phi_compute_hash): Hash number of args
instead of block number for PHIs with two or one args.
(vn_phi_eq): Compare edge predicates of PHIs that are in different
blocks.

* gcc.dg/tree-ssa/ssa-fre-50.c: New testcase.

Index: gcc/tree-cfg.h
===
*** gcc/tree-cfg.h  (revision 228863)
--- gcc/tree-cfg.h  (working copy)
*** extern unsigned int execute_fixup_cfg (v
*** 105,109 
--- 105,111 
  extern unsigned int split_critical_edges (void);
  extern basic_block insert_cond_bb (basic_block, gimple *, gimple *);
  extern bool gimple_find_sub_bbs (gimple_seq, gimple_stmt_iterator *);
+ extern bool extract_true_false_controlled_edges (basic_block, basic_block,
+edge *, edge *);
  
  #endif /* _TREE_CFG_H  */
Index: gcc/tree-cfg.c
===
*** gcc/tree-cfg.c  (revision 228863)
--- gcc/tree-cfg.c  (working copy)
*** extract_true_false_edges_from_block (bas
*** 8532,8537 
--- 8578,8652 
  }
  }
  
+ 
+ /* From a controlling predicate in the immediate dominator DOM of
+PHIBLOCK determine the edges into PHIBLOCK that are chosen if the
+predicate evaluates to true and false and store them to
+*TRUE_CONTROLLED_EDGE and *FALSE_CONTROLLED_EDGE if
+they are non-NULL.  Returns true if the edges can be determined,
+else return false.  */
+ 
+ bool
+ extract_true_false_controlled_edges (basic_block dom, basic_block phiblock,
+edge *true_controlled_edge,
+edge *false_controlled_edge)
+ {
+   basic_block bb = phiblock;
+   edge true_edge, false_edge, tem;
+   edge e0 = NULL, e1 = NULL;
+ 
+   /* We have to verify that one edge into the PHI node is dominated
+  by the true edge of the predicate block and the other edge
+  dominated by the false edge.  This ensures that the PHI argument
+  we are going to take is completely determined by the path we
+  take from the predicate block.
+  We can only use BB 

Re: [PATCH] c/67882 - improve -Warray-bounds for invalid offsetof

2015-10-16 Thread Bernd Schmidt

On 10/09/2015 04:55 AM, Martin Sebor wrote:

Gcc attempts to diagnose invalid offsetof expressions whose member
designator is an array element with an out-of-bounds index. The
logic in the function that does this detection is incomplete, leading
to false negatives. Since the result of the expression in these cases
can be surprising, this patch tightens up the logic to diagnose more
such cases.


In the future, please explain more clearly in the patch submission what 
the false negatives are. That'll make the reviewer's job easier.



Tested by boostrapping and running c/c++ tests on x86_64 with no
regressions.


Should run the full testsuite (standard practice, and library tests 
might have occurrences of offsetof).


A ChangeLog is missing. (Not that I personally care about ChangeLogs, 
but apparently others do.)



+struct offsetof_ctx_t
+{
+  tree inx; /* The invalid array index or NULL_TREE.  */
+  int maxinx;   /* All indices to the array have the highest valid value. */
+};


I think "idx" is commonly used, I've never seen the spelling "inx". 
Also, typically comments go on their own lines before the field.



+
+ if (tree_int_cst_lt (upbound, t)) {
+   pctx->inx = t;


None-standard formatting. Elsewhere too.


+ /* Index is considered valid when it's either less than
+the upper bound or equal to it and refers to the lowest
+rank.  Since in the latter case it may not at this point
+in the recursive call to the function be known whether
+the element at the index is used to do more than to
+compute its offset (e.g., it can be used to designate
+a member of the type to which the just past-the-end
+element refers), point the INX variable at the index
+and leave it up to the caller to decide whether or not
+to diagnose it.  Special handling is required for minor
+index values referring to the element just past the end
+of the array object.  */


I admit to having trouble parsing this comment. Can you write that in a 
clearer way somehow? I'm still trying to make my mind up whether the 
logic in this patch could be simplified.



t = convert (sizetype, t);
off = size_binop (MULT_EXPR, TYPE_SIZE_UNIT (TREE_TYPE (expr)), t);
+
break;


Spurious change, please remove.


+extern tree fold_offsetof_1 (tree, offsetof_ctx_t* = NULL);


Space before *.


+// treatment since the offsetof exression yields the same result for


"expression".


+// The following expression is silently accepted as an extension
+// because it simply forms the equivalent of a just-past-the-end
+// address.
+__builtin_offsetof (A, a1_1 [0][1]),// extension


Hmm, do we really want to support any kind of multidimensional array for 
this extension? My guess would have been to warn here.


So I checked and it looks like we accept flexible array member syntax 
like "int a[][2];", which suggests that the test might have the right 
idea, but has the indices swapped (the first one is the flexible one)? 
Ccing Joseph for a ruling.



Bernd


[HSA] HSA back-end improvement

2015-10-16 Thread Martin Liška
Hello.

Attached patch set applies a bunch of small changes to HSA back-end.
Patches have been installed to hsa branch.

Martin
>From 10cf42ce8c0199471271edea80bb0cd717b6f0d1 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Fri, 9 Oct 2015 14:36:31 +0200
Subject: [PATCH 1/8] HSA: fix types in switch to if conversion code

gcc/ChangeLog:

2015-10-15  Martin Liska  

	* hsa-gen.c (convert_switch_statements): Generate fold_convert
	for situations where index type and label value types are different.
---
 gcc/hsa-gen.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/gcc/hsa-gen.c b/gcc/hsa-gen.c
index aa57669..3366e01 100644
--- a/gcc/hsa-gen.c
+++ b/gcc/hsa-gen.c
@@ -5048,6 +5048,7 @@ convert_switch_statements ()
 
 	unsigned labels = gimple_switch_num_labels (s);
 	tree index = gimple_switch_index (s);
+	tree index_type = TREE_TYPE (index);
 	tree default_label = gimple_switch_default_label (s);
 	basic_block default_label_bb = label_to_block_fn
 	  (func, CASE_LABEL (default_label));
@@ -5100,17 +5101,24 @@ convert_switch_statements ()
 	tree low = CASE_LOW (label);
 	tree high = CASE_HIGH (label);
 
+	if (!useless_type_conversion_p (TREE_TYPE (low), index_type))
+	  low = fold_convert (index_type, low);
+
 	gimple_stmt_iterator cond_gsi = gsi_last_bb (cur_bb);
 	gimple *c = NULL;
 	if (high)
 	  {
 		tree tmp1 = make_temp_ssa_name (boolean_type_node, NULL,
 		"switch_cond_op1");
+
 		gimple *assign1 = gimple_build_assign (tmp1, LE_EXPR, low,
 		  index);
 
 		tree tmp2 = make_temp_ssa_name (boolean_type_node, NULL,
 		"switch_cond_op2");
+
+		if (!useless_type_conversion_p (TREE_TYPE (high), index_type))
+		  high = fold_convert (index_type, high);
 		gimple *assign2 = gimple_build_assign (tmp2, LE_EXPR, index,
 		  high);
 
-- 
2.6.0

>From b0fee876ca636e1623bc2a0a06d73ed435ff6397 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Fri, 9 Oct 2015 16:27:55 +0200
Subject: [PATCH 2/8] HSA: introduce seen_error for hsa_symbol class.

gcc/ChangeLog:

2015-10-15  Martin Liska  

	* hsa-gen.c (fillup_sym_for_decl): Add new seen_error guard.
	(get_symbol_for_decl): Mark all functions that use a problematic
	symbol as problematic too.
	* hsa.c (hsa_type_bit_size): Use correct seen_error function.
	* hsa.h (struct hsa_symbol): New member flag.
---
 gcc/hsa-gen.c | 14 +-
 gcc/hsa.c |  2 +-
 gcc/hsa.h |  3 +++
 3 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/gcc/hsa-gen.c b/gcc/hsa-gen.c
index 3366e01..892fac2 100644
--- a/gcc/hsa-gen.c
+++ b/gcc/hsa-gen.c
@@ -658,6 +658,9 @@ fillup_sym_for_decl (tree decl, struct hsa_symbol *sym)
 {
   sym->decl = decl;
   sym->type = hsa_type_for_tree_type (TREE_TYPE (decl), >dim);
+
+  if (hsa_seen_error ())
+sym->seen_error = true;
 }
 
 /* Lookup or create the associated hsa_symbol structure with a given VAR_DECL
@@ -680,7 +683,16 @@ get_symbol_for_decl (tree decl)
   slot = hsa_global_variable_symbols->find_slot (, INSERT);
   gcc_checking_assert (slot);
   if (*slot)
-	return *slot;
+	{
+	  sym = *slot;
+
+	  /* If the symbol is problematic, mark current function also as
+	 problematic.  */
+	  if (sym->seen_error)
+	hsa_fail_cfun ();
+
+	  return sym;
+	}
   sym = XCNEW (struct hsa_symbol);
   sym->segment = BRIG_SEGMENT_GLOBAL;
   sym->linkage = BRIG_LINKAGE_FUNCTION;
diff --git a/gcc/hsa.c b/gcc/hsa.c
index 1617ec6..ed6a779 100644
--- a/gcc/hsa.c
+++ b/gcc/hsa.c
@@ -404,7 +404,7 @@ hsa_type_bit_size (BrigType16_t t)
   return 128;
 
 default:
-  gcc_assert (seen_error ());
+  gcc_assert (hsa_seen_error ());
   return t;
 }
 }
diff --git a/gcc/hsa.h b/gcc/hsa.h
index c7e3957..e2d5aed 100644
--- a/gcc/hsa.h
+++ b/gcc/hsa.h
@@ -86,6 +86,9 @@ struct hsa_symbol
 
   /* Is in global scope.  */
   bool global_scope_p;
+
+  /* True if an error has been seen for the symbol.  */
+  bool seen_error;
 };
 
 /* Abstract class for HSA instruction operands. */
-- 
2.6.0

>From 9eb195d3b8a967d8843299afcd1a487107db9aa1 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Thu, 15 Oct 2015 13:18:04 +0200
Subject: [PATCH 3/8] HSA: add new warning and enhance host fallback.

gcc/ChangeLog:

2015-10-15  Martin Liska  

	* hsa-gen.c (get_symbol_for_decl): Replace warning with
	HSA_SORRY_ATV.

libgomp/ChangeLog:

2015-10-15  Martin Liska  

	* target.c (GOMP_target): Add new case where we want to process
	host fallback.
---
 gcc/hsa-gen.c|  4 ++--
 libgomp/target.c | 14 ++
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/gcc/hsa-gen.c b/gcc/hsa-gen.c
index 892fac2..9ec1049 100644
--- a/gcc/hsa-gen.c
+++ b/gcc/hsa-gen.c
@@ -708,8 +708,8 @@ get_symbol_for_decl (tree decl)
 	  hsa_cfun->readonly_variables.safe_push (sym);
 	}
   else
-	warning (0, "referring to global symbol %q+D by name from HSA code "
-		 

[Ada] Minor cleanup in finalization support of the runtime

2015-10-16 Thread Arnaud Charlet
This removes a couple of redundant/unused things.  No functional changes.

Tested on x86_64-pc-linux-gnu, committed on trunk

2015-10-16  Eric Botcazou  

* a-tags.ads (Parent_Size): Remove obsolete pragma Export.
* s-finmas.ads (Header_Offset): Delete.
* s-finmas.adb (Header_Offset): Likewise.
(Finalize): Call Header_Size instead of Header_Offset.
* s-stposu.adb (Allocate_Any_Controlled): Likewise.
(Deallocate_Any_Controlled): Likewise.

Index: a-tags.ads
===
--- a-tags.ads  (revision 228884)
+++ a-tags.ads  (working copy)
@@ -6,7 +6,7 @@
 --  --
 -- S p e c  --
 --  --
---  Copyright (C) 1992-2014, Free Software Foundation, Inc. --
+--  Copyright (C) 1992-2015, Free Software Foundation, Inc. --
 --  --
 -- This specification is derived from the Ada Reference Manual for use with --
 -- GNAT. The copyright notice above, and the license provisions that follow --
@@ -526,9 +526,6 @@
--  ancestor is the parent of the type represented by tag T. This function
--  assumes that _size is always in slot one of the dispatch table.
 
-   pragma Export (Ada, Parent_Size, "ada__tags__parent_size");
-   --  This procedure is used in s-finimp and is thus exported manually
-
procedure Register_Interface_Offset
  (This : System.Address;
   Interface_T  : Tag;
Index: s-stposu.adb
===
--- s-stposu.adb(revision 228884)
+++ s-stposu.adb(working copy)
@@ -281,7 +281,7 @@
  -- +- Header_And_Padding --+
 
  N_Ptr := Address_To_FM_Node_Ptr
-(N_Addr + Header_And_Padding - Header_Offset);
+(N_Addr + Header_And_Padding - Header_Size);
 
  --  Prepend the allocated object to the finalization master
 
@@ -414,7 +414,7 @@
 
  --  Convert the bits preceding the object into a list header
 
- N_Ptr := Address_To_FM_Node_Ptr (Addr - Header_Offset);
+ N_Ptr := Address_To_FM_Node_Ptr (Addr - Header_Size);
 
  --  Detach the object from the related finalization master. This
  --  action does not need to know the prior context used during
Index: s-finmas.adb
===
--- s-finmas.adb(revision 228884)
+++ s-finmas.adb(working copy)
@@ -6,7 +6,7 @@
 --  --
 -- B o d y  --
 --  --
--- Copyright (C) 2011, Free Software Foundation, Inc.   --
+-- Copyright (C) 2015, Free Software Foundation, Inc.   --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -212,7 +212,7 @@
  --  Skip the list header in order to offer proper object layout for
  --  finalization.
 
- Obj_Addr := Curr_Ptr.all'Address + Header_Offset;
+ Obj_Addr := Curr_Ptr.all'Address + Header_Size;
 
  --  Retrieve TSS primitive Finalize_Address depending on the master's
  --  mode of operation.
@@ -327,15 +327,6 @@
   return FM_Node'Size / Storage_Unit;
end Header_Size;
 
-   ---
-   -- Header_Offset --
-   ---
-
-   function Header_Offset return System.Storage_Elements.Storage_Offset is
-   begin
-  return FM_Node'Size / Storage_Unit;
-   end Header_Offset;
-

-- Initialize --

Index: s-finmas.ads
===
--- s-finmas.ads(revision 228884)
+++ s-finmas.ads(working copy)
@@ -6,7 +6,7 @@
 --  --
 -- S p e c  --
 --  --
---  Copyright (C) 2011-2013, Free Software Foundation, Inc. --
+--  Copyright (C) 2011-2015, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -111,9 

[Ada] Spurious error on SPARK_Mode in generic package instantiation

2015-10-16 Thread Arnaud Charlet
This patch modifies the generic instantiation to ensure that a context with a
missing SPARK_Mode annotation is treated as having SPARK_Mode set to Off. This
ensures that the following SPARK UG rule 9.4.1

   Code where SPARK_Mode is Off shall not enclose code where Spark_Mode is
   On. However, if an instance of a generic unit is enclosed by code where
   SPARK_Mode is Off and if any SPARK_Mode specifications occur within the
   generic unit, then the corresponding SPARK_Mode specifications occurring
   within the instance have no semantic effect.

does not lead to spurious errors.


-- Source --


--  gen_pack.ads

generic
  type T is private;

package Gen_Pack with SPARK_Mode is
   procedure Force_Body;

   generic
  type Inner_T is private;

   package Inner_Gen_Pack with SPARK_Mode => Off is
  type Inner_T_Ptr is access Inner_T;
   end Inner_Gen_Pack;

   package Inner_Inst is new Inner_Gen_Pack (Inner_T => T);

   type T_Ptr is private;

private
   pragma SPARK_Mode (Off);
   type T_Ptr is new Inner_Inst.Inner_T_Ptr;
end Gen_Pack;

--  gen_pack.adb

package body Gen_Pack with SPARK_Mode => Off is
   procedure Force_Body is begin null; end Force_Body;
end Gen_Pack;

--  main.adb

with Gen_Pack;

procedure Main is
   package Inst is new Gen_Pack (T => Integer);
begin
   null;
end Main;

-
-- Compilation --
-

$ gcc -c main.adb

Tested on x86_64-pc-linux-gnu, committed on trunk

2015-10-16  Hristian Kirtchev  

* sem_ch12.adb (Analyze_Package_Instantiation):
Treat a missing SPARK_Mode annotation as having mode "Off".
(Analyze_Subprogram_Instantiation): Treat a missing SPARK_Mode
annotation as having mode "Off".
(Instantiate_Package_Body): Code
reformatting. Treat a missing SPARK_Mode annotation as having mode
"Off".
(Instantiate_Subprogram_Body): Code reformatting. Treat
a missing SPARK_Mode annotation as having mode "Off".

Index: sem_ch12.adb
===
--- sem_ch12.adb(revision 228884)
+++ sem_ch12.adb(working copy)
@@ -3723,11 +3723,12 @@
  goto Leave;
 
   else
- --  If the context of the instance is subject to SPARK_Mode "off",
- --  set the global flag which signals Analyze_Pragma to ignore all
- --  SPARK_Mode pragmas within the instance.
+ --  If the context of the instance is subject to SPARK_Mode "off" or
+ --  the annotation is altogether missing, set the global flag which
+ --  signals Analyze_Pragma to ignore all SPARK_Mode pragmas within
+ --  the instance.
 
- if SPARK_Mode = Off then
+ if SPARK_Mode /= On then
 Ignore_Pragma_SPARK_Mode := True;
  end if;
 
@@ -5098,11 +5099,12 @@
  Error_Msg_NE ("instantiation of & within itself", N, Gen_Unit);
 
   else
- --  If the context of the instance is subject to SPARK_Mode "off",
- --  set the global flag which signals Analyze_Pragma to ignore all
- --  SPARK_Mode pragmas within the instance.
+ --  If the context of the instance is subject to SPARK_Mode "off" or
+ --  the annotation is altogether missing, set the global flag which
+ --  signals Analyze_Pragma to ignore all SPARK_Mode pragmas within
+ --  the instance.
 
- if SPARK_Mode = Off then
+ if SPARK_Mode /= On then
 Ignore_Pragma_SPARK_Mode := True;
  end if;
 
@@ -10632,18 +10634,19 @@
   Act_Spec: constant Node_Id:= Specification (Act_Decl);
   Act_Decl_Id : constant Entity_Id  := Defining_Entity (Act_Spec);
 
+  Save_IPSM: constant Boolean := Ignore_Pragma_SPARK_Mode;
+  Save_Style_Check : constant Boolean := Style_Check;
+
+  Act_Body  : Node_Id;
+  Act_Body_Id   : Entity_Id;
   Act_Body_Name : Node_Id;
   Gen_Body  : Node_Id;
   Gen_Body_Id   : Node_Id;
-  Act_Body  : Node_Id;
-  Act_Body_Id   : Entity_Id;
+  Par_Ent   : Entity_Id := Empty;
+  Par_Vis   : Boolean   := False;
 
   Parent_Installed : Boolean := False;
-  Save_Style_Check : constant Boolean := Style_Check;
 
-  Par_Ent : Entity_Id := Empty;
-  Par_Vis : Boolean   := False;
-
   Vis_Prims_List : Elist_Id := No_Elist;
   --  List of primitives made temporarily visible in the instantiation
   --  to match the visibility of the formal type
@@ -10783,8 +10786,17 @@
   if Present (Gen_Body_Id) then
  Save_Env (Gen_Unit, Act_Decl_Id);
  Style_Check := False;
+
+ --  If the context of the instance is subject to SPARK_Mode "off" or
+ --  the annotation is altogether missing, set the global flag which
+ --  signals Analyze_Pragma to ignore all SPARK_Mode pragmas within
+ --  the instance.
+
+ if SPARK_Mode /= On then
+   

Re: [PATCH 1/7] Libsanitizer merge from upstream r249633.

2015-10-16 Thread Renato Golin
On 14 October 2015 at 19:38, Renato Golin  wrote:
> On 14 October 2015 at 19:21, Evgenii Stepanov  
> wrote:
>> Wait. As Jakub correctly pointed out in the other thread, there is no
>> obvious reason why there could not be a single shadow offset value
>> that would work for all 3 possible VMA settings. I suggest figuring
>> this out first.
>
> We are.

For anyone interested, here are the first few reviews:

http://reviews.llvm.org/D13781
http://reviews.llvm.org/D13782

There's more coming... :)

I don't want to spam this list for all the future patches, so if
you're interested, you might monitor the LLVM list, or register in our
Phabricator and create a filter for reviews with "VMA" to always CC
you.

cheers,
--renato


Re: [PATCH 1/7] Libsanitizer merge from upstream r249633.

2015-10-16 Thread Maxim Ostapenko

On 16/10/15 16:48, Renato Golin wrote:

On 14 October 2015 at 19:38, Renato Golin  wrote:

On 14 October 2015 at 19:21, Evgenii Stepanov  wrote:

Wait. As Jakub correctly pointed out in the other thread, there is no
obvious reason why there could not be a single shadow offset value
that would work for all 3 possible VMA settings. I suggest figuring
this out first.

We are.

For anyone interested, here are the first few reviews:

http://reviews.llvm.org/D13781
http://reviews.llvm.org/D13782

There's more coming... :)

I don't want to spam this list for all the future patches, so if
you're interested, you might monitor the LLVM list, or register in our
Phabricator and create a filter for reviews with "VMA" to always CC
you.

cheers,
--renato



Yeah, thanks. Just wondering if I should step back until they are 
resolved upstream or we can have another merge in the future (stage3 is 
coming ...)?


-Maxim


[PATCH] 2015-10-15 Benedikt Huber <benedikt.hu...@theobroma-systems.com> Philipp Tomsich <philipp.toms...@theobroma-systems.com>

2015-10-16 Thread Benedikt Huber
* config/aarch64/aarch64-builtins.c: Builtins for rsqrt and rsqrtf.
* config/aarch64/aarch64-protos.h: Declare.
* config/aarch64/aarch64-simd.md: Matching expressions for frsqrte and
frsqrts.
* config/aarch64/aarch64-tuning-flags.def: Added recip_sqrt.
* config/aarch64/aarch64.c: New functions. Emit rsqrt estimation code 
when
applicable.
* config/aarch64/aarch64.md: Added enum entries.
* config/aarch64/aarch64.opt: Added option -mlow-precision-recip-sqrt.
* testsuite/gcc.target/aarch64/rsqrt_asm_check_common.h: Common macros 
for
assembly checks.
* testsuite/gcc.target/aarch64/rsqrt_asm_check_negative_1.c: Make sure
frsqrts and frsqrte are not emitted.
* testsuite/gcc.target/aarch64/rsqrt_asm_check_1.c: Make sure frsqrts 
and
frsqrte are emitted.
* testsuite/gcc.target/aarch64/rsqrt_1.c: Functional tests for rsqrt.

Signed-off-by: Philipp Tomsich 
---
 gcc/ChangeLog  |  20 
 gcc/config/aarch64/aarch64-builtins.c  | 115 +
 gcc/config/aarch64/aarch64-protos.h|   4 +
 gcc/config/aarch64/aarch64-simd.md |  27 +
 gcc/config/aarch64/aarch64-tuning-flags.def|   1 +
 gcc/config/aarch64/aarch64.c   | 107 ++-
 gcc/config/aarch64/aarch64.md  |   3 +
 gcc/config/aarch64/aarch64.opt |   5 +
 gcc/doc/invoke.texi|  12 +++
 gcc/testsuite/gcc.target/aarch64/rsqrt_1.c | 111 
 .../gcc.target/aarch64/rsqrt_asm_check_1.c |  25 +
 .../gcc.target/aarch64/rsqrt_asm_check_common.h|  42 
 .../aarch64/rsqrt_asm_check_negative_1.c   |  12 +++
 13 files changed, 482 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/rsqrt_1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/rsqrt_asm_check_1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/rsqrt_asm_check_common.h
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/rsqrt_asm_check_negative_1.c

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 30860c4..2abe832 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,23 @@
+2015-10-15  Benedikt Huber  
+   Philipp Tomsich  
+
+   * config/aarch64/aarch64-builtins.c: Builtins for rsqrt and rsqrtf.
+   * config/aarch64/aarch64-protos.h: Declare.
+   * config/aarch64/aarch64-simd.md: Matching expressions for frsqrte and
+   frsqrts.
+   * config/aarch64/aarch64-tuning-flags.def: Added recip_sqrt.
+   * config/aarch64/aarch64.c: New functions. Emit rsqrt estimation code 
when
+   applicable.
+   * config/aarch64/aarch64.md: Added enum entries.
+   * config/aarch64/aarch64.opt: Added option -mlow-precision-recip-sqrt.
+   * testsuite/gcc.target/aarch64/rsqrt_asm_check_common.h: Common macros 
for
+   assembly checks.
+   * testsuite/gcc.target/aarch64/rsqrt_asm_check_negative_1.c: Make sure
+   frsqrts and frsqrte are not emitted.
+   * testsuite/gcc.target/aarch64/rsqrt_asm_check_1.c: Make sure frsqrts 
and
+   frsqrte are emitted.
+   * testsuite/gcc.target/aarch64/rsqrt_1.c: Functional tests for rsqrt.
+
 2015-10-14  Uros Bizjak  
 
* config/mips/mips.h (MIPS_STACK_ALIGN): Implement using
diff --git a/gcc/config/aarch64/aarch64-builtins.c 
b/gcc/config/aarch64/aarch64-builtins.c
index 716ed6e..0fb19a4 100644
--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
@@ -344,6 +344,11 @@ enum aarch64_builtins
   AARCH64_BUILTIN_GET_FPSR,
   AARCH64_BUILTIN_SET_FPSR,
 
+  AARCH64_BUILTIN_RSQRT_DF,
+  AARCH64_BUILTIN_RSQRT_SF,
+  AARCH64_BUILTIN_RSQRT_V2DF,
+  AARCH64_BUILTIN_RSQRT_V2SF,
+  AARCH64_BUILTIN_RSQRT_V4SF,
   AARCH64_SIMD_BUILTIN_BASE,
   AARCH64_SIMD_BUILTIN_LANE_CHECK,
 #include "aarch64-simd-builtins.def"
@@ -842,6 +847,46 @@ aarch64_init_crc32_builtins ()
 }
 }
 
+/* Add builtins for reciprocal square root.  */
+
+void
+aarch64_init_builtin_rsqrt (void)
+{
+  tree fndecl = NULL;
+  tree ftype = NULL;
+
+  tree V2SF_type_node = build_vector_type (float_type_node, 2);
+  tree V2DF_type_node = build_vector_type (double_type_node, 2);
+  tree V4SF_type_node = build_vector_type (float_type_node, 4);
+
+  typedef struct
+  {
+tree type_node;
+const char *builtin_name;
+int function_code;
+  } builtin_decls_data;
+
+  builtin_decls_data bdda[] =
+  {
+{ double_type_node, "__builtin_aarch64_rsqrt_df", AARCH64_BUILTIN_RSQRT_DF 
},
+{ float_type_node, "__builtin_aarch64_rsqrt_sf", AARCH64_BUILTIN_RSQRT_SF 
},
+{ V2DF_type_node, "__builtin_aarch64_rsqrt_v2df", 
AARCH64_BUILTIN_RSQRT_V2DF },
+{ V2SF_type_node, 

Re: [PATCH] i386: Use the STC bb-reorder algorithm at -Os (PR67864)

2015-10-16 Thread Segher Boessenkool
On Fri, Oct 16, 2015 at 02:55:54PM +0200, Bernd Schmidt wrote:
> On 10/16/2015 02:53 PM, Segher Boessenkool wrote:
> >For x86, STC still gives better results for optimise-for-size than
> >"simple" does.  So use STC at -Os as well.
> 
> For how many targets is this true, and for the others, what is the 
> biggest win from "simple"?

See .

Of the targets I could test there, only x86 and mn10300 like STC better.
My theory is that is because they have a smaller encoding for very short
branches.

Simple is about half a percent smaller on most targets.

> If the list of targets which get patches such 
> as this one is too large, maybe we ought to admit defeat on the "simple" 
> algorithm and revert it.

(Revert it for -Os, not -O1).

Yes, but it's just these two.  For everything else simple wins now.


Segher


Re: [PR67383][ARM][4.9]Backport of "Allow any register for DImode values in Thumb2"

2015-10-16 Thread Renlin Li

Hi Ramana,

On 16/10/15 11:52, Ramana Radhakrishnan wrote:

On Thu, Oct 15, 2015 at 03:01:24PM +0100, Renlin Li wrote:

Hi all,

This is a backport patch to loosen restrictions on core registers
for DImode values in Thumb2.

It fixes PR67383. In this particular case, reload tries to spill a
hard register, and use next register together as a pair to reload a
DImode pseudo register. However, the spilled register number is
odd.This is rejected by arm_hard_regno_mode_ok(). There is no other
register available, so the compiler throws an ICE.

I was not convinced enough by the reasoning provided in the description
because this patch was intended to be a bit of an optimization
rather than a correctness fix.


True, It's not a fix. It just allows more flexibility for register 
allocation.



The command line implies we remove r7 (frame pointer in Thumb2 - historical 
accident, fno-omit-frame-pointer), r9 (ffixed-r9), r10 (-mpic-register) which
leaves us with:

* r0, r1
* r2, r3
* r4, r5

as the only free registers available for DImode values for the whole 
compilation.

We then have r0, r1 and r2 live across the insn which means that there are no 
free registers to handle DImode values
under the constraints provided unless LRA / reload can spill the argument 
registers which it doesn't seem to be able to do
in this particular testcase. Vlad, is that correct ?
According to the logic, conflict hard register are excluded from spill 
candidate. That's why, in this case, r0, r1, r2 cannot be used.



Then I wondered why the same problem did not occur in ARM state given that has 
the same restriction.
In ARM state life is a bit better because the Frame pointer is r11 which means 
you pretty much have r6 and r7
as well available in addition to the above list, which means that theoretically 
you can
get away with this in ARM state.

Can you do some more comparison with ARM state as to why we don't have the same 
issue there ?


Presumably, ARM state should suffer from the same issue. I will have a look.

Regards,
Renlin

The test case in PR67383 is too big, so I didn't include it as part
of the patch.

I've put up a reduced testcase on the bz, the one I was using to debug.


arm-none-eabi regression test Okay without any new issues. Okay to
backport to 4.9?

For changes of this nature please bootstrap and regression test this in arm and 
thumb2 state as well please.

regards
Ramana





Re: [PATCH 1/7] Libsanitizer merge from upstream r249633.

2015-10-16 Thread Renato Golin
On 16 October 2015 at 14:59, Maxim Ostapenko
 wrote:
> Yeah, thanks. Just wondering if I should step back until they are resolved
> upstream or we can have another merge in the future (stage3 is coming ...)?

Well, right now, the support is patchy, experimental, but it's
reasonably stable. From the moment the first patch concerning  VMA
lands, until they're all applied, support will be unstable.

So, if you want to merge, you better merge from the current tree, or
wait for the VMA issue to stabilise, which can take a few weeks. You
can also take 3.7.0, which was slightly better tested than current
trunk and isn't missing a lot.

cheers,
--renato


[Ada] Check suppression in Ada.Containers

2015-10-16 Thread Arnaud Charlet
This patch implements two new check names (Container_Checks and
Tampering_Check) that may be used with pragma Suppress.
Suppressing Tampering_Check suppresses checks for "tampering with cursors" and
"tampering with elements". If pragma Suppress(Tampering_Check) is in force at
the point of instantiating Ada.Containers.Vectors, then tampering checks, and
all the controlled-type machinery that goes with them, are removed from the
generated code. This makes many operations much more efficient. Suppressing
Container_Checks suppresses all checks within the container package instance,
including the tampering checks. Pragma Suppress(All_Checks) or the command-line
switch -gnatp may also be used to suppress these checks.

No test available -- this is purely an efficiency improvement.

So far, only Ada.Containers.Vectors implement check suppression.

Tested on x86_64-pc-linux-gnu, committed on trunk

2015-10-16  Bob Duff  

* a-contai.ads: Add two check names: Container_Checks and
Tampering_Check.  Move the tampering check machinery from
Ada.Containers.Vectors to Ada.Containers. Later we can share it
with other containers.
Disable the tampering machinery in the presence of
Suppress(Tampering_Check).
Simplify the implementation of tampering checks. E.g. use RAII
to make incrementing/decrementing of the counts more concise.
* a-contai.adb: New package body, implementing the above.
* a-convec.ads, a-convec.adb: Use tampering check machinery
in Ada.Containers.
Disable all checking code when checks are suppressed.
Simplify many of the operations. Implement "&" in terms of Append,
rather than "by hand".
Remove: function "=" (L, R : Elements_Array) return Boolean is
abstract; so we can call the predefined "=" on Elements_Array.
For "=" on Vectors: Previously, we returned True immediately if
Left'Address = Right'Address.  That seems like a non-optimization
("if X = X" is unusual), so removed that.  Simplify by using
slice comparison ("=" on Element_Array will automatically call
"=" on the components, even if user defined).

Index: a-contai.adb
===
--- a-contai.adb(revision 0)
+++ a-contai.adb(revision 0)
@@ -0,0 +1,189 @@
+--
+--  --
+-- GNAT LIBRARY COMPONENTS  --
+--  --
+--   A D A . C O N T A I N E R S--
+--  --
+-- B o d y  --
+--  --
+--Copyright (C) 2015, Free Software Foundation, Inc.--
+--  --
+-- GNAT is free software;  you can  redistribute it  and/or modify it under --
+-- terms of the  GNU General Public License as published  by the Free Soft- --
+-- ware  Foundation;  either version 3,  or (at your option) any later ver- --
+-- sion.  GNAT is distributed in the hope that it will be useful, but WITH- --
+-- OUT ANY WARRANTY;  without even the  implied warranty of MERCHANTABILITY --
+-- or FITNESS FOR A PARTICULAR PURPOSE. --
+--  --
+-- As a special exception under Section 7 of GPL version 3, you are granted --
+-- additional permissions described in the GCC Runtime Library Exception,   --
+-- version 3.1, as published by the Free Software Foundation.   --
+--  --
+-- You should have received a copy of the GNU General Public License and--
+-- a copy of the GCC Runtime Library Exception along with this program; --
+-- see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see--
+-- .  --
+--
+
+package body Ada.Containers is
+
+   package body Generic_Implementation is
+
+  
+  -- Adjust --
+  
+
+  procedure Adjust (Control : in out Reference_Control_Type) is
+ pragma Assert (T_Check); -- not called if check suppressed
+  begin
+ if Control.T_Counts /= null then
+Lock (Control.T_Counts.all);
+ end if;
+  end Adjust;
+
+  --
+  -- Busy --
+  --
+
+  procedure Busy (T_Counts : in 

Re: [PATCH][haifa-sched] model load/store multiples properly in autoprefetcher scheduling

2015-10-16 Thread Kyrill Tkachov


On 16/10/15 04:55, Vladimir Makarov wrote:

On 10/15/2015 11:27 AM, Kyrill Tkachov wrote:


On 15/10/15 11:16, Bernd Schmidt wrote:

On 10/15/2015 11:40 AM, Kyrill Tkachov wrote:

The code that analyzes the offsets of the loads/stores doesn't try to
handle load/store-multiple insns.
These appear rather frequently in memory streaming workloads on aarch64
in the form of load-pair/store-pair instructions
i.e. ldp/stp.  In RTL, they are created by the sched_fusion pass + a
subsequent peephole and during sched2 they appear
as PARALLEL rtxes of multiple SETs to/from memory.




 * sched-int.h (struct autopref_multipass_data_): Remove offset
 field.  Add min_offset, max_offset, multi_mem_insn_p fields.
 * haifa-sched.c (analyze_set_insn_for_autopref): New function.
 (autopref_multipass_init): Use it.  Handle PARALLEL sets.
 (autopref_rank_data): New function.
 (autopref_rank_for_schedule): Use it.
 (autopref_multipass_dfa_lookahead_guard_1): Likewise.


Looks pretty reasonable to me. Ok to commit with a few changes next Wednesday 
unless you hear from Vlad in the meantime (I just want to give him time to look 
at it).


Thanks, I'll wait as you suggested (and cc'ing Vlad).
In the meantime, here's the updated patch with the suggested changes for the 
record.

Ok for me.



Thanks, I'll commit it on Monday then.
Cheers,
Kyrill



[PATCH] Correctly fill up cgraph_node::local.versionable flag.

2015-10-16 Thread Martin Liška
Hello.

I've been working on HSA branch, where we have a cloning pass running with all
optimization levels. The patch makes computation of 
cgraph_node::local.versionability
independent on IPA CP and uses the flag to verify that a function can be cloned.

The patch can bootstrap on x86_64-linux-pc and survives test suite.

Ready for trunk?
Thanks,
Martin
>From d17b51257d5e01ab6bd9a018b08f8ed6fd39c029 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Thu, 8 Oct 2015 17:57:30 +0200
Subject: [PATCH 1/3] Correctly fill up cgraph_node::local.versionable flag.

gcc/ChangeLog:

2015-10-15  Martin Liska  

	* cgraphclones.c (cgraph_node::create_virtual_clone):
	Verify cgraph_node.local.versionable instead of calling
	tree_versionable_function_p.
	* ipa-cp.c (determine_versionability): Save the information
	to ipa_node_params summary.
	(ipcp_versionable_function_p): Use it.
	(ipcp_propagate_stage): Pass IPA_NODE_REF to a called function.
	(ipcp_generate_summary): Do not compute cgraph_node
	versionability.
	* ipa-inline-analysis.c (inline_generate_summary): Compute
	visibility for all cgraph nodes.
	* ipa-prop.c (ipa_node_params_t::duplicate): Duplicate
	ipa_node_params::versionability.
	* ipa-prop.h (struct ipa_node_params): Declare it.
---
 gcc/cgraphclones.c|  2 +-
 gcc/ipa-cp.c  | 15 ++-
 gcc/ipa-inline-analysis.c |  4 
 gcc/ipa-prop.c|  1 +
 gcc/ipa-prop.h|  2 ++
 5 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/gcc/cgraphclones.c b/gcc/cgraphclones.c
index e51431c..5c04dc4 100644
--- a/gcc/cgraphclones.c
+++ b/gcc/cgraphclones.c
@@ -570,7 +570,7 @@ cgraph_node::create_virtual_clone (vec redirect_callers,
   char *name;
 
   if (!in_lto_p)
-gcc_checking_assert (tree_versionable_function_p (old_decl));
+gcc_checking_assert (local.versionable);
 
   gcc_assert (local.can_change_signature || !args_to_skip);
 
diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index 69a181d..0136af5 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -478,7 +478,8 @@ print_all_lattices (FILE * f, bool dump_sources, bool dump_benefits)
with NODE.  */
 
 static void
-determine_versionability (struct cgraph_node *node)
+determine_versionability (struct cgraph_node *node,
+			  struct ipa_node_params *info)
 {
   const char *reason = NULL;
 
@@ -510,7 +511,7 @@ determine_versionability (struct cgraph_node *node)
 fprintf (dump_file, "Function %s/%i is not versionable, reason: %s.\n",
 	 node->name (), node->order, reason);
 
-  node->local.versionable = (reason == NULL);
+  info->versionable = (reason == NULL);
 }
 
 /* Return true if it is at all technically possible to create clones of a
@@ -519,7 +520,7 @@ determine_versionability (struct cgraph_node *node)
 static bool
 ipcp_versionable_function_p (struct cgraph_node *node)
 {
-  return node->local.versionable;
+  return IPA_NODE_REF (node)->versionable;
 }
 
 /* Structure holding accumulated information about callers of a node.  */
@@ -2793,7 +2794,7 @@ ipcp_propagate_stage (struct ipa_topo_info *topo)
   {
 struct ipa_node_params *info = IPA_NODE_REF (node);
 
-determine_versionability (node);
+determine_versionability (node, info);
 if (node->has_gimple_body_p ())
   {
 	info->lattices = XCNEWVEC (struct ipcp_param_lattices,
@@ -4497,11 +4498,7 @@ ipcp_generate_summary (void)
   ipa_register_cgraph_hooks ();
 
   FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (node)
-  {
-	node->local.versionable
-	  = tree_versionable_function_p (node->decl);
-	ipa_analyze_node (node);
-  }
+ipa_analyze_node (node);
 }
 
 /* Write ipcp summary for nodes in SET.  */
diff --git a/gcc/ipa-inline-analysis.c b/gcc/ipa-inline-analysis.c
index 108ff3e..9184944 100644
--- a/gcc/ipa-inline-analysis.c
+++ b/gcc/ipa-inline-analysis.c
@@ -4104,6 +4104,10 @@ inline_generate_summary (void)
 {
   struct cgraph_node *node;
 
+  FOR_EACH_DEFINED_FUNCTION (node)
+if (DECL_STRUCT_FUNCTION (node->decl))
+  node->local.versionable = tree_versionable_function_p (node->decl);
+
   /* When not optimizing, do not bother to analyze.  Inlining is still done
  because edge redirection needs to happen there.  */
   if (!optimize && !flag_generate_lto && !flag_generate_offload && !flag_wpa)
diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
index 8dd9479..19846a8 100644
--- a/gcc/ipa-prop.c
+++ b/gcc/ipa-prop.c
@@ -3507,6 +3507,7 @@ ipa_node_params_t::duplicate(cgraph_node *src, cgraph_node *dst,
 
   new_info->analysis_done = old_info->analysis_done;
   new_info->node_enqueued = old_info->node_enqueued;
+  new_info->versionable = old_info->versionable;
 
   old_av = ipa_get_agg_replacements_for_node (src);
   if (old_av)
diff --git a/gcc/ipa-prop.h b/gcc/ipa-prop.h
index b9868bb..35952dc 100644
--- a/gcc/ipa-prop.h
+++ b/gcc/ipa-prop.h
@@ -334,6 +334,8 @@ struct ipa_node_params
   unsigned node_within_scc : 1;
   /* Node is calling a private function called only once.  */
   unsigned 

[Ada] Pragma Constant_After_Elaboration

2015-10-16 Thread Arnaud Charlet
This patch implements the legality rules of pragma Constant_After_Elaboration:

   The Boolean aspect Constant_After_Elaboration may be specified as part of
   the declaration of a library level variable.

The semantic checks of this annotation cannot be performed by the compiler as
this requires full flow analysis.


-- Source --


--  gen_pack.ads

generic
package Gen_Pack is
   Var : Integer := 1;

   OK : Integer := Var   --  OK
 with Constant_After_Elaboration => True;
end Gen_Pack;

--  semantics.ads

with Gen_Pack;

package Semantics is
   Var : Integer := 1;

   --  3.3.1 The Boolean aspect Constant_After_Elaboration may be specified as
   --  part of the declaration of a library level variable.

   OK_1 : Integer := Var
 with Constant_After_Elaboration => False;   --  OK

   package OK_2 is new Gen_Pack; --  OK

   Error_1 : Integer
 with Constant_After_Elaboration;--  Error

   Error_2 : Integer := Var
 with Constant_After_Elaboration => 2;   --  Error

   Error_3 : constant Integer := Var
 with Constant_After_Elaboration;--  Error

   procedure Error_4
 with Constant_After_Elaboration;--  Error

   procedure Proc;
end Semantics;

--  semantics.adb

package body Semantics is
   procedure Error_4 is begin null; end Error_4;

   procedure Proc is
  Error_5 : Integer := Var
with Constant_After_Elaboration; --  Error

  package Error_6 is new Gen_Pack;   --  Error
   begin
  null;
   end Proc;
end Semantics;


-- Compilation and output --


$ gcc -c semantics.adb
semantics.adb:6:14: aspect "Constant_After_Elaboration" must apply to a library
  level variable
semantics.adb:8:07: instantiation error at gen_pack.ads:6
semantics.adb:8:07: aspect "Constant_After_Elaboration" must apply to a library
  level variable
semantics.ads:15:11: aspect "Constant_After_Elaboration" must apply to a
  variable with initialization expression
semantics.ads:18:41: expected type "Standard.Boolean"
semantics.ads:18:41: found type universal integer
semantics.ads:21:11: aspect "Constant_After_Elaboration" must apply to a
  variable declaration
semantics.ads:24:11: incorrect placement of aspect "Constant_After_Elaboration"

Tested on x86_64-pc-linux-gnu, committed on trunk

2015-10-16  Hristian Kirtchev  

* aspects.adb Add an entry for Constant_After_Elaboration in
table Canonical_Aspect.
* aspects.ads Add entries for Constant_After_Elaboration in
tables Aspect_Argument, Aspect_Delay, Aspect_Id, Aspect_Names
and Implementation_Defined_Aspect.
* par-prag.adb Pragma Constant_After_Elaboration does not require
special processing by the parser.
* sem_ch13.adb Add an entry for Constant_After_Elaboration
in table Sig_Flags.
(Analyze_Aspect_Specifications):
Add processing for aspect Constant_After_Elaboration.
(Check_Aspect_At_Freeze_Point): Aspect Constant_After_Elaboration
does not require special processing at freeze time.
* sem_prag.adb (Analyze_Pragma): Add processing for pragma
Constant_After_Elaboration. Use routine Find_Related_Context to
retrieve the context of pragma Part_Of.
(Duplication_Error): Update comment on usage.
(Find_Related_Context): New routine.
* sem_prag.ads Add an entry for Constant_After_Elaboration
in table Aspect_Specifying_Pragma.
(Analyze_Contract_Cases_In_Decl_Part): Update the comment on usage.
* sem_util.adb (Add_Contract_Item): Add processing for pragma
Constant_After_Elaboration.
* sem_util.ads (Add_Contract_Item): Update the comment on usage.
* snames.ads-tmpl Add new predefined name and aspect id for
Constant_After_Elaboration.

Index: sem_prag.adb
===
--- sem_prag.adb(revision 228884)
+++ sem_prag.adb(working copy)
@@ -200,10 +200,18 @@
--  context denoted by Context. If this is the case, emit an error.
 
procedure Duplication_Error (Prag : Node_Id; Prev : Node_Id);
-   --  Subsidiary to routines Find_Related_Package_Or_Body and
-   --  Find_Related_Subprogram_Or_Body. Emit an error on pragma Prag that
-   --  duplicates previous pragma Prev.
+   --  Subsidiary to all Find_Related_xxx routines. Emit an error on pragma
+   --  Prag that duplicates previous pragma Prev.
 
+   function Find_Related_Context
+ (Prag  : Node_Id;
+  Do_Checks : Boolean := False) return Node_Id;
+   --  Subsidiaty to the analysis of pragmas Constant_After_Elaboration and
+   --  Part_Of. Find the first 

[patch] Document options for building and linking to libstdc++fs.a

2015-10-16 Thread Jonathan Wakely

This documents how to use the Filsystem TS library.

Committed to trunk.


commit 2d8dfef4311b51a2743f5ab722d467792c7c32dd
Author: Jonathan Wakely 
Date:   Fri Oct 16 14:53:46 2015 +0100

Document options for Filesystem TS library

	* doc/xml/manual/configure.xml: Document
	--enable-libstdcxx-filesystem-ts option.
	* doc/xml/manual/status_cxx2014.xml: Document libstdc++fs.a.
	* doc/xml/manual/using.xml: Likewise.
	* doc/html/*: Regenerate.

diff --git a/libstdc++-v3/doc/xml/manual/configure.xml b/libstdc++-v3/doc/xml/manual/configure.xml
index 2f558d2..ac383cf 100644
--- a/libstdc++-v3/doc/xml/manual/configure.xml
+++ b/libstdc++-v3/doc/xml/manual/configure.xml
@@ -411,6 +411,15 @@
  
  
 
+ --enable-libstdcxx-filesystem-ts[default]
+ 
+Build libstdc++fs.a as well
+  as the usual libstdc++ and libsupc++ libraries. This is enabled by
+  default on select POSIX targets where it is known to work and disabled
+  otherwise.
+
+ 
+
 
 
 
diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2014.xml b/libstdc++-v3/doc/xml/manual/status_cxx2014.xml
index d022ea4..6f1fbe5 100644
--- a/libstdc++-v3/doc/xml/manual/status_cxx2014.xml
+++ b/libstdc++-v3/doc/xml/manual/status_cxx2014.xml
@@ -402,7 +402,11 @@ not in any particular release.
   
   File System
   Y
-  
+  
+	Link with
+	
+	-lstdc++fs
+  
 
 
 
diff --git a/libstdc++-v3/doc/xml/manual/using.xml b/libstdc++-v3/doc/xml/manual/using.xml
index 2c8d179..96ae686 100644
--- a/libstdc++-v3/doc/xml/manual/using.xml
+++ b/libstdc++-v3/doc/xml/manual/using.xml
@@ -96,6 +96,14 @@
 
 
 
+  -lstdc++fs
+  Linking to libstdc++fs
+is required for use of the Filesystem library extensions in
+experimental/filesystem.
+  
+
+
+
   -fopenmp
   For parallel mode.
 
@@ -1361,8 +1369,31 @@ A quick read of the relevant part of the GCC
   you.
 
 
-  
 
+Experimental Library Extensions
+
+
+  GCC 5.3 includes an implementation of the Filesystem library defined
+  by the technical specification ISO/IEC TS 18822:2015. Because this is
+  an experimental library extension, not part of the C++ standard, it
+  is implemented in a separate library,
+  libstdc++fs.a, and there is
+  no shared library for it. To use the library you should include
+  experimental/filesystem
+  and link with -lstdc++fs. The library implementation
+  is incomplete on non-POSIX platforms, specifically Windows support is
+  rudimentary.
+
+
+
+  Due to the experimental nature of the Filesystem library the usual
+  guarantees about ABI stability and backwards compatibility do not apply
+  to it. There is no guarantee that the components in any
+  experimental/xxx
+  header will remain compatible between different GCC releases.
+
+
+  
 
   Concurrency
 


[PATCH v8][aarch64] Implemented reciprocal square root (rsqrt) estimation in -ffast-math

2015-10-16 Thread Benedikt Huber
This eighth revision of the patch:
 * Style improvements.

Ok for check in.


Benedikt Huber (1):
  2015-10-15  Benedikt Huber  
Philipp Tomsich  

 gcc/ChangeLog  |  20 
 gcc/config/aarch64/aarch64-builtins.c  | 115 +
 gcc/config/aarch64/aarch64-protos.h|   4 +
 gcc/config/aarch64/aarch64-simd.md |  27 +
 gcc/config/aarch64/aarch64-tuning-flags.def|   1 +
 gcc/config/aarch64/aarch64.c   | 107 ++-
 gcc/config/aarch64/aarch64.md  |   3 +
 gcc/config/aarch64/aarch64.opt |   5 +
 gcc/doc/invoke.texi|  12 +++
 gcc/testsuite/gcc.target/aarch64/rsqrt_1.c | 111 
 .../gcc.target/aarch64/rsqrt_asm_check_1.c |  25 +
 .../gcc.target/aarch64/rsqrt_asm_check_common.h|  42 
 .../aarch64/rsqrt_asm_check_negative_1.c   |  12 +++
 13 files changed, 482 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/rsqrt_1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/rsqrt_asm_check_1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/rsqrt_asm_check_common.h
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/rsqrt_asm_check_negative_1.c

-- 
1.9.1



[PATCH] tree-scalar-evolution.c: Handle LSHIFT by constant

2015-10-16 Thread Alan Lawrence
This lets the vectorizer handle some simple strides expressed using left-shift
rather than mul, e.g. a[i << 1] (whereas previously only a[i * 2] would have
been handled).

This patch does *not* handle the general case of shifts - neither a[i << j]
nor a[1 << i] will be handled; that would be a significantly bigger patch
(probably duplicating or generalizing much of chrec_fold_multiply and
chrec_fold_multiply_poly_poly in tree-chrec.c), and would probably also only
be applicable to machines with gather-load support.

Bootstrapped+check-gcc,g++,gfortran on x86_64, AArch64 and ARM, also Ada on 
x86_64.

Is this OK for trunk?

gcc/ChangeLog:

PR tree-optimization/65963
* tree-scalar-evolution.c (interpret_rhs_expr): Handle some LSHIFT_EXPRs
as equivalent MULT_EXPRs.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/vect-strided-shift-1.c: New.
---
 gcc/testsuite/gcc.dg/vect/vect-strided-shift-1.c | 33 
 gcc/tree-scalar-evolution.c  | 18 +
 2 files changed, 51 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/vect/vect-strided-shift-1.c

diff --git a/gcc/testsuite/gcc.dg/vect/vect-strided-shift-1.c 
b/gcc/testsuite/gcc.dg/vect/vect-strided-shift-1.c
new file mode 100644
index 000..b1ce2ec
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-strided-shift-1.c
@@ -0,0 +1,33 @@
+/* PR tree-optimization/65963.  */
+#include "tree-vect.h"
+
+#define N 512
+
+int in[2*N], out[N];
+
+__attribute__ ((noinline)) void
+loop (void)
+{
+  for (int i = 0; i < N; i++)
+out[i] = in[i << 1] + 7;
+}
+
+int
+main (int argc, char **argv)
+{
+  check_vect ();
+  for (int i = 0; i < 2*N; i++)
+{
+  in[i] = i;
+  __asm__ volatile ("" : : : "memory");
+}
+  loop ();
+  __asm__ volatile ("" : : : "memory");
+  for (int i = 0; i < N; i++)
+{
+  if (out[i] != i*2 + 7)
+   abort ();
+}
+  return 0;
+}
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 1 "vect" 
{ target { vect_strided2 } } } } */
diff --git a/gcc/tree-scalar-evolution.c b/gcc/tree-scalar-evolution.c
index 0753bf3..e478b0e 100644
--- a/gcc/tree-scalar-evolution.c
+++ b/gcc/tree-scalar-evolution.c
@@ -1831,12 +1831,30 @@ interpret_rhs_expr (struct loop *loop, gimple *at_stmt,
   break;
 
 case MULT_EXPR:
+case LSHIFT_EXPR:
+  /* Handle A< TYPE_PRECISION (type))
+   type = TREE_TYPE (chrec1);
+ if (TYPE_PRECISION (type) == 0)
+   {
+ res = chrec_dont_know;
+ break;
+   }
+ chrec2 = fold_build2 (LSHIFT_EXPR, type,
+   build_int_cst (type, 1),
+   chrec2);
+   }
   res = chrec_fold_multiply (type, chrec1, chrec2);
   break;
 
-- 
1.9.1



RFC: always default to -mno-unaligned-access for bare-metal ARM

2015-10-16 Thread Sandra Loosemore
Recently I tracked down a target crash problem in an ARM EABI 
configuration running on a Cortex-A9 board to an unaligned access fault. 
 The startup code provided by the customer for this board doesn't 
enable the MMU, and unaligned access support requires the MMU to be 
enabled per


http://infocenter.arm.com/help/topic/com.arm.doc.faqs/ka13671.html

The GCC manual presently says "By default unaligned access is disabled 
for all pre-ARMv6 and all ARMv6-M architectures, and enabled for all 
other architectures."  I think it would be safer for GCC to default to 
-maligned-access only in configurations that are known also to enable 
the MMU, e.g. GNU/Linux.  Code being built for bare-metal ARM EABI 
targets typically includes low-level startup code and boot loaders that 
run before the MMU is enabled, and likewise I don't think RTEMS, 
uCLinux, etc imply MMU support, either.


We could tell all ARM EABI users to build with -mno-unaligned-access, 
but if multilibs for (say) -march=armv7-a are being provided then those 
must also be built with the "safe" setting to avoid crashes in e.g. the 
Newlib memcpy implementation.  I think it would be more user-friendly 
just to change the default.


ARM maintainers, WDYT?  I will work up and test a patch if there is 
agreement that this is a reasonable thing to do.


-Sandra



Re: [PATCH 5/9] i386: Add address spaces for fs/gs segments

2015-10-16 Thread Paolo Bonzini


On 08/10/2015 06:59, Richard Henderson wrote:
> +/* Address space support.
> +
> +   This is not "far pointers" in the 16-bit sense, but an easy way
> +   to use %fs and %gs segment prefixes.  Therefore:
> +
> +(a) All address spaces have the same modes,
> +(b) All address spaces have the same addresss forms,
> +(c) While %fs and %gs are technically subsets of the generic
> +address space, they are probably not subsets of each other.
> +(d) Since we have no access to the segment base register values
> +without resorting to a system call, we cannot convert a
> +non-default address space to a default address space.
> +Therefore we do not claim %fs or %gs are subsets of generic.

rdfsbase and rdgsbase are potentially accessible to userspace too, so I
think %fs or %gs should be considered subsets of generic.

Paolo

> +   Therefore, we need not override any of the address space hooks.  */



[gomp4,committed] Handle bind clause in dump_omp_clause

2015-10-16 Thread Tom de Vries

Hi,

this patch handles the oacc bind clause in dump_omp_clause.

Committed to gomp-4_0-branch.

Thanks,
- Tom
Handle bind clause in dump_omp_clause

2015-10-16  Tom de Vries  

	* tree-pretty-print.c (dump_omp_clause): Handle bind clause.
---
 gcc/tree-pretty-print.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 19ebdbc..c651b03 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -805,6 +805,12 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, int flags)
 case OMP_CLAUSE_NOHOST:
   pp_string (pp, "nohost");
   break;
+case OMP_CLAUSE_BIND:
+  pp_string (pp, "bind(");
+  dump_generic_node (pp, OMP_CLAUSE_BIND_NAME (clause),
+			 spc, flags, false);
+  pp_string (pp, ")");
+  break;
 
 default:
   pp_string (pp, "unknown");
-- 
1.9.1



[gomp4, committed] Add nohost clause support in dump_omp_clause

2015-10-16 Thread Tom de Vries

Hi,

this patch adds nohost clause support in dump_omp_clause. Furthermore, 
it fixes an infinite recursion bug when handling unrecognized clauses.


Committed to gomp-4_0-branch.

Thanks,
- Tom
Add nohost clause support in dump_omp_clause

2015-10-16  Tom de Vries  

	* tree-pretty-print.c (dump_omp_clause): Handle OMP_CLAUSE_NOHOST.  Fix
	infinite recursion in default label.
---
 gcc/tree-pretty-print.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index cd965bb..19ebdbc 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -802,10 +802,12 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, int flags)
 			spc, flags);
   pp_string (pp, " ]");
   break;
+case OMP_CLAUSE_NOHOST:
+  pp_string (pp, "nohost");
+  break;
 
 default:
-  /* Should never happen.  */
-  dump_generic_node (pp, clause, spc, flags, false);
+  pp_string (pp, "unknown");
   break;
 }
 }
-- 
1.9.1



Re: [PATCH] 2015-10-15 Benedikt Huber <benedikt.hu...@theobroma-systems.com> Philipp Tomsich <philipp.toms...@theobroma-systems.com>

2015-10-16 Thread Benedikt Huber
I introduced this in revision 7 due to a request from James Greenhalgh.
https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00963.html

> Given that this is all so mechanical, I'd have a preference towards
> refactoring this to loop over some structured data.

Do you mean, that I should get rid of the typedef and leave the struct without 
it?
Or should I completely drop the struct?

> On 16 Oct 2015, at 14:37, Oleg Endo  wrote:
> 
> On Thu, 2015-10-15 at 22:03 +, Benedikt Huber wrote:
>> 
>> +/* Add builtins for reciprocal square root.  */
>> +
>> +void
>> +aarch64_init_builtin_rsqrt (void)
>> +{
>> +  tree fndecl = NULL;
>> +  tree ftype = NULL;
>> +
>> +  tree V2SF_type_node = build_vector_type (float_type_node, 2);
>> +  tree V2DF_type_node = build_vector_type (double_type_node, 2);
>> +  tree V4SF_type_node = build_vector_type (float_type_node, 4);
>> +
>> +  typedef struct
>> +  {
>> +tree type_node;
>> +const char *builtin_name;
>> +int function_code;
>> +  } builtin_decls_data;
> 
> There is an ongoing effort to remove all the unnecessary typedef struct
> and enum etc stuff.  Please try not to add more of it.
> 
> Cheers,
> Oleg
> 



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [PATCH] 2015-10-15 Benedikt Huber <benedikt.hu...@theobroma-systems.com> Philipp Tomsich <philipp.toms...@theobroma-systems.com>

2015-10-16 Thread Marcus Shawcroft
On 16 October 2015 at 15:31, Benedikt Huber
 wrote:
> I introduced this in revision 7 due to a request from James Greenhalgh.
> https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00963.html
>
>> Given that this is all so mechanical, I'd have a preference towards
>> refactoring this to loop over some structured data.
>
> Do you mean, that I should get rid of the typedef and leave the struct 
> without it?
> Or should I completely drop the struct?

The use of the struct is fine, we are being discouraged from using
unnecessary typedefs.  Just rewrite it as:

 struct builtin_decls_data
   {
   ...
   };

The references to the typedef'd name don't need to be modified.
Cheers
/Marcus


Re: Do not use TYPE_CANONICAL in useless_type_conversion

2015-10-16 Thread Andreas Schwab
Jan Hubicka  writes:

>> Jan Hubicka  writes:
>> 
>> > Does the patch in https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00902.html 
>> > help?
>> 
>> No, it doesn't.
>> 
> Andreas,
> I am sorry for getting late to this. I hoped that the alternative patch by 
> Alexandre would fix this.
> I still don't know how to reproduce without IA-64 box, so I am attaching a 
> patch that I think should
> fix it.  Does the attached patch work?

With that patch I'm getting a different ICE:

| in expand_debug_locations, at cfgexpand.c:5265   |
| Error detected around ../../gcc/ada/par_sco.adb:2690:10  |

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: [PATCH] 2015-10-15 Benedikt Huber <benedikt.hu...@theobroma-systems.com> Philipp Tomsich <philipp.toms...@theobroma-systems.com>

2015-10-16 Thread Marcus Shawcroft
On 16 October 2015 at 14:59, Benedikt Huber
 wrote:

> +  typedef struct
> +  {
> +tree type_node;
> +const char *builtin_name;
> +int function_code;
> +  } builtin_decls_data;

Please address Oleg's comment.

Cheers
/Marcus


[AArch64] Update comments on the usage of X30 in FIXED_REGISTERS and CALL_USED_REGISTERS

2015-10-16 Thread Jiong Wang

The patch https://gcc.gnu.org/ml/gcc-patches/2014-09/msg02654.html
from last year changed the definition of LR in CALL_USED_REGISTERS,
but didn't update the comment above the #define to reflect the new usage.

This patch bring the comment inline with the implementation.

OK for trunk?

Thanks.

2015-10-16  Jiong. Wang  

gcc/
  * config/aarch64/aarch64.h: Update the comments on usage of X30.

diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 5a8db76..1eaaca0 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -210,14 +210,17 @@ extern unsigned aarch64_architecture_version;
significant bits.  Unlike AArch32 S1 is not packed into D0,
etc.  */
 
-/* Note that we don't mark X30 as a call-clobbered register.  The idea is
-   that it's really the call instructions themselves which clobber X30.
-   We don't care what the called function does with it afterwards.
-
-   This approach makes it easier to implement sibcalls.  Unlike normal
-   calls, sibcalls don't clobber X30, so the register reaches the
-   called function intact.  EPILOGUE_USES says that X30 is useful
-   to the called function.  */
+/* We don't mark X30 as a fixed register while we mark it as a caller-saved
+   register.  The idea is we want X30 to be allocable as a caller-saved
+   register when possible.
+
+   NOTE: although X30 is marked as caller-saved, it's callee-saved at the same
+   time.  The caller-saved attribute makes sure if X30 is allocated as free
+   register to hold any temporary value then the value is saved properly across
+   function call.  While on AArch64, the call instruction writes the return
+   address to LR.  If the called function is a non-leaf function, it is the
+   responsibility of the callee to save and restore LR appropriately in it's
+   prologue / epilogue.  */
 
 #define FIXED_REGISTERS	\
   {			\


[PATCH, wwwdocs] Add -march=skylake-avx512 to gcc-6/changes.html.

2015-10-16 Thread Kirill Yukhin
Hello,
Patch in the bottom adds mentioning of new
`march=skylake-avx512' to gcc-6/changes.html.

Is it ok to install?

This switch was backported to gcc-5.
Is it ok to create a new section `GCC 5.3' and put it there
or I need to wait for actual release?

--
Thanks, K

Index: htdocs/gcc-6/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-6/changes.html,v
retrieving revision 1.36
diff -p -r1.36 changes.html
*** htdocs/gcc-6/changes.html   12 Oct 2015 16:55:25 -  1.36
--- htdocs/gcc-6/changes.html   16 Oct 2015 12:24:54 -
*** enum {
*** 188,193 
--- 188,197 

  IA-32/x86-64
 
+  GCC now supports the Intel CPU named Skylake with AVX-512 extensions
+   through -march=skylake-avx512. The switch enables following
+   ISA extensions: AVX-512F, AVX512VL, AVX-512CD, AVX-512BW, AVX-512DQ.
+  
   
 Support for new AMD instructions monitorx and
 mwaitx has been added. This includes new intrinsic


[PATCH, rs6000] Enable secureplt by default on musl

2015-10-16 Thread Szabolcs Nagy

The musl dynamic loader can only deal with secure-plt,
make it the default.

Split out from
https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01640.html
for easier review (independent of the rest of the patch).

gcc/ChangeLog:

2015-10-16  Gregor Richards  
Szabolcs Nagy  

* config.gcc (enable_secureplt): Add *-linux*-musl*.
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 5818663..06376bb 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -2442,6 +2442,10 @@ powerpc*-*-linux*)
 	powerpc*-*-linux*paired*)
 		tm_file="${tm_file} rs6000/750cl.h" ;;
 	esac
+	case ${target} in
+	*-linux*-musl*)
+		enable_secureplt=yes ;;
+	esac
 	if test x${enable_secureplt} = xyes; then
 		tm_file="rs6000/secureplt.h ${tm_file}"
 	fi


Re: Benchmarks of v2 (was Re: [PATCH 0/5] RFC: Overhaul of diagnostics (v2))

2015-10-16 Thread David Malcolm
On Wed, 2015-10-14 at 11:00 +0200, Richard Biener wrote:
> On Tue, Oct 13, 2015 at 5:32 PM, David Malcolm  wrote:
> > On Thu, 2015-09-24 at 10:15 +0200, Richard Biener wrote:
> >> On Thu, Sep 24, 2015 at 2:25 AM, David Malcolm  wrote:
> >> > On Wed, 2015-09-23 at 15:36 +0200, Richard Biener wrote:
> >> >> On Wed, Sep 23, 2015 at 3:19 PM, Michael Matz  wrote:
> >> >> > Hi,
> >> >> >
> >> >> > On Tue, 22 Sep 2015, David Malcolm wrote:
> >> >> >
> >> >> >> The drawback is that it could bloat the ad-hoc table.  Can the ad-hoc
> >> >> >> table ever get smaller, or does it only ever get inserted into?
> >> >> >
> >> >> > It only ever grows.
> >> >> >
> >> >> >> An idea I had is that we could stash short ranges directly into the 
> >> >> >> 32
> >> >> >> bits of location_t, by offsetting the per-column-bits somewhat.
> >> >> >
> >> >> > It's certainly worth an experiment: let's say you restrict yourself to
> >> >> > tokens less than 8 characters, you need an additional 3 bits (using 
> >> >> > one
> >> >> > value, e.g. zero, as the escape value).  That leaves 20 bits for the 
> >> >> > line
> >> >> > numbers (for the normal 8 bit columns), which might be enough for most
> >> >> > single-file compilations.  For LTO compilation this often won't be 
> >> >> > enough.
> >> >> >
> >> >> >> My plan is to investigate the impact these patches have on the time 
> >> >> >> and
> >> >> >> memory consumption of the compiler,
> >> >> >
> >> >> > When you do so, make sure you're also measuring an LTO compilation 
> >> >> > with
> >> >> > debug info of something big (firefox).  I know that we already had 
> >> >> > issues
> >> >> > with the size of the linemap data in the past for these cases 
> >> >> > (probably
> >> >> > when we added columns).
> >> >>
> >> >> The issue we have with LTO is that the linemap gets populated in quite
> >> >> random order and thus we repeatedly switch files (we've mitigated this
> >> >> somewhat for GCC 5).  We also considered dropping column info
> >> >> (and would drop range info) as diagnostics are from optimizers only
> >> >> with LTO and we keep locations merely for debug info.
> >> >
> >> > Thanks.  Presumably the mitigation you're referring to is the
> >> > lto_location_cache class in lto-streamer-in.c?
> >> >
> >> > Am I right in thinking that, right now, the LTO code doesn't support
> >> > ad-hoc locations? (presumably the block pointers only need to exist
> >> > during optimization, which happens after the serialization)
> >>
> >> LTO code does support ad-hoc locations but they are "restored" only
> >> when reading function bodies and stmts (by means of COMBINE_LOCATION_DATA).
> >>
> >> > The obvious simplification would be, as you suggest, to not bother
> >> > storing range information with LTO, falling back to just the existing
> >> > representation.  Then there's no need to extend LTO to serialize ad-hoc
> >> > data; simply store the underlying locus into the bit stream.  I think
> >> > that this happens already: lto-streamer-out.c calls expand_location and
> >> > stores the result, so presumably any ad-hoc location_t values made by
> >> > the v2 patches would have dropped their range data there when I ran the
> >> > test suite.
> >>
> >> Yep.  We only preserve BLOCKs, so if you don't add extra code to
> >> preserve ranges they'll be "dropped".
> >>
> >> > If it's acceptable to not bother with ranges for LTO, one way to do the
> >> > "stashing short ranges into the location_t" idea might be for the
> >> > bits-per-range of location_t values to be a property of the line_table
> >> > (or possibly the line map), set up when the struct line_maps is created.
> >> > For non-LTO it could be some tuned value (maybe from a param?); for LTO
> >> > it could be zero, so that we have as many bits as before for line/column
> >> > data.
> >>
> >> That could be a possibility (likewise for column info?)
> >>
> >> Richard.
> >>
> >> > Hope this sounds sane
> >> > Dave
> >
> > I did some crude benchmarking of the patchkit, using these scripts:
> >   https://github.com/davidmalcolm/gcc-benchmarking
> > (specifically, bb0222b455df8cefb53bfc1246eb0a8038256f30),
> > using the "big-code.c" and "kdecore.cc" files Michael posted as:
> >   https://gcc.gnu.org/ml/gcc-patches/2013-09/msg00062.html
> > and "influence.i", a preprocessed version of SPEC2006's 445.gobmk
> > engine/influence.c (as an example of a moderate-sized pure C source
> > file).
> >
> > This doesn't yet cover very large autogenerated C files, and the .cc
> > file is only being measured to see the effect on the ad-hoc table (and
> > tokenization).
> >
> > "control" was r227977.
> > "experiment" was the same revision with the v2 patchkit applied.
> >
> > Recall that this patchkit captures ranges for tokens as an extra field
> > within tokens within libcpp and the C FE, and adds ranges to the ad-hoc
> > location lookaside, storing them for all tree nodes within the C FE that
> > have a 

  1   2   >