Re: [PATCH, trivial][AArch64] Fix mode iterator for *aarch64_simd_ld1rmode pattern

2014-11-15 Thread Yangfei (Felix)
 On 13 November 2014 06:14, Yangfei (Felix) felix.y...@huawei.com wrote:
  Hi,
 
We find that the VALLDI mode iterator used in *aarch64_simd_ld1rmode
 pattern is not appropriate.
The reason is that it's impossible to get a new operand of DImode by
 vec_duplicating an operand of the same mode.
So this patch just excludes the DImode and uses VALL instead.
Reg-tested for aarch64-linux-gnu with QEMU.  OK for the trunk?
 
 OK, can you back port it to 4.9?
 Thanks
 /Marcus

Yes, committed: 
- trunk: r217573. 
- 4.9 branch: r217574. 


[PATCH] Check 'fd' neither -1 nor 0, before close it

2014-11-15 Thread Chen Gang
'fd' may be 0 which does not need 'open' operation, so neither need
'close' operation on it when it is 0.

Also in c_common_read_pch(), when failure occurs, also need be sure the
'fd' is not '-1' for the next close operation.

It passes testsuite under Fedora x86_64-unknown-linux-gnu.


gcc/

* c-family/c-pch.c (c_common_read_pch): Check 'fd' neither -1
nor 0, before close it,

libcpp/

* files.c (_cpp_compare_file_date, read_file, validate_pch
open_file, _cpp_save_file_entries): Check 'fd' neither -1 nor 0,
before close it.
---
 gcc/c-family/c-pch.c | 10 ++
 libcpp/files.c   | 20 +++-
 2 files changed, 17 insertions(+), 13 deletions(-)

diff --git a/gcc/c-family/c-pch.c b/gcc/c-family/c-pch.c
index 93609b6..d001965 100644
--- a/gcc/c-family/c-pch.c
+++ b/gcc/c-family/c-pch.c
@@ -355,7 +355,8 @@ c_common_read_pch (cpp_reader *pfile, const char *name,
   if (f == NULL)
 {
   cpp_errno (pfile, CPP_DL_ERROR, calling fdopen);
-  close (fd);
+  if (fd  fd != -1)
+   close (fd);
   goto end;
 }
 
@@ -376,14 +377,15 @@ c_common_read_pch (cpp_reader *pfile, const char *name,
   timevar_push (TV_PCH_CPP_RESTORE);
   if (cpp_read_state (pfile, name, f, smd) != 0)
 {
-  fclose (f);
+  if (fd)
+   fclose (f);
   timevar_pop (TV_PCH_CPP_RESTORE);
   goto end;
 }
   timevar_pop (TV_PCH_CPP_RESTORE);
 
-
-  fclose (f);
+  if (fd)
+fclose (f);
 
   line_table-trace_includes = saved_trace_includes;
   linemap_add (line_table, LC_ENTER, 0, saved_loc.file, saved_loc.line);
diff --git a/libcpp/files.c b/libcpp/files.c
index 3984821..5c845da 100644
--- a/libcpp/files.c
+++ b/libcpp/files.c
@@ -243,7 +243,8 @@ open_file (_cpp_file *file)
  errno = ENOENT;
}
 
-  close (file-fd);
+  if (file-fd)
+close (file-fd);
   file-fd = -1;
 }
 #if defined(_WIN32)  !defined(__CYGWIN__)
@@ -753,7 +754,8 @@ read_file (cpp_reader *pfile, _cpp_file *file)
 }
 
   file-dont_read = !read_file_guts (pfile, file);
-  close (file-fd);
+  if (file-fd)
+close (file-fd);
   file-fd = -1;
 
   return !file-dont_read;
@@ -1435,11 +1437,9 @@ _cpp_compare_file_date (cpp_reader *pfile, const char 
*fname,
   if (file-err_no)
 return -1;
 
-  if (file-fd != -1)
-{
-  close (file-fd);
-  file-fd = -1;
-}
+  if (file-fd  file-fd != -1)
+close (file-fd);
+  file-fd = -1;
 
   return file-st.st_mtime  pfile-buffer-file-st.st_mtime;
 }
@@ -1694,7 +1694,8 @@ validate_pch (cpp_reader *pfile, _cpp_file *file, const 
char *pchname)
 
   if (!valid)
{
- close (file-fd);
+ if (file-fd)
+   close (file-fd);
  file-fd = -1;
}
 
@@ -1849,7 +1850,8 @@ _cpp_save_file_entries (cpp_reader *pfile, FILE *fp)
}
  ff = fdopen (f-fd, rb);
  md5_stream (ff, result-entries[count].sum);
- fclose (ff);
+ if (f-fd)
+   fclose (ff);
  f-fd = oldfd;
}
   result-entries[count].size = f-st.st_size;
-- 
1.9.3


[Patch, Fortran] Convert gfc_fatal_error to common diagnostics

2014-11-15 Thread Tobias Burnus
Especially since color diagnostic is now the default [1], it makes sense 
to convert more gfortran diagnostics to use the common diagnostics.


For an example, see [1]. That also brings all the nice features like 
placing the warning option in brackets:

  Warning: USE statement at (1) has no ONLY qualifier [-Wuse-without-only]
Adding -Werror changes it to:
  Error: USE statement at (1) has no ONLY qualifier 
[-Werror=use-without-only]
-fno-diagnostics-show-caret compactifies the error into a single line 
without showing the source code – and other nice features.

[Thanks Manuel!]


This patch converts all gfc_fatal_error to the new scheme, except for 
those few using %L. As most calls could be converted, I renamed the old 
one to _1 instead of using the _2 name for the new one.


Additionally, I changed quoted strings from '%s' to %qs and added quotes 
around -farguments via % … %. That also has a colouring effect 
(default: black and bold).


Build and regtested on x86-64-gnu-linux.
OK for the trunk?

Tobias

PS: Manuel is working on %L support and buffered output (gfc_error, 
gfc_warning); thus, expect more colors in the next weeks. While I intent 
to convert gfc_intrinsic_error and the remaining gfc_{error,warning}_now 
(all which do not use %L).


PPS: Turning on colors comes too late for the command-line diagnostic - 
they always come up in blank and white.


[1] See https://gcc.gnu.org/gcc-5/changes.html under Fortran and under 
C family.
2014-11-15  Tobias Burnus  bur...@net-b.de

gcc/fortran/
* error.c (gfc_fatal_error_1): Renamed from gfc_fatal_error.
	(gfc_fatal_error): Add; uses common diagnostics.
* array.c (gfc_match_array_ref, gfc_match_array_spec): Use
	% %.
* check.c (check_co_collective, gfc_check_lcobound,
	gfc_check_image_index, gfc_check_num_images,
	gfc_check_this_image, gfc_check_ucobound): Ditto.
* cpp.c (gfc_cpp_post_options): Ditto.
	(gfc_cpp_init_0, gfc_cpp_done): Change %s to %qs.
* gfc-diagnostic.def (DK_FATAL): Capitalize first letter.
* gfortran.h (gfc_fatal_error_1): Add.
* match.c (gfc_match_name, gfc_match_critical,
	lock_unlock_statement, sync_statement): Add % %.
* module.c (bad_module, gfc_dump_module, gfc_use_module): Change
	%s to %qs.
* options.c (gfc_handle_module_path_options, gfc_handle_fpe_option,
	gfc_handle_coarray_option, gfc_handle_runtime_check_option,
	gfc_handle_option): Add % %.
* simplify.c (gfc_simplify_num_images): Ditto.
* trans-stmt.c (gfc_trans_sync): Use gfc_fatal_error_1.
* trans-array.c (gfc_conv_array_initializer): Ditto.
	* trans-types.c (gfc_init_kinds): Use gfc_fatal_error instead
	of fatal_error; add % % quotations.

gcc/testsuite/
* gfortran.dg/binding_label_tests_4.f03: Add dg-excess-errors.
* gfortran.dg/coarray_9.f90: Ditto.
* gfortran.dg/empty_label.f: Ditto.
* gfortran.dg/empty_label.f90: Ditto.

diff --git a/gcc/fortran/array.c b/gcc/fortran/array.c
index ef2aa69..159e626 100644
--- a/gcc/fortran/array.c
+++ b/gcc/fortran/array.c
@@ -209,7 +209,7 @@ coarray:
 
   if (gfc_option.coarray == GFC_FCOARRAY_NONE)
 {
-  gfc_fatal_error (Coarrays disabled at %C, use -fcoarray= to enable);
+  gfc_fatal_error (Coarrays disabled at %C, use %-fcoarray=% to enable);
   return MATCH_ERROR;
 }
 
@@ -592,7 +592,7 @@ coarray:
 
   if (gfc_option.coarray == GFC_FCOARRAY_NONE)
 {
-  gfc_fatal_error (Coarrays disabled at %C, use -fcoarray= to enable);
+  gfc_fatal_error (Coarrays disabled at %C, use %-fcoarray=% to enable);
   goto cleanup;
 }
 
diff --git a/gcc/fortran/check.c b/gcc/fortran/check.c
index 6f1fe3f..034b329 100644
--- a/gcc/fortran/check.c
+++ b/gcc/fortran/check.c
@@ -1482,8 +1482,8 @@ check_co_collective (gfc_expr *a, gfc_expr *image_idx, gfc_expr *stat,
 
   if (gfc_option.coarray == GFC_FCOARRAY_NONE)
 {
-  gfc_fatal_error (Coarrays disabled at %L, use -fcoarray= to enable,
-		   a-where);
+  gfc_fatal_error_1 (Coarrays disabled at %L, use -fcoarray= to enable,
+			 a-where);
   return false;
 }
 
@@ -2569,7 +2569,7 @@ gfc_check_lcobound (gfc_expr *coarray, gfc_expr *dim, gfc_expr *kind)
 {
   if (gfc_option.coarray == GFC_FCOARRAY_NONE)
 {
-  gfc_fatal_error (Coarrays disabled at %C, use -fcoarray= to enable);
+  gfc_fatal_error (Coarrays disabled at %C, use %-fcoarray=% to enable);
   return false;
 }
 
@@ -4847,7 +4847,7 @@ gfc_check_image_index (gfc_expr *coarray, gfc_expr *sub)
 
   if (gfc_option.coarray == GFC_FCOARRAY_NONE)
 {
-  gfc_fatal_error (Coarrays disabled at %C, use -fcoarray= to enable);
+  gfc_fatal_error (Coarrays disabled at %C, use %-fcoarray=% to enable);
   return false;
 }
 
@@ -4885,7 +4885,7 @@ gfc_check_num_images (gfc_expr *distance, gfc_expr *failed)
 {
   if (gfc_option.coarray == GFC_FCOARRAY_NONE)
 {
-  gfc_fatal_error (Coarrays disabled at %C, use -fcoarray= to 

Re: Follow-up to PR51471

2014-11-15 Thread Eric Botcazou
 IIRC, fill_eager and its related friends are all speculative in some way
 and aren't those precisely the ones that are causing us problems.   Also
 note we have backends working around this stuff in fairly blunt ways:

I'd say that the PA back-end went a bit too far here, especially if it marks 
some insns of the epilogue as frame-related.  dwarf2cfi.c has special code to 
handle delay slots (SEQUENCEs) so it's not an all-or-nothing game.

 Given architectural difficulties of delay slots on modern processors,
 would it be that painful to just not allow filling slots with frame
 insns and let dbr try to find something else or drop in a nop?  I
 wouldn't be all that surprised if there wasn't a measurable performance
 difference on something like a modern Sparc.

Yes, modern SPARCs have (short) branches without delay slots.  But the other 
big contender is MIPS here and the story might be different for it.

-- 
Eric Botcazou


Re: Drop target_option_node reconstruction logic.

2014-11-15 Thread Richard Biener
On November 14, 2014 8:13:15 PM CET, Jan Hubicka hubi...@ucw.cz wrote:
Hi,
this patch kills lto's code to rebuilt DECL_FUNCTION_SPECIFIC_TARGET
from target
attributes.  This code was never complete and it should be no-op now
when we save
tehe target nodes.
It also makes free_land_data_in_decl to actually anotate all function
bodies with
a default option node.  The reason is that when LTOint units, one
compiled with
default settings and one, say with -msse3, we want ot keep these
functions preserved.

Incrementally i will proceed with similar changes for optimization
nodes.

Bootstrapped/regtested ppc64-linux, earlier version tested at
x86_64-linux and
firefox LTO, OK?

OK.

Thanks,
Richard.

Honza

   * lto.c (lto_read_decls): Do not rebuild
DECL_FUNCTION_SPECIFIC_TARGET.
   * tree.c (free_lang_data_in_decl): Annotate all functio nbodies with
   DECL_FUNCTION_SPECIFIC_TARGET.
Index: lto/lto.c
===
--- lto/lto.c  (revision 217571)
+++ lto/lto.c  (working copy)
@@ -1935,15 +1935,6 @@
 if (TREE_CODE (t) == INTEGER_CST
  !TREE_OVERFLOW (t))
   cache_integer_cst (t);
-/* Re-build DECL_FUNCTION_SPECIFIC_TARGET, we need that
-   for both WPA and LTRANS stage.  */
-if (TREE_CODE (t) == FUNCTION_DECL)
-  {
-tree attr = lookup_attribute (target, DECL_ATTRIBUTES (t));
-if (attr)
-  targetm.target_option.valid_attribute_p
-  (t, NULL_TREE, TREE_VALUE (attr), 0);
-  }
 /* Register TYPE_DECLs with the debuginfo machinery.  */
 if (!flag_wpa
  TREE_CODE (t) == TYPE_DECL)
Index: tree.c
===
--- tree.c (revision 217571)
+++ tree.c (working copy)
@@ -5115,6 +5115,9 @@
the PARM_DECL will be used in the function's body).  */
 for (t = DECL_ARGUMENTS (decl); t; t = TREE_CHAIN (t))
   DECL_CONTEXT (t) = decl;
+if (!DECL_FUNCTION_SPECIFIC_TARGET (decl))
+  DECL_FUNCTION_SPECIFIC_TARGET (decl)
+= target_option_default_node;
   }
 
   /* DECL_SAVED_TREE holds the GENERIC representation for DECL.




Re: [Patch, Fortran] Convert gfc_fatal_error to common diagnostics

2014-11-15 Thread FX
 Build and regtested on x86-64-gnu-linux.
 OK for the trunk?

Please document, in the source, the difference between gfc_fatal_error and 
gfc_fatal_error_1. They currently have the same generic description, which 
wouldn’t help people writing new front-end code to know which one to use. 
Moreover, if the transition will not be complete soon (or indeterminate), it 
should be added to the wiki’s list of partial transitions.

Other than that, OK, and thanks for doing this tedious work.

FX



Re: [GRAPHITE, PATCH] Ping: Loop unroll and jam optimization

2014-11-15 Thread Mircea Namolaru
The close of stage 1 is getting close (very close). Even there is not so much 
new code (basically
the new code computes the separation class option for AST build), I am not sure 
that the patch 
qualify for stage 2.

There is very nice code generated by unroll-and-jam (stride mining) for small 
kernels both for constant 
or non-constant bound loops, and is an argument for the new isl based code 
generator. Otherwise I'm afraid 
that the code generated looks very similar with the cloog generated one, an 
inner loop
with bounds of min/max that GCC doesn't further optimize, preventing perceived 
advantages of 
strip mining (register reuse and scalar reduction, instruction scheduling etc).

ok for trunk ?

Thanks, Mircea

Index: gcc/graphite-poly.h
===
--- gcc/graphite-poly.h	(revision 217013)
+++ gcc/graphite-poly.h	(working copy)
@@ -349,6 +349,9 @@
   poly_scattering_p _saved;
   isl_map *saved;
 
+  /* For tiling, the map for computing the separating class.  */
+  isl_map *map_sepclass;
+
   /* True when this PBB contains only a reduction statement.  */
   bool is_reduction;
 };
Index: gcc/graphite.c
===
--- gcc/graphite.c	(revision 217013)
+++ gcc/graphite.c	(working copy)
@@ -383,7 +383,8 @@
   || flag_loop_strip_mine
   || flag_graphite_identity
   || flag_loop_parallelize_all
-  || flag_loop_optimize_isl)
+  || flag_loop_optimize_isl
+  || flag_loop_unroll_jam)
 flag_graphite = 1;
 
   return flag_graphite != 0;
Index: gcc/common.opt
===
--- gcc/common.opt	(revision 217013)
+++ gcc/common.opt	(working copy)
@@ -1328,6 +1328,10 @@
 Common Report Var(flag_loop_block) Optimization
 Enable Loop Blocking transformation
 
+floop-unroll-and-jam
+Common Report Var(flag_loop_unroll_jam) Optimization
+Enable Loop Unroll Jam transformation
+ 
 fgnu-tm
 Common Report Var(flag_tm)
 Enable support for GNU transactional memory
Index: gcc/graphite-optimize-isl.c
===
--- gcc/graphite-optimize-isl.c	(revision 217013)
+++ gcc/graphite-optimize-isl.c	(working copy)
@@ -186,7 +186,7 @@
   PartialSchedule = isl_band_get_partial_schedule (Band);
   *Dimensions = isl_band_n_member (Band);
 
-  if (DisableTiling)
+  if (DisableTiling || flag_loop_unroll_jam)
 return PartialSchedule;
 
   /* It does not make any sense to tile a band with just one dimension.  */
@@ -241,7 +241,9 @@
constant number of iterations, if the number of loop iterations at
DimToVectorize can be devided by VectorWidth. The default VectorWidth is
currently constant and not yet target specific. This function does not reason
-   about parallelism.  */
+   about parallelism.
+
+  */
 static isl_map *
 getPrevectorMap (isl_ctx *ctx, int DimToVectorize,
 		 int ScheduleDimensions,
@@ -305,8 +307,98 @@
   isl_constraint_set_constant_si (c, VectorWidth - 1);
   TilingMap = isl_map_add_constraint (TilingMap, c);
 
-  isl_map_dump (TilingMap);
+  return TilingMap;
+}
 
+/* Compute an auxiliary map to getPrevectorMap, for computing the separating 
+   class defined by full tiles.  Used in graphite_isl_ast_to_gimple.c to set the 
+   corresponding option for AST build.
+
+   The map (for VectorWidth=4):
+
+   [i,j] - [it,j,ip] : it % 4 = 0 and it = ip = it + 3 and it + 3 = i and
+ip = 0
+
+   The image of this map is the separation class. The range of this map includes
+   all the i that are multiple of 4 in the domain beside the greater one. 
+
+ */ 
+static isl_map *
+getPrevectorMap_full (isl_ctx *ctx, int DimToVectorize,
+		 int ScheduleDimensions,
+		 int VectorWidth)
+{
+  isl_space *Space;
+  isl_local_space *LocalSpace, *LocalSpaceRange;
+  isl_set *Modulo;
+  isl_map *TilingMap;
+  isl_constraint *c;
+  isl_aff *Aff;
+  int PointDimension; /* ip */
+  int TileDimension;  /* it */
+  isl_val *VectorWidthMP;
+  int i;
+
+  /* assert (0 = DimToVectorize  DimToVectorize  ScheduleDimensions);*/
+
+  Space = isl_space_alloc (ctx, 0, ScheduleDimensions, ScheduleDimensions + 1);
+  TilingMap = isl_map_universe (isl_space_copy (Space));
+  LocalSpace = isl_local_space_from_space (Space);
+  PointDimension = ScheduleDimensions;
+  TileDimension = DimToVectorize;
+
+  /* Create an identity map for everything except DimToVectorize and the 
+ point loop. */
+  for (i = 0; i  ScheduleDimensions; i++)
+{
+  if (i == DimToVectorize)
+continue;
+
+  c = isl_equality_alloc (isl_local_space_copy (LocalSpace));
+
+  isl_constraint_set_coefficient_si (c, isl_dim_in, i, -1);
+  isl_constraint_set_coefficient_si (c, isl_dim_out, i, 1);
+
+  TilingMap = isl_map_add_constraint (TilingMap, c);
+}
+
+  /* it % 'VectorWidth' = 0  */
+  LocalSpaceRange = isl_local_space_range (isl_local_space_copy (LocalSpace));
+  Aff = 

Re: [gimple-classes, committed 4/6] tree-ssa-tail-merge.c: Use gassign

2014-11-15 Thread David Malcolm
On Thu, 2014-11-13 at 11:45 +0100, Richard Biener wrote:
 On Thu, Nov 13, 2014 at 2:41 AM, David Malcolm dmalc...@redhat.com wrote:
  On Tue, 2014-11-11 at 11:43 +0100, Richard Biener wrote:
  On Tue, Nov 11, 2014 at 8:26 AM, Jakub Jelinek ja...@redhat.com wrote:
   On Mon, Nov 10, 2014 at 05:27:50PM -0500, David Malcolm wrote:
   On Sat, 2014-11-08 at 14:56 +0100, Jakub Jelinek wrote:
On Sat, Nov 08, 2014 at 01:07:28PM +0100, Richard Biener wrote:
 To be constructive here - the above case is from within a
 GIMPLE_ASSIGN case label
 and thus I'd have expected

 case GIMPLE_ASSIGN:
   {
 gassign *a1 = as_a gassign * (s1);
 gassign *a2 = as_a gassign * (s2);
   lhs1 = gimple_assign_lhs (a1);
   lhs2 = gimple_assign_lhs (a2);
   if (TREE_CODE (lhs1) != SSA_NAME
TREE_CODE (lhs2) != SSA_NAME)
 return (operand_equal_p (lhs1, lhs2, 0)
  gimple_operand_equal_value_p (gimple_assign_rhs1 
 (a1),
  gimple_assign_rhs1 
 (a2)));
   else if (TREE_CODE (lhs1) == SSA_NAME
 TREE_CODE (lhs2) == SSA_NAME)
 return vn_valueize (lhs1) == vn_valueize (lhs2);
   return false;
   }

 instead.  That's the kind of changes I have expected and have 
 approved of.
   
But even that looks like just adding extra work for all developers, 
with no
gain.  You only have to add extra code and extra temporaries, in 
switches
typically also have to add {} because of the temporaries and thus 
extra
indentation level, and it doesn't simplify anything in the code.
  
   The branch attempts to use the C++ typesystem to capture information
   about the kinds of gimple statement we expect, both:
 (A) so that the compiler can detect type errors, and
 (B) as a comprehension aid to the human reader of the code
  
   The ideal here is when function params and struct field can be
   strengthened from gimple to a subclass ptr.  This captures the
   knowledge that every use of a function or within a struct has a given
   gimple code.
  
   I just don't like all the as_a/is_a stuff enforced everywhere,
   it means more typing, more temporaries, more indentation.
   So, as I view it, instead of the checks being done cheaply (yes, I think
   the gimple checking as we have right now is very cheap) under the
   hood by the accessors (gimple_assign_{lhs,rhs1} etc.), those changes
   put the burden on the developers, who has to check that manually through
   the as_a/is_a stuff everywhere, more typing and uglier syntax.
   I just don't see that as a step forward, instead a huge step backwards.
   But perhaps I'm alone with this.
   Can you e.g. compare the size of - lines in your patchset combined, and
   size of + lines in your patchset?  As in, if your changes lead to less
   typing or more.
 
  I see two ways out here.  One is to add overloads to all the functions
  taking the special types like
 
  tree
  gimple_assign_rhs1 (gimple *);
 
  or simply add
 
  gassign *operator ()(gimple *g) { return as_a gassign * (g); }
 
  into a gimple-compat.h header which you include in places that
  are not converted nicely.
 
  Thanks for the suggestions.
 
  Am I missing something, or is the gimple-compat.h idea above not valid C
  ++?
 
  Note that gimple is still a typedef to
gimple_statement_base *
  (as noted before, the gimple - gimple * change would break everyone
  else's patches, so we talked about that as a followup patch for early
  stage3).
 
  Given that, if I try to create an operator () outside of a class, I
  get this error:
 
  ‘gassign* operator()(gimple)’ must be a nonstatic member function
 
  which is emitted from cp/decl.c's grok_op_properties:
/* An operator function must either be a non-static member function
   or have at least one parameter of a class, a reference to a class,
   an enumeration, or a reference to an enumeration.  13.4.0.6 */
 
  I tried making it a member function of gimple_statement_base, but that
  doesn't work either: we want a conversion
from a gimple_statement_base * to a gassign *, not
from a gimple_statement_base   to a gassign *.
 
  Is there some syntactic trick here that I'm missing?  Sorry if I'm being
  dumb (I can imagine there's a way of doing it by making gimple become
  some kind of wrapped ptr class, but that way lies madness, surely).
 
 Hmm.
 
 struct assign;
 struct base {
   operator assign *() const { return (assign *)this; }
 };
 struct assign : base {
 };
 
 void foo (assign *);
 void bar (base *b)
 {
   foo (b);
 }
 
 doesn't work, but
 
 void bar (base b)
 {
   foo (b);
 }
 
 does.  Indeed C++ doesn't seem to provide what is necessary
 for the compat trick :(
 
 So the gimple-compat.h header would need to provide
 additional overloads for the affected functions like
 
 inline 

Re: [Patch, Fortran] Convert gfc_fatal_error to common diagnostics

2014-11-15 Thread Tobias Burnus

FX wrote:
Please document, in the source, the difference between gfc_fatal_error 
and gfc_fatal_error_1. They currently have the same generic 
description, which wouldn’t help people writing new front-end code to 
know which one to use. Moreover, if the transition will not be 
complete soon (or indeterminate), it should be added to the wiki’s 
list of partial transitions.


Well, the diagnostics conversion is on going and was only delayed due to 
delays of reviewing the line-map part of last patch. (That part was 
required for %C support; the last patch added the _2 variants for 
gfc_error_now/gfc_warning_now.) In any case, the support for %L should 
be ready soon. When that's in, there won't be any need for 
gfc*_error*/gfc*_warning* duplication any more.


Support for buffered output (and discarding it), will take a bit longer 
– but is also planed for GCC 5. This support is required for 
gfc_error/gfc_warning and, hence, for most diagnostic output.


Thus, I don't think it should be put into the wiki. (Admittedly, I also 
do not know which page you are referring to.) In any case, there are 
several PRs about issues fixed by the on-going change to the common 
diagnostics.


When that's done, there are still additional task for diagnostic 
improvements left (see PRs), all which required the common diagnostic in 
place.



Other than that, OK, and thanks for doing this tedious work.


Thanks for the review! For the diagnostic changes, you have mainly to 
thank Manuel, who is the driving force behind all diagnositic work (C, 
C++) and who did the lion share of the Fortran front end work (including 
the required changes in the common part).


Tobias

PS: Attached is the error.c part of the committed patch (r217600); I 
added a few lines above the functions _2/_1 to make clear when to use 
them. I hope that we can soon remove the old version.
Index: gcc/fortran/error.c
===
--- gcc/fortran/error.c	(Revision 217599)
+++ gcc/fortran/error.c	(Arbeitskopie)
@@ -933,6 +933,7 @@ gfc_notify_std (int std, const char *gmsgid, ...)
 
 
 /* Immediate warning (i.e. do not buffer the warning).  */
+/* Use gfc_warning_now_2 instead, unless gmsgid contains a %L.  */
 
 void
 gfc_warning_now (const char *gmsgid, ...)
@@ -1086,6 +1087,7 @@ gfc_diagnostic_finalizer (diagnostic_context *cont
 }
 
 /* Immediate warning (i.e. do not buffer the warning).  */
+/* This function uses the common diagnostics, but does not support %L, yet.  */
 
 bool
 gfc_warning_now_2 (int opt, const char *gmsgid, ...)
@@ -1104,6 +1106,7 @@ gfc_warning_now_2 (int opt, const char *gmsgid, ..
 }
 
 /* Immediate warning (i.e. do not buffer the warning).  */
+/* This function uses the common diagnostics, but does not support %L, yet.  */
 
 bool
 gfc_warning_now_2 (const char *gmsgid, ...)
@@ -1122,6 +1125,7 @@ gfc_warning_now_2 (const char *gmsgid, ...)
 
 
 /* Immediate error (i.e. do not buffer).  */
+/* This function uses the common diagnostics, but does not support %L, yet.  */
 
 void
 gfc_error_now_2 (const char *gmsgid, ...)
@@ -1135,6 +1139,24 @@ gfc_error_now_2 (const char *gmsgid, ...)
   va_end (argp);
 }
 
+
+/* Fatal error, never returns.  */
+/* This function uses the common diagnostics, but does not support %L, yet.  */
+
+void
+gfc_fatal_error (const char *gmsgid, ...)
+{
+  va_list argp;
+  diagnostic_info diagnostic;
+
+  va_start (argp, gmsgid);
+  diagnostic_set_info (diagnostic, gmsgid, argp, UNKNOWN_LOCATION, DK_FATAL);
+  report_diagnostic (diagnostic);
+  va_end (argp);
+
+  gcc_unreachable ();
+}
+
 /* Clear the warning flag.  */
 
 void
@@ -1213,6 +1235,7 @@ warning:
 
 
 /* Immediate error.  */
+/* Use gfc_error_now_2 instead, unless gmsgid contains a %L.  */
 
 void
 gfc_error_now (const char *gmsgid, ...)
@@ -1243,9 +1266,10 @@ gfc_error_now (const char *gmsgid, ...)
 
 
 /* Fatal error, never returns.  */
+/* Use gfc_fatal_error instead, unless gmsgid contains a %L.  */
 
 void
-gfc_fatal_error (const char *gmsgid, ...)
+gfc_fatal_error_1 (const char *gmsgid, ...)
 {
   va_list argp;
 


Re: [PATCH] Add force option to find_best_rename_reg in regrename pass

2014-11-15 Thread Eric Botcazou
 It looks at register that respect the constraints of all the instructions in
 the set and tries to pick one in the preferred class for all the
 instructions involved. This is generally useful for any pass that wants to
 do register renaming. However it also contains some logic to only select
 the register that also haven't been used for a longer time than the
 register that should be replaced. This bit is specific to the register
 renaming pass and makes the function unusable for this new pass as a result
 which forces us to do a copy of the function.

 This patch adds an extra parameter to skip this check and only consider the
 constraints and tries to pick a register in the preferred class.

OK on principle but...

 2014-11-14  Thomas Preud'homme  thomas.preudho...@arm.com
 
 * regrename.c (find_best_rename_reg): Rename to ...
 (find_rename_reg): This. Also add a parameter to skip tick check.
 * regrename.h: Likewise.
 * config/c6x/c6x.c: Adapt to above renaming.

Missing function in config/c6x/c6x.c entry.

 @@ -408,8 +410,13 @@ find_best_rename_reg (du_head_p this_head, enum
 reg_class super_class,  ((pass == 0
   !TEST_HARD_REG_BIT (reg_class_contents[preferred_class],
 best_new_reg))
 -   || tick[best_new_reg]  tick[new_reg]))
 - best_new_reg = new_reg;
 +   || !best_rename || tick[best_new_reg]  tick[new_reg]))
 + {
 +   if (best_rename)
 + best_new_reg = new_reg;
 +   else
 + return new_reg;
 + }
   }
if (pass == 0  best_new_reg != old_reg)
   break;

Please write it like so:

  if (!check_new_reg_p (old_reg, new_reg, this_head, *unavailable))
continue;

  if (!best_rename)
return new_reg;

  /* In the first pass, we force the renaming of registers that
 don't belong to PREFERRED_CLASS to registers that do, even
 though the latters were used not very long ago.  *
  if ((pass == 0
  !TEST_HARD_REG_BIT (reg_class_contents[preferred_class],
  best_new_reg))
  || tick[best_new_reg]  tick[new_reg]))
best_new_reg = new_reg;

-- 
Eric Botcazou


Re: [Patch, Fortran] Convert gfc_fatal_error to common diagnostics

2014-11-15 Thread FX
 Thus, I don't think it should be put into the wiki. (Admittedly, I also do 
 not know which page you are referring to.) In any case, there are several PRs 
 about issues fixed by the on-going change to the common diagnostics.

OK. (The page I was referring to is here: 
https://gcc.gnu.org/wiki/Partial_Transitions)


FX

[PATCH] Fix Cilk+ ICEs with overflow builtins (PR middle-end/63884)

2014-11-15 Thread Marek Polacek
The problem here is that the Cilk+ code wasn't prepared to handle
internal calls that the new overflow builtins entail.  Fixed by
checking that the CALL_EXPR_FN isn't NULL.

Looking at cilk-plus.exp, I think this file will need some tweaks
now that the C default is gnu11...

Bootstrapped/regtested on powerpc64-linux, ok for trunk?

2014-11-15  Marek Polacek  pola...@redhat.com

PR middle-end/63884
c-family/
* array-notation-common.c (is_sec_implicit_index_fn): Return false
for NULL fndecl.
(extract_array_notation_exprs): Return for NULL node.
testsuite/
* c-c++-common/cilk-plus/AN/pr63884.c: New test.

diff --git gcc/c-family/array-notation-common.c 
gcc/c-family/array-notation-common.c
index f8bce04..cb5708c 100644
--- gcc/c-family/array-notation-common.c
+++ gcc/c-family/array-notation-common.c
@@ -35,6 +35,9 @@ along with GCC; see the file COPYING3.  If not see
 bool
 is_sec_implicit_index_fn (tree fndecl)
 {
+  if (!fndecl)
+return false;
+
   if (TREE_CODE (fndecl) == ADDR_EXPR)
 fndecl = TREE_OPERAND (fndecl, 0);
 
@@ -327,6 +330,9 @@ extract_array_notation_exprs (tree node, bool 
ignore_builtin_fn,
  vectree, va_gc **array_list)
 {
   size_t ii = 0;  
+
+  if (!node)
+return;
   if (TREE_CODE (node) == ARRAY_NOTATION_REF)
 {
   vec_safe_push (*array_list, node);
diff --git gcc/testsuite/c-c++-common/cilk-plus/AN/pr63884.c 
gcc/testsuite/c-c++-common/cilk-plus/AN/pr63884.c
index e69de29..c876a8d 100644
--- gcc/testsuite/c-c++-common/cilk-plus/AN/pr63884.c
+++ gcc/testsuite/c-c++-common/cilk-plus/AN/pr63884.c
@@ -0,0 +1,10 @@
+/* PR middle-end/63884 */
+/* { dg-do compile } */
+/* { dg-options -fcilkplus } */
+
+int
+foo (int x, int y)
+{
+  int r;
+  return __builtin_sadd_overflow (x, y, r);
+}

Marek


Add log message for max-completely-peeled-times

2014-11-15 Thread Eric Botcazou
Try_unroll_loop_completely logs a message for max-completely-peeled-insns:

  else if (unr_insns
(unsigned) PARAM_VALUE (PARAM_MAX_COMPLETELY_PEELED_INSNS))
{
  if (dump_file  (dump_flags  TDF_DETAILS))
fprintf (dump_file, Not unrolling loop %d: 
 (--param max-completely-peeled-insns limit reached).\n,
 loop-num);
  return false;
}

but not for max-completely-peeled-times:

  max_unroll = PARAM_VALUE (PARAM_MAX_COMPLETELY_PEEL_TIMES);
  if (n_unroll  max_unroll)
return false;

so the attached patch adds one.

Tested on x86_64-suse-linux, applied on the mainline as obvious.


2014-11-15  Eric Botcazou  ebotca...@adacore.com

* tree-ssa-loop-ivcanon.c (try_unroll_loop_completely): Add log message
for max-completely-peeled-insns limit.


-- 
Eric BotcazouIndex: tree-ssa-loop-ivcanon.c
===
--- tree-ssa-loop-ivcanon.c	(revision 217538)
+++ tree-ssa-loop-ivcanon.c	(working copy)
@@ -674,7 +674,7 @@ try_unroll_loop_completely (struct loop
 			HOST_WIDE_INT maxiter,
 			location_t locus)
 {
-  unsigned HOST_WIDE_INT n_unroll = 0, ninsns, max_unroll, unr_insns;
+  unsigned HOST_WIDE_INT n_unroll = 0, ninsns, unr_insns;
   gimple cond;
   struct loop_size size;
   bool n_unroll_found = false;
@@ -720,9 +720,14 @@ try_unroll_loop_completely (struct loop
   if (!n_unroll_found)
 return false;
 
-  max_unroll = PARAM_VALUE (PARAM_MAX_COMPLETELY_PEEL_TIMES);
-  if (n_unroll  max_unroll)
-return false;
+  if (n_unroll  (unsigned) PARAM_VALUE (PARAM_MAX_COMPLETELY_PEEL_TIMES))
+{
+  if (dump_file  (dump_flags  TDF_DETAILS))
+	fprintf (dump_file, Not unrolling loop %d 
+		 (--param max-completely-peeled-times limit reached).\n,
+		 loop-num);
+  return false;
+}
 
   if (!edge_to_cancel)
 edge_to_cancel = loop_edge_to_cancel (loop);


openacc kernels directive -- initial support

2014-11-15 Thread Tom de Vries

Hi,

I'm submitting a patch series with initial support for the oacc kernels 
directive.

The patch series uses pass_parallelize_loops to implement parallelization of 
loops in the oacc kernels region.


The patch series consists of these 8 patches:
...
1  Expand oacc kernels after pass_build_ealias
2  Add pass_oacc_kernels
3  Add pass_ch_oacc_kernels to pass_oacc_kernels
4  Add pass_tree_loop_{init,done} to pass_oacc_kernels
5  Add pass_loop_im to pass_oacc_kernels
6  Add pass_ccp to pass_oacc_kernels
7  Add pass_parloops_oacc_kernels to pass_oacc_kernels
8  Do simple omp lowering for no address taken var
...

The patch series does not yet apply cleanly to trunk, since it's dependent on 
the oacc middle end changes present in the gomp-4_0-branch, already submitted by 
Thomas for trunk.


Furthermore, it's dependent on an assert fix submitted for trunk ('Fix 
gcc_assert in expand_omp_for_static_chunk' @ 
https://gcc.gnu.org/ml/gcc-patches/2014-11/msg01149.html ).


The patch series is intended for trunk, but - given the dependency on the oacc 
middle end changes - has been bootstrapped for x86_64 on top of gomp-4_0-branch.


I'll post the patch series in reply to this email.

Thanks,
- Tom

[ FTR  In order to get clean libgomp and goacc test results in gomp-4_0-branch, 
to have a good basis for testing, I used the following patch set:


 Don't allow flto-partition=balance for fopenacc
   Unsubmitted. This works around a compilation problem for
   libgomp/testsuite/libgomp.oacc-c-c++-common/asyncwait-2.c that I ran into on
   our internal dev branch.  I'll investigate whether I can reproduce with
   gomp-4_0-branch asap.

 Mark fopenacc as LTO option
   @ https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00085.html   

 Only use nvidia accelerator if present
   @ https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00247.html

 Set default LIBGOMP_PLUGIN_PATH
   @ https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00242.html
]


Fix ICE on pragma Loop_Optimize in Ada

2014-11-15 Thread Eric Botcazou
The attached testcase triggers an ICE because it contains pragma Loop_Optimize 
which survives down to RTL expansion.  There are 2 bugs: first, this should 
not ICE but issue a ignoring loop annotation message (this was accidentally 
disabled in https://gcc.gnu.org/ml/gcc-patches/2014-04/msg00681.html) and, 
second, the code must look for IFN_ANNOTATE calls in the latch as well.

Tested on x86_64-suse-linux, applied on the mainline as obvious.


2014-11-15  Eric Botcazou  ebotca...@adacore.com

* tree-cfg.c (replace_loop_annotate_in_block): New function extracted
from...
(replace_loop_annotate): ...here.  Call it on the header and on the
latch block, if any.  Restore proper behavior of final cleanup.


2014-11-15  Eric Botcazou  ebotca...@adacore.com

* gnat.dg/opt44.ad[sb]: New test.


-- 
Eric BotcazouIndex: tree-cfg.c
===
--- tree-cfg.c	(revision 217538)
+++ tree-cfg.c	(working copy)
@@ -265,13 +265,56 @@ build_gimple_cfg (gimple_seq seq)
   discriminator_per_locus = NULL;
 }
 
+/* Look for ANNOTATE calls with loop annotation kind in BB; if found, remove
+   them and propagate the information to LOOP.  We assume that the annotations
+   come immediately before the condition in BB, if any.  */
+
+static void
+replace_loop_annotate_in_block (basic_block bb, struct loop *loop)
+{
+  gimple_stmt_iterator gsi = gsi_last_bb (bb);
+  gimple stmt = gsi_stmt (gsi);
+
+  if (!(stmt  gimple_code (stmt) == GIMPLE_COND))
+return;
+
+  for (gsi_prev_nondebug (gsi); !gsi_end_p (gsi); gsi_prev (gsi))
+{
+  stmt = gsi_stmt (gsi);
+  if (gimple_code (stmt) != GIMPLE_CALL)
+	break;
+  if (!gimple_call_internal_p (stmt)
+	  || gimple_call_internal_fn (stmt) != IFN_ANNOTATE)
+	break;
+
+  switch ((annot_expr_kind) tree_to_shwi (gimple_call_arg (stmt, 1)))
+	{
+	case annot_expr_ivdep_kind:
+	  loop-safelen = INT_MAX;
+	  break;
+	case annot_expr_no_vector_kind:
+	  loop-dont_vectorize = true;
+	  break;
+	case annot_expr_vector_kind:
+	  loop-force_vectorize = true;
+	  cfun-has_force_vectorize_loops = true;
+	  break;
+	default:
+	  gcc_unreachable ();
+	}
+
+  stmt = gimple_build_assign (gimple_call_lhs (stmt),
+  gimple_call_arg (stmt, 0));
+  gsi_replace (gsi, stmt, true);
+}
+}
 
 /* Look for ANNOTATE calls with loop annotation kind; if found, remove
them and propagate the information to the loop.  We assume that the
annotations come immediately before the condition of the loop.  */
 
 static void
-replace_loop_annotate ()
+replace_loop_annotate (void)
 {
   struct loop *loop;
   basic_block bb;
@@ -280,37 +323,12 @@ replace_loop_annotate ()
 
   FOR_EACH_LOOP (loop, 0)
 {
-  gsi = gsi_last_bb (loop-header);
-  stmt = gsi_stmt (gsi);
-  if (!(stmt  gimple_code (stmt) == GIMPLE_COND))
-	continue;
-  for (gsi_prev_nondebug (gsi); !gsi_end_p (gsi); gsi_prev (gsi))
-	{
-	  stmt = gsi_stmt (gsi);
-	  if (gimple_code (stmt) != GIMPLE_CALL)
-	break;
-	  if (!gimple_call_internal_p (stmt)
-	  || gimple_call_internal_fn (stmt) != IFN_ANNOTATE)
-	break;
-	  switch ((annot_expr_kind) tree_to_shwi (gimple_call_arg (stmt, 1)))
-	{
-	case annot_expr_ivdep_kind:
-	  loop-safelen = INT_MAX;
-	  break;
-	case annot_expr_no_vector_kind:
-	  loop-dont_vectorize = true;
-	  break;
-	case annot_expr_vector_kind:
-	  loop-force_vectorize = true;
-	  cfun-has_force_vectorize_loops = true;
-	  break;
-	default:
-	  gcc_unreachable ();
-	}
-	  stmt = gimple_build_assign (gimple_call_lhs (stmt),
-  gimple_call_arg (stmt, 0));
-	  gsi_replace (gsi, stmt, true);
-	}
+  /* First look into the header.  */
+  replace_loop_annotate_in_block (loop-header, loop);
+
+  /* Then look into the latch, if any.  */
+  if (loop-latch)
+	replace_loop_annotate_in_block (loop-latch, loop);
 }
 
   /* Remove IFN_ANNOTATE.  Safeguard for the case loop-latch == NULL.  */
@@ -320,10 +338,11 @@ replace_loop_annotate ()
 	{
 	  stmt = gsi_stmt (gsi);
 	  if (gimple_code (stmt) != GIMPLE_CALL)
-	break;
+	continue;
 	  if (!gimple_call_internal_p (stmt)
 	  || gimple_call_internal_fn (stmt) != IFN_ANNOTATE)
-	break;
+	continue;
+
 	  switch ((annot_expr_kind) tree_to_shwi (gimple_call_arg (stmt, 1)))
 	{
 	case annot_expr_ivdep_kind:
@@ -333,6 +352,7 @@ replace_loop_annotate ()
 	default:
 	  gcc_unreachable ();
 	}
+
 	  warning_at (gimple_location (stmt), 0, ignoring loop annotation);
 	  stmt = gimple_build_assign (gimple_call_lhs (stmt),
   gimple_call_arg (stmt, 0));-- { dg-do compile }
-- { dg-options -O }

package body Opt44 is

   procedure Addsub (X, Y : Sarray; R : out Sarray; N : Integer) is
   begin
  for I in Sarray'Range loop
 pragma Loop_Optimize (Ivdep);
 pragma Loop_Optimize (Vector);
 if N  0 then
   R(I) := 

Re: [PATCH 2/5] combine: handle I2 a parallel of two SETs

2014-11-15 Thread Segher Boessenkool
On Fri, Nov 14, 2014 at 08:35:48PM +0100, Bernd Schmidt wrote:
 On 11/14/2014 08:19 PM, Segher Boessenkool wrote:
 +  /* If I2 is a PARALLEL of two SETs of REGs (and perhaps some CLOBBERs),
 + make those two SETs separate I1 and I2 insns, and make an I0 that is
 + the original I1.  */
 +  if (i0 == 0
 +   GET_CODE (PATTERN (i2)) == PARALLEL
 +   XVECLEN (PATTERN (i2), 0) = 2
 +   GET_CODE (XVECEXP (PATTERN (i2), 0, 0)) == SET
 +   GET_CODE (XVECEXP (PATTERN (i2), 0, 1)) == SET
 +   REG_P (SET_DEST (XVECEXP (PATTERN (i2), 0, 0)))
 +   REG_P (SET_DEST (XVECEXP (PATTERN (i2), 0, 1)))
 +   !reg_used_between_p (SET_DEST (XVECEXP (PATTERN (i2), 0, 0)), 
 i2, i3)
 +   !reg_used_between_p (SET_DEST (XVECEXP (PATTERN (i2), 0, 1)), 
 i2, i3)
 
 Don't we have other code in combine checking the reg_used_between case?

It doesn't make any sense at all.  What I wanted to check is whether
splitting the parallel creates a conflict, but I woefully failed at that.
Will fix.

 +   (XVECLEN (PATTERN (i2), 0) == 2
 +  || GET_CODE (XVECEXP (PATTERN (i2), 0, 2)) == CLOBBER))
 
 This probably wants to test for XVECLEN == 3 for the second case. Can 
 then drop the earlier test.

It needs to test there are exactly two SETs, any amount of clobbers, and
nothing else.

 I think you also need to check that !reg_overlap_mentioned_p between the 
 two dests and the other set's sources.

Only the dest of the new I1 with the sources of the new I2, but yes.


Segher


Re: [PATCH][AArch64] LR register not used in leaf functions

2014-11-15 Thread Jiong Wang
2014-11-15 0:15 GMT+00:00 Andrew Pinski pins...@gmail.com:
 On Tue, Sep 30, 2014 at 8:00 AM, Jiong Wang jiong.w...@arm.com wrote:
 On 27/09/14 22:20, Kugan wrote:


 On 23/09/14 01:58, Jiong Wang wrote:

 +  /* If we decided that we didn't need a leaf frame pointer but then used
 + LR in the function, then we'll want a frame pointer after all, so
 + prevent this elimination to ensure a frame pointer is used.  */
 +  if (to == STACK_POINTER_REGNUM
 +   flag_omit_leaf_frame_pointer
 +   df_regs_ever_live_p (LR_REGNUM))
 + return false;

 This breaks my build on aarch64-elf (with some local modifications)

Hi Andrew,

  then what's your local modification?
  I think the problem is we need to figure out why there is an ICE
after your local modification?
  can you please send me your local modification and testcase if possible.


 aarch64_frame_pointer_required returns true but then we use LR but now
 aarch64_can_eliminate and aarch64_frame_pointer_required are
 inconsitant which is not a valid thing for LRA (and reload).

 This was mentioned in 
 https://gcc.gnu.org/ml/gcc-patches/2013-12/msg00151.html :
  IRA calls hook frame_pointer_required and it returns false. After
 that LRA calls can_eliminate hook and it returns false which means
 that fp can not be used for allocation and we should spill all pseudos
 assigned to it.

 Thanks,
 Andrew Pinski


Re: patch switching on LRA remat

2014-11-15 Thread H.J. Lu
On Fri, Nov 14, 2014 at 12:07 PM, Vladimir Makarov vmaka...@redhat.com wrote:
  The LRA rematerialization patch I've submitted about day ago broke H.J.'s
 32-bit bootstrap.  So I switched off the rematerialization right away.  The
 set for bootstrapping used by H.J. was very useful.  I've fixed several
 existing and potential bugs.

 Here the patch fixing the bugs and switching on LRA remat back.  The patch
 was bootstrapped on x86-64 and i686 (using H.J.'s options).

 Committed as rev. 217588.

 2014-11-14  Vladimir Makarov  vmaka...@redhat.com

 * lra-int.h (lra_create_live_ranges): Add parameter.
 * lra-lives.c (temp_bitmap): Move higher.
 (initiate_live_solver): Move temp_bitmap initialization into
 lra_live_ranges_init.
 (finish_live_solver): Move temp_bitmap clearing into
 live_ranges_finish.
 (process_bb_lives): Add parameter.  Use it to control live info
 update and dead insn elimination.  Pass it to mark_regno_live and
 mark_regno_dead.
 (lra_create_live_ranges): Add parameter.  Pass it to
 process_bb_lives.
 (lra_live_ranges_init, lra_live_ranges_finish): See changes in
 initiate_live_solver and finish_live_solver.
 * lra-remat.c (do_remat): Process insn non-operand hard regs too.
 Use temp_bitmap to update avail_cands.
 * lra.c (lra): Pass new parameter to lra_create_live_ranges.  Move
 check with lra_need_for_spill_p after live range pass.  Switch on
 rematerialization pass.

Unfortunately, it failed to bootstrap ia32 GCC:

https://gcc.gnu.org/ml/gcc-regression/2014-11/msg00392.html

You can bootstrap ia32 GCC on Linux/x86-64:

1. Install ia32 binutils under /foo/bar.
2. Set PATH=/foo/bar:$PATH
3. Install 32-bit libraries used by GCC, glibc, mpfr, gmp, libmpc. ...
4. Configure GCC with

CC=gcc -m32 CXX=g++ -m32 ../src-trunk/configure \
--with-arch=core2 --with-cpu=slm --prefix=/usr/5.0.0
--enable-clocale=gnu  --enable-shared --with-demangler-in-ld
i686-linux --with-fpmath=sse --enable-languages=c,c++


-- 
H.J.


Re: [PATCH][AArch64] LR register not used in leaf functions

2014-11-15 Thread Andrew Pinski
On Sat, Nov 15, 2014 at 6:08 AM, Jiong Wang
wong.kwongyuan.to...@gmail.com wrote:
 2014-11-15 0:15 GMT+00:00 Andrew Pinski pins...@gmail.com:
 On Tue, Sep 30, 2014 at 8:00 AM, Jiong Wang jiong.w...@arm.com wrote:
 On 27/09/14 22:20, Kugan wrote:


 On 23/09/14 01:58, Jiong Wang wrote:

 +  /* If we decided that we didn't need a leaf frame pointer but then 
 used
 + LR in the function, then we'll want a frame pointer after all, so
 + prevent this elimination to ensure a frame pointer is used.  */
 +  if (to == STACK_POINTER_REGNUM
 +   flag_omit_leaf_frame_pointer
 +   df_regs_ever_live_p (LR_REGNUM))
 + return false;

 This breaks my build on aarch64-elf (with some local modifications)

 Hi Andrew,

   then what's your local modification?
   I think the problem is we need to figure out why there is an ICE
 after your local modification?
   can you please send me your local modification and testcase if possible.

My local modifications can be found in the gcc git at
apinski/thunderx-cost.  Note I reverted this patch so I can continue
working.  The testcase is compiling newlib.  Let me try to get it
again.
I was configuring a combined build with:
--disable-fixed-point --without-ppl --without-python --disable-werror
--enable-plugins --enable-checking --disable-sim --with-newlib
--disable-tls --with-cpu=thunderx --with-multilib-list=lp64,ilp32
--target=aarch64-thunderx-elf --enable-languages=c,c++

Thanks,
Andrew Pinski



 aarch64_frame_pointer_required returns true but then we use LR but now
 aarch64_can_eliminate and aarch64_frame_pointer_required are
 inconsitant which is not a valid thing for LRA (and reload).

 This was mentioned in 
 https://gcc.gnu.org/ml/gcc-patches/2013-12/msg00151.html :
  IRA calls hook frame_pointer_required and it returns false. After
 that LRA calls can_eliminate hook and it returns false which means
 that fp can not be used for allocation and we should spill all pseudos
 assigned to it.

 Thanks,
 Andrew Pinski


[committed,testsuite] Fix dg-error for a darwin testcase

2014-11-15 Thread FX
Committed as trivial, as the error wording changed due to more precise 
diagnostics: it now says ‘CFStringRef {aka const struct __CFString *}’ instead 
of just ‘CFStringRef’

FX



2014-10-19  Francois-Xavier Coudert  fxcoud...@gcc.gnu.org

* gcc.dg/darwin-cfstring-format-1.c: Adjust dg-error.


Index: gcc.dg/darwin-cfstring-format-1.c
===
--- gcc.dg/darwin-cfstring-format-1.c   (revision 217599)
+++ gcc.dg/darwin-cfstring-format-1.c   (working copy)
@@ -18,7 +18,7 @@ int s2 (int a, CFStringRef fmt, ... ) __
 int s2a (int a, CFStringRef fmt, ... ) __attribute__((format(CFString, 2, 2))) 
; /* { dg-error format string argument follows the args to be formatted } */
 
 int s3 (const char *fmt, ... ) __attribute__((format(__CFString__, 1, 2))) ; 
/* { dg-error format argument should be a .CFString. reference but a string 
was found } */
-int s4 (CFStringRef fmt, ... ) __attribute__((format(printf, 1, 2))) ; /* { 
dg-error found a .CFStringRef. but the format argument should be a string } */
+int s4 (CFStringRef fmt, ... ) __attribute__((format(printf, 1, 2))) ; /* { 
dg-error found a .CFStringRef.* but the format argument should be a string } 
*/
 
 char *s5 (char dum, char *fmt1, ... ) __attribute__((format_arg(2))) ; /* OK */
 CFStringRef s6 (CFStringRef dum, CFStringRef fmt1, ... ) 
__attribute__((format_arg(2))) ; /* OK */



Re: [gofrontend-dev] [PATCH 4/4] Gccgo port to s390[x] -- part II

2014-11-15 Thread Ian Taylor
On Thu, Nov 13, 2014 at 2:58 AM, Dominik Vogt v...@linux.vnet.ibm.com wrote:

 What do you think about the attached patches?  They work for me, but I'm
 not sure whether the patch to go-test.exp is good because I know
 nothing about tcl.

Looks plausible to me.

Ian


Re: [PATCH][AArch64] LR register not used in leaf functions

2014-11-15 Thread Andrew Pinski
On Sat, Nov 15, 2014 at 7:21 AM, Andrew Pinski pins...@gmail.com wrote:
 On Sat, Nov 15, 2014 at 6:08 AM, Jiong Wang
 wong.kwongyuan.to...@gmail.com wrote:
 2014-11-15 0:15 GMT+00:00 Andrew Pinski pins...@gmail.com:
 On Tue, Sep 30, 2014 at 8:00 AM, Jiong Wang jiong.w...@arm.com wrote:
 On 27/09/14 22:20, Kugan wrote:


 On 23/09/14 01:58, Jiong Wang wrote:

 +  /* If we decided that we didn't need a leaf frame pointer but then 
 used
 + LR in the function, then we'll want a frame pointer after all, so
 + prevent this elimination to ensure a frame pointer is used.  */
 +  if (to == STACK_POINTER_REGNUM
 +   flag_omit_leaf_frame_pointer
 +   df_regs_ever_live_p (LR_REGNUM))
 + return false;

 This breaks my build on aarch64-elf (with some local modifications)

 Hi Andrew,

   then what's your local modification?
   I think the problem is we need to figure out why there is an ICE
 after your local modification?
   can you please send me your local modification and testcase if possible.

 My local modifications can be found in the gcc git at
 apinski/thunderx-cost.  Note I reverted this patch so I can continue
 working.  The testcase is compiling newlib.  Let me try to get it
 again.
 I was configuring a combined build with:
 --disable-fixed-point --without-ppl --without-python --disable-werror
 --enable-plugins --enable-checking --disable-sim --with-newlib
 --disable-tls --with-cpu=thunderx --with-multilib-list=lp64,ilp32
 --target=aarch64-thunderx-elf --enable-languages=c,c++

Attached is the preprocessed source.
cc1 strtol.i -mabi=ilp32 -O2
is enough to reproduce the ICE.

Thanks,
Andrew


 Thanks,
 Andrew Pinski



 aarch64_frame_pointer_required returns true but then we use LR but now
 aarch64_can_eliminate and aarch64_frame_pointer_required are
 inconsitant which is not a valid thing for LRA (and reload).

 This was mentioned in 
 https://gcc.gnu.org/ml/gcc-patches/2013-12/msg00151.html :
  IRA calls hook frame_pointer_required and it returns false. After
 that LRA calls can_eliminate hook and it returns false which means
 that fp can not be used for allocation and we should spill all pseudos
 assigned to it.

 Thanks,
 Andrew Pinski


strtol.i
Description: Binary data


[committed,testsuite] Fix missing includes for darwin testcases

2014-11-15 Thread FX
Committed as trivial.
And also, fixed wrong date on my earlier ChangeLog entry :)

FX



2014-11-15  Francois-Xavier Coudert  fxcoud...@gcc.gnu.org

* gcc.dg/pubtypes-3.c: Include string.h.
* gcc.dg/pubtypes-4.c: Likewise.


 
Index: gcc.dg/pubtypes-3.c
===
--- gcc.dg/pubtypes-3.c (revision 217599)
+++ gcc.dg/pubtypes-3.c (working copy)
@@ -9,6 +9,7 @@
 
 #include stdlib.h
 #include stdio.h
+#include string.h
 
 struct used_struct 
 {
Index: gcc.dg/pubtypes-4.c
===
--- gcc.dg/pubtypes-4.c (revision 217599)
+++ gcc.dg/pubtypes-4.c (working copy)
@@ -11,6 +11,7 @@
 
 #include stdlib.h
 #include stdio.h
+#include string.h
 
 struct used_struct 
 {



Re: [BUILDROBOT] error: �??cl_target_option_stream_in�?? was not declared in this scope (was: LTO streaming of TARGET_OPTIMIZE_NODE)

2014-11-15 Thread Jan-Benedict Glaw
On Fri, 2014-11-14 19:53:33 +0100, Jan Hubicka hubi...@ucw.cz wrote:
  Breaks build:
  
  g++ -c   -g -O2 -DIN_GCC  -DCROSS_DIRECTORY_STRUCTURE  -fno-exceptions 
  -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing 
  -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual 
  -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings 
  -fno-common  -DHAVE_CONFIG_H -I. -I. -I/home/jbglaw/repos/gcc/gcc 
  -I/home/jbglaw/repos/gcc/gcc/. -I/home/jbglaw/repos/gcc/gcc/../include 
  -I/home/jbglaw/repos/gcc/gcc/../libcpp/include  
  -I/home/jbglaw/repos/gcc/gcc/../libdecnumber 
  -I/home/jbglaw/repos/gcc/gcc/../libdecnumber/dpd -I../libdecnumber 
  -I/home/jbglaw/repos/gcc/gcc/../libbacktrace-o tree-streamer-in.o -MT 
  tree-streamer-in.o -MMD -MP -MF ./.deps/tree-streamer-in.TPo 
  /home/jbglaw/repos/gcc/gcc/tree-streamer-in.c
  /home/jbglaw/repos/gcc/gcc/tree-streamer-in.c: In function ‘void 
  unpack_value_fields(data_in*, bitpack_d*, tree)’:
  /home/jbglaw/repos/gcc/gcc/tree-streamer-in.c:527:180: error: 
  ‘cl_target_option_stream_in’ was not declared in this scope
  make[1]: *** [tree-streamer-in.o] Error 1
  
  
  See eg. these builds:
  
  http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=376049
  http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=376050
  http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=376051
 
 I managed to do a partial commit (mistyping lto-streamer.h). It should be 
 fixed by the followup commit
 a minute later.  Does it works for you now?

Looks good. Thanks!

MfG, JBG

-- 
  Jan-Benedict Glaw  jbg...@lug-owl.de  +49-172-7608481
Signature of: They that give up essential liberty to obtain temporary safety,
the second  : deserve neither liberty nor safety.  (Ben Franklin)


signature.asc
Description: Digital signature


[committed,testsuite] Only run gcc.dg/tree-ssa/pr61144.c when aliases are supported

2014-11-15 Thread FX
All other tests in gcc.dg/ that use __attribute__((__alias__())) are guarded by 
dg-require-alias.
Let’s do the same for gcc.dg/tree-ssa/pr61144.c, otherwise it complains on 
darwin.



2014-11-15  Francois-Xavier Coudert  fxcoud...@gcc.gnu.org

* gcc.dg/tree-ssa/pr61144.c: Add dg-require-alias.


 
Index: gcc.dg/tree-ssa/pr61144.c
===
--- gcc.dg/tree-ssa/pr61144.c   (revision 217599)
+++ gcc.dg/tree-ssa/pr61144.c   (working copy)
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-require-weak  } */
+/* { dg-require-alias  } */
 /* { dg-options -O2 -fdump-tree-optimized } */
 static int dummy = 0;
 extern int foo __attribute__((__weak__, __alias__(dummy)));



Rerog streaming of OPTIMIZATION_NODE

2014-11-15 Thread Jan Hubicka
Hi,
this patch implements OPTIMIZATION_NODE streaming same was as previous patch
did for TARGET_OPTION_NODE. Since the code turned out to be completely
analogous to the previous one I will go ahead and commit it as obvious.
It will help to make followup changes easier to follow.

I also tested this with forcing default optimization node on every function
with LTO.  It seems to just work, modulo inliner ignoring most of the flags and
happily dragging code from one set of optimization options to another.

Bootstrapped/regtested ppc64-linux and x86_64-linux, tested with Firefox, 
Comitted.

Honza

* lto-streamer-out.c (hash_tree): Use cl_optimization_hash.
* lto-streamer.h (cl_optimization_stream_out, 
cl_optimization_stream_in): Declare.
* optc-save-gen.awk: Generate cl_optimization LTO streaming and hashing 
routines.
* opth-gen.awk: Add prototype of cl_optimization_hash.
* tree-streamer-in.c (unpack_ts_optimization): Remove.
(streamer_unpack_tree_bitfields): Use cl_optimization_stream_in.
* tree-streamer-out.c (pack_ts_optimization): Remove.
(streamer_pack_tree_bitfields): Use cl_optimization_stream_out.
Index: lto-streamer-out.c
===
--- lto-streamer-out.c  (revision 217572)
+++ lto-streamer-out.c  (working copy)
@@ -948,7 +948,7 @@
 hstate.add_wide_int (cl_target_option_hash (TREE_TARGET_OPTION (t)));
 
   if (CODE_CONTAINS_STRUCT (code, TS_OPTIMIZATION))
-hstate.add (t, sizeof (struct cl_optimization));
+hstate.add_wide_int (cl_optimization_hash (TREE_OPTIMIZATION (t)));
 
   if (CODE_CONTAINS_STRUCT (code, TS_IDENTIFIER))
 hstate.merge_hash (IDENTIFIER_HASH_VALUE (t));
Index: lto-streamer.h
===
--- lto-streamer.h  (revision 217572)
+++ lto-streamer.h  (working copy)
@@ -844,7 +844,11 @@
 struct bitpack_d *,
 struct cl_target_option *);
 
+void cl_optimization_stream_out (struct bitpack_d *, struct cl_optimization *);
 
+void cl_optimization_stream_in (struct bitpack_d *, struct cl_optimization *);
+
+
 /* In lto-symtab.c.  */
 extern void lto_symtab_merge_decls (void);
 extern void lto_symtab_merge_symbols (void);
Index: optc-save-gen.awk
===
--- optc-save-gen.awk   (revision 217571)
+++ optc-save-gen.awk   (working copy)
@@ -551,4 +551,61 @@
 
 print };
 
+n_opt_val = 2;
+var_opt_val[0] = x_optimize
+var_opt_val_type[0] = char 
+var_opt_val[1] = x_optimize_size
+var_opt_val_type[1] = char 
+for (i = 0; i  n_opts; i++) {
+   if (flag_set_p(Optimization, flags[i])) {
+   name = var_name(flags[i])
+   if(name == )
+   continue;
+
+   if(name in var_opt_list_seen)
+   continue;
+
+   var_opt_list_seen[name]++;
+
+   otype = var_type_struct(flags[i])
+   var_opt_val_type[n_opt_val] = otype;
+   var_opt_val[n_opt_val++] = x_ name;
+   }
 }
+print ;
+print /* Hash optimization options  */;
+print hashval_t;
+print cl_optimization_hash (struct cl_optimization const *ptr 
ATTRIBUTE_UNUSED);
+print {;
+print   inchash::hash hstate;;
+for (i = 0; i  n_opt_val; i++) {
+   name = var_opt_val[i]
+   print   hstate.add_wide_int (ptr- name);;
+}
+print   return hstate.end ();;
+print };
+
+print ;
+print /* Stream out optimization options  */;
+print void;
+print cl_optimization_stream_out (struct bitpack_d *bp,;
+print struct cl_optimization *ptr);
+print {;
+for (i = 0; i  n_opt_val; i++) {
+   name = var_opt_val[i]
+   print   bp_pack_value (bp, ptr- name, 64);;
+}
+print };
+
+print ;
+print /* Stream in optimization options  */;
+print void;
+print cl_optimization_stream_in (struct bitpack_d *bp,;
+printstruct cl_optimization *ptr);
+print {;
+for (i = 0; i  n_opt_val; i++) {
+   name = var_opt_val[i]
+   print   ptr- name = ( var_opt_val_type[i] ) bp_unpack_value (bp, 
64);;
+}
+print };
+}
Index: opth-gen.awk
===
--- opth-gen.awk(revision 217571)
+++ opth-gen.awk(working copy)
@@ -299,6 +299,9 @@
 print /* Hash option variables from a structure.  */;
 print extern hashval_t cl_target_option_hash (const struct cl_target_option 
*);;
 print ;
+print /* Hash optimization from a structure.  */;
+print extern hashval_t cl_optimization_hash (const struct cl_optimization 
*);;
+print ;
 print /* Anything that includes tm.h, does not necessarily need this.  */
 print #if !defined(GCC_TM_H)
 print #include \input.h\ /* for location_t */
Index: tree-streamer-in.c
===
--- tree-streamer-in.c  (revision 217571)
+++ tree-streamer-in.c  

Re: patch switching on LRA remat

2014-11-15 Thread Vladimir Makarov

On 2014-11-15 9:58 AM, H.J. Lu wrote:

On Fri, Nov 14, 2014 at 12:07 PM, Vladimir Makarov vmaka...@redhat.com wrote:

  The LRA rematerialization patch I've submitted about day ago broke H.J.'s
32-bit bootstrap.  So I switched off the rematerialization right away.  The
set for bootstrapping used by H.J. was very useful.  I've fixed several
existing and potential bugs.

Here the patch fixing the bugs and switching on LRA remat back.  The patch
was bootstrapped on x86-64 and i686 (using H.J.'s options).

Committed as rev. 217588.

2014-11-14  Vladimir Makarov  vmaka...@redhat.com

 * lra-int.h (lra_create_live_ranges): Add parameter.
 * lra-lives.c (temp_bitmap): Move higher.
 (initiate_live_solver): Move temp_bitmap initialization into
 lra_live_ranges_init.
 (finish_live_solver): Move temp_bitmap clearing into
 live_ranges_finish.
 (process_bb_lives): Add parameter.  Use it to control live info
 update and dead insn elimination.  Pass it to mark_regno_live and
 mark_regno_dead.
 (lra_create_live_ranges): Add parameter.  Pass it to
 process_bb_lives.
 (lra_live_ranges_init, lra_live_ranges_finish): See changes in
 initiate_live_solver and finish_live_solver.
 * lra-remat.c (do_remat): Process insn non-operand hard regs too.
 Use temp_bitmap to update avail_cands.
 * lra.c (lra): Pass new parameter to lra_create_live_ranges.  Move
 check with lra_need_for_spill_p after live range pass.  Switch on
 rematerialization pass.


Unfortunately, it failed to bootstrap ia32 GCC:

https://gcc.gnu.org/ml/gcc-regression/2014-11/msg00392.html

You can bootstrap ia32 GCC on Linux/x86-64:

1. Install ia32 binutils under /foo/bar.
2. Set PATH=/foo/bar:$PATH
3. Install 32-bit libraries used by GCC, glibc, mpfr, gmp, libmpc. ...
4. Configure GCC with



Thanks, H.J.  I see it's a different set of options as it was before.  I 
switched off remat. temporarily (rev. 217609).


Index: ChangeLog
===
--- ChangeLog   (revision 217608)
+++ ChangeLog   (working copy)
@@ -1,3 +1,7 @@
+2014-11-15  Vladimir Makarov  vmaka...@redhat.com
+
+   * lra.c (lra): Switch off rematerialization pass.
+
 2014-11-15  Marc Glisse  marc.gli...@inria.fr

* config/i386/xmmintrin.h (_mm_add_ps, _mm_sub_ps, _mm_mul_ps,
Index: lra.c
===
--- lra.c   (revision 217602)
+++ lra.c   (working copy)
@@ -2354,7 +2354,7 @@
break;
   /* Now we know what pseudos should be spilled.  Try to
 rematerialize them first.  */
-  if (lra_remat ())
+  if (0lra_remat ())
{
  /* We need full live info -- see the comment above.  */
  lra_create_live_ranges (lra_reg_spill_p, true);




[PATCH, 1/8] Expand oacc kernels after pass_build_ealias

2014-11-15 Thread Tom de Vries

On 15-11-14 13:14, Tom de Vries wrote:

Hi,

I'm submitting a patch series with initial support for the oacc kernels 
directive.

The patch series uses pass_parallelize_loops to implement parallelization of
loops in the oacc kernels region.

The patch series consists of these 8 patches:
...
 1  Expand oacc kernels after pass_build_ealias
 2  Add pass_oacc_kernels
 3  Add pass_ch_oacc_kernels to pass_oacc_kernels
 4  Add pass_tree_loop_{init,done} to pass_oacc_kernels
 5  Add pass_loop_im to pass_oacc_kernels
 6  Add pass_ccp to pass_oacc_kernels
 7  Add pass_parloops_oacc_kernels to pass_oacc_kernels
 8  Do simple omp lowering for no address taken var
...


This patch moves omp expansion of the oacc kernels directive to after 
pass_build_ealias.


The rationale is that in order to use pass_parallelize_loops for analysis and 
transformation of an oacc kernels region, we postpone omp expansion of that 
region until the earliest point in the pass list where enough information is 
availabe to run pass_parallelize_loops, in other words, after pass_build_ealias.


The patch postpones expansion in expand_omp, and ensures expansion by adding 
pass_expand_omp_ssa:

- after pass_build_ealias, and
- after pass_all_early_optimizations for the case we're not optimizing.

In order to make sure the oacc kernels region arrives at pass_expand_omp_ssa, 
the way it left expand_omp, the patch makes pass_ccp and pass_forwprop aware of 
lowered omp code, to handle it conservatively.


The patch contains changes in expand_omp_target to deal with ssa-code, similar 
to what is already present in expand_omp_taskreg.


Furthermore, the patch forces the .omp_data_sizes and .omp_data_kinds to not be 
static for oacc kernels. It does this to get some references to .omp_data_sizes 
and .omp_data_kinds in the ssa code.  Without these references, the definitions 
will be removed. The reference of the variables in GIMPLE_OACC_KERNELS is not 
enough to have them not removed. [ In vries/oacc-kernels, I used a BUILT_IN_USE 
kludge for this purpose ].


Finally, at the end of pass_expand_omp_ssa we're left with SSA_NAMEs in the 
original function of which the definition has been removed (as in moved to the 
split off function). TODO_remove_unused_locals takes care of some of them, but 
not the anonymous ones. So the patch iterates over all SSA_NAMEs to find these 
dangling SSA_NAMEs and releases them.


OK for trunk?

Thanks,
- Tom
2014-11-14  Tom de Vries  t...@codesourcery.com

	* function.h (struct function): Add contains_oacc_kernels field.
	* gimplify.c (gimplify_omp_workshare): Set contains_oacc_kernels.
	* omp-low.c: Include gimple-pretty-print.h.
	(release_first_vuse_in_edge_dest): New function.
	(expand_omp_target): Handle ssa-code.
	(expand_omp): Don't expand GIMPLE_OACC_KERNELS when not in ssa.
	(pass_data_expand_omp): Don't set PROP_gimple_eomp unconditionally in
	properties_provided field.
	(pass_expand_omp::execute): Set PROP_gimple_eomp in
	cfun-curr_properties only if cfun does not contain oacc kernels.
	(pass_data_expand_omp_ssa): Add TODO_remove_unused_locals to
	todo_flags_finish field.
	(pass_expand_omp_ssa::execute): Release dandging SSA_NAMEs after calling
	execute_expand_omp.
	(lower_omp_target): Add static_arrays variable, init to 1.  Don't use
	static arrays for kernels directive.  Use static_arrays variable.
	Handle case that .omp_data_kinds is not static.
	(gimple_stmt_omp_lowering_p): New function.
	* omp-low.h (gimple_stmt_omp_lowering_p): Declare.
	* passes.def: Add pass_expand_omp_ssa after pass_build_ealias.
	* tree-ssa-ccp.c: Include omp-low.h.
	(surely_varying_stmt_p): Handle omp lowering code conservatively.
	* tree-ssa-forwprop.c: Include omp-low.h.
	(pass_forwprop::execute): Handle omp lowering code conservatively.
---
 gcc/function.h  |   3 +
 gcc/gimplify.c  |   1 +
 gcc/omp-low.c   | 194 +---
 gcc/omp-low.h   |   1 +
 gcc/passes.def  |   2 +
 gcc/tree-ssa-ccp.c  |   4 +
 gcc/tree-ssa-forwprop.c |   4 +-
 7 files changed, 196 insertions(+), 13 deletions(-)

diff --git a/gcc/function.h b/gcc/function.h
index 08ab761..a72c154 100644
--- a/gcc/function.h
+++ b/gcc/function.h
@@ -664,6 +664,9 @@ struct GTY(()) function {
 
   /* Set when the tail call has been identified.  */
   unsigned int tail_call_marked : 1;
+
+  /* Set when the function contains oacc kernels directives.  */
+  unsigned int contains_oacc_kernels : 1;
 };
 
 /* Add the decl D to the local_decls list of FUN.  */
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 2c8c666..52d7e6d 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -7281,6 +7281,7 @@ gimplify_omp_workshare (tree *expr_p, gimple_seq *pre_p)
   break;
 case OACC_KERNELS:
   stmt = gimple_build_oacc_kernels (body, OACC_KERNELS_CLAUSES (expr));
+  cfun-contains_oacc_kernels = 1;
   break;
 case OACC_PARALLEL:
   stmt = 

[PATCH, 2/8] Add pass_oacc_kernels

2014-11-15 Thread Tom de Vries

On 15-11-14 13:14, Tom de Vries wrote:

Hi,

I'm submitting a patch series with initial support for the oacc kernels 
directive.

The patch series uses pass_parallelize_loops to implement parallelization of
loops in the oacc kernels region.

The patch series consists of these 8 patches:
...
 1  Expand oacc kernels after pass_build_ealias
 2  Add pass_oacc_kernels
 3  Add pass_ch_oacc_kernels to pass_oacc_kernels
 4  Add pass_tree_loop_{init,done} to pass_oacc_kernels
 5  Add pass_loop_im to pass_oacc_kernels
 6  Add pass_ccp to pass_oacc_kernels
 7  Add pass_parloops_oacc_kernels to pass_oacc_kernels
 8  Do simple omp lowering for no address taken var
...


This patch adds a pass group pass_oacc_kernels.

The rationale is that we want a pass group to run oacc kernels region related 
(optimization) passes in.


OK for trunk?

Thanks,
- Tom

2014-11-14  Tom de Vries  t...@codesourcery.com

	* passes.def: Add pass group pass_oacc_kernels.
	* tree-pass.h (make_pass_oacc_kernels): Declare.
	* tree-ssa-loop.c (gate_oacc_kernels): New static function.
	(pass_data_oacc_kernels): New pass_data.
	(class pass_oacc_kernels): New pass.
	(make_pass_oacc_kernels): New function.
---
 gcc/passes.def  |  5 +
 gcc/tree-pass.h |  1 +
 gcc/tree-ssa-loop.c | 48 
 3 files changed, 54 insertions(+)

diff --git a/gcc/passes.def b/gcc/passes.def
index bce8591..1fdb70a 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -72,6 +72,11 @@ along with GCC; see the file COPYING3.  If not see
 	  /* pass_build_ealias is a dummy pass that ensures that we
 	 execute TODO_rebuild_alias at this point.  */
 	  NEXT_PASS (pass_build_ealias);
+	  /* Pass group that runs when there are oacc kernels in the
+	 function.  */
+	  NEXT_PASS (pass_oacc_kernels);
+	  PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels)
+	  POP_INSERT_PASSES ()
 	  NEXT_PASS (pass_expand_omp_ssa);
 	  NEXT_PASS (pass_fre);
 	  NEXT_PASS (pass_merge_phi);
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index eaa69b4..0bae847 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -445,6 +445,7 @@ extern gimple_opt_pass *make_pass_strength_reduction (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_vtable_verify (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_ubsan (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_sanopt (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_oacc_kernels (gcc::context *ctxt);
 
 /* IPA Passes */
 extern simple_ipa_opt_pass *make_pass_ipa_lower_emutls (gcc::context *ctxt);
diff --git a/gcc/tree-ssa-loop.c b/gcc/tree-ssa-loop.c
index 758b5fc..c29aa22 100644
--- a/gcc/tree-ssa-loop.c
+++ b/gcc/tree-ssa-loop.c
@@ -157,6 +157,54 @@ make_pass_tree_loop (gcc::context *ctxt)
   return new pass_tree_loop (ctxt);
 }
 
+/* Gate for oacc kernels pass group.  */
+
+static bool
+gate_oacc_kernels (function *fn)
+{
+  if (!flag_openacc)
+return false;
+
+  return fn-contains_oacc_kernels;
+}
+
+/* The oacc kernels superpass.  */
+
+namespace {
+
+const pass_data pass_data_oacc_kernels =
+{
+  GIMPLE_PASS, /* type */
+  oacc_kernels, /* name */
+  OPTGROUP_LOOP, /* optinfo_flags */
+  TV_TREE_LOOP, /* tv_id */
+  PROP_cfg, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  0, /* todo_flags_finish */
+};
+
+class pass_oacc_kernels : public gimple_opt_pass
+{
+public:
+  pass_oacc_kernels (gcc::context *ctxt)
+: gimple_opt_pass (pass_data_oacc_kernels, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  virtual bool gate (function *fn) { return gate_oacc_kernels (fn); }
+
+}; // class pass_oacc_kernels
+
+} // anon namespace
+
+gimple_opt_pass *
+make_pass_oacc_kernels (gcc::context *ctxt)
+{
+  return new pass_oacc_kernels (ctxt);
+}
+
 /* The no-loop superpass.  */
 
 namespace {
-- 
1.9.1







[PATCH, 3/8] Add pass_ch_oacc_kernels to pass_oacc_kernels

2014-11-15 Thread Tom de Vries

On 15-11-14 13:14, Tom de Vries wrote:

Hi,

I'm submitting a patch series with initial support for the oacc kernels 
directive.

The patch series uses pass_parallelize_loops to implement parallelization of
loops in the oacc kernels region.

The patch series consists of these 8 patches:
...
 1  Expand oacc kernels after pass_build_ealias
 2  Add pass_oacc_kernels
 3  Add pass_ch_oacc_kernels to pass_oacc_kernels
 4  Add pass_tree_loop_{init,done} to pass_oacc_kernels
 5  Add pass_loop_im to pass_oacc_kernels
 6  Add pass_ccp to pass_oacc_kernels
 7  Add pass_parloops_oacc_kernels to pass_oacc_kernels
 8  Do simple omp lowering for no address taken var
...


This patch adds a pass_ch_oacc_kernels to the pass group pass_oacc_kernels.

The idea is that pass_parallelize_loops only deals with loops for which the 
header has been copied, so the easiest way to meet that requirement when running 
pass_parallelize_loops in group pass_oacc_kernels, is to run pass_ch as a part 
of pass_oacc_kernels.


We define a seperate pass pass_ch_oacc_kernels, to leave all loops that aren't 
part of a kernels region alone.


OK for trunk?

Thanks,
- Tom

2014-11-14  Tom de Vries  t...@codesourcery.com

	* omp-low.c (loop_in_oacc_kernels_region_p): New function.
	* omp-low.h (loop_in_oacc_kernels_region_p): Declare.
	* passes.def: Add pass_ch_oacc_kernels to pass group pass_oacc_kernels.
	* tree-pass.h (make_pass_ch_oacc_kernels): Declare
	* tree-ssa-loop-ch.c: Include omp-low.h.
	(pass_ch_execute): Declare.
	(pass_ch::execute): Factor out ...
	(pass_ch_execute): ... this new function.  If handling oacc kernels,
	skip loops that are not in oacc kernels region.
	(pass_ch_oacc_kernels::execute):
	(pass_data_ch_oacc_kernels): New pass_data.
	(class pass_ch_oacc_kernels): New pass.
	(pass_ch_oacc_kernels::execute, make_pass_ch_oacc_kernels): New
	function.
---
 gcc/omp-low.c  | 83 ++
 gcc/omp-low.h  |  2 ++
 gcc/passes.def |  1 +
 gcc/tree-pass.h|  1 +
 gcc/tree-ssa-loop-ch.c | 59 +--
 5 files changed, 144 insertions(+), 2 deletions(-)

diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 6caeae9..e35fa8b 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -13909,4 +13909,87 @@ gimple_stmt_omp_lowering_p (gimple stmt)
   return false;
 }
 
+/* Return true if LOOP is inside a kernels region.  */
+
+bool
+loop_in_oacc_kernels_region_p (struct loop *loop, basic_block *region_entry,
+			   basic_block *region_exit)
+{
+  bitmap excludes_bitmap = BITMAP_GGC_ALLOC ();
+  bitmap region_bitmap = BITMAP_GGC_ALLOC ();
+  bitmap_clear (region_bitmap);
+
+  if (region_entry != NULL)
+*region_entry = NULL;
+  if (region_exit != NULL)
+*region_exit = NULL;
+
+  basic_block bb;
+  gimple last;
+  FOR_EACH_BB_FN (bb, cfun)
+{
+  if (bitmap_bit_p (region_bitmap, bb-index))
+	continue;
+
+  last = last_stmt (bb);
+  if (!last)
+	continue;
+
+  if (gimple_code (last) != GIMPLE_OACC_KERNELS)
+	continue;
+
+  bitmap_clear (excludes_bitmap);
+  bitmap_set_bit (excludes_bitmap, bb-index);
+
+  vecbasic_block dominated
+	= get_all_dominated_blocks (CDI_DOMINATORS, bb);
+
+  unsigned di;
+  basic_block dom;
+
+  basic_block end_region = NULL;
+  FOR_EACH_VEC_ELT (dominated, di, dom)
+	{
+	  if (dom == bb)
+	continue;
+
+	  last = last_stmt (dom);
+	  if (!last)
+	continue;
+
+	  if (gimple_code (last) != GIMPLE_OMP_RETURN)
+	continue;
+
+	  if (end_region == NULL
+	  || dominated_by_p (CDI_DOMINATORS, end_region, dom))
+	end_region = dom;
+	}
+
+  vecbasic_block excludes
+	= get_all_dominated_blocks (CDI_DOMINATORS, end_region);
+
+  unsigned di2;
+  basic_block exclude;
+
+  FOR_EACH_VEC_ELT (excludes, di2, exclude)
+	if (exclude != end_region)
+	  bitmap_set_bit (excludes_bitmap, exclude-index);
+
+  FOR_EACH_VEC_ELT (dominated, di, dom)
+	if (!bitmap_bit_p (excludes_bitmap, dom-index))
+	  bitmap_set_bit (region_bitmap, dom-index);
+
+  if (bitmap_bit_p (region_bitmap, loop-header-index))
+	{
+	  if (region_entry != NULL)
+	*region_entry = bb;
+	  if (region_exit != NULL)
+	*region_exit = end_region;
+	  return true;
+	}
+}
+
+  return false;
+}
+
 #include gt-omp-low.h
diff --git a/gcc/omp-low.h b/gcc/omp-low.h
index ff8a956..f1b9d77 100644
--- a/gcc/omp-low.h
+++ b/gcc/omp-low.h
@@ -29,6 +29,8 @@ extern tree omp_reduction_init (tree, tree);
 extern bool make_gimple_omp_edges (basic_block, struct omp_region **, int *);
 extern void omp_finish_file (void);
 extern bool gimple_stmt_omp_lowering_p (gimple);
+extern bool loop_in_oacc_kernels_region_p (struct loop *, basic_block *,
+	   basic_block *);
 
 extern GTY(()) vectree, va_gc *offload_funcs;
 extern GTY(()) vectree, va_gc *offload_vars;
diff --git a/gcc/passes.def b/gcc/passes.def
index 1fdb70a..5eefe73 100644
--- a/gcc/passes.def
+++ 

[PATCH, 4/8] Add pass_tree_loop_{init,done} to pass_oacc_kernels

2014-11-15 Thread Tom de Vries

On 15-11-14 13:14, Tom de Vries wrote:

Hi,

I'm submitting a patch series with initial support for the oacc kernels 
directive.

The patch series uses pass_parallelize_loops to implement parallelization of
loops in the oacc kernels region.

The patch series consists of these 8 patches:
...
 1  Expand oacc kernels after pass_build_ealias
 2  Add pass_oacc_kernels
 3  Add pass_ch_oacc_kernels to pass_oacc_kernels
 4  Add pass_tree_loop_{init,done} to pass_oacc_kernels
 5  Add pass_loop_im to pass_oacc_kernels
 6  Add pass_ccp to pass_oacc_kernels
 7  Add pass_parloops_oacc_kernels to pass_oacc_kernels
 8  Do simple omp lowering for no address taken var
...


This patch adds pass_tree_loop_init and pass_tree_loop_init_done to 
pass_oacc_kernels.


Pass_parallelize_loops is run between these passes in the pass group 
pass_tree_loop, since it requires loop information.  We do the same for 
pass_oacc_kernels.


OK for trunk?

Thanks,
- Tom

2014-11-14  Tom de Vries  t...@codesourcery.com

	* passes.def: Run pass_tree_loop_init and pass_tree_loop_done in pass
	group pass_oacc_kernels.
	* tree-ssa-loop.c (pass_tree_loop_init::clone)
	(pass_tree_loop_done::clone): New function.
---
 gcc/passes.def  | 2 ++
 gcc/tree-ssa-loop.c | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/gcc/passes.def b/gcc/passes.def
index 5eefe73..83f437b 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -77,6 +77,8 @@ along with GCC; see the file COPYING3.  If not see
 	  NEXT_PASS (pass_oacc_kernels);
 	  PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels)
 	  NEXT_PASS (pass_ch_oacc_kernels);
+	  NEXT_PASS (pass_tree_loop_init);
+	  NEXT_PASS (pass_tree_loop_done);
 	  POP_INSERT_PASSES ()
 	  NEXT_PASS (pass_expand_omp_ssa);
 	  NEXT_PASS (pass_fre);
diff --git a/gcc/tree-ssa-loop.c b/gcc/tree-ssa-loop.c
index c29aa22..c78b013 100644
--- a/gcc/tree-ssa-loop.c
+++ b/gcc/tree-ssa-loop.c
@@ -269,6 +269,7 @@ public:
 
   /* opt_pass methods: */
   virtual unsigned int execute (function *);
+  opt_pass * clone () { return new pass_tree_loop_init (m_ctxt); }
 
 }; // class pass_tree_loop_init
 
@@ -563,6 +564,7 @@ public:
 
   /* opt_pass methods: */
   virtual unsigned int execute (function *) { return tree_ssa_loop_done (); }
+  opt_pass * clone () { return new pass_tree_loop_done (m_ctxt); }
 
 }; // class pass_tree_loop_done
 
-- 
1.9.1







[PATCH, 6/8] Add pass_ccp to pass_oacc_kernels

2014-11-15 Thread Tom de Vries

On 15-11-14 13:14, Tom de Vries wrote:

Hi,

I'm submitting a patch series with initial support for the oacc kernels 
directive.

The patch series uses pass_parallelize_loops to implement parallelization of
loops in the oacc kernels region.

The patch series consists of these 8 patches:
...
 1  Expand oacc kernels after pass_build_ealias
 2  Add pass_oacc_kernels
 3  Add pass_ch_oacc_kernels to pass_oacc_kernels
 4  Add pass_tree_loop_{init,done} to pass_oacc_kernels
 5  Add pass_loop_im to pass_oacc_kernels
 6  Add pass_ccp to pass_oacc_kernels
 7  Add pass_parloops_oacc_kernels to pass_oacc_kernels
 8  Do simple omp lowering for no address taken var
...


This patch adds pass_loop_ccp to pass group pass_oacc_kernels.

We need this pass to simplify the loop body, and allow pass_parloops to detect 
that loop iterations are independent.


OK for trunk?

Thanks,
- Tom

2014-11-14  Tom de Vries  t...@codesourcery.com

	* passes.def: Add pass_ccp in pass group pass_oacc_kernels.

	* gcc.dg/pr43513.c: Update for new pass_ccp.
	* gcc.dg/tree-ssa/alias-17.c: Same.
	* gcc.dg/tree-ssa/foldconst-4.c: Same.
	* gcc.dg/tree-ssa/ssa-ccp-29.c: Same.
	* gcc.dg/tree-ssa/ssa-ccp-3.c: Same.
---
 gcc/passes.def  | 1 +
 gcc/testsuite/gcc.dg/pr43513.c  | 6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/alias-17.c| 6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/foldconst-4.c | 6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-ccp-29.c  | 6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-ccp-3.c   | 6 +++---
 6 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/gcc/passes.def b/gcc/passes.def
index f6c16b9..cd9443c 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -79,6 +79,7 @@ along with GCC; see the file COPYING3.  If not see
 	  NEXT_PASS (pass_ch_oacc_kernels);
 	  NEXT_PASS (pass_tree_loop_init);
 	  NEXT_PASS (pass_lim);
+	  NEXT_PASS (pass_ccp);
 	  NEXT_PASS (pass_tree_loop_done);
 	  POP_INSERT_PASSES ()
 	  NEXT_PASS (pass_expand_omp_ssa);
diff --git a/gcc/testsuite/gcc.dg/pr43513.c b/gcc/testsuite/gcc.dg/pr43513.c
index 78a037b..3fb0890 100644
--- a/gcc/testsuite/gcc.dg/pr43513.c
+++ b/gcc/testsuite/gcc.dg/pr43513.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options -O2 -fdump-tree-ccp2 } */
+/* { dg-options -O2 -fdump-tree-ccp3 } */
 
 void bar (int *);
 void foo (char *, int);
@@ -15,5 +15,5 @@ foo3 ()
 foo (%d , results[i]);
 }
 
-/* { dg-final { scan-tree-dump-times alloca 0 ccp2} } */
-/* { dg-final { cleanup-tree-dump ccp2 } } */
+/* { dg-final { scan-tree-dump-times alloca 0 ccp3} } */
+/* { dg-final { cleanup-tree-dump ccp3 } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/alias-17.c b/gcc/testsuite/gcc.dg/tree-ssa/alias-17.c
index 48e72ff..59862f6 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/alias-17.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/alias-17.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options -O -fno-early-inlining -fdump-tree-ccp2 } */
+/* { dg-options -O -fno-early-inlining -fdump-tree-ccp3 } */
 
 int *p;
 int inline bar(void) { return 0; }
@@ -14,5 +14,5 @@ int foo(int x)
   return *q + *p;
 }
 
-/* { dg-final { scan-tree-dump-not NOTE: no flow-sensitive alias info for ccp2 } } */
-/* { dg-final { cleanup-tree-dump ccp2 } } */
+/* { dg-final { scan-tree-dump-not NOTE: no flow-sensitive alias info for ccp3 } } */
+/* { dg-final { cleanup-tree-dump ccp3 } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/foldconst-4.c b/gcc/testsuite/gcc.dg/tree-ssa/foldconst-4.c
index 445d415..916a857 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/foldconst-4.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/foldconst-4.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options -O -fdump-tree-ccp2 } */
+/* { dg-options -O -fdump-tree-ccp3 } */
 
 struct a {int a,b;};
 const static struct a a;
@@ -10,5 +10,5 @@ test()
 {
   return a.a+b[c];
 }
-/* { dg-final { scan-tree-dump return 0; ccp2 } } */
-/* { dg-final { cleanup-tree-dump ccp2 } } */
+/* { dg-final { scan-tree-dump return 0; ccp3 } } */
+/* { dg-final { cleanup-tree-dump ccp3 } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-ccp-29.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-ccp-29.c
index 44d2945..1e3f41b 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-ccp-29.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-ccp-29.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options -O -fdump-tree-ccp2 } */
+/* { dg-options -O -fdump-tree-ccp3 } */
 
 static double num;
 int foo (void)
@@ -7,5 +7,5 @@ int foo (void)
   return *(unsigned *)num;
 }
 
-/* { dg-final { scan-tree-dump return 0; ccp2 } } */
-/* { dg-final { cleanup-tree-dump ccp2 } } */
+/* { dg-final { scan-tree-dump return 0; ccp3 } } */
+/* { dg-final { cleanup-tree-dump ccp3 } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-ccp-3.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-ccp-3.c
index 86a706b..03717e1 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-ccp-3.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-ccp-3.c
@@ -1,5 +1,5 @@
 /* { dg-do 

[PATCH, 5/8] Add pass_loop_im to pass_oacc_kernels

2014-11-15 Thread Tom de Vries

On 15-11-14 13:14, Tom de Vries wrote:

Hi,

I'm submitting a patch series with initial support for the oacc kernels 
directive.

The patch series uses pass_parallelize_loops to implement parallelization of
loops in the oacc kernels region.

The patch series consists of these 8 patches:
...
 1  Expand oacc kernels after pass_build_ealias
 2  Add pass_oacc_kernels
 3  Add pass_ch_oacc_kernels to pass_oacc_kernels
 4  Add pass_tree_loop_{init,done} to pass_oacc_kernels
 5  Add pass_loop_im to pass_oacc_kernels
 6  Add pass_ccp to pass_oacc_kernels
 7  Add pass_parloops_oacc_kernels to pass_oacc_kernels
 8  Do simple omp lowering for no address taken var
...


This patch adds pass_loop_im to pass group pass_oacc_kernels.

We need this pass to simplify the loop body, and allow pass_parloops to detect 
that loop iterations are independent.


OK for trunk?

Thanks,
- Tom


2014-11-14  Tom de Vries  t...@codesourcery.com

	* passes.def: Add pass_lim in pass group pass_ch_oacc_kernels.

	* c-c++-common/restrict-2.c: Update for new pass_lim.
	* c-c++-common/restrict-4.c: Same.
	* g++.dg/tree-ssa/pr33615.C:  Same.
	* g++.dg/tree-ssa/restrict1.C: Same.
	* gcc.dg/tm/pub-safety-1.c:  Same.
	* gcc.dg/tm/reg-promotion.c:  Same.
	* gcc.dg/tree-ssa/20050314-1.c:  Same.
	* gcc.dg/tree-ssa/loop-32.c: Same.
	* gcc.dg/tree-ssa/loop-33.c: Same.
	* gcc.dg/tree-ssa/loop-34.c: Same.
	* gcc.dg/tree-ssa/loop-35.c: Same.
	* gcc.dg/tree-ssa/loop-7.c: Same.
	* gcc.dg/tree-ssa/pr23109.c: Same.
	* gcc.dg/tree-ssa/restrict-3.c: Same.
	* gcc.dg/tree-ssa/ssa-lim-1.c: Same.
	* gcc.dg/tree-ssa/ssa-lim-10.c: Same.
	* gcc.dg/tree-ssa/ssa-lim-11.c: Same.
	* gcc.dg/tree-ssa/ssa-lim-12.c: Same.
	* gcc.dg/tree-ssa/ssa-lim-2.c: Same.
	* gcc.dg/tree-ssa/ssa-lim-3.c: Same.
	* gcc.dg/tree-ssa/ssa-lim-6.c: Same.
	* gcc.dg/tree-ssa/ssa-lim-7.c: Same.
	* gcc.dg/tree-ssa/ssa-lim-8.c: Same.
	* gcc.dg/tree-ssa/ssa-lim-9.c: Same.
	* gcc.dg/tree-ssa/structopt-1.c: Same.
	* gfortran.dg/pr32921.f: Same.
---
 gcc/passes.def  | 1 +
 gcc/testsuite/c-c++-common/restrict-2.c | 6 +++---
 gcc/testsuite/c-c++-common/restrict-4.c | 6 +++---
 gcc/testsuite/g++.dg/tree-ssa/pr33615.C | 6 +++---
 gcc/testsuite/g++.dg/tree-ssa/restrict1.C   | 6 +++---
 gcc/testsuite/gcc.dg/tm/pub-safety-1.c  | 6 +++---
 gcc/testsuite/gcc.dg/tm/reg-promotion.c | 6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/20050314-1.c  | 6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/loop-32.c | 6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/loop-33.c | 6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/loop-34.c | 6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/loop-35.c | 8 
 gcc/testsuite/gcc.dg/tree-ssa/loop-7.c  | 6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/pr23109.c | 6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/restrict-3.c  | 6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-1.c   | 6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-10.c  | 6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-11.c  | 6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-12.c  | 6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-2.c   | 6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-3.c   | 8 
 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-6.c   | 6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-7.c   | 6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-8.c   | 6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-9.c   | 6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/structopt-1.c | 6 +++---
 gcc/testsuite/gfortran.dg/pr32921.f | 6 +++---
 27 files changed, 81 insertions(+), 80 deletions(-)

diff --git a/gcc/passes.def b/gcc/passes.def
index 83f437b..f6c16b9 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -78,6 +78,7 @@ along with GCC; see the file COPYING3.  If not see
 	  PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels)
 	  NEXT_PASS (pass_ch_oacc_kernels);
 	  NEXT_PASS (pass_tree_loop_init);
+	  NEXT_PASS (pass_lim);
 	  NEXT_PASS (pass_tree_loop_done);
 	  POP_INSERT_PASSES ()
 	  NEXT_PASS (pass_expand_omp_ssa);
diff --git a/gcc/testsuite/c-c++-common/restrict-2.c b/gcc/testsuite/c-c++-common/restrict-2.c
index 3f71b77..f0b0e15a 100644
--- a/gcc/testsuite/c-c++-common/restrict-2.c
+++ b/gcc/testsuite/c-c++-common/restrict-2.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options -O -fno-strict-aliasing -fdump-tree-lim1-details } */
+/* { dg-options -O -fno-strict-aliasing -fdump-tree-lim2-details } */
 
 void foo (float * __restrict__ a, float * __restrict__ b, int n, int j)
 {
@@ -10,5 +10,5 @@ void foo (float * __restrict__ a, float * __restrict__ b, int n, int j)
 
 /* We should move the RHS of the store out of the loop.  */
 
-/* { dg-final { scan-tree-dump-times Moving statement 11 lim1 } } */
-/* { dg-final { cleanup-tree-dump lim1 } } */
+/* { dg-final { scan-tree-dump-times Moving statement 11 lim2 } } */
+/* { dg-final { cleanup-tree-dump lim2 } } */
diff --git a/gcc/testsuite/c-c++-common/restrict-4.c b/gcc/testsuite/c-c++-common/restrict-4.c

[PATCH, 7/8] Add pass_parloops_oacc_kernels to pass_oacc_kernels

2014-11-15 Thread Tom de Vries

On 15-11-14 13:14, Tom de Vries wrote:

Hi,

I'm submitting a patch series with initial support for the oacc kernels 
directive.

The patch series uses pass_parallelize_loops to implement parallelization of
loops in the oacc kernels region.

The patch series consists of these 8 patches:
...
 1  Expand oacc kernels after pass_build_ealias
 2  Add pass_oacc_kernels
 3  Add pass_ch_oacc_kernels to pass_oacc_kernels
 4  Add pass_tree_loop_{init,done} to pass_oacc_kernels
 5  Add pass_loop_im to pass_oacc_kernels
 6  Add pass_ccp to pass_oacc_kernels
 7  Add pass_parloops_oacc_kernels to pass_oacc_kernels
 8  Do simple omp lowering for no address taken var
...


This patch adds:
- a specialized version of pass_parallelize_loops called
pass_parloops_oacc_kernels to pass group pass_oacc_kernels, and
- relevant test-cases.

The pass only handles loops that are in a kernels region, and skips over bits of 
pass_parallelize_loops that are already done for oacc kernels.


The pass reintroduces the use of omp_expand_local, I haven't managed to make it 
work yet using the external pass pass_expand_omp_ssa.


An obvious limitation of the patch is the fact that we copy over the clauses 
from the kernels directive to the generated parallel directive. We'll need to do 
something more intelligent here, f.i. setting vector_length based on the 
parallelization factor.


Another limitation is that the pass still needs -ftree-parallelize-loops to 
trigger.

OK for trunk?

Thanks,
- Tom

2014-11-14  Tom de Vries  t...@codesourcery.com

	* passes.def: Add pass_parallelize_loops_oacc_kernels in pass group
	pass_oacc_kernels.  Move pass_expand_omp_ssa into pass group
	pass_oacc_kernels.
	* tree-parloops.c (create_parallel_loop): Add function parameters
	region_entry and bool oacc_kernels_p.  Handle oacc_kernels_p.
	(gen_parallel_loop): Same.  Use omp_expand_local if oacc_kernels_p.
	Call create_parallel_loop with additional args.
	(parallelize_loops): Add function parameter oacc_kernels_p.  Calculate
	dominance info.  Skip loops that are not in a kernels region. Call
	gen_parallel_loop with additional args.
	(pass_parallelize_loops::execute): Call parallelize_loops with false
	argument.
	(pass_data_parallelize_loops_oacc_kernels): New pass_data.
	(class pass_parallelize_loops_oacc_kernels): New pass.
	(pass_parallelize_loops_oacc_kernels::execute)
	(make_pass_parallelize_loops_oacc_kernels): New function.
	* tree-pass.h (make_pass_parallelize_loops_oacc_kernels): Declare.

	* testsuite/libgomp.oacc-c/oacc-kernels-2-run.c: New test.
	* testsuite/libgomp.oacc-c/oacc-kernels-run.c: New test.

	* gcc.dg/oacc-kernels-2.c: New test.
	* gcc.dg/oacc-kernels.c: New test.
---
 gcc/passes.def |   3 +-
 gcc/testsuite/gcc.dg/oacc-kernels-2.c  |  79 +++
 gcc/testsuite/gcc.dg/oacc-kernels.c|  71 ++
 gcc/tree-parloops.c| 242 -
 gcc/tree-pass.h|   2 +
 .../testsuite/libgomp.oacc-c/oacc-kernels-2-run.c  |  65 ++
 .../testsuite/libgomp.oacc-c/oacc-kernels-run.c|  59 +
 7 files changed, 465 insertions(+), 56 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/oacc-kernels-2.c
 create mode 100644 gcc/testsuite/gcc.dg/oacc-kernels.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-c/oacc-kernels-2-run.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-c/oacc-kernels-run.c

diff --git a/gcc/passes.def b/gcc/passes.def
index cd9443c..cc09ba9 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -80,9 +80,10 @@ along with GCC; see the file COPYING3.  If not see
 	  NEXT_PASS (pass_tree_loop_init);
 	  NEXT_PASS (pass_lim);
 	  NEXT_PASS (pass_ccp);
+  	  NEXT_PASS (pass_parallelize_loops_oacc_kernels);
+	  NEXT_PASS (pass_expand_omp_ssa);
 	  NEXT_PASS (pass_tree_loop_done);
 	  POP_INSERT_PASSES ()
-	  NEXT_PASS (pass_expand_omp_ssa);
 	  NEXT_PASS (pass_fre);
 	  NEXT_PASS (pass_merge_phi);
 	  NEXT_PASS (pass_cd_dce);
diff --git a/gcc/testsuite/gcc.dg/oacc-kernels-2.c b/gcc/testsuite/gcc.dg/oacc-kernels-2.c
new file mode 100644
index 000..1ff4bad
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/oacc-kernels-2.c
@@ -0,0 +1,79 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target fopenacc } */
+/* { dg-options -fopenacc -ftree-parallelize-loops=32 -O2 -std=c99 -fdump-tree-parloops_oacc_kernels-all -fdump-tree-copyrename } */
+
+#include stdlib.h
+#include stdio.h
+
+#define N (1024 * 512)
+#define N_REF 4293394432
+
+#if 1
+#define COUNTERTYPE unsigned int
+#else
+#define COUNTERTYPE int
+#endif
+
+int
+main (void)
+{
+  unsigned int i;
+
+  unsigned int *__restrict a;
+  unsigned int *__restrict b;
+  unsigned int *__restrict c;
+
+  a = malloc (N * sizeof (unsigned int));
+  b = malloc (N * sizeof (unsigned int));
+  c = malloc (N * sizeof (unsigned int));
+
+
+#pragma acc kernels copyout (a[0:N])
+  {
+for 

[PATCH, 8/8] Do simple omp lowering for no address taken var

2014-11-15 Thread Tom de Vries

On 15-11-14 13:14, Tom de Vries wrote:

Hi,

I'm submitting a patch series with initial support for the oacc kernels 
directive.

The patch series uses pass_parallelize_loops to implement parallelization of
loops in the oacc kernels region.

The patch series consists of these 8 patches:
...
 1  Expand oacc kernels after pass_build_ealias
 2  Add pass_oacc_kernels
 3  Add pass_ch_oacc_kernels to pass_oacc_kernels
 4  Add pass_tree_loop_{init,done} to pass_oacc_kernels
 5  Add pass_loop_im to pass_oacc_kernels
 6  Add pass_ccp to pass_oacc_kernels
 7  Add pass_parloops_oacc_kernels to pass_oacc_kernels
 8  Do simple omp lowering for no address taken var
...


This patch lowers integer variables that do not have their address taken as 
local variable.  We use a copy at region entry and exit to copy the value in and 
out.


In the context of reduction handling in a kernels region, this allows the 
parloops reduction analysis to recognize the reduction, even after oacc lowering 
has been done in pass_lower_omp.


In more detail, without this patch, the omp_data_i load and stores are generated 
in place (in this case, in the loop):

...
{
  .omp_data_iD.2201 = .omp_data_arr.15D.2220;
  {
unsigned intD.9 iD.2146;

iD.2146 = 0;
goto D.2207;
D.2208:
D.2216 = .omp_data_iD.2201-cD.2203;
c.9D.2176 = *D.2216;
D.2177 = (long unsigned intD.10) iD.2146;
D.2178 = D.2177 * 4;
D.2179 = c.9D.2176 + D.2178;
D.2180 = *D.2179;
D.2217 = .omp_data_iD.2201-sumD.2205;
D.2218 = *D.2217;
D.2217 = .omp_data_iD.2201-sumD.2205;
D.2219 = D.2180 + D.2218;
*D.2217 = D.2219;
iD.2146 = iD.2146 + 1;
D.2207:
if (iD.2146 = 524287) goto D.2208; else goto D.2209;
D.2209:
  }
...

With this patch, the omp_data_i load and stores for sum are generated at entry 
and exit:

...
{
  .omp_data_iD.2201 = .omp_data_arr.15D.2218;
  D.2216 = .omp_data_iD.2201-sumD.2205;
  sumD.2206 = *D.2216;
  {
unsigned intD.9 iD.2146;

iD.2146 = 0;
goto D.2207;
D.2208:
D.2217 = .omp_data_iD.2201-cD.2203;
c.9D.2176 = *D.2217;
D.2177 = (long unsigned intD.10) iD.2146;
D.2178 = D.2177 * 4;
D.2179 = c.9D.2176 + D.2178;
D.2180 = *D.2179;
sumD.2206 = D.2180 + sumD.2206;
iD.2146 = iD.2146 + 1;
D.2207:
if (iD.2146 = 524287) goto D.2208; else goto D.2209;
D.2209:
  }
  *D.2216 = sumD.2206;
  #pragma omp return
}
...


So, without the patch the reduction operation looks like this:
...
*(.omp_data_iD.2201-sumD.2205) = *(.omp_data_iD.2201-sumD.2205) + x
...

And with this patch the reduction operation is simply:
...
sumD.2206 = sumD.2206 + x:
...

OK for trunk?

Thanks,
- Tom

2014-11-03  Tom de Vries  t...@codesourcery.com

	* gimple.c (gimple_seq_ior_addresses_taken_op)
	(gimple_seq_ior_addresses_taken): New function.
	* gimple.h (gimple_seq_ior_addresses_taken): Declare.
	* omp-low.c (addresses_taken): Declare local variable.
	(lower_oacc_offload): Lower variables that do not have their address
	taken as local variable.  Use a copy at region entry and exit to copy
	the value in and out.
	(execute_lower_omp): Calculate addresses_taken.
---
 gcc/gimple.c  | 35 +++
 gcc/gimple.h  |  1 +
 gcc/omp-low.c | 25 ++---
 3 files changed, 58 insertions(+), 3 deletions(-)

diff --git a/gcc/gimple.c b/gcc/gimple.c
index a9174e6..107eb26 100644
--- a/gcc/gimple.c
+++ b/gcc/gimple.c
@@ -2428,6 +2428,41 @@ gimple_ior_addresses_taken (bitmap addresses_taken, gimple stmt)
 	gimple_ior_addresses_taken_1);
 }
 
+/* Helper function for gimple_seq_ior_addresses_taken.  */
+
+static tree
+gimple_seq_ior_addresses_taken_op (tree *tp,
+   int *walk_subtrees ATTRIBUTE_UNUSED,
+   void *data)
+{
+  struct walk_stmt_info *wi = (struct walk_stmt_info *)data;
+  bitmap addresses_taken = (bitmap)wi-info;
+
+  tree t = *tp;
+  if (TREE_CODE (t) != ADDR_EXPR)
+return NULL_TREE;
+
+  tree var = TREE_OPERAND (t, 0);
+  if (!DECL_P (var))
+return NULL_TREE;
+
+  bitmap_set_bit (addresses_taken, DECL_UID (var));
+
+  return NULL_TREE;
+}
+
+/* Find the decls in SEQ that have their address taken, and set the
+   corresponding decl_uid 

Re: [PATCH] Fix Cilk+ ICEs with overflow builtins (PR middle-end/63884)

2014-11-15 Thread Jakub Jelinek
On Sat, Nov 15, 2014 at 01:03:46PM +0100, Marek Polacek wrote:
 The problem here is that the Cilk+ code wasn't prepared to handle
 internal calls that the new overflow builtins entail.  Fixed by
 checking that the CALL_EXPR_FN isn't NULL.
 
 Looking at cilk-plus.exp, I think this file will need some tweaks
 now that the C default is gnu11...
 
 Bootstrapped/regtested on powerpc64-linux, ok for trunk?
 
 2014-11-15  Marek Polacek  pola...@redhat.com
 
   PR middle-end/63884
 c-family/
   * array-notation-common.c (is_sec_implicit_index_fn): Return false
   for NULL fndecl.
   (extract_array_notation_exprs): Return for NULL node.
 testsuite/
   * c-c++-common/cilk-plus/AN/pr63884.c: New test.

Ok, thanks.

Jakub


PATCH: PR bootstrap/63888: [5 Regression] bootstrap failed when configured with -with-build-config=bootstrap-asan --disable-werror

2014-11-15 Thread H.J. Lu
Hi,

GCC uses xstrndup/xstrdup throughout the source tree and those memory
may not be freed explicitly before exut.  LeakSanitizer isn't very
useful here.  This patch suppresses LeakSanitizer in bootstrap.  OK
for trunk?

This patch isn't sufficient.  I got

configure:3612: /export/build/gnu/gcc-asan/build-x86_64-linux/./gcc/xgcc
-B/export/build/gnu/gcc-asan/build-x86_64-linux/./gcc/
-B/usr/gcc-5.0.0/x86_64-unknown-linux-gnu/bin/
-B/usr/gcc-5.0.0/x86_64-unknown-linux-gnu/lib/ -isystem
/usr/gcc-5.0.0/x86_64-unknown-linux-gnu/include -isystem
/usr/gcc-5.0.0/x86_64-unknown-linux-gnu/sys-include-c -g -O2
conftest.c 5
=
==14370==ERROR: AddressSanitizer: odr-violation (0x02b38aa0):
  [1] size=12 'CSWTCH.2819'
/export/gnu/import/git/sources/gcc/gcc/tree-vrp.c:4056:7
  [2] size=12 'CSWTCH.2820'
/export/gnu/import/git/sources/gcc/gcc/tree-vrp.c:4109:8
These globals were registered at these points:
  [1]:
#0 0x68e9c6 in __asan_register_globals
/export/gnu/import/git/sources/gcc/libsanitizer/asan/asan_globals.cc:217
#1 0x28dc89c in __libc_csu_init
(/export/build/gnu/gcc-asan/build-x86_64-linux/gcc/cc1+0x28dc89c)
#2 0x309e821c34 in __libc_start_main (/lib64/libc.so.6+0x309e821c34)
#3 0x683d3e
(/export/build/gnu/gcc-asan/build-x86_64-linux/gcc/cc1+0x683d3e)

  [2]:
#0 0x68e9c6 in __asan_register_globals
/export/gnu/import/git/sources/gcc/libsanitizer/asan/asan_globals.cc:217
#1 0x28dc89c in __libc_csu_init
(/export/build/gnu/gcc-asan/build-x86_64-linux/gcc/cc1+0x28dc89c)
#2 0x309e821c34 in __libc_start_main (/lib64/libc.so.6+0x309e821c34)
#3 0x683d3e
(/export/build/gnu/gcc-asan/build-x86_64-linux/gcc/cc1+0x683d3e)

==14370==HINT: if you don't care about these warnings you may set
ASAN_OPTIONS=detect_odr_violation=0
SUMMARY: AddressSanitizer: odr-violation: global 'CSWTCH.2819' at
/export/gnu/import/git/sources/gcc/gcc/tree-vrp.c:4056:7
==14370==ABORTING



H.J.
---
2014-11-15  H.J. Lu  hongjiu...@intel.com

PR bootstrap/63888
* bootstrap-asan.mk (ASAN_OPTIONS): Export detect_leaks=0.

diff --git a/config/bootstrap-asan.mk b/config/bootstrap-asan.mk
index fbef021..52ef30e 100644
--- a/config/bootstrap-asan.mk
+++ b/config/bootstrap-asan.mk
@@ -1,5 +1,8 @@
 # This option enables -fsanitize=address for stage2 and stage3.
 
+# Suppress LeakSanitizer in bootstrap.
+export ASAN_OPTIONS=detect_leaks=0
+
 STAGE2_CFLAGS += -fsanitize=address
 STAGE3_CFLAGS += -fsanitize=address
 POSTSTAGE1_LDFLAGS += -fsanitize=address -static-libasan \


Re: patch switching on LRA remat

2014-11-15 Thread H.J. Lu
On Sat, Nov 15, 2014 at 9:07 AM, Vladimir Makarov vmaka...@redhat.com wrote:
 On 2014-11-15 9:58 AM, H.J. Lu wrote:

 On Fri, Nov 14, 2014 at 12:07 PM, Vladimir Makarov vmaka...@redhat.com
 wrote:

   The LRA rematerialization patch I've submitted about day ago broke
 H.J.'s
 32-bit bootstrap.  So I switched off the rematerialization right away.
 The
 set for bootstrapping used by H.J. was very useful.  I've fixed several
 existing and potential bugs.

 Here the patch fixing the bugs and switching on LRA remat back.  The
 patch
 was bootstrapped on x86-64 and i686 (using H.J.'s options).

 Committed as rev. 217588.

 2014-11-14  Vladimir Makarov  vmaka...@redhat.com

  * lra-int.h (lra_create_live_ranges): Add parameter.
  * lra-lives.c (temp_bitmap): Move higher.
  (initiate_live_solver): Move temp_bitmap initialization into
  lra_live_ranges_init.
  (finish_live_solver): Move temp_bitmap clearing into
  live_ranges_finish.
  (process_bb_lives): Add parameter.  Use it to control live info
  update and dead insn elimination.  Pass it to mark_regno_live
 and
  mark_regno_dead.
  (lra_create_live_ranges): Add parameter.  Pass it to
  process_bb_lives.
  (lra_live_ranges_init, lra_live_ranges_finish): See changes in
  initiate_live_solver and finish_live_solver.
  * lra-remat.c (do_remat): Process insn non-operand hard regs
 too.
  Use temp_bitmap to update avail_cands.
  * lra.c (lra): Pass new parameter to lra_create_live_ranges.
 Move
  check with lra_need_for_spill_p after live range pass.  Switch
 on
  rematerialization pass.


 Unfortunately, it failed to bootstrap ia32 GCC:

 https://gcc.gnu.org/ml/gcc-regression/2014-11/msg00392.html

 You can bootstrap ia32 GCC on Linux/x86-64:

 1. Install ia32 binutils under /foo/bar.
 2. Set PATH=/foo/bar:$PATH
 3. Install 32-bit libraries used by GCC, glibc, mpfr, gmp, libmpc. ...
 4. Configure GCC with


 Thanks, H.J.  I see it's a different set of options as it was before.  I
 switched off remat. temporarily (rev. 217609).

It also miscompiled SPEC CPU 2000 on both ia32 and x86-64:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63895



-- 
H.J.


[committed,testsuite] Not run gcc.target/i386/sibcall-1.c on PIC targets

2014-11-15 Thread FX
Don’t run gcc.target/i386/sibcall-1.c on PIC targets.


2014-11-15  Francois-Xavier Coudert  fxcoud...@gcc.gnu.org

PR target/60104
* gcc.target/i386/sibcall-1.c: Don't run on pic targets.


Index: gcc.target/i386/sibcall-1.c
===
--- gcc.target/i386/sibcall-1.c (revision 217599)
+++ gcc.target/i386/sibcall-1.c (working copy)
@@ -1,4 +1,4 @@
-/* { dg-do compile { target ia32 } } */
+/* { dg-do compile { target { ia32  nonpic } } } */
 /* { dg-options -O2 } */
 
 extern int (*foo)(int);



Re: [PATCH][AArch64] LR register not used in leaf functions

2014-11-15 Thread Jiong Wang
2014-11-15 15:49 GMT+00:00 Andrew Pinski pins...@gmail.com:
 My local modifications can be found in the gcc git at
 apinski/thunderx-cost.  Note I reverted this patch so I can continue
 working.  The testcase is compiling newlib.  Let me try to get it
 again.
 I was configuring a combined build with:
 --disable-fixed-point --without-ppl --without-python --disable-werror
 --enable-plugins --enable-checking --disable-sim --with-newlib
 --disable-tls --with-cpu=thunderx --with-multilib-list=lp64,ilp32
 --target=aarch64-thunderx-elf --enable-languages=c,c++

 Attached is the preprocessed source.
 cc1 strtol.i -mabi=ilp32 -O2
 is enough to reproduce the ICE.

thanks. I can reproduce this ICE under -mabi=ilp32 on your code base.

and it's strange LR marked as alive while the final assembly don't have it.
the reason of this is the following pattern

define_insn *tboptabmode1
define_insn *cboptabmode1

always declare to clobber some register while they don't always use them
in code generation, so sub-optimal code generated, some registers are wasted.

you can see before my patch x19 is allocated for that clobber, thus there
is an unnecessary save of x19 to stack, while after my patch, x30 is allocated
for that clobber, so aarch64_can_eliminate invoked after this will
think this function
require frame pointer according to our ABI, so there are unncessary frame setup
instruction.

basically, it's not inconsistent between aarch64_require_frame_p and
aarch64_can_eliminate.  it's inconsistent between
aarch64_can_eliminate
invoked before assign_by_spills and after that.

and my first impression is that the gcc_assert in lra-eliminate.c is
to strong and need to be relaxed in some situation.


 Thanks,
 Andrew



Re: [committed,testsuite] Not run gcc.target/i386/sibcall-1.c on PIC targets

2014-11-15 Thread H.J. Lu
On Sat, Nov 15, 2014 at 11:46 AM, FX fxcoud...@gmail.com wrote:
 Don’t run gcc.target/i386/sibcall-1.c on PIC targets.


 2014-11-15  Francois-Xavier Coudert  fxcoud...@gcc.gnu.org

 PR target/60104
 * gcc.target/i386/sibcall-1.c: Don't run on pic targets.


 Index: gcc.target/i386/sibcall-1.c
 ===
 --- gcc.target/i386/sibcall-1.c (revision 217599)
 +++ gcc.target/i386/sibcall-1.c (working copy)
 @@ -1,4 +1,4 @@
 -/* { dg-do compile { target ia32 } } */
 +/* { dg-do compile { target { ia32  nonpic } } } */
  /* { dg-options -O2 } */

  extern int (*foo)(int);


This looks wrong.  This test should pass for 64-bit or ia32  nonpic.

-- 
H.J.


Re: [committed,testsuite] Not run gcc.target/i386/sibcall-1.c on PIC targets

2014-11-15 Thread FX
 This looks wrong.  This test should pass for 64-bit or ia32  nonpic.

It was Kai’s original testcase, so I don’t want to modify it too much, other 
than make it skip where it clearly fails.

FX

Re: [committed,testsuite] Not run gcc.target/i386/sibcall-1.c on PIC targets

2014-11-15 Thread H.J. Lu
On Sat, Nov 15, 2014 at 12:22 PM, FX fxcoud...@gmail.com wrote:
 This looks wrong.  This test should pass for 64-bit or ia32  nonpic.

 It was Kai’s original testcase, so I don’t want to modify it too much, other 
 than make it skip where it clearly fails.


Original bug report was filed against x86-64:

The attached testcase is a greatly reduced interpreter loop,
containing a simple load and indirect branch:

  goto *addresses[*pc++]

gcc 4.8.2 (as well as older versions) with -O2 produces the following
x86-64 output:

  movq addresses.1721(,%rax,8), %rax
  jmp *%rax

Since the loaded value is not used after the branch, there's no need
to hold it in a register, so the load could be folded into the branch.
This would improve code size and instruction count.

Add  a testcase only for ia32 makes no senses at all.


H.J.




-- 
H.J.


Re: Rerog streaming of OPTIMIZATION_NODE

2014-11-15 Thread Jan-Benedict Glaw
On Sat, 2014-11-15 17:57:20 +0100, Jan Hubicka hubi...@ucw.cz wrote:
 Hi,
 this patch implements OPTIMIZATION_NODE streaming same was as previous patch
 did for TARGET_OPTION_NODE. Since the code turned out to be completely
 analogous to the previous one I will go ahead and commit it as obvious.
 It will help to make followup changes easier to follow.
 
 I also tested this with forcing default optimization node on every function
 with LTO.  It seems to just work, modulo inliner ignoring most of the flags 
 and
 happily dragging code from one set of optimization options to another.
 
 Bootstrapped/regtested ppc64-linux and x86_64-linux, tested with Firefox, 
 Comitted.
 
 Honza
 
   * lto-streamer-out.c (hash_tree): Use cl_optimization_hash.
   * lto-streamer.h (cl_optimization_stream_out, 
 cl_optimization_stream_in): Declare.
   * optc-save-gen.awk: Generate cl_optimization LTO streaming and hashing 
 routines.
   * opth-gen.awk: Add prototype of cl_optimization_hash.
   * tree-streamer-in.c (unpack_ts_optimization): Remove.
   (streamer_unpack_tree_bitfields): Use cl_optimization_stream_in.
   * tree-streamer-out.c (pack_ts_optimization): Remove.
   (streamer_pack_tree_bitfields): Use cl_optimization_stream_out.

The recent work, I'm not exactly sure if it's actually /this/ commit,
broke nios2-rtems, see eg. build
http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=376303

g++ -c   -g -O2 -DIN_GCC  -DCROSS_DIRECTORY_STRUCTURE  -fno-exceptions 
-fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings 
-Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic 
-Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -fno-common  
-DHAVE_CONFIG_H -I. -I. -I/home/jbglaw/repos/gcc/gcc 
-I/home/jbglaw/repos/gcc/gcc/. -I/home/jbglaw/repos/gcc/gcc/../include 
-I/home/jbglaw/repos/gcc/gcc/../libcpp/include  
-I/home/jbglaw/repos/gcc/gcc/../libdecnumber 
-I/home/jbglaw/repos/gcc/gcc/../libdecnumber/dpd -I../libdecnumber 
-I/home/jbglaw/repos/gcc/gcc/../libbacktrace-o optabs.o -MT optabs.o -MMD 
-MP -MF ./.deps/optabs.TPo /home/jbglaw/repos/gcc/gcc/optabs.c
gawk -f /home/jbglaw/repos/gcc/gcc/opt-functions.awk -f 
/home/jbglaw/repos/gcc/gcc/opt-read.awk \
   -f /home/jbglaw/repos/gcc/gcc/optc-save-gen.awk \
   -v header_name=config.h system.h coretypes.h tm.h  optionlist  
options-save.c
g++ -c   -g -O2 -DIN_GCC  -DCROSS_DIRECTORY_STRUCTURE  -fno-exceptions 
-fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings 
-Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic 
-Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -fno-common  
-DHAVE_CONFIG_H -I. -I. -I/home/jbglaw/repos/gcc/gcc 
-I/home/jbglaw/repos/gcc/gcc/. -I/home/jbglaw/repos/gcc/gcc/../include 
-I/home/jbglaw/repos/gcc/gcc/../libcpp/include  
-I/home/jbglaw/repos/gcc/gcc/../libdecnumber 
-I/home/jbglaw/repos/gcc/gcc/../libdecnumber/dpd -I../libdecnumber 
-I/home/jbglaw/repos/gcc/gcc/../libbacktrace-o options-save.o -MT 
options-save.o -MMD -MP -MF ./.deps/options-save.TPo options-save.c
options-save.c: In function ‘void cl_target_option_stream_in(data_in*, 
bitpack_d*, cl_target_option*)’:
options-save.c:1901:41: error: expected primary-expression before ‘enum’
   ptr-saved_custom_code_status[256] = (enum nios2_ccs_code 
saved_custom_code_status[256]) bp_unpack_value (bp, 64);
 ^
options-save.c:1901:41: error: expected ‘)’ before ‘enum’
options-save.c:1902:40: error: expected primary-expression before ‘int’
   ptr-saved_custom_code_index[256] = (int saved_custom_code_index[256]) 
bp_unpack_value (bp, 64);
^
options-save.c:1902:40: error: expected ‘)’ before ‘int’
options-save.c:1903:49: error: expected primary-expression before ‘int’
   ptr-saved_fpu_custom_code[n2fpu_code_num] = (int 
saved_fpu_custom_code[n2fpu_code_num]) bp_unpack_value (bp, 64);
 ^
options-save.c:1903:49: error: expected ‘)’ before ‘int’
make[1]: *** [options-save.o] Error 1


(I just bisected it just now, it's this commit:

2014-11-14  Jan Hubicka  hubi...@ucw.cz

* optc-save-gen.awk: Output cl_target_option_eq,
cl_target_option_hash, cl_target_option_stream_out,
cl_target_option_stream_in functions.
* opth-gen.awk: Output prototypes for
cl_target_option_eq and cl_target_option_hash.
* lto-streamer.h (cl_target_option_stream_out,
cl_target_option_stream_in): Declare.
* tree.c (cl_option_hash_hash): Use cl_target_option_hash.
(cl_option_hash_eq): Use cl_target_option_eq.
* tree-streamer-in.c (unpack_value_fields): Stream in
TREE_TARGET_OPTION.
* lto-streamer-out.c (DFS::DFS_write_tree_body): Follow
DECL_FUNCTION_SPECIFIC_TARGET.
(hash_tree): Hash TREE_TARGET_OPTION; visit

RE: Follow-up to PR51471

2014-11-15 Thread Matthew Fortune
Eric Botcazou ebotca...@adacore.com writes:
  IIRC, fill_eager and its related friends are all speculative in some
 way
  and aren't those precisely the ones that are causing us problems.
 Also
  note we have backends working around this stuff in fairly blunt ways:
 
 I'd say that the PA back-end went a bit too far here, especially if it
 marks some insns of the epilogue as frame-related.  dwarf2cfi.c has
 special code to handle delay slots (SEQUENCEs) so it's not an all-or-
 nothing game.
 
  Given architectural difficulties of delay slots on modern processors,
  would it be that painful to just not allow filling slots with frame
  insns and let dbr try to find something else or drop in a nop?  I
  wouldn't be all that surprised if there wasn't a measurable
  performance difference on something like a modern Sparc.
 
 Yes, modern SPARCs have (short) branches without delay slots.  But the
 other big contender is MIPS here and the story might be different for
 it.

MIPSr6 introduces 'compact' branches which do not have delay slots.

So the issues of filling delay slots will be less important from R6
onwards. However, delay slots remain important for now.

I haven't thought about the problem much but instinctively I'd be surprised
if a blanket restriction on frame-related instructions would lead to lots
of NOPs in delay slots.

Matthew



Re: Rerog streaming of OPTIMIZATION_NODE

2014-11-15 Thread Jan Hubicka
 
 The recent work, I'm not exactly sure if it's actually /this/ commit,
 broke nios2-rtems, see eg. build
 http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=376303
 
 g++ -c   -g -O2 -DIN_GCC  -DCROSS_DIRECTORY_STRUCTURE  -fno-exceptions 
 -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing 
 -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual 
 -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings 
 -fno-common  -DHAVE_CONFIG_H -I. -I. -I/home/jbglaw/repos/gcc/gcc 
 -I/home/jbglaw/repos/gcc/gcc/. -I/home/jbglaw/repos/gcc/gcc/../include 
 -I/home/jbglaw/repos/gcc/gcc/../libcpp/include  
 -I/home/jbglaw/repos/gcc/gcc/../libdecnumber 
 -I/home/jbglaw/repos/gcc/gcc/../libdecnumber/dpd -I../libdecnumber 
 -I/home/jbglaw/repos/gcc/gcc/../libbacktrace-o optabs.o -MT optabs.o -MMD 
 -MP -MF ./.deps/optabs.TPo /home/jbglaw/repos/gcc/gcc/optabs.c
 gawk -f /home/jbglaw/repos/gcc/gcc/opt-functions.awk -f 
 /home/jbglaw/repos/gcc/gcc/opt-read.awk \
-f /home/jbglaw/repos/gcc/gcc/optc-save-gen.awk \
-v header_name=config.h system.h coretypes.h tm.h  optionlist  
 options-save.c
 g++ -c   -g -O2 -DIN_GCC  -DCROSS_DIRECTORY_STRUCTURE  -fno-exceptions 
 -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing 
 -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual 
 -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings 
 -fno-common  -DHAVE_CONFIG_H -I. -I. -I/home/jbglaw/repos/gcc/gcc 
 -I/home/jbglaw/repos/gcc/gcc/. -I/home/jbglaw/repos/gcc/gcc/../include 
 -I/home/jbglaw/repos/gcc/gcc/../libcpp/include  
 -I/home/jbglaw/repos/gcc/gcc/../libdecnumber 
 -I/home/jbglaw/repos/gcc/gcc/../libdecnumber/dpd -I../libdecnumber 
 -I/home/jbglaw/repos/gcc/gcc/../libbacktrace-o options-save.o -MT 
 options-save.o -MMD -MP -MF ./.deps/options-save.TPo options-save.c
 options-save.c: In function ‘void cl_target_option_stream_in(data_in*, 
 bitpack_d*, cl_target_option*)’:
 options-save.c:1901:41: error: expected primary-expression before ‘enum’
ptr-saved_custom_code_status[256] = (enum nios2_ccs_code 
 saved_custom_code_status[256]) bp_unpack_value (bp, 64);
  ^
 options-save.c:1901:41: error: expected ‘)’ before ‘enum’
 options-save.c:1902:40: error: expected primary-expression before ‘int’
ptr-saved_custom_code_index[256] = (int saved_custom_code_index[256]) 
 bp_unpack_value (bp, 64);
 ^
 options-save.c:1902:40: error: expected ‘)’ before ‘int’
 options-save.c:1903:49: error: expected primary-expression before ‘int’
ptr-saved_fpu_custom_code[n2fpu_code_num] = (int 
 saved_fpu_custom_code[n2fpu_code_num]) bp_unpack_value (bp, 64);
  ^
 options-save.c:1903:49: error: expected ‘)’ before ‘int’
 make[1]: *** [options-save.o] Error 1

Yep, it is because my code does not handle streaming of arrays into the target 
optimization nodes.
I will take a look on why that array is really needed. It seems like a overkill?

Honza
 
 
 (I just bisected it just now, it's this commit:
 
 2014-11-14  Jan Hubicka  hubi...@ucw.cz
 
   * optc-save-gen.awk: Output cl_target_option_eq,
   cl_target_option_hash, cl_target_option_stream_out,
   cl_target_option_stream_in functions.
   * opth-gen.awk: Output prototypes for
   cl_target_option_eq and cl_target_option_hash.
   * lto-streamer.h (cl_target_option_stream_out,
   cl_target_option_stream_in): Declare.
   * tree.c (cl_option_hash_hash): Use cl_target_option_hash.
   (cl_option_hash_eq): Use cl_target_option_eq.
   * tree-streamer-in.c (unpack_value_fields): Stream in
   TREE_TARGET_OPTION.
   * lto-streamer-out.c (DFS::DFS_write_tree_body): Follow
   DECL_FUNCTION_SPECIFIC_TARGET.
   (hash_tree): Hash TREE_TARGET_OPTION; visit
   DECL_FUNCTION_SPECIFIC_TARGET.
   * tree-streamer-out.c (streamer_pack_tree_bitfields): Skip
   TS_TARGET_OPTION.
   (streamer_write_tree_body): Output TS_TARGET_OPTION.
 )
 
 MfG, JBG
 
 -- 
   Jan-Benedict Glaw  jbg...@lug-owl.de  +49-172-7608481
  Signature of:  http://perl.plover.com/Questions.html
  the second  :




Re: [PATCH] Fix gimple_fold_stmt_to_constant regression

2014-11-15 Thread H.J. Lu
On Fri, Nov 14, 2014 at 4:39 AM, Richard Biener rguent...@suse.de wrote:

 Following up https://gcc.gnu.org/ml/gcc-patches/2014-10/msg01233.html and
 fixing the regressions this caused as soon as I removed the dispatch
 to fold_unary (and more regressions it would have caused if I managed
 to finish the idea to also remove the dispatches to fold_binary
 and fold_ternary...) the following patch makes CCP and VRP follow
 selected SSA edges again when gimple_fold_stmt_to_constant_1
 dispatches to gimple_simplify.

 The valueization for gimple_simplify of SSA propagator users may
 both valueize to anything (in particular constants) and it may
 signal to follow SSA edges if the destination will never be
 visited again by the propagator (thus its lattice value is stable).
 Esp. cutting out valueizing SSA names to constants is what caused
 the regressions.

 Note that this highlights the fact that overloading the valueization
 result with the signal to (not) follow SSA edges isn't the very
 best thing to do - for example we can't valueize to a SSA name
 (like for looking through SSA copies) but at the same time say
 that gimple_simplify shouldn't follow the edge to its definition.
 This shouldn't be a serious limitation for CCP and VRP which
 care about constants only - but it shows a defect in the
 gimple_simplify interface.  I haven't yet concluded on a better
 one though - options go from adding a secondary return to
 the valueize hook to adding a second hook maybe with additionally
 adding a simple flag to turn off SSA edge following globally.

 Anyway - the following patch should fix the immediate regression
 and allows to go forward with removing GENERIC folding from
 both fold_stmt and gimple_fold_stmt_to_constant.  Just not
 for this stage1 which will end too soon.

 Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

 Thanks,
 Richard.

 2014-11-14  Richard Biener  rguent...@suse.de

 * gimple-fold.h (gimple_fold_stmt_to_constant_1): Add 2nd
 valueization hook defaulted to no_follow_ssa_edges.
 * gimple-fold.c (gimple_fold_stmt_to_constant_1): Pass
 2nd valueization hook to gimple_simplify.
 * tree-ssa-ccp.c (valueize_op_1): New function to be
 used for gimple_simplify called via gimple_fold_stmt_to_constant_1.
 (ccp_fold): Adjust.
 * tree-vrp.c (vrp_valueize_1): New function to be
 used for gimple_simplify called via gimple_fold_stmt_to_constant_1.
 (vrp_visit_assignment_or_call): Adjust.


This caused:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63898

-- 
H.J.


Re: [PING][PATCH] c11-atomic-exec-5: Avoid dead code where LDBL_MANT_DIG is 106

2014-11-15 Thread Maciej W. Rozycki
On Fri, 14 Nov 2014, Joseph Myers wrote:

 OK.

 Applied, thanks.

  Maciej


Re: [PATCH] gcc/testsuite: guality.exp: Fix `test_counts' restoration

2014-11-15 Thread Maciej W. Rozycki
On Fri, 14 Nov 2014, Jakub Jelinek wrote:

  gcc/testsuite/ 
  * g++.dg/guality/guality.exp (check_guality): Fix `test_counts' 
  restoration.
 
 Ok, thanks.

 Applied, thanks.

  Maciej


Small C++ constexpr PATCHes

2014-11-15 Thread Jason Merrill
Here are two small constexpr changes that I made while working on C++14 
constexpr support that I thought should go in separately.  The first 
clarifies the error about missing mem-initializers in constexpr 
constructors so that people aren't confused about why assigning to the 
field in the constructor body doesn't count as initialization.


The second is a small optimization opportunity that I noticed.

Tested x86_64-pc-linux-gnu, applying to trunk.

commit 2b9db1c0982dfdee378b41039a37262d87575e25
Author: Jason Merrill ja...@redhat.com
Date:   Thu Nov 13 20:44:02 2014 -0500

	* constexpr.c (cx_check_missing_mem_inits): Clarify error message.

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index d30bf635..0d45f31 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -716,8 +716,9 @@ cx_check_missing_mem_inits (tree fun, tree body, bool complain)
 	}
 	  if (!complain)
 	return true;
-	  error (uninitialized member %qD in %constexpr% constructor,
-		 field);
+	  error (member %qD must be initialized by mem-initializer 
+		 in %constexpr% constructor, field);
+	  inform (DECL_SOURCE_LOCATION (field), declared here);
 	  bad = true;
 	}
   if (field == NULL_TREE)
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-ctor.C b/gcc/testsuite/g++.dg/cpp0x/constexpr-ctor.C
index 659e733..55beda7 100644
--- a/gcc/testsuite/g++.dg/cpp0x/constexpr-ctor.C
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-ctor.C
@@ -3,5 +3,5 @@
 struct A
 {
   int i;
-  constexpr A() { }		// { dg-error uninitialized member .A::i }
+  constexpr A() { }		// { dg-error A::i }
 };
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-diag4.C b/gcc/testsuite/g++.dg/cpp0x/constexpr-diag4.C
index 29f574d..13ca6fa 100644
--- a/gcc/testsuite/g++.dg/cpp0x/constexpr-diag4.C
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-diag4.C
@@ -21,5 +21,5 @@ struct A1
 struct B1
 {
 A1 a1;
-constexpr B1() {} // { dg-error uninitialized member }
+constexpr B1() {} // { dg-error B1::a1 }
 };
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-ex3.C b/gcc/testsuite/g++.dg/cpp0x/constexpr-ex3.C
index 3e2685b..a589356 100644
--- a/gcc/testsuite/g++.dg/cpp0x/constexpr-ex3.C
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-ex3.C
@@ -6,7 +6,7 @@
 struct A
 {
   int i;
-  constexpr A(int _i) { i = _i; } // { dg-error empty body|uninitialized member }
+  constexpr A(int _i) { i = _i; } // { dg-error empty body|A::i }
 };
 
 template class T
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-template2.C b/gcc/testsuite/g++.dg/cpp0x/constexpr-template2.C
index a316b34..12a8d42 100644
--- a/gcc/testsuite/g++.dg/cpp0x/constexpr-template2.C
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-template2.C
@@ -3,7 +3,7 @@
 template class T struct A
 {
   T t;
-  constexpr A() { }		// { dg-error uninitialized }
+  constexpr A() { }		// { dg-error ::t }
 };
 
 int main()
diff --git a/gcc/testsuite/g++.dg/cpp0x/nsdmi3.C b/gcc/testsuite/g++.dg/cpp0x/nsdmi3.C
index 6ac414b..d2e7439 100644
--- a/gcc/testsuite/g++.dg/cpp0x/nsdmi3.C
+++ b/gcc/testsuite/g++.dg/cpp0x/nsdmi3.C
@@ -15,4 +15,4 @@ struct B
 
 constexpr B b;			// { dg-error B::B }
 
-// { dg-prune-output uninitialized member }
+// { dg-prune-output B::a1 }
commit ac8ad66594ec8fcf2c3a1a03b71f0f3b4e6825e5
Author: Jason Merrill ja...@redhat.com
Date:   Sat Nov 15 00:53:13 2014 -0500

	* constexpr.c (cxx_eval_builtin_function_call): Use
	fold_builtin_call_array.

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 0d45f31..66d356f 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -995,9 +995,8 @@ cxx_eval_builtin_function_call (const constexpr_ctx *ctx, tree t,
 }
   if (*non_constant_p)
 return t;
-  new_call = build_call_array_loc (EXPR_LOCATION (t), TREE_TYPE (t),
-   CALL_EXPR_FN (t), nargs, args);
-  new_call = fold (new_call);
+  new_call = fold_builtin_call_array (EXPR_LOCATION (t), TREE_TYPE (t),
+  CALL_EXPR_FN (t), nargs, args);
   VERIFY_CONSTANT (new_call);
   return new_call;
 }


Small C++ PATCH to cp_parser_omp_declare_reduction_exprs

2014-11-15 Thread Jason Merrill
With the C++14 constexpr support we were getting confused by using 
finish_expr_stmt on something that isn't an expression.  This already 
should have been add_stmt, so I'm checking it in separately.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 046fdf7db65830bc1030d766ffa8f4ba696e0660
Author: Jason Merrill ja...@redhat.com
Date:   Sat Nov 15 01:31:47 2014 -0500

	* parser.c (cp_parser_omp_declare_reduction_exprs): A block is not
	an expression.

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 3ab65a9..111ec10 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -31188,7 +31188,7 @@ cp_parser_omp_declare_reduction_exprs (tree fndecl, cp_parser *parser)
 
   block = finish_omp_structured_block (block);
   cp_walk_tree (block, cp_remove_omp_priv_cleanup_stmt, omp_priv, NULL);
-  finish_expr_stmt (block);
+  add_stmt (block);
 
   if (ctor)
 	add_decl_expr (omp_orig);


Re: Rerog streaming of OPTIMIZATION_NODE

2014-11-15 Thread Jan Hubicka
Jonah,
 Yep, it is because my code does not handle streaming of arrays into the 
 target optimization nodes.
 I will take a look on why that array is really needed. It seems like a 
 overkill?

I am looking into the nios2_register_custom_code and I do not quite understand
what it is good for?  If it tracks customs codes function wide, then perhaps
target part of cfun would be better place to home it.  It it is unit wide, then
saving/restoring does not seem to make much sense.

Can you, please, explain me why this needs to be stored into target option
structure?  If this is really needed, then we can always add a support for
streaming arrays, but I would preffer keeping that structure small and simple
;)

Honza


[BUILDROBOT] nios2: build breakage (was: Rerog streaming of OPTIMIZATION_NODE)

2014-11-15 Thread Jan-Benedict Glaw
Hi,

On Sun, 2014-11-16 00:36:27 +0100, Jan Hubicka hubi...@ucw.cz wrote:
  Yep, it is because my code does not handle streaming of arrays
  into the target optimization nodes.  I will take a look on why
  that array is really needed. It seems like a overkill?
 
 I am looking into the nios2_register_custom_code and I do not quite
 understand what it is good for?  If it tracks customs codes function
 wide, then perhaps target part of cfun would be better place to home
 it.  It it is unit wide, then saving/restoring does not seem to make
 much sense.
 
 Can you, please, explain me why this needs to be stored into target
 option structure?  If this is really needed, then we can always add
 a support for streaming arrays, but I would preffer keeping that
 structure small and simple ;)

Port maintainers Cc'ed.

MfG, JBG

-- 
  Jan-Benedict Glaw  jbg...@lug-owl.de  +49-172-7608481
Signature of:   http://www.eyrie.org/~eagle/faqs/questions.html
the second  :


signature.asc
Description: Digital signature


[PATCH, committed] Sync config.{guess,sub} with upstream

2014-11-15 Thread Jan-Benedict Glaw
Hi!

I was under the impression that somebody else took over keeping an eye
on syncing common files between gcc, binutils, automake, config, ...

  Seems I was kind of wrong with that assumption? Alas, I've started
my scripts again and will continue my former syncing work, starting
with some easy stuff and do it step by step, additionally verifying
multiple targets being built without errors in the build robot.


2014-11-16  Jan-Benedict Glaw  jbg...@lug-owl.de

* config.sub: Update from upstream config repo.
* config.guess: Ditto.

 
diff --git a/config.guess b/config.guess
index 1f5c50c..6c32c86 100755
--- a/config.guess
+++ b/config.guess
@@ -2,7 +2,7 @@
 # Attempt to guess a canonical system name.
 #   Copyright 1992-2014 Free Software Foundation, Inc.
 
-timestamp='2014-03-23'
+timestamp='2014-11-04'
 
 # This file is free software; you can redistribute it and/or modify it
 # under the terms of the GNU General Public License as published by
@@ -24,12 +24,12 @@ timestamp='2014-03-23'
 # program.  This Exception is an additional permission under section 7
 # of the GNU General Public License, version 3 (GPLv3).
 #
-# Originally written by Per Bothner.
+# Originally written by Per Bothner; maintained since 2000 by Ben Elliston.
 #
 # You can get the latest version of this script from:
 # 
http://git.savannah.gnu.org/gitweb/?p=config.git;a=blob_plain;f=config.guess;hb=HEAD
 #
-# Please send patches with a ChangeLog entry to config-patc...@gnu.org.
+# Please send patches to config-patc...@gnu.org.
 
 
 me=`echo $0 | sed -e 's,.*/,,'`
@@ -579,8 +579,9 @@ EOF
else
IBM_ARCH=powerpc
fi
-   if [ -x /usr/bin/oslevel ] ; then
-   IBM_REV=`/usr/bin/oslevel`
+   if [ -x /usr/bin/lslpp ] ; then
+   IBM_REV=`/usr/bin/lslpp -Lqc bos.rte.libc |
+  awk -F: '{ print $3 }' | sed s/[0-9]*$/0/`
else
IBM_REV=${UNAME_VERSION}.${UNAME_RELEASE}
fi
diff --git a/config.sub b/config.sub
index 88a0cb4..7cc68ba 100755
--- a/config.sub
+++ b/config.sub
@@ -2,7 +2,7 @@
 # Configuration validation subroutine script.
 #   Copyright 1992-2014 Free Software Foundation, Inc.
 
-timestamp='2014-07-28'
+timestamp='2014-09-26'
 
 # This file is free software; you can redistribute it and/or modify it
 # under the terms of the GNU General Public License as published by
@@ -25,7 +25,7 @@ timestamp='2014-07-28'
 # of the GNU General Public License, version 3 (GPLv3).
 
 
-# Please send patches with a ChangeLog entry to config-patc...@gnu.org.
+# Please send patches to config-patc...@gnu.org.
 #
 # Configuration subroutine to validate and canonicalize a configuration type.
 # Supply the specified configuration type as an argument.
@@ -302,6 +302,7 @@ case $basic_machine in
| pdp10 | pdp11 | pj | pjl \
| powerpc | powerpc64 | powerpc64le | powerpcle \
| pyramid \
+   | riscv32 | riscv64 \
| rl78 | rx \
| score \
| sh | sh[1234] | sh[24]a | sh[24]aeb | sh[23]e | sh[34]eb | sheb | 
shbe | shle | sh[1234]le | sh3ele \
@@ -326,6 +327,9 @@ case $basic_machine in
c6x)
basic_machine=tic6x-unknown
;;
+   leon|leon[3-9])
+   basic_machine=sparc-$basic_machine
+   ;;
m6811 | m68hc11 | m6812 | m68hc12 | m68hcs12x | nvptx | picochip)
basic_machine=$basic_machine-unknown
os=-none
@@ -773,6 +777,9 @@ case $basic_machine in
basic_machine=m68k-isi
os=-sysv
;;
+   leon-*|leon[3-9]-*)
+   basic_machine=sparc-`echo $basic_machine | sed 's/-.*//'`
+   ;;
m68knommu)
basic_machine=m68k-unknown
os=-linux
-- 
  Jan-Benedict Glaw  jbg...@lug-owl.de  +49-172-7608481
  Signature of:  Zensur im Internet? Nein danke!
  the second  :


signature.asc
Description: Digital signature


Fix speculation in ipa-cp

2014-11-15 Thread Jan Hubicka
Hi,
this patch enables propagation of speculative contextes I promised to fix after 
Martin's
merge. There were few bugs that ended up disturbing testsuite:

 1) ipa_polymorphic_call_context::combine_with did not handle very well case
where the incomming type is in construction and it's current type is known.
This made us to drop the ball on one of devirt-*.C testcases
 2) ipa_get_indirect_edge_target_1 did not correctly apply the offset adjustment
and type_preserved prior using the known context.  This caused an ICE while
building Firefox
 3) propagate_context_accross_jump_function cutpaste the logic where
for constants we track if the function may be called externally with a 
unknown
parameter (and thus we need to clone).
In the case of contextes we do not really need to set this flag if we
did not use the incomming information.
Also I think it would be better to handle these by speculation rather than 
clonning,
but I hope Martin will followup on this.
 4) I noticed that find_more_scalar_values_for_callers_subset seems to contain 
a bug
when searching if all incomming edges do contribute a same constant to a
given parameter.  The code seems to set the constant to NULL and then set it
to non-NULL at first occurence.  When it however hits two different 
constants it
resets back to NULL and later it may get again non-NULL.

Otherwise the patch just disable parts where speculation si cleared; it makes 
equal_to
to work better and it also add meet operation that is used to compute context 
of edges
that have multiple callers.

Bootstrapped/regtested x86_64-linux, tested with Firefox, plan to commit it 
shortly.

Honza

* ipa-polymorphic-call.c
(ipa_polymorphic_call_context::speculation_consistent_p): Constify.
(ipa_polymorphic_call_context::meet_speculation_with): New function.
(ipa_polymorphic_call_context::combine_with): Handle types in 
construction
better.
(ipa_polymorphic_call_context::equal_to): Do not bother about useless
speculation.
(ipa_polymorphic_call_context::meet_with): New function.
* cgraph.h (class ipa_polymorphic_call_context): Add
meet_width, meet_speculation_with; constify speculation_consistent_p.
* ipa-cp.c (ipa_context_from_jfunc): Handle speculation; combine with 
incomming
context.
(propagate_context_accross_jump_function): Likewise; be more cureful.
about set_contains_variable.
(ipa_get_indirect_edge_target_1): Fix handling of dynamic type changes.
(find_more_scalar_values_for_callers_subset): Fix.
(find_more_contexts_for_caller_subset): Perform meet operation.
Index: ipa-polymorphic-call.c
===
--- ipa-polymorphic-call.c  (revision 217609)
+++ ipa-polymorphic-call.c  (working copy)
@@ -1727,7 +1727,7 @@ bool
 ipa_polymorphic_call_context::speculation_consistent_p (tree spec_outer_type,
HOST_WIDE_INT 
spec_offset,
bool 
spec_maybe_derived_type,
-   tree otr_type)
+   tree otr_type) const
 {
   if (!flag_devirtualize_speculatively)
 return false;
@@ -1873,6 +1873,102 @@ ipa_polymorphic_call_context::combine_sp
   return false;
 }
 
+/* Make speculation less specific so
+   NEW_OUTER_TYPE, NEW_OFFSET, NEW_MAYBE_DERIVED_TYPE is also included.
+   If OTR_TYPE is set, assume the context is used with OTR_TYPE.  */
+
+bool
+ipa_polymorphic_call_context::meet_speculation_with
+   (tree new_outer_type, HOST_WIDE_INT new_offset, bool new_maybe_derived_type,
+tree otr_type)
+{
+  if (!new_outer_type  speculative_outer_type)
+{
+  clear_speculation ();
+  return true;
+}
+
+  /* restrict_to_inner_class may eliminate wrong speculation making our job
+ easeier.  */
+  if (otr_type)
+restrict_to_inner_class (otr_type);
+
+  if (!speculative_outer_type
+  || !speculation_consistent_p (speculative_outer_type,
+   speculative_offset,
+   speculative_maybe_derived_type,
+   otr_type))
+return false;
+
+  if (!speculation_consistent_p (new_outer_type, new_offset,
+new_maybe_derived_type, otr_type))
+{
+  clear_speculation ();
+  return true;
+}
+
+  else if (types_must_be_same_for_odr (speculative_outer_type,
+  new_outer_type))
+{
+  if (speculative_offset != new_offset)
+   {
+ clear_speculation ();
+ return true;
+   }
+  else
+   {
+ if (!speculative_maybe_derived_type  new_maybe_derived_type)
+   {
+ speculative_maybe_derived_type = true;
+ 

Re: OpenACC middle end changes

2014-11-15 Thread Gerald Pfeifer
On Thursday 2014-11-13 17:59, Thomas Schwinge wrote:
 Here is our current set of OpenACC middle end changes.  As discussed
 before, this is not yet all of OpenACC 2.0 -- we shall a) document what
 is working already, and b) continue to work on closing the gap.

As David wrote in a different context, strchrnul is a GNU extension and 
not present at least on AIX and FreeBSD 8 (and possibly 9).

Gerald

PS: Sorry, this mail got stuck in my outbox.


Re: Concepts code review

2014-11-15 Thread Braden Obrzut
I don't believe I'll be able to familiarize myself adequately with the 
more complex issues before the stage 1 deadline (from what I understand 
Andrew is/was taking care of the blocking issues?), so I will leave what 
I have for the more trivial issues and a few comments.


On 11/11/2014 12:05 PM, Jason Merrill wrote:



// Diagnose constraint failures in a variable concept.
void
diagnose_var (location_t loc, tree t, tree args)


The name and comment seem misleading since T is actually a 
TEMPLATE_ID_EXPR.


Variable templates (and thus concepts) are TEMPLATE_ID_EXPR.  I changed 
the comment to explicitly state that it will be a TEMPLATE_ID_EXPR, but 
I'm not sure the name needs to be changed.  If I recall correctly, I 
initially implemented as *_var_concept and Andrew told me to shorten 
it.  Note that this also matches up with normalize_var.



+// Bring the parameters of a function declaration back into
+// scope without entering the function body. The declarator
+// must be a function declarator. The caller is responsible
+// for calling finish_scope.
+void
+push_function_parms (cp_declarator *declarator)


I think if the caller is calling finish_scope, I'd prefer for the 
begin_scope call to be there as well.


Even though Andrew said that this will change later for other reasons, 
it's a function I wrote so: I actually debated this with Andrew before.  
My rationale for calling begin_scope in the function was that it feels 
consistent with the semantics of the call. Specifically it can be seen 
as reopening the function parameter scope.  Thus the call is balanced by 
calling finish_scope.  Either way would work of course, but perhaps it 
just needed a better name and/or comment?



+  // Save the current template requirements.
+  saved_template_reqs = release (current_template_reqs);


It seems like a lot of places with saved_template_reqs variables could 
be converted to use cp_manage_requirements.


Probably.  The instance you quoted isn't very trivial though.  The 
requirements are saved in two different branches and need to be restored 
before a certain point in the function.  Might just need to spend more 
time looking over the code.



+  // FIXME: This could be improved. Perhaps the type of the requires
+  // expression depends on the satisfaction of its constraints. That
+  // is, its type is bool only if its substitution into its normalized
+  // constraints succeeds.


The requires-expression is not type-dependent, but it can be 
instantiation-dependent and value-dependent.


This is an interesting change.  The REQUIRES_EXPR is currently marked as 
value dependent.  The ChangeLog indicates that Andrew loosened the 
conditions for being value dependent for some cases, but then added it 
as type dependent when something else failed.  May require some time to 
pin down exactly what needs to be done here.


- Braden Obrzut

2014-11-15  Braden Obrzut  ad...@maniacsvault.net

* gcc/cp/constraint.cc (resolve_constraint_check): Move definition
check to grokfndecl.
(normalize_template_id): Use expression location if available when
informing about missing parentheses.
(build_requires_expr): Added comment.
(diagnose_var): Clarified comment.
* gcc/cp/decl.c (check_concept_refinement): Remove outdated comment
regarding variable concepts.
(grokfndecl): Ensure that all concept declarations are definitions.
(grokdeclarator): Remove outdated comment regarding variable concepts.
* gcc/cp/parser.c (cp_parser_introduction_list): Use vec for temporary
list instead of a TREE_LIST.
(get_id_declarator): Renamed from cp_get_id_declarator.
(get_unqualified_id): Renamed from cp_get_identifier.
(is_constrained_parameter): Renamed from cp_is_constrained_parameter.
(cp_parser_check_constrained_type_parm): Renamed from
cp_check_constrained_type_parm.
(cp_parser_constrained_type_template_parm): Renamed from
cp_constrained_type_template_parm.
(cp_parser_constrained_template_template_parm): Renamed from
cp_constrained_template_template_parm.
(constrained_non_type_template_parm): Renamed from
cp_constrained_non_type_tmeplate_parm.
(finish_constrained_parameter): Renamed from
cp_finish_constrained_parameter.
(maybe_type_parameter): Renamed from cp_maybe_type_parameter.
(declares_type_parameter): Renamed from cp_declares_type_parameter.
(declares_type_template_parameter): Renamed from
cp_declares_type_template_parameter.
(declares_template_template_parameter): Renamed from
cp_declares_template_template_parameter.
(cp_parser_type_parameter): Call
cp_parser_default_type_template_argument and
cp_parser_default_template_template_argument which were already
factored out from this function.
(cp_maybe_constrained_type_specifier): Use the new INTRODUCED_PARM_DECL
instead of PLACEHOLDER_EXPR.
(cp_parser_requires_expr_scope): Remove old comment and change
destructor to use 

[PATCH] driver: ignore SIGINT while waiting on subprocesses to finish

2014-11-15 Thread Patrick Palka
Currently the top-level driver handles SIGINT by immediately killing
itself even when the driver has subprocesses (e.g. cc1, as) running.  I
don't think this is a good idea because

  1. if the top-level driver is waiting on a hanging subprocess,
  pressing ^C will kill the driver but it may not necessarily kill the
  subprocess; an unresponsive, perhaps busy-looping subprocess may be
  running in the background yet the compiler will seem to have to
  terminated successfully.

  2. when debugging gcc with gcc -wrapper gdb,--args we are unable to
  use ^C from within the GDB subprocess because pressing ^C immediately
  kills the driver and we lose our terminal.  This makes debugging more
  inconvenient.

This patch fixes these two issues by having the driver ignore SIGINT
while a subprocess is running.  The driver instead will have to wait for
the subprocess to exit before it terminates, like usual.

I tested this change by running gcc -wrapper gdb, gcc -wrapper
valgrind and plain old gcc in various ways (-pipe, -flto, -c, etc)
and pressing ^C during compilation.  I noticed no differences in
behavior or promptness other than finally being able to use ^C inside
GDB.

Does this change look OK for trunk after a successful bootstrap +
regtest on x86_64-unknown-linux-gnu?
---
 gcc/gcc.c | 21 -
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/gcc/gcc.c b/gcc/gcc.c
index 653ca8d..0802f41 100644
--- a/gcc/gcc.c
+++ b/gcc/gcc.c
@@ -2653,7 +2653,11 @@ add_sysrooted_prefix (struct path_prefix *pprefix, const 
char *prefix,
   add_prefix (pprefix, prefix, component, priority,
  require_machine_suffix, os_multilib);
 }
-
+
+/* True if the SIGINT signal should be ignored by the driver's
+   signal handler.  */
+static bool ignore_sigint_p;
+
 /* Execute the command specified by the arguments on the current line of spec.
When using pipes, this includes several piped-together commands
with `|' between them.
@@ -2839,6 +2843,10 @@ execute (void)
   if (pex == NULL)
 fatal_error (pex_init failed: %m);
 
+  /* A SIGINT raised while subprocesses are running should not kill the
+ driver.  */
+  ignore_sigint_p = true;
+
   for (i = 0; i  n_commands; i++)
 {
   const char *errmsg;
@@ -2878,6 +2886,8 @@ execute (void)
 if (!pex_get_status (pex, n_commands, statuses))
   fatal_error (failed to get exit status: %m);
 
+ignore_sigint_p = false;
+
 if (report_times || report_times_to_file)
   {
times = (struct pex_time *) alloca (n_commands * sizeof (struct 
pex_time));
@@ -2893,6 +2903,9 @@ execute (void)
 
if (WIFSIGNALED (status))
  {
+   if (WTERMSIG (status) == SIGINT)
+ raise (SIGINT);
+   else
 #ifdef SIGPIPE
/* SIGPIPE is a special case.  It happens in -pipe mode
   when the compiler dies before the preprocessor is done,
@@ -6710,6 +6723,12 @@ set_input (const char *filename)
 static void
 fatal_signal (int signum)
 {
+  if (signum == SIGINT  ignore_sigint_p)
+{
+  signal (signum, fatal_signal);
+  return;
+}
+
   signal (signum, SIG_DFL);
   delete_failure_queue ();
   delete_temp_files ();
-- 
2.2.0.rc1.23.gf570943



Re: [BUILDROBOT] nios2: build breakage

2014-11-15 Thread Sandra Loosemore

On 11/15/2014 04:49 PM, Jan-Benedict Glaw wrote:

Hi,

On Sun, 2014-11-16 00:36:27 +0100, Jan Hubicka hubi...@ucw.cz wrote:

Yep, it is because my code does not handle streaming of arrays
into the target optimization nodes.  I will take a look on why
that array is really needed. It seems like a overkill?


I am looking into the nios2_register_custom_code and I do not quite
understand what it is good for?  If it tracks customs codes function
wide, then perhaps target part of cfun would be better place to home
it.  It it is unit wide, then saving/restoring does not seem to make
much sense.

Can you, please, explain me why this needs to be stored into target
option structure?  If this is really needed, then we can always add
a support for streaming arrays, but I would preffer keeping that
structure small and simple ;)


Port maintainers Cc'ed.


I can explain why this is needed, at least.

The Nios II architecture optionally allows custom instructions that 
are typically used to implement floating-point operations.  The nios2 
GCC backend knows to generate these instructions if the user tells it 
what opcodes implement these instructions.  This is typically done by 
command-line options, but can also be done on a per-function basis by 
means of target attributes or pragmas -- see 
gcc/testsuite/gcc.target/nios2/custom-fp-* for examples.  The 
command-line options, attribute, and pragma support was a requirement 
from Altera, so yes, this is really needed.


-Sandra



Re: [PATCH, libgfortran] PR 60324 Unbounded stack allocations in libgfortran

2014-11-15 Thread Janne Blomqvist
On Sat, Nov 15, 2014 at 8:01 AM, Janne Blomqvist
blomqvist.ja...@gmail.com wrote:
 On Fri, Nov 14, 2014 at 11:02 PM, Tobias Burnus bur...@net-b.de wrote:
 I think instead of doing a run-time check I'd prefer something like the
 following, keeping the compile-time assert.

 --- a/libgfortran/intrinsics/random.c
 +++ b/libgfortran/intrinsics/random.c
 @@ -253 +253 @@ static GFC_UINTEGER_4 kiss_default_seed[] = {
 -static const GFC_INTEGER_4 kiss_size =
 sizeof(kiss_seed)/sizeof(kiss_seed[0]);
 +#define KISS_SIZE ((GFC_INTEGER_4) (sizeof(kiss_seed)/sizeof(kiss_seed[0]))

 (plus s/kiss_size/KISS_SIZE/ changes in the code.)

 Janne, what do you think?

 I like it. With this, you can also get rid of the assert and the newly
 introduced KISS_MAX_SIZE macro, and just make the seed array the
 correct size, as was originally done (with a VLA). Consider such a
 patch pre-approved.

I went ahead and did it myself, committed the attached patch as
r217623 after regtesting.

2014-11-16  Janne Blomqvist  j...@gcc.gnu.org

PR libfortran/60324
* intrinsics/random.c (kiss_size): Rename to KISS_SIZE, make it a
macro instead of a variable.
(random_seed_i4): Make seed correct size, remove assert, KISS_SIZE
related changes.
(random_seed_i8): KISS_SIZE related changes.


-- 
Janne Blomqvist
diff --git a/libgfortran/intrinsics/random.c b/libgfortran/intrinsics/random.c
index 5e91929..d2510b2 100644
--- a/libgfortran/intrinsics/random.c
+++ b/libgfortran/intrinsics/random.c
@@ -224,7 +224,7 @@ KISS algorithm.  */
z=0,c=0 and z=2^32-1,c=698769068
should be avoided.  */
 
-/* Any modifications to the seeds that change kiss_size below need to be
+/* Any modifications to the seeds that change KISS_SIZE below need to be
reflected in check.c (gfc_check_random_seed) to enable correct
compile-time checking of PUT size for the RANDOM_SEED intrinsic.  */
 
@@ -250,7 +250,7 @@ static GFC_UINTEGER_4 kiss_default_seed[] = {
 #endif
 };
 
-static const GFC_INTEGER_4 kiss_size = sizeof(kiss_seed)/sizeof(kiss_seed[0]);
+#define KISS_SIZE (sizeof(kiss_seed)/sizeof(kiss_seed[0]))
 
 static GFC_UINTEGER_4 * const kiss_seed_1 = kiss_seed;
 static GFC_UINTEGER_4 * const kiss_seed_2 = kiss_seed + 4;
@@ -665,12 +665,7 @@ unscramble_seed (unsigned char *dest, unsigned char *src, 
int size)
 void
 random_seed_i4 (GFC_INTEGER_4 *size, gfc_array_i4 *put, gfc_array_i4 *get)
 {
-  int i;
-
-#define KISS_MAX_SIZE 12
-  unsigned char seed[4 * KISS_MAX_SIZE];
-  _Static_assert (kiss_size = KISS_MAX_SIZE,
- kiss_size must = KISS_MAX_SIZE);
+  unsigned char seed[4 * KISS_SIZE];
 
   __gthread_mutex_lock (random_lock);
 
@@ -681,11 +676,11 @@ random_seed_i4 (GFC_INTEGER_4 *size, gfc_array_i4 *put, 
gfc_array_i4 *get)
   /* From the standard: If no argument is present, the processor assigns
  a processor-dependent value to the seed.  */
   if (size == NULL  put == NULL  get == NULL)
-  for (i = 0; i  kiss_size; i++)
+  for (size_t i = 0; i  KISS_SIZE; i++)
kiss_seed[i] = kiss_default_seed[i];
 
   if (size != NULL)
-*size = kiss_size;
+*size = KISS_SIZE;
 
   if (put != NULL)
 {
@@ -694,18 +689,18 @@ random_seed_i4 (GFC_INTEGER_4 *size, gfc_array_i4 *put, 
gfc_array_i4 *get)
 runtime_error (Array rank of PUT is not 1.);
 
   /* If the array is too small, abort.  */
-  if (GFC_DESCRIPTOR_EXTENT(put,0)  kiss_size)
+  if (GFC_DESCRIPTOR_EXTENT(put,0)  (index_type) KISS_SIZE)
 runtime_error (Array size of PUT is too small.);
 
   /*  We copy the seed given by the user.  */
-  for (i = 0; i  kiss_size; i++)
+  for (size_t i = 0; i  KISS_SIZE; i++)
memcpy (seed + i * sizeof(GFC_UINTEGER_4),
-   (put-base_addr[(kiss_size - 1 - i) * 
GFC_DESCRIPTOR_STRIDE(put,0)]),
+   (put-base_addr[(KISS_SIZE - 1 - i) * 
GFC_DESCRIPTOR_STRIDE(put,0)]),
sizeof(GFC_UINTEGER_4));
 
   /* We put it after scrambling the bytes, to paper around users who
 provide seeds with quality only in the lower or upper part.  */
-  scramble_seed ((unsigned char *) kiss_seed, seed, 4*kiss_size);
+  scramble_seed ((unsigned char *) kiss_seed, seed, 4 * KISS_SIZE);
 }
 
   /* Return the seed to GET data.  */
@@ -716,15 +711,15 @@ random_seed_i4 (GFC_INTEGER_4 *size, gfc_array_i4 *put, 
gfc_array_i4 *get)
runtime_error (Array rank of GET is not 1.);
 
   /* If the array is too small, abort.  */
-  if (GFC_DESCRIPTOR_EXTENT(get,0)  kiss_size)
+  if (GFC_DESCRIPTOR_EXTENT(get,0)  (index_type) KISS_SIZE)
runtime_error (Array size of GET is too small.);
 
   /* Unscramble the seed.  */
-  unscramble_seed (seed, (unsigned char *) kiss_seed, 4*kiss_size);
+  unscramble_seed (seed, (unsigned char *) kiss_seed, 4 * KISS_SIZE);
 
   /*  Then copy it back to the user variable.  */
-  for (i = 0; i  kiss_size; i++)
-   memcpy ((get-base_addr[(kiss_size - 1 - i) * 

Re: [PATCH] driver: ignore SIGINT while waiting on subprocesses to finish

2014-11-15 Thread Patrick Palka

On Sat, 15 Nov 2014, Patrick Palka wrote:


Currently the top-level driver handles SIGINT by immediately killing
itself even when the driver has subprocesses (e.g. cc1, as) running.  I
don't think this is a good idea because

 1. if the top-level driver is waiting on a hanging subprocess,
 pressing ^C will kill the driver but it may not necessarily kill the
 subprocess; an unresponsive, perhaps busy-looping subprocess may be
 running in the background yet the compiler will seem to have to
 terminated successfully.

 2. when debugging gcc with gcc -wrapper gdb,--args we are unable to
 use ^C from within the GDB subprocess because pressing ^C immediately
 kills the driver and we lose our terminal.  This makes debugging more
 inconvenient.

This patch fixes these two issues by having the driver ignore SIGINT
while a subprocess is running.  The driver instead will have to wait for
the subprocess to exit before it terminates, like usual.

I tested this change by running gcc -wrapper gdb, gcc -wrapper
valgrind and plain old gcc in various ways (-pipe, -flto, -c, etc)
and pressing ^C during compilation.  I noticed no differences in
behavior or promptness other than finally being able to use ^C inside
GDB.

Does this change look OK for trunk after a successful bootstrap +
regtest on x86_64-unknown-linux-gnu?


I forgot a ChangeLog entry:

* gcc.c (ignore_sigint_p): New static variable.
(execute): Use it.
(fatal_signal): Ignore SIGINT if ignore_sigint_p is true.


Re: Concepts code review

2014-11-15 Thread Jason Merrill

On 11/15/2014 07:58 PM, Braden Obrzut wrote:

Variable templates (and thus concepts) are TEMPLATE_ID_EXPR.  I changed
the comment to explicitly state that it will be a TEMPLATE_ID_EXPR, but
I'm not sure the name needs to be changed.  If I recall correctly, I
initially implemented as *_var_concept and Andrew told me to shorten
it.  Note that this also matches up with normalize_var.


OK.


I think if the caller is calling finish_scope, I'd prefer for the
begin_scope call to be there as well.


Even though Andrew said that this will change later for other reasons,
it's a function I wrote so: I actually debated this with Andrew before.
My rationale for calling begin_scope in the function was that it feels
consistent with the semantics of the call. Specifically it can be seen
as reopening the function parameter scope.  Thus the call is balanced by
calling finish_scope.  Either way would work of course, but perhaps it
just needed a better name and/or comment?


A better name would be OK.  Perhaps push_function_parms_with_scope.

Jason



Re: [Patch] PR 61692 - Fix for inline asm ICE

2014-11-15 Thread David Wohlferd

On 9/15/2014 2:51 PM, Jeff Law wrote:
Let's go with your original inputs + outputs + labels change and punt 
the clobbers stuff for now.


jeff


I have also added the test code you requested.

I have a release on file with the FSF, but don't have SVN write access.

Problem:
extract_insn() in recog.c will ICE if (noperands  MAX_RECOG_OPERANDS).  
Normally this isn't a problem since expand_asm_operands() in cfgexpand.c 
catches and reports a proper error for this condition.  However, 
expand_asm_operands() only checks (ninputs + noutputs) instead of 
(ninputs + noutputs + nlabels), so you can get the ICE when using asm 
goto.


ChangeLog:
2014-11-15  David Wohlferd d...@limegreensocks.com

PR target/61692
* cfgexpand.c (expand_asm_operands): Count all inline asm params.
* testsuite/gcc.dg/pr61692.c: New test.

dw
Index: gcc/cfgexpand.c
===
--- gcc/cfgexpand.c	(revision 217623)
+++ gcc/cfgexpand.c	(working copy)
@@ -2589,7 +2589,7 @@
 }
 
   ninputs += ninout;
-  if (ninputs + noutputs  MAX_RECOG_OPERANDS)
+  if (ninputs + noutputs + nlabels  MAX_RECOG_OPERANDS)
 {
   error (more than %d operands in %asm%, MAX_RECOG_OPERANDS);
   return;
Index: gcc/testsuite/gcc.dg/pr61692.c
===
--- gcc/testsuite/gcc.dg/pr61692.c	(revision 0)
+++ gcc/testsuite/gcc.dg/pr61692.c	(working copy)
@@ -0,0 +1,173 @@
+/*  PR 61692  */
+
+/* Check for ice when exceededing the max #
+   of parameters to inline asm. */
+
+int Labels()
+{
+label01: label02: label03: label04: label05:
+label06: label07: label08: label09: label10:
+label11: label12: label13: label14: label15:
+label16: label17: label18: label19: label20:
+label21: label22: label23: label24: label25:
+label26: label27: label28: label29: label30:
+label31:
+
+__asm__ goto ( /* Works. */
+: /* no outputs */ 
+: /* no inputs */ 
+: /* no clobbers */
+: label01, label02, label03, label04, label05, 
+  label06, label07, label08, label09, label10, 
+  label11, label12, label13, label14, label15, 
+  label16, label17, label18, label19, label20, 
+  label21, label22, label23, label24, label25, 
+  label26, label27, label28, label29, label30);
+
+__asm__ goto ( /* { dg-error more than 30 operands } */
+: /* no outputs */ 
+: /* no inputs */ 
+: /* no clobbers */
+: label01, label02, label03, label04, label05, 
+  label06, label07, label08, label09, label10, 
+  label11, label12, label13, label14, label15, 
+  label16, label17, label18, label19, label20, 
+  label21, label22, label23, label24, label25, 
+  label26, label27, label28, label29, label30, 
+  label31);
+
+return 0;
+}
+
+int Labels_and_Inputs()
+{
+int b01, b02, b03, b04, b05, b06, b07, b08, b09, b10;
+int b11, b12, b13, b14, b15, b16, b17, b18, b19, b20;
+int b21, b22, b23, b24, b25, b26, b27, b28, b29, b30;
+int b31;
+
+label01: label02: label03: label04: label05:
+label06: label07: label08: label09: label10:
+label11: label12: label13: label14: label15:
+label16: label17: label18: label19: label20:
+label21: label22: label23: label24: label25:
+label26: label27: label28: label29: label30:
+label31:
+
+b01 = b02 = b03 = b04 = b05 = b06 = b07 = b08 = b09 = b10 = 0;
+b11 = b12 = b13 = b14 = b15 = b16 = b17 = b18 = b19 = b20 = 0;
+b21 = b22 = b23 = b24 = b25 = b26 = b27 = b28 = b29 = b30 = 0;
+b31 = 0;
+
+__asm__ goto ( /* Works. */
+  : /* no outputs */
+  : m (b01), m (b02), m (b03), m (b04), m (b05), 
+m (b06), m (b07), m (b08), m (b09), m (b10), 
+m (b11), m (b12), m (b13), m (b14), m (b15),
+m (b16), m (b17), m (b18), m (b19), m (b20), 
+m (b21), m (b22), m (b23), m (b24), m (b25),
+m (b26), m (b27), m (b28), m (b29)
+  : /* no clobbers */
+  : label01);
+
+__asm__ goto ( /* { dg-error more than 30 operands } */
+  : /* no outputs */
+  : m (b01), m (b02), m (b03), m (b04), m (b05), 
+m (b06), m (b07), m (b08), m (b09), m (b10), 
+m (b11), m (b12), m (b13), m (b14), m (b15),
+m (b16), m (b17), m (b18), m (b19), m (b20), 
+m (b21), m (b22), m (b23), m (b24), m (b25),
+m (b26), m (b27), m (b28), m (b29), m (b30)
+  : /* no clobbers */
+  : label01);
+
+  return 0;
+}
+
+int Outputs()
+{
+int b01, b02, b03, b04, b05, b06, b07, b08, b09, b10;
+int b11, b12, b13, b14, b15, b16, b17, b18, b19, b20;
+int b21, b22, b23, b24, b25, b26, b27, b28, b29, b30;
+int b31;
+
+/* Outputs. */
+__asm__ volatile ( /* Works. */
+ : =m (b01),  =m (b02),  =m (b03),  =m (b04), =m (b05),
+   =m (b06),  =m (b07),  =m (b08),  =m (b09), =m (b10),
+   =m (b11),  

Re: [PATCH] Look through widening type conversions for possible edge assertions

2014-11-15 Thread Patrick Palka
On Wed, Nov 12, 2014 at 3:38 AM, Richard Biener
richard.guent...@gmail.com wrote:
 On Wed, Nov 12, 2014 at 5:17 AM, Patrick Palka patr...@parcs.ath.cx wrote:
 On Tue, Nov 11, 2014 at 8:48 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Tue, Nov 11, 2014 at 1:10 PM, Patrick Palka patr...@parcs.ath.cx wrote:
 This patch is a replacement for the 2nd VRP refactoring patch.  It
 simply teaches VRP to look through widening type conversions when
 finding suitable edge assertions, e.g.

 bool p = x != y;
 int q = (int) p;
 if (q == 0) // new edge assert: p == 0 and therefore x == y

 I think the proper fix is to forward x != y to q == 0 instead of this one.
 That said - the tree-ssa-forwprop.c restriction on only forwarding
 single-uses into conditions is clearly bogus here.  I suggest to
 relax it for conversions and compares.  Like with

 Index: tree-ssa-forwprop.c
 ===
 --- tree-ssa-forwprop.c (revision 217349)
 +++ tree-ssa-forwprop.c (working copy)
 @@ -476,7 +476,7 @@ forward_propagate_into_comparison_1 (gim
 {
   rhs0 = rhs_to_tree (TREE_TYPE (op1), def_stmt);
   tmp = combine_cond_expr_cond (stmt, code, type,
 -   rhs0, op1, !single_use0_p);
 +   rhs0, op1, false);
   if (tmp)
 return tmp;
 }


 Thanks,
 Richard.

 That makes sense.  Attached is what I have so far.  I relaxed the
 forwprop restriction in the case of comparing an integer constant with
 a comparison or with a conversion from a boolean value.  (If I allow
 all conversions, not just those from a boolean value, then a couple of
 -Wstrict-overflow faillures trigger..)  Does the change look sensible?
  Should the logic be duplicated for the case when TREE_CODE (op1) ==
 SSA_NAME? Thanks for your help so far!

 It looks good though I'd have allowed all kinds of conversions, not only
 those from booleans.

 If the patch tests ok with that change it is ok.

Sadly changing the patch to propagate all kinds of conversions, not
only just those from booleans, introduces regressions that I don't
know how to adequately fix.


RFA (tree-inline): PATCH for more C++14 constexpr support

2014-11-15 Thread Jason Merrill
This patch implements more support for C++14 constexpr: it allows 
arbitrary modification of variables in a constexpr function, but does 
not currently handle jumping -- multiple returns, loops, switches.


The approach I took for this was to just use the DECL_SAVED_TREE for a 
constexpr function as the basis for expansion rather than trying to 
massage it into a magic expression.  And now the values of local 
variables, including parameters, are kept in the values hash map that I 
introduced with the aggregate NSDMI patch.


But in the presence of recursive constexpr calls we can't use the same 
PARM_DECL as a key, so we need to remap it.  Thus I've added 
remap_fn_body to tree-inline.c to unshare the entire function body and 
remap the parms and result to avoid clashes.


This handles some more C++14 testcases and has no regressions on C++11 
constexpr testcases.  Support for jumps will follow soon.


Tested x86_64-pc-linux-gnu and powerpc64-unknown-linux-gnu.

Is the remap_fn_body function ok for trunk?
commit decc90baa31ae1535b4b0ab80aeee185471a5ddb
Author: Jason Merrill ja...@redhat.com
Date:   Sat Nov 15 10:46:55 2014 -0500

	C++14 constexpr support (minus loops and multiple returns)
gcc/
	* tree-inline.c (remap_fn_body): New.
	* tree-inline.h: Declare it.
gcc/cp/
	* constexpr.c (use_new_call): New macro.
	(build_data_member_initialization): Ignore non-mem-inits.
	(check_constexpr_bind_expr_vars): Remove C++14 checks.
	(constexpr_fn_retval): Likewise.
	(check_constexpr_ctor_body): Do nothing in C++14.
	(massage_constexpr_body): In C++14 only collect mem-inits.
	(get_function_named_in_call): Handle null CALL_EXPR_FN.
	(cxx_bind_parameters_in_call): Build bindings in same order as
	parameters.  Don't treat iniviref parms specially in new call mode.
	(cxx_eval_call_expression): If use_new_call, do constexpr expansion
	based on DECL_SAVED_TREE rather than the massaged constexpr body.
	(cxx_eval_component_reference): A null element means we're mid-
	initialization.
	(cxx_eval_store_expression, cxx_eval_increment_expression): New.
	(cxx_eval_constant_expression): Handle RESULT_DECL, DECL_EXPR,
	MODIFY_EXPR, STATEMENT_LIST, BIND_EXPR, USING_STMT,
	PREINCREMENT_EXPR, POSTINCREMENT_EXPR, PREDECREMENT_EXPR,
	POSTDECREMENT_EXPR.  Don't look into DECL_INITIAL of variables in
	constexpr functions.  In new-call mode find parms in the values table.
	(potential_constant_expression_1): Handle null CALL_EXPR_FN.
	Handle STATEMENT_LIST, MODIFY_EXPR, MODOP_EXPR, IF_STMT,
	PREINCREMENT_EXPR, POSTINCREMENT_EXPR, PREDECREMENT_EXPR,
	POSTDECREMENT_EXPR, BIND_EXPR, WITH_CLEANUP_EXPR,
	CLEANUP_POINT_EXPR, MUST_NOT_THROW_EXPR, TRY_CATCH_EXPR,
	EH_SPEC_BLOCK, EXPR_STMT, DECL_EXPR.
	(cxx_eval_array_reference): Call build_cplus_new.
	(cxx_eval_component_reference): Likewise.
	(cxx_eval_outermost_constant_expr): Pull object out of AGGR_INIT_EXPR.
	(maybe_constant_init): Look through INIT_EXPR.
	(ensure_literal_type_for_constexpr_object): Set
	cp_function_chain-invalid_constexpr.
	* cp-tree.h (struct language_function): Add invalid_constexpr bitfield.
	* decl.c (start_decl): Set cp_function_chain-invalid_constexpr.
	(check_for_uninitialized_const_var): Likewise.
	(maybe_save_function_definition): Check it.
	* parser.c (cp_parser_jump_statement): Set
	cp_function_chain-invalid_constexpr.
	(cp_parser_asm_definition): Likewise.

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 66d356f..53cfb18 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -31,6 +31,7 @@ along with GCC; see the file COPYING3.  If not see
 #include tree-iterator.h
 #include gimplify.h
 #include builtins.h
+#include tree-inline.h
 
 static bool verify_constant (tree, bool, bool *, bool *);
 #define VERIFY_CONSTANT(X)		\
@@ -93,8 +94,11 @@ ensure_literal_type_for_constexpr_object (tree decl)
 	error (the type %qT of constexpr variable %qD is not literal,
 		   type, decl);
 	  else
-	error (variable %qD of non-literal type %qT in %constexpr% 
-		   function, decl, type);
+	{
+	  error (variable %qD of non-literal type %qT in %constexpr% 
+		 function, decl, type);
+	  cp_function_chain-invalid_constexpr = true;
+	}
 	  explain_non_literal_class (type);
 	  return NULL;
 	}
@@ -310,13 +314,20 @@ build_data_member_initialization (tree t, vecconstructor_elt, va_gc **vec)
   if (TREE_CODE (t) == CONVERT_EXPR)
 t = TREE_OPERAND (t, 0);
   if (TREE_CODE (t) == INIT_EXPR
-  || TREE_CODE (t) == MODIFY_EXPR)
+  /* vptr initialization shows up as a MODIFY_EXPR.  In C++14 we only
+	 use what this function builds for cx_check_missing_mem_inits, and
+	 assignment in the ctor body doesn't count.  */
+  || (cxx_dialect  cxx14  TREE_CODE (t) == MODIFY_EXPR))
 {
   member = TREE_OPERAND (t, 0);
   init = break_out_target_exprs 

a patch to fix an x86 bootstrap with LRA

2014-11-15 Thread Vladimir Makarov
  The following patch fixes a x86 bootstrap failure with different sets 
of options (-with-arch=corei7 -with-cpu=intel, -with-arch=core2 
-with-cpu=slm).


  The patch was bootstrapped on x86 (with the 2 sets) and x86-64.

  Committed as rev. 217624.

2014-11-15  Vladimir Makarov  vmaka...@redhat.com

* lra-remat.c (cand_transf_func): Process regno for
rematerialization too.
* lra.c (lra): Switch on rematerialization pass.
Index: lra.c
===
--- lra.c   (revision 217609)
+++ lra.c   (working copy)
@@ -2354,7 +2354,7 @@ lra (FILE *f)
break;
   /* Now we know what pseudos should be spilled.  Try to
 rematerialize them first.  */
-  if (0lra_remat ())
+  if (lra_remat ())
{
  /* We need full live info -- see the comment above.  */
  lra_create_live_ranges (lra_reg_spill_p, true);
Index: lra-remat.c
===
--- lra-remat.c (revision 217602)
+++ lra-remat.c (working copy)
@@ -860,6 +860,10 @@ cand_trans_fun (int bb_index, bitmap bb_
bitmap_set_bit (temp_bitmap, cid);
break;
  }
+  /* Check regno for rematerialization.  */
+  if (bitmap_bit_p (bb_changed_regs, cand-regno)
+ || bitmap_bit_p (bb_dead_regs, cand-regno))
+   bitmap_set_bit (temp_bitmap, cid);
 }
   return bitmap_ior_and_compl (bb_out,
   bb_info-gen_cands, bb_in, temp_bitmap);


RFC: PATCH to genericize C++ loops to LOOP_EXPR instead of gotos

2014-11-15 Thread Jason Merrill
I've had a TODO in genericize_cp_loop for a long time suggesting that we 
should genericize to LOOP_EXPR rather than gotos, and now that I need to 
interpret the function body for constexpr evaluation, making this change 
will also simplify that handling.


This change also does away with canonicalizing the condition to the 
bottom of the loop, to avoid the extra goto.  It seems to me that this 
is unnecessary nowadays, since the optimizers are very capable of making 
any necessary transformations, but I'm interested in feedback from other 
people.


Tested x86_64-pc-linux-gnu.

Opinions?
commit 1a45860e7757ee054f6bf98bee4ebe5c661dfb90
Author: Jason Merrill ja...@redhat.com
Date:   Thu Nov 13 23:54:48 2014 -0500

	* cp-gimplify.c (genericize_cp_loop): Use LOOP_EXPR.
	(genericize_for_stmt): Handle null statement-list.

diff --git a/gcc/cp/cp-gimplify.c b/gcc/cp/cp-gimplify.c
index e5436bb..81b26d2 100644
--- a/gcc/cp/cp-gimplify.c
+++ b/gcc/cp/cp-gimplify.c
@@ -208,7 +208,7 @@ genericize_cp_loop (tree *stmt_p, location_t start_locus, tree cond, tree body,
 		void *data)
 {
   tree blab, clab;
-  tree entry = NULL, exit = NULL, t;
+  tree exit = NULL;
   tree stmt_list = NULL;
 
   blab = begin_bc_block (bc_break, start_locus);
@@ -222,64 +222,46 @@ genericize_cp_loop (tree *stmt_p, location_t start_locus, tree cond, tree body,
   cp_walk_tree (incr, cp_genericize_r, data, NULL);
   *walk_subtrees = 0;
 
-  /* If condition is zero don't generate a loop construct.  */
-  if (cond  integer_zerop (cond))
+  if (cond  TREE_CODE (cond) != INTEGER_CST)
 {
-  if (cond_is_first)
-	{
-	  t = build1_loc (start_locus, GOTO_EXPR, void_type_node,
-			  get_bc_label (bc_break));
-	  append_to_statement_list (t, stmt_list);
-	}
-}
-  else
-{
-  /* Expand to gotos, just like c_finish_loop.  TODO: Use LOOP_EXPR.  */
-  tree top = build1 (LABEL_EXPR, void_type_node,
-			 create_artificial_label (start_locus));
-
-  /* If we have an exit condition, then we build an IF with gotos either
-	 out of the loop, or to the top of it.  If there's no exit condition,
-	 then we just build a jump back to the top.  */
-  exit = build1 (GOTO_EXPR, void_type_node, LABEL_EXPR_LABEL (top));
-
-  if (cond  !integer_nonzerop (cond))
-	{
-	  /* Canonicalize the loop condition to the end.  This means
-	 generating a branch to the loop condition.  Reuse the
-	 continue label, if possible.  */
-	  if (cond_is_first)
-	{
-	  if (incr)
-		{
-		  entry = build1 (LABEL_EXPR, void_type_node,
-  create_artificial_label (start_locus));
-		  t = build1_loc (start_locus, GOTO_EXPR, void_type_node,
-  LABEL_EXPR_LABEL (entry));
-		}
-	  else
-		t = build1_loc (start_locus, GOTO_EXPR, void_type_node,
-get_bc_label (bc_continue));
-	  append_to_statement_list (t, stmt_list);
-	}
-
-	  t = build1 (GOTO_EXPR, void_type_node, get_bc_label (bc_break));
-	  exit = fold_build3_loc (start_locus,
-  COND_EXPR, void_type_node, cond, exit, t);
-	}
-
-  append_to_statement_list (top, stmt_list);
+  /* If COND is constant, don't bother building an exit.  If it's false,
+	 we won't build a loop.  If it's true, any exits are in the body.  */
+  location_t cloc = EXPR_LOC_OR_LOC (cond, start_locus);
+  exit = build1_loc (cloc, GOTO_EXPR, void_type_node,
+			 get_bc_label (bc_break));
+  exit = fold_build3_loc (cloc, COND_EXPR, void_type_node, cond,
+			  build_empty_stmt (cloc), exit);
 }
 
+  if (exit  cond_is_first)
+append_to_statement_list (exit, stmt_list);
   append_to_statement_list (body, stmt_list);
   finish_bc_block (stmt_list, bc_continue, clab);
   append_to_statement_list (incr, stmt_list);
-  append_to_statement_list (entry, stmt_list);
-  append_to_statement_list (exit, stmt_list);
+  if (exit  !cond_is_first)
+append_to_statement_list (exit, stmt_list);
+
+  if (!stmt_list)
+stmt_list = build_empty_stmt (start_locus);
+
+  tree loop;
+  if (cond  integer_zerop (cond))
+{
+  if (cond_is_first)
+	loop = fold_build3_loc (start_locus, COND_EXPR,
+void_type_node, cond, stmt_list,
+build_empty_stmt (start_locus));
+  else
+	loop = stmt_list;
+}
+  else
+loop = build1_loc (start_locus, LOOP_EXPR, void_type_node, stmt_list);
+
+  stmt_list = NULL;
+  append_to_statement_list (loop, stmt_list);
   finish_bc_block (stmt_list, bc_break, blab);
-
-  if (stmt_list == NULL_TREE)
-stmt_list = build1 (NOP_EXPR, void_type_node, integer_zero_node);
+  if (!stmt_list)
+stmt_list = build_empty_stmt (start_locus);
 
   *stmt_p = stmt_list;
 }
@@ -303,6 +285,8 @@ genericize_for_stmt (tree *stmt_p, int *walk_subtrees, void *data)
   genericize_cp_loop (loop, EXPR_LOCATION (stmt), FOR_COND (stmt),
 		  FOR_BODY (stmt), FOR_EXPR (stmt), 1, walk_subtrees, data);
   append_to_statement_list (loop, expr);
+  if (expr == NULL_TREE)
+expr = loop;
   *stmt_p = expr;
 }
 


Re: RFA (tree-inline): PATCH for more C++14 constexpr support

2014-11-15 Thread Jason Merrill

On 11/15/2014 11:59 PM, Jason Merrill wrote:

This handles some more C++14 testcases and has no regressions on C++11
constexpr testcases.  Support for jumps will follow soon.


Like this, though it still needs a bit of cleaning up.

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 53cfb18..611d4f2 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -871,7 +871,7 @@ struct constexpr_ctx {
 static GTY (()) hash_tableconstexpr_call_hasher *constexpr_call_table;
 
 static tree cxx_eval_constant_expression (const constexpr_ctx *, tree,
-	  bool, bool, bool *, bool *);
+	  bool, bool, bool *, bool *, tree *);
 
 /* Compute a hash value for a constexpr call representation.  */
 
@@ -993,7 +993,8 @@ cxx_eval_builtin_function_call (const constexpr_ctx *ctx, tree t,
 {
   args[i] = cxx_eval_constant_expression (ctx, CALL_EXPR_ARG (t, i),
 	  allow_non_constant, addr,
-	  non_constant_p, overflow_p);
+	  non_constant_p, overflow_p,
+	  NULL);
   if (allow_non_constant  *non_constant_p)
 	return t;
 }
@@ -1070,7 +1071,7 @@ cxx_bind_parameters_in_call (const constexpr_ctx *ctx, tree t,
 	}
   arg = cxx_eval_constant_expression (ctx, x, allow_non_constant,
 	  TREE_CODE (type) == REFERENCE_TYPE,
-	  non_constant_p, overflow_p);
+	  non_constant_p, overflow_p, NULL);
   /* Don't VERIFY_CONSTANT here.  */
   if (*non_constant_p  allow_non_constant)
 	return;
@@ -1151,7 +1152,7 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree t,
   /* Might be a constexpr function pointer.  */
   fun = cxx_eval_constant_expression (ctx, fun, allow_non_constant,
 	  /*addr*/false, non_constant_p,
-	  overflow_p);
+	  overflow_p, NULL);
   STRIP_NOPS (fun);
   if (TREE_CODE (fun) == ADDR_EXPR)
 	fun = TREE_OPERAND (fun, 0);
@@ -1187,7 +1188,8 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree t,
 	{
 	  tree arg = convert_from_reference (get_nth_callarg (t, 1));
 	  return cxx_eval_constant_expression (ctx, arg, allow_non_constant,
-	   addr, non_constant_p, overflow_p);
+	   addr, non_constant_p,
+	   overflow_p, NULL);
 	}
   else if (TREE_CODE (t) == AGGR_INIT_EXPR
 	AGGR_INIT_ZERO_FIRST (t))
@@ -1270,7 +1272,7 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree t,
 	  result = (cxx_eval_constant_expression
 			(new_ctx, new_call.fundef-body,
 			 allow_non_constant, addr,
-			 non_constant_p, overflow_p));
+			 non_constant_p, overflow_p, NULL));
 	}
 	  else
 	{
@@ -1310,8 +1312,10 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree t,
 	  else
 		ctx-values-put (res, NULL_TREE);
 
+	  tree jump_target = NULL_TREE;
 	  cxx_eval_constant_expression (ctx, body, allow_non_constant,
-	addr, non_constant_p, overflow_p);
+	addr, non_constant_p, overflow_p,
+	jump_target);
 
 	  if (VOID_TYPE_P (TREE_TYPE (res)))
 		/* This can be null for a subobject constructor call, in
@@ -1320,7 +1324,16 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree t,
 		   to put such a call in the hash table.  */
 		result = addr ? ctx-object : ctx-ctor;
 	  else
-		result = *ctx-values-get (slot ? slot : res);
+		{
+		  result = *ctx-values-get (slot ? slot : res);
+		  if (result == NULL_TREE)
+		{
+		  if (!allow_non_constant)
+			error (constexpr call flows off the end 
+			   of the function);
+		  *non_constant_p = true;
+		}
+		}
 
 	  /* Remove the parms/result from the values map.  Is it worth
 		 bothering to do this when the map itself is only live for
@@ -1433,7 +1446,8 @@ cxx_eval_unary_expression (const constexpr_ctx *ctx, tree t,
   tree r;
   tree orig_arg = TREE_OPERAND (t, 0);
   tree arg = cxx_eval_constant_expression (ctx, orig_arg, allow_non_constant,
-	   addr, non_constant_p, overflow_p);
+	   addr, non_constant_p, overflow_p,
+	   NULL);
   VERIFY_CONSTANT (arg);
   if (arg == orig_arg)
 return t;
@@ -1456,11 +1470,11 @@ cxx_eval_binary_expression (const constexpr_ctx *ctx, tree t,
   tree lhs, rhs;
   lhs = cxx_eval_constant_expression (ctx, orig_lhs,
   allow_non_constant, addr,
-  non_constant_p, overflow_p);
+  non_constant_p, overflow_p, NULL);
   VERIFY_CONSTANT (lhs);
   rhs = cxx_eval_constant_expression (ctx, orig_rhs,
   allow_non_constant, addr,
-  non_constant_p, overflow_p);
+  non_constant_p, overflow_p, NULL);
   VERIFY_CONSTANT (rhs);
   if (lhs == orig_lhs  rhs == orig_rhs)
 return t;
@@ -1476,20 +1490,24 @@ cxx_eval_binary_expression (const constexpr_ctx *ctx, tree t,
 static tree
 cxx_eval_conditional_expression (const constexpr_ctx *ctx, tree t,
  bool allow_non_constant, bool addr,
- bool *non_constant_p, bool *overflow_p)
+ bool *non_constant_p, bool *overflow_p,
+ tree *jump_target)
 {
   tree val = cxx_eval_constant_expression (ctx, TREE_OPERAND (t, 

Avoid applying inline plan for all functions ahead of late compilation

2014-11-15 Thread Jan Hubicka
Hi,
late in GCC 4.9 development we broke the feature that ltrans stages do not read 
all
functions in ahead.  This is because of late IPA passes that do not like to see 
functions
without IPA transformations applied.  I was originally OK with the solution 
based
on fact that we have only IPA-PTA as late IPA pass that is disabled by default 
and
eventually probably should become part of WPA in some form.
SIMD streaming was however added and this causes us to stream in all function 
bodies
and apply all inlining decisions at very beggining of optimization queue.

Fixed by this patch.  get_body is now responsible for applying transformations
on demand and late IPA passes needs to call get_body on functions that they
are interested in + are advised to not be interested in every single function in
the program.

The patch also hits a bug in i386's ix86_set_current_function. It is responsible
for initializing backend and it does so lazily remembering the previous options
backend was initialized for. Pragma parsing however clears the cache that leads
to wrong settings being used for subsetquent functions.

Bootstrapped/regtested x86_64-linux, will commit it tomorrow after bit of more 
testing.

Index: gcc/cgraphclones.c
===
--- gcc/cgraphclones.c  (revision 217612)
+++ gcc/cgraphclones.c  (working copy)
@@ -307,7 +307,7 @@ duplicate_thunk_for_node (cgraph_node *t
 node = duplicate_thunk_for_node (thunk_of, node);
 
   if (!DECL_ARGUMENTS (thunk-decl))
-thunk-get_body ();
+thunk-get_untransformed_body ();
 
   cgraph_edge *cs;
   for (cs = node-callers; cs; cs = cs-next_caller)
@@ -1067,7 +1067,7 @@ symbol_table::materialize_all_clones (vo
   !gimple_has_body_p (node-decl))
{
  if (!node-clone_of-clone_of)
-   node-clone_of-get_body ();
+   node-clone_of-get_untransformed_body ();
  if (gimple_has_body_p (node-clone_of-decl))
{
  if (symtab-dump_file)
Index: gcc/ipa-icf.c
===
--- gcc/ipa-icf.c   (revision 217612)
+++ gcc/ipa-icf.c   (working copy)
@@ -706,7 +706,7 @@ void
 sem_function::init (void)
 {
   if (in_lto_p)
-get_node ()-get_body ();
+get_node ()-get_untransformed_body ();
 
   tree fndecl = node-decl;
   function *func = DECL_STRUCT_FUNCTION (fndecl);
Index: gcc/passes.c
===
--- gcc/passes.c(revision 217612)
+++ gcc/passes.c(working copy)
@@ -2214,36 +2214,6 @@ execute_one_pass (opt_pass *pass)
  executed.  */
   invoke_plugin_callbacks (PLUGIN_PASS_EXECUTION, pass);
 
-  /* SIPLE IPA passes do not handle callgraphs with IPA transforms in it.
- Apply all trnasforms first.  */
-  if (pass-type == SIMPLE_IPA_PASS)
-{
-  struct cgraph_node *node;
-  bool applied = false;
-  FOR_EACH_DEFINED_FUNCTION (node)
-   if (node-analyzed
-node-has_gimple_body_p ()
-(!node-clone_of || node-decl != node-clone_of-decl))
- {
-   if (!node-global.inlined_to
-node-ipa_transforms_to_apply.exists ())
- {
-   node-get_body ();
-   push_cfun (DECL_STRUCT_FUNCTION (node-decl));
-   execute_all_ipa_transforms ();
-   cgraph_edge::rebuild_edges ();
-   free_dominance_info (CDI_DOMINATORS);
-   free_dominance_info (CDI_POST_DOMINATORS);
-   pop_cfun ();
-   applied = true;
- }
- }
-  if (applied)
-   symtab-remove_unreachable_nodes (false, dump_file);
-  /* Restore current_pass.  */
-  current_pass = pass;
-}
-
   if (!quiet_flag  !cfun)
 fprintf (stderr,  %s, pass-name ? pass-name : );
 
Index: gcc/cgraphunit.c
===
--- gcc/cgraphunit.c(revision 217612)
+++ gcc/cgraphunit.c(working copy)
@@ -197,7 +197,6 @@ along with GCC; see the file COPYING3.
 #include target.h
 #include diagnostic.h
 #include params.h
-#include fibheap.h
 #include intl.h
 #include hash-map.h
 #include plugin-api.h
@@ -1469,7 +1468,7 @@ cgraph_node::expand_thunk (bool output_a
}
 
   if (in_lto_p)
-   get_body ();
+   get_untransformed_body ();
   a = DECL_ARGUMENTS (thunk_fndecl);
   
   current_function_decl = thunk_fndecl;
@@ -1522,7 +1521,7 @@ cgraph_node::expand_thunk (bool output_a
   gimple ret;
 
   if (in_lto_p)
-   get_body ();
+   get_untransformed_body ();
   a = DECL_ARGUMENTS (thunk_fndecl);
 
   current_function_decl = thunk_fndecl;
@@ -1744,7 +1743,7 @@ cgraph_node::expand (void)
   announce_function (decl);
   process = 0;
   gcc_assert (lowered);
-  get_body ();
+  get_untransformed_body ();
 
   /* Generate RTL for the body of DECL.  */
 
Index: 

Re: Avoid applying inline plan for all functions ahead of late compilation

2014-11-15 Thread Richard Biener
On November 16, 2014 8:15:37 AM CET, Jan Hubicka hubi...@ucw.cz wrote:
Hi,
late in GCC 4.9 development we broke the feature that ltrans stages do
not read all
functions in ahead.  This is because of late IPA passes that do not
like to see functions
without IPA transformations applied.  I was originally OK with the
solution based
on fact that we have only IPA-PTA as late IPA pass that is disabled by
default and
eventually probably should become part of WPA in some form.
SIMD streaming was however added and this causes us to stream in all
function bodies
and apply all inlining decisions at very beggining of optimization
queue.

Fixed by this patch.  get_body is now responsible for applying
transformations
on demand and late IPA passes needs to call get_body on functions that
they
are interested in + are advised to not be interested in every single
function in
the program.

The patch also hits a bug in i386's ix86_set_current_function. It is
responsible
for initializing backend and it does so lazily remembering the previous
options
backend was initialized for. Pragma parsing however clears the cache
that leads
to wrong settings being used for subsetquent functions.

Bootstrapped/regtested x86_64-linux, will commit it tomorrow after bit
of more testing.

But for example for IPA pta this means we apply all IPA transforms without any 
garbage collection run?

Richard.

Index: gcc/cgraphclones.c
===
--- gcc/cgraphclones.c (revision 217612)
+++ gcc/cgraphclones.c (working copy)
@@ -307,7 +307,7 @@ duplicate_thunk_for_node (cgraph_node *t
 node = duplicate_thunk_for_node (thunk_of, node);
 
   if (!DECL_ARGUMENTS (thunk-decl))
-thunk-get_body ();
+thunk-get_untransformed_body ();
 
   cgraph_edge *cs;
   for (cs = node-callers; cs; cs = cs-next_caller)
@@ -1067,7 +1067,7 @@ symbol_table::materialize_all_clones (vo
  !gimple_has_body_p (node-decl))
   {
 if (!node-clone_of-clone_of)
-  node-clone_of-get_body ();
+  node-clone_of-get_untransformed_body ();
 if (gimple_has_body_p (node-clone_of-decl))
   {
 if (symtab-dump_file)
Index: gcc/ipa-icf.c
===
--- gcc/ipa-icf.c  (revision 217612)
+++ gcc/ipa-icf.c  (working copy)
@@ -706,7 +706,7 @@ void
 sem_function::init (void)
 {
   if (in_lto_p)
-get_node ()-get_body ();
+get_node ()-get_untransformed_body ();
 
   tree fndecl = node-decl;
   function *func = DECL_STRUCT_FUNCTION (fndecl);
Index: gcc/passes.c
===
--- gcc/passes.c   (revision 217612)
+++ gcc/passes.c   (working copy)
@@ -2214,36 +2214,6 @@ execute_one_pass (opt_pass *pass)
  executed.  */
   invoke_plugin_callbacks (PLUGIN_PASS_EXECUTION, pass);
 
-  /* SIPLE IPA passes do not handle callgraphs with IPA transforms in
it.
- Apply all trnasforms first.  */
-  if (pass-type == SIMPLE_IPA_PASS)
-{
-  struct cgraph_node *node;
-  bool applied = false;
-  FOR_EACH_DEFINED_FUNCTION (node)
-  if (node-analyzed
-   node-has_gimple_body_p ()
-   (!node-clone_of || node-decl != node-clone_of-decl))
-{
-  if (!node-global.inlined_to
-   node-ipa_transforms_to_apply.exists ())
-{
-  node-get_body ();
-  push_cfun (DECL_STRUCT_FUNCTION (node-decl));
-  execute_all_ipa_transforms ();
-  cgraph_edge::rebuild_edges ();
-  free_dominance_info (CDI_DOMINATORS);
-  free_dominance_info (CDI_POST_DOMINATORS);
-  pop_cfun ();
-  applied = true;
-}
-}
-  if (applied)
-  symtab-remove_unreachable_nodes (false, dump_file);
-  /* Restore current_pass.  */
-  current_pass = pass;
-}
-
   if (!quiet_flag  !cfun)
 fprintf (stderr,  %s, pass-name ? pass-name : );
 
Index: gcc/cgraphunit.c
===
--- gcc/cgraphunit.c   (revision 217612)
+++ gcc/cgraphunit.c   (working copy)
@@ -197,7 +197,6 @@ along with GCC; see the file COPYING3.
 #include target.h
 #include diagnostic.h
 #include params.h
-#include fibheap.h
 #include intl.h
 #include hash-map.h
 #include plugin-api.h
@@ -1469,7 +1468,7 @@ cgraph_node::expand_thunk (bool output_a
   }
 
   if (in_lto_p)
-  get_body ();
+  get_untransformed_body ();
   a = DECL_ARGUMENTS (thunk_fndecl);
   
   current_function_decl = thunk_fndecl;
@@ -1522,7 +1521,7 @@ cgraph_node::expand_thunk (bool output_a
   gimple ret;
 
   if (in_lto_p)
-  get_body ();
+  get_untransformed_body ();
   a = DECL_ARGUMENTS (thunk_fndecl);
 
   current_function_decl = thunk_fndecl;
@@ -1744,7 +1743,7 @@ cgraph_node::expand (void)
   announce_function (decl);