date:20140417

Reduce -flto -fprofile-generate memory use

2014-04-17 Thread Jan Hubicka

Hi,
while compiling firefox I noticed that -fprofile-generage -flto goes to 8GB.
It turns out that this is caused by ipa_reference no longer being disabled
becaus in_lto_p became flag that is set later (it is not clear to me why it
needs to be this way).

I however do not see reason why not disable ipa-reference for non-lto path, too.

Bootstrapped/regtested x86_linux, comitted to mainline.
OK for 4.9.1?

Honza

Index: ChangeLog
===
--- ChangeLog   (revision 209461)
+++ ChangeLog   (working copy)
@@ -1,5 +1,10 @@
 2014-04-16  Jan Hubicka  hubi...@ucw.cz
 
+   * opts.c (common_handle_option): Disable -fipa-reference coorectly
+   with -fuse-profile.
+
+2014-04-16  Jan Hubicka  hubi...@ucw.cz
+
* ipa-devirt.c (odr_type_d): Add field all_derivations_known.
(type_all_derivations_known_p): New predicate.
(type_all_ctors_visible_p): New predicate.
Index: opts.c
===
--- opts.c  (revision 209461)
+++ opts.c  (working copy)
@@ -1732,7 +1732,7 @@ common_handle_option (struct gcc_options
   /* FIXME: Instrumentation we insert makes ipa-reference bitmaps
 quadratic.  Disable the pass until better memory representation
 is done.  */
-  if (!opts_set-x_flag_ipa_reference  opts-x_in_lto_p)
+  if (!opts_set-x_flag_ipa_reference)
 opts-x_flag_ipa_reference = false;
   break;

Re: Reduce -flto -fprofile-generate memory use

2014-04-17 Thread Richard Biener

On Thu, 17 Apr 2014, Jan Hubicka wrote:

 Hi,
 while compiling firefox I noticed that -fprofile-generage -flto goes to 8GB.
 It turns out that this is caused by ipa_reference no longer being disabled
 becaus in_lto_p became flag that is set later (it is not clear to me why it
 needs to be this way).
 
 I however do not see reason why not disable ipa-reference for non-lto path, 
 too.
 
 Bootstrapped/regtested x86_linux, comitted to mainline.
 OK for 4.9.1?

Yes.

Thanks,
Richard.

 Honza
 
 Index: ChangeLog
 ===
 --- ChangeLog (revision 209461)
 +++ ChangeLog (working copy)
 @@ -1,5 +1,10 @@
  2014-04-16  Jan Hubicka  hubi...@ucw.cz
  
 + * opts.c (common_handle_option): Disable -fipa-reference coorectly
 + with -fuse-profile.
 +
 +2014-04-16  Jan Hubicka  hubi...@ucw.cz
 +
   * ipa-devirt.c (odr_type_d): Add field all_derivations_known.
   (type_all_derivations_known_p): New predicate.
   (type_all_ctors_visible_p): New predicate.
 Index: opts.c
 ===
 --- opts.c(revision 209461)
 +++ opts.c(working copy)
 @@ -1732,7 +1732,7 @@ common_handle_option (struct gcc_options
/* FIXME: Instrumentation we insert makes ipa-reference bitmaps
quadratic.  Disable the pass until better memory representation
is done.  */
 -  if (!opts_set-x_flag_ipa_reference  opts-x_in_lto_p)
 +  if (!opts_set-x_flag_ipa_reference)
  opts-x_flag_ipa_reference = false;
break;
  
 
 

-- 
Richard Biener rguent...@suse.de
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imendorffer

Re: Fix lto/PR60854

2014-04-17 Thread Richard Biener

On Thu, Apr 17, 2014 at 4:30 AM, Jan Hubicka hubi...@ucw.cz wrote:
 Hi,
 the testcase shows problem where cpp implicit alias is always inline and
 symtab_remove_unreachable_nodes removes the body of aliased function before
 inlininghappens.  The real problem is that cgraph_state is set too early
 and not as the comment says after inlinig, but for release branch I think
 it is easier to sovle the problem by simply making the alias target
 reachable by hand.

 Bootstrapped/regtested x86_64-linux, comitted to trunk. Let me know
 when it is OK for release brach.

It's ok for 4.9.1.

Richard.

 Honza

 Index: ChangeLog
 ===
 --- ChangeLog   (revision 209458)
 +++ ChangeLog   (working copy)
 @@ -1,3 +1,9 @@
 +2014-04-16  Jan Hubicka  hubi...@ucw.cz
 +
 +   PR ipa/60854
 +   * ipa.c (symtab_remove_unreachable_nodes): Mark targets of
 +   external aliases alive, too.
 +
  2014-04-16  Andrew  Pinski  apin...@cavium.com

 * config/host-linux.c (TRY_EMPTY_VM_SPACE): Change aarch64 ilp32
 Index: testsuite/ChangeLog
 ===
 --- testsuite/ChangeLog (revision 209450)
 +++ testsuite/ChangeLog (working copy)
 @@ -1,3 +1,8 @@
 +2014-04-16  Jan Hubicka  hubi...@ucw.cz
 +
 +   PR ipa/60854
 +   * g++.dg/torture/pr60854.C: New testcase.
 +
  2014-04-16  Catherine Moore  c...@codesourcery.com

 * gcc.target/mips/umips-store16-2.c: New test.
 Index: ipa.c
 ===
 --- ipa.c   (revision 209450)
 +++ ipa.c   (working copy)
 @@ -415,7 +415,18 @@ symtab_remove_unreachable_nodes (bool be
   || !DECL_EXTERNAL (e-callee-decl)
   || e-callee-alias
   || before_inlining_p))
 -   pointer_set_insert (reachable, e-callee);
 +   {
 + /* Be sure that we will not optimize out alias target
 +body.  */
 + if (DECL_EXTERNAL (e-callee-decl)
 +  e-callee-alias
 +  before_inlining_p)
 +   {
 + pointer_set_insert (reachable,
 + cgraph_function_node 
 (e-callee));
 +   }
 + pointer_set_insert (reachable, e-callee);
 +   }
   enqueue_node (e-callee, first, reachable);
 }

 Index: testsuite/g++.dg/torture/pr60854.C
 ===
 --- testsuite/g++.dg/torture/pr60854.C  (revision 0)
 +++ testsuite/g++.dg/torture/pr60854.C  (revision 0)
 @@ -0,0 +1,13 @@
 +template typename T
 +class MyClass
 +{
 +public:
 +  __attribute__ ((__always_inline__)) inline MyClass () { ; }
 +};
 +
 +extern template class MyClassdouble;
 +
 +void Func()
 +{
 +  MyClassdouble x;
 +}

Re: [PATCH GCC]Fix pr60363 by adding backtraced value of phi arg along jump threading path

2014-04-17 Thread Richard Biener

On Thu, Apr 17, 2014 at 7:30 AM, Jeff Law l...@redhat.com wrote:
 On 03/18/14 04:13, bin.cheng wrote:

 Hi,
 After control flow graph change made by
 http://gcc.gnu.org/ml/gcc-patches/2014-02/msg01492.html, case
 gcc.dg/tree-ssa/ssa-dom-thread-4.c is broken on logical_op_short_circuit
 targets including cortex-m3/cortex-m0.
 The regression reveals a missed opportunity in jump threading, which
 causes
 a forward basic block doesn't get removed in cfgcleanup after jump
 threading
 in VRP1.  Root cause is stated at the corresponding PR:
 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60363, please refer to it for
 detailed report.

 This patch fixes the issue by adding constant value instead of ssa_name as
 the new phi argument.  Bootstrap and test on x86_64, also test on
 cortex-m3
 and the regression is gone.
 I think this should wait for stage1, but would like to hear some comments
 now.  So does it look reasonable?


 2014-03-18  Bin Chengbin.ch...@arm.com

 PR regression/60363
 * gcc/tree-ssa-threadupdate.c (get_value_locus_in_path): New.
 (copy_phi_args): New parameters.  Call get_value_locus_in_path.
 (update_destination_phis): New parameter.
 (create_edge_and_update_destination_phis): Ditto.
 (ssa_fix_duplicate_block_edges): Pass new arguments.
 (thread_single_edge): Ditto.

 This is a good and interesting catch. DOM knows how to propagate these
 context sensitive equivalences which should expose the optimizable forwarder
 blocks.

 But I'm a big believer in catching as many CFG simplifications as early as
 we can as they tend to have nice cascading effects.  So if we can pick it up
 by being smarter in how we duplicate arguments, then I'm all for it.

 +  for (int j = idx - 1; j = 0; j--)
 +{
 +  edge e = (*path)[j]-e;
 +  if (e-dest == def_bb)
 +   {
 + arg = gimple_phi_arg_def (def_phi, e-dest_idx);
 + *locus = gimple_phi_arg_location (def_phi, e-dest_idx);
 + return (TREE_CODE (arg) == INTEGER_CST ? arg : def);

 Presumably any constant that can legitimately appear in a PHI node is good
 here.  So for example ADDR_EXPR something in static storage ought to be
 handled as well.

 One could also argue that we should go ahead and do a context sensitive copy
 propagation here too if ARG turns out to be an SSA_NAME.  You have to be a
 bit more careful with those and use may_propagate_copy_p and you'd probably
 want to test the loop depth of the SSA_NAMEs to ensure you're not doing a
 propagation that is going to muck up LICM.  See loop_depth_of_name uses in
 tree-ssa-dom.c.

 Overall I think it's good.  We just need to resolve whether or not we want
 to catch constant ADDR_EXPRs and/or do the context sensitive copy
 propagations.

Simply use is_gimple_min_invariant (arg) ? arg : def

 jeff

[PATCH] Fix PR60841

2014-04-17 Thread Richard Biener


This fixes running into the exponential value-graph - SLP tree
expansion by artificially limiting the overall SLP tree size.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2014-04-16   Richard Biener  rguent...@suse.de

PR tree-optimization/60841
* tree-vect-data-refs.c (vect_analyze_data_refs): Count stmts.
* tree-vect-loop.c (vect_analyze_loop_2): Pass down number
of stmts to SLP build.
* tree-vect-slp.c (vect_slp_analyze_bb_1): Likewise.
(vect_analyze_slp): Likewise.
(vect_analyze_slp_instance): Likewise.
(vect_build_slp_tree): Limit overall SLP tree growth.
* tree-vectorizer.h (vect_analyze_data_refs,
vect_analyze_slp): Adjust prototypes.

* gcc.dg/vect/pr60841.c: New testcase.

Index: gcc/tree-vect-data-refs.c
===
--- gcc/tree-vect-data-refs.c   (revision 209423)
+++ gcc/tree-vect-data-refs.c   (working copy)
@@ -3172,7 +3213,7 @@ vect_check_gather (gimple stmt, loop_vec
 bool
 vect_analyze_data_refs (loop_vec_info loop_vinfo,
bb_vec_info bb_vinfo,
-   int *min_vf)
+   int *min_vf, unsigned *n_stmts)
 {
   struct loop *loop = NULL;
   basic_block bb = NULL;
@@ -3207,6 +3248,9 @@ vect_analyze_data_refs (loop_vec_info lo
  for (gsi = gsi_start_bb (bbs[i]); !gsi_end_p (gsi); gsi_next (gsi))
{
  gimple stmt = gsi_stmt (gsi);
+ if (is_gimple_debug (stmt))
+   continue;
+ ++*n_stmts;
  if (!find_data_references_in_stmt (loop, stmt, datarefs))
{
  if (is_gimple_call (stmt)  loop-safelen)
@@ -3260,6 +3304,9 @@ vect_analyze_data_refs (loop_vec_info lo
   for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (gsi))
{
  gimple stmt = gsi_stmt (gsi);
+ if (is_gimple_debug (stmt))
+   continue;
+ ++*n_stmts;
  if (!find_data_references_in_stmt (NULL, stmt,
 BB_VINFO_DATAREFS (bb_vinfo)))
{
Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c(revision 209423)
+++ gcc/tree-vect-loop.c(working copy)
@@ -1629,6 +1629,7 @@ vect_analyze_loop_2 (loop_vec_info loop_
   int max_vf = MAX_VECTORIZATION_FACTOR;
   int min_vf = 2;
   unsigned int th;
+  unsigned int n_stmts = 0;
 
   /* Find all data references in the loop (which correspond to vdefs/vuses)
  and analyze their evolution in the loop.  Also adjust the minimal
@@ -1637,7 +1638,7 @@ vect_analyze_loop_2 (loop_vec_info loop_
  FORNOW: Handle only simple, array references, which
  alignment can be forced, and aligned pointer-references.  */
 
-  ok = vect_analyze_data_refs (loop_vinfo, NULL, min_vf);
+  ok = vect_analyze_data_refs (loop_vinfo, NULL, min_vf, n_stmts);
   if (!ok)
 {
   if (dump_enabled_p ())
@@ -1747,7 +1748,7 @@ vect_analyze_loop_2 (loop_vec_info loop_
 }
 
   /* Check the SLP opportunities in the loop, analyze and build SLP trees.  */
-  ok = vect_analyze_slp (loop_vinfo, NULL);
+  ok = vect_analyze_slp (loop_vinfo, NULL, n_stmts);
   if (ok)
 {
   /* Decide which possible SLP instances to SLP.  */
Index: gcc/tree-vect-slp.c
===
--- gcc/tree-vect-slp.c (revision 209423)
+++ gcc/tree-vect-slp.c (working copy)
@@ -849,9 +849,10 @@ vect_build_slp_tree (loop_vec_info loop_
  unsigned int *max_nunits,
  vecslp_tree *loads,
  unsigned int vectorization_factor,
-bool *matches, unsigned *npermutes)
+bool *matches, unsigned *npermutes, unsigned *tree_size,
+unsigned max_tree_size)
 {
-  unsigned nops, i, this_npermutes = 0;
+  unsigned nops, i, this_npermutes = 0, this_tree_size = 0;
   gimple stmt;
 
   if (!matches)
@@ -911,6 +912,12 @@ vect_build_slp_tree (loop_vec_info loop_
   if (oprnd_info-first_dt != vect_internal_def)
 continue;
 
+  if (++this_tree_size  max_tree_size)
+   {
+ vect_free_oprnd_info (oprnds_info);
+ return false;
+   }
+
   child = vect_create_new_slp_node (oprnd_info-def_stmts);
   if (!child)
{
@@ -921,7 +928,8 @@ vect_build_slp_tree (loop_vec_info loop_
   bool *matches = XALLOCAVEC (bool, group_size);
   if (vect_build_slp_tree (loop_vinfo, bb_vinfo, child,
   group_size, max_nunits, loads,
-  vectorization_factor, matches, npermutes))
+  vectorization_factor, matches,
+  npermutes, this_tree_size, max_tree_size))
{
  oprnd_info-def_stmts = vNULL;
  SLP_TREE_CHILDREN

[PATCH] Fix PR60849

2014-04-17 Thread Richard Biener


This fixes PR60849 by properly rejecting non-boolean typed
comparisons from valid_gimple_rhs_p so they go through the
gimplification paths.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2014-04-16  Richard Biener  rguent...@suse.de

PR middle-end/60849
* tree-ssa-propagate.c (valid_gimple_rhs_p): Only allow effective
boolean results for comparisons.

* g++.dg/opt/pr60849.C: New testcase.

Index: gcc/tree-ssa-propagate.c
===
--- gcc/tree-ssa-propagate.c(revision 209423)
+++ gcc/tree-ssa-propagate.c(working copy)
@@ -571,8 +571,14 @@ valid_gimple_rhs_p (tree expr)
   /* All constants are ok.  */
   break;
 
-case tcc_binary:
 case tcc_comparison:
+  if (!INTEGRAL_TYPE_P (TREE_TYPE (expr))
+ || (TREE_CODE (TREE_TYPE (expr)) != BOOLEAN_TYPE
+  TYPE_PRECISION (TREE_TYPE (expr)) != 1))
+   return false;
+
+  /* Fallthru.  */
+case tcc_binary:
   if (!is_gimple_val (TREE_OPERAND (expr, 0))
  || !is_gimple_val (TREE_OPERAND (expr, 1)))
return false;
Index: gcc/testsuite/g++.dg/opt/pr60849.C
===
--- gcc/testsuite/g++.dg/opt/pr60849.C  (revision 0)
+++ gcc/testsuite/g++.dg/opt/pr60849.C  (working copy)
@@ -0,0 +1,13 @@
+// { dg-do compile }
+// { dg-options -O2 }
+
+int g;
+
+extern C int isnan ();
+
+void foo(float a) {
+  int (*xx)(...);
+  xx = isnan;
+  if (xx(a))
+g++;
+}

[PATCH] Fix PR60836

2014-04-17 Thread Richard Biener


This fixes PR60836 by emitting a non-proper PHI argument to the
incoming edge.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2014-04-16  Richard Biener  rguent...@suse.de

PR tree-optimization/60836
* tree-vect-loop.c (vect_create_epilog_for_reduction): Force
initial PHI args to be gimple values.

* g++.dg/vect/pr60836.cc: New testcase.

Index: gcc/tree-vect-loop.c
===
*** gcc/tree-vect-loop.c(revision 209423)
--- gcc/tree-vect-loop.c(working copy)
*** vect_create_epilog_for_reduction (vectr
*** 3951,3958 
/* Set phi nodes arguments.  */
FOR_EACH_VEC_ELT (reduction_phis, i, phi)
  {
!   tree vec_init_def = vec_initial_defs[i];
!   tree def = vect_defs[i];
for (j = 0; j  ncopies; j++)
  {
/* Set the loop-entry arg of the reduction-phi.  */
--- 3952,3963 
/* Set phi nodes arguments.  */
FOR_EACH_VEC_ELT (reduction_phis, i, phi)
  {
!   tree vec_init_def, def;
!   gimple_seq stmts;
!   vec_init_def = force_gimple_operand (vec_initial_defs[i], stmts,
!  true, NULL_TREE);
!   gsi_insert_seq_on_edge_immediate (loop_preheader_edge (loop), stmts);
!   def = vect_defs[i];
for (j = 0; j  ncopies; j++)
  {
/* Set the loop-entry arg of the reduction-phi.  */
Index: gcc/testsuite/g++.dg/vect/pr60836.cc
===
*** gcc/testsuite/g++.dg/vect/pr60836.cc(revision 0)
--- gcc/testsuite/g++.dg/vect/pr60836.cc(working copy)
***
*** 0 
--- 1,39 
+ // { dg-do compile }
+ 
+ int a, b;
+ typedef double (*NormFunc) (const int );
+ int 
+ max (int p1, int p2)
+ {
+   if (p1  p2)
+ return p2;
+   return p1;
+ }
+ 
+ struct A
+ {
+   int operator  () (int p1, int p2)
+ {
+   return max (p1, p2);
+ }
+ };
+ template  class, class  double
+ norm_ (const int )
+ {
+   char c, d;
+   A e;
+   for (; a; a++)
+ {
+   b = e (b, d);
+   b = e (b, c);
+ }
+ }
+ 
+ void
+ norm ()
+ {
+   static NormFunc f = norm_  int, A ;
+   f = 0;
+ }
+ 
+ // { dg-final { cleanup-tree-dump vect } }

[PATCH 2/6] merge register_dump_files_1 into register_dump_files

2014-04-17 Thread tsaunders

From: Trevor Saunders tsaund...@mozilla.com

Hi,

simplification allowed by previous patch.

bootstrap + regtest passed on x86_64-unknown-linux-gnu, ok?

Trev

2014-03-19  Trevor Saunders  tsaund...@mozilla.com

* pass_manager.h (pass_manager::register_dump_files_1): Remove 
declaration.
* passes.c (pass_manager::register_dump_files_1): Merge into
(pass_manager::register_dump_files): this, and remove its handling of
properties since the pass always has the properties anyway.
(pass_manager::pass_manager): Adjust.


diff --git a/gcc/pass_manager.h b/gcc/pass_manager.h
index 8309567..9f4d67b 100644
--- a/gcc/pass_manager.h
+++ b/gcc/pass_manager.h
@@ -91,8 +91,7 @@ public:
 
 private:
   void set_pass_for_id (int id, opt_pass *pass);
-  void register_dump_files_1 (opt_pass *pass);
-  void register_dump_files (opt_pass *pass, int properties);
+  void register_dump_files (opt_pass *pass);
 
 private:
   context *m_ctxt;
diff --git a/gcc/passes.c b/gcc/passes.c
index 3f9590a..7508771 100644
--- a/gcc/passes.c
+++ b/gcc/passes.c
@@ -706,11 +706,10 @@ pass_manager::register_one_dump_file (opt_pass *pass)
   free (CONST_CAST (char *, full_name));
 }
 
-/* Recursive worker function for register_dump_files.  */
+/* Register the dump files for the pass_manager starting at PASS. */
 
 void
-pass_manager::
-register_dump_files_1 (opt_pass *pass)
+pass_manager::register_dump_files (opt_pass *pass)
 {
   do
 {
@@ -718,25 +717,13 @@ register_dump_files_1 (opt_pass *pass)
 register_one_dump_file (pass);
 
   if (pass-sub)
-register_dump_files_1 (pass-sub);
+register_dump_files (pass-sub);
 
   pass = pass-next;
 }
   while (pass);
 }
 
-/* Register the dump files for the pass_manager starting at PASS.
-   PROPERTIES reflects the properties that are guaranteed to be available at
-   the beginning of the pipeline.  */
-
-void
-pass_manager::
-register_dump_files (opt_pass *pass,int properties)
-{
-  pass-properties_required |= properties;
-  register_dump_files_1 (pass);
-}
-
 struct pass_registry
 {
   const char* unique_name;
@@ -1536,19 +1523,11 @@ pass_manager::pass_manager (context *ctxt)
 #undef TERMINATE_PASS_LIST
 
   /* Register the passes with the tree dump code.  */
-  register_dump_files (all_lowering_passes, PROP_gimple_any);
-  register_dump_files (all_small_ipa_passes,
-  PROP_gimple_any | PROP_gimple_lcf | PROP_gimple_leh
-  | PROP_cfg);
-  register_dump_files (all_regular_ipa_passes,
-  PROP_gimple_any | PROP_gimple_lcf | PROP_gimple_leh
-  | PROP_cfg);
-  register_dump_files (all_late_ipa_passes,
-  PROP_gimple_any | PROP_gimple_lcf | PROP_gimple_leh
-  | PROP_cfg);
-  register_dump_files (all_passes,
-  PROP_gimple_any | PROP_gimple_lcf | PROP_gimple_leh
-  | PROP_cfg);
+  register_dump_files (all_lowering_passes);
+  register_dump_files (all_small_ipa_passes);
+  register_dump_files (all_regular_ipa_passes);
+  register_dump_files (all_late_ipa_passes);
+  register_dump_files (all_passes);
 }
 
 /* If we are in IPA mode (i.e., current_function_decl is NULL), call
-- 
1.9.2

[PATCH 1/6] remove properties stuff from register_dump_files_1

2014-04-17 Thread tsaunders

From: Trevor Saunders tsaund...@mozilla.com

Hi,

just removing some dead code.

bootstrapped + regtested against r209414 on x86_64-unknown-linux-gnu, ok?

Trev

2014-03-19  Trevor Saunders  tsaund...@mozilla.com

* pass_manager.h (pass_manager::register_dump_files_1): Adjust.
* passes.c (pass_manager::register_dump_files_1): Remove dead code
dealing with properties.
(pass_manager::register_dump_files): Adjust.

diff --git a/gcc/pass_manager.h b/gcc/pass_manager.h
index e1d8143..8309567 100644
--- a/gcc/pass_manager.h
+++ b/gcc/pass_manager.h
@@ -91,7 +91,7 @@ public:
 
 private:
   void set_pass_for_id (int id, opt_pass *pass);
-  int register_dump_files_1 (opt_pass *pass, int properties);
+  void register_dump_files_1 (opt_pass *pass);
   void register_dump_files (opt_pass *pass, int properties);
 
 private:
diff --git a/gcc/passes.c b/gcc/passes.c
index 60fb135..3f9590a 100644
--- a/gcc/passes.c
+++ b/gcc/passes.c
@@ -708,33 +708,21 @@ pass_manager::register_one_dump_file (opt_pass *pass)
 
 /* Recursive worker function for register_dump_files.  */
 
-int
+void
 pass_manager::
-register_dump_files_1 (opt_pass *pass, int properties)
+register_dump_files_1 (opt_pass *pass)
 {
   do
 {
-  int new_properties = (properties | pass-properties_provided)
-   ~pass-properties_destroyed;
-
   if (pass-name  pass-name[0] != '*')
 register_one_dump_file (pass);
 
   if (pass-sub)
-new_properties = register_dump_files_1 (pass-sub, new_properties);
-
-  /* If we have a gate, combine the properties that we could have with
- and without the pass being examined.  */
-  if (pass-has_gate)
-properties = new_properties;
-  else
-properties = new_properties;
+register_dump_files_1 (pass-sub);
 
   pass = pass-next;
 }
   while (pass);
-
-  return properties;
 }
 
 /* Register the dump files for the pass_manager starting at PASS.
@@ -746,7 +734,7 @@ pass_manager::
 register_dump_files (opt_pass *pass,int properties)
 {
   pass-properties_required |= properties;
-  register_dump_files_1 (pass, properties);
+  register_dump_files_1 (pass);
 }
 
 struct pass_registry
-- 
1.9.2

[PATCH 4/6] enable -Woverloaded-virtual when available

2014-04-17 Thread tsaunders

From: Trevor Saunders tbsau...@mozilla.com

hi,

its a useful warning, and helps catch bugs in the next two patches.

bootstrap + regtest passed on x86_64-unknown-linux-gnu, ok?

Trev

2014-03-19  Trevor Saunders  tsaund...@mozilla.com

* configure.ac: Check for -Woverloaded-virtual and enable it if found.
* configure: Regenerate.


diff --git a/gcc/configure b/gcc/configure
index 415377a..1a48ca3 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -6427,6 +6427,50 @@ fi
   done
 CFLAGS=$save_CFLAGS
 
+save_CFLAGS=$CFLAGS
+for real_option in -Woverloaded-virtual; do
+  # Do the check with the no- prefix removed since gcc silently
+  # accepts any -Wno-* option on purpose
+  case $real_option in
+-Wno-*) option=-W`expr x$real_option : 'x-Wno-\(.*\)'` ;;
+*) option=$real_option ;;
+  esac
+  as_acx_Woption=`$as_echo acx_cv_prog_cc_warning_$option | $as_tr_sh`
+
+  { $as_echo $as_me:${as_lineno-$LINENO}: checking whether $CC supports 
$option 5
+$as_echo_n checking whether $CC supports $option...  6; }
+if { as_var=$as_acx_Woption; eval test \\${$as_var+set}\ = set; }; then :
+  $as_echo_n (cached)  6
+else
+  CFLAGS=$option
+cat confdefs.h - _ACEOF conftest.$ac_ext
+/* end confdefs.h.  */
+
+int
+main ()
+{
+
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_compile $LINENO; then :
+  eval $as_acx_Woption=yes
+else
+  eval $as_acx_Woption=no
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+
+fi
+eval ac_res=\$$as_acx_Woption
+  { $as_echo $as_me:${as_lineno-$LINENO}: result: $ac_res 5
+$as_echo $ac_res 6; }
+  if test `eval 'as_val=${'$as_acx_Woption'};$as_echo $as_val'` = yes; then :
+  strict_warn=$strict_warn${strict_warn:+ }$real_option
+fi
+  done
+CFLAGS=$save_CFLAGS
+
 c_strict_warn=
 save_CFLAGS=$CFLAGS
 for real_option in -Wold-style-definition -Wc++-compat; do
@@ -17927,7 +17971,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat  conftest.$ac_ext _LT_EOF
-#line 17930 configure
+#line 17974 configure
 #include confdefs.h
 
 #if HAVE_DLFCN_H
@@ -18033,7 +18077,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat  conftest.$ac_ext _LT_EOF
-#line 18036 configure
+#line 18080 configure
 #include confdefs.h
 
 #if HAVE_DLFCN_H
diff --git a/gcc/configure.ac b/gcc/configure.ac
index 0336066..b2726e5 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -340,6 +340,8 @@ ACX_PROG_CC_WARNING_OPTS(
 ACX_PROG_CC_WARNING_OPTS(
m4_quote(m4_do([-Wmissing-format-attribute])), [strict_warn])
 ACX_PROG_CC_WARNING_OPTS(
+   m4_quote(m4_do([-Woverloaded-virtual])), [strict_warn])
+ACX_PROG_CC_WARNING_OPTS(
m4_quote(m4_do([-Wold-style-definition -Wc++-compat])), [c_strict_warn])
 ACX_PROG_CC_WARNING_ALMOST_PEDANTIC(
m4_quote(m4_do([-Wno-long-long -Wno-variadic-macros ], 
-- 
1.9.2

Re: [PATCH 2/6] merge register_dump_files_1 into register_dump_files

2014-04-17 Thread Richard Biener

On Thu, Apr 17, 2014 at 10:37 AM,  tsaund...@mozilla.com wrote:
 From: Trevor Saunders tsaund...@mozilla.com

 Hi,

 simplification allowed by previous patch.

 bootstrap + regtest passed on x86_64-unknown-linux-gnu, ok?

Ok.

Thanks,
Richard.

 Trev

 2014-03-19  Trevor Saunders  tsaund...@mozilla.com

 * pass_manager.h (pass_manager::register_dump_files_1): Remove 
 declaration.
 * passes.c (pass_manager::register_dump_files_1): Merge into
 (pass_manager::register_dump_files): this, and remove its handling of
 properties since the pass always has the properties anyway.
 (pass_manager::pass_manager): Adjust.


 diff --git a/gcc/pass_manager.h b/gcc/pass_manager.h
 index 8309567..9f4d67b 100644
 --- a/gcc/pass_manager.h
 +++ b/gcc/pass_manager.h
 @@ -91,8 +91,7 @@ public:

  private:
void set_pass_for_id (int id, opt_pass *pass);
 -  void register_dump_files_1 (opt_pass *pass);
 -  void register_dump_files (opt_pass *pass, int properties);
 +  void register_dump_files (opt_pass *pass);

  private:
context *m_ctxt;
 diff --git a/gcc/passes.c b/gcc/passes.c
 index 3f9590a..7508771 100644
 --- a/gcc/passes.c
 +++ b/gcc/passes.c
 @@ -706,11 +706,10 @@ pass_manager::register_one_dump_file (opt_pass *pass)
free (CONST_CAST (char *, full_name));
  }

 -/* Recursive worker function for register_dump_files.  */
 +/* Register the dump files for the pass_manager starting at PASS. */

  void
 -pass_manager::
 -register_dump_files_1 (opt_pass *pass)
 +pass_manager::register_dump_files (opt_pass *pass)
  {
do
  {
 @@ -718,25 +717,13 @@ register_dump_files_1 (opt_pass *pass)
  register_one_dump_file (pass);

if (pass-sub)
 -register_dump_files_1 (pass-sub);
 +register_dump_files (pass-sub);

pass = pass-next;
  }
while (pass);
  }

 -/* Register the dump files for the pass_manager starting at PASS.
 -   PROPERTIES reflects the properties that are guaranteed to be available at
 -   the beginning of the pipeline.  */
 -
 -void
 -pass_manager::
 -register_dump_files (opt_pass *pass,int properties)
 -{
 -  pass-properties_required |= properties;
 -  register_dump_files_1 (pass);
 -}
 -
  struct pass_registry
  {
const char* unique_name;
 @@ -1536,19 +1523,11 @@ pass_manager::pass_manager (context *ctxt)
  #undef TERMINATE_PASS_LIST

/* Register the passes with the tree dump code.  */
 -  register_dump_files (all_lowering_passes, PROP_gimple_any);
 -  register_dump_files (all_small_ipa_passes,
 -  PROP_gimple_any | PROP_gimple_lcf | PROP_gimple_leh
 -  | PROP_cfg);
 -  register_dump_files (all_regular_ipa_passes,
 -  PROP_gimple_any | PROP_gimple_lcf | PROP_gimple_leh
 -  | PROP_cfg);
 -  register_dump_files (all_late_ipa_passes,
 -  PROP_gimple_any | PROP_gimple_lcf | PROP_gimple_leh
 -  | PROP_cfg);
 -  register_dump_files (all_passes,
 -  PROP_gimple_any | PROP_gimple_lcf | PROP_gimple_leh
 -  | PROP_cfg);
 +  register_dump_files (all_lowering_passes);
 +  register_dump_files (all_small_ipa_passes);
 +  register_dump_files (all_regular_ipa_passes);
 +  register_dump_files (all_late_ipa_passes);
 +  register_dump_files (all_passes);
  }

  /* If we are in IPA mode (i.e., current_function_decl is NULL), call
 --
 1.9.2

Re: [PATCH 1/6] remove properties stuff from register_dump_files_1

2014-04-17 Thread Richard Biener

On Thu, Apr 17, 2014 at 10:37 AM,  tsaund...@mozilla.com wrote:
 From: Trevor Saunders tsaund...@mozilla.com

 Hi,

 just removing some dead code.

 bootstrapped + regtested against r209414 on x86_64-unknown-linux-gnu, ok?

Ok.

Thanks,
Richard.

 Trev

 2014-03-19  Trevor Saunders  tsaund...@mozilla.com

 * pass_manager.h (pass_manager::register_dump_files_1): Adjust.
 * passes.c (pass_manager::register_dump_files_1): Remove dead code
 dealing with properties.
 (pass_manager::register_dump_files): Adjust.

 diff --git a/gcc/pass_manager.h b/gcc/pass_manager.h
 index e1d8143..8309567 100644
 --- a/gcc/pass_manager.h
 +++ b/gcc/pass_manager.h
 @@ -91,7 +91,7 @@ public:

  private:
void set_pass_for_id (int id, opt_pass *pass);
 -  int register_dump_files_1 (opt_pass *pass, int properties);
 +  void register_dump_files_1 (opt_pass *pass);
void register_dump_files (opt_pass *pass, int properties);

  private:
 diff --git a/gcc/passes.c b/gcc/passes.c
 index 60fb135..3f9590a 100644
 --- a/gcc/passes.c
 +++ b/gcc/passes.c
 @@ -708,33 +708,21 @@ pass_manager::register_one_dump_file (opt_pass *pass)

  /* Recursive worker function for register_dump_files.  */

 -int
 +void
  pass_manager::
 -register_dump_files_1 (opt_pass *pass, int properties)
 +register_dump_files_1 (opt_pass *pass)
  {
do
  {
 -  int new_properties = (properties | pass-properties_provided)
 -   ~pass-properties_destroyed;
 -
if (pass-name  pass-name[0] != '*')
  register_one_dump_file (pass);

if (pass-sub)
 -new_properties = register_dump_files_1 (pass-sub, new_properties);
 -
 -  /* If we have a gate, combine the properties that we could have with
 - and without the pass being examined.  */
 -  if (pass-has_gate)
 -properties = new_properties;
 -  else
 -properties = new_properties;
 +register_dump_files_1 (pass-sub);

pass = pass-next;
  }
while (pass);
 -
 -  return properties;
  }

  /* Register the dump files for the pass_manager starting at PASS.
 @@ -746,7 +734,7 @@ pass_manager::
  register_dump_files (opt_pass *pass,int properties)
  {
pass-properties_required |= properties;
 -  register_dump_files_1 (pass, properties);
 +  register_dump_files_1 (pass);
  }

  struct pass_registry
 --
 1.9.2

Re: [PATCH 4/6] enable -Woverloaded-virtual when available

2014-04-17 Thread Richard Biener

On Thu, Apr 17, 2014 at 10:37 AM,  tsaund...@mozilla.com wrote:
 From: Trevor Saunders tbsau...@mozilla.com

 hi,

 its a useful warning, and helps catch bugs in the next two patches.

 bootstrap + regtest passed on x86_64-unknown-linux-gnu, ok?

Ok.

Thanks,
Richard.

 Trev

 2014-03-19  Trevor Saunders  tsaund...@mozilla.com

 * configure.ac: Check for -Woverloaded-virtual and enable it if found.
 * configure: Regenerate.


 diff --git a/gcc/configure b/gcc/configure
 index 415377a..1a48ca3 100755
 --- a/gcc/configure
 +++ b/gcc/configure
 @@ -6427,6 +6427,50 @@ fi
done
  CFLAGS=$save_CFLAGS

 +save_CFLAGS=$CFLAGS
 +for real_option in -Woverloaded-virtual; do
 +  # Do the check with the no- prefix removed since gcc silently
 +  # accepts any -Wno-* option on purpose
 +  case $real_option in
 +-Wno-*) option=-W`expr x$real_option : 'x-Wno-\(.*\)'` ;;
 +*) option=$real_option ;;
 +  esac
 +  as_acx_Woption=`$as_echo acx_cv_prog_cc_warning_$option | $as_tr_sh`
 +
 +  { $as_echo $as_me:${as_lineno-$LINENO}: checking whether $CC supports 
 $option 5
 +$as_echo_n checking whether $CC supports $option...  6; }
 +if { as_var=$as_acx_Woption; eval test \\${$as_var+set}\ = set; }; then :
 +  $as_echo_n (cached)  6
 +else
 +  CFLAGS=$option
 +cat confdefs.h - _ACEOF conftest.$ac_ext
 +/* end confdefs.h.  */
 +
 +int
 +main ()
 +{
 +
 +  ;
 +  return 0;
 +}
 +_ACEOF
 +if ac_fn_c_try_compile $LINENO; then :
 +  eval $as_acx_Woption=yes
 +else
 +  eval $as_acx_Woption=no
 +fi
 +rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
 +
 +fi
 +eval ac_res=\$$as_acx_Woption
 +  { $as_echo $as_me:${as_lineno-$LINENO}: result: $ac_res 5
 +$as_echo $ac_res 6; }
 +  if test `eval 'as_val=${'$as_acx_Woption'};$as_echo $as_val'` = yes; 
 then :
 +  strict_warn=$strict_warn${strict_warn:+ }$real_option
 +fi
 +  done
 +CFLAGS=$save_CFLAGS
 +
  c_strict_warn=
  save_CFLAGS=$CFLAGS
  for real_option in -Wold-style-definition -Wc++-compat; do
 @@ -17927,7 +17971,7 @@ else
lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
lt_status=$lt_dlunknown
cat  conftest.$ac_ext _LT_EOF
 -#line 17930 configure
 +#line 17974 configure
  #include confdefs.h

  #if HAVE_DLFCN_H
 @@ -18033,7 +18077,7 @@ else
lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
lt_status=$lt_dlunknown
cat  conftest.$ac_ext _LT_EOF
 -#line 18036 configure
 +#line 18080 configure
  #include confdefs.h

  #if HAVE_DLFCN_H
 diff --git a/gcc/configure.ac b/gcc/configure.ac
 index 0336066..b2726e5 100644
 --- a/gcc/configure.ac
 +++ b/gcc/configure.ac
 @@ -340,6 +340,8 @@ ACX_PROG_CC_WARNING_OPTS(
  ACX_PROG_CC_WARNING_OPTS(
 m4_quote(m4_do([-Wmissing-format-attribute])), [strict_warn])
  ACX_PROG_CC_WARNING_OPTS(
 +   m4_quote(m4_do([-Woverloaded-virtual])), [strict_warn])
 +ACX_PROG_CC_WARNING_OPTS(
 m4_quote(m4_do([-Wold-style-definition -Wc++-compat])), 
 [c_strict_warn])
  ACX_PROG_CC_WARNING_ALMOST_PEDANTIC(
 m4_quote(m4_do([-Wno-long-long -Wno-variadic-macros ],
 --
 1.9.2

Re: [RFC] Add aarch64 support for ada

2014-04-17 Thread Tristan Gingold


On 16 Apr 2014, at 17:36, Richard Henderson r...@redhat.com wrote:

 On 04/16/2014 12:39 AM, Eric Botcazou wrote:
 The primary bit of rfc here is the hunk that applies to ada/types.h
 with respect to Fat_Pointer.  Given that the Ada type, as defined in
 s-stratt.ads, does not include alignment, I can't imagine why the C
 type should have it.
 
 See gcc-interface/utils.c:finish_fat_pointer_type.
 
 Ah hah.
 
  /* Make sure we can put it into a register.  */
  if (STRICT_ALIGNMENT)
TYPE_ALIGN (record_type) = MIN (BIGGEST_ALIGNMENT, 2 * POINTER_SIZE);
 
 AArch64 is not a STRICT_ALIGNMENT target, thus the mismatch.

As the align attribute in types.h is for the host, couldn't a configure test 
solve
this issue ?

 If we were to make this alignment unconditional, would it be better to drop 
 the
 code from here in finish_fat_pointer_type and instead record that in the Ada
 source, as we do with the C source?
 
 I presume
 
  for Fat_Pointer'Alignment use System.Address'Size * 2;
 
 or some such incantation would do that...

One of the most common Fat_Pointer is for strings, which aren't declared in any
source and is very commonly used.

OTOH, I think this optimization mostly targets sparc.

Tristan.

Re: [PATCH v2] libstdc++: Add hexfloat/defaultfloat io manipulators.

2014-04-17 Thread Jonathan Wakely

On 17 April 2014 01:56, Luke Allardyce wrote:
 Thanks, I was wrong about that.

 Then I think we should just bite the bullet and provide the new
 behaviour. If we do have an abi_tag on those types in the next release
 then we can preserve the old behaviour in the old ABI and use the
 C++11 semantics for the abi_tagged type, which will be used for both
 C++03 and C++11 code. I am not too concerned that people who use a
 meaningless modifier in C++03 code get the C++11 behaviour. If they
 really want %g or %G then they shouldn't use fixed|scientific.

 Does that mean abi_tag will be enabled with separate compiler flag /
 define rather than checking against the __cplusplus value?

I'm going to send a mail later on today, but the plan is that it's not
going to depend on __cplusplus at all. That makes it possible to pass
the abi_tagged types between C++03 and C++11 code.

Re: [PATCH] Enhancing the widen-mult pattern in vectorization.

2014-04-17 Thread Richard Biener

On Sat, Dec 7, 2013 at 12:45 AM, Cong Hou co...@google.com wrote:
 After further reviewing this patch, I found I don't have to change the
 code in tree-vect-stmts.c to allow further type conversion after
 widen-mult operation. Instead, I detect the following pattern in
 vect_recog_widen_mult_pattern():

 T1 a, b;
 ai = (T2) a;
 bi = (T2) b;
 c = ai * bi;

 where T2 is more that double the size of T1. (e.g. T1 is char and T2 is int).

 In this case I just create a new type T3 whose size is double of the
 size of T1, then get an intermediate result of type T3 from
 widen-mult. Then I add a new statement to STMT_VINFO_PATTERN_DEF_SEQ
 converting the result into type T2.

 This strategy makes the patch more clean.

 Bootstrapped and tested on an x86-64 machine.

Ok for trunk (please re-bootstrap/test of course).

Thanks,
Richard.


 thanks,
 Cong


 diff --git a/gcc/ChangeLog b/gcc/ChangeLog
 index f298c0b..12990b2 100644
 --- a/gcc/ChangeLog
 +++ b/gcc/ChangeLog
 @@ -1,3 +1,10 @@
 +2013-12-02  Cong Hou  co...@google.com
 +
 + * tree-vect-patterns.c (vect_recog_widen_mult_pattern): Enhance
 + the widen-mult pattern by handling two operands with different
 + sizes, and operands whose size is smaller than half of the result
 + type.
 +
  2013-11-22  Jakub Jelinek  ja...@redhat.com

   PR sanitizer/59061
 diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
 index 12d2c90..611ae1c 100644
 --- a/gcc/testsuite/ChangeLog
 +++ b/gcc/testsuite/ChangeLog
 @@ -1,3 +1,8 @@
 +2013-12-02  Cong Hou  co...@google.com
 +
 + * gcc.dg/vect/vect-widen-mult-u8-s16-s32.c: New test.
 + * gcc.dg/vect/vect-widen-mult-u8-u32.c: New test.
 +
  2013-11-22  Jakub Jelinek  ja...@redhat.com

   * c-c++-common/asan/no-redundant-instrumentation-7.c: Fix
 diff --git a/gcc/testsuite/gcc.dg/vect/vect-widen-mult-u8-s16-s32.c
 b/gcc/testsuite/gcc.dg/vect/vect-widen-mult-u8-s16-s32.c
 new file mode 100644
 index 000..9f9081b
 --- /dev/null
 +++ b/gcc/testsuite/gcc.dg/vect/vect-widen-mult-u8-s16-s32.c
 @@ -0,0 +1,48 @@
 +/* { dg-require-effective-target vect_int } */
 +
 +#include stdarg.h
 +#include tree-vect.h
 +
 +#define N 64
 +
 +unsigned char X[N] __attribute__ ((__aligned__(__BIGGEST_ALIGNMENT__)));
 +short Y[N] __attribute__ ((__aligned__(__BIGGEST_ALIGNMENT__)));
 +int result[N];
 +
 +/* unsigned char * short - int widening-mult.  */
 +__attribute__ ((noinline)) int
 +foo1(int len) {
 +  int i;
 +
 +  for (i=0; ilen; i++) {
 +result[i] = X[i] * Y[i];
 +  }
 +}
 +
 +int main (void)
 +{
 +  int i;
 +
 +  check_vect ();
 +
 +  for (i=0; iN; i++) {
 +X[i] = i;
 +Y[i] = 64-i;
 +__asm__ volatile ();
 +  }
 +
 +  foo1 (N);
 +
 +  for (i=0; iN; i++) {
 +if (result[i] != X[i] * Y[i])
 +  abort ();
 +  }
 +
 +  return 0;
 +}
 +
 +/* { dg-final { scan-tree-dump-times vectorized 1 loops 1 vect {
 target { vect_widen_mult_hi_to_si || vect_unpack } } } } */
 +/* { dg-final { scan-tree-dump-times vect_recog_widen_mult_pattern:
 detected 1 vect { target vect_widen_mult_hi_to_si_pattern } } } */
 +/* { dg-final { scan-tree-dump-times pattern recognized 1 vect {
 target vect_widen_mult_hi_to_si_pattern } } } */
 +/* { dg-final { cleanup-tree-dump vect } } */
 +
 diff --git a/gcc/testsuite/gcc.dg/vect/vect-widen-mult-u8-u32.c
 b/gcc/testsuite/gcc.dg/vect/vect-widen-mult-u8-u32.c
 new file mode 100644
 index 000..12c4692
 --- /dev/null
 +++ b/gcc/testsuite/gcc.dg/vect/vect-widen-mult-u8-u32.c
 @@ -0,0 +1,48 @@
 +/* { dg-require-effective-target vect_int } */
 +
 +#include stdarg.h
 +#include tree-vect.h
 +
 +#define N 64
 +
 +unsigned char X[N] __attribute__ ((__aligned__(__BIGGEST_ALIGNMENT__)));
 +unsigned char Y[N] __attribute__ ((__aligned__(__BIGGEST_ALIGNMENT__)));
 +unsigned int result[N];
 +
 +/* unsigned char- unsigned int widening-mult.  */
 +__attribute__ ((noinline)) int
 +foo1(int len) {
 +  int i;
 +
 +  for (i=0; ilen; i++) {
 +result[i] = X[i] * Y[i];
 +  }
 +}
 +
 +int main (void)
 +{
 +  int i;
 +
 +  check_vect ();
 +
 +  for (i=0; iN; i++) {
 +X[i] = i;
 +Y[i] = 64-i;
 +__asm__ volatile ();
 +  }
 +
 +  foo1 (N);
 +
 +  for (i=0; iN; i++) {
 +if (result[i] != X[i] * Y[i])
 +  abort ();
 +  }
 +
 +  return 0;
 +}
 +
 +/* { dg-final { scan-tree-dump-times vectorized 1 loops 1 vect {
 target { vect_widen_mult_qi_to_hi || vect_unpack } } } } */
 +/* { dg-final { scan-tree-dump-times vect_recog_widen_mult_pattern:
 detected 1 vect { target vect_widen_mult_qi_to_hi_pattern } } } */
 +/* { dg-final { scan-tree-dump-times pattern recognized 1 vect {
 target vect_widen_mult_qi_to_hi_pattern } } } */
 +/* { dg-final { cleanup-tree-dump vect } } */
 +
 diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
 index 7823cc3..f412e2d 100644
 --- a/gcc/tree-vect-patterns.c
 +++ b/gcc/tree-vect-patterns.c
 @@ -529,7 +529,8 @@ vect_handle_widen_op_by_const (gimple stmt, enum
 tree_code code,

 Try to find the following pattern:

 - type a_t, b_t;
 + type1 a_t;
 +

Re: Remove obsolete Solaris 9 support

2014-04-17 Thread Rainer Orth

Uros Bizjak ubiz...@gmail.com writes:

 On Wed, Apr 16, 2014 at 1:16 PM, Rainer Orth
 r...@cebitec.uni-bielefeld.de wrote:
 Now that 4.9 has branched, it's time to actually remove the obsolete
 Solaris 9 configuration.  Most of this is just legwork and falls under
 my Solaris maintainership.

 A couple of questions, though:

 * Uros: I'm removing all sse_os_support() checks from the testsuite.
   Solaris 9 was the only consumer, so it seems best to do away with it.

 This is OK, but please leave sse-os-check.h (and corresponding
 sse_os_support calls) in the testsuite. Just remove the Solaris 9
 specific code from sse-os-check.h and always return 1, perhaps with
 the comment that all currently supported OSes support SSE
 instructions.

Done.  I'll repost the final patch once another round of testing has
completed.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [PATCH GCC]Fix pr60363 by adding backtraced value of phi arg along jump threading path

2014-04-17 Thread Bin.Cheng

On Thu, Apr 17, 2014 at 1:30 PM, Jeff Law l...@redhat.com wrote:
 On 03/18/14 04:13, bin.cheng wrote:

 Hi,
 After control flow graph change made by
 http://gcc.gnu.org/ml/gcc-patches/2014-02/msg01492.html, case
 gcc.dg/tree-ssa/ssa-dom-thread-4.c is broken on logical_op_short_circuit
 targets including cortex-m3/cortex-m0.
 The regression reveals a missed opportunity in jump threading, which
 causes
 a forward basic block doesn't get removed in cfgcleanup after jump
 threading
 in VRP1.  Root cause is stated at the corresponding PR:
 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60363, please refer to it for
 detailed report.

 This patch fixes the issue by adding constant value instead of ssa_name as
 the new phi argument.  Bootstrap and test on x86_64, also test on
 cortex-m3
 and the regression is gone.
 I think this should wait for stage1, but would like to hear some comments
 now.  So does it look reasonable?


 2014-03-18  Bin Chengbin.ch...@arm.com

 PR regression/60363
 * gcc/tree-ssa-threadupdate.c (get_value_locus_in_path): New.
 (copy_phi_args): New parameters.  Call get_value_locus_in_path.
 (update_destination_phis): New parameter.
 (create_edge_and_update_destination_phis): Ditto.
 (ssa_fix_duplicate_block_edges): Pass new arguments.
 (thread_single_edge): Ditto.

 This is a good and interesting catch. DOM knows how to propagate these
 context sensitive equivalences which should expose the optimizable forwarder
 blocks.
At the time I was looking into the problem, DOM couldn't understand
the equivalence.  Maybe it can be improved too.

 But I'm a big believer in catching as many CFG simplifications as early as
 we can as they tend to have nice cascading effects.  So if we can pick it up
 by being smarter in how we duplicate arguments, then I'm all for it.

 +  for (int j = idx - 1; j = 0; j--)
 +{
 +  edge e = (*path)[j]-e;
 +  if (e-dest == def_bb)
 +   {
 + arg = gimple_phi_arg_def (def_phi, e-dest_idx);
 + *locus = gimple_phi_arg_location (def_phi, e-dest_idx);
 + return (TREE_CODE (arg) == INTEGER_CST ? arg : def);

 Presumably any constant that can legitimately appear in a PHI node is good
 here.  So for example ADDR_EXPR something in static storage ought to be
 handled as well.

 One could also argue that we should go ahead and do a context sensitive copy
 propagation here too if ARG turns out to be an SSA_NAME.  You have to be a
 bit more careful with those and use may_propagate_copy_p and you'd probably
 want to test the loop depth of the SSA_NAMEs to ensure you're not doing a
 propagation that is going to muck up LICM.  See loop_depth_of_name uses in
 tree-ssa-dom.c.

 Overall I think it's good.  We just need to resolve whether or not we want
 to catch constant ADDR_EXPRs and/or do the context sensitive copy
 propagations.
Do you mean const/copy propagation in jump threading optimization, or
just an independent opt somewhere else?  It's naturally flow sensitive
along jump threading path, which looks interesting to me.

Thanks,
bin

 jeff



-- 
Best Regards.

Re: [PATCH, i386, PR57623] Introduce synonyms for BMI intrinsics

2014-04-17 Thread Jakub Jelinek

On Wed, Jul 03, 2013 at 08:14:25AM +0200, Uros Bizjak wrote:
 On Tue, Jul 2, 2013 at 10:32 AM, Kirill Yukhin kirill.yuk...@gmail.com 
 wrote:
  Bootstrap passing. Updated tests passing on BMI-featured HW.
 
  ChangeLog:
  2013-07-02  Kirill Yukhin  kirill.yuk...@intel.com
 
  * config/i386/bmiintrin.h (_blsi_u32): New.
  (_blsi_u64): Ditto.
  (_blsr_u32): Ditto.
  (_blsr_u64): Ditto.
  (_blsmsk_u32): Ditto.
  (_blsmsk_u64): Ditto.
  (_tzcnt_u32): Ditto.
  (_tzcnt_u64): Ditto.
 
  testsuite/ChangeLog:
  2013-07-02  Kirill Yukhin  kirill.yuk...@intel.com
 
  * gcc.target/i386/bmi-1.c: Extend with new instrinsics.
  Fix scan patterns.
  * gcc.target/i386/bmi-2.c: Ditto.
 
  [1] http://gcc.gnu.org/ml/gcc-patches/2013-06/msg01286.html
 
 This is OK for mainline.
 
 BTW: Do we want to backport this patch (and your previous) to 4.8 branch?

Kyrill, you've committed this only to the 4.8 branch and not to the trunk,
which means we actually regress on this on in 4.9 compared to 4.8.2.

As the patch has been approved, I went ahead and after testing it
on x86_64 (-m32/-m64) committed it to the trunk and 4.9.

2014-04-17  Jakub Jelinek  ja...@redhat.com

PR target/60847
Forward port from 4.8 branch
2013-07-19  Kirill Yukhin  kirill.yuk...@intel.com

* config/i386/bmiintrin.h (_blsi_u32): New.
(_blsi_u64): Ditto.
(_blsr_u32): Ditto.
(_blsr_u64): Ditto.
(_blsmsk_u32): Ditto.
(_blsmsk_u64): Ditto.
(_tzcnt_u32): Ditto.
(_tzcnt_u64): Ditto.

* gcc.target/i386/bmi-1.c: Extend with new instrinsics.
Fix scan patterns.
* gcc.target/i386/bmi-2.c: Ditto.

--- gcc/config/i386/bmiintrin.h (revision 201046)
+++ gcc/config/i386/bmiintrin.h (revision 201047)
@@ -40,7 +40,6 @@ __tzcnt_u16 (unsigned short __X)
   return __builtin_ctzs (__X);
 }
 
-
 extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 __andn_u32 (unsigned int __X, unsigned int __Y)
 {
@@ -66,17 +65,34 @@ __blsi_u32 (unsigned int __X)
 }
 
 extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
+_blsi_u32 (unsigned int __X)
+{
+  return __blsi_u32 (__X);
+}
+
+extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 __blsmsk_u32 (unsigned int __X)
 {
   return __X ^ (__X - 1);
 }
 
 extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
+_blsmsk_u32 (unsigned int __X)
+{
+  return __blsmsk_u32 (__X);
+}
+
+extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 __blsr_u32 (unsigned int __X)
 {
   return __X  (__X - 1);
 }
 
+extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
+_blsr_u32 (unsigned int __X)
+{
+  return __blsr_u32 (__X);
+}
 
 extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 __tzcnt_u32 (unsigned int __X)
@@ -84,6 +100,12 @@ __tzcnt_u32 (unsigned int __X)
   return __builtin_ctz (__X);
 }
 
+extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
+_tzcnt_u32 (unsigned int __X)
+{
+  return __builtin_ctz (__X);
+}
+
 
 #ifdef  __x86_64__
 extern __inline unsigned long long __attribute__((__gnu_inline__, 
__always_inline__, __artificial__))
@@ -111,22 +133,46 @@ __blsi_u64 (unsigned long long __X)
 }
 
 extern __inline unsigned long long __attribute__((__gnu_inline__, 
__always_inline__, __artificial__))
+_blsi_u64 (unsigned long long __X)
+{
+  return __blsi_u64 (__X);
+}
+
+extern __inline unsigned long long __attribute__((__gnu_inline__, 
__always_inline__, __artificial__))
 __blsmsk_u64 (unsigned long long __X)
 {
   return __X ^ (__X - 1);
 }
 
 extern __inline unsigned long long __attribute__((__gnu_inline__, 
__always_inline__, __artificial__))
+_blsmsk_u64 (unsigned long long __X)
+{
+  return __blsmsk_u64 (__X);
+}
+
+extern __inline unsigned long long __attribute__((__gnu_inline__, 
__always_inline__, __artificial__))
 __blsr_u64 (unsigned long long __X)
 {
   return __X  (__X - 1);
 }
 
 extern __inline unsigned long long __attribute__((__gnu_inline__, 
__always_inline__, __artificial__))
+_blsr_u64 (unsigned long long __X)
+{
+  return __blsr_u64 (__X);
+}
+
+extern __inline unsigned long long __attribute__((__gnu_inline__, 
__always_inline__, __artificial__))
 __tzcnt_u64 (unsigned long long __X)
 {
   return __builtin_ctzll (__X);
 }
+
+extern __inline unsigned long long __attribute__((__gnu_inline__, 
__always_inline__, __artificial__))
+_tzcnt_u64 (unsigned long long __X)
+{
+  return __builtin_ctzll (__X);
+}
 
 #endif /* __x86_64__  */
 
--- gcc/testsuite/gcc.target/i386/bmi-1.c   (revision 201046)
+++ gcc/testsuite/gcc.target/i386/bmi-1.c   (revision 201047)
@@ -2,10 +2,10 @@
 /* { dg-options

Re: [PATCH][2/3] Fix PR54733 Optimize endian independent load/store

2014-04-17 Thread Richard Biener

On Thu, Apr 17, 2014 at 7:19 AM, Thomas Preud'homme
thomas.preudho...@arm.com wrote:
 From: Richard Biener [mailto:richard.guent...@gmail.com]

 With handling only the outermost handled-component and then only a
 selected subset you'll catch many but not all cases.  Why not simply
 use get_inner_reference () here (plus stripping the constant offset
 from an innermost MEM_REF) and get the best of both worlds (not
 duplicate parts of its logic and handle more cases)?  Eventually
 using tree-affine.c and get_inner_reference_aff is even more appropriate
 so you can compute the address differences without decomposing
 them yourselves.

 Why does the constant offset from an innermost MEM_REF need to be
 stripped? Shouldn't that be part of the offset in the symbolic number?

Yes, but get_inner_reference returns MEM[ptr, constant-offset] as base,
thus it doesn't move the constant offset therein to bitpos and doesn't
return MEM[ptr, 0].  You have to do that yourselves.

(as you are really interested in the _address_ of the memory reference
instead of the reference itself it would be appropriate to introduce a
variant of get_inner_reference that returns 'ptr' in this case and
x for x.field1 for example)


 + /*  Compute address to load from and cast according to the size
 + of the load.  */
 + load_ptr_type = build_pointer_type (load_type);
 + addr_expr = build1 (ADDR_EXPR, load_ptr_type, bswap_src);
 + addr_tmp = make_temp_ssa_name (load_ptr_type, NULL,
 load_src);
 + addr_stmt = gimple_build_assign_with_ops
 +(NOP_EXPR, addr_tmp, addr_expr, NULL);
 + gsi_insert_before (gsi, addr_stmt, GSI_SAME_STMT);
 +
 + /* Perform the load.  */
 + load_offset_ptr = build_int_cst (load_ptr_type, 0);
 + val_tmp = make_temp_ssa_name (load_type, NULL, load_dst);
 + val_expr = build2 (MEM_REF, load_type, addr_tmp, 
 load_offset_ptr);
 + load_stmt = gimple_build_assign_with_ops
 +(MEM_REF, val_tmp, val_expr, NULL);

 this is unnecessarily complex and has TBAA issues.  You don't need to
 create a correct pointer type, so doing

 addr_expr = fold_build_addr_expr (bswap_src);

 is enough.  Now, to fix the TBAA issues you either need to remember
 and combine the reference_alias_ptr_type of each original load and
 use that for the load_offset_ptr value or decide that isn't worth it and
 use alias-set zero (use ptr_type_node).

 Sorry this is only my second patch [1] to gcc so it's not all clear to me. 
 The TBAA
 issue you mention comes from val_expr referring to a memory area that
 overlap with the smaller memory area used in the bitwise OR operation, am I
 right? Now, I have no idea about how to do the combination of the values
 returned by reference_alias_ptr_type () for each individual small memory
 area. Can you advise me on this? And what are the effect of not doing it and
 using ptr_type_node for the alias-set?

You can combine two reference_alias_ptr_type()s with

  if (alias_ptr_types_compatible_p (type1, type2))
return type1;
  else
return ptr_type_node;

using ptr_type_node for the alias-set will make it alias with all memory
references (that is, type-based disambiguation will be disabled).  That's
required for example if you combine four loads with type 'short' using
a single load with type 'long'.

 [1] First one was a fix on the existing implementation of the bswap pass.


 Can you also expand the comment about size vs. range?  Is it
 that range can be bigger than size if you have (short)a[0] |
 ((short)a[3]  1) sofar
 where size == 2 but range == 3?  Thus range can also be smaller than size
 for example for (short)a[0] | ((short)a[0]  1) where range would be 1 and
 size == 2?  I suppose adding two examples like this to the comment, together
 with the expected value of 'n' would help here.

 You understood correctly. I will add the suggested example.

 Otherwise the patch looks good.  Now we're only missing the addition
 of trying to match to a VEC_PERM_EXPR with a constant mask
 using can_vec_perm_p ;)

 Is that the vector shuffle engine you were mentioning in PR54733? If I
 understand correctly it is a generalization of the check again CMPNOP and
 CMPXCHG in find_bswap in this new patchset. I will look if ARM could
 Benefit from this and if yes I might take a look (yep, two conditions).

Yep.  For example it might match on things like

int foo (char *x)
{
   return x[0]  1 | x[0])  1) | x[1])  1) | x[0];
}

not sure if target support for shuffles on small vectors (or vector parts)
is working well.  Thus on v1si as in the example.

Richard.

 Thanks a lot for such quick and detailed comments after my ping.

 Best regards,

 Thomas

Re: [PATCH] dwarf2out: Use normal constant values in bound_info if possible.

2014-04-17 Thread Mark Wielaard

On Tue, 2014-04-15 at 14:24 -0700, Cary Coutant wrote:
  +   /* If HOST_WIDE_INT is big enough then represent the bound as
  +  a constant value.  Note that we need to make sure the type
  +  is signed or unsigned.  We cannot just add an unsigned
  +  constant if the value itself is positive.  Some DWARF
  +  consumers will lookup the bounds type and then sign extend
  +  any unsigned values found for signed types.  This is only
  +  for DW_AT_lower_bound, normally unsigned values
  +  (DW_FORM_data[1248]) are assumed to not need
  +  sign-extension.  */
 
 This comment confuses me.

Sorry, obviously not my intention. But I see what I was trying to say
and how I said it didn't make things very clear. Apologies.

  By we need to make sure the type is signed
 or unsigned (what else can it be?), I think you mean we need to
 choose a form based on whether the type is signed or unsigned.

Yes, right. I was confusing matters in my comment because I was thinking
of non-constants (reference or exprlocs) that are handled elsewhere
later on in the code.

  And by This is only for DW_AT_lower_bound, ..., I think you mean This is
 needed only for DW_AT_{lower,upper}_bound, since for most other
 attributes, consumers will treat DW_FORM_data[1248] as unsigned
 values, regardless of the underlying type.

Yes, right again.

 Otherwise, the patch looks OK to me.

Thanks I pushed it with the comment changed to how you expressed things.
It now reads:

/* If HOST_WIDE_INT is big enough then represent the bound as   
   a constant value.  We need to choose a form based on 
   whether the type is signed or unsigned.  We cannot just  
   call add_AT_unsigned if the value itself is positive 
   (add_AT_unsigned might add the unsigned value encoded as 
   DW_FORM_data[1248]).  Some DWARF consumers will lookup the   
   bounds type and then sign extend any unsigned values found   
   for signed types.  This is needed only for   
   DW_AT_{lower,upper}_bound, since for most other attributes,  
   consumers will treat DW_FORM_data[1248] as unsigned values,  
   regardless of the underlying type.  */

Thanks,

Mark

Re: [PATCH] Fix PR60849

2014-04-17 Thread Marc Glisse


On Thu, 17 Apr 2014, Richard Biener wrote:


This fixes PR60849 by properly rejecting non-boolean typed
comparisons from valid_gimple_rhs_p so they go through the
gimplification paths.


Could you also accept vector comparisons please?

--
Marc Glisse

Re: [PATCH, i386, PR57623] Introduce synonyms for BMI intrinsics

2014-04-17 Thread Kirill Yukhin

Thanks! Sorry, missed that!

K

On Thu, Apr 17, 2014 at 2:13 PM, Jakub Jelinek ja...@redhat.com wrote:
 On Wed, Jul 03, 2013 at 08:14:25AM +0200, Uros Bizjak wrote:
 On Tue, Jul 2, 2013 at 10:32 AM, Kirill Yukhin kirill.yuk...@gmail.com 
 wrote:
  Bootstrap passing. Updated tests passing on BMI-featured HW.
 
  ChangeLog:
  2013-07-02  Kirill Yukhin  kirill.yuk...@intel.com
 
  * config/i386/bmiintrin.h (_blsi_u32): New.
  (_blsi_u64): Ditto.
  (_blsr_u32): Ditto.
  (_blsr_u64): Ditto.
  (_blsmsk_u32): Ditto.
  (_blsmsk_u64): Ditto.
  (_tzcnt_u32): Ditto.
  (_tzcnt_u64): Ditto.
 
  testsuite/ChangeLog:
  2013-07-02  Kirill Yukhin  kirill.yuk...@intel.com
 
  * gcc.target/i386/bmi-1.c: Extend with new instrinsics.
  Fix scan patterns.
  * gcc.target/i386/bmi-2.c: Ditto.
 
  [1] http://gcc.gnu.org/ml/gcc-patches/2013-06/msg01286.html

 This is OK for mainline.

 BTW: Do we want to backport this patch (and your previous) to 4.8 branch?

 Kyrill, you've committed this only to the 4.8 branch and not to the trunk,
 which means we actually regress on this on in 4.9 compared to 4.8.2.

 As the patch has been approved, I went ahead and after testing it
 on x86_64 (-m32/-m64) committed it to the trunk and 4.9.

 2014-04-17  Jakub Jelinek  ja...@redhat.com

 PR target/60847
 Forward port from 4.8 branch
 2013-07-19  Kirill Yukhin  kirill.yuk...@intel.com

 * config/i386/bmiintrin.h (_blsi_u32): New.
 (_blsi_u64): Ditto.
 (_blsr_u32): Ditto.
 (_blsr_u64): Ditto.
 (_blsmsk_u32): Ditto.
 (_blsmsk_u64): Ditto.
 (_tzcnt_u32): Ditto.
 (_tzcnt_u64): Ditto.

 * gcc.target/i386/bmi-1.c: Extend with new instrinsics.
 Fix scan patterns.
 * gcc.target/i386/bmi-2.c: Ditto.

 --- gcc/config/i386/bmiintrin.h (revision 201046)
 +++ gcc/config/i386/bmiintrin.h (revision 201047)
 @@ -40,7 +40,6 @@ __tzcnt_u16 (unsigned short __X)
return __builtin_ctzs (__X);
  }

 -
  extern __inline unsigned int __attribute__((__gnu_inline__, 
 __always_inline__, __artificial__))
  __andn_u32 (unsigned int __X, unsigned int __Y)
  {
 @@ -66,17 +65,34 @@ __blsi_u32 (unsigned int __X)
  }

  extern __inline unsigned int __attribute__((__gnu_inline__, 
 __always_inline__, __artificial__))
 +_blsi_u32 (unsigned int __X)
 +{
 +  return __blsi_u32 (__X);
 +}
 +
 +extern __inline unsigned int __attribute__((__gnu_inline__, 
 __always_inline__, __artificial__))
  __blsmsk_u32 (unsigned int __X)
  {
return __X ^ (__X - 1);
  }

  extern __inline unsigned int __attribute__((__gnu_inline__, 
 __always_inline__, __artificial__))
 +_blsmsk_u32 (unsigned int __X)
 +{
 +  return __blsmsk_u32 (__X);
 +}
 +
 +extern __inline unsigned int __attribute__((__gnu_inline__, 
 __always_inline__, __artificial__))
  __blsr_u32 (unsigned int __X)
  {
return __X  (__X - 1);
  }

 +extern __inline unsigned int __attribute__((__gnu_inline__, 
 __always_inline__, __artificial__))
 +_blsr_u32 (unsigned int __X)
 +{
 +  return __blsr_u32 (__X);
 +}

  extern __inline unsigned int __attribute__((__gnu_inline__, 
 __always_inline__, __artificial__))
  __tzcnt_u32 (unsigned int __X)
 @@ -84,6 +100,12 @@ __tzcnt_u32 (unsigned int __X)
return __builtin_ctz (__X);
  }

 +extern __inline unsigned int __attribute__((__gnu_inline__, 
 __always_inline__, __artificial__))
 +_tzcnt_u32 (unsigned int __X)
 +{
 +  return __builtin_ctz (__X);
 +}
 +

  #ifdef  __x86_64__
  extern __inline unsigned long long __attribute__((__gnu_inline__, 
 __always_inline__, __artificial__))
 @@ -111,22 +133,46 @@ __blsi_u64 (unsigned long long __X)
  }

  extern __inline unsigned long long __attribute__((__gnu_inline__, 
 __always_inline__, __artificial__))
 +_blsi_u64 (unsigned long long __X)
 +{
 +  return __blsi_u64 (__X);
 +}
 +
 +extern __inline unsigned long long __attribute__((__gnu_inline__, 
 __always_inline__, __artificial__))
  __blsmsk_u64 (unsigned long long __X)
  {
return __X ^ (__X - 1);
  }

  extern __inline unsigned long long __attribute__((__gnu_inline__, 
 __always_inline__, __artificial__))
 +_blsmsk_u64 (unsigned long long __X)
 +{
 +  return __blsmsk_u64 (__X);
 +}
 +
 +extern __inline unsigned long long __attribute__((__gnu_inline__, 
 __always_inline__, __artificial__))
  __blsr_u64 (unsigned long long __X)
  {
return __X  (__X - 1);
  }

  extern __inline unsigned long long __attribute__((__gnu_inline__, 
 __always_inline__, __artificial__))
 +_blsr_u64 (unsigned long long __X)
 +{
 +  return __blsr_u64 (__X);
 +}
 +
 +extern __inline unsigned long long __attribute__((__gnu_inline__, 
 __always_inline__, __artificial__))
  __tzcnt_u64 (unsigned long long __X)
  {
return __builtin_ctzll (__X);
  }
 +
 +extern __inline unsigned long long __attribute__((__gnu_inline__, 
 __always_inline__, __artificial__))
 +_tzcnt_u64 (unsigned long long __X)
 +{
 +  return

Re: [PATCH] Fix PR60849

2014-04-17 Thread Richard Biener

On Thu, 17 Apr 2014, Marc Glisse wrote:

 On Thu, 17 Apr 2014, Richard Biener wrote:
 
  This fixes PR60849 by properly rejecting non-boolean typed
  comparisons from valid_gimple_rhs_p so they go through the
  gimplification paths.
 
 Could you also accept vector comparisons please?

Sure.  Testing in progress.

Richard.

2014-04-17  Richard Biener  rguent...@suse.de

PR middle-end/60849
* tree-ssa-propagate.c (valid_gimple_rhs_p): Allow vector
comparison results and add clarifying comment.

Index: gcc/tree-ssa-propagate.c
===
--- gcc/tree-ssa-propagate.c(revision 209469)
+++ gcc/tree-ssa-propagate.c(working copy)
@@ -572,9 +572,13 @@ valid_gimple_rhs_p (tree expr)
   break;
 
 case tcc_comparison:
-  if (!INTEGRAL_TYPE_P (TREE_TYPE (expr))
- || (TREE_CODE (TREE_TYPE (expr)) != BOOLEAN_TYPE
-  TYPE_PRECISION (TREE_TYPE (expr)) != 1))
+  /* GENERIC allows comparisons with non-boolean types, reject
+ those for GIMPLE.  Let vector-typed comparisons pass - rules
+for GENERIC and GIMPLE are the same here.  */
+  if (!(INTEGRAL_TYPE_P (TREE_TYPE (expr))
+(TREE_CODE (TREE_TYPE (expr)) == BOOLEAN_TYPE
+   || TYPE_PRECISION (TREE_TYPE (expr)) == 1))
+  TREE_CODE (TREE_TYPE (expr)) != VECTOR_TYPE)
return false;
 
   /* Fallthru.  */

[PATCH] Try to coalesce for unary and binary ops

2014-04-17 Thread Richard Biener


The patch below increases the number of coalescs we attempt
to also cover unary and binary operations.  This improves
initial code generation for code like

int foo (int i, int j, int k, int l)
{
  int res = i;
  res += j;
  res += k;
  res += l;
  return res;
}

from

;; res_3 = i_1(D) + j_2(D);

(insn 9 8 0 (parallel [
(set (reg/v:SI 83 [ res ])
(plus:SI (reg/v:SI 87 [ i ])
(reg/v:SI 88 [ j ])))
(clobber (reg:CC 17 flags))
]) t.c:4 -1
 (nil))

;; res_5 = res_3 + k_4(D);

(insn 10 9 0 (parallel [
(set (reg/v:SI 84 [ res ])
(plus:SI (reg/v:SI 83 [ res ])
(reg/v:SI 89 [ k ])))
(clobber (reg:CC 17 flags))
]) t.c:5 -1
 (nil))
...

to

;; res_3 = i_1(D) + j_2(D);

(insn 9 8 0 (parallel [
(set (reg/v:SI 83 [ res ])
(plus:SI (reg/v:SI 85 [ i ])
(reg/v:SI 86 [ j ])))
(clobber (reg:CC 17 flags))
]) t.c:4 -1
 (nil))

;; res_5 = res_3 + k_4(D);

(insn 10 9 0 (parallel [
(set (reg/v:SI 83 [ res ])
(plus:SI (reg/v:SI 83 [ res ])
(reg/v:SI 87 [ k ])))
(clobber (reg:CC 17 flags))
]) t.c:5 -1
 (nil))

re-using the same pseudo for the LHS.

Expansion has special code to improve coalescing of op1 with
target thus this is what we try to match here.

Overall there are positive and negative size effects during
a bootstrap on x86_64, but overall it seems to be a loss
- stage3 cc1 text size is 18261647 bytes without the patch
compared to 18265751 bytes with the patch.

Now the question is what does this tell us?  Not re-using
the same pseudo as op and target is always better?

Btw, I tried this to find a convincing metric for a intra-BB
scheduling pass (during out-of-SSA) on GIMPLE (to be able
to kill that odd scheduling code we now have in reassoc).
And to have sth that TER not immediately un-does we have
to disable TER which conveniently happens for coalesced
SSA names.  Thus - schedule for register pressure, and thus
reduce SSA name lifetime - with the goal that out-of-SSA can
do more coalescing.  But it won't even try to coalesce
anything else than PHI copies (not affected by scheduling)
or plain SSA name copies (shouldn't happen anyway due to
copy propagation).

So - any ideas?  Or is the overall negative for cc1 just
an artifact to ignore and we _should_ coalesce as much
as possible (even if it doesn't avoid copies - thus the
cost of 0 used in the patch)?

Otherwise the patch bootstraps and tests fine on x86_64-unknown-linux-gnu.

Thanks,
Richard.

2014-04-17  Richard Biener  rguent...@suse.de

* tree-ssa-coalesce.c (create_outofssa_var_map): Try to
coalesce SSA name uses with SSA name results in all unary
and binary operations.

Index: gcc/tree-ssa-coalesce.c
===
*** gcc/tree-ssa-coalesce.c (revision 209469)
--- gcc/tree-ssa-coalesce.c (working copy)
*** create_outofssa_var_map (coalesce_list_p
*** 991,1007 
case GIMPLE_ASSIGN:
  {
tree lhs = gimple_assign_lhs (stmt);
tree rhs1 = gimple_assign_rhs1 (stmt);
!   if (gimple_assign_ssa_name_copy_p (stmt)
 gimple_can_coalesce_p (lhs, rhs1))
  {
v1 = SSA_NAME_VERSION (lhs);
v2 = SSA_NAME_VERSION (rhs1);
!   cost = coalesce_cost_bb (bb);
!   add_coalesce (cl, v1, v2, cost);
bitmap_set_bit (used_in_copy, v1);
bitmap_set_bit (used_in_copy, v2);
  }
  }
  break;
  
--- 993,1031 
case GIMPLE_ASSIGN:
  {
tree lhs = gimple_assign_lhs (stmt);
+   if (TREE_CODE (lhs) != SSA_NAME)
+ break;
+ 
+   /* Expansion handles target == op1 properly and also
+  target == op2 for commutative binary ops.  */
tree rhs1 = gimple_assign_rhs1 (stmt);
!   enum tree_code code = gimple_assign_rhs_code (stmt);
!   enum gimple_rhs_class klass = get_gimple_rhs_class (code);
!   if (TREE_CODE (rhs1) == SSA_NAME
 gimple_can_coalesce_p (lhs, rhs1))
  {
v1 = SSA_NAME_VERSION (lhs);
v2 = SSA_NAME_VERSION (rhs1);
!   add_coalesce (cl, v1, v2,
! klass == GIMPLE_SINGLE_RHS
! ? coalesce_cost_bb (bb) : 0);
bitmap_set_bit (used_in_copy, v1);
bitmap_set_bit (used_in_copy, v2);
  }
+   if (klass == GIMPLE_BINARY_RHS
+commutative_tree_code (code))
+ {
+

Re: [PATCH] Try to coalesce for unary and binary ops

2014-04-17 Thread Richard Biener

On Thu, 17 Apr 2014, Richard Biener wrote:

 
 The patch below increases the number of coalescs we attempt
 to also cover unary and binary operations.  This improves
 initial code generation for code like
 
 int foo (int i, int j, int k, int l)
 {
   int res = i;
   res += j;
   res += k;
   res += l;
   return res;
 }
 
 from
 
 ;; res_3 = i_1(D) + j_2(D);
 
 (insn 9 8 0 (parallel [
 (set (reg/v:SI 83 [ res ])
 (plus:SI (reg/v:SI 87 [ i ])
 (reg/v:SI 88 [ j ])))
 (clobber (reg:CC 17 flags))
 ]) t.c:4 -1
  (nil))
 
 ;; res_5 = res_3 + k_4(D);
 
 (insn 10 9 0 (parallel [
 (set (reg/v:SI 84 [ res ])
 (plus:SI (reg/v:SI 83 [ res ])
 (reg/v:SI 89 [ k ])))
 (clobber (reg:CC 17 flags))
 ]) t.c:5 -1
  (nil))
 ...
 
 to
 
 ;; res_3 = i_1(D) + j_2(D);
 
 (insn 9 8 0 (parallel [
 (set (reg/v:SI 83 [ res ])
 (plus:SI (reg/v:SI 85 [ i ])
 (reg/v:SI 86 [ j ])))
 (clobber (reg:CC 17 flags))
 ]) t.c:4 -1
  (nil))
 
 ;; res_5 = res_3 + k_4(D);
 
 (insn 10 9 0 (parallel [
 (set (reg/v:SI 83 [ res ])
 (plus:SI (reg/v:SI 83 [ res ])
 (reg/v:SI 87 [ k ])))
 (clobber (reg:CC 17 flags))
 ]) t.c:5 -1
  (nil))
 
 re-using the same pseudo for the LHS.
 
 Expansion has special code to improve coalescing of op1 with
 target thus this is what we try to match here.
 
 Overall there are positive and negative size effects during
 a bootstrap on x86_64, but overall it seems to be a loss
 - stage3 cc1 text size is 18261647 bytes without the patch
 compared to 18265751 bytes with the patch.
 
 Now the question is what does this tell us?  Not re-using
 the same pseudo as op and target is always better?
 
 Btw, I tried this to find a convincing metric for a intra-BB
 scheduling pass (during out-of-SSA) on GIMPLE (to be able
 to kill that odd scheduling code we now have in reassoc).
 And to have sth that TER not immediately un-does we have
 to disable TER which conveniently happens for coalesced
 SSA names.  Thus - schedule for register pressure, and thus
 reduce SSA name lifetime - with the goal that out-of-SSA can
 do more coalescing.  But it won't even try to coalesce
 anything else than PHI copies (not affected by scheduling)
 or plain SSA name copies (shouldn't happen anyway due to
 copy propagation).
 
 So - any ideas?  Or is the overall negative for cc1 just
 an artifact to ignore and we _should_ coalesce as much
 as possible (even if it doesn't avoid copies - thus the
 cost of 0 used in the patch)?

One example where it delivers bad initial expansion on x86_64 is

int foo (int *p)
{
  int res = p[0];
  res += p[1];
  res += p[2];
  res += p[3];
  return res;
}

where i386.c:ix86_fixup_binary_operands tries to be clever
and improve address combine, generating two instructions
for (plus:SI (reg/v:SI 83 [ res ]) (mem:SI (...))) and thus
triggering expand_binop_directly

  pat = maybe_gen_insn (icode, 3, ops);
  if (pat)
{
  /* If PAT is composed of more than one insn, try to add an 
appropriate
 REG_EQUAL note to it.  If we can't because TEMP conflicts with an
 operand, call expand_binop again, this time without a target.  */
  if (INSN_P (pat)  NEXT_INSN (pat) != NULL_RTX
   ! add_equal_note (pat, ops[0].value, optab_to_code 
(binoptab),
   ops[1].value, ops[2].value))
{
  delete_insns_since (last);
  return expand_binop (mode, binoptab, op0, op1, NULL_RTX,
   unsignedp, methods);
}

and thus we end up with

(insn 9 6 10 (set (reg:SI 91)
(mem:SI (plus:DI (reg/v/f:DI 88 [ p ])
(const_int 4 [0x4])) [0 MEM[(int *)p_2(D) + 4B]+0 S4 
A32])) t.c:4 -1
 (nil))

(insn 10 9 11 (parallel [
(set (reg:SI 90)
(plus:SI (reg/v:SI 83 [ res ])
(reg:SI 91)))
(clobber (reg:CC 17 flags))
]) t.c:4 -1
 (expr_list:REG_EQUAL (plus:SI (reg/v:SI 83 [ res ])
(mem:SI (plus:DI (reg/v/f:DI 88 [ p ])
(const_int 4 [0x4])) [0 MEM[(int *)p_2(D) + 4B]+0 S4 
A32]))
(nil)))

(insn 11 10 0 (set (reg/v:SI 83 [ res ])
(reg:SI 90)) t.c:4 -1
 (nil))

unpatched we avoid the last move (the tiny testcase of course
ends up optimizing the same anyway).  Not sure if that strong
desire to add a REG_EQUAL note makes up for the losses.  At
least it looks backwards to the code preceeding it:

  /* If operation is commutative,
 try to make the first operand a register.
 Even better, try to make it the same as the target.
 Also try to make the last operand a constant.  */
  if (commutative_p
   swap_commutative_operands_with_target (target, xop0, xop1))
{
  swap = xop1;
  xop1 = xop0;
  xop0 = swap;
}

RE: [PATCH] Add a new option -fmerge-bitfields (patch / doc inside)

2014-04-17 Thread Zoran Jovanovic

Hello,
My apologies for inconvenience.
Removed every appearance of -ftree-bitfield-merge from the patch and fixed an 
issue with unions.
The rest of the patch is the same as before.

Regards,
Zoran Jovanovic

--

Lowering is applied only for bit-fields copy sequences that are merged.
Data structure representing bit-field copy sequences is renamed and reduced in 
size.
Optimization turned on by default for -O2 and higher.
Some comments fixed.

Benchmarking performed on WebKit for Android.
Code size reduction noticed on several files, best examples are:

core/rendering/style/StyleMultiColData (632-520 bytes)
core/platform/graphics/FontDescription (1715-1475 bytes)
core/rendering/style/FillLayer (5069-4513 bytes)
core/rendering/style/StyleRareInheritedData (5618-5346)
core/css/CSSSelectorList(4047-3887)
core/platform/animation/CSSAnimationData (3844-3440 bytes)
core/css/resolver/FontBuilder (13818-13350 bytes)
core/platform/graphics/Font (16447-15975 bytes)


Example:

One of the motivating examples for this work was copy constructor of the class 
which contains bit-fields.

C++ code:
class A
{
public:
A(const A x);
unsigned a : 1;
unsigned b : 2;
unsigned c : 4;
};

A::A(const Ax)
{
a = x.a;
b = x.b;
c = x.c;
}

GIMPLE code without optimization:

  bb 2:
  _3 = x_2(D)-a;
  this_4(D)-a = _3;
  _6 = x_2(D)-b;
  this_4(D)-b = _6;
  _8 = x_2(D)-c;
  this_4(D)-c = _8;
  return;

Optimized GIMPLE code:
  bb 2:
  _10 = x_2(D)-D.1867;
  _11 = BIT_FIELD_REF _10, 7, 0;
  _12 = this_4(D)-D.1867;
  _13 = _12  128;
  _14 = (unsigned char) _11;
  _15 = _13 | _14;
  this_4(D)-D.1867 = _15;
  return;

Generated MIPS32r2 assembly code without optimization:
 lw  $3,0($5)
lbu $2,0($4)
andi$3,$3,0x1
andi$2,$2,0xfe
or  $2,$2,$3
sb  $2,0($4)
lw  $3,0($5)
andi$2,$2,0xf9
andi$3,$3,0x6
or  $2,$2,$3
sb  $2,0($4)
lw  $3,0($5)
andi$2,$2,0x87
andi$3,$3,0x78
or  $2,$2,$3
j   $31
sb  $2,0($4)

Optimized MIPS32r2 assembly code:
lw  $3,0($5)
lbu $2,0($4)
andi$3,$3,0x7f
andi$2,$2,0x80
or  $2,$3,$2
j   $31
sb  $2,0($4)


Algorithm works on basic block level and consists of following 3 major steps:
1. Go through basic block statements list. If there are statement pairs that 
implement copy of bit field content from one memory location to another record 
statements pointers and other necessary data in corresponding data structure.
2. Identify records that represent adjacent bit field accesses and mark them as 
merged.
3. Lower bit-field accesses by using new field size for those that can be 
merged.


New command line option -fmerge-bitfields is introduced.


Tested - passed gcc regression tests for MIPS32r2.


Changelog -

gcc/ChangeLog:
2014-04-16 Zoran Jovanovic (zoran.jovano...@imgtec.com)
  * common.opt (fmerge-bitfields): New option.
  * doc/invoke.texi: Add reference to -fmerge-bitfields.
  * tree-sra.c (lower_bitfields): New function.
  Entry for (-fmerge-bitfields).
  (part_of_union_p): New function.
  (bf_access_candidate_p): New function.
  (lower_bitfield_read): New function.
  (lower_bitfield_write): New function.
  (bitfield_stmt_bfcopy_pair::hash): New function.
  (bitfield_stmt_bfcopy_pair::equal): New function.
  (bitfield_stmt_bfcopy_pair::remove): New function.
  (create_and_insert_bfcopy): New function.
  (get_bit_offset): New function.
  (add_stmt_bfcopy_pair): New function.
  (cmp_bfcopies): New function.
  (get_merged_bit_field_size): New function.
  * dwarf2out.c (simple_type_size_in_bits): Move to tree.c.
  (field_byte_offset): Move declaration to tree.h and make it extern.
  * testsuite/gcc.dg/tree-ssa/bitfldmrg1.c: New test.
  * testsuite/gcc.dg/tree-ssa/bitfldmrg2.c: New test.
  * tree-ssa-sccvn.c (expressions_equal_p): Move to tree.c.
  * tree-ssa-sccvn.h (expressions_equal_p): Move declaration to tree.h.
  * tree.c (expressions_equal_p): Move from tree-ssa-sccvn.c.
  (simple_type_size_in_bits): Move from dwarf2out.c.
  * tree.h (expressions_equal_p): Add declaration.
  (field_byte_offset): Add declaration.

Patch -

diff --git a/gcc/common.opt b/gcc/common.opt
index da275e5..52c7f58 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2203,6 +2203,10 @@ ftree-sra
 Common Report Var(flag_tree_sra) Optimization
 Perform scalar replacement of aggregates
 
+fmerge-bitfields
+Common Report Var(flag_tree_bitfield_merge) Optimization
+Merge loads and stores of consecutive bitfields
+
 ftree-ter
 Common Report Var(flag_tree_ter) Optimization
 Replace temporary expressions in the SSA-normal pass
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index

[RFC][PING^2] Do not consider volatile asms as optimization barriers #1

2014-04-17 Thread Yury Gribov

--
From: Yury Gribov y.gri...@samsung.com
Sent:  Tuesday, March 25, 2014 11:57AM
To: Jakub Jelinek ja...@redhat.com, Eric Botcazou 
ebotca...@adacore.com, gcc-patches@gcc.gnu.org, Hans-Peter Nilsson 
h...@bitrange.com, rdsandif...@googlemail.com

Subject: Re: [RFC] Do not consider volatile asms as optimization barriers #1

On 03/25/2014 11:57 AM, Yury Gribov wrote:
Jakub Jelinek wrote:

Richard Sandiford wrote:

OK, how about this?  It looks like the builtins.c and stmt.c stuff
wasn't
merged until 4.9, and at this stage it seemed safer to just add the same
use/clobber sequence to both places.

Please wait a little bit, the patch has been committed to the trunk only
very recently, we want to see if it has any fallout.

It has been two weeks since Richard commited this to trunk. Perhaps it's
ok to backport to 4.8 branch now?

-Y

Link to original email: 
http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01306.html

Re: Patch ping

2014-04-17 Thread Jakub Jelinek

On Wed, Apr 16, 2014 at 02:45:37PM -0400, DJ Delorie wrote:
 I'll approve both patches, if you agree to think about a way to solve
 this problem without module-specific configury changes for each such
 command line option.  I understand the usefulness of having
 instrumentation, but the configure hack is a hack.

Only the second patch I'd consider a hack, the first patch merely makes sure
the POSTSTAGE1_LDFLAGS stuff actually isn't eaten by libtool.

I'll think about other options for the second patch.

 Note that in a combined tree this isn't a problem, because we'd just
 instrument the linker at the same time.

Only if you never use the plugin from the combined tree build with any other
linker.  Add -B ../ to some other linker and suddenly it will crash.

Jakub

Re: [PATCH] Make SRA tolerate most throwing statements

2014-04-17 Thread Martin Jambor

On Wed, Apr 16, 2014 at 11:22:28AM +0200, Richard Biener wrote:
 On Tue, 15 Apr 2014, Martin Jambor wrote:
 
  Hi,
  
  back in January in
  http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00848.html Eric pointed
  out a testcase where the problem was SRA not scalarizing an aggregate
  because it was involved in a throwing statement.  The reason is that
  SRA is likely to need to append new statements after each one where a
  replaced aggregate is present, but throwing statements must end their
  BBs.  This patch comes up with a fix for most such situations by
  adding these new statements onto a single successor non-EH edge, if
  there is one and only one such edge.
  
  I have bootstrapped and tested a very similar version on x86_64-linux,
  bootstrap and testing of this exact one is currently underway.  OK for
  trunk?  Eric, if and once this gets in, can you please add the
  testcase from your original post to the suite?
  
  Thanks,
  
  Martin
  
  
  2014-04-15  Martin Jambor  mjam...@suse.cz
  
  * tree-sra.c (single_non_eh_succ): New function.
  (disqualify_ops_if_throwing_stmt): Renamed to
  disqualify_if_bad_bb_terminating_stmt.  Allow throwing statements
  having one non-EH successor BB.
  (gsi_for_eh_followups): New function.
  (sra_modify_expr): If stmt ends bb, use single non-EH successor to
  generate loads into replacements.
  (sra_modify_assign): Likewise and and also use the simple path for
  such statements.
  (sra_modify_function_body): Iterate safely over BBs.
  

...

  @@ -2734,6 +2758,19 @@ get_access_for_expr (tree expr)
 return get_var_base_offset_size_access (base, offset, max_size);
   }
   
  +/* Split the single non-EH successor edge from BB (there must be exactly 
  one)
  +   and return a gimple iterator to the new block.  */
  +
  +static gimple_stmt_iterator
  +gsi_for_eh_followups (basic_block bb)
  +{
  +  edge e = single_non_eh_succ (bb);
  +  gcc_assert (e);
  +
  +  basic_block new_bb = split_edge (e);
  +  return gsi_start_bb (new_bb);
  +}
  +
   /* Replace the expression EXPR with a scalar replacement if there is one 
  and
  generate other statements to do type conversion or subtree copying if
  necessary.  GSI is used to place newly created statements, WRITE is 
  true if
  @@ -2763,6 +2800,13 @@ sra_modify_expr (tree *expr, gimple_stmt_iterator 
  *gsi, bool write)
 type = TREE_TYPE (*expr);
   
 loc = gimple_location (gsi_stmt (*gsi));
  +  gimple_stmt_iterator alt_gsi = gsi_none ();
  +  if (write  stmt_ends_bb_p (gsi_stmt (*gsi)))
  +{
  +  alt_gsi = gsi_for_eh_followups (gsi_bb (*gsi));
  +  gsi = alt_gsi;
 
 I think you should try to either use gsi_insert_on_edge_immediate
 (yeah, bad we can't build a gsi_for_edge_insert ()) or add
 a gsi_for_edge_insert () building on gimple_find_edge_insert_loc
 (note the before/after flag that returns - gsi_insert_* variants
 that take a flag specifying after/before would come handy here).
 You could also add a flag to gimple_find_edge_insert_loc whether
 it always should be possible to use gsi_insert_after and split
 the block in some more cases (or split it if both after and
 before inserts should be valid, but that would not split in
 the very rare case of an empty successor only).
 
 Basically usually you can avoid splitting the edge.

The following patch adds gsi_start_edge for that purpose and uses it
together with gsi_commit_edge_inserts from within SRA.

I did not make it an inline static function in the header like the
other gsi initializing functions because that would make
gimple-iterator.h depend on tree-cfg.h and with our current flat
includes that triggered changes of includes in half a gazillion
unrelated c files (I have that patch too because I was apparently too
lazy to think before the third coffee yesterday but I do not think it
is worth it).

Bootstrapped and tested on x86_64-linux, this time it also includes
Eric's testcase.  OK for trunk?

Thanks,

Martin


2014-04-16  Martin Jambor  mjam...@suse.cz

* gimple-iterator.c (gsi_start_edge): New function.
* gimple-iterator.h (gsi_start_edge): Declare.
* tree-sra.c (single_non_eh_succ): New function.
(disqualify_ops_if_throwing_stmt): Renamed to
disqualify_if_bad_bb_terminating_stmt.  Allow throwing statements
having one non-EH successor BB.
(sra_modify_expr): If stmt ends bb, use single non-EH successor to
generate loads into replacements.
(sra_modify_assign): Likewise and and also use the simple path for
such statements.
(sra_modify_function_body): Commit statements on edges.

testsuite/
* gnat.dg/opt34.adb: New.
* gnat.dg/opt34_pkg.ads: Likewise.

diff --git a/gcc/gimple-iterator.c b/gcc/gimple-iterator.c
index 1cfeb73..8a1ec53 100644
--- a/gcc/gimple-iterator.c
+++ b/gcc/gimple-iterator.c
@@ -689,6 +689,15 @@ gsi_insert_seq_on_edge (edge e, gimple_seq seq)
   gimple_seq_add_seq

Re: [PATCH] Make SRA tolerate most throwing statements

2014-04-17 Thread Richard Biener

On Thu, Apr 17, 2014 at 2:21 PM, Martin Jambor mjam...@suse.cz wrote:
 On Wed, Apr 16, 2014 at 11:22:28AM +0200, Richard Biener wrote:
 On Tue, 15 Apr 2014, Martin Jambor wrote:

  Hi,
 
  back in January in
  http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00848.html Eric pointed
  out a testcase where the problem was SRA not scalarizing an aggregate
  because it was involved in a throwing statement.  The reason is that
  SRA is likely to need to append new statements after each one where a
  replaced aggregate is present, but throwing statements must end their
  BBs.  This patch comes up with a fix for most such situations by
  adding these new statements onto a single successor non-EH edge, if
  there is one and only one such edge.
 
  I have bootstrapped and tested a very similar version on x86_64-linux,
  bootstrap and testing of this exact one is currently underway.  OK for
  trunk?  Eric, if and once this gets in, can you please add the
  testcase from your original post to the suite?
 
  Thanks,
 
  Martin
 
 
  2014-04-15  Martin Jambor  mjam...@suse.cz
 
  * tree-sra.c (single_non_eh_succ): New function.
  (disqualify_ops_if_throwing_stmt): Renamed to
  disqualify_if_bad_bb_terminating_stmt.  Allow throwing statements
  having one non-EH successor BB.
  (gsi_for_eh_followups): New function.
  (sra_modify_expr): If stmt ends bb, use single non-EH successor to
  generate loads into replacements.
  (sra_modify_assign): Likewise and and also use the simple path for
  such statements.
  (sra_modify_function_body): Iterate safely over BBs.
 

 ...

  @@ -2734,6 +2758,19 @@ get_access_for_expr (tree expr)
 return get_var_base_offset_size_access (base, offset, max_size);
   }
 
  +/* Split the single non-EH successor edge from BB (there must be exactly 
  one)
  +   and return a gimple iterator to the new block.  */
  +
  +static gimple_stmt_iterator
  +gsi_for_eh_followups (basic_block bb)
  +{
  +  edge e = single_non_eh_succ (bb);
  +  gcc_assert (e);
  +
  +  basic_block new_bb = split_edge (e);
  +  return gsi_start_bb (new_bb);
  +}
  +
   /* Replace the expression EXPR with a scalar replacement if there is one 
  and
  generate other statements to do type conversion or subtree copying if
  necessary.  GSI is used to place newly created statements, WRITE is 
  true if
  @@ -2763,6 +2800,13 @@ sra_modify_expr (tree *expr, gimple_stmt_iterator 
  *gsi, bool write)
 type = TREE_TYPE (*expr);
 
 loc = gimple_location (gsi_stmt (*gsi));
  +  gimple_stmt_iterator alt_gsi = gsi_none ();
  +  if (write  stmt_ends_bb_p (gsi_stmt (*gsi)))
  +{
  +  alt_gsi = gsi_for_eh_followups (gsi_bb (*gsi));
  +  gsi = alt_gsi;

 I think you should try to either use gsi_insert_on_edge_immediate
 (yeah, bad we can't build a gsi_for_edge_insert ()) or add
 a gsi_for_edge_insert () building on gimple_find_edge_insert_loc
 (note the before/after flag that returns - gsi_insert_* variants
 that take a flag specifying after/before would come handy here).
 You could also add a flag to gimple_find_edge_insert_loc whether
 it always should be possible to use gsi_insert_after and split
 the block in some more cases (or split it if both after and
 before inserts should be valid, but that would not split in
 the very rare case of an empty successor only).

 Basically usually you can avoid splitting the edge.

 The following patch adds gsi_start_edge for that purpose and uses it
 together with gsi_commit_edge_inserts from within SRA.

 I did not make it an inline static function in the header like the
 other gsi initializing functions because that would make
 gimple-iterator.h depend on tree-cfg.h and with our current flat
 includes that triggered changes of includes in half a gazillion
 unrelated c files (I have that patch too because I was apparently too
 lazy to think before the third coffee yesterday but I do not think it
 is worth it).

 Bootstrapped and tested on x86_64-linux, this time it also includes
 Eric's testcase.  OK for trunk?

Ok.

Thanks,
Richard.

 Thanks,

 Martin


 2014-04-16  Martin Jambor  mjam...@suse.cz

 * gimple-iterator.c (gsi_start_edge): New function.
 * gimple-iterator.h (gsi_start_edge): Declare.
 * tree-sra.c (single_non_eh_succ): New function.
 (disqualify_ops_if_throwing_stmt): Renamed to
 disqualify_if_bad_bb_terminating_stmt.  Allow throwing statements
 having one non-EH successor BB.
 (sra_modify_expr): If stmt ends bb, use single non-EH successor to
 generate loads into replacements.
 (sra_modify_assign): Likewise and and also use the simple path for
 such statements.
 (sra_modify_function_body): Commit statements on edges.

 testsuite/
 * gnat.dg/opt34.adb: New.
 * gnat.dg/opt34_pkg.ads: Likewise.

 diff --git a/gcc/gimple-iterator.c b/gcc/gimple-iterator.c
 index 1cfeb73..8a1ec53 100644
 --- a/gcc/gimple-iterator.c
 +++

Re: [PATCH 1/6] remove properties stuff from register_dump_files_1

2014-04-17 Thread Trevor Saunders

On Thu, Apr 17, 2014 at 10:53:07AM +0200, Richard Biener wrote:
 On Thu, Apr 17, 2014 at 10:37 AM,  tsaund...@mozilla.com wrote:
  From: Trevor Saunders tsaund...@mozilla.com
 
  Hi,
 
  just removing some dead code.
 
  bootstrapped + regtested against r209414 on x86_64-unknown-linux-gnu, ok?
 
 Ok.

Thanks for the quick reviews! committed as r209477 - 209482

Trev

 
 Thanks,
 Richard.
 
  Trev
 
  2014-03-19  Trevor Saunders  tsaund...@mozilla.com
 
  * pass_manager.h (pass_manager::register_dump_files_1): Adjust.
  * passes.c (pass_manager::register_dump_files_1): Remove dead code
  dealing with properties.
  (pass_manager::register_dump_files): Adjust.
 
  diff --git a/gcc/pass_manager.h b/gcc/pass_manager.h
  index e1d8143..8309567 100644
  --- a/gcc/pass_manager.h
  +++ b/gcc/pass_manager.h
  @@ -91,7 +91,7 @@ public:
 
   private:
 void set_pass_for_id (int id, opt_pass *pass);
  -  int register_dump_files_1 (opt_pass *pass, int properties);
  +  void register_dump_files_1 (opt_pass *pass);
 void register_dump_files (opt_pass *pass, int properties);
 
   private:
  diff --git a/gcc/passes.c b/gcc/passes.c
  index 60fb135..3f9590a 100644
  --- a/gcc/passes.c
  +++ b/gcc/passes.c
  @@ -708,33 +708,21 @@ pass_manager::register_one_dump_file (opt_pass *pass)
 
   /* Recursive worker function for register_dump_files.  */
 
  -int
  +void
   pass_manager::
  -register_dump_files_1 (opt_pass *pass, int properties)
  +register_dump_files_1 (opt_pass *pass)
   {
 do
   {
  -  int new_properties = (properties | pass-properties_provided)
  -   ~pass-properties_destroyed;
  -
 if (pass-name  pass-name[0] != '*')
   register_one_dump_file (pass);
 
 if (pass-sub)
  -new_properties = register_dump_files_1 (pass-sub, new_properties);
  -
  -  /* If we have a gate, combine the properties that we could have with
  - and without the pass being examined.  */
  -  if (pass-has_gate)
  -properties = new_properties;
  -  else
  -properties = new_properties;
  +register_dump_files_1 (pass-sub);
 
 pass = pass-next;
   }
 while (pass);
  -
  -  return properties;
   }
 
   /* Register the dump files for the pass_manager starting at PASS.
  @@ -746,7 +734,7 @@ pass_manager::
   register_dump_files (opt_pass *pass,int properties)
   {
 pass-properties_required |= properties;
  -  register_dump_files_1 (pass, properties);
  +  register_dump_files_1 (pass);
   }
 
   struct pass_registry
  --
  1.9.2
 


signature.asc
Description: Digital signature

Changes for if-convert to recognize simple conditional reduction.

2014-04-17 Thread Yuri Rumyantsev

Hi All,

We implemented enhancement for if-convert phase to recognize the
simplest conditional reduction and to transform it vectorizable form,
e.g. statement
if (A[i] != 0) num+= 1; will be recognized.
A new test-case is also provided.

Bootstrapping and regression testing did not show any new failures.

Is it OK for trunk?

gcc/ChangeLog:
2014-04-17  Yuri Rumyantsev  ysrum...@gmail.com

* tree-if-conv.c (is_cond_scalar_reduction): New function.
(convert_scalar_cond_reduction): Likewise.
(predicate_scalar_phi): Add recognition and transformation
of simple conditioanl reduction to be vectorizable.

gcc/testsuite/ChangeLog:
2014-04-17  Yuri Rumyantsev  ysrum...@gmail.com

* gcc.dg/cond-reduc.c: New test.


if-conv-cond-reduc.patch
Description: Binary data

Re: [PATCH] Redesign jump threading profile updates

2014-04-17 Thread Teresa Johnson

On Wed, Apr 16, 2014 at 10:39 PM, Jeff Law l...@redhat.com wrote:
 On 03/26/14 17:44, Teresa Johnson wrote:

 Recently I discovered that the profile updates being performed by jump
 threading were incorrect in many cases, particularly in the case where
 the threading path contains a joiner. Some of the duplicated
 blocks/edges were not getting any counts, leading to incorrect
 function splitting and other downstream optimizations, and there were
 other insanities as well. After making a few attempts to fix the
 handling I ended up completely redesigning the profile update code,
 removing a few places throughout the code where it was attempting to
 do some updates.

 The profile updates in that code is a mess.  It's never really been looked
 at in any systematic way, what's there is ad-hoc and usually in response to
 someone mentioning the profile data was incorrectly updated.   As we rely
 more and more on that data the ad-hoc updating is going to cause us more and
 more pain.

 So any work in this space is going to be greatly appreciated.

 I'll have to look at this in some detail.  But I wanted you to know I was
 aware of the work and it's in my queue.

Great, thanks for the update! I realize that it is not a trivial
change so it would take some time to get through. Hopefully it should
address the ongoing profile fixup issues.
Teresa


 Thanks!

 jeff



-- 
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413

[PING] [PATCH] Fix for PR libstdc++/60758

2014-04-17 Thread Alexey Merzlyakov


Hi,

This fixes infinite backtrace in __cxa_end_cleanup().
Regtest was finished with no regressions on arm-linux-gnueabi(sf).

The patch posted at:
  http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00496.html

Thanks in advance.

Best regards,
Merzlyakov Alexey

[ PATCH] Extend mode-switching to support toggle (1/2)

2014-04-17 Thread Christian Bruel

Hello,

He is a new version of the patch. It hookizes the mode-setting and
mode-toggling macros. Split in 2 parts.

Successfully bootstrapped/regtested on ix86 and SH4/SH4a.

I was able to do a limited build on Epiphany, if someone could give it a
try on it that would be great.

comments ? suggestions ?

many thanks,

Christian









2014-04-02  Christian Bruel  christian.br...@st.com

	* target.def (mode_switching): New hook vector.
	(mode_emit, mode_needed, mode_after, mode_entry): New hooks.
	(mode_exit, modepriority_to_mode): Likewise.
	* mode-switching.c (MODE_NEEDED, MODE_AFTER, MODE_ENTRY): Hookify.
	(MODE_EXIT, MODE_PRIORITY_TO_MODE, EMIT_MODE_SET): Likewise.
	(default_priority_to_mode): Define.
	* targhooks.h (default_priority_to_mode): Declare.
	* target.h: Include tm.h and hard-reg-set.h.
	* doc/tm.texi.in (EMIT_MODE_SET, MODE_NEEDED, MODE_AFTER, MODE_ENTRY)
	(MODE_EXIT, MODE_PRIORITY_TO_MODE): Delete and hookify.
	* doc/tm.texi Regenerate.
	* config/sh/sh.h (MODE_NEEDED, MODE_AFTER, MODE_ENTRY): Delete
	(MODE_EXIT, MODE_PRIORITY_TO_MODE, EMIT_MODE_SET): Likewise.
	* config/sh/sh.c (emit_fpu_toggle): New function.
	(sh4_emit_mode_set, sh4_mode_needed): Hookify.
	(sh4_mode_after, sh4_mode_entry, sh4_mode_exit): Likewise.
	* config/i386/i386.h (MODE_NEEDED, MODE_AFTER, MODE_ENTRY): Delete
	(MODE_EXIT, MODE_PRIORITY_TO_MODE, EMIT_MODE_SET): Likewise.
	* config/i386/i386-protos.h (ix86_mode_needed, ix86_mode_after)
	(ix86_mode_entrym, ix86_emit_mode_set): Remove external declaration.
	* config/i386/i386.c (ix86_mode_needed, ix86_mode_after, ix86_mode_exit,
	(ix86_mode_entry, ix86_mode_priority, ix86_emit_mode_set): Hookify.
	* config/epiphany/epiphany.h (MODE_NEEDED, MODE_AFTER, MODE_ENTRY):
	Delete
	(MODE_EXIT, MODE_PRIORITY_TO_MODE, EMIT_MODE_SET): Likewise.
	* config/sh/sh.h (MODE_NEEDED, MODE_AFTER, MODE_ENTRY): Delete
	(MODE_EXIT, MODE_PRIORITY_TO_MODE, EMIT_MODE_SET): Likewise.
	* config/sh/sh.c (sh4_emit_mode_set, sh4_mode_needed): Hookify.
	(sh4_mode_after, sh4_mode_entry, sh4_mode_exit): Likewise.
	* config/epiphany/epiphany-protos.h (epiphany_mode_needed)
	(emit_set_fp_mode, epiphany_mode_entry_exit, epiphany_mode_after)
	(epiphany_mode_priority_to_mode): Remove declaration.
	* config/epiphany/epiphany.c (emit_set_fp_mode): Hookify.
	(epiphany_mode_needed, epiphany_mode_priority_to_mode): Likewise.
	(epiphany_mode_entry, epiphany_mode_exit, epiphany_mode_after):
	Likewise.
	(epiphany_mode_priority_to_mode): Change priority type. Hookify.
	(epiphany_mode_needed, epiphany_mode_entry_exit): Hookify.
	(epiphany_mode_after, epiphany_mode_entry, emit_set_fp_mode): Hookify.

--- gcc/config/epiphany/epiphany-protos.h	(revision 209415)
+++ gcc/config/epiphany/epiphany-protos.h	(working copy)
@@ -45,9 +45,7 @@ extern void emit_set_fp_mode (int entity, int mode
 extern void epiphany_insert_mode_switch_use (rtx insn, int, int);
 extern void epiphany_expand_set_fp_mode (rtx *operands);
 extern int epiphany_mode_needed (int entity, rtx insn);
-extern int epiphany_mode_entry_exit (int entity, bool);
 extern int epiphany_mode_after (int entity, int last_mode, rtx insn);
-extern int epiphany_mode_priority_to_mode (int entity, unsigned priority);
 extern bool epiphany_epilogue_uses (int regno);
 extern bool epiphany_optimize_mode_switching (int entity);
 extern bool epiphany_is_interrupt_p (tree);
--- gcc/config/epiphany/epiphany.c	(revision 209415)
+++ gcc/config/epiphany/epiphany.c	(working copy)
@@ -152,6 +152,20 @@ static rtx frame_insn (rtx);
 /* We further restrict the minimum to be a multiple of eight.  */
 #define TARGET_MIN_ANCHOR_OFFSET (optimize_size ? 0 : -2040)
 
+/* Mode switching hooks.  */
+
+#define TARGET_MODE_EMIT emit_set_fp_mode
+
+#define TARGET_MODE_NEEDED epiphany_mode_needed
+
+#define TARGET_MODE_PRIORITY epiphany_mode_priority
+
+#define TARGET_MODE_ENTRY epiphany_mode_entry
+
+#define TARGET_MODE_EXIT epiphany_mode_exit
+
+#define TARGET_MODE_AFTER epiphany_mode_after
+
 #include target-def.h
 
 #undef TARGET_ASM_ALIGNED_HI_OP
@@ -2306,8 +2320,8 @@ epiphany_optimize_mode_switching (int entity)
   gcc_unreachable ();
 }
 
-int
-epiphany_mode_priority_to_mode (int entity, unsigned priority)
+static int
+epiphany_mode_priority (int entity, int priority)
 {
   if (entity == EPIPHANY_MSW_ENTITY_AND || entity == EPIPHANY_MSW_ENTITY_OR
   || entity== EPIPHANY_MSW_ENTITY_CONFIG)
@@ -2415,7 +2429,7 @@ epiphany_mode_needed (int entity, rtx insn)
   }
 }
 
-int
+static int
 epiphany_mode_entry_exit (int entity, bool exit)
 {
   int normal_mode = epiphany_normal_fp_mode ;
@@ -2502,6 +2516,18 @@ epiphany_mode_after (int entity, int last_mode, rt
   return last_mode;
 }
 
+static int
+epiphany_mode_entry (int entity)
+{
+  return epiphany_mode_entry_exit (entity, false);
+}
+
+static int
+epiphany_mode_exit (int entity)
+{
+  return epiphany_mode_entry_exit (entity, true);
+}
+
 void
 emit_set_fp_mode (int entity, int mode, HARD_REG_SET regs_live ATTRIBUTE_UNUSED)
 {
---

Re: [PATCH] Try to coalesce for unary and binary ops

2014-04-17 Thread Michael Matz

Hi,

On Thu, 17 Apr 2014, Richard Biener wrote:

 The patch below increases the number of coalescs we attempt
 to also cover unary and binary operations.

This is not usually a good idea if not mitigated by things like register 
pressure measurement and using target properties to determine if it's a 
two- or three-address instruction.  It increases register pressure and 
naturally generates multiple-def pseudos which aren't liked by some of the 
RTL passes.  It will lead to fewer pseudos, so there's a positive side.

 Now the question is what does this tell us?  Not re-using the same 
 pseudo as op and target is always better?

No, it tells us that tree-ssa-coalesce is too early for such coalescing.  
The register allocator is the right spot (or instruction selection if we 
had that), and it's done there.

 And to have sth that TER not immediately un-does we have
 to disable TER which conveniently happens for coalesced
 SSA names.

So, instead TER should be improved to not disturb the incoming instruction 
order (except where secondary effects of expanding larger trees can be 
had).  Changing the coalescing set to disable some bad parts in a later 
pass doesn't sound very convincing :)


Ciao,
Michael.

[ PATCH] Extend mode-switching to support toggle (2/2)

2014-04-17 Thread Christian Bruel

and the toggle-support hookized

many thanks,

Christian



2014-04-02  Christian Bruel  christian.br...@st.com

	* target.def (mode_switching): New hook vector.
	(toggle_init, toggle_destroy, toggle_set, toggle_test):
	New mode toggle hooks.
	* targhooks.h (default_toggle_test): Declare.
	* basic-block.h (pre_edge_lcm_avs): Declare.
	* lcm.c (pre_edge_lcm_avs): Renamed from pre_edge_lcm.
	Call clear_aux_for_edges. Fix comments.
	(pre_edge_lcm): New wrapper function to call pre_edge_lcm_avs.
	(pre_edge_rev_lcm): Idem.
	* mode-switching.c (init_modes_infos): New function.
	(free_modes_infos): Likewise.
	(add_mode_set): Likewise.
	(get_mode): Likewise.
	(commit_mode_sets): Likewise.
	(merge_modes): Likewise.
	(optimize_mode_switching): Support mode toggle.
	(default_priority_to_mode, default_toggle_test): Define.
	* doc/tm.texi.in (TARGET_MODE_TOGGLE_INIT, TARGET_MODE_TOGGLE_TEST)
	(TARGET_MODE_TOGGLE_DESTROY, TARGET_MODE_TOGGLE_SET):
	 New target hooks.
	* doc/tm.texi: Regenerate.
	* config/sh/sh.c (sh4_toggle_init, sh4_toggle_destroy): Add hook and define.
	(sh4_toggle_set, sh4_toggle_test): Likewise.
	(mode_in_flip, mode_out_flip): Add bitmap to compute mode flipping.
	(TARGET_MODE_EMIT): New toggle parameter.
	* config/sh/sh.md (toggle_pr): Defined for TARGET_SH4_300 and TARGET_SH4A_FP.
	(in_delay_slot): fpscr_toggle don't go in delay slot.
	* config/i386/i386.c (ix86_emit_mode_set): Add bool unused parameter.
	* config/epiphany/epiphany.c (emit_set_fp_mode): Add bool unused parameter.

--- gcc/basic-block.h	2014-01-07 10:30:59.0 +0100
+++ /work1/bruel/superh_elf/gnu_trunk.devs/gcc/gcc/basic-block.h	2014-04-15 16:17:53.0 +0200
@@ -711,6 +711,9 @@
 extern struct edge_list *pre_edge_lcm (int, sbitmap *, sbitmap *,
    sbitmap *, sbitmap *, sbitmap **,
    sbitmap **);
+extern struct edge_list *pre_edge_lcm_avs (int, sbitmap *, sbitmap *,
+	   sbitmap *, sbitmap *, sbitmap *,
+	   sbitmap *, sbitmap **, sbitmap **);
 extern struct edge_list *pre_edge_rev_lcm (int, sbitmap *,
 	   sbitmap *, sbitmap *,
 	   sbitmap *, sbitmap **,
--- gcc/config/epiphany/epiphany.c	2014-04-17 13:23:48.0 +0200
+++ /work1/bruel/superh_elf/gnu_trunk.devs/gcc/gcc/config/epiphany/epiphany.c	2014-04-17 13:25:54.0 +0200
@@ -2529,7 +2529,8 @@
 }
 
 void
-emit_set_fp_mode (int entity, int mode, HARD_REG_SET regs_live ATTRIBUTE_UNUSED)
+emit_set_fp_mode (int entity, int mode, bool toggle ATTRIBUTE_UNUSED,
+		  HARD_REG_SET regs_live ATTRIBUTE_UNUSED)
 {
   rtx save_cc, cc_reg, mask, src, src2;
   enum attr_fp_mode fp_mode;
--- gcc/config/epiphany/epiphany-protos.h	2014-04-17 11:10:36.0 +0200
+++ /work1/bruel/superh_elf/gnu_trunk.devs/gcc/gcc/config/epiphany/epiphany-protos.h	2014-04-17 11:22:02.0 +0200
@@ -40,7 +40,8 @@
 extern void epiphany_init_expanders (void);
 extern int hard_regno_mode_ok (int regno, enum machine_mode mode);
 #ifdef HARD_CONST
-extern void emit_set_fp_mode (int entity, int mode, HARD_REG_SET regs_live);
+extern void emit_set_fp_mode (int entity, int mode,
+			  bool toggle ATTRIBUTE_UNUSED, HARD_REG_SET regs_live);
 #endif
 extern void epiphany_insert_mode_switch_use (rtx insn, int, int);
 extern void epiphany_expand_set_fp_mode (rtx *operands);
--- gcc/config/epiphany/resolve-sw-modes.c	2014-04-17 11:10:36.0 +0200
+++ /work1/bruel/superh_elf/gnu_trunk.devs/gcc/gcc/config/epiphany/resolve-sw-modes.c	2014-04-17 11:21:07.0 +0200
@@ -147,7 +147,7 @@
 	}
 	  start_sequence ();
 	  emit_set_fp_mode (EPIPHANY_MSW_ENTITY_ROUND_UNKNOWN,
-			jilted_mode, NULL);
+			jilted_mode, false, NULL);
 	  seq = get_insns ();
 	  end_sequence ();
 	  need_commit = true;
--- gcc/config/i386/i386.c	2014-04-17 13:02:49.0 +0200
+++ /work1/bruel/superh_elf/gnu_trunk.devs/gcc/gcc/config/i386/i386.c	2014-04-17 13:04:18.0 +0200
@@ -16409,7 +16409,8 @@
are to be inserted.  */
 
 static void
-ix86_emit_mode_set (int entity, int mode, HARD_REG_SET regs_live)
+ix86_emit_mode_set (int entity, int mode, bool toggle ATTRIBUTE_UNUSED,
+		HARD_REG_SET regs_live)
 {
   switch (entity)
 {
--- gcc/config/sh/sh.c	2014-04-17 13:23:07.0 +0200
+++ /work1/bruel/superh_elf/gnu_trunk.devs/gcc/gcc/config/sh/sh.c	2014-04-17 13:25:27.0 +0200
@@ -202,7 +202,7 @@
 static int calc_live_regs (HARD_REG_SET *);
 static HOST_WIDE_INT rounded_frame_size (int);
 static bool sh_frame_pointer_required (void);
-static void sh4_emit_mode_set (int, int, HARD_REG_SET);
+static void sh4_emit_mode_set (int, int, bool, HARD_REG_SET);
 static int sh4_mode_needed (int, rtx);
 static int sh4_mode_after (int, int, rtx);
 static int sh4_mode_entry (int);
@@ -590,9 +590,21 @@
 #undef TARGET_MODE_EXIT
 #define TARGET_MODE_EXIT sh4_mode_exit
 
+#undef TARGET_MODE_TOGGLE_INIT
+#define TARGET_MODE_TOGGLE_INIT sh4_toggle_init
+
 #undef TARGET_MODE_PRIORITY
 #define TARGET_MODE_PRIORITY sh4_mode_priority
 
+#undef

[PATCH 0/3] libsanitizer libc conditionals

2014-04-17 Thread Bernhard Reutner-Fischer

Respun. First two patches are for gcc, the last one is for upstream
LLVM.

The gcc part was bootstrapped and regtested on x86_64-unknown-linux-gnu
without regressions and bootstrapped on x86_64-unknown-linux-uclibc to
verify that the configury works as expected and that the library links
without errors. These two patches are essentially backports of the
LLVM bits in patch #3.

The LLVM part was compiled on x86_64 (X86_64 ?) against glibc and
verified that the configury picks up the previously hard-coded values
both with configure  make as well as with cmake  make.

LLVM'er, please install the LLVM bits.

Ok for trunk?


Bernhard Reutner-Fischer (3):
  libsanitizer: Fix !statfs64 builds
  libsanitizer: add conditionals for libc
  [LLVM] [sanitizer] add conditionals for libc

 libsanitizer/asan/Makefile.am  |   6 +
 libsanitizer/asan/Makefile.in  |  17 +-
 libsanitizer/config.h.in   |  60 +
 libsanitizer/configure | 281 -
 libsanitizer/configure.ac  |  38 +++
 libsanitizer/interception/interception_linux.cc|   2 +
 libsanitizer/interception/interception_linux.h |   8 +
 libsanitizer/lsan/Makefile.am  |   6 +
 libsanitizer/lsan/Makefile.in  |  11 +-
 libsanitizer/sanitizer_common/Makefile.am  |   5 +
 libsanitizer/sanitizer_common/Makefile.in  |  18 +-
 .../sanitizer_common_interceptors.inc  | 100 +++-
 .../sanitizer_platform_interceptors.h  |   4 +-
 .../sanitizer_platform_limits_linux.cc |   2 +
 .../sanitizer_platform_limits_posix.cc |  44 +++-
 .../sanitizer_platform_limits_posix.h  |  27 +-
 .../sanitizer_common/sanitizer_posix_libcdep.cc|   7 +
 libsanitizer/tsan/Makefile.am  |   6 +
 libsanitizer/tsan/Makefile.in  |  11 +-
 19 files changed, 619 insertions(+), 34 deletions(-)

-- 
1.9.1

[PATCH 1/3] libsanitizer: Fix !statfs64 builds

2014-04-17 Thread Bernhard Reutner-Fischer

libsanitizer/ChangeLog
2014-04-02  Bernhard Reutner-Fischer  al...@gcc.gnu.org

* configure.ac: Check for sizeof(struct statfs64).
* configure, config.h.in: Regenerate.
* sanitizer_common/sanitizer_platform_interceptors.h
(SANITIZER_INTERCEPT_STATFS64): Make conditional on
SIZEOF_STRUCT_STATFS64 being not 0.
* sanitizer_common/sanitizer_platform_limits_linux.cc
(namespace __sanitizer): Make unsigned
struct_statfs64_sz conditional on SANITIZER_INTERCEPT_STATFS64.

Signed-off-by: Bernhard Reutner-Fischer rep.dot@gmail.com
---
 libsanitizer/config.h.in   |  9 +++
 libsanitizer/configure | 69 ++
 libsanitizer/configure.ac  | 15 +
 .../sanitizer_platform_interceptors.h  |  4 +-
 .../sanitizer_platform_limits_linux.cc |  2 +
 5 files changed, 98 insertions(+), 1 deletion(-)

diff --git a/libsanitizer/config.h.in b/libsanitizer/config.h.in
index e4b2786..4bd6a7f 100644
--- a/libsanitizer/config.h.in
+++ b/libsanitizer/config.h.in
@@ -61,12 +61,18 @@
 /* Define to 1 if you have the sys/mman.h header file. */
 #undef HAVE_SYS_MMAN_H
 
+/* Define to 1 if you have the sys/statfs.h header file. */
+#undef HAVE_SYS_STATFS_H
+
 /* Define to 1 if you have the sys/stat.h header file. */
 #undef HAVE_SYS_STAT_H
 
 /* Define to 1 if you have the sys/types.h header file. */
 #undef HAVE_SYS_TYPES_H
 
+/* Define to 1 if you have the sys/vfs.h header file. */
+#undef HAVE_SYS_VFS_H
+
 /* Define to 1 if you have the unistd.h header file. */
 #undef HAVE_UNISTD_H
 
@@ -107,6 +113,9 @@
 /* The size of `short', as computed by sizeof. */
 #undef SIZEOF_SHORT
 
+/* The size of `struct statfs64', as computed by sizeof. */
+#undef SIZEOF_STRUCT_STATFS64
+
 /* The size of `void *', as computed by sizeof. */
 #undef SIZEOF_VOID_P
 
diff --git a/libsanitizer/configure b/libsanitizer/configure
index 5e4840f..c636212 100755
--- a/libsanitizer/configure
+++ b/libsanitizer/configure
@@ -15463,6 +15463,75 @@ _ACEOF
 
 
 
+for ac_header in sys/statfs.h
+do :
+  ac_fn_c_check_header_mongrel $LINENO sys/statfs.h 
ac_cv_header_sys_statfs_h $ac_includes_default
+if test x$ac_cv_header_sys_statfs_h = xyes; then :
+  cat confdefs.h _ACEOF
+#define HAVE_SYS_STATFS_H 1
+_ACEOF
+
+fi
+
+done
+
+if test $ac_cv_header_sys_statfs_h = no; then
+  for ac_header in sys/vfs.h
+do :
+  ac_fn_c_check_header_mongrel $LINENO sys/vfs.h ac_cv_header_sys_vfs_h 
$ac_includes_default
+if test x$ac_cv_header_sys_vfs_h = xyes; then :
+  cat confdefs.h _ACEOF
+#define HAVE_SYS_VFS_H 1
+_ACEOF
+
+fi
+
+done
+
+fi
+# The cast to long int works around a bug in the HP C Compiler
+# version HP92453-01 B.11.11.23709.GP, which incorrectly rejects
+# declarations like `int a3[[(sizeof (unsigned char)) = 0]];'.
+# This bug is HP SR number 8606223364.
+{ $as_echo $as_me:${as_lineno-$LINENO}: checking size of struct statfs64 5
+$as_echo_n checking size of struct statfs64...  6; }
+if test ${ac_cv_sizeof_struct_statfs64+set} = set; then :
+  $as_echo_n (cached)  6
+else
+  if ac_fn_c_compute_int $LINENO (long int) (sizeof (struct statfs64)) 
ac_cv_sizeof_struct_statfs64
+#ifdef HAVE_SYS_STATFS_H
+# include sys/statfs.h
+#endif
+#ifdef HAVE_SYS_VFS_H
+# include sys/vfs.h
+#endif
+
+; then :
+
+else
+  if test $ac_cv_type_struct_statfs64 = yes; then
+ { { $as_echo $as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd': 5
+$as_echo $as_me: error: in \`$ac_pwd': 2;}
+{ as_fn_set_status 77
+as_fn_error cannot compute sizeof (struct statfs64)
+See \`config.log' for more details. $LINENO 5; }; }
+   else
+ ac_cv_sizeof_struct_statfs64=0
+   fi
+fi
+
+fi
+{ $as_echo $as_me:${as_lineno-$LINENO}: result: 
$ac_cv_sizeof_struct_statfs64 5
+$as_echo $ac_cv_sizeof_struct_statfs64 6; }
+
+
+
+cat confdefs.h _ACEOF
+#define SIZEOF_STRUCT_STATFS64 $ac_cv_sizeof_struct_statfs64
+_ACEOF
+
+
+
 if test ${multilib} = yes; then
   multilib_arg=--enable-multilib
 else
diff --git a/libsanitizer/configure.ac b/libsanitizer/configure.ac
index e672131..746c216 100644
--- a/libsanitizer/configure.ac
+++ b/libsanitizer/configure.ac
@@ -78,6 +78,21 @@ AC_SUBST(enable_static)
 
 AC_CHECK_SIZEOF([void *])
 
+dnl Careful, this breaks on glibc for e.g. dirent.d_ino being 64bit
+dnl AC_SYS_LARGEFILE
+AC_CHECK_HEADERS(sys/statfs.h)
+if test $ac_cv_header_sys_statfs_h = no; then
+  AC_CHECK_HEADERS(sys/vfs.h)
+fi
+AC_CHECK_SIZEOF([struct statfs64],[],[
+#ifdef HAVE_SYS_STATFS_H
+# include sys/statfs.h
+#endif
+#ifdef HAVE_SYS_VFS_H
+# include sys/vfs.h
+#endif
+])
+
 if test ${multilib} = yes; then
   multilib_arg=--enable-multilib
 else
diff --git a/libsanitizer/sanitizer_common/sanitizer_platform_interceptors.h 
b/libsanitizer/sanitizer_common/sanitizer_platform_interceptors.h
index f37d84b..b9ebd5c 100644
--- a/libsanitizer/sanitizer_common/sanitizer_platform_interceptors.h
+++

[PATCH 2/3] libsanitizer: add conditionals for libc

2014-04-17 Thread Bernhard Reutner-Fischer

Conditionalize usage of dlvsym(), nanosleep(), usleep();
Conditionalize layout of struct sigaction and type of it's member
sa_flags.
Conditionalize glob_t members gl_closedir, gl_readdir, gl_opendir,
gl_flags, gl_lstat, gl_stat.
Check for availability of glob.h for use with above members.
Check for availability of netrom/netrom.h, sys/ustat.h (for obsolete
ustat() function), utime.h (for obsolete utime() function), wordexp.h.
Determine size of sigset_t instead of hardcoding it.

libsanitizer/ChangeLog

2014-04-16  Bernhard Reutner-Fischer  al...@gcc.gnu.org

* configure.ac (AC_CHECK_HEADERS): Add time.h, wordexp.h,
glob.h, netrom/netrom.h, sys/ustat.h.
(AC_CHECK_MEMBERS): Check GNU extension glob_t members.
(AC_CHECK_SIZEOF): Determine size of sigset_t.
(HAVE_STRUCT_SIGACTION_SA_MASK_LAST,
STRUCT_SIGACTION_SA_FLAGS_TYPE): New.
(AC_CHECK_FUNCS): Add usleep, nanosleep, dlvsym.
* configure, config.h.in: Regenerate.
* asan/Makefile.am, lsan/Makefile.am, tsan/Makefile.am,
sanitizer_common/Makefile.am (AM_CXXFLAGS): Include config.h,
add include search directory.
* asan/Makefile.in, lsan/Makefile.in, tsan/Makefile.in,
sanitizer_common/Makefile.in: Regenerate.
* interception/interception_linux.h,
interception/interception_linux.cc,
sanitizer_common/sanitizer_common_interceptors.inc,
sanitizer_common/sanitizer_platform_limits_posix.cc,
sanitizer_common/sanitizer_platform_limits_posix.h,
sanitizer_common/sanitizer_posix_libcdep.cc: Use config.h's new
defines.

Signed-off-by: Bernhard Reutner-Fischer rep.dot@gmail.com
---
 libsanitizer/asan/Makefile.am  |   6 +
 libsanitizer/asan/Makefile.in  |  17 +-
 libsanitizer/config.h.in   |  51 +
 libsanitizer/configure | 212 -
 libsanitizer/configure.ac  |  23 +++
 libsanitizer/interception/interception_linux.cc|   2 +
 libsanitizer/interception/interception_linux.h |   8 +
 libsanitizer/lsan/Makefile.am  |   6 +
 libsanitizer/lsan/Makefile.in  |  11 +-
 libsanitizer/sanitizer_common/Makefile.am  |   5 +
 libsanitizer/sanitizer_common/Makefile.in  |  18 +-
 .../sanitizer_common_interceptors.inc  | 100 +-
 .../sanitizer_platform_limits_posix.cc |  44 -
 .../sanitizer_platform_limits_posix.h  |  27 ++-
 .../sanitizer_common/sanitizer_posix_libcdep.cc|   7 +
 libsanitizer/tsan/Makefile.am  |   6 +
 libsanitizer/tsan/Makefile.in  |  11 +-
 17 files changed, 521 insertions(+), 33 deletions(-)

diff --git a/libsanitizer/asan/Makefile.am b/libsanitizer/asan/Makefile.am
index 3f07a83..851774c 100644
--- a/libsanitizer/asan/Makefile.am
+++ b/libsanitizer/asan/Makefile.am
@@ -9,6 +9,12 @@ DEFS += -DMAC_INTERPOSE_FUNCTIONS -DMISSING_BLOCKS_SUPPORT
 endif
 AM_CXXFLAGS = -Wall -W -Wno-unused-parameter -Wwrite-strings -pedantic 
-Wno-long-long  -fPIC -fno-builtin -fno-exceptions -fno-rtti 
-fomit-frame-pointer -funwind-tables -fvisibility=hidden -Wno-variadic-macros
 AM_CXXFLAGS += $(LIBSTDCXX_RAW_CXX_CXXFLAGS)
+AM_CXXFLAGS += -include $(top_builddir)/config.h
+if LIBBACKTRACE_SUPPORTED
+# backtrace-rename.h is included from config.h, provide -I dir for it
+AM_CXXFLAGS += -I $(top_srcdir)
+endif
+
 ACLOCAL_AMFLAGS = -I $(top_srcdir) -I $(top_srcdir)/config
 
 toolexeclib_LTLIBRARIES = libasan.la
diff --git a/libsanitizer/asan/Makefile.in b/libsanitizer/asan/Makefile.in
index 273eb4b..a9b889d 100644
--- a/libsanitizer/asan/Makefile.in
+++ b/libsanitizer/asan/Makefile.in
@@ -37,8 +37,10 @@ build_triplet = @build@
 host_triplet = @host@
 target_triplet = @target@
 @USING_MAC_INTERPOSE_TRUE@am__append_1 = -DMAC_INTERPOSE_FUNCTIONS 
-DMISSING_BLOCKS_SUPPORT
-@USING_MAC_INTERPOSE_FALSE@am__append_2 = 
$(top_builddir)/interception/libinterception.la
-@LIBBACKTRACE_SUPPORTED_TRUE@am__append_3 = 
$(top_builddir)/libbacktrace/libsanitizer_libbacktrace.la
+# backtrace-rename.h is included from config.h, provide -I dir for it
+@LIBBACKTRACE_SUPPORTED_TRUE@am__append_2 = -I $(top_srcdir)
+@USING_MAC_INTERPOSE_FALSE@am__append_3 = 
$(top_builddir)/interception/libinterception.la
+@LIBBACKTRACE_SUPPORTED_TRUE@am__append_4 = 
$(top_builddir)/libbacktrace/libsanitizer_libbacktrace.la
 subdir = asan
 DIST_COMMON = $(srcdir)/Makefile.in $(srcdir)/Makefile.am
 ACLOCAL_M4 = $(top_srcdir)/aclocal.m4
@@ -86,8 +88,8 @@ LTLIBRARIES = $(toolexeclib_LTLIBRARIES)
 am__DEPENDENCIES_1 =
 libasan_la_DEPENDENCIES =  \
$(top_builddir)/sanitizer_common/libsanitizer_common.la \
-   $(top_builddir)/lsan/libsanitizer_lsan.la $(am__append_2) \
-   $(am__append_3) $(am__DEPENDENCIES_1)
+   $(top_builddir)/lsan/libsanitizer_lsan.la $(am__append_3) \
+

[PATCH 3/3] [LLVM] [sanitizer] add conditionals for libc

2014-04-17 Thread Bernhard Reutner-Fischer

Conditionalize usage of dlvsym(), nanosleep(), usleep();
Conditionalize layout of struct sigaction and type of it's member
sa_flags.
Conditionalize glob_t members gl_closedir, gl_readdir, gl_opendir,
gl_flags, gl_lstat, gl_stat.
Check for availability of glob.h for use with above members.
Check for availability of netrom/netrom.h, sys/ustat.h (for obsolete
ustat() function), utime.h (for obsolete utime() function), wordexp.h.
Determine size of sigset_t instead of hardcoding it.
Determine size of struct statfs64, if available.

Leave defaults to match what glibc expects but probe them for uClibc.

Signed-off-by: Bernhard Reutner-Fischer rep.dot@gmail.com
---
 CMakeLists.txt |  58 +++
 cmake/Modules/CompilerRTUtils.cmake|  15 ++
 cmake/Modules/FunctionExistsNotStub.cmake  |  56 +++
 lib/interception/interception_linux.cc |   2 +
 lib/interception/interception_linux.h  |   9 ++
 .../sanitizer_common_interceptors.inc  | 101 +++-
 .../sanitizer_platform_limits_posix.cc |  44 -
 .../sanitizer_platform_limits_posix.h  |  27 +++-
 lib/sanitizer_common/sanitizer_posix_libcdep.cc|   9 ++
 make/platform/clang_linux.mk   | 180 +
 make/platform/clang_linux_test_libc.c  |  68 
 11 files changed, 561 insertions(+), 8 deletions(-)
 create mode 100644 cmake/Modules/FunctionExistsNotStub.cmake
 create mode 100644 make/platform/clang_linux_test_libc.c

diff --git a/CMakeLists.txt b/CMakeLists.txt
index e1a7a1f..af8073e 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -330,6 +330,64 @@ if(APPLE)
 -isysroot ${IOSSIM_SDK_DIR})
 endif()
 
+set(ct_c ${COMPILER_RT_SOURCE_DIR}/make/platform/clang_linux_test_libc.c)
+check_include_file(sys/ustat.h HAVE_SYS_USTAT_H)
+check_include_file(utime.h HAVE_UTIME_H)
+check_include_file(wordexp.h HAVE_WORDEXP_H)
+check_include_file(glob.h HAVE_GLOB_H)
+include(FunctionExistsNotStub)
+check_function_exists_not_stub(${ct_c} nanosleep HAVE_NANOSLEEP)
+check_function_exists_not_stub(${ct_c} usleep HAVE_USLEEP)
+include(CheckTypeSize)
+# check for sizeof sigset_t
+set(oCMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES})
+set(CMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES} signal.h)
+check_type_size(sigset_t SIZEOF_SIGSET_T BUILTIN_TYPES_ONLY)
+if(EXISTS HAVE_SIZEOF_SIGSET_T)
+  set(SIZEOF_SIGSET_T ${HAVE_SIZEOF_SIGSET_T})
+endif()
+set(CMAKE_EXTRA_INCLUDE_FILES ${oCMAKE_EXTRA_INCLUDE_FILES})
+# check for sizeof struct statfs64
+set(oCMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES})
+check_include_file(sys/statfs.h HAVE_SYS_STATFS_H)
+check_include_file(sys/vfs.h HAVE_SYS_VFS_H)
+if(HAVE_SYS_STATFS_H)
+  set(CMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES} sys/statfs.h)
+endif()
+if(HAVE_SYS_VFS_H)
+  set(CMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES} sys/vfs.h)
+endif()
+# Have to pass _LARGEFILE64_SOURCE otherwise there is no struct statfs64.
+# We forcefully enable LFS to retain glibc legacy behaviour herein.
+set(oCMAKE_REQUIRED_DEFINITIONS ${CMAKE_REQUIRED_DEFINITIONS})
+set(CMAKE_REQUIRED_DEFINITIONS ${CMAKE_REQUIRED_DEFINITIONS} 
-D_LARGEFILE64_SOURCE)
+check_type_size(struct statfs64 SIZEOF_STRUCT_STATFS64)
+if(EXISTS HAVE_SIZEOF_STRUCT_STATFS64)
+  set(SIZEOF_STRUCT_STATFS64 ${HAVE_SIZEOF_STRUCT_STATFS64})
+else()
+  set(CMAKE_REQUIRED_DEFINITIONS ${oCMAKE_REQUIRED_DEFINITIONS})
+endif()
+set(CMAKE_EXTRA_INCLUDE_FILES ${oCMAKE_EXTRA_INCLUDE_FILES})
+# do not set(CMAKE_REQUIRED_DEFINITIONS ${oCMAKE_REQUIRED_DEFINITIONS})
+# it back here either way.
+include(CheckStructHasMember)
+check_struct_has_member(glob_t gl_flags glob.h HAVE_GLOB_T_GL_FLAGS)
+check_struct_has_member(glob_t gl_closedir glob.h HAVE_GLOB_T_GL_CLOSEDIR)
+check_struct_has_member(glob_t gl_readdir glob.h HAVE_GLOB_T_GL_READDIR)
+check_struct_has_member(glob_t gl_opendir glob.h HAVE_GLOB_T_GL_OPENDIR)
+check_struct_has_member(glob_t gl_lstat glob.h HAVE_GLOB_T_GL_LSTAT)
+check_struct_has_member(glob_t gl_stat glob.h HAVE_GLOB_T_GL_STAT)
+
+# folks seem to have an aversion to configure_file? So be it..
+foreach(x HAVE_SYS_USTAT_H HAVE_UTIME_H HAVE_WORDEXP_H HAVE_GLOB_H
+HAVE_NANOSLEEP HAVE_USLEEP SIZEOF_SIGSET_T SIZEOF_STRUCT_STATFS64
+HAVE_GLOB_T_GL_FLAGS HAVE_GLOB_T_GL_CLOSEDIR
+HAVE_GLOB_T_GL_READDIR HAVE_GLOB_T_GL_OPENDIR
+HAVE_GLOB_T_GL_LSTAT HAVE_GLOB_T_GL_STAT)
+def_undef_string(${x} SANITIZER_COMMON_CFLAGS)
+endforeach()
+
+
 # Architectures supported by Sanitizer runtimes. Specific sanitizers may
 # support only subset of these (e.g. TSan works on x86_64 only).
 filter_available_targets(SANITIZER_COMMON_SUPPORTED_ARCH
diff --git a/cmake/Modules/CompilerRTUtils.cmake 
b/cmake/Modules/CompilerRTUtils.cmake
index e22e775..3a0beec 100644
--- a/cmake/Modules/CompilerRTUtils.cmake
+++ b/cmake/Modules/CompilerRTUtils.cmake
@@ -59,3 +59,18 @@ macro(append_no_rtti_flag list)

Re: [PATCH 3/3] [LLVM] [sanitizer] add conditionals for libc

2014-04-17 Thread Konstantin Serebryany

Hi,

If you are trying to modify the libsanitizer files, please read here:
https://code.google.com/p/address-sanitizer/wiki/HowToContribute

--kcc

On Thu, Apr 17, 2014 at 5:49 PM, Bernhard Reutner-Fischer
rep.dot@gmail.com wrote:
 Conditionalize usage of dlvsym(), nanosleep(), usleep();
 Conditionalize layout of struct sigaction and type of it's member
 sa_flags.
 Conditionalize glob_t members gl_closedir, gl_readdir, gl_opendir,
 gl_flags, gl_lstat, gl_stat.
 Check for availability of glob.h for use with above members.
 Check for availability of netrom/netrom.h, sys/ustat.h (for obsolete
 ustat() function), utime.h (for obsolete utime() function), wordexp.h.
 Determine size of sigset_t instead of hardcoding it.
 Determine size of struct statfs64, if available.

 Leave defaults to match what glibc expects but probe them for uClibc.

 Signed-off-by: Bernhard Reutner-Fischer rep.dot@gmail.com
 ---
  CMakeLists.txt |  58 +++
  cmake/Modules/CompilerRTUtils.cmake|  15 ++
  cmake/Modules/FunctionExistsNotStub.cmake  |  56 +++
  lib/interception/interception_linux.cc |   2 +
  lib/interception/interception_linux.h  |   9 ++
  .../sanitizer_common_interceptors.inc  | 101 +++-
  .../sanitizer_platform_limits_posix.cc |  44 -
  .../sanitizer_platform_limits_posix.h  |  27 +++-
  lib/sanitizer_common/sanitizer_posix_libcdep.cc|   9 ++
  make/platform/clang_linux.mk   | 180 
 +
  make/platform/clang_linux_test_libc.c  |  68 
  11 files changed, 561 insertions(+), 8 deletions(-)
  create mode 100644 cmake/Modules/FunctionExistsNotStub.cmake
  create mode 100644 make/platform/clang_linux_test_libc.c

 diff --git a/CMakeLists.txt b/CMakeLists.txt
 index e1a7a1f..af8073e 100644
 --- a/CMakeLists.txt
 +++ b/CMakeLists.txt
 @@ -330,6 +330,64 @@ if(APPLE)
  -isysroot ${IOSSIM_SDK_DIR})
  endif()

 +set(ct_c ${COMPILER_RT_SOURCE_DIR}/make/platform/clang_linux_test_libc.c)
 +check_include_file(sys/ustat.h HAVE_SYS_USTAT_H)
 +check_include_file(utime.h HAVE_UTIME_H)
 +check_include_file(wordexp.h HAVE_WORDEXP_H)
 +check_include_file(glob.h HAVE_GLOB_H)
 +include(FunctionExistsNotStub)
 +check_function_exists_not_stub(${ct_c} nanosleep HAVE_NANOSLEEP)
 +check_function_exists_not_stub(${ct_c} usleep HAVE_USLEEP)
 +include(CheckTypeSize)
 +# check for sizeof sigset_t
 +set(oCMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES})
 +set(CMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES} signal.h)
 +check_type_size(sigset_t SIZEOF_SIGSET_T BUILTIN_TYPES_ONLY)
 +if(EXISTS HAVE_SIZEOF_SIGSET_T)
 +  set(SIZEOF_SIGSET_T ${HAVE_SIZEOF_SIGSET_T})
 +endif()
 +set(CMAKE_EXTRA_INCLUDE_FILES ${oCMAKE_EXTRA_INCLUDE_FILES})
 +# check for sizeof struct statfs64
 +set(oCMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES})
 +check_include_file(sys/statfs.h HAVE_SYS_STATFS_H)
 +check_include_file(sys/vfs.h HAVE_SYS_VFS_H)
 +if(HAVE_SYS_STATFS_H)
 +  set(CMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES} sys/statfs.h)
 +endif()
 +if(HAVE_SYS_VFS_H)
 +  set(CMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES} sys/vfs.h)
 +endif()
 +# Have to pass _LARGEFILE64_SOURCE otherwise there is no struct statfs64.
 +# We forcefully enable LFS to retain glibc legacy behaviour herein.
 +set(oCMAKE_REQUIRED_DEFINITIONS ${CMAKE_REQUIRED_DEFINITIONS})
 +set(CMAKE_REQUIRED_DEFINITIONS ${CMAKE_REQUIRED_DEFINITIONS} 
 -D_LARGEFILE64_SOURCE)
 +check_type_size(struct statfs64 SIZEOF_STRUCT_STATFS64)
 +if(EXISTS HAVE_SIZEOF_STRUCT_STATFS64)
 +  set(SIZEOF_STRUCT_STATFS64 ${HAVE_SIZEOF_STRUCT_STATFS64})
 +else()
 +  set(CMAKE_REQUIRED_DEFINITIONS ${oCMAKE_REQUIRED_DEFINITIONS})
 +endif()
 +set(CMAKE_EXTRA_INCLUDE_FILES ${oCMAKE_EXTRA_INCLUDE_FILES})
 +# do not set(CMAKE_REQUIRED_DEFINITIONS ${oCMAKE_REQUIRED_DEFINITIONS})
 +# it back here either way.
 +include(CheckStructHasMember)
 +check_struct_has_member(glob_t gl_flags glob.h HAVE_GLOB_T_GL_FLAGS)
 +check_struct_has_member(glob_t gl_closedir glob.h HAVE_GLOB_T_GL_CLOSEDIR)
 +check_struct_has_member(glob_t gl_readdir glob.h HAVE_GLOB_T_GL_READDIR)
 +check_struct_has_member(glob_t gl_opendir glob.h HAVE_GLOB_T_GL_OPENDIR)
 +check_struct_has_member(glob_t gl_lstat glob.h HAVE_GLOB_T_GL_LSTAT)
 +check_struct_has_member(glob_t gl_stat glob.h HAVE_GLOB_T_GL_STAT)
 +
 +# folks seem to have an aversion to configure_file? So be it..
 +foreach(x HAVE_SYS_USTAT_H HAVE_UTIME_H HAVE_WORDEXP_H HAVE_GLOB_H
 +HAVE_NANOSLEEP HAVE_USLEEP SIZEOF_SIGSET_T SIZEOF_STRUCT_STATFS64
 +HAVE_GLOB_T_GL_FLAGS HAVE_GLOB_T_GL_CLOSEDIR
 +HAVE_GLOB_T_GL_READDIR HAVE_GLOB_T_GL_OPENDIR
 +HAVE_GLOB_T_GL_LSTAT HAVE_GLOB_T_GL_STAT)
 +def_undef_string(${x} SANITIZER_COMMON_CFLAGS)
 +endforeach()
 +
 +
  # Architectures supported by Sanitizer runtimes. Specific sanitizers may
  # support only subset of these (e.g. TSan works on

Re: [PATCH 3/3] [LLVM] [sanitizer] add conditionals for libc

2014-04-17 Thread Bernhard Reutner-Fischer

On 17 April 2014 16:07, Konstantin Serebryany
konstantin.s.serebry...@gmail.com wrote:
 Hi,

 If you are trying to modify the libsanitizer files, please read here:
 https://code.google.com/p/address-sanitizer/wiki/HowToContribute

I read that, thanks. Patch 3/3 is for current compiler-rt git repo,
please install it there, i do not have write access to the LLVM nor
compiler-rt trees.
TIA,

 --kcc

 On Thu, Apr 17, 2014 at 5:49 PM, Bernhard Reutner-Fischer
 rep.dot@gmail.com wrote:
 Conditionalize usage of dlvsym(), nanosleep(), usleep();
 Conditionalize layout of struct sigaction and type of it's member
 sa_flags.
 Conditionalize glob_t members gl_closedir, gl_readdir, gl_opendir,
 gl_flags, gl_lstat, gl_stat.
 Check for availability of glob.h for use with above members.
 Check for availability of netrom/netrom.h, sys/ustat.h (for obsolete
 ustat() function), utime.h (for obsolete utime() function), wordexp.h.
 Determine size of sigset_t instead of hardcoding it.
 Determine size of struct statfs64, if available.

 Leave defaults to match what glibc expects but probe them for uClibc.

 Signed-off-by: Bernhard Reutner-Fischer rep.dot@gmail.com
 ---
  CMakeLists.txt |  58 +++
  cmake/Modules/CompilerRTUtils.cmake|  15 ++
  cmake/Modules/FunctionExistsNotStub.cmake  |  56 +++
  lib/interception/interception_linux.cc |   2 +
  lib/interception/interception_linux.h  |   9 ++
  .../sanitizer_common_interceptors.inc  | 101 +++-
  .../sanitizer_platform_limits_posix.cc |  44 -
  .../sanitizer_platform_limits_posix.h  |  27 +++-
  lib/sanitizer_common/sanitizer_posix_libcdep.cc|   9 ++
  make/platform/clang_linux.mk   | 180 
 +
  make/platform/clang_linux_test_libc.c  |  68 
  11 files changed, 561 insertions(+), 8 deletions(-)
  create mode 100644 cmake/Modules/FunctionExistsNotStub.cmake
  create mode 100644 make/platform/clang_linux_test_libc.c

 diff --git a/CMakeLists.txt b/CMakeLists.txt
 index e1a7a1f..af8073e 100644
 --- a/CMakeLists.txt
 +++ b/CMakeLists.txt
 @@ -330,6 +330,64 @@ if(APPLE)
  -isysroot ${IOSSIM_SDK_DIR})
  endif()

 +set(ct_c ${COMPILER_RT_SOURCE_DIR}/make/platform/clang_linux_test_libc.c)
 +check_include_file(sys/ustat.h HAVE_SYS_USTAT_H)
 +check_include_file(utime.h HAVE_UTIME_H)
 +check_include_file(wordexp.h HAVE_WORDEXP_H)
 +check_include_file(glob.h HAVE_GLOB_H)
 +include(FunctionExistsNotStub)
 +check_function_exists_not_stub(${ct_c} nanosleep HAVE_NANOSLEEP)
 +check_function_exists_not_stub(${ct_c} usleep HAVE_USLEEP)
 +include(CheckTypeSize)
 +# check for sizeof sigset_t
 +set(oCMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES})
 +set(CMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES} signal.h)
 +check_type_size(sigset_t SIZEOF_SIGSET_T BUILTIN_TYPES_ONLY)
 +if(EXISTS HAVE_SIZEOF_SIGSET_T)
 +  set(SIZEOF_SIGSET_T ${HAVE_SIZEOF_SIGSET_T})
 +endif()
 +set(CMAKE_EXTRA_INCLUDE_FILES ${oCMAKE_EXTRA_INCLUDE_FILES})
 +# check for sizeof struct statfs64
 +set(oCMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES})
 +check_include_file(sys/statfs.h HAVE_SYS_STATFS_H)
 +check_include_file(sys/vfs.h HAVE_SYS_VFS_H)
 +if(HAVE_SYS_STATFS_H)
 +  set(CMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES} sys/statfs.h)
 +endif()
 +if(HAVE_SYS_VFS_H)
 +  set(CMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES} sys/vfs.h)
 +endif()
 +# Have to pass _LARGEFILE64_SOURCE otherwise there is no struct statfs64.
 +# We forcefully enable LFS to retain glibc legacy behaviour herein.
 +set(oCMAKE_REQUIRED_DEFINITIONS ${CMAKE_REQUIRED_DEFINITIONS})
 +set(CMAKE_REQUIRED_DEFINITIONS ${CMAKE_REQUIRED_DEFINITIONS} 
 -D_LARGEFILE64_SOURCE)
 +check_type_size(struct statfs64 SIZEOF_STRUCT_STATFS64)
 +if(EXISTS HAVE_SIZEOF_STRUCT_STATFS64)
 +  set(SIZEOF_STRUCT_STATFS64 ${HAVE_SIZEOF_STRUCT_STATFS64})
 +else()
 +  set(CMAKE_REQUIRED_DEFINITIONS ${oCMAKE_REQUIRED_DEFINITIONS})
 +endif()
 +set(CMAKE_EXTRA_INCLUDE_FILES ${oCMAKE_EXTRA_INCLUDE_FILES})
 +# do not set(CMAKE_REQUIRED_DEFINITIONS ${oCMAKE_REQUIRED_DEFINITIONS})
 +# it back here either way.
 +include(CheckStructHasMember)
 +check_struct_has_member(glob_t gl_flags glob.h HAVE_GLOB_T_GL_FLAGS)
 +check_struct_has_member(glob_t gl_closedir glob.h HAVE_GLOB_T_GL_CLOSEDIR)
 +check_struct_has_member(glob_t gl_readdir glob.h HAVE_GLOB_T_GL_READDIR)
 +check_struct_has_member(glob_t gl_opendir glob.h HAVE_GLOB_T_GL_OPENDIR)
 +check_struct_has_member(glob_t gl_lstat glob.h HAVE_GLOB_T_GL_LSTAT)
 +check_struct_has_member(glob_t gl_stat glob.h HAVE_GLOB_T_GL_STAT)
 +
 +# folks seem to have an aversion to configure_file? So be it..
 +foreach(x HAVE_SYS_USTAT_H HAVE_UTIME_H HAVE_WORDEXP_H HAVE_GLOB_H
 +HAVE_NANOSLEEP HAVE_USLEEP SIZEOF_SIGSET_T SIZEOF_STRUCT_STATFS64
 +HAVE_GLOB_T_GL_FLAGS HAVE_GLOB_T_GL_CLOSEDIR
 +HAVE_GLOB_T_GL_READDIR

Re: [PATCH v7?] PR middle-end/60281

2014-04-17 Thread lin zuojian

Hi Bernd,
I have my copyright mark signed and the process has completed. Now I
am going to answer two more questions before my patch can be
commited right?

Did you copy any
files or text written by someone else in these changes?”

no

[Which files have you changed so far, and which new files have you written
so far?]
gcc/asan.c
gcc/ChangeLog
gcc/cfgexpand.c

Okay, you may review my patch again, if there is no problem, please
commit it for me.
--
Regards
lin zuojian

Re: [RFC] Add aarch64 support for ada

2014-04-17 Thread Richard Henderson

On 04/17/2014 02:00 AM, Tristan Gingold wrote:
 
 On 16 Apr 2014, at 17:36, Richard Henderson r...@redhat.com wrote:
 
 On 04/16/2014 12:39 AM, Eric Botcazou wrote:
 The primary bit of rfc here is the hunk that applies to ada/types.h
 with respect to Fat_Pointer.  Given that the Ada type, as defined in
 s-stratt.ads, does not include alignment, I can't imagine why the C
 type should have it.

 See gcc-interface/utils.c:finish_fat_pointer_type.

 Ah hah.

  /* Make sure we can put it into a register.  */
  if (STRICT_ALIGNMENT)
TYPE_ALIGN (record_type) = MIN (BIGGEST_ALIGNMENT, 2 * POINTER_SIZE);

 AArch64 is not a STRICT_ALIGNMENT target, thus the mismatch.
 
 As the align attribute in types.h is for the host, couldn't a configure test 
 solve
 this issue ?

I doubt it.  I'm not sure what kind of configure test you could write that
would determine the setting of STRICT_ALIGNMENT, since even non-strict-align
targets prefer to align data for performance reasons.  Be careful that the test
couldn't be an execution test, lest you break host != build.

 One of the most common Fat_Pointer is for strings, which aren't declared in 
 any
 source and is very commonly used.
 
 OTOH, I think this optimization mostly targets sparc.

Indeed, 32-bit sparc wants 64-bit alignment for its ldd/std instructions.

Perhaps the better optimization (supposing it's really worth keeping) is to
DECL_ALIGN the static strings, rather than align the type?

Presumably Ada strings are as with C string literals -- symbols private to the
compilation unit which are normally passed by value.  Thus functions within the
compilation unit would see the extra alignment of the data and be able to use
ldd to load the pair.  On the receiving end being able to use std would remain
a matter of luck.


r~

Re: [PATCH 3/3] [LLVM] [sanitizer] add conditionals for libc

2014-04-17 Thread Konstantin Serebryany

On Thu, Apr 17, 2014 at 6:27 PM, Bernhard Reutner-Fischer
rep.dot@gmail.com wrote:
 On 17 April 2014 16:07, Konstantin Serebryany
 konstantin.s.serebry...@gmail.com wrote:
 Hi,

 If you are trying to modify the libsanitizer files, please read here:
 https://code.google.com/p/address-sanitizer/wiki/HowToContribute

 I read that, thanks. Patch 3/3 is for current compiler-rt git repo,
 please install it there, i do not have write access to the LLVM nor
 compiler-rt trees.

I can commit your patch to llvm tree only after you follow the process
described on that page.
Sorry, this is a hard rule.

--kcc

 TIA,

 --kcc

 On Thu, Apr 17, 2014 at 5:49 PM, Bernhard Reutner-Fischer
 rep.dot@gmail.com wrote:
 Conditionalize usage of dlvsym(), nanosleep(), usleep();
 Conditionalize layout of struct sigaction and type of it's member
 sa_flags.
 Conditionalize glob_t members gl_closedir, gl_readdir, gl_opendir,
 gl_flags, gl_lstat, gl_stat.
 Check for availability of glob.h for use with above members.
 Check for availability of netrom/netrom.h, sys/ustat.h (for obsolete
 ustat() function), utime.h (for obsolete utime() function), wordexp.h.
 Determine size of sigset_t instead of hardcoding it.
 Determine size of struct statfs64, if available.

 Leave defaults to match what glibc expects but probe them for uClibc.

 Signed-off-by: Bernhard Reutner-Fischer rep.dot@gmail.com
 ---
  CMakeLists.txt |  58 +++
  cmake/Modules/CompilerRTUtils.cmake|  15 ++
  cmake/Modules/FunctionExistsNotStub.cmake  |  56 +++
  lib/interception/interception_linux.cc |   2 +
  lib/interception/interception_linux.h  |   9 ++
  .../sanitizer_common_interceptors.inc  | 101 +++-
  .../sanitizer_platform_limits_posix.cc |  44 -
  .../sanitizer_platform_limits_posix.h  |  27 +++-
  lib/sanitizer_common/sanitizer_posix_libcdep.cc|   9 ++
  make/platform/clang_linux.mk   | 180 
 +
  make/platform/clang_linux_test_libc.c  |  68 
  11 files changed, 561 insertions(+), 8 deletions(-)
  create mode 100644 cmake/Modules/FunctionExistsNotStub.cmake
  create mode 100644 make/platform/clang_linux_test_libc.c

 diff --git a/CMakeLists.txt b/CMakeLists.txt
 index e1a7a1f..af8073e 100644
 --- a/CMakeLists.txt
 +++ b/CMakeLists.txt
 @@ -330,6 +330,64 @@ if(APPLE)
  -isysroot ${IOSSIM_SDK_DIR})
  endif()

 +set(ct_c ${COMPILER_RT_SOURCE_DIR}/make/platform/clang_linux_test_libc.c)
 +check_include_file(sys/ustat.h HAVE_SYS_USTAT_H)
 +check_include_file(utime.h HAVE_UTIME_H)
 +check_include_file(wordexp.h HAVE_WORDEXP_H)
 +check_include_file(glob.h HAVE_GLOB_H)
 +include(FunctionExistsNotStub)
 +check_function_exists_not_stub(${ct_c} nanosleep HAVE_NANOSLEEP)
 +check_function_exists_not_stub(${ct_c} usleep HAVE_USLEEP)
 +include(CheckTypeSize)
 +# check for sizeof sigset_t
 +set(oCMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES})
 +set(CMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES} signal.h)
 +check_type_size(sigset_t SIZEOF_SIGSET_T BUILTIN_TYPES_ONLY)
 +if(EXISTS HAVE_SIZEOF_SIGSET_T)
 +  set(SIZEOF_SIGSET_T ${HAVE_SIZEOF_SIGSET_T})
 +endif()
 +set(CMAKE_EXTRA_INCLUDE_FILES ${oCMAKE_EXTRA_INCLUDE_FILES})
 +# check for sizeof struct statfs64
 +set(oCMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES})
 +check_include_file(sys/statfs.h HAVE_SYS_STATFS_H)
 +check_include_file(sys/vfs.h HAVE_SYS_VFS_H)
 +if(HAVE_SYS_STATFS_H)
 +  set(CMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES} sys/statfs.h)
 +endif()
 +if(HAVE_SYS_VFS_H)
 +  set(CMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES} sys/vfs.h)
 +endif()
 +# Have to pass _LARGEFILE64_SOURCE otherwise there is no struct statfs64.
 +# We forcefully enable LFS to retain glibc legacy behaviour herein.
 +set(oCMAKE_REQUIRED_DEFINITIONS ${CMAKE_REQUIRED_DEFINITIONS})
 +set(CMAKE_REQUIRED_DEFINITIONS ${CMAKE_REQUIRED_DEFINITIONS} 
 -D_LARGEFILE64_SOURCE)
 +check_type_size(struct statfs64 SIZEOF_STRUCT_STATFS64)
 +if(EXISTS HAVE_SIZEOF_STRUCT_STATFS64)
 +  set(SIZEOF_STRUCT_STATFS64 ${HAVE_SIZEOF_STRUCT_STATFS64})
 +else()
 +  set(CMAKE_REQUIRED_DEFINITIONS ${oCMAKE_REQUIRED_DEFINITIONS})
 +endif()
 +set(CMAKE_EXTRA_INCLUDE_FILES ${oCMAKE_EXTRA_INCLUDE_FILES})
 +# do not set(CMAKE_REQUIRED_DEFINITIONS ${oCMAKE_REQUIRED_DEFINITIONS})
 +# it back here either way.
 +include(CheckStructHasMember)
 +check_struct_has_member(glob_t gl_flags glob.h HAVE_GLOB_T_GL_FLAGS)
 +check_struct_has_member(glob_t gl_closedir glob.h HAVE_GLOB_T_GL_CLOSEDIR)
 +check_struct_has_member(glob_t gl_readdir glob.h HAVE_GLOB_T_GL_READDIR)
 +check_struct_has_member(glob_t gl_opendir glob.h HAVE_GLOB_T_GL_OPENDIR)
 +check_struct_has_member(glob_t gl_lstat glob.h HAVE_GLOB_T_GL_LSTAT)
 +check_struct_has_member(glob_t gl_stat glob.h HAVE_GLOB_T_GL_STAT)
 +
 +# folks seem to have an aversion to configure_file? So be

Re: fuse-caller-save - hook format

2014-04-17 Thread Vladimir Makarov

On 2014-04-16, 3:19 PM, Tom de Vries wrote:
 Vladimir,
 
 All patches for the fuse-caller-save optimization have been ok-ed. The only 
 part
 not approved is the MIPS-specific part.
 
 The objection of Richard S. is not so much the patch itself, but more the idea
 of the hook fn_other_hard_reg_usage.
 
 For clarity, I'm restating the current hook definition here:
 ...
 +@deftypefn {Target Hook} bool TARGET_FN_OTHER_HARD_REG_USAGE (struct
 hard_reg_set_container *@var{regs})
 Add any hard registers to @var{regs} that are set or clobbered by a call to 
 the
 function.  This hook only needs to add registers that cannot be found by
 examination of the final RTL representation of a function.  This hook returns
 true if it managed to determine which registers need to be added.  The default
 version of this hook returns false.
 ...
 
 Richard prefers to, rather than having a hook specifying what registers are
 implicitly clobbered, adding those clobbers to CALL_INSN_FUNCTION_USAGE.
 
 I can see these possibilities (and perhaps there are more):
 
 1. We go with Richards proposal: we make each target responsible for adding
 these clobbers in CALL_INSN_FUNCTION_USAGE, and use a hook called f.i.
 targetm.fuse_caller_save_p or targetm.implicit_call_clobbers_in_fusage_p, to
 indicate whether a target has taken care of that, meaning it's safe to do the
 fuse-caller-save optimization.
 
 2. A mixed solution: we make each target responsible for specifying which
 clobbers need to be added in CALL_INSN_FUNCTION_USAGE, using a hook called 
 f.i.
 targetm.call_clobbered_regs, and add generic code to add those clobbers to
 CALL_INSN_FUNCTION_USAGE.
 
 3. We stick with the current, approved hook format, and try to convince 
 Richard
 to live with it.
 
 
 Since you are a register allocator maintainer, familiar with the
 fuse-caller-save optimization, and have approved the original hook, I would 
 like
 to ask you to make a decision on how to proceed from here.
 

I have no preferences and it is a matter of taste.  Each solution has
own advantages and disadvantages.  Putting this info into
CALL_INSN_FUNCTION_USAGE helps GCC developing a lot but it has a big
drawback in RTL memory footprint (especially for some targets which have
a lot of regs like AM29k or IA64).  On the order hand analogous approach
is already used in DF-infrastructure (which would be nice to fix it imho).

Still between GCC users and GCC developers, I'd prefer solution (even
the effect on amount of resources used by GCC is quite insignificant)
for users as their number is in a few magnitudes more then the developers.

But I can live with any solution.  So it is up to you.  I am flexible.

PATCH: PR target/60863: Incorrect codegen in ix86_expand_clear for -Os

2014-04-17 Thread H.J. Lu

Hi,

I checked in this preapproved patch to generate xor reg, reg when
optimizing for size.


H.J.
Index: ChangeLog
===
--- ChangeLog   (revision 209487)
+++ ChangeLog   (working copy)
@@ -1,3 +1,10 @@
+2014-04-17  H.J. Lu  hongjiu...@intel.com
+
+   PR target/60863
+   * config/i386/i386.c (ix86_expand_clear): Remove outdated
+   comment.  Check optimize_insn_for_size_p instead of
+   optimize_insn_for_speed_p.
+
 2014-04-17  Martin Jambor  mjam...@suse.cz
 
* gimple-iterator.c (gsi_start_edge): New function.
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 209487)
+++ config/i386/i386.c  (working copy)
@@ -16668,8 +16668,7 @@ ix86_expand_clear (rtx dest)
 dest = gen_rtx_REG (SImode, REGNO (dest));
   tmp = gen_rtx_SET (VOIDmode, dest, const0_rtx);
 
-  /* This predicate should match that for movsi_xor and movdi_xor_rex64.  */
-  if (!TARGET_USE_MOV0 || optimize_insn_for_speed_p ())
+  if (!TARGET_USE_MOV0 || optimize_insn_for_size_p ())
 {
   rtx clob = gen_rtx_CLOBBER (VOIDmode, gen_rtx_REG (CCmode, FLAGS_REG));
   tmp = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, tmp, clob));

Re: fuse-caller-save - hook format

2014-04-17 Thread Richard Sandiford

Vladimir Makarov vmaka...@redhat.com writes:
 On 2014-04-16, 3:19 PM, Tom de Vries wrote:
 Vladimir,
 
 All patches for the fuse-caller-save optimization have been ok-ed. The
 only part
 not approved is the MIPS-specific part.
 
 The objection of Richard S. is not so much the patch itself, but more the 
 idea
 of the hook fn_other_hard_reg_usage.
 
 For clarity, I'm restating the current hook definition here:
 ...
 +@deftypefn {Target Hook} bool TARGET_FN_OTHER_HARD_REG_USAGE (struct
 hard_reg_set_container *@var{regs})
 Add any hard registers to @var{regs} that are set or clobbered by a
 call to the
 function.  This hook only needs to add registers that cannot be found by
 examination of the final RTL representation of a function.  This hook returns
 true if it managed to determine which registers need to be added.  The 
 default
 version of this hook returns false.
 ...
 
 Richard prefers to, rather than having a hook specifying what registers are
 implicitly clobbered, adding those clobbers to CALL_INSN_FUNCTION_USAGE.
 
 I can see these possibilities (and perhaps there are more):
 
 1. We go with Richards proposal: we make each target responsible for adding
 these clobbers in CALL_INSN_FUNCTION_USAGE, and use a hook called f.i.
 targetm.fuse_caller_save_p or targetm.implicit_call_clobbers_in_fusage_p, to
 indicate whether a target has taken care of that, meaning it's safe to do the
 fuse-caller-save optimization.
 
 2. A mixed solution: we make each target responsible for specifying which
 clobbers need to be added in CALL_INSN_FUNCTION_USAGE, using a hook
 called f.i.
 targetm.call_clobbered_regs, and add generic code to add those clobbers to
 CALL_INSN_FUNCTION_USAGE.
 
 3. We stick with the current, approved hook format, and try to
 convince Richard
 to live with it.
 
 
 Since you are a register allocator maintainer, familiar with the
 fuse-caller-save optimization, and have approved the original hook, I
 would like
 to ask you to make a decision on how to proceed from here.
 

 I have no preferences and it is a matter of taste.  Each solution has
 own advantages and disadvantages.  Putting this info into
 CALL_INSN_FUNCTION_USAGE helps GCC developing a lot but it has a big
 drawback in RTL memory footprint (especially for some targets which have
 a lot of regs like AM29k or IA64).  On the order hand analogous approach
 is already used in DF-infrastructure (which would be nice to fix it imho).

 Still between GCC users and GCC developers, I'd prefer solution (even
 the effect on amount of resources used by GCC is quite insignificant)
 for users as their number is in a few magnitudes more then the developers.

Hmm, but you're talking like there are going to be a lot of these registers.
This isn't about which registers are call-clobbered or call-saved according
to the ABI (that's already available in other places).  All we want here
are the set of registers that are clobbered _in the caller_ before reaching
the callee or after the callee has returned.

So although IA-64 has lots of registers, the caller doesn't AFAIK use
lots of registers in the process of making the call.

On all targets we should be talking about one or two registers here.

Thanks,
Richard

Re: [RFC] Add aarch64 support for ada

2014-04-17 Thread Tristan Gingold


On 17 Apr 2014, at 16:50, Richard Henderson r...@redhat.com wrote:

 On 04/17/2014 02:00 AM, Tristan Gingold wrote:
 
 On 16 Apr 2014, at 17:36, Richard Henderson r...@redhat.com wrote:
 
 On 04/16/2014 12:39 AM, Eric Botcazou wrote:
 The primary bit of rfc here is the hunk that applies to ada/types.h
 with respect to Fat_Pointer.  Given that the Ada type, as defined in
 s-stratt.ads, does not include alignment, I can't imagine why the C
 type should have it.
 
 See gcc-interface/utils.c:finish_fat_pointer_type.
 
 Ah hah.
 
 /* Make sure we can put it into a register.  */
 if (STRICT_ALIGNMENT)
   TYPE_ALIGN (record_type) = MIN (BIGGEST_ALIGNMENT, 2 * POINTER_SIZE);
 
 AArch64 is not a STRICT_ALIGNMENT target, thus the mismatch.
 
 As the align attribute in types.h is for the host, couldn't a configure test 
 solve
 this issue ?
 
 I doubt it.  I'm not sure what kind of configure test you could write that
 would determine the setting of STRICT_ALIGNMENT, since even non-strict-align
 targets prefer to align data for performance reasons.  Be careful that the 
 test
 couldn't be an execution test, lest you break host != build.

What about this compile-time check:

package Fatptralign is
   type String_Acc is access String;
   type Integer_acc is access Integer;

   pragma Compile_Time_Error
(String_Acc'Alignment = 1 * Integer_Acc'Alignment,
 Fat pointer are simply aligned);

   pragma Compile_Time_Error
(String_Acc'Alignment = 2 * Integer_Acc'Alignment,
 Fat pointer are doubly aligned);
end Fatptralign;


 One of the most common Fat_Pointer is for strings, which aren't declared in 
 any
 source and is very commonly used.
 
 OTOH, I think this optimization mostly targets sparc.
 
 Indeed, 32-bit sparc wants 64-bit alignment for its ldd/std instructions.
 
 Perhaps the better optimization (supposing it's really worth keeping)

That's a true question (worth keeping).  I think this also affects powerpc (as
an important target)

Eric ?

 is to
 DECL_ALIGN the static strings, rather than align the type?

[ Ada strings (and more generally Ada unconstrained array and Ada accesses to
  unconstrained arrays) are represented in GNAT by a fat pointer, ie a structure
  containing a pointer to the bounds and a pointer to the data.
  We are talking about the alignment of that structure. ]

 Presumably Ada strings are as with C string literals -- symbols private to the
 compilation unit which are normally passed by value.  Thus functions within 
 the
 compilation unit would see the extra alignment of the data and be able to use
 ldd to load the pair.  On the receiving end being able to use std would remain
 a matter of luck.

I think this will dismiss most of the gain.  Fat pointers can be heavily used in
some applications, and be present in structures.  Gain with only private symbols
might be tiny.

Tristan.

Re: [RFC] Add aarch64 support for ada

2014-04-17 Thread Eric Botcazou

 Ah hah.
 
   /* Make sure we can put it into a register.  */
   if (STRICT_ALIGNMENT)
 TYPE_ALIGN (record_type) = MIN (BIGGEST_ALIGNMENT, 2 * POINTER_SIZE);
 
 AArch64 is not a STRICT_ALIGNMENT target, thus the mismatch.

I see.  Initially this alignment promotion had been universal, but someone 
recently complained about holes in structures on x86-64 because of it so we 
restricted it to the platforms where it is really necessary for the goal 
stated in the comment; we left types.h untouched because the alignment could 
not possibly change the calling convention for non-strict-alignment targets...

 If we were to make this alignment unconditional, would it be better to drop
 the code from here in finish_fat_pointer_type and instead record that in
 the Ada source, as we do with the C source?

We cannot really do that, the s-stratt.ads thing is a red herring, alignment 
of fat pointer types is entirely decided inside the compiler (layout.adb:3213 
and gcc-interface/utils.c:finish_fat_pointer_type)

I presume that the attached kludge is sufficient to make it work?


* fe.h (Compiler_Abort): Replace Fat_Pointer by String.
(Error_Msg_N): Likewise.
(Error_Msg_NE): Likewise.
(Get_External_Name_With_Suffix): Likewise.
* types.h (Fat_Pointer): Delete.
(String): New type.
(DECLARE_STRING): New macro.
* gcc-interface/decl.c (create_concat_name): Adjust.
* gcc-interface/trans.c (post_error): Likewise.
(post_error_ne): Likewise.
* gcc-interface/misc.c (internal_error_function): Likewise.


-- 
Eric BotcazouIndex: fe.h
===
--- fe.h	(revision 209461)
+++ fe.h	(working copy)
@@ -39,7 +39,7 @@ extern C {
 /* comperr:  */
 
 #define Compiler_Abort comperr__compiler_abort
-extern int Compiler_Abort (Fat_Pointer, int, Fat_Pointer) ATTRIBUTE_NORETURN;
+extern int Compiler_Abort (String, int, String) ATTRIBUTE_NORETURN;
 
 /* csets: */
 
@@ -90,8 +90,8 @@ extern Node_Id Get_Attribute_Definition_
 #define Error_Msg_NE  errout__error_msg_ne
 #define Set_Identifier_Casing errout__set_identifier_casing
 
-extern void Error_Msg_N	  (Fat_Pointer, Node_Id);
-extern void Error_Msg_NE  (Fat_Pointer, Node_Id, Entity_Id);
+extern void Error_Msg_N	  (String, Node_Id);
+extern void Error_Msg_NE  (String, Node_Id, Entity_Id);
 extern void Set_Identifier_Casing (Char *, const Char *);
 
 /* err_vars: */
@@ -151,7 +151,7 @@ extern void Setup_Asm_Outputs		(Node_Id)
 
 extern void Get_Encoded_Name			(Entity_Id);
 extern void Get_External_Name			(Entity_Id, Boolean);
-extern void Get_External_Name_With_Suffix	(Entity_Id, Fat_Pointer);
+extern void Get_External_Name_With_Suffix	(Entity_Id, String);
 
 /* exp_util: */
 
Index: types.h
===
--- types.h	(revision 209461)
+++ types.h	(working copy)
@@ -76,11 +76,14 @@ typedef Char *Str;
 /* Pointer to string of Chars */
 typedef Char *Str_Ptr;
 
-/* Types for the fat pointer used for strings and the template it
-   points to.  */
+/* Types for the fat pointer used for strings and the template it points to.
+   On most platforms the fat pointer is naturally aligned but, on the rest,
+   it is given twice the natural alignment.  For maximum portability, we do
+   not overalign the type but only the objects.  */
 typedef struct {int Low_Bound, High_Bound; } String_Template;
-typedef struct {const char *Array; String_Template *Bounds; }
-	__attribute ((aligned (sizeof (char *) * 2))) Fat_Pointer;
+typedef struct {const char *Array; String_Template *Bounds; } String;
+#define DECLARE_STRING(s, a, t) \
+  __attribute__ ((aligned (sizeof (char *) * 2))) String s = { a, t }
 
 /* Types for Node/Entity Kinds:  */
 
Index: gcc-interface/decl.c
===
--- gcc-interface/decl.c	(revision 209461)
+++ gcc-interface/decl.c	(working copy)
@@ -8861,8 +8861,8 @@ create_concat_name (Entity_Id gnat_entit
   if (suffix)
 {
   String_Template temp = {1, (int) strlen (suffix)};
-  Fat_Pointer fp = {suffix, temp};
-  Get_External_Name_With_Suffix (gnat_entity, fp);
+  DECLARE_STRING (s, suffix, temp);
+  Get_External_Name_With_Suffix (gnat_entity, s);
 }
   else
 Get_External_Name (gnat_entity, 0);
Index: gcc-interface/trans.c
===
--- gcc-interface/trans.c	(revision 209461)
+++ gcc-interface/trans.c	(working copy)
@@ -7833,7 +7833,6 @@ gnat_gimplify_stmt (tree *stmt_p)
 	  gnu_cond = build2 (ANNOTATE_EXPR, TREE_TYPE (gnu_cond), gnu_cond,
  build_int_cst (integer_type_node,
 		annot_expr_ivdep_kind));
-
 	if (LOOP_STMT_NO_VECTOR (stmt))
 	  gnu_cond = build2 (ANNOTATE_EXPR, TREE_TYPE (gnu_cond), gnu_cond,
  build_int_cst (integer_type_node,
@@ -9357,16 +9356,14 @@ void
 post_error

Re: [C PATCH] Make attributes accept enum values (PR c/50459)

2014-04-17 Thread Marek Polacek

On Wed, Apr 16, 2014 at 01:40:22PM -0400, Jason Merrill wrote:
 On 04/15/2014 03:56 PM, Marek Polacek wrote:
 The testsuite doesn't hit this code with C++, but does hit this code
 with C.  The thing is, if we have e.g.
 enum { A = 128 };
 void *fn1 (void) __attribute__((assume_aligned (A)));
 then handle_assume_aligned_attribute walks the attribute arguments
 and gets the argument via TREE_VALUE.  If this argument is an enum
 value, then for C the argument is identifier_node that contains const_decl,
 
 Ah.  Then I think the C parser should be fixed to check
 attribute_takes_identifier_p and look up the argument if false.

Ok, thanks, I didn't know about attribute_takes_identifier_p.  Should be done
in the following.  Regtested/bootstrapped on x86_64-linux.  Ok now?

2014-04-17  Marek Polacek  pola...@redhat.com

PR c/50459
c-family/
* c-common.c (handle_aligned_attribute): Don't call default_conversion
on FUNCTION_DECLs.
(handle_vector_size_attribute): Likewise.
(handle_sentinel_attribute): Call default_conversion and allow even
integral types as an argument.  
c/
* c-parser.c (c_parser_attributes): If the attribute doesn't take an
identifier, call lookup_name for arguments. 
testsuite/
* c-c++-common/pr50459.c: New test.

diff --git gcc/c-family/c-common.c gcc/c-family/c-common.c
index c0e247b..1443914 100644
--- gcc/c-family/c-common.c
+++ gcc/c-family/c-common.c
@@ -7539,7 +7539,8 @@ handle_aligned_attribute (tree *node, tree ARG_UNUSED 
(name), tree args,
   if (args)
 {
   align_expr = TREE_VALUE (args);
-  if (align_expr  TREE_CODE (align_expr) != IDENTIFIER_NODE)
+  if (align_expr  TREE_CODE (align_expr) != IDENTIFIER_NODE
+  TREE_CODE (align_expr) != FUNCTION_DECL)
align_expr = default_conversion (align_expr);
 }
   else
@@ -8533,7 +8534,8 @@ handle_vector_size_attribute (tree *node, tree name, tree 
args,
   *no_add_attrs = true;
 
   size = TREE_VALUE (args);
-  if (size  TREE_CODE (size) != IDENTIFIER_NODE)
+  if (size  TREE_CODE (size) != IDENTIFIER_NODE
+   TREE_CODE (size) != FUNCTION_DECL)
 size = default_conversion (size);
 
   if (!tree_fits_uhwi_p (size))
@@ -8944,8 +8946,12 @@ handle_sentinel_attribute (tree *node, tree name, tree 
args,
   if (args)
 {
   tree position = TREE_VALUE (args);
+  if (position  TREE_CODE (position) != IDENTIFIER_NODE
+  TREE_CODE (position) != FUNCTION_DECL)
+   position = default_conversion (position);
 
-  if (TREE_CODE (position) != INTEGER_CST)
+  if (TREE_CODE (position) != INTEGER_CST
+  || !INTEGRAL_TYPE_P (TREE_TYPE (position)))
{
  warning (OPT_Wattributes,
   requested position is not an integer constant);
diff --git gcc/c/c-parser.c gcc/c/c-parser.c
index 5653e49..f8fe424 100644
--- gcc/c/c-parser.c
+++ gcc/c/c-parser.c
@@ -3912,6 +3912,7 @@ c_parser_attributes (c_parser *parser)
 || c_parser_next_token_is (parser, CPP_KEYWORD))
{
  tree attr, attr_name, attr_args;
+ bool attr_takes_id_p;
  vectree, va_gc *expr_list;
  if (c_parser_next_token_is (parser, CPP_COMMA))
{
@@ -3922,6 +3923,7 @@ c_parser_attributes (c_parser *parser)
  attr_name = c_parser_attribute_any_word (parser);
  if (attr_name == NULL)
break;
+ attr_takes_id_p = attribute_takes_identifier_p (attr_name);
  if (is_cilkplus_vector_p (attr_name))   
{
  c_token *v_token = c_parser_peek_token (parser);
@@ -3950,6 +3952,15 @@ c_parser_attributes (c_parser *parser)
  == CPP_CLOSE_PAREN)))
{
  tree arg1 = c_parser_peek_token (parser)-value;
+ if (!attr_takes_id_p)
+   {
+ /* This is for enum values, so that they can be used as
+an attribute parameter; lookup_name will find their
+CONST_DECLs.  */
+ tree ln = lookup_name (arg1);
+ if (ln)
+   arg1 = ln;
+   }
  c_parser_consume_token (parser);
  if (c_parser_next_token_is (parser, CPP_CLOSE_PAREN))
attr_args = build_tree_list (NULL_TREE, arg1);
diff --git gcc/testsuite/c-c++-common/pr50459.c 
gcc/testsuite/c-c++-common/pr50459.c
index e69de29..f954b32 100644
--- gcc/testsuite/c-c++-common/pr50459.c
+++ gcc/testsuite/c-c++-common/pr50459.c
@@ -0,0 +1,14 @@
+/* PR c/50459 */
+/* { dg-do compile } */
+/* { dg-options -Wall -Wextra } */
+
+enum { A = 128, B = 1 };
+void *fn1 (void) __attribute__((assume_aligned (A)));
+void *fn2 (void) __attribute__((assume_aligned (A, 4)));
+void fn3 (void) __attribute__((constructor (A)));
+void fn4 (void) __attribute__((destructor (A)));
+void *fn5 (int) __attribute__((alloc_size (B)));
+void *fn6 (int) __attribute__((alloc_align (B)));
+void fn7 (const char *,

Re: Avoid unnecesary GGC runs during LTO

2014-04-17 Thread Jan Hubicka

   +
   +  /* At this stage we know that majority of GGC memory is reachable.  
   + Growing the limits prevents unnecesary invocation of GGC.  */
   +  ggc_grow ();
  ggc_collect ();
  
  Isn't the collect here pointless?  I see not in ENABLE_CHECKING, but
  shouldn't this be abstracted away, thus call ggc_collect from ggc_grow?
  Or maybe rather even for ENABLE_CHECKING adjust G.allocated_last_gc
  and simply drop the ggc_collect above ().
 
 I am fine with both.  I basically decided to keep the explicit ggc_collect() 
 to
 make it clear (from lto.c source code) that we are GGC safe at this point and
 to have way to double check that we do not produce too much of garbage with
 checking disabled. (so with -Q I will see how much it is collected at that 
 place).
 
 We can embed it into ggc_grow and document that w/o checking it is equivalent
 to ggc_cooect.
  
  Anyway, this is sth for stage1 at this point.
 
 OK,
 Honza

Ping...
the patches saves 33 GGC runs during libxul.so link, that is not that bad ;)

Honza
  
  Thanks,
  Richard.
  
  /* Set the hooks so that all of the ipa passes can read in their data. 
*/
   Index: ggc-none.c
   ===
   --- ggc-none.c(revision 209170)
   +++ ggc-none.c(working copy)
   @@ -63,3 +63,8 @@ ggc_free (void *p)
{
  free (p);
}
   +
   +void
   +ggc_grow (void)
   +{
   +}
   Index: ggc-page.c
   ===
   --- ggc-page.c(revision 209170)
   +++ ggc-page.c(working copy)
   @@ -2095,6 +2095,19 @@ ggc_collect (void)
fprintf (G.debug_file, END COLLECTING\n);
}

   +/* Assume that all GGC memory is reachable and grow the limits for next 
   collection. */
   +
   +void
   +ggc_grow (void)
   +{
   +#ifndef ENABLE_CHECKING
   +  G.allocated_last_gc = MAX (G.allocated_last_gc,
   +  G.allocated);
   +#endif
   +  if (!quiet_flag)
   +fprintf (stderr,  {GC start %luk} , (unsigned long) G.allocated / 
   1024);
   +}
   +
/* Print allocation statistics.  */
#define SCALE(x) ((unsigned long) ((x)  1024*10 \
   ? (x) \
   
   
  
  -- 
  Richard Biener rguent...@suse.de
  SUSE / SUSE Labs
  SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
  GF: Jeff Hawn, Jennifer Guild, Felix Imendorffer

RE: [PATCH] Add a new option -fmerge-bitfields (patch / doc inside)

2014-04-17 Thread Zoran Jovanovic

Hello,
Unfortunately, optimization is limited only to bit-fields that have same 
bit-field representative (DECL_BIT_FIELD_REPRESENTATIVE), and fields from 
different classes do have different representatives.
In given example optimization would merge accesses to x and y bit-fields from 
Base class, but not the access to z from Der class.

Regards,
Zoran
  

From: Daniel Gutson [daniel.gut...@tallertechnologies.com]
Sent: Wednesday, April 16, 2014 4:16 PM
To: Zoran Jovanovic
Cc: Bernhard Reutner-Fischer; Richard Biener; gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] Add a new option -fmerge-bitfields (patch / doc inside)

On Wed, Apr 16, 2014 at 8:38 AM, Zoran Jovanovic
zoran.jovano...@imgtec.com wrote:
 Hello,
 This is new patch version.
 Lowering is applied only for bit-fields copy sequences that are merged.
 Data structure representing bit-field copy sequences is renamed and reduced 
 in size.
 Optimization turned on by default for -O2 and higher.
 Some comments fixed.

 Benchmarking performed on WebKit for Android.
 Code size reduction noticed on several files, best examples are:

 core/rendering/style/StyleMultiColData (632-520 bytes)
 core/platform/graphics/FontDescription (1715-1475 bytes)
 core/rendering/style/FillLayer (5069-4513 bytes)
 core/rendering/style/StyleRareInheritedData (5618-5346)
 core/css/CSSSelectorList(4047-3887)
 core/platform/animation/CSSAnimationData (3844-3440 bytes)
 core/css/resolver/FontBuilder (13818-13350 bytes)
 core/platform/graphics/Font (16447-15975 bytes)


 Example:

 One of the motivating examples for this work was copy constructor of the 
 class which contains bit-fields.

 C++ code:
 class A
 {
 public:
 A(const A x);
 unsigned a : 1;
 unsigned b : 2;
 unsigned c : 4;
 };

 A::A(const Ax)
 {
 a = x.a;
 b = x.b;
 c = x.c;
 }

Very interesting.

Does this work with inheritance too? E.g.

struct Base
{
uint32_t x:1;
uint32_t y:3;

Base(const Base other) { x = other.x; y = other.y; }
};

struct Der : Base
{
Der() = default;
Der(const Der other) : Base(other)
{ z = other.z; }

uint32_t z:9;
};



 GIMPLE code without optimization:

   bb 2:
   _3 = x_2(D)-a;
   this_4(D)-a = _3;
   _6 = x_2(D)-b;
   this_4(D)-b = _6;
   _8 = x_2(D)-c;
   this_4(D)-c = _8;
   return;

 Optimized GIMPLE code:
   bb 2:
   _10 = x_2(D)-D.1867;
   _11 = BIT_FIELD_REF _10, 7, 0;
   _12 = this_4(D)-D.1867;
   _13 = _12  128;
   _14 = (unsigned char) _11;
   _15 = _13 | _14;
   this_4(D)-D.1867 = _15;
   return;

 Generated MIPS32r2 assembly code without optimization:
  lw  $3,0($5)
 lbu $2,0($4)
 andi$3,$3,0x1
 andi$2,$2,0xfe
 or  $2,$2,$3
 sb  $2,0($4)
 lw  $3,0($5)
 andi$2,$2,0xf9
 andi$3,$3,0x6
 or  $2,$2,$3
 sb  $2,0($4)
 lw  $3,0($5)
 andi$2,$2,0x87
 andi$3,$3,0x78
 or  $2,$2,$3
 j   $31
 sb  $2,0($4)

 Optimized MIPS32r2 assembly code:
 lw  $3,0($5)
 lbu $2,0($4)
 andi$3,$3,0x7f
 andi$2,$2,0x80
 or  $2,$3,$2
 j   $31
 sb  $2,0($4)


 Algorithm works on basic block level and consists of following 3 major steps:
 1. Go through basic block statements list. If there are statement pairs that 
 implement copy of bit field content from one memory location to another 
 record statements pointers and other necessary data in corresponding data 
 structure.
 2. Identify records that represent adjacent bit field accesses and mark them 
 as merged.
 3. Lower bit-field accesses by using new field size for those that can be 
 merged.


 New command line option -fmerge-bitfields is introduced.


 Tested - passed gcc regression tests for MIPS32r2.


 Changelog -

 gcc/ChangeLog:
 2014-04-16 Zoran Jovanovic (zoran.jovano...@imgtec.com)
   * common.opt (fmerge-bitfields): New option.
   * doc/invoke.texi: Add reference to -fmerge-bitfields.
   * tree-sra.c (lower_bitfields): New function.
   Entry for (-fmerge-bitfields).
   (bf_access_candidate_p): New function.
   (lower_bitfield_read): New function.
   (lower_bitfield_write): New function.
   (bitfield_stmt_bfcopy_pair::hash): New function.
   (bitfield_stmt_bfcopy_pair::equal): New function.
   (bitfield_stmt_bfcopy_pair::remove): New function.
   (create_and_insert_bfcopy): New function.
   (get_bit_offset): New function.
   (add_stmt_bfcopy_pair): New function.
   (cmp_bfcopies): New function.
   (get_merged_bit_field_size): New function.
   * dwarf2out.c (simple_type_size_in_bits): Move to tree.c.
   (field_byte_offset): Move declaration to tree.h and make it extern.
   * testsuite/gcc.dg/tree-ssa/bitfldmrg1.c: New test.
   * testsuite/gcc.dg/tree-ssa/bitfldmrg2.c: New test.
   * tree-ssa-sccvn.c (expressions_equal_p): Move to tree.c.
   *

[C++ Patch] PR 59120

2014-04-17 Thread Paolo Carlini


Hi,

we can fix this crash during error recovery very easily, by grouping 
together the two conditions leading to skip  early return, in complete 
analogy with the single-declaration case (note that we explicitly commit 
to tentative parse at the beginning of the function, thus we are good). 
Tested x86_64-linux.


Thanks,
Paolo.

///
/cp
2014-04-17  Paolo Carlini  paolo.carl...@oracle.com

PR c++/59120
* parser.c (cp_parser_alias_declaration): Check return value of
cp_parser_require.

/testsuite
2014-04-17  Paolo Carlini  paolo.carl...@oracle.com

PR c++/59120
* g++.dg/cpp0x/alias-decl-42.C: New.
Index: cp/parser.c
===
--- cp/parser.c (revision 209472)
+++ cp/parser.c (working copy)
@@ -16142,20 +16142,13 @@ cp_parser_alias_declaration (cp_parser* parser)
   if (parser-num_template_parameter_lists)
 parser-type_definition_forbidden_message = saved_message;
 
-  if (type == error_mark_node)
+  if (type == error_mark_node
+  || !cp_parser_require (parser, CPP_SEMICOLON, RT_SEMICOLON))
 {
   cp_parser_skip_to_end_of_block_or_statement (parser);
   return error_mark_node;
 }
 
-  cp_parser_require (parser, CPP_SEMICOLON, RT_SEMICOLON);
-
-  if (cp_parser_error_occurred (parser))
-{
-  cp_parser_skip_to_end_of_block_or_statement (parser);
-  return error_mark_node;
-}
-
   /* A typedef-name can also be introduced by an alias-declaration. The
  identifier following the using keyword becomes a typedef-name. It has
  the same semantics as if it were introduced by the typedef
Index: testsuite/g++.dg/cpp0x/alias-decl-42.C
===
--- testsuite/g++.dg/cpp0x/alias-decl-42.C  (revision 0)
+++ testsuite/g++.dg/cpp0x/alias-decl-42.C  (working copy)
@@ -0,0 +1,4 @@
+// PR c++/59120
+// { dg-do compile { target c++11 } }
+
+templatetypename T using X = int T::T*;  // { dg-error expected }

[PATCH] Fix uninitialised variable warning in tree-ssa-loop-ivcanon.c

2014-04-17 Thread Kyrill Tkachov


Hi all,

While looking at the build logs I noticed a warning while building 
tree-ssa-loop-ivcanon.c about a potential use of an uninitialised variable.
This patchlet fixes that warning by initialising it to 0.

Tested arm-none-eabi.

Ok for trunk?

2014-04-17  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* tree-ssa-loop-ivcanon.c (canonicalize_loop_induction_variables):
Initialise n_unroll to 0.
diff --git a/gcc/tree-ssa-loop-ivcanon.c b/gcc/tree-ssa-loop-ivcanon.c
index cdf1559..7a83b12 100644
--- a/gcc/tree-ssa-loop-ivcanon.c
+++ b/gcc/tree-ssa-loop-ivcanon.c
@@ -656,7 +656,7 @@ try_unroll_loop_completely (struct loop *loop,
 			HOST_WIDE_INT maxiter,
 			location_t locus)
 {
-  unsigned HOST_WIDE_INT n_unroll, ninsns, max_unroll, unr_insns;
+  unsigned HOST_WIDE_INT n_unroll = 0, ninsns, max_unroll, unr_insns;
   gimple cond;
   struct loop_size size;
   bool n_unroll_found = false;

Re: [PATCH 3/3] [LLVM] [sanitizer] add conditionals for libc

2014-04-17 Thread Bernhard Reutner-Fischer

On 17 April 2014 16:51:23 Konstantin Serebryany 
konstantin.s.serebry...@gmail.com wrote:



On Thu, Apr 17, 2014 at 6:27 PM, Bernhard Reutner-Fischer
rep.dot@gmail.com wrote:
 On 17 April 2014 16:07, Konstantin Serebryany
 konstantin.s.serebry...@gmail.com wrote:
 Hi,

 If you are trying to modify the libsanitizer files, please read here:
 https://code.google.com/p/address-sanitizer/wiki/HowToContribute

 I read that, thanks. Patch 3/3 is for current compiler-rt git repo,
 please install it there, i do not have write access to the LLVM nor
 compiler-rt trees.

I can commit your patch to llvm tree only after you follow the process
described on that page.
Sorry, this is a hard rule.


What part of the process do you think I did not follow?

I made a patch for compiler-rt, sent it to llvm-comm...@cs.uiuc.edu then 
provided the corresponding GCC parts, along a backport of the new bits that 
I expect to be overwritten once you do a new merge, leaving just the GCC 
configuy bits. This is how I read the wiki page you cite.


Please tell me what you expect me to do differently?

Thanks,


--kcc



Sent with AquaMail for Android
http://www.aqua-mail.com

Re: fuse-caller-save - hook format

2014-04-17 Thread Vladimir Makarov


On 2014-04-17, 11:29 AM, Richard Sandiford wrote:

Vladimir Makarov vmaka...@redhat.com writes:

On 2014-04-16, 3:19 PM, Tom de Vries wrote:

Vladimir,

All patches for the fuse-caller-save optimization have been ok-ed. The
only part
not approved is the MIPS-specific part.

The objection of Richard S. is not so much the patch itself, but more the idea
of the hook fn_other_hard_reg_usage.

For clarity, I'm restating the current hook definition here:
...
+@deftypefn {Target Hook} bool TARGET_FN_OTHER_HARD_REG_USAGE (struct
hard_reg_set_container *@var{regs})
Add any hard registers to @var{regs} that are set or clobbered by a
call to the
function.  This hook only needs to add registers that cannot be found by
examination of the final RTL representation of a function.  This hook returns
true if it managed to determine which registers need to be added.  The default
version of this hook returns false.
...

Richard prefers to, rather than having a hook specifying what registers are
implicitly clobbered, adding those clobbers to CALL_INSN_FUNCTION_USAGE.

I can see these possibilities (and perhaps there are more):

1. We go with Richards proposal: we make each target responsible for adding
these clobbers in CALL_INSN_FUNCTION_USAGE, and use a hook called f.i.
targetm.fuse_caller_save_p or targetm.implicit_call_clobbers_in_fusage_p, to
indicate whether a target has taken care of that, meaning it's safe to do the
fuse-caller-save optimization.

2. A mixed solution: we make each target responsible for specifying which
clobbers need to be added in CALL_INSN_FUNCTION_USAGE, using a hook
called f.i.
targetm.call_clobbered_regs, and add generic code to add those clobbers to
CALL_INSN_FUNCTION_USAGE.

3. We stick with the current, approved hook format, and try to
convince Richard
to live with it.


Since you are a register allocator maintainer, familiar with the
fuse-caller-save optimization, and have approved the original hook, I
would like
to ask you to make a decision on how to proceed from here.



I have no preferences and it is a matter of taste.  Each solution has
own advantages and disadvantages.  Putting this info into
CALL_INSN_FUNCTION_USAGE helps GCC developing a lot but it has a big
drawback in RTL memory footprint (especially for some targets which have
a lot of regs like AM29k or IA64).  On the order hand analogous approach
is already used in DF-infrastructure (which would be nice to fix it imho).

Still between GCC users and GCC developers, I'd prefer solution (even
the effect on amount of resources used by GCC is quite insignificant)
for users as their number is in a few magnitudes more then the developers.


Hmm, but you're talking like there are going to be a lot of these registers.


Yes, you are right.  That is what I thought.  I should have read Tom's 
email with more attention.



This isn't about which registers are call-clobbered or call-saved according
to the ABI (that's already available in other places).  All we want here
are the set of registers that are clobbered _in the caller_ before reaching
the callee or after the callee has returned.

So although IA-64 has lots of registers, the caller doesn't AFAIK use
lots of registers in the process of making the call.

On all targets we should be talking about one or two registers here.



I see.  I guess your proposed solution is ok then.

[PATCH] Fix warning in libgfortran configure script

2014-04-17 Thread Kyrill Tkachov


Hi all,

While configuring libgfortran I'm getting this message:
libgfortran/configure: line 25938: test: =: unary operator expected
The script doesn't fail and continues afterwards, but I don't think it's 
supposed to give that warning.
This patch makes it go away and makes it more consistent with other similar uses 
(a few lines below $ac_cv_lib_rt_clock_gettime is quoted when used in a test 
structure). configure.ac is updated and configure is regenerated with autoconf 2.64


Ok for trunk?

Make sure libgfortran builds for arm-none-eabi.

libgfortran/
2014-04-17  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* configure.ac: Quote usage of ac_cv_func_clock_gettime in if test.
* configure: Regenerate.
diff --git a/libgfortran/configure b/libgfortran/configure
index 23f57c7..d3ced74 100755
--- a/libgfortran/configure
+++ b/libgfortran/configure
@@ -25935,7 +25935,7 @@ fi
 # test is copied from libgomp, and modified to not link in -lrt as
 # libgfortran calls clock_gettime via a weak reference if it's found
 # in librt.
-if test $ac_cv_func_clock_gettime = no; then
+if test $ac_cv_func_clock_gettime = no; then
   { $as_echo $as_me:${as_lineno-$LINENO}: checking for clock_gettime in -lrt 5
 $as_echo_n checking for clock_gettime in -lrt...  6; }
 if test ${ac_cv_lib_rt_clock_gettime+set} = set; then :
diff --git a/libgfortran/configure.ac b/libgfortran/configure.ac
index de2d65e..24dbf2b 100644
--- a/libgfortran/configure.ac
+++ b/libgfortran/configure.ac
@@ -510,7 +510,7 @@ AC_CHECK_LIB([m],[feenableexcept],[have_feenableexcept=yes AC_DEFINE([HAVE_FEENA
 # test is copied from libgomp, and modified to not link in -lrt as
 # libgfortran calls clock_gettime via a weak reference if it's found
 # in librt.
-if test $ac_cv_func_clock_gettime = no; then
+if test $ac_cv_func_clock_gettime = no; then
   AC_CHECK_LIB(rt, clock_gettime,
 [AC_DEFINE(HAVE_CLOCK_GETTIME_LIBRT, 1,
[Define to 1 if you have the `clock_gettime' function in librt.])])

Re: [PATCH 3/3] [LLVM] [sanitizer] add conditionals for libc

2014-04-17 Thread Konstantin Serebryany

On Thu, Apr 17, 2014 at 8:45 PM, Bernhard Reutner-Fischer
rep.dot@gmail.com wrote:
 On 17 April 2014 16:51:23 Konstantin Serebryany
 konstantin.s.serebry...@gmail.com wrote:

 On Thu, Apr 17, 2014 at 6:27 PM, Bernhard Reutner-Fischer
 rep.dot@gmail.com wrote:
  On 17 April 2014 16:07, Konstantin Serebryany
  konstantin.s.serebry...@gmail.com wrote:
  Hi,
 
  If you are trying to modify the libsanitizer files, please read here:
  https://code.google.com/p/address-sanitizer/wiki/HowToContribute
 
  I read that, thanks. Patch 3/3 is for current compiler-rt git repo,
  please install it there, i do not have write access to the LLVM nor
  compiler-rt trees.

 I can commit your patch to llvm tree only after you follow the process
 described on that page.
 Sorry, this is a hard rule.


 What part of the process do you think I did not follow?

 I made a patch for compiler-rt, sent it to llvm-comm...@cs.uiuc.edu then
 provided the corresponding GCC parts, along a backport of the new bits that
 I expect to be overwritten once you do a new merge, leaving just the GCC
 configuy bits. This is how I read the wiki page you cite.

 Please tell me what you expect me to do differently?

First, I did not notice that you've sent it to llvm-commits because it
was also sent to the gcc list (unusual thing to happen)
and got filtered into the gcc part of my mail. Sorry.
But second, the patch is far from trivial and you should not expect us
to commit it w/o a careful review,
so here comes another part of the wiki: For non-trivial patches
please use Phabricator -- this will help us reply faster.

--kcc



 Thanks,


 --kcc



 Sent with AquaMail for Android
 http://www.aqua-mail.com

RE: [PATCH] Fix uninitialised variable warning in tree-ssa-loop-ivcanon.c

2014-04-17 Thread Daniel Marjamäki

Hello!

I am not against it..

However I think there is no danger. I see no potential use of
uninitialized variable.

The use of n_unroll is guarded by n_unroll_found.

Best regards,
Daniel Marjamäki

PATCH: PR target/60868: [4.9/4.10 Regression] ICE: in int_mode_for_mode, at stor-layout.c:400 with -minline-all-stringops -minline-stringops-dynamically -march=core2

2014-04-17 Thread H.J. Lu

Hi,

GET_MODE returns VOIDmode on CONST_INT.  It happens with -O0.  This
patch uses counter_mode on count_exp to get mode.  Tested on
Linux/x86-64 without regressions.  OK for trunk and 4.9 branch?

Thanks.


H.J.
---
gcc/

2014-04-17  H.J. Lu  hongjiu...@intel.com

PR target/60868
* config/i386/i386.c (ix86_expand_set_or_movmem): Call counter_mode 
on count_exp to get mode.

gcc/testsuite/

2014-04-17  H.J. Lu  hongjiu...@intel.com

PR target/60868
* gcc.target/i386/pr60868.c: New testcase.

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 536f50f..7a68623 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -24392,7 +24392,8 @@ ix86_expand_set_or_movmem (rtx dst, rtx src, rtx 
count_exp, rtx val_exp,
  if (jump_around_label == NULL_RTX)
jump_around_label = gen_label_rtx ();
  emit_cmp_and_jump_insns (count_exp, GEN_INT (dynamic_check - 1),
-  LEU, 0, GET_MODE (count_exp), 1, hot_label);
+  LEU, 0, counter_mode (count_exp),
+  1, hot_label);
  predict_jump (REG_BR_PROB_BASE * 90 / 100);
  if (issetmem)
set_storage_via_libcall (dst, count_exp, val_exp, false);
diff --git a/gcc/testsuite/gcc.target/i386/pr60868.c 
b/gcc/testsuite/gcc.target/i386/pr60868.c
new file mode 100644
index 000..c30bbfc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr60868.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options -O0 -minline-all-stringops -minline-stringops-dynamically 
-march=core2 } */
+
+void bar (float *);
+
+void foo (void)
+{
+  float b[256] = {0};
+  bar(b);
+}

Re: Avoid unnecesary GGC runs during LTO

2014-04-17 Thread Richard Biener

On April 17, 2014 6:03:13 PM CEST, Jan Hubicka hubi...@ucw.cz wrote:
   +
   +  /* At this stage we know that majority of GGC memory is
reachable.  
   + Growing the limits prevents unnecesary invocation of GGC. 
*/
   +  ggc_grow ();
  ggc_collect ();
  
  Isn't the collect here pointless?  I see not in ENABLE_CHECKING,
but
  shouldn't this be abstracted away, thus call ggc_collect from
ggc_grow?
  Or maybe rather even for ENABLE_CHECKING adjust G.allocated_last_gc
  and simply drop the ggc_collect above ().
 
 I am fine with both.  I basically decided to keep the explicit
ggc_collect() to
 make it clear (from lto.c source code) that we are GGC safe at this
point and
 to have way to double check that we do not produce too much of
garbage with
 checking disabled. (so with -Q I will see how much it is collected at
that place).
 
 We can embed it into ggc_grow and document that w/o checking it is
equivalent
 to ggc_cooect.
  
  Anyway, this is sth for stage1 at this point.
 
 OK,
 Honza

Ping...
the patches saves 33 GGC runs during libxul.so link, that is not that
bad ;)

What is the updated patch you propose?

Richard

Honza
  
  Thanks,
  Richard.
  
  /* Set the hooks so that all of the ipa passes can read in
their data.  */
   Index: ggc-none.c
  
===
   --- ggc-none.c   (revision 209170)
   +++ ggc-none.c   (working copy)
   @@ -63,3 +63,8 @@ ggc_free (void *p)
{
  free (p);
}
   +
   +void
   +ggc_grow (void)
   +{
   +}
   Index: ggc-page.c
  
===
   --- ggc-page.c   (revision 209170)
   +++ ggc-page.c   (working copy)
   @@ -2095,6 +2095,19 @@ ggc_collect (void)
fprintf (G.debug_file, END COLLECTING\n);
}

   +/* Assume that all GGC memory is reachable and grow the limits
for next collection. */
   +
   +void
   +ggc_grow (void)
   +{
   +#ifndef ENABLE_CHECKING
   +  G.allocated_last_gc = MAX (G.allocated_last_gc,
   + G.allocated);
   +#endif
   +  if (!quiet_flag)
   +fprintf (stderr,  {GC start %luk} , (unsigned long)
G.allocated / 1024);
   +}
   +
/* Print allocation statistics.  */
#define SCALE(x) ((unsigned long) ((x)  1024*10 \
  ? (x) \
   
   
  
  -- 
  Richard Biener rguent...@suse.de
  SUSE / SUSE Labs
  SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
  GF: Jeff Hawn, Jennifer Guild, Felix Imendorffer

Re: Avoid unnecesary GGC runs during LTO

2014-04-17 Thread Jan Hubicka

 On April 17, 2014 6:03:13 PM CEST, Jan Hubicka hubi...@ucw.cz wrote:
+
+  /* At this stage we know that majority of GGC memory is
 reachable.  
+ Growing the limits prevents unnecesary invocation of GGC. 
 */
+  ggc_grow ();
   ggc_collect ();
   
   Isn't the collect here pointless?  I see not in ENABLE_CHECKING,
 but
   shouldn't this be abstracted away, thus call ggc_collect from
 ggc_grow?
   Or maybe rather even for ENABLE_CHECKING adjust G.allocated_last_gc
   and simply drop the ggc_collect above ().
  
  I am fine with both.  I basically decided to keep the explicit
 ggc_collect() to
  make it clear (from lto.c source code) that we are GGC safe at this
 point and
  to have way to double check that we do not produce too much of
 garbage with
  checking disabled. (so with -Q I will see how much it is collected at
 that place).
  
  We can embed it into ggc_grow and document that w/o checking it is
 equivalent
  to ggc_cooect.
   
   Anyway, this is sth for stage1 at this point.
  
  OK,
  Honza
 
 Ping...
 the patches saves 33 GGC runs during libxul.so link, that is not that
 bad ;)
 
 What is the updated patch you propose?

I was trying to explain, why I kept explicit ggc_collect just after ggc_grow:

I want to make it clear that we are ggc safe at that point. I also want to see
the ggc run happening w/o checking to have -Q report how much of garbage we see
at this stage so I can keep eye on it.

I can hide ENABLE_CHECKING ggc_collect call in ggc_grow and update
documentation if your preffer.

Honza
 
 Richard
 
 Honza
   
   Thanks,
   Richard.
   
   /* Set the hooks so that all of the ipa passes can read in
 their data.  */
Index: ggc-none.c
   
 ===
--- ggc-none.c (revision 209170)
+++ ggc-none.c (working copy)
@@ -63,3 +63,8 @@ ggc_free (void *p)
 {
   free (p);
 }
+
+void
+ggc_grow (void)
+{
+}
Index: ggc-page.c
   
 ===
--- ggc-page.c (revision 209170)
+++ ggc-page.c (working copy)
@@ -2095,6 +2095,19 @@ ggc_collect (void)
 fprintf (G.debug_file, END COLLECTING\n);
 }
 
+/* Assume that all GGC memory is reachable and grow the limits
 for next collection. */
+
+void
+ggc_grow (void)
+{
+#ifndef ENABLE_CHECKING
+  G.allocated_last_gc = MAX (G.allocated_last_gc,
+   G.allocated);
+#endif
+  if (!quiet_flag)
+fprintf (stderr,  {GC start %luk} , (unsigned long)
 G.allocated / 1024);
+}
+
 /* Print allocation statistics.  */
 #define SCALE(x) ((unsigned long) ((x)  1024*10 \
 ? (x) \


   
   -- 
   Richard Biener rguent...@suse.de
   SUSE / SUSE Labs
   SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
   GF: Jeff Hawn, Jennifer Guild, Felix Imendorffer

Re: [RFC] Add aarch64 support for ada

2014-04-17 Thread Richard Henderson

On 04/17/2014 08:35 AM, Tristan Gingold wrote:
 What about this compile-time check:
 
 package Fatptralign is
type String_Acc is access String;
type Integer_acc is access Integer;
 
pragma Compile_Time_Error
 (String_Acc'Alignment = 1 * Integer_Acc'Alignment,
  Fat pointer are simply aligned);
 
pragma Compile_Time_Error
 (String_Acc'Alignment = 2 * Integer_Acc'Alignment,
  Fat pointer are doubly aligned);
 end Fatptralign;

Yes, that seems to work, even with a cross-compiler.


r~

Re: [PATCH] C++ thunk section names

2014-04-17 Thread Sriraman Tallam

Ping.

On Wed, Feb 5, 2014 at 4:31 PM, Sriraman Tallam tmsri...@google.com wrote:
 Hi,

   I would like this patch reviewed and considered for commit when
 Stage 1 is active again.

 Patch Description:

 A C++ thunk's section name is set to be the same as the original function's
 section name for which the thunk was created in order to place the two
 together.  This is done in cp/method.c in function use_thunk.
 However, with function reordering turned on, the original function's section
 name can change to something like .text.hot.orginal or
 .text.unlikely.original in function default_function_section in varasm.c
 based on the node count of that function.  The thunk function's section name
 is not updated to be the same as the original here and also is not always
 correct to do it as the original function can be hotter than the thunk.

 I have created a patch to not name the thunk function's section to be the same
 as the original function when function reordering is enabled.

 Thanks
 Sri

Inliner heuristics TLC 1/n - let small function inlinng to ignore cold portion of program

2014-04-17 Thread Jan Hubicka

Hi,
I think for 4.10 we should revisit inliner behaviour to be more LTO and LTO+FDO
ready. This is first of small patches I made to sanitize behaviour of current 
bounds.

The main problem LTO brings is that we get way too many inline candidates. In 
per-file
model one gets only small percentage of calls inlinable, since most of them go 
to other
units, so our current heuristics behave quite well, inlining usually all calls 
that it
consider benefical.

With LTO almost all calls are inlinable and if we inline everything we consider
profitable we get insane code size growths, so practically always we hit our
30% unit growth threshold.  This is not always a good idea.  Reducing
inline-insns-auto/inline-insns-single to avoid inliner hitting the growth limit
would cause a regression on benchmarks that needs inlining of large functions.

LLVM seems to get around the problem by doing code expanding inlining at compile
time (in equivalent of our early inliner). This makes functions big, so the LTO
doesn't inline much, but it also misses useful cross-module inlines and replace
them by less usefull inter-module.

Other approach would be to have inline-insns-crossmodule that is significantly
smaller than inline-insns-auto. We already have crossmodule hint that probably
ought to be made smarter to not fire on COMDAT functions.
I do not want to do it, since the numbers I collected in
http://hubicka.blogspot.ca/2014/04/devirtualization-in-c-part-5-feedback.html
suggest that inline-insns-auto is already quite bad limit.

I would be happy to hear about alternative solutions to this.  We may want
to switch whole program inliner into temperature style bound, like open64 does.

Well, this patch actually goes bit different direction - making unit growth
threashold more sane.

While looking into inliner behaviour at Firefox to write my blog entry
I noticed that with profile feedback only very small portion of the program
is trained (15%) and only around 7% of code contains something that we consider
hot.

Inliner however still hits the inline-unit-growth limit with:
Unit growth for small function inlining: 7232256-9220597 (27%)
Inlined 183353 calls, eliminated 54652 function

We do not grow the code in the cold portions of program, but because of
the dead padding we grow everything we consider hot 4 times, instead
of 1.3 times as we would usually do if it was unpadded.

This patch fixes the problem by considering only non-cold functions for
frequency calculation.  We now get:

Unit growth for small function inlining: 2083217-2537163 (21%)
Inlined 134611 calls, eliminated 53586 functions

So while the relative growth is still close to 30%, the absolute
growth is only 22% of the previous one.  We inline fewer calls but
in the dynamic stats there is very minor (sub 0.01%) diference.

Bootstrapped/regtested x86_64-linux, will commit it shortly.

Honza

* ipa-inline.c (inline_small_functions): Account only non-cold
functions.

* doc/invoke.texi (inline-unit-growth): Update documentation.
Index: ipa-inline.c
===
--- ipa-inline.c(revision 209461)
+++ ipa-inline.c(working copy)
@@ -1585,7 +1590,10 @@ inline_small_functions (void)
struct inline_summary *info = inline_summary (node);
struct ipa_dfs_info *dfs = (struct ipa_dfs_info *) node-aux;
 
-   if (!DECL_EXTERNAL (node-decl))
+   /* Do not account external functions, they will be optimized out
+  if not inlined.  Also only count the non-cold portion of 
program.  */
+   if (!DECL_EXTERNAL (node-decl)
+node-frequency != NODE_FREQUENCY_UNLIKELY_EXECUTED)
  initial_size += info-size;
info-growth = estimate_growth (node);
if (dfs  dfs-next_cycle)
Index: doc/invoke.texi
===
--- doc/invoke.texi (revision 209461)
+++ doc/invoke.texi (working copy)
@@ -9409,7 +9409,8 @@ before applying @option{--param inline-u
 @item inline-unit-growth
 Specifies maximal overall growth of the compilation unit caused by inlining.
 The default value is 30 which limits unit growth to 1.3 times the original
-size.
+size. Cold functions (either marked cold via an attribibute or by profile
+feedback) are not accounted into the unit size.
 
 @item ipcp-unit-growth
 Specifies maximal overall growth of the compilation unit caused by

Re: Avoid unnecesary GGC runs during LTO

2014-04-17 Thread Richard Biener

On April 17, 2014 7:18:05 PM CEST, Jan Hubicka hubi...@ucw.cz wrote:
 On April 17, 2014 6:03:13 PM CEST, Jan Hubicka hubi...@ucw.cz
wrote:
+
+  /* At this stage we know that majority of GGC memory is
 reachable.  
+ Growing the limits prevents unnecesary invocation of
GGC. 
 */
+  ggc_grow ();
   ggc_collect ();
   
   Isn't the collect here pointless?  I see not in ENABLE_CHECKING,
 but
   shouldn't this be abstracted away, thus call ggc_collect from
 ggc_grow?
   Or maybe rather even for ENABLE_CHECKING adjust
G.allocated_last_gc
   and simply drop the ggc_collect above ().
  
  I am fine with both.  I basically decided to keep the explicit
 ggc_collect() to
  make it clear (from lto.c source code) that we are GGC safe at
this
 point and
  to have way to double check that we do not produce too much of
 garbage with
  checking disabled. (so with -Q I will see how much it is collected
at
 that place).
  
  We can embed it into ggc_grow and document that w/o checking it is
 equivalent
  to ggc_cooect.
   
   Anyway, this is sth for stage1 at this point.
  
  OK,
  Honza
 
 Ping...
 the patches saves 33 GGC runs during libxul.so link, that is not
that
 bad ;)
 
 What is the updated patch you propose?

I was trying to explain, why I kept explicit ggc_collect just after
ggc_grow:

I want to make it clear that we are ggc safe at that point. I also want
to see
the ggc run happening w/o checking to have -Q report how much of
garbage we see
at this stage so I can keep eye on it.

I can hide ENABLE_CHECKING ggc_collect call in ggc_grow and update
documentation if your preffer.

I'd prefer that.  OK with that change.

Thanks,
Richard.

Honza
 
 Richard
 
 Honza
   
   Thanks,
   Richard.
   
   /* Set the hooks so that all of the ipa passes can read in
 their data.  */
Index: ggc-none.c
   
 ===
--- ggc-none.c(revision 209170)
+++ ggc-none.c(working copy)
@@ -63,3 +63,8 @@ ggc_free (void *p)
 {
   free (p);
 }
+
+void
+ggc_grow (void)
+{
+}
Index: ggc-page.c
   
 ===
--- ggc-page.c(revision 209170)
+++ ggc-page.c(working copy)
@@ -2095,6 +2095,19 @@ ggc_collect (void)
 fprintf (G.debug_file, END COLLECTING\n);
 }
 
+/* Assume that all GGC memory is reachable and grow the
limits
 for next collection. */
+
+void
+ggc_grow (void)
+{
+#ifndef ENABLE_CHECKING
+  G.allocated_last_gc = MAX (G.allocated_last_gc,
+  G.allocated);
+#endif
+  if (!quiet_flag)
+fprintf (stderr,  {GC start %luk} , (unsigned long)
 G.allocated / 1024);
+}
+
 /* Print allocation statistics.  */
 #define SCALE(x) ((unsigned long) ((x)  1024*10 \
? (x) \


   
   -- 
   Richard Biener rguent...@suse.de
   SUSE / SUSE Labs
   SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
   GF: Jeff Hawn, Jennifer Guild, Felix Imendorffer

Re: [gomp4] Add tables generation

2014-04-17 Thread Ilya Verbin

On 27 Mar 17:16, Jakub Jelinek wrote:
 On Thu, Mar 27, 2014 at 08:13:00PM +0400, Ilya Verbin wrote:
  On 27 Mar 15:02, Jakub Jelinek wrote:
   The tables need to be created before IPA, that way it really shouldn't
   matter in what order you emit them.  E.g. the outlined target functions
   could be added to the table during ompexp pass which actually creates the
   outlined functions, the vars need to be added before target lto or host 
   lto
   is streamed.
  
  For host tables it's ok, but when target compiler will create tables with 
  functions?
  It reads bytecode from target_lto sections, so it never executes ompexp 
  pass.
 
 Which is why the table created for host by the ompexp pass should be
 streamed into the target_lto sections (marked specially somehow, special
 attribute or whatever), and then corresponding target table created from
 that, rather then created from some possibly different ordering there.
 
   Jakub

Hi Jakub,

Could you please take a look at this patch?  It fixes the ordering issue in the
tables stated above, and passes all the tests that I have.  But I'm not sure
about its correctness from the architectural point of view.


---
 gcc/lto-cgraph.c   | 93 ++
 gcc/lto-section-in.c   |  3 +-
 gcc/lto-streamer-out.c |  2 ++
 gcc/lto-streamer.h |  3 ++
 gcc/lto/lto.c  |  2 ++
 gcc/omp-low.c  | 68 +++-
 6 files changed, 115 insertions(+), 56 deletions(-)

diff --git a/gcc/lto-cgraph.c b/gcc/lto-cgraph.c
index 544f04b..3d6637e 100644
--- a/gcc/lto-cgraph.c
+++ b/gcc/lto-cgraph.c
@@ -82,6 +82,8 @@ enum LTO_symtab_tags
   LTO_symtab_last_tag
 };
 
+extern vectree, va_gc *offload_funcs, *offload_vars;
+
 /* Create a new symtab encoder.
if FOR_INPUT, the encoder allocate only datastructures needed
to read the symtab.  */
@@ -958,6 +960,51 @@ output_symtab (void)
   output_refs (encoder);
 }
 
+void
+output_offload_tables (void)
+{
+  /* Collect all omp-target global variables to offload_vars, if they have not
+ been gathered earlier by input_offload_tables.  */
+  if (vec_safe_is_empty (offload_vars))
+{
+  struct varpool_node *vnode;
+  FOR_EACH_DEFINED_VARIABLE (vnode)
+   {
+ if (!lookup_attribute (omp declare target,
+DECL_ATTRIBUTES (vnode-decl))
+ || TREE_CODE (vnode-decl) != VAR_DECL
+ || DECL_SIZE (vnode-decl) == 0)
+   continue;
+ vec_safe_push (offload_vars, vnode-decl);
+   }
+}
+
+  if (vec_safe_is_empty (offload_funcs)  vec_safe_is_empty (offload_vars))
+return;
+
+  struct lto_simple_output_block *ob
+= lto_create_simple_output_block (LTO_section_offload_table);
+
+  for (unsigned i = 0; i  vec_safe_length (offload_funcs); i++)
+{
+  streamer_write_enum (ob-main_stream, LTO_symtab_tags,
+  LTO_symtab_last_tag, LTO_symtab_unavail_node);
+  lto_output_fn_decl_index (ob-decl_state, ob-main_stream,
+   (*offload_funcs)[i]);
+}
+
+  for (unsigned i = 0; i  vec_safe_length (offload_vars); i++)
+{
+  streamer_write_enum (ob-main_stream, LTO_symtab_tags,
+  LTO_symtab_last_tag, LTO_symtab_variable);
+  lto_output_var_decl_index (ob-decl_state, ob-main_stream,
+(*offload_vars)[i]);
+}
+
+  streamer_write_uhwi_stream (ob-main_stream, 0);
+  lto_destroy_simple_output_block (ob);
+}
+
 /* Overwrite the information in NODE based on FILE_DATA, TAG, FLAGS,
STACK_SIZE, SELF_TIME and SELF_SIZE.  This is called either to initialize
NODE or to replace the values in it, for instance because the first
@@ -1611,6 +1658,52 @@ input_symtab (void)
 }
 }
 
+void
+input_offload_tables (void)
+{
+  struct lto_file_decl_data **file_data_vec = lto_get_file_decl_data ();
+  struct lto_file_decl_data *file_data;
+  unsigned int j = 0;
+
+  while ((file_data = file_data_vec[j++]))
+{
+  const char *data;
+  size_t len;
+  struct lto_input_block *ib
+   = lto_create_simple_input_block (file_data, LTO_section_offload_table,
+data, len);
+  if (!ib)
+   continue;
+
+  enum LTO_symtab_tags tag
+   = streamer_read_enum (ib, LTO_symtab_tags, LTO_symtab_last_tag);
+  while (tag)
+   {
+ if (tag == LTO_symtab_unavail_node)
+   {
+ int decl_index = streamer_read_uhwi (ib);
+ tree fn_decl
+   = lto_file_decl_data_get_fn_decl (file_data, decl_index);
+ vec_safe_push (offload_funcs, fn_decl);
+   }
+ else if (tag == LTO_symtab_variable)
+   {
+ int decl_index = streamer_read_uhwi (ib);
+ tree var_decl
+   = lto_file_decl_data_get_var_decl (file_data, decl_index);
+ vec_safe_push (offload_vars, var_decl);
+   }
+

Re: Inliner heuristics TLC 1/n - let small function inlinng to ignore cold portion of program

2014-04-17 Thread David Malcolm

On Thu, 2014-04-17 at 19:52 +0200, Jan Hubicka wrote:

[...]

 Index: doc/invoke.texi
 ===
 --- doc/invoke.texi   (revision 209461)
 +++ doc/invoke.texi   (working copy)
 @@ -9409,7 +9409,8 @@ before applying @option{--param inline-u
  @item inline-unit-growth
  Specifies maximal overall growth of the compilation unit caused by inlining.
  The default value is 30 which limits unit growth to 1.3 times the original
 -size.
 +size. Cold functions (either marked cold via an attribibute or by profile
FWIW, there a trivial typo here-^^

Go patch commited: Mark various expressions as immutable

2014-04-17 Thread Ian Lance Taylor

This patch from Chris Manghane marks various expression types as
immutable: numerics, constants, type info, address of, type conversion
when appropriate.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

diff -r 194e0f47c9e5 go/expressions.cc
--- a/go/expressions.cc	Wed Apr 16 13:33:13 2014 -0700
+++ b/go/expressions.cc	Thu Apr 17 11:57:28 2014 -0700
@@ -555,6 +555,10 @@
   { return true; }
 
   bool
+  do_is_immutable() const
+  { return true; }
+
+  bool
   do_numeric_constant_value(Numeric_constant* nc) const
   {
 nc-set_unsigned_long(NULL, 0);
@@ -1422,6 +1426,10 @@
   do_is_constant() const
   { return true; }
 
+  bool
+  do_is_immutable() const
+  { return true; }
+
   Type*
   do_type();
 
@@ -1790,6 +1798,10 @@
   { return true; }
 
   bool
+  do_is_immutable() const
+  { return true; }
+
+  bool
   do_numeric_constant_value(Numeric_constant* nc) const;
 
   Type*
@@ -2109,6 +2121,10 @@
   { return true; }
 
   bool
+  do_is_immutable() const
+  { return true; }
+
+  bool
   do_numeric_constant_value(Numeric_constant* nc) const
   {
 nc-set_float(this-type_, this-val_);
@@ -2292,6 +2308,10 @@
   { return true; }
 
   bool
+  do_is_immutable() const
+  { return true; }
+
+  bool
   do_numeric_constant_value(Numeric_constant* nc) const
   {
 nc-set_complex(this-type_, this-real_, this-imag_);
@@ -2506,6 +2526,10 @@
   { return true; }
 
   bool
+  do_is_immutable() const
+  { return true; }
+
+  bool
   do_numeric_constant_value(Numeric_constant* nc) const;
 
   bool
@@ -2994,6 +3018,9 @@
   do_is_constant() const;
 
   bool
+  do_is_immutable() const;
+
+  bool
   do_numeric_constant_value(Numeric_constant*) const;
 
   bool
@@ -3175,6 +3202,27 @@
   return true;
 }
 
+// Return whether a type conversion is immutable.
+
+bool
+Type_conversion_expression::do_is_immutable() const
+{
+  Type* type = this-type_;
+  Type* expr_type = this-expr_-type();
+
+  if (type-interface_type() != NULL
+  || expr_type-interface_type() != NULL)
+return false;
+
+  if (!this-expr_-is_immutable())
+return false;
+
+  if (Type::are_identical(type, expr_type, false, NULL))
+return true;
+
+  return type-is_basic_type()  expr_type-is_basic_type();
+}
+
 // Return the constant numeric value if there is one.
 
 bool
@@ -3599,7 +3647,8 @@
 
   bool
   do_is_immutable() const
-  { return this-expr_-is_immutable(); }
+  { return this-expr_-is_immutable()
+  || (this-op_ == OPERATOR_AND  this-expr_-is_variable()); }
 
   bool
   do_numeric_constant_value(Numeric_constant*) const;
@@ -14076,6 +14125,10 @@
   { }
 
  protected:
+  bool
+  do_is_immutable() const
+  { return true; }
+
   Type*
   do_type();

Re: Inliner heuristics TLC 1/n - let small function inlinng to ignore cold portion of program

2014-04-17 Thread Xinliang David Li

This looks fine.  LIPO has similar change too.  Other directions worth
looking into:

1) To model icache effect better,  weighted callee size need to be
used with profile. The weight for BB may look like: min(1,
FREQ(BB)/FREQ(ENTRY)).
2) When function splitting is turned on, are any inline heuristic
changes are needed? E.g. only consider the hot code part of node for
unit growth computation?

We are also looking into more aggressive approach to track per loop
(inter-procedural) region growth limit, instead of using one single
global limit.

David

On Thu, Apr 17, 2014 at 10:52 AM, Jan Hubicka hubi...@ucw.cz wrote:
 Hi,
 I think for 4.10 we should revisit inliner behaviour to be more LTO and 
 LTO+FDO
 ready. This is first of small patches I made to sanitize behaviour of current 
 bounds.

 The main problem LTO brings is that we get way too many inline candidates. In 
 per-file
 model one gets only small percentage of calls inlinable, since most of them 
 go to other
 units, so our current heuristics behave quite well, inlining usually all 
 calls that it
 consider benefical.

 With LTO almost all calls are inlinable and if we inline everything we 
 consider
 profitable we get insane code size growths, so practically always we hit our
 30% unit growth threshold.  This is not always a good idea.  Reducing
 inline-insns-auto/inline-insns-single to avoid inliner hitting the growth 
 limit
 would cause a regression on benchmarks that needs inlining of large functions.

 LLVM seems to get around the problem by doing code expanding inlining at 
 compile
 time (in equivalent of our early inliner). This makes functions big, so the 
 LTO
 doesn't inline much, but it also misses useful cross-module inlines and 
 replace
 them by less usefull inter-module.

 Other approach would be to have inline-insns-crossmodule that is significantly
 smaller than inline-insns-auto. We already have crossmodule hint that probably
 ought to be made smarter to not fire on COMDAT functions.
 I do not want to do it, since the numbers I collected in
 http://hubicka.blogspot.ca/2014/04/devirtualization-in-c-part-5-feedback.html
 suggest that inline-insns-auto is already quite bad limit.

 I would be happy to hear about alternative solutions to this.  We may want
 to switch whole program inliner into temperature style bound, like open64 
 does.

 Well, this patch actually goes bit different direction - making unit growth
 threashold more sane.

 While looking into inliner behaviour at Firefox to write my blog entry
 I noticed that with profile feedback only very small portion of the program
 is trained (15%) and only around 7% of code contains something that we 
 consider
 hot.

 Inliner however still hits the inline-unit-growth limit with:
 Unit growth for small function inlining: 7232256-9220597 (27%)
 Inlined 183353 calls, eliminated 54652 function

 We do not grow the code in the cold portions of program, but because of
 the dead padding we grow everything we consider hot 4 times, instead
 of 1.3 times as we would usually do if it was unpadded.

 This patch fixes the problem by considering only non-cold functions for
 frequency calculation.  We now get:

 Unit growth for small function inlining: 2083217-2537163 (21%)
 Inlined 134611 calls, eliminated 53586 functions

 So while the relative growth is still close to 30%, the absolute
 growth is only 22% of the previous one.  We inline fewer calls but
 in the dynamic stats there is very minor (sub 0.01%) diference.

 Bootstrapped/regtested x86_64-linux, will commit it shortly.

 Honza

 * ipa-inline.c (inline_small_functions): Account only non-cold
 functions.

 * doc/invoke.texi (inline-unit-growth): Update documentation.
 Index: ipa-inline.c
 ===
 --- ipa-inline.c(revision 209461)
 +++ ipa-inline.c(working copy)
 @@ -1585,7 +1590,10 @@ inline_small_functions (void)
 struct inline_summary *info = inline_summary (node);
 struct ipa_dfs_info *dfs = (struct ipa_dfs_info *) node-aux;

 -   if (!DECL_EXTERNAL (node-decl))
 +   /* Do not account external functions, they will be optimized out
 +  if not inlined.  Also only count the non-cold portion of 
 program.  */
 +   if (!DECL_EXTERNAL (node-decl)
 +node-frequency != NODE_FREQUENCY_UNLIKELY_EXECUTED)
   initial_size += info-size;
 info-growth = estimate_growth (node);
 if (dfs  dfs-next_cycle)
 Index: doc/invoke.texi
 ===
 --- doc/invoke.texi (revision 209461)
 +++ doc/invoke.texi (working copy)
 @@ -9409,7 +9409,8 @@ before applying @option{--param inline-u
  @item inline-unit-growth
  Specifies maximal overall growth of the compilation unit caused by inlining.
  The default value is 30 which limits unit growth to 1.3 times the original
 -size.
 +size.

RE: [PATCH v7?] PR middle-end/60281

2014-04-17 Thread Bernd Edlinger

Hi Lin,

On Thu, 17 Apr 2014 22:29:14, Lin Zuojian wrote:

 Hi Bernd,
 I have my copyright mark signed and the process has completed. Now I
 am going to answer two more questions before my patch can be
 commited right?

 Did you copy any
 files or text written by someone else in these changes?”

 no

 [Which files have you changed so far, and which new files have you written
 so far?]
 gcc/asan.c
 gcc/ChangeLog
 gcc/cfgexpand.c

 Okay, you may review my patch again, if there is no problem, please
 commit it for me.
 --
 Regards
 lin zuojian

I am not sure if your patch was already approved by a global GCC reviewer.
That is however absolutely necessary before it can be committed.

I think it would be best to re-submit the latest version of your patch now,
and ask a global reviewer for approval.

The message should be sent to gcc-patches@gcc.gnu.org and contain the
following information in addition to the proposed patch itself and the
change-log entry:

a) On which target(s) did you boot-strap your patch?
 
b) Did you run the testsuite?

c) When you compare the test results with and without the patch, were there any 
regressions?


Regards
Bernd.

Go patch committed: Only convert function type when necessary

2014-04-17 Thread Ian Lance Taylor

This patch to the Go frontend fixes it to not convert the function type
in a call when calling an interface method.  The function type of an
interface method is not correct, since it does not include the receiver,
but the type of the method field is correct, and as such should not be
converted.  This is PR 60870.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Tested by Ulrich Weigand on PPC.  Committed
to mainline.

Ian

diff -r 43e2635914c2 go/expressions.cc
--- a/go/expressions.cc	Thu Apr 17 12:09:37 2014 -0700
+++ b/go/expressions.cc	Thu Apr 17 12:24:08 2014 -0700
@@ -9619,9 +9619,20 @@
   fn = Expression::make_compound(set_closure, fn, location);
 }
 
-  Btype* bft = fntype-get_backend_fntype(gogo);
   Bexpression* bfn = tree_to_expr(fn-get_tree(context));
-  bfn = gogo-backend()-convert_expression(bft, bfn, location);
+
+  // When not calling a named function directly, use a type conversion
+  // in case the type of the function is a recursive type which refers
+  // to itself.  We don't do this for an interface method because 1)
+  // an interface method never refers to itself, so we always have a
+  // function type here; 2) we pass an extra first argument to an
+  // interface method, so fntype is not correct.
+  if (func == NULL  !is_interface_method)
+{
+  Btype* bft = fntype-get_backend_fntype(gogo);
+  bfn = gogo-backend()-convert_expression(bft, bfn, location);
+}
+
   Bexpression* call = gogo-backend()-call_expression(bfn, fn_args, location);
 
   if (this-results_ != NULL)

Re: [RFC][PATCH] RL78 - clean-up of missing operand mode warnings.

2014-04-17 Thread Richard Hulme


On 15/04/14 22:58, DJ Delorie wrote:

I typically leave the mode off when the operand accepts a CONST_INT as
I've had problems with patterns matching CONST_INTs otherwise, as
CONST_INT rtx's do not have a mode (or have VOIDmode).

(yes, I know gcc is supposed to accomodate that, but like I said, I've
had problems...)


Ok, that's fine.  I was just trying to mop up one little bit of the sea 
of warnings.


It seems a little inconsistent, however, that *movqi_real and 
*xorqi3_real don't specify modes but *movhi_real and 
*andqi_real/*iorqi_real do (and they also accept CONST_INTs).  Not that 
I'm advocating generating more warnings, but my inner OCD likes 
consistency :)


Richard.

Re: [patch] change specific int128 - generic intN

2014-04-17 Thread Marc Glisse


On Tue, 15 Apr 2014, DJ Delorie wrote:


I wasn't sure what to do with that array, since it was static and
couldn't have empty slots in them like the arrays in tree.h.  Also,
do we need to have *every* type in that list?  What's the rule for
whether a type gets installed there or not?  The comment says
guaranteed to be in the runtime support but does that mean for this
particular build (wrt multilibs) as not all intN types are guaranteed
(even the int128 types were not guaranteed to be supported before my
patch).  In other parts of the patch, just taking out the special case
for __int128 was sufficient to do the right thing for all __intN
types.


You need someone who understands this better than me (ask Jason). To be 
able to throw/catch a type, you need some typeinfo symbols. The front-end 
generates that for classes when they are defined. For fundamental types, 
it assumes libsupc++ will provide it, and the function you are modifying 
is the one generating libsupc++ (I am surprised your patch didn't cause 
any failure on x64_64, at least in abi_check). We need to generate the 
typeinfo for __intN, either in libsupc++, or in each TU, and since both 
cases will require code, I assume libsupc++ is preferable.



I can certainly put the intN types in there, but note that it would
mean regenerating the fundamentals[] array at runtime to include those
types which are supported at the time.


After the patch I linked, it should just mean calling the helper function 
on your new types, no need to touch the array.



Do the entries in the array need to be in a particular order?


No, any random order would do.

--
Marc Glisse

Re: [RFC] Add aarch64 support for ada

2014-04-17 Thread Richard Henderson

On 04/17/2014 08:56 AM, Eric Botcazou wrote:
 I presume that the attached kludge is sufficient to make it work?
 
 
   * fe.h (Compiler_Abort): Replace Fat_Pointer by String.
   (Error_Msg_N): Likewise.
   (Error_Msg_NE): Likewise.
   (Get_External_Name_With_Suffix): Likewise.
   * types.h (Fat_Pointer): Delete.
   (String): New type.
   (DECLARE_STRING): New macro.
   * gcc-interface/decl.c (create_concat_name): Adjust.
   * gcc-interface/trans.c (post_error): Likewise.
   (post_error_ne): Likewise.
   * gcc-interface/misc.c (internal_error_function): Likewise.

Yes, this bootstrapped.


r~

Go patch committed: Use backend interface for constant expressions

2014-04-17 Thread Ian Lance Taylor

This patch from Chris Manghane changes the Go frontend to use the
backend interface for global constants.  Bootstrapped and ran Go
testsuite on x86_64-unknown-linux-gnu.  Committed to mainline.

Ian


2014-04-17  Chris Manghane  cm...@google.com

* go-gcc.cc (Gcc_backend::named_constant_expression): New
function.


Index: gcc/go/go-gcc.cc
===
--- gcc/go/go-gcc.cc	(revision 209494)
+++ gcc/go/go-gcc.cc	(revision 209495)
@@ -227,6 +227,10 @@ class Gcc_backend : public Backend
   indirect_expression(Bexpression* expr, bool known_valid, Location);
 
   Bexpression*
+  named_constant_expression(Btype* btype, const std::string name,
+			Bexpression* val, Location);
+
+  Bexpression*
   integer_constant_expression(Btype* btype, mpz_t val);
 
   Bexpression*
@@ -962,6 +966,29 @@ Gcc_backend::indirect_expression(Bexpres
   return tree_to_expr(ret);
 }
 
+// Return an expression that declares a constant named NAME with the
+// constant value VAL in BTYPE.
+
+Bexpression*
+Gcc_backend::named_constant_expression(Btype* btype, const std::string name,
+   Bexpression* val, Location location)
+{
+  tree type_tree = btype-get_tree();
+  tree const_val = val-get_tree();
+  if (type_tree == error_mark_node || const_val == error_mark_node)
+return this-error_expression();
+
+  tree name_tree = get_identifier_from_string(name);
+  tree decl = build_decl(location.gcc_location(), CONST_DECL, name_tree,
+			 type_tree);
+  DECL_INITIAL(decl) = const_val;
+  TREE_CONSTANT(decl) = 1;
+  TREE_READONLY(decl) = 1;
+
+  go_preserve_from_gc(decl);
+  return this-make_expression(decl);
+}
+
 // Return a typed value as a constant integer.
 
 Bexpression*
Index: gcc/go/gofrontend/gogo-tree.cc
===
--- gcc/go/gofrontend/gogo-tree.cc	(revision 209494)
+++ gcc/go/gofrontend/gogo-tree.cc	(revision 209495)
@@ -1015,44 +1015,22 @@ Named_object::get_tree(Gogo* gogo, Named
 {
 case NAMED_OBJECT_CONST:
   {
-	Named_constant* named_constant = this-u_.const_value;
 	Translate_context subcontext(gogo, function, NULL, NULL);
-	tree expr_tree = named_constant-expr()-get_tree(subcontext);
-	if (expr_tree == error_mark_node)
-	  decl = error_mark_node;
-	else
+	Type* type = this-u_.const_value-type();
+	Location loc = this-location();
+
+	Expression* const_ref = Expression::make_const_reference(this, loc);
+Bexpression* const_decl =
+	  tree_to_expr(const_ref-get_tree(subcontext));
+	if (type != NULL  type-is_numeric_type())
 	  {
-	Type* type = named_constant-type();
-	if (type != NULL  !type-is_abstract())
-	  {
-		if (type-is_error())
-		  expr_tree = error_mark_node;
-		else
-		  {
-		Btype* btype = type-get_backend(gogo);
-		expr_tree = fold_convert(type_to_tree(btype), expr_tree);
-		  }
-	  }
-	if (expr_tree == error_mark_node)
-	  decl = error_mark_node;
-	else if (INTEGRAL_TYPE_P(TREE_TYPE(expr_tree)))
-	  {
-tree name = get_identifier_from_string(this-get_id(gogo));
-		decl = build_decl(named_constant-location().gcc_location(),
-  CONST_DECL, name, TREE_TYPE(expr_tree));
-		DECL_INITIAL(decl) = expr_tree;
-		TREE_CONSTANT(decl) = 1;
-		TREE_READONLY(decl) = 1;
-	  }
-	else
-	  {
-		// A CONST_DECL is only for an enum constant, so we
-		// shouldn't use for non-integral types.  Instead we
-		// just return the constant itself, rather than a
-		// decl.
-		decl = expr_tree;
-	  }
+	Btype* btype = type-get_backend(gogo);
+	std::string name = this-get_id(gogo);
+const_decl =
+	  gogo-backend()-named_constant_expression(btype, name,
+			 const_decl, loc);
 	  }
+	decl = expr_to_tree(const_decl);
   }
   break;
 
Index: gcc/go/gofrontend/backend.h
===
--- gcc/go/gofrontend/backend.h	(revision 209494)
+++ gcc/go/gofrontend/backend.h	(revision 209495)
@@ -257,6 +257,12 @@ class Backend
   virtual Bexpression*
   indirect_expression(Bexpression* expr, bool known_valid, Location) = 0;
 
+  // Return an expression that declares a constant named NAME with the
+  // constant value VAL in BTYPE.
+  virtual Bexpression*
+  named_constant_expression(Btype* btype, const std::string name,
+ Bexpression* val, Location) = 0;
+
   // Return an expression for the multi-precision integer VAL in BTYPE.
   virtual Bexpression*
   integer_constant_expression(Btype* btype, mpz_t val) = 0;
Index: gcc/go/gofrontend/expressions.cc
===
--- gcc/go/gofrontend/expressions.cc	(revision 209494)
+++ gcc/go/gofrontend/expressions.cc	(revision 209495)
@@ -2792,12 +2792,12 @@ Const_expression::do_get_tree(Translate_
   // If the type has been set for this expression, but the underlying
   // object is an abstract int

Re: Patch ping

2014-04-17 Thread Uros Bizjak

On Wed, Apr 16, 2014 at 11:35 PM, Jeff Law l...@redhat.com wrote:

 I'd like to ping 2 patches:

 http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00140.html
 - Ensure GET_MODE_{SIZE,INNER,NUNITS} (const) is constant rather than
memory load after optimization (I'd like to keep the current
 MODE_SIZE
patch for the reasons mentioned there, but also add this patch)

 This is fine.  Per the follow-up discussion, I think you can mark it was
 resolving 36109 as well.



 http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00131.html
 - PR target/59617
handle gather loads for AVX512 (at least non-masked ones, masked ones
will need to wait for 5.0 and we need to find how to represent it in
GIMPLE)

 I'll leave this to Uros :-)

IIRC, this patch was already committed to 4.9 some time ago.

Uros.

[PATCH], PR target/60876 -- fix build issue with powerpc

2014-04-17 Thread Michael Meissner

I committed the following patch as obvious to fix the PowerPC build issue that
came up with changes to machmode.h.  These changes allow the compiler to build
and bootstrap. Submitted as subversion id 209498.

2014-04-17  Michael Meissner  meiss...@linux.vnet.ibm.com

PR target/60876
* config/rs6000/rs6000.c (rs6000_setup_reg_addr_masks): Make sure
GET_MODE_SIZE gets passed an enum machine_mode type and not
integer.
(rs6000_init_hard_regno_mode_ok): Likewise.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 209494)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -2329,6 +2329,8 @@ rs6000_setup_reg_addr_masks (void)
 
   for (m = 0; m  NUM_MACHINE_MODES; ++m)
 {
+  enum machine_mode m2 = (enum machine_mode)m;
+
   /* SDmode is special in that we want to access it only via REG+REG
 addressing on power7 and above, since we want to use the LFIWZX and
 STFIWZX instructions to load it.  */
@@ -2363,13 +2365,13 @@ rs6000_setup_reg_addr_masks (void)
 
  if (TARGET_UPDATE
   (rc == RELOAD_REG_GPR || rc == RELOAD_REG_FPR)
-  GET_MODE_SIZE (m) = 8
-  !VECTOR_MODE_P (m)
-  !COMPLEX_MODE_P (m)
+  GET_MODE_SIZE (m2) = 8
+  !VECTOR_MODE_P (m2)
+  !COMPLEX_MODE_P (m2)
   !indexed_only_p
-  !(TARGET_E500_DOUBLE  GET_MODE_SIZE (m) == 8)
-  !(m == DFmode  TARGET_UPPER_REGS_DF)
-  !(m == SFmode  TARGET_UPPER_REGS_SF))
+  !(TARGET_E500_DOUBLE  GET_MODE_SIZE (m2) == 8)
+  !(m2 == DFmode  TARGET_UPPER_REGS_DF)
+  !(m2 == SFmode  TARGET_UPPER_REGS_SF))
{
  addr_mask |= RELOAD_REG_PRE_INCDEC;
 
@@ -2815,6 +2817,7 @@ rs6000_init_hard_regno_mode_ok (bool glo
 
   for (m = 0; m  NUM_MACHINE_MODES; ++m)
{
+ enum machine_mode m2 = (enum machine_mode)m;
  int reg_size2 = reg_size;
 
  /* TFmode/TDmode always takes 2 registers, even in VSX.  */
@@ -2823,7 +2826,7 @@ rs6000_init_hard_regno_mode_ok (bool glo
reg_size2 = UNITS_PER_FP_WORD;
 
  rs6000_class_max_nregs[m][c]
-   = (GET_MODE_SIZE (m) + reg_size2 - 1) / reg_size2;
+   = (GET_MODE_SIZE (m2) + reg_size2 - 1) / reg_size2;
}
 }

Re: [RFC][PATCH] RL78 - clean-up of missing operand mode warnings.

2014-04-17 Thread DJ Delorie


 It seems a little inconsistent, however, that *movqi_real and 
 *xorqi3_real don't specify modes but *movhi_real and 
 *andqi_real/*iorqi_real do (and they also accept CONST_INTs).  Not that 
 I'm advocating generating more warnings, but my inner OCD likes 
 consistency :)

Adding the mode might be the right way, but I've seen cases where it
wasn't.  My paranoia supercedes my OCD ;-)

libgo patch committed: Avoid unnecessary gccgo extension

2014-04-17 Thread Ian Lance Taylor

This patch from Peter Collingbourne avoids an unnecessary gccgo
extension in libgo.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

diff -r 801009f33610 libgo/go/syscall/libcall_posix.go
--- a/libgo/go/syscall/libcall_posix.go	Thu Apr 17 16:01:58 2014 -0700
+++ b/libgo/go/syscall/libcall_posix.go	Thu Apr 17 16:03:58 2014 -0700
@@ -138,7 +138,7 @@
 //sys	Select(nfd int, r *FdSet, w *FdSet, e *FdSet, timeout *Timeval) (n int, err error)
 //select(nfd _C_int, r *FdSet, w *FdSet, e *FdSet, timeout *Timeval) _C_int
 
-const nfdbits = int(unsafe.Sizeof(fds_bits_type) * 8)
+const nfdbits = int(unsafe.Sizeof(fds_bits_type(0)) * 8)
 
 type FdSet struct {
 	Bits [(FD_SETSIZE + nfdbits - 1) / nfdbits]fds_bits_type

libgo patch committed: Use delete rather than old map deletion syntax

2014-04-17 Thread Ian Lance Taylor

This patch from Peter Collingbourne changes libgo to use the builtin
delete function rather than the old map deletion syntax.  Bootstrapped
and ran Go testsuite on x86_64-unknown-linux-gnu.  Committed to
mainline.

Ian

diff -r 1e27a38c43ea libgo/go/syscall/syscall_unix.go
--- a/libgo/go/syscall/syscall_unix.go	Thu Apr 17 16:13:05 2014 -0700
+++ b/libgo/go/syscall/syscall_unix.go	Thu Apr 17 16:17:50 2014 -0700
@@ -153,7 +153,7 @@
 	if errno := m.munmap(uintptr(unsafe.Pointer(b[0])), uintptr(len(b))); errno != nil {
 		return errno
 	}
-	m.active[p] = nil, false
+	delete(m.active, p)
 	return nil
 }

libgo patch committed: Avoid duplicate function declarations in syscall

2014-04-17 Thread Ian Lance Taylor

This patch from Peter Collingbourne avoids duplicate function
declarations in the generated libcalls.go file when building the syscall
package.  Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu.
Committed to mainline.

Ian

diff -r 5009262c3e56 libgo/go/syscall/mksyscall.awk
--- a/libgo/go/syscall/mksyscall.awk	Thu Apr 17 16:26:57 2014 -0700
+++ b/libgo/go/syscall/mksyscall.awk	Thu Apr 17 16:30:29 2014 -0700
@@ -96,8 +96,11 @@
 cfnresult = line
 
 printf(// Automatically generated wrapper for %s/%s\n, gofnname, cfnname)
-printf(//extern %s\n, cfnname)
-printf(func c_%s(%s) %s\n, cfnname, cfnparams, cfnresult)
+if (!(cfnname in cfns)) {
+cfns[cfnname] = 1
+printf(//extern %s\n, cfnname)
+printf(func c_%s(%s) %s\n, cfnname, cfnparams, cfnresult)
+}
 printf(func %s(%s) %s%s%s%s{\n,
 	   gofnname, gofnparams, gofnresults ==  ?  : (, gofnresults,
 	   gofnresults ==  ?  : ), gofnresults ==  ?  :  )

[PATCH, rs6000, 4.8, 4.9, trunk] Fix little endian behavior of vec_merge[hl] for V4SI/V4SF with VSX

2014-04-17 Thread Bill Schmidt

Hi,

I missed a case in the vector API work for little endian.  When VSX is
enabled, the vec_mergeh and vec_mergel interfaces for 4x32 vectors are
translated into xxmrghw and xxmrglw.  The patterns for these were not
adjusted for little endian.  This patch fixes this and adds tests for
V4SI and V4SF modes when VSX is available.

Bootstrapped and tested on 4.8, 4.9, and trunk for
powerpc64le-unknown-linux-gnu with no regressions.  Tests are still
ongoing for powerpc64-unknown-linux-gnu.  Provided those complete
without regressions, is this fix ok for trunk, 4.9, and 4.8?

Thanks,
Bill


[gcc]

2014-04-17  Bill Schmidt  wschm...@linux.vnet.ibm.com

* config/rs6000/vsx.md (vsx_xxmrghw_mode): Adjust for
little-endian.
(vsx_xxmrglw_mode): Likewise.

[gcc/testsuite]

2014-04-17  Bill Schmidt  wschm...@linux.vnet.ibm.com

* gcc.dg/vmx/merge-vsx.c: Add V4SI and V4SF tests.
* gcc.dg/vmx/merge-vsx-be-order.c: Likewise.


Index: gcc/config/rs6000/vsx.md
===
--- gcc/config/rs6000/vsx.md(revision 209513)
+++ gcc/config/rs6000/vsx.md(working copy)
@@ -1891,7 +1891,12 @@
  (parallel [(const_int 0) (const_int 4)
 (const_int 1) (const_int 5)])))]
   VECTOR_MEM_VSX_P (MODEmode)
-  xxmrghw %x0,%x1,%x2
+{
+  if (BYTES_BIG_ENDIAN)
+return xxmrghw %x0,%x1,%x2;
+  else
+return xxmrglw %x0,%x2,%x1;
+}
   [(set_attr type vecperm)])
 
 (define_insn vsx_xxmrglw_mode
@@ -1903,7 +1908,12 @@
  (parallel [(const_int 2) (const_int 6)
 (const_int 3) (const_int 7)])))]
   VECTOR_MEM_VSX_P (MODEmode)
-  xxmrglw %x0,%x1,%x2
+{
+  if (BYTES_BIG_ENDIAN)
+return xxmrglw %x0,%x1,%x2;
+  else
+return xxmrghw %x0,%x2,%x1;
+}
   [(set_attr type vecperm)])
 
 ;; Shift left double by word immediate
Index: gcc/testsuite/gcc.dg/vmx/merge-vsx-be-order.c
===
--- gcc/testsuite/gcc.dg/vmx/merge-vsx-be-order.c   (revision 209513)
+++ gcc/testsuite/gcc.dg/vmx/merge-vsx-be-order.c   (working copy)
@@ -21,10 +21,19 @@ static void test()
   vector long long vlb = {0,1};
   vector double vda = {-2.0,-1.0};
   vector double vdb = {0.0,1.0};
+  vector unsigned int vuia = {0,1,2,3};
+  vector unsigned int vuib = {4,5,6,7};
+  vector signed int vsia = {-4,-3,-2,-1};
+  vector signed int vsib = {0,1,2,3};
+  vector float vfa = {-4.0,-3.0,-2.0,-1.0};
+  vector float vfb = {0.0,1.0,2.0,3.0};
 
   /* Result vectors.  */
   vector long long vlh, vll;
   vector double vdh, vdl;
+  vector unsigned int vuih, vuil;
+  vector signed int vsih, vsil;
+  vector float vfh, vfl;
 
   /* Expected result vectors.  */
 #if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
@@ -32,11 +41,23 @@ static void test()
   vector long long vlrl = {0,-2};
   vector double vdrh = {1.0,-1.0};
   vector double vdrl = {0.0,-2.0};
+  vector unsigned int vuirh = {6,2,7,3};
+  vector unsigned int vuirl = {4,0,5,1};
+  vector signed int vsirh = {2,-2,3,-1};
+  vector signed int vsirl = {0,-4,1,-3};
+  vector float vfrh = {2.0,-2.0,3.0,-1.0};
+  vector float vfrl = {0.0,-4.0,1.0,-3.0};
 #else
   vector long long vlrh = {-2,0};
   vector long long vlrl = {-1,1};
   vector double vdrh = {-2.0,0.0};
   vector double vdrl = {-1.0,1.0};
+  vector unsigned int vuirh = {0,4,1,5};
+  vector unsigned int vuirl = {2,6,3,7};
+  vector signed int vsirh = {-4,0,-3,1};
+  vector signed int vsirl = {-2,2,-1,3};
+  vector float vfrh = {-4.0,0.0,-3.0,1.0};
+  vector float vfrl = {-2.0,2.0,-1.0,3.0};
 #endif
 
   vlh = vec_mergeh (vla, vlb);
@@ -43,9 +64,21 @@ static void test()
   vll = vec_mergel (vla, vlb);
   vdh = vec_mergeh (vda, vdb);
   vdl = vec_mergel (vda, vdb);
+  vuih = vec_mergeh (vuia, vuib);
+  vuil = vec_mergel (vuia, vuib);
+  vsih = vec_mergeh (vsia, vsib);
+  vsil = vec_mergel (vsia, vsib);
+  vfh  = vec_mergeh (vfa,  vfb );
+  vfl  = vec_mergel (vfa,  vfb );
 
   check (vec_long_long_eq (vlh, vlrh), vlh);
   check (vec_long_long_eq (vll, vlrl), vll);
   check (vec_double_eq (vdh, vdrh), vdh );
   check (vec_double_eq (vdl, vdrl), vdl );
+  check (vec_all_eq (vuih, vuirh), vuih);
+  check (vec_all_eq (vuil, vuirl), vuil);
+  check (vec_all_eq (vsih, vsirh), vsih);
+  check (vec_all_eq (vsil, vsirl), vsil);
+  check (vec_all_eq (vfh,  vfrh),  vfh);
+  check (vec_all_eq (vfl,  vfrl),  vfl);
 }
Index: gcc/testsuite/gcc.dg/vmx/merge-vsx.c
===
--- gcc/testsuite/gcc.dg/vmx/merge-vsx.c(revision 209513)
+++ gcc/testsuite/gcc.dg/vmx/merge-vsx.c(working copy)
@@ -21,10 +21,19 @@ static void test()
   vector long long vlb = {0,1};
   vector double vda = {-2.0,-1.0};
   vector double vdb = {0.0,1.0};
+  vector unsigned int vuia = {0,1,2,3};
+  vector unsigned int vuib = {4,5,6,7};
+  vector signed int vsia = {-4,-3,-2,-1};
+  vector signed int vsib = {0,1,2,3};
+  vector float vfa =

[PATCH v8] PR middle-end/60281

2014-04-17 Thread lin zuojian

 Hi,
Here is the patch after the Jakub's review, and Jakub helps with the
coding style.

--

 * asan.c (asan_emit_stack_protection):
 Force the base to align to appropriate bits if STRICT_ALIGNMENT.  Set
 shadow_mem align to appropriate bits if STRICT_ALIGNMENT. 
 * cfgexpand.c
 (expand_stack_vars): Set base_align appropriately when asan is on.
 (expand_used_vars): Leave a space in the stack frame for alignment if
 STRICT_ALIGNMENT.

---
 gcc/ChangeLog   |  9 +
 gcc/asan.c  | 15 +++
 gcc/cfgexpand.c | 18 --
 3 files changed, 40 insertions(+), 2 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index da35be8..30a2b33 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,12 @@
+2014-04-18  Lin Zuojian  manjian2...@gmail.com
+   PR middle-end/60281
+   * asan.c (asan_emit_stack_protection): Force the base to align to
+   appropriate bits if STRICT_ALIGNMENT.  Set shadow_mem align to
+   appropriate bits if STRICT_ALIGNMENT.
+   * cfgexpand.c (expand_stack_vars): Set base_align appropriately
+   when asan is on.
+   (expand_used_vars): Leave a space in the stack frame for alignment
+   if STRICT_ALIGNMENT.
 2014-04-17  Jakub Jelinek  ja...@redhat.com
 
PR target/60847
diff --git a/gcc/asan.c b/gcc/asan.c
index 53992a8..28a476f 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -1017,8 +1017,17 @@ asan_emit_stack_protection (rtx base, rtx pbase, 
unsigned int alignb,
base_align_bias = ((asan_frame_size + alignb - 1)
~(alignb - HOST_WIDE_INT_1)) - asan_frame_size;
 }
+  /* Align base if target is STRICT_ALIGNMENT.  */
+  if (STRICT_ALIGNMENT)
+base = expand_binop (Pmode, and_optab, base,
+gen_int_mode (-((GET_MODE_ALIGNMENT (SImode)
+  ASAN_SHADOW_SHIFT)
+/ BITS_PER_UNIT), Pmode), NULL_RTX,
+1, OPTAB_DIRECT);
+
   if (use_after_return_class == -1  pbase)
 emit_move_insn (pbase, base);
+
   base = expand_binop (Pmode, add_optab, base,
   gen_int_mode (base_offset - base_align_bias, Pmode),
   NULL_RTX, 1, OPTAB_DIRECT);
@@ -1097,6 +1106,8 @@ asan_emit_stack_protection (rtx base, rtx pbase, unsigned 
int alignb,
   (ASAN_RED_ZONE_SIZE  ASAN_SHADOW_SHIFT) == 4);
   shadow_mem = gen_rtx_MEM (SImode, shadow_base);
   set_mem_alias_set (shadow_mem, asan_shadow_set);
+  if (STRICT_ALIGNMENT)
+set_mem_align (shadow_mem, (GET_MODE_ALIGNMENT (SImode)));
   prev_offset = base_offset;
   for (l = length; l; l -= 2)
 {
@@ -1186,6 +1197,10 @@ asan_emit_stack_protection (rtx base, rtx pbase, 
unsigned int alignb,
 
   shadow_mem = gen_rtx_MEM (BLKmode, shadow_base);
   set_mem_alias_set (shadow_mem, asan_shadow_set);
+
+  if (STRICT_ALIGNMENT)
+set_mem_align (shadow_mem, (GET_MODE_ALIGNMENT (SImode)));
+
   prev_offset = base_offset;
   last_offset = base_offset;
   last_size = 0;
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index b7f6360..14511e1 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -1013,10 +1013,19 @@ expand_stack_vars (bool (*pred) (size_t), struct 
stack_vars_data *data)
  if (data-asan_base == NULL)
data-asan_base = gen_reg_rtx (Pmode);
  base = data-asan_base;
+
+ if (!STRICT_ALIGNMENT)
+   base_align = crtl-max_used_stack_slot_alignment;
+ else
+   base_align = MAX (crtl-max_used_stack_slot_alignment,
+ GET_MODE_ALIGNMENT (SImode)
+  ASAN_SHADOW_SHIFT);
}
  else
-   offset = alloc_stack_frame_space (stack_vars[i].size, alignb);
- base_align = crtl-max_used_stack_slot_alignment;
+   {
+ offset = alloc_stack_frame_space (stack_vars[i].size, alignb);
+ base_align = crtl-max_used_stack_slot_alignment;
+   }
}
   else
{
@@ -1845,6 +1854,11 @@ expand_used_vars (void)
= alloc_stack_frame_space (redzonesz, ASAN_RED_ZONE_SIZE);
  data.asan_vec.safe_push (prev_offset);
  data.asan_vec.safe_push (offset);
+ /* Leave space for alignment if STRICT_ALIGNMENT.  */
+ if (STRICT_ALIGNMENT)
+   alloc_stack_frame_space ((GET_MODE_ALIGNMENT (SImode)
+  ASAN_SHADOW_SHIFT)
+/ BITS_PER_UNIT, 1);
 
  var_end_seq
= asan_emit_stack_protection (virtual_stack_vars_rtx,
-- 
1.8.3.2

--
Regards
lin zuojian

Re: [PATCH v8] PR middle-end/60281

2014-04-17 Thread lin zuojian

Hi Bernd,
a) On which target(s) did you boot-strap your patch?
I just run it on x86, can't run it on ARM, because Android is not a
posix system, nor a System V compatible system. And my code does not
effect x86.

b) Did you run the testsuite?
Yes, but again my code does not effect x86.

c) When you compare the test results with and without the patch, were there any 
regressions?
Only the bug has gone. My app can run on my Android ARM system.

On Fri, Apr 18, 2014 at 12:21:50PM +0800, lin zuojian wrote:
  Hi,
 Here is the patch after the Jakub's review, and Jakub helps with the
 coding style.
 
 --
 
  * asan.c (asan_emit_stack_protection):
  Force the base to align to appropriate bits if STRICT_ALIGNMENT.  Set
  shadow_mem align to appropriate bits if STRICT_ALIGNMENT. 
  * cfgexpand.c
  (expand_stack_vars): Set base_align appropriately when asan is on.
  (expand_used_vars): Leave a space in the stack frame for alignment if
  STRICT_ALIGNMENT.
 
 ---
  gcc/ChangeLog   |  9 +
  gcc/asan.c  | 15 +++
  gcc/cfgexpand.c | 18 --
  3 files changed, 40 insertions(+), 2 deletions(-)
 
 diff --git a/gcc/ChangeLog b/gcc/ChangeLog
 index da35be8..30a2b33 100644
 --- a/gcc/ChangeLog
 +++ b/gcc/ChangeLog
 @@ -1,3 +1,12 @@
 +2014-04-18  Lin Zuojian  manjian2...@gmail.com
 +   PR middle-end/60281
 +   * asan.c (asan_emit_stack_protection): Force the base to align to
 +   appropriate bits if STRICT_ALIGNMENT.  Set shadow_mem align to
 +   appropriate bits if STRICT_ALIGNMENT.
 +   * cfgexpand.c (expand_stack_vars): Set base_align appropriately
 +   when asan is on.
 +   (expand_used_vars): Leave a space in the stack frame for alignment
 +   if STRICT_ALIGNMENT.
  2014-04-17  Jakub Jelinek  ja...@redhat.com
  
   PR target/60847
 diff --git a/gcc/asan.c b/gcc/asan.c
 index 53992a8..28a476f 100644
 --- a/gcc/asan.c
 +++ b/gcc/asan.c
 @@ -1017,8 +1017,17 @@ asan_emit_stack_protection (rtx base, rtx pbase, 
 unsigned int alignb,
   base_align_bias = ((asan_frame_size + alignb - 1)
   ~(alignb - HOST_WIDE_INT_1)) - asan_frame_size;
  }
 +  /* Align base if target is STRICT_ALIGNMENT.  */
 +  if (STRICT_ALIGNMENT)
 +base = expand_binop (Pmode, and_optab, base,
 +  gen_int_mode (-((GET_MODE_ALIGNMENT (SImode)
 +ASAN_SHADOW_SHIFT)
 +  / BITS_PER_UNIT), Pmode), NULL_RTX,
 +  1, OPTAB_DIRECT);
 +
if (use_after_return_class == -1  pbase)
  emit_move_insn (pbase, base);
 +
base = expand_binop (Pmode, add_optab, base,
  gen_int_mode (base_offset - base_align_bias, Pmode),
  NULL_RTX, 1, OPTAB_DIRECT);
 @@ -1097,6 +1106,8 @@ asan_emit_stack_protection (rtx base, rtx pbase, 
 unsigned int alignb,
  (ASAN_RED_ZONE_SIZE  ASAN_SHADOW_SHIFT) == 4);
shadow_mem = gen_rtx_MEM (SImode, shadow_base);
set_mem_alias_set (shadow_mem, asan_shadow_set);
 +  if (STRICT_ALIGNMENT)
 +set_mem_align (shadow_mem, (GET_MODE_ALIGNMENT (SImode)));
prev_offset = base_offset;
for (l = length; l; l -= 2)
  {
 @@ -1186,6 +1197,10 @@ asan_emit_stack_protection (rtx base, rtx pbase, 
 unsigned int alignb,
  
shadow_mem = gen_rtx_MEM (BLKmode, shadow_base);
set_mem_alias_set (shadow_mem, asan_shadow_set);
 +
 +  if (STRICT_ALIGNMENT)
 +set_mem_align (shadow_mem, (GET_MODE_ALIGNMENT (SImode)));
 +
prev_offset = base_offset;
last_offset = base_offset;
last_size = 0;
 diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
 index b7f6360..14511e1 100644
 --- a/gcc/cfgexpand.c
 +++ b/gcc/cfgexpand.c
 @@ -1013,10 +1013,19 @@ expand_stack_vars (bool (*pred) (size_t), struct 
 stack_vars_data *data)
 if (data-asan_base == NULL)
   data-asan_base = gen_reg_rtx (Pmode);
 base = data-asan_base;
 +
 +   if (!STRICT_ALIGNMENT)
 + base_align = crtl-max_used_stack_slot_alignment;
 +   else
 + base_align = MAX (crtl-max_used_stack_slot_alignment,
 +   GET_MODE_ALIGNMENT (SImode)
 +ASAN_SHADOW_SHIFT);
   }
 else
 - offset = alloc_stack_frame_space (stack_vars[i].size, alignb);
 -   base_align = crtl-max_used_stack_slot_alignment;
 + {
 +   offset = alloc_stack_frame_space (stack_vars[i].size, alignb);
 +   base_align = crtl-max_used_stack_slot_alignment;
 + }
   }
else
   {
 @@ -1845,6 +1854,11 @@ expand_used_vars (void)
   = alloc_stack_frame_space (redzonesz, ASAN_RED_ZONE_SIZE);
 data.asan_vec.safe_push (prev_offset);
 data.asan_vec.safe_push (offset);
 +   /* Leave space for alignment if STRICT_ALIGNMENT.  */
 +   if (STRICT_ALIGNMENT)
 + alloc_stack_frame_space

[patch, testsuite] Fix fragile case nsdmi-union5

2014-04-17 Thread Joey Ye

Resulting from discussion here:
http://gcc.gnu.org/ml/gcc/2014-04/msg00125.html

ChangeLog:
* g++.dg/cpp0x/nsdmi-union5.C: Change to runtime test.

Index: gcc/testsuite/g++.dg/cpp0x/nsdmi-union5.C
===
--- gcc/testsuite/g++.dg/cpp0x/nsdmi-union5.C   (revision 209462)
+++ gcc/testsuite/g++.dg/cpp0x/nsdmi-union5.C   (working copy)
@@ -1,6 +1,5 @@
 // PR c++/58701
-// { dg-require-effective-target c++11 }
-// { dg-final { scan-assembler 7 } }
+// { dg-do run { target c++11 } }
 
 static union
 {
@@ -9,3 +8,10 @@
 int i = 7;
   };
 };
+
+extern C void abort(void);
+int main()
+{
+  if (i != 7) abort();
+  return 0;
+}

[C PATCH] Warn if switch has boolean value (PR c/60439)

2014-04-17 Thread Marek Polacek

This patch implements a new warning that warns when controlling
expression of a switch has boolean value.  (Intentionally I don't
warn if the controlling expression is (un)signed:1 bit-field.)
I guess the question is if this should be enabled by default or
deserves some new warning option.  Since clang does the former,
I did it too and currently this warning is enabled by default.

Regtested/bootstrapped on x86_64-linux, ok for trunk?

2014-04-17  Marek Polacek  pola...@redhat.com

PR c/60439
c/
* c-typeck.c (c_start_case): Warn if switch condition has boolean
value.
testsuite/
* gcc.dg/pr60439.c: New test.

diff --git gcc/c/c-typeck.c gcc/c/c-typeck.c
index 65aad45..91b1109 100644
--- gcc/c/c-typeck.c
+++ gcc/c/c-typeck.c
@@ -9344,6 +9344,28 @@ c_start_case (location_t switch_loc,
   else
{
  tree type = TYPE_MAIN_VARIANT (orig_type);
+ tree e = exp;
+ enum tree_code exp_code;
+
+ while (TREE_CODE (e) == COMPOUND_EXPR)
+   e = TREE_OPERAND (e, 1);
+ exp_code = TREE_CODE (e);
+
+ if (TREE_CODE (type) == BOOLEAN_TYPE
+ || exp_code == TRUTH_ANDIF_EXPR
+ || exp_code == TRUTH_AND_EXPR
+ || exp_code == TRUTH_ORIF_EXPR
+ || exp_code == TRUTH_OR_EXPR
+ || exp_code == TRUTH_XOR_EXPR
+ || exp_code == TRUTH_NOT_EXPR
+ || exp_code == EQ_EXPR
+ || exp_code == NE_EXPR
+ || exp_code == LE_EXPR
+ || exp_code == GE_EXPR
+ || exp_code == LT_EXPR
+ || exp_code == GT_EXPR)
+   warning_at (switch_cond_loc, 0,
+   switch condition has boolean value);
 
  if (!in_system_header_at (input_location)
   (type == long_integer_type_node
diff --git gcc/testsuite/gcc.dg/pr60439.c gcc/testsuite/gcc.dg/pr60439.c
index e69de29..26e7c25 100644
--- gcc/testsuite/gcc.dg/pr60439.c
+++ gcc/testsuite/gcc.dg/pr60439.c
@@ -0,0 +1,112 @@
+/* PR c/60439 */
+/* { dg-do compile } */
+
+typedef _Bool bool;
+extern _Bool foo (void);
+
+void
+f1 (const _Bool b)
+{
+  switch (b) /* { dg-warning switch condition has boolean value } */
+case 1:
+  break;
+}
+
+void
+f2 (int a, int b)
+{
+  switch (a  b) /* { dg-warning switch condition has boolean value } */
+case 1:
+  break;
+  switch ((bool) (a  b)) /* { dg-warning switch condition has boolean 
value } */
+case 1:
+  break;
+  switch ((a  b) || a) /* { dg-warning switch condition has boolean value 
} */
+case 1:
+  break;
+}
+
+void
+f3 (int a)
+{
+  switch (!!a) /* { dg-warning switch condition has boolean value } */
+case 1:
+  break;
+  switch (!a) /* { dg-warning switch condition has boolean value } */
+case 1:
+  break;
+}
+
+void
+f4 (void)
+{
+  switch (foo ()) /* { dg-warning switch condition has boolean value } */
+case 1:
+  break;
+}
+
+void
+f5 (int a)
+{
+  switch (a == 3) /* { dg-warning switch condition has boolean value } */
+case 1:
+  break;
+  switch (a != 3) /* { dg-warning switch condition has boolean value } */
+case 1:
+  break;
+  switch (a  3) /* { dg-warning switch condition has boolean value } */
+case 1:
+  break;
+  switch (a  3) /* { dg-warning switch condition has boolean value } */
+case 1:
+  break;
+  switch (a = 3) /* { dg-warning switch condition has boolean value } */
+case 1:
+  break;
+  switch (a = 3) /* { dg-warning switch condition has boolean value } */
+case 1:
+  break;
+  switch (foo (), foo (), a = 42) /* { dg-warning switch condition has 
boolean value } */
+case 1:
+  break;
+  switch (a == 3, a  4, a ^ 5, a)
+case 1:
+  break;
+}
+
+void
+f6 (bool b)
+{
+  switch (b) /* { dg-warning switch condition has boolean value } */
+case 1:
+  break;
+  switch (!b) /* { dg-warning switch condition has boolean value } */
+case 1:
+  break;
+  switch (b++) /* { dg-warning switch condition has boolean value } */
+case 1:
+  break;
+}
+
+void
+f7 (void)
+{
+  bool b;
+  switch (b = 1) /* { dg-warning switch condition has boolean value } */
+case 1:
+  break;
+}
+
+void
+f8 (int i)
+{
+  switch (i)
+case 0:
+  break;
+  switch ((unsigned int) i)
+case 0:
+  break;
+  switch ((bool) i) /* { dg-warning switch condition has boolean value } */
+case 0:
+  break;
+}

Marek

Re: [C PATCH] Warn if switch has boolean value (PR c/60439)

2014-04-17 Thread Marc Glisse


On Fri, 18 Apr 2014, Marek Polacek wrote:


This patch implements a new warning that warns when controlling
expression of a switch has boolean value.  (Intentionally I don't
warn if the controlling expression is (un)signed:1 bit-field.)
I guess the question is if this should be enabled by default or
deserves some new warning option.  Since clang does the former,
I did it too and currently this warning is enabled by default.


It can be enabled by -Wsome-name which is itself enabled by default but
at least gives the possibility to use -Wno-some-name, -Werror=some-name,
etc. No? I believe Manuel insists regularly that no new warning should
use 0 (and old ones should progressively lose it).

--
Marc Glisse

91 matches

Mail list logo