date:20130408

Re: [Google] Recompute function frequency after calculating branch probability

2013-04-08 Thread Xinliang David Li

Ok.

thanks,

David

On Sun, Apr 7, 2013 at 8:07 PM, Dehao Chen de...@google.com wrote:
 Hi,

 This patch updates the function frequency after calculating branch
 probability. This is important because cold function could be promoted
 to hot after ipa-inline.

 Bootstrapped and passed gcc regression tests.

 Okay for google-4_7?

 Thanks,
 Dehao

 --- a/gcc/predict.c
 +++ b/gcc/predict.c
 @@ -2877,7 +2877,10 @@ rebuild_frequencies (void)
else if (profile_status == PROFILE_READ)
  {
if (flag_auto_profile)
 -   afdo_calculate_branch_prob ();
 +   {
 + afdo_calculate_branch_prob ();
 + compute_function_frequency ();
 +   }
counts_to_freqs ();
  }
else

Re: [PATCH] Fix PR48182

2013-04-08 Thread Marek Polacek

On Fri, Apr 05, 2013 at 03:00:43PM -0600, Jeff Law wrote:
 On 04/05/2013 02:50 PM, Jakub Jelinek wrote:
 On Fri, Apr 05, 2013 at 02:42:19PM -0600, Jeff Law wrote:
 ?  I must be missing something, the change causes an early bail out
 from try_crossjump_to_edge.
 
 We don't want to raise the min to  0 as that doesn't allow the user
 to turn on this specific transformation.
 
 The condition is
if (nmatch  PARAM_VALUE (PARAM_MIN_CROSSJUMP_INSNS))
  return false; // aka don't crossjump
 So, the smaller the N in --param min-crossjump-insns=N is, the more likely
 we crossjump.  Thus N=0 should mean that it is most likely we crossjump,
 and as N=1 requires that at least one insn matches, N=0 would mean that
 even zero insns can match.  If we for --param min-crossjump-insns=0
 always return false, it means we never crossjump, so it is least likely
 that we crossjump, which corresponds to largest possible N, not smallest
 one.
 Yes the smaller the N, the more likely we are to crossjump, of
 course the value 0 would make no sense (I'm clearly out of practice
 on reviews :-).
 
 Yea, changing the min value in params.def to 1 would be a better way
 to fix.  Consider that patch pre-approved.

Ok, thanks.  I'll apply this one.  Regtest/bootstrap pending.

2013-04-08  Marek Polacek  pola...@redhat.com

PR rtl-optimization/48182
* params.def (PARAM_MIN_CROSSJUMP_INSNS): Increase the minimum
value to 1.

--- gcc/params.def.mp   2013-04-08 08:38:48.515263034 +0200
+++ gcc/params.def  2013-04-08 08:39:10.444340238 +0200
@@ -433,7 +433,7 @@ DEFPARAM(PARAM_MAX_CROSSJUMP_EDGES,
 DEFPARAM(PARAM_MIN_CROSSJUMP_INSNS,
  min-crossjump-insns,
  The minimum number of matching instructions to consider for 
crossjumping,
- 5, 0, 0)
+ 5, 1, 0)
 
 /* The maximum number expansion factor when copying basic blocks.  */
 DEFPARAM(PARAM_MAX_GROW_COPY_BB_INSNS,

Marek

Re: [PATCH] Fix PR48182

2013-04-08 Thread Jakub Jelinek

On Mon, Apr 08, 2013 at 08:48:22AM +0200, Marek Polacek wrote:
  Yea, changing the min value in params.def to 1 would be a better way
  to fix.  Consider that patch pre-approved.
 
 Ok, thanks.  I'll apply this one.  Regtest/bootstrap pending.

Thanks.  Also ok for 4.8.

 2013-04-08  Marek Polacek  pola...@redhat.com
 
   PR rtl-optimization/48182
   * params.def (PARAM_MIN_CROSSJUMP_INSNS): Increase the minimum
   value to 1.
 
 --- gcc/params.def.mp 2013-04-08 08:38:48.515263034 +0200
 +++ gcc/params.def2013-04-08 08:39:10.444340238 +0200
 @@ -433,7 +433,7 @@ DEFPARAM(PARAM_MAX_CROSSJUMP_EDGES,
  DEFPARAM(PARAM_MIN_CROSSJUMP_INSNS,
   min-crossjump-insns,
   The minimum number of matching instructions to consider for 
 crossjumping,
 - 5, 0, 0)
 + 5, 1, 0)
  
  /* The maximum number expansion factor when copying basic blocks.  */
  DEFPARAM(PARAM_MAX_GROW_COPY_BB_INSNS,

Jakub

Re: [Patch, fortran, 4.9] Use bool type instead gfc_try

2013-04-08 Thread Janne Blomqvist

PING (now in plain text mode so that the lists will accept the
message, hopefully. $#% gmail improvements.)


On Fri, Mar 22, 2013 at 8:58 AM, Janne Blomqvist
blomqvist.ja...@gmail.com wrote:

 On Thu, Mar 21, 2013 at 11:31 PM, Janne Blomqvist
 blomqvist.ja...@gmail.com wrote:
  Updated patch which in addition does the above transformations as
  well.

 .. and here is the actual patch (thanks Bernhard!)


 --
 Janne Blomqvist




--
Janne Blomqvist

[Committed] S/390: Fix pr48335 testsuite fails

2013-04-08 Thread Andreas Krebbel

Hi,

I've committed the attached patch fixing the following testsuite fails
on s390 (-march=z196):

 FAIL: gcc.dg/pr48335-2.c (internal compiler error)
 FAIL: gcc.dg/pr48335-2.c (test for excess errors)
 FAIL: gcc.dg/pr48335-3.c (internal compiler error)
 FAIL: gcc.dg/pr48335-3.c (test for excess errors)

I've also committed the fix to 4.8 branch since it is a regression
from 4.7.

Bye,

-Andreas-

2013-04-08  Andreas Krebbel  andreas.kreb...@de.ibm.com

* config/s390/s390.c (s390_expand_insv): Only accept insertions
within mode size.

---
 gcc/config/s390/s390.c |3 +++
 1 file changed, 3 insertions(+)

Index: gcc/config/s390/s390.c
===
*** gcc/config/s390/s390.c.orig
--- gcc/config/s390/s390.c
*** s390_expand_insv (rtx dest, rtx op1, rtx
*** 4648,4653 
--- 4648,4656 
int smode_bsize, mode_bsize;
rtx op, clobber;
  
+   if (bitsize + bitpos  GET_MODE_SIZE (mode))
+ return false;
+ 
/* Generate INSERT IMMEDIATE (IILL et al).  */
/* (set (ze (reg)) (const_int)).  */
if (TARGET_ZARCH

[PATCH] Another ldist testcase

2013-04-08 Thread Jakub Jelinek

Hi!

I was curious whether we don't miscompile the following testcase on 4.8
branch (-1+0i matches integer_all_onesp), but apparently we don't, because
TYPE_PRECISION on the COMPLEX_TYPE is 0.  Anyway, I'd like to check this
into trunk/4.8 branch, ok?

2013-04-08  Jakub Jelinek  ja...@redhat.com

* gcc.c-torture/execute/pr56837.c: New test.

--- gcc/testsuite/gcc.c-torture/execute/pr56837.c.jj2013-02-13 
21:50:57.150673158 +0100
+++ gcc/testsuite/gcc.c-torture/execute/pr56837.c   2013-04-08 
10:23:44.941870778 +0200
@@ -0,0 +1,21 @@
+extern void abort (void);
+_Complex int a[1024];
+
+__attribute__((noinline, noclone)) void
+foo (void)
+{
+  int i;
+  for (i = 0; i  1024; i++)
+a[i] = -1;
+}
+
+int
+main ()
+{
+  int i;
+  foo ();
+  for (i = 0; i  1024; i++)
+if (a[i] != -1)
+  abort ();
+  return 0;
+}

Jakub

Re: [Patch, fortran, 4.9] Use bool type instead gfc_try

2013-04-08 Thread Tobias Burnus


Janne Blomqvist wrote:

On Thu, Mar 21, 2013 at 11:31 PM, Janne Blomqvist
blomqvist.ja...@gmail.com wrote:

Updated patch which in addition does the above transformations as
well.

.. and here is the actual patch (thanks Bernhard!)

http://gcc.gnu.org/ml/fortran/2013-03/msg00108.html

Thanks for the update and sorry for the delay. The patch idea as such is 
okay. However, the patch isn't.



+ if (!gfc_notify_std(GFC_STD_F2003, Noninteger exponent in 
an initialization expression at %L, op2-where))


Missing   before the ( additionally, the line is way too long. 
That's actually an issue throughout the whole file.


Additionally, the reformating caused code like:  Noninteger exponent 
in  The  is quite ugly.



If you fix those issues, and update the patch for the newly added code 
(which presumably added a few FAILUREs), the patch is okay.


It is, indeed, most of the time helpful as it shortens the code without 
loosing its clearness. (Only at a few places, 'I found FAILURE/SUCCESS a 
tad clearer.)


Thanks for the patch.


For nicer looking code, you could also do:

* Remove the trailing   for
+   return false;
(That's the only modified line with a trailing space)

* Change
+  if (!gfc_resolve_expr(e)
+  || !gfc_specification_expr(e))
+return false;
to
  if (!gfc_resolve_expr(e) || !gfc_specification_expr(e))

* Ditto for:
+ if (t
   b-expr1 != NULL
and a few more.


Tobias

[PATCH] Adjust g++.dg/vect/slp-pr56812.cc

2013-04-08 Thread Richard Biener


This adjusts g++.dg/vect/slp-pr56812.cc for targets that cannot
handle HW misaligned vector loads.

Tested on x86_64-unknown-linux-gnu, confirmed by Andreas that it
helps ia64 / powerpc.

Richard.

2013-04-08  Richard Biener  rguent...@suse.de

* g++.dg/vect/slp-pr56812.cc: Adjust.

Index: gcc/testsuite/g++.dg/vect/slp-pr56812.cc
===
--- gcc/testsuite/g++.dg/vect/slp-pr56812.cc(revision 197480)
+++ gcc/testsuite/g++.dg/vect/slp-pr56812.cc(working copy)
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target vect_float } */
+/* { dg-require-effective-target vect_hw_misalign } */
 /* { dg-additional-options -O3 -funroll-loops -fvect-cost-model } */
 
 class mydata {

Re: [committed] Fix GCC bootstrap on hppa--hpux* using HP cat

2013-04-08 Thread Richard Biener

On Sat, 6 Apr 2013, John David Anglin wrote:

 The patch fixes PR other/55274 and we now generate a non empty map file.  As
 noted
 in the PR, this problem causes a hang when bootstrap is done using HP cat.
 
 Tested on hppa64-hp-hpux11.11 and hppa2.0w-hp-hpux11.11.  Committed to trunk
 and 4.8.
 
 Richard, would it be ok to apply to the 4.7 branch?  This is a 4.7 regression.

Sure.

Thanks,
Richard.

[PATCH] Adjust gfortran.dg/vect/fast-math-pr37021.f90

2013-04-08 Thread Richard Biener


To require vect_double.

Committed.

Richard.

2013-04-08  Richard Biener  rguent...@suse.de

* gfortran.dg/vect/fast-math-pr37021.f90: Adjust.

Index: gcc/testsuite/gfortran.dg/vect/fast-math-pr37021.f90
===
--- gcc/testsuite/gfortran.dg/vect/fast-math-pr37021.f90(revision 
197568)
+++ gcc/testsuite/gfortran.dg/vect/fast-math-pr37021.f90(working copy)
@@ -1,4 +1,5 @@
 ! { dg-do compile }
+! { dg-require-effective-target vect_double }
 
 subroutine to_product_of(self,a,b,a1,a2)
   complex(kind=8) :: self (:)

Re: [PATCH] Another ldist testcase

2013-04-08 Thread Richard Biener

On Mon, 8 Apr 2013, Jakub Jelinek wrote:

 Hi!
 
 I was curious whether we don't miscompile the following testcase on 4.8
 branch (-1+0i matches integer_all_onesp), but apparently we don't, because
 TYPE_PRECISION on the COMPLEX_TYPE is 0.  Anyway, I'd like to check this
 into trunk/4.8 branch, ok?

Ok.

Thanks,
Richard.

 2013-04-08  Jakub Jelinek  ja...@redhat.com
 
   * gcc.c-torture/execute/pr56837.c: New test.
 
 --- gcc/testsuite/gcc.c-torture/execute/pr56837.c.jj  2013-02-13 
 21:50:57.150673158 +0100
 +++ gcc/testsuite/gcc.c-torture/execute/pr56837.c 2013-04-08 
 10:23:44.941870778 +0200
 @@ -0,0 +1,21 @@
 +extern void abort (void);
 +_Complex int a[1024];
 +
 +__attribute__((noinline, noclone)) void
 +foo (void)
 +{
 +  int i;
 +  for (i = 0; i  1024; i++)
 +a[i] = -1;
 +}
 +
 +int
 +main ()
 +{
 +  int i;
 +  foo ();
 +  for (i = 0; i  1024; i++)
 +if (a[i] != -1)
 +  abort ();
 +  return 0;
 +}
 
   Jakub
 
 

-- 
Richard Biener rguent...@suse.de
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend

Re: [PATCH] MEM_REF clobber handling fixes/improvements (PR c++/34949, PR c++/50243)

2013-04-08 Thread Richard Biener

On Thu, 4 Apr 2013, Jakub Jelinek wrote:

 Hi!
 
 The vt3.C testcase (from PR34949) ICEd, because sink_clobbers sunk a
 MEM_REF[SSA_NAME] clobber from a landing pad which that SSA_NAME definition
 dominated to an outer one which wasn't dominated by the definition.
 As the ehcleanup nor ehdisp passes keep dominance info current and perform
 changes that invalidate it, I can't unfortunately do cheaply a
 dominated_by_p check, so the patch just throws away all MEM_REF[SSA_NAME]
 clobbers if SSA_NAME isn't a default def which is valid everywhere.  As
 sink_clobbers is only done on otherwise empty bb's and typically the
 clobbers are preceeded by some stores which are to be DSEd if unneeded
 and after DSEing aren't really needed anymore, this doesn't seem to hurt
 much.
 The patch also improves optimize_clobbers, so that it only removes any
 clobbers if the bb is actually empty (except for clobbers, resx, maybe debug
 stmts or __builtin_stack_restore), that way needed clobbers are kept around
 until they are used by DSE.
 Also, MEM_REF[SSA_NAME] clobbers aren't useful very late in the optimization
 pipeline, but could cause some SSA_NAMEs to be considered unnecessarily live
 (especially if they are considered live across EH edges it is undesirable),
 such clobbers are mainly useful during DSE1/DSE2, but at expansion are
 completely ignored (unlike VAR_DECL clobbers, which are also used for the
 stack layout decisions), so the patch removes all those MEM_REF[SSA_NAME]
 clobbers shortly after dse2 (in fab pass).
 
 Bootstrapped/regtested on x86_64-linux and i686-linux, on libstdc++ I saw
 some code size reduction with the patch.
 
 Ok for trunk?

Ok.

Thanks,
Richard.

 2013-04-04  Jakub Jelinek  ja...@redhat.com
 
   PR c++/34949
   PR c++/50243
   * tree-eh.c (optimize_clobbers): Only remove clobbers if bb doesn't
   contain anything but clobbers, at most one __builtin_stack_restore,
   optionally debug stmts and final resx, and if it has at least one
   incoming EH edge.  Don't check for SSA_NAME on LHS of a clobber.
   (sink_clobbers): Don't check for SSA_NAME on LHS of a clobber.
   Instead of moving clobbers with MEM_REF LHS with SSA_NAME address
   which isn't defaut definition, remove them.
   (unsplit_eh, cleanup_empty_eh): Use single_{pred,succ}_{p,edge}
   instead of EDGE_COUNT comparisons or EDGE_{PRED,SUCC}.
   * tree-ssa-ccp.c (execute_fold_all_builtins): Remove clobbers
   with MEM_REF LHS with SSA_NAME address.
 
   * g++.dg/opt/vt3.C: New test.
   * g++.dg/opt/vt4.C: New test.
 
 --- gcc/tree-eh.c.jj  2013-03-26 10:03:55.0 +0100
 +++ gcc/tree-eh.c 2013-04-04 13:44:27.718982776 +0200
 @@ -3230,14 +3230,48 @@ static void
  optimize_clobbers (basic_block bb)
  {
gimple_stmt_iterator gsi = gsi_last_bb (bb);
 +  bool any_clobbers = false;
 +  bool seen_stack_restore = false;
 +  edge_iterator ei;
 +  edge e;
 +
 +  /* Only optimize anything if the bb contains at least one clobber,
 + ends with resx (checked by caller), optionally contains some
 + debug stmts or labels, or at most one __builtin_stack_restore
 + call, and has an incoming EH edge.  */
for (gsi_prev (gsi); !gsi_end_p (gsi); gsi_prev (gsi))
  {
gimple stmt = gsi_stmt (gsi);
if (is_gimple_debug (stmt))
   continue;
 -  if (!gimple_clobber_p (stmt)
 -   || TREE_CODE (gimple_assign_lhs (stmt)) == SSA_NAME)
 - return;
 +  if (gimple_clobber_p (stmt))
 + {
 +   any_clobbers = true;
 +   continue;
 + }
 +  if (!seen_stack_restore
 +gimple_call_builtin_p (stmt, BUILT_IN_STACK_RESTORE))
 + {
 +   seen_stack_restore = true;
 +   continue;
 + }
 +  if (gimple_code (stmt) == GIMPLE_LABEL)
 + break;
 +  return;
 +}
 +  if (!any_clobbers)
 +return;
 +  FOR_EACH_EDGE (e, ei, bb-preds)
 +if (e-flags  EDGE_EH)
 +  break;
 +  if (e == NULL)
 +return;
 +  gsi = gsi_last_bb (bb);
 +  for (gsi_prev (gsi); !gsi_end_p (gsi); gsi_prev (gsi))
 +{
 +  gimple stmt = gsi_stmt (gsi);
 +  if (!gimple_clobber_p (stmt))
 + continue;
unlink_stmt_vdef (stmt);
gsi_remove (gsi, true);
release_defs (stmt);
 @@ -3278,8 +3312,7 @@ sink_clobbers (basic_block bb)
   continue;
if (gimple_code (stmt) == GIMPLE_LABEL)
   break;
 -  if (!gimple_clobber_p (stmt)
 -   || TREE_CODE (gimple_assign_lhs (stmt)) == SSA_NAME)
 +  if (!gimple_clobber_p (stmt))
   return 0;
any_clobbers = true;
  }
 @@ -3292,11 +3325,27 @@ sink_clobbers (basic_block bb)
for (gsi_prev (gsi); !gsi_end_p (gsi); gsi_prev (gsi))
  {
gimple stmt = gsi_stmt (gsi);
 +  tree lhs;
if (is_gimple_debug (stmt))
   continue;
if (gimple_code (stmt) == GIMPLE_LABEL)
   break;
unlink_stmt_vdef (stmt);
 +  lhs = gimple_assign_lhs (stmt);
 +  /* Unfortunately we don't have

Re: [patch] Fix node weight updates during ipa-cp (issue7812053)

2013-04-08 Thread Richard Biener

On Fri, Apr 5, 2013 at 4:18 PM, Teresa Johnson tejohn...@google.com wrote:
 On Thu, Mar 28, 2013 at 2:27 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Wed, Mar 27, 2013 at 6:22 PM, Teresa Johnson tejohn...@google.com wrote:
 I found that the node weight updates on cloned nodes during ipa-cp were
 leading to incorrect/insane weights. Both the original and new node weight
 computations used truncating divides, leading to a loss of total node 
 weight.
 I have fixed this by making both rounding integer divides.

 Bootstrapped and tested on x86-64-unknown-linux-gnu. Ok for trunk?

 I'm sure we can outline a rounding integer divide inline function on
 gcov_type.  To gcov-io.h, I suppose.

 Otherwise this looks ok to me.

 Thanks. I went ahead and worked on outlining this functionality. In
 the process of doing so, I discovered that there was already a method
 in basic-block.h to do part of this: apply_probability(), which does
 the rounding divide by REG_BR_PROB_BASE. There is a related function
 combine_probabilities() that takes 2 int probabilities instead of a
 gcov_type and an int probability. I decided to use apply_probability()
 in ipa-cp, and add a new macro GCOV_COMPUTE_SCALE to basic-block.h to
 compute the scale factor/probability via a rounding divide. So the
 ipa-cp changes I made use both GCOV_COMPUTE_SCALE and
 apply_probability.

 I then went through all the code to look for instances where we were
 computing scale factors/probabilities and performing scaling. I found
 a mix of existing uses of apply/combine_probabilities, uses of RDIV,
 inlined rounding divides, and truncating divides. I think it would be
 good to unify all of this. As a first step, I replaced all inline code
 sequences that were already doing rounding divides to compute scale
 factors/probabilities or do the scaling, to instead use the
 appropriate helper function/macro described above. For these
 locations, there should be no change to behavior.

 There are a number of places where there are truncating divides right
 now. Since changing those may impact the resulting behavior, for this
 patch I simply added a comment as to which helper they should use. As
 soon as this patch goes in I am planning to change those to use the
 appropriate helper and test performance, and then will send that patch
 for review. So for this patch, the only place where behavior is
 changed is in ipa-cp which was my original change.

 New patch is attached. Bootstrapped (both bootstrap and
 profiledbootstrap) and tested on x86-64-unknown-linux-gnu. Ok for
 trunk?

Ok.

Thanks,
Richard.

 Thanks,
 Teresa


 Thanks,
 Richard.

 2013-03-27  Teresa Johnson  tejohn...@google.com

 * ipa-cp.c (update_profiling_info): Perform rounding integer
 division when updating weights instead of truncating.
 (update_specialized_profile): Ditto.

 Index: ipa-cp.c
 ===
 --- ipa-cp.c(revision 197118)
 +++ ipa-cp.c(working copy)
 @@ -2588,14 +2588,18 @@ update_profiling_info (struct cgraph_node *orig_no

for (cs = new_node-callees; cs ; cs = cs-next_callee)
  if (cs-frequency)
 -  cs-count = cs-count * (new_sum * REG_BR_PROB_BASE
 -  / orig_node_count) / REG_BR_PROB_BASE;
 +  cs-count = (cs-count
 +   * ((new_sum * REG_BR_PROB_BASE + orig_node_count/2)
 +  / orig_node_count)
 +   + REG_BR_PROB_BASE/2) / REG_BR_PROB_BASE;
  else
cs-count = 0;

for (cs = orig_node-callees; cs ; cs = cs-next_callee)
 -cs-count = cs-count * (remainder * REG_BR_PROB_BASE
 -/ orig_node_count) / REG_BR_PROB_BASE;
 +cs-count = (cs-count
 + * ((remainder * REG_BR_PROB_BASE + orig_node_count/2)
 +/ orig_node_count)
 + + REG_BR_PROB_BASE/2) / REG_BR_PROB_BASE;

if (dump_file)
  dump_profile_updates (orig_node, new_node);
 @@ -2627,14 +2631,19 @@ update_specialized_profile (struct cgraph_node *ne

for (cs = new_node-callees; cs ; cs = cs-next_callee)
  if (cs-frequency)
 -  cs-count += cs-count * redirected_sum / new_node_count;
 +  cs-count += (cs-count
 +* ((redirected_sum * REG_BR_PROB_BASE
 ++ new_node_count/2) / new_node_count)
 ++ REG_BR_PROB_BASE/2) / REG_BR_PROB_BASE;
  else
cs-count = 0;

for (cs = orig_node-callees; cs ; cs = cs-next_callee)
  {
 -  gcov_type dec = cs-count * (redirected_sum * REG_BR_PROB_BASE
 -  / orig_node_count) / REG_BR_PROB_BASE;
 +  gcov_type dec = (cs-count
 +   * ((redirected_sum * REG_BR_PROB_BASE
 +   + orig_node_count/2) / orig_node_count)
 +   + REG_BR_PROB_BASE/2) / REG_BR_PROB_BASE;
if (dec  cs-count)
 cs-count -= dec;
else

Re: [patch tree-ssa-structalias.c]: Small finding in find_func_aliases function

2013-04-08 Thread Richard Biener

On Fri, Apr 5, 2013 at 9:30 PM, Jeff Law l...@redhat.com wrote:
 On 04/05/2013 02:29 AM, Kai Tietz wrote:

 Hello,

 while debugging I made the finding that in find_func_aliases rhsop
 might be used as NULL for gimple_assign_single_p items.  It should be
 using for the gimple_assign_single_p instead directly the rhs1-item as
 argument to pass to get_constraint_for_rhs function.

 ChangeLog

 2013-04-05  Kai Tietz

  * tree-ssa-structalias.c (find_func_aliases): Special-case
  gimple_assign_single_p handling.

 Ok for apply?

 Yes.  OK for the trunk.

 Do you have a testcase?

He can't because the analysis is wrong.  GIMPLE_SINGLE_RHS have exactly
two operands thus rhsop is always gimple_assign_rhs1 ().  So the patch
only un-CSEs gimple_assign_rhs1 ().

The is_gimple_assign () case can surely be re-worked to be easier to read
but the patch doesn't improve things.

Please revert it.

Thanks,
Richard.


 jeff

Re: [patch tree-ssa-structalias.c]: Small finding in find_func_aliases function

2013-04-08 Thread Kai Tietz

I haven't even applied it.

Kai

Re: [RFA] [PATCH] Minor improvement to canonicalization of COND_EXPR for gimple

2013-04-08 Thread Richard Biener

On Sat, Apr 6, 2013 at 1:13 PM, Jeff Law l...@redhat.com wrote:


 The tree combiner/forward propagator is missing opportunities to collapse
 sequences like this:

   _15 = _12 ^ _14;
   if (_15 != 0)


 Into:

 if (_12 != _14)

 The tree combiner/forward propagator builds this tree:

 x ^ y

 Then passes it to canonicalize_cond_expr_cond  That is not suitable for the
 condition in a gimple COND_EXPR.  So canonicalize_cond_expr_cond returns
 NULL.  Thus combine_cond_expr_cond decides the tree it created isn't useful
 and throws it away.

 This patch changes canonicalize_cond_expr to rewrite x ^ y into x != y.  The
 net result being the tree combiner/forward propagator is able to perform the
 desired simplification, eliminating the BIT_XOR_EXPR.

 Bootstrapped and regression tested on x86_64-unknown-linux-gnu.  As you can
 see from the testcase, these kinds of sequences show up when compiling gcc
 itself.

 OK for the trunk?

Nice. Ok.

Thanks,
Richard.

 commit 809408a4bde6dfbaf62c5bda9ab7ae6c4447d984
 Author: Jeff Law l...@redhat.com
 Date:   Sat Apr 6 05:11:17 2013 -0600

 * gimple.c (canonicalize_cond_expr_cond): Rewrite x ^ y into
 x != y.

 * gcc.dg/tree-ssa/forwprop-25.c: New test

 diff --git a/gcc/ChangeLog b/gcc/ChangeLog
 index b8a6900..44797cc 100644
 --- a/gcc/ChangeLog
 +++ b/gcc/ChangeLog
 @@ -1,3 +1,8 @@
 +2013-04-06  Jeff Law  l...@redhat.com
 +
 +   * gimple.c (canonicalize_cond_expr_cond): Rewrite x ^ y into
 +   x != y.
 +
  2013-04-03  Jeff Law  l...@redhat.com

 * Makefile.in (lra-constraints.o): Depend on $(OPTABS_H).
 diff --git a/gcc/gimple.c b/gcc/gimple.c
 index 785c2f0..cdb6f24 100644
 --- a/gcc/gimple.c
 +++ b/gcc/gimple.c
 @@ -2958,7 +2958,11 @@ canonicalize_cond_expr_cond (tree t)
t = build2 (TREE_CODE (top0), TREE_TYPE (t),
   TREE_OPERAND (top0, 0), TREE_OPERAND (top0, 1));
  }
 -
 +  /* For x ^ y use x != y.  */
 +  else if (TREE_CODE (t) == BIT_XOR_EXPR)
 +t = build2 (NE_EXPR, TREE_TYPE (t),
 +   TREE_OPERAND (t, 0), TREE_OPERAND (t, 1));
 +
if (is_gimple_condexpr (t))
  return t;

 diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
 index dc0b745..601ca66 100644
 --- a/gcc/testsuite/ChangeLog
 +++ b/gcc/testsuite/ChangeLog
 @@ -1,3 +1,7 @@
 +2013-04-06  Jeff Law  l...@redhat.com
 +
 +   * gcc.dg/tree-ssa/forwprop-25.c: New test
 +
  2013-04-03  Jeff Law  l...@redhat.com

 PR tree-optimization/56799
 diff --git a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-25.c
 b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-25.c
 new file mode 100644
 index 000..cf0c504
 --- /dev/null
 +++ b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-25.c
 @@ -0,0 +1,43 @@
 +/* { dg-do compile } */
 +/* { dg-options -O1 -fdump-tree-forwprop1 } */
 +
 +struct rtx_def;
 +typedef struct rtx_def *rtx;
 +typedef const struct rtx_def *const_rtx;
 +enum machine_mode
 +{
 +  MAX_MACHINE_MODE,
 +  NUM_MACHINE_MODES = MAX_MACHINE_MODE
 +};
 +extern const char *const mode_name[NUM_MACHINE_MODES];
 +enum mode_class
 +{ MODE_RANDOM, MODE_CC, MODE_INT, MODE_PARTIAL_INT, MODE_FRACT,
 MODE_UFRACT,
 +MODE_ACCUM, MODE_UACCUM, MODE_FLOAT, MODE_DECIMAL_FLOAT,
 MODE_COMPLEX_INT,
 +MODE_COMPLEX_FLOAT, MODE_VECTOR_INT, MODE_VECTOR_FRACT,
 +MODE_VECTOR_UFRACT, MODE_VECTOR_ACCUM, MODE_VECTOR_UACCUM,
 +MODE_VECTOR_FLOAT, MAX_MODE_CLASS };
 +extern const unsigned char mode_class[NUM_MACHINE_MODES];
 +extern const unsigned short mode_precision[NUM_MACHINE_MODES];
 +struct rtx_def
 +{
 +  __extension__ enum machine_mode mode:8;
 +};
 +void
 +convert_move (rtx to, rtx from, int unsignedp)
 +{
 +  enum machine_mode to_mode = ((enum machine_mode) (to)-mode);
 +  enum machine_mode from_mode = ((enum machine_mode) (from)-mode);
 +  ((void)
 +   (!((mode_precision[from_mode] != mode_precision[to_mode])
 +  || enum mode_class) mode_class[from_mode]) == MODE_DECIMAL_FLOAT)
 !=
 + (((enum mode_class) mode_class[to_mode]) ==
 +  MODE_DECIMAL_FLOAT))) ?
 +fancy_abort (/home/gcc/virgin-gcc/gcc/expr.c, 380, __FUNCTION__),
 +0 : 0));
 +}
 +
 +/* { dg-final { scan-tree-dump Replaced.*!=.*with.*!=.*  forwprop1} }
 */
 +/* { dg-final { cleanup-tree-dump forwprop1 } } */
 +
 +
 +

[PATCH, libstdc++]: Update alpha baseline_symbols.txt

2013-04-08 Thread Uros Bizjak

Hello!

2013-04-08  Uros Bizjak  ubiz...@gmail.com

* config/abi/post/alpha-linux-gnu/baseline_symbols.txt: Update.

Tested on alphaev68-pc-linux-gnu.

OK for mainline SVN?

Uros.
Index: config/abi/post/alpha-linux-gnu/baseline_symbols.txt
===
--- config/abi/post/alpha-linux-gnu/baseline_symbols.txt(revision 
197551)
+++ config/abi/post/alpha-linux-gnu/baseline_symbols.txt(working copy)
@@ -543,6 +543,7 @@
 
FUNC:_ZNKSt17__gnu_cxx_ldbl1289money_putIwSt19ostreambuf_iteratorIwSt11char_traitsIwEEE8__do_putES4_bRSt8ios_basewd@@GLIBCXX_LDBL_3.4
 
FUNC:_ZNKSt17__gnu_cxx_ldbl1289money_putIwSt19ostreambuf_iteratorIwSt11char_traitsIwEEE9_M_insertILb0EEES4_S4_RSt8ios_basewRKSbIwS3_SaIwEE@@GLIBCXX_LDBL_3.4
 
FUNC:_ZNKSt17__gnu_cxx_ldbl1289money_putIwSt19ostreambuf_iteratorIwSt11char_traitsIwEEE9_M_insertILb1EEES4_S4_RSt8ios_basewRKSbIwS3_SaIwEE@@GLIBCXX_LDBL_3.4
+FUNC:_ZNKSt17bad_function_call4whatEv@@GLIBCXX_3.4.18
 FUNC:_ZNKSt18basic_stringstreamIcSt11char_traitsIcESaIcEE3strEv@@GLIBCXX_3.4
 FUNC:_ZNKSt18basic_stringstreamIcSt11char_traitsIcESaIcEE5rdbufEv@@GLIBCXX_3.4
 FUNC:_ZNKSt18basic_stringstreamIwSt11char_traitsIwESaIwEE3strEv@@GLIBCXX_3.4
@@ -732,6 +733,8 @@
 
FUNC:_ZNKSt7num_putIwSt19ostreambuf_iteratorIwSt11char_traitsIwEEE6do_putES3_RSt8ios_basewm@@GLIBCXX_3.4
 
FUNC:_ZNKSt7num_putIwSt19ostreambuf_iteratorIwSt11char_traitsIwEEE6do_putES3_RSt8ios_basewx@@GLIBCXX_3.4
 
FUNC:_ZNKSt7num_putIwSt19ostreambuf_iteratorIwSt11char_traitsIwEEE6do_putES3_RSt8ios_basewy@@GLIBCXX_3.4
+FUNC:_ZNKSt8__detail20_Prime_rehash_policy11_M_next_bktEm@@GLIBCXX_3.4.18
+FUNC:_ZNKSt8__detail20_Prime_rehash_policy14_M_need_rehashEmmm@@GLIBCXX_3.4.18
 FUNC:_ZNKSt8bad_cast4whatEv@@GLIBCXX_3.4.9
 FUNC:_ZNKSt8ios_base7failure4whatEv@@GLIBCXX_3.4
 FUNC:_ZNKSt8messagesIcE18_M_convert_to_charERKSs@@GLIBCXX_3.4
@@ -1353,6 +1356,7 @@
 FUNC:_ZNSt11regex_errorD0Ev@@GLIBCXX_3.4.15
 FUNC:_ZNSt11regex_errorD1Ev@@GLIBCXX_3.4.15
 FUNC:_ZNSt11regex_errorD2Ev@@GLIBCXX_3.4.15
+FUNC:_ZNSt11this_thread11__sleep_forENSt6chrono8durationIlSt5ratioILl1ELl1NS1_IlS2_ILl1ELl10@@GLIBCXX_3.4.18
 FUNC:_ZNSt12__basic_fileIcE2fdEv@@GLIBCXX_3.4
 FUNC:_ZNSt12__basic_fileIcE4fileEv@@GLIBCXX_3.4.1
 FUNC:_ZNSt12__basic_fileIcE4openEPKcSt13_Ios_Openmodei@@GLIBCXX_3.4
@@ -1635,6 +1639,11 @@
 FUNC:_ZNSt13basic_ostreamIwSt11char_traitsIwEElsEt@@GLIBCXX_3.4
 FUNC:_ZNSt13basic_ostreamIwSt11char_traitsIwEElsEx@@GLIBCXX_3.4
 FUNC:_ZNSt13basic_ostreamIwSt11char_traitsIwEElsEy@@GLIBCXX_3.4
+FUNC:_ZNSt13random_device14_M_init_pretr1ERKSs@@GLIBCXX_3.4.18
+FUNC:_ZNSt13random_device16_M_getval_pretr1Ev@@GLIBCXX_3.4.18
+FUNC:_ZNSt13random_device7_M_finiEv@@GLIBCXX_3.4.18
+FUNC:_ZNSt13random_device7_M_initERKSs@@GLIBCXX_3.4.18
+FUNC:_ZNSt13random_device9_M_getvalEv@@GLIBCXX_3.4.18
 FUNC:_ZNSt13runtime_errorC1ERKSs@@GLIBCXX_3.4
 FUNC:_ZNSt13runtime_errorC2ERKSs@@GLIBCXX_3.4
 FUNC:_ZNSt13runtime_errorD0Ev@@GLIBCXX_3.4
@@ -2393,14 +2402,17 @@
 FUNC:_ZNVSt9__atomic011atomic_flag5clearESt12memory_order@@GLIBCXX_3.4.11
 FUNC:_ZSt10unexpectedv@@GLIBCXX_3.4
 FUNC:_ZSt11_Hash_bytesPKvmm@@CXXABI_1.3.5
+FUNC:_ZSt13get_terminatev@@GLIBCXX_3.4.19
 FUNC:_ZSt13set_terminatePFvvE@@GLIBCXX_3.4
 
FUNC:_ZSt14__convert_to_vIdEvPKcRT_RSt12_Ios_IostateRKP15__locale_struct@@GLIBCXX_3.4
 
FUNC:_ZSt14__convert_to_vIeEvPKcRT_RSt12_Ios_IostateRKP15__locale_struct@@GLIBCXX_3.4
 
FUNC:_ZSt14__convert_to_vIfEvPKcRT_RSt12_Ios_IostateRKP15__locale_struct@@GLIBCXX_3.4
 
FUNC:_ZSt14__convert_to_vIgEvPKcRT_RSt12_Ios_IostateRKP15__locale_struct@@GLIBCXX_LDBL_3.4
+FUNC:_ZSt14get_unexpectedv@@GLIBCXX_3.4.19
 FUNC:_ZSt14set_unexpectedPFvvE@@GLIBCXX_3.4
 FUNC:_ZSt15_Fnv_hash_bytesPKvmm@@CXXABI_1.3.5
 FUNC:_ZSt15future_categoryv@@GLIBCXX_3.4.15
+FUNC:_ZSt15get_new_handlerv@@GLIBCXX_3.4.19
 FUNC:_ZSt15set_new_handlerPFvvE@@GLIBCXX_3.4
 FUNC:_ZSt15system_categoryv@@GLIBCXX_3.4.11
 
FUNC:_ZSt16__ostream_insertIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_PKS3_l@@GLIBCXX_3.4.9
@@ -2678,6 +2690,7 @@
 FUNC:__cxa_guard_release@@CXXABI_1.3
 FUNC:__cxa_pure_virtual@@CXXABI_1.3
 FUNC:__cxa_rethrow@@CXXABI_1.3
+FUNC:__cxa_thread_atexit@@CXXABI_1.3.7
 FUNC:__cxa_throw@@CXXABI_1.3
 FUNC:__cxa_tm_cleanup@@CXXABI_TM_1
 FUNC:__cxa_vec_cctor@@CXXABI_1.3
@@ -2724,6 +2737,7 @@
 OBJECT:0:CXXABI_1.3.4
 OBJECT:0:CXXABI_1.3.5
 OBJECT:0:CXXABI_1.3.6
+OBJECT:0:CXXABI_1.3.7
 OBJECT:0:CXXABI_LDBL_1.3
 OBJECT:0:CXXABI_TM_1
 OBJECT:0:GLIBCXX_3.4
@@ -2736,6 +2750,8 @@
 OBJECT:0:GLIBCXX_3.4.15
 OBJECT:0:GLIBCXX_3.4.16
 OBJECT:0:GLIBCXX_3.4.17
+OBJECT:0:GLIBCXX_3.4.18
+OBJECT:0:GLIBCXX_3.4.19
 OBJECT:0:GLIBCXX_3.4.2
 OBJECT:0:GLIBCXX_3.4.3
 OBJECT:0:GLIBCXX_3.4.4

Re: C: Add new warning -Wunprototyped-calls

2013-04-08 Thread Richard Biener

On Sat, Apr 6, 2013 at 11:50 PM, Andreas Schwab sch...@linux-m68k.org wrote:
 Tobias Burnus bur...@net-b.de writes:

 gcc.dg/Wunprototyped-calls.c:13:3: warning: call to function ‘g’ without a
 real prototype [-Wunprototyped-calls]

 What is a real prototype?

One reason I didn't bother to upstream that patch is language lawyer
legalise ...

We want to catch

int foo ();

int bar (T x)
{
   return foo (x);
}

int foo (U)
{
...
}

that is, calling foo () from a context where the definition or declaration with
argument specification is not visible.  This causes the C frontend to apply
var-args promotion rules to all arguments which may differ from promotion
rules that would be applied when a real prototype was visible at the point
of the function call.

I'd just say without a prototype.  int foo();  or just foo(); is specified as
part of 6.7.5.3/14 as The empty list in a function declarator that is not
part of a definition of that function specifies that no information about the
number or types of the parameters is supplied. (this appears mostly
in KR style programs where the T D ( identifier-list(opt) ) form is valid).
I am not sure that GCC doing varargs style promotions to calls with
only this kind of declarator is valid or if the program would be rejected
by KR (and only the GCC extension of varargs functions without
a first named arguments makes us do what we do ...).

The patch was implemented while hunting down miscompiles in
either X or ghostscript (I don't remember...).

Richard.

 Andreas.

 --
 Andreas Schwab, sch...@linux-m68k.org
 GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
 And now for something completely different.

Re: [patch] update documentation for SEQUENCE

2013-04-08 Thread Richard Biener

On Sun, Apr 7, 2013 at 12:04 AM, Steven Bosscher stevenb@gmail.com wrote:
 Hello,

 The existing documentation for SEQUENCE still states it is used for
 DEFINE_EXPAND sequences. I think I wasn't even hacking GCC when that
 practice was abandoned, and in the mean time some other uses of
 SEQUENCE have appeared in the compiler. So, a long-overdue
 documentation update.

 OK for trunk?

Ok.

Thanks,
Richard.

 Ciao!
 Steven


 * doc/rtl.texi (sequence): Rewrite documentation to match the
 current use of SEQUENCE rtl objects.
 * rtl.def (SEQUENCE): Likewise.

 Index: doc/rtl.texi
 ===
 --- doc/rtl.texi(revision 197532)
 +++ doc/rtl.texi(working copy)
 @@ -3099,18 +3099,11 @@ side-effects.

  @findex sequence
  @item (sequence [@var{insns} @dots{}])
 -Represents a sequence of insns.  Each of the @var{insns} that appears
 -in the vector is suitable for appearing in the chain of insns, so it
 -must be an @code{insn}, @code{jump_insn}, @code{call_insn},
 -@code{code_label}, @code{barrier} or @code{note}.
 +Represents a sequence of insns.  If a @code{sequence} appears in the
 +chain of insns, then each of the @var{insns} that appears in the sequence
 +must be suitable for appearing in the chain of insns, i.e. must satisfy
 +the @code{INSN_P} predicate.

 -A @code{sequence} RTX is never placed in an actual insn during RTL
 -generation.  It represents the sequence of insns that result from a
 -@code{define_expand} @emph{before} those insns are passed to
 -@code{emit_insn} to insert them in the chain of insns.  When actually
 -inserted, the individual sub-insns are separated out and the
 -@code{sequence} is forgotten.
 -
  After delay-slot scheduling is completed, an insn and all the insns that
  reside in its delay slots are grouped together into a @code{sequence}.
  The insn requiring the delay slot is the first insn in the vector;
 @@ -3123,6 +3116,19 @@ the effect of the insns in the delay slots.  In su
  the branch and should be executed only if the branch is taken; otherwise
  the insn should be executed only if the branch is not taken.
  @xref{Delay Slots}.
 +
 +Some back ends also use @code{sequence} objects for purposes other than
 +delay-slot groups.  This is not supported in the common parts of the
 +compiler, which treat such sequences as delay-slot groups.
 +
 +DWARF2 Call Frame Address (CFA) adjustments are sometimes also expressed
 +using @code{sequence} objects as the value of a @code{RTX_FRAME_RELATED_P}
 +note.  This only happens if the CFA adjustments cannot be easily derived
 +from the pattern of the instruction to which the note is attached.  In
 +such cases, the value of the note is used instead of best-guesing the
 +semantics of the instruction.  The back end can attach notes containing
 +a @code{sequence} of @code{set} patterns that express the effect of the
 +parent instruction.
  @end table

  These expression codes appear in place of a side effect, as the body of
 Index: rtl.def
 ===
 --- rtl.def (revision 197533)
 +++ rtl.def (working copy)
 @@ -102,10 +102,24 @@ DEF_RTL_EXPR(EXPR_LIST, expr_list, ee, RTX_EXT
 The insns are represented in print by their uids.  */
  DEF_RTL_EXPR(INSN_LIST, insn_list, ue, RTX_EXTRA)

 -/* SEQUENCE appears in the result of a `gen_...' function
 -   for a DEFINE_EXPAND that wants to make several insns.
 -   Its elements are the bodies of the insns that should be made.
 -   `emit_insn' takes the SEQUENCE apart and makes separate insns.  */
 +/* SEQUENCE is used in late passes of the compiler to group insns for
 +   one reason or another.
 +
 +   For example, after delay slot filling, branch instructions with filled
 +   delay slots are represented as a SEQUENCE of length 1 + n_delay_slots,
 +   with the branch instruction in XEXPVEC(seq, 0, 0) and the instructions
 +   occupying the delay slots in the remaining XEXPVEC slots.
 +
 +   Another place where a SEQUENCE may appear, is in REG_FRAME_RELATED_EXPR
 +   notes, to express complex operations that are not obvious from the insn
 +   to which the REG_FRAME_RELATED_EXPR note is attached.  In this usage of
 +   SEQUENCE, the sequence vector slots do not hold real instructions but
 +   only pseudo-instructions that can be translated to DWARF CFA expressions.
 +
 +   Some back ends also use SEQUENCE to group insns in bundles.
 +
 +   Much of the compiler infrastructure is not prepared to handle SEQUENCE
 +   objects.  Only passes after pass_free_cfg are expected to handle them.  */
  DEF_RTL_EXPR(SEQUENCE, sequence, E, RTX_EXTRA)

  /* Represents a non-global base address.  This is only used in alias.c.  */

Re: [RFA][PATCH] Improve VRP of COND_EXPR_CONDs

2013-04-08 Thread Richard Biener

On Sat, Apr 6, 2013 at 2:48 PM, Jeff Law l...@redhat.com wrote:

 Given something like this:

  bb 6:
   _23 = changed_17 ^ 1;
   _12 = (_Bool) _23;
   if (_12 != 0)
 goto bb 10;
   else
 goto bb 7;

 Assume _23 and changed_17 have integer types wider than a boolean, but VRP
 has determined they have a range [0..1].

 We should be turning that into:

  bb 6:
   _23 = changed_17 ^ 1;
   _12 = (_Bool) _23;
   if (_23 != 0)
 goto bb 10;
   else
 goto bb 7;

 Note the change in the conditional.  This also makes the statement
 _12 = (_Bool) _23 dead which should be eliminated by DCE.

 This kind of thing happens regularly in GCC itself and is fixed by the
 attached patch.

 Bootstrapped and regression tested on x86_64-unknown-linux-gnu.

 OK for the trunk?


 commit fd82eea6f208bb12646e0e0e307fb86f043c1649
 Author: Jeff Law l...@redhat.com
 Date:   Sat Apr 6 06:46:58 2013 -0600

* tree-vrp.c (simplify_cond_using_ranges): Simplify test of
 boolean
when the boolean was created by converting a wider object which
had a boolean range.

 * gcc.dg/tree-ssa/vrp87.c: New test

 diff --git a/gcc/ChangeLog b/gcc/ChangeLog
 index 44797cc..d34ecde 100644
 --- a/gcc/ChangeLog
 +++ b/gcc/ChangeLog
 @@ -1,5 +1,9 @@
  2013-04-06  Jeff Law  l...@redhat.com

 +   * tree-vrp.c (simplify_cond_using_ranges): Simplify test of boolean
 +   when the boolean was created by converting a wider object which
 +   had a boolean range.
 +
 * gimple.c (canonicalize_cond_expr_cond): Rewrite x ^ y into
 x != y.

 diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
 index 601ca66..6ed8af2 100644
 --- a/gcc/testsuite/ChangeLog
 +++ b/gcc/testsuite/ChangeLog
 @@ -1,5 +1,7 @@
  2013-04-06  Jeff Law  l...@redhat.com

 +   * gcc.dg/tree-ssa/vrp87.c: New test
 +
 * gcc.dg/tree-ssa/forwprop-25.c: New test

  2013-04-03  Jeff Law  l...@redhat.com
 diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp87.c
 b/gcc/testsuite/gcc.dg/tree-ssa/vrp87.c
 new file mode 100644
 index 000..7feff81
 --- /dev/null
 +++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp87.c
 @@ -0,0 +1,81 @@
 +/* { dg-do compile } */
 +/* { dg-options -O2 -fdump-tree-vrp2-details -fdump-tree-cddce2-details }
 */
 +
 +struct bitmap_head_def;
 +typedef struct bitmap_head_def *bitmap;
 +typedef const struct bitmap_head_def *const_bitmap;
 +
 +
 +typedef unsigned long BITMAP_WORD;
 +typedef struct bitmap_element_def
 +{
 +  struct bitmap_element_def *next;
 +  unsigned int indx;
 +  BITMAP_WORD bits[((128 + (8 * 8 * 1u) - 1) / (8 * 8 * 1u))];
 +} bitmap_element;
 +
 +
 +
 +
 +
 +
 +typedef struct bitmap_head_def
 +{
 +  bitmap_element *first;
 +
 +} bitmap_head;
 +
 +
 +
 +static __inline__ unsigned char
 +bitmap_elt_ior (bitmap dst, bitmap_element * dst_elt,
 +   bitmap_element * dst_prev, const bitmap_element * a_elt,
 +   const bitmap_element * b_elt, unsigned char changed)
 +{
 +
 +  if (a_elt)
 +{
 +
 +  if (!changed  dst_elt)
 +   {
 + changed = 1;
 +   }
 +}
 +  else
 +{
 +  changed = 1;
 +}
 +  return changed;
 +}
 +
 +unsigned char
 +bitmap_ior_into (bitmap a, const_bitmap b)
 +{
 +  bitmap_element *a_elt = a-first;
 +  const bitmap_element *b_elt = b-first;
 +  bitmap_element *a_prev = ((void *) 0);
 +  unsigned char changed = 0;
 +
 +  while (b_elt)
 +{
 +
 +  if (!a_elt || a_elt-indx == b_elt-indx)
 +   changed = bitmap_elt_ior (a, a_elt, a_prev, a_elt, b_elt, changed);
 +  else if (a_elt-indx  b_elt-indx)
 +   changed = 1;
 +  b_elt = b_elt-next;
 +
 +
 +}
 +
 +  return changed;
 +}
 +
 +/* Verify that VRP simplified an if statement.  */
 +/* { dg-final { scan-tree-dump Folded into: if.* vrp2} } */
 +/* Verify that DCE after VRP2 eliminates a dead conversion
 +   to a (Bool).  */
 +/* { dg-final { scan-tree-dump Deleting.*_Bool.*; cddce2} } */
 +/* { dg-final { cleanup-tree-dump vrp2 } } */
 +/* { dg-final { cleanup-tree-dump cddce2 } } */
 +
 diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
 index 250a506..d76cead 100644
 --- a/gcc/tree-vrp.c
 +++ b/gcc/tree-vrp.c
 @@ -8584,6 +8584,43 @@ simplify_cond_using_ranges (gimple stmt)
 }
  }

 +  /* If we have a comparison of a SSA_NAME boolean against
 + a constant (which obviously must be [0..1]).  See if the
 + SSA_NAME was set by a type conversion where the source
 + of the conversion is another SSA_NAME with a range [0..1].
 +
 + If so, we can replace the SSA_NAME in the comparison with
 + the RHS of the conversion.  This will often make the type
 + conversion dead code which DCE will clean up.  */
 +  if (TREE_CODE (op0) == SSA_NAME
 +   TREE_CODE (TREE_TYPE (op0)) == BOOLEAN_TYPE

Use

   (TREE_CODE (TREE_TYPE (op0)) == BOOLEAN_TYPE
|| (INTEGRAL_TYPE_P (TREE_TYPE (op0))
 TYPE_PRECISION (TREE_TYPE (op0)) == 1))

to catch some more cases.

 +   is_gimple_min_invariant (op1))

In

Re: [PATCH v3]IPA: fixing inline fail report caused by overwritable functions.

2013-04-08 Thread Richard Biener

On Mon, Apr 8, 2013 at 4:47 AM, Zhouyi Zhou zhouzho...@gmail.com wrote:
 When inline failed because of callee is overwritable, gcc will not report it
 in dump file (triggered by -fdump-tree-einline) as other not inlinable cases
 do. This patch correct this.

 Regtested/bootstrapped on x86_64-linux.

Can you trigger this message to show up with -Winline before/after the patch?
Can you please add a testcase then?

Thanks,
Richard.

 ChangeLog:
 2013-04-08 Zhouyi Zhou yizhouz...@ict.ac.cn
* cif-code.def (OVERWRITABLE): correct the comment for overwritable
 function
* ipa-inline.c (can_inline_edge_p): let dump mechanism report the 
 inline
fail caused by overwritable functions.

 Index: gcc/ipa-inline.c
 ===
 --- gcc/ipa-inline.c(revision 197549)
 +++ gcc/ipa-inline.c(working copy)
 @@ -266,7 +266,7 @@ can_inline_edge_p (struct cgraph_edge *e
else if (avail = AVAIL_OVERWRITABLE)
  {
e-inline_failed = CIF_OVERWRITABLE;
 -  return false;
 +  inlinable = false;
  }
else if (e-call_stmt_cannot_inline_p)
  {
 Index: gcc/cif-code.def
 ===
 --- gcc/cif-code.def(revision 197549)
 +++ gcc/cif-code.def(working copy)
 @@ -48,7 +48,7 @@ DEFCIFCODE(REDEFINED_EXTERN_INLINE,
  /* Function is not inlinable.  */
  DEFCIFCODE(FUNCTION_NOT_INLINABLE, N_(function not inlinable))

 -/* Function is not overwritable.  */
 +/* Function is overwritable.  */
  DEFCIFCODE(OVERWRITABLE, N_(function body can be overwritten at link time))

  /* Function is not an inlining candidate.  */

Re: [Patch, fortran, 4.9] Use bool type instead gfc_try

2013-04-08 Thread Mikael Morin

Le 08/04/2013 10:34, Tobias Burnus a écrit :
 Janne Blomqvist wrote:
 On Thu, Mar 21, 2013 at 11:31 PM, Janne Blomqvist
 blomqvist.ja...@gmail.com wrote:
 Updated patch which in addition does the above transformations as
 well.
 .. and here is the actual patch (thanks Bernhard!)
 http://gcc.gnu.org/ml/fortran/2013-03/msg00108.html
 
 Thanks for the update and sorry for the delay. The patch idea as such is
 okay. However, the patch isn't.
 
[... formatting problems ...]

there is also a SUCCESS_EXIT_CODE changed to true_EXIT_CODE.
I think that's unintended.

Mikael

Re: patch to fix constant math - 4th patch - the wide-int class - patch ping for the next stage 1

2013-04-08 Thread Richard Biener

On Fri, Apr 5, 2013 at 2:34 PM, Kenneth Zadeck zad...@naturalbridge.com wrote:
 Richard,

 There has been something that has bothered me about you proposal for the
 storage manager and i think i can now characterize that problem.  Say i want
 to compute the expression

 (a + b) / c

 converting from tree values, using wide-int as the engine and then storing
 the result in a tree.   (A very common operation for the various simplifiers
 in gcc.)

 in my version of wide-int where there is only the stack allocated fix size
 allocation for the data, the compiler arranges for 6 instances of wide-int
 that are statically allocated on the stack when the function is entered.
 There would be 3 copies of the precision and data to get things started and
 one allocation variable sized object at the end when the INT_CST is built
 and one copy to put it back.   As i have argued, these copies are of
 negligible size.

 In your world, to get things started, you would do 3 pointer copies to get
 the values out of the tree to set the expression leaves but then you will
 call the allocator 3 times to get space to hold the intermediate nodes
 before you get to pointer copy the result back into the result cst which
 still needs an allocation to build it. I am assuming that we can play the
 same game at the tree level that we do at the rtl level where we do 1
 variable sized allocation to get the entire INT_CST rather than doing 1
 fixed sized allocation and 1 variable sized one.

 even if we take the simpler example of a + b, you still loose.   The cost of
 the extra allocation and it's subsequent recovery is more than my copies.
 In fact, even in the simplest case of someone going from a HWI thru wide_int
 into tree, you have 2 allocations vs my 1.

Just to clarify, my code wouldn't handle

  tree a, b, c;
  tree res = (a + b) / c;

transparently.  The most complex form of the above that I think would
be reasonable to handle would be

  tree a, b, c;
  wide_int wires = (wi (a) + b) / c;
  tree res = build_int_cst (TREE_TYPE (a), wires);

and the code as posted would even require you to specify the
return type of operator+ and operator/ explicitely like

 wide_int wires = (wi (a).operator+wi_embed_var
(b)).operator/wi_embed_var (c);

but as I said I just didn't bother to decide that the return type is
always of wide_int variable-len-storage kind.

Now, the only real allocation that happens is done by build_int_cst.
There is one wide_int on the stack to hold the a + b result and one
separate wide_int to hold wires (it's literally written in the code).
There are no pointer copies involved in the end - the result from
converting a tree to a wide_inttree-storage is the original 'tree'
pointer itself, thus a register.

 I just do not see the cost savings and if there are no cost savings, you
 certainly cannot say that having these templates is simpler than not having
 the templates.

I think you are missing the point - by abstracting away the storage
you don't necessarily need to add the templates.  But you open up
a very easy route for doing so and you make the operations _trivially_
work on the tree / RTL storage with no overhead in generated code
and minimal overhead in the amount of code in GCC itself.  In my
prototype the overhead of adding 'tree' support is to place

class wi_tree_int_cst
{
  tree cst;
public:
  void construct (tree c) { cst = c; }
  const HOST_WIDE_INT *storage() const { return reinterpret_cast
HOST_WIDE_INT *(TREE_INT_CST (cst)); }
  unsigned len() const { return 2; }
};

template 
class wi_traits tree
{
public:
typedef wide_int wi_tree_int_cst wi_t;
wi_traits(tree t)
  {
wi_tree_int_cst ws;
ws.construct (t);
w.construct (ws);
  }
wi_t* operator-() { return w; }
private:
wi_t w;
};

into tree.h.

Richard.

 Kenny


 On 04/02/2013 11:04 AM, Richard Biener wrote:

 On Wed, Feb 27, 2013 at 2:59 AM, Kenneth Zadeck
 zad...@naturalbridge.com wrote:

 This patch contains a large number of the changes requested by Richi.
 It
 does not contain any of the changes that he requested to abstract the
 storage layer.   That suggestion appears to be quite unworkable.

 I of course took this claim as a challenge ... with the following result.
 It is
 of course quite workable ;)

 The attached patch implements the core wide-int class and three storage
 models (fixed size for things like plain HWI and double-int, variable size
 similar to how your wide-int works and an adaptor for the double-int as
 contained in trees).  With that you can now do

 HOST_WIDE_INT
 wi_test (tree x)
 {
// template argument deduction doesn't do the magic we want it to do
// to make this kind of implicit conversions work
// overload resolution considers this kind of conversions so we
// need some magic that combines both ... but seeding the overload
// set with some instantiations doesn't seem to be possible :/
// wide_int w = x + 1;
wide_int w;
w += x;
w += 1;
// template argument

Re: Comments on the suggestion to use infinite precision math for wide int.

2013-04-08 Thread Richard Biener

On Sun, Apr 7, 2013 at 7:16 PM, Kenneth Zadeck zad...@naturalbridge.com wrote:
 Richard,

 You advocate that I should be using an infinite precision
 representation and I advocate a finite precision representation where
 the precision is taken from the context.  I would like to make the
 case for my position here, in a separate thread, because the other
 thread is just getting too messy.

 At both the tree level and the rtl level you have a type (mode is just
 bad rep for types) and both of those explicitly have precisions. The
 semantics of the programming languages that we implement define, or at
 least recommend, that most operations be done in a precision that is
 implementation dependent (or like java a particular machine
 independent precision).  Each hardware platform specifies exactly how
 every operation is done.  I will admit that infinite precision is more
 esthetically pleasing than what i have done, but exact precision
 matches the needs of these clients.  The problem is that the results
 from infinite precision arithmetic differ in many significant ways
 from finite precision math.  And the number of places where you have
 to inject a precision to get the expected answer, ultimately makes the
 infinite precision representation unattractive.

 As I said on Thursday, whenever you do operations that do not satisfy
 the requirements of a mathematical ring (add sub and mul are in a
 ring, divide, shift and comparisons are not) you run the risk of
 getting a result that is not what would have been obtained with either
 a strict interpretation of the semantics or the machine. Intuitively
 any operation that looks at the bits above the precision does not
 qualify as an operation that works in a ring.

 The poster child for operations that do not belong to a ring is division.
 For my example, I am using 4 bit integers because it makes the
 examples easy, but similar examples exist for any fixed precision.

 Consider 8 * 10 / 4

 in an infinite precision world the result is 20, but in a 4 bit
 precision world the answer is 0.

 another example is to ask if

 -10 * 10 is less than 0?

 again you get a different answer with infinite precision.   I would argue
 that if i declare a variable of type uint32 and scale my examples i have
 the right to expect the compiler to produce the same result as the
 machine would.

 While C and C++ may have enough wiggle room in their standards so that
 this is just an unexpected, but legal, result as opposed to being wrong,
 everyone will hate you (us) if we do this.  Furthermore, Java explicitly
 does
 not allow this (not that anyone actually uses gcj).  I do not know
 enough about go, ada and fortran to say how it would effect them.

 In looking at the double-int class, the only operation that does not
 fit in a ring that is done properly is shifting.  There we explicitly
 pass in the precision.

 The reason that we rarely see this kind of problem even though
 double-int implements 128 bit infinite precision is that currently
 very little of the compiler actually uses infinite precision in a
 robust way.   In a large number of places, the code looks like:

 if (TYPE_PRECISION (TREE_TYPE (...))  HOST_BITS_PER_WIDE_INT)
do something using inline operators.
 else
either do not do something or use const-double,

 such code clears out most of these issues before the two passes that
 embrace infinite precision get a chance to do much damage.  However,
 my patch at the rtl level gets rid of most of this kind of code and
 replaces it with calls to wide-int that currently uses only operations
 within the precision.  I assume that if i went down the infinite
 precision road at the tree level, that all of this would come to the
 surface very quickly.  I prefer to not change my rep and not have to
 deal with this later.

 Add, subtract, multiply and the logicals are all safe.  But divide,
 remainder, and all of the comparisons need explicit precisions.  In
 addition operations like clz, ctl and clrsb need precisions.  In total
 about half of the functions would need a precision passed in.  My
 point is that once you have to start passing in the precision in for all
 of those operations, it seems to be cleaner to get the precision from
 the leaves of the tree as I currently do.

 Once you buy into the math in a particular precision world, a lot of
 the other issues that you raise are just settled.  Asking how to extend
 a value beyond it's precision is like asking what the universe was like
 before
 the big bang.  It is just something you do not need to know.

 I understand that you would like to have functions like x + 1 work,
 and so do I. I just could not figure out how to make them have
 unsurprising semantics.  In particular, g++ did not seem to be happy
 with me defining two plus operators, one for each of signed and
 unsigned HWIs.  It seems like if someone explicitly added a wide_int
 and an unsigned HWI that they had a right to have the unsigned hwi not
 be sign

[C++ Patch] PR 56871

2013-04-08 Thread Paolo Carlini


Hi,

seems an easy issue: we aren't allowing an explicit specializations 
differing from the template declaration with respect to the constexpr 
specifier.


Tested x86_64-linux.

Thanks,
Paolo.

//
/cp
2013-04-08  Paolo Carlini  paolo.carl...@oracle.com

PR c++/56871
* decl.c (validate_constexpr_redeclaration): Allow an explicit
specialization to be different wrt the constexpr specifier.

/testsuite
2013-04-08  Paolo Carlini  paolo.carl...@oracle.com

PR c++/56871
* g++.dg/cpp0x/constexpr-specialization.C: New.
Index: cp/decl.c
===
--- cp/decl.c   (revision 197572)
+++ cp/decl.c   (working copy)
@@ -1203,6 +1203,14 @@ validate_constexpr_redeclaration (tree old_decl, t
= DECL_DECLARED_CONSTEXPR_P (new_decl);
   return true;
 }
+  /* 7.1.5 [dcl.constexpr]
+ Note: An explicit specialization can differ from the template
+ declaration with respect to the constexpr specifier.  */
+  if (TREE_CODE (old_decl) == FUNCTION_DECL
+   TREE_CODE (new_decl) == FUNCTION_DECL
+   ! DECL_TEMPLATE_SPECIALIZATION (old_decl)
+   DECL_TEMPLATE_SPECIALIZATION (new_decl))
+return true;
   error (redeclaration %qD differs in %constexpr%, new_decl);
   error (from previous declaration %q+D, old_decl);
   return false;
Index: testsuite/g++.dg/cpp0x/constexpr-specialization.C
===
--- testsuite/g++.dg/cpp0x/constexpr-specialization.C   (revision 0)
+++ testsuite/g++.dg/cpp0x/constexpr-specialization.C   (working copy)
@@ -0,0 +1,12 @@
+// PR c++/56871
+// { dg-options -std=c++11 }
+
+templatetypename T constexpr int foo(T);
+template int foo(int);
+template int foo(int);// { dg-error previous }
+template constexpr int foo(int);  // { dg-error redeclaration }
+
+templatetypename T int bar(T);
+template constexpr int bar(int);
+template constexpr int bar(int);  // { dg-error previous }
+template int bar(int);// { dg-error redeclaration }

Re: [patch, AVR] Add new ATmegaRFR devices

2013-04-08 Thread Joerg Wunsch

As Georg-Johann Lay wrote:

 Joerg Wunsch wrote:
 The attached patch adds the new ATmega*RFR* devices to AVR-GCC.
 [...]
 
 Supply the auto generated files, too.  Cf. t-avr, avr-mcus.def etc.

OK, thanks for the reminder.  Here is the updated patch.

-- 
Joerg Wunsch * Development engineer, Dresden, Germany

Atmel Automotive GmbH, Theresienstrasse 2, D-74027 Heilbronn
Geschaeftsfuehrung: Steven A. Laub, Stephen Cumming
Amtsgericht Stuttgart, Registration HRB 106594
ChangeLog entry:

2013-04-08 Joerg Wunsch joerg.wun...@atmel.com

* gcc/config/avr/avr-mcus.def: Add ATmega644RFR2,
ATmega128RFR2, ATmega1284RFR2, ATmega256RFR2, ATmega2564RFR2;
remove non-existent ATmega64RFA2
* gcc/doc/avr-mmcu.texi: Regenerate.
* gcc/config/avr/avr-tables.opt: Regenerate.
* gcc/config/avr/t-multilib: Regenerate.

Index: gcc/config/avr/avr-mcus.def
===
--- gcc/config/avr/avr-mcus.def (Revision 197562)
+++ gcc/config/avr/avr-mcus.def (Arbeitskopie)
@@ -229,8 +229,8 @@
 AVR_MCU (atmega64c1,   ARCH_AVR5, __AVR_ATmega64C1__,0, 0, 
0x0100, 1, m64c1)
 AVR_MCU (atmega64m1,   ARCH_AVR5, __AVR_ATmega64M1__,0, 0, 
0x0100, 1, m64m1)
 AVR_MCU (atmega64hve,  ARCH_AVR5, __AVR_ATmega64HVE__,   0, 0, 
0x0100, 1, m64hve)
-AVR_MCU (atmega64rfa2, ARCH_AVR5, __AVR_ATmega64RFA2__,  0, 0, 
0x0200, 1, m64rfa2)
 AVR_MCU (atmega64rfr2, ARCH_AVR5, __AVR_ATmega64RFR2__,  0, 0, 
0x0200, 1, m64rfr2)
+AVR_MCU (atmega644rfr2,ARCH_AVR5, __AVR_ATmega644RFR2__, 0, 0, 
0x0200, 1, m644rfr2)
 AVR_MCU (atmega32hvb,  ARCH_AVR5, __AVR_ATmega32HVB__,   0, 0, 
0x0100, 1, m32hvb)
 AVR_MCU (atmega32hvbrevb,  ARCH_AVR5, __AVR_ATmega32HVBREVB__,   0, 0, 
0x0100, 1, m32hvbrevb)
 AVR_MCU (atmega16hva2, ARCH_AVR5, __AVR_ATmega16HVA2__,  0, 0, 
0x0100, 1, m16hva2)
@@ -262,6 +262,8 @@
 AVR_MCU (atmega1284,   ARCH_AVR51, __AVR_ATmega1284__,   0, 0, 
0x0100, 2, m1284)
 AVR_MCU (atmega1284p,  ARCH_AVR51, __AVR_ATmega1284P__,  0, 0, 
0x0100, 2, m1284p)
 AVR_MCU (atmega128rfa1,ARCH_AVR51, __AVR_ATmega128RFA1__,0, 0, 
0x0200, 2, m128rfa1)
+AVR_MCU (atmega128rfr2,ARCH_AVR51, __AVR_ATmega128RFR2__,0, 0, 
0x0200, 2, m128rfr2)
+AVR_MCU (atmega1284rfr2,   ARCH_AVR51, __AVR_ATmega1284RFR2__,   0, 0, 
0x0200, 2, m1284rfr2)
 AVR_MCU (at90can128,   ARCH_AVR51, __AVR_AT90CAN128__,   0, 0, 
0x0100, 2, can128)
 AVR_MCU (at90usb1286,  ARCH_AVR51, __AVR_AT90USB1286__,  0, 0, 
0x0100, 2, usb1286)
 AVR_MCU (at90usb1287,  ARCH_AVR51, __AVR_AT90USB1287__,  0, 0, 
0x0100, 2, usb1287)
@@ -269,6 +271,8 @@
 AVR_MCU (avr6, ARCH_AVR6, NULL,0, 0, 
0x0200, 4, m2561)
 AVR_MCU (atmega2560,   ARCH_AVR6, __AVR_ATmega2560__,0, 0, 
0x0200, 4, m2560)
 AVR_MCU (atmega2561,   ARCH_AVR6, __AVR_ATmega2561__,0, 0, 
0x0200, 4, m2561)
+AVR_MCU (atmega256rfr2,ARCH_AVR6, __AVR_ATmega256RFR2__, 0, 0, 
0x0200, 4, m256rfr2)
+AVR_MCU (atmega2564rfr2,   ARCH_AVR6, __AVR_ATmega2564RFR2__,0, 0, 
0x0200, 4, m2564rfr2)
 /* Xmega, 16K = Flash  64K, RAM = 64K */
 AVR_MCU (avrxmega2,ARCH_AVRXMEGA2, NULL,   0, 0, 
0x2000, 1, x32a4)
 AVR_MCU (atxmega16a4,  ARCH_AVRXMEGA2, __AVR_ATxmega16A4__,  0, 0, 
0x2000, 1, x16a4)
Index: gcc/doc/avr-mmcu.texi
===
--- gcc/doc/avr-mmcu.texi   (Revision 197562)
+++ gcc/doc/avr-mmcu.texi   (Arbeitskopie)
@@ -38,15 +38,15 @@
 
 @item avr5
 ``Enhanced'' devices with 16@tie{}KiB up to 64@tie{}KiB of program memory.
-@*@var{mcu}@tie{}= @code{ata5790}, @code{ata5790n}, @code{ata5795}, 
@code{atmega16}, @code{atmega16a}, @code{atmega16hva}, @code{atmega16hva}, 
@code{atmega16hva2}, @code{atmega16hva2}, @code{atmega16hvb}, 
@code{atmega16hvb}, @code{atmega16hvbrevb}, @code{atmega16m1}, 
@code{atmega16m1}, @code{atmega16u4}, @code{atmega16u4}, @code{atmega161}, 
@code{atmega162}, @code{atmega163}, @code{atmega164a}, @code{atmega164p}, 
@code{atmega164pa}, @code{atmega165}, @code{atmega165a}, @code{atmega165p}, 
@code{atmega165pa}, @code{atmega168}, @code{atmega168a}, @code{atmega168p}, 
@code{atmega168pa}, @code{atmega169}, @code{atmega169a}, @code{atmega169p}, 
@code{atmega169pa}, @code{atmega26hvg}, @code{atmega32}, @code{atmega32a}, 
@code{atmega32a}, @code{atmega32c1}, @code{atmega32c1}, @code{atmega32hvb}, 
@code{atmega32hvb}, @code{atmega32hvbrevb}, @code{atmega32m1}, 
@code{atmega32m1}, @code{atmega32u4}, @code{atmega32u4}, @code{atmega32u6}, 
@code{atmega32u6}, @code{atmega323}, @code{atmega324a}, @code{atmega324p}, 
@code{atmega324pa}, @code{atmega325}, @code{atmega325a}, @code{atmega325p}, 
@code{atmega3250}, @code{atmega3250a}, @code{atmega3250p}, @code{atmega3250pa}, 
@code{atmega328},

[PATCH] Fix PR48762

2013-04-08 Thread Marek Polacek

This patch prevents two Invalid read of size 8 and one
Invalid write of size 8 warnings when cc1 is run under valgrind.  What
happens here is that we firstly allocate 0B
  ebb_data.path = XNEWVEC (struct branch_path,
 PARAM_VALUE (PARAM_MAX_CSE_PATH_LENGTH));
(in fact, XNEWVEC always allocates at least 1B--but still it's not enough),
then in cse_find_path we have (path_size is 0)
  if (path_size == 0)
  data-path[path_size++].bb = first_bb;
so we immediately have invalid write and moreover path_size increments,
thus we call cse_find_path again, then we get the invalid reads.
So fixed by guarding the write with PARAM_MAX_CSE_PATH_LENGTH  0.

Alternatively, we can bump the minimum of that param, as usual ;)

Bootstrapped/regtested on x86_64-linux, ok for trunk/4.8?

2013-04-08  Marek Polacek  pola...@redhat.com

PR tree-optimization/48762
* cse.c (cse_find_path): Require PARAM_MAX_CSE_PATH_LENGTH be  0.

--- gcc/cse.c.mp2013-04-08 13:19:15.082670099 +0200
+++ gcc/cse.c   2013-04-08 13:19:29.014713914 +0200
@@ -6166,7 +6166,7 @@ cse_find_path (basic_block first_bb, str
 }
 
   /* If the path was empty from the beginning, construct a new path.  */
-  if (path_size == 0)
+  if (path_size == 0  PARAM_VALUE (PARAM_MAX_CSE_PATH_LENGTH)  0)
 data-path[path_size++].bb = first_bb;
   else
 {

Marek

[build] Use -z ignore instead of --as-needed on Solaris

2013-04-08 Thread Rainer Orth

While the Solaris linker doesn't support the --as-needed/--no-as-needed
options (yet), it long has provided the equivalent -z ignore/-z record
options.  This patch makes use of them, avoiding unnecessary
dependencies on libgcc_s.so.1.

Bootstrapped without regressions on i386-pc-solaris2.11 (and checking
that many dependencies on libgcc_s.so.1 in runtime libraries are gone
that were flagged as unused by ldd -u) and x86_64-unknown-linux-gnu
(gcc/specs unchanged, make check still running).

Ok for mainline if it passes?

Thanks.
Rainer


2013-04-05  Rainer Orth  r...@cebitec.uni-bielefeld.de

* configure.ac (gcc_cv_ld_as_needed): Set
gcc_cv_ld_as_needed_option, gcc_cv_no_as_needed_option.
Use -z ignore, -z record on *-*-solaris2*.
(HAVE_LD_AS_NEEDED): Update comment.
(LD_AS_NEEDED_OPTION, LD_NO_AS_NEEDED_OPTION): Define.
* configure: Regenerate.
* config.in: Regenerate.
* gcc.c (init_gcc_specs) [USE_LD_AS_NEEDED]: Use
LD_AS_NEEDED_OPTION, LD_NO_AS_NEEDED_OPTION.
* config/sol2.h [HAVE_LD_AS_NEEDED] (USE_LD_AS_NEEDED): Define.
* doc/tm.texi.in (USE_LD_AS_NEEDED): Allow for --as-needed
equivalents.  Fix markup.
* doc/tm.texi: Regenerate.

# HG changeset patch
# Parent 602ad5b6c5e29819082e386836c33220c78ae4b7
Use -z ignore instead of --as-needed on Solaris

diff --git a/gcc/config/sol2.h b/gcc/config/sol2.h
--- a/gcc/config/sol2.h
+++ b/gcc/config/sol2.h
@@ -181,6 +181,11 @@ along with GCC; see the file COPYING3.  
%(link_arch) \
%{Qy:} %{!Qn:-Qy}
 
+/* Use --as-needed/-z ignore -lgcc_s for eh support.  */
+#ifdef HAVE_LD_AS_NEEDED
+#define USE_LD_AS_NEEDED 1
+#endif
+
 #ifdef USE_GLD
 /* Solaris 11 build 135+ implements dl_iterate_phdr.  GNU ld needs
--eh-frame-hdr to create the required .eh_frame_hdr sections.  */
diff --git a/gcc/configure.ac b/gcc/configure.ac
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -4538,6 +4538,8 @@ AC_MSG_RESULT($gcc_cv_ld_eh_gc_sections_
 AC_CACHE_CHECK(linker --as-needed support,
 gcc_cv_ld_as_needed,
 [gcc_cv_ld_as_needed=no
+gcc_cv_ld_as_needed_option='--as-needed'
+gcc_cv_ld_no_as_needed_option='--no-as-needed'
 if test $in_tree_ld = yes ; then
   if test $gcc_cv_gld_major_version -eq 2 -a $gcc_cv_gld_minor_version -ge 16 -o $gcc_cv_gld_major_version -gt 2 \
   test $in_tree_ld_is_elf = yes; then
@@ -4547,12 +4549,25 @@ elif test x$gcc_cv_ld != x; then
 	# Check if linker supports --as-needed and --no-as-needed options
 	if $gcc_cv_ld --help 2/dev/null | grep as-needed  /dev/null; then
 		gcc_cv_ld_as_needed=yes
+	else
+	  case $target in
+	# Solaris 2 ld always supports -z ignore/-z record.
+	*-*-solaris2*)
+	  gcc_cv_ld_as_needed=yes
+	  gcc_cv_ld_as_needed_option=-z ignore
+	  gcc_cv_ld_no_as_needed_option=-z record
+	  ;;
+	  esac
 	fi
 fi
 ])
 if test x$gcc_cv_ld_as_needed = xyes; then
 	AC_DEFINE(HAVE_LD_AS_NEEDED, 1,
-[Define if your linker supports --as-needed and --no-as-needed options.])
+[Define if your linker supports --as-needed/--no-as-needed or equivalent options.])
+	AC_DEFINE_UNQUOTED(LD_AS_NEEDED_OPTION, $gcc_cv_ld_as_needed_option,
+[Define to the linker option to ignore unused dependencies.])
+	AC_DEFINE_UNQUOTED(LD_NO_AS_NEEDED_OPTION, $gcc_cv_ld_no_as_needed_option,
+[Define to the linker option to keep unused dependencies.])
 fi
 
 case $target:$tm_file in
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -260,7 +260,8 @@ line, but, unlike @code{LIBGCC_SPEC}, it
 @defmac USE_LD_AS_NEEDED
 A macro that controls the modifications to @code{LIBGCC_SPEC}
 mentioned in @code{REAL_LIBGCC_SPEC}.  If nonzero, a spec will be
-generated that uses --as-needed and the shared libgcc in place of the
+generated that uses @option{--as-needed} or equivalent options and the
+shared @file{libgcc} in place of the
 static exception handler library, when linking without any of
 @code{-static}, @code{-static-libgcc}, or @code{-shared-libgcc}.
 @end defmac
diff --git a/gcc/gcc.c b/gcc/gcc.c
--- a/gcc/gcc.c
+++ b/gcc/gcc.c
@@ -1361,7 +1361,8 @@ init_gcc_specs (struct obstack *obstack,
 		%{!static:%{!static-libgcc:
 #if USE_LD_AS_NEEDED
 		%{!shared-libgcc:,
-		static_name,  --as-needed , shared_name,  --no-as-needed
+		static_name,   LD_AS_NEEDED_OPTION  ,
+		shared_name,   LD_NO_AS_NEEDED_OPTION
 		}
 		%{shared-libgcc:,
 		shared_name, %{!shared: , static_name, }

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [PATCH, ARM, iWMMXT] PR target/54338 - Include IWMMXT_GR_REGS in ALL_REGS

2013-04-08 Thread Ramana Radhakrishnan




ChangeLog

2013-04-02  Xinyu Qi  x...@marvell.com

PR target/54338
* config/arm/arm.h (REG_CLASS_CONTENTS): Include IWMMXT_GR_REGS in 
ALL_REGS.

Thanks,
Xinyu



Thanks now applied to trunk.

For the future please consider creating patches at the top level 
directory. Makes it easier for application by someone else :) .


regards
Ramana

[PATCH] Assorted dump/debug fixes for the vectorizer

2013-04-08 Thread Richard Biener


The following avoids the excessive verboseness of get_vectype_*
and leaves better traces of the original stmt in the vectorizer
temporary names by preserving their SSA name version.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2013-04-08  Richard Biener  rguent...@suse.de

* gimple-pretty-print.c (debug_gimple_stmt): Do not print
extra newline.
* tree-vect-loop.c (vect_determine_vectorization_factor): Dump
determined vector type.
(vect_analyze_data_refs): Likewise.
(vect_get_new_vect_var): Adjust.
(vect_create_destination_var): Preserve SSA name versions.
* tree-vect-stmts.c (get_vectype_for_scalar_type_and_size): Do
not dump anything here.

* gfortran.dg/vect/fast-math-mgrid-resid.f: Adjust.

Index: gcc/gimple-pretty-print.c
===
--- gcc/gimple-pretty-print.c   (revision 197486)
+++ gcc/gimple-pretty-print.c   (working copy)
@@ -84,7 +84,6 @@ DEBUG_FUNCTION void
 debug_gimple_stmt (gimple gs)
 {
   print_gimple_stmt (stderr, gs, 0, TDF_VOPS|TDF_MEMSYMS);
-  fprintf (stderr, \n);
 }
 
 
Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c(revision 197486)
+++ gcc/tree-vect-loop.c(working copy)
@@ -409,6 +409,12 @@ vect_determine_vectorization_factor (loo
}
 
  STMT_VINFO_VECTYPE (stmt_info) = vectype;
+
+ if (dump_enabled_p ())
+   {
+ dump_printf_loc (MSG_NOTE, vect_location, vectype: );
+ dump_generic_expr (MSG_NOTE, TDF_SLIM, vectype);
+   }
 }
 
  /* The vectorization factor is according to the smallest
Index: gcc/tree-vect-data-refs.c
===
--- gcc/tree-vect-data-refs.c   (revision 197486)
+++ gcc/tree-vect-data-refs.c   (working copy)
@@ -3206,6 +3206,17 @@ vect_analyze_data_refs (loop_vec_info lo
}
  return false;
 }
+  else
+   {
+ if (dump_enabled_p ())
+   {
+ dump_printf_loc (MSG_NOTE, vect_location,
+  got vectype for stmt: );
+ dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt, 0);
+ dump_generic_expr (MSG_NOTE, TDF_SLIM,
+STMT_VINFO_VECTYPE (stmt_info));
+   }
+   }
 
   /* Adjust the minimal vectorization factor according to the
 vector type.  */
@@ -3293,13 +3304,13 @@ vect_get_new_vect_var (tree type, enum v
   switch (var_kind)
   {
   case vect_simple_var:
-prefix = vect_;
+prefix = vect;
 break;
   case vect_scalar_var:
-prefix = stmp_;
+prefix = stmp;
 break;
   case vect_pointer_var:
-prefix = vect_p;
+prefix = vectp;
 break;
   default:
 gcc_unreachable ();
@@ -3307,7 +3318,7 @@ vect_get_new_vect_var (tree type, enum v
 
   if (name)
 {
-  char* tmp = concat (prefix, name, NULL);
+  char* tmp = concat (prefix, _, name, NULL);
   new_vect_var = create_tmp_reg (type, tmp);
   free (tmp);
 }
@@ -3836,7 +3847,8 @@ tree
 vect_create_destination_var (tree scalar_dest, tree vectype)
 {
   tree vec_dest;
-  const char *new_name;
+  const char *name;
+  char *new_name;
   tree type;
   enum vect_var_kind kind;
 
@@ -3845,10 +3857,13 @@ vect_create_destination_var (tree scalar
 
   gcc_assert (TREE_CODE (scalar_dest) == SSA_NAME);
 
-  new_name = get_name (scalar_dest);
-  if (!new_name)
-new_name = var_;
+  name = get_name (scalar_dest);
+  if (name)
+asprintf (new_name, %s_%u, name, SSA_NAME_VERSION (scalar_dest));
+  else
+asprintf (new_name, _%u, SSA_NAME_VERSION (scalar_dest));
   vec_dest = vect_get_new_vect_var (type, kind, new_name);
+  free (new_name);
 
   return vec_dest;
 }
Index: gcc/tree-vect-stmts.c
===
--- gcc/tree-vect-stmts.c   (revision 197486)
+++ gcc/tree-vect-stmts.c   (working copy)
@@ -6094,30 +6094,10 @@ get_vectype_for_scalar_type_and_size (tr
 return NULL_TREE;
 
   vectype = build_vector_type (scalar_type, nunits);
-  if (dump_enabled_p ())
-{
-  dump_printf_loc (MSG_NOTE, vect_location,
-   get vectype with %d units of type , nunits);
-  dump_generic_expr (MSG_NOTE, TDF_SLIM, scalar_type);
-}
-
-  if (!vectype)
-return NULL_TREE;
-
-  if (dump_enabled_p ())
-{
-  dump_printf_loc (MSG_NOTE, vect_location, vectype: );
-  dump_generic_expr (MSG_NOTE, TDF_SLIM, vectype);
-}
 
   if (!VECTOR_MODE_P (TYPE_MODE (vectype))
!INTEGRAL_MODE_P (TYPE_MODE (vectype)))
-{
-  if (dump_enabled_p ())
-dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
- mode not supported by target.);
-  return NULL_TREE;
-}
+return

Re: Comments on the suggestion to use infinite precision math for wide int.

2013-04-08 Thread Kenneth Zadeck



On 04/08/2013 06:46 AM, Richard Biener wrote:

On Sun, Apr 7, 2013 at 7:16 PM, Kenneth Zadeck zad...@naturalbridge.com wrote:

Richard,

You advocate that I should be using an infinite precision
representation and I advocate a finite precision representation where
the precision is taken from the context.  I would like to make the
case for my position here, in a separate thread, because the other
thread is just getting too messy.

At both the tree level and the rtl level you have a type (mode is just
bad rep for types) and both of those explicitly have precisions. The
semantics of the programming languages that we implement define, or at
least recommend, that most operations be done in a precision that is
implementation dependent (or like java a particular machine
independent precision).  Each hardware platform specifies exactly how
every operation is done.  I will admit that infinite precision is more
esthetically pleasing than what i have done, but exact precision
matches the needs of these clients.  The problem is that the results
from infinite precision arithmetic differ in many significant ways
from finite precision math.  And the number of places where you have
to inject a precision to get the expected answer, ultimately makes the
infinite precision representation unattractive.

As I said on Thursday, whenever you do operations that do not satisfy
the requirements of a mathematical ring (add sub and mul are in a
ring, divide, shift and comparisons are not) you run the risk of
getting a result that is not what would have been obtained with either
a strict interpretation of the semantics or the machine. Intuitively
any operation that looks at the bits above the precision does not
qualify as an operation that works in a ring.

The poster child for operations that do not belong to a ring is division.
For my example, I am using 4 bit integers because it makes the
examples easy, but similar examples exist for any fixed precision.

Consider 8 * 10 / 4

in an infinite precision world the result is 20, but in a 4 bit
precision world the answer is 0.

another example is to ask if

-10 * 10 is less than 0?

again you get a different answer with infinite precision.   I would argue
that if i declare a variable of type uint32 and scale my examples i have
the right to expect the compiler to produce the same result as the
machine would.

While C and C++ may have enough wiggle room in their standards so that
this is just an unexpected, but legal, result as opposed to being wrong,
everyone will hate you (us) if we do this.  Furthermore, Java explicitly
does
not allow this (not that anyone actually uses gcj).  I do not know
enough about go, ada and fortran to say how it would effect them.

In looking at the double-int class, the only operation that does not
fit in a ring that is done properly is shifting.  There we explicitly
pass in the precision.

The reason that we rarely see this kind of problem even though
double-int implements 128 bit infinite precision is that currently
very little of the compiler actually uses infinite precision in a
robust way.   In a large number of places, the code looks like:

if (TYPE_PRECISION (TREE_TYPE (...))  HOST_BITS_PER_WIDE_INT)
do something using inline operators.
else
either do not do something or use const-double,

such code clears out most of these issues before the two passes that
embrace infinite precision get a chance to do much damage.  However,
my patch at the rtl level gets rid of most of this kind of code and
replaces it with calls to wide-int that currently uses only operations
within the precision.  I assume that if i went down the infinite
precision road at the tree level, that all of this would come to the
surface very quickly.  I prefer to not change my rep and not have to
deal with this later.

Add, subtract, multiply and the logicals are all safe.  But divide,
remainder, and all of the comparisons need explicit precisions.  In
addition operations like clz, ctl and clrsb need precisions.  In total
about half of the functions would need a precision passed in.  My
point is that once you have to start passing in the precision in for all
of those operations, it seems to be cleaner to get the precision from
the leaves of the tree as I currently do.

Once you buy into the math in a particular precision world, a lot of
the other issues that you raise are just settled.  Asking how to extend
a value beyond it's precision is like asking what the universe was like
before
the big bang.  It is just something you do not need to know.

I understand that you would like to have functions like x + 1 work,
and so do I. I just could not figure out how to make them have
unsurprising semantics.  In particular, g++ did not seem to be happy
with me defining two plus operators, one for each of signed and
unsigned HWIs.  It seems like if someone explicitly added a wide_int
and an unsigned HWI that they had a right to have the unsigned hwi not
be sign extended.  But if you can show

[PATCH][ARM] Improve code generation for anddi3

2013-04-08 Thread Kyrylo Tkachov

Hi all,

When compiling:

unsigned long long
muld (unsigned long long X, unsigned long long Y)
{
  unsigned long long mask = 0xull;
  return (X  mask) * (Y  mask);
}

we get a suboptimal sequence:
stmfd   sp!, {r4, r5}
mvn r4, #0
mov r5, #0
and r0, r0, r4
and r3, r3, r5
and r1, r1, r5
and r2, r2, r4
mul r3, r0, r3
mla r3, r2, r1, r3
umull   r0, r1, r0, r2
ldmfd   sp!, {r4, r5}
add r1, r3, r1
bx  lr

This patch improves that situation by changing the anddi3 insn into an
insn_and_split and
simplifying the SImode ands. Also, the NEON version is merged with the
non-NEON one.
This allows us to generate just:
umull   r0, r1, r2, r0
bx  lr
for the above code.

Regtested arm-none-eabi on qemu.
Ok for trunk?

Thanks,
Kyrill


gcc/ChangeLog
2013-04-08  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* config/arm/arm.c (const_ok_for_dimode_op): Handle AND case.
* config/arm/arm.md (*anddi3_insn): Change to insn_and_split.
* config/arm/constraints.md (De): New constraint.
* config/arm/neon.md (anddi3_neon): Delete.
(neon_vandmode): Expand to standard anddi3 pattern.
* config/arm/predicates.md (imm_for_neon_inv_logic_operand):
Move earlier in the file.
(neon_inv_logic_op2): Likewise.
(arm_anddi_operand_neon): New predicate.

gcc/testsuite/ChangeLog
2013-04-08  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* gcc.target/arm/anddi3-opt.c: New test.
* gcc.target/arm/anddi3-opt2.c: Likewise.


anddi3_new.patch
Description: Binary data

Re: Comments on the suggestion to use infinite precision math for wide int.

2013-04-08 Thread Robert Dewar


It may be interesting to look at what we have done in
Ada with regard to overflow in intermediate expressions.
Briefly we allow specification of three modes

all intermediate arithmetic is done in the base type,
with overflow signalled if an intermediate value is
outside this range.

all intermediate arithmetic is done in the widest
integer type, with overflow signalled if an intermediate
value is outside this range.

all intermediate arithmetic uses an infinite precision
arithmetic package built for this purpose.

In the second and third cases we do range analysis that
allows smaller intermediate precision if we know it's
safe.

We also allow separate specification of the mode inside
and outside assertions (e.g. preconditions and postconditions)
since in the latter you often want to regard integers as
mathematical, not subject to intermediate overflow.

Re: C: Add new warning -Wunprototyped-calls

2013-04-08 Thread Andreas Schwab

Richard Biener richard.guent...@gmail.com writes:

 when a real prototype was visible

How is that different from a prototype?

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
And now for something completely different.

Re: C: Add new warning -Wunprototyped-calls

2013-04-08 Thread Richard Biener

On Mon, Apr 8, 2013 at 3:05 PM, Andreas Schwab sch...@linux-m68k.org wrote:
 Richard Biener richard.guent...@gmail.com writes:

 when a real prototype was visible

 How is that different from a prototype?

It's different from the case where a KR definition was seen and thus
type information is present via that mechanism.  We don't want to
warn in that case.

As I suggested, the warning should just print without a prototype
but prototype here means that a definition before the call is
enough to make us happy (as opposed to -Wstrict-prototypes which
warns about function definitions without a previous prototype we
want to warn about calls to functions without a definition or a prototype).

Any better suggestion?

Richard.

 Andreas.

 --
 Andreas Schwab, sch...@linux-m68k.org
 GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
 And now for something completely different.

Re: Comments on the suggestion to use infinite precision math for wide int.

2013-04-08 Thread Kenneth Zadeck


On 04/08/2013 04:56 AM, Florian Weimer wrote:

On 04/07/2013 07:16 PM, Kenneth Zadeck wrote:
The poster child for operations that do not belong to a ring is 
division.

For my example, I am using 4 bit integers because it makes the
examples easy, but similar examples exist for any fixed precision.

Consider 8 * 10 / 4

in an infinite precision world the result is 20, but in a 4 bit
precision world the answer is 0.


I think you mean 4 instead of 20.


oops

another example is to ask if

-10 * 10 is less than 0?

again you get a different answer with infinite precision.


Actually, for C/C++ ,you don't—because of undefined signed overflow
(at least with default compiler flags).  But similar examples with 
unsigned types exist, so this point isn't too relevant.



I would argue
that if i declare a variable of type uint32 and scale my examples i have
the right to expect the compiler to produce the same result as the
machine would.


In my very, very limited experience, the signed/unsigned mismatch is 
more confusing.  With infinite precision, this confusion would not 
arise (but adjustment would be needed to get limited-precision 
results, as you write).  With finite precision, you either need 
separate types for signed/unsigned, or separate operations.


I come from a world where people write code where they expect full 
control of the horizon and vertical when they program.   Hank Warren, 
the author of Hacker Delight is in my group and a huge number of those 
tricks require understanding what is going on in the machine.  If the 
compiler decides that it wants to do things differently, you are dead.

While C and C++ may have enough wiggle room in their standards so that
this is just an unexpected, but legal, result as opposed to being wrong,
everyone will hate you (us) if we do this.  Furthermore, Java explicitly
does
not allow this (not that anyone actually uses gcj).  I do not know
enough about go,


Go specified two's-complement signed arithmetic and does not 
automatically promote to int (i.e., it performs arithmetic in the 
type, and mixed arguments are not supported).


Go constant arithmetic is infinite precision.

 ada and fortran to say how it would effect them.

Ada requires trapping arithmetic for signed integers.  Currently, this 
is implemented in the front end.  Arithmetic happens in the base range 
of a type (which is symmetric around zero and chosen to correspond to 
a machine type).  Ada allows omitting intermediate overflow checks as 
long as you produce the infinite precision result (or raise an 
overflow exception).


I think this applies to Ada constant arithmetic as well.

(GNAT has a mode where comparisons are computed with infinite 
precision, which is extremely useful for writing bounds checking code.)


Considering the range of different arithmetic operations we need to 
support, I'm not convinced that the ring model is appropriate.



I will answer this in Robert's email.

Re: Comments on the suggestion to use infinite precision math for wide int.

2013-04-08 Thread Robert Dewar


On 4/8/2013 9:15 AM, Kenneth Zadeck wrote:


I think this applies to Ada constant arithmetic as well.


Ada constant arithmetic (at compile time) is always infinite
precision (for float as well as for integer).

Re: RFC: color diagnostics markers

2013-04-08 Thread Jakub Jelinek

On Fri, Apr 05, 2013 at 11:51:43PM +0200, Manuel López-Ibáñez wrote:
 In this patch the default is never, because for some reason auto
 triggers colorization during regression testing. I have not found a

That reason is obvious, dejagnu (expect?) creates pseudo terminals, so
isatty is true, we'd need to just use -fno-diagnostics-color by default
for the testsuite (IMHO not a big deal).

Anyway, I've kept the default as never for now, but am sending my review
comments in form of a new diff, which fixes formatting, avoids memory leaks
and changes it to introduce more color names (for caret, locus, quoted
text), change default of note color (for some color compatibility with
clang, bold green is there used for caret lines, for notes they use
bold black apparently, but that doesn't work too well on white-on-black
terminals).  Right now the patch is unfinished, because there is no support
for the new %[locus]%s:%d:%d%[] style diagnostics strings (where
%[locus] and %[] stand for switching to locus color and resetting color
%back) in the -Wformat code (and gettext).  I'm wondering if instead of the
%[colorname] and %[] it wouldn't be better to just have some %r or whatever
letter isn't taken yet which would consume a const char * colorname from
%va_arg, and some other letter with no argument that would do color reset.
Ideas for best unused letters for that?  Perhaps then -Wformat support for
it would be easier.  I.e. instead of:
pp_printf (%[locus]%s:%d:%d[], loc.file, loc.line, loc.column);
one would write:
pp_printf (%r%s:%d:%d%R, locus, loc.file, loc.line, loc.column);

Jakub
--- gcc/opts.c.jj   2013-03-05 07:00:46.847494476 +0100
+++ gcc/opts.c  2013-04-08 14:29:20.592412422 +0200
@@ -30,6 +30,7 @@ along with GCC; see the file COPYING3.
 #include flags.h
 #include params.h
 #include diagnostic.h
+#include diagnostic-color.h
 #include opts-diagnostic.h
 #include insn-attr-common.h
 #include common/common-target.h
@@ -1497,6 +1498,11 @@ common_handle_option (struct gcc_options
   dc-show_caret = value;
   break;
 
+case OPT_fdiagnostics_color_:
+  pp_show_color (dc-printer)
+   = colorize_init ((diagnostic_color_rule_t) value);
+  break;
+
 case OPT_fdiagnostics_show_option:
   dc-show_option_requested = value;
   break;
--- gcc/Makefile.in.jj  2013-04-04 15:03:29.285380160 +0200
+++ gcc/Makefile.in 2013-04-08 14:44:47.076155748 +0200
@@ -1465,7 +1465,7 @@ OBJS = \
 
 # Objects in libcommon.a, potentially used by all host binaries and with
 # no target dependencies.
-OBJS-libcommon = diagnostic.o pretty-print.o intl.o input.o version.o
+OBJS-libcommon = diagnostic.o diagnostic-color.o pretty-print.o intl.o input.o 
version.o
 
 # Objects in libcommon-target.a, used by drivers and by the core
 # compiler and containing target-dependent code.
@@ -2668,11 +2668,12 @@ fold-const.o : fold-const.c $(CONFIG_H)
$(GIMPLE_H) realmpfr.h $(TREE_FLOW_H)
 diagnostic.o : diagnostic.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
version.h $(DEMANGLE_H) $(INPUT_H) intl.h $(BACKTRACE_H) $(DIAGNOSTIC_H) \
-   diagnostic.def
+   diagnostic.def diagnostic-color.h
+diagnostic-color.o : diagnostic-color.c $(CONFIG_H) $(SYSTEM_H) 
diagnostic-color.h
 opts.o : opts.c $(OPTS_H) $(OPTIONS_H) $(DIAGNOSTIC_CORE_H) $(CONFIG_H) 
$(SYSTEM_H) \
coretypes.h $(DUMPFILE_H) $(TM_H) \
$(DIAGNOSTIC_H) insn-attr-common.h intl.h $(COMMON_TARGET_H) \
-   $(FLAGS_H) $(PARAMS_H) opts-diagnostic.h
+   $(FLAGS_H) $(PARAMS_H) opts-diagnostic.h diagnostic-color.h
 opts-global.o : opts-global.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
$(DIAGNOSTIC_H) $(OPTS_H) $(FLAGS_H) $(GGC_H) $(TREE_H) langhooks.h \
$(TM_H) $(RTL_H) $(DBGCNT_H) debug.h $(LTO_STREAMER_H) output.h \
@@ -3434,7 +3435,8 @@ params.o : params.c $(CONFIG_H) $(SYSTEM
$(PARAMS_H) $(DIAGNOSTIC_CORE_H)
 pointer-set.o: pointer-set.c pointer-set.h $(CONFIG_H) $(SYSTEM_H)
 hooks.o: hooks.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(HOOKS_H)
-pretty-print.o: $(CONFIG_H) $(SYSTEM_H) coretypes.h intl.h $(PRETTY_PRINT_H)
+pretty-print.o: $(CONFIG_H) $(SYSTEM_H) coretypes.h intl.h $(PRETTY_PRINT_H) \
+   diagnostic-color.h
 errors.o : errors.c $(CONFIG_H) $(SYSTEM_H) errors.h
 dbgcnt.o: dbgcnt.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(DUMPFILE_H) \
 $(DIAGNOSTIC_CORE_H) $(DBGCNT_H)
--- gcc/common.opt.jj   2013-04-04 15:03:29.285380160 +0200
+++ gcc/common.opt  2013-04-08 11:32:33.438159412 +0200
@@ -1028,6 +1028,30 @@ fdiagnostics-show-caret
 Common Var(flag_diagnostics_show_caret) Init(1)
 Show the source line with a caret indicating the column
 
+fdiagnostics-color
+Common Alias(fdiagnostics-color=,always,never)
+;
+
+fdiagnostics-color=
+Common Joined RejectNegative Enum(diagnostic_color_rule)
+-fdiagnostics-color=[never|always|auto]Colorize diagnostics
+
+; Required for these enum values.
+SourceInclude
+diagnostic-color.h
+
+Enum
+Name(diagnostic_color_rule) Type(int)
+
+EnumValue
+Enum(diagnostic_color_rule)

Re: Comments on the suggestion to use infinite precision math for wide int.

2013-04-08 Thread Kenneth Zadeck


On 04/08/2013 09:19 AM, Robert Dewar wrote:

On 4/8/2013 9:15 AM, Kenneth Zadeck wrote:


I think this applies to Ada constant arithmetic as well.


Ada constant arithmetic (at compile time) is always infinite
precision (for float as well as for integer).

What do you mean when you say constant arithmetic?Do you mean 
places where there is an explicit 8 * 6 in the source or do you mean any 
arithmetic that a compiler, using the full power of interprocedural 
constant propagation can discover?

Re: Comments on the suggestion to use infinite precision math for wide int.

2013-04-08 Thread Kenneth Zadeck


On 04/08/2013 09:03 AM, Robert Dewar wrote:

It may be interesting to look at what we have done in
Ada with regard to overflow in intermediate expressions.
Briefly we allow specification of three modes

all intermediate arithmetic is done in the base type,
with overflow signalled if an intermediate value is
outside this range.

all intermediate arithmetic is done in the widest
integer type, with overflow signalled if an intermediate
value is outside this range.

all intermediate arithmetic uses an infinite precision
arithmetic package built for this purpose.

In the second and third cases we do range analysis that
allows smaller intermediate precision if we know it's
safe.

We also allow separate specification of the mode inside
and outside assertions (e.g. preconditions and postconditions)
since in the latter you often want to regard integers as
mathematical, not subject to intermediate overflow.
So then how does a language like ada work in gcc?   My assumption is 
that most of what you describe here is done in the front end and by the 
time you get to the middle end of the compiler, you have chosen types 
for which you are comfortable to have any remaining math done in along 
with explicit checks for overflow where the programmer asked for them.


Otherwise, how could ada have ever worked with gcc?

kenny

Re: [RFA][PATCH] Improve VRP of COND_EXPR_CONDs

2013-04-08 Thread Jeff Law


On 04/08/2013 03:45 AM, Richard Biener wrote:


@@ -8584,6 +8584,43 @@ simplify_cond_using_ranges (gimple stmt)
 }
  }

+  /* If we have a comparison of a SSA_NAME boolean against
+ a constant (which obviously must be [0..1]).  See if the
+ SSA_NAME was set by a type conversion where the source
+ of the conversion is another SSA_NAME with a range [0..1].
+
+ If so, we can replace the SSA_NAME in the comparison with
+ the RHS of the conversion.  This will often make the type
+ conversion dead code which DCE will clean up.  */
+  if (TREE_CODE (op0) == SSA_NAME
+   TREE_CODE (TREE_TYPE (op0)) == BOOLEAN_TYPE


Use

(TREE_CODE (TREE_TYPE (op0)) == BOOLEAN_TYPE
 || (INTEGRAL_TYPE_P (TREE_TYPE (op0))
  TYPE_PRECISION (TREE_TYPE (op0)) == 1))

to catch some more cases.

Good catch.  Done.




+   is_gimple_min_invariant (op1))


In this case it's simpler to test TREE_CODE (op1) == INTEGER_CST.

Agreed  fixed.




+{
+  gimple def_stmt = SSA_NAME_DEF_STMT (op0);
+  tree innerop;
+
+  if (!is_gimple_assign (def_stmt)
+ || !CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def_stmt)))
+   return false;
+
+  innerop = gimple_assign_rhs1 (def_stmt);
+
+  if (!SSA_NAME_OCCURS_IN_ABNORMAL_PHI (innerop))


As Steven said, the abnormal check is not necessary, but for completeness
you should check TREE_CODE (innerop) == SSA_NAME.  Valid (but
unfolded) GIMPLE can have (_Bool) 1, too.

Agreed  fixed.



Note that we already have code with similar functionality (see if a
conversion would alter the value of X) as part of optimizing
(T1)(T2)X to (T1)X in simplify_conversion_using_ranges.  Maybe
a part of it can be split out and used to simplify conditions for
a bigger range of types than just compares against boolean 0/1.
That may be a follow-up -- there's still several of these things I'm 
looking at.  I wanted to go ahead and start pushing out those which were 
clearly improvements rather than queue them while I looked at all the 
oddities I'm seeing in the dumps.


jeff

Re: [PATCH GCC]Relax the probability condition in CE pass when optimizing for code size

2013-04-08 Thread Jeff Law


On 04/06/2013 09:15 PM, Bin Cheng wrote:




-Original Message-
From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-ow...@gcc.gnu.org]

On

Behalf Of Bin Cheng
Sent: Tuesday, March 26, 2013 4:33 PM
To: 'Joern Rennecke'
Cc: gcc-patches@gcc.gnu.org; 'Jeff Law'
Subject: RE: [PATCH GCC]Relax the probability condition in CE pass when
optimizing for code size




-Original Message-
From: Joern Rennecke [mailto:joern.renne...@embecosm.com]
Sent: Monday, March 25, 2013 8:53 PM
To: Bin Cheng
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [PATCH GCC]Relax the probability condition in CE pass
when optimizing for code size

Quoting Bin Cheng bin.ch...@arm.com:


During the work I observed passes before combine might interfere
with CE pass, so this patch is enabled for ce2/ce3 after combination

pass.


It is tested on x86/thumb2 for both normal and Os. Is it ok for trunk?


There are bound to be target and application specific variations on
which scaling factors work best.


2013-03-25  Bin Cheng  bin.ch...@arm.com

* ifcvt.c (ifcvt_after_combine): New static variable.


It would make more sense to pass in the scale factor as a an argument
to if_convert.  And get the respective values from a set of gcc
parameters,

so

they can be tweaked by ports and/or by a user/ML learning framework

(e.g.

Milepost) supplying the appropriate --param option.


I agree it would be more flexible to pass the factor as parameter, but not
sure how useful to users it will be because: firstly it has already been
target specific by the BRANCH_COST heuristic; for code size, the heuristic
should be tuned to achieve an overall good results, I doubt to which

extend it

depends on specific target/application.

Hi Jeff,
This is based on your heuristic tuning in ifcvt, would you help us on this
issue with some suggestions?
Not sure what you need from me.   It seems to me that having the scaling 
factor be dependent on optimizing for size vs optimizing for speed makes 
sense.  The only question is whether or not it's important enough to be 
a knob the user can turn -- I've got no strong opinions on that.


jeff

Re: Comments on the suggestion to use infinite precision math for wide int.

2013-04-08 Thread Robert Dewar


On 4/8/2013 9:24 AM, Kenneth Zadeck wrote:


So then how does a language like ada work in gcc?   My assumption is
that most of what you describe here is done in the front end and by the
time you get to the middle end of the compiler, you have chosen types
for which you are comfortable to have any remaining math done in along
with explicit checks for overflow where the programmer asked for them.


That's right, the front end does all the promotion of types


Otherwise, how could ada have ever worked with gcc?


Sometimes we do have to make changes to gcc to accomodate Ada
specific requirements, but this was not one of those cases. Of
course the back end would do a better job of the range analysis
to remove some unnecessary use of infinite precision, but the
front end in practice does a good enough job.

Re: C: Add new warning -Wunprototyped-calls

2013-04-08 Thread Andreas Schwab

Richard Biener richard.guent...@gmail.com writes:

 On Mon, Apr 8, 2013 at 3:05 PM, Andreas Schwab sch...@linux-m68k.org wrote:
 Richard Biener richard.guent...@gmail.com writes:

 when a real prototype was visible

 How is that different from a prototype?

 It's different from the case where a KR definition was seen and thus
 type information is present via that mechanism.  We don't want to
 warn in that case.

But that isn't a prototype.

 As I suggested, the warning should just print without a prototype
 but prototype here means that a definition before the call is
 enough to make us happy (as opposed to -Wstrict-prototypes which
 warns about function definitions without a previous prototype we
 want to warn about calls to functions without a definition or a prototype).

How does a definition help here if it isn't a prototype?

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
And now for something completely different.

Re: Comments on the suggestion to use infinite precision math for wide int.

2013-04-08 Thread Robert Dewar


On 4/8/2013 9:23 AM, Kenneth Zadeck wrote:

On 04/08/2013 09:19 AM, Robert Dewar wrote:

On 4/8/2013 9:15 AM, Kenneth Zadeck wrote:


I think this applies to Ada constant arithmetic as well.


Ada constant arithmetic (at compile time) is always infinite
precision (for float as well as for integer).


What do you mean when you say constant arithmetic?Do you mean
places where there is an explicit 8 * 6 in the source or do you mean any
arithmetic that a compiler, using the full power of interprocedural
constant propagation can discover?


Somewhere between the two. Ada has a very well defined notion of
what is and what is not a static expression, it definitely does not
include everything the compiler can discover, but it goes beyond just
explicit literal arithmetic, e.g. declared constants

   X : Integer := 75;

are considered static. It is static expressions that must be computed
with full precision at compile time. For expressions the compiler can
tell are constant even though not officially static, it is fine to
compute at compile time for integer, but NOT for float, since you want
to use target precision for all non-static float-operations.

[PING]RE: [patch] cilkplus: Array notation for C patch

2013-04-08 Thread Iyer, Balaji V

Hello Joseph,
Did you get a chance to look at this patch?

Thanks,

Balaji V. Iyer.

 -Original Message-
 From: Iyer, Balaji V
 Sent: Friday, March 29, 2013 5:58 PM
 To: 'Joseph Myers'; 'Aldy Hernandez'
 Cc: 'gcc-patches'
 Subject: RE: [patch] cilkplus: Array notation for C patch
 
 Hello Joseph, Aldy et al.,
   I reworded couple comments (e.g changed builtin with built-in, etc) and
 added a header comment to the c-array-notation.c that explains the overall
 process. I am attaching  a fixed patch.
 
 Thanks,
 
 Balaji V. Iyer.
 
 Here are the Changelog entries again:
 
 gcc/ChangeLog
 +2013-03-28  Balaji V. Iyer  balaji.v.i...@intel.com
 +
 +   * doc/extend.texi (C Extensions): Added documentation about Cilk Plus
 +   array notation built-in reduction functions.
 +   * doc/passes.texi (Passes): Added documentation about changes done
 +   for Cilk Plus.
 +   * doc/invoke.texi (C Dialect Options): Added documentation about
 +   the -fcilkplus flag.
 +   * doc/generic.texi (Storage References): Added documentation for
 +   ARRAY_NOTATION_REF storage.
 +   * Makefile.in (C_COMMON_OBJS): Added c-family/array-notation-
 common.o.
 +   * tree-pretty-print.c (dump_generic_node): Add case for
 +   ARRAY_NOTATION_REF.
 +   (BUILTINS_DEF): Depend on cilkplus.def.
 +   * builtins.def: Include cilkplus.def.
 +   Define DEF_CILKPLUS_BUILTIN.
 +   * builtin-types.def: Define BT_FN_INT_PTR_PTR_PTR.
 +   * cilkplus.def: New file.
 
 gcc/c-family/ChangeLog
 +2013-03-28  Balaji V. Iyer  balaji.v.i...@intel.com
 +
 + * c-common.c (c_define_builtins): When cilkplus is enabled, the
 + function array_notation_init_builtins is called.
 + (c_common_init_ts): Added ARRAY_NOTATION_REF as typed.
 + * c-common.def (ARRAY_NOTATION_REF): New tree.
 + * c-common.h (build_array_notation_expr): New function declaration.
 + (build_array_notation_ref): Likewise.
 + (extract_sec_implicit_index_arg): New extern declaration.
 + (is_sec_implicit_index_fn): Likewise.
 + (ARRAY_NOTATION_CHECK): New define.
 + (ARRAY_NOTATION_ARRAY): Likewise.
 + (ARRAY_NOTATION_START): Likewise.
 + (ARRAY_NOTATION_LENGTH): Likewise.
 + (ARRAY_NOTATION_STRIDE): Likewise.
 + (ARRAY_NOTATION_TYPE): Likewise.
 + * c-pretty-print.c (pp_c_postifix_expression): Added a new case for
 + ARRAY_NOTATION_REF.
 + (pp_c_expression): Likewise.
 + * c.opt (flag_enable_cilkplus): New flag.
 + * array-notation-common.c: New file.
 
 gcc/c/ChangeLog
 +2013-03-28  Balaji V. Iyer  balaji.v.i...@intel.com
 +
 + * c-typeck.c (build_array_ref): Added a check to see if array's
 + index is greater than one.  If true, then emit an error.
 + (build_function_call_vec): Exclude error reporting and checking
 + for builtin array-notation functions.
 + (convert_arguments): Likewise.
 + (c_finish_return): Added a check for array notations as a return
 + expression.  If true, then emit an error.
 + (c_finish_loop): Added a check for array notations in a loop
 + condition.  If true then emit an error.
 + (lvalue_p): Added a ARRAY_NOTATION_REF case.
 + (build_binary_op): Added a check for array notation expr inside
 + op1 and op0.  If present, we call another function to find correct
 + type.
 + * Make-lang.in (C_AND_OBJC_OBJS): Added c-array-notation.o.
 + * c-parser.c (c_parser_compound_statement): Check if array
 + notation code is used in tree, if so, then transform them into
 + appropriate C code.
 + (c_parser_expr_no_commas): Check if array notation is used in LHS
 + or RHS, if so, then build array notation expression instead of
 + regular modify.
 + (c_parser_postfix_expression_after_primary): Added a check for
 + colon(s) after square braces, if so then handle it like an array
 + notation.  Also, break up array notations in unary op if found.
 + (c_parser_direct_declarator_inner): Added a check for array
 + notation.
 + (c_parser_compound_statement): Added a check for array notation in
 + a stmt.  If one is present, then expand array notation expr.
 + (c_parser_if_statement): Likewise.
 + (c_parser_switch_statement): Added a check for array notations in
 + a switch statement's condition.  If true, then output an error.
 + (c_parser_while_statement): Similarly, but for a while.
 + (c_parser_do_statement): Similarly, but for a do-while.
 + (c_parser_for_statement): Similarly, but for a for-loop.
 + (c_parser_unary_expression): Check if array notation is used in a
 + pre-increment or pre-decrement expression.  If true, then expand
 + them.
 + (c_parser_array_notation): New function.
 + * c-array-notation.c: New file.
 + * c-tree.h (is_cilkplus_reduce_builtin): Protoize.
 
  -Original Message-
  From: Iyer, Balaji V
  Sent: Thursday, March 28, 2013 1:07 PM
  To: Joseph Myers;

Re: Comments on the suggestion to use infinite precision math for wide int.

2013-04-08 Thread Kenneth Zadeck


On 04/08/2013 09:52 AM, Robert Dewar wrote:

On 4/8/2013 9:23 AM, Kenneth Zadeck wrote:

On 04/08/2013 09:19 AM, Robert Dewar wrote:

On 4/8/2013 9:15 AM, Kenneth Zadeck wrote:


I think this applies to Ada constant arithmetic as well.


Ada constant arithmetic (at compile time) is always infinite
precision (for float as well as for integer).


What do you mean when you say constant arithmetic?Do you mean
places where there is an explicit 8 * 6 in the source or do you mean any
arithmetic that a compiler, using the full power of interprocedural
constant propagation can discover?


Somewhere between the two. Ada has a very well defined notion of
what is and what is not a static expression, it definitely does not
include everything the compiler can discover, but it goes beyond just
explicit literal arithmetic, e.g. declared constants

   X : Integer := 75;

I actually guessed that it was something like this but i did not want to 
spend the time trying to figure this bit of ada syntax out.

are considered static. It is static expressions that must be computed
with full precision at compile time. For expressions the compiler can
tell are constant even though not officially static, it is fine to
compute at compile time for integer, but NOT for float, since you want
to use target precision for all non-static float-operations.




yes but the relevant question for the not officially static integer 
constants is in what precision are those operations to be performed 
in?I assume that you choose gcc types for these operations and you 
expect the math to be done within that type, i.e. exactly the way you 
expect the machine to perform.

Re: [RFA][PATCH] Improve VRP of COND_EXPR_CONDs

2013-04-08 Thread Richard Biener

On Mon, Apr 8, 2013 at 3:27 PM, Jeff Law l...@redhat.com wrote:
 On 04/08/2013 03:45 AM, Richard Biener wrote:

 @@ -8584,6 +8584,43 @@ simplify_cond_using_ranges (gimple stmt)
  }
   }

 +  /* If we have a comparison of a SSA_NAME boolean against
 + a constant (which obviously must be [0..1]).  See if the
 + SSA_NAME was set by a type conversion where the source
 + of the conversion is another SSA_NAME with a range [0..1].
 +
 + If so, we can replace the SSA_NAME in the comparison with
 + the RHS of the conversion.  This will often make the type
 + conversion dead code which DCE will clean up.  */
 +  if (TREE_CODE (op0) == SSA_NAME
 +   TREE_CODE (TREE_TYPE (op0)) == BOOLEAN_TYPE


 Use

 (TREE_CODE (TREE_TYPE (op0)) == BOOLEAN_TYPE
  || (INTEGRAL_TYPE_P (TREE_TYPE (op0))
   TYPE_PRECISION (TREE_TYPE (op0)) == 1))

 to catch some more cases.

 Good catch.  Done.



 +   is_gimple_min_invariant (op1))


 In this case it's simpler to test TREE_CODE (op1) == INTEGER_CST.

 Agreed  fixed.



 +{
 +  gimple def_stmt = SSA_NAME_DEF_STMT (op0);
 +  tree innerop;
 +
 +  if (!is_gimple_assign (def_stmt)
 + || !CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def_stmt)))
 +   return false;
 +
 +  innerop = gimple_assign_rhs1 (def_stmt);
 +
 +  if (!SSA_NAME_OCCURS_IN_ABNORMAL_PHI (innerop))


 As Steven said, the abnormal check is not necessary, but for completeness
 you should check TREE_CODE (innerop) == SSA_NAME.  Valid (but
 unfolded) GIMPLE can have (_Bool) 1, too.

 Agreed  fixed.



 Note that we already have code with similar functionality (see if a
 conversion would alter the value of X) as part of optimizing
 (T1)(T2)X to (T1)X in simplify_conversion_using_ranges.  Maybe
 a part of it can be split out and used to simplify conditions for
 a bigger range of types than just compares against boolean 0/1.

 That may be a follow-up -- there's still several of these things I'm looking
 at.  I wanted to go ahead and start pushing out those which were clearly
 improvements rather than queue them while I looked at all the oddities I'm
 seeing in the dumps.

Fine with me.

Richard.

 jeff

Re: Comments on the suggestion to use infinite precision math for wide int.

2013-04-08 Thread Robert Dewar


On 4/8/2013 9:58 AM, Kenneth Zadeck wrote:


yes but the relevant question for the not officially static integer
constants is in what precision are those operations to be performed
in?I assume that you choose gcc types for these operations and you
expect the math to be done within that type, i.e. exactly the way you
expect the machine to perform.


As I explained in an earlier message, *within* a single expression, we
are free to use higher precision, and we provide modes that allow this
up to and including the usea of infinite precision. That applies not
just to constant expressions but to all expressions.

Re: Comments on the suggestion to use infinite precision math for wide int.

2013-04-08 Thread Kenneth Zadeck


On 04/08/2013 10:12 AM, Robert Dewar wrote:

On 4/8/2013 9:58 AM, Kenneth Zadeck wrote:


yes but the relevant question for the not officially static integer
constants is in what precision are those operations to be performed
in?I assume that you choose gcc types for these operations and you
expect the math to be done within that type, i.e. exactly the way you
expect the machine to perform.


As I explained in an earlier message, *within* a single expression, we
are free to use higher precision, and we provide modes that allow this
up to and including the usea of infinite precision. That applies not
just to constant expressions but to all expressions.




My confusion is what you mean by we?   Do you mean we the writer of 
the program, we the person invoking the compiler by the use command 
line options or we, your company's implementation of ada?


My interpretation of your first email was that it was possible for the 
programmer to do something equivalent to adding attributes surrounding a 
block in the program to control the precision and overflow detection of 
the expressions in the block.   And if this is so, then by the time the 
expression is seen by the middle end of gcc, those attributes will have 
been converted into tree code will evaluate the code in a well defined 
way by both the optimization passes and the target machine.


Kenny

Re: RFC: color diagnostics markers

2013-04-08 Thread Manuel López-Ibáñez

On 8 April 2013 15:23, Jakub Jelinek ja...@redhat.com wrote:
 On Fri, Apr 05, 2013 at 11:51:43PM +0200, Manuel López-Ibáñez wrote:
 In this patch the default is never, because for some reason auto
 triggers colorization during regression testing. I have not found a

 That reason is obvious, dejagnu (expect?) creates pseudo terminals, so
 isatty is true, we'd need to just use -fno-diagnostics-color by default
 for the testsuite (IMHO not a big deal).


Fine for me.

 Anyway, I've kept the default as never for now, but am sending my review
 comments in form of a new diff, which fixes formatting, avoids memory leaks
 and changes it to introduce more color names (for caret, locus, quoted
 text), change default of note color (for some color compatibility with
 clang, bold green is there used for caret lines, for notes they use
 bold black apparently, but that doesn't work too well on white-on-black
 terminals).  Right now the patch is unfinished, because there is no support
 for the new %[locus]%s:%d:%d%[] style diagnostics strings (where
 %[locus] and %[] stand for switching to locus color and resetting color
 %back) in the -Wformat code (and gettext).  I'm wondering if instead of the
 %[colorname] and %[] it wouldn't be better to just have some %r or whatever
 letter isn't taken yet which would consume a const char * colorname from
 %va_arg, and some other letter with no argument that would do color reset.
 Ideas for best unused letters for that?  Perhaps then -Wformat support for
 it would be easier.  I.e. instead of:
 pp_printf (%[locus]%s:%d:%d[], loc.file, loc.line, loc.column);
 one would write:
 pp_printf (%r%s:%d:%d%R, locus, loc.file, loc.line, loc.column);

Thanks for working on this, your improvements are quite nice.

About %r versus %[colorname], I just don't see the user-case for
dynamic color names.

In fact, I would be fine with something like:

pp_start_color()
pp_stop_color()
pp_wrap_in_color()

It is a bit more verbose, but also clearer when reading the code. And
no need for %[colorname] or %r or -Wformat support.

Cheers,

Manuel.

Re: Comments on the suggestion to use infinite precision math for wide int.

2013-04-08 Thread Richard Biener

On Mon, Apr 8, 2013 at 2:43 PM, Kenneth Zadeck zad...@naturalbridge.com wrote:

 On 04/08/2013 06:46 AM, Richard Biener wrote:

 On Sun, Apr 7, 2013 at 7:16 PM, Kenneth Zadeck zad...@naturalbridge.com
 wrote:

 Richard,

 You advocate that I should be using an infinite precision
 representation and I advocate a finite precision representation where
 the precision is taken from the context.  I would like to make the
 case for my position here, in a separate thread, because the other
 thread is just getting too messy.

 At both the tree level and the rtl level you have a type (mode is just
 bad rep for types) and both of those explicitly have precisions. The
 semantics of the programming languages that we implement define, or at
 least recommend, that most operations be done in a precision that is
 implementation dependent (or like java a particular machine
 independent precision).  Each hardware platform specifies exactly how
 every operation is done.  I will admit that infinite precision is more
 esthetically pleasing than what i have done, but exact precision
 matches the needs of these clients.  The problem is that the results
 from infinite precision arithmetic differ in many significant ways
 from finite precision math.  And the number of places where you have
 to inject a precision to get the expected answer, ultimately makes the
 infinite precision representation unattractive.

 As I said on Thursday, whenever you do operations that do not satisfy
 the requirements of a mathematical ring (add sub and mul are in a
 ring, divide, shift and comparisons are not) you run the risk of
 getting a result that is not what would have been obtained with either
 a strict interpretation of the semantics or the machine. Intuitively
 any operation that looks at the bits above the precision does not
 qualify as an operation that works in a ring.

 The poster child for operations that do not belong to a ring is division.
 For my example, I am using 4 bit integers because it makes the
 examples easy, but similar examples exist for any fixed precision.

 Consider 8 * 10 / 4

 in an infinite precision world the result is 20, but in a 4 bit
 precision world the answer is 0.

 another example is to ask if

 -10 * 10 is less than 0?

 again you get a different answer with infinite precision.   I would argue
 that if i declare a variable of type uint32 and scale my examples i have
 the right to expect the compiler to produce the same result as the
 machine would.

 While C and C++ may have enough wiggle room in their standards so that
 this is just an unexpected, but legal, result as opposed to being wrong,
 everyone will hate you (us) if we do this.  Furthermore, Java explicitly
 does
 not allow this (not that anyone actually uses gcj).  I do not know
 enough about go, ada and fortran to say how it would effect them.

 In looking at the double-int class, the only operation that does not
 fit in a ring that is done properly is shifting.  There we explicitly
 pass in the precision.

 The reason that we rarely see this kind of problem even though
 double-int implements 128 bit infinite precision is that currently
 very little of the compiler actually uses infinite precision in a
 robust way.   In a large number of places, the code looks like:

 if (TYPE_PRECISION (TREE_TYPE (...))  HOST_BITS_PER_WIDE_INT)
 do something using inline operators.
 else
 either do not do something or use const-double,

 such code clears out most of these issues before the two passes that
 embrace infinite precision get a chance to do much damage.  However,
 my patch at the rtl level gets rid of most of this kind of code and
 replaces it with calls to wide-int that currently uses only operations
 within the precision.  I assume that if i went down the infinite
 precision road at the tree level, that all of this would come to the
 surface very quickly.  I prefer to not change my rep and not have to
 deal with this later.

 Add, subtract, multiply and the logicals are all safe.  But divide,
 remainder, and all of the comparisons need explicit precisions.  In
 addition operations like clz, ctl and clrsb need precisions.  In total
 about half of the functions would need a precision passed in.  My
 point is that once you have to start passing in the precision in for all
 of those operations, it seems to be cleaner to get the precision from
 the leaves of the tree as I currently do.

 Once you buy into the math in a particular precision world, a lot of
 the other issues that you raise are just settled.  Asking how to extend
 a value beyond it's precision is like asking what the universe was like
 before
 the big bang.  It is just something you do not need to know.

 I understand that you would like to have functions like x + 1 work,
 and so do I. I just could not figure out how to make them have
 unsurprising semantics.  In particular, g++ did not seem to be happy
 with me defining two plus operators, one for each of signed and
 unsigned HWIs.

Re: Comments on the suggestion to use infinite precision math for wide int.

2013-04-08 Thread Robert Dewar


On 4/8/2013 10:26 AM, Kenneth Zadeck wrote:


My confusion is what you mean by we?   Do you mean we the writer of
the program, we the person invoking the compiler by the use command
line options or we, your company's implementation of ada?


Sorry, bad usage, The gcc implementation of Ada allows the user to
specify by pragmas how intermediate overflow is handled.


My interpretation of your first email was that it was possible for the
programmer to do something equivalent to adding attributes surrounding a
block in the program to control the precision and overflow detection of
the expressions in the block.   And if this is so, then by the time the
expression is seen by the middle end of gcc, those attributes will have
been converted into tree code will evaluate the code in a well defined
way by both the optimization passes and the target machine.


Yes, that's a correct understanding


Kenny

Re: [Fortran, RFC patch] Document naming and argument passing convention

2013-04-08 Thread Tobias Burnus


Dear all,

attached is an updated version of the patch, which address the raised 
issues and some minor problems and omissions I found.


OK for the trunk?

Tobias
2013-04-08  Tobias Burnus  bur...@net-b.de

	* gfortran.texi (KIND Type Parameters,
	Internal representation of LOGICAL variables): Add crossrefs.
	(Intrinsic Types): Mention issues with _Bool interop.
	(Naming and argument-passing conventions): New section.

diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi
index 4f9008d..46fdeb3 100644
--- a/gcc/fortran/gfortran.texi
+++ b/gcc/fortran/gfortran.texi
@@ -1166,7 +1166,8 @@ parameters of the @code{ISO_FORTRAN_ENV} module instead of the concrete values.
 The available kind parameters can be found in the constant arrays
 @code{CHARACTER_KINDS}, @code{INTEGER_KINDS}, @code{LOGICAL_KINDS} and
 @code{REAL_KINDS} in the @code{ISO_FORTRAN_ENV} module
-(see @ref{ISO_FORTRAN_ENV}).
+(see @ref{ISO_FORTRAN_ENV}).  For C interoperability, the kind parameters of
+the @code{ISO_C_BINDING} module should be used (see @ref{ISO_C_BINDING}).
 
 
 @node Internal representation of LOGICAL variables
@@ -1184,16 +1185,7 @@ A @code{LOGICAL(KIND=N)} variable is represented as an
 values: @code{1} for @code{.TRUE.} and @code{0} for
 @code{.FALSE.}.  Any other integer value results in undefined behavior.
 
-Note that for mixed-language programming using the
-@code{ISO_C_BINDING} feature, there is a @code{C_BOOL} kind that can
-be used to create @code{LOGICAL(KIND=C_BOOL)} variables which are
-interoperable with the C99 _Bool type.  The C99 _Bool type has an
-internal representation described in the C99 standard, which is
-identical to the above description, i.e. with 1 for true and 0 for
-false being the only permissible values.  Thus the internal
-representation of @code{LOGICAL} variables in GNU Fortran is identical
-to C99 _Bool, except for a possible difference in storage size
-depending on the kind.
+See also @ref{Argument passing conventions} and @ref{Interoperability with C}.
 
 
 @node Thread-safety of the runtime library
@@ -2204,6 +2196,7 @@ common, but not the former.
 * Interoperability with C::
 * GNU Fortran Compiler Directives::
 * Non-Fortran Main Program::
+* Naming and argument-passing conventions::
 @end menu
 
 This chapter is about mixed-language interoperability, but also applies
@@ -2250,6 +2243,16 @@ in C and Fortran, the named constants shall be used which are defined in the
 for kind parameters and character named constants for the escape sequences
 in C.  For a list of the constants, see @ref{ISO_C_BINDING}.
 
+For logical types, please note that the Fortran standard only guarantees
+interoperability between C99's @code{_Bool} and Fortran's @code{C_Bool}-kind
+logicals and C99 defines that @code{true} has the value 1 and @code{false}
+the value 0.  Using any other integer value with GNU Fortran's @code{LOGICAL}
+(with any kind parameter) gives an undefined result.  (Passing other integer
+values than 0 and 1 to GCC's @code{_Bool} is also undefined, unless the
+integer is explicitly or implicitly casted to @code{_Bool}.)
+
+
+
 @node Derived Types and struct
 @subsection Derived Types and struct
 
@@ -2975,6 +2978,144 @@ int main (int argc, char *argv[])
 @end table
 
 
+@node Naming and argument-passing conventions
+@section Naming and argument-passing conventions
+
+This section gives an overview about the naming convention of procedures
+and global variables and about the argument passing conventions used by
+GNU Fortran.  If a C binding has been specified, the naming convention
+and some of the argument-passing conventions change.  If possible,
+mixed-language and mixed-compiler projects should use the better defined
+C binding for interoperability.  See @pxref{Interoperability with C}.
+
+@menu
+* Naming conventions::
+* Argument passing conventions::
+@end menu
+
+
+@node Naming conventions
+@subsection Naming conventions
+
+According the Fortran standard, valid Fortran names consist of a letter
+between @code{A} to @code{Z}, @code{a} to @code{z}, digits @code{0},
+@code{1} to @code{9} and underscores (@code{_}) with the restriction
+that names may only start with a letter.  As vendor extension, the
+dollar sign (@code{$}) is additionally permitted with the option
+@option{-fdollar-ok}, but not as first character and only if the
+target system supports it.
+
+By default, the procedure name is the lower-cased Fortran name with an
+appended underscore (@code{_}); using @option{-fno-underscoring} no
+underscore is appended while @code{-fsecond-underscore} appends two
+underscores.  Depending on the target system and the calling convention,
+the procedure might be additionally dressed; for instance, on 32bit
+Windows with @code{stdcall}, an at-sign @code{@@} followed by an integer
+number is appended.  For the changing the calling convention, see
+@pxref{GNU Fortran Compiler Directives}.
+
+For common blocks, the same convention is used, i.e. by default an
+underscore is

Re: [C++ Patch] PR 56871

2013-04-08 Thread Paolo Carlini

... I think that by the time we do the check, if old_decl is a 
FUNCTION_DECL we can safely assume that new_decl is also a 
FUNCTION_DECL, thus I can simplify the code. I'm finishing testing the 
below variant.


Thanks
Paolo.

///
Index: cp/decl.c
===
--- cp/decl.c   (revision 197572)
+++ cp/decl.c   (working copy)
@@ -1196,12 +1196,21 @@ validate_constexpr_redeclaration (tree old_decl, t
   if (DECL_DECLARED_CONSTEXPR_P (old_decl)
   == DECL_DECLARED_CONSTEXPR_P (new_decl))
 return true;
-  if (TREE_CODE (old_decl) == FUNCTION_DECL  DECL_BUILT_IN (old_decl))
+  if (TREE_CODE (old_decl) == FUNCTION_DECL)
 {
-  /* Hide a built-in declaration.  */
-  DECL_DECLARED_CONSTEXPR_P (old_decl)
-   = DECL_DECLARED_CONSTEXPR_P (new_decl);
-  return true;
+  if (DECL_BUILT_IN (old_decl))
+   {
+ /* Hide a built-in declaration.  */
+ DECL_DECLARED_CONSTEXPR_P (old_decl)
+   = DECL_DECLARED_CONSTEXPR_P (new_decl);
+ return true;
+   }
+  /* 7.1.5 [dcl.constexpr]
+Note: An explicit specialization can differ from the template
+declaration with respect to the constexpr specifier.  */
+  if (! DECL_TEMPLATE_SPECIALIZATION (old_decl)
+  DECL_TEMPLATE_SPECIALIZATION (new_decl))
+   return true;
 }
   error (redeclaration %qD differs in %constexpr%, new_decl);
   error (from previous declaration %q+D, old_decl);
Index: testsuite/g++.dg/cpp0x/constexpr-specialization.C
===
--- testsuite/g++.dg/cpp0x/constexpr-specialization.C   (revision 0)
+++ testsuite/g++.dg/cpp0x/constexpr-specialization.C   (working copy)
@@ -0,0 +1,12 @@
+// PR c++/56871
+// { dg-options -std=c++11 }
+
+templatetypename T constexpr int foo(T);
+template int foo(int);
+template int foo(int);// { dg-error previous }
+template constexpr int foo(int);  // { dg-error redeclaration }
+
+templatetypename T int bar(T);
+template constexpr int bar(int);
+template constexpr int bar(int);  // { dg-error previous }
+template int bar(int);// { dg-error redeclaration }

Re: RFC: color diagnostics markers

2013-04-08 Thread Jakub Jelinek

On Mon, Apr 08, 2013 at 04:29:02PM +0200, Manuel López-Ibáñez wrote:
 In fact, I would be fine with something like:
 
 pp_start_color()
 pp_stop_color()
 pp_wrap_in_color()
 
 It is a bit more verbose, but also clearer when reading the code. And
 no need for %[colorname] or %r or -Wformat support.

But you then need to break the code into multiple function calls, which
decreases readability.

pp_verbatim (context-printer,
 _(%s:%d:%d:   [ skipping %d instantiation contexts, 
   use -ftemplate-backtrace-limit=0 to disable ]\n),
 xloc.file, xloc.line, xloc.column, skip);

can be right now a single call, while you would need several.  Also, if you
eventually want to colorize something in say error_at, warning_at and
similar format strings.  For those you really don't have the printer at
hand, and can't easily break it into multiple calls.  The reason for %r/%R
instead of the %[ in the patch is that I think it will be easier to teach
-Wformat and gettext about it that way, rather than if the argument is
embedded in between [ and ].  With %r/%R it would be:
pp_verbatim (context-printer,
 _(%r%s:%d:%d:%R   [ skipping %d instantiation 
contexts, 
   use -ftemplate-backtrace-limit=0 to disable ]\n),
 locus, xloc.file, xloc.line, xloc.column, skip);

Jakub

Re: [cilkplus] misc cleanups for #pragma simd implementation

2013-04-08 Thread Aldy Hernandez


On 04/08/13 08:59, Iyer, Balaji V wrote:

Hi Aldy,
Here are the things I  found with the patch. All my comments have 
BVI: in front of them.


BTW, it would be nice if you could use standard mailer quotation when 
responding (, etc).



-  return;

BVI: I am OK with removing this return, but the reason why I put it there is 
because it gets easier for me to set the break point there.


This is not standard practice in GCC source code.  It will have to be 
removed if we ever merge.



  case PRAGMA_CILK_GRAINSIZE:
-  if (context == pragma_external)
-   {
- c_parser_error (parser,pragma grainsize must be inside a function);
- return false;
-   }
-  if (flag_enable_cilk)
-   c_parser_cilk_grainsize (parser);
-  else
-   {
- warning (0, pragma grainsize ignored);
- c_parser_skip_until_found (parser, CPP_PRAGMA_EOL, NULL);
-   }
+  if (!c_parser_pragma_simd_ok_p (parser, context))
+   return false;
+  cilkplus_local_simd_values.loc = loc;
+  c_parser_cilk_grainsize (parser);

BVI: This is incorrect. #pragma grainsize is part of cilk keywords. It has no 
relation to the pragma simd and it will work wthout vectorization support.


Fixed, let me know if the current implementation is correct.


+// FIXME: We should really rewrite all this psv* business to use vectors.
+/* Given an index into the pragma simd list (PSV_INDEX), find its
+   entry and return it.  */

BVI: I am in the process of doing so. I will send out that patch as soon as I 
get some free time.


Ok, I have left all the FIXMEs so we don't miss any of them.


for (ps_iter = psv_head; ps_iter-ptr_next != NULL;
 ps_iter = ps_iter-ptr_next)
-{
-  ;
-}
+;

BVI: Are you sure the compiler let you get away this this? It gave me a warning 
once (in stage2 I believe).


Sure, I've done it for years.


BVI: I have fixed these scripts already: the correct notation that I have used is 
cilkplus_type_language_compile/execute/errors.exp


Ok, I have removed them from my patch.

-
+   
BVI: Why did you replace a space with a tab?


Whoops, removed the space (and the tab).

How about this?
commit 1847c6c76ca2ed0da68cb7985fde4c0b4d634b65
Author: Aldy Hernandez al...@redhat.com
Date:   Mon Apr 8 09:59:38 2013 -0500

Minor cleanups for pragma simd implementation.

diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index f00d28d..a48b011 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -117,26 +117,11 @@ c_parse_init (void)
   ridpointers [(int) c_common_reswords[i].rid] = id;
 }
 
-  /* Here we initialize the simd_values structure. We only need it 
- initialized the first time, after each consumptions, for-loop will 
- automatically consume the values and delete the information.  */
-  cilkplus_local_simd_values.index  = 0;
-  cilkplus_local_simd_values.pragma_encountered = false;
-  cilkplus_local_simd_values.types  = P_SIMD_NOASSERT;
-  cilkplus_local_simd_values.vectorlength   = NULL_TREE;
-  cilkplus_local_simd_values.vec_length_list= NULL;
-  cilkplus_local_simd_values.vec_length_size= 0;
-  cilkplus_local_simd_values.private_vars   = NULL_TREE;
-  cilkplus_local_simd_values.priv_var_list  = NULL;
-  cilkplus_local_simd_values.priv_var_size  = 0;
-  cilkplus_local_simd_values.linear_vars= NULL_TREE;
-  cilkplus_local_simd_values.linear_var_size= 0;
-  cilkplus_local_simd_values.linear_var_list= NULL;
-  cilkplus_local_simd_values.linear_steps   = NULL_TREE;
-  cilkplus_local_simd_values.linear_steps_list  = NULL;
-  cilkplus_local_simd_values.linear_steps_size  = 0;
-  cilkplus_local_simd_values.reduction_vals = NULL;
-  cilkplus_local_simd_values.ptr_next   = NULL;
+  /* Only initialize the first time.  After each consumption, the
+ for-loop handling code (c_finish_loop) will automatically consume
+ the values and delete the information.  */
+  memset (cilkplus_local_simd_values, 0,
+ sizeof (cilkplus_local_simd_values));
 
   clear_pragma_simd_list ();
 }
@@ -1251,12 +1236,16 @@ static void c_parser_objc_at_synthesize_declaration 
(c_parser *);
 static void c_parser_objc_at_dynamic_declaration (c_parser *);
 static bool c_parser_objc_diagnose_bad_element_prefix
   (c_parser *, struct c_declspecs *);
+
+// FIXME: Re-work this so there are only prototypes for mutually
+// recursive functions.
+/* Cilk Plus supporting routines.  */
 static void c_parser_cilk_for_statement (c_parser *, tree);
-void c_parser_simd_linear (c_parser *);
-void c_parser_simd_private (c_parser *);
-void c_parser_simd_assert (c_parser *, bool);
-void c_parser_simd_vectorlength (c_parser *);
-void c_parser_simd_reduction (c_parser *);
+static void c_parser_simd_linear (c_parser *);
+static void c_parser_simd_private (c_parser *);
+static void c_parser_simd_assert (c_parser *, bool);
+static void c_parser_simd_vectorlength (c_parser *);
+static void

RFA: Fix tree-optimization/55524

2013-04-08 Thread Joern Rennecke


This is basically the same patch as attached to the PR, except that I
have changed the goto-loop into a do-while loop with a new comment;
this caused the need for a lot of reformatting.

bootstrapped  regtested on i686-pc-linux-gnu.
2013-04-08  Joern Rennecke  joern.renne...@embecosm.com

* tree-ssa-math-opts.c (mult_to_fma_pass): New file static struct.
(convert_mult_to_fma): In first pass, don't use an fms construct
when we don't have an fms operation, but fmna.
(execute_optimize_widening_mul): Add a second pass if
convert_mult_to_fma requests it.

Index: gcc/tree-ssa-math-opts.c
===
--- gcc/tree-ssa-math-opts.c(revision 197578)
+++ gcc/tree-ssa-math-opts.c(working copy)
@@ -2461,6 +2461,12 @@ convert_plusminus_to_widen (gimple_stmt_
   return true;
 }
 
+static struct 
+{
+  bool second_pass;
+  bool retry_request;
+} mult_to_fma_pass;
+
 /* Combine the multiplication at MUL_STMT with operands MULOP1 and MULOP2
with uses in additions and subtractions to form fused multiply-add
operations.  Returns true if successful and MUL_STMT should be removed.  */
@@ -2570,6 +2576,22 @@ convert_mult_to_fma (gimple mul_stmt, tr
  return false;
}
 
+  /* If the subtrahend (gimple_assign_rhs2 (use_stmt)) is computed
+by a MULT_EXPR that we'll visit later, we might be able to
+get a more profitable match with fnma.
+OTOH, if we don't, a negate / fma pair has likely lower latency
+that a mult / subtract pair.  */
+  if (use_code == MINUS_EXPR  !negate_p
+  gimple_assign_rhs1 (use_stmt) == result
+  optab_handler (fms_optab, TYPE_MODE (type)) == CODE_FOR_nothing
+  optab_handler (fnma_optab, TYPE_MODE (type)) != CODE_FOR_nothing
+  mult_to_fma_pass.second_pass == false)
+   {
+ /* ??? Could make setting of retry_request dependent on some
+rtx_cost measure we evaluate beforehand.  */
+ mult_to_fma_pass.retry_request = true;
+ return false;
+   }
   /* We can't handle a * b + a * b.  */
   if (gimple_assign_rhs1 (use_stmt) == gimple_assign_rhs2 (use_stmt))
return false;
@@ -2657,76 +2679,89 @@ execute_optimize_widening_mul (void)
 
   memset (widen_mul_stats, 0, sizeof (widen_mul_stats));
 
-  FOR_EACH_BB (bb)
-{
-  gimple_stmt_iterator gsi;
 
-  for (gsi = gsi_after_labels (bb); !gsi_end_p (gsi);)
-{
- gimple stmt = gsi_stmt (gsi);
- enum tree_code code;
+  /* We may run one or two passes.  In the first pass, if have fnma,
+ but not fms, we don't synthesize fms so that we can get the maximum
+ matches for fnma.  If we have therefore skipped opportunities to
+ synthesize fms, we'll run a second pass where we use any such
+ opportunities that still remain.  */
+  mult_to_fma_pass.retry_request = false;
+  do
+{
+  mult_to_fma_pass.second_pass = mult_to_fma_pass.retry_request;
+  FOR_EACH_BB (bb)
+   {
+ gimple_stmt_iterator gsi;
 
- if (is_gimple_assign (stmt))
+ for (gsi = gsi_after_labels (bb); !gsi_end_p (gsi);)
{
- code = gimple_assign_rhs_code (stmt);
- switch (code)
+ gimple stmt = gsi_stmt (gsi);
+ enum tree_code code;
+
+ if (is_gimple_assign (stmt))
{
-   case MULT_EXPR:
- if (!convert_mult_to_widen (stmt, gsi)
-  convert_mult_to_fma (stmt,
- gimple_assign_rhs1 (stmt),
- gimple_assign_rhs2 (stmt)))
+ code = gimple_assign_rhs_code (stmt);
+ switch (code)
{
- gsi_remove (gsi, true);
- release_defs (stmt);
- continue;
-   }
- break;
-
-   case PLUS_EXPR:
-   case MINUS_EXPR:
- convert_plusminus_to_widen (gsi, stmt, code);
- break;
+   case MULT_EXPR:
+ if (!convert_mult_to_widen (stmt, gsi)
+  convert_mult_to_fma (stmt,
+ gimple_assign_rhs1 (stmt),
+ gimple_assign_rhs2 (stmt)))
+   {
+ gsi_remove (gsi, true);
+ release_defs (stmt);
+ continue;
+   }
+ break;
+
+   case PLUS_EXPR:
+   case MINUS_EXPR:
+ convert_plusminus_to_widen (gsi, stmt, code);
+ break;
 
-   default:;
+   default:;
+   }
}
-   }
- else if

[patch cygwin]: Replace use of TARGET_CYGWIN64 by TARGET_64BIT

2013-04-08 Thread Kai Tietz

Hi,

this patch fixes an obvious typo in recently applied patch.

ChangeLog

2013-04-08  Kai Tietz  kti...@redhat.com

* config/i386/cygwin.h (EXTRA_OS_CPP_BUILTINS): Replaced
TARGET_CYGWIN64 by TARGET_64BIT.

Applied to trunk as obvious fix. as revision 197593.

Regards,
Kai

Index: cygwin.h
===
--- cygwin.h(Revision 197586)
+++ cygwin.h(Arbeitskopie)
@@ -22,7 +22,7 @@ along with GCC; see the file COPYING3.  If not see
   do   \
 {  \
   builtin_define (__CYGWIN__);   \
-  if (!TARGET_CYGWIN64)\
+  if (!TARGET_64BIT)   \
builtin_define (__CYGWIN32__);\
   builtin_define (__unix__); \
   builtin_define (__unix);   \

Re: [PATCH] Avoid warning when unused attribute applied to C++ member variables (issue8212043)

2013-04-08 Thread Teresa Johnson

Ping.
Thanks, Teresa

On Sun, Mar 31, 2013 at 9:39 AM, Teresa Johnson tejohn...@google.com wrote:
 On Sun, Mar 31, 2013 at 1:36 AM, Andrew Pinski pins...@gmail.com wrote:
 On Sun, Mar 31, 2013 at 12:10 AM, Teresa Johnson tejohn...@google.com 
 wrote:
 This patch allows the unused attribute to be used without warning
 on C++ class members, which are of type FIELD_DECL. This is for
 compatibility with clang, which allows the attribute to be specified on
 class members and struct fields. It looks like more work would need to
 be done to implement the actual unused variable detection and warning
 on FIELD_DECLs, but this change will at least avoid the warning on the
 code that uses the unused attribute in these cases. The documentation at
 http://gcc.gnu.org/onlinedocs/gcc/Variable-Attributes.html also doesn't
 seem to preclude its use on C++ member variables.

 This also allows it on field in normal C case.  As far as I understand
 they are fields and not variables in the normal programming sense
 which is why the document does not mention them.

 That's true that this change will also allow the unused attribute on
 normal C struct fields. I just verified that clang also allows this,
 and it could potentially be taken advantage of to warn on unused
 fields as well.

 Teresa


 Thanks,
 Andrew Pinski



 Bootstrapped and tested on x86-64-unknown-linux-gnu. Ok for trunk?

 2013-03-30  Teresa Johnson  tejohn...@google.com

 * c-family/c-common.c (handle_unused_attribute): Handle
 FIELD_DECL for C++ class members.

 Index: c-family/c-common.c
 ===
 --- c-family/c-common.c (revision 197266)
 +++ c-family/c-common.c (working copy)
 @@ -6753,6 +6753,7 @@ handle_unused_attribute (tree *node, tree name, tr

if (TREE_CODE (decl) == PARM_DECL
   || TREE_CODE (decl) == VAR_DECL
 + || TREE_CODE (decl) == FIELD_DECL
   || TREE_CODE (decl) == FUNCTION_DECL
   || TREE_CODE (decl) == LABEL_DECL
   || TREE_CODE (decl) == TYPE_DECL)

 --
 This patch is available for review at http://codereview.appspot.com/8212043



 --
 Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413



--
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413

Re: [C++ Patch] PR 56871

2013-04-08 Thread Jason Merrill


OK.

Jason

Re: RFC: color diagnostics markers

2013-04-08 Thread Manuel López-Ibáñez

On 8 April 2013 16:43, Jakub Jelinek ja...@redhat.com wrote:
 On Mon, Apr 08, 2013 at 04:29:02PM +0200, Manuel López-Ibáñez wrote:
 In fact, I would be fine with something like:

 pp_start_color()
 pp_stop_color()
 pp_wrap_in_color()

 It is a bit more verbose, but also clearer when reading the code. And
 no need for %[colorname] or %r or -Wformat support.

 But you then need to break the code into multiple function calls, which
 decreases readability.

 pp_verbatim (context-printer,
  _(%s:%d:%d:   [ skipping %d instantiation contexts, 
 
use -ftemplate-backtrace-limit=0 to disable ]\n),
  xloc.file, xloc.line, xloc.column, skip);

I guess decreases readability depends whether one knows what the
extra codes mean or not. I still have to check many times what %K and
%q#+T and other less common codes exactly do. I'd rather have less
codes than more.
And one could argue that the above call should be split, since the
%s:%d:%d: should not be translated.

That said, I would prefer that instead of

   expanded_location xloc;
xloc = expand_location (loc);
  if (context-show_column)
pp_verbatim (context-printer,
 _(%r%s:%d:%d:%R  [ skipping %d instantiation
contexts, 
   use -ftemplate-backtrace-limit=0 to disable ]\n),
 locus, xloc.file, xloc.line, xloc.column, skip);
  else
pp_verbatim (context-printer,
 _(%r%s:%d:%R   [ skipping %d instantiation contexts, 
   use -ftemplate-backtrace-limit=0 to disable ]\n),
 locus, xloc.file, xloc.line, skip);

 we had:

 pp_verbatim (context-printer,
  _(%X   [ skipping %d instantiation contexts, 
use -ftemplate-backtrace-limit=0 to disable ]\n),
  expand_location(loc), skip);

and the pretty-printer takes care of applying color or not (or
expanding column numbers or not, etc).

Or without the extra %X code:

 pp_print_locus (context-printer, loc);
 pp_verbatim (context-printer,
  _(  [ skipping %d instantiation contexts, 
use -ftemplate-backtrace-limit=0 to
disable ]\n), skip);

Or even if we don't want the pretty printer to know about -show_column:

 diagnostic_print_locus (context, loc);
 pp_verbatim (context-printer,
  _(  [ skipping %d instantiation contexts, 
use -ftemplate-backtrace-limit=0 to
disable ]\n), skip);

and the internal diagnostics machinery takes care of applying color or
not (or expanding column numbers or not, etc).

 can be right now a single call, while you would need several.  Also, if you
 eventually want to colorize something in say error_at, warning_at and
 similar format strings.  For those you really don't have the printer at

Do we really want to allow that much flexibility? Then the color_dict
needs to be dynamic or the caller is restricted to re-using existing
colornames.
I was expecting the use of color to be rather limited to a very very
few well-defined concepts. I was hoping that higher-level diagnostic
functions would be oblivious to the color stuff to not make the
diagnostics code much more complex.

Maybe I am wrong here and FE maintainers do want that flexibility.

Cheers,

Manuel.

Re: RFC: color diagnostics markers

2013-04-08 Thread Jakub Jelinek

On Mon, Apr 08, 2013 at 07:54:18PM +0200, Manuel López-Ibáñez wrote:
  can be right now a single call, while you would need several.  Also, if you
  eventually want to colorize something in say error_at, warning_at and
  similar format strings.  For those you really don't have the printer at
 
 Do we really want to allow that much flexibility? Then the color_dict
 needs to be dynamic or the caller is restricted to re-using existing
 colornames.

Yes, I think we want that flexibility, it certainly isn't that much
difficult to support it (a few lines of code, will try to code the %r/%R
variant tomorrow), and from time to time it can be useful.
Perhaps that %L or whatever character isn't taken for the expanded location
could be used too.

 I was expecting the use of color to be rather limited to a very very
 few well-defined concepts. I was hoping that higher-level diagnostic
 functions would be oblivious to the color stuff to not make the
 diagnostics code much more complex.

I don't see why we would need dynamic color names, as the color names are
to be overridable through GCC_COLORS, documented in invoke.text etc., the
list better be static and not too long, but we can add new color names in
the future when needed.

Jakub

useless cast blocking some optimization in gcc 4.7.3

2013-04-08 Thread Laurent Alfonsi


Hello,

I have identified a big performance regression between 4.6 and 4.7. (I 
have enclosed a pathological test).


After investigation, it is because of the += statement applied on 2 
signed chars.
  - It is now type-promoted to int when it is written result += 
foo().(since 4.7)
  - it is type promoted to unsigned char when it is written result = 
result + foo().


The char-int-char cast is blocking some optimizations in later phases.
Anyway, this doesn't look wrong, so I extended fold optimization in 
order to catch this case. (patch enclosed)

The patch basically transforms :
(TypeA)  ( (TypeB) a1 + (TypeB) a2 )/* with a1 and 
a2 of the signed type TypeA */

into :
a1 + a2

I believe this is legal for any licit a1/a2 input values (no overflow on 
signed char).
No new failure on the two tested targets : sh-superh-elf and 
x86_64-unknown-linux-gnu.

Should I enter a bugzilla to track this ? Is it ok for trunk ?

2013-04-08  Laurent Alfonsi  laurent.alfo...@st.com

   * fold-const.c (fold_unary_loc): Suppress useless type promotion.


Thanks,
Laurent

#include cstdio

typedef char int8_t;
const int iterations = 20;
const int SIZE 	= 200;
int8_t data8[SIZE];

/**/

template typename T
inline void check_result(T result) {
	if (result != T(200)) {
		printf(test failed %d!=%d\n, result, 200);
}
}

/**/

template typename T
	struct all_constants {
	  static T get_one(T input) { return (T(1)); }
	};

/**/

template typename T, typename Input
void test_constant(T* first, int count) {
  int i;

  for(i = 0; i  iterations; ++i) {
T result = 0;
for (int n = 0; n  count; ++n) {
		result += Input::get_one( first[n] );
	}
check_resultT(result);
  }
}

/**/

int main(int argc, char** argv)
{
	test_constantint8_t, all_constantsint8_t (data8,SIZE);
	return 0;
}
--- ./gcc.orig/gcc/fold-const.c	2013-04-08 14:09:32.0 +0200
+++ ./gcc/gcc/fold-const.c	2013-04-08 11:08:16.0 +0200
@@ -8055,6 +8055,26 @@
 	}
 	}
 
+  /* Convert (T1) ((T2)X + (T2)Y) into X + Y, 
+ if X and Y already have type T1 (integral only), and T2  T1 */
+  if (INTEGRAL_TYPE_P (type)
+   TYPE_OVERFLOW_UNDEFINED (type)
+	   (TREE_CODE (op0) == PLUS_EXPR || TREE_CODE (op0) == MINUS_EXPR
+	 || TREE_CODE (op0) == MULT_EXPR)
+	   TREE_CODE (TREE_OPERAND (op0, 0)) == NOP_EXPR
+	   TREE_CODE (TREE_OPERAND (op0, 1)) == NOP_EXPR
+	   type == TREE_TYPE (TREE_OPERAND (TREE_OPERAND (op0, 0), 0))
+	   type == TREE_TYPE (TREE_OPERAND (TREE_OPERAND (op0, 1), 0))
+	   TYPE_PRECISION (type)  TYPE_PRECISION (TREE_TYPE (op0)))
+	{
+	  tem = fold_build2_loc (loc, TREE_CODE (op0), type,
+			 fold_convert_loc (loc, type,
+	   TREE_OPERAND (op0, 0)),
+			 fold_convert_loc (loc, type,
+	   TREE_OPERAND (op0, 1)));
+	  return fold_convert_loc (loc, type, tem);
+	}
+
   tem = fold_convert_const (code, type, op0);
   return tem ? tem : NULL_TREE;

[PATCH] Don't forwprop into clobbers in some cases (PR tree-optimization/56854)

2013-04-08 Thread Jakub Jelinek

Hi!

lhs ={v} {CLOBBER};
stmts right now allow only VAR_DECL or MEM_REF lhs, but the forwprop code
below on the attached testcase attempts to propagate an ARRAY_REF (of
MEM_REF) into it.  Fixed by not propagating in that case, allowing arbitrary
memory lhs is IMHO unnecessary and such lhs's wouldn't be very useful for
DSE anyway.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2013-04-08  Jakub Jelinek  ja...@redhat.com

PR tree-optimization/56854
* tree-ssa-forwprop.c (forward_propagate_addr_expr_1): Don't
forward into clobber stmts if it would change MEM_REF lhs into
non-MEM_REF.

* g++.dg/torture/pr56854.C: New test.

--- gcc/tree-ssa-forwprop.c.jj  2013-02-25 23:51:21.0 +0100
+++ gcc/tree-ssa-forwprop.c 2013-04-08 16:12:37.0 +0200
@@ -826,7 +826,11 @@ forward_propagate_addr_expr_1 (tree name
integer_zerop (TREE_OPERAND (lhs, 1))
useless_type_conversion_p
(TREE_TYPE (TREE_OPERAND (def_rhs, 0)),
-TREE_TYPE (gimple_assign_rhs1 (use_stmt
+TREE_TYPE (gimple_assign_rhs1 (use_stmt)))
+  /* Don't forward anything into clobber stmts if it would result
+ in the lhs no longer being a MEM_REF.  */
+   (!gimple_clobber_p (use_stmt)
+  || TREE_CODE (TREE_OPERAND (def_rhs, 0)) == MEM_REF))
{
  tree *def_rhs_basep = TREE_OPERAND (def_rhs, 0);
  tree new_offset, new_base, saved, new_lhs;
--- gcc/testsuite/g++.dg/torture/pr56854.C.jj   2013-04-08 18:03:37.978009666 
+0200
+++ gcc/testsuite/g++.dg/torture/pr56854.C  2013-04-08 18:03:09.0 
+0200
@@ -0,0 +1,24 @@
+// PR tree-optimization/56854
+// { dg-do compile }
+
+inline void *
+operator new (__SIZE_TYPE__, void *p) throw ()
+{
+  return p;
+}
+
+struct A
+{
+  int a;
+  A () : a (0) {}
+  ~A () {}
+  A operator= (const A v) { this-~A (); new (this) A (v); return *this; }
+};
+A b[4], c[4];
+
+void
+foo ()
+{
+  for (int i = 0; i  4; ++i)
+c[i] = b[i];
+}

Jakub

[linaro/gcc-4_8-branch] Merge from upstream gcc-4_8-branch and backports from trunk

2013-04-08 Thread Matthew Gretton-Dann


Hi,

I have just merge upstream gcc-4_8-branch into linaro/gcc-4_8-branch, up to 
r197294.  (The merge is r197598.)


I have also backported the following trunk revisions into the 
linaro/gcc-4_8-branch: 196856, 196858, 196876, 197046, 197051, 197052, 
197153, 197207, 197341, 197342, and 197346. (Backports are revisions 
197599:197609).


Thanks,

Matt

--
Matthew Gretton-Dann
Toolchain Working Group, Linaro

[patch, fortran] Committed fix for PR 56782

2013-04-08 Thread Thomas Koenig


Hello world,

I committed the attached patch as obvious to fix the regression
with array constructors on trunk, after regression-testing.

Will commit to 4.8 next.

Thomas

2013-04-08  Thomas Koenig  tkoe...@gcc.gnu.org

PR fortran/56782
* frontend-passes.c (callback_reduction):  Dont't do
any simplification if there is only a single element
which has an iterator.

2013-04-08  Thomas Koenig  tkoe...@gcc.gnu.org

PR fortran/56782
* gfortran.dg/array_constructor_44.f90:  New test.
Index: frontend-passes.c
===
--- frontend-passes.c	(Revision 197233)
+++ frontend-passes.c	(Arbeitskopie)
@@ -300,7 +300,12 @@ callback_reduction (gfc_expr **e, int *walk_subtre
 
   c = gfc_constructor_first (arg-value.constructor);
 
-  if (c == NULL)
+  /* Don't do any simplififcation if we have
+ - no element in the constructor or
+ - only have a single element in the array which contains an
+ iterator.  */
+
+  if (c == NULL || (c-iterator != NULL  gfc_constructor_next (c) == NULL))
 return 0;
 
   res = copy_walk_reduction_arg (c-expr, fn);
! { dg-do run }
! { dg-options -ffrontend-optimize }
! PR 56872 - wrong front-end optimization with a single constructor.
! Original bug report by Rich Townsend.
  integer :: k
  real :: s
  integer :: m
  s = 2.0
  m = 4
  res = SUM([(s**(REAL(k-1)/REAL(m-1)),k=1,m)])
  if (abs(res - 5.84732246)  1e-6) call abort
  end

Re: useless cast blocking some optimization in gcc 4.7.3

2013-04-08 Thread Marc Glisse


Hello,

On Mon, 8 Apr 2013, Laurent Alfonsi wrote:

I have identified a big performance regression between 4.6 and 4.7. (I have 
enclosed a pathological test).


After investigation, it is because of the += statement applied on 2 signed 
chars.
 - It is now type-promoted to int when it is written result += foo(). 
(since 4.7)
 - it is type promoted to unsigned char when it is written result = 
result + foo().


The char-int-char cast is blocking some optimizations in later phases.


Which ones?

Anyway, this doesn't look wrong, so I extended fold optimization in order to 
catch this case. (patch enclosed)

The patch basically transforms :
   (TypeA)  ( (TypeB) a1 + (TypeB) a2 )/* with a1 and a2 of 
the signed type TypeA */

into :
   a1 + a2

I believe this is legal for any licit a1/a2 input values (no overflow on 
signed char).


I don't think this is ok, please refer to the discussion around the 
PR and patch that added this conversion, it was done on purpose.


According to this (4th item)
http://gcc.gnu.org/onlinedocs/gcc/Integers-implementation.html

char a=100;
a+=a;

is perfectly defined and a is -56 (assuming a signed 8 bit char and a 
strictly larger int). However, your transformation turns it into undefined 
behavior: an addition that overflows in a type with 
TYPE_OVERFLOW_UNDEFINED.


--
Marc Glisse

Make lto-symtab to ignore conflicts in static functions

2013-04-08 Thread Jan Hubicka

Hi,
currently lto-symtab is trying to resolve all duplicated declarations,
including static variables where such duplicates should not happen.
This conflicts with the plan to solve PR54095 by postponning renaming to
the partitioning.  This patch adds lto_symtab_symbol_p that disable merging
on statics and keeps duplicate entries for a given asm name.

Boostrapped/regtested x86_64-linux, OK?

Honza

PR lto/54095
lto-symtab.c (lto_symtab_symbol_p): New function.
(lto_symtab_resolve_can_prevail_p, lto_symtab_resolve_symbols,
lto_symtab_resolve_symbols, lto_symtab_merge_decls_2,
lto_symtab_merge_decls_1, lto_symtab_merge_cgraph_nodes_1):
Skip static symbols.

Index: lto-symtab.c
===
*** lto-symtab.c(revision 197551)
--- lto-symtab.c(working copy)
*** lto_symtab_resolve_replaceable_p (symtab
*** 226,237 
return false;
  }
  
  /* Return true if the symtab entry E can be the prevailing one.  */
  
  static bool
  lto_symtab_resolve_can_prevail_p (symtab_node e)
  {
!   if (!symtab_real_symbol_p (e))
  return false;
  
/* The C++ frontend ends up neither setting TREE_STATIC nor
--- 226,249 
return false;
  }
  
+ /* Return true, if the symbol E should be resolved by lto-symtab.
+Those are all real symbols that are not static (we handle renaming
+of static later in partitioning).  */
+ 
+ static bool
+ lto_symtab_symbol_p (symtab_node e)
+ {
+   if (!TREE_PUBLIC (e-symbol.decl))
+ return false;
+   return symtab_real_symbol_p (e);
+ }
+ 
  /* Return true if the symtab entry E can be the prevailing one.  */
  
  static bool
  lto_symtab_resolve_can_prevail_p (symtab_node e)
  {
!   if (!lto_symtab_symbol_p (e))
  return false;
  
/* The C++ frontend ends up neither setting TREE_STATIC nor
*** lto_symtab_resolve_symbols (symtab_node
*** 261,267 
  
/* Always set e-node so that edges are updated to reflect decl merging. */
for (e = first; e; e = e-symbol.next_sharing_asm_name)
! if (symtab_real_symbol_p (e)
 (e-symbol.resolution == LDPR_PREVAILING_DEF_IRONLY
|| e-symbol.resolution == LDPR_PREVAILING_DEF_IRONLY_EXP
|| e-symbol.resolution == LDPR_PREVAILING_DEF))
--- 273,279 
  
/* Always set e-node so that edges are updated to reflect decl merging. */
for (e = first; e; e = e-symbol.next_sharing_asm_name)
! if (lto_symtab_symbol_p (e)
 (e-symbol.resolution == LDPR_PREVAILING_DEF_IRONLY
|| e-symbol.resolution == LDPR_PREVAILING_DEF_IRONLY_EXP
|| e-symbol.resolution == LDPR_PREVAILING_DEF))
*** lto_symtab_resolve_symbols (symtab_node
*** 275,281 
  {
/* Assert it's the only one.  */
for (e = prevailing-symbol.next_sharing_asm_name; e; e = 
e-symbol.next_sharing_asm_name)
!   if (symtab_real_symbol_p (e)
 (e-symbol.resolution == LDPR_PREVAILING_DEF_IRONLY
|| e-symbol.resolution == LDPR_PREVAILING_DEF_IRONLY_EXP
|| e-symbol.resolution == LDPR_PREVAILING_DEF))
--- 287,293 
  {
/* Assert it's the only one.  */
for (e = prevailing-symbol.next_sharing_asm_name; e; e = 
e-symbol.next_sharing_asm_name)
!   if (lto_symtab_symbol_p (e)
 (e-symbol.resolution == LDPR_PREVAILING_DEF_IRONLY
|| e-symbol.resolution == LDPR_PREVAILING_DEF_IRONLY_EXP
|| e-symbol.resolution == LDPR_PREVAILING_DEF))
*** lto_symtab_resolve_symbols (symtab_node
*** 310,317 
/* Do a second round choosing one from the replaceable prevailing decls.  */
for (e = first; e; e = e-symbol.next_sharing_asm_name)
  {
!   if (!lto_symtab_resolve_can_prevail_p (e)
! || !symtab_real_symbol_p (e))
continue;
  
/* Choose the first function that can prevail as prevailing.  */
--- 322,328 
/* Do a second round choosing one from the replaceable prevailing decls.  */
for (e = first; e; e = e-symbol.next_sharing_asm_name)
  {
!   if (!lto_symtab_resolve_can_prevail_p (e))
continue;
  
/* Choose the first function that can prevail as prevailing.  */
*** lto_symtab_merge_decls_2 (symtab_node fi
*** 365,375 
/* Try to merge each entry with the prevailing one.  */
for (e = prevailing-symbol.next_sharing_asm_name;
 e; e = e-symbol.next_sharing_asm_name)
! {
!   if (!lto_symtab_merge (prevailing, e)
!  !diagnosed_p)
!   mismatches.safe_push (e-symbol.decl);
! }
if (mismatches.is_empty ())
  return;
  
--- 376,387 
/* Try to merge each entry with the prevailing one.  */
for (e = prevailing-symbol.next_sharing_asm_name;
 e; e = e-symbol.next_sharing_asm_name)
! if (TREE_PUBLIC (e-symbol.decl))
!   {
!   if (!lto_symtab_merge (prevailing, e)
!

Re: [patch] update documentation for SEQUENCE

2013-04-08 Thread Steven Bosscher

On Mon, Apr 8, 2013 at 11:30 AM, Richard Biener wrote:
 On Sun, Apr 7, 2013 at 12:04 AM, Steven Bosscher wrote:
 Hello,

 The existing documentation for SEQUENCE still states it is used for
 DEFINE_EXPAND sequences. I think I wasn't even hacking GCC when that
 practice was abandoned, and in the mean time some other uses of
 SEQUENCE have appeared in the compiler. So, a long-overdue
 documentation update.

 OK for trunk?

 Ok.

Thanks, I'm committing this along with something else I noticed:
NOTE_INSN_LOOP notes don't exist anymore, and NOTE_INSN_EH_REGION
notes don't have NOTE_BLOCK_NUMBER anymore but do have
so-far-undocumented NOTE_EH_HANDLER.

@@ -3602,29 +3608,9 @@ of debugging information.
 @item NOTE_INSN_EH_REGION_BEG
 @itemx NOTE_INSN_EH_REGION_END
 These types of notes indicate the position of the beginning and end of a
-level of scoping for exception handling.  @code{NOTE_BLOCK_NUMBER}
-identifies which @code{CODE_LABEL} or @code{note} of type
-@code{NOTE_INSN_DELETED_LABEL} is associated with the given region.
+level of scoping for exception handling.  @code{NOTE_EH_HANDLER}
+identifies which region is associated with these notes.

-@findex NOTE_INSN_LOOP_BEG
-@findex NOTE_INSN_LOOP_END
-@item NOTE_INSN_LOOP_BEG
-@itemx NOTE_INSN_LOOP_END
-These types of notes indicate the position of the beginning and end
-of a @code{while} or @code{for} loop.  They enable the loop optimizer
-to find loops quickly.
-
-@findex NOTE_INSN_LOOP_CONT
-@item NOTE_INSN_LOOP_CONT
-Appears at the place in a loop that @code{continue} statements jump to.
-
-@findex NOTE_INSN_LOOP_VTOP
-@item NOTE_INSN_LOOP_VTOP
-This note indicates the place in a loop where the exit test begins for
-those loops in which the exit test has been duplicated.  This position
-becomes another virtual start of the loop when considering loop
-invariants.
-
 @findex NOTE_INSN_FUNCTION_BEG
 @item NOTE_INSN_FUNCTION_BEG
 Appears at the start of the function body, after the function

[patch] obvious: remove REG_EH_CONTEXT note

2013-04-08 Thread Steven Bosscher

Remnants of the RTL inliner... Committed as obvious.

* reg-notes.def (REG_EH_CONTEXT): Remove unused note.

--- trunk/gcc/reg-notes.def 2013/04/08 19:36:43 197610
+++ trunk/gcc/reg-notes.def 2013/04/08 19:59:57 197611
@@ -172,11 +172,6 @@
the rest of the compiler as a CALL_INSN.  */
 REG_NOTE (CFA_FLUSH_QUEUE)

-/* Indicates that REG holds the exception context for the function.
-   This context is shared by inline functions, so the code to acquire
-   the real exception context is delayed until after inlining.  */
-REG_NOTE (EH_CONTEXT)
-
 /* Indicates what exception region an INSN belongs in.  This is used
to indicate what region to which a call may throw.  REGION 0
indicates that a call cannot throw at all.  REGION -1 indicates

Tract symbol names that are unique in DSO

2013-04-08 Thread Jan Hubicka

Hi,
this patch adds a new symbol flag UNIQUE_NAME.  Its purpose is to disable
renaming at LTO time when the symbol is already known to be unique in the whole
resulting DSO. This happens for symbols that was previously global and we know
from LTO plugin resolution data that they are not bound by non-LTO world.

I also made clones to be unique. This needs more care.
1) when clonning at compilation time, one can produce two clones of same name
   (like foo.sra.1) for static functions.
2) we make an assumption here that the namespace .clonetype.num is private for 
GCC.
   This is how things works since introduction of WHOPR, but it is not 
documented.
   We may need to add those ugly __GLOBAL_XYZ manglings.

I would like to handle 2) incrementally after some discussion with plugin folks.

The flag is currently write only, I am going to use by later patch.

Bootstrapped/regtested x86_64-linux, will commit it after we settle on the ohter
changes that needs the flag.

Honza
PR lto/54095
* cgraph.c (cgraph_make_node_local_1): Se unique_name.
* cgraph.h (symtab_node_base): Add unique_name.
* lto-cgraph.c (lto_output_node, lto_output_varpool_node,
input_overwrite_node, input_varpool_node): Stream unique_name.
* cgraphclones.c (cgraph_create_virtual_clone,
cgraph_function_versioning): Set unique_name.
* ipa.c (function_and_variable_visibility): Set unique_name.

Index: cgraph.c
===
*** cgraph.c(revision 197551)
--- cgraph.c(working copy)
*** cgraph_make_node_local_1 (struct cgraph_
*** 1798,1803 
--- 1800,1807 
  
node-symbol.externally_visible = false;
node-local.local = true;
+   node-symbol.unique_name = (node-symbol.resolution == 
LDPR_PREVAILING_DEF_IRONLY
+ || node-symbol.resolution == 
LDPR_PREVAILING_DEF_IRONLY_EXP);
node-symbol.resolution = LDPR_PREVAILING_DEF_IRONLY;
gcc_assert (cgraph_function_body_availability (node) == AVAIL_LOCAL);
  }
Index: cgraph.h
===
*** cgraph.h(revision 197551)
--- cgraph.h(working copy)
*** struct GTY(()) symtab_node_base
*** 62,67 
--- 62,69 
/* Needed variables might become dead by optimization.  This flag
   forces the variable to be output even if it appears dead otherwise.  */
unsigned force_output : 1;
+   /* True when the name is known to be unique and thus it does not need 
mangling.  */
+   unsigned unique_name : 1;
  
/* Ordering of all symtab entries.  */
int order;
Index: lto-cgraph.c
===
*** lto-cgraph.c(revision 197551)
--- lto-cgraph.c(working copy)
*** lto_output_node (struct lto_simple_outpu
*** 468,473 
--- 468,474 
bp_pack_value (bp, node-local.can_change_signature, 1);
bp_pack_value (bp, node-local.redefined_extern_inline, 1);
bp_pack_value (bp, node-symbol.force_output, 1);
+   bp_pack_value (bp, node-symbol.unique_name, 1);
bp_pack_value (bp, node-symbol.address_taken, 1);
bp_pack_value (bp, node-abstract_and_needed, 1);
bp_pack_value (bp, tag == LTO_symtab_analyzed_node
*** lto_output_varpool_node (struct lto_simp
*** 533,538 
--- 534,540 
bp = bitpack_create (ob-main_stream);
bp_pack_value (bp, node-symbol.externally_visible, 1);
bp_pack_value (bp, node-symbol.force_output, 1);
+   bp_pack_value (bp, node-symbol.unique_name, 1);
bp_pack_value (bp, node-finalized, 1);
bp_pack_value (bp, node-alias, 1);
bp_pack_value (bp, node-alias_of != NULL, 1);
*** input_overwrite_node (struct lto_file_de
*** 886,891 
--- 888,894 
node-local.can_change_signature = bp_unpack_value (bp, 1);
node-local.redefined_extern_inline = bp_unpack_value (bp, 1);
node-symbol.force_output = bp_unpack_value (bp, 1);
+   node-symbol.unique_name = bp_unpack_value (bp, 1);
node-symbol.address_taken = bp_unpack_value (bp, 1);
node-abstract_and_needed = bp_unpack_value (bp, 1);
node-symbol.used_from_other_partition = bp_unpack_value (bp, 1);
*** input_varpool_node (struct lto_file_decl
*** 1040,1045 
--- 1043,1049 
bp = streamer_read_bitpack (ib);
node-symbol.externally_visible = bp_unpack_value (bp, 1);
node-symbol.force_output = bp_unpack_value (bp, 1);
+   node-symbol.unique_name = bp_unpack_value (bp, 1);
node-finalized = bp_unpack_value (bp, 1);
node-alias = bp_unpack_value (bp, 1);
non_null_aliasof = bp_unpack_value (bp, 1);
Index: cgraphclones.c
===
*** cgraphclones.c  (revision 197551)
--- cgraphclones.c  (working copy)
*** cgraph_create_virtual_clone (struct cgra
*** 324,329 
--- 324,337

Make change_decl_assembler_name functional with inline clones

2013-04-08 Thread Jan Hubicka

Hi,
this patch makes change_decl_assembler_name to do the right thing with inline
clones.  My original plan was to remove inline clones from assembler_name_hash,
but it hits the problem that we currently need to make them unique for purposes
of LTO sreaming.

It is not hard to walk the clone tree and update it.
Later we can reorg streaming to not rely on uniqueness of symbol names of 
function
bodies not associated with a real symbol and perhaps simplify this somewhat.

Bootstrapped/regtested x86_64-linux, will commit it shortly.

PR lto/54095
* symtab.c (insert_to_assembler_name_hash): Handle clones.
(unlink_from_assembler_name_hash): Likewise.
(symtab_prevail_in_asm_name_hash, symtab_register_node,
symtab_unregister_node, symtab_initialize_asm_name_hash,
change_decl_assembler_name): Update.

Index: symtab.c
===
*** symtab.c(revision 197551)
--- symtab.c(working copy)
*** eq_assembler_name (const void *p1, const
*** 102,108 
  /* Insert NODE to assembler name hash.  */
  
  static void
! insert_to_assembler_name_hash (symtab_node node)
  {
if (is_a varpool_node (node)  DECL_HARD_REGISTER (node-symbol.decl))
  return;
--- 102,108 
  /* Insert NODE to assembler name hash.  */
  
  static void
! insert_to_assembler_name_hash (symtab_node node, bool with_clones)
  {
if (is_a varpool_node (node)  DECL_HARD_REGISTER (node-symbol.decl))
  return;
*** insert_to_assembler_name_hash (symtab_no
*** 111,116 
--- 111,119 
if (assembler_name_hash)
  {
void **aslot;
+   struct cgraph_node *cnode;
+   tree decl = node-symbol.decl;
+ 
tree name = DECL_ASSEMBLER_NAME (node-symbol.decl);
  
aslot = htab_find_slot_with_hash (assembler_name_hash, name,
*** insert_to_assembler_name_hash (symtab_no
*** 121,126 
--- 124,136 
if (*aslot != NULL)
((symtab_node)*aslot)-symbol.previous_sharing_asm_name = node;
*aslot = node;
+ 
+   /* Update also possible inline clones sharing a decl.  */
+   cnode = dyn_cast cgraph_node (node);
+   if (cnode  cnode-clones  with_clones)
+   for (cnode = cnode-clones; cnode; cnode = cnode-next_sibling_clone)
+ if (cnode-symbol.decl == decl)
+   insert_to_assembler_name_hash ((symtab_node) cnode, true);
  }
  
  }
*** insert_to_assembler_name_hash (symtab_no
*** 128,137 
  /* Remove NODE from assembler name hash.  */
  
  static void
! unlink_from_assembler_name_hash (symtab_node node)
  {
if (assembler_name_hash)
  {
if (node-symbol.next_sharing_asm_name)
node-symbol.next_sharing_asm_name-symbol.previous_sharing_asm_name
  = node-symbol.previous_sharing_asm_name;
--- 138,150 
  /* Remove NODE from assembler name hash.  */
  
  static void
! unlink_from_assembler_name_hash (symtab_node node, bool with_clones)
  {
if (assembler_name_hash)
  {
+   struct cgraph_node *cnode;
+   tree decl = node-symbol.decl;
+ 
if (node-symbol.next_sharing_asm_name)
node-symbol.next_sharing_asm_name-symbol.previous_sharing_asm_name
  = node-symbol.previous_sharing_asm_name;
*** unlink_from_assembler_name_hash (symtab_
*** 155,160 
--- 168,180 
}
node-symbol.next_sharing_asm_name = NULL;
node-symbol.previous_sharing_asm_name = NULL;
+ 
+   /* Update also possible inline clones sharing a decl.  */
+   cnode = dyn_cast cgraph_node (node);
+   if (cnode  cnode-clones  with_clones)
+   for (cnode = cnode-clones; cnode; cnode = cnode-next_sibling_clone)
+ if (cnode-symbol.decl == decl)
+   unlink_from_assembler_name_hash ((symtab_node) cnode, true);
  }
  }
  
*** unlink_from_assembler_name_hash (symtab_
*** 163,170 
  void
  symtab_prevail_in_asm_name_hash (symtab_node node)
  {
!   unlink_from_assembler_name_hash (node);
!   insert_to_assembler_name_hash (node);
  }
  
  
--- 183,190 
  void
  symtab_prevail_in_asm_name_hash (symtab_node node)
  {
!   unlink_from_assembler_name_hash (node, false);
!   insert_to_assembler_name_hash (node, false);
  }
  
  
*** symtab_register_node (symtab_node node)
*** 196,202 
  
/* Be sure to do this last; C++ FE might create new nodes via
   DECL_ASSEMBLER_NAME langhook!  */
!   insert_to_assembler_name_hash (node);
  }
  
  /* Make NODE to be the one symtab hash is pointing to.  Used when reshaping 
tree
--- 216,222 
  
/* Be sure to do this last; C++ FE might create new nodes via
   DECL_ASSEMBLER_NAME langhook!  */
!   insert_to_assembler_name_hash (node, false);
  }
  
  /* Make NODE to be the one symtab hash is pointing to.  Used when reshaping 
tree
*** symtab_unregister_node (symtab_node node
*** 259,265 
else
*slot =

Re: Comments on the suggestion to use infinite precision math for wide int.

2013-04-08 Thread Lawrence Crowl

On 4/8/13, Kenneth Zadeck zad...@naturalbridge.com wrote:
 The other problem, which i invite you to use the full power of
 your c++ sorcery on, is the one where defining an operator so
 that wide-int + unsigned hwi is either rejected or properly
 zero extended.  If you can do this, I will go along with
 your suggestion that the internal rep should be sign extended.
 Saying that constants are always sign extended seems ok, but there
 are a huge number of places where we convert unsigned hwis as
 the second operand and i do not want that to be a trap.  I went
 thru a round of this, where i did not post the patch because i
 could not make this work.  And the number of places where you
 want to use an hwi as the second operand dwarfs the number of
 places where you want to use a small integer constant.

You can use overloading, as in the following, which actually ignores
handling the sign in the representation.

class number {
unsigned int rep1;
int representation;
public:
number(int arg) : representation(arg) {}
number(unsigned int arg) : representation(arg) {}
friend number operator+(number, int);
friend number operator+(number, unsigned int);
friend number operator+(int, number);
friend number operator+(unsigned int, number);
};

number operator+(number n, int si) {
return n.representation + si;
}

number operator+(number n, unsigned int ui) {
return n.representation + ui;
}

number operator+(int si, number n) {
return n.representation + si;
}

number operator+(unsigned int ui, number n) {
return n.representation + ui;
}

If the argument type is of a template type parameter, then
you can test the template type via

if (std::is_signedT::value)
   // sign extend
else
   // zero extend

See http://www.cplusplus.com/reference/type_traits/is_signed/.

If you want to handle non-builtin types that are asigne dor unsigned,
then you need to add a specialization for is_signed.

-- 
Lawrence Crowl

Re: [i386] Replace builtins with vector extensions

2013-04-08 Thread Marc Glisse


On Sun, 7 Apr 2013, Marc Glisse wrote:


 extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__,
__artificial__))
 _mm_slli_epi16 (__m128i __A, int __B)
 {
-  return (__m128i)__builtin_ia32_psllwi128 ((__v8hi)__A, __B);
+  return (__m128i) ((__v8hi)__A  __B);
 }


Actually, I believe I have to keep using the builtins for shifts, because 
the intrinsics have well defined behavior for large __B whereas  and  
don't.


--
Marc Glisse

Re: RFC: add some static probes to libstdc++

2013-04-08 Thread Tom Tromey

 Jonathan == Jonathan Wakely jwakely@gmail.com writes:

Jonathan On 2 April 2013 16:39, Marc Glisse wrote:
 On Tue, 2 Apr 2013, Jonathan Wakely wrote:
 Should we update the prerequisites documentation to say that if
 Systemtap is installed it needs to be at least version X?
 
 
 I thought you were going to suggest enhancing the configure test so it fails
 on old systemtap (detects it as absent).

Jonathan Ah yes, that's a much better idea!

Sorry about the delay on this.  I've been away.
I will try to write a fix this week.

Tom

Re: Comments on the suggestion to use infinite precision math for wide int.

2013-04-08 Thread Lawrence Crowl

On 4/8/13, Richard Biener richard.guent...@gmail.com wrote:
 I advocate the infinite precision signed representation as one
 solution to avoid the issues that come up with your implementation
 (as I currently have access to) which has a representation
 with N bits of precision encoded with M = N bits and no sign
 information.  That obviously leaves operations on numbers of
 that representation with differing N undefined.  You define it
 by having coded the operations which as far as I can see simply
 assume N is equal for any two operands and the effective sign for
 extending the M-bits encoding to the common N-bits precision is
 available.  A thorough specification of both the encoding scheme
 and the operation semantics is missing.  I can side-step both of
 these issues nicely by simply using a infinite precision signed
 representation and requiring the client to explicitely truncate /
 extend to a specific precision when required.  I also leave open
 the possibility to have the _encoding_ be always the same as an
 infinite precision signed representation but to always require
 an explicitely specified target precision for each operation
 (which rules out the use of operator overloading).

For efficiency, the machine representation of an infinite precision
number should allow for a compact one-word representation.

  class infinite
  {
int length;
union representation
{
 int inside_word;
 int *outside_words;
} field;
  public:
int mod_one_word()
{
  if (length == 1)
return field.inside_word;
  else
return field.outside_word[0];
}
  };

Also for efficiency, you want to know the modulus at the time you
do the last normal operation on it, not as a subsequent operation.

 Citing your example:

   8 * 10 / 4

 and transforming it slightly into a commonly used pattern:

   (byte-size * 8 + bit-size) / 8

 then I argue that what people want here is this carried out in
 _infinite_ precision!

But what people want isn't really relevant, what is relevant is
what the language and/or compatiblity requires.  Ideally, gcc
should accurately represent languages with both finite size and
infinite size.

 Even if byte-size happens to come from
 a sizetype TREE_INT_CST with 64bit precision.  So either
 choice - having a fixed-precision representation or an
 infinite-precision representation - can and will lead to errors
 done by the programmer.  And as you can easily build a
 finite precision wrapper around an infinite precision implementation
 but not the other way around it's obvious to me what the
 implementation should provide.

IIUC, the issue here is not the logical chain of implementation, but
the interface that is most helpful to the programmers in getting to
performant, correct code.  I expect we need the infinite precision
forms, but also that having more concise coding for fixed-precision
would be helpful.

For mixed operations, all the languages that I know of promote
smaller operands to larger operands, so I think a reasonable
definition is possible here.

-- 
Lawrence Crowl

Re: Comments on the suggestion to use infinite precision math for wide int.

2013-04-08 Thread Lawrence Crowl

On 4/8/13, Robert Dewar de...@adacore.com wrote:
 On 4/8/2013 10:26 AM, Kenneth Zadeck wrote:
  On 04/08/2013 10:12 AM, Robert Dewar wrote:
   On 4/8/2013 9:58 AM, Kenneth Zadeck wrote:
yes but the relevant question for the not officially
static integer constants is in what precision are those
operations to be performed in?  I assume that you choose
gcc types for these operations and you expect the math to
be done within that type, i.e. exactly the way you expect
the machine to perform.
  
   As I explained in an earlier message, *within* a single
   expression, we are free to use higher precision, and we provide
   modes that allow this up to and including the usea of infinite
   precision. That applies not just to constant expressions but
   to all expressions.
 
  My confusion is what you mean by we?  Do you mean we the
  writer of the program, we the person invoking the compiler
  by the use command line options or we, your company's
  implementation of ada?

 Sorry, bad usage, The gcc implementation of Ada allows the user
 to specify by pragmas how intermediate overflow is handled.

Correct me if I'm wrong, but the Ada standard doesn't require any
particular maximum evaluation precision, but only that you get an
exception if the values exceed the chosen maximum.

  My interpretation of your first email was that it was possible
  for the programmer to do something equivalent to adding
  attributes surrounding a block in the program to control the
  precision and overflow detection of the expressions in the block.
  And if this is so, then by the time the expression is seen by
  the middle end of gcc, those attributes will have been converted
  into tree code will evaluate the code in a well defined way by
  both the optimization passes and the target machine.

 Yes, that's a correct understanding

In essence, you have moved some of the optimization from the back
end to the front end.  Correct?

-- 
Lawrence Crowl

Re: Comments on the suggestion to use infinite precision math for wide int.

2013-04-08 Thread Robert Dewar


On 4/8/2013 5:12 PM, Lawrence Crowl wrote:

(BTW, you *really* don't need to quote entire messages, I find
it rather redundant for the entire thread to be in every message,
we all have thread following mail readers!)


Correct me if I'm wrong, but the Ada standard doesn't require any
particular maximum evaluation precision, but only that you get an
exception if the values exceed the chosen maximum.


Right, that's at run-time, at compile-time for static expressions,
infinite precision is required.

But at run-time, all three of the modes we provide are
standard conforming.


In essence, you have moved some of the optimization from the back
end to the front end.  Correct?


Sorry, I don't quite understand that. If you are syaing that the
back end could handle this widening for intermediate values, sure
it could, this is the kind of thing that can be done at various
different places.

Re: [patch] Hash table changes from cxx-conversion branch

2013-04-08 Thread Lawrence Crowl

Ping?

On 3/31/13, Lawrence Crowl cr...@googlers.com wrote:
 On 3/28/13, Richard Biener richard.guent...@gmail.com wrote:
 On Mar 27, 2013 Lawrence Crowl cr...@googlers.com wrote:
  On 3/27/13, Richard Biener richard.guent...@gmail.com wrote:
   On Mar 23, 2013 Lawrence Crowl cr...@googlers.com wrote:
This patch is a consolodation of the hash_table patches to
the cxx-conversion branch.
   
Update various hash tables from htab_t to hash_table.
Modify types and calls to match.
  
   Ugh.  Can you split it up somewhat ... like split target bits
   away at least?  Targets may prefer to keep the old hashes for
   ease of branch maintainance.
 
  I will do that.
 
* tree-ssa-live.c'var_map_base_init::tree_to_index
   
New struct tree_int_map_hasher.
  
   I think this wants to be generalized - we have the common
   tree_map/tree_decl_map and tree_int_map maps in tree.h -
   those (and its users) should be tackled in a separate patch
   by providing common hashtable trails implementations.
 
  I will investigate for a separate patch.
 
Remove unused:
   
htab_t scop::original_pddrs
SCOP_ORIGINAL_PDDRS
   
Remove unused:
   
insert_loop_close_phis
insert_guard_phis
debug_ivtype_map
ivtype_map_elt_info
new_ivtype_map_elt
  
   Unused function/type removal are obvious changes.
  
Remove unused:
dse.c bitmap clear_alias_sets
dse.c bitmap disqualified_clear_alias_sets
dse.c alloc_pool clear_alias_mode_pool
dse.c dse_step2_spill
dse.c dse_step5_spill
graphds.h htab_t graph::indices
  
   See above.
 
  It wasn't obvious that the functions could be removed.  :-)
 
  Are you saying you don't want these notations in the description?

 No, I was saying that removal of unused functions / types should be
 committed separately and do not need approval as they are obvious.
 If they are not obvious (I didn't look at that patch part),
 then posting separately still helps ;)

 I've split out the removals to separate patches.  The remaining
 work is in two independent pieces.  The changes within the config
 directory and the changes outside that directory.  The descriptions
 and patch are attached compressed due to mailer size issues.

 Okay for trunk?

 --
 Lawrence Crowl



-- 
Lawrence Crowl

Re: Comments on the suggestion to use infinite precision math for wide int.

2013-04-08 Thread Kenneth Zadeck


In some sense you have to think in terms of three worlds:
1) what you call compile-time static expressions is one world which in 
gcc is almost always done by the front ends.
2) the second world is what the optimizers can do.   This is not 
compile-time static expressions because that is what the front end has 
already done.

3) there is run time.

My view on this is that optimization is just doing what is normally done 
at run time but doing it early.   From that point of view, we are if not 
required, morally obligated to do thing in the same way that the 
hardware would have done them.This is why i am so against richi on 
wanting to do infinite precision.By the time the middle or the back 
end sees the representation, all of the things that are allowed to be 
done in infinite precision have already been done.   What we are left 
with is a (mostly) strongly typed language that pretty much says exactly 
what must be done. Anything that we do in the middle end or back ends in 
infinite precision will only surprise the programmer and make them want 
to use llvm.


Kenny

On 04/08/2013 05:36 PM, Robert Dewar wrote:

On 4/8/2013 5:12 PM, Lawrence Crowl wrote:

(BTW, you *really* don't need to quote entire messages, I find
it rather redundant for the entire thread to be in every message,
we all have thread following mail readers!)


Correct me if I'm wrong, but the Ada standard doesn't require any
particular maximum evaluation precision, but only that you get an
exception if the values exceed the chosen maximum.


Right, that's at run-time, at compile-time for static expressions,
infinite precision is required.

But at run-time, all three of the modes we provide are
standard conforming.


In essence, you have moved some of the optimization from the back
end to the front end.  Correct?


Sorry, I don't quite understand that. If you are syaing that the
back end could handle this widening for intermediate values, sure
it could, this is the kind of thing that can be done at various
different places.

Re: Comments on the suggestion to use infinite precision math for wide int.

2013-04-08 Thread Robert Dewar


On 4/8/2013 5:46 PM, Kenneth Zadeck wrote:

In some sense you have to think in terms of three worlds:
1) what you call compile-time static expressions is one world which in
gcc is almost always done by the front ends.
2) the second world is what the optimizers can do.   This is not
compile-time static expressions because that is what the front end has
already done.
3) there is run time.

My view on this is that optimization is just doing what is normally done
at run time but doing it early.   From that point of view, we are if not
required, morally obligated to do thing in the same way that the
hardware would have done them.This is why i am so against richi on
wanting to do infinite precision.By the time the middle or the back
end sees the representation, all of the things that are allowed to be
done in infinite precision have already been done.   What we are left
with is a (mostly) strongly typed language that pretty much says exactly
what must be done. Anything that we do in the middle end or back ends in
infinite precision will only surprise the programmer and make them want
to use llvm.


That may be so in C, in Ada it would be perfectly reasonable to use
infinite precision for intermediate results in some cases, since the
language standard specifically encourages this approach.

[google gcc-4_7] offline profile merge tool (issue8508048)

2013-04-08 Thread Rong Xu

Hi,

This is a offline profile merge program.

Usage: profile_merge.py [options] arg1 arg2 ...

Options:
  -h, --helpshow this help message and exit
  -w MULTIPLIERS, --multipliers=MULTIPLIERS
Comma separated list of multipliers to be applied for
each corresponding profile.
  -o OUTPUT_PROFILE, --output=OUTPUT_PROFILE
Output directory or zip file to dump the merged
profile. Default output is profile-merged.zip.

  Arguments:
Comma separated list of input directories or zip files that contain
profile data to merge.

Histogram is recomputed (i.e. preicise). Module grouping information
in LIPO is approximation.

Thanks,

-Rong

2013-04-08  Rong Xu  x...@google.com

* contrib/profile_merge.py: An offline profile merge tool.

Index: contrib/profile_merge.py
===
--- contrib/profile_merge.py(revision 0)
+++ contrib/profile_merge.py(revision 0)
@@ -0,0 +1,1301 @@
+#!/usr/bin/python2.7
+#
+# Copyright 2013 Google Inc. All Rights Reserved.
+
+Merge two or more gcda profile.
+
+
+__author__ = 'Seongbae Park, Rong Xu'
+__author_email__ = 'sp...@google.com, x...@google.com'
+
+import array
+from optparse import OptionGroup
+from optparse import OptionParser
+import os
+import struct
+import zipfile
+
+new_histogram = None
+
+
+class Error(Exception):
+  Exception class for profile module.
+
+
+def ReadAllAndClose(path):
+  Return the entire byte content of the specified file.
+
+  Args:
+path: The path to the file to be opened and read.
+
+  Returns:
+The byte sequence of the content of the file.
+  
+  data_file = open(path, 'rb')
+  data = data_file.read()
+  data_file.close()
+  return data
+
+
+def MergeCounters(objs, index, multipliers):
+  Accumulate the counter at index from all counters objs.
+  val = 0
+  for j in xrange(len(objs)):
+val += multipliers[j] * objs[j].counters[index]
+  return val
+
+
+class DataObject(object):
+  Base class for various datum in GCDA/GCNO file.
+
+  def __init__(self, tag):
+self.tag = tag
+
+
+class Function(DataObject):
+  Function and its counters.
+
+  Attributes:
+length: Length of the data on the disk
+ident: Ident field
+line_checksum: Checksum of the line number
+cfg_checksum: Checksum of the control flow graph
+counters: All counters associated with the function
+file: The name of the file the function is defined in. Optional.
+line: The line number the function is defined at. Optional.
+
+  Function object contains other counter objects and block/arc/line objects.
+  
+
+  def __init__(self, reader, tag, n_words):
+Read function record information from a gcda/gcno file.
+
+Args:
+  reader: gcda/gcno file.
+  tag: funtion tag.
+  n_words: length of function record in unit of 4-byte.
+
+DataObject.__init__(self, tag)
+self.length = n_words
+self.counters = []
+
+if reader:
+  pos = reader.pos
+  self.ident = reader.ReadWord()
+  self.line_checksum = reader.ReadWord()
+  self.cfg_checksum = reader.ReadWord()
+
+  # Function name string is in gcno files, but not
+  # in gcda files. Here we make string reading optional.
+  if (reader.pos - pos)  n_words:
+reader.ReadStr()
+
+  if (reader.pos - pos)  n_words:
+self.file = reader.ReadStr()
+self.line_number = reader.ReadWord()
+  else:
+self.file = ''
+self.line_number = 0
+else:
+  self.ident = 0
+  self.line_checksum = 0
+  self.cfg_checksum = 0
+  self.file = None
+  self.line_number = 0
+
+  def Write(self, writer):
+Write out the function.
+
+writer.WriteWord(self.tag)
+writer.WriteWord(self.length)
+writer.WriteWord(self.ident)
+writer.WriteWord(self.line_checksum)
+writer.WriteWord(self.cfg_checksum)
+for c in self.counters:
+  c.Write(writer)
+
+  def EntryCount(self):
+Return the number of times the function called.
+return self.ArcCounters().counters[0]
+
+  def Merge(self, others, multipliers):
+Merge all functions in others into self.
+
+Args:
+  others: A sequence of Function objects
+  multipliers: A sequence of integers to be multiplied during merging.
+
+for o in others:
+  assert self.ident == o.ident
+  assert self.line_checksum == o.line_checksum
+  assert self.cfg_checksum == o.cfg_checksum
+
+for i in xrange(len(self.counters)):
+  self.counters[i].Merge([o.counters[i] for o in others], multipliers)
+
+  def Print(self):
+Print all the attributes in full detail.
+print 'function: ident %d length %d line_chksum %x cfg_chksum %x' % (
+self.ident, self.length,
+self.line_checksum, self.cfg_checksum)
+if self.file:
+  print 'file: %s' % self.file
+  print 'line_number:   %d' % self.line_number
+for c in

Re: [google gcc-4_7] offline profile merge tool (issue8508048)

2013-04-08 Thread Xinliang David Li

The copyright header is wrong. Please use the standard one for GCC.

David

On Mon, Apr 8, 2013 at 2:57 PM, Rong Xu x...@google.com wrote:
 Hi,

 This is a offline profile merge program.

 Usage: profile_merge.py [options] arg1 arg2 ...

 Options:
   -h, --helpshow this help message and exit
   -w MULTIPLIERS, --multipliers=MULTIPLIERS
 Comma separated list of multipliers to be applied for
 each corresponding profile.
   -o OUTPUT_PROFILE, --output=OUTPUT_PROFILE
 Output directory or zip file to dump the merged
 profile. Default output is profile-merged.zip.

   Arguments:
 Comma separated list of input directories or zip files that contain
 profile data to merge.

 Histogram is recomputed (i.e. preicise). Module grouping information
 in LIPO is approximation.

 Thanks,

 -Rong

 2013-04-08  Rong Xu  x...@google.com

 * contrib/profile_merge.py: An offline profile merge tool.

 Index: contrib/profile_merge.py
 ===
 --- contrib/profile_merge.py(revision 0)
 +++ contrib/profile_merge.py(revision 0)
 @@ -0,0 +1,1301 @@
 +#!/usr/bin/python2.7
 +#
 +# Copyright 2013 Google Inc. All Rights Reserved.
 +
 +Merge two or more gcda profile.
 +
 +
 +__author__ = 'Seongbae Park, Rong Xu'
 +__author_email__ = 'sp...@google.com, x...@google.com'
 +
 +import array
 +from optparse import OptionGroup
 +from optparse import OptionParser
 +import os
 +import struct
 +import zipfile
 +
 +new_histogram = None
 +
 +
 +class Error(Exception):
 +  Exception class for profile module.
 +
 +
 +def ReadAllAndClose(path):
 +  Return the entire byte content of the specified file.
 +
 +  Args:
 +path: The path to the file to be opened and read.
 +
 +  Returns:
 +The byte sequence of the content of the file.
 +  
 +  data_file = open(path, 'rb')
 +  data = data_file.read()
 +  data_file.close()
 +  return data
 +
 +
 +def MergeCounters(objs, index, multipliers):
 +  Accumulate the counter at index from all counters objs.
 +  val = 0
 +  for j in xrange(len(objs)):
 +val += multipliers[j] * objs[j].counters[index]
 +  return val
 +
 +
 +class DataObject(object):
 +  Base class for various datum in GCDA/GCNO file.
 +
 +  def __init__(self, tag):
 +self.tag = tag
 +
 +
 +class Function(DataObject):
 +  Function and its counters.
 +
 +  Attributes:
 +length: Length of the data on the disk
 +ident: Ident field
 +line_checksum: Checksum of the line number
 +cfg_checksum: Checksum of the control flow graph
 +counters: All counters associated with the function
 +file: The name of the file the function is defined in. Optional.
 +line: The line number the function is defined at. Optional.
 +
 +  Function object contains other counter objects and block/arc/line objects.
 +  
 +
 +  def __init__(self, reader, tag, n_words):
 +Read function record information from a gcda/gcno file.
 +
 +Args:
 +  reader: gcda/gcno file.
 +  tag: funtion tag.
 +  n_words: length of function record in unit of 4-byte.
 +
 +DataObject.__init__(self, tag)
 +self.length = n_words
 +self.counters = []
 +
 +if reader:
 +  pos = reader.pos
 +  self.ident = reader.ReadWord()
 +  self.line_checksum = reader.ReadWord()
 +  self.cfg_checksum = reader.ReadWord()
 +
 +  # Function name string is in gcno files, but not
 +  # in gcda files. Here we make string reading optional.
 +  if (reader.pos - pos)  n_words:
 +reader.ReadStr()
 +
 +  if (reader.pos - pos)  n_words:
 +self.file = reader.ReadStr()
 +self.line_number = reader.ReadWord()
 +  else:
 +self.file = ''
 +self.line_number = 0
 +else:
 +  self.ident = 0
 +  self.line_checksum = 0
 +  self.cfg_checksum = 0
 +  self.file = None
 +  self.line_number = 0
 +
 +  def Write(self, writer):
 +Write out the function.
 +
 +writer.WriteWord(self.tag)
 +writer.WriteWord(self.length)
 +writer.WriteWord(self.ident)
 +writer.WriteWord(self.line_checksum)
 +writer.WriteWord(self.cfg_checksum)
 +for c in self.counters:
 +  c.Write(writer)
 +
 +  def EntryCount(self):
 +Return the number of times the function called.
 +return self.ArcCounters().counters[0]
 +
 +  def Merge(self, others, multipliers):
 +Merge all functions in others into self.
 +
 +Args:
 +  others: A sequence of Function objects
 +  multipliers: A sequence of integers to be multiplied during merging.
 +
 +for o in others:
 +  assert self.ident == o.ident
 +  assert self.line_checksum == o.line_checksum
 +  assert self.cfg_checksum == o.cfg_checksum
 +
 +for i in xrange(len(self.counters)):
 +  self.counters[i].Merge([o.counters[i] for o in others], multipliers)
 +
 +  def Print(self):
 +Print all the attributes in

Re: Comments on the suggestion to use infinite precision math for wide int.

2013-04-08 Thread Mike Stump

On Apr 8, 2013, at 2:48 PM, Robert Dewar de...@adacore.com wrote:
 That may be so in C, in Ada it would be perfectly reasonable to use
 infinite precision for intermediate results in some cases, since the
 language standard specifically encourages this approach.

gcc lacks an infinite precision plus operator?!  :-)

[google gcc-4_7] offline profile merge (patchset 2) (issue8508048)

2013-04-08 Thread Rong Xu

Revised copyright info.

-Rong

2013-04-08  Rong Xu  x...@google.com

* contrib/profile_merge.py: An offline profile merge tool.

Index: contrib/profile_merge.py
===
--- contrib/profile_merge.py(revision 0)
+++ contrib/profile_merge.py(revision 0)
@@ -0,0 +1,1320 @@
+#!/usr/bin/python2.7
+#
+#Copyright (C) 2013
+#Free Software Foundation, Inc.
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+#
+# GCC is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# http://www.gnu.org/licenses/.
+#
+
+
+Merge two or more gcda profile.
+
+
+__author__ = 'Seongbae Park, Rong Xu'
+__author_email__ = 'sp...@google.com, x...@google.com'
+
+import array
+from optparse import OptionGroup
+from optparse import OptionParser
+import os
+import struct
+import zipfile
+
+new_histogram = None
+
+
+class Error(Exception):
+  Exception class for profile module.
+
+
+def ReadAllAndClose(path):
+  Return the entire byte content of the specified file.
+
+  Args:
+path: The path to the file to be opened and read.
+
+  Returns:
+The byte sequence of the content of the file.
+  
+  data_file = open(path, 'rb')
+  data = data_file.read()
+  data_file.close()
+  return data
+
+
+def MergeCounters(objs, index, multipliers):
+  Accumulate the counter at index from all counters objs.
+  val = 0
+  for j in xrange(len(objs)):
+val += multipliers[j] * objs[j].counters[index]
+  return val
+
+
+class DataObject(object):
+  Base class for various datum in GCDA/GCNO file.
+
+  def __init__(self, tag):
+self.tag = tag
+
+
+class Function(DataObject):
+  Function and its counters.
+
+  Attributes:
+length: Length of the data on the disk
+ident: Ident field
+line_checksum: Checksum of the line number
+cfg_checksum: Checksum of the control flow graph
+counters: All counters associated with the function
+file: The name of the file the function is defined in. Optional.
+line: The line number the function is defined at. Optional.
+
+  Function object contains other counter objects and block/arc/line objects.
+  
+
+  def __init__(self, reader, tag, n_words):
+Read function record information from a gcda/gcno file.
+
+Args:
+  reader: gcda/gcno file.
+  tag: funtion tag.
+  n_words: length of function record in unit of 4-byte.
+
+DataObject.__init__(self, tag)
+self.length = n_words
+self.counters = []
+
+if reader:
+  pos = reader.pos
+  self.ident = reader.ReadWord()
+  self.line_checksum = reader.ReadWord()
+  self.cfg_checksum = reader.ReadWord()
+
+  # Function name string is in gcno files, but not
+  # in gcda files. Here we make string reading optional.
+  if (reader.pos - pos)  n_words:
+reader.ReadStr()
+
+  if (reader.pos - pos)  n_words:
+self.file = reader.ReadStr()
+self.line_number = reader.ReadWord()
+  else:
+self.file = ''
+self.line_number = 0
+else:
+  self.ident = 0
+  self.line_checksum = 0
+  self.cfg_checksum = 0
+  self.file = None
+  self.line_number = 0
+
+  def Write(self, writer):
+Write out the function.
+
+writer.WriteWord(self.tag)
+writer.WriteWord(self.length)
+writer.WriteWord(self.ident)
+writer.WriteWord(self.line_checksum)
+writer.WriteWord(self.cfg_checksum)
+for c in self.counters:
+  c.Write(writer)
+
+  def EntryCount(self):
+Return the number of times the function called.
+return self.ArcCounters().counters[0]
+
+  def Merge(self, others, multipliers):
+Merge all functions in others into self.
+
+Args:
+  others: A sequence of Function objects
+  multipliers: A sequence of integers to be multiplied during merging.
+
+for o in others:
+  assert self.ident == o.ident
+  assert self.line_checksum == o.line_checksum
+  assert self.cfg_checksum == o.cfg_checksum
+
+for i in xrange(len(self.counters)):
+  self.counters[i].Merge([o.counters[i] for o in others], multipliers)
+
+  def Print(self):
+Print all the attributes in full detail.
+print 'function: ident %d length %d line_chksum %x cfg_chksum %x' % (
+self.ident, self.length,
+self.line_checksum, self.cfg_checksum)
+if self.file:
+  print 'file: %s' % self.file
+  print 'line_number:   %d' % self.line_number
+for c in self.counters:
+  c.Print()
+
+  def

Re: Comments on the suggestion to use infinite precision math for wide int.

2013-04-08 Thread Robert Dewar


On 4/8/2013 6:34 PM, Mike Stump wrote:

On Apr 8, 2013, at 2:48 PM, Robert Dewar de...@adacore.com wrote:

That may be so in C, in Ada it would be perfectly reasonable to use
infinite precision for intermediate results in some cases, since the
language standard specifically encourages this approach.


gcc lacks an infinite precision plus operator?!  :-)


Right, that's why we do everything in the front end in the
case of Ada. But it would be perfectly reasonable for the
back end to do this substitution.

Re: Comments on the suggestion to use infinite precision math for wide int.

2013-04-08 Thread Kenneth Zadeck



On 04/08/2013 06:45 PM, Robert Dewar wrote:

On 4/8/2013 6:34 PM, Mike Stump wrote:

On Apr 8, 2013, at 2:48 PM, Robert Dewar de...@adacore.com wrote:

That may be so in C, in Ada it would be perfectly reasonable to use
infinite precision for intermediate results in some cases, since the
language standard specifically encourages this approach.


gcc lacks an infinite precision plus operator?!  :-)


Right, that's why we do everything in the front end in the
case of Ada. But it would be perfectly reasonable for the
back end to do this substitution.
but there is no way in the current tree language to convey which ones 
you can and which ones you cannot.

Re: Comments on the suggestion to use infinite precision math for wide int.

2013-04-08 Thread Robert Dewar


On 4/8/2013 7:46 PM, Kenneth Zadeck wrote:


On 04/08/2013 06:45 PM, Robert Dewar wrote:

On 4/8/2013 6:34 PM, Mike Stump wrote:

On Apr 8, 2013, at 2:48 PM, Robert Dewar de...@adacore.com wrote:

That may be so in C, in Ada it would be perfectly reasonable to use
infinite precision for intermediate results in some cases, since the
language standard specifically encourages this approach.


gcc lacks an infinite precision plus operator?!  :-)


Right, that's why we do everything in the front end in the
case of Ada. But it would be perfectly reasonable for the
back end to do this substitution.

but there is no way in the current tree language to convey which ones
you can and which ones you cannot.


Well the back end has all the information to figure this out I think!
But anyway, for Ada, the current situation is just fine, and has
the advantage that the -gnatG expanded code listing clearly shows in
Ada source form, what is going on.

Re: [PATCH, updated] Vtable pointer verification, C++ front end changes (patch 1 of 3)

2013-04-08 Thread Jason Merrill

Hi, sorry it has taken me so long to get back to this.  Hopefully we can 
wrap it up quickly now that we're back in stage 1.


On 02/25/2013 02:24 PM, Caroline Tice wrote:

-CXX_FOR_TARGET='$$r/$(HOST_SUBDIR)/gcc/xg++ -B$$r/$(HOST_SUBDIR)/gcc/ 
-nostdinc++ `if test -f $$r/$(TARGET_SUBDIR)
/libstdc++-v3/scripts/testsuite_flags; then $(SHELL) 
$$r/$(TARGET_SUBDIR)/libstdc++-v3/scripts/testsuite_flags --build-
includes; else echo -funconfigured-libstdc++-v3 ; fi` 
-L$$r/$(TARGET_SUBDIR)/libstdc++-v3/src -L$$r/$(TARGET_SUBDIR)/li
bstdc++-v3/src/.libs'
+CXX_FOR_TARGET='$$r/$(HOST_SUBDIR)/gcc/xg++ -B$$r/$(HOST_SUBDIR)/gcc/ 
-nostdinc++ `if test -f $$r/$(TARGET_SUBDIR)
/libstdc++-v3/scripts/testsuite_flags; then $(SHELL) 
$$r/$(TARGET_SUBDIR)/libstdc++-v3/scripts/testsuite_flags --build-
includes; else echo -funconfigured-libstdc++-v3 ; fi` 
-L$$r/$(TARGET_SUBDIR)/libstdc++-v3/src -L$$r/$(TARGET_SUBDIR)/li
bstdc++-v3/src/.libs -L$$r/$(TARGET_SUBDIR)/libstdc++-v3/libsupc++/.libs'


You shouldn't need this, since libstdc++ includes libsupc++.  And if you 
did need to do it, it would need to be in configure.ac or it will be 
discarded by the next autoconf.



+ information aboui which vtable will actually be emitted.  */


about


+vtv_finish_verification_constructor_init_function (tree function_body)
+{
+  tree fn;
+
+  finish_compound_stmt (function_body);
+  fn = finish_function (0);
+  DECL_STATIC_CONSTRUCTOR (fn) = 1;
+  decl_init_priority_insert (fn, MAX_RESERVED_INIT_PRIORITY - 1);


Why did you stop using finish_objects?  If it was to be able to return 
the function, you can get that from current_function_decl before calling 
finish_objects.



Index: gcc/cp/g++spec.c


Changes to g++spec.c only affect the g++ driver, not the gcc driver. Are you 
sure this is what you want?  Can't you handle this stuff directly in the specs 
like address sanitizer does?


I haven't seen a response to this comment.


+   vtv_rts.cc \
+   vtv_malloc.cc \
+   vtv_utils.cc


It seems to me that this code belongs in a separate library like libsanitizer, 
not in libstdc++.


Or this one.


-  switch_to_section (sect);
+  if (sect-named.name
+  (strcmp (sect-named.name, .vtable_map_vars) == 0))
+   {
+#if defined (OBJECT_FORMAT_ELF)
+  targetm.asm_out.named_section (sect-named.name,
+sect-named.common.flags
+| SECTION_LINKONCE,
+DECL_NAME (decl));
+  in_section = sect;
+#else
+  switch_to_section (sect);
+#endif
+}
+  else
+switch_to_section (sect);




+  if (strcmp (name, .vtable_map_vars) == 0)
+  flags |= SECTION_LINKONCE;



These changes should not be necessary.  Just set DECL_ONE_ONLY on the vtable
map variables.



I believe this change was necessary so that each vtable map variable
would have its own comdat name and be in its own comdat group...but I
will revisit this and see if we still need it.


What did you find?  Perhaps you need to make sure that the map variables 
are getting passed to comdat_linkage at some point, such as here in 
vtable_find_or_create_map_decl:



+  DECL_SECTION_NAME (var_decl) = build_string (strlen (sect_name),
+   sect_name);
+  DECL_HAS_IMPLICIT_SECTION_NAME_P (var_decl) = true;
+  DECL_COMDAT_GROUP (var_decl) = get_identifier (var_name);


Here comdat_linkage (var_decl) could replace these three lines and I 
believe make the above varasm change unnecessary.



+/* This function adds classes we are interested in to a list of
+   classes that is saved during pre-compiled header generation.

...

+/* This function goes through the list of classes we saved before the
+   pre-compiled header generation and calls vtv_save_base_class_info
+   on each one, to build up our class hierarchy data structure.  */


These functions apply to non-PCH compiles as well; I find the mention of 
PCH here confusing.



+  tree void_ptr_type = build_pointer_type (void_type_node);
+  tree const_char_ptr_type = build_pointer_type
+  (build_qualified_type (char_type_node,
+ TYPE_QUAL_CONST));


These are already built, as ptr_type_node and const_string_type_node.


+  arg_types = chainon (arg_types, build_tree_list (NULL_TREE, void_type_node));


And you can use void_list_node instead of building a new void list.


+  arg_types = build_tree_list (NULL_TREE, build_pointer_type (void_ptr_type));
+  arg_types = chainon (arg_types, build_tree_list (NULL_TREE,
+   const_ptr_type_node));
+  arg_types = chainon (arg_types, build_tree_list (NULL_TREE,
+   size_type_node));
+  arg_types = chainon (arg_types, build_tree_list (NULL_TREE, void_type_node));
+
+

[RFA][PATCH] Improve VRP of COND_EXPR_CONDs -- v2

2013-04-08 Thread Jeff Law



This incorporates the concrete suggestions from Steven  Richi -- it 
doesn't do any refactoring of the VRP code.  There's still stuff I'm 
looking at that might directly lead to some refactoring.  In the mean 
time I'm submitting the obvious small improvements.


Bootstrapped and regression tested on x86_64-unknown-linux-gnu.

OK for trunk?

Jeff
commit d6d1e36561b9022bbcdf157a886895f5bb0ef2ae
Author: Jeff Law l...@redhat.com
Date:   Sat Apr 6 06:46:58 2013 -0600

   * tree-vrp.c (simplify_cond_using_ranges): Simplify test of boolean
   when the boolean was created by converting a wider object which
   had a boolean range.

* gcc.dg/tree-ssa/vrp87.c: New test

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 6ee7d9c..110f61e 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -16,8 +16,14 @@
 
 2013-04-08  Jeff Law  l...@redhat.com
 
+   * tree-vrp.c (simplify_cond_using_ranges): Simplify test of boolean
+   when the boolean was created by converting a wider object which
+   had a boolean range.
+
+2013-04-08  Jeff Law  l...@redhat.com
+
* gimple.c (canonicalize_cond_expr_cond): Rewrite x ^ y into x != y.
-   
+
 2013-04-08  Richard Biener  rguent...@suse.de
 
* gimple-pretty-print.c (debug_gimple_stmt): Do not print
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp87.c 
b/gcc/testsuite/gcc.dg/tree-ssa/vrp87.c
new file mode 100644
index 000..7feff81
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp87.c
@@ -0,0 +1,81 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -fdump-tree-vrp2-details -fdump-tree-cddce2-details } */
+
+struct bitmap_head_def;
+typedef struct bitmap_head_def *bitmap;
+typedef const struct bitmap_head_def *const_bitmap;
+
+
+typedef unsigned long BITMAP_WORD;
+typedef struct bitmap_element_def
+{
+  struct bitmap_element_def *next;
+  unsigned int indx;
+  BITMAP_WORD bits[((128 + (8 * 8 * 1u) - 1) / (8 * 8 * 1u))];
+} bitmap_element;
+
+
+
+
+
+
+typedef struct bitmap_head_def
+{
+  bitmap_element *first;
+
+} bitmap_head;
+
+
+
+static __inline__ unsigned char
+bitmap_elt_ior (bitmap dst, bitmap_element * dst_elt,
+   bitmap_element * dst_prev, const bitmap_element * a_elt,
+   const bitmap_element * b_elt, unsigned char changed)
+{
+
+  if (a_elt)
+{
+
+  if (!changed  dst_elt)
+   {
+ changed = 1;
+   }
+}
+  else
+{
+  changed = 1;
+}
+  return changed;
+}
+
+unsigned char
+bitmap_ior_into (bitmap a, const_bitmap b)
+{
+  bitmap_element *a_elt = a-first;
+  const bitmap_element *b_elt = b-first;
+  bitmap_element *a_prev = ((void *) 0);
+  unsigned char changed = 0;
+
+  while (b_elt)
+{
+
+  if (!a_elt || a_elt-indx == b_elt-indx)
+   changed = bitmap_elt_ior (a, a_elt, a_prev, a_elt, b_elt, changed);
+  else if (a_elt-indx  b_elt-indx)
+   changed = 1;
+  b_elt = b_elt-next;
+
+
+}
+
+  return changed;
+}
+
+/* Verify that VRP simplified an if statement.  */
+/* { dg-final { scan-tree-dump Folded into: if.* vrp2} } */
+/* Verify that DCE after VRP2 eliminates a dead conversion
+   to a (Bool).  */
+/* { dg-final { scan-tree-dump Deleting.*_Bool.*; cddce2} } */
+/* { dg-final { cleanup-tree-dump vrp2 } } */
+/* { dg-final { cleanup-tree-dump cddce2 } } */
+
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index 250a506..4520c89 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -8584,6 +8584,45 @@ simplify_cond_using_ranges (gimple stmt)
}
 }
 
+  /* If we have a comparison of a SSA_NAME boolean against
+ a constant (which obviously must be [0..1]), see if the
+ SSA_NAME was set by a type conversion where the source
+ of the conversion is another SSA_NAME with a range [0..1].
+
+ If so, we can replace the SSA_NAME in the comparison with
+ the RHS of the conversion.  This will often make the type
+ conversion dead code which DCE will clean up.  */
+  if (TREE_CODE (op0) == SSA_NAME
+   (TREE_CODE (TREE_TYPE (op0)) == BOOLEAN_TYPE
+ || (INTEGRAL_TYPE_P (TREE_TYPE (op))
+  TYPE_PRECISION (TREE_TYPE (op0)) == 1))
+   TREE_CODE (op1) == INTEGER_CST)
+{
+  gimple def_stmt = SSA_NAME_DEF_STMT (op0);
+  tree innerop;
+
+  if (!is_gimple_assign (def_stmt)
+ || !CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def_stmt)))
+   return false;
+
+  innerop = gimple_assign_rhs1 (def_stmt);
+
+  if (TREE_CODE (innerop) == SSA_NAME)
+   {
+ value_range_t *vr = get_value_range (innerop);
+
+ if (range_int_cst_p (vr)
+  operand_equal_p (vr-min, integer_zero_node, 0)
+  operand_equal_p (vr-max, integer_one_node, 0))
+   {
+ tree newconst = fold_convert (TREE_TYPE (innerop), op1);
+ gimple_cond_set_lhs (stmt, innerop);
+ gimple_cond_set_rhs (stmt, newconst);
+ return true;
+   }
+   }
+}
+
   return false;
 }

[PATCH] Improve cstore code generation on 64-bit sparc.

2013-04-08 Thread David Miller


One major suboptimal area of the sparc back end is cstore generation
on 64-bit.

Due to the way arguments and return values of functions must be
promoted, the ideal mode for cstore's result would be DImode.

But this hasn't been done because of a fundamental limitation
of the cstore patterns.  They require a fixed mode be used for
the boolean result value.

I've decided to work around this by building a target hook which
specifies the type to use for conditional store results, and then I
use a special predicate for operans 0 in the cstore expanders so
that they still match even when we use DImode.

The default version of the target hook just does what it does now,
so no other target should be impacted by this at all.

Regstrapped on 32-bit sparc-linux-gnu and I've run the testsuite
with -m64 to validate the 64-bit side.

Any major objections?

gcc/

* target.def (cstore_mode): New hook.
* target.h: Include insn-codes.h
* targhooks.c: Likewise.
(default_cstore_mode): New function.
* targhooks.h: Declare it.
* doc/tm.texi.in: New hook slot for TARGET_CSTORE_MODE.
* doc/tm.texi: Rebuild.
* expmed.c (emit_cstore): Obtain cstore boolean result mode using
target hook, rather than inspecting the insn_data.
* config/sparc/sparc.c (sparc_cstore_mode): New function.
(TARGET_CSTORE_MODE): Redefine.
(emit_scc_insn): When TARGET_ARCH64, emit new 64-bit boolean
result patterns.
* config/sparc/predicates.md (cstore_result_operand): New special
predicate.
* config/sparc/sparc.md (cstoresi4, cstoredi4, cstoreF:mode4):
Use it for operand 0.
(*seqsi_special): Rewrite using 'P' mode iterator on operand 0.
(*snesi_special): Likewise.
(*snesi_zero): Likewise.
(*seqsi_zero): Likewise.
(*sltu_insn): Likewise.
(*sgeu_insn): Likewise.
(*seqdi_special): Make operand 0 and comparison operation be of
DImode.
(*snedi_special): Likewise.
(*snedi_special_vis3): Likewise.
(*neg_snesi_zero): Rename to *neg_snesisi_zero.
(*neg_snesi_sign_extend): Rename to *neg_snesidi_zero.
(*snesi_zero_extend): Delete, covered by 'P' mode iterator.
(*neg_seqsi_zero): Rename to *neg_seqsisi_zero.
(*neg_seqsi_sign_extend): Rename to *neg_seqsidi_zero.
(*seqsi_zero_extend): Delete, covered by 'P' mode iterator.
(*sltu_extend_sp64): Likewise.
(*neg_sltu_insn): Rename to *neg_sltusi_insn.
(*neg_sltu_extend_sp64): Rename to *neg_sltudi_insn.
(*sgeu_extend_sp64): Delete, covered by 'P' mode iterator.
(*neg_sgeu_insn): Rename to *neg_sgeusi_insn.
(*neg_sgeu_extend_sp64): Rename to *neg_sgeudi_insn.

gcc/testsuite/

* gcc.target/sparc/setcc-4.c: New test.
* gcc.target/sparc/setcc-5.c: New test.
---
 gcc/config/sparc/predicates.md   |   5 ++
 gcc/config/sparc/sparc.c |  23 +-
 gcc/config/sparc/sparc.md| 137 ++-
 gcc/doc/tm.texi  |   4 +
 gcc/doc/tm.texi.in   |   2 +
 gcc/expmed.c |   2 +-
 gcc/target.def   |  10 +++
 gcc/target.h |   1 +
 gcc/targhooks.c  |   9 ++
 gcc/targhooks.h  |   1 +
 gcc/testsuite/gcc.target/sparc/setcc-4.c |  44 ++
 gcc/testsuite/gcc.target/sparc/setcc-5.c |  42 ++
 12 files changed, 183 insertions(+), 97 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/sparc/setcc-4.c
 create mode 100644 gcc/testsuite/gcc.target/sparc/setcc-5.c

diff --git a/gcc/config/sparc/predicates.md b/gcc/config/sparc/predicates.md
index b8524e5..073bce2 100644
--- a/gcc/config/sparc/predicates.md
+++ b/gcc/config/sparc/predicates.md
@@ -265,6 +265,11 @@
   (ior (match_test register_operand (op, SImode))
(match_test TARGET_ARCH64  register_operand (op, DImode
 
+;; Return true if OP is an integer register of the appropriate mode
+;; for a cstore result.
+(define_special_predicate cstore_result_operand
+  (match_test register_operand (op, TARGET_ARCH64 ? DImode : SImode)))
+
 ;; Return true if OP is a floating point condition code register.
 (define_predicate fcc_register_operand
   (match_code reg)
diff --git a/gcc/config/sparc/sparc.c b/gcc/config/sparc/sparc.c
index 3e98325..4a73c73 100644
--- a/gcc/config/sparc/sparc.c
+++ b/gcc/config/sparc/sparc.c
@@ -597,6 +597,7 @@ static void sparc_print_operand_address (FILE *, rtx);
 static reg_class_t sparc_secondary_reload (bool, rtx, reg_class_t,
   enum machine_mode,
   secondary_reload_info *);
+static enum machine_mode sparc_cstore_mode (enum insn_code icode);
 
 #ifdef SUBTARGET_ATTRIBUTE_TABLE
 /* Table of valid machine attributes.  */

Re: [RFA][PATCH] Improve VRP of COND_EXPR_CONDs -- v2

2013-04-08 Thread Jeff Law


On 04/08/2013 07:54 PM, Jeff Law wrote:


This incorporates the concrete suggestions from Steven  Richi -- it
doesn't do any refactoring of the VRP code.  There's still stuff I'm
looking at that might directly lead to some refactoring.  In the mean
time I'm submitting the obvious small improvements.

Bootstrapped and regression tested on x86_64-unknown-linux-gnu.

OK for trunk?
Just a note, there's a typo op should be op0 in that patch; not sure 
why git gave me the old version since that's something I thought I'd 
patched and squashed out...  Clearly a git user workflow error of some kind.


Jeff

Re: [PATCH v3]IPA: fixing inline fail report caused by overwritable functions.

2013-04-08 Thread Zhouyi Zhou

On Mon, Apr 8, 2013 at 5:48 PM, Richard Biener richard.guent...@gmail.com 
wrote:
Can you trigger this message to show up with -Winline before/after the patch?
Can you please add a testcase then?
Thanks Richard for reviewing, from my point of view about gcc and my invoking 
of gcc, -Winline
only works on callees that be declared inline, but if the callee is declared
inline, it will be AVAIL_AVAILABLE in function can_inline_edge_p, thus out of 
the 
range of my patch.

So I only add a testcase for fixing the tree dump, are there any thing more I 
can do?   

Regtested/bootstrapped on x86_64-linux

ChangeLog:
2013-04-08 Zhouyi Zhou yizhouz...@ict.ac.cn
   * cif-code.def (OVERWRITABLE): correct the comment for overwritable
function
   * ipa-inline.c (can_inline_edge_p): let dump mechanism report the 
inline
   fail caused by overwritable functions.
   * gcc.dg/tree-ssa/inline-11.c: New test  

Index: gcc/cif-code.def
===
--- gcc/cif-code.def(revision 197549)
+++ gcc/cif-code.def(working copy)
@@ -48,7 +48,7 @@ DEFCIFCODE(REDEFINED_EXTERN_INLINE,
 /* Function is not inlinable.  */
 DEFCIFCODE(FUNCTION_NOT_INLINABLE, N_(function not inlinable))
 
-/* Function is not overwritable.  */
+/* Function is overwritable.  */
 DEFCIFCODE(OVERWRITABLE, N_(function body can be overwritten at link time))
 
 /* Function is not an inlining candidate.  */
Index: gcc/testsuite/gcc.dg/tree-ssa/inline-11.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/inline-11.c   (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/inline-11.c   (working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -fdump-tree-einline } */
+int w;
+int bar (void) __attribute__ ((weak));
+int bar (){
+  w++;
+}
+void foo()
+{
+  bar();
+}
+/* { dg-final { scan-tree-dump-times function body can be overwritten at link 
time 1 einline } } */
+/* { dg-final { cleanup-tree-dump einline } } */
Index: gcc/ipa-inline.c
===
--- gcc/ipa-inline.c(revision 197549)
+++ gcc/ipa-inline.c(working copy)
@@ -266,7 +266,7 @@ can_inline_edge_p (struct cgraph_edge *e
   else if (avail = AVAIL_OVERWRITABLE)
 {
   e-inline_failed = CIF_OVERWRITABLE;
-  return false;
+  inlinable = false;
 }
   else if (e-call_stmt_cannot_inline_p)
 {

Re: [patch libgcc]: Adjust cygming-crtbegin code to use weak

2013-04-08 Thread Dave Korn

On 22/03/2013 08:44, Kai Tietz wrote:

 2013-03-22  Kai Tietz  kti...@redhat.com
 
   * config/i386/cygming-crtbegin.c (__register_frame_info): Make weak.
   (__deregister_frame_info): Likewise.

Hi Kai,

  I read your explanation of the problem relating to x86-64 memory models over
on the Cygwin dev list, and that explained your motivation for making this
change; I see why it's not easy to get an *ABS* 0 reference there.  So,
providing dummy versions of the functions makes perfect sense to me, and
certainly won't cause problems for i686.  (I did a lot of testing, and the
only problem I found is that a weak definition has to be provided on the
linker command line *after* the file that contains the weak-with-zero-default
definition if it is to override that; in the case here however we're going to
be overriding the weak-with-default by a strong function declaration, so that
issue does not arise.)

  I still have a comment or two about the patch itself:

 Index: libgcc/config/i386/cygming-crtbegin.c
 ===
 --- libgcc/config/i386/cygming-crtbegin.c (Revision 196898)
 +++ libgcc/config/i386/cygming-crtbegin.c (Arbeitskopie)
 @@ -46,15 +46,33 @@ see the files COPYING3 and COPYING.RUNTIME respect
  #define LIBGCJ_SONAME libgcj_s.dll
  #endif
 
 -
 +#if DWARF2_UNWIND_INFO
  /* Make the declarations weak.  This is critical for
 _Jv_RegisterClasses because it lives in libgcj.a  */
 -extern void __register_frame_info (const void *, struct object *)
 +extern void __register_frame_info (__attribute__((unused)) const void *,
 +__attribute__((unused)) struct object *)
  TARGET_ATTRIBUTE_WEAK;
 -extern void *__deregister_frame_info (const void *)
 +extern void *__deregister_frame_info (__attribute__((unused)) const void *)
 TARGET_ATTRIBUTE_WEAK;
 -extern void _Jv_RegisterClasses (const void *) TARGET_ATTRIBUTE_WEAK;
 +TARGET_ATTRIBUTE_WEAK void
 +__register_frame_info (__attribute__((unused)) const void *p,
 +__attribute__((unused)) struct object *o)
 +{}

  Braces should go on separate lines I think.

 +TARGET_ATTRIBUTE_WEAK void *
 +__deregister_frame_info (__attribute__((unused)) const void *p)
 +{ return (void*) 0; }

  Certainly here.

 +#endif /* DWARF2_UNWIND_INFO */
 +
 +#if TARGET_USE_JCR_SECTION
 +extern void _Jv_RegisterClasses (__attribute__((unused)) const void *)
 +  TARGET_ATTRIBUTE_WEAK;
 +
 +TARGET_ATTRIBUTE_WEAK void
 +_Jv_RegisterClasses (__attribute__((unused)) const void *p)
 +{}
 +#endif /* TARGET_USE_JCR_SECTION */
 +
  #if defined(HAVE_LD_RO_RW_SECTION_MIXING)
  # define EH_FRAME_SECTION_CONST const
  #else

  Also, now that you've provided a default weak definition of the functions in
the file itself, it's no longer possible for the function pointer variables
(register_frame_fn, register_class_fn, deregister_frame_fn) to be zero, so you
should remove the if () tests on them and just call them unconditionally.

cheers,
  DaveK

95 matches

Mail list logo