date:20110517

Re: C6X port 8/11: A new FUNCTION_ARG macro

2011-05-17 Thread Paolo Bonzini


On 05/12/2011 05:40 PM, Bernd Schmidt wrote:

+  if (targetm.calls.function_arg_round_to_arg_boundary (passed_mode, type))
+round_boundary = boundary;
+  else
+round_boundary = PARM_BOUNDARY;


Why add an if, instead of making the new target hook 
function_arg_round_boundary?  The default implementation can then reuse 
default_function_arg_boundary and C6X will redefine it to 
c6x_function_arg_boundary.


Paolo

Re: Prefixes for libgcc symbols (C6X 9.5/11)

2011-05-17 Thread Paolo Bonzini


On 05/13/2011 03:40 PM, Bernd Schmidt wrote:


gcc/
* libgcc2.h (__NW, __NDW): Define using a __gnu_ prefix if
LIBGCC2_GNU_PREFIX is defined.
(__N): New macro.
(__powisf2, __powidf2, __powitf2, __powixf2, __bswapsi2, __bswapdi2,
__mulsc3, __muldc3, __mulxc3, __multc3, __divsc3, __divdc3, __divxc3,
__divtc3, __udiv_w_sdiv, __clear_cache, __enable_execute_stack,
__clz_tab): Define using __N.
(__absvsi2, __negvsi2, __addvsi3, __subvsi3, __mulvsi3): Likewise if
COMPAT_SIMODE_TRAPPING_ARITHMETIC.
* target.def (libfunc_gnu_prefix): New hook.
* doc/tm.texi.in (LIBGCC2_GNU_PREFIX): Document.
(TARGET_LIBFUNC_GNU_PREFIX): Add hook.
* doc/tm.texi: Regenerate.
* optabs.c (gen_libfunc): Take the libfunc_gnu_prefix hook into
account.
(gen_interclass_conv_libfunc, gen_intraclass_conv_libfunc): Likewise.
(init_optabs): Likewise for the bswap libfuncs.
* tree.c (build_common_builtin_nodes): Likewise for complex multiply
and divide.
* config/t-slibgcc-elf-ver (SHLIB_MAPFILES): Use $$(libgcc_objdir).
* config/t-slibgcc-sld (SHLIB_MAPFILES): Likewise.
* libgcc-std.ver: Remove.
* Makefile.in (srcdirify): Handle $$(libgcc_objdir).
* config/bfin/libgcc-bfin.ver: Remove.
* config/bfin/t-bfin-linux (SHLIB_MAPFILES): Remove.
* config/frv/t-linux (SHLIB_MAPFILES): Use $$(libgcc_objdir) for
libgcc-std.ver.
* config/i386/t-linux (SHLIB_MAPFILES): Likewise.
* config/mips/t-slibgcc-irix (SHLIB_MAPFILES): Likewise.
* config/rs6000/t-aix43 (SHLIB_MAPFILES): Likewise.
* config/rs6000/t-aix52 (SHLIB_MAPFILES): Likewise.
* config/sparc/t-linux (SHLIB_MAPFILES): Likewise.
* config/i386/t-linux (SHLIB_MAPFILES): Likewise.
* config/i386/t-linux (SHLIB_MAPFILES): Likewise.
* config/fixed-bit.h (FIXED_OP): Define differently depending on
LIBGCC2_GNU_PREFIX. All uses changed not to pass leading underscores.
(FIXED_CONVERT_OP, FIXED_CONVERT_OP2): Likewise.

libgcc/
* libgcc-std.ver.in: New file.
* Makefile.in (LIBGCC_VER_GNU_PREFIX, LIBGCC_VER_SYMBOLS_PREFIX): New
variables.
(libgcc-std.ver): New rule.
* config/t-gnu-prefix: New file.
* config/t-underscore-prefix: New file.


Build parts are ok.

Paolo

[PATCH] Optimize __sync_fetch_and_add (x, -N) == N and __sync_add_and_fetch (x, N) == 0 (PR target/48986)

2011-05-17 Thread Jakub Jelinek

Hi!

This patch optimizes using peephole2 __sync_fetch_and_add (x, -N) == N
and __sync_add_and_fetch (x, N) == 0 by just doing lock {add,sub,inc,dec}
and testing flags, instead of lock xadd plus comparison.
The sync_old_addmode predicate change makes it possible to optimize
__sync_add_and_fetch with constant second argument to same
code as __sync_fetch_and_add.  Doing it in peephole2 has a disadvantage
though, both that the 3 instructions need to be consecutive and
e.g. that xadd insn has to be supported by the CPU.  Other alternative
would be to come up with a new bool builtin that would represent the
whole __sync_fetch_and_add (x, -N) == N operation (perhaps with dot or space
in its name to make it inaccessible), try to match it during some folding
and expand it using special optab.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk
this way?

2011-05-16  Jakub Jelinek  ja...@redhat.com

PR target/48986
* config/i386/sync.md (sync_old_addmode): Relax operand 2
predicate to allow CONST_INT.
(*sync_old_add_cmpmode): New insn and peephole2 for it.

--- gcc/config/i386/sync.md.jj  2010-05-21 11:46:29.0 +0200
+++ gcc/config/i386/sync.md 2011-05-16 14:42:08.0 +0200
@@ -170,11 +170,62 @@ (define_insn sync_old_addmode
  [(match_operand:SWI 1 memory_operand +m)] UNSPECV_XCHG))
(set (match_dup 1)
(plus:SWI (match_dup 1)
- (match_operand:SWI 2 register_operand 0)))
+ (match_operand:SWI 2 nonmemory_operand 0)))
(clobber (reg:CC FLAGS_REG))]
   TARGET_XADD
   lock{%;} xadd{imodesuffix}\t{%0, %1|%1, %0})
 
+(define_peephole2
+  [(set (match_operand:SWI 0 register_operand )
+   (match_operand:SWI 2 const_int_operand ))
+   (parallel [(set (match_dup 0)
+  (unspec_volatile:SWI
+[(match_operand:SWI 1 memory_operand )] UNSPECV_XCHG))
+ (set (match_dup 1)
+  (plus:SWI (match_dup 1)
+(match_dup 0)))
+ (clobber (reg:CC FLAGS_REG))])
+   (set (reg:CCZ FLAGS_REG)
+   (compare:CCZ (match_dup 0)
+(match_operand:SWI 3 const_int_operand )))]
+  peep2_reg_dead_p (3, operands[0])
+(unsigned HOST_WIDE_INT) INTVAL (operands[2])
+  == -(unsigned HOST_WIDE_INT) INTVAL (operands[3])
+!reg_overlap_mentioned_p (operands[0], operands[1])
+  [(parallel [(set (reg:CCZ FLAGS_REG)
+  (compare:CCZ (unspec_volatile:SWI [(match_dup 1)]
+UNSPECV_XCHG)
+   (match_dup 3)))
+ (set (match_dup 1)
+  (plus:SWI (match_dup 1)
+(match_dup 2)))])])
+
+(define_insn *sync_old_add_cmpmode
+  [(set (reg:CCZ FLAGS_REG)
+   (compare:CCZ (unspec_volatile:SWI
+  [(match_operand:SWI 0 memory_operand +m)]
+  UNSPECV_XCHG)
+(match_operand:SWI 2 const_int_operand i)))
+   (set (match_dup 0)
+   (plus:SWI (match_dup 0)
+ (match_operand:SWI 1 const_int_operand i)))]
+  (unsigned HOST_WIDE_INT) INTVAL (operands[1])
+   == -(unsigned HOST_WIDE_INT) INTVAL (operands[2])
+{
+  if (TARGET_USE_INCDEC)
+{
+  if (operands[1] == const1_rtx)
+   return lock{%;} inc{imodesuffix}\t%0;
+  if (operands[1] == constm1_rtx)
+   return lock{%;} dec{imodesuffix}\t%0;
+}
+
+  if (x86_maybe_negate_const_int (operands[1], MODEmode))
+return lock{%;} sub{imodesuffix}\t{%1, %0|%0, %1};
+
+  return lock{%;} add{imodesuffix}\t{%1, %0|%0, %1};
+})
+
 ;; Recall that xchg implicitly sets LOCK#, so adding it again wastes space.
 (define_insn sync_lock_test_and_setmode
   [(set (match_operand:SWI 0 register_operand =r)

Jakub

Re: [patch, ARM] Fix PR42017, LR not used in leaf functions

2011-05-17 Thread Chung-Lin Tang

On 2011/5/13 04:26 PM, Richard Sandiford wrote:
 Richard Sandiford richard.sandif...@linaro.org writes:
 Chung-Lin Tang clt...@codesourcery.com writes:
 My fix here simply adds 'reload_completed' as an additional condition
 for EPILOGUE_USES to return true for LR_REGNUM. I think this should be
 valid, as correct LR save/restoring is handled by the epilogue/prologue
 code; it should be safe for IRA to treat it as a normal call-used register.

 FWIW, epilogue_completed might be a more accurate choice.
 
 I still stand by this, although I realise no other target does it.

Did a re-test of the patch just to be sure, as expected the test results
were also clear. Attached is the updated patch.

 It seems a lot of other ports suffer from the same problem though.
 I wonder which targets really do want to make a register live throughout
 the function?  If none do, perhaps we should say that this macro is
 only meaningful once the epilogue has been generated.
 
 To answer my own question, I suppose VRSAVE is one.  So I was wrong
 about the target-independent fix.
 
 Richard

To rehash what I remember we discussed at LDS, such registers like
VRSAVE might be more appropriately placed in global regs. It looks like
EPILOGUE_USES could be more clarified in its use...

To Richard Earnshaw and Ramana, is the patch okay for trunk? This should
be a not-so-insignificant performance regression-fix/improvement.

Thanks,
Chung-Lin
Index: config/arm/arm.h
===
--- config/arm/arm.h(revision 173814)
+++ config/arm/arm.h(working copy)
@@ -1627,7 +1627,7 @@
frame.  */
 #define EXIT_IGNORE_STACK 1
 
-#define EPILOGUE_USES(REGNO) ((REGNO) == LR_REGNUM)
+#define EPILOGUE_USES(REGNO) (epilogue_completed  (REGNO) == LR_REGNUM)
 
 /* Determine if the epilogue should be output as RTL.
You should override this if you define FUNCTION_EXTRA_EPILOGUE.  */

[PATCH, PR45098]

2011-05-17 Thread Tom de Vries

Hi Zdenek,

I have a patch set for for PR45098.

01_object-size-target.patch
02_pr45098-rtx-cost-set.patch
03_pr45098-computation-cost.patch
04_pr45098-iv-init-cost.patch
05_pr45098-bound-cost.patch
06_pr45098-bound-cost.test.patch
07_pr45098-nowrap-limits-iterations.patch
08_pr45098-nowrap-limits-iterations.test.patch
09_pr45098-shift-add-cost.patch
10_pr45098-shift-add-cost.test.patch

I will sent out the patches individually.

The patch set has been bootstrapped and reg-tested on x86_64, and
reg-tested on ARM.

The effect of the patch set on examples is the removal of 1 iterator,
demonstrated below for '-Os -mthumb -march=armv7-a' on example tr4.

tr4.c:
...
extern void foo2 (short*);
void tr4 (short array[], int n)
{
  int i;
  if (n  0)
for (i = 0; i  n; i++)
  foo2 (array[i]);
}
...

tr4.s diff (left without, right with patch):
...
push{r4, r5, r6, lr}  | cmp r1, #0
subsr6, r1, #0| push{r3, r4, r5, lr}
ble .L1 ble .L1
mov r5, r0| mov r4, r0
movsr4, #0| add r5, r0, r1, lsl #1
.L3:.L3:
mov r0, r5| mov r0, r4
addsr4, r4, #1| addsr4, r4, #2
bl  foo2bl  foo2
addsr5, r5, #2| cmp r4, r5
cmp r4, r6
bne .L3 bne .L3
.L1:.L1:
pop {r4, r5, r6, pc}  | pop {r3, r4, r5, pc}
...


The effect of the patch set on the test cases in terms of size is listed
in the following 2 tables.

---
-Os -thumb -mmarch=armv7-a
---
withoutwith   delta
---
tr1  32  30  -2
tr2  36  36   0
tr3  32  30  -2
tr4  26  26   0
tr5  20  20   0
---

---
-Os -mmarch=armv7-a
---
withoutwith   delta
---
tr1  60  52  -8
tr2  64  60  -4
tr3  60  52  -8
tr4  48  44  -4
tr5  36  32  -4
---


The size impact on several benchmarks is shown in the following table
(%, lower is better).

 nonepic
thumb1  thumb2  thumb1 thumb2
spec2000  99.999.999.9   99.9
eembc 99.9   100.099.9  100.1
dhrystone100.0   100.0   100.0  100.0
coremark  99.399.999.3  100.0

Thanks,
- Tom

[PATCH, PR45098, 3/10]

2011-05-17 Thread Tom de Vries

On 05/17/2011 09:10 AM, Tom de Vries wrote:
 Hi Zdenek,
 
 I have a patch set for for PR45098.
 
 01_object-size-target.patch
 02_pr45098-rtx-cost-set.patch
 03_pr45098-computation-cost.patch
 04_pr45098-iv-init-cost.patch
 05_pr45098-bound-cost.patch
 06_pr45098-bound-cost.test.patch
 07_pr45098-nowrap-limits-iterations.patch
 08_pr45098-nowrap-limits-iterations.test.patch
 09_pr45098-shift-add-cost.patch
 10_pr45098-shift-add-cost.test.patch
 
 I will sent out the patches individually.
 

OK for trunk?

Thanks,
- Tom
2011-05-05  Tom de Vries  t...@codesourcery.com

	PR target/45098
	* tree-ssa-loop-ivopts.c (computation_cost): Prevent cost of 0.

Index: gcc/tree-ssa-loop-ivopts.c
===
--- gcc/tree-ssa-loop-ivopts.c	(revision 173380)
+++ gcc/tree-ssa-loop-ivopts.c	(working copy)
@@ -2862,7 +2862,9 @@ computation_cost (tree expr, bool speed)
   default_rtl_profile ();
   node-frequency = real_frequency;
 
-  cost = seq_cost (seq, speed);
+  cost = (seq != NULL_RTX
+  ? seq_cost (seq, speed)
+  : (unsigned)rtx_cost (rslt, SET, speed));
   if (MEM_P (rslt))
 cost += address_cost (XEXP (rslt, 0), TYPE_MODE (type),
 			  TYPE_ADDR_SPACE (type), speed);

[PATCH, PR45098, 4/10]

2011-05-17 Thread Tom de Vries

On 05/17/2011 09:10 AM, Tom de Vries wrote:
 Hi Zdenek,
 
 I have a patch set for for PR45098.
 
 01_object-size-target.patch
 02_pr45098-rtx-cost-set.patch
 03_pr45098-computation-cost.patch
 04_pr45098-iv-init-cost.patch
 05_pr45098-bound-cost.patch
 06_pr45098-bound-cost.test.patch
 07_pr45098-nowrap-limits-iterations.patch
 08_pr45098-nowrap-limits-iterations.test.patch
 09_pr45098-shift-add-cost.patch
 10_pr45098-shift-add-cost.test.patch
 
 I will sent out the patches individually.
 

OK for trunk?

Thanks,
- Tom
2011-05-05  Tom de Vries  t...@codesourcery.com

	* tree-ssa-loop-ivopts.c (determine_iv_cost): Prevent
	cost_base.cost == 0.

Index: gcc/tree-ssa-loop-ivopts.c
===
--- gcc/tree-ssa-loop-ivopts.c	(revision 173380)
+++ gcc/tree-ssa-loop-ivopts.c	(working copy)
@@ -4688,6 +4688,8 @@ determine_iv_cost (struct ivopts_data *d
 
   base = cand-iv-base;
   cost_base = force_var_cost (data, base, NULL);
+  if (cost_base.cost == 0)
+  cost_base.cost = COSTS_N_INSNS (1);
   cost_step = add_cost (TYPE_MODE (TREE_TYPE (base)), data-speed);
 
   cost = cost_step + adjust_setup_cost (data, cost_base.cost);

[PATCH, PR45098, 7/10]

2011-05-17 Thread Tom de Vries

On 05/17/2011 09:10 AM, Tom de Vries wrote:
 Hi Zdenek,
 
 I have a patch set for for PR45098.
 
 01_object-size-target.patch
 02_pr45098-rtx-cost-set.patch
 03_pr45098-computation-cost.patch
 04_pr45098-iv-init-cost.patch
 05_pr45098-bound-cost.patch
 06_pr45098-bound-cost.test.patch
 07_pr45098-nowrap-limits-iterations.patch
 08_pr45098-nowrap-limits-iterations.test.patch
 09_pr45098-shift-add-cost.patch
 10_pr45098-shift-add-cost.test.patch
 
 I will sent out the patches individually.
 

OK for trunk?

Thanks,
- Tom
2011-05-05  Tom de Vries  t...@codesourcery.com

	PR target/45098
	* tree-ssa-loop-ivopts.c (struct ivopts_data): Add fields
	max_iterations_p and max_iterations.
	(is_nonwrap_use, max_loop_iterations, set_max_iterations): New function.
	(may_eliminate_iv): Use max_iterations_p and max_iterations.
	(tree_ssa_iv_optimize_loop): Use set_max_iterations.

Index: gcc/tree-ssa-loop-ivopts.c
===
--- gcc/tree-ssa-loop-ivopts.c (revision 173355)
+++ gcc/tree-ssa-loop-ivopts.c (working copy)
@@ -291,6 +291,12 @@ struct ivopts_data
 
   /* Whether the loop body includes any function calls.  */
   bool body_includes_call;
+
+  /* Whether max_iterations is valid.  */
+  bool max_iterations_p;
+
+  /* Maximum number of iterations of current_loop.  */
+  double_int max_iterations;
 };
 
 /* An assignment of iv candidates to uses.  */
@@ -4319,6 +4325,108 @@ iv_elimination_compare (struct ivopts_da
   return (exit-flags  EDGE_TRUE_VALUE ? EQ_EXPR : NE_EXPR);
 }
 
+/* Determine if USE contains non-wrapping arithmetic.  */
+
+static bool
+is_nonwrap_use (struct ivopts_data *data, struct iv_use *use)
+{
+  gimple stmt = use-stmt;
+  tree var, ptr, ptr_type;
+
+  if (!is_gimple_assign (stmt))
+return false;
+
+  switch (gimple_assign_rhs_code (stmt))
+{
+case POINTER_PLUS_EXPR:
+  ptr = gimple_assign_rhs1 (stmt);
+  ptr_type = TREE_TYPE (ptr);
+  var = gimple_assign_rhs2 (stmt);
+  if (!expr_invariant_in_loop_p (data-current_loop, ptr))
+return false;
+  break;
+case ARRAY_REF:
+  ptr = TREE_OPERAND ((gimple_assign_rhs1 (stmt)), 0);
+  ptr_type = build_pointer_type (TREE_TYPE (gimple_assign_rhs1 (stmt)));
+  var = TREE_OPERAND ((gimple_assign_rhs1 (stmt)), 1);
+  break;
+default:
+  return false;
+}
+
+  if (!nowrap_type_p (ptr_type))
+return false;
+
+  if (TYPE_PRECISION (ptr_type) != TYPE_PRECISION (TREE_TYPE (var)))
+return false;
+
+  return true;
+}
+
+/* Attempt to infer maximum number of loop iterations of DATA-current_loop
+   from uses in loop containing non-wrapping arithmetic.  If successful,
+   return true, and return maximum iterations in MAX_NITER.  */
+
+static bool
+max_loop_iterations (struct ivopts_data *data, double_int *max_niter)
+{
+  struct iv_use *use;
+  struct iv *iv;
+  bool found = false;
+  double_int period;
+  gimple stmt;
+  unsigned i;
+
+  for (i = 0; i  n_iv_uses (data); i++)
+{
+  use = iv_use (data, i);
+
+  stmt = use-stmt;
+  if (!just_once_each_iteration_p (data-current_loop, gimple_bb (stmt)))
+	continue;
+
+  if (!is_nonwrap_use (data, use))
+continue;
+
+  iv = use-iv;
+  if (iv-step == NULL_TREE || TREE_CODE (iv-step) != INTEGER_CST)
+	continue;
+  period = tree_to_double_int (iv_period (iv));
+
+  if (found)
+*max_niter = double_int_umin (*max_niter, period);
+  else
+{
+  found = true;
+  *max_niter = period;
+}
+}
+
+  return found;
+}
+
+/* Initializes DATA-max_iterations and DATA-max_iterations_p.  */
+
+static void
+set_max_iterations (struct ivopts_data *data)
+{
+  double_int max_niter, max_niter2;
+  bool estimate1, estimate2;
+
+  data-max_iterations_p = false;
+  estimate1 = estimated_loop_iterations (data-current_loop, true, max_niter);
+  estimate2 = max_loop_iterations (data, max_niter2);
+  if (!(estimate1 || estimate2))
+return;
+  if (estimate1  estimate2)
+data-max_iterations = double_int_umin (max_niter, max_niter2);
+  else if (estimate1)
+data-max_iterations = max_niter;
+  else
+data-max_iterations = max_niter2;
+  data-max_iterations_p = true;
+}
+
 /* Check whether it is possible to express the condition in USE by comparison
of candidate CAND.  If so, store the value compared with to BOUND.  */
 
@@ -4391,10 +4499,10 @@ may_eliminate_iv (struct ivopts_data *da
   /* See if we can take advantage of infered loop bound information.  */
   if (loop_only_exit_p (loop, exit))
 {
-  if (!estimated_loop_iterations (loop, true, max_niter))
+  if (!data-max_iterations_p)
 return false;
   /* The loop bound is already adjusted by adding 1.  */
-  if (double_int_ucmp (max_niter, period_value)  0)
+  if (double_int_ucmp (data-max_iterations, period_value)  0)
 return false;

[PATCH, PR45098, 8/10]

2011-05-17 Thread Tom de Vries

On 05/17/2011 09:10 AM, Tom de Vries wrote:
 Hi Zdenek,
 
 I have a patch set for for PR45098.
 
 01_object-size-target.patch
 02_pr45098-rtx-cost-set.patch
 03_pr45098-computation-cost.patch
 04_pr45098-iv-init-cost.patch
 05_pr45098-bound-cost.patch
 06_pr45098-bound-cost.test.patch
 07_pr45098-nowrap-limits-iterations.patch
 08_pr45098-nowrap-limits-iterations.test.patch
 09_pr45098-shift-add-cost.patch
 10_pr45098-shift-add-cost.test.patch
 
 I will sent out the patches individually.
 

OK for trunk?

Thanks,
- Tom
2011-05-05  Tom de Vries  t...@codesourcery.com

	PR target/45098
	* gcc.target/arm/ivopts-3.c: New test.
	* gcc.target/arm/ivopts-4.c: New test.
	* gcc.target/arm/ivopts-5.c: New test.
	* gcc.dg/tree-ssa/ivopt_infer_2.c: Adapt test.

Index: gcc/testsuite/gcc.target/arm/ivopts-3.c
===
--- /dev/null (new file)
+++ gcc/testsuite/gcc.target/arm/ivopts-3.c (revision 0)
@@ -0,0 +1,20 @@
+/* { dg-do assemble } */
+/* { dg-options -Os -mthumb -fdump-tree-ivopts -save-temps } */
+
+extern unsigned int foo2 (short*) __attribute__((pure));
+
+unsigned int
+tr3 (short array[], unsigned int n)
+{
+  unsigned sum = 0;
+  unsigned int x;
+  for (x = 0; x  n; x++)
+sum += foo2 (array[x]);
+  return sum;
+}
+
+/* { dg-final { scan-tree-dump-times PHI ivtmp 1 ivopts} } */
+/* { dg-final { scan-tree-dump-times PHI x 0 ivopts} } */
+/* { dg-final { scan-tree-dump-times , x 0 ivopts} } */
+/* { dg-final { object-size text = 30 { target arm_thumb2_ok } } } */
+/* { dg-final { cleanup-tree-dump ivopts } } */
Index: gcc/testsuite/gcc.target/arm/ivopts-4.c
===
--- /dev/null (new file)
+++ gcc/testsuite/gcc.target/arm/ivopts-4.c (revision 0)
@@ -0,0 +1,21 @@
+/* { dg-do assemble } */
+/* { dg-options -mthumb -Os -fdump-tree-ivopts -save-temps } */
+
+extern unsigned int foo (int*) __attribute__((pure));
+
+unsigned int
+tr2 (int array[], int n)
+{
+  unsigned int sum = 0;
+  int x;
+  if (n  0)
+for (x = 0; x  n; x++)
+  sum += foo (array[x]);
+  return sum;
+}
+
+/* { dg-final { scan-tree-dump-times PHI ivtmp 1 ivopts} } */
+/* { dg-final { scan-tree-dump-times PHI x 0 ivopts} } */
+/* { dg-final { scan-tree-dump-times , x 0 ivopts} } */
+/* { dg-final { object-size text = 36 { target arm_thumb2_ok } } } */
+/* { dg-final { cleanup-tree-dump ivopts } } */
Index: gcc/testsuite/gcc.target/arm/ivopts-5.c
===
--- /dev/null (new file)
+++ gcc/testsuite/gcc.target/arm/ivopts-5.c (revision 0)
@@ -0,0 +1,20 @@
+/* { dg-do assemble } */
+/* { dg-options -Os -mthumb -fdump-tree-ivopts -save-temps } */
+
+extern unsigned int foo (int*) __attribute__((pure));
+
+unsigned int
+tr1 (int array[], unsigned int n)
+{
+  unsigned int sum = 0;
+  unsigned int x;
+  for (x = 0; x  n; x++)
+sum += foo (array[x]);
+  return sum;
+}
+
+/* { dg-final { scan-tree-dump-times PHI ivtmp 1 ivopts} } */
+/* { dg-final { scan-tree-dump-times PHI x 0 ivopts} } */
+/* { dg-final { scan-tree-dump-times , x 0 ivopts} } */
+/* { dg-final { object-size text = 30 { target arm_thumb2_ok } } } */
+/* { dg-final { cleanup-tree-dump ivopts } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/ivopt_infer_2.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/ivopt_infer_2.c	(revision 173380)
+++ gcc/testsuite/gcc.dg/tree-ssa/ivopt_infer_2.c	(working copy)
@@ -7,7 +7,8 @@
 
 extern int a[];
 
-/* Can not infer loop iteration from array -- exit test can not be replaced.  */
+/* Can infer loop iteration from nonwrapping pointer arithmetic.
+   exit test can be replaced.  */
 void foo (int i_width, TYPE dst, TYPE src1, TYPE src2)
 {
   TYPE dstn= dst + i_width;
@@ -21,5 +22,5 @@ void foo (int i_width, TYPE dst, TYPE sr
}
 }
 
-/* { dg-final { scan-tree-dump-times Replacing 0 ivopts} } */
+/* { dg-final { scan-tree-dump-times Replacing 1 ivopts} } */
 /* { dg-final { cleanup-tree-dump ivopts } } */

[PATCH, PR45098, 10/10]

2011-05-17 Thread Tom de Vries

On 05/17/2011 09:10 AM, Tom de Vries wrote:
 Hi Zdenek,
 
 I have a patch set for for PR45098.
 
 01_object-size-target.patch
 02_pr45098-rtx-cost-set.patch
 03_pr45098-computation-cost.patch
 04_pr45098-iv-init-cost.patch
 05_pr45098-bound-cost.patch
 06_pr45098-bound-cost.test.patch
 07_pr45098-nowrap-limits-iterations.patch
 08_pr45098-nowrap-limits-iterations.test.patch
 09_pr45098-shift-add-cost.patch
 10_pr45098-shift-add-cost.test.patch
 
 I will sent out the patches individually.
 

OK for trunk?

Thanks,
- Tom
2011-05-05  Tom de Vries  t...@codesourcery.com

	PR target/45098
	* gcc.target/arm/ivopts-6.c: New test.

Index: gcc/testsuite/gcc.target/arm/ivopts-6.c
===
--- /dev/null (new file)
+++ gcc/testsuite/gcc.target/arm/ivopts-6.c (revision 0)
@@ -0,0 +1,15 @@
+/* { dg-do assemble } */
+/* { dg-options -Os -fdump-tree-ivopts -save-temps -marm } */
+
+void
+tr5 (short array[], int n)
+{
+  int x;
+  if (n  0)
+for (x = 0; x  n; x++)
+  array[x] = 0;
+}
+
+/* { dg-final { scan-tree-dump-times PHI  1 ivopts} } */
+/* { dg-final { object-size text = 32 } } */
+/* { dg-final { cleanup-tree-dump ivopts } } */

Re: [PATCH] Optimize __sync_fetch_and_add (x, -N) == N and __sync_add_and_fetch (x, N) == 0 (PR target/48986)

2011-05-17 Thread Uros Bizjak

On Tue, May 17, 2011 at 9:02 AM, Jakub Jelinek ja...@redhat.com wrote:
 Hi!

 This patch optimizes using peephole2 __sync_fetch_and_add (x, -N) == N
 and __sync_add_and_fetch (x, N) == 0 by just doing lock {add,sub,inc,dec}
 and testing flags, instead of lock xadd plus comparison.
 The sync_old_addmode predicate change makes it possible to optimize
 __sync_add_and_fetch with constant second argument to same
 code as __sync_fetch_and_add.  Doing it in peephole2 has a disadvantage
 though, both that the 3 instructions need to be consecutive and
 e.g. that xadd insn has to be supported by the CPU.  Other alternative
 would be to come up with a new bool builtin that would represent the
 whole __sync_fetch_and_add (x, -N) == N operation (perhaps with dot or space
 in its name to make it inaccessible), try to match it during some folding
 and expand it using special optab.

 Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk
 this way?

 2011-05-16  Jakub Jelinek  ja...@redhat.com

        PR target/48986
        * config/i386/sync.md (sync_old_addmode): Relax operand 2
        predicate to allow CONST_INT.
        (*sync_old_add_cmpmode): New insn and peephole2 for it.

OK, but please add a comment explaining why we have matched constraint
with non-matched predicate. These operands are otherwise targets for
cleanups ;)

Also, a comment explaining the purpose of the added peephole would be nice.

IMO, the change to sync_old_addmode is also appropriate to release branches.

Thanks,
Uros.

Re: Don't let search bots look at buglist.cgi

2011-05-17 Thread Axel Freyn

On Mon, May 16, 2011 at 10:27:44PM -0700, Ian Lance Taylor wrote:
 On Mon, May 16, 2011 at 6:42 AM, Richard Guenther
 richard.guent...@gmail.com wrote:
 
  httpd being in the top-10 always, fiddling with bugzilla URLs?
  (Note, I don't have access to gcc.gnu.org, I'm relaying info from multiple
  instances of discussion on #gcc and richi poking on it; that said, it
  still might not be web crawlers, that's right, but I'll happily accept
  _any_ load improvement on gcc.gnu.org, how unfounded they might seem)
 
 I think that simply blocking buglist.cgi has dropped bugzilla off the
 immediate radar.
 It also seems to have lowered the load, although I'm not sure if we
 are still keeping
 historical data.
 
 
  I for example see also
 
  66.249.71.59 - - [16/May/2011:13:37:58 +] GET
  /viewcvs?view=revisionrevision=169814 HTTP/1.1 200 1334 -
  Mozilla/5.0 (compatible; Googlebot/2.1;
  +http://www.google.com/bot.html) (35%) 2060117us
 
  and viewvc is certainly even worse (from an I/O perspecive).  I thought
  we blocked all bot traffic from the viewvc stuff ...
 
 This is only happening at top level.  I committed this patch to fix this.
Probably you know it much better than me, but wouldn't it be a
possibility to only allow some of google crawlers? (if all try to crawl
bugzilla)
As I read
http://www.google.com/support/webmasters/bin/answer.py?hl=enanswer=1061943
it would be possible to block the Crawlers Googlebot-Mobile,
Mediapartners-Google and AdsBot-Google, (which seem to be independent
Crawlers?) while allowing the main Googlebot (Well, I don't know how
often which crawler appears how often on bugzilla...)

Axel

Commit: RX: Add peepholes for move followed by compare

2011-05-17 Thread Nick Clifton

Hi Guys,

  I am applying the patch below to add a peephole optimization to the RX
  backend.  It was suggested by Kazuhio Inaoka at Renesas Japan, and
  adapted by me to use peephole2 system.  It finds a register move
  followed by a comparison of the moved register against zero and
  replaces the two instructions with a single addition instruction.  The
  addition does not actually do anything since the value being added is
  zero, but as a side effect it moves the register and performs the
  comparison.

Cheers
  Nick

gcc/ChangeLog
2011-05-17  Kazuhio Inaoka  kazuhiro.inaoka...@renesas.com
Nick Clifton  ni...@redhat.com

* config/rx/rx.md: Add peepholes to match a register move followed
by a comparison of the moved register.  Replace these with an
addition of zero that does both actions in one instruction.

Index: gcc/config/rx/rx.md
===
--- gcc/config/rx/rx.md (revision 173815)
+++ gcc/config/rx/rx.md (working copy)
@@ -904,6 +904,39 @@
(set_attr length   3,4,5,6,7,6)]
 )
 
+;; Peepholes to match:
+;;   (set (reg A) (reg B))
+;;   (set (CC) (compare:CC (reg A/reg B) (const_int 0)))
+;; and replace them with the addsi3_flags pattern, using an add
+;; of zero to copy the register and set the condition code bits.
+(define_peephole2
+  [(set (match_operand:SI 0 register_operand)
+(match_operand:SI 1 register_operand))
+   (set (reg:CC CC_REG)
+(compare:CC (match_dup 0)
+(const_int 0)))]
+  
+  [(parallel [(set (match_dup 0)
+  (plus:SI (match_dup 1) (const_int 0)))
+ (set (reg:CC_ZSC CC_REG)
+  (compare:CC_ZSC (plus:SI (match_dup 1) (const_int 0))
+  (const_int 0)))])]
+)
+
+(define_peephole2
+  [(set (match_operand:SI 0 register_operand)
+(match_operand:SI 1 register_operand))
+   (set (reg:CC CC_REG)
+(compare:CC (match_dup 1)
+(const_int 0)))]
+  
+  [(parallel [(set (match_dup 0)
+  (plus:SI (match_dup 1) (const_int 0)))
+ (set (reg:CC_ZSC CC_REG)
+  (compare:CC_ZSC (plus:SI (match_dup 1) (const_int 0))
+  (const_int 0)))])]
+)
+
 (define_expand adddi3
   [(set (match_operand:DI  0 register_operand)
(plus:DI (match_operand:DI 1 register_operand)

Commit: RX: Add peepholes to remove redundant extensions

2011-05-17 Thread Nick Clifton

Hi Guys,

  I am applying the patch below to add a couple of peephole
  optimizations to the RX backend.  It seems that GCC does not cope very
  well with the RX's ability to perform either sign-extending loads or
  zero-extending loads and so sometimes it can generate an extending
  load followed by a register to register extension.  The peepholes
  match these cases and delete the unnecessary extension where possible.

Cheers
  Nick

gcc/ChangeLog
2011-05-17  Nick Clifton  ni...@redhat.com

* config/rx/rx.md: Add peephole to remove redundant extensions
after loads.

Index: gcc/config/rx/rx.md
===
--- gcc/config/rx/rx.md (revision 173819)
+++ gcc/config/rx/rx.md (working copy)
@@ -1701,6 +1701,35 @@
(extend_types:SI (match_dup 1]
 )
 
+;; Convert:
+;;   (set (reg1) (sign_extend (mem))
+;;   (set (reg2) (zero_extend (reg1))
+;; into
+;;   (set (reg2) (zero_extend (mem)))
+(define_peephole2
+  [(set (match_operand:SI  0 register_operand)
+   (sign_extend:SI (match_operand:small_int_modes 1 memory_operand)))
+   (set (match_operand:SI  2 register_operand)
+   (zero_extend:SI (match_operand:small_int_modes 3 register_operand)))]
+  REGNO (operands[0]) == REGNO (operands[3])
+(REGNO (operands[0]) == REGNO (operands[2])
+   || peep2_regno_dead_p (2, REGNO (operands[0])))
+  [(set (match_dup 2)
+   (zero_extend:SI (match_dup 1)))]
+)
+
+;; Remove the redundant sign extension from:
+;;   (set (reg) (extend (mem)))
+;;   (set (reg) (extend (reg)))
+(define_peephole2
+  [(set (match_operand:SI   0 register_operand)
+   (extend_types:SI (match_operand:small_int_modes 1 memory_operand)))
+   (set (match_dup 0)
+   (extend_types:SI (match_operand:small_int_modes 2 register_operand)))]
+  REGNO (operands[0]) == REGNO (operands[2])
+  [(set (match_dup 0) (extend_types:SI (match_dup 1)))]
+)
+
 (define_insn comparesi3_extend_types:codesmall_int_modes:mode
   [(set (reg:CC CC_REG)
(compare:CC (match_operand:SI   0 
register_operand =r)

Commit: RX: Fix predicates for restricted memory patterns

2011-05-17 Thread Nick Clifton

Hi Guys,

  I am applying the patch below to fix a minor discrepancy in the rx.md
  file.  Several patterns can only use restricted memory addresses.
  They have the correct Q constraint, but they were using the more
  permissive memory_operand predicate.  The patch fixes these patterns
  by replacing memory_operand with rx_restricted_mem_operand.

Cheers
  Nick

gcc/ChangeLog
2011-05-17  Nick Clifton  ni...@redhat.com

* config/rx/rx.md (bitset_in_memory): Use
rx_restricted_mem_operand.
(bitinvert_in_memory): Likewise.
(bitclr_in_memory): Likewise.

Index: gcc/config/rx/rx.md
===
--- gcc/config/rx/rx.md (revision 173820)
+++ gcc/config/rx/rx.md (working copy)
@@ -1831,7 +1831,7 @@
 )
 
 (define_insn *bitset_in_memory
-  [(set (match_operand:QI0 memory_operand +Q)
+  [(set (match_operand:QI0 rx_restricted_mem_operand 
+Q)
(ior:QI (ashift:QI (const_int 1)
   (match_operand:QI 1 nonmemory_operand ri))
(match_dup 0)))]
@@ -1852,7 +1852,7 @@
 )
 
 (define_insn *bitinvert_in_memory
-  [(set (match_operand:QI 0 memory_operand +Q)
+  [(set (match_operand:QI 0 rx_restricted_mem_operand +Q)
(xor:QI (ashift:QI (const_int 1)
   (match_operand:QI 1 nonmemory_operand ri))
(match_dup 0)))]
@@ -1875,7 +1875,7 @@
 )
 
 (define_insn *bitclr_in_memory
-  [(set (match_operand:QI 0 memory_operand +Q)
+  [(set (match_operand:QI 0 rx_restricted_mem_operand +Q)
(and:QI (not:QI
  (ashift:QI
(const_int 1)

Commit: RX: Include cost of register moving in the cost of register loading.

2011-05-17 Thread Nick Clifton

Hi Guys,

  I am applying the patch below to fix a bug with the
  rx_memory_move_cost function.  The problem was that the costs are
  meant to be relative to the cost of moving a value between registers,
  but the existing definition was making stores cheaper than moves, and
  loads the same cost as moves.  Thus gcc was sometimes choosing to store
  values in memory when actually it was better to keep them in memory.

  The patch fixes the problem by adding in the register move cost to the
  memory move cost.  It also removes the call to
  memory_move_secondary_cost since there is no secondary cost.

Cheers
  Nick

gcc/ChangeLog
2011-05-17  Nick Clifton  ni...@redhat.com

* config/rx/rx.c (rx_memory_move_cost): Include cost of register
moves.

Index: gcc/config/rx/rx.c
===
--- gcc/config/rx/rx.c  (revision 173815)
+++ gcc/config/rx/rx.c  (working copy)
@@ -2638,7 +2638,7 @@
 static int
 rx_memory_move_cost (enum machine_mode mode, reg_class_t regclass, bool in)
 {
-  return (in ? 2 : 0) + memory_move_secondary_cost (mode, regclass, in);
+  return (in ? 2 : 0) + REGISTER_MOVE_COST (mode, regclass, regclass);
 }
 
 /* Convert a CC_MODE to the set of flags that it represents.  */

Re: [PATCH] Fix PR46728 (move pow/powi folds to tree phases)

2011-05-17 Thread Richard Guenther

On Mon, May 16, 2011 at 7:30 PM, William J. Schmidt
wschm...@linux.vnet.ibm.com wrote:
 Richi, thank you for the detailed review!

 I'll plan to move the power-series expansion into the existing IL walk
 during pass_cse_sincos.  As part of this, I'll move
 tree_expand_builtin_powi and its subfunctions from builtins.c into
 tree-ssa-math-opts.c.  I'll submit this as a separate patch.

 I will also stop attempting to make code generation match completely at
 -O0.  If there are tests in the test suite that fail only at -O0 due to
 these changes, I'll modify the tests to require -O1 or higher.

 I understand that you'd prefer that I leave the existing
 canonicalization folds in place, and only un-canonicalize them during my
 new pass (now, during cse_sincos).  Actually, that was my first approach
 to this issue.  The problem that I ran into is that the various folds
 are not performed just by the front end, but can pop up later, after my
 pass is done.  In particular, pass_fold_builtins will undo my changes,
 turning expressions involving roots back into expressions involving
 pow/powi.  It wasn't clear to me whether the folds could kick in
 elsewhere as well, so I took the approach of shutting them down.  I see
 now that this does lose some optimizations such as
 pow(sqrt(cbrx(x)),6.0), as you pointed out.

Yeah, it's always a delicate balance between canonicalization
and optimal form for further optimization.  Did you really see
sqrt(cbrt(x)) being transformed back to pow()?  I would doubt that,
as on gimple the foldings that require two function calls to match
shouldn't trigger.  Or do you see sqrt(x) turned into pow(x,0.5)?
I see that the vectorizer for example handles both pow(x,0.5) and
pow(x,2), so indeed that might happen.

I think what we might want to do is limit what the generic
gimple fold_stmt folding does to function calls, also to avoid
building regular generic call statements again.  But that might
be a bigger project and certainly should be done separately.

So I'd say don't worry about this issue for the initial patch but
instead deal with it separately.

We also repeatedly thought about whether canonicalizing
everything to pow is a good idea or not, especially our
canonicalizing of x * x to pow (x, 2) leads to interesting
effects in some cases, as several passes do not handle
function calls very well.  So I also thought about introducing
a POW_EXPR tree code that would be easier in this
regard and would be a more IL friendly canonical form
of the power-related functions.

 Should I attempt to leave the folds in place, and screen out the
 particular cases that are causing trouble in pass_fold_builtins?  Or is
 it too fragile to try to catch all places where folds occur?  If there's
 a flag that indicates parsing is complete, I suppose I could disable
 individual folds once we're into the optimizer.  I'd appreciate your
 guidance.

Indeed restricting canonicalization to earlier passes would be the
way to go I think.  I will think of the best way to achieve this.

Richard.

 Thanks,
 Bill

[PATCH][?/n] LTO type merging cleanup

2011-05-17 Thread Richard Guenther


This avoids the odd cases where gimple_register_canonical_type could
end up running in cycles.  I was able to reproduce this issue
with an intermediate tree and LTO bootstrap.  While the following
patch is not the real fix (that one runs into a known cache-preloading
issue again ...) it certainly makes a lot of sense and avoids
the issue by design.

LTO bootstrapped on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2011-05-17  Richard Guenther  rguent...@suse.de

* gimple.c (gimple_register_canonical_type): Use the main-variant
leader for computing the canonical type.

Index: gcc/gimple.c
===
*** gcc/gimple.c(revision 173825)
--- gcc/gimple.c(working copy)
*** gimple_register_canonical_type (tree t)
*** 4856,4874 
if (TYPE_CANONICAL (t))
  return TYPE_CANONICAL (t);
  
!   /* Always register the type itself first so that if it turns out
!  to be the canonical type it will be the one we merge to as well.  */
!   t = gimple_register_type (t);
  
if (TYPE_CANONICAL (t))
  return TYPE_CANONICAL (t);
  
-   /* Always register the main variant first.  This is important so we
-  pick up the non-typedef variants as canonical, otherwise we'll end
-  up taking typedef ids for structure tags during comparison.  */
-   if (TYPE_MAIN_VARIANT (t) != t)
- gimple_register_canonical_type (TYPE_MAIN_VARIANT (t));
- 
if (gimple_canonical_types == NULL)
  gimple_canonical_types = htab_create_ggc (16381, 
gimple_canonical_type_hash,
  gimple_canonical_type_eq, 0);
--- 4856,4869 
if (TYPE_CANONICAL (t))
  return TYPE_CANONICAL (t);
  
!   /* Use the leader of our main variant for determining our canonical
!  type.  The main variant leader is a type that will always
!  prevail.  */
!   t = gimple_register_type (TYPE_MAIN_VARIANT (t));
  
if (TYPE_CANONICAL (t))
  return TYPE_CANONICAL (t);
  
if (gimple_canonical_types == NULL)
  gimple_canonical_types = htab_create_ggc (16381, 
gimple_canonical_type_hash,
  gimple_canonical_type_eq, 0);

Re: [PATCH][?/n] Cleanup LTO type merging

2011-05-17 Thread Richard Guenther

On Mon, 16 May 2011, H.J. Lu wrote:

 On Mon, May 16, 2011 at 7:17 AM, Richard Guenther rguent...@suse.de wrote:
 
  The following patch improves hashing types by re-instantiating the
  patch that makes us visit aggregate target types of pointers and
  function return and argument types.  This halves the collision
  rate on the type hash table for a linux-kernel build and improves
  WPA compile-time from 3mins to 1mins and reduces memory usage by
  1GB for that testcase.
 
  Bootstrapped and tested on x86_64-unknown-linux-gnu, SPEC2k6
  build-tested.
 
  Richard.
 
  (patch is reversed)
 
  2011-05-16  Richard Guenther  rguent...@suse.de
 
         * gimple.c (iterative_hash_gimple_type): Re-instantiate
         change to always visit pointer target and function result
         and argument types.
 
 
 This caused:
 
 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49013

I have reverted the patch for now.

Richard.

Re: Reintroduce -mflat option on SPARC

2011-05-17 Thread Eric Botcazou

 Right, -mflat option should only be for 32-bit SPARC target.

OK, let's keep it that way for now.

Another question: why does the model hijack %i7 to use it as frame pointer, 
instead of just using %fp?  AFAICS both are kept as fixed registers by the 
code so the model seems to be wasting 1 register (2 without frame pointer).

-- 
Eric Botcazou

Re: Don't let search bots look at buglist.cgi

2011-05-17 Thread Michael Matz

Hi,

On Mon, 16 May 2011, Ian Lance Taylor wrote:

  httpd being in the top-10 always, fiddling with bugzilla URLs? 
  (Note, I don't have access to gcc.gnu.org, I'm relaying info from 
  multiple instances of discussion on #gcc and richi poking on it; 
  that said, it still might not be web crawlers, that's right, but 
  I'll happily accept
  _any_ load improvement on gcc.gnu.org, how unfounded they might seem)
 
 I think that simply blocking buglist.cgi has dropped bugzilla off the 
 immediate radar. It also seems to have lowered the load, although I'm 
 not sure if we are still keeping historical data.

Btw. FWIW, I had a quick look at one of the httpd log files, and in seven 
hours on last Saturday (from 5:30 to 12:30), there were overall 435203 GET 
requests, and 391319 of them came from our own MnoGoSearch engine, that's 
90%.  Granted many are then in fact 304 (not modified) responses, but 
still, perhaps the eagerness of our own crawler can be turned down a bit.


Ciao,
Michael.

Re: [PATCH][?/n] Cleanup LTO type merging

2011-05-17 Thread Jan Hubicka

 On Mon, 16 May 2011, Jan Hubicka wrote:
 
   
   I've seen us merge different named structs which happen to reside
   on the same variant list.  That's bogus, not only because we are
   adjusting TYPE_MAIN_VARIANT during incremental type-merging and
   fixup, so computing a persistent hash by looking at it looks
   fishy as well.
  
  Hi,
  as reported on IRC earlier, I get the segfault while building libxul
  duea to infinite recursion problem.
  
  I now however also get a lot more of the following ICEs:
  In function '__unguarded_insertion_sort':
  lto1: internal compiler error: in splice_child_die, at dwarf2out.c:8274
  previously it reported once during Mozilla build (and I put testcase into
  bugzilla), now it reproduces on many libraries. I did not see this problem
  when applying only the SCC hasing change.
 
 This change causes us to preserve more TYPE_DECLs I think, so we might
 run more often into pre-existing debuginfo issues.  Previously most
 of the types were merged into their nameless variant which probably
 didn't get output into debug info.
 
 Do you by chance have small testcases for your problems? ;)

I think you might just look into one at http://gcc.gnu.org/bugzilla/show 
_bug.cgi?id=48354

Honza

Re: [PATCH] Fix PR46728 (move pow/powi folds to tree phases)

2011-05-17 Thread William J. Schmidt


On Tue, 2011-05-17 at 11:03 +0200, Richard Guenther wrote:
 On Mon, May 16, 2011 at 7:30 PM, William J. Schmidt
 wschm...@linux.vnet.ibm.com wrote:
  Richi, thank you for the detailed review!
 
  I'll plan to move the power-series expansion into the existing IL walk
  during pass_cse_sincos.  As part of this, I'll move
  tree_expand_builtin_powi and its subfunctions from builtins.c into
  tree-ssa-math-opts.c.  I'll submit this as a separate patch.
 
  I will also stop attempting to make code generation match completely at
  -O0.  If there are tests in the test suite that fail only at -O0 due to
  these changes, I'll modify the tests to require -O1 or higher.
 
  I understand that you'd prefer that I leave the existing
  canonicalization folds in place, and only un-canonicalize them during my
  new pass (now, during cse_sincos).  Actually, that was my first approach
  to this issue.  The problem that I ran into is that the various folds
  are not performed just by the front end, but can pop up later, after my
  pass is done.  In particular, pass_fold_builtins will undo my changes,
  turning expressions involving roots back into expressions involving
  pow/powi.  It wasn't clear to me whether the folds could kick in
  elsewhere as well, so I took the approach of shutting them down.  I see
  now that this does lose some optimizations such as
  pow(sqrt(cbrx(x)),6.0), as you pointed out.
 
 Yeah, it's always a delicate balance between canonicalization
 and optimal form for further optimization.  Did you really see
 sqrt(cbrt(x)) being transformed back to pow()?  I would doubt that,
 as on gimple the foldings that require two function calls to match
 shouldn't trigger.  Or do you see sqrt(x) turned into pow(x,0.5)?
 I see that the vectorizer for example handles both pow(x,0.5) and
 pow(x,2), so indeed that might happen.

Yes, I was seeing sqrt(x) turned back to pow(x,0.5), and even x*x
turning back into pow(x,2.0).  I don't specifically recall the
sqrt(cbrt(x)) case; you're probably right about that one.  But I had
several test cases break because of this.

 
 I think what we might want to do is limit what the generic
 gimple fold_stmt folding does to function calls, also to avoid
 building regular generic call statements again.  But that might
 be a bigger project and certainly should be done separately.
 
 So I'd say don't worry about this issue for the initial patch but
 instead deal with it separately.

Agreed...

 
 We also repeatedly thought about whether canonicalizing
 everything to pow is a good idea or not, especially our
 canonicalizing of x * x to pow (x, 2) leads to interesting
 effects in some cases, as several passes do not handle
 function calls very well.  So I also thought about introducing
 a POW_EXPR tree code that would be easier in this
 regard and would be a more IL friendly canonical form
 of the power-related functions.
 
  Should I attempt to leave the folds in place, and screen out the
  particular cases that are causing trouble in pass_fold_builtins?  Or is
  it too fragile to try to catch all places where folds occur?  If there's
  a flag that indicates parsing is complete, I suppose I could disable
  individual folds once we're into the optimizer.  I'd appreciate your
  guidance.
 
 Indeed restricting canonicalization to earlier passes would be the
 way to go I think.  I will think of the best way to achieve this.

Thanks.  I think we need to address this as part of this patch, unless
you're willing to live with a number of broken test cases in the
meanwhile.  If I only do the un-canonicalization in the new pass and let
some of the folds be re-done later, some will fail.  I'll start
experimenting and see how many.

Bill

 
 Richard.
 
  Thanks,
  Bill

Re: [patch ada]: Fix boolean_type_node setup and some cleanup for boolean use

2011-05-17 Thread Eric Botcazou

 2011-05-16  Kai Tietz

   PR middle-end/48989
   * gcc-interface/trans.c (Exception_Handler_to_gnu_sjlj): Use
   boolean_false_node instead of integer_zero_node.
   (convert_with_check): Likewise.
   * gcc-interface/decl.c (choices_to_gnu): Likewise.

OK for this part.

   * gcc-interface/misc.c (gnat_init): Set precision for
   generated boolean_type_node and initialize
   boolean_false_node.

Not OK, you cannot set the precision of boolean_type_node to 1 in Ada.

-- 
Eric Botcazou

[PATCH][?/n] LTO type merging cleanup

2011-05-17 Thread Richard Guenther


This fixes an oversight in the new SCC hash mixing code - we of course
need to return the adjusted hash of our type, not the purely local one.

There's still something weird going on, hash values somehow depend
on the order we feed it types ...

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2011-05-17  Richard Guenther  rguent...@suse.de

* gimple.c (iterative_hash_gimple_type): Simplify singleton
case some more, fix final hash value of the non-singleton case.

Index: gcc/gimple.c
===
--- gcc/gimple.c(revision 173827)
+++ gcc/gimple.c(working copy)
@@ -4213,25 +4213,24 @@ iterative_hash_gimple_type (tree type, h
   if (state-low == state-dfsnum)
 {
   tree x;
-  struct sccs *cstate;
   struct tree_int_map *m;
 
   /* Pop off the SCC and set its hash values.  */
   x = VEC_pop (tree, *sccstack);
-  cstate = (struct sccs *)*pointer_map_contains (sccstate, x);
-  cstate-on_sccstack = false;
   /* Optimize SCC size one.  */
   if (x == type)
{
+ state-on_sccstack = false;
  m = ggc_alloc_cleared_tree_int_map ();
  m-base.from = x;
- m-to = cstate-u.hash;
+ m-to = v;
  slot = htab_find_slot (type_hash_cache, m, INSERT);
  gcc_assert (!*slot);
  *slot = (void *) m;
}
   else
{
+ struct sccs *cstate;
  unsigned first, i, size, j;
  struct type_hash_pair *pairs;
  /* Pop off the SCC and build an array of type, hash pairs.  */
@@ -4241,6 +4240,8 @@ iterative_hash_gimple_type (tree type, h
  size = VEC_length (tree, *sccstack) - first + 1;
  pairs = XALLOCAVEC (struct type_hash_pair, size);
  i = 0;
+ cstate = (struct sccs *)*pointer_map_contains (sccstate, x);
+ cstate-on_sccstack = false;
  pairs[i].type = x;
  pairs[i].hash = cstate-u.hash;
  do
@@ -4275,6 +4276,8 @@ iterative_hash_gimple_type (tree type, h
  for (j = 0; pairs[j].hash != pairs[i].hash; ++j)
hash = iterative_hash_hashval_t (pairs[j].hash, hash);
  m-to = hash;
+ if (pairs[i].type == type)
+   v = hash;
  slot = htab_find_slot (type_hash_cache, m, INSERT);
  gcc_assert (!*slot);
  *slot = (void *) m;

Re: [patch ada]: Fix boolean_type_node setup and some cleanup for boolean use

2011-05-17 Thread Kai Tietz

2011/5/17 Eric Botcazou ebotca...@adacore.com:
 2011-05-16  Kai Tietz

       PR middle-end/48989
       * gcc-interface/trans.c (Exception_Handler_to_gnu_sjlj): Use
       boolean_false_node instead of integer_zero_node.
       (convert_with_check): Likewise.
       * gcc-interface/decl.c (choices_to_gnu): Likewise.

 OK for this part.

       * gcc-interface/misc.c (gnat_init): Set precision for
       generated boolean_type_node and initialize
       boolean_false_node.

 Not OK, you cannot set the precision of boolean_type_node to 1 in Ada.

 --
 Eric Botcazou

Hmm, sad. As the a check in tree-cfg for truth-expressions about
having type-precision of 1 would be a good way.  What is actual the
cause for not setting type-precision here? At least in testcases I
didn't found a regression caused by this.

Regards,
Kai

Re: [PATCH][?/n] Cleanup LTO type merging

2011-05-17 Thread H.J. Lu

On Tue, May 17, 2011 at 3:29 AM, Richard Guenther rguent...@suse.de wrote:
 On Mon, 16 May 2011, H.J. Lu wrote:

 On Mon, May 16, 2011 at 7:17 AM, Richard Guenther rguent...@suse.de wrote:
 
  The following patch improves hashing types by re-instantiating the
  patch that makes us visit aggregate target types of pointers and
  function return and argument types.  This halves the collision
  rate on the type hash table for a linux-kernel build and improves
  WPA compile-time from 3mins to 1mins and reduces memory usage by
  1GB for that testcase.
 
  Bootstrapped and tested on x86_64-unknown-linux-gnu, SPEC2k6
  build-tested.
 
  Richard.
 
  (patch is reversed)
 
  2011-05-16  Richard Guenther  rguent...@suse.de
 
         * gimple.c (iterative_hash_gimple_type): Re-instantiate
         change to always visit pointer target and function result
         and argument types.
 

 This caused:

 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49013

 I have reverted the patch for now.


It doesn't solve the problem and I reopened:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49013

Your followup patches may have similar issues.

-- 
H.J.

Re: [PATCH][?/n] Cleanup LTO type merging

2011-05-17 Thread H.J. Lu

On Tue, May 17, 2011 at 5:59 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Tue, May 17, 2011 at 3:29 AM, Richard Guenther rguent...@suse.de wrote:
 On Mon, 16 May 2011, H.J. Lu wrote:

 On Mon, May 16, 2011 at 7:17 AM, Richard Guenther rguent...@suse.de wrote:
 
  The following patch improves hashing types by re-instantiating the
  patch that makes us visit aggregate target types of pointers and
  function return and argument types.  This halves the collision
  rate on the type hash table for a linux-kernel build and improves
  WPA compile-time from 3mins to 1mins and reduces memory usage by
  1GB for that testcase.
 
  Bootstrapped and tested on x86_64-unknown-linux-gnu, SPEC2k6
  build-tested.
 
  Richard.
 
  (patch is reversed)
 
  2011-05-16  Richard Guenther  rguent...@suse.de
 
         * gimple.c (iterative_hash_gimple_type): Re-instantiate
         change to always visit pointer target and function result
         and argument types.
 

 This caused:

 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49013

 I have reverted the patch for now.


 It doesn't solve the problem and I reopened:

 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49013

 Your followup patches may have similar issues.


I think you reverted the WRONG patch:

http://gcc.gnu.org/viewcvs?view=revisionrevision=173827


-- 
H.J.

Re: [PATCH][?/n] Cleanup LTO type merging

2011-05-17 Thread Richard Guenther

On Tue, May 17, 2011 at 3:01 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Tue, May 17, 2011 at 5:59 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Tue, May 17, 2011 at 3:29 AM, Richard Guenther rguent...@suse.de wrote:
 On Mon, 16 May 2011, H.J. Lu wrote:

 On Mon, May 16, 2011 at 7:17 AM, Richard Guenther rguent...@suse.de 
 wrote:
 
  The following patch improves hashing types by re-instantiating the
  patch that makes us visit aggregate target types of pointers and
  function return and argument types.  This halves the collision
  rate on the type hash table for a linux-kernel build and improves
  WPA compile-time from 3mins to 1mins and reduces memory usage by
  1GB for that testcase.
 
  Bootstrapped and tested on x86_64-unknown-linux-gnu, SPEC2k6
  build-tested.
 
  Richard.
 
  (patch is reversed)
 
  2011-05-16  Richard Guenther  rguent...@suse.de
 
         * gimple.c (iterative_hash_gimple_type): Re-instantiate
         change to always visit pointer target and function result
         and argument types.
 

 This caused:

 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49013

 I have reverted the patch for now.


 It doesn't solve the problem and I reopened:

 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49013

 Your followup patches may have similar issues.


 I think you reverted the WRONG patch:

 http://gcc.gnu.org/viewcvs?view=revisionrevision=173827

No, that was on purpose.

 --
 H.J.

Re: [PATCH][?/n] Cleanup LTO type merging

2011-05-17 Thread H.J. Lu

On Tue, May 17, 2011 at 6:03 AM, Richard Guenther
richard.guent...@gmail.com wrote:
 On Tue, May 17, 2011 at 3:01 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Tue, May 17, 2011 at 5:59 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Tue, May 17, 2011 at 3:29 AM, Richard Guenther rguent...@suse.de wrote:
 On Mon, 16 May 2011, H.J. Lu wrote:

 On Mon, May 16, 2011 at 7:17 AM, Richard Guenther rguent...@suse.de 
 wrote:
 
  The following patch improves hashing types by re-instantiating the
  patch that makes us visit aggregate target types of pointers and
  function return and argument types.  This halves the collision
  rate on the type hash table for a linux-kernel build and improves
  WPA compile-time from 3mins to 1mins and reduces memory usage by
  1GB for that testcase.
 
  Bootstrapped and tested on x86_64-unknown-linux-gnu, SPEC2k6
  build-tested.
 
  Richard.
 
  (patch is reversed)
 
  2011-05-16  Richard Guenther  rguent...@suse.de
 
         * gimple.c (iterative_hash_gimple_type): Re-instantiate
         change to always visit pointer target and function result
         and argument types.
 

 This caused:

 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49013

 I have reverted the patch for now.


 It doesn't solve the problem and I reopened:

 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49013

 Your followup patches may have similar issues.


 I think you reverted the WRONG patch:

 http://gcc.gnu.org/viewcvs?view=revisionrevision=173827

 No, that was on purpose.


But it doesn't fix the problem.



-- 
H.J.

Re: FDO patch -- make ic related vars TLS if target allows

2011-05-17 Thread H.J. Lu

On Wed, Apr 27, 2011 at 10:54 AM, Xinliang David Li davi...@google.com wrote:
 Hi please review the trivial patch below. It reduces race conditions
 in value profiling. Another trivial change (to initialize
 function_list struct) is also included.

 Bootstrapped and regression tested on x86-64/linux.

 Thanks,

 David


 2011-04-27  Xinliang David Li  davi...@google.com

        * tree-profile.c (init_ic_make_global_vars): Set
        tls attribute on ic vars.
        * coverage.c (coverage_end_function): Initialize
        function_list with zero.


This caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49014


-- 
H.J.

Re: [patch ada]: Fix boolean_type_node setup and some cleanup for boolean use

2011-05-17 Thread Eric Botcazou

 Hmm, sad. As the a check in tree-cfg for truth-expressions about
 having type-precision of 1 would be a good way.  What is actual the
 cause for not setting type-precision here?

But we are setting it:

  /* In Ada, we use an unsigned 8-bit type for the default boolean type.  */
  boolean_type_node = make_unsigned_type (8);
  TREE_SET_CODE (boolean_type_node, BOOLEAN_TYPE);

See make_unsigned_type:

/* Create and return a type for unsigned integers of PRECISION bits.  */

tree
make_unsigned_type (int precision)
{
  tree type = make_node (INTEGER_TYPE);

  TYPE_PRECISION (type) = precision;

  fixup_unsigned_type (type);
  return type;
}


The other languages are changing the precision, but in Ada we need a standard 
scalar (precision == mode size) in order to support invalid values.

 At least in testcases I didn't found a regression caused by this.

Right, I've just installed the attached testcase, it passes with the unmodified 
compiler but fails with your gcc-interface/misc.c change.


2011-05-17  Eric Botcazou  ebotca...@adacore.com

* gnat.dg/invalid1.adb: New test.


-- 
Eric Botcazou
-- { dg-do run }
-- { dg-options -gnatws -gnatVa }

pragma Initialize_Scalars;

procedure Invalid1 is

  X : Boolean;
  A : Boolean := False;

  procedure Uninit (B : out Boolean) is
  begin
if A then
  B := True;
  raise Program_Error;
end if;
  end;

begin

  -- first, check that initialize_scalars is enabled
  begin
if X then
  A := False;
end if;
raise Program_Error;
  exception
when Constraint_Error = null;
  end;

  -- second, check if copyback of an invalid value raises constraint error
  begin
Uninit (A);
if A then
  -- we expect constraint error in the 'if' above according to gnat ug:
  -- 
  -- call.  Note that there is no specific option to test `out'
  -- parameters, but any reference within the subprogram will be tested
  -- in the usual manner, and if an invalid value is copied back, any
  -- reference to it will be subject to validity checking.
  -- ...
  raise Program_Error;
end if;
raise Program_Error;
  exception
when Constraint_Error = null;
  end;

end;

Re: [PATCH] comment precising need to use free_dominance_info

2011-05-17 Thread Pierre Vittet

So maybe this patch adding a comment on calculate_dominance_info is more 
adapted.


ChangeLog:
2011-05-17  Pierre Vittetpier...@pvittet.com

* dominance.c (calculate_dominance_info): Add comment
   precising when to free with free_dominance_info

contributor number: 634276

Index: gcc/dominance.c
===
--- gcc/dominance.c (revision 173830)
+++ gcc/dominance.c (working copy)
@@ -628,8 +628,15 @@ compute_dom_fast_query (enum cdi_direction dir)
 }
 
 /* The main entry point into this module.  DIR is set depending on whether
-   we want to compute dominators or postdominators.  */
+   we want to compute dominators or postdominators.  
 
+   We try to keep dominance info alive as long as possible (to avoid
+   recomputing it often). It has to be freed with free_dominance_info when CFG
+   transformation makes it invalide. 
+   
+   post_dominance info is less often used, and should be freed after each use.
+*/
+
 void
 calculate_dominance_info (enum cdi_direction dir)
 {

RFA: MN10300: Add TLS support

2011-05-17 Thread Nick Clifton

Hi Richard, Hi Jeff, Hi Alex,

  Here is another MN10300 patch.  This ones adds support for TLS.  I
  must confess that I did not actually write this code - DJ did - but I
  have been asked to submit it upstream, so here goes:

  OK to apply ?

Cheers
  Nick

gcc/ChangeLog
2011-05-17  DJ Delorie  d...@redhat.com
Nick Clifton  ni...@redhat.com

* config/mn10300/mn10300.c (mn10300_unspec_int_label_counter):
New variable.
(mn10300_option_override): Disable TLS for the MN10300.
(tls_symbolic_operand_kind): New function.
(get_some_local_dynamic_name_1): New function.
(get_some_local_dynamic_name): New function.
(mn10300_print_operand): Handle %.
(mn10300_legitimize_address): Legitimize TLS addresses.
(is_legitimate_tls_operand): New function.
(mn10300_legitimate_pic_operand_p): TLS operands are
legitimate.
(mn10300_legitimate_address_p): TLS symbols do not make
legitimate addresses.
Allow TLS operands under some circumstances.
(mn10300_legitimate_constant_p): Handle TLS UNSPECs.
(mn10300_init_machine_status): New function.
(mn10300_init_expanders): New function.
(pic_nonpic_got_ptr): New function.
(mn10300_tls_get_addr): New function.
(mn10300_legitimize_tls_address): New function.
(mn10300_constant_address_p): New function.
(TARGET_HAVE_TLS): Define.
* config/mn10300/predicates.md (tls_symbolic_operand): New.
(nontls_general_operand): New.
* config/mn10300/mn10300.h (enum reg_class): Add D0_REGS,
A0_REGS.
(REG_CLASS_NAMES): Likewise.
(REG_CLASS_CONTENTS): Likewise.
(struct machine_function): New structure.
(INIT_EXPANDERS): Define.
(mn10300_unspec_int_label_counter): New variable.
(PRINT_OPERAND_PUNCT_VALID_P): Define.
(CONSTANT_ADDRESS_P): Define.
* config/mn10300/constraints (B): New constraint.
(C): New constraint.
* config/mn10300/mn10300-protos.h: Alpha sort.
(mn10300_init_expanders): Prototype.
(mn10300_tls_get_addr): Prototype.
(mn10300_legitimize_tls_address): Prototype.
(mn10300_constant_address_p): Prototype.
* config/mn10300/mn10300.md (TLS_REG): New constant.
(UNSPEC_INT_LABEL): New constant.
(UNSPEC_TLSGD): New constant.
(UNSPEC_TLSLDM): New constant.
(UNSPEC_DTPOFF): New constant.
(UNSPEC_GOTNTPOFF): New constant.
(UNSPEC_INDNTPOFF): New constant.
(UNSPEC_TPOFF): New constant.
(UNSPEC_TLS_GD): New constant.
(UNSPEC_TLS_LD_BASE): New constant.
(movsi): Add TLS code.
(tls_global_dynamic_i): New pattern.
(tls_global_dynamic): New pattern.
(tls_local_dynamic_base_i): New pattern.
(tls_local_dynamic_base): New pattern.
(tls_initial_exec): New pattern.
(tls_initial_exec_1): New pattern.
(tls_initial_exec_2): New pattern.
(am33_set_got): New pattern.
(int_label): New pattern.
(am33_loadPC_anyreg): New pattern.
(add_GOT_to_any_reg): New pattern.

Index: gcc/config/mn10300/mn10300.c
===
--- gcc/config/mn10300/mn10300.c	(revision 173815)
+++ gcc/config/mn10300/mn10300.c	(working copy)
@@ -46,7 +46,12 @@
 #include df.h
 #include opts.h
 #include cfgloop.h
+#include ggc.h
 
+/* This is used by GOTaddr2picreg to uniquely identify
+   UNSPEC_INT_LABELs.  */
+int mn10300_unspec_int_label_counter;
+
 /* This is used in the am33_2.0-linux-gnu port, in which global symbol
names are not prefixed by underscores, to tell whether to prefix a
label with a plus sign or not, so that the assembler can tell
@@ -124,6 +129,9 @@
 target_flags = ~MASK_MULT_BUG;
   else
 {
+  /* We can't do TLS if we don't have the TLS register.  */
+  targetm.have_tls = false;
+
   /* Disable scheduling for the MN10300 as we do
 	 not have timing information available for it.  */
   flag_schedule_insns = 0;
@@ -162,6 +170,51 @@
 fprintf (asm_out_file, \t.am33\n);
 }
 
+/* Returns non-zero if OP has the KIND tls model.  */
+
+static inline bool
+tls_symbolic_operand_kind (rtx op, enum tls_model kind)
+{
+  if (GET_CODE (op) != SYMBOL_REF)
+return false;
+  return SYMBOL_REF_TLS_MODEL (op) == kind;
+}
+
+/* Locate some local-dynamic symbol still in use by this function
+   so that we can print its name in some tls_local_dynamic_base
+   pattern.  This is used by % in print_operand().  */
+
+static int
+get_some_local_dynamic_name_1 (rtx *px, void *data ATTRIBUTE_UNUSED)
+{
+  rtx x = *px;
+
+  if (GET_CODE (x) == SYMBOL_REF
+   tls_symbolic_operand_kind (x, TLS_MODEL_LOCAL_DYNAMIC))
+{
+  cfun-machine-some_ld_name = XSTR (x, 0);
+  return 1;
+}
+
+  return 0;
+}
+
+static const char *
+get_some_local_dynamic_name (void)
+{

[PATCH] Fixup LTO SCC hash comparison fn

2011-05-17 Thread Richard Guenther


Quite obvious if you look at it for the 100th time...

Richard.

2011-05-17  Richard Guenther  rguent...@suse.de

* gimple.c (type_hash_pair_compare): Fix comparison.

Index: gcc/gimple.c
===
--- gcc/gimple.c(revision 173830)
+++ gcc/gimple.c(working copy)
@@ -4070,9 +4070,11 @@ type_hash_pair_compare (const void *p1_,
 {
   const struct type_hash_pair *p1 = (const struct type_hash_pair *) p1_;
   const struct type_hash_pair *p2 = (const struct type_hash_pair *) p2_;
-  if (p1-hash == p2-hash)
-return TYPE_UID (p1-type) - TYPE_UID (p2-type);
-  return p1-hash - p2-hash;
+  if (p1-hash  p2-hash)
+return -1;
+  else if (p1-hash  p2-hash)
+return 1;
+  return 0;
 }
 
 /* Returning a hash value for gimple type TYPE combined with VAL.

[PING][PATCH 13/18] move TS_EXP to be a substructure of TS_TYPED

2011-05-17 Thread Nathan Froyd

On 05/10/2011 04:18 PM, Nathan Froyd wrote:
 On 03/10/2011 11:23 PM, Nathan Froyd wrote:
 After all that, we can finally make tree_exp inherit from typed_tree.
 Quite anticlimatic.
 
 Ping.  http://gcc.gnu.org/ml/gcc-patches/2011-03/msg00559.html

Ping^2.

-Nathan

Re: Libiberty: POSIXify psignal definition

2011-05-17 Thread Richard Earnshaw


On Thu, 2011-05-05 at 09:30 +0200, Corinna Vinschen wrote:
 [Please keep me CCed, I'm not subscribed to gcc-patches.  Thank you]
 
 Hi,
 
 the definition of psignal in libiberty is
 
void psignal (int, char *);
 
 The correct definition per POSIX is
 
void psignal (int, const char *);
 
 The below patch fixes that.
 
 
 Thanks,
 Corinna
 
 
   * strsignal.c (psignal): Change second parameter to const char *.
   Fix comment accordingly.
 

OK.

R.

Re: Libiberty: POSIXify psignal definition

2011-05-17 Thread DJ Delorie


  * strsignal.c (psignal): Change second parameter to const char *.
  Fix comment accordingly.
  
 
 OK.

I had argued against this patch:

http://gcc.gnu.org/ml/gcc-patches/2011-05/msg00439.html

The newlib change broke ALL released versions of gcc, and the above
patch does NOT fix the problem, but merely hides it until the next
time we trip over it.

Re: [patch ada]: Fix boolean_type_node setup and some cleanup for boolean use

2011-05-17 Thread Kai Tietz

2011/5/17 Eric Botcazou ebotca...@adacore.com:
 Hmm, sad. As the a check in tree-cfg for truth-expressions about
 having type-precision of 1 would be a good way.  What is actual the
 cause for not setting type-precision here?

 But we are setting it:

  /* In Ada, we use an unsigned 8-bit type for the default boolean type.  */
  boolean_type_node = make_unsigned_type (8);
  TREE_SET_CODE (boolean_type_node, BOOLEAN_TYPE);

 See make_unsigned_type:

 /* Create and return a type for unsigned integers of PRECISION bits.  */

 tree
 make_unsigned_type (int precision)
 {
  tree type = make_node (INTEGER_TYPE);

  TYPE_PRECISION (type) = precision;

  fixup_unsigned_type (type);
  return type;
 }


 The other languages are changing the precision, but in Ada we need a standard
 scalar (precision == mode size) in order to support invalid values.

 At least in testcases I didn't found a regression caused by this.

 Right, I've just installed the attached testcase, it passes with the 
 unmodified
 compiler but fails with your gcc-interface/misc.c change.


 2011-05-17  Eric Botcazou  ebotca...@adacore.com

        * gnat.dg/invalid1.adb: New test.


 --
 Eric Botcazou


Ok, thanks for explaining it.  So would be patch ok for apply without
the precision setting?

Regards,
Kai

Re: Libiberty: POSIXify psignal definition

2011-05-17 Thread Richard Earnshaw


On Tue, 2011-05-17 at 11:52 -0400, DJ Delorie wrote:
 * strsignal.c (psignal): Change second parameter to const char *.
 Fix comment accordingly.
   
  
  OK.
 
 I had argued against this patch:
 
 http://gcc.gnu.org/ml/gcc-patches/2011-05/msg00439.html
 
 The newlib change broke ALL released versions of gcc, and the above
 patch does NOT fix the problem, but merely hides it until the next
 time we trip over it.
 

So regardless of whether the changes to newlib are a good idea or not, I
think the fix to libiberty is still right. Posix says that psignal takes
a const char *, and libiberty's implementation doesn't.  That's just
silly.

I do agree that the newlib code should be tightened up, particularly in
order to support older compilers; but that doesn't mean we shouldn't fix
libiberty as well.

R.

Re: Libiberty: POSIXify psignal definition

2011-05-17 Thread Corinna Vinschen

On May 17 16:33, Richard Earnshaw wrote:
 
 On Thu, 2011-05-05 at 09:30 +0200, Corinna Vinschen wrote:
  [Please keep me CCed, I'm not subscribed to gcc-patches.  Thank you]
  
  Hi,
  
  the definition of psignal in libiberty is
  
 void psignal (int, char *);
  
  The correct definition per POSIX is
  
 void psignal (int, const char *);
  
  The below patch fixes that.
  
  
  Thanks,
  Corinna
  
  
  * strsignal.c (psignal): Change second parameter to const char *.
  Fix comment accordingly.
  
 
 OK.
 
 R.

Thanks.  I just have no check in rights to the gcc repository.  I applied
the change to the sourceware CVS repository but for gcc I need a proxy.


Thanks,
Corinna

-- 
Corinna Vinschen
Cygwin Project Co-Leader
Red Hat

Re: Libiberty: POSIXify psignal definition

2011-05-17 Thread Corinna Vinschen

On May 17 17:07, Richard Earnshaw wrote:
 
 On Tue, 2011-05-17 at 11:52 -0400, DJ Delorie wrote:
* strsignal.c (psignal): Change second parameter to const char 
*.
Fix comment accordingly.

   
   OK.
  
  I had argued against this patch:
  
  http://gcc.gnu.org/ml/gcc-patches/2011-05/msg00439.html
  
  The newlib change broke ALL released versions of gcc, and the above
  patch does NOT fix the problem, but merely hides it until the next
  time we trip over it.
  
 
 So regardless of whether the changes to newlib are a good idea or not, I
 think the fix to libiberty is still right. Posix says that psignal takes
 a const char *, and libiberty's implementation doesn't.  That's just
 silly.
 
 I do agree that the newlib code should be tightened up, particularly in
 order to support older compilers;

What I don't understand is why the newlib change broke older compilers.
The function has been added to newlib and the definitions in newlib are
correct.

If this is refering to the fact that libiberty doesn't grok
automatically if a symbol has been added to newlib, then that's a
problem in libiberty, not in newlib.

Otherwise, if you're building an older compiler, just use an older
newlib as well.


Corinna

-- 
Corinna Vinschen
Cygwin Project Co-Leader
Red Hat

Re: Libiberty: POSIXify psignal definition

2011-05-17 Thread DJ Delorie


 So regardless of whether the changes to newlib are a good idea or not, I
 think the fix to libiberty is still right.

Irrelevent.  I said I'd accept that change *after* the real problem is
fixed.  The real problem hasn't been fixed.

The real problem is that libibery should NOT INCLUDE PSIGNAL AT ALL if
newlib has it.

What *should* have happened, is libiberty should have been fixed
*first*, and newlib waited until a gcc/binutils release cycle
happened, so that at least ONE version of those could build with
newlib.

Re: Libiberty: POSIXify psignal definition

2011-05-17 Thread DJ Delorie


 Thanks.  I just have no check in rights to the gcc repository.  I
 applied the change to the sourceware CVS repository but for gcc I
 need a proxy.

Please, never apply libiberty patches only to src.  They're likely to
get deleted by the robomerge.  The rule is: gcc only, or both at the
same time.

Re: Libiberty: POSIXify psignal definition

2011-05-17 Thread DJ Delorie


 What I don't understand is why the newlib change broke older compilers.

Older compilers have the older libiberty.  At the moment, libiberty
cannot be built by *any* released gcc, because you cannot *build* any
released gcc, because it cannot build its target libiberty.

 The function has been added to newlib and the definitions in newlib are
 correct.

Correct is irrelevent.  They don't match libiberty, so the build
breaks.

 If this is refering to the fact that libiberty doesn't grok
 automatically if a symbol has been added to newlib, then that's a
 problem in libiberty, not in newlib.

It's a problem in every released gcc at the moment, so no released gcc
can be built for a newlib target, without hacking the sources.

 Otherwise, if you're building an older compiler, just use an older
 newlib as well.

The only option here is to not release a newlib at all until a fixed
gcc release happens, then, and require that fixed gcc for that version
of newlib forward.

Re: [Patch, libfortran] PR 48931 Async-signal-safety of backtrace signal handler

2011-05-17 Thread Toon Moene


On 05/14/2011 09:40 PM, Janne Blomqvist wrote:


Hi,

the current version of showing the backtrace is not async-signal-safe
as it uses backtrace_symbols() which, in turn, uses malloc(). The
attached patch changes the backtrace printing functionality to instead
use backtrace_symbols_fd() and pipes.


Great - this would solve a problem I filed a bugzilla report for years 
ago (unfortunately, I do not know the number of it).


I closed it WONTFIX, because neither FX nor I could come up with an 
alternative way *not* using malloc.


[ The problem was getting a traceback after corruption of the
  malloc arena, which just hangs under the current implementation. ]

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

[PATCH, i386]: Trivial, use bool some more.

2011-05-17 Thread Uros Bizjak

Hello!

2011-05-16  Uros Bizjak  ubiz...@gmail.com

* config/i386/i386-protos.h (output_fix_trunc): Change arg 3 to bool.
(output_fp_compare): Change args 3 and 4 to bool.
(ix86_expand_call): Change arg 6 to bool.
(ix86_attr_length_immediate_default): Change arg 2 to bool.
(ix86_attr_length_vex_default): Change arg 3 to bool.
* config/i386/i386.md: Update all uses.
* config/i386/i386.c: Ditto.
(ix86_flags_dependent): Change return type to bool.

Patch was tested on x86_64-pc-linux-gnu {,-m32}, also with
--enable-build-with-cxx (additional patch is needed to bootstrap
without errors ATM).

Committed to mainline SVN.

Uros.
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 173832)
+++ config/i386/i386.md (working copy)
@@ -414,9 +414,9 @@
   (const_int 0)
 (eq_attr type alu,alu1,negnot,imovx,ishift,rotate,ishift1,rotate1,
  imul,icmp,push,pop)
-  (symbol_ref ix86_attr_length_immediate_default(insn,1))
+  (symbol_ref ix86_attr_length_immediate_default (insn, true))
 (eq_attr type imov,test)
-  (symbol_ref ix86_attr_length_immediate_default(insn,0))
+  (symbol_ref ix86_attr_length_immediate_default (insn, false))
 (eq_attr type call)
   (if_then_else (match_operand 0 constant_call_address_operand )
 (const_int 4)
@@ -524,11 +524,11 @@
   (if_then_else (and (eq_attr prefix_0f 1)
 (eq_attr prefix_extra 0))
 (if_then_else (eq_attr prefix_vex_w 1)
-  (symbol_ref ix86_attr_length_vex_default (insn, 1, 1))
-  (symbol_ref ix86_attr_length_vex_default (insn, 1, 0)))
+  (symbol_ref ix86_attr_length_vex_default (insn, true, true))
+  (symbol_ref ix86_attr_length_vex_default (insn, true, false)))
 (if_then_else (eq_attr prefix_vex_w 1)
-  (symbol_ref ix86_attr_length_vex_default (insn, 0, 1))
-  (symbol_ref ix86_attr_length_vex_default (insn, 0, 0)
+  (symbol_ref ix86_attr_length_vex_default (insn, false, true))
+  (symbol_ref ix86_attr_length_vex_default (insn, false, false)
 
 ;; Set when modrm byte is used.
 (define_attr modrm 
@@ -1262,7 +1262,7 @@
UNSPEC_FNSTSW))]
   X87_FLOAT_MODE_P (GET_MODE (operands[1]))
 GET_MODE (operands[1]) == GET_MODE (operands[2])
-  * return output_fp_compare (insn, operands, 0, 0);
+  * return output_fp_compare (insn, operands, false, false);
   [(set_attr type multi)
(set_attr unit i387)
(set (attr mode)
@@ -1309,7 +1309,7 @@
 (match_operand:XF 2 register_operand f))]
  UNSPEC_FNSTSW))]
   TARGET_80387
-  * return output_fp_compare (insn, operands, 0, 0);
+  * return output_fp_compare (insn, operands, false, false);
   [(set_attr type multi)
(set_attr unit i387)
(set_attr mode XF)])
@@ -1343,7 +1343,7 @@
 (match_operand:MODEF 2 nonimmediate_operand fm))]
  UNSPEC_FNSTSW))]
   TARGET_80387
-  * return output_fp_compare (insn, operands, 0, 0);
+  * return output_fp_compare (insn, operands, false, false);
   [(set_attr type multi)
(set_attr unit i387)
(set_attr mode MODE)])
@@ -1378,7 +1378,7 @@
  UNSPEC_FNSTSW))]
   X87_FLOAT_MODE_P (GET_MODE (operands[1]))
 GET_MODE (operands[1]) == GET_MODE (operands[2])
-  * return output_fp_compare (insn, operands, 0, 1);
+  * return output_fp_compare (insn, operands, false, true);
   [(set_attr type multi)
(set_attr unit i387)
(set (attr mode)
@@ -1428,7 +1428,7 @@
   X87_FLOAT_MODE_P (GET_MODE (operands[1]))
 (TARGET_USE_MODEMODE_FIOP || optimize_function_for_size_p (cfun))
 (GET_MODE (operands [3]) == GET_MODE (operands[1]))
-  * return output_fp_compare (insn, operands, 0, 0);
+  * return output_fp_compare (insn, operands, false, false);
   [(set_attr type multi)
(set_attr unit i387)
(set_attr fp_int_src true)
@@ -1504,7 +1504,7 @@
   TARGET_MIX_SSE_I387
 SSE_FLOAT_MODE_P (GET_MODE (operands[0]))
 GET_MODE (operands[0]) == GET_MODE (operands[1])
-  * return output_fp_compare (insn, operands, 1, 0);
+  * return output_fp_compare (insn, operands, true, false);
   [(set_attr type fcmp,ssecomi)
(set_attr prefix orig,maybe_vex)
(set (attr mode)
@@ -1533,7 +1533,7 @@
   TARGET_SSE_MATH
 SSE_FLOAT_MODE_P (GET_MODE (operands[0]))
 GET_MODE (operands[0]) == GET_MODE (operands[1])
-  * return output_fp_compare (insn, operands, 1, 0);
+  * return output_fp_compare (insn, operands, true, false);
   [(set_attr type ssecomi)
(set_attr prefix maybe_vex)
(set (attr mode)
@@ -1557,7 +1557,7 @@
 TARGET_CMOVE
 !(SSE_FLOAT_MODE_P (GET_MODE (operands[0]))  TARGET_SSE_MATH)
 GET_MODE (operands[0]) == GET_MODE (operands[1])
-  * return output_fp_compare (insn, operands, 1, 0);
+  * return output_fp_compare (insn, operands, true, false);
   [(set_attr type fcmp)
(set (attr mode)
  (cond

[PATCH]: Restore bootstrap with --enable-build-with-cxx

2011-05-17 Thread Uros Bizjak

Hello!

2011-05-17  Uros Bizjak  ubiz...@gmail.com

* ipa-inline-analysis.c (inline_node_duplication_hook): Initialize
info-entry with 0
* tree-inline.c (maybe_inline_call_in_expr):  Initialize
id.transform_lang_insert_block with NULL.

Tested on x86_64-pc-linux-gnu {, m32} with --enable-build-with-cxx.
Committed to mainline SVN as obvious.

Uros.
Index: ipa-inline-analysis.c
===
--- ipa-inline-analysis.c   (revision 173832)
+++ ipa-inline-analysis.c   (working copy)
@@ -702,7 +702,7 @@ inline_node_duplication_hook (struct cgr
   bool inlined_to_p = false;
   struct cgraph_edge *edge;
 
-  info-entry = false;
+  info-entry = 0;
   VEC_safe_grow_cleared (tree, heap, known_vals, count);
   for (i = 0; i  count; i++)
 {
Index: tree-inline.c
===
--- tree-inline.c   (revision 173832)
+++ tree-inline.c   (working copy)
@@ -5232,7 +5232,7 @@ maybe_inline_call_in_expr (tree exp)
   id.transform_call_graph_edges = CB_CGE_DUPLICATE;
   id.transform_new_cfg = false;
   id.transform_return_to_modify = true;
-  id.transform_lang_insert_block = false;
+  id.transform_lang_insert_block = NULL;
 
   /* Make sure not to unshare trees behind the front-end's back
 since front-end specific mechanisms may rely on sharing.  */

[PATCH, MELT] correcting path error in the Makefile.in

2011-05-17 Thread Pierre Vittet

This patch correct a bug in the current revision of MELT, which was 
preventing MELT to run correctly.


This was a path problem in gcc/Makefile.in (melt-modules/ and 
melt-modules.mk) were not found.


My contributor number is 634276.

changelog :


2011-05-17  Pierre Vittet  pier...@pvittet.com

* Makefile.in : Correct path errors for melt_module_dir and for
install-melt-mk target




Index: gcc/Makefile.in
===
--- gcc/Makefile.in (revision 173832)
+++ gcc/Makefile.in (working copy)
@@ -5352,7 +5352,7 @@ melt_default_modules_list=melt-default-modules
 melt_source_dir=$(libexecsubdir)/melt-source/
 
 ## this is the installation directory of melt dynamic modules (*.so)
-melt_module_dir=$(libexecsubdir)/melt-module/
+melt_module_dir=$(libexecsubdir)/melt-modules/
 
 ## this is the installed path of the MELT module makefile
 melt_installed_module_makefile=$(libexecsubdir)/melt-module.mk
@@ -5416,8 +5416,8 @@ install-melt-modules: melt-modules melt-all-module
 
 ## install the makefile for MELT modules
 install-melt-mk: melt-module.mk
-   $(mkinstalldirs) $(DESTDIR)$(plugin_includedir)
-   $(INSTALL_DATA) $ $(DESTDIR)/$(plugin_includedir)/
+   $(mkinstalldirs) $(DESTDIR)$(libexecsubdir)
+   $(INSTALL_DATA) $ $(DESTDIR)/$(libexecsubdir)/
 
 ## install the default modules list
 install-melt-default-modules-list: $(melt_default_modules_list).modlis

Re: [Patch, libfortran] PR 48931 Async-signal-safety of backtrace signal handler

2011-05-17 Thread Toon Moene


On 05/17/2011 07:50 PM, Toon Moene wrote:


On 05/14/2011 09:40 PM, Janne Blomqvist wrote:


Hi,

the current version of showing the backtrace is not async-signal-safe
as it uses backtrace_symbols() which, in turn, uses malloc(). The
attached patch changes the backtrace printing functionality to instead
use backtrace_symbols_fd() and pipes.


Great - this would solve a problem I filed a bugzilla report for years
ago (unfortunately, I do not know the number of it).


It was 33905 (2007-10-26).

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: [PATCH, MELT] correcting path error in the Makefile.in

2011-05-17 Thread Basile Starynkevitch

On Tue, 17 May 2011 21:30:44 +0200
Pierre Vittet pier...@pvittet.com wrote:

 This patch correct a bug in the current revision of MELT, which was 
 preventing MELT to run correctly.
 
 This was a path problem in gcc/Makefile.in (melt-modules/ and 
 melt-modules.mk) were not found.
 
 My contributor number is 634276.
 
 changelog :
 
 
 2011-05-17  Pierre Vittet  pier...@pvittet.com
 
   * Makefile.in : Correct path errors for melt_module_dir and for
   install-melt-mk target

The ChangeLog.MELT entry should mention the Makefile target as
changelog functions. And the colon shouldn't have any space before.
So I applied the patch with the following entry:

2011-05-17  Pierre Vittet  pier...@pvittet.com

* Makefile.in (melt_module_dir,install-melt-mk): Correct path
errors.
Committed revision 173835.

Thanks.


-- 
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basileatstarynkevitchdotnet mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mine, sont seulement les miennes} ***

Re: [PATCH]: Restore bootstrap with --enable-build-with-cxx

2011-05-17 Thread Toon Moene


On 05/17/2011 08:32 PM, Uros Bizjak wrote:


Tested on x86_64-pc-linux-gnu {, m32} with --enable-build-with-cxx.
Committed to mainline SVN as obvious.


Does that mean that I can now remove the --disable-werror from my daily 
C++ bootstrap run ?


It's great that some people understand the intricacies of the 
infight^H^H^H^H^H^H differences between the C and C++ type model.


OK: 1/2 :-)

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Restore MIPS builds

2011-05-17 Thread Richard Sandiford

I've applied the patch below to restore -Werror MIPS builds.
Tested on mips64-linux-gnu.

Richard


gcc/
* config/mips/mips.c (mips_handle_option): Remove unused variable.

Index: gcc/config/mips/mips.c
===
--- gcc/config/mips/mips.c  2011-05-15 08:37:21.0 +0100
+++ gcc/config/mips/mips.c  2011-05-15 08:37:28.0 +0100
@@ -15287,7 +15287,6 @@ mips_handle_option (struct gcc_options *
location_t loc ATTRIBUTE_UNUSED)
 {
   size_t code = decoded-opt_index;
-  const char *arg = decoded-arg;
 
   switch (code)
 {

[PATCH] fix vfmsubaddpd/vfmaddsubpd generation

2011-05-17 Thread Quentin Neill

This patch fixes an obvious problem: the fma4_fmsubadd/fma4_fmaddsub
instruction templates don't generate vfmsubaddpd/vfmaddsubpd because
they don't use ssemodesuffix

This passes bootstrap on x86_64 on trunk.  Okay to commit?

BTW, I'm testing on gcc-4_6-branch.  Should I post a different patch
thread, or just use this one?
-- 
Quentin
From aa70d4f6180f1c6712888b7328723232b5da8bdc Mon Sep 17 00:00:00 2001
From: Quentin Neill quentin.ne...@amd.com
Date: Tue, 17 May 2011 10:24:17 -0500
Subject: [PATCH] 2011-05-17  Harsha Jagasia  harsha.jaga...@amd.com

* config/i386/sse.md (fma4_fmsubadd): Use ssemodesuffix.
(fma4_fmaddsub): Likewise
---
 gcc/ChangeLog  |5 +
 gcc/config/i386/sse.md |4 ++--
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 3625d9b..e86ea4e 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2011-05-17  Harsha Jagasia  harsha.jaga...@amd.com
+
+   * config/i386/sse.md (fma4_fmsubadd): Use ssemodesuffix.
+   (fma4_fmaddsub): Likewise
+
 2011-05-17  Richard Guenther  rguent...@suse.de
 
* gimple.c (iterative_hash_gimple_type): Simplify singleton
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 291bffb..7c4e6dd 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -1663,7 +1663,7 @@
   (match_operand:VF 3 nonimmediate_operand xm,x)]
  UNSPEC_FMADDSUB))]
   TARGET_FMA4
-  vfmaddsubps\t{%3, %2, %1, %0|%0, %1, %2, %3}
+  vfmaddsubpssemodesuffix\t{%3, %2, %1, %0|%0, %1, %2, %3}
   [(set_attr type ssemuladd)
(set_attr mode MODE)])
 
@@ -1676,7 +1676,7 @@
 (match_operand:VF 3 nonimmediate_operand xm,x))]
  UNSPEC_FMADDSUB))]
   TARGET_FMA4
-  vfmsubaddps\t{%3, %2, %1, %0|%0, %1, %2, %3}
+  vfmsubaddpssemodesuffix\t{%3, %2, %1, %0|%0, %1, %2, %3}
   [(set_attr type ssemuladd)
(set_attr mode MODE)])
 
-- 
1.7.1

[v3] tuple vs noexcept

2011-05-17 Thread Paolo Carlini


Hi,

this time too, took the occasion to add the get(tuple) bits. Tested 
x86_64-linux, committed.


Paolo.

///
2011-05-17  Paolo Carlini  paolo.carl...@oracle.com

* include/std/tuple: Use noexcept where appropriate.
(tuple::swap): Rework implementation.
(_Head_base::_M_swap_impl): Remove.
(get(std::tuple)): Add.
* testsuite/20_util/tuple/element_access/get2.cc: New.
* testsuite/20_util/weak_ptr/comparison/cmp_neg.cc: Adjust dg-error
line number.
Index: include/std/tuple
===
--- include/std/tuple   (revision 173832)
+++ include/std/tuple   (working copy)
@@ -59,6 +59,15 @@
 struct __add_ref_Tp
 { typedef _Tp type; };
 
+  // Adds an rvalue reference to a non-reference type.
+  templatetypename _Tp
+struct __add_r_ref
+{ typedef _Tp type; };
+
+  templatetypename _Tp
+struct __add_r_ref_Tp
+{ typedef _Tp type; };
+
   templatestd::size_t _Idx, typename _Head, bool _IsEmpty
 struct _Head_base;
 
@@ -78,13 +87,6 @@
 
   _Head   _M_head()   { return *this; }
   const _Head _M_head() const { return *this; }
-
-  void 
-  _M_swap_impl(_Head __h)
-  {
-   using std::swap;
-   swap(__h, _M_head());
-  }
 };
 
   templatestd::size_t _Idx, typename _Head
@@ -103,13 +105,6 @@
   _Head   _M_head()   { return _M_head_impl; }
   const _Head _M_head() const { return _M_head_impl; }
 
-  void
-  _M_swap_impl(_Head __h)
-  { 
-   using std::swap;
-   swap(__h, _M_head());
-  }
-
   _Head _M_head_impl; 
 };
 
@@ -130,9 +125,11 @@
*/
   templatestd::size_t _Idx
 struct _Tuple_impl_Idx
-{ 
+{
+  templatestd::size_t, typename... friend class _Tuple_impl;
+
 protected:
-  void _M_swap_impl(_Tuple_impl) { /* no-op */ }
+  void _M_swap(_Tuple_impl) noexcept { /* no-op */ }
 };
 
   /**
@@ -145,6 +142,8 @@
 : public _Tuple_impl_Idx + 1, _Tail...,
   private _Head_base_Idx, _Head, std::is_empty_Head::value
 {
+  templatestd::size_t, typename... friend class _Tuple_impl;
+
   typedef _Tuple_impl_Idx + 1, _Tail... _Inherited;
   typedef _Head_base_Idx, _Head, std::is_empty_Head::value _Base;
 
@@ -218,10 +217,14 @@
 
 protected:
   void
-  _M_swap_impl(_Tuple_impl __in)
+  _M_swap(_Tuple_impl __in)
+  noexcept(noexcept(swap(std::declval_Head(),
+std::declval_Head()))
+   noexcept(__in._M_tail()._M_swap(__in._M_tail(
   {
-   _Base::_M_swap_impl(__in._M_head());
-   _Inherited::_M_swap_impl(__in._M_tail());
+   using std::swap;
+   swap(this-_M_head(), __in._M_head());
+   _Inherited::_M_swap(__in._M_tail());
   }
 };
 
@@ -300,14 +303,15 @@
 
   void
   swap(tuple __in)
-  { _Inherited::_M_swap_impl(__in); }
+  noexcept(noexcept(__in._M_swap(__in)))
+  { _Inherited::_M_swap(__in); }
 };
 
   template  
 class tuple
 {
 public:
-  void swap(tuple) { /* no-op */ }
+  void swap(tuple) noexcept { /* no-op */ }
 };
 
   /// tuple (2-element), with construction and assignment from a pair.
@@ -360,6 +364,7 @@
 
   tuple
   operator=(tuple __in)
+  // noexcept has to wait is_nothrow_move_assignable
   {
static_cast_Inherited(*this) = std::move(__in);
return *this;
@@ -392,7 +397,7 @@
 
   templatetypename _U1, typename _U2
 tuple
-operator=(pair_U1, _U2 __in)
+operator=(pair_U1, _U2 __in) noexcept
 {
  this-_M_head() = std::forward_U1(__in.first);
  this-_M_tail()._M_head() = std::forward_U2(__in.second);
@@ -401,11 +406,8 @@
 
   void
   swap(tuple __in)
-  { 
-   using std::swap;
-   swap(this-_M_head(), __in._M_head());
-   swap(this-_M_tail()._M_head(), __in._M_tail()._M_head());  
-  }
+  noexcept(noexcept(__in._M_swap(__in)))
+  { _Inherited::_M_swap(__in); }
 };
 
   /// tuple (1-element).
@@ -473,7 +475,8 @@
 
   void
   swap(tuple __in)
-  { _Inherited::_M_swap_impl(__in); }
+  noexcept(noexcept(__in._M_swap(__in)))
+  { _Inherited::_M_swap(__in); }
 };
 
 
@@ -522,22 +525,31 @@
 __get_helper(const _Tuple_impl__i, _Head, _Tail... __t)
 { return __t._M_head(); }
 
-  // Return a reference (const reference) to the ith element of a tuple.
-  // Any const or non-const ref elements are returned with their original type.
+  // Return a reference (const reference, rvalue reference) to the ith element
+  // of a tuple.  Any const or non-const ref elements are returned with their
+  // original type.
   templatestd::size_t __i, typename... _Elements
 inline typename __add_ref
-  typename tuple_element__i, tuple_Elements... ::type
+  typename tuple_element__i,

Fix PR 49026 (-mfpmath= attribute bug)

2011-05-17 Thread Joseph S. Myers

PR 49026 identified testsuite regressions when mfpmath= is set by
target attributes, that for some reason appear on x86_64-darwin but
not x86_64-linux.

This patch fixes one place where I failed to preserve the logic of
this attribute handling, and restores the code generated for the
testcase to the code attached to that PR as being generated before my
previous patch.

Bootstrapped with no regressions on x86_64-unknown-linux-gnu.  Applied
to mainline.

2011-05-17  Joseph Myers  jos...@codesourcery.com

* config/i386/i386.c (ix86_valid_target_attribute_tree): Use
enum_opts_set when testing if attributes have set -mfpmath=.

Index: gcc/config/i386/i386.c
===
--- gcc/config/i386/i386.c  (revision 173809)
+++ gcc/config/i386/i386.c  (working copy)
@@ -4692,7 +4692,7 @@ ix86_valid_target_attribute_tree (tree a
   || target_flags != def-x_target_flags
   || option_strings[IX86_FUNCTION_SPECIFIC_ARCH]
   || option_strings[IX86_FUNCTION_SPECIFIC_TUNE]
-  || ix86_fpmath != def-x_ix86_fpmath)
+  || enum_opts_set.x_ix86_fpmath)
 {
   /* If we are using the default tune= or arch=, undo the string assigned,
 and use the default.  */

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH]: Restore bootstrap with --enable-build-with-cxx

2011-05-17 Thread Gabriel Dos Reis

On Tue, May 17, 2011 at 2:46 PM, Toon Moene t...@moene.org wrote:
 On 05/17/2011 08:32 PM, Uros Bizjak wrote:

 Tested on x86_64-pc-linux-gnu {, m32} with --enable-build-with-cxx.
 Committed to mainline SVN as obvious.

 Does that mean that I can now remove the --disable-werror from my daily C++
 bootstrap run ?

 It's great that some people understand the intricacies of the
 infight^H^H^H^H^H^H differences between the C and C++ type model.

 OK: 1/2 :-)

I suspect this infight would vanish if we just switched, as we discussed
in the past.

-- Gaby

Re: [google] Parameterize function overhead estimate for inlining

2011-05-17 Thread Xinliang David Li

You will have a followup patch to override arm defaults, right? Ok for
google/main.

Thanks,

David

On Tue, May 17, 2011 at 9:29 PM, Mark Heffernan meh...@google.com wrote:
 This tiny change improves the size estimation for inlining and results in an
 average 1% size reduction and a small (maybe 0.25% geomean) performance
 increase on internal benchmarks on x86-64.  I parameterized the value rather
 than changing it directly because previous exploration with x86 and ARM
 arches indicated that it varies significantly with architecture.  Default
 value is tuned for x86-64.
 Bootstrapped and tested on x86-64.  Will explore relevance and effectiveness
 for trunk and SPEC later.
 Ok for google/main?
 Mark
 2011-05-17  Mark Heffernan  meh...@google.com





        * ipa-inline.c (estimate_function_body_sizes): Parameterize static


        function static overhead.


        * params.def (PARAM_INLINE_FUNCTION_OVERHEAD_SIZE): New parameter.
 Index: ipa-inline.c
 ===
 --- ipa-inline.c (revision 173845)
 +++ ipa-inline.c (working copy)
 @@ -1979,10 +1979,11 @@ estimate_function_body_sizes (struct cgr
    gcov_type time = 0;
    gcov_type time_inlining_benefit = 0;
    /* Estimate static overhead for function prologue/epilogue and alignment.
 */
 -  int size = 2;
 +  int size = PARAM_VALUE (PARAM_INLINE_FUNCTION_OVERHEAD_SIZE);
    /* Benefits are scaled by probability of elimination that is in range
       0,2.  */
 -  int size_inlining_benefit = 2 * 2;
 +  int size_inlining_benefit =
 +    PARAM_VALUE (PARAM_INLINE_FUNCTION_OVERHEAD_SIZE) * 2;
    basic_block bb;
    gimple_stmt_iterator bsi;
    struct function *my_function = DECL_STRUCT_FUNCTION (node-decl);
 Index: params.def
 ===
 --- params.def (revision 173845)
 +++ params.def (working copy)
 @@ -110,6 +110,11 @@ DEFPARAM (PARAM_MIN_INLINE_RECURSIVE_PRO
    Inline recursively only when the probability of call being executed
 exceeds the parameter,
    10, 0, 0)

 +DEFPARAM (PARAM_INLINE_FUNCTION_OVERHEAD_SIZE,
 +  inline-function-overhead-size,
 +  Size estimate of function overhead (prologue and epilogue) for inlining
 purposes,
 +  7, 0, 0)
 +
  /* Limit of iterations of early inliner.  This basically bounds number of
     nested indirect calls early inliner can resolve.  Deeper chains are
 still
     handled by late inlining.  */

[google] Increase inlining limits with FDO/LIPO

2011-05-17 Thread Mark Heffernan

This small patch greatly expands the function size limits for inlining
with FDO/LIPO.  With profile information, the inliner is much more
selective and precise and so the limits can be increased with less
worry that functions and total code size will blow up.  This speeds up
x86-64 internal benchmarks by about geomean 1.5% to 3% with LIPO
(depending on microarch), and 1% to 1.5% with FDO.  Size increase is
negligible (0.1% mean).

Bootstrapped and regression tested on x86-64.

Trunk testing to follow.

Ok for google/main?

Mark


2011-05-17  Mark Heffernan  meh...@google.com

   * opts.c (finish_options): Increase inlining limits with profile
   generate and use.

Index: opts.c
===
--- opts.c  (revision 173666)
+++ opts.c  (working copy)
@@ -828,6 +828,22 @@ finish_options (struct gcc_options *opts
  opts-x_flag_split_stack = 0;
}
 }
+
+  if (opts-x_flag_profile_use
+  || opts-x_profile_arc_flag
+  || opts-x_flag_profile_values)
+{
+  /* With accurate profile information, inlining is much more
+selective and makes better decisions, so increase the
+inlining function size limits.  Changes must be added to both
+the generate and use builds to avoid profile mismatches.  */
+  maybe_set_param_value
+   (PARAM_MAX_INLINE_INSNS_SINGLE, 1000,
+opts-x_param_values, opts_set-x_param_values);
+  maybe_set_param_value
+   (PARAM_MAX_INLINE_INSNS_AUTO, 1000,
+opts-x_param_values, opts_set-x_param_values);
+}
 }

59 matches

Mail list logo