Re: PATCH: Check ia32 in GCC tests

2011-07-11 Thread Mike Stump
On Jul 9, 2011, at 7:22 PM, H.J. Lu wrote:
 On Thu, Jul 07, 2011 at 10:29:53AM -0700, H.J. Lu wrote:
 Hi,
 
 On Linux/x86-64, when we pass
 
 RUNTESTFLAGS=--target_board='unix{-mx32}'
 
 to GCC tests, we can't check lp64/ilp32 for availability of 64bit x86
 instructions.  This patch adds ia32 and x32 effetive targets.  OK for
 trunk?
 
 
 Here is a followup patch to use ia32 effetive target.  OK for trunk?

Ok.


Re: PATCH: Check ia32 in GCC tests

2011-07-11 Thread Mike Stump
On Jul 9, 2011, at 7:25 PM, H.J. Lu wrote:
 2011-07-09  H.J. Lu  hongjiu...@intel.com
 
  * gcc.dg/vect/costmodel/x86_64/x86_64-costmodel-vect.exp: Check
  ia32.
  * go.test/go-test.exp (go-set-goarch): Likewise.
 
 
 A small update.

Ok.


Re: [PATCH] Fix configure --with-cloog

2011-07-11 Thread Romain Geissler
2011/7/6 Romain Geissler romain.geiss...@gmail.com:

 I forgot configure was a generated script. Here is the patch that fix
 it at the m4 macro level :


 2011-07-06  Romain Geissler  romain.geiss...@gmail.com

      * config/cloog.m4: Add $gmplibs to cloog $LDFLAGS
      * configure: Regenerate


 Index: config/cloog.m4
 ===
 --- config/cloog.m4     (revision 175907)
 +++ config/cloog.m4     (working copy)
 @@ -142,7 +142,7 @@ AC_DEFUN([CLOOG_FIND_FLAGS],
   dnl clooglibs  clooginc may have been initialized by CLOOG_INIT_FLAGS.
   CFLAGS=${CFLAGS} ${clooginc} ${gmpinc}
   CPPFLAGS=${CPPFLAGS} ${_cloogorginc}
 -  LDFLAGS=${LDFLAGS} ${clooglibs}
 +  LDFLAGS=${LDFLAGS} ${clooglibs} ${gmplibs}

   case $cloog_backend in
     ppl-legacy)


Ping: It seems that little patch has been forgotten.
Is ok for the trunk ?

NB: I don't have write access to the trunk

Romain Geissler


Re: plugin event for C/C++ declarations

2011-07-11 Thread Romain Geissler
2011/7/7 Diego Novillo dnovi...@google.com:
 OK.  This one fell through the cracks in my inbox.  Apologies.


 Diego.

Hi,

I don't have write access, can you please add the patch to the trunk ?

Romain Geissler


Re: [PATCH] Remove call_expr_arg and call_expr_argp

2011-07-11 Thread Romain Geissler
2011/7/8 Richard Guenther richard.guent...@gmail.com:
 Ok.

 Thanks,
 Richard.


Hi,

I don't have write access, can you please add the patch to the trunk ?

Romain Geissler


[Patch, Fortran, committed] Remove bogus dg-error in gfortran.dg/coarray_lock_3.f90

2011-07-11 Thread Tobias Burnus

Hi all,

when committing the LOCK patch, I forgot to include attached change in 
the testsuite, which causes testsuite failures.


I planned to correct that together with other constraint-check issues, 
but obviously I haven't done so for several weeks. Thus, I decided to 
start by fixing the test suite. (The line is indeed OK, i.e. the just 
committed patch is correct.)


There are still some issues with LOCK_TYPE checking, in particular with 
LOCK_TYPES in derived types.


Committed as Rev. 176137.

Tobias
Index: gcc/testsuite/gfortran.dg/coarray_lock_3.f90
===
--- gcc/testsuite/gfortran.dg/coarray_lock_3.f90	(revision 176136)
+++ gcc/testsuite/gfortran.dg/coarray_lock_3.f90	(working copy)
@@ -69,7 +69,7 @@
   lock(lock)
   lock(lock2(1))
   lock(lock2) ! { dg-error must be a scalar coarray of type LOCK_TYPE }
-  lock(lock[1]) ! { dg-error must be a scalar coarray of type LOCK_TYPE }
+  lock(lock[1]) ! OK
 end subroutine lock_test2
 
 
Index: gcc/testsuite/ChangeLog
===
--- gcc/testsuite/ChangeLog	(revision 176136)
+++ gcc/testsuite/ChangeLog	(working copy)
@@ -1,3 +1,8 @@
+2011-07-11  Tobias Burnus  bur...@net-b.de
+
+	PR fortran/18918
+	* gfortran.dg/coarray_lock_3.f90: Remove bogus dg-error.
+
 2011-07-11  Georg-Johann Lay  a...@gjlay.de
 	
 	* lib/target-supports.exp (check_effective_target_scheduling):


[Patch, AVR]: Fix PR39633 (missing *cmpqi)

2011-07-11 Thread Georg-Johann Lay
char  7 is compiled to

LSL reg
SBC reg,reg

which leaves cc0 in a mess because Z-flag is not set by SBC, it's
propagated from LSL.

Patch as obvious, new testcase pass and contains *cmpqi.

Ok to commit?

Johann


gcc/
PR target/39633
* config/avr/avr.c (notice_update_cc): For ashiftrt:QI, only
offsets 1..5 set cc0 in a usable way.

testsuite/
* gcc.target/avr/torture/pr39633.c: New test case.
Index: testsuite/gcc.target/avr/torture/pr39633.c
===
--- testsuite/gcc.target/avr/torture/pr39633.c	(revision 0)
+++ testsuite/gcc.target/avr/torture/pr39633.c	(revision 0)
@@ -0,0 +1,25 @@
+/* { dg-do run } */
+
+#include stdlib.h
+
+char c = 42;
+
+void __attribute__((noinline,noclone))
+pr39633 (char a)
+{
+  a = 7;
+  if (a)
+c = a;
+}
+
+int main()
+{
+  pr39633 (6);
+
+  if (c != 42)
+abort();
+
+  exit(0);
+
+  return 0;
+}
Index: config/avr/avr.c
===
--- config/avr/avr.c	(revision 176136)
+++ config/avr/avr.c	(working copy)
@@ -1479,9 +1479,8 @@ notice_update_cc (rtx body ATTRIBUTE_UNU
 	{
 	  rtx x = XEXP (src, 1);
 
-	  if (GET_CODE (x) == CONST_INT
-		   INTVAL (x)  0
-		   INTVAL (x) != 6)
+	  if (CONST_INT_P (x)
+		   IN_RANGE (INTVAL (x), 1, 5))
 		{
 		  cc_status.value1 = SET_DEST (set);
 		  cc_status.flags |= CC_OVERFLOW_UNUSABLE;


[PATCH] Extra invariant motion step after ivopt

2011-07-11 Thread Andreas Krebbel
Hi,

with the changes in the IVopts pass from last year I see a reduced
number of induction variables used for the first of the 3 hotloops in
the 436.cactus benchmark:

http://gcc.gnu.org/viewcvs?view=revisionrevision=162653

Which leads to an heavily increased number of instructions in the body
of the first loop in the resulting binary:

with GCC 4.5: BB 4: 52  - number of instructions
with GCC 4.6: BB 4: 110 - similiar result with GCC head

With GCC 4.6 a lot of loop invariant integer arithmetic is done in
order to calculate the addresses which are used to access the array
fields.

Adding another invariant motion pass improves the loop even beyond the
4.5 result:

with GCC 4.6 + attached patch: BB 4: 47

The benchmark result for 436.cactus only improves by about 2% since
the first loop is not actually the hottest in the trio but the code
is actually much better.

I've not been able to measure the compile time overhead. Out of 10
measurements compiling the cactus testcase the minimum of the compile
times was even lower then before. Perhaps having less instructions in
the loop body made other passes faster.  Overall I expect a very small
compile time increase.

Ok for mainline?

Bye,

-Andreas-


2011-07-11  Andreas Krebbel  andreas.kreb...@de.ibm.com

* passes.c (init_optimization_passes): Add invariant motion pass
after induction variable optimization.

Index: gcc/passes.c
===
*** gcc/passes.c.orig
--- gcc/passes.c
*** init_optimization_passes (void)
*** 1363,1368 
--- 1363,1369 
  NEXT_PASS (pass_parallelize_loops);
  NEXT_PASS (pass_loop_prefetch);
  NEXT_PASS (pass_iv_optimize);
+ NEXT_PASS (pass_lim);
  NEXT_PASS (pass_tree_loop_done);
}
NEXT_PASS (pass_cse_reciprocals);


RFA: Fix bug in optimize_mode_switching

2011-07-11 Thread Joern Rennecke

I work on target with complex mode switching needs, so it can happen that
in some block, for an entity a mode is provided without the need for a set.
This causes the current optimize_mode_switching to crash when it later
dereferences a NULL seginfo pointer.
Fixed by using an actual flag to keep track if we have allocated any seginfo.

Bootstrappded on x86_64-unknown-linux-gnu.
2011-07-08  Joern Rennecke joern.renne...@embecosm.com

* mode-switching.c (optimize_mode_switching): Fix bug in MODE_AFTER
handling.

Index: mode-switching.c
===
--- mode-switching.c(revision 1670)
+++ mode-switching.c(revision 1671)
@@ -499,6 +499,7 @@ optimize_mode_switching (void)
{
  struct seginfo *ptr;
  int last_mode = no_mode;
+ bool any_set_required = false;
  HARD_REG_SET live_now;
 
  REG_SET_TO_HARD_REG_SET (live_now, df_get_live_in (bb));
@@ -527,6 +528,7 @@ optimize_mode_switching (void)
 
  if (mode != no_mode  mode != last_mode)
{
+ any_set_required = true;
  last_mode = mode;
  ptr = new_seginfo (mode, insn, bb-index, live_now);
  add_seginfo (info + bb-index, ptr);
@@ -548,8 +550,10 @@ optimize_mode_switching (void)
}
 
  info[bb-index].computing = last_mode;
- /* Check for blocks without ANY mode requirements.  */
- if (last_mode == no_mode)
+ /* Check for blocks without ANY mode requirements.
+N.B. because of MODE_AFTER, last_mode might still be different
+from no_mode.  */
+ if (!any_set_required)
{
  ptr = new_seginfo (no_mode, BB_END (bb), bb-index, live_now);
  add_seginfo (info + bb-index, ptr);


[SPARC] Another minor tweak

2011-07-11 Thread Eric Botcazou
Since DWARF2 uses DW_CFA_GNU_window_save and the middle-end REG_CFA_WINDOW_SAVE 
to designate the thing, this makes the SPARC back-end use the same wording.

Tested on SPARC/Solaris, applied on the mainline.


2011-07-11  Eric Botcazou  ebotca...@adacore.com

* config/sparc/sparc.md (save_register_window_1): Rename to...
(window_save): ...this.
* config/sparc/sparc.c (emit_save_register_window): Rename to...
(emit_window_save): ...this.
(sparc_expand_prologue): Adjust to above renaming.


-- 
Eric Botcazou
Index: config/sparc/sparc.md
===
--- config/sparc/sparc.md	(revision 176072)
+++ config/sparc/sparc.md	(working copy)
@@ -6276,10 +6276,10 @@ (define_expand prologue
   DONE;
 })
 
-;; The save register window insn is modelled as follows.  The dwarf2
-;; information is manually added in emit_save_register_window in sparc.c.
+;; The register window save insn is modelled as follows.  The dwarf2
+;; information is manually added in emit_window_save.
 
-(define_insn save_register_window_1
+(define_insn window_save
   [(unspec_volatile
 	[(match_operand 0 arith_operand rI)]
 	UNSPECV_SAVEW)]
Index: config/sparc/sparc.c
===
--- config/sparc/sparc.c	(revision 176072)
+++ config/sparc/sparc.c	(working copy)
@@ -4590,14 +4590,12 @@ emit_save_or_restore_local_in_regs (rtx
 			 save_local_or_in_reg_p, action, SORR_ADVANCE);
 }
 
-/* Generate a save_register_window insn.  */
+/* Emit a window_save insn.  */
 
 static rtx
-emit_save_register_window (rtx increment)
+emit_window_save (rtx increment)
 {
-  rtx insn;
-
-  insn = emit_insn (gen_save_register_window_1 (increment));
+  rtx insn = emit_insn (gen_window_save (increment));
   RTX_FRAME_RELATED_P (insn) = 1;
 
   /* The incoming return address (%o7) is saved in %i7.  */
@@ -4716,10 +4714,10 @@ sparc_expand_prologue (void)
   rtx size_int_rtx = GEN_INT (-size);
 
   if (size = 4096)
-	emit_save_register_window (size_int_rtx);
+	emit_window_save (size_int_rtx);
   else if (size = 8192)
 	{
-	  emit_save_register_window (GEN_INT (-4096));
+	  emit_window_save (GEN_INT (-4096));
 	  /* %sp is not the CFA register anymore.  */
 	  emit_insn (gen_stack_pointer_inc (GEN_INT (4096 - size)));
 	}
@@ -4727,7 +4725,7 @@ sparc_expand_prologue (void)
 	{
 	  rtx size_rtx = gen_rtx_REG (Pmode, 1);
 	  emit_move_insn (size_rtx, size_int_rtx);
-	  emit_save_register_window (size_rtx);
+	  emit_window_save (size_rtx);
 	}
 }
 


Re: RFA PR regression/49498

2011-07-11 Thread Richard Guenther
On Fri, Jul 8, 2011 at 7:25 PM, Jeff Law l...@redhat.com wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1


 As detailed in the PR, improvements to jump threading caused the
 relatively simple guard predicates in this testcase to become
 significantly more complex.  The predicate complexity is enough to
 confuse the predicate-aware pruning of bogus uninitialized variable
 warnings.

 Note the actual runtime flow control was improved by jump threading,
 which was doing exactly what it should.

 Based on David's comments, it's unlikely the predicate-aware code in
 tree-ssa-uninit.c is going to be able to handle the more complex guards.
  So I'm turning off DOM (jump threading) for this testcase.

 OK for trunk?

Ok.

Thanks,
Richard.


 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.11 (GNU/Linux)
 Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

 iQEcBAEBAgAGBQJOFz2VAAoJEBRtltQi2kC7qMQH/2GMEXQrFZzWZev2Rd7CH20F
 x7SsUDtkPW5K5pd1uLJOTsGh7fwr8l173n27GQVR5DN5OCLmoxWDrpsUeaMRd4bg
 LcZun7h+NGSrqxna/LExs9PBNR1P9blh1X6/LCqmWuo8hIqJ5HDUDK6674iD4C8p
 I71W25FYPgAno9Okm0UiBKOaZjRJdtfiZqMSgM9HreagYbHQcMYlcWsyc9irXM9b
 oxkaFzM+Aq5ZxpulpD0NCJ4aGMe6u2+FymrsjbbrPfnB2y7MY1DklxA0L7NO893d
 dxZ5N3Fi1adDsUP7Oh/0PNGQkB6HjDlAR6gV0oyUAamswn/Owo6lAYvQdNTMUAk=
 =VYjS
 -END PGP SIGNATURE-



Re: [Patch, AVR]: Fix PR39633 (missing *cmpqi)

2011-07-11 Thread Denis Chertykov
2011/7/11 Georg-Johann Lay a...@gjlay.de:
 char  7 is compiled to

 LSL reg
 SBC reg,reg

 which leaves cc0 in a mess because Z-flag is not set by SBC, it's
 propagated from LSL.

 Patch as obvious, new testcase pass and contains *cmpqi.

 Ok to commit?


Please, commit.

Denis.


[PATCH] Remove obsolete alias check in cgraph_redirect_edge_call_stmt_to_callee

2011-07-11 Thread Martin Jambor
Hi,

since (same body) aliases have their own cgraph_nodes, the check for
them in cgraph_redirect_edge_call_stmt_to_callee is now unnecessary
because e-callee is now the alias, not the function node.

The following patch therefore removes it.  Bootstrapped and tested on
x86_64-linux, OK for trunk?

Thanks,

Martin


2011-07-08  Martin Jambor  mjam...@suse.cz

* cgraphunit.c (cgraph_redirect_edge_call_stmt_to_callee): Alias
check removed.

Index: src/gcc/cgraphunit.c
===
--- src.orig/gcc/cgraphunit.c
+++ src/gcc/cgraphunit.c
@@ -2380,9 +2380,7 @@ cgraph_redirect_edge_call_stmt_to_callee
 #endif
 
   if (e-indirect_unknown_callee
-  || decl == e-callee-decl
-  /* Don't update call from same body alias to the real function.  */
-  || (decl  cgraph_get_node (decl) == cgraph_get_node (e-callee-decl)))
+  || decl == e-callee-decl)
 return e-call_stmt;
 
 #ifdef ENABLE_CHECKING


Re: [PATCH] Remove call_expr_arg and call_expr_argp

2011-07-11 Thread Richard Guenther
On Mon, Jul 11, 2011 at 9:53 AM, Romain Geissler
romain.geiss...@gmail.com wrote:
 2011/7/8 Richard Guenther richard.guent...@gmail.com:
 Ok.

 Thanks,
 Richard.


 Hi,

 I don't have write access, can you please add the patch to the trunk ?

Done.  Btw, a proper changelog would have been

2011-07-11  Romain Geissler  romain.geiss...@gmail.com

   * tree.h (call_expr_arg): Remove.
   (call_expr_argp): Likewise.


 Romain Geissler



Re: [PATCH] Extra invariant motion step after ivopt

2011-07-11 Thread Richard Guenther
On Mon, Jul 11, 2011 at 10:50 AM, Andreas Krebbel
kreb...@linux.vnet.ibm.com wrote:
 Hi,

 with the changes in the IVopts pass from last year I see a reduced
 number of induction variables used for the first of the 3 hotloops in
 the 436.cactus benchmark:

 http://gcc.gnu.org/viewcvs?view=revisionrevision=162653

 Which leads to an heavily increased number of instructions in the body
 of the first loop in the resulting binary:

 with GCC 4.5: BB 4: 52  - number of instructions
 with GCC 4.6: BB 4: 110 - similiar result with GCC head

 With GCC 4.6 a lot of loop invariant integer arithmetic is done in
 order to calculate the addresses which are used to access the array
 fields.

 Adding another invariant motion pass improves the loop even beyond the
 4.5 result:

 with GCC 4.6 + attached patch: BB 4: 47

 The benchmark result for 436.cactus only improves by about 2% since
 the first loop is not actually the hottest in the trio but the code
 is actually much better.

 I've not been able to measure the compile time overhead. Out of 10
 measurements compiling the cactus testcase the minimum of the compile
 times was even lower then before. Perhaps having less instructions in
 the loop body made other passes faster.  Overall I expect a very small
 compile time increase.

 Ok for mainline?

Ok.

Thanks,
Richard.

 Bye,

 -Andreas-


 2011-07-11  Andreas Krebbel  andreas.kreb...@de.ibm.com

        * passes.c (init_optimization_passes): Add invariant motion pass
        after induction variable optimization.

 Index: gcc/passes.c
 ===
 *** gcc/passes.c.orig
 --- gcc/passes.c
 *** init_optimization_passes (void)
 *** 1363,1368 
 --- 1363,1369 
          NEXT_PASS (pass_parallelize_loops);
          NEXT_PASS (pass_loop_prefetch);
          NEXT_PASS (pass_iv_optimize);
 +         NEXT_PASS (pass_lim);
          NEXT_PASS (pass_tree_loop_done);
        }
        NEXT_PASS (pass_cse_reciprocals);



Re: [11/11] Fix get_mode_bounds

2011-07-11 Thread Bernd Schmidt
On 07/06/11 20:37, Richard Henderson wrote:
 On 07/01/2011 10:42 AM, Bernd Schmidt wrote:
 get_mode_bounds should also use GET_MODE_PRECISION, but this exposes a
 problem on ia64 - BImode needs to be handled specially here to work
 around another preexisting special case in gen_int_mode.
 
 Would it be better to remove the trunc_int_for_mode special case?
 It appears that I added that for ia64 and it's unchanged since...

I tried that on ia64. It didn't bootstrap with the special case removed
(configure-stage1-target-libgomp failure), and progressed further
without the change.

(It still failed with
/usr/bin/ld: /opt/cfarm/gmp-4.2.4/lib/libgmp.a(errno.o): @gprel
relocation against dynamic symbol __gmp_errno
/usr/bin/ld: /opt/cfarm/gmp-4.2.4/lib/libgmp.a(errno.o): @gprel
relocation against dynamic symbol __gmp_errno
/usr/bin/ld: /opt/cfarm/gmp-4.2.4/lib/libgmp.a(errno.o): @gprel
relocation against dynamic symbol __gmp_errno
/usr/bin/ld: final link failed: Nonrepresentable section on output
)

 That said, I'm willing to approve the patch as-is.

I'll commit it then.


Bernd


Re: Ping: The TI C6X port

2011-07-11 Thread Bernd Schmidt
On 06/06/11 14:53, Gerald Pfeifer wrote:
 not a direct approval for any of the outstanding patches, but I am happy 
 to report that the steering committee is appointing you maintainer of the 
 C6X port.
 
 Please go ahead and add yourself to the MAINTAINERS file as part of the
 patch that actually adds the port (10/11 if I recall correctly).

Internally, the question came up whether that means I can just commit
the port once the preliminary patches are approved (which I think is
now). Opinions?


Bernd


RFA: Use create_*_operand expand_insn for movmisalign

2011-07-11 Thread Richard Sandiford
When I added the new optabs insn-expansion routines, I looked for code
that checked the predicates before calling GEN_FCN.  This patch also
uses the routines in two cases where we don't currently check the
predicates.  The benefits are:

1) We assert that the predicates really do match.

2) We support targets (like ARM) that only support restricted addressing
   modes.  See the allows_mem stuff in maybe_legitimize_operand_same_code.

Tested on x86_64-linux-gnu and (with an ARM patch to take advantage of it)
on arm-linux-gnueabi.  OK to install?

Richard


gcc/
* expr.c (expand_expr_real_1): Use expand_insn for movmisalign.

Index: gcc/expr.c
===
--- gcc/expr.c  2011-07-11 11:29:58.0 +0100
+++ gcc/expr.c  2011-07-11 11:31:45.0 +0100
@@ -8692,7 +8692,8 @@ expand_expr_real_1 (tree exp, rtx target
   {
addr_space_t as = TYPE_ADDR_SPACE (TREE_TYPE (exp));
struct mem_address addr;
-   int icode, align;
+   enum insn_code icode;
+   int align;
 
get_address_description (exp, addr);
op0 = addr_for_mem_ref (addr, as, true);
@@ -8709,18 +8710,15 @@ expand_expr_real_1 (tree exp, rtx target
 ((icode = optab_handler (movmisalign_optab, mode))
!= CODE_FOR_nothing))
  {
-   rtx reg, insn;
+   struct expand_operand ops[2];
 
/* We've already validated the memory, and we're creating a
-  new pseudo destination.  The predicates really can't fail.  */
-   reg = gen_reg_rtx (mode);
-
-   /* Nor can the insn generator.  */
-   insn = GEN_FCN (icode) (reg, temp);
-   gcc_assert (insn != NULL_RTX);
-   emit_insn (insn);
-
-   return reg;
+  new pseudo destination.  The predicates really can't fail,
+  nor can the generator.  */
+   create_output_operand (ops[0], NULL_RTX, mode);
+   create_fixed_operand (ops[1], temp);
+   expand_insn (icode, 2, ops);
+   return ops[0].value;
  }
return temp;
   }
@@ -8732,7 +8730,8 @@ expand_expr_real_1 (tree exp, rtx target
enum machine_mode address_mode;
tree base = TREE_OPERAND (exp, 0);
gimple def_stmt;
-   int icode, align;
+   enum insn_code icode;
+   int align;
/* Handle expansion of non-aliased memory with non-BLKmode.  That
   might end up in a register.  */
if (TREE_CODE (base) == ADDR_EXPR)
@@ -8806,17 +8805,15 @@ expand_expr_real_1 (tree exp, rtx target
 ((icode = optab_handler (movmisalign_optab, mode))
!= CODE_FOR_nothing))
  {
-   rtx reg, insn;
+   struct expand_operand ops[2];
 
/* We've already validated the memory, and we're creating a
-  new pseudo destination.  The predicates really can't fail.  */
-   reg = gen_reg_rtx (mode);
-
-   /* Nor can the insn generator.  */
-   insn = GEN_FCN (icode) (reg, temp);
-   emit_insn (insn);
-
-   return reg;
+  new pseudo destination.  The predicates really can't fail,
+  nor can the generator.  */
+   create_output_operand (ops[0], NULL_RTX, mode);
+   create_fixed_operand (ops[1], temp);
+   expand_insn (icode, 2, ops);
+   return ops[0].value;
  }
return temp;
   }


Re: PING: PATCH [4/n]: Prepare x32: Permute the conversion and addition if one operand is a constant

2011-07-11 Thread Paolo Bonzini

On 07/11/2011 02:04 AM, H.J. Lu wrote:

With my original change,  I got

(const:DI (plus:DI (symbol_ref:DI (iplane.1577) [flags 0x2]
var_decl 0x70857960 iplane)
 (const_int -4 [0xfffc])))

I think it is safe to permute the conversion and addition operation
if one operand is a constant and we are zero-extending.  This is
how zero-extending works.


Ok, I think I understand what you mean.  The key is the

   XEXP (x, 1) == convert_memory_address_addr_space
  (to_mode, XEXP (x, 1), as)

test.  It ensures basically that the constant has 31-bit precision, 
because otherwise the constant would change from e.g. (const_int 
-0x7ffc) to (const_int 0x8004) when zero-extending it from 
SImode to DImode.


But I'm not sure it's safe.  You have,

  (zero_extend:DI (plus:SI FOO:SI) (const_int Y))

and you want to convert it to

  (plus:DI FOO:DI (zero_extend:DI (const_int Y)))

(where the zero_extend is folded).  Ignore that FOO is a SYMBOL_REF 
(this piece of code does not assume anything about its shape); if FOO == 
0xfffc and Y = 8, the result will be respectively 0x4 (valid) and 
0x10004 (invalid).


If pointers extend as signed you also have a similar case.   If FOO == 
0x7ffc and Y = 8, the result of


  (sign_extend:DI (plus:SI FOO:SI) (const_int Y))

and

  (plus:DI FOO:DI (sign_extend:DI (const_int Y)))

will be respectively 0x8004 (valid) and 0x8004 (invalid).


What happens if you just return NULL instead of the assertion (good idea 
adding it!)?


Of course then you need to:

1) check the return values of convert_memory_address_addr_space_1, and 
propagate NULL up to simplify_unary_operation;


2) check in simplify-rtx.c whether the return value of 
convert_memory_address_1 is NULL, and only return if the return value is 
not NULL.  This is not yet necessary (convert_memory_address is the last 
transformation for both SIGN_EXTEND and ZERO_EXTEND) but it is better to 
keep code clean.


Thanks,

Paolo


Re: RFA: Use create_*_operand expand_insn for movmisalign

2011-07-11 Thread Richard Guenther
On Mon, Jul 11, 2011 at 12:38 PM, Richard Sandiford
richard.sandif...@linaro.org wrote:
 When I added the new optabs insn-expansion routines, I looked for code
 that checked the predicates before calling GEN_FCN.  This patch also
 uses the routines in two cases where we don't currently check the
 predicates.  The benefits are:

 1) We assert that the predicates really do match.

 2) We support targets (like ARM) that only support restricted addressing
   modes.  See the allows_mem stuff in maybe_legitimize_operand_same_code.

 Tested on x86_64-linux-gnu and (with an ARM patch to take advantage of it)
 on arm-linux-gnueabi.  OK to install?

Ok.

Thanks,
Richard.

 Richard


 gcc/
        * expr.c (expand_expr_real_1): Use expand_insn for movmisalign.

 Index: gcc/expr.c
 ===
 --- gcc/expr.c  2011-07-11 11:29:58.0 +0100
 +++ gcc/expr.c  2011-07-11 11:31:45.0 +0100
 @@ -8692,7 +8692,8 @@ expand_expr_real_1 (tree exp, rtx target
       {
        addr_space_t as = TYPE_ADDR_SPACE (TREE_TYPE (exp));
        struct mem_address addr;
 -       int icode, align;
 +       enum insn_code icode;
 +       int align;

        get_address_description (exp, addr);
        op0 = addr_for_mem_ref (addr, as, true);
 @@ -8709,18 +8710,15 @@ expand_expr_real_1 (tree exp, rtx target
             ((icode = optab_handler (movmisalign_optab, mode))
                != CODE_FOR_nothing))
          {
 -           rtx reg, insn;
 +           struct expand_operand ops[2];

            /* We've already validated the memory, and we're creating a
 -              new pseudo destination.  The predicates really can't fail.  */
 -           reg = gen_reg_rtx (mode);
 -
 -           /* Nor can the insn generator.  */
 -           insn = GEN_FCN (icode) (reg, temp);
 -           gcc_assert (insn != NULL_RTX);
 -           emit_insn (insn);
 -
 -           return reg;
 +              new pseudo destination.  The predicates really can't fail,
 +              nor can the generator.  */
 +           create_output_operand (ops[0], NULL_RTX, mode);
 +           create_fixed_operand (ops[1], temp);
 +           expand_insn (icode, 2, ops);
 +           return ops[0].value;
          }
        return temp;
       }
 @@ -8732,7 +8730,8 @@ expand_expr_real_1 (tree exp, rtx target
        enum machine_mode address_mode;
        tree base = TREE_OPERAND (exp, 0);
        gimple def_stmt;
 -       int icode, align;
 +       enum insn_code icode;
 +       int align;
        /* Handle expansion of non-aliased memory with non-BLKmode.  That
           might end up in a register.  */
        if (TREE_CODE (base) == ADDR_EXPR)
 @@ -8806,17 +8805,15 @@ expand_expr_real_1 (tree exp, rtx target
             ((icode = optab_handler (movmisalign_optab, mode))
                != CODE_FOR_nothing))
          {
 -           rtx reg, insn;
 +           struct expand_operand ops[2];

            /* We've already validated the memory, and we're creating a
 -              new pseudo destination.  The predicates really can't fail.  */
 -           reg = gen_reg_rtx (mode);
 -
 -           /* Nor can the insn generator.  */
 -           insn = GEN_FCN (icode) (reg, temp);
 -           emit_insn (insn);
 -
 -           return reg;
 +              new pseudo destination.  The predicates really can't fail,
 +              nor can the generator.  */
 +           create_output_operand (ops[0], NULL_RTX, mode);
 +           create_fixed_operand (ops[1], temp);
 +           expand_insn (icode, 2, ops);
 +           return ops[0].value;
          }
        return temp;
       }



Re: [PATCH 4/6] Shrink-wrapping

2011-07-11 Thread Bernd Schmidt
On 07/11/11 13:08, Richard Sandiford wrote:
 Bernd Schmidt ber...@codesourcery.com writes:
 On 07/07/11 22:08, Richard Sandiford wrote:
 Sure, I understand that returns does more than return on ARM.
 What I meant was: we'd normally want that other stuff to be
 expressed in rtl alongside the (return) rtx.  E.g. something like:

   (parallel
 [(return)
  (set (reg r4) (mem (plus (reg sp) (const_int ...
  (set (reg r5) (mem (plus (reg sp) (const_int ...
  (set (reg sp) (plus (reg sp) (const_int ...)))])

 I've thought about it some more. Isn't this just a question of
 definitions? Much like we implicitly clobber call-used registers for a
 CALL rtx, we might as well define RETURN to restore the intersection
 between regs_ever_live and call-saved regs? This is what its current
 usage implies, but I guess it's never been necessary to spell it out
 explicitly since we don't optimize across branches to the exit block.
 
 I don't think we could assume that for all targets.  On ARM, (return)
 restores registers, but on many targets it's done separately.

An instruction that does not do this should then use simple_return,
which has the appropriate definition (just return, nothing else).

For most ports I expect there is no difference, since HAVE_return tends
to have a guard that requires no epilogue (as the documentation suggests
should be the case).


Bernd


Re: [PATCH] Build a bi-arch compiler on s390-linux-gnu

2011-07-11 Thread Matthias Klose
On 03/25/2009 04:30 PM, Andreas Krebbel wrote:
 2009-03-23  Arthur Loiret  aloi...@debian.org

  * config.gcc (s390-*-linux*): If 'enabled_targets' is 'all', build
  a bi-arch compiler defaulting to 31-bit. In this case:
  (tmake_file): Add s390/t-linux64.
  * doc/install.texi: Add s390-linux to the list of targets supporting
  --enable-targets=all.
 
 This is ok. Thanks!

Now checked in.

  Matthias


[PATCH][0/N][RFC] Change POINTER_PLUS_EXPR offset type requirements

2011-07-11 Thread Richard Guenther

This is the first patch in a series of patches that will eventually lead
to changed requirements for the POINTER_PLUS_EXPR offset operand.  The
first and foremost goal is to reduce the number of sizetyped computations
in our IL (with sizetype being that oddball type that is unsigned
but sign-extended).  The following patch goes for a canonical type
precision for the offset operand (matching the precision of sizetype)
but allows both signed (preferred) and unsigned offsets.

The patch introduces several wrappers around the concept of
valid offset types to be able to convert users without actually
switching the implementations to a different set of types.

The abstractions include

 - ptrofftype_p (t) - whether t is a valid type for operand 1 of ppe
 - convert_to_ptrofftype (t) - shortcut for the fold_convert (...)
 pattern, allows for advanced promotion rules
 - common_ptrofftype (t) - needed for the rare case when you combine
 two pointer-plus-expr offsets
 - fold_build_pointer_plus[_hwi] - for the common pattern that
 first converts the offset to a proper type and then builds
 a pointer-plus-expr (or builds an offset tree from a HWI calculation)

The patch actually provides implementations for the desired final
set of types.

Thus, comments on the abstraction itself and the (choice of) final
implementation welcome.

I suppose the fold_build_pointer_plus* one is least controversical
so I'll start with singling out that and its uses.

Thanks,
Richard.

2011-06-17  Richard Guenther  rguent...@suse.de

* expr.c (expand_expr_real_2): Extend the POINTER_PLUS_EXPR offset
operand to pointer precision.
* tree-cfg.c (verify_expr): Use ptrofftype_p for POINTER_PLUS_EXPR
offset verification.
(verify_gimple_assign_binary): Likewise.
* tree.c (build2_stat): Likewise.
(build_common_tree_nodes): Build ptrofftype and uptrofftype.
* tree.h (enum size_type_kind): Add PTROFFTYPE and UPTROFFTYPE.
(ptrofftype): Define.
(uptrofftype): Likewise.
(convert_to_ptrofftype_loc): New helper function.
(convert_to_ptrofftype): Define.
(common_ptrofftype): New helper function.
(ptrofftype_p): Likewise.
(fold_build_pointer_plus_loc): New helper function.
(fold_build_pointer_plus_hwi_loc): Likewise.
(fold_build_pointer_plus): Define.
(fold_build_pointer_plus_hwi): Likewise.
* tree.def (POINTER_PLUS_EXPR): Adjust documentation.

Index: trunk/gcc/expr.c
===
*** trunk.orig/gcc/expr.c   2011-07-11 11:48:46.0 +0200
--- trunk/gcc/expr.c2011-07-11 12:57:40.0 +0200
*** expand_expr_real_2 (sepops ops, rtx targ
*** 7428,7442 
}
  
  case POINTER_PLUS_EXPR:
!   /* Even though the sizetype mode and the pointer's mode can be different
!  expand is able to handle this correctly and get the correct result 
out
!  of the PLUS_EXPR code.  */
!   /* Make sure to sign-extend the sizetype offset in a POINTER_PLUS_EXPR
!  if sizetype precision is smaller than pointer precision.  */
!   if (TYPE_PRECISION (sizetype)  TYPE_PRECISION (type))
!   treeop1 = fold_convert_loc (loc, type,
!   fold_convert_loc (loc, ssizetype,
! treeop1));
  case PLUS_EXPR:
/* If we are adding a constant, a VAR_DECL that is sp, fp, or ap, and
 something else, make sure we add the register to the constant and
--- 7428,7439 
}
  
  case POINTER_PLUS_EXPR:
!   /* Extend/truncate the offset operand to pointer width according
!  to its signedness.  */
!   if (TYPE_PRECISION (type) != TYPE_PRECISION (TREE_TYPE (treeop1)))
!   treeop1 = fold_convert_loc (loc, type, treeop1);
! 
!   /* Fallthru.  */
  case PLUS_EXPR:
/* If we are adding a constant, a VAR_DECL that is sp, fp, or ap, and
 something else, make sure we add the register to the constant and
Index: trunk/gcc/tree-cfg.c
===
*** trunk.orig/gcc/tree-cfg.c   2011-07-11 11:48:46.0 +0200
--- trunk/gcc/tree-cfg.c2011-07-11 12:12:42.0 +0200
*** verify_expr (tree *tp, int *walk_subtree
*** 2845,2857 
  error (invalid operand to pointer plus, first operand is not a 
pointer);
  return t;
}
!   /* Check to make sure the second operand is an integer with type of
!sizetype.  */
!   if (!useless_type_conversion_p (sizetype,
!TREE_TYPE (TREE_OPERAND (t, 1
{
  error (invalid operand to pointer plus, second operand is not an 
!integer with type of sizetype);
  return t;
}
/* FALLTHROUGH */
--- 2845,2855 
  error (invalid 

Re: [PATCH] Make VRP optimize useless conversions

2011-07-11 Thread Richard Guenther
On Fri, 8 Jul 2011, Richard Guenther wrote:

 On Fri, 8 Jul 2011, Michael Matz wrote:
 
  Hi,
  
  On Fri, 8 Jul 2011, Richard Guenther wrote:
  
   It should be indeed safe with the current handling of conversions, but 
   better be safe.  So, like the following?
  
  No.  The point is that you can't compare the bounds that VRP computes with 
  each other when the outcome affects correctness.  Think about a very 
  trivial and stupid VRP, that assigns the range [WIDEST_INT_MIN .. 
  WIDEST_UINT_MAX] to each and every SSA name without looking at types and 
  operations at all (assuming that this reflects the largest int type on the 
  target).  It's useless but correct.  Of course we wouldn't implement such 
  useless range discovery, but similar situations can arise when some VRP 
  algorithms give up for certain reasons, or computation of tight bounds 
  merely isn't implemented for some operations.
  
  Your routines need to work also in the presence of such imprecise ranges.
  
  Hence, the check that the intermediate conversion is useless needs to take 
  into account the input value range (that's conservatively correct), and 
  the precision and signedness of the target type (if it can represent all 
  value of the input range the conversion was useless).  It must not look at 
  the suspected value range of the destination, precisely because it is 
  conservative only.
 
 Ok, indeed conservative is different for what VRP does and for what
 a transformation must assess.  So the following patch makes
 a conservative attempt at checking the transformation (which of
 course non-surprisingly matches what the VRP part does).
 
 So, more like the following?

The following actually works.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

Can you double-check it?

Thanks,
Richard.

2011-07-11  Richard Guenther  rguent...@suse.de

* tree-vrp.c (simplify_conversion_using_ranges): Manually
translate the source value-range through the conversion chain.

Index: gcc/tree-vrp.c
===
--- gcc/tree-vrp.c  (revision 176030)
+++ gcc/tree-vrp.c  (working copy)
@@ -7347,30 +7347,55 @@ simplify_switch_using_ranges (gimple stm
 static bool
 simplify_conversion_using_ranges (gimple stmt)
 {
-  tree rhs1 = gimple_assign_rhs1 (stmt);
-  gimple def_stmt = SSA_NAME_DEF_STMT (rhs1);
-  value_range_t *final, *inner;
+  tree innerop, middleop, finaltype;
+  gimple def_stmt;
+  value_range_t *innervr;
+  double_int innermin, innermax, middlemin, middlemax;
 
-  /* Obtain final and inner value-ranges for a conversion
- sequence (final-type)(intermediate-type)inner-type.  */
-  final = get_value_range (gimple_assign_lhs (stmt));
-  if (final-type != VR_RANGE)
-return false;
+  finaltype = TREE_TYPE (gimple_assign_lhs (stmt));
+  middleop = gimple_assign_rhs1 (stmt);
+  def_stmt = SSA_NAME_DEF_STMT (middleop);
   if (!is_gimple_assign (def_stmt)
   || !CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def_stmt)))
 return false;
-  rhs1 = gimple_assign_rhs1 (def_stmt);
-  if (TREE_CODE (rhs1) != SSA_NAME)
+  innerop = gimple_assign_rhs1 (def_stmt);
+  if (TREE_CODE (innerop) != SSA_NAME)
 return false;
-  inner = get_value_range (rhs1);
-  if (inner-type != VR_RANGE)
+
+  /* Get the value-range of the inner operand.  */
+  innervr = get_value_range (innerop);
+  if (innervr-type != VR_RANGE
+  || TREE_CODE (innervr-min) != INTEGER_CST
+  || TREE_CODE (innervr-max) != INTEGER_CST)
 return false;
-  /* If the value-range is preserved by the conversion sequence strip
- the intermediate conversion.  */
-  if (!tree_int_cst_equal (final-min, inner-min)
-  || !tree_int_cst_equal (final-max, inner-max))
+
+  /* Simulate the conversion chain to check if the result is equal if
+ the middle conversion is removed.  */
+  innermin = tree_to_double_int (innervr-min);
+  innermax = tree_to_double_int (innervr-max);
+  middlemin = double_int_ext (innermin, TYPE_PRECISION (TREE_TYPE (middleop)),
+ TYPE_UNSIGNED (TREE_TYPE (middleop)));
+  middlemax = double_int_ext (innermax, TYPE_PRECISION (TREE_TYPE (middleop)),
+ TYPE_UNSIGNED (TREE_TYPE (middleop)));
+  /* If the middle values do not represent a proper range fail.  */
+  if (double_int_cmp (middlemin, middlemax,
+ TYPE_UNSIGNED (TREE_TYPE (middleop)))  0)
 return false;
-  gimple_assign_set_rhs1 (stmt, rhs1);
+  if (!double_int_equal_p (double_int_ext (middlemin,
+  TYPE_PRECISION (finaltype),
+  TYPE_UNSIGNED (finaltype)),
+  double_int_ext (innermin,
+  TYPE_PRECISION (finaltype),
+  TYPE_UNSIGNED (finaltype)))
+  || !double_int_equal_p (double_int_ext (middlemax,
+ 

[ARM] Tighten predicates for misaligned loads and stores

2011-07-11 Thread Richard Sandiford
While working on another patch, I noticed that the new misaligned
load/store patterns allow REG+CONST addresses before reload, even though
the instruction (and its constraints) don't.

This patch tightens the predicatese to match the existing vstN patterns.
It depends on:

  http://gcc.gnu.org/ml/gcc-patches/2011-07/msg00795.html

which I've justed applied.

The patch also makes neon_struct_operand a normal predicate instead
of a special predicate.  That was a cut--pasto of mine, sorry.

Tested on arm-linux-gnueabi.  OK to install?

Richard


gcc/
* config/arm/predicates.md (neon_struct_operand): Make a normal
predicate.
(neon_struct_or_register_operand): New predicate.
* config/arm/neon.md (movmisalignmode): Replace predicates
with neon_struct_or_register_operand.
(*movmisalignmode_neon_store, *movmisalignmode_neon_load): Use
neon_struct_operand instead of memory_operand.

Index: gcc/config/arm/predicates.md
===
--- gcc/config/arm/predicates.md2011-07-11 11:29:58.0 +0100
+++ gcc/config/arm/predicates.md2011-07-11 13:14:25.0 +0100
@@ -732,9 +732,13 @@ (define_special_predicate vect_par_cons
   return true; 
 })
 
-(define_special_predicate neon_struct_operand
+(define_predicate neon_struct_operand
   (and (match_code mem)
(match_test TARGET_32BIT  neon_vector_mem_operand (op, 2
 
+(define_predicate neon_struct_or_register_operand
+  (ior (match_operand 0 neon_struct_operand)
+   (match_operand 0 s_register_operand)))
+
 (define_special_predicate add_operator
   (match_code plus))
Index: gcc/config/arm/neon.md
===
--- gcc/config/arm/neon.md  2011-07-11 12:21:19.0 +0100
+++ gcc/config/arm/neon.md  2011-07-11 13:14:25.0 +0100
@@ -372,8 +372,8 @@ (define_split
 })
 
 (define_expand movmisalignmode
-  [(set (match_operand:VDQX 0 nonimmediate_operand )
-   (unspec:VDQX [(match_operand:VDQX 1 general_operand )]
+  [(set (match_operand:VDQX 0 neon_struct_or_register_operand)
+   (unspec:VDQX [(match_operand:VDQX 1 neon_struct_or_register_operand)]
 UNSPEC_MISALIGNED_ACCESS))]
   TARGET_NEON  !BYTES_BIG_ENDIAN
 {
@@ -386,7 +386,7 @@ (define_expand movmisalignmode
 })
 
 (define_insn *movmisalignmode_neon_store
-  [(set (match_operand:VDX 0 memory_operand =Um)
+  [(set (match_operand:VDX 0 neon_struct_operand=Um)
(unspec:VDX [(match_operand:VDX 1 s_register_operand  w)]
UNSPEC_MISALIGNED_ACCESS))]
   TARGET_NEON  !BYTES_BIG_ENDIAN
@@ -394,15 +394,15 @@ (define_insn *movmisalignmode_neon_st
   [(set_attr neon_type neon_vst1_1_2_regs_vst2_2_regs)])
 
 (define_insn *movmisalignmode_neon_load
-  [(set (match_operand:VDX 0 s_register_operand =w)
-   (unspec:VDX [(match_operand:VDX 1 memory_operand  Um)]
+  [(set (match_operand:VDX 0 s_register_operand  =w)
+   (unspec:VDX [(match_operand:VDX 1 neon_struct_operand  Um)]
UNSPEC_MISALIGNED_ACCESS))]
   TARGET_NEON  !BYTES_BIG_ENDIAN
   vld1.V_sz_elem\t{%P0}, %A1
   [(set_attr neon_type neon_vld1_1_2_regs)])
 
 (define_insn *movmisalignmode_neon_store
-  [(set (match_operand:VQX 0 memory_operand =Um)
+  [(set (match_operand:VQX 0 neon_struct_operand=Um)
(unspec:VQX [(match_operand:VQX 1 s_register_operand  w)]
UNSPEC_MISALIGNED_ACCESS))]
   TARGET_NEON  !BYTES_BIG_ENDIAN
@@ -410,8 +410,8 @@ (define_insn *movmisalignmode_neon_st
   [(set_attr neon_type neon_vst1_1_2_regs_vst2_2_regs)])
 
 (define_insn *movmisalignmode_neon_load
-  [(set (match_operand:VQX 0 s_register_operand =w)
-   (unspec:VQX [(match_operand:VQX 1 memory_operand  Um)]
+  [(set (match_operand:VQX 0 s_register_operand   =w)
+   (unspec:VQX [(match_operand:VQX 1 neon_struct_operand  Um)]
UNSPEC_MISALIGNED_ACCESS))]
   TARGET_NEON  !BYTES_BIG_ENDIAN
   vld1.V_sz_elem\t{%q0}, %A1


[Patch] Add my name to the Write After Approval list.

2011-07-11 Thread Daniel Carrera

Hello,

As my very first commit to GCC I have added my name to the MAINTAINERS 
file in the Write After Approval section.


--
I'm not overweight, I'm undertall.
Index: MAINTAINERS
===
--- MAINTAINERS	(revision 176151)
+++ MAINTAINERS	(revision 176152)
@@ -324,6 +324,7 @@
 Christian Bruel	christian.br...@st.com
 Kevin Buettner	kev...@redhat.com
 Andrew Cagney	cag...@redhat.com
+Daniel Carrera	dcarr...@gmail.com
 Stephane Carrez	stcar...@nerim.fr
 Gabriel Charettegch...@google.com
 Chandra Chavva	ccha...@redhat.com


Re: [PATCH] Make VRP optimize useless conversions

2011-07-11 Thread Michael Matz
Hi,

On Mon, 11 Jul 2011, Richard Guenther wrote:

 The following actually works.
 
 Bootstrapped and tested on x86_64-unknown-linux-gnu.
 
 Can you double-check it?

Seems sensible.  Given this:

  short s;
  int i;
  for (s = 0; s = 127; s++)
i += (signed char)(unsigned char)s;
  return i;

(or similar), does it remove the conversions to signed and unsigned char 
now?  And does it _not_ remove them if the upper bound is 128, or the 
lower bound is -1 ?

Similar (now with extensions):
  signed char c;
  unsigned u;
  for (c = 1; c  127; c++)
u += (unsigned)(int)c;

The conversion to int is not necessary; but it is when the lower bound 
is -1.


Ciao,
Michael.


Re: [PATCH] Add -mno-r11 option to suppress load of ppc64 static chain in indirect calls

2011-07-11 Thread David Edelsohn
On Thu, Jul 7, 2011 at 4:19 PM, Richard Guenther
richard.guent...@gmail.com wrote:

 Does XLC have a similar switch whose name we can use?

The IBM XL compiler is discussing a similar feature, but it is not
implemented yet and does not have a formal command line option name.

- David


[PLUGIN] c-family files installation

2011-07-11 Thread Romain Geissler
This patch add a new exception to the plugin header flattering strategy.
c-family files can't be installed in the plugin include root directory as some
other files like cp/cp-tree.h will look for them in the c-family directory.

Furthermore, i had to correct an include in c-pretty-print.h so that it
looks for c-common.h in the c-family directory. That way, headers will
work out of the box when compiling a plugin, there is no need for
additional include directory.

Builds and installs fine

Ok for the trunk (i have no write access) ?

Romain Geissler

gcc/c-family/
2011-07-11  Romain Geissler  romain.geiss...@gmail.com

 * c-pretty-print.h: Search c-common.h in c-family

gcc/
2011-07-11  Romain Geissler  romain.geiss...@gmail.com

 PR plugins/45348
 PR plugins/48425
 PR plugins/46577
 * Makefile.in: Do not flatten c-family directory
 when installing plugin headers


Index: gcc/c-family/c-pretty-print.h
===
--- gcc/c-family/c-pretty-print.h   (revision 175907)
+++ gcc/c-family/c-pretty-print.h   (working copy)
@@ -23,7 +23,7 @@ along with GCC; see the file COPYING3.
 #define GCC_C_PRETTY_PRINTER

 #include tree.h
-#include c-common.h
+#include c-family/c-common.h
 #include pretty-print.h


Index: gcc/Makefile.in
===
--- gcc/Makefile.in (revision 175907)
+++ gcc/Makefile.in (working copy)
@@ -4643,7 +4643,7 @@ s-header-vars: Makefile

 # Install the headers needed to build a plugin.
 install-plugin: installdirs lang.install-plugin s-header-vars
-# We keep the directory structure for files in config and .def files. All
+# We keep the directory structure for files in config or c-family and
.def files. All
 # other files are flattened to a single directory.
$(mkinstalldirs) $(DESTDIR)$(plugin_includedir)
headers=`echo $(PLUGIN_HEADERS) | tr ' ' '\012' | sort -u`; \
@@ -4656,7 +4656,7 @@ install-plugin: installdirs lang.install
  else continue; \
  fi; \
  case $$path in \
- $(srcdir)/config/* | $(srcdir)/*.def ) \
+ $(srcdir)/config/* | $(srcdir)/c-family/* | $(srcdir)/*.def ) \
base=`echo $$path | sed -e s|$$srcdirstrip/||`;; \
  *) base=`basename $$path` ;; \
  esac; \


Re: [Patch,testsuite]: Skip AVR if .text overflows

2011-07-11 Thread Georg-Johann Lay
Mike Stump wrote:
 On Jul 8, 2011, at 7:57 AM, Georg-Johann Lay wrote:
 These tests are too big for AVR: .text (128 KiB) overflows and ld
  complains.
 
 Ok to commit?
 
 Ok.  If people feel they have a nice design for `too big', let us
 know...  I think it would have to be ld message scan based, to
 allow things that are just small enough, and correctly identify
 those that are just a hair too big.  I'd preapprove work in that
 direction.  :-)

I don't know enough of dejagnu/tcl/expect to do that.
For the moment I'm happy with the explicit skip-avr-solution, and it's
just a handfull of tests that fail.

Johann



Re: [patch tree-optimization]: [2 of 3]: Boolify compares more

2011-07-11 Thread Kai Tietz
2011/7/8 Richard Guenther richard.guent...@gmail.com:
 On Fri, Jul 8, 2011 at 1:32 PM, Kai Tietz ktiet...@googlemail.com wrote:
 2011/7/8 Richard Guenther richard.guent...@gmail.com:
 On Thu, Jul 7, 2011 at 6:07 PM, Kai Tietz ktiet...@googlemail.com wrote:
 Hello,

 This patch - second of series - adds boolification of comparisions in
 gimplifier.  For this
 casts from/to boolean are marked as not-useless. And in fold_unary_loc
 casts to non-boolean integral types are preserved.
 The hunk in tree-ssa-forwprop.c in combine_cond-expr_cond is not strictly
 necessary - as long as fold-const handles 1-bit precision 
 bitwise-expression
 with truth-logic - but it has shown to short-cut some expensier folding. So
 I kept it within this patch.

 Please split it out.  Also ...


 The adjusted testcase gcc.dg/uninit-15.c indicates that due
 optimization we loose
 in this case variables declaration.  But this might be to be expected.

 In vectorization we have a regression in gcc.dg/vect/vect-cond-3.c
 test-case.  It's caused
 by always having boolean-type on conditions.  So vectorizer sees
 different types, which
 aren't handled by vectorizer right now.  Maybe this issue could be
 special-cased for
 boolean-types in tree-vect-loop, by making operand for used condition
 equal to vector-type.
 But this is a subject for a different patch and not addressed by this 
 series.

 There is a regressions in tree-ssa/vrp47.c, and the fix is addressed
 by the 3rd patch of this
 series.

 Bootstrapped and regression tested for all standard-languages (plus
 Ada and Obj-C++) on host x86_64-pc-linux-gnu.

 Ok for apply?

 Regards,
 Kai


 ChangeLog

 2011-07-07  Kai Tietz  kti...@redhat.com

        * fold-const.c (fold_unary_loc): Preserve
        non-boolean-typed casts.
        * gimplify.c (gimple_boolify): Handle boolification
        of comparisons.
        (gimplify_expr): Boolifiy non aggregate-typed
        comparisons.
        * tree-cfg.c (verify_gimple_comparison): Check result
        type of comparison expression.
        * tree-ssa.c (useless_type_conversion_p): Preserve incompatible
        casts from/to boolean,
        * tree-ssa-forwprop.c (combine_cond_expr_cond): Add simplification
        support for one-bit-precision typed X for cases X != 0 and X == 0.
        (forward_propagate_comparison): Adjust test of condition
        result.


        * gcc.dg/tree-ssa/builtin-expect-5.c: Adjusted.
        * gcc.dg/tree-ssa/pr21031.c: Likewise.
        * gcc.dg/tree-ssa/pr30978.c: Likewise.
        * gcc.dg/tree-ssa/ssa-fre-6.c: Likewise.
        * gcc.dg/binop-xor1.c: Mark it as expected fail.
        * gcc.dg/binop-xor3.c: Likewise.
        * gcc.dg/uninit-15.c: Adjust reported message.

 Index: gcc-head/gcc/fold-const.c
 ===
 --- gcc-head.orig/gcc/fold-const.c
 +++ gcc-head/gcc/fold-const.c
 @@ -7665,11 +7665,11 @@ fold_unary_loc (location_t loc, enum tre
             non-integral type.
             Do not fold the result as that would not simplify further, also
             folding again results in recursions.  */
 -         if (INTEGRAL_TYPE_P (type))
 +         if (TREE_CODE (type) == BOOLEAN_TYPE)
            return build2_loc (loc, TREE_CODE (op0), type,
                               TREE_OPERAND (op0, 0),
                               TREE_OPERAND (op0, 1));
 -         else
 +         else if (!INTEGRAL_TYPE_P (type))
            return build3_loc (loc, COND_EXPR, type, op0,
                               fold_convert (type, boolean_true_node),
                               fold_convert (type, boolean_false_node));
 Index: gcc-head/gcc/gimplify.c
 ===
 --- gcc-head.orig/gcc/gimplify.c
 +++ gcc-head/gcc/gimplify.c
 @@ -2842,18 +2842,23 @@ gimple_boolify (tree expr)

     case TRUTH_NOT_EXPR:
       TREE_OPERAND (expr, 0) = gimple_boolify (TREE_OPERAND (expr, 0));
 -      /* FALLTHRU */

 -    case EQ_EXPR: case NE_EXPR:
 -    case LE_EXPR: case GE_EXPR: case LT_EXPR: case GT_EXPR:
       /* These expressions always produce boolean results.  */
 -      TREE_TYPE (expr) = boolean_type_node;
 +      if (TREE_CODE (type) != BOOLEAN_TYPE)
 +       TREE_TYPE (expr) = boolean_type_node;
       return expr;

     default:
 +      if (COMPARISON_CLASS_P (expr))
 +       {
 +         /* There expressions always prduce boolean results.  */
 +         if (TREE_CODE (type) != BOOLEAN_TYPE)
 +           TREE_TYPE (expr) = boolean_type_node;
 +         return expr;
 +       }
       /* Other expressions that get here must have boolean values, but
         might need to be converted to the appropriate mode.  */
 -      if (type == boolean_type_node)
 +      if (TREE_CODE (type) == BOOLEAN_TYPE)
        return expr;
       return fold_convert_loc (loc, boolean_type_node, expr);
     }
 @@ -6763,7 +6768,7 @@ gimplify_expr (tree *expr_p, gimple_seq
            tree org_type = TREE_TYPE (*expr_p);

  

Re: PING: PATCH [4/n]: Prepare x32: Permute the conversion and addition if one operand is a constant

2011-07-11 Thread H.J. Lu
On Mon, Jul 11, 2011 at 4:03 AM, Paolo Bonzini bonz...@gnu.org wrote:
 On 07/11/2011 02:04 AM, H.J. Lu wrote:

 With my original change,  I got

 (const:DI (plus:DI (symbol_ref:DI (iplane.1577) [flags 0x2]
 var_decl 0x70857960 iplane)
         (const_int -4 [0xfffc])))

 I think it is safe to permute the conversion and addition operation
 if one operand is a constant and we are zero-extending.  This is
 how zero-extending works.

 Ok, I think I understand what you mean.  The key is the

   XEXP (x, 1) == convert_memory_address_addr_space
                  (to_mode, XEXP (x, 1), as)

 test.  It ensures basically that the constant has 31-bit precision, because
 otherwise the constant would change from e.g. (const_int -0x7ffc) to
 (const_int 0x8004) when zero-extending it from SImode to DImode.

 But I'm not sure it's safe.  You have,

  (zero_extend:DI (plus:SI FOO:SI) (const_int Y))

 and you want to convert it to

  (plus:DI FOO:DI (zero_extend:DI (const_int Y)))

 (where the zero_extend is folded).  Ignore that FOO is a SYMBOL_REF (this
 piece of code does not assume anything about its shape); if FOO ==
 0xfffc and Y = 8, the result will be respectively 0x4 (valid) and
 0x10004 (invalid).

This example contradicts what you said above It ensures basically that the
constant has 31-bit precision.  For zero-extend, the issue is address-wrap.
As I understand, to support address-wrap, you need to use ptr_mode.

 If pointers extend as signed you also have a similar case.   If FOO ==
 0x7ffc and Y = 8, the result of

  (sign_extend:DI (plus:SI FOO:SI) (const_int Y))

 and

  (plus:DI FOO:DI (sign_extend:DI (const_int Y)))

 will be respectively 0x8004 (valid) and 0x8004 (invalid).


 What happens if you just return NULL instead of the assertion (good idea
 adding it!)?

 Of course then you need to:

 1) check the return values of convert_memory_address_addr_space_1, and
 propagate NULL up to simplify_unary_operation;

 2) check in simplify-rtx.c whether the return value of
 convert_memory_address_1 is NULL, and only return if the return value is not
 NULL.  This is not yet necessary (convert_memory_address is the last
 transformation for both SIGN_EXTEND and ZERO_EXTEND) but it is better to
 keep code clean.

I will give it a try.

Thanks.


-- 
H.J.


Re: Ping: The TI C6X port

2011-07-11 Thread Mike Stump
On Jul 11, 2011, at 3:18 AM, Bernd Schmidt ber...@codesourcery.com wrote:
 On 06/06/11 14:53, Gerald Pfeifer wrote:
 not a direct approval for any of the outstanding patches, but I am happy 
 to report that the steering committee is appointing you maintainer of the 
 C6X port.
 
 Please go ahead and add yourself to the MAINTAINERS file as part of the
 patch that actually adds the port (10/11 if I recall correctly).
 
 Internally, the question came up whether that means I can just commit
 the port once the preliminary patches are approved (which I think is
 now). Opinions?

My take, you need approval for everything outside your area, once you have 
that, and that work is checked in, then, you can check in all the target bits, 
self approving those bits, if they meet your standard.  Your free to reject 
them as well.  ;-).


[PATCH] Remove cgraph_get_node_or_alias

2011-07-11 Thread Martin Jambor
Hi,

cgraph_get_node_or_alias is now completely equivalent to
cgraph_get_node, in fact it is exactly same character-by-character.
Therefore it should be removed, which is what the patch below does.

Bootstrapped and tested on x86_64-linux, OK for trunk?

Thanks,

Martin


2011-07-11  Martin Jambor  mjam...@suse.cz

* cgraph.h (cgraph_get_node_or_alias): Removed declaration.

* cgraph.c (cgraph_get_node_or_alias): Removed.
(change_decl_assembler_name): Changed all calls to
cgraph_get_node_or_alias to a call to cgraph_get_node.
(cgraph_make_decl_local): Likewise.
* lto-symtab.c (lto_symtab_resolve_symbols): Likewise.
* varasm.c (default_binds_local_p_1): Likewise.
(decl_binds_to_current_def_p): Likewise.

Index: src/gcc/cgraph.c
===
--- src.orig/gcc/cgraph.c
+++ src/gcc/cgraph.c
@@ -642,29 +642,6 @@ cgraph_add_thunk (struct cgraph_node *de
is assigned.  */
 
 struct cgraph_node *
-cgraph_get_node_or_alias (const_tree decl)
-{
-  struct cgraph_node key, *node = NULL, **slot;
-
-  gcc_assert (TREE_CODE (decl) == FUNCTION_DECL);
-
-  if (!cgraph_hash)
-return NULL;
-
-  key.decl = CONST_CAST2 (tree, const_tree, decl);
-
-  slot = (struct cgraph_node **) htab_find_slot (cgraph_hash, key,
-NO_INSERT);
-
-  if (slot  *slot)
-node = *slot;
-  return node;
-}
-
-/* Returns the cgraph node assigned to DECL or NULL if no cgraph node
-   is assigned.  */
-
-struct cgraph_node *
 cgraph_get_node (const_tree decl)
 {
   struct cgraph_node key, *node = NULL, **slot;
@@ -1984,7 +1961,7 @@ change_decl_assembler_name (tree decl, t
 
   if (assembler_name_hash
   TREE_CODE (decl) == FUNCTION_DECL
-  (node = cgraph_get_node_or_alias (decl)) != NULL)
+  (node = cgraph_get_node (decl)) != NULL)
{
  tree old_name = DECL_ASSEMBLER_NAME (decl);
  slot = htab_find_slot_with_hash (assembler_name_hash, old_name,
@@ -2002,7 +1979,7 @@ change_decl_assembler_name (tree decl, t
 }
   if (assembler_name_hash
TREE_CODE (decl) == FUNCTION_DECL
-   (node = cgraph_get_node_or_alias (decl)) != NULL)
+   (node = cgraph_get_node (decl)) != NULL)
 {
   slot = htab_find_slot_with_hash (assembler_name_hash, name,
   decl_assembler_name_hash (name),
@@ -2525,7 +2502,7 @@ cgraph_make_decl_local (tree decl)
  old_name  = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl));
  if (TREE_CODE (decl) == FUNCTION_DECL)
{
- struct cgraph_node *node = cgraph_get_node_or_alias (decl);
+ struct cgraph_node *node = cgraph_get_node (decl);
  change_decl_assembler_name (decl,
  clone_function_name (decl, local));
  if (node-local.lto_file_data)
Index: src/gcc/cgraph.h
===
--- src.orig/gcc/cgraph.h
+++ src/gcc/cgraph.h
@@ -480,7 +480,6 @@ struct cgraph_edge *cgraph_create_indire
 int, gcov_type, int);
 struct cgraph_indirect_call_info *cgraph_allocate_init_indirect_info (void);
 struct cgraph_node * cgraph_get_node (const_tree);
-struct cgraph_node * cgraph_get_node_or_alias (const_tree);
 struct cgraph_node * cgraph_create_node (tree);
 struct cgraph_node * cgraph_get_create_node (tree);
 struct cgraph_node * cgraph_same_body_alias (struct cgraph_node *, tree, tree);
Index: src/gcc/lto-symtab.c
===
--- src.orig/gcc/lto-symtab.c
+++ src/gcc/lto-symtab.c
@@ -438,7 +438,7 @@ lto_symtab_resolve_symbols (void **slot)
   for (e = (lto_symtab_entry_t) *slot; e; e = e-next)
 {
   if (TREE_CODE (e-decl) == FUNCTION_DECL)
-   e-node = cgraph_get_node_or_alias (e-decl);
+   e-node = cgraph_get_node (e-decl);
   else if (TREE_CODE (e-decl) == VAR_DECL)
e-vnode = varpool_get_node (e-decl);
 }
Index: src/gcc/varasm.c
===
--- src.orig/gcc/varasm.c
+++ src/gcc/varasm.c
@@ -6720,7 +6720,7 @@ default_binds_local_p_1 (const_tree exp,
 }
   else if (TREE_CODE (exp) == FUNCTION_DECL  TREE_PUBLIC (exp))
 {
-  struct cgraph_node *node = cgraph_get_node_or_alias (exp);
+  struct cgraph_node *node = cgraph_get_node (exp);
   if (node
   resolution_local_p (node-resolution))
resolved_locally = true;
@@ -6808,7 +6808,7 @@ decl_binds_to_current_def_p (tree decl)
 }
   else if (TREE_CODE (decl) == FUNCTION_DECL)
 {
-  struct cgraph_node *node = cgraph_get_node_or_alias (decl);
+  struct cgraph_node *node = cgraph_get_node (decl);
   if (node
   node-resolution != LDPR_UNKNOWN)
return resolution_to_local_definition_p (node-resolution);


Re: [PATCH] Remove cgraph_get_node_or_alias

2011-07-11 Thread Jan Hubicka
 Hi,
 
 cgraph_get_node_or_alias is now completely equivalent to
 cgraph_get_node, in fact it is exactly same character-by-character.
 Therefore it should be removed, which is what the patch below does.
 
 Bootstrapped and tested on x86_64-linux, OK for trunk?
OK,
thanks!
Honza


[libgcc] Remove libgcov.c from EXCLUDES

2011-07-11 Thread Rainer Orth
After installing the libgcov move patch, I noticed that I had overlooked
in instance in gcc/po/EXCLUDES.

This patch removes is, installed as obvious.

Rainer


2011-07-09  Rainer Orth  r...@cebitec.uni-bielefeld.de

* EXCLUDES (libgcov.c): Remove.

diff --git a/gcc/po/EXCLUDES b/gcc/po/EXCLUDES
--- a/gcc/po/EXCLUDES
+++ b/gcc/po/EXCLUDES
@@ -40,7 +40,6 @@ gthr-win32.h
 gthr.h
 libgcc2.c
 libgcc2.h
-libgcov.c
 limitx.h
 limity.h
 longlong.h

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: Ping: The TI C6X port

2011-07-11 Thread Gerald Pfeifer
On Mon, 11 Jul 2011, Mike Stump wrote:
 My take, you need approval for everything outside your area, once you 
 have that, and that work is checked in, then, you can check in all the 
 target bits, self approving those bits, if they meet your standard.  

That's my understanding as well.  (With the caveat that if someone is
really hating something in your new port, it would be good to take that
very seriously. :-)

Gerald


ARM: Clear icache when creating a closure

2011-07-11 Thread Andrew Haley
On a multicore ARM, you really do have to clear both caches, not just the
dcache.  This bug may exist in other ports too.

Andrew.


2011-07-11  Andrew Haley  a...@redhat.com

* src/arm/ffi.c (FFI_INIT_TRAMPOLINE): Clear icache.

diff --git a/src/arm/ffi.c b/src/arm/ffi.c
index 885a9cb..b2e7667 100644
--- a/src/arm/ffi.c
+++ b/src/arm/ffi.c
@@ -558,12 +558,16 @@ ffi_closure_free (void *ptr)
 ({ unsigned char *__tramp = (unsigned char*)(TRAMP);   \
unsigned int  __fun = (unsigned int)(FUN);  \
unsigned int  __ctx = (unsigned int)(CTX);  \
+   unsigned char *insns = (unsigned char *)(CTX);   \
*(unsigned int*) __tramp[0] = 0xe92d000f; /* stmfd sp!, {r0-r3} */ \
*(unsigned int*) __tramp[4] = 0xe59f; /* ldr r0, [pc] */   \
*(unsigned int*) __tramp[8] = 0xe59ff000; /* ldr pc, [pc] */   \
*(unsigned int*) __tramp[12] = __ctx;  \
*(unsigned int*) __tramp[16] = __fun;  \
-   __clear_cache((__tramp[0]), (__tramp[19]));   \
+   __clear_cache((__tramp[0]), (__tramp[19])); /* Clear data mapping.  */ \
+   __clear_cache(insns, insns + 3 * sizeof (unsigned int)); \
+ /* Clear instruction   \
+mapping.  */\
  })

 #endif


[PATCH] Fix gfc_trans_pointer_assign_need_temp (PR fortran/49698)

2011-07-11 Thread Jakub Jelinek
Hi!

As the attached testcase (on x86-64) shows, inner_size is initialized to
1 of a wrong type, which results in verify_stmt ICEs because a PLUS has
one 64-bit and one 32-bit operand.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux.
Ok for trunk/4.6?

2011-07-11  Jakub Jelinek  ja...@redhat.com

PR fortran/49698
* trans-stmt.c (gfc_trans_pointer_assign_need_temp): Initialize
inner_size to gfc_index_one_node instead of integer_one_node.

* gfortran.dg/pr49698.f90: New test.

--- gcc/fortran/trans-stmt.c.jj 2011-07-07 13:23:57.0 +0200
+++ gcc/fortran/trans-stmt.c2011-07-11 10:53:34.0 +0200
@@ -3323,7 +3323,7 @@ gfc_trans_pointer_assign_need_temp (gfc_
   count = gfc_create_var (gfc_array_index_type, count);
   gfc_add_modify (block, count, gfc_index_zero_node);
 
-  inner_size = integer_one_node;
+  inner_size = gfc_index_one_node;
   lss = gfc_walk_expr (expr1);
   rss = gfc_walk_expr (expr2);
   if (lss == gfc_ss_terminator)
--- gcc/testsuite/gfortran.dg/pr49698.f90.jj2011-07-11 11:32:01.0 
+0200
+++ gcc/testsuite/gfortran.dg/pr49698.f90   2011-07-11 11:21:53.0 
+0200
@@ -0,0 +1,15 @@
+! PR fortran/49698
+! { dg-do compile }
+subroutine foo (x, y, z)
+  type S
+integer, pointer :: e = null()
+  end type S
+  type T
+type(S), dimension(:), allocatable :: a
+  end type T
+  type(T) :: x, y
+  integer :: z, i
+  forall (i = 1 : z)
+y%a(i)%e = x%a(i)%e
+  end forall
+end subroutine foo

Jakub


Re: [PATCH] Fix gfc_trans_pointer_assign_need_temp (PR fortran/49698)

2011-07-11 Thread Tobias Burnus

On 07/11/2011 06:24 PM, Jakub Jelinek wrote:

As the attached testcase (on x86-64) shows, inner_size is initialized to
1 of a wrong type, which results in verify_stmt ICEs because a PLUS has
one 64-bit and one 32-bit operand.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux.
Ok for trunk/4.6?


OK. I would even claim the patch is obvious.

Thanks for taking care of this PR.

Tobias


Define [CD]TORS_SECTION_ASM_OP on Solaris/x86 with Sun ld

2011-07-11 Thread Rainer Orth
While investigating why many libmudflap execution tests failed on
Solaris 11/x86 with Sun ld, but succeeded on Solaris 11/SPARC, I came
across the following:

The first failure is fail17-frag.  With MUDFLAP_OPTIONS=-trace-calls, I see

mf: __mfwrap_strcpy
mf: check ptr=8050f8c b=995 size=10 read location=`(strcpy src)'
mf: violation pc=fee8894d location=(strcpy src) type=1 ptr=8050f8c size=10
***
mudflap violation 1 (check/read): time=1310040104.249932 ptr=8050f8c size=10
pc=fee8894d location=`(strcpy src)'

The constant string isn't ever registered here, thus the failure.

With gld instead, I see no failure

mf: __mfwrap_strcpy
mf: check ptr=8048bac b=747 size=10 read location=`(strcpy src)'

and the string is registered very early:

mf: register ptr=0 size=1 type=0 name='NULL'
mf: register ptr=8048bac size=10 type=4 name='string literal'

This registration is from tree-mudflap.c (mudflap_enqueue_constant),
emitted at the end of mudflap_finish_file via

  cgraph_build_static_cdtor ('I', ctor_statements,
 MAX_RESERVED_INIT_PRIORITY-1);

With gld, the registration function _GLOBAL__sub_I_00099_0_main is
entered into a .ctors section, but with Sun ld I find separate .ctors
and .ctors.65436 sections.  varasm.c (get_cdtor_priority_section), which
is called by default_named_section_asm_out_constructor, states:

  /* ??? This only works reliably with the GNU linker.  */

This is obviously true: Sun ld doesn't coalesce different .ctors.N
sections, thus the contructor isn't called.

On Solaris 11/SPARC instead, CTORS_SECTION_ASM_OP is defined in
sparc/sysv4.h, and thus default_ctor_section_asm_out_constructor is
called.

The obvious solution is to define [CD]TORS_SECTION_ASM_OP on Solaris/x86
with Sun ld, too.  And indeed this fixes all remaining libmudflap
failures.

On the other hand, there's the question why tree-mudflap.c tries to
create a constructor with a non-default priority on a platform with
SUPPORTS_INIT_PRIORITY == 0 or at all: it seems that all is fine even
with init_priority ignored.  Either mudflap_enqueue_constant should
check for this condition, or at least the middle-end should emit an
error or a warning in this case.

I've bootstrapped the following patch without regressions on
i386-pc-solaris2.11 (Sun as/ld) and i386-pc-solaris2.8 (GNU as/ld)
without regressions.

Installed on mainline.

Rainer


2011-07-08  Rainer Orth  r...@cebitec.uni-bielefeld.de

* config/i386/sol2.h [!USE_GLD] (CTORS_SECTION_ASM_OP): Define.
(DTORS_SECTION_ASM_OP): Define.

diff --git a/gcc/config/i386/sol2.h b/gcc/config/i386/sol2.h
--- a/gcc/config/i386/sol2.h
+++ b/gcc/config/i386/sol2.h
@@ -152,6 +152,13 @@ along with GCC; see the file COPYING3.  
 #undef TARGET_ASM_NAMED_SECTION
 #define TARGET_ASM_NAMED_SECTION i386_solaris_elf_named_section
 
+/* Unlike GNU ld, Sun ld doesn't coalesce .ctors.N/.dtors.N sections, so
+   inhibit their creation.  Also cf. sparc/sysv4.h.  */
+#ifndef USE_GLD
+#define CTORS_SECTION_ASM_OP   \t.section\t.ctors, \aw\
+#define DTORS_SECTION_ASM_OP   \t.section\t.dtors, \aw\
+#endif
+
 /* We do not need NT_VERSION notes.  */
 #undef X86_FILE_START_VERSION_DIRECTIVE
 #define X86_FILE_START_VERSION_DIRECTIVE false

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


[build] Use libgcc copy of i386/t-crtstuff on Solaris/x86

2011-07-11 Thread Rainer Orth
I noticed that libgcc/configure.ac uses the gcc copy of i386/t-crtstuff
on Solaris/x86, while the identical libgcc copy is perfectly fine.  This
patch corrects this.

Bootstrapped without regressions on i386-pc-solaris2.11, installed on
mainline.

Rainer


2011-07-10  Rainer Orth  r...@cebitec.uni-bielefeld.de

* configure.ac (i?86-*-solaris2*): Use libgcc copy of
i386/t-crtstuff.
* configure: Regenerate.

diff --git a/libgcc/configure.ac b/libgcc/configure.ac
--- a/libgcc/configure.ac
+++ b/libgcc/configure.ac
@@ -215,9 +215,7 @@ i?86-*-solaris2* | x86_64-*-solaris2.1[[
.zero   8
 EOF
   if AC_TRY_COMMAND(${CC-cc} -shared -nostartfiles -nodefaultlibs -o 
conftest.so conftest.s 1AS_MESSAGE_LOG_FD); then
-  # configure expects config files in libgcc/config, so need a relative
-  # path here.
-  tmake_file=${tmake_file} ../../gcc/config/i386/t-crtstuff
+  tmake_file=${tmake_file} i386/t-crtstuff
   fi
   ;;
 esac

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


More mudflap fixes for Solaris 11

2011-07-11 Thread Rainer Orth
When testing libmudflap on Solaris 8, 9, and 10 with GNU ld, I found a
couple of testsuite failures:

* On Solaris 10, several libmudflap.cth tests fail with

FAIL: libmudflap.cth/pass37-frag.c (test for excess errors)
Excess errors:
/vol/gcc/src/hg/trunk/local/libmudflap/testsuite/libmudflap.cth/pass37-frag.c:23
: undefined reference to `sched_yield'

  Before Solaris 11, one needs -lrt for sched_yield.

* On Solaris 9 (which, unlike Solaris 10+, still provides a static
  libc), many -static tests fail:

FAIL: libmudflap.c/fail1-frag.c (-static) (test for excess errors)
Excess errors:
/vol/gcc/bin/gld-2.21.1: cannot find -ldl
collect2: error: ld returned 1 exit status

  There is no static libdl, of course, and it seems to be unnecessary in
  the testsuite anyway.  In theory, one could avoid adding it to
  mfconfig.exp (mfconfig_libs), but that complexity is probably
  unwarranted for the following.

* There is no static librt, so all -static tests fail for that reason.
  Again, one could think about only adding it for the tests that need
  it, but given that linking statically against system libraries is
  heavily frowned upon even in Solaris 8/9, I decided against it.

  Instead, I chose to add mfconfig_libs to the -static check in
  libmudflap-init, which disables them completely for Solaris.

* Not a testsuite issue, but the pth directory is now completely
  unused/unnecessary, so I don't create it.

With this patch, all libmudflap tests (with the exception of 64-bit
libmudflap.c++/pass55-frag.cxx) pass on i386-pc-solaris2.11,
i386-pc-solaris2.8, sparc-sun-solaris2.8, and x86_64-unknown-linux-gnu.

Ok for mainline?

Rainer


2011-07-08  Rainer Orth  r...@cebitec.uni-bielefeld.de

* configure.ac: Don't create pth.
Check for library containing sched_yield.
* configure: Regenerate.
* config.h.in: Regenerate.

* testsuite/lib/libmudflap.exp (libmudflap-init): Use
mfconfig_libs in -static check.

diff --git a/libmudflap/configure.ac b/libmudflap/configure.ac
--- a/libmudflap/configure.ac
+++ b/libmudflap/configure.ac
@@ -112,12 +112,6 @@ else
 fi
 AC_SUBST(MF_HAVE_UINTPTR_T)
 
-if test ! -d pth
-then
-  # libmudflapth objects are built in this subdirectory
-  mkdir pth
-fi
-
 AC_CHECK_HEADERS(pthread.h)
 
 AC_MSG_CHECKING([for thread model used by GCC])
@@ -150,6 +144,7 @@ AC_SUBST(build_libmudflapth)
 AC_CHECK_LIB(dl, dlsym)
 
 AC_CHECK_FUNC(connect,, AC_CHECK_LIB(socket, connect))
+AC_CHECK_FUNC(sched_yield,, AC_CHECK_LIB(rt, sched_yield))
 
 # Calculate toolexeclibdir
 # Also toolexecdir, though it's only used in toolexeclibdir
diff --git a/libmudflap/testsuite/lib/libmudflap.exp 
b/libmudflap/testsuite/lib/libmudflap.exp
--- a/libmudflap/testsuite/lib/libmudflap.exp
+++ b/libmudflap/testsuite/lib/libmudflap.exp
@@ -124,9 +124,11 @@ proc libmudflap-init { language } {
 
 # If there is no static library then don't run tests with -static.
 global tool
+global mfconfig_libs
 set opts additional_flags=-static
 lappend opts additional_flags=-fmudflap
 lappend opts additional_flags=-lmudflap
+lappend opts libs=$mfconfig_libs
 set src stlm[pid].c
 set exe stlm[pid].x
 
-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: PING: PATCH [4/n]: Prepare x32: Permute the conversion and addition if one operand is a constant

2011-07-11 Thread H.J. Lu
On Mon, Jul 11, 2011 at 8:54 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Mon, Jul 11, 2011 at 4:03 AM, Paolo Bonzini bonz...@gnu.org wrote:
 On 07/11/2011 02:04 AM, H.J. Lu wrote:

 With my original change,  I got

 (const:DI (plus:DI (symbol_ref:DI (iplane.1577) [flags 0x2]
 var_decl 0x70857960 iplane)
         (const_int -4 [0xfffc])))

 I think it is safe to permute the conversion and addition operation
 if one operand is a constant and we are zero-extending.  This is
 how zero-extending works.

 Ok, I think I understand what you mean.  The key is the

   XEXP (x, 1) == convert_memory_address_addr_space
                  (to_mode, XEXP (x, 1), as)

 test.  It ensures basically that the constant has 31-bit precision, because
 otherwise the constant would change from e.g. (const_int -0x7ffc) to
 (const_int 0x8004) when zero-extending it from SImode to DImode.

 But I'm not sure it's safe.  You have,

  (zero_extend:DI (plus:SI FOO:SI) (const_int Y))

 and you want to convert it to

  (plus:DI FOO:DI (zero_extend:DI (const_int Y)))

 (where the zero_extend is folded).  Ignore that FOO is a SYMBOL_REF (this
 piece of code does not assume anything about its shape); if FOO ==
 0xfffc and Y = 8, the result will be respectively 0x4 (valid) and
 0x10004 (invalid).

 This example contradicts what you said above It ensures basically that the
 constant has 31-bit precision.  For zero-extend, the issue is address-wrap.
 As I understand, to support address-wrap, you need to use ptr_mode.


I am totally confused what the current code

 /* For addition we can safely permute the conversion and addition
 operation if one operand is a constant and converting the constant
 does not change it or if one operand is a constant and we are
 using a ptr_extend instruction  (POINTERS_EXTEND_UNSIGNED  0).
 We can always safely permute them if we are making the address
 narrower.  */
  if (GET_MODE_SIZE (to_mode)  GET_MODE_SIZE (from_mode)
  || (GET_CODE (x) == PLUS
   CONST_INT_P (XEXP (x, 1))
   (XEXP (x, 1) == convert_memory_address_addr_space
   (to_mode, XEXP (x, 1), as)
 || POINTERS_EXTEND_UNSIGNED  0)))
return gen_rtx_fmt_ee (GET_CODE (x), to_mode,
   convert_memory_address_addr_space
 (to_mode, XEXP (x, 0), as),
   XEXP (x, 1));

is trying to do.  It doesn't support address-wrap at all, regardless if
converting the constant changes the constant.  I think it should be
OK to permute if no instructions are allowed, like:

 if (GET_MODE_SIZE (to_mode)  GET_MODE_SIZE (from_mode)
  || (GET_CODE (x) == PLUS
   CONST_INT_P (XEXP (x, 1))
   POINTERS_EXTEND_UNSIGNED != 0
   no_emit))
return gen_rtx_fmt_ee (GET_CODE (x), to_mode,
   convert_memory_address_addr_space_1
 (to_mode, XEXP (x, 0), as, no_emit),
   XEXP (x, 1));


-- 
H.J.


Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching

2011-07-11 Thread Andrew Stubbs

On 07/07/11 10:58, Richard Guenther wrote:

I think you should assume that series of widenings, (int)(short)char_variable
are already combined.  Thus I believe you only need to consider a single
conversion in valid_types_for_madd_p.


Ok, here's my new patch.

This version only allows one conversion between the multiply and 
addition, so assumes that VRP has eliminated any needless ones.


That one conversion may either be a truncate, if the mode was too large 
for the meaningful data, or an extend, which must be of the right flavour.


This means that this patch now has the same effect as the last patch, 
for all valid cases (following you VRP patch), but rejects the cases 
where the C language (unhelpfully) requires an intermediate temporary to 
be of the 'wrong' signedness.


Hopefully the output will now be the same between both -O0 and -O2, and 
programmers will continue to have to be careful about casting unsigned 
variables whenever they expect purely unsigned math. :(


Is this one ok?

Andrew
2011-07-11  Andrew Stubbs  a...@codesourcery.com

	gcc/
	* tree-ssa-math-opts.c (convert_plusminus_to_widen): Permit a single
	conversion statement separating multiply-and-accumulate.

	gcc/testsuite/
	* gcc.target/arm/wmul-5.c: New file.
	* gcc.target/arm/no-wmla-1.c: New file.

--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/no-wmla-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -march=armv7-a } */
+
+int
+foo (int a, short b, short c)
+{
+ int bc = b * c;
+return a + (short)bc;
+}
+
+/* { dg-final { scan-assembler mul } } */
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-5.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -march=armv7-a } */
+
+long long
+foo (long long a, char *b, char *c)
+{
+  return a + *b * *c;
+}
+
+/* { dg-final { scan-assembler umlal } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -2135,6 +2135,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
 			enum tree_code code)
 {
   gimple rhs1_stmt = NULL, rhs2_stmt = NULL;
+  gimple conv1_stmt = NULL, conv2_stmt = NULL, conv_stmt;
   tree type, type1, type2;
   tree lhs, rhs1, rhs2, mult_rhs1, mult_rhs2, add_rhs;
   enum tree_code rhs1_code = ERROR_MARK, rhs2_code = ERROR_MARK;
@@ -2175,6 +2176,38 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
   else
 return false;
 
+  /* Allow for one conversion statement between the multiply
+ and addition/subtraction statement.  If there are more than
+ one conversions then we assume they would invalidate this
+ transformation.  If that's not the case then they should have
+ been folded before now.  */
+  if (CONVERT_EXPR_CODE_P (rhs1_code))
+{
+  conv1_stmt = rhs1_stmt;
+  rhs1 = gimple_assign_rhs1 (rhs1_stmt);
+  if (TREE_CODE (rhs1) == SSA_NAME)
+	{
+	  rhs1_stmt = SSA_NAME_DEF_STMT (rhs1);
+	  if (is_gimple_assign (rhs1_stmt))
+	rhs1_code = gimple_assign_rhs_code (rhs1_stmt);
+	}
+  else
+	return false;
+}
+  if (CONVERT_EXPR_CODE_P (rhs2_code))
+{
+  conv2_stmt = rhs2_stmt;
+  rhs2 = gimple_assign_rhs1 (rhs2_stmt);
+  if (TREE_CODE (rhs2) == SSA_NAME)
+	{
+	  rhs2_stmt = SSA_NAME_DEF_STMT (rhs2);
+	  if (is_gimple_assign (rhs2_stmt))
+	rhs2_code = gimple_assign_rhs_code (rhs2_stmt);
+	}
+  else
+	return false;
+}
+
   /* If code is WIDEN_MULT_EXPR then it would seem unnecessary to call
  is_widening_mult_p, but we still need the rhs returns.
 
@@ -2188,6 +2221,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
 			   type2, mult_rhs2))
 	return false;
   add_rhs = rhs2;
+  conv_stmt = conv1_stmt;
 }
   else if (rhs2_code == MULT_EXPR || rhs2_code == WIDEN_MULT_EXPR)
 {
@@ -2195,6 +2229,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
 			   type2, mult_rhs2))
 	return false;
   add_rhs = rhs1;
+  conv_stmt = conv2_stmt;
 }
   else
 return false;
@@ -2202,6 +2237,33 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
   if (TYPE_UNSIGNED (type1) != TYPE_UNSIGNED (type2))
 return false;
 
+  /* If there was a conversion between the multiply and addition
+ then we need to make sure it fits a multiply-and-accumulate.
+ The should be a single mode change which does not change the
+ value.  */
+  if (conv_stmt)
+{
+  tree from_type = TREE_TYPE (gimple_assign_rhs1 (conv_stmt));
+  tree to_type = TREE_TYPE (gimple_assign_lhs (conv_stmt));
+  int data_size = TYPE_PRECISION (type1) + TYPE_PRECISION (type2);
+  bool is_unsigned = TYPE_UNSIGNED (type1)  TYPE_UNSIGNED (type2);
+
+  if (TYPE_PRECISION (from_type)  TYPE_PRECISION (to_type))
+	{
+	  /* Conversion is a truncate.  */
+	  if (TYPE_PRECISION (to_type)  data_size)
+	return false;
+	}
+  else if (TYPE_PRECISION (from_type)  TYPE_PRECISION (to_type))
+	{
+	  /* Conversion is an extend.  Check 

[PATCH] Create smaller DWARF ops for some int_loc_descriptor constants etc. (PR debug/49676)

2011-07-11 Thread Jakub Jelinek
Hi!

While working on the last dwarf2out.c patch, I've noticed we can generate
more compact DWARF location descriptions in several cases, e.g.
DW_OP_constu 0xf8000
is 7 byts long, while
DW_OP_lit31 DW_OP_lit31 DW_OP_shl
pushes the same value to the stack and is just 3 bytes long.
The patch adjusts {,size_of_}int_loc_descriptor to generate those and
similar sequences when shorter (for case of the same length prefers
single op over several smaller ops though).
In addition to that, it attempts to optimize DW_OP_GNU_const_type
into int_loc_descriptor + DW_OP_GNU_convert when possible, and
optimizes DW_OP_plus_uconst const into int_loc_descriptor (const) DW_OP_plus
if the latter is shorter.

On the attached testcase .debug_info on x86_64 -m64 shrunk from
1092 bytes to 986 bytes (i.e. almost 10% reduction, though on artificial
testcase), and for -m32 it shrunk from 1440 to 1295 bytes.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2011-07-11  Jakub Jelinek  ja...@redhat.com

PR debug/49676
* dwarf2out.c (int_shift_loc_descriptor): New function.
(int_loc_descriptor): If shorter, emit i as
(i  shift), shift, DW_OP_shl for suitable shift value.
Similarly, try to optimize large negative values using
DW_OP_neg of a positive value if shorter.
(size_of_int_shift_loc_descriptor): New function.
(size_of_int_loc_descriptor): Adjust to match int_loc_descriptor
changes.
(mem_loc_descriptor) case CONST_INT: Emit zero-extended constants
that fit into DWARF2_ADDR_SIZE bytes as int_loc_descriptor +
DW_OP_GNU_convert instead of DW_OP_GNU_const_type if the former
is shorter.
(resolve_addr_in_expr): Optimize DW_OP_plus_uconst with a large
addend as added DW_OP_plus if it is shorter.

--- gcc/dwarf2out.c.jj  2011-07-11 10:39:50.0 +0200
+++ gcc/dwarf2out.c 2011-07-11 14:38:07.0 +0200
@@ -10135,6 +10135,21 @@ multiple_reg_loc_descriptor (rtx rtl, rt
   return loc_result;
 }
 
+static unsigned long size_of_int_loc_descriptor (HOST_WIDE_INT);
+
+/* Return a location descriptor that designates a constant i,
+   as a compound operation from constant (i  shift), constant shift
+   and DW_OP_shl.  */
+
+static dw_loc_descr_ref
+int_shift_loc_descriptor (HOST_WIDE_INT i, int shift)
+{
+  dw_loc_descr_ref ret = int_loc_descriptor (i  shift);
+  add_loc_descr (ret, int_loc_descriptor (shift));
+  add_loc_descr (ret, new_loc_descr (DW_OP_shl, 0, 0));
+  return ret;
+}
+
 /* Return a location descriptor that designates a constant.  */
 
 static dw_loc_descr_ref
@@ -10146,15 +10161,45 @@ int_loc_descriptor (HOST_WIDE_INT i)
  defaulting to the LEB encoding.  */
   if (i = 0)
 {
+  int clz = clz_hwi (i);
+  int ctz = ctz_hwi (i);
   if (i = 31)
op = (enum dwarf_location_atom) (DW_OP_lit0 + i);
   else if (i = 0xff)
op = DW_OP_const1u;
   else if (i = 0x)
op = DW_OP_const2u;
-  else if (HOST_BITS_PER_WIDE_INT == 32
-  || i = 0x)
+  else if (clz + ctz = HOST_BITS_PER_WIDE_INT - 5
+   clz + 5 + 255 = HOST_BITS_PER_WIDE_INT)
+   /* DW_OP_litX DW_OP_litY DW_OP_shl takes just 3 bytes and
+  DW_OP_litX DW_OP_const1u Y DW_OP_shl takes just 4 bytes,
+  while DW_OP_const4u is 5 bytes.  */
+   return int_shift_loc_descriptor (i, HOST_BITS_PER_WIDE_INT - clz - 5);
+  else if (clz + ctz = HOST_BITS_PER_WIDE_INT - 8
+   clz + 8 + 31 = HOST_BITS_PER_WIDE_INT)
+   /* DW_OP_const1u X DW_OP_litY DW_OP_shl takes just 4 bytes,
+  while DW_OP_const4u is 5 bytes.  */
+   return int_shift_loc_descriptor (i, HOST_BITS_PER_WIDE_INT - clz - 8);
+  else if (HOST_BITS_PER_WIDE_INT == 32 || i = 0x)
op = DW_OP_const4u;
+  else if (clz + ctz = HOST_BITS_PER_WIDE_INT - 8
+   clz + 8 + 255 = HOST_BITS_PER_WIDE_INT)
+   /* DW_OP_const1u X DW_OP_const1u Y DW_OP_shl takes just 5 bytes,
+  while DW_OP_constu of constant = 0x1 takes at least
+  6 bytes.  */
+   return int_shift_loc_descriptor (i, HOST_BITS_PER_WIDE_INT - clz - 8);
+  else if (clz + ctz = HOST_BITS_PER_WIDE_INT - 16
+   clz + 16 + (size_of_uleb128 (i)  5 ? 255 : 31)
+ = HOST_BITS_PER_WIDE_INT)
+   /* DW_OP_const2u X DW_OP_litY DW_OP_shl takes just 5 bytes,
+  DW_OP_const2u X DW_OP_const1u Y DW_OP_shl takes 6 bytes,
+  while DW_OP_constu takes in this case at least 6 bytes.  */
+   return int_shift_loc_descriptor (i, HOST_BITS_PER_WIDE_INT - clz - 16);
+  else if (clz + ctz = HOST_BITS_PER_WIDE_INT - 32
+   clz + 32 + 31 = HOST_BITS_PER_WIDE_INT
+   size_of_uleb128 (i)  6)
+   /* DW_OP_const4u X DW_OP_litY DW_OP_shl takes just 7 bytes.  */
+   return int_shift_loc_descriptor (i, HOST_BITS_PER_WIDE_INT - clz - 32);
   else
op = 

[PATCH, 4.6, PR 49094, committed] Backport of fixes for PR 49094

2011-07-11 Thread Martin Jambor
Hi,

I have just committed the following to the 4.6 branch (after
re-testing and as the rev. 176166) to fix PR 49094 there too.  It's
the following two patches which are already in trunk in one:

http://gcc.gnu.org/ml/gcc-patches/2011-06/msg02342.html and
http://gcc.gnu.org/ml/gcc-patches/2011-06/msg02371.html

The first has been explicitely approved for the branch, the second is
a straightforward followup.  Sorry it took so long, I just nearly
forgot to do the backport too.

Thanks,

Martin



2011-07-11  Martin Jambor  mjam...@suse.cz

PR tree-optimization/49094
* tree-sra.c (tree_non_mode_aligned_mem_p): New function.
(build_accesses_from_assign): Use it.

* testsuite/gcc.dg/tree-ssa/pr49094.c: New test.

Index: gcc/testsuite/gcc.dg/tree-ssa/pr49094.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/pr49094.c (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr49094.c (revision 0)
@@ -0,0 +1,38 @@
+/* { dg-do run } */
+/* { dg-options -O } */
+
+struct in_addr {
+   unsigned int s_addr;
+};
+
+struct ip {
+   unsigned char ip_p;
+   unsigned short ip_sum;
+   struct  in_addr ip_src,ip_dst;
+} __attribute__ ((aligned(1), packed));
+
+struct ip ip_fw_fwd_addr;
+
+int test_alignment( char *m )
+{
+  struct ip *ip = (struct ip *) m;
+  struct in_addr pkt_dst;
+  pkt_dst = ip-ip_dst ;
+  if( pkt_dst.s_addr == 0 )
+return 1;
+  else
+return 0;
+}
+
+int __attribute__ ((noinline, noclone))
+intermediary (char *p)
+{
+  return test_alignment (p);
+}
+
+int
+main (int argc, char *argv[])
+{
+  ip_fw_fwd_addr.ip_dst.s_addr = 1;
+  return intermediary ((void *) ip_fw_fwd_addr);
+}
Index: gcc/tree-sra.c
===
--- gcc/tree-sra.c  (revision 176152)
+++ gcc/tree-sra.c  (working copy)
@@ -1020,6 +1020,27 @@ disqualify_ops_if_throwing_stmt (gimple
   return false;
 }
 
+/* Return true iff type of EXP is not sufficiently aligned.  */
+
+static bool
+tree_non_mode_aligned_mem_p (tree exp)
+{
+  enum machine_mode mode = TYPE_MODE (TREE_TYPE (exp));
+  unsigned int align;
+
+  if (TREE_CODE (exp) == SSA_NAME
+  || TREE_CODE (exp) == MEM_REF
+  || mode == BLKmode
+  || !STRICT_ALIGNMENT)
+return false;
+
+  align = get_object_alignment (exp, BIGGEST_ALIGNMENT);
+  if (GET_MODE_ALIGNMENT (mode)  align)
+return true;
+
+  return false;
+}
+
 /* Scan expressions occuring in STMT, create access structures for all accesses
to candidates for scalarization and remove those candidates which occur in
statements or expressions that prevent them from being split apart.  Return
@@ -1044,7 +1065,10 @@ build_accesses_from_assign (gimple stmt)
   lacc = build_access_from_expr_1 (lhs, stmt, true);
 
   if (lacc)
-lacc-grp_assignment_write = 1;
+{
+  lacc-grp_assignment_write = 1;
+  lacc-grp_unscalarizable_region |= tree_non_mode_aligned_mem_p (rhs);
+}
 
   if (racc)
 {
@@ -1052,6 +1076,7 @@ build_accesses_from_assign (gimple stmt)
   if (should_scalarize_away_bitmap  !gimple_has_volatile_ops (stmt)
   !is_gimple_reg_type (racc-type))
bitmap_set_bit (should_scalarize_away_bitmap, DECL_UID (racc-base));
+  racc-grp_unscalarizable_region |= tree_non_mode_aligned_mem_p (lhs);
 }
 
   if (lacc  racc


Re: PING: PATCH [4/n]: Prepare x32: Permute the conversion and addition if one operand is a constant

2011-07-11 Thread H.J. Lu
On Mon, Jul 11, 2011 at 9:55 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Mon, Jul 11, 2011 at 8:54 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Mon, Jul 11, 2011 at 4:03 AM, Paolo Bonzini bonz...@gnu.org wrote:
 On 07/11/2011 02:04 AM, H.J. Lu wrote:

 With my original change,  I got

 (const:DI (plus:DI (symbol_ref:DI (iplane.1577) [flags 0x2]
 var_decl 0x70857960 iplane)
         (const_int -4 [0xfffc])))

 I think it is safe to permute the conversion and addition operation
 if one operand is a constant and we are zero-extending.  This is
 how zero-extending works.

 Ok, I think I understand what you mean.  The key is the

   XEXP (x, 1) == convert_memory_address_addr_space
                  (to_mode, XEXP (x, 1), as)

 test.  It ensures basically that the constant has 31-bit precision, because
 otherwise the constant would change from e.g. (const_int -0x7ffc) to
 (const_int 0x8004) when zero-extending it from SImode to DImode.

 But I'm not sure it's safe.  You have,

  (zero_extend:DI (plus:SI FOO:SI) (const_int Y))

 and you want to convert it to

  (plus:DI FOO:DI (zero_extend:DI (const_int Y)))

 (where the zero_extend is folded).  Ignore that FOO is a SYMBOL_REF (this
 piece of code does not assume anything about its shape); if FOO ==
 0xfffc and Y = 8, the result will be respectively 0x4 (valid) and
 0x10004 (invalid).

 This example contradicts what you said above It ensures basically that the
 constant has 31-bit precision.  For zero-extend, the issue is address-wrap.
 As I understand, to support address-wrap, you need to use ptr_mode.


 I am totally confused what the current code

     /* For addition we can safely permute the conversion and addition
         operation if one operand is a constant and converting the constant
         does not change it or if one operand is a constant and we are
         using a ptr_extend instruction  (POINTERS_EXTEND_UNSIGNED  0).
         We can always safely permute them if we are making the address
         narrower.  */
      if (GET_MODE_SIZE (to_mode)  GET_MODE_SIZE (from_mode)
          || (GET_CODE (x) == PLUS
               CONST_INT_P (XEXP (x, 1))
               (XEXP (x, 1) == convert_memory_address_addr_space
                                   (to_mode, XEXP (x, 1), as)
                 || POINTERS_EXTEND_UNSIGNED  0)))
        return gen_rtx_fmt_ee (GET_CODE (x), to_mode,
                               convert_memory_address_addr_space
                                 (to_mode, XEXP (x, 0), as),
                               XEXP (x, 1));

 is trying to do.  It doesn't support address-wrap at all, regardless if
 converting the constant changes the constant.  I think it should be
 OK to permute if no instructions are allowed, like:

     if (GET_MODE_SIZE (to_mode)  GET_MODE_SIZE (from_mode)
          || (GET_CODE (x) == PLUS
               CONST_INT_P (XEXP (x, 1))
               POINTERS_EXTEND_UNSIGNED != 0
               no_emit))
        return gen_rtx_fmt_ee (GET_CODE (x), to_mode,
                               convert_memory_address_addr_space_1
                                 (to_mode, XEXP (x, 0), as, no_emit),
                               XEXP (x, 1));



This patch implements it.

-- 
H.J.
---2011-07-11  H.J. Lu  hongjiu...@intel.com

PR middle-end/47727
* explow.c (convert_memory_address_addr_space_1): New.
(convert_memory_address_addr_space): Use it.

* expr.c (convert_modes_1): New.
(convert_modes): Use it.

* expr.h (convert_modes_1): New.

* rtl.h (convert_memory_address_addr_space_1): New.
(convert_memory_address_1): Likewise.

* simplify-rtx.c (simplify_unary_operation_1): Call
convert_memory_address_1 instead of convert_memory_address.
2011-07-11  H.J. Lu  hongjiu...@intel.com

PR middle-end/47727
* explow.c (convert_memory_address_addr_space_1): New.
(convert_memory_address_addr_space): Use it.

* expr.c (convert_modes_1): New.
(convert_modes): Use it.

* expr.h (convert_modes_1): New.

* rtl.h (convert_memory_address_addr_space_1): New.
(convert_memory_address_1): Likewise.

* simplify-rtx.c (simplify_unary_operation_1): Call
convert_memory_address_1 instead of convert_memory_address.

diff --git a/gcc/explow.c b/gcc/explow.c
index 3c692f4..d2c54ff 100644
--- a/gcc/explow.c
+++ b/gcc/explow.c
@@ -320,8 +320,9 @@ break_out_memory_refs (rtx x)
arithmetic insns can be used.  */
 
 rtx
-convert_memory_address_addr_space (enum machine_mode to_mode ATTRIBUTE_UNUSED,
-  rtx x, addr_space_t as ATTRIBUTE_UNUSED)
+convert_memory_address_addr_space_1 (enum machine_mode to_mode 
ATTRIBUTE_UNUSED,
+rtx x, addr_space_t as ATTRIBUTE_UNUSED,
+bool no_emit ATTRIBUTE_UNUSED)
 {
 #ifndef POINTERS_EXTEND_UNSIGNED
   gcc_assert (GET_MODE (x) == 

[v3] Small testsuite patch for -Wall

2011-07-11 Thread Paolo Carlini

Hi,

tested x86_64-linux, committed.

Paolo.

/
2011-07-11  Paolo Carlini  paolo.carl...@oracle.com

* testsuite/util/testsuite_allocator.h (propagating_allocator::
operator=(const propagating_allocator)): Retun *this.
Index: testsuite/util/testsuite_allocator.h
===
--- testsuite/util/testsuite_allocator.h(revision 176144)
+++ testsuite/util/testsuite_allocator.h(working copy)
@@ -408,6 +408,7 @@
{
  static_assert(P2, assigning propagating_allocatorT, true);
  propagating_allocator(a).swap_base(*this);
+ return *this;
}
 
   // postcondition: a.get_personality() == 0


[build] Move crtfastmath to toplevel libgcc

2011-07-11 Thread Rainer Orth
Another low-hanging fruit in the toplevel libgcc move is crtfastmath.

The following patch moves the various crtfastmath.c files over to libgcc
and removes the remnants of the gcc side of the configuration.

Unfortunately, one piece needs to stay behind: crtfastmath.o must remain
in gcc/config/i386/t-linux64 (EXTRA_MULTILIB_PARTS): If I remove just
crtfastmath.o, the libgcc Makefile detects a mismatch between the
extra_parts lists of gcc and libgcc.  If I remove the whole variable,
the *-*-linux* default from config.gcc kicks in and we get another
mismatch ;-(

There's one other question here: alpha/t-crtfm uses
-frandom-seed=gcc-crtfastmath with this comment:

# FIXME drow/20061228 - I have preserved this -frandom-seed option
# while migrating this rule from the GCC directory, but I do not
# know why it is necessary if no other crt file uses it.

Is there any particular reason to either keep this or not to use it in
the generic file?  This way, only i386 needs to stay separate with its
use of -msse -minline-all-stringops.

Bootstrapped without regressions on i386-pc-solaris2.11 and
x86_64-unknown-linux-gnu.

Ok for mainline?

Rainer


2011-07-10  Rainer Orth  r...@cebitec.uni-bielefeld.de

gcc:
* config/alpha/crtfastmath.c: Move to ../libgcc/config/alpha.
* config/alpha/t-crtfm: Remove.
* config/i386/crtfastmath.c: Move to ../libgcc/config/i386.
* config/i386/t-crtfm: Remove.
* config/ia64/crtfastmath.c: Move to ../libgcc/config/ia64.
* config/mips/crtfastmath.c: Move to ../libgcc/config/mips.
* config/sparc/crtfastmath.c: Move to ../libgcc/config/sparc.
* config/sparc/t-crtfm: Remove.

* config.gcc (alpha*-*-linux*): Remove alpha/t-crtfm from tmake_file.
(alpha*-*-freebsd*): Likewise.
(i[34567]86-*-darwin*): Remove i386/t-crtfm from tmake_file.
(x86_64-*-darwin*): Likewise.
(i[34567]86-*-linux*): Likewise.
(x86_64-*-linux*): Likewise.
(x86_64-*-mingw*): Likewise.
(ia64*-*-elf*): Remove crtfastmath.o from extra_parts.
(ia64*-*-freebsd*): Likewise.
(ia64*-*-linux*): Likewise.
(mips64*-*-linux*): Likewise.
(mips*-*-linux*): Likewise.
(sparc-*-linux*): Remove sparc/t-crtfm from tmake_file.
(sparc64-*-linux*): Likewise.
(sparc64-*-freebsd*): Likewise.

libgcc:
* config/alpha/crtfastmath.c: New file.
* config/i386/crtfastmath.c: New file.
* config/ia64/crtfastmath.c: New file.
* config/mips/crtfastmath.c: New file.
* config/sparc/crtfastmath.c: New file.

* config/t-crtfm (crtfastmath.o): Use $(srcdir) to refer to
crtfastmath.c.
* config/alpha/t-crtfm: Likewise.
* config/i386/t-crtfm: Likewise.
* config/ia64/t-ia64 (crtfastmath.o): Remove.

* config.host (alpha*-*-freebsd*): Add alpha/t-crtfm to tmake_file.
Add crtfastmath.o to extra_parts.
(i[34567]86-*-darwin*): Add i386/t-crtfm to tmake_file.
Add crtfastmath.o to extra_parts.
(x86_64-*-darwin*): Likewise.
(x86_64-*-mingw*): Likewise.
(ia64*-*-elf*): Add t-crtfm to tmake_file.
(ia64*-*-freebsd*): Likewise.
(ia64*-*-linux*): Likewise.
(sparc64-*-freebsd*): Add t-crtfm to tmake_file.
Add crtfastmath.o to extra_parts.

diff --git a/gcc/config.gcc b/gcc/config.gcc
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -756,13 +756,13 @@ alpha*-*-linux*)
tm_file=${tm_file} alpha/elf.h alpha/linux.h alpha/linux-elf.h 
glibc-stdint.h
extra_options=${extra_options} alpha/elf.opt
target_cpu_default=MASK_GAS
-   tmake_file=${tmake_file} alpha/t-crtfm alpha/t-alpha alpha/t-ieee 
alpha/t-linux
+   tmake_file=${tmake_file} alpha/t-alpha alpha/t-ieee alpha/t-linux
;;
 alpha*-*-freebsd*)
tm_file=${tm_file} ${fbsd_tm_file} alpha/elf.h alpha/freebsd.h
extra_options=${extra_options} alpha/elf.opt
target_cpu_default=MASK_GAS
-   tmake_file=${tmake_file} alpha/t-crtfm alpha/t-alpha alpha/t-ieee
+   tmake_file=${tmake_file} alpha/t-alpha alpha/t-ieee
extra_parts=crtbegin.o crtend.o crtbeginS.o crtendS.o crtbeginT.o
;;
 alpha*-*-netbsd*)
@@ -1208,12 +1208,12 @@ i[34567]86-*-darwin*)
need_64bit_isa=yes
# Baseline choice for a machine that allows m64 support.
with_cpu=${with_cpu:-core2}
-   tmake_file=${tmake_file} t-slibgcc-dummy i386/t-crtpc i386/t-crtfm
+   tmake_file=${tmake_file} t-slibgcc-dummy i386/t-crtpc
libgcc_tm_file=$libgcc_tm_file i386/darwin-lib.h
;;
 x86_64-*-darwin*)
with_cpu=${with_cpu:-core2}
-   tmake_file=${tmake_file} ${cpu_type}/t-darwin64 t-slibgcc-dummy 
i386/t-crtpc i386/t-crtfm
+   tmake_file=${tmake_file} ${cpu_type}/t-darwin64 t-slibgcc-dummy 
i386/t-crtpc
tm_file=${tm_file} ${cpu_type}/darwin64.h
libgcc_tm_file=$libgcc_tm_file 

Re: [PATCH] Create smaller DWARF ops for some int_loc_descriptor constants etc. (PR debug/49676)

2011-07-11 Thread Richard Henderson
On 07/11/2011 09:33 AM, Jakub Jelinek wrote:
   PR debug/49676
   * dwarf2out.c (int_shift_loc_descriptor): New function.
   (int_loc_descriptor): If shorter, emit i as
   (i  shift), shift, DW_OP_shl for suitable shift value.
   Similarly, try to optimize large negative values using
   DW_OP_neg of a positive value if shorter.
   (size_of_int_shift_loc_descriptor): New function.
   (size_of_int_loc_descriptor): Adjust to match int_loc_descriptor
   changes.
   (mem_loc_descriptor) case CONST_INT: Emit zero-extended constants
   that fit into DWARF2_ADDR_SIZE bytes as int_loc_descriptor +
   DW_OP_GNU_convert instead of DW_OP_GNU_const_type if the former
   is shorter.
   (resolve_addr_in_expr): Optimize DW_OP_plus_uconst with a large
   addend as added DW_OP_plus if it is shorter.

Ok.


r~


[Committed, Backport 4.6, AVR]: PR39633 (missing *cmpqi)

2011-07-11 Thread Georg-Johann Lay
Backported to 4.6:

http://gcc.gnu.org/viewcvs?view=revisionrevision=176143


PING^2 Re: PATCH: fix collect2 handling of --demangle and --no-demangle

2011-07-11 Thread Sandra Loosemore

Ping?

http://gcc.gnu.org/ml/gcc-patches/2011-06/msg01368.html


We had a bug report from a customer that the linker was ignoring the
--demangle and --no-demangle options when generating map files.
Moreover, it was failing in a host-dependent way; on Windows hosts,
it was always emitting demangled names in the map file, while on
Linux hosts, it never did.  Moreover, on Windows hosts it also
ignored the setting of the COLLECT_NO_DEMANGLE environment variable.

This turns out to be a problem in collect2, or actually, three
problems:

(1) By default, collect2 is configured to filter out --demangle and
--no-demangle from the linker options, and it tries to do demangling
on symbol names in stdout and stderr itself instead.  But, it is too
stupid to know about the map file.

(2) Collect2 is trying to set COLLECT_NO_DEMANGLE to disable
demangling in ld, but in a nonportable way that causes it to be
always unset instead on Windows.

(3) If you configure with --with-demangler-in-ld to try to disable
the collect2 demangling, there's another bug that causes it to ignore
any explicit --demangle or --no-demangle options and only pay
attention to whether or not COLLECT_NO_DEMANGLE is set.

The attached patch addresses all three problems:

(1) I've flipped the default to --with-demangler-in-ld=yes.  Note
that configure.ac already takes care not to let this percolate
through to collect2 without verifying that the linker is GNU ld and
that it is a version that supports --demangle.  Perhaps back in 2004
when this option was first added, the ld demangling support was
deemed too experimental to make it the default, but that's surely not
the case any more.  Also, since this has been broken since 2004, I'm
not sure there's much reason to be concerned with backwards
compatibility, here

(2) I fixed the COLLECT_NO_DEMANGLE environment variable setting
recipe.

(3) I simplified the argument processing for --demangle and
--no-demangle to pass them straight through to the linker when
HAVE_LD_DEMANGLE is defined.

OK to commit?

-Sandra

2011-06-17  Sandra Loosemore  san...@codesourcery.com

gcc/
* configure.ac (demangler_in_ld): Default to yes.
* configure: Regenerated.
* collect2.c (main): When HAVE_LD_DEMANGLE is defined, don't
mess with COLLECT_NO_DEMANGLE, and just pass --demangle and
--no-demangle options straight through to ld.  When
HAVE_LD_DEMANGLE is not defined, set COLLECT_NO_DEMANGLE in a
way that has the intended effect on Windows.




Re: [build] Move crtfastmath to toplevel libgcc

2011-07-11 Thread Richard Henderson
On 07/11/2011 10:26 AM, Rainer Orth wrote:
 There's one other question here: alpha/t-crtfm uses
 -frandom-seed=gcc-crtfastmath with this comment:
 
 # FIXME drow/20061228 - I have preserved this -frandom-seed option
 # while migrating this rule from the GCC directory, but I do not
 # know why it is necessary if no other crt file uses it.
 
 Is there any particular reason to either keep this or not to use it in
 the generic file?  This way, only i386 needs to stay separate with its
 use of -msse -minline-all-stringops.

This random-seed thing is there for the mangled name we build
for the constructor on Tru64.

It's not needed for any target for which a .ctors section is
supported.  It also doesn't hurt, so you could move it to any
generic build rule.


r~


[PATCH, i386]: ix86_trampoline_init: use offset everywhere

2011-07-11 Thread Uros Bizjak
Hello!

A small cleanup, no functional change.  This allows us to assert that
generated code length is less than TRAMPOLINE_SIZE also for 32bit
targets.

2011-07-11  Uros Bizjak  ubiz...@gmail.com

* config/i386/i386.c (ix86_trampoline_init): Switch arms of if expr.
Use offset everywhere.  Always assert that offset = TRAMPOLINE_SIZE.

Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline.

Uros.
Index: i386.c
===
--- i386.c  (revision 176159)
+++ i386.c  (working copy)
@@ -22683,54 +22683,14 @@ static void
 ix86_trampoline_init (rtx m_tramp, tree fndecl, rtx chain_value)
 {
   rtx mem, fnaddr;
+  int opcode;
+  int offset = 0;
 
   fnaddr = XEXP (DECL_RTL (fndecl), 0);
 
-  if (!TARGET_64BIT)
-{
-  rtx disp, chain;
-  int opcode;
-
-  /* Depending on the static chain location, either load a register
-with a constant, or push the constant to the stack.  All of the
-instructions are the same size.  */
-  chain = ix86_static_chain (fndecl, true);
-  if (REG_P (chain))
-   {
- if (REGNO (chain) == CX_REG)
-   opcode = 0xb9;
- else if (REGNO (chain) == AX_REG)
-   opcode = 0xb8;
- else
-   gcc_unreachable ();
-   }
-  else
-   opcode = 0x68;
-
-  mem = adjust_address (m_tramp, QImode, 0);
-  emit_move_insn (mem, gen_int_mode (opcode, QImode));
-
-  mem = adjust_address (m_tramp, SImode, 1);
-  emit_move_insn (mem, chain_value);
-
-  /* Compute offset from the end of the jmp to the target function.
-In the case in which the trampoline stores the static chain on
-the stack, we need to skip the first insn which pushes the
-(call-saved) register static chain; this push is 1 byte.  */
-  disp = expand_binop (SImode, sub_optab, fnaddr,
-  plus_constant (XEXP (m_tramp, 0),
- MEM_P (chain) ? 9 : 10),
-  NULL_RTX, 1, OPTAB_DIRECT);
-
-  mem = adjust_address (m_tramp, QImode, 5);
-  emit_move_insn (mem, gen_int_mode (0xe9, QImode));
-
-  mem = adjust_address (m_tramp, SImode, 6);
-  emit_move_insn (mem, disp);
-}
-  else
+  if (TARGET_64BIT)
 {
-  int offset = 0, size;
+  int size;
 
   /* Load the function address to r11.  Try to load address using
 the shorter movl instead of movabs.  We may want to support
@@ -22757,20 +22717,22 @@ ix86_trampoline_init (rtx m_tramp, tree 
  offset += 10;
}
 
-  /* Load static chain using movabs to r10.  */
-  mem = adjust_address (m_tramp, HImode, offset);
-  /* Use the shorter movl instead of movabs for x32.  */
+  /* Load static chain using movabs to r10.  Use the
+shorter movl instead of movabs for x32.  */
   if (TARGET_X32)
{
+ opcode = 0xba41;
  size = 6;
- emit_move_insn (mem, gen_int_mode (0xba41, HImode));
}
   else
{
+ opcode = 0xba49;
  size = 10;
- emit_move_insn (mem, gen_int_mode (0xba49, HImode));
}
 
+  mem = adjust_address (m_tramp, HImode, offset);
+  emit_move_insn (mem, gen_int_mode (opcode, HImode));
+
   mem = adjust_address (m_tramp, ptr_mode, offset + 2);
   emit_move_insn (mem, chain_value);
   offset += size;
@@ -22780,10 +22742,56 @@ ix86_trampoline_init (rtx m_tramp, tree 
   mem = adjust_address (m_tramp, SImode, offset);
   emit_move_insn (mem, gen_int_mode (0x90e3ff49, SImode));
   offset += 4;
+}
+  else
+{
+  rtx disp, chain;
 
-  gcc_assert (offset = TRAMPOLINE_SIZE);
+  /* Depending on the static chain location, either load a register
+with a constant, or push the constant to the stack.  All of the
+instructions are the same size.  */
+  chain = ix86_static_chain (fndecl, true);
+  if (REG_P (chain))
+   {
+ switch (REGNO (chain))
+   {
+   case AX_REG:
+ opcode = 0xb8; break;
+   case CX_REG:
+ opcode = 0xb9; break; 
+   default:
+ gcc_unreachable ();
+   }
+   }
+  else
+   opcode = 0x68;
+
+  mem = adjust_address (m_tramp, QImode, offset);
+  emit_move_insn (mem, gen_int_mode (opcode, QImode));
+
+  mem = adjust_address (m_tramp, SImode, offset + 1);
+  emit_move_insn (mem, chain_value);
+  offset += 5;
+
+  mem = adjust_address (m_tramp, QImode, offset);
+  emit_move_insn (mem, gen_int_mode (0xe9, QImode));
+
+  mem = adjust_address (m_tramp, SImode, offset + 1);
+
+  /* Compute offset from the end of the jmp to the target function.
+In the case in which the trampoline stores the static chain on
+the stack, we need to skip the first insn which pushes the
+(call-saved) register static chain; this push is 1 byte.  

[Patch, Fortran] Allocate + CAF library

2011-07-11 Thread Daniel Carrera

Hello,

This is my largest patch so far and the first that I'll commit myself. 
This patch improves support for the ALLOCATE statement when using the 
coarray library. Specifically, it adds support for the stat= and errmsg= 
attributes:


ALLOCATE( x(n)[*] , stat=i , errmsg=str )

These attributes are now written by the CAF library. This patch also 
involves a good amount of code cleanup.


ChangeLog is attached.

As soon as I get the go-ahead, I'll commit this patch.


Cheers,
Daniel.
--
I'm not overweight, I'm undertall.
Index: gcc/fortran/trans-array.c
===
--- gcc/fortran/trans-array.c	(revision 176148)
+++ gcc/fortran/trans-array.c	(working copy)
@@ -4366,7 +4366,8 @@ gfc_array_init_size (tree descriptor, in
 /*GCC ARRAYS*/
 
 bool
-gfc_array_allocate (gfc_se * se, gfc_expr * expr, tree pstat)
+gfc_array_allocate (gfc_se * se, gfc_expr * expr, tree status, tree errmsg,
+		tree errlen)
 {
   tree tmp;
   tree pointer;
@@ -4460,22 +4461,15 @@ gfc_array_allocate (gfc_se * se, gfc_exp
   error = build_call_expr_loc (input_location,
   			   gfor_fndecl_runtime_error, 1, msg);
 
-  if (pstat != NULL_TREE  !integer_zerop (pstat))
+  if (status != NULL_TREE)
 {
-  /* Set the status variable if it's present.  */
+  tree status_type = TREE_TYPE (status);
   stmtblock_t set_status_block;
-  tree status_type = pstat ? TREE_TYPE (TREE_TYPE (pstat)) : NULL_TREE;
 
   gfc_start_block (set_status_block);
-  gfc_add_modify (set_status_block,
-  		  fold_build1_loc (input_location, INDIRECT_REF,
-     status_type, pstat),
-  			   build_int_cst (status_type, LIBERROR_ALLOCATION));
-
-  tmp = fold_build2_loc (input_location, EQ_EXPR, boolean_type_node,
-  			 pstat, build_int_cst (TREE_TYPE (pstat), 0));
-  error = fold_build3_loc (input_location, COND_EXPR, void_type_node, tmp,
-  			   error, gfc_finish_block (set_status_block));
+  gfc_add_modify (set_status_block, status,
+		  build_int_cst (status_type, LIBERROR_ALLOCATION));
+  error = gfc_finish_block (set_status_block);
 }
 
   gfc_start_block (elseblock);
@@ -4484,14 +4478,15 @@ gfc_array_allocate (gfc_se * se, gfc_exp
   pointer = gfc_conv_descriptor_data_get (se-expr);
   STRIP_NOPS (pointer);
 
-  /* The allocate_array variants take the old pointer as first argument.  */
+  /* The allocatable variant takes the old pointer as first argument.  */
   if (allocatable)
-tmp = gfc_allocate_allocatable_with_status (elseblock,
-		pointer, size, pstat, expr);
+tmp = gfc_allocate_allocatable (elseblock, pointer, size,
+status, errmsg, errlen, expr);
   else
-tmp = gfc_allocate_with_status (elseblock, size, pstat, false);
-  tmp = fold_build2_loc (input_location, MODIFY_EXPR, void_type_node, pointer,
-			 tmp);
+tmp = gfc_allocate_using_malloc (elseblock, size, status);
+
+  tmp = fold_build2_loc (input_location, MODIFY_EXPR, void_type_node,
+			 pointer, tmp);
 
   gfc_add_expr_to_block (elseblock, tmp);
 
Index: gcc/fortran/trans-array.h
===
--- gcc/fortran/trans-array.h	(revision 176148)
+++ gcc/fortran/trans-array.h	(working copy)
@@ -24,7 +24,7 @@ tree gfc_array_deallocate (tree, tree, g
 
 /* Generate code to initialize an allocate an array.  Statements are added to
se, which should contain an expression for the array descriptor.  */
-bool gfc_array_allocate (gfc_se *, gfc_expr *, tree);
+bool gfc_array_allocate (gfc_se *, gfc_expr *, tree, tree, tree);
 
 /* Allow the bounds of a loop to be set from a callee's array spec.  */
 void gfc_set_loop_bounds_from_array_spec (gfc_interface_mapping *,
Index: gcc/fortran/trans-openmp.c
===
--- gcc/fortran/trans-openmp.c	(revision 176148)
+++ gcc/fortran/trans-openmp.c	(working copy)
@@ -188,9 +188,9 @@ gfc_omp_clause_default_ctor (tree clause
   size = fold_build2_loc (input_location, MULT_EXPR, gfc_array_index_type,
 			  size, esize);
   size = gfc_evaluate_now (fold_convert (size_type_node, size), cond_block);
-  ptr = gfc_allocate_allocatable_with_status (cond_block,
-	  build_int_cst (pvoid_type_node, 0),
-	  size, NULL, NULL);
+  ptr = gfc_allocate_allocatable (cond_block,
+			  build_int_cst (pvoid_type_node, 0),
+			  size, NULL_TREE, NULL_TREE, NULL_TREE, NULL);
   gfc_conv_descriptor_data_set (cond_block, decl, ptr);
   then_b = gfc_finish_block (cond_block);
 
@@ -241,9 +241,9 @@ gfc_omp_clause_copy_ctor (tree clause, t
   size = fold_build2_loc (input_location, MULT_EXPR, gfc_array_index_type,
 			  size, esize);
   size = gfc_evaluate_now (fold_convert (size_type_node, size), block);
-  ptr = gfc_allocate_allocatable_with_status (block,
-	  build_int_cst (pvoid_type_node, 0),
-	  size, NULL, NULL);
+  ptr = gfc_allocate_allocatable (block,
+			  build_int_cst (pvoid_type_node, 0),

[pph] Stream out chains backwards (issue4657092)

2011-07-11 Thread Gabriel Charette
**This patch goes on top of patches in issues 4672055 and 4675069 (which have 
yet to be committed)**

Some things are built as soon as a tree is streamed in, and since the chains 
are backwards, flipping them after streaming them in is not sufficient as some 
things (e.g. unique numbers given to functions) have already been allocated.

The solution is to stream out the chain backwards to begin with.

This fixes the assembly diffs in which the LFB# were different in the pph and 
non-pph assembly.

As noted by a FIXME comment, we probably want to do this for usings and 
using_directives as well, but I didn't for this patch as we don't handle those 
yet, and I'm not sure whether their chain is flipped or not.

Fixed tests x1functions and c1pr36533.
The c1pr44948-1a test changed from an ICE in lto_streamer_cache_get to an ICE 
in lto_get_pickled_tree, but lto_get_pickled_tree calls lto_streamer_cache_get 
and I'm pretty sure this is the same bug, not a new one introduced by this 
patch.

Tested with bootstrap build and pph regression testing.

2011-07-11  Gabriel Charette  gch...@google.com

* pph-streamer-in.c (pph_add_bindings_to_namespace): Don't reverse 
names and namespaces chains. Reverse names and namespaces
only for the binding levels of namespaces streamed in as is.
* pph-streamer-out.c (pph_out_chained_tree): New.
(pph_out_chain_filtered): Add REVERSE parameter.
(pph_out_binding_level): Use REVERSE parameter of
pph_out_chain_filtered.
* g++.dg/pph/c1pr36533.cc: Expect no asm difference.
* g++.dg/pph/c1pr44948-1a.cc: Adjust XFAIL pattern.
* g++.dg/pph/x1functions.cc: Expect no asm difference.

diff --git a/gcc/cp/pph-streamer-in.c b/gcc/cp/pph-streamer-in.c
index 55f7e12..fde1b93 100644
--- a/gcc/cp/pph-streamer-in.c
+++ b/gcc/cp/pph-streamer-in.c
@@ -1146,11 +1146,6 @@ pph_add_bindings_to_namespace (struct cp_binding_level 
*bl, tree ns)
 {
   tree t, chain;
 
-  /* The chains are built backwards (ref: add_decl_to_level),
- reverse them before putting them back in.  */
-  bl-names = nreverse (bl-names);
-  bl-namespaces = nreverse (bl-namespaces);
-
   for (t = bl-names; t; t = chain)
 {
   /* Pushing a decl into a scope clobbers its DECL_CHAIN.
@@ -1164,11 +1159,26 @@ pph_add_bindings_to_namespace (struct cp_binding_level 
*bl, tree ns)
 
   for (t = bl-namespaces; t; t = chain)
 {
+  struct cp_binding_level* ns_lvl;
+
   /* Pushing a decl into a scope clobbers its DECL_CHAIN.
 Preserve it.  */
   chain = DECL_CHAIN (t);
   pushdecl_into_namespace (t, ns);
-  pph_add_bindings_to_namespace (NAMESPACE_LEVEL (t), t);
+
+  /* FIXME pph: verify whether this namespace exists already,
+if it does we should merge it.  */
+  ns_lvl = NAMESPACE_LEVEL (t);
+  /* FIXME pph: the only benefit of making this call is the embedded call 
to
+varpool_finalize_decl for the names contained in this namespace and
+it's transitive closure of namespaces, the bindings themselves do NOT
+need to be added to this namespace as they are already part of it.  */
+  pph_add_bindings_to_namespace (ns_lvl, t);
+  /* Adding the bindings to another namespace automatically reverses them, 
but
+since these were already part of this namespace, they weren't: reverse
+them in place now.  */
+  ns_lvl-names = nreverse (ns_lvl-names);
+  ns_lvl-namespaces = nreverse (ns_lvl-namespaces);
 }
 }
 
diff --git a/gcc/cp/pph-streamer-out.c b/gcc/cp/pph-streamer-out.c
index d1e757f..445fca5 100644
--- a/gcc/cp/pph-streamer-out.c
+++ b/gcc/cp/pph-streamer-out.c
@@ -584,21 +584,44 @@ pph_out_label_binding (pph_stream *stream, 
cp_label_binding *lb, bool ref_p)
 }
 
 
+/* Outputs chained tree T by nulling out it's chain first and restoring it
+   after the streaming is done. STREAM and REF_P are as in
+   pph_out_chain_filtered.  */
+
+static inline void
+pph_out_chained_tree (pph_stream *stream, tree t, bool ref_p)
+{
+  tree saved_chain;
+
+  saved_chain = TREE_CHAIN (t);
+  TREE_CHAIN (t) = NULL_TREE;
+
+  pph_out_tree_or_ref_1 (stream, t, ref_p, 2);
+
+  TREE_CHAIN (t) = saved_chain;
+}
+
+
 /* Output a chain of nodes to STREAM starting with FIRST.  Skip any
nodes that do not match FILTER.  REF_P is true if nodes in the chain
-   should be emitted as references.  */
+   should be emitted as references.  Stream the chain in the reverse order
+   if REVERSE is true.*/
 
 static void
 pph_out_chain_filtered (pph_stream *stream, tree first, bool ref_p,
-  enum chain_filter filter)
+  enum chain_filter filter, bool reverse)
 {
   unsigned count;
+  int i;
   tree t;
+  tree *to_stream = NULL;
 
   /* Special case.  If the caller wants no filtering, it is much
  faster to just call pph_out_chain directly.  */
   if (filter == NONE)
 {
+  if (reverse)
+   nreverse (first);
   

[gomp-3_1-branch] Update openmp_version in gcc/fortran/intrinsic.texi

2011-07-11 Thread Tobias Burnus
The attached patch updates the version number in gfortran's intrinsic 
documentation. I don't know whether it makes sense to keep the number, 
but if one does, it should be up to date.


Jakub, is the attached patch OK for the branch?

Tobias
2011-07-11  Tobias Burnus  bur...@net-b.de

	* intrinsic.c (OMP_LIB): Updated openmp_version's
	value to 201107.

Index: intrinsic.texi
===
--- intrinsic.texi	(Revision 176173)
+++ intrinsic.texi	(Arbeitskopie)
@@ -13100,7 +13100,7 @@
 @code{OMP_LIB} provides the scalar default-integer
 named constant @code{openmp_version} with a value of the form
 @var{mm}, where @code{} is the year and @var{mm} the month
-of the OpenMP version; for OpenMP v3.0 the value is @code{200805}.
+of the OpenMP version; for OpenMP v3.1 the value is @code{201107}.
 
 And the following scalar integer named constants of the
 kind @code{omp_sched_kind}:


Re: [gomp-3_1-branch] Update openmp_version in gcc/fortran/intrinsic.texi

2011-07-11 Thread Jakub Jelinek
On Mon, Jul 11, 2011 at 08:26:22PM +0200, Tobias Burnus wrote:
 The attached patch updates the version number in gfortran's
 intrinsic documentation. I don't know whether it makes sense to keep
 the number, but if one does, it should be up to date.
 
 Jakub, is the attached patch OK for the branch?

Yeah, thanks.

 2011-07-11  Tobias Burnus  bur...@net-b.de
 
   * intrinsic.c (OMP_LIB): Updated openmp_version's
   value to 201107.
 

Jakub


[Committed, Backport 4.6, AVR]: PR target/46779

2011-07-11 Thread Georg-Johann Lay
Backported fix for PR46779 to 4.6:

http://gcc.gnu.org/viewcvs?root=gccview=revrev=176055


[v3] Fix libstdc++/49559

2011-07-11 Thread Paolo Carlini

Hi,

for details, see the audit trail. Compared to the last draft, I also 
fixed __rotate_adaptive (issue noticed by artificially reducing the size 
of the buffer).


Tested x86_64-linux, committed.

Thanks,
Paolo.

/
2011-07-11  Paolo Carlini  paolo.carl...@oracle.com

PR libstdc++/49559
* include/bits/stl_algo.h (__move_merge_backward): Remove.
(__move_merge_adaptive, __move_merge_adaptive_backward): New.
(__merge_adaptive): Use the latter two.
(__rotate_adaptive): Avoid self move-assignment.
* include/bits/stl_algobase.h (move_backward): Fix comment.
* testsuite/25_algorithms/stable_sort/49559.cc: New.
* testsuite/25_algorithms/inplace_merge/49559.cc: Likewise.
* testsuite/25_algorithms/inplace_merge/moveable.cc: Extend.
* testsuite/25_algorithms/inplace_merge/moveable2.cc: Likewise.
* testsuite/util/testsuite_rvalref.h (rvalstruct::operator=
(rvalstruct)): Check for self move-assignment.
Index: include/bits/stl_algobase.h
===
--- include/bits/stl_algobase.h (revision 176144)
+++ include/bits/stl_algobase.h (working copy)
@@ -641,7 +641,7 @@
*  loop count will be known (and therefore a candidate for compiler
*  optimizations such as unrolling).
*
-   *  Result may not be in the range [first,last).  Use move instead.  Note
+   *  Result may not be in the range (first,last].  Use move instead.  Note
*  that the start of the output range may overlap [first,last).
   */
   templatetypename _BI1, typename _BI2
Index: include/bits/stl_algo.h
===
--- include/bits/stl_algo.h (revision 176144)
+++ include/bits/stl_algo.h (working copy)
@@ -2716,20 +2716,76 @@
 
   // merge
 
-  /// This is a helper function for the merge routines.
+  /// This is a helper function for the __merge_adaptive routines.
+  templatetypename _InputIterator1, typename _InputIterator2,
+  typename _OutputIterator
+void
+__move_merge_adaptive(_InputIterator1 __first1, _InputIterator1 __last1,
+ _InputIterator2 __first2, _InputIterator2 __last2,
+ _OutputIterator __result)
+{
+  while (__first1 != __last1  __first2 != __last2)
+   {
+ if (*__first2  *__first1)
+   {
+ *__result = _GLIBCXX_MOVE(*__first2);
+ ++__first2;
+   }
+ else
+   {
+ *__result = _GLIBCXX_MOVE(*__first1);
+ ++__first1;
+   }
+ ++__result;
+   }
+  if (__first1 != __last1)
+   _GLIBCXX_MOVE3(__first1, __last1, __result);
+}
+
+  /// This is a helper function for the __merge_adaptive routines.
+  templatetypename _InputIterator1, typename _InputIterator2,
+  typename _OutputIterator, typename _Compare
+void
+__move_merge_adaptive(_InputIterator1 __first1, _InputIterator1 __last1,
+ _InputIterator2 __first2, _InputIterator2 __last2,
+ _OutputIterator __result, _Compare __comp)
+{
+  while (__first1 != __last1  __first2 != __last2)
+   {
+ if (__comp(*__first2, *__first1))
+   {
+ *__result = _GLIBCXX_MOVE(*__first2);
+ ++__first2;
+   }
+ else
+   {
+ *__result = _GLIBCXX_MOVE(*__first1);
+ ++__first1;
+   }
+ ++__result;
+   }
+  if (__first1 != __last1)
+   _GLIBCXX_MOVE3(__first1, __last1, __result);
+}
+
+  /// This is a helper function for the __merge_adaptive routines.
   templatetypename _BidirectionalIterator1, typename _BidirectionalIterator2,
   typename _BidirectionalIterator3
-_BidirectionalIterator3
-__move_merge_backward(_BidirectionalIterator1 __first1,
- _BidirectionalIterator1 __last1,
- _BidirectionalIterator2 __first2,
- _BidirectionalIterator2 __last2,
- _BidirectionalIterator3 __result)
+void
+__move_merge_adaptive_backward(_BidirectionalIterator1 __first1,
+  _BidirectionalIterator1 __last1,
+  _BidirectionalIterator2 __first2,
+  _BidirectionalIterator2 __last2,
+  _BidirectionalIterator3 __result)
 {
   if (__first1 == __last1)
-   return _GLIBCXX_MOVE_BACKWARD3(__first2, __last2, __result);
-  if (__first2 == __last2)
-   return _GLIBCXX_MOVE_BACKWARD3(__first1, __last1, __result);
+   {
+ _GLIBCXX_MOVE_BACKWARD3(__first2, __last2, __result);
+ return;
+   }
+  else if (__first2 == __last2)
+   return;
+
   --__last1;
   --__last2;
   while (true)
@@ -2738,34 +2794,41 @@
{

C++ PATCH for c++/44609 (printing an error for each step in infinite template recursion)

2011-07-11 Thread Jason Merrill
The PR complained about G++ getting into an infinite loop, but it isn't 
really infinite; the problem is that in the testcase a function template 
has an error and then depends on another instance of itself.


I've fixed this for many cases by refusing to instantiate a declaration 
if there have been errors since beginning to instantiate the nearest 
enclosing declaration.  This doesn't affect classes and constexpr 
variables/functions, because we can't just decide not to instantiate 
them without producing other errors.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit af8514f2c47162f32f56d0ee3f18e9040d756b1f
Author: Jason Merrill ja...@redhat.com
Date:   Mon Jul 11 09:38:11 2011 -0400

	PR c++/44609
	* cp-tree.h (struct tinst_level): Add errors field.
	* pt.c (neglectable_inst_p, limit_bad_template_recurson): New.
	(push_tinst_level): Don't start another decl in that case.
	(reopen_tinst_level): Adjust errors field.
	* decl2.c (cp_write_global_declarations): Don't complain about
	undefined inline if its template was defined.
	* mangle.c (mangle_decl_string): Handle failure from push_tinst_level.

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 357295c..cc08640 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -4679,6 +4679,9 @@ struct GTY((chain_next (%h.next))) tinst_level {
   /* The location where the template is instantiated.  */
   location_t locus;
 
+  /* errorcount+sorrycount when we pushed this level.  */
+  int errors;
+
   /* True if the location is in a system header.  */
   bool in_system_header_p;
 };
diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index 8cd51c2..d90d4b5 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -3950,10 +3950,10 @@ cp_write_global_declarations (void)
 	 #pragma interface, etc.) we decided not to emit the
 	 definition here.  */
 	   !DECL_INITIAL (decl)
-	  /* An explicit instantiation can be used to specify
-	 that the body is in another unit. It will have
-	 already verified there was a definition.  */
-	   !DECL_EXPLICIT_INSTANTIATION (decl))
+	  /* Don't complain if the template was defined.  */
+	   !(DECL_TEMPLATE_INSTANTIATION (decl)
+	DECL_INITIAL (DECL_TEMPLATE_RESULT
+(template_for_substitution (decl)
 	{
 	  warning (0, inline function %q+D used but never defined, decl);
 	  /* Avoid a duplicate warning from check_global_declaration_1.  */
diff --git a/gcc/cp/mangle.c b/gcc/cp/mangle.c
index 81b772f..4a83c9a 100644
--- a/gcc/cp/mangle.c
+++ b/gcc/cp/mangle.c
@@ -3106,11 +3106,11 @@ mangle_decl_string (const tree decl)
   if (DECL_LANG_SPECIFIC (decl)  DECL_USE_TEMPLATE (decl))
 {
   struct tinst_level *tl = current_instantiation ();
-  if (!tl || tl-decl != decl)
+  if ((!tl || tl-decl != decl)
+	   push_tinst_level (decl))
 	{
 	  template_p = true;
 	  saved_fn = current_function_decl;
-	  push_tinst_level (decl);
 	  current_function_decl = NULL_TREE;
 	}
 }
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 2c64dd4..7c735ef 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -7499,6 +7499,36 @@ uses_template_parms_level (tree t, int level)
  /*include_nondeduced_p=*/true);
 }
 
+/* Returns TRUE iff INST is an instantiation we don't need to do in an
+   ill-formed translation unit, i.e. a variable or function that isn't
+   usable in a constant expression.  */
+
+static inline bool
+neglectable_inst_p (tree d)
+{
+  return (DECL_P (d)
+	   !(TREE_CODE (d) == FUNCTION_DECL ? DECL_DECLARED_CONSTEXPR_P (d)
+	   : decl_maybe_constant_var_p (d)));
+}
+
+/* Returns TRUE iff we should refuse to instantiate DECL because it's
+   neglectable and instantiated from within an erroneous instantiation.  */
+
+static bool
+limit_bad_template_recurson (tree decl)
+{
+  struct tinst_level *lev = current_tinst_level;
+  int errs = errorcount + sorrycount;
+  if (lev == NULL || errs == 0 || !neglectable_inst_p (decl))
+return false;
+
+  for (; lev; lev = lev-next)
+if (neglectable_inst_p (lev-decl))
+  break;
+
+  return (lev  errs  lev-errors);
+}
+
 static int tinst_depth;
 extern int max_tinst_depth;
 #ifdef GATHER_STATISTICS
@@ -7532,9 +7562,16 @@ push_tinst_level (tree d)
   return 0;
 }
 
+  /* If the current instantiation caused problems, don't let it instantiate
+ anything else.  Do allow deduction substitution and decls usable in
+ constant expressions.  */
+  if (limit_bad_template_recurson (d))
+return 0;
+
   new_level = ggc_alloc_tinst_level ();
   new_level-decl = d;
   new_level-locus = input_location;
+  new_level-errors = errorcount+sorrycount;
   new_level-in_system_header_p = in_system_header;
   new_level-next = current_tinst_level;
   current_tinst_level = new_level;
@@ -7578,6 +7615,8 @@ reopen_tinst_level (struct tinst_level *level)
 
   current_tinst_level = level;
   pop_tinst_level ();
+  if (current_tinst_level)
+current_tinst_level-errors = errorcount+sorrycount;
   return level-decl;
 }
 
diff --git 

[Ada] Fix --enable-build-with-cxx build

2011-07-11 Thread Eric Botcazou
This is an updated version of Laurent's patch originally here:
  http://gcc.gnu.org/ml/gcc/2009-06/msg00635.html

Bootstrapped/regtested on x86_64-suse-linux with --enable-build-with-cxx.

Arno, I think we should apply it.  This isn't very intrusive in the end and 
with it we can go full C++ instead of requiring 3 compilers to bootstrap.


2011-07-11  Laurent GUERBY laur...@guerby.net
Eric Botcazou  ebotca...@adacore.com

gnattools/
* Makefile.in (TOOLS_FLAGS_TO_PASS_1): Add LINKER.
(TOOLS_FLAGS_TO_PASS_1re): Likewise.
(TOOLS_FLAGS_TO_PASS_NATIVE): Likewise.
(TOOLS_FLAGS_TO_PASS_CROSS): Likewise.

gcc/
* prefix.h: Wrap up in extern C block.

ada/
* adadecode.c: Likewise.
* adadecode.h: Likewise.
* adaint.c: Likewise.
* adaint.h: Likewise.
* argv.c: Likewise.
* arit64.c: Likewise.
* atree.h: Likewise.
* aux-io.c: Likewise.
* cal.c: Likewise.
* cio.c: Likewise.
* cstreams.c: Likewise.
* ctrl_c.c: Likewise.
* env.c: Likewise.
* errno.c: Likewise.
* exit.c: Likewise.
* expect.c: Likewise.
* fe.h: Likewise.
* final.c: Likewise.
* init.c: Likewise.
* initialize.c: Likewise.
* link.c: Likewise.
* mkdir.c: Likewise.
* namet.h: Likewise.
* nlists.h: Likewise.
* raise-gcc.c: Likewise.
* raise.c: Likewise.
* raise.h: Likewise.
* repinfo.h: Likewise.
* s-oscons-tmplt.c: Likewise.
* seh_init.c: Likewise.
* socket.c: Likewise.
* sysdep.c: Likewise.
* targext.c: Likewise.
* tb-alvms.c: Likewise.
* tb-alvxw.c: Likewise.
* tb-gcc.c: Likewise.
* tb-ivms.c: Likewise.
* tracebak.c: Likewise.
* uintp.h: Likewise.
* urealp.h: Likewise.
* vx_stack_info.c: Likewise.
* xeinfo.adb: Wrap up generated C code in extern C block.
* xsinfo.adb: Wrap up generated C code in extern C block.
* xsnamest.adb: Wrap up generated C code in extern C block.
* gcc-interface/gadaint.h: Wrap up in extern C block.
* gcc-interface/gigi.h: Wrap up prototypes in extern C block.
* gcc-interface/misc.c: Wrap up prototypes in extern C block.
* gcc-interface/Make-lang.in (GCC_LINK): Use LINKER.
* gcc-interface/Makefile.in (GCC_LINK): Likewise.


-- 
Eric Botcazou
Index: gnattools/Makefile.in
===
--- gnattools/Makefile.in	(revision 176072)
+++ gnattools/Makefile.in	(working copy)
@@ -67,6 +67,7 @@ ADA_INCLUDES_FOR_SUBDIR = -I. -I$(fsrcdi
 # Variables for gnattools1, native
 TOOLS_FLAGS_TO_PASS_1= \
 	CC=../../xgcc -B../../ \
+	LINKER=$(CXX) \
 	CFLAGS=$(CFLAGS) $(WARN_CFLAGS) \
 	LDFLAGS=$(LDFLAGS) \
 	ADAFLAGS=$(ADAFLAGS) \
@@ -82,6 +83,7 @@ TOOLS_FLAGS_TO_PASS_1= \
 # Variables for regnattools
 TOOLS_FLAGS_TO_PASS_1re= \
 	CC=../../xgcc -B../../ \
+	LINKER=$(CXX) \
 	CFLAGS=$(CFLAGS) \
 	ADAFLAGS=$(ADAFLAGS) \
 	ADA_CFLAGS=$(ADA_CFLAGS) \
@@ -99,6 +101,7 @@ TOOLS_FLAGS_TO_PASS_1re= \
 # Variables for gnattools2, native
 TOOLS_FLAGS_TO_PASS_NATIVE= \
 	CC=../../xgcc -B../../ \
+	LINKER=$(CXX) \
 	CFLAGS=$(CFLAGS) \
 	ADAFLAGS=$(ADAFLAGS) \
 	ADA_CFLAGS=$(ADA_CFLAGS) \
@@ -115,6 +118,7 @@ TOOLS_FLAGS_TO_PASS_NATIVE= \
 # Variables for gnattools, cross
 TOOLS_FLAGS_TO_PASS_CROSS= \
 	CC=$(CC) \
+	LINKER=$(CXX) \
 	CFLAGS=$(CFLAGS) $(WARN_CFLAGS) \
 	LDFLAGS=$(LDFLAGS) \
 	ADAFLAGS=$(ADAFLAGS)	\
Index: gcc/ada/adadecode.h
===
--- gcc/ada/adadecode.h	(revision 176072)
+++ gcc/ada/adadecode.h	(working copy)
@@ -29,6 +29,10 @@
  *  *
  /
 
+#ifdef __cplusplus
+extern C {
+#endif
+
 /* This function will return the Ada name from the encoded form.
The Ada coding is done in exp_dbug.ads and this is the inverse function.
see exp_dbug.ads for full encoding rules, a short description is added
@@ -51,3 +55,7 @@ extern void get_encoding (const char *,
function used in the binutils and GDB. Always consider using __gnat_decode
instead of ada_demangle. Caller must free the pointer returned.  */
 extern char *ada_demangle (const char *);
+
+#ifdef __cplusplus
+}
+#endif
Index: gcc/ada/sysdep.c
===
--- gcc/ada/sysdep.c	(revision 176072)
+++ gcc/ada/sysdep.c	(working copy)
@@ -30,7 +30,11 @@
  /
 
 /* This file contains system dependent symbols that are referenced in the
-   GNAT Run Time Library */
+   GNAT Run Time Library.  */
+
+#ifdef __cplusplus
+extern C {
+#endif
 
 #ifdef __vxworks
 #include ioLib.h
@@ -1012,3 

[dwarf2cfi] Cleanup interpretation of cfa.reg

2011-07-11 Thread Richard Henderson
Sometimes we compare cfa.reg with REGNO, and sometimes with
something that has been passed through DWARF_FRAME_REGNUM.
This leads to all sorts of confusion.

I think that ideally we'd leave dw_cfa_location.reg in the
GCC regno space, because that's convenient for the majority
of the code that interprets rtl and turns it into CFIs.

However, we have no inverse of DWARF_FRAME_REGNUM, which
means that lookup_cfa_1 cannot read CFI data in dwarf2
regno space and produce an output in GCC regno space.

Therefore, I've audited all uses of dw_cfa_location.reg
to ensure that all references are in dwarf2 regno space.

It would have been nice to be able to use C++ classes to
be able to do this checking in perpetuity, but the equivalent
struct wrapping in C would have made the source to ugly.

Tested on x86_64-linux.  Committed.


r~
* dwarf2cfi.c (DW_STACK_POINTER_REGNUM): New.
(DW_FRAME_POINTER_REGNUM): New.
(expand_builtin_init_dwarf_reg_sizes): Use unsigned for rnum.
(def_cfa_1): Do not convert reg to DWARF_FRAME_REGNUM here.
(dwf_regno): New.
(dwarf2out_flush_queued_reg_saves, dwarf2out_frame_debug_def_cfa,
dwarf2out_frame_debug_adjust_cfa, dwarf2out_frame_debug_cfa_register,
dwarf2out_frame_debug_cfa_expression, dwarf2out_frame_debug_expr):
Use it. 
* dwarf2out.c (based_loc_descr): Use dwarf_frame_regnum.
* dwarf2out.h (dwarf_frame_regnum): New.
(struct cfa_loc): Document the domain of the reg member.


diff --git a/gcc/dwarf2cfi.c b/gcc/dwarf2cfi.c
index 5b8420e..1c76b3f 100644
--- a/gcc/dwarf2cfi.c
+++ b/gcc/dwarf2cfi.c
@@ -57,6 +57,10 @@ along with GCC; see the file COPYING3.  If not see
 
 /* Maximum size (in bytes) of an artificially generated label.  */
 #define MAX_ARTIFICIAL_LABEL_BYTES 30
+
+/* Short-hand for commonly used register numbers.  */
+#define DW_STACK_POINTER_REGNUM  dwarf_frame_regnum (STACK_POINTER_REGNUM)
+#define DW_FRAME_POINTER_REGNUM  dwarf_frame_regnum (HARD_FRAME_POINTER_REGNUM)
 
 /* A vector of call frame insns for the CIE.  */
 cfi_vec cie_cfi_vec;
@@ -85,7 +89,7 @@ static void dwarf2out_frame_debug_restore_state (void);
 rtx
 expand_builtin_dwarf_sp_column (void)
 {
-  unsigned int dwarf_regnum = DWARF_FRAME_REGNUM (STACK_POINTER_REGNUM);
+  unsigned int dwarf_regnum = DW_STACK_POINTER_REGNUM;
   return GEN_INT (DWARF2_FRAME_REG_OUT (dwarf_regnum, 1));
 }
 
@@ -113,7 +117,7 @@ expand_builtin_init_dwarf_reg_sizes (tree address)
 
   for (i = 0; i  FIRST_PSEUDO_REGISTER; i++)
 {
-  int rnum = DWARF2_FRAME_REG_OUT (DWARF_FRAME_REGNUM (i), 1);
+  unsigned int rnum = DWARF2_FRAME_REG_OUT (dwarf_frame_regnum (i), 1);
 
   if (rnum  DWARF_FRAME_REGISTERS)
{
@@ -123,7 +127,7 @@ expand_builtin_init_dwarf_reg_sizes (tree address)
 
  if (HARD_REGNO_CALL_PART_CLOBBERED (i, save_mode))
save_mode = choose_hard_reg_mode (i, 1, true);
- if (DWARF_FRAME_REGNUM (i) == DWARF_FRAME_RETURN_COLUMN)
+ if (dwarf_frame_regnum (i) == DWARF_FRAME_RETURN_COLUMN)
{
  if (save_mode == VOIDmode)
continue;
@@ -415,8 +419,6 @@ def_cfa_1 (dw_cfa_location *loc_p)
   if (cfa_store.reg == loc.reg  loc.indirect == 0)
 cfa_store.offset = loc.offset;
 
-  loc.reg = DWARF_FRAME_REGNUM (loc.reg);
-
   /* If nothing changed, no need to issue any call frame instructions.  */
   if (cfa_equal_p (loc, old_cfa))
 return;
@@ -810,10 +812,10 @@ dwarf2out_args_size (HOST_WIDE_INT size)
 static void
 dwarf2out_stack_adjust (HOST_WIDE_INT offset)
 {
-  if (cfa.reg == STACK_POINTER_REGNUM)
+  if (cfa.reg == DW_STACK_POINTER_REGNUM)
 cfa.offset += offset;
 
-  if (cfa_store.reg == STACK_POINTER_REGNUM)
+  if (cfa_store.reg == DW_STACK_POINTER_REGNUM)
 cfa_store.offset += offset;
 
   if (ACCUMULATE_OUTGOING_ARGS)
@@ -859,7 +861,7 @@ dwarf2out_notice_stack_adjust (rtx insn, bool after_p)
 
   /* If only calls can throw, and we have a frame pointer,
  save up adjustments until we see the CALL_INSN.  */
-  if (!flag_asynchronous_unwind_tables  cfa.reg != STACK_POINTER_REGNUM)
+  if (!flag_asynchronous_unwind_tables  cfa.reg != DW_STACK_POINTER_REGNUM)
 {
   if (CALL_P (insn)  !after_p)
{
@@ -952,6 +954,16 @@ static GTY(()) VEC(reg_saved_in_data, gc) 
*regs_saved_in_regs;
 
 static GTY(()) reg_saved_in_data *cie_return_save;
 
+/* Short-hand inline for the very common D_F_R (REGNO (x)) operation.  */
+/* ??? This ought to go into dwarf2out.h alongside dwarf_frame_regnum,
+   except that dwarf2out.h is used in places where rtl is prohibited.  */
+
+static inline unsigned
+dwf_regno (const_rtx reg)
+{
+  return dwarf_frame_regnum (REGNO (reg));
+}
+
 /* Compare X and Y for equivalence.  The inputs may be REGs or PC_RTX.  */
 
 static bool
@@ -1031,9 +1043,9 @@ dwarf2out_flush_queued_reg_saves (void)
   if (q-reg == pc_rtx)
reg = DWARF_FRAME_RETURN_COLUMN;
   else
-reg = 

libgo patch committed: Use abort, not std::abort, in C code

2011-07-11 Thread Ian Lance Taylor
This patch changes std::abort() to abort() in C code.  I'm not sure how
this was working previously.  Bootstrapped on x86_64-unknown-linux-gnu.
Committed to mainline.

Ian

diff -r 3291a9609c87 libgo/runtime/go-unwind.c
--- a/libgo/runtime/go-unwind.c	Thu Jul 07 09:48:11 2011 -0700
+++ b/libgo/runtime/go-unwind.c	Mon Jul 11 09:59:04 2011 -0700
@@ -293,7 +293,7 @@
   break;
 
 default:
-  std::abort();
+  abort();
 }
   actions |= state  _US_FORCE_UNWIND;
 


[gomp-3.1] const qualified vs. predetermination

2011-07-11 Thread Jakub Jelinek
Hi!

The final standard dropped const qualified vars without mutable member
back to being predetermined shared, but allows them to be specified in
firstprivate clause (so that valid OpenMP 3.0 using default(none)
aren't suddenly invalid).
The following patch implements that.  I'm not 100% sure about
const qualified static data members or const qualified threadprivate,
have asked about it on openmp forums, for the time being they aren't
allowed in firstprivate clause.

2011-07-11  Jakub Jelinek  ja...@redhat.com

gcc/
* c-typeck.c (c_finish_omp_clauses): Don't complain about
const qualified predetermined vars in firstprivate clause.

Revert
2011-03-10  Jakub Jelinek  ja...@redhat.com

* c-typeck.c (c_finish_omp_clauses): Complain about
TREE_READONLY decls in private, lastprivate and reduction
clauses.
gcc/c-family/
Revert
2011-03-10  Jakub Jelinek  ja...@redhat.com

* c-omp.c (c_omp_predetermined_sharing): Don't return
OMP_CLAUSE_DEFAULT_SHARED for TREE_READONLY decls.
gcc/cp/
* semantics.c (finish_omp_clauses): Don't complain about
const qualified predetermined vars in firstprivate clause,
unless it is a static data member.

Revert
2011-03-10  Jakub Jelinek  ja...@redhat.com

* cp-gimplify.c (cxx_omp_predetermined_sharing): Don't return
OMP_CLAUSE_DEFAULT_SHARED for decls with TYPE_READONLY
type having no mutable member.
* semantics.c (finish_omp_clauses): Complain about
TREE_READONLY decls with no mutable member in private,
lastprivate and reduction clauses.
gcc/testsuite/
* g++.dg/gomp/private-1.C: Adjust for expected wording of error
messages.

Revert
2011-03-10  Jakub Jelinek  ja...@redhat.com

* gcc.dg/gomp/appendix-a/a.24.1.c: Adjust for const-qualified
decls having no mutable members no longer being predetermined
shared.
* gcc.dg/gomp/sharing-1.c: Likewise.
* gcc.dg/gomp/clause-1.c: Likewise.
* g++.dg/gomp/sharing-1.C: Likewise.
* g++.dg/gomp/clause-3.C: Likewise.
* g++.dg/gomp/predetermined-1.C: Likewise.

--- gcc/c-family/c-omp.c(revision 176179)
+++ gcc/c-family/c-omp.c(working copy)
@@ -601,7 +601,12 @@ c_split_parallel_clauses (location_t loc
 /* True if OpenMP sharing attribute of DECL is predetermined.  */
 
 enum omp_clause_default_kind
-c_omp_predetermined_sharing (tree decl ATTRIBUTE_UNUSED)
+c_omp_predetermined_sharing (tree decl)
 {
+  /* Variables with const-qualified type having no mutable member
+ are predetermined shared.  */
+  if (TREE_READONLY (decl))
+return OMP_CLAUSE_DEFAULT_SHARED;
+
   return OMP_CLAUSE_DEFAULT_UNSPECIFIED;
 }
--- gcc/cp/cp-gimplify.c(revision 176179)
+++ gcc/cp/cp-gimplify.c(working copy)
@@ -1372,6 +1372,8 @@ cxx_omp_privatize_by_reference (const_tr
 enum omp_clause_default_kind
 cxx_omp_predetermined_sharing (tree decl)
 {
+  tree type;
+
   /* Static data members are predetermined as shared.  */
   if (TREE_STATIC (decl))
 {
@@ -1380,6 +1382,41 @@ cxx_omp_predetermined_sharing (tree decl
return OMP_CLAUSE_DEFAULT_SHARED;
 }
 
+  type = TREE_TYPE (decl);
+  if (TREE_CODE (type) == REFERENCE_TYPE)
+{
+  if (!is_invisiref_parm (decl))
+   return OMP_CLAUSE_DEFAULT_UNSPECIFIED;
+  type = TREE_TYPE (type);
+
+  if (TREE_CODE (decl) == RESULT_DECL  DECL_NAME (decl))
+   {
+ /* NVR doesn't preserve const qualification of the
+variable's type.  */
+ tree outer = outer_curly_brace_block (current_function_decl);
+ tree var;
+
+ if (outer)
+   for (var = BLOCK_VARS (outer); var; var = DECL_CHAIN (var))
+ if (DECL_NAME (decl) == DECL_NAME (var)
+  (TYPE_MAIN_VARIANT (type)
+ == TYPE_MAIN_VARIANT (TREE_TYPE (var
+   {
+ if (TYPE_READONLY (TREE_TYPE (var)))
+   type = TREE_TYPE (var);
+ break;
+   }
+   }
+}
+
+  if (type == error_mark_node)
+return OMP_CLAUSE_DEFAULT_UNSPECIFIED;
+
+  /* Variables with const-qualified type having no mutable member
+ are predetermined shared.  */
+  if (TYPE_READONLY (type)  !cp_has_mutable_p (type))
+return OMP_CLAUSE_DEFAULT_SHARED;
+
   return OMP_CLAUSE_DEFAULT_UNSPECIFIED;
 }
 
--- gcc/cp/semantics.c  (revision 176179)
+++ gcc/cp/semantics.c  (working copy)
@@ -3966,7 +3966,6 @@ finish_omp_clauses (tree clauses)
   bool need_copy_ctor = false;
   bool need_copy_assignment = false;
   bool need_implicitly_determined = false;
-  bool no_const = false;
   tree type, inner_type;
 
   switch (c_kind)
@@ -3980,7 +3979,6 @@ finish_omp_clauses (tree clauses)
  need_complete_non_reference = true;
  need_default_ctor = true;
  need_implicitly_determined = 

Re: [PLUGIN] c-family files installation

2011-07-11 Thread Matthias Klose
On 07/11/2011 05:18 PM, Romain Geissler wrote:
 This patch add a new exception to the plugin header flattering strategy.
 c-family files can't be installed in the plugin include root directory as some
 other files like cp/cp-tree.h will look for them in the c-family directory.
 
 Furthermore, i had to correct an include in c-pretty-print.h so that it
 looks for c-common.h in the c-family directory. That way, headers will
 work out of the box when compiling a plugin, there is no need for
 additional include directory.
 
 Builds and installs fine
 
 Ok for the trunk (i have no write access) ?

looks ok (but I cannot approve it). Almost the same patch submitted at
http://gcc.gnu.org/ml/gcc-patches/2011-06/msg01461.html, but this chunk 
unreviewed.

  Matthias


libgo patch committed: Define CC_FOR_BUILD in Makefile

2011-07-11 Thread Ian Lance Taylor
This patch to libgo defines CC_FOR_BUILD in Makefile, to make it more
likely to be able to build code in the libgo subdirectory.  Bootstrapped
on x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

diff -r 4732400182b5 libgo/configure.ac
--- a/libgo/configure.ac	Mon Jul 11 13:13:33 2011 -0700
+++ b/libgo/configure.ac	Mon Jul 11 13:25:16 2011 -0700
@@ -42,6 +42,9 @@
 AC_SUBST(enable_shared)
 AC_SUBST(enable_static)
 
+CC_FOR_BUILD=${CC_FOR_BUILD:-gcc}
+AC_SUBST(CC_FOR_BUILD)
+
 WARN_FLAGS='-Wall -Wextra -Wwrite-strings -Wcast-qual'
 AC_SUBST(WARN_FLAGS)
 


RFC: attribute to reverse bitfield allocations

2011-07-11 Thread DJ Delorie

Finally getting around to writing this one.  The idea is to have an
attribute which determines how bitfields are allocated within words
(lsb-first vs msb-first), assuming the programmer doesn't ask us to do
something impossible.  __attribute__((bitorder(FOO))) where FOO is:

  native (or omitted, or no attribute): no swapping
  lsb, msb: swap as needed to get the desired allocation order
  swapped: always swap

First pass.  Still missing: documentation, checks for overlapped
bitfields after swapping.

Is this approach acceptable?  Note: the qsort is because the output
function requires fields to be in bit-index order, but you can't sort
them earlier or the constructors wouldn't match the fields.

Index: c-family/c-common.c
===
--- c-family/c-common.c (revision 176083)
+++ c-family/c-common.c (working copy)
@@ -312,12 +312,13 @@ struct visibility_flags visibility_optio
 
 static tree c_fully_fold_internal (tree expr, bool, bool *, bool *);
 static tree check_case_value (tree);
 static bool check_case_bounds (tree, tree, tree *, tree *);
 
 static tree handle_packed_attribute (tree *, tree, tree, int, bool *);
+static tree handle_bitorder_attribute (tree *, tree, tree, int, bool *);
 static tree handle_nocommon_attribute (tree *, tree, tree, int, bool *);
 static tree handle_common_attribute (tree *, tree, tree, int, bool *);
 static tree handle_noreturn_attribute (tree *, tree, tree, int, bool *);
 static tree handle_hot_attribute (tree *, tree, tree, int, bool *);
 static tree handle_cold_attribute (tree *, tree, tree, int, bool *);
 static tree handle_noinline_attribute (tree *, tree, tree, int, bool *);
@@ -589,12 +590,14 @@ const unsigned int num_c_common_reswords
 const struct attribute_spec c_common_attribute_table[] =
 {
   /* { name, min_len, max_len, decl_req, type_req, fn_type_req, handler,
affects_type_identity } */
   { packed, 0, 0, false, false, false,
  handle_packed_attribute , false},
+  { bitorder,   0, 1, false, true, false,
+ handle_bitorder_attribute , false},
   { nocommon,   0, 0, true,  false, false,
  handle_nocommon_attribute, false},
   { common, 0, 0, true,  false, false,
  handle_common_attribute, false },
   /* FIXME: logically, noreturn attributes should be listed as
  false, true, true and apply to function types.  But implementing this
@@ -5764,12 +5767,42 @@ handle_packed_attribute (tree *node, tre
   *no_add_attrs = true;
 }
 
   return NULL_TREE;
 }
 
+/* Handle a bitorder attribute; arguments as in
+   struct attribute_spec.handler.  */
+
+static tree
+handle_bitorder_attribute (tree *ARG_UNUSED (node), tree ARG_UNUSED (name),
+  tree ARG_UNUSED (args),
+  int ARG_UNUSED (flags), bool *no_add_attrs)
+{
+  tree bmode;
+  const char *bname;
+
+  /* Allow no arguments to mean native.  */
+  if (args == NULL_TREE)
+return NULL_TREE;
+
+  bmode = TREE_VALUE (args);
+
+  bname = IDENTIFIER_POINTER (bmode);
+  if (strcmp (bname, msb)
+   strcmp (bname, lsb)
+   strcmp (bname, swapped)
+   strcmp (bname, native))
+{
+  error (%qE is not a valid bitorder - use lsb, msb, native, or swapped, 
bmode);
+  *no_add_attrs = true;
+}
+
+  return NULL_TREE;
+}
+
 /* Handle a nocommon attribute; arguments as in
struct attribute_spec.handler.  */
 
 static tree
 handle_nocommon_attribute (tree *node, tree name,
   tree ARG_UNUSED (args),
Index: stor-layout.c
===
--- stor-layout.c   (revision 176083)
+++ stor-layout.c   (working copy)
@@ -1716,24 +1716,82 @@ finalize_type_size (tree type)
  TYPE_ALIGN (variant) = align;
  TYPE_USER_ALIGN (variant) = user_align;
  SET_TYPE_MODE (variant, mode);
}
 }
 }
+  
+static void
+reverse_bitfield_layout (record_layout_info rli)
+{
+  tree field, oldtype;
+  for (field = TYPE_FIELDS (rli-t); field; field = TREE_CHAIN (field))
+{
+  tree type = TREE_TYPE (field);
+  if (TREE_CODE (field) != FIELD_DECL)
+   continue;
+  if (TREE_CODE (field) == ERROR_MARK || TREE_CODE (type) == ERROR_MARK)
+   return;
+  oldtype = TREE_TYPE (DECL_FIELD_BIT_OFFSET (field));
+  DECL_FIELD_BIT_OFFSET (field)
+   = size_binop (MINUS_EXPR,
+ size_binop (MINUS_EXPR, TYPE_SIZE (type),
+ DECL_SIZE (field)),
+ DECL_FIELD_BIT_OFFSET (field));
+  TREE_TYPE (DECL_FIELD_BIT_OFFSET (field)) = oldtype;
+}
+}
+
+static int
+reverse_bitfields_p (record_layout_info rli)
+{
+  tree st, arg;
+  const char *mode;
+
+  st = rli-t;
+
+  arg = lookup_attribute (bitorder, TYPE_ATTRIBUTES (st));
+
+  if (!arg)
+return 

Re: Use of vector instructions in memmov/memset expanding

2011-07-11 Thread Michael Zolotukhin
Resending in plain text:

On 11 July 2011 23:50, Michael Zolotukhin
michael.v.zolotuk...@gmail.com wrote:

 The attached patch enables use of vector instructions in memmov/memset 
 expanding.

 New algorithm for move-mode selection is implemented for move_by_pieces, 
 store_by_pieces.
 x86-specific ix86_expand_movmem and ix86_expand_setmem are also changed in 
 similar way, x86 cost-models parameters are slightly changed to support this. 
 This implementation checks if array's alignment is known at compile time and 
 chooses expanding algorithm and move-mode according to it.

 Bootstrapped, two new fails due to incorrect tests (see 
 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49503). New implementation gives 
 quite big performance gain on memset/memcpy in some cases.

 A bunch of new tests are added to verify the implementation.

 Is it ok for trunk?

 Changelog:

 2011-07-11  Zolotukhin Michael  michael.v.zolotuk...@intel.com

     * config/i386/i386.h (processor_costs): Add second dimension to
     stringop_algs array.
     (clear_ratio): Tune value to improve performance.
     * config/i386/i386.c (cost models): Initialize second dimension of
     stringop_algs arrays.  Tune cost model in atom_cost, generic32_cost
     and generic64_cost.
     (ix86_expand_move): Add support for vector moves, that use half of
     vector register.
     (expand_set_or_movmem_via_loop_with_iter): New function.
     (expand_set_or_movmem_via_loop): Enable reuse of the same iters in
     different loops, produced by this function.
     (emit_strset): New function.
     (promote_duplicated_reg): Add support for vector modes, add
     declaration.
     (promote_duplicated_reg_to_size): Likewise.
     (expand_movmem_epilogue): Add epilogue generation for bigger sizes.
     (expand_setmem_epilogue): Likewise.
     (expand_movmem_prologue): Likewise for prologue.
     (expand_setmem_prologue): Likewise.
     (expand_constant_movmem_prologue): Likewise.
     (expand_constant_setmem_prologue): Likewise.
     (decide_alg): Add new argument align_unknown.  Fix algorithm of
     strategy selection if TARGET_INLINE_ALL_STRINGOPS is set.
     (decide_alignment): Update desired alignment according to chosen move
     mode.
     (ix86_expand_movmem): Change unrolled_loop strategy to use SSE-moves.
     (ix86_expand_setmem): Likewise.
     (ix86_slow_unaligned_access): Implementation of new hook
     slow_unaligned_access.
     (ix86_promote_rtx_for_memset): Implementation of new hook
     promote_rtx_for_memset.
     * config/i386/sse.md (sse2_loadq): Add expand for sse2_loadq.
     (vec_dupv4si): Add expand for vec_dupv4si.
     (vec_dupv2di): Add expand for vec_dupv2di.
     * emit-rtl.c (adjust_address_1): Improve algorithm for determining
     alignment of address+offset.
     (get_mem_align_offset): Add handling of MEM_REFs.
     * expr.c (compute_align_by_offset): New function.
     (move_by_pieces_insn): New function.
     (widest_mode_for_unaligned_mov): New function.
     (widest_mode_for_aligned_mov): New function.
     (widest_int_mode_for_size): Change type of size from int to
     HOST_WIDE_INT.
     (set_by_pieces_1): New function (new algorithm of memset expanding).
     (set_by_pieces_2): New function.
     (generate_move_with_mode): New function for set_by_pieces.
     (alignment_for_piecewise_move): Use hook slow_unaligned_access instead
     of macros SLOW_UNALIGNED_ACCESS.
     (emit_group_load_1): Likewise.
     (emit_group_store): Likewise.
     (emit_push_insn): Likewise.
     (store_field): Likewise.
     (expand_expr_real_1): Likewise.
     (compute_aligned_cost): New function.
     (compute_unaligned_cost): New function.
     (vector_mode_for_mode): New function.
     (vector_extensions_used_for_mode): New function.
     (move_by_pieces): New algorithm of memmove expanding.
     (move_by_pieces_ninsns): Update according to changes in
     move_by_pieces.
     (move_by_pieces_1): Remove as unused.
     (store_by_pieces): New algorithm for memset expanding.
     (clear_by_pieces): Likewise.
     (store_by_pieces_1): Remove incorrect parameters' attributes.
     * expr.h (compute_align_by_offset): Add declaration.
     * rtl.h (vector_extensions_used_for_mode): Add declaration.
     * builtins.c (expand_builtin_memset_args): Update according to changes
     in set_by_pieces.
     * target.def (DEFHOOK): Add hook slow_unaligned_access and
     promote_rtx_for_memset.
     * targhooks.c (default_slow_unaligned_access): Add default hook
     implementation.
     (default_promote_rtx_for_memset): Likewise.
     * targhooks.h (default_slow_unaligned_access): Add prototype.
     (default_promote_rtx_for_memset): Likewise.
     * cse.c (cse_insn): Stop forward propagation of vector constants.
     * fwprop.c (forward_propagate_and_simplify): Likewise.
     * doc/tm.texi (SLOW_UNALIGNED_ACCESS): Remove documentation for deleted
     macro SLOW_UNALIGNED_ACCESS.
     (TARGET_SLOW_UNALIGNED_ACCESS): Add documentation on 

C++ PATCH for c++/49672 (ICE with variadic parms to lambda)

2011-07-11 Thread Jason Merrill
Note that this doesn't allow capture of a pack expansion yet, just fixes 
a hole in the patch for c++/48424.  When instantiating a template 
function that has a non-pack parameter after a parameter pack, we were 
incorrectly treating it as part of the pack, leading to confusion.


Tested x86_64-pc-linux-gnu, applying to trunk.  I suppose I should also 
apply it to 4.6 since it has the earlier 48424 patch.
commit 5a1ca9b80d38e86fc997289e0eb90f3bbc98ad0d
Author: Jason Merrill ja...@redhat.com
Date:   Mon Jul 11 16:25:25 2011 -0400

	PR c++/49672
	* pt.c (extract_fnparm_pack): Split out from...
	(make_fnparm_pack): ...here.
	(instantiate_decl): Handle non-pack parms after a pack.
	* semantics.c (maybe_add_lambda_conv_op): Don't in a template.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 7c735ef..33b5b5f 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -8711,11 +8711,12 @@ tsubst_template_arg (tree t, tree args, tsubst_flags_t complain, tree in_decl)
   return r;
 }
 
-/* Give a chain SPEC_PARM of PARM_DECLs, pack them into a
-   NONTYPE_ARGUMENT_PACK.  */
+/* Given a function parameter pack TMPL_PARM and some function parameters
+   instantiated from it at *SPEC_P, return a NONTYPE_ARGUMENT_PACK of them
+   and set *SPEC_P to point at the next point in the list.  */
 
 static tree
-make_fnparm_pack (tree spec_parm)
+extract_fnparm_pack (tree tmpl_parm, tree *spec_p)
 {
   /* Collect all of the extra packed parameters into an
  argument pack.  */
@@ -8723,11 +8724,18 @@ make_fnparm_pack (tree spec_parm)
   tree parmtypevec;
   tree argpack = make_node (NONTYPE_ARGUMENT_PACK);
   tree argtypepack = cxx_make_type (TYPE_ARGUMENT_PACK);
-  int i, len = list_length (spec_parm);
+  tree spec_parm = *spec_p;
+  int i, len;
+
+  for (len = 0; spec_parm; ++len, spec_parm = TREE_CHAIN (spec_parm))
+if (tmpl_parm
+	 !function_parameter_expanded_from_pack_p (spec_parm, tmpl_parm))
+  break;
 
   /* Fill in PARMVEC and PARMTYPEVEC with all of the parameters.  */
   parmvec = make_tree_vec (len);
   parmtypevec = make_tree_vec (len);
+  spec_parm = *spec_p;
   for (i = 0; i  len; i++, spec_parm = DECL_CHAIN (spec_parm))
 {
   TREE_VEC_ELT (parmvec, i) = spec_parm;
@@ -8738,9 +8746,19 @@ make_fnparm_pack (tree spec_parm)
   SET_ARGUMENT_PACK_ARGS (argpack, parmvec);
   SET_ARGUMENT_PACK_ARGS (argtypepack, parmtypevec);
   TREE_TYPE (argpack) = argtypepack;
+  *spec_p = spec_parm;
 
   return argpack;
-}
+}
+
+/* Give a chain SPEC_PARM of PARM_DECLs, pack them into a
+   NONTYPE_ARGUMENT_PACK.  */
+
+static tree
+make_fnparm_pack (tree spec_parm)
+{
+  return extract_fnparm_pack (NULL_TREE, spec_parm);
+}
 
 /* Substitute ARGS into T, which is an pack expansion
(i.e. TYPE_PACK_EXPANSION or EXPR_PACK_EXPANSION). Returns a
@@ -17830,21 +17848,21 @@ instantiate_decl (tree d, int defer_ok,
 	  spec_parm = skip_artificial_parms_for (d, spec_parm);
 	  tmpl_parm = skip_artificial_parms_for (subst_decl, tmpl_parm);
 	}
-  while (tmpl_parm  !FUNCTION_PARAMETER_PACK_P (tmpl_parm))
+  for (; tmpl_parm; tmpl_parm = DECL_CHAIN (tmpl_parm))
 	{
-	  register_local_specialization (spec_parm, tmpl_parm);
-	  tmpl_parm = DECL_CHAIN (tmpl_parm);
-	  spec_parm = DECL_CHAIN (spec_parm);
+	  if (!FUNCTION_PARAMETER_PACK_P (tmpl_parm))
+	{
+	  register_local_specialization (spec_parm, tmpl_parm);
+	  spec_parm = DECL_CHAIN (spec_parm);
+	}
+	  else
+	{
+	  /* Register the (value) argument pack as a specialization of
+		 TMPL_PARM, then move on.  */
+	  tree argpack = extract_fnparm_pack (tmpl_parm, spec_parm);
+	  register_local_specialization (argpack, tmpl_parm);
+	}
 	}
-  if (tmpl_parm  FUNCTION_PARAMETER_PACK_P (tmpl_parm))
-{
-  /* Register the (value) argument pack as a specialization of
- TMPL_PARM, then move on.  */
-	  tree argpack = make_fnparm_pack (spec_parm);
-  register_local_specialization (argpack, tmpl_parm);
-  tmpl_parm = DECL_CHAIN (tmpl_parm);
-	  spec_parm = NULL_TREE;
-}
   gcc_assert (!spec_parm);
 
   /* Substitute into the body of the function.  */
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 84b0dd8..fd00e29 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -8808,6 +8808,9 @@ maybe_add_lambda_conv_op (tree type)
   if (LAMBDA_EXPR_CAPTURE_LIST (CLASSTYPE_LAMBDA_EXPR (type)) != NULL_TREE)
 return;
 
+  if (processing_template_decl)
+return;
+
   stattype = build_function_type (TREE_TYPE (TREE_TYPE (callop)),
   FUNCTION_ARG_CHAIN (callop));
 
diff --git a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-variadic1.C b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-variadic1.C
new file mode 100644
index 000..f17b336
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-variadic1.C
@@ -0,0 +1,15 @@
+// PR c++/49672
+// { dg-options -std=c++0x }
+
+templatetypename ... Args
+static void foo()
+{
+  [](Args..., int x) {
+x;
+  };
+}

New automaton_option collapse-ndfa

2011-07-11 Thread Bernd Schmidt
On C6X, we want to use the ndfa option to give the scheduler maximum
freedom when assigning units to instructions. After scheduling is
complete, we process the insns again in c6x_reorg, looking at each cycle
and assigning a unit specifier to each instruction so that there is no
conflict within a cycle.

This works, except for a few reservations that span more than the first
cycle. To handle these properly, one possibility would be to consider
the entire scheduled block with a more complicated algorithm. While
feasible, I'd prefer not to go there for the moment.

I came up with the notion of adding a new transition to the NDFA, one
which collapses a nondeterministic state (which is composed of multiple
possible deterministic ones) to just one of its component states. This
can be done at the end of each cycle, and gives a state that can be
processed with cpu_unit_reservation_p to identify the units chosen by
the scheduler. The following patch implements this.

The new option also modifies the generation of advance-cycle transitions
so that they only exist in deterministic states. This matches the
expected use of the feature where we have a collapse-ndfa transition
before the end of each cycle (using the dfa_pre_cycle_insn hook).
state_transition now recognizes const0_rtx as the collapse-ndfa
transition (NULL_RTX was taken for advance-cycle).

Tested with 4.5 c6x-elf so far. I hope to commit the C6X port to
mainline soon and will retest the patch with that as well. IA64 is
another user of the ndfa option, but it failed to bootstrap a clean
tree when I tried it a few days ago, so I've only built a cross-cc1 and
examined the generated insn-automata.c before/after the patch. No
changes beyond slight expected reorganization in the code recognizing
NULL_RTX as advance-cycle, and no changes in the ia64.dfa file
generated with the v option.

Ok?


Bernd
* doc/md.texi (automata_option): Document collapse-ndfa.
* genautomata.c (COLLAPSE_OPTION): New macro.
(collapse_flag): New static variable.
(struct description): New member normal_decls_num.
(struct automaton): New members advance_ainsn and collapse_ainsn.
(gen_automata_option): Check for COLLAPSE_OPTION.
(collapse_ndfa_insn_decl): New static variable.
(add_collapse_ndfa_insn_decl, special_decl_p): New functions.
(find_arc): If insn is the collapse-ndfa insn, accept any arc we
find.
(transform_insn_regexps): Call add_collapse_ndfa_insn_decl if
necessary.  Use normal_decls_num rather than decls_num, remove
test for special decls.
(create_alt_states, form_ainsn_with_same_reservs): Use
special_decl_p.
(make_automaton); Likewise.  Use the new advance_cycle_insn member
of struct automaton.
(create_composed_state): Disallow advance-cycle arcs if collapse_flag
is set.
(NDFA_to_DFA): Don't create composed states for the collapse-ndfa
transition.  Create the necessary transitions for it.
(create_ainsns): Return void.  Take an automaton_t argument, and
update its ainsn_list, advance_ainsn and collapse_ainsn members.  All
callers changed.
(COLLAPSE_NDFA_VALUE_NAME): New macro.
(output_tables): Output code to define it.
(output_internal_insn_code_evaluation): Output code to accept
const0_rtx as collapse-ndfa transition.
(output_default_latencies, output_print_reservation_func,
output_print_description): Reorganize loops to use normal_decls_num
as loop bound; remove special case for advance_cycle_insn_decl.
(initiate_automaton_gen): Handle COLLAPSE_OPTION.
(check_automata_insn_issues): Check for collapse_ainsn.
(expand_automate): Allocate sufficient space.  Initialize
normal_decls_num.

Index: doc/md.texi
===
--- doc/md.texi (revision 176171)
+++ doc/md.texi (working copy)
@@ -7859,6 +7859,16 @@ nondeterministic treatment means trying
 may be rejected by reservations in the subsequent insns.
 
 @item
+@dfn{collapse-ndfa} modifies the behaviour of the generator when
+producing an automaton.  An additional state transition to collapse a
+nondeterministic @acronym{NDFA} state to a deterministic @acronym{DFA}
+state is generated.  It can be triggered by passing @code{const0_rtx} to
+state_transition.  In such an automaton, cycle advance transitions are
+available only for these collapsed states.  This option is useful for
+ports that want to use the @code{ndfa} option, but also want to use
+@code{define_query_cpu_unit} to assign units to insns issued in a cycle.
+
+@item
 @dfn{progress} means output of a progress bar showing how many states
 were generated so far for automaton being processed.  This is useful
 during debugging a @acronym{DFA} description.  If you see too many
Index: genautomata.c

Re: RFC: attribute to reverse bitfield allocations

2011-07-11 Thread Mike Stump
On Jul 11, 2011, at 1:52 PM, DJ Delorie wrote:
 Finally getting around to writing this one.  The idea is to have an
 attribute which determines how bitfields are allocated

:-)  Apple has one of these sorts of creatures.  You can see the code in the 
Apple tree, marked by APPLE LOCAL {begin ,end ,}bitfield reversal.  Your code 
looks much nicer than the Apple code, I hope it works as well.


[pph] Do not call pushdecl_into_namespace to re-register symbols (issue4685053)

2011-07-11 Thread Diego Novillo


This patch changes the way we re-register symbols as they are read
from the PPH image.  Instead of calling pushdecl...(), we merge the
global bindings from the PPH file (scope_chain-bindings) into the
global bindings of the current translation unit.

This fixes 3 name lookup failures in the testsuite: c2eabi1.cc,
c2meteor-contest.cc and x1namespace.cc.  It does produce a new failure
in x1tmplclass.cc, which is addressed in the 3rd patch in this series.

Tested on x86_64.  Applied to branch.


Diego.

* pph-streamer-in.c (pph_register_decls_in_symtab): Rename
from pph_add_bindings_to_namespace.
Do not call pushdecl_into_namespace for every symbol.  Just
reset the scope for its identifier's namespace binding.
(pph_in_scope_chain): Merge every field in struct
cp_binding_level from the scope_chain-bindings coming from
STREAM and the current scope_chain-bindings.

testsuite/ChangeLog.pph

* g++.dg/pph/c2eabi1.cc: Remove XFAIL markers.  Expect an
assembly difference.
* g++.dg/pph/c2meteor-contest.cc: Likewise.
* g++.dg/pph/x1namespace.cc: Mark fixed.

diff --git a/gcc/cp/ChangeLog.pph b/gcc/cp/ChangeLog.pph
index 37b8464..1011902 100644
--- a/gcc/cp/ChangeLog.pph
+++ b/gcc/cp/ChangeLog.pph
@@ -1,3 +1,13 @@
+2011-07-07   Diego Novillo  dnovi...@google.com
+
+   * pph-streamer-in.c (pph_register_decls_in_symtab): Rename
+   from pph_add_bindings_to_namespace.
+   Do not call pushdecl_into_namespace for every symbol.  Just
+   reset the scope for its identifier's namespace binding.
+   (pph_in_scope_chain): Merge every field in struct
+   cp_binding_level from the scope_chain-bindings coming from
+   STREAM and the current scope_chain-bindings.
+
 2011-07-06   Diego Novillo  dnovi...@google.com
 
* pph-streamer-out.c (pph_out_scope_chain): Fix formatting.
diff --git a/gcc/cp/pph-streamer-in.c b/gcc/cp/pph-streamer-in.c
index 0bab93b..571ebf5 100644
--- a/gcc/cp/pph-streamer-in.c
+++ b/gcc/cp/pph-streamer-in.c
@@ -1139,38 +1139,32 @@ pph_in_lang_type (pph_stream *stream)
 }
 
 
-/* Add all bindings declared in BL to NS.  */
+/* Register all the symbols in binding level BL in the callgraph symbol
+   table.  NS is the namespace where all the symbols in BL live.  */
 
 static void
-pph_add_bindings_to_namespace (struct cp_binding_level *bl, tree ns)
+pph_register_decls_in_symtab (struct cp_binding_level *bl, tree ns)
 {
-  tree t, chain;
+  tree t;
 
   /* The chains are built backwards (ref: add_decl_to_level),
  reverse them before putting them back in.  */
   bl-names = nreverse (bl-names);
   bl-namespaces = nreverse (bl-namespaces);
 
-  for (t = bl-names; t; t = chain)
-{
-  /* Pushing a decl into a scope clobbers its DECL_CHAIN.
-Preserve it.  */
-  chain = DECL_CHAIN (t);
-  pushdecl_into_namespace (t, ns);
+  for (t = bl-names; t; t = DECL_CHAIN (t))
+if (DECL_NAME (t)  IDENTIFIER_NAMESPACE_BINDINGS (DECL_NAME (t)))
+  {
+   cxx_binding *b = IDENTIFIER_NAMESPACE_BINDINGS (DECL_NAME (t));
+   b-scope = NAMESPACE_LEVEL (ns);
 
-  if (TREE_CODE (t) == VAR_DECL  TREE_STATIC (t)  !DECL_EXTERNAL (t))
-   varpool_finalize_decl (t);
-}
+   if (TREE_CODE (t) == VAR_DECL  TREE_STATIC (t)  !DECL_EXTERNAL (t))
+ varpool_finalize_decl (t);
+  }
 
-  for (t = bl-namespaces; t; t = chain)
-{
-  /* Pushing a decl into a scope clobbers its DECL_CHAIN.
-Preserve it.  */
-  chain = DECL_CHAIN (t);
-  pushdecl_into_namespace (t, ns);
-  if (NAMESPACE_LEVEL (t))
-   pph_add_bindings_to_namespace (NAMESPACE_LEVEL (t), t);
-}
+  for (t = bl-namespaces; t; t = DECL_CHAIN (t))
+if (NAMESPACE_LEVEL (t))
+  pph_register_decls_in_symtab (NAMESPACE_LEVEL (t), t);
 }
 
 
@@ -1179,12 +1173,56 @@ pph_add_bindings_to_namespace (struct cp_binding_level 
*bl, tree ns)
 static void
 pph_in_scope_chain (pph_stream *stream)
 {
-  struct cp_binding_level *pph_bindings;
+  struct saved_scope *file_scope_chain;
+  unsigned i;
+  tree decl;
+  cp_class_binding *cb;
+  cp_label_binding *lb;
+  struct cp_binding_level *cur_bindings, *new_bindings;
+
+  file_scope_chain = ggc_alloc_cleared_saved_scope ();
+  file_scope_chain-bindings = new_bindings = pph_in_binding_level (stream);
+  cur_bindings = scope_chain-bindings;
+
+  pph_register_decls_in_symtab (new_bindings, global_namespace);
+
+  /* Merge the bindings from STREAM into saved_scope-bindings.  */
+  chainon (cur_bindings-names, new_bindings-names);
+  chainon (cur_bindings-namespaces, new_bindings-namespaces);
+
+  for (i = 0; VEC_iterate (tree, new_bindings-static_decls, i, decl); i++)
+VEC_safe_push (tree, gc, cur_bindings-static_decls, decl);
+
+  chainon (cur_bindings-usings, new_bindings-usings);
+  chainon (cur_bindings-using_directives, new_bindings-using_directives);
+
+  for (i = 0;
+   VEC_iterate (cp_class_binding, 

[pph] Add alternate addresses to register in the cache (issue4685054)

2011-07-11 Thread Diego Novillo

This patch adapts an idea from Gab that allow us to register alternate
addresses in the cache.  The problem here is making sure that symbols
read from a PPH file reference the right bindings.

If a symbol is in the global namespace when compiling a header file,
its bindings will point to NAMESPACE_LEVEL(global_namespace)-bindings,
but that global_namespace is the global_namespace instantiated for the
header file.  When reading that PPH image from a translation unit, we
need to refer to the bindings of the *current* global_namespace.

In general we solve this by inserting the pointer in the streamer
cache.  For instance, to avoid instantiating a second global_namespace
decl, the initialization code of both the writer and the reader store
global_namespace into the streaming cache.  This way, all the
references to global_namespace point to the current global_namespace
as known by the writer and the reader.

However, we cannot use the same trick on the bindings for
global_namespace.  If we simply inserted it into the cache then
writing out NAMESPACE_LEVEL(global_namespace)-bindings would simply
write a reference to the current one and on the reader side, it would
simply restore a pointer to the current translation unit's bindings.
Without ever actually writing or reading anything (since it was
satisified from the cache).

Therefore, we want a mechanism that allows the reader to: (a) read all
the symbols in the global bindings, and (b) references to the
global binding made by the symbols should point to the global bindings
of the current translation unit (instead of the one in the PPH image).

That's where ALLOC_AND_REGISTER_ALTERNATE comes in.  When called, it
allocates the data structure but registers another pointer in the
cache.  We use this trick when calling pph_in_binding_level from the
toplevel:

+  new_bindings = pph_in_binding_level (stream, scope_chain-bindings);

This way, when pph_in_binding_level tries to allocate the binding
structure read from STREAM, it registers scope_chain-bindings in the
cache.  This way, references to the original file's global binding are
automatically redirected to the current translation unit's global
bindings.

Gab, I modified your original implementation to move all the logic to
the place where we need to make this decision.  This way, it is easier
to tell which functions need this alternate registration, instead of
relying on some status flag squirreled away in the STREAM data
structure.


Tested on x86_64.  Applied to branch.

2011-07-11   Diego Novillo  dnovi...@google.com
 Gabriel Charette  gch...@google.com

* pph-streamer-in.c (ALLOC_AND_REGISTER_ALTERNATE): Define.
(pph_in_binding_level): Add argument TO_REGISTER.  Call
ALLOC_AND_REGISTER_ALTERNATE if set.
Update all users.
(pph_register_decls_in_symtab): Call varpool_finalize_decl
on all file-local symbols.
(pph_in_scope_chain): Call pph_in_binding_level with
scope_chain-bindings as the alternate pointer to
register in the streaming cache.

diff --git a/gcc/cp/ChangeLog.pph b/gcc/cp/ChangeLog.pph
index 1011902..f18c2f4 100644
--- a/gcc/cp/ChangeLog.pph
+++ b/gcc/cp/ChangeLog.pph
@@ -1,3 +1,16 @@
+2011-07-11   Diego Novillo  dnovi...@google.com
+Gabriel Charette  gch...@google.com
+
+   * pph-streamer-in.c (ALLOC_AND_REGISTER_ALTERNATE): Define.
+   (pph_in_binding_level): Add argument TO_REGISTER.  Call
+   ALLOC_AND_REGISTER_ALTERNATE if set.
+   Update all users.
+   (pph_register_decls_in_symtab): Call varpool_finalize_decl
+   on all file-local symbols.
+   (pph_in_scope_chain): Call pph_in_binding_level with
+   scope_chain-bindings as the alternate pointer to
+   register in the streaming cache.
+
 2011-07-07   Diego Novillo  dnovi...@google.com
 
* pph-streamer-in.c (pph_register_decls_in_symtab): Rename
diff --git a/gcc/cp/pph-streamer-in.c b/gcc/cp/pph-streamer-in.c
index 571ebf5..903cd94 100644
--- a/gcc/cp/pph-streamer-in.c
+++ b/gcc/cp/pph-streamer-in.c
@@ -42,6 +42,18 @@ along with GCC; see the file COPYING3.  If not see
   pph_register_shared_data (STREAM, DATA, IX); \
 } while (0)
 
+/* Same as ALLOC_AND_REGISTER, but instead of registering DATA into the
+   cache at slot IX, it registers ALT_DATA.  Used to support mapping
+   pointers to global data in the original STREAM that need to point
+   to a different instance when aggregating individual PPH files into
+   the current translation unit (see pph_in_binding_level for an
+   example).  */
+#define ALLOC_AND_REGISTER_ALTERNATE(STREAM, IX, DATA, ALLOC_EXPR, ALT_DATA)\
+do {   \
+  (DATA) = (ALLOC_EXPR);   \
+  pph_register_shared_data (STREAM, ALT_DATA, IX); \
+} while (0)
+
 /* Callback for unpacking value fields in ASTs.  BP is the bitpack 
we are 

[pph] Use FOR_EACH_VEC_ELT consistently (issue4673057)

2011-07-11 Thread Diego Novillo
No functional changes.  Just tidying calls to VEC_iterate.

Tested on x86_64.  Committed to branch.

Diego.


* pph-streamer-in.c (pph_in_scope_chain): Replace VEC_iterate
loops with FOR_EACH_VEC_ELT.
(pph_read_file_contents): Likewise.
* pph-streamer-out.c (pph_out_tree_vec): Likewise.
(pph_out_qual_use_vec): Likewise.
(pph_out_binding_level): Likewise.
(pph_out_tree_pair_vec): Likewise.
* pph-streamer.h (pph_out_tree_VEC): Likewise.

diff --git a/gcc/cp/ChangeLog.pph b/gcc/cp/ChangeLog.pph
index f18c2f4..46794d0 100644
--- a/gcc/cp/ChangeLog.pph
+++ b/gcc/cp/ChangeLog.pph
@@ -1,4 +1,15 @@
 2011-07-11   Diego Novillo  dnovi...@google.com
+
+   * pph-streamer-in.c (pph_in_scope_chain): Replace VEC_iterate
+   loops with FOR_EACH_VEC_ELT.
+   (pph_read_file_contents): Likewise.
+   * pph-streamer-out.c (pph_out_tree_vec): Likewise.
+   (pph_out_qual_use_vec): Likewise.
+   (pph_out_binding_level): Likewise.
+   (pph_out_tree_pair_vec): Likewise.
+   * pph-streamer.h (pph_out_tree_VEC): Likewise.
+
+2011-07-11   Diego Novillo  dnovi...@google.com
 Gabriel Charette  gch...@google.com
 
* pph-streamer-in.c (ALLOC_AND_REGISTER_ALTERNATE): Define.
diff --git a/gcc/cp/pph-streamer-in.c b/gcc/cp/pph-streamer-in.c
index 903cd94..b40c384 100644
--- a/gcc/cp/pph-streamer-in.c
+++ b/gcc/cp/pph-streamer-in.c
@@ -1227,22 +1227,18 @@ pph_in_scope_chain (pph_stream *stream)
   chainon (cur_bindings-names, new_bindings-names);
   chainon (cur_bindings-namespaces, new_bindings-namespaces);
 
-  for (i = 0; VEC_iterate (tree, new_bindings-static_decls, i, decl); i++)
+  FOR_EACH_VEC_ELT (tree, new_bindings-static_decls, i, decl)
 VEC_safe_push (tree, gc, cur_bindings-static_decls, decl);
 
   chainon (cur_bindings-usings, new_bindings-usings);
   chainon (cur_bindings-using_directives, new_bindings-using_directives);
 
-  for (i = 0;
-   VEC_iterate (cp_class_binding, new_bindings-class_shadowed, i, cb);
-   i++)
+  FOR_EACH_VEC_ELT (cp_class_binding, new_bindings-class_shadowed, i, cb)
 VEC_safe_push (cp_class_binding, gc, cur_bindings-class_shadowed, cb);
 
   chainon (cur_bindings-type_shadowed, new_bindings-type_shadowed);
 
-  for (i = 0;
-   VEC_iterate (cp_label_binding, new_bindings-shadowed_labels, i, lb);
-   i++)
+  FOR_EACH_VEC_ELT (cp_label_binding, new_bindings-shadowed_labels, i, lb)
 VEC_safe_push (cp_label_binding, gc, cur_bindings-shadowed_labels, lb);
 
   chainon (cur_bindings-blocks, new_bindings-blocks);
@@ -1412,14 +1408,14 @@ pph_read_file_contents (pph_stream *stream)
   keyed_classes = chainon (file_keyed_classes, keyed_classes);
 
   file_unemitted_tinfo_decls = pph_in_tree_vec (stream);
-  for (i = 0; VEC_iterate (tree, file_unemitted_tinfo_decls, i, t); i++)
+  FOR_EACH_VEC_ELT (tree, file_unemitted_tinfo_decls, i, t)
 VEC_safe_push (tree, gc, unemitted_tinfo_decls, t);
 
   file_static_aggregates = pph_in_tree (stream);
   static_aggregates = chainon (file_static_aggregates, static_aggregates);
 
   /* Expand all the functions with bodies that we read from STREAM.  */
-  for (i = 0; VEC_iterate (tree, stream-fns_to_expand, i, fndecl); i++)
+  FOR_EACH_VEC_ELT (tree, stream-fns_to_expand, i, fndecl)
 {
   /* FIXME pph - This is somewhat gross.  When we generated the
 PPH image, the parser called expand_or_defer_fn on FNDECL,
diff --git a/gcc/cp/pph-streamer-out.c b/gcc/cp/pph-streamer-out.c
index 089bb13..f7bf739 100644
--- a/gcc/cp/pph-streamer-out.c
+++ b/gcc/cp/pph-streamer-out.c
@@ -483,7 +483,7 @@ pph_out_tree_vec (pph_stream *stream, VEC(tree,gc) *v, bool 
ref_p)
   tree t;
 
   pph_out_uint (stream, VEC_length (tree, v));
-  for (i = 0; VEC_iterate (tree, v, i, t); i++)
+  FOR_EACH_VEC_ELT (tree, v, i, t)
 pph_out_tree_or_ref (stream, t, ref_p);
 }
 
@@ -499,7 +499,7 @@ pph_out_qual_use_vec (pph_stream *stream,
   qualified_typedef_usage_t *q;
 
   pph_out_uint (stream, VEC_length (qualified_typedef_usage_t, v));
-  for (i = 0; VEC_iterate (qualified_typedef_usage_t, v, i, q); i++)
+  FOR_EACH_VEC_ELT (qualified_typedef_usage_t, v, i, q)
 {
   pph_out_tree_or_ref (stream, q-typedef_decl, ref_p);
   pph_out_tree_or_ref (stream, q-context, ref_p);
@@ -657,13 +657,13 @@ pph_out_binding_level (pph_stream *stream, struct 
cp_binding_level *bl,
   pph_out_chain_filtered (stream, bl-using_directives, ref_p, NO_BUILTINS);
 
   pph_out_uint (stream, VEC_length (cp_class_binding, bl-class_shadowed));
-  for (i = 0; VEC_iterate (cp_class_binding, bl-class_shadowed, i, cs); i++)
+  FOR_EACH_VEC_ELT (cp_class_binding, bl-class_shadowed, i, cs)
 pph_out_class_binding (stream, cs, ref_p);
 
   pph_out_tree_or_ref (stream, bl-type_shadowed, ref_p);
 
   pph_out_uint (stream, VEC_length (cp_label_binding, bl-shadowed_labels));
-  for (i = 0; VEC_iterate (cp_label_binding, bl-shadowed_labels, i, sl); i++)
+  FOR_EACH_VEC_ELT 

New German PO file for 'gcc' (version 4.6.1)

2011-07-11 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the German team of translators.  The file is available at:

http://translationproject.org/latest/gcc/de.po

(This file, 'gcc-4.6.1.de.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.
coordina...@translationproject.org



Re: PING: PATCH [8/n]: Prepare x32: PR other/48007: Unwind library doesn't work with UNITS_PER_WORD sizeof (void *)

2011-07-11 Thread H.J. Lu
Ping.

On Wed, Jul 6, 2011 at 2:20 PM, H.J. Lu hjl.to...@gmail.com wrote:
 PING.

 On Thu, Jun 30, 2011 at 1:47 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Thu, Jun 30, 2011 at 12:02 PM, Richard Henderson r...@redhat.com wrote:
 On 06/30/2011 11:23 AM, H.J. Lu wrote:
 +#ifdef REG_VALUE_IN_UNWIND_CONTEXT
 +typedef _Unwind_Word _Unwind_Context_Reg_Val;
 +/* Signal frame context.  */
 +#define SIGNAL_FRAME_BIT ((_Unwind_Word) 1  0)

 There's absolutely no reason to re-define this.
 So what if the value is most-significant-bit set?

 Nor do I see any reason not to continue setting E_C_B.

 Done.

 +#define _Unwind_IsExtendedContext(c) 1

 Why is this not still an inline function?

 It is defined before _Unwind_Context is declared.  I used
 macros so that there can be one less #ifdef.

 +
 +static inline _Unwind_Word
 +_Unwind_Get_Unwind_Word (_Unwind_Context_Reg_Val val)
 +{
 +  return val;
 +}
 +
 +static inline _Unwind_Context_Reg_Val
 +_Unwind_Get_Unwind_Context_Reg_Val (_Unwind_Word val)
 +{
 +  return val;
 +}

 I cannot believe this actually works.  I see nowhere that
 you copy the by-address slot out of the stack frame and
 place it into the by-value slot in the unwind context.

 I changed the implantation based on the feedback from
 Jason.  Now I use the same reg field for both value and
 address.

    /* This will segfault if the register hasn't been saved.  */
    if (size == sizeof(_Unwind_Ptr))
 -    return * (_Unwind_Ptr *) ptr;
 +    return * (_Unwind_Ptr *) (_Unwind_Internal_Ptr) val;
    else
      {
        gcc_assert (size == sizeof(_Unwind_Word));
 -      return * (_Unwind_Word *) ptr;
 +      return * (_Unwind_Word *) (_Unwind_Internal_Ptr) val;
      }

 Indeed, this section is both wrong and belies the change
 you purport to make.

 You didn't even test this, did you?


 Here is the updated patch.  It works on simple tests.
 I am running full tests.  I kept config/i386/value-unwind.h
 since libgcc/md-unwind-support.h is included too late
 in unwind-dw2.c and I don't want to move it to be on
 the safe side.

 OK for trunk?

 Thanks.

 --
 H.J.
 ---
 gcc/

 2011-06-30  H.J. Lu  hongjiu...@intel.com

        * config.gcc (libgcc_tm_file): Add i386/value-unwind.h for
        Linux/x86.

        * system.h (REG_VALUE_IN_UNWIND_CONTEXT): Poisoned.

        * unwind-dw2.c (_Unwind_Context_Reg_Val): New.
        (_Unwind_Get_Unwind_Word): Likewise.
        (_Unwind_Get_Unwind_Context_Reg_Val): Likewise.
        (_Unwind_Context): Use _Unwind_Context_Reg_Val on the reg field.
        (_Unwind_IsExtendedContext): Defined as macro.
        (_Unwind_GetGR): Updated.
        (_Unwind_SetGR): Likewise.
        (_Unwind_GetGRPtr): Likewise.
        (_Unwind_SetGRPtr): Likewise.
        (_Unwind_SetGRValue): Likewise.
        (_Unwind_GRByValue): Likewise.
        (__frame_state_for): Likewise.
        (uw_install_context_1): Likewise.

        * doc/tm.texi.in: Document REG_VALUE_IN_UNWIND_CONTEXT.
        * doc/tm.texi: Regenerated.

 libgcc/

 2011-06-30  H.J. Lu  hongjiu...@intel.com

        * config/i386/value-unwind.h: New.




 --
 H.J.




-- 
H.J.


[PATCH] [Annotalysis] Fix to get_canonical_lock_expr

2011-07-11 Thread Delesley Hutchins
This patch fixes get_canonical_lock_expr so that it works on lock
expressions that involve a MEM_REF.  Gimple code can use either
MEM_REF or INDIRECT_REF in many expressions, and the choice of which
to use is somewhat arbitrary.  The canonical form of a lock expression
must rewrite all MEM_REFs to INDIRECT_REFs to accurately compare
expressions.  The surrounding if block prevented this rewrite from
happening in certain cases.

Bootstrapped and passed GCC regression testsuite on x86_64-unknown-linux-gnu.

Okay for branches/annotalysis and google/main?

 -DeLesley


2011-07-06   DeLesley Hutchins  deles...@google.com
  * cp_get_virtual_function_decl.c (handle_call_gs): Changes
  function to return null if the method cannot be found.
  * thread_annot_lock-79.C: Additional annotalysis test cases


Index: gcc/tree-threadsafe-analyze.c
===
--- gcc/tree-threadsafe-analyze.c   (revision 176188)
+++ gcc/tree-threadsafe-analyze.c   (working copy)
@@ -959,19 +959,17 @@ get_canonical_lock_expr (tree lock, tree base_obj,
   tree canon_base = get_canonical_lock_expr (base, base_obj,
  true /* is_temp_expr */,
  new_leftmost_base_var);
-  if (base != canon_base)
-{
-  /* If CANON_BASE is an ADDR_EXPR (e.g. a), doing an indirect or
- memory reference on top of it is equivalent to accessing the
- variable itself. That is, *(a) == a. So if that's the case,
- simply return the variable. Otherwise, build an indirect ref
- expression.  */
-  if (TREE_CODE (canon_base) == ADDR_EXPR)
-lock = TREE_OPERAND (canon_base, 0);
-  else
-lock = build1 (INDIRECT_REF,
-   TREE_TYPE (TREE_TYPE (canon_base)), canon_base);
-}
+
+  /* If CANON_BASE is an ADDR_EXPR (e.g. a), doing an indirect or
+ memory reference on top of it is equivalent to accessing the
+ variable itself. That is, *(a) == a. So if that's the case,
+ simply return the variable. Otherwise, build an indirect ref
+ expression.  */
+  if (TREE_CODE (canon_base) == ADDR_EXPR)
+lock = TREE_OPERAND (canon_base, 0);
+  else
+lock = build1 (INDIRECT_REF,
+   TREE_TYPE (TREE_TYPE (canon_base)), canon_base);
   break;
 }
   default:


-- 
DeLesley Hutchins | Software Engineer | deles...@google.com | 505-206-0315


Re: Add __builtin_clrsb, similar to clz/ctz

2011-07-11 Thread Hans-Peter Nilsson
On Mon, 20 Jun 2011, Bernd Schmidt wrote:
 New patch below. Retested on i686 and bfin.

Yay, bikeshedding opportunity! :P
Can we call them leading *repeated* sign bits? (in docs and
comments)  Calling them redundant makes you think the
representation is  not two's complement but new and improved...
like (bitwise) (131) == -1 or something.

brgds, H-P
PS. just a minor change, I can do the legwork.


Re: Use of vector instructions in memmov/memset expanding

2011-07-11 Thread H.J. Lu
On Mon, Jul 11, 2011 at 1:57 PM, Michael Zolotukhin
michael.v.zolotuk...@gmail.com wrote:
 Sorry, for sending once again - forgot to attach the patch.

 On 11 July 2011 23:50, Michael Zolotukhin
 michael.v.zolotuk...@gmail.com wrote:
 The attached patch enables use of vector instructions in memmov/memset
 expanding.

 New algorithm for move-mode selection is implemented for move_by_pieces,
 store_by_pieces.
 x86-specific ix86_expand_movmem and ix86_expand_setmem are also changed in
 similar way, x86 cost-models parameters are slightly changed to support
 this. This implementation checks if array's alignment is known at compile
 time and chooses expanding algorithm and move-mode according to it.

 Bootstrapped, two new fails due to incorrect tests (see
 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49503). New implementation gives
 quite big performance gain on memset/memcpy in some cases.

 A bunch of new tests are added to verify the implementation.

 Is it ok for trunk?

 Changelog:

 2011-07-11  Zolotukhin Michael  michael.v.zolotuk...@intel.com

     * config/i386/i386.h (processor_costs): Add second dimension to
     stringop_algs array.
     (clear_ratio): Tune value to improve performance.
     * config/i386/i386.c (cost models): Initialize second dimension of
     stringop_algs arrays.  Tune cost model in atom_cost, generic32_cost
     and generic64_cost.
     (ix86_expand_move): Add support for vector moves, that use half of
     vector register.
     (expand_set_or_movmem_via_loop_with_iter): New function.
     (expand_set_or_movmem_via_loop): Enable reuse of the same iters in
     different loops, produced by this function.
     (emit_strset): New function.
     (promote_duplicated_reg): Add support for vector modes, add
     declaration.
     (promote_duplicated_reg_to_size): Likewise.
     (expand_movmem_epilogue): Add epilogue generation for bigger sizes.
     (expand_setmem_epilogue): Likewise.
     (expand_movmem_prologue): Likewise for prologue.
     (expand_setmem_prologue): Likewise.
     (expand_constant_movmem_prologue): Likewise.
     (expand_constant_setmem_prologue): Likewise.
     (decide_alg): Add new argument align_unknown.  Fix algorithm of
     strategy selection if TARGET_INLINE_ALL_STRINGOPS is set.
     (decide_alignment): Update desired alignment according to chosen move
     mode.
     (ix86_expand_movmem): Change unrolled_loop strategy to use SSE-moves.
     (ix86_expand_setmem): Likewise.
     (ix86_slow_unaligned_access): Implementation of new hook
     slow_unaligned_access.
     (ix86_promote_rtx_for_memset): Implementation of new hook
     promote_rtx_for_memset.
     * config/i386/sse.md (sse2_loadq): Add expand for sse2_loadq.
     (vec_dupv4si): Add expand for vec_dupv4si.
     (vec_dupv2di): Add expand for vec_dupv2di.
     * emit-rtl.c (adjust_address_1): Improve algorithm for determining
     alignment of address+offset.
     (get_mem_align_offset): Add handling of MEM_REFs.
     * expr.c (compute_align_by_offset): New function.
     (move_by_pieces_insn): New function.
     (widest_mode_for_unaligned_mov): New function.
     (widest_mode_for_aligned_mov): New function.
     (widest_int_mode_for_size): Change type of size from int to
     HOST_WIDE_INT.
     (set_by_pieces_1): New function (new algorithm of memset expanding).
     (set_by_pieces_2): New function.
     (generate_move_with_mode): New function for set_by_pieces.
     (alignment_for_piecewise_move): Use hook slow_unaligned_access instead
     of macros SLOW_UNALIGNED_ACCESS.
     (emit_group_load_1): Likewise.
     (emit_group_store): Likewise.
     (emit_push_insn): Likewise.
     (store_field): Likewise.
     (expand_expr_real_1): Likewise.
     (compute_aligned_cost): New function.
     (compute_unaligned_cost): New function.
     (vector_mode_for_mode): New function.
     (vector_extensions_used_for_mode): New function.
     (move_by_pieces): New algorithm of memmove expanding.
     (move_by_pieces_ninsns): Update according to changes in
     move_by_pieces.
     (move_by_pieces_1): Remove as unused.
     (store_by_pieces): New algorithm for memset expanding.
     (clear_by_pieces): Likewise.
     (store_by_pieces_1): Remove incorrect parameters' attributes.
     * expr.h (compute_align_by_offset): Add declaration.
     * rtl.h (vector_extensions_used_for_mode): Add declaration.
     * builtins.c (expand_builtin_memset_args): Update according to changes
     in set_by_pieces.
     * target.def (DEFHOOK): Add hook slow_unaligned_access and
     promote_rtx_for_memset.
     * targhooks.c (default_slow_unaligned_access): Add default hook
     implementation.
     (default_promote_rtx_for_memset): Likewise.
     * targhooks.h (default_slow_unaligned_access): Add prototype.
     (default_promote_rtx_for_memset): Likewise.
     * cse.c (cse_insn): Stop forward propagation of vector constants.
     * fwprop.c (forward_propagate_and_simplify): Likewise.
     * doc/tm.texi (SLOW_UNALIGNED_ACCESS): Remove 

re: Fix argument pushes to unaligned stack slots

2011-07-11 Thread matthew green

hi folks.

i'm having a problem with GCC 4.5.3 on netbsd-m68k target.  i've tracked
it down to this change from several years ago:

 2007-02-06  Joseph Myers  jos...@codesourcery.com
 
   * expr.c (emit_push_insn): If STRICT_ALIGNMENT, copy to an
   unaligned stack slot via a suitably aligned slot.

the problem is that emit_library_call_value_1() calls emit_push_insn()
with TYPE_NULL which ends up triggering a NULL deref when emit_push_insn()
calls assign_temp() with type = TYPE_NULL, and assign_temp() crashes.

this simple change seems to be sufficient to avoid the crash and the
generated code appears to run OK.  if it is OK, could someone please
commit it?  thanks.  (feel free to update my log message if it could
be clearer or more correct.)


.mrg.


2011-07-10  matthew green  m...@eterna.com.au

* expr.c (emit_push_insn): Don't copy a TYPE_NULL expression
to the stack for correct alignment.

Index: external/gpl3/gcc/dist/gcc/expr.c
===
RCS file: /cvsroot/src/external/gpl3/gcc/dist/gcc/expr.c,v
retrieving revision 1.1.1.1
diff -p -u -r1.1.1.1 expr.c
--- external/gpl3/gcc/dist/gcc/expr.c   21 Jun 2011 01:20:17 -  1.1.1.1
+++ external/gpl3/gcc/dist/gcc/expr.c   12 Jul 2011 04:17:00 -
@@ -3764,7 +3764,8 @@ emit_push_insn (rtx x, enum machine_mode
   xinner = x;
 
   if (mode == BLKmode
-  || (STRICT_ALIGNMENT  align  GET_MODE_ALIGNMENT (mode)))
+  || (STRICT_ALIGNMENT  align  GET_MODE_ALIGNMENT (mode)
+   type != NULL_TREE))
 {
   /* Copy a block into the stack, entirely or partially.  */