Re: Missing guard in ira-color.c ?

2012-05-11 Thread Tristan Gingold

On May 10, 2012, at 10:41 PM, Vladimir Makarov wrote:

 On 05/10/2012 09:10 AM, Tristan Gingold wrote:
 Hi,
 
 I am getting a segfault in ira-color.c:2945 on the trunk:
 
 Program received signal SIGSEGV, Segmentation fault.
 0x00a79f37 in move_spill_restore () at ../../src/gcc/ira-color.c:2945
 2945   || ira_reg_equiv_const[regno] != NULL_RTX
 (gdb) l
 2940   /* don't do the optimization because it can create
 2941  copies and the reload pass can spill the allocno set
 2942  by copy although the allocno will not get memory
 2943  slot.  */
 2944   || ira_reg_equiv_invariant_p[regno]
 2945   || ira_reg_equiv_const[regno] != NULL_RTX
 2946   || !bitmap_bit_p (loop_node-border_allocnos, ALLOCNO_NUM 
 (a)))
 2947 continue;
 2948   mode = ALLOCNO_MODE (a);
 2949   rclass = ALLOCNO_CLASS (a);
 
 while building gcc (gnatcmd.adb file) for ia64-vms using a cross compiler 
 (target=ia64-vms, host=x86_64-linux).
 
 The reason looks to be an out of bounds access:
 
 (gdb) print regno
 $10 = 18476
 (gdb) print ira_reg_equiv_len
 $11 = 17984
 
 (I suppose this setup is not easy at all to reproduce, but I can provide any 
 files, if necessary).
 Tristan, thanks for reporting this.
 Wild guess, as I don't know IRA at all:  looks like in this file most 
 accesses to ira_reg_equiv_* are guarded.  Is it expected that they aren't at 
 this point ?
 Yes, I guess.  It is possible to have the pseudos which are out of range 
 ira_reg_equiv_const.  It should be hard to reproduce such error because they 
 are generated when we need to break circular dependence (e.g. when hard 
 register 1 should be moved to hard register 2 and hard register 2 to hard 
 register 1).
 
 Your solution is perfectly fine.  So you can commit the patch into the trunk 
 as pre-approved.

Thank you for your prompt answer.  I will commit it after regtesting.

Tristan.



Re: [PATCH, 4.7] Backport fix to [un]signed_type_for

2012-05-11 Thread Richard Guenther
On Thu, 10 May 2012, William J. Schmidt wrote:

 On Thu, 2012-05-10 at 18:49 +0200, Jakub Jelinek wrote:
  On Thu, May 10, 2012 at 11:44:27AM -0500, William J. Schmidt wrote:
   Backporting this patch to 4.7 fixes a problem building Fedora 17.
   Bootstrapped and regression tested on powerpc64-unknown-linux-gnu.  Is
   the backport OK?
  
  For 4.7 I'd very much prefer a less intrusive change (i.e. change
  the java langhook) instead, but I'll defer to Richard if he prefers
  this over that.
 
 OK.  If that's desired, this is the possible change to the langhook:
 
 Index: gcc/java/typeck.c
 ===
 --- gcc/java/typeck.c (revision 187158)
 +++ gcc/java/typeck.c (working copy)
 @@ -189,6 +189,12 @@ java_type_for_size (unsigned bits, int unsignedp)
  return unsignedp ? unsigned_int_type_node : int_type_node;
if (bits = TYPE_PRECISION (long_type_node))
  return unsignedp ? unsigned_long_type_node : long_type_node;
 +  /* A 64-bit target with TImode requires 128-bit type definitions
 + for bitsizetype.  */
 +  if (int128_integer_type_node
 +   bits == TYPE_PRECISION (int128_integer_type_node))
 +return (unsignedp ? int128_unsigned_type_node
 + : int128_integer_type_node);
return 0;
  }
 
 which also fixed the problem and bootstraps without regressions.
 Whichever you guys prefer is fine with me.

I prefer the java variant, too.

Thanks,
Richard.


Re: [C++ Patch] PR 53305

2012-05-11 Thread Paolo Carlini

On 05/11/2012 04:48 AM, Gabriel Dos Reis wrote:

On Thu, May 10, 2012 at 6:40 PM, Paolo Carlinipaolo.carl...@oracle.com  wrote:

Hi,

an ICE on invalid (per Daniel's analysis): when r is NULL_TREE the next
DECL_CONTEXT (r) can only crash. Plus a garbled error message because
pp_cxx_simple_type_specifier doesn't handle BOUND_TEMPLATE_TEMPLATE_PARM.

Tested x86_64-linux.

Thanks,
Paolo.

///

Stylistically, I would write

if (r == NULL)

or

   if (r == NULL_TREE)

Patch OK with that change.
Thanks Gaby. Then I guess I'm going to commit the variant with 
NULL_TREE, I like it a tad better because after all formally these 
functions return trees. And consistently I change another instance only 
a few lines above.


Thanks,
Paolo.

/
Index: testsuite/g++.dg/cpp0x/variadic132.C
===
--- testsuite/g++.dg/cpp0x/variadic132.C(revision 0)
+++ testsuite/g++.dg/cpp0x/variadic132.C(revision 0)
@@ -0,0 +1,27 @@
+// PR c++/53305
+// { dg-do compile { target c++11 } }
+
+templateclass... Ts struct tuple { };
+
+struct funct
+{
+  templateclass... argTs
+  int operator()(argTs...);
+};
+
+templateclass... class test;
+
+templatetemplate class... class tp,
+class... arg1Ts, class... arg2Ts
+class testtparg1Ts..., tparg2Ts...
+{
+  templateclass func, class...arg3Ts
+auto test2(func fun, arg1Ts... arg1s, arg3Ts... arg3s)
+- decltype(fun(arg1s..., arg3s...));
+};
+
+int main()
+{
+  testtuple, tuplechar,int t2;
+  t2.test2(funct(), 'a', 2);  // { dg-error no matching function }
+}
Index: cp/cxx-pretty-print.c
===
--- cp/cxx-pretty-print.c   (revision 187376)
+++ cp/cxx-pretty-print.c   (working copy)
@@ -1261,6 +1261,7 @@ pp_cxx_simple_type_specifier (cxx_pretty_printer *
 case TEMPLATE_TYPE_PARM:
 case TEMPLATE_TEMPLATE_PARM:
 case TEMPLATE_PARM_INDEX:
+case BOUND_TEMPLATE_TEMPLATE_PARM:
   pp_cxx_unqualified_id (pp, t);
   break;
 
Index: cp/pt.c
===
--- cp/pt.c (revision 187376)
+++ cp/pt.c (working copy)
@@ -12066,7 +12066,7 @@ tsubst_copy (tree t, tree args, tsubst_flags_t com
 case PARM_DECL:
   r = retrieve_local_specialization (t);
 
-  if (r == NULL)
+  if (r == NULL_TREE)
{
  tree c;
 
@@ -12084,6 +12084,8 @@ tsubst_copy (tree t, tree args, tsubst_flags_t com
 not the following PARM_DECLs that are chained to T.  */
  c = copy_node (t);
  r = tsubst_decl (c, args, complain);
+ if (r == NULL_TREE)
+   return error_mark_node;
  /* Give it the template pattern as its context; its true context
 hasn't been instantiated yet and this is good enough for
 mangling.  */


[Patch, Fortran, committed] PR53310 - Fix multi-malloc memory leak in EOSHIFT

2012-05-11 Thread Tobias Burnus
I have committed the attached patch as obvious. libgfortran (since GCC 
4.5) allocates the memory rank()-times instead of only once in eoshift2. 
For the test case of the PR, gfortran leaked 6 MB per call!


Build and regtested on x86-64-linux and comitted as Rev. 187395 to the 
4.8 trunk. I intent to backport the patch, but I have not yet decided 
whether it will be down to 4.5 or not.


Tobias

PS: In the ChangeLog I have converted some spaces into tabs.
Index: libgfortran/intrinsics/eoshift2.c
===
--- libgfortran/intrinsics/eoshift2.c	(revision 187394)
+++ libgfortran/intrinsics/eoshift2.c	(working copy)
@@ -77,6 +77,10 @@ eoshift2 (gfc_array_char *ret, const gfc_array_cha
 
   ret-offset = 0;
   ret-dtype = array-dtype;
+
+  /* xmalloc allocates a single byte for zero size.  */
+  ret-base_addr = xmalloc (size * arraysize);
+
   for (i = 0; i  GFC_DESCRIPTOR_RANK (array); i++)
 {
 	  index_type ub, str;
@@ -90,10 +94,6 @@ eoshift2 (gfc_array_char *ret, const gfc_array_cha
 	  * GFC_DESCRIPTOR_STRIDE(ret,i-1);
 
 	  GFC_DIMENSION_SET(ret-dim[i], 0, ub, str);
-
-  /* xmalloc allocates a single byte for zero size.  */
-	  ret-base_addr = xmalloc (size * arraysize);
-
 }
 }
   else if (unlikely (compile_options.bounds_check))
Index: libgfortran/ChangeLog
===
--- libgfortran/ChangeLog	(revision 187394)
+++ libgfortran/ChangeLog	(working copy)
@@ -1,17 +1,23 @@
+2012-05-11  Tobias Burnus  bur...@net-b.de
+
+	PR fortran/53310
+	* intrinsics/eoshift2.c (eoshift2): Do not leak
+	memory by allocating it in the loop.
+
 2012-05-05  Janne Blomqvist  j...@gcc.gnu.org
 
-* config.h.in: Regenerated.
-* configure: Regenerated.
-* configure.ac: Add checks for getegid and __secure_getenv.
-* io/unix.c (P_tmpdir): Fallback definition for macro.
-(tempfile_open): New function.
-(tempfile): Use secure_getenv, call tempfile_open to try each
-directory in turn.
-* libgfortran.h (DEFAULT_TMPDIR): Remove macro.
-(secure_getenv): New macro/prototype.
-* runtime/environ.c (secure_getenv): New function.
-(variable_table): Rename GFORTRAN_TMPDIR to TMPDIR.
-* runtime/main.c (find_addr2line): Use secure_getenv.
+	* config.h.in: Regenerated.
+	* configure: Regenerated.
+	* configure.ac: Add checks for getegid and __secure_getenv.
+	* io/unix.c (P_tmpdir): Fallback definition for macro.
+	(tempfile_open): New function.
+	(tempfile): Use secure_getenv, call tempfile_open to try each
+	directory in turn.
+	* libgfortran.h (DEFAULT_TMPDIR): Remove macro.
+	(secure_getenv): New macro/prototype.
+	* runtime/environ.c (secure_getenv): New function.
+	(variable_table): Rename GFORTRAN_TMPDIR to TMPDIR.
+	* runtime/main.c (find_addr2line): Use secure_getenv.
 
 2012-04-22  Tobias Burnus  bur...@net-b.de
 
@@ -479,24 +485,24 @@
 
 2012-03-15  Janne Blomqvist  j...@gcc.gnu.org
 
-PR libfortran/52434
-PR libfortran/48878
-PR libfortran/38199
-* io/unit.c (get_internal_unit): Default to ROUND_UNSPECIFIED.
-(init_units): Likewise.
-* io/write_float.def (determine_precision): New function.
-(output_float): Take into account buffer with %f format, no need
-for our own rounding if unspecified or processor specified
-rounding.
-(DTOA): Simplify format string, add parameters.
-(FDTOA): New macros similar to DTOA, but using %f format.
-(OUTPUT_FLOAT_FMT_G): Stack allocate newf, determine correct
-precision and fill buffer.
-(EN_PREC): New macro.
-(determine_en_precision): New function.
-(WRITE_FLOAT): For G format, move buffer filling into
-output_float_FMT_G, use FDTOA for F format.
-(write_float): Increase buffer due to F format.
+	PR libfortran/52434
+	PR libfortran/48878
+	PR libfortran/38199
+	* io/unit.c (get_internal_unit): Default to ROUND_UNSPECIFIED.
+	(init_units): Likewise.
+	* io/write_float.def (determine_precision): New function.
+	(output_float): Take into account buffer with %f format, no need
+	for our own rounding if unspecified or processor specified
+	rounding.
+	(DTOA): Simplify format string, add parameters.
+	(FDTOA): New macros similar to DTOA, but using %f format.
+	(OUTPUT_FLOAT_FMT_G): Stack allocate newf, determine correct
+	precision and fill buffer.
+	(EN_PREC): New macro.
+	(determine_en_precision): New function.
+	(WRITE_FLOAT): For G format, move buffer filling into
+	output_float_FMT_G, use FDTOA for F format.
+	(write_float): Increase buffer due to F format.
 
 2012-03-14  Rainer Orth  r...@cebitec.uni-bielefeld.de
 


Re: [C++ Patch] PR 53301

2012-05-11 Thread Paolo Carlini

Hi,

On 05/11/2012 06:41 AM, Jason Merrill wrote:

On 05/10/2012 05:31 PM, Paolo Carlini wrote:

Let's see if we can do something *now* ;) My concrete proposal would be:

TYPE_PTRMEM_P rename to TYPE_PTRDATAMEM_P (consistent with
TYPE_PTRMEMFUNC_P)
TYPE_PTR_TO_MEMBER_P rename to TYPE_PTRMEM_P

and then finally

#define TYPE_PTR_OR_PTRMEM_P(NODE) \
(TYPE_PTR_P (NODE) || TYPE_PTRMEM_P (NODE))

and use it everywhere. Sounds like an improvement?


Sounds pretty good.  But I suspect a lot of places want to check 
TYPE_PTR_P || TYPE_PTRDATAMEM_P (because you can't just use TREE_TYPE 
to get the function type of a PMF), so this new macro doesn't help 
with your desire to avoid writing ||.
Well, I wanted to avoid writing it somewhere *else* ;) But, if we want, 
we could add a new predicate for your case too, maybe in a following 
step. What do you think?

Additionally, we could maybe rename PTRMEM_OK_P to PTRMEMFUNC_OK_P

No, that flag applies to both varieties of members.
I see. Then while we are at it we could probably tweak a bit a the 
comment, I find it a tad misleading:


/* Indicates when overload resolution may resolve to a pointer to
   member function. [expr.unary.op]/3 */

???

Thanks,
Paolo.


Re: PATCH: Add RTM support to -march=native

2012-05-11 Thread Uros Bizjak
On Fri, May 11, 2012 at 4:51 AM, H.J. Lu hongjiu...@intel.com wrote:

 This patch adds RTM support to -march=native.  Tested on Linux/x86-64.
 OK for trunk?

 2012-05-10  H.J. Lu  hongjiu...@intel.com

        * config/i386/driver-i386.c (host_detect_local_cpu): Support
        RTM.

OK.

Thanks,
Uros.


Re: [C++ Patch] PR 53305

2012-05-11 Thread Gabriel Dos Reis
On Fri, May 11, 2012 at 3:07 AM, Paolo Carlini paolo.carl...@oracle.com wrote:

 Thanks Gaby. Then I guess I'm going to commit the variant with NULL_TREE, I
 like it a tad better because after all formally these functions return
 trees. And consistently I change another instance only a few lines above.

OK. Thanks;

-- Gaby


[RFC GCC/patch] Support sinking loads from memory in tree-ssa-sink.c if possible

2012-05-11 Thread Bin Cheng
Hi,
I previously noticed from testcases that gcc now does not sink loads from
memory
in tree-ssa-sink.c. After discussing I worked out a patch to support this in
gcc.
Please refer to  http://gcc.gnu.org/ml/gcc/2012-04/msg00404.html for some
info.

I think it is a trivial optimization, but it might be wanted in GCC since
less
instructions are executed with this.

One potential issue might be: Should I disable it when optimizing for size
and
the sink hurting it.

I bootstrapped x86 and tested the patch on x86/arm, no new fail introduced.

It is OK for trunk? 
And comments are welcome. Thanks.


gcc/ChangeLog:
2012-05-11  Bin Cheng  bin.ch...@arm.com

* tree-ssa-sink.c (pred_blocks, pred_visited): New static vars.
(init_sink_load): New function initializes pred_blocks and
pred_visited.
(free_sink_load): New function frees pred_blocks and pred_visited.
(sink_load_p): New function checks whether load operation should be
sunk.
(execute_sink_code): Call init_sink_load and free_sink_load.
(statement_sink_location): Sink loads from memory if possible.

gcc/testsuite/ChangeLog:
2012-05-11  Bin Cheng  bin.ch...@arm.com

* testsuite/gcc.dg/tree-ssa/ssa-sink-10.c: New test.
* testsuite/gcc.dg/tree-ssa/ssa-sink-11.c: New test.
Index: gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-10.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-10.c (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-10.c (revision 0)
@@ -0,0 +1,15 @@
+/* { dg-do compile } */ 
+/* { dg-options -O2 -fdump-tree-sink-stats } */
+extern int b;
+int
+foo (int a, int c)
+{
+  int x = b;
+  if (c == 1)
+return x;
+  else
+return a;
+}
+/* We should sink load from b into the branch that returns x. */
+/* { dg-final { scan-tree-dump-times Sunk statements: 1 1 sink } } */
+/* { dg-final { cleanup-tree-dump sink } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-11.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-11.c (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-11.c (revision 0)
@@ -0,0 +1,18 @@
+/* { dg-do compile } */ 
+/* { dg-options -O2 -fdump-tree-sink-stats } */
+extern int b;
+void bar();
+int
+foo (int a, int c)
+{
+  int x = b;
+  bar();
+  if (c == 1)
+return x;
+  else
+return a;
+}
+/* We should not sink load of b into the branch that returns x,
+   because it might be clobbered by the call to bar.  */
+/* { dg-final { scan-tree-dump-times Sunk statements: 0 sink } } */
+/* { dg-final { cleanup-tree-dump sink } } */
Index: gcc/tree-ssa-sink.c
===
--- gcc/tree-ssa-sink.c (revision 186736)
+++ gcc/tree-ssa-sink.c (working copy)
@@ -77,6 +77,115 @@
 } sink_stats;
 
 
+/* Stores reversely breadth-first searched block pointer, starting at
+   the potential destination block of sinking from load.  */
+static basic_block *pred_blocks;
+/* Bitmap records the visited basic blocks in breadth-first search.  */
+static bitmap pred_visited;
+
+static void
+init_sink_load (void)
+{
+  pred_blocks = XCNEWVEC (basic_block, last_basic_block);
+  pred_visited = BITMAP_ALLOC (NULL);
+}
+
+static void
+free_sink_load (void)
+{
+  free (pred_blocks);
+  BITMAP_FREE (pred_visited);
+}
+
+/* Given a gimple STMT in basic block FROM, which is a load operation,
+   check whether it should be sinked to TOGSI in basic block TO.
+   Return TRUE if it should be sinked, otherwise return FALSE.
+   
+   Loads should not be sunk across stmts that might clobber the reference,
+   for simplicity, we just stop checking at any store.  */
+
+static bool
+sink_load_p (gimple stmt, basic_block from, basic_block to,
+gimple_stmt_iterator *togsi)
+{
+  int i, vc;
+  basic_block bb;
+  gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
+
+  /* Stmt should be a load.  */
+  gcc_assert (gimple_vuse (stmt)  !gimple_vdef (stmt));
+
+  /* Sink has conflict effect with if-conversion pass, which speculates
+ loads. So do not sink possibly trapping stmt, which can not be
+ speculated by if-conversion pass afterward.  */
+  if (gimple_assign_rhs_could_trap_p (stmt))
+return false;
+
+  /* Sink to not post-dominated basic block, unless it's in outer loop.  */
+  if (dominated_by_p (CDI_POST_DOMINATORS, from, to)
+   (from-loop_depth = to-loop_depth))
+return false;
+
+  /* If the latch block is empty, don't make it non-empty by sinking
+ load into it.  */
+  if (to == from-loop_father-latch  empty_block_p (to))
+return false;
+
+  /* Don't sink stmt if reference might be clobbered by successor stmts.  */
+  for (gsi_next(gsi); !gsi_end_p (gsi); gsi_next (gsi))
+if (gimple_vdef (gsi_stmt (gsi)))
+  return false;
+
+  /* Don't sink stmt if reference might be clobbered by predecessors of
+ TOGSI.  */
+  gsi = *togsi;
+  if (!gsi_end_p (gsi))
+gsi_prev (gsi);
+
+  for (; 

[testsuite] Allow for ! comments in g++.dg/debug/dwarf2/nested-3.C

2012-05-11 Thread Rainer Orth
It turns out that g++.dg/debug/dwarf2/nested-3.C was still failing on
Solaris/SPARC: with both as and gas, ! is used as a comment character.
This patch accounts for that.

Tested with the appropritate runtest invocation on sparc-sun-solaris2.11
with both as and gas, installed on mainline.

Rainer


2012-05-11  Rainer Orth  r...@cebitec.uni-bielefeld.de

* g++.dg/debug/dwarf2/nested-3.C: Allow for ! comments.

# HG changeset patch
# Parent 00aab0518e96468d2a15d919ee91176c075ea95e
Allow for ! comments in g++.dg/debug/dwarf2/nested-3.C

diff --git a/gcc/testsuite/g++.dg/debug/dwarf2/nested-3.C b/gcc/testsuite/g++.dg/debug/dwarf2/nested-3.C
--- a/gcc/testsuite/g++.dg/debug/dwarf2/nested-3.C
+++ b/gcc/testsuite/g++.dg/debug/dwarf2/nested-3.C
@@ -59,4 +59,4 @@ main ()
 //
 // Hence the scary regexp:
 //
-// { dg-final { scan-assembler \[^\n\r\]*\\(DIE \\(0x(\[0-9a-f\]+)\\) DW_TAG_namespace\\)\[\n\r\]+\[^\n\r\]*\thread\[\^\n\r]+\[\n\r\]+(\[^\n\r\]*\[\n\r\]+)+\[^\n\r\]*\\(DIE \\(0x(\[0-9a-f\]+)\\) DW_TAG_class_type\\)(\[\n\r\]+\[^\n\r\]*)+\Executor\[^\n\r\]+\[\n\r\]+\[^\n\r\]*DW_AT_declaration\[\n\r\]+\[^\n\r\]*DW_AT_signature\[^#/\]*\[#/\] \[^\n\r\]*\\(DIE\[^\n\r\]*DW_TAG_subprogram\\)\[\n\r\]+(\[^\n\r\]*\[\n\r\]+)+\[^\n\r\]*\CurrentExecutor\[^\n\r\]+\[\n\r\]+(\[^\n\r\]*\[\n\r\]+)+(\[^\n\r\]*\[\n\r\]+)+\[^\n\r\]*end of children of DIE 0x\\3\[\n\r]+\[^\n\r\]*end of children of DIE 0x\\1\[\n\r]+ } }
+// { dg-final { scan-assembler \[^\n\r\]*\\(DIE \\(0x(\[0-9a-f\]+)\\) DW_TAG_namespace\\)\[\n\r\]+\[^\n\r\]*\thread\[\^\n\r]+\[\n\r\]+(\[^\n\r\]*\[\n\r\]+)+\[^\n\r\]*\\(DIE \\(0x(\[0-9a-f\]+)\\) DW_TAG_class_type\\)(\[\n\r\]+\[^\n\r\]*)+\Executor\[^\n\r\]+\[\n\r\]+\[^\n\r\]*DW_AT_declaration\[\n\r\]+\[^\n\r\]*DW_AT_signature\[^#/!\]*\[#/!\] \[^\n\r\]*\\(DIE\[^\n\r\]*DW_TAG_subprogram\\)\[\n\r\]+(\[^\n\r\]*\[\n\r\]+)+\[^\n\r\]*\CurrentExecutor\[^\n\r\]+\[\n\r\]+(\[^\n\r\]*\[\n\r\]+)+(\[^\n\r\]*\[\n\r\]+)+\[^\n\r\]*end of children of DIE 0x\\3\[\n\r]+\[^\n\r\]*end of children of DIE 0x\\1\[\n\r]+ } }

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [RFC GCC/patch] Support sinking loads from memory in tree-ssa-sink.c if possible

2012-05-11 Thread Richard Guenther
On Fri, May 11, 2012 at 10:53 AM, Bin Cheng bin.ch...@arm.com wrote:
 Hi,
 I previously noticed from testcases that gcc now does not sink loads from
 memory
 in tree-ssa-sink.c. After discussing I worked out a patch to support this in
 gcc.
 Please refer to  http://gcc.gnu.org/ml/gcc/2012-04/msg00404.html for some
 info.

 I think it is a trivial optimization, but it might be wanted in GCC since
 less
 instructions are executed with this.

 One potential issue might be: Should I disable it when optimizing for size
 and
 the sink hurting it.

 I bootstrapped x86 and tested the patch on x86/arm, no new fail introduced.

 It is OK for trunk?
 And comments are welcome. Thanks.

First of all, can you do your patches with -p?  That makes them easier
to review.

I think you should increase testing coverage - interesting cases include

+extern int b;
+void bar();
+int
+foo (int a, int c)
+{
+  int x = b;
+  if (c == 1)
+return x;
+  else
 {
  bar ();
+return a;
 }
+}

where bar () should not prevent sinking.

I don't like the way you perform the checking whether there is a store along
the paths from the source to the destination location - we have virtual operands
in SSA form that should enable a more efficient way to do this.  (in general
SSU form would be the best here, as the whole sinking is really partial dead
code elimination)  Basically you are looking for the active VUSE operand
at the insert location and want to make sure it matches that of the load you
want to sink.  Or the other way around, you are looking for all VDEF operands
that make the loads VUSE operand dead and want to make sure those are
not on any path from the sink source to its destination.  Walking upwards
is matching what SSA provides the best (but doesn't help very much, as as
soon that you've found the SSA name to start walking you are finished).
But it boils down to recording the active VUSE at basic-block entry (something
that's the same for all loads, no need to compute that per load).

But, as we are dealing with simple cases (any store will block sinking of loads)
we can as well limit the CFG structure we can sink over to only consider
immediate successors of the sink source.  As sink_code_in_bb already
walks backwards over the stmts of the sink source it's easy to remember
whether we've seen a store here or not (also, if there is a relevant store
in the source BB then the vuses SSA uses are all in BB or its dominators
and the single store reachable that way is in BB).

Thanks,
Richard.


 gcc/ChangeLog:
 2012-05-11  Bin Cheng  bin.ch...@arm.com

        * tree-ssa-sink.c (pred_blocks, pred_visited): New static vars.
        (init_sink_load): New function initializes pred_blocks and
 pred_visited.
        (free_sink_load): New function frees pred_blocks and pred_visited.
        (sink_load_p): New function checks whether load operation should be
 sunk.
        (execute_sink_code): Call init_sink_load and free_sink_load.
        (statement_sink_location): Sink loads from memory if possible.

 gcc/testsuite/ChangeLog:
 2012-05-11  Bin Cheng  bin.ch...@arm.com

        * testsuite/gcc.dg/tree-ssa/ssa-sink-10.c: New test.
        * testsuite/gcc.dg/tree-ssa/ssa-sink-11.c: New test.


Re: [patch] support for multiarch systems

2012-05-11 Thread Paolo Bonzini
Il 11/05/2012 07:13, Matthias Klose ha scritto:
 ok, I did clarify it in the existing documentation of MULTIARCH_DIRNAME in
 fragments.texi, detailing the search order for the files. Should the search
 order be mentioned in some user documentation as well? if yes, where?

Thanks!  I don't think the search order should be mentioned specially,
since anyway *_INCLUDE_PATH and LIBRARY_PATH are mentioned.  However
under MULTILIB_DIRNAMES I would add something like this:

@code{MULTILIB_DIRNAMES} describes the multilib directories using GCC
conventions and is applied to directories that are part of the GCC
installation.  When multilib-enabled, the compiler will add a
subdirectory of the form @var{prefix}/@var{multilib} before each
directory in the search path for libraries and crt files.

 +@findex MULTILIB_OSDIRNAMES
 +@item MULTILIB_OSDIRNAMES
 +If @code{MULTILIB_OPTIONS} is used, this variable specifies

... a list of subdirectory names, that are used to modify the search
path depending on the chosen multilib.  Unlike @code{MULTILIB_DIRNAMES},
@code{MULTILIB_OSDIRNAMES} describes the multilib directories using
operating systems conventions, and is applied to the directories such as
@code{lib} or those in the @env{LIBRARY_PATH} environment variable.

  The format is either the same as of
 +@code{MULTILIB_DIRNAMES}, or a set of mappings.  When it is the same
 +as @code{MULTILIB_DIRNAMES}, it describes the multilib directories
 +using OS conventions, rather than GCC conventions.

s/OS/operating system/

 When it is a set
 +of mappings of the form @var{gccdir}=@var{osdir}, the left side gives
 +the GCC convention and the right gives the equivalent OS defined
 +location.  If the @var{osdir} part begins with a @samp{!},

... GCC will not search in the non-multilib directory and use
exclusively the multilib directory.  Otherwise, the compiler will
examine the search path for libraries and crt files twice; the first
time it will add @var{multilib} to each directory in the search path,
the second it will not.

 Use the mapping when there is
 +no one-to-one equivalence between GCC levels and the OS.

I'm not sure what you mean here?


 +For multilib enabled configurations (see @code{MULTIARCH_DIRNAME})
 +below), the multilib name is appended to each directory name, separated
 +by a colon (e.g. @samp{../lib:x86_64-linux-gnu}).

For configurations that support both multilib and multiarch,
@code{MULTILIB_OSDIRNAMES} also encodes the multiarch name, thus
subsuming @code{MULTIARCH_DIRNAME}.  The multiarch name is appended to
each directory name, separated by a colon (e.g.
@samp{../lib32:i386-linux-gnu}).

Each multiarch subdirectory will be searched before the corresponding OS
multilib directory, for example @samp{/lib/i386-linux-gnu} before
@samp{/lib/..lib32}.  The multiarch name will also be used to modify the
system header search path, as explained for @code{MULTIARCH_DIRNAME}.


 +@findex MULTIARCH_DIRNAME
 +@item MULTIARCH_DIRNAME
 +If @code{MULTIARCH_DIRNAME} is used, this variable specifies the
 +multiarch name for this configuration.

 +Libraries and crt files are searched first in
 +@var{prefix}/@var{multiarch} before @var{prefix} for each @var{prefix}
 +added by @code{add_prefix} or @code{add_sysrooted_prefix}.
 +System header files are searched first in
 +@code{LOCAL_INCLUDE_DIR}/@var{multiarch} before
 +@code{LOCAL_INCLUDE_DIR}, and in
 +@code{NATIVE_SYSTEM_HEADER_DIR}/@var{multiarch} before
 +@code{NATIVE_SYSTEM_HEADER_DIR}.

 +@code{MULTIARCH_DIRNAME} is not used for multilib enabled
 +configurations, but encoded in @code{MULTILIB_OSDIRNAMES} instead.

This sounds simpler, and doesn't refer to GCC implementation details
such add_prefix/add_sysrooted_prefix:

This variable specifies the multiarch name for configurations that are
multiarch-enabled but not multilibbed configurations.

The multiarch name is used to augment the search path for libraries, crt
files and system header files with additional locations.  The compiler
will add a multiarch subdirectory of the form
@var{prefix}/@var{multiarch} before each directory in the library and
crt search path.  It will also add two directories
@code{LOCAL_INCLUDE_DIR}/@var{multiarch} and
@code{NATIVE_SYSTEM_HEADER_DIR}/@var{multiarch}) to the system header
search path, respectively before @code{LOCAL_INCLUDE_DIR} and
@code{NATIVE_SYSTEM_HEADER_DIR}.

@code{MULTIARCH_DIRNAME} is not used for configurations that support
both multilib and multiarch.  In that case, multiarch names are encoded
in @code{MULTILIB_OSDIRNAMES} instead.

 +The multiarch tuples are defined
 +in @uref{http://wiki.debian.org/Multiarch/Tuples}.

Is this needed?  Aren't they defined simply by the GCC source code?
Surely if some other OS implements multiarch with different tuples (no
matter how undesirable that would be) Debian would not be an
authoritative source.

Paolo


Re: [testsuite] Fix gcc.target/i386/hle-* testcases with Sun as

2012-05-11 Thread Rainer Orth
Hi Uros,

 there, an alternative patch might be to remove it from the port now
 instead of the patch.

 Right, that's why I was asking for review rather than just installing on
 my own.

 I'd rather see that we remove semicolon in this case, but please note
 that xchg with memory operand doesn't need lock prefix. If it is not
 too much trouble...

I saw your message only after I had committed the patch.  I'll try to
have a look this weekend, but it's quite unclear if I have any time.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: Symbol table 20/many: cleanup of cgraph_remove_unreachable_nodes

2012-05-11 Thread H.J. Lu
On Thu, May 10, 2012 at 1:19 PM, Jan Hubicka hubi...@ucw.cz wrote:
 Hi,
 after some thought, the changes into omp-low are not as obviously harmless as 
 I
 originally tought.  So i decided to handle this by separate patch.  This patch
 simply makes cgraph to not release bodies of artificial functions that papers
 around the problem in easier way.

 Bootstrapped/regtested x86_64-linux, comitted.

 Honza

        * cgraph.h (cgraph_remove_unreachable_nodes): Rename to ...
        (symtab_remove_unreachable_nodes): ... this one.
        * ipa-cp.c (ipcp_driver): Do not remove unreachable nodes.
        * cgraphunit.c (ipa_passes): Update.
        * cgraphclones.c (cgraph_materialize_all_clones): Update.
        * cgraph.c (cgraph_release_function_body): Only turn initial
        into error mark when initial was previously set.
        * ipa-inline.c (ipa_inline): Update.
        * ipa.c: Include ipa-inline.h
        (enqueue_cgraph_node, enqueue_varpool_node): Remove.
        (enqueue_node): New function.
        (process_references): Update.
        (symtab_remove_unreachable_nodes): Cleanup.
        * passes.c (execute_todo, execute_one_pass): Update.

This may have caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53321

-- 
H.J.


[PATCH] Fix PR53295

2012-05-11 Thread Richard Guenther

This fixes the dependency of vectorization of strided loads on
gather support.  For that to work we need to lift the restriction
in data-ref analysis that requries a constant DR_STEP.  Fortunately
fallout is small.

As a side-effect the data-reference for strided loads now makes
sense (rather than being gobbled over by gather analysis BB-level DR).

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2012-05-11  Richard Guenther  rguent...@suse.de

* tree-data-ref.h (stride_of_unit_type_p): Handle non-constant
strides.
* tree-data-ref.c (dr_analyze_innermost): Allow non-constant
strides when analyzing data-references in a loop context.
* tree-vect-data-refs.c (vect_mark_for_runtime_alias_test): Reject
non-constant strides for now.
(vect_enhance_data_refs_alignment): Ignore data references
that are strided loads.
(vect_analyze_data_ref_access): Handle non-constant strides.
(vect_check_strided_load): Verify the data-reference is a load.
(vect_analyze_data_refs): Restructure to make strided load
support not dependent on gather support.
* tree-vect-stmts.c (vectorizable_load): Avoid useless work
when doing strided or gather loads.
* tree-vect-loop-manip.c (vect_vfa_segment_size): Use
integer_zerop to compare stride with zero.

Index: gcc/tree-data-ref.h
===
*** gcc/tree-data-ref.h (revision 187396)
--- gcc/tree-data-ref.h (working copy)
*** bool stmt_with_adjacent_zero_store_dr_p
*** 618,626 
  static inline bool
  stride_of_unit_type_p (tree stride, tree type)
  {
!   return tree_int_cst_equal (fold_unary (ABS_EXPR, TREE_TYPE (stride),
!stride),
!TYPE_SIZE_UNIT (type));
  }
  
  /* Determines whether RDG vertices V1 and V2 access to similar memory
--- 618,627 
  static inline bool
  stride_of_unit_type_p (tree stride, tree type)
  {
!   return (TREE_CODE (stride) == INTEGER_CST
!  tree_int_cst_equal (fold_unary (ABS_EXPR, TREE_TYPE (stride),
!stride),
!TYPE_SIZE_UNIT (type)));
  }
  
  /* Determines whether RDG vertices V1 and V2 access to similar memory
Index: gcc/tree-data-ref.c
===
*** gcc/tree-data-ref.c (revision 187396)
--- gcc/tree-data-ref.c (working copy)
*** dr_analyze_innermost (struct data_refere
*** 736,742 
if (in_loop)
  {
if (!simple_iv (loop, loop_containing_stmt (stmt), base, base_iv,
!   false))
  {
if (nest)
  {
--- 736,742 
if (in_loop)
  {
if (!simple_iv (loop, loop_containing_stmt (stmt), base, base_iv,
!   nest ? true : false))
  {
if (nest)
  {
*** dr_analyze_innermost (struct data_refere
*** 773,779 
offset_iv.step = ssize_int (0);
  }
else if (!simple_iv (loop, loop_containing_stmt (stmt),
!poffset, offset_iv, false))
  {
if (nest)
  {
--- 773,780 
offset_iv.step = ssize_int (0);
  }
else if (!simple_iv (loop, loop_containing_stmt (stmt),
!poffset, offset_iv,
!  nest ? true : false))
  {
if (nest)
  {
Index: gcc/tree-vect-data-refs.c
===
*** gcc/tree-vect-data-refs.c   (revision 187396)
--- gcc/tree-vect-data-refs.c   (working copy)
*** vect_mark_for_runtime_alias_test (ddr_p
*** 542,547 
--- 542,558 
return false;
  }
  
+   /* FORNOW: We don't support creating runtime alias tests for non-constant
+  step.  */
+   if (TREE_CODE (DR_STEP (DDR_A (ddr))) != INTEGER_CST
+   || TREE_CODE (DR_STEP (DDR_B (ddr))) != INTEGER_CST)
+ {
+   if (vect_print_dump_info (REPORT_DR_DETAILS))
+   fprintf (vect_dump, versioning not yet supported for non-constant 
+step);
+   return false;
+ }
+ 
VEC_safe_push (ddr_p, heap, LOOP_VINFO_MAY_ALIAS_DDRS (loop_vinfo), ddr);
return true;
  }
*** vect_enhance_data_refs_alignment (loop_v
*** 1522,1527 
--- 1533,1543 
if (integer_zerop (DR_STEP (dr)))
continue;
  
+   /* Strided loads perform only component accesses, alignment is
+irrelevant for them.  */
+   if (STMT_VINFO_STRIDE_LOAD_P (stmt_info))
+   continue;
+ 
supportable_dr_alignment = vect_supportable_dr_alignment (dr, true);
do_peeling = vector_alignment_reachable_p (dr);
if (do_peeling)
*** vect_enhance_data_refs_alignment (loop_v
*** 1779,1784 

Re: [PATCH, alpha]: Fix ICE in alpha_emit_conditional_move, at config/alpha/alpha.c:2649

2012-05-11 Thread Uros Bizjak
On Fri, May 11, 2012 at 1:54 AM, Uros Bizjak ubiz...@gmail.com wrote:
 Hello!

 Recently testsuite/gcc.c-torture/execute/ieee/pr50310.c started to ICE
 when compiled with -O3 -mieee on alphaev68-pc-linux-gnu:

 $ ~/gcc-build-alpha/gcc/cc1 -O3 -mieee -quiet pr50310.c
 pr50310.c: In function ‘foo’:
 pr50310.c:31:20: internal compiler error: in
 alpha_emit_conditional_move, at config/alpha/alpha.c:2649
     s3[10 * 4 + i] = __builtin_isunordered (s1[i], s2[i]) ? -1.0 : 0.0;
                    ^
 Please submit a full bug report,
 with preprocessed source if appropriate.
 See http://gcc.gnu.org/bugs.html for instructions.

 It turned out that UNORDERED and ORDERED RTX codes are not handled in
 alpha_emit_conditional_move. Attached patch fixes this oversight.

 2012-05-11  Uros Bizjak  ubiz...@gmail.com

        * config/alpha/alpha.c (alpha_emit_conditional_branch): Handle
        ORDERED and UNORDERED conditions.

 Patch was bootstrapped and regression tested on
 alphaev68-pc-linux-gnu.  OK for mainline SVN and release branches?

Oops, wrong patch was attached. Attached is the correct one.

Uros.
Index: config/alpha/alpha.c
===
--- config/alpha/alpha.c(revision 187394)
+++ config/alpha/alpha.c(working copy)
@@ -2335,7 +2335,7 @@ alpha_emit_conditional_branch (rtx operands[], enu
 {
 case EQ:  case LE:  case LT:  case LEU:  case LTU:
 case UNORDERED:
-  /* We have these compares: */
+  /* We have these compares.  */
   cmp_code = code, branch_code = NE;
   break;
 
@@ -2572,13 +2572,15 @@ alpha_emit_conditional_move (rtx cmp, enum machine
   switch (code)
{
case EQ: case LE: case LT: case LEU: case LTU:
+   case UNORDERED:
  /* We have these compares.  */
  cmp_code = code, code = NE;
  break;
 
case NE:
- /* This must be reversed.  */
- cmp_code = EQ, code = EQ;
+   case ORDERED:
+ /* These must be reversed.  */
+ cmp_code = reverse_condition (code), code = EQ;
  break;
 
case GE: case GT: case GEU: case GTU:
@@ -2598,6 +2600,14 @@ alpha_emit_conditional_move (rtx cmp, enum machine
  gcc_unreachable ();
}
 
+  if (cmp_mode == DImode)
+   {
+ if (!reg_or_0_operand (op0, DImode))
+   op0 = force_reg (DImode, op0);
+ if (!reg_or_8bit_operand (op1, DImode))
+   op1 = force_reg (DImode, op1);
+   }
+
   tem = gen_reg_rtx (cmp_mode);
   emit_insn (gen_rtx_SET (VOIDmode, tem,
  gen_rtx_fmt_ee (cmp_code, cmp_mode,
@@ -2609,6 +2619,14 @@ alpha_emit_conditional_move (rtx cmp, enum machine
   local_fast_math = 1;
 }
 
+  if (cmp_mode == DImode)
+{
+  if (!reg_or_0_operand (op0, DImode))
+   op0 = force_reg (DImode, op0);
+  if (!reg_or_8bit_operand (op1, DImode))
+   op1 = force_reg (DImode, op1);
+}
+
   /* We may be able to use a conditional move directly.
  This avoids emitting spurious compares.  */
   if (signed_comparison_operator (cmp, VOIDmode)
@@ -2627,11 +2645,13 @@ alpha_emit_conditional_move (rtx cmp, enum machine
   switch (code)
 {
 case EQ:  case LE:  case LT:  case LEU:  case LTU:
+case UNORDERED:
   /* We have these compares: */
   break;
 
 case NE:
-  /* This must be reversed.  */
+case ORDERED:
+  /* These must be reversed.  */
   code = reverse_condition (code);
   cmov_code = EQ;
   break;


[PATCH] More pass cleanups, make phinodes cache global

2012-05-11 Thread Richard Guenther

This makes some more pass structs internal to passes.c.  It also
makes the phinodes cache global (which it is already - sort of,
it gets cleaned after each early optimization but is kept after
IPA opts).

Bootstrapped on x86_64-unknown-linux-gnu.

Richard.

2012-05-11  Richard Guenther  rguent...@suse.de

* tree-pass.h (pass_rest_of_compilation,
pass_all_optimizations, pass_postreload, pass_all_early_optimizations):
Remove.
* passes.c (pass_all_optimizations, pass_postreload,
pass_all_early_optimizations): Make static.
(pass_rest_of_compilation): Likewise.  Make it an RTL_PASS.
* tree-phinodes.c (init_phinodes, fini_phinodes): Remove.
* tree-ssa.c (init_tree_ssa): Do not call init_phinodes.
(delete_tree_ssa): Do not call fini_phinodes.
* tree-flow.h (init_phinodes, fini_phinodes): Remove.

Index: gcc/tree-pass.h
===
*** gcc/tree-pass.h (revision 187396)
--- gcc/tree-pass.h (working copy)
*** extern struct gimple_opt_pass pass_tree_
*** 430,436 
  extern struct gimple_opt_pass pass_dse;
  extern struct gimple_opt_pass pass_nrv;
  extern struct gimple_opt_pass pass_rename_ssa_copies;
- extern struct gimple_opt_pass pass_rest_of_compilation;
  extern struct gimple_opt_pass pass_sink_code;
  extern struct gimple_opt_pass pass_fre;
  extern struct gimple_opt_pass pass_check_data_deps;
--- 430,435 
*** extern struct simple_ipa_opt_pass pass_i
*** 477,483 
  extern struct ipa_opt_pass_d pass_ipa_profile;
  extern struct ipa_opt_pass_d pass_ipa_cdtor_merge;
  
- extern struct gimple_opt_pass pass_all_optimizations;
  extern struct gimple_opt_pass pass_cleanup_cfg_post_optimizing;
  extern struct gimple_opt_pass pass_init_datastructures;
  extern struct gimple_opt_pass pass_fixup_cfg;
--- 476,481 
*** extern struct rtl_opt_pass pass_sms;
*** 535,541 
  extern struct rtl_opt_pass pass_sched;
  extern struct rtl_opt_pass pass_ira;
  extern struct rtl_opt_pass pass_reload;
- extern struct rtl_opt_pass pass_postreload;
  extern struct rtl_opt_pass pass_clean_state;
  extern struct rtl_opt_pass pass_branch_prob;
  extern struct rtl_opt_pass pass_value_profile_transformations;
--- 533,538 
*** extern struct rtl_opt_pass pass_rtl_seqa
*** 576,582 
  extern struct gimple_opt_pass pass_release_ssa_names;
  extern struct gimple_opt_pass pass_early_inline;
  extern struct gimple_opt_pass pass_inline_parameters;
- extern struct gimple_opt_pass pass_all_early_optimizations;
  extern struct gimple_opt_pass pass_update_address_taken;
  extern struct gimple_opt_pass pass_convert_switch;
  
--- 573,578 
Index: gcc/passes.c
===
*** gcc/passes.c(revision 187396)
--- gcc/passes.c(working copy)
*** gate_all_early_optimizations (void)
*** 334,340 
   !seen_error ());
  }
  
! struct gimple_opt_pass pass_all_early_optimizations =
  {
   {
GIMPLE_PASS,
--- 334,340 
   !seen_error ());
  }
  
! static struct gimple_opt_pass pass_all_early_optimizations =
  {
   {
GIMPLE_PASS,
*** gate_all_optimizations (void)
*** 364,370 
   (!seen_error () || gimple_in_ssa_p (cfun)));
  }
  
! struct gimple_opt_pass pass_all_optimizations =
  {
   {
GIMPLE_PASS,
--- 364,370 
   (!seen_error () || gimple_in_ssa_p (cfun)));
  }
  
! static struct gimple_opt_pass pass_all_optimizations =
  {
   {
GIMPLE_PASS,
*** gate_rest_of_compilation (void)
*** 391,400 
return !(rtl_dump_and_exit || flag_syntax_only || seen_error ());
  }
  
! struct gimple_opt_pass pass_rest_of_compilation =
  {
   {
!   GIMPLE_PASS,
*rest_of_compilation,   /* name */
gate_rest_of_compilation, /* gate */
NULL, /* execute */
--- 391,400 
return !(rtl_dump_and_exit || flag_syntax_only || seen_error ());
  }
  
! static struct rtl_opt_pass pass_rest_of_compilation =
  {
   {
!   RTL_PASS,
*rest_of_compilation,   /* name */
gate_rest_of_compilation, /* gate */
NULL, /* execute */
*** gate_postreload (void)
*** 416,422 
return reload_completed;
  }
  
! struct rtl_opt_pass pass_postreload =
  {
   {
RTL_PASS,
--- 416,422 
return reload_completed;
  }
  
! static struct rtl_opt_pass pass_postreload =
  {
   {
RTL_PASS,
*** init_optimization_passes (void)
*** 1377,1382 
--- 1377,1383 
p = all_late_ipa_passes;
NEXT_PASS (pass_ipa_pta);
*p = NULL;
+ 
/* These passes are run after IPA passes on every function that is being
   output to the assembler file.  */
p = all_passes;
Index: gcc/tree-phinodes.c

Re: [Patch, fortran] PR 52428 Reading of large negative values and range checking

2012-05-11 Thread Janne Blomqvist
PING #2!

On Wed, May 2, 2012 at 10:22 PM, Janne Blomqvist
blomqvist.ja...@gmail.com wrote:
 PING

 On Thu, Apr 26, 2012 at 12:08 AM, Janne Blomqvist
 blomqvist.ja...@gmail.com wrote:
 Hi,

 currently when -frange-check is enabled, we check for overflow when
 doing a formatted read of an integer value. This check, however, is
 against the Fortran numerical model (see 13.4 in F2008), which defines
 a symmetric interval [-huge(), huge()], whereas all targets gfortran
 supports use a two's complement representation with a range of
 [-huge()-1, huge()].

 However, there is no checking against the numerical model when doing
 arithmetic, and thus we can generate and write the value -huge()-1,
 but we cannot read it back in! With the -fno-range-check option, this
 overflow checking can be disabled, but at the cost of disabling all
 overflow checking, which leads to reading nonsense values if the
 hardware supported range overflows.

 The attached patch changes this logic such that overflow checking
 against the hardware supported range [-huge()-1, huge()] is always
 done when reading, regardless of the -frange-check flag setting. This
 also seems to be what ifort 12.0 does, I haven't checked other
 compilers.

 For some more arguments back and forth, see the PR.

 Regtested on x86_64-unknown-linux-gnu, Ok for trunk?

 gcc/fortran ChangeLog:

 2012-04-25  Janne Blomqvist  j...@gcc.gnu.org

        PR fortran/52428
        * gfortran.texi: Update _gfortran_set_options documentation.
        * invoke.texi: Remove runtime behavior description of
        -fno-range-check.
        * trans-decl.c (create_main_function): Don't pass the range-check
        setting to the library.


 libgfortran ChangeLog:

 2012-04-25  Janne Blomqvist  j...@gcc.gnu.org

        PR fortran/52428
        * io/io.h (max_value): Rename to si_max, remove second argument.
        * io/list_read.c (convert_integer): Use unsigned types when
        parsing the digits, set max value depending on the sign.
        * io/read.c (max_value): Rename to si_max, remove second argument,
        simplify.
        (read_decimal): Set max value depending on sign, always check
        overflow.
        (read_radix): Calculate max unsigned value directly.
        * libgfortran.h (struct compile_options_t): Remove range_check
        field.
        * runtime/compile_options.c (set_options): Skip handling
        options[7].
        (init_compile_options): Don't set removed field.


 gcc/testsuite ChangeLog:

 2012-04-25  Janne Blomqvist  j...@gcc.gnu.org

        PR fortran/52428
        * gfortran.dg/int_range_io_1.f90: New test.


 --
 Janne Blomqvist



 --
 Janne Blomqvist



-- 
Janne Blomqvist


Re: [PATCH] Take branch misprediction effects into account when RTL loop unrolling (issue6099055)

2012-05-11 Thread Teresa Johnson
Ping?
Teresa

On Fri, May 4, 2012 at 3:41 PM, Teresa Johnson tejohn...@google.com wrote:

 On David's suggestion, I have removed the changes that rename niter_desc
 to
 loop_desc from this patch to focus the patch on the unrolling changes. I
 can
 submit a cleanup patch to do the renaming as soon as this one goes in.

 Bootstrapped and tested on x86_64-unknown-linux-gnu.  Ok for trunk?

 Thanks,
 Teresa

 Here is the new description of improvements from the original patch:

 Improved patch based on feedback. Main changes are:

 1) Improve efficiency by caching loop analysis results in the loop
 auxiliary
 info structure hanging off the loop structure. Added a new routine,
 analyze_loop_insns, to fill in information about the average and total
 number
 of branches, as well as whether there are any floating point set and call
 instructions in the loop. The new routine is invoked when we first create
 a
 loop's niter_desc struct, and the caller (get_simple_loop_desc) has been
 modified to handle creating a niter_desc for the fake outermost loop.

 2) Improvements to max_unroll_with_branches:
 - Treat the fake outermost loop (the procedure body) as we would a hot
 outer
 loop, i.e. compute the max unroll looking at its nested branches, instead
 of
 shutting off unrolling when we reach the fake outermost loop.
 - Pull the checks previously done in the caller into the routine (e.g.
 whether the loop iterates frequently or contains fp instructions).
 - Fix a bug in the previous version that sometimes caused overflow in the
 new unroll factor.

 3) Remove float variables, and use integer computation to compute the
 average number of branches in the loop.

 4) Detect more types of floating point computations in the loop by walking
 all set instructions, not just single sets.

 2012-05-04   Teresa Johnson  tejohn...@google.com

        * doc/invoke.texi: Update the documentation with new params.
        * loop-unroll.c (max_unroll_with_branches): New function.
        (decide_unroll_constant_iterations,
 decide_unroll_runtime_iterations):
        Add heuristic to avoid increasing branch mispredicts when
 unrolling.
        (decide_peel_simple, decide_unroll_stupid): Retrieve number of
        branches from niter_desc instead of via function that walks loop.
        * loop-iv.c (get_simple_loop_desc): Invoke new analyze_loop_insns
        function, and add guards to enable this function to work for the
        outermost loop.
        * cfgloop.c (insn_has_fp_set, analyze_loop_insns): New functions.
        (num_loop_branches): Remove.
        * cfgloop.h (struct loop_desc): Added new fields to cache
 additional
        loop analysis information.
        (num_loop_branches): Remove.
        (analyze_loop_insns): Declare.
        * params.def (PARAM_MIN_ITER_UNROLL_WITH_BRANCHES): New param.
        (PARAM_UNROLL_OUTER_LOOP_BRANCH_BUDGET): Ditto.

 Index: doc/invoke.texi
 ===
 --- doc/invoke.texi     (revision 187013)
 +++ doc/invoke.texi     (working copy)
 @@ -8842,6 +8842,12 @@ The maximum number of insns of an unswitched loop.
  @item max-unswitch-level
  The maximum number of branches unswitched in a single loop.

 +@item min-iter-unroll-with-branches
 +Minimum iteration count to ignore branch effects when unrolling.
 +
 +@item unroll-outer-loop-branch-budget
 +Maximum number of branches allowed in hot outer loop region after unroll.
 +
  @item lim-expensive
  The minimum cost of an expensive expression in the loop invariant motion.

 Index: loop-unroll.c
 ===
 --- loop-unroll.c       (revision 187013)
 +++ loop-unroll.c       (working copy)
 @@ -152,6 +152,99 @@ static void combine_var_copies_in_loop_exit (struc
                                             basic_block);
  static rtx get_expansion (struct var_to_expand *);

 +/* Compute the maximum number of times LOOP can be unrolled without
 exceeding
 +   a branch budget, which can increase branch mispredictions. The number
 of
 +   branches is computed by weighting each branch with its expected
 execution
 +   probability through the loop based on profile data. If no profile
 feedback
 +   data exists, simply return the current NUNROLL factor.  */
 +
 +static unsigned
 +max_unroll_with_branches(struct loop *loop, unsigned nunroll)
 +{
 +  struct loop *outer;
 +  struct niter_desc *outer_desc = 0;
 +  int outer_niters = 1;
 +  int frequent_iteration_threshold;
 +  unsigned branch_budget;
 +  struct niter_desc *desc = get_simple_loop_desc (loop);
 +
 +  /* Ignore loops with FP computation as these tend to benefit much more
 +     consistently from unrolling.  */
 +  if (desc-has_fp)
 +    return nunroll;
 +
 +  frequent_iteration_threshold = PARAM_VALUE
 (PARAM_MIN_ITER_UNROLL_WITH_BRANCHES);
 +  if (expected_loop_iterations (loop) = (unsigned)
 frequent_iteration_threshold)
 +    return nunroll;
 +
 +  /* If there was no profile 

unwinding fallbacks for powerpc32 on aix 5.2 and 5.3

2012-05-11 Thread Olivier Hainque
Hello,

We have been using these for a few years now. I have just verified
that they do allow exception propagation from signal for Ada on aix 5.3
with a recent mainline.

Posting here in case that is of interest, even though it was only
exercised on aix 5.2 and 5.3, for 32bit code generation (tested
on 64bit 5.2 kernel nevertheless).

Thanks in advance for your feedback,

Cheers,

Olivier

2012-05-11  Olivier Hainque  hain...@adacore.com

libgcc/
* config/rs6000/aix-unwind.h (ucontext_for): Helper for ...
(ppc_aix_fallback_frame_state): New, implementation of ...
(MD_FALLBACK_FRAME_STATE_FOR): Define.



aix-unw.dif
Description: video/dv




Re: [PATCH] Remove TYPE_IS_SIZETYPE

2012-05-11 Thread Olivier Hainque

On May 10, 2012, at 14:00 , Richard Guenther wrote:
 Of course the mere existence
 of DECL_OFFSET_ALIGN complicates matters for no good reasons (well,
 at least I did not find a good use of it until now ...).

I remember an old discussion about it ... hmm ...
 
   http://gcc.gnu.org/ml/gcc/2006-10/msg00019.html

 JIC that could be of use.

 Olivier




[PATCH] Remove find_new_referenced_vars

2012-05-11 Thread Richard Guenther

This removes the renaming variant of find_referenced_vars_in.  I'm
working towards removing the mark_symbols_for_renaming hammer.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2012-05-11  Richard Guenther  rguent...@suse.de

* tree-flow.h (referenced_var_check_and_insert): Remove.
(find_new_referenced_vars): Likewise.
* tree-dfa.c (referenced_var_check_and_insert): Make static.
(find_new_referenced_vars_1, find_new_referenced_vars): Remove.
* tree-inline.c (copy_bb): Use find_referenced_vars_in
instead of find_new_referenced_vars.
* gimple-fold.c (gimplify_and_update_call_from_tree): Likewise.

Index: gcc/tree-flow.h
===
*** gcc/tree-flow.h (revision 187401)
--- gcc/tree-flow.h (working copy)
*** typedef struct
*** 332,338 
 (VAR) = next_referenced_var ((ITER)))
  
  extern tree referenced_var_lookup (struct function *, unsigned int);
- extern bool referenced_var_check_and_insert (tree);
  #define num_referenced_vars htab_elements (gimple_referenced_vars (cfun))
  
  #define num_ssa_names (VEC_length (tree, cfun-gimple_df-ssa_names))
--- 332,337 
*** extern tree get_virtual_var (tree);
*** 496,502 
  extern bool add_referenced_var (tree);
  extern void remove_referenced_var (tree);
  extern void mark_symbols_for_renaming (gimple);
- extern void find_new_referenced_vars (gimple);
  extern tree make_rename_temp (tree, const char *);
  extern void set_default_def (tree, tree);
  extern tree gimple_default_def (struct function *, tree);
--- 495,500 
Index: gcc/tree-dfa.c
===
*** gcc/tree-dfa.c  (revision 187401)
--- gcc/tree-dfa.c  (working copy)
*** find_vars_r (tree *tp, int *walk_subtree
*** 462,470 
return NULL_TREE;
  }
  
! /* Find referenced variables in STMT.  In contrast with
!find_new_referenced_vars, this function will not mark newly found
!variables for renaming.  */
  
  void
  find_referenced_vars_in (gimple stmt)
--- 462,468 
return NULL_TREE;
  }
  
! /* Find referenced variables in STMT.  */
  
  void
  find_referenced_vars_in (gimple stmt)
*** referenced_var_lookup (struct function *
*** 505,511 
  /* Check if TO is in the referenced_vars hash table and insert it if not.
 Return true if it required insertion.  */
  
! bool
  referenced_var_check_and_insert (tree to)
  {
tree h, *loc;
--- 503,509 
  /* Check if TO is in the referenced_vars hash table and insert it if not.
 Return true if it required insertion.  */
  
! static bool
  referenced_var_check_and_insert (tree to)
  {
tree h, *loc;
*** mark_symbols_for_renaming (gimple stmt)
*** 667,704 
  }
  
  
- /* Find all variables within the gimplified statement that were not
-previously visible to the function and add them to the referenced
-variables list.  */
- 
- static tree
- find_new_referenced_vars_1 (tree *tp, int *walk_subtrees,
-   void *data ATTRIBUTE_UNUSED)
- {
-   tree t = *tp;
- 
-   if (TREE_CODE (t) == VAR_DECL  !var_ann (t))
- {
-   add_referenced_var (t);
-   mark_sym_for_renaming (t);
- }
- 
-   if (IS_TYPE_OR_DECL_P (t))
- *walk_subtrees = 0;
- 
-   return NULL;
- }
- 
- 
- /* Find any new referenced variables in STMT.  */
- 
- void
- find_new_referenced_vars (gimple stmt)
- {
-   walk_gimple_op (stmt, find_new_referenced_vars_1, NULL);
- }
- 
- 
  /* If EXP is a handled component reference for a structure, return the
 base variable.  The access range is delimited by bit positions *POFFSET and
 *POFFSET + *PMAX_SIZE.  The access size is *PSIZE bits.  If either
--- 657,662 
Index: gcc/tree-inline.c
===
*** gcc/tree-inline.c   (revision 187401)
--- gcc/tree-inline.c   (working copy)
*** copy_bb (copy_body_data *id, basic_block
*** 1794,1800 
  ssa_op_iter i;
  tree def;
  
! find_new_referenced_vars (gsi_stmt (copy_gsi));
  FOR_EACH_SSA_TREE_OPERAND (def, stmt, i, SSA_OP_DEF)
if (TREE_CODE (def) == SSA_NAME)
  SSA_NAME_DEF_STMT (def) = stmt;
--- 1794,1800 
  ssa_op_iter i;
  tree def;
  
! find_referenced_vars_in (gsi_stmt (copy_gsi));
  FOR_EACH_SSA_TREE_OPERAND (def, stmt, i, SSA_OP_DEF)
if (TREE_CODE (def) == SSA_NAME)
  SSA_NAME_DEF_STMT (def) = stmt;
Index: gcc/gimple-fold.c
===
*** gcc/gimple-fold.c   (revision 187401)
--- gcc/gimple-fold.c   (working copy)
*** gimplify_and_update_call_from_tree (gimp
*** 625,631 
new_stmt = gsi_stmt (i);
/* The replacement can 

[Fortran, (RFC) patch] PR49110/51055 Assignment to alloc. deferred-length character vars

2012-05-11 Thread Tobias Burnus

Dear all,

the following patch fixes

  character(len=:), allocatable :: str
  str = repeat('X', n)

Before the patch, the (re)allocation happened before the RHS was 
evaluated. Thus, the string length wasn't known! (Work around: Add 
parentheses around the function call.)



Note that the patch assumes that the function's result variable's length 
specification expression is completely known to the caller. I think 
that's always the case in gfortran - or is it not?.


I am not sure whether once can construct a valid program using INTERFACE 
where that's not the case. The answer is probably no, given:
If a type parameter of a function result or a bound of a function 
result array is not a constant expression, the exact dependence on the 
entities in the expression is a characteristic. (F2008, 12.3.3). -- 
Nevertheless, one has to think about whether there could be some issue. 
If there is, I have no idea how one could handle it instead. Thus, there 
be better no issue!



I am also not positive that len=: pointer results are properly handled, 
but probably they are. (Untested; character array results are also 
untested - and array constructors are known to be broken.)


Comments?

The attached patch builds and regtests.
OK for the trunk?

Tobias
2012-05-11  Tobias Burnus  bur...@net-b.de

	PR fortran/49110
	PR fortran/51055
	* trans-expr.c (gfc_trans_assignment_1): Fix allocation
	handling for assignment of function results to allocatable
	deferred-length strings.

2012-05-11  Tobias Burnus  bur...@net-b.de

	PR fortran/49110
	PR fortran/51055
	* gfortran.dg/deferred_type_param_3.f90: New.
	* gfortran.dg/deferred_type_param_4.f90: New.

diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c
index 8045b1f..8126697 100644
--- a/gcc/fortran/trans-expr.c
+++ b/gcc/fortran/trans-expr.c
@@ -7002,17 +7003,18 @@ gfc_trans_assignment_1 (gfc_expr * expr1, gfc_expr * expr2, bool init_flag,
 {
   tmp = gfc_deallocate_alloc_comp (expr2-ts.u.derived, rse.expr, 0);
   gfc_add_expr_to_block (loop.post, tmp);
 }
 
-  /* For a deferred character length function, the function call must
- happen before the (re)allocation of the lhs, otherwise the character
- length of the result is not known.  */
-  def_clen_func = (((expr2-expr_type == EXPR_FUNCTION)
-			   || (expr2-expr_type == EXPR_COMPCALL)
-			   || (expr2-expr_type == EXPR_PPC))
-		expr2-ts.deferred);
+  /* When assigning a character function result to a deferred-length variable,
+ the function call must happen before the (re)allocation of the lhs -
+ otherwise the character length of the result is not known.
+ NOTE: This relies on having the exact dependence of the length type
+ parameter available to the caller; gfortran save it in the .mod files. */
+  def_clen_func = (expr2-expr_type == EXPR_FUNCTION
+		   || expr2-expr_type == EXPR_COMPCALL
+		   || expr2-expr_type == EXPR_PPC);
   if (gfc_option.flag_realloc_lhs
 	 expr2-ts.type == BT_CHARACTER
 	 (def_clen_func || expr2-expr_type == EXPR_OP)
 	 expr1-ts.deferred)
 gfc_add_block_to_block (block, rse.pre);
--- /dev/null	2012-05-11 07:40:56.923831265 +0200
+++ gcc/gcc/testsuite/gfortran.dg/deferred_type_param_3.f90	2012-05-11 12:18:46.0 +0200
@@ -0,0 +1,33 @@
+! { dg-do run }
+!
+! PR fortran/51055
+! PR fortran/49110
+!
+
+subroutine test()
+  implicit none
+  integer :: i = 5
+  character(len=:), allocatable :: s1
+  call sub(s1, i)
+  if (len(s1) /= 5) call abort()
+  if (s1 /= Z) call abort()
+contains
+  subroutine sub(str,j)
+character(len=:), allocatable :: str
+integer :: j
+str = REPEAT(Z,j)
+if (len(str) /= 5) call abort()
+if (str /= Z) call abort()
+  end subroutine sub
+end subroutine test
+
+program a
+ character(len=:),allocatable :: s
+ integer :: j=2
+ s = repeat ('x', j)
+ if (len(repeat(' ',j)) /= 2) call abort()
+ if (repeat('y',j) /= yy) call abort()
+ if (len(s) /= 2) call abort()
+ if (s /= xx) call abort()
+ call test()
+end program a
--- /dev/null	2012-05-11 07:40:56.923831265 +0200
+++ gcc/gcc/testsuite/gfortran.dg/deferred_type_param_4.f90	2012-05-11 12:22:30.0 +0200
@@ -0,0 +1,33 @@
+! { dg-do run }
+!
+! PR fortran/51055
+! PR fortran/49110
+!
+!
+program test
+  implicit none
+  character(len=:), allocatable :: str
+  integer :: i
+  i = 5
+  str = f()
+  call printIt ()
+  i = 7
+  str = repeat('X', i)
+  call printIt ()
+contains
+  function f()
+character(len=i) :: f
+f = '1234567890'
+  end function f
+  subroutine printIt
+!print *, len(str)
+!print '(3a)', '',str,''
+if (i == 5) then
+  if (str /= 12345 .or. len(str) /= 5) call abort ()
+else if (i == 7) then
+  if (str /= XXX .or. len(str) /= 7) call abort ()
+else
+  call abort ()
+end if
+  end subroutine
+end


Re: patch for PR53125

2012-05-11 Thread Vladimir Makarov

On 05/11/2012 08:17 AM, Richard Earnshaw wrote:

On 10/05/12 20:58, Vladimir Makarov wrote:

The following patch is for PR53125.  The PR is described on
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53125.

The patch improves the compilation speed by 35% for the case.

The patch was successfully bootstrapped on x86-64.

Committed as rev. 187373.

2012-05-10  Vladimir Makarovvmaka...@redhat.com

  PR rtl-optimization/53125
  * ira.c (ira): Call find_moveable_pseudos and
  move_unallocated_pseudos if only ira_conflicts_p is true.




ENOPATCH?


Sorry for ommitting the patch.  I added it to the attachment.


Index: ira.c
===
--- ira.c   (revision 187372)
+++ ira.c   (working copy)
@@ -4125,7 +4125,12 @@ ira (FILE *f)
 }

   allocated_reg_info_size = max_reg_num ();
-  find_moveable_pseudos ();
+
+  /* It is not worth to do such improvement when we use a simple
+ allocation because of -O0 usage or because the function is too
+ big.  */
+  if (ira_conflicts_p)
+find_moveable_pseudos ();

   max_regno_before_ira = max_reg_num ();
   ira_setup_eliminable_regset ();
@@ -4234,7 +4239,10 @@ ira (FILE *f)
  max_regno * sizeof (struct ira_spilled_reg_stack_slot));
 }
   allocate_initial_values (reg_equivs);
-  move_unallocated_pseudos ();
+
+  /* See comment for find_moveable_pseudos call.  */
+  if (ira_conflicts_p)
+move_unallocated_pseudos ();
 }

 static void


Re: [PATCH] ARM/NEON: vld1q_dup_s64 builtin

2012-05-11 Thread Ramana Radhakrishnan


 You also don't mention how this patch was tested.

 I used the testsuite I developed some time ago to test all the Neon
 builtins, which I posted last year on the qemu mailing-list. With the
 current GCCs, this bug is the only remaining one I could detect.


Fair enough.



  Alternatively it might be worth splitting the
 vld1q_*64 case into a 64 bit load into a (subreg:DI (V2DI reg)  0 )
 followed by a subreg to subreg move which should end up having the
 same effect . That splitting would allow for better instruction
 scheduling.

 Are you aware of examples of similar cases I could use as a model?

I would change the iterator from VQX to VQ in the pattern above (you
can also simplify the setting of neon_type in that case as well as
change that to be a vec_duplicate as below and get rid of any
lingering definitions of UNSPEC_VLD1_DUP if they exist), define a
separate pattern that expressed this as a define_insn_and_split as
below.

 (define_insn_and_split neon_vld1_dupv2di
   [(set (match_operand:V2DI 0 s_register_operand =w)
 (vec_duplicate:V2DI (match_operand:DI 1 neon_struct_operand Um)))]
   TARGET_NEON
   #
reload_completed
   [(const_int 0)]
   {
rtx tmprtx = gen_lowpart (DImode, operands[0]);
emit_insn (gen_neon_vld1_dupdi (tmprtx, operands[1]));
emit_move_insn (gen_highpart (DImode, operands[0]), tmprtx );
DONE;
}
(set_attr length 8)
(set_attr neon_type fromearlierpattern)
)

Do you want to try this and see what you get ?



  In addition it would be nice to have a testcase in
 gcc.target/arm .

 Well. Prior to sending my patch I did look at that directory, but I supposed
 that such a test ought to belong to the neon/ subdir where the tests are
 described as autogenerated. Any doc on how to do that?

 I'd rather have an extra regression test in gcc.target/arm that was a
run time test. for e.g. take a look at gcc.target/arm/neon-vadds64.c .

Ramana


 Thanks,

 Christophe.



Re: [RFC] PR 53063 encode group options in .opt files

2012-05-11 Thread Manuel López-Ibáñez
On 10 May 2012 16:05, Joseph S. Myers jos...@codesourcery.com wrote:
 On Wed, 9 May 2012, Manuel López-Ibáñez wrote:

 2012-05-09  Manuel López-Ibáñez  m...@gcc.gnu.org

       PR 53063
 gcc/
       * doc/options.texi (EnabledBy): Document
       * opts.c: Include opts.h and options.h before tm.h.
       (finish_options): Do not handle some sub-options here...
       (common_handle_option): ... instead call common_handle_option_auto 
 here.
       * optc-gen.awk: Handle EnabledBy.
       * opth-gen.awk: Declare common_handle_option_auto.
       * common.opt (Wuninitialized): Use EnabledBy. Delete Init.
       (Wmaybe-uninitialized): Likewise.
         (Wunused-but-set-variable): Likewise.
         (Wunused-function): Likewise.
         (Wunused-label): Likewise.
         (Wunused-value): Likewise.
         (Wunused-variable): Likewise.
         * opt-read.awk: Create opt_numbers array.
 ada/
       * gcc-interface/misc.c (gnat_parse_file): Move before ...
       (gnat_handle_option): ... this. Use handle_generated_option.
 c-family/
       * c-opts.c (c_common_handle_option): Use handle_generated_option
       to enable sub-options.
 fortran/
       * options.c: Include diagnostics.h instead of
       diagnostics-core.h.
       (set_Wall): Do not see warn_unused here.
       (gfc_handle_option): Set it here using handle_generated_option.

 OK with the

 \ No newline at end of file

 fixed (i.e. a newline added to the end of optc-gen.awk) and 2012 added to
 copyright dates of modified files if not already there.

Great! Now we have EnabledBy for common options. Now, what should we
do with language specific settings? One idea could be to have
something like:

LangEnabledBy(Fortran Ada, Wall)

and then auto-generate something like:

ada_handle_option_auto();
fortran_handle_option_auto()

which would be called by the *_handle_option() of each front-end.

Does this sound reasonable?

Cheers,

Manuel.


Re: [PATCH] Add option for dumping to stderr (issue6190057)

2012-05-11 Thread Xinliang David Li
On Fri, May 11, 2012 at 1:49 AM, Richard Guenther
richard.guent...@gmail.com wrote:
 On Thu, May 10, 2012 at 6:28 PM, Xinliang David Li davi...@google.com wrote:
 I like your suggestion and support the end goal you have.  I don't
 like the -fopt-info behavior to interfere with regular -fdump-xxx
 options either.

 I think we should stage the changes in multiple steps as originally
 planned. Is Sharad's change good to be checked in for the first stage?

 Well - I don't think the change walks in the direction we want to go, so I 
 don't
 see a good reason to make that change.

Is your direction misunderstood or you want everything to be changed
in one single patch (including replacement of all fprintf (dump_file,
...)? If the former, please clarify. If the latter, it can be done.

thanks,

David


 After this one is checked in, the new dump interfaces will be worked
 on (and to allow multiple streams). Most of the remaining changes will
 be massive text replacement.

 thanks,

 David


 On Thu, May 10, 2012 at 1:18 AM, Richard Guenther
 richard.guent...@gmail.com wrote:
 On Thu, May 10, 2012 at 2:31 AM, Xinliang David Li davi...@google.com 
 wrote:
 Bummer.  I was thinking to reserve '=' for selective  dumping:

 -fdump-tree-pre=func_list_regexp

 I guess this can be achieved via @

 -fdump-tree-pre@func_list

 -fdump-tree-pre=file_name@func_list


 Another issue -- I don't think the current precedence rule is correct.
 Consider that -fopt-info=2 will be mapped to

 -fdump-tree-all-transform-verbose2=stderr
 -fdump-rtl-all-transform-verbose2=stderr

 then

 the current precedence rule will cause surprise when the following is used

 -fopt-info -fdump-tree-pre

 The PRE dump will be emitted to stderr which is not what user wants.
 In short, special streams should be treated as 'weak' the same way as
 your previous implementation.

 Hm, this raises a similar concern I have with the -fvectorizer-verbose flag.
 With -fopt-info -fdump-tree-pre I do not want some information to be
 present only on stderr or in the dump file!  I want it in _both_ places!
 (-fvectorizer-verbose makes the -fdump-tree-vect dump contain less
 information :()

 Thus, the information where dumping goes has to be done differently
 (which is why I asked for some re-org originally, so that passes no
 longer explicitely reference dump_file - dump_file may be different
 for different kind of information it dumps!).  Passes should, instead of

  fprintf (dump_file, ..., ...)

 do

  dump_printf (TDF_scev, ..., ...)

 thus, specify the kind of information they dump (would be mostly
 TDF_details vs. 0 today I guess).  The dump_printf routine would
 then properly direct to one or more places to dump at.

 I realize this needs some more dispatchers for dumping expressions
 and statements (but it should not be too many).  Dumping to
 dump_file would in any case dump to the passes private dump file
 only (unqualified stuff would never be useful for -fopt-info).

 The perfect candidate to convert to this kind of scheme is obviously
 the vectorizer with its existing -fvectorizer-verbose.

 If the patch doesn't work towards this kind of end-result I'd rather
 not have it.

 Thanks,
 Richard.

 thanks,

 David



 On Wed, May 9, 2012 at 4:56 PM, Sharad Singhai sing...@google.com wrote:
 Thanks for your suggestions/comments. I have updated the patch and
 documentation. It supports the following usage:

 gcc  -fdump-tree-all=tree.dump -fdump-tree-pre=stdout
 -fdump-rtl-ira=ira.dump

 Here all tree dumps except the PRE are output into tree.dump, PRE dump
 goes to stdout and the IRA dump goes to ira.dump.

 Thanks,
 Sharad

 2012-05-09   Sharad Singhai  sing...@google.com

        * doc/invoke.texi: Add documentation for the new option.
        * tree-dump.c (dump_get_standard_stream): New function.
        (dump_files): Update for new field.
        (dump_switch_p_1): Handle dump filenames.
        (dump_begin): Likewise.
        (get_dump_file_name): Likewise.
        (dump_end): Remove attribute.
        (dump_enable_all): Add new parameter FILENAME.
        All callers updated.
        (enable_rtl_dump_file):
        * tree-pass.h (enum tree_dump_index): Add new constant.
        (struct dump_file_info): Add new field FILENAME.
        * testsuite/g++.dg/other/dump-filename-1.C: New test.

 Index: doc/invoke.texi
 ===
 --- doc/invoke.texi     (revision 187265)
 +++ doc/invoke.texi     (working copy)
 @@ -5322,20 +5322,23 @@ Here are some examples showing uses of these optio

  @item -d@var{letters}
  @itemx -fdump-rtl-@var{pass}
 +@itemx -fdump-rtl-@var{pass}=@var{filename}
  @opindex d
  Says to make debugging dumps during compilation at times specified by
  @var{letters}.  This is used for debugging the RTL-based passes of the
  compiler.  The file names for most of the dumps are made by appending
  a pass number and a word to the @var{dumpname}, and the files are
 -created in the 

Re: PING: PATCH: Backport x32 support to libtool

2012-05-11 Thread H.J. Lu
On Mon, Apr 16, 2012 at 10:47 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Tue, Apr 3, 2012 at 7:49 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Thu, Mar 29, 2012 at 7:34 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Sat, Mar 3, 2012 at 9:54 AM, H.J. Lu hongjiu...@intel.com wrote:
 Hi,

 This patch backports x32 support to libtool:

 http://git.savannah.gnu.org/cgit/libtool.git/commit/?id=88992fe6771ec3258bde1b03314ce579da0ac2d5

 OK to install?

 Thanks.


 H.J.
 ---
 ommit 0d8c092cac25c3bce5dbfc1981b84df91b3f6086
 Author: H.J. Lu hjl.to...@gmail.com
 Date:   Mon Dec 12 13:03:14 2011 -0800

    Add x32 support to libtool.m4

    2011-12-12  H.J. Lu  hongjiu...@intel.com

        * libtool.m4 (_LT_ENABLE_LOCK): Support x32.

 diff --git a/ChangeLog.x32 b/ChangeLog.x32
 new file mode 100644
 index 000..b6e01ee
 --- /dev/null
 +++ b/ChangeLog.x32
 @@ -0,0 +1,3 @@
 +2011-12-12  H.J. Lu  hongjiu...@intel.com
 +
 +       * libtool.m4 (_LT_ENABLE_LOCK): Support x32.
 diff --git a/libtool.m4 b/libtool.m4
 index 67321a7..a7f99ac 100644
 --- a/libtool.m4
 +++ b/libtool.m4
 @@ -1232,7 +1232,14 @@ s390*-*linux*|s390*-*tpf*|sparc*-*linux*)
            LD=${LD-ld} -m elf_i386_fbsd
            ;;
          x86_64-*linux*)
 -           LD=${LD-ld} -m elf_i386
 +           case `/usr/bin/file conftest.o` in
 +             *x86-64*)
 +               LD=${LD-ld} -m elf32_x86_64
 +               ;;
 +             *)
 +               LD=${LD-ld} -m elf_i386
 +               ;;
 +           esac
            ;;
          ppc64-*linux*|powerpc64-*linux*)
            LD=${LD-ld} -m elf32ppclinux


 Hi Ralf,

 Can you review this patch?

 Thanks.


 PING.


 PING.


PING.


-- 
H.J.


Re: PATCH: Add x32 support to boehm-gc

2012-05-11 Thread H.J. Lu
On Sun, Apr 29, 2012 at 10:27 AM, H.J. Lu hongjiu...@intel.com wrote:
 Hi,

 This patch adds x32 support to boehm-gc.  The same patch has been
 sent to the boehm-gc mailing list. Tested on Linux/x32 and Linux/x86-64.
 OK for trunk?

 Thanks.


 H.J.
 - Forwarded message from H.J. Lu hongjiu...@intel.com -
 Date: Mon, 16 Apr 2012 09:39:20 -0700
 From: H.J. Lu hongjiu...@intel.com
 To: H.J. Lu hjl.to...@gmail.com
 Cc: hans_bo...@hp.com, g...@linux.hpl.hp.com
 Subject: [bdwgc] PATCH: Add x32 support
 User-Agent: Mutt/1.5.21 (2010-09-15)

 Hi,

 Here are 2 small patches to add x32 support to bdwgc.  X32 info can be
 found at

 https://sites.google.com/site/x32abi/

 They are fully tested on Linux/x32 and Linux/x86-64.

 Thanks.


 H.J.
 ---
 From 16ea9de35f16f0859c40862f8ef310c0dde6082c Mon Sep 17 00:00:00 2001
 From: H.J. Lu hjl.to...@gmail.com
 Date: Mon, 16 Apr 2012 09:26:07 -0700
 Subject: [PATCH 1/2] Define ALIGNMENT and CPP_WORDSZ for x32

        * include/private/gcconfig.h: (ALIGNMENT): Set to 4 for x32.
        (CPP_WORDSZ): Set to 32 for x32.
 diff --git a/include/private/gcconfig.h b/include/private/gcconfig.h
 index a163e47..918d100 100644
 --- a/include/private/gcconfig.h
 +++ b/include/private/gcconfig.h
 @@ -2117,8 +2117,13 @@

  # ifdef X86_64
  #   define MACH_TYPE X86_64
 -#   define ALIGNMENT 8
 -#   define CPP_WORDSZ 64
 +#   ifdef __ILP32__
 +#     define ALIGNMENT 4
 +#     define CPP_WORDSZ 32
 +#   else
 +#     define ALIGNMENT 8
 +#     define CPP_WORDSZ 64
 +#   endif
  #   ifndef HBLKSIZE
  #     define HBLKSIZE 4096
  #   endif
 --
 1.7.6.5


Hi Uros,

This patch has been checked into upstream:

https://github.com/ivmai/bdwgc/commit/936c1d5f7b8e8e91f7263bbff884a9d2377951f2

Is this OK for trunk?

Thanks.



-- 
H.J.


[Patch,AVR]: ad PR49868: assemble 3-byte symbols

2012-05-11 Thread Georg-Johann Lay
Currently avr_assemble_integer emits an assembler warning to
make gas complain about missing feature

http://sourceware.org/PR13503

if a 3-byte address must be assembled.

As PR13503 is implemented now, avr-gcc can use this feature.

It's only needed for the new __memx address space.

Ok to install?

The feature is new in gcc and binutils and only rarely
used, IMHO. The current solution will throw a comprehensible
warning but generate bad code, with the patch old binutils
will throw a not really helpful diagnose but work with
new binutils...

Thus okay to backport? Or leave 4.7 as is?

Johann

PR target/46898
* config/avr/avr.c (avr_const_address_lo16): Remove.
(avr_assemble_integer):  Print .byte lo8(x),
.byte hi8(x), .byte hh8(x) instead of emit an assembler
.warning if 3-byte address is assembled.

* doc/extend.texi (AVR Named Address Spaces): Document that
binutils 2.23 is needed to assemble 3-byte addresses.
Index: config/avr/avr.c
===
--- config/avr/avr.c	(revision 187411)
+++ config/avr/avr.c	(working copy)
@@ -6639,48 +6639,6 @@ _reg_unused_after (rtx insn, rtx reg)
 }
 
 
-/* Return RTX that represents the lower 16 bits of a constant address.
-   Unfortunately, simplify_gen_subreg does not handle this case.  */
-
-static rtx
-avr_const_address_lo16 (rtx x)
-{
-  rtx lo16;
-  
-  switch (GET_CODE (x))
-{
-default:
-  break;
-  
-case CONST:
-  if (PLUS == GET_CODE (XEXP (x, 0))
-   SYMBOL_REF == GET_CODE (XEXP (XEXP (x, 0), 0))
-   CONST_INT_P (XEXP (XEXP (x, 0), 1)))
-{
-  HOST_WIDE_INT offset = INTVAL (XEXP (XEXP (x, 0), 1));
-  const char *name = XSTR (XEXP (XEXP (x, 0), 0), 0);
-  
-  lo16 = gen_rtx_SYMBOL_REF (Pmode, ggc_strdup (name));
-  lo16 = gen_rtx_CONST (Pmode, plus_constant (Pmode, lo16, offset));
-  
-  return lo16;
-}
-  
-  break;
-  
-case SYMBOL_REF:
-  {
-const char *name = XSTR (x, 0);
-
-return gen_rtx_SYMBOL_REF (Pmode, ggc_strdup (name));
-  }
-}
-  
-  avr_edump (\n%?: %r\n, x);
-  gcc_unreachable();
-}
-
-
 /* Target hook for assembling integer objects.  The AVR version needs
special handling for references to certain labels.  */
 
@@ -6688,7 +6646,7 @@ static bool
 avr_assemble_integer (rtx x, unsigned int size, int aligned_p)
 {
   if (size == POINTER_SIZE / BITS_PER_UNIT  aligned_p
-   text_segment_operand (x, VOIDmode) )
+   text_segment_operand (x, VOIDmode))
 {
   fputs (\t.word\tgs(, asm_out_file);
   output_addr_const (asm_out_file, x);
@@ -6698,15 +6656,17 @@ avr_assemble_integer (rtx x, unsigned in
 }
   else if (GET_MODE (x) == PSImode)
 {
-  default_assemble_integer (avr_const_address_lo16 (x),
-GET_MODE_SIZE (HImode), aligned_p);
+  /* This needs binutils 2.23+, see PR binutils/13503  */
+
+  fputs (\t.byte\tlo8(, asm_out_file);
+  output_addr_const (asm_out_file, x);
+  fputs ()\n, asm_out_file);
   
-  fputs (\t.warning\t\assembling 24-bit address needs binutils
-  extension for hh8(, asm_out_file);
+  fputs (\t.byte\thi8(, asm_out_file);
   output_addr_const (asm_out_file, x);
-  fputs ()\\n, asm_out_file);
+  fputs ()\n, asm_out_file);
   
-  fputs (\t.byte\t0\t ASM_COMMENT_START  hh8(, asm_out_file);
+  fputs (\t.byte\thh8(, asm_out_file);
   output_addr_const (asm_out_file, x);
   fputs ()\n, asm_out_file);
   
Index: doc/extend.texi
===
--- doc/extend.texi	(revision 187411)
+++ doc/extend.texi	(working copy)
@@ -1273,6 +1273,7 @@ If the high bit of the address is set, d
 RAM using the lower two bytes as RAM address.
 If the high bit of the address is clear, data is read from flash
 with @code{RAMPZ} set according to the high byte of the address.
+@xref{AVR Built-in Functions,,@code{__builtin_avr_flash_segment}}.
 
 Objects in this address space will be located in @code{.progmem.data}.
 @end table
@@ -1302,6 +1303,7 @@ int main (void)
 @}
 @end example
 
+@noindent
 For each named address space supported by avr-gcc there is an equally
 named but uppercase built-in macro defined. 
 The purpose is to facilitate testing if respective address space
@@ -1327,7 +1329,8 @@ int read_var (void)
 #endif /* __FLASH */
 @end example
 
-Notice that attribute @ref{AVR Variable Attributes,@code{progmem}}
+@noindent
+Notice that attribute @ref{AVR Variable Attributes,,@code{progmem}}
 locates data in flash but
 accesses to these data will read from generic address space, i.e.@:
 from RAM,
@@ -1335,6 +1338,7 @@ so that you need special accessors like
 from @w{@uref{http://nongnu.org/avr-libc/user-manual,AVR-LibC}}
 together with attribute @code{progmem}.
 
+@noindent
 @b{Limitations and 

Re: [RFC] PR 53063 encode group options in .opt files

2012-05-11 Thread Joseph S. Myers
On Fri, 11 May 2012, Manuel López-Ibáñez wrote:

 Great! Now we have EnabledBy for common options. Now, what should we
 do with language specific settings? One idea could be to have
 something like:
 
 LangEnabledBy(Fortran Ada, Wall)
 
 and then auto-generate something like:
 
 ada_handle_option_auto();
 fortran_handle_option_auto()
 
 which would be called by the *_handle_option() of each front-end.
 
 Does this sound reasonable?

What cases do we have where a language-independent option enables another 
language-independent option only for some front ends?  That's the only 
case that should need a language-dependent generated function here.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [RFC] PR 53063 encode group options in .opt files

2012-05-11 Thread Manuel López-Ibáñez
On 11 May 2012 19:04, Joseph S. Myers jos...@codesourcery.com wrote:
 On Fri, 11 May 2012, Manuel López-Ibáñez wrote:

 Great! Now we have EnabledBy for common options. Now, what should we
 do with language specific settings? One idea could be to have
 something like:

 LangEnabledBy(Fortran Ada, Wall)

 and then auto-generate something like:

 ada_handle_option_auto();
 fortran_handle_option_auto()

 which would be called by the *_handle_option() of each front-end.

 Does this sound reasonable?

 What cases do we have where a language-independent option enables another
 language-independent option only for some front ends?  That's the only
 case that should need a language-dependent generated function here.

Wall enables Wunused in Ada Fortran and C family.

Cheers,

Manuel.


Re: [RFC] PR 53063 encode group options in .opt files

2012-05-11 Thread Manuel López-Ibáñez
On 11 May 2012 19:09, Manuel López-Ibáñez lopeziba...@gmail.com wrote:
 On 11 May 2012 19:04, Joseph S. Myers jos...@codesourcery.com wrote:
 On Fri, 11 May 2012, Manuel López-Ibáñez wrote:

 Great! Now we have EnabledBy for common options. Now, what should we
 do with language specific settings? One idea could be to have
 something like:

 LangEnabledBy(Fortran Ada, Wall)

 and then auto-generate something like:

 ada_handle_option_auto();
 fortran_handle_option_auto()

 which would be called by the *_handle_option() of each front-end.

 Does this sound reasonable?

 What cases do we have where a language-independent option enables another
 language-independent option only for some front ends?  That's the only
 case that should need a language-dependent generated function here.

 Wall enables Wunused in Ada Fortran and C family.

But the case of lang-indep enables lang-dependent is far more common:
Wextra enables some C-only options.

Cheers,

Manuel.


Re: [Fortran] Patch ping

2012-05-11 Thread Tobias Burnus

On 18 April 2012 at 18:57, Bernhard Reutner-Fischer wrote:

On Tue, Apr 17, 2012 at 12:47:48AM +0200, Tobias Burnus wrote:

Approved but not yet committed:
Bernhard:
- [PATCH] gfortran testsuite: implicitly cleanup-modules, part 2
  http://gcc.gnu.org/ml/fortran/2012-04/msg00065.html

Before actually pushing this, I ment to ask if we *want* to make
sure that we do not add superfluous cleanup-module calls in the
future (which would slow down testing needlessly)?

If so we would either have to manually reject occurances of those during
patch-review or install a warning or the like if there is an explicit
cleanup-modules call yielding the same set as the now automatically
determined set.


I would go for the manual method: As cleanup-modules is something which 
developers tend to forget, I do not think that many patches will include 
them. On then simply tries to reduce those by patch review. - If patch 
developers do not see it in other files, the chance is high that they do 
not even know (or remember) about that feature in a few months.


And after some time (1/2 year, 1 year?), one can check whether a 
spurious clean-up modules has slipped in - or whether some 
cleanup-module is missing. I expect that there will be none or very, 
very few cases.


Tobias


Re: PATCH: Add x32 support to boehm-gc

2012-05-11 Thread Uros Bizjak
On Fri, May 11, 2012 at 6:55 PM, H.J. Lu hjl.to...@gmail.com wrote:

 This patch adds x32 support to boehm-gc.  The same patch has been
 sent to the boehm-gc mailing list. Tested on Linux/x32 and Linux/x86-64.
 OK for trunk?

 This patch has been checked into upstream:

 https://github.com/ivmai/bdwgc/commit/936c1d5f7b8e8e91f7263bbff884a9d2377951f2

 Is this OK for trunk?

OK, we sync with upstream.

Thanks,
Uros.


Re: [PATCH] Add option for dumping to stderr (issue6190057)

2012-05-11 Thread Xinliang David Li
To be more specific, does the following match what your envisioned?

1) when multiple streams are specified for dumping, the information
will be dumped to all streams IF the new dumping interfaces are used
(see below). For legacy code, the default dump file will still be used
and the user specified file (including stderr) will be silently
ignored. This is of course temporary until all dumping sites are
converted

2) Define the following new dumping interfaces which will honor all
streams specified

  dump_printf (const char*, );   // for raw string dumping
  dump_generic_expr (...); //  wrapper to print_generic_expr without
taking FILE*
  dump_node (...); // wrapper to print_node
  dump_gimple_stmt(...); // wrapper to print_gimple_stmt

  dump_inform (...); //wrapper to inform (..) method, but is aware of
of the dumping streams and modify global_dc appropriately.

3) Implement -fopt-info=N, and take -ftree-vectorizer-verbose as the
first guinea pig to use the new dumping interfaces.

After the infrastructure is ready, gradually deprecate the use of the
original dumper interfaces.

what do you think?

thanks,

David

On Fri, May 11, 2012 at 9:06 AM, Xinliang David Li davi...@google.com wrote:
 On Fri, May 11, 2012 at 1:49 AM, Richard Guenther
 richard.guent...@gmail.com wrote:
 On Thu, May 10, 2012 at 6:28 PM, Xinliang David Li davi...@google.com 
 wrote:
 I like your suggestion and support the end goal you have.  I don't
 like the -fopt-info behavior to interfere with regular -fdump-xxx
 options either.

 I think we should stage the changes in multiple steps as originally
 planned. Is Sharad's change good to be checked in for the first stage?

 Well - I don't think the change walks in the direction we want to go, so I 
 don't
 see a good reason to make that change.

 Is your direction misunderstood or you want everything to be changed
 in one single patch (including replacement of all fprintf (dump_file,
 ...)? If the former, please clarify. If the latter, it can be done.

 thanks,

 David


 After this one is checked in, the new dump interfaces will be worked
 on (and to allow multiple streams). Most of the remaining changes will
 be massive text replacement.

 thanks,

 David


 On Thu, May 10, 2012 at 1:18 AM, Richard Guenther
 richard.guent...@gmail.com wrote:
 On Thu, May 10, 2012 at 2:31 AM, Xinliang David Li davi...@google.com 
 wrote:
 Bummer.  I was thinking to reserve '=' for selective  dumping:

 -fdump-tree-pre=func_list_regexp

 I guess this can be achieved via @

 -fdump-tree-pre@func_list

 -fdump-tree-pre=file_name@func_list


 Another issue -- I don't think the current precedence rule is correct.
 Consider that -fopt-info=2 will be mapped to

 -fdump-tree-all-transform-verbose2=stderr
 -fdump-rtl-all-transform-verbose2=stderr

 then

 the current precedence rule will cause surprise when the following is used

 -fopt-info -fdump-tree-pre

 The PRE dump will be emitted to stderr which is not what user wants.
 In short, special streams should be treated as 'weak' the same way as
 your previous implementation.

 Hm, this raises a similar concern I have with the -fvectorizer-verbose 
 flag.
 With -fopt-info -fdump-tree-pre I do not want some information to be
 present only on stderr or in the dump file!  I want it in _both_ places!
 (-fvectorizer-verbose makes the -fdump-tree-vect dump contain less
 information :()

 Thus, the information where dumping goes has to be done differently
 (which is why I asked for some re-org originally, so that passes no
 longer explicitely reference dump_file - dump_file may be different
 for different kind of information it dumps!).  Passes should, instead of

  fprintf (dump_file, ..., ...)

 do

  dump_printf (TDF_scev, ..., ...)

 thus, specify the kind of information they dump (would be mostly
 TDF_details vs. 0 today I guess).  The dump_printf routine would
 then properly direct to one or more places to dump at.

 I realize this needs some more dispatchers for dumping expressions
 and statements (but it should not be too many).  Dumping to
 dump_file would in any case dump to the passes private dump file
 only (unqualified stuff would never be useful for -fopt-info).

 The perfect candidate to convert to this kind of scheme is obviously
 the vectorizer with its existing -fvectorizer-verbose.

 If the patch doesn't work towards this kind of end-result I'd rather
 not have it.

 Thanks,
 Richard.

 thanks,

 David



 On Wed, May 9, 2012 at 4:56 PM, Sharad Singhai sing...@google.com wrote:
 Thanks for your suggestions/comments. I have updated the patch and
 documentation. It supports the following usage:

 gcc  -fdump-tree-all=tree.dump -fdump-tree-pre=stdout
 -fdump-rtl-ira=ira.dump

 Here all tree dumps except the PRE are output into tree.dump, PRE dump
 goes to stdout and the IRA dump goes to ira.dump.

 Thanks,
 Sharad

 2012-05-09   Sharad Singhai  sing...@google.com

        * doc/invoke.texi: Add documentation for the new option.
        * 

PATCH: PR target/53315: simple xtest program generates ICE

2012-05-11 Thread H.J. Lu
Hi,

This patch uses + in constraint and match_dup in xbegin_1.  OK for
trunk?

Andi, can you provide a run-time testcase patch with proper cpuid check?

Thanks.


H.J.
---
2012-05-11  Andrew Pinski  apin...@cavium.com
H.J. Lu  hongjiu...@intel.com

PR target/53315
* config/i386/i386.md (xbegin_1): Use + in constraint and
match_dup.
(xbegin): Updated.

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index c825579..b64d1d3 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -18361,9 +18361,9 @@
 {
   rtx label = gen_label_rtx ();
 
-  operands[1] = force_reg (SImode, constm1_rtx);
+  operands[0] = force_reg (SImode, constm1_rtx);
 
-  emit_jump_insn (gen_xbegin_1 (operands[0], operands[1], label));
+  emit_jump_insn (gen_xbegin_1 (operands[0], label));
 
   emit_label (label);
   LABEL_NUSES (label) = 1;
@@ -18375,13 +18375,12 @@
   [(set (pc)
(if_then_else (ne (unspec [(const_int 0)] UNSPEC_XBEGIN_ABORT)
  (const_int 0))
- (label_ref (match_operand 2))
+ (label_ref (match_operand 1))
  (pc)))
-   (set (match_operand:SI 0 register_operand =a)
-   (unspec_volatile:SI [(match_operand:SI 1 register_operand 0)]
-   UNSPECV_XBEGIN))]
+   (set (match_operand:SI 0 register_operand +a)
+   (unspec_volatile:SI [(match_dup 0)] UNSPECV_XBEGIN))]
   TARGET_RTM
-  xbegin\t%l2
+  xbegin\t%l1
   [(set_attr type other)
(set_attr length 6)])
 


Re: [patch] support for multiarch systems

2012-05-11 Thread Matthias Klose
On 11.05.2012 12:51, Paolo Bonzini wrote:
 Il 11/05/2012 07:13, Matthias Klose ha scritto:
 ok, I did clarify it in the existing documentation of MULTIARCH_DIRNAME in
 fragments.texi, detailing the search order for the files. Should the search
 order be mentioned in some user documentation as well? if yes, where?
 
 Thanks!  I don't think the search order should be mentioned specially,
 since anyway *_INCLUDE_PATH and LIBRARY_PATH are mentioned.  However
 under MULTILIB_DIRNAMES I would add something like this:
 
 @code{MULTILIB_DIRNAMES} describes the multilib directories using GCC
 conventions and is applied to directories that are part of the GCC
 installation.  When multilib-enabled, the compiler will add a
 subdirectory of the form @var{prefix}/@var{multilib} before each
 directory in the search path for libraries and crt files.

done.

 +@findex MULTILIB_OSDIRNAMES
 +@item MULTILIB_OSDIRNAMES
 +If @code{MULTILIB_OPTIONS} is used, this variable specifies
 
 ... a list of subdirectory names, that are used to modify the search
 path depending on the chosen multilib.  Unlike @code{MULTILIB_DIRNAMES},
 @code{MULTILIB_OSDIRNAMES} describes the multilib directories using
 operating systems conventions, and is applied to the directories such as
 @code{lib} or those in the @env{LIBRARY_PATH} environment variable.
 
  The format is either the same as of
 +@code{MULTILIB_DIRNAMES}, or a set of mappings.  When it is the same
 +as @code{MULTILIB_DIRNAMES}, it describes the multilib directories
 +using OS conventions, rather than GCC conventions.
 
 s/OS/operating system/

done.

 When it is a set
 +of mappings of the form @var{gccdir}=@var{osdir}, the left side gives
 +the GCC convention and the right gives the equivalent OS defined
 +location.  If the @var{osdir} part begins with a @samp{!},
 
 ... GCC will not search in the non-multilib directory and use
 exclusively the multilib directory.  Otherwise, the compiler will
 examine the search path for libraries and crt files twice; the first
 time it will add @var{multilib} to each directory in the search path,
 the second it will not.
 
 Use the mapping when there is
 +no one-to-one equivalence between GCC levels and the OS.
 
 I'm not sure what you mean here?

The whole paragraph was taken from the genmultilib script. I left it out for
this version. I think I'll update the genmultilib script when these docs are
settled.

 +For multilib enabled configurations (see @code{MULTIARCH_DIRNAME})
 +below), the multilib name is appended to each directory name, separated
 +by a colon (e.g. @samp{../lib:x86_64-linux-gnu}).
 
 For configurations that support both multilib and multiarch,
 @code{MULTILIB_OSDIRNAMES} also encodes the multiarch name, thus
 subsuming @code{MULTIARCH_DIRNAME}.  The multiarch name is appended to
 each directory name, separated by a colon (e.g.
 @samp{../lib32:i386-linux-gnu}).
 
 Each multiarch subdirectory will be searched before the corresponding OS
 multilib directory, for example @samp{/lib/i386-linux-gnu} before
 @samp{/lib/..lib32}.  The multiarch name will also be used to modify the
 system header search path, as explained for @code{MULTIARCH_DIRNAME}.

done.

 +@findex MULTIARCH_DIRNAME
 +@item MULTIARCH_DIRNAME
 +If @code{MULTIARCH_DIRNAME} is used, this variable specifies the
 +multiarch name for this configuration.

 +Libraries and crt files are searched first in
 +@var{prefix}/@var{multiarch} before @var{prefix} for each @var{prefix}
 +added by @code{add_prefix} or @code{add_sysrooted_prefix}.
 +System header files are searched first in
 +@code{LOCAL_INCLUDE_DIR}/@var{multiarch} before
 +@code{LOCAL_INCLUDE_DIR}, and in
 +@code{NATIVE_SYSTEM_HEADER_DIR}/@var{multiarch} before
 +@code{NATIVE_SYSTEM_HEADER_DIR}.

 +@code{MULTIARCH_DIRNAME} is not used for multilib enabled
 +configurations, but encoded in @code{MULTILIB_OSDIRNAMES} instead.
 
 This sounds simpler, and doesn't refer to GCC implementation details
 such add_prefix/add_sysrooted_prefix:
 
 This variable specifies the multiarch name for configurations that are
 multiarch-enabled but not multilibbed configurations.
 
 The multiarch name is used to augment the search path for libraries, crt
 files and system header files with additional locations.  The compiler
 will add a multiarch subdirectory of the form
 @var{prefix}/@var{multiarch} before each directory in the library and
 crt search path.  It will also add two directories
 @code{LOCAL_INCLUDE_DIR}/@var{multiarch} and
 @code{NATIVE_SYSTEM_HEADER_DIR}/@var{multiarch}) to the system header
 search path, respectively before @code{LOCAL_INCLUDE_DIR} and
 @code{NATIVE_SYSTEM_HEADER_DIR}.
 
 @code{MULTIARCH_DIRNAME} is not used for configurations that support
 both multilib and multiarch.  In that case, multiarch names are encoded
 in @code{MULTILIB_OSDIRNAMES} instead.

done. thanks for the wording.

 +The multiarch tuples are defined
 +in @uref{http://wiki.debian.org/Multiarch/Tuples}.
 
 Is this needed?  Aren't they defined 

Re: patch for PR53125

2012-05-11 Thread Steven Bosscher
 2012-05-10  Vladimir Makarovvmaka...@redhat.com

          PR rtl-optimization/53125
          * ira.c (ira): Call find_moveable_pseudos and
          move_unallocated_pseudos if only ira_conflicts_p is true.

And the attached patch fixes the reginfo slowness.

The reginfo pass was doing:

  for each bb:
for each insn in bb, from BB_END(bb) to BB_HEAD(bb):
  for each reg that died between BB_END(bb) and insn: // i.e.
register is live at insn
REG_LIVE_LENGTH(reg)++

With very large basic blocks, and the kind of code for the test case
of the PR results in many live registers for SPARC (for x86_64 there
are far fewer live registers). For SPARC, there are O(1e5) insns and
O(1e4) registers live for most of the basic block. That is effectively
almost quadratic behavior in the number of insns per basic block.

But the above loop is a bit silly. It is much easier and
computationally cheaper to just remember at what insn reg died (last
used), and add to REG_LIVE_LENGTH the distance from the insn that sets
reg to the insn that used reg.

It turns out that (before or after the patch) partial or conditional
sets never kill a register, so that REG_LIVE_LENGTH for registers set
by partial or conditional stores is not very accurate. I have a few
ideas for how to improve that situation a bit, but I'm not sure if
it's worth the trouble.

I also think that the computation of REG_FREQ and
REG_FREQ_CALLS_CROSSED has been wrong for some time. The maximum
REG_FREQ is REG_FREQ_MAX (see regs.h) but regstat.c just accumulates
REG_FREQ_FROM_BB. I've added a few lines to limit the REG_FREQ to
REG_FREQ_MAX, e.g.:

  REG_FREQ (dregno) += REG_FREQ_FROM_BB (bb);
+ REG_FREQ (dregno) =
+   MIN (REG_FREQ (dregno), REG_FREQ_MAX);

With this patch, the reginfo pass disappears from the time report for
the test case attached to PR53125, compiled at -O0.

Bootstrapped and tested on x86_64-unknown-linux-gnu, and I verified
for a few simple test cases that the computed REG_LIVE_LENGHTs are
unchanged. OK for trunk?

Ciao!
Steven


Re: patch for PR53125

2012-05-11 Thread Steven Bosscher
Now with patch

On Fri, May 11, 2012 at 8:42 PM, Steven Bosscher stevenb@gmail.com wrote:
 2012-05-10  Vladimir Makarovvmaka...@redhat.com

          PR rtl-optimization/53125
          * ira.c (ira): Call find_moveable_pseudos and
          move_unallocated_pseudos if only ira_conflicts_p is true.

 And the attached patch fixes the reginfo slowness.

 The reginfo pass was doing:

  for each bb:
    for each insn in bb, from BB_END(bb) to BB_HEAD(bb):
      for each reg that died between BB_END(bb) and insn: // i.e.
 register is live at insn
        REG_LIVE_LENGTH(reg)++

 With very large basic blocks, and the kind of code for the test case
 of the PR results in many live registers for SPARC (for x86_64 there
 are far fewer live registers). For SPARC, there are O(1e5) insns and
 O(1e4) registers live for most of the basic block. That is effectively
 almost quadratic behavior in the number of insns per basic block.

 But the above loop is a bit silly. It is much easier and
 computationally cheaper to just remember at what insn reg died (last
 used), and add to REG_LIVE_LENGTH the distance from the insn that sets
 reg to the insn that used reg.

 It turns out that (before or after the patch) partial or conditional
 sets never kill a register, so that REG_LIVE_LENGTH for registers set
 by partial or conditional stores is not very accurate. I have a few
 ideas for how to improve that situation a bit, but I'm not sure if
 it's worth the trouble.

 I also think that the computation of REG_FREQ and
 REG_FREQ_CALLS_CROSSED has been wrong for some time. The maximum
 REG_FREQ is REG_FREQ_MAX (see regs.h) but regstat.c just accumulates
 REG_FREQ_FROM_BB. I've added a few lines to limit the REG_FREQ to
 REG_FREQ_MAX, e.g.:

                  REG_FREQ (dregno) += REG_FREQ_FROM_BB (bb);
 +                 REG_FREQ (dregno) =
 +                   MIN (REG_FREQ (dregno), REG_FREQ_MAX);

 With this patch, the reginfo pass disappears from the time report for
 the test case attached to PR53125, compiled at -O0.

 Bootstrapped and tested on x86_64-unknown-linux-gnu, and I verified
 for a few simple test cases that the computed REG_LIVE_LENGHTs are
 unchanged. OK for trunk?

 Ciao!
 Steven


regstat_smarter_local_live_v2.diff
Description: Binary data


Re: [google/gcc-4_6] Port arm hardfp patch from Linaro (issue 6206055)

2012-05-11 Thread carrot

Please add ChangeLog and the trunk patch revision.

http://codereview.appspot.com/6206055/


Re: [patch, fortran] PR fortran/52537 Optimize string comparisons against empty strings

2012-05-11 Thread Thomas Koenig

Hi Tobias,


Hello Thomas,

below a very timely review - your patch is not even a month old
and was never pinged, besides, you have chosen an unlucky day.
(In other words: Sorry for the slow review.)


As a matter of fact, I had become 3/4 convinced that I had already
committed this patch, which is why I didn't ping it.  So, thanks
for remembering!


Thomas Koenig wrote on Fri, 13 Apr 2012:

this patch replaces  a != '' with len_trim(a) != 0, to
speed up the comparison.


I wonder how much it helps - especially for the real world
code. Let's see whether the bug reporter will report back.


We'll see.



Can you also check kind=4 string in the test case?


Added (in the second revision of the test case, I also had
a flag wrong).


Otherwise, the patch is OK.


The whitespace fixes you suggested have been applied.

Thanks for the review!

Thomas



Re: [RFC] PR 53063 encode group options in .opt files

2012-05-11 Thread Joseph S. Myers
On Fri, 11 May 2012, Manuel López-Ibáñez wrote:

  What cases do we have where a language-independent option enables another
  language-independent option only for some front ends?  That's the only
  case that should need a language-dependent generated function here.
 
  Wall enables Wunused in Ada Fortran and C family.

Is there a reason for it not to do so for all languages?  You only really 
need a front-end-specific generated function if there's something about 
the semantics that needs to be different in different front ends.

 But the case of lang-indep enables lang-dependent is far more common:
 Wextra enables some C-only options.

That shouldn't be a problem.  You can mark those options as enabled by 
-Wextra; at most you might need to ensure that enabling a wrong-language 
option this way doesn't generate the usual warning you get for using a 
wrong-language option on the command line.

There's nothing wrong with having separate autogenerated functions for 
each language if you want to split things out that way, but it would seem 
simpler just to have one function called somewhere central, whatever 
option it is (common or not) that is enabling other options.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [v3] fix libstdc++/53263

2012-05-11 Thread François Dumont

Attached patch applied to trunk.

2012-05-11  François Dumont fdum...@gcc.gnu.org

PR libstdc++/53263
* include/debug/safe_iterator.h (__gnu_debug::__base): Move...
* include/debug/functions.h: ... Here. Add debug function
overloads to perform checks on normal iterators when possible.
* include/debug/macros.h (__glibcxx_check_heap)
(__glibcxx_check_heap_pred): Use __gnu_debug::__base on iterator range.

I check my Bugzilla account Permissions page and it is written:

There are no permission bits set on your account.

I simply don't know who to ask permissions. Should I file a bugzilla 
entry for that ?


François

On 05/10/2012 11:18 PM, Paolo Carlini wrote:

Hi,

On 05/09/2012 11:02 PM, François Dumont wrote:

Here is a patch for PR 53263.

Tested under linux x86_64 debug mode.

Ok for trunk and 4.7 branch ?


Thanks. Considering that this isn't a regression and also that nobody 
reported the issue for so many years, the patch seems a bit largish to 
me to go into the branch. Thus, let's apply to mainline only and 
consider the issue closed. If people insist, seriously insist ;) we 
may reconsider for 4.7.2.


Thanks again!
Paolo.

PS: are you finally able to manage Bugzilla, yes?




Index: include/debug/functions.h
===
--- include/debug/functions.h	(revision 187292)
+++ include/debug/functions.h	(working copy)
@@ -31,7 +31,8 @@
 #define _GLIBCXX_DEBUG_FUNCTIONS_H 1
 
 #include bits/c++config.h
-#include bits/stl_iterator_base_types.h // for iterator_traits, categories
+#include bits/stl_iterator_base_types.h // for iterator_traits, categories and
+	  // _Iter_base
 #include bits/cpp_type_traits.h // for __is_integer
 #include debug/formatter.h
 
@@ -118,11 +119,8 @@
 inline bool
 __valid_range_aux(const _InputIterator __first,
 		  const _InputIterator __last, std::__false_type)
-  {
-typedef typename std::iterator_traits_InputIterator::iterator_category
-  _Category;
-return __valid_range_aux2(__first, __last, _Category());
-  }
+  { return __valid_range_aux2(__first, __last,
+			  std::__iterator_category(__first)); }
 
   /** Don't know what these iterators are, or if they are even
*  iterators (we may get an integral type for InputIterator), so
@@ -214,6 +212,15 @@
   return true;
 }
 
+  // For performance reason, as the iterator range has been validated, check on
+  // random access safe iterators is done using the base iterator.
+  templatetypename _Iterator, typename _Sequence
+inline bool
+__check_sorted_aux(const _Safe_iterator_Iterator, _Sequence __first,
+		   const _Safe_iterator_Iterator, _Sequence __last,
+		   std::random_access_iterator_tag __tag)
+  { return __check_sorted_aux(__first.base(), __last.base(), __tag); }
+
   // Can't check if an input iterator sequence is sorted, because we can't step
   // through the sequence.
   templatetypename _InputIterator, typename _Predicate
@@ -240,19 +247,28 @@
   return true;
 }
 
+  // For performance reason, as the iterator range has been validated, check on
+  // random access safe iterators is done using the base iterator.
+  templatetypename _Iterator, typename _Sequence,
+	   typename _Predicate
+inline bool
+__check_sorted_aux(const _Safe_iterator_Iterator, _Sequence __first,
+		   const _Safe_iterator_Iterator, _Sequence __last,
+		   _Predicate __pred,
+		   std::random_access_iterator_tag __tag)
+  { return __check_sorted_aux(__first.base(), __last.base(), __pred, __tag); }
+
   // Determine if a sequence is sorted.
   templatetypename _InputIterator
 inline bool
 __check_sorted(const _InputIterator __first, const _InputIterator __last)
 {
-  typedef typename std::iterator_traits_InputIterator::iterator_category
-_Category;
-
   // Verify that the  operator for elements in the sequence is a
   // StrictWeakOrdering by checking that it is irreflexive.
   __glibcxx_assert(__first == __last || !(*__first  *__first));
 
-  return __check_sorted_aux(__first, __last, _Category());
+  return __check_sorted_aux(__first, __last,
+std::__iterator_category(__first));
 }
 
   templatetypename _InputIterator, typename _Predicate
@@ -260,14 +276,12 @@
 __check_sorted(const _InputIterator __first, const _InputIterator __last,
_Predicate __pred)
 {
-  typedef typename std::iterator_traits_InputIterator::iterator_category
-_Category;
-
   // Verify that the predicate is StrictWeakOrdering by checking that it
   // is irreflexive.
   __glibcxx_assert(__first == __last || !__pred(*__first, *__first));
 
-  return __check_sorted_aux(__first, __last, __pred, _Category());
+  return __check_sorted_aux(__first, __last, __pred,
+std::__iterator_category(__first));
 }
 
   templatetypename _InputIterator
@@ -332,13 +346,11 @@
   return 

Re: [PATCH, Android] Stack protector enabling for Android target

2012-05-11 Thread Igor Zamyatin
Hi!

Please look at the modified patch in the attachment. ChangeLog remains the same.

Tested in android environment(x86_64-*-linux-android), also
bootstrapped on x86_64-unknown-linux-gnu.
I also started regtesting on linux. Is it ok after successfull regtesting?

Thanks,
Igor

On Mon, May 7, 2012 at 10:08 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Mon, May 7, 2012 at 11:04 AM, Maxim Kuvyrkov ma...@codesourcery.com 
 wrote:
 On 6/05/2012, at 7:02 PM, Igor Zamyatin wrote:

 Hi!

 The patch enables stack protector for Android.
 Android targets don't contain necessary information in features.h so
 we explicitly enable stack protector for Android.

 Bootstrapped and regtested on x86_64. Ok to commit?

 Thanks,
 Igor

 2012-05-06  Igor Zamyatin  igor.zamya...@intel.com

        * configure.ac: Stack protector enabling for Android targets.
        * configure: Regenerate.


 diff --git a/gcc/configure.ac b/gcc/configure.ac
 index 86b4bea..c1012d6 100644
 --- a/gcc/configure.ac
 +++ b/gcc/configure.ac
 @@ -4545,6 +4545,8 @@ AC_CACHE_CHECK(__stack_chk_fail in target C library,
       gcc_cv_libc_provides_ssp,
       [gcc_cv_libc_provides_ssp=no
     case $target in
 +       *-android*)
 +         gcc_cv_libc_provides_ssp=yes;;
        *-*-linux* | *-*-kfreebsd*-gnu | *-*-knetbsd*-gnu)
       [# glibc 2.4 and later provides __stack_chk_fail and
       # either __stack_chk_guard, or TLS access to stack guard canary.


 What you really want is to enable stack protector for Bionic libc, which is 
 often synonymous with -android* target, but not always.  Let's enable ssp 
 based on whether __BIONIC__ is defined in the libc headers (i.e., add a grep 
 test for __BIONIC__ in sys/cdefs.h)

 Also please add a comment along the lines of all versions of Bionic support 
 stack protector.

 Which exact target did you test this on?  X86_64-*-* is a pretty broad 
 definition.


 We are working on x86_64-*-linux-android target,
 which uses x32.

 --
 H.J.


stack_protector.patch
Description: Binary data


Re: [PATCH, tree-optimization] Fix for PR 52868

2012-05-11 Thread Igor Zamyatin
Ping?

On Fri, Apr 27, 2012 at 4:42 PM, Igor Zamyatin izamya...@gmail.com wrote:
 On Wed, Apr 25, 2012 at 6:41 PM, Richard Guenther
 richard.guent...@gmail.com wrote:
 On Wed, Apr 25, 2012 at 4:32 PM, Igor Zamyatin izamya...@gmail.com wrote:
 On Wed, Apr 25, 2012 at 1:14 PM, Richard Guenther
 richard.guent...@gmail.com wrote:
 On Wed, Apr 25, 2012 at 10:56 AM, Igor Zamyatin izamya...@gmail.com 
 wrote:
 Hi all!

 I'd like to post for review the patch which makes some costs adjusting
 in get_computation_cost_at routine in ivopts part.
 As mentioned in the PR changes also fix the bwaves regression from PR 
 52272.
 Also changes introduce no degradations on spec2000/2006 and
 EEMBC1.1/2.0(this was measured on Atom) on x86


 Bootstrapped and regtested on x86. Ok to commit?

 I can't make sense of the patch and the comment does not help.

 +      diff_cost = cost.cost;
       cost.cost /= avg_loop_niter (data-current_loop);
 +      add_cost_val = add_cost (TYPE_MODE (ctype), data-speed);
 +      /* Do cost correction if address cost is small enough
 +         and difference cost is high enough.  */
 +      if (address_p  diff_cost  add_cost_val
 +           get_address_cost (symbol_present, var_present,
 +                               offset, ratio, cstepi,
 +                               TYPE_MODE (TREE_TYPE (utype)),
 +                               TYPE_ADDR_SPACE (TREE_TYPE (utype)),
 +                               speed, stmt_is_after_inc,
 +                               can_autoinc).cost = add_cost_val)
 +        cost.cost += add_cost_val;

 Please explain more thoroughly.  It also would seem to be better to add
 an extra case, as later code does

 For example for such code

   for (j=0; jM;j++) {
       for (i=0; iN; i++)
           sum += ptr-a[j][i] * ptr-c[k][i];
   }
  we currently have following gimple on x86 target (I provided a piece
 of all phase output):

           # ivtmp.13_30 = PHI ivtmp.13_31(3), ivtmp.13_33(7)
           D.1748_34 = (void *) ivtmp.13_30;
           D.1722_7 = MEM[base: D.1748_34, offset: 0B];
           D.1750_36 = ivtmp.27_28;
           D.1751_37 = D.1750_36 + ivtmp.13_30; -- we got
 non-invariant add which is not taken into account currently in cost
 model
           D.1752_38 = (void *) D.1751_37;
           D.1753_39 = (sizetype) k_8(D);
           D.1754_40 = D.1753_39 * 800;
           D.1723_9 = MEM[base: D.1752_38, index: D.1754_40, offset: 16000B];
           ...

  With proposed fix we produce:

           # ivtmp.14_30 = PHI ivtmp.14_31(3), 0(7)
           D.1749_34 = (struct S *) ivtmp.25_28;
           D.1722_7 = MEM[base: D.1749_34, index: ivtmp.14_30, offset: 0B];
           D.1750_35 = (sizetype) k_8(D);
           D.1751_36 = D.1750_35 * 800;
           D.1752_37 = ptr_6(D) + D.1751_36;
           D.1723_9 = MEM[base: D.1752_37, index: ivtmp.14_30, offset: 
 16000B];

 which is more effective on platforms where address cost is cheaper
 than cost of addition operation. That's basically what this adjustment
 is for.

 If we generally miss to account for the add then why is the adjustment
 conditional on diff_cost  add_cost and address_cost = add_cost?

 Is this a new heuristic or a fix for not accurately computing the cost for 
 the
 stmts we generate?

 I'd say this is closer to heuristic since diff_cost  add_cost is an
 attempt to catch the case with non-invariant add produced by pointer
 difference and address_cost =add_cost leaves the cases with cheap
 address operations


 Richard.

 So comment in the source code now looks as follows

 /* Do cost correction when address difference produces
   additional non-invariant add operation which is less
   profitable if address cost is cheaper than cost of add.  */


  /* Now the computation is in shape symbol + var1 + const + ratio * var2.
     (symbol/var1/const parts may be omitted).  If we are looking for an
     address, find the cost of addressing this.  */
  if (address_p)
    return add_costs (cost,
                      get_address_cost (symbol_present, var_present,
                                        offset, ratio, cstepi,
                                        TYPE_MODE (TREE_TYPE (utype)),
                                        TYPE_ADDR_SPACE (TREE_TYPE (utype)),
                                        speed, stmt_is_after_inc,
                                        can_autoinc));

 thus refactoring the code a bit would make it possible to CSE the
 get_address_cost
 call and eventually make it clearer what the code does.

 'offset' could be changed beetween two calls of get_address_cost so
 such refactoring looks useless.

 New patch (only the comment was changed) attached. Changelog was
 changed as well.


 Richard.


 Changelog:

  2012-04-26  Yuri Rumyantsev  yuri.rumyant...@intel.com

         * tree-ssa-loop-ivopts.c (get_computation_cost_at): Adjust
        cost model when address difference produces additional
        non-invariant add operation which is less profitable if
      

Re: [google/gcc-4_6] Backport arm hardfp patch from trunk (issue 6206055)

2012-05-11 Thread shenhan

On 2012/05/11 18:48:40, carrot wrote:

Please add ChangeLog and the trunk patch revision.


Thanks.

Trunk patch revision is 186859.

ChangeLog entry added.

 Patch started.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 2f85855..588fa04 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,14 @@
+2012-05-11  Han Shen  shen...@google.com
+
+   Backport from mainline.
+   2012-04-26  Michael Hope  michael.h...@linaro.org
+   Richard Earnshaw  rearn...@arm.com
+
+   * config/arm/linux-eabi.h (GLIBC_DYNAMIC_LINKER_SOFT_FLOAT): Define.
+   (GLIBC_DYNAMIC_LINKER_HARD_FLOAT): Define.
+   (GLIBC_DYNAMIC_LINKER_DEFAULT): Define.
+   (GLIBC_DYNAMIC_LINKER): Redefine to use the hard float path.
+
 2012-04-30  Uros Bizjak  ubiz...@gmail.com

Backport from mainline
diff --git a/gcc/config/arm/linux-eabi.h b/gcc/config/arm/linux-eabi.h
index 3a32188..58d06c2 100644
--- a/gcc/config/arm/linux-eabi.h
+++ b/gcc/config/arm/linux-eabi.h
@@ -32,7 +32,8 @@
   while (false)

 /* We default to a soft-float ABI so that binaries can run on all
-   target hardware.  */
+   target hardware.  If you override this to use the hard-float ABI
then
+   change the setting of GLIBC_DYNAMIC_LINKER_DEFAULT as well.  */
 #undef  TARGET_DEFAULT_FLOAT_ABI
 #define TARGET_DEFAULT_FLOAT_ABI ARM_FLOAT_ABI_SOFT

@@ -59,10 +60,26 @@
 #undef  SUBTARGET_EXTRA_LINK_SPEC
 #define SUBTARGET_EXTRA_LINK_SPEC  -m  TARGET_LINKER_EMULATION

-/* Use ld-linux.so.3 so that it will be possible to run classic
-   GNU/Linux binaries on an EABI system.  */
+/* GNU/Linux on ARM currently supports three dynamic linkers:
+   - ld-linux.so.2 - for the legacy ABI
+   - ld-linux.so.3 - for the EABI-derived soft-float ABI
+   - ld-linux-armhf.so.3 - for the EABI-derived hard-float ABI.
+   All the dynamic linkers live in /lib.
+   We default to soft-float, but this can be overridden by changing
both
+   GLIBC_DYNAMIC_LINKER_DEFAULT and TARGET_DEFAULT_FLOAT_ABI.  */
+
 #undef  GLIBC_DYNAMIC_LINKER
-#define GLIBC_DYNAMIC_LINKER RUNTIME_ROOT_PREFIX /lib/ld-linux.so.3
+#define GLIBC_DYNAMIC_LINKER_SOFT_FLOAT \
+  RUNTIME_ROOT_PREFIX /lib/ld-linux.so.3
+#define GLIBC_DYNAMIC_LINKER_HARD_FLOAT \
+  RUNTIME_ROOT_PREFIX /lib/ld-linux-armhf.so.3
+#define GLIBC_DYNAMIC_LINKER_DEFAULT \
+  RUNTIME_ROOT_PREFIX GLIBC_DYNAMIC_LINKER_SOFT_FLOAT
+
+#define GLIBC_DYNAMIC_LINKER \
+   %{mfloat-abi=hard: GLIBC_DYNAMIC_LINKER_HARD_FLOAT } \
+%{mfloat-abi=soft*: GLIBC_DYNAMIC_LINKER_SOFT_FLOAT } \
+%{!mfloat-abi=*: GLIBC_DYNAMIC_LINKER_DEFAULT }

 /* At this point, bpabi.h will have clobbered LINK_SPEC.  We want to
use the GNU/Linux version, not the generic BPABI version.  */


http://codereview.appspot.com/6206055/


Re: [google/gcc-4_6] Backport arm hardfp patch from trunk (issue 6206055)

2012-05-11 Thread carrot


http://codereview.appspot.com/6206055/diff/4001/gcc/ChangeLog
File gcc/ChangeLog (right):

http://codereview.appspot.com/6206055/diff/4001/gcc/ChangeLog#newcode11
gcc/ChangeLog:11:
It seems you also merged in another patch r187012, please mention it.

http://codereview.appspot.com/6206055/diff/4001/gcc/config/arm/linux-eabi.h
File gcc/config/arm/linux-eabi.h (right):

http://codereview.appspot.com/6206055/diff/4001/gcc/config/arm/linux-eabi.h#newcode77
gcc/config/arm/linux-eabi.h:77: RUNTIME_ROOT_PREFIX
GLIBC_DYNAMIC_LINKER_SOFT_FLOAT

Here the RUNTIME_ROOT_PREFIX may be redundant.

http://codereview.appspot.com/6206055/


Re: [Patch / RFC] Improving more locations for binary expressions

2012-05-11 Thread Paolo Carlini

On 05/10/2012 04:28 PM, Jason Merrill wrote:

Looks good.

Thanks Jason. The below is the idea fully implemented.

The call.c bits are exactly in the form I had in mind a couple of days ago.

The parser.c bits, the ones I actually preliminarily posted, are now a 
bit different: I noticed that in the middle of the function we were 
reading / writing all the cp_parser_expression_stack_entry fields by 
hand (and now would be one more) thus tweaked the algorithm to use a 
cp_parser_expression_stack_entry as local variable. In my opinion this 
tidies the code quite a bit and also makes manifest that tree_type and 
loc are always updated at the same time, as they should be.


Booted and tested x86_64-linux. Ok?

Thanks,
Paolo.

//
/cp
2012-05-12

* parser.c (struct cp_parser_expression_stack_entry): Add location_t
field.
(cp_parser_binary_expression): Rework to always update at the same
time tree_type and loc.
* call.c (print_z_candidate): Add location_t parameter.
(print_z_candidates, convert_like_real, joust): Adjust.

/testsuite
2012-05-12

* g++.dg/parse/error47.C: New.
Index: testsuite/g++.dg/parse/error47.C
===
--- testsuite/g++.dg/parse/error47.C(revision 0)
+++ testsuite/g++.dg/parse/error47.C(revision 0)
@@ -0,0 +1,9 @@
+struct T { };
+
+T foo();
+
+void bar(int a, int b)
+{
+  if (foo()  a  b) // { dg-error 13:no match for 'operator' }
+;
+}
Index: cp/parser.c
===
--- cp/parser.c (revision 187414)
+++ cp/parser.c (working copy)
@@ -1621,6 +1621,8 @@ typedef struct cp_parser_expression_stack_entry
   enum tree_code tree_type;
   /* Precedence of the binary operation we are parsing.  */
   enum cp_parser_prec prec;
+  /* Location of the binary operation we are parsing.  */
+  location_t loc;
 } cp_parser_expression_stack_entry;
 
 /* The stack for storing partial expressions.  We only need NUM_PREC_VALUES
@@ -7275,30 +7277,33 @@ cp_parser_binary_expression (cp_parser* parser, bo
 {
   cp_parser_expression_stack stack;
   cp_parser_expression_stack_entry *sp = stack[0];
-  tree lhs, rhs;
+  cp_parser_expression_stack_entry sentry;
+  tree rhs;
   cp_token *token;
-  location_t loc;
-  enum tree_code tree_type, lhs_type, rhs_type;
+  enum tree_code rhs_type;
   enum cp_parser_prec new_prec, lookahead_prec;
   tree overload;
 
   /* Parse the first expression.  */
-  lhs = cp_parser_cast_expression (parser, /*address_p=*/false, cast_p, pidk);
-  lhs_type = ERROR_MARK;
+  sentry.lhs = cp_parser_cast_expression (parser, /*address_p=*/false,
+ cast_p, pidk);
+  sentry.lhs_type = ERROR_MARK;
+  sentry.prec = prec;
 
   for (;;)
 {
   /* Get an operator token.  */
   token = cp_lexer_peek_token (parser-lexer);
-  loc = token-location;
 
   if (warn_cxx0x_compat
token-type == CPP_RSHIFT
!parser-greater_than_is_operator_p)
 {
-  if (warning_at (loc, OPT_Wc__0x_compat, %% operator is treated
+  if (warning_at (token-location, OPT_Wc__0x_compat,
+ %% operator is treated
   as two right angle brackets in C++11))
-   inform (loc, suggest parentheses around %% expression);
+   inform (token-location,
+   suggest parentheses around %% expression);
 }
 
   new_prec = TOKEN_PRECEDENCE (token);
@@ -7310,7 +7315,7 @@ cp_parser_binary_expression (cp_parser* parser, bo
 - or, we found an operator which has lower priority.  This is the case
   where the recursive descent *ascends*, as in `3 * 4 + 5' after
   parsing `3 * 4'.  */
-  if (new_prec = prec)
+  if (new_prec = sentry.prec)
{
  if (sp == stack)
break;
@@ -7319,17 +7324,18 @@ cp_parser_binary_expression (cp_parser* parser, bo
}
 
  get_rhs:
-  tree_type = binops_by_token[token-type].tree_type;
+  sentry.tree_type = binops_by_token[token-type].tree_type;
+  sentry.loc = token-location;
 
   /* We used the operator token.  */
   cp_lexer_consume_token (parser-lexer);
 
   /* For false  x or true || x, x will never be executed;
 disable warnings while evaluating it.  */
-  if (tree_type == TRUTH_ANDIF_EXPR)
-   c_inhibit_evaluation_warnings += lhs == truthvalue_false_node;
-  else if (tree_type == TRUTH_ORIF_EXPR)
-   c_inhibit_evaluation_warnings += lhs == truthvalue_true_node;
+  if (sentry.tree_type == TRUTH_ANDIF_EXPR)
+   c_inhibit_evaluation_warnings += sentry.lhs == truthvalue_false_node;
+  else if (sentry.tree_type == TRUTH_ORIF_EXPR)
+   c_inhibit_evaluation_warnings += sentry.lhs == truthvalue_true_node;
 
   /* Extract another operand.  It may be the RHS of this expression
 or the LHS of a new, 

Re: [patch] Fix debug info of nested inline functions

2012-05-11 Thread Jason Merrill

On 03/02/2012 03:29 PM, Eric Botcazou wrote:

I notice that D.7 seems to suggest that if the nested function is not 
inlinable and shared between all instances of the containing function 
that we put a normal (non-abstract/concrete) instance of the nested 
function inside the abstract function for the containing function.  But 
I agree that it's cleaner your way: put an abstract instance there 
instead so that the abstract instance of the containing function is all 
abstract.



+  /* Emit an abstract instance of nested functions within an abstract instance
+ of their parent.  */
+  int declaration = ((decl != current_function_decl
+  !(DECL_INITIAL (decl) != NULL_TREE
+   DECL_ABSTRACT (decl)
+   current_function_decl
+   DECL_ABSTRACT (current_function_decl)))
 || class_or_namespace_scope_p (context_die));


Hmm, why isn't current_function_decl == decl when we're trying to emit 
the abstract instance of a nested function?



+  /* Do not emit concrete instances of abstracted nested functions without
+ actual instances.  */
+  else if (TREE_CODE (decl_or_origin) == FUNCTION_DECL
+   die
+   get_AT (die, DW_AT_inline))
+;


Should without actual instances be something like within concrete 
instances of containing functions?



- if (origin  origin-die_parent)
+ if (origin
+  origin-die_parent
+ /* Skip an abtract parent instance.  */
+  !(origin-die_parent-die_tag == DW_TAG_subprogram
+   get_AT (origin-die_parent, DW_AT_inline)))
add_child_die (origin-die_parent, die);


What if the immediate parent is a DW_TAG_lexical_block or some other 
thing nested inside an abstract subprogram?



and you suggested to iterate over DECL_CONTEXT instead of die_parent to find an
appropriate parent in order to attach the DIE on the limbo list to.


In my earlier comments I seem to have been wrong about the behavior of 
gen_subprogram_die; now I see that if there is an abstract instance the 
concrete out-of-line instance is not associated with the decl number. 
So I guess your earlier limbo handling code was fine apart from the 
lexical_block issue above.


Jason


Re: [C++ Patch] PR 53301

2012-05-11 Thread Jason Merrill

On 05/11/2012 06:03 AM, Paolo Carlini wrote:

if (TYPE_PTR_P (type)
  !TYPE_PTRFN_P (type)
- !TYPE_PTR_TO_MEMBER_P (type))
+ !TYPE_PTRMEM_P (type))


The check for !pointer to member is no longer necessary, since they 
don't use POINTER_TYPE.


OK with that change.

Jason



Re: [Patch / RFC] Improving more locations for binary expressions

2012-05-11 Thread Jason Merrill
I find the name sentry confusing; I don't see how it applies.  Perhaps 
current instead?


Otherwise, the patch is OK.

Jason