date:20110602

Re: -fdump-passes -fenable-xxx=func_name_list

2011-06-02 Thread Xinliang David Li

This is the version of the patch that walks through pass lists.

Ok with this one?

David

On Wed, Jun 1, 2011 at 12:45 PM, Xinliang David Li davi...@google.com wrote:
 On Wed, Jun 1, 2011 at 12:29 PM, Richard Guenther
 richard.guent...@gmail.com wrote:
 On Wed, Jun 1, 2011 at 6:16 PM, Xinliang David Li davi...@google.com wrote:
 On Wed, Jun 1, 2011 at 1:51 AM, Richard Guenther
 richard.guent...@gmail.com wrote:
 On Wed, Jun 1, 2011 at 1:34 AM, Xinliang David Li davi...@google.com 
 wrote:
 The following patch implements the a new option that dumps gcc PASS
 configuration. The sample output is attached.  There is one
 limitation: some placeholder passes that are named with '*xxx' are
 note registered thus they are not listed. They are not important as
 they can not be turned on/off anyway.

 The patch also enhanced -fenable-xxx and -fdisable-xx to allow a list
 of function assembler names to be specified.

 Ok for trunk?

 Please split the patch.

 I'm not too happy how you dump the pass configuration.  Why not simply,
 at a _single_ place, walk the pass tree?  Instead of doing pieces of it
 at pass execution time when it's not already dumped - that really looks
 gross.

 Yes, that was the original plan -- but it has problems
 1) the dumper needs to know the root pass lists -- which can change
 frequently -- it can be a long term maintanance burden;
 2) the centralized dumper needs to be done after option processing
 3) not sure if gate functions have any side effects or have dependencies on 
 cfun

 The proposed solutions IMHO is not that intrusive -- just three hooks
 to do the dumping and tracking indentation.

 Well, if you have a CU that is empty or optimized to nothing at some point
 you will not get a complete pass list.  I suppose optimize attributes might
 also confuse output.  Your solution might not be that intrusive
 but it is still ugly.  I don't see 1) as an issue, for 2) you can just call 
 the
 dumping from toplev_main before calling do_compile (), 3) gate functions
 shouldn't have side-effects, but as they could gate on optimize_for_speed ()
 your option summary output will be bogus anyway.

 So - what is the output intended for if it isn't reliable?

 This needs to be cleaned up at some point -- the gate function should
 behave the same for all functions and per-function decisions need to
 be pushed down to the executor body.  I will try to rework the patch
 as you suggested to see if there are problems.

 David



 Richard.


 The documentation should also link this option to the -fenable/disable
 options as obviously the pass names in that dump are those to be
 used for those flags (and not readily available anywhere else).

 Ok.


 I also think that it would be way more useful to note in the individual
 dump files the functions (at the place they would usually appear) that
 have the pass explicitly enabled/disabled.

 Ok -- for ipa passes or tree/rtl passes where all functions are
 explicitly disabled.

 Thanks,

 David


 Richard.

 Thanks,

 David







dump-pass3.p
Description: Binary data


out
Description: Binary data

Ping: [Patch] Make libstdc++'s abi_check more robust against readelf output format

2011-06-02 Thread Simon Baldwin

Ping.


On 20 May 2011 17:05, Simon Baldwin sim...@google.com wrote:

 Make libstdc++'s abi_check more robust against readelf output format.

 libstdc++-abi/abi_check in the libstdc++-v3 testsuite relies on a fixed
 number of space separated fields in readelf output.  However, the field
 count for readelf output can vary where the library contains OS or processor
 specific bindings, or other unknown bindings.

 This patch replaces the strings that readelf outputs for such bindings
 with alternative strings that use underscores in place of space.  It
 preserves the count of fields for such cases, and allows the awk statement
 that follows to find the desired field correctly with $n.

 OK for trunk?

 libstdc++-v3/ChangeLog:
 2011-05-20  Simon Baldwin  sim...@google.com

        * scripts/extract_symvers.in: Handle processor/OS specific or
        unknown symbol binding strings from readelf.


 Index: libstdc++-v3/scripts/extract_symvers.in
 ===
 --- libstdc++-v3/scripts/extract_symvers.in     (revision 173951)
 +++ libstdc++-v3/scripts/extract_symvers.in     (working copy)
 @@ -52,6 +52,9 @@ SunOS)
   ${readelf} ${lib} |\
   sed -e 's/ \[other: [A-Fa-f0-9]*\] //' -e '/\.dynsym/,/^$/p;d' |\
   egrep -v ' (LOCAL|UND) ' |\
 +  sed -e 's/ processor specific: / processor_specific:_/g' |\
 +  sed -e 's/ OS specific: / OS_specific:_/g' |\
 +  sed -e 's/ unknown: / unknown:_/g' |\
   awk '{ if ($4 == FUNC || $4 == NOTYPE)
            printf %s:%s\n, $4, $8;
          else if ($4 == OBJECT || $4 == TLS)



--
Google UK Limited | Registered Office: Belgrave House, 76 Buckingham
Palace Road, London SW1W 9TQ | Registered in England Number: 3977902

Re: [PATCH][ARM] Add support for ADDW and SUBW instructions

2011-06-02 Thread Andrew Stubbs


Ping 2.

On 20/04/11 16:27, Andrew Stubbs wrote:

This patch adds basic support for the Thumb ADDW and SUBW instructions.

The patch permits the compiler to use the new instructions for constants
that can be loaded with a single instruction (i.e. 16-bit unshifted),
but does not support use of addw with split-constants; I have a patch
for that coming soon.

This patch requires that my previously posted patch for MOVW is applied
first.

OK?

Andrew

Re: [PATCH][ARM] Add support for ADDW and SUBW instructions

2011-06-02 Thread Ramana Radhakrishnan

 OK?


This is largely OK modulo the following.

Please remove the alternatives in the subsi3 pattern since that is just
unnecessary. Please make the constraints internal only.


cheers
Ramana


 Andrew

Re: [PATCH, ARM] Thumb-2 12-bit immediates in ADD and SUB instructions

2011-06-02 Thread Ramana Radhakrishnan




Would you include this in your patch? Or should we submit it as a
separate patch?



Could you submit this as a follow-up patch that touches the costs. I 
would rather that these changes also went in when we were looking at 
this area ?


cheers
Ramana

Re: [patch] Improve detection of widening multiplication in the vectorizer

2011-06-02 Thread Ira Rosen

On 1 June 2011 15:14, Richard Guenther richard.guent...@gmail.com wrote:
 On Wed, Jun 1, 2011 at 1:37 PM, Ira Rosen ira.ro...@linaro.org wrote:
 On 1 June 2011 12:42, Richard Guenther richard.guent...@gmail.com wrote:

 Did you think about moving pass_optimize_widening_mul before
 loop optimizations?  Does that pass catch the cases you are
 teaching the pattern recognizer?  I think we should try to expose
 these more complicated instructions to loop optimizers.


 pass_optimize_widening_mul doesn't catch these cases, but I can try to
 teach it instead of the vectorizer.
 I am now testing

 Index: passes.c
 ===
 --- passes.c    (revision 174391)
 +++ passes.c    (working copy)
 @@ -870,6 +870,7 @@
       NEXT_PASS (pass_split_crit_edges);
       NEXT_PASS (pass_pre);
       NEXT_PASS (pass_sink_code);
 +      NEXT_PASS (pass_optimize_widening_mul);
       NEXT_PASS (pass_tree_loop);
        {
          struct opt_pass **p = pass_tree_loop.pass.sub;
 @@ -934,7 +935,6 @@
       NEXT_PASS (pass_forwprop);
       NEXT_PASS (pass_phiopt);
       NEXT_PASS (pass_fold_builtins);
 -      NEXT_PASS (pass_optimize_widening_mul);
       NEXT_PASS (pass_tail_calls);
       NEXT_PASS (pass_rename_ssa_copies);
       NEXT_PASS (pass_uncprop);

 to see how it affects other loop optimizations (vectorizer pattern
 tests obviously fail).

Looks like it needs copy_prop and dce as well:

Index: passes.c
===
--- passes.c(revision 174391)
+++ passes.c(working copy)
@@ -870,6 +870,9 @@
   NEXT_PASS (pass_split_crit_edges);
   NEXT_PASS (pass_pre);
   NEXT_PASS (pass_sink_code);
+  NEXT_PASS (pass_copy_prop);
+  NEXT_PASS (pass_dce);
+  NEXT_PASS (pass_optimize_widening_mul);
   NEXT_PASS (pass_tree_loop);
{
  struct opt_pass **p = pass_tree_loop.pass.sub;
@@ -934,7 +937,6 @@
   NEXT_PASS (pass_forwprop);
   NEXT_PASS (pass_phiopt);
   NEXT_PASS (pass_fold_builtins);
-  NEXT_PASS (pass_optimize_widening_mul);
   NEXT_PASS (pass_tail_calls);
   NEXT_PASS (pass_rename_ssa_copies);
   NEXT_PASS (pass_uncprop);

otherwise I get (on x86_64-suse-linux)

FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfmaddss
FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfmaddsd
FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfmsubss
FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfmsubsd
FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfnmaddss
FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfnmaddsd

Ira


 Thanks.  I would hope that we eventually can get rid of the
 pattern recognizer ... at least for SSE there is also always
 a scalar variant instruction for each vectorized one.

 Richard.

Re: introduce --param max-vartrack-expr-depth

2011-06-02 Thread Jakub Jelinek

On Wed, Jun 01, 2011 at 07:25:39PM -0300, Alexandre Oliva wrote:
 Such as this one...

I'd appreciate if this could go in...

 Index: gcc/params.def
 ===
 --- gcc/params.def.orig   2011-05-31 18:28:05.348070586 -0300
 +++ gcc/params.def2011-06-01 17:09:41.117140944 -0300
 @@ -845,7 +845,7 @@ DEFPARAM (PARAM_MAX_VARTRACK_SIZE,
  DEFPARAM (PARAM_MAX_VARTRACK_EXPR_DEPTH,
 max-vartrack-expr-depth,
 Max. recursion depth for expanding var tracking expressions,
 -   10, 0, 0)
 +   20, 0, 0)
  
  /* Set minimum insn uid for non-debug insns.  */
  
 Index: gcc/var-tracking.c
 ===
 --- gcc/var-tracking.c.orig   2011-05-31 20:06:25.604477956 -0300
 +++ gcc/var-tracking.c2011-05-31 23:56:06.578450957 -0300
 @@ -5288,7 +5288,7 @@ reverse_op (rtx val, const_rtx expr)
arg = XEXP (src, 1);
if (!CONST_INT_P (arg)  GET_CODE (arg) != SYMBOL_REF)
   {
 -   arg = cselib_expand_value_rtx (arg, scratch_regs, EXPR_DEPTH);
 +   arg = cselib_expand_value_rtx (arg, scratch_regs, 5);
 if (arg == NULL_RTX)
   return NULL_RTX;
 if (!CONST_INT_P (arg)  GET_CODE (arg) != SYMBOL_REF)

Jakub

Re: [PATCH][ARM] Add support for ADDW and SUBW instructions

2011-06-02 Thread Andrew Stubbs


On 02/06/11 09:23, Ramana Radhakrishnan wrote:

Please remove the alternatives in the subsi3 pattern since that is just
unnecessary. Please make the constraints internal only.


Is this better?

Andrew
2011-06-02  Andrew Stubbs  a...@codesourcery.com

	gcc/
	* config/arm/arm-protos.h (const_ok_for_op): Add prototype.
	* config/arm/arm.c (const_ok_for_op): Add support for addw/subw.
	Remove prototype. Remove static function type.
	* config/arm/arm.md (*arm_addsi3): Add addw/subw support.
	Add arch attribute.
	* config/arm/constraints.md (Pj, PJ): New constraints.

--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -46,6 +46,7 @@ extern bool arm_vector_mode_supported_p (enum machine_mode);
 extern bool arm_small_register_classes_for_mode_p (enum machine_mode);
 extern int arm_hard_regno_mode_ok (unsigned int, enum machine_mode);
 extern int const_ok_for_arm (HOST_WIDE_INT);
+extern int const_ok_for_op (HOST_WIDE_INT, enum rtx_code);
 extern int arm_split_constant (RTX_CODE, enum machine_mode, rtx,
 			   HOST_WIDE_INT, rtx, rtx, int);
 extern RTX_CODE arm_canonicalize_comparison (RTX_CODE, rtx *, rtx *);
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -82,7 +82,6 @@ inline static int thumb1_index_register_rtx_p (rtx, int);
 static bool arm_legitimate_address_p (enum machine_mode, rtx, bool);
 static int thumb_far_jump_used_p (void);
 static bool thumb_force_lr_save (void);
-static int const_ok_for_op (HOST_WIDE_INT, enum rtx_code);
 static rtx emit_sfm (int, int);
 static unsigned arm_size_return_regs (void);
 static bool arm_assemble_integer (rtx, unsigned int, int);
@@ -2149,7 +2148,7 @@ const_ok_for_arm (HOST_WIDE_INT i)
 }
 
 /* Return true if I is a valid constant for the operation CODE.  */
-static int
+int
 const_ok_for_op (HOST_WIDE_INT i, enum rtx_code code)
 {
   if (const_ok_for_arm (i))
@@ -2165,6 +2164,13 @@ const_ok_for_op (HOST_WIDE_INT i, enum rtx_code code)
 	return 0;
 
 case PLUS:
+  /* See if we can use addw or subw.  */
+  if (TARGET_THUMB2
+	   ((i  0xf000) == 0
+	  || ((-i)  0xf000) == 0))
+	return 1;
+  /* else fall through.  */
+
 case COMPARE:
 case EQ:
 case NE:
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -707,21 +707,24 @@
 ;;  (plus (reg rN) (reg sp)) into (reg rN).  In this case reload will
 ;; put the duplicated register first, and not try the commutative version.
 (define_insn_and_split *arm_addsi3
-  [(set (match_operand:SI  0 s_register_operand =r, k,r,r, k,r)
-	(plus:SI (match_operand:SI 1 s_register_operand %rk,k,r,rk,k,rk)
-		 (match_operand:SI 2 reg_or_int_operand rI,rI,k,L, L,?n)))]
+  [(set (match_operand:SI  0 s_register_operand =r, k,r,r, k, r, k,r, k, r)
+	(plus:SI (match_operand:SI 1 s_register_operand %rk,k,r,rk,k, rk,k,rk,k, rk)
+		 (match_operand:SI 2 reg_or_int_operand rI,rI,k,Pj,Pj,L, L,PJ,PJ,?n)))]
   TARGET_32BIT
   @
add%?\\t%0, %1, %2
add%?\\t%0, %1, %2
add%?\\t%0, %2, %1
+   addw%?\\t%0, %1, %2
+   addw%?\\t%0, %1, %2
sub%?\\t%0, %1, #%n2
sub%?\\t%0, %1, #%n2
+   subw%?\\t%0, %1, #%n2
+   subw%?\\t%0, %1, #%n2
#
   TARGET_32BIT
 GET_CODE (operands[2]) == CONST_INT
-!(const_ok_for_arm (INTVAL (operands[2]))
-|| const_ok_for_arm (-INTVAL (operands[2])))
+!const_ok_for_op (INTVAL (operands[2]), PLUS)
 (reload_completed || !arm_eliminable_register (operands[1]))
   [(clobber (const_int 0))]
   
@@ -730,8 +733,9 @@
 		  operands[1], 0);
   DONE;
   
-  [(set_attr length 4,4,4,4,4,16)
-   (set_attr predicable yes)]
+  [(set_attr length 4,4,4,4,4,4,4,4,4,16)
+   (set_attr predicable yes)
+   (set_attr arch *,*,*,t2,t2,*,*,t2,t2,*)]
 )
 
 (define_insn_and_split *thumb1_addsi3
--- a/gcc/config/arm/constraints.md
+++ b/gcc/config/arm/constraints.md
@@ -31,7 +31,7 @@
 ;; The following multi-letter normal constraints have been used:
 ;; in ARM/Thumb-2 state: Da, Db, Dc, Dn, Dl, DL, Dv, Dy, Di, Dz
 ;; in Thumb-1 state: Pa, Pb, Pc, Pd
-;; in Thumb-2 state: Ps, Pt, Pu, Pv, Pw, Px, Py
+;; in Thumb-2 state: Pj, PJ, Ps, Pt, Pu, Pv, Pw, Px, Py
 
 ;; The following memory constraints have been used:
 ;; in ARM/Thumb-2 state: Q, Ut, Uv, Uy, Un, Um, Us
@@ -75,6 +75,18 @@
 	   (and (match_code const_int)
 (match_test (ival  0x) == 0)
 
+(define_constraint Pj
+ @internal A 12-bit constant suitable for an ADDW or SUBW instruction. (Thumb-2)
+ (and (match_code const_int)
+  (and (match_test TARGET_THUMB2)
+	   (match_test (ival  0xf000) == 0
+
+(define_constraint PJ
+ @internal A constant that satisfies the Pj constrant if negated.
+ (and (match_code const_int)
+  (and (match_test TARGET_THUMB2)
+	   (match_test ((-ival)  0xf000) == 0
+
 (define_register_constraint k STACK_REG
  @internal The stack register.)

Re: [patch][simplify-rtx] Fix 16-bit - 64-bit multiply and accumulate

2011-06-02 Thread Richard Earnshaw


On Thu, 2011-05-26 at 14:35 +0100, Andrew Stubbs wrote:
 On 25/05/11 14:47, Joseph S. Myers wrote:
  The shift must be by a positive constant amount, strictly less than the
  precision (GET_MODE_PRECISION) of the mode (of the value being shifted).
  If that applies, the relevant number of bits is the precision of the mode
  minus the number of bits of the shift.  For an extension, just take the
  number of bits in the inner mode.  Add the two numbers of bits; if the
  result does not exceed the number of bits in the mode (of the operands and
  the multiplication) then the multiplication won't overflow.
 
 I believe the attached should implement what you describe.
 
 Is the patch OK now?
 
 Andrew

OK.

R.

Re: [patch] Improve detection of widening multiplication in the vectorizer

2011-06-02 Thread Richard Guenther

On Thu, Jun 2, 2011 at 10:46 AM, Ira Rosen ira.ro...@linaro.org wrote:
 On 1 June 2011 15:14, Richard Guenther richard.guent...@gmail.com wrote:
 On Wed, Jun 1, 2011 at 1:37 PM, Ira Rosen ira.ro...@linaro.org wrote:
 On 1 June 2011 12:42, Richard Guenther richard.guent...@gmail.com wrote:

 Did you think about moving pass_optimize_widening_mul before
 loop optimizations?  Does that pass catch the cases you are
 teaching the pattern recognizer?  I think we should try to expose
 these more complicated instructions to loop optimizers.


 pass_optimize_widening_mul doesn't catch these cases, but I can try to
 teach it instead of the vectorizer.
 I am now testing

 Index: passes.c
 ===
 --- passes.c    (revision 174391)
 +++ passes.c    (working copy)
 @@ -870,6 +870,7 @@
       NEXT_PASS (pass_split_crit_edges);
       NEXT_PASS (pass_pre);
       NEXT_PASS (pass_sink_code);
 +      NEXT_PASS (pass_optimize_widening_mul);
       NEXT_PASS (pass_tree_loop);
        {
          struct opt_pass **p = pass_tree_loop.pass.sub;
 @@ -934,7 +935,6 @@
       NEXT_PASS (pass_forwprop);
       NEXT_PASS (pass_phiopt);
       NEXT_PASS (pass_fold_builtins);
 -      NEXT_PASS (pass_optimize_widening_mul);
       NEXT_PASS (pass_tail_calls);
       NEXT_PASS (pass_rename_ssa_copies);
       NEXT_PASS (pass_uncprop);

 to see how it affects other loop optimizations (vectorizer pattern
 tests obviously fail).

 Looks like it needs copy_prop and dce as well:

 Index: passes.c
 ===
 --- passes.c    (revision 174391)
 +++ passes.c    (working copy)
 @@ -870,6 +870,9 @@
       NEXT_PASS (pass_split_crit_edges);
       NEXT_PASS (pass_pre);
       NEXT_PASS (pass_sink_code);
 +      NEXT_PASS (pass_copy_prop);
 +      NEXT_PASS (pass_dce);
 +      NEXT_PASS (pass_optimize_widening_mul);
       NEXT_PASS (pass_tree_loop);
        {
          struct opt_pass **p = pass_tree_loop.pass.sub;
 @@ -934,7 +937,6 @@
       NEXT_PASS (pass_forwprop);
       NEXT_PASS (pass_phiopt);
       NEXT_PASS (pass_fold_builtins);
 -      NEXT_PASS (pass_optimize_widening_mul);
       NEXT_PASS (pass_tail_calls);
       NEXT_PASS (pass_rename_ssa_copies);
       NEXT_PASS (pass_uncprop);

 otherwise I get (on x86_64-suse-linux)

 FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfmaddss
 FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfmaddsd
 FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfmsubss
 FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfmsubsd
 FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfnmaddss
 FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfnmaddsd

Hmm.  I would have put the pass next to the sincos pass, but yes,
in principle a copyprop  dce pass after PRE makes sense
(the loop passes likely don't run because there are no loops in
those testcases - both copyprop and dce should be scheduled
more like TODOs, or even automatically by the pass manager
via PROPs ...).  Dead code can indeed confuse those matching
passes that look for single-use vars.

I'll think about a more elegant solution for this problem.

Would you mind checking if the next-to-sincos position makes
any difference?

Thanks,
Richard.

 Ira


 Thanks.  I would hope that we eventually can get rid of the
 pattern recognizer ... at least for SSE there is also always
 a scalar variant instruction for each vectorized one.

 Richard.

Re: introduce --param max-vartrack-expr-depth

2011-06-02 Thread Bernd Schmidt

On 06/02/2011 10:46 AM, Jakub Jelinek wrote:
 On Wed, Jun 01, 2011 at 07:25:39PM -0300, Alexandre Oliva wrote:
 Such as this one...
 
 I'd appreciate if this could go in...

Go on then.

Bernd

Re: Ping: [Patch] Make libstdc++'s abi_check more robust against readelf output format

2011-06-02 Thread Paolo Carlini

Hi,

 Ping.

Did Ian Taylor see this patch? If he likes it, I'm also fine with it.

Paolo

Re: Add missing ChangeLog entry

2011-06-02 Thread Nathan Sidwell


On 06/01/11 15:32, Ian Lance Taylor wrote:

I noticed that we have a --with-specs option in gcc/configure.ac, added
in revision 155208 with this e-mail message:
http://gcc.gnu.org/ml/gcc-patches/2009-12/msg00132.html


sorry about that.  How's the attachd documentation?


--
Nathan Sidwell

2011-06-02  Nathan Sidwell  nat...@codesourcery.com

	* doc/install.texi (Options specification): Document --with-specs.

Index: doc/install.texi
===
--- doc/install.texi	(revision 174559)
+++ doc/install.texi	(working copy)
@@ -771,6 +771,12 @@
 on other configuration options, and differs between cross and native
 configurations.
 
+@item --with-specs=@var{specs}
+Specify additional command line driver SPECS.  This can be useful if
+you to turn on a non-standard feature by default without modifying the
+compiler's source code, for instance
+@option{--with-specs=%@{!fcommon:%@{!fno-common:-fno-common@}@}}.
+
 @end table
 
 @item --program-prefix=@var{prefix}

Re: RFA: another patch to solve PR49154

2011-06-02 Thread Hans-Peter Nilsson

On Tue, 31 May 2011, Richard Sandiford wrote:

 Gah, seems like I'd forgotten the no subclasses bit by the time
 I started looking at code.  Sorry for the false alarm.

Still, the extra look made me realise that I should have
restricted that statement to allocatable registers.
(And I really do appreciate a look from a native speaker.)

Updated patch follows, checked dvi and info output:

* doc/tm.texi.in (Register Classes): Document rule for the narrowest
register classes.
* doc/tm.texi: Regenerate.

Index: doc/tm.texi.in
===
--- doc/tm.texi.in  (revision 174376)
+++ doc/tm.texi.in  (working copy)
@@ -2327,6 +2327,12 @@ constraints is through machine-dependent
 You can define such letters to correspond to various classes, then use
 them in operand constraints.

+You must define the narrowest register classes for allocatable
+registers, so that each class either has no subclasses, or that for
+some mode, the move cost between registers within the class is
+cheaper than moving a register in the class to or from memory
+(@pxref{Costs}).
+
 You should define a class for the union of two classes whenever some
 instruction allows both classes.  For example, if an instruction allows
 either a floating point (coprocessor) register or a general register for a

brgds, H-P

Re: [PATCH][ARM] Add support for ADDW and SUBW instructions

2011-06-02 Thread Ramana Radhakrishnan

On 2 June 2011 10:03, Andrew Stubbs a...@codesourcery.com wrote:
 On 02/06/11 09:23, Ramana Radhakrishnan wrote:

 Please remove the alternatives in the subsi3 pattern since that is just
 unnecessary. Please make the constraints internal only.

 Is this better?


OK.

Ramana

 Andrew

Re: Add missing ChangeLog entry

2011-06-02 Thread Basile Starynkevitch

On Thu, 02 Jun 2011 11:12:12 +0100
Nathan Sidwell nat...@codesourcery.com wrote:

 On 06/01/11 15:32, Ian Lance Taylor wrote:
  I noticed that we have a --with-specs option in gcc/configure.ac, added
  in revision 155208 with this e-mail message:
  http://gcc.gnu.org/ml/gcc-patches/2009-12/msg00132.html
 
 sorry about that.  How's the attachd documentation?

[...]
+@item --with-specs=@var{specs}
+Specify additional command line driver SPECS.  This can be useful if
+you to turn on a non-standard feature by default without modifying the
+compiler's source code, for instance
+@option{--with-specs=%@{!fcommon:%@{!fno-common:-fno-common@}@}}.

I am not a native English speaker, and my english is bad. But perhaps it should 
be
this can be useful if you *want* to turn on
I feel that the 'want' word is missing...

Regards.
-- 
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basileatstarynkevitchdotnet mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mine, sont seulement les miennes} ***

Remove SETJMP_VIA_SAVE_AREA support

2011-06-02 Thread Eric Botcazou

This removes the (undocumented) support for SETJMP_VIA_SAVE_AREA from the 
compiler.  This is a trick implemented on the SPARC exclusively to reuse the 
register save area present in all frames (because of the register windows) for 
part of the setjmp buffer.  The benefit are marginal and dwarfed by the usual 
drawbacks of using setjmp/longjmp (to implement exceptions for example).

This exposed a couple of similar bugs in cse.c and postreload-gcse.c: the code 
was effectively treating a basic block with a single, abnormal incoming edge 
as if the edge was normal.

Bootstrapped/regtested on x86_64-suse-linux and sparc-sun-solaris2.10.  I also 
verified that ACATS is clean with the SJLJ EH scheme.  Applied on the mainline.


2011-06-02  Eric Botcazou  ebotca...@adacore.com

* function.h (struct stack_usage): Remove dynamic_alloc_count field.
(current_function_dynamic_alloc_count): Delete.
* builtins.c (expand_builtin_setjmp_setup): Do not set calls_setjmp.
(expand_builtin_nonlocal_goto): Remove obsolete comment.
(expand_builtin_update_setjmp_buf): Remove dead code.
* cse.c (cse_find_path): Do not follow a single abnormal incoming edge.
* explow.c (allocate_dynamic_stack_space): Remove SETJMP_VIA_SAVE_AREA
support.
* function.c (instantiate_virtual_regs): Likewise.
* postreload-gcse.c (bb_has_well_behaved_predecessors): Return false
for a block with a single abnormal incoming edge.
* config/sparc/sparc.h (STACK_SAVEAREA_MODE): Define.
* config/sparc/sparc-protos.h (load_got_register): Declare.
* config/sparc/sparc.c (TARGET_BUILTIN_SETJMP_FRAME_VALUE): Define.
(load_got_register): Make global.
(sparc_frame_pointer_required): Add 'static'.
(sparc_can_eliminate): Likewise.  Call sparc_frame_pointer_required.
(sparc_builtin_setjmp_frame_value): New function.
* config/sparc/sparc.md (UNSPECV_SETJMP): Remove.
(save_stack_nonlocal): New expander.
(restore_stack_nonlocal): Likewise.
(nonlocal_goto): Remove modes, adjust predicates and reimplement.
(nonlocal_goto_internal): New insn.
(goto_handler_and_restore): Delete.
(builtin_setjmp_setup): Likewise.
(do_builtin_setjmp_setup): Likewise.
(setjmp): Likewise.
(builtin_setjmp_receiver): New expander.


-- 
Eric Botcazou
Index: function.h
===
--- function.h	(revision 174559)
+++ function.h	(working copy)
@@ -1,6 +1,6 @@
 /* Structure for saving state for a nested function.
Copyright (C) 1989, 1992, 1993, 1994, 1995, 1996, 1997, 1998,
-   1999, 2000, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010
+   1999, 2000, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011
Free Software Foundation, Inc.
 
 This file is part of GCC.
@@ -476,9 +476,6 @@ struct GTY(()) stack_usage
  !ACCUMULATE_OUTGOING_ARGS, it contains the outgoing arguments.  */
   int pushed_stack_size;
 
-  /* # of dynamic allocations in the function.  */
-  unsigned int dynamic_alloc_count : 31;
-
   /* Nonzero if the amount of stack space allocated dynamically cannot
  be bounded at compile-time.  */
   unsigned int has_unbounded_dynamic_stack_size : 1;
@@ -487,7 +484,6 @@ struct GTY(()) stack_usage
 #define current_function_static_stack_size (cfun-su-static_stack_size)
 #define current_function_dynamic_stack_size (cfun-su-dynamic_stack_size)
 #define current_function_pushed_stack_size (cfun-su-pushed_stack_size)
-#define current_function_dynamic_alloc_count (cfun-su-dynamic_alloc_count)
 #define current_function_has_unbounded_dynamic_stack_size \
   (cfun-su-has_unbounded_dynamic_stack_size)
 #define current_function_allocates_dynamic_stack_space\
Index: builtins.c
===
--- builtins.c	(revision 174559)
+++ builtins.c	(working copy)
@@ -806,10 +806,6 @@ expand_builtin_setjmp_setup (rtx buf_add
 emit_insn (gen_builtin_setjmp_setup (buf_addr));
 #endif
 
-  /* Tell optimize_save_area_alloca that extra work is going to
- need to go on during alloca.  */
-  cfun-calls_setjmp = 1;
-
   /* We have a nonlocal label.   */
   cfun-has_nonlocal_label = 1;
 }
@@ -992,8 +988,8 @@ expand_builtin_nonlocal_goto (tree exp)
   r_label = convert_memory_address (Pmode, r_label);
   r_save_area = expand_normal (t_save_area);
   r_save_area = convert_memory_address (Pmode, r_save_area);
-  /* Copy the address of the save location to a register just in case it was based
-on the frame pointer.   */
+  /* Copy the address of the save location to a register just in case it was
+ based on the frame pointer.   */
   r_save_area = copy_to_reg (r_save_area);
   r_fp = gen_rtx_MEM (Pmode, r_save_area);
   r_sp = gen_rtx_MEM (STACK_SAVEAREA_MODE (SAVE_NONLOCAL),
@@ -1013,11 +1009,7 @@ expand_builtin_nonlocal_goto (tree exp)
   emit_clobber (gen_rtx_MEM

Re: [patch] Improve detection of widening multiplication in the vectorizer

2011-06-02 Thread Ira Rosen

On 2 June 2011 12:59, Richard Guenther richard.guent...@gmail.com wrote:
 On Thu, Jun 2, 2011 at 10:46 AM, Ira Rosen ira.ro...@linaro.org wrote:
 On 1 June 2011 15:14, Richard Guenther richard.guent...@gmail.com wrote:
 On Wed, Jun 1, 2011 at 1:37 PM, Ira Rosen ira.ro...@linaro.org wrote:
 On 1 June 2011 12:42, Richard Guenther richard.guent...@gmail.com wrote:

 Did you think about moving pass_optimize_widening_mul before
 loop optimizations?  Does that pass catch the cases you are
 teaching the pattern recognizer?  I think we should try to expose
 these more complicated instructions to loop optimizers.


 pass_optimize_widening_mul doesn't catch these cases, but I can try to
 teach it instead of the vectorizer.
 I am now testing

 Index: passes.c
 ===
 --- passes.c    (revision 174391)
 +++ passes.c    (working copy)
 @@ -870,6 +870,7 @@
       NEXT_PASS (pass_split_crit_edges);
       NEXT_PASS (pass_pre);
       NEXT_PASS (pass_sink_code);
 +      NEXT_PASS (pass_optimize_widening_mul);
       NEXT_PASS (pass_tree_loop);
        {
          struct opt_pass **p = pass_tree_loop.pass.sub;
 @@ -934,7 +935,6 @@
       NEXT_PASS (pass_forwprop);
       NEXT_PASS (pass_phiopt);
       NEXT_PASS (pass_fold_builtins);
 -      NEXT_PASS (pass_optimize_widening_mul);
       NEXT_PASS (pass_tail_calls);
       NEXT_PASS (pass_rename_ssa_copies);
       NEXT_PASS (pass_uncprop);

 to see how it affects other loop optimizations (vectorizer pattern
 tests obviously fail).

 Looks like it needs copy_prop and dce as well:

 Index: passes.c
 ===
 --- passes.c    (revision 174391)
 +++ passes.c    (working copy)
 @@ -870,6 +870,9 @@
       NEXT_PASS (pass_split_crit_edges);
       NEXT_PASS (pass_pre);
       NEXT_PASS (pass_sink_code);
 +      NEXT_PASS (pass_copy_prop);
 +      NEXT_PASS (pass_dce);
 +      NEXT_PASS (pass_optimize_widening_mul);
       NEXT_PASS (pass_tree_loop);
        {
          struct opt_pass **p = pass_tree_loop.pass.sub;
 @@ -934,7 +937,6 @@
       NEXT_PASS (pass_forwprop);
       NEXT_PASS (pass_phiopt);
       NEXT_PASS (pass_fold_builtins);
 -      NEXT_PASS (pass_optimize_widening_mul);
       NEXT_PASS (pass_tail_calls);
       NEXT_PASS (pass_rename_ssa_copies);
       NEXT_PASS (pass_uncprop);

 otherwise I get (on x86_64-suse-linux)

 FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfmaddss
 FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfmaddsd
 FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfmsubss
 FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfmsubsd
 FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfnmaddss
 FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfnmaddsd

 Hmm.  I would have put the pass next to the sincos pass, but yes,
 in principle a copyprop  dce pass after PRE makes sense
 (the loop passes likely don't run because there are no loops in
 those testcases - both copyprop and dce should be scheduled
 more like TODOs, or even automatically by the pass manager
 via PROPs ...).  Dead code can indeed confuse those matching
 passes that look for single-use vars.

 I'll think about a more elegant solution for this problem.

 Would you mind checking if the next-to-sincos position makes
 any difference?

Before sincos we have

  D.2747_2 = __builtin_powf (a_1(D), 2.0e+0);
  D.2746_4 = D.2747_2 + c_3(D);

which is transformed by sincos to

  powmult.8_7 = a_1(D) * a_1(D);
  D.2747_2 = powmult.8_7;
  D.2746_4 = D.2747_2 + c_3(D);

but widening_mul  is confused by D.2747_2 = powmult.8_7; and it needs
both copy_prop and dce to remove it:

  powmult.8_7 = a_1(D) * a_1(D);
  D.2746_4 = c_3(D) + powmult.8_7;

So moving widening_mul next to sincos doesn't help.
Maybe gimple_expand_builtin_pow() can be changed to generate the last
version by itself?

Ira


 Thanks,
 Richard.

 Ira


 Thanks.  I would hope that we eventually can get rid of the
 pattern recognizer ... at least for SSE there is also always
 a scalar variant instruction for each vectorized one.

 Richard.

Re: Ping: [Patch] Make libstdc++'s abi_check more robust against readelf output format

2011-06-02 Thread Jonathan Wakely

On 2 June 2011 08:55, Simon Baldwin wrote:
 Index: libstdc++-v3/scripts/extract_symvers.in
 ===
 --- libstdc++-v3/scripts/extract_symvers.in     (revision 173951)
 +++ libstdc++-v3/scripts/extract_symvers.in     (working copy)
 @@ -52,6 +52,9 @@ SunOS)
   ${readelf} ${lib} |\
   sed -e 's/ \[other: [A-Fa-f0-9]*\] //' -e '/\.dynsym/,/^$/p;d' |\
   egrep -v ' (LOCAL|UND) ' |\
 +  sed -e 's/ processor specific: / processor_specific:_/g' |\
 +  sed -e 's/ OS specific: / OS_specific:_/g' |\
 +  sed -e 's/ unknown: / unknown:_/g' |\

Is there a reason to use three sed processes instead of one?
We already assume sed -e script -e script works earlier in that pipeline.

We could even replace the egrep with a sed 'd' command and combine it
all into a single sed, but that could be left for another day.

Re: Initialize INSN_COND

2011-06-02 Thread Bernd Schmidt

On 06/02/2011 01:29 PM, Alexander Monakov wrote:
 Bernd,
 
 The problem is INSN_COND should be reset when initializing a new deps
 structure, otherwise instructions may get stale conditions from other
 previously analyzed instructions.  Presuming that sd_init_insn is the
 proper place for that, I'll test the following patch.
 
 
 2011-06-02  Alexander Monakov  amona...@ispras.ru
 
   * sched-deps.c (sd_init_insn): Initialize INSN_COND.
   * sel-sched.c (move_op): Use correct type for 'res'.  Verify that
   code_motion_path_driver returned 0 or 1.

Ok. Although I wonder how sel-sched can end up reusing an entry in
h_d_i_d? How does it use this machinery? If it's not doing a normal
forward scan as in sched_analyze, the INSN_COND mechanism may break in
other ways.


Bernd

Re: [google] Improve locus information during if-conversion (issue4526101)

2011-06-02 Thread Diego Novillo

On Wed, Jun 1, 2011 at 21:03, Sharad Singhai sing...@google.com wrote:

 2011-06-01  Sharad Singhai  sing...@google.com

        Google Ref 39994
        * ifcvt.c (noce_try_cmove_arith): Use the locus information
        from the if-statment rather than the then path.

Could you elaborate how it improves locus information?  Is there a
test case you can add to the testsuite?  Or an example code fragment
that shows how is the locus better now?

OK for google/main.


Diego.

[PATCH] make attribute((returns_twice)) actually work (PR tree-optimization/49243)

2011-06-02 Thread Mikael Pettersson

GCC has attribute((returns_twice)) which is supposed to allow the safe
use of alternate implementations of setjmp-like functions.  In particular,
a function that calls a setjmp-like function must itself not be inlined,
because that would enable unsafe optimizations.  This works for calls to
setjmp (a few alternate spellings are allowed), but not to e.g. my_setjmp
even if that function is declared with attribute((returns_twice)).  This
bug affects the entire gcc-4.x series, gcc-3.x worked; see PR49243.

A function that calls setjmp is marked non-inlinable because setjmp_call_p
is applied to the function position, and it deduces via special_function_p
that the callee is ECF_RETURNS_TWICE.  But special_function_p only looks at
the name, so setjmp_call_p fails to detect attribute((returns_twice)) callees.

The fix is to have setjmp_call_p also check if the returns_twice attribute
is present, via DECL_IS_RETURNS_TWICE.  It could call flags_from_decl_or_type
instead, but that would perform quite a bit of redundant work for this case.

The test case uses -Winline to check that gcc refuses to inline a function
that calls a returns_twice callee.  This is sufficient to verify the fix, and
avoids the machine-specific code needed in the original runtime test case.

Tested w/o regressions with gcc trunk and 4.6 on x86_64-linux.  The added test
case does fail without the fix and pass with it.

OK for trunk, and perhaps 4.6?

(I don't have svn write access.)

/Mikael

gcc/

2011-06-02  Mikael Pettersson  mi...@it.uu.se

PR tree-optimization/49243
* calls.c (setjmp_call_p): Also check if fndecl has the
returns_twice attribute.

gcc/testsuite/

2011-06-02  Mikael Pettersson  mi...@it.uu.se

PR tree-optimization/49243
* gcc.dg/pr49243.c: New.

--- gcc-4.7-20110528/gcc/calls.c.~1~2011-05-25 13:00:14.0 +0200
+++ gcc-4.7-20110528/gcc/calls.c2011-06-02 12:55:32.0 +0200
@@ -554,6 +554,8 @@ special_function_p (const_tree fndecl, i
 int
 setjmp_call_p (const_tree fndecl)
 {
+  if (DECL_IS_RETURNS_TWICE (fndecl))
+return ECF_RETURNS_TWICE;
   return special_function_p (fndecl, 0)  ECF_RETURNS_TWICE;
 }
 
--- gcc-4.7-20110528/gcc/testsuite/gcc.dg/pr49243.c.~1~ 1970-01-01 
01:00:00.0 +0100
+++ gcc-4.7-20110528/gcc/testsuite/gcc.dg/pr49243.c 2011-06-02 
12:55:32.0 +0200
@@ -0,0 +1,25 @@
+/* PR tree-optimization/49243 */
+/* { dg-do compile } */
+/* { dg-options -O2 -Winline } */
+
+extern unsigned long jb[];
+extern int my_setjmp(unsigned long jb[]) __attribute__((returns_twice));
+extern int decode(const char*);
+
+static inline int wrapper(const char **s_ptr) /* { dg-warning (inlining 
failed|function 'wrapper' can never be inlined because it uses setjmp) } */
+{
+if (my_setjmp(jb) == 0) {
+   const char *s = *s_ptr;
+   while (decode(s) != 0)
+   *s_ptr = ++s;
+   return 0;
+} else
+   return -1;
+}
+
+void parse(const char *data)
+{
+const char *s = data;
+if (!(wrapper(s) == -1  (s - data) == 1)) /* { dg-warning called from 
here } */
+   __builtin_abort();
+}

Re: [google] Improve locus information during if-conversion (issue4526101)

2011-06-02 Thread Diego Novillo

On Thu, Jun 2, 2011 at 08:46, Steven Bosscher stevenb@gmail.com wrote:
 On Thu, Jun 2, 2011 at 1:46 PM, Diego Novillo dnovi...@google.com wrote:
 On Wed, Jun 1, 2011 at 21:03, Sharad Singhai sing...@google.com wrote:

 2011-06-01  Sharad Singhai  sing...@google.com

        Google Ref 39994
        * ifcvt.c (noce_try_cmove_arith): Use the locus information
        from the if-statment rather than the then path.

 Could you elaborate how it improves locus information?  Is there a
 test case you can add to the testsuite?  Or an example code fragment
 that shows how is the locus better now?

 OK for google/main.

 Why can't this patch just go onto the trunk?

Yes.  Every patch submitted for google/main also applies to trunk.  I
generally try to avoid approving patches that are not inside my
maintenance areas for trunk (unless the patch is obvious).

Sharad simply forgot to request trunk approval for this patch and I
forgot to remind him.  Sharad, could you mark future patches?

 (Idem for some other google/main patches -- is there a merge plan??)

The merge plan is to submit patches to trunk.  They are quickly moved
in google/main for scheduling reasons, but everything going to
google/main is automatically assumed to apply to trunk as well.

The only patches that we don't necessarily mean to apply to trunk are
the ones we put in google/integration (though some patches have
already been moved or approved for trunk).


Diego.

Re: [PATCH, ARM] Fix ABI for double-precision helpers on single-float-only CPUs

2011-06-02 Thread Paul Brook

 gcc/
 * config/arm/arm.c (arm_libcall_uses_aapcs_base)
 (arm_init_cumulative_args): Use correct ABI for double-precision
 helper functions in hard-float mode if only single-precision
 arithmetic is supported in hardware.

Ok, though I'd add a bit more explanation to the comments: Technically the 
same is true for the single precision helpers.  However all targets that 
support the hard-float ABI implement single-precision in hardware, so this 
never occurs in practice.

Paul

Re: [PATCH][all-langs] Defer size_t and sizetype setting to the middle-end

2011-06-02 Thread Janne Blomqvist

On Wed, Jun 1, 2011 at 14:34, Richard Guenther rguent...@suse.de wrote:

 This patch defers the control over size_t and sizetype to the
 middle-end which in turn consults the target.  This removes
 various inconsistencies for frontends that do not seem to care
 about size_t and will allow simplifying the global tree initialization.

 Bootstrapped on x86_64-unknown-linux-gnu for all languages, testing
 in progress.

 Ok for trunk?  (the change is worthwhile from an LTO and middle-end
 perspective and I'll apply leeway to frontends that appear to be
 unmaintained - hello Java)

Fortran parts are ok.

-- 
Janne Blomqvist

[gcc patch 0/3] libiberty: New DMGL_RET_DROP

2011-06-02 Thread Jan Kratochvil

Hi,

introducing DMGL_RET_DROP which suppresses return type demangled from linkage
name for the toplevel function type.

DMGL_RET_POSTFIX is now in use only for DMGL_JAVA.  Besides Java return types
in linkage name are in C++ present for function templates.

GDB since 7.2 provides convenience alias for the function template symbols
without the return type so that for
00400523 T _Z4funcIdET_i
00400523 T double funcdouble(int)
one can since GDB 7.2 easily find the template functions by name using:
(gdb) break funtab-completion
instead of having to guess the return type first as in GDB 7.1:
(gdb) break 'double funtab-completion

As the demangler usage has been reintroduced for GDB 7.3 (to fix GDB PR 12506
and similar cases by using DW_AT_linkage_name again) it now needs to drop the
return type by the demangler (instead of a GDB 7.2 custom physname code).

The function templates return types linkage name are a similar case like
DMGL_JAVA which uses DMGL_RET_POSTFIX:
jmain.main(java.lang.String[])void
I believe both cases should either use DMGL_RET_POSTFIX or the new
DMGL_RET_DROP, therefore to use just:
jmain.main(java.lang.String[])
For C++ I (and also Tom Tromey) prefer DMGL_RET_DROP:
funcdouble(int)
over DMGL_RET_POSTFIX:
funcdouble(int)double
as in practice there are no two template function instances with the same name
+ parameters signature but different return type - one cannot overload
function by return type in either C++ or Java.  G++ rejects compilation of
a CU containing such two instances, one can only link two different CUs
together to get the return type linkage name difference in a single file.

After all one also still can reference the function by its original ELF symbol
'double funcdouble(int)'.  Proposing in a different GDB patch to use
DMGL_RET_DROP even for Java symbols but I do not have any real need for it.

This patchset had no GCC regressions for Fedora gcc-4.6.0-8.fc15 (not tested
for GCC HEAD, hopefully OK).

The new testcases are based on a C++ source:

template typename T
char outer (int (*inner) (long)) { return 0; }
int outer2_ret (long) { return 0; }
template typename T
int (*outer2 (int (*inner) (long))) (long) { return outer2_ret; }
char outer (short (*inner) (int), long) { outer2short (0); return
outershort (0);  }


Thanks,
Jan

[gcc patch 3/3] cp-demangle.c: Fix DMGL_RET_POSTFIX for inner func types

2011-06-02 Thread Jan Kratochvil

Hi,

I do not need this patch in any way but I believe it should go in for the case
anyone would want to use DMGL_RET_POSTFIX with C/C++.

Without this fix the new testcase would:
FAIL at line 3979, options --format=gnu-v3 --ret-postfix:
in:  _Z6outer2IsEPFilES1_
out: outer2short(int (*)(long))int (*(int (*)(long)))(long)
exp: outer2short(int (*)(long))int (*)(long)


Thanks,
Jan


libiberty/
2011-05-24  Jan Kratochvil  jan.kratoch...@redhat.com

* cp-demangle.c (d_print_comp) DEMANGLE_COMPONENT_FUNCTION_TYPE:
Suppress d_print_mod for DMGL_RET_POSTFIX.
* testsuite/demangle-expected: New testcases for --ret-postfix.

--- a/libiberty/cp-demangle.c
+++ b/libiberty/cp-demangle.c
@@ -3921,7 +3921,10 @@ d_print_comp (struct d_print_info *dpi, const struct 
demangle_component *dc,
 options  ~(DMGL_RET_POSTFIX | DMGL_RET_DROP));
 
/* Print return type if present */
-   if (d_left (dc) != NULL  (options  DMGL_RET_DROP) == 0)
+   if (d_left (dc) != NULL  (options  DMGL_RET_POSTFIX) != 0)
+ d_print_comp (dpi, d_left (dc),
+   options  ~(DMGL_RET_POSTFIX | DMGL_RET_DROP));
+   else if (d_left (dc) != NULL  (options  DMGL_RET_DROP) == 0)
  {
struct d_print_mod dpm;
 
--- a/libiberty/testsuite/demangle-expected
+++ b/libiberty/testsuite/demangle-expected
@@ -3968,6 +3968,15 @@ outer(short (*)(int), long)
 --format=gnu-v3
 _Z6outer2IsEPFilES1_
 int (*outer2short(int (*)(long)))(long)
+--format=gnu-v3 --ret-postfix
+_Z5outerIsEcPFilE
+outershort(int (*)(long))char
+--format=gnu-v3 --ret-postfix
+_Z5outerPFsiEl
+outer(short (*)(int), long)
+--format=gnu-v3 --ret-postfix
+_Z6outer2IsEPFilES1_
+outer2short(int (*)(long))int (*)(long)
 --format=gnu-v3 --ret-drop
 _Z5outerIsEcPFilE
 outershort(int (*)(long))

[PATCH] PR fortran/49265 -- allow for double colon in module procedure statement

2011-06-02 Thread Steve Kargl

The attached patch allows for F2008's optional double
colon in a module procedure statement.  Built and
regression tested on trunk.  OK for trunk and 4.6?


Steven G. Kargl  ka...@gcc.gnu.org

   PR fortran/49265
   * decl.c (gfc_match_modproc):  Allow for a double colon in a module
   procedure statement.


2011-06-02  Steven G. Kargl  ka...@gcc.gnu.org

   PR fortran/49265
   * gfortran.dg/module_procedure_double_colon.f90:  New test.


-- 
Steve
Index: gcc/fortran/ChangeLog
===
--- gcc/fortran/ChangeLog	(revision 174566)
+++ gcc/fortran/ChangeLog	(working copy)
@@ -1,3 +1,9 @@
+2011-06-02  Steven G. Kargl  ka...@gcc.gnu.org
+
+	PR fortran/49265
+	* decl.c (gfc_match_modproc):  Allow for a double colon in a module
+	procedure statement.
+ 
 2011-05-31  Tobias Burnus  bur...@net-b.de
 
 	PR fortran/18918
Index: gcc/fortran/decl.c
===
--- gcc/fortran/decl.c	(revision 174566)
+++ gcc/fortran/decl.c	(working copy)
@@ -7016,6 +7016,7 @@ gfc_match_modproc (void)
   char name[GFC_MAX_SYMBOL_LEN + 1];
   gfc_symbol *sym;
   match m;
+  locus old_locus;
   gfc_namespace *module_ns;
   gfc_interface *old_interface_head, *interface;
 
@@ -7044,10 +7045,22 @@ gfc_match_modproc (void)
  end up with a syntax error and need to recover.  */
   old_interface_head = gfc_current_interface_head ();
 
+  /* Check if the F2008 optional double colon appears.  */
+  old_locus = gfc_current_locus;
+  if (gfc_match ( :: ) == MATCH_YES)
+{
+  if (gfc_notify_std (GFC_STD_F2008, Fortran 2008: double colon in 
+			 MODULE PROCEDURE statement at %L, old_locus)
+	  == FAILURE)
+	return MATCH_ERROR;
+}
+  else
+gfc_current_locus = old_locus;
+  
   for (;;)
 {
-  locus old_locus = gfc_current_locus;
   bool last = false;
+  old_locus = gfc_current_locus;
 
   m = gfc_match_name (name);
   if (m == MATCH_NO)
@@ -7059,6 +7072,7 @@ gfc_match_modproc (void)
 	 current namespace.  */
   if (gfc_match_eos () == MATCH_YES)
 	last = true;
+
   if (!last  gfc_match_char (',') != MATCH_YES)
 	goto syntax;
 
Index: gcc/testsuite/ChangeLog
===
--- gcc/testsuite/ChangeLog	(revision 174566)
+++ gcc/testsuite/ChangeLog	(working copy)
@@ -1,3 +1,8 @@
+2011-06-02  Steven G. Kargl  ka...@gcc.gnu.org
+
+	PR fortran/49265
+	* gfortran.dg/module_procedure_double_colon.f90:  New test.
+
 2011-06-02  Eric Botcazou  ebotca...@adacore.com
 	Hans-Peter Nilsson  h...@axis.com
 
Index: gcc/testsuite/gfortran.dg/module_procedure_double_colon.f90
===
--- gcc/testsuite/gfortran.dg/module_procedure_double_colon.f90	(revision 0)
+++ gcc/testsuite/gfortran.dg/module_procedure_double_colon.f90	(revision 0)
@@ -0,0 +1,16 @@
+! { dg-do compile }
+! { dg-options -std=f95 }
+!
+! PR fortran/49265
+! Contributed by Erik Toussaint
+!
+module m1
+   implicit none
+   interface foo
+  module procedure :: bar ! { dg-error double colon }
+   end interface
+contains
+   subroutine bar
+   end subroutine
+end module
+! { dg-final { cleanup-modules m1 } }

Re: fix left-over debug insns in DCE

2011-06-02 Thread Eric Botcazou

 One of the issues was that DCE removed an insn that set a REG in a
 certain mode, without adjusting a debug use of that REG.  This was in
 libstdc++, but I failed to take note of the affected file.  DF later
 attached that debug use to another SET to the same REG in a different,
 incompatible mode.  When that one was found to be dead by DF, we ended
 up ICEing as we attempted to emit the invalid SUBREGs.

 I reused some of the infrastructure to propagate dead DEFs into debug
 uses in DF to get DCE to emit debug temps and adjust debug uses as well,
 fixing this issue.  While at that, I improved the handling of unused
 DEFs in DF, that previously resulted in loss of debug information, so as
 to retain it as much as possible.

Why can't the problem be addressed purely within DF?  Starting to spill the DF 
logic to individual RTL passes doesn't look very appealing to me.

 This is the patch I ended up with.  Regstrapped on x86_64-linux-gnu and
 i686-linux-gnu.  Ok to install?

OK for the usual debug insn bookkeeping, i.e.

* dce.c (reset_unmarked_insns_debug_uses): New.
(delete_unmarked_insns): Skip debug insns.
(prescan_insns_for_dce): Likewise.
(rest_of_handle_ud_dce): Propagate debug uses.
* reg-stack.c (subst_stack_regs_in_debug_insn): Signal when no
active reg can be found.
(subst_all_stack_regs_in_debug_insn): New.  Reset debug insn then.
(convert_regs_1): Use it.

The rest needs further discussing IMO.

-- 
Eric Botcazou

Re: [patch] Improve detection of widening multiplication in the vectorizer

2011-06-02 Thread Richard Guenther

On Thu, Jun 2, 2011 at 1:08 PM, Ira Rosen ira.ro...@linaro.org wrote:
 On 2 June 2011 12:59, Richard Guenther richard.guent...@gmail.com wrote:
 On Thu, Jun 2, 2011 at 10:46 AM, Ira Rosen ira.ro...@linaro.org wrote:
 On 1 June 2011 15:14, Richard Guenther richard.guent...@gmail.com wrote:
 On Wed, Jun 1, 2011 at 1:37 PM, Ira Rosen ira.ro...@linaro.org wrote:
 On 1 June 2011 12:42, Richard Guenther richard.guent...@gmail.com wrote:

 Did you think about moving pass_optimize_widening_mul before
 loop optimizations?  Does that pass catch the cases you are
 teaching the pattern recognizer?  I think we should try to expose
 these more complicated instructions to loop optimizers.


 pass_optimize_widening_mul doesn't catch these cases, but I can try to
 teach it instead of the vectorizer.
 I am now testing

 Index: passes.c
 ===
 --- passes.c    (revision 174391)
 +++ passes.c    (working copy)
 @@ -870,6 +870,7 @@
       NEXT_PASS (pass_split_crit_edges);
       NEXT_PASS (pass_pre);
       NEXT_PASS (pass_sink_code);
 +      NEXT_PASS (pass_optimize_widening_mul);
       NEXT_PASS (pass_tree_loop);
        {
          struct opt_pass **p = pass_tree_loop.pass.sub;
 @@ -934,7 +935,6 @@
       NEXT_PASS (pass_forwprop);
       NEXT_PASS (pass_phiopt);
       NEXT_PASS (pass_fold_builtins);
 -      NEXT_PASS (pass_optimize_widening_mul);
       NEXT_PASS (pass_tail_calls);
       NEXT_PASS (pass_rename_ssa_copies);
       NEXT_PASS (pass_uncprop);

 to see how it affects other loop optimizations (vectorizer pattern
 tests obviously fail).

 Looks like it needs copy_prop and dce as well:

 Index: passes.c
 ===
 --- passes.c    (revision 174391)
 +++ passes.c    (working copy)
 @@ -870,6 +870,9 @@
       NEXT_PASS (pass_split_crit_edges);
       NEXT_PASS (pass_pre);
       NEXT_PASS (pass_sink_code);
 +      NEXT_PASS (pass_copy_prop);
 +      NEXT_PASS (pass_dce);
 +      NEXT_PASS (pass_optimize_widening_mul);
       NEXT_PASS (pass_tree_loop);
        {
          struct opt_pass **p = pass_tree_loop.pass.sub;
 @@ -934,7 +937,6 @@
       NEXT_PASS (pass_forwprop);
       NEXT_PASS (pass_phiopt);
       NEXT_PASS (pass_fold_builtins);
 -      NEXT_PASS (pass_optimize_widening_mul);
       NEXT_PASS (pass_tail_calls);
       NEXT_PASS (pass_rename_ssa_copies);
       NEXT_PASS (pass_uncprop);

 otherwise I get (on x86_64-suse-linux)

 FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfmaddss
 FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfmaddsd
 FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfmsubss
 FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfmsubsd
 FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfnmaddss
 FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfnmaddsd

 Hmm.  I would have put the pass next to the sincos pass, but yes,
 in principle a copyprop  dce pass after PRE makes sense
 (the loop passes likely don't run because there are no loops in
 those testcases - both copyprop and dce should be scheduled
 more like TODOs, or even automatically by the pass manager
 via PROPs ...).  Dead code can indeed confuse those matching
 passes that look for single-use vars.

 I'll think about a more elegant solution for this problem.

 Would you mind checking if the next-to-sincos position makes
 any difference?

 Before sincos we have

  D.2747_2 = __builtin_powf (a_1(D), 2.0e+0);
  D.2746_4 = D.2747_2 + c_3(D);

 which is transformed by sincos to

  powmult.8_7 = a_1(D) * a_1(D);
  D.2747_2 = powmult.8_7;
  D.2746_4 = D.2747_2 + c_3(D);

 but widening_mul  is confused by D.2747_2 = powmult.8_7; and it needs
 both copy_prop and dce to remove it:

  powmult.8_7 = a_1(D) * a_1(D);
  D.2746_4 = c_3(D) + powmult.8_7;

 So moving widening_mul next to sincos doesn't help.
 Maybe gimple_expand_builtin_pow() can be changed to generate the last
 version by itself?

Yeah, I guess so.  I'll have a look.

Richard.

 Ira


 Thanks,
 Richard.

 Ira


 Thanks.  I would hope that we eventually can get rid of the
 pattern recognizer ... at least for SSE there is also always
 a scalar variant instruction for each vectorized one.

 Richard.

Re: [PATCH, ARM] Fix ABI for double-precision helpers on single-float-only CPUs

2011-06-02 Thread Richard Earnshaw


On Fri, 2011-05-27 at 17:32 +0100, Julian Brown wrote:
 The helper functions used to implement double-precision arithmetic on
 ARM processors that only support single-precision arithmetic in hardware
 should use the soft-float ABI (i.e. passing and returning floating-point
 arguments in core registers), even when -mfloat-abi=hard is in effect.
 This patch tweaks the ABI for the affected functions so that is true.
 
 Tested with cross to ARM EABI, and by manually observing compiler
 output. We've also been carrying this patch in our local tree for some
 time without issue.
 
 OK to apply?
 
 Thanks,
 
 Julian
 
 ChangeLog
 
 gcc/
 * config/arm/arm.c (arm_libcall_uses_aapcs_base)
 (arm_init_cumulative_args): Use correct ABI for double-precision
 helper functions in hard-float mode if only single-precision
 arithmetic is supported in hardware.

I see Paul has already approved this, but I've just spotted one
potential problem that might cause latent bugs sometime in the future.

The code to register the libcalls is only run once, the first time we
try to look up a libcall.  If we ever end up allowing dynamic changing
of CPU and optimization options, not registering the other libcalls will
lead to subtle problems at run time.  I suggest that these functions be
unconditionally added along with the other libcalls.

I also don't understand why all the tests are needed in
arm_init_cumulative_args?  Surely arm_libcall_uses_aapcs_base() will
already have run that test.

R.

[gc-improv] Fix all remaining C testsuite failures

2011-06-02 Thread Laurynas Biveinis

This patch
- Fixes PCH failures by re-initializing struct function after PCH read.
- Allocates couple of global RTXes in the permanent memory.
- Fixes RTL copying by taking source and destination memory areas into account. 
I.e. RTXes that would be normally shared, if source is in the permanent, and 
destination is in the function area, then are copied.  Assert that this does 
not happen in the cases when copying is meaningless.

With this, the C testsuite achieves parity on x86_64-unknown-linux-gnu, but 
I've lagged significantly behind with merges from trunk. 

My next step is to implement poisoning of function memory area and also I will 
look into walking GC and checking for non-GTY((skip)) pointers pointing to RTL 
memory areas. I will do this before I do the next merge, as hopefully this will 
make merges easier.

This patch took me two months. At this pace, I'm not sure the branch will be 
ready for consideration for 4.7.

2011-06-02  Laurynas Biveinis  laurynas.bivei...@gmail.com

* varasm.c (make_decl_rtl): Allocate DECL_RTL in the permanent RTL
memory.

* rtl.c: (_obstack_allocated_p): Declare.
(allocated_in_function_mem_p): New.
(need_copy_p): New.
(copy_rtx): Re-enable sharing of CONST_VECTOR rtxes.  Use
need_copy_p to decide on copying vs. sharing of rtxes.

* function.c (reinit_struct_function): New.
(set_cfun, prepare_function_start): Call it.

* config/i386/i386.c (ix86_expand_split_stack_prologue): Allocate
split_stack_fn in the permanent RTL memory.
(ix86_expand_split_stack_prologue): Allocate split_stack_fn_large
in the permanent RTL memory.

Index: gcc/function.c
===
--- gcc/function.c	(revision 171651)
+++ gcc/function.c	(working copy)
@@ -151,6 +151,7 @@
 static void do_clobber_return_reg (rtx, void *);
 static void do_use_return_reg (rtx, void *);
 static void set_insn_locators (rtx, int) ATTRIBUTE_UNUSED;
+static void reinit_struct_function (void);
 
 /* Stack of nested functions.  */
 /* Keep track of the cfun stack.  */
@@ -4316,6 +4317,7 @@
 {
   cfun = new_cfun;
   invoke_set_current_function_hook (new_cfun ? new_cfun-decl : NULL_TREE);
+  reinit_struct_function ();
 }
 }
 
@@ -4417,6 +4419,16 @@
   allocate_struct_function (fndecl, false);
 }
 
+/* Initialize those parts of struct function that are cleared during PCH read
+   and write.  */
+
+static void
+reinit_struct_function (void)
+{
+  if (cfun  !cfun-machine  init_machine_status)
+cfun-machine = (*init_machine_status) ();
+}
+
 /* Reset crtl and other non-struct-function variables to defaults as
appropriate for emitting rtl at the start of a function.  */
 
@@ -4437,8 +4449,7 @@
 }
 
   /* cfun-machine is NULL after PCH read.  Initialize it.  */
-  if (!cfun-machine  init_machine_status)
-cfun-machine = (*init_machine_status) ();
+  reinit_struct_function ();
 
   cse_not_expected = ! optimize;
 
Index: gcc/ChangeLog.gc-improv
===
--- gcc/ChangeLog.gc-improv	(revision 172076)
+++ gcc/ChangeLog.gc-improv	(working copy)
@@ -1,3 +1,22 @@
+2011-06-02  Laurynas Biveinis  laurynas.bivei...@gmail.com
+
+	* varasm.c (make_decl_rtl): Allocate DECL_RTL in the permanent RTL
+	memory.
+
+	* rtl.c: (_obstack_allocated_p): Declare.
+	(allocated_in_function_mem_p): New.
+	(need_copy_p): New.
+	(copy_rtx): Re-enable sharing of CONST_VECTOR rtxes.  Use
+	need_copy_p to decide on copying vs. sharing of rtxes.
+
+	* function.c (reinit_struct_function): New.
+	(set_cfun, prepare_function_start): Call it.
+
+	* config/i386/i386.c (ix86_expand_split_stack_prologue): Allocate
+	split_stack_fn in the permanent RTL memory.
+	(ix86_expand_split_stack_prologue): Allocate split_stack_fn_large
+	in the permanent RTL memory.
+
 2011-04-07  Laurynas Biveinis  laurynas.bivei...@gmail.com
 
 	* stmt.c (label_rtx): Allocate RTX in permanent RTL memory.
Index: gcc/varasm.c
===
--- gcc/varasm.c	(revision 171651)
+++ gcc/varasm.c	(working copy)
@@ -1238,6 +1238,9 @@
 		 optimization may eliminate reads and/or 
 		 writes to register variables);
 
+	  if (TREE_STATIC (decl))
+	use_rtl_permanent_mem ();
+
 	  /* If the user specified one of the eliminables registers here,
 	 e.g., FRAME_POINTER_REGNUM, we don't want to get this variable
 	 confused with that register and be eliminated.  This usage is
@@ -1248,6 +1251,9 @@
 	  REG_USERVAR_P (DECL_RTL (decl)) = 1;
 
 	  if (TREE_STATIC (decl))
+	use_rtl_function_mem ();
+
+	  if (TREE_STATIC (decl))
 	{
 	  /* Make this register global, so not usable for anything
 		 else.  */
Index: gcc/rtl.c
===
--- gcc/rtl.c	(revision 171651)
+++ gcc/rtl.c	(working copy)
@@ -150,6 +150,9 @@
 static

Re: [patch] Fix PR tree-optimization/49038

2011-06-02 Thread Ira Rosen

On 26 May 2011 10:52, Ira Rosen ira.ro...@linaro.org wrote:
 Hi,

 The vectorizer supports strided loads with gaps, e.g., when only a[4i]
 and a[4i+2] are accessed, it generates a vector load a[4i:4i+3], i.e.,
 creating an access to a[4i+3], which doesn't exist in the scalar code.
 This access maybe invalid as described in the PR.

 This patch creates an epilogue loop (with at least one iteration) for
 such cases.

 Bootstrapped and tested on powerpc64-suse-linux.
 Applied to trunk. I'll prepare patches for 4.5 and 4.6 next week.


Here are the patches. Bootstrapped and tested on x86_64-suse-linux
(4.5) and on powerpc64-suse-linux (4.6).
OK to apply?

Thanks,
Ira


4.6 ChangeLog:

 PR tree-optimization/49038
 * tree-vect-loop-manip.c (vect_generate_tmps_on_preheader):
 Ensure at least one epilogue iteration if required by data
 accesses with gaps.
 * tree-vectorizer.h (struct _loop_vec_info): Add new field
 to mark loops that require peeling for gaps.
 * tree-vect-loop.c (new_loop_vec_info): Initialize new field.
 (vect_get_known_peeling_cost): Take peeling for gaps into
 account.
 (vect_transform_loop): Generate epilogue if required by data
 access with gaps.
 * tree-vect-data-refs.c (vect_analyze_group_access): Mark the
 loop as requiring an epilogue if there are gaps in the end of
 the strided group.

4.5 ChangeLog:

 PR tree-optimization/49038
 * tree-vect-loop-manip.c (vect_generate_tmps_on_preheader):
 Ensure at least one epilogue iteration if required by data
 accesses with gaps.
 * tree-vectorizer.h (struct _loop_vec_info): Add new field
 to mark loops that require peeling for gaps.
 * tree-vect-loop.c (new_loop_vec_info): Initialize new field.
 (vect_estimate_min_profitable_iters): Take peeling for gaps 
into
 account.
 (vect_transform_loop): Generate epilogue if required by data
 access with gaps.
 * tree-vect-data-refs.c (vect_analyze_group_access): Mark the
 loop as requiring an epilogue if there are gaps in the end of
 the strided group.

4.6 and 4.5 testsuite/ChangeLog:

 PR tree-optimization/49038
 * gcc.dg/vect/vect-strided-u8-i8-gap4-unknown.c: New test.
 * gcc.dg/vect/pr49038.c: New test.
Index: tree-vect-loop-manip.c
===
--- tree-vect-loop-manip.c  (revision 174565)
+++ tree-vect-loop-manip.c  (working copy)
@@ -1516,7 +1516,7 @@ vect_generate_tmps_on_preheader (loop_ve
   edge pe;
   basic_block new_bb;
   gimple_seq stmts;
-  tree ni_name;
+  tree ni_name, ni_minus_gap_name;
   tree var;
   tree ratio_name;
   tree ratio_mult_vf_name;
@@ -1533,9 +1533,39 @@ vect_generate_tmps_on_preheader (loop_ve
   ni_name = vect_build_loop_niters (loop_vinfo, cond_expr_stmt_list);
   log_vf = build_int_cst (TREE_TYPE (ni), exact_log2 (vf));
 
+  /* If epilogue loop is required because of data accesses with gaps, we
+ subtract one iteration from the total number of iterations here for
+ correct calculation of RATIO.  */
+  if (LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo))
+{
+  ni_minus_gap_name = fold_build2 (MINUS_EXPR, TREE_TYPE (ni_name),
+  ni_name,
+  build_one_cst (TREE_TYPE (ni_name)));
+  if (!is_gimple_val (ni_minus_gap_name))
+   {
+ var = create_tmp_var (TREE_TYPE (ni), ni_gap);
+  add_referenced_var (var);
+
+  stmts = NULL;
+  ni_minus_gap_name = force_gimple_operand (ni_minus_gap_name, stmts,
+   true, var);
+  if (cond_expr_stmt_list)
+gimple_seq_add_seq (cond_expr_stmt_list, stmts);
+  else
+{
+  pe = loop_preheader_edge (loop);
+  new_bb = gsi_insert_seq_on_edge_immediate (pe, stmts);
+  gcc_assert (!new_bb);
+}
+}
+}
+  else
+ni_minus_gap_name = ni_name;
+
   /* Create: ratio = ni  log2(vf) */
 
-  ratio_name = fold_build2 (RSHIFT_EXPR, TREE_TYPE (ni_name), ni_name, log_vf);
+  ratio_name = fold_build2 (RSHIFT_EXPR, TREE_TYPE (ni_minus_gap_name),
+   ni_minus_gap_name, log_vf);
   if (!is_gimple_val (ratio_name))
 {
   var = create_tmp_var (TREE_TYPE (ni), bnd);
Index: testsuite/gcc.dg/vect/vect-strided-u8-i8-gap4-unknown.c
===
--- testsuite/gcc.dg/vect/vect-strided-u8-i8-gap4-unknown.c (revision 0)
+++

Re: [PATCH] PR fortran/49265 -- allow for double colon in module procedure statement

2011-06-02 Thread Steve Kargl

On Thu, Jun 02, 2011 at 05:59:29PM +0200, Thomas Koenig wrote:
 Hi Steve,
 
 it seems that, with your patch,
 
interface foo
   module procedure::bar
end interface
 
 is rejected, as is
 
   interface foo
  module procuedure:: bar
   end interface
 
 Is this the way it is supposed to be?
 

Oh phew.  Good catch.  I wasn't dealing with
the possible white space issues.  Here's an
updated patch and testcase.

-- 
Steve
Index: gcc/fortran/ChangeLog
===
--- gcc/fortran/ChangeLog	(revision 174566)
+++ gcc/fortran/ChangeLog	(working copy)
@@ -1,3 +1,11 @@
+2011-06-02  Steven G. Kargl  ka...@gcc.gnu.org
+
+	PR fortran/49265
+	* decl.c (gfc_match_modproc):  Allow for a double colon in a module
+	procedure statement.
+	* parse.c ( decode_statement): Deal with whitespace around :: in
+	gfc_match_modproc.
+ 
 2011-05-31  Tobias Burnus  bur...@net-b.de
 
 	PR fortran/18918
Index: gcc/fortran/decl.c
===
--- gcc/fortran/decl.c	(revision 174566)
+++ gcc/fortran/decl.c	(working copy)
@@ -7016,6 +7016,7 @@ gfc_match_modproc (void)
   char name[GFC_MAX_SYMBOL_LEN + 1];
   gfc_symbol *sym;
   match m;
+  locus old_locus;
   gfc_namespace *module_ns;
   gfc_interface *old_interface_head, *interface;
 
@@ -7044,10 +7045,23 @@ gfc_match_modproc (void)
  end up with a syntax error and need to recover.  */
   old_interface_head = gfc_current_interface_head ();
 
+  /* Check if the F2008 optional double colon appears.  */
+  gfc_gobble_whitespace ();
+  old_locus = gfc_current_locus;
+  if (gfc_match (::) == MATCH_YES)
+{
+  if (gfc_notify_std (GFC_STD_F2008, Fortran 2008: double colon in 
+			 MODULE PROCEDURE statement at %L, old_locus)
+	  == FAILURE)
+	return MATCH_ERROR;
+}
+  else
+gfc_current_locus = old_locus;
+  
   for (;;)
 {
-  locus old_locus = gfc_current_locus;
   bool last = false;
+  old_locus = gfc_current_locus;
 
   m = gfc_match_name (name);
   if (m == MATCH_NO)
@@ -7059,6 +7073,7 @@ gfc_match_modproc (void)
 	 current namespace.  */
   if (gfc_match_eos () == MATCH_YES)
 	last = true;
+
   if (!last  gfc_match_char (',') != MATCH_YES)
 	goto syntax;
 
Index: gcc/fortran/parse.c
===
--- gcc/fortran/parse.c	(revision 174566)
+++ gcc/fortran/parse.c	(working copy)
@@ -399,7 +399,7 @@ decode_statement (void)
   break;
 
 case 'm':
-  match (module% procedure% , gfc_match_modproc, ST_MODULE_PROC);
+  match (module% procedure, gfc_match_modproc, ST_MODULE_PROC);
   match (module, gfc_match_module, ST_MODULE);
   break;
 
Index: gcc/testsuite/ChangeLog
===
--- gcc/testsuite/ChangeLog	(revision 174566)
+++ gcc/testsuite/ChangeLog	(working copy)
@@ -1,3 +1,8 @@
+2011-06-02  Steven G. Kargl  ka...@gcc.gnu.org
+
+	PR fortran/49265
+	* gfortran.dg/module_procedure_double_colon.f90:  New test.
+
 2011-06-02  Eric Botcazou  ebotca...@adacore.com
 	Hans-Peter Nilsson  h...@axis.com
 
Index: gcc/testsuite/gfortran.dg/module_procedure_double_colon.f90
===
--- gcc/testsuite/gfortran.dg/module_procedure_double_colon.f90	(revision 0)
+++ gcc/testsuite/gfortran.dg/module_procedure_double_colon.f90	(revision 0)
@@ -0,0 +1,24 @@
+! { dg-do compile }
+! { dg-options -std=f95 }
+!
+! PR fortran/49265
+! Contributed by Erik Toussaint
+!
+module m1
+   implicit none
+   interface foo
+  module procedure::bar   ! { dg-error double colon }
+  module procedure ::bar_none ! { dg-error double colon }
+  module procedure:: none_bar ! { dg-error double colon }
+   end interface
+contains
+   subroutine bar
+   end subroutine
+   subroutine bar_none(i)
+ integer i
+   end subroutine
+   subroutine none_bar(x)
+ real x
+   end subroutine
+end module
+! { dg-final { cleanup-modules m1 } }

Re: [PATCH] PR fortran/49265 -- allow for double colon in module procedure statement

2011-06-02 Thread Thomas Koenig


Hi Steve,



Oh phew.  Good catch.  I wasn't dealing with
the possible white space issues.  Here's an
updated patch and testcase.



OK for trunk.  Could you also add the test case a second time,
without -std=f95, to make sure it keeps passing?

Thanks for the patch!

Thomas

Re: [PATCH, ARM] Cortex-A5 tuning [2/2] - tweak instruction conditionalisation

2011-06-02 Thread Julian Brown

On Wed, 01 Jun 2011 17:00:30 +0100
Richard Earnshaw rearn...@arm.com wrote:

 
 On Wed, 2011-06-01 at 16:49 +0100, Julian Brown wrote:
  This patch tweaks the behaviour of arm_final_prescan_insn when
  tuning for Cortex-A5 cores, since branches are cheaper than long
  sequences of conditionalised instructions on those processors. As
  posted in the previous patch, this provides a measurable increase
  in performance on a popular embedded benchmark.
  
  (I didn't use the tuning infrastructure for this one, though it
  could easily be changed to do so, now I come to think of it.)

 I would much prefer that this was done through the tuning
 infrastructure.  If one core likes it this way, there's a strong
 chance of another one coming along that has similar preferences.

How does this version look? I've left the size-optimisation case the
same (max_insns_skipped=6), but added a tunable integer to the
tune_params structure allowing the speed-optimisation case to be varied
according to the chosen target tuning.

To maintain existing semantics, this means duplicating the fastmul
structure for the StrongARM (XScale also used the StrongARM
setting, but already has its own tuning structure).

Minimally re-tested. OK to apply?

Thanks,

Julian

ChangeLog

gcc/
* config/arm/arm-cores.def (strongarm, strongarm110, strongarm1100)
(strongarm1110): Use strongarm tuning.
* config/arm/arm-protos.h (tune_params): Add max_insns_skipped
field.
* config/arm/arm.c (arm_strongarm_tune): New.
(arm_slowmul_tune, arm_fastmul_tune, arm_xscale_tune, arm_9e_tune)
(arm_v6t2_tune, arm_cortex_tune, arm_cortex_a5_tune)
(arm_cortex_a9_tune, arm_fa726te_tune): Add max_insns_skipped field
setting, using previous defaults or 1 for Cortex-A5.
(arm_option_override): Set max_insns_skipped from current tuning.commit 2116062b95b55fc048d54321c8b41a4d83175430
Author: Julian Brown jul...@henry7.codesourcery.com
Date:   Fri May 27 11:26:57 2011 -0700

Tune max_insns_skipped for conditionalization for Cortex-A5.

diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def
index 4ff2324..89697c0 100644
--- a/gcc/config/arm/arm-cores.def
+++ b/gcc/config/arm/arm-cores.def
@@ -70,10 +70,10 @@ ARM_CORE(arm7dmi,   arm7dmi,	3M,	FL_CO_PROC | FL_MODE26, fastmul)
 /* V4 Architecture Processors */
 ARM_CORE(arm8,  arm8,		4,	 FL_MODE26 | FL_LDSCHED, fastmul)
 ARM_CORE(arm810,arm810,	4,	 FL_MODE26 | FL_LDSCHED, fastmul)
-ARM_CORE(strongarm, strongarm,	4,	 FL_MODE26 | FL_LDSCHED | FL_STRONG, fastmul)
-ARM_CORE(strongarm110,  strongarm110,	4,	 FL_MODE26 | FL_LDSCHED | FL_STRONG, fastmul)
-ARM_CORE(strongarm1100, strongarm1100, 4,	 FL_MODE26 | FL_LDSCHED | FL_STRONG, fastmul)
-ARM_CORE(strongarm1110, strongarm1110, 4,	 FL_MODE26 | FL_LDSCHED | FL_STRONG, fastmul)
+ARM_CORE(strongarm, strongarm,	4,	 FL_MODE26 | FL_LDSCHED | FL_STRONG, strongarm)
+ARM_CORE(strongarm110,  strongarm110,	4,	 FL_MODE26 | FL_LDSCHED | FL_STRONG, strongarm)
+ARM_CORE(strongarm1100, strongarm1100, 4,	 FL_MODE26 | FL_LDSCHED | FL_STRONG, strongarm)
+ARM_CORE(strongarm1110, strongarm1110, 4,	 FL_MODE26 | FL_LDSCHED | FL_STRONG, strongarm)
 ARM_CORE(fa526, fa526,4,   FL_LDSCHED, fastmul)
 ARM_CORE(fa626, fa626,4,   FL_LDSCHED, fastmul)
 
diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index c104d74..67aee46 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -221,6 +221,9 @@ struct tune_params
   bool (*rtx_costs) (rtx, RTX_CODE, RTX_CODE, int *, bool);
   bool (*sched_adjust_cost) (rtx, rtx, rtx, int *);
   int constant_limit;
+  /* Maximum number of instructions to conditionalise in
+ arm_final_prescan_insn.  */
+  int max_insns_skipped;
   int num_prefetch_slots;
   int l1_cache_size;
   int l1_cache_line_size;
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index cd3f104..8f01202 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -857,6 +857,7 @@ const struct tune_params arm_slowmul_tune =
   arm_slowmul_rtx_costs,
   NULL,
   3,		/* Constant limit.  */
+  5,		/* Max cond insns.  */
   ARM_PREFETCH_NOT_BENEFICIAL,
   true,		/* Prefer constant pool.  */
   arm_default_branch_cost
@@ -867,6 +868,21 @@ const struct tune_params arm_fastmul_tune =
   arm_fastmul_rtx_costs,
   NULL,
   1,		/* Constant limit.  */
+  5,		/* Max cond insns.  */
+  ARM_PREFETCH_NOT_BENEFICIAL,
+  true,		/* Prefer constant pool.  */
+  arm_default_branch_cost
+};
+
+/* StrongARM has early execution of branches, so a sequence that is worth
+   skipping is shorter.  Set max_insns_skipped to a lower value.  */
+
+const struct tune_params arm_strongarm_tune =
+{
+  arm_fastmul_rtx_costs,
+  NULL,
+  1,		/* Constant limit.  */
+

Re: [google] Improve locus information during if-conversion (issue4526101)

2011-06-02 Thread शरद सिंघई

This patch improves precision of the line number information during
coverage mode. Yes, I need to add an example/test case. I was planning
to do that before I propose this patch for trunk as well.

Thanks,
Sharad

On Thu, Jun 2, 2011 at 4:46 AM, Diego Novillo dnovi...@google.com wrote:

 On Wed, Jun 1, 2011 at 21:03, Sharad Singhai sing...@google.com wrote:

  2011-06-01  Sharad Singhai  sing...@google.com
 
         Google Ref 39994
         * ifcvt.c (noce_try_cmove_arith): Use the locus information
         from the if-statment rather than the then path.

 Could you elaborate how it improves locus information?  Is there a
 test case you can add to the testsuite?  Or an example code fragment
 that shows how is the locus better now?

 OK for google/main.


 Diego.

Re: [patch] testsuite: support board_info timeouts

2011-06-02 Thread DJ Delorie

I never got feedback from the testsuite maintainers on this one...

 Date: Mon, 9 Aug 2010 23:48:31 -0400
 From: DJ Delorie d...@redhat.com
 Mailing-List: contact gcc-patches-h...@gcc.gnu.org; run by ezmlm

 Is there any reason why we don't support board-level timeouts?  It's
 really hard to specify timeouts for sid-based embedded targets with
 lots of multilibs (or just one, sometimes).

 It's certainly better than really REALLY ugly which is the only
 other option at that point.

   * lib/timeout.exp (timeout): Add board_info support.

  2010-08-09  Thomas Koenig  tkoe...@gcc.gnu.org
 Index: lib/timeout.exp
 ===
 --- lib/timeout.exp   (revision 163048)
 +++ lib/timeout.exp   (working copy)
 @@ -43,12 +43,14 @@ proc timeout_value { args } {
  if [info exists individual_timeout] {
   set val $individual_timeout
  } elseif [info exists tool_timeout] {
   set val $tool_timeout
  } elseif [target_info exists gcc,timeout] {
   set val [target_info gcc,timeout]
 +} elseif [board_info target exists gcc,timeout] {
 + set val [board_info target gcc,timeout]
  } else {
   # This is really, REALLY ugly, but this is the default from
   # remote.exp deep within DejaGnu.
   set val 300
  }

Re: [PATCH] PR fortran/49265 -- allow for double colon in module procedure statement

2011-06-02 Thread Steve Kargl

On Thu, Jun 02, 2011 at 06:39:18PM +0200, Thomas Koenig wrote:
 Hi Steve,
 
 
 Oh phew.  Good catch.  I wasn't dealing with
 the possible white space issues.  Here's an
 updated patch and testcase.
 
 
 OK for trunk.  Could you also add the test case a second time,
 without -std=f95, to make sure it keeps passing?
 

Yes, I'll add a 2nd testcase.  Thanks for the review.

-- 
Steve

Re: Add missing ChangeLog entry

2011-06-02 Thread Ian Lance Taylor

Nathan Sidwell nat...@codesourcery.com writes:

 On 06/01/11 15:32, Ian Lance Taylor wrote:
 I noticed that we have a --with-specs option in gcc/configure.ac, added
 in revision 155208 with this e-mail message:
 http://gcc.gnu.org/ml/gcc-patches/2009-12/msg00132.html

 sorry about that.  How's the attachd documentation?

Works for me, but please add an @xref{Spec Files} (you might need a
document in there too, not sure) as a pointer to where the spec format
is documented.

Thanks.

Ian

Re: Ping: [Patch] Make libstdc++'s abi_check more robust against readelf output format

2011-06-02 Thread Ian Lance Taylor

On Thu, Jun 2, 2011 at 3:08 AM, Paolo Carlini pcarl...@gmail.com wrote:
 Hi,

 Ping.

 Did Ian Taylor see this patch? If he likes it, I'm also fine with it.

I think this patch is fine, with or without Jonathan's suggestion.

Ian

Re: [patch] testsuite: support board_info timeouts

2011-06-02 Thread Mike Stump

On Jun 2, 2011, at 9:48 AM, DJ Delorie wrote:
 I never got feedback from the testsuite maintainers on this one...

Ok.

Re: [PATCH] PR fortran/49265 -- allow for double colon in module procedure statement

2011-06-02 Thread Steve Kargl

On Thu, Jun 02, 2011 at 06:39:18PM +0200, Thomas Koenig wrote:
 Hi Steve,
 
 
 Oh phew.  Good catch.  I wasn't dealing with
 the possible white space issues.  Here's an
 updated patch and testcase.
 
 
 OK for trunk.  Could you also add the test case a second time,
 without -std=f95, to make sure it keeps passing?
 
 Thanks for the patch!
 

svn-commit.tmp: 14 lines, 440 characters.
Sendingfortran/ChangeLog
Sendingfortran/decl.c
Sendingfortran/parse.c
Sendingtestsuite/ChangeLog
Adding testsuite/gfortran.dg/module_procedure_double_colon_1.f90
Adding testsuite/gfortran.dg/module_procedure_double_colon_2.f90
Transmitting file data ..
Committed revision 174569.

-- 
Steve

[PATCH, i386]: Introduce Y4 register constraint and merge SSE4_1 patterns

2011-06-02 Thread Uros Bizjak

Hello!

... and some unrelated cleanups involving simplifying a couple of
switch statements.

2011-06-02  Uros Bizjak  ubiz...@gmail.com

* config/i386/i386.c (standard_sse_constant_p) case 1:
Simplify switch statement.
* config/i386/i386.md (*movdf_internal_rex64) case 8,9,10: Ditto.
(*movdf_internal) case 6,7,8: Ditto.

* config/i386/constraints.md (Y4): New constraint.
* config/i386/sse.md (vec_setmode_0): Merge with
*vec_setmode_0_sse4_1 and *vec_setmode_0_sse2.
(*vec_extractv2di_1): Merge from *vec_extractv2di_1_sse2 and
*vec_extractv2di_1_sse.
(*vec_concatv2di_rex64): Merge from *vec_concatv2di_rex64_sse4_1
and *vec_concatv2di_rex64_sse.

testsuite/ChangeLog:

2011-06-02  Uros Bizjak  ubiz...@gmail.com

* gcc.target/i386/sse2-init-v2di-2: Update scan-assembler-times string.

Bootstrapped and regression tested on x86_64-pc-linux-gnu, committed
to mainline SVN.

Uros.
Index: testsuite/gcc.target/i386/sse2-init-v2di-2.c
===
--- testsuite/gcc.target/i386/sse2-init-v2di-2.c(revision 174566)
+++ testsuite/gcc.target/i386/sse2-init-v2di-2.c(working copy)
@@ -10,4 +10,4 @@ test (long long b)
   return _mm_cvtsi64_si128 (b); 
 }
 
-/* { dg-final { scan-assembler-times \\*vec_concatv2di_rex64_sse4_1/4 1 } } 
*/
+/* { dg-final { scan-assembler-times \\*vec_concatv2di_rex64/4 1 } } */
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 174566)
+++ config/i386/i386.md (working copy)
@@ -2956,18 +2956,15 @@
 case 10:
   switch (get_attr_mode (insn))
{
-   case MODE_V4SF:
- return %vmovaps\t{%1, %0|%0, %1};
-   case MODE_V2DF:
- if (TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL)
-   return %vmovaps\t{%1, %0|%0, %1};
- else
-   return %vmovapd\t{%1, %0|%0, %1};
case MODE_TI:
- if (TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL)
-   return %vmovaps\t{%1, %0|%0, %1};
- else
+ if (!TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL)
return %vmovdqa\t{%1, %0|%0, %1};
+   case MODE_V2DF:
+ if (!TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL)
+   return %vmovapd\t{%1, %0|%0, %1};
+   case MODE_V4SF:
+ return %vmovaps\t{%1, %0|%0, %1};
+
case MODE_DI:
  return %vmovq\t{%1, %0|%0, %1};
case MODE_DF:
@@ -3102,18 +3099,15 @@
 case 8:
   switch (get_attr_mode (insn))
{
-   case MODE_V4SF:
- return %vmovaps\t{%1, %0|%0, %1};
-   case MODE_V2DF:
- if (TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL)
-   return %vmovaps\t{%1, %0|%0, %1};
- else
-   return %vmovapd\t{%1, %0|%0, %1};
case MODE_TI:
- if (TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL)
-   return %vmovaps\t{%1, %0|%0, %1};
- else
+ if (!TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL)
return %vmovdqa\t{%1, %0|%0, %1};
+   case MODE_V2DF:
+ if (!TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL)
+   return %vmovapd\t{%1, %0|%0, %1};
+   case MODE_V4SF:
+ return %vmovaps\t{%1, %0|%0, %1};
+
case MODE_DI:
  return %vmovq\t{%1, %0|%0, %1};
case MODE_DF:
Index: config/i386/constraints.md
===
--- config/i386/constraints.md  (revision 174566)
+++ config/i386/constraints.md  (working copy)
@@ -99,6 +99,9 @@
 (define_register_constraint Y2 TARGET_SSE2 ? SSE_REGS : NO_REGS
  @internal Any SSE register, when SSE2 is enabled.)
 
+(define_register_constraint Y4 TARGET_SSE4_1 ? SSE_REGS : NO_REGS
+ @internal Any SSE register, when SSE4_1 is enabled.)
+
 (define_register_constraint Yi
  TARGET_SSE2  TARGET_INTER_UNIT_MOVES ? SSE_REGS : NO_REGS
  @internal Any SSE register, when SSE2 and inter-unit moves are enabled.)
Index: config/i386/sse.md
===
--- config/i386/sse.md  (revision 174566)
+++ config/i386/sse.md  (working copy)
@@ -3376,79 +3376,35 @@
 
 ;; Avoid combining registers from different units in a single alternative,
 ;; see comment above inline_secondary_memory_needed function in i386.c
-(define_insn *vec_setmode_0_sse4_1
+(define_insn vec_setmode_0
   [(set (match_operand:VI4F_128 0 nonimmediate_operand
- =x,x,x ,x,x,x  ,x  ,m,m,m)
+ =Y4,Y2,Y2,x,x,x,Y4 ,x  ,m,m,m)
(vec_merge:VI4F_128
  (vec_duplicate:VI4F_128
(match_operand:ssescalarmode 2 general_operand
-  x,m,*r,x,x,*rm,*rm,x,*r,fF))
+  Y4,m ,*r,m,x,x,*rm,*rm,x,*r,fF))
  (match_operand:VI4F_128 1 vector_move_operand
-  C,C,C ,0,x,0  ,x  ,0,0 ,0)
+  C ,C ,C ,C,0,x,0  ,x  ,0,0 ,0)
  (const_int 1)))]
-  TARGET_SSE4_1
+  TARGET_SSE
   @
%vinsertps\t{$0xe, %d2, %0|%0, %d2, 0xe}

Re: [PATCH] c-pragma: adding a data field to pragma_handler

2011-06-02 Thread Tom Tromey

 Pierre == Pierre  p.vit...@laposte.net writes:

Pierre I have changed this handler in order to accept a second parameter
Pierre which is a void *, allowing to give extra datas to the handler. I
Pierre think this data field might be of general use: we can have condition
Pierre or data at register time that we want to express in the handler. I
Pierre guess this is a common way to pass data to an handler function.

I can't approve or reject this patch, but the idea seems reasonable
enough to me.

Pierre I would like your opinion on this patch! Thanks!

It has a number of formatting issues.

Pierre +typedef void (*pragma_handler)(struct cpp_reader *, void * );

No space after the final *.

Pierre +/* Internally use to keep the data of the handler.  */
Pierre +struct internal_pragma_handler_d{

Space before the {.

Pierre +  pragma_handler handler;
Pierre +  void * data; 

No space.  Lots of instances of this.

Pierre  /* A vector of registered pragma callbacks.  */
Pierre +/*This is never freed as we need it during the whole execution */

Coalesce the two comments.  The comment formatting is wrong, see GNU
standards.

Pierrens_name.space = space;
Pierrens_name.name = name;
Pierre +  
PierreVEC_safe_push (pragma_ns_name, heap, registered_pp_pragmas, 
ns_name);

Gratuitous newline addition.

Pierre +  ihandler-handler = handler;
Pierre +  ihandler-data = data;

I didn't see anything that initialized ihandler.

Pierre +  VEC_safe_push (internal_pragma_handler, heap, registered_pragmas,
Pierre +ihandler);

I think you wanted just `internal_pragma_handler ihandler', no *, for
the definition.

Pierre +c_register_pragma (const char *space, const char *name, pragma_handler 
handler, 
Pierre +   void * data)

There are lots of calls to this that you did not update.
Do a recursive grep to see.

One way to avoid a massive change is to add a new overload that passes
in the data to c_register_pragma_1; and then change the legacy
functions to pass NULL.

I don't know if that approach is ok (it is typical in gdb...), so if
not, you have to update all callers.

Tom

Re: [PATCH libcpp]: S_ISREG non-zero value does not always fit in a bool

2011-06-02 Thread Tom Tromey

 John == John Tytgat john.tyt...@aaug.net writes:

John 2011-05-29  John Tytgat  john.tyt...@aaug.net
John   * files.c (read_file_guts): Add test on non-zero value of S_ISREG.

It seems reasonable enough to me.  I am checking it in.

Out of curiosity, do you know of a platform where this is an issue?

Tom

Re: [google] Use minimum cost circulation, not minimum cost flow to smooth profiles other minor fixes. (issue4536106)

2011-06-02 Thread Xinliang David Li

ok for google/main.

David

On Thu, Jun 2, 2011 at 11:00 AM, Martin Thuresson mart...@google.com wrote:
 This patch from Neil Vachharajani and Dehao Chen improves mcf by using
 minimum cost circulation instead of minimum cost flow to smooth profiles.
 It also introduces a parameter for controlling running time of the algorithm.
 This was what was originally presented in the academic work and handles
 certain cases where the function entry and exit have incorrect profile
 weights.

 For now, this is for google/main. Once I have collected performance results
 from SPEC I will propose this patch for trunk as well.

 Bootstraps and no test regressions. Ok for google/main?

 2011-06-02  Neil Vachharajani  nvach...@gmail.com, Dehao Chen
        daniel...@gmail.com

        * gcc/doc/invoke.texi (min-mcf-cancel-iters): Document.
        * gcc/mcf.c (MAX_ITER): Use new param PARAM_MIN_MCF_CANCEL_ITERS.
        (edge_type): Add SINK_SOURCE_EDGE.
        (dump_fixup_edge): Handle SINK_SOURCE_EDGE.
        (create_fixup_graph): Make problem miminum cost circulation.
        (cancel_negative_cycle): Update handling of infinite capacity.
        (compute_residual_flow): Update handling of infinite capacity.
        (find_max_flow): Update handling of infinite capacity.
        (modify_sink_source_capacity): New function.
        (find_minimum_cost_flow): Make problem miminum cost circulation.
        Use param PARAM_MIN_MCF_CANCEL_ITERS.
        * gcc/params.def (PARAM_MIN_MCF_CANCEL_ITERS): Define.

 Index: gcc/doc/invoke.texi
 ===
 --- gcc/doc/invoke.texi (revision 174456)
 +++ gcc/doc/invoke.texi (working copy)
 @@ -8341,6 +8341,12 @@ whether the result of a complex multipli

  The default is @option{-fno-cx-fortran-rules}.

 +@item min-mcf-cancel-iters
 +The minimum number of iterations of negative cycle cancellation during
 +MCF profile correction before early termination.  This parameter is
 +only useful when using @option{-fprofile-correction}.
 +
 +
  @end table

  The following options control optimizations that may improve
 Index: gcc/mcf.c
 ===
 --- gcc/mcf.c   (revision 174456)
 +++ gcc/mcf.c   (working copy)
 @@ -52,6 +52,8 @@ along with GCC; see the file COPYING3.
  #include langhooks.h
  #include tree.h
  #include gcov-io.h
 +#include params.h
 +#include diagnostic-core.h

  #include profile.h

 @@ -64,15 +66,18 @@ along with GCC; see the file COPYING3.
  #define COST(k, w)      ((k) / mcf_ln ((w) + 2))
  /* Limit the number of iterations for cancel_negative_cycles() to ensure
    reasonable compile time.  */
 -#define MAX_ITER(n, e)  10 + (100 / ((n) * (e)))
 +#define MAX_ITER(n, e)  (PARAM_VALUE (PARAM_MIN_MCF_CANCEL_ITERS) + \
 +                        (100 / ((n) * (e
 +
  typedef enum
  {
 -  INVALID_EDGE,
 +  INVALID_EDGE = 0,
   VERTEX_SPLIT_EDGE,       /* Edge to represent vertex with w(e) = w(v).  */
   REDIRECT_EDGE,           /* Edge after vertex transformation.  */
   REVERSE_EDGE,
   SOURCE_CONNECT_EDGE,     /* Single edge connecting to single source.  */
   SINK_CONNECT_EDGE,       /* Single edge connecting to single sink.  */
 +  SINK_SOURCE_EDGE,        /* Single edge connecting sink to source.  */
   BALANCE_EDGE,                    /* Edge connecting with source/sink: cp(e) 
 = 0.  */
   REDIRECT_NORMALIZED_EDGE, /* Normalized edge for a redirect edge.  */
   REVERSE_NORMALIZED_EDGE   /* Normalized edge for a reverse edge.  */
 @@ -250,6 +255,10 @@ dump_fixup_edge (FILE *file, fixup_graph
          fputs ( @SINK_CONNECT_EDGE, file);
          break;

 +       case SINK_SOURCE_EDGE:
 +         fputs ( @SINK_SOURCE_EDGE, file);
 +         break;
 +
        case REVERSE_EDGE:
          fputs ( @REVERSE_EDGE, file);
          break;
 @@ -465,7 +474,7 @@ create_fixup_graph (fixup_graph_type *fi
   double k_neg = 0;
   /* Vector to hold D(v) = sum_out_edges(v) - sum_in_edges(v).  */
   gcov_type *diff_out_in = NULL;
 -  gcov_type supply_value = 1, demand_value = 0;
 +  gcov_type supply_value = 0, demand_value = 0;
   gcov_type fcost = 0;
   int new_entry_index = 0, new_exit_index = 0;
   int i = 0, j = 0;
 @@ -486,14 +495,15 @@ create_fixup_graph (fixup_graph_type *fi
     fnum_vertices_after_transform + n_edges + n_basic_blocks + 2;

   /* In create_fixup_graph: Each basic block and edge can be split into 3
 -     edges. Number of balance edges = n_basic_blocks. So after
 -     create_fixup_graph:
 -     max_edges = 4 * n_basic_blocks + 3 * n_edges
 +     edges. Number of balance edges = n_basic_blocks - 1. And there is 1 edge
 +     connecting new_entry and new_exit, and 2 edges connecting new_entry to
 +     entry, and exit to new_exit. So after create_fixup_graph:
 +     max_edges = 4 * n_basic_blocks + 3 * n_edges + 2
      Accounting for residual flow edges
 -     max_edges = 2 * (4 * n_basic_blocks + 3 * n_edges)
 -     = 8 * n_basic_blocks + 6 *

Re: [patch] testsuite: support board_info timeouts

2011-06-02 Thread DJ Delorie


Thanks!  Committed.

Re: [google] Use minimum cost circulation, not minimum cost flow to smooth profiles other minor fixes. (issue4536106)

2011-06-02 Thread Xinliang David Li

Counter overflow?

David

On Thu, Jun 2, 2011 at 11:12 AM, Martin Thuresson mart...@google.com wrote:
 On Thu, Jun 2, 2011 at 11:05 AM, Xinliang David Li davi...@google.com wrote:

 Smoothing works for sample FDO and profile data from multi-threaded
 programs. You won't see any difference in SPEC.

 Dehao reported some performance improvements from the algorithmic improvements
 he added in terms of extra fixup edges and handling of infinite capacity.

 Martin





 David

 On Thu, Jun 2, 2011 at 11:00 AM, Martin Thuresson mart...@google.com wrote:
  This patch from Neil Vachharajani and Dehao Chen improves mcf by using
  minimum cost circulation instead of minimum cost flow to smooth profiles.
  It also introduces a parameter for controlling running time of the 
  algorithm.
  This was what was originally presented in the academic work and handles
  certain cases where the function entry and exit have incorrect profile
  weights.
 
  For now, this is for google/main. Once I have collected performance results
  from SPEC I will propose this patch for trunk as well.
 
  Bootstraps and no test regressions. Ok for google/main?
 
  2011-06-02  Neil Vachharajani  nvach...@gmail.com, Dehao Chen
         daniel...@gmail.com
 
         * gcc/doc/invoke.texi (min-mcf-cancel-iters): Document.
         * gcc/mcf.c (MAX_ITER): Use new param PARAM_MIN_MCF_CANCEL_ITERS.
         (edge_type): Add SINK_SOURCE_EDGE.
         (dump_fixup_edge): Handle SINK_SOURCE_EDGE.
         (create_fixup_graph): Make problem miminum cost circulation.
         (cancel_negative_cycle): Update handling of infinite capacity.
         (compute_residual_flow): Update handling of infinite capacity.
         (find_max_flow): Update handling of infinite capacity.
         (modify_sink_source_capacity): New function.
         (find_minimum_cost_flow): Make problem miminum cost circulation.
         Use param PARAM_MIN_MCF_CANCEL_ITERS.
         * gcc/params.def (PARAM_MIN_MCF_CANCEL_ITERS): Define.
 
  Index: gcc/doc/invoke.texi
  ===
  --- gcc/doc/invoke.texi (revision 174456)
  +++ gcc/doc/invoke.texi (working copy)
  @@ -8341,6 +8341,12 @@ whether the result of a complex multipli
 
   The default is @option{-fno-cx-fortran-rules}.
 
  +@item min-mcf-cancel-iters
  +The minimum number of iterations of negative cycle cancellation during
  +MCF profile correction before early termination.  This parameter is
  +only useful when using @option{-fprofile-correction}.
  +
  +
   @end table
 
   The following options control optimizations that may improve
  Index: gcc/mcf.c
  ===
  --- gcc/mcf.c   (revision 174456)
  +++ gcc/mcf.c   (working copy)
  @@ -52,6 +52,8 @@ along with GCC; see the file COPYING3.
   #include langhooks.h
   #include tree.h
   #include gcov-io.h
  +#include params.h
  +#include diagnostic-core.h
 
   #include profile.h
 
  @@ -64,15 +66,18 @@ along with GCC; see the file COPYING3.
   #define COST(k, w)      ((k) / mcf_ln ((w) + 2))
   /* Limit the number of iterations for cancel_negative_cycles() to ensure
     reasonable compile time.  */
  -#define MAX_ITER(n, e)  10 + (100 / ((n) * (e)))
  +#define MAX_ITER(n, e)  (PARAM_VALUE (PARAM_MIN_MCF_CANCEL_ITERS) + \
  +                        (100 / ((n) * (e
  +
   typedef enum
   {
  -  INVALID_EDGE,
  +  INVALID_EDGE = 0,
    VERTEX_SPLIT_EDGE,       /* Edge to represent vertex with w(e) = w(v).  
  */
    REDIRECT_EDGE,           /* Edge after vertex transformation.  */
    REVERSE_EDGE,
    SOURCE_CONNECT_EDGE,     /* Single edge connecting to single source.  */
    SINK_CONNECT_EDGE,       /* Single edge connecting to single sink.  */
  +  SINK_SOURCE_EDGE,        /* Single edge connecting sink to source.  */
    BALANCE_EDGE,                    /* Edge connecting with source/sink: 
  cp(e) = 0.  */
    REDIRECT_NORMALIZED_EDGE, /* Normalized edge for a redirect edge.  */
    REVERSE_NORMALIZED_EDGE   /* Normalized edge for a reverse edge.  */
  @@ -250,6 +255,10 @@ dump_fixup_edge (FILE *file, fixup_graph
           fputs ( @SINK_CONNECT_EDGE, file);
           break;
 
  +       case SINK_SOURCE_EDGE:
  +         fputs ( @SINK_SOURCE_EDGE, file);
  +         break;
  +
         case REVERSE_EDGE:
           fputs ( @REVERSE_EDGE, file);
           break;
  @@ -465,7 +474,7 @@ create_fixup_graph (fixup_graph_type *fi
    double k_neg = 0;
    /* Vector to hold D(v) = sum_out_edges(v) - sum_in_edges(v).  */
    gcov_type *diff_out_in = NULL;
  -  gcov_type supply_value = 1, demand_value = 0;
  +  gcov_type supply_value = 0, demand_value = 0;
    gcov_type fcost = 0;
    int new_entry_index = 0, new_exit_index = 0;
    int i = 0, j = 0;
  @@ -486,14 +495,15 @@ create_fixup_graph (fixup_graph_type *fi
      fnum_vertices_after_transform + n_edges + n_basic_blocks + 2;
 
    /* In create_fixup_graph: Each basic block and edge can be split

Fix for PR objc/48539 (Missing warning when messaging a forward-declared class)

2011-06-02 Thread Nicola Pero

This patch fixes PR objc/48539 (Missing warning when messaging a 
forward-declared class).

The problem occurs when using @class, and then messaging class or instance 
objects of the class.
It can happen both with class and instance methods.

--

An example with class methods is

 @class MyClass;

 [MyClass method];

In this example, the compiler has no information on 'MyClass' and on what 
methods it responds to.
So, there is no way to determine if MyClass responds to the method +method, 
and what the method
prototype is (in this example it doesn't matter, but if there are arguments or 
return values, it could
matter a lot, including potential causing a crash at runtime if the wrong 
method prototype is used).
Clang emits a warning there, which seems very appropriate, and with this patch, 
GCC 4.7.0 emits a warning
there too. :-)

--

Then there is then the issue of what to do about instance methods, as in the 
following example --

 @class MyClass;

 MyClass *x;
 [x method];

This is almost identical to the case above, and this patch adds a similar 
warning. ;-)

Note that in this case, the current behaviour of the compiler is substandard; 
the compiler silently throws away
the MyClass * type, silently casts x to id, and proceeds to accept for it 
to be used as a receiver
of any possible method of any class, without any warning (!!).  clang does the 
same by the way.

We clearly do want to emit a warning there, instead.  The fact that the 
programmer has explicitly declared x to be
of type MyClass * instead of id means she is expecting the compiler to use 
that information to do the
standard method lookup/check based on the class.  If the @interface is missing, 
it is most likely an error / slip
in the program, which is worthwhile for the compiler to warn about (in the same 
way as we warn above for class
methods!). ;-)

So this patch changes this behaviour and adds a warning here; if the programmer 
doesn't want the warning and
is happy with x being treated as an id, she can simply add a cast to id 
to clarify her mind, and the
warning will go away.  Ie. if you do

 @class MyClass;

 MyClass *x;
 [(id)x method];

you don't get any warning as you explicitly disabled type-based checks by 
casting to id.  But if you leave x to be
of type MyClass *, then you're asking for type-based checks / method lookup, 
the compiler will try to do the method lookup,
and if the @interface of MyClass was not found, will emit a warning because 
it can't do the requested type-based checks /
method lookup. ;-)

I tried this patch with gnustep core and it did find a number of slips in the 
code, which is good.  No major
bugs, but all cases where someone had forgotten to #include the header with the 
@interface of a class and so
where the compiler couldn't do the proper checks, but nobody would notice 
because of the current silent
behaviour where the missing @interface causes the variable to magically and 
silently become of type id.

In fact, looking at the examples is very convincing that we need this warning.  
Without it, @class NSArray; basically
makes NSArray * a typedef for id when doing method invocations.  So, in 
your code you may have methods or functions
taking (NSArray *) arguments, and then you call methods on these arguments, 
expecting the compiler to check that the methods
are appropriate for an NSArray.  Instead, the compiler is silently treating 
NSArray * as identical to id, and performing
no checks whatsoever, and not bothering to tell you anything about the fact! ;-)

--

Finally, there's the additional complication of deciding what to do when this 
is mixed with protocols, as in --

 @class MyClass;
 @protocol MyProtocol
 - (void) method;
 @end

 MyClass MyProtocol *x;
 [x method];

This is a weird/rare case, and I don't expect to see it much in practice, but 
it needs to be sorted out, and GCC already
has at least two existing testcases for it in the GCC testsuite.  At the 
moment, the compiler does the equivalent of
silently converting MyClass MyProtocol * to id MyProtocol.  That's not 
great, but I pondered about this for
a long time, tried a few variations, and then decided to make no changes. :-)

The reason is that if the method being called is part of the protocol, then the 
compiler can find the method prototype
(and do the type-checking) without needing any more information on the actual 
class.  Complaining about the @interface not
being available seems pointless nit-picking, since it's not required to do the 
type-based method lookup. ;-)

If the method being called is not part of the protocol, the compiler already 
emits a warning that the method could not be
found in the protocol.  I thought about adding a second warning about the 
@interface not being found, and for a while had
it in the patch, but in practice it seemed overkill and I removed it from the 
final version.

--

The warning message that I chose for GCC is --

  method-lookup-1.m:42:3: warning: @interface of class ‘NotKnown’

Re: Fix for PR objc/48539 (Missing warning when messaging a forward-declared class)

2011-06-02 Thread Mike Stump

On Jun 2, 2011, at 11:29 AM, Nicola Pero wrote:
 This patch fixes PR objc/48539 (Missing warning when messaging a 
 forward-declared class).

 Ok to commit to trunk ?

Ok.

Re: __sync_swap* with acq/rel/full memory barrier semantics

2011-06-02 Thread Aldy Hernandez


On 05/30/11 15:07, Andrew MacLeod wrote:


Aldy was just too excited about working on memory model I think :-)

I've been looking at this, and I propose we go this way :

http://gcc.gnu.org/wiki/Atomic/GCCMM/CodeGen


Still overly excited, but now with a more thorough plan :).

I'm going to concentrate on the non controversial parts (the __sync 
builtins), while the details are ironed out.


The attached patch implements the exchange operation, with a 
parameter/enum for the type of memory model to use.  I have chosen to 
call the builtins __sync_mem_BLAH to keep them all consistent.


I am including documentation and a test, so folks can get an idea of 
where I'm headed with this.  Once I take everyone's input, we can 
implement the rest of the builtins, and take it from there.


I see no prior art in providing some sort of enum for a builtin 
parameter.  I can proceed down this path if advisable, but an easier 
path is to just declare the __SYNC_MEM_* enum as preprocessor macros as 
I do in this patch.  Suggestions welcome.


How does this (lightly tested patch) look?
* doc/extend.texi (__sync_mem_exchange): Document.
* cppbuiltin.c (define__GNUC__): Define __SYNC_MEM*.
* c-family/c-common.c (BUILT_IN_MEM_EXCHANGE_N): Add case.
* optabs.c (expand_sync_mem_exchange): New.
* optabs.h (enum direct_optab_index): Add DOI_sync_mem* entries.
(sync_mem_exchange_*_optab): Define.
* genopinit.c: Add entries for sync_mem_exchange_*.
* tree.h (enum memmodel): New.
* builtins.c (get_memmodel): New.
(expand_builtin_mem_exchange): New.
(expand_builtin_synchronize): Remove static.
(expand_builtin): Add cases for BUILT_IN_MEM_EXCHANGE_*.
* sync-builtins.def: Add entries for BUILT_IN_MEM_EXCHANGE_*.
* builtin-types.def (BT_FN_I{1,2,4,8,16}_VPTR_I{1,2,4,8,16}_INT):
New.
* expr.h (expand_sync_mem_exchange): Declare.
(expand_builtin_synchronize): Same.
* config/i386/i386.md (UNSPECV_MEM_XCHG): New.
(sync_mem_exchange_seq_cstmode): New pattern.

Index: doc/extend.texi
===
--- doc/extend.texi (revision 173831)
+++ doc/extend.texi (working copy)
@@ -6728,6 +6728,22 @@ This builtin is not a full barrier, but 
 This means that all previous memory stores are globally visible, and all
 previous memory loads have been satisfied, but following memory reads
 are not prevented from being speculated to before the barrier.
+
+@item @var{type} __sync_mem_exchange (@var{type} *ptr, @var{type} value, int 
memmodel, ...)
+@findex __sync_mem_exchange
+This builtin implements an atomic exchange operation within the
+constraints of a memory model.  It writes @var{value} into
+@code{*@var{ptr}}, and returns the previous contents of
+@code{*@var{ptr}}.
+
+The valid memory model variants for this builtin are
+__SYNC_MEM_RELAXED, __SYNC_MEM_SEQ_CST, __SYNC_MEM_ACQUIRE,
+__SYNC_MEM_RELEASE, and __SYNC_MEM_ACQ_REL.  If the variant is not
+available for the given target, the compiler will fall back to the
+more restrictive memory model, the sequentially consistent model (if
+available).  If the sequentially consistent model is not implemented
+for the target, the compiler will implement the builtin with a compare
+and swap loop.
 @end table
 
 @node Object Size Checking
Index: cppbuiltin.c
===
--- cppbuiltin.c(revision 173831)
+++ cppbuiltin.c(working copy)
@@ -66,6 +66,12 @@ define__GNUC__ (cpp_reader *pfile)
   cpp_define_formatted (pfile, __GNUC_MINOR__=%d, minor);
   cpp_define_formatted (pfile, __GNUC_PATCHLEVEL__=%d, patchlevel);
   cpp_define_formatted (pfile, __VERSION__=\%s\, version_string);
+  cpp_define_formatted (pfile, __SYNC_MEM_RELAXED=%d, MEMMODEL_RELAXED);
+  cpp_define_formatted (pfile, __SYNC_MEM_SEQ_CST=%d, MEMMODEL_SEQ_CST);
+  cpp_define_formatted (pfile, __SYNC_MEM_ACQUIRE=%d, MEMMODEL_ACQUIRE);
+  cpp_define_formatted (pfile, __SYNC_MEM_RELEASE=%d, MEMMODEL_RELEASE);
+  cpp_define_formatted (pfile, __SYNC_MEM_ACQ_REL=%d, MEMMODEL_ACQ_REL);
+  cpp_define_formatted (pfile, __SYNC_MEM_CONSUME=%d, MEMMODEL_CONSUME);
 }
 
 
Index: c-family/c-common.c
===
--- c-family/c-common.c (revision 173831)
+++ c-family/c-common.c (working copy)
@@ -9035,6 +9035,7 @@ resolve_overloaded_builtin (location_t l
 case BUILT_IN_VAL_COMPARE_AND_SWAP_N:
 case BUILT_IN_LOCK_TEST_AND_SET_N:
 case BUILT_IN_LOCK_RELEASE_N:
+case BUILT_IN_MEM_EXCHANGE_N:
   {
int n = sync_resolve_size (function, params);
tree new_function, first_param, result;
Index: optabs.c
===
--- optabs.c(revision 173831)
+++ optabs.c(working copy)
@@ -6988,6 +6988,85 @@ expand_sync_lock_test_and_set (rtx mem,

Re: __sync_swap* with acq/rel/full memory barrier semantics

2011-06-02 Thread Jakub Jelinek

On Thu, Jun 02, 2011 at 02:12:38PM -0500, Aldy Hernandez wrote:

 +/* This function expands a fine grained atomic exchange operation:
 +   atomically store VAL in MEM and return the previous value in MEM.
 +
 +   MEMMODEL is the memory model variant to use.
 +   TARGET is an option place to stick the return value.  */
 +
 +rtx
 +expand_sync_mem_exchange (enum memmodel model, rtx mem, rtx val, rtx target)
 +{
 +  enum machine_mode mode = GET_MODE (mem);
 +  enum insn_code icode;
 +  direct_optab op;
 +
 +  switch (model)
 +{
 +case MEMMODEL_RELAXED:
 +  /* ?? Eventually we should either just emit the atomic
 +  instruction without any barriers (and thus allow movements
 +  and transformations), or emit a relaxed builtin.
 +
 +  It is still not clear whether any transformations are
 +  permissible on the atomics (for example, CSE might break
 +  coherence), so we might need to emit a relaxed builtin.
 +
 + Until we figure this out, be conservative and fall
 + through.  */
 +case MEMMODEL_SEQ_CST:
 +  op = sync_mem_exchange_seq_cst_optab;
 +  break;
 +case MEMMODEL_ACQUIRE:
 +  op = sync_mem_exchange_acq_optab;
 +  break;
 +case MEMMODEL_RELEASE:
 +  op = sync_mem_exchange_rel_optab;
 +  break;
 +case MEMMODEL_ACQ_REL:
 +  op = sync_mem_exchange_acq_rel_optab;
 +  break;

Wouldn't it be better to pass the model (as an extra CONST_INT
operand) to the expanders?  Targets where atomic instructions always act
as full barriers could just ignore that argument, other could decide what
to do based on the value.

Jakub

[PATCH] [Bug c++/49118] fake template nesting for operator- chain

2011-06-02 Thread David Krauss

This is my first frontend contribution. While it fixes the crash and produces 
an explanatory error message, the message isn't quite right. I don't understand 
the message generation system so I might need help. Or, it looks like there's 
an issue with template backtraces at the moment anyway, so there might be an 
interaction with another known bug.

The problem occurs when operator- drill-down behavior is infinitely chained, 
for example with a template

template int n 
t n + 1 
t n ::operator-()

There is no cycle to signal endlessness, and no template nesting, as drill-down 
is implemented as a deep expression, not tail-calls. The result is that the 
compiler hangs.

My solution is to pretend that there is template nesting, presuming the user 
will find this intuitive. There is the added benefit of the maximum chain 
length being configured by the template nesting limit.

Drill-down is implemented by build_x_arrow. If operator- resolves, it calls it 
and uses the result type to lookup another operator-. I'd like to re-open a 
template context related to operator- after generating the call. The function 
push_tinst_level seems to relate only to diagnostics, with no semantic effect, 
so it seems a good candidate.

Optimally the re-opened context would be the preceding operator- function 
itself, to create the illusion of nested calls. However, the result of 
build_new_op may be a target_expr or a call_expr. I'm not sure of the best way 
to recover the function declaration from this ambiguous tree, nor whether it 
would a performance issue (i.e., too much work for the reward).

The identity of the class containing the *next* operator- call is easy to 
recover, however, since it is the type of the expression from build_new_op. 
This introduces an off-by-one error, and gets us a class template rather than 
the more relevant function member. These problems shouldn't matter since this 
is all just for diagnostics. But perhaps the discrepancy between having a 
function type and a class type is interfering with message generation?

Thanks for the help and consideration!



endless_arrow.clog
Description: Binary data



endless_arrow.patch
Description: Binary data

Re: __sync_swap* with acq/rel/full memory barrier semantics

2011-06-02 Thread Aldy Hernandez


On 06/02/11 14:25, Jakub Jelinek wrote:


+case MEMMODEL_SEQ_CST:
+  op = sync_mem_exchange_seq_cst_optab;
+  break;
+case MEMMODEL_ACQUIRE:
+  op = sync_mem_exchange_acq_optab;
+  break;
+case MEMMODEL_RELEASE:
+  op = sync_mem_exchange_rel_optab;
+  break;
+case MEMMODEL_ACQ_REL:
+  op = sync_mem_exchange_acq_rel_optab;
+  break;


Wouldn't it be better to pass the model (as an extra CONST_INT
operand) to the expanders?  Targets where atomic instructions always act
as full barriers could just ignore that argument, other could decide what
to do based on the value.


*shrug* I don't care.  Whatever everyone agrees on.

Re: PING^2 [PATCH] Support for AMD64 targets running GNU/kFreeBSD

2011-06-02 Thread Robert Millan

Hi,

2011/5/21 Joseph S. Myers jos...@codesourcery.com:
 Please send a patch against *current trunk* and CC *relevant target
 architecture maintainers*.  linux*.h headers are no longer used on
 non-Linux targets (since my 2011-04-28 patch - on which I CC:ed you) so
 this patch version is no longer appropriate.  I think you'll want to make
 gnu-user64.h use GNU_USER_LINK_EMULATION32 and GNU_USER_LINK_EMULATION64
 similarly to how gnu-user.h uses GNU_USER_LINK_EMULATION.

Thanks for the tip.  Here's an update to current trunk.

-- 
Robert Millan
2011-06-02  Robert Millan  r...@gnu.org

* config/i386/kfreebsd-gnu.h: Resync with `config/i386/linux.h'.
* config/kfreebsd-gnu.h (GNU_USER_DYNAMIC_LINKER): Resync with
`config/linux.h'.

* config/i386/kfreebsd-gnu64.h: New file.
* config.gcc (x86_64-*-kfreebsd*-gnu): Replace `i386/kfreebsd-gnu.h'
with `i386/kfreebsd-gnu64.h'.

* config/i386/linux64.h (GNU_USER_LINK_EMULATION32)
(GNU_USER_LINK_EMULATION64): New macros.
* config/i386/gnu-user64.h (LINK_SPEC): Rely on
`GNU_USER_LINK_EMULATION32' and `GNU_USER_LINK_EMULATION64' instead
of hardcoding `elf_i386' and `elf_x86_64'.

Index: gcc/config/i386/kfreebsd-gnu64.h
===
--- gcc/config/i386/kfreebsd-gnu64.h(revision 0)
+++ gcc/config/i386/kfreebsd-gnu64.h(revision 0)
@@ -0,0 +1,26 @@
+/* Definitions for AMD x86-64 running kFreeBSD-based GNU systems with ELF 
format
+   Copyright (C) 2011
+   Free Software Foundation, Inc.
+   Contributed by Robert Millan.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+http://www.gnu.org/licenses/.  */
+
+#define GNU_USER_LINK_EMULATION32 elf_i386_fbsd
+#define GNU_USER_LINK_EMULATION64 elf_x86_64_fbsd
+
+#define GLIBC_DYNAMIC_LINKER32 /lib/ld.so.1
+#define GLIBC_DYNAMIC_LINKER64 /lib64/ld-kfreebsd-x86-64.so.1
Index: gcc/config/i386/kfreebsd-gnu.h
===
--- gcc/config/i386/kfreebsd-gnu.h  (revision 174566)
+++ gcc/config/i386/kfreebsd-gnu.h  (working copy)
@@ -1,5 +1,5 @@
 /* Definitions for Intel 386 running kFreeBSD-based GNU systems with ELF format
-   Copyright (C) 2004, 2007, 2011
+   Copyright (C) 2011
Free Software Foundation, Inc.
Contributed by Robert Millan.
 
@@ -19,11 +19,5 @@
 along with GCC; see the file COPYING3.  If not see
 http://www.gnu.org/licenses/.  */
 
-#undef GNU_USER_LINK_EMULATION
 #define GNU_USER_LINK_EMULATION elf_i386_fbsd
-
-#undef GNU_USER_DYNAMIC_LINKER32
-#define GNU_USER_DYNAMIC_LINKER32 /lib/ld.so.1
-
-#undef GNU_USER_DYNAMIC_LINKER64
-#define GNU_USER_DYNAMIC_LINKER64 /lib/ld-kfreebsd-x86-64.so.1
+#define GLIBC_DYNAMIC_LINKER /lib/ld.so.1
Index: gcc/config/i386/linux64.h
===
--- gcc/config/i386/linux64.h   (revision 174566)
+++ gcc/config/i386/linux64.h   (working copy)
@@ -24,6 +24,9 @@
 see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 http://www.gnu.org/licenses/.  */
 
+#define GNU_USER_LINK_EMULATION32 elf_i386
+#define GNU_USER_LINK_EMULATION64 elf_x86_64
+
 #define GLIBC_DYNAMIC_LINKER32 /lib/ld-linux.so.2
 #define GLIBC_DYNAMIC_LINKER64 /lib64/ld-linux-x86-64.so.2
 
Index: gcc/config/i386/gnu-user64.h
===
--- gcc/config/i386/gnu-user64.h(revision 174566)
+++ gcc/config/i386/gnu-user64.h(working copy)
@@ -69,7 +69,8 @@
  %{!mno-sse2avx:%{mavx:-msse2avx}} %{msse2avx:%{!mavx:-msse2avx}}
 
 #undef LINK_SPEC
-#define LINK_SPEC %{ SPEC_64 :-m elf_x86_64} %{ SPEC_32 :-m elf_i386} \
+#define LINK_SPEC %{ SPEC_64 :-m  GNU_USER_LINK_EMULATION64 } \
+   %{ SPEC_32 :-m  GNU_USER_LINK_EMULATION32 } \
   %{shared:-shared} \
   %{!shared: \
 %{!static: \
Index: gcc/config/kfreebsd-gnu.h
===
--- gcc/config/kfreebsd-gnu.h   (revision 174566)
+++ gcc/config/kfreebsd-gnu.h   (working copy)
@@ -19,7 +19,6 @@
 along with GCC; see the file COPYING3.  If not see
 http://www.gnu.org/licenses/.  */
 
-#undef GNU_USER_TARGET_OS_CPP_BUILTINS
 #define GNU_USER_TARGET_OS_CPP_BUILTINS()  \
   do   \
 {  \
@@ -31,5 +30,6 @@
 }

[PATCH, i386]: Introduce Y3 register constraint and merge SSE3 patterns

2011-06-02 Thread Uros Bizjak

Hello!

2011-06-02  Uros Bizjak  ubiz...@gmail.com

* config/i386/constraints.md (Y3): New register constraint.
* config/i386/sse.md (*vec_interleave_highv2df): Merge with
*sse3_interleave_highv2df and *sse2_interleave_highv2df.
(*vec_interleave_lowv2df): Merge with *sse3_interleave_lowv2df and
*sse2_interleave_lowv2df.

Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu
{,-m32}, committed to mainline SVN.

Uros.
Index: constraints.md
===
--- constraints.md  (revision 174570)
+++ constraints.md  (working copy)
@@ -99,6 +99,9 @@
 (define_register_constraint Y2 TARGET_SSE2 ? SSE_REGS : NO_REGS
  @internal Any SSE register, when SSE2 is enabled.)
 
+(define_register_constraint Y3 TARGET_SSE3 ? SSE_REGS : NO_REGS
+ @internal Any SSE register, when SSE3 is enabled.)
+
 (define_register_constraint Y4 TARGET_SSE4_1 ? SSE_REGS : NO_REGS
  @internal Any SSE register, when SSE4_1 is enabled.)
 
Index: sse.md
===
--- sse.md  (revision 174570)
+++ sse.md  (working copy)
@@ -3804,15 +3804,15 @@
 operands[2] = force_reg (V2DFmode, operands[2]);
 })
 
-(define_insn *sse3_interleave_highv2df
-  [(set (match_operand:V2DF 0 nonimmediate_operand =x,x,x,x,x,m)
+(define_insn *vec_interleave_highv2df
+  [(set (match_operand:V2DF 0 nonimmediate_operand =x,x,Y3,x,x,m)
(vec_select:V2DF
  (vec_concat:V4DF
-   (match_operand:V2DF 1 nonimmediate_operand  0,x,o,o,o,x)
-   (match_operand:V2DF 2 nonimmediate_operand  x,x,1,0,x,0))
+   (match_operand:V2DF 1 nonimmediate_operand  0,x,o ,o,o,x)
+   (match_operand:V2DF 2 nonimmediate_operand  x,x,1 ,0,x,0))
  (parallel [(const_int 1)
 (const_int 3)])))]
-  TARGET_SSE3  ix86_vec_interleave_v2df_operator_ok (operands, 1)
+  TARGET_SSE2  ix86_vec_interleave_v2df_operator_ok (operands, 1)
   @
unpckhpd\t{%2, %0|%0, %2}
vunpckhpd\t{%2, %1, %0|%0, %1, %2}
@@ -3826,23 +3826,6 @@
(set_attr prefix orig,vex,maybe_vex,orig,vex,maybe_vex)
(set_attr mode V2DF,V2DF,V2DF,V1DF,V1DF,V1DF)])
 
-(define_insn *sse2_interleave_highv2df
-  [(set (match_operand:V2DF 0 nonimmediate_operand =x,x,m)
-   (vec_select:V2DF
- (vec_concat:V4DF
-   (match_operand:V2DF 1 nonimmediate_operand  0,o,x)
-   (match_operand:V2DF 2 nonimmediate_operand  x,0,0))
- (parallel [(const_int 1)
-(const_int 3)])))]
-  TARGET_SSE2  ix86_vec_interleave_v2df_operator_ok (operands, 1)
-  @
-   unpckhpd\t{%2, %0|%0, %2}
-   movlpd\t{%H1, %0|%0, %H1}
-   movhpd\t{%1, %0|%0, %1}
-  [(set_attr type sselog,ssemov,ssemov)
-   (set_attr prefix_data16 *,1,1)
-   (set_attr mode V2DF,V1DF,V1DF)])
-
 ;; Recall that the 256-bit unpck insns only shuffle within their lanes.
 (define_expand avx_movddup256
   [(set (match_operand:V4DF 0 register_operand )
@@ -3923,15 +3906,15 @@
 operands[1] = force_reg (V2DFmode, operands[1]);
 })
 
-(define_insn *sse3_interleave_lowv2df
-  [(set (match_operand:V2DF 0 nonimmediate_operand =x,x,x,x,x,o)
+(define_insn *vec_interleave_lowv2df
+  [(set (match_operand:V2DF 0 nonimmediate_operand =x,x,Y3,x,x,o)
(vec_select:V2DF
  (vec_concat:V4DF
-   (match_operand:V2DF 1 nonimmediate_operand  0,x,m,0,x,0)
-   (match_operand:V2DF 2 nonimmediate_operand  x,x,1,m,m,x))
+   (match_operand:V2DF 1 nonimmediate_operand  0,x,m ,0,x,0)
+   (match_operand:V2DF 2 nonimmediate_operand  x,x,1 ,m,m,x))
  (parallel [(const_int 0)
 (const_int 2)])))]
-  TARGET_SSE3  ix86_vec_interleave_v2df_operator_ok (operands, 0)
+  TARGET_SSE2  ix86_vec_interleave_v2df_operator_ok (operands, 0)
   @
unpcklpd\t{%2, %0|%0, %2}
vunpcklpd\t{%2, %1, %0|%0, %1, %2}
@@ -3945,23 +3928,6 @@
(set_attr prefix orig,vex,maybe_vex,orig,vex,maybe_vex)
(set_attr mode V2DF,V2DF,V2DF,V1DF,V1DF,V1DF)])
 
-(define_insn *sse2_interleave_lowv2df
-  [(set (match_operand:V2DF 0 nonimmediate_operand =x,x,o)
-   (vec_select:V2DF
- (vec_concat:V4DF
-   (match_operand:V2DF 1 nonimmediate_operand  0,0,0)
-   (match_operand:V2DF 2 nonimmediate_operand  x,m,x))
- (parallel [(const_int 0)
-(const_int 2)])))]
-  TARGET_SSE2  ix86_vec_interleave_v2df_operator_ok (operands, 0)
-  @
-   unpcklpd\t{%2, %0|%0, %2}
-   movhpd\t{%2, %0|%0, %2}
-   movlpd\t{%2, %H0|%H0, %2}
-  [(set_attr type sselog,ssemov,ssemov)
-   (set_attr prefix_data16 *,1,1)
-   (set_attr mode V2DF,V1DF,V1DF)])
-
 (define_split
   [(set (match_operand:V2DF 0 memory_operand )
(vec_select:V2DF

Re: [google]Backport r174549 Fix 3 test cases incorrectly run in Thumb/Xscale (issue4524090)

2011-06-02 Thread Carrot Wei

OK for google/main.

thanks
Carrot

On Thu, Jun 2, 2011 at 12:51 PM, Jing Yu jin...@google.com wrote:
 http://gcc.gnu.org/ml/gcc-patches/2010-10/msg00134.html
 Backport r174549 to fix three testcases that are specific to ARM mode
 and therefore should be skipped when compiling for thumb.

 Thanks,
 Jing

 2011-06-01  Jing Yu  jin...@google.com
        Backport r174549

        2011-06-01  Sofiane Naci  sofiane.n...@arm.com

        * gcc.target/arm/mmx-1.c: Skip test in -mthumb.
        * gcc.target/arm/g2.c: Skip test in -mthumb.
        Skip test unless cpu is xscale.
        * gcc.target/arm/scd42-2.c: Likewise.

 Index: gcc.target/arm/mmx-1.c
 ===
 --- gcc.target/arm/mmx-1.c      (revision 174299)
 +++ gcc.target/arm/mmx-1.c      (working copy)
 @@ -4,6 +4,7 @@
  /* { dg-skip-if Test is specific to the iWMMXt { arm*-*-* } { -mcpu=* } 
 { -mcpu=iwmmxt } } */
  /* { dg-skip-if Test is specific to the iWMMXt { arm*-*-* } { -mabi=* } 
 { -mabi=iwmmxt } } */
  /* { dg-skip-if Test is specific to the iWMMXt { arm*-*-* } { -march=* } 
 { -march=iwmmxt } } */
 +/* { dg-skip-if Test is specific to ARM mode { arm*-*-* } { -mthumb } { 
  } } */
  /* { dg-options -O -mno-apcs-frame -mcpu=iwmmxt -mabi=iwmmxt } */
  /* { dg-require-effective-target arm32 } */
  /* { dg-require-effective-target arm_iwmmxt_ok } */
 Index: gcc.target/arm/g2.c
 ===
 --- gcc.target/arm/g2.c (revision 174299)
 +++ gcc.target/arm/g2.c (working copy)
 @@ -2,6 +2,8 @@
  /* { dg-do compile } */
  /* { dg-options -mcpu=xscale -O2 } */
  /* { dg-skip-if Test is specific to the Xscale { arm*-*-* } { -march=* } 
 { -march=xscale } } */
 +/* { dg-skip-if Test is specific to the Xscale { arm*-*-* } { -mcpu=* } 
 { -mcpu=xscale } } */
 +/* { dg-skip-if Test is specific to ARM mode { arm*-*-* } { -mthumb } { 
  } } */
  /* { dg-require-effective-target arm32 } */

  /* Brett Gaines' test case. */
 Index: gcc.target/arm/scd42-2.c
 ===
 --- gcc.target/arm/scd42-2.c    (revision 174299)
 +++ gcc.target/arm/scd42-2.c    (working copy)
 @@ -2,6 +2,8 @@
  /* { dg-do compile } */
  /* { dg-options -mcpu=xscale -O } */
  /* { dg-skip-if Test is specific to the Xscale { arm*-*-* } { -march=* } 
 { -march=xscale } } */
 +/* { dg-skip-if Test is specific to the Xscale { arm*-*-* } { -mcpu=* } 
 { -mcpu=xscale } } */
 +/* { dg-skip-if Test is specific to ARM mode { arm*-*-* } { -mthumb } { 
  } } */
  /* { dg-require-effective-target arm32 } */

  unsigned load2(void) __attribute__ ((naked));

 --
 This patch is available for review at http://codereview.appspot.com/4524090

Re: Remove SETJMP_VIA_SAVE_AREA support

2011-06-02 Thread Eric Botcazou

 This exposed a couple of similar bugs in cse.c and postreload-gcse.c: the
 code was effectively treating a basic block with a single, abnormal
 incoming edge as if the edge was normal.

I've installed the following refined fix, after testing on i586-suse-linux and 
sparc-sun-solaris2.10.  Most EDGE_ABNORMAL edges can very likely be treated 
normally here, for example EH edges when call-saved registers are considered.
The only really problematic ones are EDGE_ABNORMAL_CALL edges when there is a 
non-local label in the function, because even call-saved registers are not 
guaranteed to be preserved in this case.


2011-06-02  Eric Botcazou  ebotca...@adacore.com

* cse.c (cse_find_path): Refine change to exclude EDGE_ABNORMAL_CALL
edges only, when there is a non-local label in the function.
* postreload-gcse.c (bb_has_well_behaved_predecessors): Likewise.


-- 
Eric Botcazou
Index: cse.c
===
--- cse.c	(revision 174564)
+++ cse.c	(working copy)
@@ -6193,7 +6193,7 @@ cse_find_path (basic_block first_bb, str
 	e = NULL;
 
 	  if (e
-	   (e-flags  EDGE_ABNORMAL) == 0
+	   !((e-flags  EDGE_ABNORMAL_CALL)  cfun-has_nonlocal_label)
 	   e-dest != EXIT_BLOCK_PTR
 	   single_pred_p (e-dest)
 	  /* Avoid visiting basic blocks twice.  The large comment
Index: postreload-gcse.c
===
--- postreload-gcse.c	(revision 174564)
+++ postreload-gcse.c	(working copy)
@@ -912,12 +912,10 @@ get_avail_load_store_reg (rtx insn)
 static bool
 bb_has_well_behaved_predecessors (basic_block bb)
 {
-  unsigned int edge_count = EDGE_COUNT (bb-preds);
   edge pred;
   edge_iterator ei;
 
-  if (edge_count == 0
-  || (edge_count == 1  (single_pred_edge (bb)-flags  EDGE_ABNORMAL)))
+  if (EDGE_COUNT (bb-preds) == 0)
 return false;
 
   FOR_EACH_EDGE (pred, ei, bb-preds)
@@ -925,6 +923,9 @@ bb_has_well_behaved_predecessors (basic_
   if ((pred-flags  EDGE_ABNORMAL)  EDGE_CRITICAL_P (pred))
 	return false;
 
+  if ((pred-flags  EDGE_ABNORMAL_CALL)  cfun-has_nonlocal_label)
+	return false;
+
   if (JUMP_TABLE_DATA_P (BB_END (pred-src)))
 	return false;
 }

Re: [patch] add -Wdelete-non-virtual-dtor

2011-06-02 Thread Jonathan Wakely

On 2 June 2011 22:27, Jonathan Wakely wrote:
 -Wnon-virtual-dtor isn't always what you want, defining a polymorphic
 object without a virtual destructor is not necessarily a mistake. You
 may never delete such an object so instead of warning when the class
 is defined it's more useful to warn only when the class is deleted, as
 Clang does with -Wdelete-non-virtual-dtor

 This patch implements the same warning for G++.

That patch was the wrong one, with a typo in the new test. The correct
one, as tested, is attached, but only differs in the additional }
character in the testcase.
Index: c-family/c.opt
===
--- c-family/c.opt  (revision 174539)
+++ c-family/c.opt  (working copy)
@@ -331,6 +331,10 @@ Wdeclaration-after-statement
 C ObjC Var(warn_declaration_after_statement) Warning
 Warn when a declaration is found after a statement
 
+Wdelete-non-virtual-dtor
+C++ ObjC++ Var(warn_delnonvdtor) Warning
+Warn about deleting polymorphic objects with non-virtual destructors
+
 Wdeprecated
 C C++ ObjC ObjC++ Var(warn_deprecated) Init(1) Warning
 Warn if a deprecated compiler feature, class, method, or field is used
Index: c-family/c-opts.c
===
--- c-family/c-opts.c   (revision 174539)
+++ c-family/c-opts.c   (working copy)
@@ -405,6 +405,7 @@ c_common_handle_option (size_t scode, co
   warn_sign_compare = value;
  warn_reorder = value;
   warn_cxx0x_compat = value;
+  warn_delnonvdtor = value;
}
 
   cpp_opts-warn_trigraphs = value;
Index: cp/init.c
===
--- cp/init.c   (revision 174539)
+++ cp/init.c   (working copy)
@@ -3421,6 +3421,31 @@ build_delete (tree type, tree addr, spec
}
  complete_p = false;
}
+  else if (warn_delnonvdtor  MAYBE_CLASS_TYPE_P (type)
+!CLASSTYPE_FINAL (type)  TYPE_POLYMORPHIC_P (type))
+   {
+ tree dtor;
+ dtor = CLASSTYPE_DESTRUCTORS (type);
+ if (!dtor || !DECL_VINDEX (dtor))
+   {
+ tree x;
+ bool abstract = false;
+ for (x = TYPE_METHODS (type); x; x = DECL_CHAIN (x))
+   if (DECL_PURE_VIRTUAL_P (x))
+ {
+   abstract = true;
+   break;
+ }
+ if (abstract)
+   warning(OPT_Wdelete_non_virtual_dtor, deleting object of
+ abstract class type %qT which has non-virtual
+ destructor will cause undefined behaviour, 
type);
+ else
+   warning(OPT_Wdelete_non_virtual_dtor, deleting object of
+ polymorphic class type %qT which has non-virtual
+ destructor may cause undefined behaviour, type);
+   }
+   }
}
   if (VOID_TYPE_P (type) || !complete_p || !MAYBE_CLASS_TYPE_P (type))
/* Call the builtin operator delete.  */
Index: doc/invoke.texi
===
--- doc/invoke.texi (revision 174539)
+++ doc/invoke.texi (working copy)
@@ -2331,6 +2331,15 @@ Warn when a class seems unusable because
 destructors in that class are private, and it has neither friends nor
 public static member functions.
 
+@item -Wdelete-non-virtual-dtor @r{(C++ and Objective-C++ only)}
+@opindex Wdelete-non-virtual-dtor
+@opindex Wno-delete-non-virtual-dtor
+Warn when @samp{delete} is used to destroy an instance of a class which
+has virtual functions and non-virtual destructor. It is unsafe to delete
+an instance of a derived class through a pointer to a base class if the
+base class does not have a virtual destructor.  This warning is enabled
+by @option{-Wall}.
+
 @item -Wnoexcept @r{(C++ and Objective-C++ only)}
 @opindex Wnoexcept
 @opindex Wno-noexcept
Index: testsuite/g++.dg/warn/delete-non-virtual-dtor.C
===
--- testsuite/g++.dg/warn/delete-non-virtual-dtor.C (revision 0)
+++ testsuite/g++.dg/warn/delete-non-virtual-dtor.C (revision 0)
@@ -0,0 +1,44 @@
+// { dg-options -std=gnu++0x -Wdelete-non-virtual-dtor }
+// { dg-do compile }
+
+struct polyBase { virtual void f(); };
+
+void f(polyBase* p, polyBase* arr)
+{
+  delete p;  // { dg-warning non-virtual destructor may }
+  delete [] arr;
+}
+
+struct polyDerived : polyBase { };
+
+void f(polyDerived* p, polyDerived* arr)
+{
+  delete p;  // { dg-warning non-virtual destructor may }
+  delete [] arr;
+}
+
+struct absDerived : polyBase { virtual void g() = 0; };
+
+void f(absDerived* p, absDerived* arr)
+{
+  delete p;  // { dg-warning non-virtual destructor will }
+  delete [] arr;
+}
+
+struct finalDerived

[patch committed] Fix PR target/49163

2011-06-02 Thread Kaz Kojima

Hi,

The attached patch is to fix PR target/49163.
The problem occurs with the unrecognizable insn like

(insn 66 141 110 4 (set (reg:SI 4 r4)
(sign_extend:SI (subreg:QI (mem/s/v/u/c:DI (plus:SI (reg/f:SI 7 r7 
[192])
(const_int 12 [0xc])) [3 s2array[1][0].f0+0 S8 A32]) 
0))) iii.c:39 164 {*extendqisi2_compact}
 (expr_list:REG_DEAD (reg/f:SI 7 r7 [192])
(nil)))

which is an intermediate insn in reload.
SH makes the memory address like (plus (reg) (const_int)) invalid
for HI/QImode because SH's mov.b instruction can take only R0 as
the other operand for that memory and compiler can't handle such
case well.  The patch makes constraints for some move insns more
rigid about invalid addresses of this type so to avoid generating
a problematic move insn.
The patch is tested on sh4-unknown-linux-gnu with no new failures.
and the new test is tested also on i686-pc-linux-gnu.
Applied on trunk.

Regards,
kaz
--
2011-06-02  Kaz Kojima  kkoj...@gcc.gnu.org

PR target/49163
* config/sh/predicates.md (general_movsrc_operand): Return 0
for memory and memory subreg of which address is an invalid
indexed address for QI and HImode.
(general_movdst_operand): Likewise.

[testsuite]
PR target/49163
* gcc.c-torture/compile/pr49163.c: New.

diff -uprN ORIG/trunk/gcc/config/sh/predicates.md 
trunk/gcc/config/sh/predicates.md
--- ORIG/trunk/gcc/config/sh/predicates.md  2010-04-12 09:52:36.0 
+0900
+++ trunk/gcc/config/sh/predicates.md   2011-06-02 10:17:40.0 +0900
@@ -394,6 +394,18 @@
return 0;
 }
 
+  if ((mode == QImode || mode == HImode)
+   (MEM_P (op)
+ || (GET_CODE (op) == SUBREG  MEM_P (SUBREG_REG (op)
+{
+  rtx x = XEXP ((MEM_P (op) ? op : SUBREG_REG (op)), 0);
+
+  if (GET_CODE (x) == PLUS
+  REG_P (XEXP (x, 0))
+  CONST_INT_P (XEXP (x, 1)))
+   return sh_legitimate_index_p (mode, XEXP (x, 1));
+}
+
   if (TARGET_SHMEDIA
(GET_CODE (op) == PARALLEL || GET_CODE (op) == CONST_VECTOR)
sh_rep_vec (op, mode))
@@ -419,6 +431,18 @@
! (high_life_started || reload_completed))
 return 0;
 
+  if ((mode == QImode || mode == HImode)
+   (MEM_P (op)
+ || (GET_CODE (op) == SUBREG  MEM_P (SUBREG_REG (op)
+{
+  rtx x = XEXP ((MEM_P (op) ? op : SUBREG_REG (op)), 0);
+
+  if (GET_CODE (x) == PLUS
+  REG_P (XEXP (x, 0))
+  CONST_INT_P (XEXP (x, 1)))
+   return sh_legitimate_index_p (mode, XEXP (x, 1));
+}
+
   return general_operand (op, mode);
 })
 
diff -uprN ORIG/trunk/gcc/testsuite/gcc.c-torture/compile/pr49163.c 
trunk/gcc/testsuite/gcc.c-torture/compile/pr49163.c
--- ORIG/trunk/gcc/testsuite/gcc.c-torture/compile/pr49163.c1970-01-01 
09:00:00.0 +0900
+++ trunk/gcc/testsuite/gcc.c-torture/compile/pr49163.c 2011-06-02 
20:58:31.0 +0900
@@ -0,0 +1,35 @@
+/* PR target/49163 */
+struct S1
+{
+ unsigned f0:18;
+ int f1;
+} __attribute__ ((packed));
+
+struct S2
+{
+  volatile long long f0;
+  int f1;
+};
+
+struct S1 s1;
+struct S2 s2;
+const struct S2 s2array[2][1] = { };
+
+struct S2 **sptr;
+
+extern int bar (char a, long long b, int * c, long long d, long long e);
+extern int baz (void);
+
+int i;
+int *ptr;
+
+void
+foo (int *arg)
+{
+  for (i = 0; i  1; i = baz())
+{
+  *arg = *(int *)sptr;
+  *ptr = bar (*arg, s2.f1, ptr,
+ bar (s2array[1][0].f0, *arg, ptr, s1.f1, *ptr), *arg);
+}
+}

Re: Initialize INSN_COND (was: C6X port 5/11: Track predication conditions more accurately)

2011-06-02 Thread Steve Ellcey

On Thu, 2011-06-02 at 15:29 +0400, Alexander Monakov wrote:
 Bernd,
 
 The problem is INSN_COND should be reset when initializing a new deps
 structure, otherwise instructions may get stale conditions from other
 previously analyzed instructions.  Presuming that sd_init_insn is the
 proper place for that, I'll test the following patch.
 
 
 2011-06-02  Alexander Monakov  amona...@ispras.ru
 
   * sched-deps.c (sd_init_insn): Initialize INSN_COND.
   * sel-sched.c (move_op): Use correct type for 'res'.  Verify that
   code_motion_path_driver returned 0 or 1.

I tested this patch on my IA64 HP-UX box and did not see any
regressions.  It fixed the problem I was having.

Steve Ellcey
s...@cup.hp.com

libobjc: remove deprecated API (patch 1)

2011-06-02 Thread Nicola Pero

This patch removes a number of deprecated libobjc functions and methods, which 
are part of the
Traditional Objective-C API that was deprecated in GCC 4.6.x and are to be 
removed in GCC 4.7.0.

It's the first of a long sequence of patches that does this removal one bit at 
a time.  This one
removes the deprecated objc_error(), objc_verror() and objc_set_error_handler() 
functions, and
all the deprecated Object methods whose implementation used to use these 
functions.

Unfortunately, all of our testcases use the Traditional Objective-C API when 
testing the GNU runtime
and they will need to be updated to use the Modern Objective-C API because 
the Traditional Objective-C
API is simply going away.  I'll update the relevant testcases with each patch.  
This first patch requires
only a tiny update of a single testcase.

Committed to trunk.

Thanks

Index: libobjc/sendmsg.c
===
--- libobjc/sendmsg.c   (revision 174585)
+++ libobjc/sendmsg.c   (working copy)
@@ -977,16 +977,8 @@ __objc_forward (id object, SEL sel, arglist_t args
  : instance ),
  object-class_pointer-name, sel_getName (sel));
 
-/* TODO: support for error: is surely deprecated ? */
-err_sel = sel_get_any_uid (error:);
-if (__objc_responds_to (object, err_sel))
-  {
-   imp = get_implementation (object, object-class_pointer, err_sel);
-   return (*imp) (object, sel_get_any_uid (error:), msg);
-  }
-
-/* The object doesn't respond to doesNotRecognize: or error:;
-   Therefore, a default action is taken.  */
+/* The object doesn't respond to doesNotRecognize:.  Therefore, a
+   default action is taken.  */
 _objc_abort (%s\n, msg);
 
 return 0;
Index: libobjc/Makefile.in
===
--- libobjc/Makefile.in (revision 174585)
+++ libobjc/Makefile.in (working copy)
@@ -139,7 +139,6 @@ OBJC_DEPRECATED_H = \
   STR.h \
   hash.h \
   objc-list.h \
-  objc_error.h \
   objc_get_uninstalled_dtable.h \
   objc_malloc.h \
   objc_msg_sendv.h \
Index: libobjc/libobjc.def
===
--- libobjc/libobjc.def (revision 174585)
+++ libobjc/libobjc.def (working copy)
@@ -25,7 +25,6 @@ search_for_method_in_list
 objc_get_uninstalled_dtable
 objc_hash_is_key_in_hash
 hash_is_key_in_hash
-objc_verror
 _objc_load_callback
 objc_malloc
 objc_atomic_malloc
@@ -53,7 +52,6 @@ objc_thread_remove
 __objc_class_name_Object
 __objc_class_name_Protocol
 __objc_class_name_NXConstantString
-objc_error
 __objc_object_alloc
 __objc_object_copy
 __objc_object_dispose
Index: libobjc/error.c
===
--- libobjc/error.c (revision 174585)
+++ libobjc/error.c (working copy)
@@ -45,53 +45,3 @@ _objc_abort (const char *fmt, ...)
   abort ();
   va_end (ap);
 }
-
-/* The rest of the file is deprecated.  */
-#include objc/objc-api.h /* For objc_error_handler.  */
-
-/*
-** Error handler function
-** NULL so that default is to just print to stderr
-*/
-static objc_error_handler _objc_error_handler = NULL;
-
-/* Trigger an objc error */
-void
-objc_error (id object, int code, const char *fmt, ...)
-{
-  va_list ap;
-
-  va_start (ap, fmt);
-  objc_verror (object, code, fmt, ap);
-  va_end (ap);
-}
-
-/* Trigger an objc error */
-void
-objc_verror (id object, int code, const char *fmt, va_list ap)
-{
-  BOOL result = NO;
-
-  /* Call the error handler if its there
- Otherwise print to stderr */
-  if (_objc_error_handler)
-result = (*_objc_error_handler) (object, code, fmt, ap);
-  else
-vfprintf (stderr, fmt, ap);
-
-  /* Continue if the error handler says its ok
- Otherwise abort the program */
-  if (result)
-return;
-  else
-abort ();
-}
-
-/* Set the error handler */
-objc_error_handler
-objc_set_error_handler (objc_error_handler func)
-{
-  objc_error_handler temp = _objc_error_handler;
-  _objc_error_handler = func;
-  return temp;
-}
Index: libobjc/ChangeLog
===
--- libobjc/ChangeLog   (revision 174585)
+++ libobjc/ChangeLog   (working copy)
@@ -1,3 +1,19 @@
+2011-06-02  Nicola Pero  nicola.p...@meta-innovation.com
+
+   * Makefile.in (OBJC_DEPRECATED_H): Removed objc_error.h.
+   * objc/deprecated/objc_error.h: Removed.
+   * objc/objc-api.h: Do not include deprecated/objc_error.h.
+   * libobjc.def (objc_error, objc_verror): Removed.
+   * error.c (_objc_error_handler, objc_error, objc_verror,
+   objc_set_error_handler): Removed.
+   * Object.m ([-error:], [-perform:], [-perform:with:],
+   [-perform:with:with], [-subclassResponsibility:],
+   [-notImplemented:], [-shouldNotImplement:], [-doesNotRecognize:]):
+   Removed.
+   * objc/deprecated/Object.h: Removed the same methods.
+   * sendmsg.c (__objc_forward): Do not try to

Re: Ping: Re: Improve DSE in the presence of calls

2011-06-02 Thread Easwaran Raman

Ping.

On Sat, May 14, 2011 at 8:01 AM, Easwaran Raman era...@google.com wrote:
 http://gcc.gnu.org/ml/gcc-patches/2011-05/msg00781.html

Re: fix left-over debug insns in DCE

2011-06-02 Thread Alexandre Oliva

On Jun  2, 2011, Eric Botcazou ebotca...@adacore.com wrote:

 Why can't the problem be addressed purely within DF?

Hmm...  Maybe it could, I'm not sure.  The problem is that DCE removes
insns, and then DF associates remaining uses in debug insns to earlier
DEFs.  Adjusting debug insns in DCE is right per the VTA design motto:
decide as if debug insns weren't there, adjust them as you would adjust
non-debug insns.  This code borrowed from DF into DCE is the “adjust”
bit.

 Starting to spill the DF 
 logic to individual RTL passes doesn't look very appealing to me.

Propagation of uses isn't DF-specific material, it just so happened that
it offered an adequate interface.  Other passes already have their own
propagation machinery, but it didn't look quite as suitable.

 This is the patch I ended up with.  Regstrapped on x86_64-linux-gnu and
 i686-linux-gnu.  Ok to install?

 OK for the usual debug insn bookkeeping, i.e.

Err...  These depend on the interface changes of functions defined
within DF to work.  Should they perhaps be moved out of DF-specific
files?

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist  Red Hat Brazil Compiler Engineer

Re: [PR debug/47590] rework md option overriding to delay var-tracking

2011-06-02 Thread Alexandre Oliva

On Jun  1, 2011, Bernd Schmidt ber...@codesourcery.com wrote:

 On 06/02/2011 12:47 AM, Alexandre Oliva wrote:
 On Jun  1, 2011, Bernd Schmidt ber...@codesourcery.com wrote:
 Looks ok, except I think you need to update tm.texi.in and tm.texi?
 
 Oh, I didn't realize updating tm.texi.in; AFAICT tm.texi is generated
 the same regardless.

 I *think* what one is supposed to do is to just add the @hook lines in
 tm.texi.in if the definition in target.def includes documentation.

Right you are, though it looks like leaving the @hook lines out makes no
difference.  Anyhow, here's the patch I'm checking in.

for  gcc/ChangeLog
from  Alexandre Oliva  aol...@redhat.com

	PR debug/47590
	* target.def (delay_sched2, delay_vartrack): New.
	* doc/tm.texi.in: Update.
	* doc/tm.texi: Rebuild.
	* sched-rgn.c (gate_handle_sched2): Fail if delay_sched2.
	* var-tracking.c (gate_handle_var_tracking): Likewise.
	* config/bfin/bfin.c (bfin_flag_schedule_insns2): Drop.
	(bfin_flag_var_tracking): Drop.
	(output_file_start): Don't save and override flag_var_tracking.
	(bfin_option_override): Ditto flag_schedule_insns_after_reload.
	(bfin_reorg): Test original variables.
	(TARGET_DELAY_SCHED2, TARGET_DELAY_VARTRACK): Define.
	* config/ia64/ia64.c (ia64_flag_schedule_insns2): Drop.
	(ia64_flag_var_tracking): Drop.
	(TARGET_DELAY_SCHED2, TARGET_DELAY_VARTRACK): Define.
	(ia64_file_start): Don't save and override flag_var_tracking.
	(ia64_override_options_after_change): Ditto
	flag_schedule_insns_after_reload.
	(ia64_reorg): Test original variables.
	* config/picochip/picochip.c (picochip_flag_schedule_insns2): Drop.
	(picochip_flag_var_tracking): Drop.
	(TARGET_DELAY_SCHED2, TARGET_DELAY_VARTRACK): Define.
	(picochip_option_override): Don't save and override
	flag_schedule_insns_after_reload.
	(picochip_asm_file_start): Ditto flag_var_tracking.
	(picochip_reorg): Test original variables.
	* config/spu/spu.c (spu_flag_var_tracking): Drop.
	(TARGET_DELAY_VARTRACK): Define.
	(spu_var_tracking): New.
	(spu_machine_dependent_reorg): Call it.
	(asm_file_start): Don't save and override flag_var_tracking.

Index: gcc/target.def
===
--- gcc/target.def.orig	2011-05-30 03:53:29.0 -0300
+++ gcc/target.def	2011-05-31 17:50:09.733284971 -0300
@@ -2717,6 +2717,16 @@ DEFHOOKPOD
  in particular GDB does not use them.,
  bool, false)
 
+DEFHOOKPOD
+(delay_sched2, True if sched2 is not to be run at its normal place.  \
+This usually means it will be run as part of machine-specific reorg.,
+bool, false)
+
+DEFHOOKPOD
+(delay_vartrack, True if vartrack is not to be run at its normal place.  \
+This usually means it will be run as part of machine-specific reorg.,
+bool, false)
+
 /* Leave the boolean fields at the end.  */
 
 /* Close the 'struct gcc_target' definition.  */
Index: gcc/doc/tm.texi.in
===
--- gcc/doc/tm.texi.in.orig	2011-06-01 19:45:28.725386885 -0300
+++ gcc/doc/tm.texi.in	2011-06-01 19:53:47.394534907 -0300
@@ -9353,6 +9353,10 @@ tables, and hence is desirable if it wor
 
 @hook TARGET_WANT_DEBUG_PUB_SECTIONS
 
+@hook TARGET_DELAY_SCHED2
+
+@hook TARGET_DELAY_VARTRACK
+
 @defmac ASM_OUTPUT_DWARF_DELTA (@var{stream}, @var{size}, @var{label1}, @var{label2})
 A C statement to issue assembly directives that create a difference
 @var{lab1} minus @var{lab2}, using an integer of the given @var{size}.
Index: gcc/doc/tm.texi
===
--- gcc/doc/tm.texi.orig	2011-05-30 03:53:29.0 -0300
+++ gcc/doc/tm.texi	2011-06-01 19:54:09.126494927 -0300
@@ -9432,6 +9432,14 @@ tables, and hence is desirable if it wor
 True if the @code{.debug_pubtypes} and @code{.debug_pubnames} sections should be emitted.  These sections are not used on most platforms, and in particular GDB does not use them.
 @end deftypevr
 
+@deftypevr {Target Hook} bool TARGET_DELAY_SCHED2
+True if sched2 is not to be run at its normal place.  This usually means it will be run as part of machine-specific reorg.
+@end deftypevr
+
+@deftypevr {Target Hook} bool TARGET_DELAY_VARTRACK
+True if vartrack is not to be run at its normal place.  This usually means it will be run as part of machine-specific reorg.
+@end deftypevr
+
 @defmac ASM_OUTPUT_DWARF_DELTA (@var{stream}, @var{size}, @var{label1}, @var{label2})
 A C statement to issue assembly directives that create a difference
 @var{lab1} minus @var{lab2}, using an integer of the given @var{size}.
Index: gcc/sched-rgn.c
===
--- gcc/sched-rgn.c.orig	2011-04-06 00:24:12.0 -0300
+++ gcc/sched-rgn.c	2011-05-31 17:43:02.584808465 -0300
@@ -3508,7 +3508,7 @@ gate_handle_sched2 (void)
 {
 #ifdef INSN_SCHEDULING
   return optimize  0  flag_schedule_insns_after_reload
- dbg_cnt (sched2_func);
+ !targetm.delay_sched2  dbg_cnt (sched2_func);
 #else
   return 0;
 #endif

Re: [PR48866] three alternative fixes

2011-06-02 Thread Alexandre Oliva

On May 30, 2011, Alexandre Oliva aol...@redhat.com wrote:

 On May 30, 2011, Alexandre Oliva aol...@redhat.com wrote:
 1. emit debug temps for replaceable DEFs that end up being referenced in
 debug insns.  We already have some code to try to deal with this, but it
 emits the huge expressions we'd rather avoid, and it may create
 unnecessary duplication.  This new approach emits a placeholder instead
 of skipping replaceable DEFs altogether, and then, if the DEF is
 referenced in a debug insn (perhaps during the late debug re-expasion of
 some other placeholder), it is expanded.  Placeholders that end up not
 being referenced are then throw away.

 This is my favorite option, for it's safest: it doesn't change
 executable code at all (or should I say it *shouldn't* change it, for I
 haven't verified that it doesn't), retaining any register pressure
 benefits from TER.

This revised and retested version records expansions in an array indexed
on SSA version rather than a pointer_map, as suggested by Matz.

for  gcc/ChangeLog
from  Alexandre Oliva  aol...@redhat.com

	PR debug/48866
	* cfgexpand.c (def_expansions): New.
	(def_expansions_init): New.
	(def_expansions_remove_placeholder, def_expansions_fini): New.
	(def_get_expansion_ptr): New.
	(expand_debug_expr): Create debug temps as needed.
	(expand_debug_insn): New, split out of...
	(expand_debug_locations): ... this.
	(gen_emit_debug_insn): New, split out of...
	(expand_gimple_basic_block): ... this.  Simplify expansion of
	debug stmts.  Emit placeholders for replaceable DEFs, rather
	than debug temps at last non-debug uses.
	(gimple_expand_cfg): Initialize and finalize expansions cache.

Index: gcc/cfgexpand.c
===
--- gcc/cfgexpand.c.orig	2011-06-01 19:45:02.520428653 -0300
+++ gcc/cfgexpand.c	2011-06-01 20:20:02.014975168 -0300
@@ -2337,6 +2337,70 @@ convert_debug_memory_address (enum machi
   return x;
 }
 
+/* Mark debug insns that are placeholders for replaceable SSA_NAMEs
+   that have not been expanded yet.  */
+#define DEBUG_INSN_TOEXPAND(RTX)	\
+  (RTL_FLAG_CHECK1(DEBUG_INSN_TOEXPAND, (RTX), DEBUG_INSN)-used)
+
+/* Map replaceable SSA_NAMEs versions to their RTL expansions.  */
+static rtx *def_expansions;
+
+/* Initialize the def_expansions data structure.  This is to be called
+   before expansion of a function starts.  */
+
+static void
+def_expansions_init (void)
+{
+  gcc_checking_assert (!def_expansions);
+  def_expansions = XCNEWVEC (rtx, num_ssa_names);
+}
+
+/* Remove the DEBUG_INSN INSN if it still binds an SSA_NAME.  */
+
+static bool
+def_expansions_remove_placeholder (rtx insn)
+{
+  gcc_checking_assert (insn);
+
+  if (TREE_CODE (INSN_VAR_LOCATION_DECL (insn)) == SSA_NAME)
+{
+  gcc_assert (!DEBUG_INSN_TOEXPAND (insn));
+  remove_insn (insn);
+}
+
+  return true;
+}
+
+/* Finalize the def_expansions data structure.  This is to be called
+   at the end of the expansion of a function.  */
+
+static void
+def_expansions_fini (void)
+{
+  int i = num_ssa_names;
+
+  gcc_checking_assert (def_expansions);
+  while (i--)
+if (def_expansions[i])
+  def_expansions_remove_placeholder (def_expansions[i]);
+  XDELETEVEC (def_expansions);
+  def_expansions = NULL;
+}
+
+/* Return a pointer to the rtx expanded from EXP.  EXP must be a
+   replaceable SSA_NAME.  */
+
+static rtx *
+def_get_expansion_ptr (tree exp)
+{
+  gcc_checking_assert (def_expansions);
+  gcc_checking_assert (TREE_CODE (exp) == SSA_NAME);
+  gcc_checking_assert (bitmap_bit_p (SA.values, SSA_NAME_VERSION (exp)));
+  return def_expansions[SSA_NAME_VERSION (exp)];
+}
+
+static void expand_debug_insn (rtx insn);
+
 /* Return an RTX equivalent to the value of the tree expression
EXP.  */
 
@@ -3131,7 +3195,30 @@ expand_debug_expr (tree exp)
 	gimple g = get_gimple_for_ssa_name (exp);
 	if (g)
 	  {
-	op0 = expand_debug_expr (gimple_assign_rhs_to_tree (g));
+	rtx insn = *def_get_expansion_ptr (exp);
+	tree vexpr;
+
+	/* If this still has the original SSA_NAME, emit a debug
+	   temp and compute the RTX value.  */
+	if (TREE_CODE (INSN_VAR_LOCATION_DECL (insn)) == SSA_NAME)
+	  {
+		tree var = SSA_NAME_VAR (INSN_VAR_LOCATION_DECL (insn));
+
+		vexpr = make_node (DEBUG_EXPR_DECL);
+		DECL_ARTIFICIAL (vexpr) = 1;
+		TREE_TYPE (vexpr) = TREE_TYPE (var);
+		DECL_MODE (vexpr) = DECL_MODE (var);
+		INSN_VAR_LOCATION_DECL (insn) = vexpr;
+
+		gcc_checking_assert (!DEBUG_INSN_TOEXPAND (insn));
+		DEBUG_INSN_TOEXPAND (insn) = 1;
+		expand_debug_insn (insn);
+	  }
+	else
+	  vexpr = INSN_VAR_LOCATION_DECL (insn);
+
+	op0 = expand_debug_expr (vexpr);
+
 	if (!op0)
 	  return NULL;
 	  }
@@ -3293,6 +3380,45 @@ expand_debug_expr (tree exp)
 }
 }
 
+/* Expand the LOC value of the debug insn INSN.  */
+
+static void
+expand_debug_insn (rtx insn)
+{
+  tree value = (tree)INSN_VAR_LOCATION_LOC (insn);
+  rtx val;
+  enum machine_mode mode;
+
+

Re: [PR48866] three alternative fixes

2011-06-02 Thread Alexandre Oliva

On May 30, 2011, Alexandre Oliva aol...@redhat.com wrote:

 On May 30, 2011, Alexandre Oliva aol...@redhat.com wrote:
 3. expand dominators before dominated blocks, so that DEFs of
 replaceable SSA names are expanded before their uses.  Expand them when
 they're encountered, but not requiring a REG as a result.  Save the RTL
 expression that results from the expansion for use in debug insns and at
 the non-debug use.

 This patch addresses some of the problems in 2, avoiding expanding code
 out of order within a block, and (hopefully) ensuring that, expanding
 dominators before dominatedblocks, DEFs are expanded before USEs.  There
 is a theoretical possibility that a USE may be expanded before a DEF,
 depending on internal details of out-of-ssa, but should this ever
 happen, we'll get a failed assertion, and then disabling TER will work
 around the problem.

I also posted the wrong patch upthread for this variant.  The one I
posted didn't work at all, because it contained a last-minute
optimization that changed the expansion of replaceable stmts from
EXPAND_NORMAL to EXPAND_SUM.  IIRC the former always yielded a pseudo,
whereas the former enabled replacements, but it also exposed the need
for better handling of non-general_operands when the use expects one.

This revised and retested version also drops the reordering of the
expansion of basic blocks, that Matz pointed out was unnecessary, and
switches to an array rather than a pointer_map to record the expansions.

for  gcc/ChangeLog
from  Alexandre Oliva  aol...@redhat.com

	PR debug/48866
	* cfgexpand.c (def_expansions): New.
	(def_expansion_recent_tree, def_expansion_recent_rtx): New.
	(def_expansions_init, def_expansions_fini): New.
	(def_has_expansion_ptr, def_get_expansion_ptr): New.
	(expand_debug_expr): Use recorded expansion if available.
	(expand_gimple_basic_block): Prepare to record expansion of
	replaceable defs.  Change return type to void.
	(gimple_expand_cfg): Initialize and finalize expansions cache.
	Expand dominator blocks before dominated.
	* expr.c (expand_expr_real_1): Use recorded expansion of
	replaceable defs.
	* expr.h (def_has_expansion_ptr): Declare.

Index: gcc/cfgexpand.c
===
--- gcc/cfgexpand.c.orig	2011-06-01 20:39:58.244953408 -0300
+++ gcc/cfgexpand.c	2011-06-01 21:44:38.005879125 -0300
@@ -2337,6 +2337,42 @@ convert_debug_memory_address (enum machi
   return x;
 }
 
+/* Map replaceable SSA_NAMEs to their RTL expansions.  */
+static rtx *def_expansions;
+
+/* Initialize the def_expansions data structure.  This is to be called
+   before expansion of a function starts.  */
+
+static void
+def_expansions_init (void)
+{
+  gcc_checking_assert (!def_expansions);
+  def_expansions = XCNEWVEC (rtx, num_ssa_names);
+}
+
+/* Finalize the def_expansions data structure.  This is to be called
+   at the end of the expansion of a function.  */
+
+static void
+def_expansions_fini (void)
+{
+  gcc_checking_assert (def_expansions);
+  XDELETEVEC (def_expansions);
+  def_expansions = NULL;
+}
+
+/* Return a pointer to the rtx expanded from EXP.  EXP must be a
+   replaceable SSA_NAME.  */
+
+rtx *
+def_get_expansion_ptr (tree exp)
+{
+  gcc_checking_assert (def_expansions);
+  gcc_checking_assert (TREE_CODE (exp) == SSA_NAME);
+  gcc_checking_assert (bitmap_bit_p (SA.values, SSA_NAME_VERSION (exp)));
+  return def_expansions[SSA_NAME_VERSION (exp)];
+}
+
 /* Return an RTX equivalent to the value of the tree expression
EXP.  */
 
@@ -3131,7 +3167,16 @@ expand_debug_expr (tree exp)
 	gimple g = get_gimple_for_ssa_name (exp);
 	if (g)
 	  {
-	op0 = expand_debug_expr (gimple_assign_rhs_to_tree (g));
+	rtx *xp = def_get_expansion_ptr (exp);
+
+	if (xp)
+	  op0 = copy_rtx (*xp);
+	else
+	  op0 = NULL;
+
+	if (!op0)
+	  op0 = expand_debug_expr (gimple_assign_rhs_to_tree (g));
+
 	if (!op0)
 	  return NULL;
 	  }
@@ -3618,20 +3663,38 @@ expand_gimple_basic_block (basic_block b
 	}
 	  else
 	{
+	  rtx *xp = NULL;
 	  def_operand_p def_p;
 	  def_p = SINGLE_SSA_DEF_OPERAND (stmt, SSA_OP_DEF);
 
-	  if (def_p != NULL)
+	  /* Ignore this stmt if it is in the list of
+		 replaceable expressions.  */
+	  if (def_p != NULL
+		   SA.values
+		   bitmap_bit_p (SA.values,
+   SSA_NAME_VERSION (DEF_FROM_PTR (def_p
 		{
-		  /* Ignore this stmt if it is in the list of
-		 replaceable expressions.  */
-		  if (SA.values
-		   bitmap_bit_p (SA.values,
-   SSA_NAME_VERSION (DEF_FROM_PTR (def_p
-		continue;
+		  tree def = DEF_FROM_PTR (def_p);
+		  gimple g = get_gimple_for_ssa_name (def);
+		  rtx retval;
+
+		  last = get_last_insn ();
+
+		  retval = expand_expr (gimple_assign_rhs_to_tree (g),
+	NULL_RTX, VOIDmode, EXPAND_SUM);
+
+		  xp = def_get_expansion_ptr (def);
+		  gcc_checking_assert (!*xp);
+		  *xp = retval;
 		}
-	  last = expand_gimple_stmt (stmt);
+	  else
+

Re: [PR48866] three alternative fixes

2011-06-02 Thread Alexandre Oliva

Ugh, failed to refresh the patch file, resending with the correct one.

On May 30, 2011, Alexandre Oliva aol...@redhat.com wrote:

 On May 30, 2011, Alexandre Oliva aol...@redhat.com wrote:
 2. emit placeholders for replaceable DEFs and, when the DEFs are
 expanded at their point of use, emit the expansion next to the
 placeholder, rather than at the current stream.  The result of the
 expansion is saved and used in debug insns that reference the
 replaceable DEF.  If the result is forced into a REG shortly thereafter,
 the code resulting from this is also emitted next to the placeholder,
 and the saved expansion is updated.  If the USE is expanded before the
 DEF, the insn stream resulting from the expansion is saved and emitted
 at the point of the DEF.

 IMHO this is the riskiest of the 3 patches, for shuffling expansions
 around isn't exactly something I'm comfortable with.  There's a very
 real risk that moving the expansion of sub-expressions to their
 definition points may end up moving uses before definitions.

Upthread, I posted the wrong patch: instead of the one that tolerated
expanding DEFs before or after USEs, I posted a simplifying experiment
that seemed to fail, but it looks like I misinterpreted the results.

This revised and retested patch also records expansions in an array
rather than a pointer_map, and it avoids re-expanding DEFs when a USE is
expanded for the second time.  Although replaceable DEFs can only have
one USE, when the single USE appears in a call stmt, it can be expanded
twice.  I'm not sure whether it would be better to expand it twice and
let RTL optimizations drop any redundancies, or reuse the result of the
first expansion, like this patch does.

for  gcc/ChangeLog
from  Alexandre Oliva  aol...@redhat.com

	PR debug/48866
	* cfgexpand.c (def_expansions): New.
	(def_expansion_recent_tree, def_expansion_recent_rtx): New.
	(def_expansions_init): New.
	(def_expansions_remove_placeholder, def_expansions_fini): New.
	(def_get_expansion_ptr): New.
	(def_expansion_recent, def_expansion_record_recent): New.
	(def_expansion_add_insns): New.
	(expand_debug_expr): Use recorded expansion if available.
	(expand_gimple_basic_block): Prepare to record expansion of
	replaceable defs.  Reset recent expansions at the end of the
	block.
	(gimple_expand_cfg): Initialize and finalize expansions cache.
	* expr.c: Include gimple-pretty-print.h.
	(store_expr): Forget recent expansions upon nontemporal moves.
	(expand_expr_real_1): Reuse or record expansion of replaceable
	defs.
	* expr.h (def_get_expansion_ptr, def_expansion_recent): Declare.
	(def_expansion_record_recent, def_expansion_add_insns): Declare.
	* explow.c (force_recent): New.
	(force_reg): Use it.  Split into...
	(force_reg_1): ... this.
	* Makefile.in (expr.o): Depend on gimple-pretty-print.h.

Index: gcc/cfgexpand.c
===
--- gcc/cfgexpand.c.orig	2011-06-02 16:43:03.596818720 -0300
+++ gcc/cfgexpand.c	2011-06-02 17:18:10.217974612 -0300
@@ -2337,6 +2337,144 @@ convert_debug_memory_address (enum machi
   return x;
 }
 
+/* Map replaceable SSA_NAMEs to NOTE_INSN_VAR_LOCATIONs that hold
+   their RTL expansions (once available) in their NOTE_VAR_LOCATIONs
+   (without a VAR_LOCATION rtx).  The SSA_NAME DEF is expanded before
+   its single USE, so the NOTE is inserted in the insn stream, marking
+   the location where the non-replaceable portion of the expansion is
+   to be inserted.  When the single USE is expanded, it will be
+   emitted before the NOTE.  */
+static rtx *def_expansions;
+
+/* The latest expanded SSA name, and its corresponding RTL expansion.
+   These are used to enable the insertion of the insn that stores the
+   expansion in a register at the end of the sequence expanded for the
+   SSA DEF.  */
+static tree def_expansion_recent_tree;
+static rtx def_expansion_recent_rtx;
+
+/* Initialize the def_expansions data structure.  This is to be called
+   before expansion of a function starts.  */
+
+static void
+def_expansions_init (void)
+{
+  gcc_checking_assert (!def_expansions);
+  def_expansions = XCNEWVEC (rtx, num_ssa_names);
+
+  gcc_checking_assert (!def_expansion_recent_tree);
+  gcc_checking_assert (!def_expansion_recent_rtx);
+}
+
+/* Remove the NOTE that marks the insertion location of the expansion
+   of a replaceable SSA note.  */
+
+static bool
+def_expansions_remove_placeholder (rtx note)
+{
+  if (!note)
+return true;
+
+  gcc_checking_assert (NOTE_P (note));
+  remove_insn (note);
+
+  return true;
+}
+
+/* Finalize the def_expansions data structure.  This is to be called
+   at the end of the expansion of a function.  */
+
+static void
+def_expansions_fini (void)
+{
+  int i = num_ssa_names;
+
+  gcc_checking_assert (def_expansions);
+
+  while (i--)
+if (def_expansions[i])
+  def_expansions_remove_placeholder (def_expansions[i]);
+  XDELETEVEC (def_expansions);
+  def_expansions = NULL;
+  def_expansion_recent_tree =

Re: introduce --param max-vartrack-expr-depth

2011-06-02 Thread Alexandre Oliva

On Jun  2, 2011, Bernd Schmidt ber...@codesourcery.com wrote:

 On 06/02/2011 10:46 AM, Jakub Jelinek wrote:
 On Wed, Jun 01, 2011 at 07:25:39PM -0300, Alexandre Oliva wrote:
 Such as this one...
 
 I'd appreciate if this could go in...

 Go on then.

Ok, here's what I've just installed.

for  gcc/ChangeLog
from  Alexandre Oliva  aol...@redhat.com

	* params.def (PARAM_MAX_VARTRACK_EXPR_DEPTH): Bump default to 10.
	* var-tracking.c (reverse_op): Limite recurse depth to 5.

Index: gcc/params.def
===
--- gcc/params.def.orig	2011-05-31 18:28:05.348070586 -0300
+++ gcc/params.def	2011-06-01 17:09:41.117140944 -0300
@@ -845,7 +845,7 @@ DEFPARAM (PARAM_MAX_VARTRACK_SIZE,
 DEFPARAM (PARAM_MAX_VARTRACK_EXPR_DEPTH,
 	  max-vartrack-expr-depth,
 	  Max. recursion depth for expanding var tracking expressions,
-	  10, 0, 0)
+	  20, 0, 0)
 
 /* Set minimum insn uid for non-debug insns.  */
 
Index: gcc/var-tracking.c
===
--- gcc/var-tracking.c.orig	2011-05-31 20:06:25.604477956 -0300
+++ gcc/var-tracking.c	2011-05-31 23:56:06.578450957 -0300
@@ -5288,7 +5288,7 @@ reverse_op (rtx val, const_rtx expr)
   arg = XEXP (src, 1);
   if (!CONST_INT_P (arg)  GET_CODE (arg) != SYMBOL_REF)
 	{
-	  arg = cselib_expand_value_rtx (arg, scratch_regs, EXPR_DEPTH);
+	  arg = cselib_expand_value_rtx (arg, scratch_regs, 5);
 	  if (arg == NULL_RTX)
 	return NULL_RTX;
 	  if (!CONST_INT_P (arg)  GET_CODE (arg) != SYMBOL_REF)


-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist  Red Hat Brazil Compiler Engineer

Re: [patch] add -Wdelete-non-virtual-dtor

2011-06-02 Thread Jason Merrill


On 06/02/2011 06:02 PM, Jonathan Wakely wrote:

+ if (!dtor || !DECL_VINDEX (dtor))


Do we really want to warn about the case where the class has no/trivial 
destructor?



+ bool abstract = false;
+ for (x = TYPE_METHODS (type); x; x = DECL_CHAIN (x))
+   if (DECL_PURE_VIRTUAL_P (x))
+ {
+   abstract = true;
+   break;
+ }
+ if (abstract)


Just check CLASSTYPE_PURE_VIRTUALS.

Jason

Re: [PR 48333] avoid -fcompare-debug errors from builtins in MEM attrs

2011-06-02 Thread Alexandre Oliva

Ping?  This fixes a case in which -g might change the executable code,
exposed with bootstrap-debug-lean.

On Apr  2, 2011, Alexandre Oliva aol...@redhat.com wrote:

   PR debug/48333
   * calls.c (emit_call_1): Prefer the __builtin declaration of
   builtin functions.

http://gcc.gnu.org/ml/gcc-patches/2011-04/msg00114.html

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist  Red Hat Brazil Compiler Engineer

Re: PING: PATCH: PR target/46770: Use .init_array/.fini_array sections

2011-06-02 Thread H.J. Lu

On Wed, May 18, 2011 at 8:57 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Tue, Apr 26, 2011 at 6:05 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Thu, Mar 31, 2011 at 7:57 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Mon, Mar 21, 2011 at 11:40 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Mon, Mar 14, 2011 at 12:28 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Thu, Jan 27, 2011 at 2:40 AM, Richard Guenther
 richard.guent...@gmail.com wrote:
 On Thu, Jan 27, 2011 at 12:12 AM, H.J. Lu hongjiu...@intel.com wrote:
 On Tue, Dec 14, 2010 at 05:20:48PM -0800, H.J. Lu wrote:
 This patch uses .init_array/.fini_array sections instead of
 .ctors/.dtors sections if mixing .init_array/.fini_array and
 .ctors/.dtors sections with init_priority works.

 It removes .ctors/.ctors sections from executables and DSOes, which 
 will
 remove one function call at startup time from each executable and DSO.
 It should reduce image size and improve system startup time.

 If a platform with a working .init_array/.fini_array support needs a
 different .init_array/.fini_array implementation, it can set
 use_initfini_array to no.

 Since .init_array/.fini_array is a target feature. 
 --enable-initfini-array
 is default to no unless the native run-time test is passed.

 To pass the native run-time test, a linker with SORT_BY_INIT_PRIORITY
 support is required.  The binutils patch is available at

 http://sourceware.org/ml/binutils/2010-12/msg00466.html

 Linker patch has been checked in.


 This patch passed 32bit/64bit regression test on Linux/x86-64.  Any
 comments?


 This updated patch fixes build on Linux/ia64 and should work on others.
 Any comments?

 Yes.  This is stage1 material.


 Here is the updated patch.  OK for trunk?

 Thanks.


 --
 H.J.
 
 2011-03-14  H.J. Lu  hongjiu...@intel.com

        PR target/46770
        * acinclude.m4 (gcc_AC_INITFINI_ARRAY): Removed.

        * config.gcc (use_initfini_array): New variable.
        Use initfini-array.o if supported.

        * crtstuff.c: Don't generate .ctors nor .dtors sections if
        NO_CTORS_DTORS_SECTIONS is defined.

        * configure.ac: Remove gcc_AC_INITFINI_ARRAY.  Add
        --enable-initfini-array and check if .init_array can be used with
        .ctors.

        * configure: Regenerated.

        * config/initfini-array.c: New.
        * config/initfini-array.h: Likewise.
        * config/t-initfini-array: Likewise.

        * config/arm/arm.c (arm_asm_init_sections): Call
        elf_initfini_array_init_sections if NO_CTORS_DTORS_SECTIONS
        is defined.
        * config/avr/avr.c (avr_asm_init_sections): Likewise.
        * config/ia64/ia64.c (ia64_asm_init_sections): Likewise.
        * config/mep/mep.c (mep_asm_init_sections): Likewise.
        * config/microblaze/microblaze.c 
 (microblaze_elf_asm_init_sections):
        Likewise.
        * config/rs6000/rs6000.c (rs6000_elf_asm_init_sections): Likewise.
        * config/stormy16/stormy16.c (xstormy16_asm_init_sections):
        Likewise.
        * config/v850/v850.c (v850_asm_init_sections): Likewise.


 PING:

 http://gcc.gnu.org/ml/gcc-patches/2011-03/msg00760.html


 Any comments?  Any objections?


 Here is the patch updated for the current trunk.  OK for trunk?


 PING,.

Hi Richard,

You commented my patch was stage 1 material:

http://gcc.gnu.org/ml/gcc-patches/2011-01/msg01989.html

Is my patch:

http://gcc.gnu.org/ml/gcc-patches/2011-03/msg00760.html

OK for trunk?

Thanks.


-- 
H.J.

73 matches

Mail list logo