Re: [PATCH][1/n] LTO option handling reorg

2011-10-26 Thread Richard Guenther
On Tue, 25 Oct 2011, Joseph S. Myers wrote:

 On Tue, 25 Oct 2011, Richard Guenther wrote:
 
  Joseph, does this look like a sensible use of the common
  machinery?  Do we want the init from COLLECT_GCC_OPTIONS
  in opts-common.c instead?
 
 Certainly there should be a single function to process COLLECT_GCC_OPTIONS 
 into an array of strings, even if only this one place needs then to 
 convert them to logical options.  And that single function should actually 
 work properly with the quoting logic used in gcc.c to generate 
 COLLECT_GCC_OPTIONS - your code appears to ignore any possibility of '\'' 
 being used for single-quotes in strings, and collect2.c:extract_string, 
 which is rather closer to what I'd expect, fails to reset backquote to 0 
 after processing the \'.

Hm, yeah - the existing logic in lto-wrapper fails to properly handle
any escaping.  I'll see if I can come up with a reasonably clean
interface.

 Note that decode_cmdline_options_to_array expects the array to start with 
 argv[0], the program name (though it's OK to have NULL there).

Ah, ok - fixed.

I'm LTO-bootstrapping and testing the following and will go with
that for now.

Thanks,
Richard.

2011-10-25  Richard Guenther  rguent...@suse.de

PR lto/41844
* Makefile.in (lto-wrapper): Depend on and link against
opts-common.o.
(lto-wrapper.o): Depend on $(OPTS_H) and $(OPTIONS_H).
* lto-wrapper.c (get_options_from_collect_gcc_options): New function.
(run_gcc): Use it.  Filter out language specific options.

Index: gcc/Makefile.in
===
*** gcc/Makefile.in.orig2011-10-25 16:07:41.0 +0200
--- gcc/Makefile.in 2011-10-25 16:14:18.0 +0200
*** collect2-aix.o : collect2-aix.c $(CONFIG
*** 2069,2080 
  tlink.o: tlink.c $(DEMANGLE_H) $(HASHTAB_H) $(CONFIG_H) $(SYSTEM_H) 
coretypes.h $(TM_H) \
  $(OBSTACK_H) collect2.h intl.h $(DIAGNOSTIC_CORE_H)
  
! lto-wrapper$(exeext): lto-wrapper.o $(LIBDEPS)
!   +$(LINKER) $(ALL_COMPILERFLAGS) $(LDFLAGS) -o T$@ lto-wrapper.o $(LIBS)
mv -f T$@ $@
  
  lto-wrapper.o: lto-wrapper.c $(CONFIG_H) $(SYSTEM_H) coretypes.h intl.h \
!   $(OBSTACK_H) $(DIAGNOSTIC_H)
  
  # Files used by all variants of C.
  c-family/c-common.o : c-family/c-common.c $(CONFIG_H) $(SYSTEM_H) coretypes.h 
\
--- 2069,2081 
  tlink.o: tlink.c $(DEMANGLE_H) $(HASHTAB_H) $(CONFIG_H) $(SYSTEM_H) 
coretypes.h $(TM_H) \
  $(OBSTACK_H) collect2.h intl.h $(DIAGNOSTIC_CORE_H)
  
! lto-wrapper$(exeext): lto-wrapper.o ggc-none.o libcommon-target.a $(LIBDEPS)
!   +$(LINKER) $(ALL_COMPILERFLAGS) $(LDFLAGS) -o T$@ \
!   lto-wrapper.o ggc-none.o libcommon-target.a $(LIBS)
mv -f T$@ $@
  
  lto-wrapper.o: lto-wrapper.c $(CONFIG_H) $(SYSTEM_H) coretypes.h intl.h \
!   $(OBSTACK_H) $(DIAGNOSTIC_H) $(OPTS_H) $(OPTIONS_H)
  
  # Files used by all variants of C.
  c-family/c-common.o : c-family/c-common.c $(CONFIG_H) $(SYSTEM_H) coretypes.h 
\
Index: gcc/lto-wrapper.c
===
*** gcc/lto-wrapper.c.orig  2011-10-25 16:07:41.0 +0200
--- gcc/lto-wrapper.c   2011-10-26 10:00:50.0 +0200
*** along with GCC; see the file COPYING3.
*** 43,48 
--- 43,50 
  #include intl.h
  #include diagnostic.h
  #include obstack.h
+ #include opts.h
+ #include options.h
  
  int debug;/* true if -save-temps.  */
  int verbose;  /* true if -v.  */
*** fork_execute (char **argv)
*** 280,285 
--- 282,333 
  /* Template of LTRANS dumpbase suffix.  */
  #define DUMPBASE_SUFFIX .ltrans18446744073709551615
  
+ /* Create decoded options from the COLLECT_GCC and COLLECT_GCC_OPTIONS
+environment according to LANG_MASK.  */
+ 
+ static void
+ get_options_from_collect_gcc_options (const char *collect_gcc,
+ const char *collect_gcc_options,
+ unsigned int lang_mask,
+ struct cl_decoded_option 
**decoded_options,
+ unsigned int *decoded_options_count)
+ {
+   char *argv_storage;
+   const char **argv;
+   int i, j, argc;
+ 
+   /* Count arguments.  */
+   argc = 0;
+   for (j = 0; collect_gcc_options[j] != '\0'; ++j)
+ if (collect_gcc_options[j] == '\'')
+   ++argc;
+   if (argc % 2 != 0)
+ fatal (malformed COLLECT_GCC_OPTIONS);
+ 
+   /* Copy the options to a argv-like array.  */
+   argc /= 2;
+   argv = (const char **) xmalloc ((argc + 2) * sizeof (char *));
+   argv[0] = collect_gcc;
+   argv_storage = xstrdup (collect_gcc_options);
+   for (i = 1, j = 0; argv_storage[j] != '\0'; ++j)
+ {
+   if (argv_storage[j] == '\'')
+   {
+ argv[i++] = argv_storage[++j];
+ while (argv_storage[j] != '\'')
+   ++j;
+ argv_storage[j] = 

Re: new patches using -fopt-info (issue5294043)

2011-10-26 Thread Richard Guenther
On Tue, Oct 25, 2011 at 9:30 PM, Xinliang David Li davi...@google.com wrote:


 On Tue, Oct 25, 2011 at 1:02 AM, Richard Guenther
 richard.guent...@gmail.com wrote:

 On Mon, Oct 24, 2011 at 6:27 PM, Xinliang David Li davi...@google.com
 wrote:
  Well, you seem to keep not reading what I write.  I am not opposed
  to adding -fopt-info/report nor to funnel messages to stdout/err.  What
  I am opposed is the way you want to introduce them.  I want you to
  fix what we dump into dump files, so that both -fopt-report and
  -fopt-info
  can be implemented by outputting selected pieces of the dump file
  to stdout/stderr.  We already have -fdump-*-stats which supposedly
  could match -fopt-report, and the default -fdump-* should be what
  goes to -fopt-info (minus the function bodies, of course).
 
  That sounds good. What you propose seems like
 
  -fdump-pass-[ir_only|transformation|debug]-stderr
 
  and -fopt-info is a short cut for
  -fdump-tree-all-transformations-stderr
  -fdump-ipa-all-tranformations-stderr
  -fdump-rtl-all-transformations-stderr

 Yes.  Note that I don't like it the way the vectorizer does (with
 -fvectorizer-verbose=... the dump files are empty).  The dump
 file content should be unchanged when redirecting (parts) to
 stderr, so we have to arrange to duplicate messages in two places.

 Vectorizer dump is a good candidate for clean up when the dumping
 infrastructure improvement is done.
 FYI, for now, we will implement the opt-info for some passes in the simple
 way in google branches, and later migrate to trunk when the dumping
 infrastructure is improved.

I was hoping you would volunteer to improve the dumping infrastructure.

Richard.


Re: [PATCH,PR50763] Follow-up patch to fix i686 bootstrap failure

2011-10-26 Thread Richard Guenther
On Tue, Oct 25, 2011 at 2:15 PM, Tom de Vries tom_devr...@mentor.com wrote:
 Richard,

 I have a patch for the i686 bootstrap problem reported in PR50763 comment 10.

 pr50763-2.c looks like this before tail_merge_optimize:
 ...
 std_canonical_va_list_type (union tree_node * typeD.1608)
 {
  _BoolD.1576 pretmp.6D.2739;
  union tree_node * pretmp.5D.2738;
  _BoolD.1576 pretmp.4D.2737;
  union tree_node * pretmp.3D.2736;
  intD.6 D.2734;
  _BoolD.1576 D.2733;
  union tree_node * D.2725;
  intD.6 D.2722;
  intD.6 D.2717;

  # BLOCK 2 freq:1
  # PRED: ENTRY [100.0%]  (fallthru,exec)
  # VUSE .MEMD.2730_12(D)
  D.2717_4 = typeD.1608_3(D)-baseD.1605.codeD.1597;
  if (D.2717_4 != 0)
    goto bb 3;
  else
    goto bb 4;
  # SUCC: 3 [50.0%]  (true,exec) 4 [50.0%]  (false,exec)

  # BLOCK 3 freq:5000
  # PRED: 2 [50.0%]  (true,exec)
  # VUSE .MEMD.2730_12(D)
  # PT = anything
  typeD.1608_5 = typeD.1608_3(D)-typedD.1606.typeD.1600;
  goto bb 6;
  # SUCC: 6 [100.0%]  (fallthru,exec)

  # BLOCK 4 freq:5000
  # PRED: 2 [50.0%]  (false,exec)
  # VUSE .MEMD.2730_12(D)
  # PT = anything
  typeD.1608_6 = typeD.1608_3(D)-typedD.1606.typeD.1600;
  # VUSE .MEMD.2730_12(D)
  D.2722_7 = typeD.1608_6-baseD.1605.codeD.1597;
  if (D.2722_7 != 0)
    goto bb 5;
  else
    goto bb 7;
  # SUCC: 5 [50.0%]  (true,exec) 7 [50.0%]  (false,exec)

  # BLOCK 7 freq:2500
  # PRED: 4 [50.0%]  (false,exec)
  goto bb 6;
  # SUCC: 6 [100.0%]  (fallthru)

  # BLOCK 5 freq:2500
  # PRED: 4 [50.0%]  (true,exec)
  # SUCC: 6 [100.0%]  (fallthru,exec)

  # BLOCK 6 freq:1
  # PRED: 3 [100.0%]  (fallthru,exec) 7 [100.0%]  (fallthru) 5 [100.0%]
 (fallthru,exec)
  # PT = anything

  # typeD.1608_1 = PHI typeD.1608_5(3), typeD.1608_3(D)(7), typeD.1608_6(5)
  # VUSE .MEMD.2730_12(D)
  # PT = anything
  D.2725_9 = typeD.1608_1-type_commonD.1607.main_variantD.1604;
  D.2733_15 = D.2725_9 != 0B;
  D.2734_14 = (intD.6) D.2733_15;
  # VUSE .MEMD.2730_12(D)
  return D.2734_14;
  # SUCC: EXIT [100.0%]

 }
 ...

 tail_merge_optimize discovers that block 3 and 5 are identical, and removes 
 block 5.
 In replace_block_by, we end up with phi_vuse1 == NULL_TREE and
 phi_vuse2 == .MEMD.2730_12(D).

 After the original fix to PR50763, this forces us to update the vuses, while
 that is not necessary.

 Attached patch fixes this problem by determining more precisely when the vuses
 need to be updated.

 bootstrapped and reg-tested on x86_64 and i686.

 OK for trunk?

Ok.

Thanks,
Richard.

 Thanks,
 - Tom

 2011-10-25  Tom de Vries  t...@codesourcery.com

        PR tree-optimization/50763
        * tree-ssa-tail-merge.c (replace_block_by): Update vops if phi_vuse1 or
        phi_vuse2 is NULL_TREE only if bb1 dominates or is dominated by bb2.

        * gcc.dg/pr50763-2.c: New test.



RE: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-26 Thread Jiangning Liu


 -Original Message-
 From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
 ow...@gcc.gnu.org] On Behalf Of Michael Matz
 Sent: Tuesday, October 11, 2011 10:45 PM
 To: Kai Tietz
 Cc: Richard Guenther; Kai Tietz; gcc-patches@gcc.gnu.org; Richard
 Henderson
 Subject: Re: [patch tree-optimization]: Improve handling of
 conditional-branches on targets with high branch costs
 
 Hi,
 
 On Tue, 11 Oct 2011, Kai Tietz wrote:
 
   Better make it a separate function the first tests your new
   conditions, and then calls simple_operand_p.
 
  Well, either I make it a new function and call it instead of
  simple_operand_p,
 
 That's what I meant, yes.
 
   @@ -5149,13 +5176,6 @@ fold_truthop (location_t loc, enum tree_
build2 (BIT_IOR_EXPR, TREE_TYPE (ll_arg),
ll_arg, rl_arg),
build_int_cst (TREE_TYPE (ll_arg), 0));
   -
   -  if (LOGICAL_OP_NON_SHORT_CIRCUIT)
   - {
   -   if (code != orig_code || lhs != orig_lhs || rhs !=
 orig_rhs)
   - return build2_loc (loc, code, truth_type, lhs, rhs);
   -   return NULL_TREE;
   - }
  
   Why do you remove this hunk?  Shouldn't you instead move the hunk
 you
   added to fold_truth_andor() here.  I realize this needs some TLC to
   fold_truth_andor_1, because right now it early-outs for non-
 comparisons,
   but it seems the better place.  I.e. somehow move the below code
 into the
   above branch, with the associated diddling on fold_truth_andor_1
 that it
   gets called.
 
  This hunk is removed, as it is vain to do here.
 
 There is a fallthrough now, that wasn't there before.  I don't know if
 it's harmless, I just wanted to mention it.
 

Yes, this part introduced different behavior for this small case,

int f(char *i, int j)
{
if (*i  j!=2)
return *i;
else
return j;
}

Before the fix, we have

  D.4710 = *i;
  D.4711 = D.4710 != 0;
  D.4712 = j != 2;
  D.4713 = D.4711  D.4712;
  if (D.4713 != 0) goto D.4714; else goto D.4715;
  D.4714:
  D.4710 = *i;
  D.4716 = (int) D.4710;
  return D.4716;
  D.4715:
  D.4716 = j;
  return D.4716;

After the fix, we have

  D.4711 = *i;
  if (D.4711 != 0) goto D.4712; else goto D.4710;
  D.4712:
  if (j != 2) goto D.4713; else goto D.4710;
  D.4713:
  D.4711 = *i;
  D.4714 = (int) D.4711;
  return D.4714;
  D.4710:
  D.4714 = j;
  return D.4714;

Does this meet the original expectation? 

I find the code below in function fold_truth_andor makes difference,

  /* Transform (A AND-IF B) into (A AND B), or (A OR-IF B)   into (A OR B).
 For sequence point consistancy, we need to check for trapping,
 and side-effects.  */
  else if (code == icode  simple_operand_p_2 (arg0)
simple_operand_p_2 (arg1))
 return fold_build2_loc (loc, ncode, type, arg0, arg1);

for *i != 0 simple_operand_p(*i) returns false. Originally this is not 
checked by the code. I don't see the patch originally wanted to cover this. Can 
this be explained reasonably?

I'm not arguing this patch did worse thing, but only want to understand the 
rationale behind this and justify this patch doesn't hurt anything else. 
Actually on the contrary, I measured and this change accidently made some 
benchmarks significantly improved due to some other reasons.

Thanks,
-Jiangning

  Btw richi asked for it, and I agree that new TRUTH-AND/OR packing is
  better done at a single place in fold_truth_andor only.
 
 As fold_truthop is called twice by fold_truth_andor, the latter might
 indeed be the better place.
 
 
 Ciao,
 Michael.





Re: [patch] dwarf2out crash: missing GTY? (PR 50806)

2011-10-26 Thread Alexandre Oliva
On Oct 21, 2011, Steve Ellcey s...@cup.hp.com wrote:

 On Fri, 2011-10-21 at 21:25 +0200, Jakub Jelinek wrote:
 My guess would be
 http://gcc.gnu.org/viewcvs?root=gccview=revrev=180194
 DEBUG_EXPR certainly shouldn't make it through into mem_loc_descriptor,
 var-tracking is supposed to resolve that.
 
 Jakub

 You are right, the bug started at r180194.  I have submitted a bug (PR
 50826) which includes a cut down test case for the problem.

Thanks for testing this patch on the affected platform.  I also
bootstrapped it on {x86_64,i686}-linux-gnu, with and without an
additional patch that looked for debug_exprs in any loc expr associated
with multi-part variables.  That didn't get any hits, so I ended up
finding out it is the use of a pseudo to hold the internal_arg_pointer
that leads to a chain of events that results in a debug temp bound to a
hard reg ending up used in MEM addresses of incoming args referenced in
the exprs of other debug temps.  (more details in the bug report)

Ok to install?

for  gcc/ChangeLog
from  Alexandre Oliva  aol...@redhat.com

	PR debug/50826
	* var-tracking.c (rtx_debug_expr_p): New.
	(use_type): Don't use debug exprs to track non-VTA variables.

Index: gcc/var-tracking.c
===
--- gcc/var-tracking.c.orig	2011-10-25 02:02:25.588540417 -0200
+++ gcc/var-tracking.c	2011-10-25 02:02:29.655480746 -0200
@@ -4907,6 +4907,18 @@ replace_expr_with_values (rtx loc)
 return cselib_subst_to_values (loc, VOIDmode);
 }
 
+/* Return true if *X is a DEBUG_EXPR.  Usable as an argument to
+   for_each_rtx to tell whether there are any DEBUG_EXPRs within
+   RTX.  */
+
+static int
+rtx_debug_expr_p (rtx *x, void *data ATTRIBUTE_UNUSED)
+{
+  rtx loc = *x;
+
+  return GET_CODE (loc) == DEBUG_EXPR;
+}
+
 /* Determine what kind of micro operation to choose for a USE.  Return
MO_CLOBBER if no micro operation is to be generated.  */
 
@@ -4988,7 +5000,13 @@ use_type (rtx loc, struct count_use_info
   else if (target_for_debug_bind (var_debug_decl (expr)))
 	return MO_CLOBBER;
   else if (track_loc_p (loc, expr, INT_MEM_OFFSET (loc),
-			false, modep, NULL))
+			false, modep, NULL)
+	   /* Multi-part variables shouldn't refer to one-part
+		  variable names such as VALUEs (never happens) or
+		  DEBUG_EXPRs (only happens in the presence of debug
+		  insns).  */
+	(!MAY_HAVE_DEBUG_INSNS
+		   || !for_each_rtx (XEXP (loc, 0), rtx_debug_expr_p, NULL)))
 	return MO_USE;
   else
 	return MO_CLOBBER;


-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist  Red Hat Brazil Compiler Engineer


Re: [patch] dwarf2out crash: missing GTY? (PR 50806)

2011-10-26 Thread Jakub Jelinek
On Wed, Oct 26, 2011 at 07:15:20AM -0200, Alexandre Oliva wrote:
 Ok to install?
 

 for  gcc/ChangeLog
 from  Alexandre Oliva  aol...@redhat.com
 
   PR debug/50826
   * var-tracking.c (rtx_debug_expr_p): New.
   (use_type): Don't use debug exprs to track non-VTA variables.

Yes, thanks.

Jakub


Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-26 Thread Kai Tietz
2011/10/26 Jiangning Liu jiangning@arm.com:


 -Original Message-
 From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
 ow...@gcc.gnu.org] On Behalf Of Michael Matz
 Sent: Tuesday, October 11, 2011 10:45 PM
 To: Kai Tietz
 Cc: Richard Guenther; Kai Tietz; gcc-patches@gcc.gnu.org; Richard
 Henderson
 Subject: Re: [patch tree-optimization]: Improve handling of
 conditional-branches on targets with high branch costs

 Hi,

 On Tue, 11 Oct 2011, Kai Tietz wrote:

   Better make it a separate function the first tests your new
   conditions, and then calls simple_operand_p.
 
  Well, either I make it a new function and call it instead of
  simple_operand_p,

 That's what I meant, yes.

   @@ -5149,13 +5176,6 @@ fold_truthop (location_t loc, enum tree_
                            build2 (BIT_IOR_EXPR, TREE_TYPE (ll_arg),
                                    ll_arg, rl_arg),
                            build_int_cst (TREE_TYPE (ll_arg), 0));
   -
   -      if (LOGICAL_OP_NON_SHORT_CIRCUIT)
   -     {
   -       if (code != orig_code || lhs != orig_lhs || rhs !=
 orig_rhs)
   -         return build2_loc (loc, code, truth_type, lhs, rhs);
   -       return NULL_TREE;
   -     }
  
   Why do you remove this hunk?  Shouldn't you instead move the hunk
 you
   added to fold_truth_andor() here.  I realize this needs some TLC to
   fold_truth_andor_1, because right now it early-outs for non-
 comparisons,
   but it seems the better place.  I.e. somehow move the below code
 into the
   above branch, with the associated diddling on fold_truth_andor_1
 that it
   gets called.
 
  This hunk is removed, as it is vain to do here.

 There is a fallthrough now, that wasn't there before.  I don't know if
 it's harmless, I just wanted to mention it.


 Yes, this part introduced different behavior for this small case,

 int f(char *i, int j)
 {
        if (*i  j!=2)
                return *i;
        else
                return j;
 }

 Before the fix, we have

  D.4710 = *i;
  D.4711 = D.4710 != 0;
  D.4712 = j != 2;
  D.4713 = D.4711  D.4712;
  if (D.4713 != 0) goto D.4714; else goto D.4715;
  D.4714:
  D.4710 = *i;
  D.4716 = (int) D.4710;
  return D.4716;
  D.4715:
  D.4716 = j;
  return D.4716;

 After the fix, we have

  D.4711 = *i;
  if (D.4711 != 0) goto D.4712; else goto D.4710;
  D.4712:
  if (j != 2) goto D.4713; else goto D.4710;
  D.4713:
  D.4711 = *i;
  D.4714 = (int) D.4711;
  return D.4714;
  D.4710:
  D.4714 = j;
  return D.4714;

 Does this meet the original expectation?

 I find the code below in function fold_truth_andor makes difference,

      /* Transform (A AND-IF B) into (A AND B), or (A OR-IF B)   into (A OR B).
         For sequence point consistancy, we need to check for trapping,
         and side-effects.  */
      else if (code == icode  simple_operand_p_2 (arg0)
                simple_operand_p_2 (arg1))
         return fold_build2_loc (loc, ncode, type, arg0, arg1);

 for *i != 0 simple_operand_p(*i) returns false. Originally this is not 
 checked by the code. I don't see the patch originally wanted to cover this. 
 Can this be explained reasonably?

 I'm not arguing this patch did worse thing, but only want to understand the 
 rationale behind this and justify this patch doesn't hurt anything else. 
 Actually on the contrary, I measured and this change accidently made some 
 benchmarks significantly improved due to some other reasons.

 Thanks,
 -Jiangning

  Btw richi asked for it, and I agree that new TRUTH-AND/OR packing is
  better done at a single place in fold_truth_andor only.

 As fold_truthop is called twice by fold_truth_andor, the latter might
 indeed be the better place.


 Ciao,
 Michael.

Well, as far as I understand C specification and sequence points, it
makes indeed a difference to do branch-cost optimization on
instructions might cause a signal traps. As signal could be handled in
specifc handlers. You need to consider here that short-circuit
optimization assumes associative law on operands.  So right-hand side
might be evaluaded before the left-hand-side.  And this indeed breaks
IMHO the sequences as mentioned in C specification about conditions.
 So a memory de-referencing (same as a floating-point trap) can never
be treated as simple, as far as I understood this.  So this patch -
beside improving logic for branch-cost merging - fixes this latent
issue.

Regards,
Kai


Re: [PATCH, PR43814] Assume function arguments of pointer type are aligned.

2011-10-26 Thread Richard Guenther
On Tue, Oct 25, 2011 at 2:22 PM, Tom de Vries tom_devr...@mentor.com wrote:
 On 09/24/2011 01:42 PM, Richard Guenther wrote:
 On Sat, Sep 24, 2011 at 11:40 AM, Jakub Jelinek ja...@redhat.com wrote:
 On Sat, Sep 24, 2011 at 11:31:25AM +0200, Richard Guenther wrote:
 In the end I'd probably say the patch is ok without the option (thus
 turned on by default), but if LC_GLOBAL_LOCALE is part of the
 glibc ABI then we clearly can't do this.

 Yes, LC_GLOBAL_LOCALE is part of glibc ABI.  I guess we could only assume
 the alignment if the pointer is actually dereferenced on the statement
 that checks the ABI or in some stmt that dominates the spot where you want
 to check the alignment.  It is IMHO quite common to pass arbitrary values
 in pointer types, then cast them back or just compare.

 Yeah (even if technically invoking undefined behavior in C).  Checking if
 there is a dereference post-dominating function entry with sth like

   FOR_EACH_IMM_USE_STMT (... ptr ...)
      if (stmt_post_dominates_entry  contains derefrence of ptr)
        alignment = TYPE_ALIGN (...);

 and otherwise not assuming anything about parameter alignment might work.
 Be careful to check the alignment of the dereference though,

 typedef int int_unaligned __attribute__((aligned(1)));
 int foo (int *p)
 {
   int_unaligned *q = p;
   return *q;
 }

 will be MEM[p] but with (well, hopefully ;)) TYPE_ALIGN of TREE_TYPE (MEM[p])
 being 1.  And yes, you'd have to look into handled-components as well.  I 
 guess
 you'll face similar problems as we do with tree-sra.c
 tree_non_mode_aligned_mem_p
 (you need to assume eventually misaligned accesses the same way expansion
 does for the dereference, otherwise you'll run into issues on
 strict-align targets).

 As that de-refrence thing doesn't really fit the CCP propagation you
 won't be able
 to handle

 int foo (int *p)
 {
   int *q = (char *)p + 3;
   return *q;
 }

 and assume q is aligned (and p is misaligned by 1).

 That is, if the definition of a pointer is post-dominated by a derefrence
 we could assume proper alignment for that pointer (as opposed to just
 special-casing its default definition).  Would be certainly interesting to
 see what kind of fallout we would get from that ;)


 I gave this a try in deduce_alignment_from_dereferences.

 The fall-out I got from this were unaligned dereferenced pointers in
 gcc.c-torture/unsorted/*{cmp,set}.c.

 Bootstrapped and reg-tested on x86_64. Build and reg-tested on MIPS and ARM.

 Ok for trunk?

Can you not do the get_value_from_alignment split (it doesn't look
necessary to me) and drop the

@@ -541,10 +550,18 @@ get_value_for_expr (tree expr, bool for_
   if (TREE_CODE (expr) == SSA_NAME)
 {
   val = *get_value (expr);
-  if (for_bits_p
-  val.lattice_val == CONSTANT
+  if (!for_bits_p)
+   return val;
+
+  if (val.lattice_val == CONSTANT
   TREE_CODE (val.value) == ADDR_EXPR)
val = get_value_from_alignment (val.value);
+  else if (val.lattice_val == VARYING
+   SSA_NAME_PTR_INFO (expr) != NULL
+   SSA_NAME_PTR_INFO (expr)-align  1
+   SSA_NAME_PTR_INFO (expr)-misalign == 0)
+   val = get_align_value (SSA_NAME_PTR_INFO (expr)-align * BITS_PER_UNIT,
+  TREE_TYPE (expr), 0);
 }

hunk?  I'm not sure why it is necessary at all - CCP is the only pass
computing alignment, so it should simply re-compute the info?

Anyway, it looks unrelated to the purpose of the patch in general.

The error reporting in deduce_alignment_from_dereferences is bogus,
the programs are undefined only at runtime, so you can at most
issue a warning.

+  /* Needs to be the successor of entry, for CDI_POST_DOMINATORS.  */
+  entry = single_succ (ENTRY_BLOCK_PTR);
+
+  FOR_EACH_BB (bb)
+{
+  gimple_stmt_iterator i;
+
+  if (!dominated_by_p (CDI_POST_DOMINATORS, entry, bb))
+   continue;

if you only consider post-dominators of the entry block then just walk
them directly (first_dom_son / next_dom_son).

+ align = TYPE_ALIGN (TREE_TYPE (memref)) / BITS_PER_UNIT;
+ if (align == 1)
+   continue;

I think you want to match what expand thinks of the alignment of this
memory reference, not just what TYPE_ALIGN says (and yes, that
needs to be split out somehow, SRA would need this as well).

+ while (TREE_CODE (ptr) == SSA_NAME)
+   {
+ pi = get_ptr_info (ptr);
+ if (pi-misalign != 0)
+   {
+ error (misaligned pointer dereferenced);
+ break;
+   }

simply looking at pi-misalign is wrong.  pi-align may be bigger
than the align that you computed above, so pi-misalign % align != 0
would be the right check.

+ if (pi-align = align)
+   break;
+ pi-align = align;

and then set pi-misalign to zero here.  But I would initialize the
CCP lattice with this, not set SSA_NAME_PTR_INFO, 

Re: [PATCH 0/6] More vector permutation work

2011-10-26 Thread Iain Sandoe

Hi Richard,

On 25 Oct 2011, at 01:17, Richard Henderson wrote:


The Idea with this patch set is to re-arrange vector permutation
so that it can be used to implement other patterns automatically.

In particular, Altivec, SPU currently have (and Sparc VIS would need)
a large amount of boilerplate code that transforms several higher
level tree codes into vector permutation inside the backend.  This
seems a bit unnecessary.

It's arguable that we could simply make VEC_PERM_EXPR the only code
that gets generated at the vectorizer level, and drop some of these
other patterns entirely.  I'm not 100% sure we should go that far,
but even if we did, I still think this is a good cleanup.

Bootstrapped and tested on x86_64-linux only.  I've only cross-tested
the ppc/spu patches.


unfortunately, the series breaks bootstrap for powerpc-darwin9:

http://gcc.gnu.org/ml/gcc-regression/2011-10/msg00539.html

the failure is in building libgfortran:

libtool: compile:  /GCC/gcc-4-7-trunk-build/./gcc/xgcc -B/GCC/gcc-4-7- 
trunk-build/./gcc/ -B/GCC/gcc-4-7-install/powerpc-apple-darwin9/bin/ - 
B/GCC/gcc-4-7-install/powerpc-apple-darwin9/lib/ -isystem /GCC/gcc-4-7- 
install/powerpc-apple-darwin9/include -isystem /GCC/gcc-4-7-install/ 
powerpc-apple-darwin9/sys-include -DHAVE_CONFIG_H -I. -I/GCC/gcc-live- 
trunk/libgfortran -iquote/GCC/gcc-live-trunk/libgfortran/io -I/GCC/gcc- 
live-trunk/libgfortran/../gcc -I/GCC/gcc-live-trunk/libgfortran/../gcc/ 
config -I../.././gcc -std=gnu99 -Wall -Wstrict-prototypes -Wmissing- 
prototypes -Wold-style-definition -Wextra -Wwrite-strings -fcx-fortran- 
rules -ffunction-sections -fdata-sections -ftree-vectorize -funroll- 
loops -g -O2 -MT matmul_c4.lo -MD -MP -MF .deps/matmul_c4.Tpo -c /GCC/ 
gcc-live-trunk/libgfortran/generated/matmul_c4.c  -fno-common -DPIC - 
o .libs/matmul_c4.o
/GCC/gcc-live-trunk/libgfortran/generated/matmul_c4.c: In function  
‘matmul_c4’:
/GCC/gcc-live-trunk/libgfortran/generated/matmul_c4.c:284:25: internal  
compiler error: in expand_vec_perm, at optabs.c:6998

Please submit a full bug report,
with preprocessed source if appropriate.

it is reproducible with a stage-1 compiler - so presumably off a cross  
too.


cheers
Iain



[PATCH] Fix lto-wrapper obstack handling

2011-10-26 Thread Richard Guenther

Seems only in some cases we fault freeing an uninitialized obstack.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2011-10-26  Richard Guenther  rguent...@suse.de

* lto-wrapper.c (run_gcc): Properly init/free obstack.

Index: gcc/lto-wrapper.c
===
*** gcc/lto-wrapper.c   (revision 180520)
--- gcc/lto-wrapper.c   (working copy)
*** run_gcc (unsigned argc, char *argv[])
*** 339,345 
char *list_option_full = NULL;
const char *linker_output = NULL;
const char *collect_gcc, *collect_gcc_options;
-   struct obstack env_obstack;
int parallel = 0;
int jobserver = 0;
bool no_partition = false;
--- 339,344 
*** run_gcc (unsigned argc, char *argv[])
*** 517,527 
--- 516,528 
  {
FILE *stream = fopen (ltrans_output_file, r);
FILE *mstream = NULL;
+   struct obstack env_obstack;
  
if (!stream)
fatal_perror (fopen: %s, ltrans_output_file);
  
/* Parse the list of LTRANS inputs from the WPA stage.  */
+   obstack_init (env_obstack);
nr = 0;
for (;;)
{
*** cont:
*** 574,580 
  
  /* Replace the .o suffix with a .ltrans.o suffix and write
 the resulting name to the LTRANS output list.  */
- obstack_init (env_obstack);
  obstack_grow (env_obstack, input_name, strlen (input_name) - 2);
  obstack_grow (env_obstack, .ltrans.o, sizeof (.ltrans.o));
  output_name = XOBFINISH (env_obstack, char *);
--- 575,580 
*** cont:
*** 654,662 
free (output_names);
free (input_names);
free (list_option_full);
  }
- 
-   obstack_free (env_obstack, NULL);
  }
  
  
--- 654,661 
free (output_names);
free (input_names);
free (list_option_full);
+   obstack_free (env_obstack, NULL);
  }
  }
  
  


RE: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-26 Thread Michael Matz
Hi,

On Wed, 26 Oct 2011, Jiangning Liu wrote:

-
-  if (LOGICAL_OP_NON_SHORT_CIRCUIT)
- {
-   if (code != orig_code || lhs != orig_lhs || rhs !=
  orig_rhs)
- return build2_loc (loc, code, truth_type, lhs, rhs);
-   return NULL_TREE;
- }
   
Why do you remove this hunk?  Shouldn't you instead move the hunk
  you
added to fold_truth_andor() here.  I realize this needs some TLC to
fold_truth_andor_1, because right now it early-outs for non-
  comparisons,
but it seems the better place.  I.e. somehow move the below code
  into the
above branch, with the associated diddling on fold_truth_andor_1
  that it
gets called.
  
   This hunk is removed, as it is vain to do here.
  
  There is a fallthrough now, that wasn't there before.  I don't know if
  it's harmless, I just wanted to mention it.
  
 
 Yes, this part introduced different behavior for this small case,
 
   D.4710 = *i;
   D.4711 = D.4710 != 0;
   D.4712 = j != 2;
   D.4713 = D.4711  D.4712;
   if (D.4713 != 0) goto D.4714; else goto D.4715;
 
 After the fix, we have
 
   D.4711 = *i;
   if (D.4711 != 0) goto D.4712; else goto D.4710;
   D.4712:
   if (j != 2) goto D.4713; else goto D.4710;

So, we have one more jump than originally, when the point of the patch 
was to emit less on targets with high branch costs.  So, as speculated, 
the hunk was not useless.  (It's nice that it caused a benchmark to 
improve significantly, but that should be done via a proper analysis and 
patch, not as a side effect of a supposed non-change).


Ciao,
Michael.


Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-26 Thread Michael Matz
Hi,

On Wed, 26 Oct 2011, Kai Tietz wrote:

  Yes, this part introduced different behavior for this small case,
 
  int f(char *i, int j)
  {
         if (*i  j!=2)
                 return *i;
         else
                 return j;
  }
 
 Well, as far as I understand C specification and sequence points, it 
 makes indeed a difference to do branch-cost optimization on instructions 
 might cause a signal traps. As signal could be handled in specifc 
 handlers. You need to consider here that short-circuit optimization 
 assumes associative law on operands.  So right-hand side might be 
 evaluaded before the left-hand-side.  And this indeed breaks IMHO the 
 sequences as mentioned in C specification about conditions.

I'm not sure in this specific case.  If it had been:

if (*a  *b) ...

the you'd be right.  The side-effect of dereferencing a must happen before 
*b, and hence transforming this into (*a!=0)(*b!=0) would be wrong.  But 
in the case at hand we only have one potentially problematic (aka 
detectable) side-effect (*i), so I don't think you could construct a 
program that detects the difference.  It would entail detecting that 
j!=2 was already evaluated, when (say) the segfault happens, but you 
can't as the variable is non-volatile.  It also can't happen that the 
side-effect *i does not occur (which also would be a problem).  So, I 
think there wasn't an actual problem before, and it had fewer jumps.


Ciao,
Michael.

Re: Go patch committed: Implement new syscall package

2011-10-26 Thread Rainer Orth
Ian,

 I committed this patch to mainline to try to fix these problems.  Thanks
 for testing.  For this patch I bootstrapped and ran Go testsuite on
 x86_64-unknown-linux-gnu.

with this patch, go and libgo results on Solaris 10 and 11/x86 are back
to normal, and Solaris 10 and 11/SPARC bootstraps are currently
running.  Thanks.

There's one problem left: with Solaris nawk, building libcalls.go fails
(from Solaris 8 to 11 inclusive):

nawk -f /vol/gcc/src/hg/trunk/local/libgo/go/syscall/mksyscall.awk ${files}  li
bcalls.go.tmp
nawk: syntax error at source line 47
 context is
if (match($0, //sys(nb)?[  ]*([a-zA-Z0-9_]+)\\(([^()]*)\\)   *(\
\(([^()]+)\\))?,   gosig) == 0) {
nawk: illegal statement at source line 47
nawk: syntax error at source line 58
missing }
make[4]: *** [s-libcalls] Error 2

I don't yet see what's wrong with the pattern, and gawk does accept it.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: PowerPC shrink-wrap support 3 of 3

2011-10-26 Thread Alan Modra
On Sun, Oct 16, 2011 at 02:51:01PM -0400, David Edelsohn wrote:
 The patch is okay, although I am not thrilled about the need to change
 the register allocation order.

Committed revision 180522.  It turns out that shrink-wrapping isn't as
effective as it used to be with the 20110915 based sources I was using
originally.  povray Ray_In_Bound no longer gets the benefit of shrink
wrap, likely due to some cfg optimization.  We end up with a simple
block that just does r3=1 then jumps to last_bb being reached from
blocks that need a prologue as well as blocks that don't.  That's
enough to kill our current shrink wrap implementation.  What we need
is something to duplicate these tail blocks..

Patch here for comment.  I haven't yet run benchmarks, but this passes
bootstrap and regression test (on rev 180286, current virgin mainline
fails bootstrap on powerpc-linux).

* function.c (thread_prologue_and_epilogue_insns): Delete
shadowing variables.  Don't do prologue register clobber tests
when shrink wrapping already failed.  Compute tail block
candidates for duplicating exit path.  Remove these from
antic set.  Duplicate tails when reached from both blocks
needing a prologue/epilogue and blocks not needing such.

Index: gcc/function.c
===
--- gcc/function.c  (revision 180467)
+++ gcc/function.c  (working copy)
@@ -5697,11 +5697,11 @@ thread_prologue_and_epilogue_insns (void
   HARD_REG_SET prologue_clobbered, prologue_used, live_on_edge;
   HARD_REG_SET set_up_by_prologue;
   rtx p_insn;
-
   VEC(basic_block, heap) *vec;
   basic_block bb;
   bitmap_head bb_antic_flags;
   bitmap_head bb_on_list;
+  bitmap_head bb_tail;
 
   if (dump_file)
fprintf (dump_file, Attempting shrink-wrapping optimization.\n);
@@ -5732,6 +5732,7 @@ thread_prologue_and_epilogue_insns (void
 
   bitmap_initialize (bb_antic_flags, bitmap_default_obstack);
   bitmap_initialize (bb_on_list, bitmap_default_obstack);
+  bitmap_initialize (bb_tail, bitmap_default_obstack);
 
   /* Find the set of basic blocks that require a stack frame.  */
 
@@ -5774,19 +5775,22 @@ thread_prologue_and_epilogue_insns (void
}
}
 
+  /* Save a copy of blocks that really need a prologue.  */
+  bitmap_copy (bb_antic_flags, bb_flags);
+
   /* For every basic block that needs a prologue, mark all blocks
 reachable from it, so as to ensure they are also seen as
 requiring a prologue.  */
   while (!VEC_empty (basic_block, vec))
{
  basic_block tmp_bb = VEC_pop (basic_block, vec);
- edge e;
- edge_iterator ei;
+
  FOR_EACH_EDGE (e, ei, tmp_bb-succs)
if (e-dest != EXIT_BLOCK_PTR
 bitmap_set_bit (bb_flags, e-dest-index))
  VEC_quick_push (basic_block, vec, e-dest);
}
+
   /* If the last basic block contains only a label, we'll be able
 to convert jumps to it to (potentially conditional) return
 insns later.  This means we don't necessarily need a prologue
@@ -5799,14 +5803,29 @@ thread_prologue_and_epilogue_insns (void
goto fail_shrinkwrap;
}
 
+  /* Find the set of basic blocks that need no prologue and only
+go to the exit.  */
+  bitmap_set_bit (bb_tail, EXIT_BLOCK_PTR-index);
+  VEC_quick_push (basic_block, vec, EXIT_BLOCK_PTR);
+  while (!VEC_empty (basic_block, vec))
+   {
+ basic_block tmp_bb = VEC_pop (basic_block, vec);
+
+ FOR_EACH_EDGE (e, ei, tmp_bb-preds)
+   if (single_succ_p (e-src)
+!bitmap_bit_p (bb_antic_flags, e-src-index)
+bitmap_set_bit (bb_tail, e-src-index))
+ VEC_quick_push (basic_block, vec, e-src);
+   }
+
   /* Now walk backwards from every block that is marked as needing
-a prologue to compute the bb_antic_flags bitmap.  */
-  bitmap_copy (bb_antic_flags, bb_flags);
+a prologue to compute the bb_antic_flags bitmap.  Exclude
+tail blocks; They can be duplicated to be used on paths not
+needing a prologue.  */
+  bitmap_and_compl (bb_antic_flags, bb_flags, bb_tail);
   FOR_EACH_BB (bb)
{
- edge e;
- edge_iterator ei;
- if (!bitmap_bit_p (bb_flags, bb-index))
+ if (!bitmap_bit_p (bb_antic_flags, bb-index))
continue;
  FOR_EACH_EDGE (e, ei, bb-preds)
if (!bitmap_bit_p (bb_antic_flags, e-src-index)
@@ -5816,8 +5835,6 @@ thread_prologue_and_epilogue_insns (void
   while (!VEC_empty (basic_block, vec))
{
  basic_block tmp_bb = VEC_pop (basic_block, vec);
- edge e;
- edge_iterator ei;
  bool all_set = true;
 
  bitmap_clear_bit (bb_on_list, tmp_bb-index);
@@ -5862,28 +5879,172 @@ thread_prologue_and_epilogue_insns (void
  

Re: [cxx-mem-model] Generic atomic functions

2011-10-26 Thread Michael Matz
Hi,

On Tue, 25 Oct 2011, Andrew MacLeod wrote:

 and new generic routines are provided as follows:
 
 void __atomic_load (T* object, T* return_value, memory_order m)
 void __atomic_store (T* object, T* new_value, memory_order m)
 void __atomic_exchange (T* object, T* new_value, T* return_value, memory_order
 m)
 void __atomic_compare_exchange (T* object, T* expected_value, T* new_value,
 bool weak, memory_order success, memory_order fail)
 
 When a generic routine is called with an object whose size maps to one 
 of the type specific built-ins, (ie sizeof (T) == 1,2,4,8 or 16) the 
 generic call is translated into the appropriate direct call.

Are the generic routines really generic?  In particular do they accept 
objects that aren't naturally aligned and are supposed to still work?  In 
that case you can't rewrite them into the type specific builtins after 
only checking the size, you also need to check alignment.


Ciao,
Michael.


Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-26 Thread Kai Tietz
2011/10/26 Michael Matz m...@suse.de:
 Hi,

 On Wed, 26 Oct 2011, Kai Tietz wrote:

  Yes, this part introduced different behavior for this small case,
 
  int f(char *i, int j)
  {
         if (*i  j!=2)
                 return *i;
         else
                 return j;
  }

 Well, as far as I understand C specification and sequence points, it
 makes indeed a difference to do branch-cost optimization on instructions
 might cause a signal traps. As signal could be handled in specifc
 handlers. You need to consider here that short-circuit optimization
 assumes associative law on operands.  So right-hand side might be
 evaluaded before the left-hand-side.  And this indeed breaks IMHO the
 sequences as mentioned in C specification about conditions.

 I'm not sure in this specific case.  If it had been:

 if (*a  *b) ...

 the you'd be right.  The side-effect of dereferencing a must happen before
 *b, and hence transforming this into (*a!=0)(*b!=0) would be wrong.  But
 in the case at hand we only have one potentially problematic (aka
 detectable) side-effect (*i), so I don't think you could construct a
 program that detects the difference.  It would entail detecting that
 j!=2 was already evaluated, when (say) the segfault happens, but you
 can't as the variable is non-volatile.  It also can't happen that the
 side-effect *i does not occur (which also would be a problem).  So, I
 think there wasn't an actual problem before, and it had fewer jumps.


 Ciao,
 Michael.

the case can be produced quite easily.

Eg:

extern int global = 0;


  if (*a  global) ...
...

the issue is that by C-specification see here 5.1.2.2.3 about
program-termination.The important point is here::

Evaluation of an expression may produce side effects. At certain
specified points in the execution sequence called sequence points, all
side effects of previous evaluations shall be complete *** and no side
effects of subsequent evaluations shall have taken place ***

Annex C describes sequence-points as

1 The following are the sequence points described in 5.1.2.3:
— The call to a function, after the arguments have been evaluated (6.5.2.2).
— The end of the first operand of the following operators: logical AND
 (6.5.13); logical OR || (6.5.14); conditional ? (6.5.15); comma ,
(6.5.17).
...

Regards,
Kai


Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-26 Thread Kai Tietz
I describe the sample more closely here

extern int global = 0;
extern int *a = NULL;

void catchSigSegV( int sig )
{
  a = global;
 }

int foo (int j)
{
 signal (SIGSEGV, catchSigSegV);
 if (*a  global) return 2;
 return 0;
}

I admit that in most cases such a scenario is not common.  This sample
seems to be a valid C program.  So the conditions in IF shall be
evaluted strict in order of sequence-points, as first argument might
trap.
It doesn't matter if second argument have side-effects or none.  The
point is the first and so it has to be separated from other
conditions.

Regards,
Kai


Re: PowerPC shrink-wrap support 3 of 3

2011-10-26 Thread Bernd Schmidt
On 10/26/11 14:27, Alan Modra wrote:
 Committed revision 180522.  It turns out that shrink-wrapping isn't as
 effective as it used to be with the 20110915 based sources I was using
 originally.  povray Ray_In_Bound no longer gets the benefit of shrink
 wrap, likely due to some cfg optimization.  We end up with a simple
 block that just does r3=1 then jumps to last_bb being reached from
 blocks that need a prologue as well as blocks that don't.  That's
 enough to kill our current shrink wrap implementation.  What we need
 is something to duplicate these tail blocks..

Would it work to insert the epilogue on some edges to this R3=1 block,
and not on the others? (How many edges of each kind are there?)

Now that we have an initial patch in the tree and it mostly seems to
work, we can think about making it a little stronger - the initial
implementation is really quite conservative.


Bernd


Re: [cxx-mem-model] Generic atomic functions

2011-10-26 Thread Andrew MacLeod

On 10/26/2011 08:29 AM, Michael Matz wrote:

Hi,

On Tue, 25 Oct 2011, Andrew MacLeod wrote:


and new generic routines are provided as follows:

void __atomic_load (T* object, T* return_value, memory_order m)
void __atomic_store (T* object, T* new_value, memory_order m)
void __atomic_exchange (T* object, T* new_value, T* return_value, memory_order
m)
void __atomic_compare_exchange (T* object, T* expected_value, T* new_value,
bool weak, memory_order success, memory_order fail)

When a generic routine is called with an object whose size maps to one
of the type specific built-ins, (ie sizeof (T) == 1,2,4,8 or 16) the
generic call is translated into the appropriate direct call.

Are the generic routines really generic?  In particular do they accept
objects that aren't naturally aligned and are supposed to still work?  In
that case you can't rewrite them into the type specific builtins after
only checking the size, you also need to check alignment.




yes, ive been considering that.   At the moment, the claim is it is well 
defined for properly aligned objects, undefined otherwise.


I've been considering what would be involved in defining it. And I not 
sure if that should wait until the next release, or whether we should 
try now.


  - The 5 size specific routines *must* be properly aligned. (ie, and N 
byte object must have the same alignment as the N byte integral)


for the generic routines:
  - if it can be determined at compile time that the object being 
pointed to is properly aligned, then it can be mapped into one of the 5 
size specific routines.  Otherwise it must remain a call into the 
runtime library.
  - The runtime library can look at the pointer value and determine if 
things are properly aligned based on the pointer value.  If properly 
aligned, then it could invoke one of the runtime lock-free routines.
 - If an improperly aligned pointer is passed in, the fall back would 
be to resort to what will probably be a locked implementation for the 
object.


  - the __atomic_is_lock_free(size) routine will have to have an 
additional object pointer parameter. If its NULL, it will return a value 
based on properly aligned objects of SIZE.  Otherwise, it will also 
check the value of the pointer to see if it is properly aligned and then 
return true or false based on whether one of the lock-free routines will 
be invoked.


So it would mean that an 8 byte object which is properly aligned could 
be lock free, while and unaligned 8 byte object would likely have a 
locked implementation.  They would all work though, just requiring the 
library.


Thats my thoughts...  I'm tempted to implement the alignment 
checking...  It seems more complete, but it is more complex. so I waffle :-)


any other opinions?




[PATCH][RFC] Re-write LTO option merging

2011-10-26 Thread Richard Guenther

This completely rewrites LTO option merging.  At compile (uselessly
now at WPA?) time we now stream a COLLECT_GCC_OPTIONS like string
as it comes from argv of the compiler binary.  Those options are
read in by the LTO driver (lto-wrapper), merged into a single
set (very simple merge function right now ;)) and given a place to
complain about incompatible arguments.  The merged set is then
prepended to the arguments from the linker driver line
(what we get in COLLECT_GCC_OPTIONS for lto-wrapper), thus the
linker command-line may override what the compiler command-line(s)
provided.

One visible change is that no optimization option on the link line
no longer means -O0, unless you explicitly specify -O0 at link time.

There are probably more obscure differences, especially due to the
very simple merge and complain function ;))  But this is a RFC ...

If WPA partitioning at any point wants to do something clever with
a set of incompatible functions it can re-parse the options and
do that (we then have to arrange for lto-wrapper to let the options
slip through).

I'm LTO bootstrapping and testing this simple variant right now
(I believe we do not excercise funny option combinations right now).

I'll still implement a very simple merge/complain function.
Suggestions for that welcome (I'll probably simply compute the
intersection of options ... in the long run we'd want to annotate
our options as to whether they should be unioned/intersected).

Any comments?

Thanks,
Richard.

2011-10-26  Richard Guenther  rguent...@suse.de

* lto-opts.c: Re-implement.
* lto-streamer.h (lto_register_user_option): Remove.
(lto_read_file_options): Likewise.
(lto_reissue_options): Likewise.
(lto_clear_user_options): Likewise.
(lto_clear_file_options): Likewise.
* opts-global.c (post_handling_callback): Remove LTO specific code.
(decode_options): Likewise.
* lto-wrapper.c (run_gcc): Read all input file options and
prepend a merged set before the linker driver options.

lto/
* lto-lang.c (lto_post_options): Do not read file options.
* lto.c (lto_read_all_file_options): Remove.
(lto_init): Call lto_set_in_hooks here.


Index: trunk/gcc/lto-opts.c
===
*** trunk.orig/gcc/lto-opts.c   2011-10-26 13:46:24.0 +0200
--- trunk/gcc/lto-opts.c2011-10-26 15:06:29.0 +0200
*** along with GCC; see the file COPYING3.
*** 33,280 
  #include common/common-target.h
  #include diagnostic.h
  #include lto-streamer.h
  
! /* When a file is initially compiled, the options used when generating
!the IL are not necessarily the same as those used when linking the
!objects into the final executable.  In general, most build systems
!will proceed with something along the lines of:
! 
!   $ gcc cc-flags -flto -c f1.c -o f1.o
!   $ gcc cc-flags -flto -c f2.c -o f2.o
!   ...
!   $ gcc cc-flags -flto -c fN.c -o fN.o
! 
!And the final link may or may not include the same cc-flags used
!to generate the initial object files:
! 
!   $ gcc ld-flags -flto -o prog f1.o ... fN.o
! 
!Since we will be generating final code during the link step, some
!of the flags used during the compile step need to be re-applied
!during the link step.  For instance, flags in the -m family.
! 
!The idea is to save a selected set of cc-flags in a special
!section of the initial object files.  This section is then read
!during linking and the options re-applied.
! 
!FIXME lto.  Currently the scheme is limited in that only the
!options saved on the first object file (f1.o) are read back during
!the link step.  This means that the options used to compile f1.o
!will be applied to ALL the object files in the final link step.
!More work needs to be done to implement a merging and validation
!mechanism, as this will not be enough for all cases.  */
! 
! /* Saved options hold the type of the option (currently CL_TARGET or
!CL_COMMON), and the code, argument, and value.  */
! 
! typedef struct GTY(()) opt_d
! {
!   unsigned int type;
!   size_t code;
!   char *arg;
!   int value;
! } opt_t;
! 
! DEF_VEC_O (opt_t);
! DEF_VEC_ALLOC_O (opt_t, heap);
! 
! 
! /* Options are held in two vectors, one for those registered by
!command line handling code, and the other for those read in from
!any LTO IL input.  */
! static VEC(opt_t, heap) *user_options = NULL;
! static VEC(opt_t, heap) *file_options = NULL;
! 
! /* Iterate FROM in reverse, writing option codes not yet in CODES into *TO.
!Mark each new option code encountered in CODES.  */
  
  static void
! reverse_iterate_options (VEC(opt_t, heap) *from, VEC(opt_t, heap) **to,
!bitmap codes)
! {
!   int i;
! 
!   for (i = VEC_length (opt_t, from); i  0; i--)
! {
!   const opt_t *const o = VEC_index (opt_t, from, i - 

Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-26 Thread Michael Matz
Hi,

On Wed, 26 Oct 2011, Kai Tietz wrote:

   int f(char *i, int j)
   {
          if (*i  j!=2)
                  return *i;
          else
                  return j;
   }
 
 
 the case can be produced quite easily.
 
 extern int global = 0;
 
 
   if (*a  global) ...

See?  You had to change the program to prove the transformation to be 
invalid.  But my point was that the function we discuss about was exactly 
as above.  It didn't have globals, or two loads, or a volatile, or 
anything else you can come up with where the transformation would be 
detectable (and hence invalid).  I claim that you can't construct a 
program that can distinguish between this function:

int f(char *i, int j)
{
  if (*i  j!=2)
    return *i;
  else
    return j;
}

and this one:

int f(char *i, int j)
{
  if (*i  j!=2)
    return *i;
  else
    return j;
}

And if you can't construct such a program, then the initial transformation 
before the fold-const.c change _for this specific situation_ was correct.


Ciao,
Michael.

Re: Go patch committed: Implement new syscall package

2011-10-26 Thread Ian Lance Taylor
Rainer Orth r...@cebitec.uni-bielefeld.de writes:

 There's one problem left: with Solaris nawk, building libcalls.go fails
 (from Solaris 8 to 11 inclusive):

 nawk -f /vol/gcc/src/hg/trunk/local/libgo/go/syscall/mksyscall.awk ${files}  
 li
 bcalls.go.tmp
 nawk: syntax error at source line 47
  context is
 if (match($0, //sys(nb)?[  ]*([a-zA-Z0-9_]+)\\(([^()]*)\\)   
 *(\
 \(([^()]+)\\))?,   gosig) == 0) {
 nawk: illegal statement at source line 47
 nawk: syntax error at source line 58
 missing }
 make[4]: *** [s-libcalls] Error 2

 I don't yet see what's wrong with the pattern, and gawk does accept it.

Hmmm, I don't have a copy of nawk.  Is it possible that it doesn't like
 * == space*?  You could try [ ]*.

Ian


Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-26 Thread Kai Tietz
2011/10/26 Michael Matz m...@suse.de:
 Hi,

 On Wed, 26 Oct 2011, Kai Tietz wrote:

   int f(char *i, int j)
   {
          if (*i  j!=2)
                  return *i;
          else
                  return j;
   }
 

 the case can be produced quite easily.

 extern int global = 0;

 
   if (*a  global) ...

 See?  You had to change the program to prove the transformation to be
 invalid.  But my point was that the function we discuss about was exactly
 as above.  It didn't have globals, or two loads, or a volatile, or
 anything else you can come up with where the transformation would be
 detectable (and hence invalid).  I claim that you can't construct a
 program that can distinguish between this function:

 int f(char *i, int j)
 {
   if (*i  j!=2)
     return *i;
   else
     return j;
 }

 and this one:

 int f(char *i, int j)
 {
   if (*i  j!=2)
     return *i;
   else
     return j;
 }

 And if you can't construct such a program, then the initial transformation
 before the fold-const.c change _for this specific situation_ was correct.


 Ciao,
 Michael.

well, if such a function is used as inline and we know for it that j
has value != 2, then we have here a big difference.  For your first
example, we still have to do the memory access to *i, even if we are
not interested in result.  See here point 4 of 5.1.2.3 of C-spec.
For your second sample we don't need to do that, as the  itself is no
sequence-point and so we can eliminate the *i access without breaking
anything.

Regards,
Kai


Re: PR rtl-optimization/46603

2011-10-26 Thread Eric Botcazou
 Thanks!

You're welcome.  I have also installed the attached patch which makes minor 
tweaks and fixes various issues in comments which have bugged me for years, 
the most glaring one being:

 Similar issue for (SUBREG:M1 (REG:M2 ...) ...) for a hard register R where
 either M1 is not valid for R or M2 is wider than a word but we only
 need one word to store an M2-sized quantity in R.

which seems to be hinting at some relativistic effect on the size of M2...

No functional changes, tested on i586-suse-linux, applied on the mainline.


2011-10-26  Eric Botcazou  ebotca...@adacore.com

* reload.c (reload_inner_reg_of_subreg): Change type of return value
and type of OUTPUT parameter to bool and adjust.  Document MODE and
OUTPUT parameters.  Use HARD_REGISTER_P.  Reorder final condition
and improve associated comment.
(push_reload): Clarify and update comments about reloading of subregs.
Adjust calls to reload_inner_reg_of_subreg.  Compute the class upfront
for the reloading of subregs in the out case as well.


-- 
Eric Botcazou
Index: reload.c
===
--- reload.c	(revision 180457)
+++ reload.c	(working copy)
@@ -256,7 +256,6 @@ static int push_secondary_reload (int, r
   enum insn_code *, secondary_reload_info *);
 static enum reg_class find_valid_class (enum machine_mode, enum machine_mode,
 	int, unsigned int);
-static int reload_inner_reg_of_subreg (rtx, enum machine_mode, int);
 static void push_replacement (rtx *, int, enum machine_mode);
 static void dup_replacements (rtx *, rtx *);
 static void combine_reloads (void);
@@ -791,39 +790,39 @@ find_reusable_reload (rtx *p_in, rtx out
   return n_reloads;
 }
 
-/* Return nonzero if X is a SUBREG which will require reloading of its
-   SUBREG_REG expression.  */
+/* Return true if X is a SUBREG that will need reloading of its SUBREG_REG
+   expression.  MODE is the mode that X will be used in.  OUTPUT is true if
+   the function is invoked for the output part of an enclosing reload.  */
 
-static int
-reload_inner_reg_of_subreg (rtx x, enum machine_mode mode, int output)
+static bool
+reload_inner_reg_of_subreg (rtx x, enum machine_mode mode, bool output)
 {
   rtx inner;
 
   /* Only SUBREGs are problematical.  */
   if (GET_CODE (x) != SUBREG)
-return 0;
+return false;
 
   inner = SUBREG_REG (x);
 
-  /* If INNER is a constant or PLUS, then INNER must be reloaded.  */
+  /* If INNER is a constant or PLUS, then INNER will need reloading.  */
   if (CONSTANT_P (inner) || GET_CODE (inner) == PLUS)
-return 1;
+return true;
 
-  /* If INNER is not a hard register, then INNER will not need to
- be reloaded.  */
-  if (!REG_P (inner)
-  || REGNO (inner) = FIRST_PSEUDO_REGISTER)
-return 0;
+  /* If INNER is not a hard register, then INNER will not need reloading.  */
+  if (!(REG_P (inner)  HARD_REGISTER_P (inner)))
+return false;
 
   /* If INNER is not ok for MODE, then INNER will need reloading.  */
-  if (! HARD_REGNO_MODE_OK (subreg_regno (x), mode))
-return 1;
+  if (!HARD_REGNO_MODE_OK (subreg_regno (x), mode))
+return true;
 
-  /* If the outer part is a word or smaller, INNER larger than a
- word and the number of regs for INNER is not the same as the
- number of words in INNER, then INNER will need reloading.  */
-  return (GET_MODE_SIZE (mode) = UNITS_PER_WORD
-	   output
+  /* If this is for an output, and the outer part is a word or smaller,
+ INNER is larger than a word and the number of registers in INNER is
+ not the same as the number of words in INNER, then INNER will need
+ reloading (with an in-out reload).  */
+  return (output
+	   GET_MODE_SIZE (mode) = UNITS_PER_WORD
 	   GET_MODE_SIZE (GET_MODE (inner))  UNITS_PER_WORD
 	   ((GET_MODE_SIZE (GET_MODE (inner)) / UNITS_PER_WORD)
 	  != (int) hard_regno_nregs[REGNO (inner)][GET_MODE (inner)]));
@@ -990,9 +989,9 @@ push_reload (rtx in, rtx out, rtx *inloc
  For machines that extend byte loads, do this for any SUBREG of a pseudo
  where both M1 and M2 are a word or smaller, M1 is wider than M2, and
  M2 is an integral mode that gets extended when loaded.
- Similar issue for (SUBREG:M1 (REG:M2 ...) ...) for a hard register R where
- either M1 is not valid for R or M2 is wider than a word but we only
- need one word to store an M2-sized quantity in R.
+ Similar issue for (SUBREG:M1 (REG:M2 ...) ...) for a hard register R
+ where either M1 is not valid for R or M2 is wider than a word but we
+ only need one register to store an M2-sized quantity in R.
  (However, if OUT is nonzero, we need to reload the reg *and*
  the subreg, so do nothing here, and let following statement handle it.)
 
@@ -1082,17 +1081,16 @@ push_reload (rtx in, rtx out, rtx *inloc
   inmode = GET_MODE (in);
 }
 
-  /* Similar issue for (SUBREG:M1 (REG:M2 ...) ...) for a hard 

Re: PowerPC shrink-wrap support 3 of 3

2011-10-26 Thread Alan Modra
On Wed, Oct 26, 2011 at 03:01:01PM +0200, Bernd Schmidt wrote:
 On 10/26/11 14:27, Alan Modra wrote:
  Committed revision 180522.  It turns out that shrink-wrapping isn't as
  effective as it used to be with the 20110915 based sources I was using
  originally.  povray Ray_In_Bound no longer gets the benefit of shrink
  wrap, likely due to some cfg optimization.  We end up with a simple
  block that just does r3=1 then jumps to last_bb being reached from
  blocks that need a prologue as well as blocks that don't.  That's
  enough to kill our current shrink wrap implementation.  What we need
  is something to duplicate these tail blocks..
 
 Would it work to insert the epilogue on some edges to this R3=1 block,
 and not on the others?

Wouldn't you need to modify all the target epilogue code?  Our
epilogues return.

 (How many edges of each kind are there?)
In the povray case there was one edge of each kind, but I have seen
other cases where there were 4 edges from blocks needing no prologue
and 2 edges from blocks needing a prologue.  I can't tell you what the
testcase was now;  It was something I looked at when ironing out bugs
in my code.  You wouldn't believe how many ways it is possible to
write buggy cfg manipulation code..

I guess the tradeoff between the classic shrink-wrap epilogue scheme
and my duplicate tail idea is whether duplicating tail blocks adds
more code than duplicating epilogues.  From what I've seen, the
duplicate tails are generally very small.  I guess I should dump out
some info so we can get a better idea.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH][ARM] Big Endian and Generic tuning

2011-10-26 Thread Andrew Stubbs

On 25/10/11 15:16, Richard Earnshaw wrote:

The difficulty on R profile is that although BE-32 mode is obsolete,
there is a config bit that can be set to make the CPU byte-swap the
instruction stream to make it behave largely like it is running in BE-32
mode.

I think really we should be pushing the R-profile code towards having
BE-8 as the default; but to do that we really need a compiler option
that can override this behaviour (probably -mbe-[8|32]).  Once we have
that option, we can fix the compiler to prefer BE-8 as there is then a
way to get back the legacy behaviour.


Here's an updated patch that makes no generalizations.

OK?

Andrew
2011-10-26  Andrew Stubbs  a...@codesourcery.com

	gcc/
	* config/arm/bpabi.h (BE8_LINK_SPEC): Recognize generic-armv7 tuning.

--- a/gcc/config/arm/bpabi.h
+++ b/gcc/config/arm/bpabi.h
@@ -58,6 +58,7 @@
 #define BE8_LINK_SPEC \
%{mbig-endian:%{march=armv7-a|mcpu=cortex-a5	\
|mcpu=cortex-a8|mcpu=cortex-a9|mcpu=cortex-a15	\
+   |mcpu=generic-armv7-a\
|march=armv7-m|mcpu=cortex-m3			\
|march=armv7e-m|mcpu=cortex-m4			\
|march=armv6-m|mcpu=cortex-m0			\


Re: [PATCH][ARM] Big Endian and Generic tuning

2011-10-26 Thread Richard Earnshaw
On 26/10/11 14:54, Andrew Stubbs wrote:
 On 25/10/11 15:16, Richard Earnshaw wrote:
 The difficulty on R profile is that although BE-32 mode is obsolete,
 there is a config bit that can be set to make the CPU byte-swap the
 instruction stream to make it behave largely like it is running in BE-32
 mode.

 I think really we should be pushing the R-profile code towards having
 BE-8 as the default; but to do that we really need a compiler option
 that can override this behaviour (probably -mbe-[8|32]).  Once we have
 that option, we can fix the compiler to prefer BE-8 as there is then a
 way to get back the legacy behaviour.
 
 Here's an updated patch that makes no generalizations.
 
 OK?
 

Yep

R.

 Andrew
 
 
 be.patch
 
 
 2011-10-26  Andrew Stubbs  a...@codesourcery.com
 
   gcc/
   * config/arm/bpabi.h (BE8_LINK_SPEC): Recognize generic-armv7 tuning.
 
 --- a/gcc/config/arm/bpabi.h
 +++ b/gcc/config/arm/bpabi.h
 @@ -58,6 +58,7 @@
  #define BE8_LINK_SPEC \
 %{mbig-endian:%{march=armv7-a|mcpu=cortex-a5 \
 |mcpu=cortex-a8|mcpu=cortex-a9|mcpu=cortex-a15\
 +   |mcpu=generic-armv7-a \
 |march=armv7-m|mcpu=cortex-m3 \
 |march=armv7e-m|mcpu=cortex-m4\
 |march=armv6-m|mcpu=cortex-m0 \




Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-26 Thread Michael Matz
Hi,

On Wed, 26 Oct 2011, Kai Tietz wrote:

 well, if such a function is used as inline and we know for it that j has 
 value != 2, then we have here a big difference.  For your first example, 
 we still have to do the memory access to *i, even if we are not 
 interested in result.

Actually we don't have to preserve memory accesses.  The interesting case 
is if the pointer has an invalid value.  The behaviour of the access then 
is undefined, and it's okay to not do it at all.  In case the pointer does 
point to an object the access (if it's value isn't needed) also isn't 
necessary.  IOW: in void f(int *p) { int i = *p; } we can always remove 
the pointer read.  So, I still maintain that the transformation on the 
original example was okay.


Ciao,
Michael.


Re: [PATCH 0/6] More vector permutation work

2011-10-26 Thread Ulrich Weigand
Richard Henderson wrote:

   Fix typos in the names of vec_extract  vec_interleave tree codes.
   Change vec_perm checking and expansion level.
   Implement interleave via permutation.
   spu: Remove vec_extract_even/odd and vec_interleave expanders.
   rs6000: Remove some vec_extract_even/odd expanders.
   i386: Delete the vec_extract_even/odd patterns.

This patch set causes a build failure on SPU:

/home/uweigand/fsf/gcc-head/libgfortran/generated/matmul_c4.c: In function 
'matmul_c4':
/home/uweigand/fsf/gcc-head/libgfortran/generated/matmul_c4.c:284:25: internal 
compiler error: in expand_vec_perm, at optabs.c:6998

  /* Multiply each element by its byte size.  */
  if (u == 2)
sel = expand_simple_binop (mode, PLUS, sel, sel, sel, 0, OPTAB_DIRECT);
  else
sel = expand_simple_binop (mode, ASHIFT, sel, GEN_INT (exact_log2 (u)),
   sel, 0, OPTAB_DIRECT);
  gcc_assert (sel != NULL);


This fails since for u == 4 and mode == V4SFmode it attempts to expand
a V4SFmode shift, which is unsupported.

Shouldn't this be using the mode of the selector rather than the mode
of the result in any case?

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  ulrich.weig...@de.ibm.com


[C++ Patch / RFC] PR 50870

2011-10-26 Thread Paolo Carlini

Hi,

I'm trying to fix this PR, ice on valid, which Daniel kindly filed while 
we were triaging PR50864. In short, in tsubst_copy_and_build, for 
COMPONENT_REF, we call tsubst_baselink with an object which in this case 
is an ARROW_EXPR, thus its TREE_TYPE is NULL_TREE. I'm trying to fix 
this be using the first operand instead, as we normally do for 
ARROW_EXPRs elsewhere. Note that, assuming this makes sense, we probably 
want to audit also case COMPONENT_REF in tsubst_copy...


What do you think?

(Tested already x86_64-linux)

Thanks,
Paolo.

//
/cp
2011-10-26  Paolo Carlini  paolo.carl...@oracle.com

PR c++/50870
* pt.c (tsubst_copy_and_build): When object is an ARROW_EXPR
pass to tsubst_baselink its first operand.

/testsuite
2011-10-26  Paolo Carlini  paolo.carl...@oracle.com

PR c++/50870
* g++.dg/cpp0x/decltype34.C: New.
Index: testsuite/g++.dg/cpp0x/decltype34.C
===
--- testsuite/g++.dg/cpp0x/decltype34.C (revision 0)
+++ testsuite/g++.dg/cpp0x/decltype34.C (revision 0)
@@ -0,0 +1,19 @@
+// PR c++/50870
+// { dg-options -std=gnu++0x }
+
+struct impl
+{
+  template class T static T create();
+};
+
+templateclass T, class U,
+class = decltype(impl::createT()-impl::createU())
+struct tester{};
+
+testerimpl*, int ti;
+
+templateclass T, class U,
+class = decltype(impl::createT()-impl::createU())
+int test() { return 0; }
+
+int i = testimpl*, int();
Index: cp/pt.c
===
--- cp/pt.c (revision 180520)
+++ cp/pt.c (working copy)
@@ -13711,7 +13711,9 @@ tsubst_copy_and_build (tree t,
member = TREE_OPERAND (t, 1);
if (BASELINK_P (member))
  member = tsubst_baselink (member,
-   non_reference (TREE_TYPE (object)),
+   TREE_CODE (object) == ARROW_EXPR
+   ? TREE_OPERAND (object, 0)
+   : non_reference (TREE_TYPE (object)),
args, complain, in_decl);
else
  member = tsubst_copy (member, args, complain, in_decl);


Re: PowerPC shrink-wrap support 3 of 3

2011-10-26 Thread Alan Modra
On Wed, Oct 26, 2011 at 03:59:36PM +0200, Bernd Schmidt wrote:
 On 10/26/11 15:54, Alan Modra wrote:
  I guess the tradeoff between the classic shrink-wrap epilogue scheme
  and my duplicate tail idea is whether duplicating tail blocks adds
  more code than duplicating epilogues.  From what I've seen, the
  duplicate tails are generally very small.  I guess I should dump out
  some info so we can get a better idea.
 
 I suppose if one wanted to avoid inserting more than one epilogue for
 code-size reasons, one could make a new basic block containing the
 epilogue, and redirect edges as appropriate.

Suppose you have a function that returns r3=0 in one tail block and
r3=1 in another, and these blocks are reached both by paths needing
a prologue and by paths not needing a prologue.  Which seems a likely
common case.  I'm fairly certain that would require two copies of the
normal epilogue, or duplicating the tail blocks.  (But it's late here
and I'm ready to nod off so may not be thinking straight.)

-- 
Alan Modra
Australia Development Lab, IBM


Re: Go patch committed: Implement new syscall package

2011-10-26 Thread Rainer Orth
Ian,

 Hmmm, I don't have a copy of nawk.  Is it possible that it doesn't like
  * == space*?  You could try [ ]*.

the problem is another one: using /usr/xpg4/bin/awk, I find:

/usr/xpg4/bin/awk: line 47 (NR=32): wrong number of arguments to function m

nawk(1) only documents match(s,ere) (i.e. two args), and the gawk docs
state:

`match(STRING, REGEXP [, ARRAY])'
[...]
 The ARRAY argument to `match' is a `gawk' extension.  In
 compatibility mode (*note Options::), using a third argument is a
 fatal error.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-26 Thread Kai Tietz
2011/10/26 Michael Matz m...@suse.de:
 Hi,

 On Wed, 26 Oct 2011, Kai Tietz wrote:

 well, if such a function is used as inline and we know for it that j has
 value != 2, then we have here a big difference.  For your first example,
 we still have to do the memory access to *i, even if we are not
 interested in result.

 Actually we don't have to preserve memory accesses.  The interesting case
 is if the pointer has an invalid value.  The behaviour of the access then
 is undefined, and it's okay to not do it at all.  In case the pointer does
 point to an object the access (if it's value isn't needed) also isn't
 necessary.  IOW: in void f(int *p) { int i = *p; } we can always remove
 the pointer read.  So, I still maintain that the transformation on the
 original example was okay.


 Ciao,
 Michael.

So you would mean that memory dereferencing shouldn't be considered as
side-effect at all?

So we would happily cause by code 'if (i  *i != 0) an crash, as
memory-dereference has for you no side-effect?

In you special case it might be valid that, if first (and C-fold-const
doesn't know if the side-effect condition is really the first, as it
might be a sub-sequence of a condition) condition might trap or not,
to combine it.  But branching has to cover the general cases.  If you
find a way to determine that left-hand operand in fold_const's
branching code is really the left-most condition in chain, then we can
add such a special case, but I don't see here an easy way to determine
it.

Regards,
Kai

Hmm, not sure


Re: [C++ Patch / RFC] PR 50870

2011-10-26 Thread Jason Merrill

On 10/26/2011 10:39 AM, Paolo Carlini wrote:

I'm trying to fix this PR, ice on valid, which Daniel kindly filed while
we were triaging PR50864. In short, in tsubst_copy_and_build, for
COMPONENT_REF, we call tsubst_baselink with an object which in this case
is an ARROW_EXPR, thus its TREE_TYPE is NULL_TREE.


That's OK, we can pass a null type to tsubst_baselink.  Just fix 
non_reference to pass NULL_TREE through.


Jason


Re: Go patch committed: Implement new syscall package

2011-10-26 Thread Ian Lance Taylor
Rainer Orth r...@cebitec.uni-bielefeld.de writes:

 the problem is another one: using /usr/xpg4/bin/awk, I find:

 /usr/xpg4/bin/awk: line 47 (NR=32): wrong number of arguments to function m

 nawk(1) only documents match(s,ere) (i.e. two args), and the gawk docs
 state:

 `match(STRING, REGEXP [, ARRAY])'
 [...]
  The ARRAY argument to `match' is a `gawk' extension.  In
  compatibility mode (*note Options::), using a third argument is a
  fatal error.

Ah.  I took a look at the nawk man page, and I don't see any evidence
that it supports submatches at all.  How annoying.

Ian


Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-26 Thread Michael Matz
Hi,

On Wed, 26 Oct 2011, Kai Tietz wrote:

 So you would mean that memory dereferencing shouldn't be considered as 
 side-effect at all?

No.  I haven't said this at all.  Of course it's a side-effect, but we're 
allowed to remove existing ones (under some circumstances).  We're not 
allowed to introduce new ones, which means that this ...

 So we would happily cause by code 'if (i  *i != 0) an crash, as 
 memory-dereference has for you no side-effect?

... is not allowed.  But in the original example the memread was on the 
left side, hence occured always, therefore we can move it to the right 
side, even though it might occur less often.

 In you special case it might be valid that, if first (and C-fold-const 
 doesn't know if the side-effect condition is really the first, as it 
 might be a sub-sequence of a condition) condition might trap or not, to 
 combine it.  But branching has to cover the general cases.  If you find 
 a way to determine that left-hand operand in fold_const's branching code 
 is really the left-most condition in chain, then we can add such a 
 special case, but I don't see here an easy way to determine it.

Hmm?  I don't see why it's necessary to check if it's the left-most 
condition in a chain.  If the left hand of '' is a memread it can always 
be moved to the right side (or the operator transformed into '' which can 
have the same effect), of course only if the original rhs is free of side 
effects, but then independed if the  was part of a larger expression.  
The memread will possibly be done fewer times than originally, but as 
said, that's okay.


Ciao,
Michael.


Re: [C++ Patch / RFC] PR 50870

2011-10-26 Thread Paolo Carlini

On 10/26/2011 05:27 PM, Jason Merrill wrote:

On 10/26/2011 10:39 AM, Paolo Carlini wrote:

I'm trying to fix this PR, ice on valid, which Daniel kindly filed while
we were triaging PR50864. In short, in tsubst_copy_and_build, for
COMPONENT_REF, we call tsubst_baselink with an object which in this case
is an ARROW_EXPR, thus its TREE_TYPE is NULL_TREE.


That's OK, we can pass a null type to tsubst_baselink.  Just fix 
non_reference to pass NULL_TREE through.
Ah, very good (just read the comment preceding tsubst_baselink, should 
have done it earlier today ;) Thus I'm finishing testing the below, Ok 
if it passes?


Thanks,
Paolo.

/

/cp
2011-10-26  Paolo Carlini  paolo.carl...@oracle.com

PR c++/50870
* typeck.c (non_reference): Pass NULL_TREE through.

/testsuite
2011-10-26  Paolo Carlini  paolo.carl...@oracle.com

PR c++/50870
* g++.dg/cpp0x/decltype34.C: New.
Index: testsuite/g++.dg/cpp0x/decltype34.C
===
--- testsuite/g++.dg/cpp0x/decltype34.C (revision 0)
+++ testsuite/g++.dg/cpp0x/decltype34.C (revision 0)
@@ -0,0 +1,19 @@
+// PR c++/50870
+// { dg-options -std=gnu++0x }
+
+struct impl
+{
+  template class T static T create();
+};
+
+templateclass T, class U,
+class = decltype(impl::createT()-impl::createU())
+struct tester{};
+
+testerimpl*, int ti;
+
+templateclass T, class U,
+class = decltype(impl::createT()-impl::createU())
+int test() { return 0; }
+
+int i = testimpl*, int();
Index: cp/typeck.c
===
--- cp/typeck.c (revision 180528)
+++ cp/typeck.c (working copy)
@@ -8322,7 +8322,7 @@ casts_away_constness (tree t1, tree t2)
 tree
 non_reference (tree t)
 {
-  if (TREE_CODE (t) == REFERENCE_TYPE)
+  if (t  TREE_CODE (t) == REFERENCE_TYPE)
 t = TREE_TYPE (t);
   return t;
 }


Re: [PATCH 0/6] More vector permutation work

2011-10-26 Thread Richard Henderson
On 10/26/2011 07:30 AM, Ulrich Weigand wrote:
 This fails since for u == 4 and mode == V4SFmode it attempts to expand
 a V4SFmode shift, which is unsupported.
 
 Shouldn't this be using the mode of the selector rather than the mode
 of the result in any case?

Yes, it should use the mode of the selector.  And doing just that is
enough to fix the bug.  But I noticed that the actual results for a
constant permutation were much nastier than they ought to be.

Try this.

Iain, this might solve your case too; no I can't test myself off cross,
because libgfortran needs the whole cross-env -- system headers and everything.


r~
diff --git a/gcc/optabs.c b/gcc/optabs.c
index 9afc911..736d826 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -6912,7 +6912,7 @@ expand_vec_perm (enum machine_mode mode, rtx v0, rtx v1, 
rtx sel, rtx target)
   enum insn_code icode;
   enum machine_mode qimode;
   unsigned int i, w, e, u;
-  rtx tmp, sel_qi;
+  rtx tmp, sel_qi = NULL;
   rtvec vec;
 
   if (!target || GET_MODE (target) != mode)
@@ -6946,23 +6946,23 @@ expand_vec_perm (enum machine_mode mode, rtx v0, rtx 
v1, rtx sel, rtx target)
   /* Fall back to a constant byte-based permutation.  */
   if (qimode != VOIDmode)
{
- icode = direct_optab_handler (vec_perm_const_optab, qimode);
- if (icode != CODE_FOR_nothing)
+ vec = rtvec_alloc (w);
+ for (i = 0; i  e; ++i)
{
- vec = rtvec_alloc (w);
- for (i = 0; i  e; ++i)
-   {
- unsigned int j, this_e;
+ unsigned int j, this_e;
 
- this_e = INTVAL (XVECEXP (sel, 0, i));
- this_e = 2 * e - 1;
- this_e *= u;
+ this_e = INTVAL (XVECEXP (sel, 0, i));
+ this_e = 2 * e - 1;
+ this_e *= u;
 
- for (j = 0; j  u; ++j)
-   RTVEC_ELT (vec, i * u + j) = GEN_INT (this_e + j);
-   }
- sel_qi = gen_rtx_CONST_VECTOR (qimode, vec);
+ for (j = 0; j  u; ++j)
+   RTVEC_ELT (vec, i * u + j) = GEN_INT (this_e + j);
+   }
+ sel_qi = gen_rtx_CONST_VECTOR (qimode, vec);
 
+ icode = direct_optab_handler (vec_perm_const_optab, qimode);
+ if (icode != CODE_FOR_nothing)
+   {
  tmp = expand_vec_perm_1 (icode, gen_lowpart (qimode, target),
   gen_lowpart (qimode, v0),
   gen_lowpart (qimode, v1), sel_qi);
@@ -6989,47 +6989,53 @@ expand_vec_perm (enum machine_mode mode, rtx v0, rtx 
v1, rtx sel, rtx target)
   if (icode == CODE_FOR_nothing)
 return NULL_RTX;
 
-  /* Multiply each element by its byte size.  */
-  if (u == 2)
-sel = expand_simple_binop (mode, PLUS, sel, sel, sel, 0, OPTAB_DIRECT);
-  else
-sel = expand_simple_binop (mode, ASHIFT, sel, GEN_INT (exact_log2 (u)),
-  sel, 0, OPTAB_DIRECT);
-  gcc_assert (sel != NULL);
-
-  /* Broadcast the low byte each element into each of its bytes.  */
-  vec = rtvec_alloc (w);
-  for (i = 0; i  w; ++i)
-{
-  int this_e = i / u * u;
-  if (BYTES_BIG_ENDIAN)
-   this_e += u - 1;
-  RTVEC_ELT (vec, i) = GEN_INT (this_e);
-}
-  tmp = gen_rtx_CONST_VECTOR (qimode, vec);
-  sel = gen_lowpart (qimode, sel);
-  sel = expand_vec_perm (qimode, sel, sel, tmp, NULL);
-  gcc_assert (sel != NULL);
-
-  /* Add the byte offset to each byte element.  */
-  /* Note that the definition of the indicies here is memory ordering,
- so there should be no difference between big and little endian.  */
-  vec = rtvec_alloc (w);
-  for (i = 0; i  w; ++i)
-RTVEC_ELT (vec, i) = GEN_INT (i % u);
-  tmp = gen_rtx_CONST_VECTOR (qimode, vec);
-  sel = expand_simple_binop (qimode, PLUS, sel, tmp, sel, 0, OPTAB_DIRECT);
-  gcc_assert (sel != NULL);
+  if (sel_qi == NULL)
+{
+  /* Multiply each element by its byte size.  */
+  enum machine_mode selmode = GET_MODE (sel);
+  if (u == 2)
+   sel = expand_simple_binop (selmode, PLUS, sel, sel,
+  sel, 0, OPTAB_DIRECT);
+  else
+   sel = expand_simple_binop (selmode, ASHIFT, sel,
+  GEN_INT (exact_log2 (u)),
+  sel, 0, OPTAB_DIRECT);
+  gcc_assert (sel != NULL);
+
+  /* Broadcast the low byte each element into each of its bytes.  */
+  vec = rtvec_alloc (w);
+  for (i = 0; i  w; ++i)
+   {
+ int this_e = i / u * u;
+ if (BYTES_BIG_ENDIAN)
+   this_e += u - 1;
+ RTVEC_ELT (vec, i) = GEN_INT (this_e);
+   }
+  tmp = gen_rtx_CONST_VECTOR (qimode, vec);
+  sel = gen_lowpart (qimode, sel);
+  sel = expand_vec_perm (qimode, sel, sel, tmp, NULL);
+  gcc_assert (sel != NULL);
+
+  /* Add the byte offset to each byte element.  */
+  /* Note that the definition of the indicies here is memory 

Re: [PATCH 0/6] More vector permutation work

2011-10-26 Thread Ulrich Weigand
Richard Henderson wrote:
 On 10/26/2011 07:30 AM, Ulrich Weigand wrote:
  This fails since for u == 4 and mode == V4SFmode it attempts to expand
  a V4SFmode shift, which is unsupported.
  
  Shouldn't this be using the mode of the selector rather than the mode
  of the result in any case?
 
 Yes, it should use the mode of the selector.  And doing just that is
 enough to fix the bug.  But I noticed that the actual results for a
 constant permutation were much nastier than they ought to be.
 
 Try this.

Thanks, this has indeed fixed the build problem.  Full test suite is
still running ...

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  ulrich.weig...@de.ibm.com


Re: new patches using -fopt-info (issue5294043)

2011-10-26 Thread Xinliang David Li
I am hoping that too:) Yes, I will try to do it when I find some time.

David

On Wed, Oct 26, 2011 at 1:37 AM, Richard Guenther
richard.guent...@gmail.com wrote:
 On Tue, Oct 25, 2011 at 9:30 PM, Xinliang David Li davi...@google.com wrote:


 On Tue, Oct 25, 2011 at 1:02 AM, Richard Guenther
 richard.guent...@gmail.com wrote:

 On Mon, Oct 24, 2011 at 6:27 PM, Xinliang David Li davi...@google.com
 wrote:
  Well, you seem to keep not reading what I write.  I am not opposed
  to adding -fopt-info/report nor to funnel messages to stdout/err.  What
  I am opposed is the way you want to introduce them.  I want you to
  fix what we dump into dump files, so that both -fopt-report and
  -fopt-info
  can be implemented by outputting selected pieces of the dump file
  to stdout/stderr.  We already have -fdump-*-stats which supposedly
  could match -fopt-report, and the default -fdump-* should be what
  goes to -fopt-info (minus the function bodies, of course).
 
  That sounds good. What you propose seems like
 
  -fdump-pass-[ir_only|transformation|debug]-stderr
 
  and -fopt-info is a short cut for
  -fdump-tree-all-transformations-stderr
  -fdump-ipa-all-tranformations-stderr
  -fdump-rtl-all-transformations-stderr

 Yes.  Note that I don't like it the way the vectorizer does (with
 -fvectorizer-verbose=... the dump files are empty).  The dump
 file content should be unchanged when redirecting (parts) to
 stderr, so we have to arrange to duplicate messages in two places.

 Vectorizer dump is a good candidate for clean up when the dumping
 infrastructure improvement is done.
 FYI, for now, we will implement the opt-info for some passes in the simple
 way in google branches, and later migrate to trunk when the dumping
 infrastructure is improved.

 I was hoping you would volunteer to improve the dumping infrastructure.

 Richard.



[C++ Patch] BASELINK_P clean up

2011-10-26 Thread Paolo Carlini

Hi,

we have an inconsistent mix: shall we do the below? Test in progress on 
x86_64-linux.


Thanks,
Paolo.

//
2011-10-26  Paolo Carlini  paolo.carl...@oracle.com

* typeck.c (cp_build_addr_expr_1): Use BASELINK_P.
* class.c (instantiate_type): Likewise.
* pt.c (convert_nontype_argument_function, uses_template_parms,
tsubst_copy, resolve_nondeduced_context, type_dependent_expression_p):
Likewise.
* semantics.c (finish_decltype_type): Likewise.
* decl2.c (mark_used): Likewise.
* name-lookup.c (arg_assoc): Likewise.

Index: typeck.c
===
--- typeck.c(revision 180528)
+++ typeck.c(working copy)
@@ -4946,7 +4946,7 @@ cp_build_addr_expr_1 (tree arg, bool strict_lvalue
   if (TREE_CODE (arg) == OFFSET_REF)
PTRMEM_OK_P (val) = PTRMEM_OK_P (arg);
 }
-  else if (TREE_CODE (TREE_OPERAND (arg, 1)) == BASELINK)
+  else if (BASELINK_P (TREE_OPERAND (arg, 1)))
 {
   tree fn = BASELINK_FUNCTIONS (TREE_OPERAND (arg, 1));
 
Index: class.c
===
--- class.c (revision 180528)
+++ class.c (working copy)
@@ -6873,7 +6873,7 @@ instantiate_type (tree lhstype, tree rhs, tsubst_f
}
 }
 
-  if (TREE_CODE (rhs) == BASELINK)
+  if (BASELINK_P (rhs))
 {
   access_path = BASELINK_ACCESS_BINFO (rhs);
   rhs = BASELINK_FUNCTIONS (rhs);
Index: pt.c
===
--- pt.c(revision 180528)
+++ pt.c(working copy)
@@ -5277,7 +5277,7 @@ convert_nontype_argument_function (tree type, tree
   fn_no_ptr = fn;
   if (TREE_CODE (fn_no_ptr) == ADDR_EXPR)
 fn_no_ptr = TREE_OPERAND (fn_no_ptr, 0);
-  if (TREE_CODE (fn_no_ptr) == BASELINK)
+  if (BASELINK_P (fn_no_ptr))
 fn_no_ptr = BASELINK_FUNCTIONS (fn_no_ptr);
  
   /* [temp.arg.nontype]/1
@@ -7801,7 +7801,7 @@ uses_template_parms (tree t)
   || EXPR_P (t)
   || TREE_CODE (t) == TEMPLATE_PARM_INDEX
   || TREE_CODE (t) == OVERLOAD
-  || TREE_CODE (t) == BASELINK
+  || BASELINK_P (t)
   || TREE_CODE (t) == IDENTIFIER_NODE
   || TREE_CODE (t) == TRAIT_EXPR
   || TREE_CODE (t) == CONSTRUCTOR
@@ -11993,7 +11993,7 @@ tsubst_copy (tree t, tree args, tsubst_flags_t com
 base, name,
 /*template_p=*/false);
  }
-   else if (TREE_CODE (name) == BASELINK)
+   else if (BASELINK_P (name))
  name = tsubst_baselink (name,
  non_reference (TREE_TYPE (object)),
  args, complain,
@@ -15197,7 +15197,7 @@ resolve_nondeduced_context (tree orig_expr)
   offset = expr;
   expr = TREE_OPERAND (expr, 1);
 }
-  if (TREE_CODE (expr) == BASELINK)
+  if (BASELINK_P (expr))
 {
   baselink = expr;
   expr = BASELINK_FUNCTIONS (expr);
@@ -19314,7 +19314,7 @@ type_dependent_expression_p (tree expression)
   if (TREE_CODE (expression) == SCOPE_REF)
return false;
 
-  if (TREE_CODE (expression) == BASELINK)
+  if (BASELINK_P (expression))
expression = BASELINK_FUNCTIONS (expression);
 
   if (TREE_CODE (expression) == TEMPLATE_ID_EXPR)
Index: semantics.c
===
--- semantics.c (revision 180528)
+++ semantics.c (working copy)
@@ -5115,7 +5115,7 @@ finish_decltype_type (tree expr, bool id_expressio
step.  */
 expr = TREE_OPERAND (expr, 1);
 
-  if (TREE_CODE (expr) == BASELINK)
+  if (BASELINK_P (expr))
 /* See through BASELINK nodes to the underlying function.  */
 expr = BASELINK_FUNCTIONS (expr);
 
Index: decl2.c
===
--- decl2.c (revision 180528)
+++ decl2.c (working copy)
@@ -4174,7 +4174,7 @@ mark_used (tree decl)
  like the DECL for the function.  Otherwise, if the BASELINK is
  for an overloaded function, we don't know which function was
  actually used until after overload resolution.  */
-  if (TREE_CODE (decl) == BASELINK)
+  if (BASELINK_P (decl))
 {
   decl = BASELINK_FUNCTIONS (decl);
   if (really_overloaded_fn (decl))
Index: name-lookup.c
===
--- name-lookup.c   (revision 180528)
+++ name-lookup.c   (working copy)
@@ -5329,7 +5329,7 @@ arg_assoc (struct arg_lookup *k, tree n)
 n = TREE_OPERAND (n, 1);
   while (TREE_CODE (n) == TREE_LIST)
 n = TREE_VALUE (n);
-  if (TREE_CODE (n) == BASELINK)
+  if (BASELINK_P (n))
 n = BASELINK_FUNCTIONS (n);
 
   if (TREE_CODE (n) == FUNCTION_DECL)


[PATCH 0/3] Predication support for selective scheduler

2011-10-26 Thread Alexander Monakov
Hello,

This patch series implements predication support for the selective scheduler.
It is implemented as a transformation similar to speculative execution, but
simpler in some aspects (e.g. does not require recovery code).  On ia64,
combining speculation and conditional execution is supported.  Combining of 
predicates is not supported.

The patches have been bootstrapped and regtested on x86_64, ia64 and arm with
selective scheduling (+ pipelining and predication) enabled at -O2.  OK for
trunk?

Alexander


Re: [cxx-mem-model] Change library call for __atomic_op_fetch routines

2011-10-26 Thread Andrew MacLeod

On 10/26/2011 01:19 PM, Richard Henderson wrote:



!resolved to an instriction sequence.  */

instruction

no no, its a mashup of restricted instruction.
ok, maybe not.

!   gcc_assert (TREE_OPERAND (addr, 0) == fndecl);
!   TREE_OPERAND (addr, 0) = builtin_decl_explicit(ext_call);

I do wonder if we ought not put the original back after the expand.
hum. I suppose that wouldn't hurt.  I don't think it should make any 
difference since Im not modifying whats in the table, but I can give it 
a shot. If it bootstraps, which I presume it will, I'll  check it in 
with the original decl replaced.


Andrew


Re: [PATCH 1/3] Transformation replay mechanism

2011-10-26 Thread Alexander Monakov
(Note, this is a prerequisite for predication support but also is an
 improvement on its own.  This patch can be installed separately.)

This patch implements transformation replay mechanism (redo_transformations).
Currently, after choosing an expression to schedule and finding its original
expressions below the scheduling fence in move_op, we need to move the
original expressions up (e.g. to correctly generate bookkeeping copies).  That
moving up would call moveup_expr_cached.

The patch slightly optimizes this by performing lookups in the history vector
of the chosen expression, which must contain all transformations of all
original expressions.  Not much changes are needed: the most notable is
perhaps that for speculation transformation, both old and new speculative
statuses are saved.

This patch is needed by the following predication patch, which uses
redo_transformations to correctly apply inverted/non-inverted predicate on
jumps without needing to remember the branch from which the expression
originated.

2011-10-26  Alexander Monakov  amona...@ispras.ru

* sel-sched-ir.c (free_history_vect): Export.
(insert_in_history_vect): Rename argument spec_ds to old_spec_ds, add
new argument (new_spec_ds).  Record it in the history vector.  Update
all callers.
(init_expr): Assert that initial history is empty.  Simplify code.
(copy_history_of_changes): Split out from copy_expr.
* sel-sched-ir.h (struct expr_history_def_1): Rename spec_ds to
old_spec_ds.  Add new member (new_spec_ds).
(copy_history_of_changes, free_history_vect): Declare.
(insert_in_history_vect): Adjust prototype.
* sel-sched.c (undo_transformations): Adjust.
(move_exprs_to_boundary): Discard history of expression to be
scheduled.
(move_op_orig_expr_found): Populate history vector of c_expr for use
in redo_transformations.
(redo_transformations): New.
(move_op_ascend): Use redo_transformations.

diff --git a/gcc/sel-sched-ir.c b/gcc/sel-sched-ir.c
index 1bbc3ed..1a73308 100644
--- a/gcc/sel-sched-ir.c
+++ b/gcc/sel-sched-ir.c
@@ -149,7 +149,6 @@ static expr_t set_insn_init (expr_t, vinsn_t, int);
 
 static void cfg_preds (basic_block, insn_t **, int *);
 static void prepare_insn_expr (insn_t, int);
-static void free_history_vect (VEC (expr_history_def, heap) **);
 
 static void move_bb_info (basic_block, basic_block);
 static void remove_empty_bb (basic_block, bool);
@@ -1503,13 +1502,13 @@ find_in_history_vect (VEC(expr_history_def, heap) 
*vect, rtx insn,
 
 /* Insert new element in a sorted history vector pointed to by PVECT,
if it is not there already.  The element is searched using
-   UID/NEW_EXPR_VINSN pair.  TYPE, OLD_EXPR_VINSN and SPEC_DS save
-   the history of a transformation.  */
+   UID/NEW_EXPR_VINSN pair.  TYPE, OLD_EXPR_VINSN, OLD_SPEC_DS and
+   NEW_SPEC_DS save the history of a transformation.  */
 void
 insert_in_history_vect (VEC (expr_history_def, heap) **pvect,
 unsigned uid, enum local_trans_type type,
 vinsn_t old_expr_vinsn, vinsn_t new_expr_vinsn,
-ds_t spec_ds)
+ds_t old_spec_ds, ds_t new_spec_ds)
 {
   VEC(expr_history_def, heap) *vect = *pvect;
   expr_history_def temp;
@@ -1525,15 +1524,16 @@ insert_in_history_vect (VEC (expr_history_def, heap) 
**pvect,
   /* It is possible that speculation types of expressions that were
  propagated through different paths will be different here.  In this
  case, merge the status to get the correct check later.  */
-  if (phist-spec_ds != spec_ds)
-phist-spec_ds = ds_max_merge (phist-spec_ds, spec_ds);
+  if (phist-old_spec_ds != old_spec_ds)
+phist-old_spec_ds = ds_max_merge (phist-old_spec_ds, old_spec_ds);
   return;
 }
 
   temp.uid = uid;
   temp.old_expr_vinsn = old_expr_vinsn;
   temp.new_expr_vinsn = new_expr_vinsn;
-  temp.spec_ds = spec_ds;
+  temp.old_spec_ds = old_spec_ds;
+  temp.new_spec_ds = new_spec_ds;
   temp.type = type;
 
   vinsn_attach (old_expr_vinsn);
@@ -1543,7 +1543,7 @@ insert_in_history_vect (VEC (expr_history_def, heap) 
**pvect,
 }
 
 /* Free history vector PVECT.  */
-static void
+void
 free_history_vect (VEC (expr_history_def, heap) **pvect)
 {
   unsigned i;
@@ -1576,7 +1576,7 @@ merge_history_vect (VEC (expr_history_def, heap) **pvect,
   for (i = 0; VEC_iterate (expr_history_def, from, i, phist); i++)
 insert_in_history_vect (pvect, phist-uid, phist-type,
 phist-old_expr_vinsn, phist-new_expr_vinsn,
-phist-spec_ds);
+phist-old_spec_ds, phist-new_spec_ds);
 }
 
 /* Compare two vinsns as rhses if possible and as vinsns otherwise.  */
@@ -1632,10 +1632,8 @@ init_expr (expr_t expr, vinsn_t vi, int spec, int use, 
int priority,
   EXPR_SPEC_DONE_DS (expr) = spec_done_ds;
   

Re: [cxx-mem-model] Change library call for __atomic_op_fetch routines

2011-10-26 Thread Richard Henderson
On 10/26/2011 10:25 AM, Andrew MacLeod wrote:
 hum. I suppose that wouldn't hurt. I don't think it should make any
 difference since Im not modifying whats in the table, but I can give
 it a shot. If it bootstraps, which I presume it will, I'll check it
 in with the original decl replaced.

The other thing that just occurred to me is that we're lying to cgraph
about the functions that wind up being referenced.

But since different function references can come in via libcalls, I
don't know how different this case is.

It'll be interesting to see what happens with LTO enabled here...


r~


Re: [C++ Patch] BASELINK_P clean up

2011-10-26 Thread Jason Merrill

On 10/26/2011 12:41 PM, Paolo Carlini wrote:

we have an inconsistent mix: shall we do the below?


Sure, why not.

Jason



Re: [cxx-mem-model] Change library call for __atomic_op_fetch routines

2011-10-26 Thread Richard Henderson
On 10/25/2011 06:45 PM, Andrew MacLeod wrote:
   * builtins.c (expand_builtin_atomic_fetch_op): External calls for
   'op_fetch' builtins need to instead call 'fetch_op' externals and issue 
   correction code.
   (expand_builtin): Provide proper builtin name for external call and
   ignored flag to expand_builtin_atomic_fetch_op.

Looks ok.

 !resolved to an instriction sequence.  */

instruction

 !   gcc_assert (TREE_OPERAND (addr, 0) == fndecl);
 !   TREE_OPERAND (addr, 0) = builtin_decl_explicit(ext_call);

I do wonder if we ought not put the original back after the expand.


r~


Re: [PATCH] rs6000: Delete the remaining vec_extract expanders.

2011-10-26 Thread Richard Henderson
On 10/25/2011 04:38 PM, Michael Meissner wrote:
 On Tue, Oct 25, 2011 at 03:01:37PM -0700, Richard Henderson wrote:
 Now that expand_binop handles lowering vec_extract_even to vec_perm,
 we can remove the last two unnecessary vec_extract patterns from the
 Altivec backend.

 Ok?
 
 Just to be sure, the lowering doesn't depend on -ftree-vectorize (i.e. in case
 people are using the buitlins).  I assume it doesn't.

No, this lowering happens during expansion to rtl, similar to NEG x - SUB 0, x
and other expansion variants that optabs.c applies.


r~


Re: [C++ Patch / RFC] PR 50870

2011-10-26 Thread Jason Merrill

OK.

Jason


PATCH RFC: Fix cross-compiler include path if not using --with-sysroot

2011-10-26 Thread Ian Lance Taylor
In my earlier patch to add support for --with-native-system-header-dir,
I messed up the case of a cross-compiler when not using --with-sysroot.
In that case the compiler was incorrectly searching /usr/include.  This
patch restores the logic to what it used to be before my patch.  As the
patch is straightforward and simply restores the code to what it used to
be before my earlier patch, I plan to go ahead and commit this after
bootstrap and testing passes.  Please let me know if you object.

Ian


2011-10-26  Ian Lance Taylor  i...@google.com

* cppdefault.c: Undef NATIVE_SYSTEM_HEADER_DIR if
CROSS_DIRECTORY_STRUCTURE is defined and TARGET_SYSTEM_ROOT is
not.
(cpp_include_defaults): Only use NATIVE_SYSTEM_HEADER_DIR if it is
defined.


Index: gcc/cppdefault.c
===
--- gcc/cppdefault.c	(revision 180530)
+++ gcc/cppdefault.c	(working copy)
@@ -32,6 +32,7 @@
 
 #if defined (CROSS_DIRECTORY_STRUCTURE)  !defined (TARGET_SYSTEM_ROOT)
 # undef LOCAL_INCLUDE_DIR
+# undef NATIVE_SYSTEM_HEADER_DIR
 #else
 # undef CROSS_INCLUDE_DIR
 #endif
@@ -84,8 +85,10 @@ const struct default_include cpp_include
 /* Another place the target system's headers might be.  */
 { TOOL_INCLUDE_DIR, BINUTILS, 0, 1, 0, 0 },
 #endif
+#ifdef NATIVE_SYSTEM_HEADER_DIR
 /* /usr/include comes dead last.  */
 { NATIVE_SYSTEM_HEADER_DIR, NATIVE_SYSTEM_HEADER_COMPONENT, 0, 0, 1, 0 },
+#endif
 { 0, 0, 0, 0, 0, 0 }
   };
 #endif /* no INCLUDE_DEFAULTS */


Re: [PATCH 2/3] Predication support

2011-10-26 Thread Alexander Monakov
This patch contains the implementation of predication support.

The selective scheduler is adjusted to recognize COND_EXEC instructions.  They
can be renamed like normal assignments.  Selective scheduler itself will add
predicated variants of instructions in availability sets when computing them
on CFG splits, using the predicate of the conditional jump.

Converting an instruction to a predicated form is treated similar to other
transformations (speculation, substitition or renaming) in
{undo,redo}_transformation.  Predication cache is maintained to quickly
retrieve predicated variants of instructions.

In particular, predication support  allows to pipeline non-doloop loops on
ia64 more cleanly, as no recovery code for speculative loads is needed.  Also,
it allows to pipeline loads on arm.

2011-10-26  Alexander Monakov  amona...@ispras.ru

* common.opt: Add -fsel-sched-predication option.
* config/ia64/ia64.c (get_mode_no_for_insn): Support conditional loads.
* rtl.h (COND_SET_SRC_PTR, COND_SET_SRC_PTR): New macros.
* sched-deps.c (conditions_mutex_with_rev_p): Rename from
conditions_mutex_p.  Adjust the caller.
(conditions_mutex_p, conditions_same_p,
conditions_same_or_mutex_p): New functions.
(sched_analyze_insn): Extract LHS/RHS for conditional assignments.
* sched-int.h (conditions_mutex_p, conditions_same_p,
conditions_same_or_mutex_p): Declare.
* sel-sched-ir.c (vinsn_equal_p): Support conditional insns.
(vinsn_equal_skip_cond_p): New function.
(init_expr): Add new argument (was_predicated).  Update all callers. 
(merge_expr_data): Assert equality of merged predicates.
(av_set_add_nocopy, join_distinct_sets): Export.
(av_set_tail, av_set_concat): New functions.
(av_set_union_and_live): Add parameter to_fallthru_p, generate
predicated insns.
(setup_id_lhs_rhs): Support conditional insns.
(setup_id_cond): New function.
(init_id_from_df, deps_init_id): Use it.
(get_dest_and_mode): Support conditional insns.
* sel-sched-ir.h (enum local_trans_type): Add TRANS_PREDICATION.
(struct _expr): Add new member (was_predicated).
(EXPR_COND, EXPR_WAS_PREDICATED): New accessor macros.
(struct idata_def): Add new member (cond).
(IDATA_COND, VINSN_COND, INSN_COND): New accessor macros.
(vinsn_equal_skip_cond_p, maybe_predicate_expr_into,
av_set_add_nocopy, av_set_intersect, join_distinct_sets,
av_set_concat): Declare.
(av_set_union_and_live): Change prototype.
(in_fallthru_bb_p): Add new parameter (skip_nops_p).  Update all
callers.
(can_substitute_through_p): Do not substitute through conditional
insns.
(substitute_reg_in_expr, replace_src_with_reg_ok_p): Support
substitution in conditional moves.
(create_insn_rtx_with_rhs, replace_dest_with_reg_ok_p,
replace_dest_with_reg_in_expr): Support renaming of conditional moves.
(create_insn_rtx_with_lhs): Delete.
(mark_unavailable_hard_regs, choose_best_reg_1,
choose_best_pseudo_reg, verify_target_availability): Support
conditional insns.
(undo_transformations): Add support for TRANS_PREDICATION.
(redo_transformations): Ditto.
(moveup_expr): Add support for predicated exprs.
(struct predication_cache, predication_cache_hash,
predication_cache_free, predication_cache_eq,
find_in_predication_cache, predicate_expr, maybe_predicate_expr_into,
populate_av_set_with_predicated, filter_av_set_by_predicate): New.
(compute_av_set_at_bb_end): Generate predicated instructions.
(sel_rank_for_schedule): Adjust heuristics.
(vinsn_vec_has_expr_p): Allow inexact match for predicated insns.
(move_op_orig_expr_not_found): Ignore insns with matching LHS if the
condition is mutually exclusive.
(code_motion_process_successors): Add new argument (cond).  Use it to
skip predicated-off paths.  Update the caller.
(av_set_common_cond): New.
(sel_region_init): Initialize predication cache.
(sel_region_finish): Free predication cache.


diff --git a/gcc/common.opt b/gcc/common.opt
index d1f286f..3a7114d 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1695,6 +1695,10 @@ fsel-sched-reschedule-pipelined
 Common Report Var(flag_sel_sched_reschedule_pipelined) Init(0) Optimization
 Reschedule pipelined regions without pipelining
 
+fsel-sched-predication
+Common Report Var(flag_sel_sched_predication) Init(0) Optimization
+Perform predication in selective scheduling
+
 ; sched_stalled_insns means that insns can be moved prematurely from the queue
 ; of stalled insns into the ready list.
 fsched-stalled-insns
diff --git a/gcc/config/ia64/ia64.c b/gcc/config/ia64/ia64.c
index 78d2441..94dbff2 100644
--- a/gcc/config/ia64/ia64.c
+++ 

Re: [PATCH,fortran] Reap dead code

2011-10-26 Thread Steve Kargl
On Sat, Oct 22, 2011 at 01:16:14PM -0700, Steve Kargl wrote:
 The attach patch reaps some code that is now dead
 due to my recent changes for ishftc in check.c.
 Regression tested on i686-*-freebsd.
 
 2011-10-22  Steevn G. Kargl  ka...@gcc.gnu.org
 
   * simplify.c (gfc_simplify_ishftc): Reap dead code.

Here's a revised patch that does 2 things.  First, it adds
a check in gfc_check_nearest that the 2nd argument (if it
is a constant) is not zero.  Second, it reaps a dead code
in several of the simplification functions.  


2011-10-26  Steven G. Kargl  ka...@gcc.gnu.org

* check.c (gfc_check_atan_2): Typo in comment.
(gfc_check_nearest): If 's' is constant, check that it is not 0.
* simplify.c (simplify_dshift, gfc_simplify_ibclr, gfc_simplify_ibits,
gfc_simplify_ibset, simplify_shift, gfc_simplify_ishftc,
gfc_simplify_nearest): Remove dead code.

2011-10-26  Steven G. Kargl  ka...@gcc.gnu.org

* gfortran.dg/nearest_5.f90: New test.

-- 
Steve
Index: fortran/check.c
===
--- fortran/check.c	(revision 180529)
+++ fortran/check.c	(working copy)
@@ -934,7 +934,7 @@ null_arg:
 gfc_try
 gfc_check_atan_2 (gfc_expr *y, gfc_expr *x)
 {
-  /* gfc_notify_std would be a wast of time as the return value
+  /* gfc_notify_std would be a waste of time as the return value
  is seemingly used only for the generic resolution.  The error
  will be: Too many arguments.  */
   if ((gfc_option.allow_std  GFC_STD_F2008) == 0)
@@ -2710,6 +2710,16 @@ gfc_check_nearest (gfc_expr *x, gfc_expr
   if (type_check (s, 1, BT_REAL) == FAILURE)
 return FAILURE;
 
+  if (s-expr_type == EXPR_CONSTANT)
+{
+  if (mpfr_sgn (s-value.real) == 0)
+	{
+	  gfc_error (Argument 'S' of NEAREST at %L shall not be zero,
+		 s-where);
+	  return FAILURE;
+	}
+}
+
   return SUCCESS;
 }
 
Index: fortran/simplify.c
===
--- fortran/simplify.c	(revision 180529)
+++ fortran/simplify.c	(working copy)
@@ -1899,13 +1899,7 @@ simplify_dshift (gfc_expr *arg1, gfc_exp
   k = gfc_validate_kind (BT_INTEGER, arg1-ts.kind, false);
   size = gfc_integer_kinds[k].bit_size;
 
-  if (gfc_extract_int (shiftarg, shift) != NULL)
-{
-  gfc_error (Invalid SHIFT argument of DSHIFTL at %L, shiftarg-where);
-  return gfc_bad_expr;
-}
-
-  gcc_assert (shift = 0  shift = size);
+  gfc_extract_int (shiftarg, shift);
 
   /* DSHIFTR(I,J,SHIFT) = DSHIFTL(I,J,SIZE-SHIFT).  */
   if (right)
@@ -2509,21 +2503,10 @@ gfc_simplify_ibclr (gfc_expr *x, gfc_exp
   if (x-expr_type != EXPR_CONSTANT || y-expr_type != EXPR_CONSTANT)
 return NULL;
 
-  if (gfc_extract_int (y, pos) != NULL || pos  0)
-{
-  gfc_error (Invalid second argument of IBCLR at %L, y-where);
-  return gfc_bad_expr;
-}
+  gfc_extract_int (y, pos);
 
   k = gfc_validate_kind (x-ts.type, x-ts.kind, false);
 
-  if (pos = gfc_integer_kinds[k].bit_size)
-{
-  gfc_error (Second argument of IBCLR exceeds bit size at %L,
-		 y-where);
-  return gfc_bad_expr;
-}
-
   result = gfc_copy_expr (x);
 
   convert_mpz_to_unsigned (result-value.integer,
@@ -2551,17 +2534,8 @@ gfc_simplify_ibits (gfc_expr *x, gfc_exp
   || z-expr_type != EXPR_CONSTANT)
 return NULL;
 
-  if (gfc_extract_int (y, pos) != NULL || pos  0)
-{
-  gfc_error (Invalid second argument of IBITS at %L, y-where);
-  return gfc_bad_expr;
-}
-
-  if (gfc_extract_int (z, len) != NULL || len  0)
-{
-  gfc_error (Invalid third argument of IBITS at %L, z-where);
-  return gfc_bad_expr;
-}
+  gfc_extract_int (y, pos);
+  gfc_extract_int (z, len);
 
   k = gfc_validate_kind (BT_INTEGER, x-ts.kind, false);
 
@@ -2614,21 +2588,10 @@ gfc_simplify_ibset (gfc_expr *x, gfc_exp
   if (x-expr_type != EXPR_CONSTANT || y-expr_type != EXPR_CONSTANT)
 return NULL;
 
-  if (gfc_extract_int (y, pos) != NULL || pos  0)
-{
-  gfc_error (Invalid second argument of IBSET at %L, y-where);
-  return gfc_bad_expr;
-}
+  gfc_extract_int (y, pos);
 
   k = gfc_validate_kind (x-ts.type, x-ts.kind, false);
 
-  if (pos = gfc_integer_kinds[k].bit_size)
-{
-  gfc_error (Second argument of IBSET exceeds bit size at %L,
-		 y-where);
-  return gfc_bad_expr;
-}
-
   result = gfc_copy_expr (x);
 
   convert_mpz_to_unsigned (result-value.integer,
@@ -3004,11 +2967,8 @@ simplify_shift (gfc_expr *e, gfc_expr *s
 
   if (e-expr_type != EXPR_CONSTANT || s-expr_type != EXPR_CONSTANT)
 return NULL;
-  if (gfc_extract_int (s, shift) != NULL)
-{
-  gfc_error (Invalid second argument of %s at %L, name, s-where);
-  return gfc_bad_expr;
-}
+
+  gfc_extract_int (s, shift);
 
   k = gfc_validate_kind (BT_INTEGER, e-ts.kind, false);
   bitsize = gfc_integer_kinds[k].bit_size;
@@ -3146,11 +3106,7 @@ gfc_simplify_ishftc (gfc_expr *e, gfc_ex
   if (e-expr_type != EXPR_CONSTANT || 

Re: [PATCH 3/3, RFC] Fixup COND_EXECs before reload

2011-10-26 Thread Alexander Monakov
This RFC patch implements conversion of COND_EXEC instructions to control flow
for pre-RA selective scheduler.  Something like this is needed to employ
predication support before reload.

Each COND_EXEC is converted separately to a new basic block with the
unconditional variant of the instruction, and a conditional jump around it.

I'm not sure what would be an acceptable approach here.  I'm also not sure
about the recommended way to emit JUMPs.


2011-10-26  Sergey Grechanik  mouseent...@ispras.ru

* sel-sched.c (convert_cond_execs): New.  Use it...
(sel_global_finish): ... here.

diff --git a/gcc/sel-sched.c b/gcc/sel-sched.c
index f5c6f8b..b8f2663 100644
--- a/gcc/sel-sched.c
+++ b/gcc/sel-sched.c
@@ -45,6 +45,7 @@ along with GCC; see the file COPYING3.  If not see
 #include rtlhooks-def.h
 #include output.h
 #include emit-rtl.h
+#include cfghooks.h
 
 #ifdef INSN_SCHEDULING
 #include sel-sched-ir.h
@@ -7978,6 +7979,60 @@ sel_global_init (void)
   init_hard_regs_data ();
 }
 
+/* Convert cond_execs to jumps before reload.  */
+static void
+convert_cond_execs (void)
+{
+  basic_block bb;
+  rtx insn;
+
+  if (reload_completed)
+return;
+
+  FOR_EACH_BB (bb)
+   /* We don't need the safe variant because we break immediately after
+  removing the current instruction.  */
+FOR_BB_INSNS (bb, insn)
+  if (INSN_P (insn)  GET_CODE (PATTERN (insn)) == COND_EXEC)
+   {
+ rtx jump;
+ rtx cond = COND_EXEC_TEST (PATTERN (insn));
+ rtx rcond = reversed_comparison (cond, GET_MODE (cond));
+ rtx unpredicated = copy_rtx (COND_EXEC_CODE (PATTERN (insn)));
+
+ /* Split bb into BB, NEW_BB, NEXT_BB (in that order).  */
+ edge e1 = split_block (bb, insn);
+ basic_block next_bb = e1-dest;
+ edge e2 = split_block (bb, insn);
+ basic_block new_bb = e2-dest;
+
+ /* Emit conditional jump at the end of bb.  */
+ rtx label = block_label (next_bb);
+
+  /* FIXME  There should be a better way.  */
+ rtx jump_pat
+  = gen_rtx_SET (GET_MODE (pc_rtx), pc_rtx,
+ gen_rtx_IF_THEN_ELSE (GET_MODE (pc_rtx),
+   rcond,
+   gen_rtx_LABEL_REF (VOIDmode,
+  label),
+   pc_rtx));
+
+ make_edge (bb, next_bb, 0);
+ jump = emit_jump_insn_after (jump_pat, BB_END (bb));
+ JUMP_LABEL (jump) = label;
+ LABEL_NUSES (label)++;
+
+ emit_insn_after_noloc (unpredicated, BB_HEAD (new_bb), new_bb);
+
+ delete_insn (insn);
+ break;
+   }
+#ifdef ENABLE_CHECKING
+  verify_flow_info ();
+#endif
+}
+
 /* Free the global data of the scheduler.  */
 static void
 sel_global_finish (void)
@@ -7998,6 +8053,8 @@ sel_global_finish (void)
 
   free_sched_pools ();
   free_dominance_info (CDI_DOMINATORS);
+
+  convert_cond_execs ();
 }
 
 /* Return true when we need to skip selective scheduling.  Used for debugging. 
 */




[PATCH, testsuite] Unxfail g++.dg/tree-ssa/pr45605.C

2011-10-26 Thread Martin Jambor
Hi,

I've noticed that a test in g++.dg/tree-ssa/pr45605.C is Xfailed, even
though early FRE now does the devirtualization that it tests for.
True, the current scan happens over the SSA dump (which implies the
front-end was somehow supposed to do it?) but I think that given we do
perform the optimization, checking that we continue to do so is
useful, hence the following change.

Tested by adding to a successful bootstrap and test run on x86_64.
Unless anyone objects, I will commit the patch tomorrow.

Thanks,

Martin

2011-10-25  Martin Jambor  mjam...@suse.cz

* g++.dg/tree-ssa/pr45605.C: Scan fre1 dump and unxfail.

Index: src/gcc/testsuite/g++.dg/tree-ssa/pr45605.C
===
--- src.orig/gcc/testsuite/g++.dg/tree-ssa/pr45605.C
+++ src/gcc/testsuite/g++.dg/tree-ssa/pr45605.C
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options -O1 -fdump-tree-ssa } */
+/* { dg-options -O1 -fdump-tree-fre1 } */
 extern C void abort(); 
 bool destructor_called = false; 
 
@@ -33,5 +33,5 @@ int main() {
 
 
 /* We should devirtualize call to D::Run */
-/* { dg-final { scan-tree-dump-times D::Run \\( 1 ssa { xfail *-*-* } } } 
*/
-/* { dg-final { cleanup-tree-dump ssa } } */
+/* { dg-final { scan-tree-dump-times D::Run \\( 1 fre1 } } */
+/* { dg-final { cleanup-tree-dump fre1 } } */


[PATCH] Pass through jump functions for addressable (scalar) parameters

2011-10-26 Thread Martin Jambor
Hi,

Honza requested that we do attempt to produce pass-through jump
functions even when the actual formal parameter that is being passed
on is addressable - provided that we can prove it has not changed
value, of course.  The following patch does this, hopefully the
limitations of our parameter-not-modified mechanism are not too strict
for this purpose.

The patch is supposed to be applied on top of the one I have just sent
that renames all parm_infos to parm_ainfos.  I have successfully
bootstrapped and tested it on x86_64-linux.

OK for trunk?

Thanks,

Martin



2011-10-26  Martin Jambor  mjam...@suse.cz

* ipa-prop.c (mark_modified): Moved up in the file.
(is_parm_modified_before_call): Renamed to
is_parm_modified_before_stmt, moved up in the file.
(load_from_unmodified_param): New function.
(compute_complex_assign_jump_func): Also attempt to create pass
through jump functions for values loaded from (addressable)
parameters.

* testsuite/gcc.dg/ipa/ipcp-4.c: New test.

Index: src/gcc/testsuite/gcc.dg/ipa/ipcp-4.c
===
--- /dev/null
+++ src/gcc/testsuite/gcc.dg/ipa/ipcp-4.c
@@ -0,0 +1,68 @@
+/* Test that IPA-CP is able to produce a pass-through jump function for the
+   call of g1 and g2 even though a is addressable.  Also test that h is not
+   cloned.  */
+
+/* { dg-do compile } */
+/* { dg-options -O3 -fipa-cp -fipa-cp-clone -fdump-ipa-cp 
-fno-early-inlining  } */
+/* { dg-add-options bind_pic_locally } */
+
+extern void use_stuff (int);
+extern void use_pointer (int *);
+
+static int
+h (int a, int b)
+{
+  int i;
+
+  for (i = 8; i = b; i++)
+use_stuff (a+8);
+}
+
+static int
+g1 (int a, int b)
+{
+  int i;
+
+  for (i = 0; i = b; i++)
+use_pointer (a);
+  h (a, b);
+}
+
+static int
+g2 (int a, int b)
+{
+  int i;
+
+  for (i = 4; i = b; i += 2)
+use_stuff (a);
+}
+
+
+static void
+f (int a, int z)
+{
+  if (z  1)
+g1 (a, z);
+  else
+g2 (a + 4, z);
+  use_pointer (a);
+}
+
+int
+main (int argc, char *argv[])
+{
+  int i;
+  for (i = 0; i  100; i++)
+f (7, argc);
+  return 0;
+}
+
+
+/* { dg-final { scan-ipa-dump Creating a specialized node of g1.*for all 
known contexts cp } } */
+/* { dg-final { scan-ipa-dump Creating a specialized node of g2.*for all 
known contexts cp } } */
+/* { dg-final { scan-ipa-dump-not Creating a specialized node of h.*for all 
known contexts cp } } */
+/* { dg-final { scan-ipa-dump-times replacing param a with const 7 2 cp  } 
} */
+/* { dg-final { scan-ipa-dump replacing param a with const 11 cp  } } */
+/* { dg-final { cleanup-ipa-dump cp } } */
+
+
Index: src/gcc/ipa-prop.c
===
--- src.orig/gcc/ipa-prop.c
+++ src/gcc/ipa-prop.c
@@ -419,31 +419,105 @@ detect_type_change_ssa (tree arg, gimple
   return detect_type_change (arg, arg, call, jfunc, 0);
 }
 
+/* Callback of walk_aliased_vdefs.  Flags that it has been invoked to the
+   boolean variable pointed to by DATA.  */
+
+static bool
+mark_modified (ao_ref *ao ATTRIBUTE_UNUSED, tree vdef ATTRIBUTE_UNUSED,
+void *data)
+{
+  bool *b = (bool *) data;
+  *b = true;
+  return true;
+}
+
+/* Return true if the formal parameter PARM might have been modified in this
+   function before reaching the statement STMT.  PARM_AINFO is a pointer to a
+   structure containing temporary information about PARM.  */
+
+static bool
+is_parm_modified_before_stmt (struct param_analysis_info *parm_ainfo,
+ gimple stmt, tree parm)
+{
+  bool modified = false;
+  ao_ref refd;
+
+  if (parm_ainfo-modified)
+return true;
+
+  ao_ref_init (refd, parm);
+  walk_aliased_vdefs (refd, gimple_vuse (stmt), mark_modified,
+ modified, parm_ainfo-visited_statements);
+  if (modified)
+{
+  parm_ainfo-modified = true;
+  return true;
+}
+  return false;
+}
+
+/* If STMT is an assignment that loads a value from an parameter declaration,
+   return the index of the parameter in ipa_node_params which has not been
+   modified.  Otherwise return -1.  */
+
+static int
+load_from_unmodified_param (struct ipa_node_params *info,
+   struct param_analysis_info *parms_ainfo,
+   gimple stmt)
+{
+  int index;
+  tree op1;
+
+  if (!gimple_assign_single_p (stmt)
+  || gimple_assign_cast_p (stmt))
+return -1;
+
+  op1 = gimple_assign_rhs1 (stmt);
+  index = ipa_get_param_decl_index (info, op1);
+  if (index  0
+  || is_parm_modified_before_stmt (parms_ainfo[index], stmt, op1))
+return -1;
+
+  return index;
+}
 
 /* Given that an actual argument is an SSA_NAME (given in NAME) and is a result
of an assignment statement STMT, try to find out whether NAME can be
described by a (possibly polynomial) pass-through jump-function or an
-   ancestor jump function and if so, write the appropriate function 

Re: [C++-11] User defined literals

2011-10-26 Thread Jason Merrill

On 10/26/2011 02:00 AM, Ed Smith-Rowland wrote:

The patch was bootstrapped and regtested on x86_64-linux-gnu.


Really?  I ran into a warning about the unused suffix parameter to 
interpret_integer.  So I've fixed that error.  I also added a couple of 
comments, and implemented the change to check_literal_operator_args that 
I wondered about a while back.  And checked it all in.


But we aren't quite done, I think: I notice that the lookup of operators 
doesn't match what's in 2.14.8.  For instance, I don't think this should 
be accepted:


double operator _foo (long long unsigned);
double d = 1.2_foo;

The lookup described in 2.14.8 involves looking through the overload set 
for a particular signature before doing normal overload resolution.


Also, we don't need to worry about argument-dependent lookup for these 
operators, since none of the arguments can have associated namespaces. 
So I think we can use lookup_name rather than lookup_function_nonclass, 
only look it up once in cp_userdef_numeric_literal, and then only build 
one call depending on the contents of the overload set.


Jason
commit e2fe821867bdc55a300f400fe8340a8d7168fb46
Author: Jason Merrill ja...@redhat.com
Date:   Wed Oct 26 13:38:05 2011 -0400

interpret_int

diff --git a/gcc/c-family/c-lex.c b/gcc/c-family/c-lex.c
index 46c0340..baee8eb 100644
--- a/gcc/c-family/c-lex.c
+++ b/gcc/c-family/c-lex.c
@@ -44,7 +44,7 @@ static splay_tree file_info_tree;
 int pending_lang_change; /* If we need to switch languages - C++ only */
 int c_header_level;	 /* depth in C headers - C++ only */
 
-static tree interpret_integer (const cpp_token *, unsigned int, const char *);
+static tree interpret_integer (const cpp_token *, unsigned int);
 static tree interpret_float (const cpp_token *, unsigned int, const char *);
 static tree interpret_fixed (const cpp_token *, unsigned int);
 static enum integer_type_kind narrowest_unsigned_type
@@ -329,7 +329,7 @@ c_lex_with_flags (tree *value, location_t *loc, unsigned char *cpp_flags,
 	   Set PURE_ZERO to pass this information to the C++ parser.  */
 	if (tok-val.str.len == 1  *tok-val.str.text == '0')
 	  add_flags = PURE_ZERO;
-	*value = interpret_integer (tok, flags, suffix);
+	*value = interpret_integer (tok, flags);
 	break;
 
 	  case CPP_N_FLOATING:
@@ -584,11 +584,9 @@ narrowest_signed_type (unsigned HOST_WIDE_INT low,
   return itk_none;
 }
 
-/* Interpret TOKEN, an integer with FLAGS as classified by cpplib.
-   For C++0X SUFFIX may contain a user-defined literal suffix.  */
+/* Interpret TOKEN, an integer with FLAGS as classified by cpplib.  */
 static tree
-interpret_integer (const cpp_token *token, unsigned int flags,
-		   const char *suffix)
+interpret_integer (const cpp_token *token, unsigned int flags)
 {
   tree value, type;
   enum integer_type_kind itk;
commit 9c14fb264df67fdc3a28b47c67991345af634d85
Author: Jason Merrill ja...@redhat.com
Date:   Wed Oct 26 12:36:58 2011 -0400

build_string comments

diff --git a/gcc/c-family/c-lex.c b/gcc/c-family/c-lex.c
index baee8eb..7b220ab 100644
--- a/gcc/c-family/c-lex.c
+++ b/gcc/c-family/c-lex.c
@@ -344,6 +344,8 @@ c_lex_with_flags (tree *value, location_t *loc, unsigned char *cpp_flags,
 	  {
 	tree suffix_id = get_identifier (suffix);
 	int len = tok-val.str.len - strlen (suffix);
+	/* If this is going to be used as a C string to pass to a
+	   raw literal operator, we need to add a trailing NUL.  */
 	tree num_string = build_string (len + 1,
 	(const char *) tok-val.str.text);
 	TREE_TYPE (num_string) = char_array_type_node;
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 5ba5008..860556c 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -8592,7 +8592,7 @@ grokdeclarator (const cp_declarator *declarator,
   error (declaration of %qD as non-function, dname);
   return error_mark_node;
 }
- 
+
   if (dname
TREE_CODE (dname) == IDENTIFIER_NODE
UDLIT_OPER_P (dname)
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 840a30d..090482c 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -3667,6 +3667,7 @@ cp_parser_userdef_numeric_literal (cp_parser *parser)
 /* Parse a user-defined string constant.  Returns a call to a user-defined
literal operator taking a character pointer and the length of the string
as arguments.  */
+
 static tree
 cp_parser_userdef_string_literal (cp_token *token)
 {
diff --git a/gcc/tree.c b/gcc/tree.c
index 64c4968..2cbd68b 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -1525,6 +1525,7 @@ build_real_from_int_cst (tree type, const_tree i)
 
 /* Return a newly constructed STRING_CST node whose value is
the LEN characters at STR.
+   Note that for a C string literal, LEN should include the trailing NUL.
The TREE_TYPE is not initialized.  */
 
 tree
commit d083a0d7f94fb0fe3605d499366b1b637e169c17
Author: Jason Merrill ja...@redhat.com
Date:   Wed Oct 26 12:37:32 2011 -0400

	* typeck.c (check_literal_operator_args): 

Re: [PATCH 0/6] More vector permutation work

2011-10-26 Thread Iain Sandoe


On 26 Oct 2011, at 17:01, Richard Henderson wrote:


On 10/26/2011 07:30 AM, Ulrich Weigand wrote:
This fails since for u == 4 and mode == V4SFmode it attempts to  
expand

a V4SFmode shift, which is unsupported.

Shouldn't this be using the mode of the selector rather than the mode
of the result in any case?


Yes, it should use the mode of the selector.  And doing just that is
enough to fix the bug.  But I noticed that the actual results for a
constant permutation were much nastier than they ought to be.

Try this.

Iain, this might solve your case too;


yes - bootstrap incl fortran just finished (not the quickest machine ;))


no I can't test myself off cross,
because libgfortran needs the whole cross-env -- system headers and  
everything.


Ah yes .. as does Ada .. and ObjC.

I recently (when trying to figure out what was broken w.ppc Ada) did  
some Darwin crosses from Linux, not too too painful to set up (but  
then I have the stuff to hand),  but in essence prob. no worse than d/ 
l a bunch of pkg or whatever to set up a foreign linux cross-root.


cheers
Iain



Re: [PATCH,fortran] Reap dead code

2011-10-26 Thread Paul Richard Thomas
Dear Steve,

Reaping implies that there is something about it that you want to keep
:-)  Surely, weeding or herbicide spraying is better than reaping?

On Wed, Oct 26, 2011 at 7:53 PM, Steve Kargl
s...@troutmask.apl.washington.edu wrote:
 On Sat, Oct 22, 2011 at 01:16:14PM -0700, Steve Kargl wrote:
 The attach patch reaps some code that is now dead

I am sure that you are right but could you confirm that is is because
of previous patches?

 due to my recent changes for ishftc in check.c.
 Regression tested on i686-*-freebsd.

It looks to be OK, subject to the above, and is, I would guess,
obvious in any case.

Thanks for the patch

Paul


[C++ Patch / RFC] PR 50864

2011-10-26 Thread Paolo Carlini

Hi,

one more / RFC, for the ICE on invalid part of these issues with '-'.

The below tries to catch the problem very early, in 
cp_parser_postfix_dot_deref_expression and apparently works fine, passes 
the testsuite, etc. Is it too early? Is the check tight enough?


Thanks,
Paolo.

//
Index: testsuite/g++.dg/template/crash109.C
===
--- testsuite/g++.dg/template/crash109.C(revision 0)
+++ testsuite/g++.dg/template/crash109.C(revision 0)
@@ -0,0 +1,10 @@
+// PR c++/50864
+
+namespace impl
+{
+  template class T T create();
+}
+
+template class T, class U, __SIZE_TYPE__
+ = sizeof(impl::createT() - impl::createU())  // { dg-error not 
a class member } 
+struct foo;
Index: cp/parser.c
===
--- cp/parser.c (revision 180532)
+++ cp/parser.c (working copy)
@@ -5673,6 +5673,15 @@ cp_parser_postfix_dot_deref_expression (cp_parser
{
  if (name != error_mark_node  !BASELINK_P (name)  parser-scope)
{
+ if (TREE_CODE (parser-scope) == NAMESPACE_DECL
+  TREE_CODE (postfix_expression) == ARROW_EXPR)
+   {
+ error_at (token-location, %%D::%D% is not a class member,
+   parser-scope, name);
+ parser-context-object_type = NULL_TREE;
+ return error_mark_node;
+   }
+
  name = build_qualified_name (/*type=*/NULL_TREE,
   parser-scope,
   name,


Re: Re: [C++-11] User defined literals

2011-10-26 Thread 3dw4rd



Oct 26, 2011 03:39:09 PM, ja...@redhat.com wrote:

On 10/26/2011 02:00 AM, Ed Smith-Rowland wrote:
 The patch was bootstrapped and regtested on x86_64-linux-gnu.

Really?  I ran into a warning about the unused suffix parameter to 
interpret_integer.  So I've fixed that error.  I also added a couple of 
comments, and implemented the change to check_literal_operator_args that 
I wondered about a while back.  And checked it all in.

But we aren't quite done, I think: I notice that the lookup of operators 
doesn't match what's in 2.14.8.  For instance, I don't think this should 
be accepted:

double operator _foo (long long unsigned);
double d = 1.2_foo;

The lookup described in 2.14.8 involves looking through the overload set 
for a particular signature before doing normal overload resolution.

Also, we don't need to worry about argument-dependent lookup for these 
operators, since none of the arguments can have associated namespaces. 
So I think we can use lookup_name rather than lookup_function_nonclass, 
only look it up once in cp_userdef_numeric_literal, and then only build 
one call depending on the contents of the overload set.

Jason

Jason,

Thank you Jason and Tom for your help in getting this together an putting up 
with my slowness.
That warning about unused suffix didn't blow up the build.  I'm surprised 
Werror didn't kill it.

I'll check out these other issues as soon as I clean out all my work and 
rebuild on a clean tree.

I guess I need to look at what kind of number I get when processing a numeric 
literal.

Thanks again.



Re: Use of vector instructions in memmov/memset expanding

2011-10-26 Thread Michael Zolotukhin
Any questions on these patches? Are they ok for the trunk?

On 20 October 2011 12:37, Michael Zolotukhin
michael.v.zolotuk...@gmail.com wrote:
 And, finally, part with the tests.

 On 20 October 2011 12:36, Michael Zolotukhin
 michael.v.zolotuk...@gmail.com wrote:
 Back-end part of the patch is attached here.

 On 20 October 2011 12:35, Michael Zolotukhin
 michael.v.zolotuk...@gmail.com wrote:
 Middle-end part of the patch is attached.

 On 20 October 2011 12:34, Michael Zolotukhin
 michael.v.zolotuk...@gmail.com wrote:
 I fixed the tests as well as updated my branch and fixed introduced
 during this process bugs.
 Here is fixed complete patch (other parts will be sent in consequent 
 letters).

 The changes passed bootstrap and make check.

 On 29 September 2011 15:21, Jakub Jelinek ja...@redhat.com wrote:
 Hi!

 On Thu, Sep 29, 2011 at 03:14:40PM +0400, Michael Zolotukhin wrote:
 +/* { dg-options -O2 -march=atom -mtune=atom -m64 -dp } */

 The testcases are wrong, -m64 or -m32 should never appear in dg-options,
 instead if the testcase is specific to -m64, it should be guarded with
 /* { dg-do compile { target lp64 } } */
 resp. ia32 (or ilp32, depending on what exactly should be done for -mx32),
 if you have the same testcase for -m32 and -m64, but just want different
 scan-assembler for the two cases, then just guard the scan-assembler
 with lp64 resp. ia32/ilp32 target and add second one for the other target.

        Jakub

 --
 ---
 Best regards,
 Michael V. Zolotukhin,
 Software Engineer
 Intel Corporation.




 --
 ---
 Best regards,
 Michael V. Zolotukhin,
 Software Engineer
 Intel Corporation.




 --
 ---
 Best regards,
 Michael V. Zolotukhin,
 Software Engineer
 Intel Corporation.




 --
 ---
 Best regards,
 Michael V. Zolotukhin,
 Software Engineer
 Intel Corporation.




-- 
---
Best regards,
Michael V. Zolotukhin,
Software Engineer
Intel Corporation.


Re: [C++-11] User defined literals

2011-10-26 Thread Jason Merrill

On 10/26/2011 04:11 PM, 3dw...@verizon.net wrote:

Thank you Jason and Tom for your help in getting this together an putting up 
with my slowness.


Thanks for all your work on this!

Jason



Re: [C++ Patch / RFC] PR 50864

2011-10-26 Thread Jason Merrill

On 10/26/2011 04:04 PM, Paolo Carlini wrote:

The below tries to catch the problem very early, in
cp_parser_postfix_dot_deref_expression and apparently works fine, passes
the testsuite, etc. Is it too early? Is the check tight enough?


At a glance, it looks too early; it's valid to have namespace-qualified 
names after -.


namespace A
{
  struct B
  {
int i;
  };
};

A::B* b;
int i = b-A::B::i;

Jason


Re: [C++ Patch / RFC] PR 50864

2011-10-26 Thread Paolo Carlini

Hi,
At a glance, it looks too early; it's valid to have 
namespace-qualified names after -.


namespace A
{
  struct B
  {
int i;
  };
};

A::B* b;
int i = b-A::B::i;
I was also trying to construct such kind of example myself... but my 
patch does not regress on the testcase you wrote down. I can tell you 
exactly why, if you like..


Paolo.


[Ada] Missing error for invalid atomic component

2011-10-26 Thread Eric Botcazou
The compiler should issue the atomic access cannot be guaranteed error twice 
on the attached testcase, but it only issues it for the stand-alone variable.

Fixed thusly, tested on i586-suse-linux, applied on the mainline.


2011-10-26  Eric Botcazou  ebotca...@adacore.com

* gcc-interface/decl.c (gnat_to_gnu_field): Always check components
declared as atomic.  Move around conditionally executed code.


2011-10-26  Eric Botcazou  ebotca...@adacore.com

* gnat.dg/specs/atomic1.ads: New test.


-- 
Eric Botcazou
Index: gcc-interface/decl.c
===
--- gcc-interface/decl.c	(revision 180423)
+++ gcc-interface/decl.c	(working copy)
@@ -6853,10 +6853,8 @@ gnat_to_gnu_field (Entity_Id gnat_field,
 	}
 }
 
-  /* If we are packing the record and the field is BLKmode, round the
- size up to a byte boundary.  */
-  if (packed  TYPE_MODE (gnu_field_type) == BLKmode  gnu_size)
-gnu_size = round_up (gnu_size, BITS_PER_UNIT);
+  if (Is_Atomic (gnat_field))
+check_ok_for_atomic (gnu_field_type, gnat_field, false);
 
   if (Present (Component_Clause (gnat_field)))
 {
@@ -6946,9 +6944,6 @@ gnat_to_gnu_field (Entity_Id gnat_field,
 	  gnu_pos = NULL_TREE;
 	}
 	}
-
-  if (Is_Atomic (gnat_field))
-	check_ok_for_atomic (gnu_field_type, gnat_field, false);
 }
 
   /* If the record has rep clauses and this is the tag field, make a rep
@@ -6961,7 +6956,14 @@ gnat_to_gnu_field (Entity_Id gnat_field,
 }
 
   else
-gnu_pos = NULL_TREE;
+{
+  gnu_pos = NULL_TREE;
+
+  /* If we are packing the record and the field is BLKmode, round the
+	 size up to a byte boundary.  */
+  if (packed  TYPE_MODE (gnu_field_type) == BLKmode  gnu_size)
+	gnu_size = round_up (gnu_size, BITS_PER_UNIT);
+}
 
   /* We need to make the size the maximum for the type if it is
  self-referential and an unconstrained type.  In that case, we can't
-- { dg-do compile }

package Atomic1 is

  type Arr is array (Integer range ) of Boolean;
  type UA is access all Arr;

  U : UA;
  pragma Atomic (U);  -- { dg-error atomic access }

  type R is record
U : UA;
pragma Atomic (U);  -- { dg-error atomic access }
  end record;

end Atomic1;


Re: [C++ Patch / RFC] PR 50864

2011-10-26 Thread Paolo Carlini

On 10/26/2011 10:30 PM, Paolo Carlini wrote:

Hi,
At a glance, it looks too early; it's valid to have 
namespace-qualified names after -.


namespace A
{
  struct B
  {
int i;
  };
};

A::B* b;
int i = b-A::B::i;
I was also trying to construct such kind of example myself... but my 
patch does not regress on the testcase you wrote down. I can tell you 
exactly why, if you like..
We have that parser-scope is a RECORD_TYPE and postfix_expression is an 
INDIRECT_REF.


Paolo.


[Ada] Internal error on unchecked union and representation clause

2011-10-26 Thread Eric Botcazou
The compiler aborts on a record type which contains a component of an unchecked 
union type whose size is smaller than 64 bits and which is subject to a 
representation clause that causes it not to start on a byte boundary.

We already have the machinery to handle this kind of composite component, but 
it was enabled only for regular record types.

Tested on i586-suse-linux, applied on the mainline.


2011-10-26  Eric Botcazou  ebotca...@adacore.com

* gcc-interface/decl.c (gnat_to_gnu_entity) E_Record_Subtype: Try to
make a packable type for fields of union types as well.
is_type: Use RECORD_OR_UNION_TYPE_P predicate.
(gnat_to_gnu_component_type): Try to make a packable type for fields
of union types as well.
(make_packable_type): Use RECORD_OR_UNION_TYPE_P predicate.
(maybe_pad_type): Try to make a packable type for fields of union types
as well.
(gnat_to_gnu_field): Likewise.
(is_variable_size): Use RECORD_OR_UNION_TYPE_P predicate.
(set_rm_size): Likewise.
(rm_size): Likewise.
* gcc-interface/misc.c (gnat_type_max_size): Likewise.
* gcc-interface/trans.c (add_decl_expr): Likewise.
* gcc-interface/utils.c (finish_record_type): Likewise.
* gcc-interface/utils2.c (build_simple_component_ref): Likewise.


2011-10-26  Eric Botcazou  ebotca...@adacore.com

* gnat.dg/specs/unchecked_union.ads: Rename to...
* gnat.dg/specs/unchecked_union1.ads: ...this.
* gnat.dg/specs/unchecked_union2.ads: New test.


-- 
Eric Botcazou
-- { dg-do compile }

package Unchecked_Union2 is

   type Small_Int is range 0 .. 2**19 - 1;

   type R1 (B : Boolean := True) is record
  case B is
 when True  = Data1 : Small_Int;
 when False = Data2 : Small_Int;
  end case;
   end record;

   for R1 use record
  Data1 at 0 range 0 .. 18;
  Data2 at 0 range 0 .. 18;
   end record;
   for R1'Size use 24;

   pragma Unchecked_Union (R1);

   type R2 is record
 Data : R1;
   end record;

   for R2 use record
 Data at 0 range 3 .. 26;
   end record;

end Unchecked_Union2;
Index: gcc-interface/decl.c
===
--- gcc-interface/decl.c	(revision 180539)
+++ gcc-interface/decl.c	(working copy)
@@ -3302,7 +3302,7 @@ gnat_to_gnu_entity (Entity_Id gnat_entit
 			== INTEGER_CST)
 		  {
 			gnu_size = DECL_SIZE (gnu_old_field);
-			if (TREE_CODE (gnu_field_type) == RECORD_TYPE
+			if (RECORD_OR_UNION_TYPE_P (gnu_field_type)
 			 !TYPE_FAT_POINTER_P (gnu_field_type)
 			 host_integerp (TYPE_SIZE (gnu_field_type), 1))
 			  gnu_field_type
@@ -4645,13 +4645,11 @@ gnat_to_gnu_entity (Entity_Id gnat_entit
 	  tree size;
 
 	  /* If a size was specified, take it into account.  Otherwise
-		 use the RM size for records as the type size has already
-		 been adjusted to the alignment.  */
+		 use the RM size for records or unions as the type size has
+		 already been adjusted to the alignment.  */
 	  if (gnu_size)
 		size = gnu_size;
-	  else if ((TREE_CODE (gnu_type) == RECORD_TYPE
-			|| TREE_CODE (gnu_type) == UNION_TYPE
-			|| TREE_CODE (gnu_type) == QUAL_UNION_TYPE)
+	  else if (RECORD_OR_UNION_TYPE_P (gnu_type)
 		!TYPE_FAT_POINTER_P (gnu_type))
 		size = rm_size (gnu_type);
 	  else
@@ -5300,7 +5298,7 @@ gnat_to_gnu_component_type (Entity_Id gn
!Is_Bit_Packed_Array (gnat_array)
!Has_Aliased_Components (gnat_array)
!Strict_Alignment (gnat_type)
-   TREE_CODE (gnu_type) == RECORD_TYPE
+   RECORD_OR_UNION_TYPE_P (gnu_type)
!TYPE_FAT_POINTER_P (gnu_type)
host_integerp (TYPE_SIZE (gnu_type), 1))
 gnu_type = make_packable_type (gnu_type, false);
@@ -6357,9 +6355,7 @@ make_packable_type (tree type, bool in_r
   tree new_field_type = TREE_TYPE (old_field);
   tree new_field, new_size;
 
-  if ((TREE_CODE (new_field_type) == RECORD_TYPE
-	   || TREE_CODE (new_field_type) == UNION_TYPE
-	   || TREE_CODE (new_field_type) == QUAL_UNION_TYPE)
+  if (RECORD_OR_UNION_TYPE_P (new_field_type)
 	   !TYPE_FAT_POINTER_P (new_field_type)
 	   host_integerp (TYPE_SIZE (new_field_type), 1))
 	new_field_type = make_packable_type (new_field_type, true);
@@ -6369,9 +6365,7 @@ make_packable_type (tree type, bool in_r
 	 packable version of the record type, see finish_record_type.  */
   if (!DECL_CHAIN (old_field)
 	   !TYPE_PACKED (type)
-	   (TREE_CODE (new_field_type) == RECORD_TYPE
-	  || TREE_CODE (new_field_type) == UNION_TYPE
-	  || TREE_CODE (new_field_type) == QUAL_UNION_TYPE)
+	   RECORD_OR_UNION_TYPE_P (new_field_type)
 	   !TYPE_FAT_POINTER_P (new_field_type)
 	   !TYPE_CONTAINS_TEMPLATE_P (new_field_type)
 	   TYPE_ADA_SIZE (new_field_type))
@@ -6533,7 +6527,7 @@ maybe_pad_type (tree type, tree size, un
  between them and it might be hard to overcome afterwards, including
  at the RTL 

Re: [PATCH,fortran] Reap dead code

2011-10-26 Thread Steve Kargl
On Wed, Oct 26, 2011 at 09:55:09PM +0200, Paul Richard Thomas wrote:
 Dear Steve,
 
 Reaping implies that there is something about it that you want to keep
 :-)  Surely, weeding or herbicide spraying is better than reaping?

Surely, you have Halloween across the Pond, ie., the Grim Reaper. :-)

 
 On Wed, Oct 26, 2011 at 7:53 PM, Steve Kargl
 s...@troutmask.apl.washington.edu wrote:
  On Sat, Oct 22, 2011 at 01:16:14PM -0700, Steve Kargl wrote:
  The attach patch reaps some code that is now dead
 
 I am sure that you are right but could you confirm that is is because
 of previous patches?

The tests in simplify.c can never be triggered because these
are already caught in check.c.  Consider,

  program a
  integer i
  i = 12234
  print *, ibclr(i, 123)
  end program a

laptop:kargl[207] gfc4x -o z a.f90
a.f90:4.20:

  print *, ibclr(i, 123)
1
Error: 'pos' at (1) must be less than BIT_SIZE('i')

This error message comes from check.c(gfc_check_bitfcn), and
it is the error message I get with 4.5.x without my dead.diff
patch.  The relevant lines of code in gfc_check_bitfcn are

  if (nonnegative_check (pos, pos) == FAILURE)
return FAILURE;

  if (less_than_bitsize1 (i, i, pos, pos, false) == FAILURE)
return FAILURE;

This is the chunk of code I removed from simplify.c(gfc_simplify_ibclr) 

-  if (gfc_extract_int (y, pos) != NULL || pos  0)
-{
-  gfc_error (Invalid second argument of IBCLR at %L, y-where);
-  return gfc_bad_expr;
-}
+  gfc_extract_int (y, pos);

The if-statement can never be true.  In fact, for the above Fortran
code, gfc_simplify_ibclr is never executed beyonds its first line:

  if (x-expr_type != EXPR_CONSTANT || y-expr_type != EXPR_CONSTANT)
return NULL;

Here, x is 'i' from the program and y is 'pos'.  So, both arguments
to ibclr must be constant for the code that I removed to even have
a chance to execute.

The other chunks of code I removed are similar in nature.  The 
checking done by code in check.c makes the checking done in
simplify.c useless bloat.  I'm just cleaning up the mess left
behind the person, who introduced gfc_check_bitfcn.

laptop:kargl[217] svn log -r 160492 check.c |more

r160492 | kargl | 2010-06-09 09:24:59 -0700 (Wed, 09 Jun 2010) | 23 lines

2010-06-09  Steven G. Kargl  ka...@gcc.gnu.org

* fortran/intrinsic.c (add_functions): Change gfc_check_btest,
gfc_check_ibclr, and gfc_check_ibset to gfc_check_bitfcn.
* fortran/intrinsic.h: Remove prototypes for gfc_check_btest,
gfc_check_ibclr, and gfc_check_ibset.  Add prototype for
gfc_check_bitfcn.
* fortran/check.c (nonnegative_check, less_than_bitsize1, 
less_than_bitsize2): New functions.
(gfc_check_btest): Renamed to gfc_check_bitfcn.  Use
nonnegative_check and less_than_bitsize1.
(gfc_check_ibclr, gfc_check_ibset): Removed.
(gfc_check_ibits,gfc_check_mvbits): Use nonnegative_check and
less_than_bitsize1.

  due to my recent changes for ishftc in check.c.
  Regression tested on i686-*-freebsd.
 
 It looks to be OK, subject to the above, and is, I would guess,
 obvious in any case.
 
 Thanks for the patch

My regression test on i686-*-freebsd.

=== gfortran Summary ===

# of expected passes39552
# of unexpected failures8
# of unexpected successes   16
# of expected failures  41
# of unsupported tests  212

The 8 failures are due to entry_4.f90 and select_type_12.f03.

-- 
Steve


[Ada] Internal error on elaboration variable with -gnatct

2011-10-26 Thread Eric Botcazou
This is an internal error on the elaboration variable generated for an array of 
discriminated record type whose discriminant is constraint by a dynamic upper 
bound in -gnatct mode.  In this mode, the compiler doesn't generate code and 
gigi is invoked only to lay out types and back-annotate type information.

Tested on i586-suse-linux, applied on the mainline.


2011-10-26  Eric Botcazou  ebotca...@adacore.com

* gcc-interface/decl.c (elaborate_expression_1): Add EXPR_PUBLIC_P local
variable.  Always create the elaboration variable, if any, as constant.


2011-10-26  Eric Botcazou  ebotca...@adacore.com

* gnat.dg/specs/discr1.ads: New test.
* gnat.dg/specs/discr1_pkg.ads: New helper.


-- 
Eric Botcazou
Index: gcc-interface/decl.c
===
--- gcc-interface/decl.c	(revision 180540)
+++ gcc-interface/decl.c	(working copy)
@@ -6085,7 +6085,8 @@ static tree
 elaborate_expression_1 (tree gnu_expr, Entity_Id gnat_entity, tree gnu_name,
 			bool definition, bool need_debug)
 {
-  const bool expr_global_p = Is_Public (gnat_entity) || global_bindings_p ();
+  const bool expr_public_p = Is_Public (gnat_entity);
+  const bool expr_global_p = expr_public_p || global_bindings_p ();
   bool expr_variable_p, use_variable;
 
   /* In most cases, we won't see a naked FIELD_DECL because a discriminant
@@ -6153,11 +6154,10 @@ elaborate_expression_1 (tree gnu_expr, E
   if (use_variable || need_debug)
 {
   tree gnu_decl
-	= create_var_decl (create_concat_name (gnat_entity,
-	   IDENTIFIER_POINTER (gnu_name)),
-			   NULL_TREE, TREE_TYPE (gnu_expr), gnu_expr,
-			   !need_debug, Is_Public (gnat_entity),
-			   !definition, expr_global_p, NULL, gnat_entity);
+	= create_var_decl_1
+	  (create_concat_name (gnat_entity, IDENTIFIER_POINTER (gnu_name)),
+	   NULL_TREE, TREE_TYPE (gnu_expr), gnu_expr, true, expr_public_p,
+	   !definition, expr_global_p, !need_debug, NULL, gnat_entity);
 
   if (use_variable)
 	return gnu_decl;
-- { dg-do compile }
-- { dg-options -gnatct }

with Discr1_Pkg; use Discr1_Pkg;

package Discr1 is

  procedure Proc (V : Variable_String_Array);

end Discr1;
package Discr1_Pkg is

  Maximum_Length : Natural := 80 ;

  subtype String_Length is Natural range 0 .. Maximum_Length;

  type Variable_String (Length : String_Length := 0) is
record
  S : String (1 .. Length);
end record;

  type Variable_String_Array is array (Natural range ) of Variable_String;

end Discr1_Pkg;


Re: [PATCH,fortran] Reap dead code

2011-10-26 Thread Paul Richard Thomas
Steve,

 Surely, you have Halloween across the Pond, ie., the Grim Reaper. :-)

And what, pray, does the Grim Reaper hold???

The patch is OK.

Thanks

Paul



 On Wed, Oct 26, 2011 at 7:53 PM, Steve Kargl
 s...@troutmask.apl.washington.edu wrote:
  On Sat, Oct 22, 2011 at 01:16:14PM -0700, Steve Kargl wrote:
  The attach patch reaps some code that is now dead

 I am sure that you are right but could you confirm that is is because
 of previous patches?

 The tests in simplify.c can never be triggered because these
 are already caught in check.c.  Consider,

  program a
  integer i
  i = 12234
  print *, ibclr(i, 123)
  end program a

 laptop:kargl[207] gfc4x -o z a.f90
 a.f90:4.20:

  print *, ibclr(i, 123)
                    1
 Error: 'pos' at (1) must be less than BIT_SIZE('i')

 This error message comes from check.c(gfc_check_bitfcn), and
 it is the error message I get with 4.5.x without my dead.diff
 patch.  The relevant lines of code in gfc_check_bitfcn are

  if (nonnegative_check (pos, pos) == FAILURE)
    return FAILURE;

  if (less_than_bitsize1 (i, i, pos, pos, false) == FAILURE)
    return FAILURE;

 This is the chunk of code I removed from simplify.c(gfc_simplify_ibclr)

 -  if (gfc_extract_int (y, pos) != NULL || pos  0)
 -    {
 -      gfc_error (Invalid second argument of IBCLR at %L, y-where);
 -      return gfc_bad_expr;
 -    }
 +  gfc_extract_int (y, pos);

 The if-statement can never be true.  In fact, for the above Fortran
 code, gfc_simplify_ibclr is never executed beyonds its first line:

  if (x-expr_type != EXPR_CONSTANT || y-expr_type != EXPR_CONSTANT)
    return NULL;

 Here, x is 'i' from the program and y is 'pos'.  So, both arguments
 to ibclr must be constant for the code that I removed to even have
 a chance to execute.

 The other chunks of code I removed are similar in nature.  The
 checking done by code in check.c makes the checking done in
 simplify.c useless bloat.  I'm just cleaning up the mess left
 behind the person, who introduced gfc_check_bitfcn.

 laptop:kargl[217] svn log -r 160492 check.c |more
 
 r160492 | kargl | 2010-06-09 09:24:59 -0700 (Wed, 09 Jun 2010) | 23 lines

 2010-06-09  Steven G. Kargl  ka...@gcc.gnu.org

        * fortran/intrinsic.c (add_functions): Change gfc_check_btest,
        gfc_check_ibclr, and gfc_check_ibset to gfc_check_bitfcn.
        * fortran/intrinsic.h: Remove prototypes for gfc_check_btest,
        gfc_check_ibclr, and gfc_check_ibset.  Add prototype for
        gfc_check_bitfcn.
        * fortran/check.c (nonnegative_check, less_than_bitsize1,
        less_than_bitsize2): New functions.
        (gfc_check_btest): Renamed to gfc_check_bitfcn.  Use
        nonnegative_check and less_than_bitsize1.
        (gfc_check_ibclr, gfc_check_ibset): Removed.
        (gfc_check_ibits,gfc_check_mvbits): Use nonnegative_check and
        less_than_bitsize1.

  due to my recent changes for ishftc in check.c.
  Regression tested on i686-*-freebsd.

 It looks to be OK, subject to the above, and is, I would guess,
 obvious in any case.

 Thanks for the patch

 My regression test on i686-*-freebsd.

                === gfortran Summary ===

 # of expected passes            39552
 # of unexpected failures        8
 # of unexpected successes       16
 # of expected failures          41
 # of unsupported tests          212

 The 8 failures are due to entry_4.f90 and select_type_12.f03.

 --
 Steve




-- 
The knack of flying is learning how to throw yourself at the ground and miss.
       --Hitchhikers Guide to the Galaxy


[trans-mem] fix problems with same body aliases

2011-10-26 Thread Aldy Hernandez
The merge broke base/complete dtor transactional clones that we 
originally implemented here:


http://gcc.gnu.org/ml/gcc-patches/2010-06/msg00590.html

First, ipa_tm_execute() was putting __comp_dtor's into tm_callees, which 
wasn't happening pre-merge.  Handling this __comp_dtor caused an ICE 
tree_function_versioning() because ENTRY_BLOCK_PTR_FOR_FUNCTION is 
invalid, since the CFG field for the original decl's cfun has not been set.


The patch below excludes such aliases from tm_callees since we can't 
handle them in ipa_tm_create_version.


Second, ipa_tm_create_version() marks aliases of the original decl as 
needed in callback_mark_needed().  Unfortunately, this needed bit does 
not persist until the end of compilation because whole program IPA will 
remove the bit here:


  /* Frontends and alias code marks nodes as needed before parsing 
is finished.
	 We may end up marking as node external nodes where this flag is 
meaningless

 strip it.  */
  if (node-needed
   (DECL_EXTERNAL (node-decl) || !node-analyzed))
node-needed = 0;

Consequently, the alias will not show up in the .tm_clone_table.

The patch below sets the analyzed bit to keep this from happening. 
With it, g++.dg/tm/alias.C works again.


No regressions elsewhere.

OK?
* trans-mem.c (ipa_tm_execute): Do not include aliases in
tm_callees.
(callback_mark_needed): Set analyzed bit.

Index: trans-mem.c
===
--- trans-mem.c (revision 180439)
+++ trans-mem.c (working copy)
@@ -4137,7 +4137,13 @@ callback_mark_needed (struct cgraph_node
   record_tm_clone_pair (node-decl, tm_alias);
 
   if (info-old_node-needed)
-   cgraph_mark_needed_node (cgraph_get_node (tm_alias));
+   {
+ struct cgraph_node *alias = cgraph_get_node (tm_alias);
+ cgraph_mark_needed_node (alias);
+ /* Needed so function_and_variable_visibility() won't reset
+the needed bit.  */
+ alias-analyzed = 1;
+   }
 }
   return false;
 }
@@ -4592,6 +4598,7 @@ ipa_tm_execute (void)
   /* For all local functions marked tm_callable, queue them.  */
   for (node = cgraph_nodes; node; node = node-next)
 if (is_tm_callable (node-decl)
+!node-alias
 cgraph_function_body_availability (node) = AVAIL_OVERWRITABLE)
   {
d = get_cg_data (node);


Fix gcc.dg/graphite/run-id-1.c for Windows targets

2011-10-26 Thread Joseph S. Myers
The test gcc.dg/graphite/run-id-1.c requires more stack space than
Windows targets provide by default.  This patch arranges for the
appropriate -Wl,--stack option (as in config/mh-mingw and
config/mh-cygwin) to be used by this test on those targets.  Tested
with cross to i686-mingw32.  OK to commit?

2011-10-26  Joseph Myers  jos...@codesourcery.com

* gcc.dg/graphite/run-id-1.c: Pass -Wl,--stack,12582912 option for
Windows targets.

Index: gcc.dg/graphite/run-id-1.c
===
--- gcc.dg/graphite/run-id-1.c  (revision 180200)
+++ gcc.dg/graphite/run-id-1.c  (working copy)
@@ -1,3 +1,4 @@
+/* { dg-options -Wl,--stack,12582912 { target *-*-mingw* *-*-cygwin* } } */
 /* { dg-require-effective-target size32plus } */
 
 void abort (void);

-- 
Joseph S. Myers
jos...@codesourcery.com


Fix gcc.dg/tls/thr-cse-1.c for MinGW target

2011-10-26 Thread Joseph S. Myers
gcc.dg/tls/thr-cse-1.c tests that there are not two calls to
__emutls_get_address in the output.  Normally this just checks for
emutls_get_address.*emutls_get_address, but on some targets where the
compiler output declares the functions called in some way the testcase
has more specific tests.  i?86-*-mingw* produces a .def directive for
___emutls_get_address so also needs such a special case, which this
patch adds (only for i?86-*-mingw* as I don't know what the output
looks like for other Windows targets).  Tested with cross to
i686-mingw32.  OK to commit?

2011-10-26  Joseph Myers  jos...@codesourcery.com

* gcc.dg/tls/thr-cse-1.c: For i?86-*-mingw*, check for multiple
calls.

Index: gcc.dg/tls/thr-cse-1.c
===
--- gcc.dg/tls/thr-cse-1.c  (revision 180200)
+++ gcc.dg/tls/thr-cse-1.c  (working copy)
@@ -15,9 +15,10 @@
   return a;
 }
 
-/* { dg-final { scan-assembler-not emutls_get_address.*emutls_get_address.* 
{ target { ! { *-wrs-vxworks  *-*-darwin8  hppa*-*-hpux* spu-*-* } } } 
} } */
+/* { dg-final { scan-assembler-not emutls_get_address.*emutls_get_address.* 
{ target { ! { *-wrs-vxworks  *-*-darwin8  hppa*-*-hpux* spu-*-* 
i?86-*-mingw* } } } } } */
 /* { dg-final { scan-assembler-not 
call\tL___emutls_get_address.stub.*call\tL___emutls_get_address.stub.* { 
target *-*-darwin8 } } } */
 /* { dg-final { scan-assembler-not (b,l|bl) __emutls_get_address.*(b,l|bl) 
__emutls_get_address.* { target hppa*-*-hpux* } } } */
 /* { dg-final { scan-assembler-not 
(brsl|brasl)\t__emutls_get_address.*(brsl|brasl)\t__emutls_get_address.* { 
target spu-*-* } } } */
 /* { dg-final { scan-assembler-not tls_lookup.*tls_lookup.* { target 
*-wrs-vxworks } } } */
+/* { dg-final { scan-assembler-not 
call\t___emutls_get_address.*call\t___emutls_get_address { target 
i?86-*-mingw* } } } */
 

-- 
Joseph S. Myers
jos...@codesourcery.com


Skip gcc.target/i386/pad-4.c for Windows targets

2011-10-26 Thread Joseph S. Myers
gcc.target/i386/pad-4.c expects particular results from PIC code
generation - PIC in the sense of a GOT and a PLT, which is different
from the sense in which Windows code is always PIC.  This patch
arranges for it to be skipped on Windows targets, similarly to other
tests such as gcc.target/i386/pic-1.c.  Tested with cross to
i686-mingw32.  OK to commit?

2011-10-26  Joseph Myers  jos...@codesourcery.com

* gcc.target/i386/pad-4.c: Skip for Windows targets.

Index: gcc.target/i386/pad-4.c
===
--- gcc.target/i386/pad-4.c (revision 180200)
+++ gcc.target/i386/pad-4.c (working copy)
@@ -2,6 +2,7 @@
 /* { dg-require-effective-target ia32 } */
 /* { dg-require-effective-target fpic } */
 /* { dg-skip-if  { i?86-*-* x86_64-*-* } { -march=* } { -march=atom } } 
*/
+/* { dg-skip-if No Windows PIC { *-*-mingw* *-*-cygwin } { * } {  } } */
 /* { dg-options -O2 -fomit-frame-pointer -march=atom -fPIC } */
 /* { dg-final { scan-assembler-times nop 8 } } */
 /* { dg-final { scan-assembler-not rep } } */

-- 
Joseph S. Myers
jos...@codesourcery.com


Fix gcc.target/i386/pr37843-4.c for targets using leading underscores

2011-10-26 Thread Joseph S. Myers
gcc.target/i386/pr37843-4.c tests for the form calls to a function
foo take in the output.  On Windows targets this function has a
leading underscore and so the test fails.  This patch allows for that
underscore in the test.  Tested with cross to i686-mingw32.  OK to
commit?

2011-10-26  Joseph Myers  jos...@codesourcery.com

* gcc.target/i386/pr37843-4.c: Allow for leading underscores on
symbol names.

Index: gcc.target/i386/pr37843-4.c
===
--- gcc.target/i386/pr37843-4.c (revision 180200)
+++ gcc.target/i386/pr37843-4.c (working copy)
@@ -3,8 +3,8 @@
 /* { dg-options -O2 -msse2 -mpreferred-stack-boundary=4 -mstackrealign } */
 /* { dg-require-effective-target sse2 } */
 /* { dg-final { scan-assembler-not andl\[\\t \]*\\$-16,\[\\t \]*%\[re\]?sp } 
} */
-/* { dg-final { scan-assembler-not call\[\\t \]*foo } } */
-/* { dg-final { scan-assembler jmp\[\\t \]*foo } } */
+/* { dg-final { scan-assembler-not call\[\\t \]*_?foo } } */
+/* { dg-final { scan-assembler jmp\[\\t \]*_?foo } } */
 
 extern int foo (void);
 

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH] Canonicalize sparc movcc patterns such that operand 0 always appears in operand 4.

2011-10-26 Thread David Miller

The background is that I was putting together some test cases for
gcc.target/sparc that make sure the most optimal setcc sequences
are being generated.

When v9, and thus conditional moves, are enabled we sometimes fall
back to conditional moves even when the addx/subx sequence is more
efficient.  For example, we can do a = x  y in 2 instructions when
x and y are both unsigned:

cmp y, x
subx%g0, -1, a

but with v9 enabled we get:

cmp x, y
mov 0, a
movleu  %icc, 1, a

The reason is because of the permissive way in which we define the RTL
for conditional moves.

For the integer moves we mark both operands 3 and 4 using predicate
arith*_operand which accepts constants as well as registers.  The
constraints do in fact express how one of the two operands must be
equal the operand 0. However combine doesn't look at constraints to
determine if a combined instruction is recognized.

So combine thinks that insns of the form:

(set op0 (if_then_else (CMP X Y)
   (const_int 0)
   (const_int 1)))

are ok.  Reload later fixes things up by reloading one of the two
constants into operand 0, and that's how we end up with the above 3
instruction sequence.

Initially I tried to fix up the conditional move patterns by using a
match_dup 0 for operand 4 to show the combiner, in the RTL, that
operand 4 must be equal to operand 0.

But I learned that this is not legal RTL.  You can only use match_dup
in instructions after the first instruction of an expansion that
initially makes use of the operand.  Surprisingly this worked most of
the time. The failures I got were in a regrename def/use chain
assertion check when loop unrolling was enabled.

Anyways, instead what I do here is normalize all expansions of
conditional moves to be of the form:

(set op0 (if_then_else (cmp X Y)
   op3
   op0))

and in the instruction patterns I use register_operand and
constraint 0 for operand 4.

This is enough to keep the combiner from recognizing sequences
like the above.

Committed to trunk.

gcc/

* config/sparc/sparc-protos.h (sparc_expand_conditional_move): Declare.
* config/sparc/sparc.md (movI:modecc, movF:modecc): Call it.
(*movI:mode_cc_v9): Normalize to expect operand 0 always in operand 4.
(*movI:mode_cc_reg_sp64): Likewise.
(*movsf_cc_v9): Likewise.
(*movsf_cc_reg_sp64): Likewise.
(*movdf_cc_v9): Likewise.
(*movdf_cc_reg_sp64): Likewise.
(*movtf_cc_hq_v9): Likewise.
(*movtf_cc_reg_hq_sp64): Likewise.
(*movtf_cc_v9): Likewise.
(*movtf_cc_reg_sp64): Likewise.
* config/sparc/sparc.c (sparc_expand_conditional_move): New function.
(sparc_print_operand): Delete 'c' and 'd' handling, no longer used.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@180542 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog   |   17 
 gcc/config/sparc/sparc-protos.h |1 +
 gcc/config/sparc/sparc.c|   80 
 gcc/config/sparc/sparc.md   |  197 +--
 4 files changed, 144 insertions(+), 151 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index cd1a236..d6d1382 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,20 @@
+2011-10-26  David S. Miller  da...@davemloft.net
+
+   * config/sparc/sparc-protos.h (sparc_expand_conditional_move): Declare.
+   * config/sparc/sparc.md (movI:modecc, movF:modecc): Call it.
+   (*movI:mode_cc_v9): Normalize to expect operand 0 always in operand 4.
+   (*movI:mode_cc_reg_sp64): Likewise.
+   (*movsf_cc_v9): Likewise.
+   (*movsf_cc_reg_sp64): Likewise.
+   (*movdf_cc_v9): Likewise.
+   (*movdf_cc_reg_sp64): Likewise.
+   (*movtf_cc_hq_v9): Likewise.
+   (*movtf_cc_reg_hq_sp64): Likewise.
+   (*movtf_cc_v9): Likewise.
+   (*movtf_cc_reg_sp64): Likewise.
+   * config/sparc/sparc.c (sparc_expand_conditional_move): New function.
+   (sparc_print_operand): Delete 'c' and 'd' handling, no longer used.
+
 2011-10-26  Eric Botcazou  ebotca...@adacore.com
 
* reload.c (reload_inner_reg_of_subreg): Change type of return value
diff --git a/gcc/config/sparc/sparc-protos.h b/gcc/config/sparc/sparc-protos.h
index bb6fb07..108e105 100644
--- a/gcc/config/sparc/sparc-protos.h
+++ b/gcc/config/sparc/sparc-protos.h
@@ -107,6 +107,7 @@ extern void sparc_expand_compare_and_swap_12 (rtx, rtx, 
rtx, rtx);
 extern const char *output_v8plus_mult (rtx, rtx *, const char *);
 extern void sparc_expand_vector_init (rtx, rtx);
 extern void sparc_expand_vec_perm_bmask(enum machine_mode, rtx);
+extern bool sparc_expand_conditional_move (enum machine_mode, rtx *);
 #endif /* RTX_CODE */
 
 #endif /* __SPARC_PROTOS_H__ */
diff --git a/gcc/config/sparc/sparc.c b/gcc/config/sparc/sparc.c
index 

Re: Fix gcc.target/i386/pr37843-4.c for targets using leading underscores

2011-10-26 Thread Mike Stump
On Oct 26, 2011, at 2:18 PM, Joseph S. Myers wrote:
 gcc.target/i386/pr37843-4.c tests for the form calls to a function
 foo take in the output.  On Windows targets this function has a
 leading underscore and so the test fails.

 OK to commit?

Ok.


Re: Skip gcc.target/i386/pad-4.c for Windows targets

2011-10-26 Thread Mike Stump
On Oct 26, 2011, at 2:18 PM, Joseph S. Myers wrote:
 gcc.target/i386/pad-4.c expects particular results from PIC code
 generation - PIC in the sense of a GOT and a PLT, which is different
 from the sense in which Windows code is always PIC.  This patch
 arranges for it to be skipped on Windows targets, similarly to other
 tests such as gcc.target/i386/pic-1.c.  Tested with cross to
 i686-mingw32.  OK to commit?

Ok.  This one makes my head hurt, just a little.


Re: [C++ Patch / RFC] PR 50864

2011-10-26 Thread Paolo Carlini

On 10/26/2011 10:35 PM, Paolo Carlini wrote:

On 10/26/2011 10:30 PM, Paolo Carlini wrote:

Hi,
At a glance, it looks too early; it's valid to have 
namespace-qualified names after -.


namespace A
{
  struct B
  {
int i;
  };
};

A::B* b;
int i = b-A::B::i;
I was also trying to construct such kind of example myself... but my 
patch does not regress on the testcase you wrote down. I can tell you 
exactly why, if you like..
We have that parser-scope is a RECORD_TYPE and postfix_expression is 
an INDIRECT_REF.

In this case, for example (like PR50870):

namespace impl
{
  struct inner
  {
template class T T create();
  };
}

template class T, class U, __SIZE_TYPE__
  = sizeof(impl::inner::createT() - impl::inner::createU())
struct foo;

we are also Ok, code is accepted, because name is a BASELINK and the new 
check isn't even reached (postfix_expression would be an ARROW_EXPR, but 
parser-scope again a RECORD_TYPE. More generally, in all the legal 
tests I tried by hand (outside the testsuite), when we get there 
parser-scope is always a RECORD_TYPE)


But if you feel more comfortable about performing the check elsewhere, I 
can try that of course.


Paolo.



[RFC PATCH] Gather vectorization (PR tree-optimization/50789)

2011-10-26 Thread Jakub Jelinek
Hi!

This patch implements gather vectorization with -mavx2, if
dr_may_alias (which apparently doesn't use tbaa :(( ) can figure out
there is no overlap with stores in the loop (if any).
The testcases show what is possible to get vectorized.

I chose to add 4 extra (internal only) gather builtins in addition to the
16 ones needed for the intrinsics, because the builtins using different
sizes of the index vs. src/mask/ret vectors would complicate the generic
code way too much (we don't have a VEC_SELECT_EXPR nor VEC_CONCAT_EXPR
and interleaving/extract even/odd is undesirable here).
With these 4 extra builtins the generic code always sees same sized
src/mask/ret vs. index vectors, either they have same number of units,
then just one vgather* insn is needed, or the index has more elements
(int index and double/long long load) - then for one loaded index vector
there is one vgather* insn using the first half of the index vector and
one using the second half of that vector, or long index with float/int
load, then two index vectors are processed by two vgather* insns and
the result gets concatenated first halves of both results.

All this is so far unconditional only, we'd need some tree representation
of conditional loads resp. conditional stores (and could already with AVX
use vmaskmov* insns for that).

Bootstrapped/regtested on x86_64-linux and i686-linux, testcases tested
also under sde.  Ok for trunk?

2011-10-26  Jakub Jelinek  ja...@redhat.com

PR tree-optimization/50789
* tree-vect-stmts.c (process_use): Add force argument, avoid
exist_non_indexing_operands_for_use_p check if true.
(vect_mark_stmts_to_be_vectorized): Adjust callers.  Handle
STMT_VINFO_GATHER_P.
(gen_perm_mask): New function.
(perm_mask_for_reverse): Use it.
(reverse_vec_element): Rename to...
(permute_vec_elements): ... this.  Add Y and MASK_VEC arguments,
generalize for any permutations.
(vectorizable_load): Adjust caller.  Handle STMT_VINFO_GATHER_P.
* target.def (TARGET_VECTORIZE_BUILTIN_GATHER): New hook.
* doc/tm.texi.in (TARGET_VECTORIZE_BUILTIN_GATHER): Document it.
* doc/tm.texi: Regenerate.
* tree-data-ref.c (initialize_data_dependence_relation,
compute_self_dependence): No longer static.
* tree-data-ref.h (initialize_data_dependence_relation,
compute_self_dependence): New prototypes.
* tree-vect-data-refs.c (vect_check_gather): New function.
(vect_analyze_data_refs): Detect possible gather load data
refs.
* tree-vectorizer.h (struct _stmt_vec_info): Add gather_p field.
(STMT_VINFO_GATHER_P): Define.
(vect_check_gather): New prototype.
* config/i386/i386-builtin-types.def: Add types for alternate
gather builtins.
* config/i386/sse.md (AVXMODE48P_DI): Remove.
(VEC_GATHER_MODE): Rename mode_attr to...
(VEC_GATHER_IDXSI): ... this.
(VEC_GATHER_IDXDI, VEC_GATHER_SRCDI): New mode_attrs.
(avx2_gathersimode, *avx2_gathersimode): Use VEC_GATHER_IDXSI
instead of VEC_GATHER_MODE.
(avx2_gatherdimode): Use VEC_GATHER_IDXDI instead of
AVXMODE48P_DI and VEC_GATHER_SRCDI instead of VEC_GATHER_MODE
on src and mask operands.
(*avx2_gatherdimode): Likewise.  Use VEC_GATHER_MODE iterator
instead of AVXMODE48P_DI.
(avx2_gatherdimode256, *avx2_gatherdimode256): Removed.
* config/i386/i386.c (enum ix86_builtins): Add
IX86_BUILTIN_GATHERALTSIV4DF, IX86_BUILTIN_GATHERALTDIV8SF,
IX86_BUILTIN_GATHERALTSIV4DI and IX86_BUILTIN_GATHERALTDIV8SI.
(ix86_init_mmx_sse_builtins): Create those builtins.
(ix86_expand_builtin): Handle those builtins and adjust expansions
of other gather builtins.
(ix86_vectorize_builtin_gather): New function.
(TARGET_VECTORIZE_BUILTIN_GATHER): Define.

* gcc.target/i386/avx2-gather-1.c: New test.
* gcc.target/i386/avx2-gather-2.c: New test.
* gcc.target/i386/avx2-gather-3.c: New test.

--- gcc/tree-vect-stmts.c.jj2011-10-26 14:19:11.0 +0200
+++ gcc/tree-vect-stmts.c   2011-10-26 16:54:23.0 +0200
@@ -332,6 +332,8 @@ exist_non_indexing_operands_for_use_p (t
- LIVE_P, RELEVANT - enum values to be set in the STMT_VINFO of the stmt
  that defined USE.  This is done by calling mark_relevant and passing it
  the WORKLIST (to add DEF_STMT to the WORKLIST in case it is relevant).
+   - FORCE is true if exist_non_indexing_operands_for_use_p check shouldn't
+ be performed.
 
Outputs:
Generally, LIVE_P and RELEVANT are used to define the liveness and
@@ -351,7 +353,8 @@ exist_non_indexing_operands_for_use_p (t
 
 static bool
 process_use (gimple stmt, tree use, loop_vec_info loop_vinfo, bool live_p,
-enum vect_relevant relevant, VEC(gimple,heap) **worklist)
+enum vect_relevant relevant, 

Re: Fix gcc.dg/tls/thr-cse-1.c for MinGW target

2011-10-26 Thread Kai Tietz
Hi,

patch looks fine for mingw 32-bit.  With a small nit new test passes
for 64-bit mingw version, too.  The cygwin-part looks like the mingw
one.

2011/10/26 Joseph S. Myers jos...@codesourcery.com:
 gcc.dg/tls/thr-cse-1.c tests that there are not two calls to
 __emutls_get_address in the output.  Normally this just checks for
 emutls_get_address.*emutls_get_address, but on some targets where the
 compiler output declares the functions called in some way the testcase
 has more specific tests.  i?86-*-mingw* produces a .def directive for
 ___emutls_get_address so also needs such a special case, which this
 patch adds (only for i?86-*-mingw* as I don't know what the output
 looks like for other Windows targets).  Tested with cross to
 i686-mingw32.  OK to commit?

 2011-10-26  Joseph Myers  jos...@codesourcery.com

        * gcc.dg/tls/thr-cse-1.c: For i?86-*-mingw*, check for multiple
        calls.

 Index: gcc.dg/tls/thr-cse-1.c
 ===
 --- gcc.dg/tls/thr-cse-1.c      (revision 180200)
 +++ gcc.dg/tls/thr-cse-1.c      (working copy)
 @@ -15,9 +15,10 @@
   return a;
  }

 -/* { dg-final { scan-assembler-not 
 emutls_get_address.*emutls_get_address.* { target { ! { *-wrs-vxworks  
 *-*-darwin8  hppa*-*-hpux* spu-*-* } } } } } */
 +/* { dg-final { scan-assembler-not 
 emutls_get_address.*emutls_get_address.* { target { ! { *-wrs-vxworks  
 *-*-darwin8  hppa*-*-hpux* spu-*-* i?86-*-mingw* } } } } } */
  /* { dg-final { scan-assembler-not 
 call\tL___emutls_get_address.stub.*call\tL___emutls_get_address.stub.* { 
 target *-*-darwin8 } } } */
  /* { dg-final { scan-assembler-not (b,l|bl) __emutls_get_address.*(b,l|bl) 
 __emutls_get_address.* { target hppa*-*-hpux* } } } */
  /* { dg-final { scan-assembler-not 
 (brsl|brasl)\t__emutls_get_address.*(brsl|brasl)\t__emutls_get_address.* { 
 target spu-*-* } } } */
  /* { dg-final { scan-assembler-not tls_lookup.*tls_lookup.* { target 
 *-wrs-vxworks } } } */

+/* { dg-final { scan-assembler-not
call\t___emutls_get_address.*call\t___emutls_get_address { target {
i?86-*-mingw* i?86-*-cygwin* } } } } */
+/* { dg-final { scan-assembler-not
call\t__emutls_get_address.*call\t__emutls_get_address { target
x86_64-*-mingw* } } } */

With this change test should pass for all IA mingw-targets and for cygwin, too.

Thanks,
Kai


Re: Fix gcc.dg/graphite/run-id-1.c for Windows targets

2011-10-26 Thread Kai Tietz
2011/10/26 Joseph S. Myers jos...@codesourcery.com:
 The test gcc.dg/graphite/run-id-1.c requires more stack space than
 Windows targets provide by default.  This patch arranges for the
 appropriate -Wl,--stack option (as in config/mh-mingw and
 config/mh-cygwin) to be used by this test on those targets.  Tested
 with cross to i686-mingw32.  OK to commit?

 2011-10-26  Joseph Myers  jos...@codesourcery.com

        * gcc.dg/graphite/run-id-1.c: Pass -Wl,--stack,12582912 option for
        Windows targets.

 Index: gcc.dg/graphite/run-id-1.c
 ===
 --- gcc.dg/graphite/run-id-1.c  (revision 180200)
 +++ gcc.dg/graphite/run-id-1.c  (working copy)
 @@ -1,3 +1,4 @@
 +/* { dg-options -Wl,--stack,12582912 { target *-*-mingw* *-*-cygwin* } } */
  /* { dg-require-effective-target size32plus } */

  void abort (void);

 --
 Joseph S. Myers
 jos...@codesourcery.com

From my side patch is ok.

Thanks,
Kai


Re: [PATCH][RFC] Re-write LTO option merging

2011-10-26 Thread Mike Stump
On Oct 26, 2011, at 6:10 AM, Richard Guenther wrote:
 This completely rewrites LTO option merging.

 Any comments?

Wondering if It breaks darwin?  
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50876

We have:

   %{Zmultiply_defined*:-multiply_defined %*} 

in LINK_SPEC.

and:

; Various linker options have a -Z added so that they can get to specs
; processing without interference.  Note that an option name with a
; prefix that matches another option name, that also takes an
; argument, being mapped to a -Z linker option, needs to be modified
; so the prefix is different, otherwise a '*' after the shorter option
; will match with the longer one.

multiply_defined
Driver RejectNegative Separate Alias(Zmultiply_defined)

in the .opt file.  We may well be the only port that does these sorts of 
tricks.  Essentially, we want to allow linker arguments on the command line, 
and have them never hit any machine independent option processing code, because 
they match (prefix match).  We accomplish this this by hiding them early, doing 
all the normal processing, and then way late, just before link, turning them 
back into the normal names.


Re: Fix gcc.dg/graphite/run-id-1.c for Windows targets

2011-10-26 Thread Mike Stump
On Oct 26, 2011, at 2:17 PM, Joseph S. Myers wrote:
 The test gcc.dg/graphite/run-id-1.c requires more stack space than
 Windows targets provide by default.  This patch arranges for the
 appropriate -Wl,--stack option (as in config/mh-mingw and
 config/mh-cygwin) to be used by this test on those targets.  Tested
 with cross to i686-mingw32.  OK to commit?

Yeah, I noticed these testcases (there were quite a few that hit me) on my port 
with a small stack.  I fixed it by increasing my stack space for testing, so 
that I didn't have to go hit up testcase after testcase.  I'm wondering if you 
would get better milage by just having this on the CFLAGS bits in dejagnu 
someplace, then at least, you'd never have to do this again.

I'm fine with either fix you want to put in.


Re: [C++ Patch / RFC] PR 50870

2011-10-26 Thread Paolo Carlini

On 10/26/2011 06:27 PM, Jason Merrill wrote:

OK.
I re-opened this one because: 1- We may want to fix it in 4_6-branch 
too, it's a regression there too; 2- We are still handling incorrectly 
the template impl case. For the latter a variant of my old idea still 
works, fwiw.


Thanks,
Paolo.

/
Index: pt.c
===
--- pt.c(revision 180532)
+++ pt.c(working copy)
@@ -13706,12 +13706,13 @@ tsubst_copy_and_build (tree t,
/* Remember that there was a reference to this entity.  */
if (DECL_P (object))
  mark_used (object);
-   object_type = TREE_TYPE (object);
+   object_type = (TREE_CODE (object) == ARROW_EXPR
+  ? TREE_OPERAND (object, 0) : TREE_TYPE (object));
 
member = TREE_OPERAND (t, 1);
if (BASELINK_P (member))
  member = tsubst_baselink (member,
-   non_reference (TREE_TYPE (object)),
+   non_reference (object_type),
args, complain, in_decl);
else
  member = tsubst_copy (member, args, complain, in_decl);


Re: Fix gcc.dg/tls/thr-cse-1.c for MinGW target

2011-10-26 Thread Joseph S. Myers
On Wed, 26 Oct 2011, Kai Tietz wrote:

 Hi,
 
 patch looks fine for mingw 32-bit.  With a small nit new test passes
 for 64-bit mingw version, too.  The cygwin-part looks like the mingw
 one.

I've committed my original patch as what I can test - you may wish to 
commit followups after testing them on other Windows targets (and indeed 
more generally look at whatever Windows-specific test failures there are 
that I haven't fixed).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [C++ Patch / RFC] PR 50870

2011-10-26 Thread Paolo Carlini

... forgot the testcase, sorry.

Paolo.

//
// PR c++/50870
// { dg-options -std=gnu++0x }

template class V
  struct impl
  {
template class T static T create();
  };

template class T, class U, class V, class
	  = decltype(implV::template createT()
		 - implV::template createU())
struct tester { };

testerimplfloat*, int, float ti;

template class T, class U, class V, class
	  = decltype(implV::template createT()
		 - implV::template createU())
int test() { return 0; }

int i = testimplfloat*, int, float();


Re: PR c++/30195

2011-10-26 Thread Jason Merrill

On 10/24/2011 05:53 PM, Fabien Chêne wrote:

After looking into it a bit more, I now see things differently. The
target_decl is in fact put into the CLASSTYPE_METHOD_VEC at
instantiation time, via handle_using_decl, with the correct access.


Great.


The problem is that while parsing the template (before instantiating),
it is not yet there and an error is immediately raised.


Ah, I see.  We need lookup_member to find the USING_DECL, but if we're 
skipping function USING_DECLs in lookup_field_1 and they aren't in 
CLASSTYPE_METHOD_VEC in templates, we don't find it either way.



I only see three solutions:
1) perform an additional lookup for using declarations that designate
functions in lookup_fnfields_slot to find something even if the
CLASSTYPE_METHOD_VEC is empty (what I implemented in the previous
patch)
2) inhibate the lookup error at some point.
3) put the decl into the CLASSTYPE_METHOD_VEC before instantiation



Honestly, I don't know if 2) is realistic, and how to implement it as well ...
3) seems complicated: in finish_member_declaration, we must put away
the decl into TYPE_FIELDS or TYPE_METHODS, but we would like to put
the target_decl into TYPE_METHODS (and call add_method), and at the
same time put its  using decl into TYPE_FIELDS... I don't know if it
can theoretically work, but having try it, I think it would need many
adjustments.


I think #3 is the way to go.  Isn't that what happens in non-templates? 
 We don't try to instantiate the CLASSTYPE_METHOD_VEC, we build up a 
whole new one from the decl list, so having it in both places shouldn't 
be a problem for instantiation.


Jason


Re: [C++ Patch / RFC] PR 50870

2011-10-26 Thread Jason Merrill

On 10/26/2011 06:20 PM, Paolo Carlini wrote:

I re-opened this one because: 1- We may want to fix it in 4_6-branch
too, it's a regression there too; 2- We are still handling incorrectly
the template impl case. For the latter a variant of my old idea still
works, fwiw.



-   object_type = TREE_TYPE (object);
+   object_type = (TREE_CODE (object) == ARROW_EXPR
+  ? TREE_OPERAND (object, 0) : TREE_TYPE (object));


This is still wrong.  Why not use the same patch for 4.6 as for trunk?

Jason



Re: [C++ Patch / RFC] PR 50870

2011-10-26 Thread Paolo Carlini

On 10/27/2011 12:48 AM, Jason Merrill wrote:

On 10/26/2011 06:20 PM, Paolo Carlini wrote:

I re-opened this one because: 1- We may want to fix it in 4_6-branch
too, it's a regression there too; 2- We are still handling incorrectly
the template impl case. For the latter a variant of my old idea still
works, fwiw.



-object_type = TREE_TYPE (object);
+object_type = (TREE_CODE (object) == ARROW_EXPR
+   ? TREE_OPERAND (object, 0) : TREE_TYPE (object));

This is still wrong.  Why not use the same patch for 4.6 as for trunk?
I don't understand, sorry: both mainline and 4_6-branch do not handle 
correctly both testcases, template impl and not. If you want to look 
into it personally, I can unassign myself, no problem, really.


Paolo.


Re: [C++ Patch / RFC] PR 50870

2011-10-26 Thread Paolo Carlini
.. maybe my message wasn't clear, sorry, I'm a bit tired (here it's 
late): I meant to say that the non_reference tweak fixes the 
non-template impl class case, but something more is needed for a 
template impl (thus the new testcase). And, additionally, this issue is 
a [4.6/4.7 Regression], thus, post 4.6.2, we may be interested in back 
porting something.


Let me know if you need additional details, or whatelse. Or we want me 
to look into a different way to attack the template case.


Paolo.


[PATCH, devirtualization] Detect the new type in type change detection

2011-10-26 Thread Martin Jambor
Hi,

I've been asked by Maxim Kuvyrkov to revive the following patch which
has not made it to 4.6.  Currently, when type based devirtualization
detects a potential type change, it simply gives up on gathering any
information on the object in question.  This patch adds an attempt to
actually detect the new type after the change.

Maxim claimed this (and another patch I'll post tomorrow) noticeably
improved performance of some real code.  I can only offer a rather
artificial example in the attachment.  When the constructors are
inlined but the function multiply_matrices is not, this patch makes
the produced executable run for only 7 seconds instead of about 20 on
my 4 year old i686 desktop (with -Ofast).

Anyway, the patch passes bootstrap and testsuite on x86_64-linux.
What do you think, is it a good idea for trunk now?

Thanks,

Martin


2011-10-21  Martin Jambor  mjam...@suse.cz

* ipa-prop.c (type_change_info): New fields object, known_current_type
and multiple_types_encountered.
(extr_type_from_vtbl_ptr_store): New function.
(check_stmt_for_type_change): Use it, set multiple_types_encountered if
the result is different from the previous one.
(detect_type_change): Renamed to detect_type_change_1. New parameter
comp_type.  Set up new fields in tci, build known type jump
functions if the new type can be identified.
(detect_type_change): New function.
* tree.h (DECL_CONTEXT): Comment new use.

* testsuite/g++.dg/ipa/devirt-c-1.C: Add dump scans.
* testsuite/g++.dg/ipa/devirt-c-2.C: Likewise.
* testsuite/g++.dg/ipa/devirt-c-7.C: New test.


Index: src/gcc/ipa-prop.c
===
--- src.orig/gcc/ipa-prop.c
+++ src/gcc/ipa-prop.c
@@ -271,8 +271,17 @@ ipa_print_all_jump_functions (FILE *f)
 
 struct type_change_info
 {
+  /* The declaration or SSA_NAME pointer of the base that we are checking for
+ type change.  */
+  tree object;
+  /* If we actually can tell the type that the object has changed to, it is
+ stored in this field.  Otherwise it remains NULL_TREE.  */
+  tree known_current_type;
   /* Set to true if dynamic type change has been detected.  */
   bool type_maybe_changed;
+  /* Set to true if multiple types have been encountered.  known_current_type
+ must be disregarded in that case.  */
+  bool multiple_types_encountered;
 };
 
 /* Return true if STMT can modify a virtual method table pointer.
@@ -338,6 +347,49 @@ stmt_may_be_vtbl_ptr_store (gimple stmt)
   return true;
 }
 
+/* If STMT can be proved to be an assignment to the virtual method table
+   pointer of ANALYZED_OBJ and the type associated with the new table
+   identified, return the type.  Otherwise return NULL_TREE.  */
+
+static tree
+extr_type_from_vtbl_ptr_store (gimple stmt, tree analyzed_obj)
+{
+  tree lhs, t, obj;
+
+  if (!is_gimple_assign (stmt))
+return NULL_TREE;
+
+  lhs = gimple_assign_lhs (stmt);
+
+  if (TREE_CODE (lhs) != COMPONENT_REF)
+return NULL_TREE;
+  obj = lhs;
+
+  if (!DECL_VIRTUAL_P (TREE_OPERAND (lhs, 1)))
+return NULL_TREE;
+
+  do
+{
+  obj = TREE_OPERAND (obj, 0);
+}
+  while (TREE_CODE (obj) == COMPONENT_REF);
+  if (TREE_CODE (obj) == MEM_REF)
+obj = TREE_OPERAND (obj, 0);
+  if (TREE_CODE (obj) == ADDR_EXPR)
+obj = TREE_OPERAND (obj, 0);
+  if (obj != analyzed_obj)
+return NULL_TREE;
+
+  t = gimple_assign_rhs1 (stmt);
+  if (TREE_CODE (t) != ADDR_EXPR)
+return NULL_TREE;
+  t = get_base_address (TREE_OPERAND (t, 0));
+  if (!t || TREE_CODE (t) != VAR_DECL || !DECL_VIRTUAL_P (t))
+return NULL_TREE;
+
+  return DECL_CONTEXT (t);
+}
+
 /* Callback of walk_aliased_vdefs and a helper function for
detect_type_change to check whether a particular statement may modify
the virtual table pointer, and if possible also determine the new type of
@@ -352,6 +404,12 @@ check_stmt_for_type_change (ao_ref *ao A
 
   if (stmt_may_be_vtbl_ptr_store (stmt))
 {
+  tree type;
+  type = extr_type_from_vtbl_ptr_store (stmt, tci-object);
+  if (tci-type_maybe_changed
+  type != tci-known_current_type)
+   tci-multiple_types_encountered = true;
+  tci-known_current_type = type;
   tci-type_maybe_changed = true;
   return true;
 }
@@ -359,19 +417,19 @@ check_stmt_for_type_change (ao_ref *ao A
 return false;
 }
 
-/* Detect whether the dynamic type of ARG has changed (before callsite CALL) by
-   looking for assignments to its virtual table pointer.  If it is, return true
-   and fill in the jump function JFUNC with relevant type information or set it
-   to unknown.  ARG is the object itself (not a pointer to it, unless
-   dereferenced).  BASE is the base of the memory access as returned by
-   get_ref_base_and_extent, as is the offset.  */
+
+
+/* Like detect_type_change but with extra argument COMP_TYPE which will become
+   the component type part of new 

Re: libstdc++/50862 fix deadlock in condition_variable_any

2011-10-26 Thread Jonathan Wakely
PR libstdc++/50862
* include/std/condition_variable (condition_variable_any::wait): Avoid
terminating if relocking user mutex throws during stack-unwinding.
* testsuite/30_threads/condition_variable_any/50862.cc: Add dg-require.

Fixes two more issues pointed out in the PR comments.

Tested x86_64-linux, committed to trunk.
Index: include/std/condition_variable
===
--- include/std/condition_variable  (revision 180456)
+++ include/std/condition_variable  (working copy)
@@ -205,7 +205,13 @@
// scoped unlock - unlocks in ctor, re-locks in dtor
struct _Unlock {
  explicit _Unlock(_Lock __lk) : _M_lock(__lk) { __lk.unlock(); }
- ~_Unlock() { _M_lock.lock(); }
+ ~_Unlock() noexcept(false)
+ {
+   if (uncaught_exception())
+ __try { _M_lock.lock(); } __catch(...) { }
+   else
+ _M_lock.lock();
+ }
  _Lock _M_lock;
};
 
Index: testsuite/30_threads/condition_variable_any/50862.cc
===
--- testsuite/30_threads/condition_variable_any/50862.cc(revision 
180456)
+++ testsuite/30_threads/condition_variable_any/50862.cc(working copy)
@@ -4,6 +4,7 @@
 // { dg-options  -std=gnu++0x  { target *-*-cygwin *-*-darwin* } }
 // { dg-require-cstdint  }
 // { dg-require-gthreads  }
+// { dg-require-sched-yield  }
  
 // Copyright (C) 2011 Free Software Foundation, Inc.
 //


[PATCH] Improve sparc setcc generation and add testcases.

2011-10-26 Thread David Miller

This makes sure the addx/subx sequence is used even when v9 conditional
moves are available.

It also shows the compiler that the setcc instructions emitting
-1, 0, or 1 all extend to DImode.

I left all the *x_{plus,minus}_foo and related patterns alone for
now, I'll get to those and add appropriate test cases as I find time.

Committed to trunk.

gcc/

* config/sparc/sparc.c (emit_scc_insn): Do not try v9 sequences until
LEU/LTU/GEU/GTU is attempted.
* config/sparc/sparc.md (*neg_snesi_sign_extend): New 64-bit insn
and split.
(*neg_seqsi_sign_extend): Likewise.
(*sltu_extend_sp64, *neg_sltu_extend_sp64, *sgeu_extend_sp64,
*neg_sgeu_extend_sp64): New insns.

gcc/testsuite/

* gcc.target/sparc/setcc-1.c: New test.
* gcc.target/sparc/setcc-2.c: New test.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@180550 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog|8 
 gcc/config/sparc/sparc.c |   14 +++
 gcc/config/sparc/sparc.md|   59 ++
 gcc/testsuite/ChangeLog  |5 +++
 gcc/testsuite/gcc.target/sparc/setcc-1.c |   39 
 gcc/testsuite/gcc.target/sparc/setcc-2.c |   39 
 6 files changed, 156 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/sparc/setcc-1.c
 create mode 100644 gcc/testsuite/gcc.target/sparc/setcc-2.c

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index d6d1382..3a51510 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,13 @@
 2011-10-26  David S. Miller  da...@davemloft.net
 
+   * config/sparc/sparc.c (emit_scc_insn): Do not try v9 sequences until
+   LEU/LTU/GEU/GTU is attempted.
+   * config/sparc/sparc.md (*neg_snesi_sign_extend): New 64-bit insn
+   and split.
+   (*neg_seqsi_sign_extend): Likewise.
+   (*sltu_extend_sp64, *neg_sltu_extend_sp64, *sgeu_extend_sp64,
+   *neg_sgeu_extend_sp64): New insns.
+
* config/sparc/sparc-protos.h (sparc_expand_conditional_move): Declare.
* config/sparc/sparc.md (movI:modecc, movF:modecc): Call it.
(*movI:mode_cc_v9): Normalize to expect operand 0 always in operand 4.
diff --git a/gcc/config/sparc/sparc.c b/gcc/config/sparc/sparc.c
index 964bcaf..80e05a6 100644
--- a/gcc/config/sparc/sparc.c
+++ b/gcc/config/sparc/sparc.c
@@ -2541,14 +2541,6 @@ emit_scc_insn (rtx operands[])
 }
 }
 
-  /* For the rest, on v9 we can use conditional moves.  */
-
-  if (TARGET_V9)
-{
-  if (gen_v9_scc (operands[0], code, x, y))
-return true;
-}
-
   /* We can do LTU and GEU using the addx/subx instructions too.  And
  for GTU/LEU, if both operands are registers swap them and fall
  back to the easy case.  */
@@ -2573,6 +2565,12 @@ emit_scc_insn (rtx operands[])
   return true;
 }
 
+  /* All the posibilities to use addx/subx based sequences has been
+ exhausted, try for a 3 instruction sequence using v9 conditional
+ moves.  */
+  if (TARGET_V9  gen_v9_scc (operands[0], code, x, y))
+return true;
+
   /* Nope, do branches.  */
   return false;
 }
diff --git a/gcc/config/sparc/sparc.md b/gcc/config/sparc/sparc.md
index 2bae28e..2b4b2bb 100644
--- a/gcc/config/sparc/sparc.md
+++ b/gcc/config/sparc/sparc.md
@@ -713,6 +713,22 @@
   
   [(set_attr length 2)])
 
+(define_insn_and_split *neg_snesi_sign_extend
+  [(set (match_operand:DI 0 register_operand =r)
+(neg:DI (ne:DI (match_operand:SI 1 register_operand r)
+  (const_int 0
+   (clobber (reg:CC CC_REG))]
+  TARGET_ARCH64
+  #
+   1
+  [(set (reg:CC_NOOV CC_REG) (compare:CC_NOOV (minus:SI (const_int 0)
+ (match_dup 1))
+   (const_int 0)))
+   (set (match_dup 0) (sign_extend:DI (neg:SI (ltu:SI (reg:CC CC_REG)
+  (const_int 0)]
+  
+  [(set_attr length 2)])
+
 (define_insn_and_split *snedi_zero
   [(set (match_operand:DI 0 register_operand =r)
 (ne:DI (match_operand:DI 1 register_operand r)
@@ -804,6 +820,21 @@
   
   [(set_attr length 2)])
 
+(define_insn_and_split *neg_seqsi_sign_extend
+  [(set (match_operand:DI 0 register_operand =r)
+   (neg:DI (eq:DI (match_operand:SI 1 register_operand r)
+  (const_int 0
+   (clobber (reg:CC CC_REG))]
+  TARGET_ARCH64
+  #
+   1
+  [(set (reg:CC_NOOV CC_REG) (compare:CC_NOOV (neg:SI (match_dup 1))
+  (const_int 0)))
+   (set (match_dup 0) (sign_extend:DI (neg:SI (geu:SI (reg:CC CC_REG)
+  (const_int 0)]
+  
+  [(set_attr length 2)])
+
 (define_insn_and_split *seqdi_zero
   [(set (match_operand:DI 0 register_operand =r)
 (eq:DI (match_operand:DI 1 register_operand r)
@@ -928,6 +959,13 @@
   addx\t%%g0, 

[PATCH] Add sparc fmaf test.

2011-10-26 Thread David Miller

Committed to trunk.

gcc/testsuite/

* gcc.target/sparc/fmaf-1.c: New test.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@180551 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/testsuite/ChangeLog |2 +
 gcc/testsuite/gcc.target/sparc/fmaf-1.c |   51 +++
 2 files changed, 53 insertions(+), 0 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/sparc/fmaf-1.c

diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 2879dfe..6078fdd 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,5 +1,7 @@
 2011-10-26  David S. Miller  da...@davemloft.net
 
+   * gcc.target/sparc/fmaf-1.c: New test.
+
* gcc.target/sparc/setcc-1.c: New test.
* gcc.target/sparc/setcc-2.c: New test.
 
diff --git a/gcc/testsuite/gcc.target/sparc/fmaf-1.c 
b/gcc/testsuite/gcc.target/sparc/fmaf-1.c
new file mode 100644
index 000..948b926
--- /dev/null
+++ b/gcc/testsuite/gcc.target/sparc/fmaf-1.c
@@ -0,0 +1,51 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -mfmaf } */
+
+float fmadds (float a, float b, float c)
+{
+  return a * b + c;
+}
+
+float fmsubs (float a, float b, float c)
+{
+  return a * b - c;
+}
+
+float fnmadds (float a, float b, float c)
+{
+  return -(a * b + c);
+}
+
+float fnmsubs (float a, float b, float c)
+{
+  return -(a * b - c);
+}
+
+double fmaddd (double a, double b, double c)
+{
+  return a * b + c;
+}
+
+double fmsubd (double a, double b, double c)
+{
+  return a * b - c;
+}
+
+double fnmaddd (double a, double b, double c)
+{
+  return -(a * b + c);
+}
+
+double fnmsubd (double a, double b, double c)
+{
+  return -(a * b - c);
+}
+
+/* { dg-final { scan-assembler fmadds\t% } } */
+/* { dg-final { scan-assembler fmsubs\t% } } */
+/* { dg-final { scan-assembler fnmadds\t% } } */
+/* { dg-final { scan-assembler fnmsubs\t% } } */
+/* { dg-final { scan-assembler fmaddd\t% } } */
+/* { dg-final { scan-assembler fmsubd\t% } } */
+/* { dg-final { scan-assembler fnmaddd\t% } } */
+/* { dg-final { scan-assembler fnmsubd\t% } } */
-- 
1.7.6.401.g6a319



[pph] Fix some cgraph node merge problems (issue5302068)

2011-10-26 Thread Diego Novillo

This patch fixes some of the problems we had with cgraph nodes for
merged functions.  When merging a function that had a cgraph node emitted 
for it, we were ICEing during cgraph allocation because the reader
was not checking whether the node already existed for that function.

Additionally, the patch removes the computation of hash values for
include branches.  That was only papering over hashing problems.
Instead, I made the hashing key more descriptive by using
decl_as_string and the mangled name.

We still have several merging problems.  The x4tmplfun* and z4tmplfun*
failures are different ICEs in template instantiation.  Still have not
looked what those are.

Tested on x86_64.  Committed.


Diego.


* cp-tree.h (get_mangled_id): Declare.
* mangle.c (get_mangled_id): Factor out of ...
(mangle_decl): ... here.
* pph-streamer-in.c
(pph_get_include_path_hash): Remove.  Update all users.
* pph-streamer-out.c (pph_merge_name): Move from pph-streamer.c.
Change return type to char *.
Call get_mangled_id and decl_as_string.
Update users.

testsuite/ChangeLog.pph

* g++.dg/pph/c4inline.cc: Mark fixed.
* g++.dg/pph/x4tmplclass1.cc: Likewise.
* g++.dg/pph/x4tmplclass2.cc: Likewise.
* g++.dg/pph/z4tmplclass1.cc: Likewise.
* g++.dg/pph/z4tmplclass2.cc: Likewise.
* g++.dg/pph/x4tmplfuncinln.cc: Change expected failure.
* g++.dg/pph/x4tmplfuncninl.cc: Likewise.
* g++.dg/pph/z4tmplfuncinln.cc: Likewise.
* g++.dg/pph/z4tmplfuncninl.cc: Likewise.

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 705c0a6..297a779 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -5782,6 +5782,7 @@ extern tree merge_exception_specifiers(tree, 
tree, tree);
 /* in mangle.c */
 extern void init_mangle(void);
 extern void mangle_decl(tree);
+extern tree get_mangled_id (tree);
 extern const char *mangle_type_string  (tree);
 extern tree mangle_typeinfo_for_type   (tree);
 extern tree mangle_typeinfo_string_for_type(tree);
diff --git a/gcc/cp/mangle.c b/gcc/cp/mangle.c
index 55851e6..38a1fcb 100644
--- a/gcc/cp/mangle.c
+++ b/gcc/cp/mangle.c
@@ -3157,13 +3157,21 @@ mangle_decl_string (const tree decl)
   return result;
 }
 
+/* Return an identifier for the external mangled name of DECL.  */
+
+tree
+get_mangled_id (tree decl)
+{
+  tree id = mangle_decl_string (decl);
+  return targetm.mangle_decl_assembler_name (decl, id);
+}
+
 /* Create an identifier for the external mangled name of DECL.  */
 
 void
 mangle_decl (const tree decl)
 {
-  tree id = mangle_decl_string (decl);
-  id = targetm.mangle_decl_assembler_name (decl, id);
+  tree id = get_mangled_id (decl);
   SET_DECL_ASSEMBLER_NAME (decl, id);
 
   if (G.need_abi_warning)
diff --git a/gcc/cp/pph-streamer-in.c b/gcc/cp/pph-streamer-in.c
index 6c8bec4..6aa301f 100644
--- a/gcc/cp/pph-streamer-in.c
+++ b/gcc/cp/pph-streamer-in.c
@@ -67,7 +67,7 @@ static VEC(char_p,heap) *string_tables = NULL;
 static int pph_loc_offset;
 
 /* Forward declarations to avoid circularity.  */
-static tree pph_in_merge_key_tree (pph_stream *, tree *, hashval_t);
+static tree pph_in_merge_key_tree (pph_stream *, tree *);
 
 /* stream initialization */
 
@@ -690,18 +690,17 @@ pph_in_chain (pph_stream *stream)
 
 
 /* Read a chain of AST merge keys from STREAM.  Merge each tree
-   into *CHAIN.  IPATH_HASH is the hash value of the include path
-   from STREAM to the root of the include tree.  */
+   into *CHAIN.  */
 
 static void
-pph_in_merge_key_chain (pph_stream *stream, tree *chain, hashval_t ipath_hash)
+pph_in_merge_key_chain (pph_stream *stream, tree *chain)
 {
   unsigned i;
   HOST_WIDE_INT count;
 
   count = pph_in_hwi (stream);
   for (i = 0; i  count; i++)
-pph_in_merge_key_tree (stream, chain, ipath_hash);
+pph_in_merge_key_tree (stream, chain);
 }
 
 
@@ -740,13 +739,6 @@ typedef struct {
 
   /* Name of the tree (from pph_merge_name).  */
   const char *name;
-
-  /* Hash value representing the include path starting at the image
- where EXPR resides up to the root of the include tree.  Objects
- found in any of these PPH images do not need to be merged.  They
- were already emitted as external references and merged when
- the PPH images were being generated.  */
-  hashval_t ipath_hash;
 } merge_toc_entry;
 
 
@@ -767,11 +759,6 @@ htab_merge_key_eq (const void *p1, const void *p2)
   const merge_toc_entry *key1 = (const merge_toc_entry *) p1;
   const merge_toc_entry *key2 = (const merge_toc_entry *) p2;
 
-  /* Matches inside the same include path are not interesting.  These
- symbols have already been merged.  */
-  if (key1-ipath_hash == key2-ipath_hash)
-return false;
-
   if (key1-context != key2-context)
 return false;
 
@@ -831,15 

Reload related segfaults

2011-10-26 Thread Alan Modra
Some recent patch has exposed a reload bug.  I'm seeing

libtool: compile:  /home/amodra/build/gcc-curr/./gcc/xgcc 
-B/home/amodra/build/gcc-curr/./gcc/ -B/home/amodra/gnu/powerpc-linux/bin/ 
-B/home/amodra/gnu/powerpc-linux/lib/ -isystem 
/home/amodra/gnu/powerpc-linux/include -isystem 
/home/amodra/gnu/powerpc-linux/sys-include -m64 -fPIC -mstrict-align 
-DHAVE_CONFIG_H -I. 
-I/home/amodra/src/gcc-current/libjava/classpath/native/fdlibm -I../../include 
-fexceptions -fasynchronous-unwind-tables -g -O2 -m64 -fPIC -mstrict-align -MT 
sf_fabs.lo -MD -MP -MF .deps/sf_fabs.Tpo -c 
/home/amodra/src/gcc-current/libjava/classpath/native/fdlibm/sf_fabs.c  -fPIC 
-DPIC -o .libs/sf_fabs.o
/home/amodra/src/gcc-current/libjava/classpath/native/fdlibm/sf_fabs.c: In 
function 'fabsf':
/home/amodra/src/gcc-current/libjava/classpath/native/fdlibm/sf_fabs.c:33:1: 
internal compiler error: Segmentation fault

(insn 11 9 15 2 (parallel [
(set (subreg:SI (reg:SF 123 [ retval ]) 0)
(and:SI (subreg:SI (reg:SF 33 1 [ x ]) 0)
(const_int 2147483647 [0x7fff])))
(clobber (scratch:CC))
]) 
/home/amodra/src/gcc-current/libjava/classpath/native/fdlibm/sf_fabs.c:32 146 
{andsi3_mc}
 (expr_list:REG_DEAD (reg:SF 33 1 [ x ])
(nil)))

Reloads for insn # 11
Reload 0: GENERAL_REGS, RELOAD_FOR_INPUT_ADDRESS (opnum = 0), can't combine, 
secondary_reload_p
reload_reg_rtx: (reg:SF 8 8)
Reload 1: GENERAL_REGS, RELOAD_FOR_OUTPUT_ADDRESS (opnum = 0), can't combine, 
secondary_reload_p
reload_reg_rtx: (reg:SF 8 8)
Reload 2: reload_in (SF) = (reg:SF 33 1 [orig:123 retval ] [123])
reload_out (SF) = (reg:SF 33 1 [orig:123 retval ] [123])
NON_FLOAT_REGS, RELOAD_OTHER (opnum = 0)
reload_in_reg: (reg:SF 33 1 [orig:123 retval ] [123])
reload_out_reg: (reg:SF 33 1 [orig:123 retval ] [123])
reload_reg_rtx: (reg:SF 10 10)
secondary_in_reload = 0, secondary_out_reload = 1

Reload 3: reload_out (SI) = (subreg:SI (reg:SF 33 1 [orig:123 retval ] [123]) 
0)
GENERAL_REGS, RELOAD_FOR_OUTPUT (opnum = 0)
reload_out_reg: (subreg:SI (reg:SF 33 1 [orig:123 retval ] [123]) 0)
reload_reg_rtx: (reg:SI 9 9)
Reload 4: reload_in (SF) = (reg:SF 33 1 [ x ])
GENERAL_REGS, RELOAD_FOR_INPUT (opnum = 1)
reload_in_reg: (reg:SF 33 1 [ x ])
reload_reg_rtx: (reg:SF 8 8)

Notice reload 3 has a subreg for reload_out.  We wind up in the
gen_reload code shown below, and try to use REGNO on a subreg, leading
to

(gdb) p debug_rtx(out)
(reg:DI -166922608)

Or at least, that's what you get after fixing print_rtx to not
segfault..  Bootstrap and regression test powerpc-linux in progress.
OK to apply, assuming no regressions?

* reload1.c (gen_reload): Don't use REGNO on SUBREGs.
* print-rtl.c (print_rtx): Don't segfault on negative regno.

Note that s/REGNO (in_rtx)/value/ in print_rtx is reasonable given
that REG in rtl.def has a format of i00, so XINT (in_rtx, i) is
always XINT (in_rtx, 0) for a reg, which is equivalent to REGNO apart
from signedness.

Index: gcc/reload1.c
===
--- gcc/reload1.c   (revision 180542)
+++ gcc/reload1.c   (working copy)
@@ -8588,28 +8588,28 @@ gen_reload (rtx out, rtx in, int opnum, 
reg_or_subregno (in)  FIRST_PSEUDO_REGISTER
(REG_P (out)
   || (GET_CODE (out) == SUBREG  REG_P (SUBREG_REG (out
reg_or_subregno (out)  FIRST_PSEUDO_REGISTER
SECONDARY_MEMORY_NEEDED (REGNO_REG_CLASS (reg_or_subregno (in)),
   REGNO_REG_CLASS (reg_or_subregno (out)),
   GET_MODE (out)))
 {
   /* Get the memory to use and rewrite both registers to its mode.  */
   rtx loc = get_secondary_mem (in, GET_MODE (out), opnum, type);
 
   if (GET_MODE (loc) != GET_MODE (out))
-   out = gen_rtx_REG (GET_MODE (loc), REGNO (out));
+   out = gen_rtx_REG (GET_MODE (loc), reg_or_subregno (out));
 
   if (GET_MODE (loc) != GET_MODE (in))
-   in = gen_rtx_REG (GET_MODE (loc), REGNO (in));
+   in = gen_rtx_REG (GET_MODE (loc), reg_or_subregno (in));
 
   gen_reload (loc, in, opnum, type);
   gen_reload (out, loc, opnum, type);
 }
 #endif
   else if (REG_P (out)  UNARY_P (in))
 {
   rtx insn;
   rtx op1;
   rtx out_moded;
   rtx set;
 
Index: gcc/print-rtl.c
===
--- gcc/print-rtl.c (revision 180542)
+++ gcc/print-rtl.c (working copy)
@@ -465,13 +465,12 @@ print_rtx (const_rtx in_rtx)
int value = XINT (in_rtx, i);
const char *name;
 
 #ifndef GENERATOR_FILE
-   if (REG_P (in_rtx)  value  FIRST_PSEUDO_REGISTER)
- fprintf (outfile,  %d %s, REGNO (in_rtx),
-  reg_names[REGNO (in_rtx)]);
+

Re: Reload related segfaults

2011-10-26 Thread David Miller
From: Alan Modra amo...@gmail.com
Date: Thu, 27 Oct 2011 13:29:56 +1030

 Some recent patch has exposed a reload bug.  I'm seeing

I think this might be a side effect or Eric's recent changes,
CC:'d.


[PATCH] Fix thinko in previous sparc setcc changes.

2011-10-26 Thread David Miller

More extensive testing showed that we have to force the usage of
v9 scc patterns when comparing DImode values with comparison codes
other than EQ and NE.

Soon we'll be able to add exceptions this this, because VIS3 has addxc
and addxccc instructions which test the 64-bit carry condition.

Committed to trunk.

gcc/

* config/sparc/sparc.c (emit_scc_insn): Force attempt of v9 sequences
if we're comparing DImode and comparison is other than EQ or NE.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@180558 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog|3 +++
 gcc/config/sparc/sparc.c |5 +
 2 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 3a51510..124b17c 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,8 @@
 2011-10-26  David S. Miller  da...@davemloft.net
 
+   * config/sparc/sparc.c (emit_scc_insn): Force attempt of v9 sequences
+   if we're comparing DImode and comparison is other than EQ or NE.
+
* config/sparc/sparc.c (emit_scc_insn): Do not try v9 sequences until
LEU/LTU/GEU/GTU is attempted.
* config/sparc/sparc.md (*neg_snesi_sign_extend): New 64-bit insn
diff --git a/gcc/config/sparc/sparc.c b/gcc/config/sparc/sparc.c
index 80e05a6..ea9fdef 100644
--- a/gcc/config/sparc/sparc.c
+++ b/gcc/config/sparc/sparc.c
@@ -2541,6 +2541,11 @@ emit_scc_insn (rtx operands[])
 }
 }
 
+  if (TARGET_V9
+   GET_MODE (x) == DImode
+   gen_v9_scc (operands[0], code, x, y))
+return true;
+
   /* We can do LTU and GEU using the addx/subx instructions too.  And
  for GTU/LEU, if both operands are registers swap them and fall
  back to the easy case.  */
-- 
1.7.6.401.g6a319



Re: Go patch committed: Implement new syscall package

2011-10-26 Thread Ian Lance Taylor
Rainer Orth r...@cebitec.uni-bielefeld.de writes:

 the problem is another one: using /usr/xpg4/bin/awk, I find:

 /usr/xpg4/bin/awk: line 47 (NR=32): wrong number of arguments to function m

 nawk(1) only documents match(s,ere) (i.e. two args), and the gawk docs
 state:

 `match(STRING, REGEXP [, ARRAY])'
 [...]
  The ARRAY argument to `match' is a `gawk' extension.  In
  compatibility mode (*note Options::), using a third argument is a
  fatal error.

I committed this patch which should fix this problem.  Bootstrapped and
ran Go testsuite on x86_64-unknown-linux-gnu.

Ian

diff -r 5f4f4eae5cd9 libgo/go/syscall/mksyscall.awk
--- a/libgo/go/syscall/mksyscall.awk	Wed Oct 26 16:19:57 2011 -0700
+++ b/libgo/go/syscall/mksyscall.awk	Wed Oct 26 21:53:05 2011 -0700
@@ -44,33 +44,63 @@
 	blocking = 1
 }
 
-if (match($0, //sys(nb)?[ 	]*([a-zA-Z0-9_]+)\\(([^()]*)\\) *(\\(([^()]+)\\))?, gosig) == 0) {
+line = $0
+
+if (match(line, //sys(nb)?[ 	]*[a-zA-Z0-9_]+\\([^()]*\\) *(\\(([^()]+)\\))?) == 0) {
 	print unmatched line:, $0 | cat 12
 	status = 1
 	next
 }
 
-gofnname = gosig[2]
-gofnparams = gosig[3]
-gofnresults = gosig[5]
+# Sets a[1] = //sysnb, a[2] == function name.
+split(line, a, [ 	(]*)
+gofnname = a[2]
+
+off = match(line, \\([^()]*\\))
+end = index(substr(line, off, length(line) - off + 1), ))
+gofnparams = substr(line, off + 1, end - 2)
+
+line = substr(line, off + end, length(line) - (off + end) + 1)
+off = match(line, \\([^()]*\\))
+if (off == 0) {
+	gofnresults = 
+} else {
+	end = index(substr(line, off, length(line) - off + 1), ))
+	gofnresults = substr(line, off + 1, end - 2)
+}
 
 getline
+line = $0
 
-if (match($0, //([a-zA-Z0-9_]+)\\(([^()]*)\\) *(.*)$, csig) == 0) {
+if (match(line, //[a-zA-Z0-9_]+\\([^()]*\\)) == 0) {
 	print unmatched C line, $0, after, gofnname | cat 12
 	status = 1
 	next
 }
 
-cfnname = csig[1]
-cfnparams = csig[2]
-cfnresult = csig[3]
+split(line, a, [ 	(]*)
+cfnname = substr(a[1], 3, length(a[1]) - 2)
+
+off = match(line, \\([^()]*\\))
+end = index(substr(line, off, length(line) - off + 1), ))
+cfnparams = substr(line, off + 1, end - 2)
+
+line = substr(line, off + end + 1, length(line) - (off + end) + 1)
+while (substr(line, 1, 1) ==  ) {
+	line = substr(line, 2, length(line) - 1)
+}
+end = index(line,  )
+if (end != 0) {
+	line = substr(line, 1, end)
+}
+cfnresult = line
 
 printf(// Automatically generated wrapper for %s/%s\n, gofnname, cfnname)
 printf(func c_%s(%s) %s%s__asm__(\%s\)\n,
 	   cfnname, cfnparams, cfnresult, cfnresult ==  ?  :  , cfnname)
-printf(func %s(%s) %s%s{\n,
-	   gofnname, gofnparams, gosig[4], gosig[4] ==  ?  :  )
+printf(func %s(%s) %s%s%s%s{\n,
+	   gofnname, gofnparams, gofnresults ==  ?  : (, gofnresults,
+	   gofnresults ==  ?  : ), gofnresults ==  ?  :  )
 
 if (blocking) {
 	print \tentersyscall()
@@ -91,22 +121,22 @@
 	args = args , 
 	}
 
-	if (match(goargs[goarg], ^([^ ]*) ([^ ]*)$, goparam) == 0) {
+	if (split(goargs[goarg], a) != 2) {
 	print loc, bad parameter:, goargs[goarg] | cat 12
 	status = 1
 	next
 	}
 
-	goname = goparam[1]
-	gotype = goparam[2]
+	goname = a[1]
+	gotype = a[2]
 
-	if (match(cargs[carg], ^([^ ]*) ([^ ]*)$, cparam) == 0) {
+	if (split(cargs[carg], a) != 2) {
 	print loc, bad C parameter:, cargs[carg] | cat 12
 	status = 1
 	next
 	}
 
-	ctype = cparam[2]
+	ctype = a[2]
 
 	if (gotype ~ /^\*/) {
 	if (gotype != ctype) {


  1   2   >