Re: [PATCH] Fix PRs 52080, 52097 and 48124, rewrite bitfield expansion, enable the C++ memory model wrt bitfields everywhere

2012-03-26 Thread Eric Botcazou
 Huh?  If they have DECL_FIELD_BIT_OFFSET of zero they are at a byte
 boundary, no?  Wait - the RECORD_TYPE itself is at non-zero
 DECL_FIELD_BIT_OFFSET and thus a zero DECL_FIELD_BIT_OFFSET for its
 fields does not mean anything?!

DECL_FIELD_BIT_OFFSET is relative to the enclosing record type.

 But how can DECL_OFFSET_ALIGN be still valid for such field?

Likewise, it is relative to DECL_FIELD_OFFSET, which is relative to the 
enclosing record type.

 Obviously if DECL_FIELD_OFFSET == 0, DECL_FIELD_BIT_OFFSET == 0 then the
 offset needs to be aligned to DECL_OFFSET_ALIGN.  Which then means
 DECL_OFFSET_ALIGN is a bit-alignment?

It's just a relative alignment, not an absolute one.

 Anyway, since we are trying to compute a nice mode to use for
 the bitfield representative we can give up in the second that
 we do not know how to reach BITS_PER_UNIT alignment.  Or we can
 simply only try to ensure MIN (BITS_PER_UNIT, DECL_OFFSET_ALIGN)
 alignment/size of the representative.  Of course the bitfield
 expansion code has to deal with non-byte-aligned representatives
 then, and we'd always have to use BLKmode for them.

But BLKmode forces byte alignment.  Anything not byte-aligned must use an 
integral mode.

 I think we can't change it to be on a byte-boundary, the same record
 may be used at different bit-positions, no?

Sure.

 Sure, we acknowledge it can cross a record boundary.  I just was not
 aware we cannot statically compute the bit-offset to the previous byte
 for a record type.

We can if we look into the entire object, not just the enclosing record.  The 
problem is that, while get_inner_reference and friends do the former, the new 
code in stor-layout.c only does the latter.

OK, I think we should enter some sort of a degraded mode for these non-byte 
aligned records, as they cannot occur in C/C++.  I'll try to come up with 
something.

-- 
Eric Botcazou


Re: [PATCH] Fix PRs 52080, 52097 and 48124, rewrite bitfield expansion, enable the C++ memory model wrt bitfields everywhere

2012-03-26 Thread Eric Botcazou
 Btw, now checking with gdb, DECL_OFFSET_ALIGN is always 128 for
 all of the fields - that looks bogus.  DECL_ALIGN is 1, but that
 doesn't mean DECL_OFFSET_ALIGN should not be 1 as well, no?

DECL_OFFSET_ALIGN is set to BIGGEST_ALIGNMENT for the first field, see 
start_record_layout.  If DECL_FIELD_OFFSET doesn't change for the next fields, 
DECL_OFFSET_ALIGN will very likely not either.

-- 
Eric Botcazou


Re: [patch] Assert assemble_external is only called during or after expanding to RTL

2012-03-26 Thread Richard Guenther
On Sat, Mar 24, 2012 at 9:25 PM, Steven Bosscher stevenb@gmail.com wrote:
 Hello,

 This patch tightens the conditions on when assemble_external() may be
 called. It also removes a comment that most platforms do not define
 ASM_OUTPUT_EXTERNAL, because hasn't been true since r119764 added a
 definition of ASM_OUTPUT_EXTERNAL to elfos.h.

 This is the first step toward addressing PR17982 on the trunk for GCC
 4.8. The next step is to change pending_assemble_externals to
 pending_assemble_visibility, and fold assemble_external_real() back
 into assemble_external.

 But first, this patch. I don't think this is very risky, because GCC
 now always works in unit-at-a-time mode. But I think it would be good
 to have this on the trunk for a week or so before proceeding.

 Bootstrapped  tested on x86_64-unknown-linux-gnu. OK for trunk?

Ok.  Though I wonder why callers cannot simply push these decls to
the pending varpool queue and we might do a final run over it,
assembling things?  [or why we call assemble_external that late at all ...]

Richard.

 Ciao!
 Steven



        * varasm.c (assemble_external): Assert this function is only called
        during or after expanding to RTL.

 Index: varasm.c
 ===
 --- varasm.c    (revision 185762)
 +++ varasm.c    (working copy)
 @@ -2166,12 +2166,18 @@ static GTY(()) tree weak_decls;
  void
  assemble_external (tree decl ATTRIBUTE_UNUSED)
  {
 -  /* Because most platforms do not define ASM_OUTPUT_EXTERNAL, the
 -     main body of this code is only rarely exercised.  To provide some
 -     testing, on all platforms, we make sure that the ASM_OUT_FILE is
 -     open.  If it's not, we should not be calling this function.  */
 +  /*  Make sure that the ASM_OUT_FILE is open.
 +      If it's not, we should not be calling this function.  */
   gcc_assert (asm_out_file);

 +  /* This function should only be called if we are expanding, or have
 +     expanded, to RTL.
 +     Ideally, only final.c would be calling this function, but it is
 +     not clear whether that would break things somehow.  See PR 17982
 +     for further discussion.  */
 +  gcc_assert (cgraph_state == CGRAPH_STATE_EXPANSION
 +              || cgraph_state == CGRAPH_STATE_FINISHED);
 +
   if (!DECL_P (decl) || !DECL_EXTERNAL (decl) || !TREE_PUBLIC (decl))
     return;


Re: [PATCH] Fix PRs 52080, 52097 and 48124, rewrite bitfield expansion, enable the C++ memory model wrt bitfields everywhere

2012-03-26 Thread Eric Botcazou
 I think we indeed can't really in stor-layout, so the only place is very
 likely get_bit_range.

Something like that for example.  I think the expmed.c hunk should be applied 
to the 4.7 branch as well, because the new code in store_bit_field is quite 
dangerous without it.


* expmed.c (store_bit_field): Assert that BITREGION_START is a multiple
of a unit before computing the offset in units.
* expr.c (get_bit_range): Return the null range if the enclosing record
is part of a larger bit field.


-- 
Eric Botcazou
Index: expr.c
===
--- expr.c	(revision 185767)
+++ expr.c	(working copy)
@@ -4462,6 +4462,25 @@ get_bit_range (unsigned HOST_WIDE_INT *b
   return;
 }
 
+  /* If we have a DECL_BIT_FIELD_REPRESENTATIVE but the enclosing record is
+ part of a larger bit field, then the representative does not serve any
+ useful purpose.  This can occur in Ada.  */
+  if (handled_component_p (TREE_OPERAND (exp, 0)))
+{
+  enum machine_mode rmode;
+  HOST_WIDE_INT rbitsize, rbitpos;
+  tree roffset;
+  int unsignedp;
+  int volatilep = 0;
+  get_inner_reference (TREE_OPERAND (exp, 0), rbitsize, rbitpos,
+			   roffset, rmode, unsignedp, volatilep, false);
+  if ((rbitpos % BITS_PER_UNIT) != 0)
+	{
+	  *bitstart = *bitend = 0;
+	  return;
+}
+}
+
   /* Compute the adjustment to bitpos from the offset of the field
  relative to the representative.  DECL_FIELD_OFFSET of field and
  repr are the same by construction if they are not constants,
Index: expmed.c
===
--- expmed.c	(revision 185570)
+++ expmed.c	(working copy)
@@ -828,8 +828,7 @@ store_bit_field (rtx str_rtx, unsigned H
   /* Under the C++0x memory model, we must not touch bits outside the
  bit region.  Adjust the address to start at the beginning of the
  bit region.  */
-  if (MEM_P (str_rtx)
-   bitregion_start  0)
+  if (MEM_P (str_rtx)  bitregion_start  0)
 {
   enum machine_mode bestmode;
   enum machine_mode op_mode;
@@ -839,6 +838,8 @@ store_bit_field (rtx str_rtx, unsigned H
   if (op_mode == MAX_MACHINE_MODE)
 	op_mode = VOIDmode;
 
+  gcc_assert ((bitregion_start % BITS_PER_UNIT) == 0);
+
   offset = bitregion_start / BITS_PER_UNIT;
   bitnum -= bitregion_start;
   bitregion_end -= bitregion_start;


Re: [PATCH] Fix PRs 52080, 52097 and 48124, rewrite bitfield expansion, enable the C++ memory model wrt bitfields everywhere

2012-03-26 Thread Eric Botcazou
 OTOH if DECL_ALIGN is absolute then if the first field of a record type has
 DECL_ALIGN that is a multiple of BITS_PER_UNIT we know we can use
 the present stor-layout code?  So we can check that and give up computing
 representatives at all for a record type if that does not hold.

I don't think so, DECL_ALIGN of a field is probably honored only if TYPE_ALIGN 
of the record is, and TYPE_ALIGN isn't honored for a bit field.

-- 
Eric Botcazou


Re: [PATCH] Fix PRs 52080, 52097 and 48124, rewrite bitfield expansion, enable the C++ memory model wrt bitfields everywhere

2012-03-26 Thread Richard Guenther
On Mon, Mar 26, 2012 at 10:17 AM, Eric Botcazou ebotca...@adacore.com wrote:
 OTOH if DECL_ALIGN is absolute then if the first field of a record type has
 DECL_ALIGN that is a multiple of BITS_PER_UNIT we know we can use
 the present stor-layout code?  So we can check that and give up computing
 representatives at all for a record type if that does not hold.

 I don't think so, DECL_ALIGN of a field is probably honored only if TYPE_ALIGN
 of the record is, and TYPE_ALIGN isn't honored for a bit field.

Uh.  When is a field a bit field though?  At least stor-layout.c
resets DECL_BIT_FIELD
when local relative alignment is proper and the filed has an integer
mode.  That's
overly optimistic if the record is placed at a bit position.  Of
course we still have
DECL_BIT_FIELD_TYPE, but I wonder what the DECL_BIT_FIELD flag is for if
we cannot really know whether the field in the end is a bitfield anyway...
(what value do the various field-decls in a component-ref chain have anyway if
the real interpretation is subject to the containing object, which
might be even only
specified by the type used for an indirect ref ...)

Richard.

 --
 Eric Botcazou


Fix PR rtl-optimization/52629

2012-03-26 Thread Eric Botcazou
This is an out-of-bounds access in count_spilled_pseudo: we can enter the 
function for a pseudo without hard reg but we immediately use the hard reg 
number as an index:

static void
count_spilled_pseudo (int spilled, int spilled_nregs, int reg)
{
  int freq = REG_FREQ (reg);
  int r = reg_renumber[reg];
  int nregs = hard_regno_nregs[r][PSEUDO_REGNO_MODE (reg)];

The negative R case is only checked a few lines below:

  /* Ignore spilled pseudo-registers which can be here only if IRA is
 used.  */
  if ((ira_conflicts_p  r  0)
  || REGNO_REG_SET_P (spilled_pseudos, reg)
  || spilled + spilled_nregs = r || r + nregs = spilled)
return;


It turns out that count_pseudo contains very similar (and correct) code so the 
patch just mimics that code.

Bootstrapped/regtested on x86_64-suse-linux, applied on mainline.  Should it be 
applied to the release branches as well?


2012-03-26  Eric Botcazou  ebotca...@adacore.com

PR rtl-optimization/52629
* reload1.c (count_pseudo): Short-circuit common case.
(count_spilled_pseudo): Return early for pseudos without hard regs.
Assert that the pseudo has got a hard reg before manipulating it.


-- 
Eric Botcazou
Index: reload1.c
===
--- reload1.c	(revision 185570)
+++ reload1.c	(working copy)
@@ -1746,11 +1746,12 @@ count_pseudo (int reg)
   int r = reg_renumber[reg];
   int nregs;
 
+  /* Ignore spilled pseudo-registers which can be here only if IRA is used.  */
+  if (ira_conflicts_p  r  0)
+return;
+
   if (REGNO_REG_SET_P (pseudos_counted, reg)
-  || REGNO_REG_SET_P (spilled_pseudos, reg)
-  /* Ignore spilled pseudo-registers which can be here only if IRA
-	 is used.  */
-  || (ira_conflicts_p  r  0))
+  || REGNO_REG_SET_P (spilled_pseudos, reg))
 return;
 
   SET_REGNO_REG_SET (pseudos_counted, reg);
@@ -1827,12 +1828,17 @@ count_spilled_pseudo (int spilled, int s
 {
   int freq = REG_FREQ (reg);
   int r = reg_renumber[reg];
-  int nregs = hard_regno_nregs[r][PSEUDO_REGNO_MODE (reg)];
+  int nregs;
+
+  /* Ignore spilled pseudo-registers which can be here only if IRA is used.  */
+  if (ira_conflicts_p  r  0)
+return;
+
+  gcc_assert (r = 0);
+
+  nregs = hard_regno_nregs[r][PSEUDO_REGNO_MODE (reg)];
 
-  /* Ignore spilled pseudo-registers which can be here only if IRA is
- used.  */
-  if ((ira_conflicts_p  r  0)
-  || REGNO_REG_SET_P (spilled_pseudos, reg)
+  if (REGNO_REG_SET_P (spilled_pseudos, reg)
   || spilled + spilled_nregs = r || r + nregs = spilled)
 return;
 


Re: [Patch V2] libgfortran: do not assume libm

2012-03-26 Thread Tristan Gingold

On Mar 22, 2012, at 11:06 AM, Paolo Bonzini wrote:

 Il 22/03/2012 09:30, Tristan Gingold ha scritto:
 Hi,
 
 this is version 2 of the patch.
 
 The initial problem is that libgfortran configure.ac used AC_CHECK_LIB([m]…) 
 to check wether several math functions are available.  That doesn't work on 
 VMS, because there is no such things as a libm.
 
 It seems to me that there are no autoconf macro to check wether a function 
 is available: AC_CHECK_FUNC[S] don't allow to specify additional include 
 files (necessary on VMS as some math functions are renamed through macros), 
 and AC_CHECK_DECL only checks the presence of the declaration (as pointed 
 out by Ralf).
 
 So I have finally learnt a little bit more about autoconf and added a new 
 file: config/math.m4 (build machinery maintainer cc:)
 libgfortran/configure.ac now uses the new GCC_CHECK_MATH_FUNC macro.
 
 Bootstrapped without gfortran regressions on x86_64-darwin.
 
 Ok for trunk ?
 
 (I will submit a follow-up change in libquadmath once this change is 
 approved).
 
 Tristan.
 
 config/
 2012-03-22  Tristan Gingold  ging...@adacore.com
 
  * math.m4: New file.
 
 libgfortran/
 2012-03-22  Tristan Gingold  ging...@adacore.com
 
  * configure.ac: Use GCC_CHECK_MATH_FUNC for math functions.
  * acinclude.m4: Include ../config/math.m4
  * configure: Regenerate.
  * Makefile.in: Regenerate.
 
 Ok.

Thanks, now committed.

Tristan.




Re: [PATCH] Fix PRs 52080, 52097 and 48124, rewrite bitfield expansion, enable the C++ memory model wrt bitfields everywhere

2012-03-26 Thread Richard Guenther
On Mon, 26 Mar 2012, Eric Botcazou wrote:

  I think we indeed can't really in stor-layout, so the only place is very
  likely get_bit_range.
 
 Something like that for example.  I think the expmed.c hunk should be applied 
 to the 4.7 branch as well, because the new code in store_bit_field is quite 
 dangerous without it.
 
 
   * expmed.c (store_bit_field): Assert that BITREGION_START is a multiple
   of a unit before computing the offset in units.
   * expr.c (get_bit_range): Return the null range if the enclosing record
   is part of a larger bit field.

The patch looks reasonable - can we compute this backward from the
result of the outer get_inner_reference call and the outermost
field-decl though?  Or make get_inner_reference compute that while
analyzing the full reference and return a flag?  OTOH it shouldn't
be too expensive.

I agree the assert should go to the banch as well, though the code
only ever triggers there with --param allow-store-data-races=0.

Thanks,
Richard.


[VMS/committed]: consolidate ia64_start_function and alpha_start_function, rework crt0

2012-03-26 Thread Tristan Gingold
Hi,

there was duplicated code for dealing with -mdebug-main in ia64_start_function 
and alpha_start_function.  This patch consolidates that.

The pointer size is recorded for function 'main', so that crt0 now expand 
argv[] if needed; likewise of the return status code.

Manually tested on both ia64 and alpha openvms, committed on trunk.

Tristan.

libgcc/
2012-03-26  Tristan Gingold  ging...@adacore.com

* config/alpha/vms.h (LINK_SPEC): Simplify.
(STARTFILE_SPEC): Remove -mvms-return-codes handling.
(NAME__MAIN, SYMBOL__MAIN): Remove.
(VMS_DEBUG_MAIN_POINTER): Remove.
* config/ia64/vms.h: Likewise.
* config/alpha/alpha.c (alpha_start_function): Move vms_debug_main
code to vms.c.  Call vms_start_function.
* config/ia64/ia64.c (ia64_start_function): Likewise.
* config/vms/vms-protos.h (vms_start_function): Declare.
* config/vms/vms.c (vms_start_function): New function.
* config/vms/vms.h (MATH_LIBRARY): Define.
(VMS_DEBUG_MAIN_POINTER): Define.

gcc/
2012-03-26  Tristan Gingold  ging...@adacore.com

* config/alpha/vms.h (LINK_SPEC): Simplify.
(STARTFILE_SPEC): Remove -mvms-return-codes handling.
(NAME__MAIN, SYMBOL__MAIN): Remove.
(VMS_DEBUG_MAIN_POINTER): Remove.
* config/ia64/vms.h: Likewise.
* config/alpha/alpha.c (alpha_start_function): Move vms_debug_main
code to vms.c.  Call vms_start_function.
* config/ia64/ia64.c (ia64_start_function): Likewise.
* config/vms/vms-protos.h (vms_start_function): Declare.
* config/vms/vms.c (vms_start_function): New function.
* config/vms/vms.h (MATH_LIBRARY): Define.
(VMS_DEBUG_MAIN_POINTER): Define.



Re: [VMS/committed]: consolidate ia64_start_function and alpha_start_function, rework crt0

2012-03-26 Thread Tristan Gingold
And the corresponding path is below...

On Mar 26, 2012, at 11:37 AM, Tristan Gingold wrote:

 Hi,
 
 there was duplicated code for dealing with -mdebug-main in 
 ia64_start_function and alpha_start_function.  This patch consolidates that.
 
 The pointer size is recorded for function 'main', so that crt0 now expand 
 argv[] if needed; likewise of the return status code.
 
 Manually tested on both ia64 and alpha openvms, committed on trunk.
 
 Tristan.
 
 libgcc/
 2012-03-26  Tristan Gingold  ging...@adacore.com
 
* config/alpha/vms.h (LINK_SPEC): Simplify.
(STARTFILE_SPEC): Remove -mvms-return-codes handling.
(NAME__MAIN, SYMBOL__MAIN): Remove.
(VMS_DEBUG_MAIN_POINTER): Remove.
* config/ia64/vms.h: Likewise.
* config/alpha/alpha.c (alpha_start_function): Move vms_debug_main
code to vms.c.  Call vms_start_function.
* config/ia64/ia64.c (ia64_start_function): Likewise.
* config/vms/vms-protos.h (vms_start_function): Declare.
* config/vms/vms.c (vms_start_function): New function.
* config/vms/vms.h (MATH_LIBRARY): Define.
(VMS_DEBUG_MAIN_POINTER): Define.
 
 gcc/
 2012-03-26  Tristan Gingold  ging...@adacore.com
 
* config/alpha/vms.h (LINK_SPEC): Simplify.
(STARTFILE_SPEC): Remove -mvms-return-codes handling.
(NAME__MAIN, SYMBOL__MAIN): Remove.
(VMS_DEBUG_MAIN_POINTER): Remove.
* config/ia64/vms.h: Likewise.
* config/alpha/alpha.c (alpha_start_function): Move vms_debug_main
code to vms.c.  Call vms_start_function.
* config/ia64/ia64.c (ia64_start_function): Likewise.
* config/vms/vms-protos.h (vms_start_function): Declare.
* config/vms/vms.c (vms_start_function): New function.
* config/vms/vms.h (MATH_LIBRARY): Define.
(VMS_DEBUG_MAIN_POINTER): Define.
 

diff --git a/gcc/config/alpha/alpha.c b/gcc/config/alpha/alpha.c
index 46c620b..c52fc50 100644
--- a/gcc/config/alpha/alpha.c
+++ b/gcc/config/alpha/alpha.c
@@ -7866,14 +7866,7 @@ alpha_start_function (FILE *file, const char *fnname,
   int i;
 
 #if TARGET_ABI_OPEN_VMS
-  if (vms_debug_main
-   strncmp (vms_debug_main, fnname, strlen (vms_debug_main)) == 0)
-{
-  targetm.asm_out.globalize_label (asm_out_file, VMS_DEBUG_MAIN_POINTER);
-  ASM_OUTPUT_DEF (asm_out_file, VMS_DEBUG_MAIN_POINTER, fnname);
-  switch_to_section (text_section);
-  vms_debug_main = NULL;
-}
+  vms_start_function (fnname);
 #endif
 
   alpha_fnname = fnname;
diff --git a/gcc/config/alpha/vms.h b/gcc/config/alpha/vms.h
index b1d46b5..8caec54 100644
--- a/gcc/config/alpha/vms.h
+++ b/gcc/config/alpha/vms.h
@@ -43,8 +43,6 @@ along with GCC; see the file COPYING3.  If not see
 builtin_define (__IEEE_FLOAT);   \
 } while (0)
 
-#define VMS_DEBUG_MAIN_POINTER TRANSFER$BREAK$GO
-
 #undef PCC_STATIC_STRUCT_RETURN
 
 #define MAX_OFILE_ALIGNMENT 524288  /* 8 x 2^16 by DEC Ada Test CD40VRA */
@@ -279,21 +277,14 @@ do {\
 #else
 /* Link with vms-dwarf2.o if -g (except -g0). This causes the
VMS link to pull all the dwarf2 debug sections together.  */
-#define LINK_SPEC %{g:-g vms-dwarf2.o%s} %{g0} %{g1:-g1 vms-dwarf2.o%s} \
-%{g2:-g2 vms-dwarf2.o%s} %{g3:-g3 vms-dwarf2.o%s} %{shared} %{v} %{map}
+#define LINK_SPEC %{g0} %{g*:-g vms-dwarf2.o%s} %{shared} %{v} %{map}
 #endif
 
 #undef STARTFILE_SPEC
-#define STARTFILE_SPEC \
-%{!shared:%{mvms-return-codes:vcrt0.o%s} %{!mvms-return-codes:pcrt0.o%s} \
-crtbegin.o%s} \
+#define STARTFILE_SPEC %{!shared:crt0.o%s crtbegin.o%s} \
  %{!static:%{shared:crtbeginS.o%s}}
 
-#define ENDFILE_SPEC \
-%{!shared:crtend.o%s} %{!static:%{shared:crtendS.o%s}}
-
-#define NAME__MAIN __gccmain
-#define SYMBOL__MAIN __gccmain
+#define ENDFILE_SPEC %{!shared:crtend.o%s} %{!static:%{shared:crtendS.o%s}}
 
 #define INIT_SECTION_ASM_OP \t.section LIB$INITIALIZE,GBL,NOWRT
 
diff --git a/gcc/config/ia64/ia64.c b/gcc/config/ia64/ia64.c
index 5b2d748..98a6120 100644
--- a/gcc/config/ia64/ia64.c
+++ b/gcc/config/ia64/ia64.c
@@ -3649,16 +3649,8 @@ void
 ia64_start_function (FILE *file, const char *fnname,
 tree decl ATTRIBUTE_UNUSED)
 {
-#if VMS_DEBUGGING_INFO
-  if (vms_debug_main
-   debug_info_level  DINFO_LEVEL_NONE
-   strncmp (vms_debug_main, fnname, strlen (vms_debug_main)) == 0)
-{
-  targetm.asm_out.globalize_label (asm_out_file, VMS_DEBUG_MAIN_POINTER);
-  ASM_OUTPUT_DEF (asm_out_file, VMS_DEBUG_MAIN_POINTER, fnname);
-  dwarf2out_vms_debug_main_pointer ();
-  vms_debug_main = 0;
-}
+#if TARGET_ABI_OPEN_VMS
+  vms_start_function (fnname);
 #endif
 
   fputs (\t.proc , file);
diff --git a/gcc/config/ia64/vms.h b/gcc/config/ia64/vms.h
index 1908c47..3e0c653 100644
--- a/gcc/config/ia64/vms.h
+++ b/gcc/config/ia64/vms.h
@@ -30,8 +30,6 @@ along with GCC; see the file COPYING3.  If not see
 #undef TARGET_DEFAULT
 #define TARGET_DEFAULT 

Re: [PATCH][ARM] NEON DImode neg

2012-03-26 Thread Andrew Stubbs

On 28/02/12 17:45, Andrew Stubbs wrote:

Hi all,

This patch adds a DImode negate pattern for NEON.

Unfortunately, the NEON vneg instruction only supports vectors, not
singletons, so there's no direct way to do it in DImode, and the
compiler ends up moving the value back to core registers, negating it,
and returning to NEON afterwards:

fmrrd r2, r3, d16 @ int
negs r2, r2
sbc r3, r3, r3, lsl #1
fmdrr d16, r2, r3 @ int

The new patch does it entirely in NEON:

vmov.i32 d17, #0 @ di
vsub.i64 d16, d17, d16

(Note that this is the result when combined with my recent patch for
NEON DImode immediates. Without that you get a constant pool load.)


This updates fixes a bootstrap failure caused by an early clobber error. 
I've also got a native regression test running now.


OK?

Andrew
2012-03-26  Andrew Stubbs  a...@codesourcery.com

	gcc/
	* config/arm/arm.md (negdi2): Use gen_negdi2_neon.
	* config/arm/neon.md (negdi2_neon): New insn.
	Also add splitters for core and NEON registers.

---
 gcc/config/arm/arm.md  |8 +++-
 gcc/config/arm/neon.md |   37 +
 2 files changed, 44 insertions(+), 1 deletions(-)

diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 751997f..f1dbbf7 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -4048,7 +4048,13 @@
 	 (neg:DI (match_operand:DI 1 s_register_operand )))
 (clobber (reg:CC CC_REGNUM))])]
   TARGET_EITHER
-  
+  {
+if (TARGET_NEON)
+  {
+emit_insn (gen_negdi2_neon (operands[0], operands[1]));
+	DONE;
+  }
+  }
 )
 
 ;; The constraints here are to prevent a *partial* overlap (where %Q0 == %R1).
diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index 3c88568..bf229a7 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -922,6 +922,43 @@
 (const_string neon_int_3)))]
 )
 
+(define_insn negdi2_neon
+  [(set (match_operand:DI 0 s_register_operand	 = w,?r,?r,?w)
+	(neg:DI (match_operand:DI 1 s_register_operand   w, 0,  r, w)))
+   (clobber (match_scratch:DI 2 =w, X,  X,w))
+   (clobber (reg:CC CC_REGNUM))]
+  TARGET_NEON
+  #
+  [(set_attr length 8)
+   (set_attr arch nota8,*,*,onlya8)]
+)
+
+; Split negdi2_neon for vfp registers
+(define_split
+  [(set (match_operand:DI 0 s_register_operand )
+	(neg:DI (match_operand:DI 1 s_register_operand )))
+   (clobber (match_scratch:DI 2 ))
+   (clobber (reg:CC CC_REGNUM))]
+  TARGET_NEON  reload_completed  IS_VFP_REGNUM (REGNO (operands[0]))
+  [(set (match_dup 2) (const_int 0))
+   (parallel [(set (match_dup 0) (minus:DI (match_dup 2) (match_dup 1)))
+	  (clobber (reg:CC CC_REGNUM))])]
+  
+)
+
+; Split negdi2_neon for core registers
+(define_split
+  [(set (match_operand:DI 0 s_register_operand )
+	(neg:DI (match_operand:DI 1 s_register_operand )))
+   (clobber (match_scratch:DI 2 ))
+   (clobber (reg:CC CC_REGNUM))]
+  TARGET_32BIT  reload_completed
+arm_general_register_operand (operands[0], DImode)
+  [(parallel [(set (match_dup 0) (neg:DI (match_dup 1)))
+	  (clobber (reg:CC CC_REGNUM))])]
+  
+)
+
 (define_insn *uminmode3_neon
   [(set (match_operand:VDQIW 0 s_register_operand =w)
 	(umin:VDQIW (match_operand:VDQIW 1 s_register_operand w)


Re: [PATCH] Bug fix in store_bit_field_1 for big endian targets (issue 51893)

2012-03-26 Thread Aurelien Buhrig

 Passes bootstrap and regression test powerpc64-linux.
 
 Thanks a lot, Alan!
 
 So, Aurelien, you only need to adjust the formatting of the patch and post a 
 ChangeLog entry along with it.  TIA.
 

Thanks Alan!
Bootstrap and regression test for m68k-elf ok, but I have trouble cross
compiling trunk for powerpc64-linux target...

Changelog:
* expmed.c (store_bit_field_1): Fix wordnum value for big endian targets

Will this fix be backported to 4.6 branch for next release?

Aurélien


Index: gcc/expmed.c
===
--- gcc/expmed.c(revision 185732)
+++ gcc/expmed.c(working copy)
@@ -550,7 +550,10 @@
{
  /* If I is 0, use the low-order word in both field and target;
 if I is 1, use the next to lowest word; and so on.  */
- unsigned int wordnum = (backwards ? nwords - i - 1 : i);
+ unsigned int wordnum = (backwards
+ ? GET_MODE_SIZE (fieldmode) / UNITS_PER_WORD
+ - i - 1
+ : i);
  unsigned int bit_offset = (backwards
 ? MAX ((int) bitsize - ((int) i + 1)
* BITS_PER_WORD,


Re: [PATCH] Remove strict-alignment checks in SRA

2012-03-26 Thread Richard Guenther
On Fri, 23 Mar 2012, Martin Jambor wrote:

 Hi,
 
 since we now should be able to expand misaligned MEM_REFs properly and
 both SRA and IPA-SRA now tag the memory accesses with the appropriate
 alignment information, we should now be able to get rid off the SRA
 disabling in the face of potential strict-alignment expansion
 problems.  The patch below does that.
 
 When I applied it before applying the patches fixing misaligned
 expansion, testcases gcc.dg/torture/pr52402.c and
 gcc.dg/tree-ssa/pr49923.c failed on strict-alignment platforms.
 However, when applied to the current trunk, they pass bootstrap and
 testsuite on sparc64-linux, ia64-linux, x86_64-linux and i686-linux.
 OK for trunk?

Ok.

Thanks,
Richard.

 Thanks,
 
 Martin
 
 
 2012-03-20  Martin Jambor  mjam...@suse.cz
 
   PR tree-optimization/50052
   * tree-sra.c (tree_non_aligned_mem_p): Removed.
   (tree_non_aligned_mem_for_access_p): Likewise.
   (build_accesses_from_assign): Removed strict alignment requirements
   checks.
   (access_precludes_ipa_sra_p): Likewise.
 
   * testsuite/gcc.dg/ipa/ipa-sra-2.c: Also run on strict-alignment
   platforms.
 
 
 Index: src/gcc/tree-sra.c
 ===
 --- src.orig/gcc/tree-sra.c
 +++ src/gcc/tree-sra.c
 @@ -1081,53 +1081,6 @@ disqualify_ops_if_throwing_stmt (gimple
return false;
  }
  
 -/* Return true if EXP is a memory reference less aligned than ALIGN.  This is
 -   invoked only on strict-alignment targets.  */
 -
 -static bool
 -tree_non_aligned_mem_p (tree exp, unsigned int align)
 -{
 -  unsigned int exp_align;
 -
 -  if (TREE_CODE (exp) == VIEW_CONVERT_EXPR)
 -exp = TREE_OPERAND (exp, 0);
 -
 -  if (TREE_CODE (exp) == SSA_NAME || is_gimple_min_invariant (exp))
 -return false;
 -
 -  /* get_object_alignment will fall back to BITS_PER_UNIT if it cannot
 - compute an explicit alignment.  Pretend that dereferenced pointers
 - are always aligned on strict-alignment targets.  */
 -  if (TREE_CODE (exp) == MEM_REF || TREE_CODE (exp) == TARGET_MEM_REF)
 -exp_align = get_object_or_type_alignment (exp);
 -  else
 -exp_align = get_object_alignment (exp);
 -
 -  if (exp_align  align)
 -return true;
 -
 -  return false;
 -}
 -
 -/* Return true if EXP is a memory reference less aligned than what the access
 -   ACC would require.  This is invoked only on strict-alignment targets.  */
 -
 -static bool
 -tree_non_aligned_mem_for_access_p (tree exp, struct access *acc)
 -{
 -  unsigned int acc_align;
 -
 -  /* The alignment of the access is that of its expression.  However, it may
 - have been artificially increased, e.g. by a local alignment promotion,
 - so we cap it to the alignment of the type of the base, on the grounds
 - that valid sub-accesses cannot be more aligned than that.  */
 -  acc_align = get_object_alignment (acc-expr);
 -  if (acc-base  acc_align  TYPE_ALIGN (TREE_TYPE (acc-base)))
 -acc_align = TYPE_ALIGN (TREE_TYPE (acc-base));
 -
 -  return tree_non_aligned_mem_p (exp, acc_align);
 -}
 -
  /* Scan expressions occuring in STMT, create access structures for all 
 accesses
 to candidates for scalarization and remove those candidates which occur in
 statements or expressions that prevent them from being split apart.  
 Return
 @@ -1154,11 +1107,7 @@ build_accesses_from_assign (gimple stmt)
lacc = build_access_from_expr_1 (lhs, stmt, true);
  
if (lacc)
 -{
 -  lacc-grp_assignment_write = 1;
 -  if (STRICT_ALIGNMENT  tree_non_aligned_mem_for_access_p (rhs, lacc))
 -lacc-grp_unscalarizable_region = 1;
 -}
 +lacc-grp_assignment_write = 1;
  
if (racc)
  {
 @@ -1166,8 +1115,6 @@ build_accesses_from_assign (gimple stmt)
if (should_scalarize_away_bitmap  !gimple_has_volatile_ops (stmt)
  !is_gimple_reg_type (racc-type))
   bitmap_set_bit (should_scalarize_away_bitmap, DECL_UID (racc-base));
 -  if (STRICT_ALIGNMENT  tree_non_aligned_mem_for_access_p (lhs, racc))
 -racc-grp_unscalarizable_region = 1;
  }
  
if (lacc  racc
 @@ -3835,10 +3782,6 @@ access_precludes_ipa_sra_p (struct acces
 || gimple_code (access-stmt) == GIMPLE_ASM))
  return true;
  
 -  if (STRICT_ALIGNMENT
 -   tree_non_aligned_mem_p (access-expr, TYPE_ALIGN (access-type)))
 -return true;
 -
return false;
  }
  
 Index: src/gcc/testsuite/gcc.dg/ipa/ipa-sra-2.c
 ===
 --- src.orig/gcc/testsuite/gcc.dg/ipa/ipa-sra-2.c
 +++ src/gcc/testsuite/gcc.dg/ipa/ipa-sra-2.c
 @@ -1,6 +1,5 @@
  /* { dg-do compile } */
  /* { dg-options -O2 -fipa-sra -fdump-tree-eipa_sra-details  } */
 -/* { dg-require-effective-target non_strict_align } */
  
  struct bovid
  {
 
 

-- 
Richard Guenther rguent...@suse.de
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer

[PATCH] Permanent Fix for PR46886

2012-03-26 Thread Razya Ladelsky
Hi,

This is (hopefully) a permanent fix  to pr46886.c
I removed the condition preventing parallelization of do_while loops, as 
it 
was blocking parallelizing important loops in spec-2006.
The patch fixes the number of iterations for cases where the body could 
appear in the latch, as in pr46886.c.

2012-03-26  Razya Ladelsky  ra...@il.ibm.com

 PR tree-optimization/46886
 * tree-parloops.c (transform_to_exit_first_loop):Set 
number of iterations correctly when the body may appear at the latch.
 (pallelize_loops): Remove the condition preventing 
do-while loops.
 


Bootstrap and testsuite psss successfully on power7 linux machine.
Ok to commit?
Thanks,



Index: tree-parloops.c
===
--- tree-parloops.c (revision 185775)
+++ tree-parloops.c (working copy)
@@ -1522,7 +1522,10 @@ transform_to_exit_first_loop (struct loop *loop, h
 the condition, moving the condition to the entry requires
 decrementing one iteration.  */
   exit_1 = EDGE_SUCC (exit-src, EDGE_SUCC (exit-src, 0) == exit); 
-  if (exit_1-dest == loop-latch)
+  
+  /* if the latch contains more than the one statemnt of control variable 
+ increment then it contains the body.  */
+  if (exit_1-dest == loop-latch  last_and_only_stmt (loop-latch))
 new_rhs = gimple_cond_rhs (cond_stmt);
   else
   {
@@ -2146,7 +2149,6 @@ parallelize_loops (void)
 return false;
   if (cfun-has_nonlocal_label)
 return false;
-
   gcc_obstack_init (parloop_obstack);
   reduction_list = htab_create (10, reduction_info_hash,
 reduction_info_eq, free);
@@ -2187,10 +2189,7 @@ parallelize_loops (void)
  || loop_has_blocks_with_irreducible_flag (loop)
  || (loop_preheader_edge (loop)-src-flags  BB_IRREDUCIBLE_LOOP)
  /* FIXME: the check for vector phi nodes could be removed.  */
- || loop_has_vector_phi_nodes (loop)
- /* FIXME: transform_to_exit_first_loop does not handle not
-header-copied loops correctly - see PR46886.  */
- || !do_while_loop_p (loop))
+ || loop_has_vector_phi_nodes (loop))
continue;
   estimated = max_stmt_executions_int (loop, false);
   /* FIXME: Bypass this check as graphite doesn't update the
@@ -2213,6 +2212,7 @@ parallelize_loops (void)
continue;
 
   changed = true;
+   
   if (dump_file  (dump_flags  TDF_DETAILS))
   {
if (loop-inner)
=

Re: [PATCH] Permanent Fix for PR46886

2012-03-26 Thread Richard Guenther
On Mon, 26 Mar 2012, Razya Ladelsky wrote:

 Hi,
 
 This is (hopefully) a permanent fix  to pr46886.c
 I removed the condition preventing parallelization of do_while loops, as 
 it 
 was blocking parallelizing important loops in spec-2006.
 The patch fixes the number of iterations for cases where the body could 
 appear in the latch, as in pr46886.c.
 
 2012-03-26  Razya Ladelsky  ra...@il.ibm.com
 
  PR tree-optimization/46886
  * tree-parloops.c (transform_to_exit_first_loop):Set 
 number of iterations correctly when the body may appear at the latch.
  (pallelize_loops): Remove the condition preventing 
 do-while loops.
  
 
 
 Bootstrap and testsuite psss successfully on power7 linux machine.
 Ok to commit?

+  
+  /* if the latch contains more than the one statemnt of control variable 
+ increment then it contains the body.  */
+  if (exit_1-dest == loop-latch  last_and_only_stmt (loop-latch))
 new_rhs = gimple_cond_rhs (cond_stmt);

please check what the comment suggests, thus,

last_and_only_stmt (loop-latch) == cond_stmt

tree-parloops.c is quite fragile in what it expects the IL to look like
and tests do not cover what later code expects.  Please do not add to 
that.

@@ -2146,7 +2149,6 @@ parallelize_loops (void)
 return false;
   if (cfun-has_nonlocal_label)
 return false;
-

spurious whitespace change.

@@ -2213,6 +2212,7 @@ parallelize_loops (void)
continue;
 
   changed = true;
+   
   if (dump_file  (dump_flags  TDF_DETAILS))
   {
if (loop-inner)

Likewise.

Ok with the above change.

Thanks,
Richard.


[C++ RFC / Patch] Implementing Deducing noexcept for destructors

2012-03-26 Thread Paolo Carlini
[sorry, I'm resending this because inadvertently I had some html and the 
message got rejected]


Hi,

thus, I have been working on c++/50043, which boils down to this:

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3204.htm

My basic idea so far is very simple:

--- class.c (revision 185792)
+++ class.c (working copy)
@@ -1001,6 +1001,10 @@ add_method (tree type, tree method, tree using_dec
destructor,
type);
}
+ else if (cxx_dialect = cxx0x
+  !TYPE_RAISES_EXCEPTIONS (TREE_TYPE (method)))
+ TREE_TYPE (method) = build_exception_variant (TREE_TYPE (method),
+ noexcept_true_spec);
}
else
{

thus, right before actually adding the method we check whether nothing 
has been deduced about it and we enforce the default noexcept. It's 
already enough to cover the testcase provided in c++/50043, which 
includes a good range of positive and negative tests. Does the idea make 
sense? As is, the patchlet leads to a few regressions, largely benign as 
far as I can see:


FAIL: g++.dg/cpp0x/noexcept01.C (test for excess errors)
FAIL: g++.dg/eh/ctor1.C -std=c++11 execution test
FAIL: g++.dg/eh/init-temp1.C -std=c++11 execution test
FAIL: g++.dg/tree-ssa/ehcleanup-1.C -std=gnu++11 scan-tree-dump-times 
ehcleanup1 Removing unreachable 4

FAIL: g++.old-deja/g++.eh/cleanup1.C -std=c++11 execution test

(the idea would changing the tests to be c++98 only and then adding 
c++11 counterparts)


Something the patchlet does not cover is:

struct B
{ ~B(); };

B::~B() { }

mailto:paolo.carl...@oracle.com
indeed the as an implicit declaration bits of the new wording in C++11 
doesn't guide so much about this, I guess it means something like:


--- decl.c (revision 185792)
+++ decl.c (working copy)
@@ -1144,7 +1144,10 @@ check_redeclaration_exception_specification (tree
if ((pedantic || ! DECL_IN_SYSTEM_HEADER (old_decl))
 ! DECL_IS_BUILTIN (old_decl)
 flag_exceptions
-  !comp_except_specs (new_exceptions, old_exceptions, ce_normal))
+  !comp_except_specs (new_exceptions, old_exceptions, ce_normal)
+  !(DECL_DESTRUCTOR_P (new_decl)
+  cxx_dialect = cxx0x
+  !new_exceptions  TYPE_NOEXCEPT_P (old_type)))
{
error (declaration of %qF has a different exception specifier,
new_decl);

does it make sense?

Another case which makes me nervous is when we add to the testcase in 
c++/50043 also a case for a virtual base class destructor, thus 
something like


struct True2 { virtual ~True2(); };
struct False { ~False() noexcept(false); };

template typename Base, typename Member
struct C : Base
{
Member mem;
};

SA(!noexcept(CTrue2, False()));

it doesn't compile at all because:

noexcept_PR50043.C:21:8: error: looser throw specifier for ‘virtual 
CTrue2, False::~C() noexcept (false)’
noexcept_PR50043.C:5:24: error: overriding ‘virtual True2::~True2() 
noexcept (true)’


is this expected? Maybe, but I'm not sure.

Thanks in advance for any tips!
Paolo.



[testsuite,avr,trunk+4.7,committed]: Add LTO tests

2012-03-26 Thread Georg-Johann Lay
http://gcc.gnu.org/viewcvs?view=revisionrevision=185793
http://gcc.gnu.org/viewcvs?view=revisionrevision=185794

This adds LTO tests to avr-torture.exp:

* gcc.target/avr/torture/avr-torture.exp (AVR_TORTURE_OPTIONS):
Add -Os -flto to list.

Johann


Re: [PATCH] PR bootstrap/52623 Disable libquadmath on AIX

2012-03-26 Thread Joseph S. Myers
On Fri, 23 Mar 2012, David Edelsohn wrote:

 The build process of libquadmath sometimes encounters problems on AIX
 due to multilib and LD_LIBRARY_PATH interfering with GCC's own library
 dependencies.  Libquadmath is not used on AIX, so this patch adds it
 to noconfigdirs.

Please include this explanation in a comment on the powerpc-*-aix* | 
rs6000-*-aix* case.  Existing directory-disabling cases may be 
uncommented, but new ones ought to have comments explaining the reason for 
disabling the library.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: PATCH: Add MaskNeeded property to option handling

2012-03-26 Thread Joseph S. Myers
On Sun, 25 Mar 2012, H.J. Lu wrote:

 Hi Joseph,
 
 I need to support InverseMask(XXX) in options without the corresponding
 Mask(XXX) since XXX is never set directly via a command line option. This
 patch adds a MaskNeeded property which turns InverseMask(XXX) into
 the inverse version of Mask(XXX), which allocates a unique bit and defines
 the same set of macros as Mask(XXX).  Does it look OK?

I'd have thought that either Mask or InverseMask with a given mask name 
(or a standalone target mask record) should cause allocation (only once, 
no matter how many options use the same mask name), and MaskExists should 
be removed, rather than adding MaskNeeded - if I understood correctly the 
purpose for which you are adding MaskNeeded.

-- 
Joseph S. Myers
jos...@codesourcery.com


Fix PR tree-optimization/52686 (SLP crashes) (Re: Vectorizer patches for 4.8)

2012-03-26 Thread Ulrich Weigand
Hello,

one of Ira's vectorizer patches I recently committed seems to have exposed
a pre-existing bug in handling WIDEN_LSHIFT_EXPR, which now causes ICEs
in SLP due to out-of-bounds memory accesses.

The underlying cause is that vect_get_smallest_scalar_type does not handle
WIDEN_LSHIFT_EXPR as it does the other widening operations.  This causes
the SLP pass to operate on the wrong type and thus get confused about the
number of vector instructions it need to allocate space for.

The following patch fixes all those ICEs for me.  Manual checking shows
that widening shifts are now recognized correctly by SLP.

Tested with no regression on armv7l-linux-gnueabi and i686-linux-gnu.

OK for mainline?

Bye,
Ulrich


ChangeLog:

gcc/
PR tree-optimization/52686
* tree-vect-data-refs.c (vect_get_smallest_scalar_type): Handle
WIDEN_LSHIFT_EXPR.

gcc/testsuite/
PR tree-optimization/52686
* gcc.target/arm/pr52686.c: New test.


Index: gcc/testsuite/gcc.target/arm/pr52686.c
===
--- gcc/testsuite/gcc.target/arm/pr52686.c  (revision 0)
+++ gcc/testsuite/gcc.target/arm/pr52686.c  (revision 0)
@@ -0,0 +1,19 @@
+/* PR target/52375 */
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options -march=armv7-a -mfloat-abi=softfp -mfpu=neon -O 
-ftree-vectorize } */
+
+unsigned int output[4];
+
+void test (unsigned short *p)
+{
+  unsigned int x = *p;
+  if (x)
+{
+  output[0] = x  1;
+  output[1] = x  1;
+  output[2] = x  1;
+  output[3] = x  1;
+}
+}
+
Index: gcc/tree-vect-data-refs.c
===
--- gcc/tree-vect-data-refs.c   (revision 185467)
+++ gcc/tree-vect-data-refs.c   (working copy)
@@ -111,6 +111,7 @@
   if (is_gimple_assign (stmt)
(gimple_assign_cast_p (stmt)
   || gimple_assign_rhs_code (stmt) == WIDEN_MULT_EXPR
+  || gimple_assign_rhs_code (stmt) == WIDEN_LSHIFT_EXPR
   || gimple_assign_rhs_code (stmt) == FLOAT_EXPR))
 {
   tree rhs_type = TREE_TYPE (gimple_assign_rhs1 (stmt));
-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  ulrich.weig...@de.ibm.com



Re: [PATCH] Permanent Fix for PR46886

2012-03-26 Thread Razya Ladelsky
Richard Guenther rguent...@suse.de wrote on 26/03/2012 01:23:15 PM:

 From: Richard Guenther rguent...@suse.de
 To: Razya Ladelsky/Haifa/IBM@IBMIL
 Cc: gcc-patches@gcc.gnu.org
 Date: 26/03/2012 01:23 PM
 Subject: Re: [PATCH] Permanent Fix for PR46886
 
 On Mon, 26 Mar 2012, Razya Ladelsky wrote:
 
  Hi,
  
  This is (hopefully) a permanent fix  to pr46886.c
  I removed the condition preventing parallelization of do_while loops, 
as 
  it 
  was blocking parallelizing important loops in spec-2006.
  The patch fixes the number of iterations for cases where the body 
could 
  appear in the latch, as in pr46886.c.
  
  2012-03-26  Razya Ladelsky  ra...@il.ibm.com
  
   PR tree-optimization/46886
   * tree-parloops.c (transform_to_exit_first_loop):Set 
  number of iterations correctly when the body may appear at the latch.
   (pallelize_loops): Remove the condition preventing 
  do-while loops.
  
  
  
  Bootstrap and testsuite psss successfully on power7 linux machine.
  Ok to commit?
 
 + 
 +  /* if the latch contains more than the one statemnt of control 
variable 
 + increment then it contains the body.  */
 +  if (exit_1-dest == loop-latch  last_and_only_stmt (loop-latch))
  new_rhs = gimple_cond_rhs (cond_stmt);
 
 please check what the comment suggests, thus,
 
 last_and_only_stmt (loop-latch) == cond_stmt
 

Hi Richard,
The latch has at least one stmt for sure:
the control variable increment (not a cond_stmt) coming from the call to 
canonicalize_loop_ivs(loop, nit, true)  earlier in tree-parloops:

 
/* .  When BUMP_IN_LATCH is true, the induction
   variable is incremented in the loop latch, otherwise it is
   incremented in the loop header.  Return the induction variable that
   was created.  */

tree
canonicalize_loop_ivs (struct loop *loop, tree *nit, bool bump_in_latch)



 tree-parloops.c is quite fragile in what it expects the IL to look like
 and tests do not cover what later code expects.  Please do not add to 
 that.


I agree.
I have also tested this code on spec2006 benchmarks.
(together with an upcoming patch, 5 of these show speedups with autopar on 
a linux-power7)




 
 @@ -2146,7 +2149,6 @@ parallelize_loops (void)
  return false;
if (cfun-has_nonlocal_label)
  return false;
 -
 
 spurious whitespace change.
 
 @@ -2213,6 +2212,7 @@ parallelize_loops (void)
 continue;
 
changed = true;
 + 
if (dump_file  (dump_flags  TDF_DETAILS))
{
 if (loop-inner)
 
 Likewise.
 
 Ok with the above change.

Ok, thanks,
Razya

 
 Thanks,
 Richard.
 



Re: Fix PR tree-optimization/52686 (SLP crashes) (Re: Vectorizer patches for 4.8)

2012-03-26 Thread Richard Guenther
On Mon, Mar 26, 2012 at 1:57 PM, Ulrich Weigand uweig...@de.ibm.com wrote:
 Hello,

 one of Ira's vectorizer patches I recently committed seems to have exposed
 a pre-existing bug in handling WIDEN_LSHIFT_EXPR, which now causes ICEs
 in SLP due to out-of-bounds memory accesses.

 The underlying cause is that vect_get_smallest_scalar_type does not handle
 WIDEN_LSHIFT_EXPR as it does the other widening operations.  This causes
 the SLP pass to operate on the wrong type and thus get confused about the
 number of vector instructions it need to allocate space for.

 The following patch fixes all those ICEs for me.  Manual checking shows
 that widening shifts are now recognized correctly by SLP.

 Tested with no regression on armv7l-linux-gnueabi and i686-linux-gnu.

 OK for mainline?

Ok.

Thanks,
Richard.

 Bye,
 Ulrich


 ChangeLog:

        gcc/
        PR tree-optimization/52686
        * tree-vect-data-refs.c (vect_get_smallest_scalar_type): Handle
        WIDEN_LSHIFT_EXPR.

        gcc/testsuite/
        PR tree-optimization/52686
        * gcc.target/arm/pr52686.c: New test.


 Index: gcc/testsuite/gcc.target/arm/pr52686.c
 ===
 --- gcc/testsuite/gcc.target/arm/pr52686.c      (revision 0)
 +++ gcc/testsuite/gcc.target/arm/pr52686.c      (revision 0)
 @@ -0,0 +1,19 @@
 +/* PR target/52375 */
 +/* { dg-do compile } */
 +/* { dg-require-effective-target arm_neon_ok } */
 +/* { dg-options -march=armv7-a -mfloat-abi=softfp -mfpu=neon -O 
 -ftree-vectorize } */
 +
 +unsigned int output[4];
 +
 +void test (unsigned short *p)
 +{
 +  unsigned int x = *p;
 +  if (x)
 +    {
 +      output[0] = x  1;
 +      output[1] = x  1;
 +      output[2] = x  1;
 +      output[3] = x  1;
 +    }
 +}
 +
 Index: gcc/tree-vect-data-refs.c
 ===
 --- gcc/tree-vect-data-refs.c   (revision 185467)
 +++ gcc/tree-vect-data-refs.c   (working copy)
 @@ -111,6 +111,7 @@
   if (is_gimple_assign (stmt)
        (gimple_assign_cast_p (stmt)
           || gimple_assign_rhs_code (stmt) == WIDEN_MULT_EXPR
 +          || gimple_assign_rhs_code (stmt) == WIDEN_LSHIFT_EXPR
           || gimple_assign_rhs_code (stmt) == FLOAT_EXPR))
     {
       tree rhs_type = TREE_TYPE (gimple_assign_rhs1 (stmt));
 --
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  ulrich.weig...@de.ibm.com



[PATCH] Fix PR52701

2012-03-26 Thread Richard Guenther

This fixes PR52701, we need to unconditionally remember the evolution
as irrelevant PHIs (apart from their final value) are not detected
as induction with variable step.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2012-03-26  Richard Guenther  rguent...@suse.de

PR tree-optimization/52701
* tree-vect-loop.c (vect_analyze_scalar_cycles_1): Always
compute and set the evolution part of PHI nodes.

* gfortran.dg/pr52701.f90: New testcase.

Index: gcc/tree-vect-loop.c
===
*** gcc/tree-vect-loop.c(revision 185792)
--- gcc/tree-vect-loop.c(working copy)
*** vect_analyze_scalar_cycles_1 (loop_vec_i
*** 565,575 
/* Analyze the evolution function.  */
access_fn = analyze_scalar_evolution (loop, def);
if (access_fn)
-   STRIP_NOPS (access_fn);
-   if (access_fn  vect_print_dump_info (REPORT_DETAILS))
{
! fprintf (vect_dump, Access function of PHI: );
! print_generic_expr (vect_dump, access_fn, TDF_SLIM);
}
  
if (!access_fn
--- 565,579 
/* Analyze the evolution function.  */
access_fn = analyze_scalar_evolution (loop, def);
if (access_fn)
{
! STRIP_NOPS (access_fn);
! if (vect_print_dump_info (REPORT_DETAILS))
!   {
! fprintf (vect_dump, Access function of PHI: );
! print_generic_expr (vect_dump, access_fn, TDF_SLIM);
!   }
! STMT_VINFO_LOOP_PHI_EVOLUTION_PART (stmt_vinfo)
!   = evolution_part_in_loop_num (access_fn, loop-num);
}
  
if (!access_fn
*** vect_analyze_scalar_cycles_1 (loop_vec_i
*** 579,586 
  continue;
}
  
-   STMT_VINFO_LOOP_PHI_EVOLUTION_PART (stmt_vinfo)
-   = evolution_part_in_loop_num (access_fn, loop-num);
gcc_assert (STMT_VINFO_LOOP_PHI_EVOLUTION_PART (stmt_vinfo) != 
NULL_TREE);
  
if (vect_print_dump_info (REPORT_DETAILS))
--- 583,588 
Index: gcc/testsuite/gfortran.dg/pr52701.f90
===
*** gcc/testsuite/gfortran.dg/pr52701.f90   (revision 0)
--- gcc/testsuite/gfortran.dg/pr52701.f90   (revision 0)
***
*** 0 
--- 1,20 
+ ! { dg-do compile }
+ ! { dg-options -O3 }
+ function pr52701 (x, z, e, f, g, l)
+   integer a, b, c, d, e, f, g, i, j, l, pr52701
+   double precision x(e), z(e*e)
+   do i = l, f
+ do j = l, i
+   d = 0
+   do a = 1, g
+ c = a - g
+ do b = 1, g
+   d = d + 1
+   c = c + g
+   z(d) = z(d) / (x(i) + x(j) - x(f + a) - x(f + b))
+ end do
+   end do
+ end do
+   end do
+   pr52701 = c
+ end


[PATCH] Fix PR52721

2012-03-26 Thread Richard Guenther

This fixes PR52721, vect_init_vector is appearantly also used to
init scalars ... the following patch fixes the ICE and adjusts
the comments/implementation to not suggest we are dealing with
vectors only.

Bootstrapped and tested on x86_64-unknown-linux-gnu, appplied to trunk.

Richard.

2012-03-26  Richard Guenther  rguent...@suse.de

PR tree-optimization/52721
* tree-vect-stmts.c (vect_init_vector): Handle scalars.

Index: gcc/tree-vect-stmts.c
===
--- gcc/tree-vect-stmts.c   (revision 185792)
+++ gcc/tree-vect-stmts.c   (working copy)
@@ -1142,7 +1142,6 @@ vect_get_load_cost (struct data_referenc
 static void
 vect_init_vector_1 (gimple stmt, gimple new_stmt, gimple_stmt_iterator *gsi)
 {
-
   if (gsi)
 vect_finish_stmt_generation (stmt, new_stmt, gsi);
   else
@@ -1185,48 +1184,48 @@ vect_init_vector_1 (gimple stmt, gimple
 
 /* Function vect_init_vector.
 
-   Insert a new stmt (INIT_STMT) that initializes a new vector variable with
-   the vector elements of VECTOR_VAR.  Place the initialization at BSI if it
-   is not NULL.  Otherwise, place the initialization at the loop preheader.
+   Insert a new stmt (INIT_STMT) that initializes a new variable of type
+   TYPE with the value VAL.  If TYPE is a vector type and VAL does not have
+   vector type a vector with all elements equal to VAL is created first.
+   Place the initialization at BSI if it is not NULL.  Otherwise, place the
+   initialization at the loop preheader.
Return the DEF of INIT_STMT.
It will be used in the vectorization of STMT.  */
 
 tree
-vect_init_vector (gimple stmt, tree vector_var, tree vector_type,
- gimple_stmt_iterator *gsi)
+vect_init_vector (gimple stmt, tree val, tree type, gimple_stmt_iterator *gsi)
 {
   tree new_var;
   gimple init_stmt;
   tree vec_oprnd;
   tree new_temp;
 
-  if (TREE_CODE (TREE_TYPE (vector_var)) != VECTOR_TYPE)
+  if (TREE_CODE (type) == VECTOR_TYPE
+   TREE_CODE (TREE_TYPE (val)) != VECTOR_TYPE)
 {
-  if (!types_compatible_p (TREE_TYPE (vector_type),
-  TREE_TYPE (vector_var)))
+  if (!types_compatible_p (TREE_TYPE (type), TREE_TYPE (val)))
{
- if (CONSTANT_CLASS_P (vector_var))
-   vector_var = fold_unary (VIEW_CONVERT_EXPR, TREE_TYPE (vector_type),
-vector_var);
+ if (CONSTANT_CLASS_P (val))
+   val = fold_unary (VIEW_CONVERT_EXPR, TREE_TYPE (type), val);
  else
{
- new_var = create_tmp_reg (TREE_TYPE (vector_type), NULL);
+ new_var = create_tmp_reg (TREE_TYPE (type), NULL);
  add_referenced_var (new_var);
  init_stmt = gimple_build_assign_with_ops (NOP_EXPR,
-   new_var, vector_var,
+   new_var, val,
NULL_TREE);
  new_temp = make_ssa_name (new_var, init_stmt);
  gimple_assign_set_lhs (init_stmt, new_temp);
  vect_init_vector_1 (stmt, init_stmt, gsi);
- vector_var = new_temp;
+ val = new_temp;
}
}
-  vector_var = build_vector_from_val (vector_type, vector_var);
+  val = build_vector_from_val (type, val);
 }
 
-  new_var = vect_get_new_vect_var (vector_type, vect_simple_var, cst_);
+  new_var = vect_get_new_vect_var (type, vect_simple_var, cst_);
   add_referenced_var (new_var);
-  init_stmt = gimple_build_assign  (new_var, vector_var);
+  init_stmt = gimple_build_assign  (new_var, val);
   new_temp = make_ssa_name (new_var, init_stmt);
   gimple_assign_set_lhs (init_stmt, new_temp);
   vect_init_vector_1 (stmt, init_stmt, gsi);


[Patch,avr]: PR target/52692: Imlpement TARGET_BUILTIN_DECL

2012-03-26 Thread Georg-Johann Lay
This implements TARGET_BUILTIN_DECL which is needed to make
LTO work with target-specific built-ins.

struct avr_builtin_description gets a new field .fndecl which is
initialized during avr_init_builtins and looked up in new hook
avr_builtin_decl.

The built-ins are initialized in such a way that there is always
  avr_bdesc[x].id == x
i.e. .id is superfluous and avr_expand_builtin can use id to access
the array and need no search loop to find built-in id.

Johann

Ok?

gcc/
PR target/52692
* config/avr/avr.c (TARGET_BUILTIN_DECL): New define.
(avr_builtin_decl): New static function.
(struct avr_builtin_description, avr_bdesc): Move up.
Add GTY marker. Add field fndecl. Remove redundant field id.
(avr_init_builtins): Initialize avr_bdesc[ID].fndecl.
(avr_expand_builtin): Code cleanup because .id is removed.

testsuite/
* gcc.target/avr/torture/builtins-2.c: New test.
Index: config/avr/avr.c
===
--- config/avr/avr.c	(revision 185792)
+++ config/avr/avr.c	(working copy)
@@ -10285,6 +10285,42 @@ enum avr_builtin_id
 AVR_BUILTIN_COUNT
   };
 
+struct GTY(()) avr_builtin_description
+{
+  enum insn_code icode;
+  const char *name;
+  int n_args;
+  tree fndecl;
+};
+
+
+/* Notice that avr_bdesc[] and avr_builtin_id are initialized in such a way
+   that a built-in's ID can be used to access the built-in by means of
+   avr_bdesc[ID]  */
+
+static GTY(()) struct avr_builtin_description
+avr_bdesc[AVR_BUILTIN_COUNT] =
+  {
+
+#define DEF_BUILTIN(NAME, N_ARGS, ID, TYPE, ICODE)  \
+{ ICODE, NAME, N_ARGS, NULL_TREE },
+#include builtins.def  
+#undef DEF_BUILTIN
+  };
+
+
+/* Implement `TARGET_BUILTIN_DECL'.  */
+
+static tree
+avr_builtin_decl (unsigned id, bool initialize_p ATTRIBUTE_UNUSED)
+{
+  if (id  AVR_BUILTIN_COUNT)
+return avr_bdesc[id].fndecl;
+
+  return error_mark_node;
+}
+
+
 static void
 avr_init_builtin_int24 (void)
 {
@@ -10295,6 +10331,7 @@ avr_init_builtin_int24 (void)
   (*lang_hooks.types.register_builtin_type) (uint24_type, __uint24);
 }
 
+
 /* Implement `TARGET_INIT_BUILTINS' */
 /* Set up all builtin functions for this target.  */
 
@@ -10348,7 +10385,9 @@ avr_init_builtins (void)
   NULL);
 
 #define DEF_BUILTIN(NAME, N_ARGS, ID, TYPE, CODE)   \
-  add_builtin_function (NAME, TYPE, ID, BUILT_IN_MD, NULL, NULL_TREE);
+  gcc_assert (ID  AVR_BUILTIN_COUNT);  \
+  avr_bdesc[ID].fndecl  \
+= add_builtin_function (NAME, TYPE, ID, BUILT_IN_MD, NULL, NULL_TREE);
 #include builtins.def  
 #undef DEF_BUILTIN
   
@@ -10356,27 +10395,6 @@ avr_init_builtins (void)
 }
 
 
-struct avr_builtin_description
-{
-  enum insn_code icode;
-  const char *name;
-  enum avr_builtin_id id;
-  int n_args;
-};
-
-static const struct avr_builtin_description
-avr_bdesc[] =
-  {
-
-#define DEF_BUILTIN(NAME, N_ARGS, ID, TYPE, ICODE)  \
-{ ICODE, NAME, ID, N_ARGS },
-#include builtins.def  
-#undef DEF_BUILTIN
-
-{ CODE_FOR_nothing, NULL, 0, -1 }
-  };
-
-
 /* Subroutine of avr_expand_builtin to take care of unop insns.  */
 
 static rtx
@@ -10545,6 +10563,7 @@ avr_expand_triop_builtin (enum insn_code
 }
 
 
+/* Implement `TARGET_EXPAND_BUILTIN'.  */
 /* Expand an expression EXP that calls a built-in function,
with result going to TARGET if that's convenient
(and in mode MODE if that's convenient).
@@ -10557,13 +10576,15 @@ avr_expand_builtin (tree exp, rtx target
 enum machine_mode mode ATTRIBUTE_UNUSED,
 int ignore ATTRIBUTE_UNUSED)
 {
-  size_t i;
   tree fndecl = TREE_OPERAND (CALL_EXPR_FN (exp), 0);
   const char* bname = IDENTIFIER_POINTER (DECL_NAME (fndecl));
   unsigned int id = DECL_FUNCTION_CODE (fndecl);
+  const struct avr_builtin_description *d = avr_bdesc[id];
   tree arg0;
   rtx op0;
 
+  gcc_assert (id  AVR_BUILTIN_COUNT);
+
   switch (id)
 {
 case AVR_BUILTIN_NOP:
@@ -10597,29 +10618,22 @@ avr_expand_builtin (tree exp, rtx target
   }
 }
 
-  for (i = 0; avr_bdesc[i].name; i++)
+  /* No special treatment needed: vanilla expand.  */
+  
+  switch (d-n_args)
 {
-  const struct avr_builtin_description *d = avr_bdesc[i];
+case 0:
+  emit_insn ((GEN_FCN (d-icode)) (target));
+  return 0;
   
-  if (d-id == id)
-switch (d-n_args)
-  {
-  case 0:
-emit_insn ((GEN_FCN (d-icode)) (target));
-return 0;
-
-  case 1:
-return avr_expand_unop_builtin (d-icode, exp, target);
-
-  case 2:
-return avr_expand_binop_builtin (d-icode, exp, target);
-
-  case 3:
-return avr_expand_triop_builtin (d-icode, exp, target);
-
-  default:
-gcc_unreachable();
-}
+   

[PATCH] PR c++/52672

2012-03-26 Thread Meador Inge
Hi All,

This patch fixes an ICE that occurs when attempting to fold nested INDIRECT_REF
trees that have conversions in between the indirect references.  For example:

constexpr unsigned long b = *((ul_ptr)(*((ul_ptr)0x0)));

What happens is that 'cxx_fold_indirect_ref' gets the top indirect reference,
strips out all the conversions ending up with the nested indirect reference,
and then asserts that the type is a pointer type, which (obviously) it isn't.

This patch fixes the problem by exiting the fold for non-pointer sub-trees
instead of doing the assert.  'fold_indirect_ref_1' already does things this
way.

Tested on x86_64 GNU/Linux.  OK?

Also, This fixes the problem on trunk, but I have reproduced the issue back to
the 4.6 branch as well.  On the 4.6 branch I think it is OK to just remove the
'gcc_assert (POINTER_TYPE_P (subtype))' assert.  I tested this and saw no
regressions.

P.S.  If it is OK can some please commit for me?  I don't have write access.

-- 
Meador Inge
CodeSourcery / Mentor Embedded
http://www.mentor.com/embedded-software
cp/
2012-03-26  Meador Inge  mead...@codesourcery.com

	PR c++/52672
	* gcc/cp/semantics.c (cxx_fold_indirect_ref): Don't attempt to fold
	stripped child trees that are not pointer types.

testsuite/
2012-03-26  Meador Inge  mead...@codesourcery.com

	PR c++/52672
	* g++.dg/cpp0x/constexpr-52672.C: New testcase.Index: gcc/testsuite/g++.dg/cpp0x/constexpr-52672.C
===
--- gcc/testsuite/g++.dg/cpp0x/constexpr-52672.C	(revision 0)
+++ gcc/testsuite/g++.dg/cpp0x/constexpr-52672.C	(revision 0)
@@ -0,0 +1,8 @@
+// PR c++/52672
+// { dg-do compile }
+// { dg-options -std=c++11 }
+
+typedef unsigned long * ul_ptr;
+constexpr unsigned long a = *((ul_ptr)0x0); // { dg-error  }
+constexpr unsigned long b = *((ul_ptr)(*((ul_ptr)0x0))); // { dg-error  }
+constexpr unsigned long c = *((ul_ptr)*((ul_ptr)(*((ul_ptr)0x0; // { dg-error  }
Index: gcc/cp/semantics.c
===
--- gcc/cp/semantics.c	(revision 185750)
+++ gcc/cp/semantics.c	(working copy)
@@ -7202,7 +7202,8 @@ cxx_fold_indirect_ref (location_t loc, tree type,
   sub = op0;
   STRIP_NOPS (sub);
   subtype = TREE_TYPE (sub);
-  gcc_assert (POINTER_TYPE_P (subtype));
+  if (!POINTER_TYPE_P (subtype))
+return NULL_TREE;
 
   if (TREE_CODE (sub) == ADDR_EXPR)
 {


[PATCH H8300] Add function_vector attribute

2012-03-26 Thread Ajinkya Dhobale
Hi,

Please find the attached h8300-func-vect.patch that adds 
'function_vector' attribute for H8300 target.

I have taken the reference from previously posted patch for GCC-3.4.2:
http://gcc.gnu.org/ml/gcc-patches/2004-09/msg02529.html

I have updated functionality of the attribute to GCC-4.7.0 source. 
Can someone please review the patch and let me know if there should be 
any modifications in it?

I will briefly mention about 'function_vector' attribute.
For test code:

void foo (void)
{
}
void bar (void)
{
foo();
}

For regular function call, compiler generates code as:
_bar:
mov.w   r6,@-r7
mov.w   r7,r6
jsr @_foo   [length=4]
mov.w   @r7+,r6
rts

Using 'function_vector' attribute, code is generated as:
_bar:
mov.w   r6,@-r7
mov.w   r7,r6
jsr @@ vector_number:8 [length=2]
mov.w   @r7+,r6
rts

I have built the toolchain with 'c' language enabled and did the 
regression testing. Results are fine. I could not build the toolchain 
for 'c++' language as GCC-4.7.0 build fails for it for H8300 target.

==
ChangeLog:
2012-26-03 Ajinkya Dhobale ajinkya.dhob...@kpitcummins.com

* config/h8300/h8300.c
(h8300_handle_function_vector_attribute): New function.
(print_operand): 'B', new print_operand switch case for jsr
instruction.
(h8300_attrib): Redefined for argument and new function.

* config/h8300/h8300.md
(call): Changed assembly output to get processed by
print_operand function.
(call_value): Same as (call).

* doc/extend.texi
(function_vector): Added description for H8300 target.
===

Thanks  Regards,
Ajinkya Dhobale
KPIT Cummins Infosystems Ltd,
Pune (INDIA)



h8300-func-vect.patch
Description: h8300-func-vect.patch


Fw: [PATCH] Permanent Fix for PR46886

2012-03-26 Thread Razya Ladelsky
 + 
 +  /* if the latch contains more than the one statemnt of control 
variable 
 + increment then it contains the body.  */
 +  if (exit_1-dest == loop-latch  last_and_only_stmt (loop-latch))
  new_rhs = gimple_cond_rhs (cond_stmt);
 
 please check what the comment suggests, thus,
 
 last_and_only_stmt (loop-latch) == cond_stmt
 


Richard,
There's no simple way to do such a check,
as the control variable increment stmt is explicitly created in 
canonicalize_loop_ivs and not in tree-parloops.
What we do know is that it is put in the latch because we ask for it when 
we call canonicalize_loop_ivs from parloops.

One way to get the control variable increment stmt is to use 
canonicalize_loop_ivs() return value and look for the stmts that use it.

Do you think I should add all this functionality to the code in order to 
assert
the stmt that is in the latch? or can I skip it ?

Thanks,
Razya



Re: [PATCH] Permanent Fix for PR46886

2012-03-26 Thread Richard Guenther
On Mon, 26 Mar 2012, Razya Ladelsky wrote:

  + 
  +  /* if the latch contains more than the one statemnt of control 
 variable 
  + increment then it contains the body.  */
  +  if (exit_1-dest == loop-latch  last_and_only_stmt (loop-latch))
   new_rhs = gimple_cond_rhs (cond_stmt);
  
  please check what the comment suggests, thus,
  
  last_and_only_stmt (loop-latch) == cond_stmt
  
 
 
 Richard,
 There's no simple way to do such a check,
 as the control variable increment stmt is explicitly created in 
 canonicalize_loop_ivs.
 What we do know is that it is put in the latch because we ask for it when 
 we call 
 canonicalize_loop_ivs from parloops.
 
 One way to get the control variable increment stmt is to use
 canonicalize_loop_ivs return value and look for the stmts that use it.
 
 Do you think I should add all this functionality to the code in order to 
 assert
 the stmt that is in the latch? or can I skip it ?

No, you do not need to do that.  I see the check is basically
verifying the latch is empty (apart from the newly inserted
IV update).

Well, looking at the code simply checking for an empty latch
(before canonicalize_loop_ivs) is not going to make the case
correct when we do _not_ have an empty latch.  That is, the whole
logic in transform_to_exit_first_loop seems to assume that either
the latch is empty or the header is empty, but this does not seem
to be verified anywhere (AFAIU)?  We copy the loop anyway but
exempt the code in the latch (well, it's not exactly clear
how gimple_duplicate_sese_tail will end up turning the loop
into while (...) style to me).

It might be that the patch is a progression, so it's ok in its
original form.

Richard.


Re: PATCH: Add MaskNeeded property to option handling

2012-03-26 Thread H.J. Lu
On Mon, Mar 26, 2012 at 4:53 AM, Joseph S. Myers
jos...@codesourcery.com wrote:
 On Sun, 25 Mar 2012, H.J. Lu wrote:

 Hi Joseph,

 I need to support InverseMask(XXX) in options without the corresponding
 Mask(XXX) since XXX is never set directly via a command line option. This
 patch adds a MaskNeeded property which turns InverseMask(XXX) into
 the inverse version of Mask(XXX), which allocates a unique bit and defines
 the same set of macros as Mask(XXX).  Does it look OK?

 I'd have thought that either Mask or InverseMask with a given mask name
 (or a standalone target mask record) should cause allocation (only once,
 no matter how many options use the same mask name), and MaskExists should
 be removed, rather than adding MaskNeeded - if I understood correctly the
 purpose for which you are adding MaskNeeded.


That is correct.  I will work on a patch to remove  MaskExists.

Thanks.

-- 
H.J.


Re: [PATCH H8300] Add function_vector attribute

2012-03-26 Thread Jeff Law

On 03/26/2012 08:18 AM, Ajinkya Dhobale wrote:

Hi,

Please find the attached h8300-func-vect.patch that adds
'function_vector' attribute for H8300 target.

I have taken the reference from previously posted patch for GCC-3.4.2:
http://gcc.gnu.org/ml/gcc-patches/2004-09/msg02529.html

I have updated functionality of the attribute to GCC-4.7.0 source.
Can someone please review the patch and let me know if there should be
any modifications in it?

I will briefly mention about 'function_vector' attribute.
For test code:

[ ... ]
How is this different than the function vector support that is already 
in GCC for the H8/300 series processors?


Jeff


Re: [Patch, libfortran] Combine get_mem and internal_malloc_size

2012-03-26 Thread Steve Kargl
On Mon, Mar 26, 2012 at 06:17:08PM +0300, Janne Blomqvist wrote:
 
 currently in libgfortran we have two malloc() wrappers, get_mem and
 internal_malloc_size, which abort the program if malloc fails.
 internal_malloc_size does
 
 if (size == 0)
 size = 1;
 
 and then calls get_mem, which doesn't have such a zero-size check.
 This is, however, a bug in get_mem, as malloc(0) is allowed to return
 NULL, and in that case get_mem would incorrectly abort the program.
 Thus having both these functions is unnecessary. The attached patch
 combines these two functions into one.  As I couldn't decide between
 get_mem and internal_malloc_size and didn't particularly like either
 of the names, I named the combined function xmalloc, as is common
 for such functions to be called.
 
 When developing the patch, I noticed that the implementations of the
 intrinsics IALL, IANY, IPARITY, NORM2, and PARITY weren't being
 regenerated from the m4 sources. The reason, it turned out, was that
 the dependencies were specified incorrectly in Makefile.am. The patch
 also fixes this.
 
 Regtested on x86_64-unknown-linux-gnu. While the patch is large, it's
 also mechanical, hence committed as obvious.
 

The patch looks ok to me, but I do have two questions.

1) xmalloc is used in other parts of gcc.  Is there a
   possibility of confusing the libgfortran version 
   with the other(s)?  Perhaps, a unique name is in 
   order, gfc_xmalloc().

2) Unless I've misread the patch (which is always a
   possibility), it seems that you've removed 
   GFC_CLEAR_MEMORY, which used calloc to zero the
   allocated memory.  Is there a situation where 
   libgfortran needs/assumes the memory is zeroed
   upon allocation?

-- 
Steve


Re: [Patch, libfortran] Combine get_mem and internal_malloc_size

2012-03-26 Thread Janne Blomqvist
On Mon, Mar 26, 2012 at 18:37, Steve Kargl
s...@troutmask.apl.washington.edu wrote:
 On Mon, Mar 26, 2012 at 06:17:08PM +0300, Janne Blomqvist wrote:

 currently in libgfortran we have two malloc() wrappers, get_mem and
 internal_malloc_size, which abort the program if malloc fails.
 internal_malloc_size does

 if (size == 0)
     size = 1;

 and then calls get_mem, which doesn't have such a zero-size check.
 This is, however, a bug in get_mem, as malloc(0) is allowed to return
 NULL, and in that case get_mem would incorrectly abort the program.
 Thus having both these functions is unnecessary. The attached patch
 combines these two functions into one.  As I couldn't decide between
 get_mem and internal_malloc_size and didn't particularly like either
 of the names, I named the combined function xmalloc, as is common
 for such functions to be called.

 When developing the patch, I noticed that the implementations of the
 intrinsics IALL, IANY, IPARITY, NORM2, and PARITY weren't being
 regenerated from the m4 sources. The reason, it turned out, was that
 the dependencies were specified incorrectly in Makefile.am. The patch
 also fixes this.

 Regtested on x86_64-unknown-linux-gnu. While the patch is large, it's
 also mechanical, hence committed as obvious.


 The patch looks ok to me, but I do have two questions.

 1) xmalloc is used in other parts of gcc.  Is there a
   possibility of confusing the libgfortran version
   with the other(s)?  Perhaps, a unique name is in
   order, gfc_xmalloc().

I though about that, and ultimately decided against it, as the symbol
name is mangled anyway (__gfortrani_xmalloc IIRC), and so shouldn't
cause any linking problems. Also, I don't think it's worth worrying
about similar named source identifiers in other subdirectories in the
GCC tree; it's easy enough to find the implementation that is used in
libgfortran with (git) grep.

 2) Unless I've misread the patch (which is always a
   possibility), it seems that you've removed
   GFC_CLEAR_MEMORY, which used calloc to zero the
   allocated memory.  Is there a situation where
   libgfortran needs/assumes the memory is zeroed
   upon allocation?

GFC_CLEAR_MEMORY was a macro for debugging purposes. Today, with tools
like valgrind available, replacing malloc with calloc is quite a
inefficient debugging tool. And as it's only a one-line change to
replace the malloc call with calloc inside xmalloc, if somebody wants
to do that it's no more effort than uncommenting the GFC_CLEAR_MEMORY
macro definition was before.

And yes, we nowadays have a corresponding xcalloc for those cases
where the caller requires zeroed memory; before xcalloc the caller
used memset to explicitly zero the memory, so GFC_CLEAR_MEMORY was
never needed from a correctness standpoint.

-- 
Janne Blomqvist


[pph] Adjust mangling when emitting PPH (issue5902061)

2012-03-26 Thread Diego Novillo
Mangle TRAIT_EXPR and CONST_DECLs in template parameters.

We use the mangler to generate unique strings to represent types and symbols
that need to be merged.  However, the C++ mangler does not handle all
possible combinations.  This patch works around this problem by using
a combination of the pretty printer and the mangler.  This only works
when a PPH image is being generated, so it does not affect the
standard mangler.

2012-03-25   Diego Novillo  dnovi...@google.com

* mangle.c (write_expression): Handle TRAIT_EXPR when PPH is enabled.
(write_template_arg_literal): Handle CONST_DECLs that do not have
a constant DECL_INITIAL when PPH is enabled.
---
 gcc/cp/ChangeLog.pph |6 ++
 gcc/cp/mangle.c  |   39 ++-
 2 files changed, 44 insertions(+), 1 deletions(-)

diff --git a/gcc/cp/ChangeLog.pph b/gcc/cp/ChangeLog.pph
index 2c64375..1cf90b4 100644
--- a/gcc/cp/ChangeLog.pph
+++ b/gcc/cp/ChangeLog.pph
@@ -1,3 +1,9 @@
+2012-03-25   Diego Novillo  dnovi...@google.com
+
+   * mangle.c (write_expression): Handle TRAIT_EXPR when PPH is enabled.
+   (write_template_arg_literal): Handle CONST_DECLs that do not have
+   a constant DECL_INITIAL when PPH is enabled.
+
 2012-03-23   Diego Novillo  dnovi...@google.com
 
* name-lookup.c (pph_set_identifier_binding): If DECL is a USING_DECL,
diff --git a/gcc/cp/mangle.c b/gcc/cp/mangle.c
index a54200a..51ed8c5 100644
--- a/gcc/cp/mangle.c
+++ b/gcc/cp/mangle.c
@@ -2754,6 +2754,30 @@ write_expression (tree expr)
 {
   write_unqualified_id (dependent_name (expr));
 }
+  else if (TREE_CODE (expr) == TRAIT_EXPR)
+{
+  /* FIXME pph.  This is almost certainly wrong as a mangled
+representation.
+
+We use the mangler to generate unique strings to represent
+types and symbols that need to be merged.  The C++ mangler
+does not handle ALL possible combinations, so we work around
+it by using the pretty printer.  We are not trying to
+generate a valid mangled name, but any string that will allow
+us to do merging.  */
+  gcc_assert (pph_writer_enabled_p ());
+  if (TRAIT_EXPR_TYPE1 (expr))
+   write_type (TRAIT_EXPR_TYPE1 (expr));
+  else
+   write_string(nil);
+  if (TRAIT_EXPR_TYPE2 (expr))
+   write_type (TRAIT_EXPR_TYPE2 (expr));
+  else
+   write_string(nil);
+  write_number ((unsigned HOST_WIDE_INT) TRAIT_EXPR_KIND (expr),
+   /*unsigned_p*/ 1,
+   /*base*/ 16);
+}
   else
 {
   int i, len;
@@ -2914,7 +2938,20 @@ write_template_arg_literal (const tree value)
 switch (TREE_CODE (value))
   {
   case CONST_DECL:
-   write_integer_cst (value);
+   if (CONSTANT_CLASS_P (DECL_INITIAL (value)))
+ write_integer_cst (value);
+   else
+ {
+   /* FIXME pph - We use the mangler to generate unique
+  strings to represent types and symbols that need
+  to be merged.  The C++ mangler does not handle
+  ALL possible combinations, so we work around it by
+  using the pretty printer.  We are not trying to
+  generate a valid mangled name, but any string that will
+  allow us to do merging.  */
+   gcc_assert (pph_writer_enabled_p ());
+   write_string (cxx_printable_name (value, 2));
+ }
break;
 
   case INTEGER_CST:
-- 
1.7.7.3


--
This patch is available for review at http://codereview.appspot.com/5902061


[pph] Fix check for empty statement list (issue5905065)

2012-03-26 Thread Diego Novillo
Fix check for empty statement list.

We do some sanity checks before starting to write a PPH image.  The
check was inconsistent: for the VEC scope_chain-x_stmt_tree.x_cur_stmt_list
we were asserting that it was NULL *and* empty.

We just need it to be empty.  It doesn't matter if it was allocated.

2012-03-25   Diego Novillo  dnovi...@google.com

* pph-out.c (pph_out_global_binding_keys): Fix check for empty
statement list.
---
 gcc/cp/ChangeLog.pph |5 +
 gcc/cp/pph-out.c |3 +--
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/gcc/cp/ChangeLog.pph b/gcc/cp/ChangeLog.pph
index 1cf90b4..f5b8281 100644
--- a/gcc/cp/ChangeLog.pph
+++ b/gcc/cp/ChangeLog.pph
@@ -1,5 +1,10 @@
 2012-03-25   Diego Novillo  dnovi...@google.com
 
+   * pph-out.c (pph_out_global_binding_keys): Fix check for empty
+   statement list.
+
+2012-03-25   Diego Novillo  dnovi...@google.com
+
* mangle.c (write_expression): Handle TRAIT_EXPR when PPH is enabled.
(write_template_arg_literal): Handle CONST_DECLs that do not have
a constant DECL_INITIAL when PPH is enabled.
diff --git a/gcc/cp/pph-out.c b/gcc/cp/pph-out.c
index 7ba4ee6..1efac6b 100644
--- a/gcc/cp/pph-out.c
+++ b/gcc/cp/pph-out.c
@@ -2604,8 +2604,7 @@ pph_out_global_binding_keys (pph_stream *stream)
   || scope_chain-x_processing_template_decl
   || scope_chain-x_processing_specialization
   || scope_chain-x_processing_explicit_instantiation
-  || scope_chain-need_pop_function_context
-  || scope_chain-x_stmt_tree.x_cur_stmt_list)
+  || scope_chain-need_pop_function_context)
   VEC_empty (tree, scope_chain-x_stmt_tree.x_cur_stmt_list));
 
   /* We need to write a record for BL before emitting the merge keys
-- 
1.7.7.3


--
This patch is available for review at http://codereview.appspot.com/5905065


[pph] Add timers for PPH procesing (issue5902062)

2012-03-26 Thread Diego Novillo

This initial set of timers measures time spent in the major phases of
PPH processing.  We may want to add/remove timers as we measure
performance.

2012-03-26   Diego Novillo  dnovi...@google.com

cp/ChangeLog.pph
* pph-core.c (pph_include_handler): Use timer TV_PPH.
(pph_init): Likewise.
(pph_finish): Likewise.
* pph-in.c (pph_in_line_table_and_includes): Use timer
TV_PPH_RESTORE_LINE_TABLE.
(pph_in_replay): Use timer TV_PPH_RESTORE_REPLAY.
(pph_in_identifiers): Use timer TV_PPH_RESTORE_IDENTIFIERS.
(pph_in_global_binding_keys): Use timer
TV_PPH_RESTORE_MERGE_KEYS.
(pph_in_global_binding_bodies): Use timer
TV_PPH_RESTORE_MERGE_BODIES.
(pph_read_file_1): Use timer TV_PPH_RESTORE_MISC and timer
TV_PPH_VALIDATE_IDENTIFIERS.
(pph_read_file): Use timer TV_PPH_RESTORE.
* pph-out.c (pph_writer_init): Use timer TV_PPH_SAVE.
(pph_out_line_table_and_includes): Use timer
TV_PPH_SAVE_LINE_TABLE.
(pph_out_replay): Use timer TV_PPH_SAVE_REPLAY.
(pph_out_identifiers): Use timer TV_PPH_SAVE_IDENTIFIERS.
(pph_out_global_binding_keys): Use timer TV_PPH_SAVE_MERGE_KEYS.
(pph_out_global_binding_bodies): Use timer TV_PPH_SAVE_MERGE_BODIES.
(pph_write_file): Use timer TV_PPH_SAVE_MISC.
(pph_writer_finish): Use timer TV_PPH_SAVE.
* pt.c (pph_in_bodies_spec_entry_htab): Use timer
TV_PPH_RESTORE_MERGE_BODIES.
(pph_out_merge_key_template_state): Use timer TV_PPH_SAVE_MERGE_KEYS.
(pph_out_merge_body_template_state): Use timer TV_PPH_SAVE_MERGE_BODIES.
(pph_in_merge_key_template_state): Use timer TV_PPH_RESTORE_MERGE_KEYS.
(pph_in_merge_body_template_state): Use timer
TV_PPH_RESTORE_MERGE_BODIES.

ChangeLog.pph
* timevar.def (TV_PPH): New timer.
(TV_PPH_VALIDATE_IDENTIFIERS): New timer.
(TV_PPH_SAVE): New timer.
(TV_PPH_SAVE_LINE_TABLE): New timer.
(TV_PPH_SAVE_IDENTIFIERS): New timer.
(TV_PPH_SAVE_MERGE_KEYS): New timer.
(TV_PPH_SAVE_MERGE_BODIES): New timer.
(TV_PPH_SAVE_MISC): New timer.
(TV_PPH_SAVE_REPLAY): New timer.
(TV_PPH_RESTORE): New timer.
(TV_PPH_RESTORE_LINE_TABLE): New timer.
(TV_PPH_RESTORE_IDENTIFIERS): New timer.
(TV_PPH_RESTORE_MERGE_KEYS): New timer.
(TV_PPH_RESTORE_MERGE_BODIES): New timer.
(TV_PPH_RESTORE_MISC): New timer.
(TV_PPH_RESTORE_REPLAY): New timer.
---
 gcc/ChangeLog.pph|   19 +++
 gcc/cp/ChangeLog.pph |   33 +
 gcc/cp/pph-core.c|   12 
 gcc/cp/pph-in.c  |   29 -
 gcc/cp/pph-out.c |   31 ++-
 gcc/cp/pt.c  |   14 ++
 gcc/timevar.def  |   20 
 7 files changed, 156 insertions(+), 2 deletions(-)

diff --git a/gcc/ChangeLog.pph b/gcc/ChangeLog.pph
index 2bf6c9a..5d8a45c 100644
--- a/gcc/ChangeLog.pph
+++ b/gcc/ChangeLog.pph
@@ -1,3 +1,22 @@
+2012-03-26   Diego Novillo  dnovi...@google.com
+
+   * timevar.def (TV_PPH): New timer.
+   (TV_PPH_VALIDATE_IDENTIFIERS): New timer.
+   (TV_PPH_SAVE): New timer.
+   (TV_PPH_SAVE_LINE_TABLE): New timer.
+   (TV_PPH_SAVE_IDENTIFIERS): New timer.
+   (TV_PPH_SAVE_MERGE_KEYS): New timer.
+   (TV_PPH_SAVE_MERGE_BODIES): New timer.
+   (TV_PPH_SAVE_MISC): New timer.
+   (TV_PPH_SAVE_REPLAY): New timer.
+   (TV_PPH_RESTORE): New timer.
+   (TV_PPH_RESTORE_LINE_TABLE): New timer.
+   (TV_PPH_RESTORE_IDENTIFIERS): New timer.
+   (TV_PPH_RESTORE_MERGE_KEYS): New timer.
+   (TV_PPH_RESTORE_MERGE_BODIES): New timer.
+   (TV_PPH_RESTORE_MISC): New timer.
+   (TV_PPH_RESTORE_REPLAY): New timer.
+
 2012-03-04   Diego Novillo  dnovi...@google.com
 
* tree.c (type_hash_traverse): Declare.
diff --git a/gcc/cp/ChangeLog.pph b/gcc/cp/ChangeLog.pph
index f5b8281..3f6f5bb 100644
--- a/gcc/cp/ChangeLog.pph
+++ b/gcc/cp/ChangeLog.pph
@@ -1,3 +1,36 @@
+2012-03-26   Diego Novillo  dnovi...@google.com
+
+   * pph-core.c (pph_include_handler): Use timer TV_PPH.
+   (pph_init): Likewise.
+   (pph_finish): Likewise.
+   * pph-in.c (pph_in_line_table_and_includes): Use timer
+   TV_PPH_RESTORE_LINE_TABLE.
+   (pph_in_replay): Use timer TV_PPH_RESTORE_REPLAY.
+   (pph_in_identifiers): Use timer TV_PPH_RESTORE_IDENTIFIERS.
+   (pph_in_global_binding_keys): Use timer
+   TV_PPH_RESTORE_MERGE_KEYS.
+   (pph_in_global_binding_bodies): Use timer
+   TV_PPH_RESTORE_MERGE_BODIES.
+   (pph_read_file_1): Use timer TV_PPH_RESTORE_MISC and timer
+   TV_PPH_VALIDATE_IDENTIFIERS.
+   (pph_read_file): Use timer TV_PPH_RESTORE.
+   * pph-out.c (pph_writer_init): Use timer TV_PPH_SAVE.
+   (pph_out_line_table_and_includes): Use timer
+   

Re: [patch] Assert assemble_external is only called during or after expanding to RTL

2012-03-26 Thread Steven Bosscher
On Mon, Mar 26, 2012 at 9:25 AM, Richard Guenther
richard.guent...@gmail.com wrote:
 On Sat, Mar 24, 2012 at 9:25 PM, Steven Bosscher stevenb@gmail.com 
 wrote:
 Hello,

 This patch tightens the conditions on when assemble_external() may be
 called. It also removes a comment that most platforms do not define
 ASM_OUTPUT_EXTERNAL, because hasn't been true since r119764 added a
 definition of ASM_OUTPUT_EXTERNAL to elfos.h.

 This is the first step toward addressing PR17982 on the trunk for GCC
 4.8. The next step is to change pending_assemble_externals to
 pending_assemble_visibility, and fold assemble_external_real() back
 into assemble_external.

 But first, this patch. I don't think this is very risky, because GCC
 now always works in unit-at-a-time mode. But I think it would be good
 to have this on the trunk for a week or so before proceeding.

 Bootstrapped  tested on x86_64-unknown-linux-gnu. OK for trunk?

 Ok.  Though I wonder why callers cannot simply push these decls to
 the pending varpool queue and we might do a final run over it,
 assembling things?  [or why we call assemble_external that late at all ...]

The idea is to only call assemble_external as late as possible, i.e.
in final. See PR17982. The pending_assemble_external queue was a hack
/ work-around for a problem with c-decl calling DECL_RTL early. Before
that hack was introduced, any call to assemble_external would set
DECL_ASSEMBLER_NAME and DECL_RTL, and write out something to
asm_out_file, during parsing. I'm trying to clean up that
pending_assemble_external mess, because it causes some nasty compile
time problems (PR 52640).

Ciao!
Steven


Re: [patch] Call assemble_external only from final.c and from MI-thunk hooks

2012-03-26 Thread Steven Bosscher
On Mon, Mar 26, 2012 at 9:31 AM, Richard Guenther
richard.guent...@gmail.com wrote:
 On Mon, Mar 26, 2012 at 1:27 AM, Steven Bosscher stevenb@gmail.com 
 wrote:
 Hello,

 This patch removes all calls to assemble_external from places other
 than final.c and MI-thunk generators.

 This is step 2 toward addressing PR17982 on the trunk for GCC 4.8. The
 next, and final, step will be to change pending_assemble_externals to
 pending_assemble_visibility, and fold assemble_external_real() back
 into assemble_external.

 Bootstrapped  tested all default languages on
 x86_64-unknown-linux-gnu and on powerpc64-unknown-linux-gnu.

 Also cross-built c and c++ to mips-elf and tested on mips-sim
 (mips-elf was the only target I could find that really emits something
 for its ASM_OUTPUT_EXTERNAL target macro and has a sim in gdb. I
 hard-coded mips_output_external, by replacing if
 (!TARGET_EXPLICIT_RELOCS  ... with if (1  ...).

 OK for trunk?

 Ok.  (I think the remaining TREE_USED sets look dubious and may not
 be needed, too?)

I'm not sure. I think some of them may still be necessary, because
they were changed relatively recently (r126144). So I left them in for
now.

Actually, the whole back end is full of cleanup opportunities, now
that the compiler always works in unit-at-a-time mode. But let's start
small.

Ciao!
Steven


Re: [google] Minor cleanup and test fixes for -mpatch-functions-for-instrumentation. (issue5877043)

2012-03-26 Thread Xinliang David Li
Ok for google branches (main and 4_7).

thanks,

David

On Wed, Mar 21, 2012 at 2:45 PM, Harshit Chopra hars...@google.com wrote:
 2012-03-21   Harshit Chopra  hars...@google.com

  Minor changes:
    i386.c: made check_should_patch_current_function C90 compatible.
    i386.md: Added '\t' to bytes generated by
             ix86_output_function_nops_prologue_epilogue for proper formatting
             of assembly.
    patch-functions-*.c: Fixed verification in tests. Added a test to verify
                         nop-bytes generated for sibling calls and another test
                         to verify a binary with nop-bytes runs properly.

        * gcc/config/i386/i386.c (check_should_patch_current_function):
        * gcc/config/i386/i386.md:
        * gcc/testsuite/gcc.target/i386/patch-functions-1.c (void foo):
        (int main):
        * gcc/testsuite/gcc.target/i386/patch-functions-2.c:
        * gcc/testsuite/gcc.target/i386/patch-functions-3.c:
        * gcc/testsuite/gcc.target/i386/patch-functions-4.c:
        * gcc/testsuite/gcc.target/i386/patch-functions-5.c:
        * gcc/testsuite/gcc.target/i386/patch-functions-6.c:
        * gcc/testsuite/gcc.target/i386/patch-functions-7.c:
        * gcc/testsuite/gcc.target/i386/patch-functions-8.c (int foo):
        (int bar):
        * gcc/testsuite/gcc.target/i386/patch-functions-sibling-call.c:

 Testing method:
  make check-gcc RUNTESTFLAGS=i386.exp=patch-functions* 
 --target_board=\unix{-m32,}\

 Patch to be applied to google/main.

 diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
 index 08bd5f0..be1f7a4 100644
 --- a/gcc/config/i386/i386.c
 +++ b/gcc/config/i386/i386.c
 @@ -10981,6 +10981,7 @@ check_should_patch_current_function (void)
   const char* func_name = NULL;
   struct loops loops;
   int num_loops = 0;
 +  int min_functions_instructions;

   /* Patch the function if it has at least a loop.  */
   if (!patch_functions_ignore_loops)
 @@ -11007,7 +11008,7 @@ check_should_patch_current_function (void)
       strcmp(main, func_name) == 0)
     return true;

 -  int min_functions_instructions =
 +  min_functions_instructions =
       PARAM_VALUE (PARAM_FUNCTION_PATCH_MIN_INSTRUCTIONS);
   if (min_functions_instructions  0)
     {
 diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
 index 08353ff..38a04ae 100644
 --- a/gcc/config/i386/i386.md
 +++ b/gcc/config/i386/i386.md
 @@ -11688,7 +11688,7 @@
       /* Emit 10 nop bytes after ret.  */
       if (ix86_output_function_nops_prologue_epilogue (asm_out_file,
                                                       
 FUNCTION_PATCH_EPILOGUE_SECTION,
 -                                                      ret,
 +                                                      \tret,
                                                       10))
        return ;
     }
 @@ -11712,7 +11712,7 @@
       /* Emit 9 nop bytes after rep;ret.  */
       if (ix86_output_function_nops_prologue_epilogue (asm_out_file,
                                                       
 FUNCTION_PATCH_EPILOGUE_SECTION,
 -                                                      rep\;ret,
 +                                                      \trep\;ret,
                                                       9))
        return ;
     }
 diff --git a/gcc/testsuite/gcc.target/i386/patch-functions-1.c 
 b/gcc/testsuite/gcc.target/i386/patch-functions-1.c
 index 308e8c3..aa1f424 100644
 --- a/gcc/testsuite/gcc.target/i386/patch-functions-1.c
 +++ b/gcc/testsuite/gcc.target/i386/patch-functions-1.c
 @@ -1,5 +1,5 @@
  /* Verify -mpatch-functions-for-instrumentation works.  */
 -/* { dg-do run } */
 +/* { dg-do compile } */
  /* { dg-require-effective-target lp64 } */
  /* { dg-options -mpatch-functions-for-instrumentation } */

 @@ -8,13 +8,16 @@
  /* Check nop-bytes at end.  */
  /* { dg-final { scan-assembler ret(.*).byte\t0x90(.*).byte\t0x90 } } */

 -void foo() {
 +__attribute__ ((noinline))
 +void foo()
 +{
   /* Dummy loop.  */
   int x = 0;
   while (++x);
  }

 -int main() {
 +int main()
 +{
   foo();
   return 0;
  }
 diff --git a/gcc/testsuite/gcc.target/i386/patch-functions-2.c 
 b/gcc/testsuite/gcc.target/i386/patch-functions-2.c
 index 6baad32..78de867 100644
 --- a/gcc/testsuite/gcc.target/i386/patch-functions-2.c
 +++ b/gcc/testsuite/gcc.target/i386/patch-functions-2.c
 @@ -1,4 +1,4 @@
 -/* { dg-do run } */
 +/* { dg-do compile } */
  /* { dg-require-effective-target lp64 } */
  /* { dg-options -mpatch-functions-for-instrumentation 
 -mno-patch-functions-main-always } */

 @@ -8,11 +8,14 @@
  /* { dg-final { scan-assembler-not .byte\t0xeb,0x09(.*).byte\t0x90 } } */
  /* { dg-final { scan-assembler-not ret(.*).byte\t0x90(.*).byte\t0x90 } } */

 -void foo() {
 +__attribute__ ((noinline))
 +void foo()
 +{
   int x = 0;
  }

 -int main() {
 +int main()
 +{
   foo();
   return 0;
  }
 diff --git a/gcc/testsuite/gcc.target/i386/patch-functions-3.c 
 b/gcc/testsuite/gcc.target/i386/patch-functions-3.c
 index 

Re: [patch][PR52640] Fix quadratic behavior with many referenced extern functions

2012-03-26 Thread Steven Bosscher
On Wed, Mar 21, 2012 at 9:35 PM, Diego Novillo dnovi...@google.com wrote:
 On 3/21/12 3:30 PM, Steven Bosscher wrote:

 +/* FIXME: Trunk is at GCC 4.8 now and the above problem still hasn't been
 +   addressed properly.  This caused PR 52640 due to O(external_decls**2)
 +   lookups in the pending_assemble_externals queue in assemble_external.
 +   Paper over with this pointer set.  (And pending_assemble_externals
 even
 +   was a TREE_LIST before?!)  */
 +static struct pointer_set_t *pending_assemble_externals_set;


 Can you add some description on what this pointer set does?

Done, I hope you agree it's sufficiently clear now what this set is for.

 OK for all release branches (if they are open for fixes).

I've committed a version to the release branches where the
pending_assemble_externals remains a TREE_LIST. I saw no reason to
complicate the patch more than strictly necessary for the release
branches. For trunk I'm trying to make pending_assemble_externals go
away.

Ciao!
Steven


PATCH: Remove MaskExists property from option handling

2012-03-26 Thread H.J. Lu
On Mon, Mar 26, 2012 at 7:47 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Mon, Mar 26, 2012 at 4:53 AM, Joseph S. Myers
 jos...@codesourcery.com wrote:
 On Sun, 25 Mar 2012, H.J. Lu wrote:

 Hi Joseph,

 I need to support InverseMask(XXX) in options without the corresponding
 Mask(XXX) since XXX is never set directly via a command line option. This
 patch adds a MaskNeeded property which turns InverseMask(XXX) into
 the inverse version of Mask(XXX), which allocates a unique bit and defines
 the same set of macros as Mask(XXX).  Does it look OK?

 I'd have thought that either Mask or InverseMask with a given mask name
 (or a standalone target mask record) should cause allocation (only once,
 no matter how many options use the same mask name), and MaskExists should
 be removed, rather than adding MaskNeeded - if I understood correctly the
 purpose for which you are adding MaskNeeded.


 That is correct.  I will work on a patch to remove  MaskExists.


Here is a patch to remove MaskExists.  The difference between the old
options.h and the new options.h on Linux/x86-64 are

--- /tmp/options.h  2012-03-26 09:49:09.590137047 -0700
+++ ./options.h 2012-03-26 10:07:56.940168938 -0700
@@ -3677,55 +3677,55 @@ extern void cl_target_option_print (FILE
 #endif

 #define MASK_128BIT_LONG_DOUBLE (1  0)
-#define OPTION_MASK_ISA_3DNOW (HOST_WIDE_INT_1  0)
-#define OPTION_MASK_ISA_3DNOW_A (HOST_WIDE_INT_1  1)
-#define OPTION_MASK_ISA_64BIT (HOST_WIDE_INT_1  2)
+#define OPTION_MASK_ISA_64BIT (HOST_WIDE_INT_1  0)
+#define OPTION_MASK_ISA_3DNOW (HOST_WIDE_INT_1  1)
+#define OPTION_MASK_ISA_3DNOW_A (HOST_WIDE_INT_1  2)
 #define MASK_80387 (1  1)
 #define MASK_USE_8BIT_IDIV (1  2)
 #define OPTION_MASK_ISA_ABM (HOST_WIDE_INT_1  3)
 #define MASK_ACCUMULATE_OUTGOING_ARGS (1  3)
 #define OPTION_MASK_ISA_AES (HOST_WIDE_INT_1  4)
 #define MASK_ALIGN_DOUBLE (1  4)
+#define MASK_NO_ALIGN_STRINGOPS (1  5)
 #define OPTION_MASK_ANDROID (1  0)
 #define OPTION_MASK_ISA_AVX (HOST_WIDE_INT_1  5)
 #define OPTION_MASK_ISA_AVX2 (HOST_WIDE_INT_1  6)
-#define MASK_AVX256_SPLIT_UNALIGNED_LOAD (1  5)
-#define MASK_AVX256_SPLIT_UNALIGNED_STORE (1  6)
+#define MASK_AVX256_SPLIT_UNALIGNED_LOAD (1  6)
+#define MASK_AVX256_SPLIT_UNALIGNED_STORE (1  7)
 #define OPTION_MASK_ISA_BMI (HOST_WIDE_INT_1  7)
 #define OPTION_MASK_ISA_BMI2 (HOST_WIDE_INT_1  8)
-#define MASK_CLD (1  7)
+#define MASK_CLD (1  8)
 #define OPTION_MASK_ISA_CRC32 (HOST_WIDE_INT_1  9)
 #define OPTION_MASK_ISA_CX16 (HOST_WIDE_INT_1  10)
 #define OPTION_MASK_ISA_F16C (HOST_WIDE_INT_1  11)
+#define MASK_NO_FANCY_MATH_387 (1  9)
 #define OPTION_MASK_ISA_FMA (HOST_WIDE_INT_1  12)
 #define OPTION_MASK_ISA_FMA4 (HOST_WIDE_INT_1  13)
-#define MASK_FLOAT_RETURNS (1  8)
+#define MASK_FLOAT_RETURNS (1  10)
 #define OPTION_MASK_ISA_FSGSBASE (HOST_WIDE_INT_1  14)
-#define MASK_IEEE_FP (1  9)
-#define MASK_INLINE_ALL_STRINGOPS (1  10)
-#define MASK_INLINE_STRINGOPS_DYNAMICALLY (1  11)
+#define MASK_IEEE_FP (1  11)
+#define MASK_INLINE_ALL_STRINGOPS (1  12)
+#define MASK_INLINE_STRINGOPS_DYNAMICALLY (1  13)
 #define OPTION_MASK_ISA_LWP (HOST_WIDE_INT_1  15)
 #define OPTION_MASK_ISA_LZCNT (HOST_WIDE_INT_1  16)
 #define OPTION_MASK_ISA_MMX (HOST_WIDE_INT_1  17)
 #define OPTION_MASK_ISA_MOVBE (HOST_WIDE_INT_1  18)
-#define MASK_MS_BITFIELD_LAYOUT (1  12)
-#define MASK_NO_ALIGN_STRINGOPS (1  13)
-#define MASK_NO_FANCY_MATH_387 (1  14)
+#define MASK_MS_BITFIELD_LAYOUT (1  14)
 #define MASK_NO_PUSH_ARGS (1  15)
 #define MASK_NO_RED_ZONE (1  16)
+#define OPTION_MASK_ISA_SSE4_1 (HOST_WIDE_INT_1  19)
 #define MASK_OMIT_LEAF_FRAME_POINTER (1  17)
-#define OPTION_MASK_ISA_PCLMUL (HOST_WIDE_INT_1  19)
-#define OPTION_MASK_ISA_POPCNT (HOST_WIDE_INT_1  20)
+#define OPTION_MASK_ISA_PCLMUL (HOST_WIDE_INT_1  20)
+#define OPTION_MASK_ISA_POPCNT (HOST_WIDE_INT_1  21)
 #define MASK_PREFER_AVX128 (1  18)
-#define OPTION_MASK_ISA_RDRND (HOST_WIDE_INT_1  21)
+#define OPTION_MASK_ISA_RDRND (HOST_WIDE_INT_1  22)
 #define MASK_RECIP (1  19)
 #define MASK_RTD (1  20)
-#define OPTION_MASK_ISA_RTM (HOST_WIDE_INT_1  22)
-#define OPTION_MASK_ISA_SAHF (HOST_WIDE_INT_1  23)
-#define OPTION_MASK_ISA_SSE (HOST_WIDE_INT_1  24)
-#define OPTION_MASK_ISA_SSE2 (HOST_WIDE_INT_1  25)
-#define OPTION_MASK_ISA_SSE3 (HOST_WIDE_INT_1  26)
-#define OPTION_MASK_ISA_SSE4_1 (HOST_WIDE_INT_1  27)
+#define OPTION_MASK_ISA_RTM (HOST_WIDE_INT_1  23)
+#define OPTION_MASK_ISA_SAHF (HOST_WIDE_INT_1  24)
+#define OPTION_MASK_ISA_SSE (HOST_WIDE_INT_1  25)
+#define OPTION_MASK_ISA_SSE2 (HOST_WIDE_INT_1  26)
+#define OPTION_MASK_ISA_SSE3 (HOST_WIDE_INT_1  27)
 #define OPTION_MASK_ISA_SSE4_2 (HOST_WIDE_INT_1  28)
 #define OPTION_MASK_ISA_SSE4A (HOST_WIDE_INT_1  29)
 #define MASK_SSEREGPARM (1  21)
@@ -3742,15 +3742,16 @@ extern void cl_target_option_print (FILE
 #endif

 #define TARGET_128BIT_LONG_DOUBLE ((target_flags 
MASK_128BIT_LONG_DOUBLE) != 0)
+#define OPTION_ISA_64BIT ((ix86_isa_flags  OPTION_MASK_ISA_64BIT) != 0)
 #define OPTION_ISA_3DNOW 

[PATCH, i386]: FIX PR 52698, reload failure with complex address

2012-03-26 Thread Uros Bizjak
Hello!

In a corner case of a reload, reload pass can generate partially
reloaded address, where not all registers get allocated to a hard reg.
When this address is checked with ix86_legitimate_address, it is
rejected, since in strict mode, pseudos are not valid address
registers. So, reload tries to legitimize following (partially
invalid) address:

 (plus:DI (plus:DI (unspec:DI [(const_int 0 [0])] UNSPEC_TP)
   (reg:DI 97))
  (reg:DI 2 cx))

via generic way by pushing all components into a register, forming
(even more invalid) address that involves three registers (r8, r9 and
rcx):

(insn 52 78 53 5 (set (mem/j:QI (plus:DI (plus:DI (reg:DI 37 r8)
(reg:DI 38 r9))
(reg:DI 2 cx [orig:98 D.1709 ] [98])) [0 foo S1 A8])
(reg:QI 1 dx [100])) /tmp/x.c:12 66 {*movqi_internal}
 (nil))

BTW: x86 declares MAX_REGISTER_PER_ADDRESS to 2...

Attached patch fixes this situation by (partially) reloading only
remaining pseudo(s), leaving UNSPEC in the address RTX.

2012-03-26  Uros Bizjak  ubiz...@gmail.com

PR target/52698
* config/i386/i386-protos.h (ix86_legitimize_reload_address):
New prototype.
* config/i386/i386.h (LEGITIMIZE_RELOAD_ADDRESS): New define.
* config/i386/i386.c: Include reload.h.
(ix86_legitimize_reload_address): New function.

testsuite/ChangeLog:

2012-03-26  Uros Bizjak  ubiz...@gmail.com
H.J. Lu  hongjiu...@intel.com

PR target/52698
* gcc.target/i386/pr52698.c: New test.

The patch was bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32}.

Since fixing reload issues is some kind of black magic, I'd like to
ask Ulrich and Richard for their opinion on this approach.

H.J., can you please test this fix on x32 testsuite for both address modes?

Thanks,
Uros.
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 185807)
+++ config/i386/i386.c  (working copy)
@@ -47,6 +47,7 @@
 #include target-def.h
 #include common/common-target.h
 #include langhooks.h
+#include reload.h
 #include cgraph.h
 #include gimple.h
 #include dwarf2.h
@@ -12010,6 +12011,64 @@
   return false;
 }
 
+/* Our implementation of LEGITIMIZE_RELOAD_ADDRESS.  Returns a value to
+   replace the input X, or the original X if no replacement is called for.
+   The output parameter *WIN is 1 if the calling macro should goto WIN,
+   0 if it should not.  */
+
+bool
+ix86_legitimize_reload_address (rtx x,
+   enum machine_mode mode ATTRIBUTE_UNUSED,
+   int opnum, int type,
+   int ind_levels ATTRIBUTE_UNUSED)
+{
+  /* Reload can generate:
+
+ (plus:DI (plus:DI (unspec:DI [(const_int 0 [0])] UNSPEC_TP)
+  (reg:DI 97))
+ (reg:DI 2 cx))
+
+ This RTX is rejected from ix86_legitimate_address_p due to
+ non-strictness of base register 97.  Following this rejection, 
+ reload pushes all three components into separate registers,
+ creating invalid memory address RTX.
+
+ Following code reloads only the invalid part of the
+ memory address RTX.  */
+
+  if (GET_CODE (x) == PLUS
+   REG_P (XEXP (x, 1))
+   GET_CODE (XEXP (x, 0)) == PLUS
+   REG_P (XEXP (XEXP (x, 0), 1)))
+{
+  rtx base, index;
+  bool something_reloaded = false;
+
+  base = XEXP (XEXP (x, 0), 1);  
+  if (!REG_OK_FOR_BASE_STRICT_P (base))
+   {
+ push_reload (base, NULL_RTX, XEXP (XEXP (x, 0), 1), NULL,
+  BASE_REG_CLASS, GET_MODE (x), VOIDmode, 0, 0,
+  opnum, (enum reload_type)type);
+ something_reloaded = true;
+   }
+
+  index = XEXP (x, 1);
+  if (!REG_OK_FOR_INDEX_STRICT_P (index))
+   {
+ push_reload (index, NULL_RTX, XEXP (x, 1), NULL,
+  INDEX_REG_CLASS, GET_MODE (x), VOIDmode, 0, 0,
+  opnum, (enum reload_type)type);
+ something_reloaded = true;
+   }
+
+  gcc_assert (something_reloaded);
+  return true;
+}
+
+  return false;
+}
+
 /* Recognizes RTL expressions that are valid memory addresses for an
instruction.  The MODE argument is the machine mode for the MEM
expression that wants to use this address.
Index: config/i386/i386.h
===
--- config/i386/i386.h  (revision 185807)
+++ config/i386/i386.h  (working copy)
@@ -1630,6 +1630,17 @@
 
 #define CONSTANT_ADDRESS_P(X)  constant_address_p (X)
 
+/* Try a machine-dependent way of reloading an illegitimate address
+   operand.  If we find one, push the reload and jump to WIN.  This
+   macro is used in only one place: `find_reloads_address' in reload.c.  */
+
+#define LEGITIMIZE_RELOAD_ADDRESS(X, MODE, OPNUM, TYPE, INDL, WIN) \
+do { 

[pph] Adjust expected patterns in some tests (issue5905068)

2012-03-26 Thread Diego Novillo
Adjust expected patterns in some tests.

After the mangling adjustments I made earlier, these tests are
now passing and or failing in a different way.  Adjusted.

2012-03-26   Diego Novillo  dnovi...@google.com

* g++.dg/pph/x6dynarray4.cc: Adjust expected patterns.
* g++.dg/pph/x6dynarray5.h: Likewise.
* g++.dg/pph/x7dynarray5.cc: Likewise.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/pph@185814 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/testsuite/ChangeLog.pph |6 ++
 gcc/testsuite/g++.dg/pph/x6dynarray4.cc |   10 --
 gcc/testsuite/g++.dg/pph/x6dynarray5.h  |9 -
 gcc/testsuite/g++.dg/pph/x7dynarray5.cc |   19 +--
 4 files changed, 31 insertions(+), 13 deletions(-)

diff --git a/gcc/testsuite/ChangeLog.pph b/gcc/testsuite/ChangeLog.pph
index b510b55..ded2ef9 100644
--- a/gcc/testsuite/ChangeLog.pph
+++ b/gcc/testsuite/ChangeLog.pph
@@ -1,3 +1,9 @@
+2012-03-26   Diego Novillo  dnovi...@google.com
+
+   * g++.dg/pph/x6dynarray4.cc: Adjust expected patterns.
+   * g++.dg/pph/x6dynarray5.h: Likewise.
+   * g++.dg/pph/x7dynarray5.cc: Likewise.
+
 2012-03-23   Diego Novillo  dnovi...@google.com
 
* g++.dg/pph/x1mbstate_t.h: Mark fixed.
diff --git a/gcc/testsuite/g++.dg/pph/x6dynarray4.cc 
b/gcc/testsuite/g++.dg/pph/x6dynarray4.cc
index 74fb0d9..6e43954 100644
--- a/gcc/testsuite/g++.dg/pph/x6dynarray4.cc
+++ b/gcc/testsuite/g++.dg/pph/x6dynarray4.cc
@@ -1,6 +1,12 @@
-// {xfail-if DEPENDENT { *-*-* } { -fpph-map=pph.map } }
+// { xfail-if DEPENDENT { *-*-* } { -fpph-map=pph.map } }
+// { dg-bogus a0dynarray-dcl3.hi:11:60: error: call of overloaded .operator 
new ...sizetype.. is ambiguous  { xfail *-*-* } 0 }
+// { dg-bogus stl_algobase.h:625:18: error: .__value. is not a member of 
.std::__is_move_iterator  { xfail *-*-* } 0 }
+// { dg-bogus stl_algobase.h:276:12: error: .__value. is not a member of 
.std::__is_move_iterator  { xfail *-*-* } 0 }
+// { dg-bogus stl_algobase.h:625:18: error: no matching function for call to 
.__miter_base  { xfail *-*-* } 0 }
+// { dg-bogus stl_algobase.h:282:5: error: no type named .iterator_type. in 
.struct std::_Miter_base.  { xfail *-*-* } 0 }
+// { dg-bogus stl_algobase.h:139:7: error: .__value. is not a member of 
.std::__are_same  { xfail *-*-* } 0 }
 
-#include x6dynarray5.h // { dg-bogus cannot open PPH file x6dynarray5.pph 
 { xfail *-*-* } }
+#include x6dynarray5.h
 
 #include algorithm
 
diff --git a/gcc/testsuite/g++.dg/pph/x6dynarray5.h 
b/gcc/testsuite/g++.dg/pph/x6dynarray5.h
index 05634c1..69dbcb8 100644
--- a/gcc/testsuite/g++.dg/pph/x6dynarray5.h
+++ b/gcc/testsuite/g++.dg/pph/x6dynarray5.h
@@ -1,12 +1,3 @@
-// { dg-xfail-if UNIMPL TRAIT MANGLING { *-*-* } { -fpph-map=pph.map } }
-// { dg-bogus bits/stl_uninitialized.h:66:12: sorry, unimplemented: mangling 
trait_expr  { xfail *-*-* } 0 }
-// { dg-bogus bits/stl_uninitialized.h:124:12: sorry, unimplemented: mangling 
trait_expr  { xfail *-*-* } 0 }
-// { dg-bogus bits/stl_uninitialized.h:178:12: sorry, unimplemented: mangling 
trait_expr  { xfail *-*-* } 0 }
-// { dg-bogus bits/allocator.h:153:12: sorry, unimplemented: mangling 
trait_expr  { xfail *-*-* } 0 }
-// { dg-bogus bits/stl_construct.h:98:12: sorry, unimplemented: mangling 
trait_expr  { xfail *-*-* } 0 }
-// { dg-bogus bits/stl_tempbuf.h:183:12: sorry, unimplemented: mangling 
trait_expr  { xfail *-*-* } 0 }
-// { dg-bogus bits/cpp_type_traits.h:87:12: internal compiler error: tree 
check: expected integer_cst, have const_decl in tree_int_cst_sgn  { xfail 
*-*-* } 0 }
-
 #ifndef X6DYNARRAY5_H
 #define X6DYNARRAY5_H
 
diff --git a/gcc/testsuite/g++.dg/pph/x7dynarray5.cc 
b/gcc/testsuite/g++.dg/pph/x7dynarray5.cc
index cd55eab..628ea7c 100644
--- a/gcc/testsuite/g++.dg/pph/x7dynarray5.cc
+++ b/gcc/testsuite/g++.dg/pph/x7dynarray5.cc
@@ -1,8 +1,23 @@
-// {xfail-if DEPENDENT { *-*-* } { -fpph-map=pph.map } }
+// { dg-xfail-if DEPENDENT { *-*-* } { -fpph-map=pph.map } }
+// { dg-bogus streambuf_iterator.h:372:58: internal compiler error: canonical 
types differ for identical types.*  { xfail *-*-* } 0 }
 // { dg-bogus ext/atomicity.h:48:61: error: void value not ignored as it 
ought to be  { xfail *-*-* } 0 }
+// { dg-bogus basic_string.h:111:63: error: no class template named .rebind. 
in .class std::allocatorchar.  { xfail *-*-* } 0 }
+// { dg-bogus basic_string.h:111:63: error: no class template named .rebind. 
in .class std::allocatorwchar_t.  { xfail *-*-* } 0 }
+// { dg-bogus basic_string.h:124:68: error: no class template named .rebind. 
in .class std::allocatorchar.  { xfail *-*-* } 0 }
+// { dg-bogus basic_string.h:124:68: error: no class template named .rebind. 
in .class std::allocatorwchar_t.  { xfail *-*-* } 0 }
+// { dg-bogus basic_string.h:126:61: error: no class template named .rebind. 
in .class std::allocatorchar.  { xfail *-*-* } 0 }
+// { dg-bogus basic_string.h:126:61: error: no class template named .rebind. 
in 

Re: [PATCH, i386]: FIX PR 52698, reload failure with complex address

2012-03-26 Thread H.J. Lu
On Mon, Mar 26, 2012 at 10:25 AM, Uros Bizjak ubiz...@gmail.com wrote:
 Hello!

 In a corner case of a reload, reload pass can generate partially
 reloaded address, where not all registers get allocated to a hard reg.
 When this address is checked with ix86_legitimate_address, it is
 rejected, since in strict mode, pseudos are not valid address
 registers. So, reload tries to legitimize following (partially
 invalid) address:

     (plus:DI (plus:DI (unspec:DI [(const_int 0 [0])] UNSPEC_TP)
                       (reg:DI 97))
              (reg:DI 2 cx))

 via generic way by pushing all components into a register, forming
 (even more invalid) address that involves three registers (r8, r9 and
 rcx):

 (insn 52 78 53 5 (set (mem/j:QI (plus:DI (plus:DI (reg:DI 37 r8)
                    (reg:DI 38 r9))
                (reg:DI 2 cx [orig:98 D.1709 ] [98])) [0 foo S1 A8])
        (reg:QI 1 dx [100])) /tmp/x.c:12 66 {*movqi_internal}
     (nil))

 BTW: x86 declares MAX_REGISTER_PER_ADDRESS to 2...

 Attached patch fixes this situation by (partially) reloading only
 remaining pseudo(s), leaving UNSPEC in the address RTX.

 2012-03-26  Uros Bizjak  ubiz...@gmail.com

        PR target/52698
        * config/i386/i386-protos.h (ix86_legitimize_reload_address):
        New prototype.
        * config/i386/i386.h (LEGITIMIZE_RELOAD_ADDRESS): New define.
        * config/i386/i386.c: Include reload.h.
        (ix86_legitimize_reload_address): New function.

 testsuite/ChangeLog:

 2012-03-26  Uros Bizjak  ubiz...@gmail.com
            H.J. Lu  hongjiu...@intel.com

        PR target/52698
        * gcc.target/i386/pr52698.c: New test.

 The patch was bootstrapped and regression tested on x86_64-pc-linux-gnu 
 {,-m32}.

 Since fixing reload issues is some kind of black magic, I'd like to
 ask Ulrich and Richard for their opinion on this approach.

 H.J., can you please test this fix on x32 testsuite for both address modes?


I am on it.

-- 
H.J.


Re: [PATCH] Fix PRs 52080, 52097 and 48124, rewrite bitfield expansion, enable the C++ memory model wrt bitfields everywhere

2012-03-26 Thread Eric Botcazou
 Uh.  When is a field a bit field though?  At least stor-layout.c
 resets DECL_BIT_FIELD
 when local relative alignment is proper and the filed has an integer
 mode.  That's
 overly optimistic if the record is placed at a bit position.  Of
 course we still have
 DECL_BIT_FIELD_TYPE, but I wonder what the DECL_BIT_FIELD flag is for if
 we cannot really know whether the field in the end is a bitfield
 anyway... (what value do the various field-decls in a component-ref chain
 have anyway if the real interpretation is subject to the containing object,
 which might be even only
 specified by the type used for an indirect ref ...)

We're talking about a bit field with record type here, so anything calculated 
within the record type in isolation is essentially invalidated for the bit 
field in question.  Fortunately this only occurs in Ada, which doesn't use the 
very questionable C++ memory model (fingers crossed for the future. :-)

-- 
Eric Botcazou


Re: [PATCH] Fix PRs 52080, 52097 and 48124, rewrite bitfield expansion, enable the C++ memory model wrt bitfields everywhere

2012-03-26 Thread Eric Botcazou
 The patch looks reasonable - can we compute this backward from the
 result of the outer get_inner_reference call and the outermost
 field-decl though?  Or make get_inner_reference compute that while
 analyzing the full reference and return a flag?  OTOH it shouldn't
 be too expensive.

There are a lot of calls to get_inner_reference throughout the compiler though.
And it is cheap if you compare it with the 4.7 code, which essentially does the 
same processing for _every_ field in the record (and even if the record object 
isn't a handled_component_p).

-- 
Eric Botcazou


Re: [patch] Call assemble_external only from final.c and from MI-thunk hooks

2012-03-26 Thread H.J. Lu
On Mon, Mar 26, 2012 at 9:26 AM, Steven Bosscher stevenb@gmail.com wrote:
 On Mon, Mar 26, 2012 at 9:31 AM, Richard Guenther
 richard.guent...@gmail.com wrote:
 On Mon, Mar 26, 2012 at 1:27 AM, Steven Bosscher stevenb@gmail.com 
 wrote:
 Hello,

 This patch removes all calls to assemble_external from places other
 than final.c and MI-thunk generators.

 This is step 2 toward addressing PR17982 on the trunk for GCC 4.8. The
 next, and final, step will be to change pending_assemble_externals to
 pending_assemble_visibility, and fold assemble_external_real() back
 into assemble_external.

 Bootstrapped  tested all default languages on
 x86_64-unknown-linux-gnu and on powerpc64-unknown-linux-gnu.

 Also cross-built c and c++ to mips-elf and tested on mips-sim
 (mips-elf was the only target I could find that really emits something
 for its ASM_OUTPUT_EXTERNAL target macro and has a sim in gdb. I
 hard-coded mips_output_external, by replacing if
 (!TARGET_EXPLICIT_RELOCS  ... with if (1  ...).

 OK for trunk?

 Ok.  (I think the remaining TREE_USED sets look dubious and may not
 be needed, too?)

 I'm not sure. I think some of them may still be necessary, because
 they were changed relatively recently (r126144). So I left them in for
 now.

 Actually, the whole back end is full of cleanup opportunities, now
 that the compiler always works in unit-at-a-time mode. But let's start
 small.


It may have caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52730

-- 
H.J.


Re: PATCH: Remove MaskExists property from option handling

2012-03-26 Thread H.J. Lu
On Mon, Mar 26, 2012 at 10:20 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Mon, Mar 26, 2012 at 7:47 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Mon, Mar 26, 2012 at 4:53 AM, Joseph S. Myers
 jos...@codesourcery.com wrote:
 On Sun, 25 Mar 2012, H.J. Lu wrote:

 Hi Joseph,

 I need to support InverseMask(XXX) in options without the corresponding
 Mask(XXX) since XXX is never set directly via a command line option. This
 patch adds a MaskNeeded property which turns InverseMask(XXX) into
 the inverse version of Mask(XXX), which allocates a unique bit and defines
 the same set of macros as Mask(XXX).  Does it look OK?

 I'd have thought that either Mask or InverseMask with a given mask name
 (or a standalone target mask record) should cause allocation (only once,
 no matter how many options use the same mask name), and MaskExists should
 be removed, rather than adding MaskNeeded - if I understood correctly the
 purpose for which you are adding MaskNeeded.


 That is correct.  I will work on a patch to remove  MaskExists.


 Here is a patch to remove MaskExists.  The difference between the old
 options.h and the new options.h on Linux/x86-64 are

...
 The same set of macros are provided.  The only differences are the
 order of bits used and macro definitions.  I am doing a full bootstrap and
 test on Linux/x86-64.  OK for trunk if there are no regressions?

 I will submit a separate patch to remove MaskExists from all *.opt files after
 this patch is installed.

 Thanks.


 --
 H.J.
 
 2012-03-26  H.J. Lu  hongjiu...@intel.com

        * opth-gen.awk: Allocated a bit for Mask and InverseMask if it
        hasn't been allocated.  Define a target macro for Mask and
        InverseMask if it hasn't been defined.  Remove MaskExists
        handling.

        * doc/options.texi: Remove MaskNeeded.

There are no regressions on Linux/x86-64.  OK for trunk?

Thanks.

-- 
H.J.


Re: [patch] Call assemble_external only from final.c and from MI-thunk hooks

2012-03-26 Thread Steven Bosscher
On Mon, Mar 26, 2012 at 8:22 PM, H.J. Lu hjl.to...@gmail.com wrote:
 It may have caused:

 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52730

It certainly seems so. Looking into it...

Ciao!
Steven


Re: [Patch,avr]: PR target/52692: Imlpement TARGET_BUILTIN_DECL

2012-03-26 Thread Denis Chertykov
2012/3/26 Georg-Johann Lay a...@gjlay.de:
 This implements TARGET_BUILTIN_DECL which is needed to make
 LTO work with target-specific built-ins.

 struct avr_builtin_description gets a new field .fndecl which is
 initialized during avr_init_builtins and looked up in new hook
 avr_builtin_decl.

 The built-ins are initialized in such a way that there is always
  avr_bdesc[x].id == x
 i.e. .id is superfluous and avr_expand_builtin can use id to access
 the array and need no search loop to find built-in id.

 Johann

 Ok?

 gcc/
        PR target/52692
        * config/avr/avr.c (TARGET_BUILTIN_DECL): New define.
        (avr_builtin_decl): New static function.
        (struct avr_builtin_description, avr_bdesc): Move up.
        Add GTY marker. Add field fndecl. Remove redundant field id.
        (avr_init_builtins): Initialize avr_bdesc[ID].fndecl.
        (avr_expand_builtin): Code cleanup because .id is removed.

 testsuite/
        * gcc.target/avr/torture/builtins-2.c: New test.

Please, apply.

Denis.


Re: [patch] Split parts of cse_insn out to a few new functions

2012-03-26 Thread Richard Sandiford
Steven Bosscher stevenb@gmail.com writes:
 On Wed, Mar 21, 2012 at 1:13 AM, Ian Lance Taylor wrote:
 On Tue, Mar 20, 2012 at 2:06 PM, Steven Bosscher wrote:

 This patch splits a couple of pieces of cse_insn out to new functions.
 There are no functional changes, and no code generation differences as
 far as I could tell on x86_64 (-m64 and -m32).

 Likewise for the attached patch.

 The purpose of the patch is and, loto hopefully make cse_insn easier
 to understand. In a follow-up patch, I will make canonicalize_insn run
 only once per insn (it currently, i.e. before and after this patch,
 runs multiple times for CSE on extended basic blocks if a block is in
 multiple extended basic blocks).

 That is what the attached patch does.

 Bootstrappedtested on x86_64-unknown-linux-gnu.
 OK for trunk?

 Ciao!
 Steven

   * cse.c (cse_canonicalized_basic_blocks): New simple bitmap to
   tag basic blocks that have already been traversed at least once,
   so that all insns have been canonicalized.
   (cse_insn): Call canonicalize_insn only if the basic block that
   contains insn is visited for the first time.
   (cse_extended_basic_block): After visiting all insns in a basic
   block, mark the block in cse_canonicalized_basic_blocks.
   (cse_main): Setup and destroy cse_canonicalized_basic_blocks.

OK, thanks (without the microoptimisation, as you say).

Out of curiosity, do you still see this bit as useful:

  /* We potentially will process this insn many times.  Therefore,
 drop the REG_EQUAL note if it is equal to the SET_SRC of the
 unique set in INSN.

 Do not do so if the REG_EQUAL note is for a STRICT_LOW_PART,
 because cse_insn handles those specially.  */

?  Does many times mean in CSE, or later?

Richard


Re: [PATCH, i386]: FIX PR 52698, reload failure with complex address

2012-03-26 Thread H.J. Lu
On Mon, Mar 26, 2012 at 10:41 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Mon, Mar 26, 2012 at 10:25 AM, Uros Bizjak ubiz...@gmail.com wrote:
 Hello!

 In a corner case of a reload, reload pass can generate partially
 reloaded address, where not all registers get allocated to a hard reg.
 When this address is checked with ix86_legitimate_address, it is
 rejected, since in strict mode, pseudos are not valid address
 registers. So, reload tries to legitimize following (partially
 invalid) address:

     (plus:DI (plus:DI (unspec:DI [(const_int 0 [0])] UNSPEC_TP)
                       (reg:DI 97))
              (reg:DI 2 cx))

 via generic way by pushing all components into a register, forming
 (even more invalid) address that involves three registers (r8, r9 and
 rcx):

 (insn 52 78 53 5 (set (mem/j:QI (plus:DI (plus:DI (reg:DI 37 r8)
                    (reg:DI 38 r9))
                (reg:DI 2 cx [orig:98 D.1709 ] [98])) [0 foo S1 A8])
        (reg:QI 1 dx [100])) /tmp/x.c:12 66 {*movqi_internal}
     (nil))

 BTW: x86 declares MAX_REGISTER_PER_ADDRESS to 2...

 Attached patch fixes this situation by (partially) reloading only
 remaining pseudo(s), leaving UNSPEC in the address RTX.

 2012-03-26  Uros Bizjak  ubiz...@gmail.com

        PR target/52698
        * config/i386/i386-protos.h (ix86_legitimize_reload_address):
        New prototype.
        * config/i386/i386.h (LEGITIMIZE_RELOAD_ADDRESS): New define.
        * config/i386/i386.c: Include reload.h.
        (ix86_legitimize_reload_address): New function.

 testsuite/ChangeLog:

 2012-03-26  Uros Bizjak  ubiz...@gmail.com
            H.J. Lu  hongjiu...@intel.com

        PR target/52698
        * gcc.target/i386/pr52698.c: New test.

 The patch was bootstrapped and regression tested on x86_64-pc-linux-gnu 
 {,-m32}.

 Since fixing reload issues is some kind of black magic, I'd like to
 ask Ulrich and Richard for their opinion on this approach.

 H.J., can you please test this fix on x32 testsuite for both address modes?


 I am on it.

There are no regressions in gcc x32 testsuite nor glibc tests for
both address modes. But I didn't check code quality nor SPEC CPU
performance.

-- 
H.J.


Re: [patch] Split parts of cse_insn out to a few new functions

2012-03-26 Thread Steven Bosscher
On Mon, Mar 26, 2012 at 9:02 PM, Richard Sandiford
rdsandif...@googlemail.com wrote:
       * cse.c (cse_canonicalized_basic_blocks): New simple bitmap to
       tag basic blocks that have already been traversed at least once,
       so that all insns have been canonicalized.
       (cse_insn): Call canonicalize_insn only if the basic block that
       contains insn is visited for the first time.
       (cse_extended_basic_block): After visiting all insns in a basic
       block, mark the block in cse_canonicalized_basic_blocks.
       (cse_main): Setup and destroy cse_canonicalized_basic_blocks.

 OK, thanks (without the microoptimisation, as you say).

Thanks, will commit!


 Out of curiosity, do you still see this bit as useful:

      /* We potentially will process this insn many times.  Therefore,
         drop the REG_EQUAL note if it is equal to the SET_SRC of the
         unique set in INSN.

         Do not do so if the REG_EQUAL note is for a STRICT_LOW_PART,
         because cse_insn handles those specially.  */

 ?  Does many times mean in CSE, or later?

It means in CSE, yes. But with the patch to canonicalize only once,
I suppose this can go away again.
Having said that, I do believe it would be good if we avoid having
REG_EQUAL notes that are rtx_equal_p to the SET_SRC. This happens
quite frequently after CSE. I'm not sure how to clean them up.

Ciao!
Steven


Re: [C++ RFC / Patch] Implementing Deducing noexcept for destructors

2012-03-26 Thread Jason Merrill

On 03/26/2012 07:22 AM, Paolo Carlini wrote:

My basic idea so far is very simple:

--- class.c (revision 185792)
+++ class.c (working copy)
@@ -1001,6 +1001,10 @@ add_method (tree type, tree method, tree using_dec
destructor,
type);
}
+ else if (cxx_dialect = cxx0x
+  !TYPE_RAISES_EXCEPTIONS (TREE_TYPE (method)))
+ TREE_TYPE (method) = build_exception_variant (TREE_TYPE (method),
+ noexcept_true_spec);
}


That would implement N1366, but implementing N3204 is a bit more 
involved.  You need to copy TYPE_RAISES_EXCEPTIONS from the result of 
implicitly_declare_fn; see defaulted_late_check for something similar.


Also, this is too early, since we can't know what the eh specification 
of the implicit declaration would be until the closing brace of the class.



struct True2 { virtual ~True2(); };
struct False { ~False() noexcept(false); };

template typename Base, typename Member
struct C : Base
{
Member mem;
};

SA(!noexcept(CTrue2, False()));

it doesn't compile at all because:

noexcept_PR50043.C:21:8: error: looser throw specifier for ‘virtual
CTrue2, False::~C() noexcept (false)’
noexcept_PR50043.C:5:24: error: overriding ‘virtual True2::~True2()
noexcept (true)’

is this expected? Maybe, but I'm not sure.


Yes.  Adding noexcept to ~True2 causes the same error without your patch.

Jason



Re: remove wrong code in immed_double_const

2012-03-26 Thread Richard Sandiford
Mike Stump mikest...@comcast.net writes:
 On Mar 23, 2012, at 3:01 AM, Richard Sandiford wrote:
 ...it doesn't mean that we interpret the value as a negative _rtx_.
 As with all rtx calculations, things like signedness and saturation are
 decided by the operation rather than the type (type == rtx mode).

 Ah...  [ light goes on ]  Let me adjust the documentation to be exceptionally 
 clear in this case.  Check out the new wording on const_int, const_double and 
 on immed_double_const.  I fixed simplify_const_unary_operation to match your 
 suggestion.

 Sorry for the rather rambling explanation :-)  I still think the
 version I suggested for this hunk is right though.

 I agree.  I now see what I had wrong.  Thanks for your patience and 
 explanations.  If you like the wording I used in the doc and on 
 immed_double_const, I think we have now resolved all issues.  The previous 
 version was bootstraped and regression tested on darwin, fortran, c, c++, 
 objective-c++...  I'll do one more run with the update for 
 simplify_const_unary_operation, as that can trip when before it would merely 
 return 0.

 Ok?


 Index: doc/rtl.texi
 ===
 --- doc/rtl.texi  (revision 185706)
 +++ doc/rtl.texi  (working copy)
 @@ -1479,8 +1479,13 @@ This type of expression represents the i
  is customarily accessed with the macro @code{INTVAL} as in
  @code{INTVAL (@var{exp})}, which is equivalent to @code{XWINT (@var{exp}, 
 0)}.
  
 -Constants generated for modes with fewer bits than @code{HOST_WIDE_INT}
 -must be sign extended to full width (e.g., with @code{gen_int_mode}).
 +Constants generated for modes with fewer bits than in
 +@code{HOST_WIDE_INT} must be sign extended to full width (e.g., with
 +@code{gen_int_mode}).  For constants for modes with more bits than in
 +@code{HOST_WIDE_INT} the implied high order bits of that constant are
 +copies of the top bit, however values are neither signed nor unsigned.
 +All operations defined on such constants define the signededness.

I'm not someone who should be wordsmithing, but I think:

...copies of the top bit.  Note however that values are neither inherently
signed nor inherently unsigned; where necessary, signedness is determined
by the rtl operation instead.

 @@ -1510,7 +1515,12 @@ Represents either a floating-point const
  integer constant too large to fit into @code{HOST_BITS_PER_WIDE_INT}
  bits but small enough to fit within twice that number of bits (GCC
  does not provide a mechanism to represent even larger constants).  In
 -the latter case, @var{m} will be @code{VOIDmode}.
 +the latter case, @var{m} will be @code{VOIDmode}.  For integral values
 +constants for modes with more bits than twice the number in
 +@code{HOST_WIDE_INT} the implied high order bits of that constant are
 +copies of the top bit of @code{CONST_DOUBLE_HIGH}, however integral
 +values are neither signed nor unsigned.  All operations defined on
 +such constants define the signededness.

Same idea here.

 +/* Return an rtx for the sum of X and the integer C, given that X has
 +   mode MODE.  This routine should be used instead of plus_constant
 +   when they want to ensure that addition happens in a particular
 +   mode, which is necessary when x can be a VOIDmode CONST_INT or

Sorry, just noticed, should be ...when X can be...

Looks good to me otherwise, thanks.  Richi?

Richard


Re: [patch] Call assemble_external only from final.c and from MI-thunk hooks

2012-03-26 Thread Steven Bosscher
On Mon, Mar 26, 2012 at 8:51 PM, Steven Bosscher stevenb@gmail.com wrote:
 On Mon, Mar 26, 2012 at 8:22 PM, H.J. Lu hjl.to...@gmail.com wrote:
 It may have caused:

 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52730

 It certainly seems so. Looking into it...

Java is the culprit. I'm going to commit the following patch to paper
over the issue for now.

Index: varasm.c
===
--- varasm.c(revision 185823)
+++ varasm.c(working copy)
@@ -2170,6 +2170,10 @@
   If it's not, we should not be calling this function.  */
   gcc_assert (asm_out_file);

+  /* In a perfect world, the following condition would be true.
+ Sadly, the Java and Go front ends emit assembly *from the front end*,
+ bypassing the call graph.  See PR52739.  Fix before GCC 4.8.  */
+#if 0
   /* This function should only be called if we are expanding, or have
  expanded, to RTL.
  Ideally, only final.c would be calling this function, but it is
@@ -2177,6 +2181,7 @@
  for further discussion.  */
   gcc_assert (cgraph_state == CGRAPH_STATE_EXPANSION
  || cgraph_state == CGRAPH_STATE_FINISHED);
+#endif

   if (!DECL_P (decl) || !DECL_EXTERNAL (decl) || !TREE_PUBLIC (decl))
 return;


Re: [patch] Call assemble_external only from final.c and from MI-thunk hooks

2012-03-26 Thread Eric Botcazou
 Index: varasm.c
 ===
 --- varasm.c  (revision 185823)
 +++ varasm.c  (working copy)
 @@ -2170,6 +2170,10 @@
If it's not, we should not be calling this function.  */
gcc_assert (asm_out_file);

 +  /* In a perfect world, the following condition would be true.
 + Sadly, the Java and Go front ends emit assembly *from the front end*,
 + bypassing the call graph.  See PR52739.  Fix before GCC 4.8.  */

There is no PR52739.

-- 
Eric Botcazou


[patch][RFC] bail out after front-end errors

2012-03-26 Thread Steven Bosscher
Hello,

This patch is one way to address PR44982. I see no good reason to
cgraph_finalize_compilation_unit if there were parse errors. As Richi
already pointed out, GCC traditionally has proceeded after parse
errors to preserve warnings and errors we generate from the middle-end
and during semantic analysis. But it seems to me that those warnings
are not very meaningful after parse errors (-Wuninitialized after a
parse error??), and errors from the middle end are mostly for exotic
code (involving asm()s and the like). Bailing out after parse errors
is therefore IMHO the right thing to do for the common case.

Thoughts? Comments?

If the consensus is that this patch goes in, I'll also have to do some
work on the test suite, because some warnings and errors disappear.
List attached below. A lot of errors and warnings from g++ disappear.
I suspect this is because they are only issued during gimplification.
That is something I'll have to address before this patch could go in.
Before I spend the effort, I'd like to know if there is consensus on
the general direction proposed here... ;-)

Ciao!
Steven



Index: toplev.c
===
--- toplev.c(revision 185813)
+++ toplev.c(working copy)
@@ -561,9 +561,14 @@ compile_file (void)
   /* Compilation is now finished except for writing
  what's left of the symbol table output.  */

-  if (flag_syntax_only || flag_wpa)
+  /* If all we have to do is syntax checking, or if there were parse
+ errors, stop here.  */
+  if (flag_syntax_only || seen_error ())
 return;

+  if (flag_wpa)
+return;
+
   timevar_start (TV_PHASE_GENERATE);

   ggc_protect_identifiers = false;
@@ -571,12 +576,6 @@ compile_file (void)
   /* This must also call cgraph_finalize_compilation_unit.  */
   lang_hooks.decls.final_write_globals ();

-  if (seen_error ())
-{
-  timevar_stop (TV_PHASE_GENERATE);
-  return;
-}
-
   /* Compilation unit is finalized.  When producing non-fat LTO object, we are
  basically finished.  */
   if (in_lto_p || !flag_lto || flag_fat_lto_objects)


New failing tests:
 FAIL: gcc.dg/asm-7.c  (test for errors, line 15)
 FAIL: gcc.dg/asm-7.c  (test for errors, line 16)
 FAIL: gcc.dg/declspec-10.c  (test for warnings, line 19)
 FAIL: gcc.dg/declspec-11.c  (test for warnings, line 19)
 FAIL: gcc.dg/declspec-9.c  (test for errors, line 20)
 FAIL: gcc.dg/gnu99-static-1.c  (test for errors, line 21)
 FAIL: gcc.dg/gnu99-static-1.c  (test for errors, line 25)
 FAIL: gcc.dg/pr48552-1.c  (test for errors, line 16)
 FAIL: gcc.dg/pr48552-1.c  (test for errors, line 40)
 FAIL: gcc.dg/pr48552-1.c  (test for errors, line 52)
 FAIL: gcc.dg/pr48552-2.c  (test for errors, line 16)
 FAIL: gcc.dg/pr48552-2.c  (test for errors, line 40)
 FAIL: gcc.dg/pr48552-2.c  (test for errors, line 52)
 FAIL: gcc.dg/redecl-10.c  (test for warnings, line 15)
 FAIL: gcc.dg/redecl-10.c  (test for warnings, line 29)
 FAIL: gcc.dg/gomp/block-2.c  (test for errors, line 14)
 FAIL: gcc.dg/gomp/block-2.c  (test for errors, line 16)
 FAIL: gcc.dg/gomp/block-7.c  (test for errors, line 9)
 FAIL: gcc.dg/gomp/block-7.c  (test for errors, line 10)
 FAIL: gcc.dg/gomp/block-7.c  (test for errors, line 11)
 FAIL: gcc.dg/gomp/block-7.c  (test for errors, line 15)
 FAIL: gcc.dg/gomp/block-7.c  (test for errors, line 16)
 FAIL: gcc.dg/gomp/block-7.c  (test for errors, line 17)
 FAIL: gcc.dg/gomp/pr27415.c  (test for errors, line 9)
 FAIL: gcc.dg/gomp/pr27415.c  (test for errors, line 28)
 FAIL: gcc.dg/gomp/pr27415.c  (test for errors, line 37)
 FAIL: c-c++-common/tm/safe-3.c (internal compiler error)
 FAIL: c-c++-common/tm/safe-3.c (test for excess errors)
 FAIL: gcc.dg/tm/pr52141.c (internal compiler error)
 FAIL: gcc.dg/tm/pr52141.c (test for excess errors)
 FAIL: g++.dg/cpp0x/constexpr-ex1.C  (test for warnings, line 17)
 FAIL: g++.dg/cpp0x/constexpr-function2.C  (test for warnings, line 46)
 FAIL: g++.dg/cpp0x/constexpr-neg1.C  (test for warnings, line 5)
 FAIL: g++.dg/cpp0x/lambda/lambda-ctor-neg.C  (test for warnings, line 15)
 FAIL: g++.dg/cpp0x/lambda/lambda-ctor-neg.C not an aggregate (test for 
 errors, line 16)
 FAIL: g++.dg/cpp0x/lambda/lambda-ctor-neg.C deleted default ctor (test for 
 errors, line 17)
 FAIL: g++.dg/cpp0x/lambda/lambda-ctor-neg.C deleted assignment op (test for 
 errors, line 18)
 FAIL: g++.dg/cpp0x/lambda/lambda-field-names.C no member named i (test for 
 errors, line 11)
 FAIL: g++.dg/cpp0x/noexcept15.C  (test for errors, line 16)
 FAIL: g++.dg/cpp0x/pr47416.C  (test for errors, line 187)
 FAIL: g++.dg/cpp0x/pr47416.C  (test for warnings, line 213)
 FAIL: g++.dg/cpp0x/pr47416.C  (test for warnings, line 223)
 FAIL: g++.dg/cpp0x/static_assert2.C  (test for errors, line 14)
 FAIL: g++.dg/cpp0x/union1.C  (test for errors, line 17)
 FAIL: g++.dg/cpp0x/union1.C  (test for errors, line 18)
 FAIL: g++.dg/cpp0x/union1.C  (test for errors, line 28)
 FAIL: g++.dg/cpp0x/union1.C  (test for errors, line 29)

Re: [patch] Call assemble_external only from final.c and from MI-thunk hooks

2012-03-26 Thread Steven Bosscher
On Mon, Mar 26, 2012 at 10:51 PM, Eric Botcazou ebotca...@adacore.com wrote:
 Index: varasm.c
 ===
 --- varasm.c  (revision 185823)
 +++ varasm.c  (working copy)
 @@ -2170,6 +2170,10 @@
        If it's not, we should not be calling this function.  */
    gcc_assert (asm_out_file);

 +  /* In a perfect world, the following condition would be true.
 +     Sadly, the Java and Go front ends emit assembly *from the front end*,
 +     bypassing the call graph.  See PR52739.  Fix before GCC 4.8.  */

 There is no PR52739.

*sigh* PR52730. It's one of those days...


Re: [Patch V2] libgfortran: do not assume libm

2012-03-26 Thread Janne Blomqvist
On Mon, Mar 26, 2012 at 11:44, Tristan Gingold ging...@adacore.com wrote:

 On Mar 22, 2012, at 11:06 AM, Paolo Bonzini wrote:

 Il 22/03/2012 09:30, Tristan Gingold ha scritto:
 Hi,

 this is version 2 of the patch.

 The initial problem is that libgfortran configure.ac used 
 AC_CHECK_LIB([m]…) to check wether several math functions are available.  
 That doesn't work on VMS, because there is no such things as a libm.

 It seems to me that there are no autoconf macro to check wether a function 
 is available: AC_CHECK_FUNC[S] don't allow to specify additional include 
 files (necessary on VMS as some math functions are renamed through macros), 
 and AC_CHECK_DECL only checks the presence of the declaration (as pointed 
 out by Ralf).

 So I have finally learnt a little bit more about autoconf and added a new 
 file: config/math.m4 (build machinery maintainer cc:)
 libgfortran/configure.ac now uses the new GCC_CHECK_MATH_FUNC macro.

 Bootstrapped without gfortran regressions on x86_64-darwin.

 Ok for trunk ?

 (I will submit a follow-up change in libquadmath once this change is 
 approved).

 Tristan.

 config/
 2012-03-22  Tristan Gingold  ging...@adacore.com

      * math.m4: New file.

 libgfortran/
 2012-03-22  Tristan Gingold  ging...@adacore.com

      * configure.ac: Use GCC_CHECK_MATH_FUNC for math functions.
      * acinclude.m4: Include ../config/math.m4
      * configure: Regenerate.
      * Makefile.in: Regenerate.

 Ok.

 Thanks, now committed.

On x86_64-unknown-linux-gnu bootstrap I see the following:


make[1]: Entering directory
`/home/janne/src/gfortran/trunk/objdir-git/x86_64-unknown-linux-gnu/libgfortran'
make  check-am
make[2]: Entering directory
`/home/janne/src/gfortran/trunk/objdir-git/x86_64-unknown-linux-gnu/libgfortran'
(CDPATH=${ZSH_VERSION+.}:  cd ../../../trunk-git/libgfortran 
/bin/bash /home/janne/src/gfortran/trunk/trunk-git/missing --run
autoheader)
true  DO=all multi-do # make
make[1]: Entering directory `/home/janne/src/gfortran/trunk/objdir-git/gcc'
Making a new config file...
echo set tmpdir
/home/janne/src/gfortran/trunk/objdir-git/gcc/testsuite  ./site.tmp
autoheader: warning: missing template: HAVE_ACOS
autoheader: Use AC_DEFINE([HAVE_ACOS], [], [Description])
autoheader: warning: missing template: HAVE_ACOSF
autoheader: warning: missing template: HAVE_ACOSH
autoheader: warning: missing template: HAVE_ACOSHF
autoheader: warning: missing template: HAVE_ACOSHL
autoheader: warning: missing template: HAVE_ACOSL
autoheader: warning: missing template: HAVE_ASIN
autoheader: warning: missing template: HAVE_ASINF
autoheader: warning: missing template: HAVE_ASINH
autoheader: warning: missing template: HAVE_ASINHF
autoheader: warning: missing template: HAVE_ASINHL
autoheader: warning: missing template: HAVE_ASINL
autoheader: warning: missing template: HAVE_ATAN
autoheader: warning: missing template: HAVE_ATAN2
autoheader: warning: missing template: HAVE_ATAN2F
autoheader: warning: missing template: HAVE_ATAN2L
autoheader: warning: missing template: HAVE_ATANF
autoheader: warning: missing template: HAVE_ATANH
autoheader: warning: missing template: HAVE_ATANHF
autoheader: warning: missing template: HAVE_ATANHL
autoheader: warning: missing template: HAVE_ATANL
autoheader: warning: missing template: HAVE_CABS
autoheader: warning: missing template: HAVE_CABSF
autoheader: warning: missing template: HAVE_CABSL
autoheader: warning: missing template: HAVE_CACOS
autoheader: warning: missing template: HAVE_CACOSF
autoheader: warning: missing template: HAVE_CACOSH
autoheader: warning: missing template: HAVE_CACOSHF
autoheader: warning: missing template: HAVE_CACOSHL
autoheader: warning: missing template: HAVE_CACOSL
autoheader: warning: missing template: HAVE_CARG
autoheader: warning: missing template: HAVE_CARGF
autoheader: warning: missing template: HAVE_CARGL
autoheader: warning: missing template: HAVE_CASIN
autoheader: warning: missing template: HAVE_CASINF
autoheader: warning: missing template: HAVE_CASINH
autoheader: warning: missing template: HAVE_CASINHF
autoheader: warning: missing template: HAVE_CASINHL
autoheader: warning: missing template: HAVE_CASINL
autoheader: warning: missing template: HAVE_CATAN
autoheader: warning: missing template: HAVE_CATANF
autoheader: warning: missing template: HAVE_CATANH
autoheader: warning: missing template: HAVE_CATANHF
autoheader: warning: missing template: HAVE_CATANHL
autoheader: warning: missing template: HAVE_CATANL
autoheader: warning: missing template: HAVE_CCOS
autoheader: warning: missing template: HAVE_CCOSF
autoheader: warning: missing template: HAVE_CCOSH
autoheader: warning: missing template: HAVE_CCOSHF
autoheader: warning: missing template: HAVE_CCOSHL
autoheader: warning: missing template: HAVE_CCOSL
autoheader: warning: missing template: HAVE_CEIL
autoheader: warning: missing template: HAVE_CEILF
autoheader: warning: missing template: HAVE_CEILL
autoheader: warning: missing template: HAVE_CEXP
autoheader: warning: missing 

[SH] PR 50751 - rework displacement calculations

2012-03-26 Thread Oleg Endo
Hi,

The attached patch generalizes the move insn displacement calculations a
little bit.  Before, the same address rebasing code was present in
sh_legitimize_address as well as sh_legitimize_reload_address.  I've
pulled those out into a separate function as a preparation step for
adding HImode displacement addressing support.

Tested by doing 'make all' (c,c++), compiling newlib and by making sure
that for the following variants the CSiBE set code size did not change:

-m2a-single -mb -O2, -m2a-single -mb -Os,
-m4a-single -mb -O2, -m4a-single -mb -Os,
-m4a-single -ml -O2, -m4a-single -ml -Os,
-m4-single -mb -O2, -m4-single -mb -Os,
-m4-single -ml -O2, -m4-single -ml -Os

I'm now also running the usual reg tests on sh-sim, just in case.
Other than that, OK?

Cheers,
Oleg


ChangeLog:

PR target/50751
* config/sh/sh.c (sh_legitimize_address,
sh_legitimize_reload_address):  Rearrange conditional logic.
Move displacement address calculations to ...
(sh_find_mov_disp_adjust): ... this new function.
Index: gcc/config/sh/sh.c
===
--- gcc/config/sh/sh.c	(revision 185768)
+++ gcc/config/sh/sh.c	(working copy)
@@ -9663,10 +9663,12 @@
 {
   if (MAYBE_BASE_REGISTER_RTX_P (x, strict))
 return true;
+
   else if ((GET_CODE (x) == POST_INC || GET_CODE (x) == PRE_DEC)
 	! TARGET_SHMEDIA
 	MAYBE_BASE_REGISTER_RTX_P (XEXP (x, 0), strict))
 return true;
+
   else if (GET_CODE (x) == PLUS
 	(mode != PSImode || reload_completed))
 {
@@ -9779,80 +9781,114 @@
   return orig;
 }
 
-/* Try machine-dependent ways of modifying an illegitimate address
-   to be legitimate.  If we find one, return the new, valid address.
-   Otherwise, return X.
+/* Given a (logical) mode size and an offset in bytes, try to find a the
+   appropriate displacement value for a mov insn.  On SH the displacements
+   are limited to max. 60 bytes for SImode, max. 30 bytes in HImode and max.
+   15 bytes in QImode.  To compensate this we create a new base address by
+   adding an adjustment value to it.
 
-   For the SH, if X is almost suitable for indexing, but the offset is
-   out of range, convert it into a normal form so that CSE has a chance
-   of reducing the number of address registers used.  */
+   If the originally requested offset is greater than 127 we prefer using
+   values 124..127 over 128..131 to increase opportunities to use the
+   add #imm, Rn insn.
 
+   In some cases it is possible that a requested offset might seem unaligned
+   or inappropriate for the mode size, like offset = 2 and mode size = 4.
+   This is compensated by adjusting the base address so that the effective
+   address of the displacement move insn will be aligned. 
+
+   This is not the best possible way of rebasing the base address, as it
+   does not look at other present displacement addressings around it.
+   In some cases this can create more base address adjustments than would
+   actually be necessary.  */
+
+struct disp_adjust
+{
+  rtx offset_adjust;
+  rtx mov_disp;
+  int max_mov_disp;
+};
+
+static struct disp_adjust
+sh_find_mov_disp_adjust (int mode_sz, HOST_WIDE_INT offset)
+{
+  struct disp_adjust res = { NULL_RTX, NULL_RTX, 0 };
+
+  /* The max. available mode for actual move insns is SImode.
+ Larger accesses will be split into multiple loads/stores.  */
+  const int max_mov_sz = GET_MODE_SIZE (SImode);
+
+  const int mov_insn_size = mode_sz = max_mov_sz ? max_mov_sz : mode_sz;
+  const HOST_WIDE_INT max_disp = 15 * mov_insn_size;
+  HOST_WIDE_INT align_modifier = offset  127 ? mov_insn_size : 0;
+
+  HOST_WIDE_INT offset_adjust;
+
+  /* In some cases this actually does happen and we must check for it.  */
+  if (mode_sz  1 || mode_sz  8)
+return res;
+
+  /* FIXME: HImode with displacement addressing is not supported yet.
+ Make it purposefully fail for now.  */
+  if (mov_insn_size == 2)
+return res;
+
+  /* Keeps the previous behavior for QImode displacement addressing.
+ This just decides how the offset is re-based.  Removing this special
+ case will result in slightly bigger code on average, but it's not that
+ bad actually.  */
+  if (mov_insn_size == 1)
+align_modifier = 0;
+
+  res.max_mov_disp = max_disp + mov_insn_size;
+
+  offset_adjust = ((offset + align_modifier)  ~max_disp) - align_modifier;
+
+  if (mode_sz + offset - offset_adjust = res.max_mov_disp)
+{
+  res.offset_adjust = GEN_INT (offset_adjust);
+  res.mov_disp = GEN_INT (offset - offset_adjust);
+}
+
+  return res;
+}
+
+/* Try to modify an illegitimate address and make it legitimate.
+   If we find one, return the new, valid address.
+   Otherwise, return the original address.  */
+
 static rtx
 sh_legitimize_address (rtx x, rtx oldx, enum machine_mode mode)
 {
   if (flag_pic)
 x = legitimize_pic_address (oldx, mode, NULL_RTX);
 
-  if (GET_CODE (x) == PLUS
-   (GET_MODE_SIZE (mode) == 4
-	  

Re: [C++ RFC / Patch] Implementing Deducing noexcept for destructors

2012-03-26 Thread Paolo Carlini

On 03/26/2012 09:31 PM, Jason Merrill wrote:

On 03/26/2012 07:22 AM, Paolo Carlini wrote:

My basic idea so far is very simple:

--- class.c (revision 185792)
+++ class.c (working copy)
@@ -1001,6 +1001,10 @@ add_method (tree type, tree method, tree 
using_dec

destructor,
type);
}
+ else if (cxx_dialect = cxx0x
+  !TYPE_RAISES_EXCEPTIONS (TREE_TYPE (method)))
+ TREE_TYPE (method) = build_exception_variant (TREE_TYPE (method),
+ noexcept_true_spec);
}


That would implement N1366, but implementing N3204 is a bit more 
involved.  You need to copy TYPE_RAISES_EXCEPTIONS from the result of 
implicitly_declare_fn; see defaulted_late_check for something similar.


Also, this is too early, since we can't know what the eh specification 
of the implicit declaration would be until the closing brace of the class.

Thanks for the help Jason.

Paolo.


Re: remove wrong code in immed_double_const

2012-03-26 Thread Mike Stump
On Mar 26, 2012, at 1:03 PM, Richard Sandiford wrote:
 I think:
 
 ...copies of the top bit.  Note however that values are neither inherently
 signed nor inherently unsigned; where necessary, signedness is determined
 by the rtl operation instead.

Sounds good to me, changed.

 Same idea here.

Fixed.

 +/* Return an rtx for the sum of X and the integer C, given that X has
 +   mode MODE.  This routine should be used instead of plus_constant
 +   when they want to ensure that addition happens in a particular
 +   mode, which is necessary when x can be a VOIDmode CONST_INT or
 
 Sorry, just noticed, should be ...when X can be...

Fixed.

Richard has ok all his bits, now down to you for final ok.

Ok?

* doc/rtl.texi (const_double): Document as sign-extending.
* expmed.c (expand_mult): Ensure we don't use shift
incorrectly.
* emit-rtl.c (immed_double_int_const): Refine to state the
value is signed.
* simplify-rtx.c (mode_signbit_p): Add a fixme for wider than
CONST_DOUBLE integers.
(simplify_const_unary_operation, UNSIGNED_FLOAT): Ensure no
negative values are converted.  Fix conversions bigger than
HOST_BITS_PER_WIDE_INT.
(simplify_binary_operation_1): Ensure we don't use shift
incorrectly.
(simplify_immed_subreg): Sign-extend CONST_DOUBLEs.
* explow.c (plus_constant_mode): Add.
(plus_constant): Implement with plus_constant_mode.
* rtl.h (plus_constant_mode): Add.

Index: doc/rtl.texi
===
--- doc/rtl.texi(revision 185706)
+++ doc/rtl.texi(working copy)
@@ -1479,8 +1479,19 @@ This type of expression represents the i
 is customarily accessed with the macro @code{INTVAL} as in
 @code{INTVAL (@var{exp})}, which is equivalent to @code{XWINT (@var{exp}, 0)}.
 
-Constants generated for modes with fewer bits than @code{HOST_WIDE_INT}
-must be sign extended to full width (e.g., with @code{gen_int_mode}).
+Constants generated for modes with fewer bits than in
+@code{HOST_WIDE_INT} must be sign extended to full width (e.g., with
+@code{gen_int_mode}).  For constants for modes with more bits than in
+@code{HOST_WIDE_INT} the implied high order bits of that constant are
+copies of the top bit.  Note however that values are neither
+inherently signed nor inherently unsigned; where necessary, signedness
+is determined by the rtl operation instead.
+
+
+, however values are neither signed nor unsigned.
+All operations defined on such constants define the signededness.
+
+
 
 @findex const0_rtx
 @findex const1_rtx
@@ -1510,7 +1521,13 @@ Represents either a floating-point const
 integer constant too large to fit into @code{HOST_BITS_PER_WIDE_INT}
 bits but small enough to fit within twice that number of bits (GCC
 does not provide a mechanism to represent even larger constants).  In
-the latter case, @var{m} will be @code{VOIDmode}.
+the latter case, @var{m} will be @code{VOIDmode}.  For integral values
+constants for modes with more bits than twice the number in
+@code{HOST_WIDE_INT} the implied high order bits of that constant are
+copies of the top bit of @code{CONST_DOUBLE_HIGH}.  Note however that
+integral values are neither inherently signed nor inherently unsigned;
+where necessary, signedness is determined by the rtl operation
+instead.
 
 @findex CONST_DOUBLE_LOW
 If @var{m} is @code{VOIDmode}, the bits of the value are stored in
Index: expmed.c
===
--- expmed.c(revision 185706)
+++ expmed.c(working copy)
@@ -3135,8 +3135,10 @@ expand_mult (enum machine_mode mode, rtx
{
  int shift = floor_log2 (CONST_DOUBLE_HIGH (op1))
  + HOST_BITS_PER_WIDE_INT;
- return expand_shift (LSHIFT_EXPR, mode, op0,
-  shift, target, unsignedp);
+ if (shift  2 * HOST_BITS_PER_WIDE_INT - 1
+ || GET_MODE_BITSIZE (mode) = 2 * HOST_BITS_PER_WIDE_INT)
+   return expand_shift (LSHIFT_EXPR, mode, op0,
+shift, target, unsignedp);
}
}
 
Index: emit-rtl.c
===
--- emit-rtl.c  (revision 185706)
+++ emit-rtl.c  (working copy)
@@ -517,8 +517,11 @@ immed_double_int_const (double_int i, en
 
 /* Return a CONST_DOUBLE or CONST_INT for a value specified as a pair
of ints: I0 is the low-order word and I1 is the high-order word.
-   Do not use this routine for non-integer modes; convert to
-   REAL_VALUE_TYPE and use CONST_DOUBLE_FROM_REAL_VALUE.  */
+   For values that are larger than 2*HOST_BITS_PER_WIDE_INT, the
+   implied upper bits are copies of the high bit of i1.  The value
+   itself is neither signed nor unsigned.  Do not use this routine for
+   non-integer modes; convert to REAL_VALUE_TYPE and use
+   

[PATCH] Allow printing of escaped curly braces in assembler directives with operands

2012-03-26 Thread Siddhesh Poyarekar
Hi,

An assembler directive with an operand is filtered through
output_asm_insn (or asm_fprintf for gcc internal asm() directives) to
expand the operand values in the assembler as well as to choose
dialects if present. This patch is concerned primarily with the
dialects, since their syntax prevent inclusion of assembler strings
with curly braces, causing them to be interpreted as dialects.

The attached patch allows printing of curly braces in assembler by
preceding them with a \\. So to print the following code into assembler:

.pushsection .foo; .asciz div { width : 50%% | height=10px }; .long
42; .popsection

The following code needs to be used with this patch:

void f()
{
  asm ( .pushsection \.foo\; .asciz \div \\{ width : 50%% |
height=10px \\} \; .long %c0; .popsection : : i(42) ); }

The other option to \\ (since it doesn't look as clean) was to use %
as an escape character, but I was not sure if that is a better looking
option or a worse looking one. I don't mind resubmitting the patch to
use %{ and %} to print curly braces if that is acceptable.

It is still possible to print curly braces in asm string literals
without operands since they do not undergo any transformation.

The patch does not introduce any regressions. I have tested this with
x86_64 and i686 and it works well with both of them.

Regards,
Siddhesh


gcc/ChangeLog:

2012-03-27  Siddhesh Poyarekar  siddh...@redhat.com

* final.c (output_asm_insn, asm_fprintf): Print curly braces if
preceded by an escaped backslash (\\).

testsuite/ChangeLog:

2012-03-27  Siddhesh Poyarekar  siddh...@redhat.com

* gcc.dg/asm-braces.c: New test case.
diff --git a/gcc/final.c b/gcc/final.c
index 718caf1..2393c0f 100644
--- a/gcc/final.c
+++ b/gcc/final.c
@@ -3444,6 +3444,12 @@ output_asm_insn (const char *templ, rtx *operands)
 	  output_operand_lossage (invalid %%-code);
 	break;
 
+  /* Escaped braces. Print them as is. */
+  case '\\':
+if (*p == '{' || *p == '}')
+  c = *p++;
+/* FALLTHROUGH */
+
   default:
 	putc (c, asm_out_file);
   }
@@ -3955,6 +3961,12 @@ asm_fprintf (FILE *file, const char *p, ...)
 	  }
 	break;
 
+  /* Escaped braces. Print them as is. */
+  case '\\':
+if (*p == '{' || *p == '}')
+  c = *p++;
+/* FALLTHROUGH */
+
   default:
 	putc (c, file);
   }
diff --git a/gcc/testsuite/gcc.dg/asm-braces.c b/gcc/testsuite/gcc.dg/asm-braces.c
new file mode 100644
index 000..4f428c8
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/asm-braces.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options  } */
+
+void f()
+{
+  asm ( .pushsection \.foo\; .asciz \div \\{ width : 50%% | height = 10px \\} \; .long %c0; .popsection : : i(42) );
+}
+
+/* { dg-final { scan-assembler div { width : 50%% | height = 10px } } } */
-- 
1.7.7.6