Re: [PATCH] Fix PRs 52080, 52097 and 48124, rewrite bitfield expansion, enable the C++ memory model wrt bitfields everywhere
Huh? If they have DECL_FIELD_BIT_OFFSET of zero they are at a byte boundary, no? Wait - the RECORD_TYPE itself is at non-zero DECL_FIELD_BIT_OFFSET and thus a zero DECL_FIELD_BIT_OFFSET for its fields does not mean anything?! DECL_FIELD_BIT_OFFSET is relative to the enclosing record type. But how can DECL_OFFSET_ALIGN be still valid for such field? Likewise, it is relative to DECL_FIELD_OFFSET, which is relative to the enclosing record type. Obviously if DECL_FIELD_OFFSET == 0, DECL_FIELD_BIT_OFFSET == 0 then the offset needs to be aligned to DECL_OFFSET_ALIGN. Which then means DECL_OFFSET_ALIGN is a bit-alignment? It's just a relative alignment, not an absolute one. Anyway, since we are trying to compute a nice mode to use for the bitfield representative we can give up in the second that we do not know how to reach BITS_PER_UNIT alignment. Or we can simply only try to ensure MIN (BITS_PER_UNIT, DECL_OFFSET_ALIGN) alignment/size of the representative. Of course the bitfield expansion code has to deal with non-byte-aligned representatives then, and we'd always have to use BLKmode for them. But BLKmode forces byte alignment. Anything not byte-aligned must use an integral mode. I think we can't change it to be on a byte-boundary, the same record may be used at different bit-positions, no? Sure. Sure, we acknowledge it can cross a record boundary. I just was not aware we cannot statically compute the bit-offset to the previous byte for a record type. We can if we look into the entire object, not just the enclosing record. The problem is that, while get_inner_reference and friends do the former, the new code in stor-layout.c only does the latter. OK, I think we should enter some sort of a degraded mode for these non-byte aligned records, as they cannot occur in C/C++. I'll try to come up with something. -- Eric Botcazou
Re: [PATCH] Fix PRs 52080, 52097 and 48124, rewrite bitfield expansion, enable the C++ memory model wrt bitfields everywhere
Btw, now checking with gdb, DECL_OFFSET_ALIGN is always 128 for all of the fields - that looks bogus. DECL_ALIGN is 1, but that doesn't mean DECL_OFFSET_ALIGN should not be 1 as well, no? DECL_OFFSET_ALIGN is set to BIGGEST_ALIGNMENT for the first field, see start_record_layout. If DECL_FIELD_OFFSET doesn't change for the next fields, DECL_OFFSET_ALIGN will very likely not either. -- Eric Botcazou
Re: [patch] Assert assemble_external is only called during or after expanding to RTL
On Sat, Mar 24, 2012 at 9:25 PM, Steven Bosscher stevenb@gmail.com wrote: Hello, This patch tightens the conditions on when assemble_external() may be called. It also removes a comment that most platforms do not define ASM_OUTPUT_EXTERNAL, because hasn't been true since r119764 added a definition of ASM_OUTPUT_EXTERNAL to elfos.h. This is the first step toward addressing PR17982 on the trunk for GCC 4.8. The next step is to change pending_assemble_externals to pending_assemble_visibility, and fold assemble_external_real() back into assemble_external. But first, this patch. I don't think this is very risky, because GCC now always works in unit-at-a-time mode. But I think it would be good to have this on the trunk for a week or so before proceeding. Bootstrapped tested on x86_64-unknown-linux-gnu. OK for trunk? Ok. Though I wonder why callers cannot simply push these decls to the pending varpool queue and we might do a final run over it, assembling things? [or why we call assemble_external that late at all ...] Richard. Ciao! Steven * varasm.c (assemble_external): Assert this function is only called during or after expanding to RTL. Index: varasm.c === --- varasm.c (revision 185762) +++ varasm.c (working copy) @@ -2166,12 +2166,18 @@ static GTY(()) tree weak_decls; void assemble_external (tree decl ATTRIBUTE_UNUSED) { - /* Because most platforms do not define ASM_OUTPUT_EXTERNAL, the - main body of this code is only rarely exercised. To provide some - testing, on all platforms, we make sure that the ASM_OUT_FILE is - open. If it's not, we should not be calling this function. */ + /* Make sure that the ASM_OUT_FILE is open. + If it's not, we should not be calling this function. */ gcc_assert (asm_out_file); + /* This function should only be called if we are expanding, or have + expanded, to RTL. + Ideally, only final.c would be calling this function, but it is + not clear whether that would break things somehow. See PR 17982 + for further discussion. */ + gcc_assert (cgraph_state == CGRAPH_STATE_EXPANSION + || cgraph_state == CGRAPH_STATE_FINISHED); + if (!DECL_P (decl) || !DECL_EXTERNAL (decl) || !TREE_PUBLIC (decl)) return;
Re: [PATCH] Fix PRs 52080, 52097 and 48124, rewrite bitfield expansion, enable the C++ memory model wrt bitfields everywhere
I think we indeed can't really in stor-layout, so the only place is very likely get_bit_range. Something like that for example. I think the expmed.c hunk should be applied to the 4.7 branch as well, because the new code in store_bit_field is quite dangerous without it. * expmed.c (store_bit_field): Assert that BITREGION_START is a multiple of a unit before computing the offset in units. * expr.c (get_bit_range): Return the null range if the enclosing record is part of a larger bit field. -- Eric Botcazou Index: expr.c === --- expr.c (revision 185767) +++ expr.c (working copy) @@ -4462,6 +4462,25 @@ get_bit_range (unsigned HOST_WIDE_INT *b return; } + /* If we have a DECL_BIT_FIELD_REPRESENTATIVE but the enclosing record is + part of a larger bit field, then the representative does not serve any + useful purpose. This can occur in Ada. */ + if (handled_component_p (TREE_OPERAND (exp, 0))) +{ + enum machine_mode rmode; + HOST_WIDE_INT rbitsize, rbitpos; + tree roffset; + int unsignedp; + int volatilep = 0; + get_inner_reference (TREE_OPERAND (exp, 0), rbitsize, rbitpos, + roffset, rmode, unsignedp, volatilep, false); + if ((rbitpos % BITS_PER_UNIT) != 0) + { + *bitstart = *bitend = 0; + return; +} +} + /* Compute the adjustment to bitpos from the offset of the field relative to the representative. DECL_FIELD_OFFSET of field and repr are the same by construction if they are not constants, Index: expmed.c === --- expmed.c (revision 185570) +++ expmed.c (working copy) @@ -828,8 +828,7 @@ store_bit_field (rtx str_rtx, unsigned H /* Under the C++0x memory model, we must not touch bits outside the bit region. Adjust the address to start at the beginning of the bit region. */ - if (MEM_P (str_rtx) - bitregion_start 0) + if (MEM_P (str_rtx) bitregion_start 0) { enum machine_mode bestmode; enum machine_mode op_mode; @@ -839,6 +838,8 @@ store_bit_field (rtx str_rtx, unsigned H if (op_mode == MAX_MACHINE_MODE) op_mode = VOIDmode; + gcc_assert ((bitregion_start % BITS_PER_UNIT) == 0); + offset = bitregion_start / BITS_PER_UNIT; bitnum -= bitregion_start; bitregion_end -= bitregion_start;
Re: [PATCH] Fix PRs 52080, 52097 and 48124, rewrite bitfield expansion, enable the C++ memory model wrt bitfields everywhere
OTOH if DECL_ALIGN is absolute then if the first field of a record type has DECL_ALIGN that is a multiple of BITS_PER_UNIT we know we can use the present stor-layout code? So we can check that and give up computing representatives at all for a record type if that does not hold. I don't think so, DECL_ALIGN of a field is probably honored only if TYPE_ALIGN of the record is, and TYPE_ALIGN isn't honored for a bit field. -- Eric Botcazou
Re: [PATCH] Fix PRs 52080, 52097 and 48124, rewrite bitfield expansion, enable the C++ memory model wrt bitfields everywhere
On Mon, Mar 26, 2012 at 10:17 AM, Eric Botcazou ebotca...@adacore.com wrote: OTOH if DECL_ALIGN is absolute then if the first field of a record type has DECL_ALIGN that is a multiple of BITS_PER_UNIT we know we can use the present stor-layout code? So we can check that and give up computing representatives at all for a record type if that does not hold. I don't think so, DECL_ALIGN of a field is probably honored only if TYPE_ALIGN of the record is, and TYPE_ALIGN isn't honored for a bit field. Uh. When is a field a bit field though? At least stor-layout.c resets DECL_BIT_FIELD when local relative alignment is proper and the filed has an integer mode. That's overly optimistic if the record is placed at a bit position. Of course we still have DECL_BIT_FIELD_TYPE, but I wonder what the DECL_BIT_FIELD flag is for if we cannot really know whether the field in the end is a bitfield anyway... (what value do the various field-decls in a component-ref chain have anyway if the real interpretation is subject to the containing object, which might be even only specified by the type used for an indirect ref ...) Richard. -- Eric Botcazou
Fix PR rtl-optimization/52629
This is an out-of-bounds access in count_spilled_pseudo: we can enter the function for a pseudo without hard reg but we immediately use the hard reg number as an index: static void count_spilled_pseudo (int spilled, int spilled_nregs, int reg) { int freq = REG_FREQ (reg); int r = reg_renumber[reg]; int nregs = hard_regno_nregs[r][PSEUDO_REGNO_MODE (reg)]; The negative R case is only checked a few lines below: /* Ignore spilled pseudo-registers which can be here only if IRA is used. */ if ((ira_conflicts_p r 0) || REGNO_REG_SET_P (spilled_pseudos, reg) || spilled + spilled_nregs = r || r + nregs = spilled) return; It turns out that count_pseudo contains very similar (and correct) code so the patch just mimics that code. Bootstrapped/regtested on x86_64-suse-linux, applied on mainline. Should it be applied to the release branches as well? 2012-03-26 Eric Botcazou ebotca...@adacore.com PR rtl-optimization/52629 * reload1.c (count_pseudo): Short-circuit common case. (count_spilled_pseudo): Return early for pseudos without hard regs. Assert that the pseudo has got a hard reg before manipulating it. -- Eric Botcazou Index: reload1.c === --- reload1.c (revision 185570) +++ reload1.c (working copy) @@ -1746,11 +1746,12 @@ count_pseudo (int reg) int r = reg_renumber[reg]; int nregs; + /* Ignore spilled pseudo-registers which can be here only if IRA is used. */ + if (ira_conflicts_p r 0) +return; + if (REGNO_REG_SET_P (pseudos_counted, reg) - || REGNO_REG_SET_P (spilled_pseudos, reg) - /* Ignore spilled pseudo-registers which can be here only if IRA - is used. */ - || (ira_conflicts_p r 0)) + || REGNO_REG_SET_P (spilled_pseudos, reg)) return; SET_REGNO_REG_SET (pseudos_counted, reg); @@ -1827,12 +1828,17 @@ count_spilled_pseudo (int spilled, int s { int freq = REG_FREQ (reg); int r = reg_renumber[reg]; - int nregs = hard_regno_nregs[r][PSEUDO_REGNO_MODE (reg)]; + int nregs; + + /* Ignore spilled pseudo-registers which can be here only if IRA is used. */ + if (ira_conflicts_p r 0) +return; + + gcc_assert (r = 0); + + nregs = hard_regno_nregs[r][PSEUDO_REGNO_MODE (reg)]; - /* Ignore spilled pseudo-registers which can be here only if IRA is - used. */ - if ((ira_conflicts_p r 0) - || REGNO_REG_SET_P (spilled_pseudos, reg) + if (REGNO_REG_SET_P (spilled_pseudos, reg) || spilled + spilled_nregs = r || r + nregs = spilled) return;
Re: [Patch V2] libgfortran: do not assume libm
On Mar 22, 2012, at 11:06 AM, Paolo Bonzini wrote: Il 22/03/2012 09:30, Tristan Gingold ha scritto: Hi, this is version 2 of the patch. The initial problem is that libgfortran configure.ac used AC_CHECK_LIB([m]…) to check wether several math functions are available. That doesn't work on VMS, because there is no such things as a libm. It seems to me that there are no autoconf macro to check wether a function is available: AC_CHECK_FUNC[S] don't allow to specify additional include files (necessary on VMS as some math functions are renamed through macros), and AC_CHECK_DECL only checks the presence of the declaration (as pointed out by Ralf). So I have finally learnt a little bit more about autoconf and added a new file: config/math.m4 (build machinery maintainer cc:) libgfortran/configure.ac now uses the new GCC_CHECK_MATH_FUNC macro. Bootstrapped without gfortran regressions on x86_64-darwin. Ok for trunk ? (I will submit a follow-up change in libquadmath once this change is approved). Tristan. config/ 2012-03-22 Tristan Gingold ging...@adacore.com * math.m4: New file. libgfortran/ 2012-03-22 Tristan Gingold ging...@adacore.com * configure.ac: Use GCC_CHECK_MATH_FUNC for math functions. * acinclude.m4: Include ../config/math.m4 * configure: Regenerate. * Makefile.in: Regenerate. Ok. Thanks, now committed. Tristan.
Re: [PATCH] Fix PRs 52080, 52097 and 48124, rewrite bitfield expansion, enable the C++ memory model wrt bitfields everywhere
On Mon, 26 Mar 2012, Eric Botcazou wrote: I think we indeed can't really in stor-layout, so the only place is very likely get_bit_range. Something like that for example. I think the expmed.c hunk should be applied to the 4.7 branch as well, because the new code in store_bit_field is quite dangerous without it. * expmed.c (store_bit_field): Assert that BITREGION_START is a multiple of a unit before computing the offset in units. * expr.c (get_bit_range): Return the null range if the enclosing record is part of a larger bit field. The patch looks reasonable - can we compute this backward from the result of the outer get_inner_reference call and the outermost field-decl though? Or make get_inner_reference compute that while analyzing the full reference and return a flag? OTOH it shouldn't be too expensive. I agree the assert should go to the banch as well, though the code only ever triggers there with --param allow-store-data-races=0. Thanks, Richard.
[VMS/committed]: consolidate ia64_start_function and alpha_start_function, rework crt0
Hi, there was duplicated code for dealing with -mdebug-main in ia64_start_function and alpha_start_function. This patch consolidates that. The pointer size is recorded for function 'main', so that crt0 now expand argv[] if needed; likewise of the return status code. Manually tested on both ia64 and alpha openvms, committed on trunk. Tristan. libgcc/ 2012-03-26 Tristan Gingold ging...@adacore.com * config/alpha/vms.h (LINK_SPEC): Simplify. (STARTFILE_SPEC): Remove -mvms-return-codes handling. (NAME__MAIN, SYMBOL__MAIN): Remove. (VMS_DEBUG_MAIN_POINTER): Remove. * config/ia64/vms.h: Likewise. * config/alpha/alpha.c (alpha_start_function): Move vms_debug_main code to vms.c. Call vms_start_function. * config/ia64/ia64.c (ia64_start_function): Likewise. * config/vms/vms-protos.h (vms_start_function): Declare. * config/vms/vms.c (vms_start_function): New function. * config/vms/vms.h (MATH_LIBRARY): Define. (VMS_DEBUG_MAIN_POINTER): Define. gcc/ 2012-03-26 Tristan Gingold ging...@adacore.com * config/alpha/vms.h (LINK_SPEC): Simplify. (STARTFILE_SPEC): Remove -mvms-return-codes handling. (NAME__MAIN, SYMBOL__MAIN): Remove. (VMS_DEBUG_MAIN_POINTER): Remove. * config/ia64/vms.h: Likewise. * config/alpha/alpha.c (alpha_start_function): Move vms_debug_main code to vms.c. Call vms_start_function. * config/ia64/ia64.c (ia64_start_function): Likewise. * config/vms/vms-protos.h (vms_start_function): Declare. * config/vms/vms.c (vms_start_function): New function. * config/vms/vms.h (MATH_LIBRARY): Define. (VMS_DEBUG_MAIN_POINTER): Define.
Re: [VMS/committed]: consolidate ia64_start_function and alpha_start_function, rework crt0
And the corresponding path is below... On Mar 26, 2012, at 11:37 AM, Tristan Gingold wrote: Hi, there was duplicated code for dealing with -mdebug-main in ia64_start_function and alpha_start_function. This patch consolidates that. The pointer size is recorded for function 'main', so that crt0 now expand argv[] if needed; likewise of the return status code. Manually tested on both ia64 and alpha openvms, committed on trunk. Tristan. libgcc/ 2012-03-26 Tristan Gingold ging...@adacore.com * config/alpha/vms.h (LINK_SPEC): Simplify. (STARTFILE_SPEC): Remove -mvms-return-codes handling. (NAME__MAIN, SYMBOL__MAIN): Remove. (VMS_DEBUG_MAIN_POINTER): Remove. * config/ia64/vms.h: Likewise. * config/alpha/alpha.c (alpha_start_function): Move vms_debug_main code to vms.c. Call vms_start_function. * config/ia64/ia64.c (ia64_start_function): Likewise. * config/vms/vms-protos.h (vms_start_function): Declare. * config/vms/vms.c (vms_start_function): New function. * config/vms/vms.h (MATH_LIBRARY): Define. (VMS_DEBUG_MAIN_POINTER): Define. gcc/ 2012-03-26 Tristan Gingold ging...@adacore.com * config/alpha/vms.h (LINK_SPEC): Simplify. (STARTFILE_SPEC): Remove -mvms-return-codes handling. (NAME__MAIN, SYMBOL__MAIN): Remove. (VMS_DEBUG_MAIN_POINTER): Remove. * config/ia64/vms.h: Likewise. * config/alpha/alpha.c (alpha_start_function): Move vms_debug_main code to vms.c. Call vms_start_function. * config/ia64/ia64.c (ia64_start_function): Likewise. * config/vms/vms-protos.h (vms_start_function): Declare. * config/vms/vms.c (vms_start_function): New function. * config/vms/vms.h (MATH_LIBRARY): Define. (VMS_DEBUG_MAIN_POINTER): Define. diff --git a/gcc/config/alpha/alpha.c b/gcc/config/alpha/alpha.c index 46c620b..c52fc50 100644 --- a/gcc/config/alpha/alpha.c +++ b/gcc/config/alpha/alpha.c @@ -7866,14 +7866,7 @@ alpha_start_function (FILE *file, const char *fnname, int i; #if TARGET_ABI_OPEN_VMS - if (vms_debug_main - strncmp (vms_debug_main, fnname, strlen (vms_debug_main)) == 0) -{ - targetm.asm_out.globalize_label (asm_out_file, VMS_DEBUG_MAIN_POINTER); - ASM_OUTPUT_DEF (asm_out_file, VMS_DEBUG_MAIN_POINTER, fnname); - switch_to_section (text_section); - vms_debug_main = NULL; -} + vms_start_function (fnname); #endif alpha_fnname = fnname; diff --git a/gcc/config/alpha/vms.h b/gcc/config/alpha/vms.h index b1d46b5..8caec54 100644 --- a/gcc/config/alpha/vms.h +++ b/gcc/config/alpha/vms.h @@ -43,8 +43,6 @@ along with GCC; see the file COPYING3. If not see builtin_define (__IEEE_FLOAT); \ } while (0) -#define VMS_DEBUG_MAIN_POINTER TRANSFER$BREAK$GO - #undef PCC_STATIC_STRUCT_RETURN #define MAX_OFILE_ALIGNMENT 524288 /* 8 x 2^16 by DEC Ada Test CD40VRA */ @@ -279,21 +277,14 @@ do {\ #else /* Link with vms-dwarf2.o if -g (except -g0). This causes the VMS link to pull all the dwarf2 debug sections together. */ -#define LINK_SPEC %{g:-g vms-dwarf2.o%s} %{g0} %{g1:-g1 vms-dwarf2.o%s} \ -%{g2:-g2 vms-dwarf2.o%s} %{g3:-g3 vms-dwarf2.o%s} %{shared} %{v} %{map} +#define LINK_SPEC %{g0} %{g*:-g vms-dwarf2.o%s} %{shared} %{v} %{map} #endif #undef STARTFILE_SPEC -#define STARTFILE_SPEC \ -%{!shared:%{mvms-return-codes:vcrt0.o%s} %{!mvms-return-codes:pcrt0.o%s} \ -crtbegin.o%s} \ +#define STARTFILE_SPEC %{!shared:crt0.o%s crtbegin.o%s} \ %{!static:%{shared:crtbeginS.o%s}} -#define ENDFILE_SPEC \ -%{!shared:crtend.o%s} %{!static:%{shared:crtendS.o%s}} - -#define NAME__MAIN __gccmain -#define SYMBOL__MAIN __gccmain +#define ENDFILE_SPEC %{!shared:crtend.o%s} %{!static:%{shared:crtendS.o%s}} #define INIT_SECTION_ASM_OP \t.section LIB$INITIALIZE,GBL,NOWRT diff --git a/gcc/config/ia64/ia64.c b/gcc/config/ia64/ia64.c index 5b2d748..98a6120 100644 --- a/gcc/config/ia64/ia64.c +++ b/gcc/config/ia64/ia64.c @@ -3649,16 +3649,8 @@ void ia64_start_function (FILE *file, const char *fnname, tree decl ATTRIBUTE_UNUSED) { -#if VMS_DEBUGGING_INFO - if (vms_debug_main - debug_info_level DINFO_LEVEL_NONE - strncmp (vms_debug_main, fnname, strlen (vms_debug_main)) == 0) -{ - targetm.asm_out.globalize_label (asm_out_file, VMS_DEBUG_MAIN_POINTER); - ASM_OUTPUT_DEF (asm_out_file, VMS_DEBUG_MAIN_POINTER, fnname); - dwarf2out_vms_debug_main_pointer (); - vms_debug_main = 0; -} +#if TARGET_ABI_OPEN_VMS + vms_start_function (fnname); #endif fputs (\t.proc , file); diff --git a/gcc/config/ia64/vms.h b/gcc/config/ia64/vms.h index 1908c47..3e0c653 100644 --- a/gcc/config/ia64/vms.h +++ b/gcc/config/ia64/vms.h @@ -30,8 +30,6 @@ along with GCC; see the file COPYING3. If not see #undef TARGET_DEFAULT #define TARGET_DEFAULT
Re: [PATCH][ARM] NEON DImode neg
On 28/02/12 17:45, Andrew Stubbs wrote: Hi all, This patch adds a DImode negate pattern for NEON. Unfortunately, the NEON vneg instruction only supports vectors, not singletons, so there's no direct way to do it in DImode, and the compiler ends up moving the value back to core registers, negating it, and returning to NEON afterwards: fmrrd r2, r3, d16 @ int negs r2, r2 sbc r3, r3, r3, lsl #1 fmdrr d16, r2, r3 @ int The new patch does it entirely in NEON: vmov.i32 d17, #0 @ di vsub.i64 d16, d17, d16 (Note that this is the result when combined with my recent patch for NEON DImode immediates. Without that you get a constant pool load.) This updates fixes a bootstrap failure caused by an early clobber error. I've also got a native regression test running now. OK? Andrew 2012-03-26 Andrew Stubbs a...@codesourcery.com gcc/ * config/arm/arm.md (negdi2): Use gen_negdi2_neon. * config/arm/neon.md (negdi2_neon): New insn. Also add splitters for core and NEON registers. --- gcc/config/arm/arm.md |8 +++- gcc/config/arm/neon.md | 37 + 2 files changed, 44 insertions(+), 1 deletions(-) diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index 751997f..f1dbbf7 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -4048,7 +4048,13 @@ (neg:DI (match_operand:DI 1 s_register_operand ))) (clobber (reg:CC CC_REGNUM))])] TARGET_EITHER - + { +if (TARGET_NEON) + { +emit_insn (gen_negdi2_neon (operands[0], operands[1])); + DONE; + } + } ) ;; The constraints here are to prevent a *partial* overlap (where %Q0 == %R1). diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index 3c88568..bf229a7 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -922,6 +922,43 @@ (const_string neon_int_3)))] ) +(define_insn negdi2_neon + [(set (match_operand:DI 0 s_register_operand = w,?r,?r,?w) + (neg:DI (match_operand:DI 1 s_register_operand w, 0, r, w))) + (clobber (match_scratch:DI 2 =w, X, X,w)) + (clobber (reg:CC CC_REGNUM))] + TARGET_NEON + # + [(set_attr length 8) + (set_attr arch nota8,*,*,onlya8)] +) + +; Split negdi2_neon for vfp registers +(define_split + [(set (match_operand:DI 0 s_register_operand ) + (neg:DI (match_operand:DI 1 s_register_operand ))) + (clobber (match_scratch:DI 2 )) + (clobber (reg:CC CC_REGNUM))] + TARGET_NEON reload_completed IS_VFP_REGNUM (REGNO (operands[0])) + [(set (match_dup 2) (const_int 0)) + (parallel [(set (match_dup 0) (minus:DI (match_dup 2) (match_dup 1))) + (clobber (reg:CC CC_REGNUM))])] + +) + +; Split negdi2_neon for core registers +(define_split + [(set (match_operand:DI 0 s_register_operand ) + (neg:DI (match_operand:DI 1 s_register_operand ))) + (clobber (match_scratch:DI 2 )) + (clobber (reg:CC CC_REGNUM))] + TARGET_32BIT reload_completed +arm_general_register_operand (operands[0], DImode) + [(parallel [(set (match_dup 0) (neg:DI (match_dup 1))) + (clobber (reg:CC CC_REGNUM))])] + +) + (define_insn *uminmode3_neon [(set (match_operand:VDQIW 0 s_register_operand =w) (umin:VDQIW (match_operand:VDQIW 1 s_register_operand w)
Re: [PATCH] Bug fix in store_bit_field_1 for big endian targets (issue 51893)
Passes bootstrap and regression test powerpc64-linux. Thanks a lot, Alan! So, Aurelien, you only need to adjust the formatting of the patch and post a ChangeLog entry along with it. TIA. Thanks Alan! Bootstrap and regression test for m68k-elf ok, but I have trouble cross compiling trunk for powerpc64-linux target... Changelog: * expmed.c (store_bit_field_1): Fix wordnum value for big endian targets Will this fix be backported to 4.6 branch for next release? Aurélien Index: gcc/expmed.c === --- gcc/expmed.c(revision 185732) +++ gcc/expmed.c(working copy) @@ -550,7 +550,10 @@ { /* If I is 0, use the low-order word in both field and target; if I is 1, use the next to lowest word; and so on. */ - unsigned int wordnum = (backwards ? nwords - i - 1 : i); + unsigned int wordnum = (backwards + ? GET_MODE_SIZE (fieldmode) / UNITS_PER_WORD + - i - 1 + : i); unsigned int bit_offset = (backwards ? MAX ((int) bitsize - ((int) i + 1) * BITS_PER_WORD,
Re: [PATCH] Remove strict-alignment checks in SRA
On Fri, 23 Mar 2012, Martin Jambor wrote: Hi, since we now should be able to expand misaligned MEM_REFs properly and both SRA and IPA-SRA now tag the memory accesses with the appropriate alignment information, we should now be able to get rid off the SRA disabling in the face of potential strict-alignment expansion problems. The patch below does that. When I applied it before applying the patches fixing misaligned expansion, testcases gcc.dg/torture/pr52402.c and gcc.dg/tree-ssa/pr49923.c failed on strict-alignment platforms. However, when applied to the current trunk, they pass bootstrap and testsuite on sparc64-linux, ia64-linux, x86_64-linux and i686-linux. OK for trunk? Ok. Thanks, Richard. Thanks, Martin 2012-03-20 Martin Jambor mjam...@suse.cz PR tree-optimization/50052 * tree-sra.c (tree_non_aligned_mem_p): Removed. (tree_non_aligned_mem_for_access_p): Likewise. (build_accesses_from_assign): Removed strict alignment requirements checks. (access_precludes_ipa_sra_p): Likewise. * testsuite/gcc.dg/ipa/ipa-sra-2.c: Also run on strict-alignment platforms. Index: src/gcc/tree-sra.c === --- src.orig/gcc/tree-sra.c +++ src/gcc/tree-sra.c @@ -1081,53 +1081,6 @@ disqualify_ops_if_throwing_stmt (gimple return false; } -/* Return true if EXP is a memory reference less aligned than ALIGN. This is - invoked only on strict-alignment targets. */ - -static bool -tree_non_aligned_mem_p (tree exp, unsigned int align) -{ - unsigned int exp_align; - - if (TREE_CODE (exp) == VIEW_CONVERT_EXPR) -exp = TREE_OPERAND (exp, 0); - - if (TREE_CODE (exp) == SSA_NAME || is_gimple_min_invariant (exp)) -return false; - - /* get_object_alignment will fall back to BITS_PER_UNIT if it cannot - compute an explicit alignment. Pretend that dereferenced pointers - are always aligned on strict-alignment targets. */ - if (TREE_CODE (exp) == MEM_REF || TREE_CODE (exp) == TARGET_MEM_REF) -exp_align = get_object_or_type_alignment (exp); - else -exp_align = get_object_alignment (exp); - - if (exp_align align) -return true; - - return false; -} - -/* Return true if EXP is a memory reference less aligned than what the access - ACC would require. This is invoked only on strict-alignment targets. */ - -static bool -tree_non_aligned_mem_for_access_p (tree exp, struct access *acc) -{ - unsigned int acc_align; - - /* The alignment of the access is that of its expression. However, it may - have been artificially increased, e.g. by a local alignment promotion, - so we cap it to the alignment of the type of the base, on the grounds - that valid sub-accesses cannot be more aligned than that. */ - acc_align = get_object_alignment (acc-expr); - if (acc-base acc_align TYPE_ALIGN (TREE_TYPE (acc-base))) -acc_align = TYPE_ALIGN (TREE_TYPE (acc-base)); - - return tree_non_aligned_mem_p (exp, acc_align); -} - /* Scan expressions occuring in STMT, create access structures for all accesses to candidates for scalarization and remove those candidates which occur in statements or expressions that prevent them from being split apart. Return @@ -1154,11 +1107,7 @@ build_accesses_from_assign (gimple stmt) lacc = build_access_from_expr_1 (lhs, stmt, true); if (lacc) -{ - lacc-grp_assignment_write = 1; - if (STRICT_ALIGNMENT tree_non_aligned_mem_for_access_p (rhs, lacc)) -lacc-grp_unscalarizable_region = 1; -} +lacc-grp_assignment_write = 1; if (racc) { @@ -1166,8 +1115,6 @@ build_accesses_from_assign (gimple stmt) if (should_scalarize_away_bitmap !gimple_has_volatile_ops (stmt) !is_gimple_reg_type (racc-type)) bitmap_set_bit (should_scalarize_away_bitmap, DECL_UID (racc-base)); - if (STRICT_ALIGNMENT tree_non_aligned_mem_for_access_p (lhs, racc)) -racc-grp_unscalarizable_region = 1; } if (lacc racc @@ -3835,10 +3782,6 @@ access_precludes_ipa_sra_p (struct acces || gimple_code (access-stmt) == GIMPLE_ASM)) return true; - if (STRICT_ALIGNMENT - tree_non_aligned_mem_p (access-expr, TYPE_ALIGN (access-type))) -return true; - return false; } Index: src/gcc/testsuite/gcc.dg/ipa/ipa-sra-2.c === --- src.orig/gcc/testsuite/gcc.dg/ipa/ipa-sra-2.c +++ src/gcc/testsuite/gcc.dg/ipa/ipa-sra-2.c @@ -1,6 +1,5 @@ /* { dg-do compile } */ /* { dg-options -O2 -fipa-sra -fdump-tree-eipa_sra-details } */ -/* { dg-require-effective-target non_strict_align } */ struct bovid { -- Richard Guenther rguent...@suse.de SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer
[PATCH] Permanent Fix for PR46886
Hi, This is (hopefully) a permanent fix to pr46886.c I removed the condition preventing parallelization of do_while loops, as it was blocking parallelizing important loops in spec-2006. The patch fixes the number of iterations for cases where the body could appear in the latch, as in pr46886.c. 2012-03-26 Razya Ladelsky ra...@il.ibm.com PR tree-optimization/46886 * tree-parloops.c (transform_to_exit_first_loop):Set number of iterations correctly when the body may appear at the latch. (pallelize_loops): Remove the condition preventing do-while loops. Bootstrap and testsuite psss successfully on power7 linux machine. Ok to commit? Thanks, Index: tree-parloops.c === --- tree-parloops.c (revision 185775) +++ tree-parloops.c (working copy) @@ -1522,7 +1522,10 @@ transform_to_exit_first_loop (struct loop *loop, h the condition, moving the condition to the entry requires decrementing one iteration. */ exit_1 = EDGE_SUCC (exit-src, EDGE_SUCC (exit-src, 0) == exit); - if (exit_1-dest == loop-latch) + + /* if the latch contains more than the one statemnt of control variable + increment then it contains the body. */ + if (exit_1-dest == loop-latch last_and_only_stmt (loop-latch)) new_rhs = gimple_cond_rhs (cond_stmt); else { @@ -2146,7 +2149,6 @@ parallelize_loops (void) return false; if (cfun-has_nonlocal_label) return false; - gcc_obstack_init (parloop_obstack); reduction_list = htab_create (10, reduction_info_hash, reduction_info_eq, free); @@ -2187,10 +2189,7 @@ parallelize_loops (void) || loop_has_blocks_with_irreducible_flag (loop) || (loop_preheader_edge (loop)-src-flags BB_IRREDUCIBLE_LOOP) /* FIXME: the check for vector phi nodes could be removed. */ - || loop_has_vector_phi_nodes (loop) - /* FIXME: transform_to_exit_first_loop does not handle not -header-copied loops correctly - see PR46886. */ - || !do_while_loop_p (loop)) + || loop_has_vector_phi_nodes (loop)) continue; estimated = max_stmt_executions_int (loop, false); /* FIXME: Bypass this check as graphite doesn't update the @@ -2213,6 +2212,7 @@ parallelize_loops (void) continue; changed = true; + if (dump_file (dump_flags TDF_DETAILS)) { if (loop-inner) =
Re: [PATCH] Permanent Fix for PR46886
On Mon, 26 Mar 2012, Razya Ladelsky wrote: Hi, This is (hopefully) a permanent fix to pr46886.c I removed the condition preventing parallelization of do_while loops, as it was blocking parallelizing important loops in spec-2006. The patch fixes the number of iterations for cases where the body could appear in the latch, as in pr46886.c. 2012-03-26 Razya Ladelsky ra...@il.ibm.com PR tree-optimization/46886 * tree-parloops.c (transform_to_exit_first_loop):Set number of iterations correctly when the body may appear at the latch. (pallelize_loops): Remove the condition preventing do-while loops. Bootstrap and testsuite psss successfully on power7 linux machine. Ok to commit? + + /* if the latch contains more than the one statemnt of control variable + increment then it contains the body. */ + if (exit_1-dest == loop-latch last_and_only_stmt (loop-latch)) new_rhs = gimple_cond_rhs (cond_stmt); please check what the comment suggests, thus, last_and_only_stmt (loop-latch) == cond_stmt tree-parloops.c is quite fragile in what it expects the IL to look like and tests do not cover what later code expects. Please do not add to that. @@ -2146,7 +2149,6 @@ parallelize_loops (void) return false; if (cfun-has_nonlocal_label) return false; - spurious whitespace change. @@ -2213,6 +2212,7 @@ parallelize_loops (void) continue; changed = true; + if (dump_file (dump_flags TDF_DETAILS)) { if (loop-inner) Likewise. Ok with the above change. Thanks, Richard.
[C++ RFC / Patch] Implementing Deducing noexcept for destructors
[sorry, I'm resending this because inadvertently I had some html and the message got rejected] Hi, thus, I have been working on c++/50043, which boils down to this: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3204.htm My basic idea so far is very simple: --- class.c (revision 185792) +++ class.c (working copy) @@ -1001,6 +1001,10 @@ add_method (tree type, tree method, tree using_dec destructor, type); } + else if (cxx_dialect = cxx0x + !TYPE_RAISES_EXCEPTIONS (TREE_TYPE (method))) + TREE_TYPE (method) = build_exception_variant (TREE_TYPE (method), + noexcept_true_spec); } else { thus, right before actually adding the method we check whether nothing has been deduced about it and we enforce the default noexcept. It's already enough to cover the testcase provided in c++/50043, which includes a good range of positive and negative tests. Does the idea make sense? As is, the patchlet leads to a few regressions, largely benign as far as I can see: FAIL: g++.dg/cpp0x/noexcept01.C (test for excess errors) FAIL: g++.dg/eh/ctor1.C -std=c++11 execution test FAIL: g++.dg/eh/init-temp1.C -std=c++11 execution test FAIL: g++.dg/tree-ssa/ehcleanup-1.C -std=gnu++11 scan-tree-dump-times ehcleanup1 Removing unreachable 4 FAIL: g++.old-deja/g++.eh/cleanup1.C -std=c++11 execution test (the idea would changing the tests to be c++98 only and then adding c++11 counterparts) Something the patchlet does not cover is: struct B { ~B(); }; B::~B() { } mailto:paolo.carl...@oracle.com indeed the as an implicit declaration bits of the new wording in C++11 doesn't guide so much about this, I guess it means something like: --- decl.c (revision 185792) +++ decl.c (working copy) @@ -1144,7 +1144,10 @@ check_redeclaration_exception_specification (tree if ((pedantic || ! DECL_IN_SYSTEM_HEADER (old_decl)) ! DECL_IS_BUILTIN (old_decl) flag_exceptions - !comp_except_specs (new_exceptions, old_exceptions, ce_normal)) + !comp_except_specs (new_exceptions, old_exceptions, ce_normal) + !(DECL_DESTRUCTOR_P (new_decl) + cxx_dialect = cxx0x + !new_exceptions TYPE_NOEXCEPT_P (old_type))) { error (declaration of %qF has a different exception specifier, new_decl); does it make sense? Another case which makes me nervous is when we add to the testcase in c++/50043 also a case for a virtual base class destructor, thus something like struct True2 { virtual ~True2(); }; struct False { ~False() noexcept(false); }; template typename Base, typename Member struct C : Base { Member mem; }; SA(!noexcept(CTrue2, False())); it doesn't compile at all because: noexcept_PR50043.C:21:8: error: looser throw specifier for ‘virtual CTrue2, False::~C() noexcept (false)’ noexcept_PR50043.C:5:24: error: overriding ‘virtual True2::~True2() noexcept (true)’ is this expected? Maybe, but I'm not sure. Thanks in advance for any tips! Paolo.
[testsuite,avr,trunk+4.7,committed]: Add LTO tests
http://gcc.gnu.org/viewcvs?view=revisionrevision=185793 http://gcc.gnu.org/viewcvs?view=revisionrevision=185794 This adds LTO tests to avr-torture.exp: * gcc.target/avr/torture/avr-torture.exp (AVR_TORTURE_OPTIONS): Add -Os -flto to list. Johann
Re: [PATCH] PR bootstrap/52623 Disable libquadmath on AIX
On Fri, 23 Mar 2012, David Edelsohn wrote: The build process of libquadmath sometimes encounters problems on AIX due to multilib and LD_LIBRARY_PATH interfering with GCC's own library dependencies. Libquadmath is not used on AIX, so this patch adds it to noconfigdirs. Please include this explanation in a comment on the powerpc-*-aix* | rs6000-*-aix* case. Existing directory-disabling cases may be uncommented, but new ones ought to have comments explaining the reason for disabling the library. -- Joseph S. Myers jos...@codesourcery.com
Re: PATCH: Add MaskNeeded property to option handling
On Sun, 25 Mar 2012, H.J. Lu wrote: Hi Joseph, I need to support InverseMask(XXX) in options without the corresponding Mask(XXX) since XXX is never set directly via a command line option. This patch adds a MaskNeeded property which turns InverseMask(XXX) into the inverse version of Mask(XXX), which allocates a unique bit and defines the same set of macros as Mask(XXX). Does it look OK? I'd have thought that either Mask or InverseMask with a given mask name (or a standalone target mask record) should cause allocation (only once, no matter how many options use the same mask name), and MaskExists should be removed, rather than adding MaskNeeded - if I understood correctly the purpose for which you are adding MaskNeeded. -- Joseph S. Myers jos...@codesourcery.com
Fix PR tree-optimization/52686 (SLP crashes) (Re: Vectorizer patches for 4.8)
Hello, one of Ira's vectorizer patches I recently committed seems to have exposed a pre-existing bug in handling WIDEN_LSHIFT_EXPR, which now causes ICEs in SLP due to out-of-bounds memory accesses. The underlying cause is that vect_get_smallest_scalar_type does not handle WIDEN_LSHIFT_EXPR as it does the other widening operations. This causes the SLP pass to operate on the wrong type and thus get confused about the number of vector instructions it need to allocate space for. The following patch fixes all those ICEs for me. Manual checking shows that widening shifts are now recognized correctly by SLP. Tested with no regression on armv7l-linux-gnueabi and i686-linux-gnu. OK for mainline? Bye, Ulrich ChangeLog: gcc/ PR tree-optimization/52686 * tree-vect-data-refs.c (vect_get_smallest_scalar_type): Handle WIDEN_LSHIFT_EXPR. gcc/testsuite/ PR tree-optimization/52686 * gcc.target/arm/pr52686.c: New test. Index: gcc/testsuite/gcc.target/arm/pr52686.c === --- gcc/testsuite/gcc.target/arm/pr52686.c (revision 0) +++ gcc/testsuite/gcc.target/arm/pr52686.c (revision 0) @@ -0,0 +1,19 @@ +/* PR target/52375 */ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-options -march=armv7-a -mfloat-abi=softfp -mfpu=neon -O -ftree-vectorize } */ + +unsigned int output[4]; + +void test (unsigned short *p) +{ + unsigned int x = *p; + if (x) +{ + output[0] = x 1; + output[1] = x 1; + output[2] = x 1; + output[3] = x 1; +} +} + Index: gcc/tree-vect-data-refs.c === --- gcc/tree-vect-data-refs.c (revision 185467) +++ gcc/tree-vect-data-refs.c (working copy) @@ -111,6 +111,7 @@ if (is_gimple_assign (stmt) (gimple_assign_cast_p (stmt) || gimple_assign_rhs_code (stmt) == WIDEN_MULT_EXPR + || gimple_assign_rhs_code (stmt) == WIDEN_LSHIFT_EXPR || gimple_assign_rhs_code (stmt) == FLOAT_EXPR)) { tree rhs_type = TREE_TYPE (gimple_assign_rhs1 (stmt)); -- Dr. Ulrich Weigand GNU Toolchain for Linux on System z and Cell BE ulrich.weig...@de.ibm.com
Re: [PATCH] Permanent Fix for PR46886
Richard Guenther rguent...@suse.de wrote on 26/03/2012 01:23:15 PM: From: Richard Guenther rguent...@suse.de To: Razya Ladelsky/Haifa/IBM@IBMIL Cc: gcc-patches@gcc.gnu.org Date: 26/03/2012 01:23 PM Subject: Re: [PATCH] Permanent Fix for PR46886 On Mon, 26 Mar 2012, Razya Ladelsky wrote: Hi, This is (hopefully) a permanent fix to pr46886.c I removed the condition preventing parallelization of do_while loops, as it was blocking parallelizing important loops in spec-2006. The patch fixes the number of iterations for cases where the body could appear in the latch, as in pr46886.c. 2012-03-26 Razya Ladelsky ra...@il.ibm.com PR tree-optimization/46886 * tree-parloops.c (transform_to_exit_first_loop):Set number of iterations correctly when the body may appear at the latch. (pallelize_loops): Remove the condition preventing do-while loops. Bootstrap and testsuite psss successfully on power7 linux machine. Ok to commit? + + /* if the latch contains more than the one statemnt of control variable + increment then it contains the body. */ + if (exit_1-dest == loop-latch last_and_only_stmt (loop-latch)) new_rhs = gimple_cond_rhs (cond_stmt); please check what the comment suggests, thus, last_and_only_stmt (loop-latch) == cond_stmt Hi Richard, The latch has at least one stmt for sure: the control variable increment (not a cond_stmt) coming from the call to canonicalize_loop_ivs(loop, nit, true) earlier in tree-parloops: /* . When BUMP_IN_LATCH is true, the induction variable is incremented in the loop latch, otherwise it is incremented in the loop header. Return the induction variable that was created. */ tree canonicalize_loop_ivs (struct loop *loop, tree *nit, bool bump_in_latch) tree-parloops.c is quite fragile in what it expects the IL to look like and tests do not cover what later code expects. Please do not add to that. I agree. I have also tested this code on spec2006 benchmarks. (together with an upcoming patch, 5 of these show speedups with autopar on a linux-power7) @@ -2146,7 +2149,6 @@ parallelize_loops (void) return false; if (cfun-has_nonlocal_label) return false; - spurious whitespace change. @@ -2213,6 +2212,7 @@ parallelize_loops (void) continue; changed = true; + if (dump_file (dump_flags TDF_DETAILS)) { if (loop-inner) Likewise. Ok with the above change. Ok, thanks, Razya Thanks, Richard.
Re: Fix PR tree-optimization/52686 (SLP crashes) (Re: Vectorizer patches for 4.8)
On Mon, Mar 26, 2012 at 1:57 PM, Ulrich Weigand uweig...@de.ibm.com wrote: Hello, one of Ira's vectorizer patches I recently committed seems to have exposed a pre-existing bug in handling WIDEN_LSHIFT_EXPR, which now causes ICEs in SLP due to out-of-bounds memory accesses. The underlying cause is that vect_get_smallest_scalar_type does not handle WIDEN_LSHIFT_EXPR as it does the other widening operations. This causes the SLP pass to operate on the wrong type and thus get confused about the number of vector instructions it need to allocate space for. The following patch fixes all those ICEs for me. Manual checking shows that widening shifts are now recognized correctly by SLP. Tested with no regression on armv7l-linux-gnueabi and i686-linux-gnu. OK for mainline? Ok. Thanks, Richard. Bye, Ulrich ChangeLog: gcc/ PR tree-optimization/52686 * tree-vect-data-refs.c (vect_get_smallest_scalar_type): Handle WIDEN_LSHIFT_EXPR. gcc/testsuite/ PR tree-optimization/52686 * gcc.target/arm/pr52686.c: New test. Index: gcc/testsuite/gcc.target/arm/pr52686.c === --- gcc/testsuite/gcc.target/arm/pr52686.c (revision 0) +++ gcc/testsuite/gcc.target/arm/pr52686.c (revision 0) @@ -0,0 +1,19 @@ +/* PR target/52375 */ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-options -march=armv7-a -mfloat-abi=softfp -mfpu=neon -O -ftree-vectorize } */ + +unsigned int output[4]; + +void test (unsigned short *p) +{ + unsigned int x = *p; + if (x) + { + output[0] = x 1; + output[1] = x 1; + output[2] = x 1; + output[3] = x 1; + } +} + Index: gcc/tree-vect-data-refs.c === --- gcc/tree-vect-data-refs.c (revision 185467) +++ gcc/tree-vect-data-refs.c (working copy) @@ -111,6 +111,7 @@ if (is_gimple_assign (stmt) (gimple_assign_cast_p (stmt) || gimple_assign_rhs_code (stmt) == WIDEN_MULT_EXPR + || gimple_assign_rhs_code (stmt) == WIDEN_LSHIFT_EXPR || gimple_assign_rhs_code (stmt) == FLOAT_EXPR)) { tree rhs_type = TREE_TYPE (gimple_assign_rhs1 (stmt)); -- Dr. Ulrich Weigand GNU Toolchain for Linux on System z and Cell BE ulrich.weig...@de.ibm.com
[PATCH] Fix PR52701
This fixes PR52701, we need to unconditionally remember the evolution as irrelevant PHIs (apart from their final value) are not detected as induction with variable step. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2012-03-26 Richard Guenther rguent...@suse.de PR tree-optimization/52701 * tree-vect-loop.c (vect_analyze_scalar_cycles_1): Always compute and set the evolution part of PHI nodes. * gfortran.dg/pr52701.f90: New testcase. Index: gcc/tree-vect-loop.c === *** gcc/tree-vect-loop.c(revision 185792) --- gcc/tree-vect-loop.c(working copy) *** vect_analyze_scalar_cycles_1 (loop_vec_i *** 565,575 /* Analyze the evolution function. */ access_fn = analyze_scalar_evolution (loop, def); if (access_fn) - STRIP_NOPS (access_fn); - if (access_fn vect_print_dump_info (REPORT_DETAILS)) { ! fprintf (vect_dump, Access function of PHI: ); ! print_generic_expr (vect_dump, access_fn, TDF_SLIM); } if (!access_fn --- 565,579 /* Analyze the evolution function. */ access_fn = analyze_scalar_evolution (loop, def); if (access_fn) { ! STRIP_NOPS (access_fn); ! if (vect_print_dump_info (REPORT_DETAILS)) ! { ! fprintf (vect_dump, Access function of PHI: ); ! print_generic_expr (vect_dump, access_fn, TDF_SLIM); ! } ! STMT_VINFO_LOOP_PHI_EVOLUTION_PART (stmt_vinfo) ! = evolution_part_in_loop_num (access_fn, loop-num); } if (!access_fn *** vect_analyze_scalar_cycles_1 (loop_vec_i *** 579,586 continue; } - STMT_VINFO_LOOP_PHI_EVOLUTION_PART (stmt_vinfo) - = evolution_part_in_loop_num (access_fn, loop-num); gcc_assert (STMT_VINFO_LOOP_PHI_EVOLUTION_PART (stmt_vinfo) != NULL_TREE); if (vect_print_dump_info (REPORT_DETAILS)) --- 583,588 Index: gcc/testsuite/gfortran.dg/pr52701.f90 === *** gcc/testsuite/gfortran.dg/pr52701.f90 (revision 0) --- gcc/testsuite/gfortran.dg/pr52701.f90 (revision 0) *** *** 0 --- 1,20 + ! { dg-do compile } + ! { dg-options -O3 } + function pr52701 (x, z, e, f, g, l) + integer a, b, c, d, e, f, g, i, j, l, pr52701 + double precision x(e), z(e*e) + do i = l, f + do j = l, i + d = 0 + do a = 1, g + c = a - g + do b = 1, g + d = d + 1 + c = c + g + z(d) = z(d) / (x(i) + x(j) - x(f + a) - x(f + b)) + end do + end do + end do + end do + pr52701 = c + end
[PATCH] Fix PR52721
This fixes PR52721, vect_init_vector is appearantly also used to init scalars ... the following patch fixes the ICE and adjusts the comments/implementation to not suggest we are dealing with vectors only. Bootstrapped and tested on x86_64-unknown-linux-gnu, appplied to trunk. Richard. 2012-03-26 Richard Guenther rguent...@suse.de PR tree-optimization/52721 * tree-vect-stmts.c (vect_init_vector): Handle scalars. Index: gcc/tree-vect-stmts.c === --- gcc/tree-vect-stmts.c (revision 185792) +++ gcc/tree-vect-stmts.c (working copy) @@ -1142,7 +1142,6 @@ vect_get_load_cost (struct data_referenc static void vect_init_vector_1 (gimple stmt, gimple new_stmt, gimple_stmt_iterator *gsi) { - if (gsi) vect_finish_stmt_generation (stmt, new_stmt, gsi); else @@ -1185,48 +1184,48 @@ vect_init_vector_1 (gimple stmt, gimple /* Function vect_init_vector. - Insert a new stmt (INIT_STMT) that initializes a new vector variable with - the vector elements of VECTOR_VAR. Place the initialization at BSI if it - is not NULL. Otherwise, place the initialization at the loop preheader. + Insert a new stmt (INIT_STMT) that initializes a new variable of type + TYPE with the value VAL. If TYPE is a vector type and VAL does not have + vector type a vector with all elements equal to VAL is created first. + Place the initialization at BSI if it is not NULL. Otherwise, place the + initialization at the loop preheader. Return the DEF of INIT_STMT. It will be used in the vectorization of STMT. */ tree -vect_init_vector (gimple stmt, tree vector_var, tree vector_type, - gimple_stmt_iterator *gsi) +vect_init_vector (gimple stmt, tree val, tree type, gimple_stmt_iterator *gsi) { tree new_var; gimple init_stmt; tree vec_oprnd; tree new_temp; - if (TREE_CODE (TREE_TYPE (vector_var)) != VECTOR_TYPE) + if (TREE_CODE (type) == VECTOR_TYPE + TREE_CODE (TREE_TYPE (val)) != VECTOR_TYPE) { - if (!types_compatible_p (TREE_TYPE (vector_type), - TREE_TYPE (vector_var))) + if (!types_compatible_p (TREE_TYPE (type), TREE_TYPE (val))) { - if (CONSTANT_CLASS_P (vector_var)) - vector_var = fold_unary (VIEW_CONVERT_EXPR, TREE_TYPE (vector_type), -vector_var); + if (CONSTANT_CLASS_P (val)) + val = fold_unary (VIEW_CONVERT_EXPR, TREE_TYPE (type), val); else { - new_var = create_tmp_reg (TREE_TYPE (vector_type), NULL); + new_var = create_tmp_reg (TREE_TYPE (type), NULL); add_referenced_var (new_var); init_stmt = gimple_build_assign_with_ops (NOP_EXPR, - new_var, vector_var, + new_var, val, NULL_TREE); new_temp = make_ssa_name (new_var, init_stmt); gimple_assign_set_lhs (init_stmt, new_temp); vect_init_vector_1 (stmt, init_stmt, gsi); - vector_var = new_temp; + val = new_temp; } } - vector_var = build_vector_from_val (vector_type, vector_var); + val = build_vector_from_val (type, val); } - new_var = vect_get_new_vect_var (vector_type, vect_simple_var, cst_); + new_var = vect_get_new_vect_var (type, vect_simple_var, cst_); add_referenced_var (new_var); - init_stmt = gimple_build_assign (new_var, vector_var); + init_stmt = gimple_build_assign (new_var, val); new_temp = make_ssa_name (new_var, init_stmt); gimple_assign_set_lhs (init_stmt, new_temp); vect_init_vector_1 (stmt, init_stmt, gsi);
[Patch,avr]: PR target/52692: Imlpement TARGET_BUILTIN_DECL
This implements TARGET_BUILTIN_DECL which is needed to make LTO work with target-specific built-ins. struct avr_builtin_description gets a new field .fndecl which is initialized during avr_init_builtins and looked up in new hook avr_builtin_decl. The built-ins are initialized in such a way that there is always avr_bdesc[x].id == x i.e. .id is superfluous and avr_expand_builtin can use id to access the array and need no search loop to find built-in id. Johann Ok? gcc/ PR target/52692 * config/avr/avr.c (TARGET_BUILTIN_DECL): New define. (avr_builtin_decl): New static function. (struct avr_builtin_description, avr_bdesc): Move up. Add GTY marker. Add field fndecl. Remove redundant field id. (avr_init_builtins): Initialize avr_bdesc[ID].fndecl. (avr_expand_builtin): Code cleanup because .id is removed. testsuite/ * gcc.target/avr/torture/builtins-2.c: New test. Index: config/avr/avr.c === --- config/avr/avr.c (revision 185792) +++ config/avr/avr.c (working copy) @@ -10285,6 +10285,42 @@ enum avr_builtin_id AVR_BUILTIN_COUNT }; +struct GTY(()) avr_builtin_description +{ + enum insn_code icode; + const char *name; + int n_args; + tree fndecl; +}; + + +/* Notice that avr_bdesc[] and avr_builtin_id are initialized in such a way + that a built-in's ID can be used to access the built-in by means of + avr_bdesc[ID] */ + +static GTY(()) struct avr_builtin_description +avr_bdesc[AVR_BUILTIN_COUNT] = + { + +#define DEF_BUILTIN(NAME, N_ARGS, ID, TYPE, ICODE) \ +{ ICODE, NAME, N_ARGS, NULL_TREE }, +#include builtins.def +#undef DEF_BUILTIN + }; + + +/* Implement `TARGET_BUILTIN_DECL'. */ + +static tree +avr_builtin_decl (unsigned id, bool initialize_p ATTRIBUTE_UNUSED) +{ + if (id AVR_BUILTIN_COUNT) +return avr_bdesc[id].fndecl; + + return error_mark_node; +} + + static void avr_init_builtin_int24 (void) { @@ -10295,6 +10331,7 @@ avr_init_builtin_int24 (void) (*lang_hooks.types.register_builtin_type) (uint24_type, __uint24); } + /* Implement `TARGET_INIT_BUILTINS' */ /* Set up all builtin functions for this target. */ @@ -10348,7 +10385,9 @@ avr_init_builtins (void) NULL); #define DEF_BUILTIN(NAME, N_ARGS, ID, TYPE, CODE) \ - add_builtin_function (NAME, TYPE, ID, BUILT_IN_MD, NULL, NULL_TREE); + gcc_assert (ID AVR_BUILTIN_COUNT); \ + avr_bdesc[ID].fndecl \ += add_builtin_function (NAME, TYPE, ID, BUILT_IN_MD, NULL, NULL_TREE); #include builtins.def #undef DEF_BUILTIN @@ -10356,27 +10395,6 @@ avr_init_builtins (void) } -struct avr_builtin_description -{ - enum insn_code icode; - const char *name; - enum avr_builtin_id id; - int n_args; -}; - -static const struct avr_builtin_description -avr_bdesc[] = - { - -#define DEF_BUILTIN(NAME, N_ARGS, ID, TYPE, ICODE) \ -{ ICODE, NAME, ID, N_ARGS }, -#include builtins.def -#undef DEF_BUILTIN - -{ CODE_FOR_nothing, NULL, 0, -1 } - }; - - /* Subroutine of avr_expand_builtin to take care of unop insns. */ static rtx @@ -10545,6 +10563,7 @@ avr_expand_triop_builtin (enum insn_code } +/* Implement `TARGET_EXPAND_BUILTIN'. */ /* Expand an expression EXP that calls a built-in function, with result going to TARGET if that's convenient (and in mode MODE if that's convenient). @@ -10557,13 +10576,15 @@ avr_expand_builtin (tree exp, rtx target enum machine_mode mode ATTRIBUTE_UNUSED, int ignore ATTRIBUTE_UNUSED) { - size_t i; tree fndecl = TREE_OPERAND (CALL_EXPR_FN (exp), 0); const char* bname = IDENTIFIER_POINTER (DECL_NAME (fndecl)); unsigned int id = DECL_FUNCTION_CODE (fndecl); + const struct avr_builtin_description *d = avr_bdesc[id]; tree arg0; rtx op0; + gcc_assert (id AVR_BUILTIN_COUNT); + switch (id) { case AVR_BUILTIN_NOP: @@ -10597,29 +10618,22 @@ avr_expand_builtin (tree exp, rtx target } } - for (i = 0; avr_bdesc[i].name; i++) + /* No special treatment needed: vanilla expand. */ + + switch (d-n_args) { - const struct avr_builtin_description *d = avr_bdesc[i]; +case 0: + emit_insn ((GEN_FCN (d-icode)) (target)); + return 0; - if (d-id == id) -switch (d-n_args) - { - case 0: -emit_insn ((GEN_FCN (d-icode)) (target)); -return 0; - - case 1: -return avr_expand_unop_builtin (d-icode, exp, target); - - case 2: -return avr_expand_binop_builtin (d-icode, exp, target); - - case 3: -return avr_expand_triop_builtin (d-icode, exp, target); - - default: -gcc_unreachable(); -} +
[PATCH] PR c++/52672
Hi All, This patch fixes an ICE that occurs when attempting to fold nested INDIRECT_REF trees that have conversions in between the indirect references. For example: constexpr unsigned long b = *((ul_ptr)(*((ul_ptr)0x0))); What happens is that 'cxx_fold_indirect_ref' gets the top indirect reference, strips out all the conversions ending up with the nested indirect reference, and then asserts that the type is a pointer type, which (obviously) it isn't. This patch fixes the problem by exiting the fold for non-pointer sub-trees instead of doing the assert. 'fold_indirect_ref_1' already does things this way. Tested on x86_64 GNU/Linux. OK? Also, This fixes the problem on trunk, but I have reproduced the issue back to the 4.6 branch as well. On the 4.6 branch I think it is OK to just remove the 'gcc_assert (POINTER_TYPE_P (subtype))' assert. I tested this and saw no regressions. P.S. If it is OK can some please commit for me? I don't have write access. -- Meador Inge CodeSourcery / Mentor Embedded http://www.mentor.com/embedded-software cp/ 2012-03-26 Meador Inge mead...@codesourcery.com PR c++/52672 * gcc/cp/semantics.c (cxx_fold_indirect_ref): Don't attempt to fold stripped child trees that are not pointer types. testsuite/ 2012-03-26 Meador Inge mead...@codesourcery.com PR c++/52672 * g++.dg/cpp0x/constexpr-52672.C: New testcase.Index: gcc/testsuite/g++.dg/cpp0x/constexpr-52672.C === --- gcc/testsuite/g++.dg/cpp0x/constexpr-52672.C (revision 0) +++ gcc/testsuite/g++.dg/cpp0x/constexpr-52672.C (revision 0) @@ -0,0 +1,8 @@ +// PR c++/52672 +// { dg-do compile } +// { dg-options -std=c++11 } + +typedef unsigned long * ul_ptr; +constexpr unsigned long a = *((ul_ptr)0x0); // { dg-error } +constexpr unsigned long b = *((ul_ptr)(*((ul_ptr)0x0))); // { dg-error } +constexpr unsigned long c = *((ul_ptr)*((ul_ptr)(*((ul_ptr)0x0; // { dg-error } Index: gcc/cp/semantics.c === --- gcc/cp/semantics.c (revision 185750) +++ gcc/cp/semantics.c (working copy) @@ -7202,7 +7202,8 @@ cxx_fold_indirect_ref (location_t loc, tree type, sub = op0; STRIP_NOPS (sub); subtype = TREE_TYPE (sub); - gcc_assert (POINTER_TYPE_P (subtype)); + if (!POINTER_TYPE_P (subtype)) +return NULL_TREE; if (TREE_CODE (sub) == ADDR_EXPR) {
[PATCH H8300] Add function_vector attribute
Hi, Please find the attached h8300-func-vect.patch that adds 'function_vector' attribute for H8300 target. I have taken the reference from previously posted patch for GCC-3.4.2: http://gcc.gnu.org/ml/gcc-patches/2004-09/msg02529.html I have updated functionality of the attribute to GCC-4.7.0 source. Can someone please review the patch and let me know if there should be any modifications in it? I will briefly mention about 'function_vector' attribute. For test code: void foo (void) { } void bar (void) { foo(); } For regular function call, compiler generates code as: _bar: mov.w r6,@-r7 mov.w r7,r6 jsr @_foo [length=4] mov.w @r7+,r6 rts Using 'function_vector' attribute, code is generated as: _bar: mov.w r6,@-r7 mov.w r7,r6 jsr @@ vector_number:8 [length=2] mov.w @r7+,r6 rts I have built the toolchain with 'c' language enabled and did the regression testing. Results are fine. I could not build the toolchain for 'c++' language as GCC-4.7.0 build fails for it for H8300 target. == ChangeLog: 2012-26-03 Ajinkya Dhobale ajinkya.dhob...@kpitcummins.com * config/h8300/h8300.c (h8300_handle_function_vector_attribute): New function. (print_operand): 'B', new print_operand switch case for jsr instruction. (h8300_attrib): Redefined for argument and new function. * config/h8300/h8300.md (call): Changed assembly output to get processed by print_operand function. (call_value): Same as (call). * doc/extend.texi (function_vector): Added description for H8300 target. === Thanks Regards, Ajinkya Dhobale KPIT Cummins Infosystems Ltd, Pune (INDIA) h8300-func-vect.patch Description: h8300-func-vect.patch
Fw: [PATCH] Permanent Fix for PR46886
+ + /* if the latch contains more than the one statemnt of control variable + increment then it contains the body. */ + if (exit_1-dest == loop-latch last_and_only_stmt (loop-latch)) new_rhs = gimple_cond_rhs (cond_stmt); please check what the comment suggests, thus, last_and_only_stmt (loop-latch) == cond_stmt Richard, There's no simple way to do such a check, as the control variable increment stmt is explicitly created in canonicalize_loop_ivs and not in tree-parloops. What we do know is that it is put in the latch because we ask for it when we call canonicalize_loop_ivs from parloops. One way to get the control variable increment stmt is to use canonicalize_loop_ivs() return value and look for the stmts that use it. Do you think I should add all this functionality to the code in order to assert the stmt that is in the latch? or can I skip it ? Thanks, Razya
Re: [PATCH] Permanent Fix for PR46886
On Mon, 26 Mar 2012, Razya Ladelsky wrote: + + /* if the latch contains more than the one statemnt of control variable + increment then it contains the body. */ + if (exit_1-dest == loop-latch last_and_only_stmt (loop-latch)) new_rhs = gimple_cond_rhs (cond_stmt); please check what the comment suggests, thus, last_and_only_stmt (loop-latch) == cond_stmt Richard, There's no simple way to do such a check, as the control variable increment stmt is explicitly created in canonicalize_loop_ivs. What we do know is that it is put in the latch because we ask for it when we call canonicalize_loop_ivs from parloops. One way to get the control variable increment stmt is to use canonicalize_loop_ivs return value and look for the stmts that use it. Do you think I should add all this functionality to the code in order to assert the stmt that is in the latch? or can I skip it ? No, you do not need to do that. I see the check is basically verifying the latch is empty (apart from the newly inserted IV update). Well, looking at the code simply checking for an empty latch (before canonicalize_loop_ivs) is not going to make the case correct when we do _not_ have an empty latch. That is, the whole logic in transform_to_exit_first_loop seems to assume that either the latch is empty or the header is empty, but this does not seem to be verified anywhere (AFAIU)? We copy the loop anyway but exempt the code in the latch (well, it's not exactly clear how gimple_duplicate_sese_tail will end up turning the loop into while (...) style to me). It might be that the patch is a progression, so it's ok in its original form. Richard.
Re: PATCH: Add MaskNeeded property to option handling
On Mon, Mar 26, 2012 at 4:53 AM, Joseph S. Myers jos...@codesourcery.com wrote: On Sun, 25 Mar 2012, H.J. Lu wrote: Hi Joseph, I need to support InverseMask(XXX) in options without the corresponding Mask(XXX) since XXX is never set directly via a command line option. This patch adds a MaskNeeded property which turns InverseMask(XXX) into the inverse version of Mask(XXX), which allocates a unique bit and defines the same set of macros as Mask(XXX). Does it look OK? I'd have thought that either Mask or InverseMask with a given mask name (or a standalone target mask record) should cause allocation (only once, no matter how many options use the same mask name), and MaskExists should be removed, rather than adding MaskNeeded - if I understood correctly the purpose for which you are adding MaskNeeded. That is correct. I will work on a patch to remove MaskExists. Thanks. -- H.J.
Re: [PATCH H8300] Add function_vector attribute
On 03/26/2012 08:18 AM, Ajinkya Dhobale wrote: Hi, Please find the attached h8300-func-vect.patch that adds 'function_vector' attribute for H8300 target. I have taken the reference from previously posted patch for GCC-3.4.2: http://gcc.gnu.org/ml/gcc-patches/2004-09/msg02529.html I have updated functionality of the attribute to GCC-4.7.0 source. Can someone please review the patch and let me know if there should be any modifications in it? I will briefly mention about 'function_vector' attribute. For test code: [ ... ] How is this different than the function vector support that is already in GCC for the H8/300 series processors? Jeff
Re: [Patch, libfortran] Combine get_mem and internal_malloc_size
On Mon, Mar 26, 2012 at 06:17:08PM +0300, Janne Blomqvist wrote: currently in libgfortran we have two malloc() wrappers, get_mem and internal_malloc_size, which abort the program if malloc fails. internal_malloc_size does if (size == 0) size = 1; and then calls get_mem, which doesn't have such a zero-size check. This is, however, a bug in get_mem, as malloc(0) is allowed to return NULL, and in that case get_mem would incorrectly abort the program. Thus having both these functions is unnecessary. The attached patch combines these two functions into one. As I couldn't decide between get_mem and internal_malloc_size and didn't particularly like either of the names, I named the combined function xmalloc, as is common for such functions to be called. When developing the patch, I noticed that the implementations of the intrinsics IALL, IANY, IPARITY, NORM2, and PARITY weren't being regenerated from the m4 sources. The reason, it turned out, was that the dependencies were specified incorrectly in Makefile.am. The patch also fixes this. Regtested on x86_64-unknown-linux-gnu. While the patch is large, it's also mechanical, hence committed as obvious. The patch looks ok to me, but I do have two questions. 1) xmalloc is used in other parts of gcc. Is there a possibility of confusing the libgfortran version with the other(s)? Perhaps, a unique name is in order, gfc_xmalloc(). 2) Unless I've misread the patch (which is always a possibility), it seems that you've removed GFC_CLEAR_MEMORY, which used calloc to zero the allocated memory. Is there a situation where libgfortran needs/assumes the memory is zeroed upon allocation? -- Steve
Re: [Patch, libfortran] Combine get_mem and internal_malloc_size
On Mon, Mar 26, 2012 at 18:37, Steve Kargl s...@troutmask.apl.washington.edu wrote: On Mon, Mar 26, 2012 at 06:17:08PM +0300, Janne Blomqvist wrote: currently in libgfortran we have two malloc() wrappers, get_mem and internal_malloc_size, which abort the program if malloc fails. internal_malloc_size does if (size == 0) size = 1; and then calls get_mem, which doesn't have such a zero-size check. This is, however, a bug in get_mem, as malloc(0) is allowed to return NULL, and in that case get_mem would incorrectly abort the program. Thus having both these functions is unnecessary. The attached patch combines these two functions into one. As I couldn't decide between get_mem and internal_malloc_size and didn't particularly like either of the names, I named the combined function xmalloc, as is common for such functions to be called. When developing the patch, I noticed that the implementations of the intrinsics IALL, IANY, IPARITY, NORM2, and PARITY weren't being regenerated from the m4 sources. The reason, it turned out, was that the dependencies were specified incorrectly in Makefile.am. The patch also fixes this. Regtested on x86_64-unknown-linux-gnu. While the patch is large, it's also mechanical, hence committed as obvious. The patch looks ok to me, but I do have two questions. 1) xmalloc is used in other parts of gcc. Is there a possibility of confusing the libgfortran version with the other(s)? Perhaps, a unique name is in order, gfc_xmalloc(). I though about that, and ultimately decided against it, as the symbol name is mangled anyway (__gfortrani_xmalloc IIRC), and so shouldn't cause any linking problems. Also, I don't think it's worth worrying about similar named source identifiers in other subdirectories in the GCC tree; it's easy enough to find the implementation that is used in libgfortran with (git) grep. 2) Unless I've misread the patch (which is always a possibility), it seems that you've removed GFC_CLEAR_MEMORY, which used calloc to zero the allocated memory. Is there a situation where libgfortran needs/assumes the memory is zeroed upon allocation? GFC_CLEAR_MEMORY was a macro for debugging purposes. Today, with tools like valgrind available, replacing malloc with calloc is quite a inefficient debugging tool. And as it's only a one-line change to replace the malloc call with calloc inside xmalloc, if somebody wants to do that it's no more effort than uncommenting the GFC_CLEAR_MEMORY macro definition was before. And yes, we nowadays have a corresponding xcalloc for those cases where the caller requires zeroed memory; before xcalloc the caller used memset to explicitly zero the memory, so GFC_CLEAR_MEMORY was never needed from a correctness standpoint. -- Janne Blomqvist
[pph] Adjust mangling when emitting PPH (issue5902061)
Mangle TRAIT_EXPR and CONST_DECLs in template parameters. We use the mangler to generate unique strings to represent types and symbols that need to be merged. However, the C++ mangler does not handle all possible combinations. This patch works around this problem by using a combination of the pretty printer and the mangler. This only works when a PPH image is being generated, so it does not affect the standard mangler. 2012-03-25 Diego Novillo dnovi...@google.com * mangle.c (write_expression): Handle TRAIT_EXPR when PPH is enabled. (write_template_arg_literal): Handle CONST_DECLs that do not have a constant DECL_INITIAL when PPH is enabled. --- gcc/cp/ChangeLog.pph |6 ++ gcc/cp/mangle.c | 39 ++- 2 files changed, 44 insertions(+), 1 deletions(-) diff --git a/gcc/cp/ChangeLog.pph b/gcc/cp/ChangeLog.pph index 2c64375..1cf90b4 100644 --- a/gcc/cp/ChangeLog.pph +++ b/gcc/cp/ChangeLog.pph @@ -1,3 +1,9 @@ +2012-03-25 Diego Novillo dnovi...@google.com + + * mangle.c (write_expression): Handle TRAIT_EXPR when PPH is enabled. + (write_template_arg_literal): Handle CONST_DECLs that do not have + a constant DECL_INITIAL when PPH is enabled. + 2012-03-23 Diego Novillo dnovi...@google.com * name-lookup.c (pph_set_identifier_binding): If DECL is a USING_DECL, diff --git a/gcc/cp/mangle.c b/gcc/cp/mangle.c index a54200a..51ed8c5 100644 --- a/gcc/cp/mangle.c +++ b/gcc/cp/mangle.c @@ -2754,6 +2754,30 @@ write_expression (tree expr) { write_unqualified_id (dependent_name (expr)); } + else if (TREE_CODE (expr) == TRAIT_EXPR) +{ + /* FIXME pph. This is almost certainly wrong as a mangled +representation. + +We use the mangler to generate unique strings to represent +types and symbols that need to be merged. The C++ mangler +does not handle ALL possible combinations, so we work around +it by using the pretty printer. We are not trying to +generate a valid mangled name, but any string that will allow +us to do merging. */ + gcc_assert (pph_writer_enabled_p ()); + if (TRAIT_EXPR_TYPE1 (expr)) + write_type (TRAIT_EXPR_TYPE1 (expr)); + else + write_string(nil); + if (TRAIT_EXPR_TYPE2 (expr)) + write_type (TRAIT_EXPR_TYPE2 (expr)); + else + write_string(nil); + write_number ((unsigned HOST_WIDE_INT) TRAIT_EXPR_KIND (expr), + /*unsigned_p*/ 1, + /*base*/ 16); +} else { int i, len; @@ -2914,7 +2938,20 @@ write_template_arg_literal (const tree value) switch (TREE_CODE (value)) { case CONST_DECL: - write_integer_cst (value); + if (CONSTANT_CLASS_P (DECL_INITIAL (value))) + write_integer_cst (value); + else + { + /* FIXME pph - We use the mangler to generate unique + strings to represent types and symbols that need + to be merged. The C++ mangler does not handle + ALL possible combinations, so we work around it by + using the pretty printer. We are not trying to + generate a valid mangled name, but any string that will + allow us to do merging. */ + gcc_assert (pph_writer_enabled_p ()); + write_string (cxx_printable_name (value, 2)); + } break; case INTEGER_CST: -- 1.7.7.3 -- This patch is available for review at http://codereview.appspot.com/5902061
[pph] Fix check for empty statement list (issue5905065)
Fix check for empty statement list. We do some sanity checks before starting to write a PPH image. The check was inconsistent: for the VEC scope_chain-x_stmt_tree.x_cur_stmt_list we were asserting that it was NULL *and* empty. We just need it to be empty. It doesn't matter if it was allocated. 2012-03-25 Diego Novillo dnovi...@google.com * pph-out.c (pph_out_global_binding_keys): Fix check for empty statement list. --- gcc/cp/ChangeLog.pph |5 + gcc/cp/pph-out.c |3 +-- 2 files changed, 6 insertions(+), 2 deletions(-) diff --git a/gcc/cp/ChangeLog.pph b/gcc/cp/ChangeLog.pph index 1cf90b4..f5b8281 100644 --- a/gcc/cp/ChangeLog.pph +++ b/gcc/cp/ChangeLog.pph @@ -1,5 +1,10 @@ 2012-03-25 Diego Novillo dnovi...@google.com + * pph-out.c (pph_out_global_binding_keys): Fix check for empty + statement list. + +2012-03-25 Diego Novillo dnovi...@google.com + * mangle.c (write_expression): Handle TRAIT_EXPR when PPH is enabled. (write_template_arg_literal): Handle CONST_DECLs that do not have a constant DECL_INITIAL when PPH is enabled. diff --git a/gcc/cp/pph-out.c b/gcc/cp/pph-out.c index 7ba4ee6..1efac6b 100644 --- a/gcc/cp/pph-out.c +++ b/gcc/cp/pph-out.c @@ -2604,8 +2604,7 @@ pph_out_global_binding_keys (pph_stream *stream) || scope_chain-x_processing_template_decl || scope_chain-x_processing_specialization || scope_chain-x_processing_explicit_instantiation - || scope_chain-need_pop_function_context - || scope_chain-x_stmt_tree.x_cur_stmt_list) + || scope_chain-need_pop_function_context) VEC_empty (tree, scope_chain-x_stmt_tree.x_cur_stmt_list)); /* We need to write a record for BL before emitting the merge keys -- 1.7.7.3 -- This patch is available for review at http://codereview.appspot.com/5905065
[pph] Add timers for PPH procesing (issue5902062)
This initial set of timers measures time spent in the major phases of PPH processing. We may want to add/remove timers as we measure performance. 2012-03-26 Diego Novillo dnovi...@google.com cp/ChangeLog.pph * pph-core.c (pph_include_handler): Use timer TV_PPH. (pph_init): Likewise. (pph_finish): Likewise. * pph-in.c (pph_in_line_table_and_includes): Use timer TV_PPH_RESTORE_LINE_TABLE. (pph_in_replay): Use timer TV_PPH_RESTORE_REPLAY. (pph_in_identifiers): Use timer TV_PPH_RESTORE_IDENTIFIERS. (pph_in_global_binding_keys): Use timer TV_PPH_RESTORE_MERGE_KEYS. (pph_in_global_binding_bodies): Use timer TV_PPH_RESTORE_MERGE_BODIES. (pph_read_file_1): Use timer TV_PPH_RESTORE_MISC and timer TV_PPH_VALIDATE_IDENTIFIERS. (pph_read_file): Use timer TV_PPH_RESTORE. * pph-out.c (pph_writer_init): Use timer TV_PPH_SAVE. (pph_out_line_table_and_includes): Use timer TV_PPH_SAVE_LINE_TABLE. (pph_out_replay): Use timer TV_PPH_SAVE_REPLAY. (pph_out_identifiers): Use timer TV_PPH_SAVE_IDENTIFIERS. (pph_out_global_binding_keys): Use timer TV_PPH_SAVE_MERGE_KEYS. (pph_out_global_binding_bodies): Use timer TV_PPH_SAVE_MERGE_BODIES. (pph_write_file): Use timer TV_PPH_SAVE_MISC. (pph_writer_finish): Use timer TV_PPH_SAVE. * pt.c (pph_in_bodies_spec_entry_htab): Use timer TV_PPH_RESTORE_MERGE_BODIES. (pph_out_merge_key_template_state): Use timer TV_PPH_SAVE_MERGE_KEYS. (pph_out_merge_body_template_state): Use timer TV_PPH_SAVE_MERGE_BODIES. (pph_in_merge_key_template_state): Use timer TV_PPH_RESTORE_MERGE_KEYS. (pph_in_merge_body_template_state): Use timer TV_PPH_RESTORE_MERGE_BODIES. ChangeLog.pph * timevar.def (TV_PPH): New timer. (TV_PPH_VALIDATE_IDENTIFIERS): New timer. (TV_PPH_SAVE): New timer. (TV_PPH_SAVE_LINE_TABLE): New timer. (TV_PPH_SAVE_IDENTIFIERS): New timer. (TV_PPH_SAVE_MERGE_KEYS): New timer. (TV_PPH_SAVE_MERGE_BODIES): New timer. (TV_PPH_SAVE_MISC): New timer. (TV_PPH_SAVE_REPLAY): New timer. (TV_PPH_RESTORE): New timer. (TV_PPH_RESTORE_LINE_TABLE): New timer. (TV_PPH_RESTORE_IDENTIFIERS): New timer. (TV_PPH_RESTORE_MERGE_KEYS): New timer. (TV_PPH_RESTORE_MERGE_BODIES): New timer. (TV_PPH_RESTORE_MISC): New timer. (TV_PPH_RESTORE_REPLAY): New timer. --- gcc/ChangeLog.pph| 19 +++ gcc/cp/ChangeLog.pph | 33 + gcc/cp/pph-core.c| 12 gcc/cp/pph-in.c | 29 - gcc/cp/pph-out.c | 31 ++- gcc/cp/pt.c | 14 ++ gcc/timevar.def | 20 7 files changed, 156 insertions(+), 2 deletions(-) diff --git a/gcc/ChangeLog.pph b/gcc/ChangeLog.pph index 2bf6c9a..5d8a45c 100644 --- a/gcc/ChangeLog.pph +++ b/gcc/ChangeLog.pph @@ -1,3 +1,22 @@ +2012-03-26 Diego Novillo dnovi...@google.com + + * timevar.def (TV_PPH): New timer. + (TV_PPH_VALIDATE_IDENTIFIERS): New timer. + (TV_PPH_SAVE): New timer. + (TV_PPH_SAVE_LINE_TABLE): New timer. + (TV_PPH_SAVE_IDENTIFIERS): New timer. + (TV_PPH_SAVE_MERGE_KEYS): New timer. + (TV_PPH_SAVE_MERGE_BODIES): New timer. + (TV_PPH_SAVE_MISC): New timer. + (TV_PPH_SAVE_REPLAY): New timer. + (TV_PPH_RESTORE): New timer. + (TV_PPH_RESTORE_LINE_TABLE): New timer. + (TV_PPH_RESTORE_IDENTIFIERS): New timer. + (TV_PPH_RESTORE_MERGE_KEYS): New timer. + (TV_PPH_RESTORE_MERGE_BODIES): New timer. + (TV_PPH_RESTORE_MISC): New timer. + (TV_PPH_RESTORE_REPLAY): New timer. + 2012-03-04 Diego Novillo dnovi...@google.com * tree.c (type_hash_traverse): Declare. diff --git a/gcc/cp/ChangeLog.pph b/gcc/cp/ChangeLog.pph index f5b8281..3f6f5bb 100644 --- a/gcc/cp/ChangeLog.pph +++ b/gcc/cp/ChangeLog.pph @@ -1,3 +1,36 @@ +2012-03-26 Diego Novillo dnovi...@google.com + + * pph-core.c (pph_include_handler): Use timer TV_PPH. + (pph_init): Likewise. + (pph_finish): Likewise. + * pph-in.c (pph_in_line_table_and_includes): Use timer + TV_PPH_RESTORE_LINE_TABLE. + (pph_in_replay): Use timer TV_PPH_RESTORE_REPLAY. + (pph_in_identifiers): Use timer TV_PPH_RESTORE_IDENTIFIERS. + (pph_in_global_binding_keys): Use timer + TV_PPH_RESTORE_MERGE_KEYS. + (pph_in_global_binding_bodies): Use timer + TV_PPH_RESTORE_MERGE_BODIES. + (pph_read_file_1): Use timer TV_PPH_RESTORE_MISC and timer + TV_PPH_VALIDATE_IDENTIFIERS. + (pph_read_file): Use timer TV_PPH_RESTORE. + * pph-out.c (pph_writer_init): Use timer TV_PPH_SAVE. + (pph_out_line_table_and_includes): Use timer +
Re: [patch] Assert assemble_external is only called during or after expanding to RTL
On Mon, Mar 26, 2012 at 9:25 AM, Richard Guenther richard.guent...@gmail.com wrote: On Sat, Mar 24, 2012 at 9:25 PM, Steven Bosscher stevenb@gmail.com wrote: Hello, This patch tightens the conditions on when assemble_external() may be called. It also removes a comment that most platforms do not define ASM_OUTPUT_EXTERNAL, because hasn't been true since r119764 added a definition of ASM_OUTPUT_EXTERNAL to elfos.h. This is the first step toward addressing PR17982 on the trunk for GCC 4.8. The next step is to change pending_assemble_externals to pending_assemble_visibility, and fold assemble_external_real() back into assemble_external. But first, this patch. I don't think this is very risky, because GCC now always works in unit-at-a-time mode. But I think it would be good to have this on the trunk for a week or so before proceeding. Bootstrapped tested on x86_64-unknown-linux-gnu. OK for trunk? Ok. Though I wonder why callers cannot simply push these decls to the pending varpool queue and we might do a final run over it, assembling things? [or why we call assemble_external that late at all ...] The idea is to only call assemble_external as late as possible, i.e. in final. See PR17982. The pending_assemble_external queue was a hack / work-around for a problem with c-decl calling DECL_RTL early. Before that hack was introduced, any call to assemble_external would set DECL_ASSEMBLER_NAME and DECL_RTL, and write out something to asm_out_file, during parsing. I'm trying to clean up that pending_assemble_external mess, because it causes some nasty compile time problems (PR 52640). Ciao! Steven
Re: [patch] Call assemble_external only from final.c and from MI-thunk hooks
On Mon, Mar 26, 2012 at 9:31 AM, Richard Guenther richard.guent...@gmail.com wrote: On Mon, Mar 26, 2012 at 1:27 AM, Steven Bosscher stevenb@gmail.com wrote: Hello, This patch removes all calls to assemble_external from places other than final.c and MI-thunk generators. This is step 2 toward addressing PR17982 on the trunk for GCC 4.8. The next, and final, step will be to change pending_assemble_externals to pending_assemble_visibility, and fold assemble_external_real() back into assemble_external. Bootstrapped tested all default languages on x86_64-unknown-linux-gnu and on powerpc64-unknown-linux-gnu. Also cross-built c and c++ to mips-elf and tested on mips-sim (mips-elf was the only target I could find that really emits something for its ASM_OUTPUT_EXTERNAL target macro and has a sim in gdb. I hard-coded mips_output_external, by replacing if (!TARGET_EXPLICIT_RELOCS ... with if (1 ...). OK for trunk? Ok. (I think the remaining TREE_USED sets look dubious and may not be needed, too?) I'm not sure. I think some of them may still be necessary, because they were changed relatively recently (r126144). So I left them in for now. Actually, the whole back end is full of cleanup opportunities, now that the compiler always works in unit-at-a-time mode. But let's start small. Ciao! Steven
Re: [google] Minor cleanup and test fixes for -mpatch-functions-for-instrumentation. (issue5877043)
Ok for google branches (main and 4_7). thanks, David On Wed, Mar 21, 2012 at 2:45 PM, Harshit Chopra hars...@google.com wrote: 2012-03-21 Harshit Chopra hars...@google.com Minor changes: i386.c: made check_should_patch_current_function C90 compatible. i386.md: Added '\t' to bytes generated by ix86_output_function_nops_prologue_epilogue for proper formatting of assembly. patch-functions-*.c: Fixed verification in tests. Added a test to verify nop-bytes generated for sibling calls and another test to verify a binary with nop-bytes runs properly. * gcc/config/i386/i386.c (check_should_patch_current_function): * gcc/config/i386/i386.md: * gcc/testsuite/gcc.target/i386/patch-functions-1.c (void foo): (int main): * gcc/testsuite/gcc.target/i386/patch-functions-2.c: * gcc/testsuite/gcc.target/i386/patch-functions-3.c: * gcc/testsuite/gcc.target/i386/patch-functions-4.c: * gcc/testsuite/gcc.target/i386/patch-functions-5.c: * gcc/testsuite/gcc.target/i386/patch-functions-6.c: * gcc/testsuite/gcc.target/i386/patch-functions-7.c: * gcc/testsuite/gcc.target/i386/patch-functions-8.c (int foo): (int bar): * gcc/testsuite/gcc.target/i386/patch-functions-sibling-call.c: Testing method: make check-gcc RUNTESTFLAGS=i386.exp=patch-functions* --target_board=\unix{-m32,}\ Patch to be applied to google/main. diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 08bd5f0..be1f7a4 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -10981,6 +10981,7 @@ check_should_patch_current_function (void) const char* func_name = NULL; struct loops loops; int num_loops = 0; + int min_functions_instructions; /* Patch the function if it has at least a loop. */ if (!patch_functions_ignore_loops) @@ -11007,7 +11008,7 @@ check_should_patch_current_function (void) strcmp(main, func_name) == 0) return true; - int min_functions_instructions = + min_functions_instructions = PARAM_VALUE (PARAM_FUNCTION_PATCH_MIN_INSTRUCTIONS); if (min_functions_instructions 0) { diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 08353ff..38a04ae 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -11688,7 +11688,7 @@ /* Emit 10 nop bytes after ret. */ if (ix86_output_function_nops_prologue_epilogue (asm_out_file, FUNCTION_PATCH_EPILOGUE_SECTION, - ret, + \tret, 10)) return ; } @@ -11712,7 +11712,7 @@ /* Emit 9 nop bytes after rep;ret. */ if (ix86_output_function_nops_prologue_epilogue (asm_out_file, FUNCTION_PATCH_EPILOGUE_SECTION, - rep\;ret, + \trep\;ret, 9)) return ; } diff --git a/gcc/testsuite/gcc.target/i386/patch-functions-1.c b/gcc/testsuite/gcc.target/i386/patch-functions-1.c index 308e8c3..aa1f424 100644 --- a/gcc/testsuite/gcc.target/i386/patch-functions-1.c +++ b/gcc/testsuite/gcc.target/i386/patch-functions-1.c @@ -1,5 +1,5 @@ /* Verify -mpatch-functions-for-instrumentation works. */ -/* { dg-do run } */ +/* { dg-do compile } */ /* { dg-require-effective-target lp64 } */ /* { dg-options -mpatch-functions-for-instrumentation } */ @@ -8,13 +8,16 @@ /* Check nop-bytes at end. */ /* { dg-final { scan-assembler ret(.*).byte\t0x90(.*).byte\t0x90 } } */ -void foo() { +__attribute__ ((noinline)) +void foo() +{ /* Dummy loop. */ int x = 0; while (++x); } -int main() { +int main() +{ foo(); return 0; } diff --git a/gcc/testsuite/gcc.target/i386/patch-functions-2.c b/gcc/testsuite/gcc.target/i386/patch-functions-2.c index 6baad32..78de867 100644 --- a/gcc/testsuite/gcc.target/i386/patch-functions-2.c +++ b/gcc/testsuite/gcc.target/i386/patch-functions-2.c @@ -1,4 +1,4 @@ -/* { dg-do run } */ +/* { dg-do compile } */ /* { dg-require-effective-target lp64 } */ /* { dg-options -mpatch-functions-for-instrumentation -mno-patch-functions-main-always } */ @@ -8,11 +8,14 @@ /* { dg-final { scan-assembler-not .byte\t0xeb,0x09(.*).byte\t0x90 } } */ /* { dg-final { scan-assembler-not ret(.*).byte\t0x90(.*).byte\t0x90 } } */ -void foo() { +__attribute__ ((noinline)) +void foo() +{ int x = 0; } -int main() { +int main() +{ foo(); return 0; } diff --git a/gcc/testsuite/gcc.target/i386/patch-functions-3.c b/gcc/testsuite/gcc.target/i386/patch-functions-3.c index
Re: [patch][PR52640] Fix quadratic behavior with many referenced extern functions
On Wed, Mar 21, 2012 at 9:35 PM, Diego Novillo dnovi...@google.com wrote: On 3/21/12 3:30 PM, Steven Bosscher wrote: +/* FIXME: Trunk is at GCC 4.8 now and the above problem still hasn't been + addressed properly. This caused PR 52640 due to O(external_decls**2) + lookups in the pending_assemble_externals queue in assemble_external. + Paper over with this pointer set. (And pending_assemble_externals even + was a TREE_LIST before?!) */ +static struct pointer_set_t *pending_assemble_externals_set; Can you add some description on what this pointer set does? Done, I hope you agree it's sufficiently clear now what this set is for. OK for all release branches (if they are open for fixes). I've committed a version to the release branches where the pending_assemble_externals remains a TREE_LIST. I saw no reason to complicate the patch more than strictly necessary for the release branches. For trunk I'm trying to make pending_assemble_externals go away. Ciao! Steven
PATCH: Remove MaskExists property from option handling
On Mon, Mar 26, 2012 at 7:47 AM, H.J. Lu hjl.to...@gmail.com wrote: On Mon, Mar 26, 2012 at 4:53 AM, Joseph S. Myers jos...@codesourcery.com wrote: On Sun, 25 Mar 2012, H.J. Lu wrote: Hi Joseph, I need to support InverseMask(XXX) in options without the corresponding Mask(XXX) since XXX is never set directly via a command line option. This patch adds a MaskNeeded property which turns InverseMask(XXX) into the inverse version of Mask(XXX), which allocates a unique bit and defines the same set of macros as Mask(XXX). Does it look OK? I'd have thought that either Mask or InverseMask with a given mask name (or a standalone target mask record) should cause allocation (only once, no matter how many options use the same mask name), and MaskExists should be removed, rather than adding MaskNeeded - if I understood correctly the purpose for which you are adding MaskNeeded. That is correct. I will work on a patch to remove MaskExists. Here is a patch to remove MaskExists. The difference between the old options.h and the new options.h on Linux/x86-64 are --- /tmp/options.h 2012-03-26 09:49:09.590137047 -0700 +++ ./options.h 2012-03-26 10:07:56.940168938 -0700 @@ -3677,55 +3677,55 @@ extern void cl_target_option_print (FILE #endif #define MASK_128BIT_LONG_DOUBLE (1 0) -#define OPTION_MASK_ISA_3DNOW (HOST_WIDE_INT_1 0) -#define OPTION_MASK_ISA_3DNOW_A (HOST_WIDE_INT_1 1) -#define OPTION_MASK_ISA_64BIT (HOST_WIDE_INT_1 2) +#define OPTION_MASK_ISA_64BIT (HOST_WIDE_INT_1 0) +#define OPTION_MASK_ISA_3DNOW (HOST_WIDE_INT_1 1) +#define OPTION_MASK_ISA_3DNOW_A (HOST_WIDE_INT_1 2) #define MASK_80387 (1 1) #define MASK_USE_8BIT_IDIV (1 2) #define OPTION_MASK_ISA_ABM (HOST_WIDE_INT_1 3) #define MASK_ACCUMULATE_OUTGOING_ARGS (1 3) #define OPTION_MASK_ISA_AES (HOST_WIDE_INT_1 4) #define MASK_ALIGN_DOUBLE (1 4) +#define MASK_NO_ALIGN_STRINGOPS (1 5) #define OPTION_MASK_ANDROID (1 0) #define OPTION_MASK_ISA_AVX (HOST_WIDE_INT_1 5) #define OPTION_MASK_ISA_AVX2 (HOST_WIDE_INT_1 6) -#define MASK_AVX256_SPLIT_UNALIGNED_LOAD (1 5) -#define MASK_AVX256_SPLIT_UNALIGNED_STORE (1 6) +#define MASK_AVX256_SPLIT_UNALIGNED_LOAD (1 6) +#define MASK_AVX256_SPLIT_UNALIGNED_STORE (1 7) #define OPTION_MASK_ISA_BMI (HOST_WIDE_INT_1 7) #define OPTION_MASK_ISA_BMI2 (HOST_WIDE_INT_1 8) -#define MASK_CLD (1 7) +#define MASK_CLD (1 8) #define OPTION_MASK_ISA_CRC32 (HOST_WIDE_INT_1 9) #define OPTION_MASK_ISA_CX16 (HOST_WIDE_INT_1 10) #define OPTION_MASK_ISA_F16C (HOST_WIDE_INT_1 11) +#define MASK_NO_FANCY_MATH_387 (1 9) #define OPTION_MASK_ISA_FMA (HOST_WIDE_INT_1 12) #define OPTION_MASK_ISA_FMA4 (HOST_WIDE_INT_1 13) -#define MASK_FLOAT_RETURNS (1 8) +#define MASK_FLOAT_RETURNS (1 10) #define OPTION_MASK_ISA_FSGSBASE (HOST_WIDE_INT_1 14) -#define MASK_IEEE_FP (1 9) -#define MASK_INLINE_ALL_STRINGOPS (1 10) -#define MASK_INLINE_STRINGOPS_DYNAMICALLY (1 11) +#define MASK_IEEE_FP (1 11) +#define MASK_INLINE_ALL_STRINGOPS (1 12) +#define MASK_INLINE_STRINGOPS_DYNAMICALLY (1 13) #define OPTION_MASK_ISA_LWP (HOST_WIDE_INT_1 15) #define OPTION_MASK_ISA_LZCNT (HOST_WIDE_INT_1 16) #define OPTION_MASK_ISA_MMX (HOST_WIDE_INT_1 17) #define OPTION_MASK_ISA_MOVBE (HOST_WIDE_INT_1 18) -#define MASK_MS_BITFIELD_LAYOUT (1 12) -#define MASK_NO_ALIGN_STRINGOPS (1 13) -#define MASK_NO_FANCY_MATH_387 (1 14) +#define MASK_MS_BITFIELD_LAYOUT (1 14) #define MASK_NO_PUSH_ARGS (1 15) #define MASK_NO_RED_ZONE (1 16) +#define OPTION_MASK_ISA_SSE4_1 (HOST_WIDE_INT_1 19) #define MASK_OMIT_LEAF_FRAME_POINTER (1 17) -#define OPTION_MASK_ISA_PCLMUL (HOST_WIDE_INT_1 19) -#define OPTION_MASK_ISA_POPCNT (HOST_WIDE_INT_1 20) +#define OPTION_MASK_ISA_PCLMUL (HOST_WIDE_INT_1 20) +#define OPTION_MASK_ISA_POPCNT (HOST_WIDE_INT_1 21) #define MASK_PREFER_AVX128 (1 18) -#define OPTION_MASK_ISA_RDRND (HOST_WIDE_INT_1 21) +#define OPTION_MASK_ISA_RDRND (HOST_WIDE_INT_1 22) #define MASK_RECIP (1 19) #define MASK_RTD (1 20) -#define OPTION_MASK_ISA_RTM (HOST_WIDE_INT_1 22) -#define OPTION_MASK_ISA_SAHF (HOST_WIDE_INT_1 23) -#define OPTION_MASK_ISA_SSE (HOST_WIDE_INT_1 24) -#define OPTION_MASK_ISA_SSE2 (HOST_WIDE_INT_1 25) -#define OPTION_MASK_ISA_SSE3 (HOST_WIDE_INT_1 26) -#define OPTION_MASK_ISA_SSE4_1 (HOST_WIDE_INT_1 27) +#define OPTION_MASK_ISA_RTM (HOST_WIDE_INT_1 23) +#define OPTION_MASK_ISA_SAHF (HOST_WIDE_INT_1 24) +#define OPTION_MASK_ISA_SSE (HOST_WIDE_INT_1 25) +#define OPTION_MASK_ISA_SSE2 (HOST_WIDE_INT_1 26) +#define OPTION_MASK_ISA_SSE3 (HOST_WIDE_INT_1 27) #define OPTION_MASK_ISA_SSE4_2 (HOST_WIDE_INT_1 28) #define OPTION_MASK_ISA_SSE4A (HOST_WIDE_INT_1 29) #define MASK_SSEREGPARM (1 21) @@ -3742,15 +3742,16 @@ extern void cl_target_option_print (FILE #endif #define TARGET_128BIT_LONG_DOUBLE ((target_flags MASK_128BIT_LONG_DOUBLE) != 0) +#define OPTION_ISA_64BIT ((ix86_isa_flags OPTION_MASK_ISA_64BIT) != 0) #define OPTION_ISA_3DNOW
[PATCH, i386]: FIX PR 52698, reload failure with complex address
Hello! In a corner case of a reload, reload pass can generate partially reloaded address, where not all registers get allocated to a hard reg. When this address is checked with ix86_legitimate_address, it is rejected, since in strict mode, pseudos are not valid address registers. So, reload tries to legitimize following (partially invalid) address: (plus:DI (plus:DI (unspec:DI [(const_int 0 [0])] UNSPEC_TP) (reg:DI 97)) (reg:DI 2 cx)) via generic way by pushing all components into a register, forming (even more invalid) address that involves three registers (r8, r9 and rcx): (insn 52 78 53 5 (set (mem/j:QI (plus:DI (plus:DI (reg:DI 37 r8) (reg:DI 38 r9)) (reg:DI 2 cx [orig:98 D.1709 ] [98])) [0 foo S1 A8]) (reg:QI 1 dx [100])) /tmp/x.c:12 66 {*movqi_internal} (nil)) BTW: x86 declares MAX_REGISTER_PER_ADDRESS to 2... Attached patch fixes this situation by (partially) reloading only remaining pseudo(s), leaving UNSPEC in the address RTX. 2012-03-26 Uros Bizjak ubiz...@gmail.com PR target/52698 * config/i386/i386-protos.h (ix86_legitimize_reload_address): New prototype. * config/i386/i386.h (LEGITIMIZE_RELOAD_ADDRESS): New define. * config/i386/i386.c: Include reload.h. (ix86_legitimize_reload_address): New function. testsuite/ChangeLog: 2012-03-26 Uros Bizjak ubiz...@gmail.com H.J. Lu hongjiu...@intel.com PR target/52698 * gcc.target/i386/pr52698.c: New test. The patch was bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32}. Since fixing reload issues is some kind of black magic, I'd like to ask Ulrich and Richard for their opinion on this approach. H.J., can you please test this fix on x32 testsuite for both address modes? Thanks, Uros. Index: config/i386/i386.c === --- config/i386/i386.c (revision 185807) +++ config/i386/i386.c (working copy) @@ -47,6 +47,7 @@ #include target-def.h #include common/common-target.h #include langhooks.h +#include reload.h #include cgraph.h #include gimple.h #include dwarf2.h @@ -12010,6 +12011,64 @@ return false; } +/* Our implementation of LEGITIMIZE_RELOAD_ADDRESS. Returns a value to + replace the input X, or the original X if no replacement is called for. + The output parameter *WIN is 1 if the calling macro should goto WIN, + 0 if it should not. */ + +bool +ix86_legitimize_reload_address (rtx x, + enum machine_mode mode ATTRIBUTE_UNUSED, + int opnum, int type, + int ind_levels ATTRIBUTE_UNUSED) +{ + /* Reload can generate: + + (plus:DI (plus:DI (unspec:DI [(const_int 0 [0])] UNSPEC_TP) + (reg:DI 97)) + (reg:DI 2 cx)) + + This RTX is rejected from ix86_legitimate_address_p due to + non-strictness of base register 97. Following this rejection, + reload pushes all three components into separate registers, + creating invalid memory address RTX. + + Following code reloads only the invalid part of the + memory address RTX. */ + + if (GET_CODE (x) == PLUS + REG_P (XEXP (x, 1)) + GET_CODE (XEXP (x, 0)) == PLUS + REG_P (XEXP (XEXP (x, 0), 1))) +{ + rtx base, index; + bool something_reloaded = false; + + base = XEXP (XEXP (x, 0), 1); + if (!REG_OK_FOR_BASE_STRICT_P (base)) + { + push_reload (base, NULL_RTX, XEXP (XEXP (x, 0), 1), NULL, + BASE_REG_CLASS, GET_MODE (x), VOIDmode, 0, 0, + opnum, (enum reload_type)type); + something_reloaded = true; + } + + index = XEXP (x, 1); + if (!REG_OK_FOR_INDEX_STRICT_P (index)) + { + push_reload (index, NULL_RTX, XEXP (x, 1), NULL, + INDEX_REG_CLASS, GET_MODE (x), VOIDmode, 0, 0, + opnum, (enum reload_type)type); + something_reloaded = true; + } + + gcc_assert (something_reloaded); + return true; +} + + return false; +} + /* Recognizes RTL expressions that are valid memory addresses for an instruction. The MODE argument is the machine mode for the MEM expression that wants to use this address. Index: config/i386/i386.h === --- config/i386/i386.h (revision 185807) +++ config/i386/i386.h (working copy) @@ -1630,6 +1630,17 @@ #define CONSTANT_ADDRESS_P(X) constant_address_p (X) +/* Try a machine-dependent way of reloading an illegitimate address + operand. If we find one, push the reload and jump to WIN. This + macro is used in only one place: `find_reloads_address' in reload.c. */ + +#define LEGITIMIZE_RELOAD_ADDRESS(X, MODE, OPNUM, TYPE, INDL, WIN) \ +do {
[pph] Adjust expected patterns in some tests (issue5905068)
Adjust expected patterns in some tests. After the mangling adjustments I made earlier, these tests are now passing and or failing in a different way. Adjusted. 2012-03-26 Diego Novillo dnovi...@google.com * g++.dg/pph/x6dynarray4.cc: Adjust expected patterns. * g++.dg/pph/x6dynarray5.h: Likewise. * g++.dg/pph/x7dynarray5.cc: Likewise. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/pph@185814 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/testsuite/ChangeLog.pph |6 ++ gcc/testsuite/g++.dg/pph/x6dynarray4.cc | 10 -- gcc/testsuite/g++.dg/pph/x6dynarray5.h |9 - gcc/testsuite/g++.dg/pph/x7dynarray5.cc | 19 +-- 4 files changed, 31 insertions(+), 13 deletions(-) diff --git a/gcc/testsuite/ChangeLog.pph b/gcc/testsuite/ChangeLog.pph index b510b55..ded2ef9 100644 --- a/gcc/testsuite/ChangeLog.pph +++ b/gcc/testsuite/ChangeLog.pph @@ -1,3 +1,9 @@ +2012-03-26 Diego Novillo dnovi...@google.com + + * g++.dg/pph/x6dynarray4.cc: Adjust expected patterns. + * g++.dg/pph/x6dynarray5.h: Likewise. + * g++.dg/pph/x7dynarray5.cc: Likewise. + 2012-03-23 Diego Novillo dnovi...@google.com * g++.dg/pph/x1mbstate_t.h: Mark fixed. diff --git a/gcc/testsuite/g++.dg/pph/x6dynarray4.cc b/gcc/testsuite/g++.dg/pph/x6dynarray4.cc index 74fb0d9..6e43954 100644 --- a/gcc/testsuite/g++.dg/pph/x6dynarray4.cc +++ b/gcc/testsuite/g++.dg/pph/x6dynarray4.cc @@ -1,6 +1,12 @@ -// {xfail-if DEPENDENT { *-*-* } { -fpph-map=pph.map } } +// { xfail-if DEPENDENT { *-*-* } { -fpph-map=pph.map } } +// { dg-bogus a0dynarray-dcl3.hi:11:60: error: call of overloaded .operator new ...sizetype.. is ambiguous { xfail *-*-* } 0 } +// { dg-bogus stl_algobase.h:625:18: error: .__value. is not a member of .std::__is_move_iterator { xfail *-*-* } 0 } +// { dg-bogus stl_algobase.h:276:12: error: .__value. is not a member of .std::__is_move_iterator { xfail *-*-* } 0 } +// { dg-bogus stl_algobase.h:625:18: error: no matching function for call to .__miter_base { xfail *-*-* } 0 } +// { dg-bogus stl_algobase.h:282:5: error: no type named .iterator_type. in .struct std::_Miter_base. { xfail *-*-* } 0 } +// { dg-bogus stl_algobase.h:139:7: error: .__value. is not a member of .std::__are_same { xfail *-*-* } 0 } -#include x6dynarray5.h // { dg-bogus cannot open PPH file x6dynarray5.pph { xfail *-*-* } } +#include x6dynarray5.h #include algorithm diff --git a/gcc/testsuite/g++.dg/pph/x6dynarray5.h b/gcc/testsuite/g++.dg/pph/x6dynarray5.h index 05634c1..69dbcb8 100644 --- a/gcc/testsuite/g++.dg/pph/x6dynarray5.h +++ b/gcc/testsuite/g++.dg/pph/x6dynarray5.h @@ -1,12 +1,3 @@ -// { dg-xfail-if UNIMPL TRAIT MANGLING { *-*-* } { -fpph-map=pph.map } } -// { dg-bogus bits/stl_uninitialized.h:66:12: sorry, unimplemented: mangling trait_expr { xfail *-*-* } 0 } -// { dg-bogus bits/stl_uninitialized.h:124:12: sorry, unimplemented: mangling trait_expr { xfail *-*-* } 0 } -// { dg-bogus bits/stl_uninitialized.h:178:12: sorry, unimplemented: mangling trait_expr { xfail *-*-* } 0 } -// { dg-bogus bits/allocator.h:153:12: sorry, unimplemented: mangling trait_expr { xfail *-*-* } 0 } -// { dg-bogus bits/stl_construct.h:98:12: sorry, unimplemented: mangling trait_expr { xfail *-*-* } 0 } -// { dg-bogus bits/stl_tempbuf.h:183:12: sorry, unimplemented: mangling trait_expr { xfail *-*-* } 0 } -// { dg-bogus bits/cpp_type_traits.h:87:12: internal compiler error: tree check: expected integer_cst, have const_decl in tree_int_cst_sgn { xfail *-*-* } 0 } - #ifndef X6DYNARRAY5_H #define X6DYNARRAY5_H diff --git a/gcc/testsuite/g++.dg/pph/x7dynarray5.cc b/gcc/testsuite/g++.dg/pph/x7dynarray5.cc index cd55eab..628ea7c 100644 --- a/gcc/testsuite/g++.dg/pph/x7dynarray5.cc +++ b/gcc/testsuite/g++.dg/pph/x7dynarray5.cc @@ -1,8 +1,23 @@ -// {xfail-if DEPENDENT { *-*-* } { -fpph-map=pph.map } } +// { dg-xfail-if DEPENDENT { *-*-* } { -fpph-map=pph.map } } +// { dg-bogus streambuf_iterator.h:372:58: internal compiler error: canonical types differ for identical types.* { xfail *-*-* } 0 } // { dg-bogus ext/atomicity.h:48:61: error: void value not ignored as it ought to be { xfail *-*-* } 0 } +// { dg-bogus basic_string.h:111:63: error: no class template named .rebind. in .class std::allocatorchar. { xfail *-*-* } 0 } +// { dg-bogus basic_string.h:111:63: error: no class template named .rebind. in .class std::allocatorwchar_t. { xfail *-*-* } 0 } +// { dg-bogus basic_string.h:124:68: error: no class template named .rebind. in .class std::allocatorchar. { xfail *-*-* } 0 } +// { dg-bogus basic_string.h:124:68: error: no class template named .rebind. in .class std::allocatorwchar_t. { xfail *-*-* } 0 } +// { dg-bogus basic_string.h:126:61: error: no class template named .rebind. in .class std::allocatorchar. { xfail *-*-* } 0 } +// { dg-bogus basic_string.h:126:61: error: no class template named .rebind. in
Re: [PATCH, i386]: FIX PR 52698, reload failure with complex address
On Mon, Mar 26, 2012 at 10:25 AM, Uros Bizjak ubiz...@gmail.com wrote: Hello! In a corner case of a reload, reload pass can generate partially reloaded address, where not all registers get allocated to a hard reg. When this address is checked with ix86_legitimate_address, it is rejected, since in strict mode, pseudos are not valid address registers. So, reload tries to legitimize following (partially invalid) address: (plus:DI (plus:DI (unspec:DI [(const_int 0 [0])] UNSPEC_TP) (reg:DI 97)) (reg:DI 2 cx)) via generic way by pushing all components into a register, forming (even more invalid) address that involves three registers (r8, r9 and rcx): (insn 52 78 53 5 (set (mem/j:QI (plus:DI (plus:DI (reg:DI 37 r8) (reg:DI 38 r9)) (reg:DI 2 cx [orig:98 D.1709 ] [98])) [0 foo S1 A8]) (reg:QI 1 dx [100])) /tmp/x.c:12 66 {*movqi_internal} (nil)) BTW: x86 declares MAX_REGISTER_PER_ADDRESS to 2... Attached patch fixes this situation by (partially) reloading only remaining pseudo(s), leaving UNSPEC in the address RTX. 2012-03-26 Uros Bizjak ubiz...@gmail.com PR target/52698 * config/i386/i386-protos.h (ix86_legitimize_reload_address): New prototype. * config/i386/i386.h (LEGITIMIZE_RELOAD_ADDRESS): New define. * config/i386/i386.c: Include reload.h. (ix86_legitimize_reload_address): New function. testsuite/ChangeLog: 2012-03-26 Uros Bizjak ubiz...@gmail.com H.J. Lu hongjiu...@intel.com PR target/52698 * gcc.target/i386/pr52698.c: New test. The patch was bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32}. Since fixing reload issues is some kind of black magic, I'd like to ask Ulrich and Richard for their opinion on this approach. H.J., can you please test this fix on x32 testsuite for both address modes? I am on it. -- H.J.
Re: [PATCH] Fix PRs 52080, 52097 and 48124, rewrite bitfield expansion, enable the C++ memory model wrt bitfields everywhere
Uh. When is a field a bit field though? At least stor-layout.c resets DECL_BIT_FIELD when local relative alignment is proper and the filed has an integer mode. That's overly optimistic if the record is placed at a bit position. Of course we still have DECL_BIT_FIELD_TYPE, but I wonder what the DECL_BIT_FIELD flag is for if we cannot really know whether the field in the end is a bitfield anyway... (what value do the various field-decls in a component-ref chain have anyway if the real interpretation is subject to the containing object, which might be even only specified by the type used for an indirect ref ...) We're talking about a bit field with record type here, so anything calculated within the record type in isolation is essentially invalidated for the bit field in question. Fortunately this only occurs in Ada, which doesn't use the very questionable C++ memory model (fingers crossed for the future. :-) -- Eric Botcazou
Re: [PATCH] Fix PRs 52080, 52097 and 48124, rewrite bitfield expansion, enable the C++ memory model wrt bitfields everywhere
The patch looks reasonable - can we compute this backward from the result of the outer get_inner_reference call and the outermost field-decl though? Or make get_inner_reference compute that while analyzing the full reference and return a flag? OTOH it shouldn't be too expensive. There are a lot of calls to get_inner_reference throughout the compiler though. And it is cheap if you compare it with the 4.7 code, which essentially does the same processing for _every_ field in the record (and even if the record object isn't a handled_component_p). -- Eric Botcazou
Re: [patch] Call assemble_external only from final.c and from MI-thunk hooks
On Mon, Mar 26, 2012 at 9:26 AM, Steven Bosscher stevenb@gmail.com wrote: On Mon, Mar 26, 2012 at 9:31 AM, Richard Guenther richard.guent...@gmail.com wrote: On Mon, Mar 26, 2012 at 1:27 AM, Steven Bosscher stevenb@gmail.com wrote: Hello, This patch removes all calls to assemble_external from places other than final.c and MI-thunk generators. This is step 2 toward addressing PR17982 on the trunk for GCC 4.8. The next, and final, step will be to change pending_assemble_externals to pending_assemble_visibility, and fold assemble_external_real() back into assemble_external. Bootstrapped tested all default languages on x86_64-unknown-linux-gnu and on powerpc64-unknown-linux-gnu. Also cross-built c and c++ to mips-elf and tested on mips-sim (mips-elf was the only target I could find that really emits something for its ASM_OUTPUT_EXTERNAL target macro and has a sim in gdb. I hard-coded mips_output_external, by replacing if (!TARGET_EXPLICIT_RELOCS ... with if (1 ...). OK for trunk? Ok. (I think the remaining TREE_USED sets look dubious and may not be needed, too?) I'm not sure. I think some of them may still be necessary, because they were changed relatively recently (r126144). So I left them in for now. Actually, the whole back end is full of cleanup opportunities, now that the compiler always works in unit-at-a-time mode. But let's start small. It may have caused: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52730 -- H.J.
Re: PATCH: Remove MaskExists property from option handling
On Mon, Mar 26, 2012 at 10:20 AM, H.J. Lu hjl.to...@gmail.com wrote: On Mon, Mar 26, 2012 at 7:47 AM, H.J. Lu hjl.to...@gmail.com wrote: On Mon, Mar 26, 2012 at 4:53 AM, Joseph S. Myers jos...@codesourcery.com wrote: On Sun, 25 Mar 2012, H.J. Lu wrote: Hi Joseph, I need to support InverseMask(XXX) in options without the corresponding Mask(XXX) since XXX is never set directly via a command line option. This patch adds a MaskNeeded property which turns InverseMask(XXX) into the inverse version of Mask(XXX), which allocates a unique bit and defines the same set of macros as Mask(XXX). Does it look OK? I'd have thought that either Mask or InverseMask with a given mask name (or a standalone target mask record) should cause allocation (only once, no matter how many options use the same mask name), and MaskExists should be removed, rather than adding MaskNeeded - if I understood correctly the purpose for which you are adding MaskNeeded. That is correct. I will work on a patch to remove MaskExists. Here is a patch to remove MaskExists. The difference between the old options.h and the new options.h on Linux/x86-64 are ... The same set of macros are provided. The only differences are the order of bits used and macro definitions. I am doing a full bootstrap and test on Linux/x86-64. OK for trunk if there are no regressions? I will submit a separate patch to remove MaskExists from all *.opt files after this patch is installed. Thanks. -- H.J. 2012-03-26 H.J. Lu hongjiu...@intel.com * opth-gen.awk: Allocated a bit for Mask and InverseMask if it hasn't been allocated. Define a target macro for Mask and InverseMask if it hasn't been defined. Remove MaskExists handling. * doc/options.texi: Remove MaskNeeded. There are no regressions on Linux/x86-64. OK for trunk? Thanks. -- H.J.
Re: [patch] Call assemble_external only from final.c and from MI-thunk hooks
On Mon, Mar 26, 2012 at 8:22 PM, H.J. Lu hjl.to...@gmail.com wrote: It may have caused: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52730 It certainly seems so. Looking into it... Ciao! Steven
Re: [Patch,avr]: PR target/52692: Imlpement TARGET_BUILTIN_DECL
2012/3/26 Georg-Johann Lay a...@gjlay.de: This implements TARGET_BUILTIN_DECL which is needed to make LTO work with target-specific built-ins. struct avr_builtin_description gets a new field .fndecl which is initialized during avr_init_builtins and looked up in new hook avr_builtin_decl. The built-ins are initialized in such a way that there is always avr_bdesc[x].id == x i.e. .id is superfluous and avr_expand_builtin can use id to access the array and need no search loop to find built-in id. Johann Ok? gcc/ PR target/52692 * config/avr/avr.c (TARGET_BUILTIN_DECL): New define. (avr_builtin_decl): New static function. (struct avr_builtin_description, avr_bdesc): Move up. Add GTY marker. Add field fndecl. Remove redundant field id. (avr_init_builtins): Initialize avr_bdesc[ID].fndecl. (avr_expand_builtin): Code cleanup because .id is removed. testsuite/ * gcc.target/avr/torture/builtins-2.c: New test. Please, apply. Denis.
Re: [patch] Split parts of cse_insn out to a few new functions
Steven Bosscher stevenb@gmail.com writes: On Wed, Mar 21, 2012 at 1:13 AM, Ian Lance Taylor wrote: On Tue, Mar 20, 2012 at 2:06 PM, Steven Bosscher wrote: This patch splits a couple of pieces of cse_insn out to new functions. There are no functional changes, and no code generation differences as far as I could tell on x86_64 (-m64 and -m32). Likewise for the attached patch. The purpose of the patch is and, loto hopefully make cse_insn easier to understand. In a follow-up patch, I will make canonicalize_insn run only once per insn (it currently, i.e. before and after this patch, runs multiple times for CSE on extended basic blocks if a block is in multiple extended basic blocks). That is what the attached patch does. Bootstrappedtested on x86_64-unknown-linux-gnu. OK for trunk? Ciao! Steven * cse.c (cse_canonicalized_basic_blocks): New simple bitmap to tag basic blocks that have already been traversed at least once, so that all insns have been canonicalized. (cse_insn): Call canonicalize_insn only if the basic block that contains insn is visited for the first time. (cse_extended_basic_block): After visiting all insns in a basic block, mark the block in cse_canonicalized_basic_blocks. (cse_main): Setup and destroy cse_canonicalized_basic_blocks. OK, thanks (without the microoptimisation, as you say). Out of curiosity, do you still see this bit as useful: /* We potentially will process this insn many times. Therefore, drop the REG_EQUAL note if it is equal to the SET_SRC of the unique set in INSN. Do not do so if the REG_EQUAL note is for a STRICT_LOW_PART, because cse_insn handles those specially. */ ? Does many times mean in CSE, or later? Richard
Re: [PATCH, i386]: FIX PR 52698, reload failure with complex address
On Mon, Mar 26, 2012 at 10:41 AM, H.J. Lu hjl.to...@gmail.com wrote: On Mon, Mar 26, 2012 at 10:25 AM, Uros Bizjak ubiz...@gmail.com wrote: Hello! In a corner case of a reload, reload pass can generate partially reloaded address, where not all registers get allocated to a hard reg. When this address is checked with ix86_legitimate_address, it is rejected, since in strict mode, pseudos are not valid address registers. So, reload tries to legitimize following (partially invalid) address: (plus:DI (plus:DI (unspec:DI [(const_int 0 [0])] UNSPEC_TP) (reg:DI 97)) (reg:DI 2 cx)) via generic way by pushing all components into a register, forming (even more invalid) address that involves three registers (r8, r9 and rcx): (insn 52 78 53 5 (set (mem/j:QI (plus:DI (plus:DI (reg:DI 37 r8) (reg:DI 38 r9)) (reg:DI 2 cx [orig:98 D.1709 ] [98])) [0 foo S1 A8]) (reg:QI 1 dx [100])) /tmp/x.c:12 66 {*movqi_internal} (nil)) BTW: x86 declares MAX_REGISTER_PER_ADDRESS to 2... Attached patch fixes this situation by (partially) reloading only remaining pseudo(s), leaving UNSPEC in the address RTX. 2012-03-26 Uros Bizjak ubiz...@gmail.com PR target/52698 * config/i386/i386-protos.h (ix86_legitimize_reload_address): New prototype. * config/i386/i386.h (LEGITIMIZE_RELOAD_ADDRESS): New define. * config/i386/i386.c: Include reload.h. (ix86_legitimize_reload_address): New function. testsuite/ChangeLog: 2012-03-26 Uros Bizjak ubiz...@gmail.com H.J. Lu hongjiu...@intel.com PR target/52698 * gcc.target/i386/pr52698.c: New test. The patch was bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32}. Since fixing reload issues is some kind of black magic, I'd like to ask Ulrich and Richard for their opinion on this approach. H.J., can you please test this fix on x32 testsuite for both address modes? I am on it. There are no regressions in gcc x32 testsuite nor glibc tests for both address modes. But I didn't check code quality nor SPEC CPU performance. -- H.J.
Re: [patch] Split parts of cse_insn out to a few new functions
On Mon, Mar 26, 2012 at 9:02 PM, Richard Sandiford rdsandif...@googlemail.com wrote: * cse.c (cse_canonicalized_basic_blocks): New simple bitmap to tag basic blocks that have already been traversed at least once, so that all insns have been canonicalized. (cse_insn): Call canonicalize_insn only if the basic block that contains insn is visited for the first time. (cse_extended_basic_block): After visiting all insns in a basic block, mark the block in cse_canonicalized_basic_blocks. (cse_main): Setup and destroy cse_canonicalized_basic_blocks. OK, thanks (without the microoptimisation, as you say). Thanks, will commit! Out of curiosity, do you still see this bit as useful: /* We potentially will process this insn many times. Therefore, drop the REG_EQUAL note if it is equal to the SET_SRC of the unique set in INSN. Do not do so if the REG_EQUAL note is for a STRICT_LOW_PART, because cse_insn handles those specially. */ ? Does many times mean in CSE, or later? It means in CSE, yes. But with the patch to canonicalize only once, I suppose this can go away again. Having said that, I do believe it would be good if we avoid having REG_EQUAL notes that are rtx_equal_p to the SET_SRC. This happens quite frequently after CSE. I'm not sure how to clean them up. Ciao! Steven
Re: [C++ RFC / Patch] Implementing Deducing noexcept for destructors
On 03/26/2012 07:22 AM, Paolo Carlini wrote: My basic idea so far is very simple: --- class.c (revision 185792) +++ class.c (working copy) @@ -1001,6 +1001,10 @@ add_method (tree type, tree method, tree using_dec destructor, type); } + else if (cxx_dialect = cxx0x + !TYPE_RAISES_EXCEPTIONS (TREE_TYPE (method))) + TREE_TYPE (method) = build_exception_variant (TREE_TYPE (method), + noexcept_true_spec); } That would implement N1366, but implementing N3204 is a bit more involved. You need to copy TYPE_RAISES_EXCEPTIONS from the result of implicitly_declare_fn; see defaulted_late_check for something similar. Also, this is too early, since we can't know what the eh specification of the implicit declaration would be until the closing brace of the class. struct True2 { virtual ~True2(); }; struct False { ~False() noexcept(false); }; template typename Base, typename Member struct C : Base { Member mem; }; SA(!noexcept(CTrue2, False())); it doesn't compile at all because: noexcept_PR50043.C:21:8: error: looser throw specifier for ‘virtual CTrue2, False::~C() noexcept (false)’ noexcept_PR50043.C:5:24: error: overriding ‘virtual True2::~True2() noexcept (true)’ is this expected? Maybe, but I'm not sure. Yes. Adding noexcept to ~True2 causes the same error without your patch. Jason
Re: remove wrong code in immed_double_const
Mike Stump mikest...@comcast.net writes: On Mar 23, 2012, at 3:01 AM, Richard Sandiford wrote: ...it doesn't mean that we interpret the value as a negative _rtx_. As with all rtx calculations, things like signedness and saturation are decided by the operation rather than the type (type == rtx mode). Ah... [ light goes on ] Let me adjust the documentation to be exceptionally clear in this case. Check out the new wording on const_int, const_double and on immed_double_const. I fixed simplify_const_unary_operation to match your suggestion. Sorry for the rather rambling explanation :-) I still think the version I suggested for this hunk is right though. I agree. I now see what I had wrong. Thanks for your patience and explanations. If you like the wording I used in the doc and on immed_double_const, I think we have now resolved all issues. The previous version was bootstraped and regression tested on darwin, fortran, c, c++, objective-c++... I'll do one more run with the update for simplify_const_unary_operation, as that can trip when before it would merely return 0. Ok? Index: doc/rtl.texi === --- doc/rtl.texi (revision 185706) +++ doc/rtl.texi (working copy) @@ -1479,8 +1479,13 @@ This type of expression represents the i is customarily accessed with the macro @code{INTVAL} as in @code{INTVAL (@var{exp})}, which is equivalent to @code{XWINT (@var{exp}, 0)}. -Constants generated for modes with fewer bits than @code{HOST_WIDE_INT} -must be sign extended to full width (e.g., with @code{gen_int_mode}). +Constants generated for modes with fewer bits than in +@code{HOST_WIDE_INT} must be sign extended to full width (e.g., with +@code{gen_int_mode}). For constants for modes with more bits than in +@code{HOST_WIDE_INT} the implied high order bits of that constant are +copies of the top bit, however values are neither signed nor unsigned. +All operations defined on such constants define the signededness. I'm not someone who should be wordsmithing, but I think: ...copies of the top bit. Note however that values are neither inherently signed nor inherently unsigned; where necessary, signedness is determined by the rtl operation instead. @@ -1510,7 +1515,12 @@ Represents either a floating-point const integer constant too large to fit into @code{HOST_BITS_PER_WIDE_INT} bits but small enough to fit within twice that number of bits (GCC does not provide a mechanism to represent even larger constants). In -the latter case, @var{m} will be @code{VOIDmode}. +the latter case, @var{m} will be @code{VOIDmode}. For integral values +constants for modes with more bits than twice the number in +@code{HOST_WIDE_INT} the implied high order bits of that constant are +copies of the top bit of @code{CONST_DOUBLE_HIGH}, however integral +values are neither signed nor unsigned. All operations defined on +such constants define the signededness. Same idea here. +/* Return an rtx for the sum of X and the integer C, given that X has + mode MODE. This routine should be used instead of plus_constant + when they want to ensure that addition happens in a particular + mode, which is necessary when x can be a VOIDmode CONST_INT or Sorry, just noticed, should be ...when X can be... Looks good to me otherwise, thanks. Richi? Richard
Re: [patch] Call assemble_external only from final.c and from MI-thunk hooks
On Mon, Mar 26, 2012 at 8:51 PM, Steven Bosscher stevenb@gmail.com wrote: On Mon, Mar 26, 2012 at 8:22 PM, H.J. Lu hjl.to...@gmail.com wrote: It may have caused: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52730 It certainly seems so. Looking into it... Java is the culprit. I'm going to commit the following patch to paper over the issue for now. Index: varasm.c === --- varasm.c(revision 185823) +++ varasm.c(working copy) @@ -2170,6 +2170,10 @@ If it's not, we should not be calling this function. */ gcc_assert (asm_out_file); + /* In a perfect world, the following condition would be true. + Sadly, the Java and Go front ends emit assembly *from the front end*, + bypassing the call graph. See PR52739. Fix before GCC 4.8. */ +#if 0 /* This function should only be called if we are expanding, or have expanded, to RTL. Ideally, only final.c would be calling this function, but it is @@ -2177,6 +2181,7 @@ for further discussion. */ gcc_assert (cgraph_state == CGRAPH_STATE_EXPANSION || cgraph_state == CGRAPH_STATE_FINISHED); +#endif if (!DECL_P (decl) || !DECL_EXTERNAL (decl) || !TREE_PUBLIC (decl)) return;
Re: [patch] Call assemble_external only from final.c and from MI-thunk hooks
Index: varasm.c === --- varasm.c (revision 185823) +++ varasm.c (working copy) @@ -2170,6 +2170,10 @@ If it's not, we should not be calling this function. */ gcc_assert (asm_out_file); + /* In a perfect world, the following condition would be true. + Sadly, the Java and Go front ends emit assembly *from the front end*, + bypassing the call graph. See PR52739. Fix before GCC 4.8. */ There is no PR52739. -- Eric Botcazou
[patch][RFC] bail out after front-end errors
Hello, This patch is one way to address PR44982. I see no good reason to cgraph_finalize_compilation_unit if there were parse errors. As Richi already pointed out, GCC traditionally has proceeded after parse errors to preserve warnings and errors we generate from the middle-end and during semantic analysis. But it seems to me that those warnings are not very meaningful after parse errors (-Wuninitialized after a parse error??), and errors from the middle end are mostly for exotic code (involving asm()s and the like). Bailing out after parse errors is therefore IMHO the right thing to do for the common case. Thoughts? Comments? If the consensus is that this patch goes in, I'll also have to do some work on the test suite, because some warnings and errors disappear. List attached below. A lot of errors and warnings from g++ disappear. I suspect this is because they are only issued during gimplification. That is something I'll have to address before this patch could go in. Before I spend the effort, I'd like to know if there is consensus on the general direction proposed here... ;-) Ciao! Steven Index: toplev.c === --- toplev.c(revision 185813) +++ toplev.c(working copy) @@ -561,9 +561,14 @@ compile_file (void) /* Compilation is now finished except for writing what's left of the symbol table output. */ - if (flag_syntax_only || flag_wpa) + /* If all we have to do is syntax checking, or if there were parse + errors, stop here. */ + if (flag_syntax_only || seen_error ()) return; + if (flag_wpa) +return; + timevar_start (TV_PHASE_GENERATE); ggc_protect_identifiers = false; @@ -571,12 +576,6 @@ compile_file (void) /* This must also call cgraph_finalize_compilation_unit. */ lang_hooks.decls.final_write_globals (); - if (seen_error ()) -{ - timevar_stop (TV_PHASE_GENERATE); - return; -} - /* Compilation unit is finalized. When producing non-fat LTO object, we are basically finished. */ if (in_lto_p || !flag_lto || flag_fat_lto_objects) New failing tests: FAIL: gcc.dg/asm-7.c (test for errors, line 15) FAIL: gcc.dg/asm-7.c (test for errors, line 16) FAIL: gcc.dg/declspec-10.c (test for warnings, line 19) FAIL: gcc.dg/declspec-11.c (test for warnings, line 19) FAIL: gcc.dg/declspec-9.c (test for errors, line 20) FAIL: gcc.dg/gnu99-static-1.c (test for errors, line 21) FAIL: gcc.dg/gnu99-static-1.c (test for errors, line 25) FAIL: gcc.dg/pr48552-1.c (test for errors, line 16) FAIL: gcc.dg/pr48552-1.c (test for errors, line 40) FAIL: gcc.dg/pr48552-1.c (test for errors, line 52) FAIL: gcc.dg/pr48552-2.c (test for errors, line 16) FAIL: gcc.dg/pr48552-2.c (test for errors, line 40) FAIL: gcc.dg/pr48552-2.c (test for errors, line 52) FAIL: gcc.dg/redecl-10.c (test for warnings, line 15) FAIL: gcc.dg/redecl-10.c (test for warnings, line 29) FAIL: gcc.dg/gomp/block-2.c (test for errors, line 14) FAIL: gcc.dg/gomp/block-2.c (test for errors, line 16) FAIL: gcc.dg/gomp/block-7.c (test for errors, line 9) FAIL: gcc.dg/gomp/block-7.c (test for errors, line 10) FAIL: gcc.dg/gomp/block-7.c (test for errors, line 11) FAIL: gcc.dg/gomp/block-7.c (test for errors, line 15) FAIL: gcc.dg/gomp/block-7.c (test for errors, line 16) FAIL: gcc.dg/gomp/block-7.c (test for errors, line 17) FAIL: gcc.dg/gomp/pr27415.c (test for errors, line 9) FAIL: gcc.dg/gomp/pr27415.c (test for errors, line 28) FAIL: gcc.dg/gomp/pr27415.c (test for errors, line 37) FAIL: c-c++-common/tm/safe-3.c (internal compiler error) FAIL: c-c++-common/tm/safe-3.c (test for excess errors) FAIL: gcc.dg/tm/pr52141.c (internal compiler error) FAIL: gcc.dg/tm/pr52141.c (test for excess errors) FAIL: g++.dg/cpp0x/constexpr-ex1.C (test for warnings, line 17) FAIL: g++.dg/cpp0x/constexpr-function2.C (test for warnings, line 46) FAIL: g++.dg/cpp0x/constexpr-neg1.C (test for warnings, line 5) FAIL: g++.dg/cpp0x/lambda/lambda-ctor-neg.C (test for warnings, line 15) FAIL: g++.dg/cpp0x/lambda/lambda-ctor-neg.C not an aggregate (test for errors, line 16) FAIL: g++.dg/cpp0x/lambda/lambda-ctor-neg.C deleted default ctor (test for errors, line 17) FAIL: g++.dg/cpp0x/lambda/lambda-ctor-neg.C deleted assignment op (test for errors, line 18) FAIL: g++.dg/cpp0x/lambda/lambda-field-names.C no member named i (test for errors, line 11) FAIL: g++.dg/cpp0x/noexcept15.C (test for errors, line 16) FAIL: g++.dg/cpp0x/pr47416.C (test for errors, line 187) FAIL: g++.dg/cpp0x/pr47416.C (test for warnings, line 213) FAIL: g++.dg/cpp0x/pr47416.C (test for warnings, line 223) FAIL: g++.dg/cpp0x/static_assert2.C (test for errors, line 14) FAIL: g++.dg/cpp0x/union1.C (test for errors, line 17) FAIL: g++.dg/cpp0x/union1.C (test for errors, line 18) FAIL: g++.dg/cpp0x/union1.C (test for errors, line 28) FAIL: g++.dg/cpp0x/union1.C (test for errors, line 29)
Re: [patch] Call assemble_external only from final.c and from MI-thunk hooks
On Mon, Mar 26, 2012 at 10:51 PM, Eric Botcazou ebotca...@adacore.com wrote: Index: varasm.c === --- varasm.c (revision 185823) +++ varasm.c (working copy) @@ -2170,6 +2170,10 @@ If it's not, we should not be calling this function. */ gcc_assert (asm_out_file); + /* In a perfect world, the following condition would be true. + Sadly, the Java and Go front ends emit assembly *from the front end*, + bypassing the call graph. See PR52739. Fix before GCC 4.8. */ There is no PR52739. *sigh* PR52730. It's one of those days...
Re: [Patch V2] libgfortran: do not assume libm
On Mon, Mar 26, 2012 at 11:44, Tristan Gingold ging...@adacore.com wrote: On Mar 22, 2012, at 11:06 AM, Paolo Bonzini wrote: Il 22/03/2012 09:30, Tristan Gingold ha scritto: Hi, this is version 2 of the patch. The initial problem is that libgfortran configure.ac used AC_CHECK_LIB([m]…) to check wether several math functions are available. That doesn't work on VMS, because there is no such things as a libm. It seems to me that there are no autoconf macro to check wether a function is available: AC_CHECK_FUNC[S] don't allow to specify additional include files (necessary on VMS as some math functions are renamed through macros), and AC_CHECK_DECL only checks the presence of the declaration (as pointed out by Ralf). So I have finally learnt a little bit more about autoconf and added a new file: config/math.m4 (build machinery maintainer cc:) libgfortran/configure.ac now uses the new GCC_CHECK_MATH_FUNC macro. Bootstrapped without gfortran regressions on x86_64-darwin. Ok for trunk ? (I will submit a follow-up change in libquadmath once this change is approved). Tristan. config/ 2012-03-22 Tristan Gingold ging...@adacore.com * math.m4: New file. libgfortran/ 2012-03-22 Tristan Gingold ging...@adacore.com * configure.ac: Use GCC_CHECK_MATH_FUNC for math functions. * acinclude.m4: Include ../config/math.m4 * configure: Regenerate. * Makefile.in: Regenerate. Ok. Thanks, now committed. On x86_64-unknown-linux-gnu bootstrap I see the following: make[1]: Entering directory `/home/janne/src/gfortran/trunk/objdir-git/x86_64-unknown-linux-gnu/libgfortran' make check-am make[2]: Entering directory `/home/janne/src/gfortran/trunk/objdir-git/x86_64-unknown-linux-gnu/libgfortran' (CDPATH=${ZSH_VERSION+.}: cd ../../../trunk-git/libgfortran /bin/bash /home/janne/src/gfortran/trunk/trunk-git/missing --run autoheader) true DO=all multi-do # make make[1]: Entering directory `/home/janne/src/gfortran/trunk/objdir-git/gcc' Making a new config file... echo set tmpdir /home/janne/src/gfortran/trunk/objdir-git/gcc/testsuite ./site.tmp autoheader: warning: missing template: HAVE_ACOS autoheader: Use AC_DEFINE([HAVE_ACOS], [], [Description]) autoheader: warning: missing template: HAVE_ACOSF autoheader: warning: missing template: HAVE_ACOSH autoheader: warning: missing template: HAVE_ACOSHF autoheader: warning: missing template: HAVE_ACOSHL autoheader: warning: missing template: HAVE_ACOSL autoheader: warning: missing template: HAVE_ASIN autoheader: warning: missing template: HAVE_ASINF autoheader: warning: missing template: HAVE_ASINH autoheader: warning: missing template: HAVE_ASINHF autoheader: warning: missing template: HAVE_ASINHL autoheader: warning: missing template: HAVE_ASINL autoheader: warning: missing template: HAVE_ATAN autoheader: warning: missing template: HAVE_ATAN2 autoheader: warning: missing template: HAVE_ATAN2F autoheader: warning: missing template: HAVE_ATAN2L autoheader: warning: missing template: HAVE_ATANF autoheader: warning: missing template: HAVE_ATANH autoheader: warning: missing template: HAVE_ATANHF autoheader: warning: missing template: HAVE_ATANHL autoheader: warning: missing template: HAVE_ATANL autoheader: warning: missing template: HAVE_CABS autoheader: warning: missing template: HAVE_CABSF autoheader: warning: missing template: HAVE_CABSL autoheader: warning: missing template: HAVE_CACOS autoheader: warning: missing template: HAVE_CACOSF autoheader: warning: missing template: HAVE_CACOSH autoheader: warning: missing template: HAVE_CACOSHF autoheader: warning: missing template: HAVE_CACOSHL autoheader: warning: missing template: HAVE_CACOSL autoheader: warning: missing template: HAVE_CARG autoheader: warning: missing template: HAVE_CARGF autoheader: warning: missing template: HAVE_CARGL autoheader: warning: missing template: HAVE_CASIN autoheader: warning: missing template: HAVE_CASINF autoheader: warning: missing template: HAVE_CASINH autoheader: warning: missing template: HAVE_CASINHF autoheader: warning: missing template: HAVE_CASINHL autoheader: warning: missing template: HAVE_CASINL autoheader: warning: missing template: HAVE_CATAN autoheader: warning: missing template: HAVE_CATANF autoheader: warning: missing template: HAVE_CATANH autoheader: warning: missing template: HAVE_CATANHF autoheader: warning: missing template: HAVE_CATANHL autoheader: warning: missing template: HAVE_CATANL autoheader: warning: missing template: HAVE_CCOS autoheader: warning: missing template: HAVE_CCOSF autoheader: warning: missing template: HAVE_CCOSH autoheader: warning: missing template: HAVE_CCOSHF autoheader: warning: missing template: HAVE_CCOSHL autoheader: warning: missing template: HAVE_CCOSL autoheader: warning: missing template: HAVE_CEIL autoheader: warning: missing template: HAVE_CEILF autoheader: warning: missing template: HAVE_CEILL autoheader: warning: missing template: HAVE_CEXP autoheader: warning: missing
[SH] PR 50751 - rework displacement calculations
Hi, The attached patch generalizes the move insn displacement calculations a little bit. Before, the same address rebasing code was present in sh_legitimize_address as well as sh_legitimize_reload_address. I've pulled those out into a separate function as a preparation step for adding HImode displacement addressing support. Tested by doing 'make all' (c,c++), compiling newlib and by making sure that for the following variants the CSiBE set code size did not change: -m2a-single -mb -O2, -m2a-single -mb -Os, -m4a-single -mb -O2, -m4a-single -mb -Os, -m4a-single -ml -O2, -m4a-single -ml -Os, -m4-single -mb -O2, -m4-single -mb -Os, -m4-single -ml -O2, -m4-single -ml -Os I'm now also running the usual reg tests on sh-sim, just in case. Other than that, OK? Cheers, Oleg ChangeLog: PR target/50751 * config/sh/sh.c (sh_legitimize_address, sh_legitimize_reload_address): Rearrange conditional logic. Move displacement address calculations to ... (sh_find_mov_disp_adjust): ... this new function. Index: gcc/config/sh/sh.c === --- gcc/config/sh/sh.c (revision 185768) +++ gcc/config/sh/sh.c (working copy) @@ -9663,10 +9663,12 @@ { if (MAYBE_BASE_REGISTER_RTX_P (x, strict)) return true; + else if ((GET_CODE (x) == POST_INC || GET_CODE (x) == PRE_DEC) ! TARGET_SHMEDIA MAYBE_BASE_REGISTER_RTX_P (XEXP (x, 0), strict)) return true; + else if (GET_CODE (x) == PLUS (mode != PSImode || reload_completed)) { @@ -9779,80 +9781,114 @@ return orig; } -/* Try machine-dependent ways of modifying an illegitimate address - to be legitimate. If we find one, return the new, valid address. - Otherwise, return X. +/* Given a (logical) mode size and an offset in bytes, try to find a the + appropriate displacement value for a mov insn. On SH the displacements + are limited to max. 60 bytes for SImode, max. 30 bytes in HImode and max. + 15 bytes in QImode. To compensate this we create a new base address by + adding an adjustment value to it. - For the SH, if X is almost suitable for indexing, but the offset is - out of range, convert it into a normal form so that CSE has a chance - of reducing the number of address registers used. */ + If the originally requested offset is greater than 127 we prefer using + values 124..127 over 128..131 to increase opportunities to use the + add #imm, Rn insn. + In some cases it is possible that a requested offset might seem unaligned + or inappropriate for the mode size, like offset = 2 and mode size = 4. + This is compensated by adjusting the base address so that the effective + address of the displacement move insn will be aligned. + + This is not the best possible way of rebasing the base address, as it + does not look at other present displacement addressings around it. + In some cases this can create more base address adjustments than would + actually be necessary. */ + +struct disp_adjust +{ + rtx offset_adjust; + rtx mov_disp; + int max_mov_disp; +}; + +static struct disp_adjust +sh_find_mov_disp_adjust (int mode_sz, HOST_WIDE_INT offset) +{ + struct disp_adjust res = { NULL_RTX, NULL_RTX, 0 }; + + /* The max. available mode for actual move insns is SImode. + Larger accesses will be split into multiple loads/stores. */ + const int max_mov_sz = GET_MODE_SIZE (SImode); + + const int mov_insn_size = mode_sz = max_mov_sz ? max_mov_sz : mode_sz; + const HOST_WIDE_INT max_disp = 15 * mov_insn_size; + HOST_WIDE_INT align_modifier = offset 127 ? mov_insn_size : 0; + + HOST_WIDE_INT offset_adjust; + + /* In some cases this actually does happen and we must check for it. */ + if (mode_sz 1 || mode_sz 8) +return res; + + /* FIXME: HImode with displacement addressing is not supported yet. + Make it purposefully fail for now. */ + if (mov_insn_size == 2) +return res; + + /* Keeps the previous behavior for QImode displacement addressing. + This just decides how the offset is re-based. Removing this special + case will result in slightly bigger code on average, but it's not that + bad actually. */ + if (mov_insn_size == 1) +align_modifier = 0; + + res.max_mov_disp = max_disp + mov_insn_size; + + offset_adjust = ((offset + align_modifier) ~max_disp) - align_modifier; + + if (mode_sz + offset - offset_adjust = res.max_mov_disp) +{ + res.offset_adjust = GEN_INT (offset_adjust); + res.mov_disp = GEN_INT (offset - offset_adjust); +} + + return res; +} + +/* Try to modify an illegitimate address and make it legitimate. + If we find one, return the new, valid address. + Otherwise, return the original address. */ + static rtx sh_legitimize_address (rtx x, rtx oldx, enum machine_mode mode) { if (flag_pic) x = legitimize_pic_address (oldx, mode, NULL_RTX); - if (GET_CODE (x) == PLUS - (GET_MODE_SIZE (mode) == 4 -
Re: [C++ RFC / Patch] Implementing Deducing noexcept for destructors
On 03/26/2012 09:31 PM, Jason Merrill wrote: On 03/26/2012 07:22 AM, Paolo Carlini wrote: My basic idea so far is very simple: --- class.c (revision 185792) +++ class.c (working copy) @@ -1001,6 +1001,10 @@ add_method (tree type, tree method, tree using_dec destructor, type); } + else if (cxx_dialect = cxx0x + !TYPE_RAISES_EXCEPTIONS (TREE_TYPE (method))) + TREE_TYPE (method) = build_exception_variant (TREE_TYPE (method), + noexcept_true_spec); } That would implement N1366, but implementing N3204 is a bit more involved. You need to copy TYPE_RAISES_EXCEPTIONS from the result of implicitly_declare_fn; see defaulted_late_check for something similar. Also, this is too early, since we can't know what the eh specification of the implicit declaration would be until the closing brace of the class. Thanks for the help Jason. Paolo.
Re: remove wrong code in immed_double_const
On Mar 26, 2012, at 1:03 PM, Richard Sandiford wrote: I think: ...copies of the top bit. Note however that values are neither inherently signed nor inherently unsigned; where necessary, signedness is determined by the rtl operation instead. Sounds good to me, changed. Same idea here. Fixed. +/* Return an rtx for the sum of X and the integer C, given that X has + mode MODE. This routine should be used instead of plus_constant + when they want to ensure that addition happens in a particular + mode, which is necessary when x can be a VOIDmode CONST_INT or Sorry, just noticed, should be ...when X can be... Fixed. Richard has ok all his bits, now down to you for final ok. Ok? * doc/rtl.texi (const_double): Document as sign-extending. * expmed.c (expand_mult): Ensure we don't use shift incorrectly. * emit-rtl.c (immed_double_int_const): Refine to state the value is signed. * simplify-rtx.c (mode_signbit_p): Add a fixme for wider than CONST_DOUBLE integers. (simplify_const_unary_operation, UNSIGNED_FLOAT): Ensure no negative values are converted. Fix conversions bigger than HOST_BITS_PER_WIDE_INT. (simplify_binary_operation_1): Ensure we don't use shift incorrectly. (simplify_immed_subreg): Sign-extend CONST_DOUBLEs. * explow.c (plus_constant_mode): Add. (plus_constant): Implement with plus_constant_mode. * rtl.h (plus_constant_mode): Add. Index: doc/rtl.texi === --- doc/rtl.texi(revision 185706) +++ doc/rtl.texi(working copy) @@ -1479,8 +1479,19 @@ This type of expression represents the i is customarily accessed with the macro @code{INTVAL} as in @code{INTVAL (@var{exp})}, which is equivalent to @code{XWINT (@var{exp}, 0)}. -Constants generated for modes with fewer bits than @code{HOST_WIDE_INT} -must be sign extended to full width (e.g., with @code{gen_int_mode}). +Constants generated for modes with fewer bits than in +@code{HOST_WIDE_INT} must be sign extended to full width (e.g., with +@code{gen_int_mode}). For constants for modes with more bits than in +@code{HOST_WIDE_INT} the implied high order bits of that constant are +copies of the top bit. Note however that values are neither +inherently signed nor inherently unsigned; where necessary, signedness +is determined by the rtl operation instead. + + +, however values are neither signed nor unsigned. +All operations defined on such constants define the signededness. + + @findex const0_rtx @findex const1_rtx @@ -1510,7 +1521,13 @@ Represents either a floating-point const integer constant too large to fit into @code{HOST_BITS_PER_WIDE_INT} bits but small enough to fit within twice that number of bits (GCC does not provide a mechanism to represent even larger constants). In -the latter case, @var{m} will be @code{VOIDmode}. +the latter case, @var{m} will be @code{VOIDmode}. For integral values +constants for modes with more bits than twice the number in +@code{HOST_WIDE_INT} the implied high order bits of that constant are +copies of the top bit of @code{CONST_DOUBLE_HIGH}. Note however that +integral values are neither inherently signed nor inherently unsigned; +where necessary, signedness is determined by the rtl operation +instead. @findex CONST_DOUBLE_LOW If @var{m} is @code{VOIDmode}, the bits of the value are stored in Index: expmed.c === --- expmed.c(revision 185706) +++ expmed.c(working copy) @@ -3135,8 +3135,10 @@ expand_mult (enum machine_mode mode, rtx { int shift = floor_log2 (CONST_DOUBLE_HIGH (op1)) + HOST_BITS_PER_WIDE_INT; - return expand_shift (LSHIFT_EXPR, mode, op0, - shift, target, unsignedp); + if (shift 2 * HOST_BITS_PER_WIDE_INT - 1 + || GET_MODE_BITSIZE (mode) = 2 * HOST_BITS_PER_WIDE_INT) + return expand_shift (LSHIFT_EXPR, mode, op0, +shift, target, unsignedp); } } Index: emit-rtl.c === --- emit-rtl.c (revision 185706) +++ emit-rtl.c (working copy) @@ -517,8 +517,11 @@ immed_double_int_const (double_int i, en /* Return a CONST_DOUBLE or CONST_INT for a value specified as a pair of ints: I0 is the low-order word and I1 is the high-order word. - Do not use this routine for non-integer modes; convert to - REAL_VALUE_TYPE and use CONST_DOUBLE_FROM_REAL_VALUE. */ + For values that are larger than 2*HOST_BITS_PER_WIDE_INT, the + implied upper bits are copies of the high bit of i1. The value + itself is neither signed nor unsigned. Do not use this routine for + non-integer modes; convert to REAL_VALUE_TYPE and use +
[PATCH] Allow printing of escaped curly braces in assembler directives with operands
Hi, An assembler directive with an operand is filtered through output_asm_insn (or asm_fprintf for gcc internal asm() directives) to expand the operand values in the assembler as well as to choose dialects if present. This patch is concerned primarily with the dialects, since their syntax prevent inclusion of assembler strings with curly braces, causing them to be interpreted as dialects. The attached patch allows printing of curly braces in assembler by preceding them with a \\. So to print the following code into assembler: .pushsection .foo; .asciz div { width : 50%% | height=10px }; .long 42; .popsection The following code needs to be used with this patch: void f() { asm ( .pushsection \.foo\; .asciz \div \\{ width : 50%% | height=10px \\} \; .long %c0; .popsection : : i(42) ); } The other option to \\ (since it doesn't look as clean) was to use % as an escape character, but I was not sure if that is a better looking option or a worse looking one. I don't mind resubmitting the patch to use %{ and %} to print curly braces if that is acceptable. It is still possible to print curly braces in asm string literals without operands since they do not undergo any transformation. The patch does not introduce any regressions. I have tested this with x86_64 and i686 and it works well with both of them. Regards, Siddhesh gcc/ChangeLog: 2012-03-27 Siddhesh Poyarekar siddh...@redhat.com * final.c (output_asm_insn, asm_fprintf): Print curly braces if preceded by an escaped backslash (\\). testsuite/ChangeLog: 2012-03-27 Siddhesh Poyarekar siddh...@redhat.com * gcc.dg/asm-braces.c: New test case. diff --git a/gcc/final.c b/gcc/final.c index 718caf1..2393c0f 100644 --- a/gcc/final.c +++ b/gcc/final.c @@ -3444,6 +3444,12 @@ output_asm_insn (const char *templ, rtx *operands) output_operand_lossage (invalid %%-code); break; + /* Escaped braces. Print them as is. */ + case '\\': +if (*p == '{' || *p == '}') + c = *p++; +/* FALLTHROUGH */ + default: putc (c, asm_out_file); } @@ -3955,6 +3961,12 @@ asm_fprintf (FILE *file, const char *p, ...) } break; + /* Escaped braces. Print them as is. */ + case '\\': +if (*p == '{' || *p == '}') + c = *p++; +/* FALLTHROUGH */ + default: putc (c, file); } diff --git a/gcc/testsuite/gcc.dg/asm-braces.c b/gcc/testsuite/gcc.dg/asm-braces.c new file mode 100644 index 000..4f428c8 --- /dev/null +++ b/gcc/testsuite/gcc.dg/asm-braces.c @@ -0,0 +1,9 @@ +/* { dg-do compile } */ +/* { dg-options } */ + +void f() +{ + asm ( .pushsection \.foo\; .asciz \div \\{ width : 50%% | height = 10px \\} \; .long %c0; .popsection : : i(42) ); +} + +/* { dg-final { scan-assembler div { width : 50%% | height = 10px } } } */ -- 1.7.7.6