RE: [PATCH]Fix computation of offset in ivopt
-Original Message- From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- ow...@gcc.gnu.org] On Behalf Of Oleg Endo Sent: Wednesday, September 25, 2013 1:41 AM To: Richard Biener Cc: Bin Cheng; GCC Patches Subject: Re: [PATCH]Fix computation of offset in ivopt After reading overflow and ivopt, I was wondering whether http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55190 is somehow related. No, this patch is irrelevant. But I do have some thought on the pr55190 and will follow in the bug entry. Thanks. bin
[PATCH] Fix PR58532
This fixes PR58532, a bootstrap-debug issue with -O3. Debug stmts got in the way of adding abnormal edges during inlining. Bootstrapped on x86_64-unkown-linux-gnu (with -O3 and default flags), committed to trunk. Richard. 2013-09-30 Richard Biener rguent...@suse.de PR middle-end/58532 * tree-cfg.c (make_abnormal_goto_edges): Skip debug statements before looking for setjmp-like calls. * g++.dg/torture/pr58552.C: New testcase. Index: gcc/tree-cfg.c === --- gcc/tree-cfg.c (revision 202971) +++ gcc/tree-cfg.c (working copy) @@ -1013,6 +1013,9 @@ make_abnormal_goto_edges (basic_block bb break; } } + if (!gsi_end_p (gsi) + is_gimple_debug (gsi_stmt (gsi))) + gsi_next_nondebug (gsi); if (!gsi_end_p (gsi)) { /* Make an edge to every setjmp-like call. */ Index: gcc/testsuite/g++.dg/torture/pr58552.C === --- gcc/testsuite/g++.dg/torture/pr58552.C (revision 0) +++ gcc/testsuite/g++.dg/torture/pr58552.C (working copy) @@ -0,0 +1,29 @@ +// { dg-do compile } +// { dg-additional-options -fcompare-debug } + +extern void fancy_abort () __attribute__ ((__noreturn__)); +extern C { +struct __jmp_buf_tag { }; +typedef struct __jmp_buf_tag jmp_buf[1]; +extern int _setjmp (struct __jmp_buf_tag __env[1]) throw (); +} +extern void *gfc_state_stack; +static jmp_buf eof_buf; +static void push_state () +{ + if (!gfc_state_stack) +fancy_abort (); +} +bool gfc_parse_file (void) +{ + int seen_program=0; + if (_setjmp (eof_buf)) +return false; + if (seen_program) +goto duplicate_main; + seen_program = 1; + push_state (); + push_state (); +duplicate_main: + return true; +}
Re: [PATCH v4 04/20] add configury
Il 27/09/2013 21:45, Gerald Pfeifer ha scritto: I believe this may be breaking all my testers on FreeBSD (i386-unknown-freebsd10.0 for example). The timing of when this patchset went in fits pretty much when my builds started to break and I am wondering about some code. Here is the failure mode: gmake[2]: Entering directory `/scratch/tmp/gerald/OBJ-0927-1848/gcc' g++ -c -DIN_GCC_FRONTEND -g -O2 -DIN_GCC -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Werror -fno-common -DHAVE_CONFIG_H -I. -Ic -I/scratch/tmp/gerald/gcc-HEAD/gcc ...[-I options]... -o c/c-lang.o -MT c/c-lang.o -MMD -MP -MF c/.deps/c-lang.TPo /scratch/tmp/gerald/gcc-HEAD/gcc/c/c-lang.c cc1plus: error: unrecognized command line option -Wno-narrowing gmake[2]: *** [c/c-lang.o] Error 1 gmake[1]: *** [install-gcc] Error 2 gmake: *** [install] Error 2 The issue is the invocation of g++ (the old system compiler, not what we built) with -Wno-narrowing (a new option). Why is install building anything? Paolo
Re: [PATCH] Trivial cleanup
Hi, On Sat, 28 Sep 2013, Andrew MacLeod wrote: My example in this form would look something like: int unsignedsrcp = ptrvar.type().type().type_unsigned(); ... GimpleType t1 = ptrvar.type (); GimpleType t2 = t1.type (); Stop that CamelCase dyslexia already, will you? ;-) Ciao, Michael.
Re: [RFC Patch, Aarch64] : Macros for profile code generation to enable gprof support
On 28 September 2013 11:57, Venkataramanan Kumar venkataramanan.ku...@linaro.org wrote: 2013-10-28 Venkataramanan Kumar venkataramanan.ku...@linaro.org * config/aarch64/aarch64.h (MCOUNT_NAME): Define. (NO_PROFILE_COUNTERS): Likewise. (PROFILE_HOOK): Likewise. (FUNCTION_PROFILER): Likewise. * config/aarch64/aarch64.c (aarch64_function_profiler): Remove. OK, Thank you. /Marcus
[ARM, AArch64] Make aarch64-common.c files more robust.
Hi, Recently I've found myself getting a number of segfaults from code calling in to the arm_early_load/alu_dep functions in aarch64-common.c. These functions expect a particular form for the RTX patterns they work over, but some of them do not validate this form. This patch fixes that, removing segmentation faults I see when tuning for Cortex-A15 and Cortex-A7. Tested on aarch64-none-elf and arm-none-eabi with no regressions. OK? Thanks, James --- gcc/ 2013-09-30 James Greenhalgh james.greenha...@arm.com * config/arm/aarch-common.c (arm_early_load_addr_dep): Add sanity checking. (arm_no_early_alu_shift_dep): Likewise. (arm_no_early_alu_shift_value_dep): Likewise. (arm_no_early_mul_dep): Likewise. (arm_no_early_store_addr_dep): Likewise. diff --git a/gcc/config/arm/aarch-common.c b/gcc/config/arm/aarch-common.c index 69366af..ea50848 100644 --- a/gcc/config/arm/aarch-common.c +++ b/gcc/config/arm/aarch-common.c @@ -44,7 +44,12 @@ arm_early_load_addr_dep (rtx producer, rtx consumer) value = COND_EXEC_CODE (value); if (GET_CODE (value) == PARALLEL) value = XVECEXP (value, 0, 0); + + if (GET_CODE (value) != SET) +return 0; + value = XEXP (value, 0); + if (GET_CODE (addr) == COND_EXEC) addr = COND_EXEC_CODE (addr); if (GET_CODE (addr) == PARALLEL) @@ -54,6 +59,10 @@ arm_early_load_addr_dep (rtx producer, rtx consumer) else addr = XVECEXP (addr, 0, 0); } + + if (GET_CODE (addr) != SET) +return 0; + addr = XEXP (addr, 1); return reg_overlap_mentioned_p (value, addr); @@ -74,21 +83,41 @@ arm_no_early_alu_shift_dep (rtx producer, rtx consumer) value = COND_EXEC_CODE (value); if (GET_CODE (value) == PARALLEL) value = XVECEXP (value, 0, 0); + + if (GET_CODE (value) != SET) +return 0; + value = XEXP (value, 0); + if (GET_CODE (op) == COND_EXEC) op = COND_EXEC_CODE (op); if (GET_CODE (op) == PARALLEL) op = XVECEXP (op, 0, 0); + + if (GET_CODE (op) != SET) +return 0; + op = XEXP (op, 1); + if (!INSN_P (op)) +return 0; + early_op = XEXP (op, 0); + /* This is either an actual independent shift, or a shift applied to the first operand of another operation. We want the whole shift operation. */ if (REG_P (early_op)) early_op = op; - return !reg_overlap_mentioned_p (value, early_op); + if (GET_CODE (op) == ASHIFT + || GET_CODE (op) == ROTATE + || GET_CODE (op) == ASHIFTRT + || GET_CODE (op) == LSHIFTRT + || GET_CODE (op) == ROTATERT) +return !reg_overlap_mentioned_p (value, early_op); + else +return 0; } /* Return nonzero if the CONSUMER instruction (an ALU op) does not @@ -106,13 +135,25 @@ arm_no_early_alu_shift_value_dep (rtx producer, rtx consumer) value = COND_EXEC_CODE (value); if (GET_CODE (value) == PARALLEL) value = XVECEXP (value, 0, 0); + + if (GET_CODE (value) != SET) +return 0; + value = XEXP (value, 0); + if (GET_CODE (op) == COND_EXEC) op = COND_EXEC_CODE (op); if (GET_CODE (op) == PARALLEL) op = XVECEXP (op, 0, 0); + + if (GET_CODE (op) != SET) +return 0; + op = XEXP (op, 1); + if (!INSN_P (op)) +return 0; + early_op = XEXP (op, 0); /* This is either an actual independent shift, or a shift applied to @@ -121,7 +162,14 @@ arm_no_early_alu_shift_value_dep (rtx producer, rtx consumer) if (!REG_P (early_op)) early_op = XEXP (early_op, 0); - return !reg_overlap_mentioned_p (value, early_op); + if (GET_CODE (op) == ASHIFT + || GET_CODE (op) == ROTATE + || GET_CODE (op) == ASHIFTRT + || GET_CODE (op) == LSHIFTRT + || GET_CODE (op) == ROTATERT) +return !reg_overlap_mentioned_p (value, early_op); + else +return 0; } /* Return nonzero if the CONSUMER (a mul or mac op) does not @@ -138,11 +186,20 @@ arm_no_early_mul_dep (rtx producer, rtx consumer) value = COND_EXEC_CODE (value); if (GET_CODE (value) == PARALLEL) value = XVECEXP (value, 0, 0); + + if (GET_CODE (value) != SET) +return 0; + value = XEXP (value, 0); + if (GET_CODE (op) == COND_EXEC) op = COND_EXEC_CODE (op); if (GET_CODE (op) == PARALLEL) op = XVECEXP (op, 0, 0); + + if (GET_CODE (op) != SET) +return 0; + op = XEXP (op, 1); if (GET_CODE (op) == PLUS || GET_CODE (op) == MINUS) @@ -169,11 +226,20 @@ arm_no_early_store_addr_dep (rtx producer, rtx consumer) value = COND_EXEC_CODE (value); if (GET_CODE (value) == PARALLEL) value = XVECEXP (value, 0, 0); + + if (GET_CODE (value) != SET) +return 0; + value = XEXP (value, 0); + if (GET_CODE (addr) == COND_EXEC) addr = COND_EXEC_CODE (addr); if (GET_CODE (addr) == PARALLEL) addr = XVECEXP (addr, 0, 0); + + if (GET_CODE (addr) != SET) +return 0; + addr = XEXP (addr, 0); return !reg_overlap_mentioned_p (value, addr);
[Patch, PPC, committed] fix ppc build breakage.
Hi, My commit r203019 contained an oversight which is fixed by the obvious patch below. tested on cross to powerpc-linux-gnu and a build of cc1 for AIX-6.1.3 (and stage1 for powerpc-darwin9). applied as r203027 Apologies for the breakage, and that this slipped through my usual testing, Iain gcc: * config/rs6000/darwin.md (load_macho_picbase_si): Wrap machopic calls and defines in TARGET_MACHO conditional. (load_macho_picbase_di): Likewise. (reload_macho_picbase): Likewise. (reload_macho_picbase_si): Likewise. (reload_macho_picbase_di): Likewise. (nonlocal_goto_receiver): Likewise. Index: gcc/config/rs6000/darwin.md === --- gcc/config/rs6000/darwin.md (revision 203026) +++ gcc/config/rs6000/darwin.md (working copy) @@ -261,7 +261,11 @@ (pc)] UNSPEC_LD_MPIC))] (DEFAULT_ABI == ABI_DARWIN) flag_pic { +#if TARGET_MACHO machopic_should_output_picbase_label (); /* Update for new func. */ +#else + gcc_unreachable (); +#endif return bcl 20,31,%0\\n%0:; } [(set_attr type branch) @@ -273,7 +277,11 @@ (pc)] UNSPEC_LD_MPIC))] (DEFAULT_ABI == ABI_DARWIN) flag_pic TARGET_64BIT { +#if TARGET_MACHO machopic_should_output_picbase_label (); /* Update for new func. */ +#else + gcc_unreachable (); +#endif return bcl 20,31,%0\\n%0:; } [(set_attr type branch) @@ -397,6 +405,7 @@ (pc)] UNSPEC_RELD_MPIC))] (DEFAULT_ABI == ABI_DARWIN) flag_pic { +#if TARGET_MACHO if (machopic_should_output_picbase_label ()) { static char tmp[64]; @@ -405,6 +414,9 @@ return tmp; } else +#else + gcc_unreachable (); +#endif return bcl 20,31,%0\\n%0:; } [(set_attr type branch) @@ -416,6 +428,7 @@ (pc)] UNSPEC_RELD_MPIC))] (DEFAULT_ABI == ABI_DARWIN) flag_pic TARGET_64BIT { +#if TARGET_MACHO if (machopic_should_output_picbase_label ()) { static char tmp[64]; @@ -424,6 +437,9 @@ return tmp; } else +#else + gcc_unreachable (); +#endif return bcl 20,31,%0\\n%0:; } [(set_attr type branch) @@ -438,6 +454,7 @@ reload_completed [(const_int 0)] { +#if TARGET_MACHO if (crtl-uses_pic_offset_table) { static unsigned n = 0; @@ -456,6 +473,8 @@ else /* Not using PIC reg, no reload needed. */ emit_note (NOTE_INSN_DELETED); - +#else + gcc_unreachable (); +#endif DONE; })
[PATCH] Enhance phiopt to handle BIT_AND_EXPR
Hi, The patch enhances phiopt to handle cases like: if (a == 0 (...)) return 0; return a; Boot strap and no make check regression on X86-64 and ARM. Is it OK for trunk? Thanks! -Zhenqiang ChangeLog: 2013-09-30 Zhenqiang Chen zhenqiang.c...@linaro.org * tree-ssa-phiopt.c (operand_equal_for_phi_arg_p_1): New. (value_replacement): Move a check to operand_equal_for_phi_arg_p_1. testsuite/ChangeLog: 2013-09-30 Zhenqiang Chen zhenqiang.c...@linaro.org * gcc.dg/tree-ssa/phi-opt-11.c: New test case. phiopt.patch Description: Binary data
RE: Fwd: [PATCH] Scheduling result adjustment to enable macro-fusion
1. For cmp/test with rip-relative addressing mem operand, don't group insns. Bulldozer also doesn't support fusion for cmp/test with both displacement MEM and immediate operand, while m_CORE_ALL doesn't support fusion for cmp/test with MEM and immediate operand. I simplify choose to use the more stringent constraint here (m_CORE_ALL's constraint). This suits Bulldozer's specification. We don't see an issue with the proposed patch. Regards Ganesh -Original Message- From: H.J. Lu [mailto:hjl.to...@gmail.com] Sent: Wednesday, September 25, 2013 2:12 AM To: Wei Mi Cc: Jan Hubicka; Alexander Monakov; Steven Bosscher; GCC Patches; David Li; Kirill Yukhin Subject: Re: Fwd: [PATCH] Scheduling result adjustment to enable macro-fusion On Tue, Sep 24, 2013 at 12:06 PM, Wei Mi w...@google.com wrote: This is the updated patch2. Changed: 1. For cmp/test with rip-relative addressing mem operand, don't group insns. Bulldozer also doesn't support fusion for cmp/test with both displacement MEM and immediate operand, while m_CORE_ALL doesn't support fusion for cmp/test with MEM and immediate operand. I simplify choose to use the more stringent constraint here (m_CORE_ALL's constraint). 2. Add Budozer back and merge TARGET_FUSE_CMP_AND_BRANCH_64 and TARGET_FUSE_CMP_AND_BRANCH_32. bootstrap and regression pass. ok for trunk? 2013-09-24 Wei Mi w...@google.com * gcc/config/i386/i386.c (rip_relative_addr_p): New Function. (ix86_macro_fusion_p): Ditto. (ix86_macro_fusion_pair_p): Ditto. * gcc/config/i386/i386.h: Add new tune features about macro-fusion. * gcc/config/i386/x86-tune.def (DEF_TUNE): Ditto. * gcc/doc/tm.texi: Generated. * gcc/doc/tm.texi.in: Ditto. * gcc/haifa-sched.c (try_group_insn): New Function. (group_insns_for_macro_fusion): Ditto. (sched_init): Call group_insns_for_macro_fusion. * gcc/sched-rgn.c (add_branch_dependences): Keep insns in a SCHED_GROUP at the end of BB to remain their location. * gcc/target.def: Add two hooks: macro_fusion_p and macro_fusion_pair_p. diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 1fd3f60..4a04778 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -24856,6 +24856,167 @@ ia32_multipass_dfa_lookahead (void) } } +/* Extracted from ix86_print_operand_address. Check whether ADDR is a + rip-relative address. */ + +static bool +rip_relative_addr_p (rtx addr) +{ + struct ix86_address parts; + rtx base, index, disp; + int ok; + + if (GET_CODE (addr) == UNSPEC XINT (addr, 1) == UNSPEC_VSIBADDR) +{ + ok = ix86_decompose_address (XVECEXP (addr, 0, 0), parts); + parts.index = XVECEXP (addr, 0, 1); +} + else if (GET_CODE (addr) == UNSPEC XINT (addr, 1) == UNSPEC_LEA_ADDR) +ok = ix86_decompose_address (XVECEXP (addr, 0, 0), parts); else +ok = ix86_decompose_address (addr, parts); + + gcc_assert (ok); + base = parts.base; + index = parts.index; + disp = parts.disp; + + if (TARGET_64BIT !base !index) +{ + rtx symbol = disp; + + if (GET_CODE (disp) == CONST + GET_CODE (XEXP (disp, 0)) == PLUS + CONST_INT_P (XEXP (XEXP (disp, 0), 1))) + symbol = XEXP (XEXP (disp, 0), 0); + + if (GET_CODE (symbol) == LABEL_REF + || (GET_CODE (symbol) == SYMBOL_REF + SYMBOL_REF_TLS_MODEL (symbol) == 0)) + return true; +} + if (flag_pic !base !index) +{ + if (GET_CODE (disp) == CONST + GET_CODE (XEXP (disp, 0)) == UNSPEC + (XINT (XEXP (disp, 0), 1) == UNSPEC_PCREL + || XINT (XEXP (disp, 0), 1) == UNSPEC_GOTPCREL + || (TARGET_64BIT + XINT (XEXP (disp, 0), 1) == UNSPEC_GOTNTPOFF))) + return true; +} + return false; +} + It doesn't look right. IP relative address is only possible with TARGET_64BIT and 1. base == pc. Or 2. UUNSPEC_PCREL, UNSPEC_GOTPCREL, and NSPEC_GOTNTPOFF. -- H.J.
Re: [ARM, AArch64] Make aarch64-common.c files more robust.
On 30 September 2013 09:52, James Greenhalgh james.greenha...@arm.com wrote: Hi, aarch64-common.c. These functions expect a particular form You meant aarch-common.c here and in the title ;-) This is fine by me, but as a config/arm/ change needs OK from Ramana or Richard. /Marcus 2013-09-30 James Greenhalgh james.greenha...@arm.com * config/arm/aarch-common.c (arm_early_load_addr_dep): Add sanity checking. (arm_no_early_alu_shift_dep): Likewise. (arm_no_early_alu_shift_value_dep): Likewise. (arm_no_early_mul_dep): Likewise. (arm_no_early_store_addr_dep): Likewise.
Re: [PATCH]: Fix use of __builtin_eh_pointer in EH_ELSE
On Sep 24, 2013, at 8:51 PM, Richard Henderson r...@redhat.com wrote: On 09/03/2013 07:08 AM, Tristan Gingold wrote: Hi, The field state-ehp_region wasn't updated before lowering constructs in the eh path of EH_ELSE. As a consequence, __builtin_eh_pointer is lowered to 0 (or possibly to a wrong region number) in this path. The only user of EH_ELSE looks to be trans-mem.c:lower_transaction, and the consequence of that is a memory leak. Furthermore, according to calls.c:flags_from_decl_or_type, the transaction_pure attribute must be set on the function type, not on the function declaration. Hence the change to declare __builtin_eh_pointer. (I don't understand the guard condition to set the attribute for it in tree.c. Why is 'builtin_decl_explicit_p (BUILT_IN_TM_LOAD_1)' needed in addition to flag_tm ?) Clearly these are totally unrelated and should not be in the same patch. This wasn't clear to me, as I got 'unsafe function call __builtin_eh_pointer in atomic transaction' before fixing the transaction_pure. So here is the 'transaction_pure' part. No check-host regressions on x86_64-linux-gnu. Ok for trunk ? Tristan. 2013-09-03 Tristan Gingold ging...@adacore.com * tree.c (set_call_expr_flags): Reject ECF_TM_PURE. (build_common_builtin_nodes): Set transaction_pure attribute on __builtin_eh_pointer function type (and not on its declaration). diff --git a/gcc/tree.c b/gcc/tree.c index f0ee309..e4be24d 100644 --- a/gcc/tree.c +++ b/gcc/tree.c @@ -9817,9 +9817,11 @@ set_call_expr_flags (tree decl, int flags) if (flags ECF_LEAF) DECL_ATTRIBUTES (decl) = tree_cons (get_identifier (leaf), NULL, DECL_ATTRIBUTES (decl)); - if ((flags ECF_TM_PURE) flag_tm) -DECL_ATTRIBUTES (decl) = tree_cons (get_identifier (transaction_pure), - NULL, DECL_ATTRIBUTES (decl)); + + /* The transaction_pure attribute must be set on the function type, not + on the declaration. */ + gcc_assert (!(flags ECF_TM_PURE)); + /* Looping const or pure is implied by noreturn. There is currently no way to declare looping const or looping pure alone. */ gcc_assert (!(flags ECF_LOOPING_CONST_OR_PURE) @@ -10018,8 +10020,9 @@ build_common_builtin_nodes (void) integer_type_node, NULL_TREE); ecf_flags = ECF_PURE | ECF_NOTHROW | ECF_LEAF; /* Only use TM_PURE if we we have TM language support. */ - if (builtin_decl_explicit_p (BUILT_IN_TM_LOAD_1)) -ecf_flags |= ECF_TM_PURE; + if (flag_tm builtin_decl_explicit_p (BUILT_IN_TM_LOAD_1)) +TYPE_ATTRIBUTES (ftype) = tree_cons (get_identifier (transaction_pure), +NULL, TYPE_ATTRIBUTES (ftype)); local_define_builtin (__builtin_eh_pointer, ftype, BUILT_IN_EH_POINTER, __builtin_eh_pointer, ecf_flags);
Re: cost model patch
Hi Richard, David, In principle yes. Note that it changes the behavior of -O2 -ftree-vectorize as -ftree-vectorize does not imply changing the default cost model. I am fine with that, but eventually this will have some testsuite fallout. Indeed I am observing a regression with this patch on arm-none-eabi in gcc.dg/tree-ssa/gen-vect-26.c. Seems that the cheap vectoriser model doesn't do unaligned stores (as expected I think?). Is adding -fvect-cost-model=dynamic to the test options the correct approach? Thanks, Kyrill
[Patch,AArch64] Support SADDL/SSUBL/UADDL/USUBL
Hello, This patch adds support to generate SADDL/SSUBL/UADDL/USUBL. Part of the support is available already (supported for intrinsics). This patch extends this support to generate these instructions (and lane variations) in all scenarios and adds a testcase. Tested for aarch64-none-elf, aarch64_be-none-elf with no regressions. OK for trunk? Cheers VP ~~~ gcc/ChangeLog: 2013-09-30 Vidya Praveen vidyaprav...@arm.com * aarch64-simd.md (aarch64_ANY_EXTEND:suADDSUB:optabl2mode_internal): Rename to ... (aarch64_ANY_EXTEND:suADDSUB:optablmode_hi_internal): ... this; Insert '\t' to output template. (aarch64_ANY_EXTEND:suADDSUB:optablmode_lo_internal): New. (aarch64_saddl2mode, aarch64_uaddl2mode): Modify to call gen_aarch64_ANY_EXTEND:suADDSUB:optablmode_hi_internal() instead. (aarch64_ssubl2mode, aarch64_usubl2mode): Ditto. gcc/testsuite/ChangeLog: 2013-09-30 Vidya Praveen vidyaprav...@arm.com * gcc.target/aarch64/vect_saddl_1.c: New. diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index f13cd5b..a0259b8 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -2586,7 +2586,7 @@ ;; suaddsublq. -(define_insn aarch64_ANY_EXTEND:suADDSUB:optabl2mode_internal +(define_insn aarch64_ANY_EXTEND:suADDSUB:optablmode_hi_internal [(set (match_operand:VWIDE 0 register_operand =w) (ADDSUB:VWIDE (ANY_EXTEND:VWIDE (vec_select:VHALF (match_operand:VQW 1 register_operand w) @@ -2595,11 +2595,26 @@ (match_operand:VQW 2 register_operand w) (match_dup 3)] TARGET_SIMD - ANY_EXTEND:suADDSUB:optabl2 %0.Vwtype, %1.Vtype, %2.Vtype + ANY_EXTEND:suADDSUB:optabl2\t%0.Vwtype, %1.Vtype, %2.Vtype [(set_attr simd_type simd_addl) (set_attr simd_mode MODE)] ) +(define_insn aarch64_ANY_EXTEND:suADDSUB:optablmode_lo_internal + [(set (match_operand:VWIDE 0 register_operand =w) + (ADDSUB:VWIDE (ANY_EXTEND:VWIDE (vec_select:VHALF + (match_operand:VQW 1 register_operand w) + (match_operand:VQW 3 vect_par_cnst_lo_half ))) + (ANY_EXTEND:VWIDE (vec_select:VHALF + (match_operand:VQW 2 register_operand w) + (match_dup 3)] + TARGET_SIMD + ANY_EXTEND:suADDSUB:optabl\t%0.Vwtype, %1.Vhalftype, %2.Vhalftype + [(set_attr simd_type simd_addl) + (set_attr simd_mode MODE)] +) + + (define_expand aarch64_saddl2mode [(match_operand:VWIDE 0 register_operand =w) (match_operand:VQW 1 register_operand w) @@ -2607,8 +2622,8 @@ TARGET_SIMD { rtx p = aarch64_simd_vect_par_cnst_half (MODEmode, true); - emit_insn (gen_aarch64_saddl2mode_internal (operands[0], operands[1], - operands[2], p)); + emit_insn (gen_aarch64_saddlmode_hi_internal (operands[0], operands[1], + operands[2], p)); DONE; }) @@ -2619,8 +2634,8 @@ TARGET_SIMD { rtx p = aarch64_simd_vect_par_cnst_half (MODEmode, true); - emit_insn (gen_aarch64_uaddl2mode_internal (operands[0], operands[1], - operands[2], p)); + emit_insn (gen_aarch64_uaddlmode_hi_internal (operands[0], operands[1], + operands[2], p)); DONE; }) @@ -2631,7 +2646,7 @@ TARGET_SIMD { rtx p = aarch64_simd_vect_par_cnst_half (MODEmode, true); - emit_insn (gen_aarch64_ssubl2mode_internal (operands[0], operands[1], + emit_insn (gen_aarch64_ssublmode_hi_internal (operands[0], operands[1], operands[2], p)); DONE; }) @@ -2643,7 +2658,7 @@ TARGET_SIMD { rtx p = aarch64_simd_vect_par_cnst_half (MODEmode, true); - emit_insn (gen_aarch64_usubl2mode_internal (operands[0], operands[1], + emit_insn (gen_aarch64_usublmode_hi_internal (operands[0], operands[1], operands[2], p)); DONE; }) diff --git a/gcc/testsuite/gcc.target/aarch64/vect_saddl_1.c b/gcc/testsuite/gcc.target/aarch64/vect_saddl_1.c new file mode 100644 index 000..ecbd8a8 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/vect_saddl_1.c @@ -0,0 +1,315 @@ +/* { dg-do run } */ +/* { dg-options -O3 -fno-inline -save-temps -fno-vect-cost-model } */ + +typedef signed char S8_t; +typedef signed short S16_t; +typedef signed int S32_t; +typedef signed long long S64_t; + +typedef signed char *__restrict__ pS8_t; +typedef signed short *__restrict__ pS16_t; +typedef signed int *__restrict__ pS32_t; +typedef signed long long *__restrict__ pS64_t; + +typedef unsigned char U8_t; +typedef unsigned short U16_t; +typedef unsigned int U32_t; +typedef unsigned long long U64_t; + +typedef unsigned char *__restrict__ pU8_t; +typedef unsigned short *__restrict__ pU16_t; +typedef unsigned int *__restrict__ pU32_t; +typedef unsigned long long *__restrict__ pU64_t; + +extern void abort (); + +void +test_addl_S64_S32_4 (pS64_t a, pS32_t b, pS32_t c) +{ + int i; + for (i = 0; i 4; i++)
Fix scheduler ix86_issue_rate and ix86_adjust_cost for modern x86 chips
Hi, while looking into schedules produced for Buldozer and Core I noticed that they do not seem to match reality. This is because ix86_issue_rate limits those CPUs into 3 instructions per cycle, while they are designed to do 4 and somewhat confused ix86_adjust_cost. I also added stack engine into modern chips even though scheduler doesn't really understand that multiple push operations can happen in one cycle. At least it gets the stack updates in sequences of push/pop operations. I did not updated buldozer issue rates yet. The current scheduler model won't allow it to execute more than 3 instructions per cycle (and 2 for version 3). I think bdver1.md/bdver3.md needs to be updated first. I am testing x86_64-linux and will commit if there are no complains. Honza * i386.c (ix86_issue_rate): Pentium4/Nocona issue 2 instructions per cycle, Core/CoreI7/Haswell 4 instructions per cycle. (ix86_adjust_cost): Add stack engine to modern AMD chips; fix for core; remove Atom that mistakely shared code with AMD. Index: config/i386/i386.c === --- config/i386/i386.c (revision 203011) +++ config/i386/i386.c (working copy) @@ -24435,17 +24435,14 @@ ix86_issue_rate (void) case PROCESSOR_SLM: case PROCESSOR_K6: case PROCESSOR_BTVER2: +case PROCESSOR_PENTIUM4: +case PROCESSOR_NOCONA: return 2; case PROCESSOR_PENTIUMPRO: -case PROCESSOR_PENTIUM4: -case PROCESSOR_CORE2: -case PROCESSOR_COREI7: -case PROCESSOR_HASWELL: case PROCESSOR_ATHLON: case PROCESSOR_K8: case PROCESSOR_AMDFAM10: -case PROCESSOR_NOCONA: case PROCESSOR_GENERIC: case PROCESSOR_BDVER1: case PROCESSOR_BDVER2: @@ -24453,6 +24450,11 @@ ix86_issue_rate (void) case PROCESSOR_BTVER1: return 3; +case PROCESSOR_CORE2: +case PROCESSOR_COREI7: +case PROCESSOR_HASWELL: + return 4; + default: return 1; } @@ -24709,10 +24711,15 @@ ix86_adjust_cost (rtx insn, rtx link, rt case PROCESSOR_BDVER3: case PROCESSOR_BTVER1: case PROCESSOR_BTVER2: -case PROCESSOR_ATOM: case PROCESSOR_GENERIC: memory = get_attr_memory (insn); + /* Stack engine allows to execute pushpop instructions in parall. */ + if (((insn_type == TYPE_PUSH || insn_type == TYPE_POP) + (dep_insn_type == TYPE_PUSH || dep_insn_type == TYPE_POP)) + (ix86_tune != PROCESSOR_ATHLON ix86_tune != PROCESSOR_K8)) + return 0; + /* Show ability of reorder buffer to hide latency of load by executing in parallel with previous instruction in case previous instruction is not needed to compute the address. */ @@ -24737,6 +24744,29 @@ ix86_adjust_cost (rtx insn, rtx link, rt else cost = 0; } + break; + +case PROCESSOR_CORE2: +case PROCESSOR_COREI7: +case PROCESSOR_HASWELL: + memory = get_attr_memory (insn); + + /* Stack engine allows to execute pushpop instructions in parall. */ + if ((insn_type == TYPE_PUSH || insn_type == TYPE_POP) + (dep_insn_type == TYPE_PUSH || dep_insn_type == TYPE_POP)) + return 0; + + /* Show ability of reorder buffer to hide latency of load by executing +in parallel with previous instruction in case +previous instruction is not needed to compute the address. */ + if ((memory == MEMORY_LOAD || memory == MEMORY_BOTH) + !ix86_agi_dependent (dep_insn, insn)) + { + if (cost = 4) + cost -= 4; + else + cost = 0; + } break; case PROCESSOR_SLM:
Re: RFA: Use m_foo rather than foo_ for member variables
On Sun, Sep 29, 2013 at 11:08 AM, Richard Sandiford rdsandif...@googlemail.com wrote: Michael Matz m...@suse.de writes: Trever Saunders tsaund...@mozilla.com writes: Richard Biener richard.guent...@gmail.com writes: Btw, I've come around multiple coding-styles in the past and I definitely would prefer m_mode / m_count to mark members vs. mode_ and count_. (and s_XXX for static members IIRC). I'd prefer m_/s_foo for members / static things too fwiw. Me as well. It's still ugly, but not so unsymmetric as the trailing underscore. Well, I'm not sure how I came to be the one writing these patches, but I suppose I prefer m_foo too. So how about the attached? The first patch has changes to the coding conventions. I added some missing spaces while there. The second patch has the mechanical code changes. The reason for yesterday's mass adding of spaces was because the second patch would have been pretty inconsistent otherwise. Tested on x86_64-linux-gnu. Ok. Thanks, Richard. Thanks, Richard
[PATCH] Fix PR58554
This fixes PR58554, pattern recognition in loop distribution now needs to check whether all stmts are unconditionally executed. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2013-09-30 Richard Biener rguent...@suse.de PR tree-optimization/58554 * tree-loop-distribution.c (classify_partition): Require unconditionally executed stores for memcpy and memset recognition. (tree_loop_distribution): Calculate dominance info. * gcc.dg/torture/pr58554.c: New testcase. Index: gcc/tree-loop-distribution.c === *** gcc/tree-loop-distribution.c(revision 203028) --- gcc/tree-loop-distribution.c(working copy) *** classify_partition (loop_p loop, struct *** 1206,1212 !SSA_NAME_IS_DEFAULT_DEF (rhs) flow_bb_inside_loop_p (loop, gimple_bb (SSA_NAME_DEF_STMT (rhs return; ! if (!adjacent_dr_p (single_store)) return; partition-kind = PKIND_MEMSET; partition-main_dr = single_store; --- 1206,1214 !SSA_NAME_IS_DEFAULT_DEF (rhs) flow_bb_inside_loop_p (loop, gimple_bb (SSA_NAME_DEF_STMT (rhs return; ! if (!adjacent_dr_p (single_store) ! || !dominated_by_p (CDI_DOMINATORS, ! loop-latch, gimple_bb (stmt))) return; partition-kind = PKIND_MEMSET; partition-main_dr = single_store; *** classify_partition (loop_p loop, struct *** 1222,1228 if (!adjacent_dr_p (single_store) || !adjacent_dr_p (single_load) || !operand_equal_p (DR_STEP (single_store), ! DR_STEP (single_load), 0)) return; /* Now check that if there is a dependence this dependence is of a suitable form for memmove. */ --- 1224,1232 if (!adjacent_dr_p (single_store) || !adjacent_dr_p (single_load) || !operand_equal_p (DR_STEP (single_store), ! DR_STEP (single_load), 0) ! || !dominated_by_p (CDI_DOMINATORS, ! loop-latch, gimple_bb (store))) return; /* Now check that if there is a dependence this dependence is of a suitable form for memmove. */ *** out: *** 1719,1724 --- 1723,1729 { if (!cd) { + calculate_dominance_info (CDI_DOMINATORS); calculate_dominance_info (CDI_POST_DOMINATORS); cd = new control_dependences (create_edge_list ()); free_dominance_info (CDI_POST_DOMINATORS); Index: gcc/testsuite/gcc.dg/torture/pr58554.c === *** gcc/testsuite/gcc.dg/torture/pr58554.c (revision 0) --- gcc/testsuite/gcc.dg/torture/pr58554.c (working copy) *** *** 0 --- 1,20 + /* { dg-do run } */ + + extern void abort (void); + void __attribute__((noinline,noclone)) + clear_board(unsigned char *board, int board_size) + { + int k; + for (k = 0; k 421; k++) + if (k board_size ) + board[k] = 3; + } + int main() + { + unsigned char board[421]; + board[420] = 1; + clear_board (board, 420); + if (board[420] != 1) + abort (); + return 0; + }
[PATCH][i386] Enable vector_loop in memset expanding and merge expanders for memset and memmov
Hi Jan, Here is a patch we've talked about recently - it merges expanders of memset and memmov. As a natural side effect, this enables vector_loop in memset expanding as well. Though in some places merging movmem and setmem isn't so efficient (the original code in these versions differed a lot), I think it's worth combining them - in many cases that allows to remove code duplicates. I tried to keep the resultant code as close to the original as possible (except enabling vector_loop in setmem). Because of that, there are some places that could be IMHO merged, but not merged in this patch. The patch is bootstrapped and tested on i386/x86_64 (make check, and stability testing on Spec2k, Spec2k6). Is it ok? Michael --- gcc/config/i386/i386.c | 1018 .../gcc.target/i386/memset-vector_loop-1.c | 11 + .../gcc.target/i386/memset-vector_loop-2.c | 10 + 3 files changed, 406 insertions(+), 633 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/memset-vector_loop-1.c create mode 100644 gcc/testsuite/gcc.target/i386/memset-vector_loop-2.c diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 21fc531..9d5654f 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -22219,13 +22219,16 @@ expand_set_or_movmem_via_loop (rtx destmem, rtx srcmem, emit_label (out_label); } -/* Output rep; mov instruction. - Arguments have same meaning as for previous function */ +/* Output rep; mov or rep; stos instruction depending on ISSETMEM argument. + When ISSETMEM is true, arguments SRCMEM and SRCPTR are ignored. + When ISSETMEM is false, arguments VALUE and ORIG_VALUE are ignored. + Other arguments have same meaning as for previous function. */ + static void -expand_movmem_via_rep_mov (rtx destmem, rtx srcmem, - rtx destptr, rtx srcptr, +expand_movmem_or_setmem_via_rep (rtx destmem, rtx srcmem, + rtx destptr, rtx srcptr, rtx value, rtx orig_value, rtx count, - enum machine_mode mode) + enum machine_mode mode, bool issetmem) { rtx destexp; rtx srcexp; @@ -22233,82 +22236,65 @@ expand_movmem_via_rep_mov (rtx destmem, rtx srcmem, HOST_WIDE_INT rounded_count; /* If the size is known, it is shorter to use rep movs. */ - if (mode == QImode CONST_INT_P (count) + if (!issetmem mode == QImode CONST_INT_P (count) !(INTVAL (count) 3)) mode = SImode; if (destptr != XEXP (destmem, 0) || GET_MODE (destmem) != BLKmode) destmem = adjust_automodify_address_nv (destmem, BLKmode, destptr, 0); - if (srcptr != XEXP (srcmem, 0) || GET_MODE (srcmem) != BLKmode) -srcmem = adjust_automodify_address_nv (srcmem, BLKmode, srcptr, 0); - countreg = ix86_zero_extend_to_Pmode (scale_counter (count, GET_MODE_SIZE (mode))); + + countreg = ix86_zero_extend_to_Pmode (scale_counter (count, + GET_MODE_SIZE (mode))); if (mode != QImode) { destexp = gen_rtx_ASHIFT (Pmode, countreg, GEN_INT (exact_log2 (GET_MODE_SIZE (mode; destexp = gen_rtx_PLUS (Pmode, destexp, destptr); - srcexp = gen_rtx_ASHIFT (Pmode, countreg, - GEN_INT (exact_log2 (GET_MODE_SIZE (mode; - srcexp = gen_rtx_PLUS (Pmode, srcexp, srcptr); } else -{ - destexp = gen_rtx_PLUS (Pmode, destptr, countreg); - srcexp = gen_rtx_PLUS (Pmode, srcptr, countreg); -} - if (CONST_INT_P (count)) +destexp = gen_rtx_PLUS (Pmode, destptr, countreg); + if ((!issetmem || orig_value == const0_rtx) CONST_INT_P (count)) { rounded_count = (INTVAL (count) ~((HOST_WIDE_INT) GET_MODE_SIZE (mode) - 1)); destmem = shallow_copy_rtx (destmem); - srcmem = shallow_copy_rtx (srcmem); set_mem_size (destmem, rounded_count); - set_mem_size (srcmem, rounded_count); -} - else -{ - if (MEM_SIZE_KNOWN_P (destmem)) - clear_mem_size (destmem); - if (MEM_SIZE_KNOWN_P (srcmem)) - clear_mem_size (srcmem); } - emit_insn (gen_rep_mov (destptr, destmem, srcptr, srcmem, countreg, - destexp, srcexp)); -} - -/* Output rep; stos instruction. - Arguments have same meaning as for previous function */ -static void -expand_setmem_via_rep_stos (rtx destmem, rtx destptr, rtx value, - rtx count, enum machine_mode mode, - rtx orig_value) -{ - rtx destexp; - rtx countreg; - HOST_WIDE_INT rounded_count; + else if (MEM_SIZE_KNOWN_P (destmem)) +clear_mem_size (destmem); - if (destptr != XEXP (destmem, 0) || GET_MODE (destmem) != BLKmode) -destmem = adjust_automodify_address_nv (destmem, BLKmode, destptr, 0); - value = force_reg (mode, gen_lowpart (mode, value)); - countreg =
Re: [PATCH] Fix libgfortran cross compile configury w.r.t newlib
On 27/09/13 17:08, Steve Ellcey wrote: On Thu, 2013-09-26 at 14:47 +0100, Marcus Shawcroft wrote: I'm in two minds about whether further sticky tape of this form is the right approach or whether the original patch should be reverted until a proper fix that does not regress the tree can be found. Thoughts? 2013-09-26 Marcus Shawcroft marcus.shawcr...@arm.com * configure.ac (AC_CHECK_FUNCS_ONCE): Make if statement dependent on gcc_no_link. Cheers /Marcus Well, I thought this patch would work for me, but it does not. It looks like gcc_no_link is set to 'no' on my target because, technically, I can link even if I don't use a linker script. I just can't find any functions. % cat x.c int main (void) { return 0; } % mips-mti-elf-gcc x.c -o x /local/home/sellcey/nightly/install-mips-mti-elf/lib/gcc/mips-mti-elf/4.9.0/../../../../mips-mti-elf/bin/ld: warning: cannot find entry symbol __start; defaulting to 00400098 % echo $? 0 % cat y.c int main (void) { exit (0); } % install-mips-mti-elf/bin/mips-mti-elf-gcc y.c -o y /local/home/sellcey/nightly/install-mips-mti-elf/lib/gcc/mips-mti-elf/4.9.0/../../../../mips-mti-elf/bin/ld: warning: cannot find entry symbol __start; defaulting to 00400098 /tmp/ccdG78PN.o: In function `main': y.c:(.text+0x14): undefined reference to `exit' collect2: error: ld returned 1 exit status ubuntu-sellcey % echo $? 1 In which case gating on gcc_no_link could be replaced with a test that looks to see if we can link with the library. Perhaps looking for exit() or some such that might reasonably be expected to be present. For example: AC_CHECK_FUNC(exit) if test x${with_newlib} = xyes -a x${ac_cv_func_exit} = xno; then /Marcus
Re: [gomp4] Library side of depend clause support
On 27 Sep 12:08, Jakub Jelinek wrote: Looks like you forgot some files. I've checked http://gcc.gnu.org/viewcvs/gcc?view=revisionrevision=202968 And e. g. hashtab.h is missing. So currently branch is failing to build, with task.c:46:21: fatal error: hashtab.h: No such file or directory Here is what I've committed now, the incremental changes were really only using a structure with flex array member for the dependers vectors, removing/making redundant earlier !ent-is_in when adding !is_in into the chain and addition of new testcases. Let's improve it incrementally later. 2013-09-27 Jakub Jelinek ja...@redhat.com * libgomp.h: Include stdlib.h. (struct gomp_task_depend_entry, struct gomp_dependers_vec): New types. (struct gomp_task): Add dependers, depend_hash, depend_count, num_dependees and depend fields. (struct gomp_taskgroup): Add num_children field. (gomp_finish_task): Free depend_hash if non-NULL. * libgomp_g.h (GOMP_task): Add depend argument. * hashtab.h: New file. * task.c: Include hashtab.h. (hash_entry_type): New typedef. (htab_alloc, htab_free, htab_hash, htab_eq): New inlines. (gomp_init_task): Clear dependers, depend_hash and depend_count fields. (GOMP_task): Add depend argument, handle depend clauses. Increment num_children field in taskgroup. (gomp_task_run_pre): Don't increment task_running_count here, nor clear task_pending bit. (gomp_task_run_post_handle_depend_hash, gomp_task_run_post_handle_dependers, gomp_task_run_post_handle_depend): New functions. (gomp_task_run_post_remove_parent): Clear in_taskwait before signalling corresponding semaphore. (gomp_task_run_post_remove_taskgroup): Decrement num_children field and make the decrement to 0 MEMMODEL_RELEASE operation, rather than storing NULL to taskgroup-children. Clear in_taskgroup_wait before signalling corresponding semaphore. (gomp_barrier_handle_tasks): Move task_running_count increment and task_pending bit clearing here. Call gomp_task_run_post_handle_depend. If more than one new tasks have been queued, wake other threads if needed. (GOMP_taskwait): Call gomp_task_run_post_handle_depend. If more than one new tasks have been queued, wake other threads if needed. After waiting on taskwait_sem, enter critical section again. (GOMP_taskgroup_start): Initialize num_children field. (GOMP_taskgroup_end): Check num_children instead of children before critical section. If children is NULL, but num_children is non-zero, wait on taskgroup_sem. Call gomp_task_run_post_handle_depend. If more than one new tasks have been queued, wake other threads if needed. After waiting on taskgroup_sem, enter critical section again. * testsuite/libgomp.c/depend-1.c: New test. * testsuite/libgomp.c/depend-2.c: New test. * testsuite/libgomp.c/depend-3.c: New test. * testsuite/libgomp.c/depend-4.c: New test.
Re: libgo patch committed: Implement reflect.MakeFunc for 386
Ian Lance Taylor i...@google.com writes: Following up on my earlier patch, this patch implements the reflect.MakeFunc function for 386. Tom Tromey pointed out to me that the libffi closure support can probably be used for this. I was not aware of that support. It supports a lot more processors, and I should probably start using it. The approach I am using does have a couple of advantages: it's more efficient, and it doesn't require any type of writable executable memory. I can get away with that because indirect calls in Go always pass a closure value. So even when and if I do change to using libffi, I might still keep this code for amd64 and 386. Unfortunately, this patch (and undoubtedly the corresponding amd64 one) break Solaris/x86 libgo bootstrap with native as: Assembler: /var/tmp//cctly9hk.s, line 8 : Illegal mnemonic Near line: .cfi_startproc /var/tmp//cctly9hk.s, line 8 : Syntax error Near line: .cfi_startproc /var/tmp//cctly9hk.s, line 21 : Illegal mnemonic Near line: .cfi_def_cfa_offset 8 /var/tmp//cctly9hk.s, line 21 : Syntax error Near line: .cfi_def_cfa_offset 8 /var/tmp//cctly9hk.s, line 22 : Illegal mnemonic Near line: .cfi_offset %ebp, -8 /var/tmp//cctly9hk.s, line 22 : Syntax error Near line: .cfi_offset %ebp, -8 /var/tmp//cctly9hk.s, line 24 : Illegal mnemonic Near line: .cfi_def_cfa_register %ebp /var/tmp//cctly9hk.s, line 24 : Syntax error Near line: .cfi_def_cfa_register %ebp /var/tmp//cctly9hk.s, line 27 : Illegal mnemonic Near line: .cfi_offset %ebx, -12 /var/tmp//cctly9hk.s, line 27 : Syntax error Near line: .cfi_offset %ebx, -12 /var/tmp//cctly9hk.s, line 45 : Illegal mnemonic Near line: .cfi_restore %ebx /var/tmp//cctly9hk.s, line 45 : Syntax error Near line: .cfi_restore %ebx /var/tmp//cctly9hk.s, line 47 : Illegal mnemonic Near line: .cfi_restore %ebp /var/tmp//cctly9hk.s, line 47 : Syntax error Near line: .cfi_restore %ebp /var/tmp//cctly9hk.s, line 48 : Illegal mnemonic Near line: .cfi_def_cfa %esp, 4 /var/tmp//cctly9hk.s, line 48 : Syntax error Near line: .cfi_def_cfa %esp, 4 /var/tmp//cctly9hk.s, line 50 : Illegal mnemonic Near line: .cfi_endproc /var/tmp//cctly9hk.s, line 50 : Syntax error Near line: .cfi_endproc /var/tmp//cctly9hk.s, line 52 : Invalid section attribute /var/tmp//cctly9hk.s, line 52 : Syntax error Near line: .section .text.__x86.get_pc_thunk.bx,axG,@progbits,__x86.get_pc_thunk.bx,comdat /var/tmp//cctly9hk.s, line 57 : Illegal mnemonic Near line: .cfi_startproc /var/tmp//cctly9hk.s, line 57 : Syntax error Near line: .cfi_startproc /var/tmp//cctly9hk.s, line 60 : Illegal mnemonic Near line: .cfi_endproc /var/tmp//cctly9hk.s, line 60 : Syntax error Near line: .cfi_endproc /var/tmp//cctly9hk.s, line 62 : Syntax error Near line: .section .note.GNU-stack,,@progbits /var/tmp//cctly9hk.s, line 63 : Syntax error Near line: .section .note.GNU-split-stack,,@progbits /var/tmp//cctly9hk.s, line 64 : Syntax error Near line: .section .note.GNU-no-split-stack,,@progbits make[4]: *** [reflect/makefunc.lo] Error 1 AFAICS, this is just the .cfi_* directives and empty section flags. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [gomp4] Library side of depend clause support
On Mon, Sep 30, 2013 at 05:04:23PM +0400, Ilya Tocar wrote: On 27 Sep 12:08, Jakub Jelinek wrote: Looks like you forgot some files. I've checked http://gcc.gnu.org/viewcvs/gcc?view=revisionrevision=202968 And e. g. hashtab.h is missing. So currently branch is failing to build, with task.c:46:21: fatal error: hashtab.h: No such file or directory Fixed now, sorry. Jakub
[PATCH][AARCH64]Replace gen_rtx_PLUS with plus_constant
Hello all, Sorry for my last patch that cause some test regressions. I have correct it, and it has been tested for aarch64-none-elf on the model. This patch will replace all explicit calls to gen_rtx_PLUS and GEN_INT with plus_constant. OK for trunk? Kind regards, Renlin Li gcc/ChangeLog: 2013-09-30 Renlin Li renlin...@arm.com * config/aarch64/aarch64.c (aarch64_expand_prologue): Use plus_constant. (aarch64_expand_epilogue): Likewise. diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index e8ae20a..db56f19 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -2065,9 +2065,9 @@ aarch64_expand_prologue (void) emit_insn (gen_add2_insn (stack_pointer_rtx, op0)); aarch64_set_frame_expr (gen_rtx_SET (Pmode, stack_pointer_rtx, - gen_rtx_PLUS (Pmode, - stack_pointer_rtx, - GEN_INT (-frame_size; + plus_constant (Pmode, + stack_pointer_rtx, + -frame_size))); } else if (frame_size 0) { @@ -2151,9 +2151,9 @@ aarch64_expand_prologue (void) GEN_INT (fp_offset))); aarch64_set_frame_expr (gen_rtx_SET (Pmode, hard_frame_pointer_rtx, - gen_rtx_PLUS (Pmode, - stack_pointer_rtx, - GEN_INT (fp_offset; + plus_constant (Pmode, + stack_pointer_rtx, + fp_offset))); RTX_FRAME_RELATED_P (insn) = 1; insn = emit_insn (gen_stack_tie (stack_pointer_rtx, hard_frame_pointer_rtx)); @@ -2349,9 +2349,9 @@ aarch64_expand_epilogue (bool for_sibcall) emit_insn (gen_add2_insn (stack_pointer_rtx, op0)); aarch64_set_frame_expr (gen_rtx_SET (Pmode, stack_pointer_rtx, - gen_rtx_PLUS (Pmode, - stack_pointer_rtx, - GEN_INT (frame_size; + plus_constant (Pmode, + stack_pointer_rtx, + frame_size))); } else if (frame_size 0) { @@ -2373,10 +2373,10 @@ aarch64_expand_epilogue (bool for_sibcall) } } - aarch64_set_frame_expr (gen_rtx_SET (Pmode, stack_pointer_rtx, - gen_rtx_PLUS (Pmode, - stack_pointer_rtx, - GEN_INT (offset; +aarch64_set_frame_expr (gen_rtx_SET (Pmode, stack_pointer_rtx, + plus_constant (Pmode, + stack_pointer_rtx, + offset))); } emit_use (gen_rtx_REG (DImode, LR_REGNUM));
[PATCH][ARM]Replace gen_rtx_PLUS with plus_constant
Hello all, Sorry for my last patch that cause some test regressions. I have correct it, and it has been tested for arm-none-eabi on the model. This patch will replace all explicit calls to gen_rtx_PLUS and GEN_INT with plus_constant. OK for trunk? Kind regards, Renlin Li gcc/ChangeLog: 2013-09-30 Renlin Li renlin...@arm.com * config/arm/arm.c (arm_output_mi_thunk): Use plus_constant. diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 2166001..256de81 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -25352,7 +25352,7 @@ arm_output_mi_thunk (FILE *file, tree thunk ATTRIBUTE_UNUSED, { /* Output .word .LTHUNKn-7-.LTHUNKPCn. */ rtx tem = XEXP (DECL_RTL (function), 0); - tem = gen_rtx_PLUS (GET_MODE (tem), tem, GEN_INT (-7)); + tem = plus_constant (GET_MODE (tem), tem, -7); tem = gen_rtx_MINUS (GET_MODE (tem), tem, gen_rtx_SYMBOL_REF (Pmode,
[PATCH] Fix A 0 ? sign bit of A : 0 optimization (PR middle-end/58564)
Hi! Apparently sign_bit_p looks through both sign and zero extensions. That is just fine for the single bit test optimization, where we know the constant is a power of two, and it is (A power_of_two) == 0 (or != 0) test. sign_bit_p is also used to check for minimum value of some integral type, because it is called with the same argument twice, it must be INTEGER_CST if non-NULL is returned and thus there is no zero extensions. But in the last spot where sign_bit_p is used, sign extensions would be fine, ((int) x) 0 for say signed char or signed short x iff x 0, but if there is a zero extension, it is just a possible missed optimization on the comparison. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/4.8? 2013-09-30 Jakub Jelinek ja...@redhat.com PR middle-end/58564 * fold-const.c (fold_ternary_loc): For A 0 : sign bit of A : 0 optimization, punt if sign_bit_p looked through any zero extension. * gcc.c-torture/execute/pr58564.c: New test. --- gcc/fold-const.c.jj 2013-09-27 15:42:37.0 +0200 +++ gcc/fold-const.c2013-09-30 11:19:06.333978484 +0200 @@ -14196,14 +14196,29 @@ fold_ternary_loc (location_t loc, enum t integer_zerop (op2) (tem = sign_bit_p (TREE_OPERAND (arg0, 0), arg1))) { + /* sign_bit_p looks through both zero and sign extensions, +but for this optimization only sign extensions are +usable. */ + tree tem2 = TREE_OPERAND (arg0, 0); + while (tem != tem2) + { + if (TREE_CODE (tem2) != NOP_EXPR + || TYPE_UNSIGNED (TREE_TYPE (TREE_OPERAND (tem2, 0 + { + tem = NULL_TREE; + break; + } + tem2 = TREE_OPERAND (tem2, 0); + } /* sign_bit_p only checks ARG1 bits within A's precision. If sign bit of A has wider type than A, bits outside of A's precision in sign bit of A need to be checked. If they are all 0, this optimization needs to be done in unsigned A's type, if they are all 1 in signed A's type, otherwise this can't be done. */ - if (TYPE_PRECISION (TREE_TYPE (tem)) - TYPE_PRECISION (TREE_TYPE (arg1)) + if (tem + TYPE_PRECISION (TREE_TYPE (tem)) + TYPE_PRECISION (TREE_TYPE (arg1)) TYPE_PRECISION (TREE_TYPE (tem)) TYPE_PRECISION (type)) { --- gcc/testsuite/gcc.c-torture/execute/pr58564.c.jj2013-09-30 11:09:38.691122488 +0200 +++ gcc/testsuite/gcc.c-torture/execute/pr58564.c 2013-09-30 11:09:14.0 +0200 @@ -0,0 +1,14 @@ +/* PR middle-end/58564 */ + +extern void abort (void); +int a, b; +short *c, **d = c; + +int +main () +{ + b = (0, 0 ((c == d) (1 (a ^ 1 | 0U; + if (b != 0) +abort (); + return 0; +} Jakub
[PATCH] Improve tree_unary_nonnegative_warnv_p (PR middle-end/58564)
Hi! Related to the last patch, this handles also BOOLEAN_TYPE and ENUMERAL_TYPE the same as INTEGER_TYPE in tree_unary_nonnegative_warnv_p, which means we e.g. fold properly the (int) (x != 0 y != 0) 0 when (x != 0 y != 0) has BOOLEAN_TYPE. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2013-09-30 Jakub Jelinek ja...@redhat.com PR middle-end/58564 * fold-const.c (tree_unary_nonnegative_warnv_p): Use INTEGRAL_TYPE_P (t) instead of TREE_CODE (t) == INTEGER_TYPE. --- gcc/fold-const.c.jj 2013-09-30 11:19:06.0 +0200 +++ gcc/fold-const.c2013-09-30 11:47:40.984561868 +0200 @@ -15448,7 +15448,7 @@ tree_unary_nonnegative_warnv_p (enum tre if (TREE_CODE (inner_type) == REAL_TYPE) return tree_expr_nonnegative_warnv_p (op0, strict_overflow_p); - if (TREE_CODE (inner_type) == INTEGER_TYPE) + if (INTEGRAL_TYPE_P (inner_type)) { if (TYPE_UNSIGNED (inner_type)) return true; @@ -15456,12 +15456,12 @@ tree_unary_nonnegative_warnv_p (enum tre strict_overflow_p); } } - else if (TREE_CODE (outer_type) == INTEGER_TYPE) + else if (INTEGRAL_TYPE_P (outer_type)) { if (TREE_CODE (inner_type) == REAL_TYPE) return tree_expr_nonnegative_warnv_p (op0, strict_overflow_p); - if (TREE_CODE (inner_type) == INTEGER_TYPE) + if (INTEGRAL_TYPE_P (inner_type)) return TYPE_PRECISION (inner_type) TYPE_PRECISION (outer_type) TYPE_UNSIGNED (inner_type); } Jakub
Re: [PATCH] Improve tree_unary_nonnegative_warnv_p (PR middle-end/58564)
On Mon, 30 Sep 2013, Jakub Jelinek wrote: Hi! Related to the last patch, this handles also BOOLEAN_TYPE and ENUMERAL_TYPE the same as INTEGER_TYPE in tree_unary_nonnegative_warnv_p, which means we e.g. fold properly the (int) (x != 0 y != 0) 0 when (x != 0 y != 0) has BOOLEAN_TYPE. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Ok. Thanks, Richard. 2013-09-30 Jakub Jelinek ja...@redhat.com PR middle-end/58564 * fold-const.c (tree_unary_nonnegative_warnv_p): Use INTEGRAL_TYPE_P (t) instead of TREE_CODE (t) == INTEGER_TYPE. --- gcc/fold-const.c.jj 2013-09-30 11:19:06.0 +0200 +++ gcc/fold-const.c 2013-09-30 11:47:40.984561868 +0200 @@ -15448,7 +15448,7 @@ tree_unary_nonnegative_warnv_p (enum tre if (TREE_CODE (inner_type) == REAL_TYPE) return tree_expr_nonnegative_warnv_p (op0, strict_overflow_p); - if (TREE_CODE (inner_type) == INTEGER_TYPE) + if (INTEGRAL_TYPE_P (inner_type)) { if (TYPE_UNSIGNED (inner_type)) return true; @@ -15456,12 +15456,12 @@ tree_unary_nonnegative_warnv_p (enum tre strict_overflow_p); } } - else if (TREE_CODE (outer_type) == INTEGER_TYPE) + else if (INTEGRAL_TYPE_P (outer_type)) { if (TREE_CODE (inner_type) == REAL_TYPE) return tree_expr_nonnegative_warnv_p (op0, strict_overflow_p); - if (TREE_CODE (inner_type) == INTEGER_TYPE) + if (INTEGRAL_TYPE_P (inner_type)) return TYPE_PRECISION (inner_type) TYPE_PRECISION (outer_type) TYPE_UNSIGNED (inner_type); } Jakub -- Richard Biener rguent...@suse.de SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imend
Re: [PATCH] Fix A 0 ? sign bit of A : 0 optimization (PR middle-end/58564)
On Mon, 30 Sep 2013, Jakub Jelinek wrote: Hi! Apparently sign_bit_p looks through both sign and zero extensions. That is just fine for the single bit test optimization, where we know the constant is a power of two, and it is (A power_of_two) == 0 (or != 0) test. sign_bit_p is also used to check for minimum value of some integral type, because it is called with the same argument twice, it must be INTEGER_CST if non-NULL is returned and thus there is no zero extensions. But in the last spot where sign_bit_p is used, sign extensions would be fine, ((int) x) 0 for say signed char or signed short x iff x 0, but if there is a zero extension, it is just a possible missed optimization on the comparison. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/4.8? Ok. Thanks, Richard. 2013-09-30 Jakub Jelinek ja...@redhat.com PR middle-end/58564 * fold-const.c (fold_ternary_loc): For A 0 : sign bit of A : 0 optimization, punt if sign_bit_p looked through any zero extension. * gcc.c-torture/execute/pr58564.c: New test. --- gcc/fold-const.c.jj 2013-09-27 15:42:37.0 +0200 +++ gcc/fold-const.c 2013-09-30 11:19:06.333978484 +0200 @@ -14196,14 +14196,29 @@ fold_ternary_loc (location_t loc, enum t integer_zerop (op2) (tem = sign_bit_p (TREE_OPERAND (arg0, 0), arg1))) { + /* sign_bit_p looks through both zero and sign extensions, + but for this optimization only sign extensions are + usable. */ + tree tem2 = TREE_OPERAND (arg0, 0); + while (tem != tem2) + { + if (TREE_CODE (tem2) != NOP_EXPR + || TYPE_UNSIGNED (TREE_TYPE (TREE_OPERAND (tem2, 0 + { + tem = NULL_TREE; + break; + } + tem2 = TREE_OPERAND (tem2, 0); + } /* sign_bit_p only checks ARG1 bits within A's precision. If sign bit of A has wider type than A, bits outside of A's precision in sign bit of A need to be checked. If they are all 0, this optimization needs to be done in unsigned A's type, if they are all 1 in signed A's type, otherwise this can't be done. */ - if (TYPE_PRECISION (TREE_TYPE (tem)) -TYPE_PRECISION (TREE_TYPE (arg1)) + if (tem +TYPE_PRECISION (TREE_TYPE (tem)) + TYPE_PRECISION (TREE_TYPE (arg1)) TYPE_PRECISION (TREE_TYPE (tem)) TYPE_PRECISION (type)) { --- gcc/testsuite/gcc.c-torture/execute/pr58564.c.jj 2013-09-30 11:09:38.691122488 +0200 +++ gcc/testsuite/gcc.c-torture/execute/pr58564.c 2013-09-30 11:09:14.0 +0200 @@ -0,0 +1,14 @@ +/* PR middle-end/58564 */ + +extern void abort (void); +int a, b; +short *c, **d = c; + +int +main () +{ + b = (0, 0 ((c == d) (1 (a ^ 1 | 0U; + if (b != 0) +abort (); + return 0; +} Jakub -- Richard Biener rguent...@suse.de SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imend
Re: libgo patch committed: Implement reflect.MakeFunc for 386
On Mon, Sep 30, 2013 at 6:07 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: Ian Lance Taylor i...@google.com writes: Following up on my earlier patch, this patch implements the reflect.MakeFunc function for 386. Tom Tromey pointed out to me that the libffi closure support can probably be used for this. I was not aware of that support. It supports a lot more processors, and I should probably start using it. The approach I am using does have a couple of advantages: it's more efficient, and it doesn't require any type of writable executable memory. I can get away with that because indirect calls in Go always pass a closure value. So even when and if I do change to using libffi, I might still keep this code for amd64 and 386. Unfortunately, this patch (and undoubtedly the corresponding amd64 one) break Solaris/x86 libgo bootstrap with native as: Unfortunately I think I'll have to somehow disable this functionality on systems with assemblers that do not understand the .cfi directives, as otherwise calling panic in a function created with MakeFunc will not work. Ian
Re: libgo patch committed: Implement reflect.MakeFunc for 386
Ian Lance Taylor i...@google.com writes: On Mon, Sep 30, 2013 at 6:07 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: Ian Lance Taylor i...@google.com writes: Following up on my earlier patch, this patch implements the reflect.MakeFunc function for 386. Tom Tromey pointed out to me that the libffi closure support can probably be used for this. I was not aware of that support. It supports a lot more processors, and I should probably start using it. The approach I am using does have a couple of advantages: it's more efficient, and it doesn't require any type of writable executable memory. I can get away with that because indirect calls in Go always pass a closure value. So even when and if I do change to using libffi, I might still keep this code for amd64 and 386. Unfortunately, this patch (and undoubtedly the corresponding amd64 one) break Solaris/x86 libgo bootstrap with native as: Unfortunately I think I'll have to somehow disable this functionality on systems with assemblers that do not understand the .cfi directives, as otherwise calling panic in a function created with MakeFunc will not work. Alternatively, one could hand-craft the .eh_frame section for such systems along the lines of libffi/src/x86/sysv.S: ugly, but doable. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [PATCH] Trivial cleanup
On 09/30/2013 04:05 AM, Michael Matz wrote: Hi, On Sat, 28 Sep 2013, Andrew MacLeod wrote: My example in this form would look something like: int unsignedsrcp = ptrvar.type().type().type_unsigned(); ... GimpleType t1 = ptrvar.type (); GimpleType t2 = t1.type (); Stop that CamelCase dyslexia already, will you? ;-) :-) Im using it purely as a holding place during the prototyping :-) Im guessing as a project we dont want it (Ive grown accustomed to it, but I'm ambivalent to it) , For now it does allow me to search/replace project wide without getting false hits since there is no other CamelCase. When Im done the header file refactoring I was going to discuss what we want for names and conventions before really getting going :-) Andrew
RFA: GCC Testsuite: Annotate compile tests that need at least 32-bit pointers/integers
Hi Guys, Several tests in the gcc.c-torture/compile directory need a target with 32-bit integers and/or 32-bit pointers. The patch below adds dg-require-effective-target in32plus to these tests. This fixes ~200 unexpected failures for the MSP430, RL78 and XSTORMY16 targets. Note - I have used dg-require-effective-target int32plus in preference to dg-require-effective-target ptr32plus even if it would appear that a pointer size test would be more appropriate. This is because the check_effective_target_ptr32_plus test is broken for targets that use PSImode (eg the MSP430 in large mode). On the MSP430 for example a PSImode pointer is 20-bits long, but when it is stored in memory it occupies 32-bits. So sizeof(void *) returns 4 but really a pointer cannot hold an entire 32-bit address. Tested with no regressions on msp430-elf, rl78-elf and xstormy16-elf targets. OK to apply ? Cheers Nick gcc/testsuite/ChangeLog 2013-09-30 Nick Clifton ni...@redhat.com * gcc.c-torture/compile/20010327-1.c: Only run the test for int32plus targets. * gcc.c-torture/compile/990617-1.c: Likewise. * gcc.c-torture/compile/calls.c: Likewise. * gcc.c-torture/compile/limits-externdecl.c: Likewise. * gcc.c-torture/compile/pr41181.c: Likewise. * gcc.c-torture/compile/pr55955.c: Likewise. Index: gcc/testsuite/gcc.c-torture/compile/20010327-1.c === --- gcc/testsuite/gcc.c-torture/compile/20010327-1.c(revision 203032) +++ gcc/testsuite/gcc.c-torture/compile/20010327-1.c(working copy) @@ -1,7 +1,4 @@ -/* { dg-skip-if non-SI pointers { m32c-*-* } { * } { } } */ -/* { dg-skip-if HI mode pointer for avr { avr-*-* } { * } { } } */ -/* { dg-skip-if HI mode pointer for pdp11 { pdp11-*-* } { * } { } } */ -/* { dg-skip-if non-SI pointers for w64 { x86_64-*-mingw* } { * } { } } */ +/* { dg-require-effective-target int32plus } */ /* This testcase tests whether GCC can produce static initialized data that references addresses of size 'unsigned long', even if that's not Index: gcc/testsuite/gcc.c-torture/compile/20020604-1.c === --- gcc/testsuite/gcc.c-torture/compile/20020604-1.c(revision 203032) +++ gcc/testsuite/gcc.c-torture/compile/20020604-1.c(working copy) @@ -1,7 +1,6 @@ /* { dg-do assemble } */ -/* { dg-skip-if The array is too big { avr-*-* pdp11-*-* } { * } { } } */ +/* { dg-require-effective-target ptr32plus } */ /* { dg-xfail-if The array too big { h8300-*-* } { -mno-h -mn } { } } */ -/* { dg-skip-if { m32c-*-* } { } { } } */ /* PR c/6957 This testcase ICEd at -O2 on IA-32, because Index: gcc/testsuite/gcc.c-torture/compile/20080625-1.c === --- gcc/testsuite/gcc.c-torture/compile/20080625-1.c(revision 203032) +++ gcc/testsuite/gcc.c-torture/compile/20080625-1.c(working copy) @@ -1,4 +1,5 @@ -/* { dg-skip-if too much data { avr-*-* m32c-*-* pdp11-*-* } { * } { } } */ +/* { dg-require-effective-target int32plus } */ + struct peakbufStruct { unsigned int lnum [5000]; int lscan [5000][4000]; Index: gcc/testsuite/gcc.c-torture/compile/990617-1.c === --- gcc/testsuite/gcc.c-torture/compile/990617-1.c (revision 203032) +++ gcc/testsuite/gcc.c-torture/compile/990617-1.c (working copy) @@ -1,7 +1,5 @@ -/* 0x7000 is too large a constant to become a pointer on - xstormy16. */ /* { dg-do assemble } */ -/* { dg-xfail-if { xstormy16-*-* } { * } { } } */ +/* { dg-require-effective-target int32plus } */ int main() { Index: gcc/testsuite/gcc.c-torture/compile/calls.c === --- gcc/testsuite/gcc.c-torture/compile/calls.c (revision 203032) +++ gcc/testsuite/gcc.c-torture/compile/calls.c (working copy) @@ -1,3 +1,4 @@ +/* { dg-require-effective-target int32plus } */ typedef void *(*T)(void); f1 () { Index: gcc/testsuite/gcc.c-torture/compile/limits-externdecl.c === --- gcc/testsuite/gcc.c-torture/compile/limits-externdecl.c (revision 203032) +++ gcc/testsuite/gcc.c-torture/compile/limits-externdecl.c (working copy) @@ -1,3 +1,4 @@ +/* { dg-require-effective-target int32plus } */ /* Inspired by the test case for PR middle-end/52640. */ typedef struct @@ -52,4 +53,4 @@ REFERENCE references[] = { LIM5 (X) 0 -}; /* { dg-error size of array is too large { target avr-*-* } } */ +}; Index: gcc/testsuite/gcc.c-torture/compile/pr41181.c === --- gcc/testsuite/gcc.c-torture/compile/pr41181.c (revision 203032) +++ gcc/testsuite/gcc.c-torture/compile/pr41181.c (working copy) @@ -1,3 +1,4 @@ +/* {
Re: [Patch, Darwin] update t-* and x-* fragments after switch to auto-deps.
Iain == Iain Sandoe i...@codesourcery.com writes: Joseph Do you need these compilation rules at all? Or could you change Joseph config.host to use paths such as config/host-darwin.o rather Joseph than just host-darwin.o, and so allow the generic rules to be Joseph used (my understanding was that the auto-deps patch series made Joseph lots of such changes to the locations of .o files in the build Joseph tree to avoid needing special compilation rules for particular Joseph files)? Iain I had a look at this, and it seems like a useful objective. However, Iain unless I'm missing a step, [following the template of Iain config.gcc:out_file] it seem to require a fair amount of modification Iain (introduction of common-object placeholders etc. in the configury and Iain Makefile.in) - plus application and testing of this on multiple Iain targets. Not something I can realistically volunteer to do in the Iain immediate future. I think it can be done more simply using vpath. (But I haven't tried this.) It seems to me though that the out_file stuff is overly manual and perhaps predates the GNU make requirement. Something like: vpath %.c $(dir $(tmake_file)) vpath %.c $(dir $(xmake_file)) This would let us keep the .o file in .. Right now, adding a new directory in which .o files may appear is a bit of a pain, because configure.ac hard-codes the list of such directories. (This could also be moved into Makefile, it just seemed more complicated that way...) Maybe it could also be done by writing a pattern rule that looks in those directories; though this is more of a pain because tmake_file and xmake_file can each list multiple files. Tom
Re: PING^3: Re: [patch] implement Cilk Plus simd loops on trunk
On 09/09/13 07:54, Aldy Hernandez wrote: PING^3 Hi guys! PING for both C and C++. Thanks. Original Message Subject: Re: PING: Fwd: Re: [patch] implement Cilk Plus simd loops on trunk Date: Tue, 27 Aug 2013 15:03:26 -0500 From: Aldy Hernandez al...@redhat.com To: Richard Henderson r...@redhat.com CC: jason merrill ja...@redhat.com, gcc-patches gcc-patches@gcc.gnu.org On 08/26/13 12:22, Richard Henderson wrote: +static tree +c_check_cilk_loop_incr (location_t loc, tree decl, tree incr) +{ + if (EXPR_HAS_LOCATION (incr)) +loc = EXPR_LOCATION (incr); + + if (!incr) +{ + error_at (loc, missing increment); + return error_mark_node; +} Either these tests are swapped, or the second one isn't really needed. Swapped. Fixed. + switch (TREE_CODE (incr)) +{ +case POSTINCREMENT_EXPR: +case PREINCREMENT_EXPR: +case POSTDECREMENT_EXPR: +case PREDECREMENT_EXPR: + if (TREE_OPERAND (incr, 0) != decl) +break; + + // Bah... canonicalize into whatever OMP_FOR_INCR needs. + if (POINTER_TYPE_P (TREE_TYPE (decl)) + TREE_OPERAND (incr, 1)) +{ + tree t = fold_convert_loc (loc, + sizetype, TREE_OPERAND (incr, 1)); + + if (TREE_CODE (incr) == POSTDECREMENT_EXPR + || TREE_CODE (incr) == PREDECREMENT_EXPR) +t = fold_build1_loc (loc, NEGATE_EXPR, sizetype, t); + t = fold_build_pointer_plus (decl, t); + incr = build2 (MODIFY_EXPR, void_type_node, decl, t); +} + return incr; Handling pointer types and pointer_plus_expr here (p++) ... +case MODIFY_EXPR: + { +tree rhs; + +if (TREE_OPERAND (incr, 0) != decl) + break; + +rhs = TREE_OPERAND (incr, 1); +if (TREE_CODE (rhs) == PLUS_EXPR + (TREE_OPERAND (rhs, 0) == decl +|| TREE_OPERAND (rhs, 1) == decl) + INTEGRAL_TYPE_P (TREE_TYPE (rhs))) + return incr; +else if (TREE_CODE (rhs) == MINUS_EXPR + TREE_OPERAND (rhs, 0) == decl + INTEGRAL_TYPE_P (TREE_TYPE (rhs))) + return incr; +// Otherwise fail because only PLUS_EXPR and MINUS_EXPR are +// allowed. +break; ... but not here (p += 1)? I should make the code more obvious. What I'm trying to do is generate what the gimplifier for OMP_FOR is expecting. OMP rewrites pointer increment/decrement expressions into a corresponding MODIFY_EXPR. I have abstracted the OMP code and shared it between both type checks, and I have also added an assert in the gimplifier just in case some future front-end extension generates OMP_FOR_INCR in a wonky way. +c_validate_cilk_plus_loop (tree *tp, int *walk_subtrees, void *data) +{ + if (!tp || !*tp) +return NULL_TREE; + + bool *valid = (bool *) data; + + switch (TREE_CODE (*tp)) +{ +case CALL_EXPR: + { +tree fndecl = CALL_EXPR_FN (*tp); + +if (TREE_CODE (fndecl) == ADDR_EXPR) + fndecl = TREE_OPERAND (fndecl, 0); +if (TREE_CODE (fndecl) == FUNCTION_DECL) + { +if (setjmp_call_p (fndecl)) + { +error_at (EXPR_LOCATION (*tp), + calls to setjmp are not allowed within loops + annotated with #pragma simd); +*valid = false; +*walk_subtrees = 0; + } + } +break; Why bother checking for setjmp? While I agree it makes little sense, there are plenty of other standard functions which also make no sense to use from within #pragma simd. What's likely to go wrong with a call to setjmp, as opposed to getcontext, pthread_create, or even printf? Sigh...the standard specifically disallows setjmp. + if (DECL_REGISTER (decl)) +{ + error_at (loc, induction variable cannot be declared register); + return false; +} Why? The standard :(. All of the actual gimple changes look good. You could commit those now if you like to reduce the size of the next patch. Ughh...got lazy on this round. How about I commit the gimple changes for the next round? How does this look?
Re: cost model patch
Yes, that will do. Can you do it for me? I can't do testing easily on arm myself. thanks, David On Mon, Sep 30, 2013 at 3:29 AM, Kyrill Tkachov kyrylo.tkac...@arm.com wrote: Hi Richard, David, In principle yes. Note that it changes the behavior of -O2 -ftree-vectorize as -ftree-vectorize does not imply changing the default cost model. I am fine with that, but eventually this will have some testsuite fallout. Indeed I am observing a regression with this patch on arm-none-eabi in gcc.dg/tree-ssa/gen-vect-26.c. Seems that the cheap vectoriser model doesn't do unaligned stores (as expected I think?). Is adding -fvect-cost-model=dynamic to the test options the correct approach? Thanks, Kyrill
Re: libgo patch committed: Implement reflect.MakeFunc for 386
On Mon, Sep 30, 2013 at 7:07 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: Ian Lance Taylor i...@google.com writes: On Mon, Sep 30, 2013 at 6:07 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: Ian Lance Taylor i...@google.com writes: Following up on my earlier patch, this patch implements the reflect.MakeFunc function for 386. Tom Tromey pointed out to me that the libffi closure support can probably be used for this. I was not aware of that support. It supports a lot more processors, and I should probably start using it. The approach I am using does have a couple of advantages: it's more efficient, and it doesn't require any type of writable executable memory. I can get away with that because indirect calls in Go always pass a closure value. So even when and if I do change to using libffi, I might still keep this code for amd64 and 386. Unfortunately, this patch (and undoubtedly the corresponding amd64 one) break Solaris/x86 libgo bootstrap with native as: Unfortunately I think I'll have to somehow disable this functionality on systems with assemblers that do not understand the .cfi directives, as otherwise calling panic in a function created with MakeFunc will not work. Alternatively, one could hand-craft the .eh_frame section for such systems along the lines of libffi/src/x86/sysv.S: ugly, but doable. Yeah. I'm not going to do that myself. But I would be happy to approve a patch for that if somebody else wants to write it. Ian
[patch] Add tree-ssa-coalesce.h
Move the prototype for coalesce_ssa_name() out of tree-ssa-live.h and put it in a new tree-ssa-coalesce.h file. Include tree-ssa-coalesce.h from tree-outof-ssa.h as it forms part of the out-of-ssa module. Also move gimple_can_coalesce_p from tree-ssa-coalesce.c to gimple.h as it operates on gimple structures and is also used a couple of other places. The prototype is already in gimple.h. Bootstraps on build/x86_64-unknown-linux-gnu with no new regressions. OK? Andrew * tree-ssa-live.h (coalesce_ssa_name): Move Prototype to... * tree-ssa-coalesce.h: New. Move prototype to here. * tree-outof-ssa.h: Include tree-ssa-coalesce.h. * tree-ssa-coalesce.c: Include tree-outof-ssa.h. (gimple_can_coalesce_p): Move to... * gimple.c (gimple_can_coalesce_p): Here. *** a1/tree-ssa-live.h 2013-09-30 10:52:14.833172626 -0400 --- tree-ssa-live.h 2013-09-30 11:10:59.161620552 -0400 *** make_live_on_entry (tree_live_info_p liv *** 321,328 bitmap_set_bit (live-global, p); } - - /* From tree-ssa-coalesce.c */ - extern var_map coalesce_ssa_name (void); - #endif /* _TREE_SSA_LIVE_H */ --- 321,324 *** a1/tree-ssa-coalesce.h 2013-09-30 11:22:12.566295821 -0400 --- tree-ssa-coalesce.h 2013-09-30 11:10:59.162614148 -0400 *** *** 1 --- 1,26 + /* Header file for tree-ssa-coalesce.c exports. +Copyright (C) 2013 Free Software Foundation, Inc. + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it under + the terms of the GNU General Public License as published by the Free + Software Foundation; either version 3, or (at your option) any later + version. + + GCC is distributed in the hope that it will be useful, but WITHOUT ANY + WARRANTY; without even the implied warranty of MERCHANTABILITY or + FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License + for more details. + + You should have received a copy of the GNU General Public License + along with GCC; see the file COPYING3. If not see + http://www.gnu.org/licenses/. */ + + #ifndef GCC_TREE_SSA_COALESCE_H + #define GCC_TREE_SSA_COALESCE_H + + extern var_map coalesce_ssa_name (void); + + + #endif /* GCC_TREE_SSA_COALESCE_H */ *** a1/tree-outof-ssa.h 2013-09-30 10:52:14.827173820 -0400 --- tree-outof-ssa.h 2013-09-30 11:10:59.163608763 -0400 *** along with GCC; see the file COPYING3. *** 23,28 --- 23,29 #include tree-ssa-live.h #include tree-ssa-ter.h + #include tree-ssa-coalesce.h /* This structure (of which only a singleton SA exists) is used to pass around information between the outof-SSA functions, cfgexpand *** a1/tree-ssa-coalesce.c 2013-09-30 10:52:14.831172590 -0400 --- tree-ssa-coalesce.c 2013-09-30 11:14:41.572345761 -0400 *** along with GCC; see the file COPYING3. *** 29,35 #include dumpfile.h #include tree-ssa.h #include hash-table.h ! #include tree-ssa-live.h #include diagnostic-core.h --- 29,35 #include dumpfile.h #include tree-ssa.h #include hash-table.h ! #include tree-outof-ssa.h #include diagnostic-core.h *** coalesce_ssa_name (void) *** 1333,1374 return map; } - - /* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for -coalescing together, false otherwise. - -This must stay consistent with var_map_base_init in tree-ssa-live.c. */ - - bool - gimple_can_coalesce_p (tree name1, tree name2) - { - /* First check the SSA_NAME's associated DECL. We only want to - coalesce if they have the same DECL or both have no associated DECL. */ - tree var1 = SSA_NAME_VAR (name1); - tree var2 = SSA_NAME_VAR (name2); - var1 = (var1 (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE; - var2 = (var2 (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE; - if (var1 != var2) - return false; - - /* Now check the types. If the types are the same, then we should - try to coalesce V1 and V2. */ - tree t1 = TREE_TYPE (name1); - tree t2 = TREE_TYPE (name2); - if (t1 == t2) - return true; - - /* If the types are not the same, check for a canonical type match. This - (for example) allows coalescing when the types are fundamentally the - same, but just have different names. - - Note pointer types with different address spaces may have the same - canonical type. Those are rejected for coalescing by the - types_compatible_p check. */ - if (TYPE_CANONICAL (t1) -TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2) -types_compatible_p (t1, t2)) - return true; - - return false; - } --- 1333,1335 *** a1/gimple.c 2013-09-30 10:52:14.721172704 -0400 --- gimple.c 2013-09-30 11:14:44.074494494 -0400 *** dump_decl_set (FILE *file, bitmap set) *** 4106,4109 --- 4106,4147 fprintf (file, NIL); } + /* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates
[patch] move htab_iterator
Jakub didn't like removing this hash table iterator functionality even though it is currently unused in the compiler. So it seems reasonable to put it in tree-hasher.h since its purpose is Hash Table Helper for Trees? bootstraps on build/x86_64-unknown-linux-gnu with no new regressions. OK? Andrew * tree-flow.h (htab_iterator, FOR_EACH_HTAB_ELEMENT): Move from here. * tree-flow-inline.h (first_htab_element, end_htab_p, next_htab_element): Also move from here. * tree-hasher.h (htab_iterator, FOR_EACH_HTAB_ELEMENT, first_htab_element, end_htab_p, next_htab_element): Move to here. *** a1/tree-flow.h 2013-09-30 10:52:14.823172915 -0400 --- tree-flow.h 2013-09-30 11:18:59.790306890 -0400 *** struct GTY(()) gimple_df { *** 92,112 htab_t GTY ((param_is (struct tm_restart_node))) tm_restart; }; - - typedef struct - { - htab_t htab; - PTR *slot; - PTR *limit; - } htab_iterator; - - /* Iterate through the elements of hashtable HTAB, using htab_iterator ITER, -storing each element in RESULT, which is of type TYPE. */ - #define FOR_EACH_HTAB_ELEMENT(HTAB, RESULT, TYPE, ITER) \ - for (RESULT = (TYPE) first_htab_element ((ITER), (HTAB)); \ - !end_htab_p ((ITER)); \ - RESULT = (TYPE) next_htab_element ((ITER))) - static inline int get_lineno (const_gimple); /*--- --- 92,97 *** a1/tree-flow-inline.h 2013-09-30 10:52:14.824172840 -0400 --- tree-flow-inline.h 2013-09-30 11:17:33.996171267 -0400 *** gimple_vop (const struct function *fun) *** 42,93 return fun-gimple_df-vop; } - /* Initialize the hashtable iterator HTI to point to hashtable TABLE */ - - static inline void * - first_htab_element (htab_iterator *hti, htab_t table) - { - hti-htab = table; - hti-slot = table-entries; - hti-limit = hti-slot + htab_size (table); - do - { - PTR x = *(hti-slot); - if (x != HTAB_EMPTY_ENTRY x != HTAB_DELETED_ENTRY) - break; - } while (++(hti-slot) hti-limit); - - if (hti-slot hti-limit) - return *(hti-slot); - return NULL; - } - - /* Return current non-empty/deleted slot of the hashtable pointed to by HTI, -or NULL if we have reached the end. */ - - static inline bool - end_htab_p (const htab_iterator *hti) - { - if (hti-slot = hti-limit) - return true; - return false; - } - - /* Advance the hashtable iterator pointed to by HTI to the next element of the -hashtable. */ - - static inline void * - next_htab_element (htab_iterator *hti) - { - while (++(hti-slot) hti-limit) - { - PTR x = *(hti-slot); - if (x != HTAB_EMPTY_ENTRY x != HTAB_DELETED_ENTRY) - return x; - }; - return NULL; - } - /* Get the number of the next statement uid to be allocated. */ static inline unsigned int gimple_stmt_max_uid (struct function *fn) --- 42,47 *** a1/tree-hasher.h 2013-09-30 10:52:14.824172840 -0400 --- tree-hasher.h 2013-09-30 11:17:33.997171769 -0400 *** int_tree_hasher::equal (const value_type *** 52,55 --- 52,119 typedef hash_table int_tree_hasher int_tree_htab_type; + + typedef struct + { + htab_t htab; + PTR *slot; + PTR *limit; + } htab_iterator; + + /* Iterate through the elements of hashtable HTAB, using htab_iterator ITER, +storing each element in RESULT, which is of type TYPE. */ + #define FOR_EACH_HTAB_ELEMENT(HTAB, RESULT, TYPE, ITER) \ + for (RESULT = (TYPE) first_htab_element ((ITER), (HTAB)); \ + !end_htab_p ((ITER)); \ + RESULT = (TYPE) next_htab_element ((ITER))) + + + /* Initialize the hashtable iterator HTI to point to hashtable TABLE */ + + static inline void * + first_htab_element (htab_iterator *hti, htab_t table) + { + hti-htab = table; + hti-slot = table-entries; + hti-limit = hti-slot + htab_size (table); + do + { + PTR x = *(hti-slot); + if (x != HTAB_EMPTY_ENTRY x != HTAB_DELETED_ENTRY) + break; + } while (++(hti-slot) hti-limit); + + if (hti-slot hti-limit) + return *(hti-slot); + return NULL; + } + + /* Return current non-empty/deleted slot of the hashtable pointed to by HTI, +or NULL if we have reached the end. */ + + static inline bool + end_htab_p (const htab_iterator *hti) + { + if (hti-slot = hti-limit) + return true; + return false; + } + + /* Advance the hashtable iterator pointed to by HTI to the next element of the +hashtable. */ + + static inline void * + next_htab_element (htab_iterator *hti) + { + while (++(hti-slot) hti-limit) + { + PTR x = *(hti-slot); + if (x != HTAB_EMPTY_ENTRY x != HTAB_DELETED_ENTRY) + return x; + }; + return NULL; + } + + + #endif /* GCC_TREE_HASHER_H */
Re: [PATCH v4 04/20] add configury
On 30 Sep 2013, at 08:45, Paolo Bonzini wrote: Il 27/09/2013 21:45, Gerald Pfeifer ha scritto: I believe this may be breaking all my testers on FreeBSD (i386-unknown-freebsd10.0 for example). The timing of when this patchset went in fits pretty much when my builds started to break and I am wondering about some code. Here is the failure mode: gmake[2]: Entering directory `/scratch/tmp/gerald/OBJ-0927-1848/gcc' g++ -c -DIN_GCC_FRONTEND -g -O2 -DIN_GCC -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Werror -fno-common -DHAVE_CONFIG_H -I. -Ic -I/scratch/tmp/gerald/gcc-HEAD/gcc ...[-I options]... -o c/c-lang.o -MT c/c-lang.o -MMD -MP -MF c/.deps/c-lang.TPo /scratch/tmp/gerald/gcc-HEAD/gcc/c/c-lang.c cc1plus: error: unrecognized command line option -Wno-narrowing gmake[2]: *** [c/c-lang.o] Error 1 gmake[1]: *** [install-gcc] Error 2 gmake: *** [install] Error 2 The issue is the invocation of g++ (the old system compiler, not what we built) with -Wno-narrowing (a new option). Why is install building anything? I don't know if the case above is related, but AFAICT, java always builds ec1 at install time (and I wonder what the reason for that is). Iain
Re: [PATCH] Enhance phiopt to handle BIT_AND_EXPR
On Mon, Sep 30, 2013 at 2:29 AM, Zhenqiang Chen zhenqiang.c...@arm.com wrote: Hi, The patch enhances phiopt to handle cases like: if (a == 0 (...)) return 0; return a; Boot strap and no make check regression on X86-64 and ARM. Is it OK for trunk? From someone who wrote lot of this code (value_replacement in fact), this looks good, though I would pull: + if (TREE_CODE (gimple_assign_rhs1 (def)) == SSA_NAME) +{ + gimple def1 = SSA_NAME_DEF_STMT (gimple_assign_rhs1 (def)); + if (is_gimple_assign (def1) gimple_assign_rhs_code (def1) == EQ_EXPR) + { + tree op0 = gimple_assign_rhs1 (def1); + tree op1 = gimple_assign_rhs2 (def1); + if ((operand_equal_for_phi_arg_p (arg0, op0) +operand_equal_for_phi_arg_p (arg1, op1)) + || (operand_equal_for_phi_arg_p (arg0, op1) +operand_equal_for_phi_arg_p (arg1, op0))) +{ + *code = gimple_assign_rhs_code (def1); + return 1; +} + } +} Out into its own function since it is repeated again for gimple_assign_rhs2 (def). Also what about cascading BIT_AND_EXPR like: if((a == 0) (...) (...)) I notice you don't handle that either. Thanks, Andrew Pinski Thanks! -Zhenqiang ChangeLog: 2013-09-30 Zhenqiang Chen zhenqiang.c...@linaro.org * tree-ssa-phiopt.c (operand_equal_for_phi_arg_p_1): New. (value_replacement): Move a check to operand_equal_for_phi_arg_p_1. testsuite/ChangeLog: 2013-09-30 Zhenqiang Chen zhenqiang.c...@linaro.org * gcc.dg/tree-ssa/phi-opt-11.c: New test case.
Re: [patch] move htab_iterator
Andrew == Andrew MacLeod amacl...@redhat.com writes: Andrew Jakub didn't like removing this hash table iterator functionality even Andrew though it is currently unused in the compiler. Andrew So it seems reasonable to put it in tree-hasher.h since its purpose is Andrew Hash Table Helper for Trees? How about putting it into libiberty? That way other hashtab users, like gdb, can use it. Tom
Re: [patch] move htab_iterator
On 09/30/2013 12:08 PM, Tom Tromey wrote: Andrew == Andrew MacLeod amacl...@redhat.com writes: Andrew Jakub didn't like removing this hash table iterator functionality even Andrew though it is currently unused in the compiler. Andrew So it seems reasonable to put it in tree-hasher.h since its purpose is Andrew Hash Table Helper for Trees? How about putting it into libiberty? That way other hashtab users, like gdb, can use it. I have no problem with that, but Jakub didn't seem to think it belonged there. I just want to get it out of where it is :-) I'll be happy to move it to where ever it will have a better home than the non-existence I originally planned for it :-) Andrew
Re: [PATCH] Trivial cleanup
On 09/30/13 02:05, Michael Matz wrote: Hi, On Sat, 28 Sep 2013, Andrew MacLeod wrote: My example in this form would look something like: int unsignedsrcp = ptrvar.type().type().type_unsigned(); ... GimpleType t1 = ptrvar.type (); GimpleType t2 = t1.type (); Stop that CamelCase dyslexia already, will you? ;-) :-) I don't think anyone is suggesting CamelCase for GCC; Andrew has been using it a lot lately in the reorganizational work so that he can quickly find things that will need changing again. A grep for GimpleType is a lot more useful than a grep on gimple or type ;-) jeff
Re: [PATCH] Trivial cleanup
On 09/28/13 08:31, Andrew MacLeod wrote: temps would be OK with me, but there are a couple of concerns. - I'd want to be able to declare the temps at the point of use, not the top of the function. this would actually help with clarity I think. Not sure what the current coding standard says about that. Point of use is fine for GCC now. From our coding conventions: Variable Definitions Variables should be defined at the point of first use, rather than at the top of the function. The existing code obviously does not follow that rule, so variables may be defined at the top of the function, as in C90. Variables may be simultaneously defined and tested in control expressions. - the compiler better do an awesome job of sharing stack space for user variables in a function... I wouldn't want to blow up the stack with a bazillion unrelatd temps each wit their own location. If the objects have the same type and disjoint lifetimes, they can be easily shared. Things are more difficult if the types are different -- IIRC, the root of the problem is the optimizers can interchange a load of one type with a later store of the other -- the aliasing code says hey, they're different types, so they don't alias, feel free to move them around as desired and all hell breaks loose. Jeff
Re: [patch] move htab_iterator
Tom How about putting it into libiberty? Tom That way other hashtab users, like gdb, can use it. Andrew I have no problem with that, but Jakub didn't seem to think it Andrew belonged there. All I found was this: http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00721.html Quoting from it: It doesn't belong to hashtab.h, because that is a libiberty API, this style of iterators is GCC specific. I think that's an accurate assessment of the current code, but I don't see why it has to continue to be that way. My argument in favor of moving it to libiberty is that other programs can then use it; and furthermore that since it is tightly tied to the hashtab implementation, it ought to be maintained there in order to preserve the module boundary. So, please reconsider. thanks, Tom
[v3] libstdc++/58437
Hi, thus the below are the patches which I'm applying to mainline and 4_7/4_8, respectively. Tested x86_64-linux. Thanks, Paolo. / 2013-09-30 Chris Jefferson ch...@bubblescope.net PR libstdc++/58437 * include/bits/stl_algo.h (__move_median_first): Rename to __move_median_to_first, change to take an addition argument. (__unguarded_partition_pivot): Adjust. * testsuite/performance/25_algorithms/sort.cc: New. * testsuite/performance/25_algorithms/sort_heap.cc: Likewise. * testsuite/performance/25_algorithms/stable_sort.cc: Likewise. Index: include/bits/stl_algo.h === --- include/bits/stl_algo.h (revision 203034) +++ include/bits/stl_algo.h (working copy) @@ -72,25 +72,27 @@ { _GLIBCXX_BEGIN_NAMESPACE_VERSION - /// Swaps the median value of *__a, *__b and *__c under __comp to *__a + /// Swaps the median value of *__a, *__b and *__c under __comp to *__result templatetypename _Iterator, typename _Compare void -__move_median_first(_Iterator __a, _Iterator __b, _Iterator __c, - _Compare __comp) +__move_median_to_first(_Iterator __result,_Iterator __a, _Iterator __b, + _Iterator __c, _Compare __comp) { if (__comp(__a, __b)) { if (__comp(__b, __c)) - std::iter_swap(__a, __b); + std::iter_swap(__result, __b); else if (__comp(__a, __c)) - std::iter_swap(__a, __c); + std::iter_swap(__result, __c); + else + std::iter_swap(__result, __a); } else if (__comp(__a, __c)) - return; + std::iter_swap(__result, __a); else if (__comp(__b, __c)) - std::iter_swap(__a, __c); + std::iter_swap(__result, __c); else - std::iter_swap(__a, __b); + std::iter_swap(__result, __b); } /// This is an overload used by find algos for the Input Iterator case. @@ -1915,7 +1917,8 @@ _RandomAccessIterator __last, _Compare __comp) { _RandomAccessIterator __mid = __first + (__last - __first) / 2; - std::__move_median_first(__first, __mid, (__last - 1), __comp); + std::__move_median_to_first(__first, __first + 1, __mid, (__last - 2), + __comp); return std::__unguarded_partition(__first + 1, __last, __first, __comp); } Index: testsuite/performance/25_algorithms/sort.cc === --- testsuite/performance/25_algorithms/sort.cc (revision 0) +++ testsuite/performance/25_algorithms/sort.cc (working copy) @@ -0,0 +1,65 @@ +// Copyright (C) 2013 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. + +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. + +// You should have received a copy of the GNU General Public License along +// with this library; see the file COPYING3. If not see +// http://www.gnu.org/licenses/. + +#include vector +#include algorithm +#include testsuite_performance.h + +int main() +{ + using namespace __gnu_test; + + time_counter time; + resource_counter resource; + + const int max_size = 1000; + + std::vectorint v(max_size); + + for (int i = 0; i max_size; ++i) +v[i] = -i; + + start_counters(time, resource); + std::sort(v.begin(), v.end()); + stop_counters(time, resource); + + report_performance(__FILE__, reverse, time, resource); + clear_counters(time, resource); + + for (int i = 0; i max_size; ++i) +v[i] = i; + + start_counters(time, resource); + std::sort(v.begin(), v.end()); + stop_counters(time, resource); + + report_performance(__FILE__, forwards, time, resource); + clear_counters(time, resource); + + // a simple psuedo-random series which does not rely on rand() and friends + v[0] = 0; + for (int i = 1; i max_size; ++i) +v[i] = (v[i-1] + 110211473) * 745988807; + + start_counters(time, resource); + std::sort(v.begin(), v.end()); + stop_counters(time, resource); + + report_performance(__FILE__, random, time, resource); + + return 0; +} Index: testsuite/performance/25_algorithms/sort_heap.cc === --- testsuite/performance/25_algorithms/sort_heap.cc(revision 0) +++ testsuite/performance/25_algorithms/sort_heap.cc(working copy) @@ -0,0 +1,73 @@ +// Copyright (C) 2013 Free Software Foundation, Inc. +// +//
Re: [patch] move htab_iterator
On 09/30/2013 01:02 PM, Tom Tromey wrote: Tom How about putting it into libiberty? Tom That way other hashtab users, like gdb, can use it. Andrew I have no problem with that, but Jakub didn't seem to think it Andrew belonged there. All I found was this: http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00721.html Quoting from it: It doesn't belong to hashtab.h, because that is a libiberty API, this style of iterators is GCC specific. I think that's an accurate assessment of the current code, but I don't see why it has to continue to be that way. My argument in favor of moving it to libiberty is that other programs can then use it; and furthermore that since it is tightly tied to the hashtab implementation, it ought to be maintained there in order to preserve the module boundary. So, please reconsider. Sure, how's this? And who has to approve the libiberty bits? Bootstrapping now... but since its unused I doubt that will be an issue :-)... Andrew gcc * tree-flow.h (htab_iterator, FOR_EACH_HTAB_ELEMENT): Move from here. * tree-flow-inline.h (first_htab_element, end_htab_p, next_htab_element): Also move from here. include * hashtab.h (htab_iterator, FOR_EACH_HTAB_ELEMENT, first_htab_element, end_htab_p, next_htab_element): Move to here. Change boolean to int and 0/1. Index: gcc/tree-flow.h === *** gcc/tree-flow.h (revision 203034) --- gcc/tree-flow.h (working copy) *** struct GTY(()) gimple_df { *** 92,112 htab_t GTY ((param_is (struct tm_restart_node))) tm_restart; }; - - typedef struct - { - htab_t htab; - PTR *slot; - PTR *limit; - } htab_iterator; - - /* Iterate through the elements of hashtable HTAB, using htab_iterator ITER, -storing each element in RESULT, which is of type TYPE. */ - #define FOR_EACH_HTAB_ELEMENT(HTAB, RESULT, TYPE, ITER) \ - for (RESULT = (TYPE) first_htab_element ((ITER), (HTAB)); \ - !end_htab_p ((ITER)); \ - RESULT = (TYPE) next_htab_element ((ITER))) - /* It is advantageous to avoid things like life analysis for variables which do not need PHI nodes. This enum describes whether or not a particular variable may need a PHI node. */ --- 92,97 Index: gcc/tree-flow-inline.h === *** gcc/tree-flow-inline.h (revision 203034) --- gcc/tree-flow-inline.h (working copy) *** gimple_vop (const struct function *fun) *** 42,93 return fun-gimple_df-vop; } - /* Initialize the hashtable iterator HTI to point to hashtable TABLE */ - - static inline void * - first_htab_element (htab_iterator *hti, htab_t table) - { - hti-htab = table; - hti-slot = table-entries; - hti-limit = hti-slot + htab_size (table); - do - { - PTR x = *(hti-slot); - if (x != HTAB_EMPTY_ENTRY x != HTAB_DELETED_ENTRY) - break; - } while (++(hti-slot) hti-limit); - - if (hti-slot hti-limit) - return *(hti-slot); - return NULL; - } - - /* Return current non-empty/deleted slot of the hashtable pointed to by HTI, -or NULL if we have reached the end. */ - - static inline bool - end_htab_p (const htab_iterator *hti) - { - if (hti-slot = hti-limit) - return true; - return false; - } - - /* Advance the hashtable iterator pointed to by HTI to the next element of the -hashtable. */ - - static inline void * - next_htab_element (htab_iterator *hti) - { - while (++(hti-slot) hti-limit) - { - PTR x = *(hti-slot); - if (x != HTAB_EMPTY_ENTRY x != HTAB_DELETED_ENTRY) - return x; - }; - return NULL; - } - /* Get the number of the next statement uid to be allocated. */ static inline unsigned int gimple_stmt_max_uid (struct function *fn) --- 42,47 Index: include/hashtab.h === *** include/hashtab.h (revision 203034) --- include/hashtab.h (working copy) *** extern hashval_t iterative_hash (const v *** 202,207 --- 202,270 /* Shorthand for hashing something with an intrinsic size. */ #define iterative_hash_object(OB,INIT) iterative_hash (OB, sizeof (OB), INIT) + /* GCC style hash table iterator. */ + + typedef struct + { + htab_t htab; + PTR *slot; + PTR *limit; + } htab_iterator; + + /* Iterate through the elements of hashtable HTAB, using htab_iterator ITER, +storing each element in RESULT, which is of type TYPE. */ + #define FOR_EACH_HTAB_ELEMENT(HTAB, RESULT, TYPE, ITER) \ + for (RESULT = (TYPE) first_htab_element ((ITER), (HTAB)); \ + !end_htab_p ((ITER)); \ + RESULT = (TYPE) next_htab_element ((ITER))) + + + /* Initialize the hashtable iterator HTI to point to hashtable TABLE */ + + static inline void * + first_htab_element (htab_iterator *hti, htab_t table) + { + hti-htab = table; + hti-slot = table-entries; + hti-limit = hti-slot + htab_size (table); + do + { + PTR x =
Go patch committed: Use backend interface for variable expressions
This patch from Chris Manghane changes the Go frontend to use the backend interface for variable expressions. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline and 4.8 branch. Ian 2013-09-30 Chris Manghane cm...@google.com * go-gcc.cc (Backend::error_expression): New function. (Backend::var_expression): New function. (Backend::indirect_expression): New function. Index: gcc/go/go-gcc.cc === --- gcc/go/go-gcc.cc (revision 202753) +++ gcc/go/go-gcc.cc (working copy) @@ -208,6 +208,16 @@ class Gcc_backend : public Backend Bexpression* zero_expression(Btype*); + Bexpression* + error_expression() + { return this-make_expression(error_mark_node); } + + Bexpression* + var_expression(Bvariable* var, Location); + + Bexpression* + indirect_expression(Bexpression* expr, bool known_valid, Location); + // Statements. Bstatement* @@ -848,6 +858,30 @@ Gcc_backend::zero_expression(Btype* btyp return tree_to_expr(ret); } +// An expression that references a variable. + +Bexpression* +Gcc_backend::var_expression(Bvariable* var, Location) +{ + tree ret = var-get_tree(); + if (ret == error_mark_node) +return this-error_expression(); + return tree_to_expr(ret); +} + +// An expression that indirectly references an expression. + +Bexpression* +Gcc_backend::indirect_expression(Bexpression* expr, bool known_valid, + Location location) +{ + tree ret = build_fold_indirect_ref_loc(location.gcc_location(), + expr-get_tree()); + if (known_valid) +TREE_THIS_NOTRAP(ret) = 1; + return tree_to_expr(ret); +} + // An expression as a statement. Bstatement* Index: gcc/go/gofrontend/expressions.cc === --- gcc/go/gofrontend/expressions.cc (revision 202753) +++ gcc/go/gofrontend/expressions.cc (working copy) @@ -978,22 +978,19 @@ Var_expression::do_get_tree(Translate_co { Bvariable* bvar = this-variable_-get_backend_variable(context-gogo(), context-function()); - tree ret = var_to_tree(bvar); - if (ret == error_mark_node) -return error_mark_node; bool is_in_heap; + Location loc = this-location(); if (this-variable_-is_variable()) is_in_heap = this-variable_-var_value()-is_in_heap(); else if (this-variable_-is_result_variable()) is_in_heap = this-variable_-result_var_value()-is_in_heap(); else go_unreachable(); + + Bexpression* ret = context-backend()-var_expression(bvar, loc); if (is_in_heap) -{ - ret = build_fold_indirect_ref_loc(this-location().gcc_location(), ret); - TREE_THIS_NOTRAP(ret) = 1; -} - return ret; +ret = context-backend()-indirect_expression(ret, true, loc); + return expr_to_tree(ret); } // Ast dump for variable expression. Index: gcc/go/gofrontend/backend.h === --- gcc/go/gofrontend/backend.h (revision 202753) +++ gcc/go/gofrontend/backend.h (working copy) @@ -231,6 +231,22 @@ class Backend virtual Bexpression* zero_expression(Btype*) = 0; + // Create an error expression. This is used for cases which should + // not occur in a correct program, in order to keep the compilation + // going without crashing. + virtual Bexpression* + error_expression() = 0; + + // Create a reference to a variable. + virtual Bexpression* + var_expression(Bvariable* var, Location) = 0; + + // Create an expression that indirects through the pointer expression EXPR + // (i.e., return the expression for *EXPR). KNOWN_VALID is true if the pointer + // is known to point to a valid memory location. + virtual Bexpression* + indirect_expression(Bexpression* expr, bool known_valid, Location) = 0; + // Statements. // Create an error statement. This is used for cases which should
[Patch] Add missing profile updates to jump threading code
The jump threading handling in the case of a joiner block was not updating profile information (it was being updated in the non-joiner case). Added profile updates for the joiner case, in one place by commoning the handling between the joiner and non-joiner cases. Bootstrapped and tested on x86_64-unknown-linux-gnu. Ok for trunk? Thanks, Teresa 2013-09-30 Teresa Johnson tejohn...@google.com * tree-ssa-threadupdate.c (ssa_fix_duplicate_block_edges): Update redirected out edge count in joiner case. (ssa_redirect_edges): Common the joiner and non-joiner cases so that joiner case gets profile updates. * testsuite/gcc.dg/tree-ssa/ssa-dom-thread-3.c (expand_one_var): Update for additional dump message. Index: tree-ssa-threadupdate.c === --- tree-ssa-threadupdate.c (revision 202947) +++ tree-ssa-threadupdate.c (working copy) @@ -403,6 +403,7 @@ ssa_fix_duplicate_block_edges (struct redirection_ threading through. That's the edge we want to redirect. */ victim = find_edge (rd-dup_block, THREAD_TARGET (e)-dest); e2 = redirect_edge_and_branch (victim, THREAD_TARGET2 (e)-dest); + e2-count = THREAD_TARGET2 (e)-count; /* If we redirected the edge, then we need to copy PHI arguments at the target. If the edge already existed (e2 != victim case), @@ -497,18 +498,8 @@ ssa_redirect_edges (struct redirection_data **slot free (el); thread_stats.num_threaded_edges++; - /* If we are threading through a joiner block, then we have to -find the edge we want to redirect and update some PHI nodes. */ - if (THREAD_TARGET2 (e)) - { - edge e2; - /* We want to redirect the incoming edge to the joiner block (E) -to instead reach the duplicate of the joiner block. */ - e2 = redirect_edge_and_branch (e, rd-dup_block); - flush_pending_stmts (e2); - } - else if (rd-dup_block) + if (rd-dup_block) { edge e2; @@ -522,9 +513,15 @@ ssa_redirect_edges (struct redirection_data **slot the computation overflows. */ if (rd-dup_block-frequency BB_FREQ_MAX * 2) rd-dup_block-frequency += EDGE_FREQUENCY (e); - EDGE_SUCC (rd-dup_block, 0)-count += e-count; - /* Redirect the incoming edge to the appropriate duplicate -block. */ + + /* In the case of threading through a joiner block, the outgoing + edges from the duplicate block were updated when they were + redirected during ssa_fix_duplicate_block_edges. */ + if (!THREAD_TARGET2 (e)) +EDGE_SUCC (rd-dup_block, 0)-count += e-count; + + /* Redirect the incoming edge (possibly to the joiner block) to the + appropriate duplicate block. */ e2 = redirect_edge_and_branch (e, rd-dup_block); gcc_assert (e == e2); flush_pending_stmts (e2); Index: testsuite/gcc.dg/tree-ssa/ssa-dom-thread-3.c === --- testsuite/gcc.dg/tree-ssa/ssa-dom-thread-3.c(revision 202947) +++ testsuite/gcc.dg/tree-ssa/ssa-dom-thread-3.c(working copy) @@ -42,7 +42,7 @@ expand_one_var (tree var, unsigned char toplevel, abort (); } /* We should thread the jump, through an intermediate block. */ -/* { dg-final { scan-tree-dump-times Threaded 1 dom1} } */ +/* { dg-final { scan-tree-dump-times Threaded 2 dom1} } */ /* { dg-final { scan-tree-dump-times Registering jump thread: \\(.*\\) incoming edge; \\(.*\\) joiner; \\(.*\\) nocopy; 1 dom1} } */ /* { dg-final { cleanup-tree-dump dom1 } } */ -- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
Re: [Patch] Add missing profile updates to jump threading code
On 09/30/13 12:52, Teresa Johnson wrote: The jump threading handling in the case of a joiner block was not updating profile information (it was being updated in the non-joiner case). Added profile updates for the joiner case, in one place by commoning the handling between the joiner and non-joiner cases. Bootstrapped and tested on x86_64-unknown-linux-gnu. Ok for trunk? Thanks, Teresa 2013-09-30 Teresa Johnson tejohn...@google.com * tree-ssa-threadupdate.c (ssa_fix_duplicate_block_edges): Update redirected out edge count in joiner case. (ssa_redirect_edges): Common the joiner and non-joiner cases so that joiner case gets profile updates. * testsuite/gcc.dg/tree-ssa/ssa-dom-thread-3.c (expand_one_var): Update for additional dump message. Thanks for the fix cleanup. OK for the trunk. Jeff
Re: RFA: GCC Testsuite: Annotate compile tests that need at least 32-bit pointers/integers
On Sep 30, 2013, at 7:23 AM, Nick Clifton ni...@redhat.com wrote: Several tests in the gcc.c-torture/compile directory need a target with 32-bit integers and/or 32-bit pointers. OK to apply ? Ok. It may be reasonable to special case ptr32plus to say no on your platform, from check_effective_target_tls_native, we see code like: proc check_effective_target_tls_native {} { # VxWorks uses emulated TLS machinery, but with non-standard helper # functions, so we fail to automatically detect it. if { [istarget *-*-vxworks*] } { return 0 } return [check_no_messages_and_pattern tls_native !emutls assembly { __thread int i; int f (void) { return i; } void g (int j) { i = j; } }] } so, instead of: proc check_effective_target_ptr32plus { } { return [check_no_compiler_messages ptr32plus object { int dummy[sizeof (void *) = 4 ? 1 : -1]; }] } you could do something like: proc check_effective_target_ptr32plus { } { # msp430 never really has 32 or more bits in a pointer. if { [istarget msp430-*-*] } { return 0 } return [check_no_compiler_messages ptr32plus object { int dummy[sizeof (void *) = 4 ? 1 : -1]; }] } Then, you don't have to worry about people adding tests with this predicate and those test cases failing. I don't have a good handle on wether this is better or not, so, I'll let you decide what you think is best.
Re: RFA: GCC Testsuite: Annotate compile tests that need at least 32-bit pointers/integers
On 09/30/13 13:42, Mike Stump wrote: On Sep 30, 2013, at 7:23 AM, Nick Clifton ni...@redhat.com wrote: Several tests in the gcc.c-torture/compile directory need a target with 32-bit integers and/or 32-bit pointers. OK to apply ? Ok. It may be reasonable to special case ptr32plus to say no on your platform, from check_effective_target_tls_native, we see code like: I'd tend to prefer this as well. It's really a failing that ptr32plus can't reasonably detect that when a target uses PSImode for pointers. Special casing the msp port in that code seems reasonable to me. jeff
Re: [PATCH]: Fix use of __builtin_eh_pointer in EH_ELSE
On 09/30/2013 03:24 AM, Tristan Gingold wrote: 2013-09-03 Tristan Gingold ging...@adacore.com * tree.c (set_call_expr_flags): Reject ECF_TM_PURE. (build_common_builtin_nodes): Set transaction_pure attribute on __builtin_eh_pointer function type (and not on its declaration). Ok. r~
Re: [Patch] Add missing profile updates to jump threading code
On 09/30/13 13:07, Jeff Law wrote: On 09/30/13 12:52, Teresa Johnson wrote: The jump threading handling in the case of a joiner block was not updating profile information (it was being updated in the non-joiner case). Added profile updates for the joiner case, in one place by commoning the handling between the joiner and non-joiner cases. Bootstrapped and tested on x86_64-unknown-linux-gnu. Ok for trunk? Thanks, Teresa 2013-09-30 Teresa Johnson tejohn...@google.com * tree-ssa-threadupdate.c (ssa_fix_duplicate_block_edges): Update redirected out edge count in joiner case. (ssa_redirect_edges): Common the joiner and non-joiner cases so that joiner case gets profile updates. * testsuite/gcc.dg/tree-ssa/ssa-dom-thread-3.c (expand_one_var): Update for additional dump message. Thanks for the fix cleanup. OK for the trunk. BTW, I'm going to go ahead and check this in -- it's conflicts with a patch that I was trying to wrap up today... jeff
Re: [Patch] Add missing profile updates to jump threading code
Oh, I can do that right now if you want - let me know if you haven't hit the trigger yet. Thanks, Teresa On Mon, Sep 30, 2013 at 1:06 PM, Jeff Law l...@redhat.com wrote: On 09/30/13 13:07, Jeff Law wrote: On 09/30/13 12:52, Teresa Johnson wrote: The jump threading handling in the case of a joiner block was not updating profile information (it was being updated in the non-joiner case). Added profile updates for the joiner case, in one place by commoning the handling between the joiner and non-joiner cases. Bootstrapped and tested on x86_64-unknown-linux-gnu. Ok for trunk? Thanks, Teresa 2013-09-30 Teresa Johnson tejohn...@google.com * tree-ssa-threadupdate.c (ssa_fix_duplicate_block_edges): Update redirected out edge count in joiner case. (ssa_redirect_edges): Common the joiner and non-joiner cases so that joiner case gets profile updates. * testsuite/gcc.dg/tree-ssa/ssa-dom-thread-3.c (expand_one_var): Update for additional dump message. Thanks for the fix cleanup. OK for the trunk. BTW, I'm going to go ahead and check this in -- it's conflicts with a patch that I was trying to wrap up today... jeff -- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
Re: [Patch] Add missing profile updates to jump threading code
Nevermind - see that you did this already. Thanks! Teresa On Mon, Sep 30, 2013 at 1:11 PM, Teresa Johnson tejohn...@google.com wrote: Oh, I can do that right now if you want - let me know if you haven't hit the trigger yet. Thanks, Teresa On Mon, Sep 30, 2013 at 1:06 PM, Jeff Law l...@redhat.com wrote: On 09/30/13 13:07, Jeff Law wrote: On 09/30/13 12:52, Teresa Johnson wrote: The jump threading handling in the case of a joiner block was not updating profile information (it was being updated in the non-joiner case). Added profile updates for the joiner case, in one place by commoning the handling between the joiner and non-joiner cases. Bootstrapped and tested on x86_64-unknown-linux-gnu. Ok for trunk? Thanks, Teresa 2013-09-30 Teresa Johnson tejohn...@google.com * tree-ssa-threadupdate.c (ssa_fix_duplicate_block_edges): Update redirected out edge count in joiner case. (ssa_redirect_edges): Common the joiner and non-joiner cases so that joiner case gets profile updates. * testsuite/gcc.dg/tree-ssa/ssa-dom-thread-3.c (expand_one_var): Update for additional dump message. Thanks for the fix cleanup. OK for the trunk. BTW, I'm going to go ahead and check this in -- it's conflicts with a patch that I was trying to wrap up today... jeff -- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413 -- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
Re: [Patch] Add missing profile updates to jump threading code
On 09/30/13 14:11, Teresa Johnson wrote: Oh, I can do that right now if you want - let me know if you haven't hit the trigger yet. Thanks, Already done :-0 jeff
Re: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 2/2
As per my previous comments on this patch, I will not approve the changes to the m32c backend, as they will cause real bugs in real hardware, and violate the hardware's ABI. The user may use -fno-strict-volatile-bitfields if they do not desire this behavior and understand the consequences. I am not a maintainer for the rx and h8300 ports, but they are in the same situation. To reiterate my core position: if the user defines a proper volatile int bitfield, and the compiler does anything other than an int-sized access, the compiler is WRONG. Any optimization that changes volatile accesses to something other than what the user specified is a bug that needs to be fixed before this option can be non-default.
[C++ Patch] PR 58563
Hi, this ICE seems easy to avoid: just check the return value of make_typename_type for error_mark_node, like we normally do everywhere else in the parser. Tested x86_64-linux. Thanks, Paolo. /// /cp 2013-09-30 Paolo Carlini paolo.carl...@oracle.com PR c++/58563 * parser.c (cp_parser_lookup_name): Check make_typename_type return value for error_mark_node. /testsuite 2013-09-30 Paolo Carlini paolo.carl...@oracle.com PR c++/58563 * g++.dg/cpp0x/pr58563.C: New. Index: cp/parser.c === --- cp/parser.c (revision 203037) +++ cp/parser.c (working copy) @@ -21756,7 +21756,8 @@ cp_parser_lookup_name (cp_parser *parser, tree nam is dependent. */ type = make_typename_type (parser-scope, name, tag_type, /*complain=*/tf_error); - decl = TYPE_NAME (type); + if (type != error_mark_node) + decl = TYPE_NAME (type); } else if (is_template (cp_parser_next_token_ends_template_argument_p (parser) Index: testsuite/g++.dg/cpp0x/pr58563.C === --- testsuite/g++.dg/cpp0x/pr58563.C(revision 0) +++ testsuite/g++.dg/cpp0x/pr58563.C(working copy) @@ -0,0 +1,8 @@ +// PR c++/58563 +// { dg-do compile { target c++11 } } + +templateint void foo() +{ + enum E {}; + E().E::~T(); // { dg-error not a class } +}
Re: RFA: Use m_foo rather than foo_ for member variables
Richard Biener richard.guent...@gmail.com writes: On Sun, Sep 29, 2013 at 11:08 AM, Richard Sandiford rdsandif...@googlemail.com wrote: Michael Matz m...@suse.de writes: Trever Saunders tsaund...@mozilla.com writes: Richard Biener richard.guent...@gmail.com writes: Btw, I've come around multiple coding-styles in the past and I definitely would prefer m_mode / m_count to mark members vs. mode_ and count_. (and s_XXX for static members IIRC). I'd prefer m_/s_foo for members / static things too fwiw. Me as well. It's still ugly, but not so unsymmetric as the trailing underscore. Well, I'm not sure how I came to be the one writing these patches, but I suppose I prefer m_foo too. So how about the attached? The first patch has changes to the coding conventions. I added some missing spaces while there. The second patch has the mechanical code changes. The reason for yesterday's mass adding of spaces was because the second patch would have been pretty inconsistent otherwise. Tested on x86_64-linux-gnu. Ok. Applied, thanks. I was only looking for private and protected members, so I ended up missing vec. I installed the patch below as an obvious follow-up. Tested on x86_64-linux-gnu. There are some other uses foo_, but many of them seem to date from before the C++ switchover. Richard gcc/ * vec.h (vec_prefix, vec): Prefix member names with m_. * vec.c (vec_prefix::calculate_allocation): Update accordingly. Index: gcc/vec.c === --- gcc/vec.c 2013-09-27 09:16:58.010299213 +0100 +++ gcc/vec.c 2013-09-30 18:09:02.892316820 +0100 @@ -183,8 +183,8 @@ vec_prefix::calculate_allocation (vec_pr if (pfx) { - alloc = pfx-alloc_; - num = pfx-num_; + alloc = pfx-m_alloc; + num = pfx-m_num; } else if (!reserve) /* If there's no vector, and we've not requested anything, then we Index: gcc/vec.h === --- gcc/vec.h 2013-09-30 18:06:22.236575959 +0100 +++ gcc/vec.h 2013-09-30 18:06:22.305576705 +0100 @@ -235,8 +235,8 @@ struct vec_prefix friend struct va_heap; friend struct va_stack; - unsigned alloc_; - unsigned num_; + unsigned m_alloc; + unsigned m_num; }; templatetypename, typename, typename struct vec; @@ -285,7 +285,7 @@ va_heap::reserve (vecT, va_heap, vl_emb MEM_STAT_DECL) { unsigned alloc -= vec_prefix::calculate_allocation (v ? v-vecpfx_ : 0, reserve, exact); += vec_prefix::calculate_allocation (v ? v-m_vecpfx : 0, reserve, exact); if (!alloc) { release (v); @@ -293,7 +293,7 @@ va_heap::reserve (vecT, va_heap, vl_emb } if (GATHER_STATISTICS v) -v-vecpfx_.release_overhead (); +v-m_vecpfx.release_overhead (); size_t size = vecT, va_heap, vl_embed::embedded_size (alloc); unsigned nelem = v ? v-length () : 0; @@ -301,7 +301,7 @@ va_heap::reserve (vecT, va_heap, vl_emb v-embedded_init (alloc, nelem); if (GATHER_STATISTICS) -v-vecpfx_.register_overhead (size FINAL_PASS_MEM_STAT); +v-m_vecpfx.register_overhead (size FINAL_PASS_MEM_STAT); } @@ -315,7 +315,7 @@ va_heap::release (vecT, va_heap, vl_emb return; if (GATHER_STATISTICS) -v-vecpfx_.release_overhead (); +v-m_vecpfx.release_overhead (); ::free (v); v = NULL; } @@ -364,7 +364,7 @@ va_gc::reserve (vecT, A, vl_embed *v, MEM_STAT_DECL) { unsigned alloc -= vec_prefix::calculate_allocation (v ? v-vecpfx_ : 0, reserve, exact); += vec_prefix::calculate_allocation (v ? v-m_vecpfx : 0, reserve, exact); if (!alloc) { ::ggc_free (v); @@ -433,9 +433,9 @@ void unregister_stack_vec (unsigned); va_stack::alloc (vecT, va_stack, vl_ptr v, unsigned nelems, vecT, va_stack, vl_embed *space) { - v.vec_ = space; - register_stack_vec (static_castvoid * (v.vec_)); - v.vec_-embedded_init (nelems, 0); + v.m_vec = space; + register_stack_vec (static_castvoid * (v.m_vec)); + v.m_vec-embedded_init (nelems, 0); } @@ -462,16 +462,16 @@ va_stack::reserve (vecT, va_stack, vl_e } /* Move VEC_ to the heap. */ - nelems += v-vecpfx_.num_; + nelems += v-m_vecpfx.m_num; vecT, va_stack, vl_embed *oldvec = v; v = NULL; va_heap::reserve (reinterpret_castvecT, va_heap, vl_embed *(v), nelems, exact PASS_MEM_STAT); if (v oldvec) { - v-vecpfx_.num_ = oldvec-length (); - memcpy (v-vecdata_, - oldvec-vecdata_, + v-m_vecpfx.m_num = oldvec-length (); + memcpy (v-m_vecdata, + oldvec-m_vecdata, oldvec-length () * sizeof (T)); } } @@ -562,11 +562,11 @@ struct vnull struct GTY((user)) vecT, A, vl_embed { public: - unsigned allocated (void) const { return vecpfx_.alloc_; } - unsigned length (void) const { return vecpfx_.num_; } - bool is_empty (void) const { return
[PATCH, PowerPC] Change code generation for VSX loads and stores for little endian
This patch implements support for VSX vector loads and stores in little endian mode. VSX loads and stores permute the register image with respect to storage in a unique manner that is not truly little endian. This can cause problems (for example, when a vector appears in a union with a different type). This patch adds an explicit permute to each VSX load and store instruction so that the register image is true little endian. It is desirable to remove redundant pairs of permutes where legal to do so. This work is delayed to a later patch. This patch currently has no effect on generated code because -mvsx is disabled in little endian mode, pending fixes of additional problems with little endian code generation. I tested this by enabling -mvsx in little endian mode and running the regression bucket. Using a GCC code base from August 5, I observed that this patch corrected 187 failures and exposed a new regression. Investigation showed that the regression is not directly related to the patch. Unfortunately the results are not as good on current trunk. It appears we have introduced some more problems for little endian code generation since August 5th, which hides the effectiveness of the patch; most of the VSX vector tests still fail with the patch applied to current trunk. There are a handful of additional regressions, which again are not directly related to the patch. I feel that the patch is well-tested by the August 5 results, and would like to commit it before continuing to investigate the recently introduced problems. I also bootstrapped and tested the patch on a big-endian machine (powerpc64-unknown-linux-gnu) to verify that I introduced no regressions in that environment. Ok for trunk? Thanks, Bill gcc: 2013-09-30 Bill Schmidt wschm...@linux.vnet.ibm.com * config/rs6000/vector.md (movmode): Emit permuted move sequences for LE VSX loads and stores at expand time. * config/rs6000/rs6000-protos.h (rs6000_emit_le_vsx_move): New prototype. * config/rs6000/rs6000.c (rs6000_const_vec): New. (rs6000_gen_le_vsx_permute): New. (rs6000_gen_le_vsx_load): New. (rs6000_gen_le_vsx_store): New. (rs6000_gen_le_vsx_move): New. * config/rs6000/vsx.md (*vsx_le_perm_load_v2di): New. (*vsx_le_perm_load_v4si): New. (*vsx_le_perm_load_v8hi): New. (*vsx_le_perm_load_v16qi): New. (*vsx_le_perm_store_v2di): New. (*vsx_le_perm_store_v4si): New. (*vsx_le_perm_store_v8hi): New. (*vsx_le_perm_store_v16qi): New. (*vsx_xxpermdi2_le_mode): New. (*vsx_xxpermdi4_le_mode): New. (*vsx_xxpermdi8_le_V8HI): New. (*vsx_xxpermdi16_le_V16QI): New. (*vsx_lxvd2x2_le_mode): New. (*vsx_lxvd2x4_le_mode): New. (*vsx_lxvd2x8_le_V8HI): New. (*vsx_lxvd2x16_le_V16QI): New. (*vsx_stxvd2x2_le_mode): New. (*vsx_stxvd2x4_le_mode): New. (*vsx_stxvd2x8_le_V8HI): New. (*vsx_stxvd2x16_le_V16QI): New. gcc/testsuite: 2013-09-30 Bill Schmidt wschm...@linux.vnet.ibm.com * gcc.target/powerpc/pr43154.c: Skip for ppc64 little endian. * gcc.target/powerpc/fusion.c: Likewise. Index: gcc/testsuite/gcc.target/powerpc/pr43154.c === --- gcc/testsuite/gcc.target/powerpc/pr43154.c (revision 203018) +++ gcc/testsuite/gcc.target/powerpc/pr43154.c (working copy) @@ -1,5 +1,6 @@ /* { dg-do compile { target { powerpc*-*-* } } } */ /* { dg-skip-if { powerpc*-*-darwin* } { * } { } } */ +/* { dg-skip-if { powerpc*le-*-* } { * } { } } */ /* { dg-require-effective-target powerpc_vsx_ok } */ /* { dg-options -O2 -mcpu=power7 } */ Index: gcc/testsuite/gcc.target/powerpc/fusion.c === --- gcc/testsuite/gcc.target/powerpc/fusion.c (revision 203018) +++ gcc/testsuite/gcc.target/powerpc/fusion.c (working copy) @@ -1,5 +1,6 @@ /* { dg-do compile { target { powerpc*-*-* } } } */ /* { dg-skip-if { powerpc*-*-darwin* } { * } { } } */ +/* { dg-skip-if { powerpc*le-*-* } { * } { } } */ /* { dg-require-effective-target powerpc_p8vector_ok } */ /* { dg-options -mcpu=power7 -mtune=power8 -O3 } */ Index: gcc/config/rs6000/vector.md === --- gcc/config/rs6000/vector.md (revision 203018) +++ gcc/config/rs6000/vector.md (working copy) @@ -88,7 +88,8 @@ (smax smax)]) -;; Vector move instructions. +;; Vector move instructions. Little-endian VSX loads and stores require +;; special handling to circumvent element endianness. (define_expand movmode [(set (match_operand:VEC_M 0 nonimmediate_operand ) (match_operand:VEC_M 1 any_operand ))] @@ -104,6 +105,15 @@ !vlogical_operand (operands[1], MODEmode)) operands[1] = force_reg (MODEmode, operands[1]); } +
libgo patch committed: Fix reflect.Call passing function
There was a bug in the libgo implementation of reflect.Call when passing a function following a non-pointer type. This patch fixes the bug and adds a test. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline and 4.8 branch. Ian diff -r 7c369498bb81 libgo/go/reflect/all_test.go --- a/libgo/go/reflect/all_test.go Mon Sep 30 11:06:06 2013 -0700 +++ b/libgo/go/reflect/all_test.go Mon Sep 30 20:03:54 2013 -0700 @@ -2406,6 +2406,15 @@ } } +func TestFuncArg(t *testing.T) { + f1 := func(i int, f func(int) int) int { return f(i) } + f2 := func(i int) int { return i + 1 } + r := ValueOf(f1).Call([]Value{ValueOf(100), ValueOf(f2)}) + if r[0].Int() != 101 { + t.Errorf(function returned %d, want 101, r[0].Int()) + } +} + var tagGetTests = []struct { Tag StructTag Key string diff -r 7c369498bb81 libgo/go/reflect/value.go --- a/libgo/go/reflect/value.go Mon Sep 30 11:06:06 2013 -0700 +++ b/libgo/go/reflect/value.go Mon Sep 30 20:03:54 2013 -0700 @@ -433,7 +433,7 @@ if v.flagflagMethod != 0 { nin++ } - firstPointer := len(in) 0 Kind(t.In(0).(*rtype).kind) != Ptr v.flagflagMethod == 0 isMethod(v.typ) + firstPointer := len(in) 0 t.In(0).Kind() != Ptr v.flagflagMethod == 0 isMethod(v.typ) params := make([]unsafe.Pointer, nin) off := 0 if v.flagflagMethod != 0 { @@ -497,8 +497,10 @@ sawRet := false for i, c := range s { if c == '(' { + if parens == 0 { +params++ + } parens++ - params++ } else if c == ')' { parens-- } else if parens == 0 c == ' ' s[i+1] != '(' !sawRet {
[Patch] Fix interval quantifier in lookahead subexpr in regex
Forget to let _M_clone and _M_eliminate_dummy follow _S_opcode_subexpr_lookahead, which also have _M_alt. Is it OK not to bootstrap it ? Tested under -m64 and -m32. Thanks! -- Tim Shen a.patch Description: Binary data
Re: [Patch] Fix interval quantifier in lookahead subexpr in regex
Hi, Tim Shen timshe...@gmail.com ha scritto: Is it OK not to bootstrap it ? Tested under -m64 and -m32. Ok, thanks. Paolo
Re: [PATCH, IRA] Fix ALLOCNO_MODE in the case of paradoxical subreg.
Probably the best place to add a code for this is in lra-constraints.c::simplify_operand_subreg by permitting subreg reload for paradoxical subregs whose hard regs are not fully in allocno class of the inner pseudo. It needs a good testing (i'd check that the generated code is not changed on variety benchmarks to see that the change has no impact on the most programs performance) and you need to add a good comment describing why this change is needed. Vlad, thanks! I make another patch here by following your guidance. Please check whether it is ok. Boostrap and regression ok. I am also verifying its performance effect on google applications (But most of them are 64 bits, so I cannot verify its performance effect on 32 bits apps). The idea of the patch is here: For the following two types of paradoxical subreg, we insert reload in simplify_operand_subreg: 1. If the op_type is OP_IN, and the hardreg could not be paired with other hardreg to contain the outermode operand, for example R15 in x86-64 (checked by in_hard_reg_set_p), we need to insert a reload. If the hardreg allocated in IRA is R12, we don't need to insert reload here because upper half of rvalue paradoxical subreg is undefined so it is ok for R13 to contain undefined data. 2. If the op_type is OP_OUT or OP_INOUT. (It is possible that we don't need to insert reload for this case too, because the upper half of lvalue paradoxical subreg is useless. If the assignment to upper half of subreg register will not be generated by rtl split4 stage, we don't need to insert reload here. But I havn't got a testcase to verify it so I keep it) Here is a paradoxical subreg example showing how the reload is generated: (insn 5 4 7 2 (set (reg:TI 106 [ __comp ]) (subreg:TI (reg:DI 107 [ __comp ]) 0)) {*movti_internal_rex64} In IRA, reg107 is allocated to a DImode hardreg. If reg107 is assigned to hardreg R15, compiler cannot find another hardreg to pair with R15 to contain TImode data. So we insert a TImode reload pseudo reg180 for it. After reload is inserted: (insn 283 0 0 (set (subreg:DI (reg:TI 180 [orig:107 __comp ] [107]) 0) (reg:DI 107 [ __comp ])) -1 (insn 5 4 7 2 (set (reg:TI 106 [ __comp ]) (subreg:TI (reg:TI 180 [orig:107 __comp ] [107]) 0)) {*movti_internal_rex64} Two reload hard registers will be allocated to reg180 to save TImode operand in LRA_assign. Thanks, Wei Mi. 2013-09-30 Wei Mi w...@google.com * lra-constraints.c (insert_move_for_subreg): New function. (simplify_operand_subreg): Add reload for paradoxical subreg. Index: lra-constraints.c === --- lra-constraints.c (revision 201963) +++ lra-constraints.c (working copy) @@ -1158,6 +1158,30 @@ process_addr_reg (rtx *loc, rtx *before, return true; } +/* Insert move insn in simplify_operand_subreg. BEFORE returns + the insn to be inserted before curr insn. AFTER returns the + the insn to be inserted after curr insn. ORIGREG and NEWREG + are the original reg and new reg for reload. */ +static void +insert_move_for_subreg (rtx *before, rtx *after, rtx origreg, rtx newreg) +{ + if (before) +{ + push_to_sequence (*before); + lra_emit_move (newreg, origreg); + *before = get_insns (); + end_sequence (); +} + if (after) +{ + start_sequence (); + lra_emit_move (origreg, newreg); + emit_insn (*after); + *after = get_insns (); + end_sequence (); +} +} + /* Make reloads for subreg in operand NOP with internal subreg mode REG_MODE, add new reloads for further processing. Return true if any reload was generated. */ @@ -1169,6 +1193,8 @@ simplify_operand_subreg (int nop, enum m enum machine_mode mode; rtx reg, new_reg; rtx operand = *curr_id-operand_loc[nop]; + enum reg_class regclass; + enum op_type type; before = after = NULL_RTX; @@ -1177,6 +1203,7 @@ simplify_operand_subreg (int nop, enum m mode = GET_MODE (operand); reg = SUBREG_REG (operand); + type = curr_static_id-operand[nop].type; /* If we change address for paradoxical subreg of memory, the address might violate the necessary alignment or the access might be slow. So take this into consideration. We should not worry @@ -1215,13 +1242,9 @@ simplify_operand_subreg (int nop, enum m (hard_regno_nregs[hard_regno][GET_MODE (reg)] = hard_regno_nregs[hard_regno][mode]) simplify_subreg_regno (hard_regno, GET_MODE (reg), -SUBREG_BYTE (operand), mode) 0 - /* Don't reload subreg for matching reload. It is actually - valid subreg in LRA. */ -! LRA_SUBREG_P (operand)) +SUBREG_BYTE (operand), mode) 0) || CONSTANT_P (reg) || GET_CODE (reg) == PLUS || MEM_P (reg)) { - enum op_type type = curr_static_id-operand[nop].type; /* The class will be defined