Re: [bootstrap] Tentative fix for PR 54281
From: Diego Novillo dnovi...@google.com Date: Thu, 16 Aug 2012 21:12:56 +0200 On 12-08-16 15:00 , Diego Novillo wrote: This is the patch I'm currently testing. I need someone with a very old toolchain (4.1 or earlier) to also give this a try (the original problem does not occur in g++ more recent than 4.1). + PR bootstrap/54281 + * intl.h: Prevent libintl.h from being included when + ENABLE_NLS is not set. Regtested on a x86_64 F 8 system with gcc-4.1.2-33, no regressions at r190450 (with --disable-nls) compared to results at r190381 (without). brgds, H-P
[patch] more bitmap obstacks
Hello, This reduces peak memory usage for the small test case of PR54146 by 2GB. 24hrs ago peak memory was 9.4 GB in top, but with Richi's patches from yesterday and with this one, peak memory is now only 5.2GB. About 40% of that is in IRA and reload so there is still room for improvement, but cutting peak mem almost in half in one day is not a bad result :-) Bootstrappedtested on x86_64-unknown-linux-gnu. OK for trunk? Ciao! Steven PR54146_bitmap_obstacks.diff Description: Binary data
Re: Commit: BFIN: Fix use of VEC_last macro in bfin.c
Hi Diego, Thanks Nick. I made the wrong fix here, sorry about that. I will be making more changes to VEC_ shortly. What's a good way for me to test them? All I was doing was building a variety of targets, just to make sure that a local, generic patch of my own did not break anything. If you have the disk space then you could try doing the same yourself. With the two attached makefiles (and a little editing to suit your environment), I work like this: % make dirs % make config [These two are only needed the first time] % make That will build: i686-pc-linux-gnu \ x86_64-pc-linux-gnu \ am33_2.0-linux \ hppa-linux-gnu \ powerpc64-linux-gnu \ ppc-linux \ s390-linux \ alpha-netbsd \ frv-uclinux \ i686-pc-cygwin \ mingw32-pe \ vax-netbsdelf \ fr30-elf \ iq2000-elf \ lm32-elf \ mcore-elf \ spu-elf \ avr-elf \ sh64-elf \ xstormy16-elf \ epiphany-elf \ arm-eabi \ bfin-elf \ cris-elf \ frv-elf \ h8300-elf \ i386-elf \ ia64-elf \ m32c-elf \ m32r-elf \ mep-elf \ mips64vr-elf \ mipsisa32-elf \ mipsisa64-elf \ mmix-mmixware \ mn10300-elf \ powerpc-eabispe \ powerpc-elf \ rl78-elf \ rx-elf \ sh-elf \ tx39-elf \ v850e-elf Cheers Nick # These are a set of rules for building and testing devo toolchains. # You may need to edit some of these variables before they will # work in your environment. BUILD_DIR = /work/builds/gcc/current SOURCE_DIR = /work/sources/gcc/current BIN_DIRS = \ arc-elf \ cr16-elf \ crx-elf \ dlx-elf \ fido-elf \ i386-darwin \ i386-netware \ lm32-rtems4.0 \ m68hc12-elf \ microblaze-elf \ mcore-pe \ moxie-elf \ msp430-elf \ mt-elf \ openrisc-elf \ or32-elf \ s390x-ibm-tpf \ tic6x-elf \ x86_64-pc-mingw64 \ z8k-coff GCC_AND_LIBGCC_DIRS = \ i686-pc-linux-gnu \ x86_64-pc-linux-gnu \ \ am33_2.0-linux \ hppa-linux-gnu \ powerpc64-linux-gnu \ ppc-linux \ s390-linux \ \ alpha-netbsd \ frv-uclinux \ i686-pc-cygwin \ mingw32-pe \ vax-netbsdelf \ ALL_DIRS = \ fr30-elf \ iq2000-elf \ lm32-elf \ mcore-elf \ spu-elf \ \ avr-elf \ sh64-elf \ xstormy16-elf \ \ epiphany-elf \ \ arm-eabi \ bfin-elf \ cris-elf \ frv-elf \ h8300-elf \ i386-elf \ ia64-elf \ m32c-elf \ m32r-elf \ mep-elf \ mips64vr-elf \ mipsisa32-elf \ mipsisa64-elf \ mmix-mmixware \ mn10300-elf \ powerpc-eabispe \ powerpc-elf \ rl78-elf \ rx-elf \ sh-elf \ tx39-elf \ v850e-elf # Obsolete: # arm-elf \ # arm-wince-pe \ all: gcc-builds-noretry all-target-libgcc all-target-newlib checks check: gcc-checks rebuilds rebuild: gcc-rebuilds include /home/nickc/bin/scripts/builds-makefile # Tools = COUNT_FAILURES = /home/nickc/bin/scripts/count-failures CHECK_FOR_REGRESSIONS = /home/nickc/bin/scripts/check-for-regressions # TIMELIMIT = /home/nickc/bin/scripts/timelimit SIMPLE_MAKE = nice -3 make -s PARALLEL_MAKE = nice -3 make -s -j3 NOSTOP_MAKE = make -s -k # Build Rules BIN_CLEAN_RULES = echo -n Cleaning all of binutils in: `basename $$PWD` ... ; \ $(NOSTOP_MAKE) clean-gold /dev/null ; \ $(NOSTOP_MAKE) clean-gas clean-ld clean-binutils /dev/null \ echo success || echo FAILURE BIN_BUILD_NO_RETRY_RULES = echo -n Building Binutils in: `basename $$PWD` ... ; \ $(PARALLEL_MAKE) all-binutils all-gas all-ld all-gold ./make.out \ echo success || echo FAILURE BIN_BUILD_RULES = echo -n Building Binutils in: `basename $$PWD` ... ; \ $(PARALLEL_MAKE) all-binutils all-gas all-ld all-gold ./make.out \ echo success \ || (echo -n fail ... (removing cache files) ... ; \ cd bfd ; ./config.status /dev/null ; cd .. ; \ rm -f ld/e*.c `basename $$PWD`/newlib/config.cache ld/config.cache libiberty/config.cache bfd/config.cache gas/config.cache opcodes/config.cache; \ cd libiberty ; $(SIMPLE_MAKE) clean /dev/null ; ./config.status /dev/null ; cd .. \
Re: [bootstrap] Tentative fix for PR 54281
On Thu, 16 Aug 2012, Ian Lance Taylor wrote: On Thu, Aug 16, 2012 at 12:12 PM, Diego Novillo dnovi...@google.com wrote: This is the patch I'm currently testing. I need someone with a very old toolchain (4.1 or earlier) to also give this a try (the original problem does not occur in g++ more recent than 4.1). Thanks. Diego. diff --git a/gcc/ChangeLog b/gcc/ChangeLog index a8ff00d..8798b8f 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,5 +1,11 @@ 2012-08-16 Diego Novillo dnovi...@google.com + PR bootstrap/54281 + * intl.h: Prevent libintl.h from being included when + ENABLE_NLS is not set. It's hard to convince myself that this patch is portable. I don't think we can assume that every system that provides libintl.h uses _LIBINTL_H as the preprocessor guard. The issue is that we must not #include libintl.h after #defining those macros. So the fix is to #include libintl.h in all cases before #defining those macros. Your proposal of unconditionally doing #include libintl.h is not a good idea, because libintl.h is not available on every system. But we are compiling host files here, so we can just use autoconf. I recommend that you add libintl.h to the AC_CHECK_HEADERS list in gcc/configure.ac, and then change intl.h to do #ifdef HAVE_LIBINTL_H #include libintl.h #endif before the #ifdef ENABLE_NLS test. Yes. Still I think including system headers from random gcc headers is bogus (the gmp.h include from double-int.h), we have system.h exactly to be able to setup things that the includes will work before poisoning anything or for systems that require special care. The configure check including system.h should define GENERATOR_FILE eventually. Or we need to split system.h. Richard.
[PATCH] More leak plugs for PR54146
Bootstrapped and tested on x86_64-unknown-linux-gnu, applied. Richard. 2012-08-16 Richard Guenther rguent...@suse.de * tree-sra.c (modify_function): Free redirect_callers vector. * ipa-split.c (split_function): Free args_to_pass vector. * tree-vect-data-refs.c (vect_peeling_hash_get_lowest_cost): Free body_cost_vec properly. (vect_enhance_data_refs_alignment): Likewise. * tree-vect-stmts.c (vectorizable_operation): Do not pre-allocate vec_oprnds. (new_stmt_vec_info): Do not pre-allocate STMT_VINFO_SAME_ALIGN_REFS. * tree-vect-slp.c (vect_free_slp_instance): Free the instance. (vect_analyze_slp_instance): Free everything. (destroy_bb_vec_info): Free the SLP instances. Index: gcc/ipa-split.c === --- gcc/ipa-split.c (revision 190447) +++ gcc/ipa-split.c (working copy) @@ -1230,6 +1230,7 @@ split_function (struct split_point *spli } call = gimple_build_call_vec (node-symbol.decl, args_to_pass); gimple_set_block (call, DECL_INITIAL (current_function_decl)); + VEC_free (tree, heap, args_to_pass); /* We avoid address being taken on any variable used by split part, so return slot optimization is always possible. Moreover this is Index: gcc/tree-vect-data-refs.c === --- gcc/tree-vect-data-refs.c (revision 190447) +++ gcc/tree-vect-data-refs.c (working copy) @@ -1368,6 +1368,7 @@ vect_peeling_hash_get_lowest_cost (void { min-inside_cost = inside_cost; min-outside_cost = outside_cost; + VEC_free (stmt_info_for_cost, heap, min-body_cost_vec); min-body_cost_vec = body_cost_vec; min-peel_info.dr = elem-dr; min-peel_info.npeel = elem-npeel; @@ -1880,7 +1881,10 @@ vect_enhance_data_refs_alignment (loop_v if (!stat) do_peeling = false; else -return stat; + { + VEC_free (stmt_info_for_cost, heap, body_cost_vec); + return stat; + } } if (do_peeling) @@ -1930,6 +1934,8 @@ vect_enhance_data_refs_alignment (loop_v gcc_assert (stat); return stat; } + else + VEC_free (stmt_info_for_cost, heap, body_cost_vec); } Index: gcc/tree-vect-stmts.c === --- gcc/tree-vect-stmts.c (revision 190447) +++ gcc/tree-vect-stmts.c (working copy) @@ -3585,22 +3585,6 @@ vectorizable_operation (gimple stmt, gim /* Handle def. */ vec_dest = vect_create_destination_var (scalar_dest, vectype); - /* Allocate VECs for vector operands. In case of SLP, vector operands are - created in the previous stages of the recursion, so no allocation is - needed, except for the case of shift with scalar shift argument. In that - case we store the scalar operand in VEC_OPRNDS1 for every vector stmt to - be created to vectorize the SLP group, i.e., SLP_NODE-VEC_STMTS_SIZE. - In case of loop-based vectorization we allocate VECs of size 1. We - allocate VEC_OPRNDS1 only in case of binary operation. */ - if (!slp_node) -{ - vec_oprnds0 = VEC_alloc (tree, heap, 1); - if (op_type == binary_op || op_type == ternary_op) -vec_oprnds1 = VEC_alloc (tree, heap, 1); - if (op_type == ternary_op) -vec_oprnds2 = VEC_alloc (tree, heap, 1); -} - /* In case the vectorization factor (VF) is bigger than the number of elements that we can fit in a vectype (nunits), we have to generate more than one vector stmt - i.e - we need to unroll the @@ -5866,7 +5850,7 @@ new_stmt_vec_info (gimple stmt, loop_vec else STMT_VINFO_DEF_TYPE (res) = vect_internal_def; - STMT_VINFO_SAME_ALIGN_REFS (res) = VEC_alloc (dr_p, heap, 5); + STMT_VINFO_SAME_ALIGN_REFS (res) = NULL; STMT_SLP_TYPE (res) = loop_vect; GROUP_FIRST_ELEMENT (res) = NULL; GROUP_NEXT_ELEMENT (res) = NULL; Index: gcc/tree-sra.c === --- gcc/tree-sra.c (revision 190447) +++ gcc/tree-sra.c (working copy) @@ -4698,6 +4698,8 @@ modify_function (struct cgraph_node *nod new_node = cgraph_function_versioning (node, redirect_callers, NULL, NULL, false, NULL, NULL, isra); + VEC_free (cgraph_edge_p, heap, redirect_callers); + current_function_decl = new_node-symbol.decl; push_cfun (DECL_STRUCT_FUNCTION (new_node-symbol.decl)); Index: gcc/tree-vect-slp.c === --- gcc/tree-vect-slp.c (revision 190447) +++ gcc/tree-vect-slp.c (working copy) @@ -94,6 +94,7 @@ vect_free_slp_instance (slp_instance ins VEC_free (int, heap, SLP_INSTANCE_LOAD_PERMUTATION (instance)); VEC_free (slp_tree, heap, SLP_INSTANCE_LOADS (instance));
Re: [PATCH] Decrease integer-share-limit
On Thu, 16 Aug 2012, Jakub Jelinek wrote: On Thu, Aug 16, 2012 at 04:44:03PM +0200, Richard Guenther wrote: On Thu, 16 Aug 2012, Paolo Carlini wrote: On 08/16/2012 03:39 PM, Richard Guenther wrote: This decreases the integer-share-limit to make sure the TREE_VEC we allocate for the small cached integers has a reasonable size for our GC memory allocator. Out of curiosity (just in case you hav two spare minutes) do you have any idea why this is so? I mean, naively one would think that allowing for any 8 bit constant would be a nice idea; puzzlingly, however the comment in the code says just experimentation. I'm wondering if tweaking a bit the memory allocator itself could allow for the full 8 bit range without a big memory waste... The GC memory allocator works on pages, there are not pages of arbitrary size but only power-of-two sizes. It's hard to improve the allocator here. The allocations are either power-of-two, or a couple of extra listed sizes. So, the alternative would be to add an extra size for the default integer share limit vector. See extra_order_size_table in ggc-page.c. Of course only provided there are many of those vectors. I doubt that a special size of 2080 / 1040 would be a good thing. (I doubt we make very good use of most of the small integer share vectors anyway, but that's for another thing ...) I've now applied the patch. Richard.
Re: [PATCH] More leak plugs for PR54146
On Fri, Aug 17, 2012 at 09:46:35AM +0200, Richard Guenther wrote: --- gcc/tree-vect-slp.c (revision 190447) +++ gcc/tree-vect-slp.c (working copy) @@ -1858,8 +1862,11 @@ new_bb_vec_info (basic_block bb) static void destroy_bb_vec_info (bb_vec_info bb_vinfo) { + VEC (slp_instance, heap) *slp_instances; + slp_instance instance; basic_block bb; gimple_stmt_iterator si; + unsigned i; if (!bb_vinfo) return; @@ -1879,6 +1886,9 @@ destroy_bb_vec_info (bb_vec_info bb_vinf free_data_refs (BB_VINFO_DATAREFS (bb_vinfo)); free_dependence_relations (BB_VINFO_DDRS (bb_vinfo)); VEC_free (gimple, heap, BB_VINFO_GROUPED_STORES (bb_vinfo)); + slp_instances = BB_VINFO_SLP_INSTANCES (bb_vinfo); + FOR_EACH_VEC_ELT (slp_instance, slp_instances, i, instance) +vect_free_slp_instance (instance); VEC_free (slp_instance, heap, BB_VINFO_SLP_INSTANCES (bb_vinfo)); In this case you should use slp_instances instead of BB_VINFO_SLP_INSTANCES (bb_vinfo) also in the above VEC_free call. destroy_cost_data (BB_VINFO_TARGET_COST_DATA (bb_vinfo)); free (bb_vinfo); Jakub
Re: PATCH: Replace target MEMBER_TYPE_FORCES_BLK macro with a target hook
On Thu, Aug 16, 2012 at 4:40 PM, H.J. Lu hjl.to...@gmail.com wrote: On Thu, Aug 16, 2012 at 3:09 AM, Richard Guenther richard.guent...@gmail.com wrote: On Thu, Aug 16, 2012 at 1:50 AM, H.J. Lu hongjiu...@intel.com wrote: Hi, This patch replaces MEMBER_TYPE_FORCES_BLK with a target hook. I also pass the type to the target hook in addition to field, which will be used by i386 backend for http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20020 This patch doesn't change code generation. Tested on Linux/x86-64. I also tested cross compilers to ia64-hpux, powerpc-linux and xtensa-linux up to cc1. OK to install? Isn't the type just always DECL_FIELD_CONTEXT (field)? Btw, with C++ you no longer need ATTRIBUTE_UNUSED, just give the parameter no name. You are right. Now I don't need to replace MEMBER_TYPE_FORCES_BLK with a target hook for PR 20020. Since I already made the change, here is the updated patch. OK to install? If we don't want to convert MEMBER_TYPE_FORCES_BLK, I can submit a patch to use MEMBER_TYPE_FORCES_BLK in i386 backend. Please put the hook documentation in target.def. The macro - hook conversion is ok with that change. Any functional change has to go via target maintainers. Thanks, Richard. Thanks. -- H.J. --- 2012-08-16 H.J. Lu hongjiu...@intel.com * stor-layout.c (compute_record_mode): Replace MEMBER_TYPE_FORCES_BLK with targetm.member_type_forces_blk. (layout_type): Likewise. * system.h: Poison MEMBER_TYPE_FORCES_BLK. * target.def (member_type_forces_blk): New target hook. * targhooks.c (default_member_type_forces_blk): New. * targhooks.h (default_member_type_forces_blk): Likewise. * doc/tm.texi.in (MEMBER_TYPE_FORCES_BLK): Renamed to ... (TARGET_MEMBER_TYPE_FORCES_BLK): This. * doc/tm.texi: Regenerated. * config/ia64/hpux.h (MEMBER_TYPE_FORCES_BLK): Removed. * config/ia64/ia64.c (ia64_member_type_forces_blk): New function. (TARGET_MEMBER_TYPE_FORCES_BLK): New macro. * config/rs6000/rs6000.c (TARGET_MEMBER_TYPE_FORCES_BLK): New macro. (rs6000_member_type_forces_blk): New function. * config/rs6000/rs6000.h (MEMBER_TYPE_FORCES_BLK): Removed. * config/xtensa/xtensa.c (xtensa_member_type_forces_blk): New function. (TARGET_MEMBER_TYPE_FORCES_BLK): New macro. * config/xtensa/xtensa.h (MEMBER_TYPE_FORCES_BLK): Removed. diff --git a/gcc/config/ia64/hpux.h b/gcc/config/ia64/hpux.h index ad106b4..d9ae109 100644 --- a/gcc/config/ia64/hpux.h +++ b/gcc/config/ia64/hpux.h @@ -115,9 +115,6 @@ do { \ #define TARGET_DEFAULT \ (MASK_DWARF2_ASM | MASK_BIG_ENDIAN | MASK_ILP32) -/* ??? Might not be needed anymore. */ -#define MEMBER_TYPE_FORCES_BLK(FIELD, MODE) ((MODE) == TFmode) - /* ASM_OUTPUT_EXTERNAL_LIBCALL defaults to just a globalize_label call, but that doesn't put out the @function type information which causes shared library problems. */ diff --git a/gcc/config/ia64/ia64.c b/gcc/config/ia64/ia64.c index c7fb559..f2f02c8 100644 --- a/gcc/config/ia64/ia64.c +++ b/gcc/config/ia64/ia64.c @@ -319,6 +319,7 @@ static const char *ia64_invalid_binary_op (int, const_tree, const_tree); static enum machine_mode ia64_c_mode_for_suffix (char); static void ia64_trampoline_init (rtx, tree, rtx); static void ia64_override_options_after_change (void); +static bool ia64_member_type_forces_blk (const_tree, enum machine_mode); static tree ia64_builtin_decl (unsigned, bool); @@ -570,6 +571,9 @@ static const struct attribute_spec ia64_attribute_table[] = #undef TARGET_GET_RAW_ARG_MODE #define TARGET_GET_RAW_ARG_MODE ia64_get_reg_raw_mode +#undef TARGET_MEMBER_TYPE_FORCES_BLK +#define TARGET_MEMBER_TYPE_FORCES_BLK ia64_member_type_forces_blk + #undef TARGET_GIMPLIFY_VA_ARG_EXPR #define TARGET_GIMPLIFY_VA_ARG_EXPR ia64_gimplify_va_arg @@ -11153,6 +11157,15 @@ ia64_get_reg_raw_mode (int regno) return default_get_reg_raw_mode(regno); } +/* Implement TARGET_MEMBER_TYPE_FORCES_BLK. ??? Might not be needed + anymore. */ + +bool +ia64_member_type_forces_blk (const_tree, enum machine_mode mode) +{ + return TARGET_HPUX mode == TFmode; +} + /* Always default to .text section until HP-UX linker is fixed. */ ATTRIBUTE_UNUSED static section * diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 34948fb..57c4823 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -1302,6 +1302,9 @@ static const struct attribute_spec rs6000_attribute_table[] = #undef TARGET_INIT_DWARF_REG_SIZES_EXTRA #define TARGET_INIT_DWARF_REG_SIZES_EXTRA rs6000_init_dwarf_reg_sizes_extra +#undef TARGET_MEMBER_TYPE_FORCES_BLK +#define TARGET_MEMBER_TYPE_FORCES_BLK rs6000_member_type_forces_blk + /* On rs6000, function
Re: [PATCH] More leak plugs for PR54146
On Fri, 17 Aug 2012, Jakub Jelinek wrote: On Fri, Aug 17, 2012 at 09:46:35AM +0200, Richard Guenther wrote: --- gcc/tree-vect-slp.c (revision 190447) +++ gcc/tree-vect-slp.c (working copy) @@ -1858,8 +1862,11 @@ new_bb_vec_info (basic_block bb) static void destroy_bb_vec_info (bb_vec_info bb_vinfo) { + VEC (slp_instance, heap) *slp_instances; + slp_instance instance; basic_block bb; gimple_stmt_iterator si; + unsigned i; if (!bb_vinfo) return; @@ -1879,6 +1886,9 @@ destroy_bb_vec_info (bb_vec_info bb_vinf free_data_refs (BB_VINFO_DATAREFS (bb_vinfo)); free_dependence_relations (BB_VINFO_DDRS (bb_vinfo)); VEC_free (gimple, heap, BB_VINFO_GROUPED_STORES (bb_vinfo)); + slp_instances = BB_VINFO_SLP_INSTANCES (bb_vinfo); + FOR_EACH_VEC_ELT (slp_instance, slp_instances, i, instance) +vect_free_slp_instance (instance); VEC_free (slp_instance, heap, BB_VINFO_SLP_INSTANCES (bb_vinfo)); In this case you should use slp_instances instead of BB_VINFO_SLP_INSTANCES (bb_vinfo) also in the above VEC_free call. I followed the destroy_loop_vec_info pattern which probably uses BB_VINFO_SLP_INSTANCES because it NULLs the vector in the place we actually store it. Which is not really important of course, as we free bb_vinfo immediately. Richard. destroy_cost_data (BB_VINFO_TARGET_COST_DATA (bb_vinfo)); free (bb_vinfo); Jakub -- Richard Guenther rguent...@suse.de SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imend
Re: [patch] more bitmap obstacks
On Fri, Aug 17, 2012 at 9:03 AM, Steven Bosscher stevenb@gmail.com wrote: Hello, This reduces peak memory usage for the small test case of PR54146 by 2GB. 24hrs ago peak memory was 9.4 GB in top, but with Richi's patches from yesterday and with this one, peak memory is now only 5.2GB. About 40% of that is in IRA and reload so there is still room for improvement, but cutting peak mem almost in half in one day is not a bad result :-) Bootstrappedtested on x86_64-unknown-linux-gnu. OK for trunk? Ok. Thanks, Richard. Ciao! Steven
[PATCH, ARM] Don't unnecessarily clobber the flags for ADD Rd, Rd, Rm
Thumb2 add instructions of the form add Rd, Rm Are already 16-bit instructions, so turning them into addsRd, Rm is pointless. This patch adds support for this in the compiler and ensures that the conversion to ADDS is skipped in this case. The additional variant in arm.md is possibly redundant, but does mean that we correctly calculate the length of this case and may even encourage the compiler to try and reuse the source operation. Tested on arm-eabi and applied to trunk. R. --- arm.c (revision 190462) +++ arm.c (local) @@ -13309,6 +13746,13 @@ thumb2_reorg (void) switch (GET_CODE (src)) { case PLUS: + /* Adding two registers and storing the result +in the first source is already a 16-bit +operation. */ + if (rtx_equal_p (dst, op0) + register_operand (op1, SImode)) + break; + if (low_register_operand (op0, SImode)) { /* ADDS Rd,Rn,Rm */ --- arm.md (revision 190462) +++ arm.md (local) @@ -746,11 +746,12 @@ (define_peephole2 ;; (plus (reg rN) (reg sp)) into (reg rN). In this case reload will ;; put the duplicated register first, and not try the commutative version. (define_insn_and_split *arm_addsi3 - [(set (match_operand:SI 0 s_register_operand =r, k,r,r, k, r, k,k,r, k, r) - (plus:SI (match_operand:SI 1 s_register_operand %rk,k,r,rk,k, rk,k,r,rk,k, rk) -(match_operand:SI 2 reg_or_int_operand rI,rI,k,Pj,Pj,L, L,L,PJ,PJ,?n)))] + [(set (match_operand:SI 0 s_register_operand =rk, r,k, r,r, k, r, k,k,r, k, r) + (plus:SI (match_operand:SI 1 s_register_operand %0, rk,k, r,rk,k, rk,k,r,rk,k, rk) +(match_operand:SI 2 reg_or_int_operand rk, rI,rI,k,Pj,Pj,L, L,L,PJ,PJ,?n)))] TARGET_32BIT @ + add%?\\t%0, %0, %2 add%?\\t%0, %1, %2 add%?\\t%0, %1, %2 add%?\\t%0, %2, %1 @@ -773,9 +774,9 @@ (define_insn_and_split *arm_addsi3 operands[1], 0); DONE; - [(set_attr length 4,4,4,4,4,4,4,4,4,4,16) + [(set_attr length 2,4,4,4,4,4,4,4,4,4,4,16) (set_attr predicable yes) - (set_attr arch *,*,*,t2,t2,*,*,a,t2,t2,*)] + (set_attr arch t2,*,*,*,t2,t2,*,*,a,t2,t2,*)] ) (define_insn_and_split *thumb1_addsi3
[Patch,AVR]: Use CC to compile gen-avr-mmcu-texi.c
http://gcc.gnu.org/viewcvs?view=revisionrevision=190473 Committed this to compile gen-avr-mmcu-texi.c with the C compiler. Johann -- * config/avr/t-avr (gen-avr-mmcu-texi): Use $(CC) to compile gen-avr-mmcu-texi.c.
Commit: MEP: Use C++ to compile mep-pragma.
Hi DJ, I am applying the following patch as an obvious patch fix for building the MEP targeted port of gcc. It fixes the build rule for mep-pragma.o so that is uses the C++ compiler instead of the C compiler. Cheers Nick gcc/ChangeLog 2012-08-17 Nick Clifton ni...@redhat.com * config/mep/t-mep (mep-pragma.o): Use $(COMPILER) to compile mep-pragma.c. Index: gcc/config/mep/t-mep === --- gcc/config/mep/t-mep(revision 190466) +++ gcc/config/mep/t-mep(working copy) @@ -30,7 +30,7 @@ coretypes.h $(TM_H) $(TREE_H) $(RTL_H) $(C_PRAGMA_H) \ $(CPPLIB_H) hard-reg-set.h output.h $(srcdir)/config/mep/mep-protos.h \ function.h insn-config.h reload.h $(TARGET_H) - $(CC) -c $(ALL_CFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) $ + $(COMPILER) -c $(ALL_CFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) $ MULTILIB_OPTIONS = mel mall-opts mfar MULTILIB_DIRNAMES = el allopt far
Re: Reproducible gcc builds, gfortran, and -grecord-gcc-switches
On 16 August 2012 21:28, Jakub Jelinek ja...@redhat.com wrote: On Thu, Aug 16, 2012 at 06:59:09PM +0200, Simon Baldwin wrote: On 16 August 2012 16:40, Michael Matz m...@suse.de wrote: ,,, Do you have considered to use a new option flag (usable in the .opt files) instead of a langhook? I.e. add a flag cl_dont_record to cl_option, a string Norecord for the .opt files, some handling for it in opt-functions.awk and the like? Adding lang-hooks used by debug producers make me twitch :) Okay. Below is an alternative approach. I've moved discussion to gcc-patches, since it's now more concrete than abstract. You could have just added case OPT_cpp_: to the switch in gen_producer_string, instead of all this. Thanks. I was under the impression, apparently mistaken, that OPT_cpp_ exists only if fortran is enabled. -- Google UK Limited | Registered Office: Belgrave House, 76 Buckingham Palace Road, London SW1W 9TQ | Registered in England Number: 3977902
Surprise! A new friend is looking for you.)
I am Kayleigh. I am 26 years and I really like spending nice time with my buddies. Yeah, it always brings me delight to go party, going to disco clubs, pubs and other places like these. It is always fun and interesting with me. And... I am so sexual! I even won the sexiest woman in our university contest about two or three times. So, if u are funny boy and like me, interested in relaxing and have great time together, then give me a message here :) Kayleigh
Commit: IQ2000: Remove modes from
Hi Guys, I am applying the patch below to fix some problems building an iq2000-elf toolchain. The main problem is that the conditional branch patterns were specifying a mode for the comparison operator, which then fails to match the rtl generated by the compiler. The other problem is that the two extend-and-compare patterns were running into reload problems when building libstdc++-v3. I have disabled them until I have time to investigate further. Cheers Nick gcc/ChangeLog 2012-08-17 Nick Clifton ni...@redhat.com * config/iq2000/iq2000.md (cbranchsi4): Remove mode from comparison and label. (branch_zero): Likewise. (branch_zero_inverted): Likewise. (branch_equality): Likewise. (branch_equality_inverted): Likewise. (extend-and-compare): Disable until reload issues can be resolved. * config/iq2000/iq2000.c (gen_conditional_branch): Use VOIDmode for comparison. (iq2000_function_arg_advance): Remove CONST_CAST2. Index: gcc/config/iq2000/iq2000.md === --- gcc/config/iq2000/iq2000.md (revision 190466) +++ gcc/config/iq2000/iq2000.md (working copy) @@ -1001,10 +1001,10 @@ (define_expand cbranchsi4 [(set (pc) (if_then_else - (match_operator:SI 0 ordered_comparison_operator -[(match_operand:SI 1 register_operand) - (match_operand:SI 2 reg_or_const_operand)]) -(label_ref (match_operand:SI 3 )) + (match_operator 0 ordered_comparison_operator +[(match_operand:SI 1 register_operand) + (match_operand:SI 2 reg_or_const_operand)]) +(label_ref (match_operand 3 )) (pc)))] @@ -1019,9 +1019,9 @@ (define_insn branch_zero [(set (pc) (if_then_else - (match_operator:SI 0 cmp_op - [(match_operand:SI 2 register_operand d) -(const_int 0)]) + (match_operator 0 cmp_op +[(match_operand:SI 2 register_operand d) + (const_int 0)]) (label_ref (match_operand 1 )) (pc)))] @@ -1040,9 +1040,9 @@ (define_insn branch_zero_inverted [(set (pc) (if_then_else - (match_operator:SI 0 cmp_op - [(match_operand:SI 2 register_operand d) -(const_int 0)]) + (match_operator 0 cmp_op +[(match_operand:SI 2 register_operand d) + (const_int 0)]) (pc) (label_ref (match_operand 1 ] @@ -1063,9 +1063,9 @@ (define_insn branch_equality [(set (pc) (if_then_else - (match_operator:SI 0 equality_op - [(match_operand:SI 2 register_operand d) -(match_operand:SI 3 register_operand d)]) + (match_operator 0 equality_op +[(match_operand:SI 2 register_operand d) + (match_operand:SI 3 register_operand d)]) (label_ref (match_operand 1 )) (pc)))] @@ -1084,9 +1084,9 @@ (define_insn branch_equality_inverted [(set (pc) (if_then_else - (match_operator:SI 0 equality_op - [(match_operand:SI 2 register_operand d) -(match_operand:SI 3 register_operand d)]) + (match_operator 0 equality_op +[(match_operand:SI 2 register_operand d) + (match_operand:SI 3 register_operand d)]) (pc) (label_ref (match_operand 1 ] @@ -1145,7 +1145,7 @@ (const_int 0)) (match_operand 2 pc_or_label_operand ) (match_operand 3 pc_or_label_operand )))] - + 0 bb%A2\\t%0(31-%1),%P2%P3 [(set_attr length 4) (set_attr type branch)]) @@ -1159,7 +1159,7 @@ (const_int 0)) (match_operand 2 pc_or_label_operand ) (match_operand 3 pc_or_label_operand )))] - + 0 bb%A3\\t%0(31-%1),%P2%P3 [(set_attr length 4) (set_attr type branch)]) Index: gcc/config/iq2000/iq2000.c === --- gcc/config/iq2000/iq2000.c (revision 190466) +++ gcc/config/iq2000/iq2000.c (working copy) @@ -1076,7 +1076,7 @@ emit_jump_insn (gen_rtx_SET (VOIDmode, pc_rtx, gen_rtx_IF_THEN_ELSE (VOIDmode, gen_rtx_fmt_ee (test_code, -mode, +VOIDmode, cmp0, cmp1), label1, label2))); } @@ -1140,7 +1140,7 @@
Commit: MCore: Fix building libgcc
Hi Guys, I am applying the patch below to fix a problem building libgcc for the mcore-elf target. The cbranchsi4 pattern was applying a mode to the comparison operator which was preventing it from matching rtl generated by the middle end. Cheers Nick gcc/ChangeLog 2012-08-17 Nick Clifton ni...@redhat.com * config/mcore/mcore.md (cbranchsi4): Remove mode from comparison. Index: gcc/config/mcore/mcore.md === --- gcc/config/mcore/mcore.md (revision 190466) +++ gcc/config/mcore/mcore.md (working copy) @@ -1502,7 +1502,7 @@ (define_expand cbranchsi4 [(set (pc) - (if_then_else (match_operator:SI 0 ordered_comparison_operator + (if_then_else (match_operator 0 ordered_comparison_operator [(match_operand:SI 1 mcore_compare_operand) (match_operand:SI 2 nonmemory_operand)]) (label_ref (match_operand 3 ))
Commit: FR30: Fix building libgcc
Hi Guys, I am applying the patch below to fix building libgcc for the fr30-elf target. The problem was that the comparison patterns were applying a mode to the operator which prevents them from matching rtl generated by the middle-end. Cheers Nick gcc/ChangeLog 2012-08-17 Nick Clifton ni...@redhat.com * config/fr30/fr30.md (cbranchsi4): Remove mode from comparison. (branch_true): Likewise. (branch_false): Likewise. Index: gcc/config/fr30/fr30.md === --- gcc/config/fr30/fr30.md (revision 190466) +++ gcc/config/fr30/fr30.md (working copy) @@ -940,7 +940,7 @@ (compare:CC (match_operand:SI 1 register_operand ) (match_operand:SI 2 nonmemory_operand ))) (set (pc) - (if_then_else (match_operator:CC 0 ordered_comparison_operator + (if_then_else (match_operator 0 ordered_comparison_operator [(reg:CC 16) (const_int 0)]) (label_ref (match_operand 3 )) (pc)))] @@ -980,9 +980,9 @@ (define_insn *branch_true [(set (pc) - (if_then_else (match_operator:CC 0 comparison_operator -[(reg:CC 16) - (const_int 0)]) + (if_then_else (match_operator 0 comparison_operator + [(reg:CC 16) + (const_int 0)]) (label_ref (match_operand 1 )) (pc)))] @@ -1034,9 +1034,9 @@ ;; branch occurs if the test is false, so the %B operator is used. (define_insn *branch_false [(set (pc) - (if_then_else (match_operator:CC 0 comparison_operator -[(reg:CC 16) - (const_int 0)]) + (if_then_else (match_operator 0 comparison_operator + [(reg:CC 16) + (const_int 0)]) (pc) (label_ref (match_operand 1 ]
RFA: LM32: Fix building libgcc
Hi Sebastien, Currently the lm32-elf target fails to configure libgcc because of a seg-fault whilst running the configuration checks: configure:3794: lm32-elf/gcc/xgcc [...] conftest.c xgcc: internal compiler error: Segmentation fault (program cc1) GCC goes into an infinite loop trying to generate a valid return instruction. The problem is the epilogue and return patterns are misnamed. The return is really a simple_return and the epilogue is really the (complex) return. Thus the patch below fixes the problem and allows the libgcc library to be configured and built. OK to apply ? Cheers Nick gcc/ChangeLog 2012-08-17 Nick Clifton ni...@redhat.com * config/lm32/lm32.md (return): Rename to simple_return. (epilogue): Rename to return. Index: gcc/config/lm32/lm32.md === --- gcc/config/lm32/lm32.md (revision 190466) +++ gcc/config/lm32/lm32.md (working copy) @@ -636,13 +636,14 @@ [(set_attr type uibranch)] ) -(define_insn return - [(return)] +(define_insn simple_return + [(simple_return)] lm32_can_use_return () ret [(set_attr type uibranch)] ) @@ -961,7 +962,7 @@ DONE; }) -(define_expand epilogue +(define_expand return [(return)]
Re: [rtl, i386] combine concat+permutation
Ping http://gcc.gnu.org/ml/gcc-patches/2012-08/msg00205.html (Cc: the nice reviewer of the previous patch) On Fri, 3 Aug 2012, Marc Glisse wrote: Hello, this is a follow up to the patch applied after this discussion: http://gcc.gnu.org/ml/gcc-patches/2012-05/msg00504.html It handles the -mavx __builtin_shuffle case mentioned there. It passes bootstrap (languages=c,c++) and regtest on x86_64. 2012-08-04 Marc Glisse marc.gli...@inria.fr gcc/ * simplify-rtx.c (simplify_binary_operation_1): Optimize shuffle of a concatenation. gcc/testsuite/ * gcc.target/i386/perm-concat.c: New test. -- Marc Glisse
[PATCH] Annotate GC bitmaps
This marks GTY bitmaps properly for most efficient marking. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2012-08-17 Richard Guenther rguent...@suse.de * bitmap.h (struct bitmap_element_def): GTY annotate next/prev. (struct bitmap_head_def): GTY skip current field. Index: gcc/bitmap.h === --- gcc/bitmap.h(revision 190471) +++ gcc/bitmap.h(working copy) @@ -167,9 +167,9 @@ typedef struct GTY (()) bitmap_obstack { bitmap_elt_clear_from to be implemented in unit time rather than linear in the number of elements to be freed. */ -typedef struct GTY(()) bitmap_element_def { - struct bitmap_element_def *next; /* Next element. */ - struct bitmap_element_def *prev; /* Previous element. */ +typedef struct GTY((chain_next (%h.next), chain_prev (%h.prev))) bitmap_element_def { + struct bitmap_element_def *next; /* Next element. */ + struct bitmap_element_def *prev; /* Previous element. */ unsigned int indx; /* regno/BITMAP_ELEMENT_ALL_BITS. */ BITMAP_WORD bits[BITMAP_ELEMENT_WORDS]; /* Bits that are set. */ } bitmap_element; @@ -177,15 +177,17 @@ typedef struct GTY(()) bitmap_element_de struct bitmap_descriptor; /* Head of bitmap linked list. gengtype ignores ifdefs, but for statistics we need to add a bitmap descriptor pointer. As it is - not collected, we can just GTY((skip)) it. */ + not collected, we can just GTY((skip())) it. Likewise current + points to something already pointed to by the chain started by first, + no need to walk it again. */ typedef struct GTY(()) bitmap_head_def { - bitmap_element *first; /* First element in linked list. */ - bitmap_element *current; /* Last element looked at. */ - unsigned int indx; /* Index of last element looked at. */ - bitmap_obstack *obstack; /* Obstack to allocate elements from. - If NULL, then use GGC allocation. */ - struct bitmap_descriptor GTY((skip)) *desc; + bitmap_element *first; /* First element in linked list. */ + bitmap_element * GTY((skip())) current; /* Last element looked at. */ + unsigned int indx; /* Index of last element looked at. */ + bitmap_obstack *obstack; /* Obstack to allocate elements from. + If NULL, then use GGC allocation. */ + struct bitmap_descriptor GTY((skip())) *desc; } bitmap_head; /* Global data */
Re: [rtl, i386] combine concat+permutation
On Fri, Aug 03, 2012 at 10:47:25PM +0200, Marc Glisse wrote: Hello, this is a follow up to the patch applied after this discussion: http://gcc.gnu.org/ml/gcc-patches/2012-05/msg00504.html It handles the -mavx __builtin_shuffle case mentioned there. It passes bootstrap (languages=c,c++) and regtest on x86_64. Ok. Thanks. 2012-08-04 Marc Glisse marc.gli...@inria.fr gcc/ * simplify-rtx.c (simplify_binary_operation_1): Optimize shuffle of a concatenation. gcc/testsuite/ * gcc.target/i386/perm-concat.c: New test. Jakub
[PATCH][C++] Save memory and reallocations in name-lookup
This reduces the number of re-allocations and the amount of memory wasted by store_binding. Previously, on a cut-down testcase from PR54146 we can see (--enable-gather-detailed-mem-stats -fmem-report output): cp/name-lookup.c:5874 (store_binding) 12033504: 1.6% 8564032: 2.4% 0: 0.0%2398752:12.9% 30134 that's the GC VEC (re-)allocation which wastes 2MB and re-allocates 30134 times. After the patch we have cp/name-lookup.c:5911 (store_bindings) 1243648: 0.2% 352240: 0.1% 0: 0.0% 154064: 0.9% 9407 cp/name-lookup.c:5945 (store_class_bindings)9643632: 1.3% 120160: 0.0% 0: 0.0% 209376: 1.3% 10109 which means only 19516 reallocations and 350kb wasted memory. The compile-time effects are neutral (name-lookup is 1.0+-0.05s on the reduced testcase) - a slightly different version which omitted the temporary vector but walked things twice was consistently slower (about 1.2s). It seems we can end up with duplicates in the names chain of store_bindings (not sure if that is intended). Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Ok if that passes? Thanks, Richard. 2012-08-17 Richard Guenther rguent...@suse.de cp/ * name-lookup.c (store_binding_p): New predicate, split out from ... (store_binding): ... here. Always store binding and require target vector with enough space. (store_bindings): Collect to store bindings and reserve space for them, then store them. (store_class_bindings): Likewise. Index: gcc/cp/name-lookup.c === *** gcc/cp/name-lookup.c(revision 190471) --- gcc/cp/name-lookup.c(working copy) *** pushtag (tree name, tree type, tag_scope *** 5855,5877 scope isn't enough, because more binding levels may be pushed. */ struct saved_scope *scope_chain; ! /* If ID has not already been marked, add an appropriate binding to !*OLD_BINDINGS. */ static void store_binding (tree id, VEC(cxx_saved_binding,gc) **old_bindings) { cxx_saved_binding *saved; ! if (!id || !IDENTIFIER_BINDING (id)) ! return; ! ! if (IDENTIFIER_MARKED (id)) ! return; IDENTIFIER_MARKED (id) = 1; ! saved = VEC_safe_push (cxx_saved_binding, gc, *old_bindings, NULL); saved-identifier = id; saved-binding = IDENTIFIER_BINDING (id); saved-real_type_value = REAL_IDENTIFIER_TYPE_VALUE (id); --- 5855,5887 scope isn't enough, because more binding levels may be pushed. */ struct saved_scope *scope_chain; ! /* Return true if ID has not already been marked. */ ! ! static inline bool ! store_binding_p (tree id) ! { ! if (!id || !IDENTIFIER_BINDING (id)) ! return false; ! ! if (IDENTIFIER_MARKED (id)) ! return false; ! ! return true; ! } ! ! /* Add an appropriate binding to *OLD_BINDINGS which needs to already !have enough space reserved. */ static void store_binding (tree id, VEC(cxx_saved_binding,gc) **old_bindings) { cxx_saved_binding *saved; ! gcc_checking_assert (store_binding_p (id)); IDENTIFIER_MARKED (id) = 1; ! saved = VEC_quick_push (cxx_saved_binding, *old_bindings, NULL); saved-identifier = id; saved-binding = IDENTIFIER_BINDING (id); saved-real_type_value = REAL_IDENTIFIER_TYPE_VALUE (id); *** store_binding (tree id, VEC(cxx_saved_bi *** 5881,5899 static void store_bindings (tree names, VEC(cxx_saved_binding,gc) **old_bindings) { ! tree t; bool subtime = timevar_cond_start (TV_NAME_LOOKUP); for (t = names; t; t = TREE_CHAIN (t)) { - tree id; - if (TREE_CODE (t) == TREE_LIST) id = TREE_PURPOSE (t); else id = DECL_NAME (t); ! store_binding (id, old_bindings); } timevar_cond_stop (TV_NAME_LOOKUP, subtime); } --- 5891,5922 static void store_bindings (tree names, VEC(cxx_saved_binding,gc) **old_bindings) { ! static VEC(tree,heap) *bindings_need_stored = NULL; ! tree t, id; ! size_t i; bool subtime = timevar_cond_start (TV_NAME_LOOKUP); for (t = names; t; t = TREE_CHAIN (t)) { if (TREE_CODE (t) == TREE_LIST) id = TREE_PURPOSE (t); else id = DECL_NAME (t); ! if (store_binding_p (id)) ! VEC_safe_push(tree, heap, bindings_need_stored, id); ! } ! if (!VEC_empty (tree, bindings_need_stored)) ! { ! VEC_reserve_exact (cxx_saved_binding, gc, *old_bindings, !VEC_length (tree, bindings_need_stored)); ! for (i = 0; VEC_iterate(tree, bindings_need_stored, i, id); ++i) ! { ! /* We can appearantly have duplicates in NAMES. */ ! if (store_binding_p (id)) ! store_binding (id, old_bindings); ! } ! VEC_truncate (tree, bindings_need_stored, 0); }
Re: [Test] contrib/test_installed modified to set specific gcov
Ping one more time. 2012/8/10 Anna Tikhonova anna.m.tikhon...@gmail.com: Ping. 2012/8/8 Anna Tikhonova anna.m.tikhon...@gmail.com: Hi, while running check for Android NDK compiler (I've used contrib/test_installed for it) I've noticed that gcov name is hardcoded in g++.dg/gcov/gcov.exp. All NDK x86 tools have prefix like i686-linux-android-, so testing will fail to spawn gcov. The patch attached introduces --with-gcov flag of contrib/test_installed so one could set specific gcov for testing. As workaround we could create a wrapper script named 'gcov' pointing to specific gcov in directory where GCC_UNDER_TEST resides. What do you think of this patch? Do you find it usefull? Thanks in advance. Anna
[PATCH][C++] Get rid of TREE_CHAIN use for TREE_VEC (NON_DEFAULT_TEMPLATE_ARGS_COUNT)
This gets rid of this field, pushing it into a short int in tree_base (hopefully 2^16 non-defaulted template args are enough ...). The existing code is a bit confusing because of the differences with respect to ENABLE_CHECKING, it has very little testcase coverage as well (a single testcase, g++.dg/template/error39.C), so I am not sure I got the idea correct (especially as ENABLE_CHECKING vs. non-ENABLE_CHECKING looks to introduce subtle differences?). Anyway, a previous iteration that failed g++.dg/template/error39.C bootstrapped and tested all languages ok, the following version now passes g++.dg/template/error39.C and hopefully bootstrap and regtest as well. We build a _lot_ of TREE_VECs from the C++ frontend, so this 8 byte saving is quite visible. TREE_VEC is the most used TREE_CODE by far on PR54146: ... ssa_name 363597 mem_ref 403966 addr_expr1203839 tree_list1205721 tree_vec 2654415 which translates to 109615896 bytes allocated in TREE_VECs (yes, that includes the actual storage, the average size is 41.3 bytes, with this patch TREE_VEC itself shrinks from 40 bytes to 32 bytes which means the average TREE_VEC size is one!). Let's hope removing TREE_TYPE from TREE_VEC is as easy ;) Re-bootstrap and regtest running on x86_64-unknown-linux-gnu. Ok for the C++ parts? Thanks, Richard. 2012-08-17 Richard Guenther rguent...@suse.de * tree.h (struct tree_base): Put address_space into a union, alongside a new field non_default_template_args_count used by C++ TREE_VECs. (struct tree_vec): Derive from tree_typed instead of tree_common. (TYPE_ADDR_SPACE): Adjust. * tree.c (initialize_tree_contains_struct): Adjust tree type hierarchy. cp/ * cp-tree.h (NON_DEFAULT_TEMPLATE_ARGS_COUNT): Remove. (SET_NON_DEFAULT_TEMPLATE_ARGS_COUNT, GET_NON_DEFAULT_TEMPLATE_ARGS_COUNT): Use non_default_template_args_count in tree_base instead of TREE_CHAIN. * pt.c (expand_template_argument_pack): Adjust. (template_parm_to_arg): Likewise. (current_template_args): Likewise. (coerce_template_parameter_pack): Likewise. (coerce_template_parms): Likewise. (tsubst_template_args): Likewise. (type_unification_real): Likewise. * parser.c (cp_parser_template_argument_list): Likewise. Index: gcc/tree.h === *** gcc/tree.h (revision 190469) --- gcc/tree.h (working copy) *** struct GTY(()) tree_base { *** 453,465 unsigned packed_flag : 1; unsigned user_align : 1; unsigned nameless_flag : 1; ! unsigned spare : 12; ! ! /* This field is only used with type nodes; the only reason it is present ! in tree_base instead of tree_type is to save space. The size of the ! field must be large enough to hold addr_space_t values. */ ! unsigned address_space : 8; }; struct GTY(()) tree_typed { --- 453,473 unsigned packed_flag : 1; unsigned user_align : 1; unsigned nameless_flag : 1; + unsigned spare0 : 4; ! /* The following has to be at a 16-bit alignment boundary to be properly ! packed and not make tree_base larger than 64 bits. */ ! union tree_base_spare_bits { ! /* This field is only used with type nodes; the only reason it is present !in tree_base instead of tree_type is to save space. The size of the !field must be large enough to hold addr_space_t values. */ ! unsigned char address_space; ! ! /* This field is only used with tree_vec nodes by the C++ frontend; the !only reason it is present in tree_base instead of tree_vec is to !save space. */ ! unsigned short non_default_template_args_count; ! } GTY((skip())) spare; }; struct GTY(()) tree_typed { *** struct GTY(()) tree_list { *** 1483,1489 #define TREE_VEC_ELT(NODE,I) TREE_VEC_ELT_CHECK (NODE, I) struct GTY(()) tree_vec { ! struct tree_common common; int length; tree GTY ((length (TREE_VEC_LENGTH ((tree)%h a[1]; }; --- 1491,1497 #define TREE_VEC_ELT(NODE,I) TREE_VEC_ELT_CHECK (NODE, I) struct GTY(()) tree_vec { ! struct tree_typed typed; int length; tree GTY ((length (TREE_VEC_LENGTH ((tree)%h a[1]; }; *** extern enum machine_mode vector_type_mod *** 2182,2188 #define TYPE_NAMELESS(NODE) (TYPE_CHECK (NODE)-base.nameless_flag) /* The address space the type is in. */ ! #define TYPE_ADDR_SPACE(NODE) (TYPE_CHECK (NODE)-base.address_space) /* There is a TYPE_QUAL value for each type qualifier. They can be combined by bitwise-or to form the complete set of qualifiers for a --- 2190,2196 #define TYPE_NAMELESS(NODE) (TYPE_CHECK (NODE)-base.nameless_flag) /* The address space the type is in. */ ! #define TYPE_ADDR_SPACE(NODE) (TYPE_CHECK
Re: [PATCH][C++] Save memory and reallocations in name-lookup
On Fri, Aug 17, 2012 at 6:17 AM, Richard Guenther rguent...@suse.de wrote: This reduces the number of re-allocations and the amount of memory wasted by store_binding. Previously, on a cut-down testcase from PR54146 we can see (--enable-gather-detailed-mem-stats -fmem-report output): cp/name-lookup.c:5874 (store_binding) 12033504: 1.6% 8564032: 2.4% 0: 0.0%2398752:12.9% 30134 that's the GC VEC (re-)allocation which wastes 2MB and re-allocates 30134 times. After the patch we have cp/name-lookup.c:5911 (store_bindings) 1243648: 0.2% 352240: 0.1% 0: 0.0% 154064: 0.9% 9407 cp/name-lookup.c:5945 (store_class_bindings)9643632: 1.3% 120160: 0.0% 0: 0.0% 209376: 1.3% 10109 Nice saving. As original author of the name lookup refactoring and improvement, I am delighted to see we can do much better. I am however concerned with: static void store_bindings (tree names, VEC(cxx_saved_binding,gc) **old_bindings) { ! static VEC(tree,heap) *bindings_need_stored = NULL; I would be more comfortable to see the cache be on per-scope (e.g. namespace scope) basis as opposed a blanket global cache stored in a global variable. -- Gaby
Re: [PATCH][C++] Save memory and reallocations in name-lookup
On Fri, Aug 17, 2012 at 6:17 AM, Richard Guenther rguent...@suse.de wrote: which means only 19516 reallocations and 350kb wasted memory. The compile-time effects are neutral (name-lookup is 1.0+-0.05s on the reduced testcase) - a slightly different version which omitted the temporary vector but walked things twice was consistently slower (about 1.2s). Funny; I found something similar 10 years ago :-/ It seems we can end up with duplicates in the names chain of store_bindings (not sure if that is intended). the chain is supposed to implement a stack, not a set. -- Gaby
[patch][RFC] 160-bits bitmap_element
Hello, On a 64-bits host we leave 32 bits of each bitmap_element unused. And actually, on a 32-bits host we do that too for GGC-allocated bitmaps (due to alignment). With this patch, we use those 32 bits by making BITMAP_WORD an unsigned int (instead of unsigned long) and having 5 elts per bitmap_element (instead of 2 or 4). The effect is that computing the bit position gets more expensive. Currently this only needs a div/mod by a power-of-two, but with this patch it's a div/mod 5. But because most bitmaps are sparse in blocks, bitmaps with 160 bits bitmap_elements have fewer searches. And I think that in general the call overhead to the bitmap functions (bitmap_bit_p/bitmap_set_bit are not inline functions, they're relatively large) and the pointer chasing is more expensive than requiring div/mod 5. Compile time for a set of cc1-i files doesn't change, but compile time for the small test case of PR54146 goes down by ~10%, and peak memory goes down a bit too (harder to measure so I don't know exactly by how much). Looking for your thoughts and comments, Ciao! Steven bitmap_160.diff Description: Binary data
Re: [patch][RFC] 160-bits bitmap_element
On Fri, Aug 17, 2012 at 1:46 PM, Steven Bosscher stevenb@gmail.com wrote: Hello, On a 64-bits host we leave 32 bits of each bitmap_element unused. And actually, on a 32-bits host we do that too for GGC-allocated bitmaps (due to alignment). With this patch, we use those 32 bits by making BITMAP_WORD an unsigned int (instead of unsigned long) and having 5 elts per bitmap_element (instead of 2 or 4). The effect is that computing the bit position gets more expensive. Currently this only needs a div/mod by a power-of-two, but with this patch it's a div/mod 5. But because most bitmaps are sparse in blocks, bitmaps with 160 bits bitmap_elements have fewer searches. And I think that in general the call overhead to the bitmap functions (bitmap_bit_p/bitmap_set_bit are not inline functions, they're relatively large) and the pointer chasing is more expensive than requiring div/mod 5. Compile time for a set of cc1-i files doesn't change, but compile time for the small test case of PR54146 goes down by ~10%, and peak memory goes down a bit too (harder to measure so I don't know exactly by how much). Looking for your thoughts and comments, Well, another effect of reducing the size of BITMAP_WORD is that operations are not performed in a mode optimally using CPU regs (did you check code generation differences on a 64bit host?). I think another idea was to even actually increase BITMAP_WORD size to be a vector type where supported to benefit from that fact, eventually getting rid of all inner loops (though we unroll them completely anyway I hope). I suppose using vector registers could be achieved in by specializing as well, in your proposed case we'd have one xmm op and a 32bit reg op. I guess that's one vote in favor of not wasting those 32 bits which is a shame. Looking for other comments. Thanks, Richard. Ciao! Steven
Re: [PATCH][C++] Save memory and reallocations in name-lookup
On Fri, 17 Aug 2012, Gabriel Dos Reis wrote: On Fri, Aug 17, 2012 at 6:17 AM, Richard Guenther rguent...@suse.de wrote: This reduces the number of re-allocations and the amount of memory wasted by store_binding. Previously, on a cut-down testcase from PR54146 we can see (--enable-gather-detailed-mem-stats -fmem-report output): cp/name-lookup.c:5874 (store_binding) 12033504: 1.6% 8564032: 2.4% 0: 0.0%2398752:12.9% 30134 that's the GC VEC (re-)allocation which wastes 2MB and re-allocates 30134 times. After the patch we have cp/name-lookup.c:5911 (store_bindings) 1243648: 0.2% 352240: 0.1% 0: 0.0% 154064: 0.9% 9407 cp/name-lookup.c:5945 (store_class_bindings)9643632: 1.3% 120160: 0.0% 0: 0.0% 209376: 1.3% 10109 Nice saving. As original author of the name lookup refactoring and improvement, I am delighted to see we can do much better. I am however concerned with: static void store_bindings (tree names, VEC(cxx_saved_binding,gc) **old_bindings) { ! static VEC(tree,heap) *bindings_need_stored = NULL; I would be more comfortable to see the cache be on per-scope (e.g. namespace scope) basis as opposed a blanket global cache stored in a global variable. It's a cache only to not need a malloc/free for each store_bindings, store_class_bindings invocation. I've made it function local for code cleaniness. We're using this kind of trick elsewhere in the compiler. It seems we can end up with duplicates in the names chain of store_bindings (not sure if that is intended). the chain is supposed to implement a stack, not a set. Ah, I see. That explains it. Richard.
Re: [PATCH][C++] Save memory and reallocations in name-lookup
On Fri, Aug 17, 2012 at 06:41:37AM -0500, Gabriel Dos Reis wrote: I am however concerned with: static void store_bindings (tree names, VEC(cxx_saved_binding,gc) **old_bindings) { ! static VEC(tree,heap) *bindings_need_stored = NULL; I would be more comfortable to see the cache be on per-scope (e.g. namespace scope) basis as opposed a blanket global cache stored in a global variable. It is not any kind of cache. It could be in theory an automatic variable vector pointer, it is only used during that function. The reason why it is static variable instead is just to avoid constant allocation/deallocation of the vector, this way after the first call it will be already allocated (but, upon entry to store_bindings will always be empty). Jakub
Re: [patch][RFC] 160-bits bitmap_element
On Fri, Aug 17, 2012 at 2:04 PM, Steven Bosscher stevenb@gmail.com wrote: On Fri, Aug 17, 2012 at 1:54 PM, Richard Guenther richard.guent...@gmail.com wrote: Well, another effect of reducing the size of BITMAP_WORD is that operations are not performed in a mode optimally using CPU regs (did you check code generation differences on a 64bit host?). I did, on x86_64 and on powerpc64. The effect is not dramatic, most of these machines can perform 32 bits operations just fine (I think the only exception would be alpha, maybe?). I wonder how bad code gets when we unconditionally use GCCs generic vector support to do Index: gcc/bitmap.c === --- gcc/bitmap.c(revision 190469) +++ gcc/bitmap.c(working copy) @@ -1446,6 +1446,17 @@ bitmap_compl_and_into (bitmap a, const_b unsigned ix; BITMAP_WORD ior = 0; +#if BITMAP_ELEMENT_WORDS == 5 + typedef v4si unsigned int __attribute__((vector_size((16; + v4si cleared4 = *(v4si *)a_elt-bits[0] *(v4si *)b_elt-bits[0]; + BITMAP_WORD cleared5 = a_elt-bits[4] b_elt-bits[4]; + v4si r4 = *(v4si *)b_elt-bits[0] ^ cleared4; + BITMAP_WORD r5 = b_elt-bits[4] ^ cleared5; + *(v4si *)a_elt-bits[0] = r4; + a_elt-bits[4] = r5; + ior4 |= r4; + ior5 |= r5; +#else for (ix = 0; ix BITMAP_ELEMENT_WORDS; ix++) of course with a proper #if GCC_VERSION. The theory is we should be able to lower it to the same scalar code or even v2si operations where only those are available ... Just an idea and eventually an opportunity to improve generic vector lowering if the above really does not work out. Richard. Ciao! Steven
Re: [patch][RFC] 160-bits bitmap_element
On Fri, Aug 17, 2012 at 02:04:50PM +0200, Steven Bosscher wrote: On Fri, Aug 17, 2012 at 1:54 PM, Richard Guenther richard.guent...@gmail.com wrote: Well, another effect of reducing the size of BITMAP_WORD is that operations are not performed in a mode optimally using CPU regs (did you check code generation differences on a 64bit host?). I did, on x86_64 and on powerpc64. The effect is not dramatic, most of these machines can perform 32 bits operations just fine (I think the only exception would be alpha, maybe?). One this is testing a bit in a bitmap or setting it, except for old alpha I guess it shouldn't be noticeable. But then there are functions that iterate over the bitmap, where using larger type should be noticeable (I mean stuff like bitmap copying, logical operations on whole bitmap, etc.). Jakub
Re: [patch][RFC] 160-bits bitmap_element
On Fri, Aug 17, 2012 at 2:12 PM, Richard Guenther richard.guent...@gmail.com wrote: On Fri, Aug 17, 2012 at 2:04 PM, Steven Bosscher stevenb@gmail.com wrote: On Fri, Aug 17, 2012 at 1:54 PM, Richard Guenther richard.guent...@gmail.com wrote: Well, another effect of reducing the size of BITMAP_WORD is that operations are not performed in a mode optimally using CPU regs (did you check code generation differences on a 64bit host?). I did, on x86_64 and on powerpc64. The effect is not dramatic, most of these machines can perform 32 bits operations just fine (I think the only exception would be alpha, maybe?). I wonder how bad code gets when we unconditionally use GCCs generic vector support to do Index: gcc/bitmap.c === --- gcc/bitmap.c(revision 190469) +++ gcc/bitmap.c(working copy) @@ -1446,6 +1446,17 @@ bitmap_compl_and_into (bitmap a, const_b unsigned ix; BITMAP_WORD ior = 0; +#if BITMAP_ELEMENT_WORDS == 5 + typedef v4si unsigned int __attribute__((vector_size((16; + v4si cleared4 = *(v4si *)a_elt-bits[0] *(v4si *)b_elt-bits[0]; + BITMAP_WORD cleared5 = a_elt-bits[4] b_elt-bits[4]; + v4si r4 = *(v4si *)b_elt-bits[0] ^ cleared4; + BITMAP_WORD r5 = b_elt-bits[4] ^ cleared5; + *(v4si *)a_elt-bits[0] = r4; + a_elt-bits[4] = r5; + ior4 |= r4; + ior5 |= r5; +#else for (ix = 0; ix BITMAP_ELEMENT_WORDS; ix++) of course with a proper #if GCC_VERSION. The theory is we should be able to lower it to the same scalar code or even v2si operations where only those are available ... Just an idea and eventually an opportunity to improve generic vector lowering if the above really does not work out. Or figure out if or why not the vectorizer does catch this (of course we do not enable that with -O2 which we eventually should in a very conservative mode). Richard. Richard. Ciao! Steven
Re: [bootstrap] Tentative fix for PR 54281
On 12-08-16 20:29 , Ian Lance Taylor wrote: I recommend that you add libintl.h to the AC_CHECK_HEADERS list in gcc/configure.ac Thanks. The attached patch implements the approach. Tested with --disable-nls and --enable-nls. Folks with 4.1 compilers, could you test it there? OK for trunk if testing succeeds? Thanks. Diego. commit 8583ba1f22ba310114bf64960f61f6bcc805e9c2 Author: Diego Novillo dnovi...@google.com Date: Thu Aug 16 14:27:49 2012 -0400 2012-08-17 Diego Novillo dnovi...@google.com PR bootstrap/54281 * configure.ac: Add libintl.h to AC_CHECK_HEADERS list. * config.in: Regenerate. * configure: Regenerate. * intl.h: Always include libintl.h if HAVE_LIBINTL_H is set. diff --git a/gcc/ChangeLog b/gcc/ChangeLog index c9a81d1..43b0af7 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,12 @@ +2012-08-17 Diego Novillo dnovi...@google.com + + PR bootstrap/54281 + * configure.ac: Add libintl.h to AC_CHECK_HEADERS list. + * config.in: Regenerate. + * configure: Regenerate. + * intl.h: Always include libintl.h if HAVE_LIBINTL_H is + set. + 2012-08-17 Richard Guenther rguent...@suse.de * bitmap.h (struct bitmap_element_def): GTY annotate next/prev. @@ -213,7 +222,7 @@ * config/tilegx/feedback.h: New file. * config/tilepro/feedback.h: New file. -2012-08-16 Diego Novillo dnovi...@google.com +2012-08-16 Diego Novillo dnovi...@google.com Revert diff --git a/gcc/config.in b/gcc/config.in index 6d986be..a9417df 100644 --- a/gcc/config.in +++ b/gcc/config.in @@ -1260,6 +1260,12 @@ #endif +/* Define to 1 if you have the libintl.h header file. */ +#ifndef USED_FOR_TARGET +#undef HAVE_LIBINTL_H +#endif + + /* Define to 1 if you have the limits.h header file. */ #ifndef USED_FOR_TARGET #undef HAVE_LIMITS_H diff --git a/gcc/configure b/gcc/configure index 1585bae..7f3489d 100755 --- a/gcc/configure +++ b/gcc/configure @@ -8248,7 +8248,7 @@ fi for ac_header in limits.h stddef.h string.h strings.h stdlib.h time.h iconv.h \ fcntl.h unistd.h sys/file.h sys/time.h sys/mman.h \ sys/resource.h sys/param.h sys/times.h sys/stat.h \ -direct.h malloc.h langinfo.h ldfcn.h locale.h wchar.h +direct.h malloc.h langinfo.h ldfcn.h locale.h wchar.h libintl.h do : as_ac_Header=`$as_echo ac_cv_header_$ac_header | $as_tr_sh` ac_fn_c_check_header_preproc $LINENO $ac_header $as_ac_Header @@ -17742,7 +17742,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat conftest.$ac_ext _LT_EOF -#line 17744 configure +#line 17745 configure #include confdefs.h #if HAVE_DLFCN_H @@ -17848,7 +17848,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat conftest.$ac_ext _LT_EOF -#line 17850 configure +#line 17851 configure #include confdefs.h #if HAVE_DLFCN_H diff --git a/gcc/configure.ac b/gcc/configure.ac index 579d9a8..6bfbf35 100644 --- a/gcc/configure.ac +++ b/gcc/configure.ac @@ -912,7 +912,7 @@ AC_HEADER_SYS_WAIT AC_CHECK_HEADERS(limits.h stddef.h string.h strings.h stdlib.h time.h iconv.h \ fcntl.h unistd.h sys/file.h sys/time.h sys/mman.h \ sys/resource.h sys/param.h sys/times.h sys/stat.h \ -direct.h malloc.h langinfo.h ldfcn.h locale.h wchar.h) +direct.h malloc.h langinfo.h ldfcn.h locale.h wchar.h libintl.h) # Check for thread headers. AC_CHECK_HEADER(thread.h, [have_thread_h=yes], [have_thread_h=]) diff --git a/gcc/intl.h b/gcc/intl.h index c4db354..42ca873 100644 --- a/gcc/intl.h +++ b/gcc/intl.h @@ -27,8 +27,16 @@ # define setlocale(category, locale) (locale) #endif +/* If libintl.h is available, include it before testing for NLS. If we + are building with --disable-nls and another header file includes + libintl.h, the stubs defined down below will cause syntax errors + when parsing libintl.h. See http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54281 + for details. */ +#ifdef HAVE_LIBINTL_H +# include libintl.h +#endif + #ifdef ENABLE_NLS -#include libintl.h extern void gcc_init_libintl (void); extern size_t gcc_gettext_width (const char *); #else
Re: [patch][RFC] 160-bits bitmap_element
On Fri, Aug 17, 2012 at 02:16:12PM +0200, Richard Guenther wrote: Or figure out if or why not the vectorizer does catch this (of course we do not enable that with -O2 which we eventually should in a very conservative mode). It might be helpful if we for the BITMAP_ELEMENT_WORDS == 5 case reordered indx field after bits, so that bits array is 64-bit aligned. Jakub
Re: [bootstrap] Tentative fix for PR 54281
Hi! On Fri, Aug 17, 2012 at 08:23:02AM -0400, Diego Novillo wrote: --- a/gcc/intl.h +++ b/gcc/intl.h @@ -27,8 +27,16 @@ # define setlocale(category, locale) (locale) #endif +/* If libintl.h is available, include it before testing for NLS. If we + are building with --disable-nls and another header file includes + libintl.h, the stubs defined down below will cause syntax errors + when parsing libintl.h. See http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54281 + for details. */ +#ifdef HAVE_LIBINTL_H +# include libintl.h +#endif + #ifdef ENABLE_NLS -#include libintl.h extern void gcc_init_libintl (void); extern size_t gcc_gettext_width (const char *); #else Will that handle even the case where without --disable-nls intl/ creates its own libintl.h? Dunno which targets need that, but I'd guess configury wouldn't find it in that case. So perhaps it should be #if defined(HAVE_LIBINTL_H) || defined(ENABLE_NLS) ? Jakub
[PATCH] Fix m68k __sync_lock_test_and_set_1 definition
This fixes a warning about the definition not matching the builtin. Since the type is not actually used for accessing the data, this doesn't affect the generated code. Tested on m68k-linux and committed. Andreas. * config/m68k/linux-atomic.c (__sync_lock_test_and_set_1): Fix type. diff --git a/libgcc/config/m68k/linux-atomic.c b/libgcc/config/m68k/linux-atomic.c index 6e81d6b..a2bba59 100644 --- a/libgcc/config/m68k/linux-atomic.c +++ b/libgcc/config/m68k/linux-atomic.c @@ -1,5 +1,5 @@ /* Linux-specific atomic operations for m68k Linux. - Copyright (C) 2011 Free Software Foundation, Inc. + Copyright (C) 2011, 2012 Free Software Foundation, Inc. Based on code contributed by CodeSourcery for ARM EABI Linux. This file is part of GCC. @@ -207,5 +207,5 @@ SUBWORD_BOOL_CAS (unsigned char, 1) #define COMMA , WORD_SYNC_OP (test_and_set, , COMMA, oldval) -SUBWORD_SYNC_OP (test_and_set, , COMMA, unsigned short, 1, oldval) +SUBWORD_SYNC_OP (test_and_set, , COMMA, unsigned char, 1, oldval) SUBWORD_SYNC_OP (test_and_set, , COMMA, unsigned short, 2, oldval) -- 1.7.11.5 -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 And now for something completely different.
Re: Commit: BFIN: Fix use of VEC_last macro in bfin.c
On 12-08-17 03:10 , nick clifton wrote: Hi Diego, Thanks Nick. I made the wrong fix here, sorry about that. I will be making more changes to VEC_ shortly. What's a good way for me to test them? All I was doing was building a variety of targets, just to make sure that a local, generic patch of my own did not break anything. If you have the disk space then you could try doing the same yourself. This is great. Thanks! Often the problem I have is that I don't know what target triplets to use. We do not have a cheat sheet handy. Would it be too much imposition for you to put this on the wiki? This is useful for those patches that need to touch files that are only built for particular targets. Often, I don't really need to build the whole thing, I just need to know that the target files compile. Thanks. Diego.
Re: [PATCH][C++] Save memory and reallocations in name-lookup
On Fri, Aug 17, 2012 at 6:55 AM, Jakub Jelinek ja...@redhat.com wrote: On Fri, Aug 17, 2012 at 06:41:37AM -0500, Gabriel Dos Reis wrote: I am however concerned with: static void store_bindings (tree names, VEC(cxx_saved_binding,gc) **old_bindings) { ! static VEC(tree,heap) *bindings_need_stored = NULL; I would be more comfortable to see the cache be on per-scope (e.g. namespace scope) basis as opposed a blanket global cache stored in a global variable. It is not any kind of cache. It could be in theory an automatic variable vector pointer, it is only used during that function. The reason why it is static variable instead is just to avoid constant allocation/deallocation of the vector, this way after the first call it will be already allocated (but, upon entry to store_bindings will always be empty). Ah, OK. The patch is OK then. Thanks, -- Gaby
Re: [C PATCH] -Wsizeof-pointer-memaccess warning
On 08/16/2012 09:10 PM, Jakub Jelinek wrote: + switch (DECL_FUNCTION_CODE (callee)) +{ +case BUILT_IN_STRNCMP: +case BUILT_IN_STRNCASECMP: +case BUILT_IN_STRNCPY: +case BUILT_IN_STRNCAT: + strop = true; + /* FALLTHRU */ +case BUILT_IN_MEMCPY: +case BUILT_IN_MEMMOVE: +case BUILT_IN_MEMCMP: + if (VEC_length (tree, params) 3) + return; + src = VEC_index (tree, params, 1); + dest = VEC_index (tree, params, 0); + break; +case BUILT_IN_MEMSET: + if (VEC_length (tree, params) 3) + return; + dest = VEC_index (tree, params, 0); + break; +case BUILT_IN_STRNDUP: + src = VEC_index (tree, params, 0); + strop = true; + break; +default: + break; +} I think this would apply to memchr, snprintf, vsnprintf, stpncopy, bzero, bcopy, bcmp as well. Or are some of them already lowered at this point? -- Florian Weimer / Red Hat Product Security Team
Re: Commit: BFIN: Fix use of VEC_last macro in bfin.c
Hi Diego, Would it be too much imposition for you to put this on the wiki? I would be happy to. Where do you think would be appropriate ? I would guess that somewhere off the Getting Started page (http://gcc.gnu.org/wiki/GettingStarted) would be best. Maybe a new item in the Tutorials section, or an extra paragraph on the Testing_GCC page ? Cheers Nick
Re: [patch][RFC] 160-bits bitmap_element
On Fri, Aug 17, 2012 at 2:21 PM, Jakub Jelinek ja...@redhat.com wrote: On Fri, Aug 17, 2012 at 02:16:12PM +0200, Richard Guenther wrote: Or figure out if or why not the vectorizer does catch this (of course we do not enable that with -O2 which we eventually should in a very conservative mode). It might be helpful if we for the BITMAP_ELEMENT_WORDS == 5 case reordered indx field after bits, so that bits array is 64-bit aligned. That's an excellent suggestion! With that change on top of my 5-int bitmap patch, I actually get a decent speed-up for this test case (5%, mostly due to less slow DF). I'm trying this idea with an otherwise unpatched compiler now. Ciao! Steven
Re: [C PATCH] -Wsizeof-pointer-memaccess warning
On Fri, Aug 17, 2012 at 02:52:34PM +0200, Florian Weimer wrote: +{ +case BUILT_IN_STRNCMP: +case BUILT_IN_STRNCASECMP: +case BUILT_IN_STRNCPY: +case BUILT_IN_STRNCAT: + strop = true; + /* FALLTHRU */ +case BUILT_IN_MEMCPY: +case BUILT_IN_MEMMOVE: +case BUILT_IN_MEMCMP: + if (VEC_length (tree, params) 3) +return; + src = VEC_index (tree, params, 1); + dest = VEC_index (tree, params, 0); + break; +case BUILT_IN_MEMSET: + if (VEC_length (tree, params) 3) +return; + dest = VEC_index (tree, params, 0); + break; +case BUILT_IN_STRNDUP: + src = VEC_index (tree, params, 0); + strop = true; + break; +default: + break; +} I think this would apply to memchr, snprintf, vsnprintf, stpncopy, bzero, bcopy, bcmp as well. Or are some of them already lowered at this point? s/stpncopy/stpncpy/, yeah, I guess, we could do that for those too, but 1) I'd prefer to wait for the C++ FE change to go in first 2) {,v}snprintf would be much harder than the rest, as the size argument then isn't the last argument to the function, and the FEs only provide the last one right now. The parser routine which parses arguments doesn't know which fn it will be for, recording sizeof_arg (including location_t) for all arguments would be an overkill, perhaps we could record instead the first sizeof argument + its position in the argument list. That would not catch say memset (ptr, sizeof (int), sizeof (ptr)); (as it would record 2nd argument, expr int and locus), but perhaps it is so rare to use sizeof on second argument of memset that it doesn't matter. On the other side it would help if somebody calls __builtin___memset_chk and similar explicitly (sizeof in that case isn't the last argument, but 3rd), the above switch then could handle also BUILT_IN_*CHK. That said, usually -D_FORTIFY_SOURCE is done through the string.h wrapper inlines and that is handled already (and tested in the testcases). Jakub
Re: [C PATCH] -Wsizeof-pointer-memaccess warning
On 08/17/2012 03:09 PM, Jakub Jelinek wrote: s/stpncopy/stpncpy/, yeah, I guess, we could do that for those too, but 1) I'd prefer to wait for the C++ FE change to go in first 2) {,v}snprintf would be much harder than the rest, as the size argument then isn't the last argument to the function, and the FEs only provide the last one right now. Could you pick the second argument for varargs functions? Incredibly hacky, but would do the trick for those two. Or does the FE not know at this point it is processing a varargs function? -- Florian Weimer / Red Hat Product Security Team
Re: [C PATCH] -Wsizeof-pointer-memaccess warning
On Fri, Aug 17, 2012 at 03:26:34PM +0200, Florian Weimer wrote: On 08/17/2012 03:09 PM, Jakub Jelinek wrote: s/stpncopy/stpncpy/, yeah, I guess, we could do that for those too, but 1) I'd prefer to wait for the C++ FE change to go in first 2) {,v}snprintf would be much harder than the rest, as the size argument then isn't the last argument to the function, and the FEs only provide the last one right now. Could you pick the second argument for varargs functions? Incredibly hacky, but would do the trick for those two. Or does the FE not know at this point it is processing a varargs function? Yeah, it doesn't know that. What could perhaps work for all the builtins that makes sense to warn for is to record the third argument if it is a sizeof, or, if that one isn't a sizeof, record the second argument if it is a sizeof. Jakub
Re: PATCH: Replace target MEMBER_TYPE_FORCES_BLK macro with a target hook
On Fri, Aug 17, 2012 at 1:02 AM, Richard Guenther richard.guent...@gmail.com wrote: On Thu, Aug 16, 2012 at 4:40 PM, H.J. Lu hjl.to...@gmail.com wrote: On Thu, Aug 16, 2012 at 3:09 AM, Richard Guenther richard.guent...@gmail.com wrote: On Thu, Aug 16, 2012 at 1:50 AM, H.J. Lu hongjiu...@intel.com wrote: Hi, This patch replaces MEMBER_TYPE_FORCES_BLK with a target hook. I also pass the type to the target hook in addition to field, which will be used by i386 backend for http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20020 This patch doesn't change code generation. Tested on Linux/x86-64. I also tested cross compilers to ia64-hpux, powerpc-linux and xtensa-linux up to cc1. OK to install? Isn't the type just always DECL_FIELD_CONTEXT (field)? Btw, with C++ you no longer need ATTRIBUTE_UNUSED, just give the parameter no name. You are right. Now I don't need to replace MEMBER_TYPE_FORCES_BLK with a target hook for PR 20020. Since I already made the change, here is the updated patch. OK to install? If we don't want to convert MEMBER_TYPE_FORCES_BLK, I can submit a patch to use MEMBER_TYPE_FORCES_BLK in i386 backend. Please put the hook documentation in target.def. The macro - hook conversion is ok with that change. Any functional change has to go via target maintainers. Thanks, Richard. Here is the updated patch. Steve, David, Sterling, can you take a look at ia64, rs6000 and xtensa changes? Thanks. -- H.J. --- 2012-08-17 H.J. Lu hongjiu...@intel.com * stor-layout.c (compute_record_mode): Replace MEMBER_TYPE_FORCES_BLK with targetm.member_type_forces_blk. (layout_type): Likewise. * system.h: Poison MEMBER_TYPE_FORCES_BLK. * target.def (member_type_forces_blk): New target hook. * targhooks.c (default_member_type_forces_blk): New. * targhooks.h (default_member_type_forces_blk): Likewise. * doc/tm.texi.in (MEMBER_TYPE_FORCES_BLK): Removed. (TARGET_MEMBER_TYPE_FORCES_BLK): New hook. * doc/tm.texi: Regenerated. * config/ia64/hpux.h (MEMBER_TYPE_FORCES_BLK): Removed. * config/ia64/ia64.c (ia64_member_type_forces_blk): New function. (TARGET_MEMBER_TYPE_FORCES_BLK): New macro. * config/rs6000/rs6000.c (TARGET_MEMBER_TYPE_FORCES_BLK): New macro. (rs6000_member_type_forces_blk): New function. * config/rs6000/rs6000.h (MEMBER_TYPE_FORCES_BLK): Removed. * config/xtensa/xtensa.c (xtensa_member_type_forces_blk): New function. (TARGET_MEMBER_TYPE_FORCES_BLK): New macro. * config/xtensa/xtensa.h (MEMBER_TYPE_FORCES_BLK): Removed. gcc-blkmode-3.patch Description: Binary data
Re: PATCH: Replace target MEMBER_TYPE_FORCES_BLK macro with a target hook
On Fri, Aug 17, 2012 at 9:39 AM, H.J. Lu hjl.to...@gmail.com wrote: Here is the updated patch. Steve, David, Sterling, can you take a look at ia64, rs6000 and xtensa changes? * config/rs6000/rs6000.c (TARGET_MEMBER_TYPE_FORCES_BLK): New macro. (rs6000_member_type_forces_blk): New function. * config/rs6000/rs6000.h (MEMBER_TYPE_FORCES_BLK): Removed. The rs6000 part of the patch looks okay. Thanks, David
[patch, tree-ssa] PR54295 Incorrect value extension in widening multiply-accumulate
PR54295 shows that widening multiply-accumulate operations can end up with incorrect code. The path to the failure is as follows 1) Compiler detects a widening multiply operation and generates the correct widening-multiply operation. This involves stripping off the widening cast to leave the underlying operands. That is X = (c1) a * (c2) b = X = a w* b the type of w is picked based on the types of a and b; if both are signed a signed multiply-extend operation is used; if both are unsigned an unsigned multiply-extend is used. If they differ a signed/unsigned multiply is used, if it exists. If either a or b is a positive constant it is coerced, if possible, into the type of the other operand. 2) The compiler then notices that the result, X is used in an addition operation Y = X + d As X is a widening multiply, the compiler then looks inside it to try and generate a widening-multiply-accumulate operation. However, it passes the rewritten expression (X = a w* b) to is_widening_mult_p and that routine does not correctly check what type of multiply is being done: the code blindly tries to strip of an conversion operations. The failure happens when a is also the result of a cast operation, for example, being cast from an unsigned to an int. The compiler then re-formulates a signed multiply-extend into an unsigned multiply-extend. That is, if a = (int) e // typeof(e) == unsigned Then instead of Y = WIDEN_MULT_PLUS (a, b, d) we end up with Y = WIDEN_MULT_PLUS (e, b, d) The fix is to make is_widening_mult_p note that it has been called with a WIDEN_MULT_EXPR and rather than decompose the operands again, to simply extract the existing operands, which have already been formulated correctly for a widening multiply operation. PR tree-ssa/54295 * tree-ssa-math-opts.c (is_widening_mult_p): Short-circuit when the stmt is already a widening multiply. Testsuite PR tree-ssa/54295 * gcc.c-torture/execute/20120817-1.c: New test. OK to commit once testing has completed? --- tree-ssa-math-opts.c(revision 190462) +++ tree-ssa-math-opts.c(local) @@ -2039,6 +2039,25 @@ is_widening_mult_p (gimple stmt, TREE_CODE (type) != FIXED_POINT_TYPE) return false; + /* If we already have a widening multiply expression, simply extract the + operands. */ + if (gimple_assign_rhs_code (stmt) == WIDEN_MULT_EXPR) +{ + *rhs1_out = gimple_assign_rhs1 (stmt); + *rhs2_out = gimple_assign_rhs2 (stmt); + if (TREE_CODE (*rhs1_out) == INTEGER_CST) + *type1_out = TREE_TYPE (*rhs2_out); + else + *type1_out = TREE_TYPE (*rhs1_out); + + if (TREE_CODE (*rhs2_out) == INTEGER_CST) + *type2_out = TREE_TYPE (*rhs1_out); + else + *type2_out = TREE_TYPE (*rhs2_out); + + return true; +} + if (!is_widening_mult_rhs_p (type, gimple_assign_rhs1 (stmt), type1_out, rhs1_out)) return false; Index: 20120817-1.c === --- 20120817-1.c(revision 0) +++ 20120817-1.c(revision 0) @@ -0,0 +1,14 @@ +typedef unsigned long long u64; +unsigned long foo = 0; +u64 f() __attribute__((noinline)); + +u64 f() { + return ((u64)40) + ((u64) 24) * (int)(foo - 1); +} + +int main () +{ + if (f () != 16) +abort (); + exit (0); +}
Re: [patch, tree-ssa] PR54295 Incorrect value extension in widening multiply-accumulate
On 17/08/12 15:04, Richard Earnshaw wrote: The fix is to make is_widening_mult_p note that it has been called with a WIDEN_MULT_EXPR and rather than decompose the operands again, to simply extract the existing operands, which have already been formulated correctly for a widening multiply operation. As long as the existing test cases work, I think the only problem with this idea is if some architecture has a wider range of widening multiply-and-accumulate than it does plain widening multiply. If no such architecture exists then this is fine with me, for whatever that's worth. IIRC, ARM is one of only two architectures (in GCC) with widening multiplies that widen more than twice the width, and, last I looked, the only one that has the patterns to use this code, so it's probably safe. Andrew
Re: [patch, tree-ssa] PR54295 Incorrect value extension in widening multiply-accumulate
On 17/08/12 15:22, Andrew Stubbs wrote: On 17/08/12 15:04, Richard Earnshaw wrote: The fix is to make is_widening_mult_p note that it has been called with a WIDEN_MULT_EXPR and rather than decompose the operands again, to simply extract the existing operands, which have already been formulated correctly for a widening multiply operation. As long as the existing test cases work, I think the only problem with this idea is if some architecture has a wider range of widening multiply-and-accumulate than it does plain widening multiply. But surely in that case step1 wouldn't have applied and we'd then be looking at a MULT_EXPR not a WIDEN_MULT_EXPR. R.
Re: [patch, tree-ssa] PR54295 Incorrect value extension in widening multiply-accumulate
On 17/08/12 15:31, Richard Earnshaw wrote: On 17/08/12 15:22, Andrew Stubbs wrote: On 17/08/12 15:04, Richard Earnshaw wrote: The fix is to make is_widening_mult_p note that it has been called with a WIDEN_MULT_EXPR and rather than decompose the operands again, to simply extract the existing operands, which have already been formulated correctly for a widening multiply operation. As long as the existing test cases work, I think the only problem with this idea is if some architecture has a wider range of widening multiply-and-accumulate than it does plain widening multiply. But surely in that case step1 wouldn't have applied and we'd then be looking at a MULT_EXPR not a WIDEN_MULT_EXPR. Not necessarily. For example, it will use a 32x32-64 signed widening multiply for 16 bit unsigned data if there is no unsigned 16x16-64 option. Hypothetically, if there happened to be an unsigned 16x16-64 multiply-and-accumulate then you'd end up masking it. Like I said though, cross that bridge if it ever comes to it. Andrew
Re: [bootstrap] Tentative fix for PR 54281
On Fri, Aug 17, 2012 at 5:32 AM, Jakub Jelinek ja...@redhat.com wrote: Will that handle even the case where without --disable-nls intl/ creates its own libintl.h? Dunno which targets need that, but I'd guess configury wouldn't find it in that case. So perhaps it should be #if defined(HAVE_LIBINTL_H) || defined(ENABLE_NLS) ? Makes sense to me. Ian
Re: [patch, tree-ssa] PR54295 Incorrect value extension in widening multiply-accumulate
On 17/08/12 15:39, Andrew Stubbs wrote: On 17/08/12 15:31, Richard Earnshaw wrote: On 17/08/12 15:22, Andrew Stubbs wrote: On 17/08/12 15:04, Richard Earnshaw wrote: The fix is to make is_widening_mult_p note that it has been called with a WIDEN_MULT_EXPR and rather than decompose the operands again, to simply extract the existing operands, which have already been formulated correctly for a widening multiply operation. As long as the existing test cases work, I think the only problem with this idea is if some architecture has a wider range of widening multiply-and-accumulate than it does plain widening multiply. But surely in that case step1 wouldn't have applied and we'd then be looking at a MULT_EXPR not a WIDEN_MULT_EXPR. Not necessarily. For example, it will use a 32x32-64 signed widening multiply for 16 bit unsigned data if there is no unsigned 16x16-64 option. Hypothetically, if there happened to be an unsigned 16x16-64 multiply-and-accumulate then you'd end up masking it. Like I said though, cross that bridge if it ever comes to it. Andrew You've lost me. If we don't have a 16x16-64 mult operation then after step 1 we'll still have a MULT_EXPR, not a WIDEN_MULT_EXPR, so when we reach step2 there's nothing to short circuit. Unless, of course, you're expecting us to get step1 - 16x16-32 widen mult step2 - widen64(step1) + acc64 But even this is only safe iff widen64 is the same type of widening as the original widening step, and we determine that by looking at the operands of the widening mult, not at any casts on them. The key thing is that the type of multiply that we want is based on the types of the operands, not on the type of the result. So it's essential we don't strip type conversions beyond the widening conversion. R.
Re: Commit: BFIN: Fix use of VEC_last macro in bfin.c
On Fri, 17 Aug 2012, Diego Novillo wrote: On 12-08-17 03:10 , nick clifton wrote: Hi Diego, Thanks Nick. I made the wrong fix here, sorry about that. I will be making more changes to VEC_ shortly. What's a good way for me to test them? All I was doing was building a variety of targets, just to make sure that a local, generic patch of my own did not break anything. If you have the disk space then you could try doing the same yourself. This is great. Thanks! Often the problem I have is that I don't know what target triplets to use. We do not have a cheat sheet handy. The cheat sheet for testing lots of targets (testing that cc1 etc. build) is contrib/config-list.mk. The targets there are supposed to cover all significant variations in what target headers are used; even if people have added target variations without always adding to the list, it's a pretty good approximation. Note that when using this file you need to have a recent trunk build in your PATH so that --enable-werror-always works properly. The baseline state is not generally 100% clean; see bug 47093 for a meta-bug whose dependencies track known issues (some of those dependencies may of course be out of date - that is, issues that have been fixed). It would certainly be good to get things down to a clean state and have an autobuilder reporting regressions building any supported target. (Of course greater levels of testing are possible, e.g. building binutils and seeing if libgcc builds, but getting cc1 building cleanly would be a good first step.) -- Joseph S. Myers jos...@codesourcery.com
Re: Reproducible gcc builds, gfortran, and -grecord-gcc-switches
On Fri, 17 Aug 2012, Simon Baldwin wrote: You could have just added case OPT_cpp_: to the switch in gen_producer_string, instead of all this. Thanks. I was under the impression, apparently mistaken, that OPT_cpp_ exists only if fortran is enabled. OPT_* for Fortran options only exist when the Fortran front-end is in the source tree (whether or not enabled). I think we try to avoid knowingly breaking use cases where people remove some front ends from the source tree, although we don't actively test them and no longer provide split-up source tarballs. -- Joseph S. Myers jos...@codesourcery.com
Re: [bootstrap] Tentative fix for PR 54281
On 12-08-17 08:32 , Jakub Jelinek wrote: Hi! On Fri, Aug 17, 2012 at 08:23:02AM -0400, Diego Novillo wrote: --- a/gcc/intl.h +++ b/gcc/intl.h @@ -27,8 +27,16 @@ # define setlocale(category, locale) (locale) #endif +/* If libintl.h is available, include it before testing for NLS. If we + are building with --disable-nls and another header file includes + libintl.h, the stubs defined down below will cause syntax errors + when parsing libintl.h. See http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54281 + for details. */ +#ifdef HAVE_LIBINTL_H +# include libintl.h +#endif + #ifdef ENABLE_NLS -#include libintl.h extern void gcc_init_libintl (void); extern size_t gcc_gettext_width (const char *); #else Will that handle even the case where without --disable-nls intl/ creates its own libintl.h? Dunno which targets need that, but I'd guess configury wouldn't find it in that case. So perhaps it should be #if defined(HAVE_LIBINTL_H) || defined(ENABLE_NLS) ? Sounds reasonable. Amended patch attached. Re-tested with --disable-nls. OK for trunk? diff --git a/gcc/ChangeLog b/gcc/ChangeLog index c9a81d1..43b0af7 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,12 @@ +2012-08-17 Diego Novillo dnovi...@google.com + + PR bootstrap/54281 + * configure.ac: Add libintl.h to AC_CHECK_HEADERS list. + * config.in: Regenerate. + * configure: Regenerate. + * intl.h: Always include libintl.h if HAVE_LIBINTL_H is + set. + 2012-08-17 Richard Guenther rguent...@suse.de * bitmap.h (struct bitmap_element_def): GTY annotate next/prev. @@ -213,7 +222,7 @@ * config/tilegx/feedback.h: New file. * config/tilepro/feedback.h: New file. -2012-08-16 Diego Novillo dnovi...@google.com +2012-08-16 Diego Novillo dnovi...@google.com Revert diff --git a/gcc/config.in b/gcc/config.in index 6d986be..a9417df 100644 --- a/gcc/config.in +++ b/gcc/config.in @@ -1260,6 +1260,12 @@ #endif +/* Define to 1 if you have the libintl.h header file. */ +#ifndef USED_FOR_TARGET +#undef HAVE_LIBINTL_H +#endif + + /* Define to 1 if you have the limits.h header file. */ #ifndef USED_FOR_TARGET #undef HAVE_LIMITS_H diff --git a/gcc/configure b/gcc/configure index 1585bae..7f3489d 100755 --- a/gcc/configure +++ b/gcc/configure @@ -8248,7 +8248,7 @@ fi for ac_header in limits.h stddef.h string.h strings.h stdlib.h time.h iconv.h \ fcntl.h unistd.h sys/file.h sys/time.h sys/mman.h \ sys/resource.h sys/param.h sys/times.h sys/stat.h \ -direct.h malloc.h langinfo.h ldfcn.h locale.h wchar.h +direct.h malloc.h langinfo.h ldfcn.h locale.h wchar.h libintl.h do : as_ac_Header=`$as_echo ac_cv_header_$ac_header | $as_tr_sh` ac_fn_c_check_header_preproc $LINENO $ac_header $as_ac_Header @@ -17742,7 +17742,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat conftest.$ac_ext _LT_EOF -#line 17744 configure +#line 17745 configure #include confdefs.h #if HAVE_DLFCN_H @@ -17848,7 +17848,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat conftest.$ac_ext _LT_EOF -#line 17850 configure +#line 17851 configure #include confdefs.h #if HAVE_DLFCN_H diff --git a/gcc/configure.ac b/gcc/configure.ac index 579d9a8..6bfbf35 100644 --- a/gcc/configure.ac +++ b/gcc/configure.ac @@ -912,7 +912,7 @@ AC_HEADER_SYS_WAIT AC_CHECK_HEADERS(limits.h stddef.h string.h strings.h stdlib.h time.h iconv.h \ fcntl.h unistd.h sys/file.h sys/time.h sys/mman.h \ sys/resource.h sys/param.h sys/times.h sys/stat.h \ -direct.h malloc.h langinfo.h ldfcn.h locale.h wchar.h) +direct.h malloc.h langinfo.h ldfcn.h locale.h wchar.h libintl.h) # Check for thread headers. AC_CHECK_HEADER(thread.h, [have_thread_h=yes], [have_thread_h=]) diff --git a/gcc/intl.h b/gcc/intl.h index c4db354..03be420 100644 --- a/gcc/intl.h +++ b/gcc/intl.h @@ -27,8 +27,16 @@ # define setlocale(category, locale) (locale) #endif +/* If libintl.h is available, include it before testing for NLS. If we + are building with --disable-nls and another header file includes + libintl.h, the stubs defined down below will cause syntax errors + when parsing libintl.h. See http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54281 + for details. */ +#if defined(HAVE_LIBINTL_H) || defined(ENABLE_NLS) +# include libintl.h +#endif + #ifdef ENABLE_NLS -#include libintl.h extern void gcc_init_libintl (void); extern size_t gcc_gettext_width (const char *); #else
Re: Commit: BFIN: Fix use of VEC_last macro in bfin.c
On 12-08-17 10:52 , Joseph S. Myers wrote: On Fri, 17 Aug 2012, Diego Novillo wrote: On 12-08-17 03:10 , nick clifton wrote: Hi Diego, Thanks Nick. I made the wrong fix here, sorry about that. I will be making more changes to VEC_ shortly. What's a good way for me to test them? All I was doing was building a variety of targets, just to make sure that a local, generic patch of my own did not break anything. If you have the disk space then you could try doing the same yourself. This is great. Thanks! Often the problem I have is that I don't know what target triplets to use. We do not have a cheat sheet handy. The cheat sheet for testing lots of targets (testing that cc1 etc. build) is contrib/config-list.mk. Thanks. We need a much better mechanism for documenting and advertising the stuff in contrib/. I've been wanting such a facility for ages (never quite got around to asking until now, though). Diego.
[AArch64] Use effective-target keyword to test for endianness
Hi, I've just committed the attached patch on the AArch64 branch to use effective-target keyword to test for endianness. Thanks Sofiane - r190482 | sofiane | 2012-08-17 16:02:20 +0100 (Fri, 17 Aug 2012) | 9 lines [AArch64] Use effective-target to check for big endian gcc/testsuite * lib/target-supports.exp (check_effective_target_aarch64_big_endian): New. * gcc.target/aarch64/aapcs64/func-ret-3.c: Add dg-require-effective-target. * gcc.target/aarch64/aapcs64/func-ret-4.c: Likewise. * gcc.target/aarch64/aapcs64/func-ret-3.x: Delete. * gcc.target/aarch64/aapcs64/func-ret-4.x: Delete. aarch64-be-effective-target.patch Description: Binary data
Re: [patch, tree-ssa] PR54295 Incorrect value extension in widening multiply-accumulate
On 17/08/12 15:47, Richard Earnshaw wrote: If we don't have a 16x16-64 mult operation then after step 1 we'll still have a MULT_EXPR, not a WIDEN_MULT_EXPR, so when we reach step2 there's nothing to short circuit. Unless, of course, you're expecting us to get step1 - 16x16-32 widen mult step2 - widen64(step1) + acc64 No, given a u16xu16-u64 operation in the code, and that the arch doesn't have such an opcode, I'd expect to get step1 - (u32)u16 x (u32)u16 - u64 Likewise, 8x8-32 might give (16)8x(16)8-32. The code can't see that the widening operation is non-optimal without looking beyond into its inputs. Andrew
Re: [bootstrap] Tentative fix for PR 54281
On Fri, Aug 17, 2012 at 10:58:25AM -0400, Diego Novillo wrote: Sounds reasonable. Amended patch attached. OK for trunk? Yes, except for: diff --git a/gcc/ChangeLog b/gcc/ChangeLog index c9a81d1..43b0af7 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -213,7 +222,7 @@ * config/tilegx/feedback.h: New file. * config/tilepro/feedback.h: New file. -2012-08-16 Diego Novillo dnovi...@google.com +2012-08-16 Diego Novillo dnovi...@google.com Revert the above hunk. Jakub
[AArch64] implement vmlsq_laneq_*
Hi, I've just committed the attached patch on the AArch64 branch to implement vmlsq_laneq_*. Thanks Sofiane - r190484 | sofiane | 2012-08-17 16:15:49 +0100 (Fri, 17 Aug 2012) | 11 lines Tejas Belagod tejas.bela...@arm.com [AArch64] Implement vmlsq_laneq_*. gcc/ * config/aarch64/arm_neon.h (vmlsq_laneq_f32, vmlsq_laneq_s16, vmlsq_laneq_u16, vmlsq_laneq_s32, vmlsq_laneq_u32): New. testsuite/ * gcc.target/aarch64/vmlsq_laneq.c: New. aarch64-vmlsq-laneq.patch Description: Binary data
Re: [patch, tree-ssa] PR54295 Incorrect value extension in widening multiply-accumulate
On 17/08/12 16:06, Andrew Stubbs wrote: On 17/08/12 15:47, Richard Earnshaw wrote: If we don't have a 16x16-64 mult operation then after step 1 we'll still have a MULT_EXPR, not a WIDEN_MULT_EXPR, so when we reach step2 there's nothing to short circuit. Unless, of course, you're expecting us to get step1 - 16x16-32 widen mult step2 - widen64(step1) + acc64 No, given a u16xu16-u64 operation in the code, and that the arch doesn't have such an opcode, I'd expect to get step1 - (u32)u16 x (u32)u16 - u64 Hmm, I would have thought that would be more costly than (u64)(u16 x u16 - u32) Likewise, 8x8-32 might give (16)8x(16)8-32. The code can't see that the widening operation is non-optimal without looking beyond into its inputs. Ok, in which case we have to give is_widening_mult_rhs_p enough smarts to not strip (s32)u32 and return u32. I'll have another think about it. R.
[AArch64] implement FSQRT in RTL.
Hi, I've just committed the attached patch on the AArch64 branch to implement FSQRT in RTL. Thanks Sofiane - r190485 | sofiane | 2012-08-17 16:22:28 +0100 (Fri, 17 Aug 2012) | 12 lines 2012-08-17 Tejas Belagod tejas.bela...@arm.com [AArch64] Implement FSQRT in RTL. * config/aarch64/aarch64-builtins.c (aarch64_simd_builtin_data): Add sqrt to the list of intrinsic descriptors. * config/aarch64/aarch64-simd.md (sqrtmode2): Insn pattern for sqrt. (aarch64_sqrtmode): Builtin expansion. * config/aarch64/arm_neon.h: Remove asm implementations of vsqrt. Add builtin implementation of vsqrt. aarch64-fsqrt.patch Description: Binary data
[AArch64] Do not mix statements with declarations
Hi, I've just committed the attached patch on the AArch64 branch to fix a style issue related to mixing statements with declarations. Thanks Sofiane - r190486 | sofiane | 2012-08-17 16:26:47 +0100 (Fri, 17 Aug 2012) | 7 lines 2012-08-17 Marcus Shawcroft marcus.shawcr...@arm.com [AArch64] Do not mix statements and declarations. * config/aarch64/aarch64.c (aarch64_simd_lane_bounds): Do not mix statements with declarations. aarch64-do-not-mix-stmt-decl.patch Description: Binary data
Re: [bootstrap] Tentative fix for PR 54281
On Fri, Aug 17, 2012 at 11:05 AM, Jakub Jelinek ja...@redhat.com wrote: On Fri, Aug 17, 2012 at 10:58:25AM -0400, Diego Novillo wrote: Sounds reasonable. Amended patch attached. OK for trunk? Yes, except for: diff --git a/gcc/ChangeLog b/gcc/ChangeLog index c9a81d1..43b0af7 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -213,7 +222,7 @@ * config/tilegx/feedback.h: New file. * config/tilepro/feedback.h: New file. -2012-08-16 Diego Novillo dnovi...@google.com +2012-08-16 Diego Novillo dnovi...@google.com Revert the above hunk. Done. Committed. Thanks. Diego.
Re: PATCH: Replace target MEMBER_TYPE_FORCES_BLK macro with a target hook
Here is the updated patch. Steve, David, Sterling, can you take a look at ia64, rs6000 and xtensa changes? Thanks. The ia64 part looks OK. Steve Ellcey sell...@mips.com
Re: [PATCH] Set correct source location for deallocator calls
On 2012-08-10 20:38, Dehao Chen wrote: + // { dg-final { scan-assembler 1 28 0 } } This test case isn't going to work except with dwarf2, and with gas. You can use -dA so that you can scan for file.c:line. There are other examples in the testsuite. This doesn't belong in guality. It belongs in g++.dg/debug/. It would be nice if you could add a java testcase to see that it doesn't regress. ! record_in_goto_queue_label (struct leh_tf_state *tf, treemple stmt, tree label, ! location_t location) BTW, for the future, please fix your mailer to not wrap lines. + /* For call expressions inside FINALL/CATCH block, if its location +is unknown, gimplify_call_expr will set it to input_location. +However, these calls are automatically generated to destructors. +And they may be cloned to many places. In this case, we will +set the location for them in tree-eh.c. But to ensure that EH +does the right job, we first need mark their location as +UNKNOWN_LOCATION. */ + input_location = UNKNOWN_LOCATION; I'll quibble with the wording here. It reads as if we want to force all calls to have UNKNOWN_LOC, whereas all we want is to prevent any calls that already have UNKNOWN_LOC from gaining input_loc via gimplify_call_expr. r~
Re: PATCH: Replace target MEMBER_TYPE_FORCES_BLK macro with a target hook
On Fri, Aug 17, 2012 at 6:59 AM, David Edelsohn dje@gmail.com wrote: On Fri, Aug 17, 2012 at 9:39 AM, H.J. Lu hjl.to...@gmail.com wrote: Here is the updated patch. Steve, David, Sterling, can you take a look at ia64, rs6000 and xtensa changes? * config/rs6000/rs6000.c (TARGET_MEMBER_TYPE_FORCES_BLK): New macro. (rs6000_member_type_forces_blk): New function. * config/rs6000/rs6000.h (MEMBER_TYPE_FORCES_BLK): Removed. The rs6000 part of the patch looks okay. Thanks, David xtensa portions look ok.
[RFC] Warning for potentially unbound writes to function parameters
In some real-world code, I noticed a curious pattern: using the unsafe string functions on function parameter arguments. This leads to gets()-style unsafe APIs. I've looked at how to implement a warning for this, and came up with the attached patch. Do you think this makes sense? 1 #include string.h 2 3 const char *data (void); 4 5 void test (char *target) 6 { 7strcpy(target, data ()); 8 } 9 10 11 void test_2 (char *target) 12 { 13char *p = target; 14strcpy(p, data ()); 15 } 16 /tmp/t.c: In function ‘test’: /tmp/t.c:7:9: warning: potentially unbound write to function parameter ‘target’ [-Wunbound-parameter-write] strcpy(target, data ()); ^ /tmp/t.c: In function ‘test_2’: /tmp/t.c:14:9: warning: potentially unbound write to function parameter ‘target’ [-Wunbound-parameter-write] strcpy(p, data ()); ^ Obviously, the warning and its name need adjusting, and more functions need to be covered. But I want to check first if you think the warning makes sense at all, and if I've found the right place to implement it (this approach seems to require optimization, alas). -- Florian Weimer / Red Hat Product Security Team commit 324c7189c9cf871584da988f12d1a686df0d6e0c Author: Florian Weimer fwei...@redhat.com Date: Fri Aug 17 18:19:13 2012 +0200 Implement -Wunbound-parameter-write (proof of concept) diff --git a/gcc/builtins.c b/gcc/builtins.c index 4b177c4..dc90484 100644 --- a/gcc/builtins.c +++ b/gcc/builtins.c @@ -3274,6 +3274,14 @@ expand_builtin_strcpy (tree exp, rtx target) { tree dest = CALL_EXPR_ARG (exp, 0); tree src = CALL_EXPR_ARG (exp, 1); + if (TREE_CODE (dest) == SSA_NAME) + { + tree dest_var = SSA_NAME_VAR (dest); + if (TREE_CODE (dest_var) == PARM_DECL) + warning_at (EXPR_LOCATION (exp), OPT_Wunbound_parameter_write, + potentially unbound write to function parameter %qD, + dest_var); + } return expand_builtin_strcpy_args (dest, src, target); } return NULL_RTX; diff --git a/gcc/common.opt b/gcc/common.opt index deb89e3..fe892b7 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -562,6 +562,10 @@ Wlarger-than= Common RejectNegative Joined UInteger Warning -Wlarger-than=number Warn if an object is larger than number bytes +Wunbound-parameter-write +Common Var(warn_unbound_parameter_write) Warning +Warn if a function without array bounds checking writes to a pointer passed as an parameter + Wunsafe-loop-optimizations Common Var(warn_unsafe_loop_optimizations) Warning Warn if the loop cannot be optimized due to nontrivial assumptions.
[PATCH, rtl-optimization]: Fix PR46829, ICE in spill_failure with -fschedule-insns [was: Fixing instability of -fschedule-insns for x86]
Hello! Attached patch fixes one of the critical problems in combine.c: combine pass blindly propagates hard registers into insn patterns and this way creates partially invalid instructions. Most of the times, reload is able to fix these inconsistencies, but operands with invalid hard registers block IRA/reload to allocate most appropriate registers from critically limited register sets, leading to spill failures. To counter these problems, x86 port invented operand predicates like reg_not_xmm0_operand and derivatives that prevented combine pass from propagating invalid hard regs. This approach in fact papered over real problem in the gcc infrastructure. Attached patch introduces operand constraint checks directly into recog_for_combine. Operands of valid instructions are checked against their constraints and combined insn is rejected if they doesn't fit. IMO, this patch is prerequisite for enabling scheduling pass on x86. Proposed priority scheduling of function operands patch will hopefully fix other issues with limited register set hard register allocations. 2012-08-17 Uros Bizjak ubiz...@gmail.com PR rtl-optimization/46829 * combine.c (recog_for_combine): Check operand constraints in case hard registers were propagater into insn pattern. testsuite/ChangeLog: 2012-08-17 Uros Bizjak ubiz...@gmail.com PR rtl-optimization/46829 * gcc.target/i386/pr46829.c: New test. Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32}. OK for mainline? Uros. Index: combine.c === --- combine.c (revision 190480) +++ combine.c (working copy) @@ -10507,6 +10507,7 @@ recog_for_combine (rtx *pnewpat, rtx insn, rtx *pn int i; rtx notes = 0; rtx old_notes, old_pat; + int old_icode; /* If PAT is a PARALLEL, check to see if it contains the CLOBBER we use to indicate that something didn't match. If we find such a @@ -10566,6 +10567,7 @@ recog_for_combine (rtx *pnewpat, rtx insn, rtx *pn print_rtl_single (dump_file, pat); } } + PATTERN (insn) = old_pat; REG_NOTES (insn) = old_notes; @@ -10607,6 +10609,86 @@ recog_for_combine (rtx *pnewpat, rtx insn, rtx *pn pat = newpat; } + old_pat = PATTERN (insn); + old_notes = REG_NOTES (insn); + old_icode = INSN_CODE (insn); + PATTERN (insn) = pat; + REG_NOTES (insn) = notes; + + /* Check operand constraints in case hard registers were propagated + into insn pattern. This check prevents combine pass from + generating insn patterns with invalid hard register operands. + These invalid insns can eventually confuse reload to error out + with a spill failure. See also PR 46829. */ + if (insn_code_number = 0 + insn_code_number != NOOP_MOVE_INSN_CODE + (INSN_CODE (insn) = recog (PATTERN (insn), insn, 0)) = 0) +{ + extract_insn (insn); + preprocess_constraints (); + + for (i = 0; i recog_data.n_operands; i++) + { + rtx op = recog_data.operand[i]; + enum machine_mode mode = GET_MODE (op); + struct operand_alternative *op_alt; + int offset = 0; + bool win; + int j; + + /* A unary operator may be accepted by the predicate, but it +is irrelevant for matching constraints. */ + if (UNARY_P (op)) + op = XEXP (op, 0); + + if (GET_CODE (op) == SUBREG) + { + if (REG_P (SUBREG_REG (op)) + REGNO (SUBREG_REG (op)) FIRST_PSEUDO_REGISTER) + offset = subreg_regno_offset (REGNO (SUBREG_REG (op)), + GET_MODE (SUBREG_REG (op)), + SUBREG_BYTE (op), + GET_MODE (op)); + op = SUBREG_REG (op); + } + + if (!(REG_P (op) HARD_REGISTER_P (op))) + continue; + + op_alt = recog_op_alt[i]; + + win = false; + for (j = 0; j recog_data.n_alternatives; j++) + { + if (op_alt[j].anything_ok + || (op_alt[j].matches != -1 + rtx_equal_p (recog_data.operand[j], + recog_data.operand[op_alt[j].matches])) + || (reg_fits_class_p (op, op_alt[j].cl, offset, mode))) + { + win = true; + break; + } + } + + if (!win) + { + if (dump_file (dump_flags TDF_DETAILS)) + { + fputs (Operand failed to match constraints:\n, +dump_file); + print_rtl_single (dump_file, op); + } + insn_code_number = -1; + break; + } + } +} + + PATTERN (insn) = old_pat; + REG_NOTES (insn) = old_notes; + INSN_CODE (insn) = old_icode; +
Re: [patch, tree-ssa] PR54295 Incorrect value extension in widening multiply-accumulate
On 17/08/12 16:20, Richard Earnshaw wrote: Ok, in which case we have to give is_widening_mult_rhs_p enough smarts to not strip (s32)u32 and return u32. I'll have another think about it. Take two. This version should address your concerns about handling (u32)u16 * (u32)u16 - u64 We now look at each operand directly, but when doing so we check whether the operand is the same size as the result or not. When it is, we can strip any conversion; when it isn't the conversion must preserve signedness of the inner operand and mustn't be a narrowing conversion. * tree-ssa-math-opts.c (widening_mult_conversion_strippable_p): New function. (is_widening_mult_rhs_p): Use it. Testing underway (again) OK? R. --- tree-ssa-math-opts.c(revision 190502) +++ tree-ssa-math-opts.c(local) @@ -1958,6 +1958,43 @@ struct gimple_opt_pass pass_optimize_bsw } }; +/* Return true if stmt is a type conversion operation that can be stripped + when used in a widening multiply operation. */ +static bool +widening_mult_conversion_strippable_p (tree result_type, gimple stmt) +{ + enum tree_code rhs_code = gimple_assign_rhs_code (stmt); + + if (TREE_CODE (result_type) == INTEGER_TYPE) +{ + tree op_type; + tree inner_op_type; + + if (!CONVERT_EXPR_CODE_P (rhs_code)) + return false; + + op_type = TREE_TYPE (gimple_assign_lhs (stmt)); + + /* If the type of OP has the same precision as the result, then +we can strip this conversion. The multiply operation will be +selected to create the correct extension as a by-product. */ + if (TYPE_PRECISION (result_type) == TYPE_PRECISION (op_type)) + return true; + + /* We can also strip a conversion if it preserves the signed-ness of +the operation and doesn't narrow the range. */ + inner_op_type = TREE_TYPE (gimple_assign_rhs1 (stmt)); + + if (TYPE_UNSIGNED (op_type) == TYPE_UNSIGNED (inner_op_type) + TYPE_PRECISION (op_type) TYPE_PRECISION (inner_op_type)) + return true; + + return false; +} + + return rhs_code == FIXED_CONVERT_EXPR; +} + /* Return true if RHS is a suitable operand for a widening multiplication, assuming a target type of TYPE. There are two cases: @@ -1982,9 +2019,7 @@ is_widening_mult_rhs_p (tree type, tree if (is_gimple_assign (stmt)) { rhs_code = gimple_assign_rhs_code (stmt); - if (TREE_CODE (type) == INTEGER_TYPE - ? !CONVERT_EXPR_CODE_P (rhs_code) - : rhs_code != FIXED_CONVERT_EXPR) + if (! widening_mult_conversion_strippable_p (type, stmt)) rhs1 = rhs; else {
Re: [google] Modification of gcov pmu format to reduce gcda size bloat (issue 6427063)
Hi Rong, Could you take a look at the patch I mailed to gcc-patches when you get a chance? It reduces the gcda size bloat issue by replacing gcov_pmu data with a filetag field that holds the position of the correct filename inside of the newly added string table. Thanks, Chris http://codereview.appspot.com/6427063/
Re: [AArch64] implement FSQRT in RTL.
On Fri, Aug 17, 2012 at 8:26 AM, Sofiane Naci sofiane.n...@arm.com wrote: Hi, I've just committed the attached patch on the AArch64 branch to implement FSQRT in RTL. Maybe add a testcase which tests that sqrt gets vectorized? Thanks, Andrew Thanks Sofiane - r190485 | sofiane | 2012-08-17 16:22:28 +0100 (Fri, 17 Aug 2012) | 12 lines 2012-08-17 Tejas Belagod tejas.bela...@arm.com [AArch64] Implement FSQRT in RTL. * config/aarch64/aarch64-builtins.c (aarch64_simd_builtin_data): Add sqrt to the list of intrinsic descriptors. * config/aarch64/aarch64-simd.md (sqrtmode2): Insn pattern for sqrt. (aarch64_sqrtmode): Builtin expansion. * config/aarch64/arm_neon.h: Remove asm implementations of vsqrt. Add builtin implementation of vsqrt.
Re: Merge C++ conversion into trunk (0/6 - Overview)
On 08/15/2012 11:25 AM, Gabriel Dos Reis wrote: On Wed, Aug 15, 2012 at 1:21 PM, Tom Tromey tro...@redhat.com wrote: Gaby == Gabriel Dos Reis g...@integrable-solutions.net writes: Tom I asked Keith to resurrect his patch for this. [snip] I hope this will be acceptable all around. OK, that sounds reasonable. I have committed a patch which should allow you to do this and closed c++/13356. If there are any further problems/issues, please let me know. We've re-purposed an old, unused option, check type. This option does absolutely nothing in older versions of gdb, so setting it unconditionally is safe for all recent versions of gdb: set check type off. Keith
Re: Merge C++ conversion into trunk (0/6 - Overview)
On Fri, Aug 17, 2012 at 10:45 AM, Keith Seitz kei...@redhat.com wrote: On 08/15/2012 11:25 AM, Gabriel Dos Reis wrote: On Wed, Aug 15, 2012 at 1:21 PM, Tom Tromey tro...@redhat.com wrote: Gaby == Gabriel Dos Reis g...@integrable-solutions.net writes: Tom I asked Keith to resurrect his patch for this. [snip] I hope this will be acceptable all around. OK, that sounds reasonable. I have committed a patch which should allow you to do this and closed c++/13356. If there are any further problems/issues, please let me know. We've re-purposed an old, unused option, check type. This option does absolutely nothing in older versions of gdb, so setting it unconditionally is safe for all recent versions of gdb: set check type off. Shouldn't it be added to GCC .gdbinit? -- H.J.
Re: [PATCH][C++] Get rid of TREE_CHAIN use for TREE_VEC (NON_DEFAULT_TEMPLATE_ARGS_COUNT)
On Aug 17, 2012, at 6:58 AM, Paolo Carlini wrote: On 08/17/2012 01:26 PM, Richard Guenther wrote: This gets rid of this field, pushing it into a short int in tree_base (hopefully 2^16 non-defaulted template args are enough ...). Honestly, I don't think it's a trivial issue. Love to hear from Jason, but, my take would be 2^16 should be enough for anyone. I think long before people hit that limit, they would merely aggregate arguments into classes and structures. I think in another 20-80 years, we might want to bump it back up to 32 bits, but... I think we can safely wait until we get a compelling bug report for it.
Re: [PATCH, rtl-optimization]: Fix PR46829, ICE in spill_failure with -fschedule-insns [was: Fixing instability of -fschedule-insns for x86]
On Fri, Aug 17, 2012 at 6:36 PM, Uros Bizjak ubiz...@gmail.com wrote: 2012-08-17 Uros Bizjak ubiz...@gmail.com PR rtl-optimization/46829 * combine.c (recog_for_combine): Check operand constraints in case hard registers were propagater into insn pattern. testsuite/ChangeLog: 2012-08-17 Uros Bizjak ubiz...@gmail.com PR rtl-optimization/46829 * gcc.target/i386/pr46829.c: New test. Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32}. Oh ... This part: + rtx_equal_p (recog_data.operand[j], + recog_data.operand[op_alt[j].matches])) should read: + rtx_equal_p (recog_data.operand[i], + recog_data.operand[op_alt[j].matches])) Note the j vs i array index difference in the first line. Correct patch attached, bootstrapped and re-tested on x86_64-pc-linux-gnu {,-m32}. Uros. Index: combine.c === --- combine.c (revision 190480) +++ combine.c (working copy) @@ -10507,6 +10507,7 @@ recog_for_combine (rtx *pnewpat, rtx insn, rtx *pn int i; rtx notes = 0; rtx old_notes, old_pat; + int old_icode; /* If PAT is a PARALLEL, check to see if it contains the CLOBBER we use to indicate that something didn't match. If we find such a @@ -10566,6 +10567,7 @@ recog_for_combine (rtx *pnewpat, rtx insn, rtx *pn print_rtl_single (dump_file, pat); } } + PATTERN (insn) = old_pat; REG_NOTES (insn) = old_notes; @@ -10607,6 +10609,86 @@ recog_for_combine (rtx *pnewpat, rtx insn, rtx *pn pat = newpat; } + old_pat = PATTERN (insn); + old_notes = REG_NOTES (insn); + old_icode = INSN_CODE (insn); + PATTERN (insn) = pat; + REG_NOTES (insn) = notes; + + /* Check operand constraints in case hard registers were propagated + into insn pattern. This check prevents combine pass from + generating insn patterns with invalid hard register operands. + These invalid insns can eventually confuse reload to error out + with a spill failure. See also PR 46829. */ + if (insn_code_number = 0 + insn_code_number != NOOP_MOVE_INSN_CODE + (INSN_CODE (insn) = recog (PATTERN (insn), insn, 0)) = 0) +{ + extract_insn (insn); + preprocess_constraints (); + + for (i = 0; i recog_data.n_operands; i++) + { + rtx op = recog_data.operand[i]; + enum machine_mode mode = GET_MODE (op); + struct operand_alternative *op_alt; + int offset = 0; + bool win; + int j; + + /* A unary operator may be accepted by the predicate, but it +is irrelevant for matching constraints. */ + if (UNARY_P (op)) + op = XEXP (op, 0); + + if (GET_CODE (op) == SUBREG) + { + if (REG_P (SUBREG_REG (op)) + REGNO (SUBREG_REG (op)) FIRST_PSEUDO_REGISTER) + offset = subreg_regno_offset (REGNO (SUBREG_REG (op)), + GET_MODE (SUBREG_REG (op)), + SUBREG_BYTE (op), + GET_MODE (op)); + op = SUBREG_REG (op); + } + + if (!(REG_P (op) HARD_REGISTER_P (op))) + continue; + + op_alt = recog_op_alt[i]; + + win = false; + for (j = 0; j recog_data.n_alternatives; j++) + { + if (op_alt[j].anything_ok + || (op_alt[j].matches != -1 + rtx_equal_p (recog_data.operand[i], + recog_data.operand[op_alt[j].matches])) + || (reg_fits_class_p (op, op_alt[j].cl, offset, mode))) + { + win = true; + break; + } + } + + if (!win) + { + if (dump_file (dump_flags TDF_DETAILS)) + { + fputs (Operand failed to match constraints:\n, +dump_file); + print_rtl_single (dump_file, op); + } + insn_code_number = -1; + break; + } + } +} + + PATTERN (insn) = old_pat; + REG_NOTES (insn) = old_notes; + INSN_CODE (insn) = old_icode; + *pnewpat = pat; *pnotes = notes; Index: testsuite/gcc.target/i386/pr46829.c === --- testsuite/gcc.target/i386/pr46829.c (revision 0) +++ testsuite/gcc.target/i386/pr46829.c (working copy) @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -fschedule-insns } */ + +struct S +{ + int i, j; +}; + +extern struct S s[]; + +extern void bar (int, ...); + +void +foo (int n) +{ + while (s[n].i) +bar (0, n, s[n].j, s, s[n].i / s[n].j); +}
Re: [patch, tree-ssa] PR54295 Incorrect value extension in widening multiply-accumulate
On 17/08/12 16:20, Richard Earnshaw wrote: No, given a u16xu16-u64 operation in the code, and that the arch doesn't have such an opcode, I'd expect to get step1 - (u32)u16 x (u32)u16 - u64 Hmm, I would have thought that would be more costly than (u64)(u16 x u16 - u32) You might be right, but then extends are often free, especially with unsigned types, so it's hard to say for sure. Did you reproduce one? It's a long time since I last looked at this stuff, so I could be confused. Andrew
Re: [PATCH][C++] Get rid of TREE_CHAIN use for TREE_VEC (NON_DEFAULT_TEMPLATE_ARGS_COUNT)
On Fri, Aug 17, 2012 at 1:03 PM, Mike Stump mikest...@comcast.net wrote: On Aug 17, 2012, at 6:58 AM, Paolo Carlini wrote: On 08/17/2012 01:26 PM, Richard Guenther wrote: This gets rid of this field, pushing it into a short int in tree_base (hopefully 2^16 non-defaulted template args are enough ...). Honestly, I don't think it's a trivial issue. Love to hear from Jason, but, my take would be 2^16 should be enough for anyone. I think long before people hit that limit, they would merely aggregate arguments into classes and structures. I think in another 20-80 years, we might want to bump it back up to 32 bits, but... I think we can safely wait until we get a compelling bug report for it. C++11 says that an implementation should be able to handle at least 2^10 template parameters, 2^12 members declared in a single class. I believe that even for automatically generated programs, 2^16 is a good limit. I suspect that by the time that limit is a hindrance, C++ would have gone through several iterations and more importantly 128-bit integers would be common place, so by that time we would have plenty of spare bits -- if we haven't already restructured the tree data structures to use idiomatic C++ constructs that are both more space and time efficient. This is a very long of saying I am comfortable with the 2^16 restriction on the number of template parameters. The patch needs to document that in the usual .texi file. -- Gaby
Re: [RFC] Warning for potentially unbound writes to function parameters
On Fri, Aug 17, 2012 at 11:22 AM, Florian Weimer fwei...@redhat.com wrote: In some real-world code, I noticed a curious pattern: using the unsafe string functions on function parameter arguments. This leads to gets()-style unsafe APIs. I've looked at how to implement a warning for this, and came up with the attached patch. Do you think this makes sense? 1 #include string.h 2 3 const char *data (void); 4 5 void test (char *target) 6 { 7strcpy(target, data ()); 8 } 9 10 11 void test_2 (char *target) 12 { 13char *p = target; 14strcpy(p, data ()); 15 } 16 /tmp/t.c: In function ‘test’: /tmp/t.c:7:9: warning: potentially unbound write to function parameter ‘target’ [-Wunbound-parameter-write] strcpy(target, data ()); ^ /tmp/t.c: In function ‘test_2’: /tmp/t.c:14:9: warning: potentially unbound write to function parameter ‘target’ [-Wunbound-parameter-write] strcpy(p, data ()); ^ Obviously, the warning and its name need adjusting, and more functions need to be covered. But I want to check first if you think the warning makes sense at all, and if I've found the right place to implement it (this approach seems to require optimization, alas). -- Florian Weimer / Red Hat Product Security Team Hmm, I think it help a little bit if you could expand on where exactly the danger the patch is trying to prevent is, and where what does unbound parameter refer to or mean? (I don't know what an unbound parameter is) -- Gaby
Re: [patch, tree-ssa] PR54295 Incorrect value extension in widening multiply-accumulate
On 17/08/12 18:05, Richard Earnshaw wrote: Take two. This version should address your concerns about handling (u32)u16 * (u32)u16 - u64 We now look at each operand directly, but when doing so we check whether the operand is the same size as the result or not. When it is, we can strip any conversion; when it isn't the conversion must preserve signedness of the inner operand and mustn't be a narrowing conversion. So, if I understand correctly, this simply prevents it from stripping any conversions from the multiply's right-hand-side if they are not widening conversions? That seems fine to me. Not that I have authority to approve it, of course. Andrew
Re: PATCH: PR target/20020: 128 bit structs not targeted to TImode
On Thu, Aug 16, 2012 at 7:41 PM, H.J. Lu hongjiu...@intel.com wrote: My email sent to gcc-patches@gcc.gnu.org was bounced as spam. I am resending it. This patch defines both MAX_FIXED_MODE_SIZE and MEMBER_TYPE_FORCES_BLK for i386. MEMBER_TYPE_FORCES_BLK is needed so that we always put union with XFmode field in BLKmode. This patch doesn't change any ABI since both MAX_FIXED_MODE_SIZE and MEMBER_TYPE_FORCES_BLK aren't ABI macros. They only change code quality. Tested on Linux/x86-64. OK to install? gcc/ 2012-08-16 H.J. Lu hongjiu...@intel.com Gary Funck g...@intrepid.com PR target/20020 * config/i386/i386.h (MAX_FIXED_MODE_SIZE): New macro. (MEMBER_TYPE_FORCES_BLK): Likewise. gcc/testsuite/ 2012-08-16 H.J. Lu hongjiu...@intel.com Gary Funck g...@intrepid.com PR target/20020 * gcc.target/i386/pr20020-1.c: New test. * gcc.target/i386/pr20020-2.c: Likewise. * gcc.target/i386/pr20020-3.c: Likewise. OK, but please check for possible xmm vs. integer reg pair code regressions and similar for mm vs. int reg pair on 32bit targets. Thanks, Uros.
Re: PATCH: PR target/20020: 128 bit structs not targeted to TImode
On Fri, Aug 17, 2012 at 12:35 PM, Uros Bizjak ubiz...@gmail.com wrote: On Thu, Aug 16, 2012 at 7:41 PM, H.J. Lu hongjiu...@intel.com wrote: My email sent to gcc-patches@gcc.gnu.org was bounced as spam. I am resending it. This patch defines both MAX_FIXED_MODE_SIZE and MEMBER_TYPE_FORCES_BLK for i386. MEMBER_TYPE_FORCES_BLK is needed so that we always put union with XFmode field in BLKmode. This patch doesn't change any ABI since both MAX_FIXED_MODE_SIZE and MEMBER_TYPE_FORCES_BLK aren't ABI macros. They only change code quality. Tested on Linux/x86-64. OK to install? gcc/ 2012-08-16 H.J. Lu hongjiu...@intel.com Gary Funck g...@intrepid.com PR target/20020 * config/i386/i386.h (MAX_FIXED_MODE_SIZE): New macro. (MEMBER_TYPE_FORCES_BLK): Likewise. gcc/testsuite/ 2012-08-16 H.J. Lu hongjiu...@intel.com Gary Funck g...@intrepid.com PR target/20020 * gcc.target/i386/pr20020-1.c: New test. * gcc.target/i386/pr20020-2.c: Likewise. * gcc.target/i386/pr20020-3.c: Likewise. OK, but please check for possible xmm vs. integer reg pair code regressions and similar for mm vs. int reg pair on 32bit targets. Thanks, Uros. This is the path I am checking in since MEMBER_TYPE_FORCES_BLK has been converted to a target hook. Tested on Linux/x86-64. Thanks. -- H.J. --- gcc/ 2012-08-17 H.J. Lu hongjiu...@intel.com Gary Funck g...@intrepid.com PR target/20020 * config/i386/i386.c (ix86_member_type_forces_blk): New function. (TARGET_MEMBER_TYPE_FORCES_BLK): New macro. * config/i386/i386.h (MAX_FIXED_MODE_SIZE): New macro. gcc/testsuite/ 2012-08-17 H.J. Lu hongjiu...@intel.com Gary Funck g...@intrepid.com PR target/20020 * gcc.target/i386/pr20020-1.c: New test. * gcc.target/i386/pr20020-2.c: Likewise. * gcc.target/i386/pr20020-3.c: Likewise. diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 976bbb4..5da4da2 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -7545,6 +7545,18 @@ ix86_promote_function_mode (const_tree type, enum machine_mode mode, for_return); } +/* Return true if a structure, union or array with MODE containing FIELD + should be accessed using BLKmode. */ + +static bool +ix86_member_type_forces_blk (const_tree field, enum machine_mode mode) +{ + /* Union with XFmode must be in BLKmode. */ + return (mode == XFmode + (TREE_CODE (DECL_FIELD_CONTEXT (field)) == UNION_TYPE + || TREE_CODE (DECL_FIELD_CONTEXT (field)) == QUAL_UNION_TYPE)); +} + rtx ix86_libcall_value (enum machine_mode mode) { @@ -40725,6 +40737,9 @@ ix86_memmodel_check (unsigned HOST_WIDE_INT val) #undef TARGET_PROMOTE_FUNCTION_MODE #define TARGET_PROMOTE_FUNCTION_MODE ix86_promote_function_mode +#undef TARGET_MEMBER_TYPE_FORCES_BLK +#define TARGET_MEMBER_TYPE_FORCES_BLK ix86_member_type_forces_blk + #undef TARGET_SECONDARY_RELOAD #define TARGET_SECONDARY_RELOAD ix86_secondary_reload diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h index 5ff82ab..11f79e3 100644 --- a/gcc/config/i386/i386.h +++ b/gcc/config/i386/i386.h @@ -1816,6 +1816,10 @@ do { \ #define BRANCH_COST(speed_p, predictable_p) \ (!(speed_p) ? 2 : (predictable_p) ? 0 : ix86_branch_cost) +/* An integer expression for the size in bits of the largest integer machine + mode that should actually be used. We allow pairs of registers. */ +#define MAX_FIXED_MODE_SIZE GET_MODE_BITSIZE (TARGET_64BIT ? TImode : DImode) + /* Define this macro as a C expression which is nonzero if accessing less than a word of memory (i.e. a `char' or a `short') is no faster than accessing a word of memory, i.e., if such access diff --git a/gcc/testsuite/gcc.target/i386/pr20020-1.c b/gcc/testsuite/gcc.target/i386/pr20020-1.c new file mode 100644 index 000..3f10970 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr20020-1.c @@ -0,0 +1,26 @@ +/* Check that 128-bit struct's are represented as TImode values. */ +/* { dg-do compile { target int128 } } */ +/* { dg-options -O2 -fdump-rtl-expand } */ + +struct shared_ptr_struct +{ + unsigned long long phase:48; + unsigned short thread:16; + union +{ + void *addr; + unsigned long long pad; +}; +}; +typedef struct shared_ptr_struct sptr_t; + +sptr_t S; + +sptr_t +sptr_result (void) +{ + return S; +} +/* { dg-final { scan-rtl-dump \\\(set \\\(reg:TI \[0-9\]* \\\[ retval \\\]\\\) expand } } */ +/* { dg-final { scan-rtl-dump \\\(set \\\(reg/i:TI 0 ax\\\) expand } } */ +/* { dg-final { cleanup-rtl-dump expand } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr20020-2.c b/gcc/testsuite/gcc.target/i386/pr20020-2.c new file mode 100644 index 000..e8c5b3d --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr20020-2.c @@ -0,0 +1,24 @@ +/* Check
Re: [RFC] Warning for potentially unbound writes to function parameters
On 08/17/2012 09:15 PM, Gabriel Dos Reis wrote: Hmm, I think it help a little bit if you could expand on where exactly the danger the patch is trying to prevent is, and where what does unbound parameter refer to or mean? (I don't know what an unbound parameter is) Sorry for being unclear. I haven't found good terminology yet. I have seen some interfaces which behave very similar to gets(char *). You pass in a buffer, and the called function writes data to it, and you don't know how much will be written. A common pattern among those functions was that they wrote to the argument buffer using strcpy or sprintf. It seems to me that this always warrants a warning because any false positive in the strcpy case could easily be optimized: if you check the size of the target buffer before writing to it, you know the length of the string you're going to write, so you can use memcpy instead of strcpy. For sprintf, you'd just use snprintf directly, removing the separate length check. Function pointers come into play because these are supplied by the caller. Ideally, we would flag any write to a pointer which does not escape to the caller (because it is a pre-existing buffer), but this is much more difficult to implement. (In C++ mode, the warning would be disabled for non-constant reference-to-pointer-to-char parameters.) Historically, there have been some APIs which had implied upper bounds on buffer arguments (mainly PATH_MAX), but these have not fared well can turn out very difficult to use correctly over time (readdir_r being an example). -- Florian Weimer / Red Hat Product Security Team
Re: [PATCH, rtl-optimization]: Fix PR46829, ICE in spill_failure with -fschedule-insns [was: Fixing instability of -fschedule-insns for x86]
Hi, On Fri, 2012-08-17 at 20:34 +0200, Uros Bizjak wrote: On Fri, Aug 17, 2012 at 6:36 PM, Uros Bizjak ubiz...@gmail.com wrote: 2012-08-17 Uros Bizjak ubiz...@gmail.com PR rtl-optimization/46829 * combine.c (recog_for_combine): Check operand constraints in case hard registers were propagater into insn pattern. testsuite/ChangeLog: 2012-08-17 Uros Bizjak ubiz...@gmail.com PR rtl-optimization/46829 * gcc.target/i386/pr46829.c: New test. Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32}. Oh ... This part: + rtx_equal_p (recog_data.operand[j], + recog_data.operand[op_alt[j].matches])) should read: + rtx_equal_p (recog_data.operand[i], + recog_data.operand[op_alt[j].matches])) Note the j vs i array index difference in the first line. Correct patch attached, bootstrapped and re-tested on x86_64-pc-linux-gnu {,-m32}. I've briefly checked your patch against rev 190459 for the SH target. The CSiBE result-size for '-m4 -ml -O2 -mpretend-cmove' shows quite some movements and improvements. However, the additional restrictions you've added effectively disable some of the combine patterns I've been adding recently to improve code quality on SH and I see some SH specific failures (haven't run the full test suite though): FAIL: gcc.target/sh/pr49263.c scan-assembler-not and FAIL: gcc.target/sh/pr51244-1.c scan-assembler-not movt|tst|negc|extu FAIL: gcc.target/sh/pr51244-10.c scan-assembler-not shll|subc|and FAIL: gcc.target/sh/pr51244-7.c scan-assembler-not cmp/hi FAIL: gcc.target/sh/pr51244-7.c scan-assembler-not mov\t#0 FAIL: gcc.target/sh/pr51244-8.c scan-assembler-not shad|neg FAIL: gcc.target/sh/pr51244-9.c scan-assembler-not mov\t#0 FAIL: gcc.target/sh/pr52933-1.c scan-assembler-times div0s 25 FAIL: gcc.target/sh/pr52933-2.c scan-assembler-times div0s 25 FAIL: gcc.target/sh/pr54236-1.c scan-assembler-times addc 3 FAIL: gcc.target/sh/pr54236-1.c scan-assembler-times subc 3 FAIL: gcc.target/sh/pr54236-1.c scan-assembler-times sett 4 FAIL: gcc.target/sh/pr54236-1.c scan-assembler-not movt I've not checked every single case, but the first two seem to be the core issues. 1) FAIL: gcc.target/sh/pr49263.c scan-assembler-not and Test function: int test (uint8_t x) { return x 15 ? -40 : -9; } Pattern in sh.md that would originally match: (define_insn tstsi_t_zero_extract_eq [(set (reg:SI T_REG) (eq:SI (zero_extract:SI (match_operand 0 logical_operand z) (match_operand:SI 1 const_int_operand) (match_operand:SI 2 const_int_operand)) (const_int 0)))] ...) (z is a constraint for the r0 register on SH) Combine now says: Trying 11 - 12: Successfully matched this instruction: (set (reg:SI 147 t) (eq:SI (zero_extract:SI (reg:SI 4 r4 [ x ]) (const_int 4 [0x4]) (const_int 0 [0])) (const_int 0 [0]))) Operand failed to match constraints: (reg:SI 4 r4 [ x ]) The problem in this case is that on SH (and other targets, too) function args are passed in regs. In this case it's the hard reg r4, which is sort of pre-allocated already. Before your patch, the pattern would match and reload would stuff in a 'mov r4,r0' to satisfy the z constraint. 2) FAIL: gcc.target/sh/pr51244-1.c scan-assembler-not movt|tst|negc|extu Test function: int testfunc_01 (int a, int b, int c, int d) { return (a == b || a == d) ? b : c; } Pattern in sh.md that would originally match: (define_insn_and_split nott [(set (reg:SI T_REG) (xor:SI (match_operand:SI 0 t_reg_operand ) (const_int 1)))] TARGET_SH1 ...) Combine now says: Trying 14 - 15: Successfully matched this instruction: (set (reg:SI 147 t) (xor:SI (reg:SI 147 t) (const_int 1 [0x1]))) Operand failed to match constraints: (reg:SI 147 t) In this case the T_REG matching is already in the predicate (similar to the thing that's being done on x86 you've mentioned before). I've started using this kind of matching pattern a while ago to get simpler and better matching combine patterns around SH's T_REG. (T_REG is a fixed 1 bit hard reg on SH, treated as SImode throughout the MD. It is used to hold comparison results, carry/borrow bits etc). I'm not sure what would be the possible solutions for those cases. For 1) maybe pre-allocated regs of args should be moved to pseudos before combine? For 2) I could probably try to come up with some matching that would satisfy the new restrictions in combine and fix up the affected patterns throughout the MD. In the worst case, would it be an option to make this restriction in combine optional (target hook or something)? Cheers, Oleg
[PATCH] Fix -fcompare-debug failure in fwprop (PR rtl-optimization/54294)
Hi! As discussed in the PR, this patch, originally posted for PR42728, makes sure the shortcut in all_uses_available_at is used on the same insns in between -g and -g0, it is the second time a -fcompare-debug failure resulted from NEXT_INSN being something different with -g, so IMHO it is desirable to commit this. Bootstrapped/regtested on x86_64-linux and i686-linux, and in the PR Uros wrote he bootstrapped/regtested it on alphaev8-linux. Ok for trunk? Then the PR discussed what to do with the remaining issues, any feedback on that would be appreciated. 2012-08-17 Jakub Jelinek ja...@redhat.com PR rtl-optimization/54294 * fwprop.c (all_uses_available_at): Ignore debug insns in between def_insn and target_insn when checking whether the shortcut is possible. --- gcc/fwprop.c.jj 2012-08-15 10:55:21.0 +0200 +++ gcc/fwprop.c2012-08-17 11:15:55.624101951 +0200 @@ -799,13 +799,17 @@ all_uses_available_at (rtx def_insn, rtx df_ref *use_rec; struct df_insn_info *insn_info = DF_INSN_INFO_GET (def_insn); rtx def_set = single_set (def_insn); + rtx next; gcc_assert (def_set); /* If target_insn comes right after def_insn, which is very common - for addresses, we can use a quicker test. */ - if (NEXT_INSN (def_insn) == target_insn - REG_P (SET_DEST (def_set))) + for addresses, we can use a quicker test. Ignore debug insns + other than target insns for this. */ + next = NEXT_INSN (def_insn); + while (next next != target_insn DEBUG_INSN_P (next)) +next = NEXT_INSN (next); + if (next == target_insn REG_P (SET_DEST (def_set))) { rtx def_reg = SET_DEST (def_set); Jakub
Re: [PATCH, rtl-optimization]: Fix PR46829, ICE in spill_failure with -fschedule-insns [was: Fixing instability of -fschedule-insns for x86]
On Fri, Aug 17, 2012 at 9:58 PM, Oleg Endo oleg.e...@t-online.de wrote: 2012-08-17 Uros Bizjak ubiz...@gmail.com PR rtl-optimization/46829 * combine.c (recog_for_combine): Check operand constraints in case hard registers were propagater into insn pattern. testsuite/ChangeLog: 2012-08-17 Uros Bizjak ubiz...@gmail.com PR rtl-optimization/46829 * gcc.target/i386/pr46829.c: New test. Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32}. Oh ... This part: + rtx_equal_p (recog_data.operand[j], + recog_data.operand[op_alt[j].matches])) should read: + rtx_equal_p (recog_data.operand[i], + recog_data.operand[op_alt[j].matches])) Note the j vs i array index difference in the first line. Correct patch attached, bootstrapped and re-tested on x86_64-pc-linux-gnu {,-m32}. I've briefly checked your patch against rev 190459 for the SH target. The CSiBE result-size for '-m4 -ml -O2 -mpretend-cmove' shows quite some movements and improvements. However, the additional restrictions you've added effectively disable some of the combine patterns I've been adding recently to improve code quality on SH and I see some SH specific failures (haven't run the full test suite though): FAIL: gcc.target/sh/pr49263.c scan-assembler-not and FAIL: gcc.target/sh/pr51244-1.c scan-assembler-not movt|tst|negc|extu FAIL: gcc.target/sh/pr51244-10.c scan-assembler-not shll|subc|and FAIL: gcc.target/sh/pr51244-7.c scan-assembler-not cmp/hi FAIL: gcc.target/sh/pr51244-7.c scan-assembler-not mov\t#0 FAIL: gcc.target/sh/pr51244-8.c scan-assembler-not shad|neg FAIL: gcc.target/sh/pr51244-9.c scan-assembler-not mov\t#0 FAIL: gcc.target/sh/pr52933-1.c scan-assembler-times div0s 25 FAIL: gcc.target/sh/pr52933-2.c scan-assembler-times div0s 25 FAIL: gcc.target/sh/pr54236-1.c scan-assembler-times addc 3 FAIL: gcc.target/sh/pr54236-1.c scan-assembler-times subc 3 FAIL: gcc.target/sh/pr54236-1.c scan-assembler-times sett 4 FAIL: gcc.target/sh/pr54236-1.c scan-assembler-not movt I've not checked every single case, but the first two seem to be the core issues. 1) FAIL: gcc.target/sh/pr49263.c scan-assembler-not and Test function: int test (uint8_t x) { return x 15 ? -40 : -9; } Pattern in sh.md that would originally match: (define_insn tstsi_t_zero_extract_eq [(set (reg:SI T_REG) (eq:SI (zero_extract:SI (match_operand 0 logical_operand z) (match_operand:SI 1 const_int_operand) (match_operand:SI 2 const_int_operand)) (const_int 0)))] ...) (z is a constraint for the r0 register on SH) Combine now says: Trying 11 - 12: Successfully matched this instruction: (set (reg:SI 147 t) (eq:SI (zero_extract:SI (reg:SI 4 r4 [ x ]) (const_int 4 [0x4]) (const_int 0 [0])) (const_int 0 [0]))) Operand failed to match constraints: (reg:SI 4 r4 [ x ]) The problem in this case is that on SH (and other targets, too) function args are passed in regs. In this case it's the hard reg r4, which is sort of pre-allocated already. Before your patch, the pattern would match and reload would stuff in a 'mov r4,r0' to satisfy the z constraint. Yes, x86_64 also has register passing convention. So, i.e.: --cut here-- int foo (int a, int b, int c) { return a + b + c; } --cut here-- expands to: (note 6 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK) (insn 2 6 3 2 (set (reg/v:SI 62 [ a ]) (reg:SI 5 di [ a ])) t.c:2 -1 (nil)) (insn 3 2 4 2 (set (reg/v:SI 63 [ b ]) (reg:SI 4 si [ b ])) t.c:2 -1 (nil)) (insn 4 3 5 2 (set (reg/v:SI 64 [ c ]) (reg:SI 1 dx [ c ])) t.c:2 -1 (nil)) (note 5 4 8 2 NOTE_INSN_FUNCTION_BEG) (insn 8 5 9 2 (parallel [ (set (reg:SI 66 [ D.1722 ]) (plus:SI (reg/v:SI 62 [ a ]) (reg/v:SI 63 [ b ]))) (clobber (reg:CC 17 flags)) ]) t.c:3 -1 (nil)) (...) So, we expand function arguments to pseudos first, and these survive up to the combine pass. If sh expanded func arguments to pseudos, then invalid hard register won't be propagated to insn pattern, leaving reload with much more freedom to choose correct register. I believe that code improvements you are seeing come from this added reload freedom. 2) FAIL: gcc.target/sh/pr51244-1.c scan-assembler-not movt|tst|negc|extu Test function: int testfunc_01 (int a, int b, int c, int d) { return (a == b || a == d) ? b : c; } Pattern in sh.md that would originally match: (define_insn_and_split nott [(set (reg:SI T_REG) (xor:SI (match_operand:SI 0 t_reg_operand ) (const_int 1)))] TARGET_SH1 ...) Combine now says: Trying 14 - 15: Successfully matched this instruction: (set (reg:SI 147 t) (xor:SI (reg:SI 147 t)
[PATCH] Fix AVR fallout
Hi! I see these warnings/errors right now: g++ -c -g -O2 -DIN_GCC -DCROSS_DIRECTORY_STRUCTURE -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wstrict-prototypes -Wmissing-prototypes -Wmissing-format-attribute -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Wold-style-definition -Wc++-compat -fno-common -DHAVE_CONFIG_H -I. -I. -I../../../../gcc/gcc -I../../../../gcc/gcc/. -I../../../../gcc/gcc/../include -I../../../../gcc/gcc/../libcpp/include -I../../../../gcc/gcc/../libdecnumber -I../../../../gcc/gcc/../libdecnumber/dpd -I../libdecnumber-I. -I. -I../../../../gcc/gcc -I../../../../gcc/gcc/. -I../../../../gcc/gcc/../include -I../../../../gcc/gcc/../libcpp/include -I../../../../gcc/gcc/../libdecnumber -I../../../../gcc/gcc/../libdecnumber/dpd -I../libdecnumber ../../../../gcc/gcc/config/avr/avr-log.c cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ [enabled by default] cc1plus: warning: command line option ‘-Wmissing-prototypes’ is valid for C/ObjC but not for C++ [enabled by default] cc1plus: warning: command line option ‘-Wold-style-definition’ is valid for C/ObjC but not for C++ [enabled by default] cc1plus: warning: command line option ‘-Wc++-compat’ is valid for C/ObjC but not for C++ [enabled by default] ../../../../gcc/gcc/config/avr/avr-log.c: In function ‘void avr_log_vadump(FILE*, const char*, va_list)’: ../../../../gcc/gcc/config/avr/avr-log.c:287:22: warning: ‘machine_mode’ is promoted to ‘int’ when passed through ‘...’ [enabled by default] ../../../../gcc/gcc/config/avr/avr-log.c:287:22: note: (so you should pass ‘int’ not ‘machine_mode’ to ‘va_arg’) ../../../../gcc/gcc/config/avr/avr-log.c:287:22: note: if this code is reached, the program will abort ../../../../gcc/gcc/config/avr/avr-log.c:291:31: warning: ‘rtx_code’ is promoted to ‘int’ when passed through ‘...’ [enabled by default] ../../../../gcc/gcc/config/avr/avr-log.c:291:31: note: if this code is reached, the program will abort ../../../../gcc/gcc/config/avr/avr-log.c:295:38: warning: ‘reg_class’ is promoted to ‘int’ when passed through ‘...’ [enabled by default] ../../../../gcc/gcc/config/avr/avr-log.c:295:38: note: if this code is reached, the program will abort [...] g++ -g -O2 -DIN_GCC -DCROSS_DIRECTORY_STRUCTURE -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wstrict-prototypes -Wmissing-prototypes -Wmissing-format-attribute -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Wold-style-definition -Wc++-compat -fno-common -DHAVE_CONFIG_H -I. -I. -I../../../../gcc/gcc -I../../../../gcc/gcc/. -I../../../../gcc/gcc/../include -I../../../../gcc/gcc/../libcpp/include -I../../../../gcc/gcc/../libdecnumber -I../../../../gcc/gcc/../libdecnumber/dpd -I../libdecnumber-I. -I. -I../../../../gcc/gcc -I../../../../gcc/gcc/. -I../../../../gcc/gcc/../include -I../../../../gcc/gcc/../libcpp/include -I../../../../gcc/gcc/../libdecnumber -I../../../../gcc/gcc/../libdecnumber/dpd -I../libdecnumber ../../../../gcc/gcc/config/avr/gen-avr-mmcu-texi.c -o gen-avr-mmcu-texi cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ [enabled by default] cc1plus: warning: command line option ‘-Wmissing-prototypes’ is valid for C/ObjC but not for C++ [enabled by default] cc1plus: warning: command line option ‘-Wold-style-definition’ is valid for C/ObjC but not for C++ [enabled by default] cc1plus: warning: command line option ‘-Wc++-compat’ is valid for C/ObjC but not for C++ [enabled by default] ../../../../gcc/gcc/config/avr/gen-avr-mmcu-texi.c: In function ‘int main()’: ../../../../gcc/gcc/config/avr/gen-avr-mmcu-texi.c:53:24: error: invalid conversion from ‘int’ to ‘avr_arch’ [-fpermissive] ../../../../gcc/gcc/config/avr/gen-avr-mmcu-texi.c:75:23: error: invalid conversion from ‘int’ to ‘avr_arch’ [-fpermissive] make[3]: *** [gen-avr-mmcu-texi] Error 1 make[3]: Leaving directory `/mnt/devel/src/linux/build/avr/gcc-stage1/gcc' make[2]: *** [all-gcc] Error 2 make[2]: Leaving directory `/mnt/devel/src/linux/build/avr/gcc-stage1' I suggest this patch. Okay? 2012-08-17 Jan-Benedict Glaw jbg...@lug-owl.de gcc/Changelog: * config/avr/avr-log.c (avr_log_vadump): Properly use int-promoted enum values. * config/avr/avr.h (struct mcu_type_s): Change `arch' from int to enum avr_arch. * config/avr/gen-avr-mmcu-texi.c (main): Use correct initializer. diff --git a/gcc/config/avr/avr-log.c b/gcc/config/avr/avr-log.c index f054bb5..f86165d 100644 --- a/gcc/config/avr/avr-log.c +++ b/gcc/config/avr/avr-log.c @@ -284,15 +284,15 @@ avr_log_vadump (FILE *file, const char *fmt, va_list ap) break; case 'm': - fputs (GET_MODE_NAME (va_arg (ap, enum machine_mode)), file); + fputs (GET_MODE_NAME ((enum machine_mode) va_arg (ap, int)), file);
Re: RFA: Support infinity, NaN, and denormalized numbers in floatformat.c
On Thu, Aug 16, 2012 at 9:20 AM, Andreas Schwab sch...@linux-m68k.org wrote: * floatformat.c (floatformat_to_double): Correctly handle numbers between 1 and 2. Simplify handling of denormal number. (main): Test with 1.1. This is OK. Thanks. Ian
Re: [PATCH, rtl-optimization]: Fix PR46829, ICE in spill_failure with -fschedule-insns [was: Fixing instability of -fschedule-insns for x86]
On Fri, 2012-08-17 at 22:32 +0200, Uros Bizjak wrote: Yes, x86_64 also has register passing convention. So, i.e.: --cut here-- int foo (int a, int b, int c) { return a + b + c; } --cut here-- expands to: (note 6 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK) (insn 2 6 3 2 (set (reg/v:SI 62 [ a ]) (reg:SI 5 di [ a ])) t.c:2 -1 (nil)) (insn 3 2 4 2 (set (reg/v:SI 63 [ b ]) (reg:SI 4 si [ b ])) t.c:2 -1 (nil)) (insn 4 3 5 2 (set (reg/v:SI 64 [ c ]) (reg:SI 1 dx [ c ])) t.c:2 -1 (nil)) (note 5 4 8 2 NOTE_INSN_FUNCTION_BEG) (insn 8 5 9 2 (parallel [ (set (reg:SI 66 [ D.1722 ]) (plus:SI (reg/v:SI 62 [ a ]) (reg/v:SI 63 [ b ]))) (clobber (reg:CC 17 flags)) ]) t.c:3 -1 (nil)) (...) So, we expand function arguments to pseudos first, and these survive up to the combine pass. If sh expanded func arguments to pseudos, then invalid hard register won't be propagated to insn pattern, leaving reload with much more freedom to choose correct register. I believe that code improvements you are seeing come from this added reload freedom. It seems on SH func args are expanded to pseudos, sorry for not checking this properly. int test (uint8_t x) { return x 15 ? -40 : -9; } expands to: (note 1 0 7 NOTE_INSN_DELETED) (note 7 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK) (insn 2 7 3 2 (set (reg:SI 164) (reg:SI 4 r4 [ x ])) sh_tmp.cpp:9 -1 (nil)) (insn 3 2 4 2 (set (reg/v:SI 163 [ x ]) (zero_extend:SI (subreg:QI (reg:SI 164) 0))) sh_tmp.cpp:9 -1 (nil)) (note 4 3 9 2 NOTE_INSN_FUNCTION_BEG) (insn 9 4 10 2 (set (reg:SI 165) (and:SI (reg/v:SI 163 [ x ]) (const_int 15 [0xf]))) sh_tmp.cpp:10 -1 (nil)) ... Then in combine it goes like: Successfully matched this instruction: (set (reg/v:SI 163 [ x ]) (zero_extend:SI (reg:QI 4 r4 [ x ]))) deferring deletion of insn with uid = 2. modifying insn i3 3 r163:SI=zero_extend(r4:QI) REG_DEAD: r4:SI deferring rescan insn with uid = 3. [[note: the matched pattern above is *zero_extendqisi2_compact ]] Trying 3 - 9: Successfully matched this instruction: (set (reg:SI 165) (and:SI (reg:SI 4 r4 [ x ]) (const_int 15 [0xf]))) deferring deletion of insn with uid = 3. modifying insn i3 9 r165:SI=r4:SI0xf REG_DEAD: r4:SI deferring rescan insn with uid = 9. Trying 9 - 11: Successfully matched this instruction: (set (reg:SI 167) (and:SI (reg:SI 4 r4 [ x ]) (const_int 15 [0xf]))) deferring deletion of insn with uid = 9. modifying insn i311 r167:SI=r4:SI0xf REG_DEAD: r4:SI deferring rescan insn with uid = 11. Trying 11 - 12: Successfully matched this instruction: (set (reg:SI 147 t) (eq:SI (zero_extract:SI (reg:SI 4 r4 [ x ]) (const_int 4 [0x4]) (const_int 0 [0])) (const_int 0 [0]))) Operand failed to match constraints: (reg:SI 4 r4 [ x ]) Hm, I'd write this instruction as: (define_insn_and_split nott [(set (match_operand:SI 0 t_reg_operand t) (xor:SI (match_operand:SI 1 t_reg_operand 0) (const_int 1)))] TARGET_SH1 ...) This would match new combine limitation without problems, although empty constraint should be matched as well. Can you perhaps investigate a bit, why op_alt[j].anything.ok in + if (op_alt[j].anything_ok + || (op_alt[j].matches != -1 + rtx_equal_p (recog_data.operand[i], + recog_data.operand[op_alt[j].matches])) + || (reg_fits_class_p (op, op_alt[j].cl, offset, mode))) does not trigger for this operation? Maybe we need to add another condition to allow empty constraint. It seems that in this case there are no alternatives in recog_data at all. The 'for' never runs, 'win' remains 'false'. Doing - if (!win) + if (!win recog_data.n_alternatives 0) seems to help. IMO, a patch like this that touches many targets won't be applied over night. Rest assured that there will be a discussion about it. Thanks for testing the patch, I look forward to comments. Sure, no problem. Cheers, Oleg
Re: Inheritance of gfc_symbol / gfc_component
Following up on myself: On 2012-08-16 14:59, Tobias Schlüter wrote: A place where C++ inheritance is a trivial improvement is the red-black tree used for storing various objects (gfc_symtree, gfc_gsymbol, gfc_st_label, I think). This is currently implemented with macro-based inheritance. It is trivial to replace the macro with C++ inheritance, So I have a patch for this which passes the testsuite, but I'm not sure if it's valid C++ because of one issue: in io.c we have at file scope gfc_st_label format_asterisk = {0, NULL, NULL, -1, ST_LABEL_FORMAT, ST_LABEL_FORMAT, NULL, 0, {NULL, NULL}}; which doesn't work if I redefine gfc_st_label to inherit from a gfc_bbt which contains the balanced binary tree info, so I have struct gfc_bbt { int priority; gfc_bbt *left; gfc_bbt *right; }; and struct gfc_st_label : public gfc_bbt { int value; ... } Previously a macro made the first three elements of gfc_st_label what are now the elements of gfc_bbt. The problem is that the initialization of format_asterisk is not allowed in the C++ standard (according to the error message, C++11 actually allows it). I thought I could work around this problem without introducing a constructor by: 1) using 0 instead of -1 as value for this fake label (which is also not a valid value for a label, so it can't collide 2) setting ST_LABEL_FORMAT = 0 and then 3) not initializing at all, assuming that as for a C struct format_asterisk would end up in .bss and thus be zero initialized. When I was investigating whether this theory is sound, I understood that anything involving inheritance is not POD (plain old data), and therefore I can't assume zeroing to be guaranteed. Is that correct? If that is correct, I will submit the patch. Otherwise, I would be interested if there's an equally simple way of initializing format_asterisk once it uses inheritance. Aternatively, I don't see what speaks against allowing C++11 in frontends (well, maybe the famous binary incompatibility with C++03 ...) That said, I had to introduce lots of type-casts in the patch, because I didn't want to work with templates, while using typesafe interfaces wherever possible, so the functionally equivalent code actually became insignificantly larger with C++ replacing macro-based inheritance. Cheers, - Tobi
Re: [PATCH] Set correct source location for deallocator calls
Hi, Richard, Thanks for the review. I've addressed most of the issues except the java unittest (see comments below). The new patch is attached in the end of this email. Thanks, Dehao On Fri, Aug 17, 2012 at 8:47 AM, Richard Henderson r...@redhat.com wrote: On 2012-08-10 20:38, Dehao Chen wrote: + // { dg-final { scan-assembler 1 28 0 } } This test case isn't going to work except with dwarf2, and with gas. You can use -dA so that you can scan for file.c:line. There are other examples in the testsuite. Done. This doesn't belong in guality. It belongs in g++.dg/debug/. Done. It would be nice if you could add a java testcase to see that it doesn't regress. I spend a whole day working on this, but find it very difficult to add such a java test because: * First, libjava testsuits are all runtime tests, i.e., it compiles the byte code to native code, execute it, and compares the output to expected output. There is no way to scan the assembly. * Though there is a way to derive the line number at runtime in java (using Exception().getStackTrace()), this method only works on VM, and the gcj generated native code does not get the lineno. Any suggestions on this? ! record_in_goto_queue_label (struct leh_tf_state *tf, treemple stmt, tree label, ! location_t location) BTW, for the future, please fix your mailer to not wrap lines. Okay, I'll try. The problem is that we have to send mail in plain txt. And in plain text mode gmail wraps each line to 80 characters and wouldn't allow you change that... I'll quibble with the wording here. It reads as if we want to force all calls to have UNKNOWN_LOC, whereas all we want is to prevent any calls that already have UNKNOWN_LOC from gaining input_loc via gimplify_call_expr. Done. New Patch: gcc/ChangeLog * tree-eh.c (goto_queue_node): New field. (record_in_goto_queue): New parameter. (record_in_goto_queue_label): New parameter. (lower_try_finally_dup_block): New parameter. (maybe_record_in_goto_queue): Update source location. (lower_try_finally_copy): Likewise. (honor_protect_cleanup_actions): Likewise. * gimplify.c (gimplify_expr): Reset the location to unknown. gcc/testsuite/ChangeLog 2012-08-07 Dehao Chen de...@google.com * g++.dg/debug/dwarf2/deallocator.C: New test. Index: gcc/testsuite/g++.dg/debug/dwarf2/deallocator.C === --- gcc/testsuite/g++.dg/debug/dwarf2/deallocator.C (revision 0) +++ gcc/testsuite/g++.dg/debug/dwarf2/deallocator.C (revision 0) @@ -0,0 +1,33 @@ +// Test that debug info generated for auto-inserted deallocator is +// correctly attributed. +// This patch scans for the lineno directly from assembly, which may +// differ between different architectures. Because it mainly tests +// FE generated debug info, without losing generality, only x86 +// assembly is scanned in this test. +// { dg-do compile { target { i?86-*-* x86_64-*-* } } } +// { dg-options -O2 -fno-exceptions -g -dA } + +struct t { + t (); + ~t (); + void foo(); + void bar(); +}; + +int bar(); + +void foo(int i) +{ + for (int j = 0; j 10; j++) +{ + t test; + test.foo(); + if (i + j) + { + test.bar(); + return; + } +} + return; +} +// { dg-final { scan-assembler 1 28 0 } } Index: gcc/tree-eh.c === --- gcc/tree-eh.c (revision 190209) +++ gcc/tree-eh.c (working copy) @@ -321,6 +321,7 @@ struct goto_queue_node { treemple stmt; + location_t location; gimple_seq repl_stmt; gimple cont_stmt; int index; @@ -560,7 +561,8 @@ record_in_goto_queue (struct leh_tf_state *tf, treemple new_stmt, int index, - bool is_label) + bool is_label, + location_t location) { size_t active, size; struct goto_queue_node *q; @@ -583,6 +585,7 @@ memset (q, 0, sizeof (*q)); q-stmt = new_stmt; q-index = index; + q-location = location; q-is_label = is_label; } @@ -590,7 +593,8 @@ TF is not null. */ static void -record_in_goto_queue_label (struct leh_tf_state *tf, treemple stmt, tree label) +record_in_goto_queue_label (struct leh_tf_state *tf, treemple stmt, tree label, + location_t location) { int index; treemple temp, new_stmt; @@ -629,7 +633,7 @@ since with a GIMPLE_COND we have an easy access to the then/else labels. */ new_stmt = stmt; - record_in_goto_queue (tf, new_stmt, index, true); + record_in_goto_queue (tf, new_stmt, index, true, location); } /* For any GIMPLE_GOTO or GIMPLE_RETURN, decide whether it leaves a try_finally @@ -649,19 +653,22 @@ { case GIMPLE_COND: new_stmt.tp = gimple_op_ptr (stmt, 2); - record_in_goto_queue_label (tf, new_stmt,
[google/gcc-4_7] Fix ICE in output_pubnames where union contains a ctor
This patch is for the google/gcc-4_7 branch. I'll submit it for trunk after the Fission patches have gone in. When adding names to the pubnames table (-gsplit-dwarf or -gpubnames), a method within a union may not get handled properly, sometimes resulting in an internal compiler error in output_pubnames(). This patch fixes the problem by using the existing predicate, class_scope_p, instead of is_class_die, which failed to test for a union type. 2012-08-17 Cary Coutant ccout...@google.com gcc/ * dwarf2out.c (is_class_die): Remove function. (add_pubname): Call class_scope_p instead. Index: gcc/dwarf2out.c === --- gcc/dwarf2out.c (revision 190490) +++ gcc/dwarf2out.c (working copy) @@ -7035,15 +7035,6 @@ is_namespace_die (dw_die_ref c) return c c-die_tag == DW_TAG_namespace; } -/* Returns true iff C is a class or structure DIE. */ - -static inline bool -is_class_die (dw_die_ref c) -{ - return c (c-die_tag == DW_TAG_class_type - || c-die_tag == DW_TAG_structure_type); -} - static char * gen_internal_sym (const char *prefix) { @@ -9314,7 +9305,7 @@ add_pubname (tree decl, dw_die_ref die) class_member, it will either be inside the class already, or will have just looked up the class to find the member. Either way, searching the class is faster than searching the index. */ - if ((TREE_PUBLIC (decl) !is_class_die (die-die_parent)) + if ((TREE_PUBLIC (decl) !class_scope_p (die-die_parent)) || is_cu_die (die-die_parent) || is_namespace_die (die-die_parent)) { const char *name = dwarf2_name (decl, 1);
Always define USE_PT_GNU_EH_FRAME in crtstuff.c for glibc
Bootstrapping cross toolchains using glibc is an unduly complicated process involving multiple GCC builds and sometimes multiple glibc builds or glibc header installs. Something like the ideal bootstrap process is described at http://sourceware.org/ml/libc-alpha/2012-03/msg00960.html. I put some changes into glibc (for 2.17) to get closer to this process (in particular, stopping the glibc build from depending on GCC's libgcc_s or libgcc_eh). This means it is now possible to build glibc using an initial static-only GCC build with inhibit_libc defined for target library builds. Such a glibc built with such an initial GCC, however, isn't quite identical to the glibc you get if you then build them full GCC against the first glibc, then rebuild glibc. It's desirable that it really is identical so that a second glibc build is completely redundant. One cause of differences is that when crtstuff.c is built in the first inhibit_libc GCC build, USE_PT_GNU_EH_FRAME does not get defined - and so USE_EH_FRAME_REGISTRY is defined, affecting the crt*.o contents (which then get linked into various glibc shared libraries). The code in crtstuff.c has no actual dependence on link.h, so no need for it to be conditional on inhibit_libc at all. Making USE_PT_GNU_EH_FRAME not depend on inhibit_libc, for glibc-using toolchains, does cause an inconsistency with unwind-dw2-fde-dip.c where the link.h contents are used so removing the inhibit_libc dependency is harder. However, I think this inconsistency is OK, given that inhibit_libc builds for glibc targets are only to be expected to be useful for bootstrapping, not as the final toolchain used once glibc has been built (for that, you'll want libgcc_s and libgcc_eh built against the glibc libraries and headers). This patch accordingly adds another case for defining USE_PT_GNU_EH_FRAME in crtstuff.c: all existing cases are kept as they are, but this new one is added as well. The logic of defined(__GLIBC__) || defined(__gnu_linux__) || defined(__GNU__) is that (a) while __GLIBC__ is usually defined in library headers (which we may not have at this point), kfreebsd-gnu, knetbsd-gnu and kopensolaris-gnu define it in GCC; (b) __gnu_linux__ covers exactly those cases with the Linux kernel where glibc is in use and (c) __GNU__ covers the GNU Hurd, the remaining case of use of glibc. Tested with cross to arm-none-linux-gnueabi that this does indeed fix various cases where previously rebuilding glibc with the second GCC as part of a bootstrap process would result in differences from the first-built glibc, and bootstrapped with no regressions on x86_64-unknown-linux-gnu as a sanity check. OK to commit? With this patch, all the glibc libraries proper - libraries against which one might link, such as libc.so.6 and libm.so.6 - have identical disassembly for both builds, whereas they did not all have identical disassembly before. There are, however, still differences in miscellaneous .so files used by glibc and programs installed by glibc, such as libmemusage.so. In that case, at least some differences result from functions in libgcc.a being hidden in the second GCC build but not the first static-only build. In my view, the logic for giving libgcc functions hidden visibility should be the same whether or not a shared libgcc is being built. (This does mean that in the static-only case, objects from .S files would need the _s.o versions built for use in generating the list of symbols to hide.) Comments? 2012-08-17 Joseph Myers jos...@codesourcery.com * crtstuff.c (USE_PT_GNU_EH_FRAME): Define for systems using glibc even if inhibit_libc. Index: libgcc/crtstuff.c === --- libgcc/crtstuff.c (revision 190491) +++ libgcc/crtstuff.c (working copy) @@ -1,7 +1,7 @@ /* Specialized bits of code needed to support construction and destruction of file-scope objects in C++ code. Copyright (C) 1991, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001 - 2002, 2003, 2004, 2005, 2006, 2007, 2009, 2010, 2011 + 2002, 2003, 2004, 2005, 2006, 2007, 2009, 2010, 2011, 2012 Free Software Foundation, Inc. Contributed by Ron Guilmette (r...@monkeys.com). @@ -113,6 +113,19 @@ # define USE_PT_GNU_EH_FRAME # endif #endif + +#if defined(OBJECT_FORMAT_ELF) \ + !defined(OBJECT_FORMAT_FLAT) \ + defined(HAVE_LD_EH_FRAME_HDR) \ + !defined(CRTSTUFFT_O) \ + (defined(__GLIBC__) || defined(__gnu_linux__) || defined(__GNU__)) +/* On systems using glibc, an inhibit_libc build of libgcc is only + part of a bootstrap process. Build the same crt*.o as would be + built with headers present, so that it is not necessary to build + glibc more than once for the bootstrap to converge. */ +# define USE_PT_GNU_EH_FRAME +#endif + #if defined(EH_FRAME_SECTION_NAME) !defined(USE_PT_GNU_EH_FRAME) # define USE_EH_FRAME_REGISTRY #endif -- Joseph S. Myers jos...@codesourcery.com
Re: [google/gcc-4_7] Fix ICE in output_pubnames where union contains a ctor
On Fri, Aug 17, 2012 at 3:09 PM, Cary Coutant ccout...@google.com wrote: This patch is for the google/gcc-4_7 branch. I'll submit it for trunk after the Fission patches have gone in. When adding names to the pubnames table (-gsplit-dwarf or -gpubnames), a method within a union may not get handled properly, sometimes resulting in an internal compiler error in output_pubnames(). This patch fixes the problem by using the existing predicate, class_scope_p, instead of is_class_die, which failed to test for a union type. OK for Google 4.7 Sterling
Re: [google/gcc-4_7] Fix ICE in output_pubnames where union contains a ctor
OK for Google 4.7 Thanks, committed at r190494. -cary
Re: Always define USE_PT_GNU_EH_FRAME in crtstuff.c for glibc
On Fri, Aug 17, 2012 at 3:14 PM, Joseph S. Myers jos...@codesourcery.com wrote: Bootstrapping cross toolchains using glibc is an unduly complicated process involving multiple GCC builds and sometimes multiple glibc builds or glibc header installs. Something like the ideal bootstrap process is described at http://sourceware.org/ml/libc-alpha/2012-03/msg00960.html. I put some changes into glibc (for 2.17) to get closer to this process (in particular, stopping the glibc build from depending on GCC's libgcc_s or libgcc_eh). This means it is now possible to build glibc using an initial static-only GCC build with inhibit_libc defined for target library builds. Such a glibc built with such an initial GCC, however, isn't quite identical to the glibc you get if you then build them full GCC against the first glibc, then rebuild glibc. It's desirable that it really is identical so that a second glibc build is completely redundant. One cause of differences is that when crtstuff.c is built in the first inhibit_libc GCC build, USE_PT_GNU_EH_FRAME does not get defined - and so USE_EH_FRAME_REGISTRY is defined, affecting the crt*.o contents (which then get linked into various glibc shared libraries). The code in crtstuff.c has no actual dependence on link.h, so no need for it to be conditional on inhibit_libc at all. Making USE_PT_GNU_EH_FRAME not depend on inhibit_libc, for glibc-using toolchains, does cause an inconsistency with unwind-dw2-fde-dip.c where the link.h contents are used so removing the inhibit_libc dependency is harder. However, I think this inconsistency is OK, given that inhibit_libc builds for glibc targets are only to be expected to be useful for bootstrapping, not as the final toolchain used once glibc has been built (for that, you'll want libgcc_s and libgcc_eh built against the glibc libraries and headers). This patch accordingly adds another case for defining USE_PT_GNU_EH_FRAME in crtstuff.c: all existing cases are kept as they are, but this new one is added as well. The logic of defined(__GLIBC__) || defined(__gnu_linux__) || defined(__GNU__) is that (a) while __GLIBC__ is usually defined in library headers (which we may not have at this point), kfreebsd-gnu, knetbsd-gnu and kopensolaris-gnu define it in GCC; (b) __gnu_linux__ covers exactly those cases with the Linux kernel where glibc is in use and (c) __GNU__ covers the GNU Hurd, the remaining case of use of glibc. Tested with cross to arm-none-linux-gnueabi that this does indeed fix various cases where previously rebuilding glibc with the second GCC as part of a bootstrap process would result in differences from the first-built glibc, and bootstrapped with no regressions on x86_64-unknown-linux-gnu as a sanity check. OK to commit? With this patch, all the glibc libraries proper - libraries against which one might link, such as libc.so.6 and libm.so.6 - have identical disassembly for both builds, whereas they did not all have identical disassembly before. There are, however, still differences in miscellaneous .so files used by glibc and programs installed by glibc, such as libmemusage.so. In that case, at least some differences result from functions in libgcc.a being hidden in the second GCC build but not the first static-only build. In my view, the logic for giving libgcc functions hidden visibility should be the same whether or not a shared libgcc is being built. (This does mean that in the static-only case, objects from .S files would need the _s.o versions built for use in generating the list of symbols to hide.) Comments? 2012-08-17 Joseph Myers jos...@codesourcery.com * crtstuff.c (USE_PT_GNU_EH_FRAME): Define for systems using glibc even if inhibit_libc. Index: libgcc/crtstuff.c === --- libgcc/crtstuff.c (revision 190491) +++ libgcc/crtstuff.c (working copy) @@ -1,7 +1,7 @@ /* Specialized bits of code needed to support construction and destruction of file-scope objects in C++ code. Copyright (C) 1991, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001 - 2002, 2003, 2004, 2005, 2006, 2007, 2009, 2010, 2011 + 2002, 2003, 2004, 2005, 2006, 2007, 2009, 2010, 2011, 2012 Free Software Foundation, Inc. Contributed by Ron Guilmette (r...@monkeys.com). @@ -113,6 +113,19 @@ # define USE_PT_GNU_EH_FRAME # endif #endif + +#if defined(OBJECT_FORMAT_ELF) \ + !defined(OBJECT_FORMAT_FLAT) \ + defined(HAVE_LD_EH_FRAME_HDR) \ + !defined(CRTSTUFFT_O) \ + (defined(__GLIBC__) || defined(__gnu_linux__) || defined(__GNU__)) +/* On systems using glibc, an inhibit_libc build of libgcc is only + part of a bootstrap process. Build the same crt*.o as would be + built with headers present, so that it is not necessary to build + glibc more than once for the bootstrap to converge. */ +# define USE_PT_GNU_EH_FRAME +#endif +
CXX conversion: min g++ version pre-requisite?
Paul Hargrove noted the following build failure on an older x86-32 Linux (Redhat 8.0) system. g++ -c -g -O2 -DIN_GCC -fno-exceptions -fno-rtti -W -Wall -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -fno-common -DHAVE_CONFIG_H -DGENERATOR_FILE -I. -Ibuild -I/usr/local/pkg/upc/nightly/compiler/gupc-src/gcc -I/usr/local/pkg/upc/nightly/compiler/gupc-src/gcc/build -I/usr/local/pkg/upc/nightly/compiler/gupc-src/gcc/../include -I/usr/local/pkg/upc/nightly/compiler/gupc-src/gcc/../libcpp/include -I/usr/local/pkg/gmp-4.2.4/include -I/usr/local/pkg/mpfr-2.4.2/include -I/usr/local/pkg/mpc-0.8/include -I/usr/local/pkg/upc/nightly/compiler/gupc-src/gcc/../libdecnumber -I/usr/local/pkg/upc/nightly/compiler/gupc-src/gcc/../libdecnumber/bid -I../libdecnumber\ -o build/genoutput.o /usr/local/pkg/upc/nightly/compiler/gupc-src/gcc/genoutput.c /usr/local/pkg/upc/nightly/compiler/gupc-src/gcc/genoutput.c: In function `void note_constraint(rtx_def*, int)': /usr/local/pkg/upc/nightly/compiler/gupc-src/gcc/genoutput.c:1178: error: reinterpret_cast from type `const char (*)[1]' to type `char*' casts away constness make[2]: *** [build/genoutput.o] Error 1 make[2]: Leaving directory `/usr/local/pkg/upc/nightly/compiler/bld/gcc' make[1]: *** [all-gcc] Error 2 make[1]: Leaving directory `/usr/local/pkg/upc/nightly/compiler/bld' make: *** [all] Error 2 The g++ version is: g++ (GCC) 3.4.0 Currently, install.texi states: @heading Tools/packages necessary for building GCC @table @asis @item ISO C90 compiler Necessary to bootstrap GCC, although versions of GCC prior to 3.4 also allow bootstrapping with a traditional (KR) C compiler. To build all languages in a cross-compiler or other configuration where 3-stage bootstrap is not performed, you need to start with an existing GCC binary (version 2.95 or later) because source code for language frontends other than C might use GCC extensions. This appears to be out-of-date with respect to new C++ stage 1 build requirement. Please advise. Thanks, - Gary
Re: CXX conversion: min g++ version pre-requisite?
On Fri, 17 Aug 2012, Gary Funck wrote: Paul Hargrove noted the following build failure on an older x86-32 Linux (Redhat 8.0) system. wow. old. The g++ version is: g++ (GCC) 3.4.0 Currently, install.texi states: @heading Tools/packages necessary for building GCC @table @asis @item ISO C90 compiler Necessary to bootstrap GCC, although versions of GCC prior to 3.4 also allow bootstrapping with a traditional (KR) C compiler. This appears to be out-of-date with respect to new C++ stage 1 build requirement. Or rather, nobody until now built with such an old version to find all problems that need to be worked around if possible, or the new (lowest version to support) oldest version to give up on, where the maintenance yield is too low... People had problems finding systems with gcc as low as 4.1.x, for which problems have recently been solved. I wouldn't be surprised if a detailed bug report would help making 3.4.x supported at least for native bootstrap... Maybe person or persons with such access could help finding the right set of casts, which for gen* programs might be cold enough to minimize the maintenance burden from uglification. brgds, H-P