Re: PATCH: PR rtl-optimization/64037: Miscompilation with -Os and enum class : char parameter
On Sun, Dec 14, 2014 at 3:08 PM, H.J. Lu hjl.to...@gmail.com wrote: The enclosed testcase fails on x86 when compiled with -Os since we pass a byte parameter with a byte load in caller and read it as an int in callee. The reason it only shows up with -Os is x86 backend encodes a byte load with an int load if -O isn't used. When a byte load is used, the upper 24 bits of the register have random value for none WORD_REGISTER_OPERATIONS targets. It happens because setup_incoming_promotions in combine.c has /* The mode and signedness of the argument before any promotions happen (equal to the mode of the pseudo holding it at that stage). */ mode1 = TYPE_MODE (TREE_TYPE (arg)); uns1 = TYPE_UNSIGNED (TREE_TYPE (arg)); /* The mode and signedness of the argument after any source language and TARGET_PROMOTE_PROTOTYPES-driven promotions. */ mode2 = TYPE_MODE (DECL_ARG_TYPE (arg)); uns3 = TYPE_UNSIGNED (DECL_ARG_TYPE (arg)); /* The mode and signedness of the argument as it is actually passed, after any TARGET_PROMOTE_FUNCTION_ARGS-driven ABI promotions. */ mode3 = promote_function_mode (DECL_ARG_TYPE (arg), mode2, uns3, TREE_TYPE (cfun-decl), 0); while they are actually passed in register by assign_parm_setup_reg in function.c: /* Store the parm in a pseudoregister during the function, but we may need to do it in a wider mode. Using 2 here makes the result consistent with promote_decl_mode and thus expand_expr_real_1. */ promoted_nominal_mode = promote_function_mode (data-nominal_type, data-nominal_mode, unsignedp, TREE_TYPE (current_function_decl), 2); where nominal_type and nominal_mode are set up with TREE_TYPE (parm) and TYPE_MODE (nominal_type). TREE_TYPE here is I think the bug is here, not in combine.c. Can you try going back in history for both snippets and see if they matched at some point? The bug was introduced by https://gcc.gnu.org/ml/gcc-cvs/2007-09/msg00613.html commit 5d93234932c3d8617ce92b77b7013ef6bede9508 Author: shinwell shinwell@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Thu Sep 20 11:01:18 2007 + gcc/ * combine.c: Include cgraph.h. (setup_incoming_promotions): Rework to allow more aggressive elimination of sign extensions when all call sites of the current function are known to lie within the current unit. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@128618 138bc75d-0d04-0410-961f-82ee72b054a4 Before this commit, combine.c has enum machine_mode mode = TYPE_MODE (TREE_TYPE (arg)); int uns = TYPE_UNSIGNED (TREE_TYPE (arg)); mode = promote_mode (TREE_TYPE (arg), mode, uns, 1); if (mode == GET_MODE (reg) mode != DECL_MODE (arg)) { rtx x; x = gen_rtx_CLOBBER (DECL_MODE (arg), const0_rtx); x = gen_rtx_fmt_e ((uns ? ZERO_EXTEND : SIGN_EXTEND), mode, x); record_value_for_reg (reg, first, x); } It matches function.c: /* This is not really promoting for a call. However we need to be consistent with assign_parm_find_data_types and expand_expr_real_1. */ promoted_nominal_mode = promote_mode (data-nominal_type, data-nominal_mode, unsignedp, 1); r128618 changed mode = promote_mode (TREE_TYPE (arg), mode, uns, 1); to mode3 = promote_mode (DECL_ARG_TYPE (arg), mode2, uns3, 1); It breaks none WORD_REGISTER_OPERATIONS targets. Hmm, I think that DECL_ARG_TYPE makes a difference only for non-WORD_REGISTER_OPERATIONS targets. But yeah, isolated the above change looks wrong. Your patch is ok for trunk if nobody objects within 24h and for branches after a week. Thanks, Richard. This patch caused PR64213. Here is the updated patch. The difference is mode3 = promote_function_mode (TREE_TYPE (arg), mode1, uns3, TREE_TYPE (cfun-decl), 0); vs mode3 = promote_function_mode (TREE_TYPE (arg), mode1, uns1, TREE_TYPE (cfun-decl), 0); I made a mistake in my previous patch where I shouldn't have changed uns3 to uns1. We do want to update mode3 and uns3, not mode3 and uns1. It generates the same code on PR64213 testcase with a cross alpha-linux GCC. Uros, can you test it on Linux/alpha? OK for master, 4.9 and 4.8 branches if it works on Linux/alpha? Yes, this patch works OK [1] on linux/alpha mainline. [1] https://gcc.gnu.org/ml/gcc-testresults/2014-12/msg01867.html Uros.
Re: [PATCH] Ensure __tsan_func_entry call isn't in a loop (PR sanitizer/64265)
On Fri, Dec 12, 2014 at 10:33 PM, Jakub Jelinek ja...@redhat.com wrote: Hi! This patch ensures that if successor of entry bb has multiple predecessors, we emit the __tsan_func_entry call on the edge from entry bb, so it can't be called inside a loop in the same function. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Ok. (I suppose for branches as well) Thanks, Richard. 2014-12-12 Jakub Jelinek ja...@redhat.com PR sanitizer/64265 * tsan.c (instrument_func_entry): Insert __tsan_func_entry call on edge from entry block to single succ instead of after labels of single succ of entry block. --- gcc/tsan.c.jj 2014-12-01 14:57:30.0 +0100 +++ gcc/tsan.c 2014-12-12 18:25:26.448608011 +0100 @@ -652,25 +652,24 @@ instrument_memory_accesses (void) static void instrument_func_entry (void) { - basic_block succ_bb; - gimple_stmt_iterator gsi; tree ret_addr, builtin_decl; gimple g; - - succ_bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)); - gsi = gsi_after_labels (succ_bb); + gimple_seq seq = NULL; builtin_decl = builtin_decl_implicit (BUILT_IN_RETURN_ADDRESS); g = gimple_build_call (builtin_decl, 1, integer_zero_node); ret_addr = make_ssa_name (ptr_type_node); gimple_call_set_lhs (g, ret_addr); gimple_set_location (g, cfun-function_start_locus); - gsi_insert_before (gsi, g, GSI_SAME_STMT); + gimple_seq_add_stmt_without_update (seq, g); - builtin_decl = builtin_decl_implicit (BUILT_IN_TSAN_FUNC_ENTRY); + builtin_decl = builtin_decl_implicit (BUILT_IN_TSAN_FUNC_ENTRY); g = gimple_build_call (builtin_decl, 1, ret_addr); gimple_set_location (g, cfun-function_start_locus); - gsi_insert_before (gsi, g, GSI_SAME_STMT); + gimple_seq_add_stmt_without_update (seq, g); + + edge e = single_succ_edge (ENTRY_BLOCK_PTR_FOR_FN (cfun)); + gsi_insert_seq_on_edge_immediate (e, seq); } /* Instruments function exits. */ Jakub
Re: Fix streaming of target optimization/option nodes
On Mon, 15 Dec 2014, Jan Hubicka wrote: Hi, actually this patch break fortran, I get streaming error in: lto1: internal compiler error: in streamer_get_pickled_tree apparently picking error_mark_node for variable constructor results in reading integer_type... ? Probably the default nodes are referenced by another builtin tree instead and you get inconsistent streaming between f951 and lto1. See the assert placed into record_common_node which you should extend to cover the optimization node trees. Richard. Honza Hi, the testcase in PR ipa/61324 fails because it is compiled with -O0 and linked with -O2. This should not matter anymore if there wasn't the following problem in streamer that makes us to merge all default nodes across units. Bootstrapped/regtested x86_64-linux, plan to commit it after more testing finishes (Firefox) Honza PR ipa/61324 * tree-streamer.c (preload_common_nodes): Do not ocnsider optimizatoin nad target_option nodes as common nodes; they depend on flags. Index: tree-streamer.c === --- tree-streamer.c (revision 218726) +++ tree-streamer.c (working copy) @@ -324,7 +324,14 @@ preload_common_nodes (struct streamer_tr /* Skip boolean type and constants, they are frontend dependent. */ if (i != TI_BOOLEAN_TYPE i != TI_BOOLEAN_FALSE -i != TI_BOOLEAN_TRUE) +i != TI_BOOLEAN_TRUE + /* Skip optimization and target option nodes; they depend on flags. */ +i != TI_OPTIMIZATION_DEFAULT +i != TI_OPTIMIZATION_CURRENT +i != TI_TARGET_OPTION_DEFAULT +i != TI_TARGET_OPTION_CURRENT +i != TI_CURRENT_TARGET_PRAGMA +i != TI_CURRENT_OPTIMIZE_PRAGMA) record_common_node (cache, global_trees[i]); } -- Richard Biener rguent...@suse.de SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)
RE: [PATCH, AARCH64] Fix ICE in CCMP (PR64015)
-Original Message- From: Richard Henderson [mailto:r...@redhat.com] Sent: Saturday, December 13, 2014 3:26 AM To: Zhenqiang Chen Cc: Marcus Shawcroft; gcc-patches@gcc.gnu.org Subject: Re: [PATCH, AARCH64] Fix ICE in CCMP (PR64015) - tree lhs = gimple_assign_lhs (g); enum machine_mode mode = TYPE_MODE (TREE_TYPE (lhs)); rtx target = gen_reg_rtx (mode); + + start_sequence (); tmp = emit_cstore (target, icode, NE, cc_mode, cc_mode, 0, tmp, const0_rtx, 1, mode); if (tmp) - return tmp; + { + rtx seq = get_insns (); + end_sequence (); + emit_insn (prep_seq); + emit_insn (gen_seq); + emit_insn (seq); + return tmp; + } + end_sequence (); Given that you're already doing delete_insns_since (last) at the end of this function, I don't think you need a new sequence around the emit_cstore. Just emit_insn (prep_seq); emit_insn (gen_seq); tmp = emit_cstore (...); if (tmp) return tmp; Updated. + int unsignedp = code == LTU || code == LEU || code == GTU || code + == GEU; You don't need to examine the code, you can look at the argument: TYPE_UNSIGNED (TREE_TYPE (treeop0)) Updated. + op0 = prepare_operand (icode, op0, 2, op_mode, cmp_mode, + unsignedp); + op1 = prepare_operand (icode, op1, 3, op_mode, cmp_mode, + unsignedp); if (!op0 || !op1) +{ + end_sequence (); + return NULL_RTX; +} + *prep_seq = get_insns (); + end_sequence (); + + cmp = gen_rtx_fmt_ee ((enum rtx_code) code, cmp_mode, op0, op1); + target = gen_rtx_REG (CCmode, CC_REGNUM); + + create_output_operand (ops[0], target, CCmode); + create_fixed_operand (ops[1], cmp); create_fixed_operand (ops[2], + op0); create_fixed_operand (ops[3], op1); Hmm. With so many fixed operands, I think you may be better off not creating the cmpmode expander in the first place. Just inline the SELECT_CC_MODE and everything right here. In the patch, I use prepare_operand (icode, op0, 2, ...) to do the operand MODE conversion (from HI/QI to SI), which needs a cmpmode expander. Without it, I have to add additional codes to do the conversion (as it in previous patch, which leads to PR64015). Thanks! -Zhenqiang gen-ccmp-v2.patch Description: Binary data
Re: [PATCH 1/n] OpenMP 4.0 offloading infrastructure
Hi, On 12 Dec 09:17, Jakub Jelinek wrote: On Fri, Dec 12, 2014 at 09:14:28AM +0100, Thomas Schwinge wrote: On Tue, 30 Sep 2014 13:16:37 +0200, I wrote: On Fri, 26 Sep 2014 16:36:21 +0400, Ilya Verbin iver...@gmail.com wrote: --- a/configure.ac +++ b/configure.ac @@ -286,6 +286,24 @@ case ${with_newlib} in yes) skipdirs=`echo ${skipdirs} | sed -e 's/ target-newlib / /'` ;; esac +AC_ARG_ENABLE(as-accelerator-for, +[AS_HELP_STRING([--enable-as-accelerator-for=ARG], + [build as offload target compiler. + Specify offload host triple by ARG])], +ENABLE_AS_ACCELERATOR_FOR=$enableval, +ENABLE_AS_ACCELERATOR_FOR=no) I don't see $ENABLE_AS_ACCELERATOR_FOR being used anywhere, so this can probably be removed? On Wed, 1 Oct 2014 20:05:45 +0400, Ilya Verbin iver...@gmail.com wrote: It will be used in one of the upcoming patches. OK, but why do you need the all-uppercase variant? The lowercase enable_as_accelerator_for already is (automatically) populated by Autoconf, and used in other places? Here is a untested cleanup patch; could you please test this? * configure.ac (--enable-as-accelerator-for): Don't set ENABLE_AS_ACCELERATOR_FOR. Update all users. * configure: Regenerate. Ok if it works. The patch doesn't regress MIC offloading. -- Thanks, K Jakub
Re: [PATCH][rtlanal.c][BE][1/2] Fix vector load/stores to not use ld1/st1
Eric Botcazou ebotca...@adacore.com writes: FWIW I agree this is the right approach, although I can't approve it. The assert above is guarding code that deals with a very general case, including some unusual combinations, so I don't think it would be a good idea to try to remove it entirely. Yes, but the patch is a bit of kludge since it short-circuits the meat of the function: /* This should always pass, otherwise we don't know how to verify the constraint. These conditions may be relaxed but subreg_regno_offset would need to be redesigned. */ gcc_assert ((GET_MODE_SIZE (xmode) % GET_MODE_SIZE (ymode)) == 0); gcc_assert ((nregs_xmode % nregs_ymode) == 0); So what would it take to do things properly here, i.e. relax the conditions and adjust downstream? What do you think we should relax it to though? Obviously there's a balance here between relaxing things enough and not relaxing them too far (so that the EImode AArch64 thing I mentioned is still a noisy failure, for example). ISTM the patch deals with the only significant case that is obviously safe for modes that are not a power of 2 in size. If you're saying that the condition itself is OK, but that the code should be further down in the function, then I don't think that would gain much. We already have early-outs for the simple cases, such as: /* Paradoxical subregs are otherwise valid. */ if (!rknown offset == 0 GET_MODE_PRECISION (ymode) GET_MODE_PRECISION (xmode)) { info-representable_p = true; /* If this is a big endian paradoxical subreg, which uses more actual hard registers than the original register, we must return a negative offset so that we find the proper highpart of the register. */ if (GET_MODE_SIZE (ymode) UNITS_PER_WORD ? REG_WORDS_BIG_ENDIAN : BYTES_BIG_ENDIAN) info-offset = nregs_xmode - nregs_ymode; else info-offset = 0; info-nregs = nregs_ymode; return; } [...] /* Lowpart subregs are otherwise valid. */ if (!rknown offset == subreg_lowpart_offset (ymode, xmode)) { info-representable_p = true; rknown = true; if (offset == 0 || nregs_xmode == nregs_ymode) { info-offset = 0; info-nregs = nregs_ymode; return; } } which also come before the assert. Thanks, Richard
Re: [PATCH, combine] Try REG_EQUAL for nonzero_bits
Thanks for the comments. Patch is updated. diff --git a/gcc/combine.c b/gcc/combine.c index 1808f97..2e865d7 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -1603,6 +1603,28 @@ setup_incoming_promotions (rtx_insn *first) } } +/* Update RSP from INSN's REG_EQUAL note and SRC. */ + +static void +update_rsp_from_reg_equal (reg_stat_type *rsp, rtx_insn *insn, rtx src, rtx x) +{ + rtx reg_equal = insn ? find_reg_equal_equiv_note (insn) : NULL_RTX; + unsigned HOST_WIDE_INT bits = nonzero_bits (src, nonzero_bits_mode); Note that src has taken the SHORT_IMMEDIATES_SIGN_EXTEND path here. + if (reg_equal) +{ + unsigned int num = num_sign_bit_copies (XEXP (reg_equal, 0), + GET_MODE (x)); + bits = nonzero_bits (XEXP (reg_equal, 0), nonzero_bits_mode); But XEXP (reg_equal, 0) hasn't here. If we want to treat the datum of the REG_EQUAL or REG_EQUIV note as equivalent to the SET_SRC (set), and I think we should (see for example combine.c:9650), there is a problem. So I think we should create a new function, something along of: /* If MODE has a precision lower than PREC and SRC is a non-negative constant that would appear negative in MODE, sign-extend SRC for use in nonzero_bits because some machines (maybe most) will actually do the sign-extension and this is the conservative approach. ??? For 2.5, try to tighten up the MD files in this regard instead of this kludge. */ rtx sign_extend_short_imm (rtx src, machine_mode mode, unsigned int prec) { if (GET_MODE_PRECISION (mode) prec CONST_INT_P (src) INTVAL (src) 0 val_signbit_known_set_p (mode, INTVAL (src))) src = GEN_INT (INTVAL (src) | ~GET_MODE_MASK (mode)); return src; } and calls it from combine.c:1702 #ifdef SHORT_IMMEDIATES_SIGN_EXTEND src = sign_extend_short_imm (src, GET_MODE (x), BITS_PER_WORD); #endif and from combine.c:9650 #ifdef SHORT_IMMEDIATES_SIGN_EXTEND tem = sign_extend_short_imm (tem, GET_MODE (x), GET_MODE_PRECISION (mode)); #endif Once this is done, the same thing needs to be applied to XEXP (reg_equal, 0) before it is sent to nonzero_bits. - /* Don't call nonzero_bits if it cannot change anything. */ - if (rsp-nonzero_bits != ~(unsigned HOST_WIDE_INT) 0) - rsp-nonzero_bits |= nonzero_bits (src, nonzero_bits_mode); num = num_sign_bit_copies (SET_SRC (set), GET_MODE (x)); if (rsp-sign_bit_copies == 0 || rsp-sign_bit_copies num) rsp-sign_bit_copies = num; + + /* Don't call nonzero_bits if it cannot change anything. */ + if (rsp-nonzero_bits != ~(unsigned HOST_WIDE_INT) 0) + update_rsp_from_reg_equal (rsp, insn, src, x); Can't we improve on this? rsp-sign_bit_copies is modified both here and in update_rsp_from_reg_equal, but rsp-nonzero_bits is modified only in the latter function. There is no reason for this discrepancy, so they ought to be handled the same way, either entirely here or entirely in the function. -- Eric Botcazou
[PATCH 2/2] RTEMS: Use MULTILIB_REQUIRED for ARM
This patch should be applied to GCC mainline. I do not have write access, so in case this gets approved, please commit it for me. gcc/ChangeLog 2014-12-15 Sebastian Huber sebastian.hu...@embedded-brains.de * config/arm/t-rtems: Use MULTILIB_REQUIRED instead of MULTILIB_EXCEPTIONS. --- gcc/config/arm/t-rtems | 173 - 1 file changed, 13 insertions(+), 160 deletions(-) diff --git a/gcc/config/arm/t-rtems b/gcc/config/arm/t-rtems index 92c4dcb..3b62181 100644 --- a/gcc/config/arm/t-rtems +++ b/gcc/config/arm/t-rtems @@ -1,4 +1,4 @@ -# Custom RTEMS EABI multilibs +# Custom RTEMS multilibs for ARM MULTILIB_OPTIONS = mbig-endian mthumb march=armv6-m/march=armv7-a/march=armv7-r/march=armv7-m mfpu=neon/mfpu=vfpv3-d16/mfpu=fpv4-sp-d16 mfloat-abi=hard MULTILIB_DIRNAMES = eb thumb armv6-m armv7-a armv7-r armv7-m neon vfpv3-d16 fpv4-sp-d16 hard @@ -6,162 +6,15 @@ MULTILIB_DIRNAMES = eb thumb armv6-m armv7-a armv7-r armv7-m neon vfpv3-d16 fpv4 # Enumeration of multilibs MULTILIB_EXCEPTIONS = -MULTILIB_EXCEPTIONS += mbig-endian/mthumb/march=armv6-m/mfpu=neon/mfloat-abi=hard -MULTILIB_EXCEPTIONS += mbig-endian/mthumb/march=armv6-m/mfpu=neon -MULTILIB_EXCEPTIONS += mbig-endian/mthumb/march=armv6-m/mfpu=vfpv3-d16/mfloat-abi=hard -MULTILIB_EXCEPTIONS += mbig-endian/mthumb/march=armv6-m/mfpu=vfpv3-d16 -MULTILIB_EXCEPTIONS += mbig-endian/mthumb/march=armv6-m/mfpu=fpv4-sp-d16/mfloat-abi=hard -MULTILIB_EXCEPTIONS += mbig-endian/mthumb/march=armv6-m/mfpu=fpv4-sp-d16 -MULTILIB_EXCEPTIONS += mbig-endian/mthumb/march=armv6-m/mfloat-abi=hard -MULTILIB_EXCEPTIONS += mbig-endian/mthumb/march=armv6-m -MULTILIB_EXCEPTIONS += mbig-endian/mthumb/march=armv7-a/mfpu=neon/mfloat-abi=hard -MULTILIB_EXCEPTIONS += mbig-endian/mthumb/march=armv7-a/mfpu=neon -MULTILIB_EXCEPTIONS += mbig-endian/mthumb/march=armv7-a/mfpu=vfpv3-d16/mfloat-abi=hard -MULTILIB_EXCEPTIONS += mbig-endian/mthumb/march=armv7-a/mfpu=vfpv3-d16 -MULTILIB_EXCEPTIONS += mbig-endian/mthumb/march=armv7-a/mfpu=fpv4-sp-d16/mfloat-abi=hard -MULTILIB_EXCEPTIONS += mbig-endian/mthumb/march=armv7-a/mfpu=fpv4-sp-d16 -MULTILIB_EXCEPTIONS += mbig-endian/mthumb/march=armv7-a/mfloat-abi=hard -MULTILIB_EXCEPTIONS += mbig-endian/mthumb/march=armv7-a -MULTILIB_EXCEPTIONS += mbig-endian/mthumb/march=armv7-r/mfpu=neon/mfloat-abi=hard -MULTILIB_EXCEPTIONS += mbig-endian/mthumb/march=armv7-r/mfpu=neon -# MULTILIB_EXCEPTIONS += mbig-endian/mthumb/march=armv7-r/mfpu=vfpv3-d16/mfloat-abi=hard -MULTILIB_EXCEPTIONS += mbig-endian/mthumb/march=armv7-r/mfpu=vfpv3-d16 -MULTILIB_EXCEPTIONS += mbig-endian/mthumb/march=armv7-r/mfpu=fpv4-sp-d16/mfloat-abi=hard -MULTILIB_EXCEPTIONS += mbig-endian/mthumb/march=armv7-r/mfpu=fpv4-sp-d16 -MULTILIB_EXCEPTIONS += mbig-endian/mthumb/march=armv7-r/mfloat-abi=hard -# MULTILIB_EXCEPTIONS += mbig-endian/mthumb/march=armv7-r -MULTILIB_EXCEPTIONS += mbig-endian/mthumb/march=armv7-m/mfpu=neon/mfloat-abi=hard -MULTILIB_EXCEPTIONS += mbig-endian/mthumb/march=armv7-m/mfpu=neon -MULTILIB_EXCEPTIONS += mbig-endian/mthumb/march=armv7-m/mfpu=vfpv3-d16/mfloat-abi=hard -MULTILIB_EXCEPTIONS += mbig-endian/mthumb/march=armv7-m/mfpu=vfpv3-d16 -MULTILIB_EXCEPTIONS += mbig-endian/mthumb/march=armv7-m/mfpu=fpv4-sp-d16/mfloat-abi=hard -MULTILIB_EXCEPTIONS += mbig-endian/mthumb/march=armv7-m/mfpu=fpv4-sp-d16 -MULTILIB_EXCEPTIONS += mbig-endian/mthumb/march=armv7-m/mfloat-abi=hard -MULTILIB_EXCEPTIONS += mbig-endian/mthumb/march=armv7-m -MULTILIB_EXCEPTIONS += mbig-endian/mthumb/mfpu=neon/mfloat-abi=hard -MULTILIB_EXCEPTIONS += mbig-endian/mthumb/mfpu=neon -MULTILIB_EXCEPTIONS += mbig-endian/mthumb/mfpu=vfpv3-d16/mfloat-abi=hard -MULTILIB_EXCEPTIONS += mbig-endian/mthumb/mfpu=vfpv3-d16 -MULTILIB_EXCEPTIONS += mbig-endian/mthumb/mfpu=fpv4-sp-d16/mfloat-abi=hard -MULTILIB_EXCEPTIONS += mbig-endian/mthumb/mfpu=fpv4-sp-d16 -MULTILIB_EXCEPTIONS += mbig-endian/mthumb/mfloat-abi=hard -MULTILIB_EXCEPTIONS += mbig-endian/mthumb -MULTILIB_EXCEPTIONS += mbig-endian/march=armv6-m/mfpu=neon/mfloat-abi=hard -MULTILIB_EXCEPTIONS += mbig-endian/march=armv6-m/mfpu=neon -MULTILIB_EXCEPTIONS += mbig-endian/march=armv6-m/mfpu=vfpv3-d16/mfloat-abi=hard -MULTILIB_EXCEPTIONS += mbig-endian/march=armv6-m/mfpu=vfpv3-d16 -MULTILIB_EXCEPTIONS += mbig-endian/march=armv6-m/mfpu=fpv4-sp-d16/mfloat-abi=hard -MULTILIB_EXCEPTIONS += mbig-endian/march=armv6-m/mfpu=fpv4-sp-d16 -MULTILIB_EXCEPTIONS += mbig-endian/march=armv6-m/mfloat-abi=hard -MULTILIB_EXCEPTIONS += mbig-endian/march=armv6-m -MULTILIB_EXCEPTIONS += mbig-endian/march=armv7-a/mfpu=neon/mfloat-abi=hard -MULTILIB_EXCEPTIONS += mbig-endian/march=armv7-a/mfpu=neon -MULTILIB_EXCEPTIONS += mbig-endian/march=armv7-a/mfpu=vfpv3-d16/mfloat-abi=hard -MULTILIB_EXCEPTIONS += mbig-endian/march=armv7-a/mfpu=vfpv3-d16 -MULTILIB_EXCEPTIONS += mbig-endian/march=armv7-a/mfpu=fpv4-sp-d16/mfloat-abi=hard -MULTILIB_EXCEPTIONS += mbig-endian/march=armv7-a/mfpu=fpv4-sp-d16
[PATCH 1/2] RTEMS: Rename ARM target config files
Now that we only have the EABI configuration for RTEMS rename the files to match the pattern used for the other RTEMS targets. This patch should be applied to GCC mainline. I do not have write access, so in case this gets approved, please commit it for me. gcc/ChangeLog 2014-12-15 Sebastian Huber sebastian.hu...@embedded-brains.de * config/arm/t-rtems-eabi: Rename to... * config/arm/t-rtems: ...this. * config/arm/rtems-eabi.h: Rename to... * config/arm/rtems.h: ...this. * config.gcc (arm*-*-rtems*): Reflect changes above. --- gcc/config.gcc | 4 +- gcc/config/arm/rtems-eabi.h | 29 gcc/config/arm/rtems.h | 29 gcc/config/arm/t-rtems | 167 gcc/config/arm/t-rtems-eabi | 167 5 files changed, 198 insertions(+), 198 deletions(-) delete mode 100644 gcc/config/arm/rtems-eabi.h create mode 100644 gcc/config/arm/rtems.h create mode 100644 gcc/config/arm/t-rtems delete mode 100644 gcc/config/arm/t-rtems-eabi diff --git a/gcc/config.gcc b/gcc/config.gcc index 8541274..e49bcbd 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -1057,8 +1057,8 @@ arm*-*-eabi* | arm*-*-symbianelf* | arm*-*-rtems*) use_gcc_stdint=wrap ;; arm*-*-rtems*) - tm_file=${tm_file} rtems.h arm/rtems-eabi.h newlib-stdint.h - tmake_file=${tmake_file} arm/t-bpabi arm/t-rtems-eabi + tm_file=${tm_file} rtems.h arm/rtems.h newlib-stdint.h + tmake_file=${tmake_file} arm/t-bpabi arm/t-rtems ;; arm*-*-symbianelf*) tm_file=${tm_file} arm/symbian.h diff --git a/gcc/config/arm/rtems-eabi.h b/gcc/config/arm/rtems-eabi.h deleted file mode 100644 index 4bdcf0d..000 --- a/gcc/config/arm/rtems-eabi.h +++ /dev/null @@ -1,29 +0,0 @@ -/* Definitions for RTEMS based ARM systems using EABI. - Copyright (C) 2011-2014 Free Software Foundation, Inc. - - This file is part of GCC. - - GCC is free software; you can redistribute it and/or modify it - under the terms of the GNU General Public License as published - by the Free Software Foundation; either version 3, or (at your - option) any later version. - - GCC is distributed in the hope that it will be useful, but WITHOUT - ANY WARRANTY; without even the implied warranty of MERCHANTABILITY - or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public - License for more details. - - You should have received a copy of the GNU General Public License - along with GCC; see the file COPYING3. If not see - http://www.gnu.org/licenses/. */ - -#define HAS_INIT_SECTION - -#undef TARGET_OS_CPP_BUILTINS -#define TARGET_OS_CPP_BUILTINS() \ -do { \ - builtin_define (__rtems__); \ - builtin_define (__USE_INIT_FINI__); \ - builtin_assert (system=rtems);\ - TARGET_BPABI_CPP_BUILTINS();\ -} while (0) diff --git a/gcc/config/arm/rtems.h b/gcc/config/arm/rtems.h new file mode 100644 index 000..4bdcf0d --- /dev/null +++ b/gcc/config/arm/rtems.h @@ -0,0 +1,29 @@ +/* Definitions for RTEMS based ARM systems using EABI. + Copyright (C) 2011-2014 Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published + by the Free Software Foundation; either version 3, or (at your + option) any later version. + + GCC is distributed in the hope that it will be useful, but WITHOUT + ANY WARRANTY; without even the implied warranty of MERCHANTABILITY + or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public + License for more details. + + You should have received a copy of the GNU General Public License + along with GCC; see the file COPYING3. If not see + http://www.gnu.org/licenses/. */ + +#define HAS_INIT_SECTION + +#undef TARGET_OS_CPP_BUILTINS +#define TARGET_OS_CPP_BUILTINS() \ +do { \ + builtin_define (__rtems__); \ + builtin_define (__USE_INIT_FINI__); \ + builtin_assert (system=rtems);\ + TARGET_BPABI_CPP_BUILTINS();\ +} while (0) diff --git a/gcc/config/arm/t-rtems b/gcc/config/arm/t-rtems new file mode 100644 index 000..92c4dcb --- /dev/null +++ b/gcc/config/arm/t-rtems @@ -0,0 +1,167 @@ +# Custom RTEMS EABI multilibs + +MULTILIB_OPTIONS = mbig-endian mthumb march=armv6-m/march=armv7-a/march=armv7-r/march=armv7-m mfpu=neon/mfpu=vfpv3-d16/mfpu=fpv4-sp-d16 mfloat-abi=hard +MULTILIB_DIRNAMES = eb thumb armv6-m armv7-a armv7-r armv7-m neon vfpv3-d16 fpv4-sp-d16 hard + +# Enumeration of multilibs + +MULTILIB_EXCEPTIONS = +MULTILIB_EXCEPTIONS += mbig-endian/mthumb/march=armv6-m/mfpu=neon/mfloat-abi=hard +MULTILIB_EXCEPTIONS +=
Re: [gofrontend-dev] Re: [PATCH 00/13] Go closures, libffi, and the static chain
On Fri, Dec 12, 2014 at 10:14:21AM -0800, Richard Henderson wrote: On 12/12/2014 04:06 AM, Dominik Vogt wrote: I'm not sure I've posted the missing patch anywhere yet, so it's attached to this message. At the moment it enables FFI_TYPE_COMPLEX only for s390[x], but eventually this should be used unconditionally. Thanks for that. I'd been meaning to get around to that. I'll change the test to use FFI_TARGET_HAS_COMPLEX_TYPE and apply it to my branch. Good. I'm not sure whether it's a good idea to expose FFI_TARGET_HAS_COMPLEX_TYPE as part of the libffi interface though. It was meant as a temporary thing to be removed once all platforms supported by libffi have implemented complex support. A while ago I've posted a patch to change the macro's name to begin with an underscore to make that clearer. (This still leaves the dynamic linking issue if we do not use libffi for reflection calls with x86* and s390[x]. Is the plan to remove the platform specific abi code for the few platforms that have it? I see no way to make them work with the static chain patch anyway.) Well, the x86 paths were updated to work with the static chain, but indeed that required assembly rather than cheating and using C as you did. But removing all of that was always my goal. Indeed, my branch now has a patch to remove all of the target-specific code. Fine with that. I wouldn't have written the s390 specific Abi code in Go if libffi had been an option back then. Ciao Dominik ^_^ ^_^ -- Dominik Vogt IBM Germany
RE: New patch: [AArch64] [BE] [1/2] Make large opaque integer modes endianness-safe.
-Original Message- From: Christophe Lyon [mailto:christophe.l...@linaro.org] Sent: 11 December 2014 13:47 To: David Sherwood Cc: gcc-patches@gcc.gnu.org; Marcus Shawcroft; Alan Hayward; Tejas Belagod; Richard Sandiford Subject: Re: New patch: [AArch64] [BE] [1/2] Make large opaque integer modes endianness-safe. On 11 December 2014 at 11:16, David Sherwood david.sherw...@arm.com wrote: Hi Christophe, Sorry to bother you again. After my clarification email below are you now happy for these patches to go in? Kind Regards, David Sherwood. -Original Message- From: David Sherwood [mailto:david.sherw...@arm.com] Sent: 27 November 2014 14:53 To: 'Christophe Lyon' Cc: gcc-patches@gcc.gnu.org; Marcus Shawcroft; Alan Hayward; 'Tejas Belagod'; Richard Sandiford Subject: RE: New patch: [AArch64] [BE] [1/2] Make large opaque integer modes endianness-safe. On 18 November 2014 10:14, David Sherwood david.sherw...@arm.com wrote: Hi Christophe, Ah sorry. My mistake - it fixes this in bugzilla: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59810 I did look at that PR, but since it has no testcase attached, I was unsure. And I am still not :-) PR 59810 is [AArch64] LDn/STn implementations are not ABI-conformant for bigendian. but the advsimd-intrinsics/vldX.c and vldX_lane.c now PASS with Alan's patches on aarch64_be, so I thought Alan's patches solve PR59810. What am I missing? Hi Christophe, I think probably this is our fault for making our lives way too difficult and artificially splitting all these patches up. :) Alan's patch: https://gcc.gnu.org/ml/gcc-patches/2014-10/msg00952.html fixes some issues on aarch64_be, but also causes regressions. For example, Tests that now fail, but worked before: aarch64_be-elf-aem: gcc.dg/vect/slp-perm-8.c -flto -ffat-lto-objects execution test aarch64_be-elf-aem: gcc.dg/vect/slp-perm-8.c execution test aarch64_be-elf-aem: gcc.dg/vect/vect-over-widen-1-big-array.c -flto -ffat-lto-objects execution test ... Tests that now work, but didn't before: aarch64_be-elf-aem: gcc.dg/vect/fast-math-vect-complex-3.c execution test aarch64_be-elf-aem: gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c execution test aarch64_be-elf-aem: gcc.dg/vect/no-scevccp-outer-10a.c execution test ... I didn't notice that because I tested Alan's patch only against the advsimd-intrinsics tests. In this respect, I don't understand why your ChangeLog entry says * config/aarch64/aarch64-simd.md (vec_store_lanes(o/c/x)i, vec_load_lanes(o/c/x)i): Fixed to work for Big Endian. since the existing advsimd-intrinsics tests already pass with Alan's patch alone or is vld1_lane still broken (for which I haven't posted a test yet)? Yes, I think the change log is unclear and I will change it. The only thing that was broken was not adhering to the ABI, but we don't have any specific regression tests that prove this. His patch is only half of the story and must be applied at the same time as the [AArch64] [BE] [1/2] Make large opaque integer modes endianness-safe. patch. With both patches applied the result looks much healthier: # Comparing 1 common sum files ## /bin/sh ./src/gcc/contrib/compare_tests /tmp/gxx-sum1.10051 /tmp/gxx-sum2.10051 Tests that now work, but didn't before: aarch64_be-elf-aem: gcc.dg/torture/pr52028.c -O3 -fomit-frame-pointer execution test aarch64_be-elf-aem: gcc.dg/torture/pr52028.c -O3 -fomit-frame-pointer -funroll-all-loops -finline- functions execution test aarch64_be-elf-aem: gcc.dg/torture/pr52028.c -O3 -fomit-frame-pointer -funroll-loops execution test ... with no new regressions. After applying both patches the aarch64_be gcc testsuite is on a parity with the aarch64 testsuite. Furthermore, after applying both of these patches: [AArch64] [BE] [1/2] Make large opaque integer modes endianness-safe [AArch64] [BE] Fix vector load/stores to not use ld1/st1 it then becomes safe for us to remove the CCMC macro, which is the cause of unnecessary spills to the stack for certain auto-vectorised code. So really I suppose when I posted my second patch [AArch64] [BE] [2/2] Make large opaque integer modes endianness-safe I should have really just called this [AArch64] [BE] Remove CCMC for aarch64 in order to make it clear exactly what the purpose of these patches is. well, not yet since this very does not remove it :-) Again, this is my fault as I made a mistake in the change log. If you look at the actual patch the CCMC macro is removed. Let me re-post corrected, more sensible change logs for both of those changes here: [AArch64] [BE] [1/2] Make large opaque integer modes endianness-safe ChangeLog: gcc/: 2014-10-10 David Sherwood david.sherw...@arm.com 2014-10-10 Tejas
Re: [PATCH 2/4] Add Visium support to libgcc
Do you have a reason for using fp-bit instead of soft-fp? Apart from the obvious historical reason, probably not, but recently added ports (Blackfin, Epiphany) also use it so I'm not sure we want to change it. libgcc files are generally GPL+exception, not LGPL without exception with a very old FSF address (config/visium/div64.c, mod64.c, set_trampoline_parity.c, udiv64.c, udivmod64.c, umod64.c) Files whose copyright/origin is clear are already GPL+exception, but these 6 files were originally imported from Glibc so they aren't in the same basket. I guess I can reuse the copyright notice of soft-fp for them. -- Eric Botcazou
[PATCH] Small i?86 testsuite tweaks (PR target/64210)
Hi! As mentioned in the PR, some tests fail with -fpic. The problem is that they are expecting a 32-bit GPR must start with %e, but %r8d or %r15d are 32-bit GPRs too. The other problem is that PIC code has some loads/stores different from non-pic code, so the counts looking e.g. for loads with ( right after tab don't match the expected values, etc. Regtested on x86_64-linux and i686-linux, ok for trunk? 2014-12-10 Jakub Jelinek ja...@redhat.com PR target/64210 * gcc.target/i386/avx512f-broadcast-gpr-1.c: Use %(?:e|r\[0-9\]+d) instead of %e in regexps trying to match 32-bit GPR. * gcc.target/i386/avx512f-vpbroadcastd-1.c: Likewise. * gcc.target/i386/avx512vl-vpbroadcastd-1.c: Likewise. * gcc.target/i386/avx512vl-vmovdqa64-1.c: Restrict some scan-assembler-times lines to nonpic targets only. Fix up \[^\n^x^y\] to \[^\nxy\]. --- gcc/testsuite/gcc.target/i386/avx512f-vpbroadcastd-1.c.jj 2014-12-03 16:33:53.0 +0100 +++ gcc/testsuite/gcc.target/i386/avx512f-vpbroadcastd-1.c 2014-12-10 15:23:07.611650110 +0100 @@ -3,9 +3,9 @@ /* { dg-final { scan-assembler-times vpbroadcastd\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+(?:\n|\[ \\t\]+#) 2 } } */ /* { dg-final { scan-assembler-times vpbroadcastd\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#) 2 } } */ /* { dg-final { scan-assembler-times vpbroadcastd\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#) 2 } } */ -/* { dg-final { scan-assembler-times vpbroadcastd\[ \\t\]+%e\[^\n\]*%zmm\[0-9\]+(?:\n|\[ \\t\]+#) 1 } } */ -/* { dg-final { scan-assembler-times vpbroadcastd\[ \\t\]+%e\[^\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#) 1 } } */ -/* { dg-final { scan-assembler-times vpbroadcastd\[ \\t\]+%e\[^\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#) 1 } } */ +/* { dg-final { scan-assembler-times vpbroadcastd\[ \\t\]+%(?:e|r\[0-9\]+d)\[^\n\]*%zmm\[0-9\]+(?:\n|\[ \\t\]+#) 1 } } */ +/* { dg-final { scan-assembler-times vpbroadcastd\[ \\t\]+%(?:e|r\[0-9\]+d)\[^\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#) 1 } } */ +/* { dg-final { scan-assembler-times vpbroadcastd\[ \\t\]+%(?:e|r\[0-9\]+d)\[^\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#) 1 } } */ #include immintrin.h --- gcc/testsuite/gcc.target/i386/avx512f-broadcast-gpr-1.c.jj 2014-12-03 16:33:53.0 +0100 +++ gcc/testsuite/gcc.target/i386/avx512f-broadcast-gpr-1.c 2014-12-10 15:23:42.132036283 +0100 @@ -1,7 +1,7 @@ /* { dg-do compile } */ /* { dg-options -mavx512f -O2 } */ /* { dg-final { scan-assembler-times vpbroadcastq\[ \\t\]+%r\[^\n\]+%zmm\[0-9\]+(?:\n|\[ \\t\]+#) 1 { target { ! { ia32 } } } } } */ -/* { dg-final { scan-assembler-times vpbroadcastd\[ \\t\]+%e\[^\n\]+%zmm\[0-9\]+(?:\n|\[ \\t\]+#) 1 { target { ! { ia32 } } } } } */ +/* { dg-final { scan-assembler-times vpbroadcastd\[ \\t\]+%(?:e|r\[0-9\]+d)\[^\n\]+%zmm\[0-9\]+(?:\n|\[ \\t\]+#) 1 { target { ! { ia32 } } } } } */ /* { dg-final { scan-assembler-times vpbroadcastq\[ \\t\]+\[^\n\]+%zmm\[0-9\]+(?:\n|\[ \\t\]+#) 1 { target ia32 } } } */ /* { dg-final { scan-assembler-times vpbroadcastd\[ \\t\]+\[^\n\]+%zmm\[0-9\]+(?:\n|\[ \\t\]+#) 1 { target ia32 } } } */ --- gcc/testsuite/gcc.target/i386/avx512vl-vpbroadcastd-1.c.jj 2014-12-03 16:33:54.0 +0100 +++ gcc/testsuite/gcc.target/i386/avx512vl-vpbroadcastd-1.c 2014-12-10 15:20:36.394339145 +0100 @@ -4,10 +4,10 @@ /* { dg-final { scan-assembler-times vpbroadcastd\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#) 2 } } */ /* { dg-final { scan-assembler-times vpbroadcastd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#) 2 } } */ /* { dg-final { scan-assembler-times vpbroadcastd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#) 2 } } */ -/* { dg-final { scan-assembler-times vpbroadcastd\[ \\t\]+%e\[^\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#) 1 { target { ! { ia32 } } } } } */ -/* { dg-final { scan-assembler-times vpbroadcastd\[ \\t\]+%e\[^\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#) 1 { target { ! { ia32 } } } } } */ -/* { dg-final { scan-assembler-times vpbroadcastd\[ \\t\]+%e\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#) 1 { target { ! { ia32 } } } } } */ -/* { dg-final { scan-assembler-times vpbroadcastd\[ \\t\]+%e\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#) 1 { target { ! { ia32 } } } } } */ +/* { dg-final { scan-assembler-times vpbroadcastd\[ \\t\]+%(?:e|r\[0-9\]+d)\[^\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#) 1 { target { ! { ia32 } } } } } */ +/* { dg-final { scan-assembler-times vpbroadcastd\[ \\t\]+%(?:e|r\[0-9\]+d)\[^\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#) 1 { target { ! { ia32 } } } } } */ +/* { dg-final { scan-assembler-times vpbroadcastd\[ \\t\]+%(?:e|r\[0-9\]+d)\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#) 1 { target { ! { ia32 } } } } } */ +/* { dg-final { scan-assembler-times vpbroadcastd\[
Re: [Patch, Fortran] PR 63674: procedure pointer and non/pure procedure
2014-12-15 7:34 GMT+01:00 Tobias Burnus bur...@net-b.de: Can you change non-pure to impure? That would better match the Fortran naming, where impure is the default unless pure or elemental is used. (It was added to permit impure elemental procedures.) Yes, sure. I have committed this change as r218738. Cheers, Janus
[committed] Fix thread_local10.C on the 4.8 branch
Hi! The thread_local10.C fails, because if plugin.exp is sourced before tls.exp on the 4.8 branch, DEFAULT_CXXFLAGS includes -ansi and the thread_local10.C test, which is C++11, but has no dg-options, fails with: /usr/src/gcc-4.8/gcc/testsuite/g++.dg/tls/thread_local10.C:11:10: error: 'thread_local' does not name a type /usr/src/gcc-4.8/gcc/testsuite/g++.dg/tls/thread_local10.C:13:14: error: 's' was not declared in this scope /usr/src/gcc-4.8/gcc/testsuite/g++.dg/tls/thread_local10.C:17:23: error: 'thread_local' does not name a type because of that. This has been fixed in 4.9+ by dropping the -ansi for C++ tests. Acked by Jason in the PR, committed to 4.8 branch. 2014-12-15 Jakub Jelinek ja...@redhat.com PR middle-end/58624 Backported from mainline 2014-03-07 Jason Merrill ja...@redhat.com * g++.dg/plugin/plugin.exp (DEFAULT_CXXFLAGS): Remove -ansi. --- gcc/testsuite/g++.dg/plugin/plugin.exp (revision 218737) +++ gcc/testsuite/g++.dg/plugin/plugin.exp (working copy) @@ -31,7 +31,7 @@ if { ![info exists TESTING_IN_BUILD_TREE # If a testcase doesn't have special options, use these. global DEFAULT_CXXFLAGS if ![info exists DEFAULT_CXXFLAGS] then { -set DEFAULT_CXXFLAGS -ansi -pedantic-errors -Wno-long-long +set DEFAULT_CXXFLAGS -pedantic-errors -Wno-long-long } # The procedures in plugin-support.exp need these parameters. Jakub
Re: [PATCH] Small i?86 testsuite tweaks (PR target/64210)
On Wed, Dec 10, 2014 at 9:08 PM, Jakub Jelinek ja...@redhat.com wrote: As mentioned in the PR, some tests fail with -fpic. The problem is that they are expecting a 32-bit GPR must start with %e, but %r8d or %r15d are 32-bit GPRs too. The other problem is that PIC code has some loads/stores different from non-pic code, so the counts looking e.g. for loads with ( right after tab don't match the expected values, etc. Regtested on x86_64-linux and i686-linux, ok for trunk? 2014-12-10 Jakub Jelinek ja...@redhat.com PR target/64210 * gcc.target/i386/avx512f-broadcast-gpr-1.c: Use %(?:e|r\[0-9\]+d) instead of %e in regexps trying to match 32-bit GPR. * gcc.target/i386/avx512f-vpbroadcastd-1.c: Likewise. * gcc.target/i386/avx512vl-vpbroadcastd-1.c: Likewise. * gcc.target/i386/avx512vl-vmovdqa64-1.c: Restrict some scan-assembler-times lines to nonpic targets only. Fix up \[^\n^x^y\] to \[^\nxy\]. OK. Thanks, Uros.
Re: [PATCH, x86][PIC] Making check for PIC register in address cost calculation only on RTL level
On Fri, Dec 12, 2014 at 1:21 PM, Zamyatin, Igor igor.zamya...@intel.com wrote: When adding checks for PIC register in address cost calculation (http://gcc.gnu.org/ml/gcc-cvs/2014-10/msg00411.html) it was meant to affect only RTL passes. Since !pic_offset_table_rtx is not enough for it (I see that pic_offset_table_rtx enabled on GIMPLE level) following change explicitly adds this restriction. Bootstrapped and regtested with RUNTESTFLAGS=--target_board='unix{-m32,-fpic}' Is it ok for trunk? 2014-12-12 Igor Zamyatin igor.zamya...@intel.com * config/i386/i386.c (ix86_address_cost): Add explicit restriction to RTL level for the check for PIC register. OK. Thanks, Uros.
Re: [patch c++]: Fix PR/63996
... committed as obvious the below. Paolo. / 2014-12-15 Paolo Carlini paolo.carl...@oracle.com * g++.dg/cpp1y/pr63996.C: Fix. Index: g++.dg/cpp1y/pr63996.C === --- g++.dg/cpp1y/pr63996.C (revision 218737) +++ g++.dg/cpp1y/pr63996.C (working copy) @@ -3,7 +3,7 @@ constexpr int foo (int i) { - int a[i] = { }; + int a[i] = { }; // { dg-error forbids variable length } } constexpr int j = foo (1); // { dg-error is not a constant expression }
[committed] Fix ipa/pr63551.c testcase
Hi! This testcase fails on the 4.9 branch (where C89 is the default) on 32-bit targets, because of warning: this decimal constant is unsigned only in ISO C90 warning. On the trunk it doesn't warn because we default to C11. In any case, the testcase also fails the same way before the r218205 fix and succeeds with it if I add U suffix, so I've committed this fix to trunk and 4.9 branches as obvious. 2014-12-15 Jakub Jelinek ja...@redhat.com PR tree-optimization/63551 * gcc.dg/ipa/pr63551.c (fn2): Use 4294967286U instead of 4294967286 to avoid warnings. --- gcc/testsuite/gcc.dg/ipa/pr63551.c.jj 2014-12-01 14:57:16.0 +0100 +++ gcc/testsuite/gcc.dg/ipa/pr63551.c 2014-12-15 11:41:27.866596259 +0100 @@ -21,7 +21,7 @@ void fn2 () { d = 0; - union U b = { 4294967286 }; + union U b = { 4294967286U }; fn1 (b); } Jakub
Re: [Patch, libstdc++/64302, libstdc++/64303] Fix match_results iterators and regex_token_iterator
On 14/12/14 15:23 -0800, Tim Shen wrote: Bootstraped and tested :) I'm not sure if it's safe to directly backport it to 4.9, since one inline function (_M_normalize_result) is added; but at least we can manually inline them by copy paste. It's OK for trunk and 4.9, without changes. Code compiled against 4.9.2 that inlines some of the regex code won't be affected by the added function. Code that doesn't inline the code will start calling the new function. So it won't hurt and might help.
[PATCH] combine: If a parallel I2 was split, do not allow a new I2 (PR64268)
If combine is given a parallel I2, it splits it into separate I1 and I2 insns in some cases (one case since the beginning of time; more cases since my r218248). It gives the new I1 the same UID as the I2 already has; there is a comment that this works fine because the I1 never is added to the insn stream. When combine eventually replaces the insns with something new, it calls SET_INSN_DELETED on those it wants gone. Since r125624 (when DF was added, back in 2007) SET_INSN_DELETED uses the UID of the insn it is called for to do the deletion for dataflow. So since then, when such a split I1 is deleted, DF thinks I2 is deleted as well. This of course explodes if I2 is in fact needed (but only if that I2 still exists at the end of combine, i.e. the insn was not combined to further down; that might explain why it wasn't noticed before). This should be fixed properly, but it is somewhat involved. This patch simply disallows all combinations coming from a split parallel I2 if it needs a new I2. This fixes PR target/64268. Bootstrapped on powerpc64-linux and powerpc-linux; the fails are gone. Regtested on powerpc64-linux -m32,-m32/-mpowerpc64,-m64,-m64/-mlra. Is this okay for mainline? Segher 2014-12-15 Segher Boessenkool seg...@kernel.crashing.org gcc/ * combine.c (try_combine): Don't try to split newpat if we started with I2 a PARALLEL that we split. --- gcc/combine.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/gcc/combine.c b/gcc/combine.c index 8995c1d3..de2e49f 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -3471,6 +3471,17 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, rtx_insn *i0, if (insn_code_number 0) insn_code_number = recog_for_combine (newpat, i3, new_i3_notes); + /* If we got I1 and I2 from splitting the original (parallel) I2 into two, + I1 and I2 have the same UID, which means that DF ends up deleting I2 + when it is asked to delete I1. So only allow this if we want I2 deleted, + that is, if we get only one insn as combine result; don't try to split + off a new I2 if it doesn't match yet. */ + if (i1 insn_code_number 0 INSN_UID (i1) == INSN_UID (i2)) +{ + undo_all (); + return 0; +} + /* If we were combining three insns and the result is a simple SET with no ASM_OPERANDS that wasn't recognized, try to split it into two insns. There are two ways to do this. It can be split using a -- 1.8.1.4
Re: patches for libstdc++ in #64271 (bootstrap on NetBSD)
On 11/12/14 19:22 +0100, Kai-Uwe Eckhardt wrote: --- libstdc++-v3/config/os/bsd/netbsd/ctype_base.h.orig 2014-12-10 22:18:50.0 +0100 +++ libstdc++-v3/config/os/bsd/netbsd/ctype_base.h 2014-12-10 22:20:31.0 +0100 @@ -43,9 +43,22 @@ // NB: Offsets into ctypechar::_M_table force a particular size // on the mask type. Because of this, we don't use an enum. -typedef unsigned char mask; -#ifndef _CTYPE_U +#if defined(_CTYPE_BL) What is _CTYPE_BL? If it corresponds to the blank character class then I would expect the ctype_base::blank mask to be changed by this patch as well. +typedef unsigned short mask; As I said in the Bugzilla comments, changing this type alters the ABI for NetBSD. Are C++ binaries compiled with NetBSD 5.0 expected to run unchanged on NetBSD 7.0? Or is an ABI break acceptable for the target? The other changes are OK, although they probably don't solve the problem in isolation without the ctype_base.h changes.
[PATCH] rs6000: Do not allow GPR0 for addic. if it is split
If an addic. is split to addi+cmp (because RA didn't give it CR0), it will do the wrong thing if the input reg is GPR0 (addi X,0,N is li X,N). So don't allow such an input. Spotted visually while investigating PR64268. Tested etc.; okay for mainline? Segher 2014-12-15 Segher Boessenkool seg...@kernel.crashing.org gcc/ * gcc/config/rs6000/rs6000.md (*addmode3_imm_dot, *addmode3_imm_dot2): Change the constraint for the second alternative for operand 1 from r to b. --- gcc/config/rs6000/rs6000.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index bb9ab0f..36e6182 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -1569,7 +1569,7 @@ (define_insn_and_split *addmode3_dot2 (define_insn_and_split *addmode3_imm_dot [(set (match_operand:CC 3 cc_reg_operand =x,?y) - (compare:CC (plus:GPR (match_operand:GPR 1 gpc_reg_operand %r,r) + (compare:CC (plus:GPR (match_operand:GPR 1 gpc_reg_operand %r,b) (match_operand:GPR 2 short_cint_operand I,I)) (const_int 0))) (clobber (match_scratch:GPR 0 =r,r)) @@ -1592,7 +1592,7 @@ (define_insn_and_split *addmode3_imm_dot (define_insn_and_split *addmode3_imm_dot2 [(set (match_operand:CC 3 cc_reg_operand =x,?y) - (compare:CC (plus:GPR (match_operand:GPR 1 gpc_reg_operand %r,r) + (compare:CC (plus:GPR (match_operand:GPR 1 gpc_reg_operand %r,b) (match_operand:GPR 2 short_cint_operand I,I)) (const_int 0))) (set (match_operand:GPR 0 gpc_reg_operand =r,r) -- 1.8.1.4
Re: [PATCH 3/4] Add Visium support to gcc
Use of `%s' in diagnostics is long obsoleted by %qs (in this case, using %qE with the identifier directly, rather than using IDENTIFIER_POINTER, is preferred). Only occurrence fixed by mimicing the i386 port. INTVAL / UINTVAL return HOST_WIDE_INT / unsigned HOST_WIDE_INT, not long / unsigned long. You have lots of uses of fprintf that presume they return long / unsigned long. Changed into HOST_WIDE_INT_PRINT_DEC/HOST_WIDE_INT_PRINT_UNSIGNED. As you have the interrupt attribute, you need to add this port to the list in extend.texi of ports with this attribute. (Generally, check the checklist of pieces in sourcebuild.texi to update for a new port.) Done. But there is a bit of a contradiction in sourcebuild.texi: * Entries in `gcc/doc/install.texi' for all target triplets supported with this target architecture, giving details of any special notes about installation for this target, or saying that there are no special notes if there are none. But gcc/doc/install.texi has: Note that this list of install notes is @emph{not} a list of supported hosts or targets. Not all supported hosts and targets are listed here, only the ones that require host-specific or target-specific information have to. I have nevertheless added visium-*-elf to gcc/doc/install.texi. I'll note that libbacktrace, libcc1, libcilkrts, liboffloadmic, libsanitizer and libvtv are not documented in there. At least one target for this port should be added to contrib/config-list.mk This is documented in sourcebuild.texi, I'll take the 5 steps covered by If the back end is added to the official GCC source repository, the following are also necessary: when the premise is fulfilled. (and you should verify that the port builds cleanly with --enable-werror -always, for both 32-bit and 64-bit hosts, when building using current trunk GCC). Do you mean a bootstrap of the cross-compiler with --enable-werror-always on a 32-bit and a 64-bit host? OK, I'll do that. Thanks for the review. -- Eric Botcazou
Re: Do not build callgraph for external functions when inlining
Jan Hubicka hubi...@ucw.cz writes: * cgraphunit.c (analyze_functions): Do not analyze extern inline funtions when not optimizing; skip comdat locals. FAIL: g++.dg/torture/pr60854.C -O0 (test for excess errors) Excess errors: /usr/local/gcc/gcc-20141215/gcc/testsuite/g++.dg/torture/pr60854.C:5:46: error: inlining failed in call to always_inline 'MyClassT::MyClass() [with T = double]': function body not available /usr/local/gcc/gcc-20141215/gcc/testsuite/g++.dg/torture/pr60854.C:12:19: error: called from here Andreas. -- Andreas Schwab, SUSE Labs, sch...@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 And now for something completely different.
Re: [patch c++]: Fix PR/63996
2014-12-15 11:48 GMT+01:00 Paolo Carlini paolo.carl...@oracle.com: ... committed as obvious the below. Paolo. / Thanks Kai
[PATCH] More TYPE_OVERFLOW_* fallout (PR middle-end/64292)
m68k revealed another missing check before TYPE_OVERFLOW_WRAPS. I think we should use INTEGRAL_TYPE_P, and not ANY_INTEGRAL_TYPE_P because the switch handles VECTOR_CST and COMPLEX_CST. Bootstrapped/regtested on x86_64-linux and ppc64-linux, ok for trunk? 2014-12-15 Marek Polacek pola...@redhat.com PR middle-end/64292 * fold-const.c (negate_expr_p): Add INTEGRAL_TYPE_P check. diff --git gcc/fold-const.c gcc/fold-const.c index d71fa94..07da71a 100644 --- gcc/fold-const.c +++ gcc/fold-const.c @@ -400,7 +400,7 @@ negate_expr_p (tree t) switch (TREE_CODE (t)) { case INTEGER_CST: - if (TYPE_OVERFLOW_WRAPS (type)) + if (INTEGRAL_TYPE_P (type) TYPE_OVERFLOW_WRAPS (type)) return true; /* Check that -CST will not overflow type. */ Marek
[PATCH 3/3] RTEMS: Add e6500 multilibs for PowerPC
Use 32-bit instructions only since currenlty there is no demand for a larger address space. Provide one multilib with FPU and AltiVec support and one without. This patch should be applied to GCC 4.9 and mainline. I do not have write access, so in case this gets approved, please commit it for me. gcc/ChangeLog 2014-12-15 Sebastian Huber sebastian.hu...@embedded-brains.de * config/rs6000/t-rtems: Add e6500 multilibs. --- gcc/config/rs6000/t-rtems | 13 + 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/gcc/config/rs6000/t-rtems b/gcc/config/rs6000/t-rtems index e935947..128ea3e 100644 --- a/gcc/config/rs6000/t-rtems +++ b/gcc/config/rs6000/t-rtems @@ -24,14 +24,17 @@ MULTILIB_MATCHES = MULTILIB_EXCEPTIONS = MULTILIB_REQUIRED = -MULTILIB_OPTIONS += mcpu=403/mcpu=505/mcpu=603e/mcpu=604/mcpu=860/mcpu=7400/mcpu=8540 -MULTILIB_DIRNAMES += m403 m505 m603e m604 m860 m7400 m8540 +MULTILIB_OPTIONS += mcpu=403/mcpu=505/mcpu=603e/mcpu=604/mcpu=860/mcpu=7400/mcpu=8540/mcpu=e6500 +MULTILIB_DIRNAMES += m403 m505 m603e m604 m860 m7400 m8540 me6500 MULTILIB_OPTIONS += msoft-float/mfloat-gprs=double MULTILIB_DIRNAMES += nof gprsdouble -MULTILIB_OPTIONS += mno-spe -MULTILIB_DIRNAMES += nospe +MULTILIB_OPTIONS += mno-spe/mno-altivec +MULTILIB_DIRNAMES += nospe noaltivec + +MULTILIB_OPTIONS += m32 +MULTILIB_DIRNAMES += m32 MULTILIB_MATCHES += ${MULTILIB_MATCHES_ENDIAN} MULTILIB_MATCHES += ${MULTILIB_MATCHES_SYSV} @@ -72,3 +75,5 @@ MULTILIB_REQUIRED += mcpu=8540 MULTILIB_REQUIRED += mcpu=8540/msoft-float/mno-spe MULTILIB_REQUIRED += mcpu=8540/mfloat-gprs=double MULTILIB_REQUIRED += mcpu=860 +MULTILIB_REQUIRED += mcpu=e6500/m32 +MULTILIB_REQUIRED += mcpu=e6500/m32/msoft-float/mno-altivec -- 1.8.4.5
[PATCH 2/3] RTEMS: Fix MPC8540 multilibs for PowerPC
GCC generates SPE instructions even if -msoft-float is specified. Explicitly add -mno-spe to prevent generation of SPE instructions. This multilib variant must not lead to a usage of the SPE. This patch should be applied to GCC 4.9 and mainline. I do not have write access, so in case this gets approved, please commit it for me. gcc/ChangeLog 2014-12-15 Sebastian Huber sebastian.hu...@embedded-brains.de * config/rs6000/t-rtems: Add -mno-spe to soft-float multilib for MPC8540. --- gcc/config/rs6000/t-rtems | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/gcc/config/rs6000/t-rtems b/gcc/config/rs6000/t-rtems index 3ebcfaa..e935947 100644 --- a/gcc/config/rs6000/t-rtems +++ b/gcc/config/rs6000/t-rtems @@ -30,6 +30,9 @@ MULTILIB_DIRNAMES += m403 m505 m603e m604 m860 m7400 m8540 MULTILIB_OPTIONS += msoft-float/mfloat-gprs=double MULTILIB_DIRNAMES += nof gprsdouble +MULTILIB_OPTIONS += mno-spe +MULTILIB_DIRNAMES += nospe + MULTILIB_MATCHES += ${MULTILIB_MATCHES_ENDIAN} MULTILIB_MATCHES += ${MULTILIB_MATCHES_SYSV} # Map 405 to 403 @@ -66,6 +69,6 @@ MULTILIB_REQUIRED += mcpu=604/msoft-float MULTILIB_REQUIRED += mcpu=7400 MULTILIB_REQUIRED += mcpu=7400/msoft-float MULTILIB_REQUIRED += mcpu=8540 -MULTILIB_REQUIRED += mcpu=8540/msoft-float +MULTILIB_REQUIRED += mcpu=8540/msoft-float/mno-spe MULTILIB_REQUIRED += mcpu=8540/mfloat-gprs=double MULTILIB_REQUIRED += mcpu=860 -- 1.8.4.5
[PATCH 1/3] RTEMS: Use MULTILIB_REQUIRED for PowerPC
This patch should be applied to GCC 4.9 and mainline. I do not have write access, so in case this gets approved, please commit it for me. gcc/ChangeLog 2014-12-15 Sebastian Huber sebastian.hu...@embedded-brains.de * config/rs6000/t-rtems: Use MULTILIB_REQUIRED instead of MULTILIB_EXCEPTIONS. --- gcc/config/rs6000/t-rtems | 65 +-- 1 file changed, 24 insertions(+), 41 deletions(-) diff --git a/gcc/config/rs6000/t-rtems b/gcc/config/rs6000/t-rtems index 426f75a..3ebcfaa 100644 --- a/gcc/config/rs6000/t-rtems +++ b/gcc/config/rs6000/t-rtems @@ -18,16 +18,18 @@ # along with GCC; see the file COPYING3. If not see # http://www.gnu.org/licenses/. -MULTILIB_OPTIONS = \ -mcpu=403/mcpu=505/mcpu=603e/mcpu=604/mcpu=860/mcpu=7400/mcpu=8540 \ -msoft-float/mfloat-gprs=double +MULTILIB_OPTIONS = +MULTILIB_DIRNAMES = +MULTILIB_MATCHES = +MULTILIB_EXCEPTIONS = +MULTILIB_REQUIRED = + +MULTILIB_OPTIONS += mcpu=403/mcpu=505/mcpu=603e/mcpu=604/mcpu=860/mcpu=7400/mcpu=8540 +MULTILIB_DIRNAMES += m403 m505 m603e m604 m860 m7400 m8540 -MULTILIB_DIRNAMES = \ -m403 m505 m603e m604 m860 m7400 m8540 \ -nof gprsdouble +MULTILIB_OPTIONS += msoft-float/mfloat-gprs=double +MULTILIB_DIRNAMES += nof gprsdouble -# MULTILIB_MATCHES = ${MULTILIB_MATCHES_FLOAT} -MULTILIB_MATCHES = MULTILIB_MATCHES += ${MULTILIB_MATCHES_ENDIAN} MULTILIB_MATCHES += ${MULTILIB_MATCHES_SYSV} # Map 405 to 403 @@ -52,37 +54,18 @@ MULTILIB_MATCHES+= mcpu?8540=mcpu?8548 # (mfloat-gprs=single is implicit default) MULTILIB_MATCHES += mcpu?8540=mcpu?8540/mfloat-gprs?single -# Soft-float only, default implies msoft-float -# NOTE: Must match with MULTILIB_MATCHES_FLOAT and MULTILIB_MATCHES -MULTILIB_SOFTFLOAT_ONLY = \ -*mcpu=401/*msoft-float* \ -*mcpu=403/*msoft-float* \ -*mcpu=405/*msoft-float* \ -*mcpu=801/*msoft-float* \ -*mcpu=821/*msoft-float* \ -*mcpu=823/*msoft-float* \ -*mcpu=860/*msoft-float* - -# Hard-float only, take out msoft-float -MULTILIB_HARDFLOAT_ONLY = \ -*mcpu=505/*msoft-float* - -# Targets which do not support gprs -MULTILIB_NOGPRS = \ -mfloat-gprs=* \ -*mcpu=403/*mfloat-gprs=* \ -*mcpu=505/*mfloat-gprs=* \ -*mcpu=603e/*mfloat-gprs=* \ -*mcpu=604/*mfloat-gprs=* \ -*mcpu=860/*mfloat-gprs=* \ -*mcpu=7400/*mfloat-gprs=* - -MULTILIB_EXCEPTIONS = - -# Disallow -Dppc and -Dmpc without other options -MULTILIB_EXCEPTIONS+= Dppc* Dmpc* +# Enumeration of multilibs -MULTILIB_EXCEPTIONS+= \ -${MULTILIB_SOFTFLOAT_ONLY} \ -${MULTILIB_HARDFLOAT_ONLY} \ -${MULTILIB_NOGPRS} +MULTILIB_REQUIRED += msoft-float +MULTILIB_REQUIRED += mcpu=403 +MULTILIB_REQUIRED += mcpu=505 +MULTILIB_REQUIRED += mcpu=603e +MULTILIB_REQUIRED += mcpu=603e/msoft-float +MULTILIB_REQUIRED += mcpu=604 +MULTILIB_REQUIRED += mcpu=604/msoft-float +MULTILIB_REQUIRED += mcpu=7400 +MULTILIB_REQUIRED += mcpu=7400/msoft-float +MULTILIB_REQUIRED += mcpu=8540 +MULTILIB_REQUIRED += mcpu=8540/msoft-float +MULTILIB_REQUIRED += mcpu=8540/mfloat-gprs=double +MULTILIB_REQUIRED += mcpu=860 -- 1.8.4.5
Re: [PATCH] More TYPE_OVERFLOW_* fallout (PR middle-end/64292)
On Mon, 15 Dec 2014, Marek Polacek wrote: m68k revealed another missing check before TYPE_OVERFLOW_WRAPS. I think we should use INTEGRAL_TYPE_P, and not ANY_INTEGRAL_TYPE_P because the switch handles VECTOR_CST and COMPLEX_CST. Bootstrapped/regtested on x86_64-linux and ppc64-linux, ok for trunk? Ok. Thanks, Richard. 2014-12-15 Marek Polacek pola...@redhat.com PR middle-end/64292 * fold-const.c (negate_expr_p): Add INTEGRAL_TYPE_P check. diff --git gcc/fold-const.c gcc/fold-const.c index d71fa94..07da71a 100644 --- gcc/fold-const.c +++ gcc/fold-const.c @@ -400,7 +400,7 @@ negate_expr_p (tree t) switch (TREE_CODE (t)) { case INTEGER_CST: - if (TYPE_OVERFLOW_WRAPS (type)) + if (INTEGRAL_TYPE_P (type) TYPE_OVERFLOW_WRAPS (type)) return true; /* Check that -CST will not overflow type. */ Marek -- Richard Biener rguent...@suse.de SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)
Re: [PATCH][AArch64] Implement vsqrt_f64 intrinsic
On Mon, Nov 17, 2014 at 05:35:23PM +, Kyrill Tkachov wrote: Hi all, This patch implements the vsqrt_f64 intrinsic in arm_neon.h. There's not much to it, we can reuse __builtin_sqrt. It's a fairly straightforward and self-contained patch, do we still want it at this stage? A new execute test is added. Tested aarch64-none-elf. Thanks, Kyrill 2014-11-17 Kyrylo Tkachov kyrylo.tkac...@arm.com * config/aarch64/arm_neon.h (vsqrt_f64): New intrinsic. 2014-11-17 Kyrylo Tkachov kyrylo.tkac...@arm.com * gcc.target/aarch64/simd/vsqrt_f64_1.c commit d9e42debe2655287eef7b8c3ecf29bbdd11e6425 Author: Kyrylo Tkachov kyrylo.tkac...@arm.com Date: Mon Nov 17 15:02:01 2014 + [AArch64] Implement vsqrt_f64 intrinsic diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h index b3b80b8..c58213a 100644 --- a/gcc/config/aarch64/arm_neon.h +++ b/gcc/config/aarch64/arm_neon.h @@ -22792,6 +22792,12 @@ vsqrtq_f32 (float32x4_t a) return __builtin_aarch64_sqrtv4sf (a); } +__extension__ static __inline float64x1_t __attribute__ ((__always_inline__)) +vsqrt_f64 (float64x1_t a) +{ + return (float64x1_t) { __builtin_sqrt (a[0]) }; +} Hi Kyrill, Does this introduce an implicit need to link against a maths library if we want arm_neon.h to work correctly? If so, I think we need to take a different approach. At O0 I've started to see: undefined reference to `sqrt' When checking a large arm_neon.h testcase. It does seem strange that the mid-end would convert a __builtin_sqrt back to a library call at O0 when the target has an optab for it, so perhaps there is a bug there to go hunt? Thanks, James
[PATCH v2] RTEMS: Add e6500 multilibs for PowerPC
Use 32-bit instructions only since currenlty there is no demand for a larger address space. Provide one multilib with FPU and AltiVec support and one without. This patch should be applied to GCC 4.9 and mainline. I do not have write access, so in case this gets approved, please commit it for me. v2: Move MULTILIB_OPTIONS += m32 to right position in the file. gcc/ChangeLog 2014-12-15 Sebastian Huber sebastian.hu...@embedded-brains.de * config/rs6000/t-rtems: Add e6500 multilibs. --- gcc/config/rs6000/t-rtems | 13 + 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/gcc/config/rs6000/t-rtems b/gcc/config/rs6000/t-rtems index e935947..eadda0d 100644 --- a/gcc/config/rs6000/t-rtems +++ b/gcc/config/rs6000/t-rtems @@ -24,14 +24,17 @@ MULTILIB_MATCHES = MULTILIB_EXCEPTIONS = MULTILIB_REQUIRED = -MULTILIB_OPTIONS += mcpu=403/mcpu=505/mcpu=603e/mcpu=604/mcpu=860/mcpu=7400/mcpu=8540 -MULTILIB_DIRNAMES += m403 m505 m603e m604 m860 m7400 m8540 +MULTILIB_OPTIONS += mcpu=403/mcpu=505/mcpu=603e/mcpu=604/mcpu=860/mcpu=7400/mcpu=8540/mcpu=e6500 +MULTILIB_DIRNAMES += m403 m505 m603e m604 m860 m7400 m8540 me6500 + +MULTILIB_OPTIONS += m32 +MULTILIB_DIRNAMES += m32 MULTILIB_OPTIONS += msoft-float/mfloat-gprs=double MULTILIB_DIRNAMES += nof gprsdouble -MULTILIB_OPTIONS += mno-spe -MULTILIB_DIRNAMES += nospe +MULTILIB_OPTIONS += mno-spe/mno-altivec +MULTILIB_DIRNAMES += nospe noaltivec MULTILIB_MATCHES += ${MULTILIB_MATCHES_ENDIAN} MULTILIB_MATCHES += ${MULTILIB_MATCHES_SYSV} @@ -72,3 +75,5 @@ MULTILIB_REQUIRED += mcpu=8540 MULTILIB_REQUIRED += mcpu=8540/msoft-float/mno-spe MULTILIB_REQUIRED += mcpu=8540/mfloat-gprs=double MULTILIB_REQUIRED += mcpu=860 +MULTILIB_REQUIRED += mcpu=e6500/m32 +MULTILIB_REQUIRED += mcpu=e6500/m32/msoft-float/mno-altivec -- 1.8.4.5
[PATCH] Fix PR64246
The following robustifies mark_loop_for_removal against multiple invocations on the same loop. Bootstrapped and tested on x86_64-unknown-linux-ngu, applied. Richard. 2014-12-15 Richard Biener rguent...@suse.de PR middle-end/64246 * cfgloop.c (mark_loop_for_removal): Make safe against multiple invocations on the same loop. * gnat.dg/opt46.adb: New testcase. * gnat.dg/opt46.ads: Likewise. * gnat.dg/opt46_pkg.adb: Likewise. * gnat.dg/opt46_pkg.ads: Likewise. Index: gcc/cfgloop.c === --- gcc/cfgloop.c (revision 218733) +++ gcc/cfgloop.c (working copy) @@ -1928,9 +1928,10 @@ bb_loop_depth (const_basic_block bb) void mark_loop_for_removal (loop_p loop) { + if (loop-header == NULL) +return; loop-former_header = loop-header; loop-header = NULL; loop-latch = NULL; loops_state_set (LOOPS_NEED_FIXUP); } - Index: gcc/testsuite/gnat.dg/opt46.adb === --- gcc/testsuite/gnat.dg/opt46.adb (revision 0) +++ gcc/testsuite/gnat.dg/opt46.adb (revision 0) @@ -0,0 +1,45 @@ +-- { dg-do compile } +-- { dg-options -O } + +with Ada.Unchecked_Deallocation; + +with Opt46_Pkg; + +package body Opt46 is + + type Pattern is abstract tagged null record; + + type Pattern_Access is access Pattern'Class; + + procedure Free is new Ada.Unchecked_Deallocation + (Pattern'Class, Pattern_Access); + + type Action is abstract tagged null record; + + type Action_Access is access Action'Class; + + procedure Free is new Ada.Unchecked_Deallocation + (Action'Class, Action_Access); + + type Pattern_Action is record + Pattern : Pattern_Access; + Action : Action_Access; + end record; + + package Pattern_Action_Table is new Opt46_Pkg (Pattern_Action, Natural, 1); + + type Session_Data is record + Filters : Pattern_Action_Table.Instance; + end record; + + procedure Close (Session : Session_Type) is + Filters : Pattern_Action_Table.Instance renames Session.Data.Filters; + begin + for F in 1 .. Pattern_Action_Table.Last (Filters) loop + Free (Filters.Table (F).Pattern); + Free (Filters.Table (F).Action); + end loop; + + end Close; + +end Opt46; Index: gcc/testsuite/gnat.dg/opt46.ads === --- gcc/testsuite/gnat.dg/opt46.ads (revision 0) +++ gcc/testsuite/gnat.dg/opt46.ads (revision 0) @@ -0,0 +1,16 @@ +package Opt46 is + + type Session_Type is limited private; + + procedure Close (Session : Session_Type); + +private + + type Session_Data; + type Session_Data_Access is access Session_Data; + + type Session_Type is record + Data : Session_Data_Access; + end record; + +end Opt46; Index: gcc/testsuite/gnat.dg/opt46_pkg.adb === --- gcc/testsuite/gnat.dg/opt46_pkg.adb (revision 0) +++ gcc/testsuite/gnat.dg/opt46_pkg.adb (revision 0) @@ -0,0 +1,8 @@ +package body Opt46_Pkg is + + function Last (T : Instance) return Table_Index_Type is + begin + return Table_Index_Type (T.P.Last_Val); + end Last; + +end Opt46_Pkg; Index: gcc/testsuite/gnat.dg/opt46_pkg.ads === --- gcc/testsuite/gnat.dg/opt46_pkg.ads (revision 0) +++ gcc/testsuite/gnat.dg/opt46_pkg.ads (revision 0) @@ -0,0 +1,31 @@ +generic + type Table_Component_Type is private; + type Table_Index_Type is range ; + + Table_Low_Bound : Table_Index_Type; + +package Opt46_Pkg is + + type Table_Type is + array (Table_Index_Type range ) of Table_Component_Type; + subtype Big_Table_Type is + Table_Type (Table_Low_Bound .. Table_Index_Type'Last); + + type Table_Ptr is access all Big_Table_Type; + + type Table_Private is private; + + type Instance is record + Table : aliased Table_Ptr := null; + P : Table_Private; + end record; + + function Last (T : Instance) return Table_Index_Type; + +private + + type Table_Private is record + Last_Val : Integer; + end record; + +end Opt46_Pkg;
Re: [match-and-simplify] print capture name
On Mon, 15 Dec 2014, Prathamesh Kulkarni wrote: On 15 December 2014 at 02:03, Prathamesh Kulkarni prathamesh.kulka...@linaro.org wrote: Print name of capture in patterns written by user. * genmatch.c (capture::name): New. (capture::capture): New default argument. (parse_capture): Pass id to capture::capture. (print_operand): Print name of capture if available. oops, forgot to attach the patch. Applied - thanks, Richard. Thanks, Prathamesh -- Richard Biener rguent...@suse.de SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)
Re: [match-and-simplify] set simplify::capture_max to 0 if pattern contains no captures
On Mon, 15 Dec 2014, Prathamesh Kulkarni wrote: Caused segfault for pattern containing no captures at: info.safe_grow_cleared(capture_max + 1); in capture_info::capture_info artificial test-case: (define_predicates integer_zerop) (simplify (bit_not integer_zerop) { build_zero_cst (type); }) * genmatch.c (simplify::simplify): Set simplify::capture_max to 0 if pattern contains no captures. Hmm, I think I've seen this before and I think that vec should be more robust. In fact vecint v = vNULL; v.quick_grow (0); will segfault as called via 1572vecT, va_heap, vl_ptr::safe_grow (unsigned len MEM_STAT_DECL) 1573{ 1574 unsigned oldlen = length (); 1575 gcc_checking_assert (oldlen = len); 1576 reserve_exact (len - oldlen PASS_MEM_STAT); 1577 m_vec-quick_grow (len); in fact it is documented here: /* Allocator for heap memory. Ensure there are at least RESERVE free slots in V. If EXACT is true, grow exactly, else grow exponentially. As a special case, if the vector had not been allocated and and RESERVE is 0, no vector will be created. */ templatetypename T inline void va_heap::reserve (vecT, va_heap, vl_embed *v, unsigned reserve, bool exact MEM_STAT_DECL) { if we don't want to do that we should simply guard the safe_grow_cleared call, not artificially increase the number of elements. But I think the following is the best and also matches vec::truncate. Boostrap regtest running on x86_64-unknown-linux-gnu. Richard. 2014-12-15 Richard Biener rguent...@suse.de * vec.h (vec::safe_grow): Guard against a grow to zero size. Index: gcc/vec.h === --- gcc/vec.h (revision 218746) +++ gcc/vec.h (working copy) @@ -1574,7 +1574,10 @@ vecT, va_heap, vl_ptr::safe_grow (unsi unsigned oldlen = length (); gcc_checking_assert (oldlen = len); reserve_exact (len - oldlen PASS_MEM_STAT); - m_vec-quick_grow (len); + if (m_vec) +m_vec-quick_grow (len); + else +gcc_checking_assert (len == 0); }
[PATCH] Fix PR64295
Bootstrapped and tested on x86_64-unknown-linux-gnu, applied. Richard. 2014-12-15 Richard Biener rguent...@suse.de PR middle-end/64295 * match.pd (X / CST - X * (1 / CST): Use const_binop instead of fold_binary to compute the constant to multiply with. * gcc.dg/pr64295.c: New testcase. Index: gcc/match.pd === --- gcc/match.pd(revision 218668) +++ gcc/match.pd(working copy) @@ -186,7 +186,7 @@ (define_operator_list inverted_tcc_compa (if (flag_reciprocal_math !real_zerop (@1)) (with - { tree tem = fold_binary (RDIV_EXPR, type, build_one_cst (type), @1); } + { tree tem = const_binop (RDIV_EXPR, type, build_one_cst (type), @1); } (if (tem) (mult @0 { tem; } (if (cst != COMPLEX_CST) Index: gcc/testsuite/gcc.dg/pr64295.c === --- gcc/testsuite/gcc.dg/pr64295.c (revision 0) +++ gcc/testsuite/gcc.dg/pr64295.c (working copy) @@ -0,0 +1,8 @@ +/* { dg-do compile } */ +/* { dg-options -O -frounding-math -funsafe-math-optimizations } */ + +double +f (double g) +{ + return g / 3; +}
[Patch, Fortran] PR 63727: Checks missing for proc-pointer components: Usage as actual argument when elemental
Hi all, here is another small diagnostic enhancement for procedure pointer components. Regtested on x86_64-unknown-linux-gnu. Ok for trunk? (I'm not adding a dedicated test case, since coarray_collectives_14 already includes this case as a FIXME, which I now transformed into a dg-error.) Cheers, Janus 2014-12-15 Janus Weil ja...@gcc.gnu.org PR fortran/63727 * resolve.c (resolve_actual_arglist): Check for elemental procedure pointer components. 2014-12-15 Janus Weil ja...@gcc.gnu.org PR fortran/63727 * gfortran.dg/coarray_collectives_14.f90: Address FIXME item. Index: gcc/fortran/resolve.c === --- gcc/fortran/resolve.c (Revision 218721) +++ gcc/fortran/resolve.c (Arbeitskopie) @@ -1740,6 +1740,7 @@ resolve_actual_arglist (gfc_actual_arglist *arg, p gfc_symbol *sym; gfc_symtree *parent_st; gfc_expr *e; + gfc_component *comp; int save_need_full_assumed_size; bool return_value = false; bool actual_arg_sav = actual_arg, first_actual_arg_sav = first_actual_arg; @@ -1967,6 +1968,14 @@ resolve_actual_arglist (gfc_actual_arglist *arg, p } } + comp = gfc_get_proc_ptr_comp(e); + if (comp comp-attr.elemental) + { + gfc_error (ELEMENTAL procedure pointer component %qs is not + allowed as an actual argument at %L, comp-name, + e-where); + } + /* Fortran 2008, C1237. */ if (e-expr_type == EXPR_VARIABLE gfc_is_coindexed (e) gfc_has_ultimate_pointer (e)) Index: gcc/testsuite/gfortran.dg/coarray_collectives_14.f90 === --- gcc/testsuite/gfortran.dg/coarray_collectives_14.f90(Revision 218721) +++ gcc/testsuite/gfortran.dg/coarray_collectives_14.f90(Arbeitskopie) @@ -62,7 +62,7 @@ program test call co_reduce(caf, arg3) ! { dg-error shall have two arguments } call co_reduce(caf, dt%arg3) ! { dg-error shall have two arguments } call co_reduce(caf, elem) ! { dg-error ELEMENTAL non-INTRINSIC procedure 'elem' is not allowed as an actual argument } - call co_reduce(caf, dt%elem) ! { FIXME: ELEMENTAL non-INTRINSIC procedure 'elem' is not allowed as an actual argument } + call co_reduce(caf, dt%elem) ! { dg-error ELEMENTAL procedure pointer component 'elem' is not allowed as an actual argument } call co_reduce(caf, realo) ! { dg-error A argument at .1. has type INTEGER.4. but the function passed as OPERATOR at .2. returns REAL.4. } call co_reduce(caf, dt%realo) ! { dg-error A argument at .1. has type INTEGER.4. but the function passed as OPERATOR at .2. returns REAL.4. } call co_reduce(caf, int8) ! { dg-error A argument at .1. has type INTEGER.4. but the function passed as OPERATOR at .2. returns INTEGER.8. }
Re: New patch: [AArch64] [BE] [1/2] Make large opaque integer modes endianness-safe.
On 15 December 2014 at 10:56, David Sherwood david.sherw...@arm.com wrote: -Original Message- From: Christophe Lyon [mailto:christophe.l...@linaro.org] Sent: 11 December 2014 13:47 To: David Sherwood Cc: gcc-patches@gcc.gnu.org; Marcus Shawcroft; Alan Hayward; Tejas Belagod; Richard Sandiford Subject: Re: New patch: [AArch64] [BE] [1/2] Make large opaque integer modes endianness-safe. On 11 December 2014 at 11:16, David Sherwood david.sherw...@arm.com wrote: Hi Christophe, Sorry to bother you again. After my clarification email below are you now happy for these patches to go in? Kind Regards, David Sherwood. -Original Message- From: David Sherwood [mailto:david.sherw...@arm.com] Sent: 27 November 2014 14:53 To: 'Christophe Lyon' Cc: gcc-patches@gcc.gnu.org; Marcus Shawcroft; Alan Hayward; 'Tejas Belagod'; Richard Sandiford Subject: RE: New patch: [AArch64] [BE] [1/2] Make large opaque integer modes endianness-safe. On 18 November 2014 10:14, David Sherwood david.sherw...@arm.com wrote: Hi Christophe, Ah sorry. My mistake - it fixes this in bugzilla: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59810 I did look at that PR, but since it has no testcase attached, I was unsure. And I am still not :-) PR 59810 is [AArch64] LDn/STn implementations are not ABI-conformant for bigendian. but the advsimd-intrinsics/vldX.c and vldX_lane.c now PASS with Alan's patches on aarch64_be, so I thought Alan's patches solve PR59810. What am I missing? Hi Christophe, I think probably this is our fault for making our lives way too difficult and artificially splitting all these patches up. :) Alan's patch: https://gcc.gnu.org/ml/gcc-patches/2014-10/msg00952.html fixes some issues on aarch64_be, but also causes regressions. For example, Tests that now fail, but worked before: aarch64_be-elf-aem: gcc.dg/vect/slp-perm-8.c -flto -ffat-lto-objects execution test aarch64_be-elf-aem: gcc.dg/vect/slp-perm-8.c execution test aarch64_be-elf-aem: gcc.dg/vect/vect-over-widen-1-big-array.c -flto -ffat-lto-objects execution test ... Tests that now work, but didn't before: aarch64_be-elf-aem: gcc.dg/vect/fast-math-vect-complex-3.c execution test aarch64_be-elf-aem: gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c execution test aarch64_be-elf-aem: gcc.dg/vect/no-scevccp-outer-10a.c execution test ... I didn't notice that because I tested Alan's patch only against the advsimd-intrinsics tests. In this respect, I don't understand why your ChangeLog entry says * config/aarch64/aarch64-simd.md (vec_store_lanes(o/c/x)i, vec_load_lanes(o/c/x)i): Fixed to work for Big Endian. since the existing advsimd-intrinsics tests already pass with Alan's patch alone or is vld1_lane still broken (for which I haven't posted a test yet)? Yes, I think the change log is unclear and I will change it. The only thing that was broken was not adhering to the ABI, but we don't have any specific regression tests that prove this. OK thanks for the clarification. His patch is only half of the story and must be applied at the same time as the [AArch64] [BE] [1/2] Make large opaque integer modes endianness-safe. patch. With both patches applied the result looks much healthier: # Comparing 1 common sum files ## /bin/sh ./src/gcc/contrib/compare_tests /tmp/gxx-sum1.10051 /tmp/gxx-sum2.10051 Tests that now work, but didn't before: aarch64_be-elf-aem: gcc.dg/torture/pr52028.c -O3 -fomit-frame-pointer execution test aarch64_be-elf-aem: gcc.dg/torture/pr52028.c -O3 -fomit-frame-pointer -funroll-all-loops -finline- functions execution test aarch64_be-elf-aem: gcc.dg/torture/pr52028.c -O3 -fomit-frame-pointer -funroll-loops execution test ... with no new regressions. After applying both patches the aarch64_be gcc testsuite is on a parity with the aarch64 testsuite. Furthermore, after applying both of these patches: [AArch64] [BE] [1/2] Make large opaque integer modes endianness-safe [AArch64] [BE] Fix vector load/stores to not use ld1/st1 it then becomes safe for us to remove the CCMC macro, which is the cause of unnecessary spills to the stack for certain auto-vectorised code. So really I suppose when I posted my second patch [AArch64] [BE] [2/2] Make large opaque integer modes endianness-safe I should have really just called this [AArch64] [BE] Remove CCMC for aarch64 in order to make it clear exactly what the purpose of these patches is. well, not yet since this very does not remove it :-) Again, this is my fault as I made a mistake in the change log. If you look at the actual patch the CCMC macro is removed. Let me re-post corrected, more sensible change logs for both of those changes here: [AArch64] [BE] [1/2] Make large opaque
Re: [PATCH] [AArch64, NEON] Fix testcases add by r218484
On 13 December 2014 at 05:06, Yangfei (Felix) felix.y...@huawei.com wrote: Thanks for reviewing the patch. See my comments inlined: This patch fix this two issues. Three changes: 1. vfma_f32, vfmaq_f32, vfms_f32, vfmsq_f32 are only available for arm*-*-* target with the FMA feature, we take care of this through the macro __ARM_FEATURE_FMA. 2. vfma_n_f32 and vfmaq_n_f32 are only available for aarch64 target, we take care of this through the macro __aarch64__. 3. vfmaq_f64, vfmaq_n_f64 and vfmsq_f64 are only available for aarch64 target, we just exclude test for them to keep the testcases clean. (Note: They also pass on aarch64 aarch64_be target and we can add test for them if needed). I would prefer to have all the available variants tested. OK, the v2 patch attached have all the available variants added. +#ifdef __aarch64__ /* Expected results. */ VECT_VAR_DECL(expected,hfloat,32,2) [] = { 0x4438ca3d, 0x44390a3d }; VECT_VAR_DECL(expected,hfloat,32,4) [] = { 0x44869eb8, 0x4486beb8, 0x4486deb8, 0x4486feb8 }; -VECT_VAR_DECL(expected,hfloat,64,2) [] = { 0x408906e1532b8520, 0x40890ee1532b8520 }; Why do you remove this one? We need to make some changes to the header files for this test. Initially, I don't want to touch the header files, so I reduced this testcase to a minimal one. int main (void) { +#ifdef __ARM_FEATURE_FMA exec_vfms (); +#endif return 0; } In the other tests, I try to put as much code in common as possible, between the 'a' and 's' variants (e.g. vmla/vmls). Maybe you can do that as a follow-up? Yes, I think we can handle this with a follow-on patch. The v2 patch is tested on armeb-linux-gnueabi, arm-linux-gnueabi, aarch64-linux-gnu and aarch64_be-linux-gnu. How about this one? Thanks. It looks better, thanks. Minor comment below. Index: gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/compute-ref-data.h === --- gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/compute-ref-data.h (revision 218582) +++ gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/compute-ref-data.h (working copy) @@ -142,6 +142,10 @@ VECT_VAR_DECL_INIT(buffer, poly, 16, 8); PAD(buffer_pad, poly, 16, 8); VECT_VAR_DECL_INIT(buffer, float, 32, 4); PAD(buffer_pad, float, 32, 4); +#ifdef __aarch64__ +VECT_VAR_DECL_INIT(buffer, float, 64, 2); +PAD(buffer_pad, float, 64, 2); +#endif /* The tests for vld1_dup and vdup expect at least 4 entries in the input buffer, so force 1- and 2-elements initializers to have 4 Index: gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vfma_n.c === --- gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vfma_n.c (revision 218582) +++ gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vfma_n.c (working copy) @@ -2,6 +2,7 @@ #include arm-neon-ref.h #include compute-ref-data.h +#if defined(__aarch64__) defined(__ARM_FEATURE_FMA) /* Expected results. */ VECT_VAR_DECL(expected,hfloat,32,2) [] = { 0x4438ca3d, 0x44390a3d }; VECT_VAR_DECL(expected,hfloat,32,4) [] = { 0x44869eb8, 0x4486beb8, 0x4486deb8, 0x4486feb8 }; @@ -9,28 +10,29 @@ VECT_VAR_DECL(expected,hfloat,64,2) [] = { 0x40890 #define VECT_VAR_ASSIGN(S,Q,T1,W) S##Q##_##T1##W #define ASSIGN(S, Q, T, W, V) T##W##_t S##Q##_##T##W = V -#define TEST_MSG VFMA/VFMAQ +#define TEST_MSG VFMA_N/VFMAQ_N + void exec_vfma_n (void) { /* Basic test: v4=vfma_n(v1,v2), then store the result. */ -#define TEST_VFMA(Q, T1, T2, W, N) \ +#define TEST_VFMA_N(Q, T1, T2, W, N) \ VECT_VAR(vector_res, T1, W, N) = \ vfma##Q##_n_##T2##W(VECT_VAR(vector1, T1, W, N), \ - VECT_VAR(vector2, T1, W, N), \ - VECT_VAR_ASSIGN(Scalar, Q, T1, W)); \ + VECT_VAR(vector2, T1, W, N),\ + VECT_VAR_ASSIGN(scalar, Q, T1, W)); \ vst1##Q##_##T2##W(VECT_VAR(result, T1, W, N), VECT_VAR(vector_res, T1, W, N)) -#define CHECK_VFMA_RESULTS(test_name,comment) \ +#define CHECK_VFMA_N_RESULTS(test_name,comment) \ {\ CHECK_FP(test_name, float, 32, 2, PRIx32, expected, comment); \ CHECK_FP(test_name, float, 32, 4, PRIx32, expected, comment); \ - CHECK_FP(test_name, float, 64, 2, PRIx64, expected, comment); \ - } +CHECK_FP(test_name, float, 64, 2, PRIx64, expected, comment); \ + } #define DECL_VABD_VAR(VAR) \ be careful with your cut and paste. VABD should probably be VFMA_N here,
Re: [Patch, Fortran] PR 63727: Checks missing for proc-pointer components: Usage as actual argument when elemental
Janus Weil wrote: here is another small diagnostic enhancement for procedure pointer components. Regtested on x86_64-unknown-linux-gnu. Ok for trunk? Looks good to me. Thanks for the patch! Tobias 2014-12-15 Janus Weil ja...@gcc.gnu.org PR fortran/63727 * resolve.c (resolve_actual_arglist): Check for elemental procedure pointer components. 2014-12-15 Janus Weil ja...@gcc.gnu.org PR fortran/63727 * gfortran.dg/coarray_collectives_14.f90: Address FIXME item.
Re: patch to fix PR64110
Hi, After this commit, GCC build fails for ARM targets: --target=arm-none-eabi --with-mode=arm --with-cpu=cortex-a9 --with-fpu=neon /obj-arm-none-eabi/gcc1/./gcc/xgcc -B/obj-arm-none-eabi/gcc1/./gcc/ -B/tools/arm-none-eabi/bin/ -B/tools/arm-none-eabi/lib/ -isystem /tools/arm-none-eabi/include -isystem /tools/arm-none-eabi/sys-include -g -O2 -mfloat-abi=hard -O2 -g -O2 -DIN_GCC -DCROSS_DIRECTORY_STRUCTURE -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition -isystem ./include -fno-inline -g -DIN_LIBGCC2 -fbuilding-libgcc -fno-stack-protector -Dinhibit_libc -fno-inline -I. -I. -I../../.././gcc -I/trunk/libgcc -I/trunk/libgcc/. -I/trunk/libgcc/../gcc -I/trunk/libgcc/../include -DHAVE_CC_TLS -o _mulhelperDQ.o -MT _mulhelperDQ.o -MD -MP -MF _mulhelperDQ.dep -DL_mulhelper -DDQ_MODE -c /trunk/libgcc/fixed-bit.c -fvisibility=hidden -DHIDE_EXPORTS /trunk/libgcc/fixed-bit.c: In function '__gnu_mulhelperdq': /trunk/libgcc/fixed-bit.c:371:1: error: unable to generate reloads for: } ^ (insn 55 63 59 2 (set (reg:DI 124 [ D.7630 ]) (mult:DI (zero_extend:DI (subreg:SI (reg:DI 112 [ D.7628 ]) 4)) (zero_extend:DI (subreg:SI (reg:DI 111 [ D.7628 ]) 4 /trunk/libgcc/fixed-bit.c:307 54 {*umulsidi3_v6} (nil)) /trunk/libgcc/fixed-bit.c:371:1: internal compiler error: in curr_insn_transform, at lra-constraints.c:3383 0x9d4875 _fatal_insn(char const*, rtx_def const*, char const*, int, char const*) /trunk/gcc/rtl-error.c:110 0x904ec4 curr_insn_transform /trunk/gcc/lra-constraints.c:3383 0x9057ce lra_constraints(bool) /trunk/gcc/lra-constraints.c:4324 0x8f5eb1 lra(_IO_FILE*) /trunk/gcc/lra.c:2277 0x8b4959 do_reload /trunk/gcc/ira.c:5391 0x8b4959 execute /trunk/gcc/ira.c:5561 Please submit a full bug report, Can you have a look? Thanks, Christophe. On 12 December 2014 at 21:12, Vladimir Makarov vmaka...@redhat.com wrote: The following patch fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64110 The patch was successfully tested and bootstrapped on x86/x86-64. Committed as rev. 218688. 2014-12-12 Vladimir Makarov vmaka...@redhat.com PR target/64110 * lra-constraints.c (process_alt_operands): Refuse alternative when reload pseudo of given class can not hold value of given mode. 2014-12-12 Vladimir Makarov vmaka...@redhat.com PR target/64110 * gcc.target/i386/pr64110.c: New.
Re: [PATCH] rs6000: Do not allow GPR0 for addic. if it is split
On Mon, Dec 15, 2014 at 6:11 AM, Segher Boessenkool seg...@kernel.crashing.org wrote: If an addic. is split to addi+cmp (because RA didn't give it CR0), it will do the wrong thing if the input reg is GPR0 (addi X,0,N is li X,N). So don't allow such an input. Spotted visually while investigating PR64268. Tested etc.; okay for mainline? Segher 2014-12-15 Segher Boessenkool seg...@kernel.crashing.org gcc/ * gcc/config/rs6000/rs6000.md (*addmode3_imm_dot, *addmode3_imm_dot2): Change the constraint for the second alternative for operand 1 from r to b. Okay. thanks, David
Re: [C++ patch] do not make extern inlines as needed when not optimizing
On 12/14/2014 06:14 PM, Jan Hubicka wrote: DECL_EXTERNAL (decl) !DECL_NOT_REALLY_EXTERN (decl) This is DECL_REALLY_EXTERN. OK with that change (in both places). Jason
Re: [C++ Patch] PR 58882
OK. Jason
Re: patch to fix PR64110
On 2014-12-15 9:14 AM, Christophe Lyon wrote: Hi, After this commit, GCC build fails for ARM targets: --target=arm-none-eabi --with-mode=arm --with-cpu=cortex-a9 --with-fpu=neon full bug report, Can you have a look? Sure. Sorry for inconvenience. LRA/reload bug fixing is a complicated thing in most cases.
Re: [PATCH] PR 62173, re-shuffle insns for RTL loop invariant hoisting
On 04/12/14 19:32, Jiong Wang wrote: On 04/12/14 11:07, Richard Biener wrote: On Thu, Dec 4, 2014 at 12:07 PM, Richard Biener richard.guent...@gmail.com wrote: On Thu, Dec 4, 2014 at 12:00 PM, Jiong Wang jiong.w...@arm.com wrote: which means re-associate the constant imm with the virtual frame pointer. transform RA - fixed_reg + RC RD - MEM (RA + const_offset) into: RA - fixed_reg + const_offset RD - MEM (RA + RC) then RA - fixed_reg + const_offset is actually loop invariant, so the later RTL GCSE PRE pass could catch it and do the hoisting, and thus ameliorate what tree level ivopts could not sort out. There is a LIM pass after gimple ivopts - if the invariantness is already visible there why not handle it there similar to the special-cases in rewrite_bittest and rewrite_reciprocal? maybe, needs further check. And of course similar tricks could be applied on the RTL level to RTL invariant motion? Thanks. I just checked the code, yes, loop invariant motion pass is the natural place to integrate such multi-insns invariant analysis trick. those code could be integrated into loop-invariant.c cleanly, but then I found although the invariant detected successfully but it's not moved out of loop because of cost issue, and looks like the patch committed to fix PR33928 is trying to prevent such cheap address be hoisted to reduce register pressure. 805 /* ??? Try to determine cheapness of address computation. Unfortunately 806 the address cost is only a relative measure, we can't really compare 807 it with any absolute number, but only with other address costs. 808 But here we don't have any other addresses, so compare with a magic 809 number anyway. It has to be large enough to not regress PR33928 810 (by avoiding to move reg+8,reg+16,reg+24 invariants), but small 811 enough to not regress 410.bwaves either (by still moving reg+reg 812 invariants). 813 See http://gcc.gnu.org/ml/gcc-patches/2009-10/msg01210.html . */ 814 inv-cheap_address = address_cost (SET_SRC (set), word_mode, 815 ADDR_SPACE_GENERIC, speed) 3; I think that maybe necessary for x86 which is short of register, while for RISC, it may not be that necessary, especially the whole register pressure is not big. currently, what I could think of is for this transformation below, we should increase the costs: A == RA - virtual_stack_var + RC RD - MEM (RA + const_offset) into: B == RA - virtual_stack_var + const_offset --B RD - MEM (RA + RC) because the cost is not that cheap, if there is not re-assocation of virtual_stack_var with const_offset, then lra elimination will create another instruction to hold the elimination result, so format A will actually be RT - real_stack_pointer + elimination_offset RA - RT + RC RD - MEM (RA + const_offset) so, the re-assocation and later hoisting of invariant B could actually save two instructions in the loop, this is why there are 15% perf gap for bzip2 under some situation. updated patch. moved this instruction shuffling trick to rtl loop invariant pass. as described above, this patch tries to transform A to B, so that after the transformation: * RA - virtual_stack_var + const_offset could be hoisted out of the loop * easy the work of lra elimination as virtual_stack_var is associated with const_offset that the elimination offset could be combined with const_offset automatically. current rtl loop invariant pass treat reg - reg + off as cheap address, while although reg - virtual_stack_var + offset fall into the same format, but it's not that cheap as we could also save one lra elimination instruction. so this patch will mark reg - virtual_stack_var + offset transformed from A to be expensive, so that it could be hoisted later. after patch, pr62173 is fixed on powerpc64, while *still not on aarch64*. because there are one glitch in aarch64_legitimize_address which cause unnecessary complex instructions sequences generated when legitimize some addresses. and if we fix that, we will get cheaper address for those cases which is generally good, and the cheaper address will cause tree level IVOPT do more IVOPT optimization which is generally good also, but from the speck2k result, there are actually around 1% code size regression on two cases, the reason is for target support post-index address, doing IVOPT may not always be the best choice because we lost the advantage of using post-index addressing. on aarch64, for the following testcase, the ivopted version is complexer than not ivopted version. while (oargc--) *(nargv++) = *(oargv++); so, I sent the generic fix here only, as it's an independent patch, and could benefit other targets like powerpc64 although the issue on aarch64 is still not resolved
Re: [PATCH] PR 62173, re-shuffle insns for RTL loop invariant hoisting
On 15/12/14 15:28, Jiong Wang wrote: On 04/12/14 19:32, Jiong Wang wrote: On 04/12/14 11:07, Richard Biener wrote: On Thu, Dec 4, 2014 at 12:07 PM, Richard Biener richard.guent...@gmail.com wrote: On Thu, Dec 4, 2014 at 12:00 PM, Jiong Wang jiong.w...@arm.com wrote: which means re-associate the constant imm with the virtual frame pointer. transform RA - fixed_reg + RC RD - MEM (RA + const_offset) into: RA - fixed_reg + const_offset RD - MEM (RA + RC) then RA - fixed_reg + const_offset is actually loop invariant, so the later RTL GCSE PRE pass could catch it and do the hoisting, and thus ameliorate what tree level ivopts could not sort out. There is a LIM pass after gimple ivopts - if the invariantness is already visible there why not handle it there similar to the special-cases in rewrite_bittest and rewrite_reciprocal? maybe, needs further check. And of course similar tricks could be applied on the RTL level to RTL invariant motion? Thanks. I just checked the code, yes, loop invariant motion pass is the natural place to integrate such multi-insns invariant analysis trick. those code could be integrated into loop-invariant.c cleanly, but then I found although the invariant detected successfully but it's not moved out of loop because of cost issue, and looks like the patch committed to fix PR33928 is trying to prevent such cheap address be hoisted to reduce register pressure. 805 /* ??? Try to determine cheapness of address computation. Unfortunately 806 the address cost is only a relative measure, we can't really compare 807 it with any absolute number, but only with other address costs. 808 But here we don't have any other addresses, so compare with a magic 809 number anyway. It has to be large enough to not regress PR33928 810 (by avoiding to move reg+8,reg+16,reg+24 invariants), but small 811 enough to not regress 410.bwaves either (by still moving reg+reg 812 invariants). 813 See http://gcc.gnu.org/ml/gcc-patches/2009-10/msg01210.html . */ 814 inv-cheap_address = address_cost (SET_SRC (set), word_mode, 815 ADDR_SPACE_GENERIC, speed) 3; I think that maybe necessary for x86 which is short of register, while for RISC, it may not be that necessary, especially the whole register pressure is not big. currently, what I could think of is for this transformation below, we should increase the costs: A == RA - virtual_stack_var + RC RD - MEM (RA + const_offset) into: B == RA - virtual_stack_var + const_offset --B RD - MEM (RA + RC) because the cost is not that cheap, if there is not re-assocation of virtual_stack_var with const_offset, then lra elimination will create another instruction to hold the elimination result, so format A will actually be RT - real_stack_pointer + elimination_offset RA - RT + RC RD - MEM (RA + const_offset) so, the re-assocation and later hoisting of invariant B could actually save two instructions in the loop, this is why there are 15% perf gap for bzip2 under some situation. updated patch. moved this instruction shuffling trick to rtl loop invariant pass. as described above, this patch tries to transform A to B, so that after the transformation: * RA - virtual_stack_var + const_offset could be hoisted out of the loop * easy the work of lra elimination as virtual_stack_var is associated with const_offset that the elimination offset could be combined with const_offset automatically. current rtl loop invariant pass treat reg - reg + off as cheap address, while although reg - virtual_stack_var + offset fall into the same format, but it's not that cheap as we could also save one lra elimination instruction. so this patch will mark reg - virtual_stack_var + offset transformed from A to be expensive, so that it could be hoisted later. after patch, pr62173 is fixed on powerpc64, while *still not on aarch64*. because there are one glitch in aarch64_legitimize_address which cause unnecessary complex instructions sequences generated when legitimize some addresses. and if we fix that, we will get cheaper address for those cases which is generally good, and the cheaper address will cause tree level IVOPT do more IVOPT optimization which is generally good also, but from the speck2k result, there are actually around 1% code size regression on two cases, the reason is for target support post-index address, doing IVOPT may not always be the best choice because we lost the advantage of using post-index addressing. on aarch64, for the following testcase, the ivopted version is complexer than not ivopted version. while (oargc--) *(nargv++) = *(oargv++); so, I sent the generic fix here only, as it's an independent patch, and could benefit other
[PATCH][AArch64] Improve bit-test-branch pattern to avoid unnecessary register clobber
from the discussion here https://gcc.gnu.org/ml/gcc-patches/2014-11/msg01949.html the other problem it exposed is the unnecessary clobber of register x19 which is a callee-saved register, then there are unnecessary push/pop in pro/epilogue. the reason comes from the following pattern: (define_insn tboptabmode1 (define_insn cboptabmode1 they always declare (clobber (match_scratch:DI 3 =r)) while that register is used only when get_attr_length (insn) == 8. actually, we could clobber CC register instead of scratch register to avoid wasting of general purpose registers. this patch fix this, and give slightly improvement on spec2k. bootstrap OK, no regression on aarch64 bare-metal. ok for trunk? the testcase included in the patch is for verification purpose only. it could verify the long branch situation, while because of the code is very big, it takes a couple of seconds to compile. will not commit it. gcc/ 2014-12-15 Ramana Radhakrishnan ramana.radhakrish...@arm.com Jiong Wang jiong.w...@arm.com * config/aarch64/aarch64.md (tboptabmode1): Clobber CC reg instead of scratch reg. (cboptabmode1): Likewise. * config/aarch64/iterators.md (bcond): New define_code_attr.diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 597ff8c..abf8e3f 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -466,13 +466,20 @@ (const_int 0)) (label_ref (match_operand 2 )) (pc))) - (clobber (match_scratch:DI 3 =r))] + (clobber (reg:CC CC_REGNUM))] - * - if (get_attr_length (insn) == 8) -return \ubfx\\t%w3, %w0, %1, #1\;cbz\\t%w3, %l2\; - return \tbz\\t%w0, %1, %l2\; - + { +if (get_attr_length (insn) == 8) + { + char buf[64]; + uint64_t val = ((uint64_t) 1) UINTVAL (operands[1]); + sprintf (buf, tst\t%%w0, %PRId64, val); + output_asm_insn (buf, operands); + return bcond\t%l2; + } +else + return tbz\t%w0, %1, %l2; + } [(set_attr type branch) (set (attr length) (if_then_else (and (ge (minus (match_dup 2) (pc)) (const_int -32768)) @@ -486,13 +493,21 @@ (const_int 0)) (label_ref (match_operand 1 )) (pc))) - (clobber (match_scratch:DI 2 =r))] + (clobber (reg:CC CC_REGNUM))] - * - if (get_attr_length (insn) == 8) -return \ubfx\\t%w2, %w0, sizem1, #1\;cbz\\t%w2, %l1\; - return \tbz\\t%w0, sizem1, %l1\; - + { +if (get_attr_length (insn) == 8) + { + char buf[64]; + uint64_t val = ((uint64_t ) 1) + (GET_MODE_SIZE (MODEmode) * BITS_PER_UNIT - 1); + sprintf (buf, tst\t%%w0, %PRId64, val); + output_asm_insn (buf, operands); + return bcond\t%l1; + } +else + return tbz\t%w0, sizem1, %l1; + } [(set_attr type branch) (set (attr length) (if_then_else (and (ge (minus (match_dup 1) (pc)) (const_int -32768)) diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index 7dd3917..bd144f9 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -823,6 +823,9 @@ (smax s) (umax u) (smin s) (umin u)]) +;; Emit conditional branch instructions. +(define_code_attr bcond [(eq beq) (ne bne) (lt bne) (ge beq)]) + ;; Emit cbz/cbnz depending on comparison type. (define_code_attr cbz [(eq cbz) (ne cbnz) (lt cbnz) (ge cbz)]) diff --git a/gcc/testsuite/gcc.target/aarch64/long_range_bit_test_branch_1.c b/gcc/testsuite/gcc.target/aarch64/long_range_bit_test_branch_1.c new file mode 100644 index 000..d4782e9 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/long_range_bit_test_branch_1.c @@ -0,0 +1,166 @@ +int dec (int); + +#define CASE_ENTRY(n) \ + case n: \ +sum = a / n; \ +sum = sum * (n - 1); \ +sum = dec (sum); \ +sum = sum / (n + 1); \ +sum = dec (sum); \ +sum = sum / (n + 2); \ +sum = dec (sum); \ +sum = sum / (n + 3); \ +sum = dec (sum); \ +sum = sum / (n + 4); \ +sum = dec (sum); \ +sum = sum / (n + 5); \ +sum = dec (sum); \ +sum = sum / (n + 6); \ +sum = dec (sum); \ +sum = sum / (n + 7); \ +sum = dec (sum); \ +sum = sum / (n + 8); \ +sum = dec (sum); \ +sum = sum / (n + 9); \ +sum = dec (sum); \ +sum = sum / (n + 10); \ +sum = dec (sum); \ +sum = sum / (n + 11); \ +sum = dec (sum); \ +sum = sum / (n + 12); \ +sum = dec (sum); \ +sum = sum / (n + 13); \ +sum = dec (sum); \ +sum = sum / (n + 14); \ +sum = dec (sum); \ +sum = sum / (n + 15); \ +sum = dec (sum); \ +sum = sum / (n + 16); \ +sum = dec (sum); \ +sum = sum / (n + 17); \ +sum = dec (sum); \ +sum = sum / (n + 18); \ +sum = dec (sum); \ +sum = sum / (n + 19); \ +sum = dec (sum); \ +sum = sum / (n + 20); \ +sum = dec (sum); \ +sum = sum / (n + 21); \ +sum = dec (sum); \ +sum = sum / (n + 22); \ +sum = dec (sum); \ +sum = sum / (n + 23); \ +sum = dec (sum); \ +sum = sum / (n + 24); \ +
Re: [PATCH] combine: If a parallel I2 was split, do not allow a new I2 (PR64268)
On 15/12/2014 12:05, Segher Boessenkool wrote: If combine is given a parallel I2, it splits it into separate I1 and I2 insns in some cases (one case since the beginning of time; more cases since my r218248). It gives the new I1 the same UID as the I2 already has; there is a comment that this works fine because the I1 never is added to the insn stream. When combine eventually replaces the insns with something new, it calls SET_INSN_DELETED on those it wants gone. Since r125624 (when DF was added, back in 2007) SET_INSN_DELETED uses the UID of the insn it is called for to do the deletion for dataflow. So since then, when such a split I1 is deleted, DF thinks I2 is deleted as well. This of course explodes if I2 is in fact needed (but only if that I2 still exists at the end of combine, i.e. the insn was not combined to further down; that might explain why it wasn't noticed before). This should be fixed properly, but it is somewhat involved. This patch simply disallows all combinations coming from a split parallel I2 if it needs a new I2. This fixes PR target/64268. Bootstrapped on powerpc64-linux and powerpc-linux; the fails are gone. Regtested on powerpc64-linux -m32,-m32/-mpowerpc64,-m64,-m64/-mlra. Is this okay for mainline? Segher 2014-12-15 Segher Boessenkool seg...@kernel.crashing.org gcc/ * combine.c (try_combine): Don't try to split newpat if we started with I2 a PARALLEL that we split. --- gcc/combine.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/gcc/combine.c b/gcc/combine.c index 8995c1d3..de2e49f 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -3471,6 +3471,17 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, rtx_insn *i0, if (insn_code_number 0) insn_code_number = recog_for_combine (newpat, i3, new_i3_notes); + /* If we got I1 and I2 from splitting the original (parallel) I2 into two, + I1 and I2 have the same UID, which means that DF ends up deleting I2 + when it is asked to delete I1. So only allow this if we want I2 deleted, + that is, if we get only one insn as combine result; don't try to split + off a new I2 if it doesn't match yet. */ + if (i1 insn_code_number 0 INSN_UID (i1) == INSN_UID (i2)) +{ + undo_all (); + return 0; +} + /* If we were combining three insns and the result is a simple SET with no ASM_OPERANDS that wasn't recognized, try to split it into two insns. There are two ways to do this. It can be split using a Random questions: 1) did you check that it never triggers on e.g. an x86 bootstrap, and that it doesn't trigger too often on PPC64? 2) if it triggers rarely, should combine just try and give a new UID to i1? What makes that hard? Or should it just skip the SET_INSN_DELETED on i1 unless it is added to the instruction stream? That said, if (1) is true, this looks like this fix is good enough for 5 and open branches. Thanks David for pointing this out to me. Paolo
Re: [Patch, Fortran] PR 63727: Checks missing for proc-pointer components: Usage as actual argument when elemental
2014-12-15 14:49 GMT+01:00 Tobias Burnus tobias.bur...@physik.fu-berlin.de: Janus Weil wrote: here is another small diagnostic enhancement for procedure pointer components. Regtested on x86_64-unknown-linux-gnu. Ok for trunk? Looks good to me. Thanks for the patch! Thanks for the review. Committed as r218751. Cheers, Janus
[PING] Enhance array types debug info. for Ada
Ping for https://gcc.gnu.org/ml/gcc-patches/2014-12/msg00058.html. On 12/01/2014 05:40 PM, Pierre-Marie de Rodat wrote: While I agree this might trigger compatibility issues with old debuggers, I don't know what to do assuming this change is not acceptable: should we add a kludge in add_scalar_info in order to force unsignedness when generating debugging information for Fortran? Here is a data point: I tried to debug gfortran.dg/array_function_2.f90 build with my patches compiler for x86_64-linux with a GDB from an old GNAT Pro release (5.03a1, from 2005): (gdb) b array_function_2.f90:24 (gdb) r (gdb) ptype q_in type = real*8 (0:-1,-6:-1) With a recent GDB, I have instead: (gdb) ptype q_in type = real(kind=8) (0:*,-6:*) Given that the only thing that my patches changed in the debug information for this example is the encoding of the arrays' lower bounds, everything looks fine, here. -- Pierre-Marie de Rodat
Re: [PING] Enhance array types debug info. for Ada
On Mon, Dec 15, 2014 at 05:21:07PM +0100, Pierre-Marie de Rodat wrote: Ping for https://gcc.gnu.org/ml/gcc-patches/2014-12/msg00058.html. Ok for trunk then. On 12/01/2014 05:40 PM, Pierre-Marie de Rodat wrote: While I agree this might trigger compatibility issues with old debuggers, I don't know what to do assuming this change is not acceptable: should we add a kludge in add_scalar_info in order to force unsignedness when generating debugging information for Fortran? Here is a data point: I tried to debug gfortran.dg/array_function_2.f90 build with my patches compiler for x86_64-linux with a GDB from an old GNAT Pro release (5.03a1, from 2005): (gdb) b array_function_2.f90:24 (gdb) r (gdb) ptype q_in type = real*8 (0:-1,-6:-1) With a recent GDB, I have instead: (gdb) ptype q_in type = real(kind=8) (0:*,-6:*) Given that the only thing that my patches changed in the debug information for this example is the encoding of the arrays' lower bounds, everything looks fine, here. -- Pierre-Marie de Rodat Jakub
Re: [PATCH] combine: If a parallel I2 was split, do not allow a new I2 (PR64268)
On Mon, Dec 15, 2014 at 04:51:14PM +0100, Paolo Bonzini wrote: Random questions: 1) did you check that it never triggers on e.g. an x86 bootstrap, and that it doesn't trigger too often on PPC64? I have checked on my largish connection of tests for the carry insns on PowerPC, and only two (related) transforms are disabled, and they aren't too important anyway. Well, and the bad transforms are disabled, only just two of-em but much more frequent (long long x; x--;). I haven't checked on x86, but it's a bugfix: don't do things that blow up! It is amazing to me that it didn't show up before. One theory is that instructions that set the condition code as well as a GPR will never combine with a later insn to two insns, always to just one. But nothing made this explicit so AFAICS it is just an accident that it worked before. I'll do an instrumented x86 bootstrap. [ That testcase, -m32: long long addSH(long long a, unsigned long b) { return a + ((unsigned long long)b 32); } results in addic 4,4,0 ; addze 3,5 while it could just be add 3,5 ] 2) if it triggers rarely, should combine just try and give a new UID to i1? That should in principle works, sure. Seems too dangerous for stage3 though. combine cannot create UIDs as things stand now. What makes that hard? Or should it just skip the SET_INSN_DELETED on i1 unless it is added to the instruction stream? I have tried that first, but it blew up in other ways. I didn't look too closely, sorry. It really is quite dangerous to allow combining just two original insns to two other insns; there are arguments why it cannot loop in this case (it always moves a parallel down, for example) but that is quite fragile, and also not documented anywhere. It also _did_ loop with some of my (perhaps broken) patch attempts. So I opted for the heavy hammer approach. That said, if (1) is true, this looks like this fix is good enough for 5 and open branches. Thanks David for pointing this out to me. I need to remember to CC: people, sorry :-/ Thanks for looking, Segher
Re: Fix streaming of target optimization/option nodes
On Mon, 15 Dec 2014, Jan Hubicka wrote: Hi, actually this patch break fortran, I get streaming error in: lto1: internal compiler error: in streamer_get_pickled_tree apparently picking error_mark_node for variable constructor results in reading integer_type... ? Probably the default nodes are referenced by another builtin tree instead and you get inconsistent streaming between f951 and lto1. See the assert placed into record_common_node which you should extend to cover the optimization node trees. Yep, I looked into that other assert and added check for optimization nodes, it does not seem to trigger + error mark node is 0. Will try to look into this more today. Honza Richard. Honza Hi, the testcase in PR ipa/61324 fails because it is compiled with -O0 and linked with -O2. This should not matter anymore if there wasn't the following problem in streamer that makes us to merge all default nodes across units. Bootstrapped/regtested x86_64-linux, plan to commit it after more testing finishes (Firefox) Honza PR ipa/61324 * tree-streamer.c (preload_common_nodes): Do not ocnsider optimizatoin nad target_option nodes as common nodes; they depend on flags. Index: tree-streamer.c === --- tree-streamer.c (revision 218726) +++ tree-streamer.c (working copy) @@ -324,7 +324,14 @@ preload_common_nodes (struct streamer_tr /* Skip boolean type and constants, they are frontend dependent. */ if (i != TI_BOOLEAN_TYPE i != TI_BOOLEAN_FALSE - i != TI_BOOLEAN_TRUE) + i != TI_BOOLEAN_TRUE + /* Skip optimization and target option nodes; they depend on flags. */ + i != TI_OPTIMIZATION_DEFAULT + i != TI_OPTIMIZATION_CURRENT + i != TI_TARGET_OPTION_DEFAULT + i != TI_TARGET_OPTION_CURRENT + i != TI_CURRENT_TARGET_PRAGMA + i != TI_CURRENT_OPTIMIZE_PRAGMA) record_common_node (cache, global_trees[i]); } -- Richard Biener rguent...@suse.de SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)
Go patch committed: Parse select case *v, *ok = -c
The gofrontend parser had a bug that caused it to fail to parse a select case like *v, *ok = -c, in which the receivers of the channel were expressions rather than identifiers. This patch from Chris Manghane fixes the problem. This is GCC PR 61253. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r fbb82148b328 go/parse.cc --- a/go/parse.cc Sun Dec 14 11:37:40 2014 -0800 +++ b/go/parse.cc Mon Dec 15 09:08:30 2014 -0800 @@ -5031,6 +5031,16 @@ e = Expression::make_receive(*channel, (*channel)-location()); } + if (!saw_comma this-peek_token()-is_op(OPERATOR_COMMA)) +{ + this-advance_token(); + // case v, e = -c: + if (!e-is_sink_expression()) + *val = e; + e = this-expression(PRECEDENCE_NORMAL, true, true, NULL, NULL); + saw_comma = true; +} + if (this-peek_token()-is_op(OPERATOR_EQ)) { if (!this-advance_token()-is_op(OPERATOR_CHANOP))
Re: PATCH: PR rtl-optimization/64037: Miscompilation with -Os and enum class : char parameter
On Mon, Dec 15, 2014 at 9:29 AM, Uros Bizjak ubiz...@gmail.com wrote: The enclosed testcase fails on x86 when compiled with -Os since we pass a byte parameter with a byte load in caller and read it as an int in callee. The reason it only shows up with -Os is x86 backend encodes a byte load with an int load if -O isn't used. When a byte load is used, the upper 24 bits of the register have random value for none WORD_REGISTER_OPERATIONS targets. It happens because setup_incoming_promotions in combine.c has /* The mode and signedness of the argument before any promotions happen (equal to the mode of the pseudo holding it at that stage). */ mode1 = TYPE_MODE (TREE_TYPE (arg)); uns1 = TYPE_UNSIGNED (TREE_TYPE (arg)); /* The mode and signedness of the argument after any source language and TARGET_PROMOTE_PROTOTYPES-driven promotions. */ mode2 = TYPE_MODE (DECL_ARG_TYPE (arg)); uns3 = TYPE_UNSIGNED (DECL_ARG_TYPE (arg)); /* The mode and signedness of the argument as it is actually passed, after any TARGET_PROMOTE_FUNCTION_ARGS-driven ABI promotions. */ mode3 = promote_function_mode (DECL_ARG_TYPE (arg), mode2, uns3, TREE_TYPE (cfun-decl), 0); while they are actually passed in register by assign_parm_setup_reg in function.c: /* Store the parm in a pseudoregister during the function, but we may need to do it in a wider mode. Using 2 here makes the result consistent with promote_decl_mode and thus expand_expr_real_1. */ promoted_nominal_mode = promote_function_mode (data-nominal_type, data-nominal_mode, unsignedp, TREE_TYPE (current_function_decl), 2); where nominal_type and nominal_mode are set up with TREE_TYPE (parm) and TYPE_MODE (nominal_type). TREE_TYPE here is I think the bug is here, not in combine.c. Can you try going back in history for both snippets and see if they matched at some point? The bug was introduced by https://gcc.gnu.org/ml/gcc-cvs/2007-09/msg00613.html commit 5d93234932c3d8617ce92b77b7013ef6bede9508 Author: shinwell shinwell@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Thu Sep 20 11:01:18 2007 + gcc/ * combine.c: Include cgraph.h. (setup_incoming_promotions): Rework to allow more aggressive elimination of sign extensions when all call sites of the current function are known to lie within the current unit. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@128618 138bc75d-0d04-0410-961f-82ee72b054a4 Before this commit, combine.c has enum machine_mode mode = TYPE_MODE (TREE_TYPE (arg)); int uns = TYPE_UNSIGNED (TREE_TYPE (arg)); mode = promote_mode (TREE_TYPE (arg), mode, uns, 1); if (mode == GET_MODE (reg) mode != DECL_MODE (arg)) { rtx x; x = gen_rtx_CLOBBER (DECL_MODE (arg), const0_rtx); x = gen_rtx_fmt_e ((uns ? ZERO_EXTEND : SIGN_EXTEND), mode, x); record_value_for_reg (reg, first, x); } It matches function.c: /* This is not really promoting for a call. However we need to be consistent with assign_parm_find_data_types and expand_expr_real_1. */ promoted_nominal_mode = promote_mode (data-nominal_type, data-nominal_mode, unsignedp, 1); r128618 changed mode = promote_mode (TREE_TYPE (arg), mode, uns, 1); to mode3 = promote_mode (DECL_ARG_TYPE (arg), mode2, uns3, 1); It breaks none WORD_REGISTER_OPERATIONS targets. Hmm, I think that DECL_ARG_TYPE makes a difference only for non-WORD_REGISTER_OPERATIONS targets. But yeah, isolated the above change looks wrong. Your patch is ok for trunk if nobody objects within 24h and for branches after a week. Thanks, Richard. This patch caused PR64213. Here is the updated patch. The difference is mode3 = promote_function_mode (TREE_TYPE (arg), mode1, uns3, TREE_TYPE (cfun-decl), 0); vs mode3 = promote_function_mode (TREE_TYPE (arg), mode1, uns1, TREE_TYPE (cfun-decl), 0); I made a mistake in my previous patch where I shouldn't have changed uns3 to uns1. We do want to update mode3 and uns3, not mode3 and uns1. It generates the same code on PR64213 testcase with a cross alpha-linux GCC. Uros, can you test it on Linux/alpha? OK for master, 4.9 and 4.8 branches if it works on Linux/alpha? Yes, this patch works OK [1] on linux/alpha mainline. ... and 4.9 branch [2]. [1] https://gcc.gnu.org/ml/gcc-testresults/2014-12/msg01867.html [2] https://gcc.gnu.org/ml/gcc-testresults/2014-12/msg01919.html Uros.
[PATCH, libgcc]: Fix PR 63832, crtstuff.c:400:19: warning: array subscript is above array bounds
Hello! Attached patch fixes PR 68323, where code tries to access what compiler think is out of bounds element. 2014-12-15 Uros Bizjak ubiz...@gmail.com PR libgcc/63832 * crtstuff.c (__do_global_dtors_aux) [HIDDEN_DTOR_LIST_END]: Use func_ptr *dtor_list temporary variable to avoid array subscript is above array bounds warnings. Bootstrapped and regression tested on x86_64-linux-gnu {,m32} CentOS 5.11. OK for mainline? Uros. Index: crtstuff.c === --- crtstuff.c (revision 218733) +++ crtstuff.c (working copy) @@ -393,13 +393,11 @@ __do_global_dtors_aux (void) extern func_ptr __DTOR_END__[] __attribute__((visibility (hidden))); static size_t dtor_idx; const size_t max_idx = __DTOR_END__ - __DTOR_LIST__ - 1; -func_ptr f; +func_ptr *dtor_list; +__asm ( : =g (dtor_list) : 0 (__DTOR_LIST__)); while (dtor_idx max_idx) - { - f = __DTOR_LIST__[++dtor_idx]; - f (); - } + dtor_list[++dtor_idx] (); } #else /* !defined (FINI_ARRAY_SECTION_ASM_OP) */ {
Go patch committed: Avoid type check failure for converted recover call
The Go frontend converts recover calls to add an argument passed down to the middle end. In some cases a recover call would be type checked even after that argument is added, causing an incorrect error. This patch from Chris Manghane fixes the problem. This is GCC PR 61248. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r 86515da29da6 go/expressions.cc --- a/go/expressions.cc Mon Dec 15 09:10:59 2014 -0800 +++ b/go/expressions.cc Mon Dec 15 09:30:12 2014 -0800 @@ -6627,6 +6627,8 @@ // Used to stop endless loops when the length of an array uses len // or cap of the array itself. mutable bool seen_; + // Whether the argument is set for calls to BUILTIN_RECOVER. + bool recover_arg_is_set_; }; Builtin_call_expression::Builtin_call_expression(Gogo* gogo, @@ -6635,7 +6637,8 @@ bool is_varargs, Location location) : Call_expression(fn, args, is_varargs, location), -gogo_(gogo), code_(BUILTIN_INVALID), seen_(false) +gogo_(gogo), code_(BUILTIN_INVALID), seen_(false), +recover_arg_is_set_(false) { Func_expression* fnexp = this-fn()-func_expression(); go_assert(fnexp != NULL); @@ -6701,6 +6704,7 @@ Expression_list* new_args = new Expression_list(); new_args-push_back(arg); this-set_args(new_args); + this-recover_arg_is_set_ = true; } // Lower a builtin call expression. This turns new and make into @@ -7841,7 +7845,9 @@ break; case BUILTIN_RECOVER: - if (this-args() != NULL !this-args()-empty()) + if (this-args() != NULL + !this-args()-empty() + !this-recover_arg_is_set_) this-report_error(_(too many arguments)); break;
Re: [PATCH, libgcc]: Fix PR 63832, crtstuff.c:400:19: warning: array subscript is above array bounds
On Mon, Dec 15, 2014 at 06:25:04PM +0100, Uros Bizjak wrote: Hello! Attached patch fixes PR 68323, where code tries to access what compiler think is out of bounds element. 2014-12-15 Uros Bizjak ubiz...@gmail.com PR libgcc/63832 * crtstuff.c (__do_global_dtors_aux) [HIDDEN_DTOR_LIST_END]: Use func_ptr *dtor_list temporary variable to avoid array subscript is above array bounds warnings. Bootstrapped and regression tested on x86_64-linux-gnu {,m32} CentOS 5.11. OK for mainline? Ok, thanks. Index: crtstuff.c === --- crtstuff.c(revision 218733) +++ crtstuff.c(working copy) @@ -393,13 +393,11 @@ __do_global_dtors_aux (void) extern func_ptr __DTOR_END__[] __attribute__((visibility (hidden))); static size_t dtor_idx; const size_t max_idx = __DTOR_END__ - __DTOR_LIST__ - 1; -func_ptr f; +func_ptr *dtor_list; +__asm ( : =g (dtor_list) : 0 (__DTOR_LIST__)); while (dtor_idx max_idx) - { - f = __DTOR_LIST__[++dtor_idx]; - f (); - } + dtor_list[++dtor_idx] (); } #else /* !defined (FINI_ARRAY_SECTION_ASM_OP) */ { Jakub
C++ PATCH for C++14 sized deallocation
This patch implements the last remaining language feature for C++14, global sized deallocation. C++ has always had sized deallocation at class scope, but didn't for deletes that use the global operator delete. The support can be controlled separately from the -std level with the -fsized-deallocation flag (same as clang). The compiler will warn about the unsized variant being defined without the sized variant (or vice versa) with the -Wsized-deallocation flag, which is also enabled by -Wextra. This patch also adds -Wc++14-compat, which currently only warns about a deallocation function with a second size_t parameter changing from being a placement delete to a usual deallocation function. Tested x86_64-pc-linux-gnu, applying to trunk. commit 294f8f12f574592269513ba6f814e564c140b7b5 Author: Jason Merrill ja...@redhat.com Date: Fri Dec 12 23:43:31 2014 -0500 N3778: Sized Deallocation gcc/c-family/ * c.opt (-fsized-deallocation, -Wc++14-compat): New. (-Wsized-deallocation): New. * c-opts.c (c_common_post_options): -fsized-deallocation defaults to on in C++14 and up. gcc/cp/ * call.c (non_placement_deallocation_fn_p): A global sized operator delete is not a usual deallocation function until C++14. (build_op_delete_call): Choose the global sized op delete if we know the size. * cp-tree.h: Declare non_placement_deallocation_fn_p. (enum cp_tree_index): Remove CPTI_GLOBAL_DELETE_FNDECL. (global_delete_fndecl): Remove. * decl.c (cxx_init_decl_processing): Also declare sized op deletes. (grok_op_properties): Warn about sized dealloc without the flag. * init.c (build_builtin_delete_call): Remove. (build_vec_delete_1, build_delete): Don't call it. * decl2.c (maybe_warn_sized_delete): New. (cp_write_global_declarations): Call it. libstdc++-v3/ * libsupc++/del_ops.cc: New. * libsupc++/del_opvs.cc: New. * libsupc++/Makefile.am: Add them. * libsupc++/Makefile.in: Regenerate. * config/abi/pre/gnu.ver: Export _ZdlPvm and _ZdaPvm. diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c index 08a36f0..dbb9912 100644 --- a/gcc/c-family/c-opts.c +++ b/gcc/c-family/c-opts.c @@ -889,6 +889,10 @@ c_common_post_options (const char **pfilename) else if (warn_narrowing == -1) warn_narrowing = 0; + /* Global sized deallocation is new in C++14. */ + if (flag_sized_deallocation == -1) +flag_sized_deallocation = (cxx_dialect = cxx14); + if (flag_extern_tls_init) { #if !defined (ASM_OUTPUT_DEF) || !SUPPORTS_WEAK diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt index b9f7c65..1676f65 100644 --- a/gcc/c-family/c.opt +++ b/gcc/c-family/c.opt @@ -315,6 +315,10 @@ Wc++11-compat C++ ObjC++ Warning Alias(Wc++0x-compat) Warn about C++ constructs whose meaning differs between ISO C++ 1998 and ISO C++ 2011 +Wc++14-compat +C++ ObjC++ Var(warn_cxx14_compat) Warning LangEnabledBy(C++ ObjC++,Wall) +Warn about C++ constructs whose meaning differs between ISO C++ 2011 and ISO C++ 2014 + Wcast-qual C ObjC C++ ObjC++ Var(warn_cast_qual) Warning Warn about casts which discard qualifiers @@ -554,6 +558,10 @@ Wmissing-field-initializers C ObjC C++ ObjC++ Var(warn_missing_field_initializers) Warning EnabledBy(Wextra) Warn about missing fields in struct initializers +Wsized-deallocation +C++ ObjC++ Var(warn_sized_deallocation) Warning EnabledBy(Wextra) +Warn about missing sized deallocation functions + Wsizeof-pointer-memaccess C ObjC C++ ObjC++ Var(warn_sizeof_pointer_memaccess) Warning LangEnabledBy(C ObjC C++ ObjC++,Wall) Warn about suspicious length parameters to certain string functions if the argument uses sizeof @@ -1341,6 +1349,10 @@ fsigned-char C ObjC C++ ObjC++ LTO Var(flag_signed_char) Make \char\ signed by default +fsized-deallocation +C++ ObjC++ Var(flag_sized_deallocation) Init(-1) +Enable C++14 sized deallocation support + fsquangle C++ ObjC++ Ignore Warn(switch %qs is no longer supported) diff --git a/gcc/cp/call.c b/gcc/cp/call.c index 312dfdf..86c78ab 100644 --- a/gcc/cp/call.c +++ b/gcc/cp/call.c @@ -5733,7 +5733,7 @@ build_new_op (location_t loc, enum tree_code code, int flags, /* Returns true iff T, an element of an OVERLOAD chain, is a usual deallocation function (3.7.4.2 [basic.stc.dynamic.deallocation]). */ -static bool +bool non_placement_deallocation_fn_p (tree t) { /* A template instance is never a usual deallocation function, @@ -5749,9 +5749,11 @@ non_placement_deallocation_fn_p (tree t) function named operator delete with exactly two parameters, the second of which has type std::size_t (18.2), then this function is a usual deallocation function. */ + bool global = DECL_NAMESPACE_SCOPE_P (t); t = FUNCTION_ARG_CHAIN (t); if (t == void_list_node || (t same_type_p (TREE_VALUE (t), size_type_node) + (!global || flag_sized_deallocation) TREE_CHAIN (t) == void_list_node)) return
[PATCH] Fix PR64312
Bootstrapped and tested on x86_64-unknown-linux-gnu, applied. Richard. 2014-12-15 Richard Biener rguent...@suse.de PR tree-optimization/64312 * tree-ssa-sccvn.c (vn_reference_lookup_pieces): Use vuse_ssa_val as callback to walk_non_aliased_vuses. (vn_reference_lookup): Likewise. * g++.dg/torture/pr64312.C: New testcase. Index: gcc/tree-ssa-sccvn.c === --- gcc/tree-ssa-sccvn.c(revision 218749) +++ gcc/tree-ssa-sccvn.c(working copy) @@ -2161,7 +2161,7 @@ vn_reference_lookup_pieces (tree vuse, a (vn_reference_t)walk_non_aliased_vuses (r, vr1.vuse, vn_reference_lookup_2, vn_reference_lookup_3, - vn_valueize, vr1); + vuse_ssa_val, vr1); gcc_checking_assert (vr1.operands == shared_lookup_references); } @@ -2214,7 +2214,7 @@ vn_reference_lookup (tree op, tree vuse, (vn_reference_t)walk_non_aliased_vuses (r, vr1.vuse, vn_reference_lookup_2, vn_reference_lookup_3, - vn_valueize, vr1); + vuse_ssa_val, vr1); gcc_checking_assert (vr1.operands == shared_lookup_references); if (wvnresult) { Index: gcc/testsuite/g++.dg/torture/pr64312.C === --- gcc/testsuite/g++.dg/torture/pr64312.C (revision 0) +++ gcc/testsuite/g++.dg/torture/pr64312.C (working copy) @@ -0,0 +1,123 @@ +// { dg-do compile } + +template typename C struct A +{ + typedef typename C::iterator type; +}; +template typename T2 struct B +{ + typedef T2 type; +}; +template typename F2 struct L +{ + typedef typename BF2::type::type type; +}; +template typename C struct M +{ + typedef typename LAC ::type type; +}; +class C +{ +public: + typedef int iterator; +}; +template class IteratorT class D +{ +public: + typedef IteratorT iterator; + template class Iterator D (Iterator p1, Iterator) : m_Begin (p1), m_End (0) + { + } + IteratorT m_Begin; + IteratorT m_End; +}; +template class IteratorT class I : public DIteratorT +{ +protected: + template class Iterator + I (Iterator p1, Iterator p2) + : DIteratorT (p1, p2) + { + } +}; +class F +{ +public: + int elems[]; + int * + m_fn1 () + { +return elems; + } +}; +class G +{ +public: + void * + m_fn2 (int) + { +return m_buffer.m_fn1 (); + } + F m_buffer; +}; +struct any_incrementable_iterator_interface +{ + virtual ~any_incrementable_iterator_interface () {} +}; +class J : public any_incrementable_iterator_interface +{ +public: + J (int) : m_it () {} + int m_it; +}; +void *operator new(unsigned long, void *p2) { return p2; } +template class T typename MT::type begin (T) { return 0; } +template class T typename MT::type end (T) {} +template class class any_iterator +{ +public: + template class WrappedIterator any_iterator (WrappedIterator) + { +void *ptr = m_buffer.m_fn2 (0); +m_impl = new (ptr) J (0); + } + ~any_iterator () + { +if (m_impl) + m_impl-~any_incrementable_iterator_interface (); + } + G m_buffer; + any_incrementable_iterator_interface *m_impl; +}; +template class Reference class K : public Iany_iteratorReference +{ +public: + template class WrappedRange + K (WrappedRange p1) + : Iany_iteratorReference (begin (p1), end (p1)) + { + } +}; +template class Reference struct H +{ + typedef KReference type; +}; +template class, class, class, class, class, class TargetReference +void +mix_values_impl () +{ + C test_data; + Hint::type source_data (test_data); + typename HTargetReference::type t2 = source_data; +} +template class +void +mix_values_driver () +{ + mix_values_implint, int, int, int, int, int (); +} +void +mix_values () +{ + mix_values_driverint (); +}
[PATCH] Fix simplify_relational_operation_1 (PR rtl-optimization/64316)
Hi! This patch fixes ICE when cmp_mode is some vector mode, creating comparison of a vector with scalar const0_rtx is a bad idea. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2014-12-15 Jakub Jelinek ja...@redhat.com PR rtl-optimization/64316 * simplify-rtx.c (simplify_relational_operation_1): For (eq/ne (and x y) x) and (eq/ne (and x y) y) optimizations use CONST0_RTX instead of const0_rtx. * gcc.dg/pr64316.c: New test. --- gcc/simplify-rtx.c.jj 2014-12-12 13:39:50.0 +0100 +++ gcc/simplify-rtx.c 2014-12-15 16:40:33.371447749 +0100 @@ -4561,7 +4561,8 @@ simplify_relational_operation_1 (enum rt rtx not_y = simplify_gen_unary (NOT, cmp_mode, XEXP (op0, 1), cmp_mode); rtx lhs = simplify_gen_binary (AND, cmp_mode, not_y, XEXP (op0, 0)); - return simplify_gen_relational (code, mode, cmp_mode, lhs, const0_rtx); + return simplify_gen_relational (code, mode, cmp_mode, lhs, + CONST0_RTX (cmp_mode)); } /* Likewise for (eq/ne (and x y) y). */ @@ -4573,7 +4574,8 @@ simplify_relational_operation_1 (enum rt rtx not_x = simplify_gen_unary (NOT, cmp_mode, XEXP (op0, 0), cmp_mode); rtx lhs = simplify_gen_binary (AND, cmp_mode, not_x, XEXP (op0, 1)); - return simplify_gen_relational (code, mode, cmp_mode, lhs, const0_rtx); + return simplify_gen_relational (code, mode, cmp_mode, lhs, + CONST0_RTX (cmp_mode)); } /* (eq/ne (bswap x) C1) simplifies to (eq/ne x C2) with C2 swapped. */ --- gcc/testsuite/gcc.dg/pr64316.c.jj 2014-12-15 16:46:47.428982539 +0100 +++ gcc/testsuite/gcc.dg/pr64316.c 2014-12-15 16:46:29.0 +0100 @@ -0,0 +1,42 @@ +/* PR rtl-optimization/64316 */ +/* { dg-do compile } */ +/* { dg-options -O3 } */ +/* { dg-additional-options -mavx2 { target { i?86-*-* x86_64-*-* } } } */ + +struct S +{ + unsigned int s; + unsigned long w[]; +}; + +struct S **s; + +int +foo (struct S *x, struct S *y, struct S *z) +{ + unsigned int i; + unsigned long *a, *b, *c; + int r = 0; + for (a = x-w, b = y-w, c = z-w, i = 0; i x-s; i++, a++) +{ + unsigned long d = *b++ *c++; + if (*a != d) + { + r = 1; + *a = d; + } +} + return r; +} + +void +bar (int x) +{ + int p = x - 1; + do +{ + foo (s[x], s[x], s[p]); + p--; +} + while (p 0); +} Jakub
[committed] Add testcase from PR63804
Hi! This PR got fixed some time ago, this patch adds a testcase for it. 2014-12-15 Jakub Jelinek ja...@redhat.com PR rtl-optimization/63804 * gcc.dg/pr63804.c: New test. --- gcc/testsuite/gcc.dg/pr63804.c.jj 2014-12-15 17:35:31.978451234 +0100 +++ gcc/testsuite/gcc.dg/pr63804.c 2014-12-15 17:35:16.0 +0100 @@ -0,0 +1,52 @@ +/* PR rtl-optimization/63804 */ +/* { dg-do compile } */ +/* { dg-options -O2 -g } */ + +struct A { int gen; } e; +int a, d; +long b; +enum B { C }; +struct D +{ + enum B type : 1; + int nr : 1; + struct { unsigned ud; } dw1; +}; +enum B c; + +void +fn1 (int p1) +{ + b = p1 a; +} + +int fn2 (); +void fn3 (); + +void +fn4 (struct D p1, unsigned p2, int p3) +{ + struct D f, g, h, j = p1, l, m = l; + struct A i = e; + if (i.gen) +p2 = 0; + j.type = c; + g = j; + p1 = g; + fn3 (); + int k = p2, v = p1.nr, p = v; + m.dw1.ud = k; + f = m; + h = f; + struct D n = h; + fn3 (n); + { +d = fn2 (); +int o = d; +fn1 (o); + } + if (i.gen) +fn3 (p1); + b = p a; + fn3 (p3); +} Jakub
Re: [patch] PR fortran/61669
On Sat, Aug 23, 2014 at 06:59:14PM +0200, Paul Richard Thomas wrote: Dear Steven, I am constantly amazed that data statement bugs keep turning up:-) Anyway, your fix is fine for trunk and, if you feel so inclined, 4.8 and 4.9. Steven didn't respond to my ping about this PR, so I've updated the testcase, bootstrapped/regtested it on x86_64-linux and i686-linux and committed to trunk. I'll put it into 4.9/4.8 later on. 2014-12-15 Steven Bosscher ste...@gcc.gnu.org Jakub Jelinek ja...@redhat.com PR fortran/61669 * gfortran.h (struct gfc_namespace): Add OLD_DATA field. * decl.c (gfc_reject_data): New function. * parse.c *use_modules): Record roll-back point. (next_statement): Likewise. (reject_statement): Roll back to last accepted DATA. * gfortran.dg/pr61669.f90: New test. --- gcc/fortran/gfortran.h (revision 214350) +++ gcc/fortran/gfortran.h (working copy) @@ -1625,7 +1625,7 @@ typedef struct gfc_namespace gfc_st_label *st_labels; /* This list holds information about all the data initializers in this namespace. */ - struct gfc_data *data; + struct gfc_data *data, *old_data; gfc_charlen *cl_list, *old_cl_list; @@ -2941,6 +2941,7 @@ void gfc_free_omp_namelist (gfc_omp_namelist *); void gfc_free_equiv (gfc_equiv *); void gfc_free_equiv_until (gfc_equiv *, gfc_equiv *); void gfc_free_data (gfc_data *); +void gfc_reject_data (gfc_namespace *); void gfc_free_case_list (gfc_case *); /* matchexp.c -- FIXME too? */ --- gcc/fortran/decl.c (revision 214350) +++ gcc/fortran/decl.c (working copy) @@ -178,7 +178,21 @@ gfc_free_data_all (gfc_namespace *ns) } } +/* Reject data parsed since the last restore point was marked. */ +void +gfc_reject_data (gfc_namespace *ns) +{ + gfc_data *d; + + while (ns-data ns-data != ns-old_data) +{ + d = ns-data-next; + free (ns-data); + ns-data = d; +} +} + static match var_element (gfc_data_variable *); /* Match a list of variables terminated by an iterator and a right --- gcc/fortran/parse.c (revision 214350) +++ gcc/fortran/parse.c (working copy) @@ -118,6 +118,7 @@ use_modules (void) gfc_warning_check (); gfc_current_ns-old_cl_list = gfc_current_ns-cl_list; gfc_current_ns-old_equiv = gfc_current_ns-equiv; + gfc_current_ns-old_data = gfc_current_ns-data; last_was_use_stmt = false; } @@ -1097,6 +1098,7 @@ next_statement (void) gfc_current_ns-old_cl_list = gfc_current_ns-cl_list; gfc_current_ns-old_equiv = gfc_current_ns-equiv; + gfc_current_ns-old_data = gfc_current_ns-data; for (;;) { gfc_statement_label = NULL; @@ -2045,6 +2047,8 @@ reject_statement (void) gfc_free_equiv_until (gfc_current_ns-equiv, gfc_current_ns-old_equiv); gfc_current_ns-equiv = gfc_current_ns-old_equiv; + gfc_reject_data (gfc_current_ns); + gfc_new_block = NULL; gfc_undo_symbols (); gfc_clear_warning (); --- gcc/testsuite/gfortran.dg/pr61669.f90 (revision 0) +++ gcc/testsuite/gfortran.dg/pr61669.f90 (working copy) @@ -0,0 +1,7 @@ +! { dg-do compile } + write (*,(a)) char(12) + CHARACTER*80 A /A/ ! { dg-error Unexpected data declaration statement } + REAL*4 B ! { dg-error Unexpected data declaration statement } + write (*,(a)) char(12) + DATA B / 0.02 / ! { dg-warning Obsolescent feature: DATA statement } + END Jakub
[PATCH] Add support for exceptions to tsan (PR sanitizer/64265)
Hi! As discussed in the PR, to support exceptions in -fsanitize=thread code, it is desirable to call __tsan_func_exit also when leaving functions by means of exceptions. Adding EH too late sounds too hard to me, so this patch instead adds an internal call during gimplification, makes sure the inliner removes the internal calls from the inline functions (we don't care about inlines, only about functions we emit), and for functions that didn't go through gimplify_function_tree (e.g. omp/tm etc. functions) just keeps using the old __tsan_func_exit additions. Bootstrapped/regtested on x86_64-linux and i686-linux. Ok for trunk? On: #include pthread.h int v; int foo (int x) { if (x 99) throw x; v++; return x; } void * tf (void *) { for (int i = 0; i 100; i++) try { foo (i); } catch (int) {} return NULL; } int main () { pthread_t th; if (pthread_create (th, NULL, tf, NULL)) return 0; v++; pthread_join (th, NULL); return 0; } I used to get without this patch: == WARNING: ThreadSanitizer: data race (pid=20449) Read of size 4 at 0x006020e0 by thread T1: #0 foo(int) /usr/src/gcc/obj/gcc/ts.C:10 (ts+0x00400cb9) #1 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #2 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #3 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #4 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #5 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #6 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #7 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #8 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #9 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #10 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #11 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #12 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #13 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #14 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #15 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #16 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #17 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #18 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #19 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #20 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #21 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #22 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #23 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #24 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #25 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #26 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #27 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #28 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #29 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #30 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #31 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #32 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #33 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #34 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #35 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #36 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #37 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #38 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #39 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #40 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #41 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #42 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #43 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #44 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #45 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #46 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #47 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #48 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #49 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #50 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #51 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #52 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #53 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #54 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #55 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #56 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #57 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18
addition to the patch for PR64110
My last patch for PR64110 results in LRA crash on ARM on compilation of some programs. There is no PR for it. Here is the patch fixing the occurred problem. The patch was tested and bootstrapped on x86/x86-64 and ARM. Committed as rev. 218760. 2014-12-15 Vladimir Makarov vmaka...@redhat.com * ira-int.h (ira_prohibited_class_mode_regs): Remove. (struct target_ira_int): Move x_ira_prohibited_class_mode_regs to ... * ira.h (struct target_ira): ... here. (ira_prohibited_class_mode_regs): Define. * lra-constraints.c (process_alt_operands): Add one more condition to refuse alternative when reload pseudo of given class can not hold value of given mode. Index: ira.h === --- ira.h (revision 218685) +++ ira.h (working copy) @@ -110,6 +110,11 @@ /* Function specific hard registers can not be used for the register allocation. */ HARD_REG_SET x_ira_no_alloc_regs; + + /* Array whose values are hard regset of hard registers available for + the allocation of given register class whose HARD_REGNO_MODE_OK + values for given mode are zero. */ + HARD_REG_SET x_ira_prohibited_class_mode_regs[N_REG_CLASSES][NUM_MACHINE_MODES]; }; extern struct target_ira default_target_ira; @@ -155,6 +160,8 @@ (this_target_ira-x_ira_class_singleton) #define ira_no_alloc_regs \ (this_target_ira-x_ira_no_alloc_regs) +#define ira_prohibited_class_mode_regs \ + (this_target_ira-x_ira_prohibited_class_mode_regs) /* Major structure describing equivalence info for a pseudo. */ struct ira_reg_equiv_s Index: ira-int.h === --- ira-int.h (revision 218685) +++ ira-int.h (working copy) @@ -843,11 +843,6 @@ unavailable for the allocation. */ short x_ira_class_hard_reg_index[N_REG_CLASSES][FIRST_PSEUDO_REGISTER]; - /* Array whose values are hard regset of hard registers available for - the allocation of given register class whose HARD_REGNO_MODE_OK - values for given mode are zero. */ - HARD_REG_SET x_ira_prohibited_class_mode_regs[N_REG_CLASSES][NUM_MACHINE_MODES]; - /* Index [CL][M] contains R if R appears somewhere in a register of the form: (reg:M R'), R' not in x_ira_prohibited_class_mode_regs[CL][M] @@ -939,8 +934,6 @@ (this_target_ira_int-x_ira_non_ordered_class_hard_regs) #define ira_class_hard_reg_index \ (this_target_ira_int-x_ira_class_hard_reg_index) -#define ira_prohibited_class_mode_regs \ - (this_target_ira_int-x_ira_prohibited_class_mode_regs) #define ira_useful_class_mode_regs \ (this_target_ira_int-x_ira_useful_class_mode_regs) #define ira_important_classes_num \ Index: lra-constraints.c === --- lra-constraints.c (revision 218688) +++ lra-constraints.c (working copy) @@ -2269,17 +2269,25 @@ /* Alternative loses if it required class pseudo can not hold value of required mode. Such insns can be -described by insn definitions with mode iterators. -Don't use ira_prohibited_class_mode_regs here as it -is common practice for constraints to use a class -which does not have actually enough regs to hold the -value (e.g. x86 AREG for mode requiring more one -general reg). */ +described by insn definitions with mode iterators. */ if (GET_MODE (*curr_id-operand_loc[nop]) != VOIDmode ! hard_reg_set_empty_p (this_alternative_set) + /* It is common practice for constraints to use a +class which does not have actually enough regs to +hold the value (e.g. x86 AREG for mode requiring +more one general reg). Therefore we have 2 +conditions to check that the reload pseudo can +not hold the mode value. */ ! HARD_REGNO_MODE_OK (ira_class_hard_regs [this_alternative][0], - GET_MODE (*curr_id-operand_loc[nop]))) + GET_MODE (*curr_id-operand_loc[nop])) + /* The above condition is not enough as the first +reg in ira_class_hard_regs can be not aligned for +multi-words mode values. */ + hard_reg_set_subset_p (this_alternative_set, + ira_prohibited_class_mode_regs + [this_alternative] + [GET_MODE (*curr_id-operand_loc[nop])])) { if (lra_dump_file != NULL) fprintf
[Patch, Fortran, OOP] PR 64244: [4.8/4.9/5 Regression] ICE at class.c:236 when using non_overridable
Hi all, here is a regression fix for a problem with the NON_OVERRIDABLE attribute. For non-overridable type-bound procedures we do not have to generate a call to the vtable, but can just translate it to a simple ('non-virtual') function call. In this particular case, an additional generic binding was present, which fooled the compiler to believe that the call goes to an overridable procedure, so it tried to generate a call to a vtable entry which did not exist. The trick is simply to take the NON-OVERRIDABLE attribute from the specific procedure, not the generic one (which means the generic call has to be resolved to a specific one first). The patch was regtested on x86_64-unknown-linux-gnu. Ok for trunk? And for 4.8/4.9 after some time? Cheers, Janus 2014-12-15 Janus Weil ja...@gcc.gnu.org PR fortran/64244 * resolve.c (resolve_typebound_call): New argument to pass out the non-overridable attribute of the specific procedure. (resolve_typebound_subroutine): Get overridable flag from resolve_typebound_call. 2014-12-15 Janus Weil ja...@gcc.gnu.org PR fortran/64244 * gfortran.dg/typebound_call_26.f90: New. Index: gcc/fortran/resolve.c === --- gcc/fortran/resolve.c (Revision 218751) +++ gcc/fortran/resolve.c (Arbeitskopie) @@ -5676,7 +5676,7 @@ success: /* Resolve a call to a type-bound subroutine. */ static bool -resolve_typebound_call (gfc_code* c, const char **name) +resolve_typebound_call (gfc_code* c, const char **name, bool *overridable) { gfc_actual_arglist* newactual; gfc_symtree* target; @@ -5700,6 +5700,10 @@ static bool if (!resolve_typebound_generic_call (c-expr1, name)) return false; + /* Pass along the NON_OVERRIDABLE attribute of the specific TBP. */ + if (overridable) +*overridable = !c-expr1-value.compcall.tbp-non_overridable; + /* Transform into an ordinary EXEC_CALL for now. */ if (!resolve_typebound_static (c-expr1, target, newactual)) @@ -5959,7 +5963,7 @@ resolve_typebound_subroutine (gfc_code *code) if (c-ts.u.derived == NULL) c-ts.u.derived = gfc_find_derived_vtab (declared); - if (!resolve_typebound_call (code, name)) + if (!resolve_typebound_call (code, name, NULL)) return false; /* Use the generic name if it is there. */ @@ -5991,7 +5995,7 @@ resolve_typebound_subroutine (gfc_code *code) } if (st == NULL) -return resolve_typebound_call (code, NULL); +return resolve_typebound_call (code, NULL, NULL); if (!resolve_ref (code-expr1)) return false; @@ -6004,10 +6008,10 @@ resolve_typebound_subroutine (gfc_code *code) || (!class_ref st-n.sym-ts.type != BT_CLASS)) { gfc_free_ref_list (new_ref); - return resolve_typebound_call (code, NULL); + return resolve_typebound_call (code, NULL, NULL); } - if (!resolve_typebound_call (code, name)) + if (!resolve_typebound_call (code, name, overridable)) { gfc_free_ref_list (new_ref); return false; ! { dg-do compile } ! ! PR 64244: [4.8/4.9/5 Regression] ICE at class.c:236 when using non_overridable ! ! Contributed by OndÅ™ej ÄŒertÃk ondrej.cer...@gmail.com module m implicit none type :: A contains generic :: f = g procedure, non_overridable :: g end type contains subroutine g(this) class(A), intent(in) :: this end subroutine end module program test_non_overridable use m, only: A implicit none class(A), allocatable :: h call h%f() end
Re: [PATCH] Fix simplify_relational_operation_1 (PR rtl-optimization/64316)
On December 15, 2014 7:39:52 PM CET, Jakub Jelinek ja...@redhat.com wrote: Hi! This patch fixes ICE when cmp_mode is some vector mode, creating comparison of a vector with scalar const0_rtx is a bad idea. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? OK Thanks, Richard. 2014-12-15 Jakub Jelinek ja...@redhat.com PR rtl-optimization/64316 * simplify-rtx.c (simplify_relational_operation_1): For (eq/ne (and x y) x) and (eq/ne (and x y) y) optimizations use CONST0_RTX instead of const0_rtx. * gcc.dg/pr64316.c: New test. --- gcc/simplify-rtx.c.jj 2014-12-12 13:39:50.0 +0100 +++ gcc/simplify-rtx.c 2014-12-15 16:40:33.371447749 +0100 @@ -4561,7 +4561,8 @@ simplify_relational_operation_1 (enum rt rtx not_y = simplify_gen_unary (NOT, cmp_mode, XEXP (op0, 1), cmp_mode); rtx lhs = simplify_gen_binary (AND, cmp_mode, not_y, XEXP (op0, 0)); - return simplify_gen_relational (code, mode, cmp_mode, lhs, const0_rtx); + return simplify_gen_relational (code, mode, cmp_mode, lhs, +CONST0_RTX (cmp_mode)); } /* Likewise for (eq/ne (and x y) y). */ @@ -4573,7 +4574,8 @@ simplify_relational_operation_1 (enum rt rtx not_x = simplify_gen_unary (NOT, cmp_mode, XEXP (op0, 0), cmp_mode); rtx lhs = simplify_gen_binary (AND, cmp_mode, not_x, XEXP (op0, 1)); - return simplify_gen_relational (code, mode, cmp_mode, lhs, const0_rtx); + return simplify_gen_relational (code, mode, cmp_mode, lhs, +CONST0_RTX (cmp_mode)); } /* (eq/ne (bswap x) C1) simplifies to (eq/ne x C2) with C2 swapped. */ --- gcc/testsuite/gcc.dg/pr64316.c.jj 2014-12-15 16:46:47.428982539 +0100 +++ gcc/testsuite/gcc.dg/pr64316.c 2014-12-15 16:46:29.0 +0100 @@ -0,0 +1,42 @@ +/* PR rtl-optimization/64316 */ +/* { dg-do compile } */ +/* { dg-options -O3 } */ +/* { dg-additional-options -mavx2 { target { i?86-*-* x86_64-*-* } } } */ + +struct S +{ + unsigned int s; + unsigned long w[]; +}; + +struct S **s; + +int +foo (struct S *x, struct S *y, struct S *z) +{ + unsigned int i; + unsigned long *a, *b, *c; + int r = 0; + for (a = x-w, b = y-w, c = z-w, i = 0; i x-s; i++, a++) +{ + unsigned long d = *b++ *c++; + if (*a != d) + { +r = 1; +*a = d; + } +} + return r; +} + +void +bar (int x) +{ + int p = x - 1; + do +{ + foo (s[x], s[x], s[p]); + p--; +} + while (p 0); +} Jakub
patch to fix PR62642
The following patch fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62642 The patch was successfully bootstrapped on x86-64. It is difficult for me to make a testcase to check the right code generation. So the patch has no test. Committed as rev. 218761. 2014-12-15 Vladimir Makarov vmaka...@redhat.com PR target/62642 * ira.c (rtx_moveable_p): Prevent UNSPEC_VOLATILE moves. Index: ira.c === --- ira.c (revision 218685) +++ ira.c (working copy) @@ -4358,6 +4358,12 @@ rtx_moveable_p (rtx *loc, enum op_type t case CLOBBER: return rtx_moveable_p (SET_DEST (x), OP_OUT); +case UNSPEC_VOLATILE: + /* It is a bad idea to consider insns with with such rtl +as moveable ones. The insn scheduler also considers them as barrier +for a reason. */ + return false; + default: break; }
Re: [gofrontend-dev] Re: [PATCH 00/13] Go closures, libffi, and the static chain
On 12/15/2014 03:42 AM, Dominik Vogt wrote: On Fri, Dec 12, 2014 at 10:14:21AM -0800, Richard Henderson wrote: On 12/12/2014 04:06 AM, Dominik Vogt wrote: I'm not sure I've posted the missing patch anywhere yet, so it's attached to this message. At the moment it enables FFI_TYPE_COMPLEX only for s390[x], but eventually this should be used unconditionally. Thanks for that. I'd been meaning to get around to that. I'll change the test to use FFI_TARGET_HAS_COMPLEX_TYPE and apply it to my branch. Good. I'm not sure whether it's a good idea to expose FFI_TARGET_HAS_COMPLEX_TYPE as part of the libffi interface though. It was meant as a temporary thing to be removed once all platforms supported by libffi have implemented complex support. A while ago I've posted a patch to change the macro's name to begin with an underscore to make that clearer. It's our copy of libffi -- I think we can assume any internals we like. Similarly, when I finish writing the bits that allow libffi to handle empty structures, I don't plan to conditionalize libgo, I simply plan to assume it works. r~
Re: [PATCH] Fix simplify_relational_operation_1 (PR rtl-optimization/64316)
2014-12-15 Jakub Jelinek ja...@redhat.com PR rtl-optimization/64316 * simplify-rtx.c (simplify_relational_operation_1): For (eq/ne (and x y) x) and (eq/ne (and x y) y) optimizations use CONST0_RTX instead of const0_rtx. * gcc.dg/pr64316.c: New test. OK, thanks. -- Eric Botcazou
C++ PATCH for c++/64297 (TYPE_CANONICAL clash with ref-qualifier)
In this testcase the problem was that we end up with a non-ref-qualified type which has a ref-qualified TYPE_CANONICAL because the middle-end check_qualified_type doesn't know about ref-qualifiers. So apply_memfn_quals needs to set it up on TYPE_CANONICAL as well. Tested x86_64-pc-linux-gnu, applying to trunk. commit 478c1d31bb46a032d07839c4d1679907a1b5bcf8 Author: Jason Merrill ja...@redhat.com Date: Mon Dec 15 14:31:12 2014 -0500 PR c++/64297 * typeck.c (apply_memfn_quals): Correct wrong TYPE_CANONICAL. diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c index 7b39816..9368b49 100644 --- a/gcc/cp/typeck.c +++ b/gcc/cp/typeck.c @@ -8945,6 +8945,12 @@ apply_memfn_quals (tree type, cp_cv_quals memfn_quals, cp_ref_qualifier rqual) /* This should really have a different TYPE_MAIN_VARIANT, but that gets complex. */ tree result = build_qualified_type (type, memfn_quals); + if (tree canon = TYPE_CANONICAL (result)) +if (canon != result) + /* check_qualified_type doesn't check the ref-qualifier, so make sure + TYPE_CANONICAL is correct. */ + TYPE_CANONICAL (result) + = build_ref_qualified_type (canon, type_memfn_rqual (result)); result = build_exception_variant (result, TYPE_RAISES_EXCEPTIONS (type)); return build_ref_qualified_type (result, rqual); } diff --git a/gcc/testsuite/g++.dg/cpp0x/ref-qual16.C b/gcc/testsuite/g++.dg/cpp0x/ref-qual16.C new file mode 100644 index 000..7409418 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/ref-qual16.C @@ -0,0 +1,13 @@ +// PR c++/64297 +// { dg-do compile { target c++11 } } + +struct A { + typedef int X; + template int X m_fn1() const; +}; +template typename struct is_function {}; +is_functionint() const i; +struct D { + template typename Y, typename = is_functionY D(Y); +} b(A::m_fn10); +
Go patch committed: copying call should copy varargs state
Copying a call expression has to copy the state of whether varargs have been lowered. Otherwise the compiler can crash on valid code like *f() += 1 when f is a varargs function. This patch from Chris Manghane fixes the problem. This is GCC PR 61255. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r 67dd2c649e07 go/expressions.cc --- a/go/expressions.cc Mon Dec 15 09:32:24 2014 -0800 +++ b/go/expressions.cc Mon Dec 15 12:15:03 2014 -0800 @@ -6552,13 +6552,7 @@ do_check_types(Gogo*); Expression* - do_copy() - { -return new Builtin_call_expression(this-gogo_, this-fn()-copy(), - this-args()-copy(), - this-is_varargs(), - this-location()); - } + do_copy(); Bexpression* do_get_backend(Translate_context*); @@ -7986,6 +7980,20 @@ } } +Expression* +Builtin_call_expression::do_copy() +{ + Call_expression* bce = +new Builtin_call_expression(this-gogo_, this-fn()-copy(), + this-args()-copy(), + this-is_varargs(), + this-location()); + + if (this-varargs_are_lowered()) +bce-set_varargs_are_lowered(); + return bce; +} + // Return the backend representation for a builtin function. Bexpression* @@ -9126,6 +9134,21 @@ } } +Expression* +Call_expression::do_copy() +{ + Call_expression* call = +Expression::make_call(this-fn_-copy(), + (this-args_ == NULL + ? NULL + : this-args_-copy()), + this-is_varargs_, this-location()); + + if (this-varargs_are_lowered_) +call-set_varargs_are_lowered(); + return call; +} + // Return whether we have to use a temporary variable to ensure that // we evaluate this call expression in order. If the call returns no // results then it will inevitably be executed last. diff -r 67dd2c649e07 go/expressions.h --- a/go/expressions.h Mon Dec 15 09:32:24 2014 -0800 +++ b/go/expressions.h Mon Dec 15 12:15:03 2014 -0800 @@ -1683,6 +1683,11 @@ is_varargs() const { return this-is_varargs_; } + // Return whether varargs have already been lowered. + bool + varargs_are_lowered() const + { return this-varargs_are_lowered_; } + // Note that varargs have already been lowered. void set_varargs_are_lowered() @@ -1738,14 +1743,7 @@ do_check_types(Gogo*); Expression* - do_copy() - { -return Expression::make_call(this-fn_-copy(), -(this-args_ == NULL - ? NULL - : this-args_-copy()), -this-is_varargs_, this-location()); - } + do_copy(); bool do_must_eval_in_order() const;
Re: [PATCH][rtlanal.c][BE][1/2] Fix vector load/stores to not use ld1/st1
What do you think we should relax it to though? Obviously there's a balance here between relaxing things enough and not relaxing them too far (so that the EImode AArch64 thing I mentioned is still a noisy failure, for example). ISTM the patch deals with the only significant case that is obviously safe for modes that are not a power of 2 in size. Apparently the change wants to accept general subregs with not only modes whose sizes are multiple of each other but also whose sizes are multiple of a common large enough value. That clearly goes against: /* This should always pass, otherwise we don't know how to verify the constraint. These conditions may be relaxed but subreg_regno_offset would need to be redesigned. */ gcc_assert ((GET_MODE_SIZE (xmode) % GET_MODE_SIZE (ymode)) == 0); gcc_assert ((nregs_xmode % nregs_ymode) == 0); so I think that we should formulate the new requirement and implement it in the main part of the function, instead of adding it as a kludge. If you're saying that the condition itself is OK, but that the code should be further down in the function, then I don't think that would gain much. We already have early-outs for the simple cases, such as: Right, but they are more of special cases and this one is not. -- Eric Botcazou
Re: [PATCH PR62178]Improve candidate selecting in IVOPT, 2nd try.
Bin.Cheng wrote: do we have some compilation time benchmarks for GCC? I'm using the llvm test-suite to see compile time differences: $ git clone http://llvm.org/git/test-suite.git /path/to/test-suite $ /path/to/test-suite/configure --without-llvmsrc --without-llvmobj --with-externals=/path/to/spec $ make -k TEST=simple TARGET_LLVMGCC=/path/to/gcc TARGET_CXX=/path/to/g++ TARGET_CC=/path/to/gcc TARGET_LLVMGXX=/path/to/g++ CC_UNDER_TEST_IS_GCC=1 TARGET_FLAGS= USE_REFERENCE_OUTPUT=1 CC_UNDER_TEST_TARGET_IS_AARCH64=1 OPTFLAGS=-O3 LLC_OPTFLAGS=-O3 ENABLE_OPTIMIZED=1 ARCH=AArch64 ENABLE_HASHED_PROGRAM_OUTPUT=1 DISABLE_JIT=1 report report.simple.csv $ head -1 report.simple.csv Program,CC,CC_Time,CC_Real_Time,Exec,Exec_Time,Exec_Real_Time $ awk -F, '{print $1, $3 }' report.simple.csv Here is how to get benchmark code size: $ make -k TEST=codesize TARGET_LLVMGCC=/path/to/gcc TARGET_CXX=/path/to/g++ TARGET_CC=/path/to/gcc TARGET_LLVMGXX=/path/to/g++ TARGET_FLAGS= USE_REFERENCE_OUTPUT=1 CC_UNDER_TEST_TARGET_IS_AARCH64=1 CC_UNDER_TEST_IS_CLANG=1 OPTFLAGS=-O3 LLC_OPTFLAGS=-O3 ENABLE_OPTIMIZED=1 ARCH=AArch64 ENABLE_HASHED_PROGRAM_OUTPUT=1 DISABLE_JIT=1 2/dev/null | grep ^size: test.codesize.txt
[PATCH, i386]: Move TARGET_CAN_SPLIT_STACK to config/i386/gnu-user-common.h
Hello! 2014-12-15 Uros Bizjak ubiz...@gmail.com * config/i386/gnu-user.h (TARGET_CAN_SPLIT_STACK): Move from here ... * config/i386/gnu-user64.h (TARGET_CAN_SPLIT_STACK): ... and here ... * config/i386/gnu-user-common.h (TARGET_CAN_SPLIT_STACK): ... to here. Bootstrapped and regtested on x86_64-linux-gnu {,-m32}, including go. Will commit tomorrow to mainline if there are no objections. Uros. Index: config/i386/gnu-user-common.h === --- config/i386/gnu-user-common.h (revision 218753) +++ config/i386/gnu-user-common.h (working copy) @@ -64,3 +64,9 @@ /* Static stack checking is supported by means of probes. */ #define STACK_CHECK_STATIC_BUILTIN 1 + +/* We only build the -fsplit-stack support in libgcc if the + assembler has full support for the CFI directives. */ +#if HAVE_GAS_CFI_PERSONALITY_DIRECTIVE +#define TARGET_CAN_SPLIT_STACK +#endif Index: config/i386/gnu-user.h === --- config/i386/gnu-user.h (revision 218753) +++ config/i386/gnu-user.h (working copy) @@ -154,11 +154,6 @@ /* i386 glibc provides __stack_chk_guard in %gs:0x14. */ #define TARGET_THREAD_SSP_OFFSET 0x14 -/* We only build the -fsplit-stack support in libgcc if the - assembler has full support for the CFI directives. */ -#if HAVE_GAS_CFI_PERSONALITY_DIRECTIVE -#define TARGET_CAN_SPLIT_STACK -#endif /* We steal the last transactional memory word. */ #define TARGET_THREAD_SPLIT_STACK_OFFSET 0x30 #endif Index: config/i386/gnu-user64.h === --- config/i386/gnu-user64.h(revision 218753) +++ config/i386/gnu-user64.h(working copy) @@ -85,11 +85,6 @@ #define TARGET_THREAD_SSP_OFFSET \ (TARGET_64BIT ? (TARGET_X32 ? 0x18 : 0x28) : 0x14) -/* We only build the -fsplit-stack support in libgcc if the - assembler has full support for the CFI directives. */ -#if HAVE_GAS_CFI_PERSONALITY_DIRECTIVE -#define TARGET_CAN_SPLIT_STACK -#endif /* We steal the last transactional memory word. */ #define TARGET_THREAD_SPLIT_STACK_OFFSET \ (TARGET_64BIT ? (TARGET_X32 ? 0x40 : 0x70) : 0x30)
Re: [Patch] Improving jump-thread pass for PR 54742
Richard Biener wrote: On the llvm test-suite, I have seen one ICE with my fsm jump-thread patch. This patch fixes the problem: diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c index 12f83ba..f8c736e 100644 --- a/gcc/tree-ssa-threadupdate.c +++ b/gcc/tree-ssa-threadupdate.c @@ -2564,6 +2564,7 @@ thread_through_all_blocks (bool may_peel_loop_headers) FOR_EACH_LOOP (loop, LI_FROM_INNERMOST) { if (!loop-header +|| !loop_latch_edge (loop) || !bitmap_bit_p (threaded_blocks, loop-header-index)) continue; retval |= thread_through_loop_header (loop, may_peel_loop_headers); Ok to commit after regstrap? This seems to be indicating that we have with no edge from the latch block to the header block. I'd like to know better how we got into that state. It Also returns null for loops with multiple latches. So the patch looks OK for me. The bug I was seeing has been fixed by the patch for: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64284 Thanks, Sebastian
Re: [PATCH][SPARC] default with_cpu to ultrasparc in sparc64-*-linux* targets
If no --with-cpu is specified at configure time gcc/config.gcc sets the cpu option in configure_default_options to `v9' in sparc64 targets. This leads to the usage of the following spec by the driver: %{!m32:%{!mcpu=*:-mcpu=v9}} Which in turn triggers the usage of -Av9 by default when invoking the assembler. This leads to failures when VIS instructions are used in inline assembly or .s files: [jemarch@install2 gcc]$ echo 'int main () { asm (fzero %f0); return 0; }' | gcc -xc - /tmp/cc1F9iJm.s: Assembler messages: /tmp/cc1F9iJm.s:11: Error: Architecture mismatch on fzero. /tmp/cc1F9iJm.s:11: (Requires v9a|v9b; requested architecture is v9.) I think that passing -mcpu=ultrasparc -mvis is a reasonable expectation when VIS instructions are used. This prevents building upstream glibc with a gcc configured with not --with-cpu option, for example. Certainly annoying. I think it would be reasonable to have gcc targetting ultrasparc extensions by default in sparc64-*-linux*. WDYT? No strong opinion. FreeBSD and OpenBSD already do it so why not? DaveM, any opinion? -- Eric Botcazou
Re: [PATCH][rtlanal.c][BE][1/2] Fix vector load/stores to not use ld1/st1
Eric Botcazou ebotca...@adacore.com writes: What do you think we should relax it to though? Obviously there's a balance here between relaxing things enough and not relaxing them too far (so that the EImode AArch64 thing I mentioned is still a noisy failure, for example). ISTM the patch deals with the only significant case that is obviously safe for modes that are not a power of 2 in size. Apparently the change wants to accept general subregs with not only modes whose sizes are multiple of each other but also whose sizes are multiple of a common large enough value. That clearly goes against: /* This should always pass, otherwise we don't know how to verify the constraint. These conditions may be relaxed but subreg_regno_offset would need to be redesigned. */ gcc_assert ((GET_MODE_SIZE (xmode) % GET_MODE_SIZE (ymode)) == 0); gcc_assert ((nregs_xmode % nregs_ymode) == 0); so I think that we should formulate the new requirement and implement it in the main part of the function, instead of adding it as a kludge. Please be more specific though. If you don't think the patch is correct, what do you think the requirement should be and how should it be integrated into the existing checks? E.g. the assert is there because the main calculation is based on: /* Size of ymode must not be greater than the size of xmode. */ mode_multiple = GET_MODE_SIZE (xmode) / GET_MODE_SIZE (ymode); gcc_assert (mode_multiple != 0); which clearly isn't a useful value if the division isn't exact. Do you mean that, since mode_multiple isn't correct for the DI-of-a-CI case, we should reformulate the end of the function to avoid using mode_multiple at all? Thanks, Richard
Re: Fix streaming of target optimization/option nodes
On Mon, 15 Dec 2014, Jan Hubicka wrote: Hi, actually this patch break fortran, I get streaming error in: lto1: internal compiler error: in streamer_get_pickled_tree apparently picking error_mark_node for variable constructor results in reading integer_type... ? Probably the default nodes are referenced by another builtin tree instead and you get inconsistent streaming between f951 and lto1. See the assert placed into record_common_node which you should extend to cover the optimization node trees. It seems that whole common node preloading is a major can of worms ;(. Anyway the problem here is that record_common_node replaces every NULL by error_mark_node. It thus matters what is the last NULL pointer recorded. It used to be TI_CURRENT_OPTIMIZE_PRAGMA but now it is TI_PID_TYPE in some cases, TI_MAIN_IDENTIFIER in others and real error_mark_node in rest of cases. I am testing the following. Honza Index: tree-streamer.c === --- tree-streamer.c (revision 218726) +++ tree-streamer.c (working copy) @@ -324,7 +324,18 @@ preload_common_nodes (struct streamer_tr /* Skip boolean type and constants, they are frontend dependent. */ if (i != TI_BOOLEAN_TYPE i != TI_BOOLEAN_FALSE -i != TI_BOOLEAN_TRUE) +i != TI_BOOLEAN_TRUE + /* MAIN_IDENTIFIER is not always initialized by Fortran FE. */ +i != TI_MAIN_IDENTIFIER + /* PID_TYPE is initialized only by C family front-ends. */ +i != TI_PID_TYPE + /* Skip optimization and target option nodes; they depend on flags. */ +i != TI_OPTIMIZATION_DEFAULT +i != TI_OPTIMIZATION_CURRENT +i != TI_TARGET_OPTION_DEFAULT +i != TI_TARGET_OPTION_CURRENT +i != TI_CURRENT_TARGET_PRAGMA +i != TI_CURRENT_OPTIMIZE_PRAGMA) record_common_node (cache, global_trees[i]); }
Re: Do not build callgraph for external functions when inlining
Jan Hubicka hubi...@ucw.cz writes: * cgraphunit.c (analyze_functions): Do not analyze extern inline funtions when not optimizing; skip comdat locals. FAIL: g++.dg/torture/pr60854.C -O0 (test for excess errors) Excess errors: /usr/local/gcc/gcc-20141215/gcc/testsuite/g++.dg/torture/pr60854.C:5:46: error: inlining failed in call to always_inline 'MyClassT::MyClass() [with T = double]': function body not available /usr/local/gcc/gcc-20141215/gcc/testsuite/g++.dg/torture/pr60854.C:12:19: error: called from here Hi, this should be fixed by patch to handle extern aliases correctly I commited later that day. Does the problem still reproduce for you? Honza Andreas. -- Andreas Schwab, SUSE Labs, sch...@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 And now for something completely different.
[C++ Patch] PR 58650
Hi, avoid crashing later in build_this_parm during error recovery. Tested x86_64-linux. Thanks, Paolo. // /cp 2014-12-15 Paolo Carlini paolo.carl...@oracle.com PR c++/58650 * decl.c (grokdeclarator): Avoid crashing on an initialized non-static data member wrongly declared friend. /testsuite 2014-12-15 Paolo Carlini paolo.carl...@oracle.com PR c++/58650 * g++.dg/parse/friend12.C: New. Index: cp/decl.c === --- cp/decl.c (revision 218764) +++ cp/decl.c (working copy) @@ -10803,6 +10803,7 @@ grokdeclarator (const cp_declarator *declarator, error (%qE is neither function nor member function; cannot be declared friend, unqualified_id); friendp = 0; + type = error_mark_node; } decl = NULL_TREE; } Index: testsuite/g++.dg/parse/friend12.C === --- testsuite/g++.dg/parse/friend12.C (revision 0) +++ testsuite/g++.dg/parse/friend12.C (working copy) @@ -0,0 +1,7 @@ +// PR c++/58650 + +struct A +{ + friend int i = 0; // { dg-error cannot be declared friend } +// { dg-error non-static data member { target { ! c++11 } } 5 } +};
[Patch, Fortran] PR54687 - Fortran options cleanup (part 2)
This patch is a follow up to https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01068.html and converts more flags to the common diagnostic handing. I think the rest can only be converted by modifiying the *.opt syntax, but I might be wrong. The patch is relative to the Fortran-part approved patch https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01068.html - and fixes some minor issues I found there compared to the old version (e.g. missing ToLower in *opt or vs. for fmax-array-constructor). Built and currently regtested on x86-64-gnu-linux. OK for the trunk? Tobias 2014-12-15 Tobias Burnus bur...@net-b.de * lang.opt (fsecond-underscore, frecord-marker=8, frecord-marker=4, frealloc-lhs, freal-8-real-16, freal-8-real-10, freal-8-real-4, freal-4-real-16, freal-4-real-10, freal-4-real-8, fprotect-parens, fstack-arrays, fmax-stack-var-size=, fmax-subrecord-length=, ffrontend-optimize, ffree-line-length-, ffixed-line-length-, finteger-4-integer-8, fdefault-real-8, fdefault-integer-8, fdefault-double-8): Add Var() and Init(). (finit-real=): Add ToLower. * gfortran.h (gfc_option_t): Remove moved flags. * options.c (gfc_init_options, gfc_handle_option): Ditto. (gfc_post_options): Update for name change. * decl.c (gfc_match_old_kind_spec, gfc_match_kind_spec): Handle flag-name change. * frontend-passes.c (gfc_run_passes): Ditto. * module.c (use_iso_fortran_env_module): Ditto. * primary.c (match_integer_constant, match_real_constant): Ditto. * resolve.c (resolve_ordinary_assign): Ditto. * scanner.c (gfc_next_char_literal, load_line): Ditto. * trans-array.c (gfc_trans_allocate_array_storage, gfc_conv_resolve_dependencies, gfc_trans_auto_array_allocation, gfc_conv_ss_startstride): Ditto. * trans-common.c (gfc_sym_mangled_common_id): Ditto. * trans-decl.c (gfc_sym_mangled_function_id, create_main_function): Ditto. * trans-expr.c (gfc_conv_expr_op, gfc_conv_procedure_call, arrayfunc_assign_needs_temporary, gfc_trans_arrayfunc_assign, gfc_trans_assignment_1): Ditto. * trans-stmt.c (gfc_trans_allocate): Ditto. * trans-types.c (gfc_init_kinds): Ditto. diff --git a/gcc/fortran/decl.c b/gcc/fortran/decl.c index f33d65c..8d01c45 100644 --- a/gcc/fortran/decl.c +++ b/gcc/fortran/decl.c @@ -2140,28 +2140,28 @@ gfc_match_old_kind_spec (gfc_typespec *ts) } - if (ts-type == BT_INTEGER ts-kind == 4 gfc_option.flag_integer4_kind == 8) + if (ts-type == BT_INTEGER ts-kind == 4 flag_integer4_kind == 8) ts-kind = 8; if (ts-type == BT_REAL || ts-type == BT_COMPLEX) { if (ts-kind == 4) { - if (gfc_option.flag_real4_kind == 8) + if (flag_real4_kind == 8) ts-kind = 8; - if (gfc_option.flag_real4_kind == 10) + if (flag_real4_kind == 10) ts-kind = 10; - if (gfc_option.flag_real4_kind == 16) + if (flag_real4_kind == 16) ts-kind = 16; } if (ts-kind == 8) { - if (gfc_option.flag_real8_kind == 4) + if (flag_real8_kind == 4) ts-kind = 4; - if (gfc_option.flag_real8_kind == 10) + if (flag_real8_kind == 10) ts-kind = 10; - if (gfc_option.flag_real8_kind == 16) + if (flag_real8_kind == 16) ts-kind = 16; } } @@ -2311,28 +2311,28 @@ kind_expr: if(m == MATCH_ERROR) gfc_current_locus = where; - if (ts-type == BT_INTEGER ts-kind == 4 gfc_option.flag_integer4_kind == 8) + if (ts-type == BT_INTEGER ts-kind == 4 flag_integer4_kind == 8) ts-kind = 8; if (ts-type == BT_REAL || ts-type == BT_COMPLEX) { if (ts-kind == 4) { - if (gfc_option.flag_real4_kind == 8) + if (flag_real4_kind == 8) ts-kind = 8; - if (gfc_option.flag_real4_kind == 10) + if (flag_real4_kind == 10) ts-kind = 10; - if (gfc_option.flag_real4_kind == 16) + if (flag_real4_kind == 16) ts-kind = 16; } if (ts-kind == 8) { - if (gfc_option.flag_real8_kind == 4) + if (flag_real8_kind == 4) ts-kind = 4; - if (gfc_option.flag_real8_kind == 10) + if (flag_real8_kind == 10) ts-kind = 10; - if (gfc_option.flag_real8_kind == 16) + if (flag_real8_kind == 16) ts-kind = 16; } } diff --git a/gcc/fortran/frontend-passes.c b/gcc/fortran/frontend-passes.c index 02f8e89..7d59f2e 100644 --- a/gcc/fortran/frontend-passes.c +++ b/gcc/fortran/frontend-passes.c @@ -104,7 +104,7 @@ gfc_run_passes (gfc_namespace *ns) doloop_warn (ns); doloop_list.release (); - if (gfc_option.flag_frontend_optimize) + if (flag_frontend_optimize) { optimize_namespace (ns); optimize_reduction (ns); @@ -376,7 +376,7 @@ cfe_register_funcs (gfc_expr **e, int *walk_subtrees ATTRIBUTE_UNUSED, temporary variable to hold the intermediate result, but only if allocation on assignment is active. */ - if ((*e)-rank 0 (*e)-shape == NULL !gfc_option.flag_realloc_lhs) + if ((*e)-rank 0 (*e)-shape == NULL !flag_realloc_lhs) return 0; /* Skip the test for pure functions if -faggressive-function-elimination diff --git
patch to fix PR63397
The following patch fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63397 The patch was successfully bootstrapped on x86-64. Committed as rev. 218766. 2014-12-15 Vladimir Makarov vmaka...@redhat.com PR rtl-optimization/63397 * ira-int.h (ira_overall_cost, ira_reg_cost, ira_mem_cost): Use int64_t. (ira_load_cost, ira_store_cost, ira_shuffle_cost): Ditto. * ira.c (ira_overall_cost, ira_overall_cost_before): Ditto. (ira_reg_cost, ira_mem_cost): Ditto. (ira_load_cost, ira_store_cost, ira_shuffle_cost): Ditto. (calculate_allocation_cost, do_reload): Use the right format for int64_t values. Index: ira.c === --- ira.c (revision 218761) +++ ira.c (working copy) @@ -431,9 +431,9 @@ struct ira_spilled_reg_stack_slot *ira_s the allocnos assigned to memory, cost of loads, stores and register move insns generated for pseudo-register live range splitting (see ira-emit.c). */ -int ira_overall_cost, overall_cost_before; -int ira_reg_cost, ira_mem_cost; -int ira_load_cost, ira_store_cost, ira_shuffle_cost; +int64_t ira_overall_cost, overall_cost_before; +int64_t ira_reg_cost, ira_mem_cost; +int64_t ira_load_cost, ira_store_cost, ira_shuffle_cost; int ira_move_loops_num, ira_additional_jumps_num; /* All registers that can be eliminated. */ @@ -2489,10 +2489,15 @@ calculate_allocation_cost (void) if (internal_flag_ira_verbose 0 ira_dump_file != NULL) { fprintf (ira_dump_file, - +++Costs: overall %d, reg %d, mem %d, ld %d, st %d, move %d\n, + +++Costs: overall %PRId64 + , reg %PRId64 + , mem %PRId64 + , ld %PRId64 + , st %PRId64 + , move %PRId64, ira_overall_cost, ira_reg_cost, ira_mem_cost, ira_load_cost, ira_store_cost, ira_shuffle_cost); - fprintf (ira_dump_file, +++ move loops %d, new jumps %d\n, + fprintf (ira_dump_file, \n+++ move loops %d, new jumps %d\n, ira_move_loops_num, ira_additional_jumps_num); } @@ -5422,7 +5427,8 @@ do_reload (void) if (internal_flag_ira_verbose 0 ira_dump_file != NULL overall_cost_before != ira_overall_cost) -fprintf (ira_dump_file, +++Overall after reload %d\n, ira_overall_cost); +fprintf (ira_dump_file, +++Overall after reload %PRId64 \n, +ira_overall_cost); flag_ira_share_spill_slots = saved_flag_ira_share_spill_slots; Index: ira-int.h === --- ira-int.h (revision 218760) +++ ira-int.h (working copy) @@ -620,9 +620,9 @@ extern struct ira_spilled_reg_stack_slot allocnos assigned to hard-registers, cost of the allocnos assigned to memory, cost of loads, stores and register move insns generated for pseudo-register live range splitting (see ira-emit.c). */ -extern int ira_overall_cost; -extern int ira_reg_cost, ira_mem_cost; -extern int ira_load_cost, ira_store_cost, ira_shuffle_cost; +extern int64_t ira_overall_cost; +extern int64_t ira_reg_cost, ira_mem_cost; +extern int64_t ira_load_cost, ira_store_cost, ira_shuffle_cost; extern int ira_move_loops_num, ira_additional_jumps_num;
Re: [C++ Patch] PR 58650
Why does error recovery fail? I would expect to be able to just drop the 'friend' and treat it as a normal non-static data member. Jason
Re: Fix streaming of target optimization/option nodes
Hi, this is final version I comitted. 20110201-1_0.c actually tests that we optimize cabs on function copmiled with -O0 that is no longer supposed to happen. Honza PR lto/64043 * gcc.dg/lto/20110201-1_0.c: New testcase. * tree-streamer.c (preload_common_nodes): Skip preloading of main_identifier_node, pid_type and optimization/option nodes. Index: testsuite/gcc.dg/lto/20110201-1_0.c === --- testsuite/gcc.dg/lto/20110201-1_0.c (revision 218726) +++ testsuite/gcc.dg/lto/20110201-1_0.c (working copy) @@ -1,6 +1,5 @@ /* { dg-lto-do run } */ /* { dg-lto-options { { -O0 -flto } } } */ -/* { dg-extra-ld-options -O2 -ffast-math -fuse-linker-plugin } */ /* { dg-require-linker-plugin } */ /* We require a linker plugin because otherwise we'd need to link @@ -9,7 +8,7 @@ which does not have folded cabs. */ double cabs(_Complex double); -double __attribute__((used)) +double __attribute__((used)) __attribute__ ((optimize (O2,fast-math))) foo (_Complex double x, int b) { if (b) Index: tree-streamer.c === --- tree-streamer.c (revision 218726) +++ tree-streamer.c (working copy) @@ -324,7 +324,18 @@ preload_common_nodes (struct streamer_tr /* Skip boolean type and constants, they are frontend dependent. */ if (i != TI_BOOLEAN_TYPE i != TI_BOOLEAN_FALSE -i != TI_BOOLEAN_TRUE) +i != TI_BOOLEAN_TRUE + /* MAIN_IDENTIFIER is not always initialized by Fortran FE. */ +i != TI_MAIN_IDENTIFIER + /* PID_TYPE is initialized only by C family front-ends. */ +i != TI_PID_TYPE + /* Skip optimization and target option nodes; they depend on flags. */ +i != TI_OPTIMIZATION_DEFAULT +i != TI_OPTIMIZATION_CURRENT +i != TI_TARGET_OPTION_DEFAULT +i != TI_TARGET_OPTION_CURRENT +i != TI_CURRENT_TARGET_PRAGMA +i != TI_CURRENT_OPTIMIZE_PRAGMA) record_common_node (cache, global_trees[i]); }
[PATCH] Make dg-extract-results.sh explicitly treat .{sum,log} files as text
This weekend I was running GDB's testsuite with many options enabled, and I noticed that, for some specific configurations (specifically when testing gdbserver), I was getting the following error: dg-extract-results.sh: sum files are for multiple tools, specify a tool I remembered seeing this a lot before, so I spent some time investigating the cause... First, I found the line on dg-extract-results.sh that printed this error message. The code does: CNT=`grep '=== .* tests ===' $SUM_FILES --text | $AWK '{ print $3 }' | sort -u | wc -l` if [ $CNT -eq 1 ]; then TOOL=`grep '=== .* tests ===' $FIRST_SUM --text | $AWK '{ print $2 }'` else msg ${PROGNAME}: sum files are for multiple tools, specify a tool msg usage exit 1 fi So, the first thing to do was to identify why $CNT was not 1. When I ran the command that generated the result for CNT, I found: $ grep '=== .* tests ===' `find outputs -name gdb.log -print` \ | awk '{ print $3 }' | sort -u | wc -l 7 Hm, strange. So, removing the wc command, the output was: gdb outputs/gdb.base/gdb-sigterm/gdb.log outputs/gdb.threads/non-ldr-exc-1/gdb.log outputs/gdb.threads/non-ldr-exc-2/gdb.log outputs/gdb.threads/non-ldr-exc-3/gdb.log outputs/gdb.threads/non-ldr-exc-4/gdb.log outputs/gdb.threads/thread-execl/gdb.log And, when I used only the grep command, without the awk and the sort, I saw that the majority of the lines were like this: outputs/gdb.trace/tfind/gdb.log:=== gdb tests === Which would generated the first line in the output above, gdb. But, for the other 6 files above, I saw: Binary file outputs/gdb.base/gdb-sigterm/gdb.log matches Right, the problem is that grep is assuming those 6 files are binary, not text. This happens because of this code, in grep: http://git.savannah.gnu.org/cgit/grep.git/tree/src/grep.c#n526 static enum textbin buffer_textbin (char *buf, size_t size) { if (eolbyte memchr (buf, '\0', size)) return TEXTBIN_BINARY; ... If one looks at those 6 files, one will find that they contain the NUL byte there. They are all printed by the same message, by gdbserver's code: input_interrupt, count = 0 c = 0 ('^@') (The ^@ above is the NUL byte.) Maybe the right fix would be to improve input_interrupt in gdbserver/remote-utils.c (see PR server/16359), but I decided to go the easier route and adjust the dg-extract-results.sh to be more robust when dealing with the sum and log files. To do that, I am suggest passing the '--text' option to grep, which overrides grep's machinery to identify if the file is binary and forces it to treat every file as text. For me, it makes sense to do that because sum and log files will always be text, no matter what happens. It is also worth noticing that the Python version of dg-extract-results already takes care of binary files. OK to apply? 2014-12-14 Sergio Durigan Junior sergi...@redhat.com * dg-extract-results.sh: Pass '--text' option to grep when filtering .{sum,log} files, which may contain binary data. --- contrib/dg-extract-results.sh | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/contrib/dg-extract-results.sh b/contrib/dg-extract-results.sh index a83c8e8..2a85ad4 100755 --- a/contrib/dg-extract-results.sh +++ b/contrib/dg-extract-results.sh @@ -131,9 +131,9 @@ if [ -z $TOOL ]; then # If no tool was specified, all specified summary files must be for # the same tool. - CNT=`grep '=== .* tests ===' $SUM_FILES | $AWK '{ print $3 }' | sort -u | wc -l` + CNT=`grep '=== .* tests ===' $SUM_FILES --text | $AWK '{ print $3 }' | sort -u | wc -l` if [ $CNT -eq 1 ]; then -TOOL=`grep '=== .* tests ===' $FIRST_SUM | $AWK '{ print $2 }'` +TOOL=`grep '=== .* tests ===' $FIRST_SUM --text | $AWK '{ print $2 }'` else msg ${PROGNAME}: sum files are for multiple tools, specify a tool msg @@ -144,7 +144,7 @@ else # Ignore the specified summary files that are not for this tool. This # should keep the relevant files in the same order. - SUM_FILES=`grep -l === $TOOL $SUM_FILES` + SUM_FILES=`grep -l === $TOOL $SUM_FILES --text` if test -z $SUM_FILES ; then msg ${PROGNAME}: none of the specified files are results for $TOOL exit 1 @@ -233,7 +233,7 @@ else VARIANTS= for VAR in $VARS do -grep Running target $VAR $SUM_FILES /dev/null VARIANTS=$VARIANTS $VAR +grep Running target $VAR $SUM_FILES --text /dev/null VARIANTS=$VARIANTS $VAR done fi -- 1.9.3
[rl78] Remove unneeded SHORT_IMMEDIATES_SIGN_EXTEND
The code that this macro enables will never do anything anyway on RL78. Applied. 2014-12-15 DJ Delorie d...@redhat.com * config/rl78/rl78.h: Remove SHORT_IMMEDIATES_SIGN_EXTEND. Index: config/rl78/rl78.h === --- config/rl78/rl78.h (revision 218766) +++ config/rl78/rl78.h (working copy) @@ -141,13 +141,12 @@ #define MOVE_RATIO(SPEED) ((SPEED) ? 24 : 16) #define SLOW_BYTE_ACCESS 0 #define STORE_FLAG_VALUE 1 #define LOAD_EXTEND_OP(MODE) ZERO_EXTEND -#define SHORT_IMMEDIATES_SIGN_EXTEND 0 /* The RL78 has four register banks. Normal operation uses RB0 as real registers, RB1 and RB2 as virtual registers (because we know they'll be there, and not used as variables), and RB3 is reserved for interrupt handlers. The virtual registers are accessed as
Re: [PATCH] Make dg-extract-results.sh explicitly treat .{sum,log} files as text
On Mon, Dec 15, 2014 at 05:37:03PM -0500, Sergio Durigan Junior wrote: 2014-12-14 Sergio Durigan Junior sergi...@redhat.com * dg-extract-results.sh: Pass '--text' option to grep when filtering .{sum,log} files, which may contain binary data. I'd be surprised if all versions of grep supported --text option (e.g. POSIX doesn't mention the -a nor --text options), guess you'd need to check for that first (early in the script) and add it only if it works. Also, supposedly the options should come before the regexp and list of files. Why isn't the python version used in your case btw? Jakub
Re: [PATCH] Make dg-extract-results.sh explicitly treat .{sum,log} files as text
On Monday, December 15 2014, Jakub Jelinek wrote: I'd be surprised if all versions of grep supported --text option (e.g. POSIX doesn't mention the -a nor --text options), guess you'd need to check for that first (early in the script) and add it only if it works. Thanks for the review, Jakub. Right, it makes sense to check that indeed. Take a look at the attached patch please. Also, supposedly the options should come before the regexp and list of files. Right, fixed. Why isn't the python version used in your case btw? Because GDB has not yet merged the new Python version into our codebase. I am working on it now, too. -- Sergio GPG key ID: 0x65FC5E36 Please send encrypted e-mail if possible http://sergiodj.net/ diff --git a/contrib/dg-extract-results.sh b/contrib/dg-extract-results.sh index a83c8e8..ebf93bf 100755 --- a/contrib/dg-extract-results.sh +++ b/contrib/dg-extract-results.sh @@ -127,13 +127,20 @@ do done test $ERROR -eq 0 || exit 1 +# Check if grep supports the '--text' option. + +GREP_TEXT_OPT=--text +if grep --text 21 | grep unrecognized option /dev/null 21 ; then + GREP_TEXT_OPT= +fi + if [ -z $TOOL ]; then # If no tool was specified, all specified summary files must be for # the same tool. - CNT=`grep '=== .* tests ===' $SUM_FILES | $AWK '{ print $3 }' | sort -u | wc -l` + CNT=`grep $GREP_TEXT_OPT '=== .* tests ===' $SUM_FILES | $AWK '{ print $3 }' | sort -u | wc -l` if [ $CNT -eq 1 ]; then -TOOL=`grep '=== .* tests ===' $FIRST_SUM | $AWK '{ print $2 }'` +TOOL=`grep $GREP_TEXT_OPT '=== .* tests ===' $FIRST_SUM | $AWK '{ print $2 }'` else msg ${PROGNAME}: sum files are for multiple tools, specify a tool msg @@ -144,7 +151,7 @@ else # Ignore the specified summary files that are not for this tool. This # should keep the relevant files in the same order. - SUM_FILES=`grep -l === $TOOL $SUM_FILES` + SUM_FILES=`grep $GREP_TEXT_OPT -l === $TOOL $SUM_FILES` if test -z $SUM_FILES ; then msg ${PROGNAME}: none of the specified files are results for $TOOL exit 1 @@ -233,7 +240,7 @@ else VARIANTS= for VAR in $VARS do -grep Running target $VAR $SUM_FILES /dev/null VARIANTS=$VARIANTS $VAR +grep $GREP_TEXT_OPT Running target $VAR $SUM_FILES /dev/null VARIANTS=$VARIANTS $VAR done fi
Re: [PATCH][SPARC] default with_cpu to ultrasparc in sparc64-*-linux* targets
From: Eric Botcazou ebotca...@adacore.com Date: Mon, 15 Dec 2014 22:27:38 +0100 I think it would be reasonable to have gcc targetting ultrasparc extensions by default in sparc64-*-linux*. WDYT? No strong opinion. FreeBSD and OpenBSD already do it so why not? DaveM, any opinion? Keep in mind that some early Niagara chips lacked some portion of the VIS instructions and they are thus emulated in software. The other problem is that the instruction scheduling for ultrasparc is either completely unnecessary or wrong for all of the Niagara variants. In fact, 'ultrasparc' is the most complicated and resource intensive of all of the instruction schedulers we have yet it applies to a very small percentage of the actual chips still in use. The ultrasparc3 scheduler is much smaller and much more efficient, and provides a schedule that works well across all chip variants.
Merge from trunk to gccgo branch
I've merged GCC trunk to the gccgo branch, again. Ian
RE: patch to fix PR64110
Hi, This commit will cause another GCC build fail for ARM targets. The details are descripted in the following Bugzilla linker: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64323. Could you help me to have a look? Thanks, Hale. -Original Message- From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- ow...@gcc.gnu.org] On Behalf Of Vladimir Makarov Sent: Saturday, December 13, 2014 4:12 AM To: GCC Patches Subject: patch to fix PR64110 The following patch fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64110 The patch was successfully tested and bootstrapped on x86/x86-64. Committed as rev. 218688. 2014-12-12 Vladimir Makarov vmaka...@redhat.com PR target/64110 * lra-constraints.c (process_alt_operands): Refuse alternative when reload pseudo of given class can not hold value of given mode. 2014-12-12 Vladimir Makarov vmaka...@redhat.com PR target/64110 * gcc.target/i386/pr64110.c: New.
[PATCH] Fix PR64217
Hi, all, In nds32 port, there is a wrong design in casesi_internal pattern. Since clobber always discards the previous value, it should have constraint modifier '=' so that LRA is able to correctly handle the register live info. So we have this patch to fix the issue. Committed as Rev.218774. Index: gcc/ChangeLog === --- gcc/ChangeLog (revision 218773) +++ gcc/ChangeLog (working copy) @@ -1,3 +1,9 @@ +2014-12-16 Chung-Ju Wu jasonw...@gmail.com + + PR target/64217 + * config/nds32/nds32.md (casesi_internal): Add '=r' for clobber + register constraint. + 2014-12-15 DJ Delorie d...@redhat.com * config/rl78/rl78.h: Remove SHORT_IMMEDIATES_SIGN_EXTEND. Index: gcc/config/nds32/nds32.md === --- gcc/config/nds32/nds32.md (revision 218773) +++ gcc/config/nds32/nds32.md (working copy) @@ -2178,7 +2178,7 @@ (const_int 4)) (label_ref (match_operand 1 ) (use (label_ref (match_dup 1))) - (clobber (match_operand:SI 2 register_operand )) + (clobber (match_operand:SI 2 register_operand =r)) (clobber (reg:SI TA_REGNUM))])] { Best regards, jasonwucj
Re: [Patch, Fortran, OOP] PR 64244: [4.8/4.9/5 Regression] ICE at class.c:236 when using non_overridable
Hi Janus, hi all, Janus Weil wrote: here is a regression fix for a problem with the NON_OVERRIDABLE attribute. For non-overridable type-bound procedures we do not have to generate a call to the vtable, but can just translate it to a simple ('non-virtual') function call. In this particular case, an additional generic binding was present, which fooled the compiler to believe that the call goes to an overridable procedure, so it tried to generate a call to a vtable entry which did not exist. The trick is simply to take the NON-OVERRIDABLE attribute from the specific procedure, not the generic one (which means the generic call has to be resolved to a specific one first). The patch was regtested on x86_64-unknown-linux-gnu. Ok for trunk? And for 4.8/4.9 after some time? OK. Thanks for the patch! Tobias 2014-12-15 Janus Weil ja...@gcc.gnu.org PR fortran/64244 * resolve.c (resolve_typebound_call): New argument to pass out the non-overridable attribute of the specific procedure. (resolve_typebound_subroutine): Get overridable flag from resolve_typebound_call. 2014-12-15 Janus Weil ja...@gcc.gnu.org PR fortran/64244 * gfortran.dg/typebound_call_26.f90: New.