Re: [Patch, gcc/flag-types.h + Fortran] PR54687 - Fortran options cleanup
OK, I think, from the Fortran POV. I hope I didn't miss some logic issue in the middle of the trivial stuff. Thanks for doing that work! FX Le 12 déc. 2014 à 08:43, Tobias Burnus bur...@net-b.de a écrit : This patch cleans up Fortran's option handling and moves it closer to the common way of option handling. That's a nice cleanup and additionally, as Manuel points out in the PR, there are a couple of reasons why this makes sense in addition. I have not yet touched all options but one has to start somewhere. Built and currently regtesting on x86-64-gnu-linux. OK for the trunk? Tobias opt2.diff
Re: [PATCH 1/n] OpenMP 4.0 offloading infrastructure
Hi! On Tue, 30 Sep 2014 13:16:37 +0200, I wrote: On Fri, 26 Sep 2014 16:36:21 +0400, Ilya Verbin iver...@gmail.com wrote: --- a/configure.ac +++ b/configure.ac @@ -286,6 +286,24 @@ case ${with_newlib} in yes) skipdirs=`echo ${skipdirs} | sed -e 's/ target-newlib / /'` ;; esac +AC_ARG_ENABLE(as-accelerator-for, +[AS_HELP_STRING([--enable-as-accelerator-for=ARG], + [build as offload target compiler. + Specify offload host triple by ARG])], +ENABLE_AS_ACCELERATOR_FOR=$enableval, +ENABLE_AS_ACCELERATOR_FOR=no) I don't see $ENABLE_AS_ACCELERATOR_FOR being used anywhere, so this can probably be removed? On Wed, 1 Oct 2014 20:05:45 +0400, Ilya Verbin iver...@gmail.com wrote: It will be used in one of the upcoming patches. OK, but why do you need the all-uppercase variant? The lowercase enable_as_accelerator_for already is (automatically) populated by Autoconf, and used in other places? Here is a untested cleanup patch; could you please test this? * configure.ac (--enable-as-accelerator-for): Don't set ENABLE_AS_ACCELERATOR_FOR. Update all users. * configure: Regenerate. diff --git configure configure index 297f38e..1804198 100755 --- configure +++ configure @@ -2893,9 +2893,7 @@ esac # Check whether --enable-as-accelerator-for was given. if test ${enable_as_accelerator_for+set} = set; then : - enableval=$enable_as_accelerator_for; ENABLE_AS_ACCELERATOR_FOR=$enableval -else - ENABLE_AS_ACCELERATOR_FOR=no + enableval=$enable_as_accelerator_for; fi @@ -3094,7 +3092,7 @@ if test ${enable_liboffloadmic+set} = set; then : as_fn_error --enable-liboffloadmic=no/host/target $LINENO 5 ;; esac else - if test ${ENABLE_AS_ACCELERATOR_FOR} != no; then + if test x$enable_as_accelerator_for != x; then case ${target} in *-intelmic-* | *-intelmicemul-*) enable_liboffloadmic=target diff --git configure.ac configure.ac index fd1bdf0..91c9a72 100644 --- configure.ac +++ configure.ac @@ -289,9 +289,7 @@ esac AC_ARG_ENABLE(as-accelerator-for, [AS_HELP_STRING([--enable-as-accelerator-for=ARG], [build as offload target compiler. - Specify offload host triple by ARG])], -ENABLE_AS_ACCELERATOR_FOR=$enableval, -ENABLE_AS_ACCELERATOR_FOR=no) + Specify offload host triple by ARG])]) AC_ARG_ENABLE(offload-targets, [AS_HELP_STRING([--enable-offload-targets=LIST], @@ -470,7 +468,7 @@ AC_HELP_STRING([[--enable-liboffloadmic[=ARG]]], *) AC_MSG_ERROR([--enable-liboffloadmic=no/host/target]) ;; esac], -[if test ${ENABLE_AS_ACCELERATOR_FOR} != no; then +[if test x$enable_as_accelerator_for != x; then case ${target} in *-intelmic-* | *-intelmicemul-*) enable_liboffloadmic=target Grüße, Thomas signature.asc Description: PGP signature
Re: [PATCH 1/n] OpenMP 4.0 offloading infrastructure
On Fri, Dec 12, 2014 at 09:14:28AM +0100, Thomas Schwinge wrote: On Tue, 30 Sep 2014 13:16:37 +0200, I wrote: On Fri, 26 Sep 2014 16:36:21 +0400, Ilya Verbin iver...@gmail.com wrote: --- a/configure.ac +++ b/configure.ac @@ -286,6 +286,24 @@ case ${with_newlib} in yes) skipdirs=`echo ${skipdirs} | sed -e 's/ target-newlib / /'` ;; esac +AC_ARG_ENABLE(as-accelerator-for, +[AS_HELP_STRING([--enable-as-accelerator-for=ARG], + [build as offload target compiler. + Specify offload host triple by ARG])], +ENABLE_AS_ACCELERATOR_FOR=$enableval, +ENABLE_AS_ACCELERATOR_FOR=no) I don't see $ENABLE_AS_ACCELERATOR_FOR being used anywhere, so this can probably be removed? On Wed, 1 Oct 2014 20:05:45 +0400, Ilya Verbin iver...@gmail.com wrote: It will be used in one of the upcoming patches. OK, but why do you need the all-uppercase variant? The lowercase enable_as_accelerator_for already is (automatically) populated by Autoconf, and used in other places? Here is a untested cleanup patch; could you please test this? * configure.ac (--enable-as-accelerator-for): Don't set ENABLE_AS_ACCELERATOR_FOR. Update all users. * configure: Regenerate. Ok if it works. Jakub
RE: [PATCH, combine] Try REG_EQUAL for nonzero_bits
-Original Message- From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- ow...@gcc.gnu.org] On Behalf Of Eric Botcazou Sent: Monday, November 24, 2014 5:41 PM To: Zhenqiang Chen Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH, combine] Try REG_EQUAL for nonzero_bits Thanks for the comments. I will compare the two nonzero_bits from src and REG_EQUAL. Then select the smaller one. They are masks so you can simply AND them before ORing the result. Do you know why it use SET_SRC (set) other than src for num_sign_bit_copies? If it is src, I should do the same for num_sign_bit_copies with REG_EQUAL info. Probably historical reasons, let's not try to change that now. You can apply the same treatment to num_sign_bit_copies (you will need a comparison here) while preserving the src vs SET_SRC (set) discrepancy. Thanks for the comments. Patch is updated. diff --git a/gcc/combine.c b/gcc/combine.c index 1808f97..2e865d7 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -1603,6 +1603,28 @@ setup_incoming_promotions (rtx_insn *first) } } +/* Update RSP from INSN's REG_EQUAL note and SRC. */ + +static void +update_rsp_from_reg_equal (reg_stat_type *rsp, rtx_insn *insn, rtx src, rtx x) +{ + rtx reg_equal = insn ? find_reg_equal_equiv_note (insn) : NULL_RTX; + unsigned HOST_WIDE_INT bits = nonzero_bits (src, nonzero_bits_mode); + + if (reg_equal) +{ + unsigned int num = num_sign_bit_copies (XEXP (reg_equal, 0), + GET_MODE (x)); + bits = nonzero_bits (XEXP (reg_equal, 0), nonzero_bits_mode); + rsp-nonzero_bits |= bits; + + if (rsp-sign_bit_copies num) + rsp-sign_bit_copies = num; +} + else +rsp-nonzero_bits |= bits; +} + /* Called via note_stores. If X is a pseudo that is narrower than HOST_BITS_PER_WIDE_INT and is being set, record what bits are known zero. @@ -1698,13 +1720,14 @@ set_nonzero_bits_and_sign_copies (rtx x, const_rtx set, void *data) src = GEN_INT (INTVAL (src) | ~GET_MODE_MASK (GET_MODE (x))); #endif - /* Don't call nonzero_bits if it cannot change anything. */ - if (rsp-nonzero_bits != ~(unsigned HOST_WIDE_INT) 0) - rsp-nonzero_bits |= nonzero_bits (src, nonzero_bits_mode); num = num_sign_bit_copies (SET_SRC (set), GET_MODE (x)); if (rsp-sign_bit_copies == 0 || rsp-sign_bit_copies num) rsp-sign_bit_copies = num; + + /* Don't call nonzero_bits if it cannot change anything. */ + if (rsp-nonzero_bits != ~(unsigned HOST_WIDE_INT) 0) + update_rsp_from_reg_equal (rsp, insn, src, x); } else {
Patch ping
Hi! I'd like to ping 3 patches: http://gcc.gnu.org/ml/gcc-patches/2014-12/msg00546.html PR63831 - P1 - fix __has_attribute/__has_cpp_attribute handling http://gcc.gnu.org/ml/gcc-patches/2014-12/msg00568.html PR64023 - P3 - fix flags passed to non-bootstrapped host modules during bootstrap http://gcc.gnu.org/ml/gcc-patches/2014-12/msg00297.html -fsanitize=vptr support, 3rd iteration Jakub
Re: [PATCH][testsuite] Mark as UNSUPPORTED tests that don't fit into tiny memory model
Kyrill Tkachov kyrylo.tkac...@arm.com writes: * lib/target-utils.exp: New file. ERROR: Couldn't find library file target-utils.exp. make[4]: *** [check-DEJAGNU] Error 1 make[4]: Leaving directory `/usr/local/gcc/gcc-20141212/Build/ia64-suse-linux/libgomp/testsuite' make[3]: *** [check-am] Error 2 Andreas. -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 And now for something completely different.
Re: r218609 - in /trunk/gcc: ChangeLog common.opt d...
hubi...@gcc.gnu.org writes: Author: hubicka Date: Wed Dec 10 21:17:28 2014 New Revision: 218609 URL: https://gcc.gnu.org/viewcvs?rev=218609root=gccview=rev Log: * doc/invoke.texi: (-devirtualize-at-ltrans): Document. * lto-cgraph.c (lto_output_varpool_node): Mark initializer as removed when it is not streamed to the given ltrans. (compute_ltrans_boundary): Make code adding all polymorphic call targets conditional with !flag_wpa || flag_ltrans_devirtualize. * common.opt (fdevirtualize-at-ltrans): New flag. /usr/local/gcc/gcc-20141211/gcc/testsuite/g++.dg/ipa/pr64059.C:56:1: internal compiler error: Segmentation fault. 0x40df742f crash_signal. ../../gcc/toplev.c:358. 0x412f2c9f get_binfo_at_offset(tree_node*, long, tree_node*). ../../gcc/tree.c:11922. 0x40a0d75f possible_polymorphic_call_targets(tree_node*, long, ipa_polymorphic_call_context, bool*, void**, bool). ../../gcc/ipa-devirt.c:2404. 0x40b6f2ef possible_polymorphic_call_targets(cgraph_edge*, bool*, void**, bool). ../../gcc/ipa-utils.h:109. 0x40b6f2ef compute_ltrans_boundary(lto_symtab_encoder_d*). ../../gcc/lto-cgraph.c:952. 0x40c40f2f ipa_write_summaries(bool). ../../gcc/passes.c:2511. 0x406584ff ipa_passes. ../../gcc/cgraphunit.c:2091. 0x406584ff symbol_table::compile(). ../../gcc/cgraphunit.c:2187. 0x4065be1f symbol_table::finalize_compilation_unit(). ../../gcc/cgraphunit.c:2340. 0x4029bcef cp_write_global_declarations(). ../../gcc/cp/decl2.c:4688. Please submit a full bug report,. with preprocessed source if appropriate.. Please include the complete backtrace with any bug report.. See http://gcc.gnu.org/bugs.html for instructions.. FAIL: g++.dg/ipa/pr64059.C -std=gnu++11 (internal compiler error) The problem here is that BINFO is not streamed because devirt is disabled. I have commited the following patch (that I have in my local tree for some time) Honza Index: ChangeLog === --- ChangeLog (revision 218658) +++ ChangeLog (working copy) @@ -1,3 +1,8 @@ +2014-12-12 Jan Hubicka hubi...@ucw.cz + + * ipa-devirt.c (possible_polymorphic_call_targets): Return early + if otr_type has no BINFO. + 2014-12-12 Zhenqiang Chen zhenqiang.c...@arm.com PR rtl-optimization/63917 Index: ipa-devirt.c === --- ipa-devirt.c(revision 218639) +++ ipa-devirt.c(working copy) @@ -2239,7 +2239,7 @@ possible_polymorphic_call_targets (tree /* If ODR is not initialized or the constext is invalid, return empty incomplete list. */ - if (!odr_hash || context.invalid) + if (!odr_hash || context.invalid || !TYPE_BINFO (otr_type)) { if (completep) *completep = context.invalid;
Re: Overload HONOR_INFINITIES, etc macros
On Thu, Dec 11, 2014 at 9:47 PM, Marc Glisse marc.gli...@inria.fr wrote: Hello, after HONOR_NANS, I am turning the other HONOR_* macros into functions. As a reminder, the goal is both to make uses shorter and to fix the answer for non-native vector types. Bootstrap+testsuite on x86_64-linux-gnu. Ok. Thanks, Richard. 2014-12-12 Marc Glisse marc.gli...@inria.fr * real.h (HONOR_SNANS, HONOR_INFINITIES, HONOR_SIGNED_ZEROS, HONOR_SIGN_DEPENDENT_ROUNDING): Replace macros with 3 overloaded declarations. * real.c (HONOR_NANS): Fix indentation. (HONOR_SNANS, HONOR_INFINITIES, HONOR_SIGNED_ZEROS, HONOR_SIGN_DEPENDENT_ROUNDING): Define three overloads. * builtins.c (fold_builtin_cproj, fold_builtin_signbit, fold_builtin_fmin_fmax, fold_builtin_classify): Simplify argument of HONOR_*. * fold-const.c (operand_equal_p, fold_comparison, fold_binary_loc): Likewise. * gimple-fold.c (gimple_val_nonnegative_real_p): Likewise. * ifcvt.c (noce_try_move, noce_try_minmax, noce_try_abs): Likewise. * omp-low.c (omp_reduction_init): Likewise. * rtlanal.c (may_trap_p_1): Likewise. * simplify-rtx.c (simplify_const_relational_operation): Likewise. * tree-ssa-dom.c (record_equality, record_edge_info): Likewise. * tree-ssa-phiopt.c (value_replacement, abs_replacement): Likewise. * tree-ssa-reassoc.c (eliminate_using_constants): Likewise. * tree-ssa-uncprop.c (associate_equivalences_with_edges): Likewise. -- Marc Glisse Index: gcc/builtins.c === --- gcc/builtins.c (revision 218639) +++ gcc/builtins.c (working copy) @@ -7671,21 +7671,21 @@ build_complex_cproj (tree type, bool neg return type. Return NULL_TREE if no simplification can be made. */ static tree fold_builtin_cproj (location_t loc, tree arg, tree type) { if (!validate_arg (arg, COMPLEX_TYPE) || TREE_CODE (TREE_TYPE (TREE_TYPE (arg))) != REAL_TYPE) return NULL_TREE; /* If there are no infinities, return arg. */ - if (! HONOR_INFINITIES (TYPE_MODE (TREE_TYPE (type + if (! HONOR_INFINITIES (type)) return non_lvalue_loc (loc, arg); /* Calculate the result when the argument is a constant. */ if (TREE_CODE (arg) == COMPLEX_CST) { const REAL_VALUE_TYPE *real = TREE_REAL_CST_PTR (TREE_REALPART (arg)); const REAL_VALUE_TYPE *imag = TREE_REAL_CST_PTR (TREE_IMAGPART (arg)); if (real_isinf (real) || real_isinf (imag)) return build_complex_cproj (type, imag-sign); @@ -8942,21 +8942,21 @@ fold_builtin_signbit (location_t loc, tr return (REAL_VALUE_NEGATIVE (c) ? build_one_cst (type) : build_zero_cst (type)); } /* If ARG is non-negative, the result is always zero. */ if (tree_expr_nonnegative_p (arg)) return omit_one_operand_loc (loc, type, integer_zero_node, arg); /* If ARG's format doesn't have signed zeros, return arg 0.0. */ - if (!HONOR_SIGNED_ZEROS (TYPE_MODE (TREE_TYPE (arg + if (!HONOR_SIGNED_ZEROS (arg)) return fold_convert (type, fold_build2_loc (loc, LT_EXPR, boolean_type_node, arg, build_real (TREE_TYPE (arg), dconst0))); return NULL_TREE; } /* Fold function call to builtin copysign, copysignf or copysignl with arguments ARG1 and ARG2. Return NULL_TREE if no simplification can be made. */ @@ -9136,26 +9136,26 @@ fold_builtin_fmin_fmax (location_t loc, tree res = do_mpfr_arg2 (arg0, arg1, type, (max ? mpfr_max : mpfr_min)); if (res) return res; /* If either argument is NaN, return the other one. Avoid the transformation if we get (and honor) a signalling NaN. Using omit_one_operand() ensures we create a non-lvalue. */ if (TREE_CODE (arg0) == REAL_CST real_isnan (TREE_REAL_CST (arg0)) - (! HONOR_SNANS (TYPE_MODE (TREE_TYPE (arg0))) + (! HONOR_SNANS (arg0) || ! TREE_REAL_CST (arg0).signalling)) return omit_one_operand_loc (loc, type, arg1, arg0); if (TREE_CODE (arg1) == REAL_CST real_isnan (TREE_REAL_CST (arg1)) - (! HONOR_SNANS (TYPE_MODE (TREE_TYPE (arg1))) + (! HONOR_SNANS (arg1) || ! TREE_REAL_CST (arg1).signalling)) return omit_one_operand_loc (loc, type, arg0, arg1); /* Transform fmin/fmax(x,x) - x. */ if (operand_equal_p (arg0, arg1, OEP_PURE_SAME)) return omit_one_operand_loc (loc, type, arg0, arg1); /* Convert fmin/fmax to MIN_EXPR/MAX_EXPR. C99 requires these functions to return the numeric arg if the other one is NaN. These tree codes don't honor that, so only transform if @@ -9552,21 +9552,21 @@
Re: Remove unused arguments of bulitin_unreachable
On Thu, Dec 11, 2014 at 7:16 PM, Jan Hubicka hubi...@ucw.cz wrote: On Thu, Dec 11, 2014 at 06:06:55PM +0100, Jan Hubicka wrote: Hi, in firefox .optimized dumps one can see few places where __builtin_unreachable is called (as a result of devirtualization code proving the code path to be undefined). There is usually some argument setup for the parameters of __builtin_unreachable that are dead. This patch makes it somewhat better so now we get: bb 30: # prephitmp_222 = PHI _52(27), pretmp_245(29) _57 = prephitmp_222 + 2; pool_40(D)-ptr = _57; __builtin_unreachable (); Why DSE does not eliminate the stores prior noreturn const function? Probably because it has a very special special-casing of function exit and because pool_40(D)-ptr is a global store which cannot be eliminated (it doesn't special case noreturn const function exits). Bootstrapped/regtested x86_64-linux, OK? Honza * tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Remove dead parameters of BUILT_IN_UNREACHABLE Shouldn't this be done when you actually change the call to __builtin_unreachable ()? I mean, __builtin_unreachable () has no arguments, so leaving any arguments there is broken IL, even if you clean it up during the next DCE. Hmm, I tought there was some reason to not do so becuase of inplace folding and memory-SSA. Well, reducing the number of used ops is fine for in-place folding. memory-SSA shouldn't be an issue. Richard. I can give a try to update all the places we can put builtin_unreachable into IL. (I wonder if that also include standard constant propagation) Honza --- tree-ssa-dce.c (revision 218610) +++ tree-ssa-dce.c (working copy) @@ -250,6 +250,15 @@ mark_stmt_if_obviously_necessary (gimple case BUILT_IN_ALLOCA: case BUILT_IN_ALLOCA_WITH_ALIGN: return; + case BUILT_IN_UNREACHABLE: + /* All parameters of BUILT_IN_UNREACHABLE are dead. Remove them +from the stmt, so we can remove their definitions. */ + if (gimple_call_num_args (stmt)) + { + gimple_set_num_ops (stmt, 3); + update_stmt (stmt); + } + break; default:; } Jakub
Re: [patch] Fix ICE on unaligned record field
On Thu, Dec 11, 2014 at 10:52 PM, Eric Botcazou ebotca...@adacore.com wrote: Note that I think the place of the check is unfortunate as you for example will not remove the argument if it is unused. In fact I'm not yet sure what transform exactly we are disabling. I am guessing we are passing an aggregate by value that resides at a bit-aligned offset of some outer object: foo (x.aggr); and the function then does foo (Aggr a) { int i = a.foo; ... } thus use only a part of the aggregate. Then IPA SRA would like to pass x.aggr.foo instead of x.aggr and thus tries to materialize a load from x.aggr.foo at all callers but fails to do that in a valid way. Right, it's the usual MEM_EXPR business creating ADDR_EXPRs out of nowhere and miserably failing on something not addressable. Well, I call it a convenience that MEM_EXPR, unlike INDIRECT_REF, can be used to encapsulate an arbitrary byte-offset and view-conversion. Of course it's still a dereference of an address so that convenience doesn't work on sth non-addressable. Erics fix did, at all callers Aggr tem = x.aggr; foo (tem.foo); ? Yes, because the code wants to take tem afterwards. While we should be able to simply do foo (BIT_FIELD_REF x.aggr, .) with the appropriate bit offset and size? (if that's of register type you need to do the load in a separate stmt of couse). Thus similar to Erics fix but avoiding the aggregate copy. Yes, that should be doable, but I'm not sure it's worth the hassle. I'll leave that to you two to decide - Martins patch is ok if you are fine with disabling the optimization (also removing an unused parameter). Thanks, Richard. -- Eric Botcazou
Re: [PATCH 3/4] Add libgomp plugin for Intel MIC
Hi! On Mon, 10 Nov 2014 17:30:38 +0300, Ilya Verbin iver...@gmail.com wrote: --- /dev/null +++ b/liboffloadmic/plugin/Makefile.am @@ -0,0 +1,123 @@ +# Plugin for offload execution on Intel MIC devices. +libgomp_src_dir = $(top_srcdir)/../../libgomp +libgomp_dir = $(build_dir)/../../libgomp Hmm, I'm not too happy about external (to libgomp) files using (for example, #include) stuff from libgomp, for the reason given in http://news.gmane.org/find-root.php?message_id=%3C87ioishf5z.fsf%40kepler.schwinge.homeip.net%3E: it can then easily happen that any such files depend on, for example, Autoconf definitions which are provided in only one of the instances. That said, libgomp_target.h as well as omp.h currently are self-contained (the latter file after having been created from omp.h.in by libgomp's configure script), so this currently is not an actual problem. + AM_LDFLAGS = -L$(liboffload_dir)/.libs -L$(libgomp_dir)/.libs -loffloadmic_target -lcoi_device -lmyo-service -lgomp -rdynamic Given that this plugin wishes to link against libgomp, don't we have to make sure that libgomp has actually been built before that is attempted, and the following (untested) patch would be required? diff --git Makefile.def Makefile.def index 7c8761a..f0a3a91 100644 --- Makefile.def +++ Makefile.def @@ -550,7 +550,7 @@ dependencies = { module=configure-target-libvtv; on=all-target-libstdc++-v3; }; // generated by the libgomp configure. Unfortunately, due to the use of // recursive make, we can't be that specific. dependencies = { module=all-target-libstdc++-v3; on=configure-target-libgomp; }; -dependencies = { module=all-target-liboffloadmic; on=configure-target-libgomp; }; +dependencies = { module=all-target-liboffloadmic; on=all-target-libgomp; }; dependencies = { module=install-target-libgo; on=install-target-libatomic; }; dependencies = { module=install-target-libgfortran; on=install-target-libquadmath; }; diff --git Makefile.in Makefile.in index ba5ae4c..8c060b9 100644 --- Makefile.in +++ Makefile.in @@ -48884,7 +48884,7 @@ all-stage3-target-libstdc++-v3: maybe-configure-stage3-target-libgomp all-stage4-target-libstdc++-v3: maybe-configure-stage4-target-libgomp all-stageprofile-target-libstdc++-v3: maybe-configure-stageprofile-target-libgomp all-stagefeedback-target-libstdc++-v3: maybe-configure-stagefeedback-target-libgomp -all-target-liboffloadmic: maybe-configure-target-libgomp +all-target-liboffloadmic: maybe-all-target-libgomp install-target-libgo: maybe-install-target-libatomic install-target-libgfortran: maybe-install-target-libquadmath install-target-libgfortran: maybe-install-target-libgcc Grüße, Thomas signature.asc Description: PGP signature
Re: [PATCH 5/n] OpenMP 4.0 offloading infrastructure: libgomp
Hi! I know, I'm a little late, but: On Mon, 6 Oct 2014 19:53:17 +0400, Ilya Verbin iver...@gmail.com wrote: This patch adds plugin support to libgomp, as well as memory mapping and interaction with target devices through plugin's interface. libgomp/ * libgomp_target.h: New file. --- /dev/null +++ b/libgomp/libgomp_target.h @@ -0,0 +1,44 @@ +/* Copyright (C) 2014 Free Software Foundation, Inc. + + This file is part of the GNU OpenMP Library (libgomp). +#ifndef LIBGOMP_TARGET_H +#define LIBGOMP_TARGET_H 1 + +/* Type of offload target device. */ +enum offload_target_type +{ + OFFLOAD_TARGET_TYPE_HOST, + OFFLOAD_TARGET_TYPE_INTEL_MIC +}; + +/* Auxiliary struct, used for transferring a host-target address range mapping + from plugin to libgomp. */ +struct mapping_table +{ + uintptr_t host_start; + uintptr_t host_end; + uintptr_t tgt_start; + uintptr_t tgt_end; +}; + +#endif /* LIBGOMP_TARGET_H */ Doesn't this file conceptually serve the same purpose as the [top-level]/include/libgomp-constants.h file that we began using on gomp-4_0-branch, https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;a=blob;f=include/gomp-constants.h;hb=refs/remotes/gomp-4_0-branch -- that is, share stuff (constants, data structures -- so the libgomp-constants.h name also isn't totally appropriate...) between the complier proper and libgomp (including offloading plugins living elsewhere)? I think we should settle on one such file. For the reason of encapsulation, http://news.gmane.org/find-root.php?message_id=%3C87k31x4321.fsf%40kepler.schwinge.homeip.net%3E, I'd prefer this to live outside of libgomp, so what about a generic [top-level]/include/libgomp.h file? Grüße, Thomas signature.asc Description: PGP signature
Re: [PATCH][testsuite] Mark as UNSUPPORTED tests that don't fit into tiny memory model
On 12/12/14 08:34, Andreas Schwab wrote: Kyrill Tkachov kyrylo.tkac...@arm.com writes: * lib/target-utils.exp: New file. ERROR: Couldn't find library file target-utils.exp. make[4]: *** [check-DEJAGNU] Error 1 make[4]: Leaving directory `/usr/local/gcc/gcc-20141212/Build/ia64-suse-linux/libgomp/testsuite' make[3]: *** [check-am] Error 2 Ugh, sorry for that, reproduced. DejaGNU (Tcl?) doesn't do recursive loads :(. The quick solution is to load target-utils.exp explicitly in libgomp.exp I'll post a patch shortly. Kyrill Andreas.
Re: [PATCH][testsuite] Mark as UNSUPPORTED tests that don't fit into tiny memory model
On 12/12/14 10:22, Kyrill Tkachov wrote: On 12/12/14 08:34, Andreas Schwab wrote: Kyrill Tkachov kyrylo.tkac...@arm.com writes: * lib/target-utils.exp: New file. ERROR: Couldn't find library file target-utils.exp. make[4]: *** [check-DEJAGNU] Error 1 make[4]: Leaving directory `/usr/local/gcc/gcc-20141212/Build/ia64-suse-linux/libgomp/testsuite' make[3]: *** [check-am] Error 2 Ugh, sorry for that, reproduced. DejaGNU (Tcl?) doesn't do recursive loads :(. The quick solution is to load target-utils.exp explicitly in libgomp.exp I'll post a patch shortly. Here it is. Committed as r218662 to get things going again. 2014-12-12 Kyrylo Tkachov kyrylo.tkac...@arm.com * testsuite/lib/libgomp.exp: Load target-utils.exp. Move load of target-supportes.exp earlier. diff --git a/libgomp/testsuite/lib/libgomp.exp b/libgomp/testsuite/lib/libgomp.exp index a154684..ff22f10 100644 --- a/libgomp/testsuite/lib/libgomp.exp +++ b/libgomp/testsuite/lib/libgomp.exp @@ -17,9 +17,10 @@ load_lib dg.exp load_gcc_lib prune.exp load_gcc_lib target-libpath.exp load_gcc_lib wrapper.exp +load_gcc_lib target-supports.exp +load_gcc_lib target-utils.exp load_gcc_lib gcc-defs.exp load_gcc_lib timeout.exp -load_gcc_lib target-supports.exp load_gcc_lib file-format.exp load_gcc_lib target-supports-dg.exp load_gcc_lib scanasm.exp
Re: [PATCH 2/4] Add liboffloadmic
Hi! On Tue, 21 Oct 2014 21:20:34 +0400, Ilya Verbin iver...@gmail.com wrote: This patch contains liboffloadmic library. liboffloadmic/ Initial commit. Imported from upstream: https://www.openmprtl.org/sites/default/files/liboffload_oss.tgz * Makefile.am: New file. * Makefile.in: New file, generated by automake. * aclocal.m4: New file, generated by aclocal. * configure: New file, generated by autoconf. * configure.ac: New file. contrib/gcc_update:files_and_dependencies needs to be updated for those build machinery files as well as those added in liboffloadmic/plugin/ later on. Grüße, Thomas signature.asc Description: PGP signature
Re: [patch] Fix ICE on unaligned record field
Well, I call it a convenience that MEM_EXPR, unlike INDIRECT_REF, can be used to encapsulate an arbitrary byte-offset and view-conversion. Of course it's still a dereference of an address so that convenience doesn't work on sth non-addressable. No discussion on the merits of MEM_EXPR vs INDIRECT_REF but on the pertinence of creating ADDR_EXPRs out of nowhere just to use them. I'll leave that to you two to decide - Martins patch is ok if you are fine with disabling the optimization (also removing an unused parameter). I'm fine with disabling it: the aggregate is passed directly so it's probably small and, in the case at hand, the optimized caller would do 2 extractions instead of only 1 so the gain is not obvious. -- Eric Botcazou
[patch c++]: Fix PR/63996
Hi, The loop-expression loops endless in c++14's case for cases the statement-list isn't constant. Bug 63996 - Infinite loop in invalid C++14 constexpr fn ChangeLog 2014-12-12 Kai Tietz kti...@redhat.com PR c++/63996 * constexpr.c (cxx_eval_loop_expr): Don't loop endless on none-constant expression. 2014-12-12 Kai Tietz kti...@redhat.com PR c++/63996 * g++.dg/cpp1y/pr63996.C: New file. Tested for x86_64-w64-mingw32. Ok for apply? Regards, Kai New testcase in g++.dg/cpp1y as pr63996.C // { dg-do compile { target c++14 } } constexpr int foo (int i) { int a[i] = { }; } constexpr int j = foo (1); // { dg-error is not a constant expression } Index: constexpr.c === --- constexpr.c (Revision 218570) +++ constexpr.c (Arbeitskopie) @@ -2841,7 +2870,7 @@ cxx_eval_loop_expr (const constexpr_ctx *ctx, tree { cxx_eval_statement_list (ctx, body, non_constant_p, overflow_p, jump_target); - if (returns (jump_target) || breaks (jump_target)) + if (returns (jump_target) || breaks (jump_target) || *non_constant_p) break; } if (breaks (jump_target))
[PATCH][rtlanal.c][BE][1/2] Fix vector load/stores to not use ld1/st1
[Cleaning this thread up to submit patch again, with better explanation] This patch causes subreg_get_info() to exit early in the simple cases where we are extracting a whole register from a multi register. In aarch64 for Big Endian we were producing a subreg of a OImode (256bits) from a CImode (384bits) This would hit the following assert in subreg_get_info: gcc_assert ((GET_MODE_SIZE (xmode) % GET_MODE_SIZE (ymode)) == 0); This is a rule we should be able to relax a little - if the subreg we want fits into a whole register then this is a valid result and can be easily detected earlier in the function. This has the bonus that we should be slightly reducing the execution time for more common cases, for example a subreg of 64bits from 256bits. This patch is required for the second part of the patch, which is aarch64 specific, and fixes up aarch64 Big Endian movoi/ci/xi. This second part has already been approved. This patch will apply cleanly by itself and no regressions were seen when testing aarch64 and x86_64 on make check. Cheers, Alan Changelog: 2014-11-14 Alan Hayward alan.hayw...@arm.com * rtlanal.c (subreg_get_info): Exit early for simple and common cases --- gcc/rtlanal.c | 13 + 1 file changed, 13 insertions(+) diff --git a/gcc/rtlanal.c b/gcc/rtlanal.c index c9bf69c..a3f7b78 100644 --- a/gcc/rtlanal.c +++ b/gcc/rtlanal.c @@ -3561,6 +3561,19 @@ subreg_get_info (unsigned int xregno, machine_mode xmode, info-offset = offset / regsize_xmode; return; } + /* Quick exit for the simple and common case of extracting whole + subregisters from a multiregister value. */ + if (!rknown + WORDS_BIG_ENDIAN == REG_WORDS_BIG_ENDIAN + regsize_xmode == regsize_ymode + (offset % regsize_ymode) == 0) + { + info-representable_p = true; + info-nregs = nregs_ymode; + info-offset = offset / regsize_ymode; + gcc_assert (info-offset + info-nregs = nregs_xmode); + return; + } } /* Lowpart subregs are otherwise valid. */ -- 1.9.1 0001-BE-fix-load-stores.-Common-code.patch Description: Binary data
Re: [PATCH] Fix PR 61225
On Fri, Dec 12, 2014 at 03:27:17PM +0800, Zhenqiang Chen wrote: Presumably you're thinking about a PARALLEL that satisfies single_set_p? No. It has nothing to do with single_set_p. I just want to reuse the code to match the instruction pattern. In common, the new PARALLEL is like Parallel newpat from I3 newpat from I2 // if have newpat from I1 // if have newpat from I0 // if have For to_combined_insn, i0 is NULL and there should have no newpat from I1 When handling I1-I2-I3, with normal order, it will get Parallel newpat from I3 After I2- to_combined_insn, the parallel will be Parallel newpat from I3 newpat from to_combined_insn. But this can not match the insn pattern. So I swap the order to. Parallel newpat from to_combined_insn. newpat from I3 Maybe I wasn't clear, sorry. My concern is you only handle a SET as newpat, not a PARALLEL. It can be a PARALLEL just fine, even if it satisfies single_set (it can have a clobber, it can have multiple sets, all but one dead). Thanks for the other changes, much appreciated. Segher
Re: [PATCH][AArch64] Use std::swap instead of manually swapping
Ping. Marcus: Uros pointed out to me that these kinds of changes are considered obvious (with precedent athttps://gcc.gnu.org/ml/gcc-patches/2014-12/msg00309.html) but did you have some concerns about backporting to other branches? Kyrill On 05/12/14 16:40, Kyrill Tkachov wrote: Ping. https://gcc.gnu.org/ml/gcc-patches/2014-11/msg01426.html Thanks, Kyrill On 27/11/14 15:37, Kyrill Tkachov wrote: Ping. Thanks, Kyrill On 13/11/14 09:42, Kyrill Tkachov wrote: Hi all, Following the trend in i386 and alpha, this patch uses std::swap to perform swapping of values in the aarch64 backend instead of declaring temporaries. Tested and bootstrapped on aarch64-linux. Ok for trunk? Thanks, Kyrill 2014-11-13 Kyrylo Tkachov kyrylo.tkac...@arm.com * config/aarch64/aarch64.c (aarch64_evpc_ext): Use std::swap instead of manual swapping implementation. (aarch64_expand_vec_perm_const_1): Likewise.
Re: [PATCH][testsuite] Mark as UNSUPPORTED tests that don't fit into tiny memory model
Here it is. Committed as r218662 to get things going again. The same change should be done for libitm and libatomic. TIA Dominique
[WWW] Update index.html and gcc-5/changes.html to reflect offloading changes.
Hello, These change adds mention of OpenMP4 offloading support in GCC: in release notes and in news section of main page. Index: htdocs/index.html === RCS file: /cvs/gcc/wwwdocs/htdocs/index.html,v retrieving revision 1.942 diff -p -r1.942 index.html *** htdocs/index.html 17 Nov 2014 08:59:33 - 1.942 --- htdocs/index.html 12 Dec 2014 11:39:56 - *** mission statement/a./p *** 52,57 --- 52,83 dl class=news + dtspanOpenMP 4.0 offloading support in GCC/span + span class=date[2014-12-12]/span/dt + dda href=http://www.openmp.org/mp-documents/OpenMP4.0.0.pdf; + OpenMP 4.0/a offloading features support was added to GCC. Generic changes: + liGeneric infrastructure (suitable for any vendor)./li + liTestsuite which covers offloading from + a href=http://openmp.org/mp-documents/OpenMP4.0.0.Examples.pdf; + OpenMP 4.0 Examples/a document./li + Specific for upcoming Intel MIC products: + liRuntime library./li + liCard emulator./li +Contributed by Jakub Jelinek (RedHat), Thomas Schwinge (CodeSourcery), + Bernd Schmidt (CodeSourcery), Andrey Turetskiy (Intel), + Ilya Verbin (Intel) and Kirill Yukhin (Intel)./dd/dt + dtspana href=gcc-4.9/GCC 4.9.2/a released/span span class=date[2014-10-30]/span/dt dd/dd Index: htdocs/gcc-5/changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v retrieving revision 1.50 diff -p -r1.50 changes.html *** htdocs/gcc-5/changes.html 10 Dec 2014 00:28:18 - 1.50 --- htdocs/gcc-5/changes.html 12 Dec 2014 11:39:56 - *** *** 83,88 --- 83,93 /ul h2 id=languagesNew Languages and Language specific improvements/h2 + ul + lia href=http://www.openmp.org/mp-documents/OpenMP4.0.0.pdf; OpenMP 4.0 + specification/a offloading features are now supported in C/C++ and + Fortran compiler/li + /ul Is it ok to commit? -- Thanks, K
[WWW] Update index.html and gcc-5/changes.html to AVX-512* changes.
Hello, This change mantions AVX-512* new instructions support in GCC: news section of index.html and gcc-5/changes.html. Index: htdocs/index.html === RCS file: /cvs/gcc/wwwdocs/htdocs/index.html,v retrieving revision 1.942 diff -p -r1.942 index.html *** htdocs/index.html 17 Nov 2014 08:59:33 - 1.942 --- htdocs/index.html 12 Dec 2014 11:39:56 - *** mission statement/a./p *** 52,57 --- 52,83 dl class=news + dtspanIntel Skylake Server AVX-512 extensions support/span + span class=date[2014-12-12]/span/dt + ddNew ISA extensions support + a href=https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf; + AVX-512{BW,DQ,VL,IFMA,VBMI}/a was added to GCC. That includes inline + assembly support, new intrinsics, and basic autovectorization. + Code was contributed by Sergey Guriev, Alexander Ivchenko, + Maxim Kuznetsov, Sergey Lega, Anna Tikhonova, Ilya Tocar, + Andrey Turetskiy, Ilya Verbin, Kirill Yukhin and + Michael Zolotukhin of Intel, Corp./dd/dt Index: htdocs/gcc-5/changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v retrieving revision 1.50 diff -p -r1.50 changes.html *** htdocs/gcc-5/changes.html 10 Dec 2014 00:28:18 - 1.50 --- htdocs/gcc-5/changes.html 12 Dec 2014 11:39:56 - *** constexpr int i = f(42); // i is 42/pre *** 426,431 --- 431,449 h3 id=x86IA-32/x86-64/h3 ul + liNew ISA extensions support + a href=https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf; + AVX-512{BW,DQ,VL,IFMA,VBMI}/a of Intel's CPU + codenamed Skylake Server was added to GCC. That includes inline + assembly support, new intrinsics, and basic autovectorization. These + new AVX-512 extensions are available via + the following GCC switches: AVX-512 Vector Length EVEX feature: + code-mavx512vl/code, AVX-512 Byte and Word instructions: + code-mavx512bw/code, AVX-512 Dword and Qword instructions: + code-mavx512dq/code, AVX-512 FMA-52 instructions: + code-mavx512ifma/code and for AVX-512 Vector Bit Manipulation + Instructions: code-mavx512vbmi/code. + /li liThe new code-mrecord-mcount/code option for code-pg/code generates a Linux kernel style table of pointers to mcount or __fentry__ calls at the beginning of functions. The new Is it ok to install? -- Thanks, K
[PATCH] Fix PR64280
The following fixes PR64280 by properly guarding the assert with whether we are going to change SSA_NAME_OCCURS_IN_ABNORMAL_PHI. Recent changes in what we allow to propagate otherwise happily will trigger it. Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Richard. 2014-12-12 Richard Biener rguent...@suse.de PR middle-end/64280 * tree-cfg.c (replace_uses_by): Guard assert properly. * g++.dg/torture/pr64280.C: New testcase. Index: gcc/tree-cfg.c === --- gcc/tree-cfg.c (revision 218661) +++ gcc/tree-cfg.c (working copy) @@ -1781,7 +1781,8 @@ replace_uses_by (tree name, tree val) { e = gimple_phi_arg_edge (as_a gphi * (stmt), PHI_ARG_INDEX_FROM_USE (use)); - if (e-flags EDGE_ABNORMAL) + if (e-flags EDGE_ABNORMAL + !SSA_NAME_OCCURS_IN_ABNORMAL_PHI (val)) { /* This can only occur for virtual operands, since for the real ones SSA_NAME_OCCURS_IN_ABNORMAL_PHI (name)) Index: gcc/testsuite/g++.dg/torture/pr64280.C === --- gcc/testsuite/g++.dg/torture/pr64280.C (revision 0) +++ gcc/testsuite/g++.dg/torture/pr64280.C (working copy) @@ -0,0 +1,42 @@ +// { dg-do compile } + +class A +{ +public: + A (); +}; +class B +{ +public: + B (int); + operator void *() { return m_fn1 () ? 0 : this; } + int m_fn1 (); +}; +typedef int jmp_buf[]; +struct C +{ + jmp_buf cond_; +}; +class F +{ + C what_; + bool m_fn2 (); +}; +int _setjmp (int[]); +void longjmp (); +class D +{ +public: + D () { longjmp (); } +}; +bool +F::m_fn2 () +{ + B a (0); + if (a) +if (_setjmp (what_.cond_)) + return 0; +else + D (); + A b; +}
[PATCH] Fix PR64284
The following patch fixes PR64284 by removing loops we copied the header for with FSM threading. There may be better solutions, but at least for the testcase it looks difficult to update loops within the constraints of the calling passes. Bootstrap and regtest ongoing on x86_64-unknown-linux-gnu. Richard. 2014-12-12 Richard Biener rguent...@suse.de PR tree-optimization/64284 * tree-ssa-threadupdate.c (duplicate_seme_region): Mark the loop for removal if we copied the loop header. * gcc.dg/torture/pr64284.c: New testcase. Index: gcc/tree-ssa-threadupdate.c === --- gcc/tree-ssa-threadupdate.c (revision 218621) +++ gcc/tree-ssa-threadupdate.c (working copy) @@ -2364,7 +2364,7 @@ duplicate_seme_region (edge entry, edge basic_block *region_copy) { unsigned i; - bool free_region_copy = false, copying_header = false; + bool free_region_copy = false; struct loop *loop = entry-dest-loop_father; edge exit_copy; edge redirected; @@ -2388,10 +2388,7 @@ duplicate_seme_region (edge entry, edge initialize_original_copy_tables (); - if (copying_header) -set_loop_copy (loop, loop_outer (loop)); - else -set_loop_copy (loop, loop); + set_loop_copy (loop, loop); if (!region_copy) { @@ -2453,6 +2450,8 @@ duplicate_seme_region (edge entry, edge } /* Redirect the entry and add the phi node arguments. */ + if (entry-dest == loop-header) +mark_loop_for_removal (loop); redirected = redirect_edge_and_branch (entry, get_bb_copy (entry-dest)); gcc_assert (redirected != NULL); flush_pending_stmts (entry); Index: gcc/testsuite/gcc.dg/torture/pr64284.c === --- gcc/testsuite/gcc.dg/torture/pr64284.c (revision 0) +++ gcc/testsuite/gcc.dg/torture/pr64284.c (working copy) @@ -0,0 +1,21 @@ +/* { dg-do compile } */ + +int *a; +int b; +int +fn1() { +enum { QSTRING } c = 0; +while (1) { + switch (*a) { + case '\'': + c = 0; + default: + switch (c) + case 0: + if (b) + return 0; + c = 1; + } + a++; +} +}
Re: [PATCH][testsuite] Mark as UNSUPPORTED tests that don't fit into tiny memory model
On 12/12/14 11:38, Dominique Dhumieres wrote: Here it is. Committed as r218662 to get things going again. The same change should be done for libitm and libatomic. Committed as r218664. I grepped for where else we include gcc-defs.exp and included the file there as well. That was libvtv and libgo. Sorry for the trouble. Kyrill 2014-12-12 Kyrylo Tkachov kyrylo.tkac...@arm.com * testsuite/lib/libatomic.exp: Load target-utils.exp 2014-12-12 Kyrylo Tkachov kyrylo.tkac...@arm.com * testsuite/lib/libitm.exp: Load target-utils.exp. Move load of target-supports.exp earlier. 2014-12-12 Kyrylo Tkachov kyrylo.tkac...@arm.com * testsuite/lib/libvtv.exp: Load target-utils.exp [I didn't find a libgo ChangeLog, what are the rules there?] TIA Dominique diff --git a/libatomic/testsuite/lib/libatomic.exp b/libatomic/testsuite/lib/libatomic.exp index 23c3b08..28cbaa8 100644 --- a/libatomic/testsuite/lib/libatomic.exp +++ b/libatomic/testsuite/lib/libatomic.exp @@ -25,6 +25,7 @@ proc load_gcc_lib { filename } { load_lib dg.exp load_gcc_lib file-format.exp load_gcc_lib target-supports.exp +load_gcc_lib target-utils.exp load_gcc_lib target-supports-dg.exp load_gcc_lib scanasm.exp load_gcc_lib scandump.exp diff --git a/libgo/testsuite/lib/libgo.exp b/libgo/testsuite/lib/libgo.exp index a8fe4e0..7031f63 100644 --- a/libgo/testsuite/lib/libgo.exp +++ b/libgo/testsuite/lib/libgo.exp @@ -42,6 +42,8 @@ proc load_gcc_lib { filename } { load_gcc_lib prune.exp load_gcc_lib target-libpath.exp load_gcc_lib wrapper.exp +load_gcc_lib target-supports.exp +load_gcc_lib target-utils.exp load_gcc_lib gcc-defs.exp load_gcc_lib timeout.exp load_gcc_lib go.exp diff --git a/libitm/testsuite/lib/libitm.exp b/libitm/testsuite/lib/libitm.exp index 669ed90..1361d56 100644 --- a/libitm/testsuite/lib/libitm.exp +++ b/libitm/testsuite/lib/libitm.exp @@ -31,9 +31,10 @@ load_lib dg.exp load_gcc_lib prune.exp load_gcc_lib target-libpath.exp load_gcc_lib wrapper.exp +load_gcc_lib target-supports.exp +load_gcc_lib target-utils.exp load_gcc_lib gcc-defs.exp load_gcc_lib timeout.exp -load_gcc_lib target-supports.exp load_gcc_lib file-format.exp load_gcc_lib target-supports-dg.exp load_gcc_lib scanasm.exp diff --git a/libvtv/testsuite/lib/libvtv.exp b/libvtv/testsuite/lib/libvtv.exp index 83674be..c473b0a 100644 --- a/libvtv/testsuite/lib/libvtv.exp +++ b/libvtv/testsuite/lib/libvtv.exp @@ -26,6 +26,7 @@ load_lib dg.exp load_gcc_lib file-format.exp load_gcc_lib target-supports.exp load_gcc_lib target-supports-dg.exp +load_gcc_lib target-utils.exp load_gcc_lib scanasm.exp load_gcc_lib scandump.exp load_gcc_lib scanrtl.exp
Re: [gofrontend-dev] Re: [PATCH 00/13] Go closures, libffi, and the static chain
On Thu, Dec 11, 2014 at 11:56:00AM -0800, Richard Henderson wrote: On 12/11/2014 04:25 AM, Dominik Vogt wrote: Update: If I disable the custom s390x code and switch to the implementation just using libffi for reflection calls, the same crash occurs with the testing/quick libgo test case. The called function sees a bogus value written by the synamic linker as the closure pointer, for example with this line in the test code: CheckEqual(fComplex64, fComplex64, nil) Is the s390 port somehow putting the address of a plt entry here? Digging through the test program with the debugger reveals that the register corruption is not caused by dynamic linking. Instead, libgo lacks a patch that is necessary for complex support. Without that, ffi_prep_args treats _Complex like a struct with two elements (which it is not on s390[x]) and messes up the layout of the stack arguments, eventually loading the wrong values into the registers when the test function is called. It turns out that the bad value in r0 was just a red herring in this case. I'm not sure I've posted the missing patch anywhere yet, so it's attached to this message. At the moment it enables FFI_TYPE_COMPLEX only for s390[x], but eventually this should be used unconditionally. -- (This still leaves the dynamic linking issue if we do not use libffi for reflection calls with x86* and s390[x]. Is the plan to remove the platform specific abi code for the few platforms that have it? I see no way to make them work with the static chain patch anyway.) Ciao Dominik ^_^ ^_^ -- Dominik Vogt IBM Germany From 84235d9e7ba8a55dea182adc4007bfab6a35fb1f Mon Sep 17 00:00:00 2001 From: Dominik Vogt v...@de.ibm.com Date: Wed, 29 Oct 2014 09:08:01 +0100 Subject: [PATCH] libgo: Enable complex number support from libffi. --- libgo/runtime/go-ffi.c | 21 + 1 file changed, 21 insertions(+) diff --git a/libgo/runtime/go-ffi.c b/libgo/runtime/go-ffi.c index 21879b9..42462d0 100644 --- a/libgo/runtime/go-ffi.c +++ b/libgo/runtime/go-ffi.c @@ -150,11 +150,26 @@ go_complex_to_ffi (ffi_type *float_type) ffi_type *ret; ret = (ffi_type *) __go_alloc (sizeof (ffi_type)); + /* Use libffi with complex type support for targets that have it. This should + be the case for all targets eventually, so the #else branch should then be + removed. */ +#if defined (__s390__) defined (FFI_TYPE_COMPLEX) + ret-type = FFI_TYPE_COMPLEX; + ret-size = 2 * float_type-size; + ret-alignment = float_type-alignment; + ret-elements = (ffi_type **) __go_alloc (2 * sizeof (ffi_type *)); + ret-elements[0] = float_type; + ret-elements[1] = NULL; +#else + /* Warning: This works only on platforms that define C _Complex types like + structures in their Abi. */ ret-type = FFI_TYPE_STRUCT; ret-elements = (ffi_type **) __go_alloc (3 * sizeof (ffi_type *)); ret-elements[0] = float_type; ret-elements[1] = float_type; ret-elements[2] = NULL; +#endif + return ret; } @@ -184,6 +199,9 @@ go_type_to_ffi (const struct __go_type_descriptor *descriptor) #ifdef __alpha__ runtime_throw(the libffi library does not support Complex64 type with reflect.Call or runtime.SetFinalizer); +#elif defined(__s390__) !defined(FFI_TYPE_COMPLEX) + runtime_throw(the libffi library does not support Complex64 type with + reflect.Call or runtime.SetFinalizer); #else if (sizeof (float) == 4) return go_complex_to_ffi (ffi_type_float); @@ -193,6 +211,9 @@ go_type_to_ffi (const struct __go_type_descriptor *descriptor) #ifdef __alpha__ runtime_throw(the libffi library does not support Complex128 type with reflect.Call or runtime.SetFinalizer); +#elif defined(__s390__) !defined(FFI_TYPE_COMPLEX) + runtime_throw(the libffi library does not support Complex128 type with + reflect.Call or runtime.SetFinalizer); #else if (sizeof (double) == 8) return go_complex_to_ffi (ffi_type_double); -- 1.8.4.2
[PATCH, x86][PIC] Making check for PIC register in address cost calculation only on RTL level
Hi! When adding checks for PIC register in address cost calculation (http://gcc.gnu.org/ml/gcc-cvs/2014-10/msg00411.html) it was meant to affect only RTL passes. Since !pic_offset_table_rtx is not enough for it (I see that pic_offset_table_rtx enabled on GIMPLE level) following change explicitly adds this restriction. Bootstrapped and regtested with RUNTESTFLAGS=--target_board='unix{-m32,-fpic}' Is it ok for trunk? Thanks, Igor Changelog: 2014-12-12 Igor Zamyatin igor.zamya...@intel.com * config/i386/i386.c (ix86_address_cost): Add explicit restriction to RTL level for the check for PIC register. diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index fffddfc..799411c 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -12802,12 +12802,14 @@ ix86_address_cost (rtx x, machine_mode, addr_space_t, bool) Therefore only pic_offset_table_rtx could be hoisted out, which is not profitable for x86. */ if (parts.base - (!pic_offset_table_rtx - || REGNO (pic_offset_table_rtx) != REGNO(parts.base)) + (current_pass-type == GIMPLE_PASS + || (!pic_offset_table_rtx + || REGNO (pic_offset_table_rtx) != REGNO(parts.base))) (!REG_P (parts.base) || REGNO (parts.base) = FIRST_PSEUDO_REGISTER) parts.index - (!pic_offset_table_rtx - || REGNO (pic_offset_table_rtx) != REGNO(parts.index)) + (current_pass-type == GIMPLE_PASS + || (!pic_offset_table_rtx + || REGNO (pic_offset_table_rtx) != REGNO(parts.index))) (!REG_P (parts.index) || REGNO (parts.index) = FIRST_PSEUDO_REGISTER) parts.base != parts.index) cost++;
[committed] Add two testcases for recently fixed PRs
Hi! I've committed these two testcases as obvious to trunk, they have been fixed with the PR63917 r218658 fix. 2014-12-12 Jakub Jelinek ja...@redhat.com PR rtl-optimization/64255 * gcc.c-torture/execute/pr64255.c: New test. PR rtl-optimization/64260 * gcc.c-torture/execute/pr64260.c: New test. --- gcc/testsuite/gcc.c-torture/execute/pr64255.c.jj2014-12-12 09:12:31.940647593 +0100 +++ gcc/testsuite/gcc.c-torture/execute/pr64255.c 2014-12-12 09:12:27.0 +0100 @@ -0,0 +1,28 @@ +/* PR rtl-optimization/64255 */ + +__attribute__((noinline, noclone)) void +bar (long i, unsigned long j) +{ + if (i != 1 || j != 1) +__builtin_abort (); +} + +__attribute__((noinline, noclone)) void +foo (long i) +{ + unsigned long j; + + if (!i) +return; + j = i = 0 ? (unsigned long) i : - (unsigned long) i; + if ((i = 0 ? (unsigned long) i : - (unsigned long) i) != j) +__builtin_abort (); + bar (i, j); +} + +int +main () +{ + foo (1); + return 0; +} --- gcc/testsuite/gcc.c-torture/execute/pr64260.c.jj2014-12-12 09:11:27.847784037 +0100 +++ gcc/testsuite/gcc.c-torture/execute/pr64260.c 2014-12-12 09:11:19.0 +0100 @@ -0,0 +1,25 @@ +/* PR rtl-optimization/64260 */ + +int a = 1, b; + +void +foo (char p) +{ + int t = 0; + for (; b 1; b++) +{ + int *s = a; + if (--t) + *s = p; + *s = 1; +} +} + +int +main () +{ + foo (0); + if (a != 0) +__builtin_abort (); + return 0; +} Jakub
[PATCH] Fix simplify_builtin_call forwprop ICE (PR tree-optimization/64269)
Hi! This testcase ICEs because I wasn't checking for overflow in the size computation. Only max (diff + len2, len1) = 1024 cases are considered, so this patch gives up if either len2 or diff is 1024. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/4.9/4.8? 2014-12-12 Jakub Jelinek ja...@redhat.com PR tree-optimization/64269 * tree-ssa-forwprop.c (simplify_builtin_call): Bail out if len2 or diff are too large. * gcc.c-torture/compile/pr64269.c: New test. --- gcc/tree-ssa-forwprop.c.jj 2014-12-01 14:57:30.0 +0100 +++ gcc/tree-ssa-forwprop.c 2014-12-12 09:46:05.790053928 +0100 @@ -1288,7 +1288,8 @@ simplify_builtin_call (gimple_stmt_itera use_operand_p use_p; if (!tree_fits_shwi_p (val2) - || !tree_fits_uhwi_p (len2)) + || !tree_fits_uhwi_p (len2) + || compare_tree_int (len2, 1024) == 1) break; if (is_gimple_call (stmt1)) { @@ -1354,7 +1355,8 @@ simplify_builtin_call (gimple_stmt_itera is not constant, or is bigger than memcpy length, bail out. */ if (diff == NULL || !tree_fits_uhwi_p (diff) - || tree_int_cst_lt (len1, diff)) + || tree_int_cst_lt (len1, diff) + || compare_tree_int (diff, 1024) == 1) break; /* Use maximum of difference plus memset length and memcpy length --- gcc/testsuite/gcc.c-torture/compile/pr64269.c.jj2014-12-12 09:47:04.795015479 +0100 +++ gcc/testsuite/gcc.c-torture/compile/pr64269.c 2014-12-12 09:46:51.0 +0100 @@ -0,0 +1,9 @@ +/* PR tree-optimization/64269 */ + +void +foo (char *p) +{ + __SIZE_TYPE__ s = ~(__SIZE_TYPE__)0; + *p = 0; + __builtin_memset (p + 1, 0, s); +} Jakub
[committed, moxie] Add use of zex instruction
I've committed the following patch for the moxie port... 2014-12-12 Anthony Green gr...@moxielogic.com * config/moxie/moxie.md: Add use of zex instruction. Index: gcc/config/moxie/moxie.md === --- gcc/config/moxie/moxie.md (revision 218664) +++ gcc/config/moxie/moxie.md (working copy) @@ -241,10 +241,10 @@ (define_insn_and_split zero_extendqisi2 [(set (match_operand:SI 0 register_operand =r,r,r,r) - (zero_extend:SI (match_operand:QI 1 nonimmediate_operand 0,W,A,B)))] + (zero_extend:SI (match_operand:QI 1 nonimmediate_operand r,W,A,B)))] @ - ; + zex.b %0, %1 ld.b %0, %1 lda.b %0, %1 ldo.b %0, %1 @@ -254,14 +254,14 @@ { operands[2] = gen_lowpart (QImode, operands[0]); } - [(set_attr length 0,2,6,6)]) + [(set_attr length 2,2,6,6)]) (define_insn_and_split zero_extendhisi2 [(set (match_operand:SI 0 register_operand =r,r,r,r) - (zero_extend:SI (match_operand:HI 1 nonimmediate_operand 0,W,A,B)))] + (zero_extend:SI (match_operand:HI 1 nonimmediate_operand r,W,A,B)))] @ - ; + zex.s %0, %1 ld.s %0, %1 lda.s %0, %1 ldo.s %0, %1 @@ -271,7 +271,7 @@ { operands[2] = gen_lowpart (HImode, operands[0]); } - [(set_attr length 0,2,6,6)]) + [(set_attr length 2,2,6,6)]) (define_insn extendqisi2 [(set (match_operand:SI 0 register_operand =r)
Re: [PATCH] Fix simplify_builtin_call forwprop ICE (PR tree-optimization/64269)
On Fri, 12 Dec 2014, Jakub Jelinek wrote: Hi! This testcase ICEs because I wasn't checking for overflow in the size computation. Only max (diff + len2, len1) = 1024 cases are considered, so this patch gives up if either len2 or diff is 1024. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/4.9/4.8? Ok. Thanks, Richard. 2014-12-12 Jakub Jelinek ja...@redhat.com PR tree-optimization/64269 * tree-ssa-forwprop.c (simplify_builtin_call): Bail out if len2 or diff are too large. * gcc.c-torture/compile/pr64269.c: New test. --- gcc/tree-ssa-forwprop.c.jj2014-12-01 14:57:30.0 +0100 +++ gcc/tree-ssa-forwprop.c 2014-12-12 09:46:05.790053928 +0100 @@ -1288,7 +1288,8 @@ simplify_builtin_call (gimple_stmt_itera use_operand_p use_p; if (!tree_fits_shwi_p (val2) - || !tree_fits_uhwi_p (len2)) + || !tree_fits_uhwi_p (len2) + || compare_tree_int (len2, 1024) == 1) break; if (is_gimple_call (stmt1)) { @@ -1354,7 +1355,8 @@ simplify_builtin_call (gimple_stmt_itera is not constant, or is bigger than memcpy length, bail out. */ if (diff == NULL || !tree_fits_uhwi_p (diff) - || tree_int_cst_lt (len1, diff)) + || tree_int_cst_lt (len1, diff) + || compare_tree_int (diff, 1024) == 1) break; /* Use maximum of difference plus memset length and memcpy length --- gcc/testsuite/gcc.c-torture/compile/pr64269.c.jj 2014-12-12 09:47:04.795015479 +0100 +++ gcc/testsuite/gcc.c-torture/compile/pr64269.c 2014-12-12 09:46:51.0 +0100 @@ -0,0 +1,9 @@ +/* PR tree-optimization/64269 */ + +void +foo (char *p) +{ + __SIZE_TYPE__ s = ~(__SIZE_TYPE__)0; + *p = 0; + __builtin_memset (p + 1, 0, s); +} Jakub -- Richard Biener rguent...@suse.de SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)
[patch] LWG DR 2285: std::make_reverse_iterator
This is a trivial addition, implementing http://cplusplus.github.io/LWG/lwg-defects.html#2285 I've also added the __cpp_lib_tuple_element_t macro missing from the commit that implemented that. Tested x86_64-linux, committed to trunk. commit 363e23c37b089661c7533476cc7b3e5b05f20f9f Author: Jonathan Wakely jwak...@redhat.com Date: Fri Dec 12 13:12:23 2014 + * include/bits/stl_iterator.h (make_reverse_iterator): LWG DR 2285. * include/std/tuple: Add feature-test macro. * testsuite/24_iterators/reverse_iterator/make.cc: New. diff --git a/libstdc++-v3/include/bits/stl_iterator.h b/libstdc++-v3/include/bits/stl_iterator.h index f4522a4..15014d5 100644 --- a/libstdc++-v3/include/bits/stl_iterator.h +++ b/libstdc++-v3/include/bits/stl_iterator.h @@ -388,6 +388,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION { return __y.base() - __x.base(); } //@} +#if __cplusplus 201103L +#define __cpp_lib_make_reverse_iterator 201402 + + // _GLIBCXX_RESOLVE_LIB_DEFECTS + // DR 2285. make_reverse_iterator + /// Generator function for reverse_iterator. + templatetypename _Iterator +inline reverse_iterator_Iterator +make_reverse_iterator(_Iterator __i) +{ return reverse_iterator_Iterator(__i); } +#endif + // 24.4.2.2.1 back_insert_iterator /** * @brief Turns assignment into insertion. diff --git a/libstdc++-v3/include/std/tuple b/libstdc++-v3/include/std/tuple index 6be7f23..01ab5fe 100644 --- a/libstdc++-v3/include/std/tuple +++ b/libstdc++-v3/include/std/tuple @@ -687,6 +687,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION }; #if __cplusplus 201103L +#define __cpp_lib_tuple_element_t 201402 + templatestd::size_t __i, typename _Tp using tuple_element_t = typename tuple_element__i, _Tp::type; #endif diff --git a/libstdc++-v3/testsuite/24_iterators/reverse_iterator/make.cc b/libstdc++-v3/testsuite/24_iterators/reverse_iterator/make.cc new file mode 100644 index 000..a0f70de --- /dev/null +++ b/libstdc++-v3/testsuite/24_iterators/reverse_iterator/make.cc @@ -0,0 +1,35 @@ +// Copyright (C) 2014 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. + +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. + +// You should have received a copy of the GNU General Public License along +// with this library; see the file COPYING3. If not see +// http://www.gnu.org/licenses/. + +// { dg-options -std=gnu++14 } + +#include iterator +#include testsuite_hooks.h + +void +test01() +{ + int a[2]{ 1, 2 }; + auto b = std::make_reverse_iterator(a); + VERIFY( b == std::reverse_iteratorint*(a) ); +} + +int +main() +{ + test01(); +}
[PATCH] Fix TYPE_OVERFLOW_* cleanup fallout
On ARM we ICE on fixed-point-exec.c testcase because TYPE_OVERFLOW_WRAPS was missing the ANY_INTEGRAL_TYPE_P check. Bootstrapped/regtested on x86_64-linux and ppc64-linux, ok for trunk? 2014-12-12 Marek Polacek pola...@redhat.com PR middle-end/64274 * fold-const.c (fold_binary_loc): Add ANY_INTEGRAL_TYPE_P check. diff --git gcc/fold-const.c gcc/fold-const.c index ec5ad98..d71fa94 100644 --- gcc/fold-const.c +++ gcc/fold-const.c @@ -10082,7 +10082,8 @@ fold_binary_loc (location_t loc, /* Reassociate (plus (plus (mult) (foo)) (mult)) as (plus (plus (mult) (mult)) (foo)) so that we can take advantage of the factoring cases below. */ - if (TYPE_OVERFLOW_WRAPS (type) + if (ANY_INTEGRAL_TYPE_P (type) + TYPE_OVERFLOW_WRAPS (type) (((TREE_CODE (arg0) == PLUS_EXPR || TREE_CODE (arg0) == MINUS_EXPR) TREE_CODE (arg1) == MULT_EXPR) Marek
Re: [PATCH] Fix TYPE_OVERFLOW_* cleanup fallout
On Fri, 12 Dec 2014, Marek Polacek wrote: On ARM we ICE on fixed-point-exec.c testcase because TYPE_OVERFLOW_WRAPS was missing the ANY_INTEGRAL_TYPE_P check. Bootstrapped/regtested on x86_64-linux and ppc64-linux, ok for trunk? Ok. THanks, Richard. 2014-12-12 Marek Polacek pola...@redhat.com PR middle-end/64274 * fold-const.c (fold_binary_loc): Add ANY_INTEGRAL_TYPE_P check. diff --git gcc/fold-const.c gcc/fold-const.c index ec5ad98..d71fa94 100644 --- gcc/fold-const.c +++ gcc/fold-const.c @@ -10082,7 +10082,8 @@ fold_binary_loc (location_t loc, /* Reassociate (plus (plus (mult) (foo)) (mult)) as (plus (plus (mult) (mult)) (foo)) so that we can take advantage of the factoring cases below. */ - if (TYPE_OVERFLOW_WRAPS (type) + if (ANY_INTEGRAL_TYPE_P (type) +TYPE_OVERFLOW_WRAPS (type) (((TREE_CODE (arg0) == PLUS_EXPR || TREE_CODE (arg0) == MINUS_EXPR) TREE_CODE (arg1) == MULT_EXPR) Marek -- Richard Biener rguent...@suse.de SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)
Re: [PATCH][ARM] Make issue rate part of per-core tuning structs
Ping (after the macro fusion patch)... https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02706.html Thanks, Kyrill On 20/11/14 16:48, Kyrill Tkachov wrote: I should say that the patch context depends on the macro fusion hook implementation posted here: https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00958.html Kyrill On 20/11/14 16:43, Kyrill Tkachov wrote: Hi all, This patch makes the arm_issue_rate function lookup the issue rate of the process from the tuning structs. This makes it look more like the aarch64 mechanism and centralises a processor-specific construct to the tuning structs, thus not forcing us to remember to update the arm_issue_rate function every time a new core is added. A new tuning struct is added for the marvell-pj4 in order to decouple it from the 9e tuning struct and enable us to set it's correct issue rate to 2. Bootstrapped and tested on arm-none-gnueabihf. Ok for trunk? Thanks, Kyrill 2014-11-19 Kyrylo Tkachov kyrylo.tkac...@arm.com * config/arm/arm-protos.h (struct tune_params): Add issue_rate field. * config/arm/arm.c (arm_slowmul_tune, arm_fastmul_tune, arm_strongarm_tune, arm_xscale_tune, arm_9e_tune, arm_v6t2_tune, arm_cortex_tune, arm_cortex_a8_tune, arm_cortex_a7_tune, arm_cortex_a15_tune, arm_cortex_a53_tune, arm_cortex_a57_tune, arm_cortex_a9_tune, arm_cortex_a12_tune, arm_v7m_tune, arm_v6m_tune, arm_fa726te_tune arm_cortex_a5_tune): Specify issue_rate value. (arm_issue_rate): Look up issue rate from tuning structs. Remove large switch statement. (arm_marvell_pj4_tune): New struct. * config/arm/arm-cores.def (marvell-pj4): Use arm_marvell_pj4_tune struct.
Re: [gofrontend-dev] Re: [PATCH 00/13] Go closures, libffi, and the static chain
On Thu, Dec 11, 2014 at 07:51:44PM +1030, Alan Modra wrote: I was worried about exactly the same problem on powerpc with r11 being used for the static chain and also destroyed in linkage stubs. It turns out we don't traverse any linkage stubs. See https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00446.html. I've written a small test suite that tests reflection calls over module boundaries (see attachment). Build with make and then just run ./main. The program must not crash; it does not check consistency of the function arguments. Ciao Dominik ^_^ ^_^ -- Dominik Vogt IBM Germany closure_reflect_tests.tgz Description: application/gtar-compressed
Re: [PATCH] Fix TYPE_OVERFLOW_* cleanup fallout
On 12/12/14 13:48, Marek Polacek wrote: On ARM we ICE on fixed-point-exec.c testcase because TYPE_OVERFLOW_WRAPS was missing the ANY_INTEGRAL_TYPE_P check. Bootstrapped/regtested on x86_64-linux and ppc64-linux, ok for trunk? Also arm-none-linux-gnueabihf bootstrap is in late stage3 and looks fine. Thanks, Kyril 2014-12-12 Marek Polacek pola...@redhat.com PR middle-end/64274 * fold-const.c (fold_binary_loc): Add ANY_INTEGRAL_TYPE_P check. diff --git gcc/fold-const.c gcc/fold-const.c index ec5ad98..d71fa94 100644 --- gcc/fold-const.c +++ gcc/fold-const.c @@ -10082,7 +10082,8 @@ fold_binary_loc (location_t loc, /* Reassociate (plus (plus (mult) (foo)) (mult)) as (plus (plus (mult) (mult)) (foo)) so that we can take advantage of the factoring cases below. */ - if (TYPE_OVERFLOW_WRAPS (type) + if (ANY_INTEGRAL_TYPE_P (type) + TYPE_OVERFLOW_WRAPS (type) (((TREE_CODE (arg0) == PLUS_EXPR || TREE_CODE (arg0) == MINUS_EXPR) TREE_CODE (arg1) == MULT_EXPR) Marek
Re: [PATCH 0/4] [AARCH64,SIMD] PR63870 Improve error messages for single lane load/store
On 10 December 2014 at 10:34, Alan Lawrence alan.lawre...@arm.com wrote: Thanks, Charles. A couple of thoughts. I think the approach in patches 2+3+4 of using __builtin_aarch64_im_lane_boundsi is justified and works quite neatly. Modulo the question of argument ordering and __AARCH64_LANE_CHECK, those patches look good. However, the SIMD_ARG_STRUCT_LOAD_STORE_LANE_INDEX, seems a lot of infrastructure to introduce if we are only going to use it in one place, and I think I might argue in favour of using ...__im_lane_bound or AARCH64_LANE_CHECK there also. Of course all of this palaver stems from using the same builtins for both D- and Q-reg intrinsics, and I suspect some cleanup may be due to those intrinsics *at some point*, but probably not in time for gcc 5.0. However, this does mean that if I use a D-reg intrinsic with a lane index that's out of bounds for the Q-reg too, I get a double error message: e.g. for testcase int8x8x4_t f_vld4_lane (int8_t * p, int8x8x4_t v) { int8x8x4_t res; return vld4_lane_s8 (p, v, 18); } I get output: In file included from gcc/testsuite/gcc.target/aarch64/simd/vld4_lane.c:5:0: .../install/lib/gcc/aarch64-none-elf/5.0.0/include/arm_neon.h: In function 'f_vld4_lane': .../install/lib/gcc/aarch64-none-elf/5.0.0/include/arm_neon.h:18123:1: error: lane 18 out of range 0 - 7 __LD4_LANE_FUNC (int8x8x4_t, int8x8_t, int8x16x4_t, int8_t, v16qi, qi, s8, ^ In function 'vld4_lane_s8', inlined from 'f_vld4_lane' at gcc/testsuite/gcc.target/aarch64/simd/vld4_lane.c:12:7: .../install/lib/gcc/aarch64-none-elf/5.0.0/include/arm_neon.h:18123:1: error: lane 18 out of range 0 - 15 __LD4_LANE_FUNC (int8x8x4_t, int8x8_t, int8x16x4_t, int8_t, v16qi, qi, s8, ^ which (although not serious) could be mildly confusing. Oh dear, this is rather sad. Aesthetically, I think the builtins should protect themselves from direct misuse, but I can't think of a clean way to prevent this. It could be done like this, but I don't think the end result really justifies it. __o = __builtin_aarch64_ld4_lane##mode ((__builtin_aarch64_simd_##ptrmode *) __ptr, __o, __c (__NUMBER_OF_LANES(__b.val[0]) - 1));
[hsa] HSA: function comments are added.
Hello. In this small patch I add missing comments. Thanks, Martin gcc/ChangeLog: 2014-12-12 Martin Liska mli...@suse.cz * hsa-brig.c: Function comments are added. * hsa-gen.c: Likewise. --- gcc/hsa-brig.c | 5 + gcc/hsa-gen.c | 19 ++- 2 files changed, 23 insertions(+), 1 deletion(-) diff --git a/gcc/hsa-brig.c b/gcc/hsa-brig.c index 3e6ed68..45972b6 100644 --- a/gcc/hsa-brig.c +++ b/gcc/hsa-brig.c @@ -917,6 +917,8 @@ emit_code_ref_operand (hsa_op_code_ref *ref) brig_operand.add (out, sizeof (out)); } +/* Emit a code list operand CODE_LIST. */ + static void emit_code_list_operand (hsa_op_code_list *code_list) { @@ -1288,6 +1290,9 @@ emit_arg_block (bool is_start) brig_insn_count++; } +/* Emit call instruction INSN, where this instruction must be closed + within a call block instruction. */ + static void emit_call_insn (hsa_insn_basic *insn) { diff --git a/gcc/hsa-gen.c b/gcc/hsa-gen.c index ac51eb0..b59a0b5 100644 --- a/gcc/hsa-gen.c +++ b/gcc/hsa-gen.c @@ -623,6 +623,9 @@ hsa_alloc_addr_op (hsa_symbol *sym, hsa_op_reg *reg, HOST_WIDE_INT offset) return addr; } +/* Allocate and set up a new code list operands with given number + of ELEMENTS. */ + static hsa_op_code_list * hsa_alloc_code_list_op (unsigned elements) { @@ -1092,6 +1095,10 @@ gen_hsa_addr (tree ref, hsa_bb *hbb, vec hsa_op_reg_p ssa_map) return hsa_alloc_addr_op (symbol, reg, offset); } +/* Generate HSA address for a function call argument of given TYPE. + INDEX is used to generate corresponding name of the arguments. + Special value -1 represents fact that result value is created. */ + static hsa_op_address * gen_hsa_addr_for_arg (tree tree_type, int index) { @@ -1663,7 +1670,7 @@ gen_hsa_insns_for_operation_assignment (gimple assign, hsa_bb *hbb, hsa_append_insn (hbb, insn); } -/* Generate HSA instructions for a given gimple condition statemet COND. +/* Generate HSA instructions for a given gimple condition statement COND. Instructions will be apended to HBB, which also needs to be the corresponding structure to the basic_block of COND. SSA_MAP maps gimple SSA names to HSA pseudo registers. */ @@ -1685,6 +1692,11 @@ gen_hsa_insns_for_cond_stmt (gimple cond, hsa_bb *hbb, hsa_append_insn (hbb, cbr); } +/* Generate HSA instructions for a direct call isntruction. + Instructions will be apended to HBB, which also needs to be the + corresponding structure to the basic_block of STMT. SSA_MAP maps gimple SSA + names to HSA pseudo registers. */ + static void gen_hsa_insns_for_direct_call (gimple stmt, hsa_bb *hbb, vec hsa_op_reg_p ssa_map) @@ -1759,6 +1771,11 @@ gen_hsa_insns_for_direct_call (gimple stmt, hsa_bb *hbb, hsa_append_insn (hbb, call_block_insn); } +/* Generate HSA instructions for a return value isntruction. + Instructions will be apended to HBB, which also needs to be the + corresponding structure to the basic_block of STMT. SSA_MAP maps gimple SSA + names to HSA pseudo registers. */ + static void gen_hsa_insns_for_return (gimple stmt, hsa_bb *hbb, vec hsa_op_reg_p ssa_map) -- 2.1.2
[hsa] HSA: memory leaks are fixed.
Hello. Attached patch removes all memory leaks which come from HSA-related source files. Thanks, Martin gcc/ChangeLog: 2014-12-05 Martin Liska mli...@suse.cz * hsa-brig.c (brig_string_slot_hasher::remove): Memory free is added. * hsa-gen.c (hsa_deinit_data_for_cfun): Destructors are called for operands and instructions that need to deallocate a data. (hsa_alloc_reg_op): Object is added to list of items that are destructed. (hsa_alloc_code_list_op): Likewise. (hsa_alloc_call_insn): Likewise. (hsa_alloc_call_block_insn): Likewise. (wrap_hsa): Products of asprintf are freed. * hsa.h (struct hsa_op_reg): New destructor. (struct hsa_op_code_list): Likewise. (struct hsa_insn_call): Likewise. (struct hsa_insn_call_block): Likewise. --- gcc/hsa-brig.c | 1 + gcc/hsa-gen.c | 32 +++- gcc/hsa.h | 25 - 3 files changed, 52 insertions(+), 6 deletions(-) diff --git a/gcc/hsa-brig.c b/gcc/hsa-brig.c index 61eadf9..3e6ed68 100644 --- a/gcc/hsa-brig.c +++ b/gcc/hsa-brig.c @@ -275,6 +275,7 @@ inline void brig_string_slot_hasher::remove (value_type *ds) { free (const_castchar* (ds-s)); + free (ds); } static hash_tablebrig_string_slot_hasher *brig_string_htab; diff --git a/gcc/hsa-gen.c b/gcc/hsa-gen.c index 5029d26..ac51eb0 100644 --- a/gcc/hsa-gen.c +++ b/gcc/hsa-gen.c @@ -84,6 +84,13 @@ static alloc_pool hsa_allocp_inst_call_block; static alloc_pool hsa_allocp_bb; static alloc_pool hsa_allocp_symbols; +/* Vectors with selected instructions and operands that need + a destruction. */ +static vec hsa_op_code_list * hsa_list_operand_code_list; +static vec hsa_op_reg * hsa_list_operand_reg; +static vec hsa_insn_call_block * hsa_list_insn_call_block; +static vec hsa_insn_call * hsa_list_insn_call; + /* Hash function to lookup a symbol for a decl. */ hash_table hsa_free_symbol_hasher *hsa_global_variable_symbols; @@ -194,6 +201,23 @@ hsa_deinit_data_for_cfun (void) FOR_EACH_BB_FN (bb, cfun) bb-aux = NULL; + for (unsigned int i = 0; i hsa_list_operand_code_list.length (); i++) +hsa_list_operand_code_list[i]-~hsa_op_code_list (); + + for (unsigned int i = 0; i hsa_list_operand_reg.length (); i++) +hsa_list_operand_reg[i]-~hsa_op_reg (); + + for (unsigned int i = 0; i hsa_list_insn_call_block.length (); i++) +hsa_list_insn_call_block[i]-~hsa_insn_call_block (); + + for (unsigned int i = 0; i hsa_list_insn_call.length (); i++) +hsa_list_insn_call[i]-~hsa_insn_call (); + + hsa_list_operand_code_list.release (); + hsa_list_operand_reg.release (); + hsa_list_insn_call_block.release (); + hsa_list_insn_call.release (); + free_alloc_pool (hsa_allocp_operand_address); free_alloc_pool (hsa_allocp_operand_immed); free_alloc_pool (hsa_allocp_operand_reg); @@ -572,11 +596,11 @@ hsa_alloc_reg_op (void) hsa_op_reg *hreg; hreg = (hsa_op_reg *) pool_alloc (hsa_allocp_operand_reg); + hsa_list_operand_reg.safe_push (hreg); memset (hreg, 0, sizeof (hsa_op_reg)); hreg-kind = BRIG_KIND_OPERAND_REG; /* TODO: Try removing later on. I suppose this is not necessary but I'd rather avoid surprises. */ - hreg-uses = vNULL; hreg-order = hsa_cfun.reg_count++; return hreg; } @@ -604,6 +628,7 @@ hsa_alloc_code_list_op (unsigned elements) { hsa_op_code_list *list; list = (hsa_op_code_list *) pool_alloc (hsa_allocp_operand_code_list); + hsa_list_operand_code_list.safe_push (list); memset (list, 0, sizeof (hsa_op_code_list)); list-kind = BRIG_KIND_OPERAND_CODE_LIST; @@ -748,6 +773,7 @@ hsa_alloc_call_insn (void) hsa_insn_call *call; call = (hsa_insn_call *) pool_alloc (hsa_allocp_inst_call); + hsa_list_insn_call.safe_push (call); memset (call, 0, sizeof (hsa_insn_call)); return call; } @@ -760,6 +786,7 @@ hsa_alloc_call_block_insn (void) hsa_insn_call_block *call_block; call_block = (hsa_insn_call_block *) pool_alloc (hsa_allocp_inst_call_block); + hsa_list_insn_call_block.safe_push (call_block); memset (call_block, 0, sizeof (hsa_insn_call_block)); call_block-opcode = HSA_OPCODE_CALL_BLOCK; @@ -2332,7 +2359,9 @@ wrap_hsa (void) strcpy (extension, \0); asprintf (extension, %s, .o\0); strcat (filename, extension); + free (extension); str = build_string_literal (strlen(filename)+1,filename); + free (filename); } } CONSTRUCTOR_APPEND_ELT (v, NULL_TREE, str); @@ -2345,6 +2374,7 @@ wrap_hsa (void) sanitize_hsa_name (tmpname + 1); str = build_string_literal (slen + 2, tmpname); + free (tmpname); CONSTRUCTOR_APPEND_ELT (v, NULL_TREE, str); int discard_arguents; int num_args = gimple_call_num_args (call_stmt); diff --git
[hsa] HSA: support for direct function call is introduced.
Hello. Following patch introduces support for direct call instructions for HSAIL. Thanks, Martin gcc/c-family/ChangeLog: 2014-12-05 Martin Liska mli...@suse.cz * c-common.c: New 'hsafunc' attribute is added. gcc/ChangeLog: 2014-12-05 Martin Liska mli...@suse.cz * hsa-brig.c (struct function_linkage_pair): New data structure. (hsa_brig_section::get_ptr_by_offset): New function. (emit_directive_variable): Linkage is retrieved by symbol. (emit_function_directives): Emitted function is added to map with offsets. (enqueue_op): New operand type handling added. (emit_code_ref_operand): Created from emit_label_operand. (emit_code_list_operand): New function. (emit_queued_operands): New operand type handling added. (emit_segment_insn): BRIG_KIND_INST_SEG is changed to BRIG_KIND_INST_SEG_CVT. (emit_cvt_insn): Undefined behavior fixed by wrong array bounds. (emit_arg_block): New function. (emit_call_insn): Likewise. (emit_call_block_insn): Likewise. (emit_insn): New instructions are handled. (hsa_output_brig): Function offsets for call instructions are resolved. * hsa-dump.c (static void indent_stream): New function. (dump_hsa_insn): Added support for call instruction. * hsa-gen.c (hsa_init_data_for_cfun): New flag for hsa_cfun is parsed. (hsa_deinit_data_for_cfun): New pools are deallocated. (get_symbol_for_decl): Symbol's linkage is set up. (hsa_get_spill_symbol): Likewise. (hsa_alloc_code_list_op): New function. (hsa_alloc_call_insn): Likewise. (hsa_alloc_call_block_insn): Likewise. (gen_hsa_addr_for_arg): Likewise. (gen_hsa_insns_for_direct_call): Likewise. (gen_hsa_insns_for_return): Likewise. (gen_hsa_insns_for_call): Likewise. (gen_hsa_insns_for_gimple_stmt): GIMPLE labels with non-taken address are supported. (gen_function_parameters): Linkage condition is introduced. (generate_hsa): kern_p flag is parsed. (wrap_hsa): Likewise. (pass_gen_hsail::execute): Likewise. (struct hsa_op_reg::verify): New function. * hsa.h (struct hsa_symbol): Linkage member is added. (struct hsa_op_code_ref): Created from existing hsa_op_label_ref. (struct hsa_op_code_list): New operand is added. (struct hsa_insn_call): New instruction. (struct hsa_insn_call_block): Likewise. (struct hsa_function_representation): kern_p attribute is introduced. (struct hsa_op_reg::verify): New function. --- gcc/c-family/c-common.c | 2 + gcc/hsa-brig.c | 222 + gcc/hsa-dump.c | 47 +- gcc/hsa-gen.c | 234 +--- gcc/hsa.h | 114 +-- 5 files changed, 583 insertions(+), 36 deletions(-) diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c index 7e348d3..74024a8 100644 --- a/gcc/c-family/c-common.c +++ b/gcc/c-family/c-common.c @@ -660,6 +660,8 @@ const struct attribute_spec c_common_attribute_table[] = handle_hsa_attribute, false }, { hsakernel, 0, 0, true, false, false, handle_hsa_attribute, false }, + { hsafunc,0, 0, true, false, false, + handle_hsa_attribute, false }, { leaf, 0, 0, true, false, false, handle_leaf_attribute, false }, { always_inline, 0, 0, true, false, false, diff --git a/gcc/hsa-brig.c b/gcc/hsa-brig.c index 13b4aaa..61eadf9 100644 --- a/gcc/hsa-brig.c +++ b/gcc/hsa-brig.c @@ -36,6 +36,7 @@ along with GCC; see the file COPYING3. If not see #include vec.h #include gimple-pretty-print.h #include diagnostic-core.h +#include hash-map.h #define BRIG_SECTION_DATA_NAMEhsa_data #define BRIG_SECTION_CODE_NAMEhsa_code @@ -80,12 +81,31 @@ public: void output (); unsigned add (const void *data, unsigned len); void round_size_up (int factor); + void *get_ptr_by_offset (unsigned int offset); }; static struct hsa_brig_section brig_data, brig_code, brig_operand; static uint32_t brig_insn_count; static bool brig_initialized = false; +/* Mapping between emitted HSA functions and their offset in code segment. */ +static hash_map tree, BrigCodeOffset32_t function_offsets; + +struct function_linkage_pair +{ + function_linkage_pair (tree decl, unsigned int off): +function_decl (decl), offset (off) {} + + /* Declaration of called function. */ + tree function_decl; + + /* Offset in operand section. */ + unsigned int offset; +}; + +/* Vector of function calls where we need to resolve function offsets. */ +static auto_vec function_linkage_pair function_call_linkage; + /* Add a new
[PATCH][AArch64] Generalize code alignment
This patch generalizes the code alignment and lets each CPU set function, jump and loop alignment independently. The defaults for A53/A57 are based the original patch by James Greenhalgh. OK for trunk? ChangeLog: 2014-12-13 Wilco Dijkstra wdijk...@arm.com * gcc/config/aarch64/aarch64-protos.h (tune-params): Add code alignment tuning parameters. * gcc/config/aarch64/aarch64.c (generic_tunings) Add code alignment tuning parameters. (cortexa53_tunings): Likewise. (cortexa57_tunings): Likewise. (thunderx_tunings): Likewise. (aarch64_override_options): Use new alignment tunings. --- gcc/config/aarch64/aarch64-protos.h | 4 +++- gcc/config/aarch64/aarch64.c| 22 +++--- 2 files changed, 18 insertions(+), 8 deletions(-) diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index 234efcb..f22573b 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -170,8 +170,10 @@ struct tune_params const struct cpu_vector_cost *const vec_costs; const int memmov_cost; const int issue_rate; - const int align; const unsigned int fuseable_ops; + const int function_align; + const int jump_align; + const int loop_align; const int int_reassoc_width; const int fp_reassoc_width; const int vec_reassoc_width; diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 5f51b97..db42164 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -323,8 +323,10 @@ static const struct tune_params generic_tunings = generic_vector_cost, NAMED_PARAM (memmov_cost, 4), NAMED_PARAM (issue_rate, 2), - NAMED_PARAM (align, 4), NAMED_PARAM (fuseable_ops, AARCH64_FUSE_NOTHING), + 8, /* function_align. */ + 8, /* jump_align. */ + 4, /* loop_align. */ 2, /* int_reassoc_width. */ 4, /* fp_reassoc_width. */ 1/* vec_reassoc_width. */ @@ -338,9 +340,11 @@ static const struct tune_params cortexa53_tunings = generic_vector_cost, NAMED_PARAM (memmov_cost, 4), NAMED_PARAM (issue_rate, 2), - NAMED_PARAM (align, 8), NAMED_PARAM (fuseable_ops, (AARCH64_FUSE_MOV_MOVK | AARCH64_FUSE_ADRP_ADD | AARCH64_FUSE_MOVK_MOVK | AARCH64_FUSE_ADRP_LDR)), + 8, /* function_align. */ + 8, /* jump_align. */ + 4, /* loop_align. */ 2, /* int_reassoc_width. */ 4, /* fp_reassoc_width. */ 1/* vec_reassoc_width. */ @@ -354,8 +358,10 @@ static const struct tune_params cortexa57_tunings = cortexa57_vector_cost, NAMED_PARAM (memmov_cost, 4), NAMED_PARAM (issue_rate, 3), - NAMED_PARAM (align, 8), NAMED_PARAM (fuseable_ops, (AARCH64_FUSE_MOV_MOVK | AARCH64_FUSE_ADRP_ADD | AARCH64_FUSE_MOVK_MOVK)), + 16, /* function_align. */ + 8, /* jump_align. */ + 4, /* loop_align. */ 2, /* int_reassoc_width. */ 4, /* fp_reassoc_width. */ 1/* vec_reassoc_width. */ @@ -369,8 +375,10 @@ static const struct tune_params thunderx_tunings = generic_vector_cost, NAMED_PARAM (memmov_cost, 6), NAMED_PARAM (issue_rate, 2), - NAMED_PARAM (align, 8), NAMED_PARAM (fuseable_ops, AARCH64_FUSE_CMP_BRANCH), + 8, /* function_align. */ + 8, /* jump_align. */ + 8, /* loop_align. */ 2, /* int_reassoc_width. */ 4, /* fp_reassoc_width. */ 1/* vec_reassoc_width. */ @@ -6773,11 +6781,11 @@ aarch64_override_options (void) if (!optimize_size) { if (align_loops = 0) - align_loops = aarch64_tune_params-align; + align_loops = aarch64_tune_params-loop_align; if (align_jumps = 0) - align_jumps = aarch64_tune_params-align; + align_jumps = aarch64_tune_params-jump_align; if (align_functions = 0) - align_functions = aarch64_tune_params-align; + align_functions = aarch64_tune_params-function_align; } aarch64_override_options_after_change (); -- 1.9.1
[patch] libstdc++/64241 fix std::make_exception_ptr() for -fno-exceptions
It's not clear to me that std::make_exception_ptr() is actually useful with -fno-exceptions, but we might as well return something well-defined rather than garbage that might crash the program. Tested x86_64-linux, committed to trunk. commit ab57ab82fb2f5eec3bbb081f63d7a53de26bc7c8 Author: Jonathan Wakely jwak...@redhat.com Date: Fri Dec 12 14:33:04 2014 + PR libstdc++/64241 * libsupc++/exception_ptr.h: Return empty object when exceptions are disabled. * testsuite/18_support/exception_ptr/64241.cc: New. diff --git a/libstdc++-v3/libsupc++/exception_ptr.h b/libstdc++-v3/libsupc++/exception_ptr.h index 9ba0de4..8b27359 100644 --- a/libstdc++-v3/libsupc++/exception_ptr.h +++ b/libstdc++-v3/libsupc++/exception_ptr.h @@ -168,16 +168,18 @@ namespace std exception_ptr make_exception_ptr(_Ex __ex) _GLIBCXX_USE_NOEXCEPT { - __try - { #ifdef __EXCEPTIONS + try + { throw __ex; -#endif } - __catch(...) + catch(...) { return current_exception(); } +#else + return exception_ptr(); +#endif } // _GLIBCXX_RESOLVE_LIB_DEFECTS diff --git a/libstdc++-v3/testsuite/18_support/exception_ptr/64241.cc b/libstdc++-v3/testsuite/18_support/exception_ptr/64241.cc new file mode 100644 index 000..c7e1433 --- /dev/null +++ b/libstdc++-v3/testsuite/18_support/exception_ptr/64241.cc @@ -0,0 +1,39 @@ +// Copyright (C) 2014 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. + +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. + +// You should have received a copy of the GNU General Public License along +// with this library; see the file COPYING3. If not see +// http://www.gnu.org/licenses/. + +// { dg-options -std=gnu++11 -fno-exceptions -O0 } + +#include exception +#include testsuite_hooks.h + +void +test01() +{ + bool test __attribute__((unused)) = true; + { +// Put some non-zero bytes on the stack +void* p __attribute__((unused)) = test; + } + std::exception_ptr p = std::make_exception_ptr(1); + VERIFY( p == nullptr ); +} + +int +main() +{ + test01(); +}
[PATCH][AArch64] Add TARGET_MIN_DIVISIONS_FOR_RECIP_MUL
Add an override for TARGET_MIN_DIVISIONS_FOR_RECIP_MUL and set the minimum number of divisions to 2. This gives ~0.5% speedup on SPECFP2000/2006. OK for trunk? ChangeLog: 2014-12-13 Wilco Dijkstra wdijk...@arm.com * gcc/config/aarch64/aarch64.c (TARGET_MIN_DIVISIONS_FOR_RECIP_MUL): Define. (aarch64_min_divisions_for_recip_mul): New function. --- gcc/config/aarch64/aarch64.c | 9 + 1 file changed, 9 insertions(+) diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index f2d390b..8c23064 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -462,6 +462,12 @@ static const char * const aarch64_condition_codes[] = hi, ls, ge, lt, gt, le, al, nv }; +static unsigned int +aarch64_min_divisions_for_recip_mul (enum machine_mode mode ATTRIBUTE_UNUSED) +{ + return 2; +} + static int aarch64_reassociation_width (unsigned opc ATTRIBUTE_UNUSED, enum machine_mode mode) @@ -11026,6 +11032,9 @@ aarch64_gen_adjusted_ldpstp (rtx *operands, bool load, #undef TARGET_MEMORY_MOVE_COST #define TARGET_MEMORY_MOVE_COST aarch64_memory_move_cost +#undef TARGET_MIN_DIVISIONS_FOR_RECIP_MUL +#define TARGET_MIN_DIVISIONS_FOR_RECIP_MUL aarch64_min_divisions_for_recip_mul + #undef TARGET_MUST_PASS_IN_STACK #define TARGET_MUST_PASS_IN_STACK must_pass_in_stack_var_size -- 1.9.1
Re: [Patch, Fortran] Convert gfc_notify_std to common diagnostics
PING - https://gcc.gnu.org/ml/gcc-patches/2014-12/msg00731.html OK.
[PATCH][AArch64] Fix PR 64263: Do not try to split constants when destination is SIMD reg
Hi all, Since the movsi_aarch64 and movdi_aarch64 patterns became splitters we want to make sure that the splitting happens only when we deal with GP registers. This patch guards the splitting part by GP_REGNUM_P rather than trying to complicate aarch64_expand_mov_immediate too much to try and handle the SIMD registers case. A testcase is added. Bootstrap on aarch64-none-linux-gnu and testing on aarch64-none-elf was succesfull. Ok for trunk? Thanks, Kyrill 2014-12-11 Kyrylo Tkachov kyrylo.tkac...@arm.com Ramana Radhakrishnan ramana.radhakrish...@arm.com PR target/64263 * config/aarch64/aarch64.md (*movsi_aarch64): Don't split if the destination is not a GP reg. (*movdi_aarch64): Likewise. 2014-12-11 Kyrylo Tkachov kyrylo.tkac...@arm.com PR target/64263 * gcc.target/aarch64/pr64263_1.c: New test.commit befb68e4b1b10e1052748202fcb4a83637fae234 Author: Kyrylo Tkachov kyrylo.tkac...@arm.com Date: Thu Dec 11 12:17:30 2014 + [AArch64] Fix PR target/64263 diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 97d7009..693369f 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -854,7 +854,8 @@ (define_insn_and_split *movsi_aarch64 fmov\\t%s0, %w1 fmov\\t%w0, %s1 fmov\\t%s0, %s1 - CONST_INT_P (operands[1]) !aarch64_move_imm (INTVAL (operands[1]), SImode) + CONST_INT_P (operands[1]) !aarch64_move_imm (INTVAL (operands[1]), SImode) + GP_REGNUM_P (REGNO (operands[0])) [(const_int 0)] { aarch64_expand_mov_immediate (operands[0], operands[1]); @@ -886,7 +887,8 @@ (define_insn_and_split *movdi_aarch64 fmov\\t%x0, %d1 fmov\\t%d0, %d1 movi\\t%d0, %1 - (CONST_INT_P (operands[1]) !aarch64_move_imm (INTVAL (operands[1]), DImode)) + (CONST_INT_P (operands[1]) !aarch64_move_imm (INTVAL (operands[1]), DImode)) + GP_REGNUM_P (REGNO (operands[0])) [(const_int 0)] { aarch64_expand_mov_immediate (operands[0], operands[1]); diff --git a/gcc/testsuite/gcc.target/aarch64/pr64263_1.c b/gcc/testsuite/gcc.target/aarch64/pr64263_1.c new file mode 100644 index 000..047e623 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/pr64263_1.c @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-options -O1 } */ + +#include arm_neon.h + +extern long int vget_lane_s64_1 (int64x1_t, const int); + +void +foo () +{ + int8x8_t val14; + int8x8_t val15; + uint8x8_t val16; + uint32x4_t val40; + val14 = vcreate_s8 (0xff0080f6807f807fUL); + val15 = vcreate_s8 (0x10807fff7f808080UL); + val16 = vcgt_s8 (val14, val15); + val40 = vreinterpretq_u32_u64 ( +vdupq_n_u64 ( + vget_lane_s64_1 ( + vreinterpret_s64_u8 (val16), 0) +)); +}
Re: [PATCH] [AArch64, NEON] Fix testcases add by r218484
On 11 December 2014 at 08:50, Yangfei (Felix) felix.y...@huawei.com wrote: Hi, We find that the committed patch is not correctly generated from our local branch. This caused some code necessary for the testcases missing. As pointed out by Christophe in https://gcc.gnu.org/ml/gcc-patches/2014-12/msg00778.html, we need to rework the testcases so that it can work for AArch32 target too. This patch fix this two issues. Three changes: 1. vfma_f32, vfmaq_f32, vfms_f32, vfmsq_f32 are only available for arm*-*-* target with the FMA feature, we take care of this through the macro __ARM_FEATURE_FMA. 2. vfma_n_f32 and vfmaq_n_f32 are only available for aarch64 target, we take care of this through the macro __aarch64__. 3. vfmaq_f64, vfmaq_n_f64 and vfmsq_f64 are only available for aarch64 target, we just exclude test for them to keep the testcases clean. (Note: They also pass on aarch64 aarch64_be target and we can add test for them if needed). I would prefer to have all the available variants tested. Tested on armeb-linux-gnueabi, arm-linux-gnueabi, aarch64-linux-gnu and aarch64_be-linux-gnu. OK for the trunk? Sorry if this cause you guys any trouble, we will be more carefull in our future work. Index: gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vfma_n.c === --- gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vfma_n.c (revision 218582) +++ gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vfma_n.c (working copy) @@ -2,35 +2,34 @@ #include arm-neon-ref.h #include compute-ref-data.h +#ifdef __aarch64__ /* Expected results. */ VECT_VAR_DECL(expected,hfloat,32,2) [] = { 0x4438ca3d, 0x44390a3d }; VECT_VAR_DECL(expected,hfloat,32,4) [] = { 0x44869eb8, 0x4486beb8, 0x4486deb8, 0x4486feb8 }; -VECT_VAR_DECL(expected,hfloat,64,2) [] = { 0x408906e1532b8520, 0x40890ee1532b8520 }; Why do you remove this one? #define VECT_VAR_ASSIGN(S,Q,T1,W) S##Q##_##T1##W #define ASSIGN(S, Q, T, W, V) T##W##_t S##Q##_##T##W = V -#define TEST_MSG VFMA/VFMAQ +#define TEST_MSG VFMA_N/VFMAQ_N + void exec_vfma_n (void) { /* Basic test: v4=vfma_n(v1,v2), then store the result. */ #define TEST_VFMA(Q, T1, T2, W, N) \ VECT_VAR(vector_res, T1, W, N) = \ vfma##Q##_n_##T2##W(VECT_VAR(vector1, T1, W, N), \ - VECT_VAR(vector2, T1, W, N), \ - VECT_VAR_ASSIGN(Scalar, Q, T1, W)); \ + VECT_VAR(vector2, T1, W, N),\ + VECT_VAR_ASSIGN(scalar, Q, T1, W)); \ vst1##Q##_##T2##W(VECT_VAR(result, T1, W, N), VECT_VAR(vector_res, T1, W, N)) #define CHECK_VFMA_RESULTS(test_name,comment) \ {\ CHECK_FP(test_name, float, 32, 2, PRIx32, expected, comment); \ CHECK_FP(test_name, float, 32, 4, PRIx32, expected, comment); \ - CHECK_FP(test_name, float, 64, 2, PRIx64, expected, comment); \ - } + } #define DECL_VABD_VAR(VAR) \ DECL_VARIABLE(VAR, float, 32, 2);\ - DECL_VARIABLE(VAR, float, 32, 4);\ - DECL_VARIABLE(VAR, float, 64, 2); + DECL_VARIABLE(VAR, float, 32, 4); DECL_VABD_VAR(vector1); DECL_VABD_VAR(vector2); @@ -42,28 +41,27 @@ void exec_vfma_n (void) /* Initialize input vector1 from buffer. */ VLOAD(vector1, buffer, , float, f, 32, 2); VLOAD(vector1, buffer, q, float, f, 32, 4); - VLOAD(vector1, buffer, q, float, f, 64, 2); /* Choose init value arbitrarily. */ VDUP(vector2, , float, f, 32, 2, 9.3f); VDUP(vector2, q, float, f, 32, 4, 29.7f); - VDUP(vector2, q, float, f, 64, 2, 15.8f); /* Choose init value arbitrarily. */ - ASSIGN(Scalar, , float, 32, 81.2f); - ASSIGN(Scalar, q, float, 32, 36.8f); - ASSIGN(Scalar, q, float, 64, 51.7f); + ASSIGN(scalar, , float, 32, 81.2f); + ASSIGN(scalar, q, float, 32, 36.8f); /* Execute the tests. */ TEST_VFMA(, float, f, 32, 2); TEST_VFMA(q, float, f, 32, 4); - TEST_VFMA(q, float, f, 64, 2); CHECK_VFMA_RESULTS (TEST_MSG, ); } +#endif int main (void) { +#ifdef __aarch64__ exec_vfma_n (); +#endif return 0; } Index: gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vfma.c === --- gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vfma.c (revision 218582) +++ gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vfma.c (working copy) @@ -2,12 +2,13 @@ #include arm-neon-ref.h #include compute-ref-data.h +#ifdef __ARM_FEATURE_FMA /* Expected results. */ VECT_VAR_DECL(expected,hfloat,32,2) [] = { 0x4438ca3d, 0x44390a3d };
Re: [PATCH, Libatomic, Darwin] Initial libatomic port for *darwin*.
Iain, What is the status of this patch? Jack On Thu, Nov 13, 2014 at 3:34 PM, Iain Sandoe i...@codesourcery.com wrote: Hello Richard, Joseph, Thanks for your reviews, On 13 Nov 2014, at 07:40, Richard Henderson wrote: On 11/12/2014 10:18 PM, Iain Sandoe wrote: # ifndef USE_ATOMIC #define USE_ATOMIC 1 # endif Why would USE_ATOMIC be defined previously? This was left-over from a mode where I allowed the User to jam the mode to OSSpinLocks to test the performance. I apologise, [weak excuse follows] with the turbulence of Darwin on trunk, my trunk version of the patch had got behind my 4.9 one. (most of the work has been hammered out there while we try to get bootstrap restored). re-synced and retested with a patched trunk that bootstraps with some band-aid. inline static void LockUnlock(uint32_t *l) { __atomic_store_4((_Atomic(uint32_t)*)l, 0, __ATOMIC_RELEASE); } Gnu coding style, please. All through the file here. Fixed. # define LOCK_SIZE sizeof(uint32_t) # define NLOCKS (PAGE_SIZE / LOCK_SIZE) static uint32_t locks[NLOCKS]; Um, surely not LOCK_SIZE, but CACHELINE_SIZE. It's the granularity of the target region that's at issue, not the size of the lock itself. The algorithm I've used is intentionally different from the pthreads-based posix one, here's the rationale, as I see it: /* Algorithm motivations. Layout Assumptions: o Darwin has a number of sub-targets with common atomic types that have no 'native' in-line handling, but are smaller than a cache-line. E.G. PPC32 needs locking for = 8byte quantities, X86/m32 for =16. o The _Atomic alignment of a natural type is no greater than the type size. o There are no special guarantees about the alignment of _Atomic aggregates other than those determined by the psABI. o There are no guarantees that placement of an entity won't cause it to straddle a cache-line boundary. o Realistic User code will likely place several _Atomic qualified types in close proximity (such that they fall within the same cache-line). Similarly, arrays of _Atomic qualified items. Performance Assumptions: o Collisions of address hashes for items (which make up the lock keys) constitute the largest performance issue. o We want to avoid unnecessary flushing of [lock table] cache-line(s) when items are accessed. Implementation: We maintain a table of locks, each lock being 4 bytes (at present). This choice of lock size gives some measure of equality in hash-collision statistics between the 'atomic' and 'OSSpinLock' implementations, since the lock size is fixed at 4 bytes for the latter. The table occupies one physical page, and we attempt to align it to a page boundary, appropriately. For entities that need a lock, with sizes one cache line: Each entity that requires a lock, chooses the lock to use from the table on the basis of a hash determined by its size and address. The lower log2(size) address bits are discarded on the assumption that the alignment of entities will not be smaller than their size. CHECKME: this is not verified for aggregates; it might be something that could/should be enforced from the front ends (since _Atomic types are allowed to have increased alignment c.f. 'normal'). For entities that need a lock, with sizes = one cacheline_size: We assume that the entity alignment = log2(cacheline_size) and discard log2(cacheline_size) bits from the address. We then apply size/cacheline_size locks to cover the entity. The idea is that this will typically result in distinct hash keys for items placed close together. The keys are mangled further such that the size is included in the hash. Finally, to attempt to make it such that the lock table entries are accessed in a scattered manner,to avoid repeated cacheline flushes, the hash is rearranged to attempt to maximise the most noise in the upper bits. */ NOTE that the CHECKME above doesn't put us in any worse position than the pthreads implementation (likely slightly better since we have a smaller granularity with the current scheme). #if USE_ATOMIC LockLock (locks[addr_hash (ptr, 1)]); #else OSSpinLockLock(locks[addr_hash (ptr, 1)]); #endif Better to #define LockLock OSSpinLockLock within the area above, so as to avoid the ifdefs here. done. Thoughts on the rationale - or OK now? thanks Iain I'm not aware of any other PRs that relate, but will do a final scan through and ask around the darwin folks. libatomic: PR target/59305 * config/darwin/host-config.h New. * config/darwin/lock.c New. * configure.tgt (DEFAULT_X86_CPU): New, (target): New entry for darwin.
Re: PR64182: Fix rounding division and modulus
Richard Biener richard.guent...@gmail.com writes: On Thu, Dec 11, 2014 at 1:26 PM, Richard Sandiford richard.sandif...@arm.com wrote: As pointed out in PR 64182, wide-int rounded division gets the ties-away-from-zero case wrong for odd-numbered dividends, while double_int gets the unsigned case wrong by unconditionally treating a dividend or remainder with the top bit set as negative. As Jakub says, the test used in double_int might also have overflow problems. This patch uses: abs (remainder) = abs (dividend) - abs (remainder) for both wide-int and double_int and fixes the unsigned case in double_int. I didn't know how to test the double_int change using input code so resorted to doing some double_int arithmetic at the start of main. Thanks to Joseph for the testcase. Tested on x86_64-linux-gnu. OK to install? Can you add a testcase? You can follow the gcc.dg/plugin/sreal_plugin.c example, maybe even make it a generic host_test_plugin.c with separate files containing the actual tests. Otherwise ok. Ah, hadn't realised we could do that. Much neater than changing main :-) Here's what I committed after retesting on x86_64-linux-gnu. As well as the testcase, I changed x - y to the more general wi::sub (x, y). Thanks, Richard gcc/ PR middle-end/64182 * wide-int.h (wi::div_round, wi::mod_round): Fix rounding of tied cases. * double-int.c (div_and_round_double): Fix handling of unsigned cases. Use same rounding approach as wide-int.h. gcc/testsuite/ 2014-xx-xx Richard Sandiford richard.sandif...@arm.com Joseph Myers jos...@codesourcery.com PR middle-end/64182 * gcc.dg/plugin/wide-int-test-1.c, gcc.dg/plugin/wide-int_plugin.c: New test. * gcc.dg/plugin/plugin.exp: Register it. * gnat.dg/round_div.adb: New test. Index: gcc/wide-int.h === --- gcc/wide-int.h 2014-12-11 14:32:17.708138315 + +++ gcc/wide-int.h 2014-12-11 14:32:17.704138366 + @@ -2616,8 +2616,8 @@ wi::div_round (const T1 x, const T2 y, { if (sgn == SIGNED) { - if (wi::ges_p (wi::abs (remainder), -wi::lrshift (wi::abs (y), 1))) + WI_BINARY_RESULT (T1, T2) abs_remainder = wi::abs (remainder); + if (wi::geu_p (abs_remainder, wi::sub (wi::abs (y), abs_remainder))) { if (wi::neg_p (x, sgn) != wi::neg_p (y, sgn)) return quotient - 1; @@ -2627,7 +2627,7 @@ wi::div_round (const T1 x, const T2 y, } else { - if (wi::geu_p (remainder, wi::lrshift (y, 1))) + if (wi::geu_p (remainder, wi::sub (y, remainder))) return quotient + 1; } } @@ -2784,8 +2784,8 @@ wi::mod_round (const T1 x, const T2 y, { if (sgn == SIGNED) { - if (wi::ges_p (wi::abs (remainder), -wi::lrshift (wi::abs (y), 1))) + WI_BINARY_RESULT (T1, T2) abs_remainder = wi::abs (remainder); + if (wi::geu_p (abs_remainder, wi::sub (wi::abs (y), abs_remainder))) { if (wi::neg_p (x, sgn) != wi::neg_p (y, sgn)) return remainder + y; @@ -2795,7 +2795,7 @@ wi::mod_round (const T1 x, const T2 y, } else { - if (wi::geu_p (remainder, wi::lrshift (y, 1))) + if (wi::geu_p (remainder, wi::sub (y, remainder))) return remainder - y; } } Index: gcc/double-int.c === --- gcc/double-int.c2014-12-11 14:32:17.708138315 + +++ gcc/double-int.c2014-12-11 14:32:17.700138416 + @@ -569,24 +569,23 @@ div_and_round_double (unsigned code, int { unsigned HOST_WIDE_INT labs_rem = *lrem; HOST_WIDE_INT habs_rem = *hrem; - unsigned HOST_WIDE_INT labs_den = lden, ltwice; - HOST_WIDE_INT habs_den = hden, htwice; + unsigned HOST_WIDE_INT labs_den = lden, lnegabs_rem, ldiff; + HOST_WIDE_INT habs_den = hden, hnegabs_rem, hdiff; /* Get absolute values. */ - if (*hrem 0) + if (!uns *hrem 0) neg_double (*lrem, *hrem, labs_rem, habs_rem); - if (hden 0) + if (!uns hden 0) neg_double (lden, hden, labs_den, habs_den); - /* If (2 * abs (lrem) = abs (lden)), adjust the quotient. */ - mul_double ((HOST_WIDE_INT) 2, (HOST_WIDE_INT) 0, - labs_rem, habs_rem, ltwice, htwice); + /* If abs(rem) = abs(den) - abs(rem), adjust the quotient. */ + neg_double (labs_rem, habs_rem, lnegabs_rem, hnegabs_rem); + add_double (labs_den, habs_den, lnegabs_rem, hnegabs_rem, + ldiff, hdiff); - if (((unsigned HOST_WIDE_INT) habs_den - (unsigned HOST_WIDE_INT) htwice) - || (((unsigned HOST_WIDE_INT) habs_den -== (unsigned
Re: [patch c++]: Fix PR/63996
OK. Jason
[patch] libstdc++/64276 replace __EXCEPTIONS and __GXX_RTTI with SD-6 macros
This replaces the GCC-specific macros with the portable feature-testing macros that are now supported by GCC. Tested x86_64-linux, committed to trunk. commit 6565657776c8c9ebe4055510f2485ccc695e23ef Author: Jonathan Wakely jwak...@redhat.com Date: Fri Dec 12 15:11:39 2014 + PR libstdc++/64276 * doc/doxygen/user.cfg.in: Define __cpp_exceptions and __cpp_rtti. * doc/html/manual/using_exceptions.html: Regenerate. * doc/xml/manual/using_exceptions.xml: Use SD-6 feature-testing macros, __cpp_exceptions and __cpp_rtti, instead of __EXCEPTIONS and __GXX_RTTI. * include/bits/c++config: Likewise. * include/bits/locale_classes.tcc: Likewise. * include/bits/shared_ptr.h: Likewise. * include/bits/shared_ptr_base.h: Likewise. * include/debug/formatter.h: Likewise. * include/experimental/any: Likewise. * include/ext/rope: Likewise. * include/ext/ropeimpl.h: Likewise. * include/std/functional: Likewise. * include/tr1/functional: Likewise. * include/tr1/shared_ptr.h: Likewise. * libsupc++/eh_call.cc: Likewise. * libsupc++/eh_personality.cc: Likewise. * libsupc++/exception_defines.h: Likewise. * libsupc++/exception_ptr.h: Likewise. * libsupc++/guard.cc: Likewise. * libsupc++/pbase_type_info.cc: Likewise. * libsupc++/pointer_type_info.cc: Likewise. * libsupc++/vterminate.cc: Likewise. * src/c++11/thread.cc: Likewise. diff --git a/libstdc++-v3/doc/doxygen/user.cfg.in b/libstdc++-v3/doc/doxygen/user.cfg.in index 7ec91a1..019462e 100644 --- a/libstdc++-v3/doc/doxygen/user.cfg.in +++ b/libstdc++-v3/doc/doxygen/user.cfg.in @@ -2142,8 +2142,8 @@ PREDEFINED = __cplusplus=201103L \ _GLIBCXX_USE_C99_STDINT_TR1 \ _GLIBCXX_USE_SCHED_YIELD \ _GLIBCXX_USE_NANOSLEEP \ - __EXCEPTIONS \ - __GXX_RTTI \ + __cpp_exceptions \ + __cpp_rtti \ ATOMIC_INT_LOCK_FREE \ PB_DS_DATA_TRUE_INDICATOR \ PB_DS_STATIC_ASSERT=// \ diff --git a/libstdc++-v3/doc/html/manual/using_exceptions.html b/libstdc++-v3/doc/html/manual/using_exceptions.html index 83e4ba6..f1dd099 100644 --- a/libstdc++-v3/doc/html/manual/using_exceptions.html +++ b/libstdc++-v3/doc/html/manual/using_exceptions.html @@ -151,7 +151,7 @@ exception neutrality and exception safety. and code class=literal__throw_exception_again/code. They are defined as follows. /ppre class=programlisting -#ifdef __EXCEPTIONS +#if __cpp_exceptions # define __try try # define __catch(X) catch(X) # define __throw_exception_again throw @@ -165,7 +165,7 @@ exception neutrality and exception safety. class code class=classnameexception/code, there exists a corresponding function with C language linkage. An example: /ppre class=programlisting -#ifdef __EXCEPTIONS +#if __cpp_exceptions void __throw_bad_exception(void) { throw bad_exception(); } #else @@ -310,4 +310,4 @@ is called. a class=link href=http://gcc.gnu.org/PR25191; target=_top GCC Bug 25191: exception_defines.h #defines try/catch /a - /em. /span/p/div/div/divdiv class=navfooterhr /table width=100% summary=Navigation footertrtd width=40% align=lefta accesskey=p href=using_concurrency.htmlPrev/a??/tdtd width=20% align=centera accesskey=u href=using.htmlUp/a/tdtd width=40% align=right??a accesskey=n href=debug.htmlNext/a/td/trtrtd width=40% align=left valign=topConcurrency??/tdtd width=20% align=centera accesskey=h href=../index.htmlHome/a/tdtd width=40% align=right valign=top??Debugging Support/td/tr/table/div/body/html \ No newline at end of file + /em. /span/p/div/div/divdiv class=navfooterhr /table width=100% summary=Navigation footertrtd width=40% align=lefta accesskey=p href=using_concurrency.htmlPrev/a??/tdtd width=20% align=centera accesskey=u href=using.htmlUp/a/tdtd width=40% align=right??a accesskey=n href=debug.htmlNext/a/td/trtrtd width=40% align=left valign=topConcurrency??/tdtd width=20% align=centera accesskey=h href=../index.htmlHome/a/tdtd width=40% align=right valign=top??Debugging Support/td/tr/table/div/body/html diff --git a/libstdc++-v3/doc/xml/manual/using_exceptions.xml b/libstdc++-v3/doc/xml/manual/using_exceptions.xml index 698b2fb..840c12b 100644 --- a/libstdc++-v3/doc/xml/manual/using_exceptions.xml +++ b/libstdc++-v3/doc/xml/manual/using_exceptions.xml @@ -251,7 +251,7 @@ exception neutrality and exception safety. /para programlisting -#ifdef __EXCEPTIONS +#if __cpp_exceptions # define __try try # define __catch(X) catch(X) # define __throw_exception_again throw @@ -269,7 +269,7 @@ exception neutrality and exception safety. /para programlisting -#ifdef __EXCEPTIONS +#if __cpp_exceptions void __throw_bad_exception(void) { throw bad_exception(); } #else
[gomp4] Merge trunk r217500 (2014-11-13) into gomp-4_0-branch
Hi! In r218677, I have committed a merge from trunk r217500 (2014-11-13) into gomp-4_0-branch. This spans the offloading commits in trunk (plus two interleaved non-offloading commits). Grüße, Thomas pgp4Q8SRXYvvi.pgp Description: PGP signature
Re: [PATCH][rtlanal.c][BE][1/2] Fix vector load/stores to not use ld1/st1
Alan Hayward alan.hayw...@arm.com writes: [Cleaning this thread up to submit patch again, with better explanation] This patch causes subreg_get_info() to exit early in the simple cases where we are extracting a whole register from a multi register. In aarch64 for Big Endian we were producing a subreg of a OImode (256bits) from a CImode (384bits) This would hit the following assert in subreg_get_info: gcc_assert ((GET_MODE_SIZE (xmode) % GET_MODE_SIZE (ymode)) == 0); This is a rule we should be able to relax a little - if the subreg we want fits into a whole register then this is a valid result and can be easily detected earlier in the function. This has the bonus that we should be slightly reducing the execution time for more common cases, for example a subreg of 64bits from 256bits. This patch is required for the second part of the patch, which is aarch64 specific, and fixes up aarch64 Big Endian movoi/ci/xi. This second part has already been approved. This patch will apply cleanly by itself and no regressions were seen when testing aarch64 and x86_64 on make check. FWIW I agree this is the right approach, although I can't approve it. The assert above is guarding code that deals with a very general case, including some unusual combinations, so I don't think it would be a good idea to try to remove it entirely. E.g. Tejas hit the same assert because we were trying to create subregs of EImode SIMD registers on AArch64. EImode is 24 bytes, so it's one-*and-a-half* SIMD registers. Taking subregs of something like that is very dangerous and I think we want the assert to continue to trigger there. This patch deals with a much simpler and more obvious case. Thanks, Richard
Re: [PATCH 5/n] OpenMP 4.0 offloading infrastructure: libgomp
On Fri, Dec 12, 2014 at 10:58:39AM +0100, Thomas Schwinge wrote: --- /dev/null +++ b/libgomp/libgomp_target.h @@ -0,0 +1,44 @@ +/* Copyright (C) 2014 Free Software Foundation, Inc. + + This file is part of the GNU OpenMP Library (libgomp). +#ifndef LIBGOMP_TARGET_H +#define LIBGOMP_TARGET_H 1 + +/* Type of offload target device. */ +enum offload_target_type +{ + OFFLOAD_TARGET_TYPE_HOST, + OFFLOAD_TARGET_TYPE_INTEL_MIC +}; Maybe this. +/* Auxiliary struct, used for transferring a host-target address range mapping + from plugin to libgomp. */ +struct mapping_table +{ + uintptr_t host_start; + uintptr_t host_end; + uintptr_t tgt_start; + uintptr_t tgt_end; +}; But this IMHO doesn't belong to include/libgomp-constants.h, that file is for stuff shared between the compiler and libgomp, while the above is for communication between libgomp plugins and libgomp, there is no point to put it outside of libgomp and the compiler is not interested in this. Jakub
C++ PATCH for c++/61402
This patch fixes the ICE in 61402, though I'll leave it open for the unsequenced execution warning issue. Tested x86_64-pc-linux-gnu, applying to trunk. commit 05beed857dc4e01061a38b764c26f1ff857788dd Author: Jason Merrill ja...@redhat.com Date: Fri Dec 12 10:43:59 2014 -0500 PR c++/61402 * lambda.c (add_capture): Don't pass a dependent type to variably_modified_type_p. diff --git a/gcc/cp/lambda.c b/gcc/cp/lambda.c index 9eb9200..3da28e5 100644 --- a/gcc/cp/lambda.c +++ b/gcc/cp/lambda.c @@ -483,7 +483,8 @@ add_capture (tree lambda, tree id, tree orig_init, bool by_reference_p, NULL_TREE, array_type_nelts (type)); type = vla_capture_type (type); } - else if (variably_modified_type_p (type, NULL_TREE)) + else if (!dependent_type_p (type) + variably_modified_type_p (type, NULL_TREE)) { error (capture of variable-size type %qT that is not an N3639 array of runtime bound, type); diff --git a/gcc/testsuite/g++.dg/cpp1y/lambda-init11.C b/gcc/testsuite/g++.dg/cpp1y/lambda-init11.C new file mode 100644 index 000..f7525d8 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp1y/lambda-init11.C @@ -0,0 +1,20 @@ +// PR c++/61402 +// { dg-do run { target c++14 } } + +extern C void abort(); + +templatetypename T +void foo(T t) { + auto test = [ i = ++t ](T v) { +if (i != v) + abort(); + }; + test(t); +} + +int main(){ + foo(3.14f); + foo(0); + foo('a'); + foo(false); +}
[PATCH] Fix for PR ipa/64278
Hello. This is patch for PR ipa/64278, where I replace ambiguous std::abs with absu_hwi. Patch can bootstrap on ppc64-linux and no new regression was seen. Ready for thunk? Thanks, Martin From 03a15009e5c9a9045669a4987588d8abf8cc67f1 Mon Sep 17 00:00:00 2001 From: mliska mli...@suse.cz Date: Fri, 12 Dec 2014 16:42:57 +0100 Subject: [PATCH] Fix for PR ipa/64278. gcc/ChangeLog: 2014-12-12 Martin Liska mli...@suse.cz PR ipa/64278 * sreal.c (sreal::operator*): Call to std::abs can be ambiguos and is replaced with absu_hwi. --- gcc/sreal.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/sreal.c b/gcc/sreal.c index bc3af23..0bbc729 100644 --- a/gcc/sreal.c +++ b/gcc/sreal.c @@ -251,7 +251,7 @@ sreal sreal::operator* (const sreal other) const { sreal r; - if (std::abs (m_sig) SREAL_MIN_SIG || std::abs (other.m_sig) SREAL_MIN_SIG) + if (absu_hwi (m_sig) SREAL_MIN_SIG || absu_hwi (other.m_sig) SREAL_MIN_SIG) { r.m_sig = 0; r.m_exp = -SREAL_MAX_EXP; -- 2.1.2
Re: [PING][PATCH V2] plugin event for C/C++ function definitions
PING, request for maintainer please 2014-12-02 15:15 GMT-03:00 Andres Tiraboschi andres.tirabos...@tallertechnologies.com: Hi, this patch adds a new plugin event PLUGIN_START_PARSE_FUNCTION and PLUGIN_FINISH_PARSE_FUNCTION that are invoked at start_function and finish_function respectively in the C and C++ frontends. PLUGIN_START_PARSE_FUNCTION is called before parsing the function body. PLUGIN_FINISH_PARSE_FUNCTION is called after parsing a function definition. Since I have no write privileges please commit this for me if ok. changelog: gcc/c/c-decl.c: Invoke callbacks in start_function and finish_function. gcc/cp/decl.c: Invoke callbacks in start_function and finish_function. gcc/doc/plugins.texi: Add documentation about PLUGIN_START_FUNCTION and PLUGIN_FINISH_FUNCTION gcc/plugin.def: Add events for start_function and finish_function. gcc/plugin.c (register_callback, invoke_plugin_callbacks): Same. gcc/testsuite/g++.dg/plugin/def_plugin.c: New test plugin. gcc/testsuite/g++.dg/plugin/def-plugin-test.C: Testcase for above plugin. gcc/testsuite/g++.dg/plugin/plugin.exp diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c index 6413e6f..d9d922c 100644 --- a/gcc/c/c-decl.c +++ b/gcc/c/c-decl.c @@ -4449,6 +4449,7 @@ start_decl (struct c_declarator *declarator, struct c_declspecs *declspecs, decl = grokdeclarator (declarator, declspecs, NORMAL, initialized, NULL, attributes, expr, NULL, deprecated_state); + invoke_plugin_callbacks (PLUGIN_START_PARSE_FUNCTION, decl); if (!decl) return 0; @@ -9031,6 +9032,7 @@ finish_function (void) It's still in DECL_STRUCT_FUNCTION, and we'll restore it in tree_rest_of_compilation. */ set_cfun (NULL); + invoke_plugin_callbacks (PLUGIN_FINISH_PARSE_FUNCTION, current_function_decl); current_function_decl = NULL; } diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c index 716ab5f..6adf2e4 100644 --- a/gcc/cp/decl.c +++ b/gcc/cp/decl.c @@ -13671,6 +13671,7 @@ start_function (cp_decl_specifier_seq *declspecs, tree decl1; decl1 = grokdeclarator (declarator, declspecs, FUNCDEF, 1, attrs); + invoke_plugin_callbacks (PLUGIN_START_PARSE_FUNCTION, decl1); if (decl1 == error_mark_node) return false; /* If the declarator is not suitable for a function definition, @@ -14301,6 +14302,7 @@ finish_function (int flags) vec_free (deferred_mark_used_calls); } + invoke_plugin_callbacks (PLUGIN_FINISH_PARSE_FUNCTION, fndecl); return fndecl; } diff --git a/gcc/doc/plugins.texi b/gcc/doc/plugins.texi index 4a839b8..1c9e074 100644 --- a/gcc/doc/plugins.texi +++ b/gcc/doc/plugins.texi @@ -174,6 +174,8 @@ Callbacks can be invoked at the following pre-determined events: @smallexample enum plugin_event @{ + PLUGIN_START_PARSE_FUNCTION, /* Called before parsing the body of a function. */ + PLUGIN_FINISH_PARSE_FUNCTION, /* After finishing parsing a function. */ PLUGIN_PASS_MANAGER_SETUP,/* To hook into pass manager. */ PLUGIN_FINISH_TYPE, /* After finishing parsing a type. */ PLUGIN_FINISH_DECL, /* After finishing parsing a declaration. */ diff --git a/gcc/plugin.c b/gcc/plugin.c index 8debc09..f7a8b64 100644 --- a/gcc/plugin.c +++ b/gcc/plugin.c @@ -433,6 +433,8 @@ register_callback (const char *plugin_name, return; } /* Fall through. */ + case PLUGIN_START_PARSE_FUNCTION: + case PLUGIN_FINISH_PARSE_FUNCTION: case PLUGIN_FINISH_TYPE: case PLUGIN_FINISH_DECL: case PLUGIN_START_UNIT: @@ -511,6 +513,8 @@ invoke_plugin_callbacks_full (int event, void *gcc_data) gcc_assert (event = PLUGIN_EVENT_FIRST_DYNAMIC); gcc_assert (event event_last); /* Fall through. */ + case PLUGIN_START_PARSE_FUNCTION: + case PLUGIN_FINISH_PARSE_FUNCTION: case PLUGIN_FINISH_TYPE: case PLUGIN_FINISH_DECL: case PLUGIN_START_UNIT: diff --git a/gcc/plugin.def b/gcc/plugin.def index df5d383..4b7f6ef 100644 --- a/gcc/plugin.def +++ b/gcc/plugin.def @@ -17,6 +17,11 @@ You should have received a copy of the GNU General Public License along with GCC; see the file COPYING3. If not see http://www.gnu.org/licenses/. */ +/* Called before parsing the body of a function. */ +DEFEVENT (PLUGIN_START_PARSE_FUNCTION) + +/* After finishing parsing a function. */ +DEFEVENT (PLUGIN_FINISH_PARSE_FUNCTION) /* To hook into pass manager. */ DEFEVENT (PLUGIN_PASS_MANAGER_SETUP) diff --git a/gcc/testsuite/g++.dg/plugin/def-plugin-test.C b/gcc/testsuite/g++.dg/plugin/def-plugin-test.C new file mode 100644 index 000..b7f2d3d --- /dev/null +++ b/gcc/testsuite/g++.dg/plugin/def-plugin-test.C @@ -0,0 +1,13 @@ +int global = 12; + +int function1(void); + +int function2(int a) // { dg-warning Start fndef function2 } +{ + return function1() + a; +} // {
Re: [PATCH] Fix for PR ipa/64146
Martin, Your test g++.dg/ipa/pr64146.C fails on darwin: grep bind pr64146.C.051i.icf returns nothing, so the first scan fails, while the second one succeeds. Dominique
C++ PATCH for N3922 (direct-initialization from { x })
N3922 changed Unicorn initialization such that while copy-list-initialization of an auto variable still deduces std::initializer_listT, direct-list-initialization from a list with a single element deduces the type of the element and from a list with multiple elements is ill-formed. I've made the ill-formed case a permerror to help with transition for affected code (if there is any). Tested x86_64-pc-linux-gnu, applying to trunk. commit bd12f74cb9c8258307883bf07daeeae5305cad34 Author: Jason Merrill ja...@redhat.com Date: Fri Jun 20 14:22:26 2014 +0200 N3922 * pt.c (do_auto_deduction): In direct-init context, { x } deduces from x. diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c index efc2001..5ed9b2c 100644 --- a/gcc/cp/decl.c +++ b/gcc/cp/decl.c @@ -5546,6 +5546,8 @@ reshape_init_r (tree type, reshape_iter *d, bool first_initializer_p, of g++.old-deja/g++.mike/p7626.C: a pointer-to-member constant is a CONSTRUCTOR (with a record type). */ if (TREE_CODE (init) == CONSTRUCTOR + /* Don't complain about a capture-init. */ + !CONSTRUCTOR_IS_DIRECT_INIT (init) BRACE_ENCLOSED_INITIALIZER_P (init)) /* p7626.C */ { if (SCALAR_TYPE_P (type)) diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index d8a9c5b..8a663d9 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -22117,7 +22117,21 @@ do_auto_deduction (tree type, tree init, tree auto_node) initializer is a braced-init-list (8.5.4), with std::initializer_listU. */ if (BRACE_ENCLOSED_INITIALIZER_P (init)) -type = listify_autos (type, auto_node); +{ + if (!DIRECT_LIST_INIT_P (init)) + type = listify_autos (type, auto_node); + else if (CONSTRUCTOR_NELTS (init) == 1) + init = CONSTRUCTOR_ELT (init, 0)-value; + else + { + if (permerror (input_location, direct-list-initialization of + %auto% requires exactly one element)) + inform (input_location, + for deduction to %std::initializer_list%, use copy- + list-initialization (i.e. add %=% before the %{%)); + type = listify_autos (type, auto_node); + } +} init = resolve_nondeduced_context (init); diff --git a/gcc/testsuite/g++.dg/cpp0x/initlist-n3922.C b/gcc/testsuite/g++.dg/cpp0x/initlist-n3922.C new file mode 100644 index 000..4dadfc9 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/initlist-n3922.C @@ -0,0 +1,15 @@ +// N3922 +// { dg-do compile { target c++11 } } + +#include initializer_list +template class T, class U struct same_type; +template class T struct same_typeT,T {}; + +auto x1 = { 1, 2 }; // decltype(x1) is std::initializer_listint +same_typedecltype(x1),std::initializer_listint s1; +auto x4 = { 3 }; // decltype(x4) is std::initializer_listint +same_typedecltype(x4),std::initializer_listint s4; +auto x5{ 3 }; // decltype(x5) is int +same_typedecltype(x5),int s5; +auto x2 = { 1, 2.0 }; // { dg-error initializer_list } cannot deduce element type +auto x3{ 1, 2 }; // { dg-error one element } not a single element diff --git a/gcc/testsuite/g++.dg/cpp1y/lambda-init5.C b/gcc/testsuite/g++.dg/cpp1y/lambda-init5.C index 5976de0..1b287a1 100644 --- a/gcc/testsuite/g++.dg/cpp1y/lambda-init5.C +++ b/gcc/testsuite/g++.dg/cpp1y/lambda-init5.C @@ -6,5 +6,5 @@ int main() { if ([x(42)]{ return x; }() != 42) __builtin_abort(); - if ([x{1,2}]{ return x.begin()[0]; }() != 1) __builtin_abort(); + if ([x{24}]{ return x; }() != 24) __builtin_abort(); }
C++ PATCH for abi_tag attribute on namespaces
By treating a namespace name as an ABI tag we can avoid most of the need for tagging class types. Tested x86_64-pc-linux-gnu, applying to trunk. commit 65b7f772bb3240688aae6af41d0a1200972885ae Author: Jason Merrill ja...@redhat.com Date: Tue Dec 2 17:11:37 2014 -0500 * cp-tree.h (NAMESPACE_ABI_TAG): New. * name-lookup.c (handle_namespace_attrs): Set it. * class.c (check_tag): Split out from find_abi_tags_r. (find_abi_tags_r): Also check namespace tags. (mark_type_abi_tags): Also mark namespace tags. diff --git a/gcc/cp/class.c b/gcc/cp/class.c index c83c8ad..07bbc69 100644 --- a/gcc/cp/class.c +++ b/gcc/cp/class.c @@ -1352,18 +1352,73 @@ handle_using_decl (tree using_decl, tree t) alter_access (t, decl, access); } +/* Data structure for find_abi_tags_r, below. */ + +struct abi_tag_data +{ + tree t; // The type that we're checking for missing tags. + tree subob; // The subobject of T that we're getting tags from. + tree tags; // error_mark_node for diagnostics, or a list of missing tags. +}; + +/* Subroutine of find_abi_tags_r. Handle a single TAG found on the class TP + in the context of P. TAG can be either an identifier (the DECL_NAME of + a tag NAMESPACE_DECL) or a STRING_CST (a tag attribute). */ + +static void +check_tag (tree tag, tree *tp, abi_tag_data *p) +{ + tree id; + + if (TREE_CODE (tag) == STRING_CST) +id = get_identifier (TREE_STRING_POINTER (tag)); + else +{ + id = tag; + tag = NULL_TREE; +} + + if (!IDENTIFIER_MARKED (id)) +{ + if (!tag) + tag = build_string (IDENTIFIER_LENGTH (id) + 1, + IDENTIFIER_POINTER (id)); + if (p-tags != error_mark_node) + { + /* We're collecting tags from template arguments. */ + p-tags = tree_cons (NULL_TREE, tag, p-tags); + ABI_TAG_IMPLICIT (p-tags) = true; + + /* Don't inherit this tag multiple times. */ + IDENTIFIER_MARKED (id) = true; + } + + /* Otherwise we're diagnosing missing tags. */ + else if (TYPE_P (p-subob)) + { + if (warning (OPT_Wabi_tag, %qT does not have the %E abi tag + that base %qT has, p-t, tag, p-subob)) + inform (location_of (p-subob), %qT declared here, + p-subob); + } + else + { + if (warning (OPT_Wabi_tag, %qT does not have the %E abi tag + that %qT (used in the type of %qD) has, + p-t, tag, *tp, p-subob)) + { + inform (location_of (p-subob), %qD declared here, + p-subob); + inform (location_of (*tp), %qT declared here, *tp); + } + } +} +} + /* walk_tree callback for check_abi_tags: if the type at *TP involves any types with abi tags, add the corresponding identifiers to the VEC in *DATA and set IDENTIFIER_MARKED. */ -struct abi_tag_data -{ - tree t; - tree subob; - // error_mark_node to get diagnostics; otherwise collect missing tags here - tree tags; -}; - static tree find_abi_tags_r (tree *tp, int *walk_subtrees, void *data) { @@ -1374,48 +1429,21 @@ find_abi_tags_r (tree *tp, int *walk_subtrees, void *data) anyway, but let's make sure of it. */ *walk_subtrees = false; + abi_tag_data *p = static_caststruct abi_tag_data*(data); + + for (tree ns = decl_namespace_context (*tp); + ns != global_namespace; + ns = CP_DECL_CONTEXT (ns)) +if (NAMESPACE_ABI_TAG (ns)) + check_tag (DECL_NAME (ns), tp, p); + if (tree attributes = lookup_attribute (abi_tag, TYPE_ATTRIBUTES (*tp))) { - struct abi_tag_data *p = static_caststruct abi_tag_data*(data); for (tree list = TREE_VALUE (attributes); list; list = TREE_CHAIN (list)) { tree tag = TREE_VALUE (list); - tree id = get_identifier (TREE_STRING_POINTER (tag)); - if (!IDENTIFIER_MARKED (id)) - { - if (p-tags != error_mark_node) - { - /* We're collecting tags from template arguments. */ - tree str = build_string (IDENTIFIER_LENGTH (id), - IDENTIFIER_POINTER (id)); - p-tags = tree_cons (NULL_TREE, str, p-tags); - ABI_TAG_IMPLICIT (p-tags) = true; - - /* Don't inherit this tag multiple times. */ - IDENTIFIER_MARKED (id) = true; - } - - /* Otherwise we're diagnosing missing tags. */ - else if (TYPE_P (p-subob)) - { - if (warning (OPT_Wabi_tag, %qT does not have the %E abi tag - that base %qT has, p-t, tag, p-subob)) - inform (location_of (p-subob), %qT declared here, - p-subob); - } - else - { - if (warning (OPT_Wabi_tag, %qT does not have the %E abi tag - that %qT (used in the type of %qD) has, - p-t, tag, *tp, p-subob)) - { - inform (location_of (p-subob), %qD declared here, - p-subob); - inform (location_of (*tp), %qT declared here, *tp); - } - } - } + check_tag (tag, tp, p); } } return NULL_TREE; @@ -1427,6 +1455,12 @@ find_abi_tags_r (tree *tp, int *walk_subtrees, void *data) static void mark_type_abi_tags (tree t, bool val) { + for (tree ns =
Re: [gofrontend-dev] Re: [PATCH 00/13] Go closures, libffi, and the static chain
On 12/12/2014 04:06 AM, Dominik Vogt wrote: I'm not sure I've posted the missing patch anywhere yet, so it's attached to this message. At the moment it enables FFI_TYPE_COMPLEX only for s390[x], but eventually this should be used unconditionally. Thanks for that. I'd been meaning to get around to that. I'll change the test to use FFI_TARGET_HAS_COMPLEX_TYPE and apply it to my branch. (This still leaves the dynamic linking issue if we do not use libffi for reflection calls with x86* and s390[x]. Is the plan to remove the platform specific abi code for the few platforms that have it? I see no way to make them work with the static chain patch anyway.) Well, the x86 paths were updated to work with the static chain, but indeed that required assembly rather than cheating and using C as you did. But removing all of that was always my goal. Indeed, my branch now has a patch to remove all of the target-specific code. Tested only on x86_64 so far, but I plan to test i686 today. r~
Re: [gofrontend-dev] Re: [PATCH 00/13] Go closures, libffi, and the static chain
On 12/12/2014 05:57 AM, Dominik Vogt wrote: On Thu, Dec 11, 2014 at 07:51:44PM +1030, Alan Modra wrote: I was worried about exactly the same problem on powerpc with r11 being used for the static chain and also destroyed in linkage stubs. It turns out we don't traverse any linkage stubs. See https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00446.html. I've written a small test suite that tests reflection calls over module boundaries (see attachment). Build with make and then just run ./main. The program must not crash; it does not check consistency of the function arguments. Oh, that's interesting. You've found a bug in the x86_64 linking: gccgo -g3 -O3 -Wall -Werror -c -fPIC q.go -o q.o gccgo -shared -Wl,-soname,libq.so -o libq.so q.o gccgo -g3 -O3 -Wall -Werror -c -fPIC p.go -o p.o gccgo -shared -Wl,-soname,libp.so -o libp.so p.o gccgo -g3 -O3 -Wall -Werror -o main main.go libq.so libp.so /usr/bin/ld: main: hidden symbol `__morestack' in /usr/lib/gcc/x86_64-redhat-linux/4.8.3/libgcc.a(morestack.o) is referenced by DSO /usr/bin/ld: final link failed: Bad value collect2: error: ld returned 1 exit status make: *** [main] Error 1 Sure enough, both shared libraries failed to pull __morestack from the static libgcc. $ nm libq.so | grep more U __morestack I guess __morestack is included in the wrong portion of libgcc? Ian? r~
Re: [PATCH, AARCH64] Fix ICE in CCMP (PR64015)
- tree lhs = gimple_assign_lhs (g); enum machine_mode mode = TYPE_MODE (TREE_TYPE (lhs)); rtx target = gen_reg_rtx (mode); + + start_sequence (); tmp = emit_cstore (target, icode, NE, cc_mode, cc_mode, 0, tmp, const0_rtx, 1, mode); if (tmp) - return tmp; + { + rtx seq = get_insns (); + end_sequence (); + emit_insn (prep_seq); + emit_insn (gen_seq); + emit_insn (seq); + return tmp; + } + end_sequence (); Given that you're already doing delete_insns_since (last) at the end of this function, I don't think you need a new sequence around the emit_cstore. Just emit_insn (prep_seq); emit_insn (gen_seq); tmp = emit_cstore (...); if (tmp) return tmp; + int unsignedp = code == LTU || code == LEU || code == GTU || code == GEU; You don't need to examine the code, you can look at the argument: TYPE_UNSIGNED (TREE_TYPE (treeop0)) + op0 = prepare_operand (icode, op0, 2, op_mode, cmp_mode, unsignedp); + op1 = prepare_operand (icode, op1, 3, op_mode, cmp_mode, unsignedp); + if (!op0 || !op1) +{ + end_sequence (); + return NULL_RTX; +} + *prep_seq = get_insns (); + end_sequence (); + + cmp = gen_rtx_fmt_ee ((enum rtx_code) code, cmp_mode, op0, op1); + target = gen_rtx_REG (CCmode, CC_REGNUM); + + create_output_operand (ops[0], target, CCmode); + create_fixed_operand (ops[1], cmp); + create_fixed_operand (ops[2], op0); + create_fixed_operand (ops[3], op1); Hmm. With so many fixed operands, I think you may be better off not creating the cmpmode expander in the first place. Just inline the SELECT_CC_MODE and everything right here. r~
[Patch][testsuite] Fix a few test cases
Hi, Here are a few test tweaks. In 921202-1.c, if STACK_SIZE is used then VLEN will blow the stack with 64bit longs. e.g. if STACK_SIZE == 512K then 3 arrays of 32767 longs means at a minimum 767K of stack will be used at -O0. In pr51447.c, the rbx global register is clobbering the rbx of main's caller, which can cause test case crashes on return. 2014-12-12 Ryan Mansfield rmansfi...@qnx.com * gcc.c-torture/execute/921202-1.c: Adjust VLEN. * gcc.c-torture/execute/pr51447.c: Restore rbx for x86-64. * gcc.dg/cpp/trad/include.c: Exclude QNX targets. OK? Regards, Ryan Mansfield Index: gcc/testsuite/gcc.c-torture/execute/921202-1.c === --- gcc/testsuite/gcc.c-torture/execute/921202-1.c (revision 218685) +++ gcc/testsuite/gcc.c-torture/execute/921202-1.c (working copy) @@ -2,7 +2,7 @@ #ifndef STACK_SIZE #define VLEN 2055 #else -#define VLEN ((STACK_SIZE/16) - 1) +#define VLEN ((STACK_SIZE/32) - 1) #endif main () { Index: gcc/testsuite/gcc.c-torture/execute/pr51447.c === --- gcc/testsuite/gcc.c-torture/execute/pr51447.c (revision 218685) +++ gcc/testsuite/gcc.c-torture/execute/pr51447.c (working copy) @@ -14,6 +14,10 @@ main (void) { __label__ nonlocal_lab; +#ifdef __x86_64__ + void *rbx __asm (rbx); + void *saved_rbx = rbx; +#endif __attribute__((noinline, noclone)) void bar (void *func) { @@ -21,9 +25,15 @@ goto nonlocal_lab; } bar (nonlocal_lab); +#ifdef __x86_64__ + rbx = saved_rbx; +#endif return 1; nonlocal_lab: if (ptr != nonlocal_lab) abort (); +#ifdef __x86_64__ + rbx = saved_rbx; +#endif return 0; } Index: gcc/testsuite/gcc.dg/cpp/trad/include.c === --- gcc/testsuite/gcc.dg/cpp/trad/include.c (revision 218685) +++ gcc/testsuite/gcc.dg/cpp/trad/include.c (working copy) @@ -1,11 +1,11 @@ /* Copyright (c) 2002 Free Software Foundation Inc. */ -/* Test that macros are not expanded in the quotes of #inlcude. */ +/* Test that macros are not expanded in the quotes of #include. */ /* vxWorksCommon.h uses the # operator to construct the name of an include file, thus making the file incompatible with -traditional-cpp. Newlib uses ## when including stdlib.h as of 2007-09-07. */ -/* { dg-do preprocess { target { { ! vxworks_kernel } { ! newlib } } } } */ +/* { dg-do preprocess { target { { ! vxworks_kernel } { ! newlib } { ! *-*-qnx* }} } } */ #define __STDC__ 1 /* Stop complaints about non-ISO compilers. */ #define stdlib 1
Re: [Patch]: Check __gthread_setspecific return
On 14-12-05 05:53 PM, Jeff Law wrote: On 12/02/14 10:53, Ryan Mansfield wrote: Hi, Underlying pthread_setspecific can return non-zero with ENOMEM or EINVAL. 2014-12-02 Ryan Mansfield rmansfi...@qnx.com * emutls.c (__emutls_get_address): Check __gthread_setspecific returns. OK? OK. Thanks Jeff. Could someone please apply on my behalf? Regards, Ryan Mansfield
[patch c++]: Fix 61228 - noexcept(expression) causes internal compiler error
Hi, following patch fixes reported issue. Tested for x86_64-w64-mingw32. Ok for apply? Regards, Kai ChangeLog 2014-12-12 Kai Tietz kti...@redhat.com PR c++/61228 * call.c (set_flags_from_callee): Assume no throw by deferred noexcept. 2014-12-12 Kai Tietz kti...@redhat.com PR c++/61228 * g++.dg/cpp0x/pr61228.C: New file. ChangeLog testcase/g++.dg/cpp0x as pr61228.C: // { dg-do run { target c++11 } } #include cctype #include algorithm templateint ( F)(int) constexpr int safeCtype(unsigned char c) noexcept(noexcept(F(c))) { return F(c); } int main() { const char t[] = a; std::find_if(t, t + 1, safeCtypestd::isspace); return 0; } Index: call.c === --- call.c (Revision 218681) +++ call.c (Arbeitskopie) @@ -335,11 +335,17 @@ set_flags_from_callee (tree call) { int nothrow; tree decl = get_callee_fndecl (call); + tree spec; /* We check both the decl and the type; a function may be known not to throw without being declared throw(). */ - nothrow = ((decl TREE_NOTHROW (decl)) -|| TYPE_NOTHROW_P (TREE_TYPE (TREE_TYPE (CALL_EXPR_FN (call); + nothrow = (decl TREE_NOTHROW (decl)); + if (!nothrow) +{ + spec = TREE_TYPE (TREE_TYPE (CALL_EXPR_FN (call))); + nothrow = (!DEFERRED_NOEXCEPT_SPEC_P (TYPE_RAISES_EXCEPTIONS (spec)) + TYPE_NOTHROW_P (spec)); +} if (!nothrow at_function_scope_p () cfun cp_function_chain) cp_function_chain-can_throw = 1;
Re: [gofrontend-dev] Re: [PATCH 00/13] Go closures, libffi, and the static chain
On Fri, Dec 12, 2014 at 10:49 AM, Richard Henderson r...@redhat.com wrote: Oh, that's interesting. You've found a bug in the x86_64 linking: gccgo -g3 -O3 -Wall -Werror -c -fPIC q.go -o q.o gccgo -shared -Wl,-soname,libq.so -o libq.so q.o gccgo -g3 -O3 -Wall -Werror -c -fPIC p.go -o p.o gccgo -shared -Wl,-soname,libp.so -o libp.so p.o gccgo -g3 -O3 -Wall -Werror -o main main.go libq.so libp.so /usr/bin/ld: main: hidden symbol `__morestack' in /usr/lib/gcc/x86_64-redhat-linux/4.8.3/libgcc.a(morestack.o) is referenced by DSO /usr/bin/ld: final link failed: Bad value collect2: error: ld returned 1 exit status make: *** [main] Error 1 Sure enough, both shared libraries failed to pull __morestack from the static libgcc. $ nm libq.so | grep more U __morestack I guess __morestack is included in the wrong portion of libgcc? My intent was that __morestack would be included in each shared library that needs it, because going through a PLT stub to call __morestack would blow out the stack. That is why the symbol is hidden. So we need to link against -lgcc. It looks like when I link with gcc -shared it does link against -lgcc. When I link with gccgo -shared it does not. I could not figure out why that it was in 30 seconds of looking at the code. Ian
Re: [patch c++]: Fix 61228 - noexcept(expression) causes internal compiler error
Hi, On 12/12/2014 08:45 PM, Kai Tietz wrote: #include cctype #include algorithm I would recommend reducing the testcase further, algorithm is very large. Thanks, Paolo.
Re: OpenACC GIMPLE_OACC_* -- or not?
Hi! On Wed, 10 Dec 2014 11:10:47 +0100, Jakub Jelinek ja...@redhat.com wrote: On Wed, Dec 10, 2014 at 11:07:37AM +0100, Thomas Schwinge wrote: ..., I noticed that GIMPLE_OMP_TARGET doesn't walk the child_fn and data_arg. Is that intentional, or should that be done? If the latter (but this doesn't seem to cause any ill effects -- why?), OK to commit the following to trunk? Ok with proper ChangeLog. gcc/gimple-walk.c | 8 1 file changed, 8 insertions(+) diff --git gcc/gimple-walk.c gcc/gimple-walk.c index bfa3532..1330c04 100644 --- gcc/gimple-walk.c +++ gcc/gimple-walk.c @@ -416,6 +416,14 @@ walk_gimple_op (gimple stmt, walk_tree_fn callback_op, pset); if (ret) return ret; + ret = walk_tree (gimple_omp_target_child_fn_ptr (stmt), callback_op, wi, + pset); + if (ret) + return ret; + ret = walk_tree (gimple_omp_target_data_arg_ptr (stmt), callback_op, wi, + pset); + if (ret) + return ret; break; case GIMPLE_OMP_TEAMS: Committed to trunk in r218686: commit c1277edd4b50623bae89bea8cba84def9b308e77 Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Fri Dec 12 20:01:18 2014 + A bit of walk_gimple_op maintenance. * gimple-walk.c (walk_gimple_op) GIMPLE_OMP_FOR: Also check intermediate walk_tree results for for_incr. GIMPLE_OMP_TARGET: Walk child_fn and data_arg, too. GIMPLE_OMP_CRITICAL, GIMPLE_OMP_ATOMIC_STORE: Pretty printing. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@218686 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog | 7 +++ gcc/gimple-walk.c | 49 +++-- 2 files changed, 38 insertions(+), 18 deletions(-) diff --git gcc/ChangeLog gcc/ChangeLog index bf9571b..3a20032 100644 --- gcc/ChangeLog +++ gcc/ChangeLog @@ -1,3 +1,10 @@ +2014-12-12 Thomas Schwinge tho...@codesourcery.com + + * gimple-walk.c (walk_gimple_op) GIMPLE_OMP_FOR: Also check + intermediate walk_tree results for for_incr. + GIMPLE_OMP_TARGET: Walk child_fn and data_arg, too. + GIMPLE_OMP_CRITICAL, GIMPLE_OMP_ATOMIC_STORE: Pretty printing. + 2014-12-12 Richard Sandiford richard.sandif...@arm.com PR middle-end/64182 diff --git gcc/gimple-walk.c gcc/gimple-walk.c index 48fa05d..959d68e 100644 --- gcc/gimple-walk.c +++ gcc/gimple-walk.c @@ -321,11 +321,13 @@ walk_gimple_op (gimple stmt, walk_tree_fn callback_op, break; case GIMPLE_OMP_CRITICAL: - ret = walk_tree (gimple_omp_critical_name_ptr ( -as_a gomp_critical * (stmt)), - callback_op, wi, pset); - if (ret) - return ret; + { + gomp_critical *omp_stmt = as_a gomp_critical * (stmt); + ret = walk_tree (gimple_omp_critical_name_ptr (omp_stmt), +callback_op, wi, pset); + if (ret) + return ret; + } break; case GIMPLE_OMP_FOR: @@ -349,9 +351,9 @@ walk_gimple_op (gimple stmt, walk_tree_fn callback_op, return ret; ret = walk_tree (gimple_omp_for_incr_ptr (stmt, i), callback_op, wi, pset); + if (ret) + return ret; } - if (ret) - return ret; break; case GIMPLE_OMP_PARALLEL: @@ -404,7 +406,6 @@ walk_gimple_op (gimple stmt, walk_tree_fn callback_op, wi, pset); if (ret) return ret; - ret = walk_tree (gimple_omp_sections_control_ptr (stmt), callback_op, wi, pset); if (ret) @@ -420,10 +421,21 @@ walk_gimple_op (gimple stmt, walk_tree_fn callback_op, break; case GIMPLE_OMP_TARGET: - ret = walk_tree (gimple_omp_target_clauses_ptr (stmt), callback_op, wi, - pset); - if (ret) - return ret; + { + gomp_target *omp_stmt = as_a gomp_target * (stmt); + ret = walk_tree (gimple_omp_target_clauses_ptr (omp_stmt), +callback_op, wi, pset); + if (ret) + return ret; + ret = walk_tree (gimple_omp_target_child_fn_ptr (omp_stmt), +callback_op, wi, pset); + if (ret) + return ret; + ret = walk_tree (gimple_omp_target_data_arg_ptr (omp_stmt), +callback_op, wi, pset); + if (ret) + return ret; + } break; case GIMPLE_OMP_TEAMS: @@ -440,7 +452,6 @@ walk_gimple_op (gimple stmt, walk_tree_fn callback_op, callback_op, wi, pset); if (ret) return ret; - ret = walk_tree (gimple_omp_atomic_load_rhs_ptr (omp_stmt), callback_op, wi, pset); if (ret) @@ -449,11 +460,13 @@ walk_gimple_op (gimple stmt, walk_tree_fn callback_op, break; case GIMPLE_OMP_ATOMIC_STORE: -
Re: Nested OpenACC/OpenMP constructs
Hi! On Wed, 10 Dec 2014 11:16:08 +0100, Jakub Jelinek ja...@redhat.com wrote: On Wed, Dec 10, 2014 at 11:10:19AM +0100, Thomas Schwinge wrote: --- /dev/null +++ gcc/testsuite/c-c++-common/gomp/nesting-1.c @@ -0,0 +1,77 @@ +void +f_omp_parallel (void) +{ +#pragma omp parallel + { +int i; Can you please use a global variable declared outside of f_omp_parallel instead? + +#pragma omp parallel +; + +#pragma omp target +; + +#pragma omp target data +; + +#pragma omp target update to(i) The thing is, if GCC tried harder, it could complain here, because i can't really be mapped at this point and thus it would be always undefined behavior. If the var is global, it is possible somebody uses #pragma omp target map(i) f_omp_parallel (); and then it would be valid. That makes sense, thanks. Similarly in other tests. Will change on gomp-4_0-branch. Otherwise LGTM. Committed to trunk in r218687: commit 4c37888fdc6548eba74aa0d652e37b33dd097aea Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Fri Dec 12 20:01:29 2014 + OpenMP target nesting tests. gcc/testsuite/ * c-c++-common/gomp/nesting-1.c: New file. * c-c++-common/gomp/nesting-warn-1.c: Likewise. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@218687 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/testsuite/ChangeLog | 5 ++ gcc/testsuite/c-c++-common/gomp/nesting-1.c | 75 gcc/testsuite/c-c++-common/gomp/nesting-warn-1.c | 23 3 files changed, 103 insertions(+) diff --git gcc/testsuite/ChangeLog gcc/testsuite/ChangeLog index 8e5b424..f2502ff 100644 --- gcc/testsuite/ChangeLog +++ gcc/testsuite/ChangeLog @@ -1,3 +1,8 @@ +2014-12-12 Thomas Schwinge tho...@codesourcery.com + + * c-c++-common/gomp/nesting-1.c: New file. + * c-c++-common/gomp/nesting-warn-1.c: Likewise. + 2014-12-12 Kai Tietz kti...@redhat.com PR c++/63996 diff --git gcc/testsuite/c-c++-common/gomp/nesting-1.c gcc/testsuite/c-c++-common/gomp/nesting-1.c new file mode 100644 index 000..711ff8e --- /dev/null +++ gcc/testsuite/c-c++-common/gomp/nesting-1.c @@ -0,0 +1,75 @@ +extern int i; + +void +f_omp_parallel (void) +{ +#pragma omp parallel + { +#pragma omp parallel +; + +#pragma omp target +; + +#pragma omp target data +; + +#pragma omp target update to(i) + +#pragma omp target data +{ +#pragma omp parallel + ; + +#pragma omp target + ; + +#pragma omp target data + ; + +#pragma omp target update to(i) +} + } +} + +void +f_omp_target (void) +{ +#pragma omp target + { +#pragma omp parallel +; + } +} + +void +f_omp_target_data (void) +{ +#pragma omp target data + { +#pragma omp parallel +; + +#pragma omp target +; + +#pragma omp target data +; + +#pragma omp target update to(i) + +#pragma omp target data +{ +#pragma omp parallel + ; + +#pragma omp target + ; + +#pragma omp target data + ; + +#pragma omp target update to(i) +} + } +} diff --git gcc/testsuite/c-c++-common/gomp/nesting-warn-1.c gcc/testsuite/c-c++-common/gomp/nesting-warn-1.c new file mode 100644 index 000..c39dd49 --- /dev/null +++ gcc/testsuite/c-c++-common/gomp/nesting-warn-1.c @@ -0,0 +1,23 @@ +extern int i; + +void +f_omp_target (void) +{ +#pragma omp target + { +#pragma omp target /* { dg-warning target construct inside of target region } */ +; +#pragma omp target data /* { dg-warning target data construct inside of target region } */ +; +#pragma omp target update to(i) /* { dg-warning target update construct inside of target region } */ + +#pragma omp parallel +{ +#pragma omp target /* { dg-warning target construct inside of target region } */ + ; +#pragma omp target data /* { dg-warning target data construct inside of target region } */ + ; +#pragma omp target update to(i) /* { dg-warning target update construct inside of target region } */ +} + } +} Grüße, Thomas pgpwm7c0zTP77.pgp Description: PGP signature
patch to fix PR64110
The following patch fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64110 The patch was successfully tested and bootstrapped on x86/x86-64. Committed as rev. 218688. 2014-12-12 Vladimir Makarov vmaka...@redhat.com PR target/64110 * lra-constraints.c (process_alt_operands): Refuse alternative when reload pseudo of given class can not hold value of given mode. 2014-12-12 Vladimir Makarov vmaka...@redhat.com PR target/64110 * gcc.target/i386/pr64110.c: New. Index: lra-constraints.c === --- lra-constraints.c (revision 218685) +++ lra-constraints.c (working copy) @@ -2267,6 +2267,29 @@ process_alt_operands (int only_alternati goto fail; } + /* Alternative loses if it required class pseudo can not +hold value of required mode. Such insns can be +described by insn definitions with mode iterators. +Don't use ira_prohibited_class_mode_regs here as it +is common practice for constraints to use a class +which does not have actually enough regs to hold the +value (e.g. x86 AREG for mode requiring more one +general reg). */ + if (GET_MODE (*curr_id-operand_loc[nop]) != VOIDmode + ! hard_reg_set_empty_p (this_alternative_set) + ! HARD_REGNO_MODE_OK (ira_class_hard_regs + [this_alternative][0], + GET_MODE (*curr_id-operand_loc[nop]))) + { + if (lra_dump_file != NULL) + fprintf + (lra_dump_file, + alt=%d: reload pseudo for op %d + can not hold the mode value -- refuse\n, + nalt, nop); + goto fail; + } + /* Check strong discouragement of reload of non-constant into class THIS_ALTERNATIVE. */ if (! CONSTANT_P (op) ! no_regs_p Index: testsuite/gcc.target/i386/pr64110.c === --- testsuite/gcc.target/i386/pr64110.c (revision 0) +++ testsuite/gcc.target/i386/pr64110.c (working copy) @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options -O3 -march=core-avx2 } */ + +int foo (void); +int a; +short *b; + +void +bar (short x) +{ + while (a--) +{ + int i, j = foo (); + for (i = 0; i j; ++i) + *b++ = x; +} +}
Re: [patch c++]: Fix 61228 - noexcept(expression) causes internal compiler error
I think it would be better to call maybe_instantiate_noexcept so that we can have a definite answer. Jason
Re: The nvptx port [10/11+] Target files
Hi! On Mon, 10 Nov 2014 17:19:57 +0100, Bernd Schmidt ber...@codesourcery.com wrote: I've now committed it, in the following form. --- /dev/null +++ b/gcc/config/nvptx/nvptx.h @@ -0,0 +1,356 @@ +#define ASM_OUTPUT_ALIGN(FILE, POWER) Committed to trunk in r218689: commit 61f8a1bd770ded96fcff88f3cbc426a23c413992 Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Fri Dec 12 20:14:10 2014 + nvptx: Define valid ASM_OUTPUT_ALIGN. gcc/ * config/nvptx/nvptx.h (ASM_OUTPUT_ALIGN): Define as a C statment. gcc/doc/tm.texi:@defmac ASM_OUTPUT_ALIGN (@var{stream}, @var{power}) gcc/doc/tm.texi-A C statement to output to the stdio stream @var{stream} an assembler gcc/doc/tm.texi-command to advance the location counter to a multiple of 2 to the gcc/doc/tm.texi-@var{power} bytes. @var{power} will be a C expression of type @code{int}. gcc/doc/tm.texi-@end defmac gcc/config/nvptx/nvptx.h:#define ASM_OUTPUT_ALIGN(FILE, POWER) Empty is not a C statement, and so in code such as: gcc/dwarf2out.c- if (lsda_encoding == DW_EH_PE_aligned) gcc/dwarf2out.c:ASM_OUTPUT_ALIGN (asm_out_file, floor_log2 (PTR_SIZE)); gcc/dwarf2out.c- dw2_asm_output_data (size_of_encoded_value (lsda_encoding), 0, gcc/dwarf2out.c- Language Specific Data Area (none)); gcc/varasm.c- if (align BITS_PER_UNIT) gcc/varasm.c:ASM_OUTPUT_ALIGN (asm_out_file, floor_log2 (align / BITS_PER_UNIT)); gcc/varasm.c- assemble_variable_contents (decl, name, dont_output_data); gcc/varasm.c- if (align 0) gcc/varasm.c:ASM_OUTPUT_ALIGN (asm_out_file, align); gcc/varasm.c- gcc/varasm.c- targetm.asm_out.internal_label (asm_out_file, LTRAMP, 0); gcc/varasm.c- if (align BITS_PER_UNIT) gcc/varasm.c:ASM_OUTPUT_ALIGN (asm_out_file, floor_log2 (align / BITS_PER_UNIT)); gcc/varasm.c- assemble_constant_contents (exp, XSTR (symbol, 0), align); ..., GCC warns: [...]/source-gcc/gcc/dwarf2out.c: In function 'void output_fde(dw_fde_ref, bool, bool, char*, int, char*, bool, int)': [...]/source-gcc/gcc/dwarf2out.c:665:3: warning: suggest braces around empty body in an 'if' statement [-Wempty-body] ASM_OUTPUT_ALIGN (asm_out_file, floor_log2 (PTR_SIZE)); ^ [...]/source-gcc/gcc/varasm.c: In function 'void assemble_variable(tree, int, int, int)': [...]/source-gcc/gcc/varasm.c:2217:2: warning: suggest braces around empty body in an 'if' statement [-Wempty-body] ASM_OUTPUT_ALIGN (asm_out_file, floor_log2 (align / BITS_PER_UNIT)); ^ [...]/source-gcc/gcc/varasm.c: In function 'rtx_def* assemble_trampoline_template()': [...]/source-gcc/gcc/varasm.c:2603:5: warning: suggest braces around empty body in an 'if' statement [-Wempty-body] ASM_OUTPUT_ALIGN (asm_out_file, align); ^ [...]/source-gcc/gcc/varasm.c: In function 'void output_constant_def_contents(rtx)': [...]/source-gcc/gcc/varasm.c:3413:2: warning: suggest braces around empty body in an 'if' statement [-Wempty-body] ASM_OUTPUT_ALIGN (asm_out_file, floor_log2 (align / BITS_PER_UNIT)); ^ Also, use the values, to get rid of that one: [...]/source-gcc/gcc/final.c: In function 'rtx_insn* final_scan_insn(rtx_insn*, FILE*, int, int, int*)': [...]/source-gcc/gcc/final.c:2450:12: warning: variable 'log_align' set but not used [-Wunused-but-set-variable] int log_align; ^ git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@218689 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog| 4 gcc/config/nvptx/nvptx.h | 10 +- 2 files changed, 13 insertions(+), 1 deletion(-) diff --git gcc/ChangeLog gcc/ChangeLog index 689c4fd..e5de2c6 100644 --- gcc/ChangeLog +++ gcc/ChangeLog @@ -1,3 +1,7 @@ +2014-12-12 Thomas Schwinge tho...@codesourcery.com + + * config/nvptx/nvptx.h (ASM_OUTPUT_ALIGN): Define as a C statment. + 2014-12-12 Vladimir Makarov vmaka...@redhat.com PR target/64110 diff --git gcc/config/nvptx/nvptx.h gcc/config/nvptx/nvptx.h index c222375..5f08ba7 100644 --- gcc/config/nvptx/nvptx.h +++ gcc/config/nvptx/nvptx.h @@ -281,9 +281,17 @@ struct GTY(()) machine_function } \ while (0) -#define ASM_OUTPUT_ALIGN(FILE, POWER) +#define ASM_OUTPUT_ALIGN(FILE, POWER) \ + do \ +{ \ + (void) (FILE); \ + (void) (POWER); \ +}
Re: [patch c++]: Fix PR/63996
Hi, On 12/12/2014 11:58 AM, Kai Tietz wrote: New testcase in g++.dg/cpp1y as pr63996.C // { dg-do compile { target c++14 } } constexpr int foo (int i) { int a[i] = { }; } constexpr int j = foo (1); // { dg-error is not a constant expression } The testcase fails spuriously because of Jason's VLA reversion commit, please adjust it. Thanks, Paolo.
Re: [patch c++]: Fix 61228 - noexcept(expression) causes internal compiler error
2014-12-12 21:15 GMT+01:00 Jason Merrill ja...@redhat.com: I think it would be better to call maybe_instantiate_noexcept so that we can have a definite answer. Jason Hmm, for case that decl != NULL_TREE, this is ok. But what if decl is NULL_TREE? Kai
[patch] Fix std::shared_ptr FAILs with -fno-rtti
A couple of small fixes for shared_ptr tests that fail with -fno-rtti. Tested x86_64-linux, committed to trunk. commit ec3619710af701b1030a6eb45862a41dc3e18ad8 Author: Jonathan Wakely jwak...@redhat.com Date: Fri Dec 12 16:17:45 2014 + PR libstdc++/58594 * include/bits/shared_ptr_base.h: Cast away cv-quals. * testsuite/20_util/shared_ptr/creation/58594-no-rtti.cc: New. * testsuite/20_util/shared_ptr/creation/private.cc: Make allocator rebindable so test passes with -fno-rtti. diff --git a/libstdc++-v3/include/bits/shared_ptr_base.h b/libstdc++-v3/include/bits/shared_ptr_base.h index 737a1a2..3ef783f 100644 --- a/libstdc++-v3/include/bits/shared_ptr_base.h +++ b/libstdc++-v3/include/bits/shared_ptr_base.h @@ -1120,7 +1120,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION : _M_ptr(), _M_refcount() { typedef typename allocator_traits_Alloc::template - rebind_traits_Tp __traits; + rebind_traitstypename std::remove_cv_Tp::type __traits; _Deletertypename __traits::allocator_type __del = { __a }; auto __guard = std::__allocate_guarded(__del._M_alloc); _M_ptr = __guard.get(); diff --git a/libstdc++-v3/testsuite/20_util/shared_ptr/creation/58594-no-rtti.cc b/libstdc++-v3/testsuite/20_util/shared_ptr/creation/58594-no-rtti.cc new file mode 100644 index 000..2eb8b95 --- /dev/null +++ b/libstdc++-v3/testsuite/20_util/shared_ptr/creation/58594-no-rtti.cc @@ -0,0 +1,27 @@ +// { dg-options -std=gnu++11 -fno-rtti } +// { dg-do compile } + +// Copyright (C) 2014 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. + +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. + +// You should have received a copy of the GNU General Public License along +// with this library; see the file COPYING3. If not see +// http://www.gnu.org/licenses/. + +#include memory + +// libstdc++/58594 +void test01() +{ + std::make_sharedconst int(); +} diff --git a/libstdc++-v3/testsuite/20_util/shared_ptr/creation/private.cc b/libstdc++-v3/testsuite/20_util/shared_ptr/creation/private.cc index 46487bb..63ab555 100644 --- a/libstdc++-v3/testsuite/20_util/shared_ptr/creation/private.cc +++ b/libstdc++-v3/testsuite/20_util/shared_ptr/creation/private.cc @@ -37,8 +37,17 @@ public: }; templatetypename T -struct MyAlloc : std::allocatorPrivate +struct MyAlloc : std::allocatorT { + templatetypename U +struct rebind { typedef MyAllocU other; }; + + MyAlloc() = default; + MyAlloc(const MyAlloc) = default; + + templatetypename U +MyAlloc(const MyAllocU) { } + void construct(T* p) { ::new((void*)p) T(); } void destroy(T* p) { p-~T(); } }; @@ -49,4 +58,3 @@ int main() auto p = std::allocate_sharedPrivate(a); return p-get(); } -
Re: [Patch, libstdc++/64239] Fix regex_iterator copying
On 11/12/14 00:27 -0800, Tim Shen wrote: As discussed in Bugzilla. Bootstrapped and tested. OK for trunk. Is it Ok to backport it to 4.9 branch, with _M_in_iterator kept unused? Yes, that's OK as well. Thanks.
Re: [PATCH 0/4] GCC port for the Visium
On Dec 11, 2014, at 4:05 PM, Eric Botcazou ebotca...@adacore.com wrote: on behalf of Controls and Data Services, AdaCore would like to contribute a port of the GCC to the Visium. OK for the mainline? The test suite bits are usual and customary. I’ll Ok them explicitly anyway. I’m fine with new ports going in with usual and customary test suite things.
[C++ Patch] PR 59628
Hi, I think that to avoid ICE-ing during error recovery when DECL_SAVED_TREE is NULL_TREE we want to simply bail out and return true to the caller. Tested x86_64-linux. Thanks, Paolo. // /cp 2014-12-12 Paolo Carlini paolo.carl...@oracle.com PR c++/59628 * semantics.c (finish_omp_reduction_clause): Early return true if DECL_SAVED_TREE (id) is NULL_TREE. /testsuite 2014-12-12 Paolo Carlini paolo.carl...@oracle.com PR c++/59628 * g++.dg/gomp/pr59628.C: New. Index: cp/semantics.c === --- cp/semantics.c (revision 218683) +++ cp/semantics.c (working copy) @@ -5138,6 +5138,8 @@ finish_omp_reduction_clause (tree c, bool *need_de id = OVL_CURRENT (id); mark_used (id); tree body = DECL_SAVED_TREE (id); + if (!body) + return true; if (TREE_CODE (body) == STATEMENT_LIST) { tree_stmt_iterator tsi; Index: testsuite/g++.dg/gomp/pr59628.C === --- testsuite/g++.dg/gomp/pr59628.C (revision 0) +++ testsuite/g++.dg/gomp/pr59628.C (working copy) @@ -0,0 +1,13 @@ +// PR c++/59628 +// { dg-do compile } +// { dg-options -fopenmp } + +struct A { int i; }; + +void foo() +{ + A a; + #pragma omp declare reduction (+: A: omp_out.i +: omp_in.i) // { dg-error expected } + #pragma omp parallel reduction (+: a) + ; +}
Re: [C++ Patch] PR 59628
On Fri, Dec 12, 2014 at 10:13:39PM +0100, Paolo Carlini wrote: Hi, I think that to avoid ICE-ing during error recovery when DECL_SAVED_TREE is NULL_TREE we want to simply bail out and return true to the caller. Tested x86_64-linux. Thanks, Paolo. // /cp 2014-12-12 Paolo Carlini paolo.carl...@oracle.com PR c++/59628 * semantics.c (finish_omp_reduction_clause): Early return true if DECL_SAVED_TREE (id) is NULL_TREE. /testsuite 2014-12-12 Paolo Carlini paolo.carl...@oracle.com PR c++/59628 * g++.dg/gomp/pr59628.C: New. Ok, thanks. Jakub
Re: [PATCH][AArch64] Use std::swap instead of manually swapping
On Dec 12, 2014, at 3:36 AM, Kyrill Tkachov kyrylo.tkac...@arm.com wrote: Marcus: Uros pointed out to me that these kinds of changes are considered obvious (with precedent athttps://gcc.gnu.org/ml/gcc-patches/2014-12/msg00309.html) but did you have some concerns about backporting to other branches? So in general, there is no need to back port this to a previous release branch. Unless it fixed a compelling bug in that release branch, or was possibly incidental to some other fix that did, it should not be back ported. A private vendor release branch or a personal branch, well, they can do what they want on it, the owner would have to weigh in.
Re: [Patch][testsuite] Fix a few test cases
On Dec 12, 2014, at 11:38 AM, Ryan Mansfield rmansfi...@qnx.com wrote: Here are a few test tweaks. In 921202-1.c, if STACK_SIZE is used then VLEN will blow the stack with 64bit longs. e.g. if STACK_SIZE == 512K then 3 arrays of 32767 longs means at a minimum 767K of stack will be used at -O0. In pr51447.c, the rbx global register is clobbering the rbx of main's caller, which can cause test case crashes on return. 2014-12-12 Ryan Mansfield rmansfi...@qnx.com * gcc.c-torture/execute/921202-1.c: Adjust VLEN. * gcc.c-torture/execute/pr51447.c: Restore rbx for x86-64. * gcc.dg/cpp/trad/include.c: Exclude QNX targets. OK? Ok for first and third part. The second one, I would like an rbx/x86_64 person to review.
[PATCH] Fix -fsanitize=float-cast-overflow with C FE (PR sanitizer/64289)
Hi! -fsanitize=float-cast-overflow sanitization is done in convert.c and calls there save_expr. Unfortunately, save_expr is a no-go for the C FE, we need c_save_expr, but as convert.c is shared by all FEs, the only way to arrange that would be a new langhook. This patch attempts to fix it the same way as PR54428 did (the other save_expr in c-convert.c). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2014-12-12 Jakub Jelinek ja...@redhat.com PR sanitizer/64289 * c-convert.c: Include ubsan.h. (convert): For real - integral casts and -fsanitize=float-cast-overflow don't call convert_to_integer, but instead instrument the float cast directly. * c-c++-common/ubsan/pr64289.c: New test. --- gcc/c/c-convert.c.jj2014-10-10 08:19:21.0 +0200 +++ gcc/c/c-convert.c 2014-12-12 16:57:34.514316301 +0100 @@ -34,6 +34,7 @@ along with GCC; see the file COPYING3. #include c-tree.h #include langhooks.h #include target.h +#include ubsan.h /* Change of width--truncation and extension of integers or reals-- is represented with NOP_EXPR. Proper functioning of many things @@ -109,6 +110,20 @@ convert (tree type, tree expr) case INTEGER_TYPE: case ENUMERAL_TYPE: + if (flag_sanitize SANITIZE_FLOAT_CAST + TREE_CODE (TREE_TYPE (expr)) == REAL_TYPE + COMPLETE_TYPE_P (type) + current_function_decl != NULL_TREE + !lookup_attribute (no_sanitize_undefined, + DECL_ATTRIBUTES (current_function_decl))) + { + expr = c_save_expr (expr); + tree check = ubsan_instrument_float_cast (loc, type, expr); + expr = fold_build1 (FIX_TRUNC_EXPR, type, expr); + if (check == NULL) + return expr; + return fold_build2 (COMPOUND_EXPR, TREE_TYPE (expr), check, expr); + } ret = convert_to_integer (type, e); goto maybe_fold; --- gcc/testsuite/c-c++-common/ubsan/pr64289.c.jj 2014-12-12 17:12:35.419638432 +0100 +++ gcc/testsuite/c-c++-common/ubsan/pr64289.c 2014-12-12 17:11:39.0 +0100 @@ -0,0 +1,9 @@ +/* PR sanitizer/64289 */ +/* { dg-do compile } */ +/* { dg-options -fsanitize=float-cast-overflow } */ + +int +foo (int a) +{ + return (int) (0 ? 0 : a ? a : 0.5); +} Jakub
[PATCH] Ensure __tsan_func_entry call isn't in a loop (PR sanitizer/64265)
Hi! This patch ensures that if successor of entry bb has multiple predecessors, we emit the __tsan_func_entry call on the edge from entry bb, so it can't be called inside a loop in the same function. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2014-12-12 Jakub Jelinek ja...@redhat.com PR sanitizer/64265 * tsan.c (instrument_func_entry): Insert __tsan_func_entry call on edge from entry block to single succ instead of after labels of single succ of entry block. --- gcc/tsan.c.jj 2014-12-01 14:57:30.0 +0100 +++ gcc/tsan.c 2014-12-12 18:25:26.448608011 +0100 @@ -652,25 +652,24 @@ instrument_memory_accesses (void) static void instrument_func_entry (void) { - basic_block succ_bb; - gimple_stmt_iterator gsi; tree ret_addr, builtin_decl; gimple g; - - succ_bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)); - gsi = gsi_after_labels (succ_bb); + gimple_seq seq = NULL; builtin_decl = builtin_decl_implicit (BUILT_IN_RETURN_ADDRESS); g = gimple_build_call (builtin_decl, 1, integer_zero_node); ret_addr = make_ssa_name (ptr_type_node); gimple_call_set_lhs (g, ret_addr); gimple_set_location (g, cfun-function_start_locus); - gsi_insert_before (gsi, g, GSI_SAME_STMT); + gimple_seq_add_stmt_without_update (seq, g); - builtin_decl = builtin_decl_implicit (BUILT_IN_TSAN_FUNC_ENTRY); + builtin_decl = builtin_decl_implicit (BUILT_IN_TSAN_FUNC_ENTRY); g = gimple_build_call (builtin_decl, 1, ret_addr); gimple_set_location (g, cfun-function_start_locus); - gsi_insert_before (gsi, g, GSI_SAME_STMT); + gimple_seq_add_stmt_without_update (seq, g); + + edge e = single_succ_edge (ENTRY_BLOCK_PTR_FOR_FN (cfun)); + gsi_insert_seq_on_edge_immediate (e, seq); } /* Instruments function exits. */ Jakub
RE: [PATCHv2,MIPS 1/2] MIPS64r6 support
Hi Matthew, -Original Message- From: Matthew Fortune [mailto:matthew.fort...@imgtec.com] Sent: Friday, November 14, 2014 6:07 PM Overall, this patch looks really good. It took me a while to get through it, but I only have a couple of minor comments. diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c index 02268f3..7797b31 100644 --- a/gcc/config/mips/mips.c +++ b/gcc/config/mips/mips.c @@ -11896,6 +12052,15 @@ mips_hard_regno_mode_ok_p (unsigned int regno, machine_mode mode) if (TARGET_O32_FP64A_ABI size = 4 (regno 1) != 0) return false; + /* Prevent the use of odd-numbered registers for CCFmode with the +o32 FPXX ABI, otherwise allow them. +The FPXX ABI does not permit double-precision data to be placed +in odd-numbered registers and double-precision compares write +them as 64-bit values. Without this restriction the R6 FPXX +ABI would not be able to execute in FR=1 FRE=1 mode. */ + if (mode == CCFmode ISA_HAS_CCF) + return !(TARGET_FLOATXX (regno 1) != 0); + /* Allow 64-bit vector modes for Loongson-2E/2F. */ if (TARGET_LOONGSON_VECTORS (mode == V2SImode I don't think we ever have CCFmode when ISA_HAS_CCF is false. Maybe just check for CCFmode? If there really is a need for both conditions, then the order of the checks needs to be reversed. Also, the comment is hard to follow. How about: /* The FPXX ABI requires double-precision values to be placed in even-numbered registers. Disallow odd-numbered registers with CCFmode because CCF mode double-precision compares will write a 64-bit value to a register. */ Did I get that right? diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h index 8a38829..c110b5e 100644 --- a/gcc/config/mips/mips.h +++ b/gcc/config/mips/mips.h -/* ISA supports instructions MULT and MULTU. - This is always true, but the macro is needed for ISA_HAS_DMULT - in mips.md. */ -#define ISA_HAS_MULT (1) +/* ISA supports instructions MULT and MULTU. */ +#define ISA_HAS_MULT ISA_HAS_HILO + I preferred the definition in your original patch : #define ISA_HAS_MULTmips_isa_rev = 5 Would you mind switching it back? Okay with those changes. Catherine
RE: [PATCHv2,MIPS 2/2] Add new triplets for vendor 'img'
-Original Message- From: Matthew Fortune [mailto:matthew.fort...@imgtec.com] Sent: Friday, November 14, 2014 6:20 PM To: 'gcc-patches@gcc.gnu.org' (gcc-patches@gcc.gnu.org) Cc: Moore, Catherine; Rich Fuhler; Steve Ellcey; Richard Sandiford Subject: [PATCHv2,MIPS 2/2] Add new triplets for vendor 'img' This patch adds new triplets: mips*-img-linux* and mips*-img-elf* The purpose of these triplets is essentially to provide a clear separation between tools which support mips32r5 and below and tools which support mips32r6 and above. Thanks, Matthew / * configure.ac: Add mips-img-elf triplet. gcc/ * config.gcc: Support mips*-img-linux* and mips*-img-elf*. * config/mips/mti-linux.h: Support mips32r6 as being the default arch. * config/mips/t-img-elf: New. * config/mips/t-img-linux: New. This patch is OK to commit. Thanks, Catherine
RE: [PATCHv2,MIPS 1/2] MIPS64r6 support
Moore, Catherine catherine_mo...@mentor.com writes: Hi Matthew, -Original Message- From: Matthew Fortune [mailto:matthew.fort...@imgtec.com] Sent: Friday, November 14, 2014 6:07 PM Overall, this patch looks really good. It took me a while to get through it, but I only have a couple of minor comments. diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c index 02268f3..7797b31 100644 --- a/gcc/config/mips/mips.c +++ b/gcc/config/mips/mips.c @@ -11896,6 +12052,15 @@ mips_hard_regno_mode_ok_p (unsigned int regno, machine_mode mode) if (TARGET_O32_FP64A_ABI size = 4 (regno 1) != 0) return false; + /* Prevent the use of odd-numbered registers for CCFmode with the +o32 FPXX ABI, otherwise allow them. +The FPXX ABI does not permit double-precision data to be placed +in odd-numbered registers and double-precision compares write +them as 64-bit values. Without this restriction the R6 FPXX +ABI would not be able to execute in FR=1 FRE=1 mode. */ + if (mode == CCFmode ISA_HAS_CCF) + return !(TARGET_FLOATXX (regno 1) != 0); + /* Allow 64-bit vector modes for Loongson-2E/2F. */ if (TARGET_LOONGSON_VECTORS (mode == V2SImode I don't think we ever have CCFmode when ISA_HAS_CCF is false. Maybe just check for CCFmode? Good point. I'll remove the ISA_HAS_CCF part. If there really is a need for both conditions, then the order of the checks needs to be reversed. Also, the comment is hard to follow. How about: /* The FPXX ABI requires double-precision values to be placed in even- numbered registers. Disallow odd-numbered registers with CCFmode because CCF mode double- precision compares will write a 64-bit value to a register. */ Did I get that right? I think that captures it, thanks. diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h index 8a38829..c110b5e 100644 --- a/gcc/config/mips/mips.h +++ b/gcc/config/mips/mips.h -/* ISA supports instructions MULT and MULTU. - This is always true, but the macro is needed for ISA_HAS_DMULT - in mips.md. */ -#define ISA_HAS_MULT (1) +/* ISA supports instructions MULT and MULTU. */ +#define ISA_HAS_MULT ISA_HAS_HILO + I preferred the definition in your original patch : #define ISA_HAS_MULT mips_isa_rev = 5 Would you mind switching it back? Fine with me. I'll sync the work with trunk and do some testsuite runs to check for regressions in the interim and commit early next week probably. Thanks for the review, Matthew
Re: [PATCH] PR other/63613: Add fixincludes for dejagnu.h
On Dec 11, 2014, at 5:07 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: David Malcolm dmalc...@redhat.com writes: * I don't consider this a critical issue that cannot work without current releases. We're already working around several upstream DejaGnu issues in our codebase, and I don't consider this particular one important enough to require everyone to upgrade to a not-a-release version. ... a DejaGnu 1.6 release would only address one part of my concern: I still don't believe this minor issues warrants us demanding all gcc testers upgrading to a newer DejaGnu release. I'd like my fellow testsuite maintainers to weigh in, though. I’m fine with how this is being done. If the jit people want to fixincludes it, fine. If they submit fixes to dejagnu and want to recommend a newer dejgnu for jit testing, that’s fine. A tester should be free to use the current dejagnu they use and it should result in no regressions in the test suite for the non-jit parts. They should also be free to update to the latest dejagnu and use it and see non-regressions across the non-jit suite. If the jit people want to simplify their lives and do a return if the dejagnu version is too old from their top-level, essentially enforcing a tester to use a newer dejegnu if they want to see jit test suite results, even that is fine with me. We can even update the recommended version of dejagnu while still keeping the current recommended version as working, I’m not opposed to that.
Re: [patch] Fix std::shared_ptr FAILs with -fno-rtti
On 12/12/14 21:06 +, Jonathan Wakely wrote: A couple of small fixes for shared_ptr tests that fail with -fno-rtti. Tested x86_64-linux, committed to trunk. Huh, somehow I committed that when the new test wasn't even passing. This fixes it properly. Tested again and committed to trunk. commit b9fd3adf17cf8b03df10288f053615d7411d5fb5 Author: redi redi@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Sat Dec 13 00:44:17 2014 + PR libstdc++/58594 * include/bits/shared_ptr_base.h: Real fix for cv-qualified types. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@218698 138bc75d-0d04-0410-961f-82ee72b054a4 diff --git a/libstdc++-v3/include/bits/shared_ptr_base.h b/libstdc++-v3/include/bits/shared_ptr_base.h index 3ef783f..ad68c23 100644 --- a/libstdc++-v3/include/bits/shared_ptr_base.h +++ b/libstdc++-v3/include/bits/shared_ptr_base.h @@ -1106,7 +1106,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION templatetypename _Alloc struct _Deleter { - void operator()(_Tp* __ptr) + void operator()(typename _Alloc::value_type* __ptr) { __allocated_ptr_Alloc __guard{ _M_alloc, __ptr }; allocator_traits_Alloc::destroy(_M_alloc, __guard.get()); @@ -1123,14 +1123,15 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION rebind_traitstypename std::remove_cv_Tp::type __traits; _Deletertypename __traits::allocator_type __del = { __a }; auto __guard = std::__allocate_guarded(__del._M_alloc); - _M_ptr = __guard.get(); + auto __ptr = __guard.get(); // _GLIBCXX_RESOLVE_LIB_DEFECTS // 2070. allocate_shared should use allocator_traitsA::construct - __traits::construct(__del._M_alloc, _M_ptr, + __traits::construct(__del._M_alloc, __ptr, std::forward_Args(__args)...); __guard = nullptr; - __shared_count_Lp __count(_M_ptr, __del, __del._M_alloc); + __shared_count_Lp __count(__ptr, __del, __del._M_alloc); _M_refcount._M_swap(__count); + _M_ptr = __ptr; __enable_shared_from_this_helper(_M_refcount, _M_ptr, _M_ptr); } #endif
libgo patch committed: Avoid GC crash with callbacks in new thread
There is a private program that crashes when using gccgo in a rather complex scenario. Newly created C threads call into Go code, forcing the Go code to allocate new M and G structures. While executing Go code, the stack is split. The Go code then returns. Returning from a Go callback is treated as entering a system call, so the G gcstack field is set to point to the Go stack. In this case, though, we were called from a newly created C thread, so we drop the extra M and G structures. The C thread then exits. Then a new C thread calls into Go code, reusing the previously created M and G. The Go code requires a larger stack frame, causing the old stack segment to be unmapped and a new stack segment allocated. At this point the gcstack field is pointing to the old stack segment. Then a garbage collection occurs. The garbage collector sees that the gcstack field is not nil, so it scans it as the first stack segment. Unfortunately it points to memory that was unmapped. So the program crashes. The fix is simple: when handling extra G structures created for callbacks from new C threads, clear the gcstack field. This patch implements that. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r 1bed87327b5c libgo/runtime/proc.c --- a/libgo/runtime/proc.c Wed Dec 10 12:35:39 2014 -0800 +++ b/libgo/runtime/proc.c Fri Dec 12 16:24:41 2014 -0800 @@ -1150,6 +1150,7 @@ __splitstack_getcontext(g-stack_context[0]); #else g-gcinitial_sp = mp; + g-gcstack = nil; g-gcstack_size = 0; g-gcnext_sp = mp; #endif @@ -1251,6 +1252,8 @@ runtime_setmg(nil, nil); mp-curg-status = Gdead; + mp-curg-gcstack = nil; + mp-curg-gcnext_sp = nil; mnext = lockextra(true); mp-schedlink = mnext;
libgo patch committed: Add testing.MainStart
In Go 1.4 the go test command uses the new function testing.MainStart. That function will be normally brought into libgo when we upgrade it to the 1.4 library. However, for convenience for people using gccgo with the Go 1.4 gc release, I've committed this patch to bring it into gccgo now. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r 9ac141d2f527 libgo/go/testing/testing.go --- a/libgo/go/testing/testing.go Fri Dec 12 16:49:18 2014 -0800 +++ b/libgo/go/testing/testing.go Fri Dec 12 16:57:00 2014 -0800 @@ -117,6 +117,26 @@ // The entire test file is presented as the example when it contains a single // example function, at least one other function, type, variable, or constant // declaration, and no test or benchmark functions. +// +// Main +// +// It is sometimes necessary for a test program to do extra setup or teardown +// before or after testing. It is also sometimes necessary for a test to control +// which code runs on the main thread. To support these and other cases, +// if a test file contains a function: +// +// func TestMain(m *testing.M) +// +// then the generated test will call TestMain(m) instead of running the tests +// directly. TestMain runs in the main goroutine and can do whatever setup +// and teardown is necessary around a call to m.Run. It should then call +// os.Exit with the result of m.Run. +// +// The minimal implementation of TestMain is: +// +// func TestMain(m *testing.M) { os.Exit(m.Run()) } +// +// In effect, that is the implementation used when no TestMain is explicitly defined. package testing import ( @@ -426,23 +446,49 @@ // An internal function but exported because it is cross-package; part of the implementation // of the go test command. func Main(matchString func(pat, str string) (bool, error), tests []InternalTest, benchmarks []InternalBenchmark, examples []InternalExample) { + os.Exit(MainStart(matchString, tests, benchmarks, examples).Run()) +} + +// M is a type passed to a TestMain function to run the actual tests. +type M struct { + matchString func(pat, str string) (bool, error) + tests []InternalTest + benchmarks []InternalBenchmark + examples[]InternalExample +} + +// MainStart is meant for use by tests generated by 'go test'. +// It is not meant to be called directly and is not subject to the Go 1 compatibility document. +// It may change signature from release to release. +func MainStart(matchString func(pat, str string) (bool, error), tests []InternalTest, benchmarks []InternalBenchmark, examples []InternalExample) *M { + return M{ + matchString: matchString, + tests: tests, + benchmarks: benchmarks, + examples:examples, + } +} + +// Run runs the tests. It returns an exit code to pass to os.Exit. +func (m *M) Run() int { flag.Parse() parseCpuList() before() startAlarm() - haveExamples = len(examples) 0 - testOk := RunTests(matchString, tests) - exampleOk := RunExamples(matchString, examples) + haveExamples = len(m.examples) 0 + testOk := RunTests(m.matchString, m.tests) + exampleOk := RunExamples(m.matchString, m.examples) stopAlarm() if !testOk || !exampleOk { fmt.Println(FAIL) after() - os.Exit(1) + return 1 } fmt.Println(PASS) - RunBenchmarks(matchString, benchmarks) + RunBenchmarks(m.matchString, m.benchmarks) after() + return 0 } func (t *T) report() {
Go patch committed: Don't move nil subexpressions into temporaries
This patch from Chris Manghane changes the Go frontend to avoid moving nil subexpressions into temporary variables. This fixes GCC PR 61254, a compiler crash on some rather unlikely, but valid, code. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r f557f41e0008 go/statements.cc --- a/go/statements.cc Fri Dec 12 17:00:18 2014 -0800 +++ b/go/statements.cc Fri Dec 12 17:40:13 2014 -0800 @@ -677,7 +677,8 @@ { if (this-skip_ 0) --this-skip_; - else if ((*pexpr)-temporary_reference_expression() == NULL) + else if ((*pexpr)-temporary_reference_expression() == NULL + !(*pexpr)-is_nil_expression()) { Location loc = (*pexpr)-location(); Temporary_statement* temp = Statement::make_temporary(NULL, *pexpr, loc);
Re: [Patch][testsuite] Fix a few test cases
On 14-12-12 04:29 PM, Mike Stump wrote: On Dec 12, 2014, at 11:38 AM, Ryan Mansfield rmansfi...@qnx.com wrote: Here are a few test tweaks. In 921202-1.c, if STACK_SIZE is used then VLEN will blow the stack with 64bit longs. e.g. if STACK_SIZE == 512K then 3 arrays of 32767 longs means at a minimum 767K of stack will be used at -O0. In pr51447.c, the rbx global register is clobbering the rbx of main's caller, which can cause test case crashes on return. 2014-12-12 Ryan Mansfield rmansfi...@qnx.com * gcc.c-torture/execute/921202-1.c: Adjust VLEN. * gcc.c-torture/execute/pr51447.c: Restore rbx for x86-64. * gcc.dg/cpp/trad/include.c: Exclude QNX targets. OK? Ok for first and third part. The second one, I would like an rbx/x86_64 person to review. Thanks. rbx is callee saved, and it's being clobbered. e.g. on Linux x86-64 Breakpoint 1, main () at /home/ryan/gnu/gcc/trunk/gcc/testsuite/gcc.c-torture/execute/pr51447.c:13 13 { 1: x/i $pc = 0x40054e main:push %rbp (gdb) info reg rbx rbx0x0 0 (gdb) s 26return 0; 1: x/i $pc = 0x400596 main+72: mov$0x0,%eax (gdb) info reg rbx rbx0x400586 4195718r The global register var docs say: A function that can alter the value of a global register variable cannot safely be called from a function compiled without this variable, because it could clobber the value the caller expects to find there on return. Therefore, the function that is the entry point into the part of the program that uses the global register variable must explicitly save and restore the value that belongs to its caller. The updated diff switches the test to use inline asm to save/restore rbx instead of the local reg vars, but maybe there's a better way. Regards, Ryan Mansfield Index: gcc/testsuite/gcc.c-torture/execute/921202-1.c === --- gcc/testsuite/gcc.c-torture/execute/921202-1.c (revision 218685) +++ gcc/testsuite/gcc.c-torture/execute/921202-1.c (working copy) @@ -2,7 +2,7 @@ #ifndef STACK_SIZE #define VLEN 2055 #else -#define VLEN ((STACK_SIZE/16) - 1) +#define VLEN ((STACK_SIZE/32) - 1) #endif main () { Index: gcc/testsuite/gcc.c-torture/execute/pr51447.c === --- gcc/testsuite/gcc.c-torture/execute/pr51447.c (revision 218685) +++ gcc/testsuite/gcc.c-torture/execute/pr51447.c (working copy) @@ -14,6 +14,10 @@ main (void) { __label__ nonlocal_lab; +#ifdef __x86_64__ + void *saved_rbx; + asm volatile (movq %%rbx, %0 : =r (saved_rbx) : : ); +#endif __attribute__((noinline, noclone)) void bar (void *func) { @@ -21,9 +25,15 @@ goto nonlocal_lab; } bar (nonlocal_lab); +#ifdef __x86_64__ + asm volatile (movq %0, %%rbx : : r (saved_rbx) : rbx ); +#endif return 1; nonlocal_lab: if (ptr != nonlocal_lab) abort (); +#ifdef __x86_64__ + asm volatile (movq %0, %%rbx : : r (saved_rbx) : rbx ); +#endif return 0; } Index: gcc/testsuite/gcc.dg/cpp/trad/include.c === --- gcc/testsuite/gcc.dg/cpp/trad/include.c (revision 218685) +++ gcc/testsuite/gcc.dg/cpp/trad/include.c (working copy) @@ -1,11 +1,11 @@ /* Copyright (c) 2002 Free Software Foundation Inc. */ -/* Test that macros are not expanded in the quotes of #inlcude. */ +/* Test that macros are not expanded in the quotes of #include. */ /* vxWorksCommon.h uses the # operator to construct the name of an include file, thus making the file incompatible with -traditional-cpp. Newlib uses ## when including stdlib.h as of 2007-09-07. */ -/* { dg-do preprocess { target { { ! vxworks_kernel } { ! newlib } } } } */ +/* { dg-do preprocess { target { { ! vxworks_kernel } { ! newlib } { ! *-*-qnx* } } } } */ #define __STDC__ 1 /* Stop complaints about non-ISO compilers. */ #define stdlib 1
libgo patch committed: Permit delete from map with zero length key
A map with a zero length key is useless, but valid. gccgo was incorrectly crashing when deleting an entry from such a map. This patch by Chris Manghane fixes the problem. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r 28f37e5c57fe libgo/runtime/go-map-delete.c --- a/libgo/runtime/go-map-delete.c Fri Dec 12 17:40:24 2014 -0800 +++ b/libgo/runtime/go-map-delete.c Fri Dec 12 17:55:34 2014 -0800 @@ -35,7 +35,10 @@ key_descriptor = descriptor-__map_descriptor-__key_type; key_offset = descriptor-__key_offset; key_size = key_descriptor-__size; - __go_assert (key_size != 0 key_size != -1UL); + if (key_size == 0) +return; + + __go_assert (key_size != -1UL); equalfn = key_descriptor-__equalfn; key_hash = key_descriptor-__hashfn (key, key_size);
Re: [PATCH] [AArch64, NEON] Fix testcases add by r218484
Thanks for reviewing the patch. See my comments inlined: This patch fix this two issues. Three changes: 1. vfma_f32, vfmaq_f32, vfms_f32, vfmsq_f32 are only available for arm*-*-* target with the FMA feature, we take care of this through the macro __ARM_FEATURE_FMA. 2. vfma_n_f32 and vfmaq_n_f32 are only available for aarch64 target, we take care of this through the macro __aarch64__. 3. vfmaq_f64, vfmaq_n_f64 and vfmsq_f64 are only available for aarch64 target, we just exclude test for them to keep the testcases clean. (Note: They also pass on aarch64 aarch64_be target and we can add test for them if needed). I would prefer to have all the available variants tested. OK, the v2 patch attached have all the available variants added. +#ifdef __aarch64__ /* Expected results. */ VECT_VAR_DECL(expected,hfloat,32,2) [] = { 0x4438ca3d, 0x44390a3d }; VECT_VAR_DECL(expected,hfloat,32,4) [] = { 0x44869eb8, 0x4486beb8, 0x4486deb8, 0x4486feb8 }; -VECT_VAR_DECL(expected,hfloat,64,2) [] = { 0x408906e1532b8520, 0x40890ee1532b8520 }; Why do you remove this one? We need to make some changes to the header files for this test. Initially, I don't want to touch the header files, so I reduced this testcase to a minimal one. int main (void) { +#ifdef __ARM_FEATURE_FMA exec_vfms (); +#endif return 0; } In the other tests, I try to put as much code in common as possible, between the 'a' and 's' variants (e.g. vmla/vmls). Maybe you can do that as a follow-up? Yes, I think we can handle this with a follow-on patch. The v2 patch is tested on armeb-linux-gnueabi, arm-linux-gnueabi, aarch64-linux-gnu and aarch64_be-linux-gnu. How about this one? Thanks. Index: gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/compute-ref-data.h === --- gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/compute-ref-data.h (revision 218582) +++ gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/compute-ref-data.h (working copy) @@ -142,6 +142,10 @@ VECT_VAR_DECL_INIT(buffer, poly, 16, 8); PAD(buffer_pad, poly, 16, 8); VECT_VAR_DECL_INIT(buffer, float, 32, 4); PAD(buffer_pad, float, 32, 4); +#ifdef __aarch64__ +VECT_VAR_DECL_INIT(buffer, float, 64, 2); +PAD(buffer_pad, float, 64, 2); +#endif /* The tests for vld1_dup and vdup expect at least 4 entries in the input buffer, so force 1- and 2-elements initializers to have 4 Index: gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vfma_n.c === --- gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vfma_n.c (revision 218582) +++ gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vfma_n.c (working copy) @@ -2,6 +2,7 @@ #include arm-neon-ref.h #include compute-ref-data.h +#if defined(__aarch64__) defined(__ARM_FEATURE_FMA) /* Expected results. */ VECT_VAR_DECL(expected,hfloat,32,2) [] = { 0x4438ca3d, 0x44390a3d }; VECT_VAR_DECL(expected,hfloat,32,4) [] = { 0x44869eb8, 0x4486beb8, 0x4486deb8, 0x4486feb8 }; @@ -9,28 +10,29 @@ VECT_VAR_DECL(expected,hfloat,64,2) [] = { 0x40890 #define VECT_VAR_ASSIGN(S,Q,T1,W) S##Q##_##T1##W #define ASSIGN(S, Q, T, W, V) T##W##_t S##Q##_##T##W = V -#define TEST_MSG VFMA/VFMAQ +#define TEST_MSG VFMA_N/VFMAQ_N + void exec_vfma_n (void) { /* Basic test: v4=vfma_n(v1,v2), then store the result. */ -#define TEST_VFMA(Q, T1, T2, W, N) \ +#define TEST_VFMA_N(Q, T1, T2, W, N) \ VECT_VAR(vector_res, T1, W, N) = \ vfma##Q##_n_##T2##W(VECT_VAR(vector1, T1, W, N), \ - VECT_VAR(vector2, T1, W, N), \ - VECT_VAR_ASSIGN(Scalar, Q, T1, W)); \ + VECT_VAR(vector2, T1, W, N),\ + VECT_VAR_ASSIGN(scalar, Q, T1, W)); \ vst1##Q##_##T2##W(VECT_VAR(result, T1, W, N), VECT_VAR(vector_res, T1, W, N)) -#define CHECK_VFMA_RESULTS(test_name,comment) \ +#define CHECK_VFMA_N_RESULTS(test_name,comment) \ {\ CHECK_FP(test_name, float, 32, 2, PRIx32, expected, comment); \ CHECK_FP(test_name, float, 32, 4, PRIx32, expected, comment); \ - CHECK_FP(test_name, float, 64, 2, PRIx64, expected, comment); \ - } +CHECK_FP(test_name, float, 64, 2, PRIx64, expected, comment); \ + } #define DECL_VABD_VAR(VAR) \ DECL_VARIABLE(VAR, float, 32, 2);\ DECL_VARIABLE(VAR, float, 32, 4);\ - DECL_VARIABLE(VAR, float, 64, 2); + DECL_VARIABLE(VAR, float, 64, 2); DECL_VABD_VAR(vector1); DECL_VABD_VAR(vector2); @@ -50,20 +52,23
Re: [patch c++]: Fix 61228 - noexcept(expression) causes internal compiler error
On 12/12/2014 03:58 PM, Kai Tietz wrote: 2014-12-12 21:15 GMT+01:00 Jason Merrill ja...@redhat.com: I think it would be better to call maybe_instantiate_noexcept so that we can have a definite answer. Jason Hmm, for case that decl != NULL_TREE, this is ok. But what if decl is NULL_TREE? We shouldn't have a deferred noexcept in that case. Jason