Re: [PATCH 199/236] Introduce rtx_insn_list subclass of rtx_def
On 08/07/14 09:33, David Malcolm wrote: On Wed, 2014-08-06 at 21:29 -0400, Trevor Saunders wrote: On Wed, Aug 06, 2014 at 01:22:58PM -0400, David Malcolm wrote: +class GTY(()) rtx_insn_list : public rtx_def +{ + /* No extra fields, but adds invariant: (GET_CODE (X) == INSN_LIST). some nice future work would be to see if these can stop being rtxen at all and just have a insn and next pointer. Or some other data structures; see https://gcc.gnu.org/ml/gcc-patches/2014-08/msg00825.html for an example I tried. [I don't know if it's a *good* example though :) ] I the case of forced_labels, I believe the only things we ever do are prepend to the list and iterate over the list performing some action on each item in the list. Order on the list doesn't matter IIRC, nor do we ever do something like give me element 3 in the list or find this element in the list Thus from an efficiency standpoint I don't see a big win for either vec or EXPR_LIST over the other. vec is probably better for iterating and access, but loses when we have to reallocate/copy the vector when we add elements to it. Space efficiency is probably better for vec. Where I think vec shines anyone with a basic background in standard C++ libraries is going to know what a vector is (or a forward_list if folks really didn't want to go with a vec implementation). Old farts such as myself just know that EXPR_LIST is a forward list implemented using rtx nodes with the implied properties noted above. However, it's not something a newbie is going to just know -- thus they're going to have to dig a bit to come to those conclusions. Changing to a vec or forward_list makes things clearer to someone casually reading the code and also carries to the reader some of those implied properties. And *that* is the reason why I think changing EXPR_LIST and INSN_LIST to be standard containers is a good move. The change for forced_labels looks quite reasonable to me and I'd look favorably upon submitting that as an RFA once the bootstrap and testing is done. jeff
Re: TAGs for variables created through common.opt
On 08/21/14 11:53, Aldy Hernandez wrote: Well, whadayaknow... Tom Tromey pointed me at --regex which we can use to add patterns for not only the .opt files, but for a bunch of other files/languages we define in GCC. The following patch adds support for common.opt, rtl.def, tree.def, and gimple.def. Now you can use your editor to tag things like GIMPLE_NOP, and PLUS_EXPR, which means I'll get lost less often. Kinda neat, IMO. Tested by inspecting TAGS.sub manually, as well as searching for random stuff with both vi and emacs. OK for mainline? OK. jeff
Re: [patch] propagate INSTALL Makefile variables down from gcc/
On 08/21/14 09:49, Olivier Hainque wrote: Hello, Experiments with custom install programs exposed that the INSTALL series of Makefile variables aren't propagated down from the gcc subdir. This patch fixes this. Checked that it addressed the unexpected behavior we were observing + bootstrapped regtested on x86_64-linux-gnu. OK to commit ? Thanks in advance for your feedback, With Kind Regards, Olivier 2014-08-21 Nicolas Roche ro...@adacore.com * Makefile.in (FLAGS_TO_PASS): Propagate INSTALL, INSTALL_DATA, INSTALL_SCRIPT and INSTALL_PROGRAM as well. OK. Jeff
Re: [PATCH] Avoid redundant indirect_info computation during inderct edge cloning
On 08/18/14 06:07, Ilya Enkovich wrote: On 15 Aug 23:08, Jan Hubicka wrote: Hi, I get a segafult in decl_maybe_in_construction_p during function versioning. We have following steps in clone creation (e.g. as in create_version_clone_with_body): 1. Create function decl 2. Create clone of cgraph node 3. Copy function body After the first step there is no body attached to function and DECL_STRUCT_FUNCTION for new decl is NULL. It is initialized on the third step. But on the second step get_polymorphic_call_info may be called for new function; it calls decl_maybe_in_construction_p which assumes DECL_STRUCT_FUNCTION already exists. I firstly wanted to fix decl_maybe_in_construction_p but then realized cgraph_clone_edge copy indirect_info from the original edge anyway and therefore its computation is not required at all. Following patch removes redundant indirect_info computation. Bootstrapped and regtested on linux-x86_64. Does it look OK for trunk? OK, plase also add testcase from https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61800 Thanks, Honza Here is a version with testcase to be committed. Thanks, Ilya -- gcc/ 2014-08-18 Ilya Enkovich ilya.enkov...@intel.com PR ipa/61800 * cgraph.h (cgraph_node::create_indirect_edge): Add compute_indirect_info param. * cgraph.c (cgraph_node::create_indirect_edge): Compute indirect_info only when it is required. * cgraphclones.c (cgraph_clone_edge): Do not recompute indirect_info fore cloned indirect edge. Just to be 100% clear, this version is OK as well. jeff
Re: [PATCH] Drop user_defined_section_attribute, directly check DECL_SECTION_NAME instead
On 08/27/14 16:30, Yi Yang wrote: Ping On Mon, Aug 11, 2014 at 3:10 PM, Yi Yang ahyan...@google.com wrote: Sorry, it is a typo :( Patch v2: -- 2014-08-11 Yi Yang ahyan...@google.com gcc: * bb-reorder.c (pass_partition_blocks::gate): Replace check. * c-family/c-common.c (handle_section_attribute): Remove user_defined_section_attribute * final.c (rest_of_handle_final): ditto * toplev.c (user_defined_section_attribute): ditto * toplev.h (user_defined_section_attribute): ditto OK. Jeff
Re: [PATCH i386 AVX512] [31/n] Update float unspec namely storeu,rcp14,rsqrt14,scalef,getexp,fixupimm,rndscale,getmant.
On Fri, Aug 29, 2014 at 3:23 PM, Kirill Yukhin kirill.yuk...@gmail.com wrote: Hello, Patch in the bottom updates few UNSPEC insn patterns w/ new mode iterator. Additionally names were slightly changed. Bootstrapped. AVX-512* tests on top of patch-set all pass under simulator. Is it ok for trunk? gcc/ * config/i386/i386.c (avx512f_getmantv2df_round): Rename to ... (avx512f_vgetmantv2df_round): this. (avx512f_getmantv4sf_round): Rename to ... (avx512f_vgetmantv4sf_round): this. (ix86_expand_args_builtin): Handle avx512vl_getmantv8sf_mask, avx512vl_getmantv4df_mask, avx512vl_getmantv4sf_mask, avx512vl_getmantv2df_mask. (ix86_expand_round_builtin): Handle avx512f_vgetmantv2df_round, avx512f_vgetmantv4sf_round. * config/i386/sse.md (define_insn avx512f_storeussemodesuffix512_mask): Delete. (define_insn avx512_storeussemodesuffixavxsizesuffix_mask): New. (define_insn mask_codeforrcp14modemask_name): Use VF_AVX512VL. (define_insn mask_codeforrsqrt14modemask_name): Ditto. (define_insn avx512f_scalefmodemask_nameround_name): Delete. (define_insn avx512_scalefmodemask_nameround_name): New. (define_insn avx512f_getexpmodemask_nameround_saeonly_name): Delete. (define_insn avx512_getexpmodemask_nameround_saeonly_name): New. (define_expand avx512f_fixupimmmode_maskzround_saeonly_expand_name): Delete. (define_expand avx512_fixupimmmode_maskzround_saeonly_expand_name): New. (define_insn avx512f_fixupimmmodesd_maskz_nameround_saeonly_name): Delete. (define_insn avx512_fixupimmmodesd_maskz_nameround_saeonly_name): New. (define_insn avx512f_fixupimmmode_maskround_saeonly_name): Delete. (define_insn avx512_fixupimmmode_maskround_saeonly_name): New. (define_insn avx512f_rndscalemodemask_nameround_saeonly_name): Delete. (define_insn avx512_rndscalemodemask_nameround_saeonly_name): New. (define_insn avx512f_getmantmodemask_nameround_saeonly_name): Delete. (define_insn avx512_getmantmodemask_nameround_saeonly_name): New. (define_insn avx512f_getmantmoderound_saeonly_name): Rename to ... (define_insn avx512f_vgetmantmoderound_saeonly_name): this. Please change ChangeLog entries to mention that these patterns are *renamed*, not *deleted*. So, something like: (new pattern): Rename from old pattern and use VF_AVX512VL mode iterator. Otherwise, nice patch that actually shows the power of mode iterators and mode attributes! OK with updated ChangeLog. Thanks, Uros.
Re: [PATCH i386 AVX512] [32/n] Add reduce,range,fpclass.
On Fri, Aug 29, 2014 at 3:55 PM, Kirill Yukhin kirill.yuk...@gmail.com wrote: Patch in the bottom adds support for reduce,range,fpclass. Bootstrapped. AVX-512* tests on top of patch-set all pass under simulator. Is it ok for trunk? gcc/ * config/i386/i386.c (ix86_expand_args_builtin): Handle avx512dq_rangepv8df_mask_round, avx512dq_rangepv16sf_mask_round, avx512dq_rangepv4df_mask, avx512dq_rangepv8sf_mask, avx512dq_rangepv2df_mask, avx512dq_rangepv4sf_mask. * config/i386/sse.md (define_c_enum unspec): Add UNSPEC_REDUCE, UNSPEC_FPCLASS, UNSPEC_FPCLASS_SCALAR, UNSPEC_RANGE, UNSPEC_RANGE_SCALAR. (define_insn mask_codeforreducepmodemask_name): New. (define_insn reducesmode): Ditto. (define_insn avx512dq_rangepmodemask_nameround_saeonly_name): Ditto. (define_insn avx512dq_rangesmoderound_saeonly_name): Ditto. (define_insn avx512dq_fpclassmodemask_scalar_merge_name): Ditto. (define_insn avx512dq_vmfpclassmode): Ditto. -- Thanks, K diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index ff37ffe..15cdb5e 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -34114,6 +34114,12 @@ ix86_expand_args_builtin (const struct builtin_description *d, case CODE_FOR_avx512vl_getmantv4df_mask: case CODE_FOR_avx512vl_getmantv4sf_mask: case CODE_FOR_avx512vl_getmantv2df_mask: + case CODE_FOR_avx512dq_rangepv8df_mask_round: + case CODE_FOR_avx512dq_rangepv16sf_mask_round: + case CODE_FOR_avx512dq_rangepv4df_mask: + case CODE_FOR_avx512dq_rangepv8sf_mask: + case CODE_FOR_avx512dq_rangepv2df_mask: + case CODE_FOR_avx512dq_rangepv4sf_mask: error (the last argument must be a 4-bit immediate); return const0_rtx; diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index d85f9a4..c505526 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -128,6 +128,13 @@ UNSPEC_SHA256MSG1 UNSPEC_SHA256MSG2 UNSPEC_SHA256RNDS2 + + ;; For AVX512DQ support + UNSPEC_REDUCE + UNSPEC_FPCLASS + UNSPEC_FPCLASS_SCALAR + UNSPEC_RANGE + UNSPEC_RANGE_SCALAR ]) It looks to me that _SCALAR unspecs are redundant, and should be possible to use UNSPEC_REDUCE for all patterns without unwanted matching. Uros.
Re: [PATCH i386 AVX512] [33/n] Add patterns for compress, expand.
On Fri, Aug 29, 2014 at 4:00 PM, Kirill Yukhin kirill.yuk...@gmail.com wrote: Patch in the bottom extends support of compress and expand insns. Bootstrapped. AVX-512* tests on top of patch-set all pass under simulator. Is it ok for trunk? gcc/ * config/i386/sse.md (define_mode_iterator VI48F): New. (define_insn avx512f_compressmode_mask): Delete. (define_insn avx512_compressmode_mask): New. (define_insn avx512f_compressstoremode_mask): Delete. (define_insn avx512_compressstoremode_mask): New. (define_expand avx512f_expandmode_maskz): Delete. (define_expand avx512_expandmode_maskz): New. (define_insn avx512f_expandmode_mask): Delete. (define_insn avx512_expandmode_mask): New. Again, rename instead of delete/new. The patch is OK. Please note that you can use UNSPEC_COMPRESS everywhere, there is no need for UNSPEC_COMPRESS_STORE and can be deleted. On a related note, it looks to me that UNSPEC_COMPRESS patterns can be merged together by using nonimmediate operand 0 and matched memory. Please see the last constraint of sse_movhlps for example. This can be a follow-up patch. Thanks, Uros.
Re: [Patch, Fortran] CAF dep (1/3): PR62278 - improve dependency.c's gfc_check_dependency's check (missed-optimization)
Dear Tobias, Given Dominques news that this fixes a golden oldie that drove me to madness and PR60593 - OK for trunk, 4.8 and 4.9 Many thanks for the patch Paul On 27 August 2014 22:59, Tobias Burnus bur...@net-b.de wrote: The current gfc_check_dependency check always looked at the pointer attribute - and assumed the worst, if either the LHS or the RHS was true. Thus, it claimed that a and b alias for the following definition: integer, pointer :: p; integer :: a. However, as a has no target (or pointer) attribute, that's not possible. Additionally, class(t) :: a has internally the pointer attribute (but CLASS_DATA(sym)-attr.class_pointer == 0), however, in the Fortran sense, a is not a pointer and cannot alias. I do not have a good example for the test case, except for a similar one as above using a[i] = p and looking at the dump; but that requires patch 3/3 of this series. Build and regtested on x86-64-gnu-linux. (I do get a failure for gfortran.dg/graphite/pr42393.f90, but only with -O1 -fgraphite-identity and also without the patch.) OK for the trunk? Tobias -- The knack of flying is learning how to throw yourself at the ground and miss. --Hitchhikers Guide to the Galaxy
Make many more options use CPP()
This converts almost all remaining CPP options to use CPP() in the *.opt files. Following your comment here (https://gcc.gnu.org/ml/gcc-patches/2014-08/msg02499.html) several options now use explicit Init(). The only test I needed to tweak was gcc.dg/cpp/endif-pedantic2.c, which tests that -Wno-endif-labels -pedantic-errors gives errors for -Wendif-labels warnings. This might be the original intention (https://gcc.gnu.org/ml/gcc-patches/2002-03/msg01732.html) but it is at odds with how other -Wno-* flags work. The current behavior enforced by the automatic machinery is that more specific options have priority over more general options, independently of the order. Although this is not documented in the manual, it has been the consensus of several recent discussions (and not so recent ones https://gcc.gnu.org/ml/gcc/2007-05/msg00720.html). It would be a burden to special-case -Wno-endif-labels just for keeping backwards compatibility. The affected users are likely very few (if any): those that expect -Wno-endif-labels -pedantic to give -Wendif-labels warnings. The effect on those users would be that they won't get the warnings they expect until they remove -Wno-endif-labels or append an explicit -Wendif-labels. The two remaining options are a bit problematic: 1) -Wall enables warn_sign_change, which does not have a -W* flag. I could simply add one but perhaps it would be better under an already existing flag. 2) -Wnormalized requires a bit more of special handling. I will try to use the Enum() facility in the *.opt file if possible. Bootstrapped and regression tested on x86_64-linux-gnu. OK? gcc/ChangeLog: 2014-08-30 Manuel López-Ibáñez m...@gcc.gnu.org * doc/options.texi: Document that Var and Init are required if CPP is given. * optc-gen.awk: Require Var and Init if CPP is given. * common.opt (Wpedantic): Use Init. libcpp/ChangeLog: 2014-08-30 Manuel López-Ibáñez m...@gcc.gnu.org * macro.c (replace_args): Use cpp_pedwarning, cpp_warning and CPP_W flags. * include/cpplib.h: Add CPP_W_C90_C99_COMPAT and CPP_W_PEDANTIC. * init.c (cpp_create_reader): Do not init to -1 here. * expr.c (num_binary_op): Use cpp_pedwarning. gcc/c-family/ChangeLog: 2014-08-30 Manuel López-Ibáñez m...@gcc.gnu.org * c.opt (Wc90-c99-compat,Wc++-compat,Wcomment,Wendif-labels, Winvalid-pch,Wlong-long,Wmissing-include-dirs,Wmultichar,Wpedantic, (Wdate-time,Wtraditional,Wundef,Wvariadic-macros): Add CPP, Var and Init. * c-opts.c (c_common_handle_option): Do not handle here. (sanitize_cpp_opts): Likewise. * c-common.c (struct reason_option_codes_t): Handle CPP_W_C90_C99_COMPAT and CPP_W_PEDANTIC. gcc/testsuite/ChangeLog: 2014-08-30 Manuel López-Ibáñez m...@gcc.gnu.org * gcc.dg/cpp/endif-pedantic2.c: More general options do not override specific ones, but specific ones do. Index: gcc/doc/options.texi === --- gcc/doc/options.texi(revision 214735) +++ gcc/doc/options.texi(working copy) @@ -483,8 +483,9 @@ The option is omitted from the producer Even if this is a target option, this option will not be recorded / compared to determine if a precompiled header file matches. @item CPP(@var{var}) The state of this option should be kept in sync with the preprocessor -option @var{var}. +option @var{var}. If this property is set, then properties @code{Var} +and @code{Init} must be set as well. @end table Index: gcc/c-family/c.opt === --- gcc/c-family/c.opt (revision 214735) +++ gcc/c-family/c.opt (working copy) @@ -294,19 +294,19 @@ Warn about boolean expression compared w Wbuiltin-macro-redefined C ObjC C++ ObjC++ CPP(warn_builtin_macro_redefined) Var(cpp_warn_builtin_macro_redefined) Init(1) Warning Warn when a built-in preprocessor macro is undefined or redefined Wc90-c99-compat -C ObjC Var(warn_c90_c99_compat) Init(-1) Warning +C ObjC CPP(cpp_warn_c90_c99_compat) Var(warn_c90_c99_compat) Init(-1) Warning Warn about features not present in ISO C90, but present in ISO C99 Wc99-c11-compat C ObjC Var(warn_c99_c11_compat) Init(-1) Warning Warn about features not present in ISO C99, but present in ISO C11 Wc++-compat -C ObjC Var(warn_cxx_compat) Warning +C ObjC Var(warn_cxx_compat) CPP(warn_cxx_operator_names) Init(0) Warning Warn about C constructs that are not in the common subset of C and C++ Wc++0x-compat C++ ObjC++ Var(warn_cxx0x_compat) Warning LangEnabledBy(C++ ObjC++,Wall) Deprecated in favor of -Wc++11-compat @@ -326,11 +326,11 @@ Warn about subscripts whose type is \ch Wclobbered C ObjC C++ ObjC++ Var(warn_clobbered) Warning EnabledBy(Wextra) Warn about variables that might be changed by \longjmp\ or \vfork\ Wcomment -C ObjC C++ ObjC++ CPP(warn_comments) Var(cpp_warn_comment) Warning LangEnabledBy(C ObjC C++ ObjC++,Wall) +C ObjC C++ ObjC++ CPP(warn_comments)
Re: [PATCH v2] Re: PR62304 (was Re: (Still) ICE for cris-elf at r214710)
On Fri, 2014-08-29 at 23:41 -0600, Jeff Law wrote: On 08/29/14 12:07, David Malcolm wrote: Yes: I made various mistakes in reorg.c and resource.c where I assumed that a JUMP_LABEL(insn) was an insn, whereas the existing code is set up to handle RETURN nodes. Well, it would seem to me that reorg is being totally braindead in mixing and matching these two nodes. In particular whatever code is passing around a RETURN rtx into places that normally accept some kind of INSN would appear to be broken. It eliminates all uses of JUMP_LABEL_AS_INSN from reorg.c, and indeed after that there are only 6 uses in the tree (including config subdirs). Good to some extent as I see JUMP_LABEL_AS_INSN as papering over bugs elsewhere, but this patch is also a step backwards as we're papering over a mess in reorg.c. 2014-08-29 David Malcolm dmalc...@redhat.com PR bootstrap/62304 * gcc/reorg.c (skip_consecutive_labels): Convert return type and param back from rtx_insn * to rtx. Rename param from label to label_or_return, reintroducing label as an rtx_insn * after we've ensured it's not a RETURN. (first_active_target_insn): Likewise for return type and param; add a checked cast to rtx_insn * once we've ensured insn is not a RETURN. (steal_delay_list_from_target): Convert param pnew_thread back from rtx_insn ** to rtx *. Replace use of JUMP_LABEL_AS_INSN with JUMP_LABEL. (own_thread_p): Convert param thread back from an rtx_insn * to an rtx. Introduce local rtx_insn * thread_insn with a checked cast once we've established we're not dealing with a RETURN, renaming subsequent uses of thread to thread_insn. (fill_simple_delay_slots): Convert uses of JUMP_LABEL_AS_INSN back to JUMP_LABEL. (follow_jumps): Convert return type and param label from rtx_insn * back to rtx. Move initialization of value to after the handling for ANY_RETURN_P, adding a checked cast there to rtx_insn *. Convert local rtx_insn * this_label to an rtx and rename to this_label_or_return, reintroducing this_label as an rtx_insn * once we've handled the case where it could be an ANY_RETURN_P. (fill_slots_from_thread): Rename param thread to thread_or_return, converting from an rtx_insn * back to an rtx. Reintroduce name thread as an rtx_insn * local with a checked cast once we've handled the case of it being an ANY_RETURN_P. Convert local new_thread from an rtx_insn * back to an rtx. Add a checked cast when assigning to trial from new_thread. Convert use of JUMP_LABEL_AS_INSN back to JUMP_LABEL. Add a checked cast to rtx_insn * from new_thread when invoking get_label_before. (fill_eager_delay_slots): Convert locals target_label, insn_at_target from rtx_insn * back to rtx. Convert uses of JUMP_LABEL_AS_INSN back to JUMP_LABEL. (relax_delay_slots): Convert locals trial, target_label from rtx_insn * back to rtx. Convert uses of JUMP_LABEL_AS_INSN back to JUMP_LABEL. Add a checked cast to rtx_insn * on trial when invoking update_block. (dbr_schedule): Convert use of JUMP_LABEL_AS_INSN back to JUMP_LABEL; this removes all JUMP_LABEL_AS_INSN from reorg.c. * resource.h (mark_target_live_regs): Undo erroneous conversion of second param of r214693, converting it back from rtx_insn * to rtx, since it could be a RETURN. * resource.c (find_dead_or_set_registers): Similarly, convert param jump_target back from an rtx_insn ** to an rtx *, as we could be writing back a RETURN. Rename local rtx_insn * next to next_insn, and introduce lab_or_return as a local rtx, handling the case where JUMP_LABEL (this_jump_insn) is a RETURN. (mark_target_live_regs): Undo erroneous conversion of second param of r214693, converting it back from rtx_insn * to rtx, since it could be a RETURN. Rename it from target to target_maybe_return, reintroducing the name target as a local rtx_insn * with a checked cast, after we've handled the case of ANY_RETURN_P. I'll OK as a means to restore the trunk to working order, but let's add a follow-up item to track down places where we're passing things like a RETURN rtx in places where we really are expecting insns. Thanks; bootstrapped on x86_64 and ppc (gcc110); committed to trunk as r214752. I plan to have a close look at everywhere that JUMP_LABEL is not an insn, though I may wait to after current stage1 to do that; for this stage1 my primary objective for rtx-classes is to use them to document the existing status quo, and I hope to context-switch back to trying to merge the JIT branch in a week or two. Hope that sounds reasonable Dave
[committed] Update libstdc++ baseline symbols on hppa-linux
We now have support for future on hppa-linux. The attached change updates the baseline symbols for it. Tested on hppa-unknown-linux-gnu. Committed to trunk. Dave -- John David Anglin dave.ang...@bell.net 2014-08-30 John David Anglin dang...@gcc.gnu.org * config/abi/post/hppa-linux-gnu/baseline_symbols.txt: Update. Index: config/abi/post/hppa-linux-gnu/baseline_symbols.txt === --- config/abi/post/hppa-linux-gnu/baseline_symbols.txt (revision 214556) +++ config/abi/post/hppa-linux-gnu/baseline_symbols.txt (working copy) @@ -384,6 +384,9 @@ FUNC:_ZNKSt14error_category10equivalentERKSt10error_codei@@GLIBCXX_3.4.11 FUNC:_ZNKSt14error_category10equivalentEiRKSt15error_condition@@GLIBCXX_3.4.11 FUNC:_ZNKSt14error_category23default_error_conditionEi@@GLIBCXX_3.4.11 +FUNC:_ZNKSt15__exception_ptr13exception_ptr20__cxa_exception_typeEv@@CXXABI_1.3.3 +FUNC:_ZNKSt15__exception_ptr13exception_ptrcvMS0_FvvEEv@@CXXABI_1.3.3 +FUNC:_ZNKSt15__exception_ptr13exception_ptrntEv@@CXXABI_1.3.3 FUNC:_ZNKSt15basic_streambufIcSt11char_traitsIcEE4gptrEv@@GLIBCXX_3.4 FUNC:_ZNKSt15basic_streambufIcSt11char_traitsIcEE4pptrEv@@GLIBCXX_3.4 FUNC:_ZNKSt15basic_streambufIcSt11char_traitsIcEE5ebackEv@@GLIBCXX_3.4 @@ -1207,6 +1210,7 @@ FUNC:_ZNSt11range_errorD1Ev@@GLIBCXX_3.4 FUNC:_ZNSt11range_errorD2Ev@@GLIBCXX_3.4.15 FUNC:_ZNSt11regex_errorC1ENSt15regex_constants10error_typeE@@GLIBCXX_3.4.20 +FUNC:_ZNSt11regex_errorC2ENSt15regex_constants10error_typeE@@GLIBCXX_3.4.21 FUNC:_ZNSt11regex_errorD0Ev@@GLIBCXX_3.4.15 FUNC:_ZNSt11regex_errorD1Ev@@GLIBCXX_3.4.15 FUNC:_ZNSt11regex_errorD2Ev@@GLIBCXX_3.4.15 @@ -1291,6 +1295,17 @@ FUNC:_ZNSt12system_errorD0Ev@@GLIBCXX_3.4.11 FUNC:_ZNSt12system_errorD1Ev@@GLIBCXX_3.4.11 FUNC:_ZNSt12system_errorD2Ev@@GLIBCXX_3.4.11 +FUNC:_ZNSt13__future_base11_State_baseD0Ev@@GLIBCXX_3.4.15 +FUNC:_ZNSt13__future_base11_State_baseD1Ev@@GLIBCXX_3.4.15 +FUNC:_ZNSt13__future_base11_State_baseD2Ev@@GLIBCXX_3.4.15 +FUNC:_ZNSt13__future_base12_Result_baseC1Ev@@GLIBCXX_3.4.15 +FUNC:_ZNSt13__future_base12_Result_baseC2Ev@@GLIBCXX_3.4.15 +FUNC:_ZNSt13__future_base12_Result_baseD0Ev@@GLIBCXX_3.4.15 +FUNC:_ZNSt13__future_base12_Result_baseD1Ev@@GLIBCXX_3.4.15 +FUNC:_ZNSt13__future_base12_Result_baseD2Ev@@GLIBCXX_3.4.15 +FUNC:_ZNSt13__future_base19_Async_state_commonD0Ev@@GLIBCXX_3.4.17 +FUNC:_ZNSt13__future_base19_Async_state_commonD1Ev@@GLIBCXX_3.4.17 +FUNC:_ZNSt13__future_base19_Async_state_commonD2Ev@@GLIBCXX_3.4.17 FUNC:_ZNSt13bad_exceptionD0Ev@@GLIBCXX_3.4 FUNC:_ZNSt13bad_exceptionD1Ev@@GLIBCXX_3.4 FUNC:_ZNSt13bad_exceptionD2Ev@@GLIBCXX_3.4 @@ -1586,6 +1601,18 @@ FUNC:_ZNSt15_List_node_base7reverseEv@@GLIBCXX_3.4 FUNC:_ZNSt15_List_node_base8transferEPS_S0_@@GLIBCXX_3.4 FUNC:_ZNSt15_List_node_base9_M_unhookEv@@GLIBCXX_3.4.14 +FUNC:_ZNSt15__exception_ptr13exception_ptr4swapERS0_@@CXXABI_1.3.3 +FUNC:_ZNSt15__exception_ptr13exception_ptrC1EMS0_FvvE@@CXXABI_1.3.3 +FUNC:_ZNSt15__exception_ptr13exception_ptrC1ERKS0_@@CXXABI_1.3.3 +FUNC:_ZNSt15__exception_ptr13exception_ptrC1Ev@@CXXABI_1.3.3 +FUNC:_ZNSt15__exception_ptr13exception_ptrC2EMS0_FvvE@@CXXABI_1.3.3 +FUNC:_ZNSt15__exception_ptr13exception_ptrC2ERKS0_@@CXXABI_1.3.3 +FUNC:_ZNSt15__exception_ptr13exception_ptrC2Ev@@CXXABI_1.3.3 +FUNC:_ZNSt15__exception_ptr13exception_ptrD1Ev@@CXXABI_1.3.3 +FUNC:_ZNSt15__exception_ptr13exception_ptrD2Ev@@CXXABI_1.3.3 +FUNC:_ZNSt15__exception_ptr13exception_ptraSERKS0_@@CXXABI_1.3.3 +FUNC:_ZNSt15__exception_ptreqERKNS_13exception_ptrES2_@@CXXABI_1.3.3 +FUNC:_ZNSt15__exception_ptrneERKNS_13exception_ptrES2_@@CXXABI_1.3.3 FUNC:_ZNSt15basic_streambufIcSt11char_traitsIcEE10pubseekoffExSt12_Ios_SeekdirSt13_Ios_Openmode@@GLIBCXX_3.4 FUNC:_ZNSt15basic_streambufIcSt11char_traitsIcEE10pubseekposESt4fposI11__mbstate_tESt13_Ios_Openmode@@GLIBCXX_3.4 FUNC:_ZNSt15basic_streambufIcSt11char_traitsIcEE12__safe_gbumpEi@@GLIBCXX_3.4.16 @@ -1769,6 +1796,9 @@ FUNC:_ZNSt16invalid_argumentD0Ev@@GLIBCXX_3.4 FUNC:_ZNSt16invalid_argumentD1Ev@@GLIBCXX_3.4 FUNC:_ZNSt16invalid_argumentD2Ev@@GLIBCXX_3.4.15 +FUNC:_ZNSt16nested_exceptionD0Ev@@CXXABI_1.3.5 +FUNC:_ZNSt16nested_exceptionD1Ev@@CXXABI_1.3.5 +FUNC:_ZNSt16nested_exceptionD2Ev@@CXXABI_1.3.5 FUNC:_ZNSt17__timepunct_cacheIcEC1Ej@@GLIBCXX_3.4 FUNC:_ZNSt17__timepunct_cacheIcEC2Ej@@GLIBCXX_3.4 FUNC:_ZNSt17__timepunct_cacheIcED0Ev@@GLIBCXX_3.4 @@ -1963,6 +1993,7 @@ FUNC:_ZNSt6localeD2Ev@@GLIBCXX_3.4 FUNC:_ZNSt6localeaSERKS_@@GLIBCXX_3.4 FUNC:_ZNSt6thread15_M_start_threadESt10shared_ptrINS_10_Impl_baseEE@@GLIBCXX_3.4.11 +FUNC:_ZNSt6thread15_M_start_threadESt10shared_ptrINS_10_Impl_baseEEPFvvE@@GLIBCXX_3.4.21 FUNC:_ZNSt6thread20hardware_concurrencyEv@@GLIBCXX_3.4.17 FUNC:_ZNSt6thread4joinEv@@GLIBCXX_3.4.11 FUNC:_ZNSt6thread6detachEv@@GLIBCXX_3.4.11 @@ -2214,6 +2245,8 @@ FUNC:_ZSt17__copy_streambufsIwSt11char_traitsIwEEiPSt15basic_streambufIT_T0_ES6_@@GLIBCXX_3.4.6
[committed] Don't request function descriptors when generating fast indirect calls on hppa
The attached change fixes a rather old regression. The code that is generated with fast indirect calls assumes that a function pointer points directly at the function being called, not a procedure descriptor. Tested on hppa-unknown-linux-gnu and hppa2.0w-hp-hpux11.11. Committed to trunk, 4.9 and 4.8. Dave -- John David Anglin dave.ang...@bell.net 2014-08-30 John David Anglin dang...@gcc.gnu.org * config/pa/pa.c (pa_assemble_integer): Don't add PLABEL relocation prefix to function labels when generating fast indirect calls. Index: config/pa/pa.c === --- config/pa/pa.c (revision 214400) +++ config/pa/pa.c (working copy) @@ -3217,7 +3217,12 @@ aligned_p function_label_operand (x, VOIDmode)) { - fputs (size == 8? \t.dword\tP% : \t.word\tP%, asm_out_file); + fputs (size == 8? \t.dword\t : \t.word\t, asm_out_file); + + /* We don't want an OPD when generating fast indirect calls. */ + if (!TARGET_FAST_INDIRECT_CALLS) + fputs (P%, asm_out_file); + output_addr_const (asm_out_file, x); fputc ('\n', asm_out_file); return true;
Re: [Patch, Fortran] CAF dep (2/3): Move code around, prepare for more locking support
Dear Tobias, Obviously there is no problem with this - OK for trunk Cheers Paul On 27 August 2014 23:37, Tobias Burnus bur...@net-b.de wrote: I claim that it is part 2 of 3 of the CAF dep series, but the patch has nothing to do with it, except that it is in the way. Technically, it just moves code from trans-intrinsic.c to trans-expr.c and makes it available. Additionally, I support the case offset == NULL_TREE, which is supposed to be used with lock variables, where we know that the coarray offset is always zero. That's used by my incomplete local lock patch. Build and regtested (as part of the series) on x86-64-gnu-linux. OK for the trunk? Tobias -- The knack of flying is learning how to throw yourself at the ground and miss. --Hitchhikers Guide to the Galaxy
Re: [PING^3] Re: [PATCH 1/2] Add -B support to gcc-ar/ranlib/nm
Hi Richard, On Thu, Aug 28, 2014 at 10:18:22AM +0200, Richard Biener wrote: This also matches joined -B/foo +{ + const char *arg = av[i] + 2; + const char *end; + + memmove (av + i, av + i + 1, sizeof (char *) * ((ac + 1) - i)); + ac--; + if (*arg == 0) +{ + arg = av[i + 1]; + if (!arg) +{ But this doesn't handle it? common.opt has -B as Joined Separate option thus allowing both. I believe it handles both cases. For the joined case (*arg == 0) is false and the earlier (arg = av[i] + 2) assignment is used. + fprintf (stderr, Usage: gcc-ar [-B prefix] ar arguments ...\n); + exit (EXIT_FAILURE); +} + memmove (av + i, av + i + 1, sizeof (char *) * ((ac + 1) - i)); + ac--; + i++; +} + + for (end = arg; *end; end++) +; + end--; + if (end arg *end != '/') +{ + char *newarg = (char *)xmalloc (strlen(arg) + 2); + + strcpy (newarg, arg); + strcat (newarg, /); + arg = newarg; +} Why the above? And why open-coded instead of using strlen? I assume you mean the for loop. I always had strange errors later if the paths were not ending with /, so I'm force adding it. + + add_prefix (path, arg); + add_prefix (target_path, arg); This adds the -B path to the _end_ of the prefix list. Does that match gcc driver behavior? The gcc driver uses PREFIX_PRIORITY_B_OPT as argument to add_prefix which ends up adding -B prefixes to the beginning of the prefix list. Ok. -andi
[RFC/PATCH] More precise diagnostic locations: dynamic locations for columns vs explicit offset
In some situations, we would like to point to a location which was not encoded when tokenizing. This happens, for example, in two prominent cases: 1) To get precise locations within strings (https://gcc.gnu.org/PR52952) for example, for Wformat warnings. 2) In the Fortran FE, which gives quite precise location information by tracking the characters that it wants to warn about instead of relying on the line-map machinery. The most straightforward way to implement this is by adding variants of diagnostic functions that take an explicit offset argument and pass this offset through the whole diagnostics machinery. This is what I implemented in the patch format_offset.diff attached. The downside is that we would need to add even more variants (with/without offset) of various diagnostic functions and track the offset/no-offset cases explicitly. The nicer/cleaner alternative is to somehow (re)compute a single location value from a given location plus the new offset. This is what I implemented in patch fortran-diagnostics-part3.diff in linemap_redo_position_for_column(). As far as I understand, this method only works reliably if the location+offset does not jump to a different line map, that is, if to_column (1u map-d.ordinary.column_bits). Otherwise, we may need to recompute all successive line-maps to accommodate the new location. The best way to do the latter (or to work-around that issue) is not clear to me at the moment. Thus, I am putting forward these two alternative implementations and seeking comments/advice/help in deciding what would be the best way to fix this key missing piece of GCC diagnostics. Related to this, perhaps I should make a more general call for help. Despite the heroic, constant torrent of diagnostic fixes by Paolo, Marek and others, I have not seen much progress on the key infrastructure issues in the roadmap (https://gcc.gnu.org/wiki/Better_Diagnostics). We have had at least one major item per release since GCC 4.5, but I don't see any particular item being tackled for GCC 5.0. Are you planning to tackle any of them? I have a simple patch to implement Fix-it hints but it needs more work. Unfortunately, I have very little free time to dedicate to GCC nowadays, so I'm afraid I might not even be able to finish this in time. Any item in that list would be a nice major feature for GCC 5.0. Perhaps we need to ask for help in gcc/gcc-help or some other forum. Cheers, Manuel. Index: gcc/c-family/c-format.c === --- gcc/c-family/c-format.c (revision 197155) +++ gcc/c-family/c-format.c (working copy) @@ -377,10 +377,11 @@ typedef struct format_wanted_type int format_length; /* The actual parameter to check against the wanted type. */ tree param; /* The argument number of that parameter. */ int arg_num; + int offset_loc; /* The next type to check for this format conversion, or NULL if none. */ struct format_wanted_type *next; } format_wanted_type; /* Convenience macro for format_length_info meaning unused. */ @@ -903,10 +904,11 @@ typedef struct /* Number of leaves of the format argument that were unterminated strings. */ int number_unterminated; /* Number of leaves of the format argument that were not counted above. */ int number_other; + location_t loc; } format_check_results; typedef struct { format_check_results *res; @@ -996,11 +998,11 @@ check_function_format (tree attrs, int n { if (is_attribute_p (format, TREE_PURPOSE (a))) { /* Yup; check it. */ function_format_info info; - decode_format_attr (TREE_VALUE (a), info, 1); + decode_format_attr (TREE_VALUE (a), info, /*validated=*/true); if (warn_format) { /* FIXME: Rewrite all the internal functions in this file to use the ARGARRAY directly instead of constructing this temporary list. */ @@ -1465,10 +1467,11 @@ check_format_arg (void *ctx, tree format if (TREE_CODE (format_tree) != ADDR_EXPR) { res-number_non_literal++; return; } + res-loc = EXPR_LOCATION(format_tree); format_tree = TREE_OPERAND (format_tree, 0); if (format_types[info-format_type].flags (int) FMT_FLAG_PARSE_ARG_CONVERT_EXTERNAL) { bool objc_str = (info-format_type == gcc_objc_string_format_type); @@ -1637,11 +1640,13 @@ check_format_info_main (format_check_res if (*format_chars++ != '%') continue; if (*format_chars == 0) { - warning (OPT_Wformat_, spurious trailing in format); + warning_at (res-loc ? res-loc : input_location, + format_chars - orig_format_chars, + OPT_Wformat_, spurious trailing in format); continue; } if (*format_chars == '%') { ++format_chars; @@ -2449,10 +2454,11 @@ format_type_warning
Re: [patch] No allocation for empty unordered containers
Any news for my patch proposals ? Regarding documentation of default minimum number of buckets, I don't know where it has been documented but why do we need to document it separately ? Could it be taken care by Doxygen ? Can't it get the default value from the code itself ? If not we could document it ourself next to the code rather than in a distinct file. François On 14/08/2014 21:22, François Dumont wrote: On 13/08/2014 11:50, Jonathan Wakely wrote: Yes you can, it's conforming to replace a (non-virtual) member function with default arguments by two or more member functions. We do it all the time. See 17.6.5.5 [member.functions] p2. You should have told it sooner ! But of course no-one is supposed to ignore the Standard :-). Then here is the patch to introduce default constructor with compiler computed noexcept qualification. Note that I also made allocator aware default constructor allocation free however noexcept qualification has to be manually written which I find quite a burden. Do you think we shall do so now ? 2014-08-14 François Dumont fdum...@gcc.gnu.org * include/bits/hashtable_policy.h (_Prime_rehash_policy): Qualify constructor noexcept. (_Hash_code_base): All specialization default constructible if possible. (_Hashtable_base): Likewise. * include/bits/hashtable.h (_Hashtable()): Implementation defaulted. * include/bits/unordered_map.h (unordered_map::unordered_map()): New, implementation defaulted. (unordered_multimap::unordered_multimap()): Likewise. * include/bits/unordered_set.h (unordered_set::unordered_set()): Likewise. (unordered_multiset::unordered_multiset()): Likewise. * include/debug/unordered_map: Likewise. * include/debug/unordered_set: Likewise. * testsuite/23_containers/unordered_map/allocator/noexcept.cc (test04()): New. * testsuite/23_containers/unordered_multimap/allocator/noexcept.cc (test04()): New. * testsuite/23_containers/unordered_set/allocator/noexcept.cc (test04()): New. * testsuite/23_containers/unordered_multiset/allocator/noexcept.cc (test04()): New. I am preparing a patch for profile mode so I will submit modification for this mode with this big patch. Tested under Linux x86_64. Ok to commit ? François
Re: [PATCH] Add support for GNU/Hurd in gnat-4.9
Am 20.08.2014 um 22:12 schrieb Svante Signell: On Wed, 2014-05-21 at 10:48 +0200, Samuel Thibault wrote: Svante Signell, le Wed 21 May 2014 10:44:54 +0200, a écrit : On Wed, 2014-05-21 at 10:33 +0200, Arnaud Charlet wrote: I think the majority of work has bee done, Now that patch will change slightly for every missing feature added to Hurd. Then it's all good, it's a matter of what I said above. Don't forget also the part where general changes are done in GNAT which require update to target specific files: these typically require someone to regularly test each port to detect any missing update, and report/fix them, even if GNU/Hurd hasn't changed itself. With the help from the dabian-ada people, especially Ludovic Brenta, gnat has been running on GNU/Hurd in Debian since gcc/gnat-4.6.3. Then it's all good. Just say that you wish to continue maintaining things like this, and upstream will be happy. Attached is the updated ada-hurd.diff patch for GNU/Hurd. It builds fine with gnat-4.9.1-1 and gcc-4.9.1-7. This patch now has the same amount of files as the kFreeBSD patch. Hopefully it can be material for upstream, but at least in Debian the Hurd port would be on par with kFreeBSD. Regarding remaining code commented out or irrelevant comments in the new file s-osinte-gnu.ads, please help me to iron out the left-overs. the patch is at least missing the ChangeLog entry. Also it is only tested using the Debian package based on the 4.9 branch, and which includes a bunch of local ada patches which are not forwarded upstream for years. Please prepare and test your patch with current trunk for upstream submission. Ludovic, can you consider using this file as ada-hurd.diff for next upload of Debian gnat-4.9? For clarity: I wish to continue to maintain the Ada port for Hurd, with the help of the Debian Ada and Hurd people, with or without being imported upstream. I disagree. Debian's current local ada patches are a mess, and no effort is made to cleanup these and forward these upstream. If the Debian Ada people can't do this, please do it yourself. Matthias
Re: [Patch, Fortran] CAF dep (3/3): coarrays - pass may_require_tmp informtion for CAF_get/send/sendget to the library
Dear Tobias, This looks fine to me - OK for trunk. Thanks for this massive effort! Paul On 28 August 2014 08:13, Tobias Burnus bur...@net-b.de wrote: This patch is based on 1/2 and 2/2 on the series. When the patch is approved, OpenCoarrays needs to be adapted; however, as surplus arguments of the callee are ignored, no immediate action is required. (And some delay avoids issues with compilers being older than the library.) The issue comes up in the context of having a coarray access on the same image, e.g. a[this_image()] = a, where alias questions play a role. While one can leave the general handling to the library - such as switching to memmove in case of local memory access, this patch tries to help the library to decide whether it has to create a temporary variable or not. For that reason, it passes an may_require_temporary argument to the library. may_require_temporary is false if the source and target variables are disjunct, or if they are such overlapping that walking them in element order will not require a temporary (special case: identical). If the compiler cannot tell at compile time, the value is always one. Of course, if the memory access is for a different image than the current image (or for sendget: when the two image indexes are for different images), the library can ignore the argument may_require_temporary and use the normal remote memory access. Build and regtested on x86-64-gnu-linux. OK for the trunk? Tobias PS: I image code like the following in the library: if (image_index == this_image) { if (contiguous LHS and RHS): use memmove // With special case: LHS and RHS identical if (!may_require_temporary) for-loop assigning LHS = RHS in element order else { tmp = malloc() if (RHS contiguous or scalar) tmp = memcpy(RHS) else for loop assigning RHS to tmp if (LHS contiguous) LHS = memcpy(tmp) else for loop assigning tmp to LHS } } else { do normal remote-image assignment } -- The knack of flying is learning how to throw yourself at the ground and miss. --Hitchhikers Guide to the Galaxy
[Comitted] Add testcase for some miscompile in older versions of GCC
Hi, In some versions of GCC with AARCH64 backported, I got a miscompile of a shift that involved a load which had a post increment of the address. This adds the testcase I created for that case. Comitted after a quick test on x86_64-linux-gnu of the testcase. I had meant to commit this two days which is why the testcase is dated two days ago. Thanks, Andrew Pinski ChangeLog: * gcc.c-torture/execute/20140828-1.c: New testcase. Index: gcc.c-torture/execute/20140828-1.c === --- gcc.c-torture/execute/20140828-1.c (revision 0) +++ gcc.c-torture/execute/20140828-1.c (revision 0) @@ -0,0 +1,22 @@ +short *f(short *a, int b, int *d) __attribute__((noinline,noclone)); + +short *f(short *a, int b, int *d) +{ + short c = *a; + a++; + c = b c; + *d = c; + return a; +} + +int main(void) +{ + int d; + short a[2]; + a[0] = 0; + if (f(a, 1, d) != a[1]) +__builtin_abort (); + if (d != 1) +__builtin_abort (); + return 0; +}
Re: [PATCH] GCC/test: Disable loop-19.c for classic FPU Power
On Fri, Aug 29, 2014 at 10:46 PM, Maciej W. Rozycki ma...@codesourcery.com wrote: Hi, The loop-19.c test case has regressed from 4.8 to 4.9 and trunk on classic FPU Power targets, these failures are now seen: FAIL: gcc.dg/tree-ssa/loop-19.c scan-tree-dump-times optimized MEM.(base: |symbol: )a, 2 FAIL: gcc.dg/tree-ssa/loop-19.c scan-tree-dump-times optimized MEM.(base: |symbol: )c, 2 However upon the inpection of generated code it is obvious that its quality has improved, the autoincrement rather than indexed addressing mode is now used in the loop produced, reducing the number of instructions in the loop from 4 to 3 and also removing another instruction from outside the loop, i.e. (new code): .globl tuned_STREAM_Copy .type tuned_STREAM_Copy, @function tuned_STREAM_Copy: lis 8,0x1e lis 10,a-8@ha ori 8,8,33920 lis 9,c-8@ha mtctr 8 la 10,a-8@l(10) la 9,c-8@l(9) .L2: lfdu 0,8(10) stfdu 0,8(9) bdnz .L2 blr .size tuned_STREAM_Copy, .-tuned_STREAM_Copy vs (old code): .globl tuned_STREAM_Copy .type tuned_STREAM_Copy, @function tuned_STREAM_Copy: lis 7,0x1e ori 7,7,33920 mtctr 7 lis 8,c@ha lis 10,a@ha li 9,0 la 8,c@l(8) la 10,a@l(10) .L3: lfdx 0,10,9 stfdx 0,8,9 addi 9,9,8 bdnz .L3 blr .size tuned_STREAM_Copy,.-tuned_STREAM_Copy The only Power targets that still pass this test are e500v2 ones such as `-mcpu=8548 -mfloat-gprs=double -mspe=yes -mabi=spe' that use the SPE unit for FP operations, because the indexed mode is still used (there's no autoincrement addressing mode available for the memory access instructions concerned): .globl tuned_STREAM_Copy .type tuned_STREAM_Copy, @function tuned_STREAM_Copy: lis 10,0x1e lis 7,c@ha lis 8,a@ha ori 10,10,0x8480 li 9,0 la 7,c@l(7) la 8,a@l(8) mtctr 10 .L2: evlddx 10,8,9 evstddx 10,7,9 addi 9,9,8 bdnz .L2 blr .size tuned_STREAM_Copy,.-tuned_STREAM_Copy [I have removed -fno-common from the current test flags for the purpose of this consideration to compare apples to apples; 4.8 didn't have it. The presence or absence of this flag does not appear to make a difference for this test case for Power targets.] The obvious reason of the failure is the offset of -8 now seen in new classic FP code for preinitialising the pointers before entering the loop. The initial offset is needed so that it is cancelled by the offset of 8 used in the loop itself to autoincrement these pointers. So the new code not only is better, but it actually has to use these offsets as well or autoincrementation would not work. Therefore I think at this point the test case is invalid for classic FP Power, so I propose that we exclude it from testing here, only leaving SPE FP Power for whatever value the test case may have for it, and especially x86 variants where there's actual code size penalty for using an immediate offset (displacement) in addition to a base register. For the record here are the optimization dumps examined by the test case, for the old generated code that passes: ;; Function tuned_STREAM_Copy (tuned_STREAM_Copy, funcdef_no=0, decl_uid=1382, cgraph_uid=0) tuned_STREAM_Copy () { sizetype ivtmp.10; double _4; bb 2: bb 3: # ivtmp.10_8 = PHI ivtmp.10_2(4), 0(2) _4 = MEM[symbol: a, index: ivtmp.10_8, offset: 0B]; MEM[symbol: c, index: ivtmp.10_8, offset: 0B] = _4; ivtmp.10_2 = ivtmp.10_8 + 8; if (ivtmp.10_2 != 1600) goto bb 4; else goto bb 5; bb 4: goto bb 3; bb 5: return; } and for the new code that fails: ;; Function tuned_STREAM_Copy (tuned_STREAM_Copy, funcdef_no=0, decl_uid=2191, symbol_order=2) Removing basic block 5 tuned_STREAM_Copy () { unsigned int ivtmp.13; unsigned int ivtmp.9; double _4; void * _15; void * _16; unsigned int _17; bb 2: ivtmp.9_11 = (unsigned int) MEM[(void *)a + 4294967288B]; ivtmp.13_14 = (unsigned int) MEM[(void *)c + 4294967288B]; _17 = (unsigned int) MEM[(void *)a + 1592B]; bb 3: # ivtmp.9_8 = PHI ivtmp.9_2(3), ivtmp.9_11(2) # ivtmp.13_12 = PHI ivtmp.13_13(3), ivtmp.13_14(2) ivtmp.9_2 = ivtmp.9_8 + 8; _15 = (void *) ivtmp.9_2; _4 = MEM[base: _15, offset: 0B]; ivtmp.13_13 = ivtmp.13_12 + 8; _16 = (void *) ivtmp.13_13; MEM[base: _16, offset: 0B] = _4; if (ivtmp.9_2 != _17) goto bb 3; else goto bb 4; bb 4: return; } Tested with the following powerpc-gnu-linux multilibs with the respective results noted on the right: -mcpu=603e UNSUPPORTED -mcpu=603e -msoft-float
Re: [PATCH, rs6000] A few more vector builtins
On Fri, Aug 29, 2014 at 2:28 PM, Bill Schmidt wschm...@linux.vnet.ibm.com wrote: Hi, This is the last in the current series of new vector built-ins. This group adds vec_ctf, vec_cts, and vec_ctu for vector double and vector long long. Additionally, it adds documentation for the built-ins added in my last patch, since I forgot to add it then... /oops Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no regressions. Ok for trunk? Thanks, Bill [gcc] 2014-08-29 Bill Schmidt wschm...@linux.vnet.ibm.com * config/rs6000/rs6000-builtin.def (XVCVSXDDP_SCALE): New built-in definition. (XVCVUXDDP_SCALE): Likewise. (XVCVDPSXDS_SCALE): Likewise. (XVCVDPUXDS_SCALE): Likewise. * config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Add entries for VSX_BUILTIN_XVCVSXDDP_SCALE, VSX_BUILTIN_XVCVUXDDP_SCALE, VSX_BUILTIN_XVCVDPSXDS_SCALE, and VSX_BUILTIN_XVCVDPUXDS_SCALE. * config/rs6000/rs6000-protos.h (rs6000_scale_v2df): New prototype. * config/rs6000/rs6000.c (real.h): New include. (rs6000_scale_v2df): New function. * config/rs6000/vsx.md (UNSPEC_VSX_XVCVSXDDP): New unspec. (UNSPEC_VSX_XVCVUXDDP): Likewise. (UNSPEC_VSX_XVCVDPSXDS): Likewise. (UNSPEC_VSX_XVCVDPUXDS): Likewise. (vsx_xvcvsxddp_scale): New define_expand. (vsx_xvcvsxddp): New define_insn. (vsx_xvcvuxddp_scale): New define_expand. (vsx_xvcvuxddp): New define_insn. (vsx_xvcvdpsxds_scale): New define_expand. (vsx_xvcvdpsxds): New define_insn. (vsx_xvcvdpuxds_scale): New define_expand. (vsx_xvcvdpuxds): New define_insn. * doc/extend.texi (vec_ctf): Add new prototypes. (vec_cts): Likewise. (vec_ctu): Likewise. (vec_splat): Likewise. (vec_div): Likewise. (vec_mul): Likewise. [gcc/testsuite] 2014-08-29 Bill Schmidt wschm...@linux.vnet.ibm.com * gcc.target/powerpc/builtins-1.c: Add tests for vec_ctf, vec_cts, and vec_ctu. * gcc.target/powerpc/builtins-2.c: Likewise. Okay. Thanks, David