[PATCH] luoxhu - backport from trunk r255555, r257253 and r258137
From: Xiong Hu Luo This is a backport of r25, r257253 and r258137 of trunk to gcc-7-branch. The patches were on trunk before GCC 8 forked already. Totally 5 files need mannual resolve due to code changes for r25. r257253 and r258137 are dependent testcases require vsx support need merge to avoid regression. The discussion for the patch r25 that went into trunk is: https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00394.html VSX support for patch r257253 and r258137: https://gcc.gnu.org/ml/gcc-patches/2018-01/msg02391.html https://gcc.gnu.org/ml/gcc-patches/2018-02/msg01506.html gcc/ChangeLog: 2019-01-14 Luo Xiong Hu Backport from trunk. Mannually resolve 3 files: * config/rs6000/altivec.h (vec_extract_fp32_from_shorth, vec_extract_fp32_from_shortl): Resolve new #defines. * config/rs6000/rs6000-c.c (ALTIVEC_BUILTIN_VEC_SLD): Resolve new expensions. * doc/extend.texi: (vec_sld, vec_sll, vec_srl, vec_sro, vec_unpackh, vec_unpackl, test_vsi_packsu_vssi_vssi, vec_packsu, vec_cmpne): Resolve new documentation. 2017-12-11 Carl Love * config/rs6000/altivec.h (vec_extract_fp32_from_shorth, vec_extract_fp32_from_shortl]): Add #defines. * config/rs6000/rs6000-builtin.def (VSLDOI_2DI): Add macro expansion. * config/rs6000/rs6000-c.c (ALTIVEC_BUILTIN_VEC_UNPACKH, ALTIVEC_BUILTIN_VEC_UNPACKL, ALTIVEC_BUILTIN_VEC_AND, ALTIVEC_BUILTIN_VEC_SLD, ALTIVEC_BUILTIN_VEC_SRL, ALTIVEC_BUILTIN_VEC_SRO, ALTIVEC_BUILTIN_VEC_SLD, ALTIVEC_BUILTIN_VEC_SLL): Add expansions. * doc/extend.texi: Add documentation for the added builtins. gcc/testsuite/ChangeLog: 2019-01-14 Luo Xiong Hu Backport from trunk r25. Mannually resolve 2 files: * gcc.target/powerpc/builtins-3-p8.c (test_vsi_packs_vusi, test_vsi_packsu-vssi, test_vsi_packsu-vusi, test_vsi_packsu-vsll, test_vsi_packsu-vull, test_vsi_packsu-vsi, test_vsi_packsu-vui): Resolve new cases. * gcc.target/powerpc/builtins-3.c (test_sll_vsc_vsc_vsuc, test_sll_vuc_vuc, test_sll_vsi_vsi_vuc, test_sll_vui_vui_vuc, test_sll_vbll_vull, test_sll_vbll_vbll_vus, test_sll_vp_vp_vuc, test_sll_vssi_vssi_vuc, test_sll_vusi_vusi_vuc, test_slo_vsc_vsc_vsc, test_slo_vuc_vuc_vsc, test_slo_vsi_vsi_vsc, test_slo_vsi_vsi_vuc, test_slo_vui_vui_vsc, test_slo_vui_vui_vuc, test_slo_vp_vp_vsc, test_slo_vp_vp_vuc, test_slo_vssi_vssi_vsc, test_slo_vssi_vssi_vuc, test_slo_vusi_vusi_vsc, test_slo_vusi_vusi_vuc, test_slo_vusi_vusi_vuc, test_slo_vf_vf_vsc, test_slo_vf_vf_vuc, test_cmpb_float): Resolve new cases. 2017-12-11 Carl Love * gcc.target/powerpc/altivec-7.c: Renamed altivec-7.h. * gcc.target/powerpc/altivec-7.h (main): Add testcases for vec_unpackl. Add dg-final tests for the instructions generated. * gcc.target/powerpc/altivec-7-be.c: New file to test on big endian. * gcc.target/powerpc/altivec-7-le.c: New file to test on little endian. * gcc.target/powerpc/altivec-13.c (foo): Add vec_sld, vec_srl, vec_sro testcases. Add dg-final tests for the instructions generated. * gcc.target/powerpc/builtins-3-p8.c (test_vsi_packs_vui, test_vsi_packs_vsi, test_vsi_packs_vssi, test_vsi_packs_vusi, test_vsi_packsu-vssi, test_vsi_packsu-vusi, test_vsi_packsu-vsll, test_vsi_packsu-vull, test_vsi_packsu-vsi, test_vsi_packsu-vui): Add testcases. Add dg-final tests for new instructions. * gcc.target/powerpc/p8vector-builtin-2.c (vbschar_eq, vbchar_eq, vuchar_eq, vbint_eq, vsint_eq, viint_eq, vuint_eq, vbool_eq, vbint_ne, vsint_ne, vuint_ne, vbool_ne, vsign_ne, vuns_ne, vbshort_ne): Add tests. Add dg-final instruction tests. * gcc.target/powerpc/vsx-vector-6.c: Renamed vsx-vector-6.h. * gcc.target/powerpc/vsx-vector-6.h (vec_andc,vec_nmsub, vec_nmadd, vec_or, vec_nor, vec_andc, vec_or, vec_andc, vec_msums): Add tests. Add dg-final tests for the generated instructions. * gcc.target/powerpc/builtins-3.c (test_sll_vsc_vsc_vsuc, test_sll_vuc_vuc, test_sll_vsi_vsi_vuc, test_sll_vui_vui_vuc, test_sll_vbll_vull, test_sll_vbll_vbll_vus, test_sll_vp_vp_vuc, test_sll_vssi_vssi_vuc, test_sll_vusi_vusi_vuc, test_slo_vsc_vsc_vsc, test_slo_vuc_vuc_vsc, test_slo_vsi_vsi_vsc, test_slo_vsi_vsi_vuc, test_slo_vui_vui_vsc, test_slo_vui_vui_vuc, test_slo_vsll_slo_vsll_vsc, test_slo_vsll_slo_vsll_vuc, test_slo_vull_slo_vull_vsc, test_slo_vull_slo_vull_vuc, test_slo_vp_vp_vsc, test_slo_vp_vp_vuc, test_slo_vssi_vssi_vsc, test_slo_vssi_vssi_vuc, test_slo_vusi_vusi_vsc, test_slo_vusi_vusi_vuc, test_slo_vusi_vusi_vuc, test_slo_vf_vf_vsc, test_slo_vf_vf_vuc, test_cmpb_float): Add tests. Backport
Re: [C++ PATCH] Ensure constexpr evaluation is done on pre-cp_fold_function bodies (PR c++/89285)
On Mon, Feb 18, 2019 at 04:04:15PM -1000, Jason Merrill wrote: > > --- gcc/cp/constexpr.c.jj 2019-02-17 17:09:47.113351897 +0100 > > +++ gcc/cp/constexpr.c 2019-02-18 19:34:57.995136395 +0100 > > @@ -1269,6 +1301,49 @@ cxx_eval_builtin_function_call (const co > > return t; > > } > > + if (fndecl_built_in_p (fun, BUILT_IN_NORMAL)) > > +switch (DECL_FUNCTION_CODE (fun)) > > + { > > + case BUILT_IN_ADD_OVERFLOW: > > + case BUILT_IN_SADD_OVERFLOW: > > + case BUILT_IN_SADDL_OVERFLOW: > > + case BUILT_IN_SADDLL_OVERFLOW: > > + case BUILT_IN_UADD_OVERFLOW: > > + case BUILT_IN_UADDL_OVERFLOW: > > + case BUILT_IN_UADDLL_OVERFLOW: > > + case BUILT_IN_SUB_OVERFLOW: > > + case BUILT_IN_SSUB_OVERFLOW: > > + case BUILT_IN_SSUBL_OVERFLOW: > > + case BUILT_IN_SSUBLL_OVERFLOW: > > + case BUILT_IN_USUB_OVERFLOW: > > + case BUILT_IN_USUBL_OVERFLOW: > > + case BUILT_IN_USUBLL_OVERFLOW: > > + case BUILT_IN_MUL_OVERFLOW: > > + case BUILT_IN_SMUL_OVERFLOW: > > + case BUILT_IN_SMULL_OVERFLOW: > > + case BUILT_IN_SMULLL_OVERFLOW: > > + case BUILT_IN_UMUL_OVERFLOW: > > + case BUILT_IN_UMULL_OVERFLOW: > > + case BUILT_IN_UMULLL_OVERFLOW: > > + /* These builtins will fold into > > + (cast) > > +((something = __real__ SAVE_EXPR <.???_OVERFLOW (cst1, cst2)>), > > + __imag__ SAVE_EXPR <.???_OVERFLOW (cst1, cst2)>) > > + which fails is_constant_expression. */ > > + if (TREE_CODE (args[0]) != INTEGER_CST > > + || TREE_CODE (args[1]) != INTEGER_CST > > + || !potential_constant_expression (args[2])) > > + { > > + if (!*non_constant_p && !ctx->quiet) > > + error ("%q+E is not a constant expression", new_call); > > + *non_constant_p = true; > > + return t; > > + } > > + return cxx_eval_constant_expression (_ctx, new_call, lval, > > +non_constant_p, overflow_p); > > + default: > > + break; > > + } > > What is this for? Won't this recursive cxx_eval_constant_expression come > back to this function again? If the expression is constant, shouldn't it > have been folded by fold_builtin_call_array? This is for the constexpr-arith-overflow.C testcase. The arguments are INTEGER_CST, INTEGER_CST and ADDR_EXPR of a VAR_DECL or PARM_DECL, and fold_builtin_call_array returns new_call: (z = REALPART_EXPR >;, (bool) IMAGPART_EXPR >;); where this doesn't pass is_constant_expression because of the z store. cxx_eval_constant_expression is able to evaluate this, as z = 0; false; in this case. I guess builtins.c folding could be improved and simplify it to (z = 0; (bool) false;); but that still doesn't pass is_constant_expression check. For C++14 it passes potential_constant_expression though, and that is what I've used for these builtins in the first iteration, but the testcase happened to pass even for C++11 and potential_constant_expression is false here. Though, perhaps we are going too far for C++11 here and should reject it, after all, people have the possibility to use __builtin_*_overflow_p now which should be usable even in C++11. The reason why it passed with C++11 is that when parsing we saw a __builtin_add_overflow (0, 0, ) call and potential_constant_expression said it is ok, then folded it into that (z = REALPART_EXPR >;, (bool) IMAGPART_EXPR >;); which is not potential_constant_expression, but nothing called it again and cxx_eval_constant_expression can handle it. > > @@ -1358,6 +1433,9 @@ cxx_bind_parameters_in_call (const const > > x = ctx->object; > > x = build_address (x); > > } > > + if (TREE_ADDRESSABLE (type) && TYPE_REF_P (TREE_TYPE (x))) > > + /* Undo convert_for_arg_passing work here. */ > > + x = build_fold_indirect_ref_loc (EXPR_LOCATION (x), x); > > Not convert_from_reference? Will change. > > @@ -4036,6 +4113,10 @@ label_matches (const constexpr_ctx *ctx, > > } > > break; > > +case BREAK_STMT: > > +case CONTINUE_STMT: > > + break; > > + > > Let's add a comment that these are handled directly in cxx_eval_loop_expr. Ok, will do. Jakub
Re: [PATCH, GCC] PR target/86487: fix the way 'uses_hard_regs_p' handles paradoxical subregs
On 2019-02-15 6:35 a.m., Andre Vieira (lists) wrote: Hi Vlad, On 13/02/2019 16:46, Vladimir Makarov wrote: On 2019-02-13 5:54 a.m., Andre Vieira (lists) wrote: PING. Since Jeff is away can another maintainer have a look at this please? I see the following patch Yeah I uploaded the wrong patch... sorry. See attached, including a testcase, currently only fails on GCC-8 and previous though. It happens. The new version is ok to commit. Thank you for working on the patch.
Re: [REVISED PATCH 7/9]: C++ P0482R5 char8_t: New standard library tests
On 07/02/19 23:39 -0500, Tom Honermann wrote: On 2/7/19 4:54 AM, Jonathan Wakely wrote: On 23/12/18 21:27 -0500, Tom Honermann wrote: Attached is a revised patch that addresses changes in P0482R6. Changes from the prior patch include: - Updated the value of the __cpp_char8_t feature test macro to 201811. Tested on x86_64-linux. There are quite a few additional changes needed to make the testsuite pass cleanly with non-default options, e.g. when running it with RUNTESTFLAGS=--target_board=unix/-fchar8_t/-fno-inline I see these failures: I remember thinking that I had to deal with this at one point. It seems I then forgot about it. FAIL: 21_strings/basic_string/literals/types.cc (test for excess errors) FAIL: 21_strings/basic_string/literals/values.cc (test for excess errors) UNRESOLVED: 21_strings/basic_string/literals/values.cc compilation failed to produce executable FAIL: 21_strings/basic_string_view/literals/types.cc (test for excess errors) FAIL: 21_strings/basic_string_view/literals/values.cc (test for excess errors) UNRESOLVED: 21_strings/basic_string_view/literals/values.cc compilation failed to produce executable FAIL: 22_locale/codecvt/char16_t.cc (test for excess errors) UNRESOLVED: 22_locale/codecvt/char16_t.cc compilation failed to produce executable FAIL: 22_locale/codecvt/char32_t.cc (test for excess errors) UNRESOLVED: 22_locale/codecvt/char32_t.cc compilation failed to produce executable FAIL: 22_locale/codecvt/codecvt_utf8/79980.cc (test for excess errors) UNRESOLVED: 22_locale/codecvt/codecvt_utf8/79980.cc compilation failed to produce executable FAIL: 22_locale/codecvt/codecvt_utf8/wchar_t/1.cc (test for excess errors) UNRESOLVED: 22_locale/codecvt/codecvt_utf8/wchar_t/1.cc compilation failed to produce executable FAIL: 22_locale/codecvt/utf8.cc (test for excess errors) UNRESOLVED: 22_locale/codecvt/utf8.cc compilation failed to produce executable FAIL: 22_locale/conversions/string/2.cc (test for excess errors) UNRESOLVED: 22_locale/conversions/string/2.cc compilation failed to produce executable FAIL: 22_locale/conversions/string/3.cc (test for excess errors) UNRESOLVED: 22_locale/conversions/string/3.cc compilation failed to produce executable FAIL: experimental/string_view/literals/types.cc (test for excess errors) FAIL: experimental/string_view/literals/values.cc (test for excess errors) UNRESOLVED: experimental/string_view/literals/values.cc compilation failed to produce executable There would be similar errors running all the tests with -std=c++2a, which is definitely something I do often and so want the tests to be clean. Absolutely, agreed. We can either disable those tests when char8_t is enabled (because we already have alternative tests checking the char8_t versions of string_view etc.) or make them work either way, which the attached patch begins doing (more changes are needed). Since most of these tests exercise functionality that is not u8/char8_t specific, I think we should make them work. I expect a different set of failures for -fno-char8_t (which is probably a less important case to support that enabling char8_t in older standards, but maybe still worth testing now and then). I'm not sure it is less important. -fno-char8_t may be an important tool for some code bases during their initial testing of, and migration to, C++20. Tom. I committed your patch for library tests unchanged, and also committed the attached one to fix the failures when running the existing tests with -std=gnu++2a or -fchar8_t. commit 1c32dfd748cc225a02cb729943eb9586eda8d7fd Author: Jonathan Wakely Date: Thu Feb 7 09:04:11 2019 + Adjust C++11/C++14 tests to work with -fchar8_t * testsuite/21_strings/basic_string/literals/types.cc [_GLIBCXX_USE_CHAR8_T]: Adjust expected string type for u8 literal. * testsuite/21_strings/basic_string/literals/values.cc [_GLIBCXX_USE_CHAR8_T]: Likewise. * testsuite/22_locale/codecvt/char16_t.cc: Adjust for u8 literals potentially having different type. * testsuite/22_locale/codecvt/char32_t.cc: Likewise. * testsuite/22_locale/codecvt/codecvt_utf8/79980.cc: Cast u8 literal to char. * testsuite/22_locale/codecvt/codecvt_utf8/wchar_t/1.cc: Likewise. * testsuite/22_locale/codecvt/utf8.cc: Likewise. * testsuite/22_locale/conversions/string/2.cc: Remove u8 prefix from string literals only using basic character set. * testsuite/22_locale/conversions/string/3.cc: Likewise. Cast other u8 literals to char. * testsuite/29_atomics/headers/atomic/macros.cc [_GLIBCXX_USE_CHAR8_T]: Test ATOMIC_CHAR8_T_LOCK_FREE. Add missing #error to ATOMIC_CHAR16_T_LOCK_FREE test. * testsuite/29_atomics/headers/atomic/types_std_c++0x.cc [_GLIBCXX_USE_CHAR8_T]: Check for std::atomic_char8_t.
Re: [REVISED PATCH 5/9]: C++ P0482R5 char8_t: Standard library support
On 08/02/19 12:56 +, Jonathan Wakely wrote: On 07/02/19 23:35 -0500, Tom Honermann wrote: On 2/7/19 4:44 AM, Jonathan Wakely wrote: On 23/12/18 21:27 -0500, Tom Honermann wrote: Attached is a revised patch that addresses changes in P0482R6. Changes from the prior patch include: - Updated the value of the __cpp_char8_t feature test macro to 201811. Tested on x86_64-linux. Thanks, Tom, this is great work! The front-end changes for char8_t went in recently, and I'm finally ready to commit the library parts. Great! There's one big problem I found in this patch, which is that the new numeric_limits specialization uses constexpr unconditionally. That fails if is compiled using options like -std=c++98 -fno-char8_t because the specialization will be used, but the constexpr keyword isn't allowed. That's easily fixed by replacing the keyword with _GLIBCXX_CONSTEXPR. Hmm, the code for the char8_t specialization was copied from the char16_t specialization which also uses constexpr unconditionally (but is guarded by a C++11+ requirement). That can use it unconditionally, because there's no -fchar16_t switch to enable char16_t prior to C++11. The char8_t specialization must be elided when the compiler is invoked with -std=c++98 -fno-char8_t (since the char8_t type doesn't exist then). The _GLIBCXX_USE_CHAR8_T guard doesn't suffice for this? _GLIBCXX_USE_CHAR8_T should only be defined if __cpp_char8_t is defined; and that should only be defined if -fchar8_t or -std=c++2a is specified. Or perhaps you intended -std=c++98 -fchar8_t? I agree in that case that use of _GLIBCXX_CONSTEXPR is necessary. Yes sorry, that's a typo above, I meant -std=c++98 -fchar8_t. The -std=c++98 -fno-char8_t case works fine, as expected (because -fno-char8_t is the default for -std=c++98 anyway). The other way to solve that problem would be for the compiler to give an error if -fchar8_t is used with C++98, but I see no fundamental reason that combination of options shouldn't be allowed. We can support it in the library by using the macro. Agreed. As discussed in San Diego, the other change needed is to add the abi_tag attribute to the new versions of path::u8string and path::generic_u8string, so that the mangling is different when its return type is different: #ifdef _GLIBCXX_USE_CHAR8_T __attribute__((__abi_tag__("__u8"))) std::u8string u8string() const; #else std::string u8string() const; #endif // _GLIBCXX_USE_CHAR8_T Otherwise we get ODR violations when linking objects compiled with -fchar8_t enabled to objects with it disabled (e.g. linking -std=c++17 objects to -std=c++2a objects, which needs to work). Are ODR violations bad? :) Only when they make people send us bug reports ;-) I suggest "__u8" as the name of the ABI tag, but I'm open to other suggestions. "__char8_t" is a bit long and verbose. "__cxx20" would be consistent with "__cxx11" used for the new ABI introduced in GCC 5 but it regularly confuses people who think it is coupled to the -std=c++11 option (and so don't understand why they still see it for -std=c++14). I have no preference or alternative suggestions here. Had I recognized the issue, I would have asked you what to do about it :) Also, I see that you've made changes to (to add the experimental::u8string_view typedef) and to std::experimental::path (to change the return type of u8string and generic_u8string). The former change is fairly harmless; it only adds a typedef, albeit one which is not a reserved name in C++14/C++17 and so should be available for users to define as a macro. Maybe prior to C++2a we should only define it when GNU extensions are enabled (i.e. when using -std=gnu++14 not -std=c++14): #if defined _GLIBCXX_USE_CHAR8_T \ && (__cplusplus > 201703L || !defined __STRICT_ANSI__) using u8string_view = basic_string_view; #endif That makes sense. Actually I was thinking about this further, and if somebody explicitly uses -fchar8_t then they're asking for a non-standard dialect of C++ anyway, and so they can't complain about some extra non-standard names. So I think it's fine to declare std::u8string_view whenever char8_t is enabled. Changing the return type of experimental::path members concerns me more. That's a published TS which is not going to be revised, and it's not obvious to me that users would want the change in semantics. If somebody is still using the Filesystem TS in C++2a code, they're probably not expecting it to change. If they need to update their code for C++2a they might as well just use std::filesystem, and so having char8_t support in std::experimental::filesystem isn't clearly useful. I agree. I added the support to the experimental implementations more out of a desire to be complete and to remove any potential barriers to use of -fchar8_t than because I felt the changes were really necessary. I would be perfectly fine with skipping the updates to the experimental libraries completely. OK, let's leave them
Re: [committed] Fix set_uids_in_ptset (PR middle-end/89303)
On 18/02/19 21:22 +0100, Jakub Jelinek wrote: On Mon, Feb 18, 2019 at 09:15:39PM +0100, Rainer Orth wrote: 2019-02-15 Rainer Orth * g++.dg/torture/pr89303.C (bad_weak_ptr): Rename to bad_weak_ptr_. Ok, thanks. If needed, guess we could rename much more (or rename the namespace in which most of it is from std to my_std, though we'd need to check for stuff that needs to be in std namespace). I think that whole testcase could be in some non-std namespace. I don't think there are any magic functions or types that need to be in namespace std to work correctly. # HG changeset patch # Parent 056fe4093ce40dc462c6b50c3ae49df032a92230 Fix g++.dg/torture/pr89303.C with Solaris ld diff --git a/gcc/testsuite/g++.dg/torture/pr89303.C b/gcc/testsuite/g++.dg/torture/pr89303.C --- a/gcc/testsuite/g++.dg/torture/pr89303.C +++ b/gcc/testsuite/g++.dg/torture/pr89303.C @@ -350,11 +350,11 @@ namespace std { return static_cast(_M_addr()); } }; - class bad_weak_ptr { }; + class bad_weak_ptr_ { }; inline void __throw_bad_weak_ptr() - { (throw (bad_weak_ptr())); } + { (throw (bad_weak_ptr_())); } class _Sp_counted_base { Jakub
PING [PATCH] fix ICE in __builtin_has_attribute (PR 88383 and 89288)
Please let me know what it will take to get the fix for these two issues approved. I've answered the questions so I don't know what else I'm expected to do here. https://gcc.gnu.org/ml/gcc-patches/2019-02/msg00793.html On 2/11/19 12:20 PM, Martin Sebor wrote: This is a repost of a patch for PR 88383 updated to also fix the just reported PR 89288 (the original patch only partially handles this case). The review of the first patch was derailed by questions about the design of the built-in so the fix for the ICE was never approved. I think the ICEs should be fixed for GCC 9 and any open design questions should be dealt with independently. Martin The patch for PR 88383 was originally posted last December: https://gcc.gnu.org/ml/gcc-patches/2018-12/msg00337.html
Re: [C++ PATCH] Ensure constexpr evaluation is done on pre-cp_fold_function bodies (PR c++/89285)
On 2/18/19 12:45 PM, Jakub Jelinek wrote: Hi! As mentioned in the PR, we've regressed on the trunk in diagnostics of some invalid constexpr evaluations. The problem is that the constexpr evaluation is effectively done on post-cp_fold_function bodies/arguments and cp_fold optimizes away some important trees for constexpr diagnostics, either itself, or through using GENERIC match.pd (on the testcase in particular diagnostics about reinterpret_cast). While we save on constexpr call hash table bodies of the functions pre-cp_fold_function, due to sharing and cp_fold_r the STATEMENT_LIST statements etc. are modified directly and genericization modifies it as well. The following patch uses copy_fn which we have been using before the the recursive constexpr cases also to make a copy of the constexpr function before cp_fold_function clobbers it. I had to implement cxx_eval_conditional_expression handling of various C++ FE statements that are replaced during genericization. Bootstrapped/regtested on x86_64-linux and i686-linux (98,11,14,17,2a), ok for trunk? 2019-02-18 Jakub Jelinek PR c++/89285 * constexpr.c (struct constexpr_fundef): Add parms and result members. (retrieve_constexpr_fundef): Adjust for the above change. (register_constexpr_fundef): Save constexpr body with copy_fn, temporarily set DECL_CONTEXT on DECL_RESULT before that. (get_fundef_copy): Change FUN argument to FUNDEF with constexpr_fundef * type, grab body and parms/result out of constexpr_fundef struct and temporarily change it for copy_fn calls too. (cxx_eval_builtin_function_call): For __builtin_FUNCTION temporarily adjust current_function_decl from ctx->call context. For arith overflow builtins, don't test is_constant_expression on the result, instead test if arguments are suitable constant expressions. (cxx_bind_parameters_in_call): Grab parameters from new_call. Undo convert_for_arg_passing changes for TREE_ADDRESSABLE type passing. (cxx_eval_call_expression): Adjust get_fundef_copy caller. (cxx_eval_conditional_expression): For IF_STMT, allow then or else operands to be NULL. (label_matches): Handle BREAK_STMT and CONTINUE_STMT. (cxx_eval_loop_expr): Add support for FOR_STMT, WHILE_STMT and DO_STMT. (cxx_eval_switch_expr): Add support for SWITCH_STMT. (cxx_eval_constant_expression): Handle IF_STMT, FOR_STMT, WHILE_STMT, DO_STMT, CONTINUE_STMT, SWITCH_STMT, BREAK_STMT and CONTINUE_STMT. For SIZEOF_EXPR, recurse on the result of fold_sizeof_expr. Ignore DECL_EXPR with USING_DECL operand. * lambda.c (maybe_add_lambda_conv_op): Build thisarg using build_int_cst to make it a valid constant expression. * g++.dg/ubsan/vptr-4.C: Expect reinterpret_cast errors. * g++.dg/cpp1y/constexpr-84192.C (f2): Adjust expected diagnostics. * g++.dg/cpp1y/constexpr-70265-2.C (foo): Adjust expected line of diagnostics. * g++.dg/cpp1y/constexpr-89285.C: New test. --- gcc/cp/constexpr.c.jj 2019-02-17 17:09:47.113351897 +0100 +++ gcc/cp/constexpr.c 2019-02-18 19:34:57.995136395 +0100 @@ -1269,6 +1301,49 @@ cxx_eval_builtin_function_call (const co return t; } + if (fndecl_built_in_p (fun, BUILT_IN_NORMAL)) +switch (DECL_FUNCTION_CODE (fun)) + { + case BUILT_IN_ADD_OVERFLOW: + case BUILT_IN_SADD_OVERFLOW: + case BUILT_IN_SADDL_OVERFLOW: + case BUILT_IN_SADDLL_OVERFLOW: + case BUILT_IN_UADD_OVERFLOW: + case BUILT_IN_UADDL_OVERFLOW: + case BUILT_IN_UADDLL_OVERFLOW: + case BUILT_IN_SUB_OVERFLOW: + case BUILT_IN_SSUB_OVERFLOW: + case BUILT_IN_SSUBL_OVERFLOW: + case BUILT_IN_SSUBLL_OVERFLOW: + case BUILT_IN_USUB_OVERFLOW: + case BUILT_IN_USUBL_OVERFLOW: + case BUILT_IN_USUBLL_OVERFLOW: + case BUILT_IN_MUL_OVERFLOW: + case BUILT_IN_SMUL_OVERFLOW: + case BUILT_IN_SMULL_OVERFLOW: + case BUILT_IN_SMULLL_OVERFLOW: + case BUILT_IN_UMUL_OVERFLOW: + case BUILT_IN_UMULL_OVERFLOW: + case BUILT_IN_UMULLL_OVERFLOW: + /* These builtins will fold into + (cast) +((something = __real__ SAVE_EXPR <.???_OVERFLOW (cst1, cst2)>), + __imag__ SAVE_EXPR <.???_OVERFLOW (cst1, cst2)>) + which fails is_constant_expression. */ + if (TREE_CODE (args[0]) != INTEGER_CST + || TREE_CODE (args[1]) != INTEGER_CST + || !potential_constant_expression (args[2])) + { + if (!*non_constant_p && !ctx->quiet) + error ("%q+E is not a constant expression", new_call); + *non_constant_p = true; + return t; + } + return cxx_eval_constant_expression (_ctx, new_call, lval, +non_constant_p, overflow_p); + default: + break; +
Re: [C++ Patch] PR 84536 ("[7/8/9 Regression] ICE with non-type template parameter")
On 2/18/19 3:15 PM, Paolo Carlini wrote: Hi, On 19/02/19 00:52, Jason Merrill wrote: On 2/18/19 12:14 PM, Paolo Carlini wrote: Hi Jason, On 18/02/19 19:28, Jason Merrill wrote: On 2/18/19 5:31 AM, Paolo Carlini wrote: Hi Jason, On 18/02/19 10:20, Jason Merrill wrote: On 2/17/19 6:58 AM, Paolo Carlini wrote: Hi, here, when we don't see an initializer we believe we are surely dealing with a case of C++17 template argument deduction for class templates, but, in fact, it's just an ill-formed C++14 template variable specialization. Conveniently, we can use here too the predicate variable_template_specialization_p. Not 100% sure about the exact wording of the error message, I added '#' to %qD to explicitly print the auto-using type too. I guess we should change the assert to a test, so that we give the error if we aren't dealing with a class template placeholder. Variable templates don't seem to be important to test for. Thanks, simpler patch. This error is also pretty poor for this testcase, where there is an initializer. Well, implementation-wise, certainly init == NULL_TREE and only when we have an empty pack this specific issue occurs. In practice, clang simply talks about an empty initializer (during instantiation, etc, like we do), whereas EDG explicitly says that pack expansion produces an empty list of expressions. I don't think that in cp_finish_decl it would be easy for us to do exactly the same, we simply see a NULL_TREE as second argument. Or we could just *assume* that we are dealing with the outcome of a pack expansion, say something like EDG even if we don't have details beyond the fact that init == NULL_TREE. I believe that without a variadic template the problem cannot occur, because we catch the empty initializer much earlier, in grokdeclarator - indeed using a !CLASS_PLACEHOLDER_TEMPLATE (auto_node) check. What do you think? Again "instantiated for an empty pack" or something similar? Perhaps we could complain in the code for empty pack expansion handling in tsubst_init? Ah, thanks Jason. In fact, however, tsubst_init isn't currently involved at all, because, at the end of regenerate_decl_from_template we call by hand tsubst_expr and assign the result to DECL_INITIAL. Simply changing that avoids the ICE. However, the error we issue - likewise for the existing cpp0x/auto31.C - is the rather user-unfriendly "value-initialization of incomplete type ‘auto’", as produced by build_value_init. Thus a simple additional test along the lines already discussed, which now becomes much more simple to implement in a precise way. Again, wording only tentative. I'm also a little puzzled that, otherwise, we could get away with tubst_expr instead of tsubst_init... + if (type_uses_auto (TREE_TYPE (decl))) + { + if (complain & tf_error) + error ("initializer for %q#D expands to an empty list " + "of expressions", decl); + return error_mark_node; + } This needs to allow the CLASS_PLACEHOLDER_TEMPLATE case. And yes, we mustn't call build_value_init for a dependent type; if the type is dependent, we should just return the NULL_TREE. Good. Then I'm finishing testing the below (currently in libstdc++). + if (tree auto_node = type_uses_auto (type)) + if (!CLASS_PLACEHOLDER_TEMPLATE (auto_node)) + { + if (complain & tf_error) + error ("initializer for %q#D expands to an empty list " +"of expressions", decl); + return error_mark_node; + } + + if (!dependent_type_p (type)) This should probably be 'else if', since we can have auto outside of a template and dependent_type_p will always return false outside of a template. Jason
Re: [C++ Patch] PR 84536 ("[7/8/9 Regression] ICE with non-type template parameter")
Hi, On 19/02/19 00:52, Jason Merrill wrote: On 2/18/19 12:14 PM, Paolo Carlini wrote: Hi Jason, On 18/02/19 19:28, Jason Merrill wrote: On 2/18/19 5:31 AM, Paolo Carlini wrote: Hi Jason, On 18/02/19 10:20, Jason Merrill wrote: On 2/17/19 6:58 AM, Paolo Carlini wrote: Hi, here, when we don't see an initializer we believe we are surely dealing with a case of C++17 template argument deduction for class templates, but, in fact, it's just an ill-formed C++14 template variable specialization. Conveniently, we can use here too the predicate variable_template_specialization_p. Not 100% sure about the exact wording of the error message, I added '#' to %qD to explicitly print the auto-using type too. I guess we should change the assert to a test, so that we give the error if we aren't dealing with a class template placeholder. Variable templates don't seem to be important to test for. Thanks, simpler patch. This error is also pretty poor for this testcase, where there is an initializer. Well, implementation-wise, certainly init == NULL_TREE and only when we have an empty pack this specific issue occurs. In practice, clang simply talks about an empty initializer (during instantiation, etc, like we do), whereas EDG explicitly says that pack expansion produces an empty list of expressions. I don't think that in cp_finish_decl it would be easy for us to do exactly the same, we simply see a NULL_TREE as second argument. Or we could just *assume* that we are dealing with the outcome of a pack expansion, say something like EDG even if we don't have details beyond the fact that init == NULL_TREE. I believe that without a variadic template the problem cannot occur, because we catch the empty initializer much earlier, in grokdeclarator - indeed using a !CLASS_PLACEHOLDER_TEMPLATE (auto_node) check. What do you think? Again "instantiated for an empty pack" or something similar? Perhaps we could complain in the code for empty pack expansion handling in tsubst_init? Ah, thanks Jason. In fact, however, tsubst_init isn't currently involved at all, because, at the end of regenerate_decl_from_template we call by hand tsubst_expr and assign the result to DECL_INITIAL. Simply changing that avoids the ICE. However, the error we issue - likewise for the existing cpp0x/auto31.C - is the rather user-unfriendly "value-initialization of incomplete type ‘auto’", as produced by build_value_init. Thus a simple additional test along the lines already discussed, which now becomes much more simple to implement in a precise way. Again, wording only tentative. I'm also a little puzzled that, otherwise, we could get away with tubst_expr instead of tsubst_init... + if (type_uses_auto (TREE_TYPE (decl))) + { + if (complain & tf_error) + error ("initializer for %q#D expands to an empty list " + "of expressions", decl); + return error_mark_node; + } This needs to allow the CLASS_PLACEHOLDER_TEMPLATE case. And yes, we mustn't call build_value_init for a dependent type; if the type is dependent, we should just return the NULL_TREE. Good. Then I'm finishing testing the below (currently in libstdc++). Thanks, Paolo. // Index: cp/pt.c === --- cp/pt.c (revision 268997) +++ cp/pt.c (working copy) @@ -15422,21 +15422,34 @@ tsubst_init (tree init, tree decl, tree args, init = tsubst_expr (init, args, complain, in_decl, false); - if (!init && TREE_TYPE (decl) != error_mark_node) + tree type = TREE_TYPE (decl); + + if (!init && type != error_mark_node) { - /* If we had an initializer but it -instantiated to nothing, -value-initialize the object. This will -only occur when the initializer was a -pack expansion where the parameter packs -used in that expansion were of length -zero. */ - init = build_value_init (TREE_TYPE (decl), - complain); - if (TREE_CODE (init) == AGGR_INIT_EXPR) - init = get_target_expr_sfinae (init, complain); - if (TREE_CODE (init) == TARGET_EXPR) - TARGET_EXPR_DIRECT_INIT_P (init) = true; + if (tree auto_node = type_uses_auto (type)) + if (!CLASS_PLACEHOLDER_TEMPLATE (auto_node)) + { + if (complain & tf_error) + error ("initializer for %q#D expands to an empty list " +"of expressions", decl); + return error_mark_node; + } + + if (!dependent_type_p (type)) + { + /* If we had an initializer but it +instantiated to nothing, +value-initialize the object. This will +only occur when the initializer was a +pack expansion where the parameter packs +used in that expansion were of length +zero. */ + init = build_value_init (type, complain); + if
Re: [C++ PATCH] Fix cxx_eval_store_expression (PR c++/89336)
On 2/17/19 3:34 AM, Jakub Jelinek wrote: On Sat, Feb 16, 2019 at 08:51:33AM -1000, Jason Merrill wrote: The likely case is still that nothing has changed in between, so this patch just quickly verifies if that is the case (by comparing CONSTRUCTOR_ELT (ctor, 0) with the previously saved value of that and by checking if at the spot in the vector is the expected index). If that is the case, it doesn't do anything else, otherwise it updates the valp pointer. For scalar types, as in all your testcases, we can evaluate the initializer before the target, as C++17 wants. We probably still need your patch for when type is a class. If you are ok that the scalar vs. aggregate case will be handled differently, I'm all for your patch, though I guess instead of that second hunk it should change: if (AGGREGATE_TYPE_P (type) || VECTOR_TYPE_P (type)) into: if (!preeval) and move the init = cxx_eval_constant_expression ... call into the body of that if. I guess that means the scalar store will be handled right even for unions then. Just wonder if similar to if (*non_constant_p) return t; after target = cxx_eval_... we shouldn't have that for (both) init = cxx_eval_... cases too. Thanks, done. The testcases can be all changed to work with say struct Z { int z; }; instead of int (or any other aggregate) and I think my patch or something similar is needed. But they would still be doing assignment, rather than initialization, so they would still be preevaluated and work. With unions, I think the most nasty case is when the union member to which we want to store is active before an assignment, but is then made inactive and later active again. struct Z { int x, y; }; union W { Z a; long long w; }; W w {}; w.a = { 5, 0 }; // w.a becomes the active member w.a = { (int) (w.w = 17LL + w.a.x), 2 }; So, if we don't preevaluate init, we look up w.a as { 5, 0 } active member and try to store that, but in the meantime the init evaluation changes active member to something different, which should invalidate w.a. Here also we're looking at assignment. Here's a modification that still breaks with my patch: struct Z { int x, y; }; union W { Z a; long long w; constexpr W(): a({int(this->w = 42), 2}) {} }; constexpr W w {}; static_assert (w.a.x == 42); But it's not clear to me that the standard actually allows this. I don't think changing the active member of a union in the mem-initializer for another member is reasonable. So, I'm going to apply this: commit b5aa6e87a705496c38639b697317b0bd764dab30 Author: Jason Merrill Date: Fri Feb 15 13:09:33 2019 -1000 PR c++/89336 - multiple stores in constexpr stmt. If we evaluate the RHS in the context of the LHS, that evaluation might change the LHS in ways that mess with being able to store the value later. So for assignment or scalar values, evaluate the RHS first. * constexpr.c (cxx_eval_store_expression): Preevaluate scalar or assigned value. diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c index d946a797999..d413c6b9b27 100644 --- a/gcc/cp/constexpr.c +++ b/gcc/cp/constexpr.c @@ -3634,6 +3634,18 @@ cxx_eval_store_expression (const constexpr_ctx *ctx, tree t, maybe_simplify_trivial_copy (target, init); tree type = TREE_TYPE (target); + bool preeval = SCALAR_TYPE_P (type) || TREE_CODE (t) == MODIFY_EXPR; + if (preeval) +{ + /* Evaluate the value to be stored without knowing what object it will be + stored in, so that any side-effects happen first. */ + if (!SCALAR_TYPE_P (type)) + new_ctx.ctor = new_ctx.object = NULL_TREE; + init = cxx_eval_constant_expression (_ctx, init, false, + non_constant_p, overflow_p); + if (*non_constant_p) + return t; +} target = cxx_eval_constant_expression (ctx, target, true, non_constant_p, overflow_p); @@ -3834,7 +3846,7 @@ cxx_eval_store_expression (const constexpr_ctx *ctx, tree t, } release_tree_vector (refs); - if (AGGREGATE_TYPE_P (type) || VECTOR_TYPE_P (type)) + if (!preeval) { /* Create a new CONSTRUCTOR in case evaluation of the initializer wants to modify it. */ @@ -3843,21 +3855,20 @@ cxx_eval_store_expression (const constexpr_ctx *ctx, tree t, *valp = build_constructor (type, NULL); CONSTRUCTOR_NO_CLEARING (*valp) = no_zero_init; } - else if (TREE_CODE (*valp) == PTRMEM_CST) - *valp = cplus_expand_constant (*valp); new_ctx.ctor = *valp; new_ctx.object = target; + init = cxx_eval_constant_expression (_ctx, init, false, + non_constant_p, overflow_p); + if (target == object) + /* The hash table might have moved since the get earlier. */ + valp = ctx->values->get (object); } - init = cxx_eval_constant_expression (_ctx, init, false, - non_constant_p, overflow_p); /* Don't share a CONSTRUCTOR that might be changed later. */ init = unshare_constructor (init); - if (target == object) -
Re: [C++ PATCH] Fix maybe_generic_this_capture ICE on USING_DECL (PR c++/89387)
On 2/18/19 1:02 PM, Jakub Jelinek wrote: Hi! On the following testcase, id_expr is false and TREE_CODE (*iter) is USING_DECL (and the following one is FUNCTION_DECL). Since the USING_DECL changes, this ICEs because DECL_NONSTATIC_MEMBER_FUNCTION_P uses TREE_TYPE which can't be used here. Previously, I believe DECL_NONSTATIC_MEMBER_FUNCTION_P would be never true for USING_DECLs. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Or should it use != USING_DECL instead (what should be DECL_NONSTATIC_MEMBER_FUNCTION_P checked on other than FUNCTION_DECL/TEMPLATE_DECL)? It only applies if DECL_DECLARES_FUNCTION_P. But the only other thing we should encounter is USING_DECL. So let's use != USING_DECL like the other places Alex changed. Jason
Re: [C++ PATCH] Avoid ICE on void to type&& reinterpret_cast (PR c++/89391)
On 2/18/19 12:58 PM, Jakub Jelinek wrote: Hi! The if (TYPE_REF_IS_RVALUE (type)) code has been added recently, but build_target_expr_with_type asserts that the expression doesn't have void type. Fixed by using the old handling in that case (the expression is not lvalue in that case and diagnostics is emitted if complain). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2019-02-18 Jakub Jelinek PR c++/89391 * typeck.c (build_reinterpret_cast_1): Don't handle void to && conversion go through build_target_expr_with_type. OK. Jason
Re: [C++ PATCH] Don't ICE on invalid scoped enum E::~E (PR c++/89390)
On 2/18/19 12:50 PM, Jakub Jelinek wrote: Hi! On the following testcase we ICE because name is BIT_NOT_EXPR and suggest_alternative_in_scoped_enum assumes it is called on IDENTIFIER_NODE only. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? OK. There is another issue, starting with 7.x we don't use sensible location in the diagnostics, 6.x emitted pr89390.C: In function ‘void foo()’: pr89390.C:9:3: error: ‘~A’ is not a member of ‘A’ A::~A (); // { dg-error "'~A' is not a member of 'A'" } ^ but 7.x and later emits: In function ‘void foo()’: cc1plus: error: ‘~A’ is not a member of ‘A’ This patch doesn't deal with that, but would be nice to provide location, dunno if it is enough to just use location of ~, or if we need to spend memory and build ~A as combined range with caret on ~. I think having a range for a destructor id-expression would be appropriate. Jason
Re: [C++ Patch] PR 84536 ("[7/8/9 Regression] ICE with non-type template parameter")
On 2/18/19 12:14 PM, Paolo Carlini wrote: Hi Jason, On 18/02/19 19:28, Jason Merrill wrote: On 2/18/19 5:31 AM, Paolo Carlini wrote: Hi Jason, On 18/02/19 10:20, Jason Merrill wrote: On 2/17/19 6:58 AM, Paolo Carlini wrote: Hi, here, when we don't see an initializer we believe we are surely dealing with a case of C++17 template argument deduction for class templates, but, in fact, it's just an ill-formed C++14 template variable specialization. Conveniently, we can use here too the predicate variable_template_specialization_p. Not 100% sure about the exact wording of the error message, I added '#' to %qD to explicitly print the auto-using type too. I guess we should change the assert to a test, so that we give the error if we aren't dealing with a class template placeholder. Variable templates don't seem to be important to test for. Thanks, simpler patch. This error is also pretty poor for this testcase, where there is an initializer. Well, implementation-wise, certainly init == NULL_TREE and only when we have an empty pack this specific issue occurs. In practice, clang simply talks about an empty initializer (during instantiation, etc, like we do), whereas EDG explicitly says that pack expansion produces an empty list of expressions. I don't think that in cp_finish_decl it would be easy for us to do exactly the same, we simply see a NULL_TREE as second argument. Or we could just *assume* that we are dealing with the outcome of a pack expansion, say something like EDG even if we don't have details beyond the fact that init == NULL_TREE. I believe that without a variadic template the problem cannot occur, because we catch the empty initializer much earlier, in grokdeclarator - indeed using a !CLASS_PLACEHOLDER_TEMPLATE (auto_node) check. What do you think? Again "instantiated for an empty pack" or something similar? Perhaps we could complain in the code for empty pack expansion handling in tsubst_init? Ah, thanks Jason. In fact, however, tsubst_init isn't currently involved at all, because, at the end of regenerate_decl_from_template we call by hand tsubst_expr and assign the result to DECL_INITIAL. Simply changing that avoids the ICE. However, the error we issue - likewise for the existing cpp0x/auto31.C - is the rather user-unfriendly "value-initialization of incomplete type ‘auto’", as produced by build_value_init. Thus a simple additional test along the lines already discussed, which now becomes much more simple to implement in a precise way. Again, wording only tentative. I'm also a little puzzled that, otherwise, we could get away with tubst_expr instead of tsubst_init... + if (type_uses_auto (TREE_TYPE (decl))) + { + if (complain & tf_error) + error ("initializer for %q#D expands to an empty list " + "of expressions", decl); + return error_mark_node; + } This needs to allow the CLASS_PLACEHOLDER_TEMPLATE case. And yes, we mustn't call build_value_init for a dependent type; if the type is dependent, we should just return the NULL_TREE. Jason
Re: [PATCH, libphobos] Detect if qsort_r is available (PR d/88127)
On Sat, 2 Feb 2019 at 11:01, Johannes Pfau wrote: > > Adds a configure test for qsort_r and use the fallback code path if > it's not available. Fixes d/88127. rt/qsort.d changes have been > pushed upstream and reviewed there: > https://github.com/dlang/druntime/pull/2480 > Bootstrapped & ran D test suite on x86_64_linux with a recent glibc, > checked that Have_Qsort_R is set correctly in config.d. > > libphobos/ChangeLog: > > 2019-02-02 Johannes Pfau > > * m4/druntime/libraries.m4: Add check for qsort_r as > DRUNTIME_LIBRARIES_CLIB. > * configure.ac: Use qsort_r check. > * libdruntime/gcc/config.d.in: Add Have_Qsort_R to store check result. > * libdruntime/rt/qsort.d: Check Have_Qsort_R before using qsort_r. > * Makefile.in: Regenerate. > * aclocal.m4: Regenerate. > * configure: Regenerate. > * libdruntime/Makefile.in: Regenerate. > * src/Makefile.in: Regenerate. > * testsuite/Makefile.in: Regenerate. > > --- > libphobos/Makefile.in | 7 +++-- > libphobos/aclocal.m4 | 40 +-- > libphobos/configure | 26 +++-- > libphobos/configure.ac| 1 + > libphobos/libdruntime/Makefile.in | 7 +++-- > libphobos/libdruntime/gcc/config.d.in | 3 ++ > libphobos/libdruntime/rt/qsort.d | 18 > libphobos/m4/druntime/libraries.m4| 12 > libphobos/src/Makefile.in | 5 ++-- > libphobos/testsuite/Makefile.in | 5 ++-- > 10 files changed, 92 insertions(+), 32 deletions(-) > Adjusted the changelog entry to fit within 80 characters. Committed as r268999. Thanks, -- Iain
Re: [PATCH] Teach evrp that main's argc argument is always non-negative for C family (PR tree-optimization/89350)
On Sat, 16 Feb 2019, Jakub Jelinek wrote: > Hi! > > Both the C and C++ standard guarantee that the argc argument to main is > non-negative, the following patch sets (or adjusts) the corresponding > SSA_NAME_RANGE_INFO. While main is just one, with IPA VRP it can also > propagate etc. I had to change one testcase because it started optimizing > it better (the test has been folded away), so no sinking was done. In C, unlike in C++, it's valid to call main recursively. I think the requirements on argc and argv must be understood to apply only to their values on entry to the initial call to main, not to any recursive calls. So I don't think this optimization is valid for C (in the absence of whole-program information that can prove the absence of any recursive calls). -- Joseph S. Myers jos...@codesourcery.com
Re: Fix libphobos testsuite failures on Solaris
On Tue, 29 Jan 2019 at 15:44, Rainer Orth wrote: > > Yet another trivial fix for a Solaris libphobos testsuite failure: > > FAIL: libphobos.shared/load.d -shared-libphobos -ldl (test for excess errors) > Excess errors: > /vol/gcc/src/hg/trunk/local/libphobos/testsuite/libphobos.shared/load.d:9: > error: static assert "unimplemented" > > I guess this is obvious? Tested on i386-pc-solaris2.11. Ok for > mainline? > Looks ok. As the OS-specific bindings are only imported for RTLD_NOLOAD, this could be made explicit in the static assert. --- import core.sys.posix.dlfcn; version (DragonFlyBSD) import core.sys.dragonflybsd.dlfcn : RTLD_NOLOAD; version (FreeBSD) import core.sys.freebsd.dlfcn : RTLD_NOLOAD; version (linux) import core.sys.linux.dlfcn : RTLD_NOLOAD; version (NetBSD) import core.sys.netbsd.dlfcn : RTLD_NOLOAD; version (OSX) import core.sys.darwin.dlfcn : RTLD_NOLOAD; version (Solaris) import core.sys.solaris.dlfcn : RTLD_NOLOAD; static assert(__traits(compiles, RTLD_NOLOAD), "unimplemented"); --- -- Iain
Re: [PATCH] Teach evrp that main's argc argument is always non-negative for C family (PR tree-optimization/89350)
On Mon, Feb 18, 2019 at 11:55:56PM +0100, Jakub Jelinek wrote: > On Mon, Feb 18, 2019 at 04:47:57PM -0600, Segher Boessenkool wrote: > > On Sat, Feb 16, 2019 at 08:12:34AM +0100, Jakub Jelinek wrote: > > > Both the C and C++ standard guarantee that the argc argument to main is > > > non-negative, the following patch sets (or adjusts) the corresponding > > > SSA_NAME_RANGE_INFO. > > > > I think this should test for flag_hosted somehow? And check that this is > > Why? Does -ffreestanding mean it can violate the C/C++ requirements? No, but nothing is required of the arguments to the main function in a freestanding implementation. C11 5.1.2.1/1. > AFAIK we don't guard other MAIN_NAME_P uses in the compiler with C/C++ > checks. E.g. "Nothing escapes by returning from main though." in > tree-ssa-structalias.c, various other spots. GCC hasn't historically required "int" for the first argument of the main function, as far as I know. This is separate from saying the main function is called "main". Segher
[C++ PATCH] Fix maybe_generic_this_capture ICE on USING_DECL (PR c++/89387)
Hi! On the following testcase, id_expr is false and TREE_CODE (*iter) is USING_DECL (and the following one is FUNCTION_DECL). Since the USING_DECL changes, this ICEs because DECL_NONSTATIC_MEMBER_FUNCTION_P uses TREE_TYPE which can't be used here. Previously, I believe DECL_NONSTATIC_MEMBER_FUNCTION_P would be never true for USING_DECLs. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Or should it use != USING_DECL instead (what should be DECL_NONSTATIC_MEMBER_FUNCTION_P checked on other than FUNCTION_DECL/TEMPLATE_DECL)? 2019-02-18 Jakub Jelinek PR c++/89387 * lambda.c (maybe_generic_this_capture): Don't check DECL_NONSTATIC_MEMBER_FUNCTION_P on USING_DECLs. * g++.dg/cpp0x/lambda/lambda-89387.C: New test. --- gcc/cp/lambda.c.jj 2019-02-18 20:48:32.112741017 +0100 +++ gcc/cp/lambda.c 2019-02-18 21:49:23.319629179 +0100 @@ -941,7 +941,8 @@ maybe_generic_this_capture (tree object, fns = TREE_OPERAND (fns, 0); for (lkp_iterator iter (fns); iter; ++iter) - if ((!id_expr || TREE_CODE (*iter) == TEMPLATE_DECL) + if (((!id_expr && TREE_CODE (*iter) == FUNCTION_DECL) + || TREE_CODE (*iter) == TEMPLATE_DECL) && DECL_NONSTATIC_MEMBER_FUNCTION_P (*iter)) { /* Found a non-static member. Capture this. */ --- gcc/testsuite/g++.dg/cpp0x/lambda/lambda-89387.C.jj 2019-02-18 21:56:46.410339001 +0100 +++ gcc/testsuite/g++.dg/cpp0x/lambda/lambda-89387.C2019-02-18 21:55:58.869119054 +0100 @@ -0,0 +1,11 @@ +// PR c++/89387 +// { dg-do compile { target c++11 } } + +template class T> +struct S { + using A = int; + using B = T; + using B::foo; + void bar () { [&] { foo (); }; } + void foo (); +}; Jakub
[C++ PATCH] Avoid ICE on void to type&& reinterpret_cast (PR c++/89391)
Hi! The if (TYPE_REF_IS_RVALUE (type)) code has been added recently, but build_target_expr_with_type asserts that the expression doesn't have void type. Fixed by using the old handling in that case (the expression is not lvalue in that case and diagnostics is emitted if complain). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2019-02-18 Jakub Jelinek PR c++/89391 * typeck.c (build_reinterpret_cast_1): Don't handle void to && conversion go through build_target_expr_with_type. * g++.dg/cpp0x/reinterpret_cast2.C: New test. --- gcc/cp/typeck.c.jj 2019-01-30 08:35:46.990055278 +0100 +++ gcc/cp/typeck.c 2019-02-18 21:19:09.727590300 +0100 @@ -7477,7 +7477,7 @@ build_reinterpret_cast_1 (tree type, tre reinterpret_cast. */ if (TYPE_REF_P (type)) { - if (TYPE_REF_IS_RVALUE (type)) + if (TYPE_REF_IS_RVALUE (type) && !VOID_TYPE_P (intype)) { if (!obvalue_p (expr)) /* Perform the temporary materialization conversion. */ --- gcc/testsuite/g++.dg/cpp0x/reinterpret_cast2.C.jj 2019-02-18 21:27:24.844391776 +0100 +++ gcc/testsuite/g++.dg/cpp0x/reinterpret_cast2.C 2019-02-18 21:27:05.261723238 +0100 @@ -0,0 +1,10 @@ +// PR c++/89391 +// { dg-do compile { target c++11 } } + +struct S { }; + +void +foo () +{ + auto a = reinterpret_cast(foo ()); // { dg-error "invalid cast of an rvalue expression of type 'void' to type" } +} Jakub
Re: [PATCH] Teach evrp that main's argc argument is always non-negative for C family (PR tree-optimization/89350)
On Mon, Feb 18, 2019 at 04:47:57PM -0600, Segher Boessenkool wrote: > On Sat, Feb 16, 2019 at 08:12:34AM +0100, Jakub Jelinek wrote: > > Both the C and C++ standard guarantee that the argc argument to main is > > non-negative, the following patch sets (or adjusts) the corresponding > > SSA_NAME_RANGE_INFO. > > I think this should test for flag_hosted somehow? And check that this is Why? Does -ffreestanding mean it can violate the C/C++ requirements? AFAIK we don't guard other MAIN_NAME_P uses in the compiler with C/C++ checks. E.g. "Nothing escapes by returning from main though." in tree-ssa-structalias.c, various other spots. > a C-like language anyway? The patch used to do that check, but I think we should be able to avoid that. I think in other languages main is just a C wrapper or compiler generated C-like wrapper that actually calls the main program's subroutine and so the C requirements apply to it too. Jakub
[C++ PATCH] Don't ICE on invalid scoped enum E::~E (PR c++/89390)
Hi! On the following testcase we ICE because name is BIT_NOT_EXPR and suggest_alternative_in_scoped_enum assumes it is called on IDENTIFIER_NODE only. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? There is another issue, starting with 7.x we don't use sensible location in the diagnostics, 6.x emitted pr89390.C: In function ‘void foo()’: pr89390.C:9:3: error: ‘~A’ is not a member of ‘A’ A::~A (); // { dg-error "'~A' is not a member of 'A'" } ^ but 7.x and later emits: In function ‘void foo()’: cc1plus: error: ‘~A’ is not a member of ‘A’ This patch doesn't deal with that, but would be nice to provide location, dunno if it is enough to just use location of ~, or if we need to spend memory and build ~A as combined range with caret on ~. 2019-02-18 Jakub Jelinek PR c++/89390 * error.c (qualified_name_lookup_error): Only call suggest_alternative_in_scoped_enum if name is IDENTIFIER_NODE. * g++.dg/diagnostic/pr89390.C: New test. --- gcc/cp/error.c.jj 2019-01-17 09:03:11.486787567 +0100 +++ gcc/cp/error.c 2019-02-18 20:56:48.047604338 +0100 @@ -4276,7 +4276,7 @@ qualified_name_lookup_error (tree scope, else { name_hint hint; - if (SCOPED_ENUM_P (scope)) + if (SCOPED_ENUM_P (scope) && TREE_CODE (name) == IDENTIFIER_NODE) hint = suggest_alternative_in_scoped_enum (name, scope); if (const char *suggestion = hint.suggestion ()) { --- gcc/testsuite/g++.dg/diagnostic/pr89390.C.jj2019-02-18 20:58:47.358646700 +0100 +++ gcc/testsuite/g++.dg/diagnostic/pr89390.C 2019-02-18 20:58:13.746198205 +0100 @@ -0,0 +1,10 @@ +// PR c++/89390 +// { dg-do compile { target c++11 } } + +enum class A { B, C }; + +void +foo () +{ + A::~A ();// { dg-error "'~A' is not a member of 'A'" "" { target *-*-* } 0 } +} Jakub
Re: [PATCH] Teach evrp that main's argc argument is always non-negative for C family (PR tree-optimization/89350)
Hi Jakub, On Sat, Feb 16, 2019 at 08:12:34AM +0100, Jakub Jelinek wrote: > Both the C and C++ standard guarantee that the argc argument to main is > non-negative, the following patch sets (or adjusts) the corresponding > SSA_NAME_RANGE_INFO. I think this should test for flag_hosted somehow? And check that this is a C-like language anyway? Segher
[PR fortran/89266, patch] - ICE with TRANSFER of len=0 character array constructor
The issue in the PR is caused during simplification in the frontend because it does not properly differentiate between expressions of size 0 (e.g. arrays of length 0 or character strings of len=0) and failure. The attached patch tries to solve this problem by modifying the helper functions gfc_element_size and gfc_target_expr_size to return a bool when simplification fails. All users of these functions needed adjustment, most of which was more or less mechanical. There was one case left (in check.c) where I am unsure if I got the logic right. In the worst case it should produce a new bug for code that would have generated an ICE before. Since the above fix also works for non-character arrays of length 0, I added a suitable test. Regtested on x86_64-pc-linux-gnu. OK for trunk? Or rather wait for post-9.1? Thanks, Harald 2019-02-18 Harald Anlauf PR fortran/89266 * target-memory.c (gfc_element_size): Return false if element size cannot be determined; element size is returned separately. (gfc_target_expr_size): Return false if expression size cannot be determined; expression size is returned separately. * target-memory.h: Adjust prototypes. * check.c (gfc_calculate_transfer_sizes): Adjust references to gfc_target_expr_size, gfc_element_size. * arith.c (hollerith2representation): Likewise. * class.c (find_intrinsic_vtab): Likewise. * simplify.c (gfc_simplify_sizeof): Likewise. 2019-02-18 Harald Anlauf PR fortran/89266 * gfortran.dg/pr89266.f90: New test. Index: gcc/fortran/arith.c === --- gcc/fortran/arith.c (revision 268993) +++ gcc/fortran/arith.c (working copy) @@ -2548,10 +2548,10 @@ static void hollerith2representation (gfc_expr *result, gfc_expr *src) { - int src_len, result_len; + size_t src_len, result_len; src_len = src->representation.length - src->ts.u.pad; - result_len = gfc_target_expr_size (result); + gfc_target_expr_size (result, _len); if (src_len > result_len) { Index: gcc/fortran/check.c === --- gcc/fortran/check.c (revision 268993) +++ gcc/fortran/check.c (working copy) @@ -5480,16 +5480,15 @@ return false; /* Calculate the size of the source. */ - *source_size = gfc_target_expr_size (source); - if (*source_size == 0) + if (!gfc_target_expr_size (source, source_size)) return false; /* Determine the size of the element. */ - result_elt_size = gfc_element_size (mold); - if (result_elt_size == 0) + if (!gfc_element_size (mold, _elt_size)) return false; - if (mold->expr_type == EXPR_ARRAY || mold->rank || size) + if ((result_elt_size > 0 && (mold->expr_type == EXPR_ARRAY || mold->rank)) + || size) { int result_length; Index: gcc/fortran/class.c === --- gcc/fortran/class.c (revision 268993) +++ gcc/fortran/class.c (working copy) @@ -2666,6 +2666,7 @@ gfc_namespace *sub_ns; gfc_namespace *contained; gfc_expr *e; + size_t e_size; gfc_get_symbol (name, ns, ); if (!gfc_add_flavor (>attr, FL_DERIVED, NULL, @@ -2700,11 +2701,13 @@ e = gfc_get_expr (); e->ts = *ts; e->expr_type = EXPR_VARIABLE; + if (ts->type == BT_CHARACTER) + e_size = ts->kind; + else + gfc_element_size (e, _size); c->initializer = gfc_get_int_expr (gfc_size_kind, NULL, -ts->type == BT_CHARACTER -? ts->kind -: gfc_element_size (e)); +e_size); gfc_free_expr (e); /* Add component _extends. */ Index: gcc/fortran/simplify.c === --- gcc/fortran/simplify.c (revision 268993) +++ gcc/fortran/simplify.c (working copy) @@ -7379,6 +7379,7 @@ { gfc_expr *result = NULL; mpz_t array_size; + size_t res_size; if (x->ts.type == BT_CLASS || x->ts.deferred) return NULL; @@ -7394,7 +7395,8 @@ result = gfc_get_constant_expr (BT_INTEGER, gfc_index_integer_kind, >where); - mpz_set_si (result->value.integer, gfc_target_expr_size (x)); + gfc_target_expr_size (x, _size); + mpz_set_si (result->value.integer, res_size); return result; } @@ -7408,6 +7410,7 @@ { gfc_expr *result = NULL; int k; + size_t siz; if (x->ts.type == BT_CLASS || x->ts.deferred) return NULL; @@ -7423,7 +7426,8 @@ result = gfc_get_constant_expr (BT_INTEGER, k, >where); - mpz_set_si
[C++ PATCH] Ensure constexpr evaluation is done on pre-cp_fold_function bodies (PR c++/89285)
Hi! As mentioned in the PR, we've regressed on the trunk in diagnostics of some invalid constexpr evaluations. The problem is that the constexpr evaluation is effectively done on post-cp_fold_function bodies/arguments and cp_fold optimizes away some important trees for constexpr diagnostics, either itself, or through using GENERIC match.pd (on the testcase in particular diagnostics about reinterpret_cast). While we save on constexpr call hash table bodies of the functions pre-cp_fold_function, due to sharing and cp_fold_r the STATEMENT_LIST statements etc. are modified directly and genericization modifies it as well. The following patch uses copy_fn which we have been using before the the recursive constexpr cases also to make a copy of the constexpr function before cp_fold_function clobbers it. I had to implement cxx_eval_conditional_expression handling of various C++ FE statements that are replaced during genericization. Bootstrapped/regtested on x86_64-linux and i686-linux (98,11,14,17,2a), ok for trunk? 2019-02-18 Jakub Jelinek PR c++/89285 * constexpr.c (struct constexpr_fundef): Add parms and result members. (retrieve_constexpr_fundef): Adjust for the above change. (register_constexpr_fundef): Save constexpr body with copy_fn, temporarily set DECL_CONTEXT on DECL_RESULT before that. (get_fundef_copy): Change FUN argument to FUNDEF with constexpr_fundef * type, grab body and parms/result out of constexpr_fundef struct and temporarily change it for copy_fn calls too. (cxx_eval_builtin_function_call): For __builtin_FUNCTION temporarily adjust current_function_decl from ctx->call context. For arith overflow builtins, don't test is_constant_expression on the result, instead test if arguments are suitable constant expressions. (cxx_bind_parameters_in_call): Grab parameters from new_call. Undo convert_for_arg_passing changes for TREE_ADDRESSABLE type passing. (cxx_eval_call_expression): Adjust get_fundef_copy caller. (cxx_eval_conditional_expression): For IF_STMT, allow then or else operands to be NULL. (label_matches): Handle BREAK_STMT and CONTINUE_STMT. (cxx_eval_loop_expr): Add support for FOR_STMT, WHILE_STMT and DO_STMT. (cxx_eval_switch_expr): Add support for SWITCH_STMT. (cxx_eval_constant_expression): Handle IF_STMT, FOR_STMT, WHILE_STMT, DO_STMT, CONTINUE_STMT, SWITCH_STMT, BREAK_STMT and CONTINUE_STMT. For SIZEOF_EXPR, recurse on the result of fold_sizeof_expr. Ignore DECL_EXPR with USING_DECL operand. * lambda.c (maybe_add_lambda_conv_op): Build thisarg using build_int_cst to make it a valid constant expression. * g++.dg/ubsan/vptr-4.C: Expect reinterpret_cast errors. * g++.dg/cpp1y/constexpr-84192.C (f2): Adjust expected diagnostics. * g++.dg/cpp1y/constexpr-70265-2.C (foo): Adjust expected line of diagnostics. * g++.dg/cpp1y/constexpr-89285.C: New test. --- gcc/cp/constexpr.c.jj 2019-02-17 17:09:47.113351897 +0100 +++ gcc/cp/constexpr.c 2019-02-18 19:34:57.995136395 +0100 @@ -139,6 +139,8 @@ ensure_literal_type_for_constexpr_object struct GTY((for_user)) constexpr_fundef { tree decl; tree body; + tree parms; + tree result; }; struct constexpr_fundef_hasher : ggc_ptr_hash @@ -176,11 +178,10 @@ constexpr_fundef_hasher::hash (constexpr static constexpr_fundef * retrieve_constexpr_fundef (tree fun) { - constexpr_fundef fundef = { NULL, NULL }; if (constexpr_fundef_table == NULL) return NULL; - fundef.decl = fun; + constexpr_fundef fundef = { fun, NULL, NULL, NULL }; return constexpr_fundef_table->find (); } @@ -897,8 +898,19 @@ register_constexpr_fundef (tree fun, tre = hash_table::create_ggc (101); entry.decl = fun; - entry.body = body; + tree saved_fn = current_function_decl; + bool clear_ctx = false; + current_function_decl = fun; + if (DECL_RESULT (fun) && DECL_CONTEXT (DECL_RESULT (fun)) == NULL_TREE) +{ + clear_ctx = true; + DECL_CONTEXT (DECL_RESULT (fun)) = fun; +} + entry.body = copy_fn (fun, entry.parms, entry.result); + current_function_decl = saved_fn; slot = constexpr_fundef_table->find_slot (, INSERT); + if (clear_ctx) +DECL_CONTEXT (DECL_RESULT (fun)) = NULL_TREE; gcc_assert (*slot == NULL); *slot = ggc_alloc (); @@ -1114,27 +1126,40 @@ maybe_initialize_fundef_copies_table () is parms, TYPE is result. */ static tree -get_fundef_copy (tree fun) +get_fundef_copy (constexpr_fundef *fundef) { maybe_initialize_fundef_copies_table (); tree copy; bool existed; - tree *slot = _copies_table->get_or_insert (fun, ); + tree *slot = _copies_table->get_or_insert (fundef->decl, ); if (!existed) { /* There is no cached function available, or in use. We can use the function directly. That
Re: [libphobos, build] Enable libphobos on Solaris 11/x86
On Tue, 29 Jan 2019 at 13:35, Rainer Orth wrote: > > With the set of libphobos Solaris patches just posted, it would become > possible to enable libphobos on Solaris 11/x86 by default. > > This is what this patch does. > > * It uses a LIBPHOBOS_SUPPORTED variable both in toplevel configure and > libphobos/configure.tgt, following what libvtv does. > > * It's necessary to disable libphobos when Solaris as is in use: it has > a relatively low line length limit of 10240 which is exceeded in a few > libphobos files. > > Bootstrapped without regressions on i386-pc-solaris2.11 (as and gas, gas > and gld, Solaris 11.3/11.4/11.5) on top of the previous set of patches. > > Also tested manually that explicit > --enable-libphobos/--disable-libphobos give the desired results > (i.e. override the defaults). > OK. -- Iain
Re: [build] Fix libgphobos linking on Solaris 11
On Tue, 27 Nov 2018 at 23:28, Rainer Orth wrote: > > As mentioned in passing in PR d/87864, libgphobos.so currently fails to > link before Solaris 11.4. Until then, you needed to link with -lsocket > -lnsl for the networking functions, in S11.4 they were merged into libc. > > To fix this, I've adapted the check from libgo/configure.ac, for the > moment just moving it into an autoconf macro, reindenting it, renaming > the variables for the new location, and removing the check for sendfile > which isn't used in libphobos. > > With that patch (and the one from PR d/87864 to provide > __start_minfo/__stop_minfo when ld does not), I could bootstrap with > --enable-libphobos on i386-pc-solaris2.11 with gas and > sparc-sun-solaris2.11 with as on both S11.3 and S11.4. On the former, > libsocket and libnsl were properly detected and linked into > libgdruntime.so and libgphobos.so, leaving no undefined symbols, while > on the latter nothing more than libc is needed. > > Ok for mainline? > Hi, Sorry, somehow I missed this and other libphobos related patches. I see no problem with this if still needed, thanks. -- Iain
Re: [PATCH, RFC] Avoid the -D option which is not available install-sh
On Sat, 16 Feb 2019 at 13:58, Bernd Edlinger wrote: > > On 2/9/19 7:21 PM, Bernd Edlinger wrote: > > On 2/9/19 7:18 PM, Jakub Jelinek wrote: > >> On Sat, Feb 09, 2019 at 06:11:00PM +, Bernd Edlinger wrote: > >>> --- libphobos/libdruntime/Makefile.am (revision 268614) > >>> +++ libphobos/libdruntime/Makefile.am (working copy) > >>> @@ -140,10 +140,12 @@ clean-local: > >>> # Handles generated files as well > >>> install-data-local: > >>> for file in $(ALL_DRUNTIME_INSTALL_DSOURCES); do \ > >>> + $(MKDIR_P) `echo $(DESTDIR)$(gdc_include_dir)/$$file \ > >>> + | sed -e 's:/[^/]*$$::'` ; \ > >> > >> Perhaps better `dirname $(DESTDIR)$(gdc_include_dir)/$$file` ? > >> > > > > Ah, yes, good point. > > > > Consider it changed. > > > > > > So here is the latest version with the requested change. > Looks ok to me. > How is the procedure with libpobos patches? > Can we check them into the gcc svn, or will Ian have to > push them first into the upstream? > See libphobos/README.gcc regarding what sources are part of upstream. Anything else that isn't listed is local to gcc svn, and can be committed directly. -- Iain
Re: [C++ Patch] PR 84536 ("[7/8/9 Regression] ICE with non-type template parameter")
Hi Jason, On 18/02/19 19:28, Jason Merrill wrote: On 2/18/19 5:31 AM, Paolo Carlini wrote: Hi Jason, On 18/02/19 10:20, Jason Merrill wrote: On 2/17/19 6:58 AM, Paolo Carlini wrote: Hi, here, when we don't see an initializer we believe we are surely dealing with a case of C++17 template argument deduction for class templates, but, in fact, it's just an ill-formed C++14 template variable specialization. Conveniently, we can use here too the predicate variable_template_specialization_p. Not 100% sure about the exact wording of the error message, I added '#' to %qD to explicitly print the auto-using type too. I guess we should change the assert to a test, so that we give the error if we aren't dealing with a class template placeholder. Variable templates don't seem to be important to test for. Thanks, simpler patch. This error is also pretty poor for this testcase, where there is an initializer. Well, implementation-wise, certainly init == NULL_TREE and only when we have an empty pack this specific issue occurs. In practice, clang simply talks about an empty initializer (during instantiation, etc, like we do), whereas EDG explicitly says that pack expansion produces an empty list of expressions. I don't think that in cp_finish_decl it would be easy for us to do exactly the same, we simply see a NULL_TREE as second argument. Or we could just *assume* that we are dealing with the outcome of a pack expansion, say something like EDG even if we don't have details beyond the fact that init == NULL_TREE. I believe that without a variadic template the problem cannot occur, because we catch the empty initializer much earlier, in grokdeclarator - indeed using a !CLASS_PLACEHOLDER_TEMPLATE (auto_node) check. What do you think? Again "instantiated for an empty pack" or something similar? Perhaps we could complain in the code for empty pack expansion handling in tsubst_init? Ah, thanks Jason. In fact, however, tsubst_init isn't currently involved at all, because, at the end of regenerate_decl_from_template we call by hand tsubst_expr and assign the result to DECL_INITIAL. Simply changing that avoids the ICE. However, the error we issue - likewise for the existing cpp0x/auto31.C - is the rather user-unfriendly "value-initialization of incomplete type ‘auto’", as produced by build_value_init. Thus a simple additional test along the lines already discussed, which now becomes much more simple to implement in a precise way. Again, wording only tentative. I'm also a little puzzled that, otherwise, we could get away with tubst_expr instead of tsubst_init... Thanks, Paolo. // Index: cp/pt.c === --- cp/pt.c (revision 268995) +++ cp/pt.c (working copy) @@ -15424,6 +15424,14 @@ tsubst_init (tree init, tree decl, tree args, if (!init && TREE_TYPE (decl) != error_mark_node) { + if (type_uses_auto (TREE_TYPE (decl))) + { + if (complain & tf_error) + error ("initializer for %q#D expands to an empty list " + "of expressions", decl); + return error_mark_node; + } + /* If we had an initializer but it instantiated to nothing, value-initialize the object. This will @@ -24053,9 +24061,8 @@ regenerate_decl_from_template (tree decl, tree tmp { start_lambda_scope (decl); DECL_INITIAL (decl) = - tsubst_expr (DECL_INITIAL (code_pattern), args, -tf_error, DECL_TI_TEMPLATE (decl), -/*integral_constant_expression_p=*/false); + tsubst_init (DECL_INITIAL (code_pattern), decl, args, +tf_error, DECL_TI_TEMPLATE (decl)); finish_lambda_scope (); if (VAR_HAD_UNKNOWN_BOUND (decl)) TREE_TYPE (decl) = tsubst (TREE_TYPE (code_pattern), args, Index: testsuite/g++.dg/cpp1y/var-templ60.C === --- testsuite/g++.dg/cpp1y/var-templ60.C(nonexistent) +++ testsuite/g++.dg/cpp1y/var-templ60.C(working copy) @@ -0,0 +1,9 @@ +// PR c++/84536 +// { dg-do compile { target c++14 } } + +template auto foo(N...); // { dg-error "initializer" } + +void bar() +{ + foo<>(); +}
[patch, fortran] Fix PR 89384
Hello world, this patch fixes the 9 regression in C interop with contiguous arguments recently reported by Reinhold Bader. ChangeLog and patch say it all. I hope I didn't overlook any obvious things here (Paul, maybe you can take a look). Regression-tested. OK for trunk? Regards Thomas 2019-02-18 Thomas Koenig PR fortran/89384 * trans-expr.c (gfc_conv_gfc_desc_to_cfi_desc): If the dummy argument is contiguous and the actual argument may not be, use gfc_conv_subref_array_arg. 2019-02-18 Thomas Koenig PR fortran/89384 * gfortran.dg/ISO_Fortran_binding_4.f90 Index: trans-expr.c === --- trans-expr.c (Revision 268992) +++ trans-expr.c (Arbeitskopie) @@ -4944,7 +4944,12 @@ gfc_conv_gfc_desc_to_cfi_desc (gfc_se *parmse, gfc if (e->rank != 0) { - gfc_conv_expr_descriptor (parmse, e); + if (fsym->attr.contiguous + && !gfc_is_simply_contiguous (e, false, true)) + gfc_conv_subref_array_arg (parmse, e, false, fsym->attr.intent, + fsym->attr.pointer); + else + gfc_conv_expr_descriptor (parmse, e); if (POINTER_TYPE_P (TREE_TYPE (parmse->expr))) parmse->expr = build_fold_indirect_ref_loc (input_location, ! { dg-do run } ! PR fortran/89384 - this used to give a wrong results ! with contiguous. ! Test case by Reinhold Bader. module mod_ctg implicit none contains subroutine ctg(x) BIND(C) real, contiguous :: x(:) if (any(abs(x - [2.,4.,6.]) > 1.e-6)) then write(*,*) 'FAIL' else write(*,*) 'OK' end if end subroutine end module program p use mod_ctg implicit none real :: x(6) integer :: i x = [ (real(i), i=1, size(x)) ] call ctg(x(2::2)) end program
Re: [PATCH, RFC] Avoid the -D option which is not available install-sh
Hi Bernd, Am 16.02.19 um 13:58 schrieb Bernd Edlinger: So here is the latest version with the requested change. How is the procedure with libpobos patches? Can we check them into the gcc svn, or will Ian have to push them first into the upstream? Most phobos/druntime changes should be upstreamed first: This ensures that we do not unintentionally revert changes on the next merge with upstream. High priority fixes can probably also be pushed to gdc before they're merged in upstream. However, in this case we can push this without any upstream interaction either way: Upstream does not use the autoconf/automake build system. They use plain Makefiles completely unrelated to the build system we use here. Patch looks good to me, but Iain has to approve this. Best regards, Johannes
Re: C++ PATCH to fix eb82.C
On 2/17/19 11:54 AM, Marek Polacek wrote: On Sat, Feb 16, 2019 at 03:54:21PM -0500, Marek Polacek wrote: I noticed this test fails in c++2a since the implementation of P0846 landed in r265734. Since it's in g++.old-deja/, I never noticted the fail (but I don't see any others). This patch tweaks a dg-error in order to make it pass in c++2a also. Tested on x86_64-linux, ok for trunk? 2019-02-16 Marek Polacek * g++.old-deja/g++.robertl/eb82.C: Tweak dg-error. diff --git gcc/testsuite/g++.old-deja/g++.robertl/eb82.C gcc/testsuite/g++.old-deja/g++.robertl/eb82.C index 9bf0398cd0a..fc2bf7866fe 100644 --- gcc/testsuite/g++.old-deja/g++.robertl/eb82.C +++ gcc/testsuite/g++.old-deja/g++.robertl/eb82.C @@ -9,5 +9,5 @@ double val () // { dg-error "" } bogus code int main () { - printf ("%d\n", val<(int)3> ()); // { dg-error "" } val undeclared + printf ("%d\n", val<(int)3> ()); // { dg-error "" "" { target c++17_down } } val undeclared } Actually I'll just go ahead with this, should be obvious anyway. I had also noticed this test failing, and when investigating noticed that the remaining error strangely talked about a partial specialization. This patch fixes that: commit 848fa7b9ab2a55d4d3bbf791c828fc3ce60d61fa Author: Jason Merrill Date: Mon Feb 18 10:05:31 2019 -1000 Improve diagnostic for redundant template arguments in declaration. * pt.c (check_explicit_specialization): If the declarator is a template-id, only check whether the arguments are dependent. diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index 48cbf3d9892..d8be92ddca4 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -2849,7 +2849,7 @@ check_explicit_specialization (tree declarator, /* This case handles bogus declarations like template <> template void f(); */ - if (!uses_template_parms (declarator)) + if (!uses_template_parms (TREE_OPERAND (declarator, 1))) error ("template-id %qD in declaration of primary template", declarator); else if (variable_template_p (TREE_OPERAND (declarator, 0))) diff --git a/gcc/testsuite/g++.old-deja/g++.robertl/eb82.C b/gcc/testsuite/g++.old-deja/g++.robertl/eb82.C index fc2bf7866fe..d4c5985cd8c 100644 --- a/gcc/testsuite/g++.old-deja/g++.robertl/eb82.C +++ b/gcc/testsuite/g++.old-deja/g++.robertl/eb82.C @@ -2,7 +2,8 @@ #include template -double val () // { dg-error "" } bogus code +double val () // { dg-error "expected" "" { target c++17_down } } bogus code +// { dg-error "template-id .val. in declaration of primary template" "" { target c++2a } .-1 } { return (double) n1; }
Re: [PATCH] correct __clear_cache signature
Martin Sebor writes: > Recent libgcc builds have been triggering -Wbuiltin-declaration-mismatch > due to the declaration of the __clear_cache built-in being incompatible > with how GCC declares it internally. The attached patch adjusts > the libgcc declaration and the one in the manual to match what GCC > expects. > > Tested on x86_64-linux. OK, thanks. Richard
Re: Trivial doc typos
Sharon Dvir writes: > Description: fixed a couple of typos in testsuite/README. > Testing: make dvi, make info, although I doubt needed. Applied, thanks. Richard
Re: [patch, fortran] Fix PR 87689, wrong decls / ABI violation on POWER
On Mon, Feb 18, 2019 at 10:30 PM Thomas Koenig wrote: > Hi Janne, > > > I'm not really sure if there is any good reason why GFortran occasionally > > generates these varargs declarations, hence my suggestion to get rid of > > them. Unless the middle-end is planning to get rid of untyped function > > decls? > > Are they still being generated after the patch went in? I haven't checked whether your patch fixes all such cases. How do we even conclusively prove it, except by just getting rid of that code path? :) > I'm not sure, > but because I wanted to change as little as possible, I did not try > to change that aspect of the code. > I fully agree, this close to the release we shouldn't do any surgery which isn't absolutely necessary. -- Janne Blomqvist
Re: [patch, fortran] Fix PR 87689, wrong decls / ABI violation on POWER
Hi Janne, I'm not really sure if there is any good reason why GFortran occasionally generates these varargs declarations, hence my suggestion to get rid of them. Unless the middle-end is planning to get rid of untyped function decls? Are they still being generated after the patch went in? I'm not sure, but because I wanted to change as little as possible, I did not try to change that aspect of the code. Regards Thomas
Re: [committed] Fix set_uids_in_ptset (PR middle-end/89303)
On Mon, Feb 18, 2019 at 09:15:39PM +0100, Rainer Orth wrote: > 2019-02-15 Rainer Orth > > * g++.dg/torture/pr89303.C (bad_weak_ptr): Rename to > bad_weak_ptr_. Ok, thanks. If needed, guess we could rename much more (or rename the namespace in which most of it is from std to my_std, though we'd need to check for stuff that needs to be in std namespace). > # HG changeset patch > # Parent 056fe4093ce40dc462c6b50c3ae49df032a92230 > Fix g++.dg/torture/pr89303.C with Solaris ld > > diff --git a/gcc/testsuite/g++.dg/torture/pr89303.C > b/gcc/testsuite/g++.dg/torture/pr89303.C > --- a/gcc/testsuite/g++.dg/torture/pr89303.C > +++ b/gcc/testsuite/g++.dg/torture/pr89303.C > @@ -350,11 +350,11 @@ namespace std >{ return static_cast(_M_addr()); } > }; > > - class bad_weak_ptr { }; > + class bad_weak_ptr_ { }; > >inline void >__throw_bad_weak_ptr() > - { (throw (bad_weak_ptr())); } > + { (throw (bad_weak_ptr_())); } > > class _Sp_counted_base > { Jakub
Re: [patch, fortran] Fix PR 87689, wrong decls / ABI violation on POWER
On Mon, Feb 18, 2019 at 7:25 PM Segher Boessenkool < seg...@kernel.crashing.org> wrote: > On Mon, Feb 18, 2019 at 10:48:35AM +0200, Janne Blomqvist wrote: > > I wonder if we shouldn't exorcise all the varargs stuff, it seems to > > cause more problems than benefits? But not in stage4 if we can avoid > > it.. > > On the Power ABIs at least, unprototyped functions (a K thing for C) are > handled the same as varargs (with zero fixed arguments). How does this > tie in with Fortran requirements? > Varargs don't exist in Fortran. But we need some kind of support for so-called "implicit interfaces" (which was the only thing available before Fortran 90), which I guess are pretty similar to the K unprototyped functions. E.g. something like subroutine foo call bar(1, 2, 3.0) end subroutine foo is perfectly valid code, even though discouraged by modern programming practice. Here the compiler can only deduce from the syntax that bar must be a subroutine that takes (int, int, float) arguments. And bar can be in another translation unit, so we have no idea what it's actual interface is, the onus is on the programmer that they match. Similarly, from subroutine foo f = bar(1, 2) print *, f end subroutine foo the compiler can deduce that bar is a function that takes (int, int) arguments, and returns a float (due to implicit typing rules). However, as previously mentioned in this thread subroutine foo call bar(1, 2) f = bar(1, 2) print *, f end subroutine foo is invalid since bar cannot be both a subroutine and a function. Also, getting back to my first statement subroutine foo call bar(1, 2) call bar(1, 2, 3) end subroutine foo is invalid since Fortran doesn't have vararg functions (well, with the newer "explicit interfaces", optional arguments are possible, but that's still not the same thing as varargs). I'm not really sure if there is any good reason why GFortran occasionally generates these varargs declarations, hence my suggestion to get rid of them. Unless the middle-end is planning to get rid of untyped function decls? -- Janne Blomqvist
Re: [committed] Fix set_uids_in_ptset (PR middle-end/89303)
Hi Jakub, >> The following testcase is miscompiled on x86_64-linux (-m32 and -m64) at >> -O1, as a pointer has two vars in points-to set, the first one is escaped >> heap var and the second one is escaped non-heap var, and in the end the last >> var that sets vars_contains_escaped won and overwrote >> vars_contains_escaped_heap rather than oring into it. >> >> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, >> preapproved by Richard on IRC, committed to trunk. >> Will test 8.x backport tonight and commit to 8.3 if that succeeds. >> >> 2019-02-13 Jakub Jelinek >> >> PR middle-end/89303 >> * tree-ssa-structalias.c (set_uids_in_ptset): Or in vi->is_heap_var >> into pt->vars_contains_escaped_heap instead of setting >> pt->vars_contains_escaped_heap to it. >> >> 2019-02-13 Jonathan Wakely >> Jakub Jelinek >> >> PR middle-end/89303 >> * g++.dg/torture/pr89303.C: New test. > > the new testcase FAILs on Solaris: > > +FAIL: g++.dg/torture/pr89303.C -O0 (test for excess errors) > +FAIL: g++.dg/torture/pr89303.C -O1 (test for excess errors) > +FAIL: g++.dg/torture/pr89303.C -O2 (test for excess errors) > +FAIL: g++.dg/torture/pr89303.C -O2 -flto (test for excess errors) > +FAIL: g++.dg/torture/pr89303.C -O2 -flto -flto-partition=none (test for > excess errors) > +FAIL: g++.dg/torture/pr89303.C -O3 -fomit-frame-pointer -funroll-loops > -fpeel-loops -ftracer -finline-functions (test for excess errors) > +FAIL: g++.dg/torture/pr89303.C -O3 -g (test for excess errors) > +FAIL: g++.dg/torture/pr89303.C -Os (test for excess errors) > > Excess errors: > ld: warning: symbol 'typeinfo for std::bad_weak_ptr' has differing sizes: > (file /var/tmp//ccB1o8Ya.o value=0x8; file > /var/gcc/regression/trunk/11-gcc/build/i386-pc-solaris2.11/./libstdc++-v3/src/.libs/libstdc++.so > value=0xc); > /var/tmp//ccB1o8Ya.o definition taken > > I suspect the class can just be renamed in pr89303.C to avoid the > conflict with include/bits/shared_ptr_base.h? the following patch does this. I've verified that it still FAILs on x86_64-pc-linux-gnu before your patch and PASSes afterwards, as well as avoiding the linker warning on i386-pc-solaris2.11. Ok for mainline? Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University 2019-02-15 Rainer Orth * g++.dg/torture/pr89303.C (bad_weak_ptr): Rename to bad_weak_ptr_. # HG changeset patch # Parent 056fe4093ce40dc462c6b50c3ae49df032a92230 Fix g++.dg/torture/pr89303.C with Solaris ld diff --git a/gcc/testsuite/g++.dg/torture/pr89303.C b/gcc/testsuite/g++.dg/torture/pr89303.C --- a/gcc/testsuite/g++.dg/torture/pr89303.C +++ b/gcc/testsuite/g++.dg/torture/pr89303.C @@ -350,11 +350,11 @@ namespace std { return static_cast(_M_addr()); } }; - class bad_weak_ptr { }; + class bad_weak_ptr_ { }; inline void __throw_bad_weak_ptr() - { (throw (bad_weak_ptr())); } + { (throw (bad_weak_ptr_())); } class _Sp_counted_base {
Re: Move -Wmaybe-uninitialized to -Wextra
On 2/4/19 3:52 PM, Martin Jambor wrote: > Hi, > > On Mon, Feb 04 2019, Marc Glisse wrote: >> On Mon, 4 Feb 2019, Martin Sebor wrote: >>> > > ... > >>> You're right that this is hard to imagine without first hand experience >>> with the problem. If this is a common pattern with the warning in C++ >>> class templates in general, a representative test case would help get >>> a better appreciation of the problem and might also give us an idea >>> of a better solution. (If there is one in Bugzilla please point me >>> at it.) >> >> Looking for "optional" and "-Wmaybe-uninitialized" shows >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78044 >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80635 >> >> Google also gives >> https://www.boost.org/doc/libs/1_69_0/libs/optional/doc/html/boost_optional/tutorial/gotchas/false_positive_with__wmaybe_uninitialized.html >> https://sourceware.org/ml/gdb-patches/2017-05/msg00130.html >> etc >> >> And that's just for using a type called 'optional' (3 implementations of >> it). > > from my very quick reading of the first googled testcase, I assume the > instance of the optional class got SRAed and a warning was generated for > what originally was a class member, which indeed is not easy to > initialize on its own in order to avoid the warning. > > Would it perhaps make sense to split the -Wmaybe-uninitialized warning > into two, one for scalars that are scalars in the original code and one > for SRA-created scalars and move only the latter to -Wextra? I could support that. It fits in with the general sense that we're not handling aggregates and addressables as well as we could. JEff
Re: [PATCH] Handle timeout warnings in dg-extract-results
Hi Christophe, > dg-extract-results currently moves lines like > WARNING: program timed out > at the end of each .exp section when it generates .sum files. > > This is because it sorts its output based on the 2nd field, which is > normally the testname as in: > FAIL: gcc.c-torture/execute/20020129-1.c -O2 -flto > -fno-use-linker-plugin -flto-partition=none execution test > > As you can notice 'program' comes after > gcc.c-torture/execute/20020129-1.c alphabetically, and generally after > most (all?) GCC testnames. > > This is a bit of a pain when trying to handle transient test failures > because you can no longer match such a WARNING line to its FAIL > counterpart. > > The attached patch changes this behavior by replacing the line > WARNING: program timed out > with > WARNING: gcc.c-torture/execute/20020129-1.c -O2 -flto > -fno-use-linker-plugin -flto-partition=none execution test program > timed out > > The effect is that this line will now appear immediately above the > FAIL: gcc.c-torture/execute/20020129-1.c -O2 -flto > -fno-use-linker-plugin -flto-partition=none execution test > so that it's easier to match them. > > > I'm not sure how much people depend on the .sum format, I also > considered emitting > WARNING: program timed out gcc.c-torture/execute/20020129-1.c -O2 > -flto -fno-use-linker-plugin -flto-partition=none execution test > > I also restricted the patch to handling only 'program timed out' > cases, to avoid breaking other things. > > I considered fixing this in Dejagnu, but it seemed more complicated, > and would delay adoption in GCC anyway. > > What do people think about this? I just had a case where your patch broke the generation of go.sum. This is on Solaris 11.5 with python 2.7.15: ro@colima 68 > /bin/ksh /vol/gcc/src/hg/trunk/local/gcc/../contrib/dg-extract-results.sh testsuite/go*/*.sum.sep > testsuite/go/go.sum Traceback (most recent call last): File "/vol/gcc/src/hg/trunk/local/gcc/../contrib/dg-extract-results.py", line 605, in Prog().main() File "/vol/gcc/src/hg/trunk/local/gcc/../contrib/dg-extract-results.py", line 569, in main self.parse_file (filename, file) File "/vol/gcc/src/hg/trunk/local/gcc/../contrib/dg-extract-results.py", line 427, in parse_file num_variations) File "/vol/gcc/src/hg/trunk/local/gcc/../contrib/dg-extract-results.py", line 311, in parse_run first_key = key UnboundLocalError: local variable 'key' referenced before assignment Before your patch, key cannot have been undefined, now it is. I've verified this by removing the WARNING: lines from the two affected go.sum.sep files and now go.sum creation just works fine. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: Move -Wmaybe-uninitialized to -Wextra
On 2/14/19 7:23 AM, Tom Tromey wrote: >> "Marc" == Marc Glisse writes: > >>> Lastly, in the case of uninitialized variables, the usual solution >>> of initializing them is trivial and always safe (some coding styles >>> even require it). > > Marc> Here it shows that we don't work with the same type of code at all. If > Marc> I am using a boost::optional, i.e. a class with a buffer and a boolean > Marc> that says if the buffer is initialized, how do I initialize the > Marc> (private) buffer? Or should boost itself zero out the buffer whenever > Marc> the boolean is set to false? > > This is https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80635 (I know you > know, but maybe others on the thread don't). > > I think in this specific case (std::optional and similar classes), GCC > should provide a way for the class to indicate that > -Wmaybe-uninitialized should not apply to the payload. > >>> A shared definition of a false positive should be one of the very >>> first steps to coming closer to a consensus. Real world (as opposed >>> to anecdotal) data on the rates of actual rates of false positives >>> and negatives vs true positives would be also most helpful, as would >>> some consensus of the severity of the bugs the true positives >>> expose, as well as some objective measure of the ease of >>> suppression. There probably are others but these would be a start. > > Marc> This data is going to be super hard to get. Most projects have been > Marc> compiling for years and tweaking their code to avoid some warnings. We > Marc> do not get to see the code that people originally write, we can only > Marc> see what they commit. > > gdb has gone through this over the years -- it turns on many warnings > and sometimes false positives show up. Most of the time there's a > comment, for -Wmaybe-uninitialized grep for "init.*gcc" in the source. > Unfortunately the comment isn't standardized; but I only get ~20 hits > for this in gdb, so it isn't really so bad in practice. Yea, in retrospect we should have had a consistent marker for GCC as well. I suspect a goodly number of those initializations that went in early in the process are no longer needed. Jeff
Re: Go patch committed: Harmonize types referenced by both C and Go
On Mon, Feb 18, 2019 at 2:48 AM Rainer Orth wrote: > > > The code was already calling syscall, it was just doing it in a way > > that the types didn't necessarily match the C declaration. This is > > the implementation of Go's syscall.Syscall function, so there isn't > > really anything else we can do. > > I feared as much. Some time ago when debugging another issue I saw > libgo using syscall() directly, certainly unexpected in that particular > case. Those cases--where libgo calls syscall.Syscall--we can clean up where appropriate. What we can't clean up is user written Go code that calls syscall.Syscall directly. Ian
Re: RFC (branch prediction): PATCH to implement P0479R5, [[likely]] and [[unlikely]].
On 2/18/19 7:44 AM, Martin Liška wrote: PING^1 On 11/30/18 11:26 AM, Martin Liška wrote: Hi Jason. Just small nits I noticed for: cat test4.C int a, b, c; void __attribute__((noinline)) bar() { if (a == 123) [[likely]] c = 5; else [[likely]] b = 77; } int main() { bar (); return 0; } $ g++ test4.C -c test4.C: In function ‘void bar()’: test4.C:8:16: warning: both branches of ‘if’ statement marked as ‘hot label’ [-Wattributes] 8 | [[likely]] c = 5; |^ 9 | else 10 | [[likely]] b = 77; |~ 1) I would expect 'likely' instead of 'hot label' 2) maybe we can put the carousel to the attribute instead of the first statement in the block? Fixed thus: commit 4f0e3ea77fd14dc9931cade9add07f1aa70e8ef4 Author: Jason Merrill Date: Mon Feb 18 08:49:49 2019 -1000 Improve duplicate [[likely]] diagnostic. * parser.c (cp_parser_statement): Make attrs_loc a range. Pass it to process_stmt_hotness_attribute. * cp-gimplify.c (process_stmt_hotness_attribute): Take attrs_loc. (genericize_if_stmt): Use likely/unlikely instead of predictor_name. diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h index 60ca1366cf6..ac3654467ac 100644 --- a/gcc/cp/cp-tree.h +++ b/gcc/cp/cp-tree.h @@ -7576,7 +7576,7 @@ extern tree cp_fully_fold (tree); extern tree cp_fully_fold_init (tree); extern void clear_fold_cache (void); extern tree lookup_hotness_attribute (tree); -extern tree process_stmt_hotness_attribute (tree); +extern tree process_stmt_hotness_attribute (tree, location_t); /* in name-lookup.c */ extern tree strip_using_decl(tree); diff --git a/gcc/cp/cp-gimplify.c b/gcc/cp/cp-gimplify.c index 33111bd14bf..56f717de85d 100644 --- a/gcc/cp/cp-gimplify.c +++ b/gcc/cp/cp-gimplify.c @@ -206,7 +206,7 @@ genericize_if_stmt (tree *stmt_p) richloc.add_range (EXPR_LOC_OR_LOC (fe, locus)); warning_at (, OPT_Wattributes, "both branches of % statement marked as %qs", - predictor_name (pr)); + pr == PRED_HOT_LABEL ? "likely" : "unlikely"); } } @@ -2765,7 +2765,7 @@ remove_hotness_attribute (tree list) PREDICT_EXPR. */ tree -process_stmt_hotness_attribute (tree std_attrs) +process_stmt_hotness_attribute (tree std_attrs, location_t attrs_loc) { if (std_attrs == error_mark_node) return std_attrs; @@ -2776,7 +2776,7 @@ process_stmt_hotness_attribute (tree std_attrs) || is_attribute_p ("likely", name)); tree pred = build_predict_expr (hot ? PRED_HOT_LABEL : PRED_COLD_LABEL, hot ? TAKEN : NOT_TAKEN); - SET_EXPR_LOCATION (pred, input_location); + SET_EXPR_LOCATION (pred, attrs_loc); add_stmt (pred); if (tree other = lookup_hotness_attribute (TREE_CHAIN (attr))) warning (OPT_Wattributes, "ignoring attribute %qE after earlier %qE", diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c index ffecce4e29b..adb5f6f27a1 100644 --- a/gcc/cp/parser.c +++ b/gcc/cp/parser.c @@ -11060,7 +11060,7 @@ cp_parser_statement (cp_parser* parser, tree in_statement_expr, { tree statement, std_attrs = NULL_TREE; cp_token *token; - location_t statement_location, attrs_location; + location_t statement_location, attrs_loc; restart: if (if_p != NULL) @@ -11069,13 +11069,19 @@ cp_parser_statement (cp_parser* parser, tree in_statement_expr, statement = NULL_TREE; saved_token_sentinel saved_tokens (parser->lexer); - attrs_location = cp_lexer_peek_token (parser->lexer)->location; + attrs_loc = cp_lexer_peek_token (parser->lexer)->location; if (c_dialect_objc ()) /* In obj-c++, seeing '[[' might be the either the beginning of c++11 attributes, or a nested objc-message-expression. So let's parse the c++11 attributes tentatively. */ cp_parser_parse_tentatively (parser); std_attrs = cp_parser_std_attribute_spec_seq (parser); + if (std_attrs) +{ + location_t end_loc + = cp_lexer_previous_token (parser->lexer)->location; + attrs_loc = make_location (attrs_loc, attrs_loc, end_loc); +} if (c_dialect_objc ()) { if (!cp_parser_parse_definitely (parser)) @@ -11107,14 +3,14 @@ cp_parser_statement (cp_parser* parser, tree in_statement_expr, case RID_IF: case RID_SWITCH: - std_attrs = process_stmt_hotness_attribute (std_attrs); + std_attrs = process_stmt_hotness_attribute (std_attrs, attrs_loc); statement = cp_parser_selection_statement (parser, if_p, chain); break; case RID_WHILE: case RID_DO: case RID_FOR: - std_attrs = process_stmt_hotness_attribute (std_attrs); + std_attrs = process_stmt_hotness_attribute (std_attrs, attrs_loc); statement = cp_parser_iteration_statement (parser, if_p, false, 0); break; @@ -11122,7 +11128,7 @@ cp_parser_statement (cp_parser* parser, tree in_statement_expr, case RID_CONTINUE: case RID_RETURN: case RID_GOTO: - std_attrs =
Trivial doc typos
Description: fixed a couple of typos in testsuite/README. Testing: make dvi, make info, although I doubt needed. svn diff (with -up) yields: Index: gcc/testsuite/README === --- gcc/testsuite/README(revision 268955) +++ gcc/testsuite/README(working copy) @@ -8,7 +8,7 @@ These tests are included "as is". If any of them fails, do not report a bug. Bug reports for DejaGnu can go to bug-deja...@gnu.org. Discussion and comments about this testsuite should be sent to -g...@gcc.gnu.org; additions and changes to should go to sent to +g...@gcc.gnu.org; additions and changes should be sent to gcc-patches@gcc.gnu.org. The entire testsuite is invoked by `make check` at the top level of @@ -48,7 +48,7 @@ where runtest Is the name used to invoke DejaGnu. If DejaGnu is not - install this will be the relative path name for runtest. + installed this will be the relative path name for runtest. --tool This tells DejaGnu which tool you are testing. It is mainly used to find the testsuite directories for a
[PATCH 29/41] i386: Emulate MMX ssse3_phdv2si3 with SSE
Emulate MMX ssse3_phdv2si3 with SSE by moving bits 64:95 to bits 32:63 in SSE register. Only SSE register source operand is allowed. PR target/89021 * config/i386/sse.md (ssse3_phdv2si3): Changed to define_insn_and_split to support SSE emulation. --- gcc/config/i386/sse.md | 34 ++ 1 file changed, 26 insertions(+), 8 deletions(-) diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 5f29f2c3595..551a1cb1eb2 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -15367,26 +15367,44 @@ (set_attr "prefix" "orig,vex") (set_attr "mode" "TI")]) -(define_insn "ssse3_phdv2si3" - [(set (match_operand:V2SI 0 "register_operand" "=y") +(define_insn_and_split "ssse3_phdv2si3" + [(set (match_operand:V2SI 0 "register_operand" "=y,x,Yv") (vec_concat:V2SI (plusminus:SI (vec_select:SI - (match_operand:V2SI 1 "register_operand" "0") + (match_operand:V2SI 1 "register_operand" "0,0,Yv") (parallel [(const_int 0)])) (vec_select:SI (match_dup 1) (parallel [(const_int 1)]))) (plusminus:SI (vec_select:SI - (match_operand:V2SI 2 "nonimmediate_operand" "ym") + (match_operand:V2SI 2 "register_mmxmem_operand" "ym,x,Yv") (parallel [(const_int 0)])) (vec_select:SI (match_dup 2) (parallel [(const_int 1)])] - "TARGET_SSSE3" - "phd\t{%2, %0|%0, %2}" - [(set_attr "type" "sseiadd") + "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3" + "@ + phd\t{%2, %0|%0, %2} + # + #" + "TARGET_MMX_WITH_SSE && reload_completed" + [(const_int 0)] +{ + /* Generate SSE version of the operation. */ + rtx op0 = lowpart_subreg (V4SImode, operands[0], + GET_MODE (operands[0])); + rtx op1 = lowpart_subreg (V4SImode, operands[1], + GET_MODE (operands[1])); + rtx op2 = lowpart_subreg (V4SImode, operands[2], + GET_MODE (operands[2])); + emit_insn (gen_ssse3_phdv4si3 (op0, op1, op2)); + ix86_move_vector_high_sse_to_mmx (op0); + DONE; +} + [(set_attr "mmx_isa" "native,x64_noavx,x64_avx") + (set_attr "type" "sseiadd") (set_attr "atom_unit" "complex") (set_attr "prefix_extra" "1") (set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p (insn)")) - (set_attr "mode" "DI")]) + (set_attr "mode" "DI,TI,TI")]) (define_insn "avx2_pmaddubsw256" [(set (match_operand:V16HI 0 "register_operand" "=x,v") -- 2.20.1
[PATCH 23/41] i386: Emulate MMX mmx_uavgv4hi3 with SSE
Emulate MMX mmx_uavgv4hi3 with SSE. Only SSE register source operand is allowed. PR target/89021 * config/i386/mmx.md (mmx_uavgv4hi3): Also check TARGET_MMX and TARGET_MMX_WITH_SSE. (*mmx_uavgv4hi3): Add SSE emulation. --- gcc/config/i386/mmx.md | 26 -- 1 file changed, 16 insertions(+), 10 deletions(-) diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index 8866354dea9..d647dc28baa 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -1736,33 +1736,39 @@ (plus:V4SI (plus:V4SI (zero_extend:V4SI - (match_operand:V4HI 1 "nonimmediate_operand")) + (match_operand:V4HI 1 "register_mmxmem_operand")) (zero_extend:V4SI - (match_operand:V4HI 2 "nonimmediate_operand"))) + (match_operand:V4HI 2 "register_mmxmem_operand"))) (const_vector:V4SI [(const_int 1) (const_int 1) (const_int 1) (const_int 1)])) (const_int 1] - "TARGET_SSE || TARGET_3DNOW_A" + "(TARGET_MMX || TARGET_MMX_WITH_SSE) + && (TARGET_SSE || TARGET_3DNOW_A)" "ix86_fixup_binary_operands_no_copy (PLUS, V4HImode, operands);") (define_insn "*mmx_uavgv4hi3" - [(set (match_operand:V4HI 0 "register_operand" "=y") + [(set (match_operand:V4HI 0 "register_operand" "=y,x,Yv") (truncate:V4HI (lshiftrt:V4SI (plus:V4SI (plus:V4SI (zero_extend:V4SI - (match_operand:V4HI 1 "nonimmediate_operand" "%0")) + (match_operand:V4HI 1 "register_mmxmem_operand" "%0,0,Yv")) (zero_extend:V4SI - (match_operand:V4HI 2 "nonimmediate_operand" "ym"))) + (match_operand:V4HI 2 "register_mmxmem_operand" "ym,x,Yv"))) (const_vector:V4SI [(const_int 1) (const_int 1) (const_int 1) (const_int 1)])) (const_int 1] - "(TARGET_SSE || TARGET_3DNOW_A) + "(TARGET_MMX || TARGET_MMX_WITH_SSE) + && (TARGET_SSE || TARGET_3DNOW_A) && ix86_binary_operator_ok (PLUS, V4HImode, operands)" - "pavgw\t{%2, %0|%0, %2}" - [(set_attr "type" "mmxshft") - (set_attr "mode" "DI")]) + "@ + pavgw\t{%2, %0|%0, %2} + pavgw\t{%2, %0|%0, %2} + vpavgw\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "mmx_isa" "native,x64_noavx,x64_avx") + (set_attr "type" "mmxshft,sseiadd,sseiadd") + (set_attr "mode" "DI,TI,TI")]) (define_insn "mmx_psadbw" [(set (match_operand:V1DI 0 "register_operand" "=y") -- 2.20.1
[PATCH 28/41] i386: Emulate MMX ssse3_phwv4hi3 with SSE
Emulate MMX ssse3_phwv4hi3 with SSE by moving bits 64:95 to bits 32:63 in SSE register. Only SSE register source operand is allowed. PR target/89021 * config/i386/sse.md (ssse3_phwv4hi3): Changed to define_insn_and_split to support SSE emulation. --- gcc/config/i386/sse.md | 34 ++ 1 file changed, 26 insertions(+), 8 deletions(-) diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 3135ce4eace..5f29f2c3595 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -15243,13 +15243,13 @@ (set_attr "prefix" "orig,vex") (set_attr "mode" "TI")]) -(define_insn "ssse3_phwv4hi3" - [(set (match_operand:V4HI 0 "register_operand" "=y") +(define_insn_and_split "ssse3_phwv4hi3" + [(set (match_operand:V4HI 0 "register_operand" "=y,x,Yv") (vec_concat:V4HI (vec_concat:V2HI (ssse3_plusminus:HI (vec_select:HI - (match_operand:V4HI 1 "register_operand" "0") + (match_operand:V4HI 1 "register_operand" "0,0,Yv") (parallel [(const_int 0)])) (vec_select:HI (match_dup 1) (parallel [(const_int 1)]))) (ssse3_plusminus:HI @@ -15258,19 +15258,37 @@ (vec_concat:V2HI (ssse3_plusminus:HI (vec_select:HI - (match_operand:V4HI 2 "nonimmediate_operand" "ym") + (match_operand:V4HI 2 "register_mmxmem_operand" "ym,x,Yv") (parallel [(const_int 0)])) (vec_select:HI (match_dup 2) (parallel [(const_int 1)]))) (ssse3_plusminus:HI (vec_select:HI (match_dup 2) (parallel [(const_int 2)])) (vec_select:HI (match_dup 2) (parallel [(const_int 3)]))] - "TARGET_SSSE3" - "phw\t{%2, %0|%0, %2}" - [(set_attr "type" "sseiadd") + "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3" + "@ + phw\t{%2, %0|%0, %2} + # + #" + "TARGET_MMX_WITH_SSE && reload_completed" + [(const_int 0)] +{ + /* Generate SSE version of the operation. */ + rtx op0 = lowpart_subreg (V8HImode, operands[0], + GET_MODE (operands[0])); + rtx op1 = lowpart_subreg (V8HImode, operands[1], + GET_MODE (operands[1])); + rtx op2 = lowpart_subreg (V8HImode, operands[2], + GET_MODE (operands[2])); + emit_insn (gen_ssse3_phwv8hi3 (op0, op1, op2)); + ix86_move_vector_high_sse_to_mmx (op0); + DONE; +} + [(set_attr "mmx_isa" "native,x64_noavx,x64_avx") + (set_attr "type" "sseiadd") (set_attr "atom_unit" "complex") (set_attr "prefix_extra" "1") (set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p (insn)")) - (set_attr "mode" "DI")]) + (set_attr "mode" "DI,TI,TI")]) (define_insn "avx2_phdv8si3" [(set (match_operand:V8SI 0 "register_operand" "=x") -- 2.20.1
Re: [PATCH] document __builtin_is_constant_evaluated
On 2/15/19 9:01 PM, Sandra Loosemore wrote: On 2/13/19 4:33 PM, Martin Sebor wrote: Index: gcc/doc/extend.texi === --- gcc/doc/extend.texi (revision 268856) +++ gcc/doc/extend.texi (working copy) @@ -12890,6 +12890,22 @@ built-in in this case, because it has no opportuni optimization. @end deftypefn +@deftypefn {Built-in Function} bool __builtin_is_constant_evaluated () +The @code{__builtin_is_constant_evaluated} function is available only +in C++. Its main use case is to determine whether a @code{constexpr} +function is being called in a @code{constexpr} context. A call to +the function evaluates to a core constant expression with the value +@code{true} if and only if it occurs within the evaluation of an expression +or conversion that is manifestly constant-evaluated as defined in the C++ +standard. Manifestly constant-evaluated contexts include constant-expressions, +the conditions of @code{constexpr if} statements, constraint-expresions, and s/expresions/expressions/ +initializers of variables usable in constant expressions. The built-in is +intended to be used by implementations of the @code{std::is_constant_evaluated} +C++ function. Programs should make use of the latter function rather than +invoking the built-in directly. For more details refer to the latest revision +of the C++ standard. +@end deftypefn + @deftypefn {Built-in Function} long __builtin_expect (long @var{exp}, long @var{c}) @opindex fprofile-arcs You may use @code{__builtin_expect} to provide the compiler with I think this is generally reasonable (and I agree with the rationale for documenting this at all), but I'd like to see this rearranged and rephrased to put the most important point (it's an internal hook to implement std::is_constant_evaluated and shouldn't be called directly) before the technical details, with a paragraph break in between. Attached is a revision with this rearrangement. Martin gcc/ChangeLog: * doc/extend.texi (Other Builtins): Add __builtin_is_constant_evaluated. Index: gcc/doc/extend.texi === --- gcc/doc/extend.texi (revision 268992) +++ gcc/doc/extend.texi (working copy) @@ -12890,6 +12890,23 @@ built-in in this case, because it has no opportuni optimization. @end deftypefn +@deftypefn {Built-in Function} bool __builtin_is_constant_evaluated () +The @code{__builtin_is_constant_evaluated} function is available only +in C++. The built-in is intended to be used by implemetations of +the @code{std::is_constant_evaluated} C++ function. Programs should make +use of the latter function rather than invoking the built-in directly. + +The main use case of the built-in is to determine whether a @code{constexpr} +function is being called in a @code{constexpr} context. A call to +the function evaluates to a core constant expression with the value +@code{true} if and only if it occurs within the evaluation of an expression +or conversion that is manifestly constant-evaluated as defined in the C++ +standard. Manifestly constant-evaluated contexts include constant-expressions, +the conditions of @code{constexpr if} statements, constraint-expressions, and +initializers of variables usable in constant expressions. For more details +refer to the latest revision of the C++ standard. +@end deftypefn + @deftypefn {Built-in Function} long __builtin_expect (long @var{exp}, long @var{c}) @opindex fprofile-arcs You may use @code{__builtin_expect} to provide the compiler with
[PATCH 37/41] i386: Allow MMXMODE moves with TARGET_MMX_WITH_SSE
PR target/89021 * config/i386/mmx.md (MMXMODE:mov): Also allow TARGET_MMX_WITH_SSE. (MMXMODE:*mov_internal): Likewise. (MMXMODE:movmisalign): Likewise. --- gcc/config/i386/mmx.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index c48d42c7d59..b230dee521f 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -70,7 +70,7 @@ (define_expand "mov" [(set (match_operand:MMXMODE 0 "nonimmediate_operand") (match_operand:MMXMODE 1 "nonimmediate_operand"))] - "TARGET_MMX" + "TARGET_MMX || TARGET_MMX_WITH_SSE" { ix86_expand_vector_move (mode, operands); DONE; @@ -81,7 +81,7 @@ "=r ,o ,r,r ,m ,?!y,!y,?!y,m ,r ,?!y,v,v,v,m,r,v,!y,*x") (match_operand:MMXMODE 1 "nonimm_or_0_operand" "rCo,rC,C,rm,rC,C ,!y,m ,?!y,?!y,r ,C,v,m,v,v,r,*x,!y"))] - "TARGET_MMX + "(TARGET_MMX || TARGET_MMX_WITH_SSE) && !(MEM_P (operands[0]) && MEM_P (operands[1]))" { switch (get_attr_type (insn)) @@ -207,7 +207,7 @@ (define_expand "movmisalign" [(set (match_operand:MMXMODE 0 "nonimmediate_operand") (match_operand:MMXMODE 1 "nonimmediate_operand"))] - "TARGET_MMX" + "TARGET_MMX || TARGET_MMX_WITH_SSE" { ix86_expand_vector_move (mode, operands); DONE; -- 2.20.1
[PATCH 30/41] i386: Emulate MMX ssse3_pmaddubsw with SSE
Emulate MMX ssse3_pmaddubsw with SSE. Only SSE register source operand is allowed. PR target/89021 * config/i386/sse.md (ssse3_pmaddubsw): Add SSE emulation. --- gcc/config/i386/sse.md | 18 +++--- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 551a1cb1eb2..e8d9bec9766 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -1,17 +1,17 @@ (set_attr "mode" "TI")]) (define_insn "ssse3_pmaddubsw" - [(set (match_operand:V4HI 0 "register_operand" "=y") + [(set (match_operand:V4HI 0 "register_operand" "=y,x,Yv") (ss_plus:V4HI (mult:V4HI (zero_extend:V4HI (vec_select:V4QI - (match_operand:V8QI 1 "register_operand" "0") + (match_operand:V8QI 1 "register_operand" "0,0,Yv") (parallel [(const_int 0) (const_int 2) (const_int 4) (const_int 6)]))) (sign_extend:V4HI (vec_select:V4QI - (match_operand:V8QI 2 "nonimmediate_operand" "ym") + (match_operand:V8QI 2 "register_mmxmem_operand" "ym,x,Yv") (parallel [(const_int 0) (const_int 2) (const_int 4) (const_int 6)] (mult:V4HI @@ -15577,13 +15577,17 @@ (vec_select:V4QI (match_dup 2) (parallel [(const_int 1) (const_int 3) (const_int 5) (const_int 7)]))] - "TARGET_SSSE3" - "pmaddubsw\t{%2, %0|%0, %2}" - [(set_attr "type" "sseiadd") + "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3" + "@ + pmaddubsw\t{%2, %0|%0, %2} + pmaddubsw\t{%2, %0|%0, %2} + vpmaddubsw\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "mmx_isa" "native,x64_noavx,x64_avx") + (set_attr "type" "sseiadd") (set_attr "atom_unit" "simul") (set_attr "prefix_extra" "1") (set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p (insn)")) - (set_attr "mode" "DI")]) + (set_attr "mode" "DI,TI,TI")]) (define_mode_iterator PMULHRSW [V8HI (V16HI "TARGET_AVX2")]) -- 2.20.1
[PATCH 40/41] i386: Enable TM MMX intrinsics with SSE2
This pach enables TM MMX intrinsics with SSE2 when MMX is disabled. PR target/89021 * config/i386/i386.c (bdesc_tm): Enable MMX intrinsics with SSE2. --- gcc/config/i386/i386.c | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 93769003a4a..a28a3f04129 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -31078,13 +31078,13 @@ static const struct builtin_description bdesc_##kind[] = \ we're lazy. Add casts to make them fit. */ static const struct builtin_description bdesc_tm[] = { - { OPTION_MASK_ISA_MMX, 0, CODE_FOR_nothing, "__builtin__ITM_WM64", (enum ix86_builtins) BUILT_IN_TM_STORE_M64, UNKNOWN, VOID_FTYPE_PV2SI_V2SI }, - { OPTION_MASK_ISA_MMX, 0, CODE_FOR_nothing, "__builtin__ITM_WaRM64", (enum ix86_builtins) BUILT_IN_TM_STORE_WAR_M64, UNKNOWN, VOID_FTYPE_PV2SI_V2SI }, - { OPTION_MASK_ISA_MMX, 0, CODE_FOR_nothing, "__builtin__ITM_WaWM64", (enum ix86_builtins) BUILT_IN_TM_STORE_WAW_M64, UNKNOWN, VOID_FTYPE_PV2SI_V2SI }, - { OPTION_MASK_ISA_MMX, 0, CODE_FOR_nothing, "__builtin__ITM_RM64", (enum ix86_builtins) BUILT_IN_TM_LOAD_M64, UNKNOWN, V2SI_FTYPE_PCV2SI }, - { OPTION_MASK_ISA_MMX, 0, CODE_FOR_nothing, "__builtin__ITM_RaRM64", (enum ix86_builtins) BUILT_IN_TM_LOAD_RAR_M64, UNKNOWN, V2SI_FTYPE_PCV2SI }, - { OPTION_MASK_ISA_MMX, 0, CODE_FOR_nothing, "__builtin__ITM_RaWM64", (enum ix86_builtins) BUILT_IN_TM_LOAD_RAW_M64, UNKNOWN, V2SI_FTYPE_PCV2SI }, - { OPTION_MASK_ISA_MMX, 0, CODE_FOR_nothing, "__builtin__ITM_RfWM64", (enum ix86_builtins) BUILT_IN_TM_LOAD_RFW_M64, UNKNOWN, V2SI_FTYPE_PCV2SI }, + { OPTION_MASK_ISA_MMX | OPTION_MASK_ISA_SSE2, 0, CODE_FOR_nothing, "__builtin__ITM_WM64", (enum ix86_builtins) BUILT_IN_TM_STORE_M64, UNKNOWN, VOID_FTYPE_PV2SI_V2SI }, + { OPTION_MASK_ISA_MMX | OPTION_MASK_ISA_SSE2, 0, CODE_FOR_nothing, "__builtin__ITM_WaRM64", (enum ix86_builtins) BUILT_IN_TM_STORE_WAR_M64, UNKNOWN, VOID_FTYPE_PV2SI_V2SI }, + { OPTION_MASK_ISA_MMX | OPTION_MASK_ISA_SSE2, 0, CODE_FOR_nothing, "__builtin__ITM_WaWM64", (enum ix86_builtins) BUILT_IN_TM_STORE_WAW_M64, UNKNOWN, VOID_FTYPE_PV2SI_V2SI }, + { OPTION_MASK_ISA_MMX | OPTION_MASK_ISA_SSE2, 0, CODE_FOR_nothing, "__builtin__ITM_RM64", (enum ix86_builtins) BUILT_IN_TM_LOAD_M64, UNKNOWN, V2SI_FTYPE_PCV2SI }, + { OPTION_MASK_ISA_MMX | OPTION_MASK_ISA_SSE2, 0, CODE_FOR_nothing, "__builtin__ITM_RaRM64", (enum ix86_builtins) BUILT_IN_TM_LOAD_RAR_M64, UNKNOWN, V2SI_FTYPE_PCV2SI }, + { OPTION_MASK_ISA_MMX | OPTION_MASK_ISA_SSE2, 0, CODE_FOR_nothing, "__builtin__ITM_RaWM64", (enum ix86_builtins) BUILT_IN_TM_LOAD_RAW_M64, UNKNOWN, V2SI_FTYPE_PCV2SI }, + { OPTION_MASK_ISA_MMX | OPTION_MASK_ISA_SSE2, 0, CODE_FOR_nothing, "__builtin__ITM_RfWM64", (enum ix86_builtins) BUILT_IN_TM_LOAD_RFW_M64, UNKNOWN, V2SI_FTYPE_PCV2SI }, { OPTION_MASK_ISA_SSE, 0, CODE_FOR_nothing, "__builtin__ITM_WM128", (enum ix86_builtins) BUILT_IN_TM_STORE_M128, UNKNOWN, VOID_FTYPE_PV4SF_V4SF }, { OPTION_MASK_ISA_SSE, 0, CODE_FOR_nothing, "__builtin__ITM_WaRM128", (enum ix86_builtins) BUILT_IN_TM_STORE_WAR_M128, UNKNOWN, VOID_FTYPE_PV4SF_V4SF }, @@ -31102,7 +31102,7 @@ static const struct builtin_description bdesc_tm[] = { OPTION_MASK_ISA_AVX, 0, CODE_FOR_nothing, "__builtin__ITM_RaWM256", (enum ix86_builtins) BUILT_IN_TM_LOAD_RAW_M256, UNKNOWN, V8SF_FTYPE_PCV8SF }, { OPTION_MASK_ISA_AVX, 0, CODE_FOR_nothing, "__builtin__ITM_RfWM256", (enum ix86_builtins) BUILT_IN_TM_LOAD_RFW_M256, UNKNOWN, V8SF_FTYPE_PCV8SF }, - { OPTION_MASK_ISA_MMX, 0, CODE_FOR_nothing, "__builtin__ITM_LM64", (enum ix86_builtins) BUILT_IN_TM_LOG_M64, UNKNOWN, VOID_FTYPE_PCVOID }, + { OPTION_MASK_ISA_MMX | OPTION_MASK_ISA_SSE2, 0, CODE_FOR_nothing, "__builtin__ITM_LM64", (enum ix86_builtins) BUILT_IN_TM_LOG_M64, UNKNOWN, VOID_FTYPE_PCVOID }, { OPTION_MASK_ISA_SSE, 0, CODE_FOR_nothing, "__builtin__ITM_LM128", (enum ix86_builtins) BUILT_IN_TM_LOG_M128, UNKNOWN, VOID_FTYPE_PCVOID }, { OPTION_MASK_ISA_AVX, 0, CODE_FOR_nothing, "__builtin__ITM_LM256", (enum ix86_builtins) BUILT_IN_TM_LOG_M256, UNKNOWN, VOID_FTYPE_PCVOID }, }; -- 2.20.1
[PATCH 18/41] i386: Emulate MMX V4HI smaxmin/V8QI umaxmin with SSE
Emulate MMX V4HI smaxmin/V8QI umaxmin with SSE. Only SSE register source operand is allowed. PR target/89021 * config/i386/mmx.md (mmx_v4hi3): Also check TARGET_MMX and TARGET_MMX_WITH_SSE. (mmx_v8qi3): Likewise. (smaxmin:v4hi3): New. (umaxmin:v8qi3): Likewise. (smaxmin:*mmx_v4hi3): Add SSE emulation. (umaxmin:*mmx_v8qi3): Likewise. --- gcc/config/i386/mmx.md | 68 +- 1 file changed, 48 insertions(+), 20 deletions(-) diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index dea2be1d8e2..edfb8623701 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -923,40 +923,68 @@ (define_expand "mmx_v4hi3" [(set (match_operand:V4HI 0 "register_operand") (smaxmin:V4HI - (match_operand:V4HI 1 "nonimmediate_operand") - (match_operand:V4HI 2 "nonimmediate_operand")))] - "TARGET_SSE || TARGET_3DNOW_A" + (match_operand:V4HI 1 "register_mmxmem_operand") + (match_operand:V4HI 2 "register_mmxmem_operand")))] + "(TARGET_MMX || TARGET_MMX_WITH_SSE) + && (TARGET_SSE || TARGET_3DNOW_A)" + "ix86_fixup_binary_operands_no_copy (, V4HImode, operands);") + +(define_expand "v4hi3" + [(set (match_operand:V4HI 0 "register_operand") +(smaxmin:V4HI + (match_operand:V4HI 1 "register_operand") + (match_operand:V4HI 2 "register_operand")))] + "TARGET_MMX_WITH_SSE" "ix86_fixup_binary_operands_no_copy (, V4HImode, operands);") (define_insn "*mmx_v4hi3" - [(set (match_operand:V4HI 0 "register_operand" "=y") + [(set (match_operand:V4HI 0 "register_operand" "=y,x,Yv") (smaxmin:V4HI - (match_operand:V4HI 1 "nonimmediate_operand" "%0") - (match_operand:V4HI 2 "nonimmediate_operand" "ym")))] - "(TARGET_SSE || TARGET_3DNOW_A) + (match_operand:V4HI 1 "register_mmxmem_operand" "%0,0,Yv") + (match_operand:V4HI 2 "register_mmxmem_operand" "ym,x,Yv")))] + "(TARGET_MMX || TARGET_MMX_WITH_SSE) + && (TARGET_SSE || TARGET_3DNOW_A) && ix86_binary_operator_ok (, V4HImode, operands)" - "pw\t{%2, %0|%0, %2}" - [(set_attr "type" "mmxadd") - (set_attr "mode" "DI")]) + "@ + pw\t{%2, %0|%0, %2} + pw\t{%2, %0|%0, %2} + vpw\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "mmx_isa" "native,x64_noavx,x64_avx") + (set_attr "type" "mmxadd,sseiadd,sseiadd") + (set_attr "mode" "DI,TI,TI")]) (define_expand "mmx_v8qi3" [(set (match_operand:V8QI 0 "register_operand") (umaxmin:V8QI - (match_operand:V8QI 1 "nonimmediate_operand") - (match_operand:V8QI 2 "nonimmediate_operand")))] - "TARGET_SSE || TARGET_3DNOW_A" + (match_operand:V8QI 1 "register_mmxmem_operand") + (match_operand:V8QI 2 "register_mmxmem_operand")))] + "(TARGET_MMX || TARGET_MMX_WITH_SSE) + && (TARGET_SSE || TARGET_3DNOW_A)" + "ix86_fixup_binary_operands_no_copy (, V8QImode, operands);") + +(define_expand "v8qi3" + [(set (match_operand:V8QI 0 "register_operand") +(umaxmin:V8QI + (match_operand:V8QI 1 "register_operand") + (match_operand:V8QI 2 "register_operand")))] + "TARGET_MMX_WITH_SSE" "ix86_fixup_binary_operands_no_copy (, V8QImode, operands);") (define_insn "*mmx_v8qi3" - [(set (match_operand:V8QI 0 "register_operand" "=y") + [(set (match_operand:V8QI 0 "register_operand" "=y,x,Yv") (umaxmin:V8QI - (match_operand:V8QI 1 "nonimmediate_operand" "%0") - (match_operand:V8QI 2 "nonimmediate_operand" "ym")))] - "(TARGET_SSE || TARGET_3DNOW_A) + (match_operand:V8QI 1 "register_mmxmem_operand" "%0,0,Yv") + (match_operand:V8QI 2 "register_mmxmem_operand" "ym,x,Yv")))] + "(TARGET_MMX || TARGET_MMX_WITH_SSE) + && (TARGET_SSE || TARGET_3DNOW_A) && ix86_binary_operator_ok (, V8QImode, operands)" - "pb\t{%2, %0|%0, %2}" - [(set_attr "type" "mmxadd") - (set_attr "mode" "DI")]) + "@ + pb\t{%2, %0|%0, %2} + pb\t{%2, %0|%0, %2} + vpb\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "mmx_isa" "native,x64_noavx,x64_avx") + (set_attr "type" "mmxadd,sseiadd,sseiadd") + (set_attr "mode" "DI,TI,TI")]) (define_insn "mmx_ashr3" [(set (match_operand:MMXMODE24 0 "register_operand" "=y,x,Yv") -- 2.20.1
[PATCH 21/41] i386: Emulate MMX maskmovq with SSE2 maskmovdqu
Emulate MMX maskmovq with SSE2 maskmovdqu for TARGET_MMX_WITH_SSE by zero-extending source and mask operands to 128 bits. Handle unmapped bits 64:127 at memory address by adjusting source and mask operands together with memory address. PR target/89021 * config/i386/xmmintrin.h: Emulate MMX maskmovq with SSE2 maskmovdqu for __MMX_WITH_SSE__. --- gcc/config/i386/xmmintrin.h | 61 + 1 file changed, 61 insertions(+) diff --git a/gcc/config/i386/xmmintrin.h b/gcc/config/i386/xmmintrin.h index 58284378514..a915f6c87d7 100644 --- a/gcc/config/i386/xmmintrin.h +++ b/gcc/config/i386/xmmintrin.h @@ -1165,7 +1165,68 @@ _m_pshufw (__m64 __A, int const __N) extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_maskmove_si64 (__m64 __A, __m64 __N, char *__P) { +#ifdef __MMX_WITH_SSE__ + /* Emulate MMX maskmovq with SSE2 maskmovdqu and handle unmapped bits + 64:127 at address __P. */ + typedef long long __v2di __attribute__ ((__vector_size__ (16))); + typedef char __v16qi __attribute__ ((__vector_size__ (16))); + /* Zero-extend __A and __N to 128 bits. */ + __v2di __A128 = __extension__ (__v2di) { ((__v1di) __A)[0], 0 }; + __v2di __N128 = __extension__ (__v2di) { ((__v1di) __N)[0], 0 }; + + /* Check the alignment of __P. */ + __SIZE_TYPE__ offset = ((__SIZE_TYPE__) __P) & 0xf; + if (offset) +{ + /* If the misalignment of __P > 8, subtract __P by 8 bytes. +Otherwise, subtract __P by the misalignment. */ + if (offset > 8) + offset = 8; + __P = (char *) (((__SIZE_TYPE__) __P) - offset); + + /* Shift __A128 and __N128 to the left by the adjustment. */ + switch (offset) + { + case 1: + __A128 = __builtin_ia32_pslldqi128 (__A128, 8); + __N128 = __builtin_ia32_pslldqi128 (__N128, 8); + break; + case 2: + __A128 = __builtin_ia32_pslldqi128 (__A128, 2 * 8); + __N128 = __builtin_ia32_pslldqi128 (__N128, 2 * 8); + break; + case 3: + __A128 = __builtin_ia32_pslldqi128 (__A128, 3 * 8); + __N128 = __builtin_ia32_pslldqi128 (__N128, 3 * 8); + break; + case 4: + __A128 = __builtin_ia32_pslldqi128 (__A128, 4 * 8); + __N128 = __builtin_ia32_pslldqi128 (__N128, 4 * 8); + break; + case 5: + __A128 = __builtin_ia32_pslldqi128 (__A128, 5 * 8); + __N128 = __builtin_ia32_pslldqi128 (__N128, 5 * 8); + break; + case 6: + __A128 = __builtin_ia32_pslldqi128 (__A128, 6 * 8); + __N128 = __builtin_ia32_pslldqi128 (__N128, 6 * 8); + break; + case 7: + __A128 = __builtin_ia32_pslldqi128 (__A128, 7 * 8); + __N128 = __builtin_ia32_pslldqi128 (__N128, 7 * 8); + break; + case 8: + __A128 = __builtin_ia32_pslldqi128 (__A128, 8 * 8); + __N128 = __builtin_ia32_pslldqi128 (__N128, 8 * 8); + break; + default: + break; + } +} + __builtin_ia32_maskmovdqu ((__v16qi)__A128, (__v16qi)__N128, __P); +#else __builtin_ia32_maskmovq ((__v8qi)__A, (__v8qi)__N, __P); +#endif } extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -- 2.20.1
[PATCH 32/41] i386: Emulate MMX pshufb with SSE version
Emulate MMX version of pshufb with SSE version by masking out the bit 3 of the shuffle control byte. Only SSE register source operand is allowed. PR target/89021 * config/i386/sse.md (ssse3_pshufbv8qi3): Changed to define_insn_and_split. Also allow TARGET_MMX_WITH_SSE. Add SSE emulation. --- gcc/config/i386/sse.md | 46 +- 1 file changed, 37 insertions(+), 9 deletions(-) diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index b08a577d1e4..79b35d95424 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -15728,17 +15728,45 @@ (set_attr "btver2_decode" "vector") (set_attr "mode" "")]) -(define_insn "ssse3_pshufbv8qi3" - [(set (match_operand:V8QI 0 "register_operand" "=y") - (unspec:V8QI [(match_operand:V8QI 1 "register_operand" "0") - (match_operand:V8QI 2 "nonimmediate_operand" "ym")] -UNSPEC_PSHUFB))] - "TARGET_SSSE3" - "pshufb\t{%2, %0|%0, %2}"; - [(set_attr "type" "sselog1") +(define_insn_and_split "ssse3_pshufbv8qi3" + [(set (match_operand:V8QI 0 "register_operand" "=y,x,Yv") + (unspec:V8QI [(match_operand:V8QI 1 "register_operand" "0,0,Yv") + (match_operand:V8QI 2 "register_mmxmem_operand" "ym,x,Yv")] +UNSPEC_PSHUFB)) + (clobber (match_scratch:V4SI 3 "=X,x,Yv"))] + "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3" + "@ + pshufb\t{%2, %0|%0, %2} + # + #" + "TARGET_MMX_WITH_SSE && reload_completed" + [(set (match_dup 3) (match_dup 5)) + (set (match_dup 3) + (and:V4SI (match_dup 3) (match_dup 2))) + (set (match_dup 0) + (unspec:V16QI [(match_dup 1) (match_dup 4)] UNSPEC_PSHUFB))] +{ + /* Emulate MMX version of pshufb with SSE version by masking out the + bit 3 of the shuffle control byte. */ + operands[0] = lowpart_subreg (V16QImode, operands[0], + GET_MODE (operands[0])); + operands[1] = lowpart_subreg (V16QImode, operands[1], + GET_MODE (operands[1])); + operands[2] = lowpart_subreg (V4SImode, operands[2], + GET_MODE (operands[2])); + operands[4] = lowpart_subreg (V16QImode, operands[3], + GET_MODE (operands[3])); + rtvec par = gen_rtvec (4, GEN_INT (0xf7f7f7f7), +GEN_INT (0xf7f7f7f7), +GEN_INT (0xf7f7f7f7), +GEN_INT (0xf7f7f7f7)); + rtx vec_const = gen_rtx_CONST_VECTOR (V4SImode, par); + operands[5] = force_const_mem (V4SImode, vec_const); +} + [(set_attr "mmx_isa" "native,x64_noavx,x64_avx") (set_attr "prefix_extra" "1") (set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p (insn)")) - (set_attr "mode" "DI")]) + (set_attr "mode" "DI,TI,TI")]) (define_insn "_psign3" [(set (match_operand:VI124_AVX2 0 "register_operand" "=x,x") -- 2.20.1
[PATCH 20/41] i386: Emulate MMX mmx_umulv4hi3_highpart with SSE
Emulate MMX mmx_umulv4hi3_highpart with SSE. Only SSE register source operand is allowed. PR target/89021 * config/i386/mmx.md (mmx_umulv4hi3_highpart): Also check TARGET_MMX and TARGET_MMX_WITH_SSE. (*mmx_umulv4hi3_highpart): Add SSE emulation. --- gcc/config/i386/mmx.md | 26 -- 1 file changed, 16 insertions(+), 10 deletions(-) diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index 5ae04de205d..5a342256cbc 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -781,28 +781,34 @@ (lshiftrt:V4SI (mult:V4SI (zero_extend:V4SI - (match_operand:V4HI 1 "nonimmediate_operand")) + (match_operand:V4HI 1 "register_mmxmem_operand")) (zero_extend:V4SI - (match_operand:V4HI 2 "nonimmediate_operand"))) + (match_operand:V4HI 2 "register_mmxmem_operand"))) (const_int 16] - "TARGET_SSE || TARGET_3DNOW_A" + "(TARGET_MMX || TARGET_MMX_WITH_SSE) + && (TARGET_SSE || TARGET_3DNOW_A)" "ix86_fixup_binary_operands_no_copy (MULT, V4HImode, operands);") (define_insn "*mmx_umulv4hi3_highpart" - [(set (match_operand:V4HI 0 "register_operand" "=y") + [(set (match_operand:V4HI 0 "register_operand" "=y,x,Yv") (truncate:V4HI (lshiftrt:V4SI (mult:V4SI (zero_extend:V4SI - (match_operand:V4HI 1 "nonimmediate_operand" "%0")) + (match_operand:V4HI 1 "register_mmxmem_operand" "%0,0,Yv")) (zero_extend:V4SI - (match_operand:V4HI 2 "nonimmediate_operand" "ym"))) + (match_operand:V4HI 2 "register_mmxmem_operand" "ym,x,Yv"))) (const_int 16] - "(TARGET_SSE || TARGET_3DNOW_A) + "(TARGET_MMX || TARGET_MMX_WITH_SSE) + && (TARGET_SSE || TARGET_3DNOW_A) && ix86_binary_operator_ok (MULT, V4HImode, operands)" - "pmulhuw\t{%2, %0|%0, %2}" - [(set_attr "type" "mmxmul") - (set_attr "mode" "DI")]) + "@ + pmulhuw\t{%2, %0|%0, %2} + pmulhuw\t{%2, %0|%0, %2} + vpmulhuw\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "mmx_isa" "native,x64_noavx,x64_avx") + (set_attr "type" "mmxmul,ssemul,ssemul") + (set_attr "mode" "DI,TI,TI")]) (define_expand "mmx_pmaddwd" [(set (match_operand:V2SI 0 "register_operand") -- 2.20.1
[PATCH 17/41] i386: Emulate MMX mmx_pinsrw with SSE
Emulate MMX mmx_pinsrw with SSE. Only SSE register destination operand is allowed. PR target/89021 * config/i386/mmx.md (mmx_pinsrw): Also check TARGET_MMX and TARGET_MMX_WITH_SSE. (*mmx_pinsrw): Add SSE emulation. --- gcc/config/i386/mmx.md | 33 +++-- 1 file changed, 23 insertions(+), 10 deletions(-) diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index 28725f48282..dea2be1d8e2 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -1282,32 +1282,45 @@ (match_operand:SI 2 "nonimmediate_operand")) (match_operand:V4HI 1 "register_operand") (match_operand:SI 3 "const_0_to_3_operand")))] - "TARGET_SSE || TARGET_3DNOW_A" + "(TARGET_MMX || TARGET_MMX_WITH_SSE) + && (TARGET_SSE || TARGET_3DNOW_A)" { operands[2] = gen_lowpart (HImode, operands[2]); operands[3] = GEN_INT (1 << INTVAL (operands[3])); }) (define_insn "*mmx_pinsrw" - [(set (match_operand:V4HI 0 "register_operand" "=y") + [(set (match_operand:V4HI 0 "register_operand" "=y,x,Yv") (vec_merge:V4HI (vec_duplicate:V4HI -(match_operand:HI 2 "nonimmediate_operand" "rm")) - (match_operand:V4HI 1 "register_operand" "0") +(match_operand:HI 2 "nonimmediate_operand" "rm,rm,rm")) + (match_operand:V4HI 1 "register_operand" "0,0,Yv") (match_operand:SI 3 "const_int_operand")))] - "(TARGET_SSE || TARGET_3DNOW_A) + "(TARGET_MMX || TARGET_MMX_WITH_SSE) + && (TARGET_SSE || TARGET_3DNOW_A) && ((unsigned) exact_log2 (INTVAL (operands[3])) < GET_MODE_NUNITS (V4HImode))" { operands[3] = GEN_INT (exact_log2 (INTVAL (operands[3]))); - if (MEM_P (operands[2])) -return "pinsrw\t{%3, %2, %0|%0, %2, %3}"; + if (TARGET_MMX_WITH_SSE && TARGET_AVX) +{ + if (MEM_P (operands[2])) + return "vpinsrw\t{%3, %2, %1, %0|%0, %1, %2, %3}"; + else + return "vpinsrw\t{%3, %k2, %1, %0|%0, %1, %k2, %3}"; +} else -return "pinsrw\t{%3, %k2, %0|%0, %k2, %3}"; +{ + if (MEM_P (operands[2])) + return "pinsrw\t{%3, %2, %0|%0, %2, %3}"; + else + return "pinsrw\t{%3, %k2, %0|%0, %k2, %3}"; +} } - [(set_attr "type" "mmxcvt") + [(set_attr "mmx_isa" "native,x64_noavx,x64_avx") + (set_attr "type" "mmxcvt,sselog,sselog") (set_attr "length_immediate" "1") - (set_attr "mode" "DI")]) + (set_attr "mode" "DI,TI,TI")]) (define_insn "mmx_pextrw" [(set (match_operand:SI 0 "register_operand" "=r,r") -- 2.20.1
[PATCH 38/41] i386: Allow MMX vector expanders with TARGET_MMX_WITH_SSE
PR target/89021 * config/i386/mmx.md (*vec_dupv2sf): Changed to define_insn_and_split to support SSE emulation. (*vec_extractv2sf_0): Likewise. (*vec_extractv2sf_1): Likewise. (*vec_extractv2si_0): Likewise. (*vec_extractv2si_1): Likewise. (*vec_extractv2si_zext_mem): Likewise. (vec_setv2sf): Also allow TARGET_MMX_WITH_SSE. (vec_extractv2sf_1 splitter): Likewise. (vec_extractv2sfsf): Likewise. (vec_setv2si): Likewise. (vec_extractv2si_1 splitter): Likewise. (vec_extractv2sisi): Likewise. (vec_setv4hi): Likewise. (vec_extractv4hihi): Likewise. (vec_setv8qi): Likewise. (vec_extractv8qiqi): Likewise. (vec_extractv2sfsf): Also allow TARGET_MMX_WITH_SSE. Pass TARGET_MMX_WITH_SSE ix86_expand_vector_extract. (vec_extractv2sisi): Likewise. (vec_extractv4hihi): Likewise. (vec_extractv8qiqi): Likewise. (vec_initv2sfsf): Also allow TARGET_MMX_WITH_SSE. Pass TARGET_MMX_WITH_SSE to ix86_expand_vector_init. (vec_initv2sisi): Likewise. (vec_initv4hihi): Likewise. (vec_initv8qiqi): Likewise. (vec_setv2si): Also allow TARGET_MMX_WITH_SSE. Pass TARGET_MMX_WITH_SSE to ix86_expand_vector_set. (vec_setv4hi): Likewise. (vec_setv8qi): Likewise. --- gcc/config/i386/mmx.md | 110 - 1 file changed, 66 insertions(+), 44 deletions(-) diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index b230dee521f..479568aa322 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -555,14 +555,23 @@ (set_attr "prefix_extra" "1") (set_attr "mode" "V2SF")]) -(define_insn "*vec_dupv2sf" - [(set (match_operand:V2SF 0 "register_operand" "=y") +(define_insn_and_split "*vec_dupv2sf" + [(set (match_operand:V2SF 0 "register_operand" "=y,x,Yv") (vec_duplicate:V2SF - (match_operand:SF 1 "register_operand" "0")))] - "TARGET_MMX" - "punpckldq\t%0, %0" - [(set_attr "type" "mmxcvt") - (set_attr "mode" "DI")]) + (match_operand:SF 1 "register_operand" "0,0,Yv")))] + "TARGET_MMX || TARGET_MMX_WITH_SSE" + "@ + punpckldq\t%0, %0 + # + #" + "TARGET_MMX_WITH_SSE && reload_completed" + [(set (match_dup 0) + (vec_duplicate:V4SF (match_dup 1)))] + "operands[0] = lowpart_subreg (V4SFmode, operands[0], +GET_MODE (operands[0]));" + [(set_attr "mmx_isa" "native,x64_noavx,x64_avx") + (set_attr "type" "mmxcvt,ssemov,ssemov") + (set_attr "mode" "DI,TI,TI")]) (define_insn "*mmx_concatv2sf" [(set (match_operand:V2SF 0 "register_operand" "=y,y") @@ -580,9 +589,9 @@ [(match_operand:V2SF 0 "register_operand") (match_operand:SF 1 "register_operand") (match_operand 2 "const_int_operand")] - "TARGET_MMX" + "TARGET_MMX || TARGET_MMX_WITH_SSE" { - ix86_expand_vector_set (false, operands[0], operands[1], + ix86_expand_vector_set (TARGET_MMX_WITH_SSE, operands[0], operands[1], INTVAL (operands[2])); DONE; }) @@ -594,11 +603,13 @@ (vec_select:SF (match_operand:V2SF 1 "nonimmediate_operand" " xm,x,ym,y,m,m") (parallel [(const_int 0)])))] - "TARGET_MMX && !(MEM_P (operands[0]) && MEM_P (operands[1]))" + "(TARGET_MMX || TARGET_MMX_WITH_SSE) + && !(MEM_P (operands[0]) && MEM_P (operands[1]))" "#" "&& reload_completed" [(set (match_dup 0) (match_dup 1))] - "operands[1] = gen_lowpart (SFmode, operands[1]);") + "operands[1] = gen_lowpart (SFmode, operands[1]);" + [(set_attr "mmx_isa" "*,*,native,native,*,*")]) ;; Avoid combining registers from different units in a single alternative, ;; see comment above inline_secondary_memory_needed function in i386.c @@ -607,7 +618,8 @@ (vec_select:SF (match_operand:V2SF 1 "nonimmediate_operand" " 0,x,x,o,o,o,o") (parallel [(const_int 1)])))] - "TARGET_MMX && !(MEM_P (operands[0]) && MEM_P (operands[1]))" + "(TARGET_MMX || TARGET_MMX_WITH_SSE) + && !(MEM_P (operands[0]) && MEM_P (operands[1]))" "@ punpckhdq\t%0, %0 %vmovshdup\t{%1, %0|%0, %1} @@ -617,6 +629,7 @@ # #" [(set_attr "isa" "*,sse3,noavx,*,*,*,*") + (set_attr "mmx_isa" "native,*,*,native,*,*,*") (set_attr "type" "mmxcvt,sse,sseshuf1,mmxmov,ssemov,fmov,imov") (set (attr "length_immediate") (if_then_else (eq_attr "alternative" "2") @@ -634,7 +647,7 @@ (vec_select:SF (match_operand:V2SF 1 "memory_operand") (parallel [(const_int 1)])))] - "TARGET_MMX && reload_completed" + "(TARGET_MMX || TARGET_MMX_WITH_SSE) && reload_completed" [(set (match_dup 0) (match_dup 1))] "operands[1] = adjust_address (operands[1], SFmode, 4);") @@ -642,19 +655,20 @@ [(match_operand:SF 0 "register_operand") (match_operand:V2SF 1 "register_operand") (match_operand 2 "const_int_operand")] -
[PATCH 25/41] i386: Emulate MMX movntq with SSE2 movntidi
Emulate MMX movntq with SSE2 movntidi. Only register source operand is allowed. PR target/89021 * config/i386/mmx.md (sse_movntq): Add SSE2 emulation. --- gcc/config/i386/mmx.md | 14 +- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index 098e41e19c3..b06f0af984a 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -214,12 +214,16 @@ }) (define_insn "sse_movntq" - [(set (match_operand:DI 0 "memory_operand" "=m") - (unspec:DI [(match_operand:DI 1 "register_operand" "y")] + [(set (match_operand:DI 0 "memory_operand" "=m,m") + (unspec:DI [(match_operand:DI 1 "register_operand" "y,r")] UNSPEC_MOVNTQ))] - "TARGET_SSE || TARGET_3DNOW_A" - "movntq\t{%1, %0|%0, %1}" - [(set_attr "type" "mmxmov") + "(TARGET_MMX || TARGET_MMX_WITH_SSE) + && (TARGET_SSE || TARGET_3DNOW_A)" + "@ + movntq\t{%1, %0|%0, %1} + movnti\t{%1, %0|%0, %1}" + [(set_attr "mmx_isa" "native,x64") + (set_attr "type" "mmxmov,ssemov") (set_attr "mode" "DI")]) ; -- 2.20.1
[PATCH 36/41] Prevent allocation of MMX registers with TARGET_MMX_WITH_SSE
From: Uros Bizjak 2019-02-18 Uroš Bizjak PR target/89021 * config/i386/i386.md (*zero_extendsidi2): Add mmx_isa attribute. * config/i386/sse.md (sse2_cvtpi2pd): Ditto. (sse2_cvtpd2pi): Ditto. (sse2_cvttpd2pi): Ditto. (*vec_concatv2sf_sse4_1): Ditto. (*vec_concatv2sf_sse): Ditto. (*vec_concatv2si_sse4_1): Ditto. (*vec_concatv2si): Ditto. (*vec_concatv4si_0): Ditto. (*vec_concatv2di_0): Ditto. --- gcc/config/i386/i386.md | 4 gcc/config/i386/sse.md | 25 - 2 files changed, 24 insertions(+), 5 deletions(-) diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 04ec0eeaa57..4cbbd4cf685 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -3683,6 +3683,10 @@ (const_string "avx512bw") ] (const_string "*"))) + (set (attr "mmx_isa") + (if_then_else (eq_attr "alternative" "5,6") + (const_string "native") + (const_string "*"))) (set (attr "type") (cond [(eq_attr "alternative" "0,1,2,4") (const_string "multi") diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 97ec3795b82..96d4e5001d8 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -4971,7 +4971,8 @@ "@ %vcvtdq2pd\t{%1, %0|%0, %1} cvtpi2pd\t{%1, %0|%0, %1}" - [(set_attr "type" "ssecvt") + [(set_attr "mmx_isa" "*,native") + (set_attr "type" "ssecvt") (set_attr "unit" "*,mmx") (set_attr "prefix_data16" "*,1") (set_attr "prefix" "maybe_vex,*") @@ -4985,7 +4986,8 @@ "@ * return TARGET_AVX ? \"vcvtpd2dq{x}\t{%1, %0|%0, %1}\" : \"cvtpd2dq\t{%1, %0|%0, %1}\"; cvtpd2pi\t{%1, %0|%0, %1}" - [(set_attr "type" "ssecvt") + [(set_attr "mmx_isa" "*,native") + (set_attr "type" "ssecvt") (set_attr "unit" "*,mmx") (set_attr "amdfam10_decode" "double") (set_attr "athlon_decode" "vector") @@ -5001,7 +5003,8 @@ "@ * return TARGET_AVX ? \"vcvttpd2dq{x}\t{%1, %0|%0, %1}\" : \"cvttpd2dq\t{%1, %0|%0, %1}\"; cvttpd2pi\t{%1, %0|%0, %1}" - [(set_attr "type" "ssecvt") + [(set_attr "mmx_isa" "*,native") + (set_attr "type" "ssecvt") (set_attr "unit" "*,mmx") (set_attr "amdfam10_decode" "double") (set_attr "athlon_decode" "vector") @@ -7209,6 +7212,10 @@ (const_string "mmxmov") ] (const_string "sselog"))) + (set (attr "mmx_isa") + (if_then_else (eq_attr "alternative" "7,8") + (const_string "native") + (const_string "*"))) (set (attr "prefix_data16") (if_then_else (eq_attr "alternative" "3,4") (const_string "1") @@ -7244,7 +7251,8 @@ movss\t{%1, %0|%0, %1} punpckldq\t{%2, %0|%0, %2} movd\t{%1, %0|%0, %1}" - [(set_attr "type" "sselog,ssemov,mmxcvt,mmxmov") + [(set_attr "mmx_isa" "*,*,native,native") + (set_attr "type" "sselog,ssemov,mmxcvt,mmxmov") (set_attr "mode" "V4SF,SF,DI,DI")]) (define_insn "*vec_concatv4sf" @@ -14520,6 +14528,10 @@ punpckldq\t{%2, %0|%0, %2} movd\t{%1, %0|%0, %1}" [(set_attr "isa" "noavx,noavx,avx,avx512dq,noavx,noavx,avx,*,*,*") + (set (attr "mmx_isa") + (if_then_else (eq_attr "alternative" "8,9") + (const_string "native") + (const_string "*"))) (set (attr "type") (cond [(eq_attr "alternative" "7") (const_string "ssemov") @@ -14557,6 +14569,7 @@ punpckldq\t{%2, %0|%0, %2} movd\t{%1, %0|%0, %1}" [(set_attr "isa" "sse2,sse2,*,*,*,*") + (set_attr "mmx_isa" "*,*,*,*,native,native") (set_attr "type" "sselog,ssemov,sselog,ssemov,mmxcvt,mmxmov") (set_attr "mode" "TI,TI,V4SF,SF,DI,DI")]) @@ -14586,7 +14599,8 @@ "@ %vmovq\t{%1, %0|%0, %1} movq2dq\t{%1, %0|%0, %1}" - [(set_attr "type" "ssemov") + [(set_attr "mmx_isa" "*,native") + (set_attr "type" "ssemov") (set_attr "prefix" "maybe_vex,orig") (set_attr "mode" "TI")]) @@ -14661,6 +14675,7 @@ %vmovq\t{%1, %0|%0, %1} movq2dq\t{%1, %0|%0, %1}" [(set_attr "isa" "x64,*,*") + (set_attr "mmx_isa" "*,*,native") (set_attr "type" "ssemov") (set_attr "prefix_rex" "1,*,*") (set_attr "prefix" "maybe_vex,maybe_vex,orig") -- 2.20.1
[PATCH 35/41] i386: Emulate MMX abs2 with SSE
Emulate MMX abs2 with SSE. Only SSE register source operand is allowed. PR target/89021 * config/i386/sse.md (abs2): Add SSE emulation. --- gcc/config/i386/sse.md | 15 +-- 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index b69a467291c..97ec3795b82 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -15973,16 +15973,19 @@ }) (define_insn "abs2" - [(set (match_operand:MMXMODEI 0 "register_operand" "=y") + [(set (match_operand:MMXMODEI 0 "register_operand" "=y,Yv") (abs:MMXMODEI - (match_operand:MMXMODEI 1 "nonimmediate_operand" "ym")))] - "TARGET_SSSE3" - "pabs\t{%1, %0|%0, %1}"; - [(set_attr "type" "sselog1") + (match_operand:MMXMODEI 1 "register_mmxmem_operand" "ym,Yv")))] + "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3" + "@ + pabs\t{%1, %0|%0, %1} + %vpabs\t{%1, %0|%0, %1}" + [(set_attr "mmx_isa" "native,x64") + (set_attr "type" "sselog1") (set_attr "prefix_rep" "0") (set_attr "prefix_extra" "1") (set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p (insn)")) - (set_attr "mode" "DI")]) + (set_attr "mode" "DI,TI")]) ; ;; -- 2.20.1
[PATCH 22/41] i386: Emulate MMX mmx_uavgv8qi3 with SSE
Emulate MMX mmx_uavgv8qi3 with SSE. Only SSE register source operand is allowed. PR target/89021 * config/i386/mmx.md (mmx_uavgv8qi3): Also check TARGET_MMX and TARGET_MMX_WITH_SSE. (*mmx_uavgv8qi3): Add SSE emulation. --- gcc/config/i386/mmx.md | 25 +++-- 1 file changed, 15 insertions(+), 10 deletions(-) diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index 5a342256cbc..8866354dea9 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -1679,50 +1679,55 @@ (plus:V8HI (plus:V8HI (zero_extend:V8HI - (match_operand:V8QI 1 "nonimmediate_operand")) + (match_operand:V8QI 1 "register_mmxmem_operand")) (zero_extend:V8HI - (match_operand:V8QI 2 "nonimmediate_operand"))) + (match_operand:V8QI 2 "register_mmxmem_operand"))) (const_vector:V8HI [(const_int 1) (const_int 1) (const_int 1) (const_int 1) (const_int 1) (const_int 1) (const_int 1) (const_int 1)])) (const_int 1] - "TARGET_SSE || TARGET_3DNOW" + "(TARGET_MMX || TARGET_MMX_WITH_SSE) + && (TARGET_SSE || TARGET_3DNOW_A)" "ix86_fixup_binary_operands_no_copy (PLUS, V8QImode, operands);") (define_insn "*mmx_uavgv8qi3" - [(set (match_operand:V8QI 0 "register_operand" "=y") + [(set (match_operand:V8QI 0 "register_operand" "=y,x,Yv") (truncate:V8QI (lshiftrt:V8HI (plus:V8HI (plus:V8HI (zero_extend:V8HI - (match_operand:V8QI 1 "nonimmediate_operand" "%0")) + (match_operand:V8QI 1 "register_mmxmem_operand" "%0,0,Yv")) (zero_extend:V8HI - (match_operand:V8QI 2 "nonimmediate_operand" "ym"))) + (match_operand:V8QI 2 "register_mmxmem_operand" "ym,x,Yv"))) (const_vector:V8HI [(const_int 1) (const_int 1) (const_int 1) (const_int 1) (const_int 1) (const_int 1) (const_int 1) (const_int 1)])) (const_int 1] - "(TARGET_SSE || TARGET_3DNOW) + "(TARGET_MMX || TARGET_MMX_WITH_SSE) + && (TARGET_SSE || TARGET_3DNOW_A) && ix86_binary_operator_ok (PLUS, V8QImode, operands)" { /* These two instructions have the same operation, but their encoding is different. Prefer the one that is de facto standard. */ - if (TARGET_SSE || TARGET_3DNOW_A) + if (TARGET_MMX_WITH_SSE && TARGET_AVX) +return "vpavgb\t{%2, %1, %0|%0, %1, %2}"; + else if (TARGET_SSE || TARGET_3DNOW_A) return "pavgb\t{%2, %0|%0, %2}"; else return "pavgusb\t{%2, %0|%0, %2}"; } - [(set_attr "type" "mmxshft") + [(set_attr "mmx_isa" "native,x64_noavx,x64_avx") + (set_attr "type" "mmxshft,sseiadd,sseiadd") (set (attr "prefix_extra") (if_then_else (not (ior (match_test "TARGET_SSE") (match_test "TARGET_3DNOW_A"))) (const_string "1") (const_string "*"))) - (set_attr "mode" "DI")]) + (set_attr "mode" "DI,TI,TI")]) (define_expand "mmx_uavgv4hi3" [(set (match_operand:V4HI 0 "register_operand") -- 2.20.1
[PATCH 24/41] i386: Emulate MMX mmx_psadbw with SSE
Emulate MMX mmx_psadbw with SSE. Only SSE register source operand is allowed. PR target/89021 * config/i386/mmx.md (mmx_psadbw): Add SSE emulation. --- gcc/config/i386/mmx.md | 19 --- 1 file changed, 12 insertions(+), 7 deletions(-) diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index d647dc28baa..098e41e19c3 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -1771,14 +1771,19 @@ (set_attr "mode" "DI,TI,TI")]) (define_insn "mmx_psadbw" - [(set (match_operand:V1DI 0 "register_operand" "=y") -(unspec:V1DI [(match_operand:V8QI 1 "register_operand" "0") - (match_operand:V8QI 2 "nonimmediate_operand" "ym")] + [(set (match_operand:V1DI 0 "register_operand" "=y,x,Yv") +(unspec:V1DI [(match_operand:V8QI 1 "register_operand" "0,0,Yv") + (match_operand:V8QI 2 "register_mmxmem_operand" "ym,x,Yv")] UNSPEC_PSADBW))] - "TARGET_SSE || TARGET_3DNOW_A" - "psadbw\t{%2, %0|%0, %2}" - [(set_attr "type" "mmxshft") - (set_attr "mode" "DI")]) + "(TARGET_MMX || TARGET_MMX_WITH_SSE) + && (TARGET_SSE || TARGET_3DNOW_A)" + "@ + psadbw\t{%2, %0|%0, %2} + psadbw\t{%2, %0|%0, %2} + vpsadbw\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "mmx_isa" "native,x64_noavx,x64_avx") + (set_attr "type" "mmxshft,sseiadd,sseiadd") + (set_attr "mode" "DI,TI,TI")]) (define_insn_and_split "mmx_pmovmskb" [(set (match_operand:SI 0 "register_operand" "=r,r") -- 2.20.1
[PATCH 34/41] i386: Emulate MMX ssse3_palignrdi with SSE
Emulate MMX version of palignrq with SSE version by concatenating 2 64-bit MMX operands into a single 128-bit SSE operand, followed by SSE psrldq. Only SSE register source operand is allowed. PR target/89021 * config/i386/sse.md (ssse3_palignrdi): Changed to define_insn_and_split to support SSE emulation. --- gcc/config/i386/sse.md | 58 ++ 1 file changed, 48 insertions(+), 10 deletions(-) diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 1d90af0a4b0..b69a467291c 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -15855,23 +15855,61 @@ (set_attr "prefix" "orig,vex,evex") (set_attr "mode" "")]) -(define_insn "ssse3_palignrdi" - [(set (match_operand:DI 0 "register_operand" "=y") - (unspec:DI [(match_operand:DI 1 "register_operand" "0") - (match_operand:DI 2 "nonimmediate_operand" "ym") - (match_operand:SI 3 "const_0_to_255_mul_8_operand" "n")] +(define_insn_and_split "ssse3_palignrdi" + [(set (match_operand:DI 0 "register_operand" "=y,x,Yv") + (unspec:DI [(match_operand:DI 1 "register_operand" "0,0,Yv") + (match_operand:DI 2 "register_mmxmem_operand" "ym,x,Yv") + (match_operand:SI 3 "const_0_to_255_mul_8_operand" "n,n,n")] UNSPEC_PALIGNR))] - "TARGET_SSSE3" + "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3" { - operands[3] = GEN_INT (INTVAL (operands[3]) / 8); - return "palignr\t{%3, %2, %0|%0, %2, %3}"; + switch (which_alternative) +{ +case 0: + operands[3] = GEN_INT (INTVAL (operands[3]) / 8); + return "palignr\t{%3, %2, %0|%0, %2, %3}"; +case 1: +case 2: + return "#"; +default: + gcc_unreachable (); +} } - [(set_attr "type" "sseishft") + "TARGET_MMX_WITH_SSE && reload_completed" + [(set (match_dup 0) + (lshiftrt:V1TI (match_dup 0) (match_dup 3)))] +{ + /* Emulate MMX palignrdi with SSE psrldq. */ + rtx op0 = lowpart_subreg (V2DImode, operands[0], + GET_MODE (operands[0])); + rtx insn; + if (TARGET_AVX) +insn = gen_vec_concatv2di (op0, operands[2], operands[1]); + else +{ + /* NB: SSE can only concatenate OP0 and OP1 to OP0. */ + insn = gen_vec_concatv2di (op0, operands[1], operands[2]); + emit_insn (insn); + /* Swap bits 0:63 with bits 64:127. */ + rtx mask = gen_rtx_PARALLEL (VOIDmode, + gen_rtvec (4, GEN_INT (2), + GEN_INT (3), + GEN_INT (0), + GEN_INT (1))); + rtx op1 = lowpart_subreg (V4SImode, op0, GET_MODE (op0)); + rtx op2 = gen_rtx_VEC_SELECT (V4SImode, op1, mask); + insn = gen_rtx_SET (op1, op2); +} + emit_insn (insn); + operands[0] = lowpart_subreg (V1TImode, op0, GET_MODE (op0)); +} + [(set_attr "mmx_isa" "native,x64_noavx,x64_avx") + (set_attr "type" "sseishft") (set_attr "atom_unit" "sishuf") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p (insn)")) - (set_attr "mode" "DI")]) + (set_attr "mode" "DI,TI,TI")]) ;; Mode iterator to handle singularity w/ absence of V2DI and V4DI ;; modes for abs instruction on pre AVX-512 targets. -- 2.20.1
[PATCH 26/41] i386: Emulate MMX umulv1siv1di3 with SSE2
Emulate MMX umulv1siv1di3 with SSE2. Only SSE register source operand is allowed. PR target/89021 * config/i386/mmx.md (sse2_umulv1siv1di3): Add SSE emulation support. (*sse2_umulv1siv1di3): Add SSE2 emulation. --- gcc/config/i386/mmx.md | 26 -- 1 file changed, 16 insertions(+), 10 deletions(-) diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index b06f0af984a..f27513f7f2c 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -905,30 +905,36 @@ (mult:V1DI (zero_extend:V1DI (vec_select:V1SI - (match_operand:V2SI 1 "nonimmediate_operand") + (match_operand:V2SI 1 "register_mmxmem_operand") (parallel [(const_int 0)]))) (zero_extend:V1DI (vec_select:V1SI - (match_operand:V2SI 2 "nonimmediate_operand") + (match_operand:V2SI 2 "register_mmxmem_operand") (parallel [(const_int 0)])] - "TARGET_SSE2" + "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSE2" "ix86_fixup_binary_operands_no_copy (MULT, V2SImode, operands);") (define_insn "*sse2_umulv1siv1di3" - [(set (match_operand:V1DI 0 "register_operand" "=y") + [(set (match_operand:V1DI 0 "register_operand" "=y,x,Yv") (mult:V1DI (zero_extend:V1DI (vec_select:V1SI - (match_operand:V2SI 1 "nonimmediate_operand" "%0") + (match_operand:V2SI 1 "register_mmxmem_operand" "%0,0,Yv") (parallel [(const_int 0)]))) (zero_extend:V1DI (vec_select:V1SI - (match_operand:V2SI 2 "nonimmediate_operand" "ym") + (match_operand:V2SI 2 "register_mmxmem_operand" "ym,x,Yv") (parallel [(const_int 0)])] - "TARGET_SSE2 && ix86_binary_operator_ok (MULT, V2SImode, operands)" - "pmuludq\t{%2, %0|%0, %2}" - [(set_attr "type" "mmxmul") - (set_attr "mode" "DI")]) + "(TARGET_MMX || TARGET_MMX_WITH_SSE) + && TARGET_SSE2 + && ix86_binary_operator_ok (MULT, V2SImode, operands)" + "@ + pmuludq\t{%2, %0|%0, %2} + pmuludq\t{%2, %0|%0, %2} + vpmuludq\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "mmx_isa" "native,x64_noavx,x64_avx") + (set_attr "type" "mmxmul,ssemul,ssemul") + (set_attr "mode" "DI,TI,TI")]) (define_expand "mmx_v4hi3" [(set (match_operand:V4HI 0 "register_operand") -- 2.20.1
[PATCH 33/41] i386: Emulate MMX ssse3_psign3 with SSE
Emulate MMX ssse3_psign3 with SSE. Only SSE register source operand is allowed. PR target/89021 * config/i386/sse.md (ssse3_psign3): Add SSE emulation. --- gcc/config/i386/sse.md | 18 +++--- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 79b35d95424..1d90af0a4b0 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -15786,17 +15786,21 @@ (set_attr "mode" "")]) (define_insn "ssse3_psign3" - [(set (match_operand:MMXMODEI 0 "register_operand" "=y") + [(set (match_operand:MMXMODEI 0 "register_operand" "=y,x,Yv") (unspec:MMXMODEI - [(match_operand:MMXMODEI 1 "register_operand" "0") - (match_operand:MMXMODEI 2 "nonimmediate_operand" "ym")] + [(match_operand:MMXMODEI 1 "register_operand" "0,0,Yv") + (match_operand:MMXMODEI 2 "register_mmxmem_operand" "ym,x,Yv")] UNSPEC_PSIGN))] - "TARGET_SSSE3" - "psign\t{%2, %0|%0, %2}"; - [(set_attr "type" "sselog1") + "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3" + "@ + psign\t{%2, %0|%0, %2} + psign\t{%2, %0|%0, %2} + vpsign\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "mmx_isa" "native,x64_noavx,x64_avx") + (set_attr "type" "sselog1") (set_attr "prefix_extra" "1") (set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p (insn)")) - (set_attr "mode" "DI")]) + (set_attr "mode" "DI,TI,TI")]) (define_insn "_palignr_mask" [(set (match_operand:VI1_AVX512 0 "register_operand" "=v") -- 2.20.1
Re: [PATCH] Teach evrp that main's argc argument is always non-negative for C family (PR tree-optimization/89350)
On 2/16/19 12:12 AM, Jakub Jelinek wrote: Hi! Both the C and C++ standard guarantee that the argc argument to main is non-negative, the following patch sets (or adjusts) the corresponding SSA_NAME_RANGE_INFO. While main is just one, with IPA VRP it can also propagate etc. I had to change one testcase because it started optimizing it better (the test has been folded away), so no sinking was done. If/when this goes in it might make sense to also set argv and argv[0] to nonnull. Martin Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2019-02-16 Jakub Jelinek PR tree-optimization/89350 * gimple-ssa-evrp.c: Include tree-dfa.h and langhooks.h. (maybe_set_main_argc_range): New function. (execute_early_vrp): Call it. * gcc.dg/tree-ssa/vrp122.c: New test. * gcc.dg/tree-ssa/ssa-sink-3.c (main): Rename to ... (bar): ... this. --- gcc/gimple-ssa-evrp.c.jj2019-01-01 12:37:15.712998659 +0100 +++ gcc/gimple-ssa-evrp.c 2019-02-15 09:49:56.768534668 +0100 @@ -41,6 +41,8 @@ along with GCC; see the file COPYING3. #include "tree-cfgcleanup.h" #include "vr-values.h" #include "gimple-ssa-evrp-analyze.h" +#include "tree-dfa.h" +#include "langhooks.h" class evrp_folder : public substitute_and_fold_engine { @@ -291,6 +293,39 @@ evrp_dom_walker::cleanup (void) evrp_folder.vr_values->cleanup_edges_and_switches (); } +/* argc in main in C/C++ is guaranteed to be non-negative. Adjust the + range info for it. */ + +static void +maybe_set_main_argc_range (void) +{ + if (!DECL_ARGUMENTS (current_function_decl) + || !(lang_GNU_C () || lang_GNU_CXX () || lang_GNU_OBJC ())) +return; + + tree argc = DECL_ARGUMENTS (current_function_decl); + if (TYPE_MAIN_VARIANT (TREE_TYPE (argc)) != integer_type_node) +return; + + argc = ssa_default_def (cfun, argc); + if (argc == NULL_TREE) +return; + + wide_int min, max; + value_range_kind kind = get_range_info (argc, , ); + if (kind == VR_VARYING) +{ + min = wi::zero (TYPE_PRECISION (integer_type_node)); + max = wi::to_wide (TYPE_MAX_VALUE (integer_type_node)); +} + else if (kind == VR_RANGE && wi::neg_p (min) && !wi::neg_p (max)) +min = wi::zero (TYPE_PRECISION (integer_type_node)); + else +return; + + set_range_info (argc, VR_RANGE, min, max); +} + /* Main entry point for the early vrp pass which is a simplified non-iterative version of vrp where basic blocks are visited in dominance order. Value ranges discovered in early vrp will also be used by ipa-vrp. */ @@ -307,6 +342,10 @@ execute_early_vrp () scev_initialize (); calculate_dominance_info (CDI_DOMINATORS); + /* argc in main in C/C++ is guaranteed to be non-negative. */ + if (MAIN_NAME_P (DECL_NAME (current_function_decl))) +maybe_set_main_argc_range (); + /* Walk stmts in dominance order and propagate VRP. */ evrp_dom_walker walker; walker.walk (ENTRY_BLOCK_PTR_FOR_FN (cfun)); --- gcc/testsuite/gcc.dg/tree-ssa/vrp122.c.jj 2019-02-15 09:54:07.016357759 +0100 +++ gcc/testsuite/gcc.dg/tree-ssa/vrp122.c 2019-02-15 09:53:59.299486561 +0100 @@ -0,0 +1,14 @@ +/* PR tree-optimization/89350 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized" } */ +/* { dg-final { scan-tree-dump-not "link_error \\\(" "optimized" } } */ + +extern void link_error (void); + +int +main (int argc, const char *argv[]) +{ + if (argc < 0) +link_error (); + return 0; +} --- gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-3.c.jj 2015-05-29 15:03:44.947546711 +0200 +++ gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-3.c 2019-02-16 08:04:29.951126611 +0100 @@ -2,7 +2,7 @@ /* { dg-options "-O2 -fdump-tree-sink-stats" } */ extern void foo(int a); int -main (int argc) +bar (int argc) { int a; a = argc + 1; Jakub
Re: [PATCH] document __has_attribute and __has_include
On 2/15/19 8:30 PM, Sandra Loosemore wrote: On 2/13/19 2:46 PM, Martin Sebor wrote: The attached patch adds documentation for the __has_attribute (and __has_cpp_attribute) and __has_include operators added in r215752. Thanks! I was a little unsure where to add this, whether the preprocessor manual or the GCC manual, or both. It seems that it belongs in the preprocessor manual but since more users read the GCC manual, it's likely to be overlooked there. I think the preprocessor manual is the right place. A while back I brought up the idea of consolidating the preprocessor docs into the GCC manual but the consensus seemed to be for retaining a separate preprocessor manual. My comments on this patch are mostly trivial markup things. @@ -3422,6 +3425,99 @@ condition succeeds after the original @samp{#if} a @samp{#else} is allowed after any number of @samp{#elif} directives, but @samp{#elif} may not follow @samp{#else}. +@node __has_attribute +@subsection __has_attribute Please use @code markup in the @subsection. Done. I also changed @node and the corresponding menu item. I wasn't sure what the convention here was: the other subsections (like If and Elif) use capitalization and no @code even their names are keywords. Should they be changed as well? +@cindex @code{__has_attribute} + +The special operator @code{__has_attribute (operand)} may be used in @code{__has_attribute (@var{operand})} Done. +@samp{#if} and @samp{#elif} expressions to test whether the attribute Another question: should these use @code instead? (Again, I'm not entirely sure what the convention is in the CPP manual. It seems consistent in using @samp for directives like #if but then it uses @code for bigger snippets like @code{#if 0} or @code{#pragma GCC poison} where (IIUC) the TexInfo manual suggests @samp might be preferable). +referenced by its argument is recognized by GCC. Using the operator +in other contexts is not valid. In C code, @var{operand} must be +a valid identifier. In C++ code, @var{operand} may be optionally +introduced by the @code{attribute-scope::} prefix. I think "attribute-scope" is not a literal part of the prefix, so @code{@var{attribute-scope}::} +The @code{attribute-scope} prefix identifies the ``namespace'' within And @var markup here, too. +which the attribute is recognized. The scope of GCC attributes is +@samp{gnu} or @samp{__gnu__}. The operator by itself, without any The @code{__has_attribute} operator by itself Done. +@var{operand} or parentheses, acts as a predefined macro so that support +for it can be tested in portable code. Thus, the recommended use of +the operator is as follows: + +@smallexample +#if defined __has_attribute +# if __has_attribute (nonnull) +# define ATTR_NONNULL __attribute__ ((nonnull)) +# endif +#endif +@end smallexample + +The first @samp{#if} test succeeds only when the operator is supported +by the version of GCC (or another compiler) being used. Only when that +test succeeds is it valid to use @code{__has_attribute} as a preprocessor +operator. As a result, combining the two tests into a single expression as +shown below would only be valid with a compiler that supports the operator +but not with others that don't. + +@smallexample +#if defined __has_attribute && __has_attribute (nonnull) /* not portable */ +@dots{} +#endif +@end smallexample + +@node __has_cpp_attribute +@subsection __has_cpp_attribute @code markup in the @subsection title, again. +@cindex @code{__has_cpp_attribute} + +The special operator @code{__has_cpp_attribute (operand)} may be used @var{operand} markup again. Done. +in @samp{#if} and @samp{#elif} expressions in C++ code to test whether +the attribute referenced by its argument is recognized by GCC. +@code{__has_cpp_attribute (operand)} is equivalent to +@code{__has_attribute (operand)} except that when @code{operand} The 3 instances above too. Done. +designates a supported standard attribute it evaluates to an integer +constant of the form @code{MM} indicating the year and month when +the attribute was first introduced into the C++ standard. For additional +information including the dates of the introduction of current standard +attributes, see @w{@uref{https://isocpp.org/std/standing-documents/sd-6-sg10-feature-test-recommendations/, +SD-6: SG10 Feature Test Recommendations}}. + +@node __has_include +@subsection __has_include @code markup in title again Done. +@cindex @code{__has_include} + > +The special operator @code{__has_include (operand)} may be used in @samp{#if} @var{operand} Done. +and @samp{#elif} expressions to test whether the header referenced by its +@var{operand} can be included using the @samp{#include} directive. Using +the operator in other contexts is not valid. The @var{operand} takes +the same form as the file in the @samp{#include} directive (@xref{Include +Syntax}) and evaluates to a nonzero
[PATCH 39/41] i386: Allow MMX intrinsic emulation with SSE
Allow MMX intrinsic emulation with SSE/SSE2/SSSE3. Don't enable MMX ISA by default with TARGET_MMX_WITH_SSE. For pr82483-1.c and pr82483-2.c, "-mssse3 -mno-mmx" compiles in 64-bit mode since MMX intrinsics can be emulated wit SSE. gcc/ PR target/89021 * config/i386/i386-builtin.def: Enable MMX intrinsics with SSE/SSE2/SSSE3. * config/i386/i386.c (ix86_init_mmx_sse_builtins): Likewise. (ix86_expand_builtin): Allow SSE/SSE2/SSSE3 to emulate MMX intrinsics with TARGET_MMX_WITH_SSE. * config/i386/mmintrin.h: Only require SSE2 if __MMX_WITH_SSE__ is defined. gcc/testsuite/ PR target/89021 * gcc.target/i386/pr82483-1.c: Error only on ia32. * gcc.target/i386/pr82483-2.c: Likewise. --- gcc/config/i386/i386-builtin.def | 126 +++--- gcc/config/i386/i386.c| 29 - gcc/config/i386/mmintrin.h| 12 ++- gcc/testsuite/gcc.target/i386/pr82483-1.c | 2 +- gcc/testsuite/gcc.target/i386/pr82483-2.c | 2 +- 5 files changed, 101 insertions(+), 70 deletions(-) diff --git a/gcc/config/i386/i386-builtin.def b/gcc/config/i386/i386-builtin.def index 88005f4687f..10a9d631f29 100644 --- a/gcc/config/i386/i386-builtin.def +++ b/gcc/config/i386/i386-builtin.def @@ -100,7 +100,7 @@ BDESC (0, 0, CODE_FOR_fnstsw, "__builtin_ia32_fnstsw", IX86_BUILTIN_FNSTSW, UNKN BDESC (0, 0, CODE_FOR_fnclex, "__builtin_ia32_fnclex", IX86_BUILTIN_FNCLEX, UNKNOWN, (int) VOID_FTYPE_VOID) /* MMX */ -BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_emms, "__builtin_ia32_emms", IX86_BUILTIN_EMMS, UNKNOWN, (int) VOID_FTYPE_VOID) +BDESC (OPTION_MASK_ISA_MMX | OPTION_MASK_ISA_SSE2, 0, CODE_FOR_mmx_emms, "__builtin_ia32_emms", IX86_BUILTIN_EMMS, UNKNOWN, (int) VOID_FTYPE_VOID) /* 3DNow! */ BDESC (OPTION_MASK_ISA_3DNOW, 0, CODE_FOR_mmx_femms, "__builtin_ia32_femms", IX86_BUILTIN_FEMMS, UNKNOWN, (int) VOID_FTYPE_VOID) @@ -442,68 +442,68 @@ BDESC (0, 0, CODE_FOR_rotrqi3, "__builtin_ia32_rorqi", IX86_BUILTIN_RORQI, UNKNO BDESC (0, 0, CODE_FOR_rotrhi3, "__builtin_ia32_rorhi", IX86_BUILTIN_RORHI, UNKNOWN, (int) UINT16_FTYPE_UINT16_INT) /* MMX */ -BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_addv8qi3, "__builtin_ia32_paddb", IX86_BUILTIN_PADDB, UNKNOWN, (int) V8QI_FTYPE_V8QI_V8QI) -BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_addv4hi3, "__builtin_ia32_paddw", IX86_BUILTIN_PADDW, UNKNOWN, (int) V4HI_FTYPE_V4HI_V4HI) -BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_addv2si3, "__builtin_ia32_paddd", IX86_BUILTIN_PADDD, UNKNOWN, (int) V2SI_FTYPE_V2SI_V2SI) -BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_subv8qi3, "__builtin_ia32_psubb", IX86_BUILTIN_PSUBB, UNKNOWN, (int) V8QI_FTYPE_V8QI_V8QI) -BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_subv4hi3, "__builtin_ia32_psubw", IX86_BUILTIN_PSUBW, UNKNOWN, (int) V4HI_FTYPE_V4HI_V4HI) -BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_subv2si3, "__builtin_ia32_psubd", IX86_BUILTIN_PSUBD, UNKNOWN, (int) V2SI_FTYPE_V2SI_V2SI) - -BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_ssaddv8qi3, "__builtin_ia32_paddsb", IX86_BUILTIN_PADDSB, UNKNOWN, (int) V8QI_FTYPE_V8QI_V8QI) -BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_ssaddv4hi3, "__builtin_ia32_paddsw", IX86_BUILTIN_PADDSW, UNKNOWN, (int) V4HI_FTYPE_V4HI_V4HI) -BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_sssubv8qi3, "__builtin_ia32_psubsb", IX86_BUILTIN_PSUBSB, UNKNOWN, (int) V8QI_FTYPE_V8QI_V8QI) -BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_sssubv4hi3, "__builtin_ia32_psubsw", IX86_BUILTIN_PSUBSW, UNKNOWN, (int) V4HI_FTYPE_V4HI_V4HI) -BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_usaddv8qi3, "__builtin_ia32_paddusb", IX86_BUILTIN_PADDUSB, UNKNOWN, (int) V8QI_FTYPE_V8QI_V8QI) -BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_usaddv4hi3, "__builtin_ia32_paddusw", IX86_BUILTIN_PADDUSW, UNKNOWN, (int) V4HI_FTYPE_V4HI_V4HI) -BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_ussubv8qi3, "__builtin_ia32_psubusb", IX86_BUILTIN_PSUBUSB, UNKNOWN, (int) V8QI_FTYPE_V8QI_V8QI) -BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_ussubv4hi3, "__builtin_ia32_psubusw", IX86_BUILTIN_PSUBUSW, UNKNOWN, (int) V4HI_FTYPE_V4HI_V4HI) - -BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_mulv4hi3, "__builtin_ia32_pmullw", IX86_BUILTIN_PMULLW, UNKNOWN, (int) V4HI_FTYPE_V4HI_V4HI) -BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_smulv4hi3_highpart, "__builtin_ia32_pmulhw", IX86_BUILTIN_PMULHW, UNKNOWN, (int) V4HI_FTYPE_V4HI_V4HI) - -BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_andv2si3, "__builtin_ia32_pand", IX86_BUILTIN_PAND, UNKNOWN, (int) V2SI_FTYPE_V2SI_V2SI) -BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_andnotv2si3, "__builtin_ia32_pandn", IX86_BUILTIN_PANDN, UNKNOWN, (int) V2SI_FTYPE_V2SI_V2SI) -BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_iorv2si3, "__builtin_ia32_por", IX86_BUILTIN_POR, UNKNOWN, (int) V2SI_FTYPE_V2SI_V2SI) -BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_xorv2si3, "__builtin_ia32_pxor", IX86_BUILTIN_PXOR, UNKNOWN, (int)
[PATCH 27/41] i386: Make _mm_empty () as NOP without MMX
With SSE emulation of MMX intrinsics, we should make _mm_empty () as NOP without MMX. PR target/89021 * config/i386/mmx.md (mmx_): Renamed to ... (*mmx_): This. (mmx_): New expander. --- gcc/config/i386/mmx.md | 30 +- 1 file changed, 29 insertions(+), 1 deletion(-) diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index f27513f7f2c..c48d42c7d59 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -1849,7 +1849,35 @@ [(UNSPECV_EMMS "emms") (UNSPECV_FEMMS "femms")]) -(define_insn "mmx_" +(define_expand "mmx_" + [(parallel +[(unspec_volatile [(const_int 0)] EMMS) + (clobber (reg:XF ST0_REG)) + (clobber (reg:XF ST1_REG)) + (clobber (reg:XF ST2_REG)) + (clobber (reg:XF ST3_REG)) + (clobber (reg:XF ST4_REG)) + (clobber (reg:XF ST5_REG)) + (clobber (reg:XF ST6_REG)) + (clobber (reg:XF ST7_REG)) + (clobber (reg:DI MM0_REG)) + (clobber (reg:DI MM1_REG)) + (clobber (reg:DI MM2_REG)) + (clobber (reg:DI MM3_REG)) + (clobber (reg:DI MM4_REG)) + (clobber (reg:DI MM5_REG)) + (clobber (reg:DI MM6_REG)) + (clobber (reg:DI MM7_REG))])] + "TARGET_MMX || TARGET_MMX_WITH_SSE" +{ + if (!TARGET_MMX) + { + emit_insn (gen_nop ()); + DONE; + } +}) + +(define_insn "*mmx_" [(unspec_volatile [(const_int 0)] EMMS) (clobber (reg:XF ST0_REG)) (clobber (reg:XF ST1_REG)) -- 2.20.1
[PATCH 19/41] i386: Emulate MMX mmx_pmovmskb with SSE
Emulate MMX mmx_pmovmskb with SSE by zero-extending result of SSE pmovmskb from QImode to SImode. Only SSE register source operand is allowed. PR target/89021 * config/i386/mmx.md (mmx_pmovmskb): Changed to define_insn_and_split to support SSE emulation. --- gcc/config/i386/mmx.md | 30 +++--- 1 file changed, 23 insertions(+), 7 deletions(-) diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index edfb8623701..5ae04de205d 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -1763,14 +1763,30 @@ [(set_attr "type" "mmxshft") (set_attr "mode" "DI")]) -(define_insn "mmx_pmovmskb" - [(set (match_operand:SI 0 "register_operand" "=r") - (unspec:SI [(match_operand:V8QI 1 "register_operand" "y")] +(define_insn_and_split "mmx_pmovmskb" + [(set (match_operand:SI 0 "register_operand" "=r,r") + (unspec:SI [(match_operand:V8QI 1 "register_operand" "y,x")] UNSPEC_MOVMSK))] - "TARGET_SSE || TARGET_3DNOW_A" - "pmovmskb\t{%1, %0|%0, %1}" - [(set_attr "type" "mmxcvt") - (set_attr "mode" "DI")]) + "(TARGET_MMX || TARGET_MMX_WITH_SSE) + && (TARGET_SSE || TARGET_3DNOW_A)" + "@ + pmovmskb\t{%1, %0|%0, %1} + #" + "TARGET_MMX_WITH_SSE && reload_completed" + [(set (match_dup 0) +(unspec:SI [(match_dup 1)] UNSPEC_MOVMSK)) + (set (match_dup 0) + (zero_extend:SI (match_dup 2)))] +{ + /* Generate SSE pmovmskb and zero-extend from QImode to SImode. */ + operands[1] = lowpart_subreg (V16QImode, operands[1], + GET_MODE (operands[1])); + operands[2] = lowpart_subreg (QImode, operands[0], + GET_MODE (operands[0])); +} + [(set_attr "mmx_isa" "native,x64") + (set_attr "type" "mmxcvt,ssemov") + (set_attr "mode" "DI,TI")]) (define_expand "mmx_maskmovq" [(set (match_operand:V8QI 0 "memory_operand") -- 2.20.1
[PATCH 13/41] i386: Emulate MMX pshufw with SSE
Emulate MMX pshufw with SSE. Only SSE register source operand is allowed. PR target/89021 * config/i386/mmx.md (mmx_pshufw): Also check TARGET_MMX and TARGET_MMX_WITH_SSE. (mmx_pshufw_1): Add SSE emulation. (*vec_dupv4hi): Changed to define_insn_and_split and also allow TARGET_MMX_WITH_SSE to support SSE emulation. --- gcc/config/i386/mmx.md | 81 +- 1 file changed, 65 insertions(+), 16 deletions(-) diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index b441f36dfc6..09e78ac5f74 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -1323,9 +1323,10 @@ (define_expand "mmx_pshufw" [(match_operand:V4HI 0 "register_operand") - (match_operand:V4HI 1 "nonimmediate_operand") + (match_operand:V4HI 1 "register_mmxmem_operand") (match_operand:SI 2 "const_int_operand")] - "TARGET_SSE || TARGET_3DNOW_A" + "(TARGET_MMX || TARGET_MMX_WITH_SSE) + && (TARGET_SSE || TARGET_3DNOW_A)" { int mask = INTVAL (operands[2]); emit_insn (gen_mmx_pshufw_1 (operands[0], operands[1], @@ -1337,14 +1338,15 @@ }) (define_insn "mmx_pshufw_1" - [(set (match_operand:V4HI 0 "register_operand" "=y") + [(set (match_operand:V4HI 0 "register_operand" "=y,Yv") (vec_select:V4HI - (match_operand:V4HI 1 "nonimmediate_operand" "ym") + (match_operand:V4HI 1 "register_mmxmem_operand" "ym,Yv") (parallel [(match_operand 2 "const_0_to_3_operand") (match_operand 3 "const_0_to_3_operand") (match_operand 4 "const_0_to_3_operand") (match_operand 5 "const_0_to_3_operand")])))] - "TARGET_SSE || TARGET_3DNOW_A" + "(TARGET_MMX || TARGET_MMX_WITH_SSE) + && (TARGET_SSE || TARGET_3DNOW_A)" { int mask = 0; mask |= INTVAL (operands[2]) << 0; @@ -1353,11 +1355,20 @@ mask |= INTVAL (operands[5]) << 6; operands[2] = GEN_INT (mask); - return "pshufw\t{%2, %1, %0|%0, %1, %2}"; + switch (which_alternative) +{ +case 0: + return "pshufw\t{%2, %1, %0|%0, %1, %2}"; +case 1: + return "%vpshuflw\t{%2, %1, %0|%0, %1, %2}"; +default: + gcc_unreachable (); +} } - [(set_attr "type" "mmxcvt") + [(set_attr "mmx_isa" "native,x64") + (set_attr "type" "mmxcvt,sselog") (set_attr "length_immediate" "1") - (set_attr "mode" "DI")]) + (set_attr "mode" "DI,TI")]) (define_insn "mmx_pswapdv2si2" [(set (match_operand:V2SI 0 "register_operand" "=y") @@ -1370,16 +1381,54 @@ (set_attr "prefix_extra" "1") (set_attr "mode" "DI")]) -(define_insn "*vec_dupv4hi" - [(set (match_operand:V4HI 0 "register_operand" "=y") +(define_insn_and_split "*vec_dupv4hi" + [(set (match_operand:V4HI 0 "register_operand" "=y,Yv,Yw") (vec_duplicate:V4HI (truncate:HI - (match_operand:SI 1 "register_operand" "0"] - "TARGET_SSE || TARGET_3DNOW_A" - "pshufw\t{$0, %0, %0|%0, %0, 0}" - [(set_attr "type" "mmxcvt") - (set_attr "length_immediate" "1") - (set_attr "mode" "DI")]) + (match_operand:SI 1 "register_operand" "0,Yv,r"] + "(TARGET_MMX || TARGET_MMX_WITH_SSE) + && (TARGET_SSE || TARGET_3DNOW_A)" + "@ + pshufw\t{$0, %0, %0|%0, %0, 0} + # + #" + "TARGET_MMX_WITH_SSE && reload_completed" + [(const_int 0)] +{ + rtx op; + operands[0] = lowpart_subreg (V8HImode, operands[0], + GET_MODE (operands[0])); + if (TARGET_AVX2) +{ + operands[1] = lowpart_subreg (HImode, operands[1], + GET_MODE (operands[1])); + op = gen_rtx_VEC_DUPLICATE (V8HImode, operands[1]); +} + else +{ + operands[1] = lowpart_subreg (V8HImode, operands[1], + GET_MODE (operands[1])); + rtx mask = gen_rtx_PARALLEL (VOIDmode, + gen_rtvec (8, + GEN_INT (0), + GEN_INT (0), + GEN_INT (0), + GEN_INT (0), + GEN_INT (4), + GEN_INT (5), + GEN_INT (6), + GEN_INT (7))); + + op = gen_rtx_VEC_SELECT (V8HImode, operands[1], mask); +} + rtx insn = gen_rtx_SET (operands[0], op); + emit_insn (insn); + DONE; +} + [(set_attr "mmx_isa" "native,x64,x64_avx") + (set_attr "type" "mmxcvt,sselog1,ssemov") + (set_attr "length_immediate" "1,1,0") + (set_attr "mode" "DI,TI,TI")]) (define_insn_and_split "*vec_dupv2si" [(set (match_operand:V2SI 0 "register_operand" "=y,x,Yv,Yw") -- 2.20.1
[PATCH 31/41] i386: Emulate MMX ssse3_pmulhrswv4hi3 with SSE
Emulate MMX ssse3_pmulhrswv4hi3 with SSE. Only SSE register source operand is allowed. PR target/89021 * config/i386/sse.md (ssse3_pmulhrswv4hi3): Require TARGET_MMX or TARGET_MMX_WITH_SSE. (*ssse3_pmulhrswv4hi3): Add SSE emulation. --- gcc/config/i386/sse.md | 26 -- 1 file changed, 16 insertions(+), 10 deletions(-) diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index e8d9bec9766..b08a577d1e4 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -15670,38 +15670,44 @@ (lshiftrt:V4SI (mult:V4SI (sign_extend:V4SI - (match_operand:V4HI 1 "nonimmediate_operand")) + (match_operand:V4HI 1 "register_mmxmem_operand")) (sign_extend:V4SI - (match_operand:V4HI 2 "nonimmediate_operand"))) + (match_operand:V4HI 2 "register_mmxmem_operand"))) (const_int 14)) (match_dup 3)) (const_int 1] - "TARGET_SSSE3" + "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3" { operands[3] = CONST1_RTX(V4HImode); ix86_fixup_binary_operands_no_copy (MULT, V4HImode, operands); }) (define_insn "*ssse3_pmulhrswv4hi3" - [(set (match_operand:V4HI 0 "register_operand" "=y") + [(set (match_operand:V4HI 0 "register_operand" "=y,x,Yv") (truncate:V4HI (lshiftrt:V4SI (plus:V4SI (lshiftrt:V4SI (mult:V4SI (sign_extend:V4SI - (match_operand:V4HI 1 "nonimmediate_operand" "%0")) + (match_operand:V4HI 1 "register_mmxmem_operand" "%0,0,Yv")) (sign_extend:V4SI - (match_operand:V4HI 2 "nonimmediate_operand" "ym"))) + (match_operand:V4HI 2 "register_mmxmem_operand" "ym,x,Yv"))) (const_int 14)) (match_operand:V4HI 3 "const1_operand")) (const_int 1] - "TARGET_SSSE3 && !(MEM_P (operands[1]) && MEM_P (operands[2]))" - "pmulhrsw\t{%2, %0|%0, %2}" - [(set_attr "type" "sseimul") + "(TARGET_MMX || TARGET_MMX_WITH_SSE) + && TARGET_SSSE3 + && !(MEM_P (operands[1]) && MEM_P (operands[2]))" + "@ + pmulhrsw\t{%2, %0|%0, %2} + pmulhrsw\t{%2, %0|%0, %2} + vpmulhrsw\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "mmx_isa" "native,x64_noavx,x64_avx") + (set_attr "type" "sseimul") (set_attr "prefix_extra" "1") (set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p (insn)")) - (set_attr "mode" "DI")]) + (set_attr "mode" "DI,TI,TI")]) (define_insn "_pshufb3" [(set (match_operand:VI1_AVX512 0 "register_operand" "=x,x,v") -- 2.20.1
[PATCH 14/41] i386: Emulate MMX sse_cvtps2pi/sse_cvttps2pi with SSE
Emulate MMX sse_cvtps2pi/sse_cvttps2pi with SSE. PR target/89021 * config/i386/sse.md (sse_cvtps2pi): Add SSE emulation. (sse_cvttps2pi): Likewise. --- gcc/config/i386/sse.md | 30 ++ 1 file changed, 18 insertions(+), 12 deletions(-) diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 30bf7e23122..dd3a3d9ba67 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -4582,26 +4582,32 @@ (set_attr "mode" "V4SF")]) (define_insn "sse_cvtps2pi" - [(set (match_operand:V2SI 0 "register_operand" "=y") + [(set (match_operand:V2SI 0 "register_operand" "=y,Yv") (vec_select:V2SI - (unspec:V4SI [(match_operand:V4SF 1 "nonimmediate_operand" "xm")] + (unspec:V4SI [(match_operand:V4SF 1 "register_mmxmem_operand" "xm,YvBm")] UNSPEC_FIX_NOTRUNC) (parallel [(const_int 0) (const_int 1)])))] - "TARGET_SSE" - "cvtps2pi\t{%1, %0|%0, %q1}" - [(set_attr "type" "ssecvt") - (set_attr "unit" "mmx") + "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSE" + "@ + cvtps2pi\t{%1, %0|%0, %q1} + %vcvtps2dq\t{%1, %0|%0, %1}" + [(set_attr "mmx_isa" "native,x64") + (set_attr "type" "ssecvt") + (set_attr "unit" "mmx,*") (set_attr "mode" "DI")]) (define_insn "sse_cvttps2pi" - [(set (match_operand:V2SI 0 "register_operand" "=y") + [(set (match_operand:V2SI 0 "register_operand" "=y,Yv") (vec_select:V2SI - (fix:V4SI (match_operand:V4SF 1 "nonimmediate_operand" "xm")) + (fix:V4SI (match_operand:V4SF 1 "register_mmxmem_operand" "xm,YvBm")) (parallel [(const_int 0) (const_int 1)])))] - "TARGET_SSE" - "cvttps2pi\t{%1, %0|%0, %q1}" - [(set_attr "type" "ssecvt") - (set_attr "unit" "mmx") + "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSE" + "@ + cvttps2pi\t{%1, %0|%0, %q1} + %vcvttps2dq\t{%1, %0|%0, %1}" + [(set_attr "mmx_isa" "native,x64") + (set_attr "type" "ssecvt") + (set_attr "unit" "mmx,*") (set_attr "prefix_rep" "0") (set_attr "mode" "SF")]) -- 2.20.1
[PATCH 16/41] i386: Emulate MMX mmx_pextrw with SSE
Emulate MMX mmx_pextrw with SSE. Only SSE register source operand is allowed. PR target/89021 * config/i386/mmx.md (mmx_pextrw): Add SSE emulation. --- gcc/config/i386/mmx.md | 18 +++--- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index 09e78ac5f74..28725f48282 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -1310,16 +1310,20 @@ (set_attr "mode" "DI")]) (define_insn "mmx_pextrw" - [(set (match_operand:SI 0 "register_operand" "=r") + [(set (match_operand:SI 0 "register_operand" "=r,r") (zero_extend:SI (vec_select:HI - (match_operand:V4HI 1 "register_operand" "y") - (parallel [(match_operand:SI 2 "const_0_to_3_operand" "n")]] - "TARGET_SSE || TARGET_3DNOW_A" - "pextrw\t{%2, %1, %0|%0, %1, %2}" - [(set_attr "type" "mmxcvt") + (match_operand:V4HI 1 "register_operand" "y,Yv") + (parallel [(match_operand:SI 2 "const_0_to_3_operand" "n,n")]] + "(TARGET_MMX || TARGET_MMX_WITH_SSE) + && (TARGET_SSE || TARGET_3DNOW_A)" + "@ + pextrw\t{%2, %1, %0|%0, %1, %2} + %vpextrw\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "mmx_isa" "native,x64") + (set_attr "type" "mmxcvt,sselog1") (set_attr "length_immediate" "1") - (set_attr "mode" "DI")]) + (set_attr "mode" "DI,TI")]) (define_expand "mmx_pshufw" [(match_operand:V4HI 0 "register_operand") -- 2.20.1
[PATCH 05/41] i386: Emulate MMX mulv4hi3 with SSE
Emulate MMX mulv4hi3 with SSE. Only SSE register source operand is allowed. PR target/89021 * config/i386/mmx.md (mmx_mulv4hi3): Also allow TARGET_MMX_WITH_SSE. (mulv4hi3): New. (*mmx_mulv4hi3): Also allow TARGET_MMX_WITH_SSE. Add SSE support. --- gcc/config/i386/mmx.md | 32 ++-- 1 file changed, 22 insertions(+), 10 deletions(-) diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index 587e31b299e..fd0189eae60 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -716,19 +716,31 @@ (define_expand "mmx_mulv4hi3" [(set (match_operand:V4HI 0 "register_operand") -(mult:V4HI (match_operand:V4HI 1 "nonimmediate_operand") - (match_operand:V4HI 2 "nonimmediate_operand")))] - "TARGET_MMX" +(mult:V4HI (match_operand:V4HI 1 "register_mmxmem_operand") + (match_operand:V4HI 2 "register_mmxmem_operand")))] + "TARGET_MMX || TARGET_MMX_WITH_SSE" + "ix86_fixup_binary_operands_no_copy (MULT, V4HImode, operands);") + +(define_expand "mulv4hi3" + [(set (match_operand:V4HI 0 "register_operand") +(mult:V4HI (match_operand:V4HI 1 "register_operand") + (match_operand:V4HI 2 "register_operand")))] + "TARGET_MMX_WITH_SSE" "ix86_fixup_binary_operands_no_copy (MULT, V4HImode, operands);") (define_insn "*mmx_mulv4hi3" - [(set (match_operand:V4HI 0 "register_operand" "=y") -(mult:V4HI (match_operand:V4HI 1 "nonimmediate_operand" "%0") - (match_operand:V4HI 2 "nonimmediate_operand" "ym")))] - "TARGET_MMX && ix86_binary_operator_ok (MULT, V4HImode, operands)" - "pmullw\t{%2, %0|%0, %2}" - [(set_attr "type" "mmxmul") - (set_attr "mode" "DI")]) + [(set (match_operand:V4HI 0 "register_operand" "=y,x,Yv") +(mult:V4HI (match_operand:V4HI 1 "register_mmxmem_operand" "%0,0,Yv") + (match_operand:V4HI 2 "register_mmxmem_operand" "ym,x,Yv")))] + "(TARGET_MMX || TARGET_MMX_WITH_SSE) + && ix86_binary_operator_ok (MULT, V4HImode, operands)" + "@ + pmullw\t{%2, %0|%0, %2} + pmullw\t{%2, %0|%0, %2} + vpmullw\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "mmx_isa" "native,x64_noavx,x64_avx") + (set_attr "type" "mmxmul,ssemul,ssemul") + (set_attr "mode" "DI,TI,TI")]) (define_expand "mmx_smulv4hi3_highpart" [(set (match_operand:V4HI 0 "register_operand") -- 2.20.1
[PATCH 15/41] i386: Emulate MMX sse_cvtpi2ps with SSE
Emulate MMX sse_cvtpi2ps with SSE2 cvtdq2ps, preserving upper 64 bits of destination XMM register. Only SSE register source operand is allowed. PR target/89021 * config/i386/sse.md (sse_cvtpi2ps): Changed to define_insn_and_split. Also allow TARGET_MMX_WITH_SSE. Add SSE emulation. --- gcc/config/i386/sse.md | 64 -- 1 file changed, 56 insertions(+), 8 deletions(-) diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index dd3a3d9ba67..3135ce4eace 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -4569,16 +4569,64 @@ ;; ; -(define_insn "sse_cvtpi2ps" - [(set (match_operand:V4SF 0 "register_operand" "=x") +(define_insn_and_split "sse_cvtpi2ps" + [(set (match_operand:V4SF 0 "register_operand" "=x,x,Yv") (vec_merge:V4SF (vec_duplicate:V4SF - (float:V2SF (match_operand:V2SI 2 "nonimmediate_operand" "ym"))) - (match_operand:V4SF 1 "register_operand" "0") - (const_int 3)))] - "TARGET_SSE" - "cvtpi2ps\t{%2, %0|%0, %2}" - [(set_attr "type" "ssecvt") + (float:V2SF (match_operand:V2SI 2 "register_mmxmem_operand" "ym,x,Yv"))) + (match_operand:V4SF 1 "register_operand" "0,0,Yv") + (const_int 3))) + (clobber (match_scratch:V4SF 3 "=X,x,Yv"))] + "TARGET_SSE || TARGET_MMX_WITH_SSE" + "@ + cvtpi2ps\t{%2, %0|%0, %2} + # + #" + "TARGET_MMX_WITH_SSE && reload_completed" + [(const_int 0)] +{ + rtx op2 = lowpart_subreg (V4SImode, operands[2], + GET_MODE (operands[2])); + /* Generate SSE2 cvtdq2ps. */ + rtx insn = gen_floatv4siv4sf2 (operands[3], op2); + emit_insn (insn); + + /* Merge operands[3] with operands[0]. */ + rtx mask, op1; + if (TARGET_AVX) +{ + mask = gen_rtx_PARALLEL (VOIDmode, + gen_rtvec (4, GEN_INT (0), GEN_INT (1), + GEN_INT (6), GEN_INT (7))); + op1 = gen_rtx_VEC_CONCAT (V8SFmode, operands[3], operands[1]); + op2 = gen_rtx_VEC_SELECT (V4SFmode, op1, mask); + insn = gen_rtx_SET (operands[0], op2); +} + else +{ + /* NB: SSE can only concatenate OP0 and OP3 to OP0. */ + mask = gen_rtx_PARALLEL (VOIDmode, + gen_rtvec (4, GEN_INT (2), GEN_INT (3), + GEN_INT (4), GEN_INT (5))); + op1 = gen_rtx_VEC_CONCAT (V8SFmode, operands[0], operands[3]); + op2 = gen_rtx_VEC_SELECT (V4SFmode, op1, mask); + insn = gen_rtx_SET (operands[0], op2); + emit_insn (insn); + + /* Swap bits 0:63 with bits 64:127. */ + mask = gen_rtx_PARALLEL (VOIDmode, + gen_rtvec (4, GEN_INT (2), GEN_INT (3), + GEN_INT (0), GEN_INT (1))); + rtx dest = lowpart_subreg (V4SImode, operands[0], +GET_MODE (operands[0])); + op1 = gen_rtx_VEC_SELECT (V4SImode, dest, mask); + insn = gen_rtx_SET (dest, op1); +} + emit_insn (insn); + DONE; +} + [(set_attr "mmx_isa" "native,x64_noavx,x64_avx") + (set_attr "type" "ssecvt") (set_attr "mode" "V4SF")]) (define_insn "sse_cvtps2pi" -- 2.20.1
[PATCH 10/41] i386: Emulate MMX mmx_andnot3 with SSE
Emulate MMX mmx_andnot3 with SSE. Only SSE register source operand is allowed. PR target/89021 * config/i386/mmx.md (mmx_andnot3): Also allow TARGET_MMX_WITH_SSE. Add SSE support. --- gcc/config/i386/mmx.md | 18 +++--- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index 33f6c2aa774..b3df46dd563 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -1049,14 +1049,18 @@ ; (define_insn "mmx_andnot3" - [(set (match_operand:MMXMODEI 0 "register_operand" "=y") + [(set (match_operand:MMXMODEI 0 "register_operand" "=y,x,Yv") (and:MMXMODEI - (not:MMXMODEI (match_operand:MMXMODEI 1 "register_operand" "0")) - (match_operand:MMXMODEI 2 "nonimmediate_operand" "ym")))] - "TARGET_MMX" - "pandn\t{%2, %0|%0, %2}" - [(set_attr "type" "mmxadd") - (set_attr "mode" "DI")]) + (not:MMXMODEI (match_operand:MMXMODEI 1 "register_operand" "0,0,Yv")) + (match_operand:MMXMODEI 2 "register_mmxmem_operand" "ym,x,Yv")))] + "TARGET_MMX || TARGET_MMX_WITH_SSE" + "@ + pandn\t{%2, %0|%0, %2} + pandn\t{%2, %0|%0, %2} + vpandn\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "mmx_isa" "native,x64_noavx,x64_avx") + (set_attr "type" "mmxadd,sselog,sselog") + (set_attr "mode" "DI,TI,TI")]) (define_expand "mmx_3" [(set (match_operand:MMXMODEI 0 "register_operand") -- 2.20.1
[PATCH 08/41] i386: Emulate MMX ashr3/3 with SSE
Emulate MMX ashr3/3 with SSE. Only SSE register source operand is allowed. PR target/89021 * config/i386/mmx.md (mmx_ashr3): Also allow TARGET_MMX_WITH_SSE. Add SSE emulation. (mmx_3): Likewise. (ashr3): New. (3): Likewise. --- gcc/config/i386/mmx.md | 50 ++ 1 file changed, 36 insertions(+), 14 deletions(-) diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index fe746a487d1..6af05a1881e 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -959,32 +959,54 @@ (set_attr "mode" "DI")]) (define_insn "mmx_ashr3" - [(set (match_operand:MMXMODE24 0 "register_operand" "=y") + [(set (match_operand:MMXMODE24 0 "register_operand" "=y,x,Yv") (ashiftrt:MMXMODE24 - (match_operand:MMXMODE24 1 "register_operand" "0") - (match_operand:DI 2 "nonmemory_operand" "yN")))] - "TARGET_MMX" - "psra\t{%2, %0|%0, %2}" - [(set_attr "type" "mmxshft") + (match_operand:MMXMODE24 1 "register_operand" "0,0,Yv") + (match_operand:DI 2 "nonmemory_operand" "yN,xN,YvN")))] + "TARGET_MMX || TARGET_MMX_WITH_SSE" + "@ + psra\t{%2, %0|%0, %2} + psra\t{%2, %0|%0, %2} + vpsra\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "mmx_isa" "native,x64_noavx,x64_avx") + (set_attr "type" "mmxshft,sseishft,sseishft") (set (attr "length_immediate") (if_then_else (match_operand 2 "const_int_operand") (const_string "1") (const_string "0"))) - (set_attr "mode" "DI")]) + (set_attr "mode" "DI,TI,TI")]) + +(define_expand "ashr3" + [(set (match_operand:MMXMODE24 0 "register_operand") +(ashiftrt:MMXMODE24 + (match_operand:MMXMODE24 1 "register_operand") + (match_operand:DI 2 "nonmemory_operand")))] + "TARGET_MMX_WITH_SSE") (define_insn "mmx_3" - [(set (match_operand:MMXMODE248 0 "register_operand" "=y") + [(set (match_operand:MMXMODE248 0 "register_operand" "=y,x,Yv") (any_lshift:MMXMODE248 - (match_operand:MMXMODE248 1 "register_operand" "0") - (match_operand:DI 2 "nonmemory_operand" "yN")))] - "TARGET_MMX" - "p\t{%2, %0|%0, %2}" - [(set_attr "type" "mmxshft") + (match_operand:MMXMODE248 1 "register_operand" "0,0,Yv") + (match_operand:DI 2 "nonmemory_operand" "yN,xN,YvN")))] + "TARGET_MMX || TARGET_MMX_WITH_SSE" + "@ + p\t{%2, %0|%0, %2} + p\t{%2, %0|%0, %2} + vp\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "mmx_isa" "native,x64_noavx,x64_avx") + (set_attr "type" "mmxshft,sseishft,sseishft") (set (attr "length_immediate") (if_then_else (match_operand 2 "const_int_operand") (const_string "1") (const_string "0"))) - (set_attr "mode" "DI")]) + (set_attr "mode" "DI,TI,TI")]) + +(define_expand "3" + [(set (match_operand:MMXMODE248 0 "register_operand") +(any_lshift:MMXMODE248 + (match_operand:MMXMODE248 1 "register_operand") + (match_operand:DI 2 "nonmemory_operand")))] + "TARGET_MMX_WITH_SSE") ; ;; -- 2.20.1
[PATCH 12/41] i386: Emulate MMX vec_dupv2si with SSE
Emulate MMX vec_dupv2si with SSE. Add the "Yw" constraint to allow broadcast from integer register for AVX512BW with TARGET_AVX512VL. Only SSE register source operand is allowed. PR target/89021 * config/i386/constraints.md (Yw): New constraint. * config/i386/mmx.md (*vec_dupv2si): Changed to define_insn_and_split and also allow TARGET_MMX_WITH_SSE to support SSE emulation. --- gcc/config/i386/constraints.md | 6 ++ gcc/config/i386/mmx.md | 24 +--- 2 files changed, 23 insertions(+), 7 deletions(-) diff --git a/gcc/config/i386/constraints.md b/gcc/config/i386/constraints.md index 16075b4acf3..c546b20d9dc 100644 --- a/gcc/config/i386/constraints.md +++ b/gcc/config/i386/constraints.md @@ -110,6 +110,8 @@ ;; v any EVEX encodable SSE register for AVX512VL target, ;; otherwise any SSE register ;; h EVEX encodable SSE register with number factor of four +;; w any EVEX encodable SSE register for AVX512BW with TARGET_AVX512VL +;; target. (define_register_constraint "Yz" "TARGET_SSE ? SSE_FIRST_REG : NO_REGS" "First SSE register (@code{%xmm0}).") @@ -146,6 +148,10 @@ "TARGET_AVX512VL ? ALL_SSE_REGS : TARGET_SSE ? SSE_REGS : NO_REGS" "@internal For AVX512VL, any EVEX encodable SSE register (@code{%xmm0-%xmm31}), otherwise any SSE register.") +(define_register_constraint "Yw" + "TARGET_AVX512BW && TARGET_AVX512VL ? ALL_SSE_REGS : NO_REGS" + "@internal Any EVEX encodable SSE register (@code{%xmm0-%xmm31}) for AVX512BW with TARGET_AVX512VL target.") + ;; We use the B prefix to denote any number of internal operands: ;; f FLAGS_REG ;; g GOT memory operand. diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index aeebb4f5741..b441f36dfc6 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -1381,14 +1381,24 @@ (set_attr "length_immediate" "1") (set_attr "mode" "DI")]) -(define_insn "*vec_dupv2si" - [(set (match_operand:V2SI 0 "register_operand" "=y") +(define_insn_and_split "*vec_dupv2si" + [(set (match_operand:V2SI 0 "register_operand" "=y,x,Yv,Yw") (vec_duplicate:V2SI - (match_operand:SI 1 "register_operand" "0")))] - "TARGET_MMX" - "punpckldq\t%0, %0" - [(set_attr "type" "mmxcvt") - (set_attr "mode" "DI")]) + (match_operand:SI 1 "register_operand" "0,0,Yv,r")))] + "TARGET_MMX || TARGET_MMX_WITH_SSE" + "@ + punpckldq\t%0, %0 + # + # + #" + "TARGET_MMX_WITH_SSE && reload_completed" + [(set (match_dup 0) + (vec_duplicate:V4SI (match_dup 1)))] + "operands[0] = lowpart_subreg (V4SImode, operands[0], +GET_MODE (operands[0]));" + [(set_attr "mmx_isa" "native,x64_noavx,x64_avx,x64_avx") + (set_attr "type" "mmxcvt,ssemov,ssemov,ssemov") + (set_attr "mode" "DI,TI,TI,TI")]) (define_insn "*mmx_concatv2si" [(set (match_operand:V2SI 0 "register_operand" "=y,y") -- 2.20.1
[PATCH 09/41] i386: Emulate MMX 3 with SSE
Emulate MMX 3 with SSE. Only SSE register source operand is allowed. PR target/89021 * config/i386/mmx.md (any_logic:mmx_3): Also allow TARGET_MMX_WITH_SSE. (any_logic:3): New. (any_logic:*mmx_3): Also allow TARGET_MMX_WITH_SSE. Add SSE support. --- gcc/config/i386/mmx.md | 33 +++-- 1 file changed, 23 insertions(+), 10 deletions(-) diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index 6af05a1881e..33f6c2aa774 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -1061,20 +1061,33 @@ (define_expand "mmx_3" [(set (match_operand:MMXMODEI 0 "register_operand") (any_logic:MMXMODEI - (match_operand:MMXMODEI 1 "nonimmediate_operand") - (match_operand:MMXMODEI 2 "nonimmediate_operand")))] - "TARGET_MMX" + (match_operand:MMXMODEI 1 "register_mmxmem_operand") + (match_operand:MMXMODEI 2 "register_mmxmem_operand")))] + "TARGET_MMX || TARGET_MMX_WITH_SSE" + "ix86_fixup_binary_operands_no_copy (, mode, operands);") + +(define_expand "3" + [(set (match_operand:MMXMODEI 0 "register_operand") + (any_logic:MMXMODEI + (match_operand:MMXMODEI 1 "register_operand") + (match_operand:MMXMODEI 2 "register_operand")))] + "TARGET_MMX_WITH_SSE" "ix86_fixup_binary_operands_no_copy (, mode, operands);") (define_insn "*mmx_3" - [(set (match_operand:MMXMODEI 0 "register_operand" "=y") + [(set (match_operand:MMXMODEI 0 "register_operand" "=y,x,Yv") (any_logic:MMXMODEI - (match_operand:MMXMODEI 1 "nonimmediate_operand" "%0") - (match_operand:MMXMODEI 2 "nonimmediate_operand" "ym")))] - "TARGET_MMX && ix86_binary_operator_ok (, mode, operands)" - "p\t{%2, %0|%0, %2}" - [(set_attr "type" "mmxadd") - (set_attr "mode" "DI")]) + (match_operand:MMXMODEI 1 "register_mmxmem_operand" "%0,0,Yv") + (match_operand:MMXMODEI 2 "register_mmxmem_operand" "ym,x,Yv")))] + "(TARGET_MMX || TARGET_MMX_WITH_SSE) + && ix86_binary_operator_ok (, mode, operands)" + "@ + p\t{%2, %0|%0, %2} + p\t{%2, %0|%0, %2} + vp\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "mmx_isa" "native,x64_noavx,x64_avx") + (set_attr "type" "mmxadd,sselog,sselog") + (set_attr "mode" "DI,TI,TI")]) ; ;; -- 2.20.1
[PATCH 11/41] i386: Emulate MMX mmx_eq/mmx_gt3 with SSE
Emulate MMX mmx_eq/mmx_gt3 with SSE. Only SSE register source operand is allowed. PR target/89021 * config/i386/mmx.md (mmx_eq3): Also allow TARGET_MMX_WITH_SSE. (*mmx_eq3): Also allow TARGET_MMX_WITH_SSE. Add SSE support. (mmx_gt3): Likewise. --- gcc/config/i386/mmx.md | 43 +- 1 file changed, 26 insertions(+), 17 deletions(-) diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index b3df46dd563..aeebb4f5741 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -1017,30 +1017,39 @@ (define_expand "mmx_eq3" [(set (match_operand:MMXMODEI 0 "register_operand") (eq:MMXMODEI - (match_operand:MMXMODEI 1 "nonimmediate_operand") - (match_operand:MMXMODEI 2 "nonimmediate_operand")))] - "TARGET_MMX" + (match_operand:MMXMODEI 1 "register_mmxmem_operand") + (match_operand:MMXMODEI 2 "register_mmxmem_operand")))] + "TARGET_MMX || TARGET_MMX_WITH_SSE" "ix86_fixup_binary_operands_no_copy (EQ, mode, operands);") (define_insn "*mmx_eq3" - [(set (match_operand:MMXMODEI 0 "register_operand" "=y") + [(set (match_operand:MMXMODEI 0 "register_operand" "=y,x,Yv") (eq:MMXMODEI - (match_operand:MMXMODEI 1 "nonimmediate_operand" "%0") - (match_operand:MMXMODEI 2 "nonimmediate_operand" "ym")))] - "TARGET_MMX && ix86_binary_operator_ok (EQ, mode, operands)" - "pcmpeq\t{%2, %0|%0, %2}" - [(set_attr "type" "mmxcmp") - (set_attr "mode" "DI")]) + (match_operand:MMXMODEI 1 "register_mmxmem_operand" "%0,0,Yv") + (match_operand:MMXMODEI 2 "register_mmxmem_operand" "ym,x,Yv")))] + "(TARGET_MMX || TARGET_MMX_WITH_SSE) + && ix86_binary_operator_ok (EQ, mode, operands)" + "@ + pcmpeq\t{%2, %0|%0, %2} + pcmpeq\t{%2, %0|%0, %2} + vpcmpeq\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "mmx_isa" "native,x64_noavx,x64_avx") + (set_attr "type" "mmxcmp,ssecmp,ssecmp") + (set_attr "mode" "DI,TI,TI")]) (define_insn "mmx_gt3" - [(set (match_operand:MMXMODEI 0 "register_operand" "=y") + [(set (match_operand:MMXMODEI 0 "register_operand" "=y,x,Yv") (gt:MMXMODEI - (match_operand:MMXMODEI 1 "register_operand" "0") - (match_operand:MMXMODEI 2 "nonimmediate_operand" "ym")))] - "TARGET_MMX" - "pcmpgt\t{%2, %0|%0, %2}" - [(set_attr "type" "mmxcmp") - (set_attr "mode" "DI")]) + (match_operand:MMXMODEI 1 "register_operand" "0,0,Yv") + (match_operand:MMXMODEI 2 "register_mmxmem_operand" "ym,x,Yv")))] + "TARGET_MMX || TARGET_MMX_WITH_SSE" + "@ + pcmpgt\t{%2, %0|%0, %2} + pcmpgt\t{%2, %0|%0, %2} + vpcmpgt\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "mmx_isa" "native,x64_noavx,x64_avx") + (set_attr "type" "mmxcmp,ssecmp,ssecmp") + (set_attr "mode" "DI,TI,TI")]) ; ;; -- 2.20.1
[PATCH 00/41] V9: Emulate MMX intrinsics with SSE
On x86-64, since __m64 is returned and passed in XMM registers, we can emulate MMX intrinsics with SSE instructions. To support it, we added #define TARGET_MMX_WITH_SSE (TARGET_64BIT && TARGET_SSE2) ;; Define instruction set of MMX instructions (define_attr "mmx_isa" "base,native,x64,x64_noavx,x64_avx" (const_string "base")) (eq_attr "mmx_isa" "native") (symbol_ref "!TARGET_MMX_WITH_SSE") (eq_attr "mmx_isa" "x64") (symbol_ref "TARGET_MMX_WITH_SSE") (eq_attr "mmx_isa" "x64_avx") (symbol_ref "TARGET_MMX_WITH_SSE && TARGET_AVX") (eq_attr "mmx_isa" "x64_noavx") (symbol_ref "TARGET_MMX_WITH_SSE && !TARGET_AVX") We added SSE emulation to MMX patterns and disabled MMX alternatives with TARGET_MMX_WITH_SSE. Most of MMX instructions have equivalent SSE versions and results of some SSE versions need to be reshuffled to the right order for MMX. Thee are couple tricky cases: 1. MMX maskmovq and SSE2 maskmovdqu aren't equivalent. We emulate MMX maskmovq with SSE2 maskmovdqu by zeroing out the upper 64 bits of the mask operand and handle unmapped bits 64:127 at memory address by adjusting source and mask operands together with memory address. 2. MMX movntq is emulated with SSE2 DImode movnti, which is available in 64-bit mode. 3. MMX pshufb takes a 3-bit index while SSE pshufb takes a 4-bit index. SSE emulation must clear the bit 4 in the shuffle control mask. 4. To emulate MMX cvtpi2p with SSE2 cvtdq2ps, we must properly preserve the upper 64 bits of destination XMM register. Tests are also added to check each SSE emulation of MMX intrinsics. There are no regressions on i686 and x86-64. For x86-64, GCC is also tested with --with-arch=native --with-cpu=native on AVX2 and AVX512F machines. PS: We may be able to enable partial SSE emulation of MMX intrinsics in 32-bit mode later. H.J. Lu (40): i386: Allow MMX register modes in SSE registers i386: Emulate MMX packsswb/packssdw/packuswb with SSE2 i386: Emulate MMX punpcklXX/punpckhXX with SSE punpcklXX i386: Emulate MMX plusminus/sat_plusminus with SSE i386: Emulate MMX mulv4hi3 with SSE i386: Emulate MMX smulv4hi3_highpart with SSE i386: Emulate MMX mmx_pmaddwd with SSE i386: Emulate MMX ashr3/3 with SSE i386: Emulate MMX 3 with SSE i386: Emulate MMX mmx_andnot3 with SSE i386: Emulate MMX mmx_eq/mmx_gt3 with SSE i386: Emulate MMX vec_dupv2si with SSE i386: Emulate MMX pshufw with SSE i386: Emulate MMX sse_cvtps2pi/sse_cvttps2pi with SSE i386: Emulate MMX sse_cvtpi2ps with SSE i386: Emulate MMX mmx_pextrw with SSE i386: Emulate MMX mmx_pinsrw with SSE i386: Emulate MMX V4HI smaxmin/V8QI umaxmin with SSE i386: Emulate MMX mmx_pmovmskb with SSE i386: Emulate MMX mmx_umulv4hi3_highpart with SSE i386: Emulate MMX maskmovq with SSE2 maskmovdqu i386: Emulate MMX mmx_uavgv8qi3 with SSE i386: Emulate MMX mmx_uavgv4hi3 with SSE i386: Emulate MMX mmx_psadbw with SSE i386: Emulate MMX movntq with SSE2 movntidi i386: Emulate MMX umulv1siv1di3 with SSE2 i386: Make _mm_empty () as NOP without MMX i386: Emulate MMX ssse3_phwv4hi3 with SSE i386: Emulate MMX ssse3_phdv2si3 with SSE i386: Emulate MMX ssse3_pmaddubsw with SSE i386: Emulate MMX ssse3_pmulhrswv4hi3 with SSE i386: Emulate MMX pshufb with SSE version i386: Emulate MMX ssse3_psign3 with SSE i386: Emulate MMX ssse3_palignrdi with SSE i386: Emulate MMX abs2 with SSE i386: Allow MMXMODE moves with TARGET_MMX_WITH_SSE i386: Allow MMX vector expanders with TARGET_MMX_WITH_SSE i386: Allow MMX intrinsic emulation with SSE i386: Enable TM MMX intrinsics with SSE2 i386: Add tests for MMX intrinsic emulations with SSE Uros Bizjak (1): Prevent allocation of MMX registers with TARGET_MMX_WITH_SSE gcc/config/i386/constraints.md|6 + gcc/config/i386/i386-builtin.def | 126 +- gcc/config/i386/i386-c.c |2 + gcc/config/i386/i386-protos.h |4 + gcc/config/i386/i386.c| 181 ++- gcc/config/i386/i386.h|2 + gcc/config/i386/i386.md | 17 + gcc/config/i386/mmintrin.h| 12 +- gcc/config/i386/mmx.md| 1028 +++-- gcc/config/i386/predicates.md |7 + gcc/config/i386/sse.md| 368 -- gcc/config/i386/xmmintrin.h | 61 + gcc/testsuite/gcc.target/i386/mmx-vals.h | 77 ++ gcc/testsuite/gcc.target/i386/pr82483-1.c |2 +- gcc/testsuite/gcc.target/i386/pr82483-2.c |2 +- gcc/testsuite/gcc.target/i386/sse2-mmx-10.c | 43 + gcc/testsuite/gcc.target/i386/sse2-mmx-11.c | 39 + gcc/testsuite/gcc.target/i386/sse2-mmx-12.c | 42 + gcc/testsuite/gcc.target/i386/sse2-mmx-13.c | 40 + gcc/testsuite/gcc.target/i386/sse2-mmx-14.c | 31 +
[PATCH 07/41] i386: Emulate MMX mmx_pmaddwd with SSE
Emulate MMX pmaddwd with SSE. Only SSE register source operand is allowed. PR target/89021 * config/i386/mmx.md (mmx_pmaddwd): Also allow TARGET_MMX_WITH_SSE. (*mmx_pmaddwd): Also allow TARGET_MMX_WITH_SSE. Add SSE support. --- gcc/config/i386/mmx.md | 25 +++-- 1 file changed, 15 insertions(+), 10 deletions(-) diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index 01c80602b5b..fe746a487d1 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -810,11 +810,11 @@ (mult:V2SI (sign_extend:V2SI (vec_select:V2HI - (match_operand:V4HI 1 "nonimmediate_operand") + (match_operand:V4HI 1 "register_mmxmem_operand") (parallel [(const_int 0) (const_int 2)]))) (sign_extend:V2SI (vec_select:V2HI - (match_operand:V4HI 2 "nonimmediate_operand") + (match_operand:V4HI 2 "register_mmxmem_operand") (parallel [(const_int 0) (const_int 2)] (mult:V2SI (sign_extend:V2SI @@ -823,20 +823,20 @@ (sign_extend:V2SI (vec_select:V2HI (match_dup 2) (parallel [(const_int 1) (const_int 3)]))] - "TARGET_MMX" + "TARGET_MMX || TARGET_MMX_WITH_SSE" "ix86_fixup_binary_operands_no_copy (MULT, V4HImode, operands);") (define_insn "*mmx_pmaddwd" - [(set (match_operand:V2SI 0 "register_operand" "=y") + [(set (match_operand:V2SI 0 "register_operand" "=y,x,Yv") (plus:V2SI (mult:V2SI (sign_extend:V2SI (vec_select:V2HI - (match_operand:V4HI 1 "nonimmediate_operand" "%0") + (match_operand:V4HI 1 "register_mmxmem_operand" "%0,0,Yv") (parallel [(const_int 0) (const_int 2)]))) (sign_extend:V2SI (vec_select:V2HI - (match_operand:V4HI 2 "nonimmediate_operand" "ym") + (match_operand:V4HI 2 "register_mmxmem_operand" "ym,x,Yv") (parallel [(const_int 0) (const_int 2)] (mult:V2SI (sign_extend:V2SI @@ -845,10 +845,15 @@ (sign_extend:V2SI (vec_select:V2HI (match_dup 2) (parallel [(const_int 1) (const_int 3)]))] - "TARGET_MMX && ix86_binary_operator_ok (MULT, V4HImode, operands)" - "pmaddwd\t{%2, %0|%0, %2}" - [(set_attr "type" "mmxmul") - (set_attr "mode" "DI")]) + "(TARGET_MMX || TARGET_MMX_WITH_SSE) + && ix86_binary_operator_ok (MULT, V4HImode, operands)" + "@ + pmaddwd\t{%2, %0|%0, %2} + pmaddwd\t{%2, %0|%0, %2} + vpmaddwd\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "mmx_isa" "native,x64_noavx,x64_avx") + (set_attr "type" "mmxmul,sseiadd,sseiadd") + (set_attr "mode" "DI,TI,TI")]) (define_expand "mmx_pmulhrwv4hi3" [(set (match_operand:V4HI 0 "register_operand") -- 2.20.1
[PATCH 03/41] i386: Emulate MMX punpcklXX/punpckhXX with SSE punpcklXX
Emulate MMX punpcklXX/punpckhXX with SSE punpcklXX. For MMX punpckhXX, move bits 64:127 to bits 0:63 in SSE register. Only SSE register source operand is allowed. PR target/89021 * config/i386/i386-protos.h (ix86_split_mmx_punpck): New prototype. * config/i386/i386.c (ix86_split_mmx_punpck): New function. * config/i386/mmx.m (mmx_punpckhbw): Changed to define_insn_and_split to support SSE emulation. (mmx_punpcklbw): Likewise. (mmx_punpckhwd): Likewise. (mmx_punpcklwd): Likewise. (mmx_punpckhdq): Likewise. (mmx_punpckldq): Likewise. --- gcc/config/i386/i386-protos.h | 1 + gcc/config/i386/i386.c| 77 +++ gcc/config/i386/mmx.md| 138 ++ 3 files changed, 168 insertions(+), 48 deletions(-) diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h index a53b48438ec..37581837a32 100644 --- a/gcc/config/i386/i386-protos.h +++ b/gcc/config/i386/i386-protos.h @@ -204,6 +204,7 @@ extern rtx ix86_split_stack_guard (void); extern void ix86_move_vector_high_sse_to_mmx (rtx); extern void ix86_split_mmx_pack (rtx[], enum rtx_code); +extern void ix86_split_mmx_punpck (rtx[], bool); #ifdef TREE_CODE extern void init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, tree, int); diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 563bc9aec69..3db41555462 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -20275,6 +20275,83 @@ ix86_split_mmx_pack (rtx operands[], enum rtx_code code) ix86_move_vector_high_sse_to_mmx (op0); } +/* Split MMX punpcklXX/punpckhXX with SSE punpcklXX. */ + +void +ix86_split_mmx_punpck (rtx operands[], bool high_p) +{ + rtx op0 = operands[0]; + rtx op1 = operands[1]; + rtx op2 = operands[2]; + machine_mode mode = GET_MODE (op0); + rtx mask; + /* The corresponding SSE mode. */ + machine_mode sse_mode, double_sse_mode; + + switch (mode) +{ +case E_V8QImode: + sse_mode = V16QImode; + double_sse_mode = V32QImode; + mask = gen_rtx_PARALLEL (VOIDmode, + gen_rtvec (16, + GEN_INT (0), GEN_INT (16), + GEN_INT (1), GEN_INT (17), + GEN_INT (2), GEN_INT (18), + GEN_INT (3), GEN_INT (19), + GEN_INT (4), GEN_INT (20), + GEN_INT (5), GEN_INT (21), + GEN_INT (6), GEN_INT (22), + GEN_INT (7), GEN_INT (23))); + break; + +case E_V4HImode: + sse_mode = V8HImode; + double_sse_mode = V16HImode; + mask = gen_rtx_PARALLEL (VOIDmode, + gen_rtvec (8, + GEN_INT (0), GEN_INT (8), + GEN_INT (1), GEN_INT (9), + GEN_INT (2), GEN_INT (10), + GEN_INT (3), GEN_INT (11))); + break; + +case E_V2SImode: + sse_mode = V4SImode; + double_sse_mode = V8SImode; + mask = gen_rtx_PARALLEL (VOIDmode, + gen_rtvec (4, + GEN_INT (0), GEN_INT (4), + GEN_INT (1), GEN_INT (5))); + break; + +default: + gcc_unreachable (); +} + + /* Generate SSE punpcklXX. */ + rtx dest = lowpart_subreg (sse_mode, op0, GET_MODE (op0)); + op1 = lowpart_subreg (sse_mode, op1, GET_MODE (op1)); + op2 = lowpart_subreg (sse_mode, op2, GET_MODE (op2)); + + op1 = gen_rtx_VEC_CONCAT (double_sse_mode, op1, op2); + op2 = gen_rtx_VEC_SELECT (sse_mode, op1, mask); + rtx insn = gen_rtx_SET (dest, op2); + emit_insn (insn); + + if (high_p) +{ + /* Move bits 64:127 to bits 0:63. */ + mask = gen_rtx_PARALLEL (VOIDmode, + gen_rtvec (4, GEN_INT (2), GEN_INT (3), + GEN_INT (0), GEN_INT (0))); + dest = lowpart_subreg (V4SImode, dest, GET_MODE (dest)); + op1 = gen_rtx_VEC_SELECT (V4SImode, dest, mask); + insn = gen_rtx_SET (dest, op1); + emit_insn (insn); +} +} + /* Helper function of ix86_fixup_binary_operands to canonicalize operand order. Returns true if the operands should be swapped. */ diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index 63a390923b6..0aa793395fb 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -1064,87 +1064,129 @@ (set_attr "type" "mmxshft,sselog,sselog") (set_attr "mode" "DI,TI,TI")]) -(define_insn "mmx_punpckhbw" - [(set (match_operand:V8QI 0 "register_operand" "=y") +(define_insn_and_split "mmx_punpckhbw" + [(set (match_operand:V8QI 0
[PATCH 06/41] i386: Emulate MMX smulv4hi3_highpart with SSE
Emulate MMX mulv4hi3 with SSE. Only SSE register source operand is allowed. PR target/89021 * config/i386/mmx.md (mmx_smulv4hi3_highpart): Also allow TARGET_MMX_WITH_SSE. (*mmx_smulv4hi3_highpart): Also allow TARGET_MMX_WITH_SSE. Add SSE support. --- gcc/config/i386/mmx.md | 25 +++-- 1 file changed, 15 insertions(+), 10 deletions(-) diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index fd0189eae60..01c80602b5b 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -748,27 +748,32 @@ (lshiftrt:V4SI (mult:V4SI (sign_extend:V4SI - (match_operand:V4HI 1 "nonimmediate_operand")) + (match_operand:V4HI 1 "register_mmxmem_operand")) (sign_extend:V4SI - (match_operand:V4HI 2 "nonimmediate_operand"))) + (match_operand:V4HI 2 "register_mmxmem_operand"))) (const_int 16] - "TARGET_MMX" + "TARGET_MMX || TARGET_MMX_WITH_SSE" "ix86_fixup_binary_operands_no_copy (MULT, V4HImode, operands);") (define_insn "*mmx_smulv4hi3_highpart" - [(set (match_operand:V4HI 0 "register_operand" "=y") + [(set (match_operand:V4HI 0 "register_operand" "=y,x,Yv") (truncate:V4HI (lshiftrt:V4SI (mult:V4SI (sign_extend:V4SI - (match_operand:V4HI 1 "nonimmediate_operand" "%0")) + (match_operand:V4HI 1 "register_mmxmem_operand" "%0,0,Yv")) (sign_extend:V4SI - (match_operand:V4HI 2 "nonimmediate_operand" "ym"))) + (match_operand:V4HI 2 "register_mmxmem_operand" "ym,x,Yv"))) (const_int 16] - "TARGET_MMX && ix86_binary_operator_ok (MULT, V4HImode, operands)" - "pmulhw\t{%2, %0|%0, %2}" - [(set_attr "type" "mmxmul") - (set_attr "mode" "DI")]) + "(TARGET_MMX || TARGET_MMX_WITH_SSE) + && ix86_binary_operator_ok (MULT, V4HImode, operands)" + "@ + pmulhw\t{%2, %0|%0, %2} + pmulhw\t{%2, %0|%0, %2} + vpmulhw\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "mmx_isa" "native,x64_noavx,x64_avx") + (set_attr "type" "mmxmul,ssemul,ssemul") + (set_attr "mode" "DI,TI,TI")]) (define_expand "mmx_umulv4hi3_highpart" [(set (match_operand:V4HI 0 "register_operand") -- 2.20.1
[PATCH 02/41] i386: Emulate MMX packsswb/packssdw/packuswb with SSE2
Emulate MMX packsswb/packssdw/packuswb with SSE packsswb/packssdw/packuswb plus moving bits 64:95 to bits 32:63 in SSE register. Only SSE register source operand is allowed. 2019-02-08 H.J. Lu Uros Bizjak PR target/89021 * config/i386/i386-protos.h (ix86_move_vector_high_sse_to_mmx): New prototype. (ix86_split_mmx_pack): Likewise. * config/i386/i386.c (ix86_move_vector_high_sse_to_mmx): New function. (ix86_split_mmx_pack): Likewise. * config/i386/i386.md (mmx_isa): New. (enabled): Also check mmx_isa. * config/i386/mmx.md (any_s_truncate): New code iterator. (s_trunsuffix): New code attr. (mmx_packsswb): Removed. (mmx_packssdw): Likewise. (mmx_packuswb): Likewise. (mmx_packswb): New define_insn_and_split to emulate MMX packsswb/packuswb with SSE2. (mmx_packssdw): Likewise. * config/i386/predicates.md (register_mmxmem_operand): New. --- gcc/config/i386/i386-protos.h | 3 ++ gcc/config/i386/i386.c| 54 gcc/config/i386/i386.md | 13 +++ gcc/config/i386/mmx.md| 67 +++ gcc/config/i386/predicates.md | 7 5 files changed, 114 insertions(+), 30 deletions(-) diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h index 27f5cc13abf..a53b48438ec 100644 --- a/gcc/config/i386/i386-protos.h +++ b/gcc/config/i386/i386-protos.h @@ -202,6 +202,9 @@ extern void ix86_expand_vecop_qihi (enum rtx_code, rtx, rtx, rtx); extern rtx ix86_split_stack_guard (void); +extern void ix86_move_vector_high_sse_to_mmx (rtx); +extern void ix86_split_mmx_pack (rtx[], enum rtx_code); + #ifdef TREE_CODE extern void init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, tree, int); #endif /* TREE_CODE */ diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index eb642165264..563bc9aec69 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -20221,6 +20221,60 @@ ix86_expand_vector_move_misalign (machine_mode mode, rtx operands[]) gcc_unreachable (); } +/* Move bits 64:95 to bits 32:63. */ + +void +ix86_move_vector_high_sse_to_mmx (rtx op) +{ + rtx mask = gen_rtx_PARALLEL (VOIDmode, + gen_rtvec (4, GEN_INT (0), GEN_INT (2), + GEN_INT (0), GEN_INT (0))); + rtx dest = lowpart_subreg (V4SImode, op, GET_MODE (op)); + op = gen_rtx_VEC_SELECT (V4SImode, dest, mask); + rtx insn = gen_rtx_SET (dest, op); + emit_insn (insn); +} + +/* Split MMX pack with signed/unsigned saturation with SSE/SSE2. */ + +void +ix86_split_mmx_pack (rtx operands[], enum rtx_code code) +{ + rtx op0 = operands[0]; + rtx op1 = operands[1]; + rtx op2 = operands[2]; + + machine_mode dmode = GET_MODE (op0); + machine_mode smode = GET_MODE (op1); + machine_mode inner_dmode = GET_MODE_INNER (dmode); + machine_mode inner_smode = GET_MODE_INNER (smode); + + /* Get the corresponding SSE mode for destination. */ + int nunits = 16 / GET_MODE_SIZE (inner_dmode); + machine_mode sse_dmode = mode_for_vector (GET_MODE_INNER (dmode), + nunits).require (); + machine_mode sse_half_dmode = mode_for_vector (GET_MODE_INNER (dmode), +nunits / 2).require (); + + /* Get the corresponding SSE mode for source. */ + nunits = 16 / GET_MODE_SIZE (inner_smode); + machine_mode sse_smode = mode_for_vector (GET_MODE_INNER (smode), + nunits).require (); + + /* Generate SSE pack with signed/unsigned saturation. */ + rtx dest = lowpart_subreg (sse_dmode, op0, GET_MODE (op0)); + op1 = lowpart_subreg (sse_smode, op1, GET_MODE (op1)); + op2 = lowpart_subreg (sse_smode, op2, GET_MODE (op2)); + + op1 = gen_rtx_fmt_e (code, sse_half_dmode, op1); + op2 = gen_rtx_fmt_e (code, sse_half_dmode, op2); + rtx insn = gen_rtx_SET (dest, gen_rtx_VEC_CONCAT (sse_dmode, + op1, op2)); + emit_insn (insn); + + ix86_move_vector_high_sse_to_mmx (op0); +} + /* Helper function of ix86_fixup_binary_operands to canonicalize operand order. Returns true if the operands should be swapped. */ diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 240384917df..04ec0eeaa57 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -792,6 +792,10 @@ avx512vl,noavx512vl,x64_avx512dq,x64_avx512bw" (const_string "base")) +;; Define instruction set of MMX instructions +(define_attr "mmx_isa" "base,native,x64,x64_noavx,x64_avx" + (const_string "base")) + (define_attr "enabled" "" (cond [(eq_attr "isa" "x64") (symbol_ref "TARGET_64BIT") (eq_attr "isa" "x64_sse2") @@ -830,6 +834,15 @@ (eq_attr "isa" "noavx512dq") (symbol_ref "!TARGET_AVX512DQ") (eq_attr "isa" "avx512vl")
[PATCH 04/41] i386: Emulate MMX plusminus/sat_plusminus with SSE
Emulate MMX plusminus/sat_plusminus with SSE. Only SSE register source operand is allowed. PR target/89021 * config/i386/mmx.md (MMXMODEI8): Require TARGET_SSE2 for V1DI. (plusminus:mmx_3): Check TARGET_MMX_WITH_SSE. (sat_plusminus:mmx_3): Likewise. (3): New. (*mmx_3): Add SSE emulation. (*mmx_3): Likewise. --- gcc/config/i386/mmx.md | 59 +++--- 1 file changed, 38 insertions(+), 21 deletions(-) diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index 0aa793395fb..587e31b299e 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -45,7 +45,7 @@ ;; 8 byte integral modes handled by MMX (and by extension, SSE) (define_mode_iterator MMXMODEI [V8QI V4HI V2SI]) -(define_mode_iterator MMXMODEI8 [V8QI V4HI V2SI V1DI]) +(define_mode_iterator MMXMODEI8 [V8QI V4HI V2SI (V1DI "TARGET_SSE2")]) ;; All 8-byte vector modes handled by MMX (define_mode_iterator MMXMODE [V8QI V4HI V2SI V1DI V2SF]) @@ -663,39 +663,56 @@ (define_expand "mmx_3" [(set (match_operand:MMXMODEI8 0 "register_operand") (plusminus:MMXMODEI8 - (match_operand:MMXMODEI8 1 "nonimmediate_operand") - (match_operand:MMXMODEI8 2 "nonimmediate_operand")))] - "TARGET_MMX || (TARGET_SSE2 && mode == V1DImode)" + (match_operand:MMXMODEI8 1 "register_mmxmem_operand") + (match_operand:MMXMODEI8 2 "register_mmxmem_operand")))] + "TARGET_MMX || TARGET_MMX_WITH_SSE" + "ix86_fixup_binary_operands_no_copy (, mode, operands);") + +(define_expand "3" + [(set (match_operand:MMXMODEI 0 "register_operand") + (plusminus:MMXMODEI + (match_operand:MMXMODEI 1 "register_operand") + (match_operand:MMXMODEI 2 "register_operand")))] + "TARGET_MMX_WITH_SSE" "ix86_fixup_binary_operands_no_copy (, mode, operands);") (define_insn "*mmx_3" - [(set (match_operand:MMXMODEI8 0 "register_operand" "=y") + [(set (match_operand:MMXMODEI8 0 "register_operand" "=y,x,Yv") (plusminus:MMXMODEI8 - (match_operand:MMXMODEI8 1 "nonimmediate_operand" "0") - (match_operand:MMXMODEI8 2 "nonimmediate_operand" "ym")))] - "(TARGET_MMX || (TARGET_SSE2 && mode == V1DImode)) + (match_operand:MMXMODEI8 1 "register_mmxmem_operand" "0,0,Yv") + (match_operand:MMXMODEI8 2 "register_mmxmem_operand" "ym,x,Yv")))] + "(TARGET_MMX || TARGET_MMX_WITH_SSE) && ix86_binary_operator_ok (, mode, operands)" - "p\t{%2, %0|%0, %2}" - [(set_attr "type" "mmxadd") - (set_attr "mode" "DI")]) + "@ + p\t{%2, %0|%0, %2} + p\t{%2, %0|%0, %2} + vp\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "mmx_isa" "native,x64_noavx,x64_avx") + (set_attr "type" "mmxadd,sseadd,sseadd") + (set_attr "mode" "DI,TI,TI")]) (define_expand "mmx_3" [(set (match_operand:MMXMODE12 0 "register_operand") (sat_plusminus:MMXMODE12 - (match_operand:MMXMODE12 1 "nonimmediate_operand") - (match_operand:MMXMODE12 2 "nonimmediate_operand")))] - "TARGET_MMX" + (match_operand:MMXMODE12 1 "register_mmxmem_operand") + (match_operand:MMXMODE12 2 "register_mmxmem_operand")))] + "TARGET_MMX || TARGET_MMX_WITH_SSE" "ix86_fixup_binary_operands_no_copy (, mode, operands);") (define_insn "*mmx_3" - [(set (match_operand:MMXMODE12 0 "register_operand" "=y") + [(set (match_operand:MMXMODE12 0 "register_operand" "=y,x,Yv") (sat_plusminus:MMXMODE12 - (match_operand:MMXMODE12 1 "nonimmediate_operand" "0") - (match_operand:MMXMODE12 2 "nonimmediate_operand" "ym")))] - "TARGET_MMX && ix86_binary_operator_ok (, mode, operands)" - "p\t{%2, %0|%0, %2}" - [(set_attr "type" "mmxadd") - (set_attr "mode" "DI")]) + (match_operand:MMXMODE12 1 "register_mmxmem_operand" "0,0,Yv") + (match_operand:MMXMODE12 2 "register_mmxmem_operand" "ym,x,Yv")))] + "(TARGET_MMX || TARGET_MMX_WITH_SSE) + && ix86_binary_operator_ok (, mode, operands)" + "@ + p\t{%2, %0|%0, %2} + p\t{%2, %0|%0, %2} + vp\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "mmx_isa" "native,x64_noavx,x64_avx") + (set_attr "type" "mmxadd,sseadd,sseadd") + (set_attr "mode" "DI,TI,TI")]) (define_expand "mmx_mulv4hi3" [(set (match_operand:V4HI 0 "register_operand") -- 2.20.1
[PATCH 01/41] i386: Allow MMX register modes in SSE registers
In 64-bit mode, SSE2 can be used to emulate MMX instructions without 3DNOW. We can use SSE2 to support MMX register modes. PR target/89021 * config/i386/i386-c.c (ix86_target_macros_internal): Define __MMX_WITH_SSE__ for TARGET_MMX_WITH_SSE. * config/i386/i386.c (ix86_set_reg_reg_cost): Add support for TARGET_MMX_WITH_SSE with VALID_MMX_REG_MODE. (ix86_vector_mode_supported_p): Likewise. * config/i386/i386.h (TARGET_MMX_WITH_SSE): New. --- gcc/config/i386/i386-c.c | 2 ++ gcc/config/i386/i386.c | 5 +++-- gcc/config/i386/i386.h | 2 ++ 3 files changed, 7 insertions(+), 2 deletions(-) diff --git a/gcc/config/i386/i386-c.c b/gcc/config/i386/i386-c.c index 5e7e46fcebe..213e1b56c6b 100644 --- a/gcc/config/i386/i386-c.c +++ b/gcc/config/i386/i386-c.c @@ -548,6 +548,8 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag, def_or_undef (parse_in, "__CLDEMOTE__"); if (isa_flag2 & OPTION_MASK_ISA_PTWRITE) def_or_undef (parse_in, "__PTWRITE__"); + if (TARGET_MMX_WITH_SSE) +def_or_undef (parse_in, "__MMX_WITH_SSE__"); if (TARGET_IAMCU) { def_or_undef (parse_in, "__iamcu"); diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 0df792a41d1..eb642165264 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -40503,7 +40503,8 @@ ix86_set_reg_reg_cost (machine_mode mode) || (TARGET_AVX && VALID_AVX256_REG_MODE (mode)) || (TARGET_SSE2 && VALID_SSE2_REG_MODE (mode)) || (TARGET_SSE && VALID_SSE_REG_MODE (mode)) - || (TARGET_MMX && VALID_MMX_REG_MODE (mode))) + || ((TARGET_MMX || TARGET_MMX_WITH_SSE) + && VALID_MMX_REG_MODE (mode))) units = GET_MODE_SIZE (mode); } @@ -44329,7 +44330,7 @@ ix86_vector_mode_supported_p (machine_mode mode) return true; if (TARGET_AVX512F && VALID_AVX512F_REG_MODE (mode)) return true; - if (TARGET_MMX && VALID_MMX_REG_MODE (mode)) + if ((TARGET_MMX ||TARGET_MMX_WITH_SSE) && VALID_MMX_REG_MODE (mode)) return true; if (TARGET_3DNOW && VALID_MMX_REG_MODE_3DNOW (mode)) return true; diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h index 4fd8bc40a34..91b233022c2 100644 --- a/gcc/config/i386/i386.h +++ b/gcc/config/i386/i386.h @@ -201,6 +201,8 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see #define TARGET_16BIT TARGET_CODE16 #define TARGET_16BIT_P(x) TARGET_CODE16_P(x) +#define TARGET_MMX_WITH_SSE(TARGET_64BIT && TARGET_SSE2) + #include "config/vxworks-dummy.h" #include "config/i386/i386-opts.h" -- 2.20.1
Re: [patch, fortran] Fix PR 87689, wrong decls / ABI violation on POWER
I have now committed the patch as r268992. Janne and Richard, thanks for the review and the comments. Am 18.02.19 um 13:50 schrieb Richard Biener: On Sun, Feb 17, 2019 at 7:19 PM Thomas Koenig wrote: Regression tests turned up a few ICEs (now fixed), plus two very invalid test cases, which I think are not worth saving. They were added to verify we don't ICE for such invalid testcases. How do they fail now? They failed with an LTO type mistmatch. Instead of deleting them, I have now added -Wno-lto-type-mismatch to the options in the committed version. I wonder how the frontend handles the 2nd call to doesntwork_p8 for program main implicit none character :: c character(len=20) :: res, doesntwork_p8 external doesntwork_p8 c = 'o' res = doesntwork_p8(c,1,2,3,4,5,6) res = doesntwork_p8(1,2) if (res /= 'foo') stop 3 end program main This is invalid Fortran. I think we should be able to diagnose this, but the comitted version does not check it. Regards Thomas
Re: [PATCH] sched-ebb.c: avoid moving table jumps (PR rtl-optimization/88423)
On 2/18/19 10:41 AM, Alexander Monakov wrote: > On Mon, 18 Feb 2019, Aaron Sawdey wrote: > >> The code in emit_case_dispatch_table() will very clearly always emit >> branch/label/jumptable_data/barrier >> so this does need to be handled. So, yes tablejump always looks like this, >> and also yes it seems to be >> ripe ground for logic bugs, we have 88308, 88347, 88423 all related to it. >> >> In the long term it might be nice to use a general mechanism >> (SCHED_GROUP_P?) for handling the label and jump >> table data that follow a case branch using jump table. >> >> But for now in stage 4, I think the right way to fix this is with the patch >> that Segher posted earlier. >> If regtest passes (x86_64 and ppc64/ppc32), ok for trunk? > > How making an assert more permissive is "the right way" here? > As already mentioned, without the assert we'd move a USE of the register with > function return value to an unreachable block, which would be incorrect. > > Do you anticipate issues with the sched-deps patch? Alexander, I see you are allowing it to see the barrier as if it were right after the tablejump. Are you saying that the motion of the tablejump is happening because the scheduler does not see the barrier (because it does not follow immediately after) and thus decides it can move instructions to the other side of the tablejump? I agree that is incorrect and is asking for other hidden problems. It would be nice if the tablejump, jump table label, jump table data, and barrier were all one indivisible unit somehow. In the meantime, can someone approve Alexander's patch? Thanks, Aaron -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC Toolchain
Re: [C++ Patch] PR 84536 ("[7/8/9 Regression] ICE with non-type template parameter")
On 2/18/19 5:31 AM, Paolo Carlini wrote: Hi Jason, On 18/02/19 10:20, Jason Merrill wrote: On 2/17/19 6:58 AM, Paolo Carlini wrote: Hi, here, when we don't see an initializer we believe we are surely dealing with a case of C++17 template argument deduction for class templates, but, in fact, it's just an ill-formed C++14 template variable specialization. Conveniently, we can use here too the predicate variable_template_specialization_p. Not 100% sure about the exact wording of the error message, I added '#' to %qD to explicitly print the auto-using type too. I guess we should change the assert to a test, so that we give the error if we aren't dealing with a class template placeholder. Variable templates don't seem to be important to test for. Thanks, simpler patch. This error is also pretty poor for this testcase, where there is an initializer. Well, implementation-wise, certainly init == NULL_TREE and only when we have an empty pack this specific issue occurs. In practice, clang simply talks about an empty initializer (during instantiation, etc, like we do), whereas EDG explicitly says that pack expansion produces an empty list of expressions. I don't think that in cp_finish_decl it would be easy for us to do exactly the same, we simply see a NULL_TREE as second argument. Or we could just *assume* that we are dealing with the outcome of a pack expansion, say something like EDG even if we don't have details beyond the fact that init == NULL_TREE. I believe that without a variadic template the problem cannot occur, because we catch the empty initializer much earlier, in grokdeclarator - indeed using a !CLASS_PLACEHOLDER_TEMPLATE (auto_node) check. What do you think? Again "instantiated for an empty pack" or something similar? Perhaps we could complain in the code for empty pack expansion handling in tsubst_init? Jason
Re: [patch, fortran] Fix PR 87689, wrong decls / ABI violation on POWER
On Mon, Feb 18, 2019 at 10:48:35AM +0200, Janne Blomqvist wrote: > I wonder if we shouldn't exorcise all the varargs stuff, it seems to > cause more problems than benefits? But not in stage4 if we can avoid > it.. On the Power ABIs at least, unprototyped functions (a K thing for C) are handled the same as varargs (with zero fixed arguments). How does this tie in with Fortran requirements? Segher
PING [PATCH] correct __clear_cache signature
Ping: https://gcc.gnu.org/ml/gcc-patches/2019-02/msg00361.html On 2/6/19 5:28 PM, Martin Sebor wrote: Recent libgcc builds have been triggering -Wbuiltin-declaration-mismatch due to the declaration of the __clear_cache built-in being incompatible with how GCC declares it internally. The attached patch adjusts the libgcc declaration and the one in the manual to match what GCC expects. Tested on x86_64-linux. Martin
Re: [PATCH] sched-ebb.c: avoid moving table jumps (PR rtl-optimization/88423)
On Mon, 18 Feb 2019, Aaron Sawdey wrote: > The code in emit_case_dispatch_table() will very clearly always emit > branch/label/jumptable_data/barrier > so this does need to be handled. So, yes tablejump always looks like this, > and also yes it seems to be > ripe ground for logic bugs, we have 88308, 88347, 88423 all related to it. > > In the long term it might be nice to use a general mechanism (SCHED_GROUP_P?) > for handling the label and jump > table data that follow a case branch using jump table. > > But for now in stage 4, I think the right way to fix this is with the patch > that Segher posted earlier. > If regtest passes (x86_64 and ppc64/ppc32), ok for trunk? How making an assert more permissive is "the right way" here? As already mentioned, without the assert we'd move a USE of the register with function return value to an unreachable block, which would be incorrect. Do you anticipate issues with the sched-deps patch? Alexander
Re: [PATCH] sched-ebb.c: avoid moving table jumps (PR rtl-optimization/88423)
The code in emit_case_dispatch_table() will very clearly always emit branch/label/jumptable_data/barrier so this does need to be handled. So, yes tablejump always looks like this, and also yes it seems to be ripe ground for logic bugs, we have 88308, 88347, 88423 all related to it. In the long term it might be nice to use a general mechanism (SCHED_GROUP_P?) for handling the label and jump table data that follow a case branch using jump table. But for now in stage 4, I think the right way to fix this is with the patch that Segher posted earlier. If regtest passes (x86_64 and ppc64/ppc32), ok for trunk? 2019-02-18 Aaron Sawdey PR rtl-optimization/88347 * schedule-ebb.c (begin_move_insn): Apply Segher's patch to handle a jump table before the barrier. On 1/24/19 9:43 AM, Alexander Monakov wrote: > On Wed, 23 Jan 2019, Alexander Monakov wrote: > >> It appears that sched-deps tries to take notice of a barrier after a jump, >> but >> similarly to sched-ebb doesn't anticipate that for a tablejump the barrier >> will >> appear after two more insns (a code_label and a jump_table_data). >> >> If so, it needs a fixup just like the posted change for the assert. I'll >> fire up >> a bootstrap/regtest. > > Updated patch below (now taking into account that NEXT_INSN may give NULL) > passes bootstrap/regtest on x86_64, also with -fsched2-use-superblocks. > > I'm surprised to learn that a tablejump may be not the final insn in its > containing basic block. It certainly seems like a ripe ground for logic > bugs like this one. Is it really intentional? > > OK for trunk? > > Thanks. > Alexander > > PR rtl-optimization/88347 > PR rtl-optimization/88423 > * sched-deps.c (sched_analyze_insn): Take into account that for > tablejumps the barrier appears after a label and a jump_table_data. > > --- a/gcc/sched-deps.c > +++ b/gcc/sched-deps.c > @@ -3005,6 +3005,11 @@ sched_analyze_insn (struct deps_desc *deps, rtx x, > rtx_insn *insn) >if (JUMP_P (insn)) > { >rtx_insn *next = next_nonnote_nondebug_insn (insn); > + /* ??? For tablejumps, the barrier may appear not immediately after > + the jump, but after a label and a jump_table_data insn. */ > + if (next && LABEL_P (next) && NEXT_INSN (next) > + && JUMP_TABLE_DATA_P (NEXT_INSN (next))) > + next = NEXT_INSN (NEXT_INSN (next)); >if (next && BARRIER_P (next)) > reg_pending_barrier = MOVE_BARRIER; >else > -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC Toolchain
Re: [PATCH 00/41] V8: Emulate MMX intrinsics with SSE
On Mon, Feb 18, 2019 at 6:37 AM Uros Bizjak wrote: > > On Mon, Feb 18, 2019 at 3:22 PM H.J. Lu wrote: > > > > > > > > > > > > > On x86-64, since __m64 is returned and passed in XMM > > > > > > > > > > > > registers, we can > > > > > > > > > > > > emulate MMX intrinsics with SSE instructions. To > > > > > > > > > > > > support it, we added > > > > > > > > > > > > > > > > > > > > > > > > #define TARGET_MMX_WITH_SSE (TARGET_64BIT && > > > > > > > > > > > > TARGET_SSE2) > > > > > > > > > > > > > > > > > > > > > > > > ;; Define instruction set of MMX instructions > > > > > > > > > > > > (define_attr "mmx_isa" > > > > > > > > > > > > "base,native,x64,x64_noavx,x64_avx" > > > > > > > > > > > > (const_string "base")) > > > > > > > > > > > > > > > > > > > > > > > > (eq_attr "mmx_isa" "native") > > > > > > > > > > > >(symbol_ref "!TARGET_MMX_WITH_SSE") > > > > > > > > > > > > (eq_attr "mmx_isa" "x64") > > > > > > > > > > > >(symbol_ref "TARGET_MMX_WITH_SSE") > > > > > > > > > > > > (eq_attr "mmx_isa" "x64_avx") > > > > > > > > > > > >(symbol_ref "TARGET_MMX_WITH_SSE && > > > > > > > > > > > > TARGET_AVX") > > > > > > > > > > > > (eq_attr "mmx_isa" "x64_noavx") > > > > > > > > > > > >(symbol_ref "TARGET_MMX_WITH_SSE && > > > > > > > > > > > > !TARGET_AVX") > > > > > > > > > > > > > > > > > > > > > > > > We added SSE emulation to MMX patterns and disabled MMX > > > > > > > > > > > > alternatives with > > > > > > > > > > > > TARGET_MMX_WITH_SSE. > > > > > > > > > > > > > > > > > > > > > > > > Most of MMX instructions have equivalent SSE versions > > > > > > > > > > > > and results of some > > > > > > > > > > > > SSE versions need to be reshuffled to the right order > > > > > > > > > > > > for MMX. Thee are > > > > > > > > > > > > couple tricky cases: > > > > > > > > > > > > > > > > > > > > > > > > 1. MMX maskmovq and SSE2 maskmovdqu aren't equivalent. > > > > > > > > > > > > We emulate MMX > > > > > > > > > > > > maskmovq with SSE2 maskmovdqu by zeroing out the upper > > > > > > > > > > > > 64 bits of the > > > > > > > > > > > > mask operand and handle unmapped bits 64:127 at memory > > > > > > > > > > > > address by > > > > > > > > > > > > adjusting source and mask operands together with memory > > > > > > > > > > > > address. > > > > > > > > > > > > > > > > > > > > > > > > 2. MMX movntq is emulated with SSE2 DImode movnti, > > > > > > > > > > > > which is available > > > > > > > > > > > > in 64-bit mode. > > > > > > > > > > > > > > > > > > > > > > > > 3. MMX pshufb takes a 3-bit index while SSE pshufb > > > > > > > > > > > > takes a 4-bit index. > > > > > > > > > > > > SSE emulation must clear the bit 4 in the shuffle > > > > > > > > > > > > control mask. > > > > > > > > > > > > > > > > > > > > > > > > 4. To emulate MMX cvtpi2p with SSE2 cvtdq2ps, we must > > > > > > > > > > > > properly preserve > > > > > > > > > > > > the upper 64 bits of destination XMM register. > > > > > > > > > > > > > > > > > > > > > > > > Tests are also added to check each SSE emulation of MMX > > > > > > > > > > > > intrinsics. > > > > > > > > > > > > > > > > > > > > > > > > There are no regressions on i686 and x86-64. For > > > > > > > > > > > > x86-64, GCC is also > > > > > > > > > > > > tested with > > > > > > > > > > > > > > > > > > > > > > > > --with-arch=native --with-cpu=native > > > > > > > > > > > > > > > > > > > > > > > > on AVX2 and AVX512F machines. > > > > > > > > > > > > > > > > > > > > > > An idea that would take patch a step further also on 32 > > > > > > > > > > > bit targets: > > > > > > > > > > > > > > > > > > > > > > *Assuming* that operations on XMM registers are as fast > > > > > > > > > > > (or perhaps > > > > > > > > > > > faster) than operations on MMX registers, we can change > > > > > > > > > > > mmx_isa > > > > > > > > > > > attribute in e.g. > > > > > > > > > > > > > > > > > > > > > > + "@ > > > > > > > > > > > + p\t{%2, %0|%0, %2} > > > > > > > > > > > + p\t{%2, %0|%0, %2} > > > > > > > > > > > + vp\t{%2, %1, %0|%0, %1, %2}" > > > > > > > > > > > + [(set_attr "mmx_isa" "native,x64_noavx,x64_avx") > > > > > > > > > > > > > > > > > > > > > > to: > > > > > > > > > > > > > > > > > > > > > > [(set_attr "isa" "*,noavx,avx") > > > > > > > > > > > (set_attr "mmx_isa" "native,*,*")] > > > > > > > > > > > > > > > > > > > > > > So, for x86_64 everything stays the same, but for x86_32 > > > > > > > > > > > we now allow > > > > > > > > > > > intrinsics to use xmm registers in addition to mmx > > > > > > > > > > > registers. We can't > > > > > > > > > > > disable MMX for x64_32 anyway due to ISA constraints (and > > > > > > > > > > > some tricky > > > > > > > > > > > cases, e.g. monvti that works only for 64bit targets and > > > > > > > > > > > e.g. maskmovq > > > > > > > > > > > & similar, which are more efficient with MMX regs), but > > > > > > > > > > > RA has much > > > > > > > > > > > more
[Patch] [aarch64] PR target/89324 Handle stack pointer for SUBS/ADDS instructions
Handle stack pointer with SUBS/ADDS instructions. In general the stack pointer was not handled for many SUBS/ADDS patterns in aarch64.md. Both the "extended register" and "immediate" forms allow the stack pointer to be used as the source register, while no form allows the stack pointer for the destination register. The define_insn patterns generating ADDS/SUBS did not allow the stack pointer for any operand, while the define_peephole2 patterns that generated RTX to be matched by these patterns allowed the stack pointer for any operand. The patterns are fixed by adding the 'k' constraint for the first source operand to all define_insns that generate the ADDS/SUBS "extended register" and "immediate" forms (but not the "shifted register" form). In peephole optimizations, constraint strings are ignored (see "(gccint) C Constraint Interface" info node in the documentation), so the decision to act or not is based solely on the predicate and condition. This patch introduces a new predicate "aarch64_general_reg" to be used in define_peephole2 patterns where only GENERAL_REGS registers are acceptable and uses that predicate in the peepholes that generate patterns for ADDS/SUBS. Additionally, this patch contains two tidy-ups (happy to remove them or put in a separate patch if people want): We change the condition of sub3_compare1_imm pattern from checking "UINTVAL (operands[2]) == -UINTVAL (operands[3])" to checking "INTVAL (operands[2]) == -INTVAL (operands[3])" for clarity, since the values checked are signed integers, there are negations involved in the check, and the condition used by the corresponding peepholes also uses INTVAL. The superfluous iterator in the assembly template for add3_compareV_imm is removed -- it was applied to an operand that is known to be a const_int. Full bootstrap and regtest done on aarch64-none-linux-gnu. Regression tests done on aarch64-none-linux-gnu and aarch64-none-elf cross compiler. OK for trunk? NOTE: I have included a bunch of RTL testcases that I used in development, these don't exercise much of the compiler and are pretty specific to the backend as it currently is, so I'm not sure they give much value. I'd appreciate feedback on whether this is in general considered useful. gcc/ChangeLog: 2019-02-18 Matthew Malcomson PR target/89324 * config/aarch64/aarch64.md: Use aarch64_general_reg predicate on destination register in peepholes generating patterns for ADDS/SUBS. (add3_compare0, *addsi3_compare0_uxtw, add3_compareC, add3_compareV_imm, add3_compareV, *adds__, *subs__, *adds__shift_, *subs__shift_, *adds__multp2, *subs__multp2, *sub3_compare0, *subsi3_compare0_uxtw, sub3_compare1): Allow stack pointer for source register. * config/aarch64/predicates.md (aarch64_general_reg): New predicate. gcc/testsuite/ChangeLog: 2019-02-18 Matthew Malcomson PR target/89324 * gcc.dg/rtl/aarch64/subs_adds_sp.c: New test. * gfortran.fortran-torture/compile/pr89324.f90: New test. ### Attachment also inlined for ease of reply### diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index b7f6fe0f1354f7aa19076a946ed2c633b9b9b8da..0d5754a21e31b0c53afb320bdf574fa4a43c7573 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -1985,7 +1985,7 @@ (define_expand "uaddvti4" (define_insn "add3_compare0" [(set (reg:CC_NZ CC_REGNUM) (compare:CC_NZ -(plus:GPI (match_operand:GPI 1 "register_operand" "%r,r,r") +(plus:GPI (match_operand:GPI 1 "register_operand" "%rk,rk,rk") (match_operand:GPI 2 "aarch64_plus_operand" "r,I,J")) (const_int 0))) (set (match_operand:GPI 0 "register_operand" "=r,r,r") @@ -2002,7 +2002,7 @@ (define_insn "add3_compare0" (define_insn "*addsi3_compare0_uxtw" [(set (reg:CC_NZ CC_REGNUM) (compare:CC_NZ -(plus:SI (match_operand:SI 1 "register_operand" "%r,r,r") +(plus:SI (match_operand:SI 1 "register_operand" "%rk,rk,rk") (match_operand:SI 2 "aarch64_plus_operand" "r,I,J")) (const_int 0))) (set (match_operand:DI 0 "register_operand" "=r,r,r") @@ -2034,7 +2034,7 @@ (define_insn "add3_compareC" [(set (reg:CC_C CC_REGNUM) (compare:CC_C (plus:GPI - (match_operand:GPI 1 "register_operand" "r,r,r") + (match_operand:GPI 1 "register_operand" "rk,rk,rk") (match_operand:GPI 2 "aarch64_plus_operand" "r,I,J")) (match_dup 1))) (set (match_operand:GPI 0 "register_operand" "=r,r,r") @@ -2081,7 +2081,7 @@ (define_insn "add3_compareV_imm" (compare:CC_V (plus: (sign_extend: - (match_operand:GPI 1 "register_operand" "r,r")) + (match_operand:GPI 1 "register_operand" "rk,rk")) (match_operand:GPI 2
Re: [PATCH 00/41] V8: Emulate MMX intrinsics with SSE
On Mon, Feb 18, 2019 at 3:22 PM H.J. Lu wrote: > > > > > > > > > > > On x86-64, since __m64 is returned and passed in XMM > > > > > > > > > > > registers, we can > > > > > > > > > > > emulate MMX intrinsics with SSE instructions. To support > > > > > > > > > > > it, we added > > > > > > > > > > > > > > > > > > > > > > #define TARGET_MMX_WITH_SSE (TARGET_64BIT && TARGET_SSE2) > > > > > > > > > > > > > > > > > > > > > > ;; Define instruction set of MMX instructions > > > > > > > > > > > (define_attr "mmx_isa" "base,native,x64,x64_noavx,x64_avx" > > > > > > > > > > > (const_string "base")) > > > > > > > > > > > > > > > > > > > > > > (eq_attr "mmx_isa" "native") > > > > > > > > > > >(symbol_ref "!TARGET_MMX_WITH_SSE") > > > > > > > > > > > (eq_attr "mmx_isa" "x64") > > > > > > > > > > >(symbol_ref "TARGET_MMX_WITH_SSE") > > > > > > > > > > > (eq_attr "mmx_isa" "x64_avx") > > > > > > > > > > >(symbol_ref "TARGET_MMX_WITH_SSE && > > > > > > > > > > > TARGET_AVX") > > > > > > > > > > > (eq_attr "mmx_isa" "x64_noavx") > > > > > > > > > > >(symbol_ref "TARGET_MMX_WITH_SSE && > > > > > > > > > > > !TARGET_AVX") > > > > > > > > > > > > > > > > > > > > > > We added SSE emulation to MMX patterns and disabled MMX > > > > > > > > > > > alternatives with > > > > > > > > > > > TARGET_MMX_WITH_SSE. > > > > > > > > > > > > > > > > > > > > > > Most of MMX instructions have equivalent SSE versions and > > > > > > > > > > > results of some > > > > > > > > > > > SSE versions need to be reshuffled to the right order for > > > > > > > > > > > MMX. Thee are > > > > > > > > > > > couple tricky cases: > > > > > > > > > > > > > > > > > > > > > > 1. MMX maskmovq and SSE2 maskmovdqu aren't equivalent. > > > > > > > > > > > We emulate MMX > > > > > > > > > > > maskmovq with SSE2 maskmovdqu by zeroing out the upper 64 > > > > > > > > > > > bits of the > > > > > > > > > > > mask operand and handle unmapped bits 64:127 at memory > > > > > > > > > > > address by > > > > > > > > > > > adjusting source and mask operands together with memory > > > > > > > > > > > address. > > > > > > > > > > > > > > > > > > > > > > 2. MMX movntq is emulated with SSE2 DImode movnti, which > > > > > > > > > > > is available > > > > > > > > > > > in 64-bit mode. > > > > > > > > > > > > > > > > > > > > > > 3. MMX pshufb takes a 3-bit index while SSE pshufb takes > > > > > > > > > > > a 4-bit index. > > > > > > > > > > > SSE emulation must clear the bit 4 in the shuffle control > > > > > > > > > > > mask. > > > > > > > > > > > > > > > > > > > > > > 4. To emulate MMX cvtpi2p with SSE2 cvtdq2ps, we must > > > > > > > > > > > properly preserve > > > > > > > > > > > the upper 64 bits of destination XMM register. > > > > > > > > > > > > > > > > > > > > > > Tests are also added to check each SSE emulation of MMX > > > > > > > > > > > intrinsics. > > > > > > > > > > > > > > > > > > > > > > There are no regressions on i686 and x86-64. For x86-64, > > > > > > > > > > > GCC is also > > > > > > > > > > > tested with > > > > > > > > > > > > > > > > > > > > > > --with-arch=native --with-cpu=native > > > > > > > > > > > > > > > > > > > > > > on AVX2 and AVX512F machines. > > > > > > > > > > > > > > > > > > > > An idea that would take patch a step further also on 32 bit > > > > > > > > > > targets: > > > > > > > > > > > > > > > > > > > > *Assuming* that operations on XMM registers are as fast (or > > > > > > > > > > perhaps > > > > > > > > > > faster) than operations on MMX registers, we can change > > > > > > > > > > mmx_isa > > > > > > > > > > attribute in e.g. > > > > > > > > > > > > > > > > > > > > + "@ > > > > > > > > > > + p\t{%2, %0|%0, %2} > > > > > > > > > > + p\t{%2, %0|%0, %2} > > > > > > > > > > + vp\t{%2, %1, %0|%0, %1, %2}" > > > > > > > > > > + [(set_attr "mmx_isa" "native,x64_noavx,x64_avx") > > > > > > > > > > > > > > > > > > > > to: > > > > > > > > > > > > > > > > > > > > [(set_attr "isa" "*,noavx,avx") > > > > > > > > > > (set_attr "mmx_isa" "native,*,*")] > > > > > > > > > > > > > > > > > > > > So, for x86_64 everything stays the same, but for x86_32 we > > > > > > > > > > now allow > > > > > > > > > > intrinsics to use xmm registers in addition to mmx > > > > > > > > > > registers. We can't > > > > > > > > > > disable MMX for x64_32 anyway due to ISA constraints (and > > > > > > > > > > some tricky > > > > > > > > > > cases, e.g. monvti that works only for 64bit targets and > > > > > > > > > > e.g. maskmovq > > > > > > > > > > & similar, which are more efficient with MMX regs), but RA > > > > > > > > > > has much > > > > > > > > > > more freedom to allocate the most effective register set > > > > > > > > > > even for > > > > > > > > > > 32bit targets. > > > > > > > > > > > > > > > > > > > > WDYT? > > > > > > > > > > > > > > > > > > > > > > > > > > > > Since MMX registers are used to pass and return __m64 values, > > > > > > > > > we