date:20190218

[PATCH] luoxhu - backport from trunk r255555, r257253 and r258137

2019-02-18 Thread luoxhu

From: Xiong Hu Luo 

This is a backport of r25, r257253 and r258137 of trunk to gcc-7-branch.
The patches were on trunk before GCC 8 forked already. Totally 5 files need
mannual resolve due to code changes for r25. r257253 and r258137 are
dependent testcases require vsx support need merge to avoid regression.

The discussion for the patch r25 that went into trunk is:
https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00394.html
VSX support for patch r257253 and r258137:
https://gcc.gnu.org/ml/gcc-patches/2018-01/msg02391.html
https://gcc.gnu.org/ml/gcc-patches/2018-02/msg01506.html

gcc/ChangeLog:

2019-01-14  Luo Xiong Hu  

Backport from trunk. Mannually resolve 3 files:
* config/rs6000/altivec.h (vec_extract_fp32_from_shorth,
vec_extract_fp32_from_shortl): Resolve new #defines.
* config/rs6000/rs6000-c.c (ALTIVEC_BUILTIN_VEC_SLD): Resolve
new expensions.
* doc/extend.texi: (vec_sld, vec_sll, vec_srl, vec_sro,
vec_unpackh, vec_unpackl, test_vsi_packsu_vssi_vssi, vec_packsu,
vec_cmpne): Resolve new documentation.
2017-12-11  Carl Love  

* config/rs6000/altivec.h (vec_extract_fp32_from_shorth,
vec_extract_fp32_from_shortl]): Add #defines.
* config/rs6000/rs6000-builtin.def (VSLDOI_2DI): Add macro expansion.
* config/rs6000/rs6000-c.c (ALTIVEC_BUILTIN_VEC_UNPACKH,
ALTIVEC_BUILTIN_VEC_UNPACKL, ALTIVEC_BUILTIN_VEC_AND,
ALTIVEC_BUILTIN_VEC_SLD, ALTIVEC_BUILTIN_VEC_SRL,
ALTIVEC_BUILTIN_VEC_SRO, ALTIVEC_BUILTIN_VEC_SLD,
ALTIVEC_BUILTIN_VEC_SLL): Add expansions.
* doc/extend.texi: Add documentation for the added builtins.

gcc/testsuite/ChangeLog:

2019-01-14  Luo Xiong Hu  

Backport from trunk r25. Mannually resolve 2 files:
* gcc.target/powerpc/builtins-3-p8.c (test_vsi_packs_vusi,
test_vsi_packsu-vssi, test_vsi_packsu-vusi, test_vsi_packsu-vsll,
test_vsi_packsu-vull, test_vsi_packsu-vsi, test_vsi_packsu-vui):
Resolve new cases.
* gcc.target/powerpc/builtins-3.c (test_sll_vsc_vsc_vsuc,
test_sll_vuc_vuc, test_sll_vsi_vsi_vuc, test_sll_vui_vui_vuc,
test_sll_vbll_vull, test_sll_vbll_vbll_vus, test_sll_vp_vp_vuc,
test_sll_vssi_vssi_vuc, test_sll_vusi_vusi_vuc, test_slo_vsc_vsc_vsc,
test_slo_vuc_vuc_vsc, test_slo_vsi_vsi_vsc, test_slo_vsi_vsi_vuc,
test_slo_vui_vui_vsc, test_slo_vui_vui_vuc, test_slo_vp_vp_vsc,
test_slo_vp_vp_vuc, test_slo_vssi_vssi_vsc, test_slo_vssi_vssi_vuc,
test_slo_vusi_vusi_vsc, test_slo_vusi_vusi_vuc, test_slo_vusi_vusi_vuc,
test_slo_vf_vf_vsc, test_slo_vf_vf_vuc, test_cmpb_float): Resolve
new cases.
2017-12-11  Carl Love  

* gcc.target/powerpc/altivec-7.c: Renamed altivec-7.h.
* gcc.target/powerpc/altivec-7.h (main): Add testcases for vec_unpackl.
Add dg-final tests for the instructions generated.
* gcc.target/powerpc/altivec-7-be.c: New file to test on big endian.
* gcc.target/powerpc/altivec-7-le.c: New file to test on little endian.
* gcc.target/powerpc/altivec-13.c (foo): Add vec_sld, vec_srl,
 vec_sro testcases. Add dg-final tests for the instructions generated.
* gcc.target/powerpc/builtins-3-p8.c (test_vsi_packs_vui,
test_vsi_packs_vsi, test_vsi_packs_vssi, test_vsi_packs_vusi,
test_vsi_packsu-vssi, test_vsi_packsu-vusi, test_vsi_packsu-vsll,
test_vsi_packsu-vull, test_vsi_packsu-vsi, test_vsi_packsu-vui): Add
testcases. Add dg-final tests for new instructions.
* gcc.target/powerpc/p8vector-builtin-2.c (vbschar_eq, vbchar_eq,
vuchar_eq, vbint_eq, vsint_eq, viint_eq, vuint_eq, vbool_eq, vbint_ne,
vsint_ne, vuint_ne, vbool_ne, vsign_ne, vuns_ne, vbshort_ne): Add
tests.
Add dg-final instruction tests.
* gcc.target/powerpc/vsx-vector-6.c: Renamed vsx-vector-6.h.
* gcc.target/powerpc/vsx-vector-6.h (vec_andc,vec_nmsub, vec_nmadd,
vec_or, vec_nor, vec_andc, vec_or, vec_andc, vec_msums): Add tests.
Add dg-final tests for the generated instructions.
* gcc.target/powerpc/builtins-3.c (test_sll_vsc_vsc_vsuc,
test_sll_vuc_vuc, test_sll_vsi_vsi_vuc, test_sll_vui_vui_vuc,
test_sll_vbll_vull, test_sll_vbll_vbll_vus, test_sll_vp_vp_vuc,
test_sll_vssi_vssi_vuc, test_sll_vusi_vusi_vuc, test_slo_vsc_vsc_vsc,
test_slo_vuc_vuc_vsc, test_slo_vsi_vsi_vsc, test_slo_vsi_vsi_vuc,
test_slo_vui_vui_vsc, test_slo_vui_vui_vuc, test_slo_vsll_slo_vsll_vsc,
test_slo_vsll_slo_vsll_vuc, test_slo_vull_slo_vull_vsc,
test_slo_vull_slo_vull_vuc, test_slo_vp_vp_vsc, test_slo_vp_vp_vuc,
test_slo_vssi_vssi_vsc, test_slo_vssi_vssi_vuc, test_slo_vusi_vusi_vsc,
test_slo_vusi_vusi_vuc, test_slo_vusi_vusi_vuc, test_slo_vf_vf_vsc,
test_slo_vf_vf_vuc, test_cmpb_float): Add tests.

Backport

Re: [C++ PATCH] Ensure constexpr evaluation is done on pre-cp_fold_function bodies (PR c++/89285)

2019-02-18 Thread Jakub Jelinek

On Mon, Feb 18, 2019 at 04:04:15PM -1000, Jason Merrill wrote:
> > --- gcc/cp/constexpr.c.jj   2019-02-17 17:09:47.113351897 +0100
> > +++ gcc/cp/constexpr.c  2019-02-18 19:34:57.995136395 +0100
> > @@ -1269,6 +1301,49 @@ cxx_eval_builtin_function_call (const co
> > return t;
> >   }
> > +  if (fndecl_built_in_p (fun, BUILT_IN_NORMAL))
> > +switch (DECL_FUNCTION_CODE (fun))
> > +  {
> > +  case BUILT_IN_ADD_OVERFLOW:
> > +  case BUILT_IN_SADD_OVERFLOW:
> > +  case BUILT_IN_SADDL_OVERFLOW:
> > +  case BUILT_IN_SADDLL_OVERFLOW:
> > +  case BUILT_IN_UADD_OVERFLOW:
> > +  case BUILT_IN_UADDL_OVERFLOW:
> > +  case BUILT_IN_UADDLL_OVERFLOW:
> > +  case BUILT_IN_SUB_OVERFLOW:
> > +  case BUILT_IN_SSUB_OVERFLOW:
> > +  case BUILT_IN_SSUBL_OVERFLOW:
> > +  case BUILT_IN_SSUBLL_OVERFLOW:
> > +  case BUILT_IN_USUB_OVERFLOW:
> > +  case BUILT_IN_USUBL_OVERFLOW:
> > +  case BUILT_IN_USUBLL_OVERFLOW:
> > +  case BUILT_IN_MUL_OVERFLOW:
> > +  case BUILT_IN_SMUL_OVERFLOW:
> > +  case BUILT_IN_SMULL_OVERFLOW:
> > +  case BUILT_IN_SMULLL_OVERFLOW:
> > +  case BUILT_IN_UMUL_OVERFLOW:
> > +  case BUILT_IN_UMULL_OVERFLOW:
> > +  case BUILT_IN_UMULLL_OVERFLOW:
> > +   /* These builtins will fold into
> > +  (cast)
> > +((something = __real__ SAVE_EXPR <.???_OVERFLOW (cst1, cst2)>),
> > + __imag__ SAVE_EXPR <.???_OVERFLOW (cst1, cst2)>)
> > +  which fails is_constant_expression.  */
> > +   if (TREE_CODE (args[0]) != INTEGER_CST
> > +   || TREE_CODE (args[1]) != INTEGER_CST
> > +   || !potential_constant_expression (args[2]))
> > + {
> > +   if (!*non_constant_p && !ctx->quiet)
> > + error ("%q+E is not a constant expression", new_call);
> > +   *non_constant_p = true;
> > +   return t;
> > + }
> > +   return cxx_eval_constant_expression (_ctx, new_call, lval,
> > +non_constant_p, overflow_p);
> > +  default:
> > +   break;
> > +  }
> 
> What is this for?  Won't this recursive cxx_eval_constant_expression come
> back to this function again?  If the expression is constant, shouldn't it
> have been folded by fold_builtin_call_array?

This is for the constexpr-arith-overflow.C testcase.
The arguments are INTEGER_CST, INTEGER_CST and ADDR_EXPR of a VAR_DECL or
PARM_DECL, and fold_builtin_call_array returns new_call:
(z = REALPART_EXPR >;, (bool) IMAGPART_EXPR 
>;);
where this doesn't pass is_constant_expression because of the z store.
cxx_eval_constant_expression is able to evaluate this, as
z = 0;
false;
in this case.
I guess builtins.c folding could be improved and simplify it to
(z = 0; (bool) false;);
but that still doesn't pass is_constant_expression check.
For C++14 it passes potential_constant_expression though, and that
is what I've used for these builtins in the first iteration, but
the testcase happened to pass even for C++11 and
potential_constant_expression is false here.  Though, perhaps we are going
too far for C++11 here and should reject it, after all, people have the
possibility to use __builtin_*_overflow_p now which should be usable even in
C++11.  The reason why it passed with C++11 is that when parsing we saw
a __builtin_add_overflow (0, 0, ) call and potential_constant_expression
said it is ok, then folded it into that
(z = REALPART_EXPR >;, (bool) IMAGPART_EXPR 
>;);
which is not potential_constant_expression, but nothing called it again
and cxx_eval_constant_expression can handle it.

> > @@ -1358,6 +1433,9 @@ cxx_bind_parameters_in_call (const const
> >   x = ctx->object;
> >   x = build_address (x);
> > }
> > +  if (TREE_ADDRESSABLE (type) && TYPE_REF_P (TREE_TYPE (x)))
> > +   /* Undo convert_for_arg_passing work here.  */
> > +   x = build_fold_indirect_ref_loc (EXPR_LOCATION (x), x);
> 
> Not convert_from_reference?

Will change.

> > @@ -4036,6 +4113,10 @@ label_matches (const constexpr_ctx *ctx,
> > }
> > break;
> > +case BREAK_STMT:
> > +case CONTINUE_STMT:
> > +  break;
> > +
> 
> Let's add a comment that these are handled directly in cxx_eval_loop_expr.

Ok, will do.

Jakub

Re: [PATCH, GCC] PR target/86487: fix the way 'uses_hard_regs_p' handles paradoxical subregs

2019-02-18 Thread Vladimir Makarov




On 2019-02-15 6:35 a.m., Andre Vieira (lists) wrote:

Hi Vlad,

On 13/02/2019 16:46, Vladimir Makarov wrote:


On 2019-02-13 5:54 a.m., Andre Vieira (lists) wrote:

PING.

Since Jeff is away can another maintainer have a look at this please?



I see the following patch


Yeah I uploaded the wrong patch... sorry. See attached, including a 
testcase, currently only fails on GCC-8 and previous though.


It happens.  The new version is ok to commit.  Thank you for working on 
the patch.

Re: [REVISED PATCH 7/9]: C++ P0482R5 char8_t: New standard library tests

2019-02-18 Thread Jonathan Wakely


On 07/02/19 23:39 -0500, Tom Honermann wrote:

On 2/7/19 4:54 AM, Jonathan Wakely wrote:

On 23/12/18 21:27 -0500, Tom Honermann wrote:
Attached is a revised patch that addresses changes in P0482R6.  
Changes from the prior patch include:

- Updated the value of the __cpp_char8_t feature test macro to 201811.

Tested on x86_64-linux.


There are quite a few additional changes needed to make the testsuite
pass cleanly with non-default options, e.g. when running it with
RUNTESTFLAGS=--target_board=unix/-fchar8_t/-fno-inline I see these
failures:
I remember thinking that I had to deal with this at one point.  It 
seems I then forgot about it.


FAIL: 21_strings/basic_string/literals/types.cc (test for excess errors)
FAIL: 21_strings/basic_string/literals/values.cc (test for excess errors)
UNRESOLVED: 21_strings/basic_string/literals/values.cc compilation 
failed to produce executable
FAIL: 21_strings/basic_string_view/literals/types.cc (test for 
excess errors)
FAIL: 21_strings/basic_string_view/literals/values.cc (test for 
excess errors)
UNRESOLVED: 21_strings/basic_string_view/literals/values.cc 
compilation failed to produce executable

FAIL: 22_locale/codecvt/char16_t.cc (test for excess errors)
UNRESOLVED: 22_locale/codecvt/char16_t.cc compilation failed to 
produce executable

FAIL: 22_locale/codecvt/char32_t.cc (test for excess errors)
UNRESOLVED: 22_locale/codecvt/char32_t.cc compilation failed to 
produce executable

FAIL: 22_locale/codecvt/codecvt_utf8/79980.cc (test for excess errors)
UNRESOLVED: 22_locale/codecvt/codecvt_utf8/79980.cc compilation 
failed to produce executable
FAIL: 22_locale/codecvt/codecvt_utf8/wchar_t/1.cc (test for excess 
errors)
UNRESOLVED: 22_locale/codecvt/codecvt_utf8/wchar_t/1.cc compilation 
failed to produce executable

FAIL: 22_locale/codecvt/utf8.cc (test for excess errors)
UNRESOLVED: 22_locale/codecvt/utf8.cc compilation failed to produce 
executable

FAIL: 22_locale/conversions/string/2.cc (test for excess errors)
UNRESOLVED: 22_locale/conversions/string/2.cc compilation failed to 
produce executable

FAIL: 22_locale/conversions/string/3.cc (test for excess errors)
UNRESOLVED: 22_locale/conversions/string/3.cc compilation failed to 
produce executable

FAIL: experimental/string_view/literals/types.cc (test for excess errors)
FAIL: experimental/string_view/literals/values.cc (test for excess 
errors)
UNRESOLVED: experimental/string_view/literals/values.cc compilation 
failed to produce executable


There would be similar errors running all the tests with -std=c++2a,
which is definitely something I do often and so want the tests to be
clean.

Absolutely, agreed.

We can either disable those tests when char8_t is enabled
(because we already have alternative tests checking the char8_t
versions of string_view etc.) or make them work either way, which the
attached patch begins doing (more changes are needed).
Since most of these tests exercise functionality that is not 
u8/char8_t specific, I think we should make them work.


I expect a different set of failures for -fno-char8_t (which is
probably a less important case to support that enabling char8_t in
older standards, but maybe still worth testing now and then).

I'm not sure it is less important.  -fno-char8_t may be an important 
tool for some code bases during their initial testing of, and 
migration to, C++20.


Tom.


I committed your patch for library tests unchanged, and also committed
the attached one to fix the failures when running the existing tests
with -std=gnu++2a or -fchar8_t.


commit 1c32dfd748cc225a02cb729943eb9586eda8d7fd
Author: Jonathan Wakely 
Date:   Thu Feb 7 09:04:11 2019 +

Adjust C++11/C++14 tests to work with -fchar8_t

* testsuite/21_strings/basic_string/literals/types.cc
[_GLIBCXX_USE_CHAR8_T]: Adjust expected string type for u8 literal.
* testsuite/21_strings/basic_string/literals/values.cc
[_GLIBCXX_USE_CHAR8_T]: Likewise.
* testsuite/22_locale/codecvt/char16_t.cc: Adjust for u8 literals
potentially having different type.
* testsuite/22_locale/codecvt/char32_t.cc: Likewise.
* testsuite/22_locale/codecvt/codecvt_utf8/79980.cc: Cast u8 literal
to char.
* testsuite/22_locale/codecvt/codecvt_utf8/wchar_t/1.cc: Likewise.
* testsuite/22_locale/codecvt/utf8.cc: Likewise.
* testsuite/22_locale/conversions/string/2.cc: Remove u8 prefix from
string literals only using basic character set.
* testsuite/22_locale/conversions/string/3.cc: Likewise. Cast other
u8 literals to char.
* testsuite/29_atomics/headers/atomic/macros.cc [_GLIBCXX_USE_CHAR8_T]:
Test ATOMIC_CHAR8_T_LOCK_FREE.
Add missing #error to ATOMIC_CHAR16_T_LOCK_FREE test.
* testsuite/29_atomics/headers/atomic/types_std_c++0x.cc
[_GLIBCXX_USE_CHAR8_T]: Check for std::atomic_char8_t.

Re: [REVISED PATCH 5/9]: C++ P0482R5 char8_t: Standard library support

2019-02-18 Thread Jonathan Wakely


On 08/02/19 12:56 +, Jonathan Wakely wrote:

On 07/02/19 23:35 -0500, Tom Honermann wrote:

On 2/7/19 4:44 AM, Jonathan Wakely wrote:

On 23/12/18 21:27 -0500, Tom Honermann wrote:
Attached is a revised patch that addresses changes in P0482R6.  
Changes from the prior patch include:

- Updated the value of the __cpp_char8_t feature test macro to 201811.

Tested on x86_64-linux.


Thanks, Tom, this is great work!

The front-end changes for char8_t went in recently, and I'm finally
ready to commit the library parts.

Great!

There's one big problem I found in
this patch, which is that the new numeric_limits
specialization uses constexpr unconditionally. That fails if 
is compiled using options like -std=c++98 -fno-char8_t because the
specialization will be used, but the constexpr keyword isn't allowed.
That's easily fixed by replacing the keyword with _GLIBCXX_CONSTEXPR.
Hmm, the code for the char8_t specialization was copied from the 
char16_t specialization which also uses constexpr unconditionally 
(but is guarded by a C++11+ requirement).


That can use it unconditionally, because there's no -fchar16_t switch
to enable char16_t prior to C++11.

The char8_t specialization must be elided when the compiler is 
invoked with -std=c++98 -fno-char8_t (since the char8_t type doesn't 
exist then).  The _GLIBCXX_USE_CHAR8_T guard doesn't suffice for 
this? _GLIBCXX_USE_CHAR8_T should only be defined if __cpp_char8_t 
is defined; and that should only be defined if -fchar8_t or 
-std=c++2a is specified.  Or perhaps you intended -std=c++98 
-fchar8_t?  I agree in that case that use of _GLIBCXX_CONSTEXPR is 
necessary.


Yes sorry, that's a typo above, I meant -std=c++98 -fchar8_t.

The -std=c++98 -fno-char8_t case works fine, as expected (because
-fno-char8_t is the default for -std=c++98 anyway).


The other way to solve that problem would be for the compiler to give
an error if -fchar8_t is used with C++98, but I see no fundamental
reason that combination of options shouldn't be allowed. We can
support it in the library by using the macro.

Agreed.


As discussed in San Diego, the other change needed is to add the
abi_tag attribute to the new versions of path::u8string and
path::generic_u8string, so that the mangling is different when its
return type is different:

#ifdef _GLIBCXX_USE_CHAR8_T
   __attribute__((__abi_tag__("__u8")))
   std::u8string  u8string() const;
#else
   std::string    u8string() const;
#endif // _GLIBCXX_USE_CHAR8_T

Otherwise we get ODR violations when linking objects compiled
with -fchar8_t enabled to objects with it disabled (e.g. linking
-std=c++17 objects to -std=c++2a objects, which needs to work).


Are ODR violations bad? :)


Only when they make people send us bug reports ;-)



I suggest "__u8" as the name of the ABI tag, but I'm open to other
suggestions. "__char8_t" is a bit long and verbose. "__cxx20" would be
consistent with "__cxx11" used for the new ABI introduced in GCC 5 but
it regularly confuses people who think it is coupled to the -std=c++11
option (and so don't understand why they still see it for -std=c++14).
I have no preference or alternative suggestions here.  Had I 
recognized the issue, I would have asked you what to do about it :)


Also, I see that you've made changes to  (to
add the experimental::u8string_view typedef) and to
std::experimental::path (to change the return type of u8string and
generic_u8string).

The former change is fairly harmless; it only adds a typedef, albeit
one which is not a reserved name in C++14/C++17 and so should be
available for users to define as a macro. Maybe prior to C++2a we
should only define it when GNU extensions are enabled (i.e. when using
-std=gnu++14 not -std=c++14):

#if defined _GLIBCXX_USE_CHAR8_T \
 && (__cplusplus > 201703L || !defined __STRICT_ANSI__)
 using u8string_view = basic_string_view;
#endif

That makes sense.


Actually I was thinking about this further, and if somebody explicitly
uses -fchar8_t then they're asking for a non-standard dialect of C++
anyway, and so they can't complain about some extra non-standard
names. So I think it's fine to declare std::u8string_view whenever
char8_t is enabled.


Changing the return type of experimental::path members concerns me
more. That's a published TS which is not going to be revised, and it's
not obvious to me that users would want the change in semantics. If
somebody is still using the Filesystem TS in C++2a code, they're
probably not expecting it to change. If they need to update their code
for C++2a they might as well just use std::filesystem, and so having
char8_t support in std::experimental::filesystem isn't clearly useful.

I agree.  I added the support to the experimental implementations 
more out of a desire to be complete and to remove any potential 
barriers to use of -fchar8_t than because I felt the changes were 
really necessary.  I would be perfectly fine with skipping the 
updates to the experimental libraries completely.


OK, let's leave them

Re: [committed] Fix set_uids_in_ptset (PR middle-end/89303)

2019-02-18 Thread Jonathan Wakely


On 18/02/19 21:22 +0100, Jakub Jelinek wrote:

On Mon, Feb 18, 2019 at 09:15:39PM +0100, Rainer Orth wrote:

2019-02-15  Rainer Orth  

* g++.dg/torture/pr89303.C (bad_weak_ptr): Rename to
bad_weak_ptr_.


Ok, thanks.
If needed, guess we could rename much more (or rename the namespace in which
most of it is from std to my_std, though we'd need to check for stuff that
needs to be in std namespace).


I think that whole testcase could be in some non-std namespace. I
don't think there are any magic functions or types that need to be in
namespace std to work correctly.


# HG changeset patch
# Parent  056fe4093ce40dc462c6b50c3ae49df032a92230
Fix g++.dg/torture/pr89303.C with Solaris ld

diff --git a/gcc/testsuite/g++.dg/torture/pr89303.C 
b/gcc/testsuite/g++.dg/torture/pr89303.C
--- a/gcc/testsuite/g++.dg/torture/pr89303.C
+++ b/gcc/testsuite/g++.dg/torture/pr89303.C
@@ -350,11 +350,11 @@ namespace std
   { return static_cast(_M_addr()); }
 };

-  class bad_weak_ptr { };
+  class bad_weak_ptr_ { };

   inline void
   __throw_bad_weak_ptr()
-  { (throw (bad_weak_ptr())); }
+  { (throw (bad_weak_ptr_())); }

 class _Sp_counted_base
 {



Jakub

PING [PATCH] fix ICE in __builtin_has_attribute (PR 88383 and 89288)

2019-02-18 Thread Martin Sebor


Please let me know what it will take to get the fix for these two
issues approved.  I've answered the questions so I don't know what
else I'm expected to do here.

  https://gcc.gnu.org/ml/gcc-patches/2019-02/msg00793.html

On 2/11/19 12:20 PM, Martin Sebor wrote:

This is a repost of a patch for PR 88383 updated to also fix the just
reported PR 89288 (the original patch only partially handles this case).
The review of the first patch was derailed by questions about the design
of the built-in so the fix for the ICE was never approved.  I think
the ICEs should be fixed for GCC 9 and any open design questions should
be dealt with independently.

Martin

The patch for PR 88383 was originally posted last December:
   https://gcc.gnu.org/ml/gcc-patches/2018-12/msg00337.html

Re: [C++ PATCH] Ensure constexpr evaluation is done on pre-cp_fold_function bodies (PR c++/89285)

2019-02-18 Thread Jason Merrill


On 2/18/19 12:45 PM, Jakub Jelinek wrote:

Hi!

As mentioned in the PR, we've regressed on the trunk in diagnostics of some
invalid constexpr evaluations.  The problem is that the constexpr evaluation
is effectively done on post-cp_fold_function bodies/arguments and cp_fold
optimizes away some important trees for constexpr diagnostics, either
itself, or through using GENERIC match.pd (on the testcase in particular
diagnostics about reinterpret_cast).
While we save on constexpr call hash table bodies of the functions
pre-cp_fold_function, due to sharing and cp_fold_r the STATEMENT_LIST
statements etc. are modified directly and genericization modifies it as
well.

The following patch uses copy_fn which we have been using before the the
recursive constexpr cases also to make a copy of the constexpr function
before cp_fold_function clobbers it.
I had to implement cxx_eval_conditional_expression handling of various
C++ FE statements that are replaced during genericization.

Bootstrapped/regtested on x86_64-linux and i686-linux (98,11,14,17,2a), ok
for trunk?

2019-02-18  Jakub Jelinek  

PR c++/89285
* constexpr.c (struct constexpr_fundef): Add parms and result members.
(retrieve_constexpr_fundef): Adjust for the above change.
(register_constexpr_fundef): Save constexpr body with copy_fn,
temporarily set DECL_CONTEXT on DECL_RESULT before that.
(get_fundef_copy): Change FUN argument to FUNDEF with
constexpr_fundef * type, grab body and parms/result out of
constexpr_fundef struct and temporarily change it for copy_fn calls
too.
(cxx_eval_builtin_function_call): For __builtin_FUNCTION temporarily
adjust current_function_decl from ctx->call context.  For arith
overflow builtins, don't test is_constant_expression on the result,
instead test if arguments are suitable constant expressions.
(cxx_bind_parameters_in_call): Grab parameters from new_call.  Undo
convert_for_arg_passing changes for TREE_ADDRESSABLE type passing.
(cxx_eval_call_expression): Adjust get_fundef_copy caller.
(cxx_eval_conditional_expression): For IF_STMT, allow then or else
operands to be NULL.
(label_matches): Handle BREAK_STMT and CONTINUE_STMT.
(cxx_eval_loop_expr): Add support for FOR_STMT, WHILE_STMT and DO_STMT.
(cxx_eval_switch_expr): Add support for SWITCH_STMT.
(cxx_eval_constant_expression): Handle IF_STMT, FOR_STMT, WHILE_STMT,
DO_STMT, CONTINUE_STMT, SWITCH_STMT, BREAK_STMT and CONTINUE_STMT.
For SIZEOF_EXPR, recurse on the result of fold_sizeof_expr.  Ignore
DECL_EXPR with USING_DECL operand.
* lambda.c (maybe_add_lambda_conv_op): Build thisarg using
build_int_cst to make it a valid constant expression.

* g++.dg/ubsan/vptr-4.C: Expect reinterpret_cast errors.
* g++.dg/cpp1y/constexpr-84192.C (f2): Adjust expected diagnostics.
* g++.dg/cpp1y/constexpr-70265-2.C (foo): Adjust expected line of
diagnostics.
* g++.dg/cpp1y/constexpr-89285.C: New test.

--- gcc/cp/constexpr.c.jj   2019-02-17 17:09:47.113351897 +0100
+++ gcc/cp/constexpr.c  2019-02-18 19:34:57.995136395 +0100
@@ -1269,6 +1301,49 @@ cxx_eval_builtin_function_call (const co
return t;
  }
  
+  if (fndecl_built_in_p (fun, BUILT_IN_NORMAL))

+switch (DECL_FUNCTION_CODE (fun))
+  {
+  case BUILT_IN_ADD_OVERFLOW:
+  case BUILT_IN_SADD_OVERFLOW:
+  case BUILT_IN_SADDL_OVERFLOW:
+  case BUILT_IN_SADDLL_OVERFLOW:
+  case BUILT_IN_UADD_OVERFLOW:
+  case BUILT_IN_UADDL_OVERFLOW:
+  case BUILT_IN_UADDLL_OVERFLOW:
+  case BUILT_IN_SUB_OVERFLOW:
+  case BUILT_IN_SSUB_OVERFLOW:
+  case BUILT_IN_SSUBL_OVERFLOW:
+  case BUILT_IN_SSUBLL_OVERFLOW:
+  case BUILT_IN_USUB_OVERFLOW:
+  case BUILT_IN_USUBL_OVERFLOW:
+  case BUILT_IN_USUBLL_OVERFLOW:
+  case BUILT_IN_MUL_OVERFLOW:
+  case BUILT_IN_SMUL_OVERFLOW:
+  case BUILT_IN_SMULL_OVERFLOW:
+  case BUILT_IN_SMULLL_OVERFLOW:
+  case BUILT_IN_UMUL_OVERFLOW:
+  case BUILT_IN_UMULL_OVERFLOW:
+  case BUILT_IN_UMULLL_OVERFLOW:
+   /* These builtins will fold into
+  (cast)
+((something = __real__ SAVE_EXPR <.???_OVERFLOW (cst1, cst2)>),
+ __imag__ SAVE_EXPR <.???_OVERFLOW (cst1, cst2)>)
+  which fails is_constant_expression.  */
+   if (TREE_CODE (args[0]) != INTEGER_CST
+   || TREE_CODE (args[1]) != INTEGER_CST
+   || !potential_constant_expression (args[2]))
+ {
+   if (!*non_constant_p && !ctx->quiet)
+ error ("%q+E is not a constant expression", new_call);
+   *non_constant_p = true;
+   return t;
+ }
+   return cxx_eval_constant_expression (_ctx, new_call, lval,
+non_constant_p, overflow_p);
+  default:
+   break;
+

Re: [C++ Patch] PR 84536 ("[7/8/9 Regression] ICE with non-type template parameter")

2019-02-18 Thread Jason Merrill


On 2/18/19 3:15 PM, Paolo Carlini wrote:

Hi,

On 19/02/19 00:52, Jason Merrill wrote:

On 2/18/19 12:14 PM, Paolo Carlini wrote:

Hi Jason,

On 18/02/19 19:28, Jason Merrill wrote:

On 2/18/19 5:31 AM, Paolo Carlini wrote:

Hi Jason,

On 18/02/19 10:20, Jason Merrill wrote:

On 2/17/19 6:58 AM, Paolo Carlini wrote:

Hi,

here, when we don't see an initializer we believe we are surely 
dealing with a case of C++17 template argument deduction for 
class templates, but, in fact, it's just an ill-formed C++14 
template variable specialization. Conveniently, we can use here 
too the predicate variable_template_specialization_p. Not 100% 
sure about the exact wording of the error message, I added '#' to 
%qD to explicitly print the auto-using type too.


I guess we should change the assert to a test, so that we give the 
error if we aren't dealing with a class template placeholder. 
Variable templates don't seem to be important to test for.

Thanks, simpler patch.
This error is also pretty poor for this testcase, where there is 
an initializer.


Well, implementation-wise, certainly init == NULL_TREE and only 
when we have an empty pack this specific issue occurs.


In practice, clang simply talks about an empty initializer (during 
instantiation, etc, like we do), whereas EDG explicitly says that 
pack expansion produces an empty list of expressions. I don't think 
that in cp_finish_decl it would be easy for us to do exactly the 
same, we simply see a NULL_TREE as second argument. Or we could 
just *assume* that we are dealing with the outcome of a pack 
expansion, say something like EDG even if we don't have details 
beyond the fact that init == NULL_TREE. I believe that without a 
variadic template the problem cannot occur, because we catch the 
empty initializer much earlier, in grokdeclarator - indeed using a 
!CLASS_PLACEHOLDER_TEMPLATE (auto_node) check. What do you think? 
Again "instantiated for an empty pack" or something similar?


Perhaps we could complain in the code for empty pack expansion 
handling in tsubst_init?


Ah, thanks Jason. In fact, however, tsubst_init isn't currently 
involved at all, because, at the end of regenerate_decl_from_template 
we call by hand tsubst_expr and assign the result to DECL_INITIAL. 
Simply changing that avoids the ICE. However, the error we issue - 
likewise for the existing cpp0x/auto31.C - is the rather 
user-unfriendly "value-initialization of incomplete type ‘auto’", as 
produced by build_value_init. Thus a simple additional test along the 
lines already discussed, which now becomes much more simple to 
implement in a precise way. Again, wording only tentative. I'm also a 
little puzzled that, otherwise, we could get away with tubst_expr 
instead of tsubst_init...



+  if (type_uses_auto (TREE_TYPE (decl)))
+    {
+  if (complain & tf_error)
+    error ("initializer for %q#D expands to an empty list "
+   "of expressions", decl);
+  return error_mark_node;
+    }


This needs to allow the CLASS_PLACEHOLDER_TEMPLATE case.

And yes, we mustn't call build_value_init for a dependent type; if the 
type is dependent, we should just return the NULL_TREE.


Good. Then I'm finishing testing the below (currently in libstdc++).



+  if (tree auto_node = type_uses_auto (type))
+   if (!CLASS_PLACEHOLDER_TEMPLATE (auto_node))
+ {
+   if (complain & tf_error)
+ error ("initializer for %q#D expands to an empty list "
+"of expressions", decl);
+   return error_mark_node;
+ }
+
+  if (!dependent_type_p (type))


This should probably be 'else if', since we can have auto outside of a 
template and dependent_type_p will always return false outside of a 
template.


Jason

Re: [C++ Patch] PR 84536 ("[7/8/9 Regression] ICE with non-type template parameter")

2019-02-18 Thread Paolo Carlini


Hi,

On 19/02/19 00:52, Jason Merrill wrote:

On 2/18/19 12:14 PM, Paolo Carlini wrote:

Hi Jason,

On 18/02/19 19:28, Jason Merrill wrote:

On 2/18/19 5:31 AM, Paolo Carlini wrote:

Hi Jason,

On 18/02/19 10:20, Jason Merrill wrote:

On 2/17/19 6:58 AM, Paolo Carlini wrote:

Hi,

here, when we don't see an initializer we believe we are surely 
dealing with a case of C++17 template argument deduction for 
class templates, but, in fact, it's just an ill-formed C++14 
template variable specialization. Conveniently, we can use here 
too the predicate variable_template_specialization_p. Not 100% 
sure about the exact wording of the error message, I added '#' to 
%qD to explicitly print the auto-using type too.


I guess we should change the assert to a test, so that we give the 
error if we aren't dealing with a class template placeholder. 
Variable templates don't seem to be important to test for.

Thanks, simpler patch.
This error is also pretty poor for this testcase, where there is 
an initializer.


Well, implementation-wise, certainly init == NULL_TREE and only 
when we have an empty pack this specific issue occurs.


In practice, clang simply talks about an empty initializer (during 
instantiation, etc, like we do), whereas EDG explicitly says that 
pack expansion produces an empty list of expressions. I don't think 
that in cp_finish_decl it would be easy for us to do exactly the 
same, we simply see a NULL_TREE as second argument. Or we could 
just *assume* that we are dealing with the outcome of a pack 
expansion, say something like EDG even if we don't have details 
beyond the fact that init == NULL_TREE. I believe that without a 
variadic template the problem cannot occur, because we catch the 
empty initializer much earlier, in grokdeclarator - indeed using a 
!CLASS_PLACEHOLDER_TEMPLATE (auto_node) check. What do you think? 
Again "instantiated for an empty pack" or something similar?


Perhaps we could complain in the code for empty pack expansion 
handling in tsubst_init?


Ah, thanks Jason. In fact, however, tsubst_init isn't currently 
involved at all, because, at the end of regenerate_decl_from_template 
we call by hand tsubst_expr and assign the result to DECL_INITIAL. 
Simply changing that avoids the ICE. However, the error we issue - 
likewise for the existing cpp0x/auto31.C - is the rather 
user-unfriendly "value-initialization of incomplete type ‘auto’", as 
produced by build_value_init. Thus a simple additional test along the 
lines already discussed, which now becomes much more simple to 
implement in a precise way. Again, wording only tentative. I'm also a 
little puzzled that, otherwise, we could get away with tubst_expr 
instead of tsubst_init...



+  if (type_uses_auto (TREE_TYPE (decl)))
+    {
+  if (complain & tf_error)
+    error ("initializer for %q#D expands to an empty list "
+   "of expressions", decl);
+  return error_mark_node;
+    }


This needs to allow the CLASS_PLACEHOLDER_TEMPLATE case.

And yes, we mustn't call build_value_init for a dependent type; if the 
type is dependent, we should just return the NULL_TREE.


Good. Then I'm finishing testing the below (currently in libstdc++).

Thanks, Paolo.

//

Index: cp/pt.c
===
--- cp/pt.c (revision 268997)
+++ cp/pt.c (working copy)
@@ -15422,21 +15422,34 @@ tsubst_init (tree init, tree decl, tree args,
 
   init = tsubst_expr (init, args, complain, in_decl, false);
 
-  if (!init && TREE_TYPE (decl) != error_mark_node)
+  tree type = TREE_TYPE (decl);
+
+  if (!init && type != error_mark_node)
 {
-  /* If we had an initializer but it
-instantiated to nothing,
-value-initialize the object.  This will
-only occur when the initializer was a
-pack expansion where the parameter packs
-used in that expansion were of length
-zero.  */
-  init = build_value_init (TREE_TYPE (decl),
-  complain);
-  if (TREE_CODE (init) == AGGR_INIT_EXPR)
-   init = get_target_expr_sfinae (init, complain);
-  if (TREE_CODE (init) == TARGET_EXPR)
-   TARGET_EXPR_DIRECT_INIT_P (init) = true;
+  if (tree auto_node = type_uses_auto (type))
+   if (!CLASS_PLACEHOLDER_TEMPLATE (auto_node))
+ {
+   if (complain & tf_error)
+ error ("initializer for %q#D expands to an empty list "
+"of expressions", decl);
+   return error_mark_node;
+ }
+
+  if (!dependent_type_p (type))
+   {
+ /* If we had an initializer but it
+instantiated to nothing,
+value-initialize the object.  This will
+only occur when the initializer was a
+pack expansion where the parameter packs
+used in that expansion were of length
+zero.  */
+ init = build_value_init (type, complain);
+ if

Re: [C++ PATCH] Fix cxx_eval_store_expression (PR c++/89336)

2019-02-18 Thread Jason Merrill


On 2/17/19 3:34 AM, Jakub Jelinek wrote:

On Sat, Feb 16, 2019 at 08:51:33AM -1000, Jason Merrill wrote:

The likely case is still that nothing has changed in between, so this patch
just quickly verifies if that is the case (by comparing
CONSTRUCTOR_ELT (ctor, 0) with the previously saved value of that and by
checking if at the spot in the vector is the expected index).  If that is
the case, it doesn't do anything else, otherwise it updates the valp
pointer.


For scalar types, as in all your testcases, we can evaluate the initializer
before the target, as C++17 wants.  We probably still need your patch for
when type is a class.


If you are ok that the scalar vs. aggregate case will be handled
differently, I'm all for your patch, though I guess instead of that second
hunk it should change:
   if (AGGREGATE_TYPE_P (type) || VECTOR_TYPE_P (type))
into:
   if (!preeval)
and move the init = cxx_eval_constant_expression ... call into the
body of that if.  I guess that means the scalar store will be handled right
even for unions then.  Just wonder if similar to
   if (*non_constant_p)
 return t;
after target = cxx_eval_... we shouldn't have that for (both) init =
cxx_eval_... cases too.


Thanks, done.


The testcases can be all changed to work with say struct Z { int z; };
instead of int (or any other aggregate) and I think my patch or something
similar is needed.


But they would still be doing assignment, rather than initialization, so 
they would still be preevaluated and work.



With unions, I think the most nasty case is when the union member to which
we want to store is active before an assignment, but is then made inactive
and later active again.
struct Z { int x, y; };
union W { Z a; long long w; };
W w {};
w.a = { 5, 0 }; // w.a becomes the active member
w.a = { (int) (w.w = 17LL + w.a.x), 2 };
So, if we don't preevaluate init, we look up w.a as { 5, 0 } active member
and try to store that, but in the meantime the init evaluation changes
active member to something different, which should invalidate w.a.


Here also we're looking at assignment.  Here's a modification that still 
breaks with my patch:


struct Z { int x, y; };
union W {
  Z a; long long w;
  constexpr W(): a({int(this->w = 42), 2}) {}
};
constexpr W w {};
static_assert (w.a.x == 42);

But it's not clear to me that the standard actually allows this.  I 
don't think changing the active member of a union in the mem-initializer 
for another member is reasonable.


So, I'm going to apply this:
commit b5aa6e87a705496c38639b697317b0bd764dab30
Author: Jason Merrill 
Date:   Fri Feb 15 13:09:33 2019 -1000

PR c++/89336 - multiple stores in constexpr stmt.

If we evaluate the RHS in the context of the LHS, that evaluation might
change the LHS in ways that mess with being able to store the value later.
So for assignment or scalar values, evaluate the RHS first.

* constexpr.c (cxx_eval_store_expression): Preevaluate scalar or
assigned value.

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index d946a797999..d413c6b9b27 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -3634,6 +3634,18 @@ cxx_eval_store_expression (const constexpr_ctx *ctx, tree t,
   maybe_simplify_trivial_copy (target, init);
 
   tree type = TREE_TYPE (target);
+  bool preeval = SCALAR_TYPE_P (type) || TREE_CODE (t) == MODIFY_EXPR;
+  if (preeval)
+{
+  /* Evaluate the value to be stored without knowing what object it will be
+	 stored in, so that any side-effects happen first.  */
+  if (!SCALAR_TYPE_P (type))
+	new_ctx.ctor = new_ctx.object = NULL_TREE;
+  init = cxx_eval_constant_expression (_ctx, init, false,
+	   non_constant_p, overflow_p);
+  if (*non_constant_p)
+	return t;
+}
   target = cxx_eval_constant_expression (ctx, target,
 	 true,
 	 non_constant_p, overflow_p);
@@ -3834,7 +3846,7 @@ cxx_eval_store_expression (const constexpr_ctx *ctx, tree t,
 }
   release_tree_vector (refs);
 
-  if (AGGREGATE_TYPE_P (type) || VECTOR_TYPE_P (type))
+  if (!preeval)
 {
   /* Create a new CONSTRUCTOR in case evaluation of the initializer
 	 wants to modify it.  */
@@ -3843,21 +3855,20 @@ cxx_eval_store_expression (const constexpr_ctx *ctx, tree t,
 	  *valp = build_constructor (type, NULL);
 	  CONSTRUCTOR_NO_CLEARING (*valp) = no_zero_init;
 	}
-  else if (TREE_CODE (*valp) == PTRMEM_CST)
-	*valp = cplus_expand_constant (*valp);
   new_ctx.ctor = *valp;
   new_ctx.object = target;
+  init = cxx_eval_constant_expression (_ctx, init, false,
+	   non_constant_p, overflow_p);
+  if (target == object)
+	/* The hash table might have moved since the get earlier.  */
+	valp = ctx->values->get (object);
 }
 
-  init = cxx_eval_constant_expression (_ctx, init, false,
-   non_constant_p, overflow_p);
   /* Don't share a CONSTRUCTOR that might be changed later.  */
   init = unshare_constructor (init);
-  if (target == object)
-

Re: [C++ PATCH] Fix maybe_generic_this_capture ICE on USING_DECL (PR c++/89387)

2019-02-18 Thread Jason Merrill


On 2/18/19 1:02 PM, Jakub Jelinek wrote:

Hi!

On the following testcase, id_expr is false and TREE_CODE (*iter)
is USING_DECL (and the following one is FUNCTION_DECL).
Since the USING_DECL changes, this ICEs because
DECL_NONSTATIC_MEMBER_FUNCTION_P uses TREE_TYPE which can't be used here.
Previously, I believe DECL_NONSTATIC_MEMBER_FUNCTION_P would be never true
for USING_DECLs.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

Or should it use != USING_DECL instead (what should be
DECL_NONSTATIC_MEMBER_FUNCTION_P checked on other than
FUNCTION_DECL/TEMPLATE_DECL)?


It only applies if DECL_DECLARES_FUNCTION_P.  But the only other thing 
we should encounter is USING_DECL.  So let's use != USING_DECL like the 
other places Alex changed.


Jason

Re: [C++ PATCH] Avoid ICE on void to type&& reinterpret_cast (PR c++/89391)

2019-02-18 Thread Jason Merrill


On 2/18/19 12:58 PM, Jakub Jelinek wrote:

Hi!

The if (TYPE_REF_IS_RVALUE (type)) code has been added recently,
but build_target_expr_with_type asserts that the expression doesn't have
void type.  Fixed by using the old handling in that case (the expression is
not lvalue in that case and diagnostics is emitted if complain).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2019-02-18  Jakub Jelinek  

PR c++/89391
* typeck.c (build_reinterpret_cast_1): Don't handle void to
&& conversion go through build_target_expr_with_type.


OK.

Jason

Re: [C++ PATCH] Don't ICE on invalid scoped enum E::~E (PR c++/89390)

2019-02-18 Thread Jason Merrill


On 2/18/19 12:50 PM, Jakub Jelinek wrote:

Hi!

On the following testcase we ICE because name is BIT_NOT_EXPR and
suggest_alternative_in_scoped_enum assumes it is called on IDENTIFIER_NODE
only.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?


OK.


There is another issue, starting with 7.x we don't use sensible location in
the diagnostics, 6.x emitted
pr89390.C: In function ‘void foo()’:
pr89390.C:9:3: error: ‘~A’ is not a member of ‘A’
A::~A (); // { dg-error "'~A' is not a member of 'A'" }
^
but 7.x and later emits:
In function ‘void foo()’:
cc1plus: error: ‘~A’ is not a member of ‘A’

This patch doesn't deal with that, but would be nice to provide location,
dunno if it is enough to just use location of ~, or if we need to spend
memory and build ~A as combined range with caret on ~.


I think having a range for a destructor id-expression would be appropriate.

Jason

Re: [C++ Patch] PR 84536 ("[7/8/9 Regression] ICE with non-type template parameter")

2019-02-18 Thread Jason Merrill


On 2/18/19 12:14 PM, Paolo Carlini wrote:

Hi Jason,

On 18/02/19 19:28, Jason Merrill wrote:

On 2/18/19 5:31 AM, Paolo Carlini wrote:

Hi Jason,

On 18/02/19 10:20, Jason Merrill wrote:

On 2/17/19 6:58 AM, Paolo Carlini wrote:

Hi,

here, when we don't see an initializer we believe we are surely 
dealing with a case of C++17 template argument deduction for class 
templates, but, in fact, it's just an ill-formed C++14 template 
variable specialization. Conveniently, we can use here too the 
predicate variable_template_specialization_p. Not 100% sure about 
the exact wording of the error message, I added '#' to %qD to 
explicitly print the auto-using type too.


I guess we should change the assert to a test, so that we give the 
error if we aren't dealing with a class template placeholder. 
Variable templates don't seem to be important to test for.

Thanks, simpler patch.
This error is also pretty poor for this testcase, where there is an 
initializer.


Well, implementation-wise, certainly init == NULL_TREE and only when 
we have an empty pack this specific issue occurs.


In practice, clang simply talks about an empty initializer (during 
instantiation, etc, like we do), whereas EDG explicitly says that 
pack expansion produces an empty list of expressions. I don't think 
that in cp_finish_decl it would be easy for us to do exactly the 
same, we simply see a NULL_TREE as second argument. Or we could just 
*assume* that we are dealing with the outcome of a pack expansion, 
say something like EDG even if we don't have details beyond the fact 
that init == NULL_TREE. I believe that without a variadic template 
the problem cannot occur, because we catch the empty initializer much 
earlier, in grokdeclarator - indeed using a 
!CLASS_PLACEHOLDER_TEMPLATE (auto_node) check. What do you think? 
Again "instantiated for an empty pack" or something similar?


Perhaps we could complain in the code for empty pack expansion 
handling in tsubst_init?


Ah, thanks Jason. In fact, however, tsubst_init isn't currently involved 
at all, because, at the end of regenerate_decl_from_template we call by 
hand tsubst_expr and assign the result to DECL_INITIAL. Simply changing 
that avoids the ICE. However, the error we issue - likewise for the 
existing cpp0x/auto31.C - is the rather user-unfriendly 
"value-initialization of incomplete type ‘auto’", as produced by 
build_value_init. Thus a simple additional test along the lines already 
discussed, which now becomes much more simple to implement in a precise 
way. Again, wording only tentative. I'm also a little puzzled that, 
otherwise, we could get away with tubst_expr instead of tsubst_init...



+  if (type_uses_auto (TREE_TYPE (decl)))
+   {
+ if (complain & tf_error)
+   error ("initializer for %q#D expands to an empty list "
+  "of expressions", decl);
+ return error_mark_node;
+   }


This needs to allow the CLASS_PLACEHOLDER_TEMPLATE case.

And yes, we mustn't call build_value_init for a dependent type; if the 
type is dependent, we should just return the NULL_TREE.


Jason

Re: [PATCH, libphobos] Detect if qsort_r is available (PR d/88127)

2019-02-18 Thread Iain Buclaw

On Sat, 2 Feb 2019 at 11:01, Johannes Pfau  wrote:
>
> Adds a configure test for qsort_r and use the fallback code path if
> it's not available. Fixes d/88127. rt/qsort.d changes have been
> pushed upstream and reviewed there: 
> https://github.com/dlang/druntime/pull/2480
> Bootstrapped & ran D test suite on x86_64_linux with a recent glibc,
> checked that Have_Qsort_R is set correctly in config.d.
>
> libphobos/ChangeLog:
>
> 2019-02-02  Johannes Pfau  
>
> * m4/druntime/libraries.m4: Add check for qsort_r as 
> DRUNTIME_LIBRARIES_CLIB.
> * configure.ac: Use qsort_r check.
> * libdruntime/gcc/config.d.in: Add Have_Qsort_R to store check result.
> * libdruntime/rt/qsort.d: Check Have_Qsort_R before using qsort_r.
> * Makefile.in: Regenerate.
> * aclocal.m4: Regenerate.
> * configure: Regenerate.
> * libdruntime/Makefile.in: Regenerate.
> * src/Makefile.in: Regenerate.
> * testsuite/Makefile.in: Regenerate.
>
> ---
>  libphobos/Makefile.in |  7 +++--
>  libphobos/aclocal.m4  | 40 +--
>  libphobos/configure   | 26 +++--
>  libphobos/configure.ac|  1 +
>  libphobos/libdruntime/Makefile.in |  7 +++--
>  libphobos/libdruntime/gcc/config.d.in |  3 ++
>  libphobos/libdruntime/rt/qsort.d  | 18 
>  libphobos/m4/druntime/libraries.m4| 12 
>  libphobos/src/Makefile.in |  5 ++--
>  libphobos/testsuite/Makefile.in   |  5 ++--
>  10 files changed, 92 insertions(+), 32 deletions(-)
>

Adjusted the changelog entry to fit within 80 characters.

Committed as r268999.

Thanks,
-- 
Iain

Re: [PATCH] Teach evrp that main's argc argument is always non-negative for C family (PR tree-optimization/89350)

2019-02-18 Thread Joseph Myers

On Sat, 16 Feb 2019, Jakub Jelinek wrote:

> Hi!
> 
> Both the C and C++ standard guarantee that the argc argument to main is
> non-negative, the following patch sets (or adjusts) the corresponding
> SSA_NAME_RANGE_INFO.  While main is just one, with IPA VRP it can also
> propagate etc.  I had to change one testcase because it started optimizing
> it better (the test has been folded away), so no sinking was done.

In C, unlike in C++, it's valid to call main recursively.  I think the 
requirements on argc and argv must be understood to apply only to their 
values on entry to the initial call to main, not to any recursive calls.  
So I don't think this optimization is valid for C (in the absence of 
whole-program information that can prove the absence of any recursive 
calls).

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: Fix libphobos testsuite failures on Solaris

2019-02-18 Thread Iain Buclaw

On Tue, 29 Jan 2019 at 15:44, Rainer Orth  wrote:
>
> Yet another trivial fix for a Solaris libphobos testsuite failure:
>
> FAIL: libphobos.shared/load.d -shared-libphobos -ldl (test for excess errors)
> Excess errors:
> /vol/gcc/src/hg/trunk/local/libphobos/testsuite/libphobos.shared/load.d:9: 
> error: static assert  "unimplemented"
>
> I guess this is obvious?  Tested on i386-pc-solaris2.11.  Ok for
> mainline?
>

Looks ok.

As the OS-specific bindings are only imported for RTLD_NOLOAD, this
could be made explicit in the static assert.

---
import core.sys.posix.dlfcn;

version (DragonFlyBSD) import core.sys.dragonflybsd.dlfcn : RTLD_NOLOAD;
version (FreeBSD) import core.sys.freebsd.dlfcn : RTLD_NOLOAD;
version (linux) import core.sys.linux.dlfcn : RTLD_NOLOAD;
version (NetBSD) import core.sys.netbsd.dlfcn : RTLD_NOLOAD;
version (OSX) import core.sys.darwin.dlfcn : RTLD_NOLOAD;
version (Solaris) import core.sys.solaris.dlfcn : RTLD_NOLOAD;

static assert(__traits(compiles, RTLD_NOLOAD), "unimplemented");
---


-- 
Iain

Re: [PATCH] Teach evrp that main's argc argument is always non-negative for C family (PR tree-optimization/89350)

2019-02-18 Thread Segher Boessenkool

On Mon, Feb 18, 2019 at 11:55:56PM +0100, Jakub Jelinek wrote:
> On Mon, Feb 18, 2019 at 04:47:57PM -0600, Segher Boessenkool wrote:
> > On Sat, Feb 16, 2019 at 08:12:34AM +0100, Jakub Jelinek wrote:
> > > Both the C and C++ standard guarantee that the argc argument to main is
> > > non-negative, the following patch sets (or adjusts) the corresponding
> > > SSA_NAME_RANGE_INFO.
> > 
> > I think this should test for flag_hosted somehow?  And check that this is
> 
> Why?  Does -ffreestanding mean it can violate the C/C++ requirements?

No, but nothing is required of the arguments to the main function in a
freestanding implementation.  C11 5.1.2.1/1.

> AFAIK we don't guard other MAIN_NAME_P uses in the compiler with C/C++
> checks.  E.g. "Nothing escapes by returning from main though." in
> tree-ssa-structalias.c, various other spots.

GCC hasn't historically required "int" for the first argument of the main
function, as far as I know.  This is separate from saying the main function
is called "main".


Segher

[C++ PATCH] Fix maybe_generic_this_capture ICE on USING_DECL (PR c++/89387)

2019-02-18 Thread Jakub Jelinek

Hi!

On the following testcase, id_expr is false and TREE_CODE (*iter)
is USING_DECL (and the following one is FUNCTION_DECL).
Since the USING_DECL changes, this ICEs because
DECL_NONSTATIC_MEMBER_FUNCTION_P uses TREE_TYPE which can't be used here.
Previously, I believe DECL_NONSTATIC_MEMBER_FUNCTION_P would be never true
for USING_DECLs.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

Or should it use != USING_DECL instead (what should be
DECL_NONSTATIC_MEMBER_FUNCTION_P checked on other than
FUNCTION_DECL/TEMPLATE_DECL)?

2019-02-18  Jakub Jelinek  

PR c++/89387
* lambda.c (maybe_generic_this_capture): Don't check
DECL_NONSTATIC_MEMBER_FUNCTION_P on USING_DECLs.

* g++.dg/cpp0x/lambda/lambda-89387.C: New test.

--- gcc/cp/lambda.c.jj  2019-02-18 20:48:32.112741017 +0100
+++ gcc/cp/lambda.c 2019-02-18 21:49:23.319629179 +0100
@@ -941,7 +941,8 @@ maybe_generic_this_capture (tree object,
  fns = TREE_OPERAND (fns, 0);
 
for (lkp_iterator iter (fns); iter; ++iter)
- if ((!id_expr || TREE_CODE (*iter) == TEMPLATE_DECL)
+ if (((!id_expr && TREE_CODE (*iter) == FUNCTION_DECL)
+  || TREE_CODE (*iter) == TEMPLATE_DECL)
  && DECL_NONSTATIC_MEMBER_FUNCTION_P (*iter))
{
  /* Found a non-static member.  Capture this.  */
--- gcc/testsuite/g++.dg/cpp0x/lambda/lambda-89387.C.jj 2019-02-18 
21:56:46.410339001 +0100
+++ gcc/testsuite/g++.dg/cpp0x/lambda/lambda-89387.C2019-02-18 
21:55:58.869119054 +0100
@@ -0,0 +1,11 @@
+// PR c++/89387
+// { dg-do compile { target c++11 } }
+
+template  class T>
+struct S {
+  using A = int;
+  using B = T;
+  using B::foo;
+  void bar () { [&] { foo (); }; }
+  void foo ();
+};

Jakub

[C++ PATCH] Avoid ICE on void to type&& reinterpret_cast (PR c++/89391)

2019-02-18 Thread Jakub Jelinek

Hi!

The if (TYPE_REF_IS_RVALUE (type)) code has been added recently,
but build_target_expr_with_type asserts that the expression doesn't have
void type.  Fixed by using the old handling in that case (the expression is
not lvalue in that case and diagnostics is emitted if complain).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2019-02-18  Jakub Jelinek  

PR c++/89391
* typeck.c (build_reinterpret_cast_1): Don't handle void to
&& conversion go through build_target_expr_with_type.

* g++.dg/cpp0x/reinterpret_cast2.C: New test.

--- gcc/cp/typeck.c.jj  2019-01-30 08:35:46.990055278 +0100
+++ gcc/cp/typeck.c 2019-02-18 21:19:09.727590300 +0100
@@ -7477,7 +7477,7 @@ build_reinterpret_cast_1 (tree type, tre
  reinterpret_cast.  */
   if (TYPE_REF_P (type))
 {
-  if (TYPE_REF_IS_RVALUE (type))
+  if (TYPE_REF_IS_RVALUE (type) && !VOID_TYPE_P (intype))
{
  if (!obvalue_p (expr))
/* Perform the temporary materialization conversion.  */
--- gcc/testsuite/g++.dg/cpp0x/reinterpret_cast2.C.jj   2019-02-18 
21:27:24.844391776 +0100
+++ gcc/testsuite/g++.dg/cpp0x/reinterpret_cast2.C  2019-02-18 
21:27:05.261723238 +0100
@@ -0,0 +1,10 @@
+// PR c++/89391
+// { dg-do compile { target c++11 } }
+
+struct S { };
+
+void
+foo ()
+{
+  auto a = reinterpret_cast(foo ());  // { dg-error "invalid cast of 
an rvalue expression of type 'void' to type" }
+}

Jakub

Re: [PATCH] Teach evrp that main's argc argument is always non-negative for C family (PR tree-optimization/89350)

2019-02-18 Thread Jakub Jelinek

On Mon, Feb 18, 2019 at 04:47:57PM -0600, Segher Boessenkool wrote:
> On Sat, Feb 16, 2019 at 08:12:34AM +0100, Jakub Jelinek wrote:
> > Both the C and C++ standard guarantee that the argc argument to main is
> > non-negative, the following patch sets (or adjusts) the corresponding
> > SSA_NAME_RANGE_INFO.
> 
> I think this should test for flag_hosted somehow?  And check that this is

Why?  Does -ffreestanding mean it can violate the C/C++ requirements?
AFAIK we don't guard other MAIN_NAME_P uses in the compiler with C/C++
checks.  E.g. "Nothing escapes by returning from main though." in
tree-ssa-structalias.c, various other spots.

> a C-like language anyway?

The patch used to do that check, but I think we should be able to avoid
that.  I think in other languages main is just a C wrapper or compiler
generated C-like wrapper that actually calls the main program's subroutine
and so the C requirements apply to it too.

Jakub

[C++ PATCH] Don't ICE on invalid scoped enum E::~E (PR c++/89390)

2019-02-18 Thread Jakub Jelinek

Hi!

On the following testcase we ICE because name is BIT_NOT_EXPR and
suggest_alternative_in_scoped_enum assumes it is called on IDENTIFIER_NODE
only.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

There is another issue, starting with 7.x we don't use sensible location in
the diagnostics, 6.x emitted
pr89390.C: In function ‘void foo()’:
pr89390.C:9:3: error: ‘~A’ is not a member of ‘A’
   A::~A (); // { dg-error "'~A' is not a member of 'A'" }
   ^
but 7.x and later emits:
In function ‘void foo()’:
cc1plus: error: ‘~A’ is not a member of ‘A’

This patch doesn't deal with that, but would be nice to provide location,
dunno if it is enough to just use location of ~, or if we need to spend
memory and build ~A as combined range with caret on ~.

2019-02-18  Jakub Jelinek  

PR c++/89390
* error.c (qualified_name_lookup_error): Only call
suggest_alternative_in_scoped_enum if name is IDENTIFIER_NODE.

* g++.dg/diagnostic/pr89390.C: New test.

--- gcc/cp/error.c.jj   2019-01-17 09:03:11.486787567 +0100
+++ gcc/cp/error.c  2019-02-18 20:56:48.047604338 +0100
@@ -4276,7 +4276,7 @@ qualified_name_lookup_error (tree scope,
   else
{
  name_hint hint;
- if (SCOPED_ENUM_P (scope))
+ if (SCOPED_ENUM_P (scope) && TREE_CODE (name) == IDENTIFIER_NODE)
hint = suggest_alternative_in_scoped_enum (name, scope);
  if (const char *suggestion = hint.suggestion ())
{
--- gcc/testsuite/g++.dg/diagnostic/pr89390.C.jj2019-02-18 
20:58:47.358646700 +0100
+++ gcc/testsuite/g++.dg/diagnostic/pr89390.C   2019-02-18 20:58:13.746198205 
+0100
@@ -0,0 +1,10 @@
+// PR c++/89390
+// { dg-do compile { target c++11 } }
+
+enum class A { B, C };
+
+void
+foo ()
+{
+  A::~A ();// { dg-error "'~A' is not a member of 'A'" "" { target *-*-* } 
0 }
+}

Jakub

Re: [PATCH] Teach evrp that main's argc argument is always non-negative for C family (PR tree-optimization/89350)

2019-02-18 Thread Segher Boessenkool

Hi Jakub,

On Sat, Feb 16, 2019 at 08:12:34AM +0100, Jakub Jelinek wrote:
> Both the C and C++ standard guarantee that the argc argument to main is
> non-negative, the following patch sets (or adjusts) the corresponding
> SSA_NAME_RANGE_INFO.

I think this should test for flag_hosted somehow?  And check that this is
a C-like language anyway?

Segher

[PR fortran/89266, patch] - ICE with TRANSFER of len=0 character array constructor

2019-02-18 Thread Harald Anlauf

The issue in the PR is caused during simplification in the frontend
because it does not properly differentiate between expressions of
size 0 (e.g. arrays of length 0 or character strings of len=0)
and failure.

The attached patch tries to solve this problem by modifying the
helper functions gfc_element_size and gfc_target_expr_size to
return a bool when simplification fails.  All users of these
functions needed adjustment, most of which was more or less
mechanical.

There was one case left (in check.c) where I am unsure if I got
the logic right.  In the worst case it should produce a new bug
for code that would have generated an ICE before.

Since the above fix also works for non-character arrays of length 0,
I added a suitable test.

Regtested on x86_64-pc-linux-gnu.

OK for trunk?  Or rather wait for post-9.1?

Thanks,
Harald

2019-02-18  Harald Anlauf  

PR fortran/89266
* target-memory.c (gfc_element_size): Return false if element size
cannot be determined; element size is returned separately.
(gfc_target_expr_size): Return false if expression size cannot be
determined; expression size is returned separately.
* target-memory.h: Adjust prototypes.
* check.c (gfc_calculate_transfer_sizes): Adjust references to
gfc_target_expr_size, gfc_element_size.
* arith.c (hollerith2representation): Likewise.
* class.c (find_intrinsic_vtab): Likewise.
* simplify.c (gfc_simplify_sizeof): Likewise.

2019-02-18  Harald Anlauf  

PR fortran/89266
* gfortran.dg/pr89266.f90: New test.

Index: gcc/fortran/arith.c
===
--- gcc/fortran/arith.c (revision 268993)
+++ gcc/fortran/arith.c (working copy)
@@ -2548,10 +2548,10 @@
 static void
 hollerith2representation (gfc_expr *result, gfc_expr *src)
 {
-  int src_len, result_len;
+  size_t src_len, result_len;
 
   src_len = src->representation.length - src->ts.u.pad;
-  result_len = gfc_target_expr_size (result);
+  gfc_target_expr_size (result, _len);
 
   if (src_len > result_len)
 {
Index: gcc/fortran/check.c
===
--- gcc/fortran/check.c (revision 268993)
+++ gcc/fortran/check.c (working copy)
@@ -5480,16 +5480,15 @@
 return false;
 
   /* Calculate the size of the source.  */
-  *source_size = gfc_target_expr_size (source);
-  if (*source_size == 0)
+  if (!gfc_target_expr_size (source, source_size))
 return false;
 
   /* Determine the size of the element.  */
-  result_elt_size = gfc_element_size (mold);
-  if (result_elt_size == 0)
+  if (!gfc_element_size (mold, _elt_size))
 return false;
 
-  if (mold->expr_type == EXPR_ARRAY || mold->rank || size)
+  if ((result_elt_size > 0 && (mold->expr_type == EXPR_ARRAY || mold->rank))
+  || size)
 {
   int result_length;
 
Index: gcc/fortran/class.c
===
--- gcc/fortran/class.c (revision 268993)
+++ gcc/fortran/class.c (working copy)
@@ -2666,6 +2666,7 @@
  gfc_namespace *sub_ns;
  gfc_namespace *contained;
  gfc_expr *e;
+ size_t e_size;
 
  gfc_get_symbol (name, ns, );
  if (!gfc_add_flavor (>attr, FL_DERIVED, NULL,
@@ -2700,11 +2701,13 @@
  e = gfc_get_expr ();
  e->ts = *ts;
  e->expr_type = EXPR_VARIABLE;
+ if (ts->type == BT_CHARACTER)
+   e_size = ts->kind;
+ else
+   gfc_element_size (e, _size);
  c->initializer = gfc_get_int_expr (gfc_size_kind,
 NULL,
-ts->type == BT_CHARACTER
-? ts->kind
-: gfc_element_size (e));
+e_size);
  gfc_free_expr (e);
 
  /* Add component _extends.  */
Index: gcc/fortran/simplify.c
===
--- gcc/fortran/simplify.c  (revision 268993)
+++ gcc/fortran/simplify.c  (working copy)
@@ -7379,6 +7379,7 @@
 {
   gfc_expr *result = NULL;
   mpz_t array_size;
+  size_t res_size;
 
   if (x->ts.type == BT_CLASS || x->ts.deferred)
 return NULL;
@@ -7394,7 +7395,8 @@
 
   result = gfc_get_constant_expr (BT_INTEGER, gfc_index_integer_kind,
  >where);
-  mpz_set_si (result->value.integer, gfc_target_expr_size (x));
+  gfc_target_expr_size (x, _size);
+  mpz_set_si (result->value.integer, res_size);
 
   return result;
 }
@@ -7408,6 +7410,7 @@
 {
   gfc_expr *result = NULL;
   int k;
+  size_t siz;
 
   if (x->ts.type == BT_CLASS || x->ts.deferred)
 return NULL;
@@ -7423,7 +7426,8 @@
 
   result = gfc_get_constant_expr (BT_INTEGER, k, >where);
 
-  mpz_set_si

[C++ PATCH] Ensure constexpr evaluation is done on pre-cp_fold_function bodies (PR c++/89285)

2019-02-18 Thread Jakub Jelinek

Hi!

As mentioned in the PR, we've regressed on the trunk in diagnostics of some
invalid constexpr evaluations.  The problem is that the constexpr evaluation
is effectively done on post-cp_fold_function bodies/arguments and cp_fold
optimizes away some important trees for constexpr diagnostics, either
itself, or through using GENERIC match.pd (on the testcase in particular
diagnostics about reinterpret_cast).
While we save on constexpr call hash table bodies of the functions
pre-cp_fold_function, due to sharing and cp_fold_r the STATEMENT_LIST
statements etc. are modified directly and genericization modifies it as
well.

The following patch uses copy_fn which we have been using before the the
recursive constexpr cases also to make a copy of the constexpr function
before cp_fold_function clobbers it.
I had to implement cxx_eval_conditional_expression handling of various
C++ FE statements that are replaced during genericization.

Bootstrapped/regtested on x86_64-linux and i686-linux (98,11,14,17,2a), ok
for trunk?

2019-02-18  Jakub Jelinek  

PR c++/89285
* constexpr.c (struct constexpr_fundef): Add parms and result members.
(retrieve_constexpr_fundef): Adjust for the above change.
(register_constexpr_fundef): Save constexpr body with copy_fn,
temporarily set DECL_CONTEXT on DECL_RESULT before that.
(get_fundef_copy): Change FUN argument to FUNDEF with
constexpr_fundef * type, grab body and parms/result out of
constexpr_fundef struct and temporarily change it for copy_fn calls
too.
(cxx_eval_builtin_function_call): For __builtin_FUNCTION temporarily
adjust current_function_decl from ctx->call context.  For arith
overflow builtins, don't test is_constant_expression on the result,
instead test if arguments are suitable constant expressions.
(cxx_bind_parameters_in_call): Grab parameters from new_call.  Undo
convert_for_arg_passing changes for TREE_ADDRESSABLE type passing.
(cxx_eval_call_expression): Adjust get_fundef_copy caller.
(cxx_eval_conditional_expression): For IF_STMT, allow then or else
operands to be NULL.
(label_matches): Handle BREAK_STMT and CONTINUE_STMT.
(cxx_eval_loop_expr): Add support for FOR_STMT, WHILE_STMT and DO_STMT.
(cxx_eval_switch_expr): Add support for SWITCH_STMT.
(cxx_eval_constant_expression): Handle IF_STMT, FOR_STMT, WHILE_STMT,
DO_STMT, CONTINUE_STMT, SWITCH_STMT, BREAK_STMT and CONTINUE_STMT.
For SIZEOF_EXPR, recurse on the result of fold_sizeof_expr.  Ignore
DECL_EXPR with USING_DECL operand.
* lambda.c (maybe_add_lambda_conv_op): Build thisarg using
build_int_cst to make it a valid constant expression.

* g++.dg/ubsan/vptr-4.C: Expect reinterpret_cast errors.
* g++.dg/cpp1y/constexpr-84192.C (f2): Adjust expected diagnostics.
* g++.dg/cpp1y/constexpr-70265-2.C (foo): Adjust expected line of
diagnostics.
* g++.dg/cpp1y/constexpr-89285.C: New test.

--- gcc/cp/constexpr.c.jj   2019-02-17 17:09:47.113351897 +0100
+++ gcc/cp/constexpr.c  2019-02-18 19:34:57.995136395 +0100
@@ -139,6 +139,8 @@ ensure_literal_type_for_constexpr_object
 struct GTY((for_user)) constexpr_fundef {
   tree decl;
   tree body;
+  tree parms;
+  tree result;
 };
 
 struct constexpr_fundef_hasher : ggc_ptr_hash
@@ -176,11 +178,10 @@ constexpr_fundef_hasher::hash (constexpr
 static constexpr_fundef *
 retrieve_constexpr_fundef (tree fun)
 {
-  constexpr_fundef fundef = { NULL, NULL };
   if (constexpr_fundef_table == NULL)
 return NULL;
 
-  fundef.decl = fun;
+  constexpr_fundef fundef = { fun, NULL, NULL, NULL };
   return constexpr_fundef_table->find ();
 }
 
@@ -897,8 +898,19 @@ register_constexpr_fundef (tree fun, tre
   = hash_table::create_ggc (101);
 
   entry.decl = fun;
-  entry.body = body;
+  tree saved_fn = current_function_decl;
+  bool clear_ctx = false;
+  current_function_decl = fun;
+  if (DECL_RESULT (fun) && DECL_CONTEXT (DECL_RESULT (fun)) == NULL_TREE)
+{
+  clear_ctx = true;
+  DECL_CONTEXT (DECL_RESULT (fun)) = fun;
+}
+  entry.body = copy_fn (fun, entry.parms, entry.result);
+  current_function_decl = saved_fn;
   slot = constexpr_fundef_table->find_slot (, INSERT);
+  if (clear_ctx)
+DECL_CONTEXT (DECL_RESULT (fun)) = NULL_TREE;
 
   gcc_assert (*slot == NULL);
   *slot = ggc_alloc ();
@@ -1114,27 +1126,40 @@ maybe_initialize_fundef_copies_table ()
is parms, TYPE is result.  */
 
 static tree
-get_fundef_copy (tree fun)
+get_fundef_copy (constexpr_fundef *fundef)
 {
   maybe_initialize_fundef_copies_table ();
 
   tree copy;
   bool existed;
-  tree *slot = _copies_table->get_or_insert (fun, );
+  tree *slot = _copies_table->get_or_insert (fundef->decl, );
 
   if (!existed)
 {
   /* There is no cached function available, or in use.  We can use
 the function directly.  That

Re: [libphobos, build] Enable libphobos on Solaris 11/x86

2019-02-18 Thread Iain Buclaw

On Tue, 29 Jan 2019 at 13:35, Rainer Orth  wrote:
>
> With the set of libphobos Solaris patches just posted, it would become
> possible to enable libphobos on Solaris 11/x86 by default.
>
> This is what this patch does.
>
> * It uses a LIBPHOBOS_SUPPORTED variable both in toplevel configure and
>   libphobos/configure.tgt, following what libvtv does.
>
> * It's necessary to disable libphobos when Solaris as is in use: it has
>   a relatively low line length limit of 10240 which is exceeded in a few
>   libphobos files.
>
> Bootstrapped without regressions on i386-pc-solaris2.11 (as and gas, gas
> and gld, Solaris 11.3/11.4/11.5) on top of the previous set of patches.
>
> Also tested manually that explicit
> --enable-libphobos/--disable-libphobos give the desired results
> (i.e. override the defaults).
>

OK.

-- 
Iain

Re: [build] Fix libgphobos linking on Solaris 11

2019-02-18 Thread Iain Buclaw

On Tue, 27 Nov 2018 at 23:28, Rainer Orth  wrote:
>
> As mentioned in passing in PR d/87864, libgphobos.so currently fails to
> link before Solaris 11.4.  Until then, you needed to link with -lsocket
> -lnsl for the networking functions, in S11.4 they were merged into libc.
>
> To fix this, I've adapted the check from libgo/configure.ac, for the
> moment just moving it into an autoconf macro, reindenting it, renaming
> the variables for the new location, and removing the check for sendfile
> which isn't used in libphobos.
>
> With that patch (and the one from PR d/87864 to provide
> __start_minfo/__stop_minfo when ld does not), I could bootstrap with
> --enable-libphobos on i386-pc-solaris2.11 with gas and
> sparc-sun-solaris2.11 with as on both S11.3 and S11.4.  On the former,
> libsocket and libnsl were properly detected and linked into
> libgdruntime.so and libgphobos.so, leaving no undefined symbols, while
> on the latter nothing more than libc is needed.
>
> Ok for mainline?
>

Hi,

Sorry, somehow I missed this and other libphobos related patches.

I see no problem with this if still needed, thanks.

-- 
Iain

Re: [PATCH, RFC] Avoid the -D option which is not available install-sh

2019-02-18 Thread Iain Buclaw

On Sat, 16 Feb 2019 at 13:58, Bernd Edlinger  wrote:
>
> On 2/9/19 7:21 PM, Bernd Edlinger wrote:
> > On 2/9/19 7:18 PM, Jakub Jelinek wrote:
> >> On Sat, Feb 09, 2019 at 06:11:00PM +, Bernd Edlinger wrote:
> >>> --- libphobos/libdruntime/Makefile.am   (revision 268614)
> >>> +++ libphobos/libdruntime/Makefile.am   (working copy)
> >>> @@ -140,10 +140,12 @@ clean-local:
> >>>  # Handles generated files as well
> >>>  install-data-local:
> >>> for file in $(ALL_DRUNTIME_INSTALL_DSOURCES); do \
> >>> + $(MKDIR_P) `echo $(DESTDIR)$(gdc_include_dir)/$$file \
> >>> + | sed -e 's:/[^/]*$$::'` ; \
> >>
> >> Perhaps better `dirname $(DESTDIR)$(gdc_include_dir)/$$file` ?
> >>
> >
> > Ah, yes, good point.
> >
> > Consider it changed.
> >
> >
>
> So here is the latest version with the requested change.
>

Looks ok to me.

> How is the procedure with libpobos patches?
> Can we check them into the gcc svn, or will Ian have to
> push them first into the upstream?
>

See libphobos/README.gcc regarding what sources are part of upstream.
Anything else that isn't listed is local to gcc svn, and can be
committed directly.

-- 
Iain

Re: [C++ Patch] PR 84536 ("[7/8/9 Regression] ICE with non-type template parameter")

2019-02-18 Thread Paolo Carlini


Hi Jason,

On 18/02/19 19:28, Jason Merrill wrote:

On 2/18/19 5:31 AM, Paolo Carlini wrote:

Hi Jason,

On 18/02/19 10:20, Jason Merrill wrote:

On 2/17/19 6:58 AM, Paolo Carlini wrote:

Hi,

here, when we don't see an initializer we believe we are surely 
dealing with a case of C++17 template argument deduction for class 
templates, but, in fact, it's just an ill-formed C++14 template 
variable specialization. Conveniently, we can use here too the 
predicate variable_template_specialization_p. Not 100% sure about 
the exact wording of the error message, I added '#' to %qD to 
explicitly print the auto-using type too.


I guess we should change the assert to a test, so that we give the 
error if we aren't dealing with a class template placeholder. 
Variable templates don't seem to be important to test for.

Thanks, simpler patch.
This error is also pretty poor for this testcase, where there is an 
initializer.


Well, implementation-wise, certainly init == NULL_TREE and only when 
we have an empty pack this specific issue occurs.


In practice, clang simply talks about an empty initializer (during 
instantiation, etc, like we do), whereas EDG explicitly says that 
pack expansion produces an empty list of expressions. I don't think 
that in cp_finish_decl it would be easy for us to do exactly the 
same, we simply see a NULL_TREE as second argument. Or we could just 
*assume* that we are dealing with the outcome of a pack expansion, 
say something like EDG even if we don't have details beyond the fact 
that init == NULL_TREE. I believe that without a variadic template 
the problem cannot occur, because we catch the empty initializer much 
earlier, in grokdeclarator - indeed using a 
!CLASS_PLACEHOLDER_TEMPLATE (auto_node) check. What do you think? 
Again "instantiated for an empty pack" or something similar?


Perhaps we could complain in the code for empty pack expansion 
handling in tsubst_init?


Ah, thanks Jason. In fact, however, tsubst_init isn't currently involved 
at all, because, at the end of regenerate_decl_from_template we call by 
hand tsubst_expr and assign the result to DECL_INITIAL. Simply changing 
that avoids the ICE. However, the error we issue - likewise for the 
existing cpp0x/auto31.C - is the rather user-unfriendly 
"value-initialization of incomplete type ‘auto’", as produced by 
build_value_init. Thus a simple additional test along the lines already 
discussed, which now becomes much more simple to implement in a precise 
way. Again, wording only tentative. I'm also a little puzzled that, 
otherwise, we could get away with tubst_expr instead of tsubst_init...


Thanks, Paolo.

//


Index: cp/pt.c
===
--- cp/pt.c (revision 268995)
+++ cp/pt.c (working copy)
@@ -15424,6 +15424,14 @@ tsubst_init (tree init, tree decl, tree args,
 
   if (!init && TREE_TYPE (decl) != error_mark_node)
 {
+  if (type_uses_auto (TREE_TYPE (decl)))
+   {
+ if (complain & tf_error)
+   error ("initializer for %q#D expands to an empty list "
+  "of expressions", decl);
+ return error_mark_node;
+   }
+
   /* If we had an initializer but it
 instantiated to nothing,
 value-initialize the object.  This will
@@ -24053,9 +24061,8 @@ regenerate_decl_from_template (tree decl, tree tmp
 {
   start_lambda_scope (decl);
   DECL_INITIAL (decl) =
-   tsubst_expr (DECL_INITIAL (code_pattern), args,
-tf_error, DECL_TI_TEMPLATE (decl),
-/*integral_constant_expression_p=*/false);
+   tsubst_init (DECL_INITIAL (code_pattern), decl, args,
+tf_error, DECL_TI_TEMPLATE (decl));
   finish_lambda_scope ();
   if (VAR_HAD_UNKNOWN_BOUND (decl))
TREE_TYPE (decl) = tsubst (TREE_TYPE (code_pattern), args,
Index: testsuite/g++.dg/cpp1y/var-templ60.C
===
--- testsuite/g++.dg/cpp1y/var-templ60.C(nonexistent)
+++ testsuite/g++.dg/cpp1y/var-templ60.C(working copy)
@@ -0,0 +1,9 @@
+// PR c++/84536
+// { dg-do compile { target c++14 } }
+
+template auto foo(N...);  // { dg-error "initializer" }
+
+void bar()
+{
+  foo<>();
+}

[patch, fortran] Fix PR 89384

2019-02-18 Thread Thomas Koenig


Hello world,

this patch fixes the 9 regression in C interop with contiguous
arguments recently reported by Reinhold Bader.

ChangeLog and patch say it all.  I hope I didn't overlook any
obvious things here (Paul, maybe you can take a look).

Regression-tested. OK for trunk?

Regards

Thomas

2019-02-18  Thomas Koenig  

PR fortran/89384
* trans-expr.c (gfc_conv_gfc_desc_to_cfi_desc): If the dummy
argument is contiguous and the actual argument may not be,
use gfc_conv_subref_array_arg.

2019-02-18  Thomas Koenig  

PR fortran/89384
* gfortran.dg/ISO_Fortran_binding_4.f90
Index: trans-expr.c
===
--- trans-expr.c	(Revision 268992)
+++ trans-expr.c	(Arbeitskopie)
@@ -4944,7 +4944,12 @@ gfc_conv_gfc_desc_to_cfi_desc (gfc_se *parmse, gfc
 
   if (e->rank != 0)
 {
-  gfc_conv_expr_descriptor (parmse, e);
+  if (fsym->attr.contiguous
+	  && !gfc_is_simply_contiguous (e, false, true))
+	gfc_conv_subref_array_arg (parmse, e, false, fsym->attr.intent,
+   fsym->attr.pointer);
+  else
+	gfc_conv_expr_descriptor (parmse, e);
 
   if (POINTER_TYPE_P (TREE_TYPE (parmse->expr)))
 	parmse->expr = build_fold_indirect_ref_loc (input_location,
! { dg-do  run }
! PR fortran/89384 - this used to give a wrong results
! with contiguous.
! Test case by Reinhold Bader.
module mod_ctg
  implicit none
contains
  subroutine ctg(x) BIND(C)
real, contiguous :: x(:)

if (any(abs(x - [2.,4.,6.]) > 1.e-6)) then
   write(*,*) 'FAIL'
else
   write(*,*) 'OK'
end if
  end subroutine
end module
program p
  use mod_ctg
  implicit none
  real :: x(6)
  integer :: i

  x = [ (real(i), i=1, size(x)) ]
  call ctg(x(2::2))

end program

Re: [PATCH, RFC] Avoid the -D option which is not available install-sh

2019-02-18 Thread Johannes Pfau


Hi Bernd,

Am 16.02.19 um 13:58 schrieb Bernd Edlinger:


So here is the latest version with the requested change.

How is the procedure with libpobos patches?
Can we check them into the gcc svn, or will Ian have to
push them first into the upstream?


Most phobos/druntime changes should be upstreamed first: This ensures 
that we do not unintentionally revert changes on the next merge with 
upstream. High priority fixes can probably also be pushed to gdc before 
they're merged in upstream.


However, in this case we can push this without any upstream interaction 
either way: Upstream does not use the autoconf/automake build system. 
They use plain Makefiles completely unrelated to the build system we use 
here.


Patch looks good to me, but Iain has to approve this.

Best regards,
Johannes

Re: C++ PATCH to fix eb82.C

2019-02-18 Thread Jason Merrill


On 2/17/19 11:54 AM, Marek Polacek wrote:

On Sat, Feb 16, 2019 at 03:54:21PM -0500, Marek Polacek wrote:

I noticed this test fails in c++2a since the implementation of P0846
landed in r265734.  Since it's in g++.old-deja/, I never noticted the
fail (but I don't see any others).  This patch tweaks a dg-error in
order to make it pass in c++2a also.

Tested on x86_64-linux, ok for trunk?

2019-02-16  Marek Polacek  

* g++.old-deja/g++.robertl/eb82.C: Tweak dg-error.

diff --git gcc/testsuite/g++.old-deja/g++.robertl/eb82.C 
gcc/testsuite/g++.old-deja/g++.robertl/eb82.C
index 9bf0398cd0a..fc2bf7866fe 100644
--- gcc/testsuite/g++.old-deja/g++.robertl/eb82.C
+++ gcc/testsuite/g++.old-deja/g++.robertl/eb82.C
@@ -9,5 +9,5 @@ double val  () // { dg-error "" } bogus code
  
  int main ()

  {
-   printf ("%d\n", val<(int)3> ()); // { dg-error "" } val undeclared
+   printf ("%d\n", val<(int)3> ()); // { dg-error "" "" { target c++17_down } 
} val undeclared
  }


Actually I'll just go ahead with this, should be obvious anyway.


I had also noticed this test failing, and when investigating noticed 
that the remaining error strangely talked about a partial 
specialization.  This patch fixes that:



commit 848fa7b9ab2a55d4d3bbf791c828fc3ce60d61fa
Author: Jason Merrill 
Date:   Mon Feb 18 10:05:31 2019 -1000

Improve diagnostic for redundant template arguments in declaration.

* pt.c (check_explicit_specialization): If the declarator is a
template-id, only check whether the arguments are dependent.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 48cbf3d9892..d8be92ddca4 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -2849,7 +2849,7 @@ check_explicit_specialization (tree declarator,
 	  /* This case handles bogus declarations like template <>
 	 template  void f(); */
 
-	  if (!uses_template_parms (declarator))
+	  if (!uses_template_parms (TREE_OPERAND (declarator, 1)))
 	error ("template-id %qD in declaration of primary template",
 		   declarator);
 	  else if (variable_template_p (TREE_OPERAND (declarator, 0)))
diff --git a/gcc/testsuite/g++.old-deja/g++.robertl/eb82.C b/gcc/testsuite/g++.old-deja/g++.robertl/eb82.C
index fc2bf7866fe..d4c5985cd8c 100644
--- a/gcc/testsuite/g++.old-deja/g++.robertl/eb82.C
+++ b/gcc/testsuite/g++.old-deja/g++.robertl/eb82.C
@@ -2,7 +2,8 @@
 #include 
 
 template 
-double val  () // { dg-error "" } bogus code
+double val  () // { dg-error "expected" "" { target c++17_down } } bogus code
+// { dg-error "template-id .val. in declaration of primary template" "" { target c++2a } .-1 }
 {  
return (double) n1;
 }

Re: [PATCH] correct __clear_cache signature

2019-02-18 Thread Richard Sandiford

Martin Sebor  writes:
> Recent libgcc builds have been triggering -Wbuiltin-declaration-mismatch
> due to the declaration of the __clear_cache built-in being incompatible
> with how GCC declares it internally.  The attached patch adjusts
> the libgcc declaration and the one in the manual to match what GCC
> expects.
>
> Tested on x86_64-linux.

OK, thanks.

Richard

Re: Trivial doc typos

2019-02-18 Thread Richard Sandiford

Sharon Dvir  writes:
> Description: fixed a couple of typos in testsuite/README.
> Testing: make dvi, make info, although I doubt needed.

Applied, thanks.

Richard

Re: [patch, fortran] Fix PR 87689, wrong decls / ABI violation on POWER

2019-02-18 Thread Janne Blomqvist

On Mon, Feb 18, 2019 at 10:30 PM Thomas Koenig 
wrote:

> Hi Janne,
>
> > I'm not really sure if there is any good reason why GFortran occasionally
> > generates these varargs declarations, hence my suggestion to get rid of
> > them. Unless the middle-end is planning to get rid of untyped function
> > decls?
>
> Are they still being generated after the patch went in?

I haven't checked whether your patch fixes all such cases. How do we even
conclusively prove it, except  by just getting rid of that code path? :)

>   I'm not sure,
> but because I wanted to change as little as possible, I did not try
> to change that aspect of the code.
>

I fully agree, this close to the release we shouldn't do any surgery which
isn't absolutely necessary.

-- 
Janne Blomqvist

Re: [patch, fortran] Fix PR 87689, wrong decls / ABI violation on POWER

2019-02-18 Thread Thomas Koenig


Hi Janne,


I'm not really sure if there is any good reason why GFortran occasionally
generates these varargs declarations, hence my suggestion to get rid of
them. Unless the middle-end is planning to get rid of untyped function
decls?


Are they still being generated after the patch went in?  I'm not sure,
but because I wanted to change as little as possible, I did not try
to change that aspect of the code.

Regards

Thomas

Re: [committed] Fix set_uids_in_ptset (PR middle-end/89303)

2019-02-18 Thread Jakub Jelinek

On Mon, Feb 18, 2019 at 09:15:39PM +0100, Rainer Orth wrote:
> 2019-02-15  Rainer Orth  
> 
>   * g++.dg/torture/pr89303.C (bad_weak_ptr): Rename to
>   bad_weak_ptr_.

Ok, thanks.
If needed, guess we could rename much more (or rename the namespace in which
most of it is from std to my_std, though we'd need to check for stuff that
needs to be in std namespace).

> # HG changeset patch
> # Parent  056fe4093ce40dc462c6b50c3ae49df032a92230
> Fix g++.dg/torture/pr89303.C with Solaris ld
> 
> diff --git a/gcc/testsuite/g++.dg/torture/pr89303.C 
> b/gcc/testsuite/g++.dg/torture/pr89303.C
> --- a/gcc/testsuite/g++.dg/torture/pr89303.C
> +++ b/gcc/testsuite/g++.dg/torture/pr89303.C
> @@ -350,11 +350,11 @@ namespace std
>{ return static_cast(_M_addr()); }
>  };
>  
> -  class bad_weak_ptr { };
> +  class bad_weak_ptr_ { };
>  
>inline void
>__throw_bad_weak_ptr()
> -  { (throw (bad_weak_ptr())); }
> +  { (throw (bad_weak_ptr_())); }
>  
>  class _Sp_counted_base
>  {


Jakub

Re: [patch, fortran] Fix PR 87689, wrong decls / ABI violation on POWER

2019-02-18 Thread Janne Blomqvist

On Mon, Feb 18, 2019 at 7:25 PM Segher Boessenkool <
seg...@kernel.crashing.org> wrote:

> On Mon, Feb 18, 2019 at 10:48:35AM +0200, Janne Blomqvist wrote:
> > I wonder if we shouldn't exorcise all the varargs stuff, it seems to
> > cause more problems than benefits? But not in stage4 if we can avoid
> > it..
>
> On the Power ABIs at least, unprototyped functions (a K thing for C) are
> handled the same as varargs (with zero fixed arguments).  How does this
> tie in with Fortran requirements?
>

Varargs don't exist in Fortran.  But we need some kind of support for
so-called "implicit interfaces" (which was the only thing available before
Fortran 90), which I guess are pretty similar to the K unprototyped
functions. E.g. something like

subroutine foo
call bar(1, 2, 3.0)
end subroutine foo

is perfectly valid code, even though discouraged by modern programming
practice. Here the compiler can only deduce from the syntax that bar must
be a subroutine that takes (int, int, float) arguments. And bar can be in
another translation unit, so we have no idea what it's actual interface is,
the onus is on the programmer that they match. Similarly, from

subroutine foo
f = bar(1, 2)
print *, f
end subroutine foo

the compiler can deduce that bar is a function that takes (int, int)
arguments, and returns a float (due to implicit typing rules). However, as
previously mentioned in this thread

subroutine foo
call bar(1, 2)
f = bar(1, 2)
print *, f
end subroutine foo

is invalid since bar cannot be both a subroutine and a function. Also,
getting back to my first statement

subroutine foo
call bar(1, 2)
call bar(1, 2, 3)
end subroutine foo

is invalid since Fortran doesn't have vararg functions (well, with the
newer "explicit interfaces", optional arguments are possible, but that's
still not the same thing as varargs).

I'm not really sure if there is any good reason why GFortran occasionally
generates these varargs declarations, hence my suggestion to get rid of
them. Unless the middle-end is planning to get rid of untyped function
decls?

-- 
Janne Blomqvist

Re: [committed] Fix set_uids_in_ptset (PR middle-end/89303)

2019-02-18 Thread Rainer Orth

Hi Jakub,

>> The following testcase is miscompiled on x86_64-linux (-m32 and -m64) at
>> -O1, as a pointer has two vars in points-to set, the first one is escaped
>> heap var and the second one is escaped non-heap var, and in the end the last
>> var that sets vars_contains_escaped won and overwrote
>> vars_contains_escaped_heap rather than oring into it.
>>
>> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
>> preapproved by Richard on IRC, committed to trunk.
>> Will test 8.x backport tonight and commit to 8.3 if that succeeds.
>>
>> 2019-02-13  Jakub Jelinek  
>>
>>  PR middle-end/89303
>>  * tree-ssa-structalias.c (set_uids_in_ptset): Or in vi->is_heap_var
>>  into pt->vars_contains_escaped_heap instead of setting
>>  pt->vars_contains_escaped_heap to it.
>>
>> 2019-02-13  Jonathan Wakely  
>>  Jakub Jelinek  
>>
>>  PR middle-end/89303
>>  * g++.dg/torture/pr89303.C: New test.
>
> the new testcase FAILs on Solaris:
>
> +FAIL: g++.dg/torture/pr89303.C   -O0  (test for excess errors)
> +FAIL: g++.dg/torture/pr89303.C   -O1  (test for excess errors)
> +FAIL: g++.dg/torture/pr89303.C   -O2  (test for excess errors)
> +FAIL: g++.dg/torture/pr89303.C   -O2 -flto  (test for excess errors)
> +FAIL: g++.dg/torture/pr89303.C -O2 -flto -flto-partition=none (test for
> excess errors)
> +FAIL: g++.dg/torture/pr89303.C -O3 -fomit-frame-pointer -funroll-loops
> -fpeel-loops -ftracer -finline-functions (test for excess errors)
> +FAIL: g++.dg/torture/pr89303.C   -O3 -g  (test for excess errors)
> +FAIL: g++.dg/torture/pr89303.C   -Os  (test for excess errors)
>
> Excess errors:
> ld: warning: symbol 'typeinfo for std::bad_weak_ptr' has differing sizes:
> (file /var/tmp//ccB1o8Ya.o value=0x8; file
> /var/gcc/regression/trunk/11-gcc/build/i386-pc-solaris2.11/./libstdc++-v3/src/.libs/libstdc++.so
> value=0xc);
> /var/tmp//ccB1o8Ya.o definition taken
>
> I suspect the class can just be renamed in pr89303.C to avoid the
> conflict with include/bits/shared_ptr_base.h?

the following patch does this.  I've verified that it still FAILs on
x86_64-pc-linux-gnu before your patch and PASSes afterwards, as well as
avoiding the linker warning on i386-pc-solaris2.11.

Ok for mainline?

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2019-02-15  Rainer Orth  

* g++.dg/torture/pr89303.C (bad_weak_ptr): Rename to
bad_weak_ptr_.

# HG changeset patch
# Parent  056fe4093ce40dc462c6b50c3ae49df032a92230
Fix g++.dg/torture/pr89303.C with Solaris ld

diff --git a/gcc/testsuite/g++.dg/torture/pr89303.C b/gcc/testsuite/g++.dg/torture/pr89303.C
--- a/gcc/testsuite/g++.dg/torture/pr89303.C
+++ b/gcc/testsuite/g++.dg/torture/pr89303.C
@@ -350,11 +350,11 @@ namespace std
   { return static_cast(_M_addr()); }
 };
 
-  class bad_weak_ptr { };
+  class bad_weak_ptr_ { };
 
   inline void
   __throw_bad_weak_ptr()
-  { (throw (bad_weak_ptr())); }
+  { (throw (bad_weak_ptr_())); }
 
 class _Sp_counted_base
 {

Re: Move -Wmaybe-uninitialized to -Wextra

2019-02-18 Thread Jeff Law

On 2/4/19 3:52 PM, Martin Jambor wrote:
> Hi,
> 
> On Mon, Feb 04 2019, Marc Glisse wrote:
>> On Mon, 4 Feb 2019, Martin Sebor wrote:
>>>
> 
> ...
> 
>>> You're right that this is hard to imagine without first hand experience
>>> with the problem.  If this is a common pattern with the warning in C++
>>> class templates in general, a representative test case would help get
>>> a better appreciation of the problem and might also give us an idea
>>> of a better solution.  (If there is one in Bugzilla please point me
>>> at it.)
>>
>> Looking for "optional" and "-Wmaybe-uninitialized" shows
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78044
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80635
>>
>> Google also gives
>> https://www.boost.org/doc/libs/1_69_0/libs/optional/doc/html/boost_optional/tutorial/gotchas/false_positive_with__wmaybe_uninitialized.html
>> https://sourceware.org/ml/gdb-patches/2017-05/msg00130.html
>> etc
>>
>> And that's just for using a type called 'optional' (3 implementations of 
>> it).
> 
> from my very quick reading of the first googled testcase, I assume the
> instance of the optional class got SRAed and a warning was generated for
> what originally was a class member, which indeed is not easy to
> initialize on its own in order to avoid the warning.
> 
> Would it perhaps make sense to split the -Wmaybe-uninitialized warning
> into two, one for scalars that are scalars in the original code and one
> for SRA-created scalars and move only the latter to -Wextra?
I could support that.   It fits in with the general sense that we're not
handling aggregates and addressables as well as we could.

JEff

Re: [PATCH] Handle timeout warnings in dg-extract-results

2019-02-18 Thread Rainer Orth

Hi Christophe,

> dg-extract-results currently moves lines like
> WARNING: program timed out
> at the end of each .exp section when it generates .sum files.
>
> This is because it sorts its output based on the 2nd field, which is
> normally the testname as in:
> FAIL: gcc.c-torture/execute/20020129-1.c   -O2 -flto
> -fno-use-linker-plugin -flto-partition=none  execution test
>
> As you can notice 'program' comes after
> gcc.c-torture/execute/20020129-1.c alphabetically, and generally after
> most (all?) GCC testnames.
>
> This is a bit of a pain when trying to handle transient test failures
> because you can no longer match such a WARNING line to its FAIL
> counterpart.
>
> The attached patch changes this behavior by replacing the line
> WARNING: program timed out
> with
> WARNING: gcc.c-torture/execute/20020129-1.c   -O2 -flto
> -fno-use-linker-plugin -flto-partition=none  execution test program
> timed out
>
> The effect is that this line will now appear immediately above the
> FAIL: gcc.c-torture/execute/20020129-1.c   -O2 -flto
> -fno-use-linker-plugin -flto-partition=none  execution test
> so that it's easier to match them.
>
>
> I'm not sure how much people depend on the .sum format, I also
> considered emitting
> WARNING: program timed out gcc.c-torture/execute/20020129-1.c   -O2
> -flto -fno-use-linker-plugin -flto-partition=none  execution test
>
> I also restricted the patch to handling only 'program timed out'
> cases, to avoid breaking other things.
>
> I considered fixing this in Dejagnu, but it seemed more complicated,
> and would delay adoption in GCC anyway.
>
> What do people think about this?

I just had a case where your patch broke the generation of go.sum.
This is on Solaris 11.5 with python 2.7.15:

ro@colima 68 > /bin/ksh 
/vol/gcc/src/hg/trunk/local/gcc/../contrib/dg-extract-results.sh 
testsuite/go*/*.sum.sep > testsuite/go/go.sum
Traceback (most recent call last):
  File "/vol/gcc/src/hg/trunk/local/gcc/../contrib/dg-extract-results.py", line 
605, in 
Prog().main()
  File "/vol/gcc/src/hg/trunk/local/gcc/../contrib/dg-extract-results.py", line 
569, in main
self.parse_file (filename, file)
  File "/vol/gcc/src/hg/trunk/local/gcc/../contrib/dg-extract-results.py", line 
427, in parse_file
num_variations)
  File "/vol/gcc/src/hg/trunk/local/gcc/../contrib/dg-extract-results.py", line 
311, in parse_run
first_key = key
UnboundLocalError: local variable 'key' referenced before assignment

Before your patch, key cannot have been undefined, now it is.  I've
verified this by removing the WARNING: lines from the two affected
go.sum.sep files and now go.sum creation just works fine.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: Move -Wmaybe-uninitialized to -Wextra

2019-02-18 Thread Jeff Law

On 2/14/19 7:23 AM, Tom Tromey wrote:
>> "Marc" == Marc Glisse  writes:
> 
>>> Lastly, in the case of uninitialized variables, the usual solution
>>> of initializing them is trivial and always safe (some coding styles
>>> even require it).
> 
> Marc> Here it shows that we don't work with the same type of code at all. If
> Marc> I am using a boost::optional, i.e. a class with a buffer and a boolean
> Marc> that says if the buffer is initialized, how do I initialize the
> Marc> (private) buffer? Or should boost itself zero out the buffer whenever
> Marc> the boolean is set to false?
> 
> This is https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80635 (I know you
> know, but maybe others on the thread don't).
> 
> I think in this specific case (std::optional and similar classes), GCC
> should provide a way for the class to indicate that
> -Wmaybe-uninitialized should not apply to the payload.
> 
>>> A shared definition of a false positive should be one of the very
>>> first steps to coming closer to a consensus.  Real world (as opposed
>>> to anecdotal) data on the rates of actual rates of false positives
>>> and negatives vs true positives would be also most helpful, as would
>>> some consensus of the severity of the bugs the true positives
>>> expose, as well as some objective measure of the ease of
>>> suppression.  There probably are others but these would be a start.
> 
> Marc> This data is going to be super hard to get. Most projects have been
> Marc> compiling for years and tweaking their code to avoid some warnings. We
> Marc> do not get to see the code that people originally write, we can only
> Marc> see what they commit.
> 
> gdb has gone through this over the years -- it turns on many warnings
> and sometimes false positives show up.  Most of the time there's a
> comment, for -Wmaybe-uninitialized grep for "init.*gcc" in the source.
> Unfortunately the comment isn't standardized; but I only get ~20 hits
> for this in gdb, so it isn't really so bad in practice.
Yea, in retrospect we should have had a consistent marker for GCC as
well.  I suspect a goodly number of those initializations that went in
early in the process are no longer needed.

Jeff

Re: Go patch committed: Harmonize types referenced by both C and Go

2019-02-18 Thread Ian Lance Taylor

On Mon, Feb 18, 2019 at 2:48 AM Rainer Orth  
wrote:
>
> > The code was already calling syscall, it was just doing it in a way
> > that the types didn't necessarily match the C declaration.  This is
> > the implementation of Go's syscall.Syscall function, so there isn't
> > really anything else we can do.
>
> I feared as much.  Some time ago when debugging another issue I saw
> libgo using syscall() directly, certainly unexpected in that particular
> case.

Those cases--where libgo calls syscall.Syscall--we can clean up where
appropriate.  What we can't clean up is user written Go code that
calls syscall.Syscall directly.

Ian

Re: RFC (branch prediction): PATCH to implement P0479R5, [[likely]] and [[unlikely]].

2019-02-18 Thread Jason Merrill


On 2/18/19 7:44 AM, Martin Liška wrote:

PING^1

On 11/30/18 11:26 AM, Martin Liška wrote:

Hi Jason.

Just small nits I noticed for:

cat test4.C
int a, b, c;

void
__attribute__((noinline))
bar()
{
   if (a == 123)
 [[likely]] c = 5;
   else
 [[likely]] b = 77;
}

int main()
{
   bar ();
   return 0;
}

$ g++ test4.C -c
test4.C: In function ‘void bar()’:
test4.C:8:16: warning: both branches of ‘if’ statement marked as ‘hot label’ 
[-Wattributes]
 8 | [[likely]] c = 5;
   |^
 9 |   else
10 | [[likely]] b = 77;
   |~

1) I would expect 'likely' instead of 'hot label'
2) maybe we can put the carousel to the attribute instead of the first 
statement in the block?


Fixed thus:


commit 4f0e3ea77fd14dc9931cade9add07f1aa70e8ef4
Author: Jason Merrill 
Date:   Mon Feb 18 08:49:49 2019 -1000

Improve duplicate [[likely]] diagnostic.

* parser.c (cp_parser_statement): Make attrs_loc a range.  Pass it
to process_stmt_hotness_attribute.
* cp-gimplify.c (process_stmt_hotness_attribute): Take attrs_loc.
(genericize_if_stmt): Use likely/unlikely instead of predictor_name.

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 60ca1366cf6..ac3654467ac 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -7576,7 +7576,7 @@ extern tree cp_fully_fold			(tree);
 extern tree cp_fully_fold_init			(tree);
 extern void clear_fold_cache			(void);
 extern tree lookup_hotness_attribute		(tree);
-extern tree process_stmt_hotness_attribute	(tree);
+extern tree process_stmt_hotness_attribute	(tree, location_t);
 
 /* in name-lookup.c */
 extern tree strip_using_decl(tree);
diff --git a/gcc/cp/cp-gimplify.c b/gcc/cp/cp-gimplify.c
index 33111bd14bf..56f717de85d 100644
--- a/gcc/cp/cp-gimplify.c
+++ b/gcc/cp/cp-gimplify.c
@@ -206,7 +206,7 @@ genericize_if_stmt (tree *stmt_p)
 	  richloc.add_range (EXPR_LOC_OR_LOC (fe, locus));
 	  warning_at (, OPT_Wattributes,
 		  "both branches of % statement marked as %qs",
-		  predictor_name (pr));
+		  pr == PRED_HOT_LABEL ? "likely" : "unlikely");
 	}
 }
 
@@ -2765,7 +2765,7 @@ remove_hotness_attribute (tree list)
PREDICT_EXPR.  */
 
 tree
-process_stmt_hotness_attribute (tree std_attrs)
+process_stmt_hotness_attribute (tree std_attrs, location_t attrs_loc)
 {
   if (std_attrs == error_mark_node)
 return std_attrs;
@@ -2776,7 +2776,7 @@ process_stmt_hotness_attribute (tree std_attrs)
 		  || is_attribute_p ("likely", name));
   tree pred = build_predict_expr (hot ? PRED_HOT_LABEL : PRED_COLD_LABEL,
   hot ? TAKEN : NOT_TAKEN);
-  SET_EXPR_LOCATION (pred, input_location);
+  SET_EXPR_LOCATION (pred, attrs_loc);
   add_stmt (pred);
   if (tree other = lookup_hotness_attribute (TREE_CHAIN (attr)))
 	warning (OPT_Wattributes, "ignoring attribute %qE after earlier %qE",
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index ffecce4e29b..adb5f6f27a1 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -11060,7 +11060,7 @@ cp_parser_statement (cp_parser* parser, tree in_statement_expr,
 {
   tree statement, std_attrs = NULL_TREE;
   cp_token *token;
-  location_t statement_location, attrs_location;
+  location_t statement_location, attrs_loc;
 
  restart:
   if (if_p != NULL)
@@ -11069,13 +11069,19 @@ cp_parser_statement (cp_parser* parser, tree in_statement_expr,
   statement = NULL_TREE;
 
   saved_token_sentinel saved_tokens (parser->lexer);
-  attrs_location = cp_lexer_peek_token (parser->lexer)->location;
+  attrs_loc = cp_lexer_peek_token (parser->lexer)->location;
   if (c_dialect_objc ())
 /* In obj-c++, seeing '[[' might be the either the beginning of
c++11 attributes, or a nested objc-message-expression.  So
let's parse the c++11 attributes tentatively.  */
 cp_parser_parse_tentatively (parser);
   std_attrs = cp_parser_std_attribute_spec_seq (parser);
+  if (std_attrs)
+{
+  location_t end_loc
+	= cp_lexer_previous_token (parser->lexer)->location;
+  attrs_loc = make_location (attrs_loc, attrs_loc, end_loc);
+}
   if (c_dialect_objc ())
 {
   if (!cp_parser_parse_definitely (parser))
@@ -11107,14 +3,14 @@ cp_parser_statement (cp_parser* parser, tree in_statement_expr,
 
 	case RID_IF:
 	case RID_SWITCH:
-	  std_attrs = process_stmt_hotness_attribute (std_attrs);
+	  std_attrs = process_stmt_hotness_attribute (std_attrs, attrs_loc);
 	  statement = cp_parser_selection_statement (parser, if_p, chain);
 	  break;
 
 	case RID_WHILE:
 	case RID_DO:
 	case RID_FOR:
-	  std_attrs = process_stmt_hotness_attribute (std_attrs);
+	  std_attrs = process_stmt_hotness_attribute (std_attrs, attrs_loc);
 	  statement = cp_parser_iteration_statement (parser, if_p, false, 0);
 	  break;
 
@@ -11122,7 +11128,7 @@ cp_parser_statement (cp_parser* parser, tree in_statement_expr,
 	case RID_CONTINUE:
 	case RID_RETURN:
 	case RID_GOTO:
-	  std_attrs =

Trivial doc typos

2019-02-18 Thread Sharon Dvir

Description: fixed a couple of typos in testsuite/README.
Testing: make dvi, make info, although I doubt needed.
svn diff (with -up) yields:

Index: gcc/testsuite/README
===
--- gcc/testsuite/README(revision 268955)
+++ gcc/testsuite/README(working copy)
@@ -8,7 +8,7 @@
 These tests are included "as is". If any of them fails, do not report
 a bug.  Bug reports for DejaGnu can go to bug-deja...@gnu.org.
 Discussion and comments about this testsuite should be sent to
-g...@gcc.gnu.org; additions and changes to should go to sent to
+g...@gcc.gnu.org; additions and changes should be sent to
 gcc-patches@gcc.gnu.org.
 
 The entire testsuite is invoked by `make check` at the top level of
@@ -48,7 +48,7 @@
 where 
 
runtest Is the name used to invoke DejaGnu.   If DejaGnu is not
-   install this will be the relative path name for runtest.
+   installed this will be the relative path name for runtest.
 
--tool  This tells DejaGnu which tool you are testing. It is
mainly used to find the testsuite directories for a

[PATCH 29/41] i386: Emulate MMX ssse3_phdv2si3 with SSE

2019-02-18 Thread H.J. Lu

Emulate MMX ssse3_phdv2si3 with SSE by moving bits
64:95 to bits 32:63 in SSE register.  Only SSE register source operand
is allowed.

PR target/89021
* config/i386/sse.md (ssse3_phdv2si3):
Changed to define_insn_and_split to support SSE emulation.
---
 gcc/config/i386/sse.md | 34 ++
 1 file changed, 26 insertions(+), 8 deletions(-)

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 5f29f2c3595..551a1cb1eb2 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -15367,26 +15367,44 @@
(set_attr "prefix" "orig,vex")
(set_attr "mode" "TI")])
 
-(define_insn "ssse3_phdv2si3"
-  [(set (match_operand:V2SI 0 "register_operand" "=y")
+(define_insn_and_split "ssse3_phdv2si3"
+  [(set (match_operand:V2SI 0 "register_operand" "=y,x,Yv")
(vec_concat:V2SI
  (plusminus:SI
(vec_select:SI
- (match_operand:V2SI 1 "register_operand" "0")
+ (match_operand:V2SI 1 "register_operand" "0,0,Yv")
  (parallel [(const_int 0)]))
(vec_select:SI (match_dup 1) (parallel [(const_int 1)])))
  (plusminus:SI
(vec_select:SI
- (match_operand:V2SI 2 "nonimmediate_operand" "ym")
+ (match_operand:V2SI 2 "register_mmxmem_operand" "ym,x,Yv")
  (parallel [(const_int 0)]))
(vec_select:SI (match_dup 2) (parallel [(const_int 1)])]
-  "TARGET_SSSE3"
-  "phd\t{%2, %0|%0, %2}"
-  [(set_attr "type" "sseiadd")
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3"
+  "@
+   phd\t{%2, %0|%0, %2}
+   #
+   #"
+  "TARGET_MMX_WITH_SSE && reload_completed"
+  [(const_int 0)]
+{
+  /* Generate SSE version of the operation.  */
+  rtx op0 = lowpart_subreg (V4SImode, operands[0],
+   GET_MODE (operands[0]));
+  rtx op1 = lowpart_subreg (V4SImode, operands[1],
+   GET_MODE (operands[1]));
+  rtx op2 = lowpart_subreg (V4SImode, operands[2],
+   GET_MODE (operands[2]));
+  emit_insn (gen_ssse3_phdv4si3 (op0, op1, op2));
+  ix86_move_vector_high_sse_to_mmx (op0);
+  DONE;
+}
+  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
+   (set_attr "type" "sseiadd")
(set_attr "atom_unit" "complex")
(set_attr "prefix_extra" "1")
(set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p (insn)"))
-   (set_attr "mode" "DI")])
+   (set_attr "mode" "DI,TI,TI")])
 
 (define_insn "avx2_pmaddubsw256"
   [(set (match_operand:V16HI 0 "register_operand" "=x,v")
-- 
2.20.1

[PATCH 23/41] i386: Emulate MMX mmx_uavgv4hi3 with SSE

2019-02-18 Thread H.J. Lu

Emulate MMX mmx_uavgv4hi3 with SSE.  Only SSE register source operand is
allowed.

PR target/89021
* config/i386/mmx.md (mmx_uavgv4hi3): Also check TARGET_MMX and
TARGET_MMX_WITH_SSE.
(*mmx_uavgv4hi3): Add SSE emulation.
---
 gcc/config/i386/mmx.md | 26 --
 1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index 8866354dea9..d647dc28baa 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -1736,33 +1736,39 @@
(plus:V4SI
  (plus:V4SI
(zero_extend:V4SI
- (match_operand:V4HI 1 "nonimmediate_operand"))
+ (match_operand:V4HI 1 "register_mmxmem_operand"))
(zero_extend:V4SI
- (match_operand:V4HI 2 "nonimmediate_operand")))
+ (match_operand:V4HI 2 "register_mmxmem_operand")))
  (const_vector:V4SI [(const_int 1) (const_int 1)
  (const_int 1) (const_int 1)]))
(const_int 1]
-  "TARGET_SSE || TARGET_3DNOW_A"
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
+   && (TARGET_SSE || TARGET_3DNOW_A)"
   "ix86_fixup_binary_operands_no_copy (PLUS, V4HImode, operands);")
 
 (define_insn "*mmx_uavgv4hi3"
-  [(set (match_operand:V4HI 0 "register_operand" "=y")
+  [(set (match_operand:V4HI 0 "register_operand" "=y,x,Yv")
(truncate:V4HI
  (lshiftrt:V4SI
(plus:V4SI
  (plus:V4SI
(zero_extend:V4SI
- (match_operand:V4HI 1 "nonimmediate_operand" "%0"))
+ (match_operand:V4HI 1 "register_mmxmem_operand" "%0,0,Yv"))
(zero_extend:V4SI
- (match_operand:V4HI 2 "nonimmediate_operand" "ym")))
+ (match_operand:V4HI 2 "register_mmxmem_operand" "ym,x,Yv")))
  (const_vector:V4SI [(const_int 1) (const_int 1)
  (const_int 1) (const_int 1)]))
(const_int 1]
-  "(TARGET_SSE || TARGET_3DNOW_A)
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
+   && (TARGET_SSE || TARGET_3DNOW_A)
&& ix86_binary_operator_ok (PLUS, V4HImode, operands)"
-  "pavgw\t{%2, %0|%0, %2}"
-  [(set_attr "type" "mmxshft")
-   (set_attr "mode" "DI")])
+  "@
+   pavgw\t{%2, %0|%0, %2}
+   pavgw\t{%2, %0|%0, %2}
+   vpavgw\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
+   (set_attr "type" "mmxshft,sseiadd,sseiadd")
+   (set_attr "mode" "DI,TI,TI")])
 
 (define_insn "mmx_psadbw"
   [(set (match_operand:V1DI 0 "register_operand" "=y")
-- 
2.20.1

[PATCH 28/41] i386: Emulate MMX ssse3_phwv4hi3 with SSE

2019-02-18 Thread H.J. Lu

Emulate MMX ssse3_phwv4hi3 with SSE by moving bits
64:95 to bits 32:63 in SSE register.  Only SSE register source operand
is allowed.

PR target/89021
* config/i386/sse.md (ssse3_phwv4hi3):
Changed to define_insn_and_split to support SSE emulation.
---
 gcc/config/i386/sse.md | 34 ++
 1 file changed, 26 insertions(+), 8 deletions(-)

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 3135ce4eace..5f29f2c3595 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -15243,13 +15243,13 @@
(set_attr "prefix" "orig,vex")
(set_attr "mode" "TI")])
 
-(define_insn "ssse3_phwv4hi3"
-  [(set (match_operand:V4HI 0 "register_operand" "=y")
+(define_insn_and_split "ssse3_phwv4hi3"
+  [(set (match_operand:V4HI 0 "register_operand" "=y,x,Yv")
(vec_concat:V4HI
  (vec_concat:V2HI
(ssse3_plusminus:HI
  (vec_select:HI
-   (match_operand:V4HI 1 "register_operand" "0")
+   (match_operand:V4HI 1 "register_operand" "0,0,Yv")
(parallel [(const_int 0)]))
  (vec_select:HI (match_dup 1) (parallel [(const_int 1)])))
(ssse3_plusminus:HI
@@ -15258,19 +15258,37 @@
  (vec_concat:V2HI
(ssse3_plusminus:HI
  (vec_select:HI
-   (match_operand:V4HI 2 "nonimmediate_operand" "ym")
+   (match_operand:V4HI 2 "register_mmxmem_operand" "ym,x,Yv")
(parallel [(const_int 0)]))
  (vec_select:HI (match_dup 2) (parallel [(const_int 1)])))
(ssse3_plusminus:HI
  (vec_select:HI (match_dup 2) (parallel [(const_int 2)]))
  (vec_select:HI (match_dup 2) (parallel [(const_int 3)]))]
-  "TARGET_SSSE3"
-  "phw\t{%2, %0|%0, %2}"
-  [(set_attr "type" "sseiadd")
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3"
+  "@
+   phw\t{%2, %0|%0, %2}
+   #
+   #"
+  "TARGET_MMX_WITH_SSE && reload_completed"
+  [(const_int 0)]
+{
+  /* Generate SSE version of the operation.  */
+  rtx op0 = lowpart_subreg (V8HImode, operands[0],
+   GET_MODE (operands[0]));
+  rtx op1 = lowpart_subreg (V8HImode, operands[1],
+   GET_MODE (operands[1]));
+  rtx op2 = lowpart_subreg (V8HImode, operands[2],
+   GET_MODE (operands[2]));
+  emit_insn (gen_ssse3_phwv8hi3 (op0, op1, op2));
+  ix86_move_vector_high_sse_to_mmx (op0);
+  DONE;
+}
+  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
+   (set_attr "type" "sseiadd")
(set_attr "atom_unit" "complex")
(set_attr "prefix_extra" "1")
(set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p (insn)"))
-   (set_attr "mode" "DI")])
+   (set_attr "mode" "DI,TI,TI")])
 
 (define_insn "avx2_phdv8si3"
   [(set (match_operand:V8SI 0 "register_operand" "=x")
-- 
2.20.1

Re: [PATCH] document __builtin_is_constant_evaluated

2019-02-18 Thread Martin Sebor


On 2/15/19 9:01 PM, Sandra Loosemore wrote:

On 2/13/19 4:33 PM, Martin Sebor wrote:


Index: gcc/doc/extend.texi
===
--- gcc/doc/extend.texi    (revision 268856)
+++ gcc/doc/extend.texi    (working copy)
@@ -12890,6 +12890,22 @@ built-in in this case, because it has no 
opportuni

 optimization.
 @end deftypefn

+@deftypefn {Built-in Function} bool __builtin_is_constant_evaluated ()
+The @code{__builtin_is_constant_evaluated} function is available only
+in C++.  Its main use case is to determine whether a @code{constexpr}
+function is being called in a @code{constexpr} context.  A call to
+the function evaluates to a core constant expression with the value
+@code{true} if and only if it occurs within the evaluation of an 
expression
+or conversion that is manifestly constant-evaluated as defined in the 
C++
+standard.  Manifestly constant-evaluated contexts include 
constant-expressions,
+the conditions of @code{constexpr if} statements, 
constraint-expresions, and


s/expresions/expressions/

+initializers of variables usable in constant expressions.  The 
built-in is
+intended to be used by implementations of the 
@code{std::is_constant_evaluated}
+C++ function.  Programs should make use of the latter function rather 
than
+invoking the built-in directly.  For more details refer to the latest 
revision

+of the C++ standard.
+@end deftypefn
+
 @deftypefn {Built-in Function} long __builtin_expect (long @var{exp}, 
long @var{c})

 @opindex fprofile-arcs
 You may use @code{__builtin_expect} to provide the compiler with


I think this is generally reasonable (and I agree with the rationale for 
documenting this at all), but I'd like to see this rearranged and 
rephrased to put the most important point (it's an internal hook to 
implement std::is_constant_evaluated and shouldn't be called directly) 
before the technical details, with a paragraph break in between.


Attached is a revision with this rearrangement.

Martin
gcc/ChangeLog:

	* doc/extend.texi (Other Builtins): Add
	__builtin_is_constant_evaluated.

Index: gcc/doc/extend.texi
===
--- gcc/doc/extend.texi	(revision 268992)
+++ gcc/doc/extend.texi	(working copy)
@@ -12890,6 +12890,23 @@ built-in in this case, because it has no opportuni
 optimization.
 @end deftypefn
 
+@deftypefn {Built-in Function} bool __builtin_is_constant_evaluated ()
+The @code{__builtin_is_constant_evaluated} function is available only
+in C++.  The built-in is intended to be used by implemetations of
+the @code{std::is_constant_evaluated} C++ function.  Programs should make
+use of the latter function rather than invoking the built-in directly.
+
+The main use case of the built-in is to determine whether a @code{constexpr}
+function is being called in a @code{constexpr} context.  A call to
+the function evaluates to a core constant expression with the value
+@code{true} if and only if it occurs within the evaluation of an expression
+or conversion that is manifestly constant-evaluated as defined in the C++
+standard.  Manifestly constant-evaluated contexts include constant-expressions,
+the conditions of @code{constexpr if} statements, constraint-expressions, and
+initializers of variables usable in constant expressions.   For more details
+refer to the latest revision of the C++ standard.
+@end deftypefn
+
 @deftypefn {Built-in Function} long __builtin_expect (long @var{exp}, long @var{c})
 @opindex fprofile-arcs
 You may use @code{__builtin_expect} to provide the compiler with

[PATCH 37/41] i386: Allow MMXMODE moves with TARGET_MMX_WITH_SSE

2019-02-18 Thread H.J. Lu

PR target/89021
* config/i386/mmx.md (MMXMODE:mov): Also allow
TARGET_MMX_WITH_SSE.
(MMXMODE:*mov_internal): Likewise.
(MMXMODE:movmisalign): Likewise.
---
 gcc/config/i386/mmx.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index c48d42c7d59..b230dee521f 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -70,7 +70,7 @@
 (define_expand "mov"
   [(set (match_operand:MMXMODE 0 "nonimmediate_operand")
(match_operand:MMXMODE 1 "nonimmediate_operand"))]
-  "TARGET_MMX"
+  "TARGET_MMX || TARGET_MMX_WITH_SSE"
 {
   ix86_expand_vector_move (mode, operands);
   DONE;
@@ -81,7 +81,7 @@
 "=r ,o ,r,r ,m ,?!y,!y,?!y,m  ,r  ,?!y,v,v,v,m,r,v,!y,*x")
(match_operand:MMXMODE 1 "nonimm_or_0_operand"
 "rCo,rC,C,rm,rC,C  ,!y,m  ,?!y,?!y,r  ,C,v,m,v,v,r,*x,!y"))]
-  "TARGET_MMX
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
&& !(MEM_P (operands[0]) && MEM_P (operands[1]))"
 {
   switch (get_attr_type (insn))
@@ -207,7 +207,7 @@
 (define_expand "movmisalign"
   [(set (match_operand:MMXMODE 0 "nonimmediate_operand")
(match_operand:MMXMODE 1 "nonimmediate_operand"))]
-  "TARGET_MMX"
+  "TARGET_MMX || TARGET_MMX_WITH_SSE"
 {
   ix86_expand_vector_move (mode, operands);
   DONE;
-- 
2.20.1

[PATCH 30/41] i386: Emulate MMX ssse3_pmaddubsw with SSE

2019-02-18 Thread H.J. Lu

Emulate MMX ssse3_pmaddubsw with SSE.  Only SSE register source operand
is allowed.

PR target/89021
* config/i386/sse.md (ssse3_pmaddubsw): Add SSE emulation.
---
 gcc/config/i386/sse.md | 18 +++---
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 551a1cb1eb2..e8d9bec9766 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -1,17 +1,17 @@
(set_attr "mode" "TI")])
 
 (define_insn "ssse3_pmaddubsw"
-  [(set (match_operand:V4HI 0 "register_operand" "=y")
+  [(set (match_operand:V4HI 0 "register_operand" "=y,x,Yv")
(ss_plus:V4HI
  (mult:V4HI
(zero_extend:V4HI
  (vec_select:V4QI
-   (match_operand:V8QI 1 "register_operand" "0")
+   (match_operand:V8QI 1 "register_operand" "0,0,Yv")
(parallel [(const_int 0) (const_int 2)
   (const_int 4) (const_int 6)])))
(sign_extend:V4HI
  (vec_select:V4QI
-   (match_operand:V8QI 2 "nonimmediate_operand" "ym")
+   (match_operand:V8QI 2 "register_mmxmem_operand" "ym,x,Yv")
(parallel [(const_int 0) (const_int 2)
   (const_int 4) (const_int 6)]
  (mult:V4HI
@@ -15577,13 +15577,17 @@
  (vec_select:V4QI (match_dup 2)
(parallel [(const_int 1) (const_int 3)
   (const_int 5) (const_int 7)]))]
-  "TARGET_SSSE3"
-  "pmaddubsw\t{%2, %0|%0, %2}"
-  [(set_attr "type" "sseiadd")
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3"
+  "@
+   pmaddubsw\t{%2, %0|%0, %2}
+   pmaddubsw\t{%2, %0|%0, %2}
+   vpmaddubsw\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
+   (set_attr "type" "sseiadd")
(set_attr "atom_unit" "simul")
(set_attr "prefix_extra" "1")
(set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p (insn)"))
-   (set_attr "mode" "DI")])
+   (set_attr "mode" "DI,TI,TI")])
 
 (define_mode_iterator PMULHRSW
   [V8HI (V16HI "TARGET_AVX2")])
-- 
2.20.1

[PATCH 40/41] i386: Enable TM MMX intrinsics with SSE2

2019-02-18 Thread H.J. Lu

This pach enables TM MMX intrinsics with SSE2 when MMX is disabled.

PR target/89021
* config/i386/i386.c (bdesc_tm): Enable MMX intrinsics with
SSE2.
---
 gcc/config/i386/i386.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 93769003a4a..a28a3f04129 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -31078,13 +31078,13 @@ static const struct builtin_description 
bdesc_##kind[] =  \
we're lazy.  Add casts to make them fit.  */
 static const struct builtin_description bdesc_tm[] =
 {
-  { OPTION_MASK_ISA_MMX, 0, CODE_FOR_nothing, "__builtin__ITM_WM64", (enum 
ix86_builtins) BUILT_IN_TM_STORE_M64, UNKNOWN, VOID_FTYPE_PV2SI_V2SI },
-  { OPTION_MASK_ISA_MMX, 0, CODE_FOR_nothing, "__builtin__ITM_WaRM64", (enum 
ix86_builtins) BUILT_IN_TM_STORE_WAR_M64, UNKNOWN, VOID_FTYPE_PV2SI_V2SI },
-  { OPTION_MASK_ISA_MMX, 0, CODE_FOR_nothing, "__builtin__ITM_WaWM64", (enum 
ix86_builtins) BUILT_IN_TM_STORE_WAW_M64, UNKNOWN, VOID_FTYPE_PV2SI_V2SI },
-  { OPTION_MASK_ISA_MMX, 0, CODE_FOR_nothing, "__builtin__ITM_RM64", (enum 
ix86_builtins) BUILT_IN_TM_LOAD_M64, UNKNOWN, V2SI_FTYPE_PCV2SI },
-  { OPTION_MASK_ISA_MMX, 0, CODE_FOR_nothing, "__builtin__ITM_RaRM64", (enum 
ix86_builtins) BUILT_IN_TM_LOAD_RAR_M64, UNKNOWN, V2SI_FTYPE_PCV2SI },
-  { OPTION_MASK_ISA_MMX, 0, CODE_FOR_nothing, "__builtin__ITM_RaWM64", (enum 
ix86_builtins) BUILT_IN_TM_LOAD_RAW_M64, UNKNOWN, V2SI_FTYPE_PCV2SI },
-  { OPTION_MASK_ISA_MMX, 0, CODE_FOR_nothing, "__builtin__ITM_RfWM64", (enum 
ix86_builtins) BUILT_IN_TM_LOAD_RFW_M64, UNKNOWN, V2SI_FTYPE_PCV2SI },
+  { OPTION_MASK_ISA_MMX | OPTION_MASK_ISA_SSE2, 0, CODE_FOR_nothing, 
"__builtin__ITM_WM64", (enum ix86_builtins) BUILT_IN_TM_STORE_M64, UNKNOWN, 
VOID_FTYPE_PV2SI_V2SI },
+  { OPTION_MASK_ISA_MMX | OPTION_MASK_ISA_SSE2, 0, CODE_FOR_nothing, 
"__builtin__ITM_WaRM64", (enum ix86_builtins) BUILT_IN_TM_STORE_WAR_M64, 
UNKNOWN, VOID_FTYPE_PV2SI_V2SI },
+  { OPTION_MASK_ISA_MMX | OPTION_MASK_ISA_SSE2, 0, CODE_FOR_nothing, 
"__builtin__ITM_WaWM64", (enum ix86_builtins) BUILT_IN_TM_STORE_WAW_M64, 
UNKNOWN, VOID_FTYPE_PV2SI_V2SI },
+  { OPTION_MASK_ISA_MMX | OPTION_MASK_ISA_SSE2, 0, CODE_FOR_nothing, 
"__builtin__ITM_RM64", (enum ix86_builtins) BUILT_IN_TM_LOAD_M64, UNKNOWN, 
V2SI_FTYPE_PCV2SI },
+  { OPTION_MASK_ISA_MMX | OPTION_MASK_ISA_SSE2, 0, CODE_FOR_nothing, 
"__builtin__ITM_RaRM64", (enum ix86_builtins) BUILT_IN_TM_LOAD_RAR_M64, 
UNKNOWN, V2SI_FTYPE_PCV2SI },
+  { OPTION_MASK_ISA_MMX | OPTION_MASK_ISA_SSE2, 0, CODE_FOR_nothing, 
"__builtin__ITM_RaWM64", (enum ix86_builtins) BUILT_IN_TM_LOAD_RAW_M64, 
UNKNOWN, V2SI_FTYPE_PCV2SI },
+  { OPTION_MASK_ISA_MMX | OPTION_MASK_ISA_SSE2, 0, CODE_FOR_nothing, 
"__builtin__ITM_RfWM64", (enum ix86_builtins) BUILT_IN_TM_LOAD_RFW_M64, 
UNKNOWN, V2SI_FTYPE_PCV2SI },
 
   { OPTION_MASK_ISA_SSE, 0, CODE_FOR_nothing, "__builtin__ITM_WM128", (enum 
ix86_builtins) BUILT_IN_TM_STORE_M128, UNKNOWN, VOID_FTYPE_PV4SF_V4SF },
   { OPTION_MASK_ISA_SSE, 0, CODE_FOR_nothing, "__builtin__ITM_WaRM128", (enum 
ix86_builtins) BUILT_IN_TM_STORE_WAR_M128, UNKNOWN, VOID_FTYPE_PV4SF_V4SF },
@@ -31102,7 +31102,7 @@ static const struct builtin_description bdesc_tm[] =
   { OPTION_MASK_ISA_AVX, 0, CODE_FOR_nothing, "__builtin__ITM_RaWM256", (enum 
ix86_builtins) BUILT_IN_TM_LOAD_RAW_M256, UNKNOWN, V8SF_FTYPE_PCV8SF },
   { OPTION_MASK_ISA_AVX, 0, CODE_FOR_nothing, "__builtin__ITM_RfWM256", (enum 
ix86_builtins) BUILT_IN_TM_LOAD_RFW_M256, UNKNOWN, V8SF_FTYPE_PCV8SF },
 
-  { OPTION_MASK_ISA_MMX, 0, CODE_FOR_nothing, "__builtin__ITM_LM64", (enum 
ix86_builtins) BUILT_IN_TM_LOG_M64, UNKNOWN, VOID_FTYPE_PCVOID },
+  { OPTION_MASK_ISA_MMX | OPTION_MASK_ISA_SSE2, 0, CODE_FOR_nothing, 
"__builtin__ITM_LM64", (enum ix86_builtins) BUILT_IN_TM_LOG_M64, UNKNOWN, 
VOID_FTYPE_PCVOID },
   { OPTION_MASK_ISA_SSE, 0, CODE_FOR_nothing, "__builtin__ITM_LM128", (enum 
ix86_builtins) BUILT_IN_TM_LOG_M128, UNKNOWN, VOID_FTYPE_PCVOID },
   { OPTION_MASK_ISA_AVX, 0, CODE_FOR_nothing, "__builtin__ITM_LM256", (enum 
ix86_builtins) BUILT_IN_TM_LOG_M256, UNKNOWN, VOID_FTYPE_PCVOID },
 };
-- 
2.20.1

[PATCH 18/41] i386: Emulate MMX V4HI smaxmin/V8QI umaxmin with SSE

2019-02-18 Thread H.J. Lu

Emulate MMX V4HI smaxmin/V8QI umaxmin with SSE.  Only SSE register source
operand is allowed.

PR target/89021
* config/i386/mmx.md (mmx_v4hi3): Also check TARGET_MMX
and TARGET_MMX_WITH_SSE.
(mmx_v8qi3): Likewise.
(smaxmin:v4hi3): New.
(umaxmin:v8qi3): Likewise.
(smaxmin:*mmx_v4hi3): Add SSE emulation.
(umaxmin:*mmx_v8qi3): Likewise.
---
 gcc/config/i386/mmx.md | 68 +-
 1 file changed, 48 insertions(+), 20 deletions(-)

diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index dea2be1d8e2..edfb8623701 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -923,40 +923,68 @@
 (define_expand "mmx_v4hi3"
   [(set (match_operand:V4HI 0 "register_operand")
 (smaxmin:V4HI
- (match_operand:V4HI 1 "nonimmediate_operand")
- (match_operand:V4HI 2 "nonimmediate_operand")))]
-  "TARGET_SSE || TARGET_3DNOW_A"
+ (match_operand:V4HI 1 "register_mmxmem_operand")
+ (match_operand:V4HI 2 "register_mmxmem_operand")))]
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
+   && (TARGET_SSE || TARGET_3DNOW_A)"
+  "ix86_fixup_binary_operands_no_copy (, V4HImode, operands);")
+
+(define_expand "v4hi3"
+  [(set (match_operand:V4HI 0 "register_operand")
+(smaxmin:V4HI
+ (match_operand:V4HI 1 "register_operand")
+ (match_operand:V4HI 2 "register_operand")))]
+  "TARGET_MMX_WITH_SSE"
   "ix86_fixup_binary_operands_no_copy (, V4HImode, operands);")
 
 (define_insn "*mmx_v4hi3"
-  [(set (match_operand:V4HI 0 "register_operand" "=y")
+  [(set (match_operand:V4HI 0 "register_operand" "=y,x,Yv")
 (smaxmin:V4HI
- (match_operand:V4HI 1 "nonimmediate_operand" "%0")
- (match_operand:V4HI 2 "nonimmediate_operand" "ym")))]
-  "(TARGET_SSE || TARGET_3DNOW_A)
+ (match_operand:V4HI 1 "register_mmxmem_operand" "%0,0,Yv")
+ (match_operand:V4HI 2 "register_mmxmem_operand" "ym,x,Yv")))]
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
+   && (TARGET_SSE || TARGET_3DNOW_A)
&& ix86_binary_operator_ok (, V4HImode, operands)"
-  "pw\t{%2, %0|%0, %2}"
-  [(set_attr "type" "mmxadd")
-   (set_attr "mode" "DI")])
+  "@
+   pw\t{%2, %0|%0, %2}
+   pw\t{%2, %0|%0, %2}
+   vpw\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
+   (set_attr "type" "mmxadd,sseiadd,sseiadd")
+   (set_attr "mode" "DI,TI,TI")])
 
 (define_expand "mmx_v8qi3"
   [(set (match_operand:V8QI 0 "register_operand")
 (umaxmin:V8QI
- (match_operand:V8QI 1 "nonimmediate_operand")
- (match_operand:V8QI 2 "nonimmediate_operand")))]
-  "TARGET_SSE || TARGET_3DNOW_A"
+ (match_operand:V8QI 1 "register_mmxmem_operand")
+ (match_operand:V8QI 2 "register_mmxmem_operand")))]
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
+   && (TARGET_SSE || TARGET_3DNOW_A)"
+  "ix86_fixup_binary_operands_no_copy (, V8QImode, operands);")
+
+(define_expand "v8qi3"
+  [(set (match_operand:V8QI 0 "register_operand")
+(umaxmin:V8QI
+ (match_operand:V8QI 1 "register_operand")
+ (match_operand:V8QI 2 "register_operand")))]
+  "TARGET_MMX_WITH_SSE"
   "ix86_fixup_binary_operands_no_copy (, V8QImode, operands);")
 
 (define_insn "*mmx_v8qi3"
-  [(set (match_operand:V8QI 0 "register_operand" "=y")
+  [(set (match_operand:V8QI 0 "register_operand" "=y,x,Yv")
 (umaxmin:V8QI
- (match_operand:V8QI 1 "nonimmediate_operand" "%0")
- (match_operand:V8QI 2 "nonimmediate_operand" "ym")))]
-  "(TARGET_SSE || TARGET_3DNOW_A)
+ (match_operand:V8QI 1 "register_mmxmem_operand" "%0,0,Yv")
+ (match_operand:V8QI 2 "register_mmxmem_operand" "ym,x,Yv")))]
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
+   && (TARGET_SSE || TARGET_3DNOW_A)
&& ix86_binary_operator_ok (, V8QImode, operands)"
-  "pb\t{%2, %0|%0, %2}"
-  [(set_attr "type" "mmxadd")
-   (set_attr "mode" "DI")])
+  "@
+   pb\t{%2, %0|%0, %2}
+   pb\t{%2, %0|%0, %2}
+   vpb\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
+   (set_attr "type" "mmxadd,sseiadd,sseiadd")
+   (set_attr "mode" "DI,TI,TI")])
 
 (define_insn "mmx_ashr3"
   [(set (match_operand:MMXMODE24 0 "register_operand" "=y,x,Yv")
-- 
2.20.1

[PATCH 21/41] i386: Emulate MMX maskmovq with SSE2 maskmovdqu

2019-02-18 Thread H.J. Lu

Emulate MMX maskmovq with SSE2 maskmovdqu for TARGET_MMX_WITH_SSE by
zero-extending source and mask operands to 128 bits.  Handle unmapped
bits 64:127 at memory address by adjusting source and mask operands
together with memory address.

PR target/89021
* config/i386/xmmintrin.h: Emulate MMX maskmovq with SSE2
maskmovdqu for __MMX_WITH_SSE__.
---
 gcc/config/i386/xmmintrin.h | 61 +
 1 file changed, 61 insertions(+)

diff --git a/gcc/config/i386/xmmintrin.h b/gcc/config/i386/xmmintrin.h
index 58284378514..a915f6c87d7 100644
--- a/gcc/config/i386/xmmintrin.h
+++ b/gcc/config/i386/xmmintrin.h
@@ -1165,7 +1165,68 @@ _m_pshufw (__m64 __A, int const __N)
 extern __inline void __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 _mm_maskmove_si64 (__m64 __A, __m64 __N, char *__P)
 {
+#ifdef __MMX_WITH_SSE__
+  /* Emulate MMX maskmovq with SSE2 maskmovdqu and handle unmapped bits
+ 64:127 at address __P.  */
+  typedef long long __v2di __attribute__ ((__vector_size__ (16)));
+  typedef char __v16qi __attribute__ ((__vector_size__ (16)));
+  /* Zero-extend __A and __N to 128 bits.  */
+  __v2di __A128 = __extension__ (__v2di) { ((__v1di) __A)[0], 0 };
+  __v2di __N128 = __extension__ (__v2di) { ((__v1di) __N)[0], 0 };
+
+  /* Check the alignment of __P.  */
+  __SIZE_TYPE__ offset = ((__SIZE_TYPE__) __P) & 0xf;
+  if (offset)
+{
+  /* If the misalignment of __P > 8, subtract __P by 8 bytes.
+Otherwise, subtract __P by the misalignment.  */
+  if (offset > 8)
+   offset = 8;
+  __P = (char *) (((__SIZE_TYPE__) __P) - offset);
+
+  /* Shift __A128 and __N128 to the left by the adjustment.  */
+  switch (offset)
+   {
+   case 1:
+ __A128 = __builtin_ia32_pslldqi128 (__A128, 8);
+ __N128 = __builtin_ia32_pslldqi128 (__N128, 8);
+ break;
+   case 2:
+ __A128 = __builtin_ia32_pslldqi128 (__A128, 2 * 8);
+ __N128 = __builtin_ia32_pslldqi128 (__N128, 2 * 8);
+ break;
+   case 3:
+ __A128 = __builtin_ia32_pslldqi128 (__A128, 3 * 8);
+ __N128 = __builtin_ia32_pslldqi128 (__N128, 3 * 8);
+ break;
+   case 4:
+ __A128 = __builtin_ia32_pslldqi128 (__A128, 4 * 8);
+ __N128 = __builtin_ia32_pslldqi128 (__N128, 4 * 8);
+ break;
+   case 5:
+ __A128 = __builtin_ia32_pslldqi128 (__A128, 5 * 8);
+ __N128 = __builtin_ia32_pslldqi128 (__N128, 5 * 8);
+ break;
+   case 6:
+ __A128 = __builtin_ia32_pslldqi128 (__A128, 6 * 8);
+ __N128 = __builtin_ia32_pslldqi128 (__N128, 6 * 8);
+ break;
+   case 7:
+ __A128 = __builtin_ia32_pslldqi128 (__A128, 7 * 8);
+ __N128 = __builtin_ia32_pslldqi128 (__N128, 7 * 8);
+ break;
+   case 8:
+ __A128 = __builtin_ia32_pslldqi128 (__A128, 8 * 8);
+ __N128 = __builtin_ia32_pslldqi128 (__N128, 8 * 8);
+ break;
+   default:
+ break;
+   }
+}
+  __builtin_ia32_maskmovdqu ((__v16qi)__A128, (__v16qi)__N128, __P);
+#else
   __builtin_ia32_maskmovq ((__v8qi)__A, (__v8qi)__N, __P);
+#endif
 }
 
 extern __inline void __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
-- 
2.20.1

[PATCH 32/41] i386: Emulate MMX pshufb with SSE version

2019-02-18 Thread H.J. Lu

Emulate MMX version of pshufb with SSE version by masking out the bit 3
of the shuffle control byte.  Only SSE register source operand is allowed.

PR target/89021
* config/i386/sse.md (ssse3_pshufbv8qi3): Changed to
define_insn_and_split.  Also allow TARGET_MMX_WITH_SSE.  Add
SSE emulation.
---
 gcc/config/i386/sse.md | 46 +-
 1 file changed, 37 insertions(+), 9 deletions(-)

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index b08a577d1e4..79b35d95424 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -15728,17 +15728,45 @@
(set_attr "btver2_decode" "vector")
(set_attr "mode" "")])
 
-(define_insn "ssse3_pshufbv8qi3"
-  [(set (match_operand:V8QI 0 "register_operand" "=y")
-   (unspec:V8QI [(match_operand:V8QI 1 "register_operand" "0")
- (match_operand:V8QI 2 "nonimmediate_operand" "ym")]
-UNSPEC_PSHUFB))]
-  "TARGET_SSSE3"
-  "pshufb\t{%2, %0|%0, %2}";
-  [(set_attr "type" "sselog1")
+(define_insn_and_split "ssse3_pshufbv8qi3"
+  [(set (match_operand:V8QI 0 "register_operand" "=y,x,Yv")
+   (unspec:V8QI [(match_operand:V8QI 1 "register_operand" "0,0,Yv")
+ (match_operand:V8QI 2 "register_mmxmem_operand" 
"ym,x,Yv")]
+UNSPEC_PSHUFB))
+   (clobber (match_scratch:V4SI 3 "=X,x,Yv"))]
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3"
+  "@
+   pshufb\t{%2, %0|%0, %2}
+   #
+   #"
+  "TARGET_MMX_WITH_SSE && reload_completed"
+  [(set (match_dup 3) (match_dup 5))
+   (set (match_dup 3)
+   (and:V4SI (match_dup 3) (match_dup 2)))
+   (set (match_dup 0)
+   (unspec:V16QI [(match_dup 1) (match_dup 4)] UNSPEC_PSHUFB))]
+{
+  /* Emulate MMX version of pshufb with SSE version by masking out the
+ bit 3 of the shuffle control byte.  */
+  operands[0] = lowpart_subreg (V16QImode, operands[0],
+   GET_MODE (operands[0]));
+  operands[1] = lowpart_subreg (V16QImode, operands[1],
+   GET_MODE (operands[1]));
+  operands[2] = lowpart_subreg (V4SImode, operands[2],
+   GET_MODE (operands[2]));
+  operands[4] = lowpart_subreg (V16QImode, operands[3],
+   GET_MODE (operands[3]));
+  rtvec par = gen_rtvec (4, GEN_INT (0xf7f7f7f7),
+GEN_INT (0xf7f7f7f7),
+GEN_INT (0xf7f7f7f7),
+GEN_INT (0xf7f7f7f7));
+  rtx vec_const = gen_rtx_CONST_VECTOR (V4SImode, par);
+  operands[5] = force_const_mem (V4SImode, vec_const);
+}
+  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
(set_attr "prefix_extra" "1")
(set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p (insn)"))
-   (set_attr "mode" "DI")])
+   (set_attr "mode" "DI,TI,TI")])
 
 (define_insn "_psign3"
   [(set (match_operand:VI124_AVX2 0 "register_operand" "=x,x")
-- 
2.20.1

[PATCH 20/41] i386: Emulate MMX mmx_umulv4hi3_highpart with SSE

2019-02-18 Thread H.J. Lu

Emulate MMX mmx_umulv4hi3_highpart with SSE.  Only SSE register source
operand is allowed.

PR target/89021
* config/i386/mmx.md (mmx_umulv4hi3_highpart): Also check
TARGET_MMX and TARGET_MMX_WITH_SSE.
(*mmx_umulv4hi3_highpart): Add SSE emulation.
---
 gcc/config/i386/mmx.md | 26 --
 1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index 5ae04de205d..5a342256cbc 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -781,28 +781,34 @@
  (lshiftrt:V4SI
(mult:V4SI
  (zero_extend:V4SI
-   (match_operand:V4HI 1 "nonimmediate_operand"))
+   (match_operand:V4HI 1 "register_mmxmem_operand"))
  (zero_extend:V4SI
-   (match_operand:V4HI 2 "nonimmediate_operand")))
+   (match_operand:V4HI 2 "register_mmxmem_operand")))
(const_int 16]
-  "TARGET_SSE || TARGET_3DNOW_A"
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
+   && (TARGET_SSE || TARGET_3DNOW_A)"
   "ix86_fixup_binary_operands_no_copy (MULT, V4HImode, operands);")
 
 (define_insn "*mmx_umulv4hi3_highpart"
-  [(set (match_operand:V4HI 0 "register_operand" "=y")
+  [(set (match_operand:V4HI 0 "register_operand" "=y,x,Yv")
(truncate:V4HI
  (lshiftrt:V4SI
(mult:V4SI
  (zero_extend:V4SI
-   (match_operand:V4HI 1 "nonimmediate_operand" "%0"))
+   (match_operand:V4HI 1 "register_mmxmem_operand" "%0,0,Yv"))
  (zero_extend:V4SI
-   (match_operand:V4HI 2 "nonimmediate_operand" "ym")))
+   (match_operand:V4HI 2 "register_mmxmem_operand" "ym,x,Yv")))
  (const_int 16]
-  "(TARGET_SSE || TARGET_3DNOW_A)
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
+   && (TARGET_SSE || TARGET_3DNOW_A)
&& ix86_binary_operator_ok (MULT, V4HImode, operands)"
-  "pmulhuw\t{%2, %0|%0, %2}"
-  [(set_attr "type" "mmxmul")
-   (set_attr "mode" "DI")])
+  "@
+   pmulhuw\t{%2, %0|%0, %2}
+   pmulhuw\t{%2, %0|%0, %2}
+   vpmulhuw\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
+   (set_attr "type" "mmxmul,ssemul,ssemul")
+   (set_attr "mode" "DI,TI,TI")])
 
 (define_expand "mmx_pmaddwd"
   [(set (match_operand:V2SI 0 "register_operand")
-- 
2.20.1

[PATCH 17/41] i386: Emulate MMX mmx_pinsrw with SSE

2019-02-18 Thread H.J. Lu

Emulate MMX mmx_pinsrw with SSE.  Only SSE register destination operand
is allowed.

PR target/89021
* config/i386/mmx.md (mmx_pinsrw): Also check TARGET_MMX and
TARGET_MMX_WITH_SSE.
(*mmx_pinsrw): Add SSE emulation.
---
 gcc/config/i386/mmx.md | 33 +++--
 1 file changed, 23 insertions(+), 10 deletions(-)

diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index 28725f48282..dea2be1d8e2 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -1282,32 +1282,45 @@
 (match_operand:SI 2 "nonimmediate_operand"))
  (match_operand:V4HI 1 "register_operand")
   (match_operand:SI 3 "const_0_to_3_operand")))]
-  "TARGET_SSE || TARGET_3DNOW_A"
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
+   && (TARGET_SSE || TARGET_3DNOW_A)"
 {
   operands[2] = gen_lowpart (HImode, operands[2]);
   operands[3] = GEN_INT (1 << INTVAL (operands[3]));
 })
 
 (define_insn "*mmx_pinsrw"
-  [(set (match_operand:V4HI 0 "register_operand" "=y")
+  [(set (match_operand:V4HI 0 "register_operand" "=y,x,Yv")
 (vec_merge:V4HI
   (vec_duplicate:V4HI
-(match_operand:HI 2 "nonimmediate_operand" "rm"))
- (match_operand:V4HI 1 "register_operand" "0")
+(match_operand:HI 2 "nonimmediate_operand" "rm,rm,rm"))
+ (match_operand:V4HI 1 "register_operand" "0,0,Yv")
   (match_operand:SI 3 "const_int_operand")))]
-  "(TARGET_SSE || TARGET_3DNOW_A)
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
+   && (TARGET_SSE || TARGET_3DNOW_A)
&& ((unsigned) exact_log2 (INTVAL (operands[3]))
< GET_MODE_NUNITS (V4HImode))"
 {
   operands[3] = GEN_INT (exact_log2 (INTVAL (operands[3])));
-  if (MEM_P (operands[2]))
-return "pinsrw\t{%3, %2, %0|%0, %2, %3}";
+  if (TARGET_MMX_WITH_SSE && TARGET_AVX)
+{
+  if (MEM_P (operands[2]))
+   return "vpinsrw\t{%3, %2, %1, %0|%0, %1, %2, %3}";
+  else
+   return "vpinsrw\t{%3, %k2, %1, %0|%0, %1, %k2, %3}";
+}
   else
-return "pinsrw\t{%3, %k2, %0|%0, %k2, %3}";
+{
+  if (MEM_P (operands[2]))
+   return "pinsrw\t{%3, %2, %0|%0, %2, %3}";
+  else
+   return "pinsrw\t{%3, %k2, %0|%0, %k2, %3}";
+}
 }
-  [(set_attr "type" "mmxcvt")
+  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
+   (set_attr "type" "mmxcvt,sselog,sselog")
(set_attr "length_immediate" "1")
-   (set_attr "mode" "DI")])
+   (set_attr "mode" "DI,TI,TI")])
 
 (define_insn "mmx_pextrw"
   [(set (match_operand:SI 0 "register_operand" "=r,r")
-- 
2.20.1

[PATCH 38/41] i386: Allow MMX vector expanders with TARGET_MMX_WITH_SSE

2019-02-18 Thread H.J. Lu

PR target/89021
* config/i386/mmx.md (*vec_dupv2sf): Changed to
define_insn_and_split to support SSE emulation.
(*vec_extractv2sf_0): Likewise.
(*vec_extractv2sf_1): Likewise.
(*vec_extractv2si_0): Likewise.
(*vec_extractv2si_1): Likewise.
(*vec_extractv2si_zext_mem): Likewise.
(vec_setv2sf): Also allow TARGET_MMX_WITH_SSE.
(vec_extractv2sf_1 splitter): Likewise.
(vec_extractv2sfsf): Likewise.
(vec_setv2si): Likewise.
(vec_extractv2si_1 splitter): Likewise.
(vec_extractv2sisi): Likewise.
(vec_setv4hi): Likewise.
(vec_extractv4hihi): Likewise.
(vec_setv8qi): Likewise.
(vec_extractv8qiqi): Likewise.
(vec_extractv2sfsf): Also allow TARGET_MMX_WITH_SSE.  Pass
TARGET_MMX_WITH_SSE ix86_expand_vector_extract.
(vec_extractv2sisi): Likewise.
(vec_extractv4hihi): Likewise.
(vec_extractv8qiqi): Likewise.
(vec_initv2sfsf): Also allow TARGET_MMX_WITH_SSE.  Pass
TARGET_MMX_WITH_SSE to ix86_expand_vector_init.
(vec_initv2sisi): Likewise.
(vec_initv4hihi): Likewise.
(vec_initv8qiqi): Likewise.
(vec_setv2si): Also allow TARGET_MMX_WITH_SSE.  Pass
TARGET_MMX_WITH_SSE to ix86_expand_vector_set.
(vec_setv4hi): Likewise.
(vec_setv8qi): Likewise.
---
 gcc/config/i386/mmx.md | 110 -
 1 file changed, 66 insertions(+), 44 deletions(-)

diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index b230dee521f..479568aa322 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -555,14 +555,23 @@
(set_attr "prefix_extra" "1")
(set_attr "mode" "V2SF")])
 
-(define_insn "*vec_dupv2sf"
-  [(set (match_operand:V2SF 0 "register_operand" "=y")
+(define_insn_and_split "*vec_dupv2sf"
+  [(set (match_operand:V2SF 0 "register_operand" "=y,x,Yv")
(vec_duplicate:V2SF
- (match_operand:SF 1 "register_operand" "0")))]
-  "TARGET_MMX"
-  "punpckldq\t%0, %0"
-  [(set_attr "type" "mmxcvt")
-   (set_attr "mode" "DI")])
+ (match_operand:SF 1 "register_operand" "0,0,Yv")))]
+  "TARGET_MMX || TARGET_MMX_WITH_SSE"
+  "@
+   punpckldq\t%0, %0
+   #
+   #"
+  "TARGET_MMX_WITH_SSE && reload_completed"
+  [(set (match_dup 0)
+   (vec_duplicate:V4SF (match_dup 1)))]
+  "operands[0] = lowpart_subreg (V4SFmode, operands[0],
+GET_MODE (operands[0]));"
+  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
+   (set_attr "type" "mmxcvt,ssemov,ssemov")
+   (set_attr "mode" "DI,TI,TI")])
 
 (define_insn "*mmx_concatv2sf"
   [(set (match_operand:V2SF 0 "register_operand" "=y,y")
@@ -580,9 +589,9 @@
   [(match_operand:V2SF 0 "register_operand")
(match_operand:SF 1 "register_operand")
(match_operand 2 "const_int_operand")]
-  "TARGET_MMX"
+  "TARGET_MMX || TARGET_MMX_WITH_SSE"
 {
-  ix86_expand_vector_set (false, operands[0], operands[1],
+  ix86_expand_vector_set (TARGET_MMX_WITH_SSE, operands[0], operands[1],
  INTVAL (operands[2]));
   DONE;
 })
@@ -594,11 +603,13 @@
(vec_select:SF
  (match_operand:V2SF 1 "nonimmediate_operand" " xm,x,ym,y,m,m")
  (parallel [(const_int 0)])))]
-  "TARGET_MMX && !(MEM_P (operands[0]) && MEM_P (operands[1]))"
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
+   && !(MEM_P (operands[0]) && MEM_P (operands[1]))"
   "#"
   "&& reload_completed"
   [(set (match_dup 0) (match_dup 1))]
-  "operands[1] = gen_lowpart (SFmode, operands[1]);")
+  "operands[1] = gen_lowpart (SFmode, operands[1]);"
+  [(set_attr "mmx_isa" "*,*,native,native,*,*")])
 
 ;; Avoid combining registers from different units in a single alternative,
 ;; see comment above inline_secondary_memory_needed function in i386.c
@@ -607,7 +618,8 @@
(vec_select:SF
  (match_operand:V2SF 1 "nonimmediate_operand" " 0,x,x,o,o,o,o")
  (parallel [(const_int 1)])))]
-  "TARGET_MMX && !(MEM_P (operands[0]) && MEM_P (operands[1]))"
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
+   && !(MEM_P (operands[0]) && MEM_P (operands[1]))"
   "@
punpckhdq\t%0, %0
%vmovshdup\t{%1, %0|%0, %1}
@@ -617,6 +629,7 @@
#
#"
   [(set_attr "isa" "*,sse3,noavx,*,*,*,*")
+   (set_attr "mmx_isa" "native,*,*,native,*,*,*")
(set_attr "type" "mmxcvt,sse,sseshuf1,mmxmov,ssemov,fmov,imov")
(set (attr "length_immediate")
  (if_then_else (eq_attr "alternative" "2")
@@ -634,7 +647,7 @@
(vec_select:SF
  (match_operand:V2SF 1 "memory_operand")
  (parallel [(const_int 1)])))]
-  "TARGET_MMX && reload_completed"
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE) && reload_completed"
   [(set (match_dup 0) (match_dup 1))]
   "operands[1] = adjust_address (operands[1], SFmode, 4);")
 
@@ -642,19 +655,20 @@
   [(match_operand:SF 0 "register_operand")
(match_operand:V2SF 1 "register_operand")
(match_operand 2 "const_int_operand")]
-

[PATCH 25/41] i386: Emulate MMX movntq with SSE2 movntidi

2019-02-18 Thread H.J. Lu

Emulate MMX movntq with SSE2 movntidi.  Only register source operand is
allowed.

PR target/89021
* config/i386/mmx.md (sse_movntq): Add SSE2 emulation.
---
 gcc/config/i386/mmx.md | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index 098e41e19c3..b06f0af984a 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -214,12 +214,16 @@
 })
 
 (define_insn "sse_movntq"
-  [(set (match_operand:DI 0 "memory_operand" "=m")
-   (unspec:DI [(match_operand:DI 1 "register_operand" "y")]
+  [(set (match_operand:DI 0 "memory_operand" "=m,m")
+   (unspec:DI [(match_operand:DI 1 "register_operand" "y,r")]
   UNSPEC_MOVNTQ))]
-  "TARGET_SSE || TARGET_3DNOW_A"
-  "movntq\t{%1, %0|%0, %1}"
-  [(set_attr "type" "mmxmov")
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
+   && (TARGET_SSE || TARGET_3DNOW_A)"
+  "@
+   movntq\t{%1, %0|%0, %1}
+   movnti\t{%1, %0|%0, %1}"
+  [(set_attr "mmx_isa" "native,x64")
+   (set_attr "type" "mmxmov,ssemov")
(set_attr "mode" "DI")])
 
 ;
-- 
2.20.1

[PATCH 36/41] Prevent allocation of MMX registers with TARGET_MMX_WITH_SSE

2019-02-18 Thread H.J. Lu

From: Uros Bizjak 

2019-02-18  Uroš Bizjak  

PR target/89021
* config/i386/i386.md (*zero_extendsidi2): Add mmx_isa attribute.
* config/i386/sse.md (sse2_cvtpi2pd): Ditto.
(sse2_cvtpd2pi): Ditto.
(sse2_cvttpd2pi): Ditto.
(*vec_concatv2sf_sse4_1): Ditto.
(*vec_concatv2sf_sse): Ditto.
(*vec_concatv2si_sse4_1): Ditto.
(*vec_concatv2si): Ditto.
(*vec_concatv4si_0): Ditto.
(*vec_concatv2di_0): Ditto.
---
 gcc/config/i386/i386.md |  4 
 gcc/config/i386/sse.md  | 25 -
 2 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 04ec0eeaa57..4cbbd4cf685 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -3683,6 +3683,10 @@
  (const_string "avx512bw")
   ]
   (const_string "*")))
+   (set (attr "mmx_isa")
+ (if_then_else (eq_attr "alternative" "5,6")
+  (const_string "native")
+  (const_string "*")))
(set (attr "type")
  (cond [(eq_attr "alternative" "0,1,2,4")
  (const_string "multi")
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 97ec3795b82..96d4e5001d8 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -4971,7 +4971,8 @@
   "@
%vcvtdq2pd\t{%1, %0|%0, %1}
cvtpi2pd\t{%1, %0|%0, %1}"
-  [(set_attr "type" "ssecvt")
+  [(set_attr "mmx_isa" "*,native")
+   (set_attr "type" "ssecvt")
(set_attr "unit" "*,mmx")
(set_attr "prefix_data16" "*,1")
(set_attr "prefix" "maybe_vex,*")
@@ -4985,7 +4986,8 @@
   "@
* return TARGET_AVX ? \"vcvtpd2dq{x}\t{%1, %0|%0, %1}\" : \"cvtpd2dq\t{%1, 
%0|%0, %1}\";
cvtpd2pi\t{%1, %0|%0, %1}"
-  [(set_attr "type" "ssecvt")
+  [(set_attr "mmx_isa" "*,native")
+   (set_attr "type" "ssecvt")
(set_attr "unit" "*,mmx")
(set_attr "amdfam10_decode" "double")
(set_attr "athlon_decode" "vector")
@@ -5001,7 +5003,8 @@
   "@
* return TARGET_AVX ? \"vcvttpd2dq{x}\t{%1, %0|%0, %1}\" : 
\"cvttpd2dq\t{%1, %0|%0, %1}\";
cvttpd2pi\t{%1, %0|%0, %1}"
-  [(set_attr "type" "ssecvt")
+  [(set_attr "mmx_isa" "*,native")
+   (set_attr "type" "ssecvt")
(set_attr "unit" "*,mmx")
(set_attr "amdfam10_decode" "double")
(set_attr "athlon_decode" "vector")
@@ -7209,6 +7212,10 @@
  (const_string "mmxmov")
   ]
   (const_string "sselog")))
+   (set (attr "mmx_isa")
+ (if_then_else (eq_attr "alternative" "7,8")
+  (const_string "native")
+  (const_string "*")))
(set (attr "prefix_data16")
  (if_then_else (eq_attr "alternative" "3,4")
   (const_string "1")
@@ -7244,7 +7251,8 @@
movss\t{%1, %0|%0, %1}
punpckldq\t{%2, %0|%0, %2}
movd\t{%1, %0|%0, %1}"
-  [(set_attr "type" "sselog,ssemov,mmxcvt,mmxmov")
+  [(set_attr "mmx_isa" "*,*,native,native")
+   (set_attr "type" "sselog,ssemov,mmxcvt,mmxmov")
(set_attr "mode" "V4SF,SF,DI,DI")])
 
 (define_insn "*vec_concatv4sf"
@@ -14520,6 +14528,10 @@
punpckldq\t{%2, %0|%0, %2}
movd\t{%1, %0|%0, %1}"
   [(set_attr "isa" "noavx,noavx,avx,avx512dq,noavx,noavx,avx,*,*,*")
+   (set (attr "mmx_isa")
+ (if_then_else (eq_attr "alternative" "8,9")
+  (const_string "native")
+  (const_string "*")))
(set (attr "type")
  (cond [(eq_attr "alternative" "7")
  (const_string "ssemov")
@@ -14557,6 +14569,7 @@
punpckldq\t{%2, %0|%0, %2}
movd\t{%1, %0|%0, %1}"
   [(set_attr "isa" "sse2,sse2,*,*,*,*")
+   (set_attr "mmx_isa" "*,*,*,*,native,native")
(set_attr "type" "sselog,ssemov,sselog,ssemov,mmxcvt,mmxmov")
(set_attr "mode" "TI,TI,V4SF,SF,DI,DI")])
 
@@ -14586,7 +14599,8 @@
   "@
%vmovq\t{%1, %0|%0, %1}
movq2dq\t{%1, %0|%0, %1}"
-  [(set_attr "type" "ssemov")
+  [(set_attr "mmx_isa" "*,native")
+   (set_attr "type" "ssemov")
(set_attr "prefix" "maybe_vex,orig")
(set_attr "mode" "TI")])
 
@@ -14661,6 +14675,7 @@
%vmovq\t{%1, %0|%0, %1}
movq2dq\t{%1, %0|%0, %1}"
   [(set_attr "isa" "x64,*,*")
+   (set_attr "mmx_isa" "*,*,native")
(set_attr "type" "ssemov")
(set_attr "prefix_rex" "1,*,*")
(set_attr "prefix" "maybe_vex,maybe_vex,orig")
-- 
2.20.1

[PATCH 35/41] i386: Emulate MMX abs2 with SSE

2019-02-18 Thread H.J. Lu

Emulate MMX abs2 with SSE.  Only SSE register source operand is
allowed.

PR target/89021
* config/i386/sse.md (abs2): Add SSE emulation.
---
 gcc/config/i386/sse.md | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index b69a467291c..97ec3795b82 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -15973,16 +15973,19 @@
 })
 
 (define_insn "abs2"
-  [(set (match_operand:MMXMODEI 0 "register_operand" "=y")
+  [(set (match_operand:MMXMODEI 0 "register_operand" "=y,Yv")
(abs:MMXMODEI
- (match_operand:MMXMODEI 1 "nonimmediate_operand" "ym")))]
-  "TARGET_SSSE3"
-  "pabs\t{%1, %0|%0, %1}";
-  [(set_attr "type" "sselog1")
+ (match_operand:MMXMODEI 1 "register_mmxmem_operand" "ym,Yv")))]
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3"
+  "@
+   pabs\t{%1, %0|%0, %1}
+   %vpabs\t{%1, %0|%0, %1}"
+  [(set_attr "mmx_isa" "native,x64")
+   (set_attr "type" "sselog1")
(set_attr "prefix_rep" "0")
(set_attr "prefix_extra" "1")
(set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p (insn)"))
-   (set_attr "mode" "DI")])
+   (set_attr "mode" "DI,TI")])
 
 ;
 ;;
-- 
2.20.1

[PATCH 22/41] i386: Emulate MMX mmx_uavgv8qi3 with SSE

2019-02-18 Thread H.J. Lu

Emulate MMX mmx_uavgv8qi3 with SSE.  Only SSE register source operand is
allowed.

PR target/89021
* config/i386/mmx.md (mmx_uavgv8qi3): Also check TARGET_MMX
and TARGET_MMX_WITH_SSE.
(*mmx_uavgv8qi3): Add SSE emulation.
---
 gcc/config/i386/mmx.md | 25 +++--
 1 file changed, 15 insertions(+), 10 deletions(-)

diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index 5a342256cbc..8866354dea9 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -1679,50 +1679,55 @@
(plus:V8HI
  (plus:V8HI
(zero_extend:V8HI
- (match_operand:V8QI 1 "nonimmediate_operand"))
+ (match_operand:V8QI 1 "register_mmxmem_operand"))
(zero_extend:V8HI
- (match_operand:V8QI 2 "nonimmediate_operand")))
+ (match_operand:V8QI 2 "register_mmxmem_operand")))
  (const_vector:V8HI [(const_int 1) (const_int 1)
  (const_int 1) (const_int 1)
  (const_int 1) (const_int 1)
  (const_int 1) (const_int 1)]))
(const_int 1]
-  "TARGET_SSE || TARGET_3DNOW"
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
+   && (TARGET_SSE || TARGET_3DNOW_A)"
   "ix86_fixup_binary_operands_no_copy (PLUS, V8QImode, operands);")
 
 (define_insn "*mmx_uavgv8qi3"
-  [(set (match_operand:V8QI 0 "register_operand" "=y")
+  [(set (match_operand:V8QI 0 "register_operand" "=y,x,Yv")
(truncate:V8QI
  (lshiftrt:V8HI
(plus:V8HI
  (plus:V8HI
(zero_extend:V8HI
- (match_operand:V8QI 1 "nonimmediate_operand" "%0"))
+ (match_operand:V8QI 1 "register_mmxmem_operand" "%0,0,Yv"))
(zero_extend:V8HI
- (match_operand:V8QI 2 "nonimmediate_operand" "ym")))
+ (match_operand:V8QI 2 "register_mmxmem_operand" "ym,x,Yv")))
  (const_vector:V8HI [(const_int 1) (const_int 1)
  (const_int 1) (const_int 1)
  (const_int 1) (const_int 1)
  (const_int 1) (const_int 1)]))
(const_int 1]
-  "(TARGET_SSE || TARGET_3DNOW)
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
+   && (TARGET_SSE || TARGET_3DNOW_A)
&& ix86_binary_operator_ok (PLUS, V8QImode, operands)"
 {
   /* These two instructions have the same operation, but their encoding
  is different.  Prefer the one that is de facto standard.  */
-  if (TARGET_SSE || TARGET_3DNOW_A)
+  if (TARGET_MMX_WITH_SSE && TARGET_AVX)
+return "vpavgb\t{%2, %1, %0|%0, %1, %2}";
+  else if (TARGET_SSE || TARGET_3DNOW_A)
 return "pavgb\t{%2, %0|%0, %2}";
   else
 return "pavgusb\t{%2, %0|%0, %2}";
 }
-  [(set_attr "type" "mmxshft")
+  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
+   (set_attr "type" "mmxshft,sseiadd,sseiadd")
(set (attr "prefix_extra")
  (if_then_else
(not (ior (match_test "TARGET_SSE")
 (match_test "TARGET_3DNOW_A")))
(const_string "1")
(const_string "*")))
-   (set_attr "mode" "DI")])
+   (set_attr "mode" "DI,TI,TI")])
 
 (define_expand "mmx_uavgv4hi3"
   [(set (match_operand:V4HI 0 "register_operand")
-- 
2.20.1

[PATCH 24/41] i386: Emulate MMX mmx_psadbw with SSE

2019-02-18 Thread H.J. Lu

Emulate MMX mmx_psadbw with SSE.  Only SSE register source operand is
allowed.

PR target/89021
* config/i386/mmx.md (mmx_psadbw): Add SSE emulation.
---
 gcc/config/i386/mmx.md | 19 ---
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index d647dc28baa..098e41e19c3 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -1771,14 +1771,19 @@
(set_attr "mode" "DI,TI,TI")])
 
 (define_insn "mmx_psadbw"
-  [(set (match_operand:V1DI 0 "register_operand" "=y")
-(unspec:V1DI [(match_operand:V8QI 1 "register_operand" "0")
- (match_operand:V8QI 2 "nonimmediate_operand" "ym")]
+  [(set (match_operand:V1DI 0 "register_operand" "=y,x,Yv")
+(unspec:V1DI [(match_operand:V8QI 1 "register_operand" "0,0,Yv")
+ (match_operand:V8QI 2 "register_mmxmem_operand" 
"ym,x,Yv")]
 UNSPEC_PSADBW))]
-  "TARGET_SSE || TARGET_3DNOW_A"
-  "psadbw\t{%2, %0|%0, %2}"
-  [(set_attr "type" "mmxshft")
-   (set_attr "mode" "DI")])
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
+   && (TARGET_SSE || TARGET_3DNOW_A)"
+  "@
+   psadbw\t{%2, %0|%0, %2}
+   psadbw\t{%2, %0|%0, %2}
+   vpsadbw\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
+   (set_attr "type" "mmxshft,sseiadd,sseiadd")
+   (set_attr "mode" "DI,TI,TI")])
 
 (define_insn_and_split "mmx_pmovmskb"
   [(set (match_operand:SI 0 "register_operand" "=r,r")
-- 
2.20.1

[PATCH 34/41] i386: Emulate MMX ssse3_palignrdi with SSE

2019-02-18 Thread H.J. Lu

Emulate MMX version of palignrq with SSE version by concatenating 2
64-bit MMX operands into a single 128-bit SSE operand, followed by
SSE psrldq.  Only SSE register source operand is allowed.

PR target/89021
* config/i386/sse.md (ssse3_palignrdi): Changed to
define_insn_and_split to support SSE emulation.
---
 gcc/config/i386/sse.md | 58 ++
 1 file changed, 48 insertions(+), 10 deletions(-)

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 1d90af0a4b0..b69a467291c 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -15855,23 +15855,61 @@
(set_attr "prefix" "orig,vex,evex")
(set_attr "mode" "")])
 
-(define_insn "ssse3_palignrdi"
-  [(set (match_operand:DI 0 "register_operand" "=y")
-   (unspec:DI [(match_operand:DI 1 "register_operand" "0")
-   (match_operand:DI 2 "nonimmediate_operand" "ym")
-   (match_operand:SI 3 "const_0_to_255_mul_8_operand" "n")]
+(define_insn_and_split "ssse3_palignrdi"
+  [(set (match_operand:DI 0 "register_operand" "=y,x,Yv")
+   (unspec:DI [(match_operand:DI 1 "register_operand" "0,0,Yv")
+   (match_operand:DI 2 "register_mmxmem_operand" "ym,x,Yv")
+   (match_operand:SI 3 "const_0_to_255_mul_8_operand" "n,n,n")]
   UNSPEC_PALIGNR))]
-  "TARGET_SSSE3"
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3"
 {
-  operands[3] = GEN_INT (INTVAL (operands[3]) / 8);
-  return "palignr\t{%3, %2, %0|%0, %2, %3}";
+  switch (which_alternative)
+{
+case 0:
+  operands[3] = GEN_INT (INTVAL (operands[3]) / 8);
+  return "palignr\t{%3, %2, %0|%0, %2, %3}";
+case 1:
+case 2:
+  return "#";
+default:
+  gcc_unreachable ();
+}
 }
-  [(set_attr "type" "sseishft")
+  "TARGET_MMX_WITH_SSE && reload_completed"
+  [(set (match_dup 0)
+   (lshiftrt:V1TI (match_dup 0) (match_dup 3)))]
+{
+  /* Emulate MMX palignrdi with SSE psrldq.  */
+  rtx op0 = lowpart_subreg (V2DImode, operands[0],
+   GET_MODE (operands[0]));
+  rtx insn;
+  if (TARGET_AVX)
+insn = gen_vec_concatv2di (op0, operands[2], operands[1]);
+  else
+{
+  /* NB: SSE can only concatenate OP0 and OP1 to OP0.  */
+  insn = gen_vec_concatv2di (op0, operands[1], operands[2]);
+  emit_insn (insn);
+  /* Swap bits 0:63 with bits 64:127.  */
+  rtx mask = gen_rtx_PARALLEL (VOIDmode,
+  gen_rtvec (4, GEN_INT (2),
+ GEN_INT (3),
+ GEN_INT (0),
+ GEN_INT (1)));
+  rtx op1 = lowpart_subreg (V4SImode, op0, GET_MODE (op0));
+  rtx op2 = gen_rtx_VEC_SELECT (V4SImode, op1, mask);
+  insn = gen_rtx_SET (op1, op2);
+}
+  emit_insn (insn);
+  operands[0] = lowpart_subreg (V1TImode, op0, GET_MODE (op0));
+}
+  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
+   (set_attr "type" "sseishft")
(set_attr "atom_unit" "sishuf")
(set_attr "prefix_extra" "1")
(set_attr "length_immediate" "1")
(set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p (insn)"))
-   (set_attr "mode" "DI")])
+   (set_attr "mode" "DI,TI,TI")])
 
 ;; Mode iterator to handle singularity w/ absence of V2DI and V4DI
 ;; modes for abs instruction on pre AVX-512 targets.
-- 
2.20.1

[PATCH 26/41] i386: Emulate MMX umulv1siv1di3 with SSE2

2019-02-18 Thread H.J. Lu

Emulate MMX umulv1siv1di3 with SSE2.  Only SSE register source operand
is allowed.

PR target/89021
* config/i386/mmx.md (sse2_umulv1siv1di3): Add SSE emulation
support.
(*sse2_umulv1siv1di3): Add SSE2 emulation.
---
 gcc/config/i386/mmx.md | 26 --
 1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index b06f0af984a..f27513f7f2c 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -905,30 +905,36 @@
 (mult:V1DI
  (zero_extend:V1DI
(vec_select:V1SI
- (match_operand:V2SI 1 "nonimmediate_operand")
+ (match_operand:V2SI 1 "register_mmxmem_operand")
  (parallel [(const_int 0)])))
  (zero_extend:V1DI
(vec_select:V1SI
- (match_operand:V2SI 2 "nonimmediate_operand")
+ (match_operand:V2SI 2 "register_mmxmem_operand")
  (parallel [(const_int 0)])]
-  "TARGET_SSE2"
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSE2"
   "ix86_fixup_binary_operands_no_copy (MULT, V2SImode, operands);")
 
 (define_insn "*sse2_umulv1siv1di3"
-  [(set (match_operand:V1DI 0 "register_operand" "=y")
+  [(set (match_operand:V1DI 0 "register_operand" "=y,x,Yv")
 (mult:V1DI
  (zero_extend:V1DI
(vec_select:V1SI
- (match_operand:V2SI 1 "nonimmediate_operand" "%0")
+ (match_operand:V2SI 1 "register_mmxmem_operand" "%0,0,Yv")
  (parallel [(const_int 0)])))
  (zero_extend:V1DI
(vec_select:V1SI
- (match_operand:V2SI 2 "nonimmediate_operand" "ym")
+ (match_operand:V2SI 2 "register_mmxmem_operand" "ym,x,Yv")
  (parallel [(const_int 0)])]
-  "TARGET_SSE2 && ix86_binary_operator_ok (MULT, V2SImode, operands)"
-  "pmuludq\t{%2, %0|%0, %2}"
-  [(set_attr "type" "mmxmul")
-   (set_attr "mode" "DI")])
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
+   && TARGET_SSE2
+   && ix86_binary_operator_ok (MULT, V2SImode, operands)"
+  "@
+   pmuludq\t{%2, %0|%0, %2}
+   pmuludq\t{%2, %0|%0, %2}
+   vpmuludq\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
+   (set_attr "type" "mmxmul,ssemul,ssemul")
+   (set_attr "mode" "DI,TI,TI")])
 
 (define_expand "mmx_v4hi3"
   [(set (match_operand:V4HI 0 "register_operand")
-- 
2.20.1

[PATCH 33/41] i386: Emulate MMX ssse3_psign3 with SSE

2019-02-18 Thread H.J. Lu

Emulate MMX ssse3_psign3 with SSE.  Only SSE register source operand
is allowed.

PR target/89021
* config/i386/sse.md (ssse3_psign3): Add SSE emulation.
---
 gcc/config/i386/sse.md | 18 +++---
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 79b35d95424..1d90af0a4b0 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -15786,17 +15786,21 @@
(set_attr "mode" "")])
 
 (define_insn "ssse3_psign3"
-  [(set (match_operand:MMXMODEI 0 "register_operand" "=y")
+  [(set (match_operand:MMXMODEI 0 "register_operand" "=y,x,Yv")
(unspec:MMXMODEI
- [(match_operand:MMXMODEI 1 "register_operand" "0")
-  (match_operand:MMXMODEI 2 "nonimmediate_operand" "ym")]
+ [(match_operand:MMXMODEI 1 "register_operand" "0,0,Yv")
+  (match_operand:MMXMODEI 2 "register_mmxmem_operand" "ym,x,Yv")]
  UNSPEC_PSIGN))]
-  "TARGET_SSSE3"
-  "psign\t{%2, %0|%0, %2}";
-  [(set_attr "type" "sselog1")
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3"
+  "@
+   psign\t{%2, %0|%0, %2}
+   psign\t{%2, %0|%0, %2}
+   vpsign\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
+   (set_attr "type" "sselog1")
(set_attr "prefix_extra" "1")
(set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p (insn)"))
-   (set_attr "mode" "DI")])
+   (set_attr "mode" "DI,TI,TI")])
 
 (define_insn "_palignr_mask"
   [(set (match_operand:VI1_AVX512 0 "register_operand" "=v")
-- 
2.20.1

Re: [PATCH] Teach evrp that main's argc argument is always non-negative for C family (PR tree-optimization/89350)

2019-02-18 Thread Martin Sebor


On 2/16/19 12:12 AM, Jakub Jelinek wrote:

Hi!

Both the C and C++ standard guarantee that the argc argument to main is
non-negative, the following patch sets (or adjusts) the corresponding
SSA_NAME_RANGE_INFO.  While main is just one, with IPA VRP it can also
propagate etc.  I had to change one testcase because it started optimizing
it better (the test has been folded away), so no sinking was done.


If/when this goes in it might make sense to also set argv and argv[0]
to nonnull.

Martin



Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2019-02-16  Jakub Jelinek  

PR tree-optimization/89350
* gimple-ssa-evrp.c: Include tree-dfa.h and langhooks.h.
(maybe_set_main_argc_range): New function.
(execute_early_vrp): Call it.

* gcc.dg/tree-ssa/vrp122.c: New test.
* gcc.dg/tree-ssa/ssa-sink-3.c (main): Rename to ...
(bar): ... this.

--- gcc/gimple-ssa-evrp.c.jj2019-01-01 12:37:15.712998659 +0100
+++ gcc/gimple-ssa-evrp.c   2019-02-15 09:49:56.768534668 +0100
@@ -41,6 +41,8 @@ along with GCC; see the file COPYING3.
  #include "tree-cfgcleanup.h"
  #include "vr-values.h"
  #include "gimple-ssa-evrp-analyze.h"
+#include "tree-dfa.h"
+#include "langhooks.h"
  
  class evrp_folder : public substitute_and_fold_engine

  {
@@ -291,6 +293,39 @@ evrp_dom_walker::cleanup (void)
evrp_folder.vr_values->cleanup_edges_and_switches ();
  }
  
+/* argc in main in C/C++ is guaranteed to be non-negative.  Adjust the

+   range info for it.  */
+
+static void
+maybe_set_main_argc_range (void)
+{
+  if (!DECL_ARGUMENTS (current_function_decl)
+  || !(lang_GNU_C () || lang_GNU_CXX () || lang_GNU_OBJC ()))
+return;
+
+  tree argc = DECL_ARGUMENTS (current_function_decl);
+  if (TYPE_MAIN_VARIANT (TREE_TYPE (argc)) != integer_type_node)
+return;
+
+  argc = ssa_default_def (cfun, argc);
+  if (argc == NULL_TREE)
+return;
+
+  wide_int min, max;
+  value_range_kind kind = get_range_info (argc, , );
+  if (kind == VR_VARYING)
+{
+  min = wi::zero (TYPE_PRECISION (integer_type_node));
+  max = wi::to_wide (TYPE_MAX_VALUE (integer_type_node));
+}
+  else if (kind == VR_RANGE && wi::neg_p (min) && !wi::neg_p (max))
+min = wi::zero (TYPE_PRECISION (integer_type_node));
+  else
+return;
+
+  set_range_info (argc, VR_RANGE, min, max);
+}
+
  /* Main entry point for the early vrp pass which is a simplified non-iterative
 version of vrp where basic blocks are visited in dominance order.  Value
 ranges discovered in early vrp will also be used by ipa-vrp.  */
@@ -307,6 +342,10 @@ execute_early_vrp ()
scev_initialize ();
calculate_dominance_info (CDI_DOMINATORS);
  
+  /* argc in main in C/C++ is guaranteed to be non-negative.  */

+  if (MAIN_NAME_P (DECL_NAME (current_function_decl)))
+maybe_set_main_argc_range ();
+
/* Walk stmts in dominance order and propagate VRP.  */
evrp_dom_walker walker;
walker.walk (ENTRY_BLOCK_PTR_FOR_FN (cfun));
--- gcc/testsuite/gcc.dg/tree-ssa/vrp122.c.jj   2019-02-15 09:54:07.016357759 
+0100
+++ gcc/testsuite/gcc.dg/tree-ssa/vrp122.c  2019-02-15 09:53:59.299486561 
+0100
@@ -0,0 +1,14 @@
+/* PR tree-optimization/89350 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+/* { dg-final { scan-tree-dump-not "link_error \\\(" "optimized" } } */
+
+extern void link_error (void);
+
+int
+main (int argc, const char *argv[])
+{
+  if (argc < 0)
+link_error ();
+  return 0;
+}
--- gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-3.c.jj   2015-05-29 
15:03:44.947546711 +0200
+++ gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-3.c  2019-02-16 08:04:29.951126611 
+0100
@@ -2,7 +2,7 @@
  /* { dg-options "-O2 -fdump-tree-sink-stats" } */
  extern void foo(int a);
  int
-main (int argc)
+bar (int argc)
  {
int a;
a = argc + 1;

Jakub

Re: [PATCH] document __has_attribute and __has_include

2019-02-18 Thread Martin Sebor


On 2/15/19 8:30 PM, Sandra Loosemore wrote:

On 2/13/19 2:46 PM, Martin Sebor wrote:

The attached patch adds documentation for the __has_attribute (and
__has_cpp_attribute) and __has_include operators added in r215752.


Thanks!


I was a little unsure where to add this, whether the preprocessor
manual or the GCC manual, or both.  It seems that it belongs in
the preprocessor manual but since more users read the GCC manual,
it's likely to be overlooked there.


I think the preprocessor manual is the right place.  A while back I 
brought up the idea of consolidating the preprocessor docs into the GCC 
manual but the consensus seemed to be for retaining a separate 
preprocessor manual.


My comments on this patch are mostly trivial markup things.


@@ -3422,6 +3425,99 @@ condition succeeds after the original @samp{#if} a
 @samp{#else} is allowed after any number of @samp{#elif} directives, but
 @samp{#elif} may not follow @samp{#else}.

+@node __has_attribute
+@subsection __has_attribute


Please use @code markup in the @subsection.


Done.  I also changed @node and the corresponding menu item.

I wasn't sure what the convention here was: the other subsections
(like If and Elif) use capitalization and no @code even their names
are keywords.  Should they be changed as well?




+@cindex @code{__has_attribute}
+
+The special operator @code{__has_attribute (operand)} may be used in


@code{__has_attribute (@var{operand})}


Done.




+@samp{#if} and @samp{#elif} expressions to test whether the attribute


Another question: should these use @code instead?  (Again, I'm not
entirely sure what the convention is in the CPP manual.  It seems
consistent in using @samp for directives like #if but then it uses
@code for bigger snippets like @code{#if 0} or @code{#pragma GCC
poison} where (IIUC) the TexInfo manual suggests @samp might be
preferable).


+referenced by its argument is recognized by GCC.  Using the operator
+in other contexts is not valid.  In C code, @var{operand} must be
+a valid identifier.  In C++ code, @var{operand} may be optionally
+introduced by the @code{attribute-scope::} prefix.


I think "attribute-scope" is not a literal part of the prefix, so

@code{@var{attribute-scope}::}


+The @code{attribute-scope} prefix identifies the ``namespace'' within


And @var markup here, too.


+which the attribute is recognized.  The scope of GCC attributes is
+@samp{gnu} or @samp{__gnu__}.  The operator by itself, without any


The @code{__has_attribute} operator by itself


Done.




+@var{operand} or parentheses, acts as a predefined macro so that support
+for it can be tested in portable code.  Thus, the recommended use of
+the operator is as follows:
+
+@smallexample
+#if defined __has_attribute
+#  if __has_attribute (nonnull)
+#    define ATTR_NONNULL __attribute__ ((nonnull))
+#  endif
+#endif
+@end smallexample
+
+The first @samp{#if} test succeeds only when the operator is supported
+by the version of GCC (or another compiler) being used.  Only when that
+test succeeds is it valid to use @code{__has_attribute} as a 
preprocessor
+operator.  As a result, combining the two tests into a single 
expression as
+shown below would only be valid with a compiler that supports the 
operator

+but not with others that don't.
+
+@smallexample
+#if defined __has_attribute && __has_attribute (nonnull)   /* not 
portable */

+@dots{}
+#endif
+@end smallexample
+
+@node __has_cpp_attribute
+@subsection __has_cpp_attribute


@code markup in the @subsection title, again.


+@cindex @code{__has_cpp_attribute}
+
+The special operator @code{__has_cpp_attribute (operand)} may be used


@var{operand} markup again.


Done.




+in @samp{#if} and @samp{#elif} expressions in C++ code to test whether
+the attribute referenced by its argument is recognized by GCC.
+@code{__has_cpp_attribute (operand)} is equivalent to
+@code{__has_attribute (operand)} except that when @code{operand}


The 3 instances above too.


Done.




+designates a supported standard attribute it evaluates to an integer
+constant of the form @code{MM} indicating the year and month when
+the attribute was first introduced into the C++ standard.  For 
additional

+information including the dates of the introduction of current standard
+attributes, see 
@w{@uref{https://isocpp.org/std/standing-documents/sd-6-sg10-feature-test-recommendations/, 


+SD-6: SG10 Feature Test Recommendations}}.
+
+@node __has_include
+@subsection __has_include


@code markup in title again


Done.




+@cindex @code{__has_include}
+ > +The special operator @code{__has_include (operand)} may be used in 

@samp{#if}

@var{operand}


Done.



+and @samp{#elif} expressions to test whether the header referenced by 
its
+@var{operand} can be included using the @samp{#include} directive.  
Using

+the operator in other contexts is not valid.  The @var{operand} takes
+the same form as the file in the @samp{#include} directive 
(@xref{Include
+Syntax}) and evaluates to a nonzero

[PATCH 39/41] i386: Allow MMX intrinsic emulation with SSE

2019-02-18 Thread H.J. Lu

Allow MMX intrinsic emulation with SSE/SSE2/SSSE3.  Don't enable MMX ISA
by default with TARGET_MMX_WITH_SSE.

For pr82483-1.c and pr82483-2.c, "-mssse3 -mno-mmx" compiles in 64-bit
mode since MMX intrinsics can be emulated wit SSE.

gcc/

PR target/89021
* config/i386/i386-builtin.def: Enable MMX intrinsics with
SSE/SSE2/SSSE3.
* config/i386/i386.c (ix86_init_mmx_sse_builtins): Likewise.
(ix86_expand_builtin): Allow SSE/SSE2/SSSE3 to emulate MMX
intrinsics with TARGET_MMX_WITH_SSE.
* config/i386/mmintrin.h: Only require SSE2 if __MMX_WITH_SSE__
is defined.

gcc/testsuite/

PR target/89021
* gcc.target/i386/pr82483-1.c: Error only on ia32.
* gcc.target/i386/pr82483-2.c: Likewise.
---
 gcc/config/i386/i386-builtin.def  | 126 +++---
 gcc/config/i386/i386.c|  29 -
 gcc/config/i386/mmintrin.h|  12 ++-
 gcc/testsuite/gcc.target/i386/pr82483-1.c |   2 +-
 gcc/testsuite/gcc.target/i386/pr82483-2.c |   2 +-
 5 files changed, 101 insertions(+), 70 deletions(-)

diff --git a/gcc/config/i386/i386-builtin.def b/gcc/config/i386/i386-builtin.def
index 88005f4687f..10a9d631f29 100644
--- a/gcc/config/i386/i386-builtin.def
+++ b/gcc/config/i386/i386-builtin.def
@@ -100,7 +100,7 @@ BDESC (0, 0, CODE_FOR_fnstsw, "__builtin_ia32_fnstsw", 
IX86_BUILTIN_FNSTSW, UNKN
 BDESC (0, 0, CODE_FOR_fnclex, "__builtin_ia32_fnclex", IX86_BUILTIN_FNCLEX, 
UNKNOWN, (int) VOID_FTYPE_VOID)
 
 /* MMX */
-BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_emms, "__builtin_ia32_emms", 
IX86_BUILTIN_EMMS, UNKNOWN, (int) VOID_FTYPE_VOID)
+BDESC (OPTION_MASK_ISA_MMX | OPTION_MASK_ISA_SSE2, 0, CODE_FOR_mmx_emms, 
"__builtin_ia32_emms", IX86_BUILTIN_EMMS, UNKNOWN, (int) VOID_FTYPE_VOID)
 
 /* 3DNow! */
 BDESC (OPTION_MASK_ISA_3DNOW, 0, CODE_FOR_mmx_femms, "__builtin_ia32_femms", 
IX86_BUILTIN_FEMMS, UNKNOWN, (int) VOID_FTYPE_VOID)
@@ -442,68 +442,68 @@ BDESC (0, 0, CODE_FOR_rotrqi3, "__builtin_ia32_rorqi", 
IX86_BUILTIN_RORQI, UNKNO
 BDESC (0, 0, CODE_FOR_rotrhi3, "__builtin_ia32_rorhi", IX86_BUILTIN_RORHI, 
UNKNOWN, (int) UINT16_FTYPE_UINT16_INT)
 
 /* MMX */
-BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_addv8qi3, "__builtin_ia32_paddb", 
IX86_BUILTIN_PADDB, UNKNOWN, (int) V8QI_FTYPE_V8QI_V8QI)
-BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_addv4hi3, "__builtin_ia32_paddw", 
IX86_BUILTIN_PADDW, UNKNOWN, (int) V4HI_FTYPE_V4HI_V4HI)
-BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_addv2si3, "__builtin_ia32_paddd", 
IX86_BUILTIN_PADDD, UNKNOWN, (int) V2SI_FTYPE_V2SI_V2SI)
-BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_subv8qi3, "__builtin_ia32_psubb", 
IX86_BUILTIN_PSUBB, UNKNOWN, (int) V8QI_FTYPE_V8QI_V8QI)
-BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_subv4hi3, "__builtin_ia32_psubw", 
IX86_BUILTIN_PSUBW, UNKNOWN, (int) V4HI_FTYPE_V4HI_V4HI)
-BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_subv2si3, "__builtin_ia32_psubd", 
IX86_BUILTIN_PSUBD, UNKNOWN, (int) V2SI_FTYPE_V2SI_V2SI)
-
-BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_ssaddv8qi3, 
"__builtin_ia32_paddsb", IX86_BUILTIN_PADDSB, UNKNOWN, (int) 
V8QI_FTYPE_V8QI_V8QI)
-BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_ssaddv4hi3, 
"__builtin_ia32_paddsw", IX86_BUILTIN_PADDSW, UNKNOWN, (int) 
V4HI_FTYPE_V4HI_V4HI)
-BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_sssubv8qi3, 
"__builtin_ia32_psubsb", IX86_BUILTIN_PSUBSB, UNKNOWN, (int) 
V8QI_FTYPE_V8QI_V8QI)
-BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_sssubv4hi3, 
"__builtin_ia32_psubsw", IX86_BUILTIN_PSUBSW, UNKNOWN, (int) 
V4HI_FTYPE_V4HI_V4HI)
-BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_usaddv8qi3, 
"__builtin_ia32_paddusb", IX86_BUILTIN_PADDUSB, UNKNOWN, (int) 
V8QI_FTYPE_V8QI_V8QI)
-BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_usaddv4hi3, 
"__builtin_ia32_paddusw", IX86_BUILTIN_PADDUSW, UNKNOWN, (int) 
V4HI_FTYPE_V4HI_V4HI)
-BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_ussubv8qi3, 
"__builtin_ia32_psubusb", IX86_BUILTIN_PSUBUSB, UNKNOWN, (int) 
V8QI_FTYPE_V8QI_V8QI)
-BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_ussubv4hi3, 
"__builtin_ia32_psubusw", IX86_BUILTIN_PSUBUSW, UNKNOWN, (int) 
V4HI_FTYPE_V4HI_V4HI)
-
-BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_mulv4hi3, "__builtin_ia32_pmullw", 
IX86_BUILTIN_PMULLW, UNKNOWN, (int) V4HI_FTYPE_V4HI_V4HI)
-BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_smulv4hi3_highpart, 
"__builtin_ia32_pmulhw", IX86_BUILTIN_PMULHW, UNKNOWN, (int) 
V4HI_FTYPE_V4HI_V4HI)
-
-BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_andv2si3, "__builtin_ia32_pand", 
IX86_BUILTIN_PAND, UNKNOWN, (int) V2SI_FTYPE_V2SI_V2SI)
-BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_andnotv2si3, 
"__builtin_ia32_pandn", IX86_BUILTIN_PANDN, UNKNOWN, (int) V2SI_FTYPE_V2SI_V2SI)
-BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_iorv2si3, "__builtin_ia32_por", 
IX86_BUILTIN_POR, UNKNOWN, (int) V2SI_FTYPE_V2SI_V2SI)
-BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_xorv2si3, "__builtin_ia32_pxor", 
IX86_BUILTIN_PXOR, UNKNOWN, (int)

[PATCH 27/41] i386: Make _mm_empty () as NOP without MMX

2019-02-18 Thread H.J. Lu

With SSE emulation of MMX intrinsics, we should make _mm_empty () as NOP
without MMX.

PR target/89021
* config/i386/mmx.md (mmx_): Renamed to ...
(*mmx_): This.
(mmx_): New expander.
---
 gcc/config/i386/mmx.md | 30 +-
 1 file changed, 29 insertions(+), 1 deletion(-)

diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index f27513f7f2c..c48d42c7d59 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -1849,7 +1849,35 @@
   [(UNSPECV_EMMS "emms")
(UNSPECV_FEMMS "femms")])
 
-(define_insn "mmx_"
+(define_expand "mmx_"
+  [(parallel
+[(unspec_volatile [(const_int 0)] EMMS)
+  (clobber (reg:XF ST0_REG))
+  (clobber (reg:XF ST1_REG))
+  (clobber (reg:XF ST2_REG))
+  (clobber (reg:XF ST3_REG))
+  (clobber (reg:XF ST4_REG))
+  (clobber (reg:XF ST5_REG))
+  (clobber (reg:XF ST6_REG))
+  (clobber (reg:XF ST7_REG))
+  (clobber (reg:DI MM0_REG))
+  (clobber (reg:DI MM1_REG))
+  (clobber (reg:DI MM2_REG))
+  (clobber (reg:DI MM3_REG))
+  (clobber (reg:DI MM4_REG))
+  (clobber (reg:DI MM5_REG))
+  (clobber (reg:DI MM6_REG))
+  (clobber (reg:DI MM7_REG))])]
+  "TARGET_MMX || TARGET_MMX_WITH_SSE"
+{
+   if (!TARGET_MMX)
+ {
+   emit_insn (gen_nop ());
+   DONE;
+ }
+})
+
+(define_insn "*mmx_"
   [(unspec_volatile [(const_int 0)] EMMS)
(clobber (reg:XF ST0_REG))
(clobber (reg:XF ST1_REG))
-- 
2.20.1

[PATCH 19/41] i386: Emulate MMX mmx_pmovmskb with SSE

2019-02-18 Thread H.J. Lu

Emulate MMX mmx_pmovmskb with SSE by zero-extending result of SSE pmovmskb
from QImode to SImode.  Only SSE register source operand is allowed.

PR target/89021
* config/i386/mmx.md (mmx_pmovmskb): Changed to
define_insn_and_split to support SSE emulation.
---
 gcc/config/i386/mmx.md | 30 +++---
 1 file changed, 23 insertions(+), 7 deletions(-)

diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index edfb8623701..5ae04de205d 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -1763,14 +1763,30 @@
   [(set_attr "type" "mmxshft")
(set_attr "mode" "DI")])
 
-(define_insn "mmx_pmovmskb"
-  [(set (match_operand:SI 0 "register_operand" "=r")
-   (unspec:SI [(match_operand:V8QI 1 "register_operand" "y")]
+(define_insn_and_split "mmx_pmovmskb"
+  [(set (match_operand:SI 0 "register_operand" "=r,r")
+   (unspec:SI [(match_operand:V8QI 1 "register_operand" "y,x")]
   UNSPEC_MOVMSK))]
-  "TARGET_SSE || TARGET_3DNOW_A"
-  "pmovmskb\t{%1, %0|%0, %1}"
-  [(set_attr "type" "mmxcvt")
-   (set_attr "mode" "DI")])
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
+   && (TARGET_SSE || TARGET_3DNOW_A)"
+  "@
+   pmovmskb\t{%1, %0|%0, %1}
+   #"
+  "TARGET_MMX_WITH_SSE && reload_completed"
+  [(set (match_dup 0)
+(unspec:SI [(match_dup 1)] UNSPEC_MOVMSK))
+   (set (match_dup 0)
+   (zero_extend:SI (match_dup 2)))]
+{
+  /* Generate SSE pmovmskb and zero-extend from QImode to SImode.  */
+  operands[1] = lowpart_subreg (V16QImode, operands[1],
+   GET_MODE (operands[1]));
+  operands[2] = lowpart_subreg (QImode, operands[0],
+   GET_MODE (operands[0]));
+}
+  [(set_attr "mmx_isa" "native,x64")
+   (set_attr "type" "mmxcvt,ssemov")
+   (set_attr "mode" "DI,TI")])
 
 (define_expand "mmx_maskmovq"
   [(set (match_operand:V8QI 0 "memory_operand")
-- 
2.20.1

[PATCH 13/41] i386: Emulate MMX pshufw with SSE

2019-02-18 Thread H.J. Lu

Emulate MMX pshufw with SSE.  Only SSE register source operand is allowed.

PR target/89021
* config/i386/mmx.md (mmx_pshufw): Also check TARGET_MMX and
TARGET_MMX_WITH_SSE.
(mmx_pshufw_1): Add SSE emulation.
(*vec_dupv4hi): Changed to define_insn_and_split and also allow
TARGET_MMX_WITH_SSE to support SSE emulation.
---
 gcc/config/i386/mmx.md | 81 +-
 1 file changed, 65 insertions(+), 16 deletions(-)

diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index b441f36dfc6..09e78ac5f74 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -1323,9 +1323,10 @@
 
 (define_expand "mmx_pshufw"
   [(match_operand:V4HI 0 "register_operand")
-   (match_operand:V4HI 1 "nonimmediate_operand")
+   (match_operand:V4HI 1 "register_mmxmem_operand")
(match_operand:SI 2 "const_int_operand")]
-  "TARGET_SSE || TARGET_3DNOW_A"
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
+   && (TARGET_SSE || TARGET_3DNOW_A)"
 {
   int mask = INTVAL (operands[2]);
   emit_insn (gen_mmx_pshufw_1 (operands[0], operands[1],
@@ -1337,14 +1338,15 @@
 })
 
 (define_insn "mmx_pshufw_1"
-  [(set (match_operand:V4HI 0 "register_operand" "=y")
+  [(set (match_operand:V4HI 0 "register_operand" "=y,Yv")
 (vec_select:V4HI
-  (match_operand:V4HI 1 "nonimmediate_operand" "ym")
+  (match_operand:V4HI 1 "register_mmxmem_operand" "ym,Yv")
   (parallel [(match_operand 2 "const_0_to_3_operand")
  (match_operand 3 "const_0_to_3_operand")
  (match_operand 4 "const_0_to_3_operand")
  (match_operand 5 "const_0_to_3_operand")])))]
-  "TARGET_SSE || TARGET_3DNOW_A"
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
+   && (TARGET_SSE || TARGET_3DNOW_A)"
 {
   int mask = 0;
   mask |= INTVAL (operands[2]) << 0;
@@ -1353,11 +1355,20 @@
   mask |= INTVAL (operands[5]) << 6;
   operands[2] = GEN_INT (mask);
 
-  return "pshufw\t{%2, %1, %0|%0, %1, %2}";
+  switch (which_alternative)
+{
+case 0:
+  return "pshufw\t{%2, %1, %0|%0, %1, %2}";
+case 1:
+  return "%vpshuflw\t{%2, %1, %0|%0, %1, %2}";
+default:
+  gcc_unreachable ();
+}
 }
-  [(set_attr "type" "mmxcvt")
+  [(set_attr "mmx_isa" "native,x64")
+   (set_attr "type" "mmxcvt,sselog")
(set_attr "length_immediate" "1")
-   (set_attr "mode" "DI")])
+   (set_attr "mode" "DI,TI")])
 
 (define_insn "mmx_pswapdv2si2"
   [(set (match_operand:V2SI 0 "register_operand" "=y")
@@ -1370,16 +1381,54 @@
(set_attr "prefix_extra" "1")
(set_attr "mode" "DI")])
 
-(define_insn "*vec_dupv4hi"
-  [(set (match_operand:V4HI 0 "register_operand" "=y")
+(define_insn_and_split "*vec_dupv4hi"
+  [(set (match_operand:V4HI 0 "register_operand" "=y,Yv,Yw")
(vec_duplicate:V4HI
  (truncate:HI
-   (match_operand:SI 1 "register_operand" "0"]
-  "TARGET_SSE || TARGET_3DNOW_A"
-  "pshufw\t{$0, %0, %0|%0, %0, 0}"
-  [(set_attr "type" "mmxcvt")
-   (set_attr "length_immediate" "1")
-   (set_attr "mode" "DI")])
+   (match_operand:SI 1 "register_operand" "0,Yv,r"]
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
+   && (TARGET_SSE || TARGET_3DNOW_A)"
+  "@
+   pshufw\t{$0, %0, %0|%0, %0, 0}
+   #
+   #"
+  "TARGET_MMX_WITH_SSE && reload_completed"
+  [(const_int 0)]
+{
+  rtx op;
+  operands[0] = lowpart_subreg (V8HImode, operands[0],
+   GET_MODE (operands[0]));
+  if (TARGET_AVX2)
+{
+  operands[1] = lowpart_subreg (HImode, operands[1],
+   GET_MODE (operands[1]));
+  op = gen_rtx_VEC_DUPLICATE (V8HImode, operands[1]);
+}
+  else
+{
+  operands[1] = lowpart_subreg (V8HImode, operands[1],
+   GET_MODE (operands[1]));
+  rtx mask = gen_rtx_PARALLEL (VOIDmode,
+  gen_rtvec (8,
+ GEN_INT (0),
+ GEN_INT (0),
+ GEN_INT (0),
+ GEN_INT (0),
+ GEN_INT (4),
+ GEN_INT (5),
+ GEN_INT (6),
+ GEN_INT (7)));
+
+  op = gen_rtx_VEC_SELECT (V8HImode, operands[1], mask);
+}
+  rtx insn = gen_rtx_SET (operands[0], op);
+  emit_insn (insn);
+  DONE;
+}
+  [(set_attr "mmx_isa" "native,x64,x64_avx")
+   (set_attr "type" "mmxcvt,sselog1,ssemov")
+   (set_attr "length_immediate" "1,1,0")
+   (set_attr "mode" "DI,TI,TI")])
 
 (define_insn_and_split "*vec_dupv2si"
   [(set (match_operand:V2SI 0 "register_operand" "=y,x,Yv,Yw")
-- 
2.20.1

[PATCH 31/41] i386: Emulate MMX ssse3_pmulhrswv4hi3 with SSE

2019-02-18 Thread H.J. Lu

Emulate MMX ssse3_pmulhrswv4hi3 with SSE.  Only SSE register source
operand is allowed.

PR target/89021
* config/i386/sse.md (ssse3_pmulhrswv4hi3): Require TARGET_MMX
or TARGET_MMX_WITH_SSE.
(*ssse3_pmulhrswv4hi3): Add SSE emulation.
---
 gcc/config/i386/sse.md | 26 --
 1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index e8d9bec9766..b08a577d1e4 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -15670,38 +15670,44 @@
  (lshiftrt:V4SI
(mult:V4SI
  (sign_extend:V4SI
-   (match_operand:V4HI 1 "nonimmediate_operand"))
+   (match_operand:V4HI 1 "register_mmxmem_operand"))
  (sign_extend:V4SI
-   (match_operand:V4HI 2 "nonimmediate_operand")))
+   (match_operand:V4HI 2 "register_mmxmem_operand")))
(const_int 14))
  (match_dup 3))
(const_int 1]
-  "TARGET_SSSE3"
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3"
 {
   operands[3] = CONST1_RTX(V4HImode);
   ix86_fixup_binary_operands_no_copy (MULT, V4HImode, operands);
 })
 
 (define_insn "*ssse3_pmulhrswv4hi3"
-  [(set (match_operand:V4HI 0 "register_operand" "=y")
+  [(set (match_operand:V4HI 0 "register_operand" "=y,x,Yv")
(truncate:V4HI
  (lshiftrt:V4SI
(plus:V4SI
  (lshiftrt:V4SI
(mult:V4SI
  (sign_extend:V4SI
-   (match_operand:V4HI 1 "nonimmediate_operand" "%0"))
+   (match_operand:V4HI 1 "register_mmxmem_operand" "%0,0,Yv"))
  (sign_extend:V4SI
-   (match_operand:V4HI 2 "nonimmediate_operand" "ym")))
+   (match_operand:V4HI 2 "register_mmxmem_operand" "ym,x,Yv")))
(const_int 14))
  (match_operand:V4HI 3 "const1_operand"))
(const_int 1]
-  "TARGET_SSSE3 && !(MEM_P (operands[1]) && MEM_P (operands[2]))"
-  "pmulhrsw\t{%2, %0|%0, %2}"
-  [(set_attr "type" "sseimul")
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
+   && TARGET_SSSE3
+   && !(MEM_P (operands[1]) && MEM_P (operands[2]))"
+  "@
+   pmulhrsw\t{%2, %0|%0, %2}
+   pmulhrsw\t{%2, %0|%0, %2}
+   vpmulhrsw\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
+   (set_attr "type" "sseimul")
(set_attr "prefix_extra" "1")
(set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p (insn)"))
-   (set_attr "mode" "DI")])
+   (set_attr "mode" "DI,TI,TI")])
 
 (define_insn "_pshufb3"
   [(set (match_operand:VI1_AVX512 0 "register_operand" "=x,x,v")
-- 
2.20.1

[PATCH 14/41] i386: Emulate MMX sse_cvtps2pi/sse_cvttps2pi with SSE

2019-02-18 Thread H.J. Lu

Emulate MMX sse_cvtps2pi/sse_cvttps2pi with SSE.

PR target/89021
* config/i386/sse.md (sse_cvtps2pi): Add SSE emulation.
(sse_cvttps2pi): Likewise.
---
 gcc/config/i386/sse.md | 30 ++
 1 file changed, 18 insertions(+), 12 deletions(-)

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 30bf7e23122..dd3a3d9ba67 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -4582,26 +4582,32 @@
(set_attr "mode" "V4SF")])
 
 (define_insn "sse_cvtps2pi"
-  [(set (match_operand:V2SI 0 "register_operand" "=y")
+  [(set (match_operand:V2SI 0 "register_operand" "=y,Yv")
(vec_select:V2SI
- (unspec:V4SI [(match_operand:V4SF 1 "nonimmediate_operand" "xm")]
+ (unspec:V4SI [(match_operand:V4SF 1 "register_mmxmem_operand" 
"xm,YvBm")]
   UNSPEC_FIX_NOTRUNC)
  (parallel [(const_int 0) (const_int 1)])))]
-  "TARGET_SSE"
-  "cvtps2pi\t{%1, %0|%0, %q1}"
-  [(set_attr "type" "ssecvt")
-   (set_attr "unit" "mmx")
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSE"
+  "@
+   cvtps2pi\t{%1, %0|%0, %q1}
+   %vcvtps2dq\t{%1, %0|%0, %1}"
+  [(set_attr "mmx_isa" "native,x64")
+   (set_attr "type" "ssecvt")
+   (set_attr "unit" "mmx,*")
(set_attr "mode" "DI")])
 
 (define_insn "sse_cvttps2pi"
-  [(set (match_operand:V2SI 0 "register_operand" "=y")
+  [(set (match_operand:V2SI 0 "register_operand" "=y,Yv")
(vec_select:V2SI
- (fix:V4SI (match_operand:V4SF 1 "nonimmediate_operand" "xm"))
+ (fix:V4SI (match_operand:V4SF 1 "register_mmxmem_operand" "xm,YvBm"))
  (parallel [(const_int 0) (const_int 1)])))]
-  "TARGET_SSE"
-  "cvttps2pi\t{%1, %0|%0, %q1}"
-  [(set_attr "type" "ssecvt")
-   (set_attr "unit" "mmx")
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSE"
+  "@
+   cvttps2pi\t{%1, %0|%0, %q1}
+   %vcvttps2dq\t{%1, %0|%0, %1}"
+  [(set_attr "mmx_isa" "native,x64")
+   (set_attr "type" "ssecvt")
+   (set_attr "unit" "mmx,*")
(set_attr "prefix_rep" "0")
(set_attr "mode" "SF")])
 
-- 
2.20.1

[PATCH 16/41] i386: Emulate MMX mmx_pextrw with SSE

2019-02-18 Thread H.J. Lu

Emulate MMX mmx_pextrw with SSE.  Only SSE register source operand is
allowed.

PR target/89021
* config/i386/mmx.md (mmx_pextrw): Add SSE emulation.
---
 gcc/config/i386/mmx.md | 18 +++---
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index 09e78ac5f74..28725f48282 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -1310,16 +1310,20 @@
(set_attr "mode" "DI")])
 
 (define_insn "mmx_pextrw"
-  [(set (match_operand:SI 0 "register_operand" "=r")
+  [(set (match_operand:SI 0 "register_operand" "=r,r")
 (zero_extend:SI
  (vec_select:HI
-   (match_operand:V4HI 1 "register_operand" "y")
-   (parallel [(match_operand:SI 2 "const_0_to_3_operand" "n")]]
-  "TARGET_SSE || TARGET_3DNOW_A"
-  "pextrw\t{%2, %1, %0|%0, %1, %2}"
-  [(set_attr "type" "mmxcvt")
+   (match_operand:V4HI 1 "register_operand" "y,Yv")
+   (parallel [(match_operand:SI 2 "const_0_to_3_operand" "n,n")]]
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
+   && (TARGET_SSE || TARGET_3DNOW_A)"
+  "@
+   pextrw\t{%2, %1, %0|%0, %1, %2}
+   %vpextrw\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "mmx_isa" "native,x64")
+   (set_attr "type" "mmxcvt,sselog1")
(set_attr "length_immediate" "1")
-   (set_attr "mode" "DI")])
+   (set_attr "mode" "DI,TI")])
 
 (define_expand "mmx_pshufw"
   [(match_operand:V4HI 0 "register_operand")
-- 
2.20.1

[PATCH 05/41] i386: Emulate MMX mulv4hi3 with SSE

2019-02-18 Thread H.J. Lu

Emulate MMX mulv4hi3 with SSE.  Only SSE register source operand is
allowed.

PR target/89021
* config/i386/mmx.md (mmx_mulv4hi3): Also allow
TARGET_MMX_WITH_SSE.
(mulv4hi3): New.
(*mmx_mulv4hi3): Also allow TARGET_MMX_WITH_SSE.  Add SSE
support.
---
 gcc/config/i386/mmx.md | 32 ++--
 1 file changed, 22 insertions(+), 10 deletions(-)

diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index 587e31b299e..fd0189eae60 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -716,19 +716,31 @@
 
 (define_expand "mmx_mulv4hi3"
   [(set (match_operand:V4HI 0 "register_operand")
-(mult:V4HI (match_operand:V4HI 1 "nonimmediate_operand")
-  (match_operand:V4HI 2 "nonimmediate_operand")))]
-  "TARGET_MMX"
+(mult:V4HI (match_operand:V4HI 1 "register_mmxmem_operand")
+  (match_operand:V4HI 2 "register_mmxmem_operand")))]
+  "TARGET_MMX || TARGET_MMX_WITH_SSE"
+  "ix86_fixup_binary_operands_no_copy (MULT, V4HImode, operands);")
+
+(define_expand "mulv4hi3"
+  [(set (match_operand:V4HI 0 "register_operand")
+(mult:V4HI (match_operand:V4HI 1 "register_operand")
+  (match_operand:V4HI 2 "register_operand")))]
+  "TARGET_MMX_WITH_SSE"
   "ix86_fixup_binary_operands_no_copy (MULT, V4HImode, operands);")
 
 (define_insn "*mmx_mulv4hi3"
-  [(set (match_operand:V4HI 0 "register_operand" "=y")
-(mult:V4HI (match_operand:V4HI 1 "nonimmediate_operand" "%0")
-  (match_operand:V4HI 2 "nonimmediate_operand" "ym")))]
-  "TARGET_MMX && ix86_binary_operator_ok (MULT, V4HImode, operands)"
-  "pmullw\t{%2, %0|%0, %2}"
-  [(set_attr "type" "mmxmul")
-   (set_attr "mode" "DI")])
+  [(set (match_operand:V4HI 0 "register_operand" "=y,x,Yv")
+(mult:V4HI (match_operand:V4HI 1 "register_mmxmem_operand" "%0,0,Yv")
+  (match_operand:V4HI 2 "register_mmxmem_operand" "ym,x,Yv")))]
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
+   && ix86_binary_operator_ok (MULT, V4HImode, operands)"
+  "@
+   pmullw\t{%2, %0|%0, %2}
+   pmullw\t{%2, %0|%0, %2}
+   vpmullw\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
+   (set_attr "type" "mmxmul,ssemul,ssemul")
+   (set_attr "mode" "DI,TI,TI")])
 
 (define_expand "mmx_smulv4hi3_highpart"
   [(set (match_operand:V4HI 0 "register_operand")
-- 
2.20.1

[PATCH 15/41] i386: Emulate MMX sse_cvtpi2ps with SSE

2019-02-18 Thread H.J. Lu

Emulate MMX sse_cvtpi2ps with SSE2 cvtdq2ps, preserving upper 64 bits of
destination XMM register.  Only SSE register source operand is allowed.

PR target/89021
* config/i386/sse.md (sse_cvtpi2ps): Changed to
define_insn_and_split.  Also allow TARGET_MMX_WITH_SSE.  Add
SSE emulation.
---
 gcc/config/i386/sse.md | 64 --
 1 file changed, 56 insertions(+), 8 deletions(-)

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index dd3a3d9ba67..3135ce4eace 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -4569,16 +4569,64 @@
 ;;
 ;
 
-(define_insn "sse_cvtpi2ps"
-  [(set (match_operand:V4SF 0 "register_operand" "=x")
+(define_insn_and_split "sse_cvtpi2ps"
+  [(set (match_operand:V4SF 0 "register_operand" "=x,x,Yv")
(vec_merge:V4SF
  (vec_duplicate:V4SF
-   (float:V2SF (match_operand:V2SI 2 "nonimmediate_operand" "ym")))
- (match_operand:V4SF 1 "register_operand" "0")
- (const_int 3)))]
-  "TARGET_SSE"
-  "cvtpi2ps\t{%2, %0|%0, %2}"
-  [(set_attr "type" "ssecvt")
+   (float:V2SF (match_operand:V2SI 2 "register_mmxmem_operand" 
"ym,x,Yv")))
+ (match_operand:V4SF 1 "register_operand" "0,0,Yv")
+ (const_int 3)))
+   (clobber (match_scratch:V4SF 3 "=X,x,Yv"))]
+  "TARGET_SSE || TARGET_MMX_WITH_SSE"
+  "@
+   cvtpi2ps\t{%2, %0|%0, %2}
+   #
+   #"
+  "TARGET_MMX_WITH_SSE && reload_completed"
+  [(const_int 0)]
+{
+  rtx op2 = lowpart_subreg (V4SImode, operands[2],
+   GET_MODE (operands[2]));
+  /* Generate SSE2 cvtdq2ps.  */
+  rtx insn = gen_floatv4siv4sf2 (operands[3], op2);
+  emit_insn (insn);
+
+  /* Merge operands[3] with operands[0].  */
+  rtx mask, op1;
+  if (TARGET_AVX)
+{
+  mask = gen_rtx_PARALLEL (VOIDmode,
+  gen_rtvec (4, GEN_INT (0), GEN_INT (1),
+ GEN_INT (6), GEN_INT (7)));
+  op1 = gen_rtx_VEC_CONCAT (V8SFmode, operands[3], operands[1]);
+  op2 = gen_rtx_VEC_SELECT (V4SFmode, op1, mask);
+  insn = gen_rtx_SET (operands[0], op2);
+}
+  else
+{
+  /* NB: SSE can only concatenate OP0 and OP3 to OP0.  */
+  mask = gen_rtx_PARALLEL (VOIDmode,
+  gen_rtvec (4, GEN_INT (2), GEN_INT (3),
+ GEN_INT (4), GEN_INT (5)));
+  op1 = gen_rtx_VEC_CONCAT (V8SFmode, operands[0], operands[3]);
+  op2 = gen_rtx_VEC_SELECT (V4SFmode, op1, mask);
+  insn = gen_rtx_SET (operands[0], op2);
+  emit_insn (insn);
+
+  /* Swap bits 0:63 with bits 64:127.  */
+  mask = gen_rtx_PARALLEL (VOIDmode,
+  gen_rtvec (4, GEN_INT (2), GEN_INT (3),
+ GEN_INT (0), GEN_INT (1)));
+  rtx dest = lowpart_subreg (V4SImode, operands[0],
+GET_MODE (operands[0]));
+  op1 = gen_rtx_VEC_SELECT (V4SImode, dest, mask);
+  insn = gen_rtx_SET (dest, op1);
+}
+  emit_insn (insn);
+  DONE;
+}
+  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
+   (set_attr "type" "ssecvt")
(set_attr "mode" "V4SF")])
 
 (define_insn "sse_cvtps2pi"
-- 
2.20.1

[PATCH 10/41] i386: Emulate MMX mmx_andnot3 with SSE

2019-02-18 Thread H.J. Lu

Emulate MMX mmx_andnot3 with SSE.  Only SSE register source operand
is allowed.

PR target/89021
* config/i386/mmx.md (mmx_andnot3): Also allow
TARGET_MMX_WITH_SSE.  Add SSE support.
---
 gcc/config/i386/mmx.md | 18 +++---
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index 33f6c2aa774..b3df46dd563 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -1049,14 +1049,18 @@
 ;
 
 (define_insn "mmx_andnot3"
-  [(set (match_operand:MMXMODEI 0 "register_operand" "=y")
+  [(set (match_operand:MMXMODEI 0 "register_operand" "=y,x,Yv")
(and:MMXMODEI
- (not:MMXMODEI (match_operand:MMXMODEI 1 "register_operand" "0"))
- (match_operand:MMXMODEI 2 "nonimmediate_operand" "ym")))]
-  "TARGET_MMX"
-  "pandn\t{%2, %0|%0, %2}"
-  [(set_attr "type" "mmxadd")
-   (set_attr "mode" "DI")])
+ (not:MMXMODEI (match_operand:MMXMODEI 1 "register_operand" "0,0,Yv"))
+ (match_operand:MMXMODEI 2 "register_mmxmem_operand" "ym,x,Yv")))]
+  "TARGET_MMX || TARGET_MMX_WITH_SSE"
+  "@
+   pandn\t{%2, %0|%0, %2}
+   pandn\t{%2, %0|%0, %2}
+   vpandn\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
+   (set_attr "type" "mmxadd,sselog,sselog")
+   (set_attr "mode" "DI,TI,TI")])
 
 (define_expand "mmx_3"
   [(set (match_operand:MMXMODEI 0 "register_operand")
-- 
2.20.1

[PATCH 08/41] i386: Emulate MMX ashr3/3 with SSE

2019-02-18 Thread H.J. Lu

Emulate MMX ashr3/3 with SSE.  Only SSE register
source operand is allowed.

PR target/89021
* config/i386/mmx.md (mmx_ashr3): Also allow
TARGET_MMX_WITH_SSE.  Add SSE emulation.
(mmx_3): Likewise.
(ashr3): New.
(3): Likewise.
---
 gcc/config/i386/mmx.md | 50 ++
 1 file changed, 36 insertions(+), 14 deletions(-)

diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index fe746a487d1..6af05a1881e 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -959,32 +959,54 @@
(set_attr "mode" "DI")])
 
 (define_insn "mmx_ashr3"
-  [(set (match_operand:MMXMODE24 0 "register_operand" "=y")
+  [(set (match_operand:MMXMODE24 0 "register_operand" "=y,x,Yv")
 (ashiftrt:MMXMODE24
- (match_operand:MMXMODE24 1 "register_operand" "0")
- (match_operand:DI 2 "nonmemory_operand" "yN")))]
-  "TARGET_MMX"
-  "psra\t{%2, %0|%0, %2}"
-  [(set_attr "type" "mmxshft")
+ (match_operand:MMXMODE24 1 "register_operand" "0,0,Yv")
+ (match_operand:DI 2 "nonmemory_operand" "yN,xN,YvN")))]
+  "TARGET_MMX || TARGET_MMX_WITH_SSE"
+  "@
+   psra\t{%2, %0|%0, %2}
+   psra\t{%2, %0|%0, %2}
+   vpsra\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
+   (set_attr "type" "mmxshft,sseishft,sseishft")
(set (attr "length_immediate")
  (if_then_else (match_operand 2 "const_int_operand")
(const_string "1")
(const_string "0")))
-   (set_attr "mode" "DI")])
+   (set_attr "mode" "DI,TI,TI")])
+
+(define_expand "ashr3"
+  [(set (match_operand:MMXMODE24 0 "register_operand")
+(ashiftrt:MMXMODE24
+ (match_operand:MMXMODE24 1 "register_operand")
+ (match_operand:DI 2 "nonmemory_operand")))]
+  "TARGET_MMX_WITH_SSE")
 
 (define_insn "mmx_3"
-  [(set (match_operand:MMXMODE248 0 "register_operand" "=y")
+  [(set (match_operand:MMXMODE248 0 "register_operand" "=y,x,Yv")
 (any_lshift:MMXMODE248
- (match_operand:MMXMODE248 1 "register_operand" "0")
- (match_operand:DI 2 "nonmemory_operand" "yN")))]
-  "TARGET_MMX"
-  "p\t{%2, %0|%0, %2}"
-  [(set_attr "type" "mmxshft")
+ (match_operand:MMXMODE248 1 "register_operand" "0,0,Yv")
+ (match_operand:DI 2 "nonmemory_operand" "yN,xN,YvN")))]
+  "TARGET_MMX || TARGET_MMX_WITH_SSE"
+  "@
+   p\t{%2, %0|%0, %2}
+   p\t{%2, %0|%0, %2}
+   vp\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
+   (set_attr "type" "mmxshft,sseishft,sseishft")
(set (attr "length_immediate")
  (if_then_else (match_operand 2 "const_int_operand")
(const_string "1")
(const_string "0")))
-   (set_attr "mode" "DI")])
+   (set_attr "mode" "DI,TI,TI")])
+
+(define_expand "3"
+  [(set (match_operand:MMXMODE248 0 "register_operand")
+(any_lshift:MMXMODE248
+ (match_operand:MMXMODE248 1 "register_operand")
+ (match_operand:DI 2 "nonmemory_operand")))]
+  "TARGET_MMX_WITH_SSE")
 
 ;
 ;;
-- 
2.20.1

[PATCH 12/41] i386: Emulate MMX vec_dupv2si with SSE

2019-02-18 Thread H.J. Lu

Emulate MMX vec_dupv2si with SSE.  Add the "Yw" constraint to allow
broadcast from integer register for AVX512BW with TARGET_AVX512VL.
Only SSE register source operand is allowed.

PR target/89021
* config/i386/constraints.md (Yw): New constraint.
* config/i386/mmx.md (*vec_dupv2si): Changed to
define_insn_and_split and also allow TARGET_MMX_WITH_SSE to
support SSE emulation.
---
 gcc/config/i386/constraints.md |  6 ++
 gcc/config/i386/mmx.md | 24 +---
 2 files changed, 23 insertions(+), 7 deletions(-)

diff --git a/gcc/config/i386/constraints.md b/gcc/config/i386/constraints.md
index 16075b4acf3..c546b20d9dc 100644
--- a/gcc/config/i386/constraints.md
+++ b/gcc/config/i386/constraints.md
@@ -110,6 +110,8 @@
 ;;  v  any EVEX encodable SSE register for AVX512VL target,
 ;; otherwise any SSE register
 ;;  h  EVEX encodable SSE register with number factor of four
+;;  w  any EVEX encodable SSE register for AVX512BW with TARGET_AVX512VL
+;; target.
 
 (define_register_constraint "Yz" "TARGET_SSE ? SSE_FIRST_REG : NO_REGS"
  "First SSE register (@code{%xmm0}).")
@@ -146,6 +148,10 @@
  "TARGET_AVX512VL ? ALL_SSE_REGS : TARGET_SSE ? SSE_REGS : NO_REGS"
  "@internal For AVX512VL, any EVEX encodable SSE register 
(@code{%xmm0-%xmm31}), otherwise any SSE register.")
 
+(define_register_constraint "Yw"
+ "TARGET_AVX512BW && TARGET_AVX512VL ? ALL_SSE_REGS : NO_REGS"
+ "@internal Any EVEX encodable SSE register (@code{%xmm0-%xmm31}) for AVX512BW 
with TARGET_AVX512VL target.")
+
 ;; We use the B prefix to denote any number of internal operands:
 ;;  f  FLAGS_REG
 ;;  g  GOT memory operand.
diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index aeebb4f5741..b441f36dfc6 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -1381,14 +1381,24 @@
(set_attr "length_immediate" "1")
(set_attr "mode" "DI")])
 
-(define_insn "*vec_dupv2si"
-  [(set (match_operand:V2SI 0 "register_operand" "=y")
+(define_insn_and_split "*vec_dupv2si"
+  [(set (match_operand:V2SI 0 "register_operand" "=y,x,Yv,Yw")
(vec_duplicate:V2SI
- (match_operand:SI 1 "register_operand" "0")))]
-  "TARGET_MMX"
-  "punpckldq\t%0, %0"
-  [(set_attr "type" "mmxcvt")
-   (set_attr "mode" "DI")])
+ (match_operand:SI 1 "register_operand" "0,0,Yv,r")))]
+  "TARGET_MMX || TARGET_MMX_WITH_SSE"
+  "@
+   punpckldq\t%0, %0
+   #
+   #
+   #"
+  "TARGET_MMX_WITH_SSE && reload_completed"
+  [(set (match_dup 0)
+   (vec_duplicate:V4SI (match_dup 1)))]
+  "operands[0] = lowpart_subreg (V4SImode, operands[0],
+GET_MODE (operands[0]));"
+  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx,x64_avx")
+   (set_attr "type" "mmxcvt,ssemov,ssemov,ssemov")
+   (set_attr "mode" "DI,TI,TI,TI")])
 
 (define_insn "*mmx_concatv2si"
   [(set (match_operand:V2SI 0 "register_operand" "=y,y")
-- 
2.20.1

[PATCH 09/41] i386: Emulate MMX 3 with SSE

2019-02-18 Thread H.J. Lu

Emulate MMX 3 with SSE.  Only SSE register source
operand is allowed.

PR target/89021
* config/i386/mmx.md (any_logic:mmx_3): Also allow
TARGET_MMX_WITH_SSE.
(any_logic:3): New.
(any_logic:*mmx_3): Also allow TARGET_MMX_WITH_SSE.
Add SSE support.
---
 gcc/config/i386/mmx.md | 33 +++--
 1 file changed, 23 insertions(+), 10 deletions(-)

diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index 6af05a1881e..33f6c2aa774 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -1061,20 +1061,33 @@
 (define_expand "mmx_3"
   [(set (match_operand:MMXMODEI 0 "register_operand")
(any_logic:MMXMODEI
- (match_operand:MMXMODEI 1 "nonimmediate_operand")
- (match_operand:MMXMODEI 2 "nonimmediate_operand")))]
-  "TARGET_MMX"
+ (match_operand:MMXMODEI 1 "register_mmxmem_operand")
+ (match_operand:MMXMODEI 2 "register_mmxmem_operand")))]
+  "TARGET_MMX || TARGET_MMX_WITH_SSE"
+  "ix86_fixup_binary_operands_no_copy (, mode, operands);")
+
+(define_expand "3"
+  [(set (match_operand:MMXMODEI 0 "register_operand")
+   (any_logic:MMXMODEI
+ (match_operand:MMXMODEI 1 "register_operand")
+ (match_operand:MMXMODEI 2 "register_operand")))]
+  "TARGET_MMX_WITH_SSE"
   "ix86_fixup_binary_operands_no_copy (, mode, operands);")
 
 (define_insn "*mmx_3"
-  [(set (match_operand:MMXMODEI 0 "register_operand" "=y")
+  [(set (match_operand:MMXMODEI 0 "register_operand" "=y,x,Yv")
 (any_logic:MMXMODEI
- (match_operand:MMXMODEI 1 "nonimmediate_operand" "%0")
- (match_operand:MMXMODEI 2 "nonimmediate_operand" "ym")))]
-  "TARGET_MMX && ix86_binary_operator_ok (, mode, operands)"
-  "p\t{%2, %0|%0, %2}"
-  [(set_attr "type" "mmxadd")
-   (set_attr "mode" "DI")])
+ (match_operand:MMXMODEI 1 "register_mmxmem_operand" "%0,0,Yv")
+ (match_operand:MMXMODEI 2 "register_mmxmem_operand" "ym,x,Yv")))]
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
+   && ix86_binary_operator_ok (, mode, operands)"
+  "@
+   p\t{%2, %0|%0, %2}
+   p\t{%2, %0|%0, %2}
+   vp\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
+   (set_attr "type" "mmxadd,sselog,sselog")
+   (set_attr "mode" "DI,TI,TI")])
 
 ;
 ;;
-- 
2.20.1

[PATCH 11/41] i386: Emulate MMX mmx_eq/mmx_gt3 with SSE

2019-02-18 Thread H.J. Lu

Emulate MMX mmx_eq/mmx_gt3 with SSE.  Only SSE register source
operand is allowed.

PR target/89021
* config/i386/mmx.md (mmx_eq3): Also allow
TARGET_MMX_WITH_SSE.
(*mmx_eq3): Also allow TARGET_MMX_WITH_SSE.  Add SSE
support.
(mmx_gt3): Likewise.
---
 gcc/config/i386/mmx.md | 43 +-
 1 file changed, 26 insertions(+), 17 deletions(-)

diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index b3df46dd563..aeebb4f5741 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -1017,30 +1017,39 @@
 (define_expand "mmx_eq3"
   [(set (match_operand:MMXMODEI 0 "register_operand")
 (eq:MMXMODEI
- (match_operand:MMXMODEI 1 "nonimmediate_operand")
- (match_operand:MMXMODEI 2 "nonimmediate_operand")))]
-  "TARGET_MMX"
+ (match_operand:MMXMODEI 1 "register_mmxmem_operand")
+ (match_operand:MMXMODEI 2 "register_mmxmem_operand")))]
+  "TARGET_MMX || TARGET_MMX_WITH_SSE"
   "ix86_fixup_binary_operands_no_copy (EQ, mode, operands);")
 
 (define_insn "*mmx_eq3"
-  [(set (match_operand:MMXMODEI 0 "register_operand" "=y")
+  [(set (match_operand:MMXMODEI 0 "register_operand" "=y,x,Yv")
 (eq:MMXMODEI
- (match_operand:MMXMODEI 1 "nonimmediate_operand" "%0")
- (match_operand:MMXMODEI 2 "nonimmediate_operand" "ym")))]
-  "TARGET_MMX && ix86_binary_operator_ok (EQ, mode, operands)"
-  "pcmpeq\t{%2, %0|%0, %2}"
-  [(set_attr "type" "mmxcmp")
-   (set_attr "mode" "DI")])
+ (match_operand:MMXMODEI 1 "register_mmxmem_operand" "%0,0,Yv")
+ (match_operand:MMXMODEI 2 "register_mmxmem_operand" "ym,x,Yv")))]
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
+   && ix86_binary_operator_ok (EQ, mode, operands)"
+  "@
+   pcmpeq\t{%2, %0|%0, %2}
+   pcmpeq\t{%2, %0|%0, %2}
+   vpcmpeq\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
+   (set_attr "type" "mmxcmp,ssecmp,ssecmp")
+   (set_attr "mode" "DI,TI,TI")])
 
 (define_insn "mmx_gt3"
-  [(set (match_operand:MMXMODEI 0 "register_operand" "=y")
+  [(set (match_operand:MMXMODEI 0 "register_operand" "=y,x,Yv")
 (gt:MMXMODEI
- (match_operand:MMXMODEI 1 "register_operand" "0")
- (match_operand:MMXMODEI 2 "nonimmediate_operand" "ym")))]
-  "TARGET_MMX"
-  "pcmpgt\t{%2, %0|%0, %2}"
-  [(set_attr "type" "mmxcmp")
-   (set_attr "mode" "DI")])
+ (match_operand:MMXMODEI 1 "register_operand" "0,0,Yv")
+ (match_operand:MMXMODEI 2 "register_mmxmem_operand" "ym,x,Yv")))]
+  "TARGET_MMX || TARGET_MMX_WITH_SSE"
+  "@
+   pcmpgt\t{%2, %0|%0, %2}
+   pcmpgt\t{%2, %0|%0, %2}
+   vpcmpgt\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
+   (set_attr "type" "mmxcmp,ssecmp,ssecmp")
+   (set_attr "mode" "DI,TI,TI")])
 
 ;
 ;;
-- 
2.20.1

[PATCH 00/41] V9: Emulate MMX intrinsics with SSE

2019-02-18 Thread H.J. Lu

On x86-64, since __m64 is returned and passed in XMM registers, we can
emulate MMX intrinsics with SSE instructions. To support it, we added

 #define TARGET_MMX_WITH_SSE (TARGET_64BIT && TARGET_SSE2)

;; Define instruction set of MMX instructions
(define_attr "mmx_isa" "base,native,x64,x64_noavx,x64_avx"
  (const_string "base"))

 (eq_attr "mmx_isa" "native")
   (symbol_ref "!TARGET_MMX_WITH_SSE")
 (eq_attr "mmx_isa" "x64")
   (symbol_ref "TARGET_MMX_WITH_SSE")
 (eq_attr "mmx_isa" "x64_avx")
   (symbol_ref "TARGET_MMX_WITH_SSE && TARGET_AVX")
 (eq_attr "mmx_isa" "x64_noavx")
   (symbol_ref "TARGET_MMX_WITH_SSE && !TARGET_AVX")

We added SSE emulation to MMX patterns and disabled MMX alternatives with
TARGET_MMX_WITH_SSE.

Most of MMX instructions have equivalent SSE versions and results of some
SSE versions need to be reshuffled to the right order for MMX.  Thee are
couple tricky cases:

1. MMX maskmovq and SSE2 maskmovdqu aren't equivalent.  We emulate MMX
maskmovq with SSE2 maskmovdqu by zeroing out the upper 64 bits of the
mask operand and handle unmapped bits 64:127 at memory address by
adjusting source and mask operands together with memory address.

2. MMX movntq is emulated with SSE2 DImode movnti, which is available
in 64-bit mode.

3. MMX pshufb takes a 3-bit index while SSE pshufb takes a 4-bit index.
SSE emulation must clear the bit 4 in the shuffle control mask.

4. To emulate MMX cvtpi2p with SSE2 cvtdq2ps, we must properly preserve
the upper 64 bits of destination XMM register.

Tests are also added to check each SSE emulation of MMX intrinsics.

There are no regressions on i686 and x86-64.  For x86-64, GCC is also
tested with

--with-arch=native --with-cpu=native

on AVX2 and AVX512F machines.

PS: We may be able to enable partial SSE emulation of MMX intrinsics in
32-bit mode later.

H.J. Lu (40):
  i386: Allow MMX register modes in SSE registers
  i386: Emulate MMX packsswb/packssdw/packuswb with SSE2
  i386: Emulate MMX punpcklXX/punpckhXX with SSE punpcklXX
  i386: Emulate MMX plusminus/sat_plusminus with SSE
  i386: Emulate MMX mulv4hi3 with SSE
  i386: Emulate MMX smulv4hi3_highpart with SSE
  i386: Emulate MMX mmx_pmaddwd with SSE
  i386: Emulate MMX ashr3/3 with SSE
  i386: Emulate MMX 3 with SSE
  i386: Emulate MMX mmx_andnot3 with SSE
  i386: Emulate MMX mmx_eq/mmx_gt3 with SSE
  i386: Emulate MMX vec_dupv2si with SSE
  i386: Emulate MMX pshufw with SSE
  i386: Emulate MMX sse_cvtps2pi/sse_cvttps2pi with SSE
  i386: Emulate MMX sse_cvtpi2ps with SSE
  i386: Emulate MMX mmx_pextrw with SSE
  i386: Emulate MMX mmx_pinsrw with SSE
  i386: Emulate MMX V4HI smaxmin/V8QI umaxmin with SSE
  i386: Emulate MMX mmx_pmovmskb with SSE
  i386: Emulate MMX mmx_umulv4hi3_highpart with SSE
  i386: Emulate MMX maskmovq with SSE2 maskmovdqu
  i386: Emulate MMX mmx_uavgv8qi3 with SSE
  i386: Emulate MMX mmx_uavgv4hi3 with SSE
  i386: Emulate MMX mmx_psadbw with SSE
  i386: Emulate MMX movntq with SSE2 movntidi
  i386: Emulate MMX umulv1siv1di3 with SSE2
  i386: Make _mm_empty () as NOP without MMX
  i386: Emulate MMX ssse3_phwv4hi3 with SSE
  i386: Emulate MMX ssse3_phdv2si3 with SSE
  i386: Emulate MMX ssse3_pmaddubsw with SSE
  i386: Emulate MMX ssse3_pmulhrswv4hi3 with SSE
  i386: Emulate MMX pshufb with SSE version
  i386: Emulate MMX ssse3_psign3 with SSE
  i386: Emulate MMX ssse3_palignrdi with SSE
  i386: Emulate MMX abs2 with SSE
  i386: Allow MMXMODE moves with TARGET_MMX_WITH_SSE
  i386: Allow MMX vector expanders with TARGET_MMX_WITH_SSE
  i386: Allow MMX intrinsic emulation with SSE
  i386: Enable TM MMX intrinsics with SSE2
  i386: Add tests for MMX intrinsic emulations with SSE

Uros Bizjak (1):
  Prevent allocation of MMX registers with TARGET_MMX_WITH_SSE

 gcc/config/i386/constraints.md|6 +
 gcc/config/i386/i386-builtin.def  |  126 +-
 gcc/config/i386/i386-c.c  |2 +
 gcc/config/i386/i386-protos.h |4 +
 gcc/config/i386/i386.c|  181 ++-
 gcc/config/i386/i386.h|2 +
 gcc/config/i386/i386.md   |   17 +
 gcc/config/i386/mmintrin.h|   12 +-
 gcc/config/i386/mmx.md| 1028 +++--
 gcc/config/i386/predicates.md |7 +
 gcc/config/i386/sse.md|  368 --
 gcc/config/i386/xmmintrin.h   |   61 +
 gcc/testsuite/gcc.target/i386/mmx-vals.h  |   77 ++
 gcc/testsuite/gcc.target/i386/pr82483-1.c |2 +-
 gcc/testsuite/gcc.target/i386/pr82483-2.c |2 +-
 gcc/testsuite/gcc.target/i386/sse2-mmx-10.c   |   43 +
 gcc/testsuite/gcc.target/i386/sse2-mmx-11.c   |   39 +
 gcc/testsuite/gcc.target/i386/sse2-mmx-12.c   |   42 +
 gcc/testsuite/gcc.target/i386/sse2-mmx-13.c   |   40 +
 gcc/testsuite/gcc.target/i386/sse2-mmx-14.c   |   31 +

[PATCH 07/41] i386: Emulate MMX mmx_pmaddwd with SSE

2019-02-18 Thread H.J. Lu

Emulate MMX pmaddwd with SSE.  Only SSE register source operand is
allowed.

PR target/89021
* config/i386/mmx.md (mmx_pmaddwd): Also allow TARGET_MMX_WITH_SSE.
(*mmx_pmaddwd): Also allow TARGET_MMX_WITH_SSE.  Add SSE support.
---
 gcc/config/i386/mmx.md | 25 +++--
 1 file changed, 15 insertions(+), 10 deletions(-)

diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index 01c80602b5b..fe746a487d1 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -810,11 +810,11 @@
  (mult:V2SI
(sign_extend:V2SI
  (vec_select:V2HI
-   (match_operand:V4HI 1 "nonimmediate_operand")
+   (match_operand:V4HI 1 "register_mmxmem_operand")
(parallel [(const_int 0) (const_int 2)])))
(sign_extend:V2SI
  (vec_select:V2HI
-   (match_operand:V4HI 2 "nonimmediate_operand")
+   (match_operand:V4HI 2 "register_mmxmem_operand")
(parallel [(const_int 0) (const_int 2)]
  (mult:V2SI
(sign_extend:V2SI
@@ -823,20 +823,20 @@
(sign_extend:V2SI
  (vec_select:V2HI (match_dup 2)
(parallel [(const_int 1) (const_int 3)]))]
-  "TARGET_MMX"
+  "TARGET_MMX || TARGET_MMX_WITH_SSE"
   "ix86_fixup_binary_operands_no_copy (MULT, V4HImode, operands);")
 
 (define_insn "*mmx_pmaddwd"
-  [(set (match_operand:V2SI 0 "register_operand" "=y")
+  [(set (match_operand:V2SI 0 "register_operand" "=y,x,Yv")
 (plus:V2SI
  (mult:V2SI
(sign_extend:V2SI
  (vec_select:V2HI
-   (match_operand:V4HI 1 "nonimmediate_operand" "%0")
+   (match_operand:V4HI 1 "register_mmxmem_operand" "%0,0,Yv")
(parallel [(const_int 0) (const_int 2)])))
(sign_extend:V2SI
  (vec_select:V2HI
-   (match_operand:V4HI 2 "nonimmediate_operand" "ym")
+   (match_operand:V4HI 2 "register_mmxmem_operand" "ym,x,Yv")
(parallel [(const_int 0) (const_int 2)]
  (mult:V2SI
(sign_extend:V2SI
@@ -845,10 +845,15 @@
(sign_extend:V2SI
  (vec_select:V2HI (match_dup 2)
(parallel [(const_int 1) (const_int 3)]))]
-  "TARGET_MMX && ix86_binary_operator_ok (MULT, V4HImode, operands)"
-  "pmaddwd\t{%2, %0|%0, %2}"
-  [(set_attr "type" "mmxmul")
-   (set_attr "mode" "DI")])
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
+   && ix86_binary_operator_ok (MULT, V4HImode, operands)"
+  "@
+   pmaddwd\t{%2, %0|%0, %2}
+   pmaddwd\t{%2, %0|%0, %2}
+   vpmaddwd\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
+   (set_attr "type" "mmxmul,sseiadd,sseiadd")
+   (set_attr "mode" "DI,TI,TI")])
 
 (define_expand "mmx_pmulhrwv4hi3"
   [(set (match_operand:V4HI 0 "register_operand")
-- 
2.20.1

[PATCH 03/41] i386: Emulate MMX punpcklXX/punpckhXX with SSE punpcklXX

2019-02-18 Thread H.J. Lu

Emulate MMX punpcklXX/punpckhXX with SSE punpcklXX.  For MMX punpckhXX,
move bits 64:127 to bits 0:63 in SSE register.  Only SSE register source
operand is allowed.

PR target/89021
* config/i386/i386-protos.h (ix86_split_mmx_punpck): New
prototype.
* config/i386/i386.c (ix86_split_mmx_punpck): New function.
* config/i386/mmx.m (mmx_punpckhbw): Changed to
define_insn_and_split to support SSE emulation.
(mmx_punpcklbw): Likewise.
(mmx_punpckhwd): Likewise.
(mmx_punpcklwd): Likewise.
(mmx_punpckhdq): Likewise.
(mmx_punpckldq): Likewise.
---
 gcc/config/i386/i386-protos.h |   1 +
 gcc/config/i386/i386.c|  77 +++
 gcc/config/i386/mmx.md| 138 ++
 3 files changed, 168 insertions(+), 48 deletions(-)

diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
index a53b48438ec..37581837a32 100644
--- a/gcc/config/i386/i386-protos.h
+++ b/gcc/config/i386/i386-protos.h
@@ -204,6 +204,7 @@ extern rtx ix86_split_stack_guard (void);
 
 extern void ix86_move_vector_high_sse_to_mmx (rtx);
 extern void ix86_split_mmx_pack (rtx[], enum rtx_code);
+extern void ix86_split_mmx_punpck (rtx[], bool);
 
 #ifdef TREE_CODE
 extern void init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, tree, int);
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 563bc9aec69..3db41555462 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -20275,6 +20275,83 @@ ix86_split_mmx_pack (rtx operands[], enum rtx_code 
code)
   ix86_move_vector_high_sse_to_mmx (op0);
 }
 
+/* Split MMX punpcklXX/punpckhXX with SSE punpcklXX.  */
+
+void
+ix86_split_mmx_punpck (rtx operands[], bool high_p)
+{
+  rtx op0 = operands[0];
+  rtx op1 = operands[1];
+  rtx op2 = operands[2];
+  machine_mode mode = GET_MODE (op0);
+  rtx mask;
+  /* The corresponding SSE mode.  */
+  machine_mode sse_mode, double_sse_mode;
+
+  switch (mode)
+{
+case E_V8QImode:
+  sse_mode = V16QImode;
+  double_sse_mode = V32QImode;
+  mask = gen_rtx_PARALLEL (VOIDmode,
+  gen_rtvec (16,
+ GEN_INT (0), GEN_INT (16),
+ GEN_INT (1), GEN_INT (17),
+ GEN_INT (2), GEN_INT (18),
+ GEN_INT (3), GEN_INT (19),
+ GEN_INT (4), GEN_INT (20),
+ GEN_INT (5), GEN_INT (21),
+ GEN_INT (6), GEN_INT (22),
+ GEN_INT (7), GEN_INT (23)));
+  break;
+
+case E_V4HImode:
+  sse_mode = V8HImode;
+  double_sse_mode = V16HImode;
+  mask = gen_rtx_PARALLEL (VOIDmode,
+  gen_rtvec (8,
+ GEN_INT (0), GEN_INT (8),
+ GEN_INT (1), GEN_INT (9),
+ GEN_INT (2), GEN_INT (10),
+ GEN_INT (3), GEN_INT (11)));
+  break;
+
+case E_V2SImode:
+  sse_mode = V4SImode;
+  double_sse_mode = V8SImode;
+  mask = gen_rtx_PARALLEL (VOIDmode,
+  gen_rtvec (4,
+ GEN_INT (0), GEN_INT (4),
+ GEN_INT (1), GEN_INT (5)));
+  break;
+
+default:
+  gcc_unreachable ();
+}
+
+  /* Generate SSE punpcklXX.  */
+  rtx dest = lowpart_subreg (sse_mode, op0, GET_MODE (op0));
+  op1 = lowpart_subreg (sse_mode, op1, GET_MODE (op1));
+  op2 = lowpart_subreg (sse_mode, op2, GET_MODE (op2));
+
+  op1 = gen_rtx_VEC_CONCAT (double_sse_mode, op1, op2);
+  op2 = gen_rtx_VEC_SELECT (sse_mode, op1, mask);
+  rtx insn = gen_rtx_SET (dest, op2);
+  emit_insn (insn);
+
+  if (high_p)
+{
+  /* Move bits 64:127 to bits 0:63.  */
+  mask = gen_rtx_PARALLEL (VOIDmode,
+  gen_rtvec (4, GEN_INT (2), GEN_INT (3),
+ GEN_INT (0), GEN_INT (0)));
+  dest = lowpart_subreg (V4SImode, dest, GET_MODE (dest));
+  op1 = gen_rtx_VEC_SELECT (V4SImode, dest, mask);
+  insn = gen_rtx_SET (dest, op1);
+  emit_insn (insn);
+}
+}
+
 /* Helper function of ix86_fixup_binary_operands to canonicalize
operand order.  Returns true if the operands should be swapped.  */
 
diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index 63a390923b6..0aa793395fb 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -1064,87 +1064,129 @@
(set_attr "type" "mmxshft,sselog,sselog")
(set_attr "mode" "DI,TI,TI")])
 
-(define_insn "mmx_punpckhbw"
-  [(set (match_operand:V8QI 0 "register_operand" "=y")
+(define_insn_and_split "mmx_punpckhbw"
+  [(set (match_operand:V8QI 0

[PATCH 06/41] i386: Emulate MMX smulv4hi3_highpart with SSE

2019-02-18 Thread H.J. Lu

Emulate MMX mulv4hi3 with SSE.  Only SSE register source operand is
allowed.

PR target/89021
* config/i386/mmx.md (mmx_smulv4hi3_highpart): Also allow
TARGET_MMX_WITH_SSE.
(*mmx_smulv4hi3_highpart): Also allow TARGET_MMX_WITH_SSE. Add
SSE support.
---
 gcc/config/i386/mmx.md | 25 +++--
 1 file changed, 15 insertions(+), 10 deletions(-)

diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index fd0189eae60..01c80602b5b 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -748,27 +748,32 @@
  (lshiftrt:V4SI
(mult:V4SI
  (sign_extend:V4SI
-   (match_operand:V4HI 1 "nonimmediate_operand"))
+   (match_operand:V4HI 1 "register_mmxmem_operand"))
  (sign_extend:V4SI
-   (match_operand:V4HI 2 "nonimmediate_operand")))
+   (match_operand:V4HI 2 "register_mmxmem_operand")))
(const_int 16]
-  "TARGET_MMX"
+  "TARGET_MMX || TARGET_MMX_WITH_SSE"
   "ix86_fixup_binary_operands_no_copy (MULT, V4HImode, operands);")
 
 (define_insn "*mmx_smulv4hi3_highpart"
-  [(set (match_operand:V4HI 0 "register_operand" "=y")
+  [(set (match_operand:V4HI 0 "register_operand" "=y,x,Yv")
(truncate:V4HI
  (lshiftrt:V4SI
(mult:V4SI
  (sign_extend:V4SI
-   (match_operand:V4HI 1 "nonimmediate_operand" "%0"))
+   (match_operand:V4HI 1 "register_mmxmem_operand" "%0,0,Yv"))
  (sign_extend:V4SI
-   (match_operand:V4HI 2 "nonimmediate_operand" "ym")))
+   (match_operand:V4HI 2 "register_mmxmem_operand" "ym,x,Yv")))
(const_int 16]
-  "TARGET_MMX && ix86_binary_operator_ok (MULT, V4HImode, operands)"
-  "pmulhw\t{%2, %0|%0, %2}"
-  [(set_attr "type" "mmxmul")
-   (set_attr "mode" "DI")])
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
+   && ix86_binary_operator_ok (MULT, V4HImode, operands)"
+  "@
+   pmulhw\t{%2, %0|%0, %2}
+   pmulhw\t{%2, %0|%0, %2}
+   vpmulhw\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
+   (set_attr "type" "mmxmul,ssemul,ssemul")
+   (set_attr "mode" "DI,TI,TI")])
 
 (define_expand "mmx_umulv4hi3_highpart"
   [(set (match_operand:V4HI 0 "register_operand")
-- 
2.20.1

[PATCH 02/41] i386: Emulate MMX packsswb/packssdw/packuswb with SSE2

2019-02-18 Thread H.J. Lu

Emulate MMX packsswb/packssdw/packuswb with SSE packsswb/packssdw/packuswb
plus moving bits 64:95 to bits 32:63 in SSE register.  Only SSE register
source operand is allowed.

2019-02-08  H.J. Lu  
Uros Bizjak  

PR target/89021
* config/i386/i386-protos.h (ix86_move_vector_high_sse_to_mmx):
New prototype.
(ix86_split_mmx_pack): Likewise.
* config/i386/i386.c (ix86_move_vector_high_sse_to_mmx): New
function.
(ix86_split_mmx_pack): Likewise.
* config/i386/i386.md (mmx_isa): New.
(enabled): Also check mmx_isa.
* config/i386/mmx.md (any_s_truncate): New code iterator.
(s_trunsuffix): New code attr.
(mmx_packsswb): Removed.
(mmx_packssdw): Likewise.
(mmx_packuswb): Likewise.
(mmx_packswb): New define_insn_and_split to emulate
MMX packsswb/packuswb with SSE2.
(mmx_packssdw): Likewise.
* config/i386/predicates.md (register_mmxmem_operand): New.
---
 gcc/config/i386/i386-protos.h |  3 ++
 gcc/config/i386/i386.c| 54 
 gcc/config/i386/i386.md   | 13 +++
 gcc/config/i386/mmx.md| 67 +++
 gcc/config/i386/predicates.md |  7 
 5 files changed, 114 insertions(+), 30 deletions(-)

diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
index 27f5cc13abf..a53b48438ec 100644
--- a/gcc/config/i386/i386-protos.h
+++ b/gcc/config/i386/i386-protos.h
@@ -202,6 +202,9 @@ extern void ix86_expand_vecop_qihi (enum rtx_code, rtx, 
rtx, rtx);
 
 extern rtx ix86_split_stack_guard (void);
 
+extern void ix86_move_vector_high_sse_to_mmx (rtx);
+extern void ix86_split_mmx_pack (rtx[], enum rtx_code);
+
 #ifdef TREE_CODE
 extern void init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, tree, int);
 #endif /* TREE_CODE  */
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index eb642165264..563bc9aec69 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -20221,6 +20221,60 @@ ix86_expand_vector_move_misalign (machine_mode mode, 
rtx operands[])
 gcc_unreachable ();
 }
 
+/* Move bits 64:95 to bits 32:63.  */
+
+void
+ix86_move_vector_high_sse_to_mmx (rtx op)
+{
+  rtx mask = gen_rtx_PARALLEL (VOIDmode,
+  gen_rtvec (4, GEN_INT (0), GEN_INT (2),
+ GEN_INT (0), GEN_INT (0)));
+  rtx dest = lowpart_subreg (V4SImode, op, GET_MODE (op));
+  op = gen_rtx_VEC_SELECT (V4SImode, dest, mask);
+  rtx insn = gen_rtx_SET (dest, op);
+  emit_insn (insn);
+}
+
+/* Split MMX pack with signed/unsigned saturation with SSE/SSE2.  */
+
+void
+ix86_split_mmx_pack (rtx operands[], enum rtx_code code)
+{
+  rtx op0 = operands[0];
+  rtx op1 = operands[1];
+  rtx op2 = operands[2];
+
+  machine_mode dmode = GET_MODE (op0);
+  machine_mode smode = GET_MODE (op1);
+  machine_mode inner_dmode = GET_MODE_INNER (dmode);
+  machine_mode inner_smode = GET_MODE_INNER (smode);
+
+  /* Get the corresponding SSE mode for destination.  */
+  int nunits = 16 / GET_MODE_SIZE (inner_dmode);
+  machine_mode sse_dmode = mode_for_vector (GET_MODE_INNER (dmode),
+   nunits).require ();
+  machine_mode sse_half_dmode = mode_for_vector (GET_MODE_INNER (dmode),
+nunits / 2).require ();
+
+  /* Get the corresponding SSE mode for source.  */
+  nunits = 16 / GET_MODE_SIZE (inner_smode);
+  machine_mode sse_smode = mode_for_vector (GET_MODE_INNER (smode),
+   nunits).require ();
+
+  /* Generate SSE pack with signed/unsigned saturation.  */
+  rtx dest = lowpart_subreg (sse_dmode, op0, GET_MODE (op0));
+  op1 = lowpart_subreg (sse_smode, op1, GET_MODE (op1));
+  op2 = lowpart_subreg (sse_smode, op2, GET_MODE (op2));
+
+  op1 = gen_rtx_fmt_e (code, sse_half_dmode, op1);
+  op2 = gen_rtx_fmt_e (code, sse_half_dmode, op2);
+  rtx insn = gen_rtx_SET (dest, gen_rtx_VEC_CONCAT (sse_dmode,
+   op1, op2));
+  emit_insn (insn);
+
+  ix86_move_vector_high_sse_to_mmx (op0);
+}
+
 /* Helper function of ix86_fixup_binary_operands to canonicalize
operand order.  Returns true if the operands should be swapped.  */
 
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 240384917df..04ec0eeaa57 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -792,6 +792,10 @@
avx512vl,noavx512vl,x64_avx512dq,x64_avx512bw"
   (const_string "base"))
 
+;; Define instruction set of MMX instructions
+(define_attr "mmx_isa" "base,native,x64,x64_noavx,x64_avx"
+  (const_string "base"))
+
 (define_attr "enabled" ""
   (cond [(eq_attr "isa" "x64") (symbol_ref "TARGET_64BIT")
 (eq_attr "isa" "x64_sse2")
@@ -830,6 +834,15 @@
 (eq_attr "isa" "noavx512dq") (symbol_ref "!TARGET_AVX512DQ")
 (eq_attr "isa" "avx512vl")

[PATCH 04/41] i386: Emulate MMX plusminus/sat_plusminus with SSE

2019-02-18 Thread H.J. Lu

Emulate MMX plusminus/sat_plusminus with SSE.  Only SSE register source
operand is allowed.

PR target/89021
* config/i386/mmx.md (MMXMODEI8): Require TARGET_SSE2 for V1DI.
(plusminus:mmx_3): Check
TARGET_MMX_WITH_SSE.
(sat_plusminus:mmx_3): Likewise.
(3): New.
(*mmx_3): Add SSE emulation.
(*mmx_3): Likewise.
---
 gcc/config/i386/mmx.md | 59 +++---
 1 file changed, 38 insertions(+), 21 deletions(-)

diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index 0aa793395fb..587e31b299e 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -45,7 +45,7 @@
 
 ;; 8 byte integral modes handled by MMX (and by extension, SSE)
 (define_mode_iterator MMXMODEI [V8QI V4HI V2SI])
-(define_mode_iterator MMXMODEI8 [V8QI V4HI V2SI V1DI])
+(define_mode_iterator MMXMODEI8 [V8QI V4HI V2SI (V1DI "TARGET_SSE2")])
 
 ;; All 8-byte vector modes handled by MMX
 (define_mode_iterator MMXMODE [V8QI V4HI V2SI V1DI V2SF])
@@ -663,39 +663,56 @@
 (define_expand "mmx_3"
   [(set (match_operand:MMXMODEI8 0 "register_operand")
(plusminus:MMXMODEI8
- (match_operand:MMXMODEI8 1 "nonimmediate_operand")
- (match_operand:MMXMODEI8 2 "nonimmediate_operand")))]
-  "TARGET_MMX || (TARGET_SSE2 && mode == V1DImode)"
+ (match_operand:MMXMODEI8 1 "register_mmxmem_operand")
+ (match_operand:MMXMODEI8 2 "register_mmxmem_operand")))]
+  "TARGET_MMX || TARGET_MMX_WITH_SSE"
+  "ix86_fixup_binary_operands_no_copy (, mode, operands);")
+
+(define_expand "3"
+  [(set (match_operand:MMXMODEI 0 "register_operand")
+   (plusminus:MMXMODEI
+ (match_operand:MMXMODEI 1 "register_operand")
+ (match_operand:MMXMODEI 2 "register_operand")))]
+  "TARGET_MMX_WITH_SSE"
   "ix86_fixup_binary_operands_no_copy (, mode, operands);")
 
 (define_insn "*mmx_3"
-  [(set (match_operand:MMXMODEI8 0 "register_operand" "=y")
+  [(set (match_operand:MMXMODEI8 0 "register_operand" "=y,x,Yv")
 (plusminus:MMXMODEI8
- (match_operand:MMXMODEI8 1 "nonimmediate_operand" "0")
- (match_operand:MMXMODEI8 2 "nonimmediate_operand" "ym")))]
-  "(TARGET_MMX || (TARGET_SSE2 && mode == V1DImode))
+ (match_operand:MMXMODEI8 1 "register_mmxmem_operand" "0,0,Yv")
+ (match_operand:MMXMODEI8 2 "register_mmxmem_operand" "ym,x,Yv")))]
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
&& ix86_binary_operator_ok (, mode, operands)"
-  "p\t{%2, %0|%0, %2}"
-  [(set_attr "type" "mmxadd")
-   (set_attr "mode" "DI")])
+  "@
+   p\t{%2, %0|%0, %2}
+   p\t{%2, %0|%0, %2}
+   vp\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
+   (set_attr "type" "mmxadd,sseadd,sseadd")
+   (set_attr "mode" "DI,TI,TI")])
 
 (define_expand "mmx_3"
   [(set (match_operand:MMXMODE12 0 "register_operand")
(sat_plusminus:MMXMODE12
- (match_operand:MMXMODE12 1 "nonimmediate_operand")
- (match_operand:MMXMODE12 2 "nonimmediate_operand")))]
-  "TARGET_MMX"
+ (match_operand:MMXMODE12 1 "register_mmxmem_operand")
+ (match_operand:MMXMODE12 2 "register_mmxmem_operand")))]
+  "TARGET_MMX || TARGET_MMX_WITH_SSE"
   "ix86_fixup_binary_operands_no_copy (, mode, operands);")
 
 (define_insn "*mmx_3"
-  [(set (match_operand:MMXMODE12 0 "register_operand" "=y")
+  [(set (match_operand:MMXMODE12 0 "register_operand" "=y,x,Yv")
 (sat_plusminus:MMXMODE12
- (match_operand:MMXMODE12 1 "nonimmediate_operand" "0")
- (match_operand:MMXMODE12 2 "nonimmediate_operand" "ym")))]
-  "TARGET_MMX && ix86_binary_operator_ok (, mode, operands)"
-  "p\t{%2, %0|%0, %2}"
-  [(set_attr "type" "mmxadd")
-   (set_attr "mode" "DI")])
+ (match_operand:MMXMODE12 1 "register_mmxmem_operand" "0,0,Yv")
+ (match_operand:MMXMODE12 2 "register_mmxmem_operand" "ym,x,Yv")))]
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
+   && ix86_binary_operator_ok (, mode, operands)"
+  "@
+   p\t{%2, %0|%0, %2}
+   p\t{%2, %0|%0, %2}
+   vp\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
+   (set_attr "type" "mmxadd,sseadd,sseadd")
+   (set_attr "mode" "DI,TI,TI")])
 
 (define_expand "mmx_mulv4hi3"
   [(set (match_operand:V4HI 0 "register_operand")
-- 
2.20.1

[PATCH 01/41] i386: Allow MMX register modes in SSE registers

2019-02-18 Thread H.J. Lu

In 64-bit mode, SSE2 can be used to emulate MMX instructions without
3DNOW.  We can use SSE2 to support MMX register modes.

PR target/89021
* config/i386/i386-c.c (ix86_target_macros_internal): Define
__MMX_WITH_SSE__ for TARGET_MMX_WITH_SSE.
* config/i386/i386.c (ix86_set_reg_reg_cost): Add support for
TARGET_MMX_WITH_SSE with VALID_MMX_REG_MODE.
(ix86_vector_mode_supported_p): Likewise.
* config/i386/i386.h (TARGET_MMX_WITH_SSE): New.
---
 gcc/config/i386/i386-c.c | 2 ++
 gcc/config/i386/i386.c   | 5 +++--
 gcc/config/i386/i386.h   | 2 ++
 3 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/gcc/config/i386/i386-c.c b/gcc/config/i386/i386-c.c
index 5e7e46fcebe..213e1b56c6b 100644
--- a/gcc/config/i386/i386-c.c
+++ b/gcc/config/i386/i386-c.c
@@ -548,6 +548,8 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag,
 def_or_undef (parse_in, "__CLDEMOTE__");
   if (isa_flag2 & OPTION_MASK_ISA_PTWRITE)
 def_or_undef (parse_in, "__PTWRITE__");
+  if (TARGET_MMX_WITH_SSE)
+def_or_undef (parse_in, "__MMX_WITH_SSE__");
   if (TARGET_IAMCU)
 {
   def_or_undef (parse_in, "__iamcu");
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 0df792a41d1..eb642165264 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -40503,7 +40503,8 @@ ix86_set_reg_reg_cost (machine_mode mode)
  || (TARGET_AVX && VALID_AVX256_REG_MODE (mode))
  || (TARGET_SSE2 && VALID_SSE2_REG_MODE (mode))
  || (TARGET_SSE && VALID_SSE_REG_MODE (mode))
- || (TARGET_MMX && VALID_MMX_REG_MODE (mode)))
+ || ((TARGET_MMX || TARGET_MMX_WITH_SSE)
+ && VALID_MMX_REG_MODE (mode)))
units = GET_MODE_SIZE (mode);
 }
 
@@ -44329,7 +44330,7 @@ ix86_vector_mode_supported_p (machine_mode mode)
 return true;
   if (TARGET_AVX512F && VALID_AVX512F_REG_MODE (mode))
 return true;
-  if (TARGET_MMX && VALID_MMX_REG_MODE (mode))
+  if ((TARGET_MMX ||TARGET_MMX_WITH_SSE) && VALID_MMX_REG_MODE (mode))
 return true;
   if (TARGET_3DNOW && VALID_MMX_REG_MODE_3DNOW (mode))
 return true;
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 4fd8bc40a34..91b233022c2 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -201,6 +201,8 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
If not, see
 #define TARGET_16BIT   TARGET_CODE16
 #define TARGET_16BIT_P(x)  TARGET_CODE16_P(x)
 
+#define TARGET_MMX_WITH_SSE(TARGET_64BIT && TARGET_SSE2)
+
 #include "config/vxworks-dummy.h"
 
 #include "config/i386/i386-opts.h"
-- 
2.20.1

Re: [patch, fortran] Fix PR 87689, wrong decls / ABI violation on POWER

2019-02-18 Thread Thomas Koenig


I have now committed the patch as r268992.  Janne and Richard, thanks
for the review and the comments.

Am 18.02.19 um 13:50 schrieb Richard Biener:

On Sun, Feb 17, 2019 at 7:19 PM Thomas Koenig  wrote:

Regression tests turned up a few ICEs (now fixed), plus two
very invalid test cases, which I think are not worth saving.


They were added to verify we don't ICE for such invalid testcases.
How do they fail now?


They failed with an LTO type mistmatch. Instead of deleting them,
I have now added -Wno-lto-type-mismatch to the options in the
committed version.


I wonder how the frontend handles
the 2nd call to doesntwork_p8 for

   program main
 implicit none
 character :: c
 character(len=20) :: res, doesntwork_p8
 external doesntwork_p8
 c = 'o'
 res = doesntwork_p8(c,1,2,3,4,5,6)
 res = doesntwork_p8(1,2)
 if (res /= 'foo') stop 3
   end program main


This is invalid Fortran.  I think we should be able to diagnose this,
but the comitted version does not check it.

Regards

Thomas

Re: [PATCH] sched-ebb.c: avoid moving table jumps (PR rtl-optimization/88423)

2019-02-18 Thread Aaron Sawdey

On 2/18/19 10:41 AM, Alexander Monakov wrote:
> On Mon, 18 Feb 2019, Aaron Sawdey wrote:
> 
>> The code in emit_case_dispatch_table() will very clearly always emit 
>> branch/label/jumptable_data/barrier
>> so this does need to be handled. So, yes tablejump always looks like this, 
>> and also yes it seems to be
>> ripe ground for logic bugs, we have 88308, 88347, 88423 all related to it.
>>
>> In the long term it might be nice to use a general mechanism 
>> (SCHED_GROUP_P?) for handling the label and jump
>> table data that follow a case branch using jump table.
>>
>> But for now in stage 4, I think the right way to fix this is with the patch 
>> that Segher posted earlier.
>> If regtest passes (x86_64 and ppc64/ppc32), ok for trunk?
> 
> How making an assert more permissive is "the right way" here?
> As already mentioned, without the assert we'd move a USE of the register with
> function return value to an unreachable block, which would be incorrect.
> 
> Do you anticipate issues with the sched-deps patch?

Alexander,
 I see you are allowing it to see the barrier as if it were right after the 
tablejump.

Are you saying that the motion of the tablejump is happening because the 
scheduler does not see
the barrier (because it does not follow immediately after) and thus decides it 
can move instructions
to the other side of the tablejump? I agree that is incorrect and is asking for 
other hidden problems.

It would be nice if the tablejump, jump table label, jump table data, and 
barrier were all one indivisible
unit somehow.

In the meantime, can someone approve Alexander's patch?

Thanks,
   Aaron

-- 
Aaron Sawdey, Ph.D.  acsaw...@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain

Re: [C++ Patch] PR 84536 ("[7/8/9 Regression] ICE with non-type template parameter")

2019-02-18 Thread Jason Merrill


On 2/18/19 5:31 AM, Paolo Carlini wrote:

Hi Jason,

On 18/02/19 10:20, Jason Merrill wrote:

On 2/17/19 6:58 AM, Paolo Carlini wrote:

Hi,

here, when we don't see an initializer we believe we are surely 
dealing with a case of C++17 template argument deduction for class 
templates, but, in fact, it's just an ill-formed C++14 template 
variable specialization. Conveniently, we can use here too the 
predicate variable_template_specialization_p. Not 100% sure about the 
exact wording of the error message, I added '#' to %qD to explicitly 
print the auto-using type too.


I guess we should change the assert to a test, so that we give the 
error if we aren't dealing with a class template placeholder. Variable 
templates don't seem to be important to test for.

Thanks, simpler patch.
This error is also pretty poor for this testcase, where there is an 
initializer.


Well, implementation-wise, certainly init == NULL_TREE and only when we 
have an empty pack this specific issue occurs.


In practice, clang simply talks about an empty initializer (during 
instantiation, etc, like we do), whereas EDG explicitly says that pack 
expansion produces an empty list of expressions. I don't think that in 
cp_finish_decl it would be easy for us to do exactly the same, we simply 
see a NULL_TREE as second argument. Or we could just *assume* that we 
are dealing with the outcome of a pack expansion, say something like EDG 
even if we don't have details beyond the fact that init == NULL_TREE. I 
believe that without a variadic template the problem cannot occur, 
because we catch the empty initializer much earlier, in grokdeclarator - 
indeed using a !CLASS_PLACEHOLDER_TEMPLATE (auto_node) check. What do 
you think? Again "instantiated for an empty pack" or something similar?


Perhaps we could complain in the code for empty pack expansion handling 
in tsubst_init?


Jason

Re: [patch, fortran] Fix PR 87689, wrong decls / ABI violation on POWER

2019-02-18 Thread Segher Boessenkool

On Mon, Feb 18, 2019 at 10:48:35AM +0200, Janne Blomqvist wrote:
> I wonder if we shouldn't exorcise all the varargs stuff, it seems to
> cause more problems than benefits? But not in stage4 if we can avoid
> it..

On the Power ABIs at least, unprototyped functions (a K thing for C) are
handled the same as varargs (with zero fixed arguments).  How does this
tie in with Fortran requirements?

Segher

PING [PATCH] correct __clear_cache signature

2019-02-18 Thread Martin Sebor


Ping: https://gcc.gnu.org/ml/gcc-patches/2019-02/msg00361.html

On 2/6/19 5:28 PM, Martin Sebor wrote:

Recent libgcc builds have been triggering -Wbuiltin-declaration-mismatch
due to the declaration of the __clear_cache built-in being incompatible
with how GCC declares it internally.  The attached patch adjusts
the libgcc declaration and the one in the manual to match what GCC
expects.

Tested on x86_64-linux.

Martin

Re: [PATCH] sched-ebb.c: avoid moving table jumps (PR rtl-optimization/88423)

2019-02-18 Thread Alexander Monakov

On Mon, 18 Feb 2019, Aaron Sawdey wrote:

> The code in emit_case_dispatch_table() will very clearly always emit 
> branch/label/jumptable_data/barrier
> so this does need to be handled. So, yes tablejump always looks like this, 
> and also yes it seems to be
> ripe ground for logic bugs, we have 88308, 88347, 88423 all related to it.
> 
> In the long term it might be nice to use a general mechanism (SCHED_GROUP_P?) 
> for handling the label and jump
> table data that follow a case branch using jump table.
> 
> But for now in stage 4, I think the right way to fix this is with the patch 
> that Segher posted earlier.
> If regtest passes (x86_64 and ppc64/ppc32), ok for trunk?

How making an assert more permissive is "the right way" here?
As already mentioned, without the assert we'd move a USE of the register with
function return value to an unreachable block, which would be incorrect.

Do you anticipate issues with the sched-deps patch?

Alexander

Re: [PATCH] sched-ebb.c: avoid moving table jumps (PR rtl-optimization/88423)

2019-02-18 Thread Aaron Sawdey

The code in emit_case_dispatch_table() will very clearly always emit 
branch/label/jumptable_data/barrier
so this does need to be handled. So, yes tablejump always looks like this, and 
also yes it seems to be
ripe ground for logic bugs, we have 88308, 88347, 88423 all related to it.

In the long term it might be nice to use a general mechanism (SCHED_GROUP_P?) 
for handling the label and jump
table data that follow a case branch using jump table.

But for now in stage 4, I think the right way to fix this is with the patch 
that Segher posted earlier.
If regtest passes (x86_64 and ppc64/ppc32), ok for trunk?

2019-02-18  Aaron Sawdey  

PR rtl-optimization/88347
* schedule-ebb.c (begin_move_insn): Apply Segher's patch to handle
a jump table before the barrier.


On 1/24/19 9:43 AM, Alexander Monakov wrote:
> On Wed, 23 Jan 2019, Alexander Monakov wrote:
> 
>> It appears that sched-deps tries to take notice of a barrier after a jump, 
>> but
>> similarly to sched-ebb doesn't anticipate that for a tablejump the barrier 
>> will
>> appear after two more insns (a code_label and a jump_table_data).
>>
>> If so, it needs a fixup just like the posted change for the assert. I'll 
>> fire up
>> a bootstrap/regtest.
> 
> Updated patch below (now taking into account that NEXT_INSN may give NULL)
> passes bootstrap/regtest on x86_64, also with -fsched2-use-superblocks.
> 
> I'm surprised to learn that a tablejump may be not the final insn in its
> containing basic block.  It certainly seems like a ripe ground for logic
> bugs like this one.  Is it really intentional?
> 
> OK for trunk?
> 
> Thanks.
> Alexander
> 
>   PR rtl-optimization/88347
>   PR rtl-optimization/88423
>   * sched-deps.c (sched_analyze_insn): Take into account that for
>   tablejumps the barrier appears after a label and a jump_table_data.
> 
> --- a/gcc/sched-deps.c
> +++ b/gcc/sched-deps.c
> @@ -3005,6 +3005,11 @@ sched_analyze_insn (struct deps_desc *deps, rtx x, 
> rtx_insn *insn)
>if (JUMP_P (insn))
>  {
>rtx_insn *next = next_nonnote_nondebug_insn (insn);
> +  /* ??? For tablejumps, the barrier may appear not immediately after
> + the jump, but after a label and a jump_table_data insn.  */
> +  if (next && LABEL_P (next) && NEXT_INSN (next)
> +   && JUMP_TABLE_DATA_P (NEXT_INSN (next)))
> + next = NEXT_INSN (NEXT_INSN (next));
>if (next && BARRIER_P (next))
>   reg_pending_barrier = MOVE_BARRIER;
>else
> 

-- 
Aaron Sawdey, Ph.D.  acsaw...@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain

Re: [PATCH 00/41] V8: Emulate MMX intrinsics with SSE

2019-02-18 Thread H.J. Lu

On Mon, Feb 18, 2019 at 6:37 AM Uros Bizjak  wrote:
>
> On Mon, Feb 18, 2019 at 3:22 PM H.J. Lu  wrote:
>
> > > > > > > > > > > > On x86-64, since __m64 is returned and passed in XMM 
> > > > > > > > > > > > registers, we can
> > > > > > > > > > > > emulate MMX intrinsics with SSE instructions. To 
> > > > > > > > > > > > support it, we added
> > > > > > > > > > > >
> > > > > > > > > > > >  #define TARGET_MMX_WITH_SSE (TARGET_64BIT && 
> > > > > > > > > > > > TARGET_SSE2)
> > > > > > > > > > > >
> > > > > > > > > > > > ;; Define instruction set of MMX instructions
> > > > > > > > > > > > (define_attr "mmx_isa" 
> > > > > > > > > > > > "base,native,x64,x64_noavx,x64_avx"
> > > > > > > > > > > >   (const_string "base"))
> > > > > > > > > > > >
> > > > > > > > > > > >  (eq_attr "mmx_isa" "native")
> > > > > > > > > > > >(symbol_ref "!TARGET_MMX_WITH_SSE")
> > > > > > > > > > > >  (eq_attr "mmx_isa" "x64")
> > > > > > > > > > > >(symbol_ref "TARGET_MMX_WITH_SSE")
> > > > > > > > > > > >  (eq_attr "mmx_isa" "x64_avx")
> > > > > > > > > > > >(symbol_ref "TARGET_MMX_WITH_SSE && 
> > > > > > > > > > > > TARGET_AVX")
> > > > > > > > > > > >  (eq_attr "mmx_isa" "x64_noavx")
> > > > > > > > > > > >(symbol_ref "TARGET_MMX_WITH_SSE && 
> > > > > > > > > > > > !TARGET_AVX")
> > > > > > > > > > > >
> > > > > > > > > > > > We added SSE emulation to MMX patterns and disabled MMX 
> > > > > > > > > > > > alternatives with
> > > > > > > > > > > > TARGET_MMX_WITH_SSE.
> > > > > > > > > > > >
> > > > > > > > > > > > Most of MMX instructions have equivalent SSE versions 
> > > > > > > > > > > > and results of some
> > > > > > > > > > > > SSE versions need to be reshuffled to the right order 
> > > > > > > > > > > > for MMX.  Thee are
> > > > > > > > > > > > couple tricky cases:
> > > > > > > > > > > >
> > > > > > > > > > > > 1. MMX maskmovq and SSE2 maskmovdqu aren't equivalent.  
> > > > > > > > > > > > We emulate MMX
> > > > > > > > > > > > maskmovq with SSE2 maskmovdqu by zeroing out the upper 
> > > > > > > > > > > > 64 bits of the
> > > > > > > > > > > > mask operand and handle unmapped bits 64:127 at memory 
> > > > > > > > > > > > address by
> > > > > > > > > > > > adjusting source and mask operands together with memory 
> > > > > > > > > > > > address.
> > > > > > > > > > > >
> > > > > > > > > > > > 2. MMX movntq is emulated with SSE2 DImode movnti, 
> > > > > > > > > > > > which is available
> > > > > > > > > > > > in 64-bit mode.
> > > > > > > > > > > >
> > > > > > > > > > > > 3. MMX pshufb takes a 3-bit index while SSE pshufb 
> > > > > > > > > > > > takes a 4-bit index.
> > > > > > > > > > > > SSE emulation must clear the bit 4 in the shuffle 
> > > > > > > > > > > > control mask.
> > > > > > > > > > > >
> > > > > > > > > > > > 4. To emulate MMX cvtpi2p with SSE2 cvtdq2ps, we must 
> > > > > > > > > > > > properly preserve
> > > > > > > > > > > > the upper 64 bits of destination XMM register.
> > > > > > > > > > > >
> > > > > > > > > > > > Tests are also added to check each SSE emulation of MMX 
> > > > > > > > > > > > intrinsics.
> > > > > > > > > > > >
> > > > > > > > > > > > There are no regressions on i686 and x86-64.  For 
> > > > > > > > > > > > x86-64, GCC is also
> > > > > > > > > > > > tested with
> > > > > > > > > > > >
> > > > > > > > > > > > --with-arch=native --with-cpu=native
> > > > > > > > > > > >
> > > > > > > > > > > > on AVX2 and AVX512F machines.
> > > > > > > > > > >
> > > > > > > > > > > An idea that would take patch a step further also on 32 
> > > > > > > > > > > bit targets:
> > > > > > > > > > >
> > > > > > > > > > > *Assuming* that operations on XMM registers are as fast 
> > > > > > > > > > > (or perhaps
> > > > > > > > > > > faster) than operations on MMX registers, we can change 
> > > > > > > > > > > mmx_isa
> > > > > > > > > > > attribute in e.g.
> > > > > > > > > > >
> > > > > > > > > > > +  "@
> > > > > > > > > > > +   p\t{%2, %0|%0, %2}
> > > > > > > > > > > +   p\t{%2, %0|%0, %2}
> > > > > > > > > > > +   vp\t{%2, %1, %0|%0, %1, %2}"
> > > > > > > > > > > +  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
> > > > > > > > > > >
> > > > > > > > > > > to:
> > > > > > > > > > >
> > > > > > > > > > > [(set_attr "isa" "*,noavx,avx")
> > > > > > > > > > >  (set_attr "mmx_isa" "native,*,*")]
> > > > > > > > > > >
> > > > > > > > > > > So, for x86_64 everything stays the same, but for x86_32 
> > > > > > > > > > > we now allow
> > > > > > > > > > > intrinsics to use xmm registers in addition to mmx 
> > > > > > > > > > > registers. We can't
> > > > > > > > > > > disable MMX for x64_32 anyway due to ISA constraints (and 
> > > > > > > > > > > some tricky
> > > > > > > > > > > cases, e.g. monvti that works only for 64bit targets and 
> > > > > > > > > > > e.g. maskmovq
> > > > > > > > > > > & similar, which are more efficient with MMX regs), but 
> > > > > > > > > > > RA has much
> > > > > > > > > > > more

[Patch] [aarch64] PR target/89324 Handle stack pointer for SUBS/ADDS instructions

2019-02-18 Thread Matthew Malcomson

Handle stack pointer with SUBS/ADDS instructions.

In general the stack pointer was not handled for many SUBS/ADDS patterns in
aarch64.md.
Both the "extended register" and "immediate" forms allow the stack pointer to be
used as the source register, while no form allows the stack pointer for the
destination register.

The define_insn patterns generating ADDS/SUBS did not allow the stack pointer
for any operand, while the define_peephole2 patterns that generated RTX to be
matched by these patterns allowed the stack pointer for any operand.

The patterns are fixed by adding the 'k' constraint for the first source operand
to all define_insns that generate the ADDS/SUBS "extended register" and
"immediate" forms (but not the "shifted register" form).

In peephole optimizations, constraint strings are ignored (see "(gccint) C
Constraint Interface" info node in the documentation), so the decision to act or
not is based solely on the predicate and condition.
This patch introduces a new predicate "aarch64_general_reg" to be used in
define_peephole2 patterns where only GENERAL_REGS registers are acceptable and
uses that predicate in the peepholes that generate patterns for ADDS/SUBS.

Additionally, this patch contains two tidy-ups (happy to remove them or put in
a separate patch if people want):
We change the condition of sub3_compare1_imm pattern from checking
"UINTVAL (operands[2]) == -UINTVAL (operands[3])"
to checking
"INTVAL (operands[2]) == -INTVAL (operands[3])"
for clarity, since the values checked are signed integers, there are negations
involved in the check, and the condition used by the corresponding peepholes
also uses INTVAL.

The superfluous  iterator in the assembly template for
add3_compareV_imm is removed -- it was applied to an operand that is
known to be a const_int.

Full bootstrap and regtest done on aarch64-none-linux-gnu.
Regression tests done on aarch64-none-linux-gnu and aarch64-none-elf cross
compiler.

OK for trunk?


NOTE: I have included a bunch of RTL testcases that I used in development, these
don't exercise much of the compiler and are pretty specific to the backend as it
currently is, so I'm not sure they give much value. I'd appreciate feedback on
whether this is in general considered useful.


gcc/ChangeLog:

2019-02-18  Matthew Malcomson  

PR target/89324
* config/aarch64/aarch64.md: Use aarch64_general_reg predicate on
destination register in peepholes generating patterns for ADDS/SUBS.
(add3_compare0,
*addsi3_compare0_uxtw, add3_compareC,
add3_compareV_imm, add3_compareV,
*adds__,
*subs__,
*adds__shift_,
*subs__shift_,
*adds__multp2, *subs__multp2,
*sub3_compare0, *subsi3_compare0_uxtw,
sub3_compare1): Allow stack pointer for source register.
* config/aarch64/predicates.md (aarch64_general_reg): New predicate.


gcc/testsuite/ChangeLog:

2019-02-18  Matthew Malcomson  

PR target/89324
* gcc.dg/rtl/aarch64/subs_adds_sp.c: New test.
* gfortran.fortran-torture/compile/pr89324.f90: New test.



### Attachment also inlined for ease of reply###


diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 
b7f6fe0f1354f7aa19076a946ed2c633b9b9b8da..0d5754a21e31b0c53afb320bdf574fa4a43c7573
 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1985,7 +1985,7 @@ (define_expand "uaddvti4"
 (define_insn "add3_compare0"
   [(set (reg:CC_NZ CC_REGNUM)
(compare:CC_NZ
-(plus:GPI (match_operand:GPI 1 "register_operand" "%r,r,r")
+(plus:GPI (match_operand:GPI 1 "register_operand" "%rk,rk,rk")
   (match_operand:GPI 2 "aarch64_plus_operand" "r,I,J"))
 (const_int 0)))
(set (match_operand:GPI 0 "register_operand" "=r,r,r")
@@ -2002,7 +2002,7 @@ (define_insn "add3_compare0"
 (define_insn "*addsi3_compare0_uxtw"
   [(set (reg:CC_NZ CC_REGNUM)
(compare:CC_NZ
-(plus:SI (match_operand:SI 1 "register_operand" "%r,r,r")
+(plus:SI (match_operand:SI 1 "register_operand" "%rk,rk,rk")
  (match_operand:SI 2 "aarch64_plus_operand" "r,I,J"))
 (const_int 0)))
(set (match_operand:DI 0 "register_operand" "=r,r,r")
@@ -2034,7 +2034,7 @@ (define_insn "add3_compareC"
   [(set (reg:CC_C CC_REGNUM)
(compare:CC_C
  (plus:GPI
-   (match_operand:GPI 1 "register_operand" "r,r,r")
+   (match_operand:GPI 1 "register_operand" "rk,rk,rk")
(match_operand:GPI 2 "aarch64_plus_operand" "r,I,J"))
  (match_dup 1)))
(set (match_operand:GPI 0 "register_operand" "=r,r,r")
@@ -2081,7 +2081,7 @@ (define_insn "add3_compareV_imm"
(compare:CC_V
  (plus:
(sign_extend:
- (match_operand:GPI 1 "register_operand" "r,r"))
+ (match_operand:GPI 1 "register_operand" "rk,rk"))
(match_operand:GPI 2

Re: [PATCH 00/41] V8: Emulate MMX intrinsics with SSE

2019-02-18 Thread Uros Bizjak

On Mon, Feb 18, 2019 at 3:22 PM H.J. Lu  wrote:

> > > > > > > > > > > On x86-64, since __m64 is returned and passed in XMM 
> > > > > > > > > > > registers, we can
> > > > > > > > > > > emulate MMX intrinsics with SSE instructions. To support 
> > > > > > > > > > > it, we added
> > > > > > > > > > >
> > > > > > > > > > >  #define TARGET_MMX_WITH_SSE (TARGET_64BIT && TARGET_SSE2)
> > > > > > > > > > >
> > > > > > > > > > > ;; Define instruction set of MMX instructions
> > > > > > > > > > > (define_attr "mmx_isa" "base,native,x64,x64_noavx,x64_avx"
> > > > > > > > > > >   (const_string "base"))
> > > > > > > > > > >
> > > > > > > > > > >  (eq_attr "mmx_isa" "native")
> > > > > > > > > > >(symbol_ref "!TARGET_MMX_WITH_SSE")
> > > > > > > > > > >  (eq_attr "mmx_isa" "x64")
> > > > > > > > > > >(symbol_ref "TARGET_MMX_WITH_SSE")
> > > > > > > > > > >  (eq_attr "mmx_isa" "x64_avx")
> > > > > > > > > > >(symbol_ref "TARGET_MMX_WITH_SSE && 
> > > > > > > > > > > TARGET_AVX")
> > > > > > > > > > >  (eq_attr "mmx_isa" "x64_noavx")
> > > > > > > > > > >(symbol_ref "TARGET_MMX_WITH_SSE && 
> > > > > > > > > > > !TARGET_AVX")
> > > > > > > > > > >
> > > > > > > > > > > We added SSE emulation to MMX patterns and disabled MMX 
> > > > > > > > > > > alternatives with
> > > > > > > > > > > TARGET_MMX_WITH_SSE.
> > > > > > > > > > >
> > > > > > > > > > > Most of MMX instructions have equivalent SSE versions and 
> > > > > > > > > > > results of some
> > > > > > > > > > > SSE versions need to be reshuffled to the right order for 
> > > > > > > > > > > MMX.  Thee are
> > > > > > > > > > > couple tricky cases:
> > > > > > > > > > >
> > > > > > > > > > > 1. MMX maskmovq and SSE2 maskmovdqu aren't equivalent.  
> > > > > > > > > > > We emulate MMX
> > > > > > > > > > > maskmovq with SSE2 maskmovdqu by zeroing out the upper 64 
> > > > > > > > > > > bits of the
> > > > > > > > > > > mask operand and handle unmapped bits 64:127 at memory 
> > > > > > > > > > > address by
> > > > > > > > > > > adjusting source and mask operands together with memory 
> > > > > > > > > > > address.
> > > > > > > > > > >
> > > > > > > > > > > 2. MMX movntq is emulated with SSE2 DImode movnti, which 
> > > > > > > > > > > is available
> > > > > > > > > > > in 64-bit mode.
> > > > > > > > > > >
> > > > > > > > > > > 3. MMX pshufb takes a 3-bit index while SSE pshufb takes 
> > > > > > > > > > > a 4-bit index.
> > > > > > > > > > > SSE emulation must clear the bit 4 in the shuffle control 
> > > > > > > > > > > mask.
> > > > > > > > > > >
> > > > > > > > > > > 4. To emulate MMX cvtpi2p with SSE2 cvtdq2ps, we must 
> > > > > > > > > > > properly preserve
> > > > > > > > > > > the upper 64 bits of destination XMM register.
> > > > > > > > > > >
> > > > > > > > > > > Tests are also added to check each SSE emulation of MMX 
> > > > > > > > > > > intrinsics.
> > > > > > > > > > >
> > > > > > > > > > > There are no regressions on i686 and x86-64.  For x86-64, 
> > > > > > > > > > > GCC is also
> > > > > > > > > > > tested with
> > > > > > > > > > >
> > > > > > > > > > > --with-arch=native --with-cpu=native
> > > > > > > > > > >
> > > > > > > > > > > on AVX2 and AVX512F machines.
> > > > > > > > > >
> > > > > > > > > > An idea that would take patch a step further also on 32 bit 
> > > > > > > > > > targets:
> > > > > > > > > >
> > > > > > > > > > *Assuming* that operations on XMM registers are as fast (or 
> > > > > > > > > > perhaps
> > > > > > > > > > faster) than operations on MMX registers, we can change 
> > > > > > > > > > mmx_isa
> > > > > > > > > > attribute in e.g.
> > > > > > > > > >
> > > > > > > > > > +  "@
> > > > > > > > > > +   p\t{%2, %0|%0, %2}
> > > > > > > > > > +   p\t{%2, %0|%0, %2}
> > > > > > > > > > +   vp\t{%2, %1, %0|%0, %1, %2}"
> > > > > > > > > > +  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
> > > > > > > > > >
> > > > > > > > > > to:
> > > > > > > > > >
> > > > > > > > > > [(set_attr "isa" "*,noavx,avx")
> > > > > > > > > >  (set_attr "mmx_isa" "native,*,*")]
> > > > > > > > > >
> > > > > > > > > > So, for x86_64 everything stays the same, but for x86_32 we 
> > > > > > > > > > now allow
> > > > > > > > > > intrinsics to use xmm registers in addition to mmx 
> > > > > > > > > > registers. We can't
> > > > > > > > > > disable MMX for x64_32 anyway due to ISA constraints (and 
> > > > > > > > > > some tricky
> > > > > > > > > > cases, e.g. monvti that works only for 64bit targets and 
> > > > > > > > > > e.g. maskmovq
> > > > > > > > > > & similar, which are more efficient with MMX regs), but RA 
> > > > > > > > > > has much
> > > > > > > > > > more freedom to allocate the most effective register set 
> > > > > > > > > > even for
> > > > > > > > > > 32bit targets.
> > > > > > > > > >
> > > > > > > > > > WDYT?
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > Since MMX registers are used to pass and return __m64 values,
> > > > > > > > > we

1 2 >

1 - 100 of 129 matches

Mail list logo