date:20190304

New Finnish PO file for 'gcc' (version 9.1-b20190203)

2019-03-04 Thread Translation Project Robot

Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the Finnish team of translators.  The file is available at:

https://translationproject.org/latest/gcc/fi.po

(This file, 'gcc-9.1-b20190203.fi.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

https://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

https://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.

[PATCH] backport r268834 from mainline to gcc-8-branch

2019-03-04 Thread Xiong Hu Luo

Backport r268834 of "Add support for the vec_sbox_be, vec_cipher_be etc."
from mainline to gcc-8-branch.

Regression-tested on Linux POWER8 LE. OK for gcc-8-branch?
PS: Is backport to gcc-7-branch also needed?

gcc/ChangeLog:
2019-03-05  Xiong Hu Luo  

Backport of r268834 from mainline to gcc-8-branch.
2019-02-13  Xiong Hu Luo  

* config/rs6000/altivec.h (vec_sbox_be, vec_cipher_be,
vec_cipherlast_be, vec_ncipher_be, vec_ncipherlast_be): New #defines.
* config/rs6000/crypto.md (CR_vqdi): New define_mode_iterator.
(crypto_vsbox_, crypto__): New define_insns.
* config/rs6000/rs6000-builtin.def (VSBOX_BE): New BU_CRYPTO_1.
(VCIPHER_BE, VCIPHERLAST_BE, VNCIPHER_BE, VNCIPHERLAST_BE):
New BU_CRYPTO_2.
* config/rs6000/rs6000.c (builtin_function_type)
: New switch options.
* doc/extend.texi (vec_sbox_be, vec_cipher_be, vec_cipherlast_be,
vec_ncipher_be, vec_ncipherlast_be): New builtin functions.

gcc/testsuite/ChangeLog:
2019-03-05  Xiong Hu Luo  

Backport of r268834 from mainline to gcc-8-branch.
2019-01-23  Xiong Hu Luo  

* gcc.target/powerpc/crypto-builtin-1.c
(crypto1_be, crypto2_be, crypto3_be, crypto4_be, crypto5_be):
New testcases.
---
 gcc/config/rs6000/altivec.h|  5 +++
 gcc/config/rs6000/crypto.md| 17 ++
 gcc/config/rs6000/rs6000-builtin.def   | 19 ---
 gcc/config/rs6000/rs6000.c |  5 +++
 gcc/doc/extend.texi| 13 
 .../gcc.target/powerpc/crypto-builtin-1.c  | 38 ++
 6 files changed, 79 insertions(+), 18 deletions(-)

diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h
index 5a34162..6c5757e 100644
--- a/gcc/config/rs6000/altivec.h
+++ b/gcc/config/rs6000/altivec.h
@@ -418,6 +418,11 @@
 #define vec_vupkhsw __builtin_vec_vupkhsw
 #define vec_vupklsw __builtin_vec_vupklsw
 #define vec_revb __builtin_vec_revb
+#define vec_sbox_be __builtin_crypto_vsbox_be
+#define vec_cipher_be __builtin_crypto_vcipher_be
+#define vec_cipherlast_be __builtin_crypto_vcipherlast_be
+#define vec_ncipher_be __builtin_crypto_vncipher_be
+#define vec_ncipherlast_be __builtin_crypto_vncipherlast_be
 #endif
 
 #ifdef __POWER9_VECTOR__
diff --git a/gcc/config/rs6000/crypto.md b/gcc/config/rs6000/crypto.md
index 0f34e14..5dc5699 100644
--- a/gcc/config/rs6000/crypto.md
+++ b/gcc/config/rs6000/crypto.md
@@ -48,6 +48,9 @@
 ;; Iterator for VSHASIGMAD/VSHASIGMAW
 (define_mode_iterator CR_hash [V4SI V2DI])
 
+;; Iterator for VSBOX/VCIPHER/VNCIPHER/VCIPHERLAST/VNCIPHERLAST
+(define_mode_iterator CR_vqdi [V16QI V2DI])
+
 ;; Iterator for the other crypto functions
 (define_int_iterator CR_code   [UNSPEC_VCIPHER
UNSPEC_VNCIPHER
@@ -60,10 +63,10 @@
  (UNSPEC_VNCIPHERLAST "vncipherlast")])
 
 ;; 2 operand crypto instructions
-(define_insn "crypto_"
-  [(set (match_operand:V2DI 0 "register_operand" "=v")
-   (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "v")
- (match_operand:V2DI 2 "register_operand" "v")]
+(define_insn "crypto__"
+  [(set (match_operand:CR_vqdi 0 "register_operand" "=v")
+   (unspec:CR_vqdi [(match_operand:CR_vqdi 1 "register_operand" "v")
+ (match_operand:CR_vqdi 2 "register_operand" "v")]
 CR_code))]
   "TARGET_CRYPTO"
   " %0,%1,%2"
@@ -90,9 +93,9 @@
   [(set_attr "type" "vecperm")])
 
 ;; 1 operand crypto instruction
-(define_insn "crypto_vsbox"
-  [(set (match_operand:V2DI 0 "register_operand" "=v")
-   (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "v")]
+(define_insn "crypto_vsbox_"
+  [(set (match_operand:CR_vqdi 0 "register_operand" "=v")
+   (unspec:CR_vqdi [(match_operand:CR_vqdi 1 "register_operand" "v")]
 UNSPEC_VSBOX))]
   "TARGET_CRYPTO"
   "vsbox %0,%1"
diff --git a/gcc/config/rs6000/rs6000-builtin.def 
b/gcc/config/rs6000/rs6000-builtin.def
index 5abbd3e..d2896fc 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -2442,13 +2442,22 @@ BU_P9_OVERLOAD_2 (CMPRB2,   "byte_in_either_range")
 BU_P9_OVERLOAD_2 (CMPEQB,  "byte_in_set")
 
 /* 1 argument crypto functions.  */
-BU_CRYPTO_1 (VSBOX,"vsbox",  CONST, crypto_vsbox)
+BU_CRYPTO_1 (VSBOX,"vsbox",  CONST, crypto_vsbox_v2di)
+BU_CRYPTO_1 (VSBOX_BE, "vsbox_be",   CONST, crypto_vsbox_v16qi)
 
 /* 2 argument crypto functions.  */
-BU_CRYPTO_2 (VCIPHER,  "vcipher",CONST, crypto_vcipher)
-BU_CRYPTO_2 (VCIPHERLAST,  "vcipherlast",CONST, crypto_vcipherlast)
-BU_CRYPTO_2 (VNCIPHER, "vncipher",   CONST, crypto_vncipher)
-BU_CRYPTO_2 (VNCIPHERLAST, "vncipherlast",   CONST, crypto_vncipherlast)
+BU_CRYPTO_2 (VCIPHER,  "vcipher",

[PATCH v3] luoxhu - backport r250477, r255555, r257253 and r258137

2019-03-04 Thread luoxhu

From: Xiong Hu Luo 

This is a backport of r250477, r25, r257253 and r258137 from trunk to
gcc-7-branch to support built-in functions:
vec_extract_fp_from_shorth, vec_extract_fp_from_shortl,
vec_extract_fp32_from_shorth and vec_extract_fp32_from_shortl, etc.
The patches were on trunk before GCC 8 forked already.  r257253 and r258137
are dependent testcases require vsx support need merge to avoid regression.

The discussion for the patch r250477 that went into trunk is:
https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00624.html
The discussion for the patch r25 that went into trunk is:
https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00394.html
VSX support for patch r257253 and r258137:
https://gcc.gnu.org/ml/gcc-patches/2018-01/msg02391.html
https://gcc.gnu.org/ml/gcc-patches/2018-02/msg01506.html

Regression-tested on Linux POWER8 LE.

2019-02-28  Xiong Hu Luo 

Backport from trunk r250477.

2017-07-24  Carl Love  

* config/rs6000/rs6000-c.c: Add support for built-in functions
vector float vec_extract_fp32_from_shorth (vector unsigned short);
vector float vec_extract_fp32_from_shortl (vector unsigned short);
* config/rs6000/altivec.h (vec_extract_fp_from_shorth,
vec_extract_fp_from_shortl): Add defines for the two builtins.
* config/rs6000/rs6000-builtin.def (VEXTRACT_FP_FROM_SHORTH,
VEXTRACT_FP_FROM_SHORTL): Add BU_P9V_OVERLOAD_1 and BU_P9V_VSX_1
new builtins.
* config/rs6000/vsx.md vsx_xvcvhpsp): Add define_insn.
(vextract_fp_from_shorth, vextract_fp_from_shortl): Add define_expands.
* doc/extend.texi: Update the built-in documentation file for the
new built-in function.

Backport from trunk r25.

2017-12-11  Carl Love  

* config/rs6000/altivec.h (vec_extract_fp32_from_shorth,
vec_extract_fp32_from_shortl]): Add #defines.
* config/rs6000/rs6000-builtin.def (VSLDOI_2DI): Add macro expansion.
* config/rs6000/rs6000-c.c (ALTIVEC_BUILTIN_VEC_UNPACKH,
ALTIVEC_BUILTIN_VEC_UNPACKL, ALTIVEC_BUILTIN_VEC_AND,
ALTIVEC_BUILTIN_VEC_SLD, ALTIVEC_BUILTIN_VEC_SRL,
ALTIVEC_BUILTIN_VEC_SRO, ALTIVEC_BUILTIN_VEC_SLD,
ALTIVEC_BUILTIN_VEC_SLL): Add expansions.
* doc/extend.texi: Add documentation for the added builtins.

gcc/testsuite/ChangeLog:

2019-02-28  Xiong Hu Luo 

Backport from trunk r250477.

2017-07-24  Carl Love  

* gcc.target/powerpc/builtins-3-p9-runnable.c: Add new test file for
the new built-ins.

Backport from trunk r25.

2017-12-11  Carl Love  
* gcc.target/powerpc/altivec-7.c: Renamed altivec-7.h.
* gcc.target/powerpc/altivec-7.h (main): Add testcases for vec_unpackl.
Add dg-final tests for the instructions generated.
* gcc.target/powerpc/altivec-7-be.c: New file to test on big endian.
* gcc.target/powerpc/altivec-7-le.c: New file to test on little endian.
* gcc.target/powerpc/altivec-13.c (foo): Add vec_sld, vec_srl,
 vec_sro testcases. Add dg-final tests for the instructions generated.
* gcc.target/powerpc/builtins-3-p8.c (test_vsi_packs_vui,
test_vsi_packs_vsi, test_vsi_packs_vssi, test_vsi_packs_vusi,
test_vsi_packsu-vssi, test_vsi_packsu-vusi, test_vsi_packsu-vsll,
test_vsi_packsu-vull, test_vsi_packsu-vsi, test_vsi_packsu-vui): Add
testcases. Add dg-final tests for new instructions.
* gcc.target/powerpc/p8vector-builtin-2.c (vbschar_eq, vbchar_eq,
vuchar_eq, vbint_eq, vsint_eq, viint_eq, vuint_eq, vbool_eq, vbint_ne,
vsint_ne, vuint_ne, vbool_ne, vsign_ne, vuns_ne, vbshort_ne): Add
tests.
Add dg-final instruction tests.
* gcc.target/powerpc/vsx-vector-6.c: Renamed vsx-vector-6.h.
* gcc.target/powerpc/vsx-vector-6.h (vec_andc,vec_nmsub, vec_nmadd,
vec_or, vec_nor, vec_andc, vec_or, vec_andc, vec_msums): Add tests.
Add dg-final tests for the generated instructions.
* gcc.target/powerpc/builtins-3.c (test_sll_vsc_vsc_vsuc,
test_sll_vuc_vuc, test_sll_vsi_vsi_vuc, test_sll_vui_vui_vuc,
test_sll_vbll_vull, test_sll_vbll_vbll_vus, test_sll_vp_vp_vuc,
test_sll_vssi_vssi_vuc, test_sll_vusi_vusi_vuc, test_slo_vsc_vsc_vsc,
test_slo_vuc_vuc_vsc, test_slo_vsi_vsi_vsc, test_slo_vsi_vsi_vuc,
test_slo_vui_vui_vsc, test_slo_vui_vui_vuc, test_slo_vsll_slo_vsll_vsc,
test_slo_vsll_slo_vsll_vuc, test_slo_vull_slo_vull_vsc,
test_slo_vull_slo_vull_vuc, test_slo_vp_vp_vsc, test_slo_vp_vp_vuc,
test_slo_vssi_vssi_vsc, test_slo_vssi_vssi_vuc, test_slo_vusi_vusi_vsc,
test_slo_vusi_vusi_vuc, test_slo_vusi_vusi_vuc, test_slo_vf_vf_vsc,
test_slo_vf_vf_vuc, test_cmpb_float): Add tests.

Backport from trunk r257253.

2018-01-31  Will Schmidt  

* gcc.target/powerpc/altivec-13.c: Remove VSX-requiring

Merge to gccgo branch

2019-03-04 Thread Ian Lance Taylor

I've merged trunk revision 269372 to the gccgo branch.

Ian

Re: [PATCH, rs6000] Fix PR88845: ICE in lra_set_insn_recog_data

2019-03-04 Thread Peter Bergner

On 3/4/19 4:24 PM, Peter Bergner wrote:
> On 3/4/19 4:16 PM, Peter Bergner wrote:
>> Index: gcc/config/rs6000/rs6000.c
>> ===
>> --- gcc/config/rs6000/rs6000.c   (revision 269028)
>> +++ gcc/config/rs6000/rs6000.c   (working copy)
>> @@ -9887,7 +9887,7 @@ valid_sf_si_move (rtx dest, rtx src, mac
>>  static bool
>>  rs6000_emit_move_si_sf_subreg (rtx dest, rtx source, machine_mode mode)
>>  {
>> -  if (TARGET_DIRECT_MOVE_64BIT && !lra_in_progress && !reload_completed
>> +  if (TARGET_DIRECT_MOVE_64BIT && !reload_completed
>>&& (!SUBREG_P (dest) || !sf_subreg_operand (dest, mode))
>>&& SUBREG_P (source) && sf_subreg_operand (source, mode))
>>  {
>> @@ -9902,7 +9902,9 @@ rs6000_emit_move_si_sf_subreg (rtx dest,
>>  
>>if (mode == SFmode && inner_mode == SImode)
>>  {
>> -  emit_insn (gen_movsf_from_si (dest, inner_source));
>> +  rtx_insn *insn = emit_insn (gen_movsf_from_si (dest, inner_source));
>> +  if (lra_in_progress)
>> +remove_scratches_1 (insn);
>>return true;
>>  }
>>  }
> 
> But maybe the call to remove_scratches_1() should move to lra_emit_move(),
> which is how we get to this code in the first place?  Who knows what other
> generic move patterns might need scratches too?

Like this.  This bootstraps and regtests with no regressions.  Do you prefer
this instead?  If so, we'll need Vlad or Jeff or ... to approve the LRA
changes.

Vlad and Jeff,

The original problem and patch is described here:

https://gcc.gnu.org/ml/gcc-patches/2019-03/msg00061.html

Short answer is, after enabling a rs6000 move pattern we need for spilling,
we ICE when spilling, because the move pattern uses a scratch register
and scratch registers are replaced early on during LRA initialization.
The patch below just extracts out the code that fixes up one insn and
makes it a function itself.  I then changed lra_emit_move() to then call
that function after generating the move insn so as to replace the scratch
register the move pattern generated.  Thoughts on this patch compared to
the rs6000 only patch linked above?

Peter


gcc/
PR rtl-optimization/88845
* config/rs6000/rs6000.c (rs6000_emit_move_si_sf_subreg): Enable during
LRA.
* lra.c (remove_scratches_1): New function.
(remove_scratches): Use it.
(lra_emit_move): Likewise.

gcc/testsuite/
PR rtl-optimization/88845
* gcc.target/powerpc/pr88845.c: New test.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 269263)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -9887,7 +9887,7 @@ valid_sf_si_move (rtx dest, rtx src, mac
 static bool
 rs6000_emit_move_si_sf_subreg (rtx dest, rtx source, machine_mode mode)
 {
-  if (TARGET_DIRECT_MOVE_64BIT && !lra_in_progress && !reload_completed
+  if (TARGET_DIRECT_MOVE_64BIT && !reload_completed
   && (!SUBREG_P (dest) || !sf_subreg_operand (dest, mode))
   && SUBREG_P (source) && sf_subreg_operand (source, mode))
 {
Index: gcc/lra.c
===
--- gcc/lra.c   (revision 269263)
+++ gcc/lra.c   (working copy)
@@ -159,6 +159,7 @@ static void invalidate_insn_recog_data (
 static int get_insn_freq (rtx_insn *);
 static void invalidate_insn_data_regno_info (lra_insn_recog_data_t,
 rtx_insn *, int);
+static void remove_scratches_1 (rtx_insn *);
 
 /* Expand all regno related info needed for LRA.  */
 static void
@@ -494,7 +495,11 @@ lra_emit_move (rtx x, rtx y)
   if (rtx_equal_p (x, y))
return;
   old = max_reg_num ();
-  emit_move_insn (x, y);
+  rtx_insn *insn = emit_move_insn (x, y);
+  /* The move pattern may require scratch registers, so convert them
+into real registers now.  */
+  if (insn != NULL_RTX)
+   remove_scratches_1 (insn);
   if (REG_P (x))
lra_reg_info[ORIGINAL_REGNO (x)].last_reload = ++lra_curr_reload_num;
   /* Function emit_move can create pseudos -- so expand the pseudo
@@ -2077,47 +2082,53 @@ lra_register_new_scratch_op (rtx_insn *i
   add_reg_note (insn, REG_UNUSED, op);
 }
 
-/* Change scratches onto pseudos and save their location.  */
+/* Change INSN's scratches into pseudos and save their location.  */
 static void
-remove_scratches (void)
+remove_scratches_1 (rtx_insn *insn)
 {
   int i;
   bool insn_changed_p;
-  basic_block bb;
-  rtx_insn *insn;
   rtx reg;
   lra_insn_recog_data_t id;
   struct lra_static_insn_data *static_id;
 
+  id = lra_get_insn_recog_data (insn);
+  static_id = id->insn_static_data;
+  insn_changed_p = false;
+  for (i = 0; i < static_id->n_operands; i++)
+if (GET_CODE (*id->operand_loc[i]) == SCRATCH
+   && GET_MODE (*id->operand_loc[i]) != VOIDmode)
+  {
+   insn_changed_p = true;
+

[wwwdocs] add gcc 9 changes

2019-03-04 Thread Martin Sebor


Attached is a patch with (mostly) my changes for GCC 9.  To make
things easier to find I grouped related changes together within
the sections I changed.  I put warnings under the same bullet,
built-ins, and attributes.

Martin
Index: gcc-9/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-9/changes.html,v
retrieving revision 1.49
diff -u -r1.49 changes.html
--- gcc-9/changes.html	28 Feb 2019 21:49:05 -	1.49
+++ gcc-9/changes.html	5 Mar 2019 00:18:18 -
@@ -60,8 +60,17 @@
 
 
 General Improvements
+The following GCC command line options have been introduced or improved.
 
   
+All command line options that take a byte-size argument accept
+64-bit integers as well as standard SI and IEC suffixes such as
+kb and KiB, MB and MiB,
+or GB and GiB denoting the corresponding
+multiples of bytes.  See
+https://gcc.gnu.org/onlinedocs/gcc/Invoking-GCC.html#Invoking-GCC;>Invoking GCC for more.
+  
+  
 A new option, https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-flive-patching;>-flive-patching=[inline-only-static|inline-clone], has been
 introduced to provide a safe compilation for live-patching. At the same
 time, provides multiple-level control on the enabled IPA optimizations.
@@ -79,9 +88,41 @@
   alignment (e.g. -falign-loops=n:m:n2:m2).
   
   
-  A new built-in function, https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html#index-_005f_005fbuiltin_005fexpect_005fwith_005fprobability;>__builtin_expect_with_probability,
-  has been added.
+  New pair of profiling options (https://gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html#index-fprofile-filter-files;>-fprofile-filter-files
+  and https://gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html#index-fprofile-exclude-files;>-fprofile-exclude-files) has been added.
+  The options help to filter which source files are instrumented.
+  
+  
+  AddressSanitizer generates more compact red-zones for automatic variables.
+  That helps to reduce memory footprint of a sanitized binary.
   
+
+The following built-in functions have been introduced.
+
+  
+https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html#index-_005f_005fbuiltin_005fexpect_005fwith_005fprobability;>__builtin_expect_with_probability to provide branch prediction probability hints to
+the optimizer.
+  
+  
+https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html#index-_005f_005fbuiltin_005fhas_005fattribute-1;>__builtin_has_attribute determines whether a function, type, or variable has been declared with some
+attribute.
+  
+  
+https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html#index-_005f_005fbuiltin_005fspeculation_005fsafe_005fvalue-1;>__builtin_speculation_safe_value can be used to help mitigate against unsafe speculative
+execution.
+  
+
+The following attributes have been introduced.
+
+  
+The https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-copy-function-attribute;>copy function attribute has been
+added.  The attribute can also be applied to type definitions and to
+variable declarations.
+  
+
+A large number of improvements to code generation have been made,
+  including but not limited to the following.
+
   
   Switch expansion has been improved by using a different strategy
   (jump table, bit test, decision tree) for a subset of switch cases.
@@ -106,6 +147,10 @@
   can be transformed into 100 * how + 5 (for values defined
   in the switch statement).
   
+
+The following improvements to the gcov command line utilitly
+  have been made.
+
   
   The gcov tool received a new option https://gcc.gnu.org/onlinedocs/gcc/Invoking-Gcov.html#Invoking-Gcov;>--use-hotness-colors
   (-q) that can provide perf-like coloring of hot functions.
@@ -113,15 +158,6 @@
   
   The gcov tool has changed its intermediate format to a new JSON format.
   
-  
-  New pair of profiling options (https://gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html#index-fprofile-filter-files;>-fprofile-filter-files
-  and https://gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html#index-fprofile-exclude-files;>-fprofile-exclude-files) has been added.
-  The options help to filter which source files are instrumented.
-  
-  
-  AddressSanitizer generates more compact red-zones for automatic variables.
-  That helps to reduce memory footprint of a sanitized binary.
-  
 
 
 
@@ -139,7 +175,6 @@
   not supported in the GCC 9 release see
   https://gcc.gnu.org/ml/gcc-patches/2018-11/msg00628.html;>this mail.
   
-
   New extensions:
   
 https://gcc.gnu.org/onlinedocs/gcc/Vector-Extensions.html#index-_005f_005fbuiltin_005fconvertvector;>__builtin_convertvector built-in for vector conversions
@@ -152,7 +187,40 @@
   warns about an unaligned pointer value from the address of a packed
   member of a struct or union.

C++ PATCH for c++/87378 - bogus -Wredundant-move warning

2019-03-04 Thread Marek Polacek

This patch fixes a bogus -Wredundant-move warning.  In the test in the PR
the std::move call isn't redundant; the testcase won't actually compile
without that call, as per the resolution of bug 87150.

Before this patch, we'd issue the redundant-move warning anytime
treat_lvalue_as_rvalue_p is true for a std::move's argument.  But we also
need to make sure that convert_for_initialization works even without the
std::move call, if not, it's not redundant.

Trouble arises when the argument is const.  Then it might be the case that
the implicit rvalue fails because it uses a const copy constructor, or
that the type of the returned object and the type of the selected ctor's
parameter aren't the same.  To handle various corner cases (e.g. when the
std::move actually makes a difference between const T& and const T&&) I had
to use another convert_for_initialization, but the first one is the important
one and should handle most of the cases.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2019-03-04  Marek Polacek  

PR c++/87378 - bogus -Wredundant-move warning.
* typeck.c (maybe_warn_pessimizing_move): See if the maybe-rvalue
overload resolution would actually succeed.

* g++.dg/cpp0x/Wredundant-move7.C: New test.
* g++.dg/cpp0x/Wredundant-move8.C: New test.

diff --git gcc/cp/typeck.c gcc/cp/typeck.c
index 1bf9ad88141..7a43ba70010 100644
--- gcc/cp/typeck.c
+++ gcc/cp/typeck.c
@@ -9429,10 +9429,43 @@ maybe_warn_pessimizing_move (tree retval, tree functype)
 do maybe-rvalue overload resolution even without std::move.  */
  else if (treat_lvalue_as_rvalue_p (arg, /*parm_ok*/true))
{
- auto_diagnostic_group d;
- if (warning_at (loc, OPT_Wredundant_move,
- "redundant move in return statement"))
-   inform (loc, "remove % call");
+ /* Make sure that the overload resolution would actually succeed
+if we removed the std::move call.  */
+ tree moved = move (arg);
+ tree t = convert_for_initialization (NULL_TREE, functype, moved,
+  (LOOKUP_NORMAL
+   | LOOKUP_ONLYCONVERTING
+   | LOOKUP_PREFER_RVALUE),
+  ICR_RETURN, NULL_TREE, 0,
+  tf_none);
+ /* If this worked, implicit rvalue would work, so the call to
+std::move is redundant.  */
+ if (t == error_mark_node
+ && CP_TYPE_CONST_P (TREE_TYPE (arg)))
+   {
+ /* But if ARG is const, it may be the case of using
+const T& instead of T&&.  */
+ t = convert_for_initialization (NULL_TREE, functype, moved,
+ (LOOKUP_NORMAL
+  | LOOKUP_ONLYCONVERTING),
+ ICR_RETURN, NULL_TREE, 0,
+ tf_none);
+ /* Unless, rarely, using std::move would actually choose
+const T && over const T&.  In that case std::move isn't
+redundant.  */
+ if (TREE_CODE (t) == TARGET_EXPR)
+   t = TARGET_EXPR_INITIAL (t);
+ tree call = cp_get_callee_fndecl_nofold (t);
+ if (call && DECL_MOVE_CONSTRUCTOR_P (call))
+   t = error_mark_node;
+   }
+ if (t != error_mark_node)
+   {
+ auto_diagnostic_group d;
+ if (warning_at (loc, OPT_Wredundant_move,
+ "redundant move in return statement"))
+   inform (loc, "remove % call");
+   }
}
}
 }
diff --git gcc/testsuite/g++.dg/cpp0x/Wredundant-move7.C 
gcc/testsuite/g++.dg/cpp0x/Wredundant-move7.C
new file mode 100644
index 000..015d7c4f7a4
--- /dev/null
+++ gcc/testsuite/g++.dg/cpp0x/Wredundant-move7.C
@@ -0,0 +1,59 @@
+// PR c++/87378
+// { dg-do compile { target c++11 } }
+// { dg-options "-Wredundant-move" }
+
+// Define std::move.
+namespace std {
+  template
+struct remove_reference
+{ typedef _Tp   type; };
+
+  template
+struct remove_reference<_Tp&>
+{ typedef _Tp   type; };
+
+  template
+struct remove_reference<_Tp&&>
+{ typedef _Tp   type; };
+
+  template
+constexpr typename std::remove_reference<_Tp>::type&&
+move(_Tp&& __t) noexcept
+{ return static_cast::type&&>(__t); }
+}
+
+struct S1 { S1(S1 &&); };
+struct S2 : S1 {};
+
+S1
+f (S2 s)
+{
+  return std::move(s); // { dg-bogus "redundant move in return statement" }
+}
+
+struct R1 {
+  R1(R1 &&);
+  R1(const R1 &&);
+};
+struct R2 : R1 {};
+
+R1

[PATCH] Fix -gdwarf-5 -gsplit-dwarf ICEs (PR debug/89498)

2019-03-04 Thread Jakub Jelinek

Hi!

output_view_list_offset does:
  if (dwarf_split_debug_info)
dw2_asm_output_delta (DWARF_OFFSET_SIZE, sym, loc_section_label,
  "%s", dwarf_attr_name (a->dw_attr));
  else
dw2_asm_output_offset (DWARF_OFFSET_SIZE, sym, debug_loc_section,
   "%s", dwarf_attr_name (a->dw_attr));
while output_loc_list_offset does:
  if (!dwarf_split_debug_info)
dw2_asm_output_offset (DWARF_OFFSET_SIZE, sym, debug_loc_section,
   "%s", dwarf_attr_name (a->dw_attr));
  else if (dwarf_version >= 5)
{
  gcc_assert (AT_loc_list (a)->num_assigned);
  dw2_asm_output_data_uleb128 (AT_loc_list (a)->hash, "%s (%s)",
   dwarf_attr_name (a->dw_attr),
   sym);
}
  else
dw2_asm_output_delta (DWARF_OFFSET_SIZE, sym, loc_section_label,
  "%s", dwarf_attr_name (a->dw_attr));
but both size_of_die and value_format handle both the same as loc_list,
so for -gdwarf-5 -gsplit-dwarf we just ICE, as e.g. AT_loc_list is not
valid on a view list.

Assuming output_view_list_offset is correct, the following patch adjusts
size_of_die/value_format accordingly.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2019-03-04  Jakub Jelinek  

PR debug/89498
* dwarf2out.c (size_of_die): For dw_val_class_view_list always use
DWARF_OFFSET_SIZE.
(value_format): For dw_val_class_view_list never use DW_FORM_loclistx.

--- gcc/dwarf2out.c.jj  2019-03-01 09:04:15.440751912 +0100
+++ gcc/dwarf2out.c 2019-03-04 17:58:59.501542373 +0100
@@ -9351,7 +9351,6 @@ size_of_die (dw_die_ref die)
  }
  break;
case dw_val_class_loc_list:
-   case dw_val_class_view_list:
  if (dwarf_split_debug_info && dwarf_version >= 5)
{
  gcc_assert (AT_loc_list (a)->num_assigned);
@@ -9360,6 +9359,9 @@ size_of_die (dw_die_ref die)
   else
 size += DWARF_OFFSET_SIZE;
  break;
+   case dw_val_class_view_list:
+ size += DWARF_OFFSET_SIZE;
+ break;
case dw_val_class_range_list:
  if (value_format (a) == DW_FORM_rnglistx)
{
@@ -9733,12 +9735,12 @@ value_format (dw_attr_node *a)
  gcc_unreachable ();
}
 case dw_val_class_loc_list:
-case dw_val_class_view_list:
   if (dwarf_split_debug_info
  && dwarf_version >= 5
  && AT_loc_list (a)->num_assigned)
return DW_FORM_loclistx;
   /* FALLTHRU */
+case dw_val_class_view_list:
 case dw_val_class_range_list:
   /* For range lists in DWARF 5, use DW_FORM_rnglistx from .debug_info.dwo
 but in .debug_info use DW_FORM_sec_offset, which is shorter if we

Jakub

[PATCH] Guard binary/ternary match.pd patterns to IFN_COND_* with IFN_COND_* availability (PR tree-optimization/89570)

2019-03-04 Thread Jakub Jelinek

Hi!

As the following testcase shows, these match.pd patterns create temporary
GIMPLE stmts even when they aren't going to result in anything useful
(all targets except aarch64 right now), besides compile time memory
this is bad with -fno-tree-dce because those stmts might not be even valid
for the target and we might ICE during expansion.

Fixed by guarding them with a vectorized_internal_fn_supported_p test.
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Note, I have no idea how to test this on aarch64, Richard S., can you please
do that?  Thanks.

2019-03-04  Jakub Jelinek  

PR tree-optimization/89570
* match.pd (vec_cond into cond_op simplification): Guard with
vectorized_internal_fn_supported_p test and #if GIMPLE.

* gcc.dg/pr89570.c: New test.

--- gcc/match.pd.jj 2019-01-16 09:35:08.421259263 +0100
+++ gcc/match.pd2019-03-04 13:00:02.884284658 +0100
@@ -5177,17 +5177,24 @@ (define_operator_list COND_TERNARY
if the target can do it in one go.  This makes the operation conditional
on c, so could drop potentially-trapping arithmetic, but that's a valid
simplification if the result of the operation isn't needed.  */
+#if GIMPLE
 (for uncond_op (UNCOND_BINARY)
  cond_op (COND_BINARY)
  (simplify
   (vec_cond @0 (view_convert? (uncond_op@4 @1 @2)) @3)
-  (with { tree op_type = TREE_TYPE (@4); }
-   (if (element_precision (type) == element_precision (op_type))
+  (with { tree op_type = TREE_TYPE (@4); 
+ internal_fn cond_fn = get_conditional_internal_fn (uncond_op); }
+   (if (cond_fn != IFN_LAST
+   && vectorized_internal_fn_supported_p (cond_fn, op_type)
+   && element_precision (type) == element_precision (op_type))
 (view_convert (cond_op @0 @1 @2 (view_convert:op_type @3))
  (simplify
   (vec_cond @0 @1 (view_convert? (uncond_op@4 @2 @3)))
-  (with { tree op_type = TREE_TYPE (@4); }
-   (if (element_precision (type) == element_precision (op_type))
+  (with { tree op_type = TREE_TYPE (@4);
+ internal_fn cond_fn = get_conditional_internal_fn (uncond_op); }
+   (if (cond_fn != IFN_LAST
+   && vectorized_internal_fn_supported_p (cond_fn, op_type)
+   && element_precision (type) == element_precision (op_type))
 (view_convert (cond_op (bit_not @0) @2 @3 (view_convert:op_type @1)))
 
 /* Same for ternary operations.  */
@@ -5195,15 +5202,24 @@ (define_operator_list COND_TERNARY
  cond_op (COND_TERNARY)
  (simplify
   (vec_cond @0 (view_convert? (uncond_op@5 @1 @2 @3)) @4)
-  (with { tree op_type = TREE_TYPE (@5); }
-   (if (element_precision (type) == element_precision (op_type))
+  (with { tree op_type = TREE_TYPE (@5);
+ internal_fn cond_fn
+   = get_conditional_internal_fn (as_internal_fn (uncond_op)); }
+   (if (cond_fn != IFN_LAST
+   && vectorized_internal_fn_supported_p (cond_fn, op_type)
+   && element_precision (type) == element_precision (op_type))
 (view_convert (cond_op @0 @1 @2 @3 (view_convert:op_type @4))
  (simplify
   (vec_cond @0 @1 (view_convert? (uncond_op@5 @2 @3 @4)))
-  (with { tree op_type = TREE_TYPE (@5); }
-   (if (element_precision (type) == element_precision (op_type))
+  (with { tree op_type = TREE_TYPE (@5);
+ internal_fn cond_fn
+   = get_conditional_internal_fn (as_internal_fn (uncond_op)); }
+   (if (cond_fn != IFN_LAST
+   && vectorized_internal_fn_supported_p (cond_fn, op_type)
+   && element_precision (type) == element_precision (op_type))
 (view_convert (cond_op (bit_not @0) @2 @3 @4
  (view_convert:op_type @1)))
+#endif
 
 /* Detect cases in which a VEC_COND_EXPR effectively replaces the
"else" value of an IFN_COND_*.  */
--- gcc/testsuite/gcc.dg/pr89570.c.jj   2019-03-04 13:04:00.459544926 +0100
+++ gcc/testsuite/gcc.dg/pr89570.c  2019-03-04 13:03:44.157801534 +0100
@@ -0,0 +1,15 @@
+/* PR tree-optimization/89570 */
+/* { dg-do compile } */
+/* { dg-options "-O1 -ftree-vectorize -fno-trapping-math -fno-tree-dce 
-fno-tree-dominator-opts" } */
+/* { dg-additional-options "-mvsx" { target powerpc_vsx_ok } } */
+
+void
+foo (double *x, double *y, double *z)
+{
+  int i;
+  for (i = 0; i < 7; i += 2)
+{
+  x[i] = y[i] ? z[i] / 2.0 : z[i];
+  x[i + 1] = y[i + 1] ? z[i + 1] / 2.0 : z[i + 1];
+}
+}

Jakub

[PATCH] Fix gimple-ssa-sprintf ICE (PR tree-optimization/89566)

2019-03-04 Thread Jakub Jelinek

Hi!

Before PR87041 changes sprintf_dom_walker::handle_gimple_call
would punt if gimple_call_builtin_p (which did all the needed call argument
checking) failed, but it doesn't fail anymore because it wants to handle
format attribute.  That is fine, but if gimple_call_builtin_p failed, we
shouldn't handle them as builtins, just possibly as format string argument
functions.  Note, info.func might be even backend or FE builtin and
DECL_FUNCTION_CODE in those cases means something completely different
anyway.

So, the following patch makes sure to only set info.fncode if
gimple_call_builtin_p succeeds, and for format attribute does at least
minimal verification, that the call actually has an argument at idx_format
position which also has a pointer-ish type, and that idx_args isn't out of
bounds either (that one can be equal to number of arguments, that represents
zero varargs arguments).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2019-03-04  Jakub Jelinek  

PR tree-optimization/89566
* gimple-ssa-sprintf.c (sprintf_dom_walker::handle_gimple_call):
Set info.fncode to BUILT_IN_NONE if gimple_call_builtin_p failed.
Punt if get_user_idx_format succeeds, but idx_format argument is
not provided or doesn't have pointer type, or if idx_args is above
number of provided arguments.

* c-c++-common/pr89566.c: New test.

--- gcc/gimple-ssa-sprintf.c.jj 2019-02-24 20:18:01.348754080 +0100
+++ gcc/gimple-ssa-sprintf.c2019-03-04 10:52:32.867295908 +0100
@@ -3858,16 +3858,21 @@ sprintf_dom_walker::handle_gimple_call (
   if (!info.func)
 return false;
 
-  info.fncode = DECL_FUNCTION_CODE (info.func);
-
   /* Format string argument number (valid for all functions).  */
   unsigned idx_format = UINT_MAX;
-  if (!gimple_call_builtin_p (info.callstmt, BUILT_IN_NORMAL))
+  if (gimple_call_builtin_p (info.callstmt, BUILT_IN_NORMAL))
+info.fncode = DECL_FUNCTION_CODE (info.func);
+  else
 {
   unsigned idx_args;
   idx_format = get_user_idx_format (info.func, _args);
-  if (idx_format == UINT_MAX)
+  if (idx_format == UINT_MAX
+ || idx_format >= gimple_call_num_args (info.callstmt)
+ || idx_args > gimple_call_num_args (info.callstmt)
+ || !POINTER_TYPE_P (TREE_TYPE (gimple_call_arg (info.callstmt,
+ idx_format
return false;
+  info.fncode = BUILT_IN_NONE;
   info.argidx = idx_args;
 }
 
--- gcc/testsuite/c-c++-common/pr89566.c.jj 2019-03-04 10:56:10.060730886 
+0100
+++ gcc/testsuite/c-c++-common/pr89566.c2019-03-04 10:55:54.334989014 
+0100
@@ -0,0 +1,15 @@
+/* PR tree-optimization/89566 */
+/* { dg-do compile } */
+
+typedef struct FILE { int i; } FILE;
+#ifdef __cplusplus
+extern "C"
+#endif
+int fprintf (FILE *, const char *, ...);
+
+int
+main ()
+{
+  ((void (*)()) fprintf) ();   // { dg-warning "function called through a 
non-compatible type" "" { target c } }
+  return 0;
+}

Jakub

Re: [PATCH, rs6000] Fix PR88845: ICE in lra_set_insn_recog_data

2019-03-04 Thread Peter Bergner

On 3/4/19 4:16 PM, Peter Bergner wrote:
> Index: gcc/config/rs6000/rs6000.c
> ===
> --- gcc/config/rs6000/rs6000.c(revision 269028)
> +++ gcc/config/rs6000/rs6000.c(working copy)
> @@ -9887,7 +9887,7 @@ valid_sf_si_move (rtx dest, rtx src, mac
>  static bool
>  rs6000_emit_move_si_sf_subreg (rtx dest, rtx source, machine_mode mode)
>  {
> -  if (TARGET_DIRECT_MOVE_64BIT && !lra_in_progress && !reload_completed
> +  if (TARGET_DIRECT_MOVE_64BIT && !reload_completed
>&& (!SUBREG_P (dest) || !sf_subreg_operand (dest, mode))
>&& SUBREG_P (source) && sf_subreg_operand (source, mode))
>  {
> @@ -9902,7 +9902,9 @@ rs6000_emit_move_si_sf_subreg (rtx dest,
>  
>if (mode == SFmode && inner_mode == SImode)
>   {
> -   emit_insn (gen_movsf_from_si (dest, inner_source));
> +   rtx_insn *insn = emit_insn (gen_movsf_from_si (dest, inner_source));
> +   if (lra_in_progress)
> + remove_scratches_1 (insn);
> return true;
>   }
>  }

But maybe the call to remove_scratches_1() should move to lra_emit_move(),
which is how we get to this code in the first place?  Who knows what other
generic move patterns might need scratches?

Peter

Re: [PATCH, rs6000] Fix PR88845: ICE in lra_set_insn_recog_data

2019-03-04 Thread Peter Bergner

On 3/4/19 1:27 PM, Segher Boessenkool wrote:
>> +  /* If LRA is generating a direct move from a GPR to a FPR,
>> + then the splitter is going to need a scratch register.  */
>> +  rtx insn = gen_movsf_from_si_internal (operands[0], operands[1]);
>> +  XEXP (XVECEXP (insn, 0, 1), 0) = gen_reg_rtx (DImode);
>> +  emit_insn (insn);
>> +  DONE;
> 
> This part isn't so great, needing detailed knowledge of the RTL generated
> by some other pattern.  Maybe there already exists some function that
> generates a register for every scratch in an insn, or you can make such
> a function?

A function that updates one insn does not exist.  There is remove_scratches(),
but that works on the entire cfg.  As part of my earlier attempts, I did split
remove_scratches() into a function that traverses the cfg and another that
replaces the scratches in one insn.  I've included it below.  I went with the
current patch, because that doesn't touch anything outside of the port.
If you prefer the patch below, we can go with that instead.  Let me know which
you prefer.


>> +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
>> "-mcpu=power8" } } */
>> +/* { dg-options "-mcpu=power8 -O2" } */
> 
> These two lines should now be just
> 
> /* { dg-options "-mdejagnu-cpu=power8 -O2" } */

Ok, will update that.

Peter



Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 269028)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -9887,7 +9887,7 @@ valid_sf_si_move (rtx dest, rtx src, mac
 static bool
 rs6000_emit_move_si_sf_subreg (rtx dest, rtx source, machine_mode mode)
 {
-  if (TARGET_DIRECT_MOVE_64BIT && !lra_in_progress && !reload_completed
+  if (TARGET_DIRECT_MOVE_64BIT && !reload_completed
   && (!SUBREG_P (dest) || !sf_subreg_operand (dest, mode))
   && SUBREG_P (source) && sf_subreg_operand (source, mode))
 {
@@ -9902,7 +9902,9 @@ rs6000_emit_move_si_sf_subreg (rtx dest,
 
   if (mode == SFmode && inner_mode == SImode)
{
- emit_insn (gen_movsf_from_si (dest, inner_source));
+ rtx_insn *insn = emit_insn (gen_movsf_from_si (dest, inner_source));
+ if (lra_in_progress)
+   remove_scratches_1 (insn);
  return true;
}
 }

Index: gcc/lra.c
===
--- gcc/lra.c   (revision 269028)
+++ gcc/lra.c   (working copy)
@@ -2077,7 +2077,40 @@ lra_register_new_scratch_op (rtx_insn *i
   add_reg_note (insn, REG_UNUSED, op);
 }
 
-/* Change scratches onto pseudos and save their location.  */
+/* Change INSN's scratches into pseudos and save their location.  */
+void
+remove_scratches_1 (rtx_insn *insn)
+{
+  int i;
+  bool insn_changed_p;
+  rtx reg;
+  lra_insn_recog_data_t id;
+  struct lra_static_insn_data *static_id;
+
+  id = lra_get_insn_recog_data (insn);
+  static_id = id->insn_static_data;
+  insn_changed_p = false;
+  for (i = 0; i < static_id->n_operands; i++)
+if (GET_CODE (*id->operand_loc[i]) == SCRATCH
+   && GET_MODE (*id->operand_loc[i]) != VOIDmode)
+  {
+   insn_changed_p = true;
+   *id->operand_loc[i] = reg
+ = lra_create_new_reg (static_id->operand[i].mode,
+   *id->operand_loc[i], ALL_REGS, NULL);
+   lra_register_new_scratch_op (insn, i, id->icode);
+   if (lra_dump_file != NULL)
+ fprintf (lra_dump_file,
+  "Removing SCRATCH in insn #%u (nop %d)\n",
+  INSN_UID (insn), i);
+  }
+  if (insn_changed_p)
+/* Because we might use DF right after caller-saves sub-pass
+   we need to keep DF info up to date.  */
+df_insn_rescan (insn);
+}
+
+/* Change scratches into pseudos and save their location.  */
 static void
 remove_scratches (void)
 {
@@ -2095,29 +2128,7 @@ remove_scratches (void)
   FOR_EACH_BB_FN (bb, cfun)
 FOR_BB_INSNS (bb, insn)
 if (INSN_P (insn))
-  {
-   id = lra_get_insn_recog_data (insn);
-   static_id = id->insn_static_data;
-   insn_changed_p = false;
-   for (i = 0; i < static_id->n_operands; i++)
- if (GET_CODE (*id->operand_loc[i]) == SCRATCH
- && GET_MODE (*id->operand_loc[i]) != VOIDmode)
-   {
- insn_changed_p = true;
- *id->operand_loc[i] = reg
-   = lra_create_new_reg (static_id->operand[i].mode,
- *id->operand_loc[i], ALL_REGS, NULL);
- lra_register_new_scratch_op (insn, i, id->icode);
- if (lra_dump_file != NULL)
-   fprintf (lra_dump_file,
-"Removing SCRATCH in insn #%u (nop %d)\n",
-INSN_UID (insn), i);
-   }
-   if (insn_changed_p)
- /* Because we might use DF right after caller-saves sub-pass
-we need to keep DF info up to date.  */
- df_insn_rescan (insn);
-

Re: A bug in vrp_meet?

2019-03-04 Thread Qing Zhao

Hi, Richard,

> On Mar 4, 2019, at 5:45 AM, Richard Biener  wrote:
>> 
>> It looks like DOM fails to visit stmts generated by simplification. Can you 
>> open a bug report with a testcase?
>> 
>> 
>> The problem is, It took me quite some time in order to come up with a small 
>> and independent testcase for this problem,
>> a little bit change made the error disappear.
>> 
>> do you have any suggestion on this?  or can you give me some hint on how to 
>> fix this in DOM?  then I can try the fix on my side?
> 
> I remember running into similar issues in the past where I tried to
> extract temporary nonnull ranges from divisions.
> I have there
> 
> @@ -1436,11 +1436,16 @@ dom_opt_dom_walker::before_dom_children
>   m_avail_exprs_stack->pop_to_marker ();
> 
>   edge taken_edge = NULL;
> -  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next ())
> -{
> -  evrp_range_analyzer.record_ranges_from_stmt (gsi_stmt (gsi), false);
> -  taken_edge = this->optimize_stmt (bb, gsi);
> -}
> +  gsi = gsi_start_bb (bb);
> +  if (!gsi_end_p (gsi))
> +while (1)
> +  {
> +   evrp_range_analyzer.record_def_ranges_from_stmt (gsi_stmt (gsi), 
> false);
> +   taken_edge = this->optimize_stmt (bb, );
> +   if (gsi_end_p (gsi))
> + break;
> +   evrp_range_analyzer.record_use_ranges_from_stmt (gsi_stmt (gsi));
> +  }
> 
>   /* Now prepare to process dominated blocks.  */
>   record_edge_info (bb);
> 
> OTOH the issue in your case is that fold emits new stmts before gsi but the
> above loop will never look at them.  See tree-ssa-forwprop.c for code how
> to deal with this (setting a pass-local flag on stmts visited and walking back
> to unvisited, newly inserted ones).  The fold_stmt interface could in theory
> also be extended to insert new stmts on a sequence passed to it so the
> caller would be responsible for inserting them into the IL and could then
> more easily revisit them (but that's a bigger task).
> 
> So, does the following help?

Yes, this change fixed the error in my side, now, in the dumped file for pass 
dom3:


Visiting statement:
i_49 = _98 > 0 ? k_105 : 0;
Meeting
  [0, 65535]
and
  [0, 0]
to
  [0, 65535]
Intersecting
  [0, 65535]
and
  [0, 65535]
to
  [0, 65535]
Optimizing statement i_49 = _98 > 0 ? k_105 : 0;
  Replaced 'k_105' with variable '_98'
gimple_simplified to _152 = MAX_EXPR <_98, 0>;
i_49 = _152;
  Folded to: i_49 = _152;
LKUP STMT i_49 = _152
 ASGN i_49 = _152

Visiting statement:
_152 = MAX_EXPR <_98, 0>;

Visiting statement:
i_49 = _152;
Intersecting
  [0, 65535]  EQUIVALENCES: { _152 } (1 elements)
and
  [0, 65535]
to
  [0, 65535]  EQUIVALENCES: { _152 } (1 elements)


We can clearly see from the above, all the new stmts generated by fold are 
visited now. 

it is also confirmed that the runtime error caused by this bug was gone with 
this fix.

So, what’s the next step for this issue?

will you commit this fix to gcc9 and gcc8  (we need it in gcc8)?

or I can test this fix on my side and commit it to both gcc9 and gcc8?

thanks.

Qing

> 
> Index: gcc/tree-ssa-dom.c
> ===
> --- gcc/tree-ssa-dom.c  (revision 269361)
> +++ gcc/tree-ssa-dom.c  (working copy)
> @@ -1482,8 +1482,25 @@ dom_opt_dom_walker::before_dom_children
>   edge taken_edge = NULL;
>   for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next ())
> {
> +  gimple_stmt_iterator pgsi = gsi;
> +  gsi_prev ();
>   evrp_range_analyzer.record_ranges_from_stmt (gsi_stmt (gsi), false);
>   taken_edge = this->optimize_stmt (bb, gsi);
> +  gimple_stmt_iterator npgsi = gsi;
> +  gsi_prev ();
> +  /* Walk new stmts eventually inserted by DOM.  gsi_stmt (gsi) itself
> +while it may be changed should not have gotten a new definition.  */
> +  if (gsi_stmt (pgsi) != gsi_stmt (npgsi))
> +   do
> + {
> +   if (gsi_end_p (pgsi))
> + pgsi = gsi_start_bb (bb);
> +   else
> + gsi_next ();
> +   evrp_range_analyzer.record_ranges_from_stmt (gsi_stmt (pgsi),
> +false);
> + }
> +   while (gsi_stmt (pgsi) != gsi_stmt (gsi));
> }
> 
>   /* Now prepare to process dominated blocks.  */
> 
> 
> Richard.
> 
>> Thanks a lot.
>> 
>> Qing
>> 
>> 
>> 
>> Richard.
>> 
>>

Re: [PATCH] PR libstdc++/88996 Implement P0439R0 - C++20 - Make std::memory_order a scoped enumeration.

2019-03-04 Thread Ed Smith-Rowland


On 3/4/19 3:05 PM, Jonathan Wakely wrote:

On 04/02/19 14:26 +, Jonathan Wakely wrote:

On 24/01/19 14:50 -0500, Ed Smith-Rowland wrote:

PR libstdc++/88996 Implement P0439R0
Make std::memory_order a scoped enumeration.
* include/bits/atomic_base.h: For C++20 make memory_order a 
scoped enum,

add assignments for the old constants.  Adjust calls.


I think it would be more accurate to say "variables for the old
enumerators" rather than "assignments for the old constants".

As this only affects C++2a it's OK for trunk now, thanks.

Somehow I missed this.

Hi Ed, it looks like this wasn't committed yet. Could you please go
ahead and commit it to trunk (ideally with the changelog tweak I
suggested above).

Thanks!


Done.

Thank you.

Ed


I

Re: RFA: PATCH to gimple-fold.c for c++/80916, bogus "static but not defined" warning

2019-03-04 Thread Jason Merrill

On Thu, Feb 28, 2019 at 12:18 PM Jason Merrill  wrote:
> On Thu, Feb 28, 2019 at 11:58 AM Jan Hubicka  wrote:
> > sorry for late reply - I did not identify it as a patch to symbol table.
> > Indeed we want can_refer_decl_in_current_unit_p is a good place to test
> > this.  Is there a reason to resrict this to functions with no body?
>
> If the function has a definition, then of course we can refer to it in
> its own unit.  Am I missing something?

Ah, yes, I was.  You mean, why do we care about DECL_INITIAL if
DECL_EXTERNAL is set?  I think I added that check out of caution.

This would be a more straightforward change:
commit 6af927c40585a4ff75a83b7cdabe8f9074a8d391
Author: Jason Merrill 
Date:   Fri Jan 25 09:09:17 2019 -0500

PR c++/80916 - spurious "static but not defined" warning.

Nothing can refer to an internal decl with no definition, so we shouldn't
treat such a decl as a possible devirtualization target.

* gimple-fold.c (can_refer_decl_in_current_unit_p): Return false
for an internal function with no definition.

diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 7ef5004f5f9..62d2e0abc26 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -121,9 +121,12 @@ can_refer_decl_in_current_unit_p (tree decl, tree from_decl)
   || !VAR_OR_FUNCTION_DECL_P (decl))
 return true;
 
-  /* Static objects can be referred only if they was not optimized out yet.  */
-  if (!TREE_PUBLIC (decl) && !DECL_EXTERNAL (decl))
+  /* Static objects can be referred only if they are defined and not optimized
+ out yet.  */
+  if (!TREE_PUBLIC (decl))
 {
+  if (DECL_EXTERNAL (decl))
+	return false;
   /* Before we start optimizing unreachable code we can be sure all
 	 static objects are defined.  */
   if (symtab->function_flags_ready)
diff --git a/gcc/testsuite/g++.dg/warn/unused-fn1.C b/gcc/testsuite/g++.dg/warn/unused-fn1.C
new file mode 100644
index 000..aabc01b3f44
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/unused-fn1.C
@@ -0,0 +1,16 @@
+// PR c++/80916
+// { dg-options "-Os -Wunused" }
+
+struct j {
+  virtual void dispatch(void *) {}
+};
+template 
+struct i : j {
+  void dispatch(void *) {} // warning: 'void i<  >::dispatch(void*) [with  = {anonymous}::l]' declared 'static' but never defined [-Wunused-function]
+};
+namespace {
+  struct l : i {};
+}
+void f(j *k) {
+  k->dispatch(0);
+}

Re: [PATCH] PR libstdc++/88996 Implement P0439R0 - C++20 - Make std::memory_order a scoped enumeration.

2019-03-04 Thread Jonathan Wakely


On 04/02/19 14:26 +, Jonathan Wakely wrote:

On 24/01/19 14:50 -0500, Ed Smith-Rowland wrote:

PR libstdc++/88996 Implement P0439R0
Make std::memory_order a scoped enumeration.
* include/bits/atomic_base.h: For C++20 make memory_order a scoped enum,
add assignments for the old constants.  Adjust calls.


I think it would be more accurate to say "variables for the old
enumerators" rather than "assignments for the old constants".

As this only affects C++2a it's OK for trunk now, thanks.


Hi Ed, it looks like this wasn't committed yet. Could you please go
ahead and commit it to trunk (ideally with the changelog tweak I
suggested above).

Thanks!

[PATCH] PR ada/89583, GNAT.Sockets.Bind_Socket fails with IPv4 address

2019-03-04 Thread Simon Wright

With GCC9, GNAT.Sockets includes support for IPv6. Sockaddr is an 
Unchecked_Union, which now includes IPv6 fields, bringing the total possible 
size to 28 bytes. The code in Bind_Socket currently calculates the length of 
the struct sockaddr to be passed to bind(2) as this size, which (at any rate on 
Darwin x86_64) results in failure (EINVAL).

This patch provides the required length explicitly from the socket's family.

Tested by rebuilding the compiler with --disable-bootstrap and re-running the 
reproducer.

gcc/ada/Changelog:

2019-03-04 Simon Wright 

PR ada/89583
* libgnat/g-socket.adb (Bind_Socket): Calculate Len (the significant 
length of
  the Sockaddr) using the Family of the Address parameter.




pr89583.diff
Description: Binary data

Re: [PATCH, rs6000] Fix PR88845: ICE in lra_set_insn_recog_data

2019-03-04 Thread Segher Boessenkool

Hi Peter,

On Fri, Mar 01, 2019 at 01:33:27PM -0600, Peter Bergner wrote:
> PR88845 shows a problem where LRA spilled an input operand of an inline
> asm statement by calling our generic movsf pattern which ended up generating
> an insn we don't have a pattern for, so we ICE.  The insn was:
> 
>   (insn (set (reg:SF 125)
>(subreg:SF (reg:SI 124) 0)))
> 
> The problem is that rs6000_emit_move_si_sf_subreg() is disabled for LRA
> and so wasn't able to call gen_movsf_from_si() which generates the correct
> pattern for moving a 32-bit value from a GPR to a FPR.  The patch below
> fixes the issue by allowing rs6000_emit_move_si_sf_subreg() to be called
> during LRA as well as creating an expander so that when it is called during
> LRA, we can create the scratch register that is required for its associated
> splitter.  We have to do this, since LRA has already converted all of the
> scratches into real registers before it does any spilling.

> +  /* If LRA is generating a direct move from a GPR to a FPR,
> +  then the splitter is going to need a scratch register.  */
> +  rtx insn = gen_movsf_from_si_internal (operands[0], operands[1]);
> +  XEXP (XVECEXP (insn, 0, 1), 0) = gen_reg_rtx (DImode);
> +  emit_insn (insn);
> +  DONE;

This part isn't so great, needing detailed knowledge of the RTL generated
by some other pattern.  Maybe there already exists some function that
generates a register for every scratch in an insn, or you can make such
a function?

Okay for trunk with or without such an improvement.  Also backports, if
you want those.  But note (on trunk):

> +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
> "-mcpu=power8" } } */
> +/* { dg-options "-mcpu=power8 -O2" } */

These two lines should now be just

/* { dg-options "-mdejagnu-cpu=power8 -O2" } */


Thanks!


Segher

Re: [libstc++] Don't throw in std::assoc_legendre for m > l

2019-03-04 Thread Ed Smith-Rowland


This is actually PR libstdc++/86655.

Thank you for reminding me Andre.

I remove the throw for m > l and just return 0.  This is also done for 
sph_legendre.


This build and tests clean on x86_64-linux.

OK?

Ed


2018-03-04  Edward Smith-Rowland  <3dw...@verizon.net>

PR libstdc++/86655 - std::assoc_legendre should not constrain
the value of m (or x).
* include/tr1/legendre_function.tcc (__assoc_legendre_p,
__sph_legendre): If degree > order Don't throw, return 0.
(__legendre_p, __assoc_legendre_p): Don't constrain x either.
* testsuite/special_functions/02_assoc_legendre/pr86655.cc: New test.
* testsuite/special_functions/20_sph_legendre/pr86655.cc: New test.
* testsuite/tr1/5_numerical_facilities/special_functions/
02_assoc_legendre/pr86655.cc: New test.
* testsuite/tr1/5_numerical_facilities/special_functions/
22_sph_legendre/pr86655.cc: New test.

Index: include/tr1/legendre_function.tcc
===
--- include/tr1/legendre_function.tcc   (revision 269347)
+++ include/tr1/legendre_function.tcc   (working copy)
@@ -82,10 +82,7 @@
 __poly_legendre_p(unsigned int __l, _Tp __x)
 {
 
-  if ((__x < _Tp(-1)) || (__x > _Tp(+1)))
-std::__throw_domain_error(__N("Argument out of range"
-  " in __poly_legendre_p."));
-  else if (__isnan(__x))
+  if (__isnan(__x))
 return std::numeric_limits<_Tp>::quiet_NaN();
   else if (__x == +_Tp(1))
 return +_Tp(1);
@@ -126,11 +123,11 @@
  *   @f[
  * P_l^m(x) = (1 - x^2)^{m/2}\frac{d^m}{dx^m}P_l(x)
  *   @f]
+ *   @note @f$ P_l^m(x) = 0 @f$ if @f$ m > l @f$.
  * 
  *   @param  l  The degree of the associated Legendre function.
  *  @f$ l >= 0 @f$.
  *   @param  m  The order of the associated Legendre function.
- *  @f$ m <= l @f$.
  *   @param  x  The argument of the associated Legendre function.
  *  @f$ |x| <= 1 @f$.
  *   @param  phase  The phase of the associated Legendre function.
@@ -142,12 +139,8 @@
   _Tp __phase = _Tp(+1))
 {
 
-  if (__x < _Tp(-1) || __x > _Tp(+1))
-std::__throw_domain_error(__N("Argument out of range"
-  " in __assoc_legendre_p."));
-  else if (__m > __l)
-std::__throw_domain_error(__N("Degree out of range"
-  " in __assoc_legendre_p."));
+  if (__m > __l)
+return _Tp(0);
   else if (__isnan(__x))
 return std::numeric_limits<_Tp>::quiet_NaN();
   else if (__m == 0)
@@ -209,12 +202,12 @@
  *   and so this function is stable for larger differences of @f$ l @f$
  *   and @f$ m @f$.
  *   @note Unlike the case for __assoc_legendre_p the Condon-Shortley
- *   phase factor @f$ (-1)^m @f$ is present here.
+ * phase factor @f$ (-1)^m @f$ is present here.
+ *   @note @f$ Y_l^m(\theta) = 0 @f$ if @f$ m > l @f$.
  * 
  *   @param  l  The degree of the spherical associated Legendre function.
  *  @f$ l >= 0 @f$.
  *   @param  m  The order of the spherical associated Legendre function.
- *  @f$ m <= l @f$.
  *   @param  theta  The radian angle argument of the spherical associated
  *  Legendre function.
  */
@@ -227,11 +220,8 @@
 
   const _Tp __x = std::cos(__theta);
 
-  if (__l < __m)
-{
-  std::__throw_domain_error(__N("Bad argument "
-"in __sph_legendre."));
-}
+  if (__m > __l)
+return _Tp(0);
   else if (__m == 0)
 {
   _Tp __P = __poly_legendre_p(__l, __x);
@@ -284,7 +274,7 @@
   _Tp __y_lm = _Tp(0);
 
   // Compute Y_l^m, l > m+1, upward recursion on l.
-  for (unsigned int __ll = __m + 2; __ll <= __l; ++__ll)
+  for (int __ll = __m + 2; __ll <= __l; ++__ll)
 {
   const _Tp __rat1 = _Tp(__ll - __m) / _Tp(__ll + __m);
   const _Tp __rat2 = _Tp(__ll - __m - 1) / _Tp(__ll + __m - 1);
Index: testsuite/special_functions/02_assoc_legendre/pr86655.cc
===
--- testsuite/special_functions/02_assoc_legendre/pr86655.cc(nonexistent)
+++ testsuite/special_functions/02_assoc_legendre/pr86655.cc(working copy)
@@ -0,0 +1,56 @@
+// { dg-do run { target c++11 } }
+// { dg-options "-D__STDCPP_WANT_MATH_SPEC_FUNCS__ -ffp-contract=off" }
+
+// Copyright (C) 2019 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or

Re: [PATCH] rs6000: Add -mdejagnu-cpu=

2019-03-04 Thread Segher Boessenkool

On Fri, Mar 01, 2019 at 07:33:27PM +0100, Jakub Jelinek wrote:
> We are talking about the
> http://git.savannah.gnu.org/cgit/dejagnu.git/commit/?id=5256bd82343000c76bc0e48139003f90b6184347
> change, right?

That's the patch I think, yes.

One thing I didn't mention is my patch fixed some ten failures, mostly
code that set things like

/* { dg-options "-maltivec -mcpu=power8" } */
/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power8" } } */

which does not work as you might expect.  (My patch removed all such
dg-skip-if lines).

Another reason is that we currently use -mpower8-vector (etc.) to select
power8, not just the power8 vector extensions, which should not even _have_
a user-accessible option.  To clean up those options we need to make -mcpu=
in the testsuite work better.

It's all by no means perfect.  But it's a clear improvement, in my mind.

Segher

Re: [LTO PATCH RFA] PR c++/88049 - ICE with undefined destructor and anon namespace.

2019-03-04 Thread Christophe Lyon

On Mon, 4 Mar 2019 at 17:37, Jason Merrill  wrote:
>
> On Mon, Mar 4, 2019 at 8:41 AM Christophe Lyon
>  wrote:
> > On Wed, 20 Feb 2019 at 02:58, Jason Merrill  wrote:
> > >
> > > A type in an anonymous namespace can never be merged with one from
> > > another translation unit, so a member of such a type is always its own
> > > prevailing decl.
> > >
> > > I don't really understand the LTO concept of prevailing decl, or why we 
> > > don't
> > > get here if the destructor is defined, but this seems reasonable and 
> > > fixes the
> > > bug.
> > >
> > > Tested x86_64-pc-linux-gnu.  OK for trunk?
> > >
> > > * lto-symtab.c (lto_symtab_prevailing_virtual_decl): Return early
> > > for a type in an anonymous namespace.
> > > ---
> > >  gcc/lto/lto-symtab.c |  8 ++--
> > >  gcc/testsuite/g++.dg/lto/pr88049_0.C | 16 
> > >  gcc/lto/ChangeLog|  6 ++
> > >  3 files changed, 28 insertions(+), 2 deletions(-)
> > >  create mode 100644 gcc/testsuite/g++.dg/lto/pr88049_0.C
> > >
> > > diff --git a/gcc/lto/lto-symtab.c b/gcc/lto/lto-symtab.c
> > > index 22da4c78b8c..343915c3cec 100644
> > > --- a/gcc/lto/lto-symtab.c
> > > +++ b/gcc/lto/lto-symtab.c
> > > @@ -1085,8 +1085,12 @@ lto_symtab_prevailing_virtual_decl (tree decl)
> > >  {
> > >if (DECL_ABSTRACT_P (decl))
> > >  return decl;
> > > -  gcc_checking_assert (!type_in_anonymous_namespace_p (DECL_CONTEXT 
> > > (decl))
> > > -  && DECL_ASSEMBLER_NAME_SET_P (decl));
> > > +
> > > +  if (type_in_anonymous_namespace_p (DECL_CONTEXT (decl)))
> > > +/* There can't be any other declarations.  */
> > > +return decl;
> > > +
> > > +  gcc_checking_assert (DECL_ASSEMBLER_NAME_SET_P (decl));
> > >
> > >symtab_node *n = symtab_node::get_for_asmname
> > >  (DECL_ASSEMBLER_NAME (decl));
> > > diff --git a/gcc/testsuite/g++.dg/lto/pr88049_0.C 
> > > b/gcc/testsuite/g++.dg/lto/pr88049_0.C
> > > new file mode 100644
> > > index 000..7ac3618c2c8
> > > --- /dev/null
> > > +++ b/gcc/testsuite/g++.dg/lto/pr88049_0.C
> > > @@ -0,0 +1,16 @@
> > > +// PR c++/88049
> > > +// { dg-lto-do link }
> > > +// { dg-lto-options {{ -flto -O2 -w }} }
> > > +// { dg-extra-ld-options -r }
> > > +
> > > +template  class a;
> > > +class b {};
> > > +template  a d(char);
> > > +template  class a : public b {
> > > +public:
> > > +  virtual ~a();
> > > +};
> > > +namespace {
> > > +  class f;
> > > +  b c = d(int());
> > > +} // namespace
> >
> >
> > Hi Jason,
> >
> > On bare-metal targets (arm, aarch64 using newlib), I'm using dejagnu's
> > testglue, which makes this new test fail because it also uses
> > g++_tg.o, leading to:
> > /arm-none-eabi/bin/ld: warning: incremental linking of LTO and non-LTO
> > objects; using -flinker-output=nolto-rel which will bypass whole
> > program optimization
> >
> > Is there a way to avoid that (besides not using testglue) ?
>
> Does adding
>
> // { dg-require-effective-target lto_incremental }
>
> to the testcase help?
>

Yes, it does, thanks. I was unaware if this effective-target.

> Jason

Re: [C++ PATCH] Further fix for designated-initializer-list handling in overload resolution (PR c++/71446)

2019-03-04 Thread Jason Merrill


On 3/4/19 4:21 AM, Jakub Jelinek wrote:

On Sat, Mar 02, 2019 at 10:30:36AM +0100, Jakub Jelinek wrote:

I'm not really sure what to do for foo.  Perhaps if we find that case just
require that the order is ok already during build_aggr_conv and fail the
conversion otherwise?  We are outside of the standard in that case anyway.


Actually, seems if the designators can match corresponding type,
reshape_init* already fills in the ce->index even on elements without
original designators.  So, the following works fine for all the testcases I
came up with.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2019-03-04  Jakub Jelinek  

PR c++/71446
* call.c (field_in_pset): New function.
(build_aggr_conv): Handle CONSTRUCTOR_IS_DESIGNATED_INIT correctly.


OK.

Jason

V2 [PATCH] Optimize vector constructor

2019-03-04 Thread H.J. Lu

On Mon, Mar 04, 2019 at 12:55:04PM +0100, Richard Biener wrote:
> On Sun, Mar 3, 2019 at 10:13 PM H.J. Lu  wrote:
> >
> > On Sun, Mar 03, 2019 at 06:40:09AM -0800, Andrew Pinski wrote:
> > > )
> > > ,On Sun, Mar 3, 2019 at 6:32 AM H.J. Lu  wrote:
> > > >
> > > > For vector init constructor:
> > > >
> > > > ---
> > > > typedef float __v4sf __attribute__ ((__vector_size__ (16)));
> > > >
> > > > __v4sf
> > > > foo (__v4sf x, float f)
> > > > {
> > > >   __v4sf y = { f, x[1], x[2], x[3] };
> > > >   return y;
> > > > }
> > > > ---
> > > >
> > > > we can optimize vector init constructor with vector copy or permute
> > > > followed by a single scalar insert:

> and you want to advance to the _1 = BIT_INSERT_EXPR here.  The easiest way
> is to emit a new stmt for _2 = copy ...; and do the set_rhs with the
> BIT_INSERT_EXPR.

Thanks for BIT_INSERT_EXPR suggestion.  I am testing this patch.


H.J.
---
We can optimize vector constructor with vector copy or permute followed
by a single scalar insert:

  __v4sf y;
  __v4sf D.1930;
  float _1;
  float _2;
  float _3;

   :
  _1 = BIT_FIELD_REF ;
  _2 = BIT_FIELD_REF ;
  _3 = BIT_FIELD_REF ;
  y_6 = {f_5(D), _3, _2, _1};
  return y_6;

with

 __v4sf y;
  __v4sf D.1930;
  float _1;
  float _2;
  float _3;
  vector(4) float _8;

   :
  _1 = BIT_FIELD_REF ;
  _2 = BIT_FIELD_REF ;
  _3 = BIT_FIELD_REF ;
  _8 = x_9(D);
  y_6 = BIT_INSERT_EXPR ;
  return y_6;

gcc/

PR tree-optimization/88828
* tree-ssa-forwprop.c (simplify_vector_constructor): Optimize
vector init constructor with vector copy or permute followed
by a single scalar insert.

gcc/testsuite/

PR tree-optimization/88828
* gcc.target/i386/pr88828-1a.c: New test.
* gcc.target/i386/pr88828-2b.c: Likewise.
* gcc.target/i386/pr88828-2.c: Likewise.
* gcc.target/i386/pr88828-3a.c: Likewise.
* gcc.target/i386/pr88828-3b.c: Likewise.
* gcc.target/i386/pr88828-3c.c: Likewise.
* gcc.target/i386/pr88828-3d.c: Likewise.
* gcc.target/i386/pr88828-4a.c: Likewise.
* gcc.target/i386/pr88828-4b.c: Likewise.
* gcc.target/i386/pr88828-5a.c: Likewise.
* gcc.target/i386/pr88828-5b.c: Likewise.
* gcc.target/i386/pr88828-6a.c: Likewise.
* gcc.target/i386/pr88828-6b.c: Likewise.
---
 gcc/testsuite/gcc.target/i386/pr88828-1a.c | 16 +
 gcc/testsuite/gcc.target/i386/pr88828-1b.c | 22 ++
 gcc/testsuite/gcc.target/i386/pr88828-2.c  | 17 +
 gcc/testsuite/gcc.target/i386/pr88828-3a.c | 16 +
 gcc/testsuite/gcc.target/i386/pr88828-3b.c | 18 +
 gcc/testsuite/gcc.target/i386/pr88828-3c.c | 22 ++
 gcc/testsuite/gcc.target/i386/pr88828-3d.c | 24 +++
 gcc/testsuite/gcc.target/i386/pr88828-4a.c | 17 +
 gcc/testsuite/gcc.target/i386/pr88828-4b.c | 20 ++
 gcc/testsuite/gcc.target/i386/pr88828-5a.c | 16 +
 gcc/testsuite/gcc.target/i386/pr88828-5b.c | 18 +
 gcc/testsuite/gcc.target/i386/pr88828-6a.c | 17 +
 gcc/testsuite/gcc.target/i386/pr88828-6b.c | 19 +
 gcc/testsuite/gcc.target/i386/pr88828-7.c  | 22 ++
 gcc/tree-ssa-forwprop.c| 84 +++---
 15 files changed, 338 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-1a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-1b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-3a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-3b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-3c.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-3d.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-4a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-4b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-5a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-5b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-6a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-6b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-7.c

diff --git a/gcc/testsuite/gcc.target/i386/pr88828-1a.c 
b/gcc/testsuite/gcc.target/i386/pr88828-1a.c
new file mode 100644
index 000..4ef1feab389
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr88828-1a.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -msse -mno-sse4" } */
+/* { dg-final { scan-assembler "movss" } } */
+/* { dg-final { scan-assembler-not "movaps" } } */
+/* { dg-final { scan-assembler-not "movlhps" } } */
+/* { dg-final { scan-assembler-not "unpcklps" } } */
+/* { dg-final { scan-assembler-not "shufps" } } */
+
+typedef float __v4sf __attribute__ ((__vector_size__ (16)));
+
+__v4sf
+foo (__v4sf x, float f)
+{
+  __v4sf y = { f, x[1], x[2], x[3] };
+  return y;
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr88828-1b.c 
b/gcc/testsuite/gcc.target/i386/pr88828-1b.c
new file mode 100644
index 000..2cddf4263f3
---

Re: [C++ Patch] PR 84605 ("[7/8/9 Regression] internal compiler error: in xref_basetypes, at cp/decl.c:13818")

2019-03-04 Thread Jason Merrill


On 3/4/19 6:03 AM, Paolo Carlini wrote:

Hi,

this error recovery regression too is rather easy to explain: since 
Jason's fix for c++/79580 (r245587) when defining a type from within an 
expression we pass ts_within_enclosing_non_class to xref_tag when we 
call it from cp_parser_class_head. Thus, in the ill-formed testcases at 
issue, cp_parser_class_head is called twice for the same 'type' returned 
by xref_tag, and the second time TYPE_BINFO is already set while 
TYPE_SIZE is still zero, thus the gcc_assert in xref_basetypes triggers. 
A rather straightforward way to give again an error message instead of 
crashing is rejecting TYPE_BEING_DEFINED too, additionally to 
COMPLETE_TYPE_P. in the check placed between the xref_tag and the 
xref_basetypes calls. The wording of the error message is probably a tad 
suboptimal in the TYPE_BEING_DEFINED case, but I'm not sure it's worth 
spending time and code on that, the issue appears anyway to be rather 
rare and all the testcases I have are error-recovery ones. Tested 
x86_64-linux.


OK.

Jason

Re: [PATCH] Fix PR89497

2019-03-04 Thread Richard Biener

On March 4, 2019 5:23:42 PM GMT+01:00, Christophe Lyon 
 wrote:
>On Fri, 1 Mar 2019 at 10:20, Richard Biener  wrote:
>>
>> On Wed, 27 Feb 2019, Richard Biener wrote:
>>
>> >
>> > CFG cleanup is now set up to perform trivial unreachable code
>> > elimination before doing anything that would require up-to-date
>> > SSA form.  Unfortunately a pending SSA update still will cause
>> > breakage to stmt folding triggered for example by basic-block
>> > merging.
>> >
>> > Fortunately it's now easy to properly "interleave" CFG cleanup
>> > and SSA update.
>> >
>> > Done as follows, bootstrap & regtest running on
>x86_64-unknown-linux-gnu.
>>
>> Testing went OK, two testcases need adjustments though.
>>
>> FAIL: gcc.dg/tree-ssa/reassoc-43.c scan-tree-dump-not reassoc2 "0 !=
>0"
>>
>> here we now end up with if (_20 != 0) matching.
>>
>> FAIL: g++.dg/tree-prof/devirt.C scan-tree-dump-times dom3 "folding
>virtual
>> function call to virtual unsigned int mozPersonalDictionary::AddRef"
>1
>> FAIL: g++.dg/tree-prof/devirt.C scan-tree-dump-times dom3 "folding
>virtual
>> function call to virtual unsigned int mozPersonalDictionary::_ZThn16"
>1
>>
>> here both foldings now alrady happen one pass earlier (during tracer
>> triggered CFG cleanup).  Previously the folding didn't happen because
>> the SSA names were marked for SSA update.
>>
>> Committed as follows.
>>
>
>Hi Richard,
>
>I've noticed a regression after you committed this patch:
>FAIL: gcc.dg/uninit-pred-8_b.c bogus warning (test for bogus messages,
>line 20)
>FAIL: gcc.dg/uninit-pred-8_b.c bogus warning (test for bogus messages,
>line 39)
>FAIL: gcc.dg/uninit-pred-8_b.c warning (test for warnings, line 42)
>
>It's unusual because I see that on arm-none-linux-gnueabihf
>--with-cpu cortex-a5
>--with-fpu vfpv3-d16-fp16
>
>but the same test still passes on the same target
>--with-cpu cortex-a9
>--with-fpu neon-fp16
>
>Any idea?

See PR89551. 

Richard. 

>Thanks,
>
>Christophe
>
>> Richard.
>>
>> 2019-03-01  Richard Biener  
>>
>> PR middle-end/89497
>> * tree-cfgcleanup.h (cleanup_tree_cfg): Add SSA update flags
>> argument, defaulted to zero.
>> * passes.c (execute_function_todo): Pass down SSA update
>flags
>> to cleanup_tree_cfg.
>> * tree-cfgcleanup.c: Include tree-into-ssa.h and
>tree-cfgcleanup.h.
>> (cleanup_tree_cfg_noloop): After cleanup_control_flow_pre
>update SSA
>> form if requested.
>> (cleanup_tree_cfg): Get and pass down SSA update flags.
>>
>> * gcc.dg/tree-ssa/reassoc-43.c: Avoid false match in regex.
>> * g++.dg/tree-prof/devirt.C: Scan tracer dump for foldings
>> that happen now earlier.
>>
>> Index: gcc/tree-cfgcleanup.h
>> ===
>> --- gcc/tree-cfgcleanup.h   (revision 269251)
>> +++ gcc/tree-cfgcleanup.h   (working copy)
>> @@ -22,7 +22,7 @@ along with GCC; see the file COPYING3.
>>
>>  /* In tree-cfgcleanup.c  */
>>  extern bitmap cfgcleanup_altered_bbs;
>> -extern bool cleanup_tree_cfg (void);
>> +extern bool cleanup_tree_cfg (unsigned = 0);
>>  extern bool fixup_noreturn_call (gimple *stmt);
>>  extern bool delete_unreachable_blocks_update_callgraph (cgraph_node
>*dst_node,
>> bool
>update_clones);
>> Index: gcc/passes.c
>> ===
>> --- gcc/passes.c(revision 269251)
>> +++ gcc/passes.c(working copy)
>> @@ -1927,7 +1927,7 @@ execute_function_todo (function *fn, voi
>>/* Always cleanup the CFG before trying to update SSA.  */
>>if (flags & TODO_cleanup_cfg)
>>  {
>> -  cleanup_tree_cfg ();
>> +  cleanup_tree_cfg (flags & TODO_update_ssa_any);
>>
>>/* When cleanup_tree_cfg merges consecutive blocks, it may
>>  perform some simplistic propagation when removing single
>> Index: gcc/tree-cfgcleanup.c
>> ===
>> --- gcc/tree-cfgcleanup.c   (revision 269251)
>> +++ gcc/tree-cfgcleanup.c   (working copy)
>> @@ -44,6 +44,9 @@ along with GCC; see the file COPYING3.
>>  #include "gimple-fold.h"
>>  #include "tree-ssa-loop-niter.h"
>>  #include "cgraph.h"
>> +#include "tree-into-ssa.h"
>> +#include "tree-cfgcleanup.h"
>> +
>>
>>  /* The set of blocks in that at least one of the following changes
>happened:
>> -- the statement at the end of the block was changed
>> @@ -943,7 +946,7 @@ mfb_keep_latches (edge e)
>> Return true if the flowgraph was modified, false otherwise.  */
>>
>>  static bool
>> -cleanup_tree_cfg_noloop (void)
>> +cleanup_tree_cfg_noloop (unsigned ssa_update_flags)
>>  {
>>timevar_push (TV_TREE_CLEANUP_CFG);
>>
>> @@ -1023,6 +1026,8 @@ cleanup_tree_cfg_noloop (void)
>>
>>/* After doing the above SSA form should be valid (or an update
>SSA
>>   should be required).  */
>> +  if (ssa_update_flags)
>> +update_ssa

Re: [LTO PATCH RFA] PR c++/88049 - ICE with undefined destructor and anon namespace.

2019-03-04 Thread Jason Merrill

On Mon, Mar 4, 2019 at 8:41 AM Christophe Lyon
 wrote:
> On Wed, 20 Feb 2019 at 02:58, Jason Merrill  wrote:
> >
> > A type in an anonymous namespace can never be merged with one from
> > another translation unit, so a member of such a type is always its own
> > prevailing decl.
> >
> > I don't really understand the LTO concept of prevailing decl, or why we 
> > don't
> > get here if the destructor is defined, but this seems reasonable and fixes 
> > the
> > bug.
> >
> > Tested x86_64-pc-linux-gnu.  OK for trunk?
> >
> > * lto-symtab.c (lto_symtab_prevailing_virtual_decl): Return early
> > for a type in an anonymous namespace.
> > ---
> >  gcc/lto/lto-symtab.c |  8 ++--
> >  gcc/testsuite/g++.dg/lto/pr88049_0.C | 16 
> >  gcc/lto/ChangeLog|  6 ++
> >  3 files changed, 28 insertions(+), 2 deletions(-)
> >  create mode 100644 gcc/testsuite/g++.dg/lto/pr88049_0.C
> >
> > diff --git a/gcc/lto/lto-symtab.c b/gcc/lto/lto-symtab.c
> > index 22da4c78b8c..343915c3cec 100644
> > --- a/gcc/lto/lto-symtab.c
> > +++ b/gcc/lto/lto-symtab.c
> > @@ -1085,8 +1085,12 @@ lto_symtab_prevailing_virtual_decl (tree decl)
> >  {
> >if (DECL_ABSTRACT_P (decl))
> >  return decl;
> > -  gcc_checking_assert (!type_in_anonymous_namespace_p (DECL_CONTEXT (decl))
> > -  && DECL_ASSEMBLER_NAME_SET_P (decl));
> > +
> > +  if (type_in_anonymous_namespace_p (DECL_CONTEXT (decl)))
> > +/* There can't be any other declarations.  */
> > +return decl;
> > +
> > +  gcc_checking_assert (DECL_ASSEMBLER_NAME_SET_P (decl));
> >
> >symtab_node *n = symtab_node::get_for_asmname
> >  (DECL_ASSEMBLER_NAME (decl));
> > diff --git a/gcc/testsuite/g++.dg/lto/pr88049_0.C 
> > b/gcc/testsuite/g++.dg/lto/pr88049_0.C
> > new file mode 100644
> > index 000..7ac3618c2c8
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/lto/pr88049_0.C
> > @@ -0,0 +1,16 @@
> > +// PR c++/88049
> > +// { dg-lto-do link }
> > +// { dg-lto-options {{ -flto -O2 -w }} }
> > +// { dg-extra-ld-options -r }
> > +
> > +template  class a;
> > +class b {};
> > +template  a d(char);
> > +template  class a : public b {
> > +public:
> > +  virtual ~a();
> > +};
> > +namespace {
> > +  class f;
> > +  b c = d(int());
> > +} // namespace
>
>
> Hi Jason,
>
> On bare-metal targets (arm, aarch64 using newlib), I'm using dejagnu's
> testglue, which makes this new test fail because it also uses
> g++_tg.o, leading to:
> /arm-none-eabi/bin/ld: warning: incremental linking of LTO and non-LTO
> objects; using -flinker-output=nolto-rel which will bypass whole
> program optimization
>
> Is there a way to avoid that (besides not using testglue) ?

Does adding

// { dg-require-effective-target lto_incremental }

to the testcase help?

Jason

RE: [committed][PATCH][GCC][AArch64] Make test options_set_10.c not run on native

2019-03-04 Thread Tamar Christina

Oh dear. I hadn't noticed that at all.

Thanks,
Corrected it.

gcc/testsuite/ChangeLog:

2019-03-04  Tamar Christina  

PR target/88530
* gcc.target/aarch64/options_set_10.c: Add native.

> -Original Message-
> From: Jakub Jelinek 
> Sent: Monday, March 4, 2019 15:52
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org; nd ; James Greenhalgh
> ; Richard Earnshaw
> ; Marcus Shawcroft
> 
> Subject: Re: [committed][PATCH][GCC][AArch64] Make test
> options_set_10.c not run on native
> 
> On Mon, Mar 04, 2019 at 03:49:54PM +, Tamar Christina wrote:
> > Hi All,
> >
> > The test options_set_10.c shouldn't run when cross compiled.
> > In addition to gating it on linux I'm also gating it on native now.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > Cross compiled and regtested on aarch64-none-linux-gnu and no issues.
> >
> > Committed under the GCC obvious rules.
> >
> > Thanks,
> > Tamar
> >
> > gcc/testsuite/ChangeLog:
> >
> > 2019-03-04  Tamar Christina  
> >
> > PR target/88530
> > * gcc.target/aarch64/options_set_10.c:
> 
> Missing description of what you've changed.
> 
>   Jakub

Re: [PATCH] Fix PR89497

2019-03-04 Thread Christophe Lyon

On Fri, 1 Mar 2019 at 10:20, Richard Biener  wrote:
>
> On Wed, 27 Feb 2019, Richard Biener wrote:
>
> >
> > CFG cleanup is now set up to perform trivial unreachable code
> > elimination before doing anything that would require up-to-date
> > SSA form.  Unfortunately a pending SSA update still will cause
> > breakage to stmt folding triggered for example by basic-block
> > merging.
> >
> > Fortunately it's now easy to properly "interleave" CFG cleanup
> > and SSA update.
> >
> > Done as follows, bootstrap & regtest running on x86_64-unknown-linux-gnu.
>
> Testing went OK, two testcases need adjustments though.
>
> FAIL: gcc.dg/tree-ssa/reassoc-43.c scan-tree-dump-not reassoc2 "0 != 0"
>
> here we now end up with if (_20 != 0) matching.
>
> FAIL: g++.dg/tree-prof/devirt.C scan-tree-dump-times dom3 "folding virtual
> function call to virtual unsigned int mozPersonalDictionary::AddRef" 1
> FAIL: g++.dg/tree-prof/devirt.C scan-tree-dump-times dom3 "folding virtual
> function call to virtual unsigned int mozPersonalDictionary::_ZThn16" 1
>
> here both foldings now alrady happen one pass earlier (during tracer
> triggered CFG cleanup).  Previously the folding didn't happen because
> the SSA names were marked for SSA update.
>
> Committed as follows.
>

Hi Richard,

I've noticed a regression after you committed this patch:
FAIL: gcc.dg/uninit-pred-8_b.c bogus warning (test for bogus messages, line 20)
FAIL: gcc.dg/uninit-pred-8_b.c bogus warning (test for bogus messages, line 39)
FAIL: gcc.dg/uninit-pred-8_b.c warning (test for warnings, line 42)

It's unusual because I see that on arm-none-linux-gnueabihf
--with-cpu cortex-a5
--with-fpu vfpv3-d16-fp16

but the same test still passes on the same target
--with-cpu cortex-a9
--with-fpu neon-fp16

Any idea?

Thanks,

Christophe

> Richard.
>
> 2019-03-01  Richard Biener  
>
> PR middle-end/89497
> * tree-cfgcleanup.h (cleanup_tree_cfg): Add SSA update flags
> argument, defaulted to zero.
> * passes.c (execute_function_todo): Pass down SSA update flags
> to cleanup_tree_cfg.
> * tree-cfgcleanup.c: Include tree-into-ssa.h and tree-cfgcleanup.h.
> (cleanup_tree_cfg_noloop): After cleanup_control_flow_pre update SSA
> form if requested.
> (cleanup_tree_cfg): Get and pass down SSA update flags.
>
> * gcc.dg/tree-ssa/reassoc-43.c: Avoid false match in regex.
> * g++.dg/tree-prof/devirt.C: Scan tracer dump for foldings
> that happen now earlier.
>
> Index: gcc/tree-cfgcleanup.h
> ===
> --- gcc/tree-cfgcleanup.h   (revision 269251)
> +++ gcc/tree-cfgcleanup.h   (working copy)
> @@ -22,7 +22,7 @@ along with GCC; see the file COPYING3.
>
>  /* In tree-cfgcleanup.c  */
>  extern bitmap cfgcleanup_altered_bbs;
> -extern bool cleanup_tree_cfg (void);
> +extern bool cleanup_tree_cfg (unsigned = 0);
>  extern bool fixup_noreturn_call (gimple *stmt);
>  extern bool delete_unreachable_blocks_update_callgraph (cgraph_node 
> *dst_node,
> bool update_clones);
> Index: gcc/passes.c
> ===
> --- gcc/passes.c(revision 269251)
> +++ gcc/passes.c(working copy)
> @@ -1927,7 +1927,7 @@ execute_function_todo (function *fn, voi
>/* Always cleanup the CFG before trying to update SSA.  */
>if (flags & TODO_cleanup_cfg)
>  {
> -  cleanup_tree_cfg ();
> +  cleanup_tree_cfg (flags & TODO_update_ssa_any);
>
>/* When cleanup_tree_cfg merges consecutive blocks, it may
>  perform some simplistic propagation when removing single
> Index: gcc/tree-cfgcleanup.c
> ===
> --- gcc/tree-cfgcleanup.c   (revision 269251)
> +++ gcc/tree-cfgcleanup.c   (working copy)
> @@ -44,6 +44,9 @@ along with GCC; see the file COPYING3.
>  #include "gimple-fold.h"
>  #include "tree-ssa-loop-niter.h"
>  #include "cgraph.h"
> +#include "tree-into-ssa.h"
> +#include "tree-cfgcleanup.h"
> +
>
>  /* The set of blocks in that at least one of the following changes happened:
> -- the statement at the end of the block was changed
> @@ -943,7 +946,7 @@ mfb_keep_latches (edge e)
> Return true if the flowgraph was modified, false otherwise.  */
>
>  static bool
> -cleanup_tree_cfg_noloop (void)
> +cleanup_tree_cfg_noloop (unsigned ssa_update_flags)
>  {
>timevar_push (TV_TREE_CLEANUP_CFG);
>
> @@ -1023,6 +1026,8 @@ cleanup_tree_cfg_noloop (void)
>
>/* After doing the above SSA form should be valid (or an update SSA
>   should be required).  */
> +  if (ssa_update_flags)
> +update_ssa (ssa_update_flags);
>
>/* Compute dominator info which we need for the iterative process below.  
> */
>if (!dom_info_available_p (CDI_DOMINATORS))
> @@ -1125,9 +1130,9 @@ repair_loop_structures (void)
>  /* Cleanup cfg

Re: A bug in vrp_meet?

2019-03-04 Thread Qing Zhao

Richard,

thanks a lot for your suggested fix. 

I will try it.

Qing
> On Mar 4, 2019, at 5:45 AM, Richard Biener  wrote:
> 
> On Fri, Mar 1, 2019 at 10:02 PM Qing Zhao  wrote:
>> 
>> 
>> On Mar 1, 2019, at 1:25 PM, Richard Biener  
>> wrote:
>> 
>> On March 1, 2019 6:49:20 PM GMT+01:00, Qing Zhao  
>> wrote:
>> 
>> Jeff,
>> 
>> thanks a lot for the reply.
>> 
>> this is really helpful.
>> 
>> I double checked the dumped intermediate file for pass “dom3", and
>> located the following for _152:
>> 
>> BEFORE the pass “dom3”, there is no _152, the corresponding Block
>> looks like:
>> 
>>  [local count: 12992277]:
>> _98 = (int) ufcMSR_52(D);
>> k_105 = (sword) ufcMSR_52(D);
>> i_49 = _98 > 0 ? k_105 : 0;
>> 
>> ***During the pass “doms”,  _152 is generated as following:
>> 
>> Optimizing block #4
>> ….
>> Visiting statement:
>> i_49 = _98 > 0 ? k_105 : 0;
>> Meeting
>> [0, 65535]
>> and
>> [0, 0]
>> to
>> [0, 65535]
>> Intersecting
>> [0, 65535]
>> and
>> [0, 65535]
>> to
>> [0, 65535]
>> Optimizing statement i_49 = _98 > 0 ? k_105 : 0;
>> Replaced 'k_105' with variable '_98'
>> gimple_simplified to _152 = MAX_EXPR <_98, 0>;
>> i_49 = _152;
>> Folded to: i_49 = _152;
>> LKUP STMT i_49 = _152
>>  ASGN i_49 = _152
>> 
>> then bb 4 becomes:
>> 
>>  [local count: 12992277]:
>> _98 = (int) ufcMSR_52(D);
>> k_105 = _98;
>> _152 = MAX_EXPR <_98, 0>;
>> i_49 = _152;
>> 
>> and all the i_49 are replaced with _152.
>> 
>> However, the value range info for _152 doesnot reflect the one for
>> i_49, it keeps as UNDEFINED.
>> 
>> is this the root problem?
>> 
>> 
>> It looks like DOM fails to visit stmts generated by simplification. Can you 
>> open a bug report with a testcase?
>> 
>> 
>> The problem is, It took me quite some time in order to come up with a small 
>> and independent testcase for this problem,
>> a little bit change made the error disappear.
>> 
>> do you have any suggestion on this?  or can you give me some hint on how to 
>> fix this in DOM?  then I can try the fix on my side?
> 
> I remember running into similar issues in the past where I tried to
> extract temporary nonnull ranges from divisions.
> I have there
> 
> @@ -1436,11 +1436,16 @@ dom_opt_dom_walker::before_dom_children
>   m_avail_exprs_stack->pop_to_marker ();
> 
>   edge taken_edge = NULL;
> -  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next ())
> -{
> -  evrp_range_analyzer.record_ranges_from_stmt (gsi_stmt (gsi), false);
> -  taken_edge = this->optimize_stmt (bb, gsi);
> -}
> +  gsi = gsi_start_bb (bb);
> +  if (!gsi_end_p (gsi))
> +while (1)
> +  {
> +   evrp_range_analyzer.record_def_ranges_from_stmt (gsi_stmt (gsi), 
> false);
> +   taken_edge = this->optimize_stmt (bb, );
> +   if (gsi_end_p (gsi))
> + break;
> +   evrp_range_analyzer.record_use_ranges_from_stmt (gsi_stmt (gsi));
> +  }
> 
>   /* Now prepare to process dominated blocks.  */
>   record_edge_info (bb);
> 
> OTOH the issue in your case is that fold emits new stmts before gsi but the
> above loop will never look at them.  See tree-ssa-forwprop.c for code how
> to deal with this (setting a pass-local flag on stmts visited and walking back
> to unvisited, newly inserted ones).  The fold_stmt interface could in theory
> also be extended to insert new stmts on a sequence passed to it so the
> caller would be responsible for inserting them into the IL and could then
> more easily revisit them (but that's a bigger task).
> 
> So, does the following help?
> 
> Index: gcc/tree-ssa-dom.c
> ===
> --- gcc/tree-ssa-dom.c  (revision 269361)
> +++ gcc/tree-ssa-dom.c  (working copy)
> @@ -1482,8 +1482,25 @@ dom_opt_dom_walker::before_dom_children
>   edge taken_edge = NULL;
>   for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next ())
> {
> +  gimple_stmt_iterator pgsi = gsi;
> +  gsi_prev ();
>   evrp_range_analyzer.record_ranges_from_stmt (gsi_stmt (gsi), false);
>   taken_edge = this->optimize_stmt (bb, gsi);
> +  gimple_stmt_iterator npgsi = gsi;
> +  gsi_prev ();
> +  /* Walk new stmts eventually inserted by DOM.  gsi_stmt (gsi) itself
> +while it may be changed should not have gotten a new definition.  */
> +  if (gsi_stmt (pgsi) != gsi_stmt (npgsi))
> +   do
> + {
> +   if (gsi_end_p (pgsi))
> + pgsi = gsi_start_bb (bb);
> +   else
> + gsi_next ();
> +   evrp_range_analyzer.record_ranges_from_stmt (gsi_stmt (pgsi),
> +false);
> + }
> +   while (gsi_stmt (pgsi) != gsi_stmt (gsi));
> }
> 
>   /* Now prepare to process dominated blocks.  */
> 
> 
> Richard.
> 
>> Thanks a lot.
>> 
>> Qing
>> 
>> 
>> 
>> Richard.
>> 
>>

Re: [committed][PATCH][GCC][AArch64] Make test options_set_10.c not run on native

2019-03-04 Thread Jakub Jelinek

On Mon, Mar 04, 2019 at 03:49:54PM +, Tamar Christina wrote:
> Hi All,
> 
> The test options_set_10.c shouldn't run when cross compiled.
> In addition to gating it on linux I'm also gating it on native now.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> Cross compiled and regtested on aarch64-none-linux-gnu and no issues.
> 
> Committed under the GCC obvious rules.
> 
> Thanks,
> Tamar
> 
> gcc/testsuite/ChangeLog:
> 
> 2019-03-04  Tamar Christina  
> 
>   PR target/88530
>   * gcc.target/aarch64/options_set_10.c:

Missing description of what you've changed.

Jakub

[committed][PATCH][GCC][AArch64] Make test options_set_10.c not run on native

2019-03-04 Thread Tamar Christina

Hi All,

The test options_set_10.c shouldn't run when cross compiled.
In addition to gating it on linux I'm also gating it on native now.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
Cross compiled and regtested on aarch64-none-linux-gnu and no issues.

Committed under the GCC obvious rules.

Thanks,
Tamar

gcc/testsuite/ChangeLog:

2019-03-04  Tamar Christina  

PR target/88530
* gcc.target/aarch64/options_set_10.c:

-- 
diff --git a/gcc/testsuite/gcc.target/aarch64/options_set_10.c b/gcc/testsuite/gcc.target/aarch64/options_set_10.c
index 5ffe83c199165dd4129814674297056bdf27cd83..1fc8aa86fd6ef7a7a8f502be149f07514091eccd 100644
--- a/gcc/testsuite/gcc.target/aarch64/options_set_10.c
+++ b/gcc/testsuite/gcc.target/aarch64/options_set_10.c
@@ -1,4 +1,4 @@
-/* { dg-do compile { target "aarch64*-*-linux*" } } */
+/* { dg-do compile { target { { aarch64*-*-linux*} && native } } } */
 /* { dg-additional-options "-mcpu=native" } */
 
 int main ()

Re: [PATCH] Handle timeout warnings in dg-extract-results

2019-03-04 Thread Christophe Lyon

On Tue, 19 Feb 2019 at 11:29, Christophe Lyon
 wrote:
>
> On Tue, 19 Feb 2019 at 10:28, Christophe Lyon
>  wrote:
> >
> > On Mon, 18 Feb 2019 at 21:12, Rainer Orth  
> > wrote:
> > >
> > > Hi Christophe,
> > >
> > > > dg-extract-results currently moves lines like
> > > > WARNING: program timed out
> > > > at the end of each .exp section when it generates .sum files.
> > > >
> > > > This is because it sorts its output based on the 2nd field, which is
> > > > normally the testname as in:
> > > > FAIL: gcc.c-torture/execute/20020129-1.c   -O2 -flto
> > > > -fno-use-linker-plugin -flto-partition=none  execution test
> > > >
> > > > As you can notice 'program' comes after
> > > > gcc.c-torture/execute/20020129-1.c alphabetically, and generally after
> > > > most (all?) GCC testnames.
> > > >
> > > > This is a bit of a pain when trying to handle transient test failures
> > > > because you can no longer match such a WARNING line to its FAIL
> > > > counterpart.
> > > >
> > > > The attached patch changes this behavior by replacing the line
> > > > WARNING: program timed out
> > > > with
> > > > WARNING: gcc.c-torture/execute/20020129-1.c   -O2 -flto
> > > > -fno-use-linker-plugin -flto-partition=none  execution test program
> > > > timed out
> > > >
> > > > The effect is that this line will now appear immediately above the
> > > > FAIL: gcc.c-torture/execute/20020129-1.c   -O2 -flto
> > > > -fno-use-linker-plugin -flto-partition=none  execution test
> > > > so that it's easier to match them.
> > > >
> > > >
> > > > I'm not sure how much people depend on the .sum format, I also
> > > > considered emitting
> > > > WARNING: program timed out gcc.c-torture/execute/20020129-1.c   -O2
> > > > -flto -fno-use-linker-plugin -flto-partition=none  execution test
> > > >
> > > > I also restricted the patch to handling only 'program timed out'
> > > > cases, to avoid breaking other things.
> > > >
> > > > I considered fixing this in Dejagnu, but it seemed more complicated,
> > > > and would delay adoption in GCC anyway.
> > > >
> > > > What do people think about this?
> > >
> > > I just had a case where your patch broke the generation of go.sum.
> > > This is on Solaris 11.5 with python 2.7.15:
> > >
> > > ro@colima 68 > /bin/ksh 
> > > /vol/gcc/src/hg/trunk/local/gcc/../contrib/dg-extract-results.sh 
> > > testsuite/go*/*.sum.sep > testsuite/go/go.sum
> > > Traceback (most recent call last):
> > >   File 
> > > "/vol/gcc/src/hg/trunk/local/gcc/../contrib/dg-extract-results.py", line 
> > > 605, in 
> > > Prog().main()
> > >   File 
> > > "/vol/gcc/src/hg/trunk/local/gcc/../contrib/dg-extract-results.py", line 
> > > 569, in main
> > > self.parse_file (filename, file)
> > >   File 
> > > "/vol/gcc/src/hg/trunk/local/gcc/../contrib/dg-extract-results.py", line 
> > > 427, in parse_file
> > > num_variations)
> > >   File 
> > > "/vol/gcc/src/hg/trunk/local/gcc/../contrib/dg-extract-results.py", line 
> > > 311, in parse_run
> > > first_key = key
> > > UnboundLocalError: local variable 'key' referenced before assignment
> > >
> > > Before your patch, key cannot have been undefined, now it is.  I've
> > > verified this by removing the WARNING: lines from the two affected
> > > go.sum.sep files and now go.sum creation just works fine.
> > >
> >
> > Sorry for the breakage.
> >
> > Can you send me the .sum that cause the problem so that I can reproduce it?
> >
>
> So the problem happens when a WARNING is the first result of a new harness.
> This is fixed by the attached dg-extract-results.patch2.txt.
>
> While looking at it, I noticed that the ordering wasn't right with the
> shell version,
> though I did test it before sending the previous patch.
> The attached dg-extract-results.patch1.txt makes sure the WARNING: line
> appears before the following testcase with the shell version too.
>
> Are both OK?
>

Ping?


> Christophe
>
>
> > Thanks
> >
> > Christophe
> >
> > > Rainer
> > >
> > > --
> > > -
> > > Rainer Orth, Center for Biotechnology, Bielefeld University

[PING][PATCH, asmcons] Fix PR rtl-optimization/89313: ICE in process_alt_operands, at lra-constraints.c:2962

2019-03-04 Thread Peter Bergner

I'd like to ping the following patch:

https://gcc.gnu.org/ml/gcc-patches/2019-02/msg01728.html

Peter


gcc/
PR rtl-optimization/89313
* function.c (matching_constraint_num): New static function.
(match_asm_constraints_1): Use it.  Fixup white space and comment.
Don't replace inputs with non-matching constraints which conflict
with early clobber outputs.

gcc/testsuite/
PR rtl-optimization/89313
* gcc.dg/pr89313.c: New test.

Re: V2 [PATCH] x32: Add addr32 prefix to UNSPEC_VSIBADDR instructions

2019-03-04 Thread Uros Bizjak

On Mon, Mar 4, 2019 at 2:54 PM H.J. Lu  wrote:
>
> On Sun, Mar 03, 2019 at 10:34:29PM +0100, Uros Bizjak wrote:
> > On Sun, Mar 3, 2019 at 10:18 PM H.J. Lu  wrote:
> > >
> > > On Sun, Mar 3, 2019 at 9:27 AM Uros Bizjak  wrote:
> > > >
> > > > On Thu, Feb 28, 2019 at 8:10 PM H.J. Lu  wrote:
> > > > >
> > > > > 32-bit indices in VSIB address are sign-extended to 64 bits.  In x32,
> > > > > when 32-bit indices are used as addresses, like in
> > > > >
> > > > > vgatherdps %ymm7, 0(,%ymm9,1), %ymm6
> > > > >
> > > > > 32-bit indices, 0xf7fa3010, is sign-extended to 0xf7fa3010 
> > > > > which
> > > > > is invalid address.  Add addr32 prefix to UNSPEC_VSIBADDR instructions
> > > > > for x32 if there is no base register nor symbol.
> > > > >
> > > > > This fixes 175.vpr and 254.gap in SPEC CPU 2000 on x32 with
> > > > >
> > > > > -Ofast -funroll-loops -march=haswell
> > > >
> > > > 1. Testcases 2 to 9 fail on fedora-29 with:
> > > >
> > > > In file included from /usr/include/features.h:452,
> > > >  from /usr/include/bits/libc-header-start.h:33,
> > > >  from /usr/include/stdlib.h:25,
> > > >  from 
> > > > /ssd/uros/gcc-build-fast/gcc/include/mm_malloc.h:27,
> > > >  from 
> > > > /ssd/uros/gcc-build-fast/gcc/include/xmmintrin.h:34,
> > > >  from 
> > > > /ssd/uros/gcc-build-fast/gcc/include/immintrin.h:29,
> > > >  from
> > > > /home/uros/gcc-svn/trunk/gcc/testsuite/gcc.target/i386/pr89523-2.c:7:
> > > > /usr/include/gnu/stubs.h:13:11: fatal error: gnu/stubs-x32.h: No such
> > > > file or directory
> > >
> > > I will update tests to remove  "#include immintrin.h"
> > >
> > > > 2. Does the patch work with -maddress-mode={short,long}?
> > >
> > > Yes.
> > >
> > > > 3. The implementation is wrong. You should use operand substitution
> > > > with VSIB address as operand, not substitution without operand.
> > >
> > > How can I add an addr32 prefix with operand substitution?  This is
> > > very similar to "%^".  My updated patch will use "%^".
> >
> > Yes, using %^ is what I think would be the optimal solution. Other
> > than that, in your proposed patch, operand-less %_ scans the entire
> > current_output_insn to dig to the UNSPEC_VSIBADDR. You can just use
> > operand substitution, and do e.g. "%X2vgatherpf0..." where 'X'
> > processes operand 2 (vsib_address_operand) and conditionally outputs
> > addr32.
> >
> > BTW: In a new version of the patch, please specify what is changed
> > from the previous version. Otherwise, review of a new version is more
> > or less a guesswork what changed.
> >
>
> Here is the updated patch.  The change is
>
> return "%P5vscatterpf1ps\t{%5%{%0%}|%X5%{%0%}}";
>
> instead of
>
> return "%^vscatterpf1ps\t{%5%{%0%}|%X5%{%0%}}";

Did I miss some version of the patch that introduced %^? You used %_
in your previous patch. Did your try with %^?

> We can't use the %X5 since %X5 is used on operands.

So, please introduce some other modifier ("X" was not to be taken
literally, but *some* letter). Why are you overloading 'P'?

I don't know why are you using operand 5 here, you can use operand 2 directly.

Uros.

> I also added a test for -maddress-mode=long.
>
>
> H.J.
> ---
> 32-bit indices in VSIB address are sign-extended to 64 bits.  In x32,
> when 32-bit indices are used as addresses, like in
>
> vgatherdps %ymm7, 0(,%ymm9,1), %ymm6
>
> 32-bit indices, 0xf7fa3010, is sign-extended to 0xf7fa3010 which
> is invalid address.  Add addr32 prefix to UNSPEC_VSIBADDR instructions
> for x32 if there is no base register nor symbol.
>
> This fixes 175.vpr and 254.gap in SPEC CPU 2000 on x32 with
>
> -Ofast -funroll-loops -march=haswell
>
> gcc/
>
> PR target/89523
> * config/i386/i386.c (ix86_print_operand): Handle UNSPEC_VSIBADDR
> instructions for '%P' to add addr32 prefix if required.
> * config/i386/sse.md (*avx512pf_gatherpfsf_mask): Prepend
> "%P5" to opcode.
> (*avx512pf_gatherpfdf_mask): Likewise.
> (*avx512pf_scatterpfsf_mask): Likewise.
> (*avx512pf_scatterpfdf_mask): Likewise.
> (*avx2_gathersi): Prepend "%P7" to opcode.
> (*avx2_gathersi_2): Prepend "%P6" to opcode.
> (*avx2_gatherdi): Prepend "%P7" to opcode.
> (*avx2_gatherdi_2): Prepend "%P6" to opcode.
> (*avx2_gatherdi_3): Prepend "%P7" to opcode.
> (*avx2_gatherdi_4): Prepend "%P6" to opcode.`
> (*avx512f_gathersi): Prepend "%P5" to opcode.
> (*avx512f_gathersi_2): Prepend "%P6" to opcode.
> (*avx512f_gatherdi): Prepend "%P5" to opcode.
> (*avx512f_gatherdi_2): Likewise.
> (*avx512f_scattersi): Likewise.
> (*avx512f_scatterdi): Likewise.
>
> gcc/testsuite/
>
> PR target/89523
> * gcc.target/i386/pr89523-1a.c: New test.
> * gcc.target/i386/pr89523-1b.c: Likewise.
> * gcc.target/i386/pr89523-2.c: Likewise.
> *

V2 [PATCH] x32: Add addr32 prefix to UNSPEC_VSIBADDR instructions

2019-03-04 Thread H.J. Lu

On Sun, Mar 03, 2019 at 10:34:29PM +0100, Uros Bizjak wrote:
> On Sun, Mar 3, 2019 at 10:18 PM H.J. Lu  wrote:
> >
> > On Sun, Mar 3, 2019 at 9:27 AM Uros Bizjak  wrote:
> > >
> > > On Thu, Feb 28, 2019 at 8:10 PM H.J. Lu  wrote:
> > > >
> > > > 32-bit indices in VSIB address are sign-extended to 64 bits.  In x32,
> > > > when 32-bit indices are used as addresses, like in
> > > >
> > > > vgatherdps %ymm7, 0(,%ymm9,1), %ymm6
> > > >
> > > > 32-bit indices, 0xf7fa3010, is sign-extended to 0xf7fa3010 which
> > > > is invalid address.  Add addr32 prefix to UNSPEC_VSIBADDR instructions
> > > > for x32 if there is no base register nor symbol.
> > > >
> > > > This fixes 175.vpr and 254.gap in SPEC CPU 2000 on x32 with
> > > >
> > > > -Ofast -funroll-loops -march=haswell
> > >
> > > 1. Testcases 2 to 9 fail on fedora-29 with:
> > >
> > > In file included from /usr/include/features.h:452,
> > >  from /usr/include/bits/libc-header-start.h:33,
> > >  from /usr/include/stdlib.h:25,
> > >  from /ssd/uros/gcc-build-fast/gcc/include/mm_malloc.h:27,
> > >  from /ssd/uros/gcc-build-fast/gcc/include/xmmintrin.h:34,
> > >  from /ssd/uros/gcc-build-fast/gcc/include/immintrin.h:29,
> > >  from
> > > /home/uros/gcc-svn/trunk/gcc/testsuite/gcc.target/i386/pr89523-2.c:7:
> > > /usr/include/gnu/stubs.h:13:11: fatal error: gnu/stubs-x32.h: No such
> > > file or directory
> >
> > I will update tests to remove  "#include immintrin.h"
> >
> > > 2. Does the patch work with -maddress-mode={short,long}?
> >
> > Yes.
> >
> > > 3. The implementation is wrong. You should use operand substitution
> > > with VSIB address as operand, not substitution without operand.
> >
> > How can I add an addr32 prefix with operand substitution?  This is
> > very similar to "%^".  My updated patch will use "%^".
> 
> Yes, using %^ is what I think would be the optimal solution. Other
> than that, in your proposed patch, operand-less %_ scans the entire
> current_output_insn to dig to the UNSPEC_VSIBADDR. You can just use
> operand substitution, and do e.g. "%X2vgatherpf0..." where 'X'
> processes operand 2 (vsib_address_operand) and conditionally outputs
> addr32.
> 
> BTW: In a new version of the patch, please specify what is changed
> from the previous version. Otherwise, review of a new version is more
> or less a guesswork what changed.
> 

Here is the updated patch.  The change is

return "%P5vscatterpf1ps\t{%5%{%0%}|%X5%{%0%}}";

instead of

return "%^vscatterpf1ps\t{%5%{%0%}|%X5%{%0%}}";

We can't use the %X5 since %X5 is used on operands.

I also added a test for -maddress-mode=long.


H.J.
---
32-bit indices in VSIB address are sign-extended to 64 bits.  In x32,
when 32-bit indices are used as addresses, like in

vgatherdps %ymm7, 0(,%ymm9,1), %ymm6

32-bit indices, 0xf7fa3010, is sign-extended to 0xf7fa3010 which
is invalid address.  Add addr32 prefix to UNSPEC_VSIBADDR instructions
for x32 if there is no base register nor symbol.

This fixes 175.vpr and 254.gap in SPEC CPU 2000 on x32 with

-Ofast -funroll-loops -march=haswell

gcc/

PR target/89523
* config/i386/i386.c (ix86_print_operand): Handle UNSPEC_VSIBADDR
instructions for '%P' to add addr32 prefix if required.
* config/i386/sse.md (*avx512pf_gatherpfsf_mask): Prepend
"%P5" to opcode.
(*avx512pf_gatherpfdf_mask): Likewise.
(*avx512pf_scatterpfsf_mask): Likewise.
(*avx512pf_scatterpfdf_mask): Likewise.
(*avx2_gathersi): Prepend "%P7" to opcode.
(*avx2_gathersi_2): Prepend "%P6" to opcode.
(*avx2_gatherdi): Prepend "%P7" to opcode.
(*avx2_gatherdi_2): Prepend "%P6" to opcode.
(*avx2_gatherdi_3): Prepend "%P7" to opcode.
(*avx2_gatherdi_4): Prepend "%P6" to opcode.`
(*avx512f_gathersi): Prepend "%P5" to opcode.
(*avx512f_gathersi_2): Prepend "%P6" to opcode.
(*avx512f_gatherdi): Prepend "%P5" to opcode.
(*avx512f_gatherdi_2): Likewise.
(*avx512f_scattersi): Likewise.
(*avx512f_scatterdi): Likewise.

gcc/testsuite/

PR target/89523
* gcc.target/i386/pr89523-1a.c: New test.
* gcc.target/i386/pr89523-1b.c: Likewise.
* gcc.target/i386/pr89523-2.c: Likewise.
* gcc.target/i386/pr89523-3.c: Likewise.
* gcc.target/i386/pr89523-4.c: Likewise.
* gcc.target/i386/pr89523-5.c: Likewise.
* gcc.target/i386/pr89523-6.c: Likewise.
* gcc.target/i386/pr89523-7.c: Likewise.
* gcc.target/i386/pr89523-8.c: Likewise.
* gcc.target/i386/pr89523-9.c: Likewise.
---
 gcc/config/i386/i386.c | 35 +++-
 gcc/config/i386/sse.md | 46 +++---
 gcc/testsuite/gcc.target/i386/pr89523-1a.c | 24 +++
 gcc/testsuite/gcc.target/i386/pr89523-1b.c |  7

Re: [PATCH] Fix PR89437

2019-03-04 Thread Richard Biener

On Mon, Mar 4, 2019 at 2:39 PM Wilco Dijkstra  wrote:
>
> Hi Richard,
>
> >On Thu, Feb 21, 2019 at 6:09 PM Wilco Dijkstra  
> >wrote:
> >>
> >> Hi Richard,
> >>
> >> >>Fix an issue with sinl (atanl (sqrtl (LDBL_MAX)) returning 0.0
> >> >>instead of 1.0 by using x < sqrtl (LDBL_MAX) in match.pd.
> >> >
> >> > Wasn't that a intermediate problem with the mpfr exponent range limiting?
> >> > Please check whether that's still needed.
> >>
> >> I tested it with trunk about an hour ago, and it included Jacub's patch.
> >> Are there other fixes outstanding which haven't been committed yet?
> >
> > Not that I know of.  Did we root-cause the bogus folding to 0.0?  Because
> > I don't really understand why using < can "fix" this...
>
> Yes, the underlying issue is that build_sinatan_real returns the first value 
> that does
> overflow when squared. Maybe that wasn't intended, but using less-than on the 
> first
> value that does overflow works. With my patch (now committed) the test passes 
> in
> all rounding modes.
>
> Like I mentioned, in the future this check could use a much smaller value 
> based on
> the size of the mantissa - that's safer since you're not close to infinity.
>
> > Latest trunk also still gives an assertion failure in mpc with the 
> > gcc.dg/torture/builtin-math-5.c
> > which started at the same time as the other mpc/mpfr releated issues:
> >
> > build/src/mpc/src/pow.c:631: MPC assertion failed: z_imag || mpfr_number_p 
> > (MPC_RE(u))
> > build/src/gcc/gcc/testsuite/gcc.dg/torture/builtin-math-5.c:95:3: internal 
> > compiler error: Aborted
> > 0x6725ab crash_signal
> > build/src/gcc/gcc/toplev.c:326
> >
> > Ick.  Is there a PR about this?
>
> This happens when using an old mpc (0.8.2). It's valid according to the 
> configure check,
> however it works with the 1.0.3 version that download-prerequisites uses. 
> Maybe we should
> increase the minimum mpc version in configure?

I guess it might be enough to adjust the recommended version and
notice caveats when
using older ones in install.texi (IIRC we already do that to some extent).

Richard.

> Wilco

re: add tsv110 pipeline scheduling

2019-03-04 Thread wuyuan (E)

Hi ,James:
Have you seen the patch submitted last week? If the problem with the patch has 
been fixed, I hope to get into the trunk earlier. look forward to your reply. 
Thank you.


Best Regards,

wuyuan 


-邮件原件-
发件人: wuyuan (E) 
发送时间: 2019年2月23日 21:28
收件人: 'James Greenhalgh' 
抄送: Kyrill Tkachov ; gcc-patches@gcc.gnu.org; 
Zhangyichao (AB) ; Zhanghaijian (A) 
; n...@arm.com; wufeng (O) ; 
Yangfei (Felix) 
Re : add tsv110 pipeline scheduling

Hi ,James:
Sorry for not responding to your email in time because of Chinese New Year’s 
holiday and urgent work. The three questions you mentioned last email are due 
to my misunderstanding of pipeline.
the first question, These instructions will occupy both the tsv110_ls* and 
tsv110_fsu* Pipeline at the same time.
rewritten as follows:
(define_insn_reservation
  "tsv110_neon_ld4_lane" 9
  (and (eq_attr "tune" "tsv110")
   (eq_attr "type" "neon_load4_all_lanes,neon_load4_all_lanes_q,\
   neon_load4_one_lane,neon_load4_one_lane_q"))
  "(tsv110_ls1 + tsv110_fsu1)|(tsv110_ls1 + tsv110_fsu2)|(tsv110_ls2 + 
tsv110_fsu1)|(tsv110_ls2 + tsv110_fsu2)")

the second question, These instructions will use tsv110_fsu1 Pipeline or 
tsv110_fsu2 Pipeline.
rewritten as follows:
(define_insn_reservation  "tsv110_neon_abd_aba" 4
  (and (eq_attr "tune" "tsv110")
   (eq_attr "type" "neon_abd,neon_arith_acc"))
  "tsv110_fsu1|tsv110_fsu2")

the third question, These instructions will use tsv110_fsu1 Pipeline or 
tsv110_fsu2 Pipeline.
rewritten as follows:
(define_insn_reservation  "tsv110_neon_abd_aba_q" 4
  (and (eq_attr "tune" "tsv110")
   (eq_attr "type" "neon_arith_acc_q"))
  "tsv110_fsu1|tsv110_fsu2")

In addition to the above changes, I asked hardware engineers and colleagues to 
review my  patch and modify some of the errors. The detailed patches are as 
follows:

  * config/aarch64/aarch64-cores.def (tsv1100): Change scheduling model.
  * config/aarch64/aarch64.md : Add "tsv110.md"
  * config/aarch64/tsv110.md: New file.

diff --git a/gcc/config/aarch64/aarch64-cores.def 
b/gcc/config/aarch64/aarch64-cores.def
index ed56e5e..82d91d6
--- a/gcc/config/aarch64/aarch64-cores.def
+++ b/gcc/config/aarch64/aarch64-cores.def
@@ -105,7 +105,7 @@ AARCH64_CORE("neoverse-n1",  neoversen1, cortexa57, 8_2A,  
AARCH64_FL_FOR_ARCH8_  AARCH64_CORE("neoverse-e1",  neoversee1, cortexa53, 
8_2A,  AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | 
AARCH64_FL_DOTPROD | AARCH64_FL_SSBS, cortexa53, 0x41, 0xd4a, -1)
 
 /* HiSilicon ('H') cores. */
-AARCH64_CORE("tsv110",  tsv110, cortexa57, 8_2A,  AARCH64_FL_FOR_ARCH8_2 | 
AARCH64_FL_CRYPTO | AARCH64_FL_F16 | AARCH64_FL_AES | AARCH64_FL_SHA2, tsv110,  
 0x48, 0xd01, -1)
+AARCH64_CORE("tsv110",  tsv110, tsv110, 8_2A,  AARCH64_FL_FOR_ARCH8_2 | 
AARCH64_FL_CRYPTO | AARCH64_FL_F16 | AARCH64_FL_AES | AARCH64_FL_SHA2, tsv110,  
 0x48, 0xd01, -1)
 
 /* ARMv8.4-A Architecture Processors.  */
 
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md 
index b7cd9fc..861f059 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -361,6 +361,7 @@
 (include "thunderx.md")
 (include "../arm/xgene1.md")
 (include "thunderx2t99.md")
+(include "tsv110.md")
 
 ;; ---
 ;; Jumps and other miscellaneous insns
diff --git a/gcc/config/aarch64/tsv110.md b/gcc/config/aarch64/tsv110.md new 
file mode 100644 index 000..9d12839
--- /dev/null
+++ b/gcc/config/aarch64/tsv110.md
@@ -0,0 +1,708 @@
+;; tsv110 pipeline description
+;; Copyright (C) 2018 Free Software Foundation, Inc.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it ;; 
+under the terms of the GNU General Public License as published by ;; 
+the Free Software Foundation; either version 3, or (at your option) ;; 
+any later version.
+;;
+;; GCC is distributed in the hope that it will be useful, but ;; 
+WITHOUT ANY WARRANTY; without even the implied warranty of ;; 
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU ;; 
+General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License ;; 
+along with GCC; see the file COPYING3.  If not see ;; 
+.
+
+(define_automaton "tsv110")
+
+(define_attr "tsv110_neon_type"
+  "neon_arith_acc, neon_arith_acc_q,
+   neon_arith_basic, neon_arith_complex,
+   neon_reduc_add_acc, neon_multiply, neon_multiply_q,
+   neon_multiply_long, neon_mla, neon_mla_q, neon_mla_long,
+   neon_sat_mla_long, neon_shift_acc, neon_shift_imm_basic,
+

RE: [PATCH][GCC][AArch64] Have empty HWCAPs string ignored during native feature detection

2019-03-04 Thread Tamar Christina

> -Original Message-
> From: Jakub Jelinek 
> Sent: Monday, March 4, 2019 13:38
> To: Christophe Lyon 
> Cc: Tamar Christina ; James Greenhalgh
> ; Kyrill Tkachov
> ; gcc-patches@gcc.gnu.org; nd
> ; Richard Earnshaw ; Marcus
> Shawcroft 
> Subject: Re: [PATCH][GCC][AArch64] Have empty HWCAPs string ignored
> during native feature detection
> 
> On Mon, Mar 04, 2019 at 02:31:57PM +0100, Christophe Lyon wrote:
> > The new test fails with a cross-compiler, because:
> > FAIL: gcc.target/aarch64/options_set_10.c (test for excess errors)
> > Excess errors:
> > cc1: error: unknown value 'native' for -mcpu
> >
> > I don't know how to restrict tests to native compilers only.
> 
> { target native }
> perhaps?

Oh, I didn't know about that one, I knew about isnative , I'll give it a try 
and see. Thanks!

Regards,
Tamar

> 
>   Jakub

Re: [LTO PATCH RFA] PR c++/88049 - ICE with undefined destructor and anon namespace.

2019-03-04 Thread Christophe Lyon

On Wed, 20 Feb 2019 at 02:58, Jason Merrill  wrote:
>
> A type in an anonymous namespace can never be merged with one from
> another translation unit, so a member of such a type is always its own
> prevailing decl.
>
> I don't really understand the LTO concept of prevailing decl, or why we don't
> get here if the destructor is defined, but this seems reasonable and fixes the
> bug.
>
> Tested x86_64-pc-linux-gnu.  OK for trunk?
>
> * lto-symtab.c (lto_symtab_prevailing_virtual_decl): Return early
> for a type in an anonymous namespace.
> ---
>  gcc/lto/lto-symtab.c |  8 ++--
>  gcc/testsuite/g++.dg/lto/pr88049_0.C | 16 
>  gcc/lto/ChangeLog|  6 ++
>  3 files changed, 28 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/lto/pr88049_0.C
>
> diff --git a/gcc/lto/lto-symtab.c b/gcc/lto/lto-symtab.c
> index 22da4c78b8c..343915c3cec 100644
> --- a/gcc/lto/lto-symtab.c
> +++ b/gcc/lto/lto-symtab.c
> @@ -1085,8 +1085,12 @@ lto_symtab_prevailing_virtual_decl (tree decl)
>  {
>if (DECL_ABSTRACT_P (decl))
>  return decl;
> -  gcc_checking_assert (!type_in_anonymous_namespace_p (DECL_CONTEXT (decl))
> -  && DECL_ASSEMBLER_NAME_SET_P (decl));
> +
> +  if (type_in_anonymous_namespace_p (DECL_CONTEXT (decl)))
> +/* There can't be any other declarations.  */
> +return decl;
> +
> +  gcc_checking_assert (DECL_ASSEMBLER_NAME_SET_P (decl));
>
>symtab_node *n = symtab_node::get_for_asmname
>  (DECL_ASSEMBLER_NAME (decl));
> diff --git a/gcc/testsuite/g++.dg/lto/pr88049_0.C 
> b/gcc/testsuite/g++.dg/lto/pr88049_0.C
> new file mode 100644
> index 000..7ac3618c2c8
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/lto/pr88049_0.C
> @@ -0,0 +1,16 @@
> +// PR c++/88049
> +// { dg-lto-do link }
> +// { dg-lto-options {{ -flto -O2 -w }} }
> +// { dg-extra-ld-options -r }
> +
> +template  class a;
> +class b {};
> +template  a d(char);
> +template  class a : public b {
> +public:
> +  virtual ~a();
> +};
> +namespace {
> +  class f;
> +  b c = d(int());
> +} // namespace


Hi Jason,

On bare-metal targets (arm, aarch64 using newlib), I'm using dejagnu's
testglue, which makes this new test fail because it also uses
g++_tg.o, leading to:
/arm-none-eabi/bin/ld: warning: incremental linking of LTO and non-LTO
objects; using -flinker-output=nolto-rel which will bypass whole
program optimization

Is there a way to avoid that (besides not using testglue) ?

Thanks,

Christophe

> diff --git a/gcc/lto/ChangeLog b/gcc/lto/ChangeLog
> index 6b183df3b0f..71a2a109e64 100644
> --- a/gcc/lto/ChangeLog
> +++ b/gcc/lto/ChangeLog
> @@ -1,3 +1,9 @@
> +2019-02-18  Jason Merrill  
> +
> +   PR c++/88049 - ICE with undefined destructor and anon namespace.
> +   * lto-symtab.c (lto_symtab_prevailing_virtual_decl): Return early
> +   for a type in an anonymous namespace.
> +
>  2019-01-09  Sandra Loosemore  
>
> PR other/16615
>
> base-commit: 79ae32275d4a19a1fc6ffebec9ac15a8c94b0b8f
> --
> 2.20.1
>

Re: [PATCH PR89487]Avoid taking address of register variable in loop list

2019-03-04 Thread Jakub Jelinek

On Mon, Mar 04, 2019 at 05:33:41AM -0800, H.J. Lu wrote:
> > > PR tree-optimization/89487
> > > * gcc/testsuite/gcc.dg/tree-ssa/pr89487.c: New test.
> 
> gcc.dg/tree-ssa/pr89487.c:
> 
> /* { dg-do compile } */
> /* { dg-options "-O2 -ftree-loop-distribution" } */
> 
> void
> caml_interprete (void)
> {
>   register int *pc asm("%r15");   These are valid only for x86-64.
>   register int *sp asm("%r14");
>   int i;
> 
>   for (i = 0; i < 3; ++i)
> *--sp = pc[i];
> }

It could perhaps #include "../pr87600.h", be guarded with
/* { dg-do compile { target aarch64*-*-* arm*-*-* i?86-*-* powerpc*-*-* 
s390*-*-* x86_64-*-* } } */
and use REG1/REG2 instead.

Jakub

Re: [PATCH] Fix PR89437

2019-03-04 Thread Wilco Dijkstra

Hi Richard,

>On Thu, Feb 21, 2019 at 6:09 PM Wilco Dijkstra  wrote:
>>
>> Hi Richard,
>>
>> >>Fix an issue with sinl (atanl (sqrtl (LDBL_MAX)) returning 0.0
>> >>instead of 1.0 by using x < sqrtl (LDBL_MAX) in match.pd.
>> >
>> > Wasn't that a intermediate problem with the mpfr exponent range limiting?
>> > Please check whether that's still needed.
>>
>> I tested it with trunk about an hour ago, and it included Jacub's patch.
>> Are there other fixes outstanding which haven't been committed yet?
>
> Not that I know of.  Did we root-cause the bogus folding to 0.0?  Because
> I don't really understand why using < can "fix" this...

Yes, the underlying issue is that build_sinatan_real returns the first value 
that does
overflow when squared. Maybe that wasn't intended, but using less-than on the 
first
value that does overflow works. With my patch (now committed) the test passes in
all rounding modes.

Like I mentioned, in the future this check could use a much smaller value based 
on
the size of the mantissa - that's safer since you're not close to infinity.

> Latest trunk also still gives an assertion failure in mpc with the 
> gcc.dg/torture/builtin-math-5.c
> which started at the same time as the other mpc/mpfr releated issues:
>
> build/src/mpc/src/pow.c:631: MPC assertion failed: z_imag || mpfr_number_p 
> (MPC_RE(u))
> build/src/gcc/gcc/testsuite/gcc.dg/torture/builtin-math-5.c:95:3: internal 
> compiler error: Aborted
> 0x6725ab crash_signal
> build/src/gcc/gcc/toplev.c:326
>
> Ick.  Is there a PR about this?

This happens when using an old mpc (0.8.2). It's valid according to the 
configure check,
however it works with the 1.0.3 version that download-prerequisites uses. Maybe 
we should
increase the minimum mpc version in configure?

Wilco

Re: [PATCH][GCC][AArch64] Have empty HWCAPs string ignored during native feature detection

2019-03-04 Thread Jakub Jelinek

On Mon, Mar 04, 2019 at 02:31:57PM +0100, Christophe Lyon wrote:
> The new test fails with a cross-compiler, because:
> FAIL: gcc.target/aarch64/options_set_10.c (test for excess errors)
> Excess errors:
> cc1: error: unknown value 'native' for -mcpu
> 
> I don't know how to restrict tests to native compilers only.

{ target native }
perhaps?

Jakub

RE: [PATCH][GCC][AArch64] Have empty HWCAPs string ignored during native feature detection

2019-03-04 Thread Tamar Christina

Hi Christophe,

> -Original Message-
> From: Christophe Lyon 
> Sent: Monday, March 4, 2019 13:32
> To: Tamar Christina 
> Cc: James Greenhalgh ; Jakub Jelinek
> ; Kyrill Tkachov ; gcc-
> patc...@gcc.gnu.org; nd ; Richard Earnshaw
> ; Marcus Shawcroft
> 
> Subject: Re: [PATCH][GCC][AArch64] Have empty HWCAPs string ignored
> during native feature detection
> 
> On Wed, 27 Feb 2019 at 18:32, Tamar Christina 
> wrote:
> >
> > Hi James,
> >
> > > -Original Message-
> > > From: James Greenhalgh 
> > > Sent: Wednesday, February 27, 2019 17:22
> > > To: Tamar Christina 
> > > Cc: Jakub Jelinek ; Kyrill Tkachov
> > > ; gcc-patches@gcc.gnu.org; nd
> > > ; Richard Earnshaw ;
> Marcus
> > > Shawcroft 
> > > Subject: Re: [PATCH][GCC][AArch64] Have empty HWCAPs string ignored
> > > during native feature detection
> > >
> > > On Thu, Feb 07, 2019 at 04:43:24AM -0600, Tamar Christina wrote:
> > > > Hi All,
> > > >
> > > > Since this hasn't been reviewed yet anyway I've updated this patch
> > > > to also
> > > fix the memory leaks etc.
> > > >
> > > > --
> > > >
> > > > This patch makes the feature detection code for AArch64 GCC not
> > > > add features automatically when the feature had no hwcaps string
> > > > to match
> > > against.
> > > >
> > > > This means that -mcpu=native no longer adds feature flags such as
> +profile.
> > > > The behavior wasn't noticed before because at the time +profile
> > > > was added a bug was preventing any feature bits from being added
> > > > by native
> > > detections.
> > > >
> > > > The loop has also been changed as Jakub specified in order to
> > > > avoid a memory leak that was present in the existing code and to
> > > > be slightly more
> > > efficient.
> > > >
> > > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > > >
> > > > Ok for trunk?
> > >
> > > OK. Is this also desirable for a backport?
> >
> > Yes I believe we have this problem in GCC8 as well the profile extensions.
> >
> > Kind regards,
> > Tamar
> >
> 
> Hi Tamar,
> 
> The new test fails with a cross-compiler, because:
> FAIL: gcc.target/aarch64/options_set_10.c (test for excess errors) Excess
> errors:
> cc1: error: unknown value 'native' for -mcpu
> 
> I don't know how to restrict tests to native compilers only.

Ah, thanks, I tested only the elf builds cross. I'll fix it up with a new 
testsuite
directive that checks if -mcpu=native compiles something.

Regards,
Tamar

> 
> Christophe
> 
> > >
> > > Thanks,
> > > James
> > >
> > > >
> > > > Thanks,
> > > > Tamar
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > > 2019-02-07  Tamar Christina  
> > > >
> > > > PR target/88530
> > > > * config/aarch64/aarch64-option-extensions.def: Document it.
> > > > * config/aarch64/driver-aarch64.c (host_detect_local_cpu):
> > > > Skip
> > > feature
> > > > if empty hwcaps.
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >
> > > > 2019-02-07  Tamar Christina  
> > > >
> > > > PR target/88530
> > > > * gcc.target/aarch64/options_set_10.c: New test.
> > > >

Re: [PATCH PR89487]Avoid taking address of register variable in loop list

2019-03-04 Thread H.J. Lu

On Fri, Mar 1, 2019 at 4:44 AM Richard Biener
 wrote:
>
> On Fri, Mar 1, 2019 at 7:54 AM bin.cheng  wrote:
> >
> > Hi,
> > This patch fixes PR89487 by following comments in PR.  It simply avoid 
> > checking runtime
> > alias by versioning in loop distribution if address of register variable 
> > may need to be taken.
> >
> > One thing I am not sure is if we should avoid generating data reference in 
> > the first place:
> > Creating dr for pc
> > analyze_innermost: success.
> > base_address: 
> > offset from base address: 0
> > constant offset from base address: 0
> > step: 0
> > base alignment: 8
> > base misalignment: 0
> > offset alignment: 128
> > step alignment: 128
> > base_object: pc
> > Here 'pc' is the register variable.
>
> Hm, I think the DR is ok-ish, we are generating DRs dependent on
> storage-order as well.
>
> > Bootstrap and test on x86_64, any comment?
>
> Patch is OK.
>
> Thanks,
> Richard.
>
> > Thanks,
> > bin
> > 2019-02-28  Bin Cheng  
> >
> > PR tree-optimization/89487
> > * tree-loop-distribution.c (has_nonaddressable_dataref_p): New.
> > (create_rdg_vertices): Compute has_nonaddressable_dataref_p.
> > (distribute_loop): Don't do runtime alias check if there is non-
> > addressable data reference.
> > * tree-ssa-loop-ivopts.c (may_be_nonaddressable_p): Check if 
> > VAR_DECL
> > is a register variable.
> >
> > 2018-02-28  Bin Cheng  
> >
> > PR tree-optimization/89487
> > * gcc/testsuite/gcc.dg/tree-ssa/pr89487.c: New test.

gcc.dg/tree-ssa/pr89487.c:

/* { dg-do compile } */
/* { dg-options "-O2 -ftree-loop-distribution" } */

void
caml_interprete (void)
{
  register int *pc asm("%r15");   These are valid only for x86-64.
  register int *sp asm("%r14");
  int i;

  for (i = 0; i < 3; ++i)
*--sp = pc[i];
}


-- 
H.J.

Re: [PATCH][GCC][AArch64] Have empty HWCAPs string ignored during native feature detection

2019-03-04 Thread Christophe Lyon

On Wed, 27 Feb 2019 at 18:32, Tamar Christina  wrote:
>
> Hi James,
>
> > -Original Message-
> > From: James Greenhalgh 
> > Sent: Wednesday, February 27, 2019 17:22
> > To: Tamar Christina 
> > Cc: Jakub Jelinek ; Kyrill Tkachov
> > ; gcc-patches@gcc.gnu.org; nd
> > ; Richard Earnshaw ; Marcus
> > Shawcroft 
> > Subject: Re: [PATCH][GCC][AArch64] Have empty HWCAPs string ignored
> > during native feature detection
> >
> > On Thu, Feb 07, 2019 at 04:43:24AM -0600, Tamar Christina wrote:
> > > Hi All,
> > >
> > > Since this hasn't been reviewed yet anyway I've updated this patch to also
> > fix the memory leaks etc.
> > >
> > > --
> > >
> > > This patch makes the feature detection code for AArch64 GCC not add
> > > features automatically when the feature had no hwcaps string to match
> > against.
> > >
> > > This means that -mcpu=native no longer adds feature flags such as 
> > > +profile.
> > > The behavior wasn't noticed before because at the time +profile was
> > > added a bug was preventing any feature bits from being added by native
> > detections.
> > >
> > > The loop has also been changed as Jakub specified in order to avoid a
> > > memory leak that was present in the existing code and to be slightly more
> > efficient.
> > >
> > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > >
> > > Ok for trunk?
> >
> > OK. Is this also desirable for a backport?
>
> Yes I believe we have this problem in GCC8 as well the profile extensions.
>
> Kind regards,
> Tamar
>

Hi Tamar,

The new test fails with a cross-compiler, because:
FAIL: gcc.target/aarch64/options_set_10.c (test for excess errors)
Excess errors:
cc1: error: unknown value 'native' for -mcpu

I don't know how to restrict tests to native compilers only.

Christophe

> >
> > Thanks,
> > James
> >
> > >
> > > Thanks,
> > > Tamar
> > >
> > > gcc/ChangeLog:
> > >
> > > 2019-02-07  Tamar Christina  
> > >
> > > PR target/88530
> > > * config/aarch64/aarch64-option-extensions.def: Document it.
> > > * config/aarch64/driver-aarch64.c (host_detect_local_cpu): Skip
> > feature
> > > if empty hwcaps.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > 2019-02-07  Tamar Christina  
> > >
> > > PR target/88530
> > > * gcc.target/aarch64/options_set_10.c: New test.
> > >

[PATCH] Remove redundant dg-do directive from test

2019-03-04 Thread Jonathan Wakely


* testsuite/26_numerics/bit/bitops.rot/rotl.cc: Remove bogus dg-do
directive.

Tested x86_64-linux, committed to trunk.

commit b265a758836e14c4bf0da54beecfb3ce24c022ec
Author: Jonathan Wakely 
Date:   Mon Mar 4 13:18:03 2019 +

Remove redundant dg-do directive from test

* testsuite/26_numerics/bit/bitops.rot/rotl.cc: Remove bogus dg-do
directive.

diff --git a/libstdc++-v3/testsuite/26_numerics/bit/bitops.rot/rotl.cc 
b/libstdc++-v3/testsuite/26_numerics/bit/bitops.rot/rotl.cc
index a7666fbd103..94be65c3d13 100644
--- a/libstdc++-v3/testsuite/26_numerics/bit/bitops.rot/rotl.cc
+++ b/libstdc++-v3/testsuite/26_numerics/bit/bitops.rot/rotl.cc
@@ -15,8 +15,6 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
-// { dg-do run { target c++11 } }
-
 // { dg-options "-std=gnu++2a" }
 // { dg-do compile { target c++2a } }

[PATCH] Fix PR89572

2019-03-04 Thread Richard Biener



Bootstrapped & tested on x86_64-unknown-linux-gnu, applied.

Richard.

2019-03-04  Richard Biener  

PR middle-end/89572
* tree-scalar-evolution.c: (get_loop_exit_condition): Use
safe_dyn_cast.

* gcc.dg/torture/pr89572.c: New testcase.

Index: gcc/tree-scalar-evolution.c
===
--- gcc/tree-scalar-evolution.c (revision 269361)
+++ gcc/tree-scalar-evolution.c (working copy)
@@ -910,7 +910,7 @@ get_loop_exit_condition (const struct lo
   gimple *stmt;
 
   stmt = last_stmt (exit_edge->src);
-  if (gcond *cond_stmt = dyn_cast  (stmt))
+  if (gcond *cond_stmt = safe_dyn_cast  (stmt))
res = cond_stmt;
 }
 
Index: gcc/testsuite/gcc.dg/torture/pr89572.c
===
--- gcc/testsuite/gcc.dg/torture/pr89572.c  (nonexistent)
+++ gcc/testsuite/gcc.dg/torture/pr89572.c  (working copy)
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-finline-functions" } */
+
+int vh, it, k1;
+
+void
+vn (void)
+{
+  ++vh;
+  if (vh == 0 && it == 0)
+k1 = -k1;
+}
+
+__attribute__ ((returns_twice)) void
+ef (int *uw)
+{
+  while (uw != (void *) 0)
+{
+  vn ();
+  *uw = 0;
+}
+}
+
+void
+gu (int *uw)
+{
+  ef (uw);
+}

[PATCH] Implement polymorphic_allocator for C++20 (P0339R6)

2019-03-04 Thread Jonathan Wakely


* include/std/memory_resource (polymorphic_allocator): Add default
template argument for C++20.
(polymorphic_allocator::allocate_bytes)
(polymorphic_allocator::deallocate_bytes)
(polymorphic_allocator::allocate_object)
(polymorphic_allocator::deallocate_object)
(polymorphic_allocator::new_object)
(polymorphic_allocator::delete_object): New member functions for
C++20.
* testsuite/20_util/polymorphic_allocator/allocate_object.cc: New
test.

Another piece of C++20, tested powerpc64le-linux, committed to trunk.

commit 3b59880300e00772f3d18c26544e0422332efd92
Author: Jonathan Wakely 
Date:   Mon Mar 4 11:42:44 2019 +

Implement polymorphic_allocator for C++20 (P0339R6)

* include/std/memory_resource (polymorphic_allocator): Add default
template argument for C++20.
(polymorphic_allocator::allocate_bytes)
(polymorphic_allocator::deallocate_bytes)
(polymorphic_allocator::allocate_object)
(polymorphic_allocator::deallocate_object)
(polymorphic_allocator::new_object)
(polymorphic_allocator::delete_object): New member functions for
C++20.
* testsuite/20_util/polymorphic_allocator/allocate_object.cc: New
test.

diff --git a/libstdc++-v3/include/std/memory_resource 
b/libstdc++-v3/include/std/memory_resource
index a212bccc9b1..7f1f0ca5e91 100644
--- a/libstdc++-v3/include/std/memory_resource
+++ b/libstdc++-v3/include/std/memory_resource
@@ -33,11 +33,13 @@
 
 #if __cplusplus >= 201703L
 
+#include   // numeric_limits
 #include   // align, allocator_arg_t, __uses_alloc
 #include  // pair, index_sequence
 #include   // vector
-#include  // size_t, max_align_t
+#include  // size_t, max_align_t, byte
 #include // shared_mutex
+#include 
 #include 
 
 namespace std _GLIBCXX_VISIBILITY(default)
@@ -55,8 +57,13 @@ namespace pmr
 
   class memory_resource;
 
+#if __cplusplus == 201703L
   template
 class polymorphic_allocator;
+#else // C++20
+  template
+class polymorphic_allocator;
+#endif
 
   // Global memory resources
   memory_resource* new_delete_resource() noexcept;
@@ -170,7 +177,59 @@ namespace pmr
   __attribute__((__nonnull__))
   { _M_resource->deallocate(__p, __n * sizeof(_Tp), alignof(_Tp)); }
 
-#if __cplusplus <= 201703L
+#if __cplusplus > 201703L
+  void*
+  allocate_bytes(size_t __nbytes,
+size_t __alignment = alignof(max_align_t))
+  { return _M_resource->allocate(__nbytes, __alignment); }
+
+  void
+  deallocate_bytes(void* __p, size_t __nbytes,
+  size_t __alignment = alignof(max_align_t))
+  { _M_resource->deallocate(__p, __nbytes, __alignment); }
+
+  template
+   _Up*
+   allocate_object(size_t __n = 1)
+   {
+ if ((std::numeric_limits::max() / sizeof(_Up)) < __n)
+   __throw_length_error("polymorphic_allocator::allocate_object");
+ return static_cast<_Up*>(allocate_bytes(__n * sizeof(_Up),
+ alignof(_Up)));
+   }
+
+  template
+   void
+   deallocate_object(_Up* __p, size_t __n = 1)
+   { deallocate_bytes(__p, __n * sizeof(_Up), alignof(_Up)); }
+
+  template
+   _Up*
+   new_object(_CtorArgs&&... __ctor_args)
+   {
+ _Up* __p = allocate_object<_Up>();
+ __try
+   {
+ construct(__p, std::forward<_CtorArgs>(__ctor_args)...);
+   }
+ __catch (...)
+   {
+ deallocate_object(__p);
+ __throw_exception_again;
+   }
+ return __p;
+   }
+
+  template
+   void
+   delete_object(_Up* __p)
+   {
+ destroy(__p);
+ deallocate_object(__p);
+   }
+#endif // C++2a
+
+#if __cplusplus == 201703L
   template
__attribute__((__nonnull__))
typename __not_pair<_Tp1>::type
diff --git 
a/libstdc++-v3/testsuite/20_util/polymorphic_allocator/allocate_object.cc 
b/libstdc++-v3/testsuite/20_util/polymorphic_allocator/allocate_object.cc
new file mode 100644
index 000..cbaccf6f5b0
--- /dev/null
+++ b/libstdc++-v3/testsuite/20_util/polymorphic_allocator/allocate_object.cc
@@ -0,0 +1,80 @@
+// Copyright (C) 2019 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or

Re: [PATCH] Optimize vector init constructor

2019-03-04 Thread Richard Biener

On Sun, Mar 3, 2019 at 10:13 PM H.J. Lu  wrote:
>
> On Sun, Mar 03, 2019 at 06:40:09AM -0800, Andrew Pinski wrote:
> > )
> > ,On Sun, Mar 3, 2019 at 6:32 AM H.J. Lu  wrote:
> > >
> > > For vector init constructor:
> > >
> > > ---
> > > typedef float __v4sf __attribute__ ((__vector_size__ (16)));
> > >
> > > __v4sf
> > > foo (__v4sf x, float f)
> > > {
> > >   __v4sf y = { f, x[1], x[2], x[3] };
> > >   return y;
> > > }
> > > ---
> > >
> > > we can optimize vector init constructor with vector copy or permute
> > > followed by a single scalar insert:
> > >
> > >   __v4sf D.1912;
> > >   __v4sf D.1913;
> > >   __v4sf D.1914;
> > >   __v4sf y;
> > >
> > >   x.0_1 = x;
> > >   D.1912 = x.0_1;
> > >   _2 = D.1912;
> > >   D.1913 = _2;
> > >   BIT_FIELD_REF  = f;
> > >   y = D.1913;
> > >   D.1914 = y;
> > >   return D.1914;
> > >
> > > instead of
> > >
> > >   __v4sf D.1962;
> > >   __v4sf y;
> > >
> > >   _1 = BIT_FIELD_REF ;
> > >   _2 = BIT_FIELD_REF ;
> > >   _3 = BIT_FIELD_REF ;
> > >   y = {f, _1, _2, _3};
> > >   D.1962 = y;
> > >   return D.1962;
> > >
> > > gcc/
> > >
> > > PR tree-optimization/88828
> > > * gimplify.c (gimplify_init_constructor): Optimize vector init
> > > constructor with vector copy or permute followed by a single
> > > scalar insert.
> >
> >
> > Doing this here does not catch things like:
> > typedef float __v4sf __attribute__ ((__vector_size__ (16)));
> >
> >
> > __v4sf
> > vector_init (float f0,float f1, float f2,float f3)
> > {
> >   __v4sf y = { f, x[1], x[2], x[3] };
> >return y;
> > }
> >
> > __v4sf
> > foo (__v4sf x, float f)
> > {
> >   return vector_init (f, x[1], x[2], x[3]) ;
> > }
> >
>
> Here is a patch for simplify_vector_constructor to optimize vector init
> constructor with vector copy or permute followed by a single scalar
> insert.

That's the correct place to fix this indeed.

  But this doesn't work correcly:
>
> [hjl@gnu-cfl-2 pr88828]$ cat bar.i
> typedef float __v4sf __attribute__ ((__vector_size__ (16)));
>
> static __v4sf
> vector_init (float f0,float f1, float f2,float f3)
> {
>   __v4sf y = { f0, f1, f2, f3 };
>return y;
> }
>
> __v4sf
> foo (__v4sf x, float f)
> {
>   return vector_init (f, x[1], x[2], x[3]) ;
> }
> [hjl@gnu-cfl-2 pr88828]$ make bar.s
> /export/build/gnu/tools-build/gcc-wip-debug/build-x86_64-linux/gcc/xgcc 
> -B/export/build/gnu/tools-build/gcc-wip-debug/build-x86_64-linux/gcc/ -O2 -S 
> bar.i
> [hjl@gnu-cfl-2 pr88828]$ cat bar.s
> .file   "bar.i"
> .text
> .p2align 4
> .globl  foo
> .type   foo, @function
> foo:
> .LFB1:
> .cfi_startproc
> ret
> .cfi_endproc
> .LFE1:
> .size   foo, .-foo
> .ident  "GCC: (GNU) 9.0.1 20190303 (experimental)"
> .section.note.GNU-stack,"",@progbits
> [hjl@gnu-cfl-2 pr88828]$
>
> Scalar insert is missing.
> ---
>  gcc/tree-ssa-forwprop.c | 77 -
>  1 file changed, 69 insertions(+), 8 deletions(-)
>
> diff --git a/gcc/tree-ssa-forwprop.c b/gcc/tree-ssa-forwprop.c
> index eeb6281c652..b10cfccf7b8 100644
> --- a/gcc/tree-ssa-forwprop.c
> +++ b/gcc/tree-ssa-forwprop.c
> @@ -2008,7 +2008,7 @@ simplify_vector_constructor (gimple_stmt_iterator *gsi)
>unsigned elem_size, i;
>unsigned HOST_WIDE_INT nelts;
>enum tree_code code, conv_code;
> -  constructor_elt *elt;
> +  constructor_elt *ce;
>bool maybe_ident;
>
>gcc_checking_assert (gimple_assign_rhs_code (stmt) == CONSTRUCTOR);
> @@ -2027,18 +2027,41 @@ simplify_vector_constructor (gimple_stmt_iterator 
> *gsi)
>orig[1] = NULL;
>conv_code = ERROR_MARK;
>maybe_ident = true;
> -  FOR_EACH_VEC_SAFE_ELT (CONSTRUCTOR_ELTS (op), i, elt)
> +
> +  tree rhs_vector = NULL;
> +  /* The single scalar element.  */
> +  tree scalar_element = NULL;
> +  unsigned int scalar_idx = 0;
> +  bool insert = false;
> +  unsigned int nscalars = 0;
> +  unsigned int nvectors = 0;
> +  FOR_EACH_VEC_SAFE_ELT (CONSTRUCTOR_ELTS (op), i, ce)
>  {
>tree ref, op1;
>
>if (i >= nelts)
> return false;
>
> -  if (TREE_CODE (elt->value) != SSA_NAME)
> +  if (TREE_CODE (ce->value) != SSA_NAME)
> return false;
> -  def_stmt = get_prop_source_stmt (elt->value, false, NULL);
> +  def_stmt = get_prop_source_stmt (ce->value, false, NULL);
>if (!def_stmt)
> -   return false;
> +   {
> + if ( gimple_nop_p (SSA_NAME_DEF_STMT (ce->value)))
> +   {
> + /* Only allow one single scalar insert.  */
> + if (nscalars != 0)
> +   return false;
> +
> + nscalars = 1;
> + insert = true;
> + scalar_idx = i;
> + scalar_element = ce->value;
> + continue;
> +   }
> + else
> +   return false;
> +   }
>code = gimple_assign_rhs_code (def_stmt);
>if (code == FLOAT_EXPR
>   || code ==

Re: A bug in vrp_meet?

2019-03-04 Thread Richard Biener

On Fri, Mar 1, 2019 at 10:02 PM Qing Zhao  wrote:
>
>
> On Mar 1, 2019, at 1:25 PM, Richard Biener  wrote:
>
> On March 1, 2019 6:49:20 PM GMT+01:00, Qing Zhao  wrote:
>
> Jeff,
>
> thanks a lot for the reply.
>
> this is really helpful.
>
> I double checked the dumped intermediate file for pass “dom3", and
> located the following for _152:
>
> BEFORE the pass “dom3”, there is no _152, the corresponding Block
> looks like:
>
>  [local count: 12992277]:
> _98 = (int) ufcMSR_52(D);
> k_105 = (sword) ufcMSR_52(D);
> i_49 = _98 > 0 ? k_105 : 0;
>
> ***During the pass “doms”,  _152 is generated as following:
>
> Optimizing block #4
> ….
> Visiting statement:
> i_49 = _98 > 0 ? k_105 : 0;
> Meeting
> [0, 65535]
> and
> [0, 0]
> to
> [0, 65535]
> Intersecting
> [0, 65535]
> and
> [0, 65535]
> to
> [0, 65535]
> Optimizing statement i_49 = _98 > 0 ? k_105 : 0;
> Replaced 'k_105' with variable '_98'
> gimple_simplified to _152 = MAX_EXPR <_98, 0>;
> i_49 = _152;
> Folded to: i_49 = _152;
> LKUP STMT i_49 = _152
>  ASGN i_49 = _152
>
> then bb 4 becomes:
>
>  [local count: 12992277]:
> _98 = (int) ufcMSR_52(D);
> k_105 = _98;
> _152 = MAX_EXPR <_98, 0>;
> i_49 = _152;
>
> and all the i_49 are replaced with _152.
>
> However, the value range info for _152 doesnot reflect the one for
> i_49, it keeps as UNDEFINED.
>
> is this the root problem?
>
>
> It looks like DOM fails to visit stmts generated by simplification. Can you 
> open a bug report with a testcase?
>
>
> The problem is, It took me quite some time in order to come up with a small 
> and independent testcase for this problem,
> a little bit change made the error disappear.
>
> do you have any suggestion on this?  or can you give me some hint on how to 
> fix this in DOM?  then I can try the fix on my side?

I remember running into similar issues in the past where I tried to
extract temporary nonnull ranges from divisions.
I have there

@@ -1436,11 +1436,16 @@ dom_opt_dom_walker::before_dom_children
   m_avail_exprs_stack->pop_to_marker ();

   edge taken_edge = NULL;
-  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next ())
-{
-  evrp_range_analyzer.record_ranges_from_stmt (gsi_stmt (gsi), false);
-  taken_edge = this->optimize_stmt (bb, gsi);
-}
+  gsi = gsi_start_bb (bb);
+  if (!gsi_end_p (gsi))
+while (1)
+  {
+   evrp_range_analyzer.record_def_ranges_from_stmt (gsi_stmt (gsi), false);
+   taken_edge = this->optimize_stmt (bb, );
+   if (gsi_end_p (gsi))
+ break;
+   evrp_range_analyzer.record_use_ranges_from_stmt (gsi_stmt (gsi));
+  }

   /* Now prepare to process dominated blocks.  */
   record_edge_info (bb);

OTOH the issue in your case is that fold emits new stmts before gsi but the
above loop will never look at them.  See tree-ssa-forwprop.c for code how
to deal with this (setting a pass-local flag on stmts visited and walking back
to unvisited, newly inserted ones).  The fold_stmt interface could in theory
also be extended to insert new stmts on a sequence passed to it so the
caller would be responsible for inserting them into the IL and could then
more easily revisit them (but that's a bigger task).

So, does the following help?

Index: gcc/tree-ssa-dom.c
===
--- gcc/tree-ssa-dom.c  (revision 269361)
+++ gcc/tree-ssa-dom.c  (working copy)
@@ -1482,8 +1482,25 @@ dom_opt_dom_walker::before_dom_children
   edge taken_edge = NULL;
   for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next ())
 {
+  gimple_stmt_iterator pgsi = gsi;
+  gsi_prev ();
   evrp_range_analyzer.record_ranges_from_stmt (gsi_stmt (gsi), false);
   taken_edge = this->optimize_stmt (bb, gsi);
+  gimple_stmt_iterator npgsi = gsi;
+  gsi_prev ();
+  /* Walk new stmts eventually inserted by DOM.  gsi_stmt (gsi) itself
+while it may be changed should not have gotten a new definition.  */
+  if (gsi_stmt (pgsi) != gsi_stmt (npgsi))
+   do
+ {
+   if (gsi_end_p (pgsi))
+ pgsi = gsi_start_bb (bb);
+   else
+ gsi_next ();
+   evrp_range_analyzer.record_ranges_from_stmt (gsi_stmt (pgsi),
+false);
+ }
+   while (gsi_stmt (pgsi) != gsi_stmt (gsi));
 }

   /* Now prepare to process dominated blocks.  */


Richard.

> Thanks a lot.
>
> Qing
>
>
>
> Richard.
>
>

[C++ Patch] PR 84605 ("[7/8/9 Regression] internal compiler error: in xref_basetypes, at cp/decl.c:13818")

2019-03-04 Thread Paolo Carlini


Hi,

this error recovery regression too is rather easy to explain: since 
Jason's fix for c++/79580 (r245587) when defining a type from within an 
expression we pass ts_within_enclosing_non_class to xref_tag when we 
call it from cp_parser_class_head. Thus, in the ill-formed testcases at 
issue, cp_parser_class_head is called twice for the same 'type' returned 
by xref_tag, and the second time TYPE_BINFO is already set while 
TYPE_SIZE is still zero, thus the gcc_assert in xref_basetypes triggers. 
A rather straightforward way to give again an error message instead of 
crashing is rejecting TYPE_BEING_DEFINED too, additionally to 
COMPLETE_TYPE_P. in the check placed between the xref_tag and the 
xref_basetypes calls. The wording of the error message is probably a tad 
suboptimal in the TYPE_BEING_DEFINED case, but I'm not sure it's worth 
spending time and code on that, the issue appears anyway to be rather 
rare and all the testcases I have are error-recovery ones. Tested 
x86_64-linux.


Thanks, Paolo.

//

/cp
2019-03-04  Paolo Carlini  

PR c++/84605
* parser.c (cp_parser_class_head): Reject TYPE_BEING_DEFINED too.

/testsuite
2019-03-04  Paolo Carlini  

PR c++/84605
* g++.dg/parse/crash69.C: New.
Index: cp/parser.c
===
--- cp/parser.c (revision 269342)
+++ cp/parser.c (working copy)
@@ -24021,8 +24021,11 @@ cp_parser_class_head (cp_parser* parser,
   cp_parser_check_class_key (class_key, type);
 
   /* If this type was already complete, and we see another definition,
- that's an error.  */
-  if (type != error_mark_node && COMPLETE_TYPE_P (type))
+ that's an error.  Likewise if the type is already being defined:
+ this can happen, eg, when it's defined from within an expression 
+ (c++/84605).  */
+  if (type != error_mark_node
+  && (COMPLETE_TYPE_P (type) || TYPE_BEING_DEFINED (type)))
 {
   error_at (type_start_token->location, "redefinition of %q#T",
type);
Index: testsuite/g++.dg/parse/crash69.C
===
--- testsuite/g++.dg/parse/crash69.C(nonexistent)
+++ testsuite/g++.dg/parse/crash69.C(working copy)
@@ -0,0 +1,11 @@
+// PR c++/84605
+
+struct b {
+  int x(((struct b {})));  // { dg-error "expected|redefinition" }
+};
+
+struct c {
+  struct d {
+int x(((struct c {})));  // { dg-error "expected|redefinition" }
+  };
+};

[PATCH] Adjust gcc.dg/uninit-pred-8_b.c (PR89551)

2019-03-04 Thread Richard Biener



The CFG cleanup change made us remove an extra forwarder which somehow
makes VRP jump threading go berzerk.  Fortunately only on
logical-op-non-short-circuit=0 targets so the easy way to fix the
testcase is to force that our way.

Tested on powerpc64le-linux-gnu and x86_64-linux-gnu.

I'll hold off applying this for a bit in case Jeff wants to
analyze why/how we're doing extra jump threading just because
of the lack of that extra forwarder...

Richard.

2019-03-04  Richard Biener  

PR testsuite/89551
* gcc.dg/uninit-pred-8_b.c: Force logical-op-non-short-circuit
the way that makes the testcase PASS.

Index: gcc/testsuite/gcc.dg/uninit-pred-8_b.c
===
--- gcc/testsuite/gcc.dg/uninit-pred-8_b.c  (revision 269361)
+++ gcc/testsuite/gcc.dg/uninit-pred-8_b.c  (working copy)
@@ -1,6 +1,7 @@
-
 /* { dg-do compile } */
-/* { dg-options "-Wuninitialized -O2" } */
+/* ???  Jump threading makes a mess of the logical-op-non-short-circuit=0 case
+   so force it our way.  */
+/* { dg-options "-Wuninitialized -O2 --param logical-op-non-short-circuit=1" } 
*/
 
 int g;
 void bar();

Re: [PATCH] C++2a Utility functions to implement uses-allocator construction (P0591R4)

2019-03-04 Thread Jonathan Wakely


On 04/03/19 09:14 +, Jonathan Wakely wrote:

On 01/03/19 14:06 +, Jonathan Wakely wrote:

On 01/03/19 13:50 +, Jonathan Wakely wrote:

* include/std/memory (uses_allocator_construction_args): New set of
overloaded functions.
(make_obj_using_allocator, uninitialized_construct_using_allocator):
New functions.
* include/std/memory_resource (polymorphic_allocator::construct)
[__cplusplus > 201703l]: Replace all overloads with a single function
using uses_allocator_construction_args.
* testsuite/20_util/polymorphic_allocator/construct_c++2a.cc: New
test.
* testsuite/20_util/uses_allocator/make_obj.cc: New test.


If we don't care about providing the exact signatures from the C++2a
draft, we could do this and use it in C++17 as well ...


[...]


+ if constexpr (sizeof...(__args) == 0)
+   {
+ return std::make_tuple(piecewise_construct,
+ std::__uses_alloc_args<_Tp1>(__a),
+ std::__uses_alloc_args<_Tp2>(__a));
+   }
+ else if constexpr (sizeof...(__args) == 1)
+   {
+ return std::make_tuple(piecewise_construct,
+ std::__uses_alloc_args<_Tp1>(__a,
+   std::forward<_Args>(__args).first...),
+ std::__uses_alloc_args<_Tp2>(__a,
+   std::forward<_Args>(__args).second...));
+   }
+ else if constexpr (sizeof...(__args) == 2)
+   {
+ return [&](auto&& __arg1, auto&& __arg2)
+   {
+ return std::make_tuple(piecewise_construct,
+ std::__uses_alloc_args<_Tp1>(__a,
+   std::forward(__arg1)),
+ std::__uses_alloc_args<_Tp2>(__a,
+   std::forward(__arg2)));
+   }(std::forward<_Args>(__args)...);
+   }


I tried replacing this lambda with:

  using _Targs = tuple<_Args&&...>;
  _Targs __targs{std::forward<_Args>(__args)...};

using _Args_0 = tuple_element_t<0, _Targs>;
using _Args_1 = tuple_element_t<1, _Targs>;

return std::make_tuple(piecewise_construct,
std::__uses_alloc_args<_Tp1>(__a,
  std::forward<_Args_0>(std::get<0>(__targs))),
std::__uses_alloc_args<_Tp2>(__a,
  std::forward<_Args_1>(std::get<1>(__targs;

And similarly for the sizeof...(__args))==3 case.  Which seems more
straightforward, unfortunately it compiles measurably slower, using
more memory. The optimized code is the same size, but unoptimized the
lambda version is a bit smaller.

The current code on trunk compiles fastest, by quite a big margin.
That surprises me as I thought a single function using if-constexpr
would outperform several overloads constrained via SFINAE.

Being able to use __uses_alloc_args in C++17 might be worth the extra
compile-time cost though. I'll keep thinking about it.


In case anybody wants to try it out, here's the complete patch
(relative to r269312 on trunk) using std::get to extract elements
from the pack, instead of using lambdas.

diff --git a/libstdc++-v3/include/std/memory b/libstdc++-v3/include/std/memory
index 00a85eef25e..d96dd02b6df 100644
--- a/libstdc++-v3/include/std/memory
+++ b/libstdc++-v3/include/std/memory
@@ -168,7 +168,7 @@ get_pointer_safety() noexcept { return pointer_safety::relaxed; }
 }
 #endif // C++2a
 
-#if __cplusplus > 201703L
+#if __cplusplus >= 201703L
   template
 struct __is_pair : false_type { };
   template
@@ -176,174 +176,111 @@ get_pointer_safety() noexcept { return pointer_safety::relaxed; }
   template
 struct __is_pair> : true_type { };
 
-  template>>,
-	   typename _Alloc, typename... _Args>
+  // Equivalent of uses_allocator_construction_args for internal use in C++17
+  template
 constexpr auto
 __uses_alloc_args(const _Alloc& __a, _Args&&... __args) noexcept
 {
-  if constexpr (uses_allocator_v, _Alloc>)
+  if constexpr (!__is_pair<_Tp>::value)
 	{
-	  if constexpr (is_constructible_v<_Tp, allocator_arg_t,
-	   const _Alloc&, _Args...>)
+	  if constexpr (uses_allocator_v, _Alloc>)
 	{
-	  return tuple(
-		  allocator_arg, __a, std::forward<_Args>(__args)...);
+	  if constexpr (is_constructible_v<_Tp, allocator_arg_t,
+		  const _Alloc&, _Args...>)
+		{
+		  return tuple(
+		  allocator_arg, __a, std::forward<_Args>(__args)...);
+		}
+	  else
+		{
+		  static_assert(
+		  is_constructible_v<_Tp, _Args..., const _Alloc&>);
+
+		  return tuple<_Args&&..., const _Alloc&>(
+		  std::forward<_Args>(__args)..., __a);
+		}
 	}
 	  else
 	{
-	  static_assert(is_constructible_v<_Tp, _Args..., const _Alloc&>);
+	  static_assert(is_constructible_v<_Tp, _Args...>);
 
-	  return tuple<_Args&&..., const _Alloc&>(
-		  std::forward<_Args>(__args)..., __a);
+	  return

[C++ PATCH] Further fix for designated-initializer-list handling in overload resolution (PR c++/71446)

2019-03-04 Thread Jakub Jelinek

On Sat, Mar 02, 2019 at 10:30:36AM +0100, Jakub Jelinek wrote:
> I'm not really sure what to do for foo.  Perhaps if we find that case just
> require that the order is ok already during build_aggr_conv and fail the
> conversion otherwise?  We are outside of the standard in that case anyway.

Actually, seems if the designators can match corresponding type,
reshape_init* already fills in the ce->index even on elements without
original designators.  So, the following works fine for all the testcases I
came up with.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2019-03-04  Jakub Jelinek  

PR c++/71446
* call.c (field_in_pset): New function.
(build_aggr_conv): Handle CONSTRUCTOR_IS_DESIGNATED_INIT correctly.

* g++.dg/cpp2a/desig12.C: New test.
* g++.dg/cpp2a/desig13.C: New test.

--- gcc/cp/call.c.jj2019-03-02 09:05:44.442524338 +0100
+++ gcc/cp/call.c   2019-03-02 18:57:12.221904541 +0100
@@ -902,6 +902,28 @@ can_convert_array (tree atype, tree ctor
   return true;
 }
 
+/* Helper for build_aggr_conv.  Return true if FIELD is in PSET, or if
+   FIELD has ANON_AGGR_TYPE_P and any initializable field in there recursively
+   is in PSET.  */
+
+static bool
+field_in_pset (hash_set *pset, tree field)
+{
+  if (pset->contains (field))
+return true;
+  if (ANON_AGGR_TYPE_P (TREE_TYPE (field)))
+for (field = TYPE_FIELDS (TREE_TYPE (field));
+field; field = DECL_CHAIN (field))
+  {
+   field = next_initializable_field (field);
+   if (field == NULL_TREE)
+ break;
+   if (field_in_pset (pset, field))
+ return true;
+  }
+  return false;
+}
+
 /* Represent a conversion from CTOR, a braced-init-list, to TYPE, an
aggregate class, if such a conversion is possible.  */
 
@@ -912,6 +934,7 @@ build_aggr_conv (tree type, tree ctor, i
   conversion *c;
   tree field = next_initializable_field (TYPE_FIELDS (type));
   tree empty_ctor = NULL_TREE;
+  hash_set *pset = NULL;
 
   /* We already called reshape_init in implicit_conversion.  */
 
@@ -919,26 +942,69 @@ build_aggr_conv (tree type, tree ctor, i
  context; they're always simple copy-initialization.  */
   flags = LOOKUP_IMPLICIT|LOOKUP_NO_NARROWING;
 
+  /* For designated initializers, verify that each initializer is convertible
+ to corresponding TREE_TYPE (ce->index) and mark those FIELD_DECLs as
+ visited.  In the following loop then ignore already visited
+ FIELD_DECLs.  */
+  if (CONSTRUCTOR_IS_DESIGNATED_INIT (ctor))
+{
+  tree idx, val;
+  FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (ctor), i, idx, val)
+   {
+ if (idx && TREE_CODE (idx) == FIELD_DECL)
+   {
+ tree ftype = TREE_TYPE (idx);
+ bool ok;
+
+ if (TREE_CODE (ftype) == ARRAY_TYPE
+ && TREE_CODE (val) == CONSTRUCTOR)
+   ok = can_convert_array (ftype, val, flags, complain);
+ else
+   ok = can_convert_arg (ftype, TREE_TYPE (val), val, flags,
+ complain);
+
+ if (!ok)
+   goto fail;
+ /* For unions, there should be just one initializer.  */
+ if (TREE_CODE (type) == UNION_TYPE)
+   {
+ field = NULL_TREE;
+ i = 1;
+ break;
+   }
+ if (pset == NULL)
+   pset = new hash_set;
+ pset->add (idx);
+   }
+ else
+   goto fail;
+   }
+}
+
   for (; field; field = next_initializable_field (DECL_CHAIN (field)))
 {
   tree ftype = TREE_TYPE (field);
   tree val;
   bool ok;
 
+  if (pset && field_in_pset (pset, field))
+   continue;
   if (i < CONSTRUCTOR_NELTS (ctor))
-   val = CONSTRUCTOR_ELT (ctor, i)->value;
+   {
+ val = CONSTRUCTOR_ELT (ctor, i)->value;
+ ++i;
+   }
   else if (DECL_INITIAL (field))
val = get_nsdmi (field, /*ctor*/false, complain);
   else if (TYPE_REF_P (ftype))
/* Value-initialization of reference is ill-formed.  */
-   return NULL;
+   goto fail;
   else
{
  if (empty_ctor == NULL_TREE)
empty_ctor = build_constructor (init_list_type_node, NULL);
  val = empty_ctor;
}
-  ++i;
 
   if (TREE_CODE (ftype) == ARRAY_TYPE
  && TREE_CODE (val) == CONSTRUCTOR)
@@ -948,15 +1014,22 @@ build_aggr_conv (tree type, tree ctor, i
  complain);
 
   if (!ok)
-   return NULL;
+   goto fail;
 
   if (TREE_CODE (type) == UNION_TYPE)
break;
 }
 
   if (i < CONSTRUCTOR_NELTS (ctor))
-return NULL;
+{
+fail:
+  if (pset)
+   delete pset;
+  return NULL;
+}
 
+  if (pset)
+delete pset;
   c = alloc_conversion (ck_aggr);
   c->type = type;
   c->rank = cr_exact;
--- gcc/testsuite/g++.dg/cpp2a/desig12.C.jj

Re: [PATCH] C++2a Utility functions to implement uses-allocator construction (P0591R4)

2019-03-04 Thread Jonathan Wakely


On 01/03/19 14:06 +, Jonathan Wakely wrote:

On 01/03/19 13:50 +, Jonathan Wakely wrote:

* include/std/memory (uses_allocator_construction_args): New set of
overloaded functions.
(make_obj_using_allocator, uninitialized_construct_using_allocator):
New functions.
* include/std/memory_resource (polymorphic_allocator::construct)
[__cplusplus > 201703l]: Replace all overloads with a single function
using uses_allocator_construction_args.
* testsuite/20_util/polymorphic_allocator/construct_c++2a.cc: New
test.
* testsuite/20_util/uses_allocator/make_obj.cc: New test.


If we don't care about providing the exact signatures from the C++2a
draft, we could do this and use it in C++17 as well ...


[...]


+ if constexpr (sizeof...(__args) == 0)
+   {
+ return std::make_tuple(piecewise_construct,
+ std::__uses_alloc_args<_Tp1>(__a),
+ std::__uses_alloc_args<_Tp2>(__a));
+   }
+ else if constexpr (sizeof...(__args) == 1)
+   {
+ return std::make_tuple(piecewise_construct,
+ std::__uses_alloc_args<_Tp1>(__a,
+   std::forward<_Args>(__args).first...),
+ std::__uses_alloc_args<_Tp2>(__a,
+   std::forward<_Args>(__args).second...));
+   }
+ else if constexpr (sizeof...(__args) == 2)
+   {
+ return [&](auto&& __arg1, auto&& __arg2)
+   {
+ return std::make_tuple(piecewise_construct,
+ std::__uses_alloc_args<_Tp1>(__a,
+   std::forward(__arg1)),
+ std::__uses_alloc_args<_Tp2>(__a,
+   std::forward(__arg2)));
+   }(std::forward<_Args>(__args)...);
+   }


I tried replacing this lambda with:

  using _Targs = tuple<_Args&&...>;
  _Targs __targs{std::forward<_Args>(__args)...};

 using _Args_0 = tuple_element_t<0, _Targs>;
 using _Args_1 = tuple_element_t<1, _Targs>;

 return std::make_tuple(piecewise_construct,
 std::__uses_alloc_args<_Tp1>(__a,
   std::forward<_Args_0>(std::get<0>(__targs))),
 std::__uses_alloc_args<_Tp2>(__a,
   std::forward<_Args_1>(std::get<1>(__targs;

And similarly for the sizeof...(__args))==3 case.  Which seems more
straightforward, unfortunately it compiles measurably slower, using
more memory. The optimized code is the same size, but unoptimized the
lambda version is a bit smaller.

The current code on trunk compiles fastest, by quite a big margin.
That surprises me as I thought a single function using if-constexpr
would outperform several overloads constrained via SFINAE.

Being able to use __uses_alloc_args in C++17 might be worth the extra
compile-time cost though. I'll keep thinking about it.

[PATCH] ARM cmpsi2_addneg fix follow-up (PR target/89506)

2019-03-04 Thread Jakub Jelinek

On Fri, Mar 01, 2019 at 03:41:33PM +, Wilco Dijkstra wrote:
> > and regtest revealed two code size
> > regressions because of that.  Is -1 vs. 1 the only case of immediate
> > valid for both "I" and "L" constraints where the former is longer than the
> > latter?
> 
> Yes -1 is the only case which can result in larger code on Thumb-2, so -1 
> should
> be disallowed by the I constraint (or even better, the underlying query). 
> That way
> it will work correctly for all add/sub patterns, not just this one.

So, over the weekend I've bootstrapped/regtested on armv7hl-linux-gnueabi
following two possible follow-ups which handle the -1 and 1 cases right
(prefer the instruction with #1 for thumb2), 0 and INT_MIN (use subs) and
for others use subs if both constraints match, otherwise adds.

The first one uses constraints and no C code in the output, I believe it is
actually more expensive for compile time, because if one just reads what
constrain_operands needs to do for another constraint, it is quite a lot.
I've tried to at least not introduce new constraints for this, there is no
constraint for number 1 (or for number -1).
The Pu constraint is thumb2 only for numbers 1..8, and the alternative uses
I constraint for the negation of it, i.e. -8..-1, only -1 from this is
valid for I.  If that matches, we emit adds with #1, otherwise just prefer
subs over adds.

The other swaps the alternatives similarly to the above, but for the special
case of desirable adds with #1 uses C code instead of another alternative.

Ok for trunk (which one)?

Jakub
2019-03-04  Jakub Jelinek  

PR target/89506
* config/arm/arm.md (cmpsi2_addneg): Swap the alternatives, add
another alternative with I constraint for operands[2] and Pu
for operands[3] and emit adds in that case, don't use C code to
emit the instruction.

--- gcc/config/arm/arm.md.jj2019-03-02 09:04:25.550794239 +0100
+++ gcc/config/arm/arm.md   2019-03-02 17:08:13.036725812 +0100
@@ -857,31 +857,31 @@ (define_insn "*compare_negsi_si"
(set_attr "type" "alus_sreg")]
 )
 
-;; This is the canonicalization of addsi3_compare0_for_combiner when the
+;; This is the canonicalization of subsi3_compare when the
 ;; addend is a constant.
+;; For 0 and INT_MIN it is essential that we use subs, as adds will result
+;; in different condition codes (like cmn rather than like cmp), so that
+;; alternative comes first.  Both I and L constraints can match for any
+;; 0x??00 where except for 0 and INT_MIN it doesn't matter what we choose,
+;; and also for -1 and 1 with TARGET_THUMB2, in that case prefer instruction
+;; with #1 as it is shorter.  The first alternative will use adds ?, ?, #1 over
+;; subs ?, ?, #-1, the second alternative will use subs for #0 or #2147483648
+;; or any other case where both I and L constraints match.
 (define_insn "cmpsi2_addneg"
   [(set (reg:CC CC_REGNUM)
(compare:CC
-(match_operand:SI 1 "s_register_operand" "r,r")
-(match_operand:SI 2 "arm_addimm_operand" "L,I")))
-   (set (match_operand:SI 0 "s_register_operand" "=r,r")
+(match_operand:SI 1 "s_register_operand" "r,r,r")
+(match_operand:SI 2 "arm_addimm_operand" "I,I,L")))
+   (set (match_operand:SI 0 "s_register_operand" "=r,r,r")
(plus:SI (match_dup 1)
-(match_operand:SI 3 "arm_addimm_operand" "I,L")))]
+(match_operand:SI 3 "arm_addimm_operand" "Pu,L,I")))]
   "TARGET_32BIT
&& (INTVAL (operands[2])
== trunc_int_for_mode (-INTVAL (operands[3]), SImode))"
-{
-  /* For 0 and INT_MIN it is essential that we use subs, as adds
- will result in different condition codes (like cmn rather than
- like cmp).  For other immediates, we should choose whatever
- will have smaller encoding.  */
-  if (operands[2] == const0_rtx
-  || INTVAL (operands[2]) == -HOST_WIDE_INT_C (0x8000)
-  || which_alternative == 1)
-return "subs%?\\t%0, %1, #%n3";
-  else
-return "adds%?\\t%0, %1, %3";
-}
+  "@
+   adds%?\\t%0, %1, %3
+   subs%?\\t%0, %1, #%n3
+   adds%?\\t%0, %1, %3"
   [(set_attr "conds" "set")
(set_attr "type" "alus_sreg")]
 )
2019-03-04  Jakub Jelinek  

PR target/89506
* config/arm/arm.md (cmpsi2_addneg): Swap the alternatives and use
subs for the first alternative except when operands[3] is 1.

--- gcc/config/arm/arm.md.jj2019-03-02 09:04:25.550794239 +0100
+++ gcc/config/arm/arm.md   2019-03-02 09:41:03.501404694 +0100
@@ -857,27 +857,27 @@ (define_insn "*compare_negsi_si"
(set_attr "type" "alus_sreg")]
 )
 
-;; This is the canonicalization of addsi3_compare0_for_combiner when the
+;; This is the canonicalization of subsi3_compare when the
 ;; addend is a constant.
 (define_insn "cmpsi2_addneg"
   [(set (reg:CC CC_REGNUM)
(compare:CC
 (match_operand:SI 1 "s_register_operand" "r,r")
-(match_operand:SI 2 "arm_addimm_operand" "L,I")))
+(match_operand:SI 2

56 matches

Mail list logo