Re: [PATCH 3/4] ivopts: Consider cost_step on different forms during unrolling

2020-06-01 Thread Kewen.Lin via Gcc-patches
Hi Richard,

Thanks for the comments!

on 2020/6/2 上午1:59, Richard Sandiford wrote:
> Could you go into more detail about this choice of cost calculation?
> It looks like we first calculate per-group flags, which are true only if
> the unrolled offsets are valid for all uses in the group.  Then we create
> per-candidate flags when associating candidates with groups.
> 

Sure.  It checks every address type IV group to determine whether this
group is valid to use reg offset addressing mode.  Here we only need to
check the first one and the last one, since the intermediates should 
have been handled by split_address_groups.  With unrolling the
displacement of the address can be offset-ed by (UF-1)*step, check the
address with this max offset whether still valid.  If the check finds
it's valid to use reg offset mode for the whole group, we flag this
group.  Later, when we create IV candidate for address group flagged,
we flag the candidate further.  This flag is mainly for iv cand
costing, we don't need to scale up iv cand's step cost for this kind
of candidate.

Imagining this loop is being unrolled, all the statements will be
duplicated by UF.  For the cost modeling against iv group, it's
scaling up the cost by UF (here I simply excluded the compare_type
since in most cases it for loop ending check).  For the cost modeling
against iv candidate, it's to focus on step costs, for an iv candidate
we flagged before, it's taken as one time step cost, for the others,
it's scaling up the step cost since the unrolling make step 
calculation become UF times.

This cost modeling is trying to simulate cost change after the
unrolling, scaling up the costs accordingly.  There are somethings
to be improved like distinguish the loop ending compare or else,
whether need to tweak the other costs somehow since the scaling up
probably cause existing cost framework imbalance, but during
benchmarking I didn't find these matter, so take it as simple as 
possible for now.


> Instead, couldn't we take this into account in get_address_cost,
> which calculates the cost of an address use for a given candidate?
> E.g. after the main if-else at the start of the function,
> perhaps it would make sense to add the worst-case offset to
> the address in “parts”, check whether that too is a valid address,
> and if not, increase var_cost by the cost of one add instruction.
> 

IIUC, what you suggest is to tweak the iv group cost, if we find
one address group is valid for reg offset mode, we price more on
the pairs between this group and other non address-based iv cands.
The question is how do we decide this add-on cost.  For the test
case I was working on initially, adding one cost (of add) doesn't
work, the normal iv still wined.  We can price it more like two
but what's the justification on this value, by heuristics?

> I guess there are two main sources of inexactness if we do that:
> 
> (1) It might underestimate the cost because it assumes that vuse[0]
> stands for all vuses in the group.
> 

Do you mean we don't need one check function like mark_reg_offset_groups?
If without it, vuse[0] might be not enough since we can't ensure the
others are fine with additional displacement from unrolling.  If we still
have it, I think it's fine to just use vuse[0].

> (2) It might overestimates the cost because it treats all unrolled
> iterations as having the cost of the final unrolled iteration.
>
> (1) could perhaps be avoided by adding a flag to the iv_use to say
> whether it wants this treatment.  I think the flag approach suffers
> from (2) too, and I'd be surprised if it makes a difference in practice.
> 

Sorry, I didn't have the whole picture how to deal with uf for your proposal.
But the flag approach considers uf in iv group cost calculation as well as
iv cand step cost calculation.

BR,
Kewen

> Thanks,
> Richard
> 



RE: [PATCH PR95254] aarch64: gcc generate inefficient code with fixed sve vector length

2020-06-01 Thread Yangfei (Felix)
Hi,

> -Original Message-
> From: Richard Sandiford [mailto:richard.sandif...@arm.com]
> Sent: Monday, June 1, 2020 4:47 PM
> To: Yangfei (Felix) 
> Cc: gcc-patches@gcc.gnu.org; Uros Bizjak ; Jakub
> Jelinek ; Hongtao Liu ; H.J. Lu
> 
> Subject: Re: [PATCH PR95254] aarch64: gcc generate inefficient code with
> fixed sve vector length

Snip...
 
> Sounds good.  Maybe at this point the x_inner and y_inner code is getting
> complicated enough to put into a lambda too:
> 
>   x_inner = ... (x);
>   y_inner = ... (y);
> 
> Just a suggestion though.

Yes, that's a good suggestion.  I see the code becomes more cleaner with 
another lambda.
 
> Yeah, looks good.
> 
> Formatting nit though: multi-line conditions should be wrapped in (...),
> i.e.:
> 
> return (...
> && ...
> && ...);
> 

Done.  v6 patch is based on trunk 20200601.
Bootstrapped and tested on aarch64-linux-gnu. 
Also bootstrapped on x86-64-linux-gnu with --enable-multilib (for building -m32 
x86 libgcc).
Regresssion test on x86-64-linux-gnu looks good except for the following 
failures which has been confirmed by x86 devs: 

> FAIL: gcc.target/i386/avx512f-vcvtps2ph-2.c (test for excess errors)
> UNRESOLVED: gcc.target/i386/avx512f-vcvtps2ph-2.c compilation failed to 
> produce executable
154803c154803

Thanks,
Felix



pr95254-v6.diff
Description: pr95254-v6.diff


Re: [PATCH] testsuite: Disable colorization for ubsan test

2020-06-01 Thread Kito Cheng via Gcc-patches
Committed, thanks :)

On Mon, Jun 1, 2020 at 4:10 PM Jakub Jelinek via Gcc-patches
 wrote:
>
> On Mon, Jun 01, 2020 at 03:43:00PM +0800, Kito Cheng wrote:
> > ping
> >
> >
> > On Wed, May 20, 2020 at 3:01 PM Kito Cheng  wrote:
> > >
> > >  - Run gcc testsuite with qemu will print out ascii color code for
> > >ubsan related testcase, however several testcase didn't consider
> > >that, so disable colorization prevent such problem and simplify the
> > >process when adding testcase in future.
> > >
> > >  - Verified on native X86 and RISC-V qemu full system mode and user mode.
> > >
> > > ChangeLog:
> > >
> > > gcc/testsuite/
> > >
> > > Kito Cheng  
> > >
> > > * ubsan-dg.exp (orig_ubsan_options_saved): New
> > > (orig_ubsan_options): Ditto.
> > > (ubsan_init): Store UBSAN_OPTIONS and set UBSAN_OPTIONS.
> > > (ubsan_finish): Restore UBSAN_OPTIONS.
>
> Ok, thanks.
>
> Jakub
>


Re: [PATCH 1/2] Introduce flag_cunroll_grow_size for cunroll

2020-06-01 Thread Jiufu Guo via Gcc-patches
Jiufu Guo  writes:

Hi,

I updated the patch just a little accordinlgy.  Thanks!

diff --git a/gcc/common.opt b/gcc/common.opt
index 4464049fc1f..570e2aa53c8 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2856,6 +2856,10 @@ funroll-all-loops
 Common Report Var(flag_unroll_all_loops) Optimization
 Perform loop unrolling for all loops.
 
+funroll-completely-grow-size
+Common Undocumented Var(flag_cunroll_grow_size) Init(2) Optimization
+; Internal undocumented flag, allow size growth during complete unrolling
+
 ; Nonzero means that loop optimizer may assume that the induction variables
 ; that control loops do not overflow and that the loops with nontrivial
 ; exit condition are not infinite
diff --git a/gcc/toplev.c b/gcc/toplev.c
index 96316fbd23b..8d52358efdd 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -1474,6 +1474,10 @@ process_options (void)
   if (flag_unroll_all_loops)
 flag_unroll_loops = 1;
 
+  /* Allow cunroll to grow size accordingly.  */
+  if (flag_cunroll_grow_size == AUTODETECT_VALUE)
+flag_cunroll_grow_size = flag_unroll_loops || flag_peel_loops;
+
   /* web and rename-registers help when run after loop unrolling.  */
   if (flag_web == AUTODETECT_VALUE)
 flag_web = flag_unroll_loops;
diff --git a/gcc/tree-ssa-loop-ivcanon.c b/gcc/tree-ssa-loop-ivcanon.c
index 8ab6ab3330c..298ab215530 100644
--- a/gcc/tree-ssa-loop-ivcanon.c
+++ b/gcc/tree-ssa-loop-ivcanon.c
@@ -1603,9 +1603,8 @@ pass_complete_unroll::execute (function *fun)
  re-peeling the same loop multiple times.  */
   if (flag_peel_loops)
 peeled_loops = BITMAP_ALLOC (NULL);
-  unsigned int val = tree_unroll_loops_completely (flag_unroll_loops
-  || flag_peel_loops
-  || optimize >= 3, true);
+  unsigned int val = tree_unroll_loops_completely (flag_cunroll_grow_size, 
+  true);
   if (peeled_loops)
 {
   BITMAP_FREE (peeled_loops);

BR,
Jiufu

> Richard Biener  writes:
>

>>> >> From: Jiufu Guo 
>>> >>
>>> >> Currently GIMPLE complete unroller(cunroll) is checking
>>> >> flag_unroll_loops and flag_peel_loops to see if allow size growth.
>>> >> Beside affects curnoll, flag_unroll_loops also controls RTL unroler.
>>> >> To have more freedom to control cunroll and RTL unroller, this patch
>>> >> introduces flag_cunroll_grow_size.  With this patch, we can control
>>> >> cunroll and RTL unroller indepently.
>>> >>
>>> >> Bootstrap/regtest pass on powerpc64le. OK for trunk? And backport to
>>> >> GCC10 after week?
>>> >>
>>> >>
>>> >> +funroll-completely-grow-size
>>> >> +Var(flag_cunroll_grow_size) Init(2)
>>> >> +; Control cunroll to allow size growth during complete unrolling
>>> >> +
>>> >
>>
>> It won't work without adjusting the awk scripts.  So go with
>>
>> funroll-completely-grow-size
>> Undocumented Optimization Var(flag_cunroll_grow_size)
>> EnabledBy(funroll-loops || fpeel-loops)
>> ; ...
>>
> EnabledBy(funroll-loops || fpeel-loops) does not works as we expected:
> "-funroll-loops -fno-peel-loops" turns off flag_cunroll_grow_size.
>
> Through "EnabledBy", a flag can be turned, and also can be turned off by
> the "EnabledBy option", only if the flag is not specifed through commond
> line.  
>
>> and enable it at O3+.  AUTODETECT_VALUE doesn't make sense for
>> an option not supposed to be set by users?
>>
>
> global_options_set.x_flagxxx can be used to check if option is set by
> user.  But it does not work well here neither, because we also care of
> if the flag is override by OPTION_OPTIMIZATION_TABLE or
> OPTION_OVERRIDE. 
>
> AUTODETECT_VALUE(value is 2) is used for some flags like flag_web,
> flag_rename_registers, flag_var_tracking, flag_tree_cselim...
> And this way could be used to check if the flag is effective(on/off)
> either explicit set by command line or implicit set through
> OPTION_OVERRIDE or OPTION_OPTIMIZATION_TABLE.
> So, I use it here.




Re: PowerPC new instructions for -mcpu=future

2020-06-01 Thread will schmidt via Gcc-patches
On Mon, 2020-06-01 at 16:01 -0400, Michael Meissner via Gcc-patches
wrote:
> These 3 patches add support for some new instructions in the 'future'
> processor.
> 
> The first patch adds support for the new byte swap instructions that
> byte swap
> valies in the GPRs.

values

> 
> The second patch renames some functions from _p9 to
> _hw.  This is
> in preparation for the third patch that adds support for IEEE 128-bit 
> minimum,
> maximum, and set compare masks.
> 
> The third patch implements the new instructions.
> 
> I have built bootstrap compilers with/without the patches, and there
> are no
> regressions.  I verified that the two new tests pass.  Can I check
> these into
> the master branch?


A couple cosmetic nits, in a followup email.
otherwise this series lgtm.
Thanks
-Will




Re: [PATCH 3/3] PowerPC future: Add IEEE 128-bit min, max, compare.

2020-06-01 Thread will schmidt via Gcc-patches
On Mon, 2020-06-01 at 16:01 -0400, Michael Meissner via Gcc-patches wrote:
> Add support for the new IEEE 128-bit minimum, maximum, and set compare mask
> instructions when -mcpu=future was used.
> 
> gcc/
> 2020-06-01  Michael Meissner  
> 
>   * config/rs6000/rs6000.c (rs6000_emit_hw_fp_minmax): Update
>   comment.
>   (rs6000_emit_hw_fp_cmove): Update comment.
>   (rs6000_emit_cmove): Add support for IEEE 128-bit min, max, and
>   comparisons with -mcpu=future.
>   (rs6000_emit_minmax): Add support for IEEE 128-bit min/max with
>   -mcpu=future.
>   * config/rs6000/rs6000.md (s3, IEEE128 iterator):
>   New insns for IEEE 128-bit min/max.
>   (movcc, IEEE128 iterator): New insns for IEEE 128-bit
>   conditional move.
>   (movcc_future, IEEE128 iterator): New insns for IEEE 128-bit
>   conditional move.
>   (movcc_invert_future, IEEE128 iterator): New insns for IEEE
>   128-bit conditional move.
>   (fpmask, IEEE128 iterator): New insns for IEEE 128-bit
>   conditional move.

Include the leading wildcard here?  
(*fpmask ...
and missing an entry for this one:
(*xxsel ...


> 
> testsuite/
> 2020-06-01  Michael Meissner  
> 
>   * gcc.target/powerpc/float128-minmax-2.c: New test.
> ---
>  gcc/config/rs6000/rs6000.c |  26 -
>  gcc/config/rs6000/rs6000.md| 121 
> +
>  .../gcc.target/powerpc/float128-minmax-2.c |  70 
>  3 files changed, 214 insertions(+), 3 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
> 
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index 0921328..bbba8f1 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -14847,7 +14847,9 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, 
> rtx op_false,
>  /* ISA 3.0 (power9) minmax subcase to emit a XSMAXCDP or XSMINCDP instruction
> for SF/DF scalars.  Move TRUE_COND to DEST if OP of the operands of the 
> last
> comparison is nonzero/true, FALSE_COND if it is zero/false.  Return 0 if 
> the
> -   hardware has no such operation.  */
> +   hardware has no such operation.
> +
> +   Under FUTURE, also handle IEEE 128-bit floating point.  */

> 
>  static int
>  rs6000_emit_hw_fp_minmax (rtx dest, rtx op, rtx true_cond, rtx false_cond)
> @@ -14889,7 +14891,9 @@ rs6000_emit_hw_fp_minmax (rtx dest, rtx op, rtx 
> true_cond, rtx false_cond)
>  /* ISA 3.0 (power9) conditional move subcase to emit XSCMP{EQ,GE,GT,NE}DP and
> XXSEL instructions for SF/DF scalars.  Move TRUE_COND to DEST if OP of the
> operands of the last comparison is nonzero/true, FALSE_COND if it is
> -   zero/false.  Return 0 if the hardware has no such operation.  */
> +   zero/false.  Return 0 if the hardware has no such operation.
> +
> +   Under FUTURE, also handle IEEE 128-bit conditional moves.  */
> 
>  static int
>  rs6000_emit_hw_fp_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
> @@ -14981,6 +14985,21 @@ rs6000_emit_cmove (rtx dest, rtx op, rtx true_cond, 
> rtx false_cond)
>   return 1;
>  }
> 
> +  /* See if we can use the FUTURE min/max/compare instructions for IEEE 
> 128-bit
> + floating point.  At present, don't worry about doing conditional moves
> + with different types for the comparison and movement (unlike SF/DF, 
> where
> + you can do a conditional test between double and use float as the 
> if/then
> + parts. */

Why don't we worry about that now?  Should this be a 'future todo'
comment here?


Beyond those nits and questions, lgtm, 
Thanks,
-Will




Re: [committed 3/3] libstdc++: Refactor filesystem::path string conversions

2020-06-01 Thread Jonathan Wakely via Gcc-patches

On 23/05/20 09:44 +0100, Jonathan Wakely wrote:

This simplifies the logic of converting Source arguments and pairs of
InputIterator arguments into the native string format. For any input
that is a contiguous range of path::value_type (or char8_t for POSIX)
a string view can be created and the conversion can be done directly,
with no intermediate allocation. Previously some cases created a
basic_string unnecessarily, for example construction from a pair of
path::string_type::iterators, or a pair of non-const value_type*
pointers.

   * include/bits/fs_path.h (__detail::_S_range_begin)
   (__detail::_S_range_end, path::_S_string_from_iter): Replace with
   overloaded function template __detail::__effective_range.
   (__detail::__effective_range): New overloaded function template to
   create a basic_string or basic_string_view for an effective range.
   (__detail::__value_type_is_char): Use __detail::__effective_range.
   Do not use remove_const on value type.
   (__detail::__value_type_is_char_or_char8_t): Likewise.
   (path::path(const Source&, format))
   (path::path(const Source&, const locale&))
   (path::operator/=(const Source&), path::append(const Source&))
   (path::concat(const Source&)): Use __detail::__effective_range.
   (path::_S_to_string(InputIterator, InputIterator)): New function
   template to create a string view if possible, or string otherwise.
   (path::_S_convert): Add overloads that convert a string returned
   by __detail::__effective_range. Use if-constexpr to inline conversion
   logic from all overloads of _Cvt::_S_convert.
   (path::_S_convert_loc): Add overload that converts a string. Use
   _S_to_string to avoid allocation when possible.
   (path::_Cvt): Remove.
   (path::operator+=(CharT)): Remove indirection through path::concat.
   * include/experimental/bits/fs_path.h (path::_S_convert_loc): Add
   overload for non-const pointers, to avoid constructing a std::string.
   * src/c++17/fs_path.cc (path::_S_convert_loc): Replace conditional
   compilation with call to _S_convert.


This commit broke *-*-mingw* bootstrap. Fixed with the attached patch.

Tested powerpc64le-linux and x86_64-w64-mingw32, committed to master.

commit cd3f067b82a1331f5fb695879ba5c3d9bb2cca3a
Author: Jonathan Wakely 
Date:   Tue Jun 2 00:07:05 2020 +0100

libstdc++: Fix filesystem::u8path for mingw targets (PR 95392)

When I refactored filesystem::path string conversions in
r11-587-584d52b088f9fcf78704b504c3f1f07e17c1cded I failed to update the
mingw-specific code in filesystem::u8path, causing a bootstrap failure.

This fixes it, and further refactors the mingw-specific code along the
same lines as the previous commit. All conversions from UTF-8 strings to
wide strings now use the same helper function, __wstr_from_utf8.

PR libstdc++/95392
* include/bits/fs_path.h (path::_S_to_string): Move to
namespace-scope and rename to ...
(__detail::__string_from_range): ... this.
[WINDOWS] (__detail::__wstr_from_utf8): New function template to
convert a char sequence containing UTF-8 to wstring.
(path::_S_convert(Iter, Iter)): Adjust call to _S_to_string.
(path::_S_convert_loc(Iter, Iter, const locale&)): Likewise.
(u8path(InputIterator, InputIterator)) [WINDOWS]: Use
__string_from_range to obtain a contiguous range and
__wstr_from_utf8 to obtain a wide string.
(u8path(const Source&)) [WINDOWS]: Use __effective_range to
obtain a contiguous range and __wstr_from_utf8 to obtain a wide
string.
(path::_S_convert(const _EcharT*, const _EcharT)) [WINDOWS]:
Use __wstr_from_utf8.

diff --git a/libstdc++-v3/include/bits/fs_path.h b/libstdc++-v3/include/bits/fs_path.h
index 2d2766ec62e..26ddf0afec4 100644
--- a/libstdc++-v3/include/bits/fs_path.h
+++ b/libstdc++-v3/include/bits/fs_path.h
@@ -211,6 +211,51 @@ namespace __detail
 #endif
 			 , _Val>;
 
+  // Create a string or string view from an iterator range.
+  template
+inline auto
+__string_from_range(_InputIterator __first, _InputIterator __last)
+{
+  using _EcharT
+	= typename std::iterator_traits<_InputIterator>::value_type;
+  static_assert(__is_encoded_char<_EcharT>);
+
+#if __cpp_lib_concepts
+  constexpr bool __contiguous = std::contiguous_iterator<_InputIterator>;
+#else
+  constexpr bool __contiguous
+	= is_pointer_v;
+#endif
+  if constexpr (__contiguous)
+	{
+	  // For contiguous iterators we can just return a string view.
+	  const auto* __f = std::__to_address(std::__niter_base(__first));
+	  const auto* __l = std::__to_address(std::__niter_base(__last));
+	  return basic_string_view<_EcharT>(__f, __l - __f);
+	}
+  else
+	// Conversion requires contiguous characters, so create a string.
+	

Re: [PATCH 1/3] PowerPC future: Add byte swap insns

2020-06-01 Thread will schmidt via Gcc-patches
On Mon, 2020-06-01 at 16:01 -0400, Michael Meissner via Gcc-patches
wrote:
> Add support for generating BRH/BRW/BRD when -mcpu=future is used.
> 

Hi,


> gcc/
> 2020-06-01  Michael Meissner  
> 
>   * config/rs6000/rs6000.md (bswaphi2_reg): If -mcpu=future,
>   generate the BRH instruction.
>   (bswapsi2_reg): If -mcpu=future, generate the BRW instruction.
>   (bswapdi2): Rename bswapdi2_xxbrd to bswapdi2_hw.
>   (bswapdi2_hw): Rename from bswapdi2_xxbrd.  If -mcpu=future,
>   generate the BRD instruction.

The "If -mcpu=future" blurbs there could probably be dropped.

> 
> testsuite/
> 2020-06-01  Michael Meissner  
> 
>   * gcc.target/powerpc/bswap64-5.c: New test.
> ---
>  gcc/config/rs6000/rs6000.md  | 44 
> +++-
>  gcc/testsuite/gcc.target/powerpc/bswap64-5.c | 42 ++
>  2 files changed, 66 insertions(+), 20 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/bswap64-5.c
> 



> diff --git a/gcc/testsuite/gcc.target/powerpc/bswap64-5.c
> b/gcc/testsuite/gcc.target/powerpc/bswap64-5.c
> new file mode 100644
> index 000..9183e16
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/bswap64-5.c
> @@ -0,0 +1,42 @@
> +/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
> +/* { dg-require-effective-target powerpc_future_ok } */
> +/* { dg-options "-O2 -mdejagnu-cpu=future" } */
> +
> +/* This tests whether -mcpu=future generates the new byte swap
> +   instructions (brd, brw, brh).  */

s/new//
(It's only new until it's not).

Aside from those nits, lgtm.
thanks
-Will




Re: [PATCH] c++: Reject some further reinterpret casts in constexpr [PR82304, PR95307]

2020-06-01 Thread Jakub Jelinek via Gcc-patches
On Fri, May 29, 2020 at 01:26:32PM -0400, Jason Merrill via Gcc-patches wrote:
> This is a diagnostic quality regression, moving the error message away from
> the line where the actual problem is.
> 
> Maybe use error_at (loc, ...)?

That works fine, bootstrapped/regtested on x86_64-linux and i686-linux, ok
for trunk?

2020-06-02  Jakub Jelinek  

PR c++/82304
PR c++/95307
* constexpr.c (cxx_eval_constant_expression): Diagnose CONVERT_EXPR
conversions from pointer types to arithmetic types here...
(cxx_eval_outermost_constant_expr): ... instead of here.

* g++.dg/template/pr79650.C: Expect different diagnostics and expect
it on all lines that do pointer to integer casts.
* g++.dg/cpp1y/constexpr-shift1.C: Expect different diagnostics.
* g++.dg/cpp1y/constexpr-82304.C: New test.
* g++.dg/cpp0x/constexpr-95307.C: New test.

--- gcc/cp/constexpr.c.jj   2020-05-29 23:49:25.479087388 +0200
+++ gcc/cp/constexpr.c  2020-06-01 12:53:30.348337388 +0200
@@ -6210,6 +6210,18 @@ cxx_eval_constant_expression (const cons
if (VOID_TYPE_P (type))
  return void_node;
 
+   if (TREE_CODE (t) == CONVERT_EXPR
+   && ARITHMETIC_TYPE_P (type)
+   && INDIRECT_TYPE_P (TREE_TYPE (op)))
+ {
+   if (!ctx->quiet)
+ error_at (loc,
+   "conversion from pointer type %qT to arithmetic type "
+   "%qT in a constant expression", TREE_TYPE (op), type);
+   *non_constant_p = true;
+   return t;
+ }
+
if (TREE_CODE (op) == PTRMEM_CST && !TYPE_PTRMEM_P (type))
  op = cplus_expand_constant (op);
 
@@ -6811,19 +6823,6 @@ cxx_eval_outermost_constant_expr (tree t
   non_constant_p = true;
 }
 
-  /* Technically we should check this for all subexpressions, but that
- runs into problems with our internal representation of pointer
- subtraction and the 5.19 rules are still in flux.  */
-  if (CONVERT_EXPR_CODE_P (TREE_CODE (r))
-  && ARITHMETIC_TYPE_P (TREE_TYPE (r))
-  && TREE_CODE (TREE_OPERAND (r, 0)) == ADDR_EXPR)
-{
-  if (!allow_non_constant)
-   error ("conversion from pointer type %qT "
-  "to arithmetic type %qT in a constant expression",
-  TREE_TYPE (TREE_OPERAND (r, 0)), TREE_TYPE (r));
-  non_constant_p = true;
-}
 
   if (!non_constant_p && overflow_p)
 non_constant_p = true;
--- gcc/testsuite/g++.dg/template/pr79650.C.jj  2020-05-29 23:49:19.040183088 
+0200
+++ gcc/testsuite/g++.dg/template/pr79650.C 2020-06-01 12:53:30.348337388 
+0200
@@ -11,10 +11,10 @@ foo ()
   static int a, b;
 lab1:
 lab2:
-  A<(intptr_t)& - (__INTPTR_TYPE__)&> c; // { dg-error "not a 
constant integer" }
-  A<(intptr_t)& - (__INTPTR_TYPE__)&> d;
-  A<(intptr_t) - (intptr_t)> e;// { dg-error "is not a 
constant expression" }
-  A<(intptr_t) - (intptr_t)> f;
-  A<(intptr_t)sizeof(a) + (intptr_t)> g; // { dg-error "not a 
constant integer" }
+  A<(intptr_t)& - (__INTPTR_TYPE__)&> c; // { dg-error 
"conversion from pointer type" }
+  A<(intptr_t)& - (__INTPTR_TYPE__)&> d; // { dg-error 
"conversion from pointer type" }
+  A<(intptr_t) - (intptr_t)> e;// { dg-error 
"conversion from pointer type" }
+  A<(intptr_t) - (intptr_t)> f;// { dg-error 
"conversion from pointer type" }
+  A<(intptr_t)sizeof(a) + (intptr_t)> g; // { dg-error 
"conversion from pointer type" }
   A<(intptr_t)> h;   // { dg-error 
"conversion from pointer type" }
 }
--- gcc/testsuite/g++.dg/cpp1y/constexpr-shift1.C.jj2020-05-29 
23:49:19.036183148 +0200
+++ gcc/testsuite/g++.dg/cpp1y/constexpr-shift1.C   2020-06-01 
13:55:22.607594689 +0200
@@ -3,7 +3,7 @@
 constexpr int p = 1;
 constexpr __PTRDIFF_TYPE__ bar (int a)
 {
-  return ((__PTRDIFF_TYPE__) ) << a; // { dg-error "is not a constant 
expression" }
+  return ((__PTRDIFF_TYPE__) ) << a; // { dg-error "conversion from pointer" 
}
 }
 constexpr __PTRDIFF_TYPE__ r = bar (2); // { dg-message "in .constexpr. 
expansion of" }
-constexpr __PTRDIFF_TYPE__ s = bar (0); // { dg-error "conversion from 
pointer" }
+constexpr __PTRDIFF_TYPE__ s = bar (0);
--- gcc/testsuite/g++.dg/cpp1y/constexpr-82304.C.jj 2020-06-01 
12:53:30.349337373 +0200
+++ gcc/testsuite/g++.dg/cpp1y/constexpr-82304.C2020-06-01 
13:03:40.668227604 +0200
@@ -0,0 +1,14 @@
+// PR c++/82304
+// { dg-do compile { target c++14 } }
+
+typedef __UINTPTR_TYPE__ uintptr_t;
+
+constexpr const char *
+foo (const char *p)
+{
+  auto l = reinterpret_cast(p); // { dg-error "conversion from 
pointer" }
+  ++l;
+  return reinterpret_cast(l);
+}
+
+constexpr auto s = foo ("Hello");
--- gcc/testsuite/g++.dg/cpp0x/constexpr-95307.C.jj 2020-06-01 
12:53:30.349337373 +0200
+++ gcc/testsuite/g++.dg/cpp0x/constexpr-95307.C2020-06-01 

Re: PowerPC tests for -mcpu=future

2020-06-01 Thread will schmidt via Gcc-patches
On Mon, 2020-06-01 at 15:53 -0400, Michael Meissner via Gcc-patches
wrote:
> This thread adds seven patches to add tests for the -mcpu=future code
> generation.  These patches are an update to the patches I sent out in
> April.
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544653.html
> 
> I have done bootstrap builds with/without the patches on a little end
> power9
> box, and there were no regressions with any of the tests ran.  I
> verified that
> these tests do run and succeed.  Can I check them into the master
> branch?
> 


One nit in #6, mentioned separately. 
Otherwise this patch series lgtm.
thanks




Re: [PATCH 6/7] PowerPC tests: Add PC-relative tests.

2020-06-01 Thread will schmidt via Gcc-patches
On Mon, 2020-06-01 at 15:53 -0400, Michael Meissner via Gcc-patches wrote:
> These tests make sure that PC-relative variant is generated for -mcpu=future 
> on
> systems that support PC-relative addressing.
> 
> 2020-06-01  Michael Meissner  
> 
>   * gcc.target/powerpc/prefix-pcrel-dd.c: New test.
>   * gcc.target/powerpc/prefix-pcrel-df.c: New test.
>   * gcc.target/powerpc/prefix-pcrel-di.c: New test.
>   * gcc.target/powerpc/prefix-pcrel-hi.c: New test.
>   * gcc.target/powerpc/prefix-pcrel-kf.c: New test.
>   * gcc.target/powerpc/prefix-pcrel-qi.c: New test.
>   * gcc.target/powerpc/prefix-pcrel-sd.c: New test.
>   * gcc.target/powerpc/prefix-pcrel-sf.c: New test.
>   * gcc.target/powerpc/prefix-pcrel-si.c: New test.
>   * gcc.target/powerpc/prefix-pcrel-udi.c: New test.
>   * gcc.target/powerpc/prefix-pcrel-uhi.c: New test.
>   * gcc.target/powerpc/prefix-pcrel-uqi.c: New test.
>   * gcc.target/powerpc/prefix-pcrel-usi.c: New test.
>   * gcc.target/powerpc/prefix-pcrel-v2df.c: New test.
>   * gcc.target/powerpc/prefix-pcrel.h: Include file for new tests.
> ---
>  gcc/testsuite/gcc.target/powerpc/prefix-pcrel-dd.c | 13 ++
>  gcc/testsuite/gcc.target/powerpc/prefix-pcrel-df.c | 13 ++
>  gcc/testsuite/gcc.target/powerpc/prefix-pcrel-di.c | 13 ++
>  gcc/testsuite/gcc.target/powerpc/prefix-pcrel-hi.c | 13 ++
>  gcc/testsuite/gcc.target/powerpc/prefix-pcrel-kf.c | 13 ++
>  gcc/testsuite/gcc.target/powerpc/prefix-pcrel-qi.c | 13 ++
>  gcc/testsuite/gcc.target/powerpc/prefix-pcrel-sd.c | 16 +++
>  gcc/testsuite/gcc.target/powerpc/prefix-pcrel-sf.c | 13 ++
>  gcc/testsuite/gcc.target/powerpc/prefix-pcrel-si.c | 13 ++
>  .../gcc.target/powerpc/prefix-pcrel-udi.c  | 13 ++
>  .../gcc.target/powerpc/prefix-pcrel-uhi.c  | 13 ++
>  .../gcc.target/powerpc/prefix-pcrel-uqi.c  | 13 ++
>  .../gcc.target/powerpc/prefix-pcrel-usi.c  | 13 ++
>  .../gcc.target/powerpc/prefix-pcrel-v2df.c | 13 ++
>  gcc/testsuite/gcc.target/powerpc/prefix-pcrel.h| 52 
> ++
>  15 files changed, 237 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-pcrel-dd.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-pcrel-df.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-pcrel-di.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-pcrel-hi.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-pcrel-kf.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-pcrel-qi.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-pcrel-sd.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-pcrel-sf.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-pcrel-si.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-pcrel-udi.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-pcrel-uhi.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-pcrel-uqi.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-pcrel-usi.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-pcrel-v2df.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-pcrel.h
> 
> diff --git a/gcc/testsuite/gcc.target/powerpc/prefix-pcrel-dd.c 
> b/gcc/testsuite/gcc.target/powerpc/prefix-pcrel-dd.c
> new file mode 100644
> index 000..f100c24
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/prefix-pcrel-dd.c
> @@ -0,0 +1,13 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target powerpc_pcrel } */
> +/* { dg-options "-O2 -mdejagnu-cpu=future" } */
> +
> +/* Tests for prefixed instructions testing whether pc-relative prefixed
> +   instructions are generated for the _Decimal64 type.  */


Similar/same comment as was made in Apr.I recommend something like 

"Test whether pc-relative prefixed instructions
are generated for the _Decimal64 type." 





[PATCH] Practical Improvement to Double Precision Complex Divide

2020-06-01 Thread Patrick McGehearty via Gcc-patches
The following patch to libgcc/libgcc2.c __divdc3 provides an
opportunity to gain important improvements to the quality of answers
for the default double precision complex divide routine when dealing
with very large or very small exponents.

The current code correctly implements Smith's method (1962) [1]
further modified by c99's requirements for dealing with NaN (not a
number) results. When working with input values where the exponents
are greater than 512 (i.e. 2.0^512) or less than -512 (i.e. 2.0^-512),
results are substantially different from the answers provided by quad
precision more than 1% of the time. Since the allowed exponent range
for double precision numbers is -1076 to +1023, the error rate may be
unacceptable for many applications. The proposed method reduces the
frequency of "substantially different" answers by more than 99% at a
modest cost of performance.

Differences between current gcc methods and the new method will be
described. Then accuracy and performance differences will be discussed.

NOTATION

For all of the following, the notation is:
Input complex values:
  a+bi  (a= 64bit real part, b= 64bit imaginary part)
  c+di
Output complex value:
  e+fi = (a+bi)/(c+di)

DESCRIPTIONS of different complex divide methods:

NAIVE COMPUTATION (-fcx-limited-range):
  e = (a*c + b*d)/(c*c + d*d)
  f = (b*c - a*d)/(c*c + d*d)

Note that c*c and d*d will overflow or underflow if either
c or d is outside the range 2^-538 to 2^512.

This method is available in gcc when the switch -fcx-limited-range is
used. That switch is also enabled by -ffast-math. Only one who has a
clear understanding of the maximum range of intermediate values
generated by a computation should consider using this switch.

SMITH's METHOD (current libgcc):
  if(fabs(c) RBIG) || (FABS (a) > RBIG) || (FABS (b) > RBIG) ) {
  a = a * 0.5;
  b = b * 0.5;
  c = c * 0.5;
  d = d * 0.5;
  }
  /* minimize overflow/underflow issues when c and d are small */
  else if (FABS (d) < RMIN2) {
  a = a * RMINSCAL;
  b = b * RMINSCAL;
  c = c * RMINSCAL;
  d = d * RMINSCAL;
  }
  r = c/d; denom = (c*r) + d;
  if( r > RMIN ) {
  e = (a*r + b) / denom   ;
  f = (b*r - a) / denom
  } else {
  e = (c * (a/d) + b) / denom;
  f = (c * (b/d) - a) / denom;
  }
[ only presenting the fabs(c) < fabs(d) case here, full code in patch. ]

Before any computation of the answer, the code checks for near maximum or
near minimum inputs and scale the results to move all values away from
the extremes. If the complex divide can be computed at all without
generating infinities, these scalings will not affect the accuracy
since they are by a power of 2.  Values that are over RBIG are
relatively rare but it is easy to test for them and required to avoid
unnecessary overflows.

Testing for RMIN2 reveals when both c and d are less than 2^-512.  By
scaling all values by 2^510, the code avoids many underflows in
intermediate computations that otherwise might occur. If scaling a and
b by 2^510 causes either to overflow, then the computation will overflow
whatever method is used.

Next, r (the ratio of c to d) is checked for being near zero. Baudin
and Smith checked r for zero. Checking for values less than DBL_MIN
covers more cases and improves overall accuracy. If r is near zero,
then when it is used in a multiplication, there is a high chance that
the result will underflow to zero, losing significant accuracy. That
underflow can be avoided if the computation is done in a different
order.  When r is subnormal, the code replaces a*r (= a*(c/d)) with
((a/d)*c) which is mathematically the same but avoids the unnecessary
underflow.

TEST Data

Two sets of data are presented to test these methods.  Both sets
contain 10 million pairs of 64bit complex values.  The exponents and
mantissas are generated using multiple calls to random() and then
combining the results. Only values which give results to complex
divide that are representable in 64-bits after being computed in quad
precision are used.

The first data set is labeled "moderate exponents".
The exponent range is limited to -512 to +511.
The second data set is labeled "full exponents".
The exponent range is -1076 to + 1024.

ACCURACY Test results:

Note: All results are based on use of fused multiply-add. If
fused multiply-add is not used, the error rate increases slightly
for the 2 ulp and 8 ulp cases.

The complex divide methods are evaluated by determining what
percentage of values exceed different ulp (units in last place)
levels.  If a "2 ulp" test results show 1%, that would mean that 1% of
10,000,000 values (100,000) have either a real or imaginary part that
had a greater than 2 bit difference from the quad precision result.

Results are reported for differences greater than or equal to 2 ulp, 8
ulp, 16 ulp, 24 ulp, and 52 ulp.  Even when the patch avoids overflows and
underflows, some input values are expected to have errors due to
normal limitations of floating point 

[committed] Fix pr92085-2.c regressions on msp430-elf

2020-06-01 Thread Jeff Law via Gcc-patches
msp430-elf has had regressions for a while.  There was other instability at the
time the regression started, so I waited to see if it'd correct itself, but it
didn't and I finally took a looksie.

We're processing this in lower-subreg:

(insn 30 64 42 6 (set (subreg:SI (reg/v:HI 33 [ oz ]) 0)
(concatn:SI [
(reg:HI 45 [ _5 ])
(reg:HI 46 [ _5+2 ])
])) "j.c":22:19 14 {movsi_x}
 (nil))


Note the paradoxical subreg destination.  There's nothing inherently wrong with
that.But if lower-subreg wants to decompose it it'll use
simplify_gen_subreg_concatn which has this gem:

 /* If we see an insn like (set (reg:DI) (subreg:DI (reg:SI) 0)) then
 resolve_simple_move will ask for the high part of the paradoxical
 subreg, which does not have a value.  Just return a zero.  */
  if (ret == NULL_RTX
  && paradoxical_subreg_p (op))
return CONST0_RTX (outermode);

That's fine and good for the source of a set, but it's not good for the
destination of a set.  For the destination we might as well just not emit
anything.  The bits we're setting are don't cares and leaving them uninitialized
should be fine.

And that's exactly what this patch does.  WHen simplify_gen_subreg_concatn
returns CONST0_RTX for a destination operand, we assume that we need not 
actually
assign anything to the destination and leave it as-is.

This fixed the regression on the msp430-elf port and has bootstrapped and
successfully regression tested on x86_64-linux-gnu.  It's built on a few other
targets as well, but I haven't tried to enumerate them -- I just knew there
aren't new failures since dropping this patch into my tester :-)


Installing on the trunk,

Jeff
commit c7969df1c5d3785c0b409f97e7682a6f0d2637ec
Author: Jeff Law 
Date:   Mon Jun 1 17:14:50 2020 -0400

Fix 92085-2.c ICE due to having (const_int 0) as the destination of a set.

gcc/
* lower-subreg.c (resolve_simple_move): If 
simplify_gen_subreg_concatn
returns (const_int 0) for the destination, then emit nothing.

diff --git a/gcc/lower-subreg.c b/gcc/lower-subreg.c
index a11e535b5bf..abe7180c686 100644
--- a/gcc/lower-subreg.c
+++ b/gcc/lower-subreg.c
@@ -1087,12 +1087,21 @@ resolve_simple_move (rtx set, rtx_insn *insn)
emit_clobber (dest);
 
   for (i = 0; i < words; ++i)
-   emit_move_insn (simplify_gen_subreg_concatn (word_mode, dest,
-dest_mode,
-i * UNITS_PER_WORD),
-   simplify_gen_subreg_concatn (word_mode, src,
-orig_mode,
-i * UNITS_PER_WORD));
+   {
+ rtx t = simplify_gen_subreg_concatn (word_mode, dest,
+  dest_mode,
+  i * UNITS_PER_WORD);
+ /* simplify_gen_subreg_concatn can return (const_int 0) for
+some sub-objects of paradoxical subregs.  As a source operand,
+that's fine.  As a destination it must be avoided.  Those are
+supposed to be don't care bits, so we can just drop that store
+on the floor.  */
+ if (t != CONST0_RTX (word_mode))
+   emit_move_insn (t,
+   simplify_gen_subreg_concatn (word_mode, src,
+orig_mode,
+i * UNITS_PER_WORD));
+   }
 }
 
   if (real_dest != NULL_RTX)


[pushed] c++: vptr ubsan and object of known type [PR95466]

2020-06-01 Thread Jason Merrill via Gcc-patches
Another case where we can't find the OBJ_TYPE_REF_OBJECT in the
OBJ_TYPE_REF_EXPR.  So let's just evaluate the sanitize call first.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog:

PR c++/95466
PR c++/95311
PR c++/95221
* class.c (build_vfn_ref): Revert 95311 change.
* cp-ubsan.c (cp_ubsan_maybe_instrument_member_call): Build a
COMPOUND_EXPR.

gcc/testsuite/ChangeLog:

PR c++/95466
* g++.dg/ubsan/vptr-17.C: New test.
---
 gcc/cp/class.c   |  8 ++--
 gcc/cp/cp-ubsan.c| 17 -
 gcc/testsuite/g++.dg/ubsan/vptr-17.C | 15 +++
 3 files changed, 25 insertions(+), 15 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/ubsan/vptr-17.C

diff --git a/gcc/cp/class.c b/gcc/cp/class.c
index c818826a108..757e010b6b7 100644
--- a/gcc/cp/class.c
+++ b/gcc/cp/class.c
@@ -729,13 +729,9 @@ build_vtbl_ref (tree instance, tree idx)
 tree
 build_vfn_ref (tree instance_ptr, tree idx)
 {
-  tree obtype = TREE_TYPE (TREE_TYPE (instance_ptr));
+  tree aref;
 
-  /* Leave the INDIRECT_REF unfolded so cp_ubsan_maybe_instrument_member_call
- can find instance_ptr.  */
-  tree ind = build1 (INDIRECT_REF, obtype, instance_ptr);
-
-  tree aref = build_vtbl_ref (ind, idx);
+  aref = build_vtbl_ref (cp_build_fold_indirect_ref (instance_ptr), idx);
 
   /* When using function descriptors, the address of the
  vtable entry is treated as a function pointer.  */
diff --git a/gcc/cp/cp-ubsan.c b/gcc/cp/cp-ubsan.c
index c40dac72b42..183bd238aff 100644
--- a/gcc/cp/cp-ubsan.c
+++ b/gcc/cp/cp-ubsan.c
@@ -125,16 +125,11 @@ cp_ubsan_maybe_instrument_member_call (tree stmt)
 {
   /* Virtual function call: Sanitize the use of the object pointer in the
 OBJ_TYPE_REF, since the vtable reference will SEGV otherwise (95221).
-OBJ_TYPE_REF_EXPR is ptr->vptr[N] and OBJ_TYPE_REF_OBJECT is ptr.  */
+OBJ_TYPE_REF_EXPR is ptr->vptr[N] and OBJ_TYPE_REF_OBJECT is ptr.  But
+we can't be sure of finding OBJ_TYPE_REF_OBJECT in OBJ_TYPE_REF_EXPR
+if the latter has been optimized, so we use a COMPOUND_EXPR below.  */
   opp = _TYPE_REF_EXPR (fn);
   op = OBJ_TYPE_REF_OBJECT (fn);
-  while (*opp != op)
-   {
- if (TREE_CODE (*opp) == COMPOUND_EXPR)
-   opp = _OPERAND (*opp, 1);
- else
-   opp = _OPERAND (*opp, 0);
-   }
 }
   else
 {
@@ -150,7 +145,11 @@ cp_ubsan_maybe_instrument_member_call (tree stmt)
   op = cp_ubsan_maybe_instrument_vptr (EXPR_LOCATION (stmt), op,
   TREE_TYPE (TREE_TYPE (op)),
   true, UBSAN_MEMBER_CALL);
-  if (op)
+  if (!op)
+/* No change.  */;
+  else if (fn && TREE_CODE (fn) == OBJ_TYPE_REF)
+*opp = cp_build_compound_expr (op, *opp, tf_none);
+  else
 *opp = op;
 }
 
diff --git a/gcc/testsuite/g++.dg/ubsan/vptr-17.C 
b/gcc/testsuite/g++.dg/ubsan/vptr-17.C
new file mode 100644
index 000..b7f6a4cb4df
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ubsan/vptr-17.C
@@ -0,0 +1,15 @@
+// PR c++/95466
+// { dg-additional-options -fsanitize=vptr }
+
+class A {
+  virtual void m_fn1();
+};
+class C {
+public:
+  virtual void m_fn2();
+};
+class B : A, public C {};
+int main() {
+  B b;
+  static_cast()->m_fn2();
+}

base-commit: 88f48e2967ead9be262483618238efa9c7c842ec
-- 
2.18.1



[committed] i386: Add __attribute__ ((gcc_struct)) to struct fenv [PR95418]

2020-06-01 Thread Uros Bizjak via Gcc-patches
Windows ABI (MinGW) is different than Linux ABI when bitfileds are involved.
The following patch adds __attribute__ ((gcc_struct)) to struct fenv in order
to match the layout of x87 state image in memory.

2020-06-01  Uroš Bizjak  

libatomic/ChangeLog:
* config/x86/fenv.c (struct fenv): Add __attribute__ ((gcc_struct)).

libgcc/ChangeLog:
* config/i386/sfp-exceptions.c (struct fenv):
Add __attribute__ ((gcc_struct)).

libgfortran/ChangeLog:
PR libfortran/95418
* config/fpu-387.h (struct fenv): Add __attribute__ ((gcc_struct)).

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}, and as
stated in the PR, also tested by Markus on MinGW.

Uros.
diff --git a/libatomic/config/x86/fenv.c b/libatomic/config/x86/fenv.c
index 88622c613f3..138a67ff217 100644
--- a/libatomic/config/x86/fenv.c
+++ b/libatomic/config/x86/fenv.c
@@ -45,7 +45,7 @@ struct fenv
   unsigned int __data_offset;
   unsigned short int __data_selector;
   unsigned short int __unused5;
-};
+} __attribute__ ((gcc_struct));
 
 #ifdef __SSE_MATH__
 # define __math_force_eval_div(x, y) \
diff --git a/libgcc/config/i386/sfp-exceptions.c 
b/libgcc/config/i386/sfp-exceptions.c
index 72cb0f4d3bb..3aed0af7c46 100644
--- a/libgcc/config/i386/sfp-exceptions.c
+++ b/libgcc/config/i386/sfp-exceptions.c
@@ -39,7 +39,7 @@ struct fenv
   unsigned int __data_offset;
   unsigned short int __data_selector;
   unsigned short int __unused5;
-};
+} __attribute__ ((gcc_struct));
 
 #ifdef __SSE_MATH__
 # define __math_force_eval_div(x, y) \
diff --git a/libgfortran/config/fpu-387.h b/libgfortran/config/fpu-387.h
index 8b5e758c2ca..7ff5acdc933 100644
--- a/libgfortran/config/fpu-387.h
+++ b/libgfortran/config/fpu-387.h
@@ -85,7 +85,7 @@ struct fenv
   unsigned short int __data_selector;
   unsigned short int __unused5;
   unsigned int __mxcsr;
-};
+} __attribute__ ((gcc_struct));
 
 /* Check we can actually store the FPU state in the allocated size.  */
 _Static_assert (sizeof(struct fenv) <= (size_t) GFC_FPE_STATE_BUFFER_SIZE,


Re: [PATCH] Prefer simple case changes in spelling suggestions

2020-06-01 Thread Tom Tromey
> Did the full DejaGnu testsuite get run?  There are a lot of tests in it
> that make use of this code.

I did "make check" and only saw some XFAILs.

Here's v2 of the patch, which I think addresses your comments.  I did
not add a new test of get_edit_distance, because as I mentioned earlier,
an existing test already does what you asked for.

Tom

commit e897a99dada8d3935343ebf7b14ad7ec36515b3d
Author: Tom Tromey 
Date:   Fri May 29 10:46:57 2020 -0600

Prefer simple case changes in spelling suggestions

I got this error message when editing gcc and recompiling:

../../gcc/gcc/ada/gcc-interface/decl.c:7714:39: error: 
‘DWARF_GNAT_ENCODINGS_all’ was not declared in this scope; did you mean 
‘DWARF_GNAT_ENCODINGS_GDB’?
 7714 | = debug_info && gnat_encodings == DWARF_GNAT_ENCODINGS_all;
  |   ^~~~
  |   DWARF_GNAT_ENCODINGS_GDB

This suggestion could be improved -- what happened here is that I
failed to upper-case the word, and DWARF_GNAT_ENCODINGS_ALL was the
correct spelling.

This patch changes gcc's spell checker to prefer simple case changes
when possible.

I tested this using the self-tests.  A new self-test is also included.

gcc/ChangeLog:

* spellcheck.c (CASE_COST): New define.
(BASE_COST): New define.
(get_edit_distance): Recognize case changes.
(get_edit_distance_cutoff): Update.
(test_edit_distances): Update.
(get_old_cutoff): Update.
(test_find_closest_string): Add case sensitivity test.

diff --git a/gcc/spellcheck.c b/gcc/spellcheck.c
index 7891260a258..9f7351f364f 100644
--- a/gcc/spellcheck.c
+++ b/gcc/spellcheck.c
@@ -25,14 +25,22 @@ along with GCC; see the file COPYING3.  If not see
 #include "spellcheck.h"
 #include "selftest.h"
 
+/* Cost of a case transformation.  */
+#define CASE_COST 1
+
+/* Cost of another kind of edit.  */
+#define BASE_COST 2
+
 /* Get the edit distance between the two strings: the minimal
number of edits that are needed to change one string into another,
where edits can be one-character insertions, removals, or substitutions,
or transpositions of two adjacent characters (counting as one "edit").
 
-   This implementation uses the Wagner-Fischer algorithm for the
-   Damerau-Levenshtein distance; specifically, the "optimal string alignment
-   distance" or "restricted edit distance" variant.  */
+   This implementation uses a modified variant of the Wagner-Fischer
+   algorithm for the Damerau-Levenshtein distance; specifically, the
+   "optimal string alignment distance" or "restricted edit distance"
+   variant.  This implementation has been further modified to take
+   case into account.  */
 
 edit_distance_t
 get_edit_distance (const char *s, int len_s,
@@ -47,9 +55,9 @@ get_edit_distance (const char *s, int len_s,
 }
 
   if (len_s == 0)
-return len_t;
+return BASE_COST * len_t;
   if (len_t == 0)
-return len_s;
+return BASE_COST * len_s;
 
   /* We effectively build a matrix where each (i, j) contains the
  distance between the prefix strings s[0:j] and t[0:i].
@@ -67,7 +75,7 @@ get_edit_distance (const char *s, int len_s,
   /* The first row is for the case of an empty target string, which
  we can reach by deleting every character in the source string.  */
   for (int i = 0; i < len_s + 1; i++)
-v_one_ago[i] = i;
+v_one_ago[i] = i * BASE_COST;
 
   /* Build successive rows.  */
   for (int i = 0; i < len_t; i++)
@@ -83,21 +91,28 @@ get_edit_distance (const char *s, int len_s,
   /* The initial column is for the case of an empty source string; we
 can reach prefixes of the target string of length i
 by inserting i characters.  */
-  v_next[0] = i + 1;
+  v_next[0] = (i + 1) * BASE_COST;
 
   /* Build the rest of the row by considering neighbors to
 the north, west and northwest.  */
   for (int j = 0; j < len_s; j++)
{
- edit_distance_t cost = (s[j] == t[i] ? 0 : 1);
- edit_distance_t deletion = v_next[j] + 1;
- edit_distance_t insertion= v_one_ago[j + 1] + 1;
+ edit_distance_t cost;
+
+ if (s[j] == t[i])
+   cost = 0;
+ else if (TOLOWER (s[j]) == TOLOWER (t[i]))
+   cost = CASE_COST;
+ else
+   cost = BASE_COST;
+ edit_distance_t deletion = v_next[j] + BASE_COST;
+ edit_distance_t insertion= v_one_ago[j + 1] + BASE_COST;
  edit_distance_t substitution = v_one_ago[j] + cost;
  edit_distance_t cheapest = MIN (deletion, insertion);
  cheapest = MIN (cheapest, substitution);
  if (i > 0 && j > 0 && s[j] == t[i - 1] && s[j - 1] == t[i])
{
- edit_distance_t transposition = v_two_ago[j - 1] + 1;
+ edit_distance_t 

[PATCH 2/3] PowerPC future: Rename some p9 hardware functions.

2020-06-01 Thread Michael Meissner via Gcc-patches
This patch renames some functions that were added for power9 support that are
named '_p9' to be '_hw'.  This is preparation for the next patch that wants to
extend these functions for -mcpu=power support.

2020-06-01  Michael Meissner  

* config/rs6000/rs6000.c (rs6000_emit_hw_fp_minmax): Rename
rs6000_emit_p9_fp_minmax.
(rs6000_emit_hw_fp_cmove): Rename rs6000_emit_p9_fp_cmove.
(rs6000_emit_cmove): Update calls to rs6000_emit_hw_fp_minmax and
rs6000_emit_hw_fp_cmove.
---
 gcc/config/rs6000/rs6000.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 8435bc1..0921328 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -14850,7 +14850,7 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, 
rtx op_false,
hardware has no such operation.  */
 
 static int
-rs6000_emit_p9_fp_minmax (rtx dest, rtx op, rtx true_cond, rtx false_cond)
+rs6000_emit_hw_fp_minmax (rtx dest, rtx op, rtx true_cond, rtx false_cond)
 {
   enum rtx_code code = GET_CODE (op);
   rtx op0 = XEXP (op, 0);
@@ -14892,7 +14892,7 @@ rs6000_emit_p9_fp_minmax (rtx dest, rtx op, rtx 
true_cond, rtx false_cond)
zero/false.  Return 0 if the hardware has no such operation.  */
 
 static int
-rs6000_emit_p9_fp_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
+rs6000_emit_hw_fp_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
 {
   enum rtx_code code = GET_CODE (op);
   rtx op0 = XEXP (op, 0);
@@ -14974,10 +14974,10 @@ rs6000_emit_cmove (rtx dest, rtx op, rtx true_cond, 
rtx false_cond)
   && (compare_mode == SFmode || compare_mode == DFmode)
   && (result_mode == SFmode || result_mode == DFmode))
 {
-  if (rs6000_emit_p9_fp_minmax (dest, op, true_cond, false_cond))
+  if (rs6000_emit_hw_fp_minmax (dest, op, true_cond, false_cond))
return 1;
 
-  if (rs6000_emit_p9_fp_cmove (dest, op, true_cond, false_cond))
+  if (rs6000_emit_hw_fp_cmove (dest, op, true_cond, false_cond))
return 1;
 }
 
-- 
1.8.3.1



[PATCH 1/3] PowerPC future: Add byte swap insns

2020-06-01 Thread Michael Meissner via Gcc-patches
Add support for generating BRH/BRW/BRD when -mcpu=future is used.

gcc/
2020-06-01  Michael Meissner  

* config/rs6000/rs6000.md (bswaphi2_reg): If -mcpu=future,
generate the BRH instruction.
(bswapsi2_reg): If -mcpu=future, generate the BRW instruction.
(bswapdi2): Rename bswapdi2_xxbrd to bswapdi2_hw.
(bswapdi2_hw): Rename from bswapdi2_xxbrd.  If -mcpu=future,
generate the BRD instruction.

testsuite/
2020-06-01  Michael Meissner  

* gcc.target/powerpc/bswap64-5.c: New test.
---
 gcc/config/rs6000/rs6000.md  | 44 +++-
 gcc/testsuite/gcc.target/powerpc/bswap64-5.c | 42 ++
 2 files changed, 66 insertions(+), 20 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/bswap64-5.c

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 0aa5265..3310b4b 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -2585,15 +2585,16 @@ (define_insn "bswap2_store"
   [(set_attr "type" "store")])
 
 (define_insn_and_split "bswaphi2_reg"
-  [(set (match_operand:HI 0 "gpc_reg_operand" "=,wa")
+  [(set (match_operand:HI 0 "gpc_reg_operand" "=r,,wa")
(bswap:HI
-(match_operand:HI 1 "gpc_reg_operand" "r,wa")))
-   (clobber (match_scratch:SI 2 "=,X"))]
+(match_operand:HI 1 "gpc_reg_operand" "r,r,wa")))
+   (clobber (match_scratch:SI 2 "=X,,X"))]
   ""
   "@
+   brh %0,%1
#
xxbrh %x0,%x1"
-  "reload_completed && int_reg_operand (operands[0], HImode)"
+  "reload_completed && !TARGET_FUTURE && int_reg_operand (operands[0], HImode)"
   [(set (match_dup 3)
(and:SI (lshiftrt:SI (match_dup 4)
 (const_int 8))
@@ -2609,21 +2610,22 @@ (define_insn_and_split "bswaphi2_reg"
   operands[3] = simplify_gen_subreg (SImode, operands[0], HImode, 0);
   operands[4] = simplify_gen_subreg (SImode, operands[1], HImode, 0);
 }
-  [(set_attr "length" "12,4")
-   (set_attr "type" "*,vecperm")
-   (set_attr "isa" "*,p9v")])
+  [(set_attr "length" "4,12,4")
+   (set_attr "type" "shift,*,vecperm")
+   (set_attr "isa" "fut,*,p9v")])
 
 ;; We are always BITS_BIG_ENDIAN, so the bit positions below in
 ;; zero_extract insns do not change for -mlittle.
 (define_insn_and_split "bswapsi2_reg"
-  [(set (match_operand:SI 0 "gpc_reg_operand" "=,wa")
+  [(set (match_operand:SI 0 "gpc_reg_operand" "=r,,wa")
(bswap:SI
-(match_operand:SI 1 "gpc_reg_operand" "r,wa")))]
+(match_operand:SI 1 "gpc_reg_operand" "r,r,wa")))]
   ""
   "@
+   brw %0,%1
#
xxbrw %x0,%x1"
-  "reload_completed && int_reg_operand (operands[0], SImode)"
+  "reload_completed && !TARGET_FUTURE && int_reg_operand (operands[0], SImode)"
   [(set (match_dup 0)  ; DABC
(rotate:SI (match_dup 1)
   (const_int 24)))
@@ -2640,9 +2642,9 @@ (define_insn_and_split "bswapsi2_reg"
(and:SI (match_dup 0)
(const_int -256]
   ""
-  [(set_attr "length" "12,4")
-   (set_attr "type" "*,vecperm")
-   (set_attr "isa" "*,p9v")])
+  [(set_attr "length" "4,12,4")
+   (set_attr "type" "shift,*,vecperm")
+   (set_attr "isa" "fut,*,p9v")])
 
 ;; On systems with LDBRX/STDBRX generate the loads/stores directly, just like
 ;; we do for L{H,W}BRX and ST{H,W}BRX above.  If not, we have to generate more
@@ -2675,7 +2677,7 @@ (define_expand "bswapdi2"
  emit_insn (gen_bswapdi2_store (dest, src));
 }
   else if (TARGET_P9_VECTOR)
-   emit_insn (gen_bswapdi2_xxbrd (dest, src));
+   emit_insn (gen_bswapdi2_hw (dest, src));
   else
emit_insn (gen_bswapdi2_reg (dest, src));
   DONE;
@@ -2706,13 +2708,15 @@ (define_insn "bswapdi2_store"
   "stdbrx %1,%y0"
   [(set_attr "type" "store")])
 
-(define_insn "bswapdi2_xxbrd"
-  [(set (match_operand:DI 0 "gpc_reg_operand" "=wa")
-   (bswap:DI (match_operand:DI 1 "gpc_reg_operand" "wa")))]
+(define_insn "bswapdi2_hw"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=r,wa")
+   (bswap:DI (match_operand:DI 1 "gpc_reg_operand" "r,wa")))]
   "TARGET_P9_VECTOR"
-  "xxbrd %x0,%x1"
-  [(set_attr "type" "vecperm")
-   (set_attr "isa" "p9v")])
+  "@
+   brd %0,%1
+   xxbrd %x0,%x1"
+  [(set_attr "type" "shift,vecperm")
+   (set_attr "isa" "fut,p9v")])
 
 (define_insn "bswapdi2_reg"
   [(set (match_operand:DI 0 "gpc_reg_operand" "=")
diff --git a/gcc/testsuite/gcc.target/powerpc/bswap64-5.c 
b/gcc/testsuite/gcc.target/powerpc/bswap64-5.c
new file mode 100644
index 000..9183e16
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/bswap64-5.c
@@ -0,0 +1,42 @@
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-require-effective-target powerpc_future_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* This tests whether -mcpu=future generates the new byte swap
+   instructions (brd, brw, brh).  */
+
+unsigned short
+bswap_short (unsigned short a)
+{
+  

PowerPC new instructions for -mcpu=future

2020-06-01 Thread Michael Meissner via Gcc-patches
These 3 patches add support for some new instructions in the 'future'
processor.

The first patch adds support for the new byte swap instructions that byte swap
valies in the GPRs.

The second patch renames some functions from _p9 to _hw.  This is
in preparation for the third patch that adds support for IEEE 128-bit minimum,
maximum, and set compare masks.

The third patch implements the new instructions.

I have built bootstrap compilers with/without the patches, and there are no
regressions.  I verified that the two new tests pass.  Can I check these into
the master branch?



[PATCH 3/3] PowerPC future: Add IEEE 128-bit min, max, compare.

2020-06-01 Thread Michael Meissner via Gcc-patches
Add support for the new IEEE 128-bit minimum, maximum, and set compare mask
instructions when -mcpu=future was used.

gcc/
2020-06-01  Michael Meissner  

* config/rs6000/rs6000.c (rs6000_emit_hw_fp_minmax): Update
comment.
(rs6000_emit_hw_fp_cmove): Update comment.
(rs6000_emit_cmove): Add support for IEEE 128-bit min, max, and
comparisons with -mcpu=future.
(rs6000_emit_minmax): Add support for IEEE 128-bit min/max with
-mcpu=future.
* config/rs6000/rs6000.md (s3, IEEE128 iterator):
New insns for IEEE 128-bit min/max.
(movcc, IEEE128 iterator): New insns for IEEE 128-bit
conditional move.
(movcc_future, IEEE128 iterator): New insns for IEEE 128-bit
conditional move.
(movcc_invert_future, IEEE128 iterator): New insns for IEEE
128-bit conditional move.
(fpmask, IEEE128 iterator): New insns for IEEE 128-bit
conditional move.

testsuite/
2020-06-01  Michael Meissner  

* gcc.target/powerpc/float128-minmax-2.c: New test.
---
 gcc/config/rs6000/rs6000.c |  26 -
 gcc/config/rs6000/rs6000.md| 121 +
 .../gcc.target/powerpc/float128-minmax-2.c |  70 
 3 files changed, 214 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 0921328..bbba8f1 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -14847,7 +14847,9 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, 
rtx op_false,
 /* ISA 3.0 (power9) minmax subcase to emit a XSMAXCDP or XSMINCDP instruction
for SF/DF scalars.  Move TRUE_COND to DEST if OP of the operands of the last
comparison is nonzero/true, FALSE_COND if it is zero/false.  Return 0 if the
-   hardware has no such operation.  */
+   hardware has no such operation.
+
+   Under FUTURE, also handle IEEE 128-bit floating point.  */
 
 static int
 rs6000_emit_hw_fp_minmax (rtx dest, rtx op, rtx true_cond, rtx false_cond)
@@ -14889,7 +14891,9 @@ rs6000_emit_hw_fp_minmax (rtx dest, rtx op, rtx 
true_cond, rtx false_cond)
 /* ISA 3.0 (power9) conditional move subcase to emit XSCMP{EQ,GE,GT,NE}DP and
XXSEL instructions for SF/DF scalars.  Move TRUE_COND to DEST if OP of the
operands of the last comparison is nonzero/true, FALSE_COND if it is
-   zero/false.  Return 0 if the hardware has no such operation.  */
+   zero/false.  Return 0 if the hardware has no such operation.
+
+   Under FUTURE, also handle IEEE 128-bit conditional moves.  */
 
 static int
 rs6000_emit_hw_fp_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
@@ -14981,6 +14985,21 @@ rs6000_emit_cmove (rtx dest, rtx op, rtx true_cond, 
rtx false_cond)
return 1;
 }
 
+  /* See if we can use the FUTURE min/max/compare instructions for IEEE 128-bit
+ floating point.  At present, don't worry about doing conditional moves
+ with different types for the comparison and movement (unlike SF/DF, where
+ you can do a conditional test between double and use float as the if/then
+ parts. */
+  if (TARGET_FUTURE && FLOAT128_IEEE_P (compare_mode)
+  && compare_mode == result_mode)
+{
+  if (rs6000_emit_hw_fp_minmax (dest, op, true_cond, false_cond))
+   return 1;
+
+  if (rs6000_emit_hw_fp_cmove (dest, op, true_cond, false_cond))
+   return 1;
+}
+
   /* Don't allow using floating point comparisons for integer results for
  now.  */
   if (FLOAT_MODE_P (compare_mode) && !FLOAT_MODE_P (result_mode))
@@ -15204,7 +15223,8 @@ rs6000_emit_minmax (rtx dest, enum rtx_code code, rtx 
op0, rtx op1)
   /* VSX/altivec have direct min/max insns.  */
   if ((code == SMAX || code == SMIN)
   && (VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)
- || (mode == SFmode && VECTOR_UNIT_VSX_P (DFmode
+ || (mode == SFmode && VECTOR_UNIT_VSX_P (DFmode))
+ || (TARGET_FUTURE && FLOAT128_IEEE_P (mode
 {
   emit_insn (gen_rtx_SET (dest, gen_rtx_fmt_ee (code, mode, op0, op1)));
   return;
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 3310b4b..ef82f11 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -14645,6 +14645,127 @@ (define_insn "*cmp_hw"
"xscmpuqp %0,%1,%2"
   [(set_attr "type" "veccmp")
(set_attr "size" "128")])
+
+;; IEEE 128-bit min/max
+(define_insn "s3"
+  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
+   (fp_minmax:IEEE128
+(match_operand:IEEE128 1 "altivec_register_operand" "v")
+(match_operand:IEEE128 2 "altivec_register_operand" "v")))]
+  "TARGET_FUTURE && FLOAT128_IEEE_P (mode)"
+  "xscqp %0,%1,%2"
+  [(set_attr "type" "fp")
+   (set_attr "size" "128")])
+
+;; IEEE 128-bit conditional move.  At present, don't worry about doing
+;; conditional moves with different types 

[PATCH 6/7] PowerPC tests: Add PC-relative tests.

2020-06-01 Thread Michael Meissner via Gcc-patches
These tests make sure that PC-relative variant is generated for -mcpu=future on
systems that support PC-relative addressing.

2020-06-01  Michael Meissner  

* gcc.target/powerpc/prefix-pcrel-dd.c: New test.
* gcc.target/powerpc/prefix-pcrel-df.c: New test.
* gcc.target/powerpc/prefix-pcrel-di.c: New test.
* gcc.target/powerpc/prefix-pcrel-hi.c: New test.
* gcc.target/powerpc/prefix-pcrel-kf.c: New test.
* gcc.target/powerpc/prefix-pcrel-qi.c: New test.
* gcc.target/powerpc/prefix-pcrel-sd.c: New test.
* gcc.target/powerpc/prefix-pcrel-sf.c: New test.
* gcc.target/powerpc/prefix-pcrel-si.c: New test.
* gcc.target/powerpc/prefix-pcrel-udi.c: New test.
* gcc.target/powerpc/prefix-pcrel-uhi.c: New test.
* gcc.target/powerpc/prefix-pcrel-uqi.c: New test.
* gcc.target/powerpc/prefix-pcrel-usi.c: New test.
* gcc.target/powerpc/prefix-pcrel-v2df.c: New test.
* gcc.target/powerpc/prefix-pcrel.h: Include file for new tests.
---
 gcc/testsuite/gcc.target/powerpc/prefix-pcrel-dd.c | 13 ++
 gcc/testsuite/gcc.target/powerpc/prefix-pcrel-df.c | 13 ++
 gcc/testsuite/gcc.target/powerpc/prefix-pcrel-di.c | 13 ++
 gcc/testsuite/gcc.target/powerpc/prefix-pcrel-hi.c | 13 ++
 gcc/testsuite/gcc.target/powerpc/prefix-pcrel-kf.c | 13 ++
 gcc/testsuite/gcc.target/powerpc/prefix-pcrel-qi.c | 13 ++
 gcc/testsuite/gcc.target/powerpc/prefix-pcrel-sd.c | 16 +++
 gcc/testsuite/gcc.target/powerpc/prefix-pcrel-sf.c | 13 ++
 gcc/testsuite/gcc.target/powerpc/prefix-pcrel-si.c | 13 ++
 .../gcc.target/powerpc/prefix-pcrel-udi.c  | 13 ++
 .../gcc.target/powerpc/prefix-pcrel-uhi.c  | 13 ++
 .../gcc.target/powerpc/prefix-pcrel-uqi.c  | 13 ++
 .../gcc.target/powerpc/prefix-pcrel-usi.c  | 13 ++
 .../gcc.target/powerpc/prefix-pcrel-v2df.c | 13 ++
 gcc/testsuite/gcc.target/powerpc/prefix-pcrel.h| 52 ++
 15 files changed, 237 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-pcrel-dd.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-pcrel-df.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-pcrel-di.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-pcrel-hi.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-pcrel-kf.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-pcrel-qi.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-pcrel-sd.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-pcrel-sf.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-pcrel-si.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-pcrel-udi.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-pcrel-uhi.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-pcrel-uqi.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-pcrel-usi.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-pcrel-v2df.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-pcrel.h

diff --git a/gcc/testsuite/gcc.target/powerpc/prefix-pcrel-dd.c 
b/gcc/testsuite/gcc.target/powerpc/prefix-pcrel-dd.c
new file mode 100644
index 000..f100c24
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/prefix-pcrel-dd.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for the _Decimal64 type.  */
+
+#define TYPE _Decimal64
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mplfd\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstfd\M} 2 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/prefix-pcrel-df.c 
b/gcc/testsuite/gcc.target/powerpc/prefix-pcrel-df.c
new file mode 100644
index 000..a9a0711
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/prefix-pcrel-df.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for the double type.  */
+
+#define TYPE double
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mplfd\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstfd\M} 2 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/prefix-pcrel-di.c 
b/gcc/testsuite/gcc.target/powerpc/prefix-pcrel-di.c
new file mode 100644
index 000..850c28b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/prefix-pcrel-di.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for the 

[PATCH 7/7] PowerPC test: Add prefixed stack protect test

2020-06-01 Thread Michael Meissner via Gcc-patches
Test that stack protection generates prefixed stack instructions if you are
using large stack frame for -mcpu=future.

2020-06-01  Michael Meissner  

* gcc.target/powerpc/prefix-stack-protect.c: New test.
---
 .../gcc.target/powerpc/prefix-stack-protect.c| 20 
 1 file changed, 20 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-stack-protect.c

diff --git a/gcc/testsuite/gcc.target/powerpc/prefix-stack-protect.c 
b/gcc/testsuite/gcc.target/powerpc/prefix-stack-protect.c
new file mode 100644
index 000..d0d291b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/prefix-stack-protect.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future -fstack-protector-strong" } */
+
+/* Test that we can handle large stack frames with -fstack-protector-strong and
+   prefixed addressing.  This was originally discovered when trying to build
+   glibc with -mcpu=future, and vfwprintf.c failed because it used
+   -fstack-protector-strong.  */
+
+extern long foo (char *);
+
+long
+bar (void)
+{
+  char buffer[0x2];
+  return foo (buffer) + 1;
+}
+
+/* { dg-final { scan-assembler {\mpld\M}  } } */
+/* { dg-final { scan-assembler {\mpstd\M} } } */
-- 
1.8.3.1



[PATCH 4/7] PowerPC test: Add prefixed no update test

2020-06-01 Thread Michael Meissner via Gcc-patches
This test makes sure we do not generate a prefixed instruction with an update
form.

2020-06-01  Michael Meissner  

* gcc.target/powerpc/prefix-no-update.c: New test.
---
 .../gcc.target/powerpc/prefix-no-update.c  | 50 ++
 1 file changed, 50 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-no-update.c

diff --git a/gcc/testsuite/gcc.target/powerpc/prefix-no-update.c 
b/gcc/testsuite/gcc.target/powerpc/prefix-no-update.c
new file mode 100644
index 000..e3c2e5e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/prefix-no-update.c
@@ -0,0 +1,50 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Make sure that we don't generate a prefixed form of the load and store with
+   update instructions (i.e. instead of generating LWZU we have to generate
+   PLWZ plus a PADDI).  */
+
+#ifndef SIZE
+#define SIZE 5
+#endif
+
+struct foo {
+  unsigned int field;
+  char pad[SIZE];
+};
+
+struct foo *inc_load (struct foo *p, unsigned int *q)
+{
+  *q = (++p)->field;   /* PLWZ, PADDI, STW.  */
+  return p;
+}
+
+struct foo *dec_load (struct foo *p, unsigned int *q)
+{
+  *q = (--p)->field;   /* PLWZ, PADDI, STW.  */
+  return p;
+}
+
+struct foo *inc_store (struct foo *p, unsigned int *q)
+{
+  (++p)->field = *q;   /* LWZ, PADDI, PSTW.  */
+  return p;
+}
+
+struct foo *dec_store (struct foo *p, unsigned int *q)
+{
+  (--p)->field = *q;   /* LWZ, PADDI, PSTW.  */
+  return p;
+}
+
+/* { dg-final { scan-assembler-times {\mlwz\M}2 } } */
+/* { dg-final { scan-assembler-times {\mstw\M}2 } } */
+/* { dg-final { scan-assembler-times {\mpaddi\M}  4 } } */
+/* { dg-final { scan-assembler-times {\mplwz\M}   2 } } */
+/* { dg-final { scan-assembler-times {\mpstw\M}   2 } } */
+/* { dg-final { scan-assembler-not   {\mplwzu\M}} } */
+/* { dg-final { scan-assembler-not   {\mpstwu\M}} } */
+/* { dg-final { scan-assembler-not   {\maddis\M}} } */
+/* { dg-final { scan-assembler-not   {\maddi\M} } } */
-- 
1.8.3.1



[PATCH 5/7] PowerPC tests: Prefixed insn with large offsets

2020-06-01 Thread Michael Meissner via Gcc-patches
Add tests to make sure for -mcpu=future that prefixed load/store instructions
are generated if the offset is larger than 16 bits.

2020-06-01  Michael Meissner  

* gcc.target/powerpc/prefix-large-dd.c: New test.
* gcc.target/powerpc/prefix-large-df.c: New test.
* gcc.target/powerpc/prefix-large-di.c: New test.
* gcc.target/powerpc/prefix-large-hi.c: New test.
* gcc.target/powerpc/prefix-large-kf.c: New test.
* gcc.target/powerpc/prefix-large-qi.c: New test.
* gcc.target/powerpc/prefix-large-sd.c: New test.
* gcc.target/powerpc/prefix-large-sf.c: New test.
* gcc.target/powerpc/prefix-large-si.c: New test.
* gcc.target/powerpc/prefix-large-udi.c: New test.
* gcc.target/powerpc/prefix-large-uhi.c: New test.
* gcc.target/powerpc/prefix-large-uqi.c: New test.
* gcc.target/powerpc/prefix-large-usi.c: New test.
* gcc.target/powerpc/prefix-large-v2df.c: New test.
* gcc.target/powerpc/prefix-large.h: Include file for new tests.
---
 gcc/testsuite/gcc.target/powerpc/prefix-large-dd.c | 13 ++
 gcc/testsuite/gcc.target/powerpc/prefix-large-df.c | 13 ++
 gcc/testsuite/gcc.target/powerpc/prefix-large-di.c | 13 ++
 gcc/testsuite/gcc.target/powerpc/prefix-large-hi.c | 13 ++
 gcc/testsuite/gcc.target/powerpc/prefix-large-kf.c | 13 ++
 gcc/testsuite/gcc.target/powerpc/prefix-large-qi.c | 13 ++
 gcc/testsuite/gcc.target/powerpc/prefix-large-sd.c | 16 +++
 gcc/testsuite/gcc.target/powerpc/prefix-large-sf.c | 13 ++
 gcc/testsuite/gcc.target/powerpc/prefix-large-si.c | 13 ++
 .../gcc.target/powerpc/prefix-large-udi.c  | 14 ++
 .../gcc.target/powerpc/prefix-large-uhi.c  | 14 ++
 .../gcc.target/powerpc/prefix-large-uqi.c  | 14 ++
 .../gcc.target/powerpc/prefix-large-usi.c  | 14 ++
 .../gcc.target/powerpc/prefix-large-v2df.c | 13 ++
 gcc/testsuite/gcc.target/powerpc/prefix-large.h| 51 ++
 15 files changed, 240 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-large-dd.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-large-df.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-large-di.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-large-hi.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-large-kf.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-large-qi.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-large-sd.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-large-sf.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-large-si.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-large-udi.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-large-uhi.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-large-uqi.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-large-usi.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-large-v2df.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-large.h

diff --git a/gcc/testsuite/gcc.target/powerpc/prefix-large-dd.c 
b/gcc/testsuite/gcc.target/powerpc/prefix-large-dd.c
new file mode 100644
index 000..2000fdd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/prefix-large-dd.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset for _Decimal64 objects.  */
+
+#define TYPE _Decimal64
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplfd\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstfd\M} 2 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/prefix-large-df.c 
b/gcc/testsuite/gcc.target/powerpc/prefix-large-df.c
new file mode 100644
index 000..48c497b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/prefix-large-df.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset for double objects.  */
+
+#define TYPE double
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplfd\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstfd\M} 2 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/prefix-large-di.c 
b/gcc/testsuite/gcc.target/powerpc/prefix-large-di.c
new file mode 100644
index 000..aeb879e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/prefix-large-di.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed 

[PATCH 2/7] PowerPC tests: Add PLI/PADDI tests.

2020-06-01 Thread Michael Meissner via Gcc-patches
Add tests for -mcpu=future that test the generation of PADDI (and PLI which
becomes PADDI).

2020-06-01  Michael Meissner  

* gcc.target/powerpc/prefix-add.c: New test.
* gcc.target/powerpc/prefix-si-constant.c: New test.
* gcc.target/powerpc/prefix-di-constant.c: New test.
---
 gcc/testsuite/gcc.target/powerpc/prefix-add.c | 14 ++
 gcc/testsuite/gcc.target/powerpc/prefix-di-constant.c | 13 +
 gcc/testsuite/gcc.target/powerpc/prefix-ds-dq.c   |  0
 gcc/testsuite/gcc.target/powerpc/prefix-si-constant.c | 12 
 4 files changed, 39 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-add.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-di-constant.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-ds-dq.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/prefix-si-constant.c

diff --git a/gcc/testsuite/gcc.target/powerpc/prefix-add.c 
b/gcc/testsuite/gcc.target/powerpc/prefix-add.c
new file mode 100644
index 000..26ef23e0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/prefix-add.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Test that PADDI is generated to add a large constant.  */
+unsigned long
+add (unsigned long a)
+{
+  return a + 0x12345U;
+}
+
+/* { dg-final { scan-assembler {\mpaddi\M} } } */
+/* { dg-final { scan-assembler-not {\maddi\M}  } } */
+/* { dg-final { scan-assembler-not {\maddis\M} } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/prefix-di-constant.c 
b/gcc/testsuite/gcc.target/powerpc/prefix-di-constant.c
new file mode 100644
index 000..389fdaa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/prefix-di-constant.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Test that PLI (PADDI) is generated to load a large constant.  */
+unsigned long long
+large (void)
+{
+  return 0x12345678ULL;
+}
+
+/* { dg-final { scan-assembler {\mpli\M} } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/prefix-ds-dq.c 
b/gcc/testsuite/gcc.target/powerpc/prefix-ds-dq.c
new file mode 100644
index 000..e69de29
diff --git a/gcc/testsuite/gcc.target/powerpc/prefix-si-constant.c 
b/gcc/testsuite/gcc.target/powerpc/prefix-si-constant.c
new file mode 100644
index 000..269fc0f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/prefix-si-constant.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Test that PLI (PADDI) is generated to load a large constant for SImode.  */
+void
+large_si (unsigned int *p)
+{
+  *p = 0x12345U;
+}
+
+/* { dg-final { scan-assembler {\mpli\M} } } */
-- 
1.8.3.1



[PATCH 1/7] PowerPC tests: Add prefixed/pcrel tests.

2020-06-01 Thread Michael Meissner via Gcc-patches
2020-06-01  Michael Meissner  

* lib/target-supports.exp (check_effective_target_powerpc_pcrel):
New.
(check_effective_target_powerpc_prefixed_addr): New.
---
 gcc/testsuite/lib/target-supports.exp | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index b335108..9d880f4 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -2163,6 +2163,25 @@ proc check_p9vector_hw_available { } {
 }]
 }
 
+# Return 1 if the target generates PC-relative instructions automatically for
+# the PowerPC 'future' machine.
+proc check_effective_target_powerpc_pcrel { } {
+return [check_no_messages_and_pattern powerpc_pcrel \
+   {\mpla\M} assembly {
+   static unsigned short s;
+   unsigned short *p_foo (void) { return  }
+   } {-O2 -mcpu=future}]
+}
+
+# Return 1 if the target generates prefixed instructions automatically for the
+# PowerPC 'future' machine.
+proc check_effective_target_powerpc_prefixed_addr { } {
+return [check_no_messages_and_pattern powerpc_prefixed_addr \
+   {\mplwz\M} assembly {
+   unsigned int foo (unsigned int *p) { return p[0x12345]; }
+   } {-O2 -mcpu=future}]
+}
+
 # Return 1 if the target supports executing power9 modulo instructions, 0
 # otherwise.  Cache the result.
 
-- 
1.8.3.1



[PATCH 3/7] PowerPC tests: Add prefixed vs. DS/DQ instruction tests.

2020-06-01 Thread Michael Meissner via Gcc-patches
Add test to make sure prefixed load/store instructions are generated if the
offset would not fit in the DS/DQ encodings.

2020-06-01  Michael Meissner  

* gcc.target/powerpc/prefix-ds-dq.c: New test.
---
 gcc/testsuite/gcc.target/powerpc/prefix-ds-dq.c | 159 
 1 file changed, 159 insertions(+)

diff --git a/gcc/testsuite/gcc.target/powerpc/prefix-ds-dq.c 
b/gcc/testsuite/gcc.target/powerpc/prefix-ds-dq.c
index e69de29..68fbad3 100644
--- a/gcc/testsuite/gcc.target/powerpc/prefix-ds-dq.c
+++ b/gcc/testsuite/gcc.target/powerpc/prefix-ds-dq.c
@@ -0,0 +1,159 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests whether we generate a prefixed load/store operation for addresses that
+   don't meet DS/DQ offset constraints.  */
+
+struct packed_struct
+{
+  long long pad;   /* offset  0 bytes.  */
+  unsigned char pad_uc;/* offset  8 bytes.  */
+  unsigned char uc;/* offset  9 bytes.  */
+
+  unsigned char pad_sc[sizeof (long long) - sizeof (unsigned char)];
+  unsigned char sc;/* offset  17 bytes.  */
+
+  unsigned char pad_us[sizeof (long long) - sizeof (signed char)];
+  unsigned short us;   /* offset  25 bytes.  */
+
+  unsigned char pad_ss[sizeof (long long) - sizeof (unsigned short)];
+  short ss;/* offset 33 bytes.  */
+
+  unsigned char pad_ui[sizeof (long long) - sizeof (short)];
+  unsigned int ui; /* offset 41 bytes.  */
+
+  unsigned char pad_si[sizeof (long long) - sizeof (unsigned int)];
+  unsigned int si; /* offset 49 bytes.  */
+
+  unsigned char pad_f[sizeof (long long) - sizeof (int)];
+  float f; /* offset 57 bytes.  */
+
+  unsigned char pad_d[sizeof (long long) - sizeof (float)];
+  double d;/* offset 65 bytes.  */
+  __float128 f128; /* offset 73 bytes.  */
+} __attribute__((packed));
+
+unsigned char
+load_uc (struct packed_struct *p)
+{
+  return p->uc;/* LBZ 3,9(3).  */
+}
+
+signed char
+load_sc (struct packed_struct *p)
+{
+  return p->sc;/* LBZ 3,17(3) + EXTSB 3,3.  */
+}
+
+unsigned short
+load_us (struct packed_struct *p)
+{
+  return p->us;/* LHZ 3,25(3).  */
+}
+
+short
+load_ss (struct packed_struct *p)
+{
+  return p->ss;/* LHA 3,33(3).  */
+}
+
+unsigned int
+load_ui (struct packed_struct *p)
+{
+  return p->ui;/* LWZ 3,41(3).  */
+}
+
+int
+load_si (struct packed_struct *p)
+{
+  return p->si;/* PLWA 3,49(3).  */
+}
+
+float
+load_float (struct packed_struct *p)
+{
+  return p->f; /* LFS 1,57(3).  */
+}
+
+double
+load_double (struct packed_struct *p)
+{
+  return p->d; /* LFD 1,65(3).  */
+}
+
+__float128
+load_float128 (struct packed_struct *p)
+{
+  return p->f128;  /* PLXV 34,73(3).  */
+}
+
+void
+store_uc (struct packed_struct *p, unsigned char uc)
+{
+  p->uc = uc;  /* STB 4,9(3).  */
+}
+
+void
+store_sc (struct packed_struct *p, signed char sc)
+{
+  p->sc = sc;  /* STB 4,17(3).  */
+}
+
+void
+store_us (struct packed_struct *p, unsigned short us)
+{
+  p->us = us;  /* STH 4,25(3).  */
+}
+
+void
+store_ss (struct packed_struct *p, signed short ss)
+{
+  p->ss = ss;  /* STH 4,33(3).  */
+}
+
+void
+store_ui (struct packed_struct *p, unsigned int ui)
+{
+  p->ui = ui;  /* STW 4,41(3).  */
+}
+
+void
+store_si (struct packed_struct *p, signed int si)
+{
+  p->si = si;  /* STW 4,49(3).  */
+}
+
+void
+store_float (struct packed_struct *p, float f)
+{
+  p->f = f;/* STFS 1,57(3).  */
+}
+
+void
+store_double (struct packed_struct *p, double d)
+{
+  p->d = d;/* STFD 1,65(3).  */
+}
+
+void
+store_float128 (struct packed_struct *p, __float128 f128)
+{
+  p->f128 = f128;  /* PSTXV 34,1(3).  */
+}
+
+/* { dg-final { scan-assembler-times {\mextsb\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mlbz\M}   2 } } */
+/* { dg-final { scan-assembler-times {\mlfd\M}   1 } } */
+/* { dg-final { scan-assembler-times {\mlfs\M}   1 } } */
+/* { dg-final { scan-assembler-times {\mlha\M}   1 } } */
+/* { dg-final { scan-assembler-times {\mlhz\M}   1 } } */
+/* { dg-final { scan-assembler-times {\mlwz\M}   1 } } */
+/* { dg-final { scan-assembler-times {\mplwa\M}  1 } } */
+/* { dg-final { scan-assembler-times {\mplxv\M}  1 } } */
+/* { dg-final { scan-assembler-times {\mpstxv\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mstb\M}   2 } } */

PowerPC tests for -mcpu=future

2020-06-01 Thread Michael Meissner via Gcc-patches
This thread adds seven patches to add tests for the -mcpu=future code
generation.  These patches are an update to the patches I sent out in April.

https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544653.html

I have done bootstrap builds with/without the patches on a little end power9
box, and there were no regressions with any of the tests ran.  I verified that
these tests do run and succeed.  Can I check them into the master branch?



Re: [PATCH 3/4] ivopts: Consider cost_step on different forms during unrolling

2020-06-01 Thread Richard Sandiford
Could you go into more detail about this choice of cost calculation?
It looks like we first calculate per-group flags, which are true only if
the unrolled offsets are valid for all uses in the group.  Then we create
per-candidate flags when associating candidates with groups.

Instead, couldn't we take this into account in get_address_cost,
which calculates the cost of an address use for a given candidate?
E.g. after the main if-else at the start of the function,
perhaps it would make sense to add the worst-case offset to
the address in “parts”, check whether that too is a valid address,
and if not, increase var_cost by the cost of one add instruction.

I guess there are two main sources of inexactness if we do that:

(1) It might underestimate the cost because it assumes that vuse[0]
stands for all vuses in the group.

(2) It might overestimates the cost because it treats all unrolled
iterations as having the cost of the final unrolled iteration.

(1) could perhaps be avoided by adding a flag to the iv_use to say
whether it wants this treatment.  I think the flag approach suffers
from (2) too, and I'd be surprised if it makes a difference in practice.

Thanks,
Richard


Re: [PATCH] c++: constrained lambda inside template [PR92633]

2020-06-01 Thread Jason Merrill via Gcc-patches

On 6/1/20 12:47 PM, Patrick Palka wrote:

When regenerating a constrained lambda during instantiation of an
enclosing template, we are forgetting to substitute into the lambda's
constraints.  Fix this by substituting through the constraints during
tsubst_lambda_expr.

Passes 'make check-c++', and also tested by building the testsuites of
cmcstl2 and range-v3.  Does this look OK to commit to master and to the
10 branch after a full bootstrap and regtest?


OK for both.


gcc/cp/ChangeLog:

PR c++/92633
PR c++/92838
* pt.c (tsubst_function_decl): Don't do set_constraints when
regenerating a lambda.
(tsubst_lambda_expr): Substitute into the lambda's constraints
and do set_constraints here.

gcc/testsuite/ChangeLog:

PR c++/92633
PR c++/92838
* g++.dg/cpp2a/concepts-lambda11.C: New test.
* g++.dg/cpp2a/concepts-lambda12.C: New test.
---
  gcc/cp/pt.c| 16 +++-
  gcc/testsuite/g++.dg/cpp2a/concepts-lambda11.C | 17 +
  gcc/testsuite/g++.dg/cpp2a/concepts-lambda12.C | 15 +++
  3 files changed, 47 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-lambda11.C
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-lambda12.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index df647af7b46..907ca879c73 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -13854,7 +13854,10 @@ tsubst_function_decl (tree t, tree args, 
tsubst_flags_t complain,
   don't substitute through the constraints; that's only done when
   they are checked.  */
if (tree ci = get_constraints (t))
-set_constraints (r, ci);
+/* Unless we're regenerating a lambda, in which case we'll set the
+   lambda's constraints in tsubst_lambda_expr.  */
+if (!lambda_fntype)
+  set_constraints (r, ci);
  
if (DECL_FRIEND_P (t) && DECL_FRIEND_CONTEXT (t))

  SET_DECL_FRIEND_CONTEXT (r,
@@ -19029,6 +19032,17 @@ tsubst_lambda_expr (tree t, tree args, tsubst_flags_t 
complain, tree in_decl)
  finish_member_declaration (fn);
}
  
+  if (tree ci = get_constraints (oldfn))

+   {
+ /* Substitute into the lambda's constraints.  */
+ if (oldtmpl)
+   ++processing_template_decl;
+ ci = tsubst_constraint_info (ci, args, complain, in_decl);
+ if (oldtmpl)
+   --processing_template_decl;
+ set_constraints (fn, ci);
+   }
+
/* Let finish_function set this.  */
DECL_DECLARED_CONSTEXPR_P (fn) = false;
  
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-lambda11.C b/gcc/testsuite/g++.dg/cpp2a/concepts-lambda11.C

new file mode 100644
index 000..dd9cd4e2344
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-lambda11.C
@@ -0,0 +1,17 @@
+// PR c++/92838
+// { dg-do compile { target c++20 } }
+
+template
+auto foo()
+{
+  [] () requires (N != 0) { }(); // { dg-error "no match" }
+  [] () requires (N == 0) { }();
+
+  []  () requires (N == M) { }(); // { dg-error "no match" }
+  []  () requires (N != M) { }();
+}
+
+void bar()
+{
+  foo<0>();
+}
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-lambda12.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-lambda12.C
new file mode 100644
index 000..2bc9fd0bb25
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-lambda12.C
@@ -0,0 +1,15 @@
+// PR c++/92633
+// { dg-do compile { target c++20 } }
+
+template
+concept different_than = !__is_same_as(A, B);
+
+template
+auto diff(B) {
+return [](different_than auto a) {};
+}
+
+int main() {
+diff(42)("");
+diff(42)(42); // { dg-error "no match" }
+}





Re: [PATCH] c++: premature requires-expression folding [PR95020]

2020-06-01 Thread Jason Merrill via Gcc-patches

On 5/30/20 12:37 AM, Patrick Palka wrote:

On Wed, 13 May 2020, Jason Merrill wrote:


On 5/11/20 6:43 PM, Patrick Palka wrote:

In the testcase below we're prematurely folding away the
requires-expression to 'true' after substituting in the function's
template arguments, but before substituting in the lambda's deduced
template arguments.

This happens because during the first tsubst_requires_expr,
processing_template_decl is 1 but 'args' is just {void} and therefore
non-dependent, so we end up folding away the requires-expression to
boolean_true_node before we could substitute in the lambda's template
arguments and determine that '*v' is ill-formed.

This patch removes the uses_template_parms check when deciding in
tsubst_requires_expr whether to keep around a new requires-expression.
Regardless of whether the template arguments are dependent, there still
might be more template parameters to later substitute in -- as in the
testcase below -- and even if not, tsubst_expr doesn't perform full
semantic processing unless !processing_template_decl, so it seems we
should wait until then to fold away the requires-expression.

Passes 'make check-c++', does this look OK to commit after a full
bootstrap/regtest?


OK.


Would the same patch be OK to backport to the GCC 10 branch?


Yes.




gcc/cp/ChangeLog:

PR c++/95020
* constraint.c (tsubst_requires_expr): Produce a new
requires-expression when processing_template_decl, even if
template arguments are not dependent.

gcc/testsuite/ChangeLog:

PR c++/95020
* g++/cpp2a/concepts-lambda7.C: New test.
---
   gcc/cp/constraint.cc  |  4 +---
   gcc/testsuite/g++.dg/cpp2a/concepts-lambda7.C | 14 ++
   2 files changed, 15 insertions(+), 3 deletions(-)
   create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-lambda7.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 4ad17f3b7d8..8ee347cae60 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -2173,9 +2173,7 @@ tsubst_requires_expr (tree t, tree args,
 if (reqs == error_mark_node)
   return boolean_false_node;
   -  /* In certain cases, produce a new requires-expression.
- Otherwise the value of the expression is true.  */
-  if (processing_template_decl && uses_template_parms (args))
+  if (processing_template_decl)
   return finish_requires_expr (cp_expr_location (t), parms, reqs);
   return boolean_true_node;
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-lambda7.C
b/gcc/testsuite/g++.dg/cpp2a/concepts-lambda7.C
new file mode 100644
index 000..50746b777a3
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-lambda7.C
@@ -0,0 +1,14 @@
+// PR c++/95020
+// { dg-do compile { target c++2a } }
+
+template
+void foo() {
+  auto t = [](auto v) {
+static_assert(requires { *v; }); // { dg-error "static assertion
failed" }
+  };
+  t(0);
+}
+
+void bar() {
+  foo();
+}










Re: [committed] libstdc++: Update/streamline Valgrind references

2020-06-01 Thread Jonathan Wakely via Gcc-patches

On 01/06/20 17:06 +0200, Gerald Pfeifer wrote:

Like many sites over the last year(s) valgrind.org has now moved to
https.  While there, replace the second of two links in the same vicinity
by a purely textual reference -- easier to maintain, and in particular
also better from a user experience perspective.


Thanks.

I've also committed a couple more doc improvements, as attached.

commit 258059d91bd0e27cc335312f4558e1b339a2e77d
Author: Jonathan Wakely 
Date:   Mon Jun 1 16:43:01 2020 +0100

libstdc++: Document API changes in GCC 10

* doc/xml/manual/evolution.xml: Document deprecation of
__is_nullptr_t and removal of std::allocator members.
* doc/html/manual/api.html: Regenerate.

diff --git a/libstdc++-v3/doc/xml/manual/evolution.xml b/libstdc++-v3/doc/xml/manual/evolution.xml
index ab04c1ad272..623d53e7faf 100644
--- a/libstdc++-v3/doc/xml/manual/evolution.xml
+++ b/libstdc++-v3/doc/xml/manual/evolution.xml
@@ -955,11 +955,23 @@ now defaults to zero.
 
 
 
+
+  The non-standard std::__is_nullptr_t type trait
+  was deprecated.
+
+
 
   The std::packaged_task constructors taking
   an allocator argument are only defined for C++11 and C++14.
 
 
+
+  Several members of std::allocator were removed
+  for C++20 mode. The removed functionality has been provided by
+  std::allocator_traits since C++11 and that should
+  be used instead.
+
+
 
 
 

commit a1ffe9b6f4d0e2dd9493c5bd669fc5a2ea24a6f9
Author: Jonathan Wakely 
Date:   Mon Jun 1 16:40:13 2020 +0100

libstdc++: Fix incorrect Docbook links

The  element creates the link text automatically from the link
target, rather than using the text node child of the element. This can
be changed by using an endterm attribute, but it's simpler to just use
the  element instead.

* doc/xml/manual/containers.xml: Replace  with .
* doc/xml/manual/evolution.xml: Likewise.
* doc/html/manual/api.html: Regenerate.
* doc/html/manual/containers.html: Regenerate.

diff --git a/libstdc++-v3/doc/xml/manual/containers.xml b/libstdc++-v3/doc/xml/manual/containers.xml
index 5c9854efbdd..6d568164b47 100644
--- a/libstdc++-v3/doc/xml/manual/containers.xml
+++ b/libstdc++-v3/doc/xml/manual/containers.xml
@@ -25,8 +25,8 @@
   list::size() is O(n)
 

- Yes it is, at least using the old
- ABI, and that's okay.  This is a decision that we preserved
+ Yes it is, at least using the old
+ ABI, and that's okay.  This is a decision that we preserved
  when we imported SGI's STL implementation.  The following is
  quoted from http://www.w3.org/1999/xlink; xlink:href="https://web.archive.org/web/20171225062613/http://www.sgi.com/tech/stl/FAQ.html;>their FAQ:

diff --git a/libstdc++-v3/doc/xml/manual/evolution.xml b/libstdc++-v3/doc/xml/manual/evolution.xml
index 1bd7bb1bb9f..ab04c1ad272 100644
--- a/libstdc++-v3/doc/xml/manual/evolution.xml
+++ b/libstdc++-v3/doc/xml/manual/evolution.xml
@@ -784,8 +784,8 @@ now defaults to zero.
 
 
   Assertions to check function preconditions can be enabled by defining the
-  _GLIBCXX_ASSERTIONS
-  macro.
+  _GLIBCXX_ASSERTIONS
+  macro.
   The initial set of assertions are a subset of the checks enabled by
   the Debug Mode, but without the ABI changes and changes to algorithmic
   complexity that are caused by enabling the full Debug Mode.


[committed] libstdc++: Fix __gnu_test::input_iterator_wrapper::operator++(int)

2020-06-01 Thread Jonathan Wakely via Gcc-patches
I noticed recently that our input_iterator_wrapper utility for writing
tests has the following post-increment operator:

void
operator++(int)
{
  ++*this;
}

That fails to meet the Cpp17InputIterator requirement that *r++ is
valid. This change makes it return a non-void proxy type that can be
deferenced to produce another proxy, which is convertible to the
value_type. The second proxy converts to const T& to ensure it can't be
written to.

* testsuite/util/testsuite_iterators.h:
(input_iterator_wrapper::operator++(int)): Return proxy object.

Tested powerpc64le-linux, committed to master.

commit 118158b646d402b0fb5d760e4827611b731fe6f3
Author: Jonathan Wakely 
Date:   Mon Jun 1 18:30:47 2020 +0100

libstdc++: Fix __gnu_test::input_iterator_wrapper::operator++(int)

I noticed recently that our input_iterator_wrapper utility for writing
tests has the following post-increment operator:

void
operator++(int)
{
  ++*this;
}

That fails to meet the Cpp17InputIterator requirement that *r++ is
valid. This change makes it return a non-void proxy type that can be
deferenced to produce another proxy, which is convertible to the
value_type. The second proxy converts to const T& to ensure it can't be
written to.

* testsuite/util/testsuite_iterators.h:
(input_iterator_wrapper::operator++(int)): Return proxy object.

diff --git a/libstdc++-v3/testsuite/util/testsuite_iterators.h 
b/libstdc++-v3/testsuite/util/testsuite_iterators.h
index 5be47f47915..71b672c85fa 100644
--- a/libstdc++-v3/testsuite/util/testsuite_iterators.h
+++ b/libstdc++-v3/testsuite/util/testsuite_iterators.h
@@ -208,6 +208,17 @@ namespace __gnu_test
   : public std::iterator::type,
 std::ptrdiff_t, T*, T&>
   {
+struct post_inc_proxy
+{
+  struct deref_proxy
+  {
+   T* ptr;
+   operator const T&() const { return *ptr; }
+  } p;
+
+  deref_proxy operator*() const { return p; }
+};
+
   protected:
 input_iterator_wrapper() : ptr(0), SharedInfo(0)
 { }
@@ -266,10 +277,12 @@ namespace __gnu_test
   return *this;
 }
 
-void
+post_inc_proxy
 operator++(int)
 {
+  post_inc_proxy tmp = { { ptr } };
   ++*this;
+  return tmp;
 }
 
 #if __cplusplus >= 201103L


[PATCH] contrib: Improve comments and error text

2020-06-01 Thread Jonathan Wakely via Gcc-patches
* gcc-changelog/git_commit.py (GitCommit.check_mentioned_files):
Improve error text.

OK for master?

commit b0eb103fc6a8b12905ce8feea299e02048b7f820
Author: Jonathan Wakely 
Date:   Mon Jun 1 18:28:35 2020 +0100

contrib: Improve comments and error text

* gcc-changelog/git_commit.py (GitCommit.check_mentioned_files):
Improve error text.

diff --git a/contrib/gcc-changelog/git_commit.py 
b/contrib/gcc-changelog/git_commit.py
index 4f82b58f64b..7fbc029408d 100755
--- a/contrib/gcc-changelog/git_commit.py
+++ b/contrib/gcc-changelog/git_commit.py
@@ -176,7 +176,7 @@ class Error:
 class ChangeLogEntry:
 def __init__(self, folder, authors, prs):
 self.folder = folder
-# Python2 has not 'copy' function
+# The 'list.copy()' function is not available before Python 3.3
 self.author_lines = list(authors)
 self.initial_prs = list(prs)
 self.prs = list(prs)
@@ -209,7 +209,7 @@ class ChangeLogEntry:
 line = line[:line.index(':')]
 in_location = False
 
-# At this point, all that 's left is a list of filenames
+# At this point, all that's left is a list of filenames
 # separated by commas and whitespaces.
 for file in line.split(','):
 file = file.strip()
@@ -503,7 +503,7 @@ class GitCommit:
 mentioned_files = set()
 for entry in self.changelog_entries:
 if not entry.files:
-msg = 'ChangeLog must contain a file entry'
+msg = 'ChangeLog must contain at least one file entry'
 self.errors.append(Error(msg, entry.folder))
 assert not entry.folder.endswith('/')
 for file in entry.files:


Re: [IMPORTANT] ChangeLog related changes

2020-06-01 Thread Jonathan Wakely via Gcc-patches
On Mon, 25 May 2020 at 23:50, Jakub Jelinek via Gcc  wrote:
>
> Hi!
>
> I've turned the strict mode of Martin Liška's hook changes,
> which means that from now on no commits to the trunk or release branches
> should be changing any ChangeLog files together with the other files,
> ChangeLog entry should be solely in the commit message.
> The DATESTAMP bumping script will be updating the ChangeLog files for you.
> If somebody makes a mistake in that, please wait 24 hours (at least until
> after 00:16 UTC after your commit) so that the script will create the
> ChangeLog entries, and afterwards it can be fixed by adjusting the ChangeLog
> files.  But you can only touch the ChangeLog files in that case (and
> shouldn't write a ChangeLog entry for that in the commit message).
>
> If anything goes wrong, please let me, other RMs and Martin Liška know.

The libstdc++ manual is written in Docbook XML, but we commit both the
XML and generated HTML pages to Git. Sometimes a small XML file can
result in dozens of mechanical changes to the generated HTML files,
which we record in the ChangeLog as:

* doc/html/*: Regenerated.

With the new checks we need to name every generated file individually.

If we add that directory to the ignored_prefixes list, we won't need
to name them. But then the doc/html/* entry will give an error, and
changes to the HTML files can be committed without any ChangeLog
entry. Should we just stop mentioning the HTML in the ChangeLog?

We could do something like the attached patch, but it seems overkill
for this one special case.
diff --git a/contrib/gcc-changelog/git_commit.py 
b/contrib/gcc-changelog/git_commit.py
index 4f82b58f64b..add0defaeed 100755
--- a/contrib/gcc-changelog/git_commit.py
+++ b/contrib/gcc-changelog/git_commit.py
@@ -501,6 +501,7 @@ class GitCommit:
 assert folder_count == len(self.changelog_entries)
 
 mentioned_files = set()
+libstdcxx_html_regenerated = False
 for entry in self.changelog_entries:
 if not entry.files:
 msg = 'ChangeLog must contain a file entry'
@@ -508,16 +509,33 @@ class GitCommit:
 assert not entry.folder.endswith('/')
 for file in entry.files:
 if not self.is_changelog_filename(file):
-mentioned_files.add(os.path.join(entry.folder, file))
+file = os.path.join(entry.folder, file)
+if file == 'libstdc++-v3/doc/html/*':
+libstdcxx_html_regenerated = True
+else:
+mentioned_files.add(file)
 
 cand = [x[0] for x in self.modified_files
 if not self.is_changelog_filename(x[0])]
 changed_files = set(cand)
+if libstdcxx_html_regenerated:
+libstdcxx_html_regenerated = False
+for c in changed_files:
+if c.startswith('libstdc++-v3/doc/html/'):
+libstdcxx_html_regenerated = True
+break
+if not libstdcxx_html_regenerated:
+self.errors.append(Error('No libstdc++ HTML changes found'))
+
 for file in sorted(mentioned_files - changed_files):
 self.errors.append(Error('file not changed in a patch', file))
 for file in sorted(changed_files - mentioned_files):
 if not self.in_ignored_location(file):
-if file in self.new_files:
+if file.startswith('libstdc++-v3/doc/html/'):
+if not libstdcxx_html_regenerated:
+msg = 'libstdc++ HTML changes not in ChangeLog'
+self.errors.append(Error(msg, file))
+elif file in self.new_files:
 changelog_location = self.get_changelog_by_path(file)
 # Python2: we cannot use next(filter(...))
 entries = filter(lambda x: x.folder == changelog_location,


Re: [PATCH] coroutines: co_returns are statements, not expressions.

2020-06-01 Thread Nathan Sidwell

On 6/1/20 4:56 AM, Iain Sandoe wrote:

Hi

This corrects an typo in the CO_RETURN_EXPR tree class.

Although it doens’t fix any PR or regression - it seems to me that it would be
sensible to apply this to 10.2 as well as master (or it’s an accident waiting to
happen).

OK for master?
10.2 (after some bake)?
thanks
Iain

gcc/cp/ChangeLog:

* cp-tree.def (CO_RETURN_EXPR): Correct the class
to use tcc_statement.


ok


---
  gcc/cp/cp-tree.def | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/cp/cp-tree.def b/gcc/cp/cp-tree.def
index 1454802bf68..99851eb780f 100644
--- a/gcc/cp/cp-tree.def
+++ b/gcc/cp/cp-tree.def
@@ -594,9 +594,9 @@ DEFTREECODE (CO_YIELD_EXPR, "co_yield", tcc_expression, 2)
  /* The co_return expression is used to support coroutines.
  
 Op0 is the original expr, can be void (for use in diagnostics)

-   Op2 is the promise return_ call for Op0. */
+   Op1 is the promise return_ call for for the expression given. */
  
-DEFTREECODE (CO_RETURN_EXPR, "co_return", tcc_expression, 2)

+DEFTREECODE (CO_RETURN_EXPR, "co_return", tcc_statement, 2)
  
  /*

  Local variables:




--
Nathan Sidwell


Re: [PATCH] coroutines: Fix missed ramp function return copy elision [PR95346].

2020-06-01 Thread Nathan Sidwell

On 6/1/20 4:46 AM, Iain Sandoe wrote:

Hi

Confusingly, "get_return_object ()" can do two things:
- Firstly it can provide the return object for the ramp function (as
   the name suggests).
- Secondly if the type of the ramp function is different from that
   of the get_return_object call, this is used as a single parameter
   to a CTOR for the ramp's return type.

In the first case we can rely on finish_return_stmt () to do the
necessary processing for copy elision.
In the second case, we should have passed a prvalue to the CTOR as
per the standard comment, but I had omitted the rvalue () call.  Fixed
thus.

tested on x86_64-darwin, x86_64-linux, powerpc64-linux
OK for master?
OK for 10.2?


ok for both, but I think there's an existing nit ...


thanks
Iain

gcc/cp/ChangeLog:

PR c++/95346
* coroutines.cc (morph_fn_to_coro): Ensure that the get-
return-object is constructed correctly; When it is not the
final return value, pass it to the CTOR of the return type
as an rvalue, per the standard comment.

gcc/testsuite/ChangeLog:

PR c++/95346
* g++.dg/coroutines/pr95346.C: New test.
---
  gcc/cp/coroutines.cc  | 70 +++
  gcc/testsuite/g++.dg/coroutines/pr95346.C | 26 +
  2 files changed, 71 insertions(+), 25 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/coroutines/pr95346.C

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index 7afa550037c..d1c2b437ade 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc




{
- args = make_tree_vector_single (gro);
- arglist = 
+ vec *args = NULL;
+ vec **arglist = NULL;
+ if (!gro_is_void_p)
+   {
+ args = make_tree_vector_single (r);
+ arglist = 
+   }
+ r = build_special_member_call (NULL_TREE,
+complete_ctor_identifier, arglist,
+fn_return_type, LOOKUP_NORMAL,
+tf_warning_or_error);
+ r = build_cplus_new (fn_return_type, r, tf_warning_or_error);


missing release_tree_vector (arg) call here?

--
Nathan Sidwell


Cleanup global decl stream reference streaming, part 2

2020-06-01 Thread Jan Hubicka
Hi,
this patch removes unnecesary ref tags and replaces them by one tag
for all references to the global stream.

lto-bootstrapped/regtested x86_64-linux, comitted.

Honza

gcc/ChangeLog:

2020-06-01  Jan Hubicka  

* lto-streamer.h (enum LTO_tags): Remove LTO_field_decl_ref,
LTO_function_decl_ref, LTO_label_decl_ref, LTO_namespace_decl_ref,
LTO_result_decl_ref, LTO_type_decl_ref, LTO_type_ref,
LTO_const_decl_ref, LTO_imported_decl_ref,
LTO_translation_unit_decl_ref, LTO_global_decl_ref and
LTO_namelist_decl_ref; add LTO_global_stream_ref.
* lto-streamer-in.c (lto_input_tree_ref): Simplify.
(lto_input_scc): Update.
(lto_input_tree_1): Update.
* lto-streamer-out.c (lto_indexable_tree_ref): Simlify.
* lto-streamer.c (lto_tag_name): Update.

diff --git a/gcc/lto-streamer-in.c b/gcc/lto-streamer-in.c
index d77b4f5e9ff..5eaba7d16d4 100644
--- a/gcc/lto-streamer-in.c
+++ b/gcc/lto-streamer-in.c
@@ -316,34 +316,17 @@ lto_input_tree_ref (class lto_input_block *ib, class 
data_in *data_in,
   unsigned HOST_WIDE_INT ix_u;
   tree result = NULL_TREE;
 
-  lto_tag_check_range (tag, LTO_field_decl_ref, LTO_namelist_decl_ref);
-
-  switch (tag)
+  if (tag == LTO_ssa_name_ref)
 {
-case LTO_ssa_name_ref:
   ix_u = streamer_read_uhwi (ib);
   result = (*SSANAMES (fn))[ix_u];
-  break;
-
-case LTO_type_ref:
-case LTO_field_decl_ref:
-case LTO_function_decl_ref:
-case LTO_type_decl_ref:
-case LTO_namespace_decl_ref:
-case LTO_global_decl_ref:
-case LTO_result_decl_ref:
-case LTO_const_decl_ref:
-case LTO_imported_decl_ref:
-case LTO_label_decl_ref:
-case LTO_translation_unit_decl_ref:
-case LTO_namelist_decl_ref:
+}
+  else
+{
+  gcc_checking_assert (tag == LTO_global_stream_ref);
   ix_u = streamer_read_uhwi (ib);
   result = (*data_in->file_data->current_decl_state
->streams[LTO_DECL_STREAM])[ix_u];
-  break;
-
-default:
-  gcc_unreachable ();
 }
 
   gcc_assert (result);
@@ -1485,7 +1468,7 @@ lto_input_scc (class lto_input_block *ib, class data_in 
*data_in,
{
  enum LTO_tags tag = streamer_read_record_start (ib);
  if (tag == LTO_null
- || (tag >= LTO_field_decl_ref && tag <= LTO_global_decl_ref)
+ || tag == LTO_global_stream_ref
  || tag == LTO_tree_pickle_reference
  || tag == LTO_integer_cst
  || tag == LTO_tree_scc
@@ -1549,7 +1532,7 @@ lto_input_tree_1 (class lto_input_block *ib, class 
data_in *data_in,
 
   if (tag == LTO_null)
 result = NULL_TREE;
-  else if (tag >= LTO_field_decl_ref && tag <= LTO_namelist_decl_ref)
+  else if (tag == LTO_global_stream_ref || tag == LTO_ssa_name_ref)
 {
   /* If TAG is a reference to an indexable tree, the next value
 in IB is the index into the table where we expect to find
diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c
index a44ed0037ee..dfc4603d7ae 100644
--- a/gcc/lto-streamer-out.c
+++ b/gcc/lto-streamer-out.c
@@ -252,84 +252,18 @@ static void
 lto_indexable_tree_ref (struct output_block *ob, tree expr,
enum LTO_tags *tag, unsigned *index)
 {
-  enum tree_code code;
-  enum lto_decl_stream_e_t encoder;
-
   gcc_checking_assert (tree_is_indexable (expr));
 
-  if (TYPE_P (expr))
+  if (TREE_CODE (expr) == SSA_NAME)
 {
-  *tag = LTO_type_ref;
-  encoder = LTO_DECL_STREAM;
+  *tag = LTO_ssa_name_ref;
+  *index = SSA_NAME_VERSION (expr);
 }
   else
 {
-  code = TREE_CODE (expr);
-  switch (code)
-   {
-   case SSA_NAME:
- *tag = LTO_ssa_name_ref;
- *index = SSA_NAME_VERSION (expr);
- return;
- break;
-
-   case FIELD_DECL:
- *tag = LTO_field_decl_ref;
- encoder = LTO_DECL_STREAM;
- break;
-
-   case FUNCTION_DECL:
- *tag = LTO_function_decl_ref;
- encoder = LTO_DECL_STREAM;
- break;
-
-   case VAR_DECL:
-   case DEBUG_EXPR_DECL:
- gcc_checking_assert (decl_function_context (expr) == NULL
-  || TREE_STATIC (expr));
- /* FALLTHRU */
-   case PARM_DECL:
- *tag = LTO_global_decl_ref;
- encoder = LTO_DECL_STREAM;
- break;
-
-   case CONST_DECL:
- *tag = LTO_const_decl_ref;
- encoder = LTO_DECL_STREAM;
- break;
-
-   case TYPE_DECL:
- *tag = LTO_type_decl_ref;
- encoder = LTO_DECL_STREAM;
- break;
-
-   case NAMESPACE_DECL:
- *tag = LTO_namespace_decl_ref;
- encoder = LTO_DECL_STREAM;
- break;
-
-   case LABEL_DECL:
- *tag = LTO_label_decl_ref;
- encoder = LTO_DECL_STREAM;
- break;
-
-   case RESULT_DECL:
- *tag = LTO_result_decl_ref;
- encoder = LTO_DECL_STREAM;
- break;
-
-   case 

Re: [PATCH ping] ppc64 check for incompatible setting of minimal-toc

2020-06-01 Thread Douglas B Rupp

Greetings,

Curious if you've had a chance to look at this patch yet?

--Doug

On 5/18/20 4:02 PM, Douglas B Rupp wrote:

Greetings,

The attached patch is proposed for rs6000/linux64.h.

The problem it addresses is that the current checking only tests for 
existence not for an incompatible/compatible setting.


For example:

$ powerpc64-linux-gnu-gcc -mcmodel=medium -mminimal-toc foo.c
is an incompatible set of switches

however

$ powerpc64-linux-gnu-gcc -mcmodel=medium -mno-minimal-toc foo.c
is ok.

Currently both are reported as incompatible.

--Douglas Rupp, AdaCore



Re: [PATCH] Fix unrecognised -mcpu target: armv7-a on arm-wrs-vxworks7 (PR95420)

2020-06-01 Thread Olivier Hainque
Hello Iain,

> On 01 Jun 2020, at 00:40, Iain Buclaw  wrote:
> 
> Hi,
> 
> In the removal of arm-wrs-vxworks, the default cpu was updated from arm8
> to armv7-a, but this is not recognized as a valid -mcpu target.  There
> is however generic-armv7-a, which was likely the intended cpu that
> should have been used instead.

Yes, indeed.

> Tested by building a cross-compiler targetting arm-wrs-vxworks7, running
> make all-gcc and ensuring it succeeds.
> 
> OK?

Yes, OK.

>  This affects release/gcc-10 branch as well, so should be
> backported too.

Certainly. Could you please ?

Thanks!

Olivier



[PATCH] c++: constrained lambda inside template [PR92633]

2020-06-01 Thread Patrick Palka via Gcc-patches
When regenerating a constrained lambda during instantiation of an
enclosing template, we are forgetting to substitute into the lambda's
constraints.  Fix this by substituting through the constraints during
tsubst_lambda_expr.

Passes 'make check-c++', and also tested by building the testsuites of
cmcstl2 and range-v3.  Does this look OK to commit to master and to the
10 branch after a full bootstrap and regtest?

gcc/cp/ChangeLog:

PR c++/92633
PR c++/92838
* pt.c (tsubst_function_decl): Don't do set_constraints when
regenerating a lambda.
(tsubst_lambda_expr): Substitute into the lambda's constraints
and do set_constraints here.

gcc/testsuite/ChangeLog:

PR c++/92633
PR c++/92838
* g++.dg/cpp2a/concepts-lambda11.C: New test.
* g++.dg/cpp2a/concepts-lambda12.C: New test.
---
 gcc/cp/pt.c| 16 +++-
 gcc/testsuite/g++.dg/cpp2a/concepts-lambda11.C | 17 +
 gcc/testsuite/g++.dg/cpp2a/concepts-lambda12.C | 15 +++
 3 files changed, 47 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-lambda11.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-lambda12.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index df647af7b46..907ca879c73 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -13854,7 +13854,10 @@ tsubst_function_decl (tree t, tree args, 
tsubst_flags_t complain,
  don't substitute through the constraints; that's only done when
  they are checked.  */
   if (tree ci = get_constraints (t))
-set_constraints (r, ci);
+/* Unless we're regenerating a lambda, in which case we'll set the
+   lambda's constraints in tsubst_lambda_expr.  */
+if (!lambda_fntype)
+  set_constraints (r, ci);
 
   if (DECL_FRIEND_P (t) && DECL_FRIEND_CONTEXT (t))
 SET_DECL_FRIEND_CONTEXT (r,
@@ -19029,6 +19032,17 @@ tsubst_lambda_expr (tree t, tree args, tsubst_flags_t 
complain, tree in_decl)
  finish_member_declaration (fn);
}
 
+  if (tree ci = get_constraints (oldfn))
+   {
+ /* Substitute into the lambda's constraints.  */
+ if (oldtmpl)
+   ++processing_template_decl;
+ ci = tsubst_constraint_info (ci, args, complain, in_decl);
+ if (oldtmpl)
+   --processing_template_decl;
+ set_constraints (fn, ci);
+   }
+
   /* Let finish_function set this.  */
   DECL_DECLARED_CONSTEXPR_P (fn) = false;
 
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-lambda11.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-lambda11.C
new file mode 100644
index 000..dd9cd4e2344
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-lambda11.C
@@ -0,0 +1,17 @@
+// PR c++/92838
+// { dg-do compile { target c++20 } }
+
+template
+auto foo()
+{
+  [] () requires (N != 0) { }(); // { dg-error "no match" }
+  [] () requires (N == 0) { }();
+
+  []  () requires (N == M) { }(); // { dg-error "no match" }
+  []  () requires (N != M) { }();
+}
+
+void bar()
+{
+  foo<0>();
+}
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-lambda12.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-lambda12.C
new file mode 100644
index 000..2bc9fd0bb25
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-lambda12.C
@@ -0,0 +1,15 @@
+// PR c++/92633
+// { dg-do compile { target c++20 } }
+
+template
+concept different_than = !__is_same_as(A, B);
+
+template
+auto diff(B) {
+return [](different_than auto a) {};
+}
+
+int main() {
+diff(42)("");
+diff(42)(42); // { dg-error "no match" }
+}
-- 
2.27.0.rc1.5.gae92ac8ae3



Re: [PATCH] Fix unrecognised -mcpu target: armv7-a on arm-wrs-vxworks7 (PR95420)

2020-06-01 Thread Richard Earnshaw
On 31/05/2020 23:40, Iain Buclaw via Gcc-patches wrote:
> Hi,
> 
> In the removal of arm-wrs-vxworks, the default cpu was updated from arm8
> to armv7-a, but this is not recognized as a valid -mcpu target.  There
> is however generic-armv7-a, which was likely the intended cpu that
> should have been used instead.
> 
> Tested by building a cross-compiler targetting arm-wrs-vxworks7, running
> make all-gcc and ensuring it succeeds.
> 
> OK?  This affects release/gcc-10 branch as well, so should be
> backported too.
> 
> Regards
> Iain.
> 
> 
> gcc/ChangeLog:
> 
>   PR target/95420
>   * config.gcc (arm-wrs-vxworks7*): Set default cpu to generic-armv7-a.
> ---
>  gcc/config.gcc | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/config.gcc b/gcc/config.gcc
> index f544932fc39..06ad813ad39 100644
> --- a/gcc/config.gcc
> +++ b/gcc/config.gcc
> @@ -1193,7 +1193,7 @@ arm-wrs-vxworks7*)
>   tmake_file="${tmake_file} arm/t-arm arm/t-vxworks arm/t-bpabi"
>   tm_file="elfos.h arm/elf.h arm/bpabi.h arm/aout.h ${tm_file}"
>   tm_file="${tm_file} vx-common.h vxworks.h arm/vxworks.h"
> - target_cpu_cname="armv7-a"
> + target_cpu_cname="generic-armv7-a"
>   need_64bit_hwint=yes
>   ;;
>  arm*-*-freebsd*)# ARM FreeBSD EABI
> 

OK all.

Thanks,

R.


Re: [PATCH] favor bcrypt over wincrypt for the random generator on Windows

2020-06-01 Thread Richard Sandiford
Steve Lhomme  writes:
> Hello,
>
> Any update on this ? This prevents libssp from being usable in UWP apps.
>
> (BTW the name of the old API is not wincrypt, the header, but CryptoAPI 
> or CAPI)

Sorry for the slow review.  I fear most global reviewers would have
no idea whether the patch is right or not.  Maybe Jon (cc:ed) could
comment.

Thanks,
Richard

>
> On 2020-04-21 9:48, Steve Lhomme wrote:
>> BCrypt is more modern and supported in Universal Apps, Wincrypt is not and
>> CryptGenRandom is deprecated:
>> https://docs.microsoft.com/en-us/windows/win32/api/wincrypt/nf-wincrypt-cryptgenrandom
>> 
>> BCrypt is available since Vista
>> https://docs.microsoft.com/en-us/windows/win32/api/bcrypt/nf-bcrypt-bcryptopenalgorithmprovider
>> 
>> It requires linking with bcrypt rather than advapi32 for wincrypt.
>> ---
>>   libssp/configure.ac | 16 
>>   libssp/ssp.c| 20 
>>   2 files changed, 36 insertions(+)
>> 
>> diff --git a/libssp/configure.ac b/libssp/configure.ac
>> index f30f81c54f6..a39d9e9c992 100644
>> --- a/libssp/configure.ac
>> +++ b/libssp/configure.ac
>> @@ -158,6 +158,22 @@ else
>>   fi
>>   AC_SUBST(ssp_have_usable_vsnprintf)
>>   
>> +AC_ARG_ENABLE(bcrypt,
>> +AS_HELP_STRING([--disable-bcrypt],
>> +  [use bcrypt for random generator on Windows (otherwise wincrypt)]),
>> +  use_win_bcrypt=$enableval,
>> +  use_win_bcrypt=yes)
>> +if test "x$use_win_bcrypt" != xno; then
>> +  case "$target_os" in
>> +win32 | pe | mingw32*)
>> +  AC_CHECK_TYPES([BCRYPT_ALG_HANDLE],[
>> +  LDFLAGS="$LDFLAGS -lbcrypt"
>> +],[],[#include 
>> +#include ])
>> +;;
>> +  esac
>> +fi
>> +
>>   AM_PROG_LIBTOOL
>>   ACX_LT_HOST_FLAGS
>>   AC_SUBST(enable_shared)
>> diff --git a/libssp/ssp.c b/libssp/ssp.c
>> index 28f3e9cc64a..f07cc41fd4f 100644
>> --- a/libssp/ssp.c
>> +++ b/libssp/ssp.c
>> @@ -56,7 +56,11 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
>> If not, see
>>  to the console using  "CONOUT$"   */
>>   #if defined (_WIN32) && !defined (__CYGWIN__)
>>   #include 
>> +#ifdef HAVE_BCRYPT_ALG_HANDLE
>> +#include 
>> +#else
>>   #include 
>> +#endif
>>   # define _PATH_TTY "CONOUT$"
>>   #else
>>   # define _PATH_TTY "/dev/tty"
>> @@ -77,6 +81,21 @@ __guard_setup (void)
>>   return;
>>   
>>   #if defined (_WIN32) && !defined (__CYGWIN__)
>> +#ifdef HAVE_BCRYPT_ALG_HANDLE
>> +  BCRYPT_ALG_HANDLE algo = 0;
>> +  NTSTATUS err = BCryptOpenAlgorithmProvider(, BCRYPT_RNG_ALGORITHM,
>> + NULL, 0);
>> +  if (BCRYPT_SUCCESS(err))
>> +{
>> +  if (BCryptGenRandom(algo, (BYTE *)&__stack_chk_guard,
>> +  sizeof (__stack_chk_guard), 0) && 
>> __stack_chk_guard != 0)
>> +{
>> +   BCryptCloseAlgorithmProvider(algo, 0);
>> +   return;
>> +}
>> +  BCryptCloseAlgorithmProvider(algo, 0);
>> +}
>> +#else /* !HAVE_BCRYPT_ALG_HANDLE */
>> HCRYPTPROV hprovider = 0;
>> if (CryptAcquireContext(, NULL, NULL, PROV_RSA_FULL,
>> CRYPT_VERIFYCONTEXT | CRYPT_SILENT))
>> @@ -89,6 +108,7 @@ __guard_setup (void)
>>   }
>> CryptReleaseContext(hprovider, 0);
>>   }
>> +#endif /* !HAVE_BCRYPT_ALG_HANDLE */
>>   #else
>> int fd = open ("/dev/urandom", O_RDONLY);
>> if (fd != -1)
>> -- 
>> 2.17.1
>> 


Re: [PATCH] Prefer simple case changes in spelling suggestions

2020-06-01 Thread Tom Tromey
> "David" == David Malcolm  writes:

>> I tested this using the self-tests.  A new self-test is also
>> included.

> Did the full DejaGnu testsuite get run?  There are a lot of tests in it
> that make use of this code.

I didn't try it, but I can.

> The patch should probably update the leading comment to
> get_edit_distance.

Will do.

>> test_get_edit_distance_both_ways ("foo", "FOO", 3);
[...]

> If I'm reading things correctly, the patch here updates the existing
> tests to apply the BASE_COST scale factor, but I don't think it adds
> any direct checks of the cost of case-conversion.  It would be good to
> add those.

It isn't obvious but the foo/FOO test did change.

Tom


[PATCH 3/6] rs6000, Add vector replace builtin support

2020-06-01 Thread Carl Love via Gcc-patches


GCC maintainers:

The following patch adds support for builtins vec_replace_elt and
vec_replace_unaligned.

The patch has been compiled and tested on

  powerpc64le-unknown-linux-gnu (Power 9 LE)

and mambo with no regression errors.

Please let me know if this patch is acceptable for the mainline
branch.  Thanks.

 Carl Love

---

gcc/ChangeLog

2020-05-30 Carl Love  

* config/rs6000/altivec.h: Add define for vec_replace_elt and
vec_replace_unaligned.
* config/rs6000/vsx.md: Add unspec UNSPEC_REPLACE_ELT and
UNSPEC_REPLACE_UN.
Add mode iterator REPLACE_ELT.
Add mode attributes REPLACE_ELT_atr, REPLACE_ELT_inst,
REPLACE_ELT_char, REPLACE_ELT_sh, REPLACE_ELT_max.
Add define_expand vreplace_elt_, mode REPLACE_ELT.
Add define_expand vreplace_un_, mode REPLACE_ELT.
Add define_insn vreplace_elt__inst, mode REPLACE_ELT.
* config/rs6000/rs6000-builtin.def (BU_FUTURE_V_3): Add
VREPLACE_ELT_V4SI, VREPLACE_ELT_UV4SI, VREPLACE_ELT_V4SF,
VREPLACE_ELT_UV2DI, VREPLACE_ELT_V2DF,VREPLACE_UN_V4SI,
VREPLACE_UN_UV4SI, VREPLACE_UN_V4SF, VREPLACE_UN_V2DI,
VREPLACE_UN_UV2DI, VREPLACE_UN_V2DF.
(BU_FUTURE_OVERLOAD_3): Add REPLACE_ELT, REPLACE_UN.
* config/rs6000/rs6000-call.c: Add
FUTURE_BUILTIN_VEC_REPLACE_ELT,
FUTURE_BUILTIN_VEC_REPLACE_UN specifications.
(rs6000_expand_ternop_builtin): Add 3rd argument checks for
CODE_FOR_vreplace_elt_v4si, CODE_FOR_vreplace_elt_v4sf,
CODE_FOR_vreplace_un_v4si, CODE_FOR_vreplace_un_v4sf.
(builtin_function_type): Add case statements for
FUTURE_BUILTIN_VREPLACE_ELT_UV4SI,
FUTURE_BUILTIN_VREPLACE_ELT_UV2DI,
FUTURE_BUILTIN_VREPLACE_UN_UV4SI,
FUTURE_BUILTIN_VREPLACE_UN_UV2DI.
* doc/extend.texi: Add description for vec_replace_elt and
vec_replace_unaligned builtins.
* testsuite/gcc.target/powerpc/vec-replace-word.c: Add new
test.
---
 gcc/config/rs6000/altivec.h   |   2 +
 gcc/config/rs6000/rs6000-builtin.def  |  16 +
 gcc/config/rs6000/rs6000-call.c   |  59 
 gcc/config/rs6000/vsx.md  |  61 
 gcc/doc/extend.texi   |  50 +++
 .../powerpc/vec-replace-word-runnable.c   | 288 ++
 6 files changed, 476 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vec-replace-word-
runnable.c

diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h
index 936aeb1ee09..435ffb8158f 100644
--- a/gcc/config/rs6000/altivec.h
+++ b/gcc/config/rs6000/altivec.h
@@ -701,6 +701,8 @@ __altivec_scalar_pred(vec_any_nle,
 #define vec_extracth(a, b, c)  __builtin_vec_extracth (a, b, c)
 #define vec_insertl(a, b, c)   __builtin_vec_insertl (a, b, c)
 #define vec_inserth(a, b, c)   __builtin_vec_inserth (a, b, c)
+#define vec_replace_elt(a, b, c)   __builtin_vec_replace_elt (a,
b, c)
+#define vec_replace_unaligned(a, b, c) __builtin_vec_replace_un (a, b,
c)
 
 #define vec_gnb(a, b)  __builtin_vec_gnb (a, b)
 #define vec_clrl(a, b) __builtin_vec_clrl (a, b)
diff --git a/gcc/config/rs6000/rs6000-builtin.def
b/gcc/config/rs6000/rs6000-builtin.def
index c5bd4f86555..91821f29a6f 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -2643,6 +2643,20 @@ BU_FUTURE_V_3 (VINSERTVPRBR, "vinsvubvrx",
CONST, vinsertvr_v16qi)
 BU_FUTURE_V_3 (VINSERTVPRHR, "vinsvuhvrx", CONST, vinsertvr_v8hi)
 BU_FUTURE_V_3 (VINSERTVPRWR, "vinsvuwvrx", CONST, vinsertvr_v4si)
 
+BU_FUTURE_V_3 (VREPLACE_ELT_V4SI, "vreplace_v4si", CONST,
vreplace_elt_v4si)
+BU_FUTURE_V_3 (VREPLACE_ELT_UV4SI, "vreplace_uv4si", CONST,
vreplace_elt_v4si)
+BU_FUTURE_V_3 (VREPLACE_ELT_V4SF, "vreplace_v4sf", CONST,
vreplace_elt_v4sf)
+BU_FUTURE_V_3 (VREPLACE_ELT_V2DI, "vreplace_v2di", CONST,
vreplace_elt_v2di)
+BU_FUTURE_V_3 (VREPLACE_ELT_UV2DI, "vreplace_uv2di", CONST,
vreplace_elt_v2di)
+BU_FUTURE_V_3 (VREPLACE_ELT_V2DF, "vreplace_v2df", CONST,
vreplace_elt_v2df)
+
+BU_FUTURE_V_3 (VREPLACE_UN_V4SI, "vreplace_un_v4si", CONST,
vreplace_un_v4si)
+BU_FUTURE_V_3 (VREPLACE_UN_UV4SI, "vreplace_un_uv4si", CONST,
vreplace_un_v4si)
+BU_FUTURE_V_3 (VREPLACE_UN_V4SF, "vreplace_un_v4sf", CONST,
vreplace_un_v4sf)
+BU_FUTURE_V_3 (VREPLACE_UN_V2DI, "vreplace_un_v2di", CONST,
vreplace_un_v2di)
+BU_FUTURE_V_3 (VREPLACE_UN_UV2DI, "vreplace_un_uv2di", CONST,
vreplace_un_v2di)
+BU_FUTURE_V_3 (VREPLACE_UN_V2DF, "vreplace_un_v2df", CONST,
vreplace_un_v2df)
+
 BU_FUTURE_V_1 (VSTRIBR, "vstribr", CONST, vstrir_v16qi)
 BU_FUTURE_V_1 (VSTRIHR, "vstrihr", CONST, vstrir_v8hi)
 BU_FUTURE_V_1 (VSTRIBL, "vstribl", CONST, vstril_v16qi)
@@ -2664,6 +2678,8 @@ BU_FUTURE_OVERLOAD_3 (EXTRACTL, "extractl")
 BU_FUTURE_OVERLOAD_3 (EXTRACTH, "extracth")
 BU_FUTURE_OVERLOAD_3 (INSERTL, "insertl")
 BU_FUTURE_OVERLOAD_3 (INSERTH, "inserth")
+BU_FUTURE_OVERLOAD_3 

[PATCH 1/6] rs6000, Update support for vec_extract

2020-06-01 Thread Carl Love via Gcc-patches


GCC maintainers:

Move the existing vector extract support in altivec.md to vsx.md
so all of the vector insert and extract support is in the same file.

The patch also updates the name of the builtins and descriptions for the
builtins in the documentation file so they match the approved builtin
names and descriptions.

The patch does not make any functional changes.

Please let me know if the changes are acceptable for the mainline branch.  
Thanks.

  Carl Love

--

gcc/ChangeLog

2020-05-30  Carl Love  

* config/rs6000/altivec.md: Move UNSPEC_EXTRACTL, UNSPEC_EXTRACTR
declarations to gcc/config/rs6000/vsx.md.
(define_expand): Move vextractl and vextractr to
gcc/config/rs6000/vsx.md.
(define_insn): Move vextractl_internal and 
vextractr_internal
to gcc/config/rs6000/vsx.md.
* config/rs6000/vsx.md: Code moved from file config/rs6000/altivec.md.
* gcc/doc/extend.texi: Update documentation for vec_extractl.
Replace builtin name vec_extractr with vec_extracth.  Update description
of vec_extracth.
---
 gcc/config/rs6000/altivec.md | 64 ---
 gcc/config/rs6000/vsx.md | 66 
 gcc/doc/extend.texi  | 73 +---
 3 files changed, 101 insertions(+), 102 deletions(-)

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 792ca4f488e..2fadb442eca 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -171,8 +171,6 @@
UNSPEC_XXEVAL
UNSPEC_VSTRIR
UNSPEC_VSTRIL
-   UNSPEC_EXTRACTL
-   UNSPEC_EXTRACTR
 ])
 
 (define_c_enum "unspecv"
@@ -183,8 +181,6 @@
UNSPECV_DSS
   ])
 
-;; Like VI, defined in vector.md, but add ISA 2.07 integer vector ops
-(define_mode_iterator VI2 [V4SI V8HI V16QI V2DI])
 ;; Short vec int modes
 (define_mode_iterator VIshort [V8HI V16QI])
 ;; Longer vec int modes for rotate/mask ops
@@ -785,66 +781,6 @@
   DONE;
 })
 
-(define_expand "vextractl"
-  [(set (match_operand:V2DI 0 "altivec_register_operand")
-   (unspec:V2DI [(match_operand:VI2 1 "altivec_register_operand")
- (match_operand:VI2 2 "altivec_register_operand")
- (match_operand:SI 3 "register_operand")]
-UNSPEC_EXTRACTL))]
-  "TARGET_FUTURE"
-{
-  if (BYTES_BIG_ENDIAN)
-{
-  emit_insn (gen_vextractl_internal (operands[0], operands[1],
-  operands[2], operands[3]));
-  emit_insn (gen_xxswapd_v2di (operands[0], operands[0]));
-}
-  else
-emit_insn (gen_vextractr_internal (operands[0], operands[2],
-operands[1], operands[3]));
-  DONE;
-})
-
-(define_insn "vextractl_internal"
-  [(set (match_operand:V2DI 0 "altivec_register_operand" "=v")
-   (unspec:V2DI [(match_operand:VEC_I 1 "altivec_register_operand" "v")
- (match_operand:VEC_I 2 "altivec_register_operand" "v")
- (match_operand:SI 3 "register_operand" "r")]
-UNSPEC_EXTRACTL))]
-  "TARGET_FUTURE"
-  "vextvlx %0,%1,%2,%3"
-  [(set_attr "type" "vecsimple")])
-
-(define_expand "vextractr"
-  [(set (match_operand:V2DI 0 "altivec_register_operand")
-   (unspec:V2DI [(match_operand:VI2 1 "altivec_register_operand")
- (match_operand:VI2 2 "altivec_register_operand")
- (match_operand:SI 3 "register_operand")]
-UNSPEC_EXTRACTR))]
-  "TARGET_FUTURE"
-{
-  if (BYTES_BIG_ENDIAN)
-{
-  emit_insn (gen_vextractr_internal (operands[0], operands[1],
-  operands[2], operands[3]));
-  emit_insn (gen_xxswapd_v2di (operands[0], operands[0]));
-}
-  else
-emit_insn (gen_vextractl_internal (operands[0], operands[2],
-operands[1], operands[3]));
-  DONE;
-})
-
-(define_insn "vextractr_internal"
-  [(set (match_operand:V2DI 0 "altivec_register_operand" "=v")
-   (unspec:V2DI [(match_operand:VEC_I 1 "altivec_register_operand" "v")
- (match_operand:VEC_I 2 "altivec_register_operand" "v")
- (match_operand:SI 3 "register_operand" "r")]
-UNSPEC_EXTRACTR))]
-  "TARGET_FUTURE"
-  "vextvrx %0,%1,%2,%3"
-  [(set_attr "type" "vecsimple")])
-
 (define_expand "vstrir_"
   [(set (match_operand:VIshort 0 "altivec_register_operand")
(unspec:VIshort [(match_operand:VIshort 1 "altivec_register_operand")]
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 2a28215ac5b..51ffe2d2000 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -344,8 +344,13 @@
UNSPEC_VSX_FIRST_MISMATCH_INDEX
UNSPEC_VSX_FIRST_MISMATCH_EOS_INDEX
UNSPEC_XXGENPCV
+   UNSPEC_EXTRACTL
+   UNSPEC_EXTRACTR
   ])
 
+;; Like VI, defined in 

[PATCH 6/6] rs6000 Add vector blend, permute builtin support

2020-06-01 Thread Carl Love via Gcc-patches


GCC maintainers:

The following patch adds support for the vec_blendv and vec_permx
builtins.

The patch has been compiled and tested on

  powerpc64le-unknown-linux-gnu (Power 9 LE)

with no regression errors.

The test cases were compiled on a Power 9 system and then tested on
Mambo.

Please let me know if this patch is acceptable for the mainline
branch.  Thanks.

 Carl Love

---
rs6000 RFC2609 vector blend, permute instructions

gcc/ChangeLog

2020-05-30  Carl Love  

* config/rs6000/altivec.h: Add define for vec_blendv and
vec_permx.
* config/rs6000/altivec.md: Add unspec UNSPEC_XXBLEND,
UNSPEC_XXPERMX.
New define_mode VM3.
New define_attr VM3_char.
New define_insn xxblend_ mode is VM3.
New define_expand xxpermx.
New define_insn xxpermx_inst.
* config/rs6000/rs6000-builtin.def (BU_FUTURE_V_3): New
definitions
VXXBLEND_V16QI, VXXBLEND_V8HI, VXXBLEND_V4SI, VXXBLEND_V2DI,
VXXBLEND_V4SF, VXXBLEND_V2DF.
(BU_FUTURE_OVERLOAD_3): New definition XXBLEND,
(BU_FUTURE_OVERLOAD_4): New definition XXPERMX.
* config/rs6000/rs6000-c.c:
(altivecaltivec_resolve_overloaded_builtin):
Add if case support for FUTURE_BUILTIN_VXXPERMX
* config/rs6000/rs6000-call.c: Define overloaded arguments for
FUTURE_BUILTIN_VXXBLEND_V16QI, FUTURE_BUILTIN_VXXBLEND_V8HI,
FUTURE_BUILTIN_VXXBLEND_V4SI, FUTURE_BUILTIN_VXXBLEND_V2DI,
FUTURE_BUILTIN_VXXBLEND_V4SF, FUTURE_BUILTIN_VXXBLEND_V2DF,
FUTURE_BUILTIN_VXXPERMX.
(rs6000_expand_quaternop_builtin): Add if case for
CODE_FOR_xxpermx.
(builtin_quaternary_function_type): Add v16uqi_type and
xxpermx_type
variables, case for FUTURE_BUILTIN_VXXPERMX.
(builtin_function_type): Add case for
FUTURE_BUILTIN_VXXBLEND_V16QI,
FUTURE_BUILTIN_VXXBLEND_V8HI, FUTURE_BUILTIN_VXXBLEND_V4SI,
FUTURE_BUILTIN_VXXBLEND_V2DI.
* doc/extend.texi: Add documentation for vec_blendv and
vec_permx.
testsuite/gcc.target/powerpc/vec-blend-runnable.c: New test.
testsuite/gcc.target/powerpc/vec-permute-ext-runnable.c: New
test.
---
 gcc/config/rs6000/altivec.h   |   2 +
 gcc/config/rs6000/altivec.md  |  82 +
 gcc/config/rs6000/rs6000-builtin.def  |  13 +
 gcc/config/rs6000/rs6000-c.c  |  25 +-
 gcc/config/rs6000/rs6000-call.c   |  94 ++
 gcc/doc/extend.texi   |  62 
 .../gcc.target/powerpc/vec-blend-runnable.c   | 276 
 .../powerpc/vec-permute-ext-runnable.c| 294 ++
 8 files changed, 843 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vec-blend-
runnable.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vec-permute-ext-
runnable.c

diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h
index 9ed41b1cbf1..1b532effebe 100644
--- a/gcc/config/rs6000/altivec.h
+++ b/gcc/config/rs6000/altivec.h
@@ -708,6 +708,8 @@ __altivec_scalar_pred(vec_any_nle,
 #define vec_splati(a)  __builtin_vec_xxspltiw (a)
 #define vec_splatid(a) __builtin_vec_xxspltid (a)
 #define vec_splati_ins(a, b, c)__builtin_vec_xxsplti32dx (a,
b, c)
+#define vec_blendv(a, b, c)__builtin_vec_xxblend (a, b, c)
+#define vec_permx(a, b, c, d)  __builtin_vec_xxpermx (a, b, c, d)
 
 #define vec_gnb(a, b)  __builtin_vec_gnb (a, b)
 #define vec_clrl(a, b) __builtin_vec_clrl (a, b)
diff --git a/gcc/config/rs6000/altivec.md
b/gcc/config/rs6000/altivec.md
index 47e8148029b..92c52eae78d 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -176,6 +176,8 @@
UNSPEC_XXSPLTIW
UNSPEC_XXSPLTID
UNSPEC_XXSPLTI32DX
+   UNSPEC_XXBLEND
+   UNSPEC_XXPERMX
 ])
 
 (define_c_enum "unspecv"
@@ -218,6 +220,21 @@
   (KF "FLOAT128_VECTOR_P (KFmode)")
   (TF "FLOAT128_VECTOR_P (TFmode)")])
 
+;; Like VM2, just do char, short, int, long, float and double
+(define_mode_iterator VM3 [V4SI
+   V8HI
+   V16QI
+   V4SF
+   V2DF
+   V2DI])
+
+(define_mode_attr VM3_char [(V2DI "d")
+   (V4SI "w")
+   (V8HI "h")
+   (V16QI "b")
+   (V2DF  "d")
+   (V4SF  "w")])
+
 ;; Map the Vector convert single precision to double precision for
integer
 ;; versus floating point
 (define_mode_attr VS_sxwsp [(V4SI "sxw") (V4SF "sp")])
@@ -908,6 +925,71 @@
   "xxsplti32dx %x0,%1,%2"
[(set_attr "type" "vecsimple")])
 
+(define_insn "xxblend_"
+  [(set (match_operand:VM3 0 "register_operand" "=wa")
+   (unspec:VM3 [(match_operand:VM3 1 "register_operand" "wa")
+

[PATCH 5/6] rs6000, Add vector splat builtin support

2020-06-01 Thread Carl Love via Gcc-patches


GCC maintainers:

The following patch adds support for the vec_splati, vec_splatid and
vec_splati_ins builtins.

Note, this patch adds support for instructions that take a 32-bit
immediate
value that represents a floating point value.  This support adds new
predicates and a support function to properly handle the immediate
value.

The patch has been compiled and tested on

  powerpc64le-unknown-linux-gnu (Power 9 LE)

with no regression errors.

The test case was compiled on a Power 9 system and then tested on
Mambo.

Please let me know if this patch is acceptable for the mainline
branch.  Thanks.

 Carl Love

gcc/ChangeLog

2020-05-30  Carl Love  

* config/rs6000/altivec.h: Add define for vec_splati,
vec_splatid
and vec_splati_ins.
* config/rs6000/vsx.md: Add UNSPEC_XXSPLTIW, UNSPEC_XXSPLTID
and UNSPEC_XXSPLTI32DX.
(define_insn): Add vxxspltiw_v4si, vxxspltiw_v4sf_inst,
vxxspltidp_v2df_inst, vxxsplti32dx_v4si_inst, and
vxxsplti32dx_v4sf_inst.
(define_expand): vxxspltiw_v4sf, vxxspltidp_v2df,
vxxsplti32dx_v4si,
vxxsplti32dx_v4sf.
* config/rs6000/predicates: Add predicates u1bit_cint_operand,
s32bit_cint_operand, c32bit_cint_operand, and
f32bit_const_operand.
* config/rs6000/rs6000-builtin.def (BU_FUTURE_V_1): Add
definitions
for VXXSPLTIW_V4SI, VXXSPLTIW_V4SF and VXXSPLTID.
(BU_FUTURE_V_3): Add definitions for VXXSPLTI32DX_V4SI and
VXXSPLTI32DX_V4SF.
(BU_FUTURE_OVERLOAD_1): Add definitions XXSPLTIW and XXSPLTID.
(BU_FUTURE_OVERLOAD_3): Add definition XXSPLTI32DX.
* config/rs6000/rs6000-call.c: Add overloaded definitions for
FUTURE_BUILTIN_VEC_XXSPLTIW, FUTURE_BUILTIN_VEC_XXSPLTID and
FUTURE_BUILTIN_VEC_XXSPLTI32DX.
* config/rs6000/rs6000-protos.h: Add prototype definition for
rs6000_constF32toI32.
(builtin_function_type): Add cases for
FUTURE_BUILTIN_VXXSPLTI32DX_V4SI
and FUTURE_BUILTIN_VXXSPLTI32DX_V4SF.
* config/rs6000/rs6000.c: Add function rs6000_constF32toI32.
* config/doc/extend.texi: Add documentation for vec_splati,
vec_splatid, and vec_splati_ins.
* testsuite/gcc.target/powerpc/vec-splati-runnable: New test.
---
 gcc/config/rs6000/altivec.h   |   3 +
 gcc/config/rs6000/altivec.md  | 109 +
 gcc/config/rs6000/predicates.md   |  20 +++
 gcc/config/rs6000/rs6000-builtin.def  |  13 ++
 gcc/config/rs6000/rs6000-call.c   |  19 +++
 gcc/config/rs6000/rs6000-protos.h |   1 +
 gcc/config/rs6000/rs6000.c|  16 ++
 gcc/doc/extend.texi   |  35 +
 .../gcc.target/powerpc/vec-splati-runnable.c  | 145 ++
 9 files changed, 361 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vec-splati-
runnable.c

diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h
index 0be68892aad..9ed41b1cbf1 100644
--- a/gcc/config/rs6000/altivec.h
+++ b/gcc/config/rs6000/altivec.h
@@ -705,6 +705,9 @@ __altivec_scalar_pred(vec_any_nle,
 #define vec_replace_unaligned(a, b, c) __builtin_vec_replace_un (a, b,
c)
 #define vec_sldb(a, b, c)  __builtin_vec_sldb (a, b, c)
 #define vec_srdb(a, b, c)  __builtin_vec_srdb (a, b, c)
+#define vec_splati(a)  __builtin_vec_xxspltiw (a)
+#define vec_splatid(a) __builtin_vec_xxspltid (a)
+#define vec_splati_ins(a, b, c)__builtin_vec_xxsplti32dx (a,
b, c)
 
 #define vec_gnb(a, b)  __builtin_vec_gnb (a, b)
 #define vec_clrl(a, b) __builtin_vec_clrl (a, b)
diff --git a/gcc/config/rs6000/altivec.md
b/gcc/config/rs6000/altivec.md
index de79ae22fd4..47e8148029b 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -173,6 +173,9 @@
UNSPEC_VSTRIL
UNSPEC_SLDB
UNSPEC_SRDB
+   UNSPEC_XXSPLTIW
+   UNSPEC_XXSPLTID
+   UNSPEC_XXSPLTI32DX
 ])
 
 (define_c_enum "unspecv"
@@ -799,6 +802,112 @@
   "vsdbi %0,%1,%2,%3"
   [(set_attr "type" "vecsimple")])
 
+(define_insn "vxxspltiw_v4si"
+  [(set (match_operand:V4SI 0 "register_operand" "=wa")
+   (unspec:V4SI [(match_operand:SI 1 "s32bit_cint_operand" "n")]
+UNSPEC_XXSPLTIW))]
+ "TARGET_FUTURE"
+ "xxspltiw %x0,%1"
+ [(set_attr "type" "vecsimple")])
+
+(define_expand "vxxspltiw_v4sf"
+  [(set (match_operand:V4SF 0 "register_operand" "=wa")
+   (unspec:V4SF [(match_operand:SF 1 "f32bit_const_operand" "n")]
+UNSPEC_XXSPLTIW))]
+ "TARGET_FUTURE"
+{
+  long long value = rs6000_constF32toI32 (operands[1]);
+  emit_insn (gen_vxxspltiw_v4sf_inst (operands[0], GEN_INT (value)));
+  DONE;
+})
+
+(define_insn "vxxspltiw_v4sf_inst"
+  [(set (match_operand:V4SF 0 "register_operand" "=wa")
+   (unspec:V4SF [(match_operand:SI 1 "c32bit_cint_operand" "n")]
+UNSPEC_XXSPLTIW))]
+ 

[PATCH 4/6] rs6000, Add vector shift double builtin support

2020-06-01 Thread Carl Love via Gcc-patches


GCC maintainers:

The following patch adds support for the vector shift double builtins
for RFC2609.

The patch has been compiled and tested on

  powerpc64le-unknown-linux-gnu (Power 9 LE)

and Mambo with no regression errors.

Please let me know if this patch is acceptable for the mainline
branch.  Thanks.

 Carl Love

---

gcc/ChangeLog

2020-05-30  Carl Love  

* config/rs6000/altivec.h: Add define for vec_sldb and
vec_srdb.
* config/rs6000/altivec.md: Add unspec definitions UNSPEC_SLDB
and
UNSPEC_SRDB.
(define_int_attr): Add SLDB_LR attribute.
(define_int_iterator): Add VSHIFT_DBL_LR iterator.
(define_insn): Add vsdb_.
* config/rs6000/rs6000-builtin.def (BU_FUTURE_V_3): Add
definitions
for VSLDB_V16QI, VSLDB_V8HI, VSLDB_V4SI, VSLDB_V2DI,
VSRDB_V16QI,
VSRDB_V8HI, VSRDB_V4SI and VSRDB_V2DI.
(BU_FUTURE_OVERLOAD_3): Add overload definitions for SLDB and
SRDB.
* config/rs6000/rs6000-call.c (altivec_overloaded_builtins):
Add
entries for FUTURE_BUILTIN_VEC_SLDB and
FUTURE_BUILTIN_VEC_SRDB.
(rs6000_expand_ternop_builtin): Add else if clause for
CODE_FOR_vsldb_v16qi, CODE_FOR_vsldb_v8hi, CODE_FOR_vsldb_v4si,
CODE_FOR_vsldb_v2di, CODE_FOR_vsrdb_v16qi, CODE_FOR_vsrdb_v8hi,
CODE_FOR_vsrdb_v4si, CODE_FOR_vsrdb_v2di.
* doc/extend.texi: Add description for vec_sldb and vec_srdb.
* testsuite/gcc.target/powerpc/vec-shift-double-runnable.c: Add
runnable test case.
gcc/ChangeLog

2020-05-26  Carl Love  

* config/rs6000/altivec.h: Add define for vec_sldb and
vec_srdb.
* config/rs6000/altivec.md: Add unspec definitions UNSPEC_SLDB
and
UNSPEC_SRDB.
(define_int_attr): Add SLDB_LR attribute.
(define_int_iterator): Add VSHIFT_DBL_LR iterator.
(define_insn): Add vsdb_.
* config/rs6000/rs6000-builtin.def (BU_FUTURE_V_3): Add
definitions
for VSLDB_V16QI, VSLDB_V8HI, VSLDB_V4SI, VSLDB_V2DI,
VSRDB_V16QI,
VSRDB_V8HI, VSRDB_V4SI and VSRDB_V2DI.
(BU_FUTURE_OVERLOAD_3): Add overload definitions for SLDB and
SRDB.
* config/rs6000/rs6000-call.c (altivec_overloaded_builtins):
Add
entries for FUTURE_BUILTIN_VEC_SLDB and
FUTURE_BUILTIN_VEC_SRDB.
(rs6000_expand_ternop_builtin): Add else if clause for
CODE_FOR_vsldb_v\
16qi,
CODE_FOR_vsldb_v8hi, CODE_FOR_vsldb_v4si, CODE_FOR_vsldb_v2di,
CODE_FOR_vsrdb_v16qi, CODE_FOR_vsrdb_v8hi, CODE_FOR_vsrdb_v4si,
CODE_FOR_vsrdb_v2di.
* doc/extend.texi: Add description for vec_sldb and vec_srdb.
* testsuite/gcc.target/powerpc/vec-shift-double-runnable.c: Add
runnable test case.
---
 gcc/config/rs6000/altivec.h   |   2 +
 gcc/config/rs6000/altivec.md  |  18 +
 gcc/config/rs6000/rs6000-builtin.def  |  11 +
 gcc/config/rs6000/rs6000-call.c   |  70 
 gcc/doc/extend.texi   |  53 +++
 .../powerpc/vec-shift-double-runnable.c   | 384 ++
 6 files changed, 538 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vec-shift-double-
runnable.c

diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h
index 435ffb8158f..0be68892aad 100644
--- a/gcc/config/rs6000/altivec.h
+++ b/gcc/config/rs6000/altivec.h
@@ -703,6 +703,8 @@ __altivec_scalar_pred(vec_any_nle,
 #define vec_inserth(a, b, c)   __builtin_vec_inserth (a, b, c)
 #define vec_replace_elt(a, b, c)   __builtin_vec_replace_elt (a,
b, c)
 #define vec_replace_unaligned(a, b, c) __builtin_vec_replace_un (a, b,
c)
+#define vec_sldb(a, b, c)  __builtin_vec_sldb (a, b, c)
+#define vec_srdb(a, b, c)  __builtin_vec_srdb (a, b, c)
 
 #define vec_gnb(a, b)  __builtin_vec_gnb (a, b)
 #define vec_clrl(a, b) __builtin_vec_clrl (a, b)
diff --git a/gcc/config/rs6000/altivec.md
b/gcc/config/rs6000/altivec.md
index 2fadb442eca..de79ae22fd4 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -171,6 +171,8 @@
UNSPEC_XXEVAL
UNSPEC_VSTRIR
UNSPEC_VSTRIL
+   UNSPEC_SLDB
+   UNSPEC_SRDB
 ])
 
 (define_c_enum "unspecv"
@@ -781,6 +783,22 @@
   DONE;
 })
 
+;; Map UNSPEC_SLDB to "l" and  UNSPEC_SRDB to "r".
+(define_int_attr SLDB_LR [(UNSPEC_SLDB "l")
+ (UNSPEC_SRDB "r")])
+
+(define_int_iterator VSHIFT_DBL_LR [UNSPEC_SLDB UNSPEC_SRDB])
+
+(define_insn "vsdb_"
+ [(set (match_operand:VI2 0 "register_operand" "=v")
+  (unspec:VI2 [(match_operand:VI2 1 "register_operand" "v")
+  (match_operand:VI2 2 "register_operand" "v")
+  (match_operand:QI 3 "const_0_to_12_operand" "n")]
+ VSHIFT_DBL_LR))]
+  "TARGET_FUTURE"
+  "vsdbi %0,%1,%2,%3"
+  [(set_attr "type" "vecsimple")])
+
 (define_expand "vstrir_"
   [(set (match_operand:VIshort 0 "altivec_register_operand")
(unspec:VIshort 

[PATCH 2/6] rs6000 Add vector insert builtin support

2020-06-01 Thread Carl Love via Gcc-patches


GCC maintainers:

This patch adds support for vec_insertl and vec_inserth builtins.

The patch has been compiled and tested on

  powerpc64le-unknown-linux-gnu (Power 9 LE)

and mambo with no regression errors.

Please let me know if this patch is acceptable for the mainline branch.

Thanks.

 Carl Love

--
gcc/ChangeLog

2020-05-30  Carl Love  

* config/rs6000/altivec.h: Add define vec_insertl, vec_inserth.
* config/rs6000/rs6000-builtin.def (BU_FUTURE_V_3): Add definition for
VINSERTGPRBL, VINSERTGPRHL, VINSERTGPRWL, VINSERTGPRDL, VINSERTVPRBL,
VINSERTVPRHL, VINSERTVPRWL, VINSERTGPRBR, VINSERTGPRHR, VINSERTGPRWR,
VINSERTGPRDR, VINSERTVPRBR, VINSERTVPRHR, VINSERTVPRWR.
(BU_FUTURE_OVERLOAD_3): Add definition for INSERTL, INSERTH.
* config/rs6000/rs6000-call.c (FUTURE_BUILTIN_VEC_INSERTL): Add 
overloaded
argument declarations.
(FUTURE_BUILTIN_VEC_INSERTH):  Add overloaded   argument declarations.
(builtin_function_type): Add case entries for 
FUTURE_BUILTIN_VINSERTGPRBL,
FUTURE_BUILTIN_VINSERTGPRHL, FUTURE_BUILTIN_VINSERTGPRWL,
FUTURE_BUILTIN_VINSERTGPRDL, FUTURE_BUILTIN_VINSERTVPRBL,
FUTURE_BUILTIN_VINSERTVPRHL, FUTURE_BUILTIN_VINSERTVPRWL.
* config/rs6000/vsx.md (define_c_enum): Add UNSPEC_INSERTL, 
UNSPEC_INSERTR.
(define_expand): Add vinsertvl_, vinsertvr_, 
vinsertgl_
vinsertgr_, mode is VI2.
(define_ins): vinsertvl_internal_, vinsertvr_internal_,
vinsertgl_internal_, vinsertgr_internal_, mode VEC_I.
* doc/extend.texi: Add documentation for vec_insertl, vec_inserth.
* gcc/testsuite/gcc.target/powerpc/vec-insert-word-runnable.c: New
test case.
---
 gcc/config/rs6000/altivec.h   |   2 +
 gcc/config/rs6000/rs6000-builtin.def  |  18 +
 gcc/config/rs6000/rs6000-call.c   |  51 +++
 gcc/config/rs6000/vsx.md  | 110 ++
 gcc/doc/extend.texi   |  68 
 .../powerpc/vec-insert-word-runnable.c| 345 ++
 6 files changed, 594 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vec-insert-word-runnable.c

diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h
index 0a7e8ab3647..936aeb1ee09 100644
--- a/gcc/config/rs6000/altivec.h
+++ b/gcc/config/rs6000/altivec.h
@@ -699,6 +699,8 @@ __altivec_scalar_pred(vec_any_nle,
 /* Overloaded built-in functions for future architecture.  */
 #define vec_extractl(a, b, c)  __builtin_vec_extractl (a, b, c)
 #define vec_extracth(a, b, c)  __builtin_vec_extracth (a, b, c)
+#define vec_insertl(a, b, c)   __builtin_vec_insertl (a, b, c)
+#define vec_inserth(a, b, c)   __builtin_vec_inserth (a, b, c)
 
 #define vec_gnb(a, b)  __builtin_vec_gnb (a, b)
 #define vec_clrl(a, b) __builtin_vec_clrl (a, b)
diff --git a/gcc/config/rs6000/rs6000-builtin.def 
b/gcc/config/rs6000/rs6000-builtin.def
index 8b1ddb00045..c5bd4f86555 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -2627,6 +2627,22 @@ BU_FUTURE_V_3 (VEXTRACTHR, "vextduhvhx", CONST, 
vextractrv8hi)
 BU_FUTURE_V_3 (VEXTRACTWR, "vextduwvhx", CONST, vextractrv4si)
 BU_FUTURE_V_3 (VEXTRACTDR, "vextddvhx", CONST, vextractrv2di)
 
+BU_FUTURE_V_3 (VINSERTGPRBL, "vinsgubvlx", CONST, vinsertgl_v16qi)
+BU_FUTURE_V_3 (VINSERTGPRHL, "vinsguhvlx", CONST, vinsertgl_v8hi)
+BU_FUTURE_V_3 (VINSERTGPRWL, "vinsguwvlx", CONST, vinsertgl_v4si)
+BU_FUTURE_V_3 (VINSERTGPRDL, "vinsgudvlx", CONST, vinsertgl_v2di)
+BU_FUTURE_V_3 (VINSERTVPRBL, "vinsvubvlx", CONST, vinsertvl_v16qi)
+BU_FUTURE_V_3 (VINSERTVPRHL, "vinsvuhvlx", CONST, vinsertvl_v8hi)
+BU_FUTURE_V_3 (VINSERTVPRWL, "vinsvuwvlx", CONST, vinsertvl_v4si)
+
+BU_FUTURE_V_3 (VINSERTGPRBR, "vinsgubvrx", CONST, vinsertgr_v16qi)
+BU_FUTURE_V_3 (VINSERTGPRHR, "vinsguhvrx", CONST, vinsertgr_v8hi)
+BU_FUTURE_V_3 (VINSERTGPRWR, "vinsguwvrx", CONST, vinsertgr_v4si)
+BU_FUTURE_V_3 (VINSERTGPRDR, "vinsgudvrx", CONST, vinsertgr_v2di)
+BU_FUTURE_V_3 (VINSERTVPRBR, "vinsvubvrx", CONST, vinsertvr_v16qi)
+BU_FUTURE_V_3 (VINSERTVPRHR, "vinsvuhvrx", CONST, vinsertvr_v8hi)
+BU_FUTURE_V_3 (VINSERTVPRWR, "vinsvuwvrx", CONST, vinsertvr_v4si)
+
 BU_FUTURE_V_1 (VSTRIBR, "vstribr", CONST, vstrir_v16qi)
 BU_FUTURE_V_1 (VSTRIHR, "vstrihr", CONST, vstrir_v8hi)
 BU_FUTURE_V_1 (VSTRIBL, "vstribl", CONST, vstril_v16qi)
@@ -2646,6 +2662,8 @@ BU_FUTURE_OVERLOAD_2 (XXGENPCVM, "xxgenpcvm")
 
 BU_FUTURE_OVERLOAD_3 (EXTRACTL, "extractl")
 BU_FUTURE_OVERLOAD_3 (EXTRACTH, "extracth")
+BU_FUTURE_OVERLOAD_3 (INSERTL, "insertl")
+BU_FUTURE_OVERLOAD_3 (INSERTH, "inserth")
 
 BU_FUTURE_OVERLOAD_1 (VSTRIR, "strir")
 BU_FUTURE_OVERLOAD_1 (VSTRIL, "stril")
diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 0ac8054d030..a265e30d1d9 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ 

[PATCH 0/6] Permute Class Operations

2020-06-01 Thread Carl Love via Gcc-patches
GCC maintianers:

The following patch set adds builtins for the various Permute Class
Operations specified in IBM RFC 2609.

Based on previous IBM internal reviews of the patch set, the desire is
for all of the vector insert and extract support to be in vsx.md as
there is a longer term plan to re-work this support for PPC.

The first patch moves the existing extract support in altivec.md to
vsx.md.

Additionally, the documentation for the existing vector extract
builtins has been updated to match the latest documentation and builtin
names in the code. Specifically, the builtin name vec_extractr has been
changed to vec_extracth.  The description of the two builtins has been
changed to match the latest description of the builtins with a few
minor edits to address typos in the descriptions.

The subsequent patches add additional vector insert, vector
replace,
vector shift, vector splat, vector blend builtin support.

   Carl Love




[committed] libstdc++: Update/streamline Valgrind references

2020-06-01 Thread Gerald Pfeifer
Like many sites over the last year(s) valgrind.org has now moved to 
https.  While there, replace the second of two links in the same vicinity 
by a purely textual reference -- easier to maintain, and in particular
also better from a user experience perspective.

Gerald

* doc/xml/faq.xml: Adjust Valgrind reference and remove another.
* doc/html/faq.html: Regenerate.
---
 libstdc++-v3/doc/html/faq.html | 4 ++--
 libstdc++-v3/doc/xml/faq.xml   | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/libstdc++-v3/doc/html/faq.html b/libstdc++-v3/doc/html/faq.html
index 18407225d7a..967e5f5f348 100644
--- a/libstdc++-v3/doc/html/faq.html
+++ b/libstdc++-v3/doc/html/faq.html
@@ -700,7 +700,7 @@
 of a few dozen kilobytes on startup. This pool is used to ensure it's
 possible to throw exceptions (such as bad_alloc)
 even when malloc is unable to allocate any more 
memory.
-With some versions of http://valgrind.org/; 
target="_top">valgrind
+With some versions of https://valgrind.org; 
target="_top">valgrind
 this pool will be shown as "still reachable" when the process exits, e.g.
 still reachable: 72,704 bytes in 1 blocks.
 This memory is not a leak, because it's still in use by libstdc++,
@@ -710,7 +710,7 @@
 
 In the past, a few people reported that the standard containers appear
 to leak memory when tested with memory checkers such as
-http://valgrind.org/; target="_top">valgrind.
+valgrind.
 Under some (non-default) configurations the library's allocators keep
 free memory in a
 pool for later reuse, rather than deallocating it with delete
diff --git a/libstdc++-v3/doc/xml/faq.xml b/libstdc++-v3/doc/xml/faq.xml
index cf8684e1cea..e419d3c22a0 100644
--- a/libstdc++-v3/doc/xml/faq.xml
+++ b/libstdc++-v3/doc/xml/faq.xml
@@ -993,7 +993,7 @@
 of a few dozen kilobytes on startup. This pool is used to ensure it's
 possible to throw exceptions (such as bad_alloc)
 even when malloc is unable to allocate any more memory.
-With some versions of http://www.w3.org/1999/xlink; 
xlink:href="http://valgrind.org/;>valgrind
+With some versions of http://www.w3.org/1999/xlink; 
xlink:href="https://valgrind.org;>valgrind
 this pool will be shown as "still reachable" when the process exits, e.g.
 still reachable: 72,704 bytes in 1 blocks.
 This memory is not a leak, because it's still in use by libstdc++,
@@ -1004,7 +1004,7 @@
 
 In the past, a few people reported that the standard containers appear
 to leak memory when tested with memory checkers such as
-http://www.w3.org/1999/xlink; 
xlink:href="http://valgrind.org/;>valgrind.
+valgrind.
 Under some (non-default) configurations the library's allocators keep
 free memory in a
 pool for later reuse, rather than deallocating it with delete
-- 
2.26.2


Re: [PATCH] coroutines: Allow parameter packs in co_await/yield expressions [PR95345]

2020-06-01 Thread Nathan Sidwell

On 6/1/20 4:09 AM, Iain Sandoe wrote:

Hi

This corrects a pasto, where I copied the constraint on bare
parameter packs from the co_return to co_yield/await without
properly reviewing it.

tested on x86_64,powerpc64-linux, x86_64-darwin
OK for master?
OK for 10.2?


ok for both


thanks
Iain

gcc/cp/ChangeLog:

PR c++/95345
* coroutines.cc (finish_co_await_expr): Revise to allow for
parameter packs.
(finish_co_yield_expr): Likewise.

gcc/testsuite/ChangeLog:

PR c++/95345
* g++.dg/coroutines/pr95345.C: New test.
---
  gcc/cp/coroutines.cc  | 45 +++
  gcc/testsuite/g++.dg/coroutines/pr95345.C | 32 
  2 files changed, 53 insertions(+), 24 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/coroutines/pr95345.C

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index cc685ca73b2..7afa550037c 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -851,19 +851,18 @@ finish_co_await_expr (location_t kw, tree expr)
/* The current function has now become a coroutine, if it wasn't already.  
*/
DECL_COROUTINE_P (current_function_decl) = 1;
  
-  if (processing_template_decl)

-{
-  current_function_returns_value = 1;
-
-  if (check_for_bare_parameter_packs (expr))
-   return error_mark_node;
+  /* This function will appear to have no return statement, even if it
+ is declared to return non-void (most likely).  This is correct - we
+ synthesize the return for the ramp in the compiler.  So suppress any
+ extraneous warnings during substitution.  */
+  TREE_NO_WARNING (current_function_decl) = true;
  
-  /* If we don't know the promise type, we can't proceed.  */

-  tree functype = TREE_TYPE (current_function_decl);
-  if (dependent_type_p (functype) || type_dependent_expression_p (expr))
-   return build5_loc (kw, CO_AWAIT_EXPR, unknown_type_node, expr,
-  NULL_TREE, NULL_TREE, NULL_TREE, integer_zero_node);
-}
+  /* If we don't know the promise type, we can't proceed, build the
+ co_await with the expression unchanged.  */
+  tree functype = TREE_TYPE (current_function_decl);
+  if (dependent_type_p (functype) || type_dependent_expression_p (expr))
+return build5_loc (kw, CO_AWAIT_EXPR, unknown_type_node, expr,
+  NULL_TREE, NULL_TREE, NULL_TREE, integer_zero_node);
  
/* We must be able to look up the "await_transform" method in the scope of

   the promise type, and obtain its return type.  */
@@ -928,19 +927,17 @@ finish_co_yield_expr (location_t kw, tree expr)
/* The current function has now become a coroutine, if it wasn't already.  
*/
DECL_COROUTINE_P (current_function_decl) = 1;
  
-  if (processing_template_decl)

-{
-  current_function_returns_value = 1;
-
-  if (check_for_bare_parameter_packs (expr))
-   return error_mark_node;
+  /* This function will appear to have no return statement, even if it
+ is declared to return non-void (most likely).  This is correct - we
+ synthesize the return for the ramp in the compiler.  So suppress any
+ extraneous warnings during substitution.  */
+  TREE_NO_WARNING (current_function_decl) = true;
  
-  tree functype = TREE_TYPE (current_function_decl);

-  /* If we don't know the promise type, we can't proceed.  */
-  if (dependent_type_p (functype) || type_dependent_expression_p (expr))
-   return build2_loc (kw, CO_YIELD_EXPR, unknown_type_node, expr,
-  NULL_TREE);
-}
+  /* If we don't know the promise type, we can't proceed, build the
+ co_await with the expression unchanged.  */
+  tree functype = TREE_TYPE (current_function_decl);
+  if (dependent_type_p (functype) || type_dependent_expression_p (expr))
+return build2_loc (kw, CO_YIELD_EXPR, unknown_type_node, expr, NULL_TREE);
  
if (!coro_promise_type_found_p (current_function_decl, kw))

  /* We must be able to look up the "yield_value" method in the scope of
diff --git a/gcc/testsuite/g++.dg/coroutines/pr95345.C 
b/gcc/testsuite/g++.dg/coroutines/pr95345.C
new file mode 100644
index 000..90e946d91c2
--- /dev/null
+++ b/gcc/testsuite/g++.dg/coroutines/pr95345.C
@@ -0,0 +1,32 @@
+#if __has_include ()
+#include 
+using namespace std;
+#elif defined (__clang__) && __has_include ()
+#include 
+using namespace std::experimental;
+#endif
+
+struct dummy_coro
+{
+  using promise_type = dummy_coro;
+  bool await_ready() { return false; }
+  void await_suspend(std::coroutine_handle<>) { }
+  void await_resume() { }
+  dummy_coro get_return_object() { return {}; }
+  dummy_coro initial_suspend() { return {}; }
+  dummy_coro final_suspend() { return {}; }
+  void return_void() { }
+  void unhandled_exception() { }
+};
+
+template 
+dummy_coro
+foo()
+{
+ ((co_await [](int){ return std::suspend_never{}; }(I)), ...);
+  co_return;
+}
+
+void bar() {
+  foo<1>();
+}




--
Nathan 

Re: [PATCH] coroutines: Correct handling of references in parm copies [PR95350].

2020-06-01 Thread Nathan Sidwell

On 6/1/20 3:59 AM, Iain Sandoe wrote:

(resending, this didn’t appear to make it to the list)

Hi,

I had implemented a move out of rvalue refs for such ramp values (since
these are most likely to be dangling references).  However this does cause
a divergence with the clang implementation - and the patch fixes that.

tested on x86_64,powerpc64-linux, x86_64-darwin
OK for master?
OK for 10.2?


ok for both



Iain

---

Adjust to handle rvalue refs the same way as clang, and to correct
the handling of moves when a copy CTOR is present.  This is one area
where we could make things easier for the end-user (as was implemented
before this change), however there needs to be agreement about when the
full statement containing a coroutine call ends (i.e. when the ramp
terminates or when the coroutine terminates).

gcc/cp/ChangeLog:

PR c++/95350
* coroutines.cc (struct param_info): Remove rv_ref field.
(build_actor_fn): Remove specifial rvalue ref handling.
(morph_fn_to_coro): Likewise.

gcc/testsuite/ChangeLog:

PR c++/95350
* g++.dg/coroutines/torture/func-params-08.C: Adjust test to
reflect that all rvalue refs are dangling.
* g++.dg/coroutines/torture/func-params-09-awaitable-parms.C:
Likewise.
* g++.dg/coroutines/pr95350.C: New test.
---
gcc/cp/coroutines.cc  | 41 +--
gcc/testsuite/g++.dg/coroutines/pr95350.C | 28 +
.../coroutines/torture/func-params-08.C   | 11 ++---
.../torture/func-params-09-awaitable-parms.C  | 11 ++---
4 files changed, 50 insertions(+), 41 deletions(-)
create mode 100644 gcc/testsuite/g++.dg/coroutines/pr95350.C

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index 969f4a66f2f..8746927577a 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -1807,7 +1807,6 @@ struct param_info
   tree frame_type;   /* The type used to represent this parm in the frame.  */
   tree orig_type;/* The original type of the parm (not as passed).  */
   bool by_ref;   /* Was passed by reference.  */
-  bool rv_ref;   /* Was an rvalue reference.  */
   bool pt_ref;   /* Was a pointer to object.  */
   bool trivial_dtor; /* The frame type has a trivial DTOR.  */
   bool this_ptr; /* Is 'this' */
@@ -2077,12 +2076,6 @@ build_actor_fn (location_t loc, tree coro_frame_type, 
tree actor, tree fnbody,
  if (parm.pt_ref)
fld_idx = build1_loc (loc, CONVERT_EXPR, TREE_TYPE (arg), fld_idx);

- /* We expect an rvalue ref. here.  */
- if (parm.rv_ref)
-   fld_idx = convert_to_reference (DECL_ARG_TYPE (arg), fld_idx,
-   CONV_STATIC, LOOKUP_NORMAL,
-   NULL_TREE, tf_warning_or_error);
-
  int i;
  tree *puse;
  FOR_EACH_VEC_ELT (*parm.body_uses, i, puse)
@@ -3770,15 +3763,8 @@ morph_fn_to_coro (tree orig, tree *resumer, tree 
*destroyer)
  if (actual_type == NULL_TREE)
actual_type = error_mark_node;
  parm.orig_type = actual_type;
- parm.by_ref = parm.rv_ref = parm.pt_ref = false;
- if (TREE_CODE (actual_type) == REFERENCE_TYPE
- && TYPE_REF_IS_RVALUE (DECL_ARG_TYPE (arg)))
-   {
- parm.rv_ref = true;
- actual_type = TREE_TYPE (actual_type);
- parm.frame_type = actual_type;
-   }
- else if (TREE_CODE (actual_type) == REFERENCE_TYPE)
+ parm.by_ref = parm.pt_ref = false;
+ if (TREE_CODE (actual_type) == REFERENCE_TYPE)
{
  /* If the user passes by reference, then we will save the
 pointer to the original.  As noted in
@@ -3786,16 +3772,12 @@ morph_fn_to_coro (tree orig, tree *resumer, tree 
*destroyer)
 referenced item ends and then the coroutine is resumed,
 we have UB; well, the user asked for it.  */
  actual_type = build_pointer_type (TREE_TYPE (actual_type));
- parm.frame_type = actual_type;
  parm.pt_ref = true;
}
  else if (TYPE_REF_P (DECL_ARG_TYPE (arg)))
-   {
- parm.by_ref = true;
- parm.frame_type = actual_type;
-   }
- else
-   parm.frame_type = actual_type;
+   parm.by_ref = true;
+
+ parm.frame_type = actual_type;

  parm.this_ptr = is_this_parameter (arg);
  if (lambda_p)
@@ -4170,17 +4152,16 @@ morph_fn_to_coro (tree orig, tree *resumer, tree 
*destroyer)
}
  else if (parm.by_ref)
vec_safe_push (promise_args, fld_idx);
- else if (parm.rv_ref)
-   vec_safe_push (promise_args, rvalue (fld_idx));
  else
vec_safe_push (promise_args, arg);

  if (TYPE_NEEDS_CONSTRUCTING (parm.frame_type))
{
  vec *p_in;
- if (parm.by_ref
-   

Re: [PATCH] coroutines: Wrap co_await in a target expr where needed [PR95050]

2020-06-01 Thread Nathan Sidwell

On 6/1/20 3:55 AM, Iain Sandoe wrote:

Hi,

Since the co_await expression is mostly opaque to the existing
machinery, we were hiding the details of the await_resume return
value.  If that needs to be wrapped in a target expression, then
emulate this with the whole co_await.  Similarly, if the await
expression we build in response to co_await p.yield_value (e)
is wrapped in a target expression, then we need to transfer that
wrapper to the resultant CO_YIELD_EXPR (which is, itself, just
a proxy for the underlying co_await).

tested on x86_64,powerpc64-linux, x86_64-darwin
OK for master?
OK for 10.2?
thanks
Iain


ok for both


gcc/cp/ChangeLog:

PR c++/95050
* coroutines.cc (build_co_await): Wrap the co_await expression
in a TARGET_EXPR, where needed.
(finish_co_yield_expr): Likewise.

gcc/testsuite/ChangeLog:

PR c++/95050
* g++.dg/coroutines/pr95050.C: New test.
---
  gcc/cp/coroutines.cc  | 29 +-
  gcc/testsuite/g++.dg/coroutines/pr95050.C | 49 +++
  2 files changed, 76 insertions(+), 2 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/coroutines/pr95050.C

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index 8746927577a..cc685ca73b2 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -816,6 +816,12 @@ build_co_await (location_t loc, tree a, suspend_point_kind 
suspend_kind)
tree awaiter_calls = make_tree_vec (3);
TREE_VEC_ELT (awaiter_calls, 0) = awrd_call; /* await_ready().  */
TREE_VEC_ELT (awaiter_calls, 1) = awsp_call; /* await_suspend().  */
+  tree te = NULL_TREE;
+  if (TREE_CODE (awrs_call) == TARGET_EXPR)
+{
+  te = awrs_call;
+  awrs_call = TREE_OPERAND (awrs_call, 1);
+}
TREE_VEC_ELT (awaiter_calls, 2) = awrs_call; /* await_resume().  */
  
tree await_expr = build5_loc (loc, CO_AWAIT_EXPR,

@@ -823,7 +829,13 @@ build_co_await (location_t loc, tree a, suspend_point_kind 
suspend_kind)
a, e_proxy, o, awaiter_calls,
build_int_cst (integer_type_node,
   (int) suspend_kind));
-  return convert_from_reference (await_expr);
+  if (te)
+{
+  TREE_OPERAND (te, 1) = await_expr;
+  await_expr = te;
+}
+  tree t = convert_from_reference (await_expr);
+  return t;
  }
  
  tree

@@ -960,8 +972,21 @@ finish_co_yield_expr (location_t kw, tree expr)
tree op = build_co_await (kw, yield_call, CO_YIELD_SUSPEND_POINT);
if (op != error_mark_node)
  {
-  op = build2_loc (kw, CO_YIELD_EXPR, TREE_TYPE (op), expr, op);
+  if (REFERENCE_REF_P (op))
+   op = TREE_OPERAND (op, 0);
+  /* If the await expression is wrapped in a TARGET_EXPR, then transfer
+that wrapper to the CO_YIELD_EXPR, since this is just a proxy for
+its contained await.  Otherwise, just build the CO_YIELD_EXPR.  */
+  if (TREE_CODE (op) == TARGET_EXPR)
+   {
+ tree t = TREE_OPERAND (op, 1);
+ t = build2_loc (kw, CO_YIELD_EXPR, TREE_TYPE (t), expr, t);
+ TREE_OPERAND (op, 1) = t;
+   }
+  else
+   op = build2_loc (kw, CO_YIELD_EXPR, TREE_TYPE (op), expr, op);
TREE_SIDE_EFFECTS (op) = 1;
+  op = convert_from_reference (op);
  }
  
return op;

diff --git a/gcc/testsuite/g++.dg/coroutines/pr95050.C 
b/gcc/testsuite/g++.dg/coroutines/pr95050.C
new file mode 100644
index 000..fd1516d32f0
--- /dev/null
+++ b/gcc/testsuite/g++.dg/coroutines/pr95050.C
@@ -0,0 +1,49 @@
+#if __has_include ()
+#include 
+using namespace std;
+#elif defined (__clang__) && __has_include ()
+#include 
+using namespace std::experimental;
+#endif
+#include 
+
+struct ret_type
+{
+  ret_type () = default;
+  ret_type (const ret_type&) = delete;
+  //ret_type (ret_type&&) = default;
+  ~ret_type() {}
+};
+
+struct task
+{
+  struct promise_type
+  {
+auto get_return_object () -> task  { return {}; }
+auto initial_suspend () -> suspend_always { return {}; }
+auto final_suspend () -> suspend_always { return {}; }
+void return_void () {}
+void unhandled_exception () { }
+void thing (ret_type x) {}
+  };
+};
+
+struct awaiter
+{
+  bool await_ready() const { return true; }
+  void await_suspend (coroutine_handle<>) {}
+  ret_type await_resume() { return {}; }
+};
+
+task
+my_coro ()
+{
+  ret_type r2{co_await awaiter{}};
+  //ret_type r3 (std::move(r2));
+}
+
+int main()
+{
+ auto x = my_coro ();
+ return 0;
+}




--
Nathan Sidwell


Re: [PATCH] coroutines: Correct handling of references in parm copies [PR95350].

2020-06-01 Thread Nathan Sidwell

On 6/1/20 3:44 AM, Iain Sandoe wrote:


Hi,

I had implemented a move out of rvalue refs for such ramp values (since
these are most likely to be dangling references).  However this does cause
a divergence with the clang implementation - and the patch fixes that.



ok for both

tested on x86_64,powerpc64-linux, x86_64-darwin
OK for master?
OK for 10.2?
thanks
Iain

---

Adjust to handle rvalue refs the same way as clang, and to correct
the handling of moves when a copy CTOR is present.  This is one area
where we could make things easier for the end-user (as was implemented
before this change), however there needs to be agreement about when the
full statement containing a coroutine call ends (i.e. when the ramp
terminates or when the coroutine terminates).

gcc/cp/ChangeLog:

PR c++/95350
* coroutines.cc (struct param_info): Remove rv_ref field.
(build_actor_fn): Remove specifial rvalue ref handling.
(morph_fn_to_coro): Likewise.

gcc/testsuite/ChangeLog:

PR c++/95350
* g++.dg/coroutines/torture/func-params-08.C: Adjust test to
reflect that all rvalue refs are dangling.
* g++.dg/coroutines/torture/func-params-09-awaitable-parms.C:
Likewise.
* g++.dg/coroutines/pr95350.C: New test.
---
gcc/cp/coroutines.cc  | 41 +--
gcc/testsuite/g++.dg/coroutines/pr95350.C | 28 +
.../coroutines/torture/func-params-08.C   | 11 ++---
.../torture/func-params-09-awaitable-parms.C  | 11 ++---
4 files changed, 50 insertions(+), 41 deletions(-)
create mode 100644 gcc/testsuite/g++.dg/coroutines/pr95350.C

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index 969f4a66f2f..8746927577a 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -1807,7 +1807,6 @@ struct param_info
   tree frame_type;   /* The type used to represent this parm in the frame.  */
   tree orig_type;    /* The original type of the parm (not as passed).  */
   bool by_ref;   /* Was passed by reference.  */
-  bool rv_ref;   /* Was an rvalue reference.  */
   bool pt_ref;   /* Was a pointer to object.  */
   bool trivial_dtor; /* The frame type has a trivial DTOR.  */
   bool this_ptr; /* Is 'this' */
@@ -2077,12 +2076,6 @@ build_actor_fn (location_t loc, tree coro_frame_type, 
tree actor, tree fnbody,

  if (parm.pt_ref)
    fld_idx = build1_loc (loc, CONVERT_EXPR, TREE_TYPE (arg), fld_idx);

- /* We expect an rvalue ref. here.  */
- if (parm.rv_ref)
-   fld_idx = convert_to_reference (DECL_ARG_TYPE (arg), fld_idx,
-   CONV_STATIC, LOOKUP_NORMAL,
-   NULL_TREE, tf_warning_or_error);
-
  int i;
  tree *puse;
  FOR_EACH_VEC_ELT (*parm.body_uses, i, puse)
@@ -3770,15 +3763,8 @@ morph_fn_to_coro (tree orig, tree *resumer, tree 
*destroyer)
  if (actual_type == NULL_TREE)
    actual_type = error_mark_node;
  parm.orig_type = actual_type;
- parm.by_ref = parm.rv_ref = parm.pt_ref = false;
- if (TREE_CODE (actual_type) == REFERENCE_TYPE
- && TYPE_REF_IS_RVALUE (DECL_ARG_TYPE (arg)))
-   {
- parm.rv_ref = true;
- actual_type = TREE_TYPE (actual_type);
- parm.frame_type = actual_type;
-   }
- else if (TREE_CODE (actual_type) == REFERENCE_TYPE)
+ parm.by_ref = parm.pt_ref = false;
+ if (TREE_CODE (actual_type) == REFERENCE_TYPE)
    {
  /* If the user passes by reference, then we will save the
pointer to the original.  As noted in
@@ -3786,16 +3772,12 @@ morph_fn_to_coro (tree orig, tree *resumer, tree 
*destroyer)
referenced item ends and then the coroutine is resumed,
we have UB; well, the user asked for it.  */
  actual_type = build_pointer_type (TREE_TYPE (actual_type));
- parm.frame_type = actual_type;
  parm.pt_ref = true;
    }
  else if (TYPE_REF_P (DECL_ARG_TYPE (arg)))
-   {
- parm.by_ref = true;
- parm.frame_type = actual_type;
-   }
- else
-   parm.frame_type = actual_type;
+   parm.by_ref = true;
+
+ parm.frame_type = actual_type;

  parm.this_ptr = is_this_parameter (arg);
  if (lambda_p)
@@ -4170,17 +4152,16 @@ morph_fn_to_coro (tree orig, tree *resumer, tree 
*destroyer)
    }
  else if (parm.by_ref)
    vec_safe_push (promise_args, fld_idx);
- else if (parm.rv_ref)
-   vec_safe_push (promise_args, rvalue (fld_idx));
  else
    vec_safe_push (promise_args, arg);

  if (TYPE_NEEDS_CONSTRUCTING (parm.frame_type))
    {
  vec *p_in;
- if (parm.by_ref
- && classtype_has_non_deleted_move_ctor (parm.frame_type)
- && !classtype_has_non_deleted_copy_ctor (parm.frame_type))
+ if (CLASS_TYPE_P (parm.frame_type)
+ && classtype_has_non_deleted_move_ctor (parm.frame_type))
+p_in = make_tree_vector_single (move (arg));
+ else if (lvalue_p (arg))
p_in = make_tree_vector_single (rvalue (arg));
  else
p_in = make_tree_vector_single (arg);
@@ -4193,9 +4174,7 @@ morph_fn_to_coro (tree orig, tree *resumer, tree 
*destroyer)
    }
  else
    {
- if (parm.rv_ref)
-r = convert_from_reference (arg);
- else if (!same_type_p (parm.frame_type, DECL_ARG_TYPE (arg)))
+ if (!same_type_p (parm.frame_type, 

Cleanup global decl stream reference streaming, part 1

2020-06-01 Thread Jan Hubicka
Hi,
this patch further simplifies way we reffer to global stream.  Every function
section has vector of references to global trees which are populated during
streaming.  This vector is for some reason divided into field_decls, fn_decls,
type_decls, types, namespace_decls, labels_decls and var_decls which contains
also other things.

There is no benefit for this split except perhaps for making the indexes
bit smaller and possibly better encodable by ulebs.  This however does not
pay back and makes things unnecesarily complex.
We may want to re-add multiple tables if we start streaming something else than
trees into the global stream, but that would not work with current
infrastructure anyway.

The patch drops different streams and I checked that it results in reduction of
global stream and apparently very small increase in function streams but it may
be just because I updated tree in between the tests. This will be fixed by
incremental patch.

[WPA] Compression: 86220483 input bytes, 217762146 uncompressed bytes (ratio: 
2.525643)
[WPA] Compression: 111735464 input bytes, 297410918 uncompressed bytes (ratio: 
2.661741)
[WPA] Size of mmap'd section decls: 86220483 bytes
[WPA] Size of mmap'd section function_body: 14353447 bytes

to:

[WPA] Compression: 85754594 input bytes, 216006049 uncompressed bytes (ratio: 
2.518886)
[WPA] Compression: 111370381 input bytes, 295746052 uncompressed bytes (ratio: 
2.655518)
[WPA] Size of mmap'd section decls: 85754594 bytes
[WPA] Size of mmap'd section function_body: 14447946 bytes

The patch also removes some of ugly macro generators of accessors functions and
makes it easier to further optimize the way we stream references to trees which
I plan to do incrementally.

I also made the API for streaming referneces symmetric.  I.e. you
stream out by
  lto_output_var_decl_ref
and stream in by
  lto_input_var_decl_ref

instead streaming out by
  lto_output_var_decl_index
and streaming in by
  decl_index = streamer_read_uhwi (ib);
  lto_file_decl_data_get_fn_decl (file_data, decl_index);

lto-bootstrapped/regtested x86_64-linux, will commit it shortly.

gcc/ChangeLog:

2020-06-01  Jan Hubicka  

* ipa-reference.c (stream_out_bitmap): Use lto_output_var_decl_ref.
(ipa_reference_read_optimization_summary): Use lto_intput_var_decl_ref.
* lto-cgraph.c (lto_output_node): Likewise.
(lto_output_varpool_node): Likewise.
(output_offload_tables): Likewise.
(input_node): Likewise.
(input_varpool_node): Likewise.
(input_offload_tables): Likewise.
* lto-streamer-in.c (lto_input_tree_ref): Declare.
(lto_input_var_decl_ref): Declare.
(lto_input_fn_decl_ref): Declare.
* lto-streamer-out.c (lto_indexable_tree_ref): Use only one decl stream.
(lto_output_var_decl_index): Rename to ..
(lto_output_var_decl_ref): ... this.
(lto_output_fn_decl_index): Rename to ...
(lto_output_fn_decl_ref): ... this.
* lto-streamer.h (enum lto_decl_stream_e_t): Remove per-type streams.
(DEFINE_DECL_STREAM_FUNCS): Remove.
(lto_output_var_decl_index): Remove.
(lto_output_fn_decl_index): Remove.
(lto_output_var_decl_ref): Declare.
(lto_output_fn_decl_ref): Declare.
(lto_input_var_decl_ref): Declare.
(lto_input_fn_decl_ref): Declare.

diff --git a/gcc/lto-streamer.h b/gcc/lto-streamer.h
index 6ab0505c3fd..fc7e4312420 100644
--- a/gcc/lto-streamer.h
+++ b/gcc/lto-streamer.h
@@ -249,38 +249,12 @@ enum lto_section_type
 /* Indices to the various function, type and symbol streams. */
 enum lto_decl_stream_e_t
 {
-  LTO_DECL_STREAM_TYPE = 0,/* Must be first. */
-  LTO_DECL_STREAM_FIELD_DECL,
-  LTO_DECL_STREAM_FN_DECL,
-  LTO_DECL_STREAM_VAR_DECL,
-  LTO_DECL_STREAM_TYPE_DECL,
-  LTO_DECL_STREAM_NAMESPACE_DECL,
-  LTO_DECL_STREAM_LABEL_DECL,
+  LTO_DECL_STREAM = 0, /* Must be first.  */
   LTO_N_DECL_STREAMS
 };
 
 typedef enum ld_plugin_symbol_resolution ld_plugin_symbol_resolution_t;
 
-
-/* Macro to define convenience functions for type and decl streams
-   in lto_file_decl_data.  */
-#define DEFINE_DECL_STREAM_FUNCS(UPPER_NAME, name) \
-static inline tree \
-lto_file_decl_data_get_ ## name (struct lto_file_decl_data *data, \
-unsigned int idx) \
-{ \
-  struct lto_in_decl_state *state = data->current_decl_state; \
-   return (*state->streams[LTO_DECL_STREAM_## UPPER_NAME])[idx]; \
-} \
-\
-static inline unsigned int \
-lto_file_decl_data_num_ ## name ## s (struct lto_file_decl_data *data) \
-{ \
-  struct lto_in_decl_state *state = data->current_decl_state; \
-  return vec_safe_length (state->streams[LTO_DECL_STREAM_## UPPER_NAME]); \
-}
-
-
 /* Return a char pointer to the start of a data stream for an lto pass
or function.  The first parameter is the file data that contains
the information.  The second parameter is the type of information
@@ -908,10 +882,12 @@ extern 

Re: [PATCH] More c++ math reject macros

2020-06-01 Thread Douglas B Rupp

Apologies!  Will do so in the future.

Doug

On 6/1/20 6:15 AM, Jonathan Wakely wrote:

Please CC libstd...@gcc.gnu.org on all libstdc++ patches, even if the
approval is coming from a target port maintainer, not from a libstdc++
maintainer.

Thanks.



Discussion about the medium code model in aarch64

2020-06-01 Thread bule
Hi,

I reported a PR in gcc Bugzilla about the medium code model in aarch64. A 
solution is proposed and some discussion has been posted.

The details of the discussion can be found here : 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95285

Wilco suggest me to make a PIC 48-bit code model by making a new relocation 
type "high32_47" combined with ADRP instruction,  which I think is feasible and 
more efficient than my solution. But this kind of relocation hasn't been 
defined in arm's ABI. Meanwhile he also doubt the necessity of the medium or 
large-pic code model.

My solution, on the other hand, only use exiting relocation types 
R__MOVW_PREL_G0-3, which is also how llvm solve similar problems. Although 
it is less efficient, but currently more easier to implement. For the necessity 
concern, because I need to optimize CESM in my work, I happened need to use 
this kind of large-pic code model. The abstracted test case is also provided in 
the bug report.

I would very much like to know what is your opinion on this issue.

Which solution you think is more appropriate for current situation? 

And regarding the necessity problem, I admit it is not a critical issue. But 
some application in HPC field do need this code model. 
Personally, I think it doesn't hurt for us to upstream a prototype first for 
customer to use it. Later if arm have an official document regarding this code 
model, we can then make a standard model.
What's you opinion regarding this necessity problem?

Thanks a lot.

Regards,
Bu Le (Bruce)




Re: [PATCH] Add missing store in emission of asan_stack_free.

2020-06-01 Thread Martin Liška

On 6/1/20 2:52 PM, Jakub Jelinek wrote:

On Mon, Jun 01, 2020 at 02:28:51PM +0200, Martin Liška wrote:

--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -1598,8 +1598,24 @@ asan_emit_stack_protection (rtx base, rtx pbase, 
unsigned int alignb,
if (use_after_return_class < 5
  && can_store_by_pieces (sz, builtin_memset_read_str, ,
  BITS_PER_UNIT, true))
-   store_by_pieces (shadow_mem, sz, builtin_memset_read_str, ,
-BITS_PER_UNIT, true, RETURN_BEGIN);
+   {
+ /* Emit:
+  memset(ShadowBase, kAsanStackAfterReturnMagic, ShadowSize);
+  **SavedFlagPtr(FakeStack) = 0
+ */
+ store_by_pieces (shadow_mem, sz, builtin_memset_read_str, ,
+  BITS_PER_UNIT, true, RETURN_BEGIN);
+
+ unsigned HOST_WIDE_INT offset
+   = (1 << (use_after_return_class + 6));
+ offset -= GET_MODE_SIZE (ptr_mode);
+ mem = adjust_address (mem, Pmode, offset);
+ mem = gen_rtx_MEM (ptr_mode, mem);
+ rtx tmp_reg = gen_reg_rtx (Pmode);
+ emit_move_insn (tmp_reg, mem);
+ mem = adjust_address (mem, QImode, 0);
+ emit_move_insn (mem, const0_rtx);


This doesn't look correct to me.
I'd think the first adjust_address should be
  mem = adjust_address (mem, ptr_mode, offset);
which will give you a MEM with ptr_mode which has SavedFlagPtr(FakeStack)
address, i.e. *SavedFlagPtr(FakeStack).
Next, you want to load that into some temporary, so e.g.
  rtx addr = gen_reg_rtx (ptr_mode);
  emit_move_insn (addr, mem);
next you need to convert that ptr_mode to Pmode if needed, so something like
  addr = convert_memory_address (Pmode, addr);
and finally:
  mem = gen_rtx_MEM (QImode, addr);
  emit_move_insn (mem, const0_rtx);
Completely untested.


This is not correct. With your suggestion I have:

int foo(int index)
{
  int a[100];
  return a[index];
}

$ diff -u before.s after.s
--- before.s2020-06-01 15:15:22.634337654 +0200
+++ after.s 2020-06-01 15:16:32.205711511 +0200
@@ -81,8 +81,7 @@
movq%rdi, 2147450920(%rax)
movq%rsi, 2147450928(%rax)
movq%rdi, 2147450936(%rax)
-   movq504(%rbx), %rax
-   movb$0, (%rax)
+   movb$0, 504(%rbx)
jmp .L3
 .L2:
movq$0, 2147450880(%rax)

There's missing one level of de-reference. Looking at clang:

movq%rsi, 2147450928(%rax)
movq%rdi, 2147450936(%rax)
movq504(%rbx), %rax
movb$0, (%rax)
jmp .L3
.L2:

It does the same as my patch.
Martin



Jakub





Re: [PATCH] More c++ math reject macros

2020-06-01 Thread Jonathan Wakely via Gcc-patches
Please CC libstd...@gcc.gnu.org on all libstdc++ patches, even if the
approval is coming from a target port maintainer, not from a libstdc++
maintainer.

Thanks.




Re: [PATCH] Add missing store in emission of asan_stack_free.

2020-06-01 Thread Jakub Jelinek via Gcc-patches
On Mon, Jun 01, 2020 at 02:28:51PM +0200, Martin Liška wrote:
> --- a/gcc/asan.c
> +++ b/gcc/asan.c
> @@ -1598,8 +1598,24 @@ asan_emit_stack_protection (rtx base, rtx pbase, 
> unsigned int alignb,
>if (use_after_return_class < 5
> && can_store_by_pieces (sz, builtin_memset_read_str, ,
> BITS_PER_UNIT, true))
> - store_by_pieces (shadow_mem, sz, builtin_memset_read_str, ,
> -  BITS_PER_UNIT, true, RETURN_BEGIN);
> + {
> +   /* Emit:
> +memset(ShadowBase, kAsanStackAfterReturnMagic, ShadowSize);
> +**SavedFlagPtr(FakeStack) = 0
> +   */
> +   store_by_pieces (shadow_mem, sz, builtin_memset_read_str, ,
> +BITS_PER_UNIT, true, RETURN_BEGIN);
> +
> +   unsigned HOST_WIDE_INT offset
> + = (1 << (use_after_return_class + 6));
> +   offset -= GET_MODE_SIZE (ptr_mode);
> +   mem = adjust_address (mem, Pmode, offset);
> +   mem = gen_rtx_MEM (ptr_mode, mem);
> +   rtx tmp_reg = gen_reg_rtx (Pmode);
> +   emit_move_insn (tmp_reg, mem);
> +   mem = adjust_address (mem, QImode, 0);
> +   emit_move_insn (mem, const0_rtx);

This doesn't look correct to me.
I'd think the first adjust_address should be
  mem = adjust_address (mem, ptr_mode, offset);
which will give you a MEM with ptr_mode which has SavedFlagPtr(FakeStack)
address, i.e. *SavedFlagPtr(FakeStack).
Next, you want to load that into some temporary, so e.g.
  rtx addr = gen_reg_rtx (ptr_mode);
  emit_move_insn (addr, mem);
next you need to convert that ptr_mode to Pmode if needed, so something like
  addr = convert_memory_address (Pmode, addr);
and finally:
  mem = gen_rtx_MEM (QImode, addr);
  emit_move_insn (mem, const0_rtx);
Completely untested.

Jakub



Re: [PATCH] Add missing store in emission of asan_stack_free.

2020-06-01 Thread Martin Liška

On 5/20/20 1:03 PM, Franz Sirl wrote:

Am 2020-05-19 um 21:05 schrieb Martin Liška:

Hi.

We make direct emission for asan_emit_stack_protection for smaller stacks.
That's fine but we're missing the piece that marks the stack as released
and we run out of pre-allocated stacks. I also included some stack-related
constants that were used in asan.c.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

2020-05-19  Martin Liska  

 PR sanitizer/94910
 * asan.c (asan_emit_stack_protection): Emit
 also **SavedFlagPtr(FakeStack) = 0 in order to release
 a stack frame.
 * asan.h (ASAN_MIN_STACK_FRAME_SIZE_LOG): New.
 (ASAN_MAX_STACK_FRAME_SIZE_LOG): Likewise.
 (ASAN_MIN_STACK_FRAME_SIZE): Likewise.
 (ASAN_MAX_STACK_FRAME_SIZE): Likewise.
---
  gcc/asan.c | 26 ++
  gcc/asan.h |  8 
  2 files changed, 30 insertions(+), 4 deletions(-)




 >-  if (asan_frame_size > 32 && asan_frame_size <= 65536 && pbase
 >+  if (asan_frame_size >= ASAN_MIN_STACK_FRAME_SIZE

Hi,

is the change from > to >= and from 32 to 64 for ASAN_MIN_STACK_FRAME_SIZE 
intentional? Just asking because it doesn't look obvious from Changelog or patch.
Also a few lines below the "5" in

   use_after_return_class = floor_log2 (asan_frame_size - 1) - 5;

looks like it may be related to ASAN_MIN_STACK_FRAME_SIZE_LOG.


Hello.

Thank you very much for the useful feedback. I really made the refactoring
in a wrong way.

I'm suggesting to only change the emission of asan_emit_stack_protection.
Tested locally with asan.exp file.

Ready for master?
Thanks,
Martin



regards,
Franz


>From 5d0c64b2f4028af3ed575934ecc0c3378cca3de1 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Tue, 19 May 2020 16:57:56 +0200
Subject: [PATCH] Add missing store in emission of asan_stack_free.

gcc/ChangeLog:

2020-05-19  Martin Liska  

	PR sanitizer/94910
	* asan.c (asan_emit_stack_protection): Emit
	also **SavedFlagPtr(FakeStack) = 0 in order to release
	a stack frame.
---
 gcc/asan.c | 20 ++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/gcc/asan.c b/gcc/asan.c
index c9872f1b007..e8d2a25ff79 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -1598,8 +1598,24 @@ asan_emit_stack_protection (rtx base, rtx pbase, unsigned int alignb,
   if (use_after_return_class < 5
 	  && can_store_by_pieces (sz, builtin_memset_read_str, ,
   BITS_PER_UNIT, true))
-	store_by_pieces (shadow_mem, sz, builtin_memset_read_str, ,
-			 BITS_PER_UNIT, true, RETURN_BEGIN);
+	{
+	  /* Emit:
+	   memset(ShadowBase, kAsanStackAfterReturnMagic, ShadowSize);
+	   **SavedFlagPtr(FakeStack) = 0
+	  */
+	  store_by_pieces (shadow_mem, sz, builtin_memset_read_str, ,
+			   BITS_PER_UNIT, true, RETURN_BEGIN);
+
+	  unsigned HOST_WIDE_INT offset
+	= (1 << (use_after_return_class + 6));
+	  offset -= GET_MODE_SIZE (ptr_mode);
+	  mem = adjust_address (mem, Pmode, offset);
+	  mem = gen_rtx_MEM (ptr_mode, mem);
+	  rtx tmp_reg = gen_reg_rtx (Pmode);
+	  emit_move_insn (tmp_reg, mem);
+	  mem = adjust_address (mem, QImode, 0);
+	  emit_move_insn (mem, const0_rtx);
+	}
   else if (use_after_return_class >= 5
 	   || !set_storage_via_setmem (shadow_mem,
 	   GEN_INT (sz),
-- 
2.26.2



Re: [PATCH 2/2] x86: Add cmpmemsi for -minline-all-stringops

2020-06-01 Thread H.J. Lu via Gcc-patches
On Mon, Jun 1, 2020 at 3:19 AM Alexander Monakov  wrote:
>
> On Sun, 31 May 2020, H.J. Lu via Gcc-patches wrote:
>
> > --- a/gcc/config/i386/i386-expand.c
> > +++ b/gcc/config/i386/i386-expand.c
> > @@ -7656,6 +7656,90 @@ ix86_expand_set_or_cpymem (rtx dst, rtx src, rtx 
> > count_exp, rtx val_exp,
> >return true;
> >  }
> >
> > +/* Expand cmpstrn or memcmp.  */
> > +
> > +bool
> > +ix86_expand_cmpstrn_or_cmpmem (rtx result, rtx src1, rtx src2,
> > +rtx length, rtx align, bool is_cmpstrn)
> > +{
> > +  if (optimize_insn_for_size_p () && !TARGET_INLINE_ALL_STRINGOPS)
> > +return false;
> > +
> > +  /* Can't use this if the user has appropriated ecx, esi or edi.  */
> > +  if (fixed_regs[CX_REG] || fixed_regs[SI_REG] || fixed_regs[DI_REG])
> > +return false;
> > +
> > +  if (is_cmpstrn)
> > +{
> > +  /* For strncmp, length is the maximum length, which can be larger
> > +  than actual string lengths.  We can expand the cmpstrn pattern
> > +  to "repz cmpsb" only if one of the strings is a constant so
> > +  that expand_builtin_strncmp() can write the length argument to
> > +  be the minimum of the const string length and the actual length
> > +  argument.  Otherwise, "repz cmpsb" may pass the 0 byte.  */
> > +  tree t1 = MEM_EXPR (src1);
> > +  tree t2 = MEM_EXPR (src2);
> > +  if (!((t1 && TREE_CODE (t1) == MEM_REF
> > +  && TREE_CODE (TREE_OPERAND (t1, 0)) == ADDR_EXPR
> > +  && (TREE_CODE (TREE_OPERAND (TREE_OPERAND (t1, 0), 0))
> > +  == STRING_CST))
> > + || (t2 && TREE_CODE (t2) == MEM_REF
> > + && TREE_CODE (TREE_OPERAND (t2, 0)) == ADDR_EXPR
> > + && (TREE_CODE (TREE_OPERAND (TREE_OPERAND (t2, 0), 0))
> > + == STRING_CST
> > + return false;
> > +}
> > +  else
> > +{
> > +  /* Expand memcmp to "repz cmpsb" only for -minline-all-stringops
> > +  since "repz cmpsb" can be much slower than memcmp function
> > +  implemented with vector instructions, see
> > +
> > +  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052
> > +   */
> > +  if (!TARGET_INLINE_ALL_STRINGOPS)
> > + return false;
> > +}
>
> This check seems to be misplaced, "rep cmps" is slower than either memcmp or
> strcmp. The test for TARGET_INLINE_ALL_STRINGOPS should happen regardless of
> is_cmpstrn, so it should go earlier in the function.
>

My patch doesn't change strncmp at all.   I opened:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95458


-- 
H.J.


Re: [PATCH 1/2] Provide diagnostic hints for missing C inttypes.h string constants.

2020-06-01 Thread Mark Wielaard
On Sun, May 24, 2020 at 02:30:13AM +0200, Mark Wielaard wrote:
> This adds a flag to c_parser so we know when we were trying to
> construct a string literal. If there is a parse error and we were
> constructing a string literal, and the next token is an unknown
> identifier name, and we know there is a standard header that defines
> that name as a string literal, then add a missing header hint to
> the error messsage.
> 
> The list of macro names are also used when providing a hint for
> missing identifiers.

Ping. Note the followup patch that introduces the same functionality
for the C++ parser was already approved. This patch (as attached) only
needs review/approval from a C-frontend maintainer for some of the
gcc/c/c-parser.c bits.

Thanks,

Mark
>From 1aceca275a73b4c7991a6fbde45f4d6da1a9daf5 Mon Sep 17 00:00:00 2001
From: Mark Wielaard 
Date: Fri, 22 May 2020 01:10:50 +0200
Subject: [PATCH] Provide diagnostic hints for missing C inttypes.h string
 constants.

This adds a flag to c_parser so we know when we were trying to
construct a string literal. If there is a parse error and we were
constructing a string literal, and the next token is an unknown
identifier name, and we know there is a standard header that defines
that name as a string literal, then add a missing header hint to
the error messsage.

The list of macro names are also used when providing a hint for
missing identifiers.

gcc/c-family/ChangeLog:

	* known-headers.cc (get_string_macro_hint): New function.
	(get_stdlib_header_for_name): Use get_string_macro_hint.
	(get_c_stdlib_header_for_string_macro_name): New function.
	* known-headers.h (get_c_stdlib_header_for_string_macro_name):
	New function declaration.

gcc/c/ChangeLog:

	* c-parser.c (struct c_parser): Add seen_string_literal
	bitfield.
	(c_parser_consume_token): Reset seen_string_literal.
	(c_parser_error_richloc): Add name_hint if seen_string_literal
	and next token is a CPP_NAME and we have a missing header
	suggestion for the name.
	(c_parser_string_literal): Set seen_string_literal.

gcc/testsuite/ChangeLog:

	* gcc.dg/spellcheck-inttypes.c: New test.
	* g++.dg/spellcheck-inttypes.C: Likewise.
---
 gcc/c-family/known-headers.cc  | 53 ++-
 gcc/c-family/known-headers.h   |  2 +
 gcc/c/c-parser.c   | 29 
 gcc/testsuite/g++.dg/spellcheck-inttypes.C | 41 
 gcc/testsuite/gcc.dg/spellcheck-inttypes.c | 78 ++
 5 files changed, 202 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/spellcheck-inttypes.C
 create mode 100644 gcc/testsuite/gcc.dg/spellcheck-inttypes.c

diff --git a/gcc/c-family/known-headers.cc b/gcc/c-family/known-headers.cc
index 1e2bf49c439a..c07cfd1db815 100644
--- a/gcc/c-family/known-headers.cc
+++ b/gcc/c-family/known-headers.cc
@@ -46,6 +46,49 @@ struct stdlib_hint
   const char *header[NUM_STDLIBS];
 };
 
+/* Given non-NULL NAME, return the header name defining it (as literal
+   string) within either the standard library (with '<' and '>'), or
+   NULL.
+
+   Only handle string macros, so that this can be used for
+   get_stdlib_header_for_name and
+   get_c_stdlib_header_for_string_macro_name.  */
+
+static const char *
+get_string_macro_hint (const char *name, enum stdlib lib)
+{
+  /*  and .  */
+  static const char *c99_cxx11_macros[] =
+{ "PRId8", "PRId16", "PRId32", "PRId64",
+  "PRIi8", "PRIi16", "PRIi32", "PRIi64",
+  "PRIo8", "PRIo16", "PRIo32", "PRIo64",
+  "PRIu8", "PRIu16", "PRIu32", "PRIu64",
+  "PRIx8", "PRIx16", "PRIx32", "PRIx64",
+  "PRIX8", "PRIX16", "PRIX32", "PRIX64",
+
+  "PRIdPTR", "PRIiPTR", "PRIoPTR", "PRIuPTR", "PRIxPTR", "PRIXPTR",
+
+  "SCNd8", "SCNd16", "SCNd32", "SCNd64",
+  "SCNi8", "SCNi16", "SCNi32", "SCNi64",
+  "SCNo8", "SCNo16", "SCNo32", "SCNo64",
+  "SCNu8", "SCNu16", "SCNu32", "SCNu64",
+  "SCNx8", "SCNx16", "SCNx32", "SCNx64",
+
+  "SCNdPTR", "SCNiPTR", "SCNoPTR", "SCNuPTR", "SCNxPTR" };
+
+  if ((lib == STDLIB_C && flag_isoc99)
+  || (lib == STDLIB_CPLUSPLUS && cxx_dialect >= cxx11 ))
+{
+  const size_t num_c99_cxx11_macros
+	= sizeof (c99_cxx11_macros) / sizeof (c99_cxx11_macros[0]);
+  for (size_t i = 0; i < num_c99_cxx11_macros; i++)
+	if (strcmp (name, c99_cxx11_macros[i]) == 0)
+	  return lib == STDLIB_C ? "" : "";
+}
+
+  return NULL;
+}
+
 /* Given non-NULL NAME, return the header name defining it within either
the standard library (with '<' and '>'), or NULL.
Only handles a subset of the most common names within the stdlibs.  */
@@ -196,7 +239,7 @@ get_stdlib_header_for_name (const char *name, enum stdlib lib)
   if (strcmp (name, c99_cxx11_hints[i].name) == 0)
 	return c99_cxx11_hints[i].header[lib];
 
-  return NULL;
+  return get_string_macro_hint (name, lib);
 }
 
 /* Given non-NULL NAME, return the header name defining it within the C
@@ -217,6 +260,14 @@ get_cp_stdlib_header_for_name (const char *name)
   

Re: [PATCH 2/2] Tune memcpy and memset for Zen cores.

2020-06-01 Thread Martin Liška

Adding Honza as Uros recommended him for a review.

Martin

On 6/1/20 1:35 PM, Martin Liška wrote:

Based on the collected numbers in PR95435, I suggest the following
tuning changes:

gcc/ChangeLog:

 PR target/95435
 * config/i386/x86-tune-costs.h: Use libcall for large sizes for
 -m32. Start using libcall from 128+ bytes.
---
  gcc/config/i386/x86-tune-costs.h | 12 ++--
  1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/gcc/config/i386/x86-tune-costs.h b/gcc/config/i386/x86-tune-costs.h
index 1169178433f..3207404e514 100644
--- a/gcc/config/i386/x86-tune-costs.h
+++ b/gcc/config/i386/x86-tune-costs.h
@@ -1314,20 +1314,20 @@ static stringop_algs znver1_memcpy[2] = {
    /* 32-bit tuning.  */
    {libcall, {{6, loop, false},
   {14, unrolled_loop, false},
- {-1, rep_prefix_4_byte, false}}},
+ {-1, libcall, false}}},
    /* 64-bit tuning.  */
    {libcall, {{16, loop, false},
- {8192, rep_prefix_8_byte, false},
+ {128, rep_prefix_8_byte, false},
   {-1, libcall, false;
  static stringop_algs znver1_memset[2] = {
    /* 32-bit tuning.  */
    {libcall, {{8, loop, false},
   {24, unrolled_loop, false},
- {2048, rep_prefix_4_byte, false},
+ {128, rep_prefix_4_byte, false},
   {-1, libcall, false}}},
    /* 64-bit tuning.  */
    {libcall, {{48, unrolled_loop, false},
- {8192, rep_prefix_8_byte, false},
+ {128, rep_prefix_8_byte, false},
   {-1, libcall, false;
  struct processor_costs znver1_cost = {
    {
@@ -1460,7 +1460,7 @@ static stringop_algs znver2_memcpy[2] = {
    /* 32-bit tuning.  */
    {libcall, {{6, loop, false},
   {14, unrolled_loop, false},
- {-1, rep_prefix_4_byte, false}}},
+ {-1, libcall, false}}},
    /* 64-bit tuning.  */
    {libcall, {{16, loop, false},
   {64, rep_prefix_4_byte, false},
@@ -1469,7 +1469,7 @@ static stringop_algs znver2_memset[2] = {
    /* 32-bit tuning.  */
    {libcall, {{8, loop, false},
   {24, unrolled_loop, false},
- {2048, rep_prefix_4_byte, false}
+ {128, rep_prefix_4_byte, false},
   {-1, libcall, false}}},
    /* 64-bit tuning.  */
    {libcall, {{24, rep_prefix_4_byte, false},




[PATCH 2/2] Tune memcpy and memset for Zen cores.

2020-06-01 Thread Martin Liška

Based on the collected numbers in PR95435, I suggest the following
tuning changes:

gcc/ChangeLog:

PR target/95435
* config/i386/x86-tune-costs.h: Use libcall for large sizes for
-m32. Start using libcall from 128+ bytes.
---
 gcc/config/i386/x86-tune-costs.h | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/gcc/config/i386/x86-tune-costs.h b/gcc/config/i386/x86-tune-costs.h
index 1169178433f..3207404e514 100644
--- a/gcc/config/i386/x86-tune-costs.h
+++ b/gcc/config/i386/x86-tune-costs.h
@@ -1314,20 +1314,20 @@ static stringop_algs znver1_memcpy[2] = {
   /* 32-bit tuning.  */
   {libcall, {{6, loop, false},
 {14, unrolled_loop, false},
-{-1, rep_prefix_4_byte, false}}},
+{-1, libcall, false}}},
   /* 64-bit tuning.  */
   {libcall, {{16, loop, false},
-{8192, rep_prefix_8_byte, false},
+{128, rep_prefix_8_byte, false},
 {-1, libcall, false;
 static stringop_algs znver1_memset[2] = {
   /* 32-bit tuning.  */
   {libcall, {{8, loop, false},
 {24, unrolled_loop, false},
-{2048, rep_prefix_4_byte, false},
+{128, rep_prefix_4_byte, false},
 {-1, libcall, false}}},
   /* 64-bit tuning.  */
   {libcall, {{48, unrolled_loop, false},
-{8192, rep_prefix_8_byte, false},
+{128, rep_prefix_8_byte, false},
 {-1, libcall, false;
 struct processor_costs znver1_cost = {
   {
@@ -1460,7 +1460,7 @@ static stringop_algs znver2_memcpy[2] = {
   /* 32-bit tuning.  */
   {libcall, {{6, loop, false},
 {14, unrolled_loop, false},
-{-1, rep_prefix_4_byte, false}}},
+{-1, libcall, false}}},
   /* 64-bit tuning.  */
   {libcall, {{16, loop, false},
 {64, rep_prefix_4_byte, false},
@@ -1469,7 +1469,7 @@ static stringop_algs znver2_memset[2] = {
   /* 32-bit tuning.  */
   {libcall, {{8, loop, false},
 {24, unrolled_loop, false},
-{2048, rep_prefix_4_byte, false}
+{128, rep_prefix_4_byte, false},
 {-1, libcall, false}}},
   /* 64-bit tuning.  */
   {libcall, {{24, rep_prefix_4_byte, false},
--
2.26.2



[PATCH 1/2] Re-format zen memcpy/memset costs.

2020-06-01 Thread Martin Liška

The patch improves readability of the memcpy and memset
expansion strategies.

gcc/ChangeLog:

* config/i386/x86-tune-costs.h: Change code formatting.
---
 gcc/config/i386/x86-tune-costs.h | 38 +++-
 1 file changed, 28 insertions(+), 10 deletions(-)

diff --git a/gcc/config/i386/x86-tune-costs.h b/gcc/config/i386/x86-tune-costs.h
index c73917e5a62..1169178433f 100644
--- a/gcc/config/i386/x86-tune-costs.h
+++ b/gcc/config/i386/x86-tune-costs.h
@@ -1311,14 +1311,23 @@ const struct processor_costs bdver_cost = {
 very small blocks it is better to use loop.  For large blocks, libcall
 can do nontemporary accesses and beat inline considerably.  */
 static stringop_algs znver1_memcpy[2] = {
-  {libcall, {{6, loop, false}, {14, unrolled_loop, false},
+  /* 32-bit tuning.  */
+  {libcall, {{6, loop, false},
+{14, unrolled_loop, false},
 {-1, rep_prefix_4_byte, false}}},
-  {libcall, {{16, loop, false}, {8192, rep_prefix_8_byte, false},
+  /* 64-bit tuning.  */
+  {libcall, {{16, loop, false},
+{8192, rep_prefix_8_byte, false},
 {-1, libcall, false;
 static stringop_algs znver1_memset[2] = {
-  {libcall, {{8, loop, false}, {24, unrolled_loop, false},
-{2048, rep_prefix_4_byte, false}, {-1, libcall, false}}},
-  {libcall, {{48, unrolled_loop, false}, {8192, rep_prefix_8_byte, false},
+  /* 32-bit tuning.  */
+  {libcall, {{8, loop, false},
+{24, unrolled_loop, false},
+{2048, rep_prefix_4_byte, false},
+{-1, libcall, false}}},
+  /* 64-bit tuning.  */
+  {libcall, {{48, unrolled_loop, false},
+{8192, rep_prefix_8_byte, false},
 {-1, libcall, false;
 struct processor_costs znver1_cost = {
   {
@@ -1448,14 +1457,23 @@ struct processor_costs znver1_cost = {
 very small blocks it is better to use loop.  For large blocks, libcall
 can do nontemporary accesses and beat inline considerably.  */
 static stringop_algs znver2_memcpy[2] = {
-  {libcall, {{6, loop, false}, {14, unrolled_loop, false},
+  /* 32-bit tuning.  */
+  {libcall, {{6, loop, false},
+{14, unrolled_loop, false},
 {-1, rep_prefix_4_byte, false}}},
-  {libcall, {{16, loop, false}, {64, rep_prefix_4_byte, false},
+  /* 64-bit tuning.  */
+  {libcall, {{16, loop, false},
+{64, rep_prefix_4_byte, false},
 {-1, libcall, false;
 static stringop_algs znver2_memset[2] = {
-  {libcall, {{8, loop, false}, {24, unrolled_loop, false},
-{2048, rep_prefix_4_byte, false}, {-1, libcall, false}}},
-  {libcall, {{24, rep_prefix_4_byte, false}, {128, rep_prefix_8_byte, false},
+  /* 32-bit tuning.  */
+  {libcall, {{8, loop, false},
+{24, unrolled_loop, false},
+{2048, rep_prefix_4_byte, false}
+{-1, libcall, false}}},
+  /* 64-bit tuning.  */
+  {libcall, {{24, rep_prefix_4_byte, false},
+{128, rep_prefix_8_byte, false},
 {-1, libcall, false;
 
 struct processor_costs znver2_cost = {

--
2.26.2




Re: [PATCH] Fix some improper debug dump in clone materialization

2020-06-01 Thread Martin Jambor
Hi Feng,

On Mon, Jun 01 2020, Feng Xue OS wrote:
> Clone materialization might produce some improper debug output as:
>
> Original--
>
> cloning foo/271 to foo.constprop/334
>replace map: 0 -> xxx1->yyy
> m_always_copy_start: 1
> IPA adjusted parameters: foo (...)
> {
> ...
> }
>
> And a better output could be:
>
> cloning foo/271 to foo.constprop/334
> replace map: 0 -> xxx, 1->yyy /* separate 1 with xxx,  */
> m_always_copy_start: 1   /* Align with replace map */
> IPA adjusted parameters:/* If no adjusted parameter, 
> start a new line or omit this line */
> foo (...)
> {
> ...
> }
>
> Feng
> ---
> 2020-06-01  Feng Xue  
>
> gcc/
>   * cgraphclones.c (materialize_all_clones): Adjust replace map dump.
>   * ipa-param-manipulation.c (ipa_dump_adjusted_parameters): Do not
>   dump infomation if there is no adjusted parameter.
>   * (ipa_param_adjustments::dump): Adjust prefix spaces for dump string.

This is OK, thank you.

Martin


Re: [PATCH 2/2] x86: Add cmpmemsi for -minline-all-stringops

2020-06-01 Thread Alexander Monakov via Gcc-patches
On Sun, 31 May 2020, H.J. Lu via Gcc-patches wrote:

> --- a/gcc/config/i386/i386-expand.c
> +++ b/gcc/config/i386/i386-expand.c
> @@ -7656,6 +7656,90 @@ ix86_expand_set_or_cpymem (rtx dst, rtx src, rtx 
> count_exp, rtx val_exp,
>return true;
>  }
>  
> +/* Expand cmpstrn or memcmp.  */
> +
> +bool
> +ix86_expand_cmpstrn_or_cmpmem (rtx result, rtx src1, rtx src2,
> +rtx length, rtx align, bool is_cmpstrn)
> +{
> +  if (optimize_insn_for_size_p () && !TARGET_INLINE_ALL_STRINGOPS)
> +return false;
> +
> +  /* Can't use this if the user has appropriated ecx, esi or edi.  */
> +  if (fixed_regs[CX_REG] || fixed_regs[SI_REG] || fixed_regs[DI_REG])
> +return false;
> +
> +  if (is_cmpstrn)
> +{
> +  /* For strncmp, length is the maximum length, which can be larger
> +  than actual string lengths.  We can expand the cmpstrn pattern
> +  to "repz cmpsb" only if one of the strings is a constant so
> +  that expand_builtin_strncmp() can write the length argument to
> +  be the minimum of the const string length and the actual length
> +  argument.  Otherwise, "repz cmpsb" may pass the 0 byte.  */
> +  tree t1 = MEM_EXPR (src1);
> +  tree t2 = MEM_EXPR (src2);
> +  if (!((t1 && TREE_CODE (t1) == MEM_REF
> +  && TREE_CODE (TREE_OPERAND (t1, 0)) == ADDR_EXPR
> +  && (TREE_CODE (TREE_OPERAND (TREE_OPERAND (t1, 0), 0))
> +  == STRING_CST))
> + || (t2 && TREE_CODE (t2) == MEM_REF
> + && TREE_CODE (TREE_OPERAND (t2, 0)) == ADDR_EXPR
> + && (TREE_CODE (TREE_OPERAND (TREE_OPERAND (t2, 0), 0))
> + == STRING_CST
> + return false;
> +}
> +  else
> +{
> +  /* Expand memcmp to "repz cmpsb" only for -minline-all-stringops
> +  since "repz cmpsb" can be much slower than memcmp function
> +  implemented with vector instructions, see
> +
> +  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052
> +   */
> +  if (!TARGET_INLINE_ALL_STRINGOPS)
> + return false;
> +}

This check seems to be misplaced, "rep cmps" is slower than either memcmp or
strcmp. The test for TARGET_INLINE_ALL_STRINGOPS should happen regardless of
is_cmpstrn, so it should go earlier in the function.

Alexander


[PATCH] Fix some improper debug dump in clone materialization

2020-06-01 Thread Feng Xue OS via Gcc-patches
Clone materialization might produce some improper debug output as:

Original--

cloning foo/271 to foo.constprop/334
   replace map: 0 -> xxx1->yyy
m_always_copy_start: 1
IPA adjusted parameters: foo (...)
{
...
}

And a better output could be:

cloning foo/271 to foo.constprop/334
replace map: 0 -> xxx, 1->yyy /* separate 1 with xxx,  */
m_always_copy_start: 1   /* Align with replace map */
IPA adjusted parameters:/* If no adjusted parameter, start 
a new line or omit this line */
foo (...)
{
...
}

Feng
---
2020-06-01  Feng Xue  

gcc/
* cgraphclones.c (materialize_all_clones): Adjust replace map dump.
* ipa-param-manipulation.c (ipa_dump_adjusted_parameters): Do not
dump infomation if there is no adjusted parameter.
* (ipa_param_adjustments::dump): Adjust prefix spaces for dump string.
---
 gcc/cgraphclones.c   | 6 +++---
 gcc/ipa-param-manipulation.c | 5 -
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/gcc/cgraphclones.c b/gcc/cgraphclones.c
index e4f1c1d4b5e..db61c218297 100644
--- a/gcc/cgraphclones.c
+++ b/gcc/cgraphclones.c
@@ -1160,15 +1160,15 @@ symbol_table::materialize_all_clones (void)
  if (node->clone.tree_map)
{
  unsigned int i;
- fprintf (symtab->dump_file, "   replace map: ");
+ fprintf (symtab->dump_file, "replace map:");
  for (i = 0;
   i < vec_safe_length (node->clone.tree_map);
   i++)
{
  ipa_replace_map *replace_info;
  replace_info = (*node->clone.tree_map)[i];
- fprintf (symtab->dump_file, "%i -> ",
-  (*node->clone.tree_map)[i]->parm_num);
+ fprintf (symtab->dump_file, "%s %i -> ",
+  i ? "," : "", replace_info->parm_num);
  print_generic_expr (symtab->dump_file,
  replace_info->new_tree);
}
diff --git a/gcc/ipa-param-manipulation.c b/gcc/ipa-param-manipulation.c
index 978916057f0..2cc4bc79dc1 100644
--- a/gcc/ipa-param-manipulation.c
+++ b/gcc/ipa-param-manipulation.c
@@ -111,6 +111,9 @@ ipa_dump_adjusted_parameters (FILE *f,
   unsigned i, len = vec_safe_length (adj_params);
   bool first = true;
 
+  if (!len)
+return;
+
   fprintf (f, "IPA adjusted parameters: ");
   for (i = 0; i < len; i++)
 {
@@ -899,7 +902,7 @@ ipa_param_adjustments::dump (FILE *f)
   fprintf (f, "m_always_copy_start: %i\n", m_always_copy_start);
   ipa_dump_adjusted_parameters (f, m_adj_params);
   if (m_skip_return)
-fprintf (f, " Will SKIP return.\n");
+fprintf (f, "Will SKIP return.\n");
 }
 
 /* Dump information contained in the object in textual form to stderr.  */
-- 

[PATCH] coroutines: co_returns are statements, not expressions.

2020-06-01 Thread Iain Sandoe
Hi

This corrects an typo in the CO_RETURN_EXPR tree class.

Although it doens’t fix any PR or regression - it seems to me that it would be
sensible to apply this to 10.2 as well as master (or it’s an accident waiting to
happen).

OK for master?
10.2 (after some bake)?
thanks
Iain

gcc/cp/ChangeLog:

* cp-tree.def (CO_RETURN_EXPR): Correct the class
to use tcc_statement.
---
 gcc/cp/cp-tree.def | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/cp/cp-tree.def b/gcc/cp/cp-tree.def
index 1454802bf68..99851eb780f 100644
--- a/gcc/cp/cp-tree.def
+++ b/gcc/cp/cp-tree.def
@@ -594,9 +594,9 @@ DEFTREECODE (CO_YIELD_EXPR, "co_yield", tcc_expression, 2)
 /* The co_return expression is used to support coroutines.
 
Op0 is the original expr, can be void (for use in diagnostics)
-   Op2 is the promise return_ call for Op0. */
+   Op1 is the promise return_ call for for the expression given. */
 
-DEFTREECODE (CO_RETURN_EXPR, "co_return", tcc_expression, 2)
+DEFTREECODE (CO_RETURN_EXPR, "co_return", tcc_statement, 2)
 
 /*
 Local variables:
-- 
2.24.1




Re: [PATCH PR95254] aarch64: gcc generate inefficient code with fixed sve vector length

2020-06-01 Thread Richard Sandiford
"Yangfei (Felix)"  writes:
> Hi,
>
>> -Original Message-
>> From: Richard Sandiford [mailto:richard.sandif...@arm.com]
>> Sent: Sunday, May 31, 2020 12:01 AM
>> To: Yangfei (Felix) 
>> Cc: gcc-patches@gcc.gnu.org; Uros Bizjak ; Jakub
>> Jelinek ; Hongtao Liu ; H.J. Lu
>> 
>> Subject: Re: [PATCH PR95254] aarch64: gcc generate inefficient code with
>> fixed sve vector length
>> 
>
> Snip...
>
>> >
>> > The v5 patch attached addressed this issue.
>> >
>> > There two added changes compared with the v4 patch:
>> > 1. In candidate_mem_p, mov_optab for innermode should be available.
>> >  In this case, mov_optab for SDmode is not there and subreg are added
>> back by emit_move_insn_1.  So we won't get the benefit with the patch.
>> 
>> I agree we should have this check.  I think the rule applies to all of the
>> transforms though, not just the mem one, so we should add the check to the
>> register and constant cases too.
>
> OK.  I changed to make this an extra condition for calculating x_inner & y 
> _inner.

Sounds good.  Maybe at this point the x_inner and y_inner code is
getting complicated enough to put into a lambda too:

  x_inner = ... (x);
  y_inner = ... (y);

Just a suggestion though.

>> > 2. Instead of using adjust_address, I changed to use adjust_address_nv to
>> avoid the emit of invalid insn 13.
>> > The latter call to validize_mem() in emit_move_insn will take care of 
>> > the
>> address for us.
>> 
>> The validation performed by validize_mem is the same as that performed by
>> adjust_address, so the only case this should make a difference is for
>> push_operands:
>
> True.
>
>>   /* If X or Y are memory references, verify that their addresses are valid
>>  for the machine.  */
>>   if (MEM_P (x)
>>   && (! memory_address_addr_space_p (GET_MODE (x), XEXP (x, 0),
>>   MEM_ADDR_SPACE (x))
>>&& ! push_operand (x, GET_MODE (x
>> x = validize_mem (x);
>> 
>>   if (MEM_P (y)
>>   && ! memory_address_addr_space_p (GET_MODE (y), XEXP (y, 0),
>>  MEM_ADDR_SPACE (y)))
>> y = validize_mem (y);
>> 
>> So I think the fix is to punt on push_operands instead (and continue to use
>> adjust_address rather than adjust_address_nv).
>
> Not sure if I understand it correctly.
> Do you mean excluding push_operand in candidate_mem_p? Like:
>
>  3830   auto candidate_mem_p = [&](machine_mode innermode, rtx mem) {
>  3831 return !targetm.can_change_mode_class (innermode, GET_MODE (mem), 
> ALL_REGS)
>  3832&& !push_operand (mem, GET_MODE (mem))
>  3833/* Not a candiate if innermode requires too much alignment.  
> */
>  3834&& (MEM_ALIGN (mem) >= GET_MODE_ALIGNMENT (innermode)
>  3835|| targetm.slow_unaligned_access (GET_MODE (mem),
>  3836  MEM_ALIGN (mem))
>  3837|| !targetm.slow_unaligned_access (innermode, MEM_ALIGN 
> (mem)));
>  3838   };

Yeah, looks good.

Formatting nit though: multi-line conditions should be wrapped in (...),
i.e.:

return (...
&& ...
&& ...);

Thanks,
Richard


[PATCH] coroutines: Fix missed ramp function return copy elision [PR95346].

2020-06-01 Thread Iain Sandoe
Hi

Confusingly, "get_return_object ()" can do two things:
- Firstly it can provide the return object for the ramp function (as
  the name suggests).
- Secondly if the type of the ramp function is different from that
  of the get_return_object call, this is used as a single parameter
  to a CTOR for the ramp's return type.

In the first case we can rely on finish_return_stmt () to do the
necessary processing for copy elision.
In the second case, we should have passed a prvalue to the CTOR as
per the standard comment, but I had omitted the rvalue () call.  Fixed
thus.

tested on x86_64-darwin, x86_64-linux, powerpc64-linux
OK for master?
OK for 10.2?
thanks
Iain

gcc/cp/ChangeLog:

PR c++/95346
* coroutines.cc (morph_fn_to_coro): Ensure that the get-
return-object is constructed correctly; When it is not the
final return value, pass it to the CTOR of the return type
as an rvalue, per the standard comment.

gcc/testsuite/ChangeLog:

PR c++/95346
* g++.dg/coroutines/pr95346.C: New test.
---
 gcc/cp/coroutines.cc  | 70 +++
 gcc/testsuite/g++.dg/coroutines/pr95346.C | 26 +
 2 files changed, 71 insertions(+), 25 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/coroutines/pr95346.C

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index 7afa550037c..d1c2b437ade 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -4279,7 +4279,8 @@ morph_fn_to_coro (tree orig, tree *resumer, tree 
*destroyer)
 }
 
   tree gro_context_body = push_stmt_list ();
-  bool gro_is_void_p = VOID_TYPE_P (TREE_TYPE (get_ro));
+  tree gro_type = TREE_TYPE (get_ro);
+  bool gro_is_void_p = VOID_TYPE_P (gro_type);
 
   tree gro = NULL_TREE;
   tree gro_bind_vars = NULL_TREE;
@@ -4289,17 +4290,23 @@ morph_fn_to_coro (tree orig, tree *resumer, tree 
*destroyer)
 finish_expr_stmt (get_ro);
   else
 {
-  gro = build_lang_decl (VAR_DECL, get_identifier ("coro.gro"),
- TREE_TYPE (get_ro));
+  gro = build_lang_decl (VAR_DECL, get_identifier ("coro.gro"), gro_type);
   DECL_CONTEXT (gro) = current_scope ();
   DECL_ARTIFICIAL (gro) = true;
   DECL_IGNORED_P (gro) = true;
   add_decl_expr (gro);
   gro_bind_vars = gro;
-
-  r = build2_loc (fn_start, INIT_EXPR, TREE_TYPE (gro), gro, get_ro);
-  r = coro_build_cvt_void_expr_stmt (r, fn_start);
-  add_stmt (r);
+  if (TYPE_NEEDS_CONSTRUCTING (gro_type))
+   {
+ vec *arg = make_tree_vector_single (get_ro);
+ r = build_special_member_call (gro, complete_ctor_identifier,
+, gro_type, LOOKUP_NORMAL,
+tf_warning_or_error);
+ release_tree_vector (arg);
+   }
+  else
+   r = build2_loc (fn_start, INIT_EXPR, gro_type, gro, get_ro);
+  finish_expr_stmt (r);
 }
 
   /* Initialize the resume_idx_name to 0, meaning "not started".  */
@@ -4333,28 +4340,41 @@ morph_fn_to_coro (tree orig, tree *resumer, tree 
*destroyer)
   /* Switch to using 'input_location' as the loc, since we're now more
  logically doing things related to the end of the function.  */
 
-  /* The ramp is done, we just need the return value.  */
-  if (!same_type_p (TREE_TYPE (get_ro), fn_return_type))
+  /* The ramp is done, we just need the return value.
+ [dcl.fct.def.coroutine] / 7
+ The expression promise.get_return_object() is used to initialize the
+ glvalue result or prvalue result object of a call to a coroutine.
+
+ If the 'get return object' is non-void, then we built it before the
+ promise was constructed.  We now supply a reference to that var,
+ either as the return value (if it's the same type) or to the CTOR
+ for an object of the return type.  */
+  if (gro_is_void_p)
+r = NULL_TREE;
+  else
+r = rvalue (gro);
+
+  if (!same_type_p (gro_type, fn_return_type))
 {
-  /* construct the return value with a single GRO param, if it's not
-void.  */
-  vec *args = NULL;
-  vec **arglist = NULL;
-  if (!gro_is_void_p)
+  /* The return object is , even if the gro is void.  */
+  if (CLASS_TYPE_P (fn_return_type))
{
- args = make_tree_vector_single (gro);
- arglist = 
+ vec *args = NULL;
+ vec **arglist = NULL;
+ if (!gro_is_void_p)
+   {
+ args = make_tree_vector_single (r);
+ arglist = 
+   }
+ r = build_special_member_call (NULL_TREE,
+complete_ctor_identifier, arglist,
+fn_return_type, LOOKUP_NORMAL,
+tf_warning_or_error);
+ r = build_cplus_new (fn_return_type, r, tf_warning_or_error);
}
-  r = build_special_member_call (NULL_TREE,
-complete_ctor_identifier, 

[patch] Make memory copy functions scalar storage order barriers

2020-06-01 Thread Eric Botcazou
Hi,

this addresses the issue raised by Andrew a few weeks ago about the usage of 
memory copy functions to toggle the scalar storage order.  Recall that you 
cannot (the compiler errors out) take the address of a scalar which is stored 
in reverse order, but you can do it for the enclosing aggregate type., which 
means that you can also pass it to the memory copy functions.  In this case, 
the optimizer may rewrite the copy into a scalar copy, which is a no-no.

The patch also contains an unrelated hunk for the tree pretty printer.

Tested on x86-64/Linux, OK for the mainline?


2020-06-01  Eric Botcazou  

* gimple-fold.c (gimple_fold_builtin_memory_op): Do not replace with a
scalar copy if either type has reverse scalar storage order.
* tree-ssa-sccvn.c (vn_reference_lookup_3): Do not propagate through a
memory copy if either type has reverse scalar storage order.

* tree-pretty-print.c (dump_generic_node) : Print quals.


2020-06-01  Eric Botcazou  

* gcc.c-torture/execute/sso-1.c: New test.

-- 
Eric Botcazoudiff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 4e3de95d2d2..64a9221f8cf 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -741,7 +741,8 @@ gimple_fold_builtin_memory_op (gimple_stmt_iterator *gsi,
 }
   else
 {
-  tree srctype, desttype;
+  tree srctype = TREE_TYPE (TREE_TYPE (src));
+  tree desttype = TREE_TYPE (TREE_TYPE (dest));
   unsigned int src_align, dest_align;
   tree off0;
   const char *tmp_str;
@@ -767,7 +768,11 @@ gimple_fold_builtin_memory_op (gimple_stmt_iterator *gsi,
 	 hack can be removed.  */
 	  && !c_strlen (src, 1)
 	  && !((tmp_str = c_getstr (src, _len)) != NULL
-	   && memchr (tmp_str, 0, tmp_len) == NULL))
+	   && memchr (tmp_str, 0, tmp_len) == NULL)
+	  && !(AGGREGATE_TYPE_P (srctype)
+	   && TYPE_REVERSE_STORAGE_ORDER (srctype))
+	  && !(AGGREGATE_TYPE_P (desttype)
+	   && TYPE_REVERSE_STORAGE_ORDER (desttype)))
 	{
 	  unsigned ilen = tree_to_uhwi (len);
 	  if (pow2p_hwi (ilen))
@@ -957,10 +962,15 @@ gimple_fold_builtin_memory_op (gimple_stmt_iterator *gsi,
 	 but that only gains us that the destination and source possibly
 	 no longer will have their address taken.  */
   srctype = TREE_TYPE (TREE_TYPE (src));
+  desttype = TREE_TYPE (TREE_TYPE (dest));
+  if ((AGGREGATE_TYPE_P (srctype)
+	   && TYPE_REVERSE_STORAGE_ORDER (srctype))
+	  || (AGGREGATE_TYPE_P (desttype)
+	  && TYPE_REVERSE_STORAGE_ORDER (desttype)))
+	return false;
   if (TREE_CODE (srctype) == ARRAY_TYPE
 	  && !tree_int_cst_equal (TYPE_SIZE_UNIT (srctype), len))
 	srctype = TREE_TYPE (srctype);
-  desttype = TREE_TYPE (TREE_TYPE (dest));
   if (TREE_CODE (desttype) == ARRAY_TYPE
 	  && !tree_int_cst_equal (TYPE_SIZE_UNIT (desttype), len))
 	desttype = TREE_TYPE (desttype);
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index f04fd65091a..7d581214022 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -1899,8 +1899,16 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags,
 
 case ARRAY_TYPE:
   {
+	unsigned int quals = TYPE_QUALS (node);
 	tree tmp;
 
+	if (quals & TYPE_QUAL_ATOMIC)
+	  pp_string (pp, "atomic ");
+	if (quals & TYPE_QUAL_CONST)
+	  pp_string (pp, "const ");
+	if (quals & TYPE_QUAL_VOLATILE)
+	  pp_string (pp, "volatile ");
+
 	/* Print the innermost component type.  */
 	for (tmp = TREE_TYPE (node); TREE_CODE (tmp) == ARRAY_TYPE;
 	 tmp = TREE_TYPE (tmp))
diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c
index 4b3f31c12cb..17867b65ecb 100644
--- a/gcc/tree-ssa-sccvn.c
+++ b/gcc/tree-ssa-sccvn.c
@@ -3275,6 +3275,9 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void *data_,
 	}
   if (TREE_CODE (lhs) == ADDR_EXPR)
 	{
+	  if (AGGREGATE_TYPE_P (TREE_TYPE (TREE_TYPE (lhs)))
+	  && TYPE_REVERSE_STORAGE_ORDER (TREE_TYPE (TREE_TYPE (lhs
+	return (void *)-1;
 	  tree tem = get_addr_base_and_unit_offset (TREE_OPERAND (lhs, 0),
 		_offset);
 	  if (!tem)
@@ -3303,6 +3306,9 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void *data_,
 	rhs = vn_valueize (rhs);
   if (TREE_CODE (rhs) == ADDR_EXPR)
 	{
+	  if (AGGREGATE_TYPE_P (TREE_TYPE (TREE_TYPE (rhs)))
+	  && TYPE_REVERSE_STORAGE_ORDER (TREE_TYPE (TREE_TYPE (rhs
+	return (void *)-1;
 	  tree tem = get_addr_base_and_unit_offset (TREE_OPERAND (rhs, 0),
 		_offset);
 	  if (!tem)
typedef unsigned char uint8_t;
typedef unsigned int uint32_t;

#define __big_endian__ scalar_storage_order("big-endian")
#define __little_endian__ scalar_storage_order("little-endian")

typedef union
{
  uint32_t val;
  uint8_t v[4];
} __attribute__((__big_endian__)) upal_u32be_t;

typedef union
{
  uint32_t val;
  uint8_t v[4];
} __attribute__((__little_endian__)) upal_u32le_t;

static inline uint32_t native_to_big_endian(uint32_t t)
{
#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
  return t;

[PATCH] coroutines: Allow parameter packs in co_await/yield expressions [PR95345]

2020-06-01 Thread Iain Sandoe
Hi

This corrects a pasto, where I copied the constraint on bare
parameter packs from the co_return to co_yield/await without
properly reviewing it.

tested on x86_64,powerpc64-linux, x86_64-darwin
OK for master?
OK for 10.2?
thanks
Iain

gcc/cp/ChangeLog:

PR c++/95345
* coroutines.cc (finish_co_await_expr): Revise to allow for
parameter packs.
(finish_co_yield_expr): Likewise.

gcc/testsuite/ChangeLog:

PR c++/95345
* g++.dg/coroutines/pr95345.C: New test.
---
 gcc/cp/coroutines.cc  | 45 +++
 gcc/testsuite/g++.dg/coroutines/pr95345.C | 32 
 2 files changed, 53 insertions(+), 24 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/coroutines/pr95345.C

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index cc685ca73b2..7afa550037c 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -851,19 +851,18 @@ finish_co_await_expr (location_t kw, tree expr)
   /* The current function has now become a coroutine, if it wasn't already.  */
   DECL_COROUTINE_P (current_function_decl) = 1;
 
-  if (processing_template_decl)
-{
-  current_function_returns_value = 1;
-
-  if (check_for_bare_parameter_packs (expr))
-   return error_mark_node;
+  /* This function will appear to have no return statement, even if it
+ is declared to return non-void (most likely).  This is correct - we
+ synthesize the return for the ramp in the compiler.  So suppress any
+ extraneous warnings during substitution.  */
+  TREE_NO_WARNING (current_function_decl) = true;
 
-  /* If we don't know the promise type, we can't proceed.  */
-  tree functype = TREE_TYPE (current_function_decl);
-  if (dependent_type_p (functype) || type_dependent_expression_p (expr))
-   return build5_loc (kw, CO_AWAIT_EXPR, unknown_type_node, expr,
-  NULL_TREE, NULL_TREE, NULL_TREE, integer_zero_node);
-}
+  /* If we don't know the promise type, we can't proceed, build the
+ co_await with the expression unchanged.  */
+  tree functype = TREE_TYPE (current_function_decl);
+  if (dependent_type_p (functype) || type_dependent_expression_p (expr))
+return build5_loc (kw, CO_AWAIT_EXPR, unknown_type_node, expr,
+  NULL_TREE, NULL_TREE, NULL_TREE, integer_zero_node);
 
   /* We must be able to look up the "await_transform" method in the scope of
  the promise type, and obtain its return type.  */
@@ -928,19 +927,17 @@ finish_co_yield_expr (location_t kw, tree expr)
   /* The current function has now become a coroutine, if it wasn't already.  */
   DECL_COROUTINE_P (current_function_decl) = 1;
 
-  if (processing_template_decl)
-{
-  current_function_returns_value = 1;
-
-  if (check_for_bare_parameter_packs (expr))
-   return error_mark_node;
+  /* This function will appear to have no return statement, even if it
+ is declared to return non-void (most likely).  This is correct - we
+ synthesize the return for the ramp in the compiler.  So suppress any
+ extraneous warnings during substitution.  */
+  TREE_NO_WARNING (current_function_decl) = true;
 
-  tree functype = TREE_TYPE (current_function_decl);
-  /* If we don't know the promise type, we can't proceed.  */
-  if (dependent_type_p (functype) || type_dependent_expression_p (expr))
-   return build2_loc (kw, CO_YIELD_EXPR, unknown_type_node, expr,
-  NULL_TREE);
-}
+  /* If we don't know the promise type, we can't proceed, build the
+ co_await with the expression unchanged.  */
+  tree functype = TREE_TYPE (current_function_decl);
+  if (dependent_type_p (functype) || type_dependent_expression_p (expr))
+return build2_loc (kw, CO_YIELD_EXPR, unknown_type_node, expr, NULL_TREE);
 
   if (!coro_promise_type_found_p (current_function_decl, kw))
 /* We must be able to look up the "yield_value" method in the scope of
diff --git a/gcc/testsuite/g++.dg/coroutines/pr95345.C 
b/gcc/testsuite/g++.dg/coroutines/pr95345.C
new file mode 100644
index 000..90e946d91c2
--- /dev/null
+++ b/gcc/testsuite/g++.dg/coroutines/pr95345.C
@@ -0,0 +1,32 @@
+#if __has_include ()
+#include 
+using namespace std;
+#elif defined (__clang__) && __has_include ()
+#include 
+using namespace std::experimental;
+#endif
+
+struct dummy_coro
+{
+  using promise_type = dummy_coro;
+  bool await_ready() { return false; }
+  void await_suspend(std::coroutine_handle<>) { }
+  void await_resume() { }
+  dummy_coro get_return_object() { return {}; }
+  dummy_coro initial_suspend() { return {}; }
+  dummy_coro final_suspend() { return {}; }
+  void return_void() { }
+  void unhandled_exception() { }
+};
+
+template 
+dummy_coro
+foo()
+{
+ ((co_await [](int){ return std::suspend_never{}; }(I)), ...);
+  co_return;
+}
+
+void bar() {
+  foo<1>();
+}
-- 
2.24.1



[PATCH] Add pattern for pointer-diff on addresses with same base/offset (PR 94234)

2020-06-01 Thread Feng Xue OS via Gcc-patches
This patch is meant to add match rules to simplify patterns as:

o. (pointer + offset_a) - (pointer + offset_b)   ->   (ptrdiff_t) (offset_a - 
offset_b)
o. (pointer_a + offset) - (pointer_b + offset)   ->   (pointer_a - pointer_b)

Bootstrapped/regtested on x86_64-linux and aarch64-linux.

Feng
---
2020-06-01  Feng Xue  

gcc/
PR tree-optimization/94234
* match.pd ((PTR + A) - (PTR + B)) -> (ptrdiff_t)(A - B): New
simplification.
* ((PTR_A + O) - (PTR_B + O)) -> (PTR_A - PTR_B): New simplification.

gcc/testsuite/
PR tree-optimization/94234
* gcc.dg/pr94234.c: New test.
---
 gcc/match.pd   | 19 +--
 gcc/testsuite/gcc.dg/pr94234.c | 24 
 2 files changed, 33 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr94234.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 33ee1a920bf..6553be4822e 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2515,16 +2515,15 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 && TREE_CODE (@2) == INTEGER_CST
 && tree_int_cst_sign_bit (@2) == 0))
  (minus (convert @1) (convert @2)
-   (simplify
-(pointer_diff (pointer_plus @@0 @1) (pointer_plus @0 @2))
-/* The second argument of pointer_plus must be interpreted as signed, and
-   thus sign-extended if necessary.  */
-(with { tree stype = signed_type_for (TREE_TYPE (@1)); }
- /* Use view_convert instead of convert here, as POINTER_PLUS_EXPR
-   second arg is unsigned even when we need to consider it as signed,
-   we don't want to diagnose overflow here.  */
- (minus (convert (view_convert:stype @1))
-   (convert (view_convert:stype @2)))
+  (simplify
+   (pointer_diff (pointer_plus@3 @0 @1) (pointer_plus @0 @2))
+(if (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@3)))
+  (convert (minus @1 @2
+  (simplify
+   (pointer_diff (pointer_plus@3 @0 @2) (pointer_plus @1 @2))
+(if (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@3))
+&& !integer_zerop (@2))
+ (pointer_diff @0 @1)
 
 /* (A * C) +- (B * C) -> (A+-B) * C and (A * C) +- A -> A * (C+-1).
 Modeled after fold_plusminus_mult_expr.  */
diff --git a/gcc/testsuite/gcc.dg/pr94234.c b/gcc/testsuite/gcc.dg/pr94234.c
new file mode 100644
index 000..ef9076c80da
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr94234.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ccp1" } */ 
+
+typedef __SIZE_TYPE__ size_t;
+typedef __PTRDIFF_TYPE__ ptrdiff_t;
+
+ptrdiff_t foo (char *a, size_t n)
+{
+  char *b1 = a + 8 * n;
+  char *b2 = a + 8 * (n - 1);
+
+  return b1 - b2;
+}
+
+ptrdiff_t goo (char *a, size_t n, size_t m)
+{
+  char *b1 = a + 8 * n;
+  char *b2 = a + 8 * (n + 1);
+
+  return (b1 + m) - (b2 + m);
+}
+
+/* { dg-final { scan-tree-dump-times "return 8;" 1 "ccp1" } } */
+/* { dg-final { scan-tree-dump-times "return -8;" 1 "ccp1" } } */
From 160eaeb151197844005837dc4b8e1e27bb6dfadf Mon Sep 17 00:00:00 2001
From: Feng Xue 
Date: Mon, 1 Jun 2020 11:57:35 +0800
Subject: [PATCH] tree-optimization/94234 - add ptr-diff pattern for addresses
 with same base or offset

2020-06-01  Feng Xue  

gcc/
	PR tree-optimization/94234
	* match.pd ((PTR + A) - (PTR + B)) -> (ptrdiff_t)(A - B): New
	simplification.
	* ((PTR_A + O) - (PTR_B + O)) -> (PTR_A - PTR_B): New simplification.

gcc/testsuite/
	PR tree-optimization/94234
	* gcc.dg/pr94234.c: New test.
---
 gcc/match.pd   | 19 +--
 gcc/testsuite/gcc.dg/pr94234.c | 24 
 2 files changed, 33 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr94234.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 33ee1a920bf..6553be4822e 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2515,16 +2515,15 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 	 && TREE_CODE (@2) == INTEGER_CST
 	 && tree_int_cst_sign_bit (@2) == 0))
  (minus (convert @1) (convert @2)
-   (simplify
-(pointer_diff (pointer_plus @@0 @1) (pointer_plus @0 @2))
-/* The second argument of pointer_plus must be interpreted as signed, and
-   thus sign-extended if necessary.  */
-(with { tree stype = signed_type_for (TREE_TYPE (@1)); }
- /* Use view_convert instead of convert here, as POINTER_PLUS_EXPR
-	second arg is unsigned even when we need to consider it as signed,
-	we don't want to diagnose overflow here.  */
- (minus (convert (view_convert:stype @1))
-	(convert (view_convert:stype @2)))
+  (simplify
+   (pointer_diff (pointer_plus@3 @0 @1) (pointer_plus @0 @2))
+(if (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@3)))
+  (convert (minus @1 @2
+  (simplify
+   (pointer_diff (pointer_plus@3 @0 @2) (pointer_plus @1 @2))
+(if (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@3))
+	 && !integer_zerop (@2))
+ (pointer_diff @0 @1)
 
 /* (A * C) +- (B * C) -> (A+-B) * C and (A * C) +- A -> A * (C+-1).
 Modeled after fold_plusminus_mult_expr.  */
diff 

[PATCH] coroutines: Correct handling of references in parm copies [PR95350].

2020-06-01 Thread Iain Sandoe
(resending, this didn’t appear to make it to the list)

Hi,

I had implemented a move out of rvalue refs for such ramp values (since
these are most likely to be dangling references).  However this does cause
a divergence with the clang implementation - and the patch fixes that.

tested on x86_64,powerpc64-linux, x86_64-darwin
OK for master?
OK for 10.2?
thanks
Iain

---

Adjust to handle rvalue refs the same way as clang, and to correct
the handling of moves when a copy CTOR is present.  This is one area
where we could make things easier for the end-user (as was implemented
before this change), however there needs to be agreement about when the
full statement containing a coroutine call ends (i.e. when the ramp
terminates or when the coroutine terminates).

gcc/cp/ChangeLog:

PR c++/95350
* coroutines.cc (struct param_info): Remove rv_ref field.
(build_actor_fn): Remove specifial rvalue ref handling.
(morph_fn_to_coro): Likewise.

gcc/testsuite/ChangeLog:

PR c++/95350
* g++.dg/coroutines/torture/func-params-08.C: Adjust test to
reflect that all rvalue refs are dangling.
* g++.dg/coroutines/torture/func-params-09-awaitable-parms.C:
Likewise.
* g++.dg/coroutines/pr95350.C: New test.
---
gcc/cp/coroutines.cc  | 41 +--
gcc/testsuite/g++.dg/coroutines/pr95350.C | 28 +
.../coroutines/torture/func-params-08.C   | 11 ++---
.../torture/func-params-09-awaitable-parms.C  | 11 ++---
4 files changed, 50 insertions(+), 41 deletions(-)
create mode 100644 gcc/testsuite/g++.dg/coroutines/pr95350.C

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index 969f4a66f2f..8746927577a 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -1807,7 +1807,6 @@ struct param_info
  tree frame_type;   /* The type used to represent this parm in the frame.  */
  tree orig_type;/* The original type of the parm (not as passed).  */
  bool by_ref;   /* Was passed by reference.  */
-  bool rv_ref;   /* Was an rvalue reference.  */
  bool pt_ref;   /* Was a pointer to object.  */
  bool trivial_dtor; /* The frame type has a trivial DTOR.  */
  bool this_ptr; /* Is 'this' */
@@ -2077,12 +2076,6 @@ build_actor_fn (location_t loc, tree coro_frame_type, 
tree actor, tree fnbody,
  if (parm.pt_ref)
fld_idx = build1_loc (loc, CONVERT_EXPR, TREE_TYPE (arg), fld_idx);

- /* We expect an rvalue ref. here.  */
- if (parm.rv_ref)
-   fld_idx = convert_to_reference (DECL_ARG_TYPE (arg), fld_idx,
-   CONV_STATIC, LOOKUP_NORMAL,
-   NULL_TREE, tf_warning_or_error);
-
  int i;
  tree *puse;
  FOR_EACH_VEC_ELT (*parm.body_uses, i, puse)
@@ -3770,15 +3763,8 @@ morph_fn_to_coro (tree orig, tree *resumer, tree 
*destroyer)
  if (actual_type == NULL_TREE)
actual_type = error_mark_node;
  parm.orig_type = actual_type;
- parm.by_ref = parm.rv_ref = parm.pt_ref = false;
- if (TREE_CODE (actual_type) == REFERENCE_TYPE
- && TYPE_REF_IS_RVALUE (DECL_ARG_TYPE (arg)))
-   {
- parm.rv_ref = true;
- actual_type = TREE_TYPE (actual_type);
- parm.frame_type = actual_type;
-   }
- else if (TREE_CODE (actual_type) == REFERENCE_TYPE)
+ parm.by_ref = parm.pt_ref = false;
+ if (TREE_CODE (actual_type) == REFERENCE_TYPE)
{
  /* If the user passes by reference, then we will save the
 pointer to the original.  As noted in
@@ -3786,16 +3772,12 @@ morph_fn_to_coro (tree orig, tree *resumer, tree 
*destroyer)
 referenced item ends and then the coroutine is resumed,
 we have UB; well, the user asked for it.  */
  actual_type = build_pointer_type (TREE_TYPE (actual_type));
- parm.frame_type = actual_type;
  parm.pt_ref = true;
}
  else if (TYPE_REF_P (DECL_ARG_TYPE (arg)))
-   {
- parm.by_ref = true;
- parm.frame_type = actual_type;
-   }
- else
-   parm.frame_type = actual_type;
+   parm.by_ref = true;
+
+ parm.frame_type = actual_type;

  parm.this_ptr = is_this_parameter (arg);
  if (lambda_p)
@@ -4170,17 +4152,16 @@ morph_fn_to_coro (tree orig, tree *resumer, tree 
*destroyer)
}
  else if (parm.by_ref)
vec_safe_push (promise_args, fld_idx);
- else if (parm.rv_ref)
-   vec_safe_push (promise_args, rvalue (fld_idx));
  else
vec_safe_push (promise_args, arg);

  if (TYPE_NEEDS_CONSTRUCTING (parm.frame_type))
{
  vec *p_in;
- if (parm.by_ref
- && classtype_has_non_deleted_move_ctor 

[PATCH] coroutines: Wrap co_await in a target expr where needed [PR95050]

2020-06-01 Thread Iain Sandoe
Hi,

Since the co_await expression is mostly opaque to the existing
machinery, we were hiding the details of the await_resume return
value.  If that needs to be wrapped in a target expression, then
emulate this with the whole co_await.  Similarly, if the await
expression we build in response to co_await p.yield_value (e)
is wrapped in a target expression, then we need to transfer that
wrapper to the resultant CO_YIELD_EXPR (which is, itself, just
a proxy for the underlying co_await).

tested on x86_64,powerpc64-linux, x86_64-darwin
OK for master?
OK for 10.2?
thanks
Iain

gcc/cp/ChangeLog:

PR c++/95050
* coroutines.cc (build_co_await): Wrap the co_await expression
in a TARGET_EXPR, where needed.
(finish_co_yield_expr): Likewise.

gcc/testsuite/ChangeLog:

PR c++/95050
* g++.dg/coroutines/pr95050.C: New test.
---
 gcc/cp/coroutines.cc  | 29 +-
 gcc/testsuite/g++.dg/coroutines/pr95050.C | 49 +++
 2 files changed, 76 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/coroutines/pr95050.C

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index 8746927577a..cc685ca73b2 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -816,6 +816,12 @@ build_co_await (location_t loc, tree a, suspend_point_kind 
suspend_kind)
   tree awaiter_calls = make_tree_vec (3);
   TREE_VEC_ELT (awaiter_calls, 0) = awrd_call; /* await_ready().  */
   TREE_VEC_ELT (awaiter_calls, 1) = awsp_call; /* await_suspend().  */
+  tree te = NULL_TREE;
+  if (TREE_CODE (awrs_call) == TARGET_EXPR)
+{
+  te = awrs_call;
+  awrs_call = TREE_OPERAND (awrs_call, 1);
+}
   TREE_VEC_ELT (awaiter_calls, 2) = awrs_call; /* await_resume().  */
 
   tree await_expr = build5_loc (loc, CO_AWAIT_EXPR,
@@ -823,7 +829,13 @@ build_co_await (location_t loc, tree a, suspend_point_kind 
suspend_kind)
a, e_proxy, o, awaiter_calls,
build_int_cst (integer_type_node,
   (int) suspend_kind));
-  return convert_from_reference (await_expr);
+  if (te)
+{
+  TREE_OPERAND (te, 1) = await_expr;
+  await_expr = te;
+}
+  tree t = convert_from_reference (await_expr);
+  return t;
 }
 
 tree
@@ -960,8 +972,21 @@ finish_co_yield_expr (location_t kw, tree expr)
   tree op = build_co_await (kw, yield_call, CO_YIELD_SUSPEND_POINT);
   if (op != error_mark_node)
 {
-  op = build2_loc (kw, CO_YIELD_EXPR, TREE_TYPE (op), expr, op);
+  if (REFERENCE_REF_P (op))
+   op = TREE_OPERAND (op, 0);
+  /* If the await expression is wrapped in a TARGET_EXPR, then transfer
+that wrapper to the CO_YIELD_EXPR, since this is just a proxy for
+its contained await.  Otherwise, just build the CO_YIELD_EXPR.  */
+  if (TREE_CODE (op) == TARGET_EXPR)
+   {
+ tree t = TREE_OPERAND (op, 1);
+ t = build2_loc (kw, CO_YIELD_EXPR, TREE_TYPE (t), expr, t);
+ TREE_OPERAND (op, 1) = t;
+   }
+  else
+   op = build2_loc (kw, CO_YIELD_EXPR, TREE_TYPE (op), expr, op);
   TREE_SIDE_EFFECTS (op) = 1;
+  op = convert_from_reference (op);
 }
 
   return op;
diff --git a/gcc/testsuite/g++.dg/coroutines/pr95050.C 
b/gcc/testsuite/g++.dg/coroutines/pr95050.C
new file mode 100644
index 000..fd1516d32f0
--- /dev/null
+++ b/gcc/testsuite/g++.dg/coroutines/pr95050.C
@@ -0,0 +1,49 @@
+#if __has_include ()
+#include 
+using namespace std;
+#elif defined (__clang__) && __has_include ()
+#include 
+using namespace std::experimental;
+#endif
+#include 
+
+struct ret_type 
+{
+  ret_type () = default;
+  ret_type (const ret_type&) = delete;
+  //ret_type (ret_type&&) = default;
+  ~ret_type() {}
+};
+
+struct task
+{
+  struct promise_type
+  {
+auto get_return_object () -> task  { return {}; }
+auto initial_suspend () -> suspend_always { return {}; }
+auto final_suspend () -> suspend_always { return {}; }
+void return_void () {} 
+void unhandled_exception () { }
+void thing (ret_type x) {} 
+  };
+};
+
+struct awaiter
+{
+  bool await_ready() const { return true; }
+  void await_suspend (coroutine_handle<>) {}
+  ret_type await_resume() { return {}; }
+};
+
+task
+my_coro ()
+{
+  ret_type r2{co_await awaiter{}};
+  //ret_type r3 (std::move(r2));
+}
+
+int main()
+{
+ auto x = my_coro ();
+ return 0;
+}
-- 
2.24.1



Re: [PATCH] testsuite: Disable colorization for ubsan test

2020-06-01 Thread Jakub Jelinek via Gcc-patches
On Mon, Jun 01, 2020 at 03:43:00PM +0800, Kito Cheng wrote:
> ping
> 
> 
> On Wed, May 20, 2020 at 3:01 PM Kito Cheng  wrote:
> >
> >  - Run gcc testsuite with qemu will print out ascii color code for
> >ubsan related testcase, however several testcase didn't consider
> >that, so disable colorization prevent such problem and simplify the
> >process when adding testcase in future.
> >
> >  - Verified on native X86 and RISC-V qemu full system mode and user mode.
> >
> > ChangeLog:
> >
> > gcc/testsuite/
> >
> > Kito Cheng  
> >
> > * ubsan-dg.exp (orig_ubsan_options_saved): New
> > (orig_ubsan_options): Ditto.
> > (ubsan_init): Store UBSAN_OPTIONS and set UBSAN_OPTIONS.
> > (ubsan_finish): Restore UBSAN_OPTIONS.

Ok, thanks.

Jakub



Re: [PATCH] testsuite: Disable colorization for ubsan test

2020-06-01 Thread Kito Cheng
ping


On Wed, May 20, 2020 at 3:01 PM Kito Cheng  wrote:
>
>  - Run gcc testsuite with qemu will print out ascii color code for
>ubsan related testcase, however several testcase didn't consider
>that, so disable colorization prevent such problem and simplify the
>process when adding testcase in future.
>
>  - Verified on native X86 and RISC-V qemu full system mode and user mode.
>
> ChangeLog:
>
> gcc/testsuite/
>
> Kito Cheng  
>
> * ubsan-dg.exp (orig_ubsan_options_saved): New
> (orig_ubsan_options): Ditto.
> (ubsan_init): Store UBSAN_OPTIONS and set UBSAN_OPTIONS.
> (ubsan_finish): Restore UBSAN_OPTIONS.
> ---
>  gcc/testsuite/lib/ubsan-dg.exp | 22 ++
>  1 file changed, 22 insertions(+)
>
> diff --git a/gcc/testsuite/lib/ubsan-dg.exp b/gcc/testsuite/lib/ubsan-dg.exp
> index 015601cd404..f4ab29e2add 100644
> --- a/gcc/testsuite/lib/ubsan-dg.exp
> +++ b/gcc/testsuite/lib/ubsan-dg.exp
> @@ -17,6 +17,9 @@
>  # Return 1 if compilation with -fsanitize=undefined is error-free for trivial
>  # code, 0 otherwise.
>
> +set orig_ubsan_options_saved 0
> +set orig_ubsan_options 0
> +
>  proc check_effective_target_fsanitize_undefined {} {
>  return [check_runtime fsanitize_undefined {
> int main (void) { return 0; }
> @@ -74,6 +77,17 @@ proc ubsan_init { args } {
>  global TOOL_OPTIONS
>  global ubsan_saved_TEST_ALWAYS_FLAGS
>  global ubsan_saved_ALWAYS_CXXFLAGS
> +global orig_ubsan_options_saved
> +global orig_ubsan_options
> +
> +if { $orig_ubsan_options_saved == 0 } {
> +   # Save the original environment.
> +   if [info exists env(UBSAN_OPTIONS)] {
> +   set orig_ubsan_options "$env(UBSAN_OPTIONS)"
> +   set orig_ubsan_options_saved 1
> +   }
> +}
> +setenv UBSAN_OPTIONS color=never
>
>  set link_flags ""
>  if ![is_remote host] {
> @@ -109,6 +123,14 @@ proc ubsan_finish { args } {
>  global ubsan_saved_ALWAYS_CXXFLAGS
>  global ubsan_saved_library_path
>  global ld_library_path
> +global orig_ubsan_options_saved
> +global orig_ubsan_options
> +
> +if { $orig_ubsan_options_saved } {
> +   setenv UBSAN_OPTIONS "$orig_ubsan_options"
> +} elseif [info exists env(UBSAN_OPTIONS)] {
> +   unsetenv UBSAN_OPTIONS
> +}
>
>  if [info exists ubsan_saved_ALWAYS_CXXFLAGS ] {
> set ALWAYS_CXXFLAGS $ubsan_saved_ALWAYS_CXXFLAGS
> --
> 2.26.2
>