Re: [PATCH] Fix nonconforming memory_operand for vpmov instructions which has memory operand narrow than 128 bits [avx512f]

2020-05-27 Thread Hongtao Liu via Gcc-patches
On Wed, May 27, 2020 at 8:01 PM Uros Bizjak  wrote:
>
> On Wed, May 27, 2020 at 8:02 AM Hongtao Liu  wrote:
> >
> > On Mon, May 25, 2020 at 8:41 PM Uros Bizjak  wrote:
> > >
> > > On Mon, May 25, 2020 at 2:21 PM Hongtao Liu  wrote:
> > > >
> > > >   According to Intel SDM, VPMOVQB xmm1/m16 {k1}{z}, xmm2 has 16-bit
> > > > memory_operand instead of 128-bit one which exists in current
> > > > implementation. Also for other vpmov instructions which have
> > > > memory_operand narrower than 128bits.
> > > >
> > > >   Bootstrap is ok, regression test for i386/x86-64 backend is ok.
> > >
> > >
> > > +  [(set (match_operand:HI 0 "memory_operand" "=m")
> > > +(subreg:HI (any_truncate:V2QI
> > > + (match_operand:V2DI 1 "register_operand" "v")) 0))]
> > >
> > > This should store in V2QImode, subregs are not allowed in insn patterns.
> > >
> > > You need a pre-reload splitter to split from register_operand to a
> > > memory_operand, Jakub fixed a bunch of pmov patterns a while ago, so
> > > perhaps he can give some additional advice here.
> > >
> >
> > Like this?
> > ---
> > (define_insn "*avx512vl_v2div2qi2_store"
> >   [(set (match_operand:V2QI 0 "memory_operand" "=m")
> > (any_truncate:V2QI
> >   (match_operand:V2DI 1 "register_operand" "v")))]
> >   "TARGET_AVX512VL"
> >   "vpmovqb\t{%1, %0|%0, %1}"
> >   [(set_attr "type" "ssemov")
> >(set_attr "memory" "store")
> >(set_attr "prefix" "evex")
> >(set_attr "mode" "TI")])
> >
> > (define_insn_and_split "*avx512vl_v2div2qi2_store"
> >   [(set (match_operand:HI 0 "memory_operand")
> > (subreg:HI
> >   (any_truncate:V2QI
> > (match_operand:V2DI 1 "register_operand")) 0))]
> >   "TARGET_AVX512VL && ix86_pre_reload_split ()"
> >   "#"
> >   "&& 1"
> >   [(set (match_dup 0)
> > (any_truncate:V2QI (match_dup 1)))]
> >   "operands[0] = adjust_address_nv (operands[0], V2QImode, 0);")
>
> Yes, assuming that scalar subregs are some artefact of middle-end processing.
>
> BTW: Please name these insn ..._1 and ..._2.
>
> Uros.

Update patch.

-- 
BR,
Hongtao
From 332140cc36dba9ebe9348c4dd08e3203c0228de0 Mon Sep 17 00:00:00 2001
From: liuhongt 
Date: Mon, 25 May 2020 16:10:06 +0800
Subject: [PATCH] Fix nonconforming memory_operand for
 vpmovq{d,w,b}/vpmovd{w,b}/vpmovwb.

According to Intel SDM, VPMOVQB xmm1/m16 {k1}{z}, xmm2 has 16-bit
memory_operand instead of 128-bit one which existed in current
implementation. Also for other vpmov instructions which have
memory_operand narrower than 128bits.

2020-05-25  Hongtao Liu  

gcc/ChangeLog

	* config/i386/sse.md (*avx512vl_v2div2qi2_store_1): Rename
	from *avx512vl_v2div2qi_store and refine memory size of
	the pattern.
	(*avx512vl_v2div2qi2_mask_store_1): Ditto.
	(*avx512vl_v4qi2_store_1): Ditto.
	(*avx512vl_v4qi2_mask_store_1): Ditto.
	(*avx512vl_v8qi2_store_1): Ditto.
	(*avx512vl_v8qi2_mask_store_1): Ditto.
	(*avx512vl_v4hi2_store_1): Ditto.
	(*avx512vl_v4hi2_mask_store_1): Ditto.
	(*avx512vl_v2div2hi2_store_1): Ditto.
	(*avx512vl_v2div2hi2_mask_store_1): Ditto.
	(*avx512vl_v2div2si2_store_1): Ditto.
	(*avx512vl_v2div2si2_mask_store_1): Ditto.
	(*avx512f_v8div16qi2_store_1): Ditto.
	(*avx512f_v8div16qi2_mask_store_1): Ditto.
	(*avx512vl_v2div2qi2_store_2): New define_insn_and_split.
	(*avx512vl_v2div2qi2_mask_store_2): Ditto.
	(*avx512vl_v4qi2_store_2): Ditto.
	(*avx512vl_v4qi2_mask_store_2): Ditto.
	(*avx512vl_v8qi2_store_2): Ditto.
	(*avx512vl_v8qi2_mask_store_2): Ditto.
	(*avx512vl_v4hi2_store_2): Ditto.
	(*avx512vl_v4hi2_mask_store_2): Ditto.
	(*avx512vl_v2div2hi2_store_2): Ditto.
	(*avx512vl_v2div2hi2_mask_store_2): Ditto.
	(*avx512vl_v2div2si2_store_2): Ditto.
	(*avx512vl_v2div2si2_mask_store_2): Ditto.
	(*avx512f_v8div16qi2_store_2): Ditto.
	(*avx512f_v8div16qi2_mask_store_2): Ditto.
	* config/i386/i386-builtin-types.def: Adjust builtin type.
	* config/i386/i386-expand.c: Ditto.
	* config/i386/i386-builtin.def: Adjust builtin.
	* config/i386/avx512fintrin.h: Ditto.
	* config/i386/avx512vlbwintrin.h: Ditto.
	* config/i386/avx512vlintrin.h: Ditto.
---
 gcc/config/i386/avx512fintrin.h|   7 +-
 gcc/config/i386/avx512vlbwintrin.h |   6 +-
 gcc/config/i386/avx512vlintrin.h   |  49 +--
 gcc/config/i386/i386-builtin-types.def |  20 +-
 gcc/config/i386/i386-builtin.def   |  60 +--
 gcc/config/i386/i386-expand.c  |  20 +-
 gcc/config/i386/sse.md | 542 -
 7 files changed, 421 insertions(+), 283 deletions(-)

diff --git a/gcc/config/i386/avx512fintrin.h b/gcc/config/i386/avx512fintrin.h
index 012cf4eb31e..4bcd697387a 100644
--- a/gcc/config/i386/avx512fintrin.h
+++ b/gcc/config/i386/avx512fintrin.h
@@ -5613,7 +5613,8 @@ extern __inline void
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm512_mask_cvtepi64_storeu_epi8 (void * __P, __mmask8 __M, __m512i __A)
 {
-  __builtin_ia32_pmovqb512mem_mask ((__v16qi *) __P, (__v8di) __A, __M);
+  

Ping^1 [PATCH 2/4 V3] Add target hook stride_dform_valid_p

2020-05-27 Thread Kewen.Lin via Gcc-patches
Hi,

Gentle ping patches as below:

1/4 v3 https://gcc.gnu.org/pipermail/gcc-patches/2020-February/540171.html
2/4 v3 https://gcc.gnu.org/pipermail/gcc-patches/2020-March/541387.html
3/4 v3 https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545643.html

Or shall I ping them seperately?

Thanks!
Kewen

on 2020/5/13 下午1:50, Kewen.Lin via Gcc-patches wrote:
> Hi,
> 
> I'd like to ping this patch as well as its sblings.  Thanks in advance.
> 
> 1/4 v3 https://gcc.gnu.org/pipermail/gcc-patches/2020-February/540171.html
> 2/4 v3 https://gcc.gnu.org/pipermail/gcc-patches/2020-March/541387.html
> 3/4 v3 https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545643.html
> 
> BR,
> Kewen
> 
> on 2020/3/3 下午8:25, Kewen.Lin wrote:
>> Hi Richard,
>>
>> Thanks for your comments!  It's a good idea to use param due to the
>> flexibility.  And yes, it sounds good to have more targets to try and
>> make it better.  But I have a bit concern on turning it on by default.
>> Since it replies on unroll factor estimation, as part 1/4 shows, it
>> calls targetm.loop_unroll_adjust if target supports, which used to
>> work on RTL level.  To avoid possible ICE, I'm intended to turn it
>> off for those targets (s390 & i386) with that hook, since without good
>> understanding on those targets, it's hard for me to extend them with
>> gimple level support.  Does it make sense?
>>
>> The updated patch has been attached.
>>
>> BR,
>> Kewen
>> -
>>
>> gcc/ChangeLog
>>
>> 2020-03-03  Kewen Lin  
>>
>>  * doc/invoke.texi (iv-consider-reg-offset-for-unroll): Document new 
>> option.
>>  * params.opt (iv-consider-reg-offset-for-unroll): New.
>>  * config/s390/s390.c (s390_option_override_internal): Disable parameter
>>  iv-consider-reg-offset-for-unroll by default.
>>  * config/i386/i386-options.c (ix86_option_override_internal): Likewise.
>>


Re: [PATCH PR95332] gcov-tool: Flexible endian adjustment for merging coverage data

2020-05-27 Thread dongjianqiang (A)
Thanks for reviewing this. Could you please help install this patch? I am not a 
gcc commiter.  

Regards,
Dong JianQiang
> 
> On 5/27/20 12:35 PM, dongjianqiang (A) wrote:
> > Thanks for your comments, I add the ChangeLog in the patch.
> 
> Thanks. That patch is fine, please install it.
> 
> Martin


Re: [PATCH] gcc: xtensa: delegitimize UNSPEC_PLT

2020-05-27 Thread Max Filippov via Gcc-patches
On Wed, May 27, 2020 at 4:35 PM augustine.sterl...@gmail.com
 wrote:
>
> On Tue, May 26, 2020 at 11:43 AM Max Filippov  wrote:
> >
> > This fixes 'non-delegitimized UNSPEC 3 found in variable location' notes
> > issued when building libraries which interferes with running tests.
> >
> > 2020-05-24  Max Filippov  
> > gcc/
> > * config/xtensa/xtensa.c (xtensa_delegitimize_address): New
> > function.
> > (TARGET_DELEGITIMIZE_ADDRESS): New macro.
>
> This is OK.

Thanks, applied to master.

-- Max


Re: [PATCH] Port libgccjit to Windows.

2020-05-27 Thread Nicolas Bértolo via Gcc-patches
Hi,

> Do you have commit/push access to the gcc repository?

No I don't.

> BTW, why isn't it necessary to use --enable-host-shared in Windows?
> Can we document that?

That's because all code is position independent in Windows.

> On the subject of nitpicking, I find myself getting distracted by the
> indentation in the patch; there seem to be a lot of mismatches.

> What editor are you using, and does it have options to
> (a) show visible whitespace, and
> (b) to apply a formatting convention?

> I use Emacs, and it takes care of this for me.  I haven't used it, but
> there's a contrib/clang-format file in the gcc source tree which
> presumably describes GCC's coding conventions, if that helps for the
> new code.

The problem seems to be that I was writing tabs but since I have set up my
editor to show them as 2 spaces I couldn't see what was wrong.

> Am I right in thinking that this installs the libgccjit.a file on Windows?
> Why is this done?

That is the file libgccjit.dll.a

It is the import library for gccjit. It is part of the way Windows handles
dynamic libraries.

> New C++ source files should have a .cc extension.
> I hope that at some point we'll rename all the existing .c ones
> accordingly.

I just couldn't get Make to generate jit-w32.o from jit-w32.cc.
It looks for jit-w32.c.

I had to leave it with the .c extension.

> Does this call generate a directory that's only accessible to the
> current user?
> Otherwise there could be a risk of a hostile user on the same machine
> clobbering the contents and injecting code into this process.

I changed the code to generate a directory than can only be accessed by the
current user.

I've attached a new version. It contains a rewrite of the code that creates
temporary directories.

Nico


0001-Port-libgccjit-to-Windows.patch
Description: Binary data


Re: [PATCH 5/7] vect: Support vector load/store with length in vectorizer

2020-05-27 Thread Kewen.Lin via Gcc-patches
on 2020/5/27 下午6:02, Richard Sandiford wrote:
> "Kewen.Lin"  writes:
>> Hi Richard,
>>
>> Thanks for your comments!
>>
>> on 2020/5/26 锟斤拷锟斤拷8:49, Richard Sandiford wrote:
>>> "Kewen.Lin"  writes:
 @@ -626,6 +645,12 @@ public:
/* True if have decided to use a fully-masked loop.  */
bool fully_masked_p;
  
 +  /* Records whether we still have the option of using a length access 
 loop.  */
 +  bool can_with_length_p;
 +
 +  /* True if have decided to use length access for the loop fully.  */
 +  bool fully_with_length_p;
>>>
>>> Rather than duplicate the flags like this, I think we should have
>>> three bits of information:
>>>
>>> (1) Can the loop operate on partial vectors?  Starts off optimistically
>>> assuming "yes", gets set to "no" when we find a counter-example.
>>>
>>> (2) If we do decide to use partial vectors, will we need loop masks?
>>>
>>> (3) If we do decide to use partial vectors, will we need lengths?
>>>
>>> Vectorisation using partial vectors succeeds if (1) && ((2) != (3))
>>>
>>> LOOP_VINFO_CAN_FULLY_MASK_P currently tracks (1) and
>>> LOOP_VINFO_MASKS currently tracks (2).  In pathological cases it's
>>> already possible to have (1) && !(2), see r9-6240 for an example.
>>>
>>> With the new support, LOOP_VINFO_LENS tracks (3).
>>>
>>> So I don't think we need the can_with_length_p.  What is now
>>> LOOP_VINFO_CAN_FULLY_MASK_P can continue to track (1) for both
>>> approaches, with the final choice of approach only being made
>>> at the end.  Maybe it would be worth renaming it to something
>>> more generic though, now that we have two approaches to partial
>>> vectorisation.
>>
>> I like this idea!  I could be wrong, but I'm afraid that we
>> can not have one common flag to be shared for both approaches,
>> the check criterias could be different for both approaches, one
>> counter example for length could be acceptable for masking, such
>> as length can only allow CONTIGUOUS related modes, but masking
>> can support more.  When we see acceptable VMAT_LOAD_STORE_LANES,
>> we leave LOOP_VINFO_CAN_FULLY_MASK_P true, later should length
>> checking turn it to false?  I guess no, assuming still true, then 
>> LOOP_VINFO_CAN_FULLY_MASK_P will mean only partial vectorization
>> for masking, not for both.  We can probably clean LOOP_VINFO_LENS
>> when the length checking is false, but we just know the vec is empty,
>> not sure we are unable to do partial vectorization with length,
>> when we see LOOP_VINFO_CAN_FULLY_MASK_P true, we could still
>> record length into it if possible.
> 
> Let's call the flag in (1) CAN_USE_PARTIAL_VECTORS_P rather than
> CAN_FULLY_MASK_P to (try to) avoid any confusion from the current name.
> 
> What I meant is that each vectorizable_* routine has the responsibility
> of finding a way of coping with partial vectorisation, or setting
> CAN_USE_PARTIAL_VECTORS_P to false if it can't.
> 
> vectorizable_load chooses the VMAT first, and then decides based on that
> whether partial vectorisation is supported.  There's no influence in
> the other direction (partial vectorisation doesn't determine the VMAT).
> 
> So once it has chosen a VMAT, vectorizable_load needs to try to find a way
> of handling the operation with partial vectorisation.  Currently the only
> way of doing that for VMAT_LOAD_STORE_LANES is using masks.  So at the
> moment there are two possible outcomes:
> 
> - The target supports the necessary IFN_MASK_LOAD_LANES function.
>   If so, we can use partial vectorisation for the statement, so we
>   leave CAN_USE_PARTIAL_VECTORS_P true and record the necessary masks
>   in LOOP_VINFO_MASKS.
> 
> - The target doesn't support the necessary IFN_MASK_LOAD_LANES function.
>   If so, we can't use partial vectorisation, so we clear
>   CAN_USE_PARTIAL_VECTORS_P.
> 
> That's how things work at the moment.  It would work in the same way
> for lengths if we ever supported IFN_LEN_LOAD_LANES: we'd check whether
> IFN_LEN_LOAD_LANES is available and record the length in LOOP_VINFO_LENS
> if so.  If partial vectorisation isn't supported (via masks or lengths),
> we'd continue to clear CAN_USE_PARTIAL_VECTORS_P.
> 
> But equally, if we never add support for IFN_LEN_LOAD_LANES, the current
> code continues to work with length-based approaches.  We'll continue to
> clear CAN_USE_PARTIAL_VECTORS_P for VMAT_LOAD_STORE_LANES when the
> target provides no IFN_MASK_LOAD_LANES function.
> 

Thanks a lot for your detailed explanation!  This proposal looks good
based on the current implementation of both masking and length.  I may
think too much, but I had a bit concern as below when some targets have
both masking and length supports in future, such as ppc adds masking
support like SVE.

I assumed that you meant each vectorizable_* routine should record the
objs for any available partial vectorisation approaches.  If one target
supports both, we would have both recorded but decide not to do partial
vectorisation finally since 

Re: Broken build

2020-05-27 Thread Alexandre Oliva
On May 27, 2020, Hans-Peter Nilsson  wrote:

>> I ask because this error suggests an empty argument passed to
>> GCC.

> And ignored before your rewrite?

Or absent.  It turned out my massaging of ldflags et al turned
consecutive blanks into empty arguments.  I posted a patch for that and
most of the other fallout at
https://gcc.gnu.org/pipermail/gcc-patches/2020-May/546659.html

Jeff Law has already aprpoved it, and I'll probably check it in very
shortly, as soon as I get a few more test results.  Please be sure to
let me know in case it does not fix the problem for you.

-- 
Alexandre Oliva, freedom fighterhe/himhttps://FSFLA.org/blogs/lxo/
Free Software Evangelist  Stallman was right, but he's left :(
GNU Toolchain Engineer   Live long and free, and prosper ethically


Re: Broken build

2020-05-27 Thread Anthony Green
Hans-Peter Nilsson via Gcc-patches  writes:

> And here's an improper bug report.
>
> One of the commits between cfdff3eeb90..5c8344e7289 caused every
> single *linked* test to fail for cris-elf, like:

I can confirm that the moxie-elf test cases don't link either.

It looks like setting ldscript in the board description file doesn't
work.  In my case that means "-Tsim.ld" isn't being passed through and
we can't link anymore.  Here's moxie-sim.exp:

https://github.com/moxielogic/moxie-test-gcc/blob/7c707e187f101922e3ef7f6e23dbbd1890f9e8dd/moxie-sim.exp#L42

Thanks for looking, Alex.

AG



Re: [PATCH] gcc: xtensa: delegitimize UNSPEC_PLT

2020-05-27 Thread augustine.sterling--- via Gcc-patches
On Tue, May 26, 2020 at 11:43 AM Max Filippov  wrote:
>
> This fixes 'non-delegitimized UNSPEC 3 found in variable location' notes
> issued when building libraries which interferes with running tests.
>
> 2020-05-24  Max Filippov  
> gcc/
> * config/xtensa/xtensa.c (xtensa_delegitimize_address): New
> function.
> (TARGET_DELEGITIMIZE_ADDRESS): New macro.

This is OK.


Re: drop -aux{dir,base}, revamp -dump{dir,base}

2020-05-27 Thread Jeff Law via Gcc-patches
On Wed, 2020-05-27 at 19:05 -0300, Alexandre Oliva wrote:
> outputs.exp: no lto, linker default output, cdtor temps, empty args
> 
> From: Alexandre Oliva 
> 
> This patch fixes various issues in the testsuite that came up after
> the dump/aux output revamp, namely:
> 
> - many outputs.exp tests used -flto without checking that LTO was
> supported, getting lots of failures.  With this patch, we test for LTO
> support, and skip -flto tests on platforms that do not support it.
> 
> - some linkers error out if an output file is not named, and the
> a.{out,exe} construct that we used throughout outputs.exp to match the
> default linker output would trigger a bug in tcl globbing.  With this
> patch, we detect the default linker output early.  If none is found,
> we arrange to pass -o a.out explicitly in tests that used to test the
> default linker output.  We now look for the detected default, or for
> explicitly-specified output.
> 
> - collect2 will leave .cdtor.* files behind in -save-temps
> tests.  Ignore them.
> 
> - The prepending of -Wl, to file names in ldflags et al was done in a
> way that introduced empty arguments when consecutive blanks appeared
> in these board configuration knobs.  Skip the empty strings between
> consecutive blanks to avoid this problem.
> 
> Tested so far on x86_64-linux-gnu and powerpc-aix7.  Ok to install?
> 
> 
> gcc/testsuite/ChangeLog:
> 
>   * lib/gcc-defs.exp: Avoid introducing empty arguments between
>   consecutive blanks in board linking options.
>   * gcc.misc-tests/outputs.exp: Likewise.  Document
>   -gsplit-dwarf testing, skip LTO tests if -flto is not
>   supported, detect the default linker output name, cope with
>   the need for an explicit executable output.
OK.  THanks for jumping on it quickly.  I'll re-enable the tester once the patch
is committed. 

jeff
> 



[committed] libstdc++: Fix atomic::load (PR 95282)

2020-05-27 Thread Jonathan Wakely via Gcc-patches
PR libstdc++/95282
* include/bits/atomic_base.h (__atomic_impl::load): Add
cv-qualifiers to parameter so that _Tp is deduced as the
unqualified type.
* testsuite/29_atomics/atomic_float/95282.cc: New test.

Tested powerpc64le-linbux, committed to master.

Backport to gcc-10 to follow.

I've added this testcase to my out-of-testsuite clang tests.


commit bbaec68c86f8e89a3460cc022c75d4c4179bfb0a
Author: Jonathan Wakely 
Date:   Wed May 27 22:55:21 2020 +0100

libstdc++: Fix atomic::load (PR 95282)

PR libstdc++/95282
* include/bits/atomic_base.h (__atomic_impl::load): Add
cv-qualifiers to parameter so that _Tp is deduced as the
unqualified type.
* testsuite/29_atomics/atomic_float/95282.cc: New test.

diff --git a/libstdc++-v3/include/bits/atomic_base.h 
b/libstdc++-v3/include/bits/atomic_base.h
index 3b66b040976..01f77a0f372 100644
--- a/libstdc++-v3/include/bits/atomic_base.h
+++ b/libstdc++-v3/include/bits/atomic_base.h
@@ -871,7 +871,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 template
   _GLIBCXX_ALWAYS_INLINE _Tp
-  load(_Tp* __ptr, memory_order __m) noexcept
+  load(const volatile _Tp* __ptr, memory_order __m) noexcept
   {
alignas(_Tp) unsigned char __buf[sizeof(_Tp)];
_Tp* __dest = reinterpret_cast<_Tp*>(__buf);
diff --git a/libstdc++-v3/testsuite/29_atomics/atomic_float/95282.cc 
b/libstdc++-v3/testsuite/29_atomics/atomic_float/95282.cc
new file mode 100644
index 000..2de751c6ad4
--- /dev/null
+++ b/libstdc++-v3/testsuite/29_atomics/atomic_float/95282.cc
@@ -0,0 +1,35 @@
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-options "-std=gnu++2a" }
+// { dg-do compile { target c++2a } }
+
+#include 
+
+float
+test01()
+{
+  std::atomic a;
+  return a.load();
+}
+
+float
+test02()
+{
+  volatile std::atomic a;
+  return a.load();
+}


Re: drop -aux{dir,base}, revamp -dump{dir,base}

2020-05-27 Thread Alexandre Oliva
outputs.exp: no lto, linker default output, cdtor temps, empty args

From: Alexandre Oliva 

This patch fixes various issues in the testsuite that came up after
the dump/aux output revamp, namely:

- many outputs.exp tests used -flto without checking that LTO was
supported, getting lots of failures.  With this patch, we test for LTO
support, and skip -flto tests on platforms that do not support it.

- some linkers error out if an output file is not named, and the
a.{out,exe} construct that we used throughout outputs.exp to match the
default linker output would trigger a bug in tcl globbing.  With this
patch, we detect the default linker output early.  If none is found,
we arrange to pass -o a.out explicitly in tests that used to test the
default linker output.  We now look for the detected default, or for
explicitly-specified output.

- collect2 will leave .cdtor.* files behind in -save-temps
tests.  Ignore them.

- The prepending of -Wl, to file names in ldflags et al was done in a
way that introduced empty arguments when consecutive blanks appeared
in these board configuration knobs.  Skip the empty strings between
consecutive blanks to avoid this problem.

Tested so far on x86_64-linux-gnu and powerpc-aix7.  Ok to install?


gcc/testsuite/ChangeLog:

* lib/gcc-defs.exp: Avoid introducing empty arguments between
consecutive blanks in board linking options.
* gcc.misc-tests/outputs.exp: Likewise.  Document
-gsplit-dwarf testing, skip LTO tests if -flto is not
supported, detect the default linker output name, cope with
the need for an explicit executable output.
---
 gcc/testsuite/gcc.misc-tests/outputs.exp |  163 +-
 gcc/testsuite/lib/gcc-defs.exp   |4 +
 2 files changed, 119 insertions(+), 48 deletions(-)

diff --git a/gcc/testsuite/gcc.misc-tests/outputs.exp 
b/gcc/testsuite/gcc.misc-tests/outputs.exp
index 9823710..c3c6c2d 100644
--- a/gcc/testsuite/gcc.misc-tests/outputs.exp
+++ b/gcc/testsuite/gcc.misc-tests/outputs.exp
@@ -30,6 +30,9 @@ if {![gcc_parallel_test_run_p $b] || [is_remote host]} {
 }
 gcc_parallel_test_enable 0
 
+# Check for -gsplit-dwarf support.  The outest proc will check that
+# gsplit_dwarf is empty if a .dwo file is missing before deciding
+# that's a fail.
 set gsplit_dwarf "-gsplit-dwarf"
 if ![check_no_compiler_messages gsplitdwarf object {
 void foo (void) { }
@@ -37,6 +40,10 @@ if ![check_no_compiler_messages gsplitdwarf object {
 set gsplit_dwarf ""
 }
 
+# Check for -flto support.  We explicitly test the result to skip
+# tests that use -flto.
+set skip_lto ![check_effective_target_lto]
+
 # Prepare additional options to be used for linking.
 # We do not compile to an executable, because that requires naming an output.
 set link_options ""
@@ -45,7 +52,9 @@ foreach i { ldflags libs ldscripts } {
 if {[board_info $dest exists $i]} {
set skip ""
foreach opt [split [board_info $dest $i]] {
-   if { $skip != "" } then {
+   if { $opt == "" } then {
+   continue
+   } elseif { $skip != "" } then {
set skip ""
} elseif { $opt == "-Xlinker" } then {
set skip $opt
@@ -73,9 +82,10 @@ if {[board_info $dest exists output_format]} {
 # double dash, or a dash followed by a period, the first dash is
 # replaced with $b-$b; names starting with "a--" or "a-." have "$b"
 # inserted after the first dash.  The glob pattern may expand to more
-# than one file, but then the test will pass when there any number of
-# matches.  So, it's safe to use for a.{out,exe}, but .{i,s,o} and
-# .[iso] will pass even if only the .o is present.
+# than one file, but then the test will pass for any number of
+# matches, i.e., it would be safe to use for a.{out,exe} (if it
+# weren't for https://core.tcl-lang.org/tcl/tktview?name=5bbd044812),
+# but .{i,s,o} and .[iso] will pass even if only the .o is present.
 proc outest { test sources opts dirs outputs } {
 global b
 global srcdir
@@ -120,6 +130,9 @@ proc outest { test sources opts dirs outputs } {
set o "a-$b-[string range $og 3 end]"
} elseif { [string range $og 0 2] == "a-." } then {
set o "a-$b.[string range $og 3 end]"
+   } elseif { "$og" == "\$aout" } then {
+   global aout
+   set o "$aout"
} else {
set o "$og"
}
@@ -148,17 +161,23 @@ proc outest { test sources opts dirs outputs } {
}
 }
 
+set outb {}
 foreach f $outs {
file delete $f
+   # collect2 may create .cdtor* files in -save-temps link tests,
+   # ??? without regard to aux output naming conventions.
+   if ![string match "*.cdtor.*" $f] then {
+   lappend outb $f
+   }
 }
 foreach d $dirs {
file delete -force $d
 }
 
-if { [llength $outs] == 0 } then {
+if { [llength $outb] == 0 } then {

Re: [PATCH] c++: constexpr RANGE_EXPR ctor indexes [PR95241]

2020-05-27 Thread Patrick Palka via Gcc-patches
On Wed, 27 May 2020, Patrick Palka wrote:

> On Wed, 27 May 2020, Patrick Palka wrote:
> 
> > In the testcase below, the CONSTRUCTOR for 'field' contains a
> > RANGE_EXPR index:
> > 
> >   {aggr_init_expr<...>, [1...2]={.off=1}}
> > 
> > but get_or_insert_ctor_field isn't prepared to handle RANGE_EXPR
> > indexes.
> > 
> > This patch adds limited support for RANGE_EXPR indexes to
> > get_or_insert_ctor_field.  The limited scope of this patch should make
> > it more suitable for backporting, and support for more access patterns
> > would be needed only to handle self-modifying CONSTRUCTORs containing a
> > RANGE_EXPR index, but I haven't yet been able to come up with a testcase
> > that exhibits such a CONSTRUCTOR.
> > 
> > Passes 'make check-c++', does this look OK to commit to master and to
> > the GCC 10 branch after a full bootstrap and regtest?
> > 
> > gcc/cp/ChangeLog:
> > 
> > PR c++/95241
> > * constexpr.c (get_or_insert_ctor_field): Add limited support
> > for RANGE_EXPR indexes.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > PR c++/95241
> > * g++.dg/cpp0x/constexpr-array25.C: New test.
> > ---
> >  gcc/cp/constexpr.c| 12 +++
> >  .../g++.dg/cpp0x/constexpr-array25.C  | 21 +++
> >  2 files changed, 33 insertions(+)
> >  create mode 100644 gcc/testsuite/g++.dg/cpp0x/constexpr-array25.C
> > 
> > diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
> > index 4e441ac8d2f..6f9bafbe8d8 100644
> > --- a/gcc/cp/constexpr.c
> > +++ b/gcc/cp/constexpr.c
> > @@ -3301,6 +3301,18 @@ get_or_insert_ctor_field (tree ctor, tree index, int 
> > pos_hint = -1)
> >  }
> >else if (TREE_CODE (type) == ARRAY_TYPE || TREE_CODE (type) == 
> > VECTOR_TYPE)
> >  {
> > +  if (TREE_CODE (index) == RANGE_EXPR)
> > +   {
> > + /* Our support for RANGE_EXPR indexes is limited to accessing an
> > +existing one via POS_HINT, and appending a new one to the end of
> > +CTOR.  ??? Support for other access patterns might be needed.  */
> > + tree lo = TREE_OPERAND (index, 0);
> > + auto *elts = CONSTRUCTOR_ELTS (ctor);
> > + gcc_assert (vec_safe_is_empty (elts)
> > + || array_index_cmp (lo, elts->last().index) > 0);
> > + return vec_safe_push (elts, {index, NULL_TREE});
> > +   }
> > +
> 
> Oops, it just occurred to me that the use of C++11 features here would
> make this patch unsuitable for backporting.  C++98-compatible patch
> incoming...

Here it is.  Does the following look OK to commit to master and to the
GCC 10 branch after a full bootstrap and regtest?

-- >8 --

Subject: [PATCH] c++: constexpr RANGE_EXPR ctor indexes [PR95241]

In the testcase below, the CONSTRUCTOR for 'field' contains a
RANGE_EXPR index:

  {aggr_init_expr<...>, [1...2]={.off=1}}

but get_or_insert_ctor_field isn't prepared to handle RANGE_EXPR
indexes.

This patch adds limited support for RANGE_EXPR indexes to
get_or_insert_ctor_field.  The limited scope of this patch should make
it more suitable for backporting, and support for more access patterns
would be needed only to handle self-modifying CONSTRUCTORs containing a
RANGE_EXPR index, but I haven't yet been able to come up with a testcase
that exhibits such a CONSTRUCTOR.

gcc/cp/ChangeLog:

PR c++/95241
* constexpr.c (get_or_insert_ctor_field): Add limited support
for RANGE_EXPR indexes.

gcc/testsuite/ChangeLog:

PR c++/95241
* g++.dg/cpp0x/constexpr-array25.C: New test.
---
 gcc/cp/constexpr.c| 15 +
 .../g++.dg/cpp0x/constexpr-array25.C  | 21 +++
 2 files changed, 36 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/constexpr-array25.C

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 4e441ac8d2f..32f2ef96fc7 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -3301,6 +3301,21 @@ get_or_insert_ctor_field (tree ctor, tree index, int 
pos_hint = -1)
 }
   else if (TREE_CODE (type) == ARRAY_TYPE || TREE_CODE (type) == VECTOR_TYPE)
 {
+  if (TREE_CODE (index) == RANGE_EXPR)
+   {
+ /* ??? Support for RANGE_EXPR indexes is currently limited to
+accessing one via POS_HINT, or appending a new one to the end
+of CTOR.  Support for other access patterns may be needed.  */
+ vec *elts = CONSTRUCTOR_ELTS (ctor);
+ if (vec_safe_length (elts))
+   {
+ tree lo = TREE_OPERAND (index, 0);
+ gcc_assert (array_index_cmp (lo, elts->last().index) > 0);
+   }
+ CONSTRUCTOR_APPEND_ELT (elts, index, NULL_TREE);
+ return >last();
+   }
+
   HOST_WIDE_INT i = find_array_ctor_elt (ctor, index, /*insert*/true);
   gcc_assert (i >= 0);
   constructor_elt *cep = CONSTRUCTOR_ELT (ctor, i);
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-array25.C 
b/gcc/testsuite/g++.dg/cpp0x/constexpr-array25.C
new file mode 100644

[committed] libstdc++: Fix view adaptors for mixed-const sentinels and iterators (PR 95322)

2020-05-27 Thread Jonathan Wakely via Gcc-patches
The bug report is that transform_view's sentinel cannot be
compared to its iterator.  The comparison is supposed to use
operator==(iterator, sentinel) after converting
sentinel to sentinel. However, the operator== is a hidden
friend so is not a candidate when comparing iterator with
sentinel. The required conversion would only happen if we'd found
the operator, but we can't find the operator until after the conversion
happens.

A new LWG issue has been reported, but not yet assigned a number.  The
solution suggested by Casey Carter is to make the hidden friends of the
sentinel types work with iterators of any const-ness, so that no
conversions are required.

Patrick Palka observed that join_view has a similar problem and a
similar fix is used for its sentinel.

PR libstdc++/95322
* include/std/ranges (transform_view::_Sentinel): Allow hidden
friends to work with _Iterator and _Iterator.
(join_view::_Sentinel): Likewise.
* testsuite/std/ranges/adaptors/95322.cc: New test.

Tested powerpc64le-linux, committed to master.

I intend to backport this to gcc-10 as well.

commit 6c2582c0406250c66e2eb3651f8e8638796b7f53
Author: Jonathan Wakely 
Date:   Wed May 27 22:08:15 2020 +0100

libstdc++: Fix view adaptors for mixed-const sentinels and iterators (PR 
95322)

The bug report is that transform_view's sentinel cannot be
compared to its iterator.  The comparison is supposed to use
operator==(iterator, sentinel) after converting
sentinel to sentinel. However, the operator== is a hidden
friend so is not a candidate when comparing iterator with
sentinel. The required conversion would only happen if we'd found
the operator, but we can't find the operator until after the conversion
happens.

A new LWG issue has been reported, but not yet assigned a number.  The
solution suggested by Casey Carter is to make the hidden friends of the
sentinel types work with iterators of any const-ness, so that no
conversions are required.

Patrick Palka observed that join_view has a similar problem and a
similar fix is used for its sentinel.

PR libstdc++/95322
* include/std/ranges (transform_view::_Sentinel): Allow hidden
friends to work with _Iterator and _Iterator.
(join_view::_Sentinel): Likewise.
* testsuite/std/ranges/adaptors/95322.cc: New test.

diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
index 0c602c7200f..b8023e67c9f 100644
--- a/libstdc++-v3/include/std/ranges
+++ b/libstdc++-v3/include/std/ranges
@@ -1853,7 +1853,7 @@ namespace views
  { return ranges::iter_swap(__x._M_current, __y._M_current); }
 
  friend _Iterator;
- friend _Sentinel<_Const>;
+ template friend struct _Sentinel;
};
 
   template
@@ -1863,13 +1863,15 @@ namespace views
  using _Parent = __detail::__maybe_const_t<_Const, transform_view>;
  using _Base = __detail::__maybe_const_t<_Const, _Vp>;
 
- constexpr range_difference_t<_Base>
- __distance_from(const _Iterator<_Const>& __i) const
- { return _M_end - __i._M_current; }
+ template
+   constexpr range_difference_t<_Base>
+   __distance_from(const _Iterator<_Const2>& __i) const
+   { return _M_end - __i._M_current; }
 
- constexpr bool
- __equal(const _Iterator<_Const>& __i) const
- { return __i._M_current == _M_end; }
+ template
+   constexpr bool
+   __equal(const _Iterator<_Const2>& __i) const
+   { return __i._M_current == _M_end; }
 
  sentinel_t<_Base> _M_end = sentinel_t<_Base>();
 
@@ -1892,19 +1894,26 @@ namespace views
  base() const
  { return _M_end; }
 
- friend constexpr bool
- operator==(const _Iterator<_Const>& __x, const _Sentinel& __y)
- { return __y.__equal(__x); }
+ template
+   requires sentinel_for,
+  iterator_t<__detail::__maybe_const_t<_Const2, _Vp>>>
+   friend constexpr bool
+   operator==(const _Iterator<_Const2>& __x, const _Sentinel& __y)
+   { return __y.__equal(__x); }
 
- friend constexpr range_difference_t<_Base>
- operator-(const _Iterator<_Const>& __x, const _Sentinel& __y)
-   requires sized_sentinel_for, iterator_t<_Base>>
- { return -__y.__distance_from(__x); }
+ template
+   requires sized_sentinel_for,
+  iterator_t<__detail::__maybe_const_t<_Const2, _Vp>>>
+   friend constexpr range_difference_t<_Base>
+   operator-(const _Iterator<_Const2>& __x, const _Sentinel& __y)
+   { return -__y.__distance_from(__x); }
 
- friend constexpr range_difference_t<_Base>
- operator-(const _Sentinel& __y, const _Iterator<_Const>& __x)
-   requires sized_sentinel_for, iterator_t<_Base>>
- 

[committed] libstdc++: Fix std::reverse_iterator comparisons (PR 94354)

2020-05-27 Thread Jonathan Wakely via Gcc-patches
The std::reverse_iterator comparisons have always been implemented only
in terms of equality and less than. In C++98 that made no difference for
reasonable code, because when the underlying operators are the same type
they are required to support all comparisons anyway.

But since LWG 280 it's possible to compare reverse_iterator and
reverse_iterator, and comparisons between X and Y might not support
the full set of equality and relational operators. This means that it
matters whether we implement operator!= as x.base() != y.base() or
!(x.base() == y.base()), and the current implementation is
non-conforming.

This was already fixed in GCC 10.1 for C++20, this change also fixes it
for all other -std modes.

PR libstdc++/94354
* include/bits/stl_iterator.h (reverse_iterator): Fix comparison
operators to use the correct operations on the underlying
iterators.
* testsuite/24_iterators/reverse_iterator/rel_ops.cc: New test.

Tested powerpc64le-linux, committed to master.


commit 979e89a9a94f66241fa8355e2b2e8f4a680c83e1
Author: Jonathan Wakely 
Date:   Wed May 27 21:58:56 2020 +0100

libstdc++: Fix std::reverse_iterator comparisons (PR 94354)

The std::reverse_iterator comparisons have always been implemented only
in terms of equality and less than. In C++98 that made no difference for
reasonable code, because when the underlying operators are the same type
they are required to support all comparisons anyway.

But since LWG 280 it's possible to compare reverse_iterator and
reverse_iterator, and comparisons between X and Y might not support
the full set of equality and relational operators. This means that it
matters whether we implement operator!= as x.base() != y.base() or
!(x.base() == y.base()), and the current implementation is
non-conforming.

This was already fixed in GCC 10.1 for C++20, this change also fixes it
for all other -std modes.

PR libstdc++/94354
* include/bits/stl_iterator.h (reverse_iterator): Fix comparison
operators to use the correct operations on the underlying
iterators.
* testsuite/24_iterators/reverse_iterator/rel_ops.cc: New test.

diff --git a/libstdc++-v3/include/bits/stl_iterator.h 
b/libstdc++-v3/include/bits/stl_iterator.h
index 19b1d53f781..b0f45499aec 100644
--- a/libstdc++-v3/include/bits/stl_iterator.h
+++ b/libstdc++-v3/include/bits/stl_iterator.h
@@ -393,6 +393,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   // _GLIBCXX_RESOLVE_LIB_DEFECTS
   // DR 280. Comparison of reverse_iterator to const reverse_iterator.
+
   template
 inline _GLIBCXX17_CONSTEXPR bool
 operator==(const reverse_iterator<_IteratorL>& __x,
@@ -403,31 +404,31 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 inline _GLIBCXX17_CONSTEXPR bool
 operator<(const reverse_iterator<_IteratorL>& __x,
  const reverse_iterator<_IteratorR>& __y)
-{ return __y.base() < __x.base(); }
+{ return __x.base() > __y.base(); }
 
   template
 inline _GLIBCXX17_CONSTEXPR bool
 operator!=(const reverse_iterator<_IteratorL>& __x,
   const reverse_iterator<_IteratorR>& __y)
-{ return !(__x == __y); }
+{ return __x.base() != __y.base(); }
 
   template
 inline _GLIBCXX17_CONSTEXPR bool
 operator>(const reverse_iterator<_IteratorL>& __x,
  const reverse_iterator<_IteratorR>& __y)
-{ return __y < __x; }
+{ return __x.base() < __y.base(); }
 
   template
 inline _GLIBCXX17_CONSTEXPR bool
 operator<=(const reverse_iterator<_IteratorL>& __x,
   const reverse_iterator<_IteratorR>& __y)
-{ return !(__y < __x); }
+{ return __x.base() >= __y.base(); }
 
   template
 inline _GLIBCXX17_CONSTEXPR bool
 operator>=(const reverse_iterator<_IteratorL>& __x,
   const reverse_iterator<_IteratorR>& __y)
-{ return !(__x < __y); }
+{ return __x.base() <= __y.base(); }
 #else // C++20
   template
 constexpr bool
diff --git a/libstdc++-v3/testsuite/24_iterators/reverse_iterator/rel_ops.cc 
b/libstdc++-v3/testsuite/24_iterators/reverse_iterator/rel_ops.cc
new file mode 100644
index 000..4f2675f471b
--- /dev/null
+++ b/libstdc++-v3/testsuite/24_iterators/reverse_iterator/rel_ops.cc
@@ -0,0 +1,99 @@
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General 

Re: [PATCH] c++: constexpr RANGE_EXPR ctor indexes [PR95241]

2020-05-27 Thread Patrick Palka via Gcc-patches
On Wed, 27 May 2020, Patrick Palka wrote:

> In the testcase below, the CONSTRUCTOR for 'field' contains a
> RANGE_EXPR index:
> 
>   {aggr_init_expr<...>, [1...2]={.off=1}}
> 
> but get_or_insert_ctor_field isn't prepared to handle RANGE_EXPR
> indexes.
> 
> This patch adds limited support for RANGE_EXPR indexes to
> get_or_insert_ctor_field.  The limited scope of this patch should make
> it more suitable for backporting, and support for more access patterns
> would be needed only to handle self-modifying CONSTRUCTORs containing a
> RANGE_EXPR index, but I haven't yet been able to come up with a testcase
> that exhibits such a CONSTRUCTOR.
> 
> Passes 'make check-c++', does this look OK to commit to master and to
> the GCC 10 branch after a full bootstrap and regtest?
> 
> gcc/cp/ChangeLog:
> 
>   PR c++/95241
>   * constexpr.c (get_or_insert_ctor_field): Add limited support
>   for RANGE_EXPR indexes.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR c++/95241
>   * g++.dg/cpp0x/constexpr-array25.C: New test.
> ---
>  gcc/cp/constexpr.c| 12 +++
>  .../g++.dg/cpp0x/constexpr-array25.C  | 21 +++
>  2 files changed, 33 insertions(+)
>  create mode 100644 gcc/testsuite/g++.dg/cpp0x/constexpr-array25.C
> 
> diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
> index 4e441ac8d2f..6f9bafbe8d8 100644
> --- a/gcc/cp/constexpr.c
> +++ b/gcc/cp/constexpr.c
> @@ -3301,6 +3301,18 @@ get_or_insert_ctor_field (tree ctor, tree index, int 
> pos_hint = -1)
>  }
>else if (TREE_CODE (type) == ARRAY_TYPE || TREE_CODE (type) == VECTOR_TYPE)
>  {
> +  if (TREE_CODE (index) == RANGE_EXPR)
> + {
> +   /* Our support for RANGE_EXPR indexes is limited to accessing an
> +  existing one via POS_HINT, and appending a new one to the end of
> +  CTOR.  ??? Support for other access patterns might be needed.  */
> +   tree lo = TREE_OPERAND (index, 0);
> +   auto *elts = CONSTRUCTOR_ELTS (ctor);
> +   gcc_assert (vec_safe_is_empty (elts)
> +   || array_index_cmp (lo, elts->last().index) > 0);
> +   return vec_safe_push (elts, {index, NULL_TREE});
> + }
> +

Oops, it just occurred to me that the use of C++11 features here would
make this patch unsuitable for backporting.  C++98-compatible patch
incoming...

>HOST_WIDE_INT i = find_array_ctor_elt (ctor, index, /*insert*/true);
>gcc_assert (i >= 0);
>constructor_elt *cep = CONSTRUCTOR_ELT (ctor, i);
> diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-array25.C 
> b/gcc/testsuite/g++.dg/cpp0x/constexpr-array25.C
> new file mode 100644
> index 000..9162943249f
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-array25.C
> @@ -0,0 +1,21 @@
> +// PR c++/95241
> +// { dg-do compile { target c++11 } }
> +
> +struct Fragment
> +{
> +  int off;
> +  constexpr Fragment(int _off) : off(_off) { }
> +  constexpr Fragment() : Fragment(1) { }
> +};
> +
> +struct Field
> +{
> +  Fragment fragments[3];
> +  constexpr Field(int off) : fragments{{off}} { }
> +};
> +
> +constexpr Field field{0};
> +
> +static_assert(field.fragments[0].off == 0
> +   && field.fragments[1].off == 1
> +   && field.fragments[2].off == 1, "");
> -- 
> 2.27.0.rc1.5.gae92ac8ae3
> 
> 



[PATCH] c++: constexpr RANGE_EXPR ctor indexes [PR95241]

2020-05-27 Thread Patrick Palka via Gcc-patches
In the testcase below, the CONSTRUCTOR for 'field' contains a
RANGE_EXPR index:

  {aggr_init_expr<...>, [1...2]={.off=1}}

but get_or_insert_ctor_field isn't prepared to handle RANGE_EXPR
indexes.

This patch adds limited support for RANGE_EXPR indexes to
get_or_insert_ctor_field.  The limited scope of this patch should make
it more suitable for backporting, and support for more access patterns
would be needed only to handle self-modifying CONSTRUCTORs containing a
RANGE_EXPR index, but I haven't yet been able to come up with a testcase
that exhibits such a CONSTRUCTOR.

Passes 'make check-c++', does this look OK to commit to master and to
the GCC 10 branch after a full bootstrap and regtest?

gcc/cp/ChangeLog:

PR c++/95241
* constexpr.c (get_or_insert_ctor_field): Add limited support
for RANGE_EXPR indexes.

gcc/testsuite/ChangeLog:

PR c++/95241
* g++.dg/cpp0x/constexpr-array25.C: New test.
---
 gcc/cp/constexpr.c| 12 +++
 .../g++.dg/cpp0x/constexpr-array25.C  | 21 +++
 2 files changed, 33 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/constexpr-array25.C

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 4e441ac8d2f..6f9bafbe8d8 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -3301,6 +3301,18 @@ get_or_insert_ctor_field (tree ctor, tree index, int 
pos_hint = -1)
 }
   else if (TREE_CODE (type) == ARRAY_TYPE || TREE_CODE (type) == VECTOR_TYPE)
 {
+  if (TREE_CODE (index) == RANGE_EXPR)
+   {
+ /* Our support for RANGE_EXPR indexes is limited to accessing an
+existing one via POS_HINT, and appending a new one to the end of
+CTOR.  ??? Support for other access patterns might be needed.  */
+ tree lo = TREE_OPERAND (index, 0);
+ auto *elts = CONSTRUCTOR_ELTS (ctor);
+ gcc_assert (vec_safe_is_empty (elts)
+ || array_index_cmp (lo, elts->last().index) > 0);
+ return vec_safe_push (elts, {index, NULL_TREE});
+   }
+
   HOST_WIDE_INT i = find_array_ctor_elt (ctor, index, /*insert*/true);
   gcc_assert (i >= 0);
   constructor_elt *cep = CONSTRUCTOR_ELT (ctor, i);
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-array25.C 
b/gcc/testsuite/g++.dg/cpp0x/constexpr-array25.C
new file mode 100644
index 000..9162943249f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-array25.C
@@ -0,0 +1,21 @@
+// PR c++/95241
+// { dg-do compile { target c++11 } }
+
+struct Fragment
+{
+  int off;
+  constexpr Fragment(int _off) : off(_off) { }
+  constexpr Fragment() : Fragment(1) { }
+};
+
+struct Field
+{
+  Fragment fragments[3];
+  constexpr Field(int off) : fragments{{off}} { }
+};
+
+constexpr Field field{0};
+
+static_assert(field.fragments[0].off == 0
+ && field.fragments[1].off == 1
+ && field.fragments[2].off == 1, "");
-- 
2.27.0.rc1.5.gae92ac8ae3



[comitted] i386: Fix V2SF horizontal add/subtract insns

2020-05-27 Thread Uros Bizjak via Gcc-patches
PFPNACC insn is incorrectly modelled to perform addition and subtraction
of two operands, but in reality it performs horizontal addition and
subtraction:

Instruction: PFPNACC dest,src

Description:
dest[31:0] <- dest[31:0] - dest[63:32];
dest[63:32] <- src[31:0] + src[63:32];

Also, it is not possible to directly replace PFACC with HADDPS and PFNACC
with HSUBPS, because operands in the second word do not match.

PFACC does:

dest[31..0] <- dest[31..0] + dest[63..32];
dest[63..32] <- src[31..0] + src [63..32];

while HADDPS does:

dest[31..0] <-  dest[31..0]  +  dest[63..32];
dest[63..32] <- dest[127..96] + dest[95..64];
dest[95..64] <- src [31..0]  +  src [64..32];
dest[127:96] <- src [127..96] + src [95..64];

2020-05-27  Uroš Bizjak  

gcc/ChangeLog:
* config/i386/mmx.md (*mmx_haddv2sf3): Remove SSE alternatives.
(mmx_hsubv2sf3): Ditto.
(mmx_haddsubv2sf3): New expander.
(*mmx_haddsubv2sf3): Rename from mmx_addsubv2sf3. Correct
RTL template to model horizontal subtraction and addition.
* config/i386/i386-builtin.def (IX86_BUILTIN_PFPNACC):
Update for rename.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Uros.
diff --git a/gcc/config/i386/i386-builtin.def b/gcc/config/i386/i386-builtin.def
index b873498f3ab..134981a798f 100644
--- a/gcc/config/i386/i386-builtin.def
+++ b/gcc/config/i386/i386-builtin.def
@@ -555,7 +555,7 @@ BDESC (OPTION_MASK_ISA_3DNOW_A, 0, CODE_FOR_mmx_pi2fw, 
"__builtin_ia32_pi2fw", I
 BDESC (OPTION_MASK_ISA_3DNOW_A, 0, CODE_FOR_mmx_pswapdv2si2, 
"__builtin_ia32_pswapdsi", IX86_BUILTIN_PSWAPDSI, UNKNOWN, (int) 
V2SI_FTYPE_V2SI)
 BDESC (OPTION_MASK_ISA_3DNOW_A, 0, CODE_FOR_mmx_pswapdv2sf2, 
"__builtin_ia32_pswapdsf", IX86_BUILTIN_PSWAPDSF, UNKNOWN, (int) 
V2SF_FTYPE_V2SF)
 BDESC (OPTION_MASK_ISA_3DNOW_A, 0, CODE_FOR_mmx_hsubv2sf3, 
"__builtin_ia32_pfnacc", IX86_BUILTIN_PFNACC, UNKNOWN, (int) 
V2SF_FTYPE_V2SF_V2SF)
-BDESC (OPTION_MASK_ISA_3DNOW_A, 0, CODE_FOR_mmx_addsubv2sf3, 
"__builtin_ia32_pfpnacc", IX86_BUILTIN_PFPNACC, UNKNOWN, (int) 
V2SF_FTYPE_V2SF_V2SF)
+BDESC (OPTION_MASK_ISA_3DNOW_A, 0, CODE_FOR_mmx_haddsubv2sf3, 
"__builtin_ia32_pfpnacc", IX86_BUILTIN_PFPNACC, UNKNOWN, (int) 
V2SF_FTYPE_V2SF_V2SF)
 
 /* SSE */
 BDESC (OPTION_MASK_ISA_SSE, 0, CODE_FOR_sse_movmskps, 
"__builtin_ia32_movmskps", IX86_BUILTIN_MOVMSKPS, UNKNOWN, (int) INT_FTYPE_V4SF)
diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index 271c1c2e833..7c9640d4f9f 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -552,32 +552,27 @@
   "TARGET_3DNOW")
 
 (define_insn "*mmx_haddv2sf3"
-  [(set (match_operand:V2SF 0 "register_operand" "=y,x,x")
+  [(set (match_operand:V2SF 0 "register_operand" "=y")
(vec_concat:V2SF
  (plus:SF
(vec_select:SF
- (match_operand:V2SF 1 "register_operand" "0,0,x")
+ (match_operand:V2SF 1 "register_operand" "0")
  (parallel [(match_operand:SI 3 "const_0_to_1_operand")]))
(vec_select:SF (match_dup 1)
(parallel [(match_operand:SI 4 "const_0_to_1_operand")])))
  (plus:SF
 (vec_select:SF
- (match_operand:V2SF 2 "nonimmediate_operand" "ym,x,x")
+ (match_operand:V2SF 2 "nonimmediate_operand" "ym")
  (parallel [(match_operand:SI 5 "const_0_to_1_operand")]))
(vec_select:SF (match_dup 2)
(parallel [(match_operand:SI 6 "const_0_to_1_operand")])]
   "TARGET_3DNOW
&& INTVAL (operands[3]) != INTVAL (operands[4])
&& INTVAL (operands[5]) != INTVAL (operands[6])"
-  "@
-   pfacc\t{%2, %0|%0, %2}
-   haddps\t{%2, %0|%0, %2}
-   vhaddps\t{%2, %1, %0|%0, %1, %2}"
-  [(set_attr "isa" "*,sse3_noavx,avx")
-   (set_attr "type" "mmxadd,sseadd,sseadd")
-   (set_attr "prefix_extra" "1,*,*")
-   (set_attr "prefix" "*,orig,vex")
-   (set_attr "mode" "V2SF,V4SF,V4SF")])
+  "pfacc\t{%2, %0|%0, %2}"
+  [(set_attr "type" "mmxadd")
+   (set_attr "prefix_extra" "1")
+   (set_attr "mode" "V2SF")])
 
 (define_insn "*mmx_haddv2sf3_low"
   [(set (match_operand:SF 0 "register_operand" "=x,x")
@@ -599,28 +594,23 @@
(set_attr "mode" "V4SF")])
 
 (define_insn "mmx_hsubv2sf3"
-  [(set (match_operand:V2SF 0 "register_operand" "=y,x,x")
+  [(set (match_operand:V2SF 0 "register_operand" "=y")
(vec_concat:V2SF
  (minus:SF
(vec_select:SF
- (match_operand:V2SF 1 "register_operand" "0,0,x")
+ (match_operand:V2SF 1 "register_operand" "0")
  (parallel [(const_int  0)]))
(vec_select:SF (match_dup 1) (parallel [(const_int 1)])))
  (minus:SF
 (vec_select:SF
- (match_operand:V2SF 2 "register_mmxmem_operand" "ym,x,x")
+ (match_operand:V2SF 2 "nonimmediate_operand" "ym")
  (parallel [(const_int  0)]))
(vec_select:SF (match_dup 2) (parallel [(const_int 1)])]
   "TARGET_3DNOW_A"
-  "@
-   pfnacc\t{%2, %0|%0, %2}
-   hsubps\t{%2, %0|%0, %2}
-   

Aw: Re: [PATCH, committed] [9/10/11 Regression] PR fortran/95104 - Segfault on a legal WAIT statement

2020-05-27 Thread Harald Anlauf
Hi Thomas,

thanks for the hint:

> Von: "Thomas Koenig" 
> Am 26.05.20 um 23:33 schrieb Harald Anlauf:
> > Will backport in a few days, when I figure out how to do it now.
>
> The way to backport now is to first run contrib/gcc-git-customization.sh
> from current master, and then change to the branch you want to
> backport this to and run
>
> git gcc-backport r11-646-g56f03cd12be26828788a27f6f3c250041a958e45 .
>
> (or what your revision may be).
>
> I just tried it, and it works well.

This almost worked for me.

I am using git worktree for the branches.  I had to checkout releases/gcc-10
again, otherwise the worktree was in a detached state.  After that, everything
worked fine.

Thanks,
Harald



[committed] i386: Remove %q modifier from two pmov insn templates [PR95355]

2020-05-27 Thread Uros Bizjak via Gcc-patches
2020-05-27  Uroš Bizjak  

gcc/ChangeLog:
PR target/95355
* config/i386/sse.md
(avx512f_v16qiv16si2):
Remove %q operand modifier from insn template.
(avx512f_v8hiv8di2): Ditto.

gcc/testsuite/ChangeLog:
PR target/95355
* gcc.target/i386/pr95355.c: New test.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Uros.
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index fde65391d7d..1cf1b8cea3b 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -17559,7 +17559,7 @@
(any_extend:V16SI
  (match_operand:V16QI 1 "nonimmediate_operand" "vm")))]
   "TARGET_AVX512F"
-  "vpmovbd\t{%1, %0|%0, %q1}"
+  "vpmovbd\t{%1, %0|%0, %1}"
   [(set_attr "type" "ssemov")
(set_attr "prefix" "evex")
(set_attr "mode" "XI")])
@@ -17935,7 +17935,7 @@
(any_extend:V8DI
  (match_operand:V8HI 1 "nonimmediate_operand" "vm")))]
   "TARGET_AVX512F"
-  "vpmovwq\t{%1, %0|%0, %q1}"
+  "vpmovwq\t{%1, %0|%0, %1}"
   [(set_attr "type" "ssemov")
(set_attr "prefix" "evex")
(set_attr "mode" "XI")])
diff --git a/gcc/testsuite/gcc.target/i386/pr95355.c 
b/gcc/testsuite/gcc.target/i386/pr95355.c
new file mode 100644
index 000..3e4faba19f2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr95355.c
@@ -0,0 +1,20 @@
+/* PR target/95355 */
+/* { dg-do assemble { target avx512dq } } */
+/* { dg-require-effective-target int128 } */
+/* { dg-require-effective-target masm_intel } */
+/* { dg-options "-O -fno-tree-dominator-opts -fno-tree-fre 
-ftree-slp-vectorize -fno-tree-ter -mavx512dq -masm=intel" } */
+
+typedef int __attribute__((__vector_size__(64))) U;
+typedef __int128 __attribute__((__vector_size__(32))) V;
+
+U i;
+V j;
+
+int
+foo(unsigned char l)
+{
+  V m = j % 999;
+  U n = l <= i;
+  V o = ((union { U a; V b[2]; }) n).b[0] + m;
+  return o[0];
+}


[committed] jit: use deep unsharing of trees [PR 95314]

2020-05-27 Thread David Malcolm via Gcc-patches
PR jit/95314 reports a internal error inside verify_gimple, which
turned out to be due to reusing the result of
gcc_jit_lvalue_get_address in several functions, leading to tree nodes
shared between multiple function bodies.

This patch fixes the issue by adopting the "Deep unsharing" strategy
described in the comment in gimplify.c preceding mostly_copy_tree_r:
to mark all of the jit "frontend"'s expression tree nodes with
TREE_VISITED, and to set LANG_HOOKS_DEEP_UNSHARING, so that "they are
unshared on the first reference within functions when the regular
unsharing algorithm runs".

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to master as r11-668-gc98bd673ef93836f03491201f1c63929ea429cd6.

gcc/jit/ChangeLog:
PR jit/95314
* dummy-frontend.c (LANG_HOOKS_DEEP_UNSHARING): Define to be true.
* jit-playback.h (gcc::jit::playback::rvalue): Mark tree node with
TREE_VISITED.

gcc/testsuite/ChangeLog:
PR jit/95314
* jit.dg/all-non-failing-tests.h: Add test-pr95314-rvalue-reuse.c.
* jit.dg/test-pr95314-rvalue-reuse.c: New test.
---
 gcc/jit/dummy-frontend.c  |  3 +
 gcc/jit/jit-playback.h|  7 ++-
 gcc/testsuite/jit.dg/all-non-failing-tests.h  | 10 
 .../jit.dg/test-pr95314-rvalue-reuse.c| 56 +++
 4 files changed, 75 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/jit.dg/test-pr95314-rvalue-reuse.c

diff --git a/gcc/jit/dummy-frontend.c b/gcc/jit/dummy-frontend.c
index 27fe9d3db96..6c7b7992a4d 100644
--- a/gcc/jit/dummy-frontend.c
+++ b/gcc/jit/dummy-frontend.c
@@ -269,6 +269,9 @@ jit_langhook_getdecls (void)
 #undef LANG_HOOKS_GETDECLS
 #define LANG_HOOKS_GETDECLSjit_langhook_getdecls
 
+#undef  LANG_HOOKS_DEEP_UNSHARING
+#define LANG_HOOKS_DEEP_UNSHARING  true
+
 struct lang_hooks lang_hooks = LANG_HOOKS_INITIALIZER;
 
 #include "gt-jit-dummy-frontend.h"
diff --git a/gcc/jit/jit-playback.h b/gcc/jit/jit-playback.h
index 074434a9f6b..f9b3e675368 100644
--- a/gcc/jit/jit-playback.h
+++ b/gcc/jit/jit-playback.h
@@ -576,7 +576,12 @@ public:
   rvalue (context *ctxt, tree inner)
 : m_ctxt (ctxt),
   m_inner (inner)
-  {}
+  {
+/* Pre-mark tree nodes with TREE_VISITED so that they can be
+   deeply unshared during gimplification (including across
+   functions); this requires LANG_HOOKS_DEEP_UNSHARING to be true.  */
+TREE_VISITED (inner) = 1;
+  }
 
   rvalue *
   as_rvalue () { return this; }
diff --git a/gcc/testsuite/jit.dg/all-non-failing-tests.h 
b/gcc/testsuite/jit.dg/all-non-failing-tests.h
index babcd3979b7..ca8d3df4193 100644
--- a/gcc/testsuite/jit.dg/all-non-failing-tests.h
+++ b/gcc/testsuite/jit.dg/all-non-failing-tests.h
@@ -234,6 +234,13 @@
 #undef create_code
 #undef verify_code
 
+/* test-pr95314-rvalue-reuse.c.  */
+#define create_code create_code_pr95314_rvalue_reuse
+#define verify_code verify_code_pr95314_rvalue_reuse
+#include "test-pr95314-rvalue-reuse.c"
+#undef create_code
+#undef verify_code
+
 /* test-reading-struct.c */
 #define create_code create_code_reading_struct
 #define verify_code verify_code_reading_struct
@@ -401,6 +408,9 @@ const struct testcase testcases[] = {
   {"pr95306_builtin_types",
create_code_pr95306_builtin_types,
verify_code_pr95306_builtin_types},
+  {"pr95314_rvalue_reuse",
+   create_code_pr95314_rvalue_reuse,
+   verify_code_pr95314_rvalue_reuse},
   {"reading_struct ",
create_code_reading_struct ,
verify_code_reading_struct },
diff --git a/gcc/testsuite/jit.dg/test-pr95314-rvalue-reuse.c 
b/gcc/testsuite/jit.dg/test-pr95314-rvalue-reuse.c
new file mode 100644
index 000..6bed0bc52a4
--- /dev/null
+++ b/gcc/testsuite/jit.dg/test-pr95314-rvalue-reuse.c
@@ -0,0 +1,56 @@
+#include 
+#include "harness.h"
+
+void create_code (gcc_jit_context *ctxt, void *user_data)
+{
+  gcc_jit_type *t_int =  gcc_jit_context_get_type (ctxt, GCC_JIT_TYPE_INT);
+  gcc_jit_type *t_void = gcc_jit_context_get_type (ctxt, GCC_JIT_TYPE_VOID);
+  gcc_jit_type *t_const_char_ptr
+= gcc_jit_context_get_type (ctxt, GCC_JIT_TYPE_CONST_CHAR_PTR);
+  gcc_jit_lvalue *global
+= gcc_jit_context_new_global (ctxt, NULL, GCC_JIT_GLOBAL_INTERNAL,
+ t_const_char_ptr, "pr95314_global");
+
+  gcc_jit_rvalue *global_ref = gcc_jit_lvalue_get_address(global, NULL);
+
+  gcc_jit_param *param_string
+= gcc_jit_context_new_param (ctxt, NULL, t_const_char_ptr, "string");
+  gcc_jit_function *puts_func
+= gcc_jit_context_new_function (ctxt, NULL, GCC_JIT_FUNCTION_IMPORTED,
+   t_int, "puts", 1, _string, 0);
+
+#define NUM_INNER_FNS 3
+  gcc_jit_function *inner_fns[NUM_INNER_FNS];
+  for (int i = 0; i < NUM_INNER_FNS; i++)
+{
+  char fnname[128];
+  sprintf (fnname, "pr95314_inner_%i", i);
+  inner_fns[i]
+   = gcc_jit_context_new_function (ctxt, NULL, GCC_JIT_FUNCTION_INTERNAL,
+ 

[commmitted] jit: fix libgccjit.info entry [PR 91330]

2020-05-27 Thread David Malcolm via Gcc-patches
2020-05-27  Tom Tromey  

gcc/jit/ChangeLog:
PR jit/91330
* docs/conf.py (texinfo_documents): Set description.
* docs/_build/texinfo/libgccjit.texi: Regenerate.
---
 gcc/jit/docs/_build/texinfo/libgccjit.texi | 2 +-
 gcc/jit/docs/conf.py   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/jit/docs/conf.py b/gcc/jit/docs/conf.py
index 9dcc88e9d52..796e16cdd74 100644
--- a/gcc/jit/docs/conf.py
+++ b/gcc/jit/docs/conf.py
@@ -244,7 +244,7 @@ man_pages = [
 #  dir menu entry, description, category)
 texinfo_documents = [
   ('index', 'libgccjit', u'libgccjit Documentation',
-   u'David Malcolm', 'libgccjit', 'One line description of project.',
+   u'David Malcolm', 'libgccjit', 'GCC-based Just In Time compiler library.',
'Miscellaneous'),
 ]
 
-- 
2.21.0



[pushed] c++: Handle multiple aggregate overloads [PR95319].

2020-05-27 Thread Jason Merrill via Gcc-patches
Here, when considering the two 'insert' overloads, we look for aggregate
conversions from the same initializer-list to B<3> or
initializer_list>.  But since my fix for reshape_init overhead on the
PR14179 testcase we reshaped the initializer-list directly, leading to an
error when we then tried to reshape it differently for the second overload.

Tested x86_64-pc-linux-gnu, applying to trunk and 10.

gcc/cp/ChangeLog:

PR c++/95319
* decl.c (reshape_init_array_1): Don't reuse in overload context.

gcc/testsuite/ChangeLog:

PR c++/95319
* g++.dg/cpp0x/initlist-array12.C: New test.
---
 gcc/cp/decl.c |  4 +++-
 gcc/testsuite/g++.dg/cpp0x/initlist-array12.C | 24 +++
 2 files changed, 27 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/initlist-array12.C

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 5476965996b..56571e39570 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -6029,8 +6029,10 @@ reshape_init_array_1 (tree elt_type, tree max_index, 
reshape_iter *d,
 
   /* The initializer for an array is always a CONSTRUCTOR.  If this is the
  outermost CONSTRUCTOR and the element type is non-aggregate, we don't need
- to build a new one.  */
+ to build a new one.  But don't reuse if not complaining; if this is
+ tentative, we might also reshape to another type (95319).  */
   bool reuse = (first_initializer_p
+   && (complain & tf_error)
&& !CP_AGGREGATE_TYPE_P (elt_type)
&& !TREE_SIDE_EFFECTS (first_initializer_p));
   if (reuse)
diff --git a/gcc/testsuite/g++.dg/cpp0x/initlist-array12.C 
b/gcc/testsuite/g++.dg/cpp0x/initlist-array12.C
new file mode 100644
index 000..b012e7295d5
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/initlist-array12.C
@@ -0,0 +1,24 @@
+// PR c++/95319
+// { dg-do compile { target c++11 } }
+
+namespace std {
+template  class initializer_list {
+  int *_M_array;
+  unsigned long _M_len;
+};
+template  struct A { typedef int _Type[_Nm]; };
+template  struct B { typename A<_Nm>::_Type _M_elems; };
+class C {
+public:
+  void insert(int, B<3>);
+  void insert(int, initializer_list>);
+};
+} // namespace std
+int a;
+int
+main() {
+  using ArrayVector = std::C;
+  auto b = ArrayVector();
+  b.insert(a, {{2}});
+  return 0;
+}

base-commit: a7fd43c38f7469a3ef5ee30e889d60e1376d4dfc
-- 
2.18.1



[PATCH] mklog: support renaming of files

2020-05-27 Thread Martin Liška

Hi.

There's a patch that utilizes newly added functionality in unidiff 0.6.0.
It newly generates:

diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf2.c
similarity index 100%
rename from gcc/ipa-icf.c
rename to gcc/ipa-icf2.c

$ ./contrib/mklog.py 0001-test.patch
gcc/ChangeLog:

* ipa-icf.c: Moved to...
* ipa-icf2.c: ...here.


The support is optional and detected during run-time.

Thoughts?
Martin
>From 8d970b9a57ee373cacbbd2aa29cdbe1c29df4081 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Wed, 27 May 2020 20:03:50 +0200
Subject: [PATCH] mklog: support renaming of files

contrib/ChangeLog:

	* mklog.py: Support renaming of files.
	One needs unidiff 0.6.0+.
	* test_mklog.py: Test it.
---
 contrib/mklog.py  |  8 
 contrib/test_mklog.py | 26 ++
 2 files changed, 34 insertions(+)

diff --git a/contrib/mklog.py b/contrib/mklog.py
index fb58661b5eb..243edbb15c5 100755
--- a/contrib/mklog.py
+++ b/contrib/mklog.py
@@ -173,6 +173,14 @@ def generate_changelog(data, no_functions=False, fill_pr_titles=False):
 out += '\t* %s: %s.\n' % (relative_path, msg)
 elif file.is_removed_file:
 out += '\t* %s: Removed.\n' % (relative_path)
+elif hasattr(file, 'is_rename') and file.is_rename:
+out += '\t* %s: Moved to...\n' % (relative_path)
+new_path = file.target_file[2:]
+# A file can be theoretically moved to a location that
+# belongs to a different ChangeLog.  Let user fix it.
+if new_path.startswith(changelog):
+new_path = new_path[len(changelog):].lstrip('/')
+out += '\t* %s: ...here.\n' % (new_path)
 else:
 if not no_functions:
 for hunk in file:
diff --git a/contrib/test_mklog.py b/contrib/test_mklog.py
index ef7f2b1a594..344b7a2c771 100755
--- a/contrib/test_mklog.py
+++ b/contrib/test_mklog.py
@@ -30,6 +30,11 @@ import unittest
 
 from mklog import generate_changelog
 
+import unidiff
+
+unidiff_supports_renaming = hasattr(unidiff.PatchedFile(), 'is_rename')
+
+
 PATCH1 = '''\
 diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
 index 567c23380fe..e6209ede9d6 100644
@@ -379,6 +384,21 @@ gcc/testsuite/ChangeLog:
 
 '''
 
+PATCH8 = '''\
+diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf2.c
+similarity index 100%
+rename from gcc/ipa-icf.c
+rename to gcc/ipa-icf2.c
+'''
+
+EXPECTED8 = '''\
+gcc/ChangeLog:
+
+	* ipa-icf.c: Moved to...
+	* ipa-icf2.c: ...here.
+
+'''
+
 class TestMklog(unittest.TestCase):
 def test_macro_definition(self):
 changelog = generate_changelog(PATCH1)
@@ -411,3 +431,9 @@ class TestMklog(unittest.TestCase):
 def test_dr_detection_in_test_case(self):
 changelog = generate_changelog(PATCH7)
 assert changelog == EXPECTED7
+
+@unittest.skipIf(not unidiff_supports_renaming,
+ 'Newer version of unidiff is needed (0.6.0+)')
+def test_renaming(self):
+changelog = generate_changelog(PATCH8)
+assert changelog == EXPECTED8
-- 
2.26.2



Re: [PATCH] gcc-changelog: enhance handling of renamings

2020-05-27 Thread Martin Liška

On 5/27/20 7:50 PM, Martin Liška wrote:

We'll need here a skip based on version of unidiff. So something like:
@pytest.mark.skipif
?


I believe something like:

diff --git a/contrib/gcc-changelog/git_email.py 
b/contrib/gcc-changelog/git_email.py
index 6e42629cf07..afa6771c7fe 100755
--- a/contrib/gcc-changelog/git_email.py
+++ b/contrib/gcc-changelog/git_email.py
@@ -54,7 +54,7 @@ class GitEmail(GitCommit):
 t = 'A'
 elif f.is_removed_file:
 t = 'D'
-elif f.is_rename:
+elif hasattr(f, 'is_rename') and f.is_rename:
 # Consider that renamed files are two operations: the deletion
 # of the original name and the addition of the new one.
 modified_files.append((f.target_file[2:], 'A'))
diff --git a/contrib/gcc-changelog/test_email.py 
b/contrib/gcc-changelog/test_email.py
index c188fe9b276..f174f08d15b 100755
--- a/contrib/gcc-changelog/test_email.py
+++ b/contrib/gcc-changelog/test_email.py
@@ -22,10 +22,15 @@ import unittest
 
 from git_email import GitEmail
 
+import unidiff

+
 
 script_path = os.path.dirname(os.path.realpath(__file__))
 
 
+unidiff_supports_renaming = hasattr(unidiff.PatchedFile, 'is_rename')

+
+
 class TestGccChangelog(unittest.TestCase):
 def setUp(self):
 self.patches = {}
@@ -296,6 +301,8 @@ class TestGccChangelog(unittest.TestCase):
 'sem_ch8.adb', 'sem_elab.adb', 'sem_type.adb',
 'sem_util.adb'])
 
+@unittest.skipIf(not unidiff_supports_renaming,

+ 'Newer version of unidiff is needed (0.6.0+)')
 def test_renamed_file(self):
 email = self.from_patch_glob(
 '0001-Ada-Add-support-for-XDR-streaming-in-the-default-run.patch')


Re: [PATCH] gcc-changelog: enhance handling of renamings

2020-05-27 Thread Martin Liška

On 5/27/20 4:29 PM, Pierre-Marie de Rodat wrote:

So far, we expect from a commit that renames a file to contain a
changelog entry only for the new name. For example, after the following
commit:

$ git move foo bar
$ git commit

We expect the following changelog:

* bar: Renamed from foo.

Git does not keep track of renamings, only file deletions and additions.
The display of patches then uses heuristics (with config-dependent
parameters) to try to match deleted and added files in the same commit.
It is thus brittle to rely on this information.

This commit modifies changelog processing so that renames are considered
as a deletion of a file plus an addition of another file. The following
changelog is now expected for the above example:


Hello.

Thank you very much for working on this! It's a good idea that's currently
not supported.



* foo: Move...
* bar: Here.

contrib/

* gcc-changelog/git_email.py (GitEmail.__init__): Interpret file
renamings as a file deletion plus a file addition.
* gcc-changelog/git_repository.py (parse_git_revisions):
Likewise.
* gcc-changelog/test_email.py: New testcase.
* gcc-changelog/test_patches.txt: New testcase.
---
  contrib/gcc-changelog/git_email.py  |   5 +
  contrib/gcc-changelog/git_repository.py |   5 +
  contrib/gcc-changelog/test_email.py |   5 +
  contrib/gcc-changelog/test_patches.txt  | 153 
  4 files changed, 168 insertions(+)

diff --git a/contrib/gcc-changelog/git_email.py 
b/contrib/gcc-changelog/git_email.py
index 8c9df293a66..6e42629cf07 100755
--- a/contrib/gcc-changelog/git_email.py
+++ b/contrib/gcc-changelog/git_email.py
@@ -54,6 +54,11 @@ class GitEmail(GitCommit):
  t = 'A'
  elif f.is_removed_file:
  t = 'D'
+elif f.is_rename:
+# Consider that renamed files are two operations: the deletion
+# of the original name and the addition of the new one.
+modified_files.append((f.target_file[2:], 'A'))
+t = 'D'


However, this is available for unidiff package starting from version 0.6.0. 
With a bit older
release I see:

t = 'D'

  elif f.is_rename:

E   AttributeError: 'PatchedFile' object has no attribute 'is_rename'

Which is a minor limitation is git_email.py is supposed to be used only for 
tests.


  else:
  t = 'M'
  modified_files.append((f.path, t))
diff --git a/contrib/gcc-changelog/git_repository.py 
b/contrib/gcc-changelog/git_repository.py
index 0473fe73fba..e3b6c4d7a38 100755
--- a/contrib/gcc-changelog/git_repository.py
+++ b/contrib/gcc-changelog/git_repository.py
@@ -47,6 +47,11 @@ def parse_git_revisions(repo_path, revisions, strict=False):
  t = 'A'
  elif file.deleted_file:
  t = 'D'
+elif file.renamed_file:
+# Consider that renamed files are two operations: the deletion
+# of the original name and the addition of the new one.
+modified_files.append((file.a_path, 'D'))
+t = 'A'


Can you please align both previous hunks, I mean doing in both:

modified_files.append(..., 'A')
t = 'D'


  else:
  t = 'M'
  modified_files.append((file.b_path, t))
diff --git a/contrib/gcc-changelog/test_email.py 
b/contrib/gcc-changelog/test_email.py
index 3d2c8ff2412..c188fe9b276 100755
--- a/contrib/gcc-changelog/test_email.py
+++ b/contrib/gcc-changelog/test_email.py
@@ -295,3 +295,8 @@ class TestGccChangelog(unittest.TestCase):
  'sem_ch12.adb', 'sem_ch4.adb', 'sem_ch7.adb',
  'sem_ch8.adb', 'sem_elab.adb', 'sem_type.adb',
  'sem_util.adb'])


We'll need here a skip based on version of unidiff. So something like:
@pytest.mark.skipif
?

I'm going to prepare a counter-part for mklog that can also handle file 
renaming.

Martin


+
+def test_renamed_file(self):
+email = self.from_patch_glob(
+'0001-Ada-Add-support-for-XDR-streaming-in-the-default-run.patch')
+assert not email.errors
diff --git a/contrib/gcc-changelog/test_patches.txt 
b/contrib/gcc-changelog/test_patches.txt
index 06869bff504..cc81fcd32b8 100644
--- a/contrib/gcc-changelog/test_patches.txt
+++ b/contrib/gcc-changelog/test_patches.txt
@@ -2741,3 +2741,156 @@ index b980b4c..c1b1d9e 100644
  --
  2.1.4
  
+=== 0001-Ada-Add-support-for-XDR-streaming-in-the-default-run.patch ===

+From ed248d9bc3b72b6888a1b9cd84a8ef26809249f0 Mon Sep 17 00:00:00 2001
+From: Arnaud Charlet 
+Date: Thu, 23 Apr 2020 05:46:29 -0400
+Subject: [PATCH] [Ada] Add support for XDR streaming in the default runtime
+
+--!# FROM: /homes/derodat/tron/gnat2fsf/gnat
+--!# COMMIT: 5ad4cabb9f70114eb61c025e91406d4fba253f95
+--!# Change-Id: I21f92cad27933747495cdfa544a048f62f944cbd
+--!# TN: 

Re: [stage1][PATCH] Lower VEC_COND_EXPR into internal functions.

2020-05-27 Thread Richard Biener via Gcc-patches
On May 27, 2020 6:13:24 PM GMT+02:00, Richard Sandiford 
 wrote:
>Martin Liška  writes:
>> On 5/26/20 12:15 PM, Richard Sandiford wrote:
>>> So longer-term, I think we should replace VCOND(U) with individual
>ifns,
>>> like for VCONDEQ.  We could reduce the number of optabs needed by
>>> canonicalising greater-based tests to lesser-based tests.
>>
>> Hello.
>>
>> Thanks for the feedback. So would it be possible to go with something
>> like DEF_INTERNAL_OPTAB_CAN_FAIL (see the attachment)?
>
>It doesn't look like this will solve the problem.  The reason that we
>don't allow optabs for directly-mapped IFNs to FAIL is that:
>
>  expand_insn (icode, 6, ops);
>
>will (deliberately) ICE when the pattern FAILs.  Code that copes with
>FAILing optabs instead needs to do:
>
>rtx_insn *watermark = get_last_insn (); <-- position whether it should
>go.
>  ...
>  if (maybe_expand_insn (icode, 6, ops))
>{
>  ...Success...;
>}
>
>  delete_insns_since (watermark);
>  ...fallback code that implements the IFN without optab support...
>
>At this point the IFN isn't really directly-mapped in the intended
>sense:
>the optab is “just” a way of optimising the IFN.
>
>So I think the effect of the patch will be to suppress the build
>failure,
>but instead ICE for PowerPC when the FAIL condition is hit.  It might
>be quite difficult to trigger though.  (That's why the static checking
>is there. :-))
>
>I think instead we should treat VCOND(U) as not directly-mapped,
>as Richard suggested (IIRC).  The internal-fn.c code should then handle
>the case in which we have an IFN_VCOND(U) call and the associated
>optab fails.  Of course, this is only going to be exercised on targets
>like powerpc* that having failing patterns, so it'll need testing
>there.
>
>What I meant by the quote above is that I think this shows the flaw in
>using IFN_VCOND(U) rather than splitting it up further.  Longer term,
>we should have a separate IFN_VCOND* and optab for each necessary
>condition.  There would then be no need (IMO) to allow the patterns
>to FAIL, and we could use directly-mapped IFNs with no fallback.
>There'd also be no need for the tree comparison operand to the IFN.

That might be indeed a good idea. 

Richard. 

>Thanks,
>Richard



Re: [PATCH 1/2] make vect_finish_stmt_generation work w/o stmt_vec_info

2020-05-27 Thread Richard Biener
On May 27, 2020 5:40:30 PM GMT+02:00, Richard Sandiford 
 wrote:
>Richard Biener  writes:
>> This makes the call chain below vec_init_vector happy with a NULL
>> stmt_vec_info which is used as "context".
>>
>> 2020-05-27  Richard Biener  
>>
>>  * tree-vect-stmts.c (vect_finish_stmt_generation_1):
>>  Conditionalize stmt_info use, assert the new stmt cannot throw
>>  when not specified.
>>  (vect_finish_stmt_generation): Adjust assert.
>
>Wasn't sure from this patch in isolation: when's it valid to pass a
>null
>stmt_info?  Felt weird that we suddenly needed this now, when we
>already
>have so many callers that follow the existing interface.

It should always, but there's this weird EH thing there for which I need to 
find a testcase. 

>Or is this because you want to remove the stmt_info argument entirely
>at some point, and this is a step towards that?

Yes. I don't have a stmt for each SLP node anymore... 

Richard. 

>Thanks,
>Richard
>
>> ---
>>  gcc/tree-vect-stmts.c | 21 +
>>  1 file changed, 13 insertions(+), 8 deletions(-)
>>
>> diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
>> index 35043ecd0f9..901999be058 100644
>> --- a/gcc/tree-vect-stmts.c
>> +++ b/gcc/tree-vect-stmts.c
>> @@ -1668,14 +1668,19 @@ vect_finish_stmt_generation_1 (vec_info
>*vinfo,
>>if (dump_enabled_p ())
>>  dump_printf_loc (MSG_NOTE, vect_location, "add new stmt: %G",
>vec_stmt);
>>  
>> -  gimple_set_location (vec_stmt, gimple_location (stmt_info->stmt));
>> +  if (stmt_info)
>> +{
>> +  gimple_set_location (vec_stmt, gimple_location
>(stmt_info->stmt));
>>  
>> -  /* While EH edges will generally prevent vectorization, stmt might
>> - e.g. be in a must-not-throw region.  Ensure newly created stmts
>> - that could throw are part of the same region.  */
>> -  int lp_nr = lookup_stmt_eh_lp (stmt_info->stmt);
>> -  if (lp_nr != 0 && stmt_could_throw_p (cfun, vec_stmt))
>> -add_stmt_to_eh_lp (vec_stmt, lp_nr);
>> +  /* While EH edges will generally prevent vectorization, stmt
>might
>> + e.g. be in a must-not-throw region.  Ensure newly created stmts
>> + that could throw are part of the same region.  */
>> +  int lp_nr = lookup_stmt_eh_lp (stmt_info->stmt);
>> +  if (lp_nr != 0 && stmt_could_throw_p (cfun, vec_stmt))
>> +add_stmt_to_eh_lp (vec_stmt, lp_nr);
>> +}
>> +  else
>> +gcc_assert (!stmt_could_throw_p (cfun, vec_stmt));
>>  
>>return vec_stmt_info;
>>  }
>> @@ -1705,7 +1710,7 @@ vect_finish_stmt_generation (vec_info *vinfo,
>>   stmt_vec_info stmt_info, gimple *vec_stmt,
>>   gimple_stmt_iterator *gsi)
>>  {
>> -  gcc_assert (gimple_code (stmt_info->stmt) != GIMPLE_LABEL);
>> +  gcc_assert (!stmt_info || gimple_code (stmt_info->stmt) !=
>GIMPLE_LABEL);
>>  
>>if (!gsi_end_p (*gsi)
>>&& gimple_has_mem_ops (vec_stmt))



Re: [stage1][PATCH] Lower VEC_COND_EXPR into internal functions.

2020-05-27 Thread Richard Sandiford
Martin Liška  writes:
> On 5/26/20 12:15 PM, Richard Sandiford wrote:
>> So longer-term, I think we should replace VCOND(U) with individual ifns,
>> like for VCONDEQ.  We could reduce the number of optabs needed by
>> canonicalising greater-based tests to lesser-based tests.
>
> Hello.
>
> Thanks for the feedback. So would it be possible to go with something
> like DEF_INTERNAL_OPTAB_CAN_FAIL (see the attachment)?

It doesn't look like this will solve the problem.  The reason that we
don't allow optabs for directly-mapped IFNs to FAIL is that:

  expand_insn (icode, 6, ops);

will (deliberately) ICE when the pattern FAILs.  Code that copes with
FAILing optabs instead needs to do:

  rtx_insn *watermark = get_last_insn (); <-- position whether it should go.
  ...
  if (maybe_expand_insn (icode, 6, ops))
{
  ...Success...;
}

  delete_insns_since (watermark);
  ...fallback code that implements the IFN without optab support...

At this point the IFN isn't really directly-mapped in the intended sense:
the optab is “just” a way of optimising the IFN.

So I think the effect of the patch will be to suppress the build failure,
but instead ICE for PowerPC when the FAIL condition is hit.  It might
be quite difficult to trigger though.  (That's why the static checking
is there. :-))

I think instead we should treat VCOND(U) as not directly-mapped,
as Richard suggested (IIRC).  The internal-fn.c code should then handle
the case in which we have an IFN_VCOND(U) call and the associated
optab fails.  Of course, this is only going to be exercised on targets
like powerpc* that having failing patterns, so it'll need testing there.

What I meant by the quote above is that I think this shows the flaw in
using IFN_VCOND(U) rather than splitting it up further.  Longer term,
we should have a separate IFN_VCOND* and optab for each necessary
condition.  There would then be no need (IMO) to allow the patterns
to FAIL, and we could use directly-mapped IFNs with no fallback.
There'd also be no need for the tree comparison operand to the IFN.

Thanks,
Richard


Re: [PATCH PR95254] aarch64: gcc generate inefficient code with fixed sve vector length

2020-05-27 Thread Richard Sandiford
"Yangfei (Felix)"  writes:
>> > +
>> > +{
>> > +  x = x_inner;
>> > +}
>> > +  else if (x_inner != NULL_RTX && MEM_P (y)
>> > + && known_eq (GET_MODE_SIZE (x_inner_mode),
>> GET_MODE_SIZE (mode))
>> > + && ! targetm.can_change_mode_class (x_inner_mode, mode,
>> ALL_REGS)
>> > + && (! targetm.slow_unaligned_access (x_inner_mode,
>> MEM_ALIGN (y))
>> > + || MEM_ALIGN (y) >= GET_MODE_ALIGNMENT
>> (x_inner_mode)))
>> 
>> What is the last condition protecting against?  Seems worth a comment.
>
> Comment added.  Here I am intended to avoid generating a slow unaligned 
> memory access.
> Machine modes like VNx2HImode may have an small alignment than modes like 
> V4HI.
> For the given test case, SLP forces the alignment of memory access of mode 
> VNx2HImode to be 32 bytes.
> In theory, we may have other cases where alignment of innermode is bigger 
> than that of the outermode.

Ah, OK.  But in that case, shouldn't we allow the change if the
original unaligned MEM was also “slow”?

I guess there might be cases in which both modes are slow enough
for the hook to return true for them, but one is worse than the other.
But I don't think there's much we can do about that as things stand:
changing the mode might move from a slow mode to a slower mode,
but it might move in the other direction too.

> +2020-05-27  Felix Yang  
> +   Richard Sandiford  

I appreciate the gesture, but I don't think it's appropriate
to list me as an author.  I haven't written any of the code,
I've just reviewed it. :-)

> diff --git a/gcc/expr.c b/gcc/expr.c
> index dfbeae71518..3035791c764 100644
> --- a/gcc/expr.c
> +++ b/gcc/expr.c
> @@ -3814,6 +3814,69 @@ emit_move_insn (rtx x, rtx y)
>gcc_assert (mode != BLKmode
> && (GET_MODE (y) == mode || GET_MODE (y) == VOIDmode));
>  
> +  /* If we have a copy which looks like one of the following patterns:

s/which/that/ (I think)

> +   (set (subreg:M1 (reg:M2 ...)) (subreg:M1 (reg:M2 ...)))
> +   (set (subreg:M1 (reg:M2 ...)) (mem:M1 ADDR))
> +   (set (mem:M1 ADDR) (subreg:M1 (reg:M2 ...)))
> +   (set (subreg:M1 (reg:M2 ...)) (constant C))
> + where mode M1 is equal in size to M2 and target hook 
> can_change_mode_class
> + (M1, M2, ALL_REGS) returns false, try to remove the subreg.  This avoids
> + an implicit round trip through memory.  */

How about:

 where mode M1 is equal in size to M2, try to detect whether the
 mode change involves an implicit round trip through memory.
 If so, see if we can avoid that by removing the subregs and
 doing the move in mode M2 instead.  */

> +  else if (x_inner != NULL_RTX
> +&& MEM_P (y)
> +&& ! targetm.can_change_mode_class (GET_MODE (x_inner),
> +mode, ALL_REGS)
> +/* Stop if the inner mode requires too much alignment.  */
> +&& (! targetm.slow_unaligned_access (GET_MODE (x_inner),
> + MEM_ALIGN (y))
> +|| MEM_ALIGN (y) >= GET_MODE_ALIGNMENT (GET_MODE (x_inner

It's better to check the alignment first, since it's cheaper.
So taking the comment above into account, I think this ends up as:

   && (MEM_ALIGN (y) >= GET_MODE_ALIGNMENT (GET_MODE (x_inner))
   || targetm.slow_unaligned_access (mode, MEM_ALIGN (y)
   || !targetm.slow_unaligned_access (GET_MODE (x_inner),
  MEM_ALIGN (y))

(Note: no space after "!", although the sources aren't as consistent
about that as they could be.)

TBH I think it would be good to avoid duplicating such a complicated
condition in both directions, so at the risk of getting flamed, how
about using a lambda?

  auto candidate_mem_p = [&](machine_mode inner_mode, rtx mem) {
return ...;
  };

with ... containing everything after the MEM_P check?

Looks good otherwise, thanks,

Richard


Re: [PATCH v2] RS6000, add VSX mask manipulation support

2020-05-27 Thread Carl Love via Gcc-patches


GCC maintainers:

I have addressed the following comments on the patch from Will:

  - ChangeLog: fixed name/symbol order;
changed reference from rs6000-c.c to rs6000-builtin.def.

  - define_expand "vec_mtvsrbm": changed name to vec_mtvsrbm_mtvsrbmi,
updated comment.

  - vsx_mask-runnable.c: divided it up into four smaller test cases,
vsx_mask-count-runnable.c, vsx_mask-expane-runnable.c,
vsx_mask-extract-runnable.c, vsx_mask-move-runnable.c.

Please let me know if there are additional concerns.  Thanks.

   Carl Love

---
RS6000 RFC 2629, add VSX mask manipulation support

The following patch adds support for builtins vec_genbm(),  vec_genhm(),
vec_genwm(), vec_gendm(), vec_genqm(), vec_cntm(), vec_expandm(),
vec_extractm().  Support for instructions mtvsrbm, mtvsrhm, mtvsrwm,
mtvsrdm, mtvsrqm, cntm, vexpandm, vextractm.

The test has been tested on:

  powerpc64le-unknown-linux-gnu (Power 9 LE)

and mambo with no regression errors.

Please let me know if this patch is acceptable for inclusion in the pu
branch.  Thanks.

   Carl Love
---

RS6000 RFC 2629, add VSX mask manipulation support

gcc/ChangeLog

2020-05-27  Carl Love  

* config/rs6000/vsx.md  (VSX_MM): New define_mode_iterator.
(VSX_MM4): New define_mode_iterator.
(VSX_MM_SUFFIX4): New define_mode_attr.
(vec_mtvsrbm): New define_expand.
(vec_mtvsrbmi): New define_insn.
(vec_mtvsr_): New define_insn.
(vec_cntmb_): New define_insn.
(vec_extract_): New define_insn.
(vec_expand_): New define_insn.
(define_c_enum unspec): Add entries UNSPEC_MTVSBM, UNSPEC_VCNTMB,
UNSPEC_VEXTRACT, UNSPEC_VEXPAND.
* config/rs6000/altivec.h ( vec_genbm, vec_genhm, vec_genwm,
vec_gendm, vec_genqm, vec_cntm, vec_expandm, vec_extractm): Add defines.
* config/rs6000/rs6000-builtin.c: Add defines BU_FUTURE_2, BU_FUTURE_1.
(BU_FUTURE_1): Add definitions for mtvsrbm, mtvsrhm, mtvsrwm,
mtvsrdm, mtvsrqm, vexpandmb, vexpandmh, vexpandmw, vexpandmd, vexpandmq,
vextractmb, vextractmh, vextractmw, vextractmd, vextractmq.
(BU_FUTURE_2): Add definitions for cntmbb, cntmbh, cntmbw, cntmbd.
(BU_FUTURE_OVERLOAD_1): Add definitions for mtvsrbm, mtvsrhm,
mtvsrwm, mtvsrdm, mtvsrqm, vexpandm, vextractm.
(BU_FUTURE_OVERLOAD_2): Add defition for cntm.
* config/rs6000/rs6000-call.c (rs6000_expand_binop_builtin): Add
checks for CODE_FOR_vec_cntmbb_v16qi, CODE_FOR_vec_cntmb_v8hi,
CODE_FOR_vec_cntmb_v4si, CODE_FOR_vec_cntmb_v2di.
(altivec_overloaded_builtins): Add overloaded argument entries for
FUTURE_BUILTIN_VEC_MTVSRBM, FUTURE_BUILTIN_VEC_MTVSRHM, 
FUTURE_BUILTIN_VEC_MTVSRWM,
FUTURE_BUILTIN_VEC_MTVSRDM, FUTURE_BUILTIN_VEC_MTVSRQM, 
FUTURE_BUILTIN_VEC_VCNTMBB,
FUTURE_BUILTIN_VCNTMBB, FUTURE_BUILTIN_VCNTMBH, FUTURE_BUILTIN_VCNTMBW,
FUTURE_BUILTIN_VCNTMBD, FUTURE_BUILTIN_VEXPANDMB, 
FUTURE_BUILTIN_VEXPANDMH,
FUTURE_BUILTIN_VEXPANDMW, FUTURE_BUILTIN_VEXPANDMD, 
FUTURE_BUILTIN_VEXPANDMQ,
FUTURE_BUILTIN_VEXTRACTMB, FUTURE_BUILTIN_VEXTRACTMH, 
FUTURE_BUILTIN_VEXTRACTMW,
FUTURE_BUILTIN_VEXTRACTMD, FUTURE_BUILTIN_VEXTRACTMQ.
(builtin_function_type): Add case entries for FUTURE_BUILTIN_MTVSRBM,
FUTURE_BUILTIN_MTVSRHM, FUTURE_BUILTIN_MTVSRWM, FUTURE_BUILTIN_MTVSRDM,
FUTURE_BUILTIN_MTVSRQM, FUTURE_BUILTIN_VCNTMBB, FUTURE_BUILTIN_VCNTMBH,
FUTURE_BUILTIN_VCNTMBW, FUTURE_BUILTIN_VCNTMBD, 
FUTURE_BUILTIN_VEXPANDMB,
FUTURE_BUILTIN_VEXPANDMH, FUTURE_BUILTIN_VEXPANDMW, 
FUTURE_BUILTIN_VEXPANDMD,
FUTURE_BUILTIN_VEXPANDMQ.
* config/rs6000/rs6000-builtin.def (altivec_overloaded_builtins): Add 
entries
for MTVSRBM, MTVSRHM, MTVSRWM, MTVSRDM, MTVSRQM, VCNTM, VEXPANDM, 
VEXTRACTM.
* testsuite/gcc.target/powerpc/vsx_mask-count-runnable.c:  New test 
case.
* testsuite/gcc.target/powerpc/vsx_mask-expand-runnable.c:  New test 
case.
* testsuite/gcc.target/powerpc/vsx_mask-extract-count-runnable.c:  New 
test case.
* testsuite/gcc.target/powerpc/vsx_mask-move-count-runnable.c:  New 
test case.
---
 gcc/config/rs6000/altivec.h   |  10 +
 gcc/config/rs6000/rs6000-builtin.def  |  45 
 gcc/config/rs6000/rs6000-call.c   |  66 -
 gcc/config/rs6000/vsx.md  |  68 ++
 .../powerpc/vsx_mask-count-runnable.c | 149 
 .../powerpc/vsx_mask-expand-runnable.c| 194 +++
 .../powerpc/vsx_mask-extract-runnable.c   | 162 +
 .../powerpc/vsx_mask-move-runnable.c  | 225 ++
 8 files changed, 918 insertions(+), 1 deletion(-)
 create mode 100644 

Re: [PATCH 1/2] make vect_finish_stmt_generation work w/o stmt_vec_info

2020-05-27 Thread Richard Sandiford
Richard Biener  writes:
> This makes the call chain below vec_init_vector happy with a NULL
> stmt_vec_info which is used as "context".
>
> 2020-05-27  Richard Biener  
>
>   * tree-vect-stmts.c (vect_finish_stmt_generation_1):
>   Conditionalize stmt_info use, assert the new stmt cannot throw
>   when not specified.
>   (vect_finish_stmt_generation): Adjust assert.

Wasn't sure from this patch in isolation: when's it valid to pass a null
stmt_info?  Felt weird that we suddenly needed this now, when we already
have so many callers that follow the existing interface.

Or is this because you want to remove the stmt_info argument entirely
at some point, and this is a step towards that?

Thanks,
Richard

> ---
>  gcc/tree-vect-stmts.c | 21 +
>  1 file changed, 13 insertions(+), 8 deletions(-)
>
> diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
> index 35043ecd0f9..901999be058 100644
> --- a/gcc/tree-vect-stmts.c
> +++ b/gcc/tree-vect-stmts.c
> @@ -1668,14 +1668,19 @@ vect_finish_stmt_generation_1 (vec_info *vinfo,
>if (dump_enabled_p ())
>  dump_printf_loc (MSG_NOTE, vect_location, "add new stmt: %G", vec_stmt);
>  
> -  gimple_set_location (vec_stmt, gimple_location (stmt_info->stmt));
> +  if (stmt_info)
> +{
> +  gimple_set_location (vec_stmt, gimple_location (stmt_info->stmt));
>  
> -  /* While EH edges will generally prevent vectorization, stmt might
> - e.g. be in a must-not-throw region.  Ensure newly created stmts
> - that could throw are part of the same region.  */
> -  int lp_nr = lookup_stmt_eh_lp (stmt_info->stmt);
> -  if (lp_nr != 0 && stmt_could_throw_p (cfun, vec_stmt))
> -add_stmt_to_eh_lp (vec_stmt, lp_nr);
> +  /* While EH edges will generally prevent vectorization, stmt might
> +  e.g. be in a must-not-throw region.  Ensure newly created stmts
> +  that could throw are part of the same region.  */
> +  int lp_nr = lookup_stmt_eh_lp (stmt_info->stmt);
> +  if (lp_nr != 0 && stmt_could_throw_p (cfun, vec_stmt))
> + add_stmt_to_eh_lp (vec_stmt, lp_nr);
> +}
> +  else
> +gcc_assert (!stmt_could_throw_p (cfun, vec_stmt));
>  
>return vec_stmt_info;
>  }
> @@ -1705,7 +1710,7 @@ vect_finish_stmt_generation (vec_info *vinfo,
>stmt_vec_info stmt_info, gimple *vec_stmt,
>gimple_stmt_iterator *gsi)
>  {
> -  gcc_assert (gimple_code (stmt_info->stmt) != GIMPLE_LABEL);
> +  gcc_assert (!stmt_info || gimple_code (stmt_info->stmt) != GIMPLE_LABEL);
>  
>if (!gsi_end_p (*gsi)
>&& gimple_has_mem_ops (vec_stmt))


[committed] i386: Implement V2SF shuffles

2020-05-27 Thread Uros Bizjak via Gcc-patches
2020-05-27  Uroš Bizjak  

gcc/ChangeLog:
* config/i386/mmx.md (mmx_pswapdsf2): Add SSE alternatives.
Enable insn pattern for TARGET_MMX_WITH_SSE.
(*mmx_movshdup): New insn pattern.
(*mmx_movsldup): Ditto.
(*mmx_movss): Ditto.
* config/i386/i386-expand.c (ix86_vectorize_vec_perm_const):
Handle E_V2SFmode.
(expand_vec_perm_movs): Handle E_V2SFmode.
(expand_vec_perm_even_odd): Ditto.
(expand_vec_perm_broadcast_1): Assert that E_V2SFmode
is already handled by standard shuffle patterns.

gcc/testsuite/ChangeLog:
* gcc.target/i386/vperm-v2sf.c: New test.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Uros.
diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
index 338b4f7cf4f..96f70ae5aaa 100644
--- a/gcc/config/i386/i386-expand.c
+++ b/gcc/config/i386/i386-expand.c
@@ -16319,6 +16319,7 @@ expand_vec_perm_movs (struct expand_vec_perm_d *d)
 return false;
 
   if (!(TARGET_SSE && vmode == V4SFmode)
+  && !(TARGET_MMX_WITH_SSE && vmode == V2SFmode)
   && !(TARGET_SSE2 && vmode == V2DFmode))
 return false;
 
@@ -18639,6 +18640,13 @@ expand_vec_perm_even_odd_1 (struct expand_vec_perm_d 
*d, unsigned odd)
   /* These are always directly implementable by expand_vec_perm_1.  */
   gcc_unreachable ();
 
+case E_V2SFmode:
+  gcc_assert (TARGET_MMX_WITH_SSE);
+  /* We have no suitable instructions.  */
+  if (d->testing_p)
+   return false;
+  break;
+
 case E_V4HImode:
   if (d->testing_p)
break;
@@ -18834,8 +18842,9 @@ expand_vec_perm_broadcast_1 (struct expand_vec_perm_d 
*d)
   gcc_unreachable ();
 
 case E_V2DFmode:
-case E_V2DImode:
+case E_V2SFmode:
 case E_V4SFmode:
+case E_V2DImode:
 case E_V2SImode:
 case E_V4SImode:
   /* These are always implementable using standard shuffle patterns.  */
@@ -19329,6 +19338,7 @@ ix86_vectorize_vec_perm_const (machine_mode vmode, rtx 
target, rtx op0,
   if (d.testing_p && TARGET_SSSE3)
return true;
   break;
+case E_V2SFmode:
 case E_V2SImode:
 case E_V4HImode:
   if (!TARGET_MMX_WITH_SSE)
@@ -19367,7 +19377,7 @@ ix86_vectorize_vec_perm_const (machine_mode vmode, rtx 
target, rtx op0,
 
   /* Implementable with shufps or pshufd.  */
   if (d.one_operand_p
- && (d.vmode == V4SFmode
+ && (d.vmode == V4SFmode || d.vmode == V2SFmode
  || d.vmode == V4SImode || d.vmode == V2SImode))
return true;
 
diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index 215162dedb5..271c1c2e833 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -938,32 +938,85 @@
 ;
 
 (define_insn "mmx_pswapdv2sf2"
-  [(set (match_operand:V2SF 0 "register_operand" "=y")
-   (vec_select:V2SF (match_operand:V2SF 1 "nonimmediate_operand" "ym")
-(parallel [(const_int 1) (const_int 0)])))]
-  "TARGET_3DNOW_A"
-  "pswapd\t{%1, %0|%0, %1}"
-  [(set_attr "type" "mmxcvt")
-   (set_attr "prefix_extra" "1")
-   (set_attr "mode" "V2SF")])
+  [(set (match_operand:V2SF 0 "register_operand" "=y,x,Yv")
+   (vec_select:V2SF
+ (match_operand:V2SF 1 "register_mmxmem_operand" "ym,0,Yv")
+ (parallel [(const_int 1) (const_int 0)])))]
+  "TARGET_3DNOW_A || TARGET_MMX_WITH_SSE"
+  "@
+   pswapd\t{%1, %0|%0, %1}
+   shufps\t{$0xe1, %1, %0|%0, %1, 0xe1}
+   vshufps\t{$0xe1, %1, %1, %0|%0, %1, %1, 0xe1}"
+  [(set_attr "isa" "*,sse_noavx,avx")
+   (set_attr "mmx_isa" "native,*,*")
+   (set_attr "type" "mmxcvt,ssemov,ssemov")
+   (set_attr "prefix_extra" "1,*,*")
+   (set_attr "mode" "V2SF,V4SF,V4SF")])
+
+(define_insn "*mmx_movshdup"
+  [(set (match_operand:V2SF 0 "register_operand" "=v,x")
+   (vec_select:V2SF
+ (match_operand:V2SF 1 "register_operand" "v,0")
+ (parallel [(const_int 1) (const_int 1)])))]
+  "TARGET_MMX_WITH_SSE"
+  "@
+   %vmovshdup\t{%1, %0|%0, %1}
+   shufps\t{$0xe5, %0, %0|%0, %0, 0xe5}"
+  [(set_attr "isa" "sse3,*")
+   (set_attr "type" "sse,sseshuf1")
+   (set_attr "length_immediate" "*,1")
+   (set_attr "prefix_rep" "1,*")
+   (set_attr "prefix" "maybe_vex,orig")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "*mmx_movsldup"
+  [(set (match_operand:V2SF 0 "register_operand" "=v,x")
+   (vec_select:V2SF
+ (match_operand:V2SF 1 "register_operand" "v,0")
+ (parallel [(const_int 0) (const_int 0)])))]
+  "TARGET_MMX_WITH_SSE"
+  "@
+   %vmovsldup\t{%1, %0|%0, %1}
+   shufps\t{$0xe0, %0, %0|%0, %0, 0xe0}"
+  [(set_attr "isa" "sse3,*")
+   (set_attr "type" "sse,sseshuf1")
+   (set_attr "length_immediate" "*,1")
+   (set_attr "prefix_rep" "1,*")
+   (set_attr "prefix" "maybe_vex,orig")
+   (set_attr "mode" "V4SF")])
 
 (define_insn "*vec_dupv2sf"
-  [(set (match_operand:V2SF 0 "register_operand" "=y,x,Yv")
+  [(set (match_operand:V2SF 0 "register_operand" "=y,Yv,x")

Re: Broken build

2020-05-27 Thread Hans-Peter Nilsson via Gcc-patches
> From: Alexandre Oliva 
> Date: Wed, 27 May 2020 16:30:07 +0200

> On May 26, 2020, Hans-Peter Nilsson  wrote:
> 
> >> Here's a proper patch submission.
> 
> > And here's an improper bug report.
> 
> :-)
> 
> Thanks, H-P,
> 
> > xgcc: error: : No such file or directory
> 
> Interesting...  If you cut the command line that you included in
> your so-called improper bug report ;-) do you get this error?

Aha: no.

> I ask because this error suggests an empty argument passed to
> GCC.

And ignored before your rewrite?  Maybe.  If so, I'm guessing
there are lots of build systems out there carrying this kind of
gem.

>  This
> test is the very kind of scenario (with multiple compiler inputs) in
> which the testsuite would implicitly arrange to pass "-dumpbase ''" to
> the compiler driver.  I don't see -dumpbase in the command line, though,
> but there might somehow be a '' not visible in gcc.log.
> 
> Could I possibly ask what build and host systems you've used, and what
> your dejagnu board configuration file looks like?

Debian 9, x86_64.  (Nothing Canadian.)

Oops, I'm using a local baseboard file, but it looks mostly like
the cris-sim.exp baseboard file in Debian's dejagnu-1.6-1.1 (I'd
guess also the official dejagnu-1.6; I doubt there are
CRIS-specific distro diffs).  Diffs are in the copyright header,
some ";" cleanups, and some target selector supposed-cleanup.

To wit, the lines
 set cris_ldopt "-sim3"
and
 set_board_info ldflags "[libgloss_link_flags] [newlib_link_flags] $cris_ldopt"
are the same.

> > -L/netapp/hp3_storage/hp/autotest/hpautotest-gcc1/cris-elf/gccobj/cris-elf/./newlib
> > -sim3 -lm -o
> 
> Is it correct to assume that -sim3 is NOT a separate command-line
> option,

No, see above: it's a separate command-line option.

> and that there is a directory named newlib-sim3 ?

No, see cris.h:LIB_SPEC and STARTFILE_SPEC.

> > Can you please have a look?
> 
> Sure, thanks!  Sorry about this undesirable surprise.

Thanks for looking.

brgds, H-P


Re: drop -aux{dir,base}, revamp -dump{dir,base}

2020-05-27 Thread Andreas Schwab
On Mai 27 2020, Alexandre Oliva wrote:

> On May 27, 2020, Andreas Schwab  wrote:
>
>> Looks like tcl 8.5.5 has a bug:
>
> Ugh, how unfortunate.

In fact, that bug exists in all versions.

https://core.tcl-lang.org/tcl/tktview?name=5bbd044812

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: [PATCH, committed] [9/10/11 Regression] PR fortran/95104 - Segfault on a legal WAIT statement

2020-05-27 Thread Thomas Koenig via Gcc-patches

Am 26.05.20 um 23:33 schrieb Harald Anlauf:

Committed as obvious.

The invalid NULL pointer dereference was discovered by Steve Kargl.

Will backport in a few days, when I figure out how to do it now.


Thanks for committing this.

The way to backport now is to first run contrib/gcc-git-customization.sh
from current master, and then change to the branch you want to
backport this to and run

git gcc-backport r11-646-g56f03cd12be26828788a27f6f3c250041a958e45 .

(or what your revision may be).

I just tried it, and it works well.

Regards

Thomas


Re: drop -aux{dir,base}, revamp -dump{dir,base}

2020-05-27 Thread Alexandre Oliva
On May 27, 2020, Andreas Schwab  wrote:

> Looks like tcl 8.5.5 has a bug:

Ugh, how unfortunate.


> % glob -nocomplain -path {} -- {a.{out,exe}}
> % glob -nocomplain -path {} -- {a.{out,exe}*}
> a.out

Thanks for tracking that down, I'll put in some work around for that.

-- 
Alexandre Oliva, freedom fighterhe/himhttps://FSFLA.org/blogs/lxo/
Free Software Evangelist  Stallman was right, but he's left :(
GNU Toolchain Engineer   Live long and free, and prosper ethically


[PATCH] gcc-changelog: enhance handling of renamings

2020-05-27 Thread Pierre-Marie de Rodat
So far, we expect from a commit that renames a file to contain a
changelog entry only for the new name. For example, after the following
commit:

   $ git move foo bar
   $ git commit

We expect the following changelog:

   * bar: Renamed from foo.

Git does not keep track of renamings, only file deletions and additions.
The display of patches then uses heuristics (with config-dependent
parameters) to try to match deleted and added files in the same commit.
It is thus brittle to rely on this information.

This commit modifies changelog processing so that renames are considered
as a deletion of a file plus an addition of another file. The following
changelog is now expected for the above example:

   * foo: Move...
   * bar: Here.

contrib/

* gcc-changelog/git_email.py (GitEmail.__init__): Interpret file
renamings as a file deletion plus a file addition.
* gcc-changelog/git_repository.py (parse_git_revisions):
Likewise.
* gcc-changelog/test_email.py: New testcase.
* gcc-changelog/test_patches.txt: New testcase.
---
 contrib/gcc-changelog/git_email.py  |   5 +
 contrib/gcc-changelog/git_repository.py |   5 +
 contrib/gcc-changelog/test_email.py |   5 +
 contrib/gcc-changelog/test_patches.txt  | 153 
 4 files changed, 168 insertions(+)

diff --git a/contrib/gcc-changelog/git_email.py 
b/contrib/gcc-changelog/git_email.py
index 8c9df293a66..6e42629cf07 100755
--- a/contrib/gcc-changelog/git_email.py
+++ b/contrib/gcc-changelog/git_email.py
@@ -54,6 +54,11 @@ class GitEmail(GitCommit):
 t = 'A'
 elif f.is_removed_file:
 t = 'D'
+elif f.is_rename:
+# Consider that renamed files are two operations: the deletion
+# of the original name and the addition of the new one.
+modified_files.append((f.target_file[2:], 'A'))
+t = 'D'
 else:
 t = 'M'
 modified_files.append((f.path, t))
diff --git a/contrib/gcc-changelog/git_repository.py 
b/contrib/gcc-changelog/git_repository.py
index 0473fe73fba..e3b6c4d7a38 100755
--- a/contrib/gcc-changelog/git_repository.py
+++ b/contrib/gcc-changelog/git_repository.py
@@ -47,6 +47,11 @@ def parse_git_revisions(repo_path, revisions, strict=False):
 t = 'A'
 elif file.deleted_file:
 t = 'D'
+elif file.renamed_file:
+# Consider that renamed files are two operations: the deletion
+# of the original name and the addition of the new one.
+modified_files.append((file.a_path, 'D'))
+t = 'A'
 else:
 t = 'M'
 modified_files.append((file.b_path, t))
diff --git a/contrib/gcc-changelog/test_email.py 
b/contrib/gcc-changelog/test_email.py
index 3d2c8ff2412..c188fe9b276 100755
--- a/contrib/gcc-changelog/test_email.py
+++ b/contrib/gcc-changelog/test_email.py
@@ -295,3 +295,8 @@ class TestGccChangelog(unittest.TestCase):
 'sem_ch12.adb', 'sem_ch4.adb', 'sem_ch7.adb',
 'sem_ch8.adb', 'sem_elab.adb', 'sem_type.adb',
 'sem_util.adb'])
+
+def test_renamed_file(self):
+email = self.from_patch_glob(
+'0001-Ada-Add-support-for-XDR-streaming-in-the-default-run.patch')
+assert not email.errors
diff --git a/contrib/gcc-changelog/test_patches.txt 
b/contrib/gcc-changelog/test_patches.txt
index 06869bff504..cc81fcd32b8 100644
--- a/contrib/gcc-changelog/test_patches.txt
+++ b/contrib/gcc-changelog/test_patches.txt
@@ -2741,3 +2741,156 @@ index b980b4c..c1b1d9e 100644
 -- 
 2.1.4
 
+=== 0001-Ada-Add-support-for-XDR-streaming-in-the-default-run.patch ===
+From ed248d9bc3b72b6888a1b9cd84a8ef26809249f0 Mon Sep 17 00:00:00 2001
+From: Arnaud Charlet 
+Date: Thu, 23 Apr 2020 05:46:29 -0400
+Subject: [PATCH] [Ada] Add support for XDR streaming in the default runtime
+
+--!# FROM: /homes/derodat/tron/gnat2fsf/gnat
+--!# COMMIT: 5ad4cabb9f70114eb61c025e91406d4fba253f95
+--!# Change-Id: I21f92cad27933747495cdfa544a048f62f944cbd
+--!# TN: T423-014
+
+Currently we provide a separate implementation of Stream_Attributes via
+s-stratt__xdr.adb which needs to be recompiled manually.
+
+This change introduces instead a new binder switch to choose at bind
+time which stream implementation to use and replaces s-stratt__xdr.adb
+by a new unit System.Stream_Attributes.XDR.
+
+2020-05-04  Arnaud Charlet  
+
+gcc/ada/
+
+   * Makefile.rtl: Add s-statxd.o.
+   * bindgen.adb (Gen_Adainit): Add support for XDR_Stream.
+   * bindusg.adb (Display): Add mention of -xdr.
+   * gnatbind.adb: Process -xdr switch.
+   * init.c (__gl_xdr_stream): New.
+   * opt.ads (XDR_Stream): New.
+   * libgnat/s-stratt__xdr.adb: Rename to...
+   * libgnat/s-statxd.adb: this and adjust.
+   * libgnat/s-statxd.ads: New.
+   * 

Re: Broken build

2020-05-27 Thread Alexandre Oliva
On May 26, 2020, Hans-Peter Nilsson  wrote:

>> Here's a proper patch submission.

> And here's an improper bug report.

:-)

Thanks, H-P,

> xgcc: error: : No such file or directory

Interesting...  If you cut the command line that you included in
your so-called improper bug report ;-) do you get this error?

I ask because this error suggests an empty argument passed to GCC.  This
test is the very kind of scenario (with multiple compiler inputs) in
which the testsuite would implicitly arrange to pass "-dumpbase ''" to
the compiler driver.  I don't see -dumpbase in the command line, though,
but there might somehow be a '' not visible in gcc.log.


Could I possibly ask what build and host systems you've used, and what
your dejagnu board configuration file looks like?


> -L/netapp/hp3_storage/hp/autotest/hpautotest-gcc1/cris-elf/gccobj/cris-elf/./newlib
> -sim3 -lm -o

Is it correct to assume that -sim3 is NOT a separate command-line
option, and that there is a directory named newlib-sim3 ?

> Can you please have a look?

Sure, thanks!  Sorry about this undesirable surprise.

-- 
Alexandre Oliva, freedom fighterhe/himhttps://FSFLA.org/blogs/lxo/
Free Software Evangelist  Stallman was right, but he's left :(
GNU Toolchain Engineer   Live long and free, and prosper ethically


[pushed] c++: operator<=> and -Wzero-as-null-pointer-constant [PR95242]

2020-05-27 Thread Jason Merrill via Gcc-patches
In C++20, if there is no viable operator< available, lhs < rhs gets
rewritten to (lhs <=> rhs) < 0, where operator< for the comparison
categories is intended to accept literal 0 on the RHS but not other
integers.  We don't want this to produce a warning from
 -Wzero-as-null-pointer-constant.

Tested x86_64-pc-linux-gnu, applying to trunk and 10.

gcc/cp/ChangeLog:

* call.c (build_new_op_1): Suppress
warn_zero_as_null_pointer_constant across comparison of <=> result
to 0.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/spaceship-synth2.C: Add
-Wzero-as-null-pointer-constant.
---
 gcc/cp/call.c | 1 +
 gcc/testsuite/g++.dg/cpp2a/spaceship-synth2.C | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index d8582883917..a51ebb5d9e3 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -6410,6 +6410,7 @@ build_new_op_1 (const op_location_t , enum tree_code 
code, int flags,
tree rhs = integer_zero_node;
if (cand->reversed ())
  std::swap (lhs, rhs);
+   warning_sentinel ws (warn_zero_as_null_pointer_constant);
result = build_new_op (loc, code,
   LOOKUP_NORMAL|LOOKUP_REWRITTEN,
   lhs, rhs, NULL_TREE,
diff --git a/gcc/testsuite/g++.dg/cpp2a/spaceship-synth2.C 
b/gcc/testsuite/g++.dg/cpp2a/spaceship-synth2.C
index e6401d29ef0..9b6cfa081d1 100644
--- a/gcc/testsuite/g++.dg/cpp2a/spaceship-synth2.C
+++ b/gcc/testsuite/g++.dg/cpp2a/spaceship-synth2.C
@@ -1,6 +1,9 @@
 // Test with only spaceship defaulted.
 // { dg-do run { target c++20 } }
 
+// Add this warning to test PR c++/95242
+// { dg-additional-options -Wzero-as-null-pointer-constant }
+
 #include 
 
 struct D

base-commit: ac9face8d26ea4b6aa72902ecc22e89ef00763c5
-- 
2.18.1



[pushed] c++: Fix stdcall attribute in template. [PR95222]

2020-05-27 Thread Jason Merrill via Gcc-patches
Another case that breaks with my fix for PR90750: we shouldn't move type
attributes in TYPENAME context either, as there's no decl for them to move
to.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog:

PR c++/95222
* decl.c (grokdeclarator): Don't shift attributes in TYPENAME
context.

gcc/testsuite/ChangeLog:

PR c++/95222
* g++.dg/ext/tmplattr10.C: New test.
---
 gcc/cp/decl.c |  2 +-
 gcc/testsuite/g++.dg/ext/tmplattr10.C | 52 +++
 2 files changed, 53 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/ext/tmplattr10.C

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 2e1390837e8..5476965996b 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -11951,7 +11951,7 @@ grokdeclarator (const cp_declarator *declarator,
  if (declarator->kind == cdk_array)
attr_flags |= (int) ATTR_FLAG_ARRAY_NEXT;
  tree late_attrs = NULL_TREE;
- if (decl_context != PARM)
+ if (decl_context != PARM && decl_context != TYPENAME)
/* Assume that any attributes that get applied late to
   templates will DTRT when applied to the declaration
   as a whole.  */
diff --git a/gcc/testsuite/g++.dg/ext/tmplattr10.C 
b/gcc/testsuite/g++.dg/ext/tmplattr10.C
new file mode 100644
index 000..3fb8c21ccbe
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/tmplattr10.C
@@ -0,0 +1,52 @@
+// PR c++/95222
+// { dg-do compile { target { { i?86-*-* x86_64-*-* } && ia32 } } }
+
+#if defined(_MSC_VER)
+#define CC_FASTCALL __fastcall
+#define CC_STDCALL __stdcall
+#else
+#define CC_FASTCALL __attribute__((fastcall))
+#define CC_STDCALL __attribute__((stdcall))
+#endif
+
+template 
+struct FuncResult;
+
+template 
+struct FuncResult
+{
+using type = R;
+};
+
+template 
+struct FuncResult
+{
+using type = R;
+};
+
+template 
+struct FuncResult
+{
+using type = R;
+};
+
+template 
+auto wrap(FuncT f) -> typename FuncResult::type
+{
+return f(1, 2, 3);
+}
+
+int CC_FASTCALL func1(int x, int y, int z)
+{
+return x + y + z;
+}
+
+int CC_STDCALL func2(int x, int y, int z)
+{
+return x + y + z;
+}
+
+int main()
+{
+return wrap() + wrap();
+}

base-commit: ac9face8d26ea4b6aa72902ecc22e89ef00763c5
-- 
2.18.1



Re: [PATCH 0/7] Support vector load/store with length

2020-05-27 Thread Segher Boessenkool
On Wed, May 27, 2020 at 09:25:43AM +0200, Richard Biener wrote:
> On Tue, 26 May 2020, Segher Boessenkool wrote:
> > On Tue, May 26, 2020 at 01:29:30PM +0100, Richard Sandiford wrote:
> > > FWIW, I agree adding .LEN_LOAD and .LEN_STORE seems like a good
> > > approach.  I think it'll be more maintainable in the long run than
> > > trying to have .MASK_LOADs and .MASK_STOREs that need a special mask
> > > operand.  (That would be too similar to VEC_COND_EXPR :-))
> > > 
> > > Not sure yet what the exact semantics wrt out-of-range values for
> > > the IFN/optab though.  Maybe we should instead have some kind of
> > > abstract, target-specific cookie created by a separate intrinsic.
> > > Haven't thought much about it yet...
> > 
> > Or maybe only support 0..N with N the length of the vector?  It is
> > pretty important to support 0 and N, but greater than N isn't as
> > important (it is useful for tricky hand-written code, but not as much
> > for compiler-generate code -- we only support an 8-bit number here on
> > Power, maybe that is why ;-) )
> 
> The question is one of semantics - if power masks the length to an
> 8 bit number it's important to preprocess the IV.

In the instructions it *is* an 8 bit number (it is the top 8 bits of a
GPR).

> As with my
> other suggestion the question is what to expose to the IL (to GIMPLE)
> here.

Yes, I understand that.  Hence my answer :-)

Only multiples of element size would be fine as well of course.

> Exposing as much as possible will help IV selection but
> will eventually require IFN variations for different semantics.
> 
> So yes, 0..N sounds about right here and we'll require a MIN ()
> operation and likely need to teach IV selection about this to at least
> possibly get an IV with the byte size multiplication factored.

Maybe we should have a hook to say which lengths are allowed for which
element type?

And, how does this work for variable lengths (the usual case!)


Segher


Re: [stage1][PATCH] Lower VEC_COND_EXPR into internal functions.

2020-05-27 Thread Martin Liška

On 5/26/20 12:15 PM, Richard Sandiford wrote:

So longer-term, I think we should replace VCOND(U) with individual ifns,
like for VCONDEQ.  We could reduce the number of optabs needed by
canonicalising greater-based tests to lesser-based tests.


Hello.

Thanks for the feedback. So would it be possible to go with something
like DEF_INTERNAL_OPTAB_CAN_FAIL (see the attachment)?

I'm sending the complete patch that survives bootstrap and regression
tests on x86_64-linux-gnu and ppc64le-linux-gnu.

Martin
diff --git a/gcc/genemit.c b/gcc/genemit.c
index 84d07d388ee..23c89dbf4e9 100644
--- a/gcc/genemit.c
+++ b/gcc/genemit.c
@@ -857,6 +857,9 @@ main (int argc, const char **argv)
 
 #define DEF_INTERNAL_OPTAB_FN(NAME, FLAGS, OPTAB, TYPE) \
   nofail_optabs[OPTAB##_optab] = true;
+
+#define DEF_INTERNAL_OPTAB_CAN_FAIL(OPTAB) \
+  nofail_optabs[OPTAB##_optab] = false;
 #include "internal-fn.def"
 
   /* Assign sequential codes to all entries in the machine description
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index 0c6fc371190..373273de2c2 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -34,6 +34,7 @@ along with GCC; see the file COPYING3.  If not see
    UNSIGNED_OPTAB, TYPE)
  DEF_INTERNAL_FLT_FN (NAME, FLAGS, OPTAB, TYPE)
  DEF_INTERNAL_INT_FN (NAME, FLAGS, OPTAB, TYPE)
+ DEF_INTERNAL_OPTAB_CAN_FAIL (OPTAB)
 
where NAME is the name of the function, FLAGS is a set of
ECF_* flags and FNSPEC is a string describing functions fnspec.
@@ -86,7 +87,10 @@ along with GCC; see the file COPYING3.  If not see
 
where STMT is the statement that performs the call.  These are generated
automatically for optab functions and call out to a function or macro
-   called expand__optab_fn.  */
+   called expand__optab_fn.
+
+   DEF_INTERNAL_OPTAB_CAN_FAIL defines tables that are used for GIMPLE
+   instruction selection and do not map directly to instructions.  */
 
 #ifndef DEF_INTERNAL_FN
 #define DEF_INTERNAL_FN(CODE, FLAGS, FNSPEC)
@@ -118,6 +122,10 @@ along with GCC; see the file COPYING3.  If not see
   DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE)
 #endif
 
+#ifndef DEF_INTERNAL_OPTAB_CAN_FAIL
+#define DEF_INTERNAL_OPTAB_CAN_FAIL(OPTAB)
+#endif
+
 DEF_INTERNAL_OPTAB_FN (MASK_LOAD, ECF_PURE, maskload, mask_load)
 DEF_INTERNAL_OPTAB_FN (LOAD_LANES, ECF_CONST, vec_load_lanes, load_lanes)
 DEF_INTERNAL_OPTAB_FN (MASK_LOAD_LANES, ECF_PURE,
@@ -141,6 +149,11 @@ DEF_INTERNAL_OPTAB_FN (VCONDU, 0, vcondu, vec_condu)
 DEF_INTERNAL_OPTAB_FN (VCONDEQ, 0, vcondeq, vec_condeq)
 DEF_INTERNAL_OPTAB_FN (VCOND_MASK, 0, vcond_mask, vec_cond_mask)
 
+DEF_INTERNAL_OPTAB_CAN_FAIL (vcond)
+DEF_INTERNAL_OPTAB_CAN_FAIL (vcondu)
+DEF_INTERNAL_OPTAB_CAN_FAIL (vcondeq)
+DEF_INTERNAL_OPTAB_CAN_FAIL (vcond_mask)
+
 DEF_INTERNAL_OPTAB_FN (WHILE_ULT, ECF_CONST | ECF_NOTHROW, while_ult, while)
 DEF_INTERNAL_OPTAB_FN (CHECK_RAW_PTRS, ECF_CONST | ECF_NOTHROW,
 		   check_raw_ptrs, check_ptrs)
@@ -385,4 +398,5 @@ DEF_INTERNAL_FN (NOP, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
 #undef DEF_INTERNAL_FLT_FLOATN_FN
 #undef DEF_INTERNAL_SIGNED_OPTAB_FN
 #undef DEF_INTERNAL_OPTAB_FN
+#undef DEF_INTERNAL_OPTAB_CAN_FAIL
 #undef DEF_INTERNAL_FN
>From 981952917dea9aaef3be13375fbf452566d926b3 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Mon, 9 Mar 2020 13:23:03 +0100
Subject: [PATCH] Lower VEC_COND_EXPR into internal functions.

gcc/ChangeLog:

2020-03-30  Martin Liska  

	* expr.c (expand_expr_real_2): Put gcc_unreachable, we should reach
	this path.
	(do_store_flag): Likewise here.
	* internal-fn.c (vec_cond_mask_direct): New.
	(vec_cond_direct): Likewise.
	(vec_condu_direct): Likewise.
	(vec_condeq_direct): Likewise.
	(expand_vect_cond_optab_fn): Move from optabs.c.
	(expand_vec_cond_optab_fn): New alias.
	(expand_vec_condu_optab_fn): Likewise.
	(expand_vec_condeq_optab_fn): Likewise.
	(expand_vect_cond_mask_optab_fn): Moved from optabs.c.
	(expand_vec_cond_mask_optab_fn): New alias.
	(direct_vec_cond_mask_optab_supported_p): New.
	(direct_vec_cond_optab_supported_p): Likewise.
	(direct_vec_condu_optab_supported_p): Likewise.
	(direct_vec_condeq_optab_supported_p): Likewise.
	* internal-fn.def (DEF_INTERNAL_OPTAB_CAN_FAIL):
	(VCOND): New new internal optab
	function.
	(VCONDU): Likewise.
	(VCONDEQ): Likewise.
	(VCOND_MASK): Likewise.
	* optabs.c (expand_vec_cond_mask_expr): Removed.
	(expand_vec_cond_expr): Likewise.
	* optabs.h (expand_vec_cond_expr): Likewise.
	(vector_compare_rtx): Likewise.
	* passes.def: Add pass_gimple_isel.
	* tree-cfg.c (verify_gimple_assign_ternary): Add new
	GIMPLE check.
	* tree-pass.h (make_pass_gimple_isel): New.
	* tree-ssa-forwprop.c (pass_forwprop::execute): Do not forward
	to already lowered VEC_COND_EXPR.
	* tree-vect-generic.c (expand_vector_divmod): Expand to SSA_NAME.
	(expand_vector_condition): Expand tcc_comparison of a VEC_COND_EXPR
	into a SSA_NAME.
	(gimple_expand_vec_cond_expr): New.
	(gimple_expand_vec_cond_exprs): New.
	(class pass_gimple_isel): New.
	

Re: [patch] Add support for __builtin_bswap128

2020-05-27 Thread Richard Biener via Gcc-patches
On Wed, May 27, 2020 at 3:33 PM Eric Botcazou  wrote:
>
> > Please use int128 effective target rather than lp64 in the tests that need
> > __int128 type.
>
> OK, thanks, adjusted locally.

OK.

Thanks,
Richard.

> --
> Eric Botcazou


[PR c++/95263] Revert alias template change

2020-05-27 Thread Nathan Sidwell
Sadly my attempt to make some aliast template construction immutable 
doesn't always apply.  Reverting that patch and I guess more work needed 
on modules :(


nathan
--
Nathan Sidwell
2020-05-27  Nathan Sidwell  

	PR c++/95263, revert 74744bb1f2847b5b9ce3e97e0fec9c23bb0e499f
	* pt.c (lookup_template_class_1): Restore alias template mutation.

diff --git i/gcc/cp/pt.c w/gcc/cp/pt.c
index c17a038c6d0..4d9651acee6 100644
--- i/gcc/cp/pt.c
+++ w/gcc/cp/pt.c
@@ -10062,21 +10062,8 @@ lookup_template_class_1 (tree d1, tree arglist, tree in_decl, tree context,
 	}
 	}
 
-  /* Build template info for the new specialization.  This can
-	 overwrite the existing TEMPLATE_INFO for T (that points to
-	 its instantiated TEMPLATE_DECL), with this one that points to
-	 the most general template, but that's what we want.  */
-
-  if (TYPE_ALIAS_P (t))
-	{
-	  /* This should already have been constructed during
-	 instantiation of the alias decl.  */
-	  tree ti = DECL_TEMPLATE_INFO (TYPE_NAME (t));
-	  gcc_checking_assert (template_args_equal (TI_ARGS (ti), arglist)
-			   && TI_TEMPLATE (ti) == found);
-	}
-  else
-	SET_TYPE_TEMPLATE_INFO (t, build_template_info (found, arglist));
+  // Build template info for the new specialization.
+  SET_TYPE_TEMPLATE_INFO (t, build_template_info (found, arglist));
 
   elt.spec = t;
   slot = type_specializations->find_slot_with_hash (, hash, INSERT);
diff --git c/gcc/testsuite/g++.dg/template/pr95263.C w/gcc/testsuite/g++.dg/template/pr95263.C
new file mode 100644
index 000..08a1b8730c0
--- /dev/null
+++ w/gcc/testsuite/g++.dg/template/pr95263.C
@@ -0,0 +1,23 @@
+// { dg-do compile { target c++11 } }
+// PR C++/95263
+// ICE on alias template instantiation
+
+template  class TPL {
+  template  using INT = int;
+};
+
+template  class Klass
+{
+public:
+  template  using ALIAS = typename TPL::INT;
+
+  template  static void FUNC (); // OK
+
+  template  static ALIAS FUNC (); // SFINAE ICE
+};
+
+void Fn ()
+{
+  Klass::FUNC<0> ();
+}
+


[PATCH 2/2] Code generate externals/invariants during the SLP graph walk

2020-05-27 Thread Richard Biener


This reaps the benefit of having the correct vector types on invariant
SLP nodes.  Bootstrap / regtest of this small series is underway on
x86_64-unknown-linux-gnu.  Comments welcome.

Thanks,
Richard.


This generates vector defs for externals and invariants during the SLP
walk rather than as part of getting vectorized defs when vectorizing
the users.  This is a requirement to make sharing of external/invariant
nodes be reflected in actual code generation.

This temporarily adds a SLP_TREE_VEC_DEFS vector alongside the
SLP_TREE_VEC_STMTS one.  Eventually the latter can go away.

2020-05-27  Richard Biener  

* tree-vectorizer.h (_slp_tree::vec_defs): Add.
(SLP_TREE_VEC_DEFS): Likewise.
* tree-vect-slp.c (_slp_tree::_slp_tree): Adjust.
(_slp_tree::~_slp_tree): Likewise.
(vect_mask_constant_operand_p): Remove unused function.
(vect_get_constant_vectors): Rename to...
(vect_create_constant_vectors): ... this.  Take the
invariant node as argument and code generate it.  Remove
dead code, remove temporary asserts.  Pass a NULL stmt_info
to vect_init_vector.
(vect_get_slp_defs): Simplify.
(vect_schedule_slp_instance): Code-generate externals and
invariants using vect_create_constant_vectors.
---
 gcc/tree-vect-slp.c   | 158 +++---
 gcc/tree-vectorizer.h |   2 +
 2 files changed, 42 insertions(+), 118 deletions(-)

diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index a6c5a9d9dc4..aa95c0a7f75 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -53,7 +53,8 @@ _slp_tree::_slp_tree ()
 {
   SLP_TREE_SCALAR_STMTS (this) = vNULL;
   SLP_TREE_SCALAR_OPS (this) = vNULL;
-  SLP_TREE_VEC_STMTS (this).create (0);
+  SLP_TREE_VEC_STMTS (this) = vNULL;
+  SLP_TREE_VEC_DEFS (this) = vNULL;
   SLP_TREE_NUMBER_OF_VEC_STMTS (this) = 0;
   SLP_TREE_CHILDREN (this) = vNULL;
   SLP_TREE_LOAD_PERMUTATION (this) = vNULL;
@@ -72,6 +73,7 @@ _slp_tree::~_slp_tree ()
   SLP_TREE_SCALAR_STMTS (this).release ();
   SLP_TREE_SCALAR_OPS (this).release ();
   SLP_TREE_VEC_STMTS (this).release ();
+  SLP_TREE_VEC_DEFS (this).release ();
   SLP_TREE_LOAD_PERMUTATION (this).release ();
 }
 
@@ -3480,56 +3482,6 @@ vect_slp_bb (basic_block bb)
 }
 
 
-/* Return 1 if vector type STMT_VINFO is a boolean vector.  */
-
-static bool
-vect_mask_constant_operand_p (vec_info *vinfo,
- stmt_vec_info stmt_vinfo, unsigned op_num)
-{
-  enum tree_code code = gimple_expr_code (stmt_vinfo->stmt);
-  tree op, vectype;
-  enum vect_def_type dt;
-
-  /* For comparison and COND_EXPR type is chosen depending
- on the non-constant other comparison operand.  */
-  if (TREE_CODE_CLASS (code) == tcc_comparison)
-{
-  gassign *stmt = as_a  (stmt_vinfo->stmt);
-  op = gimple_assign_rhs1 (stmt);
-
-  if (!vect_is_simple_use (op, vinfo, , ))
-   gcc_unreachable ();
-
-  return !vectype || VECTOR_BOOLEAN_TYPE_P (vectype);
-}
-
-  if (code == COND_EXPR)
-{
-  gassign *stmt = as_a  (stmt_vinfo->stmt);
-  tree cond = gimple_assign_rhs1 (stmt);
-
-  if (TREE_CODE (cond) == SSA_NAME)
-   {
- if (op_num > 0)
-   return VECTOR_BOOLEAN_TYPE_P (STMT_VINFO_VECTYPE (stmt_vinfo));
- op = cond;
-   }
-  else
-   {
- if (op_num > 1)
-   return VECTOR_BOOLEAN_TYPE_P (STMT_VINFO_VECTYPE (stmt_vinfo));
- op = TREE_OPERAND (cond, 0);
-   }
-
-  if (!vect_is_simple_use (op, vinfo, , ))
-   gcc_unreachable ();
-
-  return !vectype || VECTOR_BOOLEAN_TYPE_P (vectype);
-}
-
-  return VECTOR_BOOLEAN_TYPE_P (STMT_VINFO_VECTYPE (stmt_vinfo));
-}
-
 /* Build a variable-length vector in which the elements in ELTS are repeated
to a fill NRESULTS vectors of type VECTOR_TYPE.  Store the vectors in
RESULTS and add any new instructions to SEQ.
@@ -3644,18 +3596,13 @@ duplicate_and_interleave (vec_info *vinfo, gimple_seq 
*seq, tree vector_type,
 }
 
 
-/* For constant and loop invariant defs of SLP_NODE this function returns
-   (vector) defs (VEC_OPRNDS) that will be used in the vectorized stmts.
-   OP_NODE determines the node for the operand containing the scalar
-   operands.  */
+/* For constant and loop invariant defs in OP_NODE this function creates
+   vector defs that will be used in the vectorized stmts and stores them
+   to SLP_TREE_VEC_DEFS of OP_NODE.  */
 
 static void
-vect_get_constant_vectors (vec_info *vinfo,
-  slp_tree slp_node, unsigned op_num,
-   vec *vec_oprnds)
+vect_create_constant_vectors (vec_info *vinfo, slp_tree op_node)
 {
-  slp_tree op_node = SLP_TREE_CHILDREN (slp_node)[op_num];
-  stmt_vec_info stmt_vinfo = SLP_TREE_SCALAR_STMTS (slp_node)[0];
   unsigned HOST_WIDE_INT nunits;
   tree vec_cst;
   unsigned j, number_of_places_left_in_vector;
@@ -3665,29 +3612,14 @@ vect_get_constant_vectors (vec_info 

[PATCH 1/2] make vect_finish_stmt_generation work w/o stmt_vec_info

2020-05-27 Thread Richard Biener
This makes the call chain below vec_init_vector happy with a NULL
stmt_vec_info which is used as "context".

2020-05-27  Richard Biener  

* tree-vect-stmts.c (vect_finish_stmt_generation_1):
Conditionalize stmt_info use, assert the new stmt cannot throw
when not specified.
(vect_finish_stmt_generation): Adjust assert.
---
 gcc/tree-vect-stmts.c | 21 +
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 35043ecd0f9..901999be058 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -1668,14 +1668,19 @@ vect_finish_stmt_generation_1 (vec_info *vinfo,
   if (dump_enabled_p ())
 dump_printf_loc (MSG_NOTE, vect_location, "add new stmt: %G", vec_stmt);
 
-  gimple_set_location (vec_stmt, gimple_location (stmt_info->stmt));
+  if (stmt_info)
+{
+  gimple_set_location (vec_stmt, gimple_location (stmt_info->stmt));
 
-  /* While EH edges will generally prevent vectorization, stmt might
- e.g. be in a must-not-throw region.  Ensure newly created stmts
- that could throw are part of the same region.  */
-  int lp_nr = lookup_stmt_eh_lp (stmt_info->stmt);
-  if (lp_nr != 0 && stmt_could_throw_p (cfun, vec_stmt))
-add_stmt_to_eh_lp (vec_stmt, lp_nr);
+  /* While EH edges will generally prevent vectorization, stmt might
+e.g. be in a must-not-throw region.  Ensure newly created stmts
+that could throw are part of the same region.  */
+  int lp_nr = lookup_stmt_eh_lp (stmt_info->stmt);
+  if (lp_nr != 0 && stmt_could_throw_p (cfun, vec_stmt))
+   add_stmt_to_eh_lp (vec_stmt, lp_nr);
+}
+  else
+gcc_assert (!stmt_could_throw_p (cfun, vec_stmt));
 
   return vec_stmt_info;
 }
@@ -1705,7 +1710,7 @@ vect_finish_stmt_generation (vec_info *vinfo,
 stmt_vec_info stmt_info, gimple *vec_stmt,
 gimple_stmt_iterator *gsi)
 {
-  gcc_assert (gimple_code (stmt_info->stmt) != GIMPLE_LABEL);
+  gcc_assert (!stmt_info || gimple_code (stmt_info->stmt) != GIMPLE_LABEL);
 
   if (!gsi_end_p (*gsi)
   && gimple_has_mem_ops (vec_stmt))
-- 
2.26.1



Re: zstd not found if installed in non-system prefix

2020-05-27 Thread Martin Liška

On 5/20/20 9:32 PM, Michael Kuhn wrote:

Hi,

when specifying a non-system prefix with --with-zstd, the build fails
because the header and library cannot be found (see
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95005).

The attached patch fixes the problem and is what we use in Spack to
make GCC build with zstd support.


Hello.

I support the patch, but we need to wait for an approval of a maintainer.

Martin




Best regards,
Michael





Re: [PATCH] PR fortran/95090 - ICE: identifier overflow

2020-05-27 Thread Thomas Koenig via Gcc-patches

Am 26.05.20 um 23:16 schrieb Harald Anlauf:

Yet another obvious case of insufficient size of a temporary buffer.

OK for master?


Yes.

Thanks a lot!

Regards

Thomas


Re: [PATCH] PR94397 the compiler consider "type is( real(kind(1.)) )" as a syntax error

2020-05-27 Thread Thomas Koenig via Gcc-patches

Hi Mark,


ping


the patch looks good do me.

Regards

Thomas


RE: [PATCH PR95254] aarch64: gcc generate inefficient code with fixed sve vector length

2020-05-27 Thread Yangfei (Felix)
Hi,

> -Original Message-
> From: Richard Sandiford [mailto:richard.sandif...@arm.com]
> Sent: Tuesday, May 26, 2020 11:58 PM
> To: Yangfei (Felix) 
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH PR95254] aarch64: gcc generate inefficient code with
> fixed sve vector length
> 
> Sorry for the slow reply, was off for a few days.
> 
> I think the new code ought to happen earlier in emit_move_insn, before:
> 
>   if (CONSTANT_P (y))
> {
> 
> That way, all the canonicalisation happens on the mode we actually want the
> move to have.

OK. That’s a good point.

> "Yangfei (Felix)"  writes:
> > diff --git a/gcc/expr.c b/gcc/expr.c
> > index dfbeae71518..4442fb83367 100644
> > --- a/gcc/expr.c
> > +++ b/gcc/expr.c
> > @@ -3852,6 +3852,62 @@ emit_move_insn (rtx x, rtx y)
> >
> >gcc_assert (mode != BLKmode);
> >
> > +  rtx x_inner = NULL_RTX;
> > +  rtx y_inner = NULL_RTX;
> > +  machine_mode x_inner_mode, y_inner_mode;
> > +
> > +  if (SUBREG_P (x)
> > +  && REG_P (SUBREG_REG (x))
> > +  && known_eq (SUBREG_BYTE (x), 0))
> > +{
> > +  x_inner = SUBREG_REG (x);
> > +  x_inner_mode = GET_MODE (x_inner);
> > +}
> > +  if (SUBREG_P (y)
> > +  && REG_P (SUBREG_REG (y))
> > +  && known_eq (SUBREG_BYTE (y), 0))
> > +{
> > +  y_inner = SUBREG_REG (y);
> > +  y_inner_mode = GET_MODE (y_inner);
> > +}
> 
> The later code is only interested in SUBREG_REGs that are the same size as
> "mode", so I think it would make sense to check that in the "if"s above
> instead of checking SUBREG_BYTE.  (SUBREG_BYTE is always zero when the
> modes are the same size, but the reverse is not true.)
> 
> It might also be better to avoid [xy]_inner_mode and just use GET_MODE
> where necessary.
> 
> It would be good to have a block comment above the code to explain what
> we're doing.

Good suggestion. Done.

> > +  if (x_inner != NULL_RTX
> > +  && y_inner != NULL_RTX
> > +  && x_inner_mode == y_inner_mode
> > +  && known_eq (GET_MODE_SIZE (x_inner_mode), GET_MODE_SIZE
> (mode))
> > +  && ! targetm.can_change_mode_class (x_inner_mode, mode,
> ALL_REGS))
> > +{
> > +  x = x_inner;
> > +  y = y_inner;
> > +}
> > +  else if (x_inner != NULL_RTX && CONSTANT_P (y)
> 
> Formatting nit: one subcondition per line when the condition spans multiple
> lines.

OK.

> > +  && known_eq (GET_MODE_SIZE (x_inner_mode),
> GET_MODE_SIZE (mode))
> > +  && ! targetm.can_change_mode_class (x_inner_mode, mode,
> ALL_REGS)
> > +  && targetm.legitimate_constant_p (x_inner_mode, y))
> 
> This call isn't valid, since the mode has to match the rtx.  ("y" still has 
> mode
> "mode" at this point.)  I think instead we should just do:
> 
>  && (y_inner = simplify_subreg (GET_MODE (x_inner), y, mode, 0))
> 
> to convert the constant, and use it if the result is nonnull.
> The existing CONSTANT_P emit_move_insn code will handle cases in which
> the new constant isn't legitimate.

Good catch. Done.

> > +
> > +{
> > +  x = x_inner;
> > +}
> > +  else if (x_inner != NULL_RTX && MEM_P (y)
> > +  && known_eq (GET_MODE_SIZE (x_inner_mode),
> GET_MODE_SIZE (mode))
> > +  && ! targetm.can_change_mode_class (x_inner_mode, mode,
> ALL_REGS)
> > +  && (! targetm.slow_unaligned_access (x_inner_mode,
> MEM_ALIGN (y))
> > +  || MEM_ALIGN (y) >= GET_MODE_ALIGNMENT
> (x_inner_mode)))
> 
> What is the last condition protecting against?  Seems worth a comment.

Comment added.  Here I am intended to avoid generating a slow unaligned memory 
access.
Machine modes like VNx2HImode may have an small alignment than modes like V4HI.
For the given test case, SLP forces the alignment of memory access of mode 
VNx2HImode to be 32 bytes.
In theory, we may have other cases where alignment of innermode is bigger than 
that of the outermode.

Attached please find the v3 patch.  Bootstrapped and tested on 
aarch64-linux-gnu.
Does it look better?

gcc/ChangeLog:
+2020-05-27  Felix Yang  
+   Richard Sandiford  
+
+   PR target/95254
+   * expr.c (emit_move_insn): If we have a copy of which src and/or dest
+   is a subreg, try to remove the subreg when innermode and outermode are
+   equal in size and targetm.can_change_mode_class (outermode, innermode,
+   ALL_REGS) returns false.

testsuite/ChangeLog:
+2020-05-27  Felix Yang  
+   Richard Sandiford  
+
+   PR target/95254
+   * gcc.target/aarch64/pr95254.c: New test.

Thanks,
Felix




pr95254-v3.diff
Description: pr95254-v3.diff


[PATCH] tree-optimization/95295 - fix sinking after path merging in new SM code

2020-05-27 Thread Richard Biener
This fixes a missed sinking of remat stores across unrelated stores
after merging from different paths.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

2020-05-27  Richard Biener  

PR tree-optimization/95295
* tree-ssa-loop-im.c (sm_seq_valid_bb): Fix sinking after
merging stores from paths.

* gcc.dg/torture/pr95295-3.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr95295-3.c | 16 
 gcc/tree-ssa-loop-im.c   |  8 ++--
 2 files changed, 22 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr95295-3.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr95295-3.c 
b/gcc/testsuite/gcc.dg/torture/pr95295-3.c
new file mode 100644
index 000..a506af9a63f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr95295-3.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+
+extern short var_15, var_20;
+extern int var_18, var_21, var_23;
+extern _Bool arr_2[];
+extern long arr_3[];
+void test()
+{
+  var_20 = 1;
+  for (int a = 0; a < 12; a += 2)
+for (short b = 0; b < 8; b += 2) {
+  arr_2[b] = var_21 = var_18 ? var_15 : 0;
+  arr_3[b] = 8569;
+}
+  var_23 = -1096835496;
+}
diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c
index b399bd0f729..d33f5335e2b 100644
--- a/gcc/tree-ssa-loop-im.c
+++ b/gcc/tree-ssa-loop-im.c
@@ -2447,12 +2447,16 @@ sm_seq_valid_bb (class loop *loop, basic_block bb, tree 
vdef,
  unsigned id = first_edge_seq[i].first;
  seq.safe_push (first_edge_seq[i]);
  unsigned new_idx;
- if (first_edge_seq[i].second == sm_ord
+ if ((first_edge_seq[i].second == sm_ord
+  || (first_edge_seq[i].second == sm_other
+  && first_edge_seq[i].from != NULL_TREE))
  && !sm_seq_push_down (seq, seq.length () - 1, _idx))
{
- bitmap_set_bit (refs_not_supported, id);
+ if (first_edge_seq[i].second == sm_ord)
+   bitmap_set_bit (refs_not_supported, id);
  /* Mark it sm_other.  */
  seq[new_idx].second = sm_other;
+ seq[new_idx].from = NULL_TREE;
}
}
  return 1;
-- 
2.25.1


Re: [PATCH] Fortran : ICE in gfc_trans_label_assign PR50392

2020-05-27 Thread Thomas Koenig via Gcc-patches

Hi Mark,


ping


Looks good.

Thanks!


[committed] libstdc++: Add new testcase for comparison category types

2020-05-27 Thread Jonathan Wakely via Gcc-patches
Comparing a comparison category type to anything except a literal 0 is
undefined. This verifies that at least some misuses are diagnosed at
compile time.

* testsuite/18_support/comparisons/categories/zero_neg.cc: New test.

Tested x86_64-linux, committed to master.

commit 116e3cfc7b8ab8afc4bdbc03db6b194413218af7
Author: Jonathan Wakely 
Date:   Wed May 27 13:13:19 2020 +0100

libstdc++: Add new testcase for comparison category types

Comparing a comparison category type to anything except a literal 0 is
undefined. This verifies that at least some misuses are diagnosed at
compile time.

* testsuite/18_support/comparisons/categories/zero_neg.cc: New test.

diff --git 
a/libstdc++-v3/testsuite/18_support/comparisons/categories/zero_neg.cc 
b/libstdc++-v3/testsuite/18_support/comparisons/categories/zero_neg.cc
new file mode 100644
index 000..0a8af41acb9
--- /dev/null
+++ b/libstdc++-v3/testsuite/18_support/comparisons/categories/zero_neg.cc
@@ -0,0 +1,46 @@
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-options "-std=gnu++2a" }
+// { dg-do compile { target c++2a } }
+
+#include 
+
+// C++20 [cmp.categories.pre]
+// "an argument other than a literal 0 is undefined"
+
+void
+test01()
+{
+  std::partial_ordering::equivalent == 0; // OK
+  std::weak_ordering::equivalent == 0;// OK
+  std::strong_ordering::equivalent == 0;  // OK
+
+  std::partial_ordering::equivalent == 1; // { dg-error "invalid conversion" }
+  std::weak_ordering::equivalent == 1;// { dg-error "invalid conversion" }
+  std::strong_ordering::equivalent == 1;  // { dg-error "invalid conversion" }
+
+  constexpr void* p = nullptr;
+  std::partial_ordering::equivalent == p; // { dg-error "invalid conversion" }
+  std::weak_ordering::equivalent == p;// { dg-error "invalid conversion" }
+  std::strong_ordering::equivalent == p;  // { dg-error "invalid conversion" }
+
+  // Ideally these would be ill-formed, but the current code accepts it.
+  std::partial_ordering::equivalent == nullptr;
+  std::weak_ordering::equivalent == nullptr;
+  std::strong_ordering::equivalent == nullptr;
+}


[PATCH] tree-optimization/95356 - fix vectorizable_shift vector types

2020-05-27 Thread Richard Biener
This makes sure to always use the same vector type for the shift
operand as for the shifted operand.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2020-05-27  Richard Biener  

PR tree-optimization/95356
* tree-vect-stmts.c (vectorizable_shift): Adjust vector
type for the shift operand.
---
 gcc/tree-vect-stmts.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 225a9dc98ac..35043ecd0f9 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -5791,7 +5791,7 @@ vectorizable_shift (vec_info *vinfo,
   if (slp_node
  && (!vect_maybe_update_slp_op_vectype (slp_op0, vectype)
  || (!scalar_shift_arg
- && !vect_maybe_update_slp_op_vectype (slp_op1, op1_vectype
+ && !vect_maybe_update_slp_op_vectype (slp_op1, vectype
{
  if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
-- 
2.12.3


Re: [PATCH] gcc/Makefile.in: move SELFTEST_DEPS before including language makefile fragments

2020-05-27 Thread Romain Naour via Gcc-patches
Hi All,

Le 22/05/2020 à 00:13, David Malcolm a écrit :
> On Thu, 2020-05-21 at 17:35 +0200, Romain Naour wrote:
>> As reported by several Buildroot users [1][2][3], the gcc build
>> may fail while running selftests makefile target.
>>
>> The problem only occurs when ccache is used with gcc 9 and 10,
>> probably due to a race condition.
>>
>> While debuging with "make -p" we can notice that s-selftest-c target
>> contain only "cc1" as dependency instead of cc1 and SELFTEST_DEPS
>> [4].
>>
>>   s-selftest-c: cc1
>>
>> While the build is failing, the s-selftest-c dependencies recipe is
>> still running and reported as a bug by make.
>>
>>   "Dependencies recipe running (THIS IS A BUG)."
>>
>> A change [5] in gcc 9 seems to introduce the problem since we can't
>> reproduce this problem with gcc 8.
> 
> Sorry about introducing the breakage.  The patch looks sane to me,
> though I don't know if I can formally approve it (and I'm now doubting
> my "make" skills...)

No problem, the issue is not obvious at all.
My intention is to help people to backport the patch to previous gcc version.

Best regards,
Romain


> 
> Dave
> 



Re: [PATCH] Fix nonconforming memory_operand for vpmov instructions which has memory operand narrow than 128 bits [avx512f]

2020-05-27 Thread Uros Bizjak via Gcc-patches
On Wed, May 27, 2020 at 8:02 AM Hongtao Liu  wrote:
>
> On Mon, May 25, 2020 at 8:41 PM Uros Bizjak  wrote:
> >
> > On Mon, May 25, 2020 at 2:21 PM Hongtao Liu  wrote:
> > >
> > >   According to Intel SDM, VPMOVQB xmm1/m16 {k1}{z}, xmm2 has 16-bit
> > > memory_operand instead of 128-bit one which exists in current
> > > implementation. Also for other vpmov instructions which have
> > > memory_operand narrower than 128bits.
> > >
> > >   Bootstrap is ok, regression test for i386/x86-64 backend is ok.
> >
> >
> > +  [(set (match_operand:HI 0 "memory_operand" "=m")
> > +(subreg:HI (any_truncate:V2QI
> > + (match_operand:V2DI 1 "register_operand" "v")) 0))]
> >
> > This should store in V2QImode, subregs are not allowed in insn patterns.
> >
> > You need a pre-reload splitter to split from register_operand to a
> > memory_operand, Jakub fixed a bunch of pmov patterns a while ago, so
> > perhaps he can give some additional advice here.
> >
>
> Like this?
> ---
> (define_insn "*avx512vl_v2div2qi2_store"
>   [(set (match_operand:V2QI 0 "memory_operand" "=m")
> (any_truncate:V2QI
>   (match_operand:V2DI 1 "register_operand" "v")))]
>   "TARGET_AVX512VL"
>   "vpmovqb\t{%1, %0|%0, %1}"
>   [(set_attr "type" "ssemov")
>(set_attr "memory" "store")
>(set_attr "prefix" "evex")
>(set_attr "mode" "TI")])
>
> (define_insn_and_split "*avx512vl_v2div2qi2_store"
>   [(set (match_operand:HI 0 "memory_operand")
> (subreg:HI
>   (any_truncate:V2QI
> (match_operand:V2DI 1 "register_operand")) 0))]
>   "TARGET_AVX512VL && ix86_pre_reload_split ()"
>   "#"
>   "&& 1"
>   [(set (match_dup 0)
> (any_truncate:V2QI (match_dup 1)))]
>   "operands[0] = adjust_address_nv (operands[0], V2QImode, 0);")

Yes, assuming that scalar subregs are some artefact of middle-end processing.

BTW: Please name these insn ..._1 and ..._2.

Uros.


Re: [PATCH v2] arm: Warn if IRQ handler is not compiled with -mgeneral-regs-only [PR target/94743]

2020-05-27 Thread Christophe Lyon via Gcc-patches
Ping?

On Thu, 14 May 2020 at 16:57, Christophe Lyon
 wrote:
>
> The interrupt attribute does not guarantee that the FP registers are
> saved, which can result in problems difficult to debug.
>
> Saving the FP registers and status registers can be a large penalty,
> so it's probably not desirable to do that all the time.
>
> If the handler calls other functions, we'd likely need to save all of
> them, for lack of knowledge of which registers they actually clobber.
>
> This is even more obscure for the end-user when the compiler inserts
> calls to helper functions such as memcpy (some multilibs do use FP
> registers to speed it up).
>
> In the PR, we discussed adding routines in libgcc to save the FP
> context and saving only locally-clobbered FP registers, but this seems
> to be too much work for the purpose, given that in general such
> handlers try to avoid this kind of penalty.
> I suspect we would also want new attributes to instruct the compiler
> that saving the FP context is not needed.
>
> In the mean time, emit a warning to suggest re-compiling with
> -mgeneral-regs-only. Note that this can lead to errors if the code
> uses floating-point and -mfloat-abi=hard, eg:
> argument of type 'double' not permitted with -mgeneral-regs-only
>
> This can be troublesome for the user, but at least this would make
> him aware of the latent issue.
>
> The patch adds several testcases:
>
> - pr94734-1-hard.c checks that a warning is emitted when using
>   -mfloat-abi=hard. Function IRQ_HDLR_Test can make implicit calls to
>   runtime floating-point routines (or direct use of FP instructions),
>   IRQ_HDLR_Test2 doesn't. We emit a warning in both cases, though.
>
> - pr94734-1-softfp.c: same as above wih -mfloat-abi=softfp.
>
> - pr94734-1-soft.c checks that no warning is emitted when using
>   -mfloat-abi=soft when the same code as above.
>
> - pr94734-2.c checks that no warning is emitted when using
>   -mgeneral-regs-only.
>
> - pr94734-3.c checks that no warning is emitted when using
>   -mgeneral-regs-only even using float-point data.
>
> 2020-05-14  Christophe Lyon  
>
> PR target/94743
> gcc/
> * config/arm/arm.c (arm_handle_isr_attribute): Warn if
> -mgeneral-regs-only is not used.
>
> gcc/testsuite/
> * gcc.misc-tests/arm-isr.c: Add -mgeneral-regs-only.
> * gcc.target/arm/empty_fiq_handler.c: Add -mgeneral-regs-only.
> * gcc.target/arm/interrupt-1.c: Add -mgeneral-regs-only.
> * gcc.target/arm/interrupt-2.c: Add -mgeneral-regs-only.
> * gcc.target/arm/pr70830.c: Add -mgeneral-regs-only.
> * gcc.target/arm/pr94743-1-hard.c: New test.
> * gcc.target/arm/pr94743-1-soft.c: New test.
> * gcc.target/arm/pr94743-1-softfp.c: New test.
> * gcc.target/arm/pr94743-2.c: New test.
> * gcc.target/arm/pr94743-3.c: New test.
> ---
>  gcc/config/arm/arm.c |  5 
>  gcc/testsuite/gcc.misc-tests/arm-isr.c   |  2 ++
>  gcc/testsuite/gcc.target/arm/empty_fiq_handler.c |  1 +
>  gcc/testsuite/gcc.target/arm/interrupt-1.c   |  2 +-
>  gcc/testsuite/gcc.target/arm/interrupt-2.c   |  2 +-
>  gcc/testsuite/gcc.target/arm/pr70830.c   |  2 +-
>  gcc/testsuite/gcc.target/arm/pr94743-1-hard.c| 29 
> 
>  gcc/testsuite/gcc.target/arm/pr94743-1-soft.c| 27 ++
>  gcc/testsuite/gcc.target/arm/pr94743-1-softfp.c  | 29 
> 
>  gcc/testsuite/gcc.target/arm/pr94743-2.c | 22 ++
>  gcc/testsuite/gcc.target/arm/pr94743-3.c | 23 +++
>  11 files changed, 141 insertions(+), 3 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/arm/pr94743-1-hard.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/pr94743-1-soft.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/pr94743-1-softfp.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/pr94743-2.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/pr94743-3.c
>
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index dda8771..c88de3e 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -7232,6 +7232,11 @@ arm_handle_isr_attribute (tree *node, tree name, tree 
> args, int flags,
>name);
>   *no_add_attrs = true;
> }
> +  else if (TARGET_VFP_BASE)
> +   {
> + warning (OPT_Wattributes, "FP registers might be clobbered despite 
> %qE attribute: compile with -mgeneral-regs-only",
> +  name);
> +   }
>/* FIXME: the argument if any is checked for type attributes;
>  should it be checked for decl ones?  */
>  }
> diff --git a/gcc/testsuite/gcc.misc-tests/arm-isr.c 
> b/gcc/testsuite/gcc.misc-tests/arm-isr.c
> index 737f9ff..9eff52c 100644
> --- a/gcc/testsuite/gcc.misc-tests/arm-isr.c
> +++ b/gcc/testsuite/gcc.misc-tests/arm-isr.c
> @@ -1,3 +1,5 @@
> +/* { dg-options "-mgeneral-regs-only" } */

Re: [patch] Add support for __builtin_bswap128

2020-05-27 Thread Eric Botcazou
> Please use int128 effective target rather than lp64 in the tests that need
> __int128 type.

OK, thanks, adjusted locally.

-- 
Eric Botcazou


[PATCH] tree-optimization/95335 - fix SLP nodes dropped to invariant

2020-05-27 Thread Richard Biener
When we drop a SLP node to invariant because we cannot vectorize it
we have to make sure to revisit it in the users.

Bootstrapped / tested on x86_64-unknown-linux-gnu, applied.

2020-05-27  Richard Biener  

PR tree-optimization/95335
* tree-vect-slp.c (vect_slp_analyze_node_operations): Reset
lvisited for nodes made external.

* gcc.dg/vect/bb-slp-pr95335.c: New testcase.
---
 gcc/testsuite/gcc.dg/vect/bb-slp-pr95335.c | 13 +
 gcc/tree-vect-slp.c|  7 ++-
 2 files changed, 19 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/bb-slp-pr95335.c

diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-pr95335.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-pr95335.c
new file mode 100644
index 000..42a70222e12
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-pr95335.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+
+float *a;
+float b;
+void
+fn1(float p1[][3])
+{
+  float c, d, e, f;
+  f = a[1] * a[1] * d;
+  b = a[1] * a[2] * d;
+  p1[1][1] = f + c;
+  p1[1][2] = b + e;
+}
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index c0c9afd0bd2..a6c5a9d9dc4 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -2915,7 +2915,12 @@ vect_slp_analyze_node_operations (vec_info *vinfo, 
slp_tree node,
   /* If this node can't be vectorized, try pruning the tree here rather
  than felling the whole thing.  */
   if (!res && vect_slp_convert_to_external (vinfo, node, node_instance))
-res = true;
+{
+  /* We'll need to revisit this for invariant costing and number
+of vectorized stmt setting.   */
+  lvisited.remove (node);
+  res = true;
+}
 
   return res;
 }
-- 
2.25.1


Re: [patch] Add support for __builtin_bswap128

2020-05-27 Thread Jakub Jelinek via Gcc-patches
On Wed, May 27, 2020 at 11:23:32AM +0200, Eric Botcazou wrote:
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/builtin-bswap-10.c: New test.
>   * gcc.dg/builtin-bswap-11.c: Likewise.
>   * gcc.dg/builtin-bswap-12.c: Likewise.
>   * gcc.target/i386/builtin-bswap-5.c: Likewise.

Please use int128 effective target rather than lp64 in the tests that need
__int128 type.

Jakub



Re: [PATCH PR95332] gcov-tool: Flexible endian adjustment for merging coverage data

2020-05-27 Thread Martin Liška

On 5/27/20 12:35 PM, dongjianqiang (A) wrote:

Thanks for your comments, I add the ChangeLog in the patch.


Thanks. That patch is fine, please install it.

Martin


Re: [PATCH PR95332] gcov-tool: Flexible endian adjustment for merging coverage data

2020-05-27 Thread dongjianqiang (A)
Hi Martin,

Thanks for your comments, I add the ChangeLog in the patch.

Thanks,
Dong JianQiang

On 5/27/20 4:47 PM, Martin Liška wrote:
> 
> On 5/27/20 5:00 AM, dongjianqiang (A) wrote:
> > Hi GCC maintainers,
> >
> > Proposed patch to PR95332 - gcov-tool merge:"not a gcov data file"
> >
> > This error occurs when using gcov-tool merge dir1 dir2 where dir1 and dir2
> are the directories containing the .gcda files which were generated by
> different endian machine,
> >
> > Any suggestions? Thanks.
> 
> Hello.
> 
> Please write a ChangeLog for the patch.
> And please use plain-text email encoding instead of a HTML.
> 
> Thnaks,
> Martin
> 
> >
> > Regards,
> >
> > Dong JianQiang
> >



pr95332_v1.patch
Description: pr95332_v1.patch


Re: drop -aux{dir,base}, revamp -dump{dir,base}

2020-05-27 Thread Andreas Schwab
Looks like tcl 8.5.5 has a bug:

% glob -nocomplain -path {} -- {a.{out,exe}}
% glob -nocomplain -path {} -- {a.{out,exe}*}
a.out

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: [stage1][PATCH] Make TOPN counter dynamically allocated.

2020-05-27 Thread Martin Liška

PING^2

On 5/15/20 11:57 AM, Martin Liška wrote:

We're in stage1: PING^




Re: [PATCH 5/7] vect: Support vector load/store with length in vectorizer

2020-05-27 Thread Richard Sandiford
"Kewen.Lin"  writes:
> Hi Richard,
>
> Thanks for your comments!
>
> on 2020/5/26 锟斤拷锟斤拷8:49, Richard Sandiford wrote:
>> "Kewen.Lin"  writes:
>>> @@ -626,6 +645,12 @@ public:
>>>/* True if have decided to use a fully-masked loop.  */
>>>bool fully_masked_p;
>>>  
>>> +  /* Records whether we still have the option of using a length access 
>>> loop.  */
>>> +  bool can_with_length_p;
>>> +
>>> +  /* True if have decided to use length access for the loop fully.  */
>>> +  bool fully_with_length_p;
>> 
>> Rather than duplicate the flags like this, I think we should have
>> three bits of information:
>> 
>> (1) Can the loop operate on partial vectors?  Starts off optimistically
>> assuming "yes", gets set to "no" when we find a counter-example.
>> 
>> (2) If we do decide to use partial vectors, will we need loop masks?
>> 
>> (3) If we do decide to use partial vectors, will we need lengths?
>> 
>> Vectorisation using partial vectors succeeds if (1) && ((2) != (3))
>> 
>> LOOP_VINFO_CAN_FULLY_MASK_P currently tracks (1) and
>> LOOP_VINFO_MASKS currently tracks (2).  In pathological cases it's
>> already possible to have (1) && !(2), see r9-6240 for an example.
>> 
>> With the new support, LOOP_VINFO_LENS tracks (3).
>> 
>> So I don't think we need the can_with_length_p.  What is now
>> LOOP_VINFO_CAN_FULLY_MASK_P can continue to track (1) for both
>> approaches, with the final choice of approach only being made
>> at the end.  Maybe it would be worth renaming it to something
>> more generic though, now that we have two approaches to partial
>> vectorisation.
>
> I like this idea!  I could be wrong, but I'm afraid that we
> can not have one common flag to be shared for both approaches,
> the check criterias could be different for both approaches, one
> counter example for length could be acceptable for masking, such
> as length can only allow CONTIGUOUS related modes, but masking
> can support more.  When we see acceptable VMAT_LOAD_STORE_LANES,
> we leave LOOP_VINFO_CAN_FULLY_MASK_P true, later should length
> checking turn it to false?  I guess no, assuming still true, then 
> LOOP_VINFO_CAN_FULLY_MASK_P will mean only partial vectorization
> for masking, not for both.  We can probably clean LOOP_VINFO_LENS
> when the length checking is false, but we just know the vec is empty,
> not sure we are unable to do partial vectorization with length,
> when we see LOOP_VINFO_CAN_FULLY_MASK_P true, we could still
> record length into it if possible.

Let's call the flag in (1) CAN_USE_PARTIAL_VECTORS_P rather than
CAN_FULLY_MASK_P to (try to) avoid any confusion from the current name.

What I meant is that each vectorizable_* routine has the responsibility
of finding a way of coping with partial vectorisation, or setting
CAN_USE_PARTIAL_VECTORS_P to false if it can't.

vectorizable_load chooses the VMAT first, and then decides based on that
whether partial vectorisation is supported.  There's no influence in
the other direction (partial vectorisation doesn't determine the VMAT).

So once it has chosen a VMAT, vectorizable_load needs to try to find a way
of handling the operation with partial vectorisation.  Currently the only
way of doing that for VMAT_LOAD_STORE_LANES is using masks.  So at the
moment there are two possible outcomes:

- The target supports the necessary IFN_MASK_LOAD_LANES function.
  If so, we can use partial vectorisation for the statement, so we
  leave CAN_USE_PARTIAL_VECTORS_P true and record the necessary masks
  in LOOP_VINFO_MASKS.

- The target doesn't support the necessary IFN_MASK_LOAD_LANES function.
  If so, we can't use partial vectorisation, so we clear
  CAN_USE_PARTIAL_VECTORS_P.

That's how things work at the moment.  It would work in the same way
for lengths if we ever supported IFN_LEN_LOAD_LANES: we'd check whether
IFN_LEN_LOAD_LANES is available and record the length in LOOP_VINFO_LENS
if so.  If partial vectorisation isn't supported (via masks or lengths),
we'd continue to clear CAN_USE_PARTIAL_VECTORS_P.

But equally, if we never add support for IFN_LEN_LOAD_LANES, the current
code continues to work with length-based approaches.  We'll continue to
clear CAN_USE_PARTIAL_VECTORS_P for VMAT_LOAD_STORE_LANES when the
target provides no IFN_MASK_LOAD_LANES function.

As I say, this is all predicated on the assumption that we don't need
to mix both masks and lengths in the same loop, and so can decide not
to use partial vectorisation when both masks and lengths have been
recorded.  This is a check that would happen at the end, after all
statements have been analysed.

(There's no reason in principle why we *couldn't* support both
approaches in the same loop, but it's not worth adding the code
for that until there's a use case.)

Thanks,
Richard


[PATCH] c++: Fix bogus "does not declare anything" warning (PR 66159)

2020-05-27 Thread Jonathan Wakely via Gcc-patches
G++ gives a bogus warning for 'struct A; using B = struct ::A;'
complaining that the elaborated-type-specifier doesn't declare anything.
That's true, but it's not trying to declare struct ::A, just refer to it
unambiguously. Do not emit the warning unless we're actually parsing a
declaration.

This also makes the relevant warning depend on -Wredundant-decls (which
is not part of -Wall or -Wextra) so it can be disabled on the command
line or by using #pragma. This means the warning will no longer be given
by default, so some tests need -Wredundant-decls added.

gcc/cp/ChangeLog:

PR c++/66159
* parser.c (cp_parser_elaborated_type_specifier): Do not warn
unless in a declaration. Make warning depend on
WOPT_redundant_decls.

gcc/testsuite/ChangeLog:

PR c++/66159
* g++.dg/parse/specialization1.C: Remove dg-warning.
* g++.dg/warn/forward-inner.C: Add -Wredundant-decls. Check
alias-declaration using elaborated-type-specifier.
* g++.dg/warn/pr36999.C: Add -Wredundant-decls.


Is it OK to make this warning no longer emitted by default, and not
even with -Wall -Wextra?

Would it be better to add a new option for this specific warning,
which would be enabled by -Wall and also by -Wredundant-decls? Maybe
-Wredundant-decls-elaborated-type or something.


commit c254d7cb3977484fd4737b973a87c1df98c30e01
Author: Jonathan Wakely 
Date:   Wed May 27 10:40:38 2020 +0100

c++: Fix bogus "does not declare anything" warning (PR 66159)

G++ gives a bogus warning for 'struct A; using B = struct ::A;'
complaining that the elaborated-type-specifier doesn't declare anything.
That's true, but it's not trying to declare struct ::A, just refer to it
unambiguously. Do not emit the warning unless we're actually parsing a
declaration.

This also makes the relevant warning depend on -Wredundant-decls (which
is not part of -Wall or -Wextra) so it can be disabled on the command
line or by using #pragma. This means the warning will no longer be given
by default, so some tests need -Wredundant-decls added.

gcc/cp/ChangeLog:

PR c++/66159
* parser.c (cp_parser_elaborated_type_specifier): Do not warn
unless in a declaration. Make warning depend on
WOPT_redundant_decls.

gcc/testsuite/ChangeLog:

PR c++/66159
* g++.dg/parse/specialization1.C: Remove dg-warning.
* g++.dg/warn/forward-inner.C: Add -Wredundant-decls. Check
alias-declaration using elaborated-type-specifier.
* g++.dg/warn/pr36999.C: Add -Wredundant-decls.

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 54ca875ce54..5287ab34752 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -18917,8 +18917,10 @@ cp_parser_elaborated_type_specifier (cp_parser* parser,
  here.  */
 
   if (cp_lexer_next_token_is (parser->lexer, CPP_SEMICOLON)
-  && !is_friend && !processing_explicit_instantiation)
-warning (0, "declaration %qD does not declare anything", decl);
+ && !is_friend && is_declaration
+ && !processing_explicit_instantiation)
+   warning (OPT_Wredundant_decls,
+"declaration %qD does not declare anything", decl);
 
  type = TREE_TYPE (decl);
}
diff --git a/gcc/testsuite/g++.dg/parse/specialization1.C 
b/gcc/testsuite/g++.dg/parse/specialization1.C
index 44a98baa2f4..6d83bc4f254 100644
--- a/gcc/testsuite/g++.dg/parse/specialization1.C
+++ b/gcc/testsuite/g++.dg/parse/specialization1.C
@@ -4,4 +4,3 @@
 
 template  class A;
 template  class A::B; // { dg-error "declaration" "err" }
-// { dg-warning "declaration" "warn" { target *-*-* } .-1 }
diff --git a/gcc/testsuite/g++.dg/warn/forward-inner.C 
b/gcc/testsuite/g++.dg/warn/forward-inner.C
index 5336d4ed946..1c10ec44a54 100644
--- a/gcc/testsuite/g++.dg/warn/forward-inner.C
+++ b/gcc/testsuite/g++.dg/warn/forward-inner.C
@@ -1,5 +1,6 @@
 // Check that the compiler warns about inner-style forward declarations in
 // contexts where they're not actually illegal, but merely useless.
+// { dg-options "-Wredundant-decls" }
 
 // Verify warnings for and within classes, and by extension, struct and union.
 class C1;
@@ -70,7 +71,7 @@ template class TC6::TC7;  // Valid explicit 
instantiation, no warning
 
 
 // Verify that friend declarations, also easy to confuse with forward
-// declrations, are similarly not warned about.
+// declarations, are similarly not warned about.
 class C8 {
  public:
   class C9 { };
@@ -79,3 +80,10 @@ class C10 {
  public:
   friend class C8::C9; // Valid friend declaration, no warning
 };
+
+#if __cplusplus >= 201103L
+// Verify that alias-declarations using an elaborated-type-specifier and
+// nested-name-specifier are not warned about (PR c++/66159).
+struct C11;
+using A1 = struct ::C11; // Valid alias-decl, no warning
+#endif
diff --git 

Re: drop -aux{dir,base}, revamp -dump{dir,base}

2020-05-27 Thread Andreas Schwab
FAIL: outputs exe default 1: a.{out,exe}
FAIL: outputs exe default 1: extra
a.out
FAIL: outputs exe default 2: a.{out,exe}
FAIL: outputs exe default 2: extra
a.out
FAIL: outputs exe savetmp unnamed1: a.{out,exe}
FAIL: outputs exe savetmp unnamed1: extra
a.out
FAIL: outputs exe savetmp unnamed2: a.{out,exe}
FAIL: outputs exe savetmp unnamed2: extra
a.out
FAIL: outputs exe savecwd unnamed1: a.{out,exe}
FAIL: outputs exe savecwd unnamed1: extra
a.out
FAIL: outputs exe savecwd unnamed2: a.{out,exe}
FAIL: outputs exe savecwd unnamed2: extra
a.out
FAIL: outputs exe saveobj unnamed1: a.{out,exe}
FAIL: outputs exe saveobj unnamed1: extra
a.out
FAIL: outputs exe saveobj unnamed2: a.{out,exe}
FAIL: outputs exe saveobj unnamed2: extra
a.out
FAIL: outputs exe auxdump unnamed1: a.{out,exe}
FAIL: outputs exe auxdump unnamed1: extra
a.out
FAIL: outputs exe auxdump unnamed2: a.{out,exe}
FAIL: outputs exe auxdump unnamed2: extra
a.out
FAIL: outputs exe auxdmps unnamed1: a.{out,exe}
FAIL: outputs exe auxdmps unnamed1: extra
a.out
FAIL: outputs exe auxdmps unnamed2: a.{out,exe}
FAIL: outputs exe auxdmps unnamed2: extra
a.out
FAIL: outputs exe dumpdir unnamed1: a.{out,exe}
FAIL: outputs exe dumpdir unnamed1: extra
a.out
FAIL: outputs exe dumpdir unnamed2: a.{out,exe}
FAIL: outputs exe dumpdir unnamed2: extra
a.out
FAIL: outputs exe dbsovrdd unnamed1: a.{out,exe}
FAIL: outputs exe dbsovrdd unnamed1: extra
a.out
FAIL: outputs exe dbsovrdd unnamed2: a.{out,exe}
FAIL: outputs exe dbsovrdd unnamed2: extra
a.out
FAIL: outputs exe dbswthdd unnamed1: a.{out,exe}
FAIL: outputs exe dbswthdd unnamed1: extra
a.out
FAIL: outputs exe dbswthdd unnamed2: a.{out,exe}
FAIL: outputs exe dbswthdd unnamed2: extra
a.out
FAIL: outputs exe dbwoutdd unnamed1: a.{out,exe}
FAIL: outputs exe dbwoutdd unnamed1: extra
a.out
FAIL: outputs exe dbwoutdd unnamed2: a.{out,exe}
FAIL: outputs exe dbwoutdd unnamed2: extra
a.out
FAIL: outputs lto sing unnamed: a.{out,exe}
FAIL: outputs lto sing unnamed: extra
a.out
FAIL: outputs lto mult unnamed: a.{out,exe}
FAIL: outputs lto mult unnamed: extra
a.out
FAIL: outputs lto sing dumpbase unnamed: a.{out,exe}
FAIL: outputs lto sing dumpbase unnamed: extra
a.out
FAIL: outputs lto mult dumpbase unnamed: a.{out,exe}
FAIL: outputs lto mult dumpbase unnamed: extra
a.out
FAIL: outputs lto sing dumpdir unnamed: a.{out,exe}
FAIL: outputs lto sing dumpdir unnamed: extra
a.out
FAIL: outputs lto mult dumpdir unnamed: a.{out,exe}
FAIL: outputs lto mult dumpdir unnamed: extra
a.out
FAIL: outputs lto dbswthdd sing unnamed: a.{out,exe}
FAIL: outputs lto dbswthdd sing unnamed: extra
a.out
FAIL: outputs lto dbswthdd mult unnamed: a.{out,exe}
FAIL: outputs lto dbswthdd mult unnamed: extra
a.out
FAIL: outputs lto dbsovrdd sing unnamed: a.{out,exe}
FAIL: outputs lto dbsovrdd sing unnamed: extra
a.out
FAIL: outputs lto dbsovrdd mult unnamed: a.{out,exe}
FAIL: outputs lto dbsovrdd mult unnamed: extra
a.out
FAIL: outputs lto sing empty dumpbase unnamed: a.{out,exe}
FAIL: outputs lto sing empty dumpbase unnamed: extra
a.out
FAIL: outputs lto mult empty dumpbase unnamed: a.{out,exe}
FAIL: outputs lto mult empty dumpbase unnamed: extra
a.out
FAIL: outputs lto sing empty dumpbase dumpdir unnamed: a.{out,exe}
FAIL: outputs lto sing empty dumpbase dumpdir unnamed: extra
a.out
FAIL: outputs lto mult empty dumpbase dumpdir unnamed: a.{out,exe}
FAIL: outputs lto mult empty dumpbase dumpdir unnamed: extra
a.out
FAIL: outputs lto sing empty dumpdir empty dumpbase unnamed: a.{out,exe}
FAIL: outputs lto sing empty dumpdir empty dumpbase unnamed: extra
a.out
FAIL: outputs lto mult empty dumpdir empty dumpbase unnamed: a.{out,exe}
FAIL: outputs lto mult empty dumpdir empty dumpbase unnamed: extra
a.out
FAIL: outputs lto st sing empty dumpbase unnamed: a.{out,exe}
FAIL: outputs lto st sing empty dumpbase unnamed: extra
a.out
FAIL: outputs lto st mult empty dumpbase unnamed: a.{out,exe}
FAIL: outputs lto st mult empty dumpbase unnamed: extra
a.out
FAIL: outputs lto st sing unnamed: a.{out,exe}
FAIL: outputs lto st sing unnamed: extra
a.out
FAIL: outputs lto st mult unnamed: a.{out,exe}
FAIL: outputs lto st mult unnamed: extra
a.out

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: [PATCH] Ada: Bump version to 11.

2020-05-27 Thread Arnaud Charlet
> I'm packaging a new gcc11 package and I noticed this needs to be bumped.
> 
> Ready for master?

Yes, thanks.

> gcc/ada/ChangeLog:
> 
>   * gnatvsn.ads: Bump Library_Version to 11.
> ---
>  gcc/ada/gnatvsn.ads | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/ada/gnatvsn.ads b/gcc/ada/gnatvsn.ads
> index f80d6cef010..aacbc228460 100644
> --- a/gcc/ada/gnatvsn.ads
> +++ b/gcc/ada/gnatvsn.ads
> @@ -38,7 +38,7 @@ package Gnatvsn is
> --  Static string identifying this version, that can be used as an 
> argument
> --  to e.g. pragma Ident.
> -   Library_Version : constant String := "10";
> +   Library_Version : constant String := "11";
> --  Library version. It needs to be updated whenever the major version
> --  number is changed.
> --
> -- 
> 2.26.2
> 


[PATCH] Ada: Bump version to 11.

2020-05-27 Thread Martin Liška

Hello.

I'm packaging a new gcc11 package and I noticed this needs to be bumped.

Ready for master?
Thanks,
Martin

gcc/ada/ChangeLog:

* gnatvsn.ads: Bump Library_Version to 11.
---
 gcc/ada/gnatvsn.ads | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/ada/gnatvsn.ads b/gcc/ada/gnatvsn.ads
index f80d6cef010..aacbc228460 100644
--- a/gcc/ada/gnatvsn.ads
+++ b/gcc/ada/gnatvsn.ads
@@ -38,7 +38,7 @@ package Gnatvsn is
--  Static string identifying this version, that can be used as an argument
--  to e.g. pragma Ident.
 
-   Library_Version : constant String := "10";

+   Library_Version : constant String := "11";
--  Library version. It needs to be updated whenever the major version
--  number is changed.
--
--
2.26.2



[PATCH] aarch64: add support for unpacked EOR, ORR and AND

2020-05-27 Thread Joe Ramsay
Hi!

This patch improves code generation for EOR, ORR and AND on unpacked vectors 
with SVE. The following function:
void f (unsigned int *x, unsigned short *y, unsigned short *z) {
  for (int i = 0; i < 7; ++i)
x[i] = (unsigned short) (y[i] & z[i]);
}

previously compiled to
ptrue   p1.d, vl3
ld1hz0.d, p1/z, [x1, #1, mul vl]
ptrue   p0.b, vl32
st1hz0.d, p0, [sp, #1, mul vl]
ld1hz0.d, p1/z, [x2, #1, mul vl]
st1hz0.d, p0, [sp]
ldr x3, [x2]
ldp x4, x2, [sp]
ldr x1, [x1]
and x1, x3, x1
and x2, x2, x4
str x2, [sp]
ld1hz0.d, p0/z, [sp]
str x1, [sp]
uxthz0.s, p0/m, z0.s
st1wz0.d, p1, [x0, #1, mul vl]
ld1hz0.d, p0/z, [sp]
uxthz0.s, p0/m, z0.s
st1wz0.d, p0, [x0]
add sp, sp, 16
ret

and now compiles to:

ptrue   p0.s, vl7

ptrue   p1.b, vl32

ld1hz1.s, p0/z, [x1]

ld1hz0.s, p0/z, [x2]

add z0.h, z0.h, z1.h

uxthz0.s, p1/m, z0.s

st1wz0.s, p0, [x0]

ret



Tested on aarch64-linux-gnu and x86_64-linux-gnu hosts.

Thanks,
Joe


2020-05-20  Joe Ramsay  



* config/aarch64/aarch64-sve.md (3): 
Add support for unpacked EOR, ORR, AND.



gcc/testsuite/ChangeLog



2020-05-20  Joe Ramsay  



* gcc.target/aarch64/sve/logical_unpacked_and_1.c: New test.

* gcc.target/aarch64/sve/logical_unpacked_and_2.c: New test.

* gcc.target/aarch64/sve/logical_unpacked_and_3.c: New test.

* gcc.target/aarch64/sve/logical_unpacked_and_4.c: New test.

* gcc.target/aarch64/sve/logical_unpacked_and_5.c: New test.

* gcc.target/aarch64/sve/logical_unpacked_and_6.c: New test.

* gcc.target/aarch64/sve/logical_unpacked_and_7.c: New test.

* gcc.target/aarch64/sve/logical_unpacked_eor_1.c: New test.

* gcc.target/aarch64/sve/logical_unpacked_eor_2.c: New test.

* gcc.target/aarch64/sve/logical_unpacked_eor_3.c: New test.

* gcc.target/aarch64/sve/logical_unpacked_eor_4.c: New test.

* gcc.target/aarch64/sve/logical_unpacked_eor_5.c: New test.

* gcc.target/aarch64/sve/logical_unpacked_eor_6.c: New test.

* gcc.target/aarch64/sve/logical_unpacked_eor_7.c: New test.

* gcc.target/aarch64/sve/logical_unpacked_orr_1.c: New test.

* gcc.target/aarch64/sve/logical_unpacked_orr_2.c: New test.

* gcc.target/aarch64/sve/logical_unpacked_orr_3.c: New test.

* gcc.target/aarch64/sve/logical_unpacked_orr_4.c: New test.

* gcc.target/aarch64/sve/logical_unpacked_orr_5.c: New test.

* gcc.target/aarch64/sve/logical_unpacked_orr_6.c: New test.

* gcc.target/aarch64/sve/logical_unpacked_orr_7.c: New test.
---

diff --git a/gcc/config/aarch64/aarch64-sve.md 
b/gcc/config/aarch64/aarch64-sve.md
index f7a0893..8f0944c 100644
--- a/gcc/config/aarch64/aarch64-sve.md
+++ b/gcc/config/aarch64/aarch64-sve.md
@@ -4211,10 +4211,10 @@
 ;; Unpredicated integer binary logical operations.
(define_insn "3"
-  [(set (match_operand:SVE_FULL_I 0 "register_operand" "=w, ?w, w")
-   (LOGICAL:SVE_FULL_I
- (match_operand:SVE_FULL_I 1 "register_operand" "%0, w, w")
- (match_operand:SVE_FULL_I 2 "aarch64_sve_logical_operand" 
"vsl, vsl, w")))]
+  [(set (match_operand:SVE_I 0 "register_operand" "=w, ?w, w")
+  (LOGICAL:SVE_I
+(match_operand:SVE_I 1 "register_operand" "%0, w, w")
+(match_operand:SVE_I 2 "aarch64_sve_logical_operand" "vsl, 
vsl, w")))]
   "TARGET_SVE"
   "@
\t%0., %0., #%C2
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_and_1.c 
b/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_and_1.c
new file mode 100644
index 000..7840355
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_and_1.c
@@ -0,0 +1,16 @@
+/* { dg-options "-O3 -msve-vector-bits=256" } */
+
+#include 
+
+void
+f (uint32_t *restrict dst, uint16_t *restrict src1, uint8_t *restrict src2)
+{
+  for (int i = 0; i < 7; ++i)
+dst[i] = (uint16_t) (src1[i] & src2[i]);
+}
+
+/* { dg-final { scan-assembler-times {\tld1h\tz[0-9]+\.s,} 1 } } */
+/* { dg-final { scan-assembler-times {\tld1b\tz[0-9]+\.s,} 1 } } */
+/* { dg-final { scan-assembler-times {\tand\tz[0-9]+\.d, z[0-9]+\.d, 
z[0-9]+\.d\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tuxth\tz[0-9]+\.s,} 1 } } */
+/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s,} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_and_2.c 
b/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_and_2.c
new file mode 100644
index 000..08b2745
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_and_2.c
@@ -0,0 +1,17 @@
+/* { dg-options "-O3 

[patch] Add support for __builtin_bswap128

2020-05-27 Thread Eric Botcazou
Hi,

this patch introduces a new builtin named __builtin_bswap128 on targets where 
TImode is supported, i.e. 64-bit targets only in practice.  The implementation 
simply reuses the existing double word path in optab, so no routine is added 
to libgcc (which means that you get two calls to _bswapdi2 in the worst case).

Tested on x86-64/Linux, OK for the mainline?


2020-05-27  Eric Botcazou  

gcc/ChangeLog:

* builtin-types.def (BT_UINT128): New primitive type.
(BT_FN_UINT128_UINT128): New function type.
* builtins.def (BUILT_IN_BSWAP128): New GCC builtin.
* doc/extend.texi (__builtin_bswap128): Document it.
* builtins.c (expand_builtin): Deal with BUILT_IN_BSWAP128.
(is_inexpensive_builtin): Likewise.
* fold-const-call.c (fold_const_call_ss): Likewise.
* fold-const.c (tree_call_nonnegative_warnv_p): Likewise.
* tree-ssa-ccp.c (evaluate_stmt): Likewise.
* tree-vect-stmts.c (vect_get_data_ptr_increment): Likewise.
(vectorizable_call): Likewise.
* optabs.c (expand_unop): Always use the double word path for it.
* tree-core.h (enum tree_index): Add TI_UINT128_TYPE.
* tree.h (uint128_type_node): New global type.
* tree.c (build_common_tree_nodes): Build it if TImode is supported.

gcc/testsuite/ChangeLog:

* gcc.dg/builtin-bswap-10.c: New test.
* gcc.dg/builtin-bswap-11.c: Likewise.
* gcc.dg/builtin-bswap-12.c: Likewise.
* gcc.target/i386/builtin-bswap-5.c: Likewise.

-- 
Eric Botcazoudiff --git a/gcc/builtin-types.def b/gcc/builtin-types.def
index c7aa691b243..c46b1bc5cbd 100644
--- a/gcc/builtin-types.def
+++ b/gcc/builtin-types.def
@@ -73,6 +73,9 @@ DEF_PRIMITIVE_TYPE (BT_UINT8, unsigned_char_type_node)
 DEF_PRIMITIVE_TYPE (BT_UINT16, uint16_type_node)
 DEF_PRIMITIVE_TYPE (BT_UINT32, uint32_type_node)
 DEF_PRIMITIVE_TYPE (BT_UINT64, uint64_type_node)
+DEF_PRIMITIVE_TYPE (BT_UINT128, uint128_type_node
+? uint128_type_node
+: error_mark_node)
 DEF_PRIMITIVE_TYPE (BT_WORD, (*lang_hooks.types.type_for_mode) (word_mode, 1))
 DEF_PRIMITIVE_TYPE (BT_UNWINDWORD, (*lang_hooks.types.type_for_mode)
 (targetm.unwind_word_mode (), 1))
@@ -300,6 +303,7 @@ DEF_FUNCTION_TYPE_1 (BT_FN_UINT8_FLOAT, BT_UINT8, BT_FLOAT)
 DEF_FUNCTION_TYPE_1 (BT_FN_UINT16_UINT16, BT_UINT16, BT_UINT16)
 DEF_FUNCTION_TYPE_1 (BT_FN_UINT32_UINT32, BT_UINT32, BT_UINT32)
 DEF_FUNCTION_TYPE_1 (BT_FN_UINT64_UINT64, BT_UINT64, BT_UINT64)
+DEF_FUNCTION_TYPE_1 (BT_FN_UINT128_UINT128, BT_UINT128, BT_UINT128)
 DEF_FUNCTION_TYPE_1 (BT_FN_UINT64_FLOAT, BT_UINT64, BT_FLOAT)
 DEF_FUNCTION_TYPE_1 (BT_FN_BOOL_INT, BT_BOOL, BT_INT)
 DEF_FUNCTION_TYPE_1 (BT_FN_BOOL_PTR, BT_BOOL, BT_PTR)
diff --git a/gcc/builtins.c b/gcc/builtins.c
index 53bae599d3e..f7bb87e4690 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -7988,6 +7988,7 @@ expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode,
 case BUILT_IN_BSWAP16:
 case BUILT_IN_BSWAP32:
 case BUILT_IN_BSWAP64:
+case BUILT_IN_BSWAP128:
   target = expand_builtin_bswap (target_mode, exp, target, subtarget);
   if (target)
 	return target;
@@ -11704,6 +11705,7 @@ is_inexpensive_builtin (tree decl)
   case BUILT_IN_BSWAP16:
   case BUILT_IN_BSWAP32:
   case BUILT_IN_BSWAP64:
+  case BUILT_IN_BSWAP128:
   case BUILT_IN_CLZ:
   case BUILT_IN_CLZIMAX:
   case BUILT_IN_CLZL:
diff --git a/gcc/builtins.def b/gcc/builtins.def
index fa8b0641ab1..ee67ac15d5c 100644
--- a/gcc/builtins.def
+++ b/gcc/builtins.def
@@ -834,6 +834,8 @@ DEF_GCC_BUILTIN(BUILT_IN_APPLY_ARGS, "apply_args", BT_FN_PTR_VAR, ATTR_L
 DEF_GCC_BUILTIN(BUILT_IN_BSWAP16, "bswap16", BT_FN_UINT16_UINT16, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN(BUILT_IN_BSWAP32, "bswap32", BT_FN_UINT32_UINT32, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN(BUILT_IN_BSWAP64, "bswap64", BT_FN_UINT64_UINT64, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN(BUILT_IN_BSWAP128, "bswap128", BT_FN_UINT128_UINT128, ATTR_CONST_NOTHROW_LEAF_LIST)
+
 DEF_EXT_LIB_BUILTIN(BUILT_IN_CLEAR_CACHE, "__clear_cache", BT_FN_VOID_PTR_PTR, ATTR_NOTHROW_LEAF_LIST)
 /* [trans-mem]: Adjust BUILT_IN_TM_CALLOC if BUILT_IN_CALLOC is changed.  */
 DEF_LIB_BUILTIN(BUILT_IN_CALLOC, "calloc", BT_FN_PTR_SIZE_SIZE, ATTR_MALLOC_WARN_UNUSED_RESULT_SIZE_1_2_NOTHROW_LEAF_LIST)
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index a2ebef8cf8c..cced19d2018 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -13784,14 +13784,20 @@ exactly 8 bits.
 
 @deftypefn {Built-in Function} uint32_t __builtin_bswap32 (uint32_t x)
 Similar to @code{__builtin_bswap16}, except the argument and return types
-are 32 bit.
+are 32-bit.
 @end deftypefn
 
 @deftypefn {Built-in Function} uint64_t __builtin_bswap64 (uint64_t x)
 Similar to @code{__builtin_bswap32}, except the argument and return types
-are 64 bit.
+are 64-bit.
 @end deftypefn
 

Re: [PATCH 0/7] Support vector load/store with length

2020-05-27 Thread Kewen.Lin via Gcc-patches
on 2020/5/27 下午3:25, Richard Biener wrote:
> On Tue, 26 May 2020, Segher Boessenkool wrote:
> 
>> Hi!
>>
>> On Tue, May 26, 2020 at 01:29:30PM +0100, Richard Sandiford wrote:
>>> FWIW, I agree adding .LEN_LOAD and .LEN_STORE seems like a good
>>> approach.  I think it'll be more maintainable in the long run than
>>> trying to have .MASK_LOADs and .MASK_STOREs that need a special mask
>>> operand.  (That would be too similar to VEC_COND_EXPR :-))
>>>
>>> Not sure yet what the exact semantics wrt out-of-range values for
>>> the IFN/optab though.  Maybe we should instead have some kind of
>>> abstract, target-specific cookie created by a separate intrinsic.
>>> Haven't thought much about it yet...
>>
>> Or maybe only support 0..N with N the length of the vector?  It is
>> pretty important to support 0 and N, but greater than N isn't as
>> important (it is useful for tricky hand-written code, but not as much
>> for compiler-generate code -- we only support an 8-bit number here on
>> Power, maybe that is why ;-) )
> 
> The question is one of semantics - if power masks the length to an
> 8 bit number it's important to preprocess the IV.  As with my
> other suggestion the question is what to expose to the IL (to GIMPLE)
> here.  Exposing as much as possible will help IV selection but
> will eventually require IFN variations for different semantics.
> 

In the current implementation, we don't use IFN for the length computation,
it has something like:

  ivtmp_28 = ivtmp_27 + 16;
  _39 = MIN_EXPR ;  // _32 is the limit
  _40 = _32 - _39; // get the zero bytes for the ending
  _41 = MIN_EXPR <_40, 16>;// check for vector size
  if (ivtmp_28 < _32)

In my initial thought, the len load/store IFNs are considered to accept any
lengths (any values hold in length mode), since the length larger than vector
size is no sense, the hardware can take it as saturated to vector size, if
hardware has some masking bits on it like ppc, we can add one hook to guard
the MIN requirement for length gen.  For now, the MIN is mandatory since ppc
is the only user.

FWIW, if we mostly adopt this for epilogues or small loop (iteration < VF),
the range can be analyzed during compilation time, these MIN computations
can be optimized theoricially.

> So yes, 0..N sounds about right here and we'll require a MIN ()
> operation and likely need to teach IV selection about this to at least
> possibly get an IV with the byte size multiplication factored.
> 

FWIW, in the current implementation, the step/limit have multiplied the bytes
of lanes first, the IV computation will not have the multilcation for it there.

BR,
Kewen

> Richard.
> 
>>
>> Segher
>>
> 



[PATCH] c++: Try to complete decomp types [PR95328]

2020-05-27 Thread Jakub Jelinek via Gcc-patches
Hi!

Two years ago Paolo has added the
  else if (processing_template_decl && !COMPLETE_TYPE_P (type))
pedwarn (...);
lines into cp_finish_decomp.  For type dependent decl we punt much earlier,
but even for types which aren't type dependent COMPLETE_TYPE_P might be
false as this testcase shows, so this patch tries to complete_type first
(the reason for writing it that way is that it is then followed by another
else if and if complete_type returns error_mark_node, we shouldn't report
anything, as a bug should have been reported already.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2020-05-27  Jakub Jelinek  

PR c++/95328
* decl.c (cp_finish_decomp): Call complete_type before checking
COMPLETE_TYPE_P.

* g++.dg/cpp1z/decomp53.C: New test.

--- gcc/cp/decl.c.jj2020-05-22 11:07:21.884215758 +0200
+++ gcc/cp/decl.c   2020-05-26 15:21:25.039880747 +0200
@@ -8392,6 +8392,8 @@ cp_finish_decomp (tree decl, tree first,
   error_at (loc, "cannot decompose lambda closure type %qT", type);
   goto error_out;
 }
+  else if (processing_template_decl && complete_type (type) == error_mark_node)
+goto error_out;
   else if (processing_template_decl && !COMPLETE_TYPE_P (type))
 pedwarn (loc, 0, "structured binding refers to incomplete class type %qT",
 type);
--- gcc/testsuite/g++.dg/cpp1z/decomp53.C.jj2020-05-26 15:25:01.397644953 
+0200
+++ gcc/testsuite/g++.dg/cpp1z/decomp53.C   2020-05-26 15:24:37.764998398 
+0200
@@ -0,0 +1,22 @@
+// PR c++/95328
+// { dg-do compile { target c++11 } }
+// { dg-options "" }
+
+template 
+struct S
+{
+  int a, b;
+};
+
+template 
+void
+foo ()
+{
+  auto [a, b] = S();  // { dg-warning "structured bindings only 
available with" "" { target c++14_down } }
+}
+
+int
+main ()
+{
+  foo ();
+}

Jakub



Re: [PATCH PR95332] gcov-tool: Flexible endian adjustment for merging coverage data

2020-05-27 Thread Martin Liška

On 5/27/20 5:00 AM, dongjianqiang (A) wrote:

Hi GCC maintainers,

Proposed patch to PR95332 - gcov-tool merge:"not a gcov data file"

This error occurs when using gcov-tool merge dir1 dir2 where dir1 and dir2 are 
the directories containing the .gcda files which were generated by different 
endian machine,

Any suggestions? Thanks.


Hello.

Please write a ChangeLog for the patch.
And please use plain-text email encoding instead of a HTML.

Thnaks,
Martin



Regards,

Dong JianQiang





[committed] openmp: Fix up omp_declare_variant{s, _alt} htab handling [PR95315]

2020-05-27 Thread Jakub Jelinek via Gcc-patches
Hi!

This patch fixes a GC ICE.  During debugging, I've found that during
gimplification we can actually call omp_resolve_declare_variant multiple
times and it would create a new magic declare_variant_alt FUNCTION_DECL
each time, which is undesirable, once we have such a decl, we should just
use that.  The other problem is that there was no cgraph node removal hook.
As the omp_declare_variants htab is used just early during gimplification,
we can just clear the whole htab, rather than trying to lookup and remove
a particular entry.  The other hash table is used later as well and that
one uses just DECL_UID as hash, so in that case the patch removes the elt.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2020-05-27  Jakub Jelinek  

PR middle-end/95315
* omp-general.c (omp_declare_variant_remove_hook): New function.
(omp_resolve_declare_variant): Always return base if it is already
declare_variant_alt magic decl itself.  Register
omp_declare_variant_remove_hook as cgraph node removal hook.

* gcc.dg/gomp/pr95315.c: New test.

--- gcc/omp-general.c.jj2020-05-26 09:35:12.222438412 +0200
+++ gcc/omp-general.c   2020-05-26 13:06:03.381443260 +0200
@@ -1695,6 +1695,28 @@ omp_resolve_late_declare_variant (tree a
   return varentry2->variant->decl;
 }
 
+/* Hook to adjust hash tables on cgraph_node removal.  */
+
+static void
+omp_declare_variant_remove_hook (struct cgraph_node *node, void *)
+{
+  if (!node->declare_variant_alt)
+return;
+
+  /* Drop this hash table completely.  */
+  omp_declare_variants = NULL;
+  /* And remove node from the other hash table.  */
+  if (omp_declare_variant_alt)
+{
+  omp_declare_variant_base_entry entry;
+  entry.base = NULL;
+  entry.node = node;
+  entry.variants = NULL;
+  omp_declare_variant_alt->remove_elt_with_hash (,
+DECL_UID (node->decl));
+}
+}
+
 /* Try to resolve declare variant, return the variant decl if it should
be used instead of base, or base otherwise.  */
 
@@ -1715,6 +1737,11 @@ omp_resolve_declare_variant (tree base)
break;
   if (TREE_CODE (TREE_PURPOSE (TREE_VALUE (attr))) != FUNCTION_DECL)
continue;
+  cgraph_node *node = cgraph_node::get (base);
+  /* If this is already a magic decl created by this function,
+don't process it again.  */
+  if (node && node->declare_variant_alt)
+   return base;
   switch (omp_context_selector_matches (TREE_VALUE (TREE_VALUE (attr
{
case 0:
@@ -1823,6 +1850,12 @@ omp_resolve_declare_variant (tree base)
}
}
 
+  static struct cgraph_node_hook_list *node_removal_hook_holder;
+  if (node_removal_hook_holder)
+   node_removal_hook_holder
+ = symtab->add_cgraph_removal_hook (omp_declare_variant_remove_hook,
+NULL);
+
   if (omp_declare_variants == NULL)
omp_declare_variants
  = hash_table::create_ggc (64);
--- gcc/testsuite/gcc.dg/gomp/pr95315.c.jj  2020-05-26 13:42:42.990567525 
+0200
+++ gcc/testsuite/gcc.dg/gomp/pr95315.c 2020-05-26 13:42:35.748672050 +0200
@@ -0,0 +1,5 @@
+/* PR middle-end/95315 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fopenmp --param ggc-min-heapsize=0" } */
+
+#include "../../c-c++-common/gomp/declare-variant-5.c"

Jakub



Re: [PATCH 5/7] vect: Support vector load/store with length in vectorizer

2020-05-27 Thread Kewen.Lin via Gcc-patches
Hi Richard,

Thanks for your comments!

on 2020/5/26 下午8:49, Richard Sandiford wrote:
> "Kewen.Lin"  writes:
>> @@ -626,6 +645,12 @@ public:
>>/* True if have decided to use a fully-masked loop.  */
>>bool fully_masked_p;
>>  
>> +  /* Records whether we still have the option of using a length access 
>> loop.  */
>> +  bool can_with_length_p;
>> +
>> +  /* True if have decided to use length access for the loop fully.  */
>> +  bool fully_with_length_p;
> 
> Rather than duplicate the flags like this, I think we should have
> three bits of information:
> 
> (1) Can the loop operate on partial vectors?  Starts off optimistically
> assuming "yes", gets set to "no" when we find a counter-example.
> 
> (2) If we do decide to use partial vectors, will we need loop masks?
> 
> (3) If we do decide to use partial vectors, will we need lengths?
> 
> Vectorisation using partial vectors succeeds if (1) && ((2) != (3))
> 
> LOOP_VINFO_CAN_FULLY_MASK_P currently tracks (1) and
> LOOP_VINFO_MASKS currently tracks (2).  In pathological cases it's
> already possible to have (1) && !(2), see r9-6240 for an example.
> 
> With the new support, LOOP_VINFO_LENS tracks (3).
> 
> So I don't think we need the can_with_length_p.  What is now
> LOOP_VINFO_CAN_FULLY_MASK_P can continue to track (1) for both
> approaches, with the final choice of approach only being made
> at the end.  Maybe it would be worth renaming it to something
> more generic though, now that we have two approaches to partial
> vectorisation.

I like this idea!  I could be wrong, but I'm afraid that we
can not have one common flag to be shared for both approaches,
the check criterias could be different for both approaches, one
counter example for length could be acceptable for masking, such
as length can only allow CONTIGUOUS related modes, but masking
can support more.  When we see acceptable VMAT_LOAD_STORE_LANES,
we leave LOOP_VINFO_CAN_FULLY_MASK_P true, later should length
checking turn it to false?  I guess no, assuming still true, then 
LOOP_VINFO_CAN_FULLY_MASK_P will mean only partial vectorization
for masking, not for both.  We can probably clean LOOP_VINFO_LENS
when the length checking is false, but we just know the vec is empty,
not sure we are unable to do partial vectorization with length,
when we see LOOP_VINFO_CAN_FULLY_MASK_P true, we could still
record length into it if possible.

> 
> I think we can assume for now that no arch will be asymmetrical,
> and require (say) loop masks for loads and lengths for stores.
> So if that does happen (i.e. if (2) && (3) ends up being true)
> we should just be able to punt on partial vectorisation.
> 

Agreed, the current implementation takes masking as preferrence,
if it's fully_masked, we will disable vector with length.

> Some of the new length code looks like it's copied and adjusted from the
> corresponding mask code.  It would be good to share the code instead
> where possible, e.g. when deciding whether an IV can overflow.
> 

Yes, some refactoring can be done, it's on my to-do list, give it
priority as your comments.  

V2 attached with some changes against V1:
  1) use rgroup_objs for both mask and length
  2) merge both mask and length handlings into 
 vect_set_loop_condition_partial which is renamed and extended
 from vect_set_loop_condition_masked.
  3) renamed and updated vect_set_loop_masks_directly to 
 vect_set_loop_objs_directly.
  4) renamed vect_set_loop_condition_unmasked to 
 vect_set_loop_condition_normal
  5) factored out min_prec_for_max_niters.
  6) added macro LOOP_VINFO_PARTIAL_VECT_P since a few places need
 to check (LOOP_VINFO_FULLY_MASKED_P || LOOP_VINFO_FULLY_WITH_LENGTH_P) 

Tested with ppc64le test cases, will update with changelog if everything
goes well.

BR,
Kewen
---
 gcc/doc/invoke.texi|   7 +
 gcc/params.opt |   4 +
 gcc/tree-vect-loop-manip.c | 266 ++-
 gcc/tree-vect-loop.c   | 311 -
 gcc/tree-vect-stmts.c  | 152 ++
 gcc/tree-vectorizer.h  |  57 +--
 6 files changed, 639 insertions(+), 158 deletions(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 8b9935dfe65..ac765feab13 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -13079,6 +13079,13 @@ by the copy loop headers pass.
 @item vect-epilogues-nomask
 Enable loop epilogue vectorization using smaller vector size.
 
+@item vect-with-length-scope
+Control the scope of vector memory access with length exploitation.  0 means we
+don't expliot any vector memory access with length, 1 means we only exploit
+vector memory access with length for those loops whose iteration number are
+less than VF, such as very small loop or epilogue, 2 means we want to exploit
+vector memory access with length for any loops if possible.
+
 @item slp-max-insns-in-bb
 Maximum number of instructions in basic block to be
 considered for SLP vectorization.
diff --git 

Re: New mklog script

2020-05-27 Thread Martin Liška

On 5/26/20 5:38 PM, Martin Sebor wrote:

By the way, it's nice that the existing gcc- aliases are documented
on https://gcc.gnu.org/gitwrite.html.  I would suggest to add this
one there as well.


Yes. I added the documentation bit and pushed ho master.

Martin
>From 035bdc56110914329c860870e338463793fb5597 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Wed, 27 May 2020 10:22:01 +0200
Subject: [PATCH] Document gcc-commit-mklog hook.

---
 htdocs/gitwrite.html | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/htdocs/gitwrite.html b/htdocs/gitwrite.html
index e3a1305b..b9bcb768 100644
--- a/htdocs/gitwrite.html
+++ b/htdocs/gitwrite.html
@@ -436,6 +436,8 @@ repository:
   gcc-verify - verify ChangeLog format for a particular commit
   gcc-backport - alias for git cherry-pick -x
   gcc-mklog - generate a ChangeLog template for a patch
+  gcc-commit-mklog - commit a git revision with a pre-filled
+  ChangeLog template
 
 
 The final customization that the script makes is to add a diff rule so
-- 
2.26.2



Re: [PATCH 1/2] rs6000: tune cunroll for simple loops at O2

2020-05-27 Thread Richard Biener via Gcc-patches
On Wed, May 27, 2020 at 6:36 AM Jiufu Guo  wrote:
>
> Segher Boessenkool  writes:
>
> > Hi!
> >
> > On Tue, May 26, 2020 at 08:58:13AM +0200, Richard Biener wrote:
> >> On Mon, May 25, 2020 at 7:44 PM Segher Boessenkool
> >>  wrote:
> >> > Yes, cunroll does not have its own option, and that is a problem.  But
> >> > that is easy to fix!  Either with an option, or just with params (the
> >> > option wouldn't do more than set params anyway?)
> >>
> >> Well, given coming up with different names for essentially the same
> >> transform is going to be challenging how about sth like
> >>
> >> -funroll-loops={early,late,static,dynamic}[insert better names here]
> >
> > User interface is hard :-)  I think luckily we don't need to change
> > anything there yet, just have an internal flag?
> >
> > But complete unrolling is something quite different, so it should have
> > its own flag anyway (at least internally).
> >
> >> note there's also -fpeel-loops which may match the transform
> >> done on GIMPLE better?
> >
> > Peeling and unrolling are the same thing, if doing complete unrolling
> > (or complete peeling), followed by DCE in both cases.  Peeling is a
> > nicer name here I think, yeah.
> >
> >> I'm not sure which are the technically
> >> correct terms for unrollings that elide the loop (the backedge).
> >
> > I don't know a better term than "complete", I don't remember ever seeing
> > something else either.
>
> How about "Var(flag_cunroll_grow_size) EnabledBy(funroll-loops ||
> funroll-all-loops || fpeel-loops)" Or flag_cunroll_allow_grow_size?
>
> And then using this flags as:
>   unsigned int val = tree_unroll_loops_completely (flag_cunroll_grow_size
>|| optimize >= 3, true);
>
> And we do not need to enable this flag at -O2.

Sure this works for me.  Note I'd make funroll-loops enabled by
funroll-all-loops so you could simplify the above.

Richard.

> Thanks for all your helpful comments again!
>
> Jiufu
>
> >
> >> We're doing such kind of unrolling even if we cannot statically
> >> decide which of a set of possible exits we take (and internally
> >> call that peeling, if we can statically decide we call it complete
> >> unrolling).
> >
> > "Peeling" is placing some copies of the loop before the loop;
> > "unrolling" is placing a few copies of the loop inside the loop body.
> > Does that match usage here?
> >
> >> The RTL side OTOH only performs classical unrolling,
> >> preserving the backedge with various strategies for the
> >> remaining iterations.
> >
> > And if you do complete unrolling that way, the backedge can be removed,
> > since it can be shown never to be taken.
> >
> >> As said, for the regression on the 10 branch with ppc I'd add
> >> [a hidden] flag that controls the RTL unroller, also set by
> >> -funroll-loops and triggered by the ppc specific heuristics.
> >
> > But the problem is in cunroll?  This is so backwards...  Because some
> > other transform abuses the unroller flags, adding a second level flag
> > with the same meaning :-(  It will work for fixing the regression,
> > sure, and it is slightly less code as well.
> >
> >
> > Segher


Re: New mklog script

2020-05-27 Thread Martin Liška

On 5/26/20 8:06 PM, Jason Merrill wrote:

gcc-ci-log would be better if you want something short; I'd prefer 
gcc-commit-mklog, and let people define their own shorter aliases as desired.


Hello.

There's a rename I'm going to install.

Martin
>From b423f910dcc2a58a86b61cc5b966a81066abbf12 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Wed, 27 May 2020 10:16:21 +0200
Subject: [PATCH] Rename commit-mklog alias to gcc-commit-mklog.

contrib/ChangeLog:

	* gcc-git-customization.sh: Rename
	commit-mklog to gcc-commit-mklog.
---
 contrib/gcc-git-customization.sh | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/contrib/gcc-git-customization.sh b/contrib/gcc-git-customization.sh
index dcc42683fa6..0e56dcf9873 100755
--- a/contrib/gcc-git-customization.sh
+++ b/contrib/gcc-git-customization.sh
@@ -27,10 +27,8 @@ git config alias.gcc-undescr \!"f() { o=\$(git config --get gcc-config.upstream)
 
 git config alias.gcc-verify '!f() { "`git rev-parse --show-toplevel`/contrib/gcc-changelog/git_check_commit.py" $@; } ; f'
 git config alias.gcc-backport '!f() { rev=$1; git cherry-pick -x $@; } ; f'
-
 git config alias.gcc-mklog '!f() { "`git rev-parse --show-toplevel`/contrib/mklog.py" $@; } ; f'
-
-git config alias.commit-mklog '!f() { GCC_FORCE_MKLOG=1 git commit "$@"; }; f'
+git config alias.gcc-commit-mklog '!f() { GCC_FORCE_MKLOG=1 git commit "$@"; }; f'
 
 # Make diff on MD files use "(define" as a function marker.
 # Use this in conjunction with a .gitattributes file containing
-- 
2.26.2



Re: [PATCH RFC] gcc-git: Add prepare-commit-msg hook.

2020-05-27 Thread Martin Liška

On 5/26/20 8:02 PM, Jason Merrill wrote:

Maybe these environment variables should start with GCC_GIT, not just GCC, to 
give more indication what they're for.


All right, I renamed it and pushed to master.

Martin


Re: [PATCH] Rewrite maintainer-scripts/update_version_git

2020-05-27 Thread Richard Biener
On Wed, 27 May 2020, Jakub Jelinek wrote:

> Hi!
> 
> This patch rewrites update_version_git to be just a thin wrapper around
> Martin's new python script.  This just arranges to check out the gcc
> repo in a temporary directory, copy out the contrib scripts so that
> the running script doesn't change with branch checkouts and runs the script.
> 
> I've run it today manually but hopefully we can do it from cron again
> from tomorrow.
> 
> Ok for trunk?

OK.

Richard.
> 
> 2020-05-27  Jakub Jelinek  
> 
>   * update_version_git: Rewrite using
>   contrib/gcc-changelog/git_update_version.py.
> 
> --- maintainer-scripts/update_version_git.jj
> +++ maintainer-scripts/update_version_git
> @@ -1,85 +1,28 @@
>  #!/bin/sh
>  #
> -# Update the current version date in all files in the tree containing
> -# it.  Consider all single-component-version release branches except
> -# those matching the regular expression in $IGNORE_BRANCHES, and also
> -# consider those branches listed in the space separated list in
> -# $ADD_BRANCHES.
> +# Update the current version date in DATESTAMP files and generate
> +# ChangeLog file entries since the last DATESTAMP update from the
> +# commit messages.
>  
>  GITROOT=${GITROOT:-"/git/gcc.git"}
> -IGNORE_BRANCHES='releases/gcc-(.*\..*|5|6|7)'
> -ADD_BRANCHES='master'
>  
>  # Run this from /tmp.
>  export GITROOT
> -BASEDIR=/tmp/$$
> -/bin/rm -rf "$BASEDIR"
> -/bin/mkdir "$BASEDIR"
> +BASEDIR=`mktemp -d`
>  cd "$BASEDIR"
>  
>  GIT=${GIT:-/usr/local/bin/git}
>  
> -# Compute the branches which we should update.
> -BRANCHES=`(cd $GITROOT \
> -&& ${GIT} for-each-ref --format='%(refname)' \
> -  'refs/heads/releases/gcc-*') \
> -   | sed -e 's/refs\/heads\///' \
> -  | egrep -v $IGNORE_BRANCHES`
> -# Always update the mainline.
> -BRANCHES="${ADD_BRANCHES} ${BRANCHES}"
> -
> -# This is put into the datestamp files.
> -CURR_DATE=`/bin/date +"%Y%m%d"`
> -
> -datestamp_FILES="gcc/DATESTAMP"
> -
> -
>  # Assume all will go well.
> -RESULT=0
>  SUBDIR=$BASEDIR/gcc
> -for BRANCH in $BRANCHES; do
> -  echo "Working on \"$BRANCH\"."
> -  # Check out the files on the branch.
> -  if [ -d "$SUBDIR" ]; then
> -cd "$SUBDIR"
> -${GIT} pull -q
> -${GIT} checkout -q "$BRANCH"
> -  else
> -${GIT} clone -q -b "$BRANCH" "$GITROOT" "$SUBDIR"
> -  fi
> -
> -  # There are no files to commit yet.
> -  COMMIT_FILES=""
> -
> -  cd "$SUBDIR"
> -  for file in $datestamp_FILES; do
> -if test -f $file; then
> -  echo "${CURR_DATE}" > $file.new
> +${GIT} clone -q -b master "$GITROOT" "$SUBDIR"
>  
> -  if /usr/bin/cmp -s $file $file.new; then
> - rm -f $file.new
> -  else
> - mv -f $file.new $file
> -COMMIT_FILES="$COMMIT_FILES $file"
> -  fi
> -fi
> -  done
> +cp -a $SUBDIR/contrib/gcc-changelog $BASEDIR/gcc-changelog
> +cd "$SUBDIR"
> +python3 ../gcc-changelog/git_update_version.py -p
> +RESULT=$?
>  
> -  if test -n "$COMMIT_FILES"; then
> -for i in $COMMIT_FILES; do
> -echo "Attempting to commit $i"
> -if ${GIT} commit -m "Daily bump." $i; then
> -  if ! ${GIT} push origin "$BRANCH"; then
> -# If we could not push the files, indicate failure.
> -RESULT=1
> -  fi
> -else
> -  # If we could not commit the files, indicate failure.
> -  RESULT=1
> -fi
> -done
> -  fi
> -done
> +cd /tmp
>  
>  /bin/rm -rf $BASEDIR
>  exit $RESULT
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


[PATCH] Rewrite maintainer-scripts/update_version_git

2020-05-27 Thread Jakub Jelinek via Gcc-patches
Hi!

This patch rewrites update_version_git to be just a thin wrapper around
Martin's new python script.  This just arranges to check out the gcc
repo in a temporary directory, copy out the contrib scripts so that
the running script doesn't change with branch checkouts and runs the script.

I've run it today manually but hopefully we can do it from cron again
from tomorrow.

Ok for trunk?

2020-05-27  Jakub Jelinek  

* update_version_git: Rewrite using
contrib/gcc-changelog/git_update_version.py.

--- maintainer-scripts/update_version_git.jj
+++ maintainer-scripts/update_version_git
@@ -1,85 +1,28 @@
 #!/bin/sh
 #
-# Update the current version date in all files in the tree containing
-# it.  Consider all single-component-version release branches except
-# those matching the regular expression in $IGNORE_BRANCHES, and also
-# consider those branches listed in the space separated list in
-# $ADD_BRANCHES.
+# Update the current version date in DATESTAMP files and generate
+# ChangeLog file entries since the last DATESTAMP update from the
+# commit messages.
 
 GITROOT=${GITROOT:-"/git/gcc.git"}
-IGNORE_BRANCHES='releases/gcc-(.*\..*|5|6|7)'
-ADD_BRANCHES='master'
 
 # Run this from /tmp.
 export GITROOT
-BASEDIR=/tmp/$$
-/bin/rm -rf "$BASEDIR"
-/bin/mkdir "$BASEDIR"
+BASEDIR=`mktemp -d`
 cd "$BASEDIR"
 
 GIT=${GIT:-/usr/local/bin/git}
 
-# Compute the branches which we should update.
-BRANCHES=`(cd $GITROOT \
-  && ${GIT} for-each-ref --format='%(refname)' \
-'refs/heads/releases/gcc-*') \
- | sed -e 's/refs\/heads\///' \
-  | egrep -v $IGNORE_BRANCHES`
-# Always update the mainline.
-BRANCHES="${ADD_BRANCHES} ${BRANCHES}"
-
-# This is put into the datestamp files.
-CURR_DATE=`/bin/date +"%Y%m%d"`
-
-datestamp_FILES="gcc/DATESTAMP"
-
-
 # Assume all will go well.
-RESULT=0
 SUBDIR=$BASEDIR/gcc
-for BRANCH in $BRANCHES; do
-  echo "Working on \"$BRANCH\"."
-  # Check out the files on the branch.
-  if [ -d "$SUBDIR" ]; then
-cd "$SUBDIR"
-${GIT} pull -q
-${GIT} checkout -q "$BRANCH"
-  else
-${GIT} clone -q -b "$BRANCH" "$GITROOT" "$SUBDIR"
-  fi
-
-  # There are no files to commit yet.
-  COMMIT_FILES=""
-
-  cd "$SUBDIR"
-  for file in $datestamp_FILES; do
-if test -f $file; then
-  echo "${CURR_DATE}" > $file.new
+${GIT} clone -q -b master "$GITROOT" "$SUBDIR"
 
-  if /usr/bin/cmp -s $file $file.new; then
-   rm -f $file.new
-  else
-   mv -f $file.new $file
-COMMIT_FILES="$COMMIT_FILES $file"
-  fi
-fi
-  done
+cp -a $SUBDIR/contrib/gcc-changelog $BASEDIR/gcc-changelog
+cd "$SUBDIR"
+python3 ../gcc-changelog/git_update_version.py -p
+RESULT=$?
 
-  if test -n "$COMMIT_FILES"; then
-for i in $COMMIT_FILES; do
-echo "Attempting to commit $i"
-if ${GIT} commit -m "Daily bump." $i; then
-  if ! ${GIT} push origin "$BRANCH"; then
-# If we could not push the files, indicate failure.
-RESULT=1
-  fi
-else
-  # If we could not commit the files, indicate failure.
-  RESULT=1
-fi
-done
-  fi
-done
+cd /tmp
 
 /bin/rm -rf $BASEDIR
 exit $RESULT

Jakub



Re: [PATCH 0/7] Support vector load/store with length

2020-05-27 Thread Richard Sandiford
Richard Biener  writes:
> On Tue, 26 May 2020, Jim Wilson wrote:
>
>> On Tue, May 26, 2020 at 12:12 AM Richard Biener  wrote:
>> > From a look at the series description below you seem to add a new way
>> > of doing loads for this.  Did you review other ISAs (those I'm not
>> > familiar with myself too much are SVE, RISC-V and GCN) in GCC whether
>> > they have similar support and whether your approach can be supported
>> > there?  ISTR SVE must have some similar support - what's the reason
>> > you do not piggy-back on that?
>> 
>> There isn't any RISC-V Vector support in GCC yet.  The RVV spec is
>> still in draft and still occasionally changing in incompatible ways.
>> We've done some experimenting with gcc patches, but all we have are
>> intrinsics.  We haven't implemented any auto vectorization support, so
>> we haven't defined tree representations for anything yet, other than
>> the types we need for intrinsics support.  But if it looks OK for SVE
>> then it probably will be OK for RVV.
>
> Btw, I'm specifically looking for other load/store with length
> implementations and as to whether they agree on taking bytes for
> the length rather than, for example the number of lanes.  I guess
> exposing this detail on GIMPLE can help IV selection but if we'd
> ever get a differing semantics ISA we'd have to add another set
> of IFNs, so maybe the PPC ones should be named in a more specific
> way like _WITH_BYTES or _BYTES or _WITH_BYTE_LENGTH or so to
> allow _WITH_LANES?

Maybe that detail is another thing that a cookie could hide.  We'd then
potentially need one IFN per approach to calculating the length parameter
(bytes vs. elements, self-capping vs. explicit capping, etc.), but it would
only be one IFN per approach, rather than the combinatorial explosion
we'd get from one IFN per approach*load/store-kind.

It doesn't make much difference when we only have one LOAD and one STORE
per approach.  But I imagine this will be useful for MVE, and there we'll
want extending loads, truncating stores, gathers and scatters too.

Thanks,
Richard


Re: [PATCH 0/7] Support vector load/store with length

2020-05-27 Thread Richard Biener
On Tue, 26 May 2020, Segher Boessenkool wrote:

> Hi!
> 
> On Tue, May 26, 2020 at 01:29:30PM +0100, Richard Sandiford wrote:
> > FWIW, I agree adding .LEN_LOAD and .LEN_STORE seems like a good
> > approach.  I think it'll be more maintainable in the long run than
> > trying to have .MASK_LOADs and .MASK_STOREs that need a special mask
> > operand.  (That would be too similar to VEC_COND_EXPR :-))
> > 
> > Not sure yet what the exact semantics wrt out-of-range values for
> > the IFN/optab though.  Maybe we should instead have some kind of
> > abstract, target-specific cookie created by a separate intrinsic.
> > Haven't thought much about it yet...
> 
> Or maybe only support 0..N with N the length of the vector?  It is
> pretty important to support 0 and N, but greater than N isn't as
> important (it is useful for tricky hand-written code, but not as much
> for compiler-generate code -- we only support an 8-bit number here on
> Power, maybe that is why ;-) )

The question is one of semantics - if power masks the length to an
8 bit number it's important to preprocess the IV.  As with my
other suggestion the question is what to expose to the IL (to GIMPLE)
here.  Exposing as much as possible will help IV selection but
will eventually require IFN variations for different semantics.

So yes, 0..N sounds about right here and we'll require a MIN ()
operation and likely need to teach IV selection about this to at least
possibly get an IV with the byte size multiplication factored.

Richard.

> 
> Segher
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


Re: [PATCH 0/7] Support vector load/store with length

2020-05-27 Thread Richard Biener
On Tue, 26 May 2020, Jim Wilson wrote:

> On Tue, May 26, 2020 at 12:12 AM Richard Biener  wrote:
> > From a look at the series description below you seem to add a new way
> > of doing loads for this.  Did you review other ISAs (those I'm not
> > familiar with myself too much are SVE, RISC-V and GCN) in GCC whether
> > they have similar support and whether your approach can be supported
> > there?  ISTR SVE must have some similar support - what's the reason
> > you do not piggy-back on that?
> 
> There isn't any RISC-V Vector support in GCC yet.  The RVV spec is
> still in draft and still occasionally changing in incompatible ways.
> We've done some experimenting with gcc patches, but all we have are
> intrinsics.  We haven't implemented any auto vectorization support, so
> we haven't defined tree representations for anything yet, other than
> the types we need for intrinsics support.  But if it looks OK for SVE
> then it probably will be OK for RVV.

Btw, I'm specifically looking for other load/store with length
implementations and as to whether they agree on taking bytes for
the length rather than, for example the number of lanes.  I guess
exposing this detail on GIMPLE can help IV selection but if we'd
ever get a differing semantics ISA we'd have to add another set
of IFNs, so maybe the PPC ones should be named in a more specific
way like _WITH_BYTES or _BYTES or _WITH_BYTE_LENGTH or so to
allow _WITH_LANES?

Richard.

> Jim
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


Re: [PATCH] Add debug (slp_tree) and dump infrastructure for this

2020-05-27 Thread Richard Biener
On Tue, 26 May 2020, David Malcolm wrote:

> On Mon, 2020-05-25 at 16:56 +0200, Richard Biener wrote:
> > This adds an alternate debug_dump_context similar to the one for
> > selftests but for interactive debugging routines.  This allows
> > to share code between user-visible dumping via the dump_* API
> > and those debugging routines.  The primary driver was SLP node
> > dumping which wasn't accessible from inside a gdb session up to
> > now.
> > 
> > Bootstrap & regtest running on x86_64-unknown-linux-gnu.
> > 
> > David, does this look OK?
> 
> Overall, seems sane to me; a couple of items inline below.
> 
> > Thanks,
> > Richard.
> > 
> > 2020-05-25  Richard Biener  
> > 
> > * dump-context.h (debug_dump_context): New class.
> > (dump_context): Make it friend.
> > * dumpfile.c (debug_dump_context::debug_dump_context):
> > Implement.
> > (debug_dump_context::~debug_dump_context): Likewise.
> > * tree-vect-slp.c: Include dump-context.h.
> > (vect_print_slp_tree): Dump a single SLP node.
> > (debug): New overload for slp_tree.
> > (vect_print_slp_graph): Rename from vect_print_slp_tree and
> > use that.
> > (vect_analyze_slp_instance): Adjust.
> > ---
> >  gcc/dump-context.h  | 19 ++
> >  gcc/dumpfile.c  | 26 +
> >  gcc/tree-vect-slp.c | 47 +
> > 
> >  3 files changed, 80 insertions(+), 12 deletions(-)
> > 
> > diff --git a/gcc/dump-context.h b/gcc/dump-context.h
> > index 347477f331e..6d51eaf31ad 100644
> > --- a/gcc/dump-context.h
> > +++ b/gcc/dump-context.h
> > @@ -29,6 +29,7 @@ along with GCC; see the file COPYING3.  If not see
> >  
> >  class optrecord_json_writer;
> >  namespace selftest { class temp_dump_context; }
> > +class debug_dump_context;
> >  
> >  /* A class for handling the various dump_* calls.
> >  
> > @@ -42,6 +43,7 @@ namespace selftest { class temp_dump_context; }
> >  class dump_context
> >  {
> >friend class selftest::temp_dump_context;
> > +  friend class debug_dump_context;
> >  
> >   public:
> >static dump_context  () { return *s_current; }
> > @@ -195,6 +197,23 @@ private:
> >auto_vec m_stashed_items;
> >  };
> >  
> > +/* An RAII-style class for use in debug dumpers for temporarily
> > using a
> > +   different dump_context.  */
> > +
> > +class debug_dump_context
> 
> (Bikeshed Alert)  The name might be confusing in that this class isn't
> a dump_context itself.  Some of our existing RAII classes have an
> "auto_" prefix; would that be an idea?
> Maybe "auto_dump_everything"???
> 
> But I don't have a strong objection to the name as-is.

kept it as-is but improved the class comment.

> [...snip...]
> 
> 
> > diff --git a/gcc/dumpfile.c b/gcc/dumpfile.c
> > index 54718784fd4..0c0c076d890 100644
> > --- a/gcc/dumpfile.c
> > +++ b/gcc/dumpfile.c
> > @@ -2078,6 +2078,32 @@ enable_rtl_dump_file (void)
> >return num_enabled > 0;
> >  }
> >  
> > +/* debug_dump_context's ctor.  Temporarily override the dump_context
> > +   (to forcibly enable output to stderr).  */
> > +
> > +debug_dump_context::debug_dump_context ()
> > +: m_context (),
> > +  m_saved (_context::get ()),
> > +  m_saved_flags (dump_flags),
> > +  m_saved_file (dump_file)
> > +{
> > +  set_dump_file (stderr);
> > +  dump_context::s_current = _context;
> > +  pflags = dump_flags = MSG_ALL_KINDS | MSG_ALL_PRIORITIES;
> > +  dump_context::get ().refresh_dumps_are_enabled ();
> > +}
> > +
> > +/* debug_dump_context's dtor.  Restore the saved dump_context.  */
> > +
> > +debug_dump_context::~debug_dump_context ()
> > +{
> > +  set_dump_file (m_saved_file);
> > +  dump_context::s_current = m_saved;
> > +  pflags = dump_flags = m_saved_flags;
> > +  dump_context::get ().refresh_dumps_are_enabled ();
> > +}
> 
> I notice that the code saves dump_flags, and later restores both
> dump_flags and pflags to the same value.  I'm a little hazy on this,
> but is there any guarantee they had the same value?  Should the value
> of pflags be saved separately from dump_flags?

Hum, right.  Better be safe than sorry.

Re-testing the following, will commit after that succeeds.

Richard.

>From 73ddc25a76088571675aeccdd0537ebb2831b863 Mon Sep 17 00:00:00 2001
From: Richard Biener 
Date: Mon, 25 May 2020 16:10:12 +0200
Subject: [PATCH] Add debug (slp_tree) and dump infrastructure for this

This adds an alternate debug_dump_context similar to the one for
selftests but for interactive debugging routines.  This allows
to share code between user-visible dumping via the dump_* API
and those debugging routines.  The primary driver was SLP node
dumping which wasn't accessible from inside a gdb session up to
now.

2020-05-27  Richard Biener  

* dump-context.h (debug_dump_context): New class.
(dump_context): Make it friend.
* dumpfile.c (debug_dump_context::debug_dump_context):
Implement.
(debug_dump_context::~debug_dump_context): Likewise.
* 

Re: [PATCH] PR94397 the compiler consider "type is( real(kind(1.)) )" as a syntax error

2020-05-27 Thread Mark Eggleston

ping

On 13/05/2020 18:19, Mark Eggleston wrote:

Please find attached a patch for PR94397.

Commit message:

Fortran  : "type is( real(kind(1.)) )" spurious syntax error PR94397

Based on a patch in the comments of the PR. That patch fixed this problem
but caused the test cases for PR93484 to fail. Changed to reduce
initialisation expressions if the expression is not EXPR_VARIABLE and not
EXPR_CONSTANT.

2020-05-13  Steven G. Kargl  
            Mark Eggleston 

gcc/fortran/

    PR fortran/94397
    * match.c (gfc_match_type_spec): New variable ok initialised
    to true. Set ok with the return value of gfc_reduce_init_expr
    called only if the expression is not EXPR_CONSTANT and is not
    EXPR_VARIABLE. Add !ok to the check for type not being integer
    or the rank being greater than zero.

2020-05-13  Mark Eggleston 

gcc/testsuite/

    PR fortran/94397
    * gfortran.dg/pr94397.F90: New test.

The formatting with tabs and date will be corrected prior to commit.

Tested on x86_64 for master, releases/gcc-9, releases/gcc-10 branches. 
OK to commit and backport?



--
https://www.codethink.co.uk/privacy.html



Re: [PATCH] Fortran : ICE in gfc_trans_label_assign PR50392

2020-05-27 Thread Mark Eggleston

ping

On 19/05/2020 08:49, Mark Eggleston wrote:

Please find attached patch for PR50392.

This patch was extracted from the comments in the PR and was written 
back in 2011! I have verified that it fixes the PR on master, gcc-8, 
gcc-9 and gcc-10.


Commit message:

Fortran  : ICE in gfc_trans_label_assign PR50392

A function may contain an assigned goto.  If the the return variable
is an integer a statement can be assigned to it.  Prior to this fix
this resulted in an ICE.

2020-05-19  Tobias Burnus 

gcc/fortran/

    * trans-decl.c (gfc_get_symbol_decl): Remove unnecessary block
    delimiters.  Add auxiliary variables if a label is assigned to
    a return variable. (gfc_gat_fake_result): ???

2020-05-19  Mark Eggleston 

gcc/testsuite/

    * gfortran.dg/pr50392.f: New test.

As can be seen there is a sequence of question marks in the 
description of the changes made. I don't know how to describe the 
inserted code, Tobias Burnus: you may be able to help as it is your code.


Tested on x86_64 using make check-fortran for master, gcc-8, ggc-9 and 
gcc-10.


Once the addition to the description to the code changes is complete 
will it be OK to commit to master and backport to gcc-8, 9 and 10.



--
https://www.codethink.co.uk/privacy.html



Re: [PATCH] Fix nonconforming memory_operand for vpmov instructions which has memory operand narrow than 128 bits [avx512f]

2020-05-27 Thread Hongtao Liu via Gcc-patches
On Mon, May 25, 2020 at 8:41 PM Uros Bizjak  wrote:
>
> On Mon, May 25, 2020 at 2:21 PM Hongtao Liu  wrote:
> >
> >   According to Intel SDM, VPMOVQB xmm1/m16 {k1}{z}, xmm2 has 16-bit
> > memory_operand instead of 128-bit one which exists in current
> > implementation. Also for other vpmov instructions which have
> > memory_operand narrower than 128bits.
> >
> >   Bootstrap is ok, regression test for i386/x86-64 backend is ok.
>
>
> +  [(set (match_operand:HI 0 "memory_operand" "=m")
> +(subreg:HI (any_truncate:V2QI
> + (match_operand:V2DI 1 "register_operand" "v")) 0))]
>
> This should store in V2QImode, subregs are not allowed in insn patterns.
>
> You need a pre-reload splitter to split from register_operand to a
> memory_operand, Jakub fixed a bunch of pmov patterns a while ago, so
> perhaps he can give some additional advice here.
>

Like this?
---
(define_insn "*avx512vl_v2div2qi2_store"
  [(set (match_operand:V2QI 0 "memory_operand" "=m")
(any_truncate:V2QI
  (match_operand:V2DI 1 "register_operand" "v")))]
  "TARGET_AVX512VL"
  "vpmovqb\t{%1, %0|%0, %1}"
  [(set_attr "type" "ssemov")
   (set_attr "memory" "store")
   (set_attr "prefix" "evex")
   (set_attr "mode" "TI")])

(define_insn_and_split "*avx512vl_v2div2qi2_store"
  [(set (match_operand:HI 0 "memory_operand")
(subreg:HI
  (any_truncate:V2QI
(match_operand:V2DI 1 "register_operand")) 0))]
  "TARGET_AVX512VL && ix86_pre_reload_split ()"
  "#"
  "&& 1"
  [(set (match_dup 0)
(any_truncate:V2QI (match_dup 1)))]
  "operands[0] = adjust_address_nv (operands[0], V2QImode, 0);")
---

> Uros.
>
>
> > gcc/ChangeLog
> >
> > * config/i386/sse.md (*avx512vl_v2div2qi2_store): Refine
> > size of memory_operand according to Intel SDM.
> > (avx512vl_v2div2qi2_mask_store): Ditto.
> > (*avx512vl_v4qi2_store): Ditto.
> > (avx512vl_v4qi2_mask_store): Ditto.
> > (*avx512vl_v8qi2_store): Ditto.
> > (avx512vl_v8qi2_mask_store): Ditto.
> > (*avx512vl_v4hi2_store): Ditto.
> > (avx512vl_v4hi2_mask_store): Ditto.
> > (*avx512vl_v2div2hi2_store): Ditto.
> > (avx512vl_v2div2hi2_mask_store): Ditto.
> > (*avx512vl_v2div2si2_store): Ditto.
> > (avx512vl_v2div2si2_mask_store): Ditto.
> > (*avx512f_v8div16qi2_store): Ditto.
> > (avx512f_v8div16qi2_mask_store): Ditto.
> > * config/i386/i386-builtin-types.def: Adjust builtin type.
> > * config/i386/i386-expand.c: Ditto.
> > * config/i386/i386-builtin.def: Adjust builtin.
> > * config/i386/avx512fintrin.h: Ditto.
> > * config/i386/avx512vlbwintrin.h: Ditto.
> > * config/i386/avx512vlintrin.h: Ditto.
> >
> >   I think the code i changed is already covered by existed intrinsics
> > tests, so i didn't add any new tests.
> > --
> > BR,
> > Hongtao



-- 
BR,
Hongtao