date:20200624

Re: [RFC/PATCH] IFN: Fix mask_{load,store} optab support macros

2020-06-24 Thread Andrew Waterman

On Wed, Jun 24, 2020 at 5:56 PM Jim Wilson  wrote:
>
> On Wed, Jun 24, 2020 at 1:35 AM Richard Sandiford
>  wrote:
> > Richard Biener  writes:
> > > AVX512 would have V16SImode and SImode because the mask would have
> > > an integer mode?  Likewise I could imagine RISC-V using V4SImode and 
> > > V4QImode
> > > or however their mask registers look like.
>
> RISC-V has 7 mask modes, currently.  Masks are defined to always fit
> in one register.  So if you have a register group of 8 (mlen=8) and an
> element length of 8-bits (elen=8) then you have 1 bit of mask per
> element which completely fills one register.  If you have a register
> group of 1 (mlen=1) and an element length of 64-bits (elen=64) then
> you have 64-bits of mask per element which completely fills one
> register.  So we need a mode for each possible way of tiling a
> register with mask bits, which is 2**x for x=(0..6).

The more recent (and hopefully final) ISA design has element i masked
by mask bit i, irrespective of element size--i.e., mlen=1
unconditionally.  Sorry about the churn.

>
> > But what I mean is, once you know the vector mode, there should only
> > be one “choice” of mask mode.
>
> This is true for RISC-V also.  The mask mode is determined by the
> number of registers in the group and the element width, so there is
> only one valid mask mode for each vector mode.

And of course this is still true.

>
> Jim

[PATCH] libgomp: added simple functions and tests for OMPD

2020-06-24 Thread y2s1982 via Gcc-patches

This patch adds some unit tests for omp-tools.h header. It also adds some simple
functions related to OMPD API versions. It also partially defines the OMPD
initialization function. More OMPD configuration is also added into Makefile.am.

2020-06-24  Tony Sim  

libgomp/ChangeLog:

* Makefile.am(toolexeclib_LTLIBRARIES): Add libgompd.la.
(libgompd_la_LDFLAGS, libgompd_la_DEPENDENCIES, libgompd_la_LINK,
libgompd_la_SOURCES, libgompd_version_dep, libgompd_version_script,
libgompd.ver-sun, libgompd.ver, libgompd_version_info): Set.
* Makefile.in: Regnerate.
* config/darwin/plugin-suffix.h (SONAME_SUFFIX): Remove ().
* config/hpux/plugin-suffix.h (SONAME_SUFFIX): Remove ().
* config/posix/plugin-suffix.h (SONAME_SUFFIX): Remove ().
* omp-tools.h: Comment out ompd_dll_locations.
* testsuite/Makefile.in: Regnerate.
* libgompd.h: New file.
* libgompd.map: New file.
* libgompd.s: New file.
* ompd-lib.c: New file.
* testsuite/libgomp.ompd/header-1.c: New test.
* testsuite/libgomp.ompd/header-order-1.c: New test.
* testsuite/libgomp.ompd/header-order-2.c: New test.
* testsuite/libgomp.ompd/ompd.exp: New test.

---
 libgomp/Makefile.am   | 29 +++-
 libgomp/Makefile.in   | 39 ++-
 libgomp/config/darwin/plugin-suffix.h |  2 +-
 libgomp/config/hpux/plugin-suffix.h   |  2 +-
 libgomp/config/posix/plugin-suffix.h  |  2 +-
 libgomp/libgompd.h| 37 +++
 libgomp/libgompd.map  | 49 ++
 libgomp/libgompd.s|  1 +
 libgomp/omp-tools.h   |  2 +-
 libgomp/ompd-lib.c| 66 +++
 libgomp/testsuite/Makefile.in |  1 +
 libgomp/testsuite/libgomp.ompd/header-1.c | 10 +++
 .../testsuite/libgomp.ompd/header-order-1.c   | 11 
 .../testsuite/libgomp.ompd/header-order-2.c   | 11 
 libgomp/testsuite/libgomp.ompd/ompd.exp   | 38 +++
 15 files changed, 291 insertions(+), 9 deletions(-)
 create mode 100644 libgomp/libgompd.h
 create mode 100644 libgomp/libgompd.map
 create mode 100644 libgomp/libgompd.s
 create mode 100644 libgomp/ompd-lib.c
 create mode 100644 libgomp/testsuite/libgomp.ompd/header-1.c
 create mode 100644 libgomp/testsuite/libgomp.ompd/header-order-1.c
 create mode 100644 libgomp/testsuite/libgomp.ompd/header-order-2.c
 create mode 100644 libgomp/testsuite/libgomp.ompd/ompd.exp

diff --git a/libgomp/Makefile.am b/libgomp/Makefile.am
index 4d31f4cef46..e15a838e55c 100644
--- a/libgomp/Makefile.am
+++ b/libgomp/Makefile.am
@@ -20,7 +20,7 @@ AM_CPPFLAGS = $(addprefix -I, $(search_path))
 AM_CFLAGS = $(XCFLAGS)
 AM_LDFLAGS = $(XLDFLAGS) $(SECTION_LDFLAGS) $(OPT_LDFLAGS)
 
-toolexeclib_LTLIBRARIES = libgomp.la
+toolexeclib_LTLIBRARIES = libgomp.la libgompd.la
 nodist_toolexeclib_HEADERS = libgomp.spec
 
 if LIBGOMP_BUILD_VERSIONED_SHLIB
@@ -31,14 +31,21 @@ PREPROCESS = $(subst -Wc$(comma), , $(COMPILE)) -E
 libgomp.ver: $(top_srcdir)/libgomp.map
$(EGREP) -v '#(#| |$$)' $< | \
  $(PREPROCESS) -P -include config.h - > $@ || (rm -f $@ ; exit 1)
+libgompd.ver: $(top_srcdir)/libgompd.map
+   $(EGREP) -v '#(#| |$$)' $< | \
+ $(PREPROCESS) -P -include config.h - > $@ || (rm -f $@ ; exit 1)
 
 if LIBGOMP_BUILD_VERSIONED_SHLIB_GNU
 libgomp_version_script = -Wl,--version-script,libgomp.ver
+libgompd_version_script = -Wl,--version-script,libgompd.ver
 libgomp_version_dep = libgomp.ver
+libgompd_version_dep = libgompd.ver
 endif
 if LIBGOMP_BUILD_VERSIONED_SHLIB_SUN
 libgomp_version_script = -Wl,-M,libgomp.ver-sun
+libgompd_version_script = -Wl,-M,libgompd.ver-sun
 libgomp_version_dep = libgomp.ver-sun
+libgompd_version_dep = libgompd.ver-sun
 libgomp.ver-sun : libgomp.ver \
$(top_srcdir)/../contrib/make_sunver.pl \
$(libgomp_la_OBJECTS) $(libgomp_la_LIBADD)
@@ -48,16 +55,32 @@ libgomp.ver-sun : libgomp.ver \
 `echo $(libgomp_la_LIBADD) | \
sed 's,/\([^/.]*\)\.la,/.libs/\1.a,g'` \
 > $@ || (rm -f $@ ; exit 1)
+libgompd.ver-sun : libgompd.ver \
+   $(top_srcdir)/../contrib/make_sunver.pl \
+   $(libgompd_la_OBJECTS) $(libgompd_la_LIBADD)
+   perl $(top_srcdir)/../contrib/make_sunver.pl \
+ libgompd.ver \
+ $(libgompd_la_OBJECTS:%.lo=.libs/%.o) \
+`echo $(libgompd_la_LIBADD) | \
+   sed 's,/\([^/.]*\)\.la,/.libs/\1.a,g'` \
+> $@ || (rm -f $@ ; exit 1)
 endif
 else
 libgomp_version_script =
+libgompd_version_script =
 libgomp_version_dep =
+libgompd_version_dep =
 endif
 libgomp_version_info = -version-info $(libtool_VERSION)
+libgompd_version_info = -version-info $(libtool_VERSION)
 libgomp_la_LDFLAGS = $(libgomp_version_info) $(libgomp_version_script) \
-

Re: [RFC/PATCH] IFN: Fix mask_{load,store} optab support macros

2020-06-24 Thread Jim Wilson

On Wed, Jun 24, 2020 at 1:35 AM Richard Sandiford
 wrote:
> Richard Biener  writes:
> > AVX512 would have V16SImode and SImode because the mask would have
> > an integer mode?  Likewise I could imagine RISC-V using V4SImode and 
> > V4QImode
> > or however their mask registers look like.

RISC-V has 7 mask modes, currently.  Masks are defined to always fit
in one register.  So if you have a register group of 8 (mlen=8) and an
element length of 8-bits (elen=8) then you have 1 bit of mask per
element which completely fills one register.  If you have a register
group of 1 (mlen=1) and an element length of 64-bits (elen=64) then
you have 64-bits of mask per element which completely fills one
register.  So we need a mode for each possible way of tiling a
register with mask bits, which is 2**x for x=(0..6).

> But what I mean is, once you know the vector mode, there should only
> be one “choice” of mask mode.

This is true for RISC-V also.  The mask mode is determined by the
number of registers in the group and the element width, so there is
only one valid mask mode for each vector mode.

Jim

Re: [PATCH] Add TARGET_UPDATE_DECL_ALIGNMENT [PR95237]

2020-06-24 Thread Sunil Pandey via Gcc-patches

On Wed, Jun 24, 2020 at 12:30 AM Richard Biener
 wrote:
>
> On Tue, Jun 23, 2020 at 5:31 PM Sunil K Pandey via Gcc-patches
>  wrote:
> >
> > From: Sunil K Pandey 
> >
> > Default for this hook is NOP. For x86, in 32 bit mode, this hook
> > sets alignment of long long on stack to 32 bits if preferred stack
> > boundary is 32 bits.
> >
> >  - This patch fixes
> > gcc.target/i386/pr69454-2.c
> > gcc.target/i386/stackalign/longlong-1.c
> >  - Regression test on x86-64, no new fail introduced.
>
> I think the name is badly chosen, TARGET_LOWER_LOCAL_DECL_ALIGNMENT

Yes, I can change the target hook name.

> would be better suited (and then asks for LOCAL_DECL_ALIGNMENT to be
> renamed to INCREASE_LOCAL_DECL_ALIGNMENT).

It seems like LOCAL_DECL_ALIGNMENT macro documentation is incorrect.
It increases as well as decreases alignment based on condition(-m32
-mpreferred-stack-boundary=2)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95885

>
> You're calling it from do_type_align which IMHO is dangerous since that's
> invoked from FIELD_DECL layout as well.  Instead invoke it from
> layout_decl itself where we do
>
>   if (code != FIELD_DECL)
> /* For non-fields, update the alignment from the type.  */
> do_type_align (type, decl);
>
> and invoke the hook _after_ do_type_align.  Also avoid
> invoking the hook on globals or hard regs and only
> invoke it on VAR_DECLs, thus only
>
>   if (VAR_P (decl) && !is_global_var (decl) && !DECL_HARD_REGISTER (decl))

It seems like decl property is not fully populated at this point call
to is_global_var (decl) on global variable return false.

$ cat foo.c
long long x;
int main()
{
if (__alignof__(x) != 8)
  __builtin_abort();
return 0;
}

Breakpoint 1, layout_decl (decl=0x77ffbb40, known_align=0)
at /local/skpandey/gccwork/gccwork/gcc/gcc/stor-layout.c:674
674 do_type_align (type, decl);
Missing separate debuginfos, use: dnf debuginfo-install
gmp-6.1.2-10.fc31.x86_64 isl-0.16.1-9.fc31.x86_64
libmpc-1.1.0-4.fc31.x86_64 mpfr-3.1.6-5.fc31.x86_64
zlib-1.2.11-20.fc31.x86_64
(gdb) call debug_tree(decl)
 
unit-size 
align:64 warn_if_not_align:0 symtab:0 alias-set -1
canonical-type 0x7fffea801888 precision:64 min  max 
pointer_to_this >
DI foo.c:1:11 size  unit-size

align:1 warn_if_not_align:0>

(gdb) p is_global_var(decl)
$1 = false
(gdb)


What about calling hook here

 603 do_type_align (tree type, tree decl)
 604 {
 605   if (TYPE_ALIGN (type) > DECL_ALIGN (decl))
 606 {
 607   SET_DECL_ALIGN (decl, TYPE_ALIGN (type));
 608   if (TREE_CODE (decl) == FIELD_DECL)
 609 DECL_USER_ALIGN (decl) = TYPE_USER_ALIGN (type);
 610   else
 611 /* Lower local decl alignment */
 612 if (VAR_P (decl)
 613 && !is_global_var (decl)
 614 && !DECL_HARD_REGISTER (decl)
 615 && cfun != NULL)
 616   targetm.lower_local_decl_alignment (decl);
 617 }

>
> Comments on the hook itself below.
>
> > Tested on x86-64.
> >
> > gcc/ChangeLog:
> >
> > PR target/95237
> > * config/i386/i386.c (ix86_update_decl_alignment): New
> > function.
> > (TARGET_UPDATE_DECL_ALIGNMENT): Define.
> > * doc/tm.texi: Regenerate.
> > * doc/tm.texi.in (TARGET_UPDATE_DECL_ALIGNMENT): New hook.
> > * stor-layout.c (do_type_align): Call target hook to update
> > decl alignment.
> > * target.def (update_decl_alignment): New hook.
> >
> > gcc/testsuite/ChangeLog:
> >
> > PR target/95237
> > * gcc.target/i386/pr95237-1.c: New test.
> > * gcc.target/i386/pr95237-2.c: New test.
> > * gcc.target/i386/pr95237-3.c: New test.
> > * gcc.target/i386/pr95237-4.c: New test.
> > * gcc.target/i386/pr95237-5.c: New test.
> > ---
> >  gcc/config/i386/i386.c| 22 ++
> >  gcc/doc/tm.texi   |  5 +
> >  gcc/doc/tm.texi.in|  2 ++
> >  gcc/stor-layout.c |  2 ++
> >  gcc/target.def|  7 +++
> >  gcc/testsuite/gcc.target/i386/pr95237-1.c | 16 
> >  gcc/testsuite/gcc.target/i386/pr95237-2.c | 10 ++
> >  gcc/testsuite/gcc.target/i386/pr95237-3.c | 10 ++
> >  gcc/testsuite/gcc.target/i386/pr95237-4.c | 10 ++
> >  gcc/testsuite/gcc.target/i386/pr95237-5.c | 16 
> >  10 files changed, 100 insertions(+)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr95237-1.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr95237-2.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr95237-3.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr95237-4.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr95237-5.c
> >
> > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> > index 37aaa49996d..bcd9abd5303 100644
> > --- a/gcc/config/i386/i386.c
> > +++

Re: [PATCH] Treat { 0 } specially for structs with the designated_init attribute.

2020-06-24 Thread Asher Gordon via Gcc-patches

Martin Sebor  writes:

> I have no experience with changing tree nodes but I wouldn't be
> surprised if there were assumptions baked into code that made it
> a non-trivial exercise.
>
> There's also lots of sharing of data in GCC so I'm not sure it
> makes sense for an identifier to have an associated location.
> I imagine two different entities with the same name might share
> the same identifier.  It should be easy to verify.  For example
> with this test case:
>
>   void f (int i) { }
>   void g (int i) { }
>
> and a breakpoint in finish_decl() in c/c-decl.c, the debugger
> will stop twice, once for the i in f and then again for the one
> in g.  They are two different arguments (with different addresses)
> but they both have the same DECL_NAME().

I see. So perhaps this isn't the best way to go about implementing
attribute locations. What do you think would be a better way? Perhaps
using a DECL_MINIMAL for attributes?

Thanks,
Asher

-- 
By necessity, by proclivity, and by delight, we all quote.  In fact, it is as
difficult to appropriate the thoughts of others as it is to invent.
-- R. Emerson
-- Quoted from a fortune cookie program
(whose author claims, "Actually, stealing IS easier.")
[to which I reply, "You think it's easy for me to
misconstrue all these misquotations?!?"  Ed.]
   
I prefer to send and receive mail encrypted. Please send me your
public key, and if you do not have my public key, please let me
know. Thanks.

GPG fingerprint: 38F3 975C D173 4037 B397  8095 D4C9 C4FC 5460 8E68


signature.asc
Description: PGP signature

Re: [PATCH 3/6 ver 3] rs6000, Add vector replace builtin support

2020-06-24 Thread Segher Boessenkool

Hi!

On Thu, Jun 18, 2020 at 03:20:10PM -0700, Carl Love wrote:
> +The programmer is responsible for understanding the endianness issues 
> involved
> +with the first argument and the result.

:-)

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/vec-replace-word-runnable.c
> @@ -0,0 +1,289 @@
> +/* { dg-do run } */
> +/* { dg-require-effective-target powerpc_future_hw } */

This now is "power10_hw".  Similarly, "power10_ok" in other testcases.

> +/* { dg-options "-mdejagnu-cpu=future" } */

And this "-mdejagnu-cpu=power10".

Okay for trunk with those changes.  Thanks!


Segher

Re: [PATCH 2/6 ver 3] rs6000 Add vector insert builtin support

2020-06-24 Thread Segher Boessenkool

Hi!

On Thu, Jun 18, 2020 at 03:20:05PM -0700, Carl Love wrote:
>   * config/rs6000/altivec.h (vec_insertl, vec_inserth): New defines.
>   * config/rs6000/rs6000-builtin.def (VINSERTGPRBL, VINSERTGPRHL,
>   VINSERTGPRWL, VINSERTGPRDL, VINSERTVPRBL, VINSERTVPRHL, VINSERTVPRWL,
>   VINSERTGPRBR, VINSERTGPRHR, VINSERTGPRWR, VINSERTGPRDR, VINSERTVPRBR,
>   VINSERTVPRHR, VINSERTVPRWR): New builtins.
>   (INSERTL, INSERTH): New builtins.
>   * config/rs6000/rs6000-call.c (FUTURE_BUILTIN_VEC_INSERTL,
>   FUTURE_BUILTIN_VEC_INSERTH):  New Overloaded definitions.

(You have two spaces here, and a stray capital.)

>   (FUTURE_BUILTIN_VINSERTGPRBL, FUTURE_BUILTIN_VINSERTGPRHL,
>   FUTURE_BUILTIN_VINSERTGPRWL, FUTURE_BUILTIN_VINSERTGPRDL,
>   FUTURE_BUILTIN_VINSERTVPRBL, FUTURE_BUILTIN_VINSERTVPRHL,
>   FUTURE_BUILTIN_VINSERTVPRWL): Add case entries.
>   * config/rs6000/vsx.md (define_c_enum): Add UNSPEC_INSERTL,
>   UNSPEC_INSERTR.
>   (define_expand): Add vinsertvl_, vinsertvr_,
>   vinsertgl_, vinsertgr_, mode is VI2.
>   (define_ins): vinsertvl_internal_, vinsertvr_internal_,
>   vinsertgl_internal_, vinsertgr_internal_, mode VEC_I.
>   * doc/extend.texi: Add documentation for vec_insertl, vec_inserth.

Okay for trunk.  Thanks!


Segher

Re: [PATCH 1/6 ver 3] rs6000, Update support for vec_extract

2020-06-24 Thread Segher Boessenkool

Hi!

On Thu, Jun 18, 2020 at 03:20:01PM -0700, Carl Love wrote:
> 2020-06-18  Carl Love  
> 
>   * config/rs6000/altivec.md: (UNSPEC_EXTRACTL, UNSPEC_EXTRACTR)
>   (vextractl, vextractr)
>   (vextractl_internal, vextractr_internal)

Please say what the iterator for the  is?  Just adding " for VI2"
is plenty.

>   (VI2): Move to ...
>   * config/rs6000/vsx.md: (UNSPEC_EXTRACTL, UNSPEC_EXTRACTR)
>   (vextractl, vextractr)
>   (vextractl_internal, vextractr_internal)
>   (VI2):  ..here.
>   * gcc/doc/extend.texi: Update documentation for vec_extractl.
>   Replace builtin name vec_extractr with vec_extracth.  Update description
>   of vec_extracth.

Okay for trunk.  Thanks!


Segher

Re: [PATCH 1/7 v5] ifn/optabs: Support vector load/store with length

2020-06-24 Thread Segher Boessenkool

Hi!

On Tue, Jun 23, 2020 at 01:20:53PM +0100, Richard Sandiford wrote:
> SVE supports integer division btw. :-)

So does Power (ISA 3.1, power10).

> In summary, I'm not saying we should never define the inactive values
> to be zero.  I just think that we should leave it until it matters.
> And I don't think it does/should matter for the current patch series.

I am perfectly happy with that.  Thanks for looking at it!

> IFN_MASK_LOAD has been around for quite a long time now and we've never
> had to define the values of inactive lanes there.

Yeah, but typically the insns that consume the values loaded will use
the same masks again, so that may not be such a strong point.

Segher

Re: [PATCH] Treat { 0 } specially for structs with the designated_init attribute.

2020-06-24 Thread Martin Sebor via Gcc-patches


On 6/24/20 2:15 PM, Asher Gordon wrote:

Hi Martin,

Asher Gordon  writes:


Martin Sebor  writes:


Unfortunately, attributes don't have locations (yet).


Hmm, well maybe I could implement that. I'm not very familiar with the
GCC source (this is my first patch), but I'll see if I can figure it
out. It would probably be useful for other warnings/errors too.


Well, I've been trying to implement source locations for
IDENTIFIER_NODEs. But I ran into some very strange problems. If I add a
'location_t' for 'struct tree_identifier' like this:


everything works fine. However, if I then try to use the 'location_t' I
just added, like this:


then lots of ICEs occur. Or, if I don't use IDENTIFIER_SOURCE_LOCATION,
but add 'locus' before 'id' instead of after it, like this:


then, again, lots of ICEs occur.

My best guess for why the ICEs occur, is that perhaps there is a
'struct' with a similar structure to 'struct tree_identifier', and
pointers to these 'struct's are being casted to each other. Something
like this, sort of:


Do you think something like that is possible? And if so, where might it
occur? Or maybe the wrong member of 'union tree_node' is being used
somewhere? But that seems unlikely, since I configured with
--enable-checking...


I have no experience with changing tree nodes but I wouldn't be
surprised if there were assumptions baked into code that made it
a non-trivial exercise.

There's also lots of sharing of data in GCC so I'm not sure it
makes sense for an identifier to have an associated location.
I imagine two different entities with the same name might share
the same identifier.  It should be easy to verify.  For example
with this test case:

  void f (int i) { }
  void g (int i) { }

and a breakpoint in finish_decl() in c/c-decl.c, the debugger
will stop twice, once for the i in f and then again for the one
in g.  They are two different arguments (with different addresses)
but they both have the same DECL_NAME().

Martin



Thanks,
Asher

[PATCH] c++: Improve checking of decls with trailing return type [PR95820]

2020-06-24 Thread Marek Polacek via Gcc-patches

This is an ICE-on-invalid but I've been seeing it when reducing
various testcases, so it's more important for me than usually.

splice_late_return_type now checks that if we've seen a late return
type, the function return type was auto.  That's a fair assumption
but grokdeclarator/cdk_function wasn't giving errors for function
pointers and similar.  So we want to perform various checks not only
when funcdecl_p || inner_declarator == NULL.  But only give the
!late_return_type errors when funcdecl_p, to accept e.g.

auto (*fp)() = f;

in C++11.  Here's a diff -w to ease the review:

--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -12102,14 +12102,9 @@ grokdeclarator (const cp_declarator *declarator,

/* Handle a late-specified return type.  */
tree late_return_type = declarator->u.function.late_return_type;
-   if (funcdecl_p
-   /* This is the case e.g. for
-  using T = auto () -> int.  */
-   || inner_declarator == NULL)
- {
if (tree auto_node = type_uses_auto (type))
  {
-   if (!late_return_type)
+   if (!late_return_type && funcdecl_p)
  {
if (current_class_type
&& LAMBDA_TYPE_P (current_class_type))
@@ -12201,7 +12196,6 @@ grokdeclarator (const cp_declarator *declarator,
"type specifier", name);
return error_mark_node;
  }
- }
type = splice_late_return_type (type, late_return_type);
if (type == error_mark_node)
  return error_mark_node;

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

gcc/cp/ChangeLog:

PR c++/95820
* decl.c (grokdeclarator) : Check also
pointers/references/... to functions.

gcc/testsuite/ChangeLog:

PR c++/95820
* g++.dg/cpp1y/auto-fn58.C: New test.
---
 gcc/cp/decl.c  | 166 -
 gcc/testsuite/g++.dg/cpp1y/auto-fn58.C |  13 ++
 2 files changed, 93 insertions(+), 86 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp1y/auto-fn58.C

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 3afad5ca805..a9ec328c498 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -12102,106 +12102,100 @@ grokdeclarator (const cp_declarator *declarator,
 
/* Handle a late-specified return type.  */
tree late_return_type = declarator->u.function.late_return_type;
-   if (funcdecl_p
-   /* This is the case e.g. for
-  using T = auto () -> int.  */
-   || inner_declarator == NULL)
+   if (tree auto_node = type_uses_auto (type))
  {
-   if (tree auto_node = type_uses_auto (type))
+   if (!late_return_type && funcdecl_p)
  {
-   if (!late_return_type)
+   if (current_class_type
+   && LAMBDA_TYPE_P (current_class_type))
+ /* OK for C++11 lambdas.  */;
+   else if (cxx_dialect < cxx14)
  {
-   if (current_class_type
-   && LAMBDA_TYPE_P (current_class_type))
- /* OK for C++11 lambdas.  */;
-   else if (cxx_dialect < cxx14)
- {
-   error_at (typespec_loc, "%qs function uses "
- "% type specifier without "
- "trailing return type", name);
-   inform (typespec_loc,
-   "deduced return type only available "
-   "with %<-std=c++14%> or %<-std=gnu++14%>");
- }
-   else if (virtualp)
- {
-   error_at (typespec_loc, "virtual function "
- "cannot have deduced return type");
-   virtualp = false;
- }
+   error_at (typespec_loc, "%qs function uses "
+ "% type specifier without "
+ "trailing return type", name);
+   inform (typespec_loc,
+   "deduced return type only available "
+   "with %<-std=c++14%> or %<-std=gnu++14%>");
  }
-   else if (!is_auto (type) && sfk != sfk_conversion)
+   else if (virtualp)
  {
-   error_at (typespec_loc, "%qs function with trailing "
- "return type has %qT as its type rather "
- "than plain %", name, type);
-   return error_mark_node;
+   error_at

[PATCH] PR fortran/95828 - Buffer overflows with SELECT RANK

2020-06-24 Thread Harald Anlauf

Another case of buffer overflow, this time coming in pairs.

Regtested on x86_64-pc-linux-gnu.

OK for master?

Thanks,
Harald


PR fortran/95828 - Buffer overflows with SELECT RANK

With SELECT RANK, name mangling results in long internal symbols that
overflows internal buffers.  Fix that.

gcc/fortran/
PR fortran/95828
* match.c (select_rank_set_tmp): Enlarge internal buffer used in
generating a mangled name.
* resolve.c (resolve_select_rank): Likewise.
diff --git a/gcc/fortran/match.c b/gcc/fortran/match.c
index 8063fcad295..b011634792e 100644
--- a/gcc/fortran/match.c
+++ b/gcc/fortran/match.c
@@ -6496,7 +6496,7 @@ static void
 select_rank_set_tmp (gfc_typespec *ts, int *case_value)
 {
   char name[2 * GFC_MAX_SYMBOL_LEN];
-  char tname[GFC_MAX_SYMBOL_LEN];
+  char tname[GFC_MAX_SYMBOL_LEN + 7];
   gfc_symtree *tmp;
   gfc_symbol *selector = select_type_stack->selector;
   gfc_symbol *sym;
diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c
index c53b312f7ed..cc8676b3e03 100644
--- a/gcc/fortran/resolve.c
+++ b/gcc/fortran/resolve.c
@@ -9638,7 +9638,7 @@ resolve_select_rank (gfc_code *code, gfc_namespace *old_ns)
   gfc_namespace *ns;
   gfc_code *body, *new_st, *tail;
   gfc_case *c;
-  char tname[GFC_MAX_SYMBOL_LEN];
+  char tname[GFC_MAX_SYMBOL_LEN + 7];
   char name[2 * GFC_MAX_SYMBOL_LEN];
   gfc_symtree *st;
   gfc_expr *selector_expr = NULL;
diff --git a/gcc/testsuite/gfortran.dg/pr95828.f90 b/gcc/testsuite/gfortran.dg/pr95828.f90
new file mode 100644
index 000..e85b2f11869
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr95828.f90
@@ -0,0 +1,21 @@
+! { dg-do compile }
+! { dg-options "-fsecond-underscore" }
+! PR fortran/95828 - ICE in resolve_select_rank, at fortran/resolve.c:9774
+
+module m2345678901234567890123456789012345678901234567890123456789_123
+  type t2345678901234567890123456789012345678901234567890123456789_123
+  end type
+contains
+  subroutine s2345678901234567890123456789012345678901234567890123456789_123 &
+(x2345678901234567890123456789012345678901234567890123456789_123)
+type(t2345678901234567890123456789012345678901234567890123456789_123) :: &
+ x2345678901234567890123456789012345678901234567890123456789_123(..)
+
+select rank (y2345678901234567890123456789012345678901234567890123456789_123 &
+  => x2345678901234567890123456789012345678901234567890123456789_123)
+rank (2)
+rank (3)
+rank default
+end select
+  end
+end

Re: [PATCH] Treat { 0 } specially for structs with the designated_init attribute.

2020-06-24 Thread Asher Gordon via Gcc-patches

Hi Martin,

Asher Gordon  writes:

> Martin Sebor  writes:
>
>> Unfortunately, attributes don't have locations (yet).
>
> Hmm, well maybe I could implement that. I'm not very familiar with the
> GCC source (this is my first patch), but I'll see if I can figure it
> out. It would probably be useful for other warnings/errors too.

Well, I've been trying to implement source locations for
IDENTIFIER_NODEs. But I ran into some very strange problems. If I add a
'location_t' for 'struct tree_identifier' like this:
diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index 8c5a2e3c404..b3c46d3c0d3 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -1423,6 +1423,7 @@ struct GTY(()) tree_poly_int_cst {
 struct GTY(()) tree_identifier {
   struct tree_common common;
   struct ht_identifier id;
+  location_t locus;
 };
 
 struct GTY(()) tree_list {
everything works fine. However, if I then try to use the 'location_t' I
just added, like this:
diff --git a/gcc/c-family/c-lex.c b/gcc/c-family/c-lex.c
index b1cef2345f4..d2d86c5f78a 100644
--- a/gcc/c-family/c-lex.c
+++ b/gcc/c-family/c-lex.c
@@ -465,6 +465,7 @@ c_lex_with_flags (tree *value, location_t *loc, unsigned char *cpp_flags,
 
 case CPP_NAME:
   *value = HT_IDENT_TO_GCC_IDENT (HT_NODE (tok->val.node.node));
+  IDENTIFIER_SOURCE_LOCATION (*value) = *loc;
   break;
 
 case CPP_NUMBER:
diff --git a/gcc/tree.h b/gcc/tree.h
index a74872f5f3e..1e5a4fba0ed 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -1069,6 +1069,16 @@ extern void omp_clause_range_check_failed (const_tree, const char *, int,
 #define IDENTIFIER_HASH_VALUE(NODE) \
   (IDENTIFIER_NODE_CHECK (NODE)->identifier.id.hash_value)
 
+/* The source location in which the identifier appears.  */
+#define IDENTIFIER_SOURCE_LOCATION(NODE) \
+  (IDENTIFIER_NODE_CHECK (NODE)->identifier.locus)
+#define IDENTIFIER_SOURCE_FILE(NODE)\
+  LOCATION_FILE (IDENTIFIER_SOURCE_LOCATION (NODE))
+#define IDENTIFIER_SOURCE_LINE(NODE)\
+  LOCATION_LINE (IDENTIFIER_SOURCE_LOCATION (NODE))
+#define IDENTIFIER_SOURCE_COLUMN(NODE)\
+  LOCATION_COLUMN (IDENTIFIER_SOURCE_LOCATION (NODE))
+
 /* Translate a hash table identifier pointer to a tree_identifier
pointer, and vice versa.  */
 
then lots of ICEs occur. Or, if I don't use IDENTIFIER_SOURCE_LOCATION,
but add 'locus' before 'id' instead of after it, like this:
diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index 8c5a2e3c404..b3c46d3c0d3 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -1423,6 +1423,7 @@ struct GTY(()) tree_poly_int_cst {
 struct GTY(()) tree_identifier {
   struct tree_common common;
+  location_t locus;
   struct ht_identifier id;
 };
 
 struct GTY(()) tree_list {
then, again, lots of ICEs occur.

My best guess for why the ICEs occur, is that perhaps there is a
'struct' with a similar structure to 'struct tree_identifier', and
pointers to these 'struct's are being casted to each other. Something
like this, sort of:
/* This is analogous to the unknown structure, to and from which
   'struct tree_identifier' is being casted. */
struct foo {
  int i;
  float f;
};

/* This is analogous to the modified 'struct tree_identifier' with the
   'location_t' added at the end. */
struct end {
  int i;
  float f;
  const char *s; /* Analogous to 'locus' in 'struct tree_identifier' */
};

/* This is analogous to the modified 'struct tree_identifier' with the
   'location_t' added in the middle. */
struct middle {
  int i;
  const char *s; /* Analogous to 'locus' in 'struct tree_identifier' */
  float f;
};

int
main (void)
{
  struct foo foo, *foo_p = 
  struct end *end_p;
  struct middle *middle_p;

  /* This works, as long as 's' is not used. */
  end_p = (struct end *) foo_p;
  /* But as soon as 's' is used, it invokes UB. This is analogous to
 the IDENTIFIER_SOURCE_LOCATION in c_lex_with_flags. */
  end_p->s = "hello";

  /* This works, as long as neither 'f' (analogous to 'id') nor 's' is
 used. */
  middle_p = (struct middle *) foo_p;
  /* But, doubtless, 'f' ('id') will be used somewhere... */
  middle_p->f = 42.0;

  return 0; /* if we're lucky... */
}
Do you think something like that is possible? And if so, where might it
occur? Or maybe the wrong member of 'union tree_node' is being used
somewhere? But that seems unlikely, since I configured with
--enable-checking...

Thanks,
Asher

-- 
One picture is worth 128K words.
   
I prefer to send and receive mail encrypted. Please send me your
public key, and if you do not have my public key, please let me
know. Thanks.

GPG fingerprint: 38F3 975C D173 4037 B397  8095 D4C9 C4FC 5460 8E68


signature.asc
Description: PGP signature

Re: [PATCH] PR 83938 Reduce memory consumption in stable_sort/inplace_merge

2020-06-24 Thread François Dumont via Gcc-patches


On 24/06/20 7:39 pm, Jonathan Wakely wrote:

On 11/06/20 08:32 +0200, FranÃ§ois Dumont via Libstdc++ wrote:

As we are on patching algos we still have this old one.

Â Â Â  From the original patch I only kept the memory optimization 
part as the new performance test was not showing good result for the 
other part to change pivot value. I also kept the small change in 
get_temporary_buffer even if I don't have strong feeling about it, it 
just make sure that we'll try to allocate 1 element as a last chance 
allocation.


Â Â Â  Note that there is still place for an improvement. If we miss 
memory on the heap we then use a recursive implementation which then 
rely on stack memory. I would be surprise that a system which miss 
heap memory would have no problem to allocate about the same on the 
stack so we will surely end up in a stack overflow. I still have this 
on my todo even if I already made several tries with no satisfying 
result in terms of performance.


Â Â Â  Tested under Linux x86_64.

Commit message:

Â Â Â  libstdc++: Limit memory allocation in 
stable_sort/inplace_merge (PR 83938)


Â Â Â  Reduce memory consumption in stable_sort/inplace_merge to what 
is used.


Â Â Â  Co-authored-by: FranÃ§ois DumontÂ 

Â Â Â  libstdc++-v3/ChangeLog:

Â Â Â  2020-06-11Â  John ChangÂ  
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  FranÃ§ois DumontÂ 

Â Â Â Â Â Â Â Â Â Â Â  PR libstdc++/83938
Â Â Â Â Â Â Â Â Â Â Â  * include/bits/stl_tempbuf.h 
(get_temporary_buffer): Change __len

Â Â Â Â Â Â Â Â Â Â Â  computation in the loop.
Â Â Â Â Â Â Â Â Â Â Â  * include/bits/stl_algo.h:
Â Â Â Â Â Â Â Â Â Â Â  (__inplace_merge): Take temporary buffer 
length from smallest range.

Â Â Â Â Â Â Â Â Â Â Â  (__stable_sort): Limit temporary buffer length.
Â Â Â Â Â Â Â Â Â Â Â  * testsuite/25_algorithms/inplace_merge/1.cc 
(test03): Test different

Â Â Â Â Â Â Â Â Â Â Â  pivot positions.
Â Â Â Â Â Â Â Â Â Â Â  * 
testsuite/performance/25_algorithms/stable_sort.cc: Test stable_sort

Â Â Â Â Â Â Â Â Â Â Â  under different heap memory conditions.
Â Â Â Â Â Â Â Â Â Â Â  * 
testsuite/performance/25_algorithms/inplace_merge.cc: New.


Ok to commit ?


I'm very nervous about changes to sort algos that aren't absolutely
necessary for correctness. It needs careful review and lots of
testing. Please be patient.


Sure, just note that there is no change to the algo logic in any way. It 
is only to limit the temporary buffer allocation to what is being used.


This kind of situation illustrates also why PR management would be nice. 
I wouldn't wonder if the patch hasn't been forgotten. I know that you 
are for it, I hope you'll make your point one day.


Thanks

[pushed] coroutines, testsuite: Update tests for get-return-object errors.

2020-06-24 Thread Iain Sandoe

Hi,

Just an improvement to test-coverage,
tested on x86_64-darwin,linux powerpc64-linux

applied to master,

thanks
Iain

=

We updated the handling of the errors for cases when the
ramp return cannot be constructed from the user's provided
get-return-object method.  This updates the testcases to
cover this.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/void-gro-non-class-coro.C: Moved to...
* g++.dg/coroutines/coro-bad-gro-01-void-gro-non-class-coro.C: ...here.
* g++.dg/coroutines/coro-bad-gro-00-class-gro-scalar-return.C: New test.
---
 .../coro-bad-gro-00-class-gro-scalar-return.C | 65 +++
 ...coro-bad-gro-01-void-gro-non-class-coro.C} |  2 +-
 2 files changed, 66 insertions(+), 1 deletion(-)
 create mode 100644 
gcc/testsuite/g++.dg/coroutines/coro-bad-gro-00-class-gro-scalar-return.C
 rename gcc/testsuite/g++.dg/coroutines/{void-gro-non-class-coro.C => 
coro-bad-gro-01-void-gro-non-class-coro.C} (99%)

diff --git 
a/gcc/testsuite/g++.dg/coroutines/coro-bad-gro-00-class-gro-scalar-return.C 
b/gcc/testsuite/g++.dg/coroutines/coro-bad-gro-00-class-gro-scalar-return.C
new file mode 100644
index 000..bd9dec6c75f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/coroutines/coro-bad-gro-00-class-gro-scalar-return.C
@@ -0,0 +1,65 @@
+// Test handling of the case where we have a class g-r-o and a non-void
+// and non-class-type ramp return.
+
+#include "coro.h"
+
+int g_promise = -1;
+
+struct Thing {
+  double x;
+  Thing () : x(0.0) {}
+  ~Thing () {}
+};
+
+template
+struct std::coroutine_traits {
+struct promise_type {
+promise_type (HandleRef h, T ...args)
+{ h = std::coroutine_handle::from_promise (*this);
+  PRINT ("Created Promise");
+  g_promise = 1;
+}
+   ~promise_type () { PRINT ("Destroyed Promise"); g_promise = 0;}
+Thing get_return_object() { return {}; }
+
+auto initial_suspend() {
+  return std::suspend_always{};
+ }
+auto final_suspend() { return std::suspend_never{}; }
+
+void return_void() {}
+void unhandled_exception() {}
+};
+};
+
+int
+my_coro (std::coroutine_handle<>& h)
+{
+  PRINT ("coro1: about to return");
+  co_return;
+} // { dg-error {'struct Thing' used where a 'int' was expected} }
+
+int main ()
+{
+  std::coroutine_handle<> h;
+  int t = my_coro (h);
+
+  if (h.done())
+{
+  PRINT ("main: apparently was already done...");
+  abort ();
+}
+
+  // initial suspend.
+  h.resume ();
+
+  // The coro should have self-destructed.
+  if (g_promise)
+{
+  PRINT ("main: apparently we did not complete...");
+  abort ();
+}
+
+  PRINT ("main: returning");
+  return t;
+}
diff --git a/gcc/testsuite/g++.dg/coroutines/void-gro-non-class-coro.C 
b/gcc/testsuite/g++.dg/coroutines/coro-bad-gro-01-void-gro-non-class-coro.C
similarity index 99%
rename from gcc/testsuite/g++.dg/coroutines/void-gro-non-class-coro.C
rename to 
gcc/testsuite/g++.dg/coroutines/coro-bad-gro-01-void-gro-non-class-coro.C
index 8176c8a10af..c31fcb58b2e 100644
--- a/gcc/testsuite/g++.dg/coroutines/void-gro-non-class-coro.C
+++ b/gcc/testsuite/g++.dg/coroutines/coro-bad-gro-01-void-gro-non-class-coro.C
@@ -55,5 +55,5 @@ int main ()
 }
 
   PRINT ("main: returning");
-  return 0;
+  return t;
 }
-- 
2.24.1

Re: [PATCHv8] c++:Handle TYPE_PACK_EXPANSION in cxx_incomplete_type_diagnostic[PR96752]

2020-06-24 Thread Jason Merrill via Gcc-patches


I'm applying your patch with some tweaks.

First, I shortened your subject line; we try to keep them under 50 chars 
so they look good in git log --oneline.  Going over isn't a big deal, 
but that's the goal.


On 6/23/20 3:47 PM, Nicholas Krause wrote:

From: Nicholas Krause 

This fixes the PR95672 by adding the missing TYPE_PACK_EXPANSION case in
cxx_incomplete_type_diagnostic in order to avoid ICES on diagnosing
incomplete template pack expansion cases. In v2, add the missing required
test case for all new patches. v3 Fixes both the test case to compile in
C++11 mode and the message to print out only the type. v4 fixes the testcase
to only target C++11. v5 and v6 fix the test case properly. v7 fixes running
the testcase. v8 fixes grammar errors. Tested on  powerpc64le-unknown-linux-gnu.

gcc/cp/ChangeLog:



A PR number goes before the change description, like

PR c++/95672


* typeck2.c (cxx_incomplete_type_diagnostic):
  Add missing TYPE_EXPANSION_PACK check for
  diagnosing incomplete types in
  cxx_incomplete_type_diagnostic to fix
  c++/PR95672.


Not here.  Also, your later lines should line up with the *, not the t.


gcc/testsuite/ChangeLog:

* g++.dg/template/PR95672.C: New test.


I renamed the testcase to use lowercase "pr" to match other tests.


Signed-off-by: Nicholas Krause 
---
  gcc/cp/typeck2.c| 5 +
  gcc/testsuite/g++.dg/template/PR95672.C | 2 ++
  2 files changed, 7 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/template/PR95672.C

diff --git a/gcc/cp/typeck2.c b/gcc/cp/typeck2.c
index 5fd3b82fa89..dac135a2e11 100644
--- a/gcc/cp/typeck2.c
+++ b/gcc/cp/typeck2.c
@@ -552,6 +552,11 @@ cxx_incomplete_type_diagnostic (location_t loc, const_tree 
value,
   TYPE_NAME (type));
break;
  
+case TYPE_PACK_EXPANSION:

+  emit_diagnostic (diag_kind, loc, 0,
+  "invalid use of pack expansion %qT", type);
+  break;
+
  case TYPENAME_TYPE:
  case DECLTYPE_TYPE:
emit_diagnostic (diag_kind, loc, 0,
diff --git a/gcc/testsuite/g++.dg/template/PR95672.C 
b/gcc/testsuite/g++.dg/template/PR95672.C
new file mode 100644
index 000..104e125287f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/PR95672.C
@@ -0,0 +1,2 @@


And I added the PR number within the file as well.


+// { dg-do compile { target c++14 } }
+struct g_class : decltype  (auto) ... {  }; // { dg-error "invalid use of pack 
expansion" }


So here's what I'm applying.  Thanks for the patch!

Jason


commit a11dd6a7605fae2c2a9941ce3e749349a468a73c
Author: Nicholas Krause 
Date:   Tue Jun 23 15:47:37 2020 -0400

c++: Handle bad pack expansion in base list. [PR96752]

This fixes PR95672 by adding the missing TYPE_PACK_EXPANSION case in
cxx_incomplete_type_diagnostic in order to avoid ICEs on diagnosing
incomplete template pack expansion cases.

Tested on powerpc64le-unknown-linux-gnu.

gcc/cp/ChangeLog:

PR c++/95672
* typeck2.c (cxx_incomplete_type_diagnostic): Add missing
TYPE_EXPANSION_PACK check for diagnosing incomplete types in
cxx_incomplete_type_diagnostic.

gcc/testsuite/ChangeLog:

PR c++/95672
* g++.dg/template/pr95672.C: New test.

Signed-off-by: Nicholas Krause 

diff --git a/gcc/cp/typeck2.c b/gcc/cp/typeck2.c
index 5fd3b82fa89..dac135a2e11 100644
--- a/gcc/cp/typeck2.c
+++ b/gcc/cp/typeck2.c
@@ -552,6 +552,11 @@ cxx_incomplete_type_diagnostic (location_t loc, const_tree value,
 		   TYPE_NAME (type));
   break;
 
+case TYPE_PACK_EXPANSION:
+  emit_diagnostic (diag_kind, loc, 0,
+		   "invalid use of pack expansion %qT", type);
+  break;
+
 case TYPENAME_TYPE:
 case DECLTYPE_TYPE:
   emit_diagnostic (diag_kind, loc, 0,
diff --git a/gcc/testsuite/g++.dg/template/pr95672.C b/gcc/testsuite/g++.dg/template/pr95672.C
new file mode 100644
index 000..c752b4a2c08
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/pr95672.C
@@ -0,0 +1,3 @@
+// PR c++/95672
+// { dg-do compile { target c++14 } }
+struct g_class : decltype  (auto) ... {  }; // { dg-error "invalid use of pack expansion" }

Re: Fortran: Fix character-kind=4 substring resolution (PR95837)

2020-06-24 Thread Thomas Koenig via Gcc-patches


Hi Tobias,

could you review the second patch instead? I have sent the wrong patch 
(early draft) and corrected it half an hour later!


Sorry, I missed that.  Here's the review of the real patch :-)

So, the first part is

+  if (ts)
+e->ts.kind = ts->kind;

Ok, I unerstand that - ts has been set earlier for a component.

But this part

+  else if (e->ts.type != BT_CHARACTER)
+e->ts.kind = gfc_default_character_kind;

I do not quite understand.  How can the type of an expression involving
a substring not be BT_CHARACTER when gfc_resolve_substring_charlen is
called?  Or is it BT_UNKNOWN, and a check for that might be better?
And, if it is indeed BT_UNKNOWN, how do we know it isn't a
CHARACTER(KIND=4)?

Best Regards

Thomas

[pushed] c++: Fix ICE with using and virtual function [PR95719]

2020-06-24 Thread Jason Merrill via Gcc-patches

Looking in the vtable for cand->conversion_path doesn't work because 
it's the base where we found the function, not the base where the 
function is defined.  If those are different, look farther.  Applying to 
trunk and 10.


The second patch uses that same base lookup for the actual conversion as 
well as finding the final overrider.  Applying only to trunk.


Tested x86_64-pc-linux-gnu.
commit 2c8346f6b30039be0ea898ce94a3bef85a8783cc
Author: Jason Merrill 
Date:   Tue Jun 23 21:25:21 2020 -0400

c++: Fix ICE with using and virtual function [PR95719]

conversion_path points to the base where we found the using-declaration, not
where the function is actually a member; look up the actual base.  And then
maybe look back to the derived class if the base is primary.

gcc/cp/ChangeLog:

PR c++/95719
* call.c (build_over_call): Look up the overrider in base_binfo.
* class.c (lookup_vfn_in_binfo): Look through BINFO_PRIMARY_P.

gcc/testsuite/ChangeLog:

PR c++/95719
* g++.dg/tree-ssa/final4.C: New test.

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 2b39a3700fc..fe68fda1364 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -8694,7 +8694,11 @@ build_over_call (struct z_candidate *cand, int flags, tsubst_flags_t complain)
   if (DECL_VINDEX (fn) && (flags & LOOKUP_NONVIRTUAL) == 0
 	  && resolves_to_fixed_type_p (arg))
 	{
-	  fn = lookup_vfn_in_binfo (DECL_VINDEX (fn), cand->conversion_path);
+	  tree binfo = cand->conversion_path;
+	  if (BINFO_TYPE (binfo) != DECL_CONTEXT (fn))
+	binfo = lookup_base (binfo, DECL_CONTEXT (fn), ba_unique,
+ NULL, complain);
+	  fn = lookup_vfn_in_binfo (DECL_VINDEX (fn), binfo);
 	  flags |= LOOKUP_NONVIRTUAL;
 	}
 
diff --git a/gcc/cp/class.c b/gcc/cp/class.c
index 629d27da894..94a95854e25 100644
--- a/gcc/cp/class.c
+++ b/gcc/cp/class.c
@@ -2455,6 +2455,10 @@ lookup_vfn_in_binfo (tree idx, tree binfo)
   int ix = tree_to_shwi (idx);
   if (TARGET_VTABLE_USES_DESCRIPTORS)
 ix /= MAX (TARGET_VTABLE_USES_DESCRIPTORS, 1);
+  while (BINFO_PRIMARY_P (binfo))
+/* BINFO_VIRTUALS in a primary base isn't accurate, find the derived
+   class that actually owns the vtable.  */
+binfo = BINFO_INHERITANCE_CHAIN (binfo);
   tree virtuals = BINFO_VIRTUALS (binfo);
   return TREE_VALUE (chain_index (ix, virtuals));
 }
diff --git a/gcc/testsuite/g++.dg/tree-ssa/final4.C b/gcc/testsuite/g++.dg/tree-ssa/final4.C
new file mode 100644
index 000..387d4bb26c2
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tree-ssa/final4.C
@@ -0,0 +1,12 @@
+// PR c++/95719
+// { dg-do compile { target c++11 } }
+// { dg-additional-options -fdump-tree-gimple }
+// { dg-final { scan-tree-dump "S2::f" "gimple" } }
+
+struct S1 { virtual ~S1(); };
+struct S2 {
+virtual ~S2();
+virtual void f();
+};
+struct S3 final: S1, S2 { using S2::f; };
+void g(S3 & s) { s.f(); }
commit 3d9fe61f07abb2653412259a974811917149b3b8
Author: Jason Merrill 
Date:   Wed Jun 24 01:49:06 2020 -0400

c++: Simplify build_over_call a bit.

It occurred to me that if we're looking up the defining base within the
conversion_path binfo, we could use the result for the conversion as well
instead of doing two separate conversions.

gcc/cp/ChangeLog:

* call.c (build_over_call): Only call build_base_path once.

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index fe68fda1364..d8923be1d68 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -8658,13 +8658,10 @@ build_over_call (struct z_candidate *cand, int flags, tsubst_flags_t complain)
   /* Bypass access control for 'this' parameter.  */
   else if (TREE_CODE (TREE_TYPE (fn)) == METHOD_TYPE)
 {
-  tree parmtype = TREE_VALUE (parm);
   tree arg = build_this (first_arg != NULL_TREE
 			 ? first_arg
 			 : (*args)[arg_index]);
   tree argtype = TREE_TYPE (arg);
-  tree converted_arg;
-  tree base_binfo;
 
   if (arg == error_mark_node)
 	return error_mark_node;
@@ -8683,38 +8680,20 @@ build_over_call (struct z_candidate *cand, int flags, tsubst_flags_t complain)
 	return error_mark_node;
 	}
 
+  /* The class where FN is defined.  */
+  tree ctx = DECL_CONTEXT (fn);
+
   /* See if the function member or the whole class type is declared
 	 final and the call can be devirtualized.  */
-  if (DECL_FINAL_P (fn)
-	  || CLASSTYPE_FINAL (TYPE_METHOD_BASETYPE (TREE_TYPE (fn
+  if (DECL_FINAL_P (fn) || CLASSTYPE_FINAL (ctx))
 	flags |= LOOKUP_NONVIRTUAL;
 
-  /* If we know the dynamic type of the object, look up the final overrider
-	 in the BINFO.  */
-  if (DECL_VINDEX (fn) && (flags & LOOKUP_NONVIRTUAL) == 0
-	  && resolves_to_fixed_type_p (arg))
-	{
-	  tree binfo = cand->conversion_path;
-	  if (BINFO_TYPE (binfo) != DECL_CONTEXT (fn))
-	binfo = lookup_base (binfo, DECL_CONTEXT (fn), ba_unique,
- NULL, complain);
-	  fn = lookup_vfn_in_binfo

Re: [PATCH] c++: Fix CTAD for aggregates in template [PR95568]

2020-06-24 Thread Jason Merrill via Gcc-patches


On 6/23/20 6:58 PM, Marek Polacek wrote:

95568 complains that CTAD for aggregates doesn't work within
requires-clause and it turned out that it doesn't work when we try
the deduction in a template.  The reason is that maybe_aggr_guide
creates a guide that can look like this

   template X(decltype (X::x))-> X


Then that's the bug; there's no reason the guide should be different 
just because we're trying to do the deduction in a template.


I'm not sure why I used finish_decltype_type there.

Jason

Re: [PATCH 0/3, v2] rs6000: Add support for Matrix-Multiply Assist (MMA) built-in functions.

2020-06-24 Thread Segher Boessenkool

Hi!

On Wed, Jun 24, 2020 at 02:28:00PM -0500, Peter Bergner via Gcc-patches wrote:
> On 6/18/20 3:42 PM, Peter Bergner wrote:
> > POWER ISA 3.1 added new Matrix-Multiply Assist (MMA) instructions.
> > The following patch set adds support for generating these instructions
> > through built-in functions which are enabled with the -mmma option.
> > 
> > The patch1 and patch1+patch2+patch3 have been bootstrapped and regtested on
> > powerpc64le-linux with no regressions.  In addition, patch1+patch2+patch3
> > has been bootstrapped and regtested on powerpc64-linux (BE), also without
> > regressions.  I'll note that I split the testsuite changes into their own
> > patch for review purposes, but I plan on committing patch2 and patch3 
> > together.
> > 
> > Changes since v1:
> >   Patch 1/3:
> > - Modified verbiage in mma.md per Will's suggestion.
> > - Modified rs6000_split_multireg_move to correctly handle BE PXImode
> >   and POImode moves.
> >   Patch 2/3:
> > - Updated ChangeLog entry per Segher's suggestion.
> > - Updated doc/extend.texi with correct built-in names for
> >   __builtin_vsx_xvcvspbf16 and __builtin_vsx_xvcvbf16sp.
> >   Patch 3/3:
> > - No changes.
> 
> The committed patches don't seem to have caused any bootstrap issues on trunk
> and are pretty independent of the rest of the rs6000 backend, so I'd like
> permission to back port the two commits to GCC 10.  Patch 2 makes used of
> the u8bit_cint_operand predicate which Kelvin added, but that isn't in GCC 10,
> so I need to back port that change too.
> 
> The back ports of the MMA patches/commits was straight forward and I'm
> currently bootstrapping/regtesting the backports on both powerpc64le-linux
> and powerpc64-linux.  Is this (including the hunk below) ok for GCC 10
> release branch assuming the tests come back clean?

Yes, all are okay for 10 as well (incl. Kelvin's backport).  Thanks!

> rs6000: Backport u8bit_cint_operand predicate.

(No dot at the end of the subject please.)


Segher

[PATCH 1/2] simplify-rtx: Parity of parity is parity

2020-06-24 Thread Segher Boessenkool

From: Roger Sayle 

2020-06-24  Roger Sayle  

* simplify-rtx.c (simplify_unary_operation_1): Simplify
(parity (parity x)) as (parity x), i.e. PARITY is idempotent.
---
 gcc/simplify-rtx.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
index 28c2dc6..65008ea 100644
--- a/gcc/simplify-rtx.c
+++ b/gcc/simplify-rtx.c
@@ -1391,6 +1391,10 @@ simplify_unary_operation_1 (enum rtx_code code, 
machine_mode mode, rtx op)
   GET_MODE (XEXP (op, 0)));
  break;
 
+   case PARITY:
+ /* (parity (parity x)) -> parity (x).  */
+ return op;
+
default:
  break;
}
-- 
1.8.3.1

[PATCH 2/2] simplify-rtx: Simplify rotates by zero

2020-06-24 Thread Segher Boessenkool

From: Roger Sayle 

2020-06-24  Roger Sayle  
Segher Boessenkool  

* simplify-rtx.c (simplify_unary_operation_1): Simplify rotates by 0.
---
 gcc/simplify-rtx.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
index 65008ea..3e913b5 100644
--- a/gcc/simplify-rtx.c
+++ b/gcc/simplify-rtx.c
@@ -3645,6 +3645,8 @@ simplify_binary_operation_1 (enum rtx_code code, 
machine_mode mode,
 
 case ROTATERT:
 case ROTATE:
+  if (trueop1 == CONST0_RTX (mode))
+   return op0;
   /* Canonicalize rotates by constant amount.  If op1 is bitsize / 2,
 prefer left rotation, if op1 is from bitsize / 2 + 1 to
 bitsize - 1, use other direction of rotate with 1 .. bitsize / 2 - 1
-- 
1.8.3.1

Re: [PATCH 0/3, v2] rs6000: Add support for Matrix-Multiply Assist (MMA) built-in functions.

2020-06-24 Thread Peter Bergner via Gcc-patches

On 6/18/20 3:42 PM, Peter Bergner wrote:
> POWER ISA 3.1 added new Matrix-Multiply Assist (MMA) instructions.
> The following patch set adds support for generating these instructions
> through built-in functions which are enabled with the -mmma option.
> 
> The patch1 and patch1+patch2+patch3 have been bootstrapped and regtested on
> powerpc64le-linux with no regressions.  In addition, patch1+patch2+patch3
> has been bootstrapped and regtested on powerpc64-linux (BE), also without
> regressions.  I'll note that I split the testsuite changes into their own
> patch for review purposes, but I plan on committing patch2 and patch3 
> together.
> 
> Changes since v1:
>   Patch 1/3:
> - Modified verbiage in mma.md per Will's suggestion.
> - Modified rs6000_split_multireg_move to correctly handle BE PXImode
>   and POImode moves.
>   Patch 2/3:
> - Updated ChangeLog entry per Segher's suggestion.
> - Updated doc/extend.texi with correct built-in names for
>   __builtin_vsx_xvcvspbf16 and __builtin_vsx_xvcvbf16sp.
>   Patch 3/3:
> - No changes.

The committed patches don't seem to have caused any bootstrap issues on trunk
and are pretty independent of the rest of the rs6000 backend, so I'd like
permission to back port the two commits to GCC 10.  Patch 2 makes used of
the u8bit_cint_operand predicate which Kelvin added, but that isn't in GCC 10,
so I need to back port that change too.

The back ports of the MMA patches/commits was straight forward and I'm
currently bootstrapping/regtesting the backports on both powerpc64le-linux
and powerpc64-linux.  Is this (including the hunk below) ok for GCC 10
release branch assuming the tests come back clean?

Peter

rs6000: Backport u8bit_cint_operand predicate.

2020-05-11  Kelvin Nilsen  
Backported from master
* config/rs6000/predicates.md (u8bit_cint_operand): New predicate.

diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index bf04e4d431f..529c2beb773 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -234,6 +234,11 @@ (define_predicate "u7bit_cint_operand"
   (and (match_code "const_int")
(match_test "IN_RANGE (INTVAL (op), 0, 127)")))

+;; Return 1 if op is an unsigned 8-bit constant integer.
+(define_predicate "u8bit_cint_operand"
+  (and (match_code "const_int")
+   (match_test "IN_RANGE (INTVAL (op), 0, 255)")))
+
 ;; Return 1 if op is a signed 8-bit constant integer.
 ;; Integer multiplication complete more quickly
 (define_predicate "s8bit_cint_operand"

Re: Fortran: Fix character-kind=4 substring resolution (PR95837)

2020-06-24 Thread Tobias Burnus


Hi Thomas,

could you review the second patch instead? I have sent the wrong patch 
(early draft) and corrected it half an hour later!


https://gcc.gnu.org/pipermail/gcc-patches/2020-June/548779.html

Tobias

On 6/24/20 8:01 PM, Thomas Koenig via Fortran wrote:

Hi Tobias,


OK for the trunk?


I just checked, and this gets a segfault for

program main
  character (len=3), parameter :: x = 'abc'
  print *, x(2:2)
end program main


+  if (ts)
+    e->kind = ts->kind;
+  else if (e->symtree->n.sym->ts.type == BT_CHARACTER)
+    e->kind = ts->kind;

There are two potential problems: e->symtree could be NULL,
and (if the second assignment is reached) ts is NULL.
You may have meant

e->kind = e->symtree->n.sym->ts.kind;

Could you correct that, and resubmit?

Thanks for working on this!

Best regards

Thomas

Re: Fortran: Fix character-kind=4 substring resolution (PR95837)

2020-06-24 Thread Thomas Koenig via Gcc-patches


Hi Tobias,


OK for the trunk?


I just checked, and this gets a segfault for

program main
  character (len=3), parameter :: x = 'abc'
  print *, x(2:2)
end program main


+  if (ts)
+e->kind = ts->kind;
+  else if (e->symtree->n.sym->ts.type == BT_CHARACTER)
+e->kind = ts->kind;

There are two potential problems: e->symtree could be NULL,
and (if the second assignment is reached) ts is NULL.
You may have meant

e->kind = e->symtree->n.sym->ts.kind;

Could you correct that, and resubmit?

Thanks for working on this!

Best regards

Thomas

Re: [PATCH] PR 83938 Reduce memory consumption in stable_sort/inplace_merge

2020-06-24 Thread Jonathan Wakely via Gcc-patches


On 11/06/20 08:32 +0200, FranÃ§ois Dumont via Libstdc++ wrote:

As we are on patching algos we still have this old one.

Â Â Â  From the original patch I only kept the memory optimization part 
as the new performance test was not showing good result for the other 
part to change pivot value. I also kept the small change in 
get_temporary_buffer even if I don't have strong feeling about it, it 
just make sure that we'll try to allocate 1 element as a last chance 
allocation.


Â Â Â  Note that there is still place for an improvement. If we miss 
memory on the heap we then use a recursive implementation which then 
rely on stack memory. I would be surprise that a system which miss 
heap memory would have no problem to allocate about the same on the 
stack so we will surely end up in a stack overflow. I still have this 
on my todo even if I already made several tries with no satisfying 
result in terms of performance.


Â Â Â  Tested under Linux x86_64.

Commit message:

Â Â Â  libstdc++: Limit memory allocation in stable_sort/inplace_merge 
(PR 83938)


Â Â Â  Reduce memory consumption in stable_sort/inplace_merge to what is used.

Â Â Â  Co-authored-by: FranÃ§ois DumontÂ  

Â Â Â  libstdc++-v3/ChangeLog:

Â Â Â  2020-06-11Â  John ChangÂ  
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  FranÃ§ois DumontÂ  

Â Â Â Â Â Â Â Â Â Â Â  PR libstdc++/83938
Â Â Â Â Â Â Â Â Â Â Â  * include/bits/stl_tempbuf.h (get_temporary_buffer): 
Change __len

Â Â Â Â Â Â Â Â Â Â Â  computation in the loop.
Â Â Â Â Â Â Â Â Â Â Â  * include/bits/stl_algo.h:
Â Â Â Â Â Â Â Â Â Â Â  (__inplace_merge): Take temporary buffer length from 
smallest range.

Â Â Â Â Â Â Â Â Â Â Â  (__stable_sort): Limit temporary buffer length.
Â Â Â Â Â Â Â Â Â Â Â  * testsuite/25_algorithms/inplace_merge/1.cc (test03): 
Test different

Â Â Â Â Â Â Â Â Â Â Â  pivot positions.
Â Â Â Â Â Â Â Â Â Â Â  * testsuite/performance/25_algorithms/stable_sort.cc: Test 
stable_sort

Â Â Â Â Â Â Â Â Â Â Â  under different heap memory conditions.
Â Â Â Â Â Â Â Â Â Â Â  * testsuite/performance/25_algorithms/inplace_merge.cc: 
New.

Ok to commit ?


I'm very nervous about changes to sort algos that aren't absolutely
necessary for correctness. It needs careful review and lots of
testing. Please be patient.

Re: [PATCH] powerpc: Restore bootstrap for PPC Darwin.

2020-06-24 Thread Segher Boessenkool

Hi!

On Wed, Jun 24, 2020 at 04:15:53PM +0100, Iain Sandoe wrote:
> Darwin has signed chars and the fields in the insn_data
> struct are const char, which leads to a bootstrap fail with
> 
> "error: comparison of integer expressions of different signedness: 'unsigned 
> int' and 'const char' [-Werror=sign-compare]”
> 
> OK for master?

Of course, pretty much trivial :-)  Thanks Iain!


Segher

[Patch] [OpenMP, Fortran] Add structure/derived-type element mapping

2020-06-24 Thread Tobias Burnus


(OpenMP 5 extends this a lot, but this is about OpenMP 4.5.
It touches code which is also used by OpenACC's attach/detach.)

@OpenACC/Julian: I think the character attach/detach for
deferred-length strings does not work properly with OpenACC;
I did not touch this code – but I think it needs some love.

This code adds support for
  map(dt%comp, dt%comp2)
where "comp" can be either a nonpointer, nonallocatable element
scalar, array or array section. Or it can be a pointer - where
character strings are one complication as for deferred-length
ones, the length is stored in an extra DT component.

While testing, I encountered two bugs, one relating to kind=4
character string (patch pending review; PR95837)
not part of testcase) and one related to deferred-length
character strings (commented in the test case; larger issue;
PR95868).

Like always, some more tests/testcase probably would not harm.

Regarding the patch:

(a) openmp.c:
This enabled component matching for 'map(' and
piggybacks on the OpenACC code for the checks. I think that
some additional checks might be useful – and I hope that no
check is too strict.
The "depend" clause was excluded as one otherwise gets a
testsuite fails due to the is-contiguous check.

(b) trans-openmp.c:
- gfc_trans_omp_clauses now has a "bool openacc".
- GOMP_MAP_ATTACH_DETACH is replaced by GOMP_MAP_ALWAYS_POINTER
- For arrays, the mapping of the descriptor is squeezed before
  "node" which contains the data transfer (var.desc.data mapping
  followed by the always_pointer for the mapping).
  In this array case, the latter gets a pointless cast in order
  to prevent that for both var.desc and var.desc.data memory gets
  allocated in the struct.
  → That's also the reason the big switch table is moved up.
- For deferred-length strings, the string-length is in an extra
  struct element (derived-type component) and will be mapped in
  addition.
- Bugs in the previous version:
  * gfc_trans_omp_array_section for "element == true", the size
of a pointer instead of the size of the element was mapped.
  * For string variables (with constant length) the kind=4 was
not properly handled.
  * Allocatable scalars were not handled – missing second clause
for the always_pointer (and attach_detach, I assume)

Comments, remarks, suggestions?
Otherwise: OK for the trunk?

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
[OpenMP, Fortran] Add structure/derived-type element mapping

gcc/fortran/ChangeLog:

	* openmp.c (gfc_match_omp_clauses): Match also derived-type
	component refs in OMP_CLAUSE_MAP.
	(resolve_omp_clauses): Resolve those.
	* trans-openmp.c (gfc_trans_omp_array_section, gfc_trans_omp_clauses):
	Handle OpenMP structure-element mapping.
	(gfc_trans_oacc_construct, gfc_trans_oacc_executable_directive,
	(gfc_trans_oacc_combined_directive, gfc_trans_oacc_declare): Update
	add openacc=true in gfc_trans_omp_clauses call.

gcc/testsuite/ChangeLog:

	* gfortran.dg/goacc/finalize-1.f: Update dump scan pattern.
	* gfortran.dg/gomp/map-1.f90: Update dg-error.
	* gfortran.dg/gomp/map-2.f90: New test.


libgomp/ChangeLog:

	* testsuite/libgomp.fortran/struct-elem-map-1.f90: New test.

 gcc/fortran/openmp.c   |   5 +-
 gcc/fortran/trans-openmp.c | 332 +++--
 gcc/testsuite/gfortran.dg/goacc/finalize-1.f   |   4 +-
 gcc/testsuite/gfortran.dg/gomp/map-1.f90   |  35 +--
 gcc/testsuite/gfortran.dg/gomp/map-2.f90   |   6 +
 .../libgomp.fortran/struct-elem-map-1.f90  | 331 
 6 files changed, 595 insertions(+), 118 deletions(-)

diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index e681903c7c2..7de2f6e1b1d 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -1464,7 +1464,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	  head = NULL;
 	  if (gfc_match_omp_variable_list ("", >lists[OMP_LIST_MAP],
 	   false, NULL, ,
-	   true) == MATCH_YES)
+	   true, true) == MATCH_YES)
 		{
 		  gfc_omp_namelist *n;
 		  for (n = *head; n; n = n->next)
@@ -4553,7 +4553,7 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
 
 		/* Look through component refs to find last array
 		   reference.  */
-		if (openacc && resolved)
+		if (resolved)
 		  {
 			/* The "!$acc cache" directive allows rectangular
 			   subarrays to be specified, with some restrictions
@@ -4563,6 +4563,7 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
 			   arr(-n:n,-n:n) could be contiguous even if it looks
 			   like it may not be.  */
 			if (list != OMP_LIST_CACHE
+			&& list != OMP_LIST_DEPEND
 			&& !gfc_is_simply_contiguous (n->expr, false, true)
 			&& gfc_is_not_contiguous (n->expr))
 			  gfc_error ("Array is not

Re: [PATCH] libgomp, fortran: Apply if clause to all sub-constructs in combined OpenMP constructs

2020-06-24 Thread Tobias Burnus


Hi Kwok,

the TODO is fixed by the attached patch; I would be happy if you could handle 
this patch,
e.g. together with your patch – or as follow up.

(Lightly tested only, i.e. it fixes the ICE but I did not
do a full testsuite run. But I regard it as obvious.)

Tobias

On 6/24/20 6:47 PM, Kwok Cheung Yeung wrote:


+  ! TODO: This currently fails with an internal compiler error
+  ! (PR 95869)
+  !subroutine test_target_parallel
+  !  do j = 1, N
+  !!$omp target parallel if(j .lt. LIMIT) map(tofrom: a(1:N))
+  !do i = 1, N
+  !  a(i) = a(i) + 1
+  !end do
+  !!$omp end target parallel
+  !   end do
+  !end subroutine


At least with my build (w/o your patch but with other patches),
I see in the original dump:

  D.4049 = j <= 59;
  #pragma omp target map( ...)
{
  {
logical(kind=4) D.4049;

#pragma omp parallel private(i) if(D.4049)
  {
Namely, the assignment is in a different scope than the declaration of the 
variable.
At the moment, I do not see why this fails – at a glance, it l

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
	* trans-openmp.c (gfc_trans_omp_target): Use correct scoping block.

diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c
index 3f4f06375ef..38a0d87d726 100644
--- a/gcc/fortran/trans-openmp.c
+++ b/gcc/fortran/trans-openmp.c
@@ -5488,7 +5488,7 @@ gfc_trans_omp_target (gfc_code *code)
 	pushlevel ();
 	gfc_start_block ();
 	tree inner_clauses
-	  = gfc_trans_omp_clauses (, [GFC_OMP_SPLIT_PARALLEL],
+	  = gfc_trans_omp_clauses (, [GFC_OMP_SPLIT_PARALLEL],
    code->loc);
 	stmt = gfc_trans_omp_code (code->block->next, true);
 	stmt = build2_loc (input_location, OMP_PARALLEL, void_type_node, stmt,

Re: [PATCH] PR fortran/95826 - Buffer overflows with PDTs and long symbols

2020-06-24 Thread Harald Anlauf

Hi Thomas,

> Is the memory you allocate freed anywhere?  We have a large
> number of memory leaks in the front end, but maybe we shouldn't
> add another one :-)

my reading of the definition of XALLOCAVEC

include/libiberty.h:#define XALLOCAVEC(T, N)((T *) alloca (sizeof (T) * 
(N)))

and alloca(3):

ALLOCA(3)   Linux Programmer's Manual  
ALLOCA(3)

NAME
   alloca - allocate memory that is automatically freed

told me to ignore questions like your's.  OK, I didn't.

Well, the pointer (char *name) is local and does not escape,
so I presumed that this is the way to go.

Anybody who knows better please correct me.  The above should be
mapped to a gcc builtin anyway.

Thanks,
Harald

Re: [PATCH] PR fortran/95827 - Buffer overflows with PDTs and long symbols

2020-06-24 Thread Thomas Koenig via Gcc-patches


Hi Haraldd,


here's another case with a buffer that did overflow.

Regtested on x86_64-pc-linux-gnu.

OK for master / backports?


OK. Thanks for the patch!

Best regards

Thomas

Re: [PATCH] PR fortran/95826 - Buffer overflows with PDTs and long symbols

2020-06-24 Thread Thomas Koenig via Gcc-patches


Hi Harald,


here's another fix for a buffer overflow with long symbols.


Is the memory you allocate freed anywhere?  We have a large
number of memory leaks in the front end, but maybe we shouldn't
add another one :-)

Best regards

Thomas

Re: [PATCH] libgomp, fortran: Apply if clause to all sub-constructs in combined OpenMP constructs

2020-06-24 Thread Jakub Jelinek via Gcc-patches

On Wed, Jun 24, 2020 at 05:47:06PM +0100, Kwok Cheung Yeung wrote:
> There appears to be a bug in the handling of the 'if' clause (without a
> directive name modifier) for combined OpenMP constructs in the Fortran
> front-end:
> 
> static void
> gfc_split_omp_clauses (gfc_code *code,
>gfc_omp_clauses clausesa[GFC_OMP_SPLIT_NUM])
> {
> ...
>  if (code->ext.omp_clauses != NULL)
>{
>  if (mask & GFC_OMP_MASK_TARGET)
>{
>  /* And this is copied to all.  */
>  clausesa[GFC_OMP_SPLIT_PARALLEL].if_expr
>= code->ext.omp_clauses->if_expr;
>}

Yeah, looks like a pasto.

> Currently, if 'target' is in the combined contruct, then the 'if' is applied
> to the 'parallel' construct, but there are combined constructs with 'target'
> but not 'parallel' (e.g. target teams distribute), which result in the 'if'
> not getting applied at all. This is also redundant, as the unmodified if is
> always applied to the 'parallel' construct if there is one.
> 
> The patch changes the behaviour to match what the common C/C++ FE does,
> which is to apply the 'if' to every applicable sub-construct in the combined
> construct. I have included a testcase to check that the if clauses have been
> applied correctly by the time it gets to the ME. I have also found a case
> that results in an ICE (using 'target parallel' with an 'if' clause) - I
> have commented out this out for now and filed it as PR 95869.
> 
> I have tested for regressions in the gfortran and libgomp testsuites. Okay
> for master/OG10?

Ok, thanks.
Though, while you are at it, if you'd like to also add the related, but
still missing support for if (simd: ...), or if (cancel: ...) that C/C++
already do support, it would be greatly appreciated.

Jakub

Re: [PATCH] libgomp: added simple functions and tests for OMPD

2020-06-24 Thread Jakub Jelinek via Gcc-patches

On Wed, Jun 24, 2020 at 06:03:26PM +0200, Tobias Burnus wrote:
> On 6/24/20 5:26 PM, Jakub Jelinek via Gcc-patches wrote:
> 
> > Just build the whole gcc tree.
> > If you are at the toplevel of gcc tree (i.e. directory that contains libgomp
> > subdirectory), do
> > mkdir obj
> > cd obj
> > ../configure --disable-bootstrap --enable-languages=c,c++,fortran
> > make -j16 # or how many cores or threads you have
> 
> To save some additional time, use: --disable-multilib --disable-pch
> 
> (But this matters more if you keep building all of GCC and not only
> libgomp. pch = pre-compiled header files.)
> 
> See also https://gcc.gnu.org/install/configure.html and for completeness:
> https://gcc.gnu.org/install/prerequisites.html
> https://gcc.gnu.org/install/build.html

--disable-multilib perhaps just initially, for the OMPD project you'll want
later to test how a 64-bit GDB loads the libgompd.so.1 library and
communicates with 32-bit libgomp.so.1 too.

And, you don't really need to build it from scratch each time you update,
for the libgomp development you just do make, or
./config.status --recheck; ./config.status
make
in the libgomp build directory, and
make check RUNTESTFLAGS=ompd.exp
to test.

Jakub

[PATCH] libgomp, fortran: Apply if clause to all sub-constructs in combined OpenMP constructs

2020-06-24 Thread Kwok Cheung Yeung


Hello

There appears to be a bug in the handling of the 'if' clause (without a 
directive name modifier) for combined OpenMP constructs in the Fortran front-end:


static void
gfc_split_omp_clauses (gfc_code *code,
   gfc_omp_clauses clausesa[GFC_OMP_SPLIT_NUM])
{
...
 if (code->ext.omp_clauses != NULL)
   {
 if (mask & GFC_OMP_MASK_TARGET)
   {
 /* And this is copied to all.  */
 clausesa[GFC_OMP_SPLIT_PARALLEL].if_expr
   = code->ext.omp_clauses->if_expr;
   }

Currently, if 'target' is in the combined contruct, then the 'if' is applied to 
the 'parallel' construct, but there are combined constructs with 'target' but 
not 'parallel' (e.g. target teams distribute), which result in the 'if' not 
getting applied at all. This is also redundant, as the unmodified if is always 
applied to the 'parallel' construct if there is one.


The patch changes the behaviour to match what the common C/C++ FE does, which is 
to apply the 'if' to every applicable sub-construct in the combined construct. I 
have included a testcase to check that the if clauses have been applied 
correctly by the time it gets to the ME. I have also found a case that results 
in an ICE (using 'target parallel' with an 'if' clause) - I have commented out 
this out for now and filed it as PR 95869.


I have tested for regressions in the gfortran and libgomp testsuites. Okay for 
master/OG10?


Thanks

Kwok

commit 052993de7457af85d5749b2ab119ffcc65e341e5
Author: Kwok Cheung Yeung 
Date:   Thu Jun 18 12:40:16 2020 -0700

libgomp, fortran: Apply if clause to all sub-constructs in combined OpenMP 
constructs

The unmodified 'if' clause should be applied to all the sub-constructs that
accept an 'if' clause in a combined OpenMP construct, and not just to the
'parallel' sub-construct.

2020-06-24  Kwok Cheung Yeung  

gcc/fortran/
* trans-openmp.c (gfc_split_omp_clauses): Add if clause
to target and simd sub-constructs.

gcc/testsuite/
* gfortran.dg/gomp/combined-if.f90: New.

diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c
index 7e2f625..67b7094 100644
--- a/gcc/fortran/trans-openmp.c
+++ b/gcc/fortran/trans-openmp.c
@@ -4748,7 +4748,7 @@ gfc_split_omp_clauses (gfc_code *code,
  clausesa[GFC_OMP_SPLIT_TARGET].if_exprs[OMP_IF_TARGET]
= code->ext.omp_clauses->if_exprs[OMP_IF_TARGET];
  /* And this is copied to all.  */
- clausesa[GFC_OMP_SPLIT_PARALLEL].if_expr
+ clausesa[GFC_OMP_SPLIT_TARGET].if_expr
= code->ext.omp_clauses->if_expr;
}
   if (mask & GFC_OMP_MASK_TEAMS)
@@ -4832,6 +4832,9 @@ gfc_split_omp_clauses (gfc_code *code,
  /* Duplicate collapse.  */
  clausesa[GFC_OMP_SPLIT_SIMD].collapse
= code->ext.omp_clauses->collapse;
+ /* And this is copied to all.  */
+ clausesa[GFC_OMP_SPLIT_SIMD].if_expr
+   = code->ext.omp_clauses->if_expr;
}
   if (mask & GFC_OMP_MASK_TASKLOOP)
{
diff --git a/gcc/testsuite/gfortran.dg/gomp/combined-if.f90 
b/gcc/testsuite/gfortran.dg/gomp/combined-if.f90
new file mode 100644
index 000..383086c
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/gomp/combined-if.f90
@@ -0,0 +1,110 @@
+! { dg-do compile }
+! { dg-additional-options "-fdump-tree-omplower" }
+
+module combined_if
+  implicit none
+
+  integer, parameter :: N = 100
+  integer, parameter :: LIMIT = 60
+  integer :: i, j
+  integer, dimension(N) :: a = (/ (i, i = 1,N) /)
+contains
+  subroutine test_parallel_loop_simd
+do j = 1, N
+  !$omp parallel do simd if(j .lt. LIMIT)
+  do i = 1, N
+a(i) = a(i) + 1
+  end do
+end do
+  end subroutine
+
+  ! TODO: This currently fails with an internal compiler error
+  ! (PR 95869)
+  !subroutine test_target_parallel
+  !  do j = 1, N
+  !!$omp target parallel if(j .lt. LIMIT) map(tofrom: a(1:N))
+  !do i = 1, N
+  !  a(i) = a(i) + 1
+  !end do
+  !!$omp end target parallel
+  !   end do
+  !end subroutine
+
+  subroutine test_target_parallel_loop
+do j = 1, N
+  !$omp target parallel do if(j .lt. LIMIT) map(tofrom: a(1:N))
+  do i = 1, N
+a(i) = a(i) + 1
+  end do
+end do
+  end subroutine
+
+  subroutine test_target_parallel_loop_simd
+do j = 1, N
+  !$omp target parallel do simd if(j .lt. LIMIT) map(tofrom: a(1:N))
+  do i = 1, N
+a(i) = a(i) + 1
+  end do
+end do
+  end subroutine
+
+  subroutine test_target_simd
+do j = 1, N
+  !$omp target simd if(j .lt. LIMIT) map(tofrom: a(1:N))
+  do i = 1, N
+a(i) = a(i) + 1
+  end do
+end do
+  end subroutine
+
+  subroutine test_target_teams
+do j = 1, N
+  !$omp target teams if(j .lt. LIMIT) map(tofrom: a(1:N))
+  do i = 1, N
+a(i) = a(i) + 1
+  end do
+  !$omp end target teams
+end do
+  end subroutine
+
+  subroutine

[PATCH] x96: Remove PTA_CLWB from PTA_ICELAKE_CLIENT

2020-06-24 Thread H.J. Lu via Gcc-patches

CLWB isn't supported on Ice Lake client.  But Ice Lake server and Tiger
Lake support it.  Move PTA_CLWB to PTA_ICELAKE_SERVER and PTA_TIGERLAKE.

PR target/95874
* config/i386/i386.h (PTA_ICELAKE_CLIENT): Remove PTA_CLWB.
(PTA_ICELAKE_SERVER): Add PTA_CLWB.
(PTA_TIGERLAKE): Add PTA_CLWB.
---
 gcc/config/i386/i386.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index e1775ff0b5d..d6b57562a53 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -2475,11 +2475,11 @@ const wide_int_bitmask PTA_CANNONLAKE = PTA_SKYLAKE | 
PTA_AVX512F
   | PTA_AVX512VBMI | PTA_AVX512IFMA | PTA_SHA;
 const wide_int_bitmask PTA_ICELAKE_CLIENT = PTA_CANNONLAKE | PTA_AVX512VNNI
   | PTA_GFNI | PTA_VAES | PTA_AVX512VBMI2 | PTA_VPCLMULQDQ | PTA_AVX512BITALG
-  | PTA_RDPID | PTA_CLWB | PTA_AVX512VPOPCNTDQ;
+  | PTA_RDPID | PTA_AVX512VPOPCNTDQ;
 const wide_int_bitmask PTA_ICELAKE_SERVER = PTA_ICELAKE_CLIENT | PTA_PCONFIG
-  | PTA_WBNOINVD;
+  | PTA_WBNOINVD | PTA_CLWB;
 const wide_int_bitmask PTA_TIGERLAKE = PTA_ICELAKE_CLIENT | PTA_MOVDIRI
-  | PTA_MOVDIR64B | PTA_AVX512VP2INTERSECT;
+  | PTA_MOVDIR64B | PTA_CLWB | PTA_AVX512VP2INTERSECT;
 const wide_int_bitmask PTA_KNL = PTA_BROADWELL | PTA_AVX512PF | PTA_AVX512ER
   | PTA_AVX512F | PTA_AVX512CD;
 const wide_int_bitmask PTA_BONNELL = PTA_CORE2 | PTA_MOVBE;
-- 
2.26.2

Re: [PATCH] libgomp: added simple functions and tests for OMPD

2020-06-24 Thread Tobias Burnus


On 6/24/20 5:26 PM, Jakub Jelinek via Gcc-patches wrote:


Just build the whole gcc tree.
If you are at the toplevel of gcc tree (i.e. directory that contains libgomp
subdirectory), do
mkdir obj
cd obj
../configure --disable-bootstrap --enable-languages=c,c++,fortran
make -j16 # or how many cores or threads you have


To save some additional time, use: --disable-multilib --disable-pch

(But this matters more if you keep building all of GCC and not only
libgomp. pch = pre-compiled header files.)

See also https://gcc.gnu.org/install/configure.html and for completeness:
https://gcc.gnu.org/install/prerequisites.html
https://gcc.gnu.org/install/build.html

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter

Re: [PATCH] libgomp: added simple functions and tests for OMPD

2020-06-24 Thread Jakub Jelinek via Gcc-patches

On Wed, Jun 24, 2020 at 11:11:47AM -0400, y2s1982 . wrote:
> > And introduce libgompd.map version script which will have the exported
> > symbols in for now OMPD_5.0 symbol version.
> > So
> > OMPD_5.0 {
> >   global:
> > ...;
> >
> >   local:
> > *;
> > };
> >
> 
> Oh I wasn't sure how everything worked yet, so I thought to borrow libomp
> scripts as a placeholder for now.
> I will change them to libompd scripts.
> As for mapping, could you point me to an example that has global: and
> local: object as above?

See libgomp.map or
info ld
and search for VERSION command.

> > So, what actually includes that header?  Otherwise it can't compile.
> >
> I forgot to uncomment that. I put that in there to see other compilation
> errors.  I am still having trouble generating compilation errors using make.
> I tried creating a separate directory outside of the source, calling
> ../gcc/configure, and trying make libgomp.
> It did not generate any output other than "make: Nothing to do be for
> '../gcc/libgomp'".  I wasn't sure what else to try.

Just build the whole gcc tree.
If you are at the toplevel of gcc tree (i.e. directory that contains libgomp
subdirectory), do
mkdir obj
cd obj
../configure --disable-bootstrap --enable-languages=c,c++,fortran
make -j16 # or how many cores or threads you have

later on you can just
cd x86_64-*-linux*/libgomp
and in there
make -j16
make -k check

> Thank you. I will try harder to follow the formatting.  When you say
> indentation level, do you mean the number of spaces?

Yes, except that 8 consecutive spaces should be replaced by tab character.
So like:
{
  if (...)
{
  if (...)
{
  if (...)
{
  ...
}
  ...
}
  ...
}
}

> Thank you for the clarification. Should I change the declaration and put
> the definition of ompd_dll_locations variable in the omp-tools.h?

Variables shouldn't be defined in header files, they should be just declared
there (but I think ompd_dll_locations actually isn't required to be declared
in a header, all one needs to ensure is that it is exported from the shared
library (i.e. it is not static, and libgompd.map has ompd_dll_locations;
entry for it in global: section).

Jakub

[PATCH] powerpc: Restore bootstrap for PPC Darwin.

2020-06-24 Thread Iain Sandoe

Hi,

Darwin has signed chars and the fields in the insn_data
struct are const char, which leads to a bootstrap fail with

"error: comparison of integer expressions of different signedness: 'unsigned 
int' and 'const char' [-Werror=sign-compare]”

OK for master?
Iain

gcc/ChangeLog:

* config/rs6000/rs6000-call.c (mma_init_builtins): Cast
the insn_data n_operands value to unsigned.
---
 gcc/config/rs6000/rs6000-call.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 81816a5..8d9be44 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -13435,7 +13435,7 @@ mma_init_builtins (void)
{
  /* This is a normal MMA built-in function.  */
  unsigned j = (attr & RS6000_BTC_QUAD) ? 1 : 0;
- for (; j < insn_data[icode].n_operands; j++)
+ for (; j < (unsigned) insn_data[icode].n_operands; j++)
{
  machine_mode mode = insn_data[icode].operand[j].mode;
  if (gimple_func && mode == PXImode)
-- 
2.8.1

Re: [PATCH] libgomp: added simple functions and tests for OMPD

2020-06-24 Thread y2s1982 . via Gcc-patches

Hello,

Thank you for the tips. I have some follow up questions. :)

On Wed, Jun 24, 2020 at 5:25 AM Jakub Jelinek  wrote:

> > libgomp/ChangeLog:
> >
> >   * Makefile.am (toolexeclib_LTLIBRARIES and related): Add
> libgompd.la.
>
> Please spell out all the changes.  I.e.
> * Makefile.am (toolexeclib_LTLIBRARIES): Add libgompd.la.
> (libgompd_la_LDFLAGS, libgompd_la_DEPENDENCIES, libgompd_la_LINK,
> libgompd_la_SOURCES): Set.
>
Okay. Thank you for the clarification.

> +libgompd_la_LDFLAGS = $(libgomp_version_info) $(libgomp_version_script) \
> > +$(lt_host_flags)
> > +libgompd_la_DEPENDENCIES = $(libgomp_version_dep)
> > +libgompd_la_LINK = $(LINK) $(libgomp_la_LDFLAGS)
>
> You actually want to use libgompd_la_LDFLAGS, libgompd_version_dep,
> libgompd_version_script, libgompd_version_info etc. (and set those
> variables
> next to the libgomp ones).
> And introduce libgompd.map version script which will have the exported
> symbols in for now OMPD_5.0 symbol version.
> So
> OMPD_5.0 {
>   global:
> ...;
>
>   local:
> *;
> };
>

Oh I wasn't sure how everything worked yet, so I thought to borrow libomp
scripts as a placeholder for now.
I will change them to libompd scripts.
As for mapping, could you point me to an example that has global: and
local: object as above?


>
> > +libgompd_la_SOURCES = libgompd.c
>
> Not sure if you want to call the source file libgompd.c, that would make
> sense only if you want to have all of OMPD implemented in a single file.
> If you need multiple, I'd suggest ompd-*.c where the * would be something
> short/descriptive to name a set of related functions.
>
Okay. I will start splitting them up.


>
> > +++ b/libgomp/libgompd.c
> > @@ -0,0 +1,46 @@
> > +#include 
> > +#include 
> > +#include "omp-tools.h"
> > +#include "libgompd.h"
> > +//#include "plugin-suffix.h"
>
> So, what actually includes that header?  Otherwise it can't compile.
>
I forgot to uncomment that. I put that in there to see other compilation
errors.  I am still having trouble generating compilation errors using make.
I tried creating a separate directory outside of the source, calling
../gcc/configure, and trying make libgomp.
It did not generate any output other than "make: Nothing to do be for
'../gcc/libgomp'".  I wasn't sure what else to try.



> > +
> > +ompd_rc_t ompd_get_api_version(ompd_word_t *version)
> > +{
> > +*version = OMPD_VERSION;
> > +return ompd_rc_ok;
> > +}
>
> Formatting.  The GNU Coding Conventions say it should be
> ompd_rc_t
> ompd_get_api_version (ompd_word_t *version)
> {
>   *version = OMPD_VERSION;
>   return ompd_rc_ok;
> }

(i.e. function name should be at the start of line for easy grepping,
> there should be a single space before (, no space after the ( or
> before ) as you have in other functions, indentation level is 2
> (see indent program defaults).
>

Thank you. I will try harder to follow the formatting.  When you say
indentation level, do you mean the number of spaces?


>
> > +
> > +ompd_rc_t ompd_get_version_string(const char **string)
> > +{
> > +string = str(OMPD_VERSION);
> > +return ompd_rc_ok;
> > +}
>
> This doesn't do anything outside of the function.  You need
>   *string =

and put there something more descriptive than just the version,
> perhaps
>   *string = "GNU OpenMP Runtime implementing OpenMP 5.0 " str
> (OMPD_VERSION);
> I don't see a str macro defined, and it should have a different name, str
> is
> too generic.
> libgomp.h defines (conditionally):
> # define ialias_str1(x) ialias_str2(x)
> # define ialias_str2(x) #x
> which does what you want, but you need it unconditionally and call it some
> other way (stringify and stringify1?, define right above the function?).
>
> > +
> > +ompd_rc_t ompd_initialize ( ompd_word_t api_version, const
> ompd_callbacks_t *callbacks )
> > +{
> > +/* initialized flag */
> > +static int ompd_initialized = 0;
> > +
> > +if (ompd_initialized)
> > +return ompd_rc_error;
> > +
> > +/* compute library name and locations */
> > +const char *prefix = "libgompd.";
> > +const char *suffix = SONAME_SUFFIX (1);
> > +size_t prefix_len, suffix_len;
> > +prefix_len = strlen(prefix);
> > +suffix_len = strlen(suffix);
>
> Formatting (in addition to what has been said above, e.g. space before (.
> But, this doesn't really belong here anyway, ompd_dll_locations needs to be
> initialized not in ompd_initialize, but far before that.
> Either it should be just a const variable, which I think with
> const char *ompd_dll_locations[2] = { "libgompd" SONAME_SUFFIX (1), NULL };
> is possible, or some library constructor needs to initializa it and then
> pass through (or call) the mandated breakpoint so that the debugger can be
> sure it is initialized.
> Then the debugger will use the variable, load the libgompd library and only
> after that actually call ompd_initialize.
>

Thank you for the clarification. Should I change the

Re: [PATCH] x86: Remove brand ID check for Intel processors

2020-06-24 Thread Uros Bizjak via Gcc-patches

On Wed, Jun 24, 2020 at 5:01 PM H.J. Lu  wrote:
>
> Brand ID was a feature that briefly existed in some Pentium III and
> Pentium 4 CPUs.  The CPUs that had non-zero brand ID still have had
> valid family/model.  Brand ID just gives a marketing name for the CPU.
> Remove the extra code for brand ID check.
>
> gcc/
>
> PR target/95660
> * common/config/i386/cpuinfo.h (get_intel_cpu): Remove brand_id.
> (cpu_indicator_init): Likewise.
> * config/i386/driver-i386.c (host_detect_local_cpu): Updated.
>
> gcc/testsuite/
>
> PR target/95660
> * gcc.target/i386/builtin_target.c (check_detailed): Updated.

LGTM.

Thanks,
Uros.

> ---
>  gcc/common/config/i386/cpuinfo.h   | 12 +---
>  gcc/config/i386/driver-i386.c  |  2 +-
>  gcc/testsuite/gcc.target/i386/builtin_target.c |  2 +-
>  3 files changed, 7 insertions(+), 9 deletions(-)
>
> diff --git a/gcc/common/config/i386/cpuinfo.h 
> b/gcc/common/config/i386/cpuinfo.h
> index 27b7b4d8581..3eda53240f6 100644
> --- a/gcc/common/config/i386/cpuinfo.h
> +++ b/gcc/common/config/i386/cpuinfo.h
> @@ -254,13 +254,12 @@ get_amd_cpu (struct __processor_model *cpu_model,
>  static inline const char *
>  get_intel_cpu (struct __processor_model *cpu_model,
>struct __processor_model2 *cpu_model2,
> -  unsigned int *cpu_features2,
> -  unsigned int brand_id)
> +  unsigned int *cpu_features2)
>  {
>const char *cpu = NULL;
>
> -  /* Parse family and model only for brand ID 0 and model 6. */
> -  if (brand_id != 0 || cpu_model2->__cpu_family != 0x6)
> +  /* Parse family and model only for model 6. */
> +  if (cpu_model2->__cpu_family != 0x6)
>  return cpu;
>
>switch (cpu_model2->__cpu_model)
> @@ -758,7 +757,7 @@ cpu_indicator_init (struct __processor_model *cpu_model,
>
>int max_level;
>unsigned int vendor;
> -  unsigned int model, family, brand_id;
> +  unsigned int model, family;
>unsigned int extended_model, extended_family;
>
>/* This function needs to run just once.  */
> @@ -791,7 +790,6 @@ cpu_indicator_init (struct __processor_model *cpu_model,
>
>model = (eax >> 4) & 0x0f;
>family = (eax >> 8) & 0x0f;
> -  brand_id = ebx & 0xff;
>extended_model = (eax >> 12) & 0xf0;
>extended_family = (eax >> 20) & 0xff;
>
> @@ -813,7 +811,7 @@ cpu_indicator_init (struct __processor_model *cpu_model,
>get_available_features (cpu_model, cpu_model2, cpu_features2,
>   ecx, edx);
>/* Get CPU type.  */
> -  get_intel_cpu (cpu_model, cpu_model2, cpu_features2, brand_id);
> +  get_intel_cpu (cpu_model, cpu_model2, cpu_features2);
>cpu_model->__cpu_vendor = VENDOR_INTEL;
>  }
>else if (vendor == signature_AMD_ebx)
> diff --git a/gcc/config/i386/driver-i386.c b/gcc/config/i386/driver-i386.c
> index e9e4d6ed023..6da2a15c1b0 100644
> --- a/gcc/config/i386/driver-i386.c
> +++ b/gcc/config/i386/driver-i386.c
> @@ -555,7 +555,7 @@ const char *host_detect_local_cpu (int argc, const char 
> **argv)
> cpu = "pentium";
>break;
>  case PROCESSOR_PENTIUMPRO:
> -  cpu = get_intel_cpu (_model, _model2, cpu_features2, 0);
> +  cpu = get_intel_cpu (_model, _model2, cpu_features2);
>if (cpu == NULL)
> {
>   if (arch)
> diff --git a/gcc/testsuite/gcc.target/i386/builtin_target.c 
> b/gcc/testsuite/gcc.target/i386/builtin_target.c
> index e87f262a775..aa9680544d8 100644
> --- a/gcc/testsuite/gcc.target/i386/builtin_target.c
> +++ b/gcc/testsuite/gcc.target/i386/builtin_target.c
> @@ -46,7 +46,7 @@ check_detailed ()
>  {
>  case VENDOR_INTEL:
>assert (__builtin_cpu_is ("intel"));
> -  get_intel_cpu (_model, _model2, cpu_features2, 0);
> +  get_intel_cpu (_model, _model2, cpu_features2);
>break;
>  case VENDOR_AMD:
>assert (__builtin_cpu_is ("amd"));
> --
> 2.26.2
>

[PATCH] x86: Remove brand ID check for Intel processors

2020-06-24 Thread H.J. Lu via Gcc-patches

Brand ID was a feature that briefly existed in some Pentium III and
Pentium 4 CPUs.  The CPUs that had non-zero brand ID still have had
valid family/model.  Brand ID just gives a marketing name for the CPU.
Remove the extra code for brand ID check.

gcc/

PR target/95660
* common/config/i386/cpuinfo.h (get_intel_cpu): Remove brand_id.
(cpu_indicator_init): Likewise.
* config/i386/driver-i386.c (host_detect_local_cpu): Updated.

gcc/testsuite/

PR target/95660
* gcc.target/i386/builtin_target.c (check_detailed): Updated.
---
 gcc/common/config/i386/cpuinfo.h   | 12 +---
 gcc/config/i386/driver-i386.c  |  2 +-
 gcc/testsuite/gcc.target/i386/builtin_target.c |  2 +-
 3 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/gcc/common/config/i386/cpuinfo.h b/gcc/common/config/i386/cpuinfo.h
index 27b7b4d8581..3eda53240f6 100644
--- a/gcc/common/config/i386/cpuinfo.h
+++ b/gcc/common/config/i386/cpuinfo.h
@@ -254,13 +254,12 @@ get_amd_cpu (struct __processor_model *cpu_model,
 static inline const char *
 get_intel_cpu (struct __processor_model *cpu_model,
   struct __processor_model2 *cpu_model2,
-  unsigned int *cpu_features2,
-  unsigned int brand_id)
+  unsigned int *cpu_features2)
 {
   const char *cpu = NULL;
 
-  /* Parse family and model only for brand ID 0 and model 6. */
-  if (brand_id != 0 || cpu_model2->__cpu_family != 0x6)
+  /* Parse family and model only for model 6. */
+  if (cpu_model2->__cpu_family != 0x6)
 return cpu;
 
   switch (cpu_model2->__cpu_model)
@@ -758,7 +757,7 @@ cpu_indicator_init (struct __processor_model *cpu_model,
 
   int max_level;
   unsigned int vendor;
-  unsigned int model, family, brand_id;
+  unsigned int model, family;
   unsigned int extended_model, extended_family;
 
   /* This function needs to run just once.  */
@@ -791,7 +790,6 @@ cpu_indicator_init (struct __processor_model *cpu_model,
 
   model = (eax >> 4) & 0x0f;
   family = (eax >> 8) & 0x0f;
-  brand_id = ebx & 0xff;
   extended_model = (eax >> 12) & 0xf0;
   extended_family = (eax >> 20) & 0xff;
 
@@ -813,7 +811,7 @@ cpu_indicator_init (struct __processor_model *cpu_model,
   get_available_features (cpu_model, cpu_model2, cpu_features2,
  ecx, edx);
   /* Get CPU type.  */
-  get_intel_cpu (cpu_model, cpu_model2, cpu_features2, brand_id);
+  get_intel_cpu (cpu_model, cpu_model2, cpu_features2);
   cpu_model->__cpu_vendor = VENDOR_INTEL;
 }
   else if (vendor == signature_AMD_ebx)
diff --git a/gcc/config/i386/driver-i386.c b/gcc/config/i386/driver-i386.c
index e9e4d6ed023..6da2a15c1b0 100644
--- a/gcc/config/i386/driver-i386.c
+++ b/gcc/config/i386/driver-i386.c
@@ -555,7 +555,7 @@ const char *host_detect_local_cpu (int argc, const char 
**argv)
cpu = "pentium";
   break;
 case PROCESSOR_PENTIUMPRO:
-  cpu = get_intel_cpu (_model, _model2, cpu_features2, 0);
+  cpu = get_intel_cpu (_model, _model2, cpu_features2);
   if (cpu == NULL)
{
  if (arch)
diff --git a/gcc/testsuite/gcc.target/i386/builtin_target.c 
b/gcc/testsuite/gcc.target/i386/builtin_target.c
index e87f262a775..aa9680544d8 100644
--- a/gcc/testsuite/gcc.target/i386/builtin_target.c
+++ b/gcc/testsuite/gcc.target/i386/builtin_target.c
@@ -46,7 +46,7 @@ check_detailed ()
 {
 case VENDOR_INTEL:
   assert (__builtin_cpu_is ("intel"));
-  get_intel_cpu (_model, _model2, cpu_features2, 0);
+  get_intel_cpu (_model, _model2, cpu_features2);
   break;
 case VENDOR_AMD:
   assert (__builtin_cpu_is ("amd"));
-- 
2.26.2

Re: [PATCH] x86: Add Cooper Lake detection with AVX512BF16

2020-06-24 Thread Uros Bizjak via Gcc-patches

On Wed, Jun 24, 2020 at 3:12 PM H.J. Lu  wrote:
>
> All Sky Lake family processors have the same CPUID model number, 0x55.
> The differences are Cascade Lake has AVX512VNNI and Cooper Lake has
> AVX512VNNI + AVX512BF16.  Check AVX512BF16 for Cooper Lake.
>
> PR target/95774
> * common/config/i386/cpuinfo.h (get_intel_cpu): Add Cooper Lake
> detection with AVX512BF16.

OK.

Thanks,
Uros.

> ---
>  gcc/common/config/i386/cpuinfo.h | 10 +-
>  1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/common/config/i386/cpuinfo.h 
> b/gcc/common/config/i386/cpuinfo.h
> index 0a0e88011bc..3eda53240f6 100644
> --- a/gcc/common/config/i386/cpuinfo.h
> +++ b/gcc/common/config/i386/cpuinfo.h
> @@ -397,7 +397,15 @@ get_intel_cpu (struct __processor_model *cpu_model,
>CHECK___builtin_cpu_is ("corei7");
>cpu_model->__cpu_type = INTEL_COREI7;
>if (has_cpu_feature (cpu_model, cpu_features2,
> -  FEATURE_AVX512VNNI))
> +  FEATURE_AVX512BF16))
> +   {
> + /* Cooper Lake.  */
> + cpu = "cooperlake";
> + CHECK___builtin_cpu_is ("cooperlake");
> + cpu_model->__cpu_subtype = INTEL_COREI7_COOPERLAKE;
> +   }
> +  else if (has_cpu_feature (cpu_model, cpu_features2,
> +   FEATURE_AVX512VNNI))
> {
>   /* Cascade Lake.  */
>   cpu = "cascadelake";
> --
> 2.26.2
>

Re: V2 [PATCH] x86: Share _isa_names_table and use cpuinfo.h

2020-06-24 Thread Uros Bizjak via Gcc-patches

On Wed, Jun 24, 2020 at 4:38 PM H.J. Lu  wrote:
>
> On Wed, Jun 24, 2020 at 03:46:25PM +0200, Uros Bizjak wrote:
> > On Wed, Jun 24, 2020 at 3:37 PM Uros Bizjak  wrote:
> > >
> > > On Wed, Jun 24, 2020 at 3:06 PM H.J. Lu  wrote:
> > > >
> > > > On Wed, Jun 24, 2020 at 02:43:43PM +0200, Uros Bizjak wrote:
> > > > > >
> > > > > > Here is the updated patch for x86 backend and libgcc.  driver-i386.c
> > > > > > is unchanged.
> > > > >
> > > > > Thanks. We should change driver-i386.c very carefully and in an
> > > > > independent way from this patch. It is a complex and interwoven web of
> > > > > name, model and features check. I propose that we first convert
> > > > > various has_xxx checks to a new interface, in as trivial way as
> > > > > possible.
> > > > >
> > > >
> > > > Here is the patch to share isa_names_table and use cpuinfo.h to check
> > > > ISAs as well as detect newer Intel/AMD processors.
> > > >
> > > > OK for master?
> > >
> > > No. Don't use get_intel_cpu and get_amd_cpu in driver-i386.c.
> >
> > +  cpu = get_amd_cpu (_model, _model2, cpu_features2);
> > +  if (cpu == NULL)
> > +   {
> > + if (name == signature_NSC_ebx)
> > +   processor = PROCESSOR_GEODE;
> > + else if (has_feature (FEATURE_SSE2)
> >
> > Here is where your patch fails. AMD part has early bypass for
> > signature_NSC_ebx, so it is detected as NSC processor regardless of
> > what is detected by generic code.
> >
> > Uros.
>
> Here is the updated patch not to use get_amd_cpu in driver-i386.c.
> I believe the usage of get_intel_cpu in driver-i386.c is correct.
>
> OK for master?

OK, let's go with this version, but look out for fallout.

I hope there is some interest from AMD folks to unify and further
improve their part of the detection code.

Thanks,
Uros.

> Thanks.
>
> H.J.
> ---
> Both driver-i386.c and libgcc use CPUID to detect the processor name
> as well as available ISAs.  To detect the same processor or ISAs, the
> same detection logic is duplicated in 2 places.  Sometimes only one place
> was up to date or got it right.  Sometimes both places got it wrong.
>
> 1. Add common/config/i386/i386-isas.h to define _isa_names_table.
> 2. Use isa_names_table to auto-generate ISA command-line options.
> 3. Use isa_names_table to auto-generate __builtin_cpu_supports tests.
> 4. Use common/config/i386/cpuinfo.h to check available ISAs and detect
> newer Intel processors in driver-i386.c and builtin_target.c.
> 5. Detection of AMD processors and older processors in driver-i386.c is
> unchanged.
>
> gcc/
>
> PR target/95843
> * common/config/i386/i386-isas.h: New file.  Extracted from
> gcc/config/i386/i386-builtins.c.
> (_isa_names_table): Add option.
> (ISA_NAMES_TABLE_START): New.
> (ISA_NAMES_TABLE_END): Likewise.
> (ISA_NAMES_TABLE_ENTRY): Likewise.
> (isa_names_table): Defined with ISA_NAMES_TABLE_START,
> ISA_NAMES_TABLE_END and ISA_NAMES_TABLE_ENTRY.  Add more ISAs
> from enum processor_features.
> * config/i386/driver-i386.c: Include
> "common/config/i386/cpuinfo.h" and
> "common/config/i386/i386-isas.h".
> (has_feature): New macro.
> (host_detect_local_cpu): Call cpu_indicator_init to get CPU
> features.  Use has_feature to detect processor features.  Call
> Call get_intel_cpu to get the newer Intel CPU name.  Use
> isa_names_table to generate command-line options.
> * config/i386/i386-builtins.c: Include
> "common/config/i386/i386-isas.h".
> (_arch_names_table): Removed.
> (isa_names_table): Likewise.
>
> gcc/testsuite/
>
> PR target/95843
> * gcc.target/i386/builtin_target.c: Include ,
> ../../../common/config/i386/i386-cpuinfo.h and
> ../../../common/config/i386/cpuinfo.h.
> (check_amd_cpu_model): Removed.
> (check_intel_cpu_model): Likewise,
> (CHECK___builtin_cpu_is): New.
> (gcc_assert): New.  Defined as assert.
> (gcc_unreachable): New.  Defined as abort.
> (inline): New.  Defined as empty.
> (ISA_NAMES_TABLE_START): Likewise.
> (ISA_NAMES_TABLE_END): Likewise.
> (ISA_NAMES_TABLE_ENTRY): New.
> (check_features): Include
> "../../../common/config/i386/i386-isas.h".
> (check_detailed): Call cpu_indicator_init.  Always call
> check_features.  Call get_amd_cpu instead of check_amd_cpu_model.
> Call get_intel_cpu instead of check_intel_cpu_model.
> ---
>  gcc/common/config/i386/i386-isas.h| 163 +
>  gcc/config/i386/driver-i386.c | 644 +++---
>  gcc/config/i386/i386-builtins.c   |  52 +-
>  .../gcc.target/i386/builtin_target.c  | 355 +-
>  4 files changed, 306 insertions(+), 908 deletions(-)
>  create mode 100644 gcc/common/config/i386/i386-isas.h
>
> diff --git a/gcc/common/config/i386/i386-isas.h

V2 [PATCH] x86: Share _isa_names_table and use cpuinfo.h

2020-06-24 Thread H.J. Lu via Gcc-patches

On Wed, Jun 24, 2020 at 03:46:25PM +0200, Uros Bizjak wrote:
> On Wed, Jun 24, 2020 at 3:37 PM Uros Bizjak  wrote:
> >
> > On Wed, Jun 24, 2020 at 3:06 PM H.J. Lu  wrote:
> > >
> > > On Wed, Jun 24, 2020 at 02:43:43PM +0200, Uros Bizjak wrote:
> > > > >
> > > > > Here is the updated patch for x86 backend and libgcc.  driver-i386.c
> > > > > is unchanged.
> > > >
> > > > Thanks. We should change driver-i386.c very carefully and in an
> > > > independent way from this patch. It is a complex and interwoven web of
> > > > name, model and features check. I propose that we first convert
> > > > various has_xxx checks to a new interface, in as trivial way as
> > > > possible.
> > > >
> > >
> > > Here is the patch to share isa_names_table and use cpuinfo.h to check
> > > ISAs as well as detect newer Intel/AMD processors.
> > >
> > > OK for master?
> >
> > No. Don't use get_intel_cpu and get_amd_cpu in driver-i386.c.
> 
> +  cpu = get_amd_cpu (_model, _model2, cpu_features2);
> +  if (cpu == NULL)
> +   {
> + if (name == signature_NSC_ebx)
> +   processor = PROCESSOR_GEODE;
> + else if (has_feature (FEATURE_SSE2)
> 
> Here is where your patch fails. AMD part has early bypass for
> signature_NSC_ebx, so it is detected as NSC processor regardless of
> what is detected by generic code.
> 
> Uros.

Here is the updated patch not to use get_amd_cpu in driver-i386.c.
I believe the usage of get_intel_cpu in driver-i386.c is correct.

OK for master?

Thanks.

H.J.
---
Both driver-i386.c and libgcc use CPUID to detect the processor name
as well as available ISAs.  To detect the same processor or ISAs, the
same detection logic is duplicated in 2 places.  Sometimes only one place
was up to date or got it right.  Sometimes both places got it wrong.

1. Add common/config/i386/i386-isas.h to define _isa_names_table.
2. Use isa_names_table to auto-generate ISA command-line options.
3. Use isa_names_table to auto-generate __builtin_cpu_supports tests.
4. Use common/config/i386/cpuinfo.h to check available ISAs and detect
newer Intel processors in driver-i386.c and builtin_target.c.
5. Detection of AMD processors and older processors in driver-i386.c is
unchanged.

gcc/

PR target/95843
* common/config/i386/i386-isas.h: New file.  Extracted from
gcc/config/i386/i386-builtins.c.
(_isa_names_table): Add option.
(ISA_NAMES_TABLE_START): New.
(ISA_NAMES_TABLE_END): Likewise.
(ISA_NAMES_TABLE_ENTRY): Likewise.
(isa_names_table): Defined with ISA_NAMES_TABLE_START,
ISA_NAMES_TABLE_END and ISA_NAMES_TABLE_ENTRY.  Add more ISAs
from enum processor_features.
* config/i386/driver-i386.c: Include
"common/config/i386/cpuinfo.h" and
"common/config/i386/i386-isas.h".
(has_feature): New macro.
(host_detect_local_cpu): Call cpu_indicator_init to get CPU
features.  Use has_feature to detect processor features.  Call
Call get_intel_cpu to get the newer Intel CPU name.  Use
isa_names_table to generate command-line options.
* config/i386/i386-builtins.c: Include
"common/config/i386/i386-isas.h".
(_arch_names_table): Removed.
(isa_names_table): Likewise.

gcc/testsuite/

PR target/95843
* gcc.target/i386/builtin_target.c: Include ,
../../../common/config/i386/i386-cpuinfo.h and
../../../common/config/i386/cpuinfo.h.
(check_amd_cpu_model): Removed.
(check_intel_cpu_model): Likewise,
(CHECK___builtin_cpu_is): New.
(gcc_assert): New.  Defined as assert.
(gcc_unreachable): New.  Defined as abort.
(inline): New.  Defined as empty.
(ISA_NAMES_TABLE_START): Likewise.
(ISA_NAMES_TABLE_END): Likewise.
(ISA_NAMES_TABLE_ENTRY): New.
(check_features): Include
"../../../common/config/i386/i386-isas.h".
(check_detailed): Call cpu_indicator_init.  Always call
check_features.  Call get_amd_cpu instead of check_amd_cpu_model.
Call get_intel_cpu instead of check_intel_cpu_model.
---
 gcc/common/config/i386/i386-isas.h| 163 +
 gcc/config/i386/driver-i386.c | 644 +++---
 gcc/config/i386/i386-builtins.c   |  52 +-
 .../gcc.target/i386/builtin_target.c  | 355 +-
 4 files changed, 306 insertions(+), 908 deletions(-)
 create mode 100644 gcc/common/config/i386/i386-isas.h

diff --git a/gcc/common/config/i386/i386-isas.h 
b/gcc/common/config/i386/i386-isas.h
new file mode 100644
index 000..08c9dbecc76
--- /dev/null
+++ b/gcc/common/config/i386/i386-isas.h
@@ -0,0 +1,163 @@
+/* i386 ISA table.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version

Re: [PATCH] Make contrib/download_prerequisites work on AIX and OpenBSD

2020-06-24 Thread David Edelsohn via Gcc-patches

On Wed, Jun 24, 2020 at 3:33 AM Richard Biener
 wrote:
>
> On Tue, Jun 23, 2020 at 10:37 PM Ilya Leoshkevich via Gcc-patches
>  wrote:
> >
> > Hello,
> >
> > I needed to test [1] on AIX and OpenBSD and noticed
> > download_prerequisites doesn't work there. The attached patch fixes
> > it.
> >
> > OK for master?
>
> OK if David acks the AIX part.

The AIX part is correct.  Thanks for the fix!

- David

>
> Thanks,
> Richard.
>
> > Best regards,
> > Ilya
> >
> > [1] https://gcc.gnu.org/pipermail/gcc-patches/2020-June/548182.html
> >
> > ---
> >
> > contrib/ChangeLog:
> >
> > 2020-06-11  Ilya Leoshkevich  
> >
> > * download_prerequisites: Support AIX and OpenBSD unames.
> > Pipe `{gzip,bzip2} -d` to `tar -xf -`.
> > ---
> >  contrib/download_prerequisites | 18 --
> >  1 file changed, 16 insertions(+), 2 deletions(-)
> >
> > diff --git a/contrib/download_prerequisites b/contrib/download_prerequisites
> > index aa0356e6266..da19913f9ab 100755
> > --- a/contrib/download_prerequisites
> > +++ b/contrib/download_prerequisites
> > @@ -47,9 +47,12 @@ force=0
> >  OS=$(uname)
> >
> >  case $OS in
> > -  "Darwin"|"FreeBSD"|"DragonFly")
> > +  "Darwin"|"FreeBSD"|"DragonFly"|"AIX")
> >  chksum='shasum -a 512 --check'
> >;;
> > +  "OpenBSD")
> > +chksum='sha512 -c'
> > +  ;;
> >*)
> >  chksum='sha512sum -c'
> >;;
> > @@ -242,8 +245,19 @@ for ar in $(echo_archives)
> >  do
> >  package="${ar%.tar*}"
> >  if [ ${force} -gt 0 ]; then rm -rf "${directory}/${package}"; fi
> > +case $ar in
> > +*.gz)
> > +   uncompress='gzip -d'
> > +   ;;
> > +*.bz2)
> > +   uncompress='bzip2 -d'
> > +   ;;
> > +*)
> > +   uncompress='cat'
> > +   ;;
> > +esac
> >  [ -e "${directory}/${package}" ]   
> >\
> > -|| ( cd "${directory}" && tar -xf "${ar}" )
> >\
> > +|| ( cd "${directory}" && $uncompress <"${ar}" | tar -xf - )   
> >\
> >  || die "Cannot extract package from ${ar}"
> >  unset package
> >  done
> > --
> > 2.25.4
> >

Re: [PATCH PR95700] Use nullptr instead of NULL as a sentinel value

2020-06-24 Thread Richard Biener via Gcc-patches

On Wed, Jun 24, 2020 at 4:14 PM Jonathan Wakely via Gcc-patches
 wrote:
>
> On 24/06/20 12:31 +0200, Ilya Leoshkevich wrote:
> >Bootstrapped and regtested on x86_64-redhat-linux, ppc64le-redhat-linux
> >and s390x-redhat-linux.
> >
> >Ok for master?
> >
> >---
> >
> >Bootstrap with musl libc fails with numerous "missing sentinel in
> >function call" errors.  This is because musl defines NULL as 0L for C++,
> >but gcc requires sentinel value to be a pointer or __null.
> >
> >Jonathan Wakely says:
> >
> >To be really safe during stage 1, GCC should not use NULL as a
> >pointer sentinel in C++ code anyway.
> >
> >The bootstrap compiler could define it to 0 or 0u, neither of which
> >is guaranteed to be OK to pass as a varargs sentinel where a null
> >pointer is expected.  Any of (void*)0 or (void*)NULL or nullptr
> >would be safe.
> >
> >Therefore, fix by replacing NULL sentinels with nullptr.
>
> For some additional context, the C++ standard guarantees that passing
> nullptr to a varargs function will convert to (void*)0. That has been
> true since nullptr was added in C++11.

Is there a diagnostic option that we can turn on so no NULLs will creep
in in such position?   Without that we'll bitrot quickly?

Richard.

>

Re: [PATCH] VEC_COND_EXPR: clean up first argument

2020-06-24 Thread Richard Biener via Gcc-patches

On Wed, Jun 24, 2020 at 11:27 AM Martin Liška  wrote:
>
> On 6/24/20 11:09 AM, Richard Biener wrote:
> > On Wed, Jun 24, 2020 at 10:49 AM Martin Liška  wrote:
> >>
> >> On 6/24/20 9:43 AM, Richard Biener wrote:
> >>> Hmm, can you instead use simple_dce_from_worklist and simply
> >>> record all SSA_NAMEs you end up "forwarding" as possibly dead
> >>> in a bitmap?  At least that hashmap traversal looks dangerous
> >>> with respect to address-space randomization and gsi_remove
> >>> inserting debug stmts and thus eventually allocating debug decls.
> >>
> >> Sure, done in the updated patch.
> >
> > You can simplify the patch by eliding the num_imm_uses checks
>
> Really? How can I be sure that a SSA_NAME is not shared among different
> VEC_COND_EXPR statements (or even by some other statements)?

The bitmap DCE does this check for you.

> > and by using auto_bitmap.
>
> Oh yeah!
>
> >  Why is it necessary to update
> > the veclower pass btw?  Is that just to avoid useless isels
> > on dead code?
>
> Yes:
>
>_10 = _9 != { 0, 0, 0, 0 };
>_11 = *a_16(D);
>_12 = *b_17(D);
>_13 = _11 + _12;
>_14 = VEC_COND_EXPR <_10, _13, { 3.0e+0, 3.0e+0, 3.0e+0, 3.0e+0 }>;
>
> is expanded by vectlower to something like:
>
>_10 = _9 != { 0, 0, 0, 0 };
>_11 = *a_16(D);
>_12 = *b_17(D);
>_67 = BIT_FIELD_REF <_11, 32, 0>;
>_68 = BIT_FIELD_REF <_12, 32, 0>;
>_69 = _67 + _68;
> ...
>_14 = {_80, _82, _84, _86};
>*a_16(D) = _14;
>
> So one needs to remove: _10 = _9 != { 0, 0, 0, 0 };
> Note the ICE happens without an optimization level.

Ah, OK.  That makes sense.

>
> >
> > You also updated veclower "nicely" but still have the hashmap
> > walk in isel - you should know when you "merged" a condition
> > into a cond and set the bit there.
>
> Isn't the same as before as the first argument can be actually shared in 
> between
> multiple GIMPLE statements?

As said above the bitmap DCE is built for lazy consumption.

Richard.

> Thanks,
> Martin
>
> >
> > Richard.
> >
> >> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> >>
> >> Ready to be installed?
> >> Thanks,
> >> Martin
> >>
>

Re: [PATCH PR95700] Use nullptr instead of NULL as a sentinel value

2020-06-24 Thread Jonathan Wakely via Gcc-patches


On 24/06/20 12:31 +0200, Ilya Leoshkevich wrote:

Bootstrapped and regtested on x86_64-redhat-linux, ppc64le-redhat-linux
and s390x-redhat-linux.

Ok for master?

---

Bootstrap with musl libc fails with numerous "missing sentinel in
function call" errors.  This is because musl defines NULL as 0L for C++,
but gcc requires sentinel value to be a pointer or __null.

Jonathan Wakely says:

   To be really safe during stage 1, GCC should not use NULL as a
   pointer sentinel in C++ code anyway.

   The bootstrap compiler could define it to 0 or 0u, neither of which
   is guaranteed to be OK to pass as a varargs sentinel where a null
   pointer is expected.  Any of (void*)0 or (void*)NULL or nullptr
   would be safe.

Therefore, fix by replacing NULL sentinels with nullptr.


For some additional context, the C++ standard guarantees that passing
nullptr to a varargs function will convert to (void*)0. That has been
true since nullptr was added in C++11.

[PATCH] tree-optimization/95866 - avoid vectorizing uniform SLP subgraphs

2020-06-24 Thread Richard Biener

This avoids vectorizing SLP subgraphs that just compute uniform
operations on all-same operands.  That fixes the less interesting
(but most embarrasing) part of the testcase in the PR.  On the
way it also fixed a missing matches[0] reset in the last
refactoring touching that place.

Bootstrap / regtest running on x86_64-unknown-linux-gnu.

2020-06-24  Richard Biener  

PR tree-optimization/95866
* tree-vect-slp.c (vect_slp_tree_uniform_p): New.
(vect_build_slp_tree_2): Properly reset matches[0],
ignore uniform constants.

* gcc.target/i386/pr95866-1.c: New testcase.
---
 gcc/testsuite/gcc.target/i386/pr95866-1.c | 18 
 gcc/tree-vect-slp.c   | 26 +--
 2 files changed, 42 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr95866-1.c

diff --git a/gcc/testsuite/gcc.target/i386/pr95866-1.c 
b/gcc/testsuite/gcc.target/i386/pr95866-1.c
new file mode 100644
index 000..991370cf669
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr95866-1.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-tree-slp2-details -msse2" } */
+
+int x[4];
+void foo(int i)
+{
+  int j = (i+1) & 31;
+  x[0] = (x[0] << j) + j;
+  x[1] = (x[1] << j) + j;
+  x[2] = (x[2] << j) + j;
+  x[3] = (x[3] << j) + j;
+}
+
+/* We should not use vector operations for i + 1 and (i + 1) & 31 but
+   instead use { j, j, j, j }.  */ 
+/* { dg-final { scan-tree-dump-times "Building parent vector operands from 
scalars" 2 "slp2" } } */
+/* { dg-final { scan-tree-dump-not " = \{i_" "slp2" } } */
+/* { dg-final { scan-tree-dump-times " = \{j_" 1 "slp2" } } */
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index e7a260877a9..b2792c76ad2 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -238,6 +238,26 @@ vect_contains_pattern_stmt_p (vec stmts)
   return false;
 }
 
+/* Return true when all lanes in the external or constant NODE have
+   the same value.  */
+
+static bool
+vect_slp_tree_uniform_p (slp_tree node)
+{
+  gcc_assert (SLP_TREE_DEF_TYPE (node) == vect_constant_def
+ || SLP_TREE_DEF_TYPE (node) == vect_external_def);
+
+  unsigned i;
+  tree op, first = NULL_TREE;
+  FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_OPS (node), i, op)
+if (!first)
+  first = op;
+else if (!operand_equal_p (first, op, 0))
+  return false;
+
+  return true;
+}
+
 /* Find the place of the data-ref in STMT_INFO in the interleaving chain
that starts from FIRST_STMT_INFO.  Return -1 if the data-ref is not a part
of the chain.  */
@@ -1439,7 +1459,7 @@ fail:
   vect_free_oprnd_info (oprnds_info);
 
   /* If we have all children of a non-unary child built up from
- scalars then just throw that away, causing it built up
+ uniform scalars then just throw that away, causing it built up
  from scalars.  */
   if (nops > 1
   && is_a  (vinfo)
@@ -1451,11 +1471,13 @@ fail:
   slp_tree child;
   unsigned j;
   FOR_EACH_VEC_ELT (children, j, child)
-   if (SLP_TREE_DEF_TYPE (child) != vect_external_def)
+   if (SLP_TREE_DEF_TYPE (child) == vect_internal_def
+   || !vect_slp_tree_uniform_p (child))
  break;
   if (!child)
{
  /* Roll back.  */
+ matches[0] = false;
  FOR_EACH_VEC_ELT (children, j, child)
vect_free_slp_tree (child, false);
 
-- 
2.26.2

Re: [Patch][gcn, nvptx, offloading] mkoffload – handle -fpic/-fPIC

2020-06-24 Thread Jakub Jelinek via Gcc-patches

On Wed, Jun 24, 2020 at 03:36:25PM +0200, Tobias Burnus wrote:
> > What about 'gcc/config/i386/intelmic-mkoffload.c'?  I see that one
> > unconditionally passes '-fPIC' to some things -- is that doing the right
> > thing for your case, too?
> 
> I have not setup intelmic to test, but it uses -fPIC -share throughout
> and in particular for the sister function to the one to those patched.
> Hence, I expect that it will simply work. – But I wouldn't mind if you
> could test that it indeed does work.

The intelmic model was that both the main program and shared libraries
(which need offloading) contain in data section a shared library, and a
small main program always the same (if I remember well) dlopens those shared
libraries at runtime.
While not very common today, it is possible multiple shared libraries or
executable and one or more shared libraries have offloading code in them
for the same offloading target.
And one can deal with that only at runtime (== dynamic linking).

Jakub

Re: [PATCH] x86: Share _isa_names_table and use cpuinfo.h

2020-06-24 Thread Uros Bizjak via Gcc-patches

On Wed, Jun 24, 2020 at 3:37 PM Uros Bizjak  wrote:
>
> On Wed, Jun 24, 2020 at 3:06 PM H.J. Lu  wrote:
> >
> > On Wed, Jun 24, 2020 at 02:43:43PM +0200, Uros Bizjak wrote:
> > > >
> > > > Here is the updated patch for x86 backend and libgcc.  driver-i386.c
> > > > is unchanged.
> > >
> > > Thanks. We should change driver-i386.c very carefully and in an
> > > independent way from this patch. It is a complex and interwoven web of
> > > name, model and features check. I propose that we first convert
> > > various has_xxx checks to a new interface, in as trivial way as
> > > possible.
> > >
> >
> > Here is the patch to share isa_names_table and use cpuinfo.h to check
> > ISAs as well as detect newer Intel/AMD processors.
> >
> > OK for master?
>
> No. Don't use get_intel_cpu and get_amd_cpu in driver-i386.c.

+  cpu = get_amd_cpu (_model, _model2, cpu_features2);
+  if (cpu == NULL)
+   {
+ if (name == signature_NSC_ebx)
+   processor = PROCESSOR_GEODE;
+ else if (has_feature (FEATURE_SSE2)

Here is where your patch fails. AMD part has early bypass for
signature_NSC_ebx, so it is detected as NSC processor regardless of
what is detected by generic code.

Uros.

Re: [PATCH] x86: Share _isa_names_table and use cpuinfo.h

2020-06-24 Thread Uros Bizjak via Gcc-patches

On Wed, Jun 24, 2020 at 3:06 PM H.J. Lu  wrote:
>
> On Wed, Jun 24, 2020 at 02:43:43PM +0200, Uros Bizjak wrote:
> > >
> > > Here is the updated patch for x86 backend and libgcc.  driver-i386.c
> > > is unchanged.
> >
> > Thanks. We should change driver-i386.c very carefully and in an
> > independent way from this patch. It is a complex and interwoven web of
> > name, model and features check. I propose that we first convert
> > various has_xxx checks to a new interface, in as trivial way as
> > possible.
> >
>
> Here is the patch to share isa_names_table and use cpuinfo.h to check
> ISAs as well as detect newer Intel/AMD processors.
>
> OK for master?

No. Don't use get_intel_cpu and get_amd_cpu in driver-i386.c.

Uros.

> Thanks.
>
> H.J.
> ---
> Both driver-i386.c and libgcc use CPUID to detect the processor name
> as well as available ISAs.  To detect the same processor or ISAs, the
> same detection logic is duplicated in 2 places.  Sometimes only one place
> was up to date or got it right.  Sometimes both places got it wrong.
>
> 1. Add common/config/i386/i386-isas.h to define _isa_names_table.
> 2. Use isa_names_table to auto-generate ISA command-line options.
> 3. Use isa_names_table to auto-generate __builtin_cpu_supports tests.
> 4. Use common/config/i386/cpuinfo.h to check available ISAs and detect
> newer Intel/AMD processors in driver-i386.c and builtin_target.c.
> 5. Detection of older processors in driver-i386.c is unchanged.
>
> gcc/
>
> PR target/95843
> * common/config/i386/i386-isas.h: New file.  Extracted from
> gcc/config/i386/i386-builtins.c.
> (_isa_names_table): Add option.
> (ISA_NAMES_TABLE_START): New.
> (ISA_NAMES_TABLE_END): Likewise.
> (ISA_NAMES_TABLE_ENTRY): Likewise.
> (isa_names_table): Defined with ISA_NAMES_TABLE_START,
> ISA_NAMES_TABLE_END and ISA_NAMES_TABLE_ENTRY.  Add more ISAs
> from enum processor_features.
> * config/i386/driver-i386.c: Include
> "common/config/i386/cpuinfo.h" and
> "common/config/i386/i386-isas.h".
> (has_feature): New macro.
> (host_detect_local_cpu): Call cpu_indicator_init to get CPU
> features.  Use has_feature to detect processor features.  Call
> get_amd_cpu to get the newer AMD CPU name.  Call get_intel_cpu
> to get the newer Intel CPU name.  Use isa_names_table to generate
> command-line options.
> * config/i386/i386-builtins.c: Include
> "common/config/i386/i386-isas.h".
> (_arch_names_table): Removed.
> (isa_names_table): Likewise.
>
> gcc/testsuite/
>
> PR target/95843
> * gcc.target/i386/builtin_target.c: Include ,
> ../../../common/config/i386/i386-cpuinfo.h and
> ../../../common/config/i386/cpuinfo.h.
> (check_amd_cpu_model): Removed.
> (check_intel_cpu_model): Likewise,
> (CHECK___builtin_cpu_is): New.
> (gcc_assert): New.  Defined as assert.
> (gcc_unreachable): New.  Defined as abort.
> (inline): New.  Defined as empty.
> (ISA_NAMES_TABLE_START): Likewise.
> (ISA_NAMES_TABLE_END): Likewise.
> (ISA_NAMES_TABLE_ENTRY): New.
> (check_features): Include
> "../../../common/config/i386/i386-isas.h".
> (check_detailed): Call cpu_indicator_init.  Always call
> check_features.  Call get_amd_cpu instead of check_amd_cpu_model.
> Call get_intel_cpu instead of check_intel_cpu_model.
> ---
>  gcc/common/config/i386/i386-isas.h| 163 +
>  gcc/config/i386/driver-i386.c | 680 +++---
>  gcc/config/i386/i386-builtins.c   |  52 +-
>  .../gcc.target/i386/builtin_target.c  | 355 +
>  4 files changed, 303 insertions(+), 947 deletions(-)
>  create mode 100644 gcc/common/config/i386/i386-isas.h
>
> diff --git a/gcc/common/config/i386/i386-isas.h 
> b/gcc/common/config/i386/i386-isas.h
> new file mode 100644
> index 000..08c9dbecc76
> --- /dev/null
> +++ b/gcc/common/config/i386/i386-isas.h
> @@ -0,0 +1,163 @@
> +/* i386 ISA table.
> +   Copyright (C) 2020 Free Software Foundation, Inc.
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify
> +it under the terms of the GNU General Public License as published by
> +the Free Software Foundation; either version 3, or (at your option)
> +any later version.
> +
> +GCC is distributed in the hope that it will be useful,
> +but WITHOUT ANY WARRANTY; without even the implied warranty of
> +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +GNU General Public License for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3.  If not see
> +.  */
> +
> +/* These are the target attribute strings for which a dispatcher is
> +   available, from

Re: [Patch][gcn, nvptx, offloading] mkoffload – handle -fpic/-fPIC

2020-06-24 Thread Tobias Burnus


On 6/23/20 9:36 PM, Thomas Schwinge wrote:


(*) Thomas mentioned that this is supposed to work also in more
complex cases than the one I outlined, although, that is probably
currently the most common one.

Static linking is another such case that we've seen in the wild


What I meant was: single library (static or dynamic) contains all
offloading code – and rest of the code is free of offloading code.
I think those two are the second most common – while a code where
all offloading bits are in the program and not the library is by
far the #1. – Other variants are much less likely, but who knows
what users are doing...


I don't think I can approve, but seems fine if this works (as you've
confirmed) -- it's one incremental step forward!

Or, should this instead be handled in the LTO wrapper (?) options merging
etc. machinery?  I'd have to dig in further.  (Jakub?)


The produced code cannot be much optimized (some tables), hence,
some fancy flags are not needed (contrary to LTO). However, there
might be other ABI relevant options besides -fPIC/-fpic (some
-m... flag? For GCN, -march= is crucial, some ARM platform might
have the same issue?)


What about 'gcc/config/i386/intelmic-mkoffload.c'?  I see that one
unconditionally passes '-fPIC' to some things -- is that doing the right
thing for your case, too?


I have not setup intelmic to test, but it uses -fPIC -share throughout
and in particular for the sister function to the one to those patched.
Hence, I expect that it will simply work. – But I wouldn't mind if you
could test that it indeed does work.

Cheers,

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter

Re: [PATCH v2] coroutines: Copy attributes to the outlined functions [PR95518, PR95813]

2020-06-24 Thread Nathan Sidwell


On 6/24/20 7:00 AM, Iain Sandoe wrote:

Hi Nathan,

Nathan Sidwell  wrote:


On 6/11/20 3:53 PM, Iain Sandoe wrote:

Hi
We had omitted the copying of function attributes (including the
'used' status).  Mark the outlined functions as artificial, since
they are; some diagnostic processing tests this.


Do we do the right thing for say attribute((section("bob"))?


I’ve made sure of that in the attached patch (also user-defined alignment).


what if the user tries attribute((alias("bob")),
presumable we don't want both ramp and actor to alias bob?  I think this might 
be tricky.


As we discussed off-list, there might be some attributes that are inappropiate 
for coroutines.
We can diagnose those (when we understand what they are) when building the ramp 
/ splitting the actor out (as a separate patch, when it becomes required; i.e. 
not part of this patch).

The attached patch copies all attributes from the original function (and those 
that are already on the decl) onto the outlined ones - this fixes 95518 and 
95813 (which is almost a dup).

OK now for master / 10.2?


ok, thanks for checking!


thanks,
Iain

  coroutines: Copy attributes to the outlined functions
  [PR95518,PR95813]

We had omitted the copying of function attributes, we now copy
the used, alignment, section values from the original decal and
the complete set of function attributes.  It is likely that some
function attributes don't really make sense for coroutines, but
that can be disgnosed separately.
Also mark the outlined functions as artificial, since they are; and
some diagnostic processing tests this.

gcc/cp/ChangeLog:

PR c++/95518
PR c++/95813
* coroutines.cc (act_des_fn): Copy function
attributes onto the outlined coroutine helpers.

gcc/testsuite/ChangeLog:

PR c++/95518
PR c++/95813
* g++.dg/coroutines/pr95518.C: New test.
 * g++.dg/coroutines/pr95813.C: New test.
---
  gcc/cp/coroutines.cc  | 12 ++
  gcc/testsuite/g++.dg/coroutines/pr95518.C | 28 ++
  gcc/testsuite/g++.dg/coroutines/pr95813.C | 46 +++
  3 files changed, 86 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/coroutines/pr95518.C
  create mode 100644 gcc/testsuite/g++.dg/coroutines/pr95813.C

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index 9cdb0c591d5..64b97535c8d 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -3530,12 +3530,24 @@ act_des_fn (tree orig, tree fn_type, tree 
coro_frame_ptr, const char* name)
tree fn_name = get_fn_local_identifier (orig, name);
tree fn = build_lang_decl (FUNCTION_DECL, fn_name, fn_type);
DECL_CONTEXT (fn) = DECL_CONTEXT (orig);
+  DECL_ARTIFICIAL (fn) = true;
DECL_INITIAL (fn) = error_mark_node;
tree id = get_identifier ("frame_ptr");
tree fp = build_lang_decl (PARM_DECL, id, coro_frame_ptr);
DECL_CONTEXT (fp) = fn;
DECL_ARG_TYPE (fp) = type_passed_as (coro_frame_ptr);
DECL_ARGUMENTS (fn) = fp;
+  /* Copy selected attributes from the original function.  */
+  TREE_USED (fn) = TREE_USED (orig);
+  if (DECL_SECTION_NAME (orig))
+set_decl_section_name (fn, DECL_SECTION_NAME (orig));
+  /* Copy any alignment that the FE added.  */
+  if (DECL_ALIGN (orig))
+SET_DECL_ALIGN (fn, DECL_ALIGN (orig));
+  /* Copy any alignment the user added.  */
+  DECL_USER_ALIGN (fn) = DECL_USER_ALIGN (orig);
+  /* Apply attributes from the original fn.  */
+  DECL_ATTRIBUTES (fn) = copy_list (DECL_ATTRIBUTES (orig));
return fn;
  }
  
diff --git a/gcc/testsuite/g++.dg/coroutines/pr95518.C b/gcc/testsuite/g++.dg/coroutines/pr95518.C

new file mode 100644
index 000..b1717677810
--- /dev/null
+++ b/gcc/testsuite/g++.dg/coroutines/pr95518.C
@@ -0,0 +1,28 @@
+// { dg-additional-options "-O -Wunused-function" }
+
+#if __has_include ()
+#include 
+using namespace std;
+#elif defined (__clang__) && __has_include ()
+#include 
+namespace std { using namespace experimental; }
+#endif
+
+struct dummy
+{
+struct promise_type
+{
+dummy get_return_object() const noexcept { return {}; }
+std::suspend_never initial_suspend() const noexcept { return {}; }
+std::suspend_never final_suspend() const noexcept { return {}; }
+void return_void() const noexcept {}
+void unhandled_exception() const noexcept {}
+};
+};
+
+// This checks that the attribute is passed on to the outlined coroutine
+// functions (so that there should be no diagnostic).
+[[maybe_unused]] static dummy foo()
+{
+co_return;
+}
diff --git a/gcc/testsuite/g++.dg/coroutines/pr95813.C 
b/gcc/testsuite/g++.dg/coroutines/pr95813.C
new file mode 100644
index 000..445cdf1f7ef
--- /dev/null
+++ b/gcc/testsuite/g++.dg/coroutines/pr95813.C
@@ -0,0 +1,46 @@
+//  { dg-additional-options  "-Wall -O" }
+
+// This should complete without any diagnostic.
+
+#include 
+#include 
+
+template 
+class lazy {
+T _v = 0;
+public:
+lazy() {}
+

[PATCH] x86: Add Cooper Lake detection with AVX512BF16

2020-06-24 Thread H.J. Lu via Gcc-patches

All Sky Lake family processors have the same CPUID model number, 0x55.
The differences are Cascade Lake has AVX512VNNI and Cooper Lake has
AVX512VNNI + AVX512BF16.  Check AVX512BF16 for Cooper Lake.

PR target/95774
* common/config/i386/cpuinfo.h (get_intel_cpu): Add Cooper Lake
detection with AVX512BF16.
---
 gcc/common/config/i386/cpuinfo.h | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/gcc/common/config/i386/cpuinfo.h b/gcc/common/config/i386/cpuinfo.h
index 0a0e88011bc..3eda53240f6 100644
--- a/gcc/common/config/i386/cpuinfo.h
+++ b/gcc/common/config/i386/cpuinfo.h
@@ -397,7 +397,15 @@ get_intel_cpu (struct __processor_model *cpu_model,
   CHECK___builtin_cpu_is ("corei7");
   cpu_model->__cpu_type = INTEL_COREI7;
   if (has_cpu_feature (cpu_model, cpu_features2,
-  FEATURE_AVX512VNNI))
+  FEATURE_AVX512BF16))
+   {
+ /* Cooper Lake.  */
+ cpu = "cooperlake";
+ CHECK___builtin_cpu_is ("cooperlake");
+ cpu_model->__cpu_subtype = INTEL_COREI7_COOPERLAKE;
+   }
+  else if (has_cpu_feature (cpu_model, cpu_features2,
+   FEATURE_AVX512VNNI))
{
  /* Cascade Lake.  */
  cpu = "cascadelake";
-- 
2.26.2

[PATCH] x86: Share _isa_names_table and use cpuinfo.h

2020-06-24 Thread H.J. Lu via Gcc-patches

On Wed, Jun 24, 2020 at 02:43:43PM +0200, Uros Bizjak wrote:
> >
> > Here is the updated patch for x86 backend and libgcc.  driver-i386.c
> > is unchanged.
> 
> Thanks. We should change driver-i386.c very carefully and in an
> independent way from this patch. It is a complex and interwoven web of
> name, model and features check. I propose that we first convert
> various has_xxx checks to a new interface, in as trivial way as
> possible.
> 

Here is the patch to share isa_names_table and use cpuinfo.h to check
ISAs as well as detect newer Intel/AMD processors.

OK for master?

Thanks.

H.J.
---
Both driver-i386.c and libgcc use CPUID to detect the processor name
as well as available ISAs.  To detect the same processor or ISAs, the
same detection logic is duplicated in 2 places.  Sometimes only one place
was up to date or got it right.  Sometimes both places got it wrong.

1. Add common/config/i386/i386-isas.h to define _isa_names_table.
2. Use isa_names_table to auto-generate ISA command-line options.
3. Use isa_names_table to auto-generate __builtin_cpu_supports tests.
4. Use common/config/i386/cpuinfo.h to check available ISAs and detect
newer Intel/AMD processors in driver-i386.c and builtin_target.c.
5. Detection of older processors in driver-i386.c is unchanged.

gcc/

PR target/95843
* common/config/i386/i386-isas.h: New file.  Extracted from
gcc/config/i386/i386-builtins.c.
(_isa_names_table): Add option.
(ISA_NAMES_TABLE_START): New.
(ISA_NAMES_TABLE_END): Likewise.
(ISA_NAMES_TABLE_ENTRY): Likewise.
(isa_names_table): Defined with ISA_NAMES_TABLE_START,
ISA_NAMES_TABLE_END and ISA_NAMES_TABLE_ENTRY.  Add more ISAs
from enum processor_features.
* config/i386/driver-i386.c: Include
"common/config/i386/cpuinfo.h" and
"common/config/i386/i386-isas.h".
(has_feature): New macro.
(host_detect_local_cpu): Call cpu_indicator_init to get CPU
features.  Use has_feature to detect processor features.  Call
get_amd_cpu to get the newer AMD CPU name.  Call get_intel_cpu
to get the newer Intel CPU name.  Use isa_names_table to generate
command-line options.
* config/i386/i386-builtins.c: Include
"common/config/i386/i386-isas.h".
(_arch_names_table): Removed.
(isa_names_table): Likewise.

gcc/testsuite/

PR target/95843
* gcc.target/i386/builtin_target.c: Include ,
../../../common/config/i386/i386-cpuinfo.h and
../../../common/config/i386/cpuinfo.h.
(check_amd_cpu_model): Removed.
(check_intel_cpu_model): Likewise,
(CHECK___builtin_cpu_is): New.
(gcc_assert): New.  Defined as assert.
(gcc_unreachable): New.  Defined as abort.
(inline): New.  Defined as empty.
(ISA_NAMES_TABLE_START): Likewise.
(ISA_NAMES_TABLE_END): Likewise.
(ISA_NAMES_TABLE_ENTRY): New.
(check_features): Include
"../../../common/config/i386/i386-isas.h".
(check_detailed): Call cpu_indicator_init.  Always call
check_features.  Call get_amd_cpu instead of check_amd_cpu_model.
Call get_intel_cpu instead of check_intel_cpu_model.
---
 gcc/common/config/i386/i386-isas.h| 163 +
 gcc/config/i386/driver-i386.c | 680 +++---
 gcc/config/i386/i386-builtins.c   |  52 +-
 .../gcc.target/i386/builtin_target.c  | 355 +
 4 files changed, 303 insertions(+), 947 deletions(-)
 create mode 100644 gcc/common/config/i386/i386-isas.h

diff --git a/gcc/common/config/i386/i386-isas.h 
b/gcc/common/config/i386/i386-isas.h
new file mode 100644
index 000..08c9dbecc76
--- /dev/null
+++ b/gcc/common/config/i386/i386-isas.h
@@ -0,0 +1,163 @@
+/* i386 ISA table.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+/* These are the target attribute strings for which a dispatcher is
+   available, from fold_builtin_cpu.  */
+struct _isa_names_table
+{
+  const char *const name;
+  const enum processor_features feature;
+  const enum feature_priority priority;
+  const char *const option;
+};
+
+/* NB: isa_names_table is shared by i386-builtins.c, driver-i386.c and
+   gcc.target/i386/builtin_target.c.  isa_names_table is a static const
+   array

Re: V7 [PATCH] x86: Move cpuinfo.h from libgcc to common/config/i386

2020-06-24 Thread Uros Bizjak via Gcc-patches

On Wed, Jun 24, 2020 at 2:27 PM H.J. Lu  wrote:
>
> On Mon, Jun 22, 2020 at 04:25:46PM -0700, H.J. Lu wrote:
> > On Sun, Jun 21, 2020 at 10:22 AM H.J. Lu  wrote:
> > >
> > > On Sun, Jun 21, 2020 at 10:18 AM Uros Bizjak  wrote:
> > > >
> > > > On Sat, Jun 20, 2020 at 3:40 PM H.J. Lu  wrote:
> > > >
> > > > > > > > >> 2) can we automatically deduce option name:
> > > > > > > > >>
> > > > > > > > >>> +  ISA_NAMES_TABLE_ENTRY("rdpid", FEATURE_RDPID, P_ZERO, 
> > > > > > > > >>> "-mrdpid")
> > > > > > > > >>> +  ISA_NAMES_TABLE_ENTRY("rdrnd", FEATURE_RDRND, P_ZERO, 
> > > > > > > > >>> "-mrdrnd")
> > > > > > > > >>
> > > > > > > > >> I mean "-m" + "rdrnd" == "-mrdrnd" ?
> > > > > > > > >
> > > > > > > > > The new option field serves 2 purposes:
> > > > > > > > >
> > > > > > > > > 1. Not all features have a corresponding command-line option
> > > > > > > > >
> > > > > > > > > ISA_NAMES_TABLE_ENTRY("cmov", FEATURE_CMOV, P_ZERO, NULL)
> > > > > > > > >
> > > > > > > > >   for (i = 0; i < ARRAY_SIZE (isa_names_table); i++)
> > > > > > > > >  if (isa_names_table[i].option)
> > > > > > > > >
> > > > > > > > > 2. Some feature has a different name in the command-line 
> > > > > > > > > option.
> > > > > > > > >
> > > > > > > > >ISA_NAMES_TABLE_ENTRY("fxsave", FEATURE_FXSAVE, P_ZERO, 
> > > > > > > > > "-mfxsr")
> > > > > > > >
> > > > > > > > I noticed that, one can theoretically use "" for an option that 
> > > > > > > > does not
> > > > > > > > have a flag. And NULL for these which have option equal to "-m" 
> > > > > > > > + name.
> > > > > > > > Anyway, that's a nit.
> > > > > > > >
> > > > > > > > I support the patch!
> > > > > > > > Martin
> > > > > > > >
> > > > > > > > >
> > > > > > > > > Here is the updated patch.   OK for master?
> > > > > > > > >
> > > > > > > > > Thanks.
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > > PING:
> > > > > > >
> > > > > > > https://gcc.gnu.org/pipermail/gcc-patches/2020-May/546522.html
> > > > > > >
> > > > > >
> > > > > > PING.
> > > > >
> > > > > Hi,
> > > > >
> > > > > We have patches like
> > > > >
> > > > > https://gcc.gnu.org/pipermail/gcc-bugs/2020-June/705851.html
> > > > >
> > > > > queued up because of this prerequisite patch.   Are there any 
> > > > > objections
> > > > > to this patch?
> > > >
> > > > Yes, there are my objections.
> > > >
> > > > As explained before, I support unifying libgcc and core gcc handling,
> > > > but _NOT_ unifying with driver-i386.c. Unifying libgcc and core gcc
> > > > handling would have benefit to avoid desynchronisation between the two
> > > > (which happened multiple times in the past, resulting in various API
> > > > issues). OTOH, unifying with driver-i386.c would result in quite messy
> > > > approach, because driver-i386 handles more targets beside relatively
> > > > recent 64bit Intel and AMD targets, not to mention heuristics to
> > > > determine the most appropriate target when standard detection fails
> > > > (e.g. emulators).
> > >
> > > Only the duplicated parts of driver-i386.c should be unified.  The only 
> > > impact
> > > should be removing code duplications.  If it isn't the case, it is a bug 
> > > in my
> > > implementation.
> >
> > Here is the updated patch to remove FEATURE_OSPKE and use bit_OSPKE
> > for FEATURE_PKU.  It uses the same get_amd_cpu and get_intel_cpu in
> > libgcc and driver-i386.c:
> >a. Detect the processor name for newer Intel and AMD processors.
> >   Since the older processor names are only supported by driver-i386.c,
> >   not by libgcc, detection for the older processors remains in
> >   driver-i386.c.
> >b. Detect available ISAs for all Intel and AMD processors.
> >
>
> Here is the updated patch for x86 backend and libgcc.  driver-i386.c
> is unchanged.

Thanks. We should change driver-i386.c very carefully and in an
independent way from this patch. It is a complex and interwoven web of
name, model and features check. I propose that we first convert
various has_xxx checks to a new interface, in as trivial way as
possible.

> OK for master?

Yes, this part looks OK to me.

Thanks,
Uros.

>
> Thannks.
>
> H.J.
> ---
> Both x86 backend and libgcc define enum processor_features.  libgcc sets
> enum processor_feature and x86 backend checks enum processor_feature.
> They are very easy out of sync and it has happened multiple times in the
> past.
>
> 1. Move cpuinfo.h from libgcc to common/config/i386 so that we can share
> the same enum processor_features in x86 backend and libgcc.
> 2. Change __cpu_features2 to an array to support more processor features.
> 3. Add more processor features to enum processor_features.
>
> gcc/
>
> PR target/95259
> * common/config/i386/cpuinfo.h: New file.
> (__processor_model): Moved from libgcc/config/i386/cpuinfo.h.
> (__processor_model2): New.
> (CHECK___builtin_cpu_is): New.  Defined as empty if not defined.
> (has_cpu_feature): New function.
>

[committed] libstdc++: Fix std::from_chars to ignore leading zeros in base 2

2020-06-24 Thread Jonathan Wakely via Gcc-patches

The parser for binary numbers returned an error if the entire string
contains more digits than the result type. Leading zeros should be
ignored.

libstdc++-v3/ChangeLog:

* include/std/charconv (__from_chars_binary): Ignore leading zeros.
* testsuite/20_util/from_chars/1.cc: Check "0x1" for all bases,
not just 10 and 16.
* testsuite/20_util/from_chars/3.cc: New test.

Tested powerpc64le-linux, committed to master.


commit eb0ff770e29715deb8b2e6f5da736e0c1e8f8d07
Author: Jonathan Wakely 
Date:   Wed Jun 24 11:45:01 2020 +0100

libstdc++: Fix std::from_chars to ignore leading zeros in base 2

The parser for binary numbers returned an error if the entire string
contains more digits than the result type. Leading zeros should be
ignored.

libstdc++-v3/ChangeLog:

* include/std/charconv (__from_chars_binary): Ignore leading zeros.
* testsuite/20_util/from_chars/1.cc: Check "0x1" for all bases,
not just 10 and 16.
* testsuite/20_util/from_chars/3.cc: New test.

diff --git a/libstdc++-v3/include/std/charconv 
b/libstdc++-v3/include/std/charconv
index 8fbf64058ee..b725e5d2afd 100644
--- a/libstdc++-v3/include/std/charconv
+++ b/libstdc++-v3/include/std/charconv
@@ -417,7 +417,11 @@ namespace __detail
   static_assert(is_unsigned<_Tp>::value, "implementation bug");
 
   const ptrdiff_t __len = __last - __first;
-  int __i = 0;
+  ptrdiff_t __i = 0;
+  while (__i < __len && __first[__i] == '0')
+   ++__i;
+  const ptrdiff_t __leading_zeroes = __i;
+
   while (__i < __len)
{
  const unsigned char __c = (unsigned)__first[__i] - '0';
@@ -428,7 +432,7 @@ namespace __detail
  __i++;
}
   __first += __i;
-  return __i <= __detail::__int_limits<_Tp>::digits;
+  return (__i - __leading_zeroes) <= __detail::__int_limits<_Tp>::digits;
 }
 
   /// std::from_chars implementation for integers in bases 3 to 10.
diff --git a/libstdc++-v3/testsuite/20_util/from_chars/1.cc 
b/libstdc++-v3/testsuite/20_util/from_chars/1.cc
index 916025bc7c6..ad5d50e67b3 100644
--- a/libstdc++-v3/testsuite/20_util/from_chars/1.cc
+++ b/libstdc++-v3/testsuite/20_util/from_chars/1.cc
@@ -31,7 +31,8 @@ check_from_chars(I expected, std::string s, int base = 0, 
char term = '\0')
   std::from_chars_result r = base == 0
 ? std::from_chars(begin, end, val)
 : std::from_chars(begin, end, val, base);
-  return r.ec == std::errc{} && (r.ptr == end || *r.ptr == term) && val == 
expected;
+  return r.ec == std::errc{} && (r.ptr == end || *r.ptr == term)
+&& val == expected;
 }
 
 #include 
@@ -52,10 +53,18 @@ void
 test02()
 {
   // "0x" parsed as "0" not as hex prefix:
-  VERIFY( check_from_chars(0, "0x1", 10, 'x') );
-  VERIFY( check_from_chars(0, "0X1", 10, 'X') );
-  VERIFY( check_from_chars(0, "0x1", 16, 'x') );
-  VERIFY( check_from_chars(0, "0X1", 16, 'X') );
+  for (int base = 2; base < 34; ++base)
+  {
+VERIFY( check_from_chars(0, "0x1", base, 'x') );
+VERIFY( check_from_chars(0, "0X1", base, 'X') );
+  }
+
+  VERIFY( check_from_chars(1123, "0x1", 34) );
+  VERIFY( check_from_chars(1123, "0X1", 34) );
+  VERIFY( check_from_chars(1156, "0x1", 35) );
+  VERIFY( check_from_chars(1156, "0X1", 35) );
+  VERIFY( check_from_chars(1189, "0x1", 36) );
+  VERIFY( check_from_chars(1189, "0X1", 36) );
 
   VERIFY( check_from_chars(1155, "xx", 34) );
   VERIFY( check_from_chars(1155, "XX", 34) );
diff --git a/libstdc++-v3/testsuite/20_util/from_chars/3.cc 
b/libstdc++-v3/testsuite/20_util/from_chars/3.cc
new file mode 100644
index 000..9d4a77f5c31
--- /dev/null
+++ b/libstdc++-v3/testsuite/20_util/from_chars/3.cc
@@ -0,0 +1,79 @@
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-do run { target c++14 } }
+
+#include 
+#include 
+#include 
+
+#ifdef DEBUG
+#include 
+#endif
+
+long long
+read(const char* first, const char* last, int base)
+{
+  long long val = 0;
+  long long place = 1;
+  while (last > first)
+  {
+val += (*--last - '0') * place;
+place *= base;
+  }
+  return val;
+}
+
+void
+test01()
+{
+  std::from_chars_result res;
+  long long val;
+  for (auto s : { "10001", "10010", "10011", "10101", "10110",

[committed] libstdc++: Fix warnings with -Wsystem-headers

2020-06-24 Thread Jonathan Wakely via Gcc-patches

libstdc++-v3/ChangeLog:

* include/bits/stl_algobase.h (__find_if): Add FALLTHRU markers.
* include/std/charconv (__detail::__to_chars): Avoid
-Wsign-compare warning.

Tested powerpc64le-linux, committed to master.


commit 25920dd18ad12ea501309b1487366e22f35db631
Author: Jonathan Wakely 
Date:   Wed Jun 24 12:34:17 2020 +0100

libstdc++: Fix warnings with -Wsystem-headers

libstdc++-v3/ChangeLog:

* include/bits/stl_algobase.h (__find_if): Add FALLTHRU markers.
* include/std/charconv (__detail::__to_chars): Avoid
-Wsign-compare warning.

diff --git a/libstdc++-v3/include/bits/stl_algobase.h 
b/libstdc++-v3/include/bits/stl_algobase.h
index 4fc8850d707..ff5b4505a08 100644
--- a/libstdc++-v3/include/bits/stl_algobase.h
+++ b/libstdc++-v3/include/bits/stl_algobase.h
@@ -2092,14 +2092,17 @@ _GLIBCXX_END_NAMESPACE_ALGO
  if (__pred(__first))
return __first;
  ++__first;
+ // FALLTHRU
case 2:
  if (__pred(__first))
return __first;
  ++__first;
+ // FALLTHRU
case 1:
  if (__pred(__first))
return __first;
  ++__first;
+ // FALLTHRU
case 0:
default:
  return __last;
diff --git a/libstdc++-v3/include/std/charconv 
b/libstdc++-v3/include/std/charconv
index 77a72d40dd0..8fbf64058ee 100644
--- a/libstdc++-v3/include/std/charconv
+++ b/libstdc++-v3/include/std/charconv
@@ -142,7 +142,7 @@ namespace __detail
'u', 'v', 'w', 'x', 'y', 'z'
   };
 
-  while (__val >= __base)
+  while (__val >= (unsigned)__base)
{
  auto const __quo = __val / __base;
  auto const __rem = __val % __base;

[committed] libstdc++: Fix std::to_chars buffer overflow (PR 95851)

2020-06-24 Thread Jonathan Wakely via Gcc-patches

The __detail::__to_chars_2 function assumes it won't be called with zero
values. However, when the output buffer is empty the caller doesn't
handle zero values correctly, and calls __to_chars_2 with a zero value,
resulting in an overflow of the empty buffer.

The __detail::__to_chars_i function should just return immediately for
an empty buffer, and otherwise ensure zero values are handled properly.

libstdc++-v3/ChangeLog:

PR libstdc++/95851
* include/std/charconv (__to_chars_i): Check for zero-sized
buffer unconditionally.
* testsuite/20_util/to_chars/95851.cc: New test.

Tested powerpc64le-linux, committed to master.

commit be50843754b4c4d47f0d628a84b3dbf2a4145a43
Author: Jonathan Wakely 
Date:   Tue Jun 23 22:47:58 2020 +0100

libstdc++: Fix std::to_chars buffer overflow (PR 95851)

The __detail::__to_chars_2 function assumes it won't be called with zero
values. However, when the output buffer is empty the caller doesn't
handle zero values correctly, and calls __to_chars_2 with a zero value,
resulting in an overflow of the empty buffer.

The __detail::__to_chars_i function should just return immediately for
an empty buffer, and otherwise ensure zero values are handled properly.

libstdc++-v3/ChangeLog:

PR libstdc++/95851
* include/std/charconv (__to_chars_i): Check for zero-sized
buffer unconditionally.
* testsuite/20_util/to_chars/95851.cc: New test.

diff --git a/libstdc++-v3/include/std/charconv 
b/libstdc++-v3/include/std/charconv
index 3caa0f8ac10..77a72d40dd0 100644
--- a/libstdc++-v3/include/std/charconv
+++ b/libstdc++-v3/include/std/charconv
@@ -327,7 +327,10 @@ namespace __detail
   using _Up = __detail::__unsigned_least_t<_Tp>;
   _Up __unsigned_val = __value;
 
-  if (__value == 0 && __first != __last)
+  if (__first == __last) [[__unlikely__]]
+   return { __last, errc::value_too_large };
+
+  if (__value == 0)
{
  *__first = '0';
  return { __first + 1, errc{} };
diff --git a/libstdc++-v3/testsuite/20_util/to_chars/95851.cc 
b/libstdc++-v3/testsuite/20_util/to_chars/95851.cc
new file mode 100644
index 000..5f0daf3b30b
--- /dev/null
+++ b/libstdc++-v3/testsuite/20_util/to_chars/95851.cc
@@ -0,0 +1,36 @@
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-do run { target c++14 } }
+
+#include 
+#include 
+
+void
+test01()
+{
+  char* p = nullptr;
+  auto res = std::to_chars(p, p, 0, 2); // PR libstdc++/95851
+  VERIFY( res.ptr == p );
+  VERIFY( res.ec == std::errc::value_too_large );
+}
+
+int
+main()
+{
+  test01();
+}

V7 [PATCH] x86: Move cpuinfo.h from libgcc to common/config/i386

2020-06-24 Thread H.J. Lu via Gcc-patches

On Mon, Jun 22, 2020 at 04:25:46PM -0700, H.J. Lu wrote:
> On Sun, Jun 21, 2020 at 10:22 AM H.J. Lu  wrote:
> >
> > On Sun, Jun 21, 2020 at 10:18 AM Uros Bizjak  wrote:
> > >
> > > On Sat, Jun 20, 2020 at 3:40 PM H.J. Lu  wrote:
> > >
> > > > > > > >> 2) can we automatically deduce option name:
> > > > > > > >>
> > > > > > > >>> +  ISA_NAMES_TABLE_ENTRY("rdpid", FEATURE_RDPID, P_ZERO, 
> > > > > > > >>> "-mrdpid")
> > > > > > > >>> +  ISA_NAMES_TABLE_ENTRY("rdrnd", FEATURE_RDRND, P_ZERO, 
> > > > > > > >>> "-mrdrnd")
> > > > > > > >>
> > > > > > > >> I mean "-m" + "rdrnd" == "-mrdrnd" ?
> > > > > > > >
> > > > > > > > The new option field serves 2 purposes:
> > > > > > > >
> > > > > > > > 1. Not all features have a corresponding command-line option
> > > > > > > >
> > > > > > > > ISA_NAMES_TABLE_ENTRY("cmov", FEATURE_CMOV, P_ZERO, NULL)
> > > > > > > >
> > > > > > > >   for (i = 0; i < ARRAY_SIZE (isa_names_table); i++)
> > > > > > > >  if (isa_names_table[i].option)
> > > > > > > >
> > > > > > > > 2. Some feature has a different name in the command-line option.
> > > > > > > >
> > > > > > > >ISA_NAMES_TABLE_ENTRY("fxsave", FEATURE_FXSAVE, P_ZERO, 
> > > > > > > > "-mfxsr")
> > > > > > >
> > > > > > > I noticed that, one can theoretically use "" for an option that 
> > > > > > > does not
> > > > > > > have a flag. And NULL for these which have option equal to "-m" + 
> > > > > > > name.
> > > > > > > Anyway, that's a nit.
> > > > > > >
> > > > > > > I support the patch!
> > > > > > > Martin
> > > > > > >
> > > > > > > >
> > > > > > > > Here is the updated patch.   OK for master?
> > > > > > > >
> > > > > > > > Thanks.
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > > PING:
> > > > > >
> > > > > > https://gcc.gnu.org/pipermail/gcc-patches/2020-May/546522.html
> > > > > >
> > > > >
> > > > > PING.
> > > >
> > > > Hi,
> > > >
> > > > We have patches like
> > > >
> > > > https://gcc.gnu.org/pipermail/gcc-bugs/2020-June/705851.html
> > > >
> > > > queued up because of this prerequisite patch.   Are there any objections
> > > > to this patch?
> > >
> > > Yes, there are my objections.
> > >
> > > As explained before, I support unifying libgcc and core gcc handling,
> > > but _NOT_ unifying with driver-i386.c. Unifying libgcc and core gcc
> > > handling would have benefit to avoid desynchronisation between the two
> > > (which happened multiple times in the past, resulting in various API
> > > issues). OTOH, unifying with driver-i386.c would result in quite messy
> > > approach, because driver-i386 handles more targets beside relatively
> > > recent 64bit Intel and AMD targets, not to mention heuristics to
> > > determine the most appropriate target when standard detection fails
> > > (e.g. emulators).
> >
> > Only the duplicated parts of driver-i386.c should be unified.  The only 
> > impact
> > should be removing code duplications.  If it isn't the case, it is a bug in 
> > my
> > implementation.
> 
> Here is the updated patch to remove FEATURE_OSPKE and use bit_OSPKE
> for FEATURE_PKU.  It uses the same get_amd_cpu and get_intel_cpu in
> libgcc and driver-i386.c:
>a. Detect the processor name for newer Intel and AMD processors.
>   Since the older processor names are only supported by driver-i386.c,
>   not by libgcc, detection for the older processors remains in
>   driver-i386.c.
>b. Detect available ISAs for all Intel and AMD processors.
> 

Here is the updated patch for x86 backend and libgcc.  driver-i386.c
is unchanged.

OK for master?

Thannks.

H.J.
---
Both x86 backend and libgcc define enum processor_features.  libgcc sets
enum processor_feature and x86 backend checks enum processor_feature.
They are very easy out of sync and it has happened multiple times in the
past.

1. Move cpuinfo.h from libgcc to common/config/i386 so that we can share
the same enum processor_features in x86 backend and libgcc.
2. Change __cpu_features2 to an array to support more processor features.
3. Add more processor features to enum processor_features.

gcc/

PR target/95259
* common/config/i386/cpuinfo.h: New file.
(__processor_model): Moved from libgcc/config/i386/cpuinfo.h.
(__processor_model2): New.
(CHECK___builtin_cpu_is): New.  Defined as empty if not defined.
(has_cpu_feature): New function.
(set_cpu_feature): Likewise.
(get_amd_cpu): Moved from libgcc/config/i386/cpuinfo.c.  Use
CHECK___builtin_cpu_is.  Return AMD CPU name.
(get_intel_cpu): Moved from libgcc/config/i386/cpuinfo.c.  Use
Use CHECK___builtin_cpu_is.  Return Intel CPU name.
(get_available_features): Moved from libgcc/config/i386/cpuinfo.c.
Also check FEATURE_3DNOW, FEATURE_3DNOWP, FEATURE_ADX,
FEATURE_ABM, FEATURE_CLDEMOTE, FEATURE_CLFLUSHOPT, FEATURE_CLWB,
FEATURE_CLZERO, FEATURE_CMPXCHG16B, FEATURE_CMPXCHG8B,
FEATURE_ENQCMD, FEATURE_F16C, FEATURE_FSGSBASE, FEATURE_FXSAVE,

Re: [PR95416] outputs.exp: skip lto tests when not using linker plugin

2020-06-24 Thread Rainer Orth

Hi Alexandre,

> On Jun  8, 2020, Alexandre Oliva  wrote:
>
>>  * outputs.exp (skip_lto): Set when missing the linker plugin.
>
> I withdraw the above work-around patch in favor of the fix proper below.
>
> It's also supposed to fix the FAILs caused by .dSYM directories on
> platforms that create them; I'd appreciate confirmation that it does.
>
> Tested on x86_64-linux-gnu, both with and without LTO plugin.  Ok to
> install?

nice.  I tested it on i386-pc-solaris2.11 with ld (no lto plugin) and
gld (with plugin).

Ok.

Thanks.
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [PATCH PR95854] ICE in find_bswap_or_nop_1 of pass store-merging

2020-06-24 Thread Richard Biener

On Wed, 24 Jun 2020, zhoukaipeng (A) wrote:

> Hi,
> 
> This is a fix for pr95854.
> 
> Only add a judgement to make sure operand1 and operand2 are both INTEGER_CST.
> 
> Bootstrap and tested on aarch64 Linux platform. No new regression witnessed.
> 
> Is it ok to be merged?

Please use tree_fits_uhwi_p () instead of the INTEGER_CST tree code check.

OK with that change.

Thanks,
Richard.

Re: [PATCH] nvptx: Add support for subword compare-and-swap

2020-06-24 Thread Kwok Cheung Yeung

On 23/06/2020 5:44 pm, Thomas Schwinge wrote:

Hi!

On 2020-06-15T21:28:12+0100, Kwok Cheung Yeung  wrote:

This patch adds support on nvptx for __sync_val_compare_and_swap operations on
1- and 2-byte values.

Is this a thorough review that these are the only functions missing, or
did you just implement what you found missing for some test case you've
been looking into?  Other architectures' similar libgcc files seem to be
defining more of such related functions.

This was to fix a particular test case that used logical reduction operators on 
subword values. Support for other operators (e.g. arithmetic add) would need 
additional support.

Other architectures also define atomic operations for subwords in libgcc (e.g. 
in config/arm/linux-atomic.c for Arm, config/riscv/atomic.c for RISC-V, in 
config/m68k/linux-atomic.c for Motorola 68000s etc.).

The implementation is a straight copy of the version for
AMD GCN.

(Should thus be generalized?  That can be done later, as far as I'm
concerned -- there already seems to be quite some code duplication in
libgcc.)

I have not verified the algorithm.

It seems a bit unfortunate to have such a thing outlined in a separate
function, given we're talking about performance-critical code here?  Even
more so for GCN, where there's no JIT compiler that can inline it later,
as it's the case for nvptx?

You have verified that GCC itself shouldn't/can't synthesize such
replacement code, inline?

The GCN/nvptx libgcc code actually seems simple enough so that GCC could
synthesize that internally -- or is there a reason not to?  (Just
curious.)

Without the additional __sync_val_compare_and_swap_* functions, the testcase 
fails with:

unresolved symbol __sync_val_compare_and_swap_2

when the accelerator compiler is run.

The algorithm is generic and fairly simple, so it is more that GCC doesn't, 
rather than shouldn't or can't. i.e. No one has implemented it as a fallback in 
lieu of native support so far.

I see 'gcc/doc/generic.texi', "OMP_ATOMIC" state:

 The gimplifier tries
 three alternative code generation strategies.  Whenever possible,
 an atomic update built-in is used.  If that fails, a
 compare-and-swap loop is attempted.  If that also fails, a
 regular critical section around the expression is used.

..., and I see 'gcc/omp-expand.c:expand_omp_atomic_pipeline' synthesize
that "compare-and-swap loop" code, which looks vaguely similar to your
libgcc implementation.

> ..., and 'gcc/optabs.c:expand_compare_and_swap_loop' etc. also look
> similar/related?

These create loops that repeatedly do the compare-and-swap operation until it 
succeeds. They cannot work to express a smaller compare-and-swap in terms of a 
larger one since they lack the necessary shifts and masking.

The exit condition is also different - since we are implementing 
compare-and-swap, the loop can exit even if the swap fails. The loop exits if 
the subword read from memory is different from the expected value (i.e. the 
compare-and-swap fails) or if the word containing the subword is equal to the 
expected value (i.e. it succeeds). The loop is there to cope with the case where 
the compare-and-swap fails due to part of the larger word changing that is not 
part of the subword that we are interested in.

..., or maybe even 'gcc/builtins.c:expand_builtin_compare_and_swap'
etc. could be doing such things?

This delegates to expand_atomic_compare_and_swap (in optabs.c), which would be 
the logical place to synthesize the sub-word loop. All it does ATM is look for 
atomic_compare_and_swap op handlers, then for sync_compare_and_swap handlers, 
then library functions for __sync_val_compare_and_swap.

Kwok

Re: [PATCH] x86: Fold arch_names_table into processor_alias_table

2020-06-24 Thread H.J. Lu via Gcc-patches

On Wed, Jun 24, 2020 at 12:31 AM Uros Bizjak  wrote:
>
> [CC list removed, it is not shown in archives]
>
> > In i386-builtins.c, arch_names_table is used to to map architecture name
> > string to internal model.  A switch statement is used to map internal
> > processor name to architecture name string and internal priority.
> >
> > model and priority are added to processor_alias_table so that a single
> > entry contains architecture name string, internal processor name,
> > internal model and internal priority.  6 entries are appended for
> > i386-builtins.c, which have special architecture name strings: amd,
> > amdfam10h, amdfam15h, amdfam17h, shanghai and istanbul, and pta_size is
> > adjusted to exclude them.  Entries which are not used by i386-builtins.c
>
> We already have an "amdfam10" entry, so I think these should later be
> extended by AMD folks to include correct processor_type, schedule
> model and flags. Having to manually exclude generic entries feels a
> bit hackish, though, but the patch is certainly an improvement.
>
> > have internal model 0.  P_PROC_DYNAMIC is added to internal priority to
> > make entries with dynamic architecture name string or priority.
>
> BTW: Can you please change P_ZERO to P_NONE?

Fixed.
> > PR target/95842
> > * common/config/i386/i386-common.c (processor_alias_table): Add
> > processor model and priority to each entry.
> > (pta_size): Updated with -6.
> > (num_arch_names): New.
> > * common/config/i386/i386-cpuinfo.h: New file.
> > * config/i386/i386-builtins.c (feature_priority): Removed.
> > (processor_model): Likewise.
> > (_arch_names_table): Likewise.
> > (arch_names_table): Likewise.
> > (get_builtin_code_for_version): Use processor_alias_table.
> > (fold_builtin_cpu): Replace arch_names_table with
> > processor_alias_table.
> > * config/i386/i386.h: Include "common/config/i386/i386-cpuinfo.h".
> > (pta): Add model and priority.
> > (num_arch_names): New.
>
> LGTM.
>
> Thanks,
> Uros.

This is the patch I am checking in.

Thanks.

-- 
H.J.
From e4d03220e495f8b316d088b49a9143a13b8180d4 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Tue, 23 Jun 2020 12:49:32 -0700
Subject: [PATCH] x86: Fold arch_names_table into processor_alias_table

In i386-builtins.c, arch_names_table is used to to map architecture name
string to internal model.  A switch statement is used to map internal
processor name to architecture name string and internal priority.

model and priority are added to processor_alias_table so that a single
entry contains architecture name string, internal processor name,
internal model and internal priority.  6 entries are appended for
i386-builtins.c, which have special architecture name strings: amd,
amdfam10h, amdfam15h, amdfam17h, shanghai and istanbul, and pta_size is
adjusted to exclude them.  Entries which are not used by i386-builtins.c
have internal model 0.  P_PROC_DYNAMIC is added to internal priority to
make entries with dynamic architecture name string or priority.

	PR target/95842
	* common/config/i386/i386-common.c (processor_alias_table): Add
	processor model and priority to each entry.
	(pta_size): Updated with -6.
	(num_arch_names): New.
	* common/config/i386/i386-cpuinfo.h: New file.
	* config/i386/i386-builtins.c (feature_priority): Removed.
	(processor_model): Likewise.
	(_arch_names_table): Likewise.
	(arch_names_table): Likewise.
	(_isa_names_table): Replace P_ZERO with P_NONE.
	(get_builtin_code_for_version): Replace P_ZERO with P_NONE.  Use
	processor_alias_table.
	(fold_builtin_cpu): Replace arch_names_table with
	processor_alias_table.
	* config/i386/i386.h: Include "common/config/i386/i386-cpuinfo.h".
	(pta): Add model and priority.
	(num_arch_names): New.
---
 gcc/common/config/i386/i386-common.c  | 239 ++---
 gcc/common/config/i386/i386-cpuinfo.h | 134 ++
 gcc/config/i386/i386-builtins.c   | 371 ++
 gcc/config/i386/i386.h|   5 +
 4 files changed, 366 insertions(+), 383 deletions(-)
 create mode 100644 gcc/common/config/i386/i386-cpuinfo.h

diff --git a/gcc/common/config/i386/i386-common.c b/gcc/common/config/i386/i386-common.c
index 42d14c29f16..654df68d688 100644
--- a/gcc/common/config/i386/i386-common.c
+++ b/gcc/common/config/i386/i386-common.c
@@ -1619,164 +1619,202 @@ STATIC_ASSERT (ARRAY_SIZE (processor_names) == PROCESSOR_max);
 
 const pta processor_alias_table[] =
 {
-  {"i386", PROCESSOR_I386, CPU_NONE, 0},
-  {"i486", PROCESSOR_I486, CPU_NONE, 0},
-  {"i586", PROCESSOR_PENTIUM, CPU_PENTIUM, 0},
-  {"pentium", PROCESSOR_PENTIUM, CPU_PENTIUM, 0},
-  {"lakemont", PROCESSOR_LAKEMONT, CPU_PENTIUM, PTA_NO_80387},
-  {"pentium-mmx", PROCESSOR_PENTIUM, CPU_PENTIUM, PTA_MMX},
-  {"winchip-c6", PROCESSOR_I486, CPU_NONE, PTA_MMX},
-  {"winchip2", PROCESSOR_I486, CPU_NONE, PTA_MMX | PTA_3DNOW},
-  {"c3", PROCESSOR_I486, CPU_NONE, PTA_MMX | PTA_3DNOW},
-  {"samuel-2", PROCESSOR_I486, CPU_NONE, PTA_MMX | PTA_3DNOW},
+  {"i386", PROCESSOR_I386, CPU_NONE, 0, 0,

Re: [PATCH] libgomp: added simple functions and tests for OMPD

2020-06-24 Thread Jakub Jelinek via Gcc-patches

On Tue, Jun 23, 2020 at 07:25:22PM -0400, y2s1982 via Gcc-patches wrote:
> --- /dev/null
> +++ b/libgomp/libgompd.c
> @@ -0,0 +1,46 @@

Oh, and one more thing, this file should start with the standard comment
with license etc., like you've added it to omp-tools.h.

> --- /dev/null
> +++ b/libgomp/libgompd.h
> @@ -0,0 +1,7 @@
> +

And this one too.
Testcases don't need it unless they are really large/complex that it is
worth to make it explicit.

Jakub

[PATCH v2] coroutines: Copy attributes to the outlined functions [PR95518, PR95813]

2020-06-24 Thread Iain Sandoe

Hi Nathan,

Nathan Sidwell  wrote:

> On 6/11/20 3:53 PM, Iain Sandoe wrote:
>> Hi
>> We had omitted the copying of function attributes (including the
>> 'used' status).  Mark the outlined functions as artificial, since
>> they are; some diagnostic processing tests this.
> 
> Do we do the right thing for say attribute((section("bob"))?  

I’ve made sure of that in the attached patch (also user-defined alignment).

> what if the user tries attribute((alias("bob")),
> presumable we don't want both ramp and actor to alias bob?  I think this 
> might be tricky.

As we discussed off-list, there might be some attributes that are inappropiate 
for coroutines.
We can diagnose those (when we understand what they are) when building the ramp 
/ splitting the actor out (as a separate patch, when it becomes required; i.e. 
not part of this patch).

The attached patch copies all attributes from the original function (and those 
that are already on the decl) onto the outlined ones - this fixes 95518 and 
95813 (which is almost a dup).

OK now for master / 10.2?
thanks,
Iain

 coroutines: Copy attributes to the outlined functions
 [PR95518,PR95813]

We had omitted the copying of function attributes, we now copy
the used, alignment, section values from the original decal and
the complete set of function attributes.  It is likely that some
function attributes don't really make sense for coroutines, but
that can be disgnosed separately.
Also mark the outlined functions as artificial, since they are; and
some diagnostic processing tests this.

gcc/cp/ChangeLog:

PR c++/95518
PR c++/95813
* coroutines.cc (act_des_fn): Copy function
attributes onto the outlined coroutine helpers.

gcc/testsuite/ChangeLog:

PR c++/95518
PR c++/95813
* g++.dg/coroutines/pr95518.C: New test.
* g++.dg/coroutines/pr95813.C: New test.
---
 gcc/cp/coroutines.cc  | 12 ++
 gcc/testsuite/g++.dg/coroutines/pr95518.C | 28 ++
 gcc/testsuite/g++.dg/coroutines/pr95813.C | 46 +++
 3 files changed, 86 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/coroutines/pr95518.C
 create mode 100644 gcc/testsuite/g++.dg/coroutines/pr95813.C

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index 9cdb0c591d5..64b97535c8d 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -3530,12 +3530,24 @@ act_des_fn (tree orig, tree fn_type, tree 
coro_frame_ptr, const char* name)
   tree fn_name = get_fn_local_identifier (orig, name);
   tree fn = build_lang_decl (FUNCTION_DECL, fn_name, fn_type);
   DECL_CONTEXT (fn) = DECL_CONTEXT (orig);
+  DECL_ARTIFICIAL (fn) = true;
   DECL_INITIAL (fn) = error_mark_node;
   tree id = get_identifier ("frame_ptr");
   tree fp = build_lang_decl (PARM_DECL, id, coro_frame_ptr);
   DECL_CONTEXT (fp) = fn;
   DECL_ARG_TYPE (fp) = type_passed_as (coro_frame_ptr);
   DECL_ARGUMENTS (fn) = fp;
+  /* Copy selected attributes from the original function.  */
+  TREE_USED (fn) = TREE_USED (orig);
+  if (DECL_SECTION_NAME (orig))
+set_decl_section_name (fn, DECL_SECTION_NAME (orig));
+  /* Copy any alignment that the FE added.  */
+  if (DECL_ALIGN (orig))
+SET_DECL_ALIGN (fn, DECL_ALIGN (orig));
+  /* Copy any alignment the user added.  */
+  DECL_USER_ALIGN (fn) = DECL_USER_ALIGN (orig);
+  /* Apply attributes from the original fn.  */
+  DECL_ATTRIBUTES (fn) = copy_list (DECL_ATTRIBUTES (orig));
   return fn;
 }
 
diff --git a/gcc/testsuite/g++.dg/coroutines/pr95518.C 
b/gcc/testsuite/g++.dg/coroutines/pr95518.C
new file mode 100644
index 000..b1717677810
--- /dev/null
+++ b/gcc/testsuite/g++.dg/coroutines/pr95518.C
@@ -0,0 +1,28 @@
+// { dg-additional-options "-O -Wunused-function" }
+
+#if __has_include ()
+#include 
+using namespace std;
+#elif defined (__clang__) && __has_include ()
+#include 
+namespace std { using namespace experimental; }
+#endif
+
+struct dummy
+{
+struct promise_type
+{
+dummy get_return_object() const noexcept { return {}; }
+std::suspend_never initial_suspend() const noexcept { return {}; }
+std::suspend_never final_suspend() const noexcept { return {}; }
+void return_void() const noexcept {}
+void unhandled_exception() const noexcept {}
+};
+};
+
+// This checks that the attribute is passed on to the outlined coroutine
+// functions (so that there should be no diagnostic).
+[[maybe_unused]] static dummy foo()
+{ 
+co_return;
+}
diff --git a/gcc/testsuite/g++.dg/coroutines/pr95813.C 
b/gcc/testsuite/g++.dg/coroutines/pr95813.C
new file mode 100644
index 000..445cdf1f7ef
--- /dev/null
+++ b/gcc/testsuite/g++.dg/coroutines/pr95813.C
@@ -0,0 +1,46 @@
+//  { dg-additional-options  "-Wall -O" }
+
+// This should complete without any diagnostic.
+
+#include 
+#include 
+
+template 
+class lazy {
+T _v = 0;
+public:
+lazy() {}
+bool await_ready() {return true;}
+void await_suspend(auto x)

[PATCH] emit SLP vectorized loads earlier

2020-06-24 Thread Richard Biener

This makes sure to emit SLP vectorized loads where the first scalar
load is.  This makes SLP dependence checking more powerful because
hoisting loads can use TBAA and it increases the freedom for
vector placement when there are constraints from live lanes.

Vectorized shifts block inserting vectorized stmts always after
vectorized defs because it ends up using the original scalar
operand even when the SLP graph indicates the shift operand
is vectorized (and we actually emit and cost those stmts).

vect_slp_analyze_and_verify_node_alignment shows we need alignment
for too many places, this is a temporary solution and my plan
is to have a single meta-info for a dataref group instead
(also getting rid of DR_GROUP_FIRST/NEXT_ELEMENT).

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

2020-06-24  Richard Biener  

* tree-vectorizer.h (vect_find_first_scalar_stmt_in_slp):
Declare.
* tree-vect-data-refs.c (vect_preserves_scalar_order_p):
Simplify for new position of vectorized SLP loads.
(vect_slp_analyze_node_dependences): Adjust for it.
(vect_slp_analyze_and_verify_node_alignment): Compute alignment
for the first stmts dataref.
* tree-vect-slp.c (vect_find_first_scalar_stmt_in_slp): New.
(vect_schedule_slp_instance): Emit loads before the
first scalar stmt.
* tree-vect-stmts.c (vectorizable_load): Do what the comment
says and use vect_find_first_scalar_stmt_in_slp.
---
 gcc/tree-vect-data-refs.c | 267 --
 gcc/tree-vect-slp.c   |  51 +++-
 gcc/tree-vect-stmts.c |   3 +-
 gcc/tree-vectorizer.h |   1 +
 4 files changed, 213 insertions(+), 109 deletions(-)

diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index 2365a3925bb..eb8288e7a85 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -232,61 +232,43 @@ vect_preserves_scalar_order_p (dr_vec_info *dr_info_a, 
dr_vec_info *dr_info_b)
   && !STMT_VINFO_GROUPED_ACCESS (stmtinfo_b))
 return true;
 
-  /* STMT_A and STMT_B belong to overlapping groups.  All loads in a
- SLP group are emitted at the position of the last scalar load and
- all loads in an interleaving group are emitted at the position
- of the first scalar load.
+  /* STMT_A and STMT_B belong to overlapping groups.  All loads are
+ emitted at the position of the first scalar load.
  Stores in a group are emitted at the position of the last scalar store.
  Compute that position and check whether the resulting order matches
- the current one.
- We have not yet decided between SLP and interleaving so we have
- to conservatively assume both.  */
-  stmt_vec_info il_a;
-  stmt_vec_info last_a = il_a = DR_GROUP_FIRST_ELEMENT (stmtinfo_a);
-  if (last_a)
+ the current one.  */
+  stmt_vec_info il_a = DR_GROUP_FIRST_ELEMENT (stmtinfo_a);
+  if (il_a)
 {
-  for (stmt_vec_info s = DR_GROUP_NEXT_ELEMENT (last_a); s;
-  s = DR_GROUP_NEXT_ELEMENT (s))
-   last_a = get_later_stmt (last_a, s);
-  if (!DR_IS_WRITE (STMT_VINFO_DATA_REF (stmtinfo_a)))
-   {
- for (stmt_vec_info s = DR_GROUP_NEXT_ELEMENT (il_a); s;
-  s = DR_GROUP_NEXT_ELEMENT (s))
-   if (get_later_stmt (il_a, s) == il_a)
- il_a = s;
-   }
-  else
-   il_a = last_a;
+  if (DR_IS_WRITE (STMT_VINFO_DATA_REF (stmtinfo_a)))
+   for (stmt_vec_info s = DR_GROUP_NEXT_ELEMENT (il_a); s;
+s = DR_GROUP_NEXT_ELEMENT (s))
+ il_a = get_later_stmt (il_a, s);
+  else /* DR_IS_READ */
+   for (stmt_vec_info s = DR_GROUP_NEXT_ELEMENT (il_a); s;
+s = DR_GROUP_NEXT_ELEMENT (s))
+ if (get_later_stmt (il_a, s) == il_a)
+   il_a = s;
 }
   else
-last_a = il_a = stmtinfo_a;
-  stmt_vec_info il_b;
-  stmt_vec_info last_b = il_b = DR_GROUP_FIRST_ELEMENT (stmtinfo_b);
-  if (last_b)
+il_a = stmtinfo_a;
+  stmt_vec_info il_b = DR_GROUP_FIRST_ELEMENT (stmtinfo_b);
+  if (il_b)
 {
-  for (stmt_vec_info s = DR_GROUP_NEXT_ELEMENT (last_b); s;
-  s = DR_GROUP_NEXT_ELEMENT (s))
-   last_b = get_later_stmt (last_b, s);
-  if (!DR_IS_WRITE (STMT_VINFO_DATA_REF (stmtinfo_b)))
-   {
- for (stmt_vec_info s = DR_GROUP_NEXT_ELEMENT (il_b); s;
-  s = DR_GROUP_NEXT_ELEMENT (s))
-   if (get_later_stmt (il_b, s) == il_b)
- il_b = s;
-   }
-  else
-   il_b = last_b;
+  if (DR_IS_WRITE (STMT_VINFO_DATA_REF (stmtinfo_b)))
+   for (stmt_vec_info s = DR_GROUP_NEXT_ELEMENT (il_b); s;
+s = DR_GROUP_NEXT_ELEMENT (s))
+ il_b = get_later_stmt (il_b, s);
+  else /* DR_IS_READ */
+   for (stmt_vec_info s = DR_GROUP_NEXT_ELEMENT (il_b); s;
+s = DR_GROUP_NEXT_ELEMENT (s))
+ if (get_later_stmt (il_b, s) == il_b)
+   il_b = s;
 }
   else
-last_b = il_b = stmtinfo_b;

[PATCH] tree-optimization/95856 fix vect_stmt_dominates_stmt_p at BB region boundary

2020-06-24 Thread Richard Biener

The following adjusts vect_stmt_dominates_stmt_p to honor out-of-region
stmts we run into which have UID -1u.

Bootstrapped / tested on x86_64-unknown-linux-gnu, applied.

Richard.

2020-06-24  Richard Biener  

PR tree-optimization/95856
* tree-vectorizer.c (vect_stmt_dominates_stmt_p): Honor
region marker -1u.

* gcc.dg/vect/pr95856.c: New testcase.
---
 gcc/testsuite/gcc.dg/vect/pr95856.c | 20 
 gcc/tree-vectorizer.c   |  4 
 2 files changed, 24 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/vect/pr95856.c

diff --git a/gcc/testsuite/gcc.dg/vect/pr95856.c 
b/gcc/testsuite/gcc.dg/vect/pr95856.c
new file mode 100644
index 000..19a86a62d95
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr95856.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+
+typedef struct {
+  float xmin, xmax;
+} rctf;
+
+typedef struct {
+  rctf tot;
+} View2D;
+
+View2D graph_main_area_draw_v2d;
+
+void get_graph_keyframe_extents();
+
+void
+graph_main_area_draw() {
+  get_graph_keyframe_extents();
+  graph_main_area_draw_v2d.tot.xmin -= 10.0f;
+  graph_main_area_draw_v2d.tot.xmax += 10.0f;
+}
diff --git a/gcc/tree-vectorizer.c b/gcc/tree-vectorizer.c
index e262ba0580e..78d9da689c8 100644
--- a/gcc/tree-vectorizer.c
+++ b/gcc/tree-vectorizer.c
@@ -752,6 +752,8 @@ vect_stmt_dominates_stmt_p (gimple *s1, gimple *s2)
   if (gsi_stmt (gsi1) == s2)
return true;
 }
+  if (gimple_uid (gsi_stmt (gsi1)) == -1u)
+return false;
 
   gimple_stmt_iterator gsi2 = gsi_for_stmt (s2);
   while (gimple_uid (gsi_stmt (gsi2)) == 0)
@@ -762,6 +764,8 @@ vect_stmt_dominates_stmt_p (gimple *s1, gimple *s2)
   if (gsi_stmt (gsi2) == s1)
return true;
 }
+  if (gimple_uid (gsi_stmt (gsi2)) == -1u)
+return false;
 
   if (gimple_uid (gsi_stmt (gsi1)) <= gimple_uid (gsi_stmt (gsi2)))
 return true;
-- 
2.26.2

[PATCH PR95700] Use nullptr instead of NULL as a sentinel value

2020-06-24 Thread Ilya Leoshkevich via Gcc-patches

Bootstrapped and regtested on x86_64-redhat-linux, ppc64le-redhat-linux
and s390x-redhat-linux.

Ok for master?

---

Bootstrap with musl libc fails with numerous "missing sentinel in
function call" errors.  This is because musl defines NULL as 0L for C++,
but gcc requires sentinel value to be a pointer or __null.

Jonathan Wakely says:

To be really safe during stage 1, GCC should not use NULL as a
pointer sentinel in C++ code anyway.

The bootstrap compiler could define it to 0 or 0u, neither of which
is guaranteed to be OK to pass as a varargs sentinel where a null
pointer is expected.  Any of (void*)0 or (void*)NULL or nullptr
would be safe.

Therefore, fix by replacing NULL sentinels with nullptr.

gcc/ChangeLog:

2020-06-17  Ilya Leoshkevich  

PR bootstrap/95700
* builtins.c (do_mpfr_remquo): Use nullptr as a sentinel.
* collect-utils.c (collect_execute): Likewise.
* collect2.c (main): Likewise.
(write_c_file_stat): Likewise.
* config/i386/driver-i386.c (describe_cache): Likewise.
(host_detect_local_cpu): Likewise.
* config/i386/i386.c (x86_64_elf_unique_section): Likewise.
* coverage.c (coverage_init): Likewise.
* dbxout.c (dbxout_init): Likewise.
* dumpfile.c (get_dump_file_name): Likewise.
* dwarf2out.c (add_top_level_skeleton_die_attrs): Likewise.
(add_enumerator_pubname): Likewise.
(add_pubtype): Likewise.
(gen_subprogram_die): Likewise.
* expr.c (build_personality_function): Likewise.
* fold-const-call.c (do_mpfr_sincos): Likewise.
(do_mpfr_arg2): Likewise.
(do_mpfr_arg3): Likewise.
* gcc-ar.c (setup_prefixes): Likewise.
(main): Likewise.
* gcc.c (set_spec): Likewise.
(close_at_file): Likewise.
(for_each_path): Likewise.
(add_sysrooted_prefix): Likewise.
(add_sysrooted_hdrs_prefix): Likewise.
(process_command): Likewise.
(do_spec_1): Likewise.
(driver::set_up_specs): Likewise.
(driver::maybe_run_linker): Likewise.
(driver::detect_jobserver): Likewise.
(set_multilib_dir): Likewise.
(find_plugindir_spec_function): Likewise.
(compare_debug_dump_opt_spec_function): Likewise.
(compare_debug_self_opt_spec_function): Likewise.
(pass_through_libs_spec_func): Likewise.
(dumps_spec_func): Likewise.
(find_fortran_preinclude_file): Likewise.
* genattrtab.c (evaluate_eq_attr): Likewise.
* gencfn-macros.c (is_group): Likewise.
* genconstants.c (print_enum_type): Likewise.
* gengtype-state.c (write_state): Likewise.
(read_state_fileloc): Likewise.
* gengtype.c (close_output_files): Likewise.
(write_field_root): Likewise.
(write_root): Likewise.
* genmatch.c (get_operator): Likewise.
* hsa-brig.c (brig_init): Likewise.
* incpath.c (add_standard_paths): Likewise.
(add_sysroot_to_chain): Likewise.
* lto-streamer.c (lto_get_section_name): Likewise.
* lto-wrapper.c (debug_objcopy): Likewise.
(run_gcc): Likewise.
* omp-low.c (lower_omp_critical): Likewise.
* opt-suggestions.c (option_proposer::build_option_suggestions): 
Likewise.
* optinfo-emit-json.cc (optrecord_json_writer::write): Likewise.
* opts.c (print_filtered_help): Likewise.
(option_name): Likewise.
(get_option_url): Likewise.
* passes.c (pass_manager::register_one_dump_file): Likewise.
* plugin.c: Likewise.
* prefix.c (translate_name): Likewise.
(update_path): Likewise.
* print-tree.c (print_decl_identifier): Likewise.
* read-md.c (md_reader::join_c_conditions): Likewise.
(md_reader::handle_enum): Likewise.
(md_reader::handle_include): Likewise.
* read-rtl.c (get_mode_token): Likewise.
* real.c (build_sinatan_real): Likewise.
* selftest.c (locate_file): Likewise.
* targhooks.c (default_mangle_assembler_name): Likewise.
* toplev.c (compile_file): Likewise.
(open_auxiliary_file): Likewise.
* trans-mem.c (tm_mangle): Likewise.
* tree-complex.c (create_one_component_var): Likewise.
* tree-nested.c (get_frame_type): Likewise.
* tree-profile.c (gimple_init_gcov_profiler): Likewise.
* tree-vect-data-refs.c (vect_get_new_vect_var): Likewise.
(vect_get_new_ssa_name): Likewise.
* tree.c (build_common_builtin_nodes): Likewise.
* varasm.c (get_named_text_section): Likewise.
(default_unique_section): Likewise.
(default_asm_output_ident_directive): Likewise.

gcc/analyzer/ChangeLog:

2020-06-17  Ilya Leoshkevich  

PR bootstrap/95700
* region-model.cc (region::dump_to_pp): Use nullptr as a sentinel.

gcc/c-family/ChangeLog:

2020-06-17  Ilya Leoshkevich  

PR

Re: [PATCH v4] arm: Implement Armv8.1-M low overhead loops

2020-06-24 Thread Andrea Corallo

Andrea Corallo  writes:

> Hi all,
>
> here the latest version of the patch to enable Armv8.1-M Mainline
> LOB (low overhead branch) extension low overhead loops (LOL) feature
> using the 'loop-doloop' pass.
>
> I posted a previous version of it during stage 4.
>
> Given the following function:
>
> void
> loop (int *a)
> {
>   for (int i = 0; i < 1000; i++)
> a[i] = i;
> }
>
> 'doloop_begin' and 'doloop_end' patterns translates into 'dls' and 'le'
> giving:
>
>  loop:
>  movwr2, #1
>  movsr3, #0
>  subsr0, r0, #4
>  push{lr}
>  dls lr, r2
>  .L2:
>  str r3, [r0, #4]!
>  addsr3, r3, #1
>  le  lr, .L2
>  ldr pc, [sp], #4
>
> bootstrapped arm-none-linux-gnueabihf, does not introduce testsuite 
> regressions.
>
> Andrea

Hi,

double checking I spotted another test impacted by the patch that I
missed as first.

Here the updated patch waving the check.

  Andrea

gcc/ChangeLog

2020-06-18  Andrea Corallo  
Mihail-Calin Ionescu  
Iain Apreotesei  

* config/arm/arm-protos.h (arm_target_insn_ok_for_lob): New
prototype.
* config/arm/arm.c (TARGET_INVALID_WITHIN_DOLOOP): Define.
(arm_invalid_within_doloop): Implement invalid_within_doloop hook.
(arm_target_insn_ok_for_lob): New function.
* config/arm/arm.h (TARGET_HAVE_LOB): Define macro.
* config/arm/thumb2.md (*doloop_end_internal, doloop_begin)
(dls_insn): Add new patterns.
(doloop_end): Modify to select LR when LOB is available.
* config/arm/unspecs.md: Add new unspec.
* doc/sourcebuild.texi (arm_v8_1_lob_ok)
(arm_thumb2_ok_no_arm_v8_1_lob): Document new target supports
options.

gcc/testsuite/ChangeLog

2020-06-18  Andrea Corallo  
Mihail-Calin Ionescu  
Iain Apreotesei  

* gcc.target/arm/lob.h: New header.
* gcc.target/arm/lob1.c: New testcase.
* gcc.target/arm/lob2.c: Likewise.
* gcc.target/arm/lob3.c: Likewise.
* gcc.target/arm/lob4.c: Likewise.
* gcc.target/arm/lob5.c: Likewise.
* gcc.target/arm/lob6.c: Likewise.
* gcc.target/arm/unsigned-extend-2.c: Do not run when generating
low loop overhead.
* gcc.target/arm/ivopts.c: Fix check for low loop overhead.
* lib/target-supports.exp (check_effective_target_arm_v8_1_lob)
(check_effective_target_arm_thumb2_ok_no_arm_v8_1_lob): New procs.

>From b88fbcf3f421fbd134f3ad69f8fb4a7a3ec20bbc Mon Sep 17 00:00:00 2001
From: Andrea Corallo 
Date: Tue, 26 May 2020 17:47:13 +0100
Subject: [PATCH] arm: Implement Armv8.1-M low overhead loops

gcc/ChangeLog

2020-06-18  Andrea Corallo  
Mihail-Calin Ionescu  
Iain Apreotesei  

	* config/arm/arm-protos.h (arm_target_insn_ok_for_lob): New
	prototype.
* config/arm/arm.c (TARGET_INVALID_WITHIN_DOLOOP): Define.
(arm_invalid_within_doloop): Implement invalid_within_doloop hook.
	(arm_target_insn_ok_for_lob): New function.
* config/arm/arm.h (TARGET_HAVE_LOB): Define macro.
* config/arm/thumb2.md (*doloop_end_internal, doloop_begin)
	(dls_insn): Add new patterns.
	(doloop_end): Modify to select LR when LOB is available.
* config/arm/unspecs.md: Add new unspec.
* doc/sourcebuild.texi (arm_v8_1_lob_ok)
	(arm_thumb2_ok_no_arm_v8_1_lob): Document new target supports
	options.

gcc/testsuite/ChangeLog

2020-06-18  Andrea Corallo  
Mihail-Calin Ionescu  
Iain Apreotesei  

* gcc.target/arm/lob.h: New header.
* gcc.target/arm/lob1.c: New testcase.
* gcc.target/arm/lob2.c: Likewise.
* gcc.target/arm/lob3.c: Likewise.
* gcc.target/arm/lob4.c: Likewise.
* gcc.target/arm/lob5.c: Likewise.
* gcc.target/arm/lob6.c: Likewise.
	* gcc.target/arm/unsigned-extend-2.c: Do not run when generating
	low loop overhead.
	* gcc.target/arm/ivopts.c: Fix check for low loop overhead.
* lib/target-supports.exp (check_effective_target_arm_v8_1_lob)
	(check_effective_target_arm_thumb2_ok_no_arm_v8_1_lob): New procs.
---
 gcc/config/arm/arm-protos.h   |  1 +
 gcc/config/arm/arm.c  | 37 +++
 gcc/config/arm/arm.h  |  3 +
 gcc/config/arm/thumb2.md  | 49 +-
 gcc/config/arm/unspecs.md |  1 +
 gcc/doc/sourcebuild.texi  | 11 +++
 gcc/testsuite/gcc.target/arm/ivopts.c |  2 +-
 gcc/testsuite/gcc.target/arm/lob.h| 15 +++
 gcc/testsuite/gcc.target/arm/lob1.c   | 85 
 gcc/testsuite/gcc.target/arm/lob2.c   | 32 ++
 gcc/testsuite/gcc.target/arm/lob3.c   | 27 ++
 gcc/testsuite/gcc.target/arm/lob4.c   | 34 +++
 gcc/testsuite/gcc.target/arm/lob5.c   | 35 +++

Re: PING Re: testsuite: clarify scan-dump file globbing behavior

2020-06-24 Thread Thomas Schwinge

Hi!

Given that nobody is available to review/approve this patch, and it
cannot really cause any harm, will my (old) review/"non-formal approval"
be sufficient for Frederik to push this?  Or, in other words: Frederik,
please push if nobody objects within the next week.


Grüße
 Thomas


On 2020-06-03T07:37:01+0200, Frederik Harwath  wrote:
> Frederik Harwath  writes:
>
> ping :-)
>
>> Frederik Harwath  writes:
>>
>> Hi Rainer, hi Mike,
>> ping: https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545803.html
>>
>> Best regards,
>> Frederik
>>
>>> Hi Thomas,
>>>
>>> Thomas Schwinge  writes:
>>>
 I can't formally approve testsuite patches, but did a review anyway:
>>>
>>> Thanks for the review!
>>>
 On 2020-05-15T12:31:54+0200, Frederik Harwath  
 wrote:
>>>
> The dump
> scanning procedures are changed to make the test unresolved
> if globbing matches more than one file.

 (The code changes look good, but I have not tested that specific aspect.)
>>>
>>> We do not have automated tests for the testsuite commands :-), but I
>>> have of course tested this manually.
>>>
 As I said, not an approval, and minor comments (see below), but still:

 Reviewed-by: Thomas Schwinge 

 Do we have to similarly also audit/alter other testsuite infrastructure
 files, anything that uses '[glob [...]]'?  (..., and then generalize
 'glob-dump-file' into 'glob-one-file', or similar.)  That can be done
 incrementally, as far as I'm concerned.
>>>
>>> I also think it would make sense to adapt similar test commands as well.
>>>
 May also make this more useful/explicit:

 This is useful if, for example, if a pass has several static
 instances [correct terminology?], and depending on torture testing
 command-line flags, a different instance executes and produces a dump
 file, and so in the test case you can use a generic [put example
 here] to scan the varying dump files names.

 (Or similar.)
>>>
>>> I have moved the explanation below the description of the individual
>>> commands and added an example. See the attached revised patch.
>>>
>>> Best regards,
>>> Frederik
>>>
>>> From 2a17749d6dbcac690d698323240438722d6119ef Mon Sep 17 00:00:00 2001
>>> From: Frederik Harwath 
>>> Date: Fri, 15 May 2020 10:35:48 +0200
>>> Subject: [PATCH] testsuite: clarify scan-dump file globbing behavior
>>>
>>> The test commands for scanning optimization dump files
>>> perform globbing on the argument that specifies the suffix
>>> of the dump files to be scanned.  This behavior is currently
>>> undocumented.  Furthermore, the current implementation of
>>> "scan-dump" and similar procedures yields an error whenever
>>> the file name globbing matches more than one file (due to an
>>> attempt to call "open" on multiple files) while a failure to
>>> match any file results in an unresolved test.
>>>
>>> This commit documents the globbing behavior.  The dump
>>> scanning procedures are changed to make the test unresolved
>>> if globbing matches more than one file.
>>>
>>> gcc/ChangeLog:
>>>
>>> 2020-05-19  Frederik Harwath  
>>>
>>> * doc/sourcebuild.texi: Describe globbing of the
>>> dump file scanning commands "suffix" argument.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> 2020-05-19  Frederik Harwath  
>>>
>>> * lib/scandump.exp (glob-dump-file): New proc.
>>> (scan-dump): Use glob-dump-file for file name expansion.
>>> (scan-dump-times): Likewise.
>>> (scan-dump-dem): Likewise.
>>> (scan-dump-dem-not): Likewise.
>>>
>>> Reviewed-by: Thomas Schwinge 
>>> ---
>>>  gcc/doc/sourcebuild.texi   | 13 
>>>  gcc/testsuite/lib/scandump.exp | 54 +++---
>>>  2 files changed, 56 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
>>> index 240d6e4b08e..9df4b06d460 100644
>>> --- a/gcc/doc/sourcebuild.texi
>>> +++ b/gcc/doc/sourcebuild.texi
>>> @@ -2911,6 +2911,19 @@ Passes if @var{regex} does not match demangled text 
>>> in the dump file with
>>>  suffix @var{suffix}.
>>>  @end table
>>>
>>> +The @var{suffix} argument which describes the dump file to be scanned
>>> +may contain a glob pattern that must expand to exactly one file
>>> +name. This is useful if, e.g., different pass instances are executed
>>> +depending on torture testing command-line flags, producing dump files
>>> +whose names differ only in their pass instance number suffix.  For
>>> +example, to scan instances 1, 2, 3 of a tree pass ``mypass'' for
>>> +occurrences of the string ``code has been optimized'', use:
>>> +@smallexample
>>> +/* @{ dg-options "-fdump-tree-mypass" @} */
>>> +/* @{ dg-final @{ scan-tree-dump "code has been optimized" "mypass\[1-3\]" 
>>> @} @} */
>>> +@end smallexample
>>> +
>>> +
>>>  @subsubsection Check for output files
>>>
>>>  @table @code
>>> diff --git a/gcc/testsuite/lib/scandump.exp b/gcc/testsuite/lib/scandump.exp
>>> index

Re: [PATCH PR95854] ICE in find_bswap_or_nop_1 of pass store-merging

2020-06-24 Thread Jakub Jelinek via Gcc-patches

Oh, and the testcase is wrong too, you can't use AArch64 specific
compilation options in generic tests, so either the test needs to be moved
into gcc.target/aarch64/ and match in style how such -march option is
specified in other tests there, or should not have that -m* in dg-options,
but have /* { dg-additional-options "-march=..." { target aarch64*-*-* } } */

Jakub

Re: [PATCH PR95854] ICE in find_bswap_or_nop_1 of pass store-merging

2020-06-24 Thread Jakub Jelinek via Gcc-patches

On Wed, Jun 24, 2020 at 09:15:57AM +, zhoukaipeng (A) wrote:
> Hi,
> 
> This is a fix for pr95854.
> 
> Only add a judgement to make sure operand1 and operand2 are both INTEGER_CST.

So what are those two operands then when not INTEGER_CSTs?  POLY_INT_CSTs?

The documentation says:
   operand 1 is a tree giving the constant number of bits being referenced;
   operand 2 is a tree giving the constant position of the first referenced bit.

The ChangeLog is incorrect, it should say:
* gimple-ssa-store-merging.c (find_bswap_or_nop_1): Return NULL
if operand 1 or 2 of a BIT_FIELD_REF are not INTEGER_CSTs.
and patch should not be modifying ChangeLog files, the ChangeLog entry
belongs now solely into the commit message from which a script adds it to
ChangeLog files later.

Jakub

Re: [PATCH] VEC_COND_EXPR: clean up first argument

2020-06-24 Thread Martin Liška


On 6/24/20 11:09 AM, Richard Biener wrote:

On Wed, Jun 24, 2020 at 10:49 AM Martin Liška  wrote:


On 6/24/20 9:43 AM, Richard Biener wrote:

Hmm, can you instead use simple_dce_from_worklist and simply
record all SSA_NAMEs you end up "forwarding" as possibly dead
in a bitmap?  At least that hashmap traversal looks dangerous
with respect to address-space randomization and gsi_remove
inserting debug stmts and thus eventually allocating debug decls.


Sure, done in the updated patch.


You can simplify the patch by eliding the num_imm_uses checks


Really? How can I be sure that a SSA_NAME is not shared among different
VEC_COND_EXPR statements (or even by some other statements)?


and by using auto_bitmap.


Oh yeah!


 Why is it necessary to update
the veclower pass btw?  Is that just to avoid useless isels
on dead code?


Yes:

  _10 = _9 != { 0, 0, 0, 0 };
  _11 = *a_16(D);
  _12 = *b_17(D);
  _13 = _11 + _12;
  _14 = VEC_COND_EXPR <_10, _13, { 3.0e+0, 3.0e+0, 3.0e+0, 3.0e+0 }>;

is expanded by vectlower to something like:

  _10 = _9 != { 0, 0, 0, 0 };
  _11 = *a_16(D);
  _12 = *b_17(D);
  _67 = BIT_FIELD_REF <_11, 32, 0>;
  _68 = BIT_FIELD_REF <_12, 32, 0>;
  _69 = _67 + _68;
...
  _14 = {_80, _82, _84, _86};
  *a_16(D) = _14;

So one needs to remove: _10 = _9 != { 0, 0, 0, 0 };
Note the ICE happens without an optimization level.




You also updated veclower "nicely" but still have the hashmap
walk in isel - you should know when you "merged" a condition
into a cond and set the bit there.


Isn't the same as before as the first argument can be actually shared in between
multiple GIMPLE statements?

Thanks,
Martin



Richard.


Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

Re: [PATCH] libgomp: added simple functions and tests for OMPD

2020-06-24 Thread Jakub Jelinek via Gcc-patches

On Tue, Jun 23, 2020 at 07:25:22PM -0400, y2s1982 via Gcc-patches wrote:
> This patch adds some unit tests for omp-tools.h header. It also adds some 
> simple
> functions related to OMPD API versions. It also partially defines the OMPD
> initialization function.
> 
> 2020-06-23  Tony Sim  
> 
> libgomp/ChangeLog:
> 
>   * Makefile.am (toolexeclib_LTLIBRARIES and related): Add libgompd.la.

Please spell out all the changes.  I.e.
* Makefile.am (toolexeclib_LTLIBRARIES): Add libgompd.la.
(libgompd_la_LDFLAGS, libgompd_la_DEPENDENCIES, libgompd_la_LINK,
libgompd_la_SOURCES): Set.

> @@ -67,6 +67,12 @@ libgomp_la_SOURCES = alloc.c atomic.c barrier.c critical.c 
> env.c error.c \
>   oacc-mem.c oacc-async.c oacc-plugin.c oacc-cuda.c priority_queue.c \
>   affinity-fmt.c teams.c allocator.c oacc-profiling.c oacc-target.c
>  
> +libgompd_la_LDFLAGS = $(libgomp_version_info) $(libgomp_version_script) \
> +$(lt_host_flags)
> +libgompd_la_DEPENDENCIES = $(libgomp_version_dep)
> +libgompd_la_LINK = $(LINK) $(libgomp_la_LDFLAGS)

You actually want to use libgompd_la_LDFLAGS, libgompd_version_dep,
libgompd_version_script, libgompd_version_info etc. (and set those variables
next to the libgomp ones).
And introduce libgompd.map version script which will have the exported
symbols in for now OMPD_5.0 symbol version.
So
OMPD_5.0 {
  global:
...;

  local:
*;
};

> +libgompd_la_SOURCES = libgompd.c

Not sure if you want to call the source file libgompd.c, that would make
sense only if you want to have all of OMPD implemented in a single file.
If you need multiple, I'd suggest ompd-*.c where the * would be something
short/descriptive to name a set of related functions.

> +++ b/libgomp/libgompd.c
> @@ -0,0 +1,46 @@
> +#include 
> +#include 
> +#include "omp-tools.h"
> +#include "libgompd.h"
> +//#include "plugin-suffix.h"

So, what actually includes that header?  Otherwise it can't compile.
> +
> +ompd_rc_t ompd_get_api_version(ompd_word_t *version)
> +{
> +*version = OMPD_VERSION;
> +return ompd_rc_ok;
> +}

Formatting.  The GNU Coding Conventions say it should be
ompd_rc_t
ompd_get_api_version (ompd_word_t *version)
{
  *version = OMPD_VERSION;
  return ompd_rc_ok;
}
(i.e. function name should be at the start of line for easy grepping,
there should be a single space before (, no space after the ( or
before ) as you have in other functions, indentation level is 2
(see indent program defaults).

> +
> +ompd_rc_t ompd_get_version_string(const char **string)
> +{
> +string = str(OMPD_VERSION);
> +return ompd_rc_ok;
> +}

This doesn't do anything outside of the function.  You need
  *string = 
and put there something more descriptive than just the version,
perhaps
  *string = "GNU OpenMP Runtime implementing OpenMP 5.0 " str (OMPD_VERSION);
I don't see a str macro defined, and it should have a different name, str is
too generic.
libgomp.h defines (conditionally):
# define ialias_str1(x) ialias_str2(x)
# define ialias_str2(x) #x
which does what you want, but you need it unconditionally and call it some
other way (stringify and stringify1?, define right above the function?).

> +
> +ompd_rc_t ompd_initialize ( ompd_word_t api_version, const ompd_callbacks_t 
> *callbacks )
> +{
> +/* initialized flag */
> +static int ompd_initialized = 0;
> +
> +if (ompd_initialized)
> +return ompd_rc_error;
> +
> +/* compute library name and locations */
> +const char *prefix = "libgompd.";
> +const char *suffix = SONAME_SUFFIX (1);
> +size_t prefix_len, suffix_len;
> +prefix_len = strlen(prefix);
> +suffix_len = strlen(suffix);

Formatting (in addition to what has been said above, e.g. space before (.
But, this doesn't really belong here anyway, ompd_dll_locations needs to be
initialized not in ompd_initialize, but far before that.
Either it should be just a const variable, which I think with
const char *ompd_dll_locations[2] = { "libgompd" SONAME_SUFFIX (1), NULL };
is possible, or some library constructor needs to initializa it and then
pass through (or call) the mandated breakpoint so that the debugger can be
sure it is initialized.
Then the debugger will use the variable, load the libgompd library and only
after that actually call ompd_initialize.

> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.ompd/header-1.c
> @@ -0,0 +1,6 @@
> +/*  Test that the omp-tools.h will compile successfully.  */
> +
> +/* { dg-do run } */

Should be just dg-do compile, that is all you care about in this case.

> +#include "omp-tools.h"
> +
> +int main(void){ return 0; }

Please fix up formatting,
int
main ()
{
  return 0;
}

Ditto other tests.

Jakub

Re: [RFC/PATCH] IFN: Fix mask_{load,store} optab support macros

2020-06-24 Thread Richard Sandiford

Jakub Jelinek  writes:
> On Wed, Jun 24, 2020 at 09:35:42AM +0100, Richard Sandiford wrote:
>> >> I keep meaning to experiment with dropping the second mode from these 
>> >> optabs,
>> >> since it should be uniquely determined by the first.  That would make 
>> >> things
>> >> slightly simpler and more consistent with the other load/store IFNs.  But 
>> >> as
>> >> usual I've never found the time.
>> >
>> > AVX512 would have V16SImode and SImode because the mask would have
>> > an integer mode?  Likewise I could imagine RISC-V using V4SImode and 
>> > V4QImode
>> > or however their mask registers look like.
>> 
>> But what I mean is, once you know the vector mode, there should only
>> be one “choice” of mask mode.  (I agree it might not be the same mode,
>> and isn't for SVE as well as AVX512.)  TARGET_GET_MASK_MODE and related
>> vectoriser masking code is based around the assumption that the mask
>> mode is a function of the vector mode.
>
> But on the same target the same vector mode (e.g. V16HImode) can have two
> different mask modes (for AVX and non-AVX512VL V16HImode, for AVX512VL
> HImode).  It is true that better both shouldn't appear in the same function,
> but it would be better if at least in the optabs it is clearly spelled out.

Hmm, OK.

The reason for bringing this up is that we took the above shortcut
for things like mask_gather_load_optab.  That's a convert optab in
which one mode is the data vector mode and the other is the index
vector mode (which can vary independently of the data mode to some
extent).  If we allow multiple mask modes for the same pair of
index and data vector modes, we'd presumably need to add support
for three-mode optabs.

I don't know whether that would be a practical problem for AVX or not,
if it did switch to using ifns instead of built-in functions for
autovectorised gathers.

Richard

[PATCH PR95854] ICE in find_bswap_or_nop_1 of pass store-merging

2020-06-24 Thread zhoukaipeng (A)

Hi,

This is a fix for pr95854.

Only add a judgement to make sure operand1 and operand2 are both INTEGER_CST.

Bootstrap and tested on aarch64 Linux platform. No new regression witnessed.

Is it ok to be merged?

Thanks,
Kaipeng Zhou


0001-store-merging-ICE-in-find_bswap_or_nop_1-PR95854.patch
Description: 0001-store-merging-ICE-in-find_bswap_or_nop_1-PR95854.patch

Re: [PATCH] Fortran : Bogus error with additional blanks in type(*) PR95829

2020-06-24 Thread Manfred Schwarb

Am 24.06.20 um 10:12 schrieb Mark Eggleston:
> Please find a fix for PR95829.  Original patch by Steve Kargl posted to PR.
>
> Commit to master and backport?
>
> Commit message:
>
> Fortran  : Bogus error with additional blanks in type(*) PR95829
>
> Checking for "* ) " instead of "*)" clears the bogus error.
>
> 2020-06-24  Steven G. Kargl  
>
> gcc/fortran/
>
>     PR fortran/95829
>     * decl.c (gfc_match_decl_type_spec): Compare with "* ) " instead
>     of "*)".
>
> 2020-06-24  Mark Eggleston 
>
> gcc/testsuite/
>
>     PR fortran/95829
>     * gfortran.dg/pr95829.f90: New test.
>

@@ -0,0 +1,14 @@
+! {dg-do compile }
+!
+! Declaration of b used to be a bogus failure.

{ dg-do compile }

with surrounding spaces, otherwise this dg-directive will not be recognized.

Re: [PATCH] Implement no_stack_protect attribute.

2020-06-24 Thread Martin Liška


PING^2

On 6/10/20 10:12 AM, Martin Liška wrote:

PING^1

On 5/25/20 3:10 PM, Martin Liška wrote:

On 5/21/20 4:53 PM, Martin Sebor wrote:

On 5/21/20 5:28 AM, Martin Liška wrote:

On 5/18/20 10:37 PM, Martin Sebor wrote:

I know there are some somewhat complex cases the attribute exclusion
mechanism isn't general enough to handle but this seems simple enough
that it should work.  Unless I'm missing something that makes it not
feasible I would suggest to use it.


Hi Martin.

Do we have a better place where we check for attribute collision?


If by collision you mean the same thing as the mutual exclusion I was
talking about then that's done by creating an attribute_spec::exclusions
array like for instance attr_cold_hot_exclusions in c-attribs.c and
pointing to it from the attribute_spec entries for each of
the mutually exclusive attributes in the attribute table.  Everything
else is handled automatically by decl_attributes.

Martin


Thanks, I'm sending updated version of the patch that utilizes the conflict
detection.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

Re: [PATCH] VEC_COND_EXPR: clean up first argument

2020-06-24 Thread Richard Biener via Gcc-patches

On Wed, Jun 24, 2020 at 10:49 AM Martin Liška  wrote:
>
> On 6/24/20 9:43 AM, Richard Biener wrote:
> > Hmm, can you instead use simple_dce_from_worklist and simply
> > record all SSA_NAMEs you end up "forwarding" as possibly dead
> > in a bitmap?  At least that hashmap traversal looks dangerous
> > with respect to address-space randomization and gsi_remove
> > inserting debug stmts and thus eventually allocating debug decls.
>
> Sure, done in the updated patch.

You can simplify the patch by eliding the num_imm_uses checks
and by using auto_bitmap.  Why is it necessary to update
the veclower pass btw?  Is that just to avoid useless isels
on dead code?

You also updated veclower "nicely" but still have the hashmap
walk in isel - you should know when you "merged" a condition
into a cond and set the bit there.

Richard.

> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?
> Thanks,
> Martin
>

Re: [PATCH wwwdocs] gcc-11/changes: Document TSAN changes

2020-06-24 Thread Marco Elver via Gcc-patches

On Tue, Jun 23, 2020 at 09:32PM +0200, Martin Liška wrote:
> On 6/23/20 1:27 PM, Marco Elver wrote:
> > Is this one good to go, or any objections?
> Thanks for it, it's fine. Please install it.

Done, thanks!

-- Marco

Re: [PATCH] VEC_COND_EXPR: clean up first argument

2020-06-24 Thread Martin Liška


On 6/24/20 9:43 AM, Richard Biener wrote:

Hmm, can you instead use simple_dce_from_worklist and simply
record all SSA_NAMEs you end up "forwarding" as possibly dead
in a bitmap?  At least that hashmap traversal looks dangerous
with respect to address-space randomization and gsi_remove
inserting debug stmts and thus eventually allocating debug decls.


Sure, done in the updated patch.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

>From ad8d574ea5bcd0f123e0c890825babbd3bbc5eef Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Wed, 24 Jun 2020 08:08:00 +0200
Subject: [PATCH] VEC_COND_EXPR: clean up first argument

gcc/ChangeLog:

	PR tree-optimization/95745
	PR middle-end/95830
	* gimple-isel.cc (gimple_expand_vec_cond_exprs): Delete dead
	SSA_NAMEs used as the first argument of a VEC_COND_EXPR.  Always
	return 0.
	* tree-vect-generic.c (expand_vector_condition): Remove dead
	SSA_NAMEs used as the first argument of a VEC_COND_EXPR.
---
 gcc/gimple-isel.cc  | 14 --
 gcc/tree-vect-generic.c | 23 +--
 2 files changed, 29 insertions(+), 8 deletions(-)

diff --git a/gcc/gimple-isel.cc b/gcc/gimple-isel.cc
index 97f92080503..f54c56eb71d 100644
--- a/gcc/gimple-isel.cc
+++ b/gcc/gimple-isel.cc
@@ -33,6 +33,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimplify-me.h"
 #include "gimplify.h"
 #include "tree-cfg.h"
+#include "bitmap.h"
+#include "tree-ssa-dce.h"
 
 /* Expand all VEC_COND_EXPR gimple assignments into calls to internal
function based on type of selected expansion.  */
@@ -178,8 +180,8 @@ gimple_expand_vec_cond_exprs (void)
 {
   gimple_stmt_iterator gsi;
   basic_block bb;
-  bool cfg_changed = false;
   hash_map vec_cond_ssa_name_uses;
+  bitmap dce_ssa_names = BITMAP_ALLOC (NULL);
 
   FOR_EACH_BB_FN (bb, cfun)
 {
@@ -196,7 +198,15 @@ gimple_expand_vec_cond_exprs (void)
 	}
 }
 
-  return cfg_changed ? TODO_cleanup_cfg : 0;
+  for (hash_map::iterator it = vec_cond_ssa_name_uses.begin ();
+   it != vec_cond_ssa_name_uses.end (); ++it)
+if (num_imm_uses ((*it).first) == 0)
+  bitmap_set_bit (dce_ssa_names, SSA_NAME_VERSION ((*it).first));
+
+  simple_dce_from_worklist (dce_ssa_names);
+  BITMAP_FREE (dce_ssa_names);
+
+  return 0;
 }
 
 namespace {
diff --git a/gcc/tree-vect-generic.c b/gcc/tree-vect-generic.c
index 83d399a7898..fafe6bf2d16 100644
--- a/gcc/tree-vect-generic.c
+++ b/gcc/tree-vect-generic.c
@@ -40,10 +40,11 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-vector-builder.h"
 #include "vec-perm-indices.h"
 #include "insn-config.h"
+#include "tree-ssa-dce.h"
 #include "recog.h"		/* FIXME: for insn_data */
 
 
-static void expand_vector_operations_1 (gimple_stmt_iterator *);
+static void expand_vector_operations_1 (gimple_stmt_iterator *, bitmap *);
 
 /* Return the number of elements in a vector type TYPE that we have
already decided needs to be expanded piecewise.  We don't support
@@ -932,7 +933,7 @@ expand_vector_divmod (gimple_stmt_iterator *gsi, tree type, tree op0,
 /* Expand a vector condition to scalars, by using many conditions
on the vector's elements.  */
 static void
-expand_vector_condition (gimple_stmt_iterator *gsi)
+expand_vector_condition (gimple_stmt_iterator *gsi, bitmap *dce_ssa_names)
 {
   gassign *stmt = as_a  (gsi_stmt (*gsi));
   tree type = gimple_expr_type (stmt);
@@ -954,10 +955,11 @@ expand_vector_condition (gimple_stmt_iterator *gsi)
   tree comp_index = index;
   location_t loc = gimple_location (gsi_stmt (*gsi));
   tree_code code = TREE_CODE (a);
+  gassign *assign = NULL;
 
   if (code == SSA_NAME)
 {
-  gassign *assign = dyn_cast (SSA_NAME_DEF_STMT (a));
+  assign = dyn_cast (SSA_NAME_DEF_STMT (a));
   if (assign != NULL
 	  && TREE_CODE_CLASS (gimple_assign_rhs_code (assign)) == tcc_comparison)
 	{
@@ -1064,6 +1066,10 @@ expand_vector_condition (gimple_stmt_iterator *gsi)
   constr = build_constructor (type, v);
   gimple_assign_set_rhs_from_tree (gsi, constr);
   update_stmt (gsi_stmt (*gsi));
+
+  if (a_is_comparison && num_imm_uses (gimple_assign_lhs (assign)) == 0)
+bitmap_set_bit (*dce_ssa_names,
+		SSA_NAME_VERSION (gimple_assign_lhs (assign)));
 }
 
 static tree
@@ -1956,7 +1962,7 @@ expand_vector_conversion (gimple_stmt_iterator *gsi)
 /* Process one statement.  If we identify a vector operation, expand it.  */
 
 static void
-expand_vector_operations_1 (gimple_stmt_iterator *gsi)
+expand_vector_operations_1 (gimple_stmt_iterator *gsi, bitmap *dce_ssa_names)
 {
   tree lhs, rhs1, rhs2 = NULL, type, compute_type = NULL_TREE;
   enum tree_code code;
@@ -1985,7 +1991,7 @@ expand_vector_operations_1 (gimple_stmt_iterator *gsi)
 
   if (code == VEC_COND_EXPR)
 {
-  expand_vector_condition (gsi);
+  expand_vector_condition (gsi, dce_ssa_names);
   return;
 }
 
@@ -2233,11 +2239,13 @@ expand_vector_operations (void)
   basic_block bb;

Re: [RFC/PATCH] IFN: Fix mask_{load,store} optab support macros

2020-06-24 Thread Jakub Jelinek via Gcc-patches

On Wed, Jun 24, 2020 at 09:35:42AM +0100, Richard Sandiford wrote:
> >> I keep meaning to experiment with dropping the second mode from these 
> >> optabs,
> >> since it should be uniquely determined by the first.  That would make 
> >> things
> >> slightly simpler and more consistent with the other load/store IFNs.  But 
> >> as
> >> usual I've never found the time.
> >
> > AVX512 would have V16SImode and SImode because the mask would have
> > an integer mode?  Likewise I could imagine RISC-V using V4SImode and 
> > V4QImode
> > or however their mask registers look like.
> 
> But what I mean is, once you know the vector mode, there should only
> be one “choice” of mask mode.  (I agree it might not be the same mode,
> and isn't for SVE as well as AVX512.)  TARGET_GET_MASK_MODE and related
> vectoriser masking code is based around the assumption that the mask
> mode is a function of the vector mode.

But on the same target the same vector mode (e.g. V16HImode) can have two
different mask modes (for AVX and non-AVX512VL V16HImode, for AVX512VL
HImode).  It is true that better both shouldn't appear in the same function,
but it would be better if at least in the optabs it is clearly spelled out.

> This is similar to single-mode direct optabs like vec_pack and vec_unpack,
> which operate on pairs of vector modes, but where one mode is uniquely
> determined by the other.

For those IMHO it would be better to use two modes, otherwise it is a pain.

Jakub

Re: [RFC/PATCH] IFN: Fix mask_{load,store} optab support macros

2020-06-24 Thread Richard Sandiford

Richard Biener  writes:
> On Wed, Jun 24, 2020 at 9:13 AM Richard Sandiford
>  wrote:
>>
>> "Kewen.Lin"  writes:
>> > Hi,
>> >
>> > When I am working on IFNs for vector with length, I noticed that the
>> > current optab support query for mask_load/mask_store looks unexpected.
>> > The mask_load/mask_store requires two modes for convert_optab query,
>> > but the macros direct_mask_{load,store}_optab_supported_p uses
>> > direct_optab_supported_p which asserts type pair should have the same mode.
>> >
>> > I'm not sure whether we have some special reason here or just a typo,
>> > since everything goes well now, mask_{load,store} optab check is mainly
>> > handled by can_vec_mask_load_store_p.
>> >
>> > But if we have some codes as below (eg: one checking for all IFNs finally)
>> >
>> >   tree_pair types = direct_internal_fn_types (ifn, call);
>> >   if(direct_internal_fn_supported_p (ifn, types, OPTIMIZE_FOR_SPEED) ...
>> >
>> > It will cause ICE.
>> >
>> > Does it make sense to fix it?
>> >
>> > Thanks in advance!
>> >
>> > BR,
>> > Kewen
>> > -
>> > gcc/ChangeLog:
>> >
>> >   * internal-fn.c (direct_mask_load_optab_supported_p): Use
>> >   convert_optab_supported_p instead of direct_optab_supported_p.
>> >   (direct_mask_store_optab_supported_p): Likewise.
>>
>> OK, thanks.
>>
>> I keep meaning to experiment with dropping the second mode from these optabs,
>> since it should be uniquely determined by the first.  That would make things
>> slightly simpler and more consistent with the other load/store IFNs.  But as
>> usual I've never found the time.
>
> AVX512 would have V16SImode and SImode because the mask would have
> an integer mode?  Likewise I could imagine RISC-V using V4SImode and V4QImode
> or however their mask registers look like.

But what I mean is, once you know the vector mode, there should only
be one “choice” of mask mode.  (I agree it might not be the same mode,
and isn't for SVE as well as AVX512.)  TARGET_GET_MASK_MODE and related
vectoriser masking code is based around the assumption that the mask
mode is a function of the vector mode.

This is similar to single-mode direct optabs like vec_pack and vec_unpack,
which operate on pairs of vector modes, but where one mode is uniquely
determined by the other.

In contrast, convert optabs are AIUI designed for cases where there are
two modes that can vary independently of one another (at least to some
extent), and so giving a single mode isn't enough to identity the operation.

Thanks,
Richard

Re: [PATCH] fold-const: Fix A <= 0 ? A : -A folding [PR95810]

2020-06-24 Thread Richard Biener

On Wed, 24 Jun 2020, Jakub Jelinek wrote:

> Hi!
> 
> We folded A <= 0 ? A : -A into -ABS (A), which is for signed integral types
> incorrect - can invoke on INT_MIN UB twice, once on ABS and once on its
> negation.
> 
> The following patch fixes it by instead folding it to (type)-ABSU (A).
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Thanks,
Richard.

> 2020-06-24  Jakub Jelinek  
> 
>   PR middle-end/95810
>   * fold-const.c (fold_cond_expr_with_comparison): Optimize
>   A <= 0 ? A : -A into (type)-absu(A) rather than -abs(A).
> 
>   * gcc.dg/ubsan/pr95810.c: New test.
> 
> --- gcc/fold-const.c.jj   2020-05-28 16:25:00.240712958 +0200
> +++ gcc/fold-const.c  2020-06-22 11:45:20.940170934 +0200
> @@ -5770,8 +5770,22 @@ fold_cond_expr_with_comparison (location
>case LT_EXPR:
>   if (TYPE_UNSIGNED (TREE_TYPE (arg1)))
> break;
> - tem = fold_build1_loc (loc, ABS_EXPR, TREE_TYPE (arg1), arg1);
> - return negate_expr (fold_convert_loc (loc, type, tem));
> + if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (arg1))
> + && !TYPE_OVERFLOW_WRAPS (TREE_TYPE (arg1)))
> +   {
> + /* A <= 0 ? A : -A for A INT_MIN is valid, but -abs(INT_MIN)
> +is not, invokes UB both in abs and in the negation of it.
> +So, use ABSU_EXPR instead.  */
> + tree utype = unsigned_type_for (TREE_TYPE (arg1));
> + tem = fold_build1_loc (loc, ABSU_EXPR, utype, arg1);
> + tem = negate_expr (tem);
> + return fold_convert_loc (loc, type, tem);
> +   }
> + else
> +   {
> + tem = fold_build1_loc (loc, ABS_EXPR, TREE_TYPE (arg1), arg1);
> + return negate_expr (fold_convert_loc (loc, type, tem));
> +   }
>default:
>   gcc_assert (TREE_CODE_CLASS (comp_code) == tcc_comparison);
>   break;
> --- gcc/testsuite/gcc.dg/ubsan/pr95810.c.jj   2020-06-22 11:49:03.666910264 
> +0200
> +++ gcc/testsuite/gcc.dg/ubsan/pr95810.c  2020-06-22 11:48:55.057036313 
> +0200
> @@ -0,0 +1,13 @@
> +/* PR middle-end/95810 */
> +/* { dg-do run } */
> +/* { dg-options "-fsanitize=undefined -fno-sanitize-recover=undefined" } */
> +
> +int
> +main ()
> +{
> +  int x = -__INT_MAX__ - 1;
> +  x = (x <= 0 ? x : -x);
> +  if (x != -__INT_MAX__ - 1)
> +__builtin_abort ();
> +  return 0;
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

[committed] openmp: Fix two pastos in non-rect loop OpenMP lowering

2020-06-24 Thread Jakub Jelinek via Gcc-patches

Hi!

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2020-06-24  Jakub Jelinek  

* omp-low.c (lower_omp_for): Fix two pastos.

--- gcc/omp-low.c.jj2020-06-16 16:30:43.751400100 +0200
+++ gcc/omp-low.c   2020-06-23 14:14:54.972834539 +0200
@@ -10579,7 +10579,7 @@ lower_omp_for (gimple_stmt_iterator *gsi
TREE_VEC_ELT (*rhs_p, 1)
  = get_formal_tmp_var (TREE_VEC_ELT (*rhs_p, 1), _list);
  if (!is_gimple_min_invariant (TREE_VEC_ELT (*rhs_p, 2)))
-   TREE_VEC_ELT (*rhs_p, 1)
+   TREE_VEC_ELT (*rhs_p, 2)
  = get_formal_tmp_var (TREE_VEC_ELT (*rhs_p, 2), _list);
}
   else if (!is_gimple_min_invariant (*rhs_p))
@@ -10594,7 +10594,7 @@ lower_omp_for (gimple_stmt_iterator *gsi
TREE_VEC_ELT (*rhs_p, 1)
  = get_formal_tmp_var (TREE_VEC_ELT (*rhs_p, 1), _list);
  if (!is_gimple_min_invariant (TREE_VEC_ELT (*rhs_p, 2)))
-   TREE_VEC_ELT (*rhs_p, 1)
+   TREE_VEC_ELT (*rhs_p, 2)
  = get_formal_tmp_var (TREE_VEC_ELT (*rhs_p, 2), _list);
}
   else if (!is_gimple_min_invariant (*rhs_p))

Jakub

[PATCH] fold-const: Fix A <= 0 ? A : -A folding [PR95810]

2020-06-24 Thread Jakub Jelinek via Gcc-patches

Hi!

We folded A <= 0 ? A : -A into -ABS (A), which is for signed integral types
incorrect - can invoke on INT_MIN UB twice, once on ABS and once on its
negation.

The following patch fixes it by instead folding it to (type)-ABSU (A).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2020-06-24  Jakub Jelinek  

PR middle-end/95810
* fold-const.c (fold_cond_expr_with_comparison): Optimize
A <= 0 ? A : -A into (type)-absu(A) rather than -abs(A).

* gcc.dg/ubsan/pr95810.c: New test.

--- gcc/fold-const.c.jj 2020-05-28 16:25:00.240712958 +0200
+++ gcc/fold-const.c2020-06-22 11:45:20.940170934 +0200
@@ -5770,8 +5770,22 @@ fold_cond_expr_with_comparison (location
   case LT_EXPR:
if (TYPE_UNSIGNED (TREE_TYPE (arg1)))
  break;
-   tem = fold_build1_loc (loc, ABS_EXPR, TREE_TYPE (arg1), arg1);
-   return negate_expr (fold_convert_loc (loc, type, tem));
+   if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (arg1))
+   && !TYPE_OVERFLOW_WRAPS (TREE_TYPE (arg1)))
+ {
+   /* A <= 0 ? A : -A for A INT_MIN is valid, but -abs(INT_MIN)
+  is not, invokes UB both in abs and in the negation of it.
+  So, use ABSU_EXPR instead.  */
+   tree utype = unsigned_type_for (TREE_TYPE (arg1));
+   tem = fold_build1_loc (loc, ABSU_EXPR, utype, arg1);
+   tem = negate_expr (tem);
+   return fold_convert_loc (loc, type, tem);
+ }
+   else
+ {
+   tem = fold_build1_loc (loc, ABS_EXPR, TREE_TYPE (arg1), arg1);
+   return negate_expr (fold_convert_loc (loc, type, tem));
+ }
   default:
gcc_assert (TREE_CODE_CLASS (comp_code) == tcc_comparison);
break;
--- gcc/testsuite/gcc.dg/ubsan/pr95810.c.jj 2020-06-22 11:49:03.666910264 
+0200
+++ gcc/testsuite/gcc.dg/ubsan/pr95810.c2020-06-22 11:48:55.057036313 
+0200
@@ -0,0 +1,13 @@
+/* PR middle-end/95810 */
+/* { dg-do run } */
+/* { dg-options "-fsanitize=undefined -fno-sanitize-recover=undefined" } */
+
+int
+main ()
+{
+  int x = -__INT_MAX__ - 1;
+  x = (x <= 0 ? x : -x);
+  if (x != -__INT_MAX__ - 1)
+__builtin_abort ();
+  return 0;
+}

Jakub

[PATCH] Fortran : Bogus error with additional blanks in type(*) PR95829

2020-06-24 Thread Mark Eggleston


Please find a fix for PR95829.  Original patch by Steve Kargl posted to PR.

Commit to master and backport?

Commit message:

Fortran  : Bogus error with additional blanks in type(*) PR95829

Checking for "* ) " instead of "*)" clears the bogus error.

2020-06-24  Steven G. Kargl  

gcc/fortran/

    PR fortran/95829
    * decl.c (gfc_match_decl_type_spec): Compare with "* ) " instead
    of "*)".

2020-06-24  Mark Eggleston 

gcc/testsuite/

    PR fortran/95829
    * gfortran.dg/pr95829.f90: New test.

--
https://www.codethink.co.uk/privacy.html

>From 1da0ff935d819e7bdbac5219ccb7b30d70c6e87c Mon Sep 17 00:00:00 2001
From: Mark Eggleston 
Date: Tue, 23 Jun 2020 11:01:28 +0100
Subject: [PATCH] Fortran  : Bogus error with additional blanks in type(*)
 PR95829

Checking for "* ) " instead of "*)" clears the bogus error.

2020-06-24  Steven G. Kargl  

gcc/fortran/

	PR fortran/95829
	* decl.c (gfc_match_decl_type_spec): Compare with "* ) " instead
	of "*)".

2020-06-24  Mark Eggleston  

gcc/testsuite/

	PR fortran/95829
	* gfortran.dg/pr95829.f90: New test.
---
 gcc/fortran/decl.c|  2 +-
 gcc/testsuite/gfortran.dg/pr95829.f90 | 14 ++
 2 files changed, 15 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gfortran.dg/pr95829.f90

diff --git a/gcc/fortran/decl.c b/gcc/fortran/decl.c
index c27cfacf2e4..0659cae3175 100644
--- a/gcc/fortran/decl.c
+++ b/gcc/fortran/decl.c
@@ -4128,7 +4128,7 @@ gfc_match_decl_type_spec (gfc_typespec *ts, int implicit_flag)
   gfc_gobble_whitespace ();
   if (gfc_peek_ascii_char () == '*')
 	{
-	  if ((m = gfc_match ("*)")) != MATCH_YES)
+	  if ((m = gfc_match ("* ) ")) != MATCH_YES)
 	return m;
 	  if (gfc_comp_struct (gfc_current_state ()))
 	{
diff --git a/gcc/testsuite/gfortran.dg/pr95829.f90 b/gcc/testsuite/gfortran.dg/pr95829.f90
new file mode 100644
index 000..9ac3c1d5344
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr95829.f90
@@ -0,0 +1,14 @@
+! {dg-do compile }
+!
+! Declaration of b used to be a bogus failure.
+
+subroutine s (a, b, c, d, e, f, g)
+  type(*) :: a
+  type(* ) :: b
+  type( *) :: c
+  type( * ) :: d
+  type(*  ) :: e
+  type(  *) :: f
+  type(  *  ) :: g
+end
+
-- 
2.11.0

[PATCH] Fortran : False positive for optional arguments PR95446

2020-06-24 Thread Mark Eggleston

Please find attached a fix for PR95446.  Patch originally posted to the 
PR by Steve Kargl.


OK to commit to master and backport?

Commit message:

Fortran  : False positive for optional arguments PR95446

Check that there is non-optional argument of the same rank in the
list of actual arguments.  If there is the warning is not required.

2020-06-24  Steven G. Kargl  

gcc/fortran/

    PR fortran/95446
    * resolve.c (resolve_elemental_actual): Add code to check for
    non-optional argument of the same rank.  Revise warning message
    to refer to the Fortran 2018 standard.

2020-06-24  Mark Eggleston 

gcc/testsuite/

    PR fortran/95446
    * gfortran.dg/elemental_optional_args_6.f90: Remove check
    for warnings that were erroneously output.
    * gfortran.dg/pr95446.f90: New test.


--
https://www.codethink.co.uk/privacy.html

>From 4ad64b418c93064cfdfd07fc8a9e6305d8cc68db Mon Sep 17 00:00:00 2001
From: Mark Eggleston 
Date: Mon, 1 Jun 2020 14:56:00 +0100
Subject: [PATCH] Fortran  : False positive for optional arguments PR95446

Check that there is non-optional argument of the same rank in the
list of actual arguments.  If there is the warning is not required.

2020-06-24  Steven G. Kargl  

gcc/fortran/

	PR fortran/95446
	* resolve.c (resolve_elemental_actual): Add code to check for
	non-optional argument of the same rank.  Revise warning message
	to refer to the Fortran 2018 standard.

2020-06-24  Mark Eggleston  

gcc/testsuite/

	PR fortran/95446
	* gfortran.dg/elemental_optional_args_6.f90: Remove check
	for warnings that were erroneously output.
	* gfortran.dg/pr95446.f90: New test.
---
 gcc/fortran/resolve.c  | 28 
 .../gfortran.dg/elemental_optional_args_6.f90  |  4 +--
 gcc/testsuite/gfortran.dg/pr95446.f90  | 38 ++
 3 files changed, 62 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/pr95446.f90

diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c
index aaee5eb6b9b..842fefcb4cd 100644
--- a/gcc/fortran/resolve.c
+++ b/gcc/fortran/resolve.c
@@ -2277,12 +2277,28 @@ resolve_elemental_actual (gfc_expr *expr, gfc_code *c)
 	  && (set_by_optional || arg->expr->rank != rank)
 	  && !(isym && isym->id == GFC_ISYM_CONVERSION))
 	{
-	  gfc_warning (OPT_Wpedantic,
-		   "%qs at %L is an array and OPTIONAL; IF IT IS "
-		   "MISSING, it cannot be the actual argument of an "
-		   "ELEMENTAL procedure unless there is a non-optional "
-		   "argument with the same rank (12.4.1.5)",
-		   arg->expr->symtree->n.sym->name, >expr->where);
+	  bool t = false;
+	  gfc_actual_arglist *a;
+
+	  /* Scan the argument list for a non-optional argument with the
+	 same rank as arg.  */
+	  for (a = arg0; a; a = a->next)
+	if (a != arg
+		&& a->expr->rank == arg->expr->rank
+		&& !a->expr->symtree->n.sym->attr.optional)
+	  {
+		t = true;
+		break;
+	  }
+
+	  if (!t)
+	gfc_warning (OPT_Wpedantic,
+			 "%qs at %L is an array and OPTIONAL; If it is not "
+			 "present, then it cannot be the actual argument of "
+			 "an ELEMENTAL procedure unless there is a non-optional"
+			 " argument with the same rank "
+			 "(Fortran 2018, 15.5.2.12)",
+			 arg->expr->symtree->n.sym->name, >expr->where);
 	}
 }
 
diff --git a/gcc/testsuite/gfortran.dg/elemental_optional_args_6.f90 b/gcc/testsuite/gfortran.dg/elemental_optional_args_6.f90
index c19c1df3e2b..56a9db56be2 100644
--- a/gcc/testsuite/gfortran.dg/elemental_optional_args_6.f90
+++ b/gcc/testsuite/gfortran.dg/elemental_optional_args_6.f90
@@ -21,8 +21,8 @@ contains
   integer, optional :: arg1(:)
   integer :: arg2(:)
 !  print *, fun1 (arg1, arg2)
-  if (size (fun1 (arg1, arg2)) /= 2) STOP 1 ! { dg-warning "is an array and OPTIONAL" }
-  if (any (fun1 (arg1, arg2) /= [1,2])) STOP 2 ! { dg-warning "is an array and OPTIONAL" }
+  if (size (fun1 (arg1, arg2)) /= 2) STOP 1
+  if (any (fun1 (arg1, arg2) /= [1,2])) STOP 2
end subroutine
 
elemental function fun1 (arg1, arg2)
diff --git a/gcc/testsuite/gfortran.dg/pr95446.f90 b/gcc/testsuite/gfortran.dg/pr95446.f90
new file mode 100644
index 000..86e1019d7af
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr95446.f90
@@ -0,0 +1,38 @@
+! { dg-do compile }
+! { dg-options "-pedantic-errors" }
+!
+! Contributed by Martin Diehl  
+
+program elemental_optional
+  implicit none
+  integer :: m(5), r(5)
+
+  m = 1
+
+  r = outer()
+  r = outer(m)
+  
+  contains
+
+  function outer(o) result(l)
+integer, intent(in), optional :: o(:)
+integer :: u(5), l(5)
+
+l = inner(o,u)
+
+  end function outer
+
+  elemental function inner(a,b) result(x)
+integer, intent(in), optional :: a
+integer, intent(in) :: b
+integer :: x
+
+if(present(a)) then
+  x = a*b
+else
+  x = b
+endif
+  end function inner
+  
+end program elemental_optional
+
-- 
2.11.0

Re: [stage1][PATCH] Add gcc_assert that _options are not dirty modified.

2020-06-24 Thread Martin Liška


On 6/18/20 5:36 PM, Martin Liška wrote:

I'm going to add this to exception list.


Done here:
https://gcc.gnu.org/pipermail/gcc-patches/2020-June/548800.html

Martin

[PATCH] arc: add exceptions for PR92860.

2020-06-24 Thread Martin Liška


Hey.

The patch is about addition of some exceptions for arc target that
address:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92860#c26

It's again another example where optimization options influence target
options.

Ready for master?
Martin

gcc/ChangeLog:

PR tree-optimization/92860
* optc-save-gen.awk: Add exceptions for arc target.
---
 gcc/optc-save-gen.awk | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/optc-save-gen.awk b/gcc/optc-save-gen.awk
index 760bf26721a..ff173fee404 100644
--- a/gcc/optc-save-gen.awk
+++ b/gcc/optc-save-gen.awk
@@ -960,6 +960,10 @@ checked_options["flag_merge_constants"]++
 checked_options["param_max_fields_for_field_sensitive"]++
 checked_options["flag_omit_frame_pointer"]++
 checked_options["unroll_only_small_loops"]++
+# arc exceptions
+checked_options["TARGET_ALIGN_CALL"]++
+checked_options["TARGET_CASE_VECTOR_PC_RELATIVE"]++
+checked_options["arc_size_opt_level"]++
 
 for (i = 0; i < n_opts; i++) {

name = var_name(flags[i]);
--
2.27.0

Re: [PATCH] VEC_COND_EXPR: clean up first argument

2020-06-24 Thread Richard Biener via Gcc-patches

On Wed, Jun 24, 2020 at 9:21 AM Martin Liška  wrote:
>
> Hi.
>
> When expanding a VEC_COND_EXPR it happens that first argument (a SSA_NAME)
> that can be no longer used. When that happens we need to remove the SSA_NAME,
> otherwise we end up expanding it and for targets like s390x, there's no optab
> expansion. We need to remove them at both places as -O0 does not help us
> with a dead SSA_NAMEs after vector lowering.
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> Cross-compiler for s390x was tested as well for problematic test-case.
>
> Ready to be installed?

Hmm, can you instead use simple_dce_from_worklist and simply
record all SSA_NAMEs you end up "forwarding" as possibly dead
in a bitmap?  At least that hashmap traversal looks dangerous
with respect to address-space randomization and gsi_remove
inserting debug stmts and thus eventually allocating debug decls.

Thanks,
Richard.

> Thanks,
> Martin
>
> gcc/ChangeLog:
>
> PR tree-optimization/95745
> PR middle-end/95830
> * gimple-isel.cc (gimple_expand_vec_cond_exprs): Delete dead
> SSA_NAMEs used as the first argument of a VEC_COND_EXPR.  Always
> return 0.
> * tree-vect-generic.c (expand_vector_condition): Remove dead
> SSA_NAMEs used as the firs argument of a VEC_COND_EXPR.
> ---
>   gcc/gimple-isel.cc  | 11 +--
>   gcc/tree-vect-generic.c |  9 -
>   2 files changed, 17 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/gimple-isel.cc b/gcc/gimple-isel.cc
> index 97f92080503..25b893224af 100644
> --- a/gcc/gimple-isel.cc
> +++ b/gcc/gimple-isel.cc
> @@ -178,7 +178,6 @@ gimple_expand_vec_cond_exprs (void)
>   {
> gimple_stmt_iterator gsi;
> basic_block bb;
> -  bool cfg_changed = false;
> hash_map vec_cond_ssa_name_uses;
>
> FOR_EACH_BB_FN (bb, cfun)
> @@ -196,7 +195,15 @@ gimple_expand_vec_cond_exprs (void)
> }
>   }
>
> -  return cfg_changed ? TODO_cleanup_cfg : 0;
> +  for (hash_map::iterator it = 
> vec_cond_ssa_name_uses.begin ();
> +   it != vec_cond_ssa_name_uses.end (); ++it)
> +if (num_imm_uses ((*it).first) == 0)
> +  {
> +   gsi = gsi_for_stmt (SSA_NAME_DEF_STMT ((*it).first));
> +   gsi_remove (, true);
> +  }
> +
> +  return 0;
>   }
>
>   namespace {
> diff --git a/gcc/tree-vect-generic.c b/gcc/tree-vect-generic.c
> index 83d399a7898..2479368743d 100644
> --- a/gcc/tree-vect-generic.c
> +++ b/gcc/tree-vect-generic.c
> @@ -954,10 +954,11 @@ expand_vector_condition (gimple_stmt_iterator *gsi)
> tree comp_index = index;
> location_t loc = gimple_location (gsi_stmt (*gsi));
> tree_code code = TREE_CODE (a);
> +  gassign *assign = NULL;
>
> if (code == SSA_NAME)
>   {
> -  gassign *assign = dyn_cast (SSA_NAME_DEF_STMT (a));
> +  assign = dyn_cast (SSA_NAME_DEF_STMT (a));
> if (assign != NULL
>   && TREE_CODE_CLASS (gimple_assign_rhs_code (assign)) == 
> tcc_comparison)
> {
> @@ -1064,6 +1065,12 @@ expand_vector_condition (gimple_stmt_iterator *gsi)
> constr = build_constructor (type, v);
> gimple_assign_set_rhs_from_tree (gsi, constr);
> update_stmt (gsi_stmt (*gsi));
> +
> +  if (a_is_comparison && num_imm_uses (gimple_assign_lhs (assign)) == 0)
> +{
> +  gimple_stmt_iterator gsi = gsi_for_stmt (assign);
> +  gsi_remove (, true);
> +}
>   }
>
>   static tree
> --
> 2.27.0
>

Re: [RFC/PATCH] IFN: Fix mask_{load,store} optab support macros

2020-06-24 Thread Richard Biener via Gcc-patches

On Wed, Jun 24, 2020 at 9:13 AM Richard Sandiford
 wrote:
>
> "Kewen.Lin"  writes:
> > Hi,
> >
> > When I am working on IFNs for vector with length, I noticed that the
> > current optab support query for mask_load/mask_store looks unexpected.
> > The mask_load/mask_store requires two modes for convert_optab query,
> > but the macros direct_mask_{load,store}_optab_supported_p uses
> > direct_optab_supported_p which asserts type pair should have the same mode.
> >
> > I'm not sure whether we have some special reason here or just a typo,
> > since everything goes well now, mask_{load,store} optab check is mainly
> > handled by can_vec_mask_load_store_p.
> >
> > But if we have some codes as below (eg: one checking for all IFNs finally)
> >
> >   tree_pair types = direct_internal_fn_types (ifn, call);
> >   if(direct_internal_fn_supported_p (ifn, types, OPTIMIZE_FOR_SPEED) ...
> >
> > It will cause ICE.
> >
> > Does it make sense to fix it?
> >
> > Thanks in advance!
> >
> > BR,
> > Kewen
> > -
> > gcc/ChangeLog:
> >
> >   * internal-fn.c (direct_mask_load_optab_supported_p): Use
> >   convert_optab_supported_p instead of direct_optab_supported_p.
> >   (direct_mask_store_optab_supported_p): Likewise.
>
> OK, thanks.
>
> I keep meaning to experiment with dropping the second mode from these optabs,
> since it should be uniquely determined by the first.  That would make things
> slightly simpler and more consistent with the other load/store IFNs.  But as
> usual I've never found the time.

AVX512 would have V16SImode and SImode because the mask would have
an integer mode?  Likewise I could imagine RISC-V using V4SImode and V4QImode
or however their mask registers look like.

Richard.

>
> Richard

Re: [PATCH 1/7 v5] ifn/optabs: Support vector load/store with length

2020-06-24 Thread Richard Sandiford

Jim Wilson  writes:
> On Tue, Jun 23, 2020 at 5:21 AM Richard Sandiford
>  wrote:
>> MVE and Power both set inactive lanes to zero.  But I'm not sure about RVV.
>> AIUI, for RVV the approach instead would be to reduce the effective vector
>> length for the final iteration of the vector loop, and I'm not sure
>> whether in that situation it makes sense to say that the other elements
>> still exist and are guaranteed to be zero.
>>
>> I'm the last person who should be speculating on that though.  Let's see
>> whether Jim has any comments.
>
> The RVV spec supports two policies for tail elements, i.e. elements
> beyond the current vector length.  They can be undisturbed or
> agnostic.  In the undisturbed case, the trail elements retain their
> old values.  In the agnostic case, the implementation can choose to
> either retain their old values, or set them to all ones, and this
> choice can be different from lane to lane.  The latter case is useful
> because registers may be wider than the execution unit, and current
> vector length may not be a multiple of the width of the execution
> unit.  So for instance if the vector registers can hold 8 elements,
> and the execution unit works on 4 elements at a time, and the current
> vector length is 2, then it might make sense to leave the last four
> elements unmodified to avoid an iteration across the registers, but
> the third and fourth elements might be set to all ones because you
> have to write to them anyways.  The choice is left up to the
> implementation because we have multiple parties designing vector
> units, and some are target for low cost embedded market, and some are
> target for high performance, and they couldn't agree on a single best
> way to implement this.  The software is expected to choose agnostic
> only if it doesn't care about what happens to tail elements, and
> undisturbed if you want to preserve them.  The value of all ones was
> chosen to discourage software developers from trying to use the values
> in tail elements.  The choice of undisturbed or agnostic can be
> changed every time you set the current vector length and type.
>
> In most cases, I think RVV programs will use agnostic for tail
> elements, since we can change the vector length at will, and it will
> be rare that we will care about elements beyond the current vector
> length.
>
> Tail elements can't cause exceptions so there is no need to worry
> about whether those elements hold valid values.

Thanks for the info.  Based on that, I guess GCC should leave the values
of extra inactive lanes undefined for now, so that the agnostic case
is supported.

Maybe in future we could have IFN_LEN_* versions of arithmetic
operations too, similar to the IFN_COND_* ones, so that they explicitly
ignore the inactive elements.

Richard

Re: [PATCH] Make contrib/download_prerequisites work on AIX and OpenBSD

2020-06-24 Thread Richard Biener via Gcc-patches

On Tue, Jun 23, 2020 at 10:37 PM Ilya Leoshkevich via Gcc-patches
 wrote:
>
> Hello,
>
> I needed to test [1] on AIX and OpenBSD and noticed
> download_prerequisites doesn't work there. The attached patch fixes
> it.
>
> OK for master?

OK if David acks the AIX part.

Thanks,
Richard.

> Best regards,
> Ilya
>
> [1] https://gcc.gnu.org/pipermail/gcc-patches/2020-June/548182.html
>
> ---
>
> contrib/ChangeLog:
>
> 2020-06-11  Ilya Leoshkevich  
>
> * download_prerequisites: Support AIX and OpenBSD unames.
> Pipe `{gzip,bzip2} -d` to `tar -xf -`.
> ---
>  contrib/download_prerequisites | 18 --
>  1 file changed, 16 insertions(+), 2 deletions(-)
>
> diff --git a/contrib/download_prerequisites b/contrib/download_prerequisites
> index aa0356e6266..da19913f9ab 100755
> --- a/contrib/download_prerequisites
> +++ b/contrib/download_prerequisites
> @@ -47,9 +47,12 @@ force=0
>  OS=$(uname)
>
>  case $OS in
> -  "Darwin"|"FreeBSD"|"DragonFly")
> +  "Darwin"|"FreeBSD"|"DragonFly"|"AIX")
>  chksum='shasum -a 512 --check'
>;;
> +  "OpenBSD")
> +chksum='sha512 -c'
> +  ;;
>*)
>  chksum='sha512sum -c'
>;;
> @@ -242,8 +245,19 @@ for ar in $(echo_archives)
>  do
>  package="${ar%.tar*}"
>  if [ ${force} -gt 0 ]; then rm -rf "${directory}/${package}"; fi
> +case $ar in
> +*.gz)
> +   uncompress='gzip -d'
> +   ;;
> +*.bz2)
> +   uncompress='bzip2 -d'
> +   ;;
> +*)
> +   uncompress='cat'
> +   ;;
> +esac
>  [ -e "${directory}/${package}" ] 
>  \
> -|| ( cd "${directory}" && tar -xf "${ar}" )  
>  \
> +|| ( cd "${directory}" && $uncompress <"${ar}" | tar -xf - ) 
>  \
>  || die "Cannot extract package from ${ar}"
>  unset package
>  done
> --
> 2.25.4
>

Re: [PATCH] x86: Fold arch_names_table into processor_alias_table

2020-06-24 Thread Uros Bizjak via Gcc-patches

[CC list removed, it is not shown in archives]

> In i386-builtins.c, arch_names_table is used to to map architecture name
> string to internal model.  A switch statement is used to map internal
> processor name to architecture name string and internal priority.
>
> model and priority are added to processor_alias_table so that a single
> entry contains architecture name string, internal processor name,
> internal model and internal priority.  6 entries are appended for
> i386-builtins.c, which have special architecture name strings: amd,
> amdfam10h, amdfam15h, amdfam17h, shanghai and istanbul, and pta_size is
> adjusted to exclude them.  Entries which are not used by i386-builtins.c

We already have an "amdfam10" entry, so I think these should later be
extended by AMD folks to include correct processor_type, schedule
model and flags. Having to manually exclude generic entries feels a
bit hackish, though, but the patch is certainly an improvement.

> have internal model 0.  P_PROC_DYNAMIC is added to internal priority to
> make entries with dynamic architecture name string or priority.

BTW: Can you please change P_ZERO to P_NONE?

> PR target/95842
> * common/config/i386/i386-common.c (processor_alias_table): Add
> processor model and priority to each entry.
> (pta_size): Updated with -6.
> (num_arch_names): New.
> * common/config/i386/i386-cpuinfo.h: New file.
> * config/i386/i386-builtins.c (feature_priority): Removed.
> (processor_model): Likewise.
> (_arch_names_table): Likewise.
> (arch_names_table): Likewise.
> (get_builtin_code_for_version): Use processor_alias_table.
> (fold_builtin_cpu): Replace arch_names_table with
> processor_alias_table.
> * config/i386/i386.h: Include "common/config/i386/i386-cpuinfo.h".
> (pta): Add model and priority.
> (num_arch_names): New.

LGTM.

Thanks,
Uros.

Re: [PATCH] Add TARGET_UPDATE_DECL_ALIGNMENT [PR95237]

2020-06-24 Thread Richard Biener via Gcc-patches

On Tue, Jun 23, 2020 at 5:31 PM Sunil K Pandey via Gcc-patches
 wrote:
>
> From: Sunil K Pandey 
>
> Default for this hook is NOP. For x86, in 32 bit mode, this hook
> sets alignment of long long on stack to 32 bits if preferred stack
> boundary is 32 bits.
>
>  - This patch fixes
> gcc.target/i386/pr69454-2.c
> gcc.target/i386/stackalign/longlong-1.c
>  - Regression test on x86-64, no new fail introduced.

I think the name is badly chosen, TARGET_LOWER_LOCAL_DECL_ALIGNMENT
would be better suited (and then asks for LOCAL_DECL_ALIGNMENT to be
renamed to INCREASE_LOCAL_DECL_ALIGNMENT).

You're calling it from do_type_align which IMHO is dangerous since that's
invoked from FIELD_DECL layout as well.  Instead invoke it from
layout_decl itself where we do

  if (code != FIELD_DECL)
/* For non-fields, update the alignment from the type.  */
do_type_align (type, decl);

and invoke the hook _after_ do_type_align.  Also avoid
invoking the hook on globals or hard regs and only
invoke it on VAR_DECLs, thus only

  if (VAR_P (decl) && !is_global_var (decl) && !DECL_HARD_REGISTER (decl))

Comments on the hook itself below.

> Tested on x86-64.
>
> gcc/ChangeLog:
>
> PR target/95237
> * config/i386/i386.c (ix86_update_decl_alignment): New
> function.
> (TARGET_UPDATE_DECL_ALIGNMENT): Define.
> * doc/tm.texi: Regenerate.
> * doc/tm.texi.in (TARGET_UPDATE_DECL_ALIGNMENT): New hook.
> * stor-layout.c (do_type_align): Call target hook to update
> decl alignment.
> * target.def (update_decl_alignment): New hook.
>
> gcc/testsuite/ChangeLog:
>
> PR target/95237
> * gcc.target/i386/pr95237-1.c: New test.
> * gcc.target/i386/pr95237-2.c: New test.
> * gcc.target/i386/pr95237-3.c: New test.
> * gcc.target/i386/pr95237-4.c: New test.
> * gcc.target/i386/pr95237-5.c: New test.
> ---
>  gcc/config/i386/i386.c| 22 ++
>  gcc/doc/tm.texi   |  5 +
>  gcc/doc/tm.texi.in|  2 ++
>  gcc/stor-layout.c |  2 ++
>  gcc/target.def|  7 +++
>  gcc/testsuite/gcc.target/i386/pr95237-1.c | 16 
>  gcc/testsuite/gcc.target/i386/pr95237-2.c | 10 ++
>  gcc/testsuite/gcc.target/i386/pr95237-3.c | 10 ++
>  gcc/testsuite/gcc.target/i386/pr95237-4.c | 10 ++
>  gcc/testsuite/gcc.target/i386/pr95237-5.c | 16 
>  10 files changed, 100 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr95237-1.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr95237-2.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr95237-3.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr95237-4.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr95237-5.c
>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 37aaa49996d..bcd9abd5303 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -16917,6 +16917,25 @@ ix86_minimum_alignment (tree exp, machine_mode mode,
>
>return align;
>  }
> +
> +/* Implement TARGET_UPDATE_DECL_ALIGNMENT.  */
> +
> +static void
> +ix86_update_decl_alignment (tree decl)
> +{
> +  tree type = TREE_TYPE (decl);
> +
> +  if (cfun != NULL
> +  && !TARGET_64BIT
> +  && DECL_ALIGN (decl) == 64
> +  && ix86_preferred_stack_boundary < 64
> +  && !is_global_var (decl)
> +  && (DECL_MODE (decl) == E_DImode
> + || (type && TYPE_MODE (type) == E_DImode))
> +  && (!type || !TYPE_USER_ALIGN (type))
> +  && (!decl || !DECL_USER_ALIGN (decl)))

I'd simply do

   unsigned new_align = LOCAL_DECL_ALIGNMENT (decl);
   if (new_align < DECL_ALIGN (decl))
 SET_DECL_ALIGN (decl, new_align);

to avoid spreading the logic to multiple places.

Thanks,
Richard.

> +SET_DECL_ALIGN (decl, 32);
> +}
>
>  /* Find a location for the static chain incoming to a nested function.
> This is a register, unless all free registers are used by arguments.  */
> @@ -23519,6 +23538,9 @@ ix86_run_selftests (void)
>  #undef TARGET_CAN_CHANGE_MODE_CLASS
>  #define TARGET_CAN_CHANGE_MODE_CLASS ix86_can_change_mode_class
>
> +#undef TARGET_UPDATE_DECL_ALIGNMENT
> +#define TARGET_UPDATE_DECL_ALIGNMENT ix86_update_decl_alignment
> +
>  #undef TARGET_STATIC_RTX_ALIGNMENT
>  #define TARGET_STATIC_RTX_ALIGNMENT ix86_static_rtx_alignment
>  #undef TARGET_CONSTANT_ALIGNMENT
> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> index 6e7d9dc54a9..c11ef5dca89 100644
> --- a/gcc/doc/tm.texi
> +++ b/gcc/doc/tm.texi
> @@ -1086,6 +1086,11 @@ On 32-bit ELF the largest supported section alignment 
> in bits is
>  @samp{(0x8000 * 8)}, but this is not representable on 32-bit hosts.
>  @end defmac
>
> +@deftypefn {Target Hook} void TARGET_UPDATE_DECL_ALIGNMENT (tree @var{decl})
> +Define this hook to update alignment of decl
> +@samp{(@var{decl}}.
>

Re: [stage1][PATCH] Add gcc_assert that _options are not dirty modified.

2020-06-24 Thread Martin Liška


On 6/18/20 5:40 PM, Martin Liška wrote:

This is bogus as these are 2 strings that are equal. Let me fix it.


Hey.

Just for the report, this is fixed on master right now.

Martin

1 2 >

1 - 100 of 104 matches

Mail list logo