Re: [PATCH] Add line debug info for virtual thunks (PR ipa/97937)

2021-01-05 Thread Bernd Edlinger



On 1/4/21 10:45 PM, Jeff Law wrote:
> 
> 
> On 1/4/21 1:06 PM, Bernd Edlinger wrote:
>> --- a/gcc/final.c
>> +++ b/gcc/final.c
>> @@ -1735,7 +1735,12 @@ final_start_function_1 (rtx_insn **firstp, FILE 
>> *file, int *seen,
>>   last_filename);
>>  
>>if (!dwarf2_debug_info_emitted_p (current_function_decl))
>> -dwarf2out_begin_prologue (0, 0, NULL);
>> +{
>> +  if (write_symbols == DWARF2_DEBUG)
>> +dwarf2out_begin_prologue (last_linenum, last_columnnum, last_filename);
>> +  else
>> +dwarf2out_begin_prologue (0, 0, NULL);
>> +}
> The only way you're getting into this code is for DEBUG_DWARF2 and
> VMS_AND_DWARF2_DEBUG and in the latter case we want to make the same
> fix.  So drop the newly added conditional and just make the code
> something like this:
> 
> 
> if (!dwarf2_debug_info_emitted_p (current_function_decl))
> Â  dwarf2out_begin_prologue (last_linenum, last_columnnum, last_filename)
> 
> 

No, this is block is entered iff
 (write_symbols != DWARF2_DEBUG && write_symbols != VMS_AND_DWARF2_DEBUG)
  || DECL_IGNORED (current_function_decl))

so emitting .loc info here could easily break DBX, XCOFF, and VMS w/o DWARF
while I have no way to test anything for these debug formats.


Bernd.


Re: [PATCH] Add line debug info for virtual thunks (PR ipa/97937)

2021-01-05 Thread Richard Biener
On Wed, 6 Jan 2021, Bernd Edlinger wrote:

> On 1/6/21 8:01 AM, Alexandre Oliva wrote:
> > On Jan  5, 2021, Richard Biener  wrote:
> > 
> >> But isn't this a consumer issue then?  If there is no line info for
> >> a PC range then gdb shouldn't display any.
> > 
> > No, there *is* line info there, carried over from an earlier .loc
> > directive, as there isn't anything like ".noloc" to output with a
> > function that is not expected or supposed to have line number info.
> > 
> > Without that, the assembler just extends the previous .loc directive
> > onto the function.
> > 
> 
> Theoretically we could exclude the range of the no-loc function
> from the .debug_ranges, then gdb would not even step into the function.

I'd argue we're failing to emit a .endloc at the end of functions
(rather than issueing a .noloc at the start of functions with no 
locations).  I wonder if using a special file ID and switching to that
would be an effective workaround?  When gas is extended we could use
file ID zero for this (which gas currently rejects).

> However if we have at least a single line info as in the case of the thunks,
> then that would be better than nothing (what this patch does).

But the problem extends to functins which do not have any line, so what
line do you use in that case?

Richard.


Re: [PATCH] Add line debug info for virtual thunks (PR ipa/97937)

2021-01-05 Thread Bernd Edlinger
On 1/6/21 8:01 AM, Alexandre Oliva wrote:
> On Jan  5, 2021, Richard Biener  wrote:
> 
>> But isn't this a consumer issue then?  If there is no line info for
>> a PC range then gdb shouldn't display any.
> 
> No, there *is* line info there, carried over from an earlier .loc
> directive, as there isn't anything like ".noloc" to output with a
> function that is not expected or supposed to have line number info.
> 
> Without that, the assembler just extends the previous .loc directive
> onto the function.
> 

Theoretically we could exclude the range of the no-loc function
from the .debug_ranges, then gdb would not even step into the function.

However if we have at least a single line info as in the case of the thunks,
then that would be better than nothing (what this patch does).


Bernd.


Re: Patch RFA: Support non-ASCII file names in git-changelog

2021-01-05 Thread Martin Liška

On 1/4/21 12:47 PM, Martin Liška wrote:

On 1/4/21 12:01 PM, Martin Liška wrote:

Anyway, I'm going to update server hook first and I'll create an issue for 
GitPython.


So I was not correct about this. Also the server hooks uses now GitPython
to identify modified files.

I've just created an issue for that:
https://github.com/gitpython-developers/GitPython/issues/1099


This one got fixed and it's present in the newly done release v3.1.12.

Anyway, I've got a workaround that I'm going to push.

Martin



Martin


>From ed9ffe47d6964dc92c92cfddbb8aac555c28e085 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Wed, 6 Jan 2021 08:11:57 +0100
Subject: [PATCH] gcc-changelog: workaround for utf8 filenames

contrib/ChangeLog:

	* gcc-changelog/git_commit.py: Add decode_path function.
	* gcc-changelog/git_email.py: Use it in order to solve
	utf8 encoding filename issues.
	* gcc-changelog/git_repository.py: Likewise.
	* gcc-changelog/test_email.py: Test it.
---
 contrib/gcc-changelog/git_commit.py | 26 +
 contrib/gcc-changelog/git_email.py  |  6 +++---
 contrib/gcc-changelog/git_repository.py |  6 +++---
 contrib/gcc-changelog/test_email.py |  3 ++-
 4 files changed, 26 insertions(+), 15 deletions(-)

diff --git a/contrib/gcc-changelog/git_commit.py b/contrib/gcc-changelog/git_commit.py
index d2e5dbe294a..ee1973371be 100755
--- a/contrib/gcc-changelog/git_commit.py
+++ b/contrib/gcc-changelog/git_commit.py
@@ -174,6 +174,24 @@ REVIEW_PREFIXES = ('reviewed-by: ', 'reviewed-on: ', 'signed-off-by: ',
 DATE_FORMAT = '%Y-%m-%d'
 
 
+def decode_path(path):
+# When core.quotepath is true (default value), utf8 chars are encoded like:
+# "b/ko\304\215ka.txt"
+#
+# The upstream bug is fixed:
+# https://github.com/gitpython-developers/GitPython/issues/1099
+#
+# but we still need a workaround for older versions of the library.
+# Please take a look at the explanation of the transformation:
+# https://stackoverflow.com/questions/990169/how-do-convert-unicode-escape-sequences-to-unicode-characters-in-a-python-string
+
+if path.startswith('"') and path.endswith('"'):
+return (path.strip('"').encode('utf8').decode('unicode-escape')
+.encode('latin-1').decode('utf8'))
+else:
+return path
+
+
 class Error:
 def __init__(self, message, line=None):
 self.message = message
@@ -303,14 +321,6 @@ class GitCommit:
  'separately from normal commits'))
 return
 
-# check for an encoded utf-8 filename
-hint = 'git config --global core.quotepath false'
-for modified, _ in self.info.modified_files:
-if modified.startswith('"') or modified.endswith('"'):
-self.errors.append(Error('Quoted UTF8 filename, please set: '
- f'"{hint}"', modified))
-return
-
 all_are_ignored = (len(project_files) + len(ignored_files)
== len(self.info.modified_files))
 self.parse_lines(all_are_ignored)
diff --git a/contrib/gcc-changelog/git_email.py b/contrib/gcc-changelog/git_email.py
index 5b53ca4a6a9..00ad00458f4 100755
--- a/contrib/gcc-changelog/git_email.py
+++ b/contrib/gcc-changelog/git_email.py
@@ -22,7 +22,7 @@ from itertools import takewhile
 
 from dateutil.parser import parse
 
-from git_commit import GitCommit, GitInfo
+from git_commit import GitCommit, GitInfo, decode_path
 
 from unidiff import PatchSet, PatchedFile
 
@@ -52,8 +52,8 @@ class GitEmail(GitCommit):
 modified_files = []
 for f in diff:
 # Strip "a/" and "b/" prefixes
-source = f.source_file[2:]
-target = f.target_file[2:]
+source = decode_path(f.source_file)[2:]
+target = decode_path(f.target_file)[2:]
 
 if f.is_added_file:
 t = 'A'
diff --git a/contrib/gcc-changelog/git_repository.py b/contrib/gcc-changelog/git_repository.py
index 8edcff91ad6..a0e293d756d 100755
--- a/contrib/gcc-changelog/git_repository.py
+++ b/contrib/gcc-changelog/git_repository.py
@@ -26,7 +26,7 @@ except ImportError:
 print('  Debian, Ubuntu: python3-git')
 exit(1)
 
-from git_commit import GitCommit, GitInfo
+from git_commit import GitCommit, GitInfo, decode_path
 
 
 def parse_git_revisions(repo_path, revisions, strict=True):
@@ -51,11 +51,11 @@ def parse_git_revisions(repo_path, revisions, strict=True):
 # Consider that renamed files are two operations:
 # the deletion of the original name
 # and the addition of the new one.
-modified_files.append((file.a_path, 'D'))
+modified_files.append((decode_path(file.a_path), 'D'))
 t = 'A'
 else:
 t = 'M'
-modified_files.append((file.b_path, t))
+

Re: [RFC] [avr] Toolchain Integration for Testsuite Execution (avr cc0 to mode_cc0 conversion)

2021-01-05 Thread abebeos via Gcc-patches
On Tue, 5 Jan 2021 at 21:25, Rainer Orth 
wrote:

> Hi Jeff,
>
> > On 1/5/21 10:54 AM, Rainer Orth wrote:
> >>
> >> I fear I'm a bit lost here myself.  I do have a little experience
> >> running various builders:
> >>
> >> * I inherited a Golang one on Solaris/amd64 (based on their own builder
> >>   infrastructure).
> >>
> >> * I do run builders for GDB (mostly dormant since Sergio left RedHat)
> >>   and LLVM on Solaris/amd64 and sparcv9 (both using buildbot).
> >>
> >> In all three cases the projects provide documentation how to configure
> >> your own builders and add them to the infrastructure.  Is something like
> >> this possible for the GCC Jenkins (say adding Solaris builders) and if
> >> so how?  Or would one need to setup one's own instance, in which case it
> >> would be extremely helpful to learn the necessary config: doing
> >> something like this from scratch is a major effort, as seen in Paul
> >> Matos' effort (also buildbot-based) of a couple of years ago.
> > We don't have any procedures in place for this (yet).  I'd like to add
> > them, but I'm swamped.
>
> understood.  Often it's easier for an outsider to document a procedure
> since he's certain to stumble across every possible roadblock someone
> familiar with the system has long forgotten about.
>

Many roadblocks/barriers, and the complaining newcomers(outsiders) are many
times even attacked by the regulars, kind of "it's just opening bash, do
this, that, then that that that, pick this, ready".

@steering committee

Consider transforming the gcc-jenkins to be an open-project (repository,
usually patch-update-processes)


> > I'm certainly open to having others contribute here.  As a long standing
> > member of the community I'd be happy to set up an account for you so you
> > could wire in a sparc/solaris system executor and set up the build
> scripts.
>

Do I need an account to look at how things work there?

I am unable to find even one script from the jenkins UI.

Any direct link?


> That would be nice.  Although my current manual daily regtests do help
> and a considerable part of the work is investigating and reporting
> failures found, any automatism takes part of the legwork.
>
> Rainer
>
> --
>
> -
> Rainer Orth, Center for Biotechnology, Bielefeld University
>


Re: [PATCH] Add line debug info for virtual thunks (PR ipa/97937)

2021-01-05 Thread Alexandre Oliva
On Jan  5, 2021, Richard Biener  wrote:

> But isn't this a consumer issue then?  If there is no line info for
> a PC range then gdb shouldn't display any.

No, there *is* line info there, carried over from an earlier .loc
directive, as there isn't anything like ".noloc" to output with a
function that is not expected or supposed to have line number info.

Without that, the assembler just extends the previous .loc directive
onto the function.

-- 
Alexandre Oliva, happy hacker  https://FSFLA.org/blogs/lxo/
   Free Software Activist GNU Toolchain Engineer
Vim, Vi, Voltei pro Emacs -- GNUlius Caesar


[PATCH] [AVX512] Fix ICE: Convert integer mask to vector in ix86_expand_fp_vec_cmp/ix86_expand_int_vec_cmp [PR98537]

2021-01-05 Thread Hongtao Liu via Gcc-patches
Hi:
  ix86_expand_fp_vec_cmp/ix86_expand_int_vec_cmp are used by vec_cmpmn
for vector comparison to vector mask, but ix86_expand_sse_cmp(which is
called in upper 2 functions.) may return integer mask whenever integer
mask is available, so convert integer mask back to vector mask if
needed.

gcc/ChangeLog:

PR target/98537
* config/i386/i386-expand.c (ix86_expand_fp_vec_cmp):
When cmp is integer mask, convert it to vector.
(ix86_expand_int_vec_cmp): Ditto.

gcc/testsuite/ChangeLog:

PR target/98537
* g++.target/i386/avx512bw-pr98537-1.C: New test.
* g++.target/i386/avx512vl-pr98537-1.C: New test.
* g++.target/i386/avx512vl-pr98537-2.C: New test.

--
BR,
Hongtao
From f7c8341793639c401199d5029053244cd7e5f828 Mon Sep 17 00:00:00 2001
From: liuhongt 
Date: Wed, 6 Jan 2021 11:24:00 +0800
Subject: [PATCH] [AVX512] Fix ICE: Convert integer mask to vector in
 ix86_expand_fp_vec_cmp/ix86_expand_int_vec_cmp [PR98537]

gcc/ChangeLog:

	PR target/98537
	* config/i386/i386-expand.c (ix86_expand_fp_vec_cmp):
	When cmp is integer mask, convert it to vector.
	(ix86_expand_int_vec_cmp): Ditto.

gcc/testsuite/ChangeLog:

	PR target/98537
	* g++.target/i386/avx512bw-pr98537-1.C: New test.
	* g++.target/i386/avx512vl-pr98537-1.C: New test.
	* g++.target/i386/avx512vl-pr98537-2.C: New test.
---
 gcc/config/i386/i386-expand.c | 28 +++--
 .../g++.target/i386/avx512bw-pr98537-1.C  | 11 +
 .../g++.target/i386/avx512vl-pr98537-1.C  | 40 +++
 .../g++.target/i386/avx512vl-pr98537-2.C  |  8 
 4 files changed, 84 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/i386/avx512bw-pr98537-1.C
 create mode 100644 gcc/testsuite/g++.target/i386/avx512vl-pr98537-1.C
 create mode 100644 gcc/testsuite/g++.target/i386/avx512vl-pr98537-2.C

diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
index 6e08fd32726..c879953b023 100644
--- a/gcc/config/i386/i386-expand.c
+++ b/gcc/config/i386/i386-expand.c
@@ -3991,6 +3991,7 @@ bool
 ix86_expand_fp_vec_cmp (rtx operands[])
 {
   enum rtx_code code = GET_CODE (operands[1]);
+  machine_mode dest_mode = GET_MODE (operands[0]);
   rtx cmp;
 
   code = ix86_prepare_sse_fp_compare_args (operands[0], code,
@@ -4024,8 +4025,18 @@ ix86_expand_fp_vec_cmp (rtx operands[])
 cmp = ix86_expand_sse_cmp (operands[0], code, operands[2], operands[3],
 			   operands[1], operands[2]);
 
-  if (operands[0] != cmp)
-emit_move_insn (operands[0], cmp);
+if (operands[0] != cmp)
+{
+  if (GET_MODE (cmp) == dest_mode)
+	emit_move_insn (operands[0], cmp);
+  else
+	{
+	  gcc_assert (ix86_valid_mask_cmp_mode (dest_mode));
+	  ix86_expand_sse_movcc (operands[0], cmp,
+ CONSTM1_RTX (dest_mode),
+ CONST0_RTX (dest_mode));
+	}
+}
 
   return true;
 }
@@ -4286,6 +4297,7 @@ bool
 ix86_expand_int_vec_cmp (rtx operands[])
 {
   rtx_code code = GET_CODE (operands[1]);
+  machine_mode dest_mode = GET_MODE (operands[0]);
   bool negate = false;
   rtx cmp = ix86_expand_int_sse_cmp (operands[0], code, operands[2],
  operands[3], NULL, NULL, );
@@ -4301,7 +4313,17 @@ ix86_expand_int_vec_cmp (rtx operands[])
   gcc_assert (!negate);
 
   if (operands[0] != cmp)
-emit_move_insn (operands[0], cmp);
+{
+  if (GET_MODE (cmp) == dest_mode)
+	emit_move_insn (operands[0], cmp);
+  else
+	{
+	  gcc_assert (ix86_valid_mask_cmp_mode (dest_mode));
+	  ix86_expand_sse_movcc (operands[0], cmp,
+ CONSTM1_RTX (dest_mode),
+ CONST0_RTX (dest_mode));
+	}
+}
 
   return true;
 }
diff --git a/gcc/testsuite/g++.target/i386/avx512bw-pr98537-1.C b/gcc/testsuite/g++.target/i386/avx512bw-pr98537-1.C
new file mode 100644
index 000..969684a222b
--- /dev/null
+++ b/gcc/testsuite/g++.target/i386/avx512bw-pr98537-1.C
@@ -0,0 +1,11 @@
+/* PR target/98537 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=x86-64 -std=c++11" } */
+
+#define TYPEV char
+#define TYPEW short
+
+#define T_ARR		\
+  __attribute__ ((target ("avx512vl,avx512bw")))
+
+#include "avx512vl-pr98537-1.C"
diff --git a/gcc/testsuite/g++.target/i386/avx512vl-pr98537-1.C b/gcc/testsuite/g++.target/i386/avx512vl-pr98537-1.C
new file mode 100644
index 000..b2ba9da
--- /dev/null
+++ b/gcc/testsuite/g++.target/i386/avx512vl-pr98537-1.C
@@ -0,0 +1,40 @@
+/* PR target/98537 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=x86-64 -std=c++11" } */
+
+#ifndef TYPEV
+#define TYPEV int
+#endif
+
+#ifndef TYPEW
+#define TYPEW long long
+#endif
+
+#ifndef T_ARR
+#define T_ARR	\
+  __attribute__ ((target ("avx512vl")))
+#endif
+
+typedef TYPEV V __attribute__((__vector_size__(32)));
+typedef TYPEW W __attribute__((__vector_size__(32)));
+
+W c, d;
+struct B {};
+B e;
+struct C { W i; };
+void foo (C);
+
+C
+operator== (B, B)
+{
+  W r = (V)c == (V)d;
+  return {r};
+}
+
+void
+T_ARR
+bar ()
+{
+  B a;
+  foo (a == e);
+}
diff --git 

Re: [PATCH][AVX512]Lower AVX512 vector compare to AVX version when dest is vector

2021-01-05 Thread Hongtao Liu via Gcc-patches
> >>
> >> Note there's a data dependency between them.  insn 7 feeds insn 9.  When
> >> there's a data dependency, combiner patterns are usually the better
> >> choice than peepholes.  I think you'd be looking to match something
> >> likethis (from the . combine dump):
> >>

Using combiner patterns, details is discussed in PR98348

Boottrapped and regtested on x86_64-linux-gnu{-m32,} for both GCC10 and trunk.
gcc/ChangeLog:

PR target/96891
PR target/98348
* config/i386/sse.md (VI_128_256): New mode iterator.
(*avx_cmp3_1, *avx_cmp3_2, *avx_cmp3_3,
 *avx_cmp3_4, *avx2_eq3, *avx2_pcmp3_1,
 *avx2_pcmp3_2, *avx2_gt3): New
define_insn_and_split to lower avx512 vector comparison to avx
version when dest is vector.
(*_cmp3,*_cmp3,*_ucmp3):
define_insn_and_split for negating the comparison result.
* config/i386/predicates.md (float_vector_all_ones_operand):
New predicate.
* config/i386/i386-expand.c (ix86_expand_sse_movcc): Use
general NOT operator without UNSPEC_MASKOP.

gcc/testsuite/ChangeLog:

PR target/96891
PR target/98348
* gcc.target/i386/avx512bw-pr96891-1.c: New test.
* gcc.target/i386/avx512f-pr96891-1.c: New test.
* gcc.target/i386/avx512f-pr96891-2.c: New test.
* gcc.target/i386/avx512f-pr96891-3.c: New test.
* g++.target/i386/avx512f-pr96891-1.C: New test.
* gcc.target/i386/bitwise_mask_op-3.c: Adjust testcase.

>
> Jeff
>



--
BR,
Hongtao
From 240c830b3d35f7571da876a21aa71e263c3abe80 Mon Sep 17 00:00:00 2001
From: liuhongt 
Date: Fri, 18 Dec 2020 15:56:06 +0800
Subject: [PATCH] Lower AVX512 vector comparison to AVX version when dest is
 vector.

gcc/ChangeLog:

	PR target/96891
	PR target/98348
	* config/i386/sse.md (VI_128_256): New mode iterator.
	(*avx_cmp3_1, *avx_cmp3_2, *avx_cmp3_3,
	 *avx_cmp3_4, *avx2_eq3, *avx2_pcmp3_1,
	 *avx2_pcmp3_2, *avx2_gt3): New
	define_insn_and_split to lower avx512 vector comparison to avx
	version when dest is vector.
	(*_cmp3,*_cmp3,*_ucmp3):
	define_insn_and_split for negating the comparison result.
	* config/i386/predicates.md (float_vector_all_ones_operand):
	New predicate.
	* config/i386/i386-expand.c (ix86_expand_sse_movcc): Use
	general NOT operator without UNSPEC_MASKOP.

gcc/testsuite/ChangeLog:

	PR target/96891
	PR target/98348
	* gcc.target/i386/avx512bw-pr96891-1.c: New test.
	* gcc.target/i386/avx512f-pr96891-1.c: New test.
	* gcc.target/i386/avx512f-pr96891-2.c: New test.
	* gcc.target/i386/avx512f-pr96891-3.c: New test.
	* g++.target/i386/avx512f-pr96891-1.C: New test.
	* gcc.target/i386/bitwise_mask_op-3.c: Adjust testcase.
---
 gcc/config/i386/i386-expand.c |  14 +-
 gcc/config/i386/predicates.md |  47 
 gcc/config/i386/sse.md| 261 +-
 .../g++.target/i386/avx512f-pr96891-1.C   |  37 +++
 .../gcc.target/i386/avx512bw-pr96891-1.c  |  75 +
 .../gcc.target/i386/avx512f-pr96891-1.c   |  40 +++
 .../gcc.target/i386/avx512f-pr96891-2.c   |  30 ++
 .../gcc.target/i386/avx512f-pr96891-3.c   |  39 +++
 .../gcc.target/i386/bitwise_mask_op-3.c   |   1 -
 9 files changed, 531 insertions(+), 13 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/i386/avx512f-pr96891-1.C
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512bw-pr96891-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512f-pr96891-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512f-pr96891-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512f-pr96891-3.c

diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
index 6e08fd32726..b4f8b275718 100644
--- a/gcc/config/i386/i386-expand.c
+++ b/gcc/config/i386/i386-expand.c
@@ -3568,17 +3568,11 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, rtx op_false)
 		  ? force_reg (mode, op_false) : op_false);
   if (op_true == CONST0_RTX (mode))
 	{
-	  rtx (*gen_not) (rtx, rtx);
-	  switch (cmpmode)
-	{
-	case E_QImode: gen_not = gen_knotqi; break;
-	case E_HImode: gen_not = gen_knothi; break;
-	case E_SImode: gen_not = gen_knotsi; break;
-	case E_DImode: gen_not = gen_knotdi; break;
-	default: gcc_unreachable ();
-	}
 	  rtx n = gen_reg_rtx (cmpmode);
-	  emit_insn (gen_not (n, cmp));
+	  if (cmpmode == E_DImode && !TARGET_64BIT)
+	emit_insn (gen_knotdi (n, cmp));
+	  else
+	emit_insn (gen_rtx_SET (n, gen_rtx_fmt_e (NOT, cmpmode, cmp)));
 	  cmp = n;
 	  /* Reverse op_true op_false.  */
 	  std::swap (op_true, op_false);
diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
index be5aaa4d76f..0bb0729e933 100644
--- a/gcc/config/i386/predicates.md
+++ b/gcc/config/i386/predicates.md
@@ -1069,6 +1069,53 @@ (define_predicate "zero_extended_scalar_load_operand"
   return true;
 })
 
+/* Return true if operand is a float vector constant that is all ones. */

Re: [PATCH] ira: Skip some pseudos in move_unallocated_pseudos

2021-01-05 Thread Kewen.Lin via Gcc-patches
on 2021/1/6 上午2:19, Jeff Law wrote:
> 
> 
> On 1/4/21 7:36 PM, Kewen.Lin wrote:
>> Hi Jeff,
>>
>> on 2021/1/5 上午7:13, Jeff Law wrote:
>>>
>>> On 12/22/20 11:40 PM, Kewen.Lin via Gcc-patches wrote:
 Hi Segher,

 on 2020/12/22 下午9:55, Segher Boessenkool wrote:
> Hi!
>
> Just a dumb formatting comment:
>
> On Tue, Dec 22, 2020 at 04:05:39PM +0800, Kewen.Lin wrote:
>> This patch is to make move_unallocated_pseudos consistent
>> to what we have in function find_moveable_pseudos, where we
>> record the original pseudo into pseudo_replaced_reg only if
>> validate_change succeeds with newreg.  To ensure every
>> unallocated pseudo in move_unallocated_pseudos has expected
>> information, it's better to add a check and skip it if it's
>> unexpected.  This avoids possible ICEs in future.
>>
>> btw, I happened to found this in the bootstrapping for one
>> experimental local patch, which is considered as impractical.
>> --- a/gcc/ira.c
>> +++ b/gcc/ira.c
>> @@ -5111,6 +5111,11 @@ move_unallocated_pseudos (void)
>>{
>>  int idx = i - first_moveable_pseudo;
>>  rtx other_reg = pseudo_replaced_reg[idx];
>> +/* If there is no appropriate pseudo in pseudo_replaced_reg, it
>> +   means validate_change fails for this new pseudo in function
>> +   find_moveable_pseudos, then bypass it here.*/
> Dot space space.
 Good catch, thanks!  I forgot to reformat after polishing the comments.
 Will fix it with other potential comments.

> The patch sounds fine to me.  Hard to tell without seeing the patch that
> exposed the problem (for onlookers like me who do not know this code
> well, anyway ;-) )
 The patch which made this issue exposed looks like:

 +; Like *rotl3_insert_3 but work with nonzero_bits rather than
 +; explicit AND.
 +(define_insn "*rotl3_insert_8"
 +  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
 +(ior:GPR (ashift:GPR (match_operand:GPR 1 "gpc_reg_operand" "r")
 + (match_operand:SI 2 "u6bit_cint_operand" 
 "n"))
 + (match_operand:GPR 3 "gpc_reg_operand" "0")))]
 +  "HOST_WIDE_INT_1U << INTVAL (operands[2])
 +   > nonzero_bits (operands[3], mode)"
 +{
 +  if (mode == SImode)
 +return "rlwimi %0,%1,%h2,0,31-%h2";
 +  else
 +return "rldimi %0,%1,%H2,0";
 +}
 +  [(set_attr "type" "insert")])

 Some insn matches this pattern in combine, later ira tries to introduce
 one new pseudo since it meets the checks in find_moveable_pseudos, but
 it fails in the call to validate_change since the nonzero_bits is more
 rough and can't satisfy the pattern condition, leaving the unexpected
 entry in pseudo_replaced_reg.
>>> But what doesn't make any sense to me is pseudo_replaced_reg[] is only
>>> set when validation is successful in find_moveable_pseudos.   So I can't
>>> see how this patch actually helps the problem you're describing.
>>>
>> Yeah, pseudo_replaced_reg[] is only set when validation is successful,
>> but we bump the max pseudo number in ira_create_new_reg as below
>> regardless of whether validation succeeds or not:
>>
>>rtx newreg = ira_create_new_reg (def_reg);
>>if (validate_change (def_insn, DF_REF_REAL_LOC (def), newreg, 0))
>>
>> Later in move_unallocated_pseudos, the iterating could cover those
>> pseudos which were created but not used due to failed validation.
>>
>>   for (i = first_moveable_pseudo; i < last_moveable_pseudo; i++)
>> if (reg_renumber[i] < 0)
>>   {
>>  int idx = i - first_moveable_pseudo;
>>  rtx other_reg = pseudo_replaced_reg[idx];// (1)
>>  rtx_insn *def_insn = DF_REF_INSN (DF_REG_DEF_CHAIN (i));
>>  /* The use must follow all definitions of OTHER_REG, so we can
>> insert the new definition immediately after any of them.  */
>>  df_ref other_def = DF_REG_DEF_CHAIN (REGNO (other_reg))
>>
>> Then we can get the NULL other_reg in (1), also have unexpected df info
>> which causes ICE.  The patch skips the handlings on those pseudos which
>> were intended to be used in validatation INSN but failed to.
> I was wondering if it was somehow related to creation of new pseudos. 
> The other important tidbit here is we reset last_movable_pseudo near the
> end of find_moveable_pseudos.

Yeah, the iterating will scan all new pseudos created in find_moveable_pseudos,
the problem occurs on those ones that fail to validate.

> OK for the trunk with an expanded comment.

Thanks!  Does the attached new version look good to you?

BR,
Kewen
diff --git a/gcc/ira.c b/gcc/ira.c
index 89b5df4003d..58c1efe54b5 100644
--- a/gcc/ira.c
+++ b/gcc/ira.c
@@ -5111,6 +5111,15 @@ move_unallocated_pseudos (void)
   {
int idx = i - first_moveable_pseudo;
rtx other_reg = pseudo_replaced_reg[idx];
+   

[committed] analyzer: fix false leaks when writing through unknown ptrs [PR97072]

2021-01-05 Thread David Malcolm via Gcc-patches
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to master as r11-6497-gac3966e315ada63eb379d560a012fa77c3909155.

gcc/analyzer/ChangeLog:
PR analyzer/97072
* region-model-reachability.cc (reachable_regions::init_cluster):
Convert symbolic region handling to a switch statement.  Add cases
to handle SK_UNKNOWN and SK_CONJURED.

gcc/testsuite/ChangeLog:
PR analyzer/97072
* gcc.dg/analyzer/pr97072.c: New test.
---
 gcc/analyzer/region-model-reachability.cc | 36 +--
 gcc/testsuite/gcc.dg/analyzer/pr97072.c   |  9 ++
 2 files changed, 36 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr97072.c

diff --git a/gcc/analyzer/region-model-reachability.cc 
b/gcc/analyzer/region-model-reachability.cc
index daf785254ea..a988ffc1439 100644
--- a/gcc/analyzer/region-model-reachability.cc
+++ b/gcc/analyzer/region-model-reachability.cc
@@ -88,20 +88,38 @@ reachable_regions::init_cluster (const region *base_reg)
   if (m_store->escaped_p (base_reg))
 add (base_reg, true);
 
-  /* If BASE_REG is *INIT_VAL(REG) for some other REG, see if REG is
- unbound and untouched.  If so, then add BASE_REG as a root.  */
   if (const symbolic_region *sym_reg = base_reg->dyn_cast_symbolic_region ())
 {
   const svalue *ptr = sym_reg->get_pointer ();
-  if (const initial_svalue *init_sval = ptr->dyn_cast_initial_svalue ())
+  switch (ptr->get_kind ())
{
- const region *init_sval_reg = init_sval->get_region ();
- const region *other_base_reg = init_sval_reg->get_base_region ();
- const binding_cluster *other_cluster
-   = m_store->get_cluster (other_base_reg);
- if (other_cluster == NULL
- || !other_cluster->touched_p ())
+   default:
+ break;
+   case SK_INITIAL:
+ {
+   /* If BASE_REG is *INIT_VAL(REG) for some other REG, see if REG is
+  unbound and untouched.  If so, then add BASE_REG as a root.  */
+   const initial_svalue *init_sval
+ = as_a  (ptr);
+   const region *init_sval_reg = init_sval->get_region ();
+   const region *other_base_reg = init_sval_reg->get_base_region ();
+   const binding_cluster *other_cluster
+ = m_store->get_cluster (other_base_reg);
+   if (other_cluster == NULL
+   || !other_cluster->touched_p ())
+ add (base_reg, true);
+ }
+ break;
+
+   case SK_UNKNOWN:
+   case SK_CONJURED:
+ {
+   /* If this cluster is due to dereferencing an unknown/conjured
+  pointer, any values written through the pointer could still
+  be live.  */
add (base_reg, true);
+ }
+ break;
}
 }
 }
diff --git a/gcc/testsuite/gcc.dg/analyzer/pr97072.c 
b/gcc/testsuite/gcc.dg/analyzer/pr97072.c
new file mode 100644
index 000..40241248884
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/pr97072.c
@@ -0,0 +1,9 @@
+void unknown_fn_1 (void *);
+
+void test_1 (int co, int y)
+{
+  void *p = __builtin_malloc (1024);
+  void **q;
+  unknown_fn_1 ();
+  *q = p;
+}
-- 
2.26.2



[committed] analyzer: add regression test for PR 98073

2021-01-05 Thread David Malcolm via Gcc-patches
This ICE was fixed by r11-2694-g808f4dfeb3a95f50 (aka the big state
rewrite for GCC 11).

Successfully regrtested on x86_64-pc-linux-gnu.
Pushed to master as r11-6496-g23fc2be633c61f24a4fbd4096c669e7147ca44ae.

gcc/testsuite/ChangeLog:
PR analyzer/98073
* gcc.dg/analyzer/pr98073.c: New test.
---
 gcc/testsuite/gcc.dg/analyzer/pr98073.c | 13 +
 1 file changed, 13 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr98073.c

diff --git a/gcc/testsuite/gcc.dg/analyzer/pr98073.c 
b/gcc/testsuite/gcc.dg/analyzer/pr98073.c
new file mode 100644
index 000..abbda09bf99
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/pr98073.c
@@ -0,0 +1,13 @@
+struct ist {
+  char ptr;
+  long len;
+} __trans_tmp_1, http_update_host_authority;
+int http_update_host_sl_0_0_0;
+void http_update_host(const struct ist uri) {
+  uri.len || uri.ptr;
+  if (http_update_host_sl_0_0_0) {
+http_update_host_authority = __trans_tmp_1;
+!http_update_host_authority.len;
+  } else
+http_update_host_authority = uri;
+}
-- 
2.26.2



[committed] analyzer: remove xfail [PR98223]

2021-01-05 Thread David Malcolm via Gcc-patches
The bogus leak message went away after
fcae5121154d1c3382b056bcc2c563cedac28e74 (aka "Hybrid EVRP and
testcases") due to that patch improving a phi node in the gimple input
to the analyzer.

Successfully regrtested on x86_64-pc-linux-gnu.
Pushed to master as r11-6495-gdf1eba3ceada6e8990c00ccfa6c5a2c9b1c13334.

gcc/testsuite/ChangeLog:
PR analyzer/98223
* gcc.dg/analyzer/pr94851-1.c: Remove xfail.
---
 gcc/testsuite/gcc.dg/analyzer/pr94851-1.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/analyzer/pr94851-1.c 
b/gcc/testsuite/gcc.dg/analyzer/pr94851-1.c
index da79652c570..34960e264cd 100644
--- a/gcc/testsuite/gcc.dg/analyzer/pr94851-1.c
+++ b/gcc/testsuite/gcc.dg/analyzer/pr94851-1.c
@@ -40,8 +40,7 @@ int pamark(void) {
   last->m_next = p;
   }
 
-  p->m_name = (char)c; /* { dg-bogus "leak of 'p'" "bogus leak" { xfail *-*-* 
} } */
-  // TODO(xfail): related to PR analyzer/97072 and PR analyzer/97074
+  p->m_name = (char)c; /* { dg-bogus "leak of 'p'" "bogus leak" } */
 
   return 1;
 }
-- 
2.26.2



[PATCH] Add input_modes parameter to TARGET_MD_ASM_ADJUST hook

2021-01-05 Thread Ilya Leoshkevich via Gcc-patches
Bootstrapped and regtested on x86_64-redhat-linux.  I also built
cross-compilers for arm-linux-gnueabi, cris-elf mn10300-elf,
nds32-linux-gnu, pdp11-aout (didn't fully work due to
https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg251887.html,
but the changed code compiled fine), powerpc-linux-gnu, vax-linux-gnu
and visium-elf, but didn't test them.  I ran into this issue while
implementing TARGET_MD_ASM_ADJUST for s390.  Ok for master?



If TARGET_MD_ASM_ADJUST changes a mode of an input operand (which
should be ok as long as the hook itself as well as after_md_seq make up
for it), input_mode will contain stale information.

It might be tempting to fix this by removing input_mode altogether and
just using GET_MODE (), but this will not work correctly with constants.
So add input_modes parameter and document that it should be updated
whenever inputs parameter is updated.

gcc/ChangeLog:

2021-01-05  Ilya Leoshkevich  

* cfgexpand.c (expand_asm_loc): Pass new parameter.
(expand_asm_stmt): Likewise.
* config/arm/aarch-common-protos.h (arm_md_asm_adjust): Add new
parameter.
* config/arm/aarch-common.c (arm_md_asm_adjust): Likewise.
* config/arm/arm.c (thumb1_md_asm_adjust): Likewise.
* config/cris/cris.c (cris_md_asm_adjust): Likewise.
* config/i386/i386.c (ix86_md_asm_adjust): Likewise.
* config/mn10300/mn10300.c (mn10300_md_asm_adjust): Likewise.
* config/nds32/nds32.c (nds32_md_asm_adjust): Likewise.
* config/pdp11/pdp11.c (pdp11_md_asm_adjust): Likewise.
* config/rs6000/rs6000.c (rs6000_md_asm_adjust): Likewise.
* config/vax/vax.c (vax_md_asm_adjust): Likewise.
* config/visium/visium.c (visium_md_asm_adjust): Likewise.
* target.def (md_asm_adjust): Likewise.
---
 gcc/cfgexpand.c  | 16 
 gcc/config/arm/aarch-common-protos.h |  8 
 gcc/config/arm/aarch-common.c|  7 ---
 gcc/config/arm/arm.c | 14 --
 gcc/config/cris/cris.c   |  7 ---
 gcc/config/i386/i386.c   |  7 ---
 gcc/config/mn10300/mn10300.c |  7 ---
 gcc/config/nds32/nds32.c |  1 +
 gcc/config/pdp11/pdp11.c |  9 +
 gcc/config/rs6000/rs6000.c   |  7 ---
 gcc/config/vax/vax.c |  3 ++-
 gcc/config/visium/visium.c   | 12 +++-
 gcc/doc/tm.texi  | 10 ++
 gcc/target.def   | 13 -
 14 files changed, 69 insertions(+), 52 deletions(-)

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index b73019b241f..e25528261a0 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -2879,6 +2879,7 @@ expand_asm_loc (tree string, int vol, location_t locus)
   rtx asm_op, clob;
   unsigned i, nclobbers;
   auto_vec input_rvec, output_rvec;
+  auto_vec input_mode;
   auto_vec constraints;
   auto_vec clobber_rvec;
   HARD_REG_SET clobbered_regs;
@@ -2888,9 +2889,8 @@ expand_asm_loc (tree string, int vol, location_t locus)
   clobber_rvec.safe_push (clob);
 
   if (targetm.md_asm_adjust)
-   targetm.md_asm_adjust (output_rvec, input_rvec,
-  constraints, clobber_rvec,
-  clobbered_regs);
+   targetm.md_asm_adjust (output_rvec, input_rvec, input_mode,
+  constraints, clobber_rvec, clobbered_regs);
 
   asm_op = body;
   nclobbers = clobber_rvec.length ();
@@ -3067,8 +3067,8 @@ expand_asm_stmt (gasm *stmt)
   return;
 }
 
-  /* There are some legacy diagnostics in here, and also avoids a
- sixth parameger to targetm.md_asm_adjust.  */
+  /* There are some legacy diagnostics in here, and also avoids an extra
+ parameter to targetm.md_asm_adjust.  */
   save_input_location s_i_l(locus);
 
   unsigned noutputs = gimple_asm_noutputs (stmt);
@@ -3419,9 +3419,9 @@ expand_asm_stmt (gasm *stmt)
  the flags register.  */
   rtx_insn *after_md_seq = NULL;
   if (targetm.md_asm_adjust)
-after_md_seq = targetm.md_asm_adjust (output_rvec, input_rvec,
- constraints, clobber_rvec,
- clobbered_regs);
+after_md_seq
+   = targetm.md_asm_adjust (output_rvec, input_rvec, input_mode,
+constraints, clobber_rvec, clobbered_regs);
 
   /* Do not allow the hook to change the output and input count,
  lest it mess up the operand numbering.  */
diff --git a/gcc/config/arm/aarch-common-protos.h 
b/gcc/config/arm/aarch-common-protos.h
index 251de3d61a8..cbef50dde71 100644
--- a/gcc/config/arm/aarch-common-protos.h
+++ b/gcc/config/arm/aarch-common-protos.h
@@ -143,9 +143,9 @@ struct cpu_cost_table
   const struct vector_cost_table vect;
 };
 
-rtx_insn *
-arm_md_asm_adjust (vec , vec &/*inputs*/,
-   vec ,
-   vec , 

[PATCH] c++: Fix thinko in auto return type checking [PR98441]

2021-01-05 Thread Marek Polacek via Gcc-patches
This fixes a thinko in my r11-2085 patch: when I said "But only give the
!late_return_type errors when funcdecl_p, to accept e.g. auto (*fp)() = f;
in C++11" I should've done this, otherwise we give bogus errors mentioning
"function with trailing return type" when there is none.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

gcc/cp/ChangeLog:

PR c++/98441
* decl.c (grokdeclarator): Move the !funcdecl_p check inside the
!late_return_type block.

gcc/testsuite/ChangeLog:

PR c++/98441
* g++.dg/cpp0x/auto55.C: New test.
---
 gcc/cp/decl.c   |  8 +---
 gcc/testsuite/g++.dg/cpp0x/auto55.C | 13 +
 2 files changed, 18 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/auto55.C

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index bf6f12c26a0..1a114a2e2d0 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -12241,10 +12241,12 @@ grokdeclarator (const cp_declarator *declarator,
tree late_return_type = declarator->u.function.late_return_type;
if (tree auto_node = type_uses_auto (type))
  {
-   if (!late_return_type && funcdecl_p)
+   if (!late_return_type)
  {
-   if (current_class_type
-   && LAMBDA_TYPE_P (current_class_type))
+   if (!funcdecl_p)
+ /* auto (*fp)() = f; is OK.  */;
+   else if (current_class_type
+&& LAMBDA_TYPE_P (current_class_type))
  /* OK for C++11 lambdas.  */;
else if (cxx_dialect < cxx14)
  {
diff --git a/gcc/testsuite/g++.dg/cpp0x/auto55.C 
b/gcc/testsuite/g++.dg/cpp0x/auto55.C
new file mode 100644
index 000..5bd32ac890d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/auto55.C
@@ -0,0 +1,13 @@
+// PR c++/98441
+// { dg-do compile { target c++11 } }
+
+struct a {
+int& mfn();
+};
+
+void fn()
+{
+int&  (a::*myvar1)(void) = ::mfn;
+auto& (a::*myvar2)(void) = ::mfn;
+auto  (a::*myvar3)(void) = ::mfn;
+}

base-commit: ad92bf4b165935b58195825dc8f089f53fd2710b
-- 
2.29.2



Re: [committed] doc: Remove HSAIL from Language Standards

2021-01-05 Thread Gerald Pfeifer
On Mon, 4 Jan 2021, Martin Jambor wrote:
> I trust you that HSA Foundation's web server was down for weeks but it
> is not down now, http://www.hsafoundation.com/standards/ loads for me
> fine and "HSA Programmer Reference Manual Specification 1.01" available
> from that page describes the HSAIL that the FE implements.

Thanks for checking that again, Martin. I have applied the patch
below, reverting the original commit.

> Given that nobody bothered to update the FE to HSAIL 1.2 (which is 2.5
> years old) and it is unlikely to have many users, maybe it is time to
> deprecate the FE in GCC 11 (I guess it is not a promise to remove it in
> 12), but that is a different question.

I think I'd recommend that, yes.

Cheers, Gerald


commit ad92bf4b165935b58195825dc8f089f53fd2710b
Author: Gerald Pfeifer 
Date:   Wed Jan 6 00:56:55 2021 +0100

doc: Re-add HSAIL to Language Standards

The HSAIL web server has reappeared after weeks, so restore the standard
reference for now while we consider further deprecation.

This reverts commit 7e999bd84f47205dc44b0f2dc90b53b3c888ca48.

gcc/
2021-01-06  Gerald Pfeifer  

Revert:
2020-12-28  Gerald Pfeifer  

* doc/standards.texi (HSAIL): Remove section.

diff --git a/gcc/doc/standards.texi b/gcc/doc/standards.texi
index 128b1c67bbc..0f88333eec6 100644
--- a/gcc/doc/standards.texi
+++ b/gcc/doc/standards.texi
@@ -320,6 +320,14 @@ available online, see 
@uref{http://gcc.gnu.org/readings.html}
 As of the GCC 4.7.1 release, GCC supports the Go 1 language standard,
 described at @uref{https://golang.org/doc/go1}.
 
+@section HSA Intermediate Language (HSAIL)
+
+GCC can compile the binary representation (BRIG) of the HSAIL text format as
+described in HSA Programmer's Reference Manual version 1.0.1. This
+capability is typically utilized to implement the HSA runtime API's HSAIL 
+finalization extension for a gcc supported processor. HSA standards are
+freely available at @uref{http://www.hsafoundation.com/standards/}.
+
 @section D language
 
 GCC supports the D 2.0 programming language.  The D language itself is


Re: [PATCH] genemit: Handle `const_double_zero' rtx

2021-01-05 Thread Maciej W. Rozycki
On Wed, 16 Dec 2020, Maciej W. Rozycki wrote:

> > CONST_DOUBLE_ATOF ("0", VOIDmode) seems malformed though, and I'd expect
> > it to assert in REAL_MODE_FORMAT (via the format_helper constructor).
> > I'm not sure the patch is strictly safer than the status quo.
> 
>  I may have missed that, though I did follow the chain of calls involved 
> here to see if there is anything problematic.  As I say I have a limited 
> way to verify this in practice as the PDP-11 code involved appears to me 
> to be dead, and the situation does not apply to the VAX backend.  Maybe I 
> could simulate it somehow artificially to see what happens.

 I have made an experiment and arranged for a couple of builtins to refer
to CONST_DOUBLE_ATOF ("0", VOIDmode) via expanders and it works just fine 
except for failing to match an RTL insn, like:

builtin.c: In function 't':
builtin.c:18:1: error: unrecognizable insn:
   18 | }
  | ^
(insn 6 3 7 2 (set (reg/v:SF 23 [ f ])
(plus:SF (const_double 0 [0] 0 [0] 0 [0] 0 [0])
(reg/v:SF 32 [ f ]))) "builtin.c":5:6 -1
 (nil))
during RTL pass: vregs
builtin.c:18:1: internal compiler error: in extract_insn, at recog.c:2315

so it does work in principle and would produce something if there was a 
matching insn.

> > FWIW, I agree with Jeff that this ought to be CONST0_RTX (mode).
> 
>  I'll have to update several places then and push the changes through full 
> regression testing, so it'll probably take until the next week.

 FWIW, CONST_DOUBLE_ATOF ("0", VOIDmode) is of course not equivalent to 
CONST0_RTX (VOIDmode), as the latter produces a CONST_INT rather than a 
CONST_DOUBLE rtx:

builtin.c: In function 't':
builtin.c:18:1: error: unrecognizable insn:
   18 | }
  | ^
(insn 6 3 7 2 (set (reg/v:SF 23 [ f ])
(plus:SF (const_int 0 [0])
(reg/v:SF 32 [ f ]))) "builtin1.c":5:6 -1
 (nil))
during RTL pass: vregs
builtin.c:18:1: internal compiler error: in extract_insn, at recog.c:2315

I suppose we do not have to support VOIDmode here, but I feel a bit uneasy 
about the lack of identity mapping between machine description (where in 
principle we could already use `const_double' with any mode, not only ones 
CONST0_RTX expands to a CONST_DOUBLE for) and RTL produced if CONST0_RTX 
was used rather than CONST_DOUBLE_ATOF, as CONST0_RTX does not always 
return a CONST_DOUBLE rtx.

 For the sake of the experiment I modified machine description further so 
as to actually let the rtx produced by CONST_DOUBLE_ATOF ("0", VOIDmode) 
through by providing suitable insns, and here's an excerpt from annotated 
artificial assembly produced:

#(insn 6 21 7 (set (reg/v:SF 0 %r0 [orig:23 f ] [23])
#(plus:SF (const_double 0 [0] 0 [0] 0 [0] 0 [0])
#(mem/c:SF (plus:SI (reg/f:SI 12 %ap)
#(const_int 4 [0x4])) [1 f+0 S4 A32]))) "builtin.c":5:6 199 
{*addzsf3}
# (expr_list:REG_DEAD (reg/f:SI 12 %ap)
#(nil)))
#addf3 $0,4(%ap),%r0# 6 [c=40]  *addzsf3

The CONST_DOUBLE rtx yielded the `$0' operand, so the expression made it 
through the backend down to generated assembly (the leading comment 
character comes from the artificial `*addzsf3' insn in modified MD).

 So with the CONST_DOUBLE_ATOF approach we have a coherent generic model 
where we can express arbitrary modes with CONST_DOUBLE rtxes without the 
need to analyse in the parser (genemit.c) whether the expression requested 
makes sense or not.  Whereas with the CONST0_RTX approach we'll either 
have to diagnose odd `const_double_zero' usage (making it inconsistent 
with `const_zero') or leave it to the undefined (and again inconsistent).

 What I think we want to do though is to make CONST_DOUBLE_ATOF ("0", 
mode) effectively alias to CONST0_RTX (mode) in the cases where the rtx 
produced is of the CONST_DOUBLE type.  By the look of `init_emit_once' I 
infer we do that already, so I must conclude that the choice between:

  printf ("CONST_DOUBLE_ATOF (\"0\", %smode)",
  GET_MODE_NAME (GET_MODE (x)));

and:

  printf ("CONST0_RTX (%smode)",
  GET_MODE_NAME (GET_MODE (x)));

for genemit.c is for those cases merely syntactic.  So I think I made the 
correct choice here and I'd still rather go with CONST_DOUBLE_ATOF, in 
which case we can simply ignore uninteresting modes.

 Have I expressed myself clearly enough?  I can post the patch I made the 
experiments with and builtin.c for the context.

 NB the festive season and the turn of events beforehand has delayed me a 
bit, but I now have a proper fix for the issue considered here, which 
actually removes the current use of VOIDmode with CONST_DOUBLE, and it's 
now only a matter of CONST0_RTX vs CONST_DOUBLE_ATOF to be used there.  
I'll post the patch shortly and we can continue the discussion in that 
context then.

 Thank you both for your input.

  Maciej


Re: [PATCH] Add pytest for a GCOV test-case

2021-01-05 Thread Jeff Law via Gcc-patches



On 12/23/20 6:03 AM, Martin Liška wrote:
> At a high level, this patch calls out to Python 3, allowing for test
>> logic to be written in Python, rather than Tcl.  Are we doing this
>> anywhere else in our test suite?
>
> No.
I'm surprised.  I thought we did this for some of David's work at some
point.  Clearly I'm mis-remembering.
>>
>> The test implicitly requires python3, and the 3rd party pytest module
>> installed within it.  What happens if these aren't installed?  (ideally
>> an UNSUPPORTED at the DejaGnu level, I think).
>
> Right now, one will see the following in the .log file:
>
> /usr/bin/python3: No module named pytest
>
>
> I must confess that I don't know how to properly mark that as UNRESOLVED
> in DejaGNU.
I think it's just something like

unresolved "could not find python interpreter $testcase" in
run-gcov-pytest if you find the right magic in the output of your spawn.

Jeff



Re: [PATCH] store VLA bounds in attribute access as strings (PR 97172)

2021-01-05 Thread Jeff Law via Gcc-patches



On 1/4/21 4:54 PM, Martin Sebor wrote:
> On 1/4/21 2:10 PM, Jeff Law wrote:
>>
>>
>> On 1/4/21 1:53 PM, Martin Sebor wrote:
>>> On 1/4/21 12:23 PM, Jeff Law wrote:


 On 1/4/21 12:19 PM, Jakub Jelinek wrote:
> On Mon, Jan 04, 2021 at 12:14:15PM -0700, Jeff Law via Gcc-patches
> wrote:
>>> Doing the STRING_CST is certainly less fragile since the SSA names
>>> created at gimplification time could even be ggc_freed when no
>>> longer
>>> used in the IL.
>> Obviously we can't use SSA_NAMEs as they're specific to each
>> function as
>> they get compiled.  But what's not as clear to me is why we can't
>> use a
>> SAVE_EXPR of the original expression that indicates the size of the
>> parameter.
> The gimplifier is destructive, so if the expressions are partly
> (e.g. in
> those SAVE_EXPRs) shared with what is in the actual IL, we lose.
> And if they aren't shared and there are side-effects, if we tried to
> gimplify them again we'd get the side-effects duplicated.
> So it all depends on what the code wants to handle, if e.g. just
> values of
> parameters with simple arithmetics on those and punt on everything
> else,
> then it is doable, but generally it is not.
>>>
>>> I explained what the code handles and when in the pipeline in
>>> the discussion of the previous patch:
>>> https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559770.html
>> Right, but that message talks about GC.  This is not a GC issue.
>>
>> This feels like we need a SAVE_EXPR to me to ensure single evaluation
>> and an unshare_expr to avoid problems with destructive gimplification.
>>
>>
>>
>>>
 I would expect the expressions to be values of parameters (or
 objects in
 static storage) and simple arithemetic on them.  If there's other
 cases,
 punting seems appropriate.

 Martin -- are there nontrivial expressions we need to be worried
 about here?
>>>
>>> At the moment the middle warnings only consider parameters, like
>>> the N in
>>>
>>>    void f (int N, int[N]);
>>>
>>>    void g (void)
>>>    {
>>>  int a[3];
>>>  f (sizeof a, a);   // warning
>>>    }
>>>
>>> The front end redeclaration warnings consider all expressions,
>>> including
>>>
>>>    int f (void);
>>>
>>>    void g (int[f () + 1]);
>>>    void g (int[f () + 2]);   // warning
>>>
>>> The patch turns these complex bounds into strings that the front
>>> end compares instead.
>>
>> If you can have an arbitrary expression, such as a function call like
>> that, then ISTM that a SAVE_EPR is mandatory as you can't call the
>> function more than once.  BUt it also seems to me that for cases that
>> aren't simple arithmetic of leaf nodes we could just punt.  I doubt
>> we're going to miss significant real world diagnostics by doing that.
>
> I don't know about that.  Bugs are rare and often in unusual and
> hard to read/understand code, so focusing on the simple cases and
> doing nothing for the rest would certainly not be an improvement.
I would disagree.  It's an improvement for what is most likely the most
common case.  VLAs aren't used that heavily to begin with and VLAs with
bounds that require function calls to evaluate would seem to be quite rare.



>
> My understanding from the discussion at the link above is that
> using SAVE_EXPRs is only necessary when the expression is evaluated
> (the warning doesn't evaluate them).
Hmm,  so this goes back to Richi's comment/question.  If we're not
evaluating the expression, then we're just doing a lexicographical
comparison?  And yes, in that case we wouldn't need the SAVE_EXPR.



>>> After the front end is done the strings
>>> don't serve any purpose (and I don't think ever will) and could
>>> be removed.  I looked for a way to do it but couldn't find one
>>> other than the free_lang_data pass in tree.c that Richard had
>>> initially said wasn't the right place.  Sounds like he's
>>> reconsidered but at this point, given that VLA parameters are
>>> used only infraquently, and VLAs with these nontrivial bounds
>>> are exceedingly rare, going to the trouble of removing them
>>> doesn't seem worth the effort.
>> But I'm not sure that inventing a new method for smuggling the data
>> around is all that wise or necessary here.   I don't see a message from
>> anyone suggesting that, but I could have missed it.
>
> No one suggested "smuggling" anything around.  It also wasn't
> my intent, nor do I think the code code that.  It just stores
> the bounds in a form that the middle end can cope with.  There
> are other front-end-only attributes that store strings (e.g.,
> attribute deprecated) so this is not new.  But as I said, I'm
> open to removing either the strings or the expressions.  I'd
> just like to know which before I do the work this time.
You're reading way too  much into the word "smuggle".

jeff



Re: [PATCH] store VLA bounds in attribute access as strings (PR 97172)

2021-01-05 Thread Jeff Law via Gcc-patches



On 1/4/21 2:20 PM, Jakub Jelinek wrote:
> On Mon, Jan 04, 2021 at 02:10:39PM -0700, Jeff Law wrote:
>>> I explained what the code handles and when in the pipeline in
>>> the discussion of the previous patch:
>>> https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559770.html
>> Right, but that message talks about GC.  This is not a GC issue.
>>
>> This feels like we need a SAVE_EXPR to me to ensure single evaluation
>> and an unshare_expr to avoid problems with destructive gimplification.
> unshare_expr will not duplicate SAVE_EXPRs.
> So, one would need to unshare with special handling of SAVE_EXPRs that would
> throw them away (for the simple arguments case) rather than handling them
> normally.
My mental model of how this works must be broken then.  I thought we
would need to unshare the expression, then wrap it in the SAVE_EXPR.  It
seems like you're saying that we've already got the SAVE_EXPR and that
unshare_expr won't traverse into it.  That would indeed be problematical.

jeff



Re: [PATCH v3] handle MEM_REF with void* arguments (PR c++/95768)

2021-01-05 Thread Jeff Law via Gcc-patches



On 1/2/21 3:22 PM, Martin Sebor via Gcc-patches wrote:
> Attached is another revision of a patch I posted last July to keep
> the pretty-printer from crashing on MEM_REFs with void* arguments:
>   https://gcc.gnu.org/pipermail/gcc-patches/2020-July/549746.html
>
> Besides avoiding the ICE and enhancing the MEM_REF detail and
> improving its format, this revision implements the suggestions
> in that discussion.  To avoid code duplication it moves
> the handling to the C pretty-printer and changes the C++ front
> end to delegate to it.  In addition, it includes a cast to
> the accessed type if it's different from/incompatible with
> (according to GIMPLE) that of the dereferenced pointer, or if
> the object is typeless.  Lastly, it replaces the  in
> the output with either VLA names or the RHS of the GIMPLE
> expression (this improves the output when for dynamically
> allocated objects).
>
> As an aside, In my experience, MEM_REFs in warnings are limited
> to -Wuninitialized.  I think other middle end warnings tend to
> avoid them.  Those that involve invalid/out-of-bounds accesses
> replace them with either the target DECL (e.g., local variable,
> or FIELD_DECL), the allocation call (e.g., malloc), or the DECL
> of the pointer (e.g., PARM_DECL), followed by a note mentioning
> the offset into the object.  I'd like to change -Wuninitialized
> at some point to follow the same style.  So I see the value of
> the MEM_REF formatting enhancement mainly as a transient solution
> until that happens.
>
> Martin
>
> gcc-95768.diff
>
> PR c++/95768 - pretty-printer ICE on -Wuninitialized with allocated storage
>
> gcc/c-family/ChangeLog:
>
>   PR c++/95768
>   * c-pretty-print.c (c_pretty_printer::primary_expression): For
>   SSA_NAMEs print VLA names and GIMPLE defining statements.
>   (print_mem_ref): New function.
>   (c_pretty_printer::unary_expression): Call it.
>
> gcc/cp/ChangeLog:
>
>   PR c++/95768
>   * error.c (dump_expr): Call c_pretty_printer::unary_expression.
>
> gcc/testsuite/ChangeLog:
>
>   PR c++/95768
>   * g++.dg/pr95768.C: New test.
>   * g++.dg/warn/Wuninitialized-12.C: New test.
>   * gcc.dg/uninit-38.c: New test.
OK
jeff



Re: [PATCH] libtool.m4: update GNU/Hurd test from upstream

2021-01-05 Thread Samuel Thibault via Gcc-patches
Jeff Law, le mar. 05 janv. 2021 16:04:45 -0700, a ecrit:
> Thanks.  Installed.

Thanks!

Samuel


Re: [PATCH] libtool.m4: update GNU/Hurd test from upstream

2021-01-05 Thread Jeff Law via Gcc-patches



On 12/23/20 6:12 PM, Samuel Thibault wrote:
> In upstream libtool, 47a889a4ca20 ("Improve GNU/Hurd support.") fixed
> detection of shlibpath_overrides_runpath, thus avoiding unnecessary relink.
> This backports it.
>
> ChangeLog:
>
>   * libtool.m4: Match gnu* along other GNU systems.
>   * libffi/configure: Re-generate.
>   * libgomp/configure: Re-generate.
>
>   * libgo/config/libtool.m4: Match gnu* along other GNU systems.
>   * libgo/configure: Re-generate.
>
> gcc/ChangeLog:
>
>   * configure: Re-generate.
>
> libatomic/ChangeLog:
>
>   * configure: Re-generate.
>
> libbacktrace/ChangeLog:
>
>   * configure: Re-generate.
>
> libcc1/ChangeLog:
>
>   * configure: Re-generate.
>
> libgfortran/ChangeLog:
>
>   * configure: Re-generate.
>
> libgomp/ChangeLog:
>
>   * configure: Re-generate.
>
> libhsail-rt/ChangeLog:
>
>   * configure: Re-generate.
>
> libitm/ChangeLog:
>
>   * configure: Re-generate.
>
> libobjc/ChangeLog:
>
>   * configure: Re-generate.
>
> liboffloadmic/ChangeLog:
>
>   * configure: Re-generate.
>   * plugin/configure: Re-generate.
>
> libphobos/ChangeLog:
>
>   * configure: Re-generate.
>
> libquadmath/ChangeLog:
>
>   * configure: Re-generate.
>
> libsanitizer/ChangeLog:
>
>   * configure: Re-generate.
>
> libssp/ChangeLog:
>
>   * configure: Re-generate.
>
> libstdc++-v3/ChangeLog:
>
>   * configure: Re-generate.
>
> libvtv/ChangeLog:
>
>   * configure: Re-generate.
>
> lto-plugin/ChangeLog:
>
>   * configure: Re-generate.
>
> zlib/ChangeLog:
>
>   * configure: Re-generate.
Thanks.  Installed.
jeff



Re: [PATCH toplevel] libctf: new testsuite

2021-01-05 Thread Alan Modra via Gcc-patches
On Tue, Jan 05, 2021 at 03:25:10PM +, Nick Alcock wrote:
> This enables 'make libctf-check', used by a new libctf testsuite in
> binutils.
> 
> 2021-01-05  Nick Alcock  
> 
>   * Makefile.def (libctf): No longer no_check.  Checking depends on
>   all-ld.
>   * Makefile.in: Regenerated.
> 
> ---
> 
>  Makefile.def  |   4 +-
>  Makefile.in   |  13 +
> 
> This is a stripped-down top-level-only subset of commit 
> c59e30ed1727135f8efb79890f2c458f73709757 in binutils-gdb.git.  (Because
> it is identical to what has already landed in binutils, it should apply
> without trouble in syncs back to there.)
> 
> I don't have permission to push this: Alan has offered to do so.

It doesn't apply due to gcc missing binutils 87279e3cef5b2c5 changes
too.  I could fix that easily enough but I'm going to ask that you
post a combined patch to bring the gcc repo up to date with any libctf
changes.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH] dec_math.f90 needs to be xfailed

2021-01-05 Thread Jeff Law via Gcc-patches



On 1/5/21 12:14 PM, Steve Kargl wrote:
> On Tue, Jan 05, 2021 at 11:26:24AM -0700, Jeff Law wrote:
>>
>> On 1/4/21 3:28 PM, Steve Kargl wrote:
>>> On Mon, Jan 04, 2021 at 02:30:43PM -0700, Jeff Law wrote:
 On 1/2/21 1:34 AM, Steve Kargl via Gcc-patches wrote:
> Can someone, anyone, please commit the following trivially patch?
> gfortran.dg/dec_math.f90 will never pass on i?86-*-freebsd*.
 Why will the test never pass on that platform?  I don't mind installing
 the patch, but I'd like to have a bit more background first :-)

>>> The testcase assumes REAL(10) has 64-bits of precision.  On
>>> i?86-*-freebsd, the i387 FPU control word is set to 53-bits.
>>> The test program is not set up to deal with 11-bits of 
>>> missing precision.
>> Thanks.  That's precisely what I needed to know.  I suspected it was
>> related to the differing state of the fpu control word.  But that begs
>> the question of whether or not the change should apply to the other BSD
>> variants.
>>
> I don't know about other BSD variants.  The setting of the control
> word was done some 27 years ago on i?86-FreeBSD.
Right, and I believe the other BSD variants are derived from the Net/2
and/or 386BSD where this originated.


>
> Hmmm.  A little code spelunking back to original FreeBSD 2.0.5,
>
> https://svnweb.freebsd.org/base/stable/2.0.5/sys/i386/include/npx.h?revision=4=markup
>
> The lines 101-132 provide the justification for the control word.
> AFIK, older FreeBSD sources are not published on FreeBSD.org due
> to USL lawsuit.
Right.  I could probably find stuff older than that, but it'd require
heading up to my old office at the U and grubbing around.  Unlikely
worth the effort.
>
> If this file is current
>
> http://mirror.nyi.net/NetBSD/misc/joerg/GENERIC/src/src/sys/arch/x86/include/cpu_extended_state.h.html
>
> then NetBSD is not affected unless, you are using a older version.  See
> lines 196-220.
ACK.   Probably the safe thing to do is keep it limited to FreeBSD.

Thanks for digging around.   I'll go ahead and install it as-is.

jeff



Re: [PATCH] libphobos: Allow building libphobos using Solaris/x86 assembler

2021-01-05 Thread Rainer Orth
Hi Iain,

> This patch removes the disabling of libphobos when the Solaris/x86
> assembler is being used.
>
> Since r11-6373, D symbols are now compressed using back references, this
> helped reduce the average symbol length by a factor of about 3, while
> the longest symbol shrank from 416133 to 1142 characters.  So the issues
> that were seen on Solaris/x86 should no longer be a problem.
>
> However, I have only used x86_64-apple-darwin10 for testing, as
> libphobos couldn't be built on that target for the same reason, except
> it was the system linker segfaulting due to long symbol names.
>
> It would be good to know if Solaris has also benefitted from the change.

great, thanks.  I'll give this a whirl once today's regular bootstraps
have finished.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


libgo patch committed: Don't define sys_SETREUID and friends

2021-01-05 Thread Ian Lance Taylor via Gcc-patches
This libgo patch changes the syscall package to not define
sys_SETREUID and some friends.  We don't use them anyhow, sice we
always call the C library functions which do the right thing.  And
they aren't defined on all GNU/Linux variants.  This fixes GCC PR
98510.  Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.
Committed to mainline.

Ian
f47c00cf95d7dbbe7147c61a4a6bc20921c3da2c
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index c80f1cc1425..094b8fad483 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-5b075d039a20f32b9c2711ca67a3e52fba74f957
+a2578eb3983514641f0baf44d27d6474d3a96758
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/libgo/go/syscall/setuidgid_32_linux.go 
b/libgo/go/syscall/setuidgid_32_linux.go
index b0b7f61d221..1fe7120d1c6 100644
--- a/libgo/go/syscall/setuidgid_32_linux.go
+++ b/libgo/go/syscall/setuidgid_32_linux.go
@@ -12,10 +12,4 @@ const (
 
sys_SETGID = SYS_SETGID32
sys_SETUID = SYS_SETUID32
-
-   sys_SETREGID = SYS_SETREGID32
-   sys_SETREUID = SYS_SETREUID32
-
-   sys_SETRESGID = SYS_SETRESGID32
-   sys_SETRESUID = SYS_SETRESUID32
 )
diff --git a/libgo/go/syscall/setuidgid_linux.go 
b/libgo/go/syscall/setuidgid_linux.go
index 38c83c92f97..22fa334bfa5 100644
--- a/libgo/go/syscall/setuidgid_linux.go
+++ b/libgo/go/syscall/setuidgid_linux.go
@@ -12,10 +12,4 @@ const (
 
sys_SETGID = SYS_SETGID
sys_SETUID = SYS_SETUID
-
-   sys_SETREGID = SYS_SETREGID
-   sys_SETREUID = SYS_SETREUID
-
-   sys_SETRESGID = SYS_SETRESGID
-   sys_SETRESUID = SYS_SETRESUID
 )


Re: libgo patch committed: Update to Go1.16beta1 release

2021-01-05 Thread Ian Lance Taylor via Gcc-patches
On Sat, Jan 2, 2021 at 6:14 AM Matthias Klose  wrote:
>
> On 1/2/21 12:11 AM, Ian Lance Taylor wrote:
> > On Thu, Dec 31, 2020 at 7:40 AM Matthias Klose  wrote:
> >>
> >> On 12/31/20 12:14 AM, Ian Lance Taylor via Gcc-patches wrote:
> >>> I've committed a patch to update libgo to the Go 1.16beta1 release.
> >>>
> >>> This patch does not include support for the new //go:embed directive
> >>> that will be available in Go 1.16.1 (https://golang.org/issue/41191)
> >>> Support for that requires compiler changes, which will come later.
> >>>
> >>> As usual with these big updates, I have not included the complete
> >>> changes in this e-mail message, only changes that are gccgo-specific.
> >>>
> >>> Testing this requires some changes to gotools.
> >>>
> >>> Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
> >>> to mainline.
> >>
> >> also breaks the s390x 32bit multilib build (s390).
> >>
> >> ../../../../src/libgo/go/internal/cpu/cpu.go:123:9: error: reference to
> >> undefined name 'doinit'
> >>   123 | doinit()
> >>   | ^
> >
> > The problems building the internal/cpu and golang.org/x/sys/cpu
> > packages on less common architectures should be fixed by this patch.
> > Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
> > to mainline.
>
> still ftbfs on power*, tested with a multilib build on powerpc64-linux-gnu.
> patch attached, didn't check on aix.
>
> ../../../src/libgo/go/internal/cpu/cpu.go:123:9: error: reference to undefined
> name 'doinit'
>   123 | doinit()
>   | ^
>
>
> ../../../src/libgo/go/internal/cpu/cpu_ppc64x_linux.go:26:26: error: reference
> to undefined name 'isSet'
>26 | PPC64.IsPOWER9 = isSet(HWCap2, hwcap2_ARCH_3_00)
>   |  ^
> ../../../src/libgo/go/internal/cpu/cpu_ppc64x_linux.go:27:25: error: reference
> to undefined name 'isSet'
>27 | PPC64.HasDARN = isSet(HWCap2, hwcap2_DARN)
>   | ^
> ../../../src/libgo/go/internal/cpu/cpu_ppc64x_linux.go:28:24: error: reference
> to undefined name 'isSet'
>28 | PPC64.HasSCV = isSet(HWCap2, hwcap2_SCV)
>   |^


This patch cleans up internal/cpu some more, including bringing in
some files from the source repo.  It should fix these problems.
Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu and
powerpc64-unknown-linux-gnu.  Committed to mainline.

Ian
9c56d98e6b7f38ee3fc0993a2baa6de1224ef1f1
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index f4c99756d25..c80f1cc1425 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-2b5bdd22b7ec2fc13ae0f644c781f64c1a209500
+5b075d039a20f32b9c2711ca67a3e52fba74f957
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/libgo/go/internal/cpu/cpu_arm.go b/libgo/go/internal/cpu/cpu_arm.go
index 7324e7b8151..06962cfc518 100644
--- a/libgo/go/internal/cpu/cpu_arm.go
+++ b/libgo/go/internal/cpu/cpu_arm.go
@@ -4,6 +4,8 @@
 
 package cpu
 
+// const CacheLinePadSize = 32
+
 // arm doesn't have a 'cpuid' equivalent, so we rely on HWCAP/HWCAP2.
 // These are initialized by archauxv() and should not be changed after they are
 // initialized.
diff --git a/libgo/go/internal/cpu/cpu_mips.go 
b/libgo/go/internal/cpu/cpu_mips.go
new file mode 100644
index 000..48755b6d1ae
--- /dev/null
+++ b/libgo/go/internal/cpu/cpu_mips.go
@@ -0,0 +1,10 @@
+// Copyright 2017 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+package cpu
+
+// const CacheLinePadSize = 32
+
+func doinit() {
+}
diff --git a/libgo/go/internal/cpu/cpu_mips64x.go 
b/libgo/go/internal/cpu/cpu_mips64x.go
index af10a5071ea..58fbd3db93d 100644
--- a/libgo/go/internal/cpu/cpu_mips64x.go
+++ b/libgo/go/internal/cpu/cpu_mips64x.go
@@ -6,6 +6,8 @@
 
 package cpu
 
+// const CacheLinePadSize = 32
+
 // This is initialized by archauxv and should not be changed after it is
 // initialized.
 var HWCap uint
diff --git a/libgo/go/internal/cpu/cpu_mipsle.go 
b/libgo/go/internal/cpu/cpu_mipsle.go
new file mode 100644
index 000..48755b6d1ae
--- /dev/null
+++ b/libgo/go/internal/cpu/cpu_mipsle.go
@@ -0,0 +1,10 @@
+// Copyright 2017 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+package cpu
+
+// const CacheLinePadSize = 32
+
+func doinit() {
+}
diff --git a/libgo/go/internal/cpu/cpu_no_name.go 
b/libgo/go/internal/cpu/cpu_no_name.go
index ce1c37a3c7c..f028a92c935 100644
--- a/libgo/go/internal/cpu/cpu_no_name.go
+++ b/libgo/go/internal/cpu/cpu_no_name.go
@@ -4,6 +4,7 @@
 
 // +build !386
 // +build !amd64
+// +build !amd64p32
 
 package cpu
 
diff --git a/libgo/go/internal/cpu/cpu_other.go 
b/libgo/go/internal/cpu/cpu_other.go
index d0f1f2e2150..ba3c42ad569 

[PATCH] libphobos: Allow building libphobos using Solaris/x86 assembler

2021-01-05 Thread Iain Buclaw via Gcc-patches
Hi,

This patch removes the disabling of libphobos when the Solaris/x86
assembler is being used.

Since r11-6373, D symbols are now compressed using back references, this
helped reduce the average symbol length by a factor of about 3, while
the longest symbol shrank from 416133 to 1142 characters.  So the issues
that were seen on Solaris/x86 should no longer be a problem.

However, I have only used x86_64-apple-darwin10 for testing, as
libphobos couldn't be built on that target for the same reason, except
it was the system linker segfaulting due to long symbol names.

It would be good to know if Solaris has also benefitted from the change.

Regards
Iain.

---
libphobos/ChangeLog:

* Makefile.in: Regenerate.
* configure: Regenerate.
* configure.ac (x86_64-*-solaris2.* | i?86-*-solaris2.*): Remove
disabling of libphobos when using Solaris/x86 assembler.
* libdruntime/Makefile.in: Regenerate.
---
 libphobos/Makefile.in |  2 +-
 libphobos/configure   | 12 
 libphobos/configure.ac| 12 
 libphobos/libdruntime/Makefile.in |  2 +-
 4 files changed, 2 insertions(+), 26 deletions(-)

diff --git a/libphobos/Makefile.in b/libphobos/Makefile.in
index a1395929819..d42248405a2 100644
--- a/libphobos/Makefile.in
+++ b/libphobos/Makefile.in
@@ -15,7 +15,7 @@
 @SET_MAKE@
 
 # Makefile for the toplevel directory of the D Standard library.
-# Copyright (C) 2006-2020 Free Software Foundation, Inc.
+# Copyright (C) 2006-2021 Free Software Foundation, Inc.
 #
 # GCC is free software; you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
diff --git a/libphobos/configure b/libphobos/configure
index a7fb5edb90f..d6e1d7463bb 100755
--- a/libphobos/configure
+++ b/libphobos/configure
@@ -15422,18 +15422,6 @@ $as_echo_n "checking for host support for libphobos... 
" >&6; }
 . ${srcdir}/configure.tgt
 case ${host} in
   x86_64-*-solaris2.* | i?86-*-solaris2.*)
-# libphobos doesn't compile with the Solaris/x86 assembler due to a
-# relatively low linelength limit.
-as_prog=`$CC -print-prog-name=as`
-if test -n "$as_prog" && $as_prog -v /dev/null 2>&1 | grep GNU > /dev/null 
2>&1; then
-  druntime_cv_use_gas=yes;
-else
-  druntime_cv_use_gas=no;
-fi
-rm -f a.out
-if test x$druntime_cv_use_gas = xno; then
-  LIBPHOBOS_SUPPORTED=no
-fi
 # 64-bit D execution fails with Solaris ld without -z relax=transtls 
support.
 if test "$druntime_ld_gld" = "no" && test "$druntime_ld_relax_transtls" = 
"no"; then
   LIBPHOBOS_SUPPORTED=no
diff --git a/libphobos/configure.ac b/libphobos/configure.ac
index cc9af29754f..254871f0a6c 100644
--- a/libphobos/configure.ac
+++ b/libphobos/configure.ac
@@ -185,18 +185,6 @@ AC_MSG_CHECKING([for host support for libphobos])
 . ${srcdir}/configure.tgt
 case ${host} in
   x86_64-*-solaris2.* | i?86-*-solaris2.*)
-# libphobos doesn't compile with the Solaris/x86 assembler due to a
-# relatively low linelength limit.
-as_prog=`$CC -print-prog-name=as`
-if test -n "$as_prog" && $as_prog -v /dev/null 2>&1 | grep GNU > /dev/null 
2>&1; then
-  druntime_cv_use_gas=yes;
-else
-  druntime_cv_use_gas=no;
-fi
-rm -f a.out
-if test x$druntime_cv_use_gas = xno; then
-  LIBPHOBOS_SUPPORTED=no
-fi
 # 64-bit D execution fails with Solaris ld without -z relax=transtls 
support.
 if test "$druntime_ld_gld" = "no" && test "$druntime_ld_relax_transtls" = 
"no"; then
   LIBPHOBOS_SUPPORTED=no
diff --git a/libphobos/libdruntime/Makefile.in 
b/libphobos/libdruntime/Makefile.in
index 99ee8b92afa..1163207a138 100644
--- a/libphobos/libdruntime/Makefile.in
+++ b/libphobos/libdruntime/Makefile.in
@@ -15,7 +15,7 @@
 @SET_MAKE@
 
 # Makefile for the D runtime library.
-# Copyright (C) 2012-2020 Free Software Foundation, Inc.
+# Copyright (C) 2012-2021 Free Software Foundation, Inc.
 #
 # GCC is free software; you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
-- 
2.27.0



Re: [PATCH] PING implement pre-c++20 contracts

2021-01-05 Thread Jason Merrill via Gcc-patches

On 1/4/21 9:58 AM, Jeff Chapman wrote:
Ping. re: 
https://gcc.gnu.org/pipermail/gcc-patches/2020-December/561135.html 



 > OK, I'll start with -alt then, thanks.

Andrew is exactly correct, contracts-jac-alt is still the current
branch we're focusing our upstreaming efforts on.

It's trailing upstream master by a fair bit at this point. I'll get
a merge pushed shortly.


The latest is still on the same branch, which hasn't been updated since 
that last merge:
https://github.com/lock3/gcc/tree/contracts-jac-alt 



Would you prefer me to keep it from trailing upstream too much through 
regular merges, or would it be more beneficial for it to be left alone 
so you have a more stable review target?


Either way I'm reviewing by diff against the most recent merged trunk 
revision, so it doesn't really matter.


But you probably want to do one merge at least, to make sure that 
modules and contracts coexist well.


Jason



[committed] d: Merge upstream dmd a5c86f5b9

2021-01-05 Thread Iain Buclaw via Gcc-patches
This patch merges the D front-end implementation with upstream dmd
a5c86f5b9, adding the following new `__traits' to the D language.

 - isDeprecated: used to detect if a function is deprecated.

 - isDisabled: used to detect if a function is marked with @disable.

 - isFuture: used to detect if a function is marked with @__future.

 - isModule: used to detect if a given symbol represents a module, this
   enhancement also adds support using `is(sym == module)'.

 - isPackage: used to detect if a given symbol represents a package,
   this enhancement also adds support using `is(sym == package)'.

 - child: takes two arguments.  The first must be a symbol or expression
   and the second must be a symbol, such as an alias to a member of the
   first 'parent' argument.  The result is the second 'member' argument
   interpreted with its 'this' context set to 'parent'.  This is the
   inverse of `__traits(parent, member)'.

 - isReturnOnStack: determines if a function's return value is placed on
   the stack, or is returned via registers.

 - isZeroInit: used to detect if a type's default initializer has no
   non-zero bits.

 - getTargetInfo: used to query features of the target being compiled
   for, the back-end can expand this to register any key to handle the
   given argument, however a reliable subset exists which includes
   "cppRuntimeLibrary", "cppStd", "floatAbi", and "objectFormat".

 - getLocation: returns a tuple whose entries correspond to the
   filename, line number, and column number of where the argument was
   declared.

 - hasPostblit: used to detect if a type is a struct with a postblit.

 - isCopyable: used to detect if a type allows copying its value.

 - getVisibility: an alias for the getProtection trait.

Bootstrapped and regression tested on x86_64-linux-gnu, with -m32 and
-mx32 multilibs, and committed to mainline.

Regards
Iain.

---
gcc/d/ChangeLog:

* dmd/MERGE: Merge upstream dmd a5c86f5b9.
* d-builtins.cc (d_eval_constant_expression): Handle ADDR_EXPR trees
created by build_string_literal.
* d-frontend.cc (retStyle): Remove function.
* d-target.cc (d_language_target_info): New variable.
(d_target_info_table): Likewise.
(Target::_init): Initialize d_target_info_table.
(Target::isReturnOnStack): New function.
(d_add_target_info_handlers): Likewise.
(d_handle_target_cpp_std): Likewise.
(d_handle_target_cpp_runtime_library): Likewise.
(Target::getTargetInfo): Likewise.
* d-target.h (struct d_target_info_spec): New type.
(d_add_target_info_handlers): Declare.
---
 gcc/d/d-builtins.cc   |  14 +
 gcc/d/d-frontend.cc   |  20 -
 gcc/d/d-target.cc | 104 +++
 gcc/d/d-target.h  |  15 +
 gcc/d/dmd/MERGE   |   2 +-
 gcc/d/dmd/declaration.h   |   3 +-
 gcc/d/dmd/dmodule.c   | 289 
 gcc/d/dmd/dstruct.c   | 118 ++-
 gcc/d/dmd/dtemplate.c |   6 +-
 gcc/d/dmd/expression.c|   9 +-
 gcc/d/dmd/expressionsem.c |  67 +-
 gcc/d/dmd/func.c  |  39 +-
 gcc/d/dmd/globals.h   |   2 +-
 gcc/d/dmd/idgen.c |  13 +
 gcc/d/dmd/module.h|   2 +-
 gcc/d/dmd/mtype.c |   1 +
 gcc/d/dmd/parse.c |  15 +-
 gcc/d/dmd/root/filename.c |  14 +
 gcc/d/dmd/root/filename.h |   1 +
 gcc/d/dmd/target.h|   3 +
 gcc/d/dmd/traits.c| 684 ++
 gcc/testsuite/gdc.test/compilable/Test16206.d |  28 +
 .../compilable/imports/pkgmodule/package.d|   3 +
 .../imports/pkgmodule/plainmodule.d   |   2 +
 .../imports/plainpackage/plainmodule.d|   4 +
 .../gdc.test/compilable/isZeroInit.d  |  78 ++
 .../gdc.test/compilable/isreturnonstack.d |   7 +
 gcc/testsuite/gdc.test/compilable/line.d  |   4 +-
 gcc/testsuite/gdc.test/compilable/test16002.d |  24 +
 gcc/testsuite/gdc.test/compilable/test17791.d |  28 +
 gcc/testsuite/gdc.test/compilable/traits.d| 130 
 .../gdc.test/fail_compilation/fail16206a.d|  12 +
 .../gdc.test/fail_compilation/fail16206b.d|  12 +
 .../fail_compilation/fail_isZeroInit.d|  12 +
 .../fail_compilation/isreturnonstack.d|  12 +
 .../gdc.test/fail_compilation/test16002.d |  15 +
 .../gdc.test/fail_compilation/test17096.d |  50 ++
 .../gdc.test/fail_compilation/trait_loc_err.d |  15 +
 .../fail_compilation/trait_loc_ov_err.d   |  40 +
 .../gdc.test/fail_compilation/traits.d|  27 +
 .../gdc.test/fail_compilation/traits_child.d  |  17 +
 .../runnable/imports/test18322import.d|  14 +
 

Re: C++ Patch ping

2021-01-05 Thread Jason Merrill via Gcc-patches

On 1/5/21 11:34 AM, Jakub Jelinek wrote:

Hi!

I'd like to ping the:
https://gcc.gnu.org/pipermail/gcc-patches/2020-December/562099.html
patch.


OK, thanks.



[PATCH] PR fortran/78746 - invalid access after error recovery

2021-01-05 Thread Harald Anlauf via Gcc-patches
Dear all,

the PR contains a lengthy discussion of several testcases, some which were
considered invalid and thus removed from the testsuite (charlen_03.f90,
charlen_10.f90), charlen_15.f90 was resolved elsewhere, so that only
class_61.f90 was left with an invalid access after error recovery with
an instrumented compiler.

I could reproduce the issue triggered by class_61.f90 using valgrind,
and found that the attached trivial, almost obvious patch solves it.
It even regtests cleanly on x86_64-pc-linux-gnu.

OK for master?  Open branches where testcase class_61.f90 exists?

Thanks,
Harald


PR fortran/78746 - invalid access after error recovery

The error recovery after an invalid reference to an undefined CLASS
during a TYPE declaration lead to an invalid access.  Add a check.

gcc/fortran/ChangeLog:

* resolve.c (resolve_component): Add check for valid CLASS
reference before trying to access CLASS data.

diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c
index fa6f756d285..891db391907 100644
--- a/gcc/fortran/resolve.c
+++ b/gcc/fortran/resolve.c
@@ -14384,7 +14396,7 @@ resolve_component (gfc_component *c, gfc_symbol *sym)
   /* F2008, C448.  */
   if (c->ts.type == BT_CLASS)
 {
-  if (CLASS_DATA (c))
+  if (c->attr.class_ok && CLASS_DATA (c))
 	{
 	  attr = &(CLASS_DATA (c)->attr);



Re: [PATCH] IBM Z: Fix check_effective_target_s390_z14_hw

2021-01-05 Thread Andreas Krebbel via Gcc-patches
On 1/5/21 7:37 PM, Ilya Leoshkevich wrote:
> Bootstrapped and regtested on z14.  Ok for master?
> 
> 
> 
> Commit 2f473f4b065d ("IBM Z: Do not run long double tests on old
> machines") introduced a predicate for tests that must run only on z14+.
> However, due to a syntax error, the predicate always returns false.
> 
> gcc/testsuite/ChangeLog:
> 
> 2020-12-10  Ilya Leoshkevich  
> 
>   * gcc.target/s390/s390.exp: Replace %% with %.

Ok. Thanks!

Andreas



Re: Go patch committed: Accept -fgo-embedcfg option

2021-01-05 Thread Ian Lance Taylor via Gcc-patches
On Tue, Jan 5, 2021 at 7:15 AM Jakub Jelinek via Gcc-patches
 wrote:
>
> On Tue, Jan 05, 2021 at 11:06:27AM +0100, Andreas Schwab wrote:
> > FAIL: compiler driver --help=go option(s): "^ +-.*[^:.]$" absent from 
> > output: "  -fgo-embedcfg=List embedded files via go:embed"
>
> Fixed thusly, committed as obvious.
>
> 2021-01-05  Jakub Jelinek  
>
> * lang.opt (fgo-embedcfg=): Add full stop at the end of description.

Thanks.

Ian


Re: [RFC] [avr] Toolchain Integration for Testsuite Execution (avr cc0 to mode_cc0 conversion)

2021-01-05 Thread Rainer Orth
Hi Jeff,

> On 1/5/21 10:54 AM, Rainer Orth wrote:
>>
>> I fear I'm a bit lost here myself.  I do have a little experience
>> running various builders:
>>
>> * I inherited a Golang one on Solaris/amd64 (based on their own builder
>>   infrastructure).
>>
>> * I do run builders for GDB (mostly dormant since Sergio left RedHat)
>>   and LLVM on Solaris/amd64 and sparcv9 (both using buildbot).
>>
>> In all three cases the projects provide documentation how to configure
>> your own builders and add them to the infrastructure.  Is something like
>> this possible for the GCC Jenkins (say adding Solaris builders) and if
>> so how?  Or would one need to setup one's own instance, in which case it
>> would be extremely helpful to learn the necessary config: doing
>> something like this from scratch is a major effort, as seen in Paul
>> Matos' effort (also buildbot-based) of a couple of years ago.
> We don't have any procedures in place for this (yet).  I'd like to add
> them, but I'm swamped.

understood.  Often it's easier for an outsider to document a procedure
since he's certain to stumble across every possible roadblock someone
familiar with the system has long forgotten about.

> I'm certainly open to having others contribute here.  As a long standing
> member of the community I'd be happy to set up an account for you so you
> could wire in a sparc/solaris system executor and set up the build scripts.

That would be nice.  Although my current manual daily regtests do help
and a considerable part of the work is investigating and reporting
failures found, any automatism takes part of the legwork.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH] dec_math.f90 needs to be xfailed

2021-01-05 Thread Steve Kargl via Gcc-patches
On Tue, Jan 05, 2021 at 11:26:24AM -0700, Jeff Law wrote:
> 
> 
> On 1/4/21 3:28 PM, Steve Kargl wrote:
> > On Mon, Jan 04, 2021 at 02:30:43PM -0700, Jeff Law wrote:
> >> On 1/2/21 1:34 AM, Steve Kargl via Gcc-patches wrote:
> >>> Can someone, anyone, please commit the following trivially patch?
> >>> gfortran.dg/dec_math.f90 will never pass on i?86-*-freebsd*.
> >> Why will the test never pass on that platform?  I don't mind installing
> >> the patch, but I'd like to have a bit more background first :-)
> >>
> > The testcase assumes REAL(10) has 64-bits of precision.  On
> > i?86-*-freebsd, the i387 FPU control word is set to 53-bits.
> > The test program is not set up to deal with 11-bits of 
> > missing precision.
> Thanks.  That's precisely what I needed to know.  I suspected it was
> related to the differing state of the fpu control word.  But that begs
> the question of whether or not the change should apply to the other BSD
> variants.
> 

I don't know about other BSD variants.  The setting of the control
word was done some 27 years ago on i?86-FreeBSD.

Hmmm.  A little code spelunking back to original FreeBSD 2.0.5,

https://svnweb.freebsd.org/base/stable/2.0.5/sys/i386/include/npx.h?revision=4=markup

The lines 101-132 provide the justification for the control word.
AFIK, older FreeBSD sources are not published on FreeBSD.org due
to USL lawsuit.

If this file is current

http://mirror.nyi.net/NetBSD/misc/joerg/GENERIC/src/src/sys/arch/x86/include/cpu_extended_state.h.html

then NetBSD is not affected unless, you are using a older version.  See
lines 196-220.

-- 
Steve


Re: The performance data for two different implementation of new security feature -ftrivial-auto-var-init

2021-01-05 Thread Qing Zhao via Gcc-patches
I am attaching my current (incomplete) patch to gcc for your reference.

From a71eb73bee5857440c4ff67c4c82be115e0675cb Mon Sep 17 00:00:00 2001
From: qing zhao 
Date: Sat, 12 Dec 2020 00:02:28 +0100
Subject: [PATCH] First version of -ftrivial-auto-var-init

---
 gcc/common.opt| 35 ++
 gcc/flag-types.h  | 14 
 gcc/gimple-pretty-print.c |  2 +-
 gcc/gimplify.c| 90 +++
 gcc/internal-fn.c | 20 +++
 gcc/internal-fn.def   |  5 +++
 gcc/tree-cfg.c|  3 ++
 gcc/tree-ssa-uninit.c |  3 ++
 gcc/tree-ssa.c|  5 +++
 9 files changed, 176 insertions(+), 1 deletion(-)

diff --git a/gcc/common.opt b/gcc/common.opt
index 6645539f5e5..c4c4fc28ef7 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -3053,6 +3053,41 @@ ftree-scev-cprop
 Common Report Var(flag_tree_scev_cprop) Init(1) Optimization
 Enable copy propagation of scalar-evolution information.
 
+ftrivial-auto-var-init=
+Common Joined RejectNegative Enum(auto_init_type) 
Var(flag_trivial_auto_var_init) Init(AUTO_INIT_UNINITIALIZED)
+-ftrivial-auto-var-init=[uninitialized|pattern|zero]   Add initializations to 
automatic variables. 
+
+Enum
+Name(auto_init_type) Type(enum auto_init_type) UnknownError(unrecognized 
automatic variable initialization type %qs)
+
+EnumValue
+Enum(auto_init_type) String(uninitialized) Value(AUTO_INIT_UNINITIALIZED)
+
+EnumValue
+Enum(auto_init_type) String(pattern) Value(AUTO_INIT_PATTERN)
+
+EnumValue
+Enum(auto_init_type) String(zero) Value(AUTO_INIT_ZERO)
+
+fauto-var-init-approach=
+Common Joined RejectNegative Enum(auto_init_approach) 
Var(flag_auto_init_approach) Init(AUTO_INIT_A))
+-fauto-var-init-approach=[A|B|C|D] Choose the approach to initialize 
automatic variables.  
+
+Enum
+Name(auto_init_approach) Type(enum auto_init_approach) 
UnknownError(unrecognized automatic variable initialization approach %qs)
+
+EnumValue
+Enum(auto_init_approach) String(A) Value(AUTO_INIT_A)
+
+EnumValue
+Enum(auto_init_approach) String(B) Value(AUTO_INIT_B)
+
+EnumValue
+Enum(auto_init_approach) String(C) Value(AUTO_INIT_C)
+
+EnumValue
+Enum(auto_init_approach) String(D) Value(AUTO_INIT_D)
+
 ; -fverbose-asm causes extra commentary information to be produced in
 ; the generated assembly code (to make it more readable).  This option
 ; is generally only of use to those who actually need to read the
diff --git a/gcc/flag-types.h b/gcc/flag-types.h
index 9342bd87be3..bfd0692b82c 100644
--- a/gcc/flag-types.h
+++ b/gcc/flag-types.h
@@ -242,6 +242,20 @@ enum vect_cost_model {
   VECT_COST_MODEL_DEFAULT = 1
 };
 
+/* Automatic variable initialization type.  */
+enum auto_init_type {
+  AUTO_INIT_UNINITIALIZED = 0,
+  AUTO_INIT_PATTERN = 1,
+  AUTO_INIT_ZERO = 2
+};
+
+enum auto_init_approach {
+  AUTO_INIT_A = 0,
+  AUTO_INIT_B = 1,
+  AUTO_INIT_C = 2,
+  AUTO_INIT_D = 3
+};
+
 /* Different instrumentation modes.  */
 enum sanitize_code {
   /* AddressSanitizer.  */
diff --git a/gcc/gimple-pretty-print.c b/gcc/gimple-pretty-print.c
index 075d6e5208a..1044d54e8d3 100644
--- a/gcc/gimple-pretty-print.c
+++ b/gcc/gimple-pretty-print.c
@@ -81,7 +81,7 @@ newline_and_indent (pretty_printer *buffer, int spc)
 DEBUG_FUNCTION void
 debug_gimple_stmt (gimple *gs)
 {
-  print_gimple_stmt (stderr, gs, 0, TDF_VOPS|TDF_MEMSYMS);
+  print_gimple_stmt (stderr, gs, 0, TDF_VOPS|TDF_MEMSYMS|TDF_LINENO|TDF_ALIAS);
 }
 
 
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 54cb66bd1dd..1eb0747ea2f 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -1674,6 +1674,16 @@ gimplify_return_expr (tree stmt, gimple_seq *pre_p)
   return GS_ALL_DONE;
 }
 
+/* Return the value that is used to initialize the vla DECL based 
+   on INIT_TYPE.  */
+tree memset_init_node (enum auto_init_type init_type)
+{
+  if (init_type == AUTO_INIT_ZERO)
+return integer_zero_node;
+  else
+gcc_assert (0);
+}
+
 /* Gimplify a variable-length array DECL.  */
 
 static void
@@ -1712,6 +1722,19 @@ gimplify_vla_decl (tree decl, gimple_seq *seq_p)
 
   gimplify_and_add (t, seq_p);
 
+  /* Add a call to memset to initialize this vla when the user requested.  */
+  if (flag_trivial_auto_var_init > AUTO_INIT_UNINITIALIZED
+  && !DECL_ARTIFICIAL (decl)
+  && VAR_P (decl) 
+  && !DECL_EXTERNAL (decl) 
+  && !TREE_STATIC (decl))
+  {
+t = builtin_decl_implicit (BUILT_IN_MEMSET);
+tree init_node = memset_init_node (flag_trivial_auto_var_init);
+t = build_call_expr (t, 3, addr, init_node, DECL_SIZE_UNIT (decl)); 
+gimplify_and_add (t, seq_p);
+  }
+
   /* Record the dynamic allocation associated with DECL if requested.  */
   if (flag_callgraph_info & CALLGRAPH_INFO_DYNAMIC_ALLOC)
 record_dynamic_alloc (decl);
@@ -1734,6 +1757,63 @@ force_labels_r (tree *tp, int *walk_subtrees, void *data 
ATTRIBUTE_UNUSED)
   return NULL_TREE;
 }
 
+
+/* Build a call to internal const function DEFERRED_INIT,
+   1st argument: DECL;
+  

Re: [PATCH] avr: cc0 to mode_cc conversion

2021-01-05 Thread Paul Koning via Gcc-patches



> On Jan 5, 2021, at 8:54 AM, Senthil Kumar Selvaraj via Gcc-patches 
>  wrote:
> 
> 
> Senthil Kumar Selvaraj writes:
> 
>> Georg-Johann Lay writes:
>> 
>> ...
>>> 
>>> 2) We just saw 100reds of insns being dublicated, basically the whole
>>> machine description except for the few insns that leave cc alone.
>>> Isn't is possible to use define subst for the bulk of the insns and
>>> get a neat code that's better to grasp and to maintain?
>>> After all it's just appending a clobber of reg_cc, and in the current
>>> proposal almost 50% of the backend is just redundent repetitions of
>>> previous insns.
> 
> I could not find a way to get define_subst to do define_insn_and_split -
> other targets using the same approach (pdp11, h8300) have the
> duplication as well.

I ran into the same issue, I tried as well for the obvious reason.  I'm pretty 
sure someone told me (a) that doesn't work, and (b) the reason is xyzzy.  But I 
no long remember what the reason is, or even if I was told one.

The impression I have is that define_subst isn't a macro facility, even though 
it looks a bit like one, and that may be why it can't do what you want to do 
here.

paul




The performance data for two different implementation of new security feature -ftrivial-auto-var-init

2021-01-05 Thread Qing Zhao via Gcc-patches
Hi,

This is an update for our previous discussion. 

1. I implemented the following two different implementations in the latest 
upstream gcc:

A. Adding real initialization during gimplification, not maintain the 
uninitialized warnings.

D. Adding  calls to .DEFFERED_INIT during gimplification, expand the 
.DEFFERED_INIT during expand to
 real initialization. Adjusting uninitialized pass with the new refs with 
“.DEFFERED_INIT”.

Note, in this initial implementation,
** I ONLY implement -ftrivial-auto-var-init=zero, the implementation of 
-ftrivial-auto-var-init=pattern 
   is not done yet.  Therefore, the performance data is only about 
-ftrivial-auto-var-init=zero. 

** I added an temporary  option -fauto-var-init-approach=A|B|C|D  to 
choose implementation A or D for 
   runtime performance study.
** I didn’t finish the uninitialized warnings maintenance work for D. 
(That might take more time than I expected). 

2. I collected runtime data for CPU2017 on a x86 machine with this new gcc for 
the following 3 cases:

no: default. (-g -O2 -march=native )
A:  default +  -ftrivial-auto-var-init=zero -fauto-var-init-approach=A 
D:  default +  -ftrivial-auto-var-init=zero -fauto-var-init-approach=D 

And then compute the slowdown data for both A and D as following:

benchmarks  A / no  D /no

500.perlbench_r 1.25%   1.25%
502.gcc_r   0.68%   1.80%
505.mcf_r   0.68%   0.14%
520.omnetpp_r   4.83%   4.68%
523.xalancbmk_r 0.18%   1.96%
525.x264_r  1.55%   2.07%
531.deepsjeng_  11.57%  11.85%
541.leela_r 0.64%   0.80%
557.xz_  -0.41% -0.41%

507.cactuBSSN_r 0.44%   0.44%
508.namd_r  0.34%   0.34%
510.parest_r0.17%   0.25%
511.povray_r56.57%  57.27%
519.lbm_r   0.00%   0.00%
521.wrf_r-0.28% -0.37%
526.blender_r   16.96%  17.71%
527.cam4_r  0.70%   0.53%
538.imagick_r   2.40%   2.40%
544.nab_r   0.00%   -0.65%

avg 5.17%   5.37%

From the above data, we can see that in general, the runtime performance 
slowdown for 
implementation A and D are similar for individual benchmarks.

There are several benchmarks that have significant slowdown with the new added 
initialization for both
A and D, for example, 511.povray_r, 526.blender_, and 531.deepsjeng_r, I will 
try to study a little bit
more on what kind of new initializations introduced such slowdown. 

From the current study so far, I think that approach D should be good enough 
for our final implementation. 
So, I will try to finish approach D with the following remaining work

  ** complete the implementation of -ftrivial-auto-var-init=pattern;
  ** complete the implementation of uninitialized warnings maintenance work 
for D. 


Let me know if you have any comments and suggestions on my current and future 
work.

Thanks a lot for your help.

Qing

> On Dec 9, 2020, at 10:18 AM, Qing Zhao via Gcc-patches 
>  wrote:
> 
> The following are the approaches I will implement and compare:
> 
> Our final goal is to keep the uninitialized warning and minimize the run-time 
> performance cost.
> 
> A. Adding real initialization during gimplification, not maintain the 
> uninitialized warnings.
> B. Adding real initialization during gimplification, marking them with 
> “artificial_init”. 
> Adjusting uninitialized pass, maintaining the annotation, making sure the 
> real init not
> Deleted from the fake init. 
> C.  Marking the DECL for an uninitialized auto variable as “no_explicit_init” 
> during gimplification,
>  maintain this “no_explicit_init” bit till after 
> pass_late_warn_uninitialized, or till pass_expand, 
>  add real initialization for all DECLs that are marked with 
> “no_explicit_init”.
> D. Adding .DEFFERED_INIT during gimplification, expand the .DEFFERED_INIT 
> during expand to
> real initialization. Adjusting uninitialized pass with the new refs with 
> “.DEFFERED_INIT”.
> 
> 
> In the above, approach A will be the one that have the minimum run-time cost, 
> will be the base for the performance
> comparison. 
> 
> I will implement approach D then, this one is expected to have the most 
> run-time overhead among the above list, but
> Implementation should be the cleanest among B, C, D. Let’s see how much more 
> performance overhead this approach
> will be. If the data is good, maybe we can avoid the effort to implement B, 
> and C. 
> 
> If the performance of D is not good, I will implement B or C at that time.
> 
> Let me know if you have any comment or suggestions.
> 
> Thanks.
> 
> Qing



[PATCH] x86: Use unsigned short to compute pextrw result

2021-01-05 Thread H.J. Lu via Gcc-patches
On Mon, Jan 4, 2021 at 7:41 PM Jeff Law  wrote:
>
>
>
> On 1/1/21 6:34 AM, H.J. Lu via Gcc-patches wrote:
> > _mm_extract_pi16 is intrinsic for pextrw, which should be zero-extended,
> > not sign-extended.
> >
> > gcc/
> >
> >   PR target/98495
> >   * config/i386/xmmintrin.h (_mm_extract_pi16): Cast to unsigned
> >   short first.
> I'd tend to prefer masking with 0x  rather than relying on the size
> of a particular type being what we need.  But this header is limited to
> just x86 and it doesn't look like there's any variance in the size of a
> short, across the x86 platforms.
>
> So, OK.
> jeff
>

I am checking in this patch to use unsigned short to compute the
zero-extended pextrw result.  This fixed:

FAIL: gcc.target/i386/sse2-mmx-pextrw.c execution test

-- 
H.J.
From 4b3d73a439caffd82eba0a64ee43bae5d5e07de9 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Tue, 5 Jan 2021 10:57:20 -0800
Subject: [PATCH] x86: Use unsigned short to compute pextrw result

Use unsigned short to compute the zero-extended pextrw result.

	PR target/98495
	* gcc.target/i386/sse2-mmx-pextrw.c (compute_correct_result): Use
	unsigned short to compute pextrw result.
---
 gcc/testsuite/gcc.target/i386/sse2-mmx-pextrw.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/i386/sse2-mmx-pextrw.c b/gcc/testsuite/gcc.target/i386/sse2-mmx-pextrw.c
index bb48740a7ca..edbac919fd8 100644
--- a/gcc/testsuite/gcc.target/i386/sse2-mmx-pextrw.c
+++ b/gcc/testsuite/gcc.target/i386/sse2-mmx-pextrw.c
@@ -32,7 +32,7 @@ test_pextrw (__m64 *i, unsigned int imm, int *r)
 static void
 compute_correct_result (__m64 *src_p, unsigned int imm, int *res_p)
 {
-  short *src = (short *) src_p;
+  unsigned short *src = (unsigned short *) src_p;
   if (imm < 4)
 *res_p = src[imm];
 }
-- 
2.29.2



Re: [PATCH] add g_nonstandard_bool attribute for GIMPLE FE use

2021-01-05 Thread Joseph Myers
On Tue, 5 Jan 2021, Richard Biener wrote:

> would maybe result in a surprising result.  One alternative
> would be to make the attribute have the signedness specified as well
> (C doesn't accept 'unsigned _Bool' or 'signed _Bool') or
> simply name the attribute "signed_bool_precision".  I guess the bool case
> is really special compared to the desire to eventually allow
> declaring of a 3 bit precision signed/unsigned integer type.
> 
> Allowing 'signed _Bool' with -fgimple might be another option
> of course.

Something that makes clear it's a signed boolean type with the given 
precision seems a good idea (I'd have assumed a nonstandard boolean type 
with a given precision was unsigned).

-- 
Joseph S. Myers
jos...@codesourcery.com


[r11-6464 Regression] FAIL: gcc.target/i386/sse2-mmx-pextrw.c execution test on Linux/x86_64

2021-01-05 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

af60b0ec79e9c5d7116122b185e44927aca5aa07 is the first bad commit
commit af60b0ec79e9c5d7116122b185e44927aca5aa07
Author: H.J. Lu 
Date:   Fri Jan 1 05:30:34 2021 -0800

x86: Cast to unsigned short first for _mm_extract_pi16

caused

FAIL: gcc.target/i386/sse2-mmx-pextrw.c execution test

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r11-6464/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/sse2-mmx-pextrw.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/sse2-mmx-pextrw.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/sse2-mmx-pextrw.c 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/sse2-mmx-pextrw.c 
--target_board='unix{-m64\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


[PATCH] IBM Z: Fix check_effective_target_s390_z14_hw

2021-01-05 Thread Ilya Leoshkevich via Gcc-patches
Bootstrapped and regtested on z14.  Ok for master?



Commit 2f473f4b065d ("IBM Z: Do not run long double tests on old
machines") introduced a predicate for tests that must run only on z14+.
However, due to a syntax error, the predicate always returns false.

gcc/testsuite/ChangeLog:

2020-12-10  Ilya Leoshkevich  

* gcc.target/s390/s390.exp: Replace %% with %.
---
 gcc/testsuite/gcc.target/s390/s390.exp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/s390/s390.exp 
b/gcc/testsuite/gcc.target/s390/s390.exp
index ba493de9f95..57b2690f8ab 100644
--- a/gcc/testsuite/gcc.target/s390/s390.exp
+++ b/gcc/testsuite/gcc.target/s390/s390.exp
@@ -197,7 +197,7 @@ proc check_effective_target_s390_z14_hw { } {
int main (void)
{
int x = 0;
-   asm ("msgrkc %%0,%%0,%%0" : "+r" (x) : );
+   asm ("msgrkc %0,%0,%0" : "+r" (x) : );
return x;
}
 }] "-march=z14 -m64 -mzarch" ] } { return 0 } else { return 1 }
-- 
2.26.2



Re: [PATCH] dec_math.f90 needs to be xfailed

2021-01-05 Thread Jeff Law via Gcc-patches



On 1/4/21 3:28 PM, Steve Kargl wrote:
> On Mon, Jan 04, 2021 at 02:30:43PM -0700, Jeff Law wrote:
>> On 1/2/21 1:34 AM, Steve Kargl via Gcc-patches wrote:
>>> Can someone, anyone, please commit the following trivially patch?
>>> gfortran.dg/dec_math.f90 will never pass on i?86-*-freebsd*.
>> Why will the test never pass on that platform?  I don't mind installing
>> the patch, but I'd like to have a bit more background first :-)
>>
> The testcase assumes REAL(10) has 64-bits of precision.  On
> i?86-*-freebsd, the i387 FPU control word is set to 53-bits.
> The test program is not set up to deal with 11-bits of 
> missing precision.
Thanks.  That's precisely what I needed to know.  I suspected it was
related to the differing state of the fpu control word.  But that begs
the question of whether or not the change should apply to the other BSD
variants.

jeff



Re: [RFC] [avr] Toolchain Integration for Testsuite Execution (avr cc0 to mode_cc0 conversion)

2021-01-05 Thread Jeff Law via Gcc-patches



On 1/5/21 10:54 AM, Rainer Orth wrote:
>
> I fear I'm a bit lost here myself.  I do have a little experience
> running various builders:
>
> * I inherited a Golang one on Solaris/amd64 (based on their own builder
>   infrastructure).
>
> * I do run builders for GDB (mostly dormant since Sergio left RedHat)
>   and LLVM on Solaris/amd64 and sparcv9 (both using buildbot).
>
> In all three cases the projects provide documentation how to configure
> your own builders and add them to the infrastructure.  Is something like
> this possible for the GCC Jenkins (say adding Solaris builders) and if
> so how?  Or would one need to setup one's own instance, in which case it
> would be extremely helpful to learn the necessary config: doing
> something like this from scratch is a major effort, as seen in Paul
> Matos' effort (also buildbot-based) of a couple of years ago.
We don't have any procedures in place for this (yet).  I'd like to add
them, but I'm swamped.

I'm certainly open to having others contribute here.  As a long standing
member of the community I'd be happy to set up an account for you so you
could wire in a sparc/solaris system executor and set up the build scripts.

Jeff



Re: [PATCH] ira: Skip some pseudos in move_unallocated_pseudos

2021-01-05 Thread Jeff Law via Gcc-patches



On 1/4/21 7:36 PM, Kewen.Lin wrote:
> Hi Jeff,
>
> on 2021/1/5 上午7:13, Jeff Law wrote:
>>
>> On 12/22/20 11:40 PM, Kewen.Lin via Gcc-patches wrote:
>>> Hi Segher,
>>>
>>> on 2020/12/22 下午9:55, Segher Boessenkool wrote:
 Hi!

 Just a dumb formatting comment:

 On Tue, Dec 22, 2020 at 04:05:39PM +0800, Kewen.Lin wrote:
> This patch is to make move_unallocated_pseudos consistent
> to what we have in function find_moveable_pseudos, where we
> record the original pseudo into pseudo_replaced_reg only if
> validate_change succeeds with newreg.  To ensure every
> unallocated pseudo in move_unallocated_pseudos has expected
> information, it's better to add a check and skip it if it's
> unexpected.  This avoids possible ICEs in future.
>
> btw, I happened to found this in the bootstrapping for one
> experimental local patch, which is considered as impractical.
> --- a/gcc/ira.c
> +++ b/gcc/ira.c
> @@ -5111,6 +5111,11 @@ move_unallocated_pseudos (void)
>{
>   int idx = i - first_moveable_pseudo;
>   rtx other_reg = pseudo_replaced_reg[idx];
> + /* If there is no appropriate pseudo in pseudo_replaced_reg, it
> +means validate_change fails for this new pseudo in function
> +find_moveable_pseudos, then bypass it here.*/
 Dot space space.
>>> Good catch, thanks!  I forgot to reformat after polishing the comments.
>>> Will fix it with other potential comments.
>>>
 The patch sounds fine to me.  Hard to tell without seeing the patch that
 exposed the problem (for onlookers like me who do not know this code
 well, anyway ;-) )
>>> The patch which made this issue exposed looks like:
>>>
>>> +; Like *rotl3_insert_3 but work with nonzero_bits rather than
>>> +; explicit AND.
>>> +(define_insn "*rotl3_insert_8"
>>> +  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
>>> +(ior:GPR (ashift:GPR (match_operand:GPR 1 "gpc_reg_operand" "r")
>>> + (match_operand:SI 2 "u6bit_cint_operand" "n"))
>>> + (match_operand:GPR 3 "gpc_reg_operand" "0")))]
>>> +  "HOST_WIDE_INT_1U << INTVAL (operands[2])
>>> +   > nonzero_bits (operands[3], mode)"
>>> +{
>>> +  if (mode == SImode)
>>> +return "rlwimi %0,%1,%h2,0,31-%h2";
>>> +  else
>>> +return "rldimi %0,%1,%H2,0";
>>> +}
>>> +  [(set_attr "type" "insert")])
>>>
>>> Some insn matches this pattern in combine, later ira tries to introduce
>>> one new pseudo since it meets the checks in find_moveable_pseudos, but
>>> it fails in the call to validate_change since the nonzero_bits is more
>>> rough and can't satisfy the pattern condition, leaving the unexpected
>>> entry in pseudo_replaced_reg.
>> But what doesn't make any sense to me is pseudo_replaced_reg[] is only
>> set when validation is successful in find_moveable_pseudos.   So I can't
>> see how this patch actually helps the problem you're describing.
>>
> Yeah, pseudo_replaced_reg[] is only set when validation is successful,
> but we bump the max pseudo number in ira_create_new_reg as below
> regardless of whether validation succeeds or not:
>
> rtx newreg = ira_create_new_reg (def_reg);
> if (validate_change (def_insn, DF_REF_REAL_LOC (def), newreg, 0))
>
> Later in move_unallocated_pseudos, the iterating could cover those
> pseudos which were created but not used due to failed validation.
>
>   for (i = first_moveable_pseudo; i < last_moveable_pseudo; i++)
> if (reg_renumber[i] < 0)
>   {
>   int idx = i - first_moveable_pseudo;
>   rtx other_reg = pseudo_replaced_reg[idx];// (1)
>   rtx_insn *def_insn = DF_REF_INSN (DF_REG_DEF_CHAIN (i));
>   /* The use must follow all definitions of OTHER_REG, so we can
>  insert the new definition immediately after any of them.  */
>   df_ref other_def = DF_REG_DEF_CHAIN (REGNO (other_reg))
>
> Then we can get the NULL other_reg in (1), also have unexpected df info
> which causes ICE.  The patch skips the handlings on those pseudos which
> were intended to be used in validatation INSN but failed to.
I was wondering if it was somehow related to creation of new pseudos. 
The other important tidbit here is we reset last_movable_pseudo near the
end of find_moveable_pseudos.

OK for the trunk with an expanded comment.

Thanks,
jeff



Re: [RFC] [avr] Toolchain Integration for Testsuite Execution (avr cc0 to mode_cc0 conversion)

2021-01-05 Thread Rainer Orth
Hi Jeff,

> On 1/5/21 10:09 AM, abebeos wrote:
>>
>>
>> On Tue, 5 Jan 2021 at 18:50, Jeff Law > > wrote:
>>
>>
>>
>> On 1/5/21 2:18 AM, abebeos wrote:
>> >
>> > On Mon, 4 Jan 2021 at 21:40, Jeff Law > 
>> > >> wrote:
>> >
>> >     On 12/31/20 7:13 AM, abebeos wrote:
>> >     [...]
>> >     >     >         I'm definitely curious about the testing
>> setup and
>> >     >     whether or
>> >     >     >         not it can
>> >     >     >         be replicated into our Jenkins setup. 
>> >     >     >
>> >     >     >
>> >     >     >     Where can I find this Jenkins setup?
>> >     >     >
>> >     >     >
>> >     >     > To close this: assuming " into our Jenkins setup" is
>> some
>> >     redhat
>> >     >     > internal jenkins setup.
>> >     >     No, it's public.
>> >     >
>> >     >     http://gcc.gnu.org/jenkins
>>  > >
>> >     
>> >>
>> >     >
>> >     >
>> >     > (sidenote: This resolves on my side to the (insecure)
>> >     > http://3.14.90.209:8080/ 
>> >
>> >     
>> >>)
>> >     Yup.
>> >
>> >     >
>> >     > Is the source-code of  http://gcc.gnu.org/jenkins
>> 
>> >     >
>> >     > 
>> >>
>> >     available somewhere? I could not locate it.
>> >     Jenkins is a project independent of GCC for building continuous
>> >     testing/delivery systems.  See http://jenkins.io
>>  >
>> >
>> >
>> > Oh, my bad - I was referring to the sources of gcc's project jenkins
>> > setup (the scripts, configs etc. for the different targets,
>> including
>> > avr).
>> The Generators subdirectory has jobs which are used to rebuild the
>> various target jobs.  They're broadly categorized by the type of
>> build. 
>> ie, pure native, qemu-emulated native, glibc cross, newlib cross
>> and no
>> runtime library.  avr IIRC fits into the final category as it doesn't
>> have an upstreamed glibc or newlib port.
>>
>>
>> Ok, but I'm still unable to find the sources ("Generators
>> subdirectory"?). Can you (or anyone else) give me a direct link to the
>> sources? E.g. I want to change the avr part, where do I start
>> (usually, a git repo.)?
> You're not going to be able to change the scripts.   BUt they are
> accessable from the web site.  They're not in GIT or anything like that.

I fear I'm a bit lost here myself.  I do have a little experience
running various builders:

* I inherited a Golang one on Solaris/amd64 (based on their own builder
  infrastructure).

* I do run builders for GDB (mostly dormant since Sergio left RedHat)
  and LLVM on Solaris/amd64 and sparcv9 (both using buildbot).

In all three cases the projects provide documentation how to configure
your own builders and add them to the infrastructure.  Is something like
this possible for the GCC Jenkins (say adding Solaris builders) and if
so how?  Or would one need to setup one's own instance, in which case it
would be extremely helpful to learn the necessary config: doing
something like this from scratch is a major effort, as seen in Paul
Matos' effort (also buildbot-based) of a couple of years ago.

Thanks.
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH] v2: Don't link cc1 etc. against libcody.a

2021-01-05 Thread Nathan Sidwell

On 1/5/21 10:47 AM, Jakub Jelinek wrote:

On Tue, Jan 05, 2021 at 10:00:06AM +0100, Jakub Jelinek via Gcc-patches wrote:

On Tue, Jan 05, 2021 at 09:56:26AM +0100, Rainer Orth wrote:

Richi complained on IRC that cc1 is linked against libcody.a.
 From my understanding, it is just the cc1plus and cc1objplus binaries
that need it, so this patch links only those against it.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


this is already part of my Solaris libcody patch

build: libcody: Link with -lsocket -lnsl if necessary [PR98316]
 https://gcc.gnu.org/pipermail/gcc-patches/2020-December/562185.html

to be committed shortly.


Ah, sorry for missing that, patch withdrawn.

The difference between the patches for this particular thing is that
my patch was adding the libcody.a also to cc1*plus-checksum* goal and their
dependencies plus cc1*plus dependencies (so that if one rebuilds libcody,
make in gcc subdir will relink cc1plus).


The following updated patch are the incremental changes between what Rainer
has committed and what I've posted.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


LGTM, thanks for navigating the twisty maze better than me :)



2021-01-05  Jakub Jelinek  

gcc/cp/
* Make-lang.in (cc1plus-checksum, cc1plus$(exeext): Add
$(CODYLIB) after $(BACKEND).
gcc/objcp/
* Make-lang.in (cc1objplus-checksum, cc1objplus$(exeext): Add
$(CODYLIB) after $(BACKEND).

--- gcc/cp/Make-lang.in.jj  2021-01-05 11:44:02.956404880 +0100
+++ gcc/cp/Make-lang.in 2021-01-05 13:56:18.628046238 +0100
@@ -121,17 +121,17 @@ cp-warn = $(STRICT_WARN)
  # re-use the checksum from the prev-final stage so it passes
  # the bootstrap comparison and allows comparing of the cc1 binary
  cc1plus-checksum.c : build/genchecksum$(build_exeext) checksum-options \
-   $(CXX_OBJS) $(BACKEND) $(LIBDEPS)
+   $(CXX_OBJS) $(BACKEND) $(CODYLIB) $(LIBDEPS)
if [ -f ../stage_final ] \
   && cmp -s ../stage_current ../stage_final; then \
   cp ../prev-gcc/cc1plus-checksum.c cc1plus-checksum.c; \
else \
- build/genchecksum$(build_exeext) $(CXX_OBJS) $(BACKEND) $(LIBDEPS) \
+ build/genchecksum$(build_exeext) $(CXX_OBJS) $(BACKEND) $(CODYLIB) 
$(LIBDEPS) \
   checksum-options > cc1plus-checksum.c.tmp && \
  $(srcdir)/../move-if-change cc1plus-checksum.c.tmp 
cc1plus-checksum.c; \
fi
  
-cc1plus$(exeext): $(CXX_OBJS) cc1plus-checksum.o $(BACKEND) $(LIBDEPS) $(c++.prev)

+cc1plus$(exeext): $(CXX_OBJS) cc1plus-checksum.o $(BACKEND) $(CODYLIB) 
$(LIBDEPS) $(c++.prev)
@$(call LINK_PROGRESS,$(INDEX.c++),start)
+$(LLINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) -o $@ \
  $(CXX_OBJS) cc1plus-checksum.o $(BACKEND) $(CODYLIB) $(NETLIBS) \
--- gcc/objcp/Make-lang.in.jj   2021-01-05 13:56:18.629046227 +0100
+++ gcc/objcp/Make-lang.in  2021-01-05 13:57:01.603562005 +0100
@@ -61,14 +61,14 @@ OBJCXX_OBJS = objcp/objcp-act.o objcp/ob
  obj-c++_OBJS = $(OBJCXX_OBJS) cc1objplus-checksum.o
  
  cc1objplus-checksum.c : build/genchecksum$(build_exeext) checksum-options \

-   $(OBJCXX_OBJS) $(BACKEND) $(LIBDEPS)
-   build/genchecksum$(build_exeext) $(OBJCXX_OBJS) $(BACKEND) \
+   $(OBJCXX_OBJS) $(BACKEND) $(CODYLIB) $(LIBDEPS)
+   build/genchecksum$(build_exeext) $(OBJCXX_OBJS) $(BACKEND) $(CODYLIB) \
$(LIBDEPS) checksum-options > cc1objplus-checksum.c.tmp && \
$(srcdir)/../move-if-change cc1objplus-checksum.c.tmp \
cc1objplus-checksum.c
  
  cc1objplus$(exeext): $(OBJCXX_OBJS) cc1objplus-checksum.o $(BACKEND) \

-$(LIBDEPS) $(obj-c++.prev)
+$(CODYLIB) $(LIBDEPS) $(obj-c++.prev)
@$(call LINK_PROGRESS,$(INDEX.obj-c++),start)
+$(LLINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) -o $@ \
$(OBJCXX_OBJS) cc1objplus-checksum.o $(BACKEND) \


Jakub




--
Nathan Sidwell


Re: [PATCH] Restore input_location after recursive expand_call_inline

2021-01-05 Thread Bernd Edlinger
On 1/5/21 5:51 PM, Jeff Law wrote:
> 
> 
> On 1/5/21 1:05 AM, Richard Biener wrote:
>> On Tue, 5 Jan 2021, Bernd Edlinger wrote:
>>
>>>
>>> On 1/4/21 10:23 PM, Jeff Law wrote:

 On 1/4/21 1:12 PM, Bernd Edlinger wrote:
> Hi,
>
> I spotted a place where input_location is clobbered accidentally.
>
> That is in a recursive call to expand_call_inline.  The input_location
> is usually restored by goto egress in this function.
>
> Additionally the return value of the recursive expand call is thrown
> away, which does not look like a good idea.
>
> Although this causes no problems ATM, I wanted to fix it anyway.
>
>
> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
> Is it OK for trunk?
>
>
> Thanks
> Bernd.
>
> 0001-Restore-input_location-after-recursive-expand_call_i.patch
>
> From 88b963bba7b32972abf0ea44a01c03d643d7c6ca Mon Sep 17 00:00:00 2001
> From: Bernd Edlinger 
> Date: Mon, 4 Jan 2021 11:35:31 +0100
> Subject: [PATCH] Restore input_location after recursive expand_call_inline
>
> This is just a precautionary fix.
>
> 2021-01-04  Bernd Edlinger  
>
>   * tree-inline.c (expand_call_inline): Restore input_location.
>   Return result from recursive call.
 I suspect that we're always supposed to inline in this case.  As
 asserting that successfully_inlined is true before jumping to "egress"
 seems wise.

 OK with that change after the usual testing.

>>> No this does not work:
>>>
>>> +FAIL: g++.dg/ipa/devirt-5.C  -std=gnu++98 (internal compiler error)
>>> +FAIL: g++.dg/ipa/devirt-5.C  -std=gnu++98 (test for excess errors)
>>> +UNRESOLVED: g++.dg/ipa/devirt-5.C  -std=gnu++98 compilation failed to 
>>> produce executable
>>> +FAIL: g++.dg/ipa/devirt-5.C  -std=gnu++14 (internal compiler error)
>>> +FAIL: g++.dg/ipa/devirt-5.C  -std=gnu++14 (test for excess errors)
>>> +UNRESOLVED: g++.dg/ipa/devirt-5.C  -std=gnu++14 compilation failed to 
>>> produce executable
>>> +FAIL: g++.dg/ipa/devirt-5.C  -std=gnu++17 (internal compiler error)
>>> +FAIL: g++.dg/ipa/devirt-5.C  -std=gnu++17 (test for excess errors)
>>> +UNRESOLVED: g++.dg/ipa/devirt-5.C  -std=gnu++17 compilation failed to 
>>> produce executable
>>> +FAIL: g++.dg/ipa/devirt-5.C  -std=gnu++2a (internal compiler error)
>>> +FAIL: g++.dg/ipa/devirt-5.C  -std=gnu++2a (test for excess errors)
>>> +UNRESOLVED: g++.dg/ipa/devirt-5.C  -std=gnu++2a compilation failed to 
>>> produce executable
>>> +FAIL: g++.dg/ipa/devirt-c-4.C  -std=gnu++98 (internal compiler error)
>>> +FAIL: g++.dg/ipa/devirt-c-4.C  -std=gnu++98 (test for excess errors)
>>> +UNRESOLVED: g++.dg/ipa/devirt-c-4.C  -std=gnu++98 compilation failed to 
>>> produce executable
>>> +FAIL: g++.dg/ipa/devirt-c-4.C  -std=gnu++14 (internal compiler error)
>>> +FAIL: g++.dg/ipa/devirt-c-4.C  -std=gnu++14 (test for excess errors)
>>> +UNRESOLVED: g++.dg/ipa/devirt-c-4.C  -std=gnu++14 compilation failed to 
>>> produce executable
>>> +FAIL: g++.dg/ipa/devirt-c-4.C  -std=gnu++17 (internal compiler error)
>>> +FAIL: g++.dg/ipa/devirt-c-4.C  -std=gnu++17 (test for excess errors)
>>> +UNRESOLVED: g++.dg/ipa/devirt-c-4.C  -std=gnu++17 compilation failed to 
>>> produce executable
>>> +FAIL: g++.dg/ipa/devirt-c-4.C  -std=gnu++2a (internal compiler error)
>>> +FAIL: g++.dg/ipa/devirt-c-4.C  -std=gnu++2a (test for excess errors)
>>> +UNRESOLVED: g++.dg/ipa/devirt-c-4.C  -std=gnu++2a compilation failed to 
>>> produce executable
>>> +FAIL: g++.dg/ipa/imm-devirt-2.C  -std=gnu++98 (internal compiler error)
>>> +FAIL: g++.dg/ipa/imm-devirt-2.C  -std=gnu++98 (test for excess errors)
>>> +UNRESOLVED: g++.dg/ipa/imm-devirt-2.C  -std=gnu++98 compilation failed to 
>>> produce executable
>>> +FAIL: g++.dg/ipa/imm-devirt-2.C  -std=gnu++14 (internal compiler error)
>>> +FAIL: g++.dg/ipa/imm-devirt-2.C  -std=gnu++14 (test for excess errors)
>>> +UNRESOLVED: g++.dg/ipa/imm-devirt-2.C  -std=gnu++14 compilation failed to 
>>> produce executable
>>> +FAIL: g++.dg/ipa/imm-devirt-2.C  -std=gnu++17 (internal compiler error)
>>> +FAIL: g++.dg/ipa/imm-devirt-2.C  -std=gnu++17 (test for excess errors)
>>> +UNRESOLVED: g++.dg/ipa/imm-devirt-2.C  -std=gnu++17 compilation failed to 
>>> produce executable
>>> +FAIL: g++.dg/ipa/imm-devirt-2.C  -std=gnu++2a (internal compiler error)
>>> +FAIL: g++.dg/ipa/imm-devirt-2.C  -std=gnu++2a (test for excess errors)
>>> +UNRESOLVED: g++.dg/ipa/imm-devirt-2.C  -std=gnu++2a compilation failed to 
>>> produce executable
>>> +FAIL: g++.dg/ipa/pr71146.C  -std=gnu++98 (internal compiler error)
>>> +FAIL: g++.dg/ipa/pr71146.C  -std=gnu++98 (test for excess errors)
>>> +FAIL: g++.dg/ipa/pr71146.C  -std=gnu++14 (internal compiler error)
>>> +FAIL: g++.dg/ipa/pr71146.C  -std=gnu++14 (test for excess errors)
>>> +FAIL: g++.dg/ipa/pr71146.C  -std=gnu++17 (internal compiler error)
>>> +FAIL: g++.dg/ipa/pr71146.C  -std=gnu++17 (test for excess 

Re: [PATCH] expand: Fold x - y < 0 to x < y during expansion [PR94802]

2021-01-05 Thread Jeff Law via Gcc-patches



On 1/5/21 8:30 AM, Jakub Jelinek wrote:
> Hi!
>
> My earlier patch to simplify x - y < 0 etc. for signed subtraction
> with undefined overflow into x < y in match.pd regressed some tests,
> even when it was guarded to be post-IPA, the following patch thus
> attempts to optimize that during expansion instead (which is the last
> time we can do it, afterwards we lose the information whether it was
> x - y < 0 or (int) ((unsigned) x - y) < 0 for which we couldn't
> optimize it.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2021-01-05  Jakub Jelinek  
>
>   PR tree-optimization/94802
>   * expr.h (maybe_optimize_sub_cmp_0): Declare.
>   * expr.c: Include tree-pretty-print.h and flags.h.
>   (maybe_optimize_sub_cmp_0): New function.
>   (do_store_flag): Use it.
>   * cfgexpand.c (expand_gimple_cond): Likewise.
>
>   * gcc.target/i386/pr94802.c: New test.
>   * gcc.dg/Wstrict-overflow-25.c: Remove xfail.
OK
jeff



Re: add alignment to enable store merging in strict-alignment targets

2021-01-05 Thread Jeff Law via Gcc-patches



On 1/5/21 12:46 AM, Alexandre Oliva wrote:
> In g++.dg/opt/store-merging-2.C, the natural alignment of types T and
> S is a single byte, so we shouldn't expect store merging on
> strict-alignment platforms.  Indeed, without something like the
> adjust-alignment pass to bump up the alignment of the automatic
> variable, as in GCC 10, the optimization does not occur.
>
> This patch adjusts the test so that the required alignment is
> expressly stated, and so we don't rely on its accidentally being there
> to get the desired optimization.
>
> Regstrapped on x86_64-linux-gnu, also tested on x-arm-wrs-vxworks7r2.
> Ok to install?
>
>
> for  gcc/testsuite/ChangeLog
>
>   * g++.dg/opt/store-merging-2.C: Add the required alignment.
OK
jeff



Re: [RFC] [avr] Toolchain Integration for Testsuite Execution (avr cc0 to mode_cc0 conversion)

2021-01-05 Thread Jeff Law via Gcc-patches



On 1/5/21 10:09 AM, abebeos wrote:
>
>
> On Tue, 5 Jan 2021 at 18:50, Jeff Law  > wrote:
>
>
>
> On 1/5/21 2:18 AM, abebeos wrote:
> >
> > On Mon, 4 Jan 2021 at 21:40, Jeff Law  
> > >> wrote:
> >
> >     On 12/31/20 7:13 AM, abebeos wrote:
> >     [...]
> >     >     >         I'm definitely curious about the testing
> setup and
> >     >     whether or
> >     >     >         not it can
> >     >     >         be replicated into our Jenkins setup. 
> >     >     >
> >     >     >
> >     >     >     Where can I find this Jenkins setup?
> >     >     >
> >     >     >
> >     >     > To close this: assuming " into our Jenkins setup" is
> some
> >     redhat
> >     >     > internal jenkins setup.
> >     >     No, it's public.
> >     >
> >     >     http://gcc.gnu.org/jenkins
>   >
> >     
> >>
> >     >
> >     >
> >     > (sidenote: This resolves on my side to the (insecure)
> >     > http://3.14.90.209:8080/ 
> >
> >     
> >>)
> >     Yup.
> >
> >     >
> >     > Is the source-code of  http://gcc.gnu.org/jenkins
> 
> >     >
> >     > 
> >>
> >     available somewhere? I could not locate it.
> >     Jenkins is a project independent of GCC for building continuous
> >     testing/delivery systems.  See http://jenkins.io
>  >
> >
> >
> > Oh, my bad - I was referring to the sources of gcc's project jenkins
> > setup (the scripts, configs etc. for the different targets,
> including
> > avr).
> The Generators subdirectory has jobs which are used to rebuild the
> various target jobs.  They're broadly categorized by the type of
> build. 
> ie, pure native, qemu-emulated native, glibc cross, newlib cross
> and no
> runtime library.  avr IIRC fits into the final category as it doesn't
> have an upstreamed glibc or newlib port.
>
>
> Ok, but I'm still unable to find the sources ("Generators
> subdirectory"?). Can you (or anyone else) give me a direct link to the
> sources? E.g. I want to change the avr part, where do I start
> (usually, a git repo.)?
You're not going to be able to change the scripts.   BUt they are
accessable from the web site.  They're not in GIT or anything like that.

jeff



Re: [RFC] [avr] Toolchain Integration for Testsuite Execution (avr cc0 to mode_cc0 conversion)

2021-01-05 Thread abebeos via Gcc-patches
On Tue, 5 Jan 2021 at 18:50, Jeff Law  wrote:

>
>
> On 1/5/21 2:18 AM, abebeos wrote:
> >
> > On Mon, 4 Jan 2021 at 21:40, Jeff Law  > > wrote:
> >
> > On 12/31/20 7:13 AM, abebeos wrote:
> > [...]
> > > > I'm definitely curious about the testing setup and
> > > whether or
> > > > not it can
> > > > be replicated into our Jenkins setup.
> > > >
> > > >
> > > > Where can I find this Jenkins setup?
> > > >
> > > >
> > > > To close this: assuming " into our Jenkins setup" is some
> > redhat
> > > > internal jenkins setup.
> > > No, it's public.
> > >
> > > http://gcc.gnu.org/jenkins 
> > >
> > >
> > >
> > > (sidenote: This resolves on my side to the (insecure)
> > > http://3.14.90.209:8080/ 
> > >)
> > Yup.
> >
> > >
> > > Is the source-code of  http://gcc.gnu.org/jenkins
> > 
> > > >
> > available somewhere? I could not locate it.
> > Jenkins is a project independent of GCC for building continuous
> > testing/delivery systems.  See http://jenkins.io 
> >
> >
> > Oh, my bad - I was referring to the sources of gcc's project jenkins
> > setup (the scripts, configs etc. for the different targets, including
> > avr).
> The Generators subdirectory has jobs which are used to rebuild the
> various target jobs.  They're broadly categorized by the type of build.
> ie, pure native, qemu-emulated native, glibc cross, newlib cross and no
> runtime library.  avr IIRC fits into the final category as it doesn't
> have an upstreamed glibc or newlib port.
>

Ok, but I'm still unable to find the sources ("Generators subdirectory"?).
Can you (or anyone else) give me a direct link to the sources? E.g. I want
to change the avr part, where do I start (usually, a git repo.)?


>
> Jeff
>
>


Re: [PATCH] Restore input_location after recursive expand_call_inline

2021-01-05 Thread Jeff Law via Gcc-patches



On 1/5/21 1:05 AM, Richard Biener wrote:
> On Tue, 5 Jan 2021, Bernd Edlinger wrote:
>
>>
>> On 1/4/21 10:23 PM, Jeff Law wrote:
>>>
>>> On 1/4/21 1:12 PM, Bernd Edlinger wrote:
 Hi,

 I spotted a place where input_location is clobbered accidentally.

 That is in a recursive call to expand_call_inline.  The input_location
 is usually restored by goto egress in this function.

 Additionally the return value of the recursive expand call is thrown
 away, which does not look like a good idea.

 Although this causes no problems ATM, I wanted to fix it anyway.


 Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
 Is it OK for trunk?


 Thanks
 Bernd.

 0001-Restore-input_location-after-recursive-expand_call_i.patch

 From 88b963bba7b32972abf0ea44a01c03d643d7c6ca Mon Sep 17 00:00:00 2001
 From: Bernd Edlinger 
 Date: Mon, 4 Jan 2021 11:35:31 +0100
 Subject: [PATCH] Restore input_location after recursive expand_call_inline

 This is just a precautionary fix.

 2021-01-04  Bernd Edlinger  

* tree-inline.c (expand_call_inline): Restore input_location.
Return result from recursive call.
>>> I suspect that we're always supposed to inline in this case.  As
>>> asserting that successfully_inlined is true before jumping to "egress"
>>> seems wise.
>>>
>>> OK with that change after the usual testing.
>>>
>> No this does not work:
>>
>> +FAIL: g++.dg/ipa/devirt-5.C  -std=gnu++98 (internal compiler error)
>> +FAIL: g++.dg/ipa/devirt-5.C  -std=gnu++98 (test for excess errors)
>> +UNRESOLVED: g++.dg/ipa/devirt-5.C  -std=gnu++98 compilation failed to 
>> produce executable
>> +FAIL: g++.dg/ipa/devirt-5.C  -std=gnu++14 (internal compiler error)
>> +FAIL: g++.dg/ipa/devirt-5.C  -std=gnu++14 (test for excess errors)
>> +UNRESOLVED: g++.dg/ipa/devirt-5.C  -std=gnu++14 compilation failed to 
>> produce executable
>> +FAIL: g++.dg/ipa/devirt-5.C  -std=gnu++17 (internal compiler error)
>> +FAIL: g++.dg/ipa/devirt-5.C  -std=gnu++17 (test for excess errors)
>> +UNRESOLVED: g++.dg/ipa/devirt-5.C  -std=gnu++17 compilation failed to 
>> produce executable
>> +FAIL: g++.dg/ipa/devirt-5.C  -std=gnu++2a (internal compiler error)
>> +FAIL: g++.dg/ipa/devirt-5.C  -std=gnu++2a (test for excess errors)
>> +UNRESOLVED: g++.dg/ipa/devirt-5.C  -std=gnu++2a compilation failed to 
>> produce executable
>> +FAIL: g++.dg/ipa/devirt-c-4.C  -std=gnu++98 (internal compiler error)
>> +FAIL: g++.dg/ipa/devirt-c-4.C  -std=gnu++98 (test for excess errors)
>> +UNRESOLVED: g++.dg/ipa/devirt-c-4.C  -std=gnu++98 compilation failed to 
>> produce executable
>> +FAIL: g++.dg/ipa/devirt-c-4.C  -std=gnu++14 (internal compiler error)
>> +FAIL: g++.dg/ipa/devirt-c-4.C  -std=gnu++14 (test for excess errors)
>> +UNRESOLVED: g++.dg/ipa/devirt-c-4.C  -std=gnu++14 compilation failed to 
>> produce executable
>> +FAIL: g++.dg/ipa/devirt-c-4.C  -std=gnu++17 (internal compiler error)
>> +FAIL: g++.dg/ipa/devirt-c-4.C  -std=gnu++17 (test for excess errors)
>> +UNRESOLVED: g++.dg/ipa/devirt-c-4.C  -std=gnu++17 compilation failed to 
>> produce executable
>> +FAIL: g++.dg/ipa/devirt-c-4.C  -std=gnu++2a (internal compiler error)
>> +FAIL: g++.dg/ipa/devirt-c-4.C  -std=gnu++2a (test for excess errors)
>> +UNRESOLVED: g++.dg/ipa/devirt-c-4.C  -std=gnu++2a compilation failed to 
>> produce executable
>> +FAIL: g++.dg/ipa/imm-devirt-2.C  -std=gnu++98 (internal compiler error)
>> +FAIL: g++.dg/ipa/imm-devirt-2.C  -std=gnu++98 (test for excess errors)
>> +UNRESOLVED: g++.dg/ipa/imm-devirt-2.C  -std=gnu++98 compilation failed to 
>> produce executable
>> +FAIL: g++.dg/ipa/imm-devirt-2.C  -std=gnu++14 (internal compiler error)
>> +FAIL: g++.dg/ipa/imm-devirt-2.C  -std=gnu++14 (test for excess errors)
>> +UNRESOLVED: g++.dg/ipa/imm-devirt-2.C  -std=gnu++14 compilation failed to 
>> produce executable
>> +FAIL: g++.dg/ipa/imm-devirt-2.C  -std=gnu++17 (internal compiler error)
>> +FAIL: g++.dg/ipa/imm-devirt-2.C  -std=gnu++17 (test for excess errors)
>> +UNRESOLVED: g++.dg/ipa/imm-devirt-2.C  -std=gnu++17 compilation failed to 
>> produce executable
>> +FAIL: g++.dg/ipa/imm-devirt-2.C  -std=gnu++2a (internal compiler error)
>> +FAIL: g++.dg/ipa/imm-devirt-2.C  -std=gnu++2a (test for excess errors)
>> +UNRESOLVED: g++.dg/ipa/imm-devirt-2.C  -std=gnu++2a compilation failed to 
>> produce executable
>> +FAIL: g++.dg/ipa/pr71146.C  -std=gnu++98 (internal compiler error)
>> +FAIL: g++.dg/ipa/pr71146.C  -std=gnu++98 (test for excess errors)
>> +FAIL: g++.dg/ipa/pr71146.C  -std=gnu++14 (internal compiler error)
>> +FAIL: g++.dg/ipa/pr71146.C  -std=gnu++14 (test for excess errors)
>> +FAIL: g++.dg/ipa/pr71146.C  -std=gnu++17 (internal compiler error)
>> +FAIL: g++.dg/ipa/pr71146.C  -std=gnu++17 (test for excess errors)
>> +FAIL: g++.dg/ipa/pr71146.C  -std=gnu++2a (internal compiler error)
>> +FAIL: g++.dg/ipa/pr71146.C  -std=gnu++2a (test for excess 

Re: [RFC] [avr] Toolchain Integration for Testsuite Execution (avr cc0 to mode_cc0 conversion)

2021-01-05 Thread Jeff Law via Gcc-patches



On 1/5/21 2:18 AM, abebeos wrote:
>
> On Mon, 4 Jan 2021 at 21:40, Jeff Law  > wrote:
>
> On 12/31/20 7:13 AM, abebeos wrote:
> [...]
> >     >         I'm definitely curious about the testing setup and
> >     whether or
> >     >         not it can
> >     >         be replicated into our Jenkins setup. 
> >     >
> >     >
> >     >     Where can I find this Jenkins setup?
> >     >
> >     >
> >     > To close this: assuming " into our Jenkins setup" is some
> redhat
> >     > internal jenkins setup.
> >     No, it's public.
> >
> >     http://gcc.gnu.org/jenkins 
> >
> >
> >
> > (sidenote: This resolves on my side to the (insecure)
> > http://3.14.90.209:8080/ 
> >)
> Yup.
>
> >
> > Is the source-code of  http://gcc.gnu.org/jenkins
> 
> > >
> available somewhere? I could not locate it.
> Jenkins is a project independent of GCC for building continuous
> testing/delivery systems.  See http://jenkins.io 
>
>
> Oh, my bad - I was referring to the sources of gcc's project jenkins
> setup (the scripts, configs etc. for the different targets, including
> avr).
The Generators subdirectory has jobs which are used to rebuild the
various target jobs.  They're broadly categorized by the type of build. 
ie, pure native, qemu-emulated native, glibc cross, newlib cross and no
runtime library.  avr IIRC fits into the final category as it doesn't
have an upstreamed glibc or newlib port.

Jeff



Re: [PATCH] v2: Don't link cc1 etc. against libcody.a

2021-01-05 Thread Jeff Law via Gcc-patches



On 1/5/21 8:47 AM, Jakub Jelinek via Gcc-patches wrote:
> On Tue, Jan 05, 2021 at 10:00:06AM +0100, Jakub Jelinek via Gcc-patches wrote:
>> On Tue, Jan 05, 2021 at 09:56:26AM +0100, Rainer Orth wrote:
 Richi complained on IRC that cc1 is linked against libcody.a.
 From my understanding, it is just the cc1plus and cc1objplus binaries
 that need it, so this patch links only those against it.

 Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>>> this is already part of my Solaris libcody patch
>>>
>>> build: libcody: Link with -lsocket -lnsl if necessary [PR98316]
>>> https://gcc.gnu.org/pipermail/gcc-patches/2020-December/562185.html
>>>
>>> to be committed shortly.
>> Ah, sorry for missing that, patch withdrawn.
>>
>> The difference between the patches for this particular thing is that
>> my patch was adding the libcody.a also to cc1*plus-checksum* goal and their
>> dependencies plus cc1*plus dependencies (so that if one rebuilds libcody,
>> make in gcc subdir will relink cc1plus).
> The following updated patch are the incremental changes between what Rainer
> has committed and what I've posted.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2021-01-05  Jakub Jelinek  
>
> gcc/cp/
>   * Make-lang.in (cc1plus-checksum, cc1plus$(exeext): Add
>   $(CODYLIB) after $(BACKEND).
> gcc/objcp/
>   * Make-lang.in (cc1objplus-checksum, cc1objplus$(exeext): Add
>   $(CODYLIB) after $(BACKEND).
OK
jeff



[PATCH] tree-optimization/98516 - fix SLP permute opt materialization

2021-01-05 Thread Richard Biener
When materializing on a VEC_PERM node we have to permute the
incoming vectors, not the outgoing one.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2021-01-05  Richard Biener  

PR tree-optimization/98516
* tree-vect-slp.c (vect_optimize_slp): Permute the incoming
lanes when materializing on a VEC_PERM node.
(vectorizable_slp_permutation): Dump the permute properly.

* gcc.dg/vect/bb-slp-pr98516-1.c: New testcase.
* gcc.dg/vect/bb-slp-pr98516-2.c: Likewise.
---
 gcc/testsuite/gcc.dg/vect/bb-slp-pr98516-1.c | 26 ++
 gcc/testsuite/gcc.dg/vect/bb-slp-pr98516-2.c | 36 
 gcc/tree-vect-slp.c  | 13 ---
 3 files changed, 70 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/bb-slp-pr98516-1.c
 create mode 100644 gcc/testsuite/gcc.dg/vect/bb-slp-pr98516-2.c

diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-pr98516-1.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-pr98516-1.c
new file mode 100644
index 000..c4c244c6f8a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-pr98516-1.c
@@ -0,0 +1,26 @@
+/* { dg-do run } */
+
+double a[4], b[2];
+
+void __attribute__((noipa))
+foo ()
+{
+  double a0 = a[0];
+  double a1 = a[1];
+  double a2 = a[2];
+  double a3 = a[3];
+  b[0] = a1 - a3;
+  b[1] = a0 + a2;
+}
+
+int main()
+{
+  a[0] = 1.;
+  a[1] = 2.;
+  a[2] = 3.;
+  a[3] = 4.;
+  foo ();
+  if (b[0] != -2 || b[1] != 4)
+__builtin_abort ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-pr98516-2.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-pr98516-2.c
new file mode 100644
index 000..f1a9341e224
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-pr98516-2.c
@@ -0,0 +1,36 @@
+/* { dg-do run } */
+
+float a[8], b[4];
+
+void __attribute__((noipa))
+foo ()
+{
+  float a0 = a[0];
+  float a1 = a[1];
+  float a2 = a[2];
+  float a3 = a[3];
+  float a4 = a[4];
+  float a5 = a[5];
+  float a6 = a[6];
+  float a7 = a[7];
+  b[0] = a1 - a5;
+  b[1] = a0 + a4;
+  b[2] = a3 - a7;
+  b[3] = a2 + a6;
+}
+
+int main()
+{
+  a[0] = 1.;
+  a[1] = 2.;
+  a[2] = 3.;
+  a[3] = 4.;
+  a[4] = 5.;
+  a[5] = 6.;
+  a[6] = 7.;
+  a[7] = 8.;
+  foo ();
+  if (b[0] != -4 || b[1] != 6 || b[2] != -4 || b[3] != 10)
+__builtin_abort ();
+  return 0;
+}
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index 49cb635ee92..c9da8457e5e 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -3094,15 +3094,18 @@ vect_optimize_slp (vec_info *vinfo)
;
  else if (SLP_TREE_LANE_PERMUTATION (node).exists ())
{
- /* If the node if already a permute node we just need to apply
-the permutation to the permute node itself.  */
+ /* If the node is already a permute node we can apply
+the permutation to the lane selection, effectively
+materializing it on the incoming vectors.  */
  if (dump_enabled_p ())
dump_printf_loc (MSG_NOTE, vect_location,
 "simplifying permute node %p\n",
 node);
 
- vect_slp_permute (perms[perm], SLP_TREE_LANE_PERMUTATION (node),
-   true);
+ for (unsigned k = 0;
+  k < SLP_TREE_LANE_PERMUTATION (node).length (); ++k)
+   SLP_TREE_LANE_PERMUTATION (node)[k].second
+ = perms[perm][SLP_TREE_LANE_PERMUTATION (node)[k].second];
}
  else
{
@@ -5554,7 +5557,7 @@ vectorizable_slp_permutation (vec_info *vinfo, 
gimple_stmt_iterator *gsi,
dump_printf (MSG_NOTE, ",");
  dump_printf (MSG_NOTE, " vops%u[%u][%u]",
   vperm[i].first.first, vperm[i].first.second,
-  vperm[i].first.second);
+  vperm[i].second);
}
   dump_printf (MSG_NOTE, "\n");
 }
-- 
2.26.2


C++ Patch ping

2021-01-05 Thread Jakub Jelinek via Gcc-patches
Hi!

I'd like to ping the:
https://gcc.gnu.org/pipermail/gcc-patches/2020-December/562099.html
patch.

Thanks

Jakub



Re: [PATCH] store VLA bounds in attribute access as strings (PR 97172)

2021-01-05 Thread Martin Sebor via Gcc-patches

On 1/5/21 5:38 AM, Richard Biener wrote:

On Mon, Jan 4, 2021 at 9:53 PM Martin Sebor  wrote:


On 1/4/21 12:23 PM, Jeff Law wrote:



On 1/4/21 12:19 PM, Jakub Jelinek wrote:

On Mon, Jan 04, 2021 at 12:14:15PM -0700, Jeff Law via Gcc-patches wrote:

Doing the STRING_CST is certainly less fragile since the SSA names
created at gimplification time could even be ggc_freed when no longer
used in the IL.

Obviously we can't use SSA_NAMEs as they're specific to each function as
they get compiled.  But what's not as clear to me is why we can't use a
SAVE_EXPR of the original expression that indicates the size of the
parameter.

The gimplifier is destructive, so if the expressions are partly (e.g. in
those SAVE_EXPRs) shared with what is in the actual IL, we lose.
And if they aren't shared and there are side-effects, if we tried to
gimplify them again we'd get the side-effects duplicated.
So it all depends on what the code wants to handle, if e.g. just values of
parameters with simple arithmetics on those and punt on everything else,
then it is doable, but generally it is not.


I explained what the code handles and when in the pipeline in
the discussion of the previous patch:
https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559770.html


I would expect the expressions to be values of parameters (or objects in
static storage) and simple arithemetic on them.  If there's other cases,
punting seems appropriate.

Martin -- are there nontrivial expressions we need to be worried about here?


At the moment the middle warnings only consider parameters, like
the N in

void f (int N, int[N]);

void g (void)
{
  int a[3];
  f (sizeof a, a);   // warning


I wonder how this can work reliably without heavy-weight
"parsing" of the attribute?  That is, how do you relate
the passed 24 constant to the N in int[N]?


There is some parsing involved but it's only slightly more complex
than in attribute fn spec.  Just a string scan followed by constant
time lookup for each pointer argument.

The N is associated with int[N] via N's position in the argument
list and encoded as $N in the string.  The attribute for the decl
above is "1[$],$0".  The 1 is the VLA argument position, each
dollar sign in the brackets is one VLA bound (numbers are constant
bounds), and the $0 is the VLA bound argument.

(There is some redundancy here since all but the most significant
array (or VLA) bound are also encoded in the type of the argument.)



The front end redeclaration warnings consider all expressions,
including

int f (void);

void g (int[f () + 1]);
void g (int[f () + 2]);   // warning


For redeclaration warning the attribute isn't needed since you
have both decls and can compare sizes directly?


The attribute is used here as well.  It's attached to the first
decl irrespective of the form of the VLA, then created for
the second decl and the two are compared.  Mismatches are then
diagnosed and dropped from the second attribute.  The result
is merged with the first and added to the decl.

Martin




The patch turns these complex bounds into strings that the front
end compares instead.  After the front end is done the strings
don't serve any purpose (and I don't think ever will) and could
be removed.  I looked for a way to do it but couldn't find one
other than the free_lang_data pass in tree.c that Richard had
initially said wasn't the right place.  Sounds like he's
reconsidered but at this point, given that VLA parameters are
used only infraquently, and VLAs with these nontrivial bounds
are exceedingly rare, going to the trouble of removing them
doesn't seem worth the effort.

Martin




Jeff







Re: [PATCH] c++: Fix deduction from the type of an NTTP

2021-01-05 Thread Jason Merrill via Gcc-patches

On 1/4/21 5:50 PM, Patrick Palka wrote:

In the testcase nontype-auto17.C below, the calls to f and g are invalid
because neither deduction nor defaulting of the template parameter T
yields a valid specialization.  Deducing T doesn't work because T is
only used in a non-deduced context, and defaulting T doesn't work
because its default argument makes the type of M invalid.

But with -std=c++17 or later, we incorrectly accept both calls.  With
C++17 (specifically P0127R2), we're allowed to try to deduce T from
the argument 42 that's been tentatively deduced for M.  The problem is
that when unify walks into the type of M, it immediately gives up on
the TYPENAME_TYPE and doesn't register any new unifications (so the type
of M is still unknown) -- and then we go on to unify M with 42 anyway.
Later in type_unification_real, we blindly use the default argument for
T to complete the template argument vector, and we end up with the bogus
specializations f and g.

This patch fixes this issue by checking whether the type of a NTTP is
still dependent after walking into the type.  If it is, it means we
couldn't deduce all the template parameters used in its type, and so we
shouldn't yet unify the NTTP.

(The new testcase ttp33.C demonstrates the need for the TEMPLATE_PARM_LEVEL
check; without it, we would ICE on this testcase from the call to tsubst.)

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

gcc/cp/ChangeLog:

* pt.c (unify) : After walking into
the type of the TEMPLATE_PARM_INDEX, substitute into the type a
second time.  If the type is still dependent, don't unify it.

gcc/testsuite/ChangeLog:

* g++.dg/template/partial5.C: Adjust directives to expect the
same errors across all dialects.
* g++.dg/cpp1z/nontype-auto17.C: New test.
* g++.dg/cpp1z/nontype-auto18.C: New test.
* g++.dg/template/ttp33.C: New test.
---
  gcc/cp/pt.c | 10 +-
  gcc/testsuite/g++.dg/cpp1z/nontype-auto17.C | 10 ++
  gcc/testsuite/g++.dg/cpp1z/nontype-auto18.C |  6 ++
  gcc/testsuite/g++.dg/template/partial5.C|  2 +-
  gcc/testsuite/g++.dg/template/ttp33.C   | 10 ++
  5 files changed, 36 insertions(+), 2 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp1z/nontype-auto17.C
  create mode 100644 gcc/testsuite/g++.dg/cpp1z/nontype-auto18.C
  create mode 100644 gcc/testsuite/g++.dg/template/ttp33.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 19fd4c1d8a4..f1e8b01bc01 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -23581,13 +23581,21 @@ unify (tree tparms, tree targs, tree parm, tree arg, 
int strict,
  /* We haven't deduced the type of this parameter yet.  */
  if (cxx_dialect >= cxx17
  /* We deduce from array bounds in try_array_deduction.  */
- && !(strict & UNIFY_ALLOW_INTEGER))
+ && !(strict & UNIFY_ALLOW_INTEGER)
+ && TEMPLATE_PARM_LEVEL (parm) <= TMPL_ARGS_DEPTH (targs))
{
  /* Deduce it from the non-type argument.  */
  tree atype = TREE_TYPE (arg);
  RECUR_AND_CHECK_FAILURE (tparms, targs,
   tparm, atype,
   UNIFY_ALLOW_NONE, explain_p);
+ /* Now check whether the type of this parameter is still
+dependent, and give up if so.  */
+ ++processing_template_decl;
+ tparm = tsubst (tparm, targs, tf_none, NULL_TREE);
+ --processing_template_decl;
+ if (uses_template_parms (tparm))
+   return unify_success (explain_p);


Hmm, I was wondering about returning success without checking whether 
the type is still dependent, and relying on the retrying in 
type_unification_real, but that only works for function templates.


The patch is OK.


}
  else
/* Try again later.  */
diff --git a/gcc/testsuite/g++.dg/cpp1z/nontype-auto17.C 
b/gcc/testsuite/g++.dg/cpp1z/nontype-auto17.C
new file mode 100644
index 000..509eb0e98e3
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/nontype-auto17.C
@@ -0,0 +1,10 @@
+// { dg-do compile { target c++11 } }
+
+template  struct K { };
+
+template  int f(K); // { dg-error 
"void" }
+int a = f(K<42>{}); // { dg-error "no match" }
+
+struct S { using type = void; };
+template  int g(K); // { dg-message 
"deduction" }
+int b = g(K<42>{}); // { dg-error "no match" }
diff --git a/gcc/testsuite/g++.dg/cpp1z/nontype-auto18.C 
b/gcc/testsuite/g++.dg/cpp1z/nontype-auto18.C
new file mode 100644
index 000..46873672714
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/nontype-auto18.C
@@ -0,0 +1,6 @@
+// { dg-do compile { target c++11 } }
+
+template  struct K { };
+struct S { using type = int; };
+template  int f(K);
+int c = f(K<42>{});
diff --git a/gcc/testsuite/g++.dg/template/partial5.C 
b/gcc/testsuite/g++.dg/template/partial5.C
index 

Re: [PATCH] libstdc++: Skip atomic instructions in _Sp_counted_base::_M_release when both counts are 1

2021-01-05 Thread Maged Michael via Gcc-patches
Please let me know if more information about this patch is needed. Thank
you.

Maged



On Thu, Dec 17, 2020 at 3:49 PM Maged Michael 
wrote:

> Please find a proposed patch for _Sp_counted_base::_M_release to skip the
> two atomic instructions that decrement each of the use count and the weak
> count when both are 1. I proposed the general idea in an earlier thread (
> https://gcc.gnu.org/pipermail/libstdc++/2020-December/051642.html) and
> got useful feedback on a draft patch and responses to related questions
> about multi-granular atomicity and alignment. This patch is based on that
> feedback.
>
>
> I added a check for thread sanitizer to use the current algorithm in that
> case because TSAN does not support multi-granular atomicity. I'd like to
> add a check of __has_feature(thread_sanitizer) for building using LLVM. I
> found examples of __has_feature in libstdc++ but it doesn't seem to be
> recognized in shared_ptr_base.h. Any guidance on how to check
> __has_feature(thread_sanitizer) in this patch?
>
>
> GCC generates code for _M_release that is larger and more complex than
> that generated by LLVM. I'd like to file a bug report about that. Jonathan,
> would you please create a bugzilla account for me (
> https://gcc.gnu.org/bugzilla/) using my gmail address. Thank you.
>
>
> Information about the patch:
>
> - Benefits of the patch: Save the cost of the last atomic decrements of
> each of the use count and the weak count in _Sp_counted_base. Atomic
> instructions are significantly slower than regular loads and stores across
> major architectures.
>
> - How current code works: _M_release() atomically decrements the use
> count, checks if it was 1, if so calls _M_dispose(), atomically decrements
> the weak count, checks if it was 1, and if so calls _M_destroy().
>
> - How the proposed patch works: _M_release() loads both use count and weak
> count together atomically (when properly aligned), checks if the value is
> equal to the value of both counts equal to 1 (e.g., 0x10001), and if so
> calls _M_dispose() and _M_destroy(). Otherwise, it follows the original
> algorithm.
>
> - Why it works: When the current thread executing _M_release() finds each
> of the counts is equal to 1, then (when _lock_policy is _S_atomic) no other
> threads could possibly hold use or weak references to this control block.
> That is, no other threads could possibly access the counts or the protected
> object.
>
> - The proposed patch is intended to interact correctly with current code
> (under certain conditions: _Lock_policy is _S_atomic, proper alignment, and
> native lock-free support for atomic operations). That is, multiple threads
> using different versions of the code with and without the patch operating
> on the same objects should always interact correctly. The intent for the
> patch is to be ABI compatible with the current implementation.
>
> - The proposed patch involves a performance trade-off between saving the
> costs of two atomic instructions when the counts are both 1 vs adding the
> cost of loading the combined counts and comparison with two ones (e.g.,
> 0x10001).
>
> - The patch has been in use (built using LLVM) in a large environment for
> many months. The performance gains outweigh the losses (roughly 10 to 1)
> across a large variety of workloads.
>
>
> I'd appreciate feedback on the patch and any suggestions for checking
> __has_feature(thread_sanitizer).
>
>
> Maged
>
>
>
> diff --git a/libstdc++-v3/include/bits/shared_ptr_base.h
> b/libstdc++-v3/include/bits/shared_ptr_base.h
>
> index 368b2d7379a..a8fc944af5f 100644
>
> --- a/libstdc++-v3/include/bits/shared_ptr_base.h
>
> +++ b/libstdc++-v3/include/bits/shared_ptr_base.h
>
> @@ -153,20 +153,78 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>
> if (!_M_add_ref_lock_nothrow())
>
>   __throw_bad_weak_ptr();
>
>}
>
>
>bool
>
>_M_add_ref_lock_nothrow() noexcept;
>
>
>void
>
>_M_release() noexcept
>
>{
>
> +#if __SANITIZE_THREAD__
>
> +_M_release_orig();
>
> +return;
>
> +#endif
>
> +if (!__atomic_always_lock_free(sizeof(long long), 0) ||
>
> +!__atomic_always_lock_free(sizeof(_Atomic_word), 0) ||
>
> +sizeof(long long) < (2 * sizeof(_Atomic_word)) ||
>
> +sizeof(long long) > (sizeof(void*)))
>
> +  {
>
> +_M_release_orig();
>
> +return;
>
> +  }
>
> +_GLIBCXX_SYNCHRONIZATION_HAPPENS_BEFORE(&_M_use_count);
>
> +_GLIBCXX_SYNCHRONIZATION_HAPPENS_BEFORE(&_M_weak_count);
>
> +if (__atomic_load_n((long long*)(&_M_use_count),
> __ATOMIC_ACQUIRE)
>
> +== (1LL + (1LL << (8 * sizeof(_Atomic_word)
>
> +  {
>
> +// Both counts are 1, so there are no weak references and
>
> +// we are releasing the last strong reference. No other
>
> +// threads can observe the effects of this _M_release()
>
> +// call (e.g. 

Re: [PATCH] c++: ICE with deferred noexcept when deducing targs [PR82099]

2021-01-05 Thread Jason Merrill via Gcc-patches

On 1/4/21 8:31 PM, Marek Polacek wrote:

In this test we ICE in type_throw_all_p because it got a deferred
noexcept which it shouldn't.  Here's the story:

In noexcept61.C, we call bar, so we perform overload resolution.  When
adding the (only) candidate, we need to deduce template arguments, so
call fn_type_unification as usually.  That deduces U to

   void (*) (int &, int &)

which is correct, but its noexcept-spec is deferred_noexcept.  Then
we call add_function_candidate (bar), wherein we try to create an
implicit conversion sequence for every argument.  Since baz is
of unknown type, we instantiate_type it; it is a TEMPLATE_ID_EXPR
so that calls resolve_address_of_overloaded_function.  But we crash
there, because target_type contains the deferred_noexcept.

So we need to maybe_instantiate_noexcept before we can compare types.
resolve_overloaded_unification seemed like the appropriate spot, now
fn_type_unification produces the function type with its noexcept-spec
instantiated.  This shouldn't go against CWG 1330 because here we
really need to instantiate the noexcept-spec.

This also fixes class-deduction76.C, a dg-ice test I recently added,
therefore this fix also fixes c++/90799, yay.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?


OK.


gcc/cp/ChangeLog:

PR c++/82099
* pt.c (resolve_overloaded_unification): Call
maybe_instantiate_noexcept after instantiating the function
decl.

gcc/testsuite/ChangeLog:

PR c++/82099
* g++.dg/cpp1z/class-deduction76.C: Remove dg-ice.
* g++.dg/cpp0x/noexcept61.C: New test.
---
  gcc/cp/pt.c|  3 +++
  gcc/testsuite/g++.dg/cpp0x/noexcept61.C| 17 +
  gcc/testsuite/g++.dg/cpp1z/class-deduction76.C |  1 -
  3 files changed, 20 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/noexcept61.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 062ef858501..0d061adc2ed 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -22373,6 +22373,9 @@ resolve_overloaded_unification (tree tparms,
  --function_depth;
}
  
+	  if (flag_noexcept_type)

+   maybe_instantiate_noexcept (fn, tf_none);
+
  elem = TREE_TYPE (fn);
  if (try_one_overload (tparms, targs, tempargs, parm,
elem, strict, sub_strict, addr_p, explain_p)
diff --git a/gcc/testsuite/g++.dg/cpp0x/noexcept61.C 
b/gcc/testsuite/g++.dg/cpp0x/noexcept61.C
new file mode 100644
index 000..653cd7e6680
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/noexcept61.C
@@ -0,0 +1,17 @@
+// PR c++/82099
+// { dg-do compile { target c++11 } }
+
+template 
+void bar (T , T , U u)
+{
+  u (x, y);
+}
+
+template 
+void baz (T , T ) noexcept (noexcept (x == y));
+
+void
+foo (int x, int y)
+{
+  bar (x, y, baz);
+}
diff --git a/gcc/testsuite/g++.dg/cpp1z/class-deduction76.C 
b/gcc/testsuite/g++.dg/cpp1z/class-deduction76.C
index 23bb6e8fa9a..a131a386baa 100644
--- a/gcc/testsuite/g++.dg/cpp1z/class-deduction76.C
+++ b/gcc/testsuite/g++.dg/cpp1z/class-deduction76.C
@@ -1,6 +1,5 @@
  // PR c++/90799
  // { dg-do compile { target c++17 } }
-// { dg-ice "unify" }
  
  template

  void foo() noexcept(T::value);

base-commit: f262a3518877ccce9ed41b2e152c3a3564727bd6





Re: [PATCH] c++, v2: Fix ICE with __builtin_bit_cast [PR98469]

2021-01-05 Thread Jason Merrill via Gcc-patches

On 1/5/21 10:26 AM, Jakub Jelinek wrote:

On Mon, Jan 04, 2021 at 04:01:25PM -0500, Jason Merrill via Gcc-patches wrote:

On 1/4/21 3:48 PM, Jakub Jelinek wrote:

On Mon, Jan 04, 2021 at 03:44:46PM -0500, Jason Merrill wrote:

This change is OK, but part of the problem is that we're trying to do
overload resolution for an S copy/move constructor, which we shouldn't be
because bit_cast is a prvalue, so in C++17 and up we should use it to
directly initialize the target without any implied constructor call.

It seems we're mishandling this because the code in
build_special_member_call specifically looks for TARGET_EXPR or CONSTRUCTOR,
and BIT_CAST_EXPR is neither of those.

Wrapping a BIT_CAST_EXPR of aggregate type in a TARGET_EXPR would address
this, and any other places that expect a class prvalue to come in the form
of a TARGET_EXPR.


I can try that tomorrow.  Won't that cause copying through extra temporary
in some cases though, or is that guaranteed to be optimized?


It won't cause any extra copying when it's used to initialize another object
(like the return value of std::bit_cast).  Class prvalues are always
expressed with a TARGET_EXPR in the front end; the TARGET_EXPR melts away
when used as an initializer, it only creates a temporary when it's used in
another way.


Ok, this version wraps it into a TARGET_EXPR then, it alone fixes the bug,
but I've kept the constexpr.c change too.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


OK.


2021-01-05  Jakub Jelinek  

PR c++/98469
* constexpr.c (cxx_eval_constant_expression) :
Punt if lval is true.
* semantics.c (cp_build_bit_cast): Call get_target_expr_sfinae on
the result if it has a class type.

* g++.dg/cpp2a/bit-cast8.C: New test.
* g++.dg/cpp2a/bit-cast9.C: New test.

--- gcc/cp/constexpr.c.jj   2021-01-04 10:25:48.750121531 +0100
+++ gcc/cp/constexpr.c  2021-01-05 11:41:38.315032636 +0100
@@ -6900,6 +6900,15 @@ cxx_eval_constant_expression (const cons
return t;
  
  case BIT_CAST_EXPR:

+  if (lval)
+   {
+ if (!ctx->quiet)
+   error_at (EXPR_LOCATION (t),
+ "address of a call to %qs is not a constant expression",
+ "__builtin_bit_cast");
+ *non_constant_p = true;
+ return t;
+   }
r = cxx_eval_bit_cast (ctx, t, non_constant_p, overflow_p);
break;
  
--- gcc/cp/semantics.c.jj	2021-01-04 10:25:48.489124486 +0100

+++ gcc/cp/semantics.c  2021-01-05 11:27:49.327372582 +0100
@@ -10761,6 +10761,10 @@ cp_build_bit_cast (location_t loc, tree
  
tree ret = build_min (BIT_CAST_EXPR, type, arg);

SET_EXPR_LOCATION (ret, loc);
+
+  if (!processing_template_decl && CLASS_TYPE_P (type))
+ret = get_target_expr_sfinae (ret, complain);
+
return ret;
  }
  
--- gcc/testsuite/g++.dg/cpp2a/bit-cast8.C.jj	2021-01-05 11:41:38.315032636 +0100

+++ gcc/testsuite/g++.dg/cpp2a/bit-cast8.C  2021-01-05 11:41:38.315032636 
+0100
@@ -0,0 +1,11 @@
+// PR c++/98469
+// { dg-do compile { target c++20 } }
+// { dg-options "-Wall" }
+
+struct S { int s; };
+
+S
+foo ()
+{
+  return __builtin_bit_cast (S, 0);
+}
--- gcc/testsuite/g++.dg/cpp2a/bit-cast9.C.jj   2021-01-05 11:41:38.315032636 
+0100
+++ gcc/testsuite/g++.dg/cpp2a/bit-cast9.C  2021-01-05 11:41:38.315032636 
+0100
@@ -0,0 +1,15 @@
+// PR c++/98469
+// { dg-do compile { target c++20 } }
+// { dg-options "-Wall" }
+
+template
+constexpr T
+bit_cast (const F ) noexcept
+{
+  return __builtin_bit_cast (T, f);
+}
+struct S { int s; };
+constexpr int foo (const S ) { return x.s; }
+constexpr int bar () { return foo (bit_cast (0)); }
+constexpr int x = bar ();
+static_assert (!x);


Jakub





[PATCH] v2: Don't link cc1 etc. against libcody.a

2021-01-05 Thread Jakub Jelinek via Gcc-patches
On Tue, Jan 05, 2021 at 10:00:06AM +0100, Jakub Jelinek via Gcc-patches wrote:
> On Tue, Jan 05, 2021 at 09:56:26AM +0100, Rainer Orth wrote:
> > > Richi complained on IRC that cc1 is linked against libcody.a.
> > > From my understanding, it is just the cc1plus and cc1objplus binaries
> > > that need it, so this patch links only those against it.
> > >
> > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> > 
> > this is already part of my Solaris libcody patch
> > 
> > build: libcody: Link with -lsocket -lnsl if necessary [PR98316]
> > https://gcc.gnu.org/pipermail/gcc-patches/2020-December/562185.html
> > 
> > to be committed shortly.
> 
> Ah, sorry for missing that, patch withdrawn.
> 
> The difference between the patches for this particular thing is that
> my patch was adding the libcody.a also to cc1*plus-checksum* goal and their
> dependencies plus cc1*plus dependencies (so that if one rebuilds libcody,
> make in gcc subdir will relink cc1plus).

The following updated patch are the incremental changes between what Rainer
has committed and what I've posted.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-01-05  Jakub Jelinek  

gcc/cp/
* Make-lang.in (cc1plus-checksum, cc1plus$(exeext): Add
$(CODYLIB) after $(BACKEND).
gcc/objcp/
* Make-lang.in (cc1objplus-checksum, cc1objplus$(exeext): Add
$(CODYLIB) after $(BACKEND).

--- gcc/cp/Make-lang.in.jj  2021-01-05 11:44:02.956404880 +0100
+++ gcc/cp/Make-lang.in 2021-01-05 13:56:18.628046238 +0100
@@ -121,17 +121,17 @@ cp-warn = $(STRICT_WARN)
 # re-use the checksum from the prev-final stage so it passes
 # the bootstrap comparison and allows comparing of the cc1 binary
 cc1plus-checksum.c : build/genchecksum$(build_exeext) checksum-options \
-   $(CXX_OBJS) $(BACKEND) $(LIBDEPS) 
+   $(CXX_OBJS) $(BACKEND) $(CODYLIB) $(LIBDEPS) 
if [ -f ../stage_final ] \
   && cmp -s ../stage_current ../stage_final; then \
   cp ../prev-gcc/cc1plus-checksum.c cc1plus-checksum.c; \
else \
- build/genchecksum$(build_exeext) $(CXX_OBJS) $(BACKEND) $(LIBDEPS) \
+ build/genchecksum$(build_exeext) $(CXX_OBJS) $(BACKEND) $(CODYLIB) 
$(LIBDEPS) \
  checksum-options > cc1plus-checksum.c.tmp && \
  $(srcdir)/../move-if-change cc1plus-checksum.c.tmp 
cc1plus-checksum.c; \
fi
 
-cc1plus$(exeext): $(CXX_OBJS) cc1plus-checksum.o $(BACKEND) $(LIBDEPS) 
$(c++.prev)
+cc1plus$(exeext): $(CXX_OBJS) cc1plus-checksum.o $(BACKEND) $(CODYLIB) 
$(LIBDEPS) $(c++.prev)
@$(call LINK_PROGRESS,$(INDEX.c++),start)
+$(LLINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) -o $@ \
  $(CXX_OBJS) cc1plus-checksum.o $(BACKEND) $(CODYLIB) $(NETLIBS) \
--- gcc/objcp/Make-lang.in.jj   2021-01-05 13:56:18.629046227 +0100
+++ gcc/objcp/Make-lang.in  2021-01-05 13:57:01.603562005 +0100
@@ -61,14 +61,14 @@ OBJCXX_OBJS = objcp/objcp-act.o objcp/ob
 obj-c++_OBJS = $(OBJCXX_OBJS) cc1objplus-checksum.o
 
 cc1objplus-checksum.c : build/genchecksum$(build_exeext) checksum-options \
-   $(OBJCXX_OBJS) $(BACKEND) $(LIBDEPS)
-   build/genchecksum$(build_exeext) $(OBJCXX_OBJS) $(BACKEND) \
+   $(OBJCXX_OBJS) $(BACKEND) $(CODYLIB) $(LIBDEPS)
+   build/genchecksum$(build_exeext) $(OBJCXX_OBJS) $(BACKEND) $(CODYLIB) \
$(LIBDEPS) checksum-options > cc1objplus-checksum.c.tmp && \
$(srcdir)/../move-if-change cc1objplus-checksum.c.tmp \
cc1objplus-checksum.c
 
 cc1objplus$(exeext): $(OBJCXX_OBJS) cc1objplus-checksum.o $(BACKEND) \
-$(LIBDEPS) $(obj-c++.prev)
+$(CODYLIB) $(LIBDEPS) $(obj-c++.prev)
@$(call LINK_PROGRESS,$(INDEX.obj-c++),start)
+$(LLINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) -o $@ \
$(OBJCXX_OBJS) cc1objplus-checksum.o $(BACKEND) \


Jakub



[PATCH] move SLP debug counter

2021-01-05 Thread Richard Biener
This moves it to catch individual SLP subgraphs

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2021-01-05  Richard Biener  

* tree-vect-slp.c (vect_slp_region): Move debug counter
to cover individual subgraphs.
---
 gcc/tree-vect-slp.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index 67aaa7b0a6a..49cb635ee92 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -4616,8 +4616,7 @@ vect_slp_region (vec bbs, 
vec datarefs,
bb_vinfo->shared->check_datarefs ();
   bb_vinfo->vector_mode = next_vector_mode;
 
-  if (vect_slp_analyze_bb_1 (bb_vinfo, n_stmts, fatal, dataref_groups)
- && dbg_cnt (vect_slp))
+  if (vect_slp_analyze_bb_1 (bb_vinfo, n_stmts, fatal, dataref_groups))
{
  if (dump_enabled_p ())
{
@@ -4648,6 +4647,9 @@ vect_slp_region (vec bbs, 
vec datarefs,
  continue;
}
 
+ if (!dbg_cnt (vect_slp))
+   continue;
+
  if (!vectorized && dump_enabled_p ())
dump_printf_loc (MSG_NOTE, vect_location,
 "Basic block will be vectorized "
-- 
2.26.2


[PATCH] tree-optimization/98428 - avoid pre-existing vectors for loop SLP

2021-01-05 Thread Richard Biener
It wasn't supposed to be enabled and appearantly copying around the
checking messed up the condition.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2021-01-05  Richard Biener  

PR tree-optimization/98428
* tree-vect-slp.c (vect_build_slp_tree_1): Properly reject
vector lane extracts for loop vectorization.
---
 gcc/tree-vect-slp.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index 2c2cf637e73..67aaa7b0a6a 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -1096,11 +1096,10 @@ vect_build_slp_tree_1 (vec_info *vinfo, unsigned char 
*swap,
   && rhs_code == BIT_FIELD_REF)
{
  tree vec = TREE_OPERAND (gimple_assign_rhs1 (stmt), 0);
- if (TREE_CODE (vec) != SSA_NAME
+ if (!is_a  (vinfo)
+ || TREE_CODE (vec) != SSA_NAME
  || !types_compatible_p (vectype, TREE_TYPE (vec)))
{
- if (is_a  (vinfo) && i != 0)
-   continue;
  if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
 "Build SLP failed: "
-- 
2.26.2


Re: [PATCH] nvptx: Cache stacks block for OpenMP kernel launch

2021-01-05 Thread Jakub Jelinek via Gcc-patches
On Tue, Jan 05, 2021 at 12:13:59PM +, Julian Brown wrote:
> Just to check, does my reply below address your concerns --
> particularly with regards to the current usage of CUDA streams
> serializing kernel executions from different host threads? Given that
> situation, and the observed speed improvement with OpenMP offloading to
> NVPTX with the patch, I'm not sure how much sense it makes to do
> anything more sophisticated than this -- especially without a test case
> that demonstrates a performance regression (or an exacerbated
> out-of-memory condition) with the patch.

I guess I can live with it for GCC 11, but would like this to be
reconsidered for GCC 12, people do run OpenMP offloading code from multiple
often concurrent threads and we shouldn't serialize it unnecessarily.

Jakub



[PATCH] expand: Fold x - y < 0 to x < y during expansion [PR94802]

2021-01-05 Thread Jakub Jelinek via Gcc-patches
Hi!

My earlier patch to simplify x - y < 0 etc. for signed subtraction
with undefined overflow into x < y in match.pd regressed some tests,
even when it was guarded to be post-IPA, the following patch thus
attempts to optimize that during expansion instead (which is the last
time we can do it, afterwards we lose the information whether it was
x - y < 0 or (int) ((unsigned) x - y) < 0 for which we couldn't
optimize it.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-01-05  Jakub Jelinek  

PR tree-optimization/94802
* expr.h (maybe_optimize_sub_cmp_0): Declare.
* expr.c: Include tree-pretty-print.h and flags.h.
(maybe_optimize_sub_cmp_0): New function.
(do_store_flag): Use it.
* cfgexpand.c (expand_gimple_cond): Likewise.

* gcc.target/i386/pr94802.c: New test.
* gcc.dg/Wstrict-overflow-25.c: Remove xfail.

--- gcc/expr.h.jj   2021-01-04 10:25:37.700246654 +0100
+++ gcc/expr.h  2021-01-05 12:55:20.673233214 +0100
@@ -298,6 +298,7 @@ extern tree string_constant (tree, tree
 extern tree byte_representation (tree, tree *, tree *, tree *);
 
 extern enum tree_code maybe_optimize_mod_cmp (enum tree_code, tree *, tree *);
+extern void maybe_optimize_sub_cmp_0 (enum tree_code, tree *, tree *);
 
 /* Two different ways of generating switch statements.  */
 extern int try_casesi (tree, tree, tree, tree, rtx, rtx, rtx, 
profile_probability);
--- gcc/expr.c.jj   2021-01-04 10:25:38.0 +0100
+++ gcc/expr.c  2021-01-05 14:12:26.826136638 +0100
@@ -62,6 +62,8 @@ along with GCC; see the file COPYING3.
 #include "ccmp.h"
 #include "gimple-fold.h"
 #include "rtx-vector-builder.h"
+#include "tree-pretty-print.h"
+#include "flags.h"
 
 
 /* If this is nonzero, we do not bother generating VOLATILE
@@ -12275,6 +12277,37 @@ maybe_optimize_mod_cmp (enum tree_code c
   *arg1 = c4;
   return code == EQ_EXPR ? LE_EXPR : GT_EXPR;
 }
+
+/* Optimize x - y < 0 into x < 0 if x - y has undefined overflow.  */
+
+void
+maybe_optimize_sub_cmp_0 (enum tree_code code, tree *arg0, tree *arg1)
+{
+  gcc_checking_assert (code == GT_EXPR || code == GE_EXPR
+  || code == LT_EXPR || code == LE_EXPR);
+  gcc_checking_assert (integer_zerop (*arg1));
+
+  if (!optimize)
+return;
+
+  gimple *stmt = get_def_for_expr (*arg0, MINUS_EXPR);
+  if (stmt == NULL)
+return;
+
+  tree treeop0 = gimple_assign_rhs1 (stmt);
+  tree treeop1 = gimple_assign_rhs2 (stmt);
+  if (!TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (treeop0)))
+return;
+
+  if (issue_strict_overflow_warning (WARN_STRICT_OVERFLOW_COMPARISON))
+warning_at (gimple_location (stmt), OPT_Wstrict_overflow,
+   "assuming signed overflow does not occur when "
+   "simplifying % to %",
+   op_symbol_code (code), op_symbol_code (code));
+
+  *arg0 = treeop0;
+  *arg1 = treeop1;
+}
 
 /* Generate code to calculate OPS, and exploded expression
using a store-flag instruction and return an rtx for the result.
@@ -12363,6 +12396,14 @@ do_store_flag (sepops ops, rtx target, m
}
 }
 
+  /* Optimize (x - y) < 0 into x < y if x - y has undefined overflow.  */
+  if (!unsignedp
+  && (ops->code == LT_EXPR || ops->code == LE_EXPR
+ || ops->code == GT_EXPR || ops->code == GE_EXPR)
+  && integer_zerop (arg1)
+  && TREE_CODE (arg0) == SSA_NAME)
+maybe_optimize_sub_cmp_0 (ops->code, , );
+
   /* Get the rtx comparison code to use.  We know that EXP is a comparison
  operation of some type.  Some comparisons against 1 and -1 can be
  converted to comparisons with zero.  Do so here so that the tests
--- gcc/cfgexpand.c.jj  2021-01-04 10:25:38.437238308 +0100
+++ gcc/cfgexpand.c 2021-01-05 12:58:51.718857608 +0100
@@ -2621,6 +2621,14 @@ expand_gimple_cond (basic_block bb, gcon
   && TREE_CODE (op1) == INTEGER_CST)
 code = maybe_optimize_mod_cmp (code, , );
 
+  /* Optimize (x - y) < 0 into x < y if x - y has undefined overflow.  */
+  if (!TYPE_UNSIGNED (TREE_TYPE (op0))
+  && (code == LT_EXPR || code == LE_EXPR
+ || code == GT_EXPR || code == GE_EXPR)
+  && integer_zerop (op1)
+  && TREE_CODE (op0) == SSA_NAME)
+maybe_optimize_sub_cmp_0 (code, , );
+
   last2 = last = get_last_insn ();
 
   extract_true_false_edges_from_block (bb, _edge, _edge);
--- gcc/testsuite/gcc.target/i386/pr94802.c.jj  2021-01-05 13:08:11.044567008 
+0100
+++ gcc/testsuite/gcc.target/i386/pr94802.c 2021-01-05 13:07:50.093802618 
+0100
@@ -0,0 +1,59 @@
+/* PR tree-optimization/94802 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -masm=att" } */
+/* { dg-final { scan-assembler-not "\ttestl\t" } } */
+/* { dg-final { scan-assembler-times "\tcmpl\t" 8 } } */
+
+void foo (void);
+
+int
+f1 (int a, int b)
+{
+  return (a - b) >= 0;
+}
+
+int
+f2 (int a, int b)
+{
+  return (a - b) > 0;
+}
+
+int
+f3 (int a, int b)
+{
+  return (a - b) <= 0;
+}
+
+int
+f4 (int a, int b)
+{
+  return (a - b) < 0;

[PATCH] c++, v2: Fix ICE with __builtin_bit_cast [PR98469]

2021-01-05 Thread Jakub Jelinek via Gcc-patches
On Mon, Jan 04, 2021 at 04:01:25PM -0500, Jason Merrill via Gcc-patches wrote:
> On 1/4/21 3:48 PM, Jakub Jelinek wrote:
> > On Mon, Jan 04, 2021 at 03:44:46PM -0500, Jason Merrill wrote:
> > > This change is OK, but part of the problem is that we're trying to do
> > > overload resolution for an S copy/move constructor, which we shouldn't be
> > > because bit_cast is a prvalue, so in C++17 and up we should use it to
> > > directly initialize the target without any implied constructor call.
> > > 
> > > It seems we're mishandling this because the code in
> > > build_special_member_call specifically looks for TARGET_EXPR or 
> > > CONSTRUCTOR,
> > > and BIT_CAST_EXPR is neither of those.
> > > 
> > > Wrapping a BIT_CAST_EXPR of aggregate type in a TARGET_EXPR would address
> > > this, and any other places that expect a class prvalue to come in the form
> > > of a TARGET_EXPR.
> > 
> > I can try that tomorrow.  Won't that cause copying through extra temporary
> > in some cases though, or is that guaranteed to be optimized?
> 
> It won't cause any extra copying when it's used to initialize another object
> (like the return value of std::bit_cast).  Class prvalues are always
> expressed with a TARGET_EXPR in the front end; the TARGET_EXPR melts away
> when used as an initializer, it only creates a temporary when it's used in
> another way.

Ok, this version wraps it into a TARGET_EXPR then, it alone fixes the bug,
but I've kept the constexpr.c change too.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-01-05  Jakub Jelinek  

PR c++/98469
* constexpr.c (cxx_eval_constant_expression) :
Punt if lval is true.
* semantics.c (cp_build_bit_cast): Call get_target_expr_sfinae on
the result if it has a class type.

* g++.dg/cpp2a/bit-cast8.C: New test.
* g++.dg/cpp2a/bit-cast9.C: New test.

--- gcc/cp/constexpr.c.jj   2021-01-04 10:25:48.750121531 +0100
+++ gcc/cp/constexpr.c  2021-01-05 11:41:38.315032636 +0100
@@ -6900,6 +6900,15 @@ cxx_eval_constant_expression (const cons
   return t;
 
 case BIT_CAST_EXPR:
+  if (lval)
+   {
+ if (!ctx->quiet)
+   error_at (EXPR_LOCATION (t),
+ "address of a call to %qs is not a constant expression",
+ "__builtin_bit_cast");
+ *non_constant_p = true;
+ return t;
+   }
   r = cxx_eval_bit_cast (ctx, t, non_constant_p, overflow_p);
   break;
 
--- gcc/cp/semantics.c.jj   2021-01-04 10:25:48.489124486 +0100
+++ gcc/cp/semantics.c  2021-01-05 11:27:49.327372582 +0100
@@ -10761,6 +10761,10 @@ cp_build_bit_cast (location_t loc, tree
 
   tree ret = build_min (BIT_CAST_EXPR, type, arg);
   SET_EXPR_LOCATION (ret, loc);
+
+  if (!processing_template_decl && CLASS_TYPE_P (type))
+ret = get_target_expr_sfinae (ret, complain);
+
   return ret;
 }
 
--- gcc/testsuite/g++.dg/cpp2a/bit-cast8.C.jj   2021-01-05 11:41:38.315032636 
+0100
+++ gcc/testsuite/g++.dg/cpp2a/bit-cast8.C  2021-01-05 11:41:38.315032636 
+0100
@@ -0,0 +1,11 @@
+// PR c++/98469
+// { dg-do compile { target c++20 } }
+// { dg-options "-Wall" }
+
+struct S { int s; };
+
+S
+foo ()
+{
+  return __builtin_bit_cast (S, 0);
+}
--- gcc/testsuite/g++.dg/cpp2a/bit-cast9.C.jj   2021-01-05 11:41:38.315032636 
+0100
+++ gcc/testsuite/g++.dg/cpp2a/bit-cast9.C  2021-01-05 11:41:38.315032636 
+0100
@@ -0,0 +1,15 @@
+// PR c++/98469
+// { dg-do compile { target c++20 } }
+// { dg-options "-Wall" }
+
+template
+constexpr T
+bit_cast (const F ) noexcept
+{
+  return __builtin_bit_cast (T, f);
+}
+struct S { int s; };
+constexpr int foo (const S ) { return x.s; }
+constexpr int bar () { return foo (bit_cast (0)); }
+constexpr int x = bar ();
+static_assert (!x);


Jakub



[PATCH toplevel] libctf: new testsuite

2021-01-05 Thread Nick Alcock via Gcc-patches
This enables 'make libctf-check', used by a new libctf testsuite in
binutils.

2021-01-05  Nick Alcock  

* Makefile.def (libctf): No longer no_check.  Checking depends on
all-ld.
* Makefile.in: Regenerated.

---

 Makefile.def  |   4 +-
 Makefile.in   |  13 +

This is a stripped-down top-level-only subset of commit 
c59e30ed1727135f8efb79890f2c458f73709757 in binutils-gdb.git.  (Because
it is identical to what has already landed in binutils, it should apply
without trouble in syncs back to there.)

I don't have permission to push this: Alan has offered to do so.

(I hope I'm doing this right...)

diff --git a/Makefile.def b/Makefile.def
index 089e70ae3ed..cc429aa8628 100644
--- a/Makefile.def
+++ b/Makefile.def
@@ -131,8 +131,7 @@ host_modules= { module= lto-plugin; bootstrap=true;
extra_make_flags='@extra_linker_plugin_flags@'; };
 host_modules= { module= libcc1; extra_configure_flags=--enable-shared; };
 host_modules= { module= gotools; };
-host_modules= { module= libctf; no_check=true;
-   bootstrap=true; };
+host_modules= { module= libctf; bootstrap=true; };
 
 target_modules = { module= libstdc++-v3;
   bootstrap=true;
@@ -547,6 +546,7 @@ dependencies = { module=configure-libctf; on=all-bfd; };
 dependencies = { module=configure-libctf; on=all-intl; };
 dependencies = { module=configure-libctf; on=all-zlib; };
 dependencies = { module=configure-libctf; on=all-libiconv; };
+dependencies = { module=check-libctf; on=all-ld; };
 
 // The Makefiles in gdb and gdbserver pull in a file that configure
 // generates in the gnulib directory, so distclean gnulib only after
diff --git a/Makefile.in b/Makefile.in
index fe34132f9e5..4fe7321786e 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -34761,6 +34761,12 @@ maybe-check-libctf:
 maybe-check-libctf: check-libctf
 
 check-libctf:
+   @: $(MAKE); $(unstage)
+   @r=`${PWD_COMMAND}`; export r; \
+   s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \
+   $(HOST_EXPORTS) $(EXTRA_HOST_EXPORTS) \
+   (cd $(HOST_SUBDIR)/libctf && \
+ $(MAKE) $(FLAGS_TO_PASS)  $(EXTRA_BOOTSTRAP_FLAGS) check)
 
 @endif libctf
 
@@ -52366,6 +52372,13 @@ configure-stage3-libctf: maybe-all-stage3-libiconv
 configure-stage4-libctf: maybe-all-stage4-libiconv
 configure-stageprofile-libctf: maybe-all-stageprofile-libiconv
 configure-stagefeedback-libctf: maybe-all-stagefeedback-libiconv
+check-libctf: maybe-all-ld
+check-stage1-libctf: maybe-all-stage1-ld
+check-stage2-libctf: maybe-all-stage2-ld
+check-stage3-libctf: maybe-all-stage3-ld
+check-stage4-libctf: maybe-all-stage4-ld
+check-stageprofile-libctf: maybe-all-stageprofile-ld
+check-stagefeedback-libctf: maybe-all-stagefeedback-ld
 distclean-gnulib: maybe-distclean-gdb
 distclean-gnulib: maybe-distclean-gdbserver
 all-bison: maybe-all-build-texinfo
-- 
2.29.2.250.g8336e49d6f.dirty



Re: [PATCH] store-merging: Handle vector CONSTRUCTORs using bswap [PR96239]

2021-01-05 Thread Jakub Jelinek via Gcc-patches
On Tue, Jan 05, 2021 at 01:55:21PM +0100, Richard Biener wrote:
> > Note, I have no idea why the bswap code needs TODO_update_ssa if it changed
> > things, for the vuses it copies them from the surrounding vuses, which looks
> > correct to me.  Perhaps because it uses force_gimple_operand_gsi* in a few
> > spots in bswap_replace?  Confused...
> 
> .. that shouldn't cause updating SSA to be necessary.  Maybe it at some
> point did not update virtual operands appropriately.

Ok, I've committed the following version without the TODO_update_ssa, which
passed another bootstrap/regtest on x86_64-linux and i686-linux.

2021-01-05  Jakub Jelinek  

PR tree-optimization/96239
* gimple-ssa-store-merging.c (maybe_optimize_vector_constructor): New
function.
(get_status_for_store_merging): Don't return BB_INVALID for blocks
with potential bswap optimizable CONSTRUCTORs.
(pass_store_merging::execute): Optimize vector CONSTRUCTORs with bswap
if possible.

* gcc.dg/tree-ssa/pr96239.c: New test.

--- gcc/gimple-ssa-store-merging.c.jj   2020-12-16 13:07:51.729733816 +0100
+++ gcc/gimple-ssa-store-merging.c  2020-12-16 16:02:06.238868137 +0100
@@ -1255,6 +1255,75 @@ bswap_replace (gimple_stmt_iterator gsi,
   return tgt;
 }
 
+/* Try to optimize an assignment CUR_STMT with CONSTRUCTOR on the rhs
+   using bswap optimizations.  CDI_DOMINATORS need to be
+   computed on entry.  Return true if it has been optimized and
+   TODO_update_ssa is needed.  */
+
+static bool
+maybe_optimize_vector_constructor (gimple *cur_stmt)
+{
+  tree fndecl = NULL_TREE, bswap_type = NULL_TREE, load_type;
+  struct symbolic_number n;
+  bool bswap;
+
+  gcc_assert (is_gimple_assign (cur_stmt)
+ && gimple_assign_rhs_code (cur_stmt) == CONSTRUCTOR);
+
+  tree rhs = gimple_assign_rhs1 (cur_stmt);
+  if (!VECTOR_TYPE_P (TREE_TYPE (rhs))
+  || !INTEGRAL_TYPE_P (TREE_TYPE (TREE_TYPE (rhs)))
+  || gimple_assign_lhs (cur_stmt) == NULL_TREE)
+return false;
+
+  HOST_WIDE_INT sz = int_size_in_bytes (TREE_TYPE (rhs)) * BITS_PER_UNIT;
+  switch (sz)
+{
+case 16:
+  load_type = bswap_type = uint16_type_node;
+  break;
+case 32:
+  if (builtin_decl_explicit_p (BUILT_IN_BSWAP32)
+ && optab_handler (bswap_optab, SImode) != CODE_FOR_nothing)
+   {
+ load_type = uint32_type_node;
+ fndecl = builtin_decl_explicit (BUILT_IN_BSWAP32);
+ bswap_type = TREE_VALUE (TYPE_ARG_TYPES (TREE_TYPE (fndecl)));
+   }
+  else
+   return false;
+  break;
+case 64:
+  if (builtin_decl_explicit_p (BUILT_IN_BSWAP64)
+ && (optab_handler (bswap_optab, DImode) != CODE_FOR_nothing
+ || (word_mode == SImode
+ && builtin_decl_explicit_p (BUILT_IN_BSWAP32)
+ && optab_handler (bswap_optab, SImode) != CODE_FOR_nothing)))
+   {
+ load_type = uint64_type_node;
+ fndecl = builtin_decl_explicit (BUILT_IN_BSWAP64);
+ bswap_type = TREE_VALUE (TYPE_ARG_TYPES (TREE_TYPE (fndecl)));
+   }
+  else
+   return false;
+  break;
+default:
+  return false;
+}
+
+  gimple *ins_stmt = find_bswap_or_nop (cur_stmt, , );
+  if (!ins_stmt || n.range != (unsigned HOST_WIDE_INT) sz)
+return false;
+
+  if (bswap && !fndecl && n.range != 16)
+return false;
+
+  memset (_stats, 0, sizeof (nop_stats));
+  memset (_stats, 0, sizeof (bswap_stats));
+  return bswap_replace (gsi_for_stmt (cur_stmt), ins_stmt, fndecl,
+   bswap_type, load_type, , bswap) != NULL_TREE;
+}
+
 /* Find manual byte swap implementations as well as load in a given
endianness. Byte swaps are turned into a bswap builtin invokation
while endian loads are converted to bswap builtin invokation or
@@ -5126,6 +5195,7 @@ static enum basic_block_status
 get_status_for_store_merging (basic_block bb)
 {
   unsigned int num_statements = 0;
+  unsigned int num_constructors = 0;
   gimple_stmt_iterator gsi;
   edge e;
 
@@ -5138,9 +5208,27 @@ get_status_for_store_merging (basic_bloc
 
   if (store_valid_for_store_merging_p (stmt) && ++num_statements >= 2)
break;
+
+  if (is_gimple_assign (stmt)
+ && gimple_assign_rhs_code (stmt) == CONSTRUCTOR)
+   {
+ tree rhs = gimple_assign_rhs1 (stmt);
+ if (VECTOR_TYPE_P (TREE_TYPE (rhs))
+ && INTEGRAL_TYPE_P (TREE_TYPE (TREE_TYPE (rhs)))
+ && gimple_assign_lhs (stmt) != NULL_TREE)
+   {
+ HOST_WIDE_INT sz
+   = int_size_in_bytes (TREE_TYPE (rhs)) * BITS_PER_UNIT;
+ if (sz == 16 || sz == 32 || sz == 64)
+   {
+ num_constructors = 1;
+ break;
+   }
+   }
+   }
 }
 
-  if (num_statements == 0)
+  if (num_statements == 0 && num_constructors == 0)
 return BB_INVALID;
 
   if (cfun->can_throw_non_call_exceptions && cfun->eh
@@ 

Re: Go patch committed: Accept -fgo-embedcfg option

2021-01-05 Thread Jakub Jelinek via Gcc-patches
On Tue, Jan 05, 2021 at 11:06:27AM +0100, Andreas Schwab wrote:
> FAIL: compiler driver --help=go option(s): "^ +-.*[^:.]$" absent from output: 
> "  -fgo-embedcfg=List embedded files via go:embed"

Fixed thusly, committed as obvious.

2021-01-05  Jakub Jelinek  

* lang.opt (fgo-embedcfg=): Add full stop at the end of description.

--- gcc/go/lang.opt.jj  2021-01-05 09:17:24.754566798 +0100
+++ gcc/go/lang.opt 2021-01-05 16:11:22.647352215 +0100
@@ -59,7 +59,7 @@ Go Joined RejectNegative
 
 fgo-embedcfg=
 Go Joined RejectNegative
--fgo-embedcfg=   List embedded files via go:embed
+-fgo-embedcfg=   List embedded files via go:embed.
 
 fgo-optimize-
 Go Joined

Jakub



[PATCH] tree-optimization/98381 - fix live bool vector extract

2021-01-05 Thread Richard Biener
This fixes extraction of live bool vector results for the case of
integer mode vectors.

Bootstrapped and tested on x86_64-unknown-linux-gnu (and i386.exp with 
SDE), pushed.

2021-01-05  Richard Biener  

PR tree-optimization/98381
* tree.c (vector_element_bits): Properly compute bool vector
element size.
* tree-vect-loop.c (vectorizable_live_operation): Properly
compute the last lane bit offset.
---
 gcc/tree-vect-loop.c | 5 ++---
 gcc/tree.c   | 9 +++--
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 830531f48b8..965cc164f6e 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -8494,7 +8494,7 @@ vectorizable_live_operation (vec_info *vinfo,
 {
   loop_vec_info loop_vinfo = dyn_cast  (vinfo);
   imm_use_iterator imm_iter;
-  tree lhs, lhs_type, bitsize, vec_bitsize;
+  tree lhs, lhs_type, bitsize;
   tree vectype = (slp_node
  ? SLP_TREE_VECTYPE (slp_node)
  : STMT_VINFO_VECTYPE (stmt_info));
@@ -8637,7 +8637,6 @@ vectorizable_live_operation (vec_info *vinfo,
   lhs_type = TREE_TYPE (lhs);
 
   bitsize = vector_element_bits_tree (vectype);
-  vec_bitsize = TYPE_SIZE (vectype);
 
   /* Get the vectorized lhs of STMT and the lane to use (counted in bits).  */
   tree vec_lhs, bitstart;
@@ -8661,7 +8660,7 @@ vectorizable_live_operation (vec_info *vinfo,
   vec_lhs = gimple_get_lhs (vec_stmt);
 
   /* Get the last lane in the vector.  */
-  bitstart = int_const_binop (MINUS_EXPR, vec_bitsize, bitsize);
+  bitstart = int_const_binop (MULT_EXPR, bitsize, bitsize_int (nunits - 
1));
 }
 
   if (loop_vinfo)
diff --git a/gcc/tree.c b/gcc/tree.c
index 421a2b4bc02..e0a1d512019 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -14021,8 +14021,13 @@ vector_element_bits (const_tree type)
 {
   gcc_checking_assert (VECTOR_TYPE_P (type));
   if (VECTOR_BOOLEAN_TYPE_P (type))
-return vector_element_size (tree_to_poly_uint64 (TYPE_SIZE (type)),
-   TYPE_VECTOR_SUBPARTS (type));
+{
+  if (VECTOR_MODE_P (TYPE_MODE (type)))
+   return vector_element_size (tree_to_poly_uint64 (TYPE_SIZE (type)),
+   TYPE_VECTOR_SUBPARTS (type));
+  else
+   return 1;
+}
   return tree_to_uhwi (TYPE_SIZE (TREE_TYPE (type)));
 }
 
-- 
2.26.2


[PATCH] i386: Prevent spurious FP exceptions with _mm_cvt{, t}ps_pi32 [PR98522]

2021-01-05 Thread Uros Bizjak via Gcc-patches
Prevent spurious FP exceptions with _mm_cvt{,t}ps_pi32 for TARGET_MMX_WITH_SSE
by clearing the top 64 bytes of the input XMM register.

2021-01-05  Uroš Bizjak  

gcc/
PR target/98522
* config/i386/sse.md (sse_cvtps2pi): Redefine as define_insn_and_split.
Clear the top 64 bytes of the input XMM register.
(sse_cvttps2pi): Ditto.

gcc/testsuite

PR target/98522
* gcc.target/i386/pr98522.c: New test.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Pushed to mainline, will be beckported to gcc-10.

Uros.
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index d84103807ff..c8e771fd697 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -5103,31 +5103,65 @@
(set_attr "type" "ssecvt")
(set_attr "mode" "V4SF")])
 
-(define_insn "sse_cvtps2pi"
+(define_insn_and_split "sse_cvtps2pi"
   [(set (match_operand:V2SI 0 "register_operand" "=y,Yv")
(vec_select:V2SI
- (unspec:V4SI [(match_operand:V4SF 1 "register_mmxmem_operand" 
"xm,YvBm")]
+ (unspec:V4SI [(match_operand:V4SF 1 "nonimmediate_operand" "xm,YvBm")]
   UNSPEC_FIX_NOTRUNC)
  (parallel [(const_int 0) (const_int 1)])))]
   "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSE"
   "@
cvtps2pi\t{%1, %0|%0, %q1}
-   %vcvtps2dq\t{%1, %0|%0, %1}"
+   #"
+  "TARGET_SSE2 && reload_completed
+   && SSE_REG_P (operands[0])"
+  [(const_int 0)]
+{
+  rtx op1 = lowpart_subreg (V2SFmode, operands[1],
+   GET_MODE (operands[1]));
+  rtx tmp = lowpart_subreg (V4SFmode, operands[0],
+   GET_MODE (operands[0]));
+
+  op1 = gen_rtx_VEC_CONCAT (V4SFmode, op1, CONST0_RTX (V2SFmode));
+  emit_insn (gen_rtx_SET (tmp, op1));
+
+  rtx dest = lowpart_subreg (V4SImode, operands[0],
+   GET_MODE (operands[0]));
+  emit_insn (gen_sse2_fix_notruncv4sfv4si (dest, tmp));
+  DONE;
+}
   [(set_attr "isa" "*,sse2")
(set_attr "mmx_isa" "native,*")
(set_attr "type" "ssecvt")
(set_attr "unit" "mmx,*")
(set_attr "mode" "DI")])
 
-(define_insn "sse_cvttps2pi"
+(define_insn_and_split "sse_cvttps2pi"
   [(set (match_operand:V2SI 0 "register_operand" "=y,Yv")
(vec_select:V2SI
- (fix:V4SI (match_operand:V4SF 1 "register_mmxmem_operand" "xm,YvBm"))
+ (fix:V4SI (match_operand:V4SF 1 "nonimmediate_operand" "xm,YvBm"))
  (parallel [(const_int 0) (const_int 1)])))]
   "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSE"
   "@
cvttps2pi\t{%1, %0|%0, %q1}
-   %vcvttps2dq\t{%1, %0|%0, %1}"
+   #"
+  "TARGET_SSE2 && reload_completed
+   && SSE_REG_P (operands[0])"
+  [(const_int 0)]
+{
+  rtx op1 = lowpart_subreg (V2SFmode, operands[1],
+   GET_MODE (operands[1]));
+  rtx tmp = lowpart_subreg (V4SFmode, operands[0],
+   GET_MODE (operands[0]));
+
+  op1 = gen_rtx_VEC_CONCAT (V4SFmode, op1, CONST0_RTX (V2SFmode));
+  emit_insn (gen_rtx_SET (tmp, op1));
+
+  rtx dest = lowpart_subreg (V4SImode, operands[0],
+   GET_MODE (operands[0]));
+  emit_insn (gen_fix_truncv4sfv4si2 (dest, tmp));
+  DONE;
+}
   [(set_attr "isa" "*,sse2")
(set_attr "mmx_isa" "native,*")
(set_attr "type" "ssecvt")
@@ -8026,7 +8060,7 @@
 (define_insn "*vec_concatv4sf_0"
   [(set (match_operand:V4SF 0 "register_operand"   "=v")
(vec_concat:V4SF
- (match_operand:V2SF 1 "nonimmediate_operand" "xm")
+ (match_operand:V2SF 1 "nonimmediate_operand" "vm")
  (match_operand:V2SF 2 "const0_operand"   " C")))]
   "TARGET_SSE2"
   "%vmovq\t{%1, %0|%0, %1}"
@@ -10457,7 +10491,7 @@
   [(set (match_operand:VF2_512_256 0 "register_operand" "=v")
(vec_merge:VF2_512_256
  (vec_duplicate:VF2_512_256
-   (match_operand: 2 "nonimmediate_operand" "xm"))
+   (match_operand: 2 "nonimmediate_operand" "vm"))
  (match_operand:VF2_512_256 1 "const0_operand" "C")
  (const_int 1)))]
   "TARGET_AVX"
diff --git a/gcc/testsuite/gcc.target/i386/pr98522.c 
b/gcc/testsuite/gcc.target/i386/pr98522.c
new file mode 100644
index 000..762f2eded50
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr98522.c
@@ -0,0 +1,39 @@
+/* PR target/98522 */
+/* { dg-do run { target { ! ia32 } } } */
+/* { dg-options "-O2" } */
+/* { dg-require-effective-target fenv_exceptions } */
+
+#include 
+#include 
+
+__m64
+__attribute__((noinline))
+test_cvt (__m128 a)
+{
+  return _mm_cvt_ps2pi (a);
+}
+
+__m64
+__attribute__((noinline))
+test_cvtt (__m128 a)
+{
+  return _mm_cvtt_ps2pi (a);
+}
+
+int
+main ()
+{
+  __m128 x = (__m128)(__m128i){0xLL, 0x7fffLL};
+  volatile __m64 y;
+
+  feclearexcept (FE_INVALID);
+
+  y = test_cvt(x);
+  y = test_cvtt (x);
+
+if (fetestexcept (FE_INVALID))
+__builtin_abort ();
+
+  return 0;
+}
+


[PATCH] i386: Add _mm256_cmov_si256 [PR98521]

2021-01-05 Thread Uros Bizjak via Gcc-patches
Add missing _mm256_cmov_si256 intrinsic to xopintrin.h.

2021-01-05  Uroš Bizjak  

gcc/
PR target/98521
* config/i386/xopintrin.h (_mm256_cmov_si256): New.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Pushed to mainline, will be beckported to gcc-10.

Uros.
diff --git a/gcc/config/i386/xopintrin.h b/gcc/config/i386/xopintrin.h
index 49bac22effa..4299a5993ed 100644
--- a/gcc/config/i386/xopintrin.h
+++ b/gcc/config/i386/xopintrin.h
@@ -208,6 +208,12 @@ _mm_cmov_si128(__m128i __A, __m128i __B, __m128i __C)
   return  (__m128i) __builtin_ia32_vpcmov (__A, __B, __C);
 }
 
+extern __inline __m256i __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
+_mm256_cmov_si256(__m256i __A, __m256i __B, __m256i __C)
+{
+  return  (__m256i) __builtin_ia32_vpcmov256 (__A, __B, __C);
+}
+
 extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 _mm_perm_epi8(__m128i __A, __m128i __B, __m128i __C)
 {


[PATCH] c++: private inheritance access diagnostics fix [PR17314]

2021-01-05 Thread Anthony Sharp via Gcc-patches
This patch fixes PR17314 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=17314).
Previously, when class C attempted to access member a declared in class A
through class B, where class B privately inherits from A and class C inherits
from B, GCC would correctly report an access violation, but would erroneously
report that the reason was because a was "protected", when in fact, from the
point of view of class C, it was "private". This patch updates the
diagnostics code to generate more correct errors in cases of failed
inheritance such as these.

The reason this bug happened was because GCC was examining the
declared access of decl, instead of looking at it in the context of class
inheritance.

--- COMMENTS ---

This is my first GCC patch ever so there is probably something I have done
very wrong. Please let me know :) The thought of my code being scrutinised
by people with PhDs and doctorates is quite frankly terrifying.

Note that since it is a new year I had to make a new changelog file so the
diff for the patch might be slightly off.

There was no need to add additional regression tests since it was adequate
to simply change some of the regression tests that were there originally
(all the patch changes is the informative message telling the user where
a decl was defined as private).

--- REGRESSION ANALYSIS ---

No regressions reported.

G++ (CLEAN) RESULTS

# of expected passes202879
# of unexpected failures1
# of expected failures988
# of unsupported tests8654

GCC (CLEAN) RESULTS

# of expected passes163377
# of unexpected failures94
# of unexpected successes37
# of expected failures915
# of unsupported tests2530

G++ (PR17314 PATCHED) RESULTS

# of expected passes202871
# of unexpected failures1
# of expected failures988
# of unsupported tests8654

GCC (PR17314 PATCHED) RESULTS

# of expected passes163377
# of unexpected failures94
# of unexpected successes37
# of expected failures915
# of unsupported tests2530

When I build and make -k check -j 6 on the patched source it reports
202871 passes (8 fewer), although the FAILs do not increase. I am not 100%
sure why this happens since I have not removed any testcases, only edited a
few, but I think this happens because in files like dr142.c I removed more
output checks than I added. make -k check -j 6 also returns error 2
sometimes, although there are no obvious errors or warnings in the logs
explaining why. Probably harmless?

--- BUILD REPORT ---

GCC builds normally on x86_64-pc-linux-gnu for x86_64-pc-linux-gnu using
make -j 6. I didn't see it necessary to test on other build targets since the
patch only affects the C++ front end and so functionality is unlikely
to differ between platforms.

The compile log reports:

Comparing stages 2 and 3
warning: gcc/cc1obj-checksum.o differs
Comparison successful.

and then continues. I assume this means it was actually successful.






Index: gcc/cp/ChangeLog
from  Anthony Sharp  

Fixes PR17314
* typeck.c (complain_about_unrecognized_member): Updated function
arguments in complain_about_access.
* call.c (complain_about_access): Altered function.
* semantics.c (get_parent_with_private_access): Added function.
(access_in_type): Added as extern function.
* search.c (access_in_type): Made function non-static so it can be
used in semantics.c.
* cp-tree.h (complain_about_access): Changed parameters of function.
Index: gcc/testsuite/ChangeLog
from  Anthony Sharp  

Fixes PR17314
* g++.dg/lookup/scoped1.c modified testcase to run successfully with
changes.
* g++.dg/tc1/dr142.c modified testcase to run successfully with
changes.
* g++.dg/tc1/dr142.c modified testcase to run successfully with
changes.
* g++.dg/tc1/dr142.c modified testcase to run successfully with
changes.
* g++.dg/tc1/dr52.c modified testcase to run successfully with changes.
* g++.old-deja/g++.brendan/visibility6.c modified testcase to run
successfully with changes.
* g++.old-deja/g++.brendan/visibility8.c modified testcase to run
successfully with changes.
* g++.old-deja/g++.jason/access8.c modified testcase to run
successfully with changes.
* g++.old-deja/g++.law/access4.c modified testcase to run successfully
with changes.
* g++.old-deja/g++.law/visibility12.c modified testcase to run
successfully with changes.
* g++.old-deja/g++.law/visibility4.c modified testcase to run
successfully with changes.
* g++.old-deja/g++.law/visibility8.c modified testcase to run
successfully with changes.
* g++.old-deja/g++.other/access4.c modified testcase to run
successfully with changes.
Index: gcc/testsuite/g++.old-deja/g++.jason/access8.C
===
--- gcc/testsuite/g++.old-deja/g++.jason/access8.C

Re: git commit hook does not record my patches to PRs

2021-01-05 Thread Jakub Jelinek via Gcc-patches
On Tue, Jan 05, 2021 at 03:04:55PM +0100, Uros Bizjak via Gcc-patches wrote:
> Hello!
> 
> For some reason git commit hook does not record my patches to PRs,
> mentioned in the commit message. Some recent examples:

Maybe the python mess with UTF-8 is back.
> 
> PR 98521:
> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=951bdbde6ade56eb63af1dfa18777348a8a0d89e
> 
> and PR98522:
> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=1ff0ddcd8b4728bcc96e1daf2e70a03dc9fbf171

Jakub



Re: V3 [PATCH 5/5] gnulib: Support variables from the top level Makefile

2021-01-05 Thread H.J. Lu via Gcc-patches
On Tue, Jan 5, 2021 at 5:27 AM Christian Biesinger
 wrote:
>
> On Fri, Jan 1, 2021 at 1:07 AM H.J. Lu via Gdb-patches
>  wrote:
> >
> > On Thu, Dec 31, 2020 at 3:50 PM Joseph Myers  
> > wrote:
> > >
> > > On Sat, 19 Dec 2020, H.J. Lu via Gcc-patches wrote:
> > >
> > > > Work around what appears to be a GNU make bug handling MAKEFLAGS
> > > > values defined in terms of make variables, as is the case for CC and
> > > > friends when we are called from the top level Makefile.
> > >
> > > This description, and the comment in Makefile.am repeating it, is rather
> > > unhelpful as it provides no way for a reader to know what the supposed bug
> > > is.  Reviewers need to be able to work out whether the proposed workaround
> > > is correct or the right approach for working around the bug.  Maintainers
> > > in future need to be able to tell what the bug is.  So the comment needs
> > > to explain what the bug is and give a reference to a report for the bug in
> > > the GNU make bug tracker, so that subsequent maintainers can look at that
> > > bug to tell if the workaround is still needed at all.
> > >
> >
> > I just copied the same workaround from other directories in GCC.
>
> But could you explain under which circumstances the bug happens?
>

To rebuild a subdirectory with different CFLAGS/CXXFLAGS without
regenerating new Makefiles, like bootstrapping GCC and doing PGO
build in binutils/GDB,  we pass new CFLAGS/CXXFLAGS from
toplevel Makefile to the subdirectory.  Without this workaround, the old
CFLAGS/CXXFLAGS are used in the subdirectory.  The same workaround
is used in subdirectories for bootstrapping GCC.  My patch extends
it to GDB for PGO build.

-- 
H.J.


git commit hook does not record my patches to PRs

2021-01-05 Thread Uros Bizjak via Gcc-patches
Hello!

For some reason git commit hook does not record my patches to PRs,
mentioned in the commit message. Some recent examples:

PR 98521:
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=951bdbde6ade56eb63af1dfa18777348a8a0d89e

and PR98522:
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=1ff0ddcd8b4728bcc96e1daf2e70a03dc9fbf171

These two patches were committed as a single push (but one commit per
push also doesn't reach the PR), and the second one is also missing in
the gcc-cvs mailing list archive:

https://gcc.gnu.org/pipermail/gcc-cvs/2021-January/date.html

In the commit message, I have put PR marks everywhere I can think of,
but the commit hook is still ignoring them. Can someone please check,
what is wrong with the flow and eventually advise me what to do to
enjoy the benefits of the automation.

Thanks,
Uros.


Re: [PATCH] avr: cc0 to mode_cc conversion

2021-01-05 Thread Senthil Kumar Selvaraj via Gcc-patches


Senthil Kumar Selvaraj writes:

> Georg-Johann Lay writes:
>
>>
>> Finally, some general remarks:
>
> The work on my github branch was not complete - I'd blindly followed
> whatever the CC0 Transition wiki mentioned (the first three steps of
> case #2), and fixed any regression fallout (for ATmega128).
>
> I intend to try out a define_subst/early clobber of reg_cc based
> approach (inspired by the cris port) and see if that can help avoid the
> proliferation of define_insn_and_splits. Will update how that works out.

I had some time this past week to try implementing some of the changes
you suggested.

>
>>
>> 2) We just saw 100reds of insns being dublicated, basically the whole
>> machine description except for the few insns that leave cc alone.
>> Isn't is possible to use define subst for the bulk of the insns and
>> get a neat code that's better to grasp and to maintain?
>> After all it's just appending a clobber of reg_cc, and in the current
>> proposal almost 50% of the backend is just redundent repetitions of
>> previous insns.

I could not find a way to get define_subst to do define_insn_and_split -
other targets using the same approach (pdp11, h8300) have the
duplication as well.

>>
>> 4) Many insns don't have reloads and don't need to be turned into a
>> splitter + yet another insns, it should be all right to clobber
>> reg_cc from the very start.  Or am I missing something?  I think
>> I marked all places, but it should be easy enough to spot them.

If I remove the define_insn_and_split and add a (clobber (reg:CC
REG_CC)) to the define_insn itself for xcall patterns, then the producer
of the pattern (define_expand, output template of
define-insn-and-split/define-split etc.. or C code) needs to modified to
include the clobber of REG_CC in a PARALLEL, so that's a whole bunch of
changes.

If that is done at define_expand, for example, then similar patterns
that do not use the hard regs (non-call variants) will also need to be
modified to add the clobber, and therefore there's no point in
define_insn without clobber and split after reload with clobber for
those patterns.

Did I get that right?

FWIW, I'm also working on a parallel implementation that clobbers REG_CC
in all patterns from the start (with matching clobbers in define_expand
etc..) - still not in good enough shape though. It will avoid
duplication, but at the expense of modification of nearly every pattern
to emit or accept a clobber of REG_CC.

Regards
Senthil


Re: V3 [PATCH 5/5] gnulib: Support variables from the top level Makefile

2021-01-05 Thread Christian Biesinger via Gcc-patches
On Fri, Jan 1, 2021 at 1:07 AM H.J. Lu via Gdb-patches
 wrote:
>
> On Thu, Dec 31, 2020 at 3:50 PM Joseph Myers  wrote:
> >
> > On Sat, 19 Dec 2020, H.J. Lu via Gcc-patches wrote:
> >
> > > Work around what appears to be a GNU make bug handling MAKEFLAGS
> > > values defined in terms of make variables, as is the case for CC and
> > > friends when we are called from the top level Makefile.
> >
> > This description, and the comment in Makefile.am repeating it, is rather
> > unhelpful as it provides no way for a reader to know what the supposed bug
> > is.  Reviewers need to be able to work out whether the proposed workaround
> > is correct or the right approach for working around the bug.  Maintainers
> > in future need to be able to tell what the bug is.  So the comment needs
> > to explain what the bug is and give a reference to a report for the bug in
> > the GNU make bug tracker, so that subsequent maintainers can look at that
> > bug to tell if the workaround is still needed at all.
> >
>
> I just copied the same workaround from other directories in GCC.

But could you explain under which circumstances the bug happens?

Christian


[c++]: Improve module-decl diagnostics [PR 98327]

2021-01-05 Thread Nathan Sidwell

The diagnostic for a misplaced module decl was essentially 'computer
says no', which isn't the most helpful.  This adjusts it to indicate
what would be acceptable.

gcc/cp/
* parser.cc (cp_parser_module_declaration): Alter diagnostic
text to say where is permissable.
gcc/testsuite/
* g++.dg/modulex/mod-decl-1.C: Adjust.
* g++.dg/modulex/p0713-2.C: Adjust.
* g++.dg/modulex/p0713-3.C: Adjust.


--
Nathan Sidwell
diff --git i/gcc/cp/parser.c w/gcc/cp/parser.c
index d855e034458..c713852fe93 100644
--- i/gcc/cp/parser.c
+++ w/gcc/cp/parser.c
@@ -13726,19 +13726,22 @@ cp_parser_module_declaration (cp_parser *parser, module_parse mp_state,
   cp_lexer_consume_token (parser->lexer);
   cp_parser_require_pragma_eol (parser, token);
 
-  if ((mp_state != MP_PURVIEW && mp_state != MP_PURVIEW_IMPORTS)
+  if (!(mp_state == MP_PURVIEW || mp_state == MP_PURVIEW_IMPORTS)
 	  || !module_interface_p () || module_partition_p ())
 	error_at (token->location,
-		  "private module fragment not permitted here");
+		  "private module fragment only permitted in purview"
+		  " of module interface or partition");
   else
 	{
 	  mp_state = MP_PRIVATE_IMPORTS;
 	  sorry_at (token->location, "private module fragment");
 	}
 }
-  else if (mp_state != MP_FIRST && mp_state != MP_GLOBAL)
+  else if (!(mp_state == MP_FIRST || mp_state == MP_GLOBAL))
 {
-  error_at (token->location, "module-declaration not permitted here");
+  /* Neither the first declaration, nor in a GMF.  */
+  error_at (token->location, "module-declaration only permitted as first"
+		" declaration, or ending a global module fragment");
 skip_eol:
   cp_parser_skip_to_pragma_eol (parser, token);
 }
diff --git i/gcc/testsuite/g++.dg/modules/mod-decl-1.C w/gcc/testsuite/g++.dg/modules/mod-decl-1.C
index b2665bec743..23d34483dd7 100644
--- i/gcc/testsuite/g++.dg/modules/mod-decl-1.C
+++ w/gcc/testsuite/g++.dg/modules/mod-decl-1.C
@@ -6,11 +6,11 @@ export module frist;
 
 import frist; // { dg-error {cannot import module.* in its own purview} }
 
-module foo.second; // { dg-error "not permitted here" }
+module foo.second; // { dg-error "only permitted as" }
 
 namespace Foo 
 {
-module third;  // { dg-error "not permitted here" }
+module third;  // { dg-error "only permitted as" }
 }
 
 struct Baz
@@ -23,7 +23,7 @@ void Bink ()
   module fifth; // { dg-error "expected" }
 }
 
-module a.; // { dg-error "not permitted" }
+module a.; // { dg-error "only permitted as" }
 
 // { dg-prune-output "not writing module" }
 
diff --git i/gcc/testsuite/g++.dg/modules/p0713-2.C w/gcc/testsuite/g++.dg/modules/p0713-2.C
index c7846e450a9..cb4ccb6c5f6 100644
--- i/gcc/testsuite/g++.dg/modules/p0713-2.C
+++ w/gcc/testsuite/g++.dg/modules/p0713-2.C
@@ -1,3 +1,3 @@
 // { dg-additional-options "-fmodules-ts" }
 int j;
-module; // { dg-error "not permitted" }
+module; // { dg-error "only permitted as" }
diff --git i/gcc/testsuite/g++.dg/modules/p0713-3.C w/gcc/testsuite/g++.dg/modules/p0713-3.C
index 3c539ebab3e..09d89b73b3f 100644
--- i/gcc/testsuite/g++.dg/modules/p0713-3.C
+++ w/gcc/testsuite/g++.dg/modules/p0713-3.C
@@ -1,6 +1,6 @@
 // { dg-additional-options "-fmodules-ts" }
 int k;
-module frob; // { dg-error "not permitted" }
+module frob; // { dg-error "only permitted as" }
 // { dg-prune-output "failed to read" }
 // { dg-prune-output "fatal error:" }
 // { dg-prune-output "compilation terminated" }


Re: [PATCH] Add line debug info for virtual thunks (PR ipa/97937)

2021-01-05 Thread Alexandre Oliva
On Jan  4, 2021, Bernd Edlinger  wrote:

> currently there is a problem when debugging a virtual thunk.  That is
> a decl with DECL_IGNORED_P.  Currently the line information displayed
> in gdb is completely bogus, thus the last line of whatever function
> is immediately before the PC of the thunk.

*nod*, I recall seeing such issues before, for compiler-generated
functions.  Ideally, there should be some .noloc directive in the
assembler that would signal "no line number info for this code
fragment", but I don't think we have anything like that.

I don't recall whether using compiler-generated line number programs
avoids the problem, but I do recall that -ffunction-sections works
around it, because IIRC switching to a new section has the same effect
of discontinuing the line number program that the desired .noloc
directive would.

-- 
Alexandre Oliva, happy hacker  https://FSFLA.org/blogs/lxo/
   Free Software Activist GNU Toolchain Engineer
Vim, Vi, Voltei pro Emacs -- GNUlius Caesar


RE: [PR66791][ARM] Replace __builtin_vext* with __buitlin_shuffle in vext intrinsics

2021-01-05 Thread Kyrylo Tkachov via Gcc-patches
Hi Prathamesh,

> -Original Message-
> From: Prathamesh Kulkarni 
> Sent: 05 January 2021 11:42
> To: Kyrylo Tkachov 
> Cc: gcc Patches 
> Subject: Re: [PR66791][ARM] Replace __builtin_vext* with __buitlin_shuffle
> in vext intrinsics
> 
> On Mon, 4 Jan 2021 at 16:01, Kyrylo Tkachov 
> wrote:
> >
> > Hi Prathamesh
> >
> > > -Original Message-
> > > From: Prathamesh Kulkarni 
> > > Sent: 04 January 2021 10:27
> > > To: gcc Patches ; Kyrylo Tkachov
> > > 
> > > Subject: [PR66791][ARM] Replace __builtin_vext* with __buitlin_shuffle
> in
> > > vext intrinsics
> > >
> > > Hi Kyrill,
> > > The attached patch replaces __builtin_vextv8qi with __builtin_shuffle
> > > for vext_s8.
> > > Just wanted to confirm if this is in the correct direction ?
> > > If yes, I will send a follow up patch that converts for all vext 
> > > intrinsics.
> >
> > Yeah, that does look correct (aarch64 does it that way).
> > As before, please make sure to delete any now-unused builtins as well.
> Thanks, does the attached patch look OK ?

Ok if testing and bootstrap shows no problems.
Thanks,
Kyrill

> Testing in progress.
> 
> Thanks,
> Prathamesh
> >
> > Thanks,
> > Kyrill
> >
> > >
> > > Thanks,
> > > Prathamesh


Re: [PATCH] store-merging: Handle vector CONSTRUCTORs using bswap [PR96239]

2021-01-05 Thread Richard Biener
On Thu, 17 Dec 2020, Jakub Jelinek wrote:

> On Wed, Dec 16, 2020 at 09:29:31AM +0100, Richard Biener wrote:
> > I think it probably makes sense to have some helper split out that
> > collects & classifies vector constructor components we can use from
> > both forwprop (where matching the V_C_E from integer could be done
> > as well IMHO) and bswap (when a permute is involved) and store-merging.
> 
> I've tried to add such helper, but handling over just analysis and letting
> each pass handle it differently seems complicated given the limitations of
> the bswap infrastructure.
> 
> So, this patch just hooks the optimization also into store-merging so that
> the original testcase from the PR can be fixed.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK, but ...

> Note, I have no idea why the bswap code needs TODO_update_ssa if it changed
> things, for the vuses it copies them from the surrounding vuses, which looks
> correct to me.  Perhaps because it uses force_gimple_operand_gsi* in a few
> spots in bswap_replace?  Confused...

.. that shouldn't cause updating SSA to be necessary.  Maybe it at some
point did not update virtual operands appropriately.

Richard.

> 2020-12-17  Jakub Jelinek  
> 
>   PR tree-optimization/96239
>   * gimple-ssa-store-merging.c (maybe_optimize_vector_constructor): New
>   function.
>   (get_status_for_store_merging): Don't return BB_INVALID for blocks
>   with potential bswap optimizable CONSTRUCTORs.
>   (pass_store_merging::execute): Optimize vector CONSTRUCTORs with bswap
>   if possible.
> 
>   * gcc.dg/tree-ssa/pr96239.c: New test.
> 
> --- gcc/gimple-ssa-store-merging.c.jj 2020-12-16 13:07:51.729733816 +0100
> +++ gcc/gimple-ssa-store-merging.c2020-12-16 16:02:06.238868137 +0100
> @@ -1255,6 +1255,75 @@ bswap_replace (gimple_stmt_iterator gsi,
>return tgt;
>  }
>  
> +/* Try to optimize an assignment CUR_STMT with CONSTRUCTOR on the rhs
> +   using bswap optimizations.  CDI_DOMINATORS need to be
> +   computed on entry.  Return true if it has been optimized and
> +   TODO_update_ssa is needed.  */
> +
> +static bool
> +maybe_optimize_vector_constructor (gimple *cur_stmt)
> +{
> +  tree fndecl = NULL_TREE, bswap_type = NULL_TREE, load_type;
> +  struct symbolic_number n;
> +  bool bswap;
> +
> +  gcc_assert (is_gimple_assign (cur_stmt)
> +   && gimple_assign_rhs_code (cur_stmt) == CONSTRUCTOR);
> +
> +  tree rhs = gimple_assign_rhs1 (cur_stmt);
> +  if (!VECTOR_TYPE_P (TREE_TYPE (rhs))
> +  || !INTEGRAL_TYPE_P (TREE_TYPE (TREE_TYPE (rhs)))
> +  || gimple_assign_lhs (cur_stmt) == NULL_TREE)
> +return false;
> +
> +  HOST_WIDE_INT sz = int_size_in_bytes (TREE_TYPE (rhs)) * BITS_PER_UNIT;
> +  switch (sz)
> +{
> +case 16:
> +  load_type = bswap_type = uint16_type_node;
> +  break;
> +case 32:
> +  if (builtin_decl_explicit_p (BUILT_IN_BSWAP32)
> +   && optab_handler (bswap_optab, SImode) != CODE_FOR_nothing)
> + {
> +   load_type = uint32_type_node;
> +   fndecl = builtin_decl_explicit (BUILT_IN_BSWAP32);
> +   bswap_type = TREE_VALUE (TYPE_ARG_TYPES (TREE_TYPE (fndecl)));
> + }
> +  else
> + return false;
> +  break;
> +case 64:
> +  if (builtin_decl_explicit_p (BUILT_IN_BSWAP64)
> +   && (optab_handler (bswap_optab, DImode) != CODE_FOR_nothing
> +   || (word_mode == SImode
> +   && builtin_decl_explicit_p (BUILT_IN_BSWAP32)
> +   && optab_handler (bswap_optab, SImode) != CODE_FOR_nothing)))
> + {
> +   load_type = uint64_type_node;
> +   fndecl = builtin_decl_explicit (BUILT_IN_BSWAP64);
> +   bswap_type = TREE_VALUE (TYPE_ARG_TYPES (TREE_TYPE (fndecl)));
> + }
> +  else
> + return false;
> +  break;
> +default:
> +  return false;
> +}
> +
> +  gimple *ins_stmt = find_bswap_or_nop (cur_stmt, , );
> +  if (!ins_stmt || n.range != (unsigned HOST_WIDE_INT) sz)
> +return false;
> +
> +  if (bswap && !fndecl && n.range != 16)
> +return false;
> +
> +  memset (_stats, 0, sizeof (nop_stats));
> +  memset (_stats, 0, sizeof (bswap_stats));
> +  return bswap_replace (gsi_for_stmt (cur_stmt), ins_stmt, fndecl,
> + bswap_type, load_type, , bswap) != NULL_TREE;
> +}
> +
>  /* Find manual byte swap implementations as well as load in a given
> endianness. Byte swaps are turned into a bswap builtin invokation
> while endian loads are converted to bswap builtin invokation or
> @@ -5126,6 +5195,7 @@ static enum basic_block_status
>  get_status_for_store_merging (basic_block bb)
>  {
>unsigned int num_statements = 0;
> +  unsigned int num_constructors = 0;
>gimple_stmt_iterator gsi;
>edge e;
>  
> @@ -5138,9 +5208,27 @@ get_status_for_store_merging (basic_bloc
>  
>if (store_valid_for_store_merging_p (stmt) && ++num_statements >= 2)
>   break;
> +
> +  if (is_gimple_assign (stmt)
> +   && 

Re: [PATCH] add g_nonstandard_bool attribute for GIMPLE FE use

2021-01-05 Thread Richard Biener
On Wed, 16 Dec 2020, Joseph Myers wrote:

> On Sun, 13 Dec 2020, Martin Sebor via Gcc-patches wrote:
> 
> > "nonstandard" isn't a very descriptive name.  The leading g_ prefix
> > also looks a little too terse (is that supposed to stand dor GIMPLE?).
> > I would suggest choosing a better name, say, bool_precision.  Since
> 
> Indeed, g_ suggests the GLib API to me, so a name not involving g_ or 
> "nonstandard" seems better.
> 
> The principle of a GIMPLE-front-end-specific attribute for this sort of 
> thing seems reasonable to me.

OK, does "integral_precision" sound better?  (supposed to cover
INTEGRAL_TYPE_P types)  Or would "precision" be preferred (I used
g_ to not conflict with possible future C attributes).  Note that
GCCs "nonstandard boolean types" are signed as opposed to
bool which is unsigned so

typedef _Bool bool1 __attribute__((precision(1)));

would maybe result in a surprising result.  One alternative
would be to make the attribute have the signedness specified as well
(C doesn't accept 'unsigned _Bool' or 'signed _Bool') or
simply name the attribute "signed_bool_precision".  I guess the bool case
is really special compared to the desire to eventually allow
declaring of a 3 bit precision signed/unsigned integer type.

Allowing 'signed _Bool' with -fgimple might be another option
of course.

Thanks,
Richard.


[patch, committed, coarray_native] Fix CO_REDUCE with RESULT_IMAGE

2021-01-05 Thread Thomas Koenig via Gcc-patches

Hi,

I just committed the attached patch to the branch.

I had also merged the trunk to branch previously,
so it should be more or less up to date by now.

Best regards

Thomas

Fix CO_REDUCE with RESULT_IMAGE.

gcc/fortran/ChangeLog:

* trans-array.c (gfc_conv_ss_descriptor): Use correct ref.
* trans-intrinsic.c (trans_argument): Use 
gfc_conv_expr_reference.

* trans-decl.c (gfc_build_builtin_function_decls):
Correct spec for array.

libgfortran/ChangeLog:

* caf_shared/collective_subroutine.c (collsub_reduce_array):
Fix off by one error for result.

gcc/testsuite/ChangeLog:

* gfortran.dg/caf-shared/co_reduce_1.f90: New test.


diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c
index 199bcaed9b1..85ef1537fcd 100644
--- a/gcc/fortran/trans-array.c
+++ b/gcc/fortran/trans-array.c
@@ -3120,7 +3120,6 @@ gfc_conv_ss_descriptor (stmtblock_t * block, gfc_ss * ss, 
int base)
   gfc_ss_info *ss_info;
   gfc_array_info *info;
   tree tmp;
-  gfc_ref *ref;
 
   ss_info = ss->info;
   info = _info->data.array;
@@ -3172,7 +3171,7 @@ gfc_conv_ss_descriptor (stmtblock_t * block, gfc_ss * ss, 
int base)
 
   if (flag_coarray == GFC_FCOARRAY_SHARED)
{
- gfc_ref *co_ref = cas_impl_this_image_ref (ref);
+ gfc_ref *co_ref = cas_impl_this_image_ref (ss_info->expr->ref);
  if (co_ref)
tmp = cas_add_this_image_offset (tmp, se.expr, _ref->u.ar, true);
}
diff --git a/gcc/fortran/trans-decl.c b/gcc/fortran/trans-decl.c
index 3ecd63d6169..f86f39159c5 100644
--- a/gcc/fortran/trans-decl.c
+++ b/gcc/fortran/trans-decl.c
@@ -4187,7 +4187,7 @@ gfc_build_builtin_function_decls (void)
 
   gfor_fndecl_cas_reduce_array = 
gfc_build_library_function_decl_with_spec (
- get_identifier (PREFIX("cas_collsub_reduce_array")), ". W r r w w . ",
+ get_identifier (PREFIX("cas_collsub_reduce_array")), ". w r r w w . ",
  void_type_node, 6, pvoid_type_node /* desc.  */,
  build_pointer_type (build_function_type_list (void_type_node,
  pvoid_type_node, pvoid_type_node, NULL_TREE)) /* assign function. 
 */,
diff --git a/gcc/fortran/trans-intrinsic.c b/gcc/fortran/trans-intrinsic.c
index 13c32957d69..92cdb3e1bdb 100644
--- a/gcc/fortran/trans-intrinsic.c
+++ b/gcc/fortran/trans-intrinsic.c
@@ -11217,7 +11217,7 @@ trans_argument (gfc_actual_arglist **curr_al, 
stmtblock_t *blk,
   if (expr->rank > 0)
 gfc_conv_expr_descriptor (argse, expr);
   else
-gfc_conv_expr (argse, expr);
+gfc_conv_expr_reference (argse, expr);
 
   gfc_add_block_to_block (blk, >pre);
   gfc_add_block_to_block (postblk, >post);
diff --git a/gcc/testsuite/gfortran.dg/caf-shared/co_reduce_1.f90 
b/gcc/testsuite/gfortran.dg/caf-shared/co_reduce_1.f90
new file mode 100644
index 000..ab8b2877295
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/caf-shared/co_reduce_1.f90
@@ -0,0 +1,24 @@
+! { dg-do run }
+! { dg-set-target-env-var GFORTRAN_NUM_IMAGES "4" }
+! This test only works with four images, it will fail otherwise.
+program main
+  implicit none
+  integer, parameter :: n = 3
+  integer, dimension(n) :: a
+  a = [1,2,3] + this_image()
+  call co_reduce (a, mysum, result_image = 2)
+  if (this_image () == 2) then
+ if (any(a /= [14,18,22])) then
+print *,a
+print *,a /= [14,18,22]
+print *,any(a /= [14,18,22])
+stop 1
+ end if
+  end if
+contains
+  PURE FUNCTION mysum (lhs,rhs)
+integer, intent(in) :: lhs, rhs
+integer :: mysum
+mysum = lhs + rhs
+  END FUNCTION mysum
+end program main
diff --git a/libgfortran/caf_shared/collective_subroutine.c 
b/libgfortran/caf_shared/collective_subroutine.c
index 875eb946e60..a39f0ae390f 100644
--- a/libgfortran/caf_shared/collective_subroutine.c
+++ b/libgfortran/caf_shared/collective_subroutine.c
@@ -121,7 +121,7 @@ collsub_reduce_array (collsub_iface *ci, gfc_array_char 
*desc,
   for (; (local->total_num_images >> cbit) != 0; cbit++)
 collsub_sync (ci);
 
-  if (!result_image || *result_image == this_image.image_num)
+  if (!result_image || (*result_image - 1 ) == this_image.image_num)
 {
   if (packed)
memcpy (GFC_DESCRIPTOR_DATA (desc), buffer, this_image_size_bytes);


Re: [PATCH] store VLA bounds in attribute access as strings (PR 97172)

2021-01-05 Thread Richard Biener via Gcc-patches
On Mon, Jan 4, 2021 at 9:53 PM Martin Sebor  wrote:
>
> On 1/4/21 12:23 PM, Jeff Law wrote:
> >
> >
> > On 1/4/21 12:19 PM, Jakub Jelinek wrote:
> >> On Mon, Jan 04, 2021 at 12:14:15PM -0700, Jeff Law via Gcc-patches wrote:
>  Doing the STRING_CST is certainly less fragile since the SSA names
>  created at gimplification time could even be ggc_freed when no longer
>  used in the IL.
> >>> Obviously we can't use SSA_NAMEs as they're specific to each function as
> >>> they get compiled.  But what's not as clear to me is why we can't use a
> >>> SAVE_EXPR of the original expression that indicates the size of the
> >>> parameter.
> >> The gimplifier is destructive, so if the expressions are partly (e.g. in
> >> those SAVE_EXPRs) shared with what is in the actual IL, we lose.
> >> And if they aren't shared and there are side-effects, if we tried to
> >> gimplify them again we'd get the side-effects duplicated.
> >> So it all depends on what the code wants to handle, if e.g. just values of
> >> parameters with simple arithmetics on those and punt on everything else,
> >> then it is doable, but generally it is not.
>
> I explained what the code handles and when in the pipeline in
> the discussion of the previous patch:
> https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559770.html
>
> > I would expect the expressions to be values of parameters (or objects in
> > static storage) and simple arithemetic on them.  If there's other cases,
> > punting seems appropriate.
> >
> > Martin -- are there nontrivial expressions we need to be worried about here?
>
> At the moment the middle warnings only consider parameters, like
> the N in
>
>void f (int N, int[N]);
>
>void g (void)
>{
>  int a[3];
>  f (sizeof a, a);   // warning

I wonder how this can work reliably without heavy-weight
"parsing" of the attribute?  That is, how do you relate
the passed 24 constant to the N in int[N]?

>}
>
> The front end redeclaration warnings consider all expressions,
> including
>
>int f (void);
>
>void g (int[f () + 1]);
>void g (int[f () + 2]);   // warning

For redeclaration warning the attribute isn't needed since you
have both decls and can compare sizes directly?

> The patch turns these complex bounds into strings that the front
> end compares instead.  After the front end is done the strings
> don't serve any purpose (and I don't think ever will) and could
> be removed.  I looked for a way to do it but couldn't find one
> other than the free_lang_data pass in tree.c that Richard had
> initially said wasn't the right place.  Sounds like he's
> reconsidered but at this point, given that VLA parameters are
> used only infraquently, and VLAs with these nontrivial bounds
> are exceedingly rare, going to the trouble of removing them
> doesn't seem worth the effort.
>
> Martin
>
> >
> >
> > Jeff
> >
>


[PATCH,committed] arc: fix accumulator first register.

2021-01-05 Thread Claudiu Zissulescu via Gcc-patches
gcc/
2021-01-05  Claudiu Zissulescu  

* config/arc/arc.md (maddsidi4_split): Use ACC_REG_FIRST.
(umaddsidi4_split): Likewise.

Signed-off-by: Claudiu Zissulescu 
---
 gcc/config/arc/arc.md | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
index 3e544430167..7a52551eef5 100644
--- a/gcc/config/arc/arc.md
+++ b/gcc/config/arc/arc.md
@@ -6177,12 +6177,12 @@ (define_insn_and_split "maddsidi4_split"
rtx acc_reg = gen_rtx_REG (DImode, ACC_REG_FIRST);
emit_move_insn (acc_reg, operands[3]);
if (TARGET_PLUS_MACD && even_register_operand (operands[0], DImode)
-   && REGNO (operands[0]) != ACCL_REGNO)
+   && REGNO (operands[0]) != ACC_REG_FIRST)
   emit_insn (gen_macd (operands[0], operands[1], operands[2]));
else
  {
   emit_insn (gen_mac (operands[1], operands[2]));
-  if (REGNO (operands[0]) != ACCL_REGNO)
+  if (REGNO (operands[0]) != ACC_REG_FIRST)
 emit_move_insn (operands[0], acc_reg);
  }
DONE;
@@ -6279,12 +6279,12 @@ (define_insn_and_split "umaddsidi4_split"
rtx acc_reg = gen_rtx_REG (DImode, ACC_REG_FIRST);
emit_move_insn (acc_reg, operands[3]);
if (TARGET_PLUS_MACD && even_register_operand (operands[0], DImode)
-   && REGNO (operands[0]) != ACCL_REGNO)
+   && REGNO (operands[0]) != ACC_REG_FIRST)
   emit_insn (gen_macdu (operands[0], operands[1], operands[2]));
else
  {
   emit_insn (gen_macu (operands[1], operands[2]));
-  if (REGNO (operands[0]) != ACCL_REGNO)
+  if (REGNO (operands[0]) != ACC_REG_FIRST)
 emit_move_insn (operands[0], acc_reg);
  }
DONE;
-- 
2.26.2



Re: [PATCH] Add line debug info for virtual thunks (PR ipa/97937)

2021-01-05 Thread Richard Biener
On Mon, 4 Jan 2021, Bernd Edlinger wrote:

> Hi,
> 
> 
> currently there is a problem when debugging a virtual thunk.  That is
> a decl with DECL_IGNORED_P.  Currently the line information displayed
> in gdb is completely bogus, thus the last line of whatever function
> is immediately before the PC of the thunk.

But isn't this a consumer issue then?  If there is no line info for
a PC range then gdb shouldn't display any.

> This patch improves the debug experience at least a bit by emitting
> at the line number information where the thunk has been defined.
> I do not dare to touch anything but dwarf2 debug info, therefore
> the patch is a bit awkward.

There's more DECL_IGNORED_P decls (like functions emitted by
profile instrumentation), which do not have any source correspondence.

So IMHO the fix should be to make a more nuanced DECL_IGNORED_P
for thunks if it is really necessary to emit debug info for them.
For example by making them DECL_ARTIFICIAL only (not sure why
we end up with them DECL_IGNORED_P - there might be a reason).

Richard.

> 
> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
> Is it OK for trunk?
> 
> 
> Thanks
> Bernd.
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


Re: [PATCH][tree-optimization]Optimize combination of comparisons to dec+compare

2021-01-05 Thread Richard Biener via Gcc-patches
On Mon, Jan 4, 2021 at 9:50 PM Eugene Rozenfeld
 wrote:
>
> Ping.
>
> -Original Message-
> From: Eugene Rozenfeld
> Sent: Tuesday, December 22, 2020 3:01 PM
> To: Richard Biener ; gcc-patches@gcc.gnu.org
> Subject: RE: Optimize combination of comparisons to dec+compare
>
> Re-sending my question and re-attaching the patch.
>
> Richard, can you please clarify your feedback?

Hmm, OK.

The patch is OK.

Thanks,
Richard.


> Thanks,
>
> Eugene
>
> -Original Message-
> From: Gcc-patches  On Behalf Of Eugene 
> Rozenfeld via Gcc-patches
> Sent: Tuesday, December 15, 2020 2:06 PM
> To: Richard Biener 
> Cc: gcc-patches@gcc.gnu.org
> Subject: [EXTERNAL] Re: Optimize combination of comparisons to dec+compare
>
> Richard,
>
> > Do we already handle x < y || x <= CST to x <= y - CST?
>
> That is an invalid transformation: e.g., consider x=3, y=4, CST=2.
> Can you please clarify?
>
> Thanks,
>
> Eugene
>
> -Original Message-
> From: Richard Biener 
> Sent: Thursday, December 10, 2020 12:21 AM
> To: Eugene Rozenfeld 
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: Optimize combination of comparisons to dec+compare
>
> On Thu, Dec 10, 2020 at 1:52 AM Eugene Rozenfeld via Gcc-patches 
>  wrote:
> >
> > This patch adds a pattern for optimizing x < y || x == XXX_MIN to x <=
> > y-1 if y is an integer with TYPE_OVERFLOW_WRAPS.
>
> Do we already handle x < y || x <= CST to x <= y - CST?
> That is, the XXX_MIN case is just a special-case of generic anti-range 
> testing?  For anti-range testing with signed types we pun to unsigned when 
> possible.
>
> > This fixes pr96674.
> >
> > Tested on x86_64-pc-linux-gnu.
> >
> > For this function
> >
> > bool f(unsigned a, unsigned b)
> > {
> > return (b == 0) | (a < b);
> > }
> >
> > the code without the patch is
> >
> > test   esi,esi
> > sete   al
> > cmpesi,edi
> > seta   dl
> > or eax,edx
> > ret
> >
> > the code with the patch is
> >
> > subesi,0x1
> > cmpesi,edi
> > setae  al
> > ret
> >
> > Eugene
> >
> > gcc/
> > PR tree-optimization/96674
> > * match.pd: New pattern x < y || x == XXX_MIN --> x <= y - 1
> >
> > gcc/testsuite
> > * gcc.dg/pr96674.c: New test.
> >


Re: [PATCH] nvptx: Cache stacks block for OpenMP kernel launch

2021-01-05 Thread Julian Brown
Hi Jakub,

Just to check, does my reply below address your concerns --
particularly with regards to the current usage of CUDA streams
serializing kernel executions from different host threads? Given that
situation, and the observed speed improvement with OpenMP offloading to
NVPTX with the patch, I'm not sure how much sense it makes to do
anything more sophisticated than this -- especially without a test case
that demonstrates a performance regression (or an exacerbated
out-of-memory condition) with the patch.

Thanks,

Julian

On Tue, 15 Dec 2020 23:16:48 +
Julian Brown  wrote:

> On Tue, 15 Dec 2020 18:00:36 +0100
> Jakub Jelinek  wrote:
> 
> > On Tue, Dec 15, 2020 at 04:49:38PM +, Julian Brown wrote:  
> > > > Do you need to hold the omp_stacks.lock across the entire
> > > > offloading? Doesn't that serialize all offloading kernels to the
> > > > same device? I mean, can't the lock be taken just shortly at the
> > > > start to either acquire the cached stacks or allocate a fresh
> > > > stack, and then at the end to put the stack back into the
> > > > cache?
> > > 
> > > I think you're suggesting something like what Alexander mentioned
> > > -- a pool of cached stacks blocks in case the single, locked block
> > > is contested. Obviously at present kernel launches are serialised
> > > on the target anyway, so it's a question of whether having the
> > > device wait for the host to unlock the stacks block (i.e. a
> > > context switch, FSVO context switch), or allocating a new stacks
> > > block, is quicker. I think the numbers posted in the parent email
> > > show that memory allocation is so slow that just waiting for the
> > > lock wins. I'm wary of adding unnecessary complication,
> > > especially if it'll only be exercised in already hard-to-debug
> > > cases (i.e. lots of threads)!
> > 
> > I'm not suggesting to have multiple stacks, on the contrary.  I've
> > suggested to do the caching only if at most one host thread is
> > offloading to the device.
> > 
> > If one uses
> > #pragma omp parallel num_threads(3)
> > {
> >   #pragma omp target
> >   ...
> > }
> > then I don't see what would previously prevent the concurrent
> > offloading, yes, we take the device lock during gomp_map_vars and
> > again during gomp_unmap_vars, but don't hold it across the
> > offloading in between.  
> 
> I still don't think I quite understand what you're getting at.
> 
> We only implement synchronous launches for OpenMP on NVPTX at present,
> and those all use the default CUDA runtime driver stream. Only one
> kernel executes on the hardware at once, even if launched from
> different host threads. The serialisation isn't due to the device lock
> being held, but by the queueing semantics of the underlying API.
> 
> > > Does target-side memory allocation call back into the plugin's
> > > GOMP_OFFLOAD_alloc? I'm not sure how that works. If not,
> > > target-side memory allocation shouldn't be affected, I don't
> > > think?
> > 
> > Again, I'm not suggesting that it should, but what I'm saying is
> > that if target region ends but some other host tasks are doing
> > target regions to the same device concurrently with that, or if
> > there are async target in fly, we shouldn't try to cache the stack,
> > but free it right away, because what the other target regions might
> > need to malloc larger amounts of memory and fail because of the
> > caching.  
> 
> I'm assuming you're not suggesting fundamentally changing APIs or
> anything to determine if we're launching target regions from multiple
> threads at once, but instead that we try to detect the condition
> dynamically in the plugin?
> 
> So, would kernel launch look something like this? (Excuse
> pseudo-code-isms!)
> 
> void GOMP_OFFLOAD_run (...)
> {
>   bool used_cache;
> 
>   pthread_mutex_lock (_dev->omp_stacks.lock);
>   if (_dev->omp_stacks.usage_count > 0)
>   {
> cuCtxSynchronize ();
> nvptx_stacks_free (_dev);
> ...allocate fresh stack, no caching...
> used_cache = false;
>   }
>   else
>   {
> /* Allocate or re-use cached stacks, and then... */
> ptx_dev->omp_stacks.usage_count++;
> used_cache = true;
>   }
>   pthread_mutex_unlock (_dev->omp_stacks.lock);
> 
>   /* Launch kernel */
> 
>   if (used_cache) {
> cuStreamAddCallback (
>   pthread_mutex_lock (_dev->omp_stacks.lock);
>   ptx_dev->omp_stacks.usage_count--;
>   pthread_mutex_unlock (_dev->omp_stacks.lock);
> );
>   } else {
> pthread_mutex_lock (_dev->omp_stacks.lock);
> /* Free uncached stack */
> pthread_mutex_unlock (_dev->omp_stacks.lock);
>   }
> }
> 
> This seems like it'd be rather fragile to me, and would offer some
> benefit perhaps only if a previous cached stacks block was much larger
> than the one required for some given later launch. It wouldn't allow
> any additional parallelism on the target I don't think.
> 
> Is that sort-of what you meant?
> 
> Oh, or perhaps something more like checking cuStreamQuery at 

Re: [08/23] Add an alternative splay tree implementation

2021-01-05 Thread Richard Biener via Gcc-patches
On Mon, Jan 4, 2021 at 4:43 PM Richard Sandiford via Gcc-patches
 wrote:
>
> Andreas Schwab  writes:
> > On Jan 04 2021, Richard Sandiford wrote:
> >
> >> Andreas Schwab  writes:
> >>> That doesn't build with gcc 4.8:
> >>
> >> Which subversion are you using?
> >
> > This is 4.8.1.
>
> Hmm, OK.  I guess that raises the question whether “supporting GCC 4.8”
> means supporting every patchlevel, or just the latest.

We document

@item ISO C++11 compiler
Necessary to bootstrap GCC.
...

To build all languages in a cross-compiler or other configuration where
3-stage bootstrap is not performed, you need to start with an existing
GCC binary (version 4.8 or later) because source code for language
frontends other than C might use GCC extensions.

Note that to bootstrap GCC with versions of GCC earlier than 4.8, you
may need to use @option{--disable-stage1-checking}, though
bootstrapping the compiler with such earlier compilers is strongly
discouraged.

while the second paragraph suggests GCC 4.8 or later works
(which IMHO includes GCC 4.8.1), the general requirement
lists a C++11 compiler which appearantly GCC 4.8.1 isn't ;)

So for simplicity I'd suggest to be more precise and say
4.8.2 or later (if 4.8.2 works)

Richard.

>
> Richard


Re: [PATCH] match.pd: Improve (A / (1 << B)) -> (A >> B) optimization [PR96930]

2021-01-05 Thread Richard Biener
On Tue, 5 Jan 2021, Jakub Jelinek wrote:

> Hi!
> 
> The following patch improves the A / (1 << B) -> A >> B simplification,
> as seen in the testcase, if there is unnecessary widening for the division,
> we just optimize it into a shift on the widened type, but if the lshift
> is widened too, there is no reason to do that, we can just shift it in the
> original type and convert after.  The tree_nonzero_bits & wi::mask check
> already ensures it is fine even for signed values.
> 
> I've split the vr-values optimization into a separate patch as it causes
> a small regression on two testcases, but this patch fixes what has been
> reported in the PR alone.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Thanks,
Richard.

> 2021-01-05  Jakub Jelinek  
> 
>   PR tree-optimization/96930
>   * match.pd ((A / (1 << B)) -> (A >> B)): If A is extended
>   from narrower value which has the same type as 1 << B, perform
>   the right shift on the narrower value followed by extension.
> 
>   * g++.dg/tree-ssa/pr96930.C: New test.
> 
> --- gcc/match.pd.jj   2021-01-04 10:37:06.0 +0100
> +++ gcc/match.pd  2021-01-05 10:27:56.653791400 +0100
> @@ -321,7 +321,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> (unsigned long long) (1 << 31) is -2147483648ULL, not 2147483648ULL,
> so it is valid only if A >> 31 is zero.  */
>  (simplify
> - (trunc_div @0 (convert? (lshift integer_onep@1 @2)))
> + (trunc_div (convert?@0 @3) (convert2? (lshift integer_onep@1 @2)))
>   (if ((TYPE_UNSIGNED (type) || tree_expr_nonnegative_p (@0))
>&& (!VECTOR_TYPE_P (type)
> || target_supports_op_p (type, RSHIFT_EXPR, optab_vector)
> @@ -336,7 +336,11 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> & wi::mask (element_precision (TREE_TYPE (@1)) - 1,
> true,
> element_precision (type))) == 0)
> -  (rshift @0 @2)))
> +   (if (!VECTOR_TYPE_P (type)
> + && useless_type_conversion_p (TREE_TYPE (@3), TREE_TYPE (@1))
> + && element_precision (TREE_TYPE (@3)) < element_precision (type))
> +(convert (rshift @3 @2))
> +(rshift @0 @2
>  
>  /* Preserve explicit divisions by 0: the C++ front-end wants to detect
> undefined behavior in constexpr evaluation, and assuming that the division
> --- gcc/testsuite/g++.dg/tree-ssa/pr96930.C.jj2021-01-04 
> 14:18:15.513100038 +0100
> +++ gcc/testsuite/g++.dg/tree-ssa/pr96930.C   2021-01-04 14:25:35.512148709 
> +0100
> @@ -0,0 +1,10 @@
> +// PR tree-optimization/96930
> +// { dg-do compile }
> +// { dg-options "-O2 -fdump-tree-optimized" }
> +// { dg-final { scan-tree-dump " = a_\[0-9]\\\(D\\\) >> b_\[0-9]\\\(D\\\);" 
> "optimized" } }
> +
> +unsigned
> +foo (unsigned a, unsigned b)
> +{
> +  return a / (unsigned long long) (1U << b);
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


Re: [PATCH] reassoc: Fix reassociation on 32-bit hosts with > 32767 bbs [PR98514]

2021-01-05 Thread Richard Biener
On Tue, 5 Jan 2021, Jakub Jelinek wrote:

> Hi!
> 
> Apparently reassoc ICEs on large functions (more than 32767 basic blocks
> with something to reassociate in those).
> The problem is that the pass uses long type to store the ranks, and
> the bb ranks are (number of SSA_NAMEs with default defs + 2 + bb->index) << 
> 16,
> so with many basic blocks we overflow the ranks and we then have assertions
> rank is not negative.
> 
> The following patch just uses HOST_WIDE_INT instead of long in the pass,
> yes, it means slightly higher memory consumption (one array indexed by
> bb->index is twice as large, and one hash_map from trees to the ranks
> will grow by 50%, but I think it is better than punting on large functions
> the reassociation on 32-bit hosts and making it inconsistent e.g. when
> cross-compiling.  Given vec.h uses unsigned for vect element counts,
> we don't really support more than 4G of SSA_NAMEs or more than 2G of basic
> blocks in a function, so even with the << 16 we can't really overflow the
> HOST_WIDE_INT rank counters.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK (can you use int64_t instead?)

Thanks,
Richard.

> 2021-01-05  Jakub Jelinek  
> 
>   PR tree-optimization/98514
>   * tree-ssa-reassoc.c (bb_rank): Change type from long * to
>   HOST_WIDE_INT *.
>   (operand_rank): Change type from hash_map to
>   hash_map.
>   (phi_rank): Change return type from long to HOST_WIDE_INT.
>   (loop_carried_phi): Change block_rank variable type from long to
>   HOST_WIDE_INT.
>   (propagate_rank): Change return type, rank parameter type and
>   op_rank variable type from long to HOST_WIDE_INT.
>   (find_operand_rank): Change return type from long to HOST_WIDE_INT
>   and change slot variable type from long * to HOST_WIDE_INT *.
>   (insert_operand_rank): Change rank parameter type from long to
>   HOST_WIDE_INT.
>   (get_rank): Change return type and rank variable type from long to
>   HOST_WIDE_INT.  Use HOST_WIDE_INT_PRINT_DEC instead of %ld to print
>   the rank.
>   (init_reassoc): Change rank variable type from long to HOST_WIDE_INT
>   and adjust correspondingly bb_rank and operand_rank initialization.
> 
> --- gcc/tree-ssa-reassoc.c.jj 2021-01-04 10:25:37.153252851 +0100
> +++ gcc/tree-ssa-reassoc.c2021-01-04 17:01:15.22328 +0100
> @@ -200,10 +200,10 @@ static unsigned int next_operand_entry_i
>  /* Starting rank number for a given basic block, so that we can rank
> operations using unmovable instructions in that BB based on the bb
> depth.  */
> -static long *bb_rank;
> +static HOST_WIDE_INT *bb_rank;
>  
>  /* Operand->rank hashtable.  */
> -static hash_map *operand_rank;
> +static hash_map *operand_rank;
>  
>  /* Vector of SSA_NAMEs on which after reassociate_bb is done with
> all basic blocks the CFG should be adjusted - basic blocks
> @@ -212,7 +212,7 @@ static hash_map *operand_ran
>  static vec reassoc_branch_fixups;
>  
>  /* Forward decls.  */
> -static long get_rank (tree);
> +static HOST_WIDE_INT get_rank (tree);
>  static bool reassoc_stmt_dominates_stmt_p (gimple *, gimple *);
>  
>  /* Wrapper around gsi_remove, which adjusts gimple_uid of debug stmts
> @@ -257,7 +257,7 @@ reassoc_remove_stmt (gimple_stmt_iterato
> calculated into an accumulator variable to be independent for each
> iteration of the loop.  If STMT is some other phi, the rank is the
> block rank of its containing block.  */
> -static long
> +static HOST_WIDE_INT
>  phi_rank (gimple *stmt)
>  {
>basic_block bb = gimple_bb (stmt);
> @@ -311,7 +311,7 @@ static bool
>  loop_carried_phi (tree exp)
>  {
>gimple *phi_stmt;
> -  long block_rank;
> +  HOST_WIDE_INT block_rank;
>  
>if (TREE_CODE (exp) != SSA_NAME
>|| SSA_NAME_IS_DEFAULT_DEF (exp))
> @@ -337,10 +337,10 @@ loop_carried_phi (tree exp)
> from expression OP.  For most operands, this is just the rank of OP.
> For loop-carried phis, the value is zero to avoid undoing the bias
> in favor of the phi.  */
> -static long
> -propagate_rank (long rank, tree op)
> +static HOST_WIDE_INT
> +propagate_rank (HOST_WIDE_INT rank, tree op)
>  {
> -  long op_rank;
> +  HOST_WIDE_INT op_rank;
>  
>if (loop_carried_phi (op))
>  return rank;
> @@ -352,17 +352,17 @@ propagate_rank (long rank, tree op)
>  
>  /* Look up the operand rank structure for expression E.  */
>  
> -static inline long
> +static inline HOST_WIDE_INT
>  find_operand_rank (tree e)
>  {
> -  long *slot = operand_rank->get (e);
> +  HOST_WIDE_INT *slot = operand_rank->get (e);
>return slot ? *slot : -1;
>  }
>  
>  /* Insert {E,RANK} into the operand rank hashtable.  */
>  
>  static inline void
> -insert_operand_rank (tree e, long rank)
> +insert_operand_rank (tree e, HOST_WIDE_INT rank)
>  {
>gcc_assert (rank > 0);
>gcc_assert (!operand_rank->put (e, rank));
> @@ -370,7 +370,7 @@ insert_operand_rank (tree e, long 

Re: [PR66791][ARM] Replace __builtin_vext* with __buitlin_shuffle in vext intrinsics

2021-01-05 Thread Prathamesh Kulkarni via Gcc-patches
On Mon, 4 Jan 2021 at 16:01, Kyrylo Tkachov  wrote:
>
> Hi Prathamesh
>
> > -Original Message-
> > From: Prathamesh Kulkarni 
> > Sent: 04 January 2021 10:27
> > To: gcc Patches ; Kyrylo Tkachov
> > 
> > Subject: [PR66791][ARM] Replace __builtin_vext* with __buitlin_shuffle in
> > vext intrinsics
> >
> > Hi Kyrill,
> > The attached patch replaces __builtin_vextv8qi with __builtin_shuffle
> > for vext_s8.
> > Just wanted to confirm if this is in the correct direction ?
> > If yes, I will send a follow up patch that converts for all vext intrinsics.
>
> Yeah, that does look correct (aarch64 does it that way).
> As before, please make sure to delete any now-unused builtins as well.
Thanks, does the attached patch look OK ?
Testing in progress.

Thanks,
Prathamesh
>
> Thanks,
> Kyrill
>
> >
> > Thanks,
> > Prathamesh


vext-2.diff
Description: Binary data


Re: [PATCH] phiopt: Optimize x < 0 ? ~y : y to (x >> 31) ^ y [PR96928]

2021-01-05 Thread Richard Biener
On Tue, 5 Jan 2021, Jakub Jelinek wrote:

> Hi!
> 
> As requested in the PR, the one's complement abs can be done more
> efficiently without cmov or branching.
> 
> Had to change the ifcvt-onecmpl-abs-1.c testcase, we no longer optimize
> it in ifcvt, on x86_64 with -m32 we generate in the end the exact same
> code, but with -m64:
>   movl%edi, %eax
> - notl%eax
> - cmpl%edi, %eax
> - cmovl   %edi, %eax
> + sarl$31, %eax
> + xorl%edi, %eax
>   ret
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Thanks,
Richard.

> 2021-01-05  Jakub Jelinek  
> 
>   PR tree-optimization/96928
>   * tree-ssa-phiopt.c (xor_replacement): New function.
>   (tree_ssa_phiopt_worker): Call it.
> 
>   * gcc.dg/tree-ssa/pr96928.c: New test.
>   * gcc.target/i386/ifcvt-onecmpl-abs-1.c: Remove -fdump-rtl-ce1,
>   instead of scanning rtl dump for ifcvt message check assembly
>   for xor instruction.
> 
> --- gcc/tree-ssa-phiopt.c.jj  2021-01-04 10:25:38.638236032 +0100
> +++ gcc/tree-ssa-phiopt.c 2021-01-04 15:29:30.050005505 +0100
> @@ -62,6 +62,8 @@ static bool minmax_replacement (basic_bl
>   edge, edge, gimple *, tree, tree);
>  static bool abs_replacement (basic_block, basic_block,
>edge, edge, gimple *, tree, tree);
> +static bool xor_replacement (basic_block, basic_block,
> +  edge, edge, gimple *, tree, tree);
>  static bool cond_removal_in_popcount_clz_ctz_pattern (basic_block, 
> basic_block,
> edge, edge, gimple *,
> tree, tree);
> @@ -346,6 +348,9 @@ tree_ssa_phiopt_worker (bool do_store_el
> else if (abs_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
>   cfgchanged = true;
> else if (!early_p
> +&& xor_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
> + cfgchanged = true;
> +   else if (!early_p
>  && cond_removal_in_popcount_clz_ctz_pattern (bb, bb1, e1,
>   e2, phi, arg0,
>   arg1))
> @@ -2097,6 +2102,109 @@ abs_replacement (basic_block cond_bb, ba
>/* Note that we optimized this PHI.  */
>return true;
>  }
> +
> +/* Optimize x < 0 ? ~y : y into (x >> (prec-1)) ^ y.  */
> +
> +static bool
> +xor_replacement (basic_block cond_bb, basic_block middle_bb,
> +  edge e0 ATTRIBUTE_UNUSED, edge e1,
> +  gimple *phi, tree arg0, tree arg1)
> +{
> +  if (!INTEGRAL_TYPE_P (TREE_TYPE (arg1)))
> +return false;
> +
> +  /* OTHER_BLOCK must have only one executable statement which must have the
> + form arg0 = ~arg1 or arg1 = ~arg0.  */
> +
> +  gimple *assign = last_and_only_stmt (middle_bb);
> +  /* If we did not find the proper one's complement assignment, then we 
> cannot
> + optimize.  */
> +  if (assign == NULL)
> +return false;
> +
> +  /* If we got here, then we have found the only executable statement
> + in OTHER_BLOCK.  If it is anything other than arg = ~arg1 or
> + arg1 = ~arg0, then we cannot optimize.  */
> +  if (!is_gimple_assign (assign))
> +return false;
> +
> +  if (gimple_assign_rhs_code (assign) != BIT_NOT_EXPR)
> +return false;
> +
> +  tree lhs = gimple_assign_lhs (assign);
> +  tree rhs = gimple_assign_rhs1 (assign);
> +
> +  /* The assignment has to be arg0 = -arg1 or arg1 = -arg0.  */
> +  if (!(lhs == arg0 && rhs == arg1) && !(lhs == arg1 && rhs == arg0))
> +return false;
> +
> +  gimple *cond = last_stmt (cond_bb);
> +  tree result = PHI_RESULT (phi);
> +
> +  /* Only relationals comparing arg[01] against zero are interesting.  */
> +  enum tree_code cond_code = gimple_cond_code (cond);
> +  if (cond_code != LT_EXPR && cond_code != GE_EXPR)
> +return false;
> +
> +  /* Make sure the conditional is x OP 0.  */
> +  tree clhs = gimple_cond_lhs (cond);
> +  if (TREE_CODE (clhs) != SSA_NAME
> +  || !INTEGRAL_TYPE_P (TREE_TYPE (clhs))
> +  || TYPE_UNSIGNED (TREE_TYPE (clhs))
> +  || TYPE_PRECISION (TREE_TYPE (clhs)) != TYPE_PRECISION (TREE_TYPE 
> (arg1))
> +  || !integer_zerop (gimple_cond_rhs (cond)))
> +return false;
> +
> +  /* We need to know which is the true edge and which is the false
> + edge so that we know if have xor or inverted xor.  */
> +  edge true_edge, false_edge;
> +  extract_true_false_edges_from_block (cond_bb, _edge, _edge);
> +
> +  /* For GE_EXPR, if the true edge goes to OTHER_BLOCK, then we
> + will need to invert the result.  Similarly for LT_EXPR if
> + the false edge goes to OTHER_BLOCK.  */
> +  edge e;
> +  if (cond_code == GE_EXPR)
> +e = true_edge;
> +  else
> +e = false_edge;
> +
> +  bool invert = e->dest == middle_bb;
> +
> +  result = duplicate_ssa_name (result, NULL);
> +
> +  

Re: [PATCH]i386: Optimize pmovskb on zero_extend of subreg HI of the result [PR98461]

2021-01-05 Thread Uros Bizjak via Gcc-patches
On Tue, Jan 5, 2021 at 11:25 AM Hongtao Liu  wrote:
>
> On Tue, Jan 5, 2021 at 3:20 PM Uros Bizjak  wrote:
> >
> > On Tue, Jan 5, 2021 at 8:04 AM Uros Bizjak  wrote:
> > > >
> > > > +(define_split
> > > > +  [(set (match_operand:SI 0 "register_operand")
> > > > +(zero_extend:SI
> > > > +  (not:HI
> > > > +(subreg:HI
> > > > +  (unspec:SI
> > > > +[(match_operand:V16QI 1 "register_operand")]
> > > > +UNSPEC_MOVMSK) 0]
> > > > +  "TARGET_SSE2"
> > > > +  [(set (match_dup 2)
> > > > +(unspec:SI [(match_dup 1)] UNSPEC_MOVMSK))
> > > > +   (set (match_dup 0)
> > > > +(match_dup 3))]
> > >
> > > Just write:
> > >
> > > (set (match_dup 0)
> > > (xor:SI (match_dup 2)(const_int 65535))
> >
>
> Yes, changed.
>
> > BTW: This could be a universal combine splitter to simplify
> >
> > unsigned int foo (unsigned short z)
> > {
> > return (unsigned short)~z;
> > }
> >
> > Trying 7 -> 8:
> >7: r87:HI=~r88:SI#0
> >  REG_DEAD r88:SI
> >8: r86:SI=zero_extend(r87:HI)
> >  REG_DEAD r87:HI
> > Failed to match this instruction:
> > (set (reg:SI 86)
> >(zero_extend:SI (not:HI (subreg:HI (reg:SI 88) 0
> >
> > But combine does not "split" to one insns.
>
> Yes, according to PSabi, the top half of the register is not
> necessarily 0, so if you add the splitter, it just changes from notl +
> movzwl to xor + movzwl, which doesn't look better?

Indeed.

The patch is OK.

Uros.


Re: [PATCH]i386: Optimize pmovskb on zero_extend of subreg HI of the result [PR98461]

2021-01-05 Thread Hongtao Liu via Gcc-patches
On Tue, Jan 5, 2021 at 3:20 PM Uros Bizjak  wrote:
>
> On Tue, Jan 5, 2021 at 8:04 AM Uros Bizjak  wrote:
> > >
> > > +(define_split
> > > +  [(set (match_operand:SI 0 "register_operand")
> > > +(zero_extend:SI
> > > +  (not:HI
> > > +(subreg:HI
> > > +  (unspec:SI
> > > +[(match_operand:V16QI 1 "register_operand")]
> > > +UNSPEC_MOVMSK) 0]
> > > +  "TARGET_SSE2"
> > > +  [(set (match_dup 2)
> > > +(unspec:SI [(match_dup 1)] UNSPEC_MOVMSK))
> > > +   (set (match_dup 0)
> > > +(match_dup 3))]
> >
> > Just write:
> >
> > (set (match_dup 0)
> > (xor:SI (match_dup 2)(const_int 65535))
>

Yes, changed.

> BTW: This could be a universal combine splitter to simplify
>
> unsigned int foo (unsigned short z)
> {
> return (unsigned short)~z;
> }
>
> Trying 7 -> 8:
>7: r87:HI=~r88:SI#0
>  REG_DEAD r88:SI
>8: r86:SI=zero_extend(r87:HI)
>  REG_DEAD r87:HI
> Failed to match this instruction:
> (set (reg:SI 86)
>(zero_extend:SI (not:HI (subreg:HI (reg:SI 88) 0
>
> But combine does not "split" to one insns.

Yes, according to PSabi, the top half of the register is not
necessarily 0, so if you add the splitter, it just changes from notl +
movzwl to xor + movzwl, which doesn't look better?

>
> Uros.



-- 
BR,
Hongtao


[PATCH] match.pd: Improve (A / (1 << B)) -> (A >> B) optimization [PR96930]

2021-01-05 Thread Jakub Jelinek via Gcc-patches
Hi!

The following patch improves the A / (1 << B) -> A >> B simplification,
as seen in the testcase, if there is unnecessary widening for the division,
we just optimize it into a shift on the widened type, but if the lshift
is widened too, there is no reason to do that, we can just shift it in the
original type and convert after.  The tree_nonzero_bits & wi::mask check
already ensures it is fine even for signed values.

I've split the vr-values optimization into a separate patch as it causes
a small regression on two testcases, but this patch fixes what has been
reported in the PR alone.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-01-05  Jakub Jelinek  

PR tree-optimization/96930
* match.pd ((A / (1 << B)) -> (A >> B)): If A is extended
from narrower value which has the same type as 1 << B, perform
the right shift on the narrower value followed by extension.

* g++.dg/tree-ssa/pr96930.C: New test.

--- gcc/match.pd.jj 2021-01-04 10:37:06.0 +0100
+++ gcc/match.pd2021-01-05 10:27:56.653791400 +0100
@@ -321,7 +321,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
(unsigned long long) (1 << 31) is -2147483648ULL, not 2147483648ULL,
so it is valid only if A >> 31 is zero.  */
 (simplify
- (trunc_div @0 (convert? (lshift integer_onep@1 @2)))
+ (trunc_div (convert?@0 @3) (convert2? (lshift integer_onep@1 @2)))
  (if ((TYPE_UNSIGNED (type) || tree_expr_nonnegative_p (@0))
   && (!VECTOR_TYPE_P (type)
  || target_supports_op_p (type, RSHIFT_EXPR, optab_vector)
@@ -336,7 +336,11 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  & wi::mask (element_precision (TREE_TYPE (@1)) - 1,
  true,
  element_precision (type))) == 0)
-  (rshift @0 @2)))
+   (if (!VECTOR_TYPE_P (type)
+   && useless_type_conversion_p (TREE_TYPE (@3), TREE_TYPE (@1))
+   && element_precision (TREE_TYPE (@3)) < element_precision (type))
+(convert (rshift @3 @2))
+(rshift @0 @2
 
 /* Preserve explicit divisions by 0: the C++ front-end wants to detect
undefined behavior in constexpr evaluation, and assuming that the division
--- gcc/testsuite/g++.dg/tree-ssa/pr96930.C.jj  2021-01-04 14:18:15.513100038 
+0100
+++ gcc/testsuite/g++.dg/tree-ssa/pr96930.C 2021-01-04 14:25:35.512148709 
+0100
@@ -0,0 +1,10 @@
+// PR tree-optimization/96930
+// { dg-do compile }
+// { dg-options "-O2 -fdump-tree-optimized" }
+// { dg-final { scan-tree-dump " = a_\[0-9]\\\(D\\\) >> b_\[0-9]\\\(D\\\);" 
"optimized" } }
+
+unsigned
+foo (unsigned a, unsigned b)
+{
+  return a / (unsigned long long) (1U << b);
+}

Jakub



  1   2   >