date:20200511

Re: [RFC PATCH] i386: Add V2SFmode FMA insn patterns [PR95046]

2020-05-11 Thread Richard Biener

On Mon, 11 May 2020, Uros Bizjak wrote:

> Attached patch implements V2SFmode FMA insn patterns. Patched compiler
> vectorizes FMA, FMS and FNMA instructions, but for some reason fails
> to vectorize FNMS.
> 
> I have double checked that the insn pattern is correct, and now I'm
> all out of ideas what could be wrong with the pattern, still ignored
> by the vectorizer. -fno-vect-cost-model does not help so it's time to
> ask the experts...

Do you have negate patterns for V2SFmode?  The vectorizer sees
decomposed ops and only the vectorized operations are later formed
into FMAs.

Richard.

> gcc/ChangeLog:
> 
> 2020-05-11  Uroš Bizjak  
> 
> PR target/95046
> * config/i386/mmx.md (fmav2sf4): New insn pattern.
> (fmsv2sf4): Ditto.
> (fnmav2sf4): Ditto.
> (fnmsv2sf4): Ditto.
> 
> testsuite/ChangeLog:
> 
> 2020-05-11  Uroš Bizjak  
> 
> PR target/95046
> * gcc.target/i386/pr95046-2.c: New test.
> 
> Otherwise, the patch is bootstrapped and regression tested on
> x86_64-linux-gnu {,-m32}.
> 
> Uros.
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

[PATCH] [PR94118]] Update documentation for x86 operand modifier.

2020-05-11 Thread Hongtao Liu via Gcc-patches

Documents operand modifiers which are available in asm stmt but
missing in document.

 | Modifier | Description | Available in asm stmt | Existed in documentation |
 | --- | --- | --- | - |
 | L,W,B,Q,S,T | print the opcode suffix for specified size of
operand. | Available | Not |
 | C | print opcode suffix for set/cmov insn. | Not | - |
 | c | like C, but print reversed condition | Not | - |
 | F,f | likewise, but for floating-point. | Not | - |
 | O | if HAVE_AS_IX86_CMOV_SUN_SYNTAX, expand to "w.", "l." or "q.",
otherwise nothing | Not | - |
 | R | print embedded rounding and sae. | Available | Not |
 | r | print only sae. | Available | Not |
 | z | print the opcode suffix for the size of the current operand. |
Available | Existed |
 | Z | likewise, with special suffixes for x87 instructions. | Availble | Not |
 | * | print a star (in certain assembler syntax) | Not | - |
 | A | print an absolute memory reference. | Available | Existed |
 | E | print address with DImode register names if TARGET_64BIT. |
Available | Existed |
 | w | print the operand as if it's a "word" (HImode) even if it
isn't. | Available | Existed |
 | s | print a shift double count, followed by the assemblers argument
delimiter. | Available | Not |
 | b | print the QImode name of the register for the indicated operand
%b0 would print %al if operands[0] is reg 0. | Available | Existed |
 | w | likewise, print the HImode name of the register. | Available | Existed |
 | k | likewise, print the SImode name of the register. | Available | Existed |
 | q | likewise, print the DImode name of the register. | Available | Existed |
 | x | likewise, print the V4SFmode name of the register. | Available | Not |
 | t | likewise, print the V8SFmode name of the register. | Available | Not |
 | g | likewise, print the V16SFmode name of the register. | Avaliable | Not |
 | h | print the QImode name for a "high" register, either ah, bh, ch
or dh. | Available | Existed |
 | y | print "st(0)" instead of "st" as a register. | Available | Not |
 | d | print duplicated register operand for AVX instruction. |
Available | Not |
 | D | print condition for SSE cmp instruction. | Not | - |
 | P | if PIC, print an @PLT suffix. | Available | Existed |
 | p | print raw symbol name. | Available | Existed |
 | X | don't print any sort of PIC '@' suffix for a symbol. | Not | - |
 | & | print some in-use local-dynamic symbol name. | Not | - |
 | H | print a memory address offset by 8; used for sse high-parts |
Available | Existed |
 | Y | print condition for XOP pcom* instruction. | Not | - |
 | V | print naked full integer register name without %. | Available | Existed |
 | + | print a branch hint as 'cs' or 'ds' prefix | Not | - |
 | ; | print a semicolon (after prefixes due to bug in older gas). | Not | - |
 | ~ | print "i" if TARGET_AVX2, "f" otherwise. | Not | - |
 | ^ | print addr32 prefix if TARGET_64BIT and Pmode != word_mode | Not | - |
 | M | print addr32 prefix for TARGET_X32 with VSIB address. | Not | - |
 | ! | print NOTRACK prefix for jxx/call/ret instructions if required.
| Not | - |
 | N | print maskz if it's constant 0 operand. | Available | Not |
 | I | print comparision predicate operand for sse cmp condition. | Not | - |

Bootstrap is ok.

gcc/ChangeLog

PR target/94118
* doc/extend.texi (x86Operandmodifiers): Document more x86
operand modifier.
* gcc/config/i386/i386.c: Add comment for operand modifier N
and I.

-- 
BR,
Hongtao
From 333ee5ef21e6903f2893c9dcf3bb941b88516542 Mon Sep 17 00:00:00 2001
From: liuhongt 
Date: Fri, 8 May 2020 17:47:33 +0800
Subject: [PATCH] Document more x86 operand modifier.

Documents operand modifiers which are available in asm stmt but missing in document.

 | Modifier | Description | Available in asm stmt | Existed in documentation |
 | --- | --- | --- | - |
 | L,W,B,Q,S,T | print the opcode suffix for specified size of operand. | Available | Not |
 | C | print opcode suffix for set/cmov insn. | Not | - |
 | c | like C, but print reversed condition | Not | - |
 | F,f | likewise, but for floating-point. | Not | - |
 | O | if HAVE_AS_IX86_CMOV_SUN_SYNTAX, expand to "w.", "l." or "q.", otherwise nothing | Not | - |
 | R | print embedded rounding and sae. | Available | Not |
 | r | print only sae. | Available | Not |
 | z | print the opcode suffix for the size of the current operand. | Available | Existed |
 | Z | likewise, with special suffixes for x87 instructions. | Availble | Not |
 | * | print a star (in certain assembler syntax) | Not | - |
 | A | print an absolute memory reference. | Available | Existed |
 | E | print address with DImode register names if TARGET_64BIT. | Available | Existed |
 | w | print the operand as if it's a "word" (HImode) even if it isn't. | Available | Existed |
 | s | print a shift double count, followed by the assemblers argument delimiter. | Available | Not |
 | b | print the QImode name of the register for the indicated operand

Re: [PATCH] rs6000: Built-in cleanups for vec_clzm, vec_ctzm, and vec_gnb.

2020-05-11 Thread Segher Boessenkool

On Mon, May 11, 2020 at 09:31:41PM -0500, Bill Schmidt wrote:
> On 5/11/20 7:16 AM, Segher Boessenkool wrote:
> >>* config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin):
> >>Change fourth operand for vec_ternarylogic to require
> >>compatibility with unsigned SImode rather than unsigned QImode.
> >Is it still checked for range 0..255 though?  (If the compiler can
> >derive that).
> 
> Yep, we already have this:
> 
>   if (icode == CODE_FOR_xxeval)
>     {
>   /* Only allow 8-bit unsigned literals.  */
>   STRIP_NOPS (arg3);
>   if (TREE_CODE (arg3) != INTEGER_CST
>   || TREE_INT_CST_LOW (arg3) & ~0xff)
>     {
>   error ("argument 4 must be an 8-bit unsigned literal");
>   return CONST0_RTX (tmode);
>     }
>     }

That test only makes sure that bits 0xff00 are zero -- which
does work correctly here because we do know the operand is SImode.
Tricky.


Segher

Re: [PATCH] rs6000: Add vec_extracth and vec_extractl

2020-05-11 Thread Bill Schmidt via Gcc-patches


On 5/11/20 9:48 AM, David Edelsohn wrote:

On Sun, May 10, 2020 at 9:14 AM Bill Schmidt  wrote:

From: Kelvin Nilsen 

Add new insns vextdu[bhw]vlx, vextddvlx, vextdu[bhw]vhx, and
vextddvhx, along with built-in access and overloaded built-in
access to these insns.

Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no
regressions, using a Power9 configuration.  Is this okay for
master?

Thanks,
Bill

[gcc]

2020-05-10  Kelvin Nilsen  

 * config/rs6000/altivec.h (vec_extractl): New #define.
 (vec_extracth): Likewise.
 * config/rs6000/altivec.md (UNSPEC_EXTRACTL): New constant.
 (UNSPEC_EXTRACTR): Likewise.
 (VEXTRACT_LR): New int iterator.

Well now the previous VSTRIR/VSTRIL patch is inconsistent.  If we're
going to use an iterator for "LR", that's fine, but it needs to be
used consistently for similar situations.  The approach for the two,
similar instructions and issues need to match.



I see your point.  I don't really like the way this was done very much, 
since the attributes are tied to the unspecs for extract-{low,high}.  
Simple attribute names like LR, lr, rl shouldn't be scoped so narrowly.


I don't like any of the alternatives very well, either.  I could either 
(1) change the names of the int iterators in this patch to incorporate 
part of the word "extract", and create similar iterators for the 
vstril/vstrir patterns; or (2) remove the iterators from this patch and 
just create two expansions and two insns instead of one of each.  I have 
a slight preference for (2) since the longer iterator names will make 
things ugly.


Do you or Segher have a preference?

Thanks!
Bill



Thanks, David

Re: [PATCH] rs6000: Built-in cleanups for vec_clzm, vec_ctzm, and vec_gnb.

2020-05-11 Thread Bill Schmidt via Gcc-patches




On 5/11/20 7:16 AM, Segher Boessenkool wrote:

Hi!

On Sat, May 09, 2020 at 08:08:34PM -0500, Bill Schmidt wrote:

I should have noticed this patch before submitting Kelvin's earlier
related patches, sorry.  I think it should still be fine to apply
the patches in order, but if you'd like me to combine this into the
two earlier ones, I'd be happy to do that.

The intermediary step works just fine as well, so it is fine as-is.

One thing:


* config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin):
Change fourth operand for vec_ternarylogic to require
compatibility with unsigned SImode rather than unsigned QImode.

Is it still checked for range 0..255 though?  (If the compiler can
derive that).



Yep, we already have this:

  if (icode == CODE_FOR_xxeval)
    {
  /* Only allow 8-bit unsigned literals.  */
  STRIP_NOPS (arg3);
  if (TREE_CODE (arg3) != INTEGER_CST
  || TREE_INT_CST_LOW (arg3) & ~0xff)
    {
  error ("argument 4 must be an 8-bit unsigned literal");
  return CONST0_RTX (tmode);
    }
    }

Thanks for the review!
Bill



In either case, if that is what the ABI says, that is what the ABI says,
so okay for trunk.

Thanks!


Segher

RE: [PATCH PR94991] aarch64: ICE: Segmentation fault with option -mgeneral-regs-only

2020-05-11 Thread Yangfei (Felix)

Hi,

> -Original Message-
> From: Richard Sandiford [mailto:richard.sandif...@arm.com]
> Sent: Monday, May 11, 2020 10:27 PM
> To: Yangfei (Felix) 
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH PR94991] aarch64: ICE: Segmentation fault with option -
> mgeneral-regs-only
> 
> LGTM.  Pushed with one minor formatting fix:
> 
> > @@ -1364,7 +1364,11 @@
> >  if (!TARGET_FLOAT)
> >{
> > aarch64_err_no_fpadvsimd (mode);
> > -   FAIL;
> > +   machine_mode intmode
> > +   = int_mode_for_size (GET_MODE_BITSIZE (mode),
> 0).require ();
> 
> The "=" should only be indented by two spaces relative to the first line.

Thanks for fixing the bad formatting  :- )
I was expecting issues like that to be reported by contrib/check_GNU_style.sh.
Will be more carefull.

Felix

MANPOWER (Employee) Solutions for GCC GNU ORG

2020-05-11 Thread Farrukh | S.A.Z Universal Links via Gcc-patches

Attn: HR Dept. (GCC GNU ORG)

 

Dear Sir/Ma'am,

 

We (S.A.Z Universal Links) are a dedicated ‘Recruitment and Staffing’ Company 
in the sense that we can staff for a range of different industries. We work in 
a strategically processed manner to help industries attain potential 
candidates. We have been serving many reputable clients such as Almarai 
Company, Al Dossary Construction, Isam Kabbani Group, Al-Latifia Trading & 
Contracting Co, Jaddarah workforce services, Eastern Trading & Const. Est., 
Al-Osais Intenational Holding Co, Jeddah Cables Company, Nesma & Partners, 
Abdullah A.Al-Barrak & Sons Co., Johnson Controls, Saeed R Al-Zahrani 
Corporation (SRACO), Al-Watania Industries, Aecom Arabia Ltd. Co. and many more.

 

We have potential candidates that come from different backgrounds like 
Construction, Maintenance, Oil & Gas, IT, Hospitality/Healthcare, Retail, EPC, 
Civil, Mechanical, Logistics, Banking, Finance, Sales, Marketing, facility 
management etc.

 

We are here to offer you MANPOWER SOLUTIONS from unskilled, semi-skilled & 
skilled professionals, mid to top-level management with the following package:

-- Candidates within and from Pakistan

-- Qualified candidates daily pouring in

-- Candidates for every field and Industry

-- 24/7 support from our headquarter in Pakistan and from our offices in Dubai, 
Riyadh & Doha.




If this is something that piques your interest, let’s set up a time to chat or 
call!

 

I look forward to speaking with you soon to discuss more!

 

Regards,





Farrukh A. Shaikh

+92300-8228363


(WhatsApp/IMO/BotIM)

 

For Inquiries, please email to farr...@sazunilinks.com 


 

Please click on below given PDF link and download our detailed company profile 
with some of our work history for your kind reference: 

http://sazunilinks.com/profile.pdf

libgo patch committed: Fix TestCallersNilPointerPanic for GoLLVM

2020-05-11 Thread Ian Lance Taylor via Gcc-patches

This libgo patch by Eric Fang fixes TestCallersNilPointerPanic when
using GoLLVM.  The expected result of TestCallersNilPointerPanic has
changed in GoLLVM.  This change makes some elements of the expected
result optional so that this test passes in both gccgo and GoLLVM.
Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to master.

Ian
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index 939ba7c8929..02f6746cf6b 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-8645632618262d1661ece0c9e6fe9e04c6e3a878
+876bdf3df3bb33dbf1414237d84be5da32a48082
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/libgo/go/runtime/callers_test.go b/libgo/go/runtime/callers_test.go
index 26a6f3a73fc..1fc7f861894 100644
--- a/libgo/go/runtime/callers_test.go
+++ b/libgo/go/runtime/callers_test.go
@@ -67,7 +67,7 @@ func testCallers(t *testing.T, pcs []uintptr, pan bool) {
}
 }
 
-func testCallersEqual(t *testing.T, pcs []uintptr, want []string) {
+func testCallersEqual(t *testing.T, pcs []uintptr, want []string, ignore 
map[string]struct{}) {
got := make([]string, 0, len(want))
 
frames := runtime.CallersFrames(pcs)
@@ -76,7 +76,9 @@ func testCallersEqual(t *testing.T, pcs []uintptr, want 
[]string) {
if !more || len(got) >= len(want) {
break
}
-   got = append(got, frame.Function)
+   if _, ok := ignore[frame.Function]; !ok {
+   got = append(got, frame.Function)
+   }
}
if !reflect.DeepEqual(want, got) {
t.Fatalf("wanted %v, got %v", want, got)
@@ -106,7 +108,7 @@ func TestCallersPanic(t *testing.T) {
pcs := make([]uintptr, 20)
pcs = pcs[:runtime.Callers(0, pcs)]
testCallers(t, pcs, true)
-   testCallersEqual(t, pcs, want)
+   testCallersEqual(t, pcs, want, nil)
}()
f1(true)
 }
@@ -128,7 +130,7 @@ func TestCallersDoublePanic(t *testing.T) {
if recover() == nil {
t.Fatal("did not panic")
}
-   testCallersEqual(t, pcs, want)
+   testCallersEqual(t, pcs, want, nil)
}()
if recover() == nil {
t.Fatal("did not panic")
@@ -149,7 +151,7 @@ func TestCallersAfterRecovery(t *testing.T) {
defer func() {
pcs := make([]uintptr, 20)
pcs = pcs[:runtime.Callers(0, pcs)]
-   testCallersEqual(t, pcs, want)
+   testCallersEqual(t, pcs, want, nil)
}()
defer func() {
if recover() == nil {
@@ -177,7 +179,7 @@ func TestCallersAbortedPanic(t *testing.T) {
// recovered, there is no remaining panic on the stack.
pcs := make([]uintptr, 20)
pcs = pcs[:runtime.Callers(0, pcs)]
-   testCallersEqual(t, pcs, want)
+   testCallersEqual(t, pcs, want, nil)
}()
defer func() {
r := recover()
@@ -208,7 +210,7 @@ func TestCallersAbortedPanic2(t *testing.T) {
defer func() {
pcs := make([]uintptr, 20)
pcs = pcs[:runtime.Callers(0, pcs)]
-   testCallersEqual(t, pcs, want)
+   testCallersEqual(t, pcs, want, nil)
}()
func() {
defer func() {
@@ -233,10 +235,16 @@ func TestCallersNilPointerPanic(t *testing.T) {
want := []string{"runtime.Callers", 
"runtime_test.TestCallersNilPointerPanic.func1",
"runtime.gopanic", "runtime.panicmem", "runtime.sigpanic",
"runtime_test.TestCallersNilPointerPanic"}
+   ign := make(map[string]struct{})
if runtime.Compiler == "gccgo" {
+   // The expected results of gollvm and gccgo are slightly 
different, the result
+   // of gccgo does not contain tRunner, and the result of gollvm 
does not contain
+   // sigpanic. Make these two elementes optional to pass both of 
gollvm and gccgo.
want = []string{"runtime.Callers", 
"runtime_test.TestCallersNilPointerPanic..func1",
-   "runtime.gopanic", "runtime.panicmem", 
"runtime.sigpanic",
+   "runtime.gopanic", "runtime.panicmem",
"runtime_test.TestCallersNilPointerPanic"}
+   ign["runtime.sigpanic"] = struct{}{}
+   ign["testing.tRunner"] = struct{}{}
}
 
defer func() {
@@ -245,7 +253,7 @@ func TestCallersNilPointerPanic(t *testing.T) {
}
pcs := make([]uintptr, 20)
pcs = pcs[:runtime.Callers(0, pcs)]
-   testCallersEqual(t, pcs, want)
+   testCallersEqual(t, pcs,

libgo patch committed: Append to, don't clobber, environment in test

2020-05-11 Thread Ian Lance Taylor via Gcc-patches

This libgo patch changes some tests in the syscall package to append
to the environment in tests rather than clobbering the environment.
In particular, this preserves LD_LIBRARY_PATH.

This is a partial backport of https://golang.org/cl/233318 from the
master sources.  It's only a partial backport because part of the
change was already applied to libgo in https://golang.org/cl/193497 as
part of the update to the Go 1.13beta1 release.  This additional parts
weren't applied then because they only show up when running the test
as root, and I didn't do that.

This fixes GCC PR 95061.

Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to master and GCC 10 branch.

Ian
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index 428b329382b..939ba7c8929 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-41019d50ae519328dd3cf200815a2a2b0b64674e
+8645632618262d1661ece0c9e6fe9e04c6e3a878
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/libgo/go/syscall/syscall_linux_test.go 
b/libgo/go/syscall/syscall_linux_test.go
index 97059c87d3d..c12df4cf5c7 100644
--- a/libgo/go/syscall/syscall_linux_test.go
+++ b/libgo/go/syscall/syscall_linux_test.go
@@ -187,7 +187,7 @@ func TestLinuxDeathSignal(t *testing.T) {
}
 
cmd := exec.Command(tmpBinary)
-   cmd.Env = []string{"GO_DEATHSIG_PARENT=1"}
+   cmd.Env = append(os.Environ(), "GO_DEATHSIG_PARENT=1")
chldStdin, err := cmd.StdinPipe()
if err != nil {
t.Fatalf("failed to create new stdin pipe: %v", err)
@@ -225,7 +225,10 @@ func TestLinuxDeathSignal(t *testing.T) {
 
 func deathSignalParent() {
cmd := exec.Command(os.Args[0])
-   cmd.Env = []string{"GO_DEATHSIG_CHILD=1"}
+   cmd.Env = append(os.Environ(),
+   "GO_DEATHSIG_PARENT=",
+   "GO_DEATHSIG_CHILD=1",
+   )
cmd.Stdin = os.Stdin
cmd.Stdout = os.Stdout
attrs := syscall.SysProcAttr{
@@ -360,7 +363,7 @@ func TestSyscallNoError(t *testing.T) {
}
 
cmd := exec.Command(tmpBinary)
-   cmd.Env = []string{"GO_SYSCALL_NOERROR=1"}
+   cmd.Env = append(os.Environ(), "GO_SYSCALL_NOERROR=1")
 
out, err := cmd.CombinedOutput()
if err != nil {

Go frontend patch committed: Use some const string references

2020-05-11 Thread Ian Lance Taylor via Gcc-patches

This patch to the Go frontend uses const string references in a couple
of places that were using plain std::string.  This will save some
std::string copying.  This fixes GCC PR 94766.  Bootstrapped and ran
Go testsuite on x86_64-pc-linux-gnu.  Committed to master.

Ian
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index 5aecee18dd6..428b329382b 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-761d68dacefc578e45ff299761f20989aef67823
+41019d50ae519328dd3cf200815a2a2b0b64674e
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/gcc/go/gofrontend/gogo.h b/gcc/go/gofrontend/gogo.h
index 7d83119b698..2fb8a3aeb43 100644
--- a/gcc/go/gofrontend/gogo.h
+++ b/gcc/go/gofrontend/gogo.h
@@ -958,7 +958,7 @@ class Gogo
 
   // Return the name of the type descriptor list symbol of a package.
   std::string
-  type_descriptor_list_symbol(std::string);
+  type_descriptor_list_symbol(const std::string&);
 
   // Return the name of the list of all type descriptor lists.
   std::string
@@ -1073,7 +1073,7 @@ class Gogo
 
 Specific_type_function(Type* atype, Named_type* aname, int64_t asize,
   Specific_type_function_kind akind,
-  const std::string afnname,
+  const std::string& afnname,
   Function_type* afntype)
   : type(atype), name(aname), size(asize), kind(akind),
fnname(afnname), fntype(afntype)
diff --git a/gcc/go/gofrontend/names.cc b/gcc/go/gofrontend/names.cc
index f4ad181515b..a721a364212 100644
--- a/gcc/go/gofrontend/names.cc
+++ b/gcc/go/gofrontend/names.cc
@@ -1024,7 +1024,7 @@ Gogo::type_descriptor_name(const Type* type, Named_type* 
nt)
 // Return the name of the type descriptor list symbol of a package.
 
 std::string
-Gogo::type_descriptor_list_symbol(std::string pkgpath)
+Gogo::type_descriptor_list_symbol(const std::string& pkgpath)
 {
   return pkgpath + "..types";
 }

[PATCH] c++: explicit(bool) malfunction with dependent expression [PR95066]

2020-05-11 Thread Marek Polacek via Gcc-patches

I forgot to set DECL_HAS_DEPENDENT_EXPLICIT_SPEC_P when merging two
function declarations and as a sad consequence, we never tsubsted
the dependent explicit-specifier in tsubst_function_decl, leading to
disregarding the explicit-specifier altogether, and wrongly accepting
this test.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/10/9?

PR c++/95066
* decl.c (duplicate_decls): Set DECL_HAS_DEPENDENT_EXPLICIT_SPEC_P.

* g++.dg/cpp2a/explicit16.C: New test.
---
 gcc/cp/decl.c   |  2 ++
 gcc/testsuite/g++.dg/cpp2a/explicit16.C | 21 +
 2 files changed, 23 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/explicit16.C

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 1b6a5672334..604ecf42e95 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -2035,6 +2035,8 @@ duplicate_decls (tree newdecl, tree olddecl, bool 
newdecl_is_friend)
   DECL_FINAL_P (newdecl) |= DECL_FINAL_P (olddecl);
   DECL_OVERRIDE_P (newdecl) |= DECL_OVERRIDE_P (olddecl);
   DECL_THIS_STATIC (newdecl) |= DECL_THIS_STATIC (olddecl);
+  DECL_HAS_DEPENDENT_EXPLICIT_SPEC_P (newdecl)
+   |= DECL_HAS_DEPENDENT_EXPLICIT_SPEC_P (olddecl);
   if (DECL_OVERLOADED_OPERATOR_P (olddecl))
DECL_OVERLOADED_OPERATOR_CODE_RAW (newdecl)
  = DECL_OVERLOADED_OPERATOR_CODE_RAW (olddecl);
diff --git a/gcc/testsuite/g++.dg/cpp2a/explicit16.C 
b/gcc/testsuite/g++.dg/cpp2a/explicit16.C
new file mode 100644
index 000..9d95b0d669e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/explicit16.C
@@ -0,0 +1,21 @@
+// PR c++/95066 - explicit malfunction with dependent expression.
+// { dg-do compile { target c++2a } }
+
+template 
+struct Foo {
+  template 
+  explicit(static_cast(true)) operator Foo();
+};
+
+template 
+template 
+Foo::operator Foo() {
+  return {};
+}
+
+int
+main ()
+{
+  Foo a;
+  Foo b = a; // { dg-error "conversion" }
+}

base-commit: 840ac85ced0695fefecee433327e4298b4adb20a
-- 
Marek Polacek • Red Hat, Inc. • 300 A St, Boston, MA

[PATCH] c++: premature requires-expression folding [PR95020]

2020-05-11 Thread Patrick Palka via Gcc-patches

In the testcase below we're prematurely folding away the
requires-expression to 'true' after substituting in the function's
template arguments, but before substituting in the lambda's deduced
template arguments.

This happens because during the first tsubst_requires_expr,
processing_template_decl is 1 but 'args' is just {void} and therefore
non-dependent, so we end up folding away the requires-expression to
boolean_true_node before we could substitute in the lambda's template
arguments and determine that '*v' is ill-formed.

This patch removes the uses_template_parms check when deciding in
tsubst_requires_expr whether to keep around a new requires-expression.
Regardless of whether the template arguments are dependent, there still
might be more template parameters to later substitute in -- as in the
testcase below -- and even if not, tsubst_expr doesn't perform full
semantic processing unless !processing_template_decl, so it seems we
should wait until then to fold away the requires-expression.

Passes 'make check-c++', does this look OK to commit after a full
bootstrap/regtest?

gcc/cp/ChangeLog:

PR c++/95020
* constraint.c (tsubst_requires_expr): Produce a new
requires-expression when processing_template_decl, even if
template arguments are not dependent.

gcc/testsuite/ChangeLog:

PR c++/95020
* g++/cpp2a/concepts-lambda7.C: New test.
---
 gcc/cp/constraint.cc  |  4 +---
 gcc/testsuite/g++.dg/cpp2a/concepts-lambda7.C | 14 ++
 2 files changed, 15 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-lambda7.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 4ad17f3b7d8..8ee347cae60 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -2173,9 +2173,7 @@ tsubst_requires_expr (tree t, tree args,
   if (reqs == error_mark_node)
 return boolean_false_node;
 
-  /* In certain cases, produce a new requires-expression.
- Otherwise the value of the expression is true.  */
-  if (processing_template_decl && uses_template_parms (args))
+  if (processing_template_decl)
 return finish_requires_expr (cp_expr_location (t), parms, reqs);
 
   return boolean_true_node;
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-lambda7.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-lambda7.C
new file mode 100644
index 000..50746b777a3
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-lambda7.C
@@ -0,0 +1,14 @@
+// PR c++/95020
+// { dg-do compile { target c++2a } }
+
+template
+void foo() {
+  auto t = [](auto v) {
+static_assert(requires { *v; }); // { dg-error "static assertion failed" }
+  };
+  t(0);
+}
+
+void bar() {
+  foo();
+}
-- 
2.26.2.561.g07d8ea56f2

Re: [PATCH] rs6000: Add xxgenpcvwm and xxgenpcvdm instructions

2020-05-11 Thread Bill Schmidt via Gcc-patches


On 5/11/20 5:21 AM, Segher Boessenkool wrote:

Hi!

On Sat, May 09, 2020 at 12:05:08PM -0500, Bill Schmidt wrote:

From: Carl Love 

Add support for xxgenpcv[dw]m, along with individual and overloaded
built-in functions for access.
(xxgenpcvm_): New insn.
(xxgenpcvm): New expansion.

Eww.  Let's please use or not use underscore in both cases.  Insns that
are not created directly should have a name starting with *.  We have
many examples of an expand with the same name as an insn (other than the
insn having a *), which isn't really confusing because the dexpand
usually is right before the insn.

But, in this case, you *do* call the insn directly (namely, from the
define expand!)  So maybe use a "xxgenpcvm_internal" or similar
name for the define_insn?


Agreed.  I'm fixing that now.  Thanks!

Bill



Okay for trunk with that improved somehow.  Thanks!


Segher

[PATCH] contrib: Handle GDB specific test result types

2020-05-11 Thread Andrew Burgess

This commit is for the benefit of GDB, but as the binutils-gdb
repository shares the contrib/ directory with gcc, this commit must
first be applied to gcc then copied back to binutils-gdb.

This commit extends the two scripts contrib/dg-extract-results.{py,sh}
to handle some new, GDB specific test result types.  These test
results types should never appear in GCC, or any other tool that
shares the contrib/ directly, so this change should be harmless.

In this patch series:
  https://sourceware.org/pipermail/gdb-patches/2020-April/167847.html
changes were made in GDB's use of Dejagnu so that two additional
conditions could be detected, these are:

  1. Test names that contain either the build or source paths.  Such
  test names make it difficult to compare the results of two test runs
  of GDB from two different directories, and

  2. Duplicate test names.  Duplicates make it difficult to track down
  exactly which test has failed.

When running Dejagnu on GDB we can now (sometimes) see two additional
test result types matching the above conditions, these are '# of paths
in test names' and '# of duplicate test names'.

If the test is run in parallel mode (make -j...) then these extra test
results will appear in the individual test summary files, but are not
merged into the final summary file.

Additionally, within the summary file there are now two new types of
test summary line, these are 'PATH: ...' and 'DUPLICATE: ...', these
allow users to quickly search the test summary to track down where the
offending test names are.  These lines are similarly not merged into
the unified gdb.sum file after a parallel test run.

This commit extends the dg-extract-results.* scripts to calculate the
totals for the two new result types, and to copy the new test summary
lines into the unified summary file.

contrib/ChangeLog:

* dg-extract-results.py: Handle GDB specific test types.
* dg-extract-results.sh: Likewise.
---
 contrib/ChangeLog |  5 +
 contrib/dg-extract-results.py |  6 --
 contrib/dg-extract-results.sh | 12 +++-
 3 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/contrib/dg-extract-results.py b/contrib/dg-extract-results.py
index 7100794d42a..30aa68771d4 100644
--- a/contrib/dg-extract-results.py
+++ b/contrib/dg-extract-results.py
@@ -117,7 +117,7 @@ class Prog:
 self.tool_re = re.compile (r'^\t\t=== (.*) tests ===$')
 self.result_re = re.compile (r'^(PASS|XPASS|FAIL|XFAIL|UNRESOLVED'
  r'|WARNING|ERROR|UNSUPPORTED|UNTESTED'
- r'|KFAIL|KPASS):\s*(.+)')
+ r'|KFAIL|KPASS|PATH|DUPLICATE):\s*(.+)')
 self.completed_re = re.compile (r'.* completed at (.*)')
 # Pieces of text to write at the head of the output.
 # start_line is a pair in which the first element is a datetime
@@ -143,7 +143,9 @@ class Prog:
 '# of known failures\t\t',
 '# of untested testcases\t\t',
 '# of unresolved testcases\t',
-'# of unsupported tests\t\t'
+'# of unsupported tests\t\t',
+'# of paths in test names\t',
+'# of duplicate test names\t'
 ]
 self.runs = dict()
 
diff --git a/contrib/dg-extract-results.sh b/contrib/dg-extract-results.sh
index f948088370e..ff6c50d029c 100755
--- a/contrib/dg-extract-results.sh
+++ b/contrib/dg-extract-results.sh
@@ -326,7 +326,7 @@ BEGIN {
   }
 }
 /^\t\t=== .* ===$/ { curvar = ""; next }
-/^(PASS|XPASS|FAIL|XFAIL|UNRESOLVED|WARNING|ERROR|UNSUPPORTED|UNTESTED|KFAIL|KPASS):/
 {
+/^(PASS|XPASS|FAIL|XFAIL|UNRESOLVED|WARNING|ERROR|UNSUPPORTED|UNTESTED|KFAIL|KPASS|PATH|DUPLICATE):/
 {
   testname=\$2
   # Ugly hack for gfortran.dg/dg.exp
   if ("$TOOL" == "gfortran" && testname ~ /^gfortran.dg\/g77\//)
@@ -400,6 +400,7 @@ BEGIN {
   variant="$VAR"
   tool="$TOOL"
   passcnt=0; failcnt=0; untstcnt=0; xpasscnt=0; xfailcnt=0; kpasscnt=0; 
kfailcnt=0; unsupcnt=0; unrescnt=0; dgerrorcnt=0;
+  pathcnt=0; dupcnt=0
   curvar=""; insummary=0
 }
 /^Running target / { curvar = \$3; next }
@@ -414,6 +415,8 @@ BEGIN {
 /^# of untested testcases/ { if (insummary == 1) untstcnt += \$5; next; }
 /^# of unresolved testcases/   { if (insummary == 1) unrescnt += \$5; next; }
 /^# of unsupported tests/  { if (insummary == 1) unsupcnt += \$5; next; }
+/^# of paths in test names/{ if (insummary == 1) pathcnt += \$7; next; }
+/^# of duplicate test names/   { if (insummary == 1) dupcnt += \$6; next; }
 /^$/   { if (insummary == 1)
{ insummary = 0; curvar = "" }
  next
@@ -431,6 +434,8 @@ END {
   if (untstcnt != 0) printf ("# of untested testcases\t\t%d\n", untstcnt)
   if (unrescnt != 0) printf ("# of unresolved testcases\t%d\n", unrescnt)
   if (unsupcnt != 0) printf ("# of unsupported tests\t\t%d\n", unsupcnt)
+  if

Re: [PATCH] Add C++2a synchronization support

2020-05-11 Thread Thomas Rodgers via Gcc-patches

I *think* I have addressed everything in the attached patch.
commit 24a989d2bf2158bdbe2511310d0583d0c6226f71
Author: Thomas Rodgers 
Date:   Mon Apr 6 17:58:47 2020 -0700

Add C++2a synchronization support

Add support for -
atomic wait/notify_one/notify_all
counting_semaphore
binary_semaphore
latch

* include/Makefile.am (bits_headers): Add new header.
* include/Makefile.in: Regenerate.
* include/bits/atomic_base.h (__atomic_base<_Itp>::wait): Define.
(__atomic_base<_Itp>::notify_one): Likewise.
(__atomic_base<_Itp>::notify_all): Likewise.
(__atomic_base<_Ptp*>::wait): Likewise.
(__atomic_base<_Ptp*>::notify_one): Likewise.
(__atomic_base<_Ptp*>::notify_all): Likewise.
(__atomic_impl::wait): Likewise.
(__atomic_impl::notify_one): Likewise.
(__atomic_impl::notify_all): Likewise.
(__atomic_float<_Fp>::wait): Likewise.
(__atomic_float<_Fp>::notify_one): Likewise.
(__atomic_float<_Fp>::notify_all): Likewise.
(__atomic_ref<_Tp>::wait): Likewise.
(__atomic_ref<_Tp>::notify_one): Likewise.
(__atomic_ref<_Tp>::notify_all): Likewise.
(atomic_wait<_Tp>): Likewise.
(atomic_wait_explicit<_Tp>): Likewise.
(atomic_notify_one<_Tp>): Likewise.
(atomic_notify_all<_Tp>): Likewise.
* include/bits/atomic_wait.h: New file.
* include/bits/atomic_timed_wait.h: New file.
* include/bits/semaphore_base.h: New file.
* include/std/atomic (atomic::wait): Define.
(atomic::wait_one): Likewise.
(atomic::wait_all): Likewise.
(atomic<_Tp>::wait): Likewise.
(atomic<_Tp>::wait_one): Likewise.
(atomic<_Tp>::wait_all): Likewise.
(atomic<_Tp*>::wait): Likewise.
(atomic<_Tp*>::wait_one): Likewise.
(atomic<_Tp*>::wait_all): Likewise.
* include/std/latch: New file.
* include/std/semaphore: New file.
* include/std/version: Add __cpp_lib_semaphore and
__cpp_lib_latch defines.
* testsuite/29_atomic/atomic/wait_notify/atomic_refs.cc: New test.
* testsuite/29_atomic/atomic/wait_notify/bool.cc: Likewise.
* testsuite/29_atomic/atomic/wait_notify/integrals.cc: Likewise.
* testsuite/29_atomic/atomic/wait_notify/floats.cc: Likewise.
* testsuite/29_atomic/atomic/wait_notify/pointers.cc: Likewise.
* testsuite/29_atomic/atomic/wait_notify/generic.h: New File.
* testsuite/30_thread/semaphore/1.cc: New test.
* testsuite/30_thread/semaphore/2.cc: Likewise.
* testsuite/30_thread/semaphore/least_max_value_neg.cc: Likewise.
* testsuite/30_thread/semaphore/try_acquire.cc: Likewise.
* testsuite/30_thread/semaphore/try_acquire_for.cc: Likewise.
* testsuite/30_thread/semaphore/try_acquire_futex.cc: Likewise.
* testsuite/30_thread/semaphore/try_acquire_posix.cc: Likewise.
* testsuite/30_thread/semaphore/try_acquire_until.cc: Likewise.
* testsuite/30_thread/latch/1.cc: New test.
* testsuite/30_thread/latch/2.cc: New test.
* testsuite/30_thread/latch/3.cc: New test.

diff --git a/libstdc++-v3/include/Makefile.am b/libstdc++-v3/include/Makefile.am
index 80aeb3f8959..b3ac1a3365f 100644
--- a/libstdc++-v3/include/Makefile.am
+++ b/libstdc++-v3/include/Makefile.am
@@ -52,6 +52,7 @@ std_headers = \
 	${std_srcdir}/iostream \
 	${std_srcdir}/istream \
 	${std_srcdir}/iterator \
+	${std_srcdir}/latch\
 	${std_srcdir}/limits \
 	${std_srcdir}/list \
 	${std_srcdir}/locale \
@@ -69,6 +70,7 @@ std_headers = \
 	${std_srcdir}/ratio \
 	${std_srcdir}/regex \
 	${std_srcdir}/scoped_allocator \
+	${std_srcdir}/semaphore \
 	${std_srcdir}/set \
 	${std_srcdir}/shared_mutex \
 	${std_srcdir}/span \
@@ -100,6 +102,8 @@ bits_headers = \
 	${bits_srcdir}/allocated_ptr.h \
 	${bits_srcdir}/allocator.h \
 	${bits_srcdir}/atomic_base.h \
+	${bits_srcdir}/atomic_wait.h \
+	${bits_srcdir}/atomic_timed_wait.h \
 	${bits_srcdir}/atomic_futex.h \
 	${bits_srcdir}/basic_ios.h \
 	${bits_srcdir}/basic_ios.tcc \
@@ -174,6 +178,7 @@ bits_headers = \
 	${bits_srcdir}/regex_compiler.tcc \
 	${bits_srcdir}/regex_executor.h \
 	${bits_srcdir}/regex_executor.tcc \
+	${bits_srcdir}/semaphore_base.h \
 	${bits_srcdir}/shared_ptr.h \
 	${bits_srcdir}/shared_ptr_atomic.h \
 	${bits_srcdir}/shared_ptr_base.h \
diff --git a/libstdc++-v3/include/bits/atomic_base.h b/libstdc++-v3/include/bits/atomic_base.h
index 87fe0bd6000..73a8a77271e 100644
--- a/libstdc++-v3/include/bits/atomic_base.h
+++ b/libstdc++-v3/include/bits/atomic_base.h
@@ -37,6 +37,10 @@
 #include 
 #include 
 
+#if __cplusplus > 201703L
+#include

Re: [PATCH] make minmax detection work with FMIN/FMAX IFNs

2020-05-11 Thread Joseph Myers

On Fri, 8 May 2020, Richard Biener wrote:

> The IFNs are supposed to match fmin and fmax from the C standard which 
> IIRC have IEEE semantics.

fmin and fmax have IEEE (2008) semantics (where an sNaN operand results in 
a qNaN result with "invalid" raised", but a quiet NaN results in the other 
operand, if not sNaN, being returned).  Not to be confused with any of the 
new minimum/maximum operations in IEEE (2019) (both variants that treat 
all NaNs like other arithmetic operations, and variants that can raise 
"invalid" for sNaN without returning a NaN), for which C bindings under 
different names are proposed.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH] c++: C++20 DR 2237, disallow simple-template-id in cdtor.

2020-05-11 Thread Marek Polacek via Gcc-patches

On Sun, Apr 05, 2020 at 09:46:09PM -0400, Jason Merrill wrote:
> On 4/4/20 7:30 PM, Marek Polacek wrote:
> > This patch implements DR 2237 which says that a simple-template-id is
> > no longer valid as the declarator-id of a constructor or destructor;
> > see .  It is not explicitly
> > stated but out-of-line destructors with a simple-template-id are also
> > meant to be ill-formed now.  (Out-of-line constructors like that are
> > invalid since DR1435 I think.)  This change only applies to C++20; it
> > is not a DR against C++17.
> > 
> > I'm not crazy about the diagnostic in constructors but ISTM that
> > cp_parser_constructor_declarator_p shouldn't print errors.
> > 
> > Does it seem reasonable to apply this now or should I defer to GCC 11?
> 
> A new error should wait for GCC 11.

Coming back to this now that we're again in stage 1.

> > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> > 
> > 2020-04-04  Marek Polacek  
> > 
> > DR 2237
> > * parser.c (cp_parser_unqualified_id): Reject simple-template-id as
> > the declarator-id of a destructor.
> > (cp_parser_constructor_declarator_p): Reject simple-template-id as
> > the declarator-id of a constructor.
> > 
> > * g++.dg/DRs/dr2237.C: New test.
> > * g++.dg/parse/constructor2.C: Add dg-error for C++20.
> > * g++.dg/parse/dtor12.C: Likewise.
> > * g++.dg/parse/dtor4.C: Likewise.
> > * g++.dg/template/dtor4.C: Adjust dg-error.
> > * g++.dg/template/error34.C: Likewise.
> > * g++.old-deja/g++.other/inline15.C: Only run for C++17 and lesses.
> > * g++.old-deja/g++.pt/ctor2.C: Add dg-error for C++20.
> > ---
> >   gcc/cp/parser.c| 16 
> >   gcc/testsuite/g++.dg/DRs/dr2237.C  | 18 ++
> >   gcc/testsuite/g++.dg/parse/constructor2.C  |  4 ++--
> >   gcc/testsuite/g++.dg/parse/dtor12.C|  2 +-
> >   gcc/testsuite/g++.dg/parse/dtor4.C |  2 +-
> >   gcc/testsuite/g++.dg/template/dtor4.C  |  2 +-
> >   gcc/testsuite/g++.dg/template/error34.C| 10 +-
> >   .../g++.old-deja/g++.other/inline15.C  |  2 +-
> >   gcc/testsuite/g++.old-deja/g++.pt/ctor2.C  |  2 +-
> >   9 files changed, 46 insertions(+), 12 deletions(-)
> >   create mode 100644 gcc/testsuite/g++.dg/DRs/dr2237.C
> > 
> > diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
> > index 7e5921e039f..810edfa87a9 100644
> > --- a/gcc/cp/parser.c
> > +++ b/gcc/cp/parser.c
> > @@ -6114,6 +6114,16 @@ cp_parser_unqualified_id (cp_parser* parser,
> > return build_min_nt_loc (loc, BIT_NOT_EXPR, make_auto ());
> >   }
> > +   /* DR 2237 (C++20 only): A simple-template-id is no longer valid as the
> > +  declarator-id of a constructor or destructor.  */
> > +   if (token->type == CPP_TEMPLATE_ID && cxx_dialect >= cxx2a)
> > + {
> > +   if (!cp_parser_uncommitted_to_tentative_parse_p (parser))
> > + error_at (tilde_loc, "template-id not allowed for destructor");
> > +   cp_parser_simulate_error (parser);
> 
> The usual pattern is
> 
> if (!cp_parser_simulate_error (parser))
>   error...

Fixed.

> > +   return error_mark_node;
> > + }
> > +
> > /* If there was an explicit qualification (S::~T), first look
> >in the scope given by the qualification (i.e., S).
> > @@ -28675,6 +28685,12 @@ cp_parser_constructor_declarator_p (cp_parser 
> > *parser, cp_parser_flags flags,
> > if (!constructor_name_p (id, nested_name_specifier))
> > constructor_p = false;
> >   }
> > +  /* DR 2237 (C++20 only): A simple-template-id is no longer valid as the
> > + declarator-id of a constructor or destructor.  */
> > +  else if (constructor_p
> > +  && cxx_dialect >= cxx2a
> > +  && cp_lexer_next_token_is (parser->lexer, CPP_TEMPLATE_ID))
> > +constructor_p = false;
> > /* If we still think that this might be a constructor-declarator,
> >look for a class-name.  */
> > else if (constructor_p)
> 
> Do you also want to exclude CPP_TEMPLATE_ID from the test at the top of the
> function for C++20?

In fact we can get away with only excluding CPP_TEMPLATE_ID in C++20 there,
because for e.g. X::X(); we'll give the "names the constructor, not the
type" error, so we won't even get here.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
This patch implements DR 2237 which says that a simple-template-id is
no longer valid as the declarator-id of a constructor or destructor;
see [diff.cpp17.class]#2.  It is not explicitly stated but out-of-line
destructors with a simple-template-id are also meant to be ill-formed
now.  (Out-of-line constructors like that are invalid since DR1435 I
think.)  This change only applies to C++20; it is not a DR against C++17.

I'm not crazy about the diagnostic in constructors but ISTM that
cp_parser_constructor_declarator_p shouldn't print errors.

DR 2237

[pushed] c++: tree walk into TYPENAME_TYPE.

2020-05-11 Thread Jason Merrill via Gcc-patches

While looking at 92583/92654 it occurred to me that typename types needed
the same fix.  So extract_locals_r also needs to see the TYPE_CONTEXT of a
TYPENAME_TYPE.  But it must not look through a typedef.

Most tree walking in the front end wants to walk through the syntactic form
of a type of expression, and doesn't care about the type referred to by a
typedef.  But min_vis_r does care.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog
2020-05-11  Jason Merrill  

PR c++/92583
PR c++/92654
* tree.c (cp_walk_subtrees): Stop at typedefs.
Handle TYPENAME_TYPE here.
* pt.c (find_parameter_packs_r): Not here.
(for_each_template_parm_r): Clear *walk_subtrees.
* decl2.c (min_vis_r): Look through typedefs.
---
 gcc/cp/decl2.c| 30 +++
 gcc/cp/pt.c   |  7 +
 gcc/cp/tree.c | 22 +++---
 .../g++.dg/cpp1z/constexpr-if-lambda3.C   |  1 +
 4 files changed, 37 insertions(+), 23 deletions(-)

diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index 8d3ac31a0c9..4767d53adef 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -2328,26 +2328,30 @@ static tree
 min_vis_r (tree *tp, int *walk_subtrees, void *data)
 {
   int *vis_p = (int *)data;
+  int this_vis = VISIBILITY_DEFAULT;
   if (! TYPE_P (*tp))
-{
-  *walk_subtrees = 0;
-}
+*walk_subtrees = 0;
+  else if (typedef_variant_p (*tp))
+/* Look through typedefs despite cp_walk_subtrees.  */
+this_vis = type_visibility (DECL_ORIGINAL_TYPE (TYPE_NAME (*tp)));
   else if (OVERLOAD_TYPE_P (*tp)
   && !TREE_PUBLIC (TYPE_MAIN_DECL (*tp)))
 {
-  *vis_p = VISIBILITY_ANON;
-  return *tp;
+  this_vis = VISIBILITY_ANON;
+  *walk_subtrees = 0;
+}
+  else if (CLASS_TYPE_P (*tp))
+{
+  this_vis = CLASSTYPE_VISIBILITY (*tp);
+  *walk_subtrees = 0;
 }
-  else if (CLASS_TYPE_P (*tp)
-  && CLASSTYPE_VISIBILITY (*tp) > *vis_p)
-*vis_p = CLASSTYPE_VISIBILITY (*tp);
   else if (TREE_CODE (*tp) == ARRAY_TYPE
   && uses_template_parms (TYPE_DOMAIN (*tp)))
-{
-  int evis = expr_visibility (TYPE_MAX_VALUE (TYPE_DOMAIN (*tp)));
-  if (evis > *vis_p)
-   *vis_p = evis;
-}
+this_vis = expr_visibility (TYPE_MAX_VALUE (TYPE_DOMAIN (*tp)));
+
+  if (this_vis > *vis_p)
+*vis_p = this_vis;
+
   return NULL;
 }
 
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 28f3c90f17b..86f1bb7470d 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -3963,12 +3963,6 @@ find_parameter_packs_r (tree *tp, int *walk_subtrees, 
void* data)
_parameter_packs_r, ppd, ppd->visited);
   return NULL_TREE;
 
-case TYPENAME_TYPE:
-  cp_walk_tree (_TYPE_FULLNAME (t), _parameter_packs_r,
-   ppd, ppd->visited);
-  *walk_subtrees = 0;
-  return NULL_TREE;
-
 case TYPE_PACK_EXPANSION:
 case EXPR_PACK_EXPANSION:
   *walk_subtrees = 0;
@@ -10321,6 +10315,7 @@ for_each_template_parm_r (tree *tp, int *walk_subtrees, 
void *d)
   /* A template-id in a TYPENAME_TYPE might be a deduced context after
 partial instantiation.  */
   WALK_SUBTREE (TYPENAME_TYPE_FULLNAME (t));
+  *walk_subtrees = 0;
   break;
 
 case CONSTRUCTOR:
diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c
index 8840932dba2..d526a6311e0 100644
--- a/gcc/cp/tree.c
+++ b/gcc/cp/tree.c
@@ -5006,9 +5006,18 @@ cp_walk_subtrees (tree *tp, int *walk_subtrees_p, 
walk_tree_fn func,
   while (0)
 
   if (TYPE_P (*tp))
-/* Walk into template args without looking through typedefs.  */
-if (tree ti = TYPE_TEMPLATE_INFO_MAYBE_ALIAS (*tp))
-  WALK_SUBTREE (TI_ARGS (ti));
+{
+  /* Walk into template args without looking through typedefs.  */
+  if (tree ti = TYPE_TEMPLATE_INFO_MAYBE_ALIAS (*tp))
+   WALK_SUBTREE (TI_ARGS (ti));
+  /* Don't look through typedefs; walk_tree_fns that want to look through
+typedefs (like min_vis_r) need to do that themselves.  */
+  if (typedef_variant_p (*tp))
+   {
+ *walk_subtrees_p = 0;
+ return NULL_TREE;
+   }
+}
 
   /* Not one of the easy cases.  We must explicitly go through the
  children.  */
@@ -5021,7 +5030,6 @@ cp_walk_subtrees (tree *tp, int *walk_subtrees_p, 
walk_tree_fn func,
 case UNBOUND_CLASS_TEMPLATE:
 case TEMPLATE_PARM_INDEX:
 case TEMPLATE_TYPE_PARM:
-case TYPENAME_TYPE:
 case TYPEOF_TYPE:
 case UNDERLYING_TYPE:
   /* None of these have subtrees other than those already walked
@@ -5029,6 +5037,12 @@ cp_walk_subtrees (tree *tp, int *walk_subtrees_p, 
walk_tree_fn func,
   *walk_subtrees_p = 0;
   break;
 
+case TYPENAME_TYPE:
+  WALK_SUBTREE (TYPE_CONTEXT (*tp));
+  WALK_SUBTREE (TYPENAME_TYPE_FULLNAME (*tp));
+  *walk_subtrees_p = 0;
+  break;
+
 case BASELINK:
   if (BASELINK_QUALIFIED_P (*tp))

[pushed] c++: Fix specialization of constrained member template.

2020-05-11 Thread Jason Merrill via Gcc-patches

The resolution of comment CA104 clarifies that we need to do direct
substitution of constraints in order to determine which member template
corresponds to an explicit specialization.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog
2020-05-11  Jason Merrill  

Resolve C++20 NB comment CA104
* pt.c (determine_specialization): Compare constraints for
specialization of member template of class instantiation.
---
 gcc/cp/pt.c | 28 ++---
 gcc/testsuite/g++.dg/cpp2a/concepts-spec1.C | 10 
 gcc/testsuite/g++.dg/template/nontype18.C   |  2 +-
 3 files changed, 36 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-spec1.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 86f1bb7470d..84864561c25 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -2282,8 +2282,29 @@ determine_specialization (tree template_id,
  below. */
  if (tsk == tsk_template)
{
- if (compparms (fn_arg_types, decl_arg_types))
-   candidates = tree_cons (NULL_TREE, fn, candidates);
+ if (!comp_template_parms (DECL_TEMPLATE_PARMS (fn),
+   current_template_parms))
+   continue;
+ if (!same_type_p (TREE_TYPE (TREE_TYPE (decl)),
+   TREE_TYPE (TREE_TYPE (fn
+   continue;
+ if (!compparms (fn_arg_types, decl_arg_types))
+   continue;
+
+ tree freq = get_trailing_function_requirements (fn);
+ tree dreq = get_trailing_function_requirements (decl);
+ if (!freq != !dreq)
+   continue;
+ if (freq)
+   {
+ tree fargs = DECL_TI_ARGS (fn);
+ tsubst_flags_t complain = tf_none;
+ freq = tsubst_constraint (freq, fargs, complain, fn);
+ if (!cp_tree_equal (freq, dreq))
+   continue;
+   }
+
+ candidates = tree_cons (NULL_TREE, fn, candidates);
  continue;
}
 
@@ -2472,7 +2493,8 @@ determine_specialization (tree template_id,
   *targs_out = copy_node (DECL_TI_ARGS (fn));
 
   /* Propagate the candidate's constraints to the declaration.  */
-  set_constraints (decl, get_constraints (fn));
+  if (tsk != tsk_template)
+   set_constraints (decl, get_constraints (fn));
 
   /* DECL is a re-declaration or partial instantiation of a template
 function.  */
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-spec1.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-spec1.C
new file mode 100644
index 000..5001813d7b7
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-spec1.C
@@ -0,0 +1,10 @@
+// Example from CA 104 proposal.
+// { dg-do compile { target concepts } }
+
+template  concept C = sizeof(T) == 8;
+template  struct A {
+  template  U f(U) requires C; // #1
+  template  U f(U) requires C; // #2
+};
+
+template <> template  U A::f(U) requires C { } // OK, 
specializes #2
diff --git a/gcc/testsuite/g++.dg/template/nontype18.C 
b/gcc/testsuite/g++.dg/template/nontype18.C
index cbe0a1b5a0d..b68416dca61 100644
--- a/gcc/testsuite/g++.dg/template/nontype18.C
+++ b/gcc/testsuite/g++.dg/template/nontype18.C
@@ -5,4 +5,4 @@ template struct A
 template void foo();
 };
 
-template template void A<0>::foo() {} // { dg-error 
"template parameter" }
+template template void A<0>::foo() {} // { dg-error "" }

base-commit: 42e9f80bf4f6a38733c221c03a512c432cdb784f
-- 
2.18.1

[RFC PATCH] i386: Add V2SFmode FMA insn patterns [PR95046]

2020-05-11 Thread Uros Bizjak via Gcc-patches

Attached patch implements V2SFmode FMA insn patterns. Patched compiler
vectorizes FMA, FMS and FNMA instructions, but for some reason fails
to vectorize FNMS.

I have double checked that the insn pattern is correct, and now I'm
all out of ideas what could be wrong with the pattern, still ignored
by the vectorizer. -fno-vect-cost-model does not help so it's time to
ask the experts...

gcc/ChangeLog:

2020-05-11  Uroš Bizjak  

PR target/95046
* config/i386/mmx.md (fmav2sf4): New insn pattern.
(fmsv2sf4): Ditto.
(fnmav2sf4): Ditto.
(fnmsv2sf4): Ditto.

testsuite/ChangeLog:

2020-05-11  Uroš Bizjak  

PR target/95046
* gcc.target/i386/pr95046-2.c: New test.

Otherwise, the patch is bootstrapped and regression tested on
x86_64-linux-gnu {,-m32}.

Uros.
diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index a8f603b94f8..0024ce761d7 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -345,6 +345,70 @@
(set_attr "prefix" "*,orig,vex")
(set_attr "mode" "V2SF,V4SF,V4SF")])
 
+(define_insn "fmav2sf4"
+  [(set (match_operand:V2SF 0 "register_operand" "=v,v,x")
+   (fma:V2SF
+ (match_operand:V2SF 1 "register_operand" "%0,v,x")
+ (match_operand:V2SF 2 "register_operand" "v,v,x")
+ (match_operand:V2SF 3 "register_operand" "v,0,x")))]
+  "(TARGET_FMA || TARGET_FMA4) && TARGET_MMX_WITH_SSE"
+  "@
+   vfmadd132ps\t{%2, %3, %0|%0, %3, %2}
+   vfmadd231ps\t{%2, %1, %0|%0, %1, %2}
+   vfmaddps\t{%3, %2, %1, %0|%0, %1, %2, %3}"
+  [(set_attr "isa" "fma,fma,fma4")
+   (set_attr "type" "ssemuladd")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "fmsv2sf4"
+  [(set (match_operand:V2SF 0 "register_operand" "=v,v,x")
+   (fma:V2SF
+ (match_operand:V2SF   1 "register_operand" "%0,v,x")
+ (match_operand:V2SF   2 "register_operand" "v,v,x")
+ (neg:V2SF
+   (match_operand:V2SF 3 "register_operand" "v,0,x"]
+  "(TARGET_FMA || TARGET_FMA4) && TARGET_MMX_WITH_SSE"
+  "@
+   vfmsub132ps\t{%2, %3, %0|%0, %3, %2}
+   vfmsub231ps\t{%2, %1, %0|%0, %1, %2}
+   vfmsubps\t{%3, %2, %1, %0|%0, %1, %2, %3}"
+  [(set_attr "isa" "fma,fma,fma4")
+   (set_attr "type" "ssemuladd")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "fnmav2sf4"
+  [(set (match_operand:V2SF 0 "register_operand" "=v,v,x")
+   (fma:V2SF
+ (neg:V2SF
+   (match_operand:V2SF 1 "register_operand" "%0,v,x"))
+ (match_operand:V2SF   2 "register_operand" "v,v,x")
+ (match_operand:V2SF   3 "register_operand" "v,0,x")))]
+  "(TARGET_FMA || TARGET_FMA4) && TARGET_MMX_WITH_SSE"
+  "@
+   vfnmadd132ps\t{%2, %3, %0|%0, %3, %2}
+   vfnmadd231ps\t{%2, %1, %0|%0, %1, %2}
+   vfnmaddps\t{%3, %2, %1, %0|%0, %1, %2, %3}"
+  [(set_attr "isa" "fma,fma,fma4")
+   (set_attr "type" "ssemuladd")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "fnmsv2sf4"
+  [(set (match_operand:V2SF 0 "register_operand" "=v,v,x")
+   (fma:V2SF
+ (neg:V2SF
+   (match_operand:V2SF 1 "register_operand" "%0,v,x"))
+ (match_operand:V2SF   2 "register_operand" "v,v,x")
+ (neg:V2SF
+   (match_operand:V2SF 3 "register_operand" "v,0,x"]
+  "(TARGET_FMA || TARGET_FMA4) && TARGET_MMX_WITH_SSE"
+  "@
+   vfnmsub132ps\t{%2, %3, %0|%0, %3, %2}
+   vfnmsub231ps\t{%2, %1, %0|%0, %1, %2}
+   vfnmsubps\t{%3, %2, %1, %0|%0, %1, %2, %3}"
+  [(set_attr "isa" "fma,fma,fma4")
+   (set_attr "type" "ssemuladd")
+   (set_attr "mode" "V4SF")])
+
 (define_expand "mmx_v2sf3"
   [(set (match_operand:V2SF 0 "register_operand")
 (smaxmin:V2SF
/* PR target/95046 */
/* { dg-do compile { target { ! ia32 } } } */
/* { dg-options "-O3 -mfma" } */


float r[2], a[2], b[2], c[2];

void
test_fma (void)
{
  for (int i = 0; i < 2; i++)
r[i] = a[i] * b[i] + c[i];
}

/* { dg-final { scan-assembler "fmadd132ps" } } */

void
test_fms (void)
{
  for (int i = 0; i < 2; i++)
r[i] = a[i] * b[i] - c[i];
}

/* { dg-final { scan-assembler "fmsub132ps" } } */

void
test_fnma (void)
{
  for (int i = 0; i < 2; i++)
r[i] = -(a[i] * b[i]) + c[i];
}

/* { dg-final { scan-assembler "fnmadd132ps" } } */

void
test_fnms (void)
{
  for (int i = 0; i < 2; i++)
r[i] = -(a[i] * b[i]) - c[i];
}

/* { dg-final { scan-assembler "fnmsub132ps" } } */

[pushed] c++: Better diagnostic in converted const expr.

2020-05-11 Thread Jason Merrill via Gcc-patches

This improves the diagnostic from

error: could not convert ‘((A<>*)(void)0)->A<>::e’ from
   ‘’ to ‘bool’

to

error: cannot convert ‘A<>::e’ from type ‘void (A<>::)()’ to type ‘bool’

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog
2020-05-11  Jason Merrill  

* call.c (implicit_conversion_error): Split out from...
(perform_implicit_conversion_flags): ...here.
(build_converted_constant_expr_internal): Use it.
---
 gcc/cp/call.c | 41 +--
 gcc/testsuite/g++.dg/cpp0x/noexcept30.C   |  2 +-
 gcc/testsuite/g++.dg/cpp0x/noexcept58.C   |  9 +
 gcc/testsuite/g++.dg/template/crash87.C   |  2 +-
 gcc/testsuite/g++.dg/template/nontype13.C |  2 +-
 5 files changed, 36 insertions(+), 20 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/noexcept58.C

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index aca12c74c25..85d670f52f9 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -4282,6 +4282,28 @@ build_user_type_conversion (tree totype, tree expr, int 
flags,
   return ret;
 }
 
+/* Give a helpful diagnostic when implicit_conversion fails.  */
+
+static void
+implicit_conversion_error (location_t loc, tree type, tree expr)
+{
+  tsubst_flags_t complain = tf_warning_or_error;
+
+  /* If expr has unknown type, then it is an overloaded function.
+ Call instantiate_type to get good error messages.  */
+  if (TREE_TYPE (expr) == unknown_type_node)
+instantiate_type (type, expr, complain);
+  else if (invalid_nonstatic_memfn_p (loc, expr, complain))
+/* We gave an error.  */;
+  else
+{
+  range_label_for_type_mismatch label (TREE_TYPE (expr), type);
+  gcc_rich_location rich_loc (loc, );
+  error_at (_loc, "could not convert %qE from %qH to %qI",
+   expr, TREE_TYPE (expr), type);
+}
+}
+
 /* Worker for build_converted_constant_expr.  */
 
 static tree
@@ -4397,8 +4419,7 @@ build_converted_constant_expr_internal (tree type, tree 
expr,
   else
 {
   if (complain & tf_error)
-   error_at (loc, "could not convert %qE from %qH to %qI", expr,
- TREE_TYPE (expr), type);
+   implicit_conversion_error (loc, type, expr);
   expr = error_mark_node;
 }
 
@@ -11845,21 +11866,7 @@ perform_implicit_conversion_flags (tree type, tree 
expr,
   if (!conv)
 {
   if (complain & tf_error)
-   {
- /* If expr has unknown type, then it is an overloaded function.
-Call instantiate_type to get good error messages.  */
- if (TREE_TYPE (expr) == unknown_type_node)
-   instantiate_type (type, expr, complain);
- else if (invalid_nonstatic_memfn_p (loc, expr, complain))
-   /* We gave an error.  */;
- else
-   {
- range_label_for_type_mismatch label (TREE_TYPE (expr), type);
- gcc_rich_location rich_loc (loc, );
- error_at (_loc, "could not convert %qE from %qH to %qI",
-   expr, TREE_TYPE (expr), type);
-   }
-   }
+   implicit_conversion_error (loc, type, expr);
   expr = error_mark_node;
 }
   else if (processing_template_decl && conv->kind != ck_identity)
diff --git a/gcc/testsuite/g++.dg/cpp0x/noexcept30.C 
b/gcc/testsuite/g++.dg/cpp0x/noexcept30.C
index 6a9f7821092..1075c69a491 100644
--- a/gcc/testsuite/g++.dg/cpp0x/noexcept30.C
+++ b/gcc/testsuite/g++.dg/cpp0x/noexcept30.C
@@ -5,7 +5,7 @@
 template
 struct F {
   template
-  void f() noexcept(::template f) {} // { dg-error "exception 
specification|convert" }
+  void f() noexcept(::template f) {} // { dg-error "exception 
specification|convert|resolve" }
 };
 
 int main () {
diff --git a/gcc/testsuite/g++.dg/cpp0x/noexcept58.C 
b/gcc/testsuite/g++.dg/cpp0x/noexcept58.C
new file mode 100644
index 000..0a145e030a5
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/noexcept58.C
@@ -0,0 +1,9 @@
+// PR c++/90748
+// { dg-do compile { target c++11 } }
+
+template  class A
+{
+  void e ();
+  bool f (int() noexcept(e));  // { dg-error "::e" }
+};
+A<> b;
diff --git a/gcc/testsuite/g++.dg/template/crash87.C 
b/gcc/testsuite/g++.dg/template/crash87.C
index af81edbfd80..7da6623612a 100644
--- a/gcc/testsuite/g++.dg/template/crash87.C
+++ b/gcc/testsuite/g++.dg/template/crash87.C
@@ -17,7 +17,7 @@ template 
 class BUG2 : BUG
 {
 public:
- typedef BUG1_5 ptr; // { dg-error "convert" }
+ typedef BUG1_5 ptr; // { dg-error "BUG::name" }
 };
 
 int main()
diff --git a/gcc/testsuite/g++.dg/template/nontype13.C 
b/gcc/testsuite/g++.dg/template/nontype13.C
index 3250109aa4a..4d6b323ed64 100644
--- a/gcc/testsuite/g++.dg/template/nontype13.C
+++ b/gcc/testsuite/g++.dg/template/nontype13.C
@@ -11,7 +11,7 @@ struct Dummy
   template
   void tester()
   {
-bar()(); // { dg-error "constant|template|convert" }
+bar()(); // { dg-error "constant|template|convert|member function" }
   }
   template
   struct bar

base-commit: 1422c2e4462c9b7c44aa035ac56af77565556181
--

[pushed] c++: Use of 'this' in parameter declaration [PR90748]

2020-05-11 Thread Jason Merrill via Gcc-patches

We were incorrectly accepting the use of 'this' at parse time and then
crashing when we tried to instantiate it.  It is invalid because 'this' is
not in scope until after the function-cv-quals.  So let's hoist setting
current_class_ptr up from cp_parser_late_return_type_opt into
cp_parser_direct_declarator where it can work for noexcept as well.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog
2020-05-11  Jason Merrill  

PR c++/90748
* parser.c (inject_parm_decls): Set current_class_ptr here.
(cp_parser_direct_declarator): And here.
(cp_parser_late_return_type_opt): Not here.
(cp_parser_noexcept_specification_opt): Nor here.
(cp_parser_exception_specification_opt)
(cp_parser_late_noexcept_specifier): Remove unneeded parameters.
---
 gcc/cp/parser.c | 87 ++---
 gcc/testsuite/g++.dg/cpp0x/noexcept59.C | 10 +++
 2 files changed, 46 insertions(+), 51 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/noexcept59.C

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 591f44f4934..10627cb1c92 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -246,7 +246,7 @@ static void cp_lexer_stop_debugging
 static cp_token_cache *cp_token_cache_new
   (cp_token *, cp_token *);
 static tree cp_parser_late_noexcept_specifier
-  (cp_parser *, tree, tree);
+  (cp_parser *, tree);
 static void noexcept_override_late_checks
   (tree, tree);
 
@@ -2246,7 +2246,7 @@ static cp_ref_qualifier cp_parser_ref_qualifier_opt
 static tree cp_parser_tx_qualifier_opt
   (cp_parser *);
 static tree cp_parser_late_return_type_opt
-  (cp_parser *, cp_declarator *, tree &, cp_cv_quals);
+  (cp_parser *, cp_declarator *, tree &);
 static tree cp_parser_declarator_id
   (cp_parser *, bool);
 static tree cp_parser_type_id
@@ -2385,11 +2385,11 @@ static tree cp_parser_exception_declaration
 static tree cp_parser_throw_expression
   (cp_parser *);
 static tree cp_parser_exception_specification_opt
-  (cp_parser *, cp_parser_flags, cp_cv_quals);
+  (cp_parser *, cp_parser_flags);
 static tree cp_parser_type_id_list
   (cp_parser *);
 static tree cp_parser_noexcept_specification_opt
-  (cp_parser *, cp_parser_flags, bool, bool *, bool, cp_cv_quals);
+  (cp_parser *, cp_parser_flags, bool, bool *, bool);
 
 /* GNU Extensions */
 
@@ -11082,8 +11082,7 @@ cp_parser_lambda_declarator_opt (cp_parser* parser, 
tree lambda_expr)
 
   /* Parse optional exception specification.  */
   exception_spec
-   = cp_parser_exception_specification_opt (parser, CP_PARSER_FLAGS_NONE,
-quals);
+   = cp_parser_exception_specification_opt (parser, CP_PARSER_FLAGS_NONE);
 
   std_attrs = cp_parser_std_attribute_spec_seq (parser);
 
@@ -21227,11 +21226,17 @@ cp_parser_direct_declarator (cp_parser* parser,
  ref_qual = cp_parser_ref_qualifier_opt (parser);
  /* Parse the tx-qualifier.  */
  tree tx_qual = cp_parser_tx_qualifier_opt (parser);
- /* And the exception-specification.  */
+
+ tree save_ccp = current_class_ptr;
+ tree save_ccr = current_class_ref;
+ if (memfn)
+   /* DR 1207: 'this' is in scope after the cv-quals.  */
+   inject_this_parameter (current_class_type, cv_quals);
+
+ /* Parse the exception-specification.  */
  exception_specification
= cp_parser_exception_specification_opt (parser,
-flags,
-cv_quals);
+flags);
 
  attrs = cp_parser_std_attribute_spec_seq (parser);
 
@@ -21241,8 +21246,7 @@ cp_parser_direct_declarator (cp_parser* parser,
  tree gnu_attrs = NULL_TREE;
  tree requires_clause = NULL_TREE;
  late_return = (cp_parser_late_return_type_opt
-(parser, declarator, requires_clause,
- memfn ? cv_quals : -1));
+(parser, declarator, requires_clause));
 
  /* Parse the virt-specifier-seq.  */
  virt_specifiers = cp_parser_virt_specifier_seq_opt (parser);
@@ -21264,6 +21268,9 @@ cp_parser_direct_declarator (cp_parser* parser,
 function.  */
  parser->default_arg_ok_p = false;
 
+ current_class_ptr = save_ccp;
+ current_class_ref = save_ccr;
+
  /* Restore the state of local_variables_forbidden_p.  */
  parser->local_variables_forbidden_p
= local_variables_forbidden_p;
@@ -22077,7 +22084,7 @@ parsing_nsdmi (void)
 
 static tree
 cp_parser_late_return_type_opt (cp_parser* parser,

[Patch, committed] PR fortran/95053 - ICE in gfc_divide(): Bad basic type

2020-05-11 Thread Harald Anlauf

Committed as obvious.

Sorry for the breakage.


PR fortran/95053 - ICE in gfc_divide(): Bad basic type

The fix for PR 93499 introduced a too strict check in gfc_divide
that could trigger errors in the early parsing phase.  Relax the
check and defer to a later stage.

gcc/fortran/

2020-05-11  Harald Anlauf  

PR fortran/95053
* arith.c (gfc_divide): Do not error out if operand 2 is
non-numeric.  Defer checks to later stage.

gcc/testsuite/

2020-05-11  Harald Anlauf  

PR fortran/95053
* gfortran.dg/pr95053.f: New test.


diff --git a/gcc/fortran/arith.c b/gcc/fortran/arith.c
index 1cd0867a941..dd72f44d377 100644
--- a/gcc/fortran/arith.c
+++ b/gcc/fortran/arith.c
@@ -1828,7 +1828,8 @@ gfc_divide (gfc_expr *op1, gfc_expr *op2)
rc = ARITH_DIV0;
  break;
default:
- gfc_internal_error ("gfc_divide(): Bad basic type");
+ /* basic type is non-numeric, handle this elsewhere.  */
+ break;
}
   if (rc == ARITH_DIV0)
{
diff --git a/gcc/testsuite/gfortran.dg/pr95053.f 
b/gcc/testsuite/gfortran.dg/pr95053.f
new file mode 100644
index 000..1d15c669467
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr95053.f
@@ -0,0 +1,7 @@
+! { dg-do compile }
+! PR fortran/95053 - ICE in gfc_divide(): Bad basic type
+!
+ 123  FORMAT ('A'/'B')
+ 132  FORMAT (A/
+ + ' B')
+  END

[pushed] c++: Remove redundant code.

2020-05-11 Thread Jason Merrill via Gcc-patches

We walk the lambda captures in cp_walk_subtrees, so we don't also need to
walk them here.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog
2020-05-11  Jason Merrill  

* pt.c (find_parameter_packs_r) [LAMBDA_EXPR]: Remove redundant
walking of capture list.
---
 gcc/cp/pt.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index c6091127225..112426af72a 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -3988,18 +3988,12 @@ find_parameter_packs_r (tree *tp, int *walk_subtrees, 
void* data)
 
 case LAMBDA_EXPR:
   {
-   /* Look at explicit captures.  */
-   for (tree cap = LAMBDA_EXPR_CAPTURE_LIST (t);
-cap; cap = TREE_CHAIN (cap))
- cp_walk_tree (_VALUE (cap), _parameter_packs_r, ppd,
-   ppd->visited);
/* Since we defer implicit capture, look in the parms and body.  */
tree fn = lambda_function (t);
cp_walk_tree (_TYPE (fn), _parameter_packs_r, ppd,
  ppd->visited);
cp_walk_tree (_SAVED_TREE (fn), _parameter_packs_r, ppd,
  ppd->visited);
-   *walk_subtrees = 0;
return NULL_TREE;
   }
 

base-commit: 2b2d298ff845ab7a07ffbd51da79473736da3324
-- 
2.18.1

[pushed] c++: Make references to __cxa_pure_virtual weak.

2020-05-11 Thread Jason Merrill via Gcc-patches

If a program has no other dependencies on libstdc++, we shouldn't require it
just for __cxa_pure_virtual, which is only there to give a prettier
diagnostic before crashing the program; resolving the reference to NULL will
also crash, just without the diagnostic.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog
2020-05-11  Jason Merrill  

* decl.c (cxx_init_decl_processing): Call declare_weak for
__cxa_pure_virtual.
---
 gcc/cp/decl.c|  3 +++
 gcc/testsuite/g++.dg/abi/pure-virtual1.C | 21 +
 2 files changed, 24 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/abi/pure-virtual1.C

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index dea1ba07c0e..1b6a5672334 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -4544,6 +4544,9 @@ cxx_init_decl_processing (void)
   abort_fndecl
 = build_library_fn_ptr ("__cxa_pure_virtual", void_ftype,
ECF_NORETURN | ECF_NOTHROW | ECF_COLD);
+  if (flag_weak)
+/* If no definition is available, resolve references to NULL.  */
+declare_weak (abort_fndecl);
 
   /* Perform other language dependent initializations.  */
   init_class_processing ();
diff --git a/gcc/testsuite/g++.dg/abi/pure-virtual1.C 
b/gcc/testsuite/g++.dg/abi/pure-virtual1.C
new file mode 100644
index 000..823328ea951
--- /dev/null
+++ b/gcc/testsuite/g++.dg/abi/pure-virtual1.C
@@ -0,0 +1,21 @@
+// Test that we don't need libsupc++ just for __cxa_pure_virtual.
+// { dg-do link }
+// { dg-require-weak }
+// { dg-additional-options "-fno-rtti -nodefaultlibs -lc" }
+
+struct A
+{
+  int i;
+  virtual void f() = 0;
+  A(): i(0) {}
+};
+
+struct B: A
+{
+  virtual void f() {}
+};
+
+int main()
+{
+  B b;
+}

base-commit: 2b2d298ff845ab7a07ffbd51da79473736da3324
-- 
2.18.1

[pushed] c++: Improve print_tree of static_assert.

2020-05-11 Thread Jason Merrill via Gcc-patches

We weren't printing the condition and message of a STATIC_ASSERT.

It's also unnecessary to duplicate the code for instantiating a
STATIC_ASSERT between tsubst_expr and instantiate_class_template_1.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog
2020-05-11  Jason Merrill  

* pt.c (instantiate_class_template_1): Call tsubst_expr for
STATIC_ASSERT member.
* ptree.c (cxx_print_xnode): Handle STATIC_ASSERT.
---
 gcc/cp/pt.c| 17 ++---
 gcc/cp/ptree.c | 11 +++
 2 files changed, 13 insertions(+), 15 deletions(-)

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 112426af72a..28f3c90f17b 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -11809,21 +11809,8 @@ instantiate_class_template_1 (tree type)
{
  /* Build new TYPE_FIELDS.  */
   if (TREE_CODE (t) == STATIC_ASSERT)
-{
-  tree condition;
-
- ++c_inhibit_evaluation_warnings;
- condition =
-   tsubst_expr (STATIC_ASSERT_CONDITION (t), args,
-tf_warning_or_error, NULL_TREE,
-/*integral_constant_expression_p=*/true);
- --c_inhibit_evaluation_warnings;
-
-  finish_static_assert (condition,
-STATIC_ASSERT_MESSAGE (t), 
-STATIC_ASSERT_SOURCE_LOCATION (t),
-/*member_p=*/true);
-}
+   tsubst_expr (t, args, tf_warning_or_error, NULL_TREE,
+/*integral_constant_expression_p=*/true);
  else if (TREE_CODE (t) != CONST_DECL)
{
  tree r;
diff --git a/gcc/cp/ptree.c b/gcc/cp/ptree.c
index 285028a4841..ab18eecd0e6 100644
--- a/gcc/cp/ptree.c
+++ b/gcc/cp/ptree.c
@@ -269,6 +269,17 @@ cxx_print_xnode (FILE *file, tree node, int indent)
 case LAMBDA_EXPR:
   cxx_print_lambda_node (file, node, indent);
   break;
+case STATIC_ASSERT:
+  if (location_t loc = STATIC_ASSERT_SOURCE_LOCATION (node))
+   {
+ expanded_location xloc = expand_location (loc);
+ indent_to (file, indent+4);
+ fprintf (file, "%s:%d:%d", xloc.file, xloc.line, xloc.column);
+   }
+  print_node (file, "condition", STATIC_ASSERT_CONDITION (node), indent+4);
+  if (tree message = STATIC_ASSERT_MESSAGE (node))
+   print_node (file, "message", message, indent+4);
+  break;
 default:
   break;
 }

base-commit: 2b2d298ff845ab7a07ffbd51da79473736da3324
-- 
2.18.1

[pushed] c++: Remove LOOKUP_EXPLICIT_TMPL_ARGS.

2020-05-11 Thread Jason Merrill via Gcc-patches

This flag is redundant with the explicit_targs field in the overload
candidate information.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog
2020-05-11  Jason Merrill  

* cp-tree.h (LOOKUP_EXPLICIT_TMPL_ARGS): Remove.
* call.c (build_new_function_call): Don't set it.
(build_new_method_call_1): Likewise.
(build_over_call): Check cand->explicit_targs instead.
---
 gcc/cp/cp-tree.h |  4 +---
 gcc/cp/call.c| 14 ++
 2 files changed, 3 insertions(+), 15 deletions(-)

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index c4b81428e14..f7c11bcf838 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -5603,10 +5603,8 @@ enum overload_flags { NO_SPECIAL = 0, DTOR_FLAG, 
TYPENAME_FLAG };
 /* Used in calls to store_init_value to suppress its usual call to
digest_init.  */
 #define LOOKUP_ALREADY_DIGESTED (LOOKUP_DEFAULTED << 1)
-/* An instantiation with explicit template arguments.  */
-#define LOOKUP_EXPLICIT_TMPL_ARGS (LOOKUP_ALREADY_DIGESTED << 1)
 /* Like LOOKUP_NO_TEMP_BIND, but also prevent binding to xvalues.  */
-#define LOOKUP_NO_RVAL_BIND (LOOKUP_EXPLICIT_TMPL_ARGS << 1)
+#define LOOKUP_NO_RVAL_BIND (LOOKUP_ALREADY_DIGESTED << 1)
 /* Used by case_conversion to disregard non-integral conversions.  */
 #define LOOKUP_NO_NON_INTEGRAL (LOOKUP_NO_RVAL_BIND << 1)
 /* Used for delegating constructors in order to diagnose self-delegation.  */
diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index dbce3866fd8..aca12c74c25 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -4600,15 +4600,7 @@ build_new_function_call (tree fn, vec 
**args,
 }
   else
 {
-  int flags = LOOKUP_NORMAL;
-  /* If fn is template_id_expr, the call has explicit template arguments
- (e.g. func(5)), communicate this info to build_over_call
- through flags so that later we can use it to decide whether to warn
- about peculiar null pointer conversion.  */
-  if (TREE_CODE (fn) == TEMPLATE_ID_EXPR)
-flags |= LOOKUP_EXPLICIT_TMPL_ARGS;
-
-  result = build_over_call (cand, flags, complain);
+  result = build_over_call (cand, LOOKUP_NORMAL, complain);
 }
 
   if (flag_coroutines
@@ -8773,7 +8765,7 @@ build_over_call (struct z_candidate *cand, int flags, 
tsubst_flags_t complain)
   if (null_node_p (arg)
   && DECL_TEMPLATE_INFO (fn)
   && cand->template_decl
-  && !(flags & LOOKUP_EXPLICIT_TMPL_ARGS))
+ && !cand->explicit_targs)
 conversion_warning = false;
 
   /* Set user_conv_p on the argument conversions, so rvalue/base handling
@@ -10345,8 +10337,6 @@ build_new_method_call_1 (tree instance, tree fns, 
vec **args,
 
  if (call != error_mark_node)
{
-  if (explicit_targs)
-flags |= LOOKUP_EXPLICIT_TMPL_ARGS;
  /* Now we know what function is being called.  */
  if (fn_p)
*fn_p = fn;

base-commit: 2b2d298ff845ab7a07ffbd51da79473736da3324
-- 
2.18.1

[pushed] c++: Avoid unnecessary deprecated warnings.

2020-05-11 Thread Jason Merrill via Gcc-patches

There's no need to warn that a deprecated function uses a deprecated type,
that just adds noise.  We were preventing that in start_decl, but that
didn't help member declarations that go through grokfield.  So handle it in
grokdeclarator instead, which is shared between them.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog
2020-05-11  Jason Merrill  

* decl.c (grokdeclarator): Adjust deprecated_state here.
(start_decl): Not here.
---
 gcc/cp/decl.c| 18 +++---
 gcc/testsuite/g++.dg/warn/deprecated-6.C |  2 +-
 gcc/testsuite/g++.dg/warn/deprecated.C   |  2 +-
 3 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 73a06a60786..adf94658420 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -5214,18 +5214,11 @@ start_decl (const cp_declarator *declarator,
 
   *pushed_scope_p = NULL_TREE;
 
-  /* An object declared as __attribute__((deprecated)) suppresses
- warnings of uses of other deprecated items.  */
-  if (lookup_attribute ("deprecated", attributes))
-deprecated_state = DEPRECATED_SUPPRESS;
-
   attributes = chainon (attributes, prefix_attributes);
 
   decl = grokdeclarator (declarator, declspecs, NORMAL, initialized,
 );
 
-  deprecated_state = DEPRECATED_NORMAL;
-
   if (decl == NULL_TREE || VOID_TYPE_P (decl)
   || decl == error_mark_node)
 return error_mark_node;
@@ -11318,6 +11311,17 @@ grokdeclarator (const cp_declarator *declarator,
   type = NULL_TREE;
   type_was_error_mark_node = true;
 }
+
+  /* Ignore erroneous attributes.  */
+  if (attrlist && *attrlist == error_mark_node)
+*attrlist = NULL_TREE;
+
+  /* An object declared as __attribute__((deprecated)) suppresses
+ warnings of uses of other deprecated items.  */
+  temp_override ds (deprecated_state);
+  if (attrlist && lookup_attribute ("deprecated", *attrlist))
+deprecated_state = DEPRECATED_SUPPRESS;
+
   cp_warn_deprecated_use (type);
   if (type && TREE_CODE (type) == TYPE_DECL)
 {
diff --git a/gcc/testsuite/g++.dg/warn/deprecated-6.C 
b/gcc/testsuite/g++.dg/warn/deprecated-6.C
index b3c390be385..605b507f534 100644
--- a/gcc/testsuite/g++.dg/warn/deprecated-6.C
+++ b/gcc/testsuite/g++.dg/warn/deprecated-6.C
@@ -89,7 +89,7 @@ struct SS2 *p2;   /* { dg-warning 
"'SS2' is deprecated: Please avoid SS2" } */
 class T {
   public:
 void member1(int) __attribute__ ((deprecated("Please avoid member1")));
-void member2(INT1) __attribute__ ((__deprecated__("Please avoid 
member2"))); /* { dg-warning "'INT1' is deprecated" } */
+void member2(INT1) __attribute__ ((__deprecated__("Please avoid 
member2")));
 int member3(T *);
 int x;
 } __attribute__ ((deprecated("Please avoid T")));
diff --git a/gcc/testsuite/g++.dg/warn/deprecated.C 
b/gcc/testsuite/g++.dg/warn/deprecated.C
index c5ccbf3271f..3817e620250 100644
--- a/gcc/testsuite/g++.dg/warn/deprecated.C
+++ b/gcc/testsuite/g++.dg/warn/deprecated.C
@@ -93,7 +93,7 @@ struct SS2 *p2;   /* { dg-warning 
"'SS2' is deprecated" } */
 class T {
   public:
 void member1(int) __attribute__ ((deprecated));
-void member2(INT1) __attribute__ ((__deprecated__)); /* { dg-warning 
"'INT1' is deprecated" } */
+void member2(INT1) __attribute__ ((__deprecated__));
 int member3(T *);
 int x;
 } __attribute__ ((deprecated));

base-commit: 2b2d298ff845ab7a07ffbd51da79473736da3324
-- 
2.18.1

[pushed] c++: Tweak VLA representation.

2020-05-11 Thread Jason Merrill via Gcc-patches

If we put the SAVE_EXPR for a VLA size inside the MINUS_EXPR rather than
outside, it will work better with constant folding.

The equivalent change was made in the C front-end in 2004, in commit
r0-64535-g8b0b9aefd29dfe6398857bcf5628662e2f0e21f6

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog
2020-05-11  Jason Merrill  

* decl.c (compute_array_index_type_loc): Stabilize before building
the MINUS_EXPR.
---
 gcc/cp/decl.c | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index adf94658420..dea1ba07c0e 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -10449,6 +10449,15 @@ compute_array_index_type_loc (location_t name_loc, 
tree name, tree size,
 itype = build_min (MINUS_EXPR, sizetype, size, integer_one_node);
   else
 {
+  if (!TREE_CONSTANT (size))
+   {
+ /* A variable sized array.  Arrange for the SAVE_EXPR on the inside
+of the MINUS_EXPR, which allows the -1 to get folded with the +1
+that happens when building TYPE_SIZE.  */
+ size = variable_size (size);
+ stabilize_vla_size (size);
+   }
+
   /* Compute the index of the largest element in the array.  It is
 one less than the number of elements in the array.  We save
 and restore PROCESSING_TEMPLATE_DECL so that computations in
@@ -10466,11 +10475,6 @@ compute_array_index_type_loc (location_t name_loc, 
tree name, tree size,
 
   if (!TREE_CONSTANT (itype))
{
- /* A variable sized array.  */
- itype = variable_size (itype);
-
- stabilize_vla_size (itype);
-
  if (sanitize_flags_p (SANITIZE_VLA)
  && current_function_decl != NULL_TREE)
{

base-commit: 2b2d298ff845ab7a07ffbd51da79473736da3324
-- 
2.18.1

Re: [PATCH] tree-vect-generic: Fix bitfield widths [PR94980 3/3]

2020-05-11 Thread Richard Biener via Gcc-patches

On May 11, 2020 6:41:58 PM GMT+02:00, Richard Sandiford 
 wrote:
>This third patch of three actually fixes the PR.  We were using
>8-bit BIT_FIELD_REFs to access single-bit elements, and multiplying
>the vector index by 8 bits rather than 1 bit.
>
>Tested individually on aarch64-linux-gnu and as a series on
>x86_64-linux-gnu.  OK to install?

OK. 

>I'm not sure what to do about backports.  Arguably the PR is
>really a progression rather than a regression, since we went
>from generating wrong code to ICEing.  I'm not at all convinced
>that this fixes all the vector-lowering problems associated
>with packed booleans, and there's a danger that we could regress
>to wrong code if we backported a piecemeal fix.

I would suggest to wait for a real world case and only backport once we're 
sufficiently sure the support is complete. 

Richard 

>Richard
>
>
>2020-05-11  Richard Sandiford  
>
>gcc/
>   PR tree-optimization/94980
>   * tree-vect-generic.c (expand_vector_comparison): Use
>   vector_element_bits_tree to get the element size in bits,
>   rather than using TYPE_SIZE.
>   (expand_vector_condition, vector_element): Likewise.
>
>gcc/testsuite/
>   PR tree-optimization/94980
>   * gcc.target/i386/pr94980.c: New test.
>---
> gcc/testsuite/gcc.target/i386/pr94980.c | 10 ++
> gcc/tree-vect-generic.c |  8 
> 2 files changed, 14 insertions(+), 4 deletions(-)
> create mode 100644 gcc/testsuite/gcc.target/i386/pr94980.c
>
>diff --git a/gcc/tree-vect-generic.c b/gcc/tree-vect-generic.c
>index adea9337a97..a7fe83da0e3 100644
>--- a/gcc/tree-vect-generic.c
>+++ b/gcc/tree-vect-generic.c
>@@ -390,7 +390,7 @@ expand_vector_comparison (gimple_stmt_iterator
>*gsi, tree type, tree op0,
>   (TREE_TYPE (type)
>   {
> tree inner_type = TREE_TYPE (TREE_TYPE (op0));
>-tree part_width = TYPE_SIZE (inner_type);
>+tree part_width = vector_element_bits_tree (TREE_TYPE (op0));
> tree index = bitsize_int (0);
> int nunits = nunits_for_known_piecewise_op (TREE_TYPE (op0));
> int prec = GET_MODE_PRECISION (SCALAR_TYPE_MODE (type));
>@@ -944,9 +944,9 @@ expand_vector_condition (gimple_stmt_iterator *gsi)
>   vec *v;
>   tree constr;
>   tree inner_type = TREE_TYPE (type);
>+  tree width = vector_element_bits_tree (type);
>   tree cond_type = TREE_TYPE (TREE_TYPE (a));
>   tree comp_inner_type = cond_type;
>-  tree width = TYPE_SIZE (inner_type);
>   tree index = bitsize_int (0);
>   tree comp_width = width;
>   tree comp_index = index;
>@@ -960,7 +960,7 @@ expand_vector_condition (gimple_stmt_iterator *gsi)
>   a1 = TREE_OPERAND (a, 0);
>   a2 = TREE_OPERAND (a, 1);
>   comp_inner_type = TREE_TYPE (TREE_TYPE (a1));
>-  comp_width = TYPE_SIZE (comp_inner_type);
>+  comp_width = vector_element_bits_tree (TREE_TYPE (a1));
> }
> 
>   if (expand_vec_cond_expr_p (type, TREE_TYPE (a1), TREE_CODE (a)))
>@@ -1333,7 +1333,7 @@ vector_element (gimple_stmt_iterator *gsi, tree
>vect, tree idx, tree *ptmpvec)
> }
>   else
> {
>-tree size = TYPE_SIZE (vect_elt_type);
>+tree size = vector_element_bits_tree (vect_type);
> tree pos = fold_build2 (MULT_EXPR, bitsizetype, bitsize_int (index),
> size);
> return fold_build3 (BIT_FIELD_REF, vect_elt_type, vect, size, pos);
>diff --git a/gcc/testsuite/gcc.target/i386/pr94980.c
>b/gcc/testsuite/gcc.target/i386/pr94980.c
>new file mode 100644
>index 000..488f94abec9
>--- /dev/null
>+++ b/gcc/testsuite/gcc.target/i386/pr94980.c
>@@ -0,0 +1,10 @@
>+/* { dg-do compile } */
>+/* { dg-options "-mavx512vl" } */
>+
>+int __attribute__((__vector_size__(16))) v;
>+
>+void
>+foo(void)
>+{
>+  0 <= (0 != v) >= 0;
>+}

Re: [PATCH] tree: Add vector_element_bits(_tree) [PR94980 1/3]

2020-05-11 Thread Richard Biener via Gcc-patches

On May 11, 2020 6:35:58 PM GMT+02:00, Richard Sandiford 
 wrote:
>A lot of code that wants to know the number of bits in a vector
>element gets that information from the element's TYPE_SIZE,
>which is always equal to TYPE_SIZE_UNIT * BITS_PER_UNIT.
>This doesn't work for SVE and AVX512-style packed boolean vectors,
>where several elements can occupy a single byte.
>
>This patch introduces a new pair of helpers for getting the true
>(possibly sub-byte) size.  I made a token attempt to convert obvious
>element size calculations, but I'm sure I missed some.
>
>Tested individually on aarch64-linux-gnu and as a series on
>x86_64-linux-gnu.  OK to install?

OK. 

Richard. 

>Richard
>
>
>2020-05-11  Richard Sandiford  
>
>gcc/
>   PR tree-optimization/94980
>   * tree.h (vector_element_bits, vector_element_bits_tree): Declare.
>   * tree.c (vector_element_bits, vector_element_bits_tree): New.
>   * match.pd: Use the new functions instead of determining the
>   vector element size directly from TYPE_SIZE(_UNIT).
>   * tree-vect-data-refs.c (vect_gather_scatter_fn_p): Likewise.
>   * tree-vect-patterns.c (vect_recog_mask_conversion_pattern): Likewise.
>   * tree-vect-stmts.c (vect_is_simple_cond): Likewise.
>   * tree-vect-generic.c (expand_vector_piecewise): Likewise.
>   (expand_vector_conversion): Likewise.
>   (expand_vector_addition): Likewise for a TYPE_SIZE_UNIT used as
>   a divisor.  Convert the dividend to bits to compensate.
>   * tree-vect-loop.c (vectorizable_live_operation): Call
>   vector_element_bits instead of open-coding it.
>---
> gcc/match.pd  |  2 +-
> gcc/tree-vect-data-refs.c |  2 +-
> gcc/tree-vect-generic.c   | 19 +++
> gcc/tree-vect-loop.c  |  4 +---
> gcc/tree-vect-patterns.c  |  3 +--
> gcc/tree-vect-stmts.c |  3 +--
> gcc/tree.c| 24 
> gcc/tree.h|  2 ++
> 8 files changed, 38 insertions(+), 21 deletions(-)
>
>diff --git a/gcc/tree.h b/gcc/tree.h
>index 4644d6616d9..11c109fffcd 100644
>--- a/gcc/tree.h
>+++ b/gcc/tree.h
>@@ -1996,6 +1996,8 @@ class auto_suppress_location_wrappers
> 
> extern machine_mode element_mode (const_tree);
> extern machine_mode vector_type_mode (const_tree);
>+extern unsigned int vector_element_bits (const_tree);
>+extern tree vector_element_bits_tree (const_tree);
> 
>/* The "canonical" type for this type node, which is used by frontends
>to
>compare the type for equality with another type.  If two types are
>diff --git a/gcc/tree.c b/gcc/tree.c
>index 5b7d3fddbcb..1aabffeea43 100644
>--- a/gcc/tree.c
>+++ b/gcc/tree.c
>@@ -13806,6 +13806,30 @@ vector_type_mode (const_tree t)
>   return mode;
> }
> 
>+/* Return the size in bits of each element of vector type TYPE.  */
>+
>+unsigned int
>+vector_element_bits (const_tree type)
>+{
>+  gcc_checking_assert (VECTOR_TYPE_P (type));
>+  if (VECTOR_BOOLEAN_TYPE_P (type))
>+return vector_element_size (tree_to_poly_uint64 (TYPE_SIZE
>(type)),
>+  TYPE_VECTOR_SUBPARTS (type));
>+  return tree_to_uhwi (TYPE_SIZE (TREE_TYPE (type)));
>+}
>+
>+/* Calculate the size in bits of each element of vector type TYPE
>+   and return the result as a tree of type bitsizetype.  */
>+
>+tree
>+vector_element_bits_tree (const_tree type)
>+{
>+  gcc_checking_assert (VECTOR_TYPE_P (type));
>+  if (VECTOR_BOOLEAN_TYPE_P (type))
>+return bitsize_int (vector_element_bits (type));
>+  return TYPE_SIZE (TREE_TYPE (type));
>+}
>+
>/* Verify that basic properties of T match TV and thus T can be a
>variant of
>TV.  TV should be the more specified variant (i.e. the main variant). 
>*/
> 
>diff --git a/gcc/match.pd b/gcc/match.pd
>index 58a4ac66414..33ee1a920bf 100644
>--- a/gcc/match.pd
>+++ b/gcc/match.pd
>@@ -6306,7 +6306,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>   }
>   (if (ins)
>(bit_insert { op0; } { ins; }
>- { bitsize_int (at * tree_to_uhwi (TYPE_SIZE (TREE_TYPE
>(type; })
>+ { bitsize_int (at * vector_element_bits (type)); })
>(if (changed)
> (vec_perm { op0; } { op1; } { op2; }))
> 
>diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
>index d41ba49fabf..b950aa9e50d 100644
>--- a/gcc/tree-vect-data-refs.c
>+++ b/gcc/tree-vect-data-refs.c
>@@ -3693,7 +3693,7 @@ vect_gather_scatter_fn_p (vec_info *vinfo, bool
>read_p, bool masked_p,
> tree *offset_vectype_out)
> {
>   unsigned int memory_bits = tree_to_uhwi (TYPE_SIZE (memory_type));
>-  unsigned int element_bits = tree_to_uhwi (TYPE_SIZE (TREE_TYPE
>(vectype)));
>+  unsigned int element_bits = vector_element_bits (vectype);
>   if (element_bits != memory_bits)
> /* For now the vector elements must be the same width as the
>memory elements.  */
>diff --git a/gcc/tree-vect-generic.c b/gcc/tree-vect-generic.c
>index 8b00f325054..126e906e0a9 100644
>--- a/gcc/tree-vect-generic.c
>+++

Re: [PATCH] tree-vect-generic: Tweak build_replicated_const [PR94980 2/3]

2020-05-11 Thread Richard Biener via Gcc-patches

On May 11, 2020 6:37:43 PM GMT+02:00, Richard Sandiford 
 wrote:
>This patch makes build_replicated_const take the number of bits
>in VALUE rather than calculating the width from the element type.
>The callers can then use vector_element_bits to calculate the
>correct element size from the vector type.
>
>Tested individually on aarch64-linux-gnu and as a series on
>x86_64-linux-gnu.  OK to install?

OK. 

Richard. 

>Richard
>
>
>2020-05-11  Richard Sandiford  
>
>gcc/
>   PR tree-optimization/94980
>   * tree-vect-generic.c (build_replicated_const): Take the number
>   of bits as a parameter, instead of the type of the elements.
>   (do_plus_minus): Update accordingly, using vector_element_bits
>   to calculate the correct number of bits.
>   (do_negate): Likewise.
>---
> gcc/tree-vect-generic.c | 15 ---
> 1 file changed, 8 insertions(+), 7 deletions(-)
>
>diff --git a/gcc/tree-vect-generic.c b/gcc/tree-vect-generic.c
>index 126e906e0a9..adea9337a97 100644
>--- a/gcc/tree-vect-generic.c
>+++ b/gcc/tree-vect-generic.c
>@@ -67,11 +67,10 @@ subparts_gt (tree type1, tree type2)
> }
> 
> /* Build a constant of type TYPE, made of VALUE's bits replicated
>-   every TYPE_SIZE (INNER_TYPE) bits to fit TYPE's precision.  */
>+   every WIDTH bits to fit TYPE's precision.  */
> static tree
>-build_replicated_const (tree type, tree inner_type, HOST_WIDE_INT
>value)
>+build_replicated_const (tree type, unsigned int width, HOST_WIDE_INT
>value)
> {
>-  int width = tree_to_uhwi (TYPE_SIZE (inner_type));
>   int n = (TYPE_PRECISION (type) + HOST_BITS_PER_WIDE_INT - 1) 
> / HOST_BITS_PER_WIDE_INT;
>   unsigned HOST_WIDE_INT low, mask;
>@@ -214,13 +213,14 @@ do_plus_minus (gimple_stmt_iterator *gsi, tree
>word_type, tree a, tree b,
>  tree bitpos ATTRIBUTE_UNUSED, tree bitsize ATTRIBUTE_UNUSED,
>  enum tree_code code, tree type ATTRIBUTE_UNUSED)
> {
>+  unsigned int width = vector_element_bits (TREE_TYPE (a));
>   tree inner_type = TREE_TYPE (TREE_TYPE (a));
>   unsigned HOST_WIDE_INT max;
>   tree low_bits, high_bits, a_low, b_low, result_low, signs;
> 
>   max = GET_MODE_MASK (TYPE_MODE (inner_type));
>-  low_bits = build_replicated_const (word_type, inner_type, max >> 1);
>-  high_bits = build_replicated_const (word_type, inner_type, max &
>~(max >> 1));
>+  low_bits = build_replicated_const (word_type, width, max >> 1);
>+  high_bits = build_replicated_const (word_type, width, max & ~(max >>
>1));
> 
>   a = tree_vec_extract (gsi, word_type, a, bitsize, bitpos);
>   b = tree_vec_extract (gsi, word_type, b, bitsize, bitpos);
>@@ -247,13 +247,14 @@ do_negate (gimple_stmt_iterator *gsi, tree
>word_type, tree b,
>  enum tree_code code ATTRIBUTE_UNUSED,
>  tree type ATTRIBUTE_UNUSED)
> {
>+  unsigned int width = vector_element_bits (TREE_TYPE (b));
>   tree inner_type = TREE_TYPE (TREE_TYPE (b));
>   HOST_WIDE_INT max;
>   tree low_bits, high_bits, b_low, result_low, signs;
> 
>   max = GET_MODE_MASK (TYPE_MODE (inner_type));
>-  low_bits = build_replicated_const (word_type, inner_type, max >> 1);
>-  high_bits = build_replicated_const (word_type, inner_type, max &
>~(max >> 1));
>+  low_bits = build_replicated_const (word_type, width, max >> 1);
>+  high_bits = build_replicated_const (word_type, width, max & ~(max >>
>1));
> 
>   b = tree_vec_extract (gsi, word_type, b, bitsize, bitpos);
>

[committed] i386: Add V2SFmode sqrt insn pattern [PR95046]

2020-05-11 Thread Uros Bizjak via Gcc-patches

gcc/ChangeLog:

2020-05-11  Uroš Bizjak  

PR target/95046
* config/i386/mmx.md (sqrtv2sf2): New insn pattern.

testsuite/ChangeLog:

2020-05-11  Uroš Bizjak  

PR target/95046
* gcc.target/i386/pr95046-1.c (test_sqrt): Add.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Uros.
diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index 7d76c631a77..a8f603b94f8 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -461,6 +461,20 @@
(set_attr "prefix_extra" "1")
(set_attr "mode" "V2SF")])
 
+(define_insn "sqrtv2sf2"
+  [(set (match_operand:V2SF 0 "register_operand" "=x,v")
+   (sqrt:V2SF (match_operand:V2SF 1 "register_operand" "0,v")))]
+  "TARGET_MMX_WITH_SSE"
+  "@
+   sqrtps\t{%1, %0|%0, %1}
+   vsqrtps\t{%1, %0|%0, %1}"
+  [(set_attr "isa" "noavx,avx")
+   (set_attr "type" "sse")
+   (set_attr "atom_sse_attr" "sqrt")
+   (set_attr "btver2_sse_attr" "sqrt")
+   (set_attr "prefix" "orig,vex")
+   (set_attr "mode" "V4SF")])
+
 (define_insn "mmx_rsqrtv2sf2"
   [(set (match_operand:V2SF 0 "register_operand" "=y")
(unspec:V2SF [(match_operand:V2SF 1 "nonimmediate_operand" "ym")]
diff --git a/gcc/testsuite/gcc.target/i386/pr95046-1.c 
b/gcc/testsuite/gcc.target/i386/pr95046-1.c
index f93d9e1a507..7adc2069c53 100644
--- a/gcc/testsuite/gcc.target/i386/pr95046-1.c
+++ b/gcc/testsuite/gcc.target/i386/pr95046-1.c
@@ -49,3 +49,14 @@ test_max (void)
 }
 
 /* { dg-final { scan-assembler "maxps" } } */
+
+float sqrtf (float);
+
+void
+test_sqrt (void)
+{
+  for (int i = 0; i < 2; i++)
+r[i] = sqrtf (a[i]);
+}
+
+/* { dg-final { scan-assembler "sqrtps" } } */

libbacktrace patch committed: Declare getpagesize if necessary

2020-05-11 Thread Ian Lance Taylor via Gcc-patches

Reportedly mingw-w64-gcc has mmap and getpagesize but does not provide
a declaration of getpagesize in any header files.  Check for a
getpagesize declaration, and declare it if necessary.  This is for PR
95012.  Bootstrapped and ran libbacktrace tests on
x86_64-pc-linux-gnu.  Committed to master.

Ian

2020-05-11  Ian Lance Taylor  

PR libbacktrace/95012
* configure.ac: Check for getpagesize declaration.
* mmap.c: Declare getpagesize if necessary.
* mmapio.c: Likewise.
* configure: Regenerate.
* config.h.in: Regenerate.
* Makefile.in: Regenerate.
diff --git a/libbacktrace/configure.ac b/libbacktrace/configure.ac
index 6f241c5bac0..de9cf628b47 100644
--- a/libbacktrace/configure.ac
+++ b/libbacktrace/configure.ac
@@ -376,7 +376,7 @@ if test "$have_fcntl" = "yes"; then
[Define to 1 if you have the fcntl function])
 fi
 
-AC_CHECK_DECLS(strnlen)
+AC_CHECK_DECLS(strnlen getpagesize)
 AC_CHECK_FUNCS(lstat readlink)
 
 # Check for getexecname function.
diff --git a/libbacktrace/mmap.c b/libbacktrace/mmap.c
index dd7d519cc56..6c8bd5d4a19 100644
--- a/libbacktrace/mmap.c
+++ b/libbacktrace/mmap.c
@@ -42,6 +42,10 @@ POSSIBILITY OF SUCH DAMAGE.  */
 #include "backtrace.h"
 #include "internal.h"
 
+#ifndef HAVE_DECL_GETPAGESIZE
+extern int getpagesize (void);
+#endif
+
 /* Memory allocation on systems that provide anonymous mmap.  This
permits the backtrace functions to be invoked from a signal
handler, assuming that mmap is async-signal safe.  */
diff --git a/libbacktrace/mmapio.c b/libbacktrace/mmapio.c
index 5dd39525ba1..69cd8065a49 100644
--- a/libbacktrace/mmapio.c
+++ b/libbacktrace/mmapio.c
@@ -40,6 +40,10 @@ POSSIBILITY OF SUCH DAMAGE.  */
 #include "backtrace.h"
 #include "internal.h"
 
+#ifndef HAVE_DECL_GETPAGESIZE
+extern int getpagesize (void);
+#endif
+
 #ifndef MAP_FAILED
 #define MAP_FAILED ((void *)-1)
 #endif

Re: [PATCH] Add C++2a synchronization support

2020-05-11 Thread Jonathan Wakely via Gcc-patches


On 11/05/20 08:43 -0700, Thomas Rodgers wrote:


Jonathan Wakely writes:


On 09/05/20 17:01 -0700, Thomas Rodgers via Libstdc++ wrote:





+#include 


 shouldn't be here (it adds runtime cost, as well as
compile-time).


Oversight, not removed after debugging it.





Can't this just be __old instead of *std::__addressof(__old) ?


Copypasta from elsewhere in the same class, I believe. I'll change it.





Isn't alignas(64) already implied by the first data member?



Yes


+{
+  int32_t alignas(64) _M_ver = 0;
+  int32_t alignas(64) _M_wait = 0;
+
+  // TODO make this used only where we don't have futexes


Don't we always need these even with futexes, for the types that don't
use a futex?



If we have futexes, we can use the address of _M_ver to wake
_M_do_wait() instead of using a condvar for types that don't use a
futex directly.


+  using __lock_t = std::unique_lock;

+  mutable __lock_t::mutex_type _M_mtx;

+
+#ifdef __GTHREAD_COND_INIT
+  mutable __gthread_cond_t _M_cv = __GTHREAD_COND_INIT;
+  __waiters() noexcept = default;


If we moved std::condition_variable into its own header (or
, could we reuse that here instead of using
__gthread_cond_t directly?


Yes, I started down that route initially, I could revisit it in a future
patch as part of also making it's use only necessary when the platform
doesn't support futex.


+__atomic_notify(const _Tp* __addr, bool __all) noexcept
+{
+  using namespace __detail;
+  auto& __w = __waiters::_S_for((void*)__addr);
+  if (!__w._M_waiting())


When __platform_wait_uses_type<_Tp> is true, will __w._M_waiting()
ever be true? Won't this always return before notifying?

Is there meant to be a __waiter constructed here?



__waiter (an RAII type) is constructed in the __atomic_wait(), that
increments the _M_wait count on the way into the wait, and decrements it
on the way out, __atomic_notify checks to see if that count is non-zero
before invoking the platform/semaphore notify because it is cheaper
to do the atomic load than it is to make the syscall() when there are no
waiters.


Doh, yes of course.

Re: [PATCH] rs6000: Add cntlzdm and cnttzdm

2020-05-11 Thread Bill Schmidt via Gcc-patches


On 5/8/20 6:51 PM, Segher Boessenkool wrote:

On Fri, May 08, 2020 at 08:17:18AM -0500, Bill Schmidt wrote:

From: Kelvin Nilsen 

Add support for new scalar instructions for counting leading or
trailing zeros under control of a bitmask.

Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no
regressions.  Is this okay for master?

Ooh, I found problems!


Thanks for catching these!  Okay with them fixed?

Thanks,
Bill




* config/rs6000/rs6000-builtin.def (__builtin_cntlzdm): New
built-in function definition.
(__builtin_cnttzdm): Likewise.,

Stray comma.


+(define_insn "cntlzdm"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
+   (unspec:DI [(match_operand:DI 1 "gpc_reg_operand" "r")
+   (match_operand:DI 2 "gpc_reg_operand" "r")]
+UNSPEC_CNTLZDM))]
+   "TARGET_FUTURE && TARGET_64BIT"
+   "cntlzdm %0,%1,%2"
+   [(set_attr "type" "integer")])

TARGET_POWERPC64.


--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/cntlzdm-0.c
@@ -0,0 +1,57 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target lp64 } */

And powerpc64 here as well then.

Not sure if this is a bigger problem than the comma thing though.


Segher

[PATCH] tree-vect-generic: Fix bitfield widths [PR94980 3/3]

2020-05-11 Thread Richard Sandiford

This third patch of three actually fixes the PR.  We were using
8-bit BIT_FIELD_REFs to access single-bit elements, and multiplying
the vector index by 8 bits rather than 1 bit.

Tested individually on aarch64-linux-gnu and as a series on
x86_64-linux-gnu.  OK to install?

I'm not sure what to do about backports.  Arguably the PR is
really a progression rather than a regression, since we went
from generating wrong code to ICEing.  I'm not at all convinced
that this fixes all the vector-lowering problems associated
with packed booleans, and there's a danger that we could regress
to wrong code if we backported a piecemeal fix.

Richard


2020-05-11  Richard Sandiford  

gcc/
PR tree-optimization/94980
* tree-vect-generic.c (expand_vector_comparison): Use
vector_element_bits_tree to get the element size in bits,
rather than using TYPE_SIZE.
(expand_vector_condition, vector_element): Likewise.

gcc/testsuite/
PR tree-optimization/94980
* gcc.target/i386/pr94980.c: New test.
---
 gcc/testsuite/gcc.target/i386/pr94980.c | 10 ++
 gcc/tree-vect-generic.c |  8 
 2 files changed, 14 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr94980.c

diff --git a/gcc/tree-vect-generic.c b/gcc/tree-vect-generic.c
index adea9337a97..a7fe83da0e3 100644
--- a/gcc/tree-vect-generic.c
+++ b/gcc/tree-vect-generic.c
@@ -390,7 +390,7 @@ expand_vector_comparison (gimple_stmt_iterator *gsi, tree 
type, tree op0,
(TREE_TYPE (type)
{
  tree inner_type = TREE_TYPE (TREE_TYPE (op0));
- tree part_width = TYPE_SIZE (inner_type);
+ tree part_width = vector_element_bits_tree (TREE_TYPE (op0));
  tree index = bitsize_int (0);
  int nunits = nunits_for_known_piecewise_op (TREE_TYPE (op0));
  int prec = GET_MODE_PRECISION (SCALAR_TYPE_MODE (type));
@@ -944,9 +944,9 @@ expand_vector_condition (gimple_stmt_iterator *gsi)
   vec *v;
   tree constr;
   tree inner_type = TREE_TYPE (type);
+  tree width = vector_element_bits_tree (type);
   tree cond_type = TREE_TYPE (TREE_TYPE (a));
   tree comp_inner_type = cond_type;
-  tree width = TYPE_SIZE (inner_type);
   tree index = bitsize_int (0);
   tree comp_width = width;
   tree comp_index = index;
@@ -960,7 +960,7 @@ expand_vector_condition (gimple_stmt_iterator *gsi)
   a1 = TREE_OPERAND (a, 0);
   a2 = TREE_OPERAND (a, 1);
   comp_inner_type = TREE_TYPE (TREE_TYPE (a1));
-  comp_width = TYPE_SIZE (comp_inner_type);
+  comp_width = vector_element_bits_tree (TREE_TYPE (a1));
 }
 
   if (expand_vec_cond_expr_p (type, TREE_TYPE (a1), TREE_CODE (a)))
@@ -1333,7 +1333,7 @@ vector_element (gimple_stmt_iterator *gsi, tree vect, 
tree idx, tree *ptmpvec)
 }
   else
 {
- tree size = TYPE_SIZE (vect_elt_type);
+ tree size = vector_element_bits_tree (vect_type);
  tree pos = fold_build2 (MULT_EXPR, bitsizetype, bitsize_int (index),
  size);
  return fold_build3 (BIT_FIELD_REF, vect_elt_type, vect, size, pos);
diff --git a/gcc/testsuite/gcc.target/i386/pr94980.c 
b/gcc/testsuite/gcc.target/i386/pr94980.c
new file mode 100644
index 000..488f94abec9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr94980.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx512vl" } */
+
+int __attribute__((__vector_size__(16))) v;
+
+void
+foo(void)
+{
+  0 <= (0 != v) >= 0;
+}

[PATCH] tree-vect-generic: Tweak build_replicated_const [PR94980 2/3]

2020-05-11 Thread Richard Sandiford

This patch makes build_replicated_const take the number of bits
in VALUE rather than calculating the width from the element type.
The callers can then use vector_element_bits to calculate the
correct element size from the vector type.

Tested individually on aarch64-linux-gnu and as a series on
x86_64-linux-gnu.  OK to install?

Richard


2020-05-11  Richard Sandiford  

gcc/
PR tree-optimization/94980
* tree-vect-generic.c (build_replicated_const): Take the number
of bits as a parameter, instead of the type of the elements.
(do_plus_minus): Update accordingly, using vector_element_bits
to calculate the correct number of bits.
(do_negate): Likewise.
---
 gcc/tree-vect-generic.c | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/gcc/tree-vect-generic.c b/gcc/tree-vect-generic.c
index 126e906e0a9..adea9337a97 100644
--- a/gcc/tree-vect-generic.c
+++ b/gcc/tree-vect-generic.c
@@ -67,11 +67,10 @@ subparts_gt (tree type1, tree type2)
 }
 
 /* Build a constant of type TYPE, made of VALUE's bits replicated
-   every TYPE_SIZE (INNER_TYPE) bits to fit TYPE's precision.  */
+   every WIDTH bits to fit TYPE's precision.  */
 static tree
-build_replicated_const (tree type, tree inner_type, HOST_WIDE_INT value)
+build_replicated_const (tree type, unsigned int width, HOST_WIDE_INT value)
 {
-  int width = tree_to_uhwi (TYPE_SIZE (inner_type));
   int n = (TYPE_PRECISION (type) + HOST_BITS_PER_WIDE_INT - 1) 
 / HOST_BITS_PER_WIDE_INT;
   unsigned HOST_WIDE_INT low, mask;
@@ -214,13 +213,14 @@ do_plus_minus (gimple_stmt_iterator *gsi, tree word_type, 
tree a, tree b,
   tree bitpos ATTRIBUTE_UNUSED, tree bitsize ATTRIBUTE_UNUSED,
   enum tree_code code, tree type ATTRIBUTE_UNUSED)
 {
+  unsigned int width = vector_element_bits (TREE_TYPE (a));
   tree inner_type = TREE_TYPE (TREE_TYPE (a));
   unsigned HOST_WIDE_INT max;
   tree low_bits, high_bits, a_low, b_low, result_low, signs;
 
   max = GET_MODE_MASK (TYPE_MODE (inner_type));
-  low_bits = build_replicated_const (word_type, inner_type, max >> 1);
-  high_bits = build_replicated_const (word_type, inner_type, max & ~(max >> 
1));
+  low_bits = build_replicated_const (word_type, width, max >> 1);
+  high_bits = build_replicated_const (word_type, width, max & ~(max >> 1));
 
   a = tree_vec_extract (gsi, word_type, a, bitsize, bitpos);
   b = tree_vec_extract (gsi, word_type, b, bitsize, bitpos);
@@ -247,13 +247,14 @@ do_negate (gimple_stmt_iterator *gsi, tree word_type, 
tree b,
   enum tree_code code ATTRIBUTE_UNUSED,
   tree type ATTRIBUTE_UNUSED)
 {
+  unsigned int width = vector_element_bits (TREE_TYPE (b));
   tree inner_type = TREE_TYPE (TREE_TYPE (b));
   HOST_WIDE_INT max;
   tree low_bits, high_bits, b_low, result_low, signs;
 
   max = GET_MODE_MASK (TYPE_MODE (inner_type));
-  low_bits = build_replicated_const (word_type, inner_type, max >> 1);
-  high_bits = build_replicated_const (word_type, inner_type, max & ~(max >> 
1));
+  low_bits = build_replicated_const (word_type, width, max >> 1);
+  high_bits = build_replicated_const (word_type, width, max & ~(max >> 1));
 
   b = tree_vec_extract (gsi, word_type, b, bitsize, bitpos);

[PATCH] tree: Add vector_element_bits(_tree) [PR94980 1/3]

2020-05-11 Thread Richard Sandiford

A lot of code that wants to know the number of bits in a vector
element gets that information from the element's TYPE_SIZE,
which is always equal to TYPE_SIZE_UNIT * BITS_PER_UNIT.
This doesn't work for SVE and AVX512-style packed boolean vectors,
where several elements can occupy a single byte.

This patch introduces a new pair of helpers for getting the true
(possibly sub-byte) size.  I made a token attempt to convert obvious
element size calculations, but I'm sure I missed some.

Tested individually on aarch64-linux-gnu and as a series on
x86_64-linux-gnu.  OK to install?

Richard


2020-05-11  Richard Sandiford  

gcc/
PR tree-optimization/94980
* tree.h (vector_element_bits, vector_element_bits_tree): Declare.
* tree.c (vector_element_bits, vector_element_bits_tree): New.
* match.pd: Use the new functions instead of determining the
vector element size directly from TYPE_SIZE(_UNIT).
* tree-vect-data-refs.c (vect_gather_scatter_fn_p): Likewise.
* tree-vect-patterns.c (vect_recog_mask_conversion_pattern): Likewise.
* tree-vect-stmts.c (vect_is_simple_cond): Likewise.
* tree-vect-generic.c (expand_vector_piecewise): Likewise.
(expand_vector_conversion): Likewise.
(expand_vector_addition): Likewise for a TYPE_SIZE_UNIT used as
a divisor.  Convert the dividend to bits to compensate.
* tree-vect-loop.c (vectorizable_live_operation): Call
vector_element_bits instead of open-coding it.
---
 gcc/match.pd  |  2 +-
 gcc/tree-vect-data-refs.c |  2 +-
 gcc/tree-vect-generic.c   | 19 +++
 gcc/tree-vect-loop.c  |  4 +---
 gcc/tree-vect-patterns.c  |  3 +--
 gcc/tree-vect-stmts.c |  3 +--
 gcc/tree.c| 24 
 gcc/tree.h|  2 ++
 8 files changed, 38 insertions(+), 21 deletions(-)

diff --git a/gcc/tree.h b/gcc/tree.h
index 4644d6616d9..11c109fffcd 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -1996,6 +1996,8 @@ class auto_suppress_location_wrappers
 
 extern machine_mode element_mode (const_tree);
 extern machine_mode vector_type_mode (const_tree);
+extern unsigned int vector_element_bits (const_tree);
+extern tree vector_element_bits_tree (const_tree);
 
 /* The "canonical" type for this type node, which is used by frontends to
compare the type for equality with another type.  If two types are
diff --git a/gcc/tree.c b/gcc/tree.c
index 5b7d3fddbcb..1aabffeea43 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -13806,6 +13806,30 @@ vector_type_mode (const_tree t)
   return mode;
 }
 
+/* Return the size in bits of each element of vector type TYPE.  */
+
+unsigned int
+vector_element_bits (const_tree type)
+{
+  gcc_checking_assert (VECTOR_TYPE_P (type));
+  if (VECTOR_BOOLEAN_TYPE_P (type))
+return vector_element_size (tree_to_poly_uint64 (TYPE_SIZE (type)),
+   TYPE_VECTOR_SUBPARTS (type));
+  return tree_to_uhwi (TYPE_SIZE (TREE_TYPE (type)));
+}
+
+/* Calculate the size in bits of each element of vector type TYPE
+   and return the result as a tree of type bitsizetype.  */
+
+tree
+vector_element_bits_tree (const_tree type)
+{
+  gcc_checking_assert (VECTOR_TYPE_P (type));
+  if (VECTOR_BOOLEAN_TYPE_P (type))
+return bitsize_int (vector_element_bits (type));
+  return TYPE_SIZE (TREE_TYPE (type));
+}
+
 /* Verify that basic properties of T match TV and thus T can be a variant of
TV.  TV should be the more specified variant (i.e. the main variant).  */
 
diff --git a/gcc/match.pd b/gcc/match.pd
index 58a4ac66414..33ee1a920bf 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -6306,7 +6306,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   }
   (if (ins)
(bit_insert { op0; } { ins; }
- { bitsize_int (at * tree_to_uhwi (TYPE_SIZE (TREE_TYPE (type; })
+ { bitsize_int (at * vector_element_bits (type)); })
(if (changed)
 (vec_perm { op0; } { op1; } { op2; }))
 
diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index d41ba49fabf..b950aa9e50d 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -3693,7 +3693,7 @@ vect_gather_scatter_fn_p (vec_info *vinfo, bool read_p, 
bool masked_p,
  tree *offset_vectype_out)
 {
   unsigned int memory_bits = tree_to_uhwi (TYPE_SIZE (memory_type));
-  unsigned int element_bits = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (vectype)));
+  unsigned int element_bits = vector_element_bits (vectype);
   if (element_bits != memory_bits)
 /* For now the vector elements must be the same width as the
memory elements.  */
diff --git a/gcc/tree-vect-generic.c b/gcc/tree-vect-generic.c
index 8b00f325054..126e906e0a9 100644
--- a/gcc/tree-vect-generic.c
+++ b/gcc/tree-vect-generic.c
@@ -276,8 +276,7 @@ expand_vector_piecewise (gimple_stmt_iterator *gsi, 
elem_op_func f,
   tree part_width = TYPE_SIZE (inner_type);
   tree index = bitsize_int (0);
   int

[PATCH] tree-optimization/95049 - fix not terminating RPO VN iteration

2020-05-11 Thread Richard Biener

This rejects lattice changes from one constant to another.

Bootstrapped / tested on x86_64-unknown-linux-gnu, applied on trunk sofar.

2020-05-11  Richard Biener  

PR tree-optimization/95049
* tree-ssa-sccvn.c (set_ssa_val_to): Reject lattice transition
between different constants.

* gcc.dg/torture/pr95049.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr95049.c |  7 +++
 gcc/tree-ssa-sccvn.c   | 27 ++-
 2 files changed, 29 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr95049.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr95049.c 
b/gcc/testsuite/gcc.dg/torture/pr95049.c
new file mode 100644
index 000..164bfdbdcfc
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr95049.c
@@ -0,0 +1,7 @@
+/* { dg-do compile } */
+
+void a()
+{
+  for (int b; b; b = !b)
+;
+}
diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c
index 39e99007c7e..4b3f31c12cb 100644
--- a/gcc/tree-ssa-sccvn.c
+++ b/gcc/tree-ssa-sccvn.c
@@ -4472,6 +4472,8 @@ set_ssa_val_to (tree from, tree to)
   vn_ssa_aux_t from_info = VN_INFO (from);
   tree currval = from_info->valnum; // SSA_VAL (from)
   poly_int64 toff, coff;
+  bool curr_undefined = false;
+  bool curr_invariant = false;
 
   /* The only thing we allow as value numbers are ssa_names
  and invariants.  So assert that here.  We don't allow VN_TOP
@@ -4514,9 +4516,9 @@ set_ssa_val_to (tree from, tree to)
}
  return false;
}
-  bool curr_invariant = is_gimple_min_invariant (currval);
-  bool curr_undefined = (TREE_CODE (currval) == SSA_NAME
-&& ssa_undefined_value_p (currval, false));
+  curr_invariant = is_gimple_min_invariant (currval);
+  curr_undefined = (TREE_CODE (currval) == SSA_NAME
+   && ssa_undefined_value_p (currval, false));
   if (currval != VN_TOP
  && !curr_invariant
  && !curr_undefined
@@ -4571,9 +4573,8 @@ set_and_exit:
   && !operand_equal_p (currval, to, 0)
   /* Different undefined SSA names are not actually different.  See
  PR82320 for a testcase were we'd otherwise not terminate iteration.  
*/
-  && !(TREE_CODE (currval) == SSA_NAME
+  && !(curr_undefined
   && TREE_CODE (to) == SSA_NAME
-  && ssa_undefined_value_p (currval, false)
   && ssa_undefined_value_p (to, false))
   /* ???  For addresses involving volatile objects or types operand_equal_p
  does not reliably detect ADDR_EXPRs as equal.  We know we are only
@@ -4585,6 +4586,22 @@ set_and_exit:
   == get_addr_base_and_unit_offset (TREE_OPERAND (to, 0), ))
   && known_eq (coff, toff)))
 {
+  if (to != from
+ && currval != VN_TOP
+ && !curr_undefined
+ /* We do not want to allow lattice transitions from one value
+to another since that may lead to not terminating iteration
+(see PR95049).  Since there's no convenient way to check
+for the allowed transition of VAL -> PHI (loop entry value,
+same on two PHIs, to same PHI result) we restrict the check
+to invariants.  */
+ && curr_invariant
+ && is_gimple_min_invariant (to))
+   {
+ if (dump_file && (dump_flags & TDF_DETAILS))
+   fprintf (dump_file, " forced VARYING");
+ to = from;
+   }
   if (dump_file && (dump_flags & TDF_DETAILS))
fprintf (dump_file, " (changed)\n");
   from_info->valnum = to;
-- 
2.12.3

Re: [PATCH] Add C++2a synchronization support

2020-05-11 Thread Thomas Rodgers via Gcc-patches



Jonathan Wakely writes:

> On 09/05/20 17:01 -0700, Thomas Rodgers via Libstdc++ wrote:



>>+#include 
>
>  shouldn't be here (it adds runtime cost, as well as
> compile-time).
>
Oversight, not removed after debugging it.



>
> Can't this just be __old instead of *std::__addressof(__old) ?
>
Copypasta from elsewhere in the same class, I believe. I'll change it.



>
> Isn't alignas(64) already implied by the first data member?
>

Yes

>>+{
>>+  int32_t alignas(64) _M_ver = 0;
>>+  int32_t alignas(64) _M_wait = 0;
>>+
>>+  // TODO make this used only where we don't have futexes
>
> Don't we always need these even with futexes, for the types that don't
> use a futex?
>

If we have futexes, we can use the address of _M_ver to wake
_M_do_wait() instead of using a condvar for types that don't use a
futex directly.

>>+  using __lock_t = std::unique_lock;
>+  mutable __lock_t::mutex_type _M_mtx;
>>+
>>+#ifdef __GTHREAD_COND_INIT
>>+  mutable __gthread_cond_t _M_cv = __GTHREAD_COND_INIT;
>>+  __waiters() noexcept = default;
>
> If we moved std::condition_variable into its own header (or
> , could we reuse that here instead of using
> __gthread_cond_t directly?
>
Yes, I started down that route initially, I could revisit it in a future
patch as part of also making it's use only necessary when the platform
doesn't support futex.

>>+__atomic_notify(const _Tp* __addr, bool __all) noexcept
>>+{
>>+  using namespace __detail;
>>+  auto& __w = __waiters::_S_for((void*)__addr);
>>+  if (!__w._M_waiting())
>
> When __platform_wait_uses_type<_Tp> is true, will __w._M_waiting()
> ever be true? Won't this always return before notifying?
>
> Is there meant to be a __waiter constructed here?
>

__waiter (an RAII type) is constructed in the __atomic_wait(), that
increments the _M_wait count on the way into the wait, and decrements it
on the way out, __atomic_notify checks to see if that count is non-zero
before invoking the platform/semaphore notify because it is cheaper
to do the atomic load than it is to make the syscall() when there are no
waiters.

>>+ return;
>>+
>>+  if constexpr (__platform_wait_uses_type<_Tp>::__value)
>>+ {
>>+   __platform_notify((__platform_wait_t*)(void*) __addr, __all);
>>+ }



>>+struct __platform_semaphore
>>+{
>>+  using __clock_t = chrono::system_clock;
>>+
>>+  __platform_semaphore(ptrdiff_t __count) noexcept
>
> Should this constructor be explicit?
>

Yes.

>>+  template
>>+ _GLIBCXX_ALWAYS_INLINE bool
>
> Do we really need this to be always_inline?
>
Probably not, copypasta from elsewhere in the same file.

>>+ __try_acquire_until_impl(const chrono::time_point<__clock_t>& __atime) 
>>noexcept
>>+ {
>>+   auto __s = chrono::time_point_cast(__atime);
>>+   auto __ns = chrono::duration_cast(__atime - __s);



>>+template
>>+  struct __atomic_semaphore
>>+  {
>>+ static constexpr size_t _S_alignment = __alignof__(_Tp);
>>+
>>+ __atomic_semaphore(_Tp __count)
>
> Should this be explicit?
>
Yes.

>>+private:
>>+  alignas(_S_alignment) _Tp _M_a;
>
> Could this just use alignas(__alignof__(_Tp)) _Tp here? There's no
> need for the _S_alignment constant if it's only used in one place.
>
Yes.

>>+};
>>+
>>+#ifdef _GLIBCXX_REQUIRE_POSIX_SEMAPHORE
>>+  template
>
> Rename __least_max_t here too.
>
>>+using __semaphore_base = __platform_semaphore<__least_max_t>;
>>+#else
>>+#  ifdef _GLIBCXX_HAVE_LINUX_FUTEX
>>+  template
>>+using __semaphore_base = std::conditional<(__least_max_t > 0
>
> This should use conditional_t<> not conditional<>::type.
>
> The least-max_value can't be negative. If it's zero, can't we use a
> futex or semaphore? So the '__least_max_t > 0' condition is wrong?
>

Yes.

>>+   && __least_max_t < 
>>std::numeric_limits<__detail::__platform_wait_t>::max()),
>
> Should that be <= rather than < ?
>

Likely.

>>+   
>>__atomic_semaphore<__detail::__platform_wait_t>,
>>+   
>>__atomic_semaphore>::type;
>>+ // __platform_semaphore
>>+#  else

Re: [PATCH 3/3] OpenACC dynamic data lifetimes ending within structured blocks

2020-05-11 Thread Thomas Schwinge

Hi Julian!

On 2020-01-17T13:18:21-0800, Julian Brown  wrote:
> This patch adds a new function to logically decrement the "dynamic
> reference counter" for a mapped OpenACC variable, and handles some cases
> in which that counter drops to zero inside a structured data
> block. Previously, it's likely that at least in some cases, ending a
> dynamic data lifetime in this way could behave unpredictably.
>
> Several new test cases are included.

As discussed before, all these test cases were already PASSing before any
of this thread's suggested patches (also for GCC 9), so "from a user's
point of view", all we get here are testsuite regressions:

  - 'libgomp.oacc-c-c++-common/structured-dynamic-lifetimes-6-lib.c'
  - 'libgomp.oacc-c-c++-common/structured-dynamic-lifetimes-6.c'
  - 'libgomp.oacc-c-c++-common/structured-dynamic-lifetimes-7-lib.c'
  - 'libgomp.oacc-c-c++-common/structured-dynamic-lifetimes-7.c'
  - 'libgomp.oacc-c-c++-common/structured-dynamic-lifetimes-8-lib.c'
  - 'libgomp.oacc-c-c++-common/structured-dynamic-lifetimes-8.c'

(Adjusted for the version of the test cases already committed; but
already XFAILed in your original patch submission, see below.)


And: the code changes proposed here are breaking compatibility with GCC
9, such that OpenACC/Fortran code compiled with GCC 9, but running with
recent runtime libraries (common case for users, distributions) would
then terminate with: 'libgomp: cannot handle 'exit data' within data
region'.  For example, half of all 'libgomp.oacc-fortran' test cases
using OpenACC 'exit data':

  - 'libgomp.oacc-fortran/data-2.f90'
  - 'libgomp.oacc-fortran/data-3.f90'
  - 'libgomp.oacc-fortran/data-4-2.f90'
  - 'libgomp.oacc-fortran/data-4.f90'
  - 'libgomp.oacc-fortran/if-1.f90'

Even though that "code generation problem" doesn't exist with GCC 10 and
newer, we still have to maintain ABI compatibility with existing binaries
compiled compiled with GCC 9.  (Or, as a less preferred solution, arrange
so that they use host-fallback execution insted of offloading.)


Grüße
 Thomas


> This patch is strongly related to the previous two, but is somewhat of
> a separate change, and those two patches can stand alone if this one
> gets deferred.
>
> Tested alongside the previous patches in the series with offloading to NVPTX.
>
> OK?
>
> Thanks,
>
> Julian
>
> ChangeLog
>
>   libgomp/
>   * oacc-mem.c (decr_dynamic_refcount): New function.
>   (goacc_exit_datum): Call above function.
>   (goacc_exit_data_internal): Call above function.
>   * testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-1.c: New
>   test.
>   * testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-1-lib.c:
>   Likewise.
>   * testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-6.c:
>   Likewise.
>   * testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-6-lib.c:
>   Likewise.
>   * testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-7.c:
>   Likewise.
>   * testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-7-lib.c:
>   Likewise.
>   * testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-8.c:
>   Likewise.
>   * testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-8-lib.c:
>   Likewise.
> ---
>  libgomp/oacc-mem.c| 128 ++
>  .../static-dynamic-lifetimes-1-lib.c  |   3 +
>  .../static-dynamic-lifetimes-1.c  | 160 ++
>  .../static-dynamic-lifetimes-6-lib.c  |   5 +
>  .../static-dynamic-lifetimes-6.c  |  46 +
>  .../static-dynamic-lifetimes-7-lib.c  |   5 +
>  .../static-dynamic-lifetimes-7.c  |  45 +
>  .../static-dynamic-lifetimes-8-lib.c  |   5 +
>  .../static-dynamic-lifetimes-8.c  |  50 ++
>  9 files changed, 412 insertions(+), 35 deletions(-)
>  create mode 100644 
> libgomp/testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-1-lib.c
>  create mode 100644 
> libgomp/testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-1.c
>  create mode 100644 
> libgomp/testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-6-lib.c
>  create mode 100644 
> libgomp/testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-6.c
>  create mode 100644 
> libgomp/testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-7-lib.c
>  create mode 100644 
> libgomp/testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-7.c
>  create mode 100644 
> libgomp/testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-8-lib.c
>  create mode 100644 
> libgomp/testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-8.c
>
> diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c
> index 783e7f363fb..f34ffa67079 100644
> --- a/libgomp/oacc-mem.c
> +++ b/libgomp/oacc-mem.c
> @@ -725,6 +725,92 @@ acc_pcopyin (void *h, size_t s)
>  #endif
>
>
> +/* Perform actions necessary to decrement the dynamic

Re: [PATCH] rs6000: Add pdep/pext

2020-05-11 Thread Bill Schmidt via Gcc-patches


On 5/8/20 3:47 PM, Segher Boessenkool wrote:

Hi,

On Thu, May 07, 2020 at 09:29:03PM -0500, Bill Schmidt wrote:

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 5ef4889ba55..33ba57855bc 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -162,6 +162,8 @@ (define_c_enum "unspec"
 UNSPEC_VRLNM
 UNSPEC_VCLZDM
 UNSPEC_VCTZDM
+   UNSPEC_VPDEPD
+   UNSPEC_VPEXTD

Similarly, maybe UNSPEC_PDEP and UNSPEC_PEXT would be nicer.



Thanks -- I'll plan to go back and do a general cleanup on scalar/vector 
UNSPECs later on.  I expect we have a lot of this sort of redundancy.


Bill



Looks okay for trunk either way :-)


Segher

[PATCH][v2] tree-optimization/94988 - enhance SM some more

2020-05-11 Thread Richard Biener

This enhances store-order preserving store motion to handle the case
of non-invariant dependent stores in the sequence of unconditionally
executed stores on exit by re-issueing them as part of the sequence
of stores on the exit.  This fixes the observed regression of
gcc.target/i386/pr64110.c which relies on store-motion of 'b'
for a loop like

  for (int i = 0; i < j; ++i)
*b++ = x;

where for correctness we now no longer apply store-motion.  With
the patch we emit the correct

  tem = b;
  for (int i = 0; i < j; ++i)
{
  tem = tem + 1;
  *tem = x;
}
  b = tem;
  *tem = x;

preserving the original order of stores.  A testcase reflecting
the miscompilation done by earlier GCC is added as well.

This also fixes the reported ICE in PR95025 and adds checking code
to catch it earlier - the issue was not-supported refs propagation
leaving stray refs in the sequence.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

2020-05-11  Richard Biener  

PR tree-optimization/94988
PR tree-optimization/95025
* tree-ssa-loop-im.c (seq_entry): Make a struct, add from.
(sm_seq_push_down): Take extra parameter denoting where we
moved the ref to.
(execute_sm_exit): Re-issue sm_other stores in the correct
order.
(sm_seq_valid_bb): When always executed, allow sm_other to
prevail inbetween sm_ord and record their stored value.
(hoist_memory_references): Adjust refs_not_supported propagation
and prune sm_other from the end of the ordered sequences.

* gcc.dg/torture/pr94988.c: New testcase.
* gcc.dg/torture/pr95025.c: Likewise.
* gcc.dg/torture/pr95045.c: Likewise.
* g++.dg/asan/pr95025.C: New testcase.
---
 gcc/testsuite/g++.dg/asan/pr95025.C|  28 
 gcc/testsuite/gcc.dg/torture/pr94988.c |  20 +++
 gcc/testsuite/gcc.dg/torture/pr95025.c |  13 ++
 gcc/testsuite/gcc.dg/torture/pr95045.c |  29 
 gcc/tree-ssa-loop-im.c | 177 ++---
 5 files changed, 218 insertions(+), 49 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/asan/pr95025.C
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr94988.c
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr95025.c
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr95045.c

diff --git a/gcc/testsuite/g++.dg/asan/pr95025.C 
b/gcc/testsuite/g++.dg/asan/pr95025.C
new file mode 100644
index 000..dabb7e92f82
--- /dev/null
+++ b/gcc/testsuite/g++.dg/asan/pr95025.C
@@ -0,0 +1,28 @@
+// { dg-do compile }
+// { dg-options "-O2 -fsanitize=address" }
+
+struct a {
+int b;
+} * c;
+struct d {
+d *e;
+};
+struct f {
+bool done;
+d *g;
+};
+int h;
+int i(f *j) {
+if (j->g) {
+   j->g = j->g->e;
+   return h;
+}
+j->done = true;
+return 0;
+}
+void k(bool j) { c->b = j; }
+void l() {
+f a;
+for (; !()->done; i())
+  k(true);
+}
diff --git a/gcc/testsuite/gcc.dg/torture/pr94988.c 
b/gcc/testsuite/gcc.dg/torture/pr94988.c
new file mode 100644
index 000..1ee99fea5ce
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr94988.c
@@ -0,0 +1,20 @@
+/* { dg-do run } */
+
+short *b;
+
+void __attribute__((noipa))
+bar (short x, int j)
+{
+  for (int i = 0; i < j; ++i)
+*b++ = x;
+}
+
+int
+main()
+{
+  b = (short *)
+  bar (0, 1);
+  if ((short)(__UINTPTR_TYPE__)b != 0)
+__builtin_abort ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/torture/pr95025.c 
b/gcc/testsuite/gcc.dg/torture/pr95025.c
new file mode 100644
index 000..5834dc04887
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr95025.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+
+static int a;
+short b;
+int *c;
+void d() {
+for (;; a -= 1)
+  for (; b; b += 1) {
+ *c ^= 5;
+ if (a)
+   return;
+  }
+}
diff --git a/gcc/testsuite/gcc.dg/torture/pr95045.c 
b/gcc/testsuite/gcc.dg/torture/pr95045.c
new file mode 100644
index 000..9f591beb6be
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr95045.c
@@ -0,0 +1,29 @@
+/* { dg-do run } */
+
+int a, c, f;
+long b;
+char d;
+int e[3];
+int g[9][3][2];
+int main()
+{
+{
+h:
+  for (f = 0; f <= 5; f++) {
+ b = 3;
+ for (; b >= 0; b--) {
+ e[2] = d = 0;
+ for (; d <= 3; d++) {
+ g[8][2][0] = e[1] = c = 0;
+ for (; c <= 1; c++)
+   e[c + 1] = g[d + 5][2][c] = 4;
+ }
+ if (a)
+   goto h;
+ }
+  }
+}
+  if (e[2] != 4)
+__builtin_abort ();
+  return 0;
+}
diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c
index 2aabb54c98d..bb78dfb2ce8 100644
--- a/gcc/tree-ssa-loop-im.c
+++ b/gcc/tree-ssa-loop-im.c
@@ -2209,7 +2209,14 @@ execute_sm (class loop *loop, im_mem_ref *ref,
able to execute in arbitrary order with respect to other stores
sm_other is used for stores we do not try to apply store motion to.  */
 enum sm_kind { sm_ord,

Re: [PATCH] rs6000: Add vec_extracth and vec_extractl

2020-05-11 Thread David Edelsohn via Gcc-patches

On Sun, May 10, 2020 at 9:14 AM Bill Schmidt  wrote:
>
> From: Kelvin Nilsen 
>
> Add new insns vextdu[bhw]vlx, vextddvlx, vextdu[bhw]vhx, and
> vextddvhx, along with built-in access and overloaded built-in
> access to these insns.
>
> Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no
> regressions, using a Power9 configuration.  Is this okay for
> master?
>
> Thanks,
> Bill
>
> [gcc]
>
> 2020-05-10  Kelvin Nilsen  
>
> * config/rs6000/altivec.h (vec_extractl): New #define.
> (vec_extracth): Likewise.
> * config/rs6000/altivec.md (UNSPEC_EXTRACTL): New constant.
> (UNSPEC_EXTRACTR): Likewise.
> (VEXTRACT_LR): New int iterator.

Well now the previous VSTRIR/VSTRIL patch is inconsistent.  If we're
going to use an iterator for "LR", that's fine, but it needs to be
used consistently for similar situations.  The approach for the two,
similar instructions and issues need to match.

Thanks, David

Re: Fix Debug mode Undefined Behavior

2020-05-11 Thread Jonathan Wakely via Gcc-patches


On 11/05/20 14:09 +0200, FranÃ§ois Dumont via Libstdc++ wrote:

On 11/05/20 12:51 pm, Ville Voutilainen wrote:

On Mon, 11 May 2020 at 00:09, FranÃ§ois Dumont via Libstdc++
 wrote:

I just committed this patch.

This was a commit-without-review. When the patch was originally
posted, the maintainer said
"Let's revisit it in a few weeks.". That's not the same as "OK when
stage1 reopens."


I don't know why I had in mind that it was Ok.

Now reverted.



Thanks.

N.B. The first line of the GIt commit log should have a colon after
the component, i.e.

libstdc++: describe it here

not:

libstdc++ describe it here

Re: [PATCH PR94991] aarch64: ICE: Segmentation fault with option -mgeneral-regs-only

2020-05-11 Thread Richard Sandiford

"Yangfei (Felix)"  writes:
> Hi,
>
>   Witnessed another ICE with option -mgeneral-regs-only. 
>   I have created a bug for that: 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94991 
>
>   For the given testcase, we are doing FAIL for scalar floating move expand 
> pattern since TARGET_FLOAT
>   is false with option -mgeneral-regs-only. But move expand pattern cannot 
> fail. It would be better to 
>   replace the FAIL with code that bitcasts to the equivalent integer mode, 
> using gen_lowpart.
>
>   Bootstrap and tested on aarch64-linux-gnu.  Comments?

LGTM.  Pushed with one minor formatting fix:

> @@ -1364,7 +1364,11 @@
>  if (!TARGET_FLOAT)
>{
>   aarch64_err_no_fpadvsimd (mode);
> - FAIL;
> + machine_mode intmode
> + = int_mode_for_size (GET_MODE_BITSIZE (mode), 0).require ();

The "=" should only be indented by two spaces relative to the first line.

Thanks,
Richard

Re: [PATCH] aarch64: prefer using csinv, csneg in zero extend contexts

2020-05-11 Thread Richard Sandiford

Alex Coplan  writes:
>> -Original Message-
>> From: Richard Sandiford 
>> Sent: 06 May 2020 11:28
>> To: Alex Coplan 
>> Cc: gcc-patches@gcc.gnu.org; Richard Earnshaw ;
>> Marcus Shawcroft ; Kyrylo Tkachov
>> ; nd 
>> Subject: Re: [PATCH] aarch64: prefer using csinv, csneg in zero extend
>> contexts
>>
>> Alex Coplan  writes:
>> >> -Original Message-
>> >> From: Richard Sandiford 
>> >> Sent: 30 April 2020 15:13
>> >> To: Alex Coplan 
>> >> Cc: gcc-patches@gcc.gnu.org; Richard Earnshaw
>> ;
>> >> Marcus Shawcroft ; Kyrylo Tkachov
>> >> ; nd 
>> >> Subject: Re: [PATCH] aarch64: prefer using csinv, csneg in zero extend
>> contexts
>> >>
>> >> Yeah, I was hoping for something like...
>> >>
>> >> > Indeed, clang generates a MVN + CSEL sequence where the CSEL
>> operates on the
>> >> > 64-bit registers:
>> >> >
>> >> > f:
>> >> > mvn w8, w2
>> >> > cmp w0, #0
>> >> > cselx0, x8, x1, eq
>> >> > ret
>> >>
>> >> ...this rather than the 4-insn (+ret) sequence that we currently
>> >> generate.  So it would have been a define_insn_and_split that handles
>> >> the zero case directly but splits into the "optimal" two-instruction
>> >> sequence for registers.
>> >>
>> >> But I guess the underlying problem is instead that we don't have
>> >> a pattern for (zero_extend:DI (not:SI ...)).  Adding that would
>> >> definitely be a better fix.
>> >
>> > Yes. I sent a patch for this very fix which Kyrill is going to commit
>> once stage
>> > 1 opens: https://gcc.gnu.org/pipermail/gcc-patches/2020-
>> April/544365.html
>>
>> Sorry, missed that.
>>
>> It looks like that patch hinders this one though.  Trying it with
>> current master (where that patch is applied), I get:
>>
>> FAIL: gcc.target/aarch64/csinv-neg.c check-function-bodies inv_zero1
>> FAIL: gcc.target/aarch64/csinv-neg.c check-function-bodies inv_zero2
>>
>> It looks like a costs issue:
>>
>> Trying 27 -> 18:
>>27: r99:DI=zero_extend(~r101:SI)
>>   REG_DEAD r101:SI
>>18: x0:DI={(cc:CC==0)?r99:DI:0}
>>   REG_DEAD cc:CC
>>   REG_DEAD r99:DI
>> Successfully matched this instruction:
>> (set (reg/i:DI 0 x0)
>> (if_then_else:DI (eq (reg:CC 66 cc)
>> (const_int 0 [0]))
>> (zero_extend:DI (not:SI (reg:SI 101)))
>> (const_int 0 [0])))
>> rejecting combination of insns 27 and 18
>> original costs 4 + 4 = 8
>> replacement cost 12
>>
>> I guess we'll need to teach aarch64_if_then_else_costs about the costs
>> of the new insns.
>
> Ah, thanks for catching this. I've attached an updated patch which fixes the
> costs issue here. With the new patch, all the test cases in csinv-neg.c now
> pass. In addition, I've done a bootstrap and regtest on aarch64-linux with no
> new failures.
>
> Do you think we need to add cases to aarch64_if_then_else_costs for the other
> new insns, or is the attached patch OK for master?

Looks good as-is, thanks.  Just a couple of very minor nits:

> 2020-05-07  Alex Coplan  
>
> * config/aarch64/aarch64.c (aarch64_if_then_else_costs): Add case to 
> correctly
> calculate cost for new pattern (*csinv3_uxtw_insn3).

ChangeLog lines follow the 80-character limit.

> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index e92c7e69fcb..efb3da7a7fc 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -11695,6 +11695,15 @@ aarch64_if_then_else_costs (rtx op0, rtx op1, rtx 
> op2, int *cost, bool speed)
> op1 = XEXP (op1, 0);
> op2 = XEXP (op2, 0);
>   }
> +  else if (GET_CODE (op1) == ZERO_EXTEND && op2 == const0_rtx)
> + {
> +   inner = XEXP (op1, 0);
> +   if (GET_CODE (inner) == NEG || GET_CODE (inner) == NOT)
> +   {
> + /* CSINV/NEG with zero extend + const 0 (*csinv3_uxtw_insn3).  */
> + op1 = XEXP (inner, 0);
> +   }

GCC style is to avoid { ... } around single statements, unless it's
needed to avoid an ambiguity.  If we did use { ... }, the block
should be indented by two further spaces, so that the "{" is
under the space after "if".

Pushed with those changes, thanks.

Richard

Re: [PATCH] Add C++2a synchronization support

2020-05-11 Thread Jonathan Wakely via Gcc-patches


On 09/05/20 17:01 -0700, Thomas Rodgers via Libstdc++ wrote:

* Note, this patch supersedes my previous atomic wait and semaphore
patches.

Add support for -
   atomic wait/notify_one/notify_all
   counting_semaphore
   binary_semaphore
   latch

   * include/Makefile.am (bits_headers): Add new header.
   * include/Makefile.in: Regenerate.
   * include/bits/atomic_base.h (__atomic_base<_Itp>:wait): Define.


Should be two colons before wait.


diff --git a/libstdc++-v3/include/Makefile.in b/libstdc++-v3/include/Makefile.in
index eb437ad8d8d..e73ff8b3e64 100644
--- a/libstdc++-v3/include/Makefile.in
+++ b/libstdc++-v3/include/Makefile.in


Generated files don't need to be in the patch.


diff --git a/libstdc++-v3/include/bits/atomic_base.h 
b/libstdc++-v3/include/bits/atomic_base.h
index 87fe0bd6000..b2cec0f1722 100644
--- a/libstdc++-v3/include/bits/atomic_base.h
+++ b/libstdc++-v3/include/bits/atomic_base.h
@@ -37,6 +37,11 @@
#include 
#include 

+#if __cplusplus > 201703L
+#include 
+#include 


 shouldn't be here (it adds runtime cost, as well as
compile-time).


@@ -542,6 +546,30 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   __cmpexch_failure_order(__m));
  }

+#if __cplusplus > 201703L
+  _GLIBCXX_ALWAYS_INLINE void
+  wait(__int_type __old, memory_order __m = memory_order_seq_cst) const 
noexcept


Please format everything to <= 80 columns (ideally < 80).


+  wait(__pointer_type __old, memory_order __m = memory_order_seq_cst) 
noexcept
+  {
+   __atomic_wait(&_M_p, __old,


This should be qualified to prevent ADL.


+ [__m, this, __old]()
+ { return this->load(__m) != __old; });
+  }
+
+  // TODO add const volatile overload
+
+  _GLIBCXX_ALWAYS_INLINE void
+  notify_one() const noexcept
+  { __atomic_notify(&_M_p, false); }


Qualify to prevent ADL here too, and all similar calls.


+#if __cplusplus > 201703L
+template
+  _GLIBCXX_ALWAYS_INLINE void
+  wait(const _Tp* __ptr, _Val<_Tp> __old, memory_order __m = 
memory_order_seq_cst) noexcept
+  {
+   __atomic_wait(__ptr, *std::__addressof(__old),


Can't this just be __old instead of *std::__addressof(__old) ?


+ [=]()
+ { return load(__ptr, __m) == *std::__addressof(__old); });


Same here?


diff --git a/libstdc++-v3/include/bits/atomic_timed_wait.h 
b/libstdc++-v3/include/bits/atomic_timed_wait.h
new file mode 100644
index 000..10f0fe50ed9
--- /dev/null
+++ b/libstdc++-v3/include/bits/atomic_timed_wait.h
@@ -0,0 +1,270 @@
+// -*- C++ -*- header.
+
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+// .
+
+/** @file bits/atomic_timed_wait.h
+ *  This is an internal header file, included by other library headers.
+ *  Do not attempt to use it directly. @headername{atomic}
+ */
+
+#ifndef _GLIBCXX_ATOMIC_TIMED_WAIT_H
+#define _GLIBCXX_ATOMIC_TIMED_WAIT_H 1
+
+#pragma GCC system_header
+
+#include 
+#include 
+#include 
+
+#include 
+
+#ifdef _GLIBCXX_HAVE_LINUX_FUTEX
+#include 
+#endif
+
+namespace std _GLIBCXX_VISIBILITY(default)
+{
+  _GLIBCXX_BEGIN_NAMESPACE_VERSION
+  enum class __atomic_wait_status { __no_timeout, __timeout };


Blank line before and after this enum definition please.


+  namespace __detail
+  {
+#ifdef _GLIBCXX_HAVE_LINUX_FUTEX
+enum
+{
+  __futex_wait_bitset_private = __futex_wait_bitset | __futex_private_flag,
+  __futex_wake_bitset_private = __futex_wake_bitset | __futex_private_flag,
+  __futex_bitset_match_any = 0x
+};
+
+using __platform_wait_clock_t = chrono::steady_clock;


Blank line after this using-decl please.


+template
+  __atomic_wait_status
+  __platform_wait_until_impl(__platform_wait_t* __addr, __platform_wait_t 
__val,
+const chrono::time_point<__platform_wait_clock_t, 
_Duration>& __atime) noexcept
+  {
+   auto __s =

Re: [libitm] eh specifications are lax

2020-05-11 Thread Nathan Sidwell


ping?

On 5/5/20 4:08 PM, Nathan Sidwell wrote:

I discovered that libitm:
(a) declares __cxa_allocate_exception and friends directly,
(b) doesn't mark them as 'throw()'
(c) doesn't mark the replacment fns _ITM_$foo as nothrow either

We happen to get away with it because of code in the compiler that, 
although it checks the parameter types, doesn't check the exception 
specification.  (One reason being they used to not be part of the 
language's type system, but now they are.)  I suspect this can lead us 
to generate pessimal code later, if we've seen one of these decls 
earlier.  Anyway, with modules it becomes trickier[*], so I'm trying to 
clean it up and not be a problem.  I see Jakub fixed part of the problem 
(https://gcc.gnu.org/pipermail/gcc-patches/2018-December/513302.html) 
AFAICT, he did fix libitm's decls, but left the lax parm-type checking 
in the compiler.


libitm.h is not very informative about specification:
   in version 1 of http://www.intel.com/some/path/here.pdf.  */

Anyway, it was too fiddly to have libitm pick up the declarations from 
libsupc++.  Besides it makes them weak declarations, and then provides 
definitions for non-elf systems.  So this patch adds the expected 'throw()'


While I can't be sure, I suspect the _ITM entry points are supposed to 
have the same exception specification as the original entry points.  So 
those are also made 'throw ()'.  libstdc++'s _GLIBCXX_NOTHROW didn't 
seem available, so I make use of a new _ITM_NOTHROW macro, suitably 
defined.


Because of the lax checking in the compiler, and old compiler with a 
patched libitm.h will be ok...  Until I change the compiler :)


booted & tested on x86_64-linux, ok?

nathan

[*] modules make it harder to have ODR violations, that's why it finds 
ODR violations in existing code.





--
Nathan Sidwell

[PATCH] c++: Enable spec_hasher table sanitization [PR87847]

2020-05-11 Thread Patrick Palka via Gcc-patches

It looks like hash table sanitization is now safe to enable for the
decl_specializations and type_specializations tables, probably ever
since PR94454 was fixed.

Bootstrapped and regtested on x86_64-pc-linux-gnu with the attached
debugging patch that makes all entries hash to 0, and also successfully
built the range-v3 testsuite and a number of other libraries with this
debugging patch.  Does this look OK to commit?

gcc/cp/ChangeLog:

PR c++/87847
* pt.c (init_template_processing): Enable sanitization for
decl_specializations and type_specializations.
---
 gcc/cp/pt.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index c6091127225..2d1869816c5 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -29441,9 +29441,8 @@ declare_integer_pack (void)
 void
 init_template_processing (void)
 {
-  /* FIXME: enable sanitization (PR87847) */
-  decl_specializations = hash_table::create_ggc (37, false);
-  type_specializations = hash_table::create_ggc (37, false);
+  decl_specializations = hash_table::create_ggc (37);
+  type_specializations = hash_table::create_ggc (37);
 
   if (cxx_dialect >= cxx11)
 declare_integer_pack ();
-- 
2.26.2.561.g07d8ea56f2From c807f9ae8871d6797fc06c229b2cc5b44e364d31 Mon Sep 17 00:00:00 2001
From: Patrick Palka 
Date: Fri, 8 May 2020 14:04:34 -0400
Subject: [PATCH] Verify equal spec_hasher entries have equal hashes

---
 gcc/cp/pt.c | 41 -
 1 file changed, 20 insertions(+), 21 deletions(-)

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index c6091127225..c9b39f126ba 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -1291,7 +1291,7 @@ retrieve_specialization (tree tmpl, tree args, hashval_t hash)
 
   if (hash == 0)
 	hash = spec_hasher::hash ();
-  found = specializations->find_with_hash (, hash);
+  found = specializations->find_with_hash (, 0);
   if (found)
 	return found->spec;
 }
@@ -1574,7 +1574,7 @@ register_specialization (tree spec, tree tmpl, tree args, bool is_friend,
 	hash = spec_hasher::hash ();
 
   slot =
-	decl_specializations->find_slot_with_hash (, hash, INSERT);
+	decl_specializations->find_slot_with_hash (, 0, INSERT);
   if (*slot)
 	fn = ((spec_entry *) *slot)->spec;
   else
@@ -1704,6 +1704,15 @@ register_specialization (tree spec, tree tmpl, tree args, bool is_friend,
 
 int comparing_specializations;
 
+/* Returns a hash for a template TMPL and template arguments ARGS.  */
+
+static hashval_t
+hash_tmpl_and_args (tree tmpl, tree args)
+{
+  hashval_t val = iterative_hash_object (DECL_UID (tmpl), 0);
+  return iterative_hash_template_arg (args, val);
+}
+
 bool
 spec_hasher::equal (spec_entry *e1, spec_entry *e2)
 {
@@ -1726,25 +1735,19 @@ spec_hasher::equal (spec_entry *e1, spec_entry *e2)
 }
   --comparing_specializations;
 
+  if (equal)
+gcc_assert (hash_tmpl_and_args (e1->tmpl, e1->args)
+		== hash_tmpl_and_args (e2->tmpl, e2->args));
   return equal;
 }
 
-/* Returns a hash for a template TMPL and template arguments ARGS.  */
-
-static hashval_t
-hash_tmpl_and_args (tree tmpl, tree args)
-{
-  hashval_t val = iterative_hash_object (DECL_UID (tmpl), 0);
-  return iterative_hash_template_arg (args, val);
-}
-
 /* Returns a hash for a spec_entry node based on the TMPL and ARGS members,
ignoring SPEC.  */
 
 hashval_t
-spec_hasher::hash (spec_entry *e)
+spec_hasher::hash (spec_entry *)
 {
-  return hash_tmpl_and_args (e->tmpl, e->args);
+  return 0;
 }
 
 /* Recursively calculate a hash value for a template argument ARG, for use
@@ -9593,7 +9596,6 @@ lookup_template_class_1 (tree d1, tree arglist, tree in_decl, tree context,
   spec_entry **slot;
   spec_entry *entry;
   spec_entry elt;
-  hashval_t hash;
 
   if (identifier_p (d1))
 {
@@ -9770,8 +9772,7 @@ lookup_template_class_1 (tree d1, tree arglist, tree in_decl, tree context,
   elt.tmpl = gen_tmpl;
   elt.args = arglist;
   elt.spec = NULL_TREE;
-  hash = spec_hasher::hash ();
-  entry = type_specializations->find_with_hash (, hash);
+  entry = type_specializations->find_with_hash (, 0);
 
   if (entry)
 	return entry->spec;
@@ -10057,7 +10058,6 @@ lookup_template_class_1 (tree d1, tree arglist, tree in_decl, tree context,
 		 use it for hash table lookup.  */
 	  elt.tmpl = found;
 	  elt.args = arglist = INNERMOST_TEMPLATE_ARGS (arglist);
-	  hash = spec_hasher::hash ();
 	}
 	}
 
@@ -10065,7 +10065,7 @@ lookup_template_class_1 (tree d1, tree arglist, tree in_decl, tree context,
   SET_TYPE_TEMPLATE_INFO (t, build_template_info (found, arglist));
 
   elt.spec = t;
-  slot = type_specializations->find_slot_with_hash (, hash, INSERT);
+  slot = type_specializations->find_slot_with_hash (, 0, INSERT);
   gcc_checking_assert (*slot == NULL);
   entry = ggc_alloc ();
   *entry = elt;
@@ -29441,9 +29441,8 @@ declare_integer_pack (void)
 void
 init_template_processing (void)
 {
-  /*

Re: [PATCH PR94969]Add unit distant vector to DDR in case of invariant access functions

2020-05-11 Thread Richard Biener via Gcc-patches

On Mon, May 11, 2020 at 7:52 AM bin.cheng via Gcc-patches
 wrote:
>
> Hi,
> As analyzed in PR94969, data dependence analysis now misses dependence vector 
> for specific case in which DRs in DDR have the same invariant access 
> functions.  This simple patch fixes the issue by also covering invariant 
> cases.  Bootstrap and test on x86_64, is it OK?

OK.

Thanks,
Richard.

> Thanks,
> bin
>
> 2020-05-11  Bin Cheng  
>
> PR tree-optimization/94969
> * tree-data-dependence.c (constant_access_functions): Rename to...
> (invariant_access_functions): ...this.  Add parameter.  Check for
> invariant access function, rather than constant.
> (build_classic_dist_vector): Call above function.
> * tree-loop-distribution.c (pg_add_dependence_edges): Add comment.
>
> gcc/testsuite
> 2020-05-11  Bin Cheng  
>
> PR tree-optimization/94969
> * gcc.dg/tree-ssa/pr94969.c: New test.

Re: [PATCH] Refactor tree-vrp.c

2020-05-11 Thread Richard Biener via Gcc-patches

On Fri, May 8, 2020 at 7:11 PM Jeff Law via Gcc-patches
 wrote:
>
> On Fri, 2020-05-08 at 13:06 -0300, Giuliano Belinassi via Gcc-patches wrote:
> > Hi,
> >
> > This patch Refactors tree-vrp.c to eliminate all global variables except
> > 'x_vrp_values', which will require that 'thread_outgoing_edges' to
> > accept an extra argument and pass it to the 'simplify' callback.
> >
> > It also removes every access to 'cfun', retrieving the function being
> > compiled from the pass engine.
> >
> > Bootstrapped and ran the testsuite on Linux x86_64.
> >
> > gcc/ChangeLog
> > 2020-05-08  Giuliano Belinassi  
> >
> >   * tree-vrp.c (class liveness): New.
> >   (insert_range_assertions): Move to class liveness.
> >   (dump_all_asserts): Same as above.
> >   (dump_asserts_for): Same as above.
> >   (live): Same as above.
> >   (need_assert_for): Same as above.
> >   (live_on_edge): Same as above.
> >   (finish_register_edge_assert_for): Same as above.
> >   (find_switch_asserts): Same as above.
> >   (find_assert_locations): Same as above.
> >   (find_assert_locations_1): Same as above.
> >   (find_conditional_asserts): Same as above.
> >   (process_assert_insertions): Same as above.
> >   (register_new_assert_for): Same as above.
> >   (vrp_prop): New variable fun.
> >   (vrp_initialize): New parameter.
> >   (identify_jump_threads): Same as above.
> >   (execute_vrp): Same as above.
> Just a note.  While the old VRP implementation in tree-vrp.c is on the 
> chopping
> block, but it'll likely be the end of summer before we know if further work in
> the new Ranger based implementation will be needed to totally replace tree-vrp
> w/o introducing any performance regressions.
>
> Thus, IMHO, we should go forward with the review.

Agreed, so I went ahead and reviewed it.  The only comment I have is
that 'liveness' is not a good match for the machinery which is about
insertion of ASSERT_EXPR stmts for VRP.  I suggest to use
vrp_insert or vrp_asserts instead.

OK with that change.
Richard.

>
> Jeff
>
>
>

Re: [PATCH] ASAN: do not rewrite param for DECL_NOT_GIMPLE_REG_P.

2020-05-11 Thread Richard Biener via Gcc-patches

On Mon, May 11, 2020 at 1:26 PM Martin Liška  wrote:
>
> Hi.
>
> Starting from r11-165-geb72dc663e9070b2 we should not rewrite parameters that
> have DECL_NOT_GIMPLE_REG_P set to true.
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?

Hmm, I think the fix is to clear DECL_NOT_GIMPLE_REG_P instead
where the code clears TREE_ADDRESSABLE of 'arg'

> Thanks,
> Martin
>
> gcc/ChangeLog:
>
> 2020-05-11  Martin Liska  
>
> PR sanitizer/95033
> * sanopt.c (sanitize_rewrite_addressable_params):
> Do not rewrite for DECL_NOT_GIMPLE_REG_P.
>
> gcc/testsuite/ChangeLog:
>
> 2020-05-11  Martin Liska  
>
> PR sanitizer/95033
> * g++.dg/asan/function-argument-4.C: New test.
> * gcc.dg/asan/pr95033.c: New test.
> ---
>   gcc/sanopt.c  |  1 +
>   .../g++.dg/asan/function-argument-4.C | 26 +++
>   gcc/testsuite/gcc.dg/asan/pr95033.c   | 13 ++
>   3 files changed, 40 insertions(+)
>   create mode 100644 gcc/testsuite/g++.dg/asan/function-argument-4.C
>   create mode 100644 gcc/testsuite/gcc.dg/asan/pr95033.c
>
>

[PATCH] Fold [0 + CST]->a.b.c to a constant

2020-05-11 Thread Richard Biener



This canonicalizes those to a constant literal.

Bootstrap / regtest running on x86_64-unknown-linux-gnu.

2020-05-11  Richard Biener  

* gimple-fold.c (maybe_canonicalize_mem_ref_addr): Canonicalize
literal constant [..] to a constant literal.
---
 gcc/gimple-fold.c | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 55b78fa284f..e4507877a0c 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -4840,6 +4840,7 @@ static bool
 maybe_canonicalize_mem_ref_addr (tree *t)
 {
   bool res = false;
+  tree *orig_t = t;
 
   if (TREE_CODE (*t) == ADDR_EXPR)
 t = _OPERAND (*t, 0);
@@ -4940,6 +4941,28 @@ maybe_canonicalize_mem_ref_addr (tree *t)
}
 }
 
+  else if (TREE_CODE (*orig_t) == ADDR_EXPR
+  && TREE_CODE (*t) == MEM_REF
+  && TREE_CODE (TREE_OPERAND (*t, 0)) == INTEGER_CST)
+{
+  tree base;
+  poly_int64 coffset;
+  base = get_addr_base_and_unit_offset (TREE_OPERAND (*orig_t, 0),
+   );
+  if (base)
+   {
+ gcc_assert (TREE_CODE (base) == MEM_REF);
+ poly_int64 moffset;
+ if (mem_ref_offset (base).to_shwi ())
+   {
+ coffset += moffset;
+ coffset += tree_to_poly_int64 (TREE_OPERAND (base, 0));
+ *orig_t = build_int_cst (TREE_TYPE (*orig_t), coffset);
+ return true;
+   }
+   }
+}
+
   /* Canonicalize TARGET_MEM_REF in particular with respect to
  the indexes becoming constant.  */
   else if (TREE_CODE (*t) == TARGET_MEM_REF)
-- 
2.16.4

Re: [PATCH] rs6000: Built-in cleanups for vec_clzm, vec_ctzm, and vec_gnb.

2020-05-11 Thread Segher Boessenkool

Hi!

On Sat, May 09, 2020 at 08:08:34PM -0500, Bill Schmidt wrote:
> I should have noticed this patch before submitting Kelvin's earlier
> related patches, sorry.  I think it should still be fine to apply
> the patches in order, but if you'd like me to combine this into the
> two earlier ones, I'd be happy to do that.

The intermediary step works just fine as well, so it is fine as-is.

One thing:

>   * config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin):
>   Change fourth operand for vec_ternarylogic to require
>   compatibility with unsigned SImode rather than unsigned QImode.

Is it still checked for range 0..255 though?  (If the compiler can
derive that).

In either case, if that is what the ABI says, that is what the ABI says,
so okay for trunk.

Thanks!

Segher

Re: Fix Debug mode Undefined Behavior

2020-05-11 Thread François Dumont via Gcc-patches


On 11/05/20 12:51 pm, Ville Voutilainen wrote:

On Mon, 11 May 2020 at 00:09, François Dumont via Libstdc++
 wrote:

I just committed this patch.

This was a commit-without-review. When the patch was originally
posted, the maintainer said
"Let's revisit it in a few weeks.". That's not the same as "OK when
stage1 reopens."


I don't know why I had in mind that it was Ok.

Now reverted.

[PATCH] ASAN: do not rewrite param for DECL_NOT_GIMPLE_REG_P.

2020-05-11 Thread Martin Liška


Hi.

Starting from r11-165-geb72dc663e9070b2 we should not rewrite parameters that
have DECL_NOT_GIMPLE_REG_P set to true.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

2020-05-11  Martin Liska  

PR sanitizer/95033
* sanopt.c (sanitize_rewrite_addressable_params):
Do not rewrite for DECL_NOT_GIMPLE_REG_P.

gcc/testsuite/ChangeLog:

2020-05-11  Martin Liska  

PR sanitizer/95033
* g++.dg/asan/function-argument-4.C: New test.
* gcc.dg/asan/pr95033.c: New test.
---
 gcc/sanopt.c  |  1 +
 .../g++.dg/asan/function-argument-4.C | 26 +++
 gcc/testsuite/gcc.dg/asan/pr95033.c   | 13 ++
 3 files changed, 40 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/asan/function-argument-4.C
 create mode 100644 gcc/testsuite/gcc.dg/asan/pr95033.c


diff --git a/gcc/sanopt.c b/gcc/sanopt.c
index 86180e32c7e..28a63442f4d 100644
--- a/gcc/sanopt.c
+++ b/gcc/sanopt.c
@@ -1155,6 +1155,7 @@ sanitize_rewrite_addressable_params (function *fun)
   if (TREE_ADDRESSABLE (arg)
 	  && !TREE_ADDRESSABLE (type)
 	  && !TREE_THIS_VOLATILE (arg)
+	  && !DECL_NOT_GIMPLE_REG_P (arg)
 	  && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST)
 	{
 	  TREE_ADDRESSABLE (arg) = 0;
diff --git a/gcc/testsuite/g++.dg/asan/function-argument-4.C b/gcc/testsuite/g++.dg/asan/function-argument-4.C
new file mode 100644
index 000..cec1f1d788f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/asan/function-argument-4.C
@@ -0,0 +1,26 @@
+// { dg-do run }
+// { dg-shouldfail "asan" }
+
+#include 
+
+static __attribute__ ((noinline)) long double
+goo (long double _Complex *a)
+{
+  return crealf(*(volatile _Complex long double *)a);
+}
+
+__attribute__ ((noinline)) float
+foo (float _Complex arg)
+{
+  return goo ((long double _Complex *));
+}
+
+int
+main ()
+{
+  return foo (3 + 2 * I);
+}
+
+// { dg-output "ERROR: AddressSanitizer: stack-buffer-overflow on address.*(\n|\r\n|\r)" }
+// { dg-output "READ of size \[0-9\]* at.*" }
+// { dg-output ".*'arg' \\(line 13\\) <== Memory access at offset \[0-9\]* partially overflows this variable.*" }
diff --git a/gcc/testsuite/gcc.dg/asan/pr95033.c b/gcc/testsuite/gcc.dg/asan/pr95033.c
new file mode 100644
index 000..1228b7edcdb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/asan/pr95033.c
@@ -0,0 +1,13 @@
+/* PR sanitizer/95033 */
+/* { dg-do compile } */
+/* { dg-options "-fsanitize=address" } */
+
+struct a
+{
+  int b;
+};
+
+struct a c(_Complex d)
+{
+  return *(struct a *)
+}

Re: Fix Debug mode Undefined Behavior

2020-05-11 Thread Ville Voutilainen via Gcc-patches

On Mon, 11 May 2020 at 00:09, François Dumont via Libstdc++
 wrote:
>
> I just committed this patch.

This was a commit-without-review. When the patch was originally
posted, the maintainer said
"Let's revisit it in a few weeks.". That's not the same as "OK when
stage1 reopens."

Re: [PATCH] rs6000: Add xxeval and vec_ternarylogic

2020-05-11 Thread Segher Boessenkool

Hi!

On Sat, May 09, 2020 at 01:15:44PM -0400, David Edelsohn wrote:
> Okay with those changes, plus any issues noticed by Segher.

The only thing I can add is, I hope Bill's builtin work will make it
unnecessary to also define QUINARY and SENARY macros (yes I looked it
up), before we need those :-)

Thanks!


Segher

Re: [PATCH] make minmax detection work with FMIN/FMAX IFNs

2020-05-11 Thread Richard Sandiford

Richard Biener  writes:
> On Mon, 11 May 2020, Alexander Monakov wrote:
>
>> On Mon, 11 May 2020, Richard Sandiford wrote:
>> 
>> > Like you say, the idea is that since the operation is commutative and
>> > is the same in both vector and scalar form, there's no reason to require
>> > any -ffast-math flags.
>> 
>> Note that PR88540 that Richard is referencing uses open-coded x < y ? x : y
>> (non-commutative) and we want to use SSE minpd even without -ffast-math, as
>> SSE min/max insns match semantics of open-coded ternary operators.
>> 
>> (unlike Arm SIMD, SSE does not have a way to compute fmin/fmax with a
>> single instruction in presence of NaNs)
>
> Indeed.  So it looks like for SSE we eventually want phiopt to generate
> a COND_EXPR here and a new optabs cond_fmin cond_fmax that could be
> used to vectorize this?  cond_fmin and cond_fmax can neither be
> treated as MIN_EXPR or MAX_EXPR nor fmin/fmax since it is not
> commutative.

I know I'm taking the name too literally, but: "cond_foo" in the sense
of "IFN_COND_FOO (cond, ..., fallback)" is always supposed to be the
equivalent of:

  cond ? IFN_FOO (...) : fallback

So cond_fmin would be the conditional form of fmin.

Thanks,
Richard

Re: [PATCH] make minmax detection work with FMIN/FMAX IFNs

2020-05-11 Thread Richard Biener

On Mon, 11 May 2020, Alexander Monakov wrote:

> On Mon, 11 May 2020, Richard Sandiford wrote:
> 
> > Like you say, the idea is that since the operation is commutative and
> > is the same in both vector and scalar form, there's no reason to require
> > any -ffast-math flags.
> 
> Note that PR88540 that Richard is referencing uses open-coded x < y ? x : y
> (non-commutative) and we want to use SSE minpd even without -ffast-math, as
> SSE min/max insns match semantics of open-coded ternary operators.
> 
> (unlike Arm SIMD, SSE does not have a way to compute fmin/fmax with a
> single instruction in presence of NaNs)

Indeed.  So it looks like for SSE we eventually want phiopt to generate
a COND_EXPR here and a new optabs cond_fmin cond_fmax that could be
used to vectorize this?  cond_fmin and cond_fmax can neither be
treated as MIN_EXPR or MAX_EXPR nor fmin/fmax since it is not
commutative.

The reason why I came back to this is that the x86 backend has
define_insns that match the conditional form so RTL if-conversions
knows how to generate this but after my patch to add some 
bit-insert/bit-field-ref combining patterns to GIMPLE RTL if-conversion 
cannot recoginze them anymore.

I'm also pretty sure we do not want cond_fmin/max IFNs on GIMPLE
early.

That said, I can recover the x86 testcases by just adjusting phiopt
here to generate a COND_EXPR (vectorizer ifcvt doesn't handle this
case either - which is why it is not vectorized I guess).  Not sure
if it really fits minmax detection since for sure the "combinations"
it does do not apply to the conditional form without extra careful
checking (is there any difference between x >= y ? x : y and
x > y ? x : y?)

Anyway, I need to sit down and more closely look at this.  So the
x86 part of the patch is clearly bogus and the phiopt part to
match the conditional form as IFN_FMIN/MAX as well(?)

Richard.

Re: Fix Debug mode Undefined Behavior

2020-05-11 Thread Jonathan Wakely via Gcc-patches


On 10/05/20 23:03 +0200, FranÃ§ois Dumont via Libstdc++ wrote:

I just committed this patch.

FranÃ§ois

On 03/03/20 10:11 pm, FranÃ§ois Dumont wrote:
After the fix of PR 91910 I tried to consider other possible race 
condition and I think we still have a problem.


Like stated in the PR when a container is destroyed all associated 
iterators are made singular. If at the same time another thread try 
to access this iterator the _M_singular check will face a data race 


That's undefined behaviour, and the user's fault.

The problem described in the PR is different. It must be safe to
destroy the container and iterator concurrently. It does not need to
be safe to destroy the container and read from the iterator
concurrently.

It might be nice to improve the behaviour on such errors in user code,
but it's not necessary for correctness (unlike the case in the PR).

when accessing _M_sequence member. In case of race condition the 
program is likely to abort but maybe because of memory access 
violation rather than a clear singular iterator assertion.


I don't think that's a valid assumption, it might terminate with
SIGSEGV, but not SIGABRT.

To avoid this I rework _M_sequence manipulation to use atomic read 
when necessary and make sure that otherwise container mutex is 
locked.


I'm not very happy with the change. You seem to be trying to make the
debug iterators fully thread-safe, to support arbitrary concurrent
accesses to the iterators and container.  Your patch doesn't achieve
that (there are still races due to non-atomic writes that conflict
with reads), and I don't even think it's possible in general.  What
the patch does do is put more work inside the critical section
controlled by the mutex, which could make things slower.

Re: [PATCH] rs6000: Add xxgenpcvwm and xxgenpcvdm instructions

2020-05-11 Thread Segher Boessenkool

Hi!

On Sat, May 09, 2020 at 12:05:08PM -0500, Bill Schmidt wrote:
> From: Carl Love 
> 
> Add support for xxgenpcv[dw]m, along with individual and overloaded
> built-in functions for access.

>   (xxgenpcvm_): New insn.
>   (xxgenpcvm): New expansion.

Eww.  Let's please use or not use underscore in both cases.  Insns that
are not created directly should have a name starting with *.  We have
many examples of an expand with the same name as an insn (other than the
insn having a *), which isn't really confusing because the dexpand
usually is right before the insn.

But, in this case, you *do* call the insn directly (namely, from the
define expand!)  So maybe use a "xxgenpcvm_internal" or similar
name for the define_insn?

Okay for trunk with that improved somehow.  Thanks!

Segher

[committed] testsuite: Require gnu-tm support for pr94856.C

2020-05-11 Thread Kito Cheng

 - The testcase uses the -fgnu-tm option but does not ensure that support
   is enabled. This patch adds the test to the testcase.

* gcc/testsuite/g++.dg/ipa/pr94856.C: Require fgnu-tm.
---
 gcc/testsuite/ChangeLog| 4 
 gcc/testsuite/g++.dg/ipa/pr94856.C | 1 +
 2 files changed, 5 insertions(+)

diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index c35e084b366b..7d28875760cf 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,7 @@
+2020-05-11  Kito Cheng  
+
+   * gcc/testsuite/g++.dg/ipa/pr94856.C: Require fgnu-tm.
+
 2020-05-11  Uroš Bizjak  
 
PR target/95046
diff --git a/gcc/testsuite/g++.dg/ipa/pr94856.C 
b/gcc/testsuite/g++.dg/ipa/pr94856.C
index 5315c52d80ed..40f3a167e297 100644
--- a/gcc/testsuite/g++.dg/ipa/pr94856.C
+++ b/gcc/testsuite/g++.dg/ipa/pr94856.C
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -fno-tree-dse --param uninlined-function-insns=0 --param 
early-inlining-insns=3 -fgnu-tm " } */
+/* { dg-require-effective-target fgnu_tm } */
 
 class a {
 public:
-- 
2.26.2

Re: [PATCH] make minmax detection work with FMIN/FMAX IFNs

2020-05-11 Thread Alexander Monakov via Gcc-patches

On Mon, 11 May 2020, Richard Sandiford wrote:

> Like you say, the idea is that since the operation is commutative and
> is the same in both vector and scalar form, there's no reason to require
> any -ffast-math flags.

Note that PR88540 that Richard is referencing uses open-coded x < y ? x : y
(non-commutative) and we want to use SSE minpd even without -ffast-math, as
SSE min/max insns match semantics of open-coded ternary operators.

(unlike Arm SIMD, SSE does not have a way to compute fmin/fmax with a
single instruction in presence of NaNs)

Alexander

Re: [PATCH] make minmax detection work with FMIN/FMAX IFNs

2020-05-11 Thread Richard Sandiford

Richard Biener  writes:
> On May 8, 2020 4:28:24 PM GMT+02:00, Alexander Monakov  
> wrote:
>>On Fri, 8 May 2020, Uros Bizjak wrote:
>>
>>> > Am I missing something?
>>> 
>>> Is the above enough to declare min/max as IEEE compliant?
>>
>>No. SSE min/max instructions semantics match C expression x < y ? x :
>>y.
>>IEEE min/max operations are commutative when exactly one operand is a
>>NaN,
>>and so are C fmin/fmax functions:
>>
>>fmin(x, NaN) == fmin(NaN, x) == x   // x is not a NaN
>>
>>In contrast, (x < y ? x : y) always returns y when x or y is a NaN, and
>>likewise the corresponding SSE instructions are not commutative.
>>
>>Therefore they are explicitly non-compliant in presence of NaNs.
>>
>>I don't know how GCC defines the semantics of GIMPLE min/max IFNs.
>
> The IFNs are supposed to match fmin and fmax from the C standard which IIRC 
> have IEEE semantics. 

Yeah, that was my understanding too (specifically the 2008 maxNum & minNum
rules, since new variants were added in 2019).

> Note the ISA likely behaves this way because it matches open coded C 
> semantics. 
>
> Arm folks added the IFNs so I have to dig up what exactly they were after...

We wanted it for pretty much exactly the kind of thing you're doing here:
having a vectorisable version of the C fmin and fmax functions.  FWIW,
Alejandro posted a patch for reductions a while back:

https://gcc.gnu.org/pipermail/gcc-patches/2018-December/513678.html

but it was posted during stage 3 and so kind-of stalled.

Like you say, the idea is that since the operation is commutative and
is the same in both vector and scalar form, there's no reason to require
any -ffast-math flags.

Thanks,
Richard

Re: [committed] i386: Vectorize basic V2SFmode operations [PR95046]

2020-05-11 Thread Uros Bizjak via Gcc-patches

Now with missing testcase.

On Mon, May 11, 2020 at 11:20 AM Uros Bizjak  wrote:
>
> Enable V2SFmode vectorization and vectorize V2SFmode PLUS,
> MINUS, MULT, MIN and MAX operations using XMM registers.
>
> To avoid unwanted secondary effects (e.g. exceptions), load values
> to XMM registers using MOVQ that clears high bits of the XMM
> register outside V2SFmode.
>
> The compiler now vectorizes e.g.:
>
> float r[2], a[2], b[2];
>
> void
> test_plus (void)
> {
>   for (int i = 0; i < 2; i++)
> r[i] = a[i] + b[i];
> }
>
> to:
> movqa(%rip), %xmm0
> movqb(%rip), %xmm1
> addps   %xmm1, %xmm0
> movlps  %xmm0, r(%rip)
> ret
>
> gcc/ChangeLog:
>
> 2020-05-11  Uroš Bizjak  
>
> PR target/95046
> * config/i386/i386.c (ix86_vector_mode_supported_p):
> Vectorize 3dNOW! vector modes for TARGET_MMX_WITH_SSE.
> * config/i386/mmx.md (*mov_internal): Do not set
> mode of alternative 13 to V2SF for TARGET_MMX_WITH_SSE.
>
> (mmx_addv2sf3): Change operand predicates from
> nonimmediate_operand to register_mmxmem_operand.
> (addv2sf3): New expander.
> (*mmx_addv2sf3): Add SSE/AVX alternatives.  Change operand
> predicates from nonimmediate_operand to register_mmxmem_operand.
> Enable instruction pattern for TARGET_MMX_WITH_SSE.
>
> (mmx_subv2sf3): Change operand predicate from
> nonimmediate_operand to register_mmxmem_operand.
> (mmx_subrv2sf3): Ditto.
> (subv2sf3): New expander.
> (*mmx_subv2sf3): Add SSE/AVX alternatives.  Change operand
> predicates from nonimmediate_operand to register_mmxmem_operand.
> Enable instruction pattern for TARGET_MMX_WITH_SSE.
>
> (mmx_mulv2sf3): Change operand predicates from
> nonimmediate_operand to register_mmxmem_operand.
> (mulv2sf3): New expander.
> (*mmx_mulv2sf3): Add SSE/AVX alternatives.  Change operand
> predicates from nonimmediate_operand to register_mmxmem_operand.
> Enable instruction pattern for TARGET_MMX_WITH_SSE.
>
> (mmx_v2sf3): Change operand predicates from
> nonimmediate_operand to register_mmxmem_operand.
> (v2sf3): New expander.
> (*mmx_v2sf3): Add SSE/AVX alternatives.  Change operand
> predicates from nonimmediate_operand to register_mmxmem_operand.
> Enable instruction pattern for TARGET_MMX_WITH_SSE.
> (mmx_ieee_v2sf3): Ditto.
>
> testsuite/ChangeLog:
>
> 2020-05-11  Uroš Bizjak  
>
> PR target/95046
> * gcc.target/i386/pr95046-1.c: New test.
>
> Bootstrapped and regression tested on x86_64-linux-gnu {-m32}.
>
> Committed to mainline.
>
> Uros.
/* PR target/94942 */
/* { dg-do compile { target { ! ia32 } } } */
/* { dg-options "-O3 -ffast-math -msse2" } */


float r[2], a[2], b[2];

void
test_plus (void)
{
  for (int i = 0; i < 2; i++)
r[i] = a[i] + b[i];
}

/* { dg-final { scan-assembler "addps" } } */

void
test_minus (void)
{
  for (int i = 0; i < 2; i++)
r[i] = a[i] - b[i];
}

/* { dg-final { scan-assembler "subps" } } */

void
test_mult (void)
{
  for (int i = 0; i < 2; i++)
r[i] = a[i] * b[i];
}

/* { dg-final { scan-assembler "mulps" } } */

void
test_min (void)
{
  for (int i = 0; i < 2; i++)
r[i] = a[i] < b[i] ? a[i] : b[i];
}

/* { dg-final { scan-assembler "minps" } } */

void
test_max (void)
{
  for (int i = 0; i < 2; i++)
r[i] = a[i] > b[i] ? a[i] : b[i];
}

/* { dg-final { scan-assembler "maxps" } } */

[committed] i386: Vectorize basic V2SFmode operations [PR95046]

2020-05-11 Thread Uros Bizjak via Gcc-patches

Enable V2SFmode vectorization and vectorize V2SFmode PLUS,
MINUS, MULT, MIN and MAX operations using XMM registers.

To avoid unwanted secondary effects (e.g. exceptions), load values
to XMM registers using MOVQ that clears high bits of the XMM
register outside V2SFmode.

The compiler now vectorizes e.g.:

float r[2], a[2], b[2];

void
test_plus (void)
{
  for (int i = 0; i < 2; i++)
r[i] = a[i] + b[i];
}

to:
movqa(%rip), %xmm0
movqb(%rip), %xmm1
addps   %xmm1, %xmm0
movlps  %xmm0, r(%rip)
ret

gcc/ChangeLog:

2020-05-11  Uroš Bizjak  

PR target/95046
* config/i386/i386.c (ix86_vector_mode_supported_p):
Vectorize 3dNOW! vector modes for TARGET_MMX_WITH_SSE.
* config/i386/mmx.md (*mov_internal): Do not set
mode of alternative 13 to V2SF for TARGET_MMX_WITH_SSE.

(mmx_addv2sf3): Change operand predicates from
nonimmediate_operand to register_mmxmem_operand.
(addv2sf3): New expander.
(*mmx_addv2sf3): Add SSE/AVX alternatives.  Change operand
predicates from nonimmediate_operand to register_mmxmem_operand.
Enable instruction pattern for TARGET_MMX_WITH_SSE.

(mmx_subv2sf3): Change operand predicate from
nonimmediate_operand to register_mmxmem_operand.
(mmx_subrv2sf3): Ditto.
(subv2sf3): New expander.
(*mmx_subv2sf3): Add SSE/AVX alternatives.  Change operand
predicates from nonimmediate_operand to register_mmxmem_operand.
Enable instruction pattern for TARGET_MMX_WITH_SSE.

(mmx_mulv2sf3): Change operand predicates from
nonimmediate_operand to register_mmxmem_operand.
(mulv2sf3): New expander.
(*mmx_mulv2sf3): Add SSE/AVX alternatives.  Change operand
predicates from nonimmediate_operand to register_mmxmem_operand.
Enable instruction pattern for TARGET_MMX_WITH_SSE.

(mmx_v2sf3): Change operand predicates from
nonimmediate_operand to register_mmxmem_operand.
(v2sf3): New expander.
(*mmx_v2sf3): Add SSE/AVX alternatives.  Change operand
predicates from nonimmediate_operand to register_mmxmem_operand.
Enable instruction pattern for TARGET_MMX_WITH_SSE.
(mmx_ieee_v2sf3): Ditto.

testsuite/ChangeLog:

2020-05-11  Uroš Bizjak  

PR target/95046
* gcc.target/i386/pr95046-1.c: New test.

Bootstrapped and regression tested on x86_64-linux-gnu {-m32}.

Committed to mainline.

Uros.
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index b40f443ba8a..d1c0e354162 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -21007,9 +21007,11 @@ ix86_vector_mode_supported_p (machine_mode mode)
 return true;
   if (TARGET_AVX512F && VALID_AVX512F_REG_MODE (mode))
 return true;
-  if ((TARGET_MMX || TARGET_MMX_WITH_SSE) && VALID_MMX_REG_MODE (mode))
+  if ((TARGET_MMX || TARGET_MMX_WITH_SSE)
+  && VALID_MMX_REG_MODE (mode))
 return true;
-  if (TARGET_3DNOW && VALID_MMX_REG_MODE_3DNOW (mode))
+  if ((TARGET_3DNOW || TARGET_MMX_WITH_SSE)
+  && VALID_MMX_REG_MODE_3DNOW (mode))
 return true;
   return false;
 }
diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index 472f90f9bc1..d3e0004d3a0 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -175,7 +175,13 @@
]
(const_string "TI"))
 
-   (and (eq_attr "alternative" "13,14")
+   (and (eq_attr "alternative" "13")
+(ior (and (match_test "mode == V2SFmode")
+  (not (match_test "TARGET_MMX_WITH_SSE")))
+ (not (match_test "TARGET_SSE2"
+ (const_string "V2SF")
+
+   (and (eq_attr "alternative" "14")
 (ior (match_test "mode == V2SFmode")
  (not (match_test "TARGET_SSE2"
  (const_string "V2SF")
@@ -235,67 +241,112 @@
 (define_expand "mmx_addv2sf3"
   [(set (match_operand:V2SF 0 "register_operand")
(plus:V2SF
- (match_operand:V2SF 1 "nonimmediate_operand")
- (match_operand:V2SF 2 "nonimmediate_operand")))]
+ (match_operand:V2SF 1 "register_mmxmem_operand")
+ (match_operand:V2SF 2 "register_mmxmem_operand")))]
   "TARGET_3DNOW"
   "ix86_fixup_binary_operands_no_copy (PLUS, V2SFmode, operands);")
 
+(define_expand "addv2sf3"
+  [(set (match_operand:V2SF 0 "register_operand")
+   (plus:V2SF
+ (match_operand:V2SF 1 "register_operand")
+ (match_operand:V2SF 2 "register_operand")))]
+  "TARGET_MMX_WITH_SSE"
+  "ix86_fixup_binary_operands_no_copy (PLUS, V2SFmode, operands);")
+
 (define_insn "*mmx_addv2sf3"
-  [(set (match_operand:V2SF 0 "register_operand" "=y")
-   (plus:V2SF (match_operand:V2SF 1 "nonimmediate_operand" "%0")
-  (match_operand:V2SF 2 "nonimmediate_operand" "ym")))]
-  "TARGET_3DNOW && ix86_binary_operator_ok (PLUS, V2SFmode, operands)"
-  "pfadd\t{%2, %0|%0, %2}"
-  [(set_attr "type" "mmxadd")
-   (set_attr "prefix_extra" "1")
-   (set_attr "mode" "V2SF")])

Re: [PATCH v2] Add handling of MULT_EXPR/PLUS_EXPR for wrapping overflow in affine combination(PR83403)

2020-05-11 Thread Richard Biener

On Mon, 11 May 2020, luoxhu wrote:

> 在 2020-05-06 20:09，Richard Biener 写道：
> > On Thu, 30 Apr 2020, luoxhu wrote:
> > 
> >> Update the patch with overflow check.  Bootstrap and regression tested PASS
> >> on Power8-LE.
> >> 
> >> 
> >> Use determine_value_range to get value range info for fold convert
> >> expressions
> >> with internal operation PLUS_EXPR/MINUS_EXPR/MULT_EXPR when not overflow on
> >> wrapping overflow inner type.  i.e.:
> >> 
> >> (long unsigned int)((unsigned  int)n * 10 + 1)
> >> =>
> >> (long unsigned int)n * (long unsigned int)10 + (long unsigned int)1
> >> 
> >> With this patch for affine combination, load/store motion could detect
> >> more address refs independency and promote some memory expressions to
> >> registers within loop.
> >> 
> >> PS: Replace the previous "(T1)(X + CST) as (T1)X - (T1)(-CST))"
> >> to "(T1)(X + CST) as (T1)X + (T1)(CST))" for wrapping overflow.
> > 
> > This is OK for trunk if bootstrapped / tested properl.
> 
> 
> Bootstrap and regression tested pass on power8LE, committed to
> r11-259-g0447929f11e6a3e1b076841712b90a8b6bc7d33a, is it necessary
> to backport it to gcc-10?

For sure not.

Richard.

> 
> Thanks,
> Xionghu
> 
> 
> > 
> > Thanks,
> > Richard.
> > 
> >> gcc/ChangeLog
> >> 
> >>  2020-04-30  Xiong Hu Luo  
> >> 
> >>  PR tree-optimization/83403
> >>  * tree-affine.c (expr_to_aff_combination): Replace SSA_NAME with
> >>  determine_value_range, Add fold conversion of MULT_EXPR, fix the
> >>  previous PLUS_EXPR.
> >> 
> >> gcc/testsuite/ChangeLog
> >> 
> >>  2020-04-30  Xiong Hu Luo  
> >> 
> >>  PR tree-optimization/83403
> >>  * gcc.dg/tree-ssa/pr83403-1.c: New test.
> >>  * gcc.dg/tree-ssa/pr83403-2.c: New test.
> >>  * gcc.dg/tree-ssa/pr83403.h: New header.
> >> ---
> >>  gcc/testsuite/gcc.dg/tree-ssa/pr83403-1.c |  8 ++
> >>  gcc/testsuite/gcc.dg/tree-ssa/pr83403-2.c |  8 ++
> >>  gcc/testsuite/gcc.dg/tree-ssa/pr83403.h   | 30 
> >> +++
> >>  gcc/tree-affine.c | 24 ++
> >>  4 files changed, 60 insertions(+), 10 deletions(-)
> >>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr83403-1.c
> >>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr83403-2.c
> >>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr83403.h
> >> 
> >> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr83403-1.c
> >> b/gcc/testsuite/gcc.dg/tree-ssa/pr83403-1.c
> >> new file mode 100644
> >> index 000..748375b03af
> >> --- /dev/null
> >> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr83403-1.c
> >> @@ -0,0 +1,8 @@
> >> +/* { dg-do compile } */
> >> +/* { dg-options "-O3 -funroll-loops -fdump-tree-lim2-details" } */
> >> +
> >> +#define TYPE unsigned int
> >> +
> >> +#include "pr83403.h"
> >> +
> >> +/* { dg-final { scan-tree-dump-times "Executing store motion of" 10 "lim2"
> >> } } */
> >> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr83403-2.c
> >> b/gcc/testsuite/gcc.dg/tree-ssa/pr83403-2.c
> >> new file mode 100644
> >> index 000..ca2e6bbd61c
> >> --- /dev/null
> >> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr83403-2.c
> >> @@ -0,0 +1,8 @@
> >> +/* { dg-do compile } */
> >> +/* { dg-options "-O3 -funroll-loops -fdump-tree-lim2-details" } */
> >> +
> >> +#define TYPE int
> >> +
> >> +#include "pr83403.h"
> >> +
> >> +/* { dg-final { scan-tree-dump-times "Executing store motion of" 10 "lim2"
> >> } } */
> >> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr83403.h
> >> b/gcc/testsuite/gcc.dg/tree-ssa/pr83403.h
> >> new file mode 100644
> >> index 000..0da8a835b5f
> >> --- /dev/null
> >> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr83403.h
> >> @@ -0,0 +1,30 @@
> >> +__attribute__ ((noinline)) void
> >> +calculate (const double *__restrict__ A, const double *__restrict__ B,
> >> + double *__restrict__ C)
> >> +{
> >> +  TYPE m = 0;
> >> +  TYPE n = 0;
> >> +  TYPE k = 0;
> >> +
> >> +  A = (const double *) __builtin_assume_aligned (A, 16);
> >> +  B = (const double *) __builtin_assume_aligned (B, 16);
> >> +  C = (double *) __builtin_assume_aligned (C, 16);
> >> +
> >> +  for (n = 0; n < 9; n++)
> >> +{
> >> +  for (m = 0; m < 10; m++)
> >> +  {
> >> +C[(n * 10) + m] = 0.0;
> >> +  }
> >> +
> >> +  for (k = 0; k < 17; k++)
> >> +  {
> >> +#pragma simd
> >> +for (m = 0; m < 10; m++)
> >> +  {
> >> +C[(n * 10) + m] += A[(k * 20) + m] * B[(n * 20) + k];
> >> +  }
> >> +  }
> >> +}
> >> +}
> >> +
> >> diff --git a/gcc/tree-affine.c b/gcc/tree-affine.c
> >> index 0eb8db1b086..5620e6bf28f 100644
> >> --- a/gcc/tree-affine.c
> >> +++ b/gcc/tree-affine.c
> >> @@ -343,24 +343,28 @@ expr_to_aff_combination (aff_tree *comb, tree_code
> >> code, tree type,
> >>   wide_int minv, maxv;
> >>   /* If inner type has wrapping overflow behavior, fold conversion
> >>   for below case:
> >> -   (T1)(X - CST) -> (T1)X - (T1)CST
> >> - if X - CST doesn't overflow by range information.  Also handle
> >> - (T1)(X + CST) as (T1)(X - (-CST)).  */
>

Re: [PATCH v2] Add handling of MULT_EXPR/PLUS_EXPR for wrapping overflow in affine combination(PR83403)

2020-05-11 Thread luoxhu via Gcc-patches


在 2020-05-06 20:09，Richard Biener 写道：

On Thu, 30 Apr 2020, luoxhu wrote:

Update the patch with overflow check.  Bootstrap and regression tested 
PASS on Power8-LE.



Use determine_value_range to get value range info for fold convert 
expressions
with internal operation PLUS_EXPR/MINUS_EXPR/MULT_EXPR when not 
overflow on

wrapping overflow inner type.  i.e.:

(long unsigned int)((unsigned  int)n * 10 + 1)
=>
(long unsigned int)n * (long unsigned int)10 + (long unsigned int)1

With this patch for affine combination, load/store motion could detect
more address refs independency and promote some memory expressions to
registers within loop.

PS: Replace the previous "(T1)(X + CST) as (T1)X - (T1)(-CST))"
to "(T1)(X + CST) as (T1)X + (T1)(CST))" for wrapping overflow.


This is OK for trunk if bootstrapped / tested properl.



Bootstrap and regression tested pass on power8LE, committed to
r11-259-g0447929f11e6a3e1b076841712b90a8b6bc7d33a, is it necessary
to backport it to gcc-10?


Thanks,
Xionghu




Thanks,
Richard.


gcc/ChangeLog

2020-04-30  Xiong Hu Luo  

PR tree-optimization/83403
* tree-affine.c (expr_to_aff_combination): Replace SSA_NAME with
determine_value_range, Add fold conversion of MULT_EXPR, fix the
previous PLUS_EXPR.

gcc/testsuite/ChangeLog

2020-04-30  Xiong Hu Luo  

PR tree-optimization/83403
* gcc.dg/tree-ssa/pr83403-1.c: New test.
* gcc.dg/tree-ssa/pr83403-2.c: New test.
* gcc.dg/tree-ssa/pr83403.h: New header.
---
 gcc/testsuite/gcc.dg/tree-ssa/pr83403-1.c |  8 ++
 gcc/testsuite/gcc.dg/tree-ssa/pr83403-2.c |  8 ++
 gcc/testsuite/gcc.dg/tree-ssa/pr83403.h   | 30 
+++

 gcc/tree-affine.c | 24 ++
 4 files changed, 60 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr83403-1.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr83403-2.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr83403.h

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr83403-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr83403-1.c

new file mode 100644
index 000..748375b03af
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr83403-1.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -funroll-loops -fdump-tree-lim2-details" } */
+
+#define TYPE unsigned int
+
+#include "pr83403.h"
+
+/* { dg-final { scan-tree-dump-times "Executing store motion of" 10 
"lim2" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr83403-2.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr83403-2.c

new file mode 100644
index 000..ca2e6bbd61c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr83403-2.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -funroll-loops -fdump-tree-lim2-details" } */
+
+#define TYPE int
+
+#include "pr83403.h"
+
+/* { dg-final { scan-tree-dump-times "Executing store motion of" 10 
"lim2" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr83403.h 
b/gcc/testsuite/gcc.dg/tree-ssa/pr83403.h

new file mode 100644
index 000..0da8a835b5f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr83403.h
@@ -0,0 +1,30 @@
+__attribute__ ((noinline)) void
+calculate (const double *__restrict__ A, const double *__restrict__ 
B,

+  double *__restrict__ C)
+{
+  TYPE m = 0;
+  TYPE n = 0;
+  TYPE k = 0;
+
+  A = (const double *) __builtin_assume_aligned (A, 16);
+  B = (const double *) __builtin_assume_aligned (B, 16);
+  C = (double *) __builtin_assume_aligned (C, 16);
+
+  for (n = 0; n < 9; n++)
+{
+  for (m = 0; m < 10; m++)
+   {
+ C[(n * 10) + m] = 0.0;
+   }
+
+  for (k = 0; k < 17; k++)
+   {
+#pragma simd
+ for (m = 0; m < 10; m++)
+   {
+ C[(n * 10) + m] += A[(k * 20) + m] * B[(n * 20) + k];
+   }
+   }
+}
+}
+
diff --git a/gcc/tree-affine.c b/gcc/tree-affine.c
index 0eb8db1b086..5620e6bf28f 100644
--- a/gcc/tree-affine.c
+++ b/gcc/tree-affine.c
@@ -343,24 +343,28 @@ expr_to_aff_combination (aff_tree *comb, 
tree_code code, tree type,

wide_int minv, maxv;
/* If inner type has wrapping overflow behavior, fold conversion
   for below case:
-(T1)(X - CST) -> (T1)X - (T1)CST
-	   if X - CST doesn't overflow by range information.  Also 
handle

-  (T1)(X + CST) as (T1)(X - (-CST)).  */
+(T1)(X *+- CST) -> (T1)X *+- (T1)CST
+  if X *+- CST doesn't overflow by range information.  */
if (TYPE_UNSIGNED (itype)
&& TYPE_OVERFLOW_WRAPS (itype)
-   && TREE_CODE (op0) == SSA_NAME
&& TREE_CODE (op1) == INTEGER_CST
-   && icode != MULT_EXPR
-   && get_range_info (op0, , ) == VR_RANGE)
+   && determine_value_range (op0, , ) == VR_RANGE)
  {
+   wi::overflow_type overflow = wi::OVF_NONE;
+   signop sign = UNSIGNED;

Re: std::atomic_flag::test

2020-05-11 Thread Jonathan Wakely via Gcc-patches


On 08/05/20 17:05 +0200, Ulrich Drepper via Libstdc++ wrote:

This is not yet implemented.  Here is a patch.

2020-05-08  Ulrich Drepper  

   * include/bits/atomic_base.h (atomic_flag): Implement test
memeber function.
   * include/std/version: Define __cpp_lib_atomic_flag_test.
   * testsuite/29_atomics/atomic_flag/test/explicit.cc: New file.
   * testsuite/29_atomics/atomic_flag/test/implicit.cc: New file.



libatomic does not have a function 'test' so I implemented it with
__atomic_load (which takes care of memory ordering) and then compare
with the set-value.

The code generated at least for x86-64 looks good, it's a
straight-forward load, nothing else.


Thanks, looks good for master.

[PATCH][OBVIOUS] Fix typo in fprofile-prefix-path.

2020-05-11 Thread Martin Liška


Hi.

I'm going to install the typo fix to both master and gcc-10 branch.

Martin

gcc/ChangeLog:

2020-05-11  Martin Liska  

PR c/95040
* common.opt: Fix typo in option description.
---
 gcc/common.opt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


diff --git a/gcc/common.opt b/gcc/common.opt
index 30d05734d16..4464049fc1f 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2215,7 +2215,7 @@ Enum(profile_update) String(prefer-atomic) Value(PROFILE_UPDATE_PREFER_ATOMIC)
 
 fprofile-prefix-path=
 Common Joined RejectNegative Var(profile_prefix_path)
-Remove prefix from absolute path before manging name for -fprofile-generate= and -fprofile-use=.
+Remove prefix from absolute path before mangling name for -fprofile-generate= and -fprofile-use=.
 
 fprofile-generate
 Common

73 matches

Mail list logo