[COMIITTED] Testsuite: Make dependence on -fdelete-null-pointer-checks explicit

2022-01-08 Thread Sandra Loosemore
I've checked in these tweaks for various testcases that fail on 
nios2-elf without an explicit -fdelete-null-pointer-checks option.  This 
target is configured to build with that optimization off by default.


-Sandra
commit 04c69d0e61c0f98a010d77a79ab749d5f0aa6b67
Author: Sandra Loosemore 
Date:   Sat Jan 8 22:02:13 2022 -0800

Testsuite: Make dependence on -fdelete-null-pointer-checks explicit

nios2-elf target defaults to -fno-delete-null-pointer-checks, breaking
tests that implicitly depend on that optimization.  Add the option
explicitly on these tests.

2022-01-08  Sandra Loosemore  

	gcc/testsuite/
	* g++.dg/cpp0x/constexpr-compare1.C: Add explicit
	-fdelete-null-pointer-checks option.
	* g++.dg/cpp0x/constexpr-compare2.C: Likewise.
	* g++.dg/cpp0x/constexpr-typeid2.C: Likewise.
	* g++.dg/cpp1y/constexpr-94716.C: Likewise.
	* g++.dg/cpp1z/constexpr-compare1.C: Likewise.
	* g++.dg/cpp1z/constexpr-if36.C: Likewise.
	* gcc.dg/init-compare-1.c: Likewise.

	libstdc++-v3/
	* testsuite/18_support/type_info/constexpr.cc: Add explicit
	-fdelete-null-pointer-checks option.

diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-compare1.C b/gcc/testsuite/g++.dg/cpp0x/constexpr-compare1.C
index ad65019..603c6d5 100644
--- a/gcc/testsuite/g++.dg/cpp0x/constexpr-compare1.C
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-compare1.C
@@ -1,4 +1,5 @@
 // { dg-do compile { target c++11 } }
+// { dg-additional-options "-fdelete-null-pointer-checks" }
 
 extern int a, b;
 static_assert (&a == &a, "");
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-compare2.C b/gcc/testsuite/g++.dg/cpp0x/constexpr-compare2.C
index b1bc472..5c08dbb 100644
--- a/gcc/testsuite/g++.dg/cpp0x/constexpr-compare2.C
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-compare2.C
@@ -1,5 +1,6 @@
 // PR c++/69681
 // { dg-do compile { target c++11 } }
+// { dg-additional-options "-fdelete-null-pointer-checks" }
 
 void f();
 void g();
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-typeid2.C b/gcc/testsuite/g++.dg/cpp0x/constexpr-typeid2.C
index 78c6b8e..8ab76f9 100644
--- a/gcc/testsuite/g++.dg/cpp0x/constexpr-typeid2.C
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-typeid2.C
@@ -1,5 +1,6 @@
 // PR c++/103600
 // { dg-do compile { target c++11 } }
+// { dg-additional-options "-fdelete-null-pointer-checks" }
 
 #include 
 
diff --git a/gcc/testsuite/g++.dg/cpp1y/constexpr-94716.C b/gcc/testsuite/g++.dg/cpp1y/constexpr-94716.C
index 90173f3..5ac8720 100644
--- a/gcc/testsuite/g++.dg/cpp1y/constexpr-94716.C
+++ b/gcc/testsuite/g++.dg/cpp1y/constexpr-94716.C
@@ -1,5 +1,6 @@
 // PR c++/94716
 // { dg-do compile { target c++14 } }
+// { dg-additional-options "-fdelete-null-pointer-checks" }
 
 template  char v = 0;
 static_assert (&v<2> == &v<2>, "");
diff --git a/gcc/testsuite/g++.dg/cpp1z/constexpr-compare1.C b/gcc/testsuite/g++.dg/cpp1z/constexpr-compare1.C
index a53c03c..d40d536 100644
--- a/gcc/testsuite/g++.dg/cpp1z/constexpr-compare1.C
+++ b/gcc/testsuite/g++.dg/cpp1z/constexpr-compare1.C
@@ -1,4 +1,5 @@
 // { dg-do compile { target c++17 } }
+// { dg-additional-options "-fdelete-null-pointer-checks" }
 
 inline int a = 0;
 inline int b = 0;
diff --git a/gcc/testsuite/g++.dg/cpp1z/constexpr-if36.C b/gcc/testsuite/g++.dg/cpp1z/constexpr-if36.C
index 4a1b134..e425af2 100644
--- a/gcc/testsuite/g++.dg/cpp1z/constexpr-if36.C
+++ b/gcc/testsuite/g++.dg/cpp1z/constexpr-if36.C
@@ -3,6 +3,7 @@
 // weakness.
 
 // { dg-do compile { target c++17 } }
+// { dg-additional-options "-fdelete-null-pointer-checks" }
 
 extern void weakfn1 (void);
 extern void weakfn2 (void);
diff --git a/gcc/testsuite/gcc.dg/init-compare-1.c b/gcc/testsuite/gcc.dg/init-compare-1.c
index 9208b66..6737c85 100644
--- a/gcc/testsuite/gcc.dg/init-compare-1.c
+++ b/gcc/testsuite/gcc.dg/init-compare-1.c
@@ -1,4 +1,5 @@
 /* { dg-do compile } */
+/* { dg-additional-options "-fdelete-null-pointer-checks" } */
 
 extern int a, b;
 int c = &a == &a;
diff --git a/libstdc++-v3/testsuite/18_support/type_info/constexpr.cc b/libstdc++-v3/testsuite/18_support/type_info/constexpr.cc
index 07f4fb6..6fb67b4 100644
--- a/libstdc++-v3/testsuite/18_support/type_info/constexpr.cc
+++ b/libstdc++-v3/testsuite/18_support/type_info/constexpr.cc
@@ -1,5 +1,6 @@
 // { dg-options "-std=gnu++23 -frtti" }
 // { dg-do compile { target c++23 } }
+// { dg-additional-options "-fdelete-null-pointer-checks" }
 
 #include 
 


[PATCH] middle-end: move initialization of stack_limit_rtx [PR103163]

2022-01-08 Thread Sandra Loosemore
This patch fixes the ICE I reported in PR103163.  We were initializing 
stack_limit_rtx before the register properties it depends on were 
getting set.  I moved it to the same function where stack_pointer_rtx, 
frame_pointer_rtx, etc are being initialized.


Besides nios2 where I observed it, this bug was also reported to affect 
powerpc.  Anybody want to check it there?  Otherwise, OK to check in?


-Sandra
commit bd91ec874339f9fd256b2d83de7159f6c11f
Author: Sandra Loosemore 
Date:   Sat Jan 8 19:59:26 2022 -0800

middle-end: move initialization of stack_limit_rtx [PR103163]

stack_limit_rtx was being initialized before init_reg_modes_target (),
resulting in the REG expression being created incorrectly and an ICE
later in compilation.

2022-01-08  Sandra Loosemore  

	PR middle-end/103163

	gcc/
	* emit-rtl.c (init_emit_regs): Initialize stack_limit_rtx here...
	(init_emit_once): ...not here.

diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
index f16..76dbe42 100644
--- a/gcc/emit-rtl.c
+++ b/gcc/emit-rtl.c
@@ -6097,6 +6097,13 @@ init_emit_regs (void)
   if ((unsigned) PIC_OFFSET_TABLE_REGNUM != INVALID_REGNUM)
 pic_offset_table_rtx = gen_raw_REG (Pmode, PIC_OFFSET_TABLE_REGNUM);
 
+  /* Process stack-limiting command-line options.  */
+  if (opt_fstack_limit_symbol_arg != NULL)
+stack_limit_rtx
+  = gen_rtx_SYMBOL_REF (Pmode, ggc_strdup (opt_fstack_limit_symbol_arg));
+  if (opt_fstack_limit_register_no >= 0)
+stack_limit_rtx = gen_rtx_REG (Pmode, opt_fstack_limit_register_no);
+
   for (i = 0; i < (int) MAX_MACHINE_MODE; i++)
 {
   mode = (machine_mode) i;
@@ -6177,13 +6184,6 @@ init_emit_once (void)
 
   /* Create the unique rtx's for certain rtx codes and operand values.  */
 
-  /* Process stack-limiting command-line options.  */
-  if (opt_fstack_limit_symbol_arg != NULL)
-stack_limit_rtx 
-  = gen_rtx_SYMBOL_REF (Pmode, ggc_strdup (opt_fstack_limit_symbol_arg));
-  if (opt_fstack_limit_register_no >= 0)
-stack_limit_rtx = gen_rtx_REG (Pmode, opt_fstack_limit_register_no);
-
   /* Don't use gen_rtx_CONST_INT here since gen_rtx_CONST_INT in this case
  tries to use these variables.  */
   for (i = - MAX_SAVED_CONST_INT; i <= MAX_SAVED_CONST_INT; i++)


Re: [power-ieee128] OPEN CONV

2022-01-08 Thread Michael Meissner via Gcc-patches
On Sat, Jan 08, 2022 at 02:15:14PM -0500, David Edelsohn wrote:
> On Sat, Jan 8, 2022 at 1:59 PM Michael Meissner  
> wrote:
> >
> > On Sat, Jan 08, 2022 at 03:18:07PM +0100, Jakub Jelinek wrote:
> > > On Sat, Jan 08, 2022 at 03:13:10PM +0100, Thomas Koenig wrote:
> > > >
> > > > On 08.01.22 15:02, Jakub Jelinek via Fortran wrote:
> > > > > Note, as for byteswapping, apparently it wasn't ever working right fox
> > > > > the IBM extended real(kind=16) and complex(kind=16).
> > > >
> > > > The lack of bug reports since the conversion feature was introduced in
> > > > 2006, more than 15 years ago, tells us something, I guess...
> > >
> > > powerpc64le was only introduced in GCC 4.8 in 2013, so slightly less
> > > than that, but still.
> > > Either nobody interchanges/shares fortran unformatted data between
> > > powerpc big and little endian, or if they do, they don't use real(kind=16)
> > > or complex(kind=16) in there...
> >
> > I still wish I had had the forethought when we were setting up the LE ABI to
> > change the default 128-bit format to IEEE instead of IBM.  But alas, I 
> > didn't.
> > You would still need converters between the big endian IBM format and little
> > endian IEEE format, but it would have avoided a lot of the problems where 
> > GCC
> > assumes there is only one floating point format for each size.
> 
> Mike,
> 
> The LE ABI initial target was Power8 and IEEE128 hardware support was
> added to Power9.  The ABI was a conscious decision. IEEE 128 was not a
> viable requirement for the LE ABI at the time of the transition.

Yes I know, but my memory is we (the GCC group within IBM) at least knew that
IEEE 128-bit was coming towards the end of the ABI definition period.  But
perhaps not.  In any case, it doesn't much matter now, as it is all ancient
history.

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


Re: [power-ieee128] OPEN CONV

2022-01-08 Thread David Edelsohn via Gcc-patches
On Sat, Jan 8, 2022 at 1:59 PM Michael Meissner  wrote:
>
> On Sat, Jan 08, 2022 at 03:18:07PM +0100, Jakub Jelinek wrote:
> > On Sat, Jan 08, 2022 at 03:13:10PM +0100, Thomas Koenig wrote:
> > >
> > > On 08.01.22 15:02, Jakub Jelinek via Fortran wrote:
> > > > Note, as for byteswapping, apparently it wasn't ever working right fox
> > > > the IBM extended real(kind=16) and complex(kind=16).
> > >
> > > The lack of bug reports since the conversion feature was introduced in
> > > 2006, more than 15 years ago, tells us something, I guess...
> >
> > powerpc64le was only introduced in GCC 4.8 in 2013, so slightly less
> > than that, but still.
> > Either nobody interchanges/shares fortran unformatted data between
> > powerpc big and little endian, or if they do, they don't use real(kind=16)
> > or complex(kind=16) in there...
>
> I still wish I had had the forethought when we were setting up the LE ABI to
> change the default 128-bit format to IEEE instead of IBM.  But alas, I didn't.
> You would still need converters between the big endian IBM format and little
> endian IEEE format, but it would have avoided a lot of the problems where GCC
> assumes there is only one floating point format for each size.

Mike,

The LE ABI initial target was Power8 and IEEE128 hardware support was
added to Power9.  The ABI was a conscious decision. IEEE 128 was not a
viable requirement for the LE ABI at the time of the transition.

Thanks, David


Re: [power-ieee128] OPEN CONV

2022-01-08 Thread Michael Meissner via Gcc-patches
On Sat, Jan 08, 2022 at 03:18:07PM +0100, Jakub Jelinek wrote:
> On Sat, Jan 08, 2022 at 03:13:10PM +0100, Thomas Koenig wrote:
> > 
> > On 08.01.22 15:02, Jakub Jelinek via Fortran wrote:
> > > Note, as for byteswapping, apparently it wasn't ever working right fox
> > > the IBM extended real(kind=16) and complex(kind=16).
> > 
> > The lack of bug reports since the conversion feature was introduced in
> > 2006, more than 15 years ago, tells us something, I guess...
> 
> powerpc64le was only introduced in GCC 4.8 in 2013, so slightly less
> than that, but still.
> Either nobody interchanges/shares fortran unformatted data between
> powerpc big and little endian, or if they do, they don't use real(kind=16)
> or complex(kind=16) in there...

I still wish I had had the forethought when we were setting up the LE ABI to
change the default 128-bit format to IEEE instead of IBM.  But alas, I didn't.
You would still need converters between the big endian IBM format and little
endian IEEE format, but it would have avoided a lot of the problems where GCC
assumes there is only one floating point format for each size.

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


Re: [PATCH 1/1] [PATCH] Fix canadian compile for mingw-w64 copies the wrong dlls for mingw-w64 multilibs [PR100427]

2022-01-08 Thread Jeff Law via Gcc-patches




On 1/8/2022 2:04 AM, NightStrike via Gcc-patches wrote:

On Thu, Jan 6, 2022, 18:31 cqwrteur via Gcc-patches 
wrote:


When building GCC hosted on windows with Canadian/native compilation
(host==target), the build scripts in GCC would override DLLs with each
other. For example, for MinGW-w64, 32-bit DLLs would override 64 bits
because build scripts copy them both to /bin.

This patch fixes the issue by avoiding copying DLLs with multilibs.
However, it would still copy when we do not build multilibs, usually the
native build for GCC on windows.
---
  gcc/configure  | 26 ++


You should probably not be modifying configure directly.
Umm, the patch modifies libtool.m4 (two instances) and presumably the 
configure changes are just rebuilds with the autotools.


jeff




Re: [PATCH] gcc: pass-manager: Fix memory leak. [PR jit/63854]

2022-01-08 Thread Jeff Law via Gcc-patches




On 1/6/2022 6:53 AM, David Malcolm via Gcc-patches wrote:

On Sun, 2021-12-19 at 22:30 +0100, Marc Nieper-Wißkirchen wrote:

This patch fixes a memory leak in the pass manager. In the existing
code,
the m_name_to_pass_map is allocated in
pass_manager::register_pass_name, but
never deallocated.  This is fixed by adding a deletion in
pass_manager::~pass_manager.  Moreover the string keys in
m_name_to_pass_map are
all dynamically allocated.  To free them, this patch adds a new hash
trait for
string hashes that are to be freed when the corresponding hash entry
is removed.

This fix is particularly relevant for using GCC as a library through
libgccjit.
The memory leak also occurs when libgccjit is instructed to use an
external
driver.

Before the patch, compiling the hello world example of libgccjit with
the external driver under Valgrind shows a loss of 12,611 (48 direct)
bytes.  After the patch, no memory leaks are reported anymore.
(Memory leaks occurring when using the internal driver are mostly in
the driver code in gcc/gcc.c and have to be fixed separately.)

The patch has been tested by fully bootstrapping the compiler with
the
frontends C, C++, Fortran, LTO, ObjC, JIT and running the test suite
under a x86_64-pc-linux-gnu host.

Thanks for the patch.

It looks correct to me, given that pass_manager::register_pass_name
does an xstrdup and puts the result in the map.

That said:
- I'm not officially a reviewer for this part of gcc (though I probably
touched this code last)
- is it cleaner to instead change m_name_to_pass_map's key type from
const char * to char *, to convey that the map "owns" the name?  That
way we probably wouldn't need struct typed_const_free_remove, and (I
hope) works better with the type system.

Dave


gcc/ChangeLog:

 PR jit/63854
 * hash-traits.h (struct typed_const_free_remove): New.
 (struct free_string_hash): New.
 * pass_manager.h: Use free_string_hash.
 * passes.c (pass_manager::register_pass_name): Use
free_string_hash.
     (pass_manager::~pass_manager): Delete allocated
m_name_to_pass_map.
My concern (and what I hadn't had time to dig into) was we initially 
used nofree_string_hash -- I wanted to make sure there wasn't any path 
where the name came from the stack (can't be free'd), was saved 
elsewhere (danging pointer) and the like.  ie, why were we using 
nofree_string_hash to begin with?  I've never really mucked around with 
these bits, so the analysis side kept falling off the daily todo list.


If/once you're comfortable with the patch David, then go ahead and apply 
it on Marc's behalf.


jeff



Re: [power-ieee128] OPEN CONV

2022-01-08 Thread Jakub Jelinek via Gcc-patches
On Sat, Jan 08, 2022 at 03:13:10PM +0100, Thomas Koenig wrote:
> 
> On 08.01.22 15:02, Jakub Jelinek via Fortran wrote:
> > Note, as for byteswapping, apparently it wasn't ever working right fox
> > the IBM extended real(kind=16) and complex(kind=16).
> 
> The lack of bug reports since the conversion feature was introduced in
> 2006, more than 15 years ago, tells us something, I guess...

powerpc64le was only introduced in GCC 4.8 in 2013, so slightly less
than that, but still.
Either nobody interchanges/shares fortran unformatted data between
powerpc big and little endian, or if they do, they don't use real(kind=16)
or complex(kind=16) in there...

Jakub



Re: [power-ieee128] OPEN CONV

2022-01-08 Thread Thomas Koenig via Gcc-patches



On 08.01.22 15:02, Jakub Jelinek via Fortran wrote:

Note, as for byteswapping, apparently it wasn't ever working right fox
the IBM extended real(kind=16) and complex(kind=16).


The lack of bug reports since the conversion feature was introduced in
2006, more than 15 years ago, tells us something, I guess...


Re: [power-ieee128] OPEN CONV

2022-01-08 Thread Jakub Jelinek via Gcc-patches
On Sat, Jan 08, 2022 at 12:10:56PM +0100, Jakub Jelinek via Gcc-patches wrote:
> One reason for that is that neither conversion is lossless, neither format
> is a subset or superset of the other.  Yes, IEEE quad has both much bigger
> exponent range (-16382..16383 vs. -1022..1023) and slightly bigger fixed
> precision (113 vs. 106 bits).
> But IBM extended has that weirdo numerically awful flexible precision where
> certain numbers can have much bigger precision than those 106 bits, up to
> 2048+52 or so.  So there is rounding in both directions.
> So, after distros switch to -mabi=ieeelongdouble by default or when people
> use -mabi=ieeelongdouble on their programs, they'd better store that format
> into data files by default, without the need of some magic CONVERT= options,
> env vars or command line options.  Only in the case where they need to
> interact with -mabi=ibmlongdouble environments, they need to take some
> action.

Note, as for byteswapping, apparently it wasn't ever working right fox
the IBM extended real(kind=16) and complex(kind=16).
Because unlike IEEE extended or integral types, it seems powerpc*-*-*
doesn't actually fully byteswap those between little and big endian.
Proof:
long double a = 
0.L;
compiled little endian IBM long double:
.size   a, 16
a:
.long   1431655765
.long   1070945621
.long   1431655766
.long   1014322517
compiled big endian IBM long double:
.size   a, 16
a:
.long   1070945621
.long   1431655765
.long   1014322517
.long   1431655766
compiled little endian IEEE long double:
.size   a, 16
a:
.long   1431655765
.long   1431655765
.long   1431655765
.long   1073567061
compiled big endian IEEE long double:
.size   a, 16
a:
.long   1073567061
.long   1431655765
.long   1431655765
.long   1431655765
where the numbers in .long arguments are 32-bit numbers stored in the
selected endianity.  Compiled with -mlong-double-64 little endian:
.size   a, 8
a:
.long   1431655765
.long   1070945621
and big endian:
.size   a, 8
a:
.long   1070945621
.long   1431655765
Unless I'm misreading this, for IEEE long double, or double (and I bet float
too) byteswapping the whole numbers is what is needed for interoperability
between powerpc64{,le}-linux, for IBM long double we'd actually want to
byteswap it as 2 real(kind=8) numbers and not one real(kind=16) one, i.e.
the numbers are always stored as the more significant double followed by
less significant double in memory.

Jakub



[PATCH] nvptx: Improved support for HFMode including neghf2 and abshf2.

2022-01-08 Thread Roger Sayle

This patch adds more support for _Float16 (HFmode) to the nvptx backend.
Currently negation, absolute value and floating point comparisons are
implemented by promoting to float (SFmode).  This patch adds suitable
define_insns to nvptx.md, most conditional on TARGET_SM53 (-misa=sm_53).
This patch also adds support for HFmode fused multiply-add.

One subtlety is that neghf2 and abshf2 are implemented by (HImode)
bit manipulation operations to update the sign bit.  The NVidia PTX
ISA documentation for neg.f16 and abs.f16 contains the caution
"Future implementations may comply with the IEEE 754 standard by preserving
the (NaN) payload and modifying only the sign bit".  Given the availability
of suitable replacements, I thought it best to provide IEEE 754 compliant
implementations.  If anyone observes a performance penalty from this
choice I'm happy to provide a -ffast-math variant (or revisit this
decision).

This patch has been tested on nvptx-none hosted on x86_64-pc-linux-gnu
(including newlib) with a make and make -k check with no new failures.
Ok for mainline?


2022-01-08  Roger Sayle  

gcc/ChangeLog
* config/nvptx/nvptx.md (*cmpf): New define_insn.
(cstorehf4): New define_expand.
(fmahf4): New define_insn.
(neghf2): New define_insn.
(abshf2): New define_insn.

gcc/testsuite/ChangeLog
* gcc.target/nvptx/float16-3.c: New test case for neghf2.
* gcc.target/nvptx/float16-4.c: New test case for abshf2.
* gcc.target/nvptx/float16-5.c: New test case for fmahf4.
* gcc.target/nvptx/float16-6.c: New test case.


Thanks in advance,
Roger
--

diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md
index ce74672..a6046d7 100644
--- a/gcc/config/nvptx/nvptx.md
+++ b/gcc/config/nvptx/nvptx.md
@@ -779,6 +779,14 @@
   ""
   "%.\\tsetp%c1\\t%0, %2, %3;")
 
+(define_insn "*cmphf"
+  [(set (match_operand:BI 0 "nvptx_register_operand" "=R")
+   (match_operator:BI 1 "nvptx_float_comparison_operator"
+  [(match_operand:HF 2 "nvptx_register_operand" "R")
+   (match_operand:HF 3 "nvptx_nonmemory_operand" "RF")]))]
+  "TARGET_SM53"
+  "%.\\tsetp%c1\\t%0, %2, %3;")
+
 (define_insn "jump"
   [(set (pc)
(label_ref (match_operand 0 "" "")))]
@@ -969,6 +977,21 @@
   DONE;
 })
 
+(define_expand "cstorehf4"
+  [(set (match_operand:SI 0 "nvptx_register_operand")
+   (match_operator:SI 1 "nvptx_float_comparison_operator"
+ [(match_operand:HF 2 "nvptx_register_operand")
+  (match_operand:HF 3 "nvptx_nonmemory_operand")]))]
+  "TARGET_SM53"
+{
+  rtx reg = gen_reg_rtx (BImode);
+  rtx cmp = gen_rtx_fmt_ee (GET_CODE (operands[1]), BImode,
+   operands[2], operands[3]);
+  emit_move_insn (reg, cmp);
+  emit_insn (gen_setccsi_from_bi (operands[0], reg));
+  DONE;
+})
+
 ;; Calls
 
 (define_insn "call_insn_"
@@ -1156,6 +1179,26 @@
   "TARGET_SM53"
   "%.\\tmul.f16\\t%0, %1, %2;")
 
+(define_insn "fmahf4"
+  [(set (match_operand:HF 0 "nvptx_register_operand" "=R")
+   (fma:HF (match_operand:HF 1 "nvptx_register_operand" "R")
+   (match_operand:HF 2 "nvptx_nonmemory_operand" "RF")
+   (match_operand:HF 3 "nvptx_nonmemory_operand" "RF")))]
+  "TARGET_SM53"
+  "%.\\tfma%#.f16\\t%0, %1, %2, %3;")
+
+(define_insn "neghf2"
+  [(set (match_operand:HF 0 "nvptx_register_operand" "=R")
+   (neg:HF (match_operand:HF 1 "nvptx_register_operand" "R")))]
+  ""
+  "%.\\txor.b16\\t%0, %1, -32768;")
+
+(define_insn "abshf2"
+  [(set (match_operand:HF 0 "nvptx_register_operand" "=R")
+   (abs:HF (match_operand:HF 1 "nvptx_register_operand" "R")))]
+  ""
+  "%.\\tand.b16\\t%0, %1, 32767;")
+
 (define_insn "exp2hf2"
   [(set (match_operand:HF 0 "nvptx_register_operand" "=R")
(unspec:HF [(match_operand:HF 1 "nvptx_register_operand" "R")]
diff --git a/gcc/testsuite/gcc.target/nvptx/float16-3.c 
b/gcc/testsuite/gcc.target/nvptx/float16-3.c
new file mode 100644
index 000..914282a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nvptx/float16-3.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -misa=sm_53 -mptx=6.3" } */
+
+_Float16 var;
+
+void neg()
+{
+  var = -var;
+}
+
+/* { dg-final { scan-assembler "xor.b16" } } */
diff --git a/gcc/testsuite/gcc.target/nvptx/float16-4.c 
b/gcc/testsuite/gcc.target/nvptx/float16-4.c
new file mode 100644
index 000..b11f17a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nvptx/float16-4.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -misa=sm_53 -mptx=6.3 -ffast-math" } */
+
+_Float16 var;
+
+void foo()
+{
+  var = (var < (_Float16)0.0) ? -var : var;
+}
+
+/* { dg-final { scan-assembler "and.b16" } } */
diff --git a/gcc/testsuite/gcc.target/nvptx/float16-5.c 
b/gcc/testsuite/gcc.target/nvptx/float16-5.c
new file mode 100644
index 000..5fe15ec
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nvptx/float16-5.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -misa=sm_53 -mptx=6.3 -ffast-m

Re: [power-ieee128] OPEN CONV

2022-01-08 Thread Jakub Jelinek via Gcc-patches
On Sat, Jan 08, 2022 at 12:00:38PM +0100, Jakub Jelinek via Gcc-patches wrote:
> And IMHO the default like for byte-swapping should be the native
> format, i.e. the one the program actually used.

One reason for that is that neither conversion is lossless, neither format
is a subset or superset of the other.  Yes, IEEE quad has both much bigger
exponent range (-16382..16383 vs. -1022..1023) and slightly bigger fixed
precision (113 vs. 106 bits).
But IBM extended has that weirdo numerically awful flexible precision where
certain numbers can have much bigger precision than those 106 bits, up to
2048+52 or so.  So there is rounding in both directions.
So, after distros switch to -mabi=ieeelongdouble by default or when people
use -mabi=ieeelongdouble on their programs, they'd better store that format
into data files by default, without the need of some magic CONVERT= options,
env vars or command line options.  Only in the case where they need to
interact with -mabi=ibmlongdouble environments, they need to take some
action.

Jakub



Re: [Ada] Read directory in Ada.Directories.Start_Search rather than Get_Next_Entry

2022-01-08 Thread Duncan Sands via Gcc-patches
Hi Pierre-Marie, is this really a good idea?  If a directory has millions of 
files in it (rare, but I've seen it) this may consume a lot of memory.  Also, if 
using a slow medium like a network file system, reading the entire directory 
contents may take a long time.  Finally, you aren't really solving the race 
condition, you're just making the window smaller, right?  After all, if I 
understand right you are still using readdir, you just use it during a shorter 
time period.


Best wishes, Duncan.

On 07/01/2022 17:27, Pierre-Marie de Rodat via Gcc-patches wrote:

The Ada.Directories directory search function is changed so the contents
of the directory is now read in Start_Search instead of in
Get_Next_Entry.  Start_Search now stores the result of the directory
search in the search object, with Get_Next_Entry returning results from
the search object. This differs from the prior implementation where
Get_Next_Entry would query the directory directly for the next item
using the POSIX readdir function.

The problem with building Get_Next_Entry around the readdir function is
POSIX does not specify the behavior of readdir when files are added or
removed from the directory being read. For example: on most systems,
deleting files from the folder being read does not impact readdir.
However, some systems, like RTEMS and HFS+ volumes on macOS, will return
NULL instead of the next item in the directory if the current item
returned by readdir is deleted.

To avoid this issue, the contents of the directory is read in
Start_Search and the user is given a copy of these results.
Consequently, any subsequent modification to the directory does not
affect the ability to iterate through the results. This approach is the
same taken by the popular fts C functions.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/a-direct.adb (Search_Data): Remove type.
(Directory_Vectors): New package instantiation.
(Search_State): New type.
(Fetch_Next_Entry): Remove.
(Close): Remove.
(Finalize): Rewritten.
(Full_Name): Ditto.
(Get_Next_Entry): Return next entry from Search results vector
rather than querying the directory directly using readdir.
(Kind): Rewritten.
(Modification_Time): Rewritten.
(More_Entries): Use Search state cursor to determine if more
entries are available for users to read.
(Simple_Name): Rewritten.
(Size): Rewritten.
(Start_Search_Internal): Rewritten to load the contents of the
directory that matches the pattern and filter into the search
object.
* libgnat/a-direct.ads (Search_Type): New type.
(Search_Ptr): Ditto.
(Directory_Entry_Type): Rewritten to support new Start_Search
procedure.
* libgnat/s-filatt.ads (File_Length_Attr): New function.





Re: [power-ieee128] OPEN CONV

2022-01-08 Thread Jakub Jelinek via Gcc-patches
On Sat, Jan 08, 2022 at 11:07:24AM +0100, Thomas Koenig wrote:
> I have tried to unravel the different cases here, I count six
> (lumping together the environment variables, the CONVERT specifier
> and -fconvert, and leaving out the byte swapping)
> 
> CompilerConvert   Read action Write action
> 
> IEEENone  NoneNone
> IEEEIEEE  NoneNone
> IEEEIBM   IBM->IEEE   IEEE->IBM
> 
> IBM None  NoneNone
> IBM IEEE  IEEE->IBM   IBM->IEEE
> IBM IBM   NoneNone
> 
> From this table, it is clear that the compiler has to inform
> the library about the option it is using, I think it is best
> encoded in the number passed to _gfortran_set_convert.

Whether the compiler is using IEEE or IBM real(kind=16) or
complex(kind=16) for a particular spot (which doesn't have to be
the same in the whole program) is known to the library by the
kind argument it provides to the I/O routines, if it is kind=16,
it is IBM, if it is kind=17, it is IEEE.
See the patch I've posted, which does one thing when the runtime
kind (i.e. abi_kind on the compiler side) is 17 and convert
says r16_ibm, and another thing when runtime kind is 16 and
convert says r16_ieee.  Other cases shouldn't need conversion.
And IMHO the default like for byte-swapping should be the native
format, i.e. the one the program actually used.
The only thing that should be encoded in _gfortran_set_convert
is -fconvertWHATEVER command line option IMO.

Jakub



Re: [PATCH] x86_64: Improve (interunit) moves from TImode to V1TImode.

2022-01-08 Thread Uros Bizjak via Gcc-patches
On Thu, Jan 6, 2022 at 7:00 PM Roger Sayle  wrote:
>
>
>
> This patch improves the code generated when moving a 128-bit value
>
> in TImode, represented by two 64-bit registers, to V1TImode, which
>
> is a single SSE register.
>
>
>
> Currently, the simple move:
>
> typedef unsigned __int128 uv1ti __attribute__ ((__vector_size__ (16)));
>
> uv1ti foo(__int128 x) { return (uv1ti)x; }
>
>
>
> is always transferred via memory, as:
>
> foo:movq%rdi, -24(%rsp)
>
> movq%rsi, -16(%rsp)
>
> movdqa  -24(%rsp), %xmm0
>
> ret
>
>
>
> with this patch, we now generate (with -msse2):
>
> foo:movq%rdi, %xmm1
>
> movq%rsi, %xmm2
>
> punpcklqdq  %xmm2, %xmm1
>
> movdqa  %xmm1, %xmm0
>
> ret
>
>
>
> and with -mavx2:
>
> foo:vmovq   %rdi, %xmm1
>
> vpinsrq $1, %rsi, %xmm1, %xmm0
>
> ret
>
>
>
> Even more dramatic is the improvement of zero extended transfers.
>
>
>
> uv1ti bar(unsigned char c) { return (uv1ti)(__int128)c; }
>
>
>
> Previously generated:
>
> bar:movq$0, -16(%rsp)
>
> movzbl  %dil, %eax
>
> movq%rax, -24(%rsp)
>
> vmovdqa -24(%rsp), %xmm0
>
> ret
>
>
>
> Now generates:
>
> bar:movzbl  %dil, %edi
>
> movq%rdi, %xmm0
>
> ret
>
>
>
>
>
> My first attempt at this functionality attempted to use a
>
> simple define_split:
>
>
>
> +;; Move TImode to V1TImode via V2DImode instead of memory.
>
> +(define_split
>
> +  [(set (match_operand:V1TI 0 "register_operand")
>
> +(subreg:V1TI (match_operand:TI 1 "register_operand") 0))]
>
> +  "TARGET_64BIT && TARGET_SSE2 && can_create_pseudo_p ()"
>
> +  [(set (match_dup 2) (vec_concat:V2DI (match_dup 3) (match_dup 4)))
>
> +   (set (match_dup 0) (subreg:V1TI (match_dup 2) 0))]
>
> +{
>
> +  operands[2] = gen_reg_rtx (V2DImode);
>
> +  operands[3] = gen_lowpart (DImode, operands[1]);
>
> +  operands[4] = gen_highpart (DImode, operands[1]);
>
> +})
>
> +
>
>
>
> Unfortunately, this triggers very late during the compilation
>
> preventing some of the simplification's we'd like (in combine).
>
> For example the foo case above becomes:
>
>
>
> foo:movq%rsi, -16(%rsp)
>
> movq%rdi, %xmm0
>
> movhps  -16(%rsp), %xmm0
>
>
>
> transferring half directly, and the other half via memory.
>
> And for the bar case above, GCC fails to appreciate that
>
> movq/vmovq clears the high bits, resulting in:
>
>
>
> bar:movzbl  %dil, %eax
>
> xorl%edx, %edx
>
> vmovq   %rax, %xmm1
>
> vpinsrq $1, %rdx, %xmm1, %xmm0
>
> ret
>
>
>
>
>
> Hence the solution (i.e. this patch) is to add a special case
>
> to ix86_expand_vector_move for TImode to V1TImode transfers.
>
>
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
>
> and make -k check with no new failures.  Ok for mainline?
>
>
>
>
>
> 2022-01-06  Roger Sayle  
>
>
>
> gcc/ChangeLog
>
> * config/i386/i386-expand.c (ix86_expand_vector_move): Add
>
> special case for TImode to V1TImode moves, going via V2DImode.
>
>
>
> gcc/testsuite/ChangeLog
>
> * gcc.target/i386/sse2-v1ti-mov-1.c: New test case.
>
> * gcc.target/i386/sse2-v1ti-zext.c: New test case.

OK.

Thanks,
Uros.


Re: [PATCH] gcc: pass-manager: Fix memory leak. [PR jit/63854]

2022-01-08 Thread Marc Nieper-Wißkirchen
Thanks for replying so quickly!

Am Do., 6. Jan. 2022 um 14:53 Uhr schrieb David Malcolm :

[...]

> Thanks for the patch.
>
> It looks correct to me, given that pass_manager::register_pass_name
> does an xstrdup and puts the result in the map.
>
> That said:
> - I'm not officially a reviewer for this part of gcc (though I probably
> touched this code last)

I am a newcomer to the codebase of GCC and haven't yet been able to
figure out whom to contact. I bothered you because the patch is mostly
relevant for the libgccjit frontend.

> - is it cleaner to instead change m_name_to_pass_map's key type from
> const char * to char *, to convey that the map "owns" the name?  That
> way we probably wouldn't need struct typed_const_free_remove, and (I
> hope) works better with the type system.

The problem with that approach is that we would then need a new
version of string_hash in hash-traits.h, say owned_string_hash, which
derives from pointer_hash  and not pointer_hash .
This would add roughly as much code as struct typed_const_free_remove.
Using the hypothetical owned_string_hash in the definition of
m_name_to_pass_map in passes.c would then produce a map taking "char
*" strings instead of "const char *" strings. This, however, would
then lead to problems in pass_manager::register_pass_name where name
is a "const char *" string (coming from outside) but
m_name_to_pass_map->get would take a "char *" string.

I don't see how to resolve this without bigger refactoring, so I think
my struct typed_const_free_remove approach is less intrusive. This
conveys at least that the key isn't changed by the hashmap operations
and that it is yet owned (because this is something that
typed_const_free_remove presupposes.

Thanks,

Marc

[...]


Re: [power-ieee128] OPEN CONV

2022-01-08 Thread Thomas Koenig via Gcc-patches



On 07.01.22 22:48, Jakub Jelinek wrote:

On Fri, Jan 07, 2022 at 10:40:50PM +0100, Thomas Koenig wrote:

One thing that one has to watch out for is a big-endian IBM long double
file, so the byte swapping will have to be done before assigning
the value.


I've tried to handle that right, i.e. on unformatted read with
byte-swapping and r16 <-> r17 conversions first do byte-swapping
and then r16 <-> r17 conversions, while for unformatted writes
first r16 <-> r17 conversions and then byte-swapping.


I have tried to unravel the different cases here, I count six
(lumping together the environment variables, the CONVERT specifier
and -fconvert, and leaving out the byte swapping)

CompilerConvert   Read action Write action

IEEENone  NoneNone
IEEEIEEE  NoneNone
IEEEIBM   IBM->IEEE   IEEE->IBM

IBM None  NoneNone
IBM IEEE  IEEE->IBM   IBM->IEEE
IBM IBM   NoneNone

From this table, it is clear that the compiler has to inform
the library about the option it is using, I think it is best
encoded in the number passed to _gfortran_set_convert.

Old programs should continue to run with the new library, so
the absence of a call to _gfortran_set_convert, or a call
which sets byte swapping, should have the old meaning, i.e
IBM long double. A program which uses IEEE long double should
then call _gfortran_set_convert with a suitable argument to
let the library know what to do, just in case.

I think this is what I will start working on.

Best regards

Thomas


Re: [PATCH] gcc: pass-manager: Fix memory leak. [PR jit/63854]

2022-01-08 Thread Marc Nieper-Wißkirchen
Am Do., 6. Jan. 2022 um 14:57 Uhr schrieb David Malcolm via Jit
:

> [...snip...]
>
> >
> > > diff --git a/gcc/passes.c b/gcc/passes.c
> > > index 4bea6ae5b6a..0c70ece5321 100644
> > > --- a/gcc/passes.c
> > > +++ b/gcc/passes.c
>
> [...snip...]
>
> > > @@ -1943,7 +1944,7 @@ pass_manager::dump_profile_report () const
> > >" |in count |out
> > > prob "
> > >"|in count  |out prob  "
> > >"|size   |time  |\n");
> > > -
> > > +
> > >for (int i = 1; i < passes_by_id_size; i++)
> > >  if (profile_record[i].run)
> > >{
> >
>
> ...and there's a stray whitespace change here (in
> pass_manager::dump_profile_report), which probably shouldn't be in the
> patch.

There was stray whitespace in that line the unpatched version of
`passes.c`, which my Emacs silently cleaned up.

Shall I retain this whitespace although it should probably haven't
been there in the first place? Or should I just add a remark in the
patch notes about that?

Thanks,

Marc


Re: [PATCH 1/1] [PATCH] Fix canadian compile for mingw-w64 copies the wrong dlls for mingw-w64 multilibs [PR100427]

2022-01-08 Thread NightStrike via Gcc-patches
On Thu, Jan 6, 2022, 18:31 cqwrteur via Gcc-patches 
wrote:

> When building GCC hosted on windows with Canadian/native compilation
> (host==target), the build scripts in GCC would override DLLs with each
> other. For example, for MinGW-w64, 32-bit DLLs would override 64 bits
> because build scripts copy them both to /bin.
>
> This patch fixes the issue by avoiding copying DLLs with multilibs.
> However, it would still copy when we do not build multilibs, usually the
> native build for GCC on windows.
> ---
>  gcc/configure  | 26 ++
>

You should probably not be modifying configure directly.

>