date:20191211

Re: AW: [PATCH] m68k architecture: support ccmode + lra

2019-12-11 Thread Stefan Franke

Am 2019-12-08 01:54, schrieb Oleg Endo:

On Tue, 2019-11-26 at 07:38 +0100, ste...@franke.ms wrote:

> On 11/21/19 10:30 AM, ste...@franke.ms wrote:
> > Hi there,
> >
> > here is mc68k's patch to switch the m68k architecture over to ccmode
> > and
> > lra. See https://github.com/mc68kghost/gcc 68k-ccmode branch.
>
> Bernd Schmidt posted a conversion of the m68k port to ccmode a couple
> weeks before yours.  We've already ACK'd it for installing onto the
> trunk.
>
> Jeff

To be honest:
- 8 days is hardly "a couple weeks before"
- ccmode is not the same as ccmode+lra

The paperwork for contributing to fsf is on the way and the repo at
https://github.com/mc68kghost/gcc got an update. Tests are not yet at 
100%
(master branch fails too many tests) but it's closer to master branch 
now.
The code is to 50% identical, a fair amount has swapped cmp/bcc, few 
are a

tad worse and some yield surprisingly better code.

You can still submit patches for further improvements, like adding
support for LRA.  Now that the main CCmode conversion is on trunk and
has been confirmed and tested, it should be much easier for you to
pinpoint problems in your changes.

Cheers,
Oleg

The problems are in the gcc implementation

- the lra implementation is buggy
- the compare elimination can't handle parallel's containing a compare
- df-core considers parallel's containing a compare also as a USE
- some optimizations/mechanisms do only work if HAVE_CC0 is defined
- way more ...

And the current implementation is IMHO unusable for lra since it did not 
introduce a CC register to track clobbering. So it's a dead end.

I can live with the fact that my patch was refuted since I simply use my 
*working* fork, where I fixed the issues mentioned above.

/cheers
Stefan

[PATCH] V10 patch #12, Test to make sure we don't generate prefixed pre-modify load/stores for -mcpu=future

2019-12-11 Thread Michael Meissner

Patch V10 #12 is a slight reworking of patch V8 #3 (making sure we don't try to
generate the non-existant PLWZU and PSTWU pre-modify instructions).

This test passes when I run make check.  Can I check this in when patch V10 #9
is installed?

2019-12-11  Michael Meissner  

* gcc.target/powerpc/prefix-no-premodify.c: Make sure we do not
generate the non-existent PLWZU instruction if -mcpu=future.

Index: gcc/testsuite/gcc.target/powerpc/prefix-no-premodify.c
===
--- gcc/testsuite/gcc.target/powerpc/prefix-no-premodify.c  (revision 
279259)
+++ gcc/testsuite/gcc.target/powerpc/prefix-no-premodify.c  (working copy)
@@ -0,0 +1,50 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Make sure that we don't generate a prefixed form of the load and store with
+   update instructions (i.e. instead of generating LWZU we have to generate
+   PLWZ plus a PADDI).  */
+
+#ifndef SIZE
+#define SIZE 5
+#endif
+
+struct foo {
+  unsigned int field;
+  char pad[SIZE];
+};
+
+struct foo *inc_load (struct foo *p, unsigned int *q)
+{
+  *q = (++p)->field;   /* PLWZ, PADDI, STW.  */
+  return p;
+}
+
+struct foo *dec_load (struct foo *p, unsigned int *q)
+{
+  *q = (--p)->field;   /* PLWZ, PADDI, STW.  */
+  return p;
+}
+
+struct foo *inc_store (struct foo *p, unsigned int *q)
+{
+  (++p)->field = *q;   /* LWZ, PADDI, PSTW.  */
+  return p;
+}
+
+struct foo *dec_store (struct foo *p, unsigned int *q)
+{
+  (--p)->field = *q;   /* LWZ, PADDI, PSTW.  */
+  return p;
+}
+
+/* { dg-final { scan-assembler-times {\mlwz\M}2 } } */
+/* { dg-final { scan-assembler-times {\mstw\M}2 } } */
+/* { dg-final { scan-assembler-times {\mpaddi\M}  4 } } */
+/* { dg-final { scan-assembler-times {\mplwz\M}   2 } } */
+/* { dg-final { scan-assembler-times {\mpstw\M}   2 } } */
+/* { dg-final { scan-assembler-not   {\mplwzu\M}} } */
+/* { dg-final { scan-assembler-not   {\mpstwu\M}} } */
+/* { dg-final { scan-assembler-not   {\maddis\M}} } */
+/* { dg-final { scan-assembler-not   {\maddi\M} } } */

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797

[PATCH] V10 patch #11, Add test for generating prefixed load/store when the offset is not valid for DS/DQ instructions

2019-12-11 Thread Michael Meissner

Patch V10 #11 is a slight reworking of patch V8 #2 (testing whether we generate
a prefixed instruction when the offset would be invalid for DS and DQ
instruction formats).

This test passes when I run make check.  Can I check this in when patch V10 #9
is checked in?

2019-12-11  Michael Meissner  

* gcc.target/powerpc/prefix-ds-dq.c: New test to verify that we
generate the prefix load/store instructions for traditional
instructions with an offset that doesn't match DS/DQ
requirements.

Index: gcc/testsuite/gcc.target/powerpc/prefix-ds-dq.c
===
--- gcc/testsuite/gcc.target/powerpc/prefix-ds-dq.c (revision 279256)
+++ gcc/testsuite/gcc.target/powerpc/prefix-ds-dq.c (working copy)
@@ -0,0 +1,156 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests whether we generate a prefixed load/store operation for addresses that
+   don't meet DS/DQ offset constraints.  */
+
+unsigned long
+load_uc_offset1 (unsigned char *p)
+{
+  return p[1]; /* should generate LBZ.  */
+}
+
+long
+load_sc_offset1 (signed char *p)
+{
+  return p[1]; /* should generate LBZ + EXTSB.  */
+}
+
+unsigned long
+load_us_offset1 (unsigned char *p)
+{
+  return *(unsigned short *)(p + 1);   /* should generate LHZ.  */
+}
+
+long
+load_ss_offset1 (unsigned char *p)
+{
+  return *(short *)(p + 1);/* should generate LHA.  */
+}
+
+unsigned long
+load_ui_offset1 (unsigned char *p)
+{
+  return *(unsigned int *)(p + 1); /* should generate LWZ.  */
+}
+
+long
+load_si_offset1 (unsigned char *p)
+{
+  return *(int *)(p + 1);  /* should generate PLWA.  */
+}
+
+unsigned long
+load_ul_offset1 (unsigned char *p)
+{
+  return *(unsigned long *)(p + 1);/* should generate PLD.  */
+}
+
+long
+load_sl_offset1 (unsigned char *p)
+{
+  return *(long *)(p + 1); /* should generate PLD.  */
+}
+
+float
+load_float_offset1 (unsigned char *p)
+{
+  return *(float *)(p + 1);/* should generate LFS.  */
+}
+
+double
+load_double_offset1 (unsigned char *p)
+{
+  return *(double *)(p + 1);   /* should generate LFD.  */
+}
+
+__float128
+load_float128_offset1 (unsigned char *p)
+{
+  return *(__float128 *)(p + 1);   /* should generate PLXV.  */
+}
+
+void
+store_uc_offset1 (unsigned char uc, unsigned char *p)
+{
+  p[1] = uc;   /* should generate STB.  */
+}
+
+void
+store_sc_offset1 (signed char sc, signed char *p)
+{
+  p[1] = sc;   /* should generate STB.  */
+}
+
+void
+store_us_offset1 (unsigned short us, unsigned char *p)
+{
+  *(unsigned short *)(p + 1) = us; /* should generate STH.  */
+}
+
+void
+store_ss_offset1 (signed short ss, unsigned char *p)
+{
+  *(signed short *)(p + 1) = ss;   /* should generate STH.  */
+}
+
+void
+store_ui_offset1 (unsigned int ui, unsigned char *p)
+{
+  *(unsigned int *)(p + 1) = ui;   /* should generate STW.  */
+}
+
+void
+store_si_offset1 (signed int si, unsigned char *p)
+{
+  *(signed int *)(p + 1) = si; /* should generate STW.  */
+}
+
+void
+store_ul_offset1 (unsigned long ul, unsigned char *p)
+{
+  *(unsigned long *)(p + 1) = ul;  /* should generate PSTD.  */
+}
+
+void
+store_sl_offset1 (signed long sl, unsigned char *p)
+{
+  *(signed long *)(p + 1) = sl;/* should generate PSTD.  */
+}
+
+void
+store_float_offset1 (float f, unsigned char *p)
+{
+  *(float *)(p + 1) = f;   /* should generate STF.  */
+}
+
+void
+store_double_offset1 (double d, unsigned char *p)
+{
+  *(double *)(p + 1) = d;  /* should generate STD.  */
+}
+
+void
+store_float128_offset1 (__float128 f128, unsigned char *p)
+{
+  *(__float128 *)(p + 1) = f128;   /* should generate PSTXV.  */
+}
+
+/* { dg-final { scan-assembler-times {\mextsb\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mlbz\M}   2 } } */
+/* { dg-final { scan-assembler-times {\mlfd\M}   1 } } */
+/* { dg-final { scan-assembler-times {\mlfs\M}   1 } } */
+/* { dg-final { scan-assembler-times {\mlha\M}   1 } } */
+/* { dg-final { scan-assembler-times {\mlhz\M}   1 } } */
+/* { dg-final { scan-assembler-times {\mlwz\M}   1 } } */
+/* { dg-final { scan-assembler-times {\mpld\M}   2 } } */
+/* { dg-final { scan-assembler-times {\mplwa\M}  1 } } */
+/* { dg-final { scan-assembler-times {\mplxv\M}  1 } } */
+/* { dg-final { scan-assembler-times {\mpstd\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstxv\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mstb\M}   2 } } */
+/* { dg-final { scan-assembler-times {\mstfd\M}  1 } } */
+/* { dg-final { scan-assembler-times {\mstfs\M}  1 } } */
+/* { dg-final { scan-assembler-times {\msth\M}   2 } } */
+/* { dg-final { scan-assembler-times {\mstw\M}   2 } } */

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King

[PATCH] V10 patch #10, Add PADDI/PLI tests for -mcpu=future

2019-12-11 Thread Michael Meissner

Patch V10 #10 is a modification of patch V8 #1.  I renamed the files from
paddi-?.c to prefixed-*.c so that there isn't a false match due to the .ident
directive.

This test passes when I do a make check.  One patch V10 #9 is checked in can I
commit this patch?

2019-12-11  Michael Meissner  

* gcc.target/powerpc/prefix-add.c: New test for -mcpu=future
generating PADDI for large constant adds.
* gcc.target/powerpc/prefix-di-constant.c: New test for
-mcpu=future generating PLI to load up large DImode constants.
* gcc.target/powerpc/prefix-si-constant.c: New test for
-mcpu=future generating PLI to load up large SImode constants.

Index: gcc/testsuite/gcc.target/powerpc/prefix-add.c
===
--- gcc/testsuite/gcc.target/powerpc/prefix-add.c   (revision 279252)
+++ gcc/testsuite/gcc.target/powerpc/prefix-add.c   (working copy)
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Test that PADDI is generated to add a large constant.  */
+unsigned long
+add (unsigned long a)
+{
+  return a + 0x12345678UL;
+}
+
+/* { dg-final { scan-assembler {\mpaddi\M} } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-di-constant.c
===
--- gcc/testsuite/gcc.target/powerpc/prefix-di-constant.c   (revision 
279252)
+++ gcc/testsuite/gcc.target/powerpc/prefix-di-constant.c   (working copy)
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Test that PLI (PADDI) is generated to load a large constant.  */
+unsigned long
+large (void)
+{
+  return 0x12345678UL;
+}
+
+/* { dg-final { scan-assembler {\mpli\M} } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-si-constant.c
===
--- gcc/testsuite/gcc.target/powerpc/prefix-si-constant.c   (revision 
279252)
+++ gcc/testsuite/gcc.target/powerpc/prefix-si-constant.c   (working copy)
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Test that PLI (PADDI) is generated to load a large constant for SImode.  */
+void
+large_si (unsigned int *p)
+{
+  *p = 0x12345U;
+}
+
+/* { dg-final { scan-assembler {\mpli\M} } } */

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797

Re: introduce -fcallgraph-info option

2019-12-11 Thread Alexandre Oliva

On Dec  9, 2019, Richard Biener  wrote:

> On Tue, 3 Dec 2019, Alexandre Oliva wrote:

>> I'm considering rejecting command lines that specify both an explicit
>> -dumpdir and -save-temps=cwd, and in the absence of an explicit
>> -dumpdir, arranging for -save-temps=cwd or -save-temps=obj to override
>> what would otherwise be the default -dumpdir.

> Making -save-temps=cwd essentially a short-cut to -save-temps -dumpdir ./
> is fine I guess (we usually do not start to reject previously accepted
> options).

What if -save-temps=* and -dumpdir both set the same underlying aux
output dir, with the latest one in the command line prevailing, rather
than being rejected?

>> gcc -c foo.c -dumpbase foobar && gcc -c bar.c -dumpbase foobar
>> 
>> and
>> 
>> gcc -c foo.c bar.c -dumpbase foobar
>> 
>> The latter will name outputs after foobar-foo and foobar-bar,
>> respectively, whereas the former will overwrite outputs named foobar
>> when compiling bar.c.  Under the proposal to modify %b according to
>> -dump*, even object files would be named after an explicit -dumpbase,
>> when -o is not explicitly specified.

> I think rejecting option combinations that do not make much sense
> or would introduce inconsistencies like this is better than trying
> to invent creative things second-guessing what the user meant.

Hmm, I sense conflicting recommendations here vs the previous block ;-)
A single -dumpbase when compiling multiple sources used to be accepted,
after all, it just overrode aux outputs so only the last one prevailed.


> Hum.  I didn't notice -dumpdir is just a prefix and I wouldn't object
> to make it errorneous if it doesn't specify an acutal directory.

But would you object to retaining the ability to use it as a prefix?

> I also note that neither -dumpdir nor -dumpbase are documented
> in invoke.texi

*nod*

> Not sure if all this means we should document the altered behavior
> or if we should take it as a hint we can alter behavior at will
> (in future) ;)

I suppose we have more leeway in changing what's not even documented.


Thanks again for the feedback!

-- 
Alexandre Oliva, freedom fighter   he/him   https://FSFLA.org/blogs/lxo
Free Software Evangelist   Stallman was right, but he's left :(
GNU Toolchain EngineerFSMatrix: It was he who freed the first of us
FSF & FSFLA board memberThe Savior shall return (true);

[PATCH] V10 patch #9, Add new effective targets for the testsuite

2019-12-11 Thread Michael Meissner

Patch V10 #9 is patch V7 #5 that was redone.  This patch adds new effective
target options for PowerPC.  I have changed this patch to look at the code
generated by the compiler to see if prefixed adddressing or PC-relative
addressing is used for -mcpu=future.  This patch needs patch V10 #8 installed
to enable the prefixed addressing and PC-relative tests.

In patch V10 #9, I did not modify the existing test
(check_effective_target_powerpc_future_ok).  As we discussed, this test should
really test whether a non-prefixed instruction is generated to allow for
targets that might support -mcpu=future but not enable prefixed addressing.
However, at present the only instructions being submitted are prefixed
instructions.  So this will have to wait until we get further down the road
with 'future' instructions.

I have bootstrapped a little endian power8 compiler and ran make check with no
regressions.  In addition with this patch installed, the new tests now run as
expected with these changes.  Can I check this in (this needs patch V10 #8 to
be installed to enable the tests).

2019-12-11  Michael Meissner  

* lib/target-supports.exp (check_effective_target_powerpc_pcrel):
New target for PowerPC -mcpu=future support.
(check_effective_target_powerpc_prefixed_addr): New target for
PowerPC -mcpu=future support.

Index: gcc/testsuite/lib/target-supports.exp
===
--- gcc/testsuite/lib/target-supports.exp   (revision 279141)
+++ gcc/testsuite/lib/target-supports.exp   (working copy)
@@ -2161,6 +2161,23 @@ proc check_p9modulo_hw_available { } {
 }]
 }
 
+# Return 1 if the target generates PC-relative instructions automatically
+proc check_effective_target_powerpc_pcrel { } {
+return [check_no_messages_and_pattern powerpc_pcrel \
+   {\mpld\M.*[@]pcrel} assembly {
+   static long s;
+   long *p = 
+   long foo (void) { return s; }
+   } {-O2 -mcpu=future}]
+}
+
+# Return 1 if the target generates prefixed instructions automatically
+proc check_effective_target_powerpc_prefixed_addr { } {
+return [check_no_messages_and_pattern powerpc_prefixed_addr \
+   {\mpld\M} assembly {
+   long foo (long *p) { return p[0x12345]; }
+   } {-O2 -mcpu=future}]
+}
 
 # Return 1 if the target supports executing FUTURE instructions, 0 otherwise.
 # Cache the result.  It is assumed that if a simulator does not support the

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797

[PATCH] V10 patch #8, Enable -mpcrel and -mprefixed-addr for -mcpu=future on 64-bit little endian Linux systems

2019-12-11 Thread Michael Meissner

This patch enables -mpcrel and -mprefixed-addr when -mcpu=future is used on a
64-bit little endian Linux system, but it does not enable those options on
other systems.  It is a slight reworking of patch V7 #7 taking into account the
comments you made.

In particular, I changed the macros used by the target tm.h file to be:
PREFIXED_ADDR_SUPPORTED_BY_OS
PCREL_SUPPORTED_BY_OS

Patch V7 #7:
https://gcc.gnu.org/ml/gcc-patches/2019-11/msg01307.html

I have bootstrapped the compiler on a little endian power8 system, and ran make
check with no regressions.  I also tested the code by not turning on -mpcrel or
-mprefixed-addr for Linux 64-bit little endian and inspected the code and saw
the appropriate code was generated.

In terms of your comment:

| ... and I don't understand this code.  If you use -mpcrel but you do not
| have the medium model, you _do_ get prefixed but you do _not_ get pcrel?
| And this all quietly?

You do not get this quietly.  You will get an error if you use -mpcrel and
-mcmodel=large options together.

2019-12-10  Michael Meissner  

* config/rs6000/linux64.h (PREFIXED_ADDR_SUPPORTED_BY_OS): Set to
1 to enable prefixed addressing if -mcpu=future.
(PCREL_SUPPORTED_BY_OS): Set to 1 to enable PC-relative addressing
if -mcpu=future.
* config/rs6000/rs6000-cpus.h (ISA_FUTURE_MASKS_SERVER): Do not
enable -mprefixed-addr or -mpcrel by default.
(ADDRESSING_FUTURE_MASKS): New macro.
(OTHER_FUTURE_MASKS): Use ADDRESSING_FUTURE_MASKS.
* config/rs6000/rs6000.c (PREFIXED_ADDR_SUPPORTED_BY_OS): Disable
prefixed addressing unless the target OS tm.h says we should
enable it.
(PCREL_SUPPORTED_BY_OS): Disable PC-relative addressing unless the
target OS tm.h says we should enable it.
(rs6000_debug_reg_global): Print whether prefixed addressing and
PC-relative addressing is enabled by default if -mcpu=future.
(rs6000_option_override_internal): Move setting prefixed
addressing and PC-relative addressing after the sub-target option
handling is done.  Only enable prefixed addressing or PC-relative
address on -mcpu=future system if the target OS says to enable
it.  Disallow prefixed addressing on 32-bit systems or if the
target object file is not ELF v2.

Index: gcc/config/rs6000/linux64.h
===
--- gcc/config/rs6000/linux64.h (revision 279141)
+++ gcc/config/rs6000/linux64.h (working copy)
@@ -640,3 +640,11 @@ extern int dot_symbols;
enabling the __float128 keyword.  */
 #undef TARGET_FLOAT128_ENABLE_TYPE
 #define TARGET_FLOAT128_ENABLE_TYPE 1
+
+/* Enable support for pc-relative and numeric prefixed addressing on the
+   'future' system.  */
+#undef  PREFIXED_ADDR_SUPPORTED_BY_OS
+#define PREFIXED_ADDR_SUPPORTED_BY_OS  1
+
+#undef  PCREL_SUPPORTED_BY_OS
+#define PCREL_SUPPORTED_BY_OS  1
Index: gcc/config/rs6000/rs6000-cpus.def
===
--- gcc/config/rs6000/rs6000-cpus.def   (revision 279141)
+++ gcc/config/rs6000/rs6000-cpus.def   (working copy)
@@ -75,15 +75,22 @@
 | OPTION_MASK_P8_VECTOR\
 | OPTION_MASK_P9_VECTOR)
 
-/* Support for a future processor's features.  Do not enable -mpcrel until it
-   is fully functional.  */
+/* Support for a future processor's features.  The prefixed and pc-relative
+   addressing bits are not added here.  Instead, they are added if the target
+   OS tm.h says that it supports the addressing modes by default when
+   -mcpu=future is used.  */
 #define ISA_FUTURE_MASKS_SERVER(ISA_3_0_MASKS_SERVER   
\
-| OPTION_MASK_FUTURE   \
+| OPTION_MASK_FUTURE)
+
+/* Addressing related flags on a future processor.  These are options that need
+   to be cleared if the target OS is not capable of supporting prefixed
+   addressing at all (such as 32-bit mode or if the object file format is not
+   ELF v2).  */
+#define ADDRESSING_FUTURE_MASKS(OPTION_MASK_PCREL  
\
 | OPTION_MASK_PREFIXED_ADDR)
 
 /* Flags that need to be turned off if -mno-future.  */
-#define OTHER_FUTURE_MASKS (OPTION_MASK_PCREL  \
-| OPTION_MASK_PREFIXED_ADDR)
+#define OTHER_FUTURE_MASKS ADDRESSING_FUTURE_MASKS
 
 /* Flags that need to be turned off if -mno-power9-vector.  */
 #define OTHER_P9_VECTOR_MASKS  (OPTION_MASK_FLOAT128_HW\
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 279202)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -98,6 +98,16 @@
 #endif
 #endif
 
+/* Set up the defaults for

[PATCH] V10 patch #7, Improve vector_extract code of a PC-relative address with a constant offset for -mcpu=future

2019-12-11 Thread Michael Meissner

This patch improves the code of vector_extract when the vector is addressed
with a PC-relative address, and the element number is constant.

I.e.

#include 

static vector double vd[10];
vector double *p = [0];

double get (void)
{
  return vector_extract (vd[4], 1);
}

I have bootstrapped this code on a little endian power8 and ran make check and
there were no regressions.  Can I check this into the trunk?

2019-12-10  Michael Meissner  

* config/rs6000/rs6000.c (rs6000_reg_to_addr_mask): New helper
function.
(rs6000_adjust_vec_address): Add support for folding a constant
offset of a vector extract of a vector accessed with PC-relative
addressing into the offset of the load.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 279200)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -6698,6 +6698,30 @@ rs6000_expand_vector_extract (rtx target
 }
 }
 
+/* Helper function to return an address mask based on a physical register.  */
+
+static addr_mask_type
+rs6000_reg_to_addr_mask (rtx reg, machine_mode mode)
+{
+  unsigned int r = reg_or_subregno (reg);
+  addr_mask_type addr_mask;
+
+  gcc_assert (HARD_REGISTER_NUM_P (r));
+  if (INT_REGNO_P (r))
+addr_mask = reg_addr[mode].addr_mask[RELOAD_REG_GPR];
+
+  else if (FP_REGNO_P (r))
+addr_mask = reg_addr[mode].addr_mask[RELOAD_REG_FPR];
+
+  else if (ALTIVEC_REGNO_P (r))
+addr_mask = reg_addr[mode].addr_mask[RELOAD_REG_VMX];
+
+  else
+gcc_unreachable ();
+
+  return addr_mask;
+}
+
 /* Adjust a memory address (MEM) of a vector type to point to a scalar field
within the vector (ELEMENT) with a mode (SCALAR_MODE).  Use a base register
temporary (BASE_TMP) to fixup the address.  Return the new memory address
@@ -6823,8 +6847,57 @@ rs6000_adjust_vec_address (rtx scalar_re
}
 }
 
+  /* For references to local static variables, try to fold a constant offset
+ into the address.  */
+  else if (pcrel_local_address (addr, Pmode) && CONST_INT_P (element_offset))
+{
+  if (GET_CODE (addr) == CONST)
+   addr = XEXP (addr, 0);
+
+  if (GET_CODE (addr) == PLUS)
+   {
+ rtx op0 = XEXP (addr, 0);
+ rtx op1 = XEXP (addr, 1);
+ if (CONST_INT_P (op1))
+   {
+ HOST_WIDE_INT offset
+   = INTVAL (XEXP (addr, 1)) + INTVAL (element_offset);
+
+ if (offset == 0)
+   new_addr = op0;
+
+ else if (SIGNED_34BIT_OFFSET_P (offset))
+   {
+ rtx plus = gen_rtx_PLUS (Pmode, op0, GEN_INT (offset));
+ new_addr = gen_rtx_CONST (Pmode, plus);
+   }
+
+ else
+   {
+ emit_move_insn (base_tmp, addr);
+ new_addr = gen_rtx_PLUS (Pmode, base_tmp, element_offset);
+   }
+   }
+ else
+   {
+ emit_move_insn (base_tmp, addr);
+ new_addr = gen_rtx_PLUS (Pmode, base_tmp, element_offset);
+   }
+   }
+
+  else
+   {
+ rtx plus = gen_rtx_PLUS (Pmode, addr, element_offset);
+ new_addr = gen_rtx_CONST (Pmode, plus);
+   }
+}
+
   else
 {
+  /* Make sure we don't overwrite the temporary if the vector extract
+offset was variable.  */
+  gcc_assert (!rtx_equal_p (base_tmp, element_offset));
+
   emit_move_insn (base_tmp, addr);
   new_addr = gen_rtx_PLUS (Pmode, base_tmp, element_offset);
 }
@@ -6834,21 +6907,8 @@ rs6000_adjust_vec_address (rtx scalar_re
   if (GET_CODE (new_addr) == PLUS)
 {
   rtx op1 = XEXP (new_addr, 1);
-  addr_mask_type addr_mask;
-  unsigned int scalar_regno = reg_or_subregno (scalar_reg);
-
-  gcc_assert (HARD_REGISTER_NUM_P (scalar_regno));
-  if (INT_REGNO_P (scalar_regno))
-   addr_mask = reg_addr[scalar_mode].addr_mask[RELOAD_REG_GPR];
-
-  else if (FP_REGNO_P (scalar_regno))
-   addr_mask = reg_addr[scalar_mode].addr_mask[RELOAD_REG_FPR];
-
-  else if (ALTIVEC_REGNO_P (scalar_regno))
-   addr_mask = reg_addr[scalar_mode].addr_mask[RELOAD_REG_VMX];
-
-  else
-   gcc_unreachable ();
+  addr_mask_type addr_mask
+   = rs6000_reg_to_addr_mask (scalar_reg, scalar_mode);
 
   if (REG_P (op1) || SUBREG_P (op1))
valid_addr_p = (addr_mask & RELOAD_REG_INDEXED) != 0;
@@ -6856,9 +6916,21 @@ rs6000_adjust_vec_address (rtx scalar_re
valid_addr_p = (addr_mask & RELOAD_REG_OFFSET) != 0;
 }
 
+  /* An address that is a single register is always valid for either indexed or
+ offsettable loads.  */
   else if (REG_P (new_addr) || SUBREG_P (new_addr))
 valid_addr_p = true;
 
+  /* If we have a PC-relative address, check if offsetable loads are
+ allowed.  */
+  else if (pcrel_local_address (new_addr, Pmode))
+{
+

[PATCH] V10 patch #6, Use prefixed load/stores for vector extract with large offsets

2019-12-11 Thread Michael Meissner

This patch optimizes vector extracts where the vector is pointed to by an
address with an offset larger than 16-bits to fold the add into the final
address.

I.e.

#include 

double get (vector double *p, unsigned int h)
{
  return vec_extract (p[5], 1);
}

I have bootstraped this patch on a little endian power8 system and ran make
check with no regressions.  Can I check this patch in?

2019-12-10  Michael Meissner  

* config/rs6000/rs6000.c (rs6000_adjust_vec_address): Add support
for the offset being 34-bits when -mcpu=future is used.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 279199)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -6766,9 +6766,17 @@ rs6000_adjust_vec_address (rtx scalar_re
  HOST_WIDE_INT offset = INTVAL (op1) + INTVAL (element_offset);
  rtx offset_rtx = GEN_INT (offset);
 
- if (IN_RANGE (offset, -32768, 32767)
+ /* 16-bit offset.  */
+ if (SIGNED_16BIT_OFFSET_P (offset)
  && (scalar_size < 8 || (offset & 0x3) == 0))
new_addr = gen_rtx_PLUS (Pmode, op0, offset_rtx);
+
+ /* 34-bit offset if we have prefixed addresses.  */
+ else if (TARGET_PREFIXED_ADDR && SIGNED_34BIT_OFFSET_P (offset))
+   new_addr = gen_rtx_PLUS (Pmode, op0, offset_rtx);
+
+ /* Offset overflowed, move offset to the temporary (which will likely
+be split), and do X-FORM addressing.  */
  else
{
  emit_move_insn (base_tmp, offset_rtx);
@@ -6799,6 +6807,12 @@ rs6000_adjust_vec_address (rtx scalar_re
  emit_insn (insn);
}
 
+ /* Make sure we don't overwrite the temporary if the element being
+extracted is variable, and we've put the offset into base_tmp
+previously.  */
+ else if (rtx_equal_p (base_tmp, element_offset))
+   emit_insn (gen_add2_insn (base_tmp, op1));
+
  else
{
  emit_move_insn (base_tmp, op1);

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797

[PATCH] V10 patch #5, Fix codegen bug with vector extracts using a variable offset & PC-relative address

2019-12-11 Thread Michael Meissner

This patch fixes a bug with vector extracts using a PC-relative address and a
variable offset with using -mcpu=future.

Consider the code:

#include 

static vector double vd;
vector double *p = 

double get (unsigned int n)
{
  return vec_extract (vd, n);
}

If you compile this code with -O2 -mcpu=future -mpcrel you get:

get:
pla 9,.LANCHOR0@pcrel
lfdx 1,9,9
blr

This is because there is only one base register temporary, and the current code
tries to first create the offset and then use the same temporary to hold the
address of the PC-relative value.

After combine the insn is:

(insn 14 9 15 2 (parallel [
(set (reg/i:DF 33 1)
(unspec:DF [
(mem/c:V2DF (symbol_ref:DI ("*.LANCHOR0") [flags 
0x182]) [1 vd+0 S16 A128])
(reg:DI 123 [ n ])
] UNSPEC_VSX_EXTRACT))
(clobber (scratch:DI))
(clobber (scratch:V2DI))
]) "foo.c":9:1 1314 {vsx_extract_v2df_var}


Split2 changes this to:

(insn 20 8 21 2 (set (reg:DI 3 3 [orig:123 n ] [123])
(and:DI (reg:DI 3 3 [orig:123 n ] [123])
(const_int 1 [0x1]))) "foo.c":9:1 193 {anddi3_mask}
 (nil))
(insn 21 20 22 2 (set (reg:DI 9 9 [126])
(ashift:DI (reg:DI 3 3 [orig:123 n ] [123])
(const_int 3 [0x3]))) "foo.c":9:1 256 {ashldi3}
 (nil))
(insn 22 21 23 2 (set (reg:DI 9 9 [126])
(symbol_ref:DI ("*.LANCHOR0") [flags 0x182])) "foo.c":9:1 680 
{*pcrel_local_addr}
 (nil))
(insn 23 22 15 2 (set (reg/i:DF 33 1)
(mem/c:DF (plus:DI (reg:DI 9 9 [126])
(reg:DI 9 9 [126])) [1  S8 A8])) "foo.c":9:1 512 
{*movdf_hardfloat64}
 (nil))

I.e. setting GPR r9 first to the offset << 3, and then wiping out the offset
and setting in the address of the PC-relative structure.

This patch changes all of the variable extract insns and the function in
rs6000.c that processes them to have a second base register temporary only if
we have prefixed addresses.  The code generated then becomes:

get:
extsw 3,3
pla 10,.LANCHOR0@pcrel
rldicl 3,3,0,63
sldi 9,3,3
lfdx 1,10,9

I use the em and ep constraints to keep the alternatives separate.  Using em
prevents the register allocator from skipping the alternative with ep in it
because it has an extra scratch register.

I have bootstrapped the compiler on a little endian power8 system and ran make
check without regression.  Can I check this in once patch V10 #4 is checked in?

2019-12-10  Michael Meissner  

* config/rs6000/rs6000-protos.h (rs6000_split_vec_extract_var):
Update calling signature.
* config/rs6000/rs6000.c (rs6000_split_vec_extract_var): Add
additional tmp base register argument.  If the memory is prefixed,
put the address into the new tmp base register.
* config/rs6000/vsx.md (vsx_extract__var, VSX_D iterator):
Add new temporary for loading up the address of prefixed memory
operands.
(vsx_extract_v4sf_var): Add new temporary for loading up the
address of prefixed memory operands.
(vsx_extract__var, VSX_EXTRACT_I iterator): Add new
temporary for loading up the address of prefixed memory operands.
(vsx_extract__mode_var): Add new temporary for
loading up the address of prefixed memory operands.

Index: gcc/config/rs6000/rs6000-protos.h
===
--- gcc/config/rs6000/rs6000-protos.h   (revision 279182)
+++ gcc/config/rs6000/rs6000-protos.h   (working copy)
@@ -59,7 +59,7 @@ extern void rs6000_expand_float128_conve
 extern void rs6000_expand_vector_init (rtx, rtx);
 extern void rs6000_expand_vector_set (rtx, rtx, int);
 extern void rs6000_expand_vector_extract (rtx, rtx, rtx);
-extern void rs6000_split_vec_extract_var (rtx, rtx, rtx, rtx, rtx);
+extern void rs6000_split_vec_extract_var (rtx, rtx, rtx, rtx, rtx, rtx);
 extern rtx rs6000_adjust_vec_address (rtx, rtx, rtx, rtx, machine_mode);
 extern void altivec_expand_vec_perm_le (rtx op[4]);
 extern void rs6000_expand_extract_even (rtx, rtx, rtx);
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 279182)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -6861,7 +6861,7 @@ rs6000_adjust_vec_address (rtx scalar_re
 
 void
 rs6000_split_vec_extract_var (rtx dest, rtx src, rtx element, rtx tmp_gpr,
- rtx tmp_altivec)
+ rtx tmp_altivec, rtx tmp_prefixed)
 {
   machine_mode mode = GET_MODE (src);
   machine_mode scalar_mode = GET_MODE_INNER (GET_MODE (src));
@@ -6878,6 +6878,16 @@ rs6000_split_vec_extract_var (rtx dest,
   int num_elements = GET_MODE_NUNITS (mode);
   rtx

[PATCH] V10 patch #4, Add new prefixed/non-prefixed memory constraints

2019-12-11 Thread Michael Meissner

Add new constraints to match whether a memory is not prefixed (em constraint)
or prefixed (ep constraint).  This is one of 4 parts aimed at reworking the
vector extract code in patch V7 #6.

This patch just adds the new constraints, but these constraints will not be
used until the next patch.  Originally I had just one constraint (em) that
matched non-prefixed memory operands.  But in order to use it, I needed to make
sure the combiner did not combine vector extracts with a variable offset with a
PC-relative memory location.

I.e.:

#include 

static vector double vd;

double get (unsigned int n)
{
  return vec_extract (vd, n);
}

In addition, as I contemplate the bigger issue about the insn length attribute,
I suspect we may need to have an ep attribute as well as em.

Patch V7 #6:
https://gcc.gnu.org/ml/gcc-patches/2019-11/msg01306.html

I have bootstrapped the compiler on a little endian power8 system and ran make
check and there were no regressions.  Can I check this patch in?

2019-12-10  Michael Meissner  

* config/rs6000/constraints.md (em constraint): New constraint for
non-prefixed memory operands.
(ep constraint): New constraint for prefixed memory operands.
* config/rs6000/predicates.md (non_prefixed_memory): New predicate
for non-prefixed memory operands.
* doc/md.texi (PowerPC constraints): Document em and ep constraints.

Index: gcc/config/rs6000/constraints.md
===
--- gcc/config/rs6000/constraints.md(revision 279182)
+++ gcc/config/rs6000/constraints.md(working copy)
@@ -202,6 +202,16 @@ (define_constraint "H"
 
 ;; Memory constraints
 
+(define_memory_constraint "em"
+  "A memory operand that does not contain a prefixed address."
+  (and (match_code "mem")
+   (match_operand 0 "non_prefixed_memory")))
+
+(define_memory_constraint "ep"
+  "A memory operand that does contains a prefixed address."
+  (and (match_code "mem")
+   (match_operand 0 "prefixed_memory")))
+
 (define_memory_constraint "es"
   "A ``stable'' memory operand; that is, one which does not include any
 automodification of the base register.  Unlike @samp{m}, this constraint
Index: gcc/config/rs6000/predicates.md
===
--- gcc/config/rs6000/predicates.md (revision 279151)
+++ gcc/config/rs6000/predicates.md (working copy)
@@ -1846,3 +1846,17 @@ (define_predicate "prefixed_memory"
 {
   return address_is_prefixed (XEXP (op, 0), mode, NON_PREFIXED_DEFAULT);
 })
+
+;; Return true if the operand is a valid memory address that does not use a
+;; prefixed address.
+(define_predicate "non_prefixed_memory"
+  (match_code "mem")
+{
+  enum insn_form iform
+= address_to_insn_form (XEXP (op, 0), mode, NON_PREFIXED_DEFAULT);
+
+  return (iform != INSN_FORM_BAD
+  && iform != INSN_FORM_PREFIXED_NUMERIC
+ && iform != INSN_FORM_PCREL_LOCAL
+ && iform != INSN_FORM_PCREL_EXTERNAL);
+})
Index: gcc/doc/md.texi
===
--- gcc/doc/md.texi (revision 279182)
+++ gcc/doc/md.texi (working copy)
@@ -3373,6 +3373,12 @@ asm ("st %1,%0" : "=m<>" (mem) : "r" (va
 
 is not.
 
+@item em
+A memory operand that does not contain a prefixed address.
+
+@item ep
+A memory operand that does contains a prefixed address.
+
 @item es
 A ``stable'' memory operand; that is, one which does not include any
 automodification of the base register.  This used to be useful when

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797

[PATCH] V10 patch #3, Use PADDI to add large constants if -mcpu=future is used

2019-12-11 Thread Michael Meissner

This patch adds an alternative to use PADDI to add large SImode and DImode
constants if -mcpu=future is used.

It is a slight reworking of patch V7 #3.  I have done bootstraps and make check
on a power8 little endian system and there were no regressions.  Can I check
this patch in?

2019-12-09  Michael Meissner  

* config/rs6000/predicates.md (add_operand): Allow eI constants.
* config/rs6000/rs6000.md (add3): Add alternative to
generate PADDI for 34-bit constants if -mcpu=future.

Index: gcc/config/rs6000/predicates.md
===
--- gcc/config/rs6000/predicates.md (revision 279141)
+++ gcc/config/rs6000/predicates.md (working copy)
@@ -839,7 +839,8 @@ (define_special_predicate "indexed_addre
 (define_predicate "add_operand"
   (if_then_else (match_code "const_int")
 (match_test "satisfies_constraint_I (op)
-|| satisfies_constraint_L (op)")
+|| satisfies_constraint_L (op)
+|| satisfies_constraint_eI (op)")
 (match_operand 0 "gpc_reg_operand")))
 
 ;; Return 1 if the operand is either a non-special register, or 0, or -1.
Index: gcc/config/rs6000/rs6000.md
===
--- gcc/config/rs6000/rs6000.md (revision 279144)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -1761,15 +1761,17 @@ (define_expand "add3"
 })
 
 (define_insn "*add3"
-  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,r")
-   (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,b")
- (match_operand:GPR 2 "add_operand" "r,I,L")))]
+  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,r,r")
+   (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,b,b")
+ (match_operand:GPR 2 "add_operand" "r,I,L,eI")))]
   ""
   "@
add %0,%1,%2
addi %0,%1,%2
-   addis %0,%1,%v2"
-  [(set_attr "type" "add")])
+   addis %0,%1,%v2
+   addi %0,%1,%2"
+  [(set_attr "type" "add")
+   (set_attr "isa" "*,*,*,fut")])
 
 (define_insn "*addsi3_high"
   [(set (match_operand:SI 0 "gpc_reg_operand" "=b")

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797

[PATCH] V10 patch #2, use PLI to load up large SImode constants if -mcpu=future

2019-12-11 Thread Michael Meissner

This patch adds an alternative to use PLI to load up large SImode constants if
-mcpu=future is used.

It is a slight reworking of patch V7 #2 after reformating the movsi_interal1
insn.  I have done bootstraps and make check on a power8 little endian system
and there were no regressions.  Can I check this patch in once patch V10 #1 is
checked in?

Patch V7 #2:
https://gcc.gnu.org/ml/gcc-patches/2019-11/msg01302.html

2019-12-09  Michael Meissner  

* config/rs6000/rs6000.md (movsi_internal1): Add alternative to
use PLI to load up 34-bit constants if -mcpu=future.

Index: gcc/config/rs6000/rs6000.md
===
--- gcc/config/rs6000/rs6000.md (revision 279143)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -6892,7 +6892,7 @@ (define_split
 ;;MR  LA
 ;;LWZ LFIWZX  LXSIWZX
 ;;STW STFIWX  STXSIWX
-;;LI  LIS #
+;;LI  LIS PLI #
 ;;XXLOR   XXSPLTIB 0  XXSPLTIB -1 VSPLTISW
 ;;XXLXOR 0XXLORC -1   P9 const
 ;;MTVSRWZ MFVSRWZ
@@ -6903,7 +6903,7 @@ (define_insn "*movsi_internal1"
  "=r, r,
   r,  d,  v,
   m,  Z,  Z,
-  r,  r,  r,
+  r,  r,  r,  r,
   wa, wa, wa, v,
   wa, v,  v,
   wa, r,
@@ -6912,7 +6912,7 @@ (define_insn "*movsi_internal1"
  "r,  U,
   m,  Z,  Z,
   r,  d,  v,
-  I,  L,  n,
+  I,  L,  eI, n,
   wa, O,  wM, wB,
   O,  wM, wS,
   r,  wa,
@@ -6930,6 +6930,7 @@ (define_insn "*movsi_internal1"
stxsiwx %x1,%y0
li %0,%1
lis %0,%v1
+   li %0,%1
#
xxlor %x0,%x1,%x1
xxspltib %x0,0
@@ -6947,7 +6948,7 @@ (define_insn "*movsi_internal1"
  "*,  *,
   load,   fpload, fpload,
   store,  fpstore,fpstore,
-  *,  *,  *,
+  *,  *,  *,  *,
   veclogical, vecsimple,  vecsimple,  vecsimple,
   veclogical, veclogical, vecsimple,
   mffgpr, mftgpr,
@@ -6956,7 +6957,7 @@ (define_insn "*movsi_internal1"
  "*,  *,
   *,  *,  *,
   *,  *,  *,
-  *,  *,  8,
+  *,  *,  *,  8,
   *,  *,  *,  *,
   *,  *,  8,
   *,  *,
@@ -6965,7 +6966,7 @@ (define_insn "*movsi_internal1"
  "*,  *,
   *,  p8v,p8v,
   *,  p8v,p8v,
-  *,  *,  *,
+  *,  *,  fut,*,
   p8v,p9v,p9v,p8v,
   p9v,p8v,p9v,
   p8v,p8v,
@@ -7120,8 +7121,7 @@ (define_insn "*movsi_from_df"
 (define_split
   [(set (match_operand:SI 0 "gpc_reg_operand")
(match_operand:SI 1 "const_int_operand"))]
-  "(unsigned HOST_WIDE_INT) (INTVAL (operands[1]) + 0x8000) >= 0x1
-   && (INTVAL (operands[1]) & 0x) != 0"
+  "num_insns_constant (operands[1], SImode) > 1"
   [(set (match_dup 0)
(match_dup 2))
(set (match_dup 0)

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797

[PATCH] V10 patch #1, Use PLI to load up large DImode constants if -mcpu=future

2019-12-11 Thread Michael Meissner

This patch adds an alternative to use PLI to load up large DImode constants if
-mcpu=future is used.

It is a slight reworking of patch V7 #1 after reformating the movdi_interal64
insn.  I have done bootstraps and make check on a power8 little endian system
and there were no regressions.  Can I check this patch in?

Patch V7 #1:
https://gcc.gnu.org/ml/gcc-patches/2019-11/msg01301.html

2019-12-09  Michael Meissner  

* config/rs6000/rs6000.c (num_insns_constant_gpr): Return 1 if the
constant can be loaded with PLI if -mcpu=future.
* config/rs6000/rs6000.md (movdi_internal64): Add alternative to
use PLI to load up 34-bit constants if -mcpu=future.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 279141)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -5541,6 +5541,10 @@ num_insns_constant_gpr (HOST_WIDE_INT va
   && (value >> 31 == -1 || value >> 31 == 0))
 return 1;
 
+  /* PADDI can support up to 34 bit signed integers.  */
+  else if (TARGET_PREFIXED_ADDR && SIGNED_34BIT_OFFSET_P (value))
+return 1;
+
   else if (TARGET_POWERPC64)
 {
   HOST_WIDE_INT low  = ((value & 0x) ^ 0x8000) - 0x8000;
Index: gcc/config/rs6000/rs6000.md
===
--- gcc/config/rs6000/rs6000.md (revision 279141)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -8828,7 +8828,7 @@ (define_split
 })
 
 ;;GPR store   GPR loadGPR move
-;;GPR li  GPR lis GPR #
+;;GPR li  GPR lis GPR pli GPR #
 ;;FPR store   FPR loadFPR move
 ;;AVX store   AVX store   AVX loadAVX loadVSX move
 ;;P9 0P9 -1   AVX 0/-1VSX 0   VSX -1
@@ -8838,7 +8838,7 @@ (define_split
 (define_insn "*movdi_internal64"
   [(set (match_operand:DI 0 "nonimmediate_operand"
  "=YZ,r,  r,
-  r,  r,  r,
+  r,  r,  r,  r,
   m,  ^d, ^d,
   wY, Z,  $v, $v, ^wa,
   wa, wa, v,  wa, wa,
@@ -8847,7 +8847,7 @@ (define_insn "*movdi_internal64"
   ?r, ?wa")
(match_operand:DI 1 "input_operand"
  "r,  YZ, r,
-  I,  L,  nF,
+  I,  L,  eI, nF,
   ^d, m,  ^d,
   ^v, $v, wY, Z,  ^wa,
   Oj, wM, OjwM,   Oj, wM,
@@ -8863,6 +8863,7 @@ (define_insn "*movdi_internal64"
mr %0,%1
li %0,%1
lis %0,%v1
+   li %0,%1
#
stfd%U0%X0 %1,%0
lfd%U1%X1 %0,%1
@@ -8886,7 +8887,7 @@ (define_insn "*movdi_internal64"
mtvsrd %x0,%1"
   [(set_attr "type"
  "store,  load,   *,
-  *,  *,  *,
+  *,  *,  *,  *,
   fpstore,fpload, fpsimple,
   fpstore,fpstore,fpload, fpload, veclogical,
   vecsimple,  vecsimple,  vecsimple,  veclogical, veclogical,
@@ -8896,7 +8897,7 @@ (define_insn "*movdi_internal64"
(set_attr "size" "64")
(set_attr "length"
  "*,  *,  *,
-  *,  *,  20,
+  *,  *,  *,  20,
   *,  *,  *,
   *,  *,  *,  *,  *,
   *,  *,  *,  *,  *,
@@ -8905,7 +8906,7 @@ (define_insn "*movdi_internal64"
   *,  *")
(set_attr "isa"
  "*,  *,  *,
-  *,  *,  *,
+  *,  *,  fut,*,
   *,  *,  *,
   p9v,p7v,p9v,p7v,*,
   p9v,p9v,p7v,*,  *,

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797

PowerPC V10 Patches for -mcpu=future

2019-12-11 Thread Michael Meissner

This set of patches is an attempt to address the issues raised in the previous
sets of patches:

The V7 patches were for important functionality
The V8 patches were for tests
The V9 patches were for the PCREL_OPT support

As I write this there are 12 patches.  There will be more patches later to
address the remaining test suite patches.  I need to look at the comments for
PCREL_OPT in detail to see what the strategy should be for those patches.

Patches V10 #1-3 are the remaining issues from V7 #1-3 to add PADDI and PLI
support for large constants.  In theory once the reformating that was
previously done and checked in, these should be simple.

Patches V10 #4-7 break up patch V7 #6 (vector extract) into 4 separate patches.

Patch V10 #8 is patch V7 #7 (turn on -mpcrel by default on 64-bit Linux targets
for -mcpu=future), changing the names of the enabling macros.

Patch V10 #9 is patch V7 #5 that was redone.  This patch adds new effective
target options for PowerPC.  I have changed this patch to look at the code
generated by the compiler to see if prefixed adddressing or PC-relative
addressing is used for -mcpu=future.  This patch needs patch V10 #8 installed
to enable the prefixed addressing and PC-relative tests.

In patch V10 #9, I did not modify the existing test
(check_effective_target_powerpc_future_ok).  As we discussed, this test should
really test whether a non-prefixed instruction is generated to allow for
targets that might support -mcpu=future but not enable prefixed addressing.
However, at present the only instructions being submitted are prefixed
instructions.  So this will have to wait until we get further down the road
with 'future' instructions.

Patch V10 #10 is a modification of patch V8 #1.  I renamed the files from
paddi-?.c to prefixed-*.c so that there isn't a false match due to the .ident
directive.

Patch V10 #11 is a slight reworking of patch V8 #2 (testing whether we generate
a prefixed instruction when the offset would be invalid for DS and DQ
instruction formats).

Patch V10 #12 is a slight reworking of patch V8 #3 (making sure we don't try to
generate the non-existant PLWZU and PSTWU pre-modify instructions).

There are 3 other patches from V8 that I will address at a later date.  Patch
V8 #4 are the tests for using prefixed instructions for each of the types when
a large numeric offset is used.  Patch V8 #5 are the tests for using
PC-relative load/store instructions for each of the types to reference static
values.  Patch V8 #6 is the test to make sure the -fstack-protector support
works when the stack frame is large and -mcpu=future is used.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797

[committed] Fix OpenMP fortran atomic swap with casts (PR fortran/92899)

2019-12-11 Thread Jakub Jelinek

Hi!

My PR77500 fix apparently broke the following testcase, while atomic swaps
are handled in many ways similarly to atomic writes, in this conditional
which is the first one in atomic swap we actually need to look through the
conversions.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
committed to trunk, queued for 9/8 backports.

2019-12-11  Jakub Jelinek  

PR fortran/92899
* trans-openmp.c (gfc_trans_omp_atomic): For GFC_OMP_ATOMIC_SWAP,
do look through conversion on expr2 if any.

* testsuite/libgomp.fortran/atomic1.f90: New test.

--- gcc/fortran/trans-openmp.c.jj   2019-12-09 19:50:25.580940368 +0100
+++ gcc/fortran/trans-openmp.c  2019-12-11 17:14:42.706119406 +0100
@@ -3534,7 +3534,6 @@ gfc_trans_omp_atomic (gfc_code *code)
   expr2 = code->expr2;
   if (((atomic_code->ext.omp_atomic & GFC_OMP_ATOMIC_MASK)
!= GFC_OMP_ATOMIC_WRITE)
-  && (atomic_code->ext.omp_atomic & GFC_OMP_ATOMIC_SWAP) == 0
   && expr2->expr_type == EXPR_FUNCTION
   && expr2->value.function.isym
   && expr2->value.function.isym->id == GFC_ISYM_CONVERSION)
--- libgomp/testsuite/libgomp.fortran/atomic1.f90.jj2019-12-11 
17:10:33.118950283 +0100
+++ libgomp/testsuite/libgomp.fortran/atomic1.f90   2019-12-11 
17:09:46.324668521 +0100
@@ -0,0 +1,46 @@
+! PR fortran/92899
+
+program pr92899
+  real :: x = 1.0
+  double precision :: y
+  integer(kind=4) :: z = 4
+  integer(kind=8) :: w
+  !$omp atomic capture
+  y = x
+  x = 2.0
+  !$omp end atomic
+  if (y /= 1.0 .or. x /= 2.0) stop 1
+  !$omp atomic capture
+  x = y
+  y = 3.0
+  !$omp end atomic
+  if (x /= 1.0 .or. y /= 3.0) stop 2
+  !$omp atomic capture
+  w = z
+  z = 5
+  !$omp end atomic
+  if (w /= 4 .or. z /= 5) stop 3
+  !$omp atomic capture
+  z = w
+  w = 6
+  !$omp end atomic
+  if (z /= 4 .or. w /= 6) stop 4
+  !$omp atomic write
+  x = y
+  !$omp end atomic
+  if (x /= 3.0 .or. y /= 3.0) stop 5
+  x = 7.0
+  !$omp atomic write
+  y = x
+  !$omp end atomic
+  if (x /= 7.0 .or. y /= 7.0) stop 6
+  !$omp atomic write
+  z = w
+  !$omp end atomic
+  if (z /= 6 .or. w /= 6) stop 7
+  z = 8
+  !$omp atomic write
+  w = z
+  !$omp end atomic
+  if (z /= 8 .or. w /= 8) stop 8
+end

Jakub

[committed] Small -ftree-loop-distribute-patterns fixes

2019-12-11 Thread Jakub Jelinek

Hi!

When looking at the hash table memset -> loop patch, I was looking at when
-ftree-loop-distribute-patterns is enabled and noticed since
r271595 / PR88440 it is actually enabled at -O2+ rather than -O3+.
This patch moves it to the proper section, so that OPT_LEVELS_2_PLUS are
together, and updates invoke.texi to match that change.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk as
obvious.

2019-12-11  Jakub Jelinek  

* opts.c (default_options_table): Move -ftree-loop-distribute-patterns
entry from -O3 or later section to -O2 or later section.
* doc/invoke.texi (-ftree-loop-distribute-patterns): Mention the
option is enabled by default at -O2+ rather than just at -O3.

--- gcc/opts.c.jj   2019-12-09 00:21:31.694097994 +0100
+++ gcc/opts.c  2019-12-11 17:03:12.623708255 +0100
@@ -508,6 +508,7 @@ static const struct default_options defa
 { OPT_LEVELS_2_PLUS, OPT_ftree_vrp, NULL, 1 },
 { OPT_LEVELS_2_PLUS, OPT_fvect_cost_model_, NULL, VECT_COST_MODEL_CHEAP },
 { OPT_LEVELS_2_PLUS, OPT_finline_functions, NULL, 1 },
+{ OPT_LEVELS_2_PLUS, OPT_ftree_loop_distribute_patterns, NULL, 1 },
 
 /* -O2 and above optimizations, but not -Os or -Og.  */
 { OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_falign_functions, NULL, 1 },
@@ -533,7 +534,6 @@ static const struct default_options defa
 { OPT_LEVELS_3_PLUS, OPT_fpredictive_commoning, NULL, 1 },
 { OPT_LEVELS_3_PLUS, OPT_fsplit_loops, NULL, 1 },
 { OPT_LEVELS_3_PLUS, OPT_fsplit_paths, NULL, 1 },
-{ OPT_LEVELS_2_PLUS, OPT_ftree_loop_distribute_patterns, NULL, 1 },
 { OPT_LEVELS_3_PLUS, OPT_ftree_loop_distribution, NULL, 1 },
 { OPT_LEVELS_3_PLUS, OPT_ftree_loop_vectorize, NULL, 1 },
 { OPT_LEVELS_3_PLUS, OPT_ftree_partial_pre, NULL, 1 },
--- gcc/doc/invoke.texi.jj  2019-12-10 22:03:21.382798370 +0100
+++ gcc/doc/invoke.texi 2019-12-11 17:03:50.828122335 +0100
@@ -9782,8 +9782,8 @@ It is also enabled by @option{-fprofile-
 @item -ftree-loop-distribute-patterns
 @opindex ftree-loop-distribute-patterns
 Perform loop distribution of patterns that can be code generated with
-calls to a library.  This flag is enabled by default at @option{-O3}, and
-by @option{-fprofile-use} and @option{-fauto-profile}.
+calls to a library.  This flag is enabled by default at @option{-O2} and
+higher, and by @option{-fprofile-use} and @option{-fauto-profile}.
 
 This pass distributes the initialization loops and generates a call to
 memset zero.  For example, the loop

Jakub

[PATCH] Fix simplify-rtx.c handling of avx512 vector comparisons (PR target/92908)

2019-12-11 Thread Jakub Jelinek

Hi!

The AVX512{F,VL} vector comparisons that set %kN registers are represented
in RTL as comparisons with vector mode operands and scalar integral result,
where at runtime the scalar integer is filled with a bitmask.
Unfortunately, simplify_relational_operation would fold e.g.
(eq:SI (reg:V32HI x) (reg:V32HI x))
into (const_int 1) rather than (const_int -1) that is expected (all elements
equal).  simplify_const_relational_operation is documented to always return
just const0_rtx or const_true_rtx and simplify_relational_operation is
expected to fix this up, for vector comparisons with vector result it
duplicates the 0 or -1 into all elements, etc., so this patch adds handling
for this case there too.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2019-12-11  Jakub Jelinek  

PR target/92908
* simplify-rtx.c (simplify_relational_operation): For vector cmp_mode
and scalar mode, if simplify_relational_operation returned
const_true_rtx, return a scalar bitmask of all ones.
(simplify_const_relational_operation): Change VOID_mode in function
comment to VOIDmode.

* gcc.target/i386/avx512bw-pr92908.c: New test.

--- gcc/simplify-rtx.c.jj   2019-11-19 22:27:02.58742 +0100
+++ gcc/simplify-rtx.c  2019-12-11 13:31:57.197809704 +0100
@@ -5037,6 +5037,23 @@ simplify_relational_operation (enum rtx_
  return NULL_RTX;
 #endif
}
+  if (VECTOR_MODE_P (cmp_mode)
+ && SCALAR_INT_MODE_P (mode)
+ && tem == const_true_rtx)
+   {
+ /* Vector comparisons that expect a scalar integral
+bitmask.  For const0_rtx the result is already correct,
+for const_true_rtx we need all bits set.  */
+ int n_elts;
+ scalar_int_mode smode = as_a  (mode);
+ gcc_assert (GET_MODE_NUNITS (cmp_mode).is_constant (_elts)
+ && GET_MODE_PRECISION (smode) <= n_elts);
+ if (GET_MODE_PRECISION (smode) == n_elts)
+   return constm1_rtx;
+ if (n_elts < HOST_BITS_PER_WIDE_INT)
+   return GEN_INT ((HOST_WIDE_INT_1U << n_elts) - 1);
+ return NULL_RTX;
+   }
 
   return tem;
 }
@@ -5383,7 +5400,7 @@ comparison_result (enum rtx_code code, i
 }
 
 /* Check if the given comparison (done in the given MODE) is actually
-   a tautology or a contradiction.  If the mode is VOID_mode, the
+   a tautology or a contradiction.  If the mode is VOIDmode, the
comparison is done in "infinite precision".  If no simplification
is possible, this function returns zero.  Otherwise, it returns
either const_true_rtx or const0_rtx.  */
--- gcc/testsuite/gcc.target/i386/avx512bw-pr92908.c.jj 2019-12-11 
14:24:12.083418762 +0100
+++ gcc/testsuite/gcc.target/i386/avx512bw-pr92908.c2019-12-11 
14:23:56.071665326 +0100
@@ -0,0 +1,21 @@
+/* PR target/92908 */
+/* { dg-do run } */
+/* { dg-options "-Og -fno-tree-fre -mavx512bw" } */
+/* { dg-require-effective-target avx512bw } */
+
+#define AVX512BW
+#include "avx512f-helper.h"
+
+typedef unsigned short V __attribute__ ((__vector_size__ (64)));
+
+V v;
+
+void
+TEST (void)
+{
+  int i;
+  v = (V) v == v;
+  for (i = 0; i < 32; i++)
+if (v[i] != 0x)
+  abort ();
+}

Jakub

Re: [PING 3][PATCH] track dynamic allocation in strlen (PR 91582)

2019-12-11 Thread Martin Sebor


Jeff's buildbot exposed a bug in the patch that caused false
positives in cases involving negative offsets into destinations
involving pointers pointing into multiple regions of the same
object.  The attached revision fixes that bug, plus makes
a few minor other fixes pointed out in PR 92868.

On 12/6/19 5:19 PM, Martin Sebor wrote:

With part 2 (below) of this work committed, I've rebased the patch
on the top of trunk and on top of the updated part 1 (also below).
Attached is the result, retested on x86_64-linux.

[1] include size and offset in -Wstringop-overflow
     https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00392.html

[2] extend -Wstringop-overflow to allocated objects
     (committed in r278983)
     https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00263.html

On 11/25/19 10:54 AM, Martin Sebor wrote:

Ping: https://gcc.gnu.org/ml/gcc-patches/2019-11/msg00812.html

On 11/18/19 11:23 AM, Martin Sebor wrote:

Ping: https://gcc.gnu.org/ml/gcc-patches/2019-11/msg00812.html

On 11/11/19 6:27 PM, Martin Sebor wrote:

The attached patch extends the strlen pass to detect out-of-bounds
accesses to memory allocated by calls to other allocation functions
besides calloc and malloc, as well as VLAs, and user-defined
functions declared with attribute alloc_size.  There is some
overlap with the _FORTIFY_SOURCE detection but thanks to
the extensive use of ranges, this enhancement detects many more
cases of overflow.

The solution primarily improves warnings but some of the changes
also improve codegen in some cases as a side-effect.  I hope to
take better advantage of the optimization opportunities the dynamic
memory tracking opens up (and also better buffer overflow and array
out-of-bounds detection) in GCC 11.

Although the strlen pass already tracks some dynamic memory calls
(calloc and malloc) rather than extending the same infrastructure
(strinfo::stmt) to others I took the approach of adding a separate
data member for the other calls (strinfo::alloc) and tracking those
independently.  I did this to keep the changes only minimally
intrusive.  In the future (post GCC 10) it might be worth
considering merging both.

Besides introducing the new member and making use of it, the rest
of the changes were prompted by weaknesses exposed by test cases
involving dynamically allocated objects.

The patch is intended to apply on top of the two related patches
posted last week ([1] and [2]).  For all tests to pass also expects
the fix for PR 92412 posted earlier today ([3]).

Martin

[1] https://gcc.gnu.org/ml/gcc-patches/2019-11/msg00429.html
[2] https://gcc.gnu.org/ml/gcc-patches/2019-11/msg00652.html
[3] https://gcc.gnu.org/ml/gcc-patches/2019-11/msg00800.html








PR middle-end/91582 - missing heap overflow detection for strcpy
PR middle-end/92868 - ICE: tree check: expected integer_cst, have ssa_name

gcc/ChangeLog:

	PR middle-end/91582
	PR middle-end/92868
	* builtins.c (gimple_call_alloc_size): Add arguments.
	(compute_objsize): Add an argument.  Set *PDECL even for allocated
	objects.
	Correct checking for negative wide_int.
	Correct handling of negative outer offsets into unknown regions
	or with unknown inner offsets.
	Extend offsets to at most sizetype precision.
	Only handle constant subobject sizes.
	* builtins.h (gimple_call_alloc_size): Add arguments.
	* gcc/tree.c (component_ref_size): Always return sizetype.
	* tree-ssa-strlen.c (strinfo::alloc): New member.
	(get_addr_stridx): Add argument.
	(get_stridx): Use ptrdiff_t.  Add argument.
	(new_strinfo): Set new member.
	(get_string_length): Handle alloca and VLA.
	(dump_strlen_info): Dump more state.
	(maybe_invalidate): Print more info.  Decrease indentation.
	(unshare_strinfo): Set new member.
	(valid_builtin_call): Handle alloca and VLA.
	(maybe_warn_overflow): Check and set no-warning bit.  Improve
	handling of offsets.  Print allocated objects.
	(handle_builtin_strlen): Handle strinfo records with null lengths.
	(handle_builtin_strcpy): Add argument.  Call maybe_warn_overflow.
	(is_strlen_related_p): Handle dynamically allocated objects.
	(get_range): Add argument.
	(handle_builtin_malloc): Rename...
	(handle_aalloc): ...to this and handle all allocation functions.
	(handle_builtin_memset): Call maybe_warn_overflow.
	(count_nonzero_bytes): Handle more MEM_REF forms.
	(strlen_check_and_optimize_call): Call handle_alloc_call.  Pass
	arguments to more callees.
	(handle_integral_assign): Add argument.  Create strinfo entries
	for MEM_REF assignments.
	(check_and_optimize_stmt): Handle more MEM_REF forms.

gcc/testsuite/ChangeLog:

	PR middle-end/91582
	* c-c++-common/Wrestrict.c: Adjust expected warnings.
	* gcc.dg/Warray-bounds-46.c: Disable -Wstringop-overflow.
	* gcc.dg/Warray-bounds-47.c: Same.
	* gcc.dg/Warray-bounds-52.c: New test.
	* gcc.dg/Wstringop-overflow-26.c: New test.
	* gcc.dg/Wstringop-overflow-27.c: New test.
	* gcc.dg/attr-alloc_size.c (test): Disable -Warray-bounds.
	* gcc.dg/attr-copy-2.c: Adjust expected warnings.
	*

Re: C++ PATCH for c++/88337 - Implement P1327R1: Allow dynamic_cast in constexpr

2019-12-11 Thread Marek Polacek

On Fri, Nov 22, 2019 at 04:11:53PM -0500, Jason Merrill wrote:
> On 11/8/19 4:24 PM, Marek Polacek wrote:
> > After much weeping and gnashing of teeth, here's a patch to handle 
> > dynamic_cast
> > in constexpr evaluation.  While the change in the standard is trivial (see
> > ), the
> > change in the compiler is less so.
> > 
> > When build_dynamic_cast realizes that a dynamic_cast needs a run-time 
> > check, it
> > generates a call to __dynamic_cast -- see dyncast.cc in libsupc++ for its
> > definition.  The gist of my approach is to evaluate such a call at compile 
> > time.
> > 
> > This should be easy in theory: let the constexpr machinery find out the 
> > dynamic
> > type and then handle a sidecast and upcast.  That's ultimately what the 
> > patch
> > is trying to do but there was a number of hindrances.
> > 
> > 1) We can't use __dynamic_cast's type_info parameters, this type is not a
> > literal class.  But that means we have no idea what we're converting to!
> 
> get_tinfo_decl sets the TREE_TYPE of the DECL_NAME of the tinfo decl to the
> relevant type, can't you use that?

Yes, lovely.  I hadn't noticed that :(.

It doesn't say if this is a reference dynamic_cast or a pointer dynamic_cast,
so I checked OBJ to wheedle that information out of it.

> > 2) [class.cdtor] says that when a dynamic_cast is used in a constructor or
> > destructor and the operand of the dynamic_cast refers to the object under
> > construction or destruction, this object is considered to be a most derived
> > object.
> 
> This means that during the 'tor the vtable pointer refers to the type_info
> for that class and the offset-to-top is 0.  Can you use that?

I can't seem to: For e.g.

struct C : A, C2 { A *c = dynamic_cast(static_cast(this)); };

the object under construction is C, the call to __dynamic_cast will be
__dynamic_cast (SAVE_EXPR <&((struct C *) this)->D.2119>, &_ZTI2C2, &_ZTI1A, -2)
here, OBJ is f.D.2156.D.2119 and ctx->global->ctor_object is f.D.2156.  So OBJ
refers to the object under construction.

But I don't see C anywhere; CLASSTYPE_TYPEINFO_VAR of OBJTYPE is _ZTI2C2.

Am I looking into the wrong place?

> > +  /* Given dynamic_cast(v),
> > +
> > + [expr.dynamic.cast] If C is the class type to which T points or 
> > refers,
> > + the runtime check logically executes as follows:
> > +
> > + If, in the most derived object pointed (referred) to by v, v points
> > + (refers) to a public base class subobject of a C object, and if only
> > + one object of type C is derived from the subobject pointed (referred)
> > + to by v the result points (refers) to that C object.
> > +
> > + In this case, HINT >= 0.  This is a downcast.  */
> 
> Please avoid using up/down to refer to inheritance relationships, people
> disagree about what they mean.  :)

OK, dropped that terminology.

> > +  if (hint >= 0)
> > +{
> > +  /* We now have something like
> > +
> > + g.D.2181.D.2154.D.2102.D.2093
> > +^~
> > +OBJ
> > +
> > +and we're looking for a component with type TYPE.  */
> > +  tree objtype = TREE_TYPE (obj);
> > +  tree ctor_object = ctx->global->ctor_object;
> > +
> > +  for (;;)
> > +   {
> > + /* Unfortunately, we can't rely on HINT, we need to do some
> > +verification here:
> > +
> > +1) Consider
> > + dynamic_cast((A*)(B*)(D*));
> > +   and imagine that there's an accessible base A from E (so HINT
> > +   is >= 0), but it's a different A than where OBJ points to.
> > +   We need to check that the one we're accessing via E->D->B->A is
> > +   in fact accessible.  If e.g. B on this path is private, we gotta
> > +   fail.  So check that every base on the way can be reached from
> > +   the preceding class.
> > +
> > +2) Further, consider
> > +
> > +   struct A { virtual void a(); };
> > +   struct AA : A {};
> > +   struct B : A {};
> > +   struct Y : AA, private B {};
> > +
> > +   dynamic_cast((A*)(B*));
> > +
> > +   Here HINT is >=0, because A is a public unique base of Y,
> > +   but that's not the A accessed via Y->B->A.  */
> > + if (!accessible_base_p (TREE_TYPE (obj), objtype, false)
> > + || !accessible_base_p (type, TREE_TYPE (obj), false))
> > +   {
> > + if (reference_p)
> > +   {
> > + if (!ctx->quiet)
> > +   {
> > + error_at (loc, "reference % failed");
> > + inform (loc, "static type %qT of its operand is a "
> > + "non-public base class of dynamic type %qT",
> > + objtype, type);
> > +   }
> > + *non_constant_p = true;
> > +   }
> > + return integer_zero_node;
> > +   }
> > +
> > + if

Re: Re: [Patch, Fortran] PR92898 - [9/10 Regression] ICE in gfc_check_is_contiguous, at fortran/check.c:7157

2019-12-11 Thread Steve Kargl

On Wed, Dec 11, 2019 at 11:24:35PM +0100, Harald Anlauf wrote:
> 
> > Gesendet: Dienstag, 10. Dezember 2019 um 23:34 Uhr
> > Von: "Thomas Koenig" 
> > An: "Harald Anlauf" , gfortran , 
> > gcc-patches 
> > Betreff: Re: [Patch, Fortran] PR92898 - [9/10 Regression] ICE in 
> > gfc_check_is_contiguous, at fortran/check.c:7157
> >
> >
> > > Index: gcc/fortran/check.c
> > > ===
> > > --- gcc/fortran/check.c   (Revision 279183)
> > > +++ gcc/fortran/check.c   (Arbeitskopie)
> > > @@ -7154,7 +7154,9 @@ bool
> > >   gfc_check_is_contiguous (gfc_expr *array)
> > >   {
> > > if (array->expr_type == EXPR_NULL
> > > -  && array->symtree->n.sym->attr.pointer == 1)
> > > +  && (!array->symtree ||
> > > +   (array->symtree->n.sym &&
> > > +array->symtree->n.sym->attr.pointer == 1)))
> >
> > I have to admit I do not understand the original code here, nor
> > do I quite understand your fix.
> >
> > Is there any circumstance where array->expr_type == EXPR_NULL, but
> > is_contiguous is valid?  What would go wrong if the other tests
> > were removed?
> 
> Actually I do not know what the additional check was supposed to do.
> Removing it does not seem to do any harm.  See below.
> 

The orginal testcase has NULL(Z) where Z has the pointer
attribute.  See 16.9.144.  NULL(Z) then has the pointer
attribute.  I did not consider NULL(), which is of course
a valid reference.

-- 
Steve

Aw: Re: [Patch, Fortran] PR92898 - [9/10 Regression] ICE in gfc_check_is_contiguous, at fortran/check.c:7157

2019-12-11 Thread Harald Anlauf

Hi Thomas,

> Gesendet: Dienstag, 10. Dezember 2019 um 23:34 Uhr
> Von: "Thomas Koenig" 
> An: "Harald Anlauf" , gfortran , 
> gcc-patches 
> Betreff: Re: [Patch, Fortran] PR92898 - [9/10 Regression] ICE in 
> gfc_check_is_contiguous, at fortran/check.c:7157
>
> Hello Harald,
>
> > Index: gcc/fortran/check.c
> > ===
> > --- gcc/fortran/check.c (Revision 279183)
> > +++ gcc/fortran/check.c (Arbeitskopie)
> > @@ -7154,7 +7154,9 @@ bool
> >   gfc_check_is_contiguous (gfc_expr *array)
> >   {
> > if (array->expr_type == EXPR_NULL
> > -  && array->symtree->n.sym->attr.pointer == 1)
> > +  && (!array->symtree ||
> > + (array->symtree->n.sym &&
> > +  array->symtree->n.sym->attr.pointer == 1)))
>
> I have to admit I do not understand the original code here, nor
> do I quite understand your fix.
>
> Is there any circumstance where array->expr_type == EXPR_NULL, but
> is_contiguous is valid?  What would go wrong if the other tests
> were removed?

Actually I do not know what the additional check was supposed to do.
Removing it does not seem to do any harm.  See below.

>
> > Index: gcc/testsuite/gfortran.dg/pr91641.f90
> > ===
> > --- gcc/testsuite/gfortran.dg/pr91641.f90   (Revision 279183)
> > +++ gcc/testsuite/gfortran.dg/pr91641.f90   (Arbeitskopie)
> > @@ -1,7 +1,9 @@
> >   ! { dg-do compile }
> >   ! PR fortran/91641
> > -! Code conyributed by Gerhard Steinmetz
> > +! PR fortran/92898
> > +! Code contributed by Gerhard Steinmetz
> >   program p
> >  real, pointer :: z(:)
> >  print *, is_contiguous (null(z))! { dg-error "shall be an 
> > associated" }
> > +   print *, is_contiguous (null()) ! { dg-error "shall be an 
> > associated" }
> >   end
>
> Sometimes, it is necessary to change test cases, when error messages
> change.  If this is not the case, it is better to add new tests to
> new test cases - this makes regression hunting much easier.
>
> Regards
>
>   Thomas

Agreed.  Please find the modified patches below.  OK for trunk / 9 ?

Thanks,
Harald

Index: gcc/fortran/check.c
===
--- gcc/fortran/check.c (Revision 279254)
+++ gcc/fortran/check.c (Arbeitskopie)
@@ -7153,8 +7153,7 @@ gfc_check_ttynam_sub (gfc_expr *unit, gfc_expr *na
 bool
 gfc_check_is_contiguous (gfc_expr *array)
 {
-  if (array->expr_type == EXPR_NULL
-  && array->symtree->n.sym->attr.pointer == 1)
+  if (array->expr_type == EXPR_NULL)
 {
   gfc_error ("Actual argument at %L of %qs intrinsic shall be an "
 "associated pointer", >where, gfc_current_intrinsic);



Index: gcc/testsuite/gfortran.dg/pr92898.f90
===
--- gcc/testsuite/gfortran.dg/pr92898.f90   (nicht existent)
+++ gcc/testsuite/gfortran.dg/pr92898.f90   (Arbeitskopie)
@@ -0,0 +1,6 @@
+! { dg-do compile }
+! PR fortran/92898
+! Code contributed by Gerhard Steinmetz
+program p
+  print *, is_contiguous (null()) ! { dg-error "shall be an associated" }
+end


2019-12-11  Harald Anlauf  

PR fortran/92898
* check.c (gfc_check_is_contiguous): Simplify check to handle
arbitrary NULL() argument.

2019-12-11  Harald Anlauf  

PR fortran/92898
* gfortran.dg/pr92898.f90: New test.

Re: [PATCH] Fix gnu-versioned-namespace build

2019-12-11 Thread Jonathan Wakely


On 11/12/19 22:28 +0100, François Dumont wrote:

On 12/11/19 9:44 PM, Jonathan Wakely wrote:

On 11/12/19 21:23 +0100, François Dumont wrote:

On 12/11/19 12:16 PM, Jonathan Wakely wrote:

On 11/12/19 08:29 +0100, François Dumont wrote:

I plan to commit this tomorrow.

Note that rather than just adding the missing 
_GLIBCXX_[BEGIN,END]_VERSION_NAMESPACE I also move anonymous 
namespace usage outside std namespace. Let me know if it was 
intentional.


It was intentional, why move it?


I just don't get the intention so I proposed to move it. But there 
are indeed other usages of this pattern in other src files.


Note that in src/c++11/debug.cc I am using anonymous namespace at 
global scope, is that wrong ?


It's certainly more fragile, so arguably it's wrong, yes.

Consider:

// some libc function in a system header we don't control:
extern "C" void __foo();

// libstdc++ code in a .cc file:
namespace
{
 void foo() { }
}
namespace std
{
 void bar() { foo(); }
}

This fails to compile, because the name foo is ambiguous in the global
scope. We don't control the libc headers, so we don't know all the
names they might declare at global scope.

If you don't put the unnamed namespace at global scope, the problem
simply doesn't exist:

namespace std
{
 namespace
 {
   void foo() { }
 }

 void bar() { foo(); }
}

Now it doesn't matter what names libc puts in the global scope,
because we're never looking for foo in the global scope.

It's obviously better to add our declarations to our own namespace
that we control, not to the global namespace (and an unnamed namespace
at global scope effectively adds the names to the global namespace).



Adding the BEGIN/END_VERSION macros is unnecessary. Those namespaces
are inline, so std::random_device already refers to
std::__8::random_device when the original declaration was in the
versioned namespace.


Ok. I must confess I wasn't clear about this but looking at other 
src files, at least in src/c++11, was showing that it is done 
almost always this way, I guess we could cleanup those files.




The only fix needed here seems to be qualifying std::isdigit (and
strictly-speaking we should also include  to declare that).


Like in attached patch ?


Did you attach the wrong patch?



Indeed, here is the correct one.


OK for trunk (including  unconditionally is fine).
Thanks.







diff --git a/libstdc++-v3/src/c++11/random.cc b/libstdc++-v3/src/c++11/random.cc
index 10fbe1dc4c4..04edc582b69 100644
--- a/libstdc++-v3/src/c++11/random.cc
+++ b/libstdc++-v3/src/c++11/random.cc
@@ -41,6 +41,7 @@

#include 
#include 
+#include  // For std::isdigit.

#if defined _GLIBCXX_HAVE_UNISTD_H && defined _GLIBCXX_HAVE_FCNTL_H
# include 
@@ -286,7 +287,7 @@ namespace std _GLIBCXX_VISIBILITY(default)
_M_mt.seed(seed);
#else
// Convert old default token "mt19937" or numeric seed tokens to "default".
-if (token == "mt19937" || isdigit((unsigned char)token[0]))
+if (token == "mt19937" || std::isdigit((unsigned char)token[0]))
  _M_init("default");
else
  _M_init(token);

Re: [PATCH] Fix gnu-versioned-namespace build

2019-12-11 Thread François Dumont


On 12/11/19 9:44 PM, Jonathan Wakely wrote:

On 11/12/19 21:23 +0100, François Dumont wrote:

On 12/11/19 12:16 PM, Jonathan Wakely wrote:

On 11/12/19 08:29 +0100, François Dumont wrote:

I plan to commit this tomorrow.

Note that rather than just adding the missing 
_GLIBCXX_[BEGIN,END]_VERSION_NAMESPACE I also move anonymous 
namespace usage outside std namespace. Let me know if it was 
intentional.


It was intentional, why move it?


I just don't get the intention so I proposed to move it. But there 
are indeed other usages of this pattern in other src files.


Note that in src/c++11/debug.cc I am using anonymous namespace at 
global scope, is that wrong ?


It's certainly more fragile, so arguably it's wrong, yes.

Consider:

// some libc function in a system header we don't control:
extern "C" void __foo();

// libstdc++ code in a .cc file:
namespace
{
 void foo() { }
}
namespace std
{
 void bar() { foo(); }
}

This fails to compile, because the name foo is ambiguous in the global
scope. We don't control the libc headers, so we don't know all the
names they might declare at global scope.

If you don't put the unnamed namespace at global scope, the problem
simply doesn't exist:

namespace std
{
 namespace
 {
   void foo() { }
 }

 void bar() { foo(); }
}

Now it doesn't matter what names libc puts in the global scope,
because we're never looking for foo in the global scope.

It's obviously better to add our declarations to our own namespace
that we control, not to the global namespace (and an unnamed namespace
at global scope effectively adds the names to the global namespace).



Adding the BEGIN/END_VERSION macros is unnecessary. Those namespaces
are inline, so std::random_device already refers to
std::__8::random_device when the original declaration was in the
versioned namespace.


Ok. I must confess I wasn't clear about this but looking at other src 
files, at least in src/c++11, was showing that it is done almost 
always this way, I guess we could cleanup those files.




The only fix needed here seems to be qualifying std::isdigit (and
strictly-speaking we should also include  to declare that).


Like in attached patch ?


Did you attach the wrong patch?



Indeed, here is the correct one.


diff --git a/libstdc++-v3/src/c++11/random.cc b/libstdc++-v3/src/c++11/random.cc
index 10fbe1dc4c4..04edc582b69 100644
--- a/libstdc++-v3/src/c++11/random.cc
+++ b/libstdc++-v3/src/c++11/random.cc
@@ -41,6 +41,7 @@
 
 #include 
 #include 
+#include  // For std::isdigit.
 
 #if defined _GLIBCXX_HAVE_UNISTD_H && defined _GLIBCXX_HAVE_FCNTL_H
 # include 
@@ -286,7 +287,7 @@ namespace std _GLIBCXX_VISIBILITY(default)
 _M_mt.seed(seed);
 #else
 // Convert old default token "mt19937" or numeric seed tokens to "default".
-if (token == "mt19937" || isdigit((unsigned char)token[0]))
+if (token == "mt19937" || std::isdigit((unsigned char)token[0]))
   _M_init("default");
 else
   _M_init(token);

Re: [PATCH 47/49] analyzer: new files: diagnostic-manager.{cc|h}

2019-12-11 Thread David Malcolm

On Wed, 2019-12-11 at 13:42 -0700, Jeff Law wrote:
> On Fri, 2019-11-15 at 20:23 -0500, David Malcolm wrote:
> > This patch adds diagnostic_manager and related support classes for
> > saving, deduplicating, and emitting analyzer diagnostics.
> > 
> > gcc/ChangeLog:
> > * analyzer/diagnostic-manager.cc: New file.
> > * analyzer/diagnostic-manager.h: New file.
> I was originally going to suggest we look to bring this out of the
> analyzer subdir, but it looks like there's a lot of tie-ins to the
> static analyzer, so let's not try that right now.
> 
> 
> > +
> > +/* Prune PATH, based on the verbosity level, to the most pertinent
> > +   events for a diagnostic that involves VAR ending in state STATE
> > +   (for state machine SM).
> > +
> > +   PATH is updated in place, and the redundant checker_events are
> > deleted.
> > +
> > +   As well as deleting events, call record_critical_state on
> > events
> > in
> > +   which state critical to the pending_diagnostic is being
> > handled,
> > so
> > +   that the event's get_desc vfunc can potentially supply a more
> > precise
> > +   description of the event to the user.
> > +   e.g. improving
> > + "calling 'foo' from 'bar'"
> > +   to
> > + "passing possibly-NULL pointer 'ptr' to 'foo' from 'bar' as
> > param 1"
> > +   when the diagnostic relates to later dereferencing 'ptr'.  */
> > +
> > +void
> > +diagnostic_manager::prune_path (checker_path *path,
> > +   const state_machine *sm,
> > +   tree var,
> > +   state_machine::state_t state) const
> You might consider breaking this up a bit.  I guess it stands out
> because it's one of the few places (so far) where the function won't
> fit in my portrait window :-)

Thanks.  Have done so locally, so will be in next iteration once I
finish squashing everything.

Dave

Re: [PATCH] Fix gnu-versioned-namespace tr1 declaration

2019-12-11 Thread Jonathan Wakely


On 11/12/19 21:25 +0100, François Dumont wrote:

On 12/11/19 12:22 PM, Jonathan Wakely wrote:

On 11/12/19 11:16 +, Jonathan Wakely wrote:

On 11/12/19 08:29 +0100, François Dumont wrote:

I plan to commit this tomorrow.

Note that rather than just adding the missing 
_GLIBCXX_[BEGIN,END]_VERSION_NAMESPACE I also move anonymous 
namespace usage outside std namespace. Let me know if it was 
intentional.


It was intentional, why move it?

Adding the BEGIN/END_VERSION macros is unnecessary. Those namespaces
are inline, so std::random_device already refers to
std::__8::random_device when the original declaration was in the
versioned namespace.

The only fix needed here seems to be qualifying std::isdigit (and
strictly-speaking we should also include  to declare that).


I was curious why that qualification is needed. Th problem is that
 is being indirectly included by some other header, and so is
, so the declarations visible are ::isdigit(int) and
std::__8::isdigit(CharT, const locale&). Even after including
 we still can't call it unqualified, because  doesn't
use the versioned namespace:


Yes, this is the other patch I wanted to propose. Make sure that tr1 
namespace is always defined consistently with the version namespace.


For instance 17_intro/headers/c++2011/parallel_mode.cc is failing at 
the moment with:


In the Venn diagram for TR1 and Parallel Mode the intersection is
labelled "nobody cares" :-)

The patch is OK for trunk though.

Re: [PATCH][RFC] Add new ipa-reorder pass

2019-12-11 Thread Andi Kleen

Martin Liška  writes:
>
> Notes and limitations:
> - The call-chain-clustering algorithm requires to fit as many as possible 
> functions into page size (4K).
>   Based on my measurements that should correspond to ~1000 GIMPLE statements 
> (IPA inliner size). I can
>   make it a param in the future.

That sounds inaccurate. I would assume e.g. integer code has a
quite different factor than floating point code. Perhaps it would
need something fancier.

If you use a static factor you probably would need to calibrate it
over a lot more code.

Anyways, I think it should be a param, or better even an option, because there's
a trend towards using 2MB code pages on x86. Linux has a number of ways to do
that today.

Also of course that's needed for other architectures.

-Andi

Re: [PATCH 46/49] analyzer: new files: checker-path.{cc|h}

2019-12-11 Thread Jeff Law

On Fri, 2019-11-15 at 20:23 -0500, David Malcolm wrote:
> This patch adds a family of classes for representing paths of events
> for analyzer diagnostics.
> 
> gcc/ChangeLog:
>   * analyzer/checker-path.cc: New file.
>   * analyzer/checker-path.h: New file.
> 
> 
No significant concerns here.  My brain is turning to mush.  That's
probably it for today.

jeff

Re: [PATCH] Fix gnu-versioned-namespace build

2019-12-11 Thread Jonathan Wakely


On 11/12/19 21:23 +0100, François Dumont wrote:

On 12/11/19 12:16 PM, Jonathan Wakely wrote:

On 11/12/19 08:29 +0100, François Dumont wrote:

I plan to commit this tomorrow.

Note that rather than just adding the missing 
_GLIBCXX_[BEGIN,END]_VERSION_NAMESPACE I also move anonymous 
namespace usage outside std namespace. Let me know if it was 
intentional.


It was intentional, why move it?


I just don't get the intention so I proposed to move it. But there are 
indeed other usages of this pattern in other src files.


Note that in src/c++11/debug.cc I am using anonymous namespace at 
global scope, is that wrong ?


It's certainly more fragile, so arguably it's wrong, yes.

Consider:

// some libc function in a system header we don't control:
extern "C" void __foo();

// libstdc++ code in a .cc file:
namespace
{
 void foo() { }
}
namespace std
{
 void bar() { foo(); }
}

This fails to compile, because the name foo is ambiguous in the global
scope. We don't control the libc headers, so we don't know all the
names they might declare at global scope.

If you don't put the unnamed namespace at global scope, the problem
simply doesn't exist:

namespace std
{
 namespace
 {
   void foo() { }
 }

 void bar() { foo(); }
}

Now it doesn't matter what names libc puts in the global scope,
because we're never looking for foo in the global scope.

It's obviously better to add our declarations to our own namespace
that we control, not to the global namespace (and an unnamed namespace
at global scope effectively adds the names to the global namespace).



Adding the BEGIN/END_VERSION macros is unnecessary. Those namespaces
are inline, so std::random_device already refers to
std::__8::random_device when the original declaration was in the
versioned namespace.


Ok. I must confess I wasn't clear about this but looking at other src 
files, at least in src/c++11, was showing that it is done almost 
always this way, I guess we could cleanup those files.




The only fix needed here seems to be qualifying std::isdigit (and
strictly-speaking we should also include  to declare that).


Like in attached patch ?


Did you attach the wrong patch?

Re: [PATCH 47/49] analyzer: new files: diagnostic-manager.{cc|h}

2019-12-11 Thread Jeff Law

On Fri, 2019-11-15 at 20:23 -0500, David Malcolm wrote:
> This patch adds diagnostic_manager and related support classes for
> saving, deduplicating, and emitting analyzer diagnostics.
> 
> gcc/ChangeLog:
>   * analyzer/diagnostic-manager.cc: New file.
>   * analyzer/diagnostic-manager.h: New file.
I was originally going to suggest we look to bring this out of the
analyzer subdir, but it looks like there's a lot of tie-ins to the
static analyzer, so let's not try that right now.


> 
> +
> +/* Prune PATH, based on the verbosity level, to the most pertinent
> +   events for a diagnostic that involves VAR ending in state STATE
> +   (for state machine SM).
> +
> +   PATH is updated in place, and the redundant checker_events are
> deleted.
> +
> +   As well as deleting events, call record_critical_state on events
> in
> +   which state critical to the pending_diagnostic is being handled,
> so
> +   that the event's get_desc vfunc can potentially supply a more
> precise
> +   description of the event to the user.
> +   e.g. improving
> + "calling 'foo' from 'bar'"
> +   to
> + "passing possibly-NULL pointer 'ptr' to 'foo' from 'bar' as
> param 1"
> +   when the diagnostic relates to later dereferencing 'ptr'.  */
> +
> +void
> +diagnostic_manager::prune_path (checker_path *path,
> + const state_machine *sm,
> + tree var,
> + state_machine::state_t state) const
You might consider breaking this up a bit.  I guess it stands out
because it's one of the few places (so far) where the function won't
fit in my portrait window :-)


Jeff

[PATCH] Fix gnu-versioned-namespace tr1 declaration

2019-12-11 Thread François Dumont


On 12/11/19 12:22 PM, Jonathan Wakely wrote:

On 11/12/19 11:16 +, Jonathan Wakely wrote:

On 11/12/19 08:29 +0100, François Dumont wrote:

I plan to commit this tomorrow.

Note that rather than just adding the missing 
_GLIBCXX_[BEGIN,END]_VERSION_NAMESPACE I also move anonymous 
namespace usage outside std namespace. Let me know if it was 
intentional.


It was intentional, why move it?

Adding the BEGIN/END_VERSION macros is unnecessary. Those namespaces
are inline, so std::random_device already refers to
std::__8::random_device when the original declaration was in the
versioned namespace.

The only fix needed here seems to be qualifying std::isdigit (and
strictly-speaking we should also include  to declare that).


I was curious why that qualification is needed. Th problem is that
 is being indirectly included by some other header, and so is
, so the declarations visible are ::isdigit(int) and
std::__8::isdigit(CharT, const locale&). Even after including
 we still can't call it unqualified, because  doesn't
use the versioned namespace:


Yes, this is the other patch I wanted to propose. Make sure that tr1 
namespace is always defined consistently with the version namespace.


For instance 17_intro/headers/c++2011/parallel_mode.cc is failing at the 
moment with:


/home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/tr1/gamma.tcc:292: 
error: reference to 'tr1' is ambiguous
In file included from 
/home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/tr1/random:45,
 from 
/home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/parallel/random_number.h:36,
 from 
/home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/parallel/partition.h:38,
 from 
/home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/parallel/quicksort.h:36,
 from 
/home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/parallel/sort.h:48,
 from 
/home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/parallel/algo.h:45,
 from 
/home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/parallel/algorithm:37,
 from 
/home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/algorithm:80,
 from 
/home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu/bits/stdc++.h:65,
 from 
/home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu/bits/extc++.h:32,
 from 
/home/fdt/dev/gcc/git/libstdc++-v3/testsuite/17_intro/headers/c++2011/parallel_mode.cc:24:
/home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/tr1/type_traits:40: 
note: candidates are: 'namespace std::__8::tr1 { }'
In file included from 
/home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/parallel/types.h:37,
 from 
/home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/parallel/parallel.h:38,
 from 
/home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/parallel/base.h:40,
 from 
/home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/parallel/algobase.h:40,
 from 
/home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/stl_algobase.h:2071,
 from 
/home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/char_traits.h:39,
 from 
/home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/ios:40,
 from 
/home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/istream:38,
 from 
/home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/sstream:38,
 from 
/home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/complex:45,
 from 
/home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/ccomplex:39,
 from 
/home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu/bits/stdc++.h:54,
 from 
/home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu/bits/extc++.h:32,
 from 
/home/fdt/dev/gcc/git/libstdc++-v3/testsuite/17_intro/headers/c++2011/parallel_mode.cc:24:
/home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/tr1/cstdint:61: 
note: 'namespace std::tr1 { }'


Tested under Linux x86_64 normal and versioned namespace.

Ok to commit ?

François


diff --git a/libstdc++-v3/include/tr1/cctype b/libstdc++-v3/include/tr1/cctype
index ce994066188..b35cd04f0db 100644

Re: [PATCH] Fix gnu-versioned-namespace build

2019-12-11 Thread François Dumont


On 12/11/19 12:16 PM, Jonathan Wakely wrote:

On 11/12/19 08:29 +0100, François Dumont wrote:

I plan to commit this tomorrow.

Note that rather than just adding the missing 
_GLIBCXX_[BEGIN,END]_VERSION_NAMESPACE I also move anonymous 
namespace usage outside std namespace. Let me know if it was 
intentional.


It was intentional, why move it?


I just don't get the intention so I proposed to move it. But there are 
indeed other usages of this pattern in other src files.


Note that in src/c++11/debug.cc I am using anonymous namespace at global 
scope, is that wrong ?




Adding the BEGIN/END_VERSION macros is unnecessary. Those namespaces
are inline, so std::random_device already refers to
std::__8::random_device when the original declaration was in the
versioned namespace.


Ok. I must confess I wasn't clear about this but looking at other src 
files, at least in src/c++11, was showing that it is done almost always 
this way, I guess we could cleanup those files.




The only fix needed here seems to be qualifying std::isdigit (and
strictly-speaking we should also include  to declare that).


Like in attached patch ?

I am including it unconditionnaly with other C wrapping headers like 
, is that fine ?


At least it builds fine.




    * src/c++11/random.cc: Add _GLIBCXX_BEGIN_NAMESPACE_VERSION and
    _GLIBCXX_END_NAMESPACE_VERSION. Move anonymous namespace outside std
    namespace.

Tested under Linux x86_64 normal/debug/versioned namespace modes.

There are still tests failing in versioned namespace, more patches to 
come.


François



diff --git a/libstdc++-v3/src/c++11/random.cc 
b/libstdc++-v3/src/c++11/random.cc

index 10fbe1dc4c4..d4ebc9556ab 100644
--- a/libstdc++-v3/src/c++11/random.cc
+++ b/libstdc++-v3/src/c++11/random.cc
@@ -73,8 +73,6 @@
# define USE_MT19937 1
#endif

-namespace std _GLIBCXX_VISIBILITY(default)
-{
namespace
{
#if USE_RDRAND
@@ -124,6 +122,10 @@ namespace std _GLIBCXX_VISIBILITY(default)
#endif
}

+namespace std _GLIBCXX_VISIBILITY(default)
+{
+_GLIBCXX_BEGIN_NAMESPACE_VERSION
+
  void
  random_device::_M_init(const std::string& token)
  {
@@ -286,7 +288,7 @@ namespace std _GLIBCXX_VISIBILITY(default)
    _M_mt.seed(seed);
#else
    // Convert old default token "mt19937" or numeric seed tokens to 
"default".

-    if (token == "mt19937" || isdigit((unsigned char)token[0]))
+    if (token == "mt19937" || std::isdigit((unsigned char)token[0]))
  _M_init("default");
    else
  _M_init(token);
@@ -407,5 +409,7 @@ namespace std _GLIBCXX_VISIBILITY(default)
    0x9d2c5680UL, 15,
    0xefc6UL, 18, 1812433253UL>;
#endif // USE_MT19937
-}
+
+_GLIBCXX_END_NAMESPACE_VERSION
+} // namespace std
#endif // _GLIBCXX_USE_C99_STDINT_TR1


.



diff --git a/libstdc++-v3/src/c++11/random.cc b/libstdc++-v3/src/c++11/random.cc
index 10fbe1dc4c4..d4ebc9556ab 100644
--- a/libstdc++-v3/src/c++11/random.cc
+++ b/libstdc++-v3/src/c++11/random.cc
@@ -73,8 +73,6 @@
 # define USE_MT19937 1
 #endif
 
-namespace std _GLIBCXX_VISIBILITY(default)
-{
 namespace
 {
 #if USE_RDRAND
@@ -124,6 +122,10 @@ namespace std _GLIBCXX_VISIBILITY(default)
 #endif
 }
 
+namespace std _GLIBCXX_VISIBILITY(default)
+{
+_GLIBCXX_BEGIN_NAMESPACE_VERSION
+
   void
   random_device::_M_init(const std::string& token)
   {
@@ -286,7 +288,7 @@ namespace std _GLIBCXX_VISIBILITY(default)
 _M_mt.seed(seed);
 #else
 // Convert old default token "mt19937" or numeric seed tokens to "default".
-if (token == "mt19937" || isdigit((unsigned char)token[0]))
+if (token == "mt19937" || std::isdigit((unsigned char)token[0]))
   _M_init("default");
 else
   _M_init(token);
@@ -407,5 +409,7 @@ namespace std _GLIBCXX_VISIBILITY(default)
 0x9d2c5680UL, 15,
 0xefc6UL, 18, 1812433253UL>;
 #endif // USE_MT19937
-}
+
+_GLIBCXX_END_NAMESPACE_VERSION
+} // namespace std
 #endif // _GLIBCXX_USE_C99_STDINT_TR1

Re: [PATCH 00/49] RFC: Add a static analysis framework to GCC

2019-12-11 Thread David Malcolm

On Mon, 2019-12-09 at 09:10 +0100, Richard Biener wrote:
> On Fri, Dec 6, 2019 at 11:31 PM Jeff Law  wrote:
> > On Wed, 2019-12-04 at 12:55 -0700, Martin Sebor wrote:
> > > On 11/15/19 6:22 PM, David Malcolm wrote:
> > > > This patch kit introduces a static analysis pass for GCC that
> > > > can
> > > > diagnose
> > > > various kinds of problems in C code at compile-time (e.g.
> > > > double-
> > > > free,
> > > > use-after-free, etc).
> > > 
> > > I haven't looked at the analyzer bits in any detail yet so I have
> > > just some very high-level questions.  But first let me say I'm
> > > excited to see this project! :)
> > > 
> > > It looks like the analyzer detects some of the same problems as
> > > some existing middle-end warnings (e.g., -Wnonnull,
> > > -Wuninitialized),
> > > and some that I have been working toward implementing (invalid
> > > uses
> > > of freed pointers such as returning them from functions or
> > > passing
> > > them to others), and others still that I have been thinking about
> > > as possible future projects (e.g., detecting uses of
> > > uninitialized
> > > arrays in string functions).
> > > 
> > > What are your thoughts about this sort of overlap?  Do you expect
> > > us to enhance both sets of warnings in parallel, or do you see us
> > > moving away from issuing warnings in the middle-end and toward
> > > making the analyzer the main source of these kinds of
> > > diagnostics?
> > > Maybe even replace some of the problematic middle-end warnings
> > > with the analyzer?  What (if anything) should we do about
> > > warnings
> > > issued for the same problems by both the middle-end and the
> > > analyzer?
> > > Or about false negatives?  E.g., a bug detected by the middle-end
> > > but not the analyzer or vice versa.
> > > 
> > > What do you see as the biggest pros and cons of either approach?
> > > (Middle-end vs analyzer.)  What limitations is the analyzer
> > > approach inherently subject to that the middle-end warnings
> > > aren't,
> > > and vice versa?
> > > 
> > > How do we prioritize between the two approaches (e.g., choose
> > > where to add a new warning)?
> > Given the cost of David's analyzer, I would tend to prioritize the
> > more
> > localized analysis.  Also note that because of the compile-time
> > complexities we end up pruning paths from the search space and lose
> > precision when we have to merge nodes.   These issues are inherent
> > in
> > the depth of analysis we're looking to do.
> > 
> > So the way to think about things is David's work is a slower,
> > deeper
> > analysis than what we usually do.  So things that are reasonable
> > candidates for -Wall would need to use the traditional mechansisms.
> > Things that require deeper analysis would be done in David's
> > framework.
> > 
> > Also note that part of David's work is to bring a fairly generic
> > engine
> > that we can expand with different domain specific analyzers.  It
> > just
> > happens to be the case that the first place he's focused is on
> > double-
> > free and use-after-free.  But (IMHO) the gem is really the generic
> > engine.
> 
> So if the "generic engine" lives inside GCC can the actual analyzers
> be plugins on a (stable) "analyzer plugin API"?

I like the idea of having plugins be able to support the analyzer
itself, so that new checkers can be registered by a plugin, analogous
to plugins that register new passes.  AIUI the clang static analyzer
works in such a fashion.

However, speaking to the "(stable)" part of your question: to do
anything useful, the checkers have to query GCC's IR (as well as
interact with the state of the analyzer), and so this reopens the
question of what the plugin API to GCC's IR is.

I'm focusing on building a concrete example of a checker (double-free)
and a few other examples; trying to generalize it into something
pluggable feels very much like something not to attempt in the initial
version.

> Does the analyzer work with LTO at whole-program scope btw?

My understanding of LTO is a little hazy, but yes, I think.

The first thing the analyzer does (in engine.cc) is:

  /* If using LTO, ensure that the cgraph nodes have function bodies.  */
  cgraph_node *node;
  FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (node)
node->get_untransformed_body ();

before then building a "supergraph" that combines CFGs and the callgraph.

BTW, for more on implementation details, prebuilt HTML of the internal
docs are at:
https://dmalcolm.fedorapeople.org/gcc/static-analyzer/gccint/Static-Analyzer.html

Dave

Re: [PATCH 44/49] analyzer: new files: state-purge.{cc|h}

2019-12-11 Thread Jeff Law

On Fri, 2019-11-15 at 20:23 -0500, David Malcolm wrote:
> This patch adds classes for tracking what state can be safely purged
> at any given point in the program.
> 
> gcc/ChangeLog:
>   * analyzer/state-purge.cc: New file.
>   * analyzer/state-purge.h: New file.
> ---
>  gcc/analyzer/state-purge.cc | 516
> 
>  gcc/analyzer/state-purge.h  | 170 +++
>  2 files changed, 686 insertions(+)
>  create mode 100644 gcc/analyzer/state-purge.cc
>  create mode 100644 gcc/analyzer/state-purge.h

> +
> +/* state_purge_map's ctor.  Walk all SSA names in all functions,
> building
> +   a state_purge_per_ssa_name instance for each.  */
> +
> +state_purge_map::state_purge_map (const supergraph ,
> +   logger *logger)
> +: log_user (logger), m_sg (sg)
> +{
> +  LOG_FUNC (logger);
> +
> +  auto_client_timevar tv ("state_purge_map ctor");
> +
> +  cgraph_node *node;
> +  FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (node)
> +  {
> +function *fun = node->get_fun ();
> +if (logger)
> +  log ("function: %s", function_name (fun));
> +//printf ("function: %s\n", function_name (fun));
Debugging leftover.  Kill it.

Otherwise it seems pretty reasonable.

jeff

Re: [PATCH 43/49] analyzer: new file: exploded-graph.h

2019-12-11 Thread Jeff Law

On Fri, 2019-11-15 at 20:23 -0500, David Malcolm wrote:
> This patch adds exploded_graph and related classes, for managing
> exploring paths through the user's code as a directed graph
> of  pairs.
> 
> gcc/ChangeLog:
>   * analyzer/exploded-graph.h: New file.
> ---
>  gcc/analyzer/exploded-graph.h | 754
> ++
>  1 file changed, 754 insertions(+)
>  create mode 100644 gcc/analyzer/exploded-graph.h
> 
> diff --git a/gcc/analyzer/exploded-graph.h b/gcc/analyzer/exploded-
> graph.h
> new file mode 100644
> index 000..f97d2b6
> --- /dev/null
> +++ b/gcc/analyzer/exploded-graph.h
> @@ -0,0 +1,754 @@
> +/* Classes for managing a directed graph of  pairs.
> +   Copyright (C) 2019 Free Software Foundation, Inc.
> +   Contributed by David Malcolm .
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify it
> +under the terms of the GNU General Public License as published by
> +the Free Software Foundation; either version 3, or (at your option)
> +any later version.
> +
> +GCC is distributed in the hope that it will be useful, but
> +WITHOUT ANY WARRANTY; without even the implied warranty of
> +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +General Public License for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3.  If not see
> +;.  */
> +
> +#ifndef GCC_ANALYZER_EXPLODED_GRAPH_H
> +#define GCC_ANALYZER_EXPLODED_GRAPH_H
> +
> +#include "fibonacci_heap.h"
> +#include "analyzer/analyzer-logging.h"
> +#include "analyzer/constraint-manager.h"
> +#include "analyzer/diagnostic-manager.h"
> +#include "analyzer/program-point.h"
> +#include "analyzer/program-state.h"
> +#include "analyzer/shortest-paths.h"
> +
> +
NIT.  Is there some reason you don't just use whitespace for these kind
of vertical separators.  THe more I see them, the more they bug me,
probably because that's not been the style we've used for GCC.



> ///
> +
> +/* Concrete implementation of region_model_context, wiring it up to
> the
> +   rest of the analysis engine.  */
> +
> +class impl_region_model_context : public region_model_context,
> public log_user
Multiple inheritance?  Is it really needed?  Can we handle via
composition instead?



> +/* A  pair, used internally by
> +   exploded_node as its immutable data, and as a key for identifying
> +   exploded_nodes we've already seen in the graph.  */
> +
> +struct point_and_state
Shouldn't this be a class?

+
> +/* Per-program_point data for an exploded_graph.  */
> +
> +struct per_program_point_data
Here too?  THere may be others.  I'd suggest reviewing all your
"structs" and determine if we're better off calling them "class".  I'm
not going to insist on it though since I think the last discussion in
this space was to relax the conventions :-)

> +
> +class exploded_graph : public digraph, public log_user
Multiple inheritance again?

Jeff

Re: [PATCH 41/49] analyzer: new files: program-point.{cc|h}

2019-12-11 Thread David Malcolm

On Wed, 2019-12-11 at 12:54 -0700, Jeff Law wrote:
> On Fri, 2019-11-15 at 20:23 -0500, David Malcolm wrote:
> > This patch introduces function_point and program_point, classes
> > for tracking locations within the program (the latter adding
> > a call_string for tracking interprocedural location).
> > 
> > gcc/ChangeLog:
> > * analyzer/program-point.cc: New file.
> > * analyzer/program-point.h: New file.
> > ---
> > 
> > 
> > diff --git a/gcc/analyzer/program-point.h b/gcc/analyzer/program-
> > point.h
> > new file mode 100644
> > index 000..ad7b9cd
> > --- /dev/null
> > +++ b/gcc/analyzer/program-point.h
> > @@ -0,0 +1,316 @@
> > +/* Classes for representing locations within the program.
> > +   Copyright (C) 2019 Free Software Foundation, Inc.
> > +   Contributed by David Malcolm .
> > +
> > +This file is part of GCC.
> > +
> > +GCC is free software; you can redistribute it and/or modify it
> > +under the terms of the GNU General Public License as published by
> > +the Free Software Foundation; either version 3, or (at your
> > option)
> > +any later version.
> > +
> > +GCC is distributed in the hope that it will be useful, but
> > +WITHOUT ANY WARRANTY; without even the implied warranty of
> > +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +General Public License for more details.
> > +
> > +You should have received a copy of the GNU General Public License
> > +along with GCC; see the file COPYING3.  If not see
> > +;;.  */
> > +
> > +#ifndef GCC_ANALYZER_PROGRAM_POINT_H
> > +#define GCC_ANALYZER_PROGRAM_POINT_H
> > +
> > +#include "analyzer/call-string.h"
> > +#include "analyzer/supergraph.h"
> > +
> > +class exploded_graph;
> > +
> > +/* An enum for distinguishing the various kinds of
> > program_point.  */
> > +
> > +enum point_kind {
> > +  /* A "fake" node which has edges to all entrypoints.  */
> > +  PK_ORIGIN,
> > +
> > +  PK_BEFORE_SUPERNODE,
> > +  PK_BEFORE_STMT,
> > +  PK_AFTER_SUPERNODE,
> > +
> > +  /* Special values used for hash_map:  */
> > +  PK_EMPTY,
> > +  PK_DELETED,
> > +
> > +  NUM_POINT_KINDS
> > +};
> Isn't this the cause of the hash_map stuff we're discussing with
> Martin?  (PK_EMPTY is a non-zero value)?

Well spotted - the one in sm_state_map in program-state.cc was the one
that I ran into, as it broke a selftest (turning it into an infinite
loop).  But I guess these could be reordered to put PK_EMPTY at the
top, if we're going to require or specialize for that.

> Regardless, I don't see anything there to object to.  We have to nail
> down the hash_map issues though.

Indeed


Dave

Re: [PATCH 41/49] analyzer: new files: program-point.{cc|h}

2019-12-11 Thread Jeff Law

On Fri, 2019-11-15 at 20:23 -0500, David Malcolm wrote:
> This patch introduces function_point and program_point, classes
> for tracking locations within the program (the latter adding
> a call_string for tracking interprocedural location).
> 
> gcc/ChangeLog:
>   * analyzer/program-point.cc: New file.
>   * analyzer/program-point.h: New file.
> ---
> 
> 
> diff --git a/gcc/analyzer/program-point.h b/gcc/analyzer/program-
> point.h
> new file mode 100644
> index 000..ad7b9cd
> --- /dev/null
> +++ b/gcc/analyzer/program-point.h
> @@ -0,0 +1,316 @@
> +/* Classes for representing locations within the program.
> +   Copyright (C) 2019 Free Software Foundation, Inc.
> +   Contributed by David Malcolm .
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify it
> +under the terms of the GNU General Public License as published by
> +the Free Software Foundation; either version 3, or (at your option)
> +any later version.
> +
> +GCC is distributed in the hope that it will be useful, but
> +WITHOUT ANY WARRANTY; without even the implied warranty of
> +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +General Public License for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3.  If not see
> +;.  */
> +
> +#ifndef GCC_ANALYZER_PROGRAM_POINT_H
> +#define GCC_ANALYZER_PROGRAM_POINT_H
> +
> +#include "analyzer/call-string.h"
> +#include "analyzer/supergraph.h"
> +
> +class exploded_graph;
> +
> +/* An enum for distinguishing the various kinds of
> program_point.  */
> +
> +enum point_kind {
> +  /* A "fake" node which has edges to all entrypoints.  */
> +  PK_ORIGIN,
> +
> +  PK_BEFORE_SUPERNODE,
> +  PK_BEFORE_STMT,
> +  PK_AFTER_SUPERNODE,
> +
> +  /* Special values used for hash_map:  */
> +  PK_EMPTY,
> +  PK_DELETED,
> +
> +  NUM_POINT_KINDS
> +};
Isn't this the cause of the hash_map stuff we're discussing with
Martin?  (PK_EMPTY is a non-zero value)?

Regardless, I don't see anything there to object to.  We have to nail
down the hash_map issues though.

jeff

Re: [PATCH] include size and offset in -Wstringop-overflow

2019-12-11 Thread Martin Sebor


On 12/5/19 4:37 PM, Martin Sebor wrote:

If yes, since I don't think I have the time for it for GCC 10
I need to decide whether to drop just this improvement or all
of the buffer overflow checks that depend on it.

Let me know which you prefer.

As I mentioned in my previous message, I think we've got two potential
paths.  We could just drop the problem hunk and adjust the tests with
xfails, but I'm not sure how badly that compromises what you're trying
to do.  We could also leave the hunk in with a fixme or somesuch.


It would compromise the buffer overflow detection in cases where
the length of a string computed by the pass is then used in the same
function to access another string.  There are more of these now as
the strlen has been getting better and better at tracking the string
lenghts.

Rather than using fixed size arrays, code that deals with strings
of unlimited lengths has to allocate space for them dynamically.
Until now, the strlen pass hasn't tried to detect or prevent
overflow into those (and transformed writes to them in ways that
prevented it from being detected later by _FORTIFY_SOURCE).  But
the follow-on enhancement to this patch changes that, and so this
limitation becomes more of an impediment.

Maybe I would ask the question differently.  In real world scenarios how
often does this happen?  How about in tests you're adding?  Is there
value in the patch without this hunk?


To reiterate what we discussed privately: the patches in this
series (the dynamic buffer overflow detection) have considerable
value even without the hunk.  The detection will work correctly
in most cases.  It will fail and cause false negatives in
the specific cases I mentioned uptread, namely:

where the out-of-bounds store is to a character array whose size
or offset depend on the length of a string previously computed
by the strlen pass.  The class of these cases is large but, due
to the limitation (it has to be the same SSA_NAME that's used
in both the allocation context and the destination offset),
the subset where it is effective is relatively small.



Essentially I'm trying to make a determination if the patch and its
follow-ups have value without this hack.



But I don't understand why you think using the RHS of an SSA_NAME
assignment is a problem when it's an INTEGER_CST.  Is that unsafe
for some reason, or likely to fail somehow?

Because it's just working around a failing in our optimizers and a
fairly trivial one at that.  I'd rather fix the real problem rather than
just paper over it.


The robust solution is to implement the full constant propagation
in (or for) the strlen pass.  Hopefully, the on-demand VRP will
give us that in GCC 11.  If not, I will look into implementing
it myself.

Since the workaround stands in the way of getting the patch
approved I have removed it.  Attached is a revised patch, rebased
on the top of trunk and retested on x86_64-linux.


I have committed this in r279248 with Jeff's off-list approval.

Martin

Re: [PATCH 28/49] analyzer: new files: analyzer.{cc|h}

2019-12-11 Thread David Malcolm

On Sat, 2019-12-07 at 08:01 -0700, Jeff Law wrote:
> On Fri, 2019-11-15 at 20:23 -0500, David Malcolm wrote:
[...]
> > diff --git a/gcc/analyzer/analyzer.cc b/gcc/analyzer/analyzer.cc
> > new file mode 100644
> > index 000..399925c
> > --- /dev/null
> > +++ b/gcc/analyzer/analyzer.cc
[...]
> > +
> > +/* Helper function for checkers.  Is the CALL to the given
> > function
> > name?  */
> > +
> > +bool
> > +is_named_call_p (const gcall *call, const char *funcname)
> > +{
> > +  gcc_assert (funcname);
> > +
> > +  tree fndecl = gimple_call_fndecl (call);
> > +  if (!fndecl)
> > +return false;
> > +
> > +  return 0 == strcmp (IDENTIFIER_POINTER (DECL_NAME (fndecl)),
> > funcname);
> > +}
> > +
> > +/* Helper function for checkers.  Is the CALL to the given
> > function
> > name,
> > +   and with the given number of arguments?  */
> > +
> > +bool
> > +is_named_call_p (const gcall *call, const char *funcname,
> > +unsigned int num_args)
> > +{
> > +  gcc_assert (funcname);
> > +
> > +  if (!is_named_call_p (call, funcname))
> > +return false;
> > +
> > +  if (gimple_call_num_args (call) != num_args)
> > +return false;
> > +
> > +  return true;
> These seem generic enough to live elsewhere.

There's a potential can of worms here: these functions are used by the
checkers to detect fndecls by name, so that checkers can associate
behavior with them.  Examples:

* sm-malloc.c recognizes "malloc" and "free" as being special

* sm-signal.c knows that fprintf is async-signal-unsafe (but currently
doesn't know about any other fns; it's a proof-of-concept)

* sm-file.c currently doesn't know anything about "__fsetlocking",
which is one of the reasons the analyzer currently doesn't detect the
leak in intl/localealias.c reported as PR 47170, as it assumes that the
FILE * might have been closed and stops tracking state for it.

etc.

So we're going to have a lot more "recognize this function by name"
just to finish the existing proof-of-concept checkers.

In a perfect world, perhaps we'd have attributes for all of this, and
the user's code and their system headers would have dutifully marked
the pertinent decls with attributes, and the use of is_named_call_p
could be thought of as a bug (or wart)...  but we don't live in that
perfect world, and a good static analyzer shouldn't need to rely on the
user marking their code.

There's also integration and chicken-and-egg issues with attributes
where if we rely e.g. on the user's libc headers having attributes for
the checker, then we need to coordinate with e.g. glibc to add the
attributes, and implement them, and then the checker doesn't work if
someone is using a different libc, etc, etc.

So I think we want a concept of "known functions" in the analyzer,
where the analyzer can have its own "on the side" model of APIs -
perhaps a mixture of baked-in (e.g. for malloc/free), perhaps from an
overridable config file, but I'm not quite sure what form it should
take.  Maybe even a pragma that lets us tag named functions with
attributes, or somesuch.

For now, however, I want to take the pragmatic approach, and use these
functions as needed (and to review this as we gain experience with the
analyzer).

So I think I prefer to keep them in the analyzer subdir (but I'm happy
to move them if you'd prefer that)

Does the above sound sane?

> > +}
> > +
> > +/* Return true if stmt is a setjmp call.  */
> > +
> > +bool
> > +is_setjmp_call_p (const gimple *stmt)
> > +{
> > +  /* TODO: is there a less hacky way to check for "setjmp"?  */
> > +  if (const gcall *call = dyn_cast  (stmt))
> > +if (is_named_call_p (call, "_setjmp", 1))
> > +  return true;
> > +
> > +  return false;
> > +}
> > +
> > +/* Return true if stmt is a longjmp call.  */
> > +
> > +bool
> > +is_longjmp_call_p (const gcall *call)
> > +{
> > +  /* TODO: is there a less hacky way to check for "longjmp"?  */
> > +  if (is_named_call_p (call, "longjmp", 2))
> > +return true;
> > +
> > +  return false;
> > +}
> I thought we already had routines to detect these.  We certainly have
> *code* to detect them.   If it's the former we really just want one
> implementation (that hands the various permutations we've seen
> through
> the years).  If it's the latter, then we probably want to use these
> routines to simplify those implementations.

We have several:

* calls.c has:
  * setjmp_call_p (const_tree fndecl)
  * special_function_p which does name matching on "setjmp",
"sigsetjmp", "savectx", "vfork", "getcontext" and sets
ECF_RETURNS_TWICE (dropping leading single and double underscores)
* except.c has "tree setjmp_fn;" used with #ifdef
DONT_USE_BUILTIN_SETJMP
* omp-low.c has: setjmp_or_longjmp_p (const_tree fndecl)

(and maybe more).

I reviewed where I'm using the two the patch proposed adding above:

* I use is_setjmp_call_p in diagnostic_manager::add_events_for_eedge
for PK_BEFORE_STMT on the gcall to capture recording the setjmp buf (so
that the event can be cross-referenced at the

Re: [PATCH 0/2] Make C front end share the C++ tree representation of loops and switches

2019-12-11 Thread Sandra Loosemore


On 12/11/19 11:27 AM, Jason Merrill wrote:

On 12/11/19 2:03 AM, Sandra Loosemore wrote:


[snip]

Anyway, I'm no longer expecting to get this front end patch into GCC 
10, but it would be helpful to get some guidance on how to proceed for 
resubmission when stage 1 re-opens.  E.g. from where I'm sitting now, 
fixing the C++ constexpr evaluator to handle gotos (if it doesn't 
already?) and reverting to the C way of lowering loops seems like a 
much more bounded problem than fixing optimizers and trying to 
benchmark their effectiveness.  :-S  OTOH, somebody more familiar with 
these optimizations might see easy fixes.  Or I could re-jigger my 
patches to continue to use different loop lowering strategies for C 
and C++ so it at least wouldn't make things any worse for either 
language than it already is.


If this is an important optimization, it would certainly be good to make 
it work for C++.  And now that constexpr works on the pre-generic form 
of the function, it doesn't care how we lower loops.  So C++ could 
revert to the C way without much trouble.  


Thanks for confirming my suspicion that it ought to "just work" to do 
that.  Unless we're concerned about papering over optimizer bugs, this 
would be an easy fix.


I just find it weird that 
apparently the middle end still can't optimize LOOP_EXPR properly.


Yeah, this seems weird to me too; you'd think it would all be the same 
once it gets converted to a control flow graph, and it would not matter 
whether the end test is located at the top or bottom of the loop, etc.


-Sandra

[Committed] PR fortran/92897 -- remove invalid assert()

2019-12-11 Thread Steve Kargl

Committed as obvious.

2019-12-11 Steven G. Kargl  

PR fortran/92897
* array.c (gfc_set_array_spec):  Remove invalid assert() triggered
by invalid Fortran code.
 
2019-12-11 Steven G. Kargl  

PR fortran/92897
* gfortran.dg/pr92897.f90: New test.

-- 
Steve
Index: gcc/fortran/array.c
===
--- gcc/fortran/array.c	(revision 279239)
+++ gcc/fortran/array.c	(working copy)
@@ -859,10 +859,6 @@ gfc_set_array_spec (gfc_symbol *sym, gfc_array_spec *a
 
   if (as->corank)
 {
-  /* The "sym" has no corank (checked via gfc_add_codimension). Thus
-	 the codimension is simply added.  */
-  gcc_assert (as->rank == 0 && sym->as->corank == 0);
-
   sym->as->cotype = as->cotype;
   sym->as->corank = as->corank;
   /* Check F2018:C822.  */
Index: gcc/testsuite/gfortran.dg/pr92897.f90
===
--- gcc/testsuite/gfortran.dg/pr92897.f90	(nonexistent)
+++ gcc/testsuite/gfortran.dg/pr92897.f90	(working copy)
@@ -0,0 +1,8 @@
+! { dg-do compile }
+! { dg-options "-fcoarray=single" }
+! Test contributed by Gerhard Steinmetz
+type(t) function f()! { dg-error "has not been declared" }
+   dimension :: t(1,2,1,2,1,2,1,2)
+   codimension :: t[1,2,1,2,1,2,1,*]! { dg-error "rank \\+ corank of" }
+end
+! { dg-prune-output "which has not been defined" }

Re: [PATCH 40/49] analyzer: new files: call-string.{cc|h}

2019-12-11 Thread Jeff Law

On Fri, 2019-11-15 at 20:23 -0500, David Malcolm wrote:
> This patch adds call_string, a class for representing the
> call stacks at a program_point, so that we can ensure that
> paths through the code are interprocedurally valid.
> 
> gcc/ChangeLog:
>   * analyzer/call-string.cc: New file.
>   * analyzer/call-string.h: New file.
Looks reasonable.  THere may be some scalability issues in here, but I
guess with all the work you've been doing any particularly bad ones for
the examples you're working with have been fixed.

jeff
>

Re: [PATCH 33/49] analyzer: new files: sm.{cc|h}

2019-12-11 Thread Jeff Law

On Fri, 2019-11-15 at 20:23 -0500, David Malcolm wrote:
> This patch adds a "state_machine" base class for describing
> API checkers in terms of state machine transitions.  Followup
> patches use this to add specific API checkers.
> 
> gcc/ChangeLog:
>   * analyzer/sm.cc: New file.
>   * analyzer/sm.h: New file.
> ---
>  gcc/analyzer/sm.cc | 135
> 
>  gcc/analyzer/sm.h  | 160
> +
>  2 files changed, 295 insertions(+)
>  create mode 100644 gcc/analyzer/sm.cc
>  create mode 100644 gcc/analyzer/sm.h
> 
> diff --git a/gcc/analyzer/sm.cc b/gcc/analyzer/sm.cc
> new file mode 100644
> index 000..eda9350
> --- /dev/null
> +++ b/gcc/analyzer/sm.cc
> @@ -0,0 +1,135 @@
> +/* Modeling API uses and misuses via state machines.
> +   Copyright (C) 2019 Free Software Foundation, Inc.
> +   Contributed by David Malcolm .
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify it
> +under the terms of the GNU General Public License as published by
> +the Free Software Foundation; either version 3, or (at your option)
> +any later version.
> +
> +GCC is distributed in the hope that it will be useful, but
> +WITHOUT ANY WARRANTY; without even the implied warranty of
> +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +General Public License for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3.  If not see
> +;.  */
> +
> +#include "config.h"
> +#include "gcc-plugin.h"
> +#include "system.h"
> +#include "coretypes.h"
> +#include "tree.h"
> +#include "gimple.h"
> +#include "analyzer/analyzer.h"
> +#include "analyzer/sm.h"
> +
> +
> 
> +
> +/* If STMT is an assignment to zero, return the LHS.  */
> +
> +tree
> +is_zero_assignment (const gimple *stmt)
> +{
> +  const gassign *assign_stmt = dyn_cast  (stmt);
> +  if (!assign_stmt)
> +return NULL_TREE;
> +
> +  enum tree_code op = gimple_assign_rhs_code (assign_stmt);
> +  if (op != INTEGER_CST)
> +return NULL_TREE;
> +
> +  if (!zerop (gimple_assign_rhs1 (assign_stmt)))
> +return NULL_TREE;
> +
> +  return gimple_assign_lhs (assign_stmt);
> +}
"assignment from zero" rather than "assignment to zero" I think.

I think you'd want to use "integer_zerop" rather than an open-coded
check.  That'll also allow you to pick up other forms such as
COMPLEX_CST and VECTOR_CST.




> +
> +/* If COND_STMT is a comparison against zero of the form (LHS OP 0),
> +   return true and write what's being compared to *OUT_LHS and the
> kind of
> +   the comparison to *OUT_OP.  */
> +
> +bool
> +is_comparison_against_zero (const gcond *cond_stmt,
> + tree *out_lhs, enum tree_code *out_op)
> +{
> +  enum tree_code op = gimple_cond_code (cond_stmt);
> +  tree lhs = gimple_cond_lhs (cond_stmt);
> +  tree rhs = gimple_cond_rhs (cond_stmt);
> +  if (!zerop (rhs))
> +return false;
> +  // TODO: make it symmetric?
> +
> +  switch (op)
> +{
> +case NE_EXPR:
> +case EQ_EXPR:
> +  *out_lhs = lhs;
> +  *out_op = op;
> +  return true;
> +
> +default:
> +  return false;
> +}
> +}
Seems like this might be useful to make generically available.




> +
> +bool
> +any_pointer_p (tree var)
> +{
> +  if (TREE_CODE (TREE_TYPE (var)) != POINTER_TYPE)
> +return false;
> +
> +  return true;
> +}
I believe what you really want is POINTER_TYPE_P which will also happen
to help C++ since it'll include things that are REFERENCE_TYPEs.


It looks like some of the methods don't have comments on their
definitions.  Please check those and add them as necessary.

Otherwise it looks reasonable.

jeff

Re: [PATCH] [ARC] Use hardware support for double-precision compare instructions.

2019-12-11 Thread Jeff Law

On Mon, 2019-12-09 at 11:52 +0200, Claudiu Zissulescu wrote:
> Although the FDCMP (the double precision floating point compare
> instruction) is added to the compiler, it is not properly used via
> cstoredi pattern. Fix it.
> 
> OK to apply?
> Claudidu
> 
> -xx-xx  Claudiu Zissulescu  
> 
>   * config/arc/arc.md (iterator SDF): Check TARGET_FP_DP_BASE.
>   (cstoredi4): Use TARGET_HARD_FLOAT.
OK
jeff

Re: [patch] factor out common files for compare_exclusions

2019-12-11 Thread Jeff Law

On Wed, 2019-12-11 at 01:35 +0100, Matthias Klose wrote:
> the toplevel configure.ac repeats common exclusion files for specific
> targets.
> Just factor those out.  Maybe not required, but gm2 is adding more
> files to be
> ignored on every target, so make it easy to only have these files
> mentioned in
> one place. Ok for the trunk?

> 2019-12-11  Matthias Klose  
> 
>   * configure.ac: Factor out common cases for compare_exclusions.
>   * configure: Regenerate.
OK.
jeff

Re: [PATCH v2][MSP430] -Add fno-exceptions multilib

2019-12-11 Thread Jeff Law

On Wed, 2019-11-27 at 19:53 +, Jozef Lawrynowicz wrote:
> From b74f34e5ae7f649296f7f6bcce35b75c34a2b0fd Mon Sep 17 00:00:00 2001
> From: Jozef Lawrynowicz 
> Date: Mon, 25 Nov 2019 12:07:24 +
> Subject: [PATCH] MSP430: Add fno-exceptions multilib
> 
> ChangeLog:
> 
> 2019-11-27  Jozef Lawrynowicz  
> 
>   * config-ml.in: Support --disable-no-exceptions configure flag.
> 
> gcc/ChangeLog:
> 
> 2019-11-27  Jozef Lawrynowicz  
> 
>   * config/msp430/msp430.h (STARTFILE_SPEC) [!fexceptions]: Use
>   crtbegin_no_eh.o if building for the C language.
>   [fno-exceptions]: Use crtbegin_no_eh.o if building for any language
>   except C.
>   (ENDFILE_SPEC) [!fexceptions]: Use crtend_no_eh.o if building for 
>   the C language.
>   [fno-exceptions]: Use crtend_no_eh.o if building for any language
>   except C.
>   * config/msp430/t-msp430: Add -fno-exceptions multilib.
>   * doc/install.texi: Document --disable-no-exceptions multilib configure
>   option.
> 
> gcc/testsuite/ChangeLog:
> 
> 2019-11-27  Jozef Lawrynowicz  
> 
>   * lib/gcc-dg.exp: Add dg-prune messages for when exception handling is
>   disabled.
>   * lib/target-supports.exp (check_effective_target_exceptions_enabled):
>   New.
> 
> libgcc/ChangeLog:
> 
> 2019-11-27  Jozef Lawrynowicz  
> 
>   * config.host: Add crt{begin,end}_no_eh.o to "extra_parts".
>   * config/msp430/t-msp430: Add rules to build crt{begin,end}_no_eh.o.
OK
jeff

Re: [PATCH v2][MSP430] Add msp430-elfbare target

2019-12-11 Thread Jeff Law

On Wed, 2019-12-11 at 12:21 +, Jozef Lawrynowicz wrote:
> On Wed, 11 Dec 2019 12:19:41 +
> Jozef Lawrynowicz  wrote:
> 
> > On Mon, 9 Dec 2019 15:28:25 +
> > Jozef Lawrynowicz  wrote:
> > 
> > > On Sat, 07 Dec 2019 11:40:33 -0700
> > > Jeff Law  wrote:
> > >   
> > > > On Fri, 2019-11-29 at 21:00 +, Jozef Lawrynowicz wrote:
> > > > > The attached patch consolidates some configuration tweaks I
> > > > > previously submitted
> > > > > as modifications to the msp430-elf target into a new target
> > > > > called
> > > > > "msp430-elfbare" i.e. "bare-metal".
> > > > > 
> > > > > MSP430: Disable TM clone registry by default
> > > > >   https://gcc.gnu.org/ml/gcc-patches/2019-11/msg00550.html
> > > > > MSP430: Disable __cxa_atexit
> > > > >   https://gcc.gnu.org/ml/gcc-patches/2019-11/msg00552.html
> > > > > 
> > > > > The patches tweak the CRT code to achieve the smallest
> > > > > possible code
> > > > > size, 
> > > > > and rely on some additional generic tweaks to crtstuff.c.
> > > > > 
> > > > > I did submit these tweaks a while ago, but I didn't get any
> > > > > feedback,
> > > > > however even if they are acceptable I suspect it is too late
> > > > > for GCC-
> > > > > 10 anyway:
> > > > > libgcc: Dont define __do_global_dtors_aux if it will be empty
> > > > >   https://gcc.gnu.org/ml/gcc-patches/2019-11/msg00417.html
> > > > > libgcc: Implement TARGET_LIBGCC_REMOVE_DSO_HANDLE
> > > > >   https://gcc.gnu.org/ml/gcc-patches/2019-11/msg00418.html
> > > > > 
> > > > > (The second one is a bit hacky, but without some way of
> > > > > removing the
> > > > > __dso_handle declaration, we end up with 150 bytes of
> > > > > unnecessary
> > > > > code in some
> > > > > programs.)
> > > > > 
> > > > > So for this patch crtstuff.c was copied to the msp430
> > > > > subdirectory
> > > > > and the
> > > > > changes were made to that target specific version.
> > > > > 
> > > > > Tiny program size can now be achieved by configuring gcc for
> > > > > msp430-
> > > > > elfbare.
> > > > > 
> > > > > For example in an "empty main" program which loops forever:
> > > > >   msp430-elfbare @ -Os:
> > > > >  textdata bss dec hex filename
> > > > >14   0   0  14   e a.out
> > > > >   msp430-elf @ -Os:
> > > > >  textdata bss dec hex filename
> > > > >   270   6   2 278 116 a.out
> > > > > 
> > > > > Successfully regtested msp430-elfbare vs msp430-elf.
> > > > > 
> > > > > Ok to apply?
> > > > > 
> > > > > P.S. This patch relies on the -fno-exceptions multilib patch
> > > > > submitted here:
> > > > > https://gcc.gnu.org/ml/gcc-patches/2019-11/msg02523.html
> > > > > 
> > > > > P.P.S. This requires some minor configury tweaks to Newlib
> > > > > and GDB of
> > > > > the form:
> > > > > -  msp430*-*-elf)
> > > > > +  msp430-*-elf*)  
> > > > 
> > > > > I'll apply these changes if the patch is accepted.
> > > > > From cff4611855d838315e793d45256de5fc8eeefafe Mon Sep 17
> > > > > 00:00:00
> > > > > 2001
> > > > > From: Jozef Lawrynowicz 
> > > > > Date: Mon, 25 Nov 2019 19:41:05 +
> > > > > Subject: [PATCH] MSP430: Add new msp430-elfbare target
> > > > > 
> > > > > contrib/ChangeLog:
> > > > > 
> > > > > 2019-11-29  Jozef Lawrynowicz  
> > > > > 
> > > > >   * config-list.mk: Add msp430-elfbare.
> > > > > 
> > > > > gcc/ChangeLog:
> > > > > 
> > > > > 2019-11-29  Jozef Lawrynowicz  
> > > > > 
> > > > >   * config.gcc: s/msp430*-*-*/msp430-*-*.
> > > > >   Handle msp430-*-elfbare.
> > > > >   * config/msp430/msp430-devices.c (TARGET_SUBDIR):
> > > > > Define.
> > > > >   (_MSPMKSTR): Define.
> > > > >   (__MSPMKSTR): Define.
> > > > >   (rest_of_devices_path): Use TARGET_SUBDIR value in
> > > > > string.
> > > > >   * config/msp430/msp430.c (msp430_option_override):
> > > > > Error if
> > > > >   -fuse-cxa-atexit is used when it has been disabled at
> > > > > configure
> > > > > time.
> > > > >   * config/msp430/t-msp430: Define TARGET_SUBDIR when
> > > > > building
> > > > >   msp430-devices.o.
> > > > >   * doc/install.texi: Document msp430-*-elf and msp430-*-
> > > > > elfbare.
> > > > >   * doc/invoke.texi: Update documentation about which
> > > > > path
> > > > > devices.csv is
> > > > >   searched for.
> > > > > 
> > > > > gcc/testsuite/ChangeLog:
> > > > > 
> > > > > 2019-11-29  Jozef Lawrynowicz  
> > > > > 
> > > > >   * g++.dg/init/dso_handle1.C: Require cxa_atexit
> > > > > support.
> > > > >   * g++.dg/init/dso_handle2.C: Likewise.
> > > > >   * g++.dg/other/cxa-atexit1.C: Likewise.
> > > > >   * gcc.target/msp430/msp430.exp: Update csv-using-
> > > > > installed.c
> > > > > test to
> > > > >   handle msp430-elfbare configuration.
> > > > > 
> > > > > libgcc/ChangeLog:
> > > > > 
> > > > > 2019-11-29  Jozef Lawrynowicz  
> > > > > 
> > > > >   * config.host: Use t-msp430-elfbare-crtstuff Makefile
> > > > > fragment
> > > > > when

Re: [PATCH 3/3] libgcc: Implement TARGET_LIBGCC_REMOVE_DSO_HANDLE

2019-12-11 Thread Jeff Law

On Wed, 2019-12-11 at 11:49 +, Jozef Lawrynowicz wrote:
> On Mon, 9 Dec 2019 13:05:22 +
> Jozef Lawrynowicz  wrote:
> 
> > On Sat, 07 Dec 2019 11:27:54 -0700
> > Jeff Law  wrote:
> > 
> > > On Wed, 2019-11-06 at 16:19 +, Jozef Lawrynowicz wrote:  
> > > > From 7bc0971d2936ebe71e7b7d3d805cf1bbf9f9f5af Mon Sep 17
> > > > 00:00:00 2001
> > > > From: Jozef Lawrynowicz 
> > > > Date: Mon, 4 Nov 2019 17:38:13 +
> > > > Subject: [PATCH 3/3] libgcc: Implement
> > > > TARGET_LIBGCC_REMOVE_DSO_HANDLE
> > > > 
> > > > gcc/ChangeLog:
> > > > 
> > > > 2019-11-06  Jozef Lawrynowicz  
> > > > 
> > > > * doc/tm.texi: Regenerate.
> > > > * doc/tm.texi.in: Define
> > > > TARGET_LIBGCC_REMOVE_DSO_HANDLE.
> > > > 
> > > > libgcc/ChangeLog:
> > > > 
> > > > 2019-11-06  Jozef Lawrynowicz  
> > > > 
> > > > * crtstuff.c: Don't declare __dso_handle if
> > > > TARGET_LIBGCC_REMOVE_DSO_HANDLE is defined.
> > > Presumably you'll switch this on for your bare elf target
> > > configuration?  
> > 
> > Yep that's the plan. I originally didn't include the target changes
> > in
> > this patch since other target changes (disabling __cxa_atexit) were
> > required for
> > the removal of __dso_handle to be OK.
> > 
> > > Are there other things, particularly related to shared library
> > > support,
> > > that we wouldn't need to use as well?  The reason I ask is I'm
> > > trying
> > > to figure out if REMOVE_DSO_HANDLE is the right name or if we
> > > should
> > > generalize it to a name that indicates shared libraries in
> > > general
> > > aren't supported on the target.  
> > 
> > CRTSTUFFS_O is defined when compiling shared versions of
> > crt{begin,end} and
> > handles an extra case in crtstuff.c where there's some shared
> > library related
> > functionality we don't need on MSP430.
> > 
> > But when CRTSTUFFS_O is undefined __dso_handle is still declared -
> > to 0. The
> > comment gives some additional insight:
> > 
> > /* Declare the __dso_handle variable.  It should have a unique
> > value  
> >in every shared-object; in a main program its value is
> > zero.  The  
> >object should in any case be protected.  This means the
> > instance   
> >in one DSO or the main program is not used in another
> > object.  The 
> >dynamic linker takes care of
> > this.  */ 
> > 
> > I haven't noticed any further shared library-related bloat coming
> > from libgcc.
> > 
> > I think a better way of solving this problem is just to check
> > DEFAULT_USE_CXA_ATEXIT rather than adding this new macro. If
> > __cxa_atexit is
> > not enabled then as far as I understand __dso_handle serves no
> > purpose.
> > DEFAULT_USE_CXA_ATEXIT is defined at configure time for any targets
> > that want
> > __cxa_atexit support.
> > 
> > A quick bootstrap and test of dg.exp on x86_64-pc-linux-gnu shows
> > no issues
> > with the following:
> > 
> > > diff --git a/libgcc/crtstuff.c b/libgcc/crtstuff.c
> > > index ae6328d317d..349f8191e61 100644
> > > --- a/libgcc/crtstuff.c
> > > +++ b/libgcc/crtstuff.c
> > > @@ -340,8 +340,10 @@ extern void *__dso_handle __attribute__
> > > ((__visibility__ ("hidden")));
> > >  #ifdef CRTSTUFFS_O
> > >  void *__dso_handle = &__dso_handle;
> > >  #else
> > > +#if DEFAULT_USE_CXA_ATEXIT
> > >  void *__dso_handle = 0;
> > >  #endif
> > > +#endif
> > >  
> > >  /* The __cxa_finalize function may not be available so we use
> > > only a
> > > weak declaration.  */  
> > 
> > I'll put that patch through some more rigorous testing.
> 
> Successfully bootstrapped and regtested the attached patch for
> x86_64-pc-linux-gnu (where DEFAULT_USE_CXA_ATEXIT is set to 1) and
> the proposed
> msp430-elfbare target (where DEFAULT_USE_CXA_ATEXIT is set to 0).
> 
> > libgcc/ChangeLog:
> > 
> > 2019-12-11  Jozef Lawrynowicz  
> > 
> > * crtstuff.c: Declare __dso_handle only if
> > DEFAULT_USE_CXA_ATEXIT is
> > true.
OK
jeff

ps.  Sorry about the formatting.  Had to switch MUAs last week and
still haven't gotten all the kinks worked out.

Re: [PATCH 0/2] Make C front end share the C++ tree representation of loops and switches

2019-12-11 Thread Jeff Law

On Wed, 2019-12-11 at 00:03 -0700, Sandra Loosemore wrote:
> On 12/6/19 3:41 PM, Jeff Law wrote:
> > On Wed, 2019-11-13 at 09:27 -0700, Sandra Loosemore wrote:
> > > I bootstrapped and regression-tested this on x86_64-linux-
> > > gnu.  There
> > > are a few regressions involving these tests:
> > > 
> > > gcc.dg/tree-ssa/pr77445-2.c
> > I believe tihs is another instance of the FSA optimization.  I'd
> > need
> > to see the before/after dumps to know if it's regressed.  The main
> > purpose of the test was to verify that we didn't muck up the
> > profile
> > estimation after the FSA optimization.
> > 
> > 
> > > gcc.dg/tree-ssa/ssa-dce-3.c
> > So I think this one is supposed to collapse into a trivial infinite
> > loop.  Anything else would be a regression.
> > 
> > 
> > > gcc.dg/tree-ssa/ssa-dom-thread-7.c
> > This is the FSA optimization.  Unfortunately checking for it being
> > done
> > right is exceedingly painful.  If you pass along the before/after
> > dumps
> > I can probably help determine whether or not we just need an update
> > to
> > the scanned bits.
> > 
> > Given how much pressure there was to handle the FSA optimization,
> > I'd
> > prefer to make sure we're still doing it before moving forward.
> 
> I poked at these 3 test cases some more.  FWIW, it appears that if
> you 
> use an unmodified build to compile them as C++ instead of C, the
> same 
> problems appear.  So I guess it is an existing bug in 
> something-or-another that we are not presently optimizing code output
> by 
> the C++ front end as well as that from the C front end.  :-S  (At
> least, 
> the ssa-dce-3.c optimization failure is a bug; the other two might
> be 
> fragile test cases.)
> 
> The C++ front end used to lower loops in exactly the same way as the
> C 
> front end.  This is the patch that changed it:
> 
> https://gcc.gnu.org/ml/gcc-patches/2014-11/msg01994.html
> 
> There wasn't much discussion about how this affected optimization
> beyond 
> noting a slight decrease in code size for a single benchmark, and no 
> consideration at all of whether it wouldn't be better to have the C
> and 
> C++ front ends use the same lowering strategy for loops, whichever
> way 
> produced better code, so that the optimizers can better recognize
> the 
> common patterns from both languages.
> 
> Anyway, I'm no longer expecting to get this front end patch into GCC
> 10, 
> but it would be helpful to get some guidance on how to proceed for 
> resubmission when stage 1 re-opens.  E.g. from where I'm sitting
> now, 
> fixing the C++ constexpr evaluator to handle gotos (if it doesn't 
> already?) and reverting to the C way of lowering loops seems like a
> much 
> more bounded problem than fixing optimizers and trying to benchmark 
> their effectiveness.  :-S  OTOH, somebody more familiar with these 
> optimizations might see easy fixes.  Or I could re-jigger my patches
> to 
> continue to use different loop lowering strategies for C and C++ so
> it 
> at least wouldn't make things any worse for either language than it 
> already is.
Well, I'm happy to look at the before/after dumps if you pass them
along.   I'd certainly like to see the two front-ends sharing these
bits.

Re: [C++ PATCH] Fix -std=c++17 and earlier handling of CLASSTYPE_NON_AGGREGATE (PR c++/92869)

2019-12-11 Thread Jason Merrill


On 12/10/19 3:47 PM, Jakub Jelinek wrote:

Hi!

In C++20, types with user-declared constructors are not aggregate types,
while in C++17 only types with user-provided or explicit constructors.
In check_bases_and_members we handle it properly:
   CLASSTYPE_NON_AGGREGATE (t)
 |= ((cxx_dialect < cxx2a
  ? type_has_user_provided_or_explicit_constructor (t)
  : TYPE_HAS_USER_CONSTRUCTOR (t))
 || TYPE_POLYMORPHIC_P (t));
but for templates the code right now behaves the C++20 way regardless of the
selected standard.

The following patch tweaks finish_struct to match check_bases_and_members.
I had to add !CLASSTYPE_NON_AGGREGATE check because
type_has_user_provided_or_explicit_constructor -> user_provided_p ICEd
on inherited ctors represented as USING_DECLs.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


OK.


2019-12-10  Jakub Jelinek  

PR c++/92869
* class.c (finish_struct): For C++17 and earlier, check
type_has_user_provided_or_explicit_constructor rather than
TYPE_HAS_USER_CONSTRUCTOR whether to set CLASSTYPE_NON_AGGREGATE.

* g++.dg/cpp0x/aggr3.C: New test.

--- gcc/cp/class.c.jj   2019-12-06 00:40:44.525629037 +0100
+++ gcc/cp/class.c  2019-12-10 14:18:41.171380767 +0100
@@ -7456,7 +7456,13 @@ finish_struct (tree t, tree attributes)
/* Remember current #pragma pack value.  */
TYPE_PRECISION (t) = maximum_field_alignment;
  
-  if (TYPE_HAS_USER_CONSTRUCTOR (t))

+  if (cxx_dialect < cxx2a)
+   {
+ if (!CLASSTYPE_NON_AGGREGATE (t)
+ && type_has_user_provided_or_explicit_constructor (t))
+   CLASSTYPE_NON_AGGREGATE (t) = 1;
+   }
+  else if (TYPE_HAS_USER_CONSTRUCTOR (t))
CLASSTYPE_NON_AGGREGATE (t) = 1;
  
/* Fix up any variants we've already built.  */

--- gcc/testsuite/g++.dg/cpp0x/aggr3.C.jj   2019-12-10 14:25:22.344231923 
+0100
+++ gcc/testsuite/g++.dg/cpp0x/aggr3.C  2019-12-10 14:23:31.700927787 +0100
@@ -0,0 +1,20 @@
+// PR c++/92869
+// { dg-do compile { target c++11 } }
+
+struct A {
+  A () = default;
+  A (const A &) = default;
+  A (A &&) = default;
+  int arr[3];
+};
+
+template 
+struct B {
+  B () = default;
+  B (const B &) = default;
+  B (B &&) = default;
+  T arr[N];
+};
+
+A a = { { 1, 2, 3 } }; // { dg-error "could not convert" "" { target 
c++2a } }
+B b = { { 1, 2, 3 } };   // { dg-error "could not convert" "" { target 
c++2a } }

Jakub

Re: [C++ PATCH] c++/92878 - Parenthesized init of aggregates in new-expression.

2019-12-11 Thread Jason Merrill


On 12/10/19 2:13 PM, Marek Polacek wrote:

Ville pointed out that our paren init of aggregates doesn't work for

   auto a = new A(1, 2, 3);

and I think it should:

A new-expression that creates an object of type T initializes that object
as follows:
...
-- Otherwise, the new-initializer is interpreted according to the
initialization rules of [dcl.init] for direct-initialization.

so I think it follows that we should perform dcl.init#17.6.2.2.

This doesn't work with new[]; we have:
   error ("parenthesized initializer in array new");

Bootstrapped/regtested on x86_64-linux, ok for trunk?


OK.


2019-12-10  Marek Polacek  

PR c++/92878 - Parenthesized init of aggregates in new-expression.
* init.c (build_new_1): Handle parenthesized initialization of
aggregates in new-expression.

* g++.dg/cpp2a/paren-init20.C: New test.

diff --git gcc/cp/init.c gcc/cp/init.c
index e40afe27e1a..b0331b8ba53 100644
--- gcc/cp/init.c
+++ gcc/cp/init.c
@@ -3608,10 +3608,22 @@ build_new_1 (vec **placement, tree type, 
tree nelts,
  tree ie;
  
  	  /* We are processing something like `new int (10)', which

-means allocate an int, and initialize it with 10.  */
+means allocate an int, and initialize it with 10.
  
-	  ie = build_x_compound_expr_from_vec (*init, "new initializer",

-  complain);
+In C++20, also handle `new A(1, 2)'.  */
+ if (cxx_dialect >= cxx2a
+ && AGGREGATE_TYPE_P (type)
+ && (*init)->length () > 1)
+   {
+ ie = build_tree_list_vec (*init);
+ ie = build_constructor_from_list (init_list_type_node, ie);
+ CONSTRUCTOR_IS_DIRECT_INIT (ie) = true;
+ CONSTRUCTOR_IS_PAREN_INIT (ie) = true;
+ ie = digest_init (type, ie, complain);
+   }
+ else
+   ie = build_x_compound_expr_from_vec (*init, "new initializer",
+complain);
  init_expr = cp_build_modify_expr (input_location, init_expr,
INIT_EXPR, ie, complain);
}
diff --git gcc/testsuite/g++.dg/cpp2a/paren-init20.C 
gcc/testsuite/g++.dg/cpp2a/paren-init20.C
new file mode 100644
index 000..05da7604686
--- /dev/null
+++ gcc/testsuite/g++.dg/cpp2a/paren-init20.C
@@ -0,0 +1,54 @@
+// PR c++/92878 - Parenthesized init of aggregates in new-expression.
+// { dg-do compile { target c++2a } }
+// Test new TYPE(...).
+
+int f ();
+
+struct A
+{
+  int a;
+  int b;
+};
+
+void
+fn_A ()
+{
+  int i = 0;
+  auto y = new A(1, 2);
+  auto x = new A(++i, ++i);
+  auto z = new A(1, { ++i });
+  auto u = new A(1, f());
+}
+
+struct B
+{
+  int a;
+  int b;
+  int c = 42;
+};
+
+void
+fn_B ()
+{
+  int i = 0;
+  auto y = new B(1, 2);
+  auto x = new B(++i, ++i);
+  auto z = new B(1, { ++i });
+  auto u = new B(1, f());
+}
+
+struct C
+{
+  int a;
+  int b = 10;
+};
+
+void
+fn_C ()
+{
+  int i = 0;
+  auto y = new C(1);
+  auto x = new C(++i);
+  auto z = new C({ ++i });
+  auto u = new C(f());
+}

Re: [PATCH 0/2] Make C front end share the C++ tree representation of loops and switches

2019-12-11 Thread Jason Merrill


On 12/11/19 2:03 AM, Sandra Loosemore wrote:

On 12/6/19 3:41 PM, Jeff Law wrote:

On Wed, 2019-11-13 at 09:27 -0700, Sandra Loosemore wrote:


I bootstrapped and regression-tested this on x86_64-linux-gnu.  There
are a few regressions involving these tests:

gcc.dg/tree-ssa/pr77445-2.c

I believe tihs is another instance of the FSA optimization.  I'd need
to see the before/after dumps to know if it's regressed.  The main
purpose of the test was to verify that we didn't muck up the profile
estimation after the FSA optimization.



gcc.dg/tree-ssa/ssa-dce-3.c

So I think this one is supposed to collapse into a trivial infinite
loop.  Anything else would be a regression.



gcc.dg/tree-ssa/ssa-dom-thread-7.c

This is the FSA optimization.  Unfortunately checking for it being done
right is exceedingly painful.  If you pass along the before/after dumps
I can probably help determine whether or not we just need an update to
the scanned bits.

Given how much pressure there was to handle the FSA optimization, I'd
prefer to make sure we're still doing it before moving forward.


I poked at these 3 test cases some more.  FWIW, it appears that if you 
use an unmodified build to compile them as C++ instead of C, the same 
problems appear.  So I guess it is an existing bug in 
something-or-another that we are not presently optimizing code output by 
the C++ front end as well as that from the C front end.  :-S  (At least, 
the ssa-dce-3.c optimization failure is a bug; the other two might be 
fragile test cases.)


The C++ front end used to lower loops in exactly the same way as the C 
front end.  This is the patch that changed it:


https://gcc.gnu.org/ml/gcc-patches/2014-11/msg01994.html

There wasn't much discussion about how this affected optimization beyond 
noting a slight decrease in code size for a single benchmark, and no 
consideration at all of whether it wouldn't be better to have the C and 
C++ front ends use the same lowering strategy for loops, whichever way 
produced better code, so that the optimizers can better recognize the 
common patterns from both languages.


Anyway, I'm no longer expecting to get this front end patch into GCC 10, 
but it would be helpful to get some guidance on how to proceed for 
resubmission when stage 1 re-opens.  E.g. from where I'm sitting now, 
fixing the C++ constexpr evaluator to handle gotos (if it doesn't 
already?) and reverting to the C way of lowering loops seems like a much 
more bounded problem than fixing optimizers and trying to benchmark 
their effectiveness.  :-S  OTOH, somebody more familiar with these 
optimizations might see easy fixes.  Or I could re-jigger my patches to 
continue to use different loop lowering strategies for C and C++ so it 
at least wouldn't make things any worse for either language than it 
already is.


If this is an important optimization, it would certainly be good to make 
it work for C++.  And now that constexpr works on the pre-generic form 
of the function, it doesn't care how we lower loops.  So C++ could 
revert to the C way without much trouble.  I just find it weird that 
apparently the middle end still can't optimize LOOP_EXPR properly.


Jason

Re: [PATCH] Fix vect rotate pattern recog (PR target/92723)

2019-12-11 Thread Richard Sandiford

Jakub Jelinek  writes:
> On Wed, Dec 11, 2019 at 04:52:30PM +, Richard Sandiford wrote:
>> WDYT about instead having:
>> 
>>   if (dt != vect_internal_def || TYPE_MODE (TREE_TYPE (oprnd1)) == mode)
>> 
>> and removing the ext_def stuff?  I'd have expected keeping the original
>> operand to always be best for vect_external_def, and it avoids changing
>> the function body during what's supposed to be just an analysis phase.
>
> The ext_def stuff is needed in any case, for the -amount & mask expression.
> If all you mean is following, then I think it should work and can
> bootstrap/regtest it tonight (though just on x86_64/i686):

Yeah, this is what I meant, sorry for the vague description.

> 2019-12-11  Jakub Jelinek  
>
>   PR target/92723
>   * tree-vect-patterns.c (vect_recog_rotate_pattern): If dt is not
>   vect_internal_def, use oprnd1 as is, without trying to cast it.
>   Formatting fix.
>
>   * gcc.dg/vect/pr92723.c: New test.

OK if it passes testing.

Thanks,
Richard

>
> --- gcc/tree-vect-patterns.c.jj   2019-12-10 21:34:45.103643981 +0100
> +++ gcc/tree-vect-patterns.c  2019-12-11 18:21:11.678218461 +0100
> @@ -2432,14 +2432,12 @@ vect_recog_rotate_pattern (stmt_vec_info
>oprnd0 = def;
>  }
>  
> -  if (dt == vect_external_def
> -  && TREE_CODE (oprnd1) == SSA_NAME)
> +  if (dt == vect_external_def && TREE_CODE (oprnd1) == SSA_NAME)
>  ext_def = vect_get_external_def_edge (vinfo, oprnd1);
>  
>def = NULL_TREE;
>scalar_int_mode mode = SCALAR_INT_TYPE_MODE (type);
> -  if (TREE_CODE (oprnd1) == INTEGER_CST
> -  || TYPE_MODE (TREE_TYPE (oprnd1)) == mode)
> +  if (dt != vect_internal_def || TYPE_MODE (TREE_TYPE (oprnd1)) == mode)
>  def = oprnd1;
>else if (def_stmt && gimple_assign_cast_p (def_stmt))
>  {
> @@ -2454,14 +2452,7 @@ vect_recog_rotate_pattern (stmt_vec_info
>  {
>def = vect_recog_temp_ssa_var (type, NULL);
>def_stmt = gimple_build_assign (def, NOP_EXPR, oprnd1);
> -  if (ext_def)
> - {
> -   basic_block new_bb
> - = gsi_insert_on_edge_immediate (ext_def, def_stmt);
> -   gcc_assert (!new_bb);
> - }
> -  else
> - append_pattern_def_seq (stmt_vinfo, def_stmt);
> +  append_pattern_def_seq (stmt_vinfo, def_stmt);
>  }
>stype = TREE_TYPE (def);
>scalar_int_mode smode = SCALAR_INT_TYPE_MODE (stype);
> --- gcc/testsuite/gcc.dg/vect/pr92723.c.jj2019-12-11 18:19:09.944060313 
> +0100
> +++ gcc/testsuite/gcc.dg/vect/pr92723.c   2019-12-11 18:19:09.944060313 
> +0100
> @@ -0,0 +1,9 @@
> +/* { dg-do compile } */
> +
> +void
> +foo (unsigned long long *x, unsigned long long *y, int z)
> +{
> +  int i;
> +  for (i = 0; i < 1024; i++)
> +x[i] = (y[i] >> z) | (y[i] << (-z & (__SIZEOF_LONG_LONG__ * __CHAR_BIT__ 
> - 1)));
> +}
>
>
>   Jakub

Re: [PATCH, GCC/ARM, 2/2] Add support for ASRL(imm), LSLL(imm) and LSRL(imm) instructions for Armv8.1-M Mainline

2019-12-11 Thread Kyrill Tkachov


Hi Mihail,

On 11/14/19 1:54 PM, Mihail Ionescu wrote:

Hi,

This is part of a series of patches where I am trying to add new
instructions for Armv8.1-M Mainline to the arm backend.
This patch is adding the following instructions:

ASRL (imm)
LSLL (imm)
LSRL (imm)


ChangeLog entry are as follow:

*** gcc/ChangeLog ***

2019-11-14  Mihail-Calin Ionescu 
2019-11-14  Sudakshina Das  

    * config/arm/arm.md (ashldi3): Generate thumb2_lsll for both reg
    and valid immediate.
    (ashrdi3): Generate thumb2_asrl for both reg and valid immediate.
    (lshrdi3): Generate thumb2_lsrl for valid immediates.
    * config/arm/constraints.md (Pg): New.
    * config/arm/predicates.md (long_shift_imm): New.
    (arm_reg_or_long_shift_imm): Likewise.
    * config/arm/thumb2.md (thumb2_asrl): New immediate alternative.
    (thumb2_lsll): Likewise.
    (thumb2_lsrl): New.

*** gcc/testsuite/ChangeLog ***

2019-11-14  Mihail-Calin Ionescu 
2019-11-14  Sudakshina Das  

    * gcc.target/arm/armv8_1m-shift-imm_1.c: New test.

Testsuite shows no regression when run for arm-none-eabi targets.

Is this ok for trunk?



This is ok once the prerequisites are in.

Thanks,

Kyrill



Thanks
Mihail


### Attachment also inlined for ease of reply    
###



diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 
b735f858a6a5c94d02a6765c1b349cdcb5e77ee3..82f4a5573d43925fb7638b9078a06699df38f88c 
100644

--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -3509,8 +3509,8 @@
 operands[2] = force_reg (SImode, operands[2]);

   /* Armv8.1-M Mainline double shifts are not expanded.  */
-  if (REG_P (operands[2]))
-   {
+  if (arm_reg_or_long_shift_imm (operands[2], GET_MODE 
(operands[2])))

+    {
   if (!reg_overlap_mentioned_p(operands[0], operands[1]))
 emit_insn (gen_movdi (operands[0], operands[1]));

@@ -3547,7 +3547,8 @@
   "TARGET_32BIT"
   "
   /* Armv8.1-M Mainline double shifts are not expanded.  */
-  if (TARGET_HAVE_MVE && REG_P (operands[2]))
+  if (TARGET_HAVE_MVE
+  && arm_reg_or_long_shift_imm (operands[2], GET_MODE (operands[2])))
 {
   if (!reg_overlap_mentioned_p(operands[0], operands[1]))
 emit_insn (gen_movdi (operands[0], operands[1]));
@@ -3580,6 +3581,17 @@
  (match_operand:SI 2 "reg_or_int_operand")))]
   "TARGET_32BIT"
   "
+  /* Armv8.1-M Mainline double shifts are not expanded.  */
+  if (TARGET_HAVE_MVE
+    && long_shift_imm (operands[2], GET_MODE (operands[2])))
+    {
+  if (!reg_overlap_mentioned_p(operands[0], operands[1]))
+    emit_insn (gen_movdi (operands[0], operands[1]));
+
+  emit_insn (gen_thumb2_lsrl (operands[0], operands[2]));
+  DONE;
+    }
+
   arm_emit_coreregs_64bit_shift (LSHIFTRT, operands[0], operands[1],
  operands[2], gen_reg_rtx (SImode),
  gen_reg_rtx (SImode));
diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md
index 
b76de81b85c8ce7a2ca484a750b908b7ca64600a..d807818c8499a6a65837f1ed0487e45947f68199 
100644

--- a/gcc/config/arm/constraints.md
+++ b/gcc/config/arm/constraints.md
@@ -35,7 +35,7 @@
 ;;   Dt, Dp, Dz, Tu
 ;; in Thumb-1 state: Pa, Pb, Pc, Pd, Pe
 ;; in Thumb-2 state: Ha, Pj, PJ, Ps, Pt, Pu, Pv, Pw, Px, Py, Pz
-;; in all states: Pf
+;; in all states: Pf, Pg

 ;; The following memory constraints have been used:
 ;; in ARM/Thumb-2 state: Uh, Ut, Uv, Uy, Un, Um, Us
@@ -187,6 +187,11 @@
 && !is_mm_consume (memmodel_from_int (ival))
 && !is_mm_release (memmodel_from_int (ival))")))

+(define_constraint "Pg"
+  "@internal In Thumb-2 state a constant in range 1 to 32"
+  (and (match_code "const_int")
+   (match_test "TARGET_THUMB2 && ival >= 1 && ival <= 32")))
+
 (define_constraint "Ps"
   "@internal In Thumb-2 state a constant in the range -255 to +255"
   (and (match_code "const_int")
diff --git a/gcc/config/arm/predicates.md b/gcc/config/arm/predicates.md
index 
69c10c06ff405e19efa172217a08a512c66cb902..ef5b0303d4424981347287865efb3cca85e56f36 
100644

--- a/gcc/config/arm/predicates.md
+++ b/gcc/config/arm/predicates.md
@@ -322,6 +322,15 @@
   && (UINTVAL (XEXP (op, 1)) < 32)")))
    (match_test "mode == GET_MODE (op)")))

+;; True for Armv8.1-M Mainline long shift instructions.
+(define_predicate "long_shift_imm"
+  (match_test "satisfies_constraint_Pg (op)"))
+
+(define_predicate "arm_reg_or_long_shift_imm"
+  (ior (match_test "TARGET_THUMB2
+   && arm_general_register_operand (op, GET_MODE (op))")
+   (match_test "satisfies_constraint_Pg (op)")))
+
 ;; True for MULT, to identify which variant of shift_operator is in use.
 (define_special_predicate "mult_operator"
   (match_code "mult"))
diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
index

Re: [PATCH, GCC/ARM, 1/2] Add support for ASRL(reg) and LSLL(reg) instructions for Armv8.1-M Mainline

2019-12-11 Thread Kyrill Tkachov


Hi Mihail,

On 11/14/19 1:54 PM, Mihail Ionescu wrote:

Hi,

This patch adds the new scalar shift instructions for Armv8.1-M
Mainline to the arm backend.
This patch is adding the following instructions:

ASRL (reg)
LSLL (reg)



Sorry for the delay, very busy time for GCC development :(




ChangeLog entry are as follow:

*** gcc/ChangeLog ***


2019-11-14  Mihail-Calin Ionescu 
2019-11-14  Sudakshina Das  

    * config/arm/arm.h (TARGET_MVE): New macro for MVE support.



I don't see this hunk in the patch... There's a lot of v8.1-M-related 
patches in flight. Is it defined elsewhere?



    * config/arm/arm.md (ashldi3): Generate thumb2_lsll for 
TARGET_MVE.

    (ashrdi3): Generate thumb2_asrl for TARGET_MVE.
    * config/arm/arm.c (arm_hard_regno_mode_ok): Allocate even odd
    register pairs for doubleword quantities for ARMv8.1M-Mainline.
    * config/arm/thumb2.md (thumb2_asrl): New.
    (thumb2_lsll): Likewise.

*** gcc/testsuite/ChangeLog ***

2019-11-14  Mihail-Calin Ionescu 
2019-11-14  Sudakshina Das  

    * gcc.target/arm/armv8_1m-shift-reg_1.c: New test.

Testsuite shows no regression when run for arm-none-eabi targets.

Is this ok for trunk?

Thanks
Mihail


### Attachment also inlined for ease of reply    
###



diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 
be51df7d14738bc1addeab8ac5a3806778106bce..bf788087a30343269b30cf7054ec29212ad9c572 
100644

--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -24454,14 +24454,15 @@ arm_hard_regno_mode_ok (unsigned int regno, 
machine_mode mode)


   /* We allow almost any value to be stored in the general registers.
  Restrict doubleword quantities to even register pairs in ARM state
- so that we can use ldrd.  Do not allow very large Neon structure
- opaque modes in general registers; they would use too many.  */
+ so that we can use ldrd and Armv8.1-M Mainline instructions.
+ Do not allow very large Neon structure  opaque modes in general
+ registers; they would use too many.  */



This comment now reads:

"Restrict doubleword quantities to even register pairs in ARM state
 so that we can use ldrd and Armv8.1-M Mainline instructions."

Armv8.1-M Mainline is not ARM mode though, so please clarify this 
comment further.


Looks ok to me otherwise (I may even have merged this with the second 
patch, but I'm not complaining about keeping it simple :) )


Thanks,

Kyrill



   if (regno <= LAST_ARM_REGNUM)
 {
   if (ARM_NUM_REGS (mode) > 4)
 return false;

-  if (TARGET_THUMB2)
+  if (TARGET_THUMB2 && !TARGET_HAVE_MVE)
 return true;

   return !(TARGET_LDRD && GET_MODE_SIZE (mode) > 4 && (regno & 1) 
!= 0);

diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 
a91a4b941c3f9d2c3d443f9f4639069ae953fb3b..b735f858a6a5c94d02a6765c1b349cdcb5e77ee3 
100644

--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -3503,6 +3503,22 @@
    (match_operand:SI 2 "reg_or_int_operand")))]
   "TARGET_32BIT"
   "
+  if (TARGET_HAVE_MVE)
+    {
+  if (!reg_or_int_operand (operands[2], SImode))
+    operands[2] = force_reg (SImode, operands[2]);
+
+  /* Armv8.1-M Mainline double shifts are not expanded.  */
+  if (REG_P (operands[2]))
+   {
+ if (!reg_overlap_mentioned_p(operands[0], operands[1]))
+   emit_insn (gen_movdi (operands[0], operands[1]));
+
+ emit_insn (gen_thumb2_lsll (operands[0], operands[2]));
+ DONE;
+   }
+    }
+
   arm_emit_coreregs_64bit_shift (ASHIFT, operands[0], operands[1],
  operands[2], gen_reg_rtx (SImode),
  gen_reg_rtx (SImode));
@@ -3530,6 +3546,16 @@
  (match_operand:SI 2 "reg_or_int_operand")))]
   "TARGET_32BIT"
   "
+  /* Armv8.1-M Mainline double shifts are not expanded.  */
+  if (TARGET_HAVE_MVE && REG_P (operands[2]))
+    {
+  if (!reg_overlap_mentioned_p(operands[0], operands[1]))
+   emit_insn (gen_movdi (operands[0], operands[1]));
+
+  emit_insn (gen_thumb2_asrl (operands[0], operands[2]));
+  DONE;
+    }
+
   arm_emit_coreregs_64bit_shift (ASHIFTRT, operands[0], operands[1],
  operands[2], gen_reg_rtx (SImode),
  gen_reg_rtx (SImode));
diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
index 
c08dab233784bd1cbaae147ece795058d2ef234f..3a716ea954ac55b2081121248b930b7f11520ffa 
100644

--- a/gcc/config/arm/thumb2.md
+++ b/gcc/config/arm/thumb2.md
@@ -1645,3 +1645,19 @@
   }
   [(set_attr "predicable" "yes")]
 )
+
+(define_insn "thumb2_asrl"
+  [(set (match_operand:DI 0 "arm_general_register_operand" "+r")
+   (ashiftrt:DI (match_dup 0)
+    (match_operand:SI 1 
"arm_general_register_operand" "r")))]

+  "TARGET_HAVE_MVE"
+  "asrl%?\\t%Q0, %R0, %1"
+  [(set_attr "predicable" "yes")])
+
+(define_insn

Re: [PATCH] Fix vect rotate pattern recog (PR target/92723)

2019-12-11 Thread Jakub Jelinek

On Wed, Dec 11, 2019 at 04:52:30PM +, Richard Sandiford wrote:
> WDYT about instead having:
> 
>   if (dt != vect_internal_def || TYPE_MODE (TREE_TYPE (oprnd1)) == mode)
> 
> and removing the ext_def stuff?  I'd have expected keeping the original
> operand to always be best for vect_external_def, and it avoids changing
> the function body during what's supposed to be just an analysis phase.

The ext_def stuff is needed in any case, for the -amount & mask expression.
If all you mean is following, then I think it should work and can
bootstrap/regtest it tonight (though just on x86_64/i686):

2019-12-11  Jakub Jelinek  

PR target/92723
* tree-vect-patterns.c (vect_recog_rotate_pattern): If dt is not
vect_internal_def, use oprnd1 as is, without trying to cast it.
Formatting fix.

* gcc.dg/vect/pr92723.c: New test.

--- gcc/tree-vect-patterns.c.jj 2019-12-10 21:34:45.103643981 +0100
+++ gcc/tree-vect-patterns.c2019-12-11 18:21:11.678218461 +0100
@@ -2432,14 +2432,12 @@ vect_recog_rotate_pattern (stmt_vec_info
   oprnd0 = def;
 }
 
-  if (dt == vect_external_def
-  && TREE_CODE (oprnd1) == SSA_NAME)
+  if (dt == vect_external_def && TREE_CODE (oprnd1) == SSA_NAME)
 ext_def = vect_get_external_def_edge (vinfo, oprnd1);
 
   def = NULL_TREE;
   scalar_int_mode mode = SCALAR_INT_TYPE_MODE (type);
-  if (TREE_CODE (oprnd1) == INTEGER_CST
-  || TYPE_MODE (TREE_TYPE (oprnd1)) == mode)
+  if (dt != vect_internal_def || TYPE_MODE (TREE_TYPE (oprnd1)) == mode)
 def = oprnd1;
   else if (def_stmt && gimple_assign_cast_p (def_stmt))
 {
@@ -2454,14 +2452,7 @@ vect_recog_rotate_pattern (stmt_vec_info
 {
   def = vect_recog_temp_ssa_var (type, NULL);
   def_stmt = gimple_build_assign (def, NOP_EXPR, oprnd1);
-  if (ext_def)
-   {
- basic_block new_bb
-   = gsi_insert_on_edge_immediate (ext_def, def_stmt);
- gcc_assert (!new_bb);
-   }
-  else
-   append_pattern_def_seq (stmt_vinfo, def_stmt);
+  append_pattern_def_seq (stmt_vinfo, def_stmt);
 }
   stype = TREE_TYPE (def);
   scalar_int_mode smode = SCALAR_INT_TYPE_MODE (stype);
--- gcc/testsuite/gcc.dg/vect/pr92723.c.jj  2019-12-11 18:19:09.944060313 
+0100
+++ gcc/testsuite/gcc.dg/vect/pr92723.c 2019-12-11 18:19:09.944060313 +0100
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+
+void
+foo (unsigned long long *x, unsigned long long *y, int z)
+{
+  int i;
+  for (i = 0; i < 1024; i++)
+x[i] = (y[i] >> z) | (y[i] << (-z & (__SIZEOF_LONG_LONG__ * __CHAR_BIT__ - 
1)));
+}


Jakub

[PR92843] [OpenACC] Fix dynamic reference counting for structured 'REFCOUNT_INFINITY'

2019-12-11 Thread Thomas Schwinge

Hi!

See attached "[PR92843] [OpenACC] Fix dynamic reference counting for
structured 'REFCOUNT_INFINITY'"; committed to trunk in r279234.


Grüße
 Thomas


From 7c8ffaf54af2c8acb77f82349aac4dd68d47ad9d Mon Sep 17 00:00:00 2001
From: tschwinge 
Date: Wed, 11 Dec 2019 16:49:27 +
Subject: [PATCH] [PR92843] [OpenACC] Fix dynamic reference counting for
 structured 'REFCOUNT_INFINITY'

	libgomp/
	PR libgomp/92843
	* oacc-mem.c (present_create_copy, delete_copyout): Fix dynamic
	reference counting for structured 'REFCOUNT_INFINITY'.  Add some
	assertions.
	(goacc_insert_pointer, goacc_remove_pointer): Adjust accordingly.
	* testsuite/libgomp.oacc-c-c++-common/pr92843-1.c: New file.
	* testsuite/libgomp.oacc-c-c++-common/clauses-1.c: Fix OpenACC.
	* testsuite/libgomp.oacc-c-c++-common/lib-82.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/nested-1.c: Likewise.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@279234 138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgomp/ChangeLog |  10 +
 libgomp/oacc-mem.c|  42 ++--
 .../libgomp.oacc-c-c++-common/clauses-1.c |  16 +-
 .../libgomp.oacc-c-c++-common/lib-82.c|   6 +-
 .../libgomp.oacc-c-c++-common/nested-1.c  |  10 +-
 .../libgomp.oacc-c-c++-common/pr92843-1.c | 179 ++
 6 files changed, 242 insertions(+), 21 deletions(-)
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/pr92843-1.c

diff --git a/libgomp/ChangeLog b/libgomp/ChangeLog
index 0a5650ed438..e5fb05aea6d 100644
--- a/libgomp/ChangeLog
+++ b/libgomp/ChangeLog
@@ -1,5 +1,15 @@
 2019-12-11  Thomas Schwinge  
 
+	PR libgomp/92843
+	* oacc-mem.c (present_create_copy, delete_copyout): Fix dynamic
+	reference counting for structured 'REFCOUNT_INFINITY'.  Add some
+	assertions.
+	(goacc_insert_pointer, goacc_remove_pointer): Adjust accordingly.
+	* testsuite/libgomp.oacc-c-c++-common/pr92843-1.c: New file.
+	* testsuite/libgomp.oacc-c-c++-common/clauses-1.c: Fix OpenACC.
+	* testsuite/libgomp.oacc-c-c++-common/lib-82.c: Likewise.
+	* testsuite/libgomp.oacc-c-c++-common/nested-1.c: Likewise.
+
 	* oacc-parallel.c (find_pointer, GOACC_enter_exit_data): Move...
 	* oacc-mem.c: ... here.
 	(gomp_acc_insert_pointer, gomp_acc_remove_pointer): Rename to
diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c
index 571e0606ac8..a809d0495a6 100644
--- a/libgomp/oacc-mem.c
+++ b/libgomp/oacc-mem.c
@@ -543,11 +543,11 @@ present_create_copy (unsigned f, void *h, size_t s, int async)
 	  gomp_fatal ("[%p,+%d] not mapped", (void *)h, (int)s);
 	}
 
+  assert (n->refcount != REFCOUNT_LINK);
   if (n->refcount != REFCOUNT_INFINITY)
-	{
-	  n->refcount++;
-	  n->dynamic_refcount++;
-	}
+	n->refcount++;
+  n->dynamic_refcount++;
+
   gomp_mutex_unlock (_dev->lock);
 }
   else if (!(f & FLAG_CREATE))
@@ -573,8 +573,10 @@ present_create_copy (unsigned f, void *h, size_t s, int async)
 
   tgt = gomp_map_vars_async (acc_dev, aq, mapnum, , NULL, ,
  , true, GOMP_MAP_VARS_OPENACC);
-  /* Initialize dynamic refcount.  */
-  tgt->list[0].key->dynamic_refcount = 1;
+  n = tgt->list[0].key;
+  assert (n->refcount == 1);
+  assert (n->dynamic_refcount == 0);
+  n->dynamic_refcount++;
 
   d = tgt->to_free;
 }
@@ -698,12 +700,9 @@ delete_copyout (unsigned f, void *h, size_t s, int async, const char *libfnname)
 		  (void *) h, (int) s, (void *) n->host_start, (int) host_size);
 }
 
-  if (n->refcount == REFCOUNT_INFINITY)
-{
-  n->refcount = 0;
-  n->dynamic_refcount = 0;
-}
-  if (n->refcount < n->dynamic_refcount)
+  assert (n->refcount != REFCOUNT_LINK);
+  if (n->refcount != REFCOUNT_INFINITY
+  && n->refcount < n->dynamic_refcount)
 {
   gomp_mutex_unlock (_dev->lock);
   gomp_fatal ("Dynamic reference counting assert fail\n");
@@ -711,13 +710,15 @@ delete_copyout (unsigned f, void *h, size_t s, int async, const char *libfnname)
 
   if (f & FLAG_FINALIZE)
 {
-  n->refcount -= n->dynamic_refcount;
+  if (n->refcount != REFCOUNT_INFINITY)
+	n->refcount -= n->dynamic_refcount;
   n->dynamic_refcount = 0;
 }
   else if (n->dynamic_refcount)
 {
+  if (n->refcount != REFCOUNT_INFINITY)
+	n->refcount--;
   n->dynamic_refcount--;
-  n->refcount--;
 }
 
   if (n->refcount == 0)
@@ -895,6 +896,8 @@ goacc_insert_pointer (size_t mapnum, void **hostaddrs, size_t *sizes,
   splay_tree_key n;
   gomp_mutex_lock (_dev->lock);
   n = lookup_host (acc_dev, *hostaddrs, *sizes);
+  assert (n->refcount != REFCOUNT_INFINITY
+	  && n->refcount != REFCOUNT_LINK);
   gomp_mutex_unlock (_dev->lock);
 
   tgt = n->tgt;
@@ -917,10 +920,11 @@ goacc_insert_pointer (size_t mapnum, void **hostaddrs, size_t *sizes,
   goacc_aq aq = get_goacc_asyncqueue (async);
   tgt = gomp_map_vars_async (acc_dev, aq, mapnum, hostaddrs,
 			 NULL, sizes, kinds, true, GOMP_MAP_VARS_OPENACC);
+  splay_tree_key n =

Re: [PATCH] OpenACC reference count overhaul

2019-12-11 Thread Thomas Schwinge

Hi!

On 2019-10-29T12:15:01+, Julian Brown  wrote:
> On Mon, 21 Oct 2019 16:14:11 +0200
> Thomas Schwinge  wrote:
>> On 2019-10-03T09:35:04-0700, Julian Brown 
>> wrote:
>> >  void
>> > -gomp_acc_remove_pointer (void *h, size_t s, bool force_copyfrom,
>> > int async,
>> > -   int finalize, int mapnum)
>> > +gomp_acc_remove_pointer (struct gomp_device_descr *acc_dev, void
>> > **hostaddrs,
>> > +   size_t *sizes, unsigned short *kinds, int
>> > async,
>> > +   bool finalize, int mapnum)
>> >  {  
>> > [...]

> That code's all gone with this version...

\o/ Yay!

>> > --- a/libgomp/oacc-parallel.c
>> > +++ b/libgomp/oacc-parallel.c
>> > @@ -56,12 +56,29 @@ find_pointer (int pos, size_t mapnum, unsigned
>> > short *kinds)  
>> 
>> I've always been confused by this function (before known as
>> 'find_pset'); this feels wrong, but I've never gotten to the bottom
>> of it.
>
> This version removes that function in favour of a function that finds
> groups of consecutive mappings that should be kept together for a
> single gomp_map_vars invocation. I think that fits better with my
> findings as written up on the wiki page
> https://gcc.gnu.org/wiki/LibgompPointerMappingKinds.

\o/ Yay!

>> > [...]
>> 
>> ;-) Yuck.  As requested before: "Can we get a comment added to such
>> 'magic', please?"
>
> That magic is gone now. 

\o/ Yay!

>> I just wish that eventually we'll be able to can get rid of that
>> stuff, and just let 'gomp_map_vars' do its thing.  Similar to
>>  "'GOACC_parallel_keyed' should use
>> 'GOMP_MAP_VARS_TARGET'".
>> 
>> (For avoidance of doubt, that's not your task right now.)

> I've removed the special-case handling
> of pointers in the enter/exit data code, and combined the
> gomp_acc_remove_pointer code (which now iterated over mappings
> one-at-a-time anyway) with the loop iterating over mappings in the
> new goacc_exit_data_internal function. It was a bit nonsensical to have
> the "exit data" code split over two files, as before.

Yes, I like that very much, and we shall tackle that next intermediate
step once your patch for  "[OpenACC] In
async context, need to use 'gomp_remove_var_async' instead of
'gomp_remove_var'" is done,
<87tv681tb3.fsf@euler.schwinge.homeip.net">http://mid.mail-archive.com/87tv681tb3.fsf@euler.schwinge.homeip.net>.

One thing:

> libgomp/

> * oacc-parallel.c (find_pointer): Remove function.
> (find_group_last, goacc_enter_data_internal,
> goacc_exit_data_internal): New functions.
> (GOACC_enter_exit_data): Use goacc_enter_data_internal and
> goacc_exit_data_internal helper functions.

It makes much sense to move all that into 'libgomp/oacc-mem.c', and as a
preparational step, see attached "[OpenACC] Consolidate
'GOACC_enter_exit_data' and its helper functions in
'libgomp/oacc-mem.c'", committed to trunk in r279233.


Grüße
 Thomas


From ca9b2739279e3ef0080164a68f082bcfb5169095 Mon Sep 17 00:00:00 2001
From: tschwinge 
Date: Wed, 11 Dec 2019 16:49:17 +
Subject: [PATCH] [OpenACC] Consolidate 'GOACC_enter_exit_data' and its helper
 functions in 'libgomp/oacc-mem.c'

	libgomp/
	* oacc-parallel.c (find_pointer, GOACC_enter_exit_data): Move...
	* oacc-mem.c: ... here.
	(gomp_acc_insert_pointer, gomp_acc_remove_pointer): Rename to
	'goacc_insert_pointer', 'goacc_remove_pointer', and make 'static'.
	* libgomp.h (gomp_acc_insert_pointer, gomp_acc_remove_pointer):
	Remove.
	* libgomp_g.h: Update.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@279233 138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgomp/ChangeLog   |   8 ++
 libgomp/libgomp.h   |   2 -
 libgomp/libgomp_g.h |   7 +-
 libgomp/oacc-mem.c  | 274 +++-
 libgomp/oacc-parallel.c | 253 -
 5 files changed, 281 insertions(+), 263 deletions(-)

diff --git a/libgomp/ChangeLog b/libgomp/ChangeLog
index f7d9ae98616..0a5650ed438 100644
--- a/libgomp/ChangeLog
+++ b/libgomp/ChangeLog
@@ -1,5 +1,13 @@
 2019-12-11  Thomas Schwinge  
 
+	* oacc-parallel.c (find_pointer, GOACC_enter_exit_data): Move...
+	* oacc-mem.c: ... here.
+	(gomp_acc_insert_pointer, gomp_acc_remove_pointer): Rename to
+	'goacc_insert_pointer', 'goacc_remove_pointer', and make 'static'.
+	* libgomp.h (gomp_acc_insert_pointer, gomp_acc_remove_pointer):
+	Remove.
+	* libgomp_g.h: Update.
+
 	* oacc-parallel.c (GOACC_wait, goacc_wait): Move...
 	* oacc-async.c: ... here.
 	* oacc-int.h (goacc_wait): Declare.
diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h
index a35aa07c80b..9f4d0428871 100644
--- a/libgomp/libgomp.h
+++ b/libgomp/libgomp.h
@@ -1138,8 +1138,6 @@ enum gomp_map_vars_kind
   GOMP_MAP_VARS_ENTER_DATA
 };
 
-extern void gomp_acc_insert_pointer (size_t, void **, size_t *, void *, int);
-extern void gomp_acc_remove_pointer (void *, size_t, bool, int, int, int);
 extern void

Re: [PATCH 2/2] [ARM] Add support for -mpure-code in thumb-1 (v6m)

2019-12-11 Thread Christophe Lyon

Ping?

Le jeu. 5 déc. 2019 à 11:13, Christophe Lyon  a
écrit :

> ping?
> https://gcc.gnu.org/ml/gcc-patches/2019-11/msg01667.html
>
> Kyrill approved the previous version modulo a typo fix, but Richard
> wanted a better name for a variable.
> Is that version OK?
>
> Thanks,
>
> Christophe
>
>
> On Tue, 26 Nov 2019 at 16:29, Christophe Lyon
>  wrote:
> >
> > ping?
> >
> > On Mon, 18 Nov 2019 at 10:00, Christophe Lyon
> >  wrote:
> > >
> > > On Wed, 13 Nov 2019 at 15:46, Christophe Lyon
> > >  wrote:
> > > >
> > > > On Tue, 12 Nov 2019 at 12:13, Richard Earnshaw (lists)
> > > >  wrote:
> > > > >
> > > > > On 18/10/2019 14:18, Christophe Lyon wrote:
> > > > > > +  bool not_supported = arm_arch_notm || flag_pic ||
> TARGET_NEON;
> > > > > >
> > > > >
> > > > > This is a poor name in the context of the function as a whole.
> What's
> > > > > not supported.  Please think of a better name so that I have some
> idea
> > > > > what the intention is.
> > > >
> > > > That's to keep most of the code common when checking if -mpure-code
> > > > and -mslow-flash-data are supported.
> > > > These 3 cases are common to the two compilation flags, and
> > > > -mslow-flash-data still needs to check TARGET_HAVE_MOVT in addition.
> > > >
> > > > Would "common_unsupported_modes" work better for you?
> > > > Or I can duplicate the "arm_arch_notm || flag_pic || TARGET_NEON" in
> > > > the two tests.
> > > >
> > >
> > > Hi,
> > >
> > > Here is an updated version, using "common_unsupported_modes" instead
> > > of "not_supported", and fixing the typo reported by Kyrill.
> > > The ChangeLog is still the same.
> > >
> > > OK?
> > >
> > > Thanks,
> > >
> > > Christophe
> > >
> > > > Thanks,
> > > >
> > > > Christophe
> > > >
> > > > >
> > > > > R.
>

[OpenACC] Consolidate 'async'/'wait' code in 'libgomp/oacc-async.c'

2019-12-11 Thread Thomas Schwinge

Hi!

As a preparational patch/general refactoring, see attached "[OpenACC]
Consolidate 'async'/'wait' code in 'libgomp/oacc-async.c'"; committed to
trunk in r279232.


Grüße
 Thomas


From 2b04bb7b4c9a13b6eadc7d9723245dd58f0f4f04 Mon Sep 17 00:00:00 2001
From: tschwinge 
Date: Wed, 11 Dec 2019 16:49:08 +
Subject: [PATCH] [OpenACC] Consolidate 'async'/'wait' code in
 'libgomp/oacc-async.c'

	libgomp/
	* oacc-parallel.c (GOACC_wait, goacc_wait): Move...
	* oacc-async.c: ... here.
	* oacc-int.h (goacc_wait): Declare.
	* libgomp_g.h: Update

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@279232 138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgomp/ChangeLog   |  5 +++
 libgomp/libgomp_g.h |  5 ++-
 libgomp/oacc-async.c| 71 
 libgomp/oacc-int.h  |  1 +
 libgomp/oacc-parallel.c | 72 -
 5 files changed, 81 insertions(+), 73 deletions(-)

diff --git a/libgomp/ChangeLog b/libgomp/ChangeLog
index 404722e20e3..f7d9ae98616 100644
--- a/libgomp/ChangeLog
+++ b/libgomp/ChangeLog
@@ -1,5 +1,10 @@
 2019-12-11  Thomas Schwinge  
 
+	* oacc-parallel.c (GOACC_wait, goacc_wait): Move...
+	* oacc-async.c: ... here.
+	* oacc-int.h (goacc_wait): Declare.
+	* libgomp_g.h: Update
+
 	PR libgomp/92854
 	* testsuite/libgomp.oacc-c-c++-common/acc_map_data-device_already-1.c:
 	New file.
diff --git a/libgomp/libgomp_g.h b/libgomp/libgomp_g.h
index dfb55fb66dc..beb1689180d 100644
--- a/libgomp/libgomp_g.h
+++ b/libgomp/libgomp_g.h
@@ -357,6 +357,10 @@ extern void GOMP_teams (unsigned int, unsigned int);
 extern void GOMP_teams_reg (void (*) (void *), void *, unsigned, unsigned,
 			unsigned);
 
+/* oacc-async.c */
+
+extern void GOACC_wait (int, int, ...);
+
 /* oacc-parallel.c */
 
 extern void GOACC_parallel_keyed (int, void (*) (void *), size_t,
@@ -370,7 +374,6 @@ extern void GOACC_enter_exit_data (int, size_t, void **,
    size_t *, unsigned short *, int, int, ...);
 extern void GOACC_update (int, size_t, void **, size_t *,
 			  unsigned short *, int, int, ...);
-extern void GOACC_wait (int, int, ...);
 extern int GOACC_get_num_threads (void);
 extern int GOACC_get_thread_num (void);
 extern void GOACC_declare (int, size_t, void **, size_t *, unsigned short *);
diff --git a/libgomp/oacc-async.c b/libgomp/oacc-async.c
index 2b24ae7adc2..6dfc3bdeb8e 100644
--- a/libgomp/oacc-async.c
+++ b/libgomp/oacc-async.c
@@ -354,6 +354,77 @@ acc_wait_all_async (int async)
 gomp_fatal ("wait all async(%d) failed", async);
 }
 
+void
+GOACC_wait (int async, int num_waits, ...)
+{
+  goacc_lazy_initialize ();
+
+  struct goacc_thread *thr = goacc_thread ();
+
+  /* No nesting.  */
+  assert (thr->prof_info == NULL);
+  assert (thr->api_info == NULL);
+  acc_prof_info prof_info;
+  acc_api_info api_info;
+  bool profiling_p = GOACC_PROFILING_SETUP_P (thr, _info, _info);
+  if (profiling_p)
+{
+  prof_info.async = async;
+  prof_info.async_queue = prof_info.async;
+}
+
+  if (num_waits)
+{
+  va_list ap;
+
+  va_start (ap, num_waits);
+  goacc_wait (async, num_waits, );
+  va_end (ap);
+}
+  else if (async == acc_async_sync)
+acc_wait_all ();
+  else
+acc_wait_all_async (async);
+
+  if (profiling_p)
+{
+  thr->prof_info = NULL;
+  thr->api_info = NULL;
+}
+}
+
+attribute_hidden void
+goacc_wait (int async, int num_waits, va_list *ap)
+{
+  while (num_waits--)
+{
+  int qid = va_arg (*ap, int);
+
+  /* Waiting on ACC_ASYNC_NOVAL maps to 'wait all'.  */
+  if (qid == acc_async_noval)
+	{
+	  if (async == acc_async_sync)
+	acc_wait_all ();
+	  else
+	acc_wait_all_async (async);
+	  break;
+	}
+
+  if (acc_async_test (qid))
+	continue;
+
+  if (async == acc_async_sync)
+	acc_wait (qid);
+  else if (qid == async)
+	/* If we're waiting on the same asynchronous queue as we're
+	   launching on, the queue itself will order work as
+	   required, so there's no need to wait explicitly.  */
+	;
+  else
+	acc_wait_async (qid, async);
+}
+}
+
 attribute_hidden void
 goacc_async_free (struct gomp_device_descr *devicep,
 		  struct goacc_asyncqueue *aq, void *ptr)
diff --git a/libgomp/oacc-int.h b/libgomp/oacc-int.h
index 9dc6c8a5713..81cb15c605f 100644
--- a/libgomp/oacc-int.h
+++ b/libgomp/oacc-int.h
@@ -113,6 +113,7 @@ void goacc_restore_bind (void);
 void goacc_lazy_initialize (void);
 void goacc_host_init (void);
 
+void goacc_wait (int, int, va_list *);
 void goacc_init_asyncqueues (struct gomp_device_descr *);
 bool goacc_fini_asyncqueues (struct gomp_device_descr *);
 void goacc_async_free (struct gomp_device_descr *, struct goacc_asyncqueue *,
diff --git a/libgomp/oacc-parallel.c b/libgomp/oacc-parallel.c
index 68a60de24fa..1faca5d562f 100644
--- a/libgomp/oacc-parallel.c
+++ b/libgomp/oacc-parallel.c
@@ -111,8 +111,6 @@ handle_ftn_pointers (size_t mapnum, void **hostaddrs, size_t *sizes,
 }
 }
 
-static void goacc_wait (int async, int

[PR92854] Add 'libgomp.oacc-c-c++-common/acc_map_data-device_already-.c', 'libgomp.oacc-c-c++-common/acc_map_data-host_already-.c'

2019-12-11 Thread Thomas Schwinge

Hi!

See attached "[PR92854] Add
'libgomp.oacc-c-c++-common/acc_map_data-device_already-*.c',
'libgomp.oacc-c-c++-common/acc_map_data-host_already-*.c'", committed to
trunk in r279231, "to document the status quo", which does match my
understanding of the OpenACC 2.6 semantics.

The TODO in 'libgomp.oacc-c-c++-common/acc_map_data-device_already-3.c'
is being tracked in PR92888 "[OpenACC] Failure to resolve back via
'acc_hostptr' an 'acc_deviceptr' retrieved for a '#pragma acc declare'd
variable".


Grüße
 Thomas


From ebcbd5ae0e1451cc97914cb825fc1017edb98e26 Mon Sep 17 00:00:00 2001
From: tschwinge 
Date: Wed, 11 Dec 2019 16:48:59 +
Subject: [PATCH] [PR92854] Add
 'libgomp.oacc-c-c++-common/acc_map_data-device_already-*.c',
 'libgomp.oacc-c-c++-common/acc_map_data-host_already-*.c'

... to document the status quo.

	libgomp/
	PR libgomp/92854
	* testsuite/libgomp.oacc-c-c++-common/acc_map_data-device_already-1.c:
	New file.
	* testsuite/libgomp.oacc-c-c++-common/acc_map_data-device_already-2.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/acc_map_data-device_already-3.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/acc_map_data-host_already-1.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/acc_map_data-host_already-2.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/acc_map_data-host_already-3.c:
	Likewise.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@279231 138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgomp/ChangeLog | 16 +
 .../acc_map_data-device_already-1.c   | 36 +++
 .../acc_map_data-device_already-2.c   | 35 ++
 .../acc_map_data-device_already-3.c   | 31 
 .../acc_map_data-host_already-1.c | 33 +
 .../acc_map_data-host_already-2.c | 32 +
 .../acc_map_data-host_already-3.c | 27 ++
 7 files changed, 210 insertions(+)
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/acc_map_data-device_already-1.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/acc_map_data-device_already-2.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/acc_map_data-device_already-3.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/acc_map_data-host_already-1.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/acc_map_data-host_already-2.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/acc_map_data-host_already-3.c

diff --git a/libgomp/ChangeLog b/libgomp/ChangeLog
index 6635ed7b44b..404722e20e3 100644
--- a/libgomp/ChangeLog
+++ b/libgomp/ChangeLog
@@ -1,3 +1,19 @@
+2019-12-11  Thomas Schwinge  
+
+	PR libgomp/92854
+	* testsuite/libgomp.oacc-c-c++-common/acc_map_data-device_already-1.c:
+	New file.
+	* testsuite/libgomp.oacc-c-c++-common/acc_map_data-device_already-2.c:
+	Likewise.
+	* testsuite/libgomp.oacc-c-c++-common/acc_map_data-device_already-3.c:
+	Likewise.
+	* testsuite/libgomp.oacc-c-c++-common/acc_map_data-host_already-1.c:
+	Likewise.
+	* testsuite/libgomp.oacc-c-c++-common/acc_map_data-host_already-2.c:
+	Likewise.
+	* testsuite/libgomp.oacc-c-c++-common/acc_map_data-host_already-3.c:
+	Likewise.
+
 2019-12-11  Thomas Schwinge  
 	Julian Brown  
 
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_map_data-device_already-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_map_data-device_already-1.c
new file mode 100644
index 000..b48a1adbbb6
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_map_data-device_already-1.c
@@ -0,0 +1,36 @@
+/* Verify that we refuse 'acc_map_data' when the "device address [...] is
+   already mapped".  */
+
+/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */
+
+#include 
+#include 
+#include 
+#include 
+
+int
+main ()
+{
+  const int N = 131;
+
+  char *h1 = (char *) malloc (N);
+  assert (h1);
+  void *d = acc_malloc (N);
+  assert (d);
+  acc_map_data (h1, d, N);
+
+  char *h2 = (char *) malloc (N);
+  assert (h2);
+  /* Try to arrange a setting such that a later 'acc_unmap_data' would find the
+ device memory object still referenced elsewhere.  This is not possible,
+ given the semantics of 'acc_map_data'.  */
+  fprintf (stderr, "CheCKpOInT\n");
+  acc_map_data (h2, d, N);
+
+  return 0;
+}
+
+
+/* { dg-output "CheCKpOInT(\n|\r\n|\r).*" } */
+/* { dg-output "device address \\\[\[0-9a-fA-FxX\]+, \\\+131\\\] is already mapped" } */
+/* { dg-shouldfail "" } */
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_map_data-device_already-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_map_data-device_already-2.c
new file mode 100644
index 000..4fe0662cabb
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_map_data-device_already-2.c
@@ -0,0 +1,35 @@
+/* Verify that we refuse 'acc_map_data' when the "device address [...] is
+   already mapped".  */
+
+/* { dg-skip-if "" { *-*-* } {

Re: [OpenACC] Update OpenACC data clause semantics to the 2.5 behavior - runtime

2019-12-11 Thread Thomas Schwinge

Hi!

On 2018-06-19T10:01:20-0700, Cesar Philippidis  wrote:
> This patch implements the OpenACC 2.5 data clause semantics in libgomp.

> --- a/libgomp/libgomp.h
> +++ b/libgomp/libgomp.h
> @@ -853,6 +853,8 @@ struct splay_tree_key_s {
>uintptr_t tgt_offset;
>/* Reference count.  */
>uintptr_t refcount;
> +  /* Dynamic reference count.  */
> +  uintptr_t dynamic_refcount;
>/* Pointer to the original mapping of "omp declare target link" object.  */
>splay_tree_key link_key;
>  };

See attached "[OpenACC] Initialize 'dynamic_refcount' whenever we
initialize 'refcount'" for 'Cases missed in r261813 "Update OpenACC data
clause semantics to the 2.5 behavior"'; committed to trunk in r279230,
and backported to gcc-9-branch in r279238.


Grüße
 Thomas


From 20d0998b970ba693b23ee24bd0c94ecb57adf281 Mon Sep 17 00:00:00 2001
From: tschwinge 
Date: Wed, 11 Dec 2019 16:48:44 +
Subject: [PATCH] [OpenACC] Initialize 'dynamic_refcount' whenever we
 initialize 'refcount'

Cases missed in r261813 "Update OpenACC data clause semantics to the 2.5
behavior".

	libgomp/
	* target.c (gomp_load_image_to_device, omp_target_associate_ptr):
	Initialize 'dynamic_refcount' whenever we initialize 'refcount'.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@279230 138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgomp/ChangeLog | 6 ++
 libgomp/target.c  | 3 +++
 2 files changed, 9 insertions(+)

diff --git a/libgomp/ChangeLog b/libgomp/ChangeLog
index 6cefeba5f5f..6635ed7b44b 100644
--- a/libgomp/ChangeLog
+++ b/libgomp/ChangeLog
@@ -1,3 +1,9 @@
+2019-12-11  Thomas Schwinge  
+	Julian Brown  
+
+	* target.c (gomp_load_image_to_device, omp_target_associate_ptr):
+	Initialize 'dynamic_refcount' whenever we initialize 'refcount'.
+
 2019-12-11  Tobias Burnus  
 
 	* omp_lib.h.in: Fix spelling of function declaration
diff --git a/libgomp/target.c b/libgomp/target.c
index 39a24f56395..1151debf256 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -1334,6 +1334,7 @@ gomp_load_image_to_device (struct gomp_device_descr *devicep, unsigned version,
   k->tgt = tgt;
   k->tgt_offset = target_table[i].start;
   k->refcount = REFCOUNT_INFINITY;
+  k->dynamic_refcount = 0;
   k->link_key = NULL;
   array->left = NULL;
   array->right = NULL;
@@ -1366,6 +1367,7 @@ gomp_load_image_to_device (struct gomp_device_descr *devicep, unsigned version,
   k->tgt = tgt;
   k->tgt_offset = target_var->start;
   k->refcount = target_size & link_bit ? REFCOUNT_LINK : REFCOUNT_INFINITY;
+  k->dynamic_refcount = 0;
   k->link_key = NULL;
   array->left = NULL;
   array->right = NULL;
@@ -2627,6 +2629,7 @@ omp_target_associate_ptr (const void *host_ptr, const void *device_ptr,
   k->tgt = tgt;
   k->tgt_offset = (uintptr_t) device_ptr + device_offset;
   k->refcount = REFCOUNT_INFINITY;
+  k->dynamic_refcount = 0;
   array->left = NULL;
   array->right = NULL;
   splay_tree_insert (>mem_map, array);
-- 
2.17.1

From f301776d131dd584f1259a4e6bfa5662451407c4 Mon Sep 17 00:00:00 2001
From: tschwinge 
Date: Wed, 11 Dec 2019 16:51:31 +
Subject: [PATCH] [OpenACC, libgomp] Initialize 'dynamic_refcount' whenever we
 initialize 'refcount'

Cases missed in r261813 "Update OpenACC data clause semantics to the 2.5
behavior".

	libgomp/
	* target.c (gomp_load_image_to_device, omp_target_associate_ptr):
	Initialize 'dynamic_refcount' whenever we initialize 'refcount'.

Backport trunk r279230.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-9-branch@279238 138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgomp/ChangeLog | 6 ++
 libgomp/target.c  | 3 +++
 2 files changed, 9 insertions(+)

diff --git a/libgomp/ChangeLog b/libgomp/ChangeLog
index 70a7f50c22b..c1959a44b8c 100644
--- a/libgomp/ChangeLog
+++ b/libgomp/ChangeLog
@@ -1,3 +1,9 @@
+2019-12-11  Thomas Schwinge  
+	Julian Brown  
+
+	* target.c (gomp_load_image_to_device, omp_target_associate_ptr):
+	Initialize 'dynamic_refcount' whenever we initialize 'refcount'.
+
 2019-12-11  Tobias Burnus  
 
 	Backported from mainline
diff --git a/libgomp/target.c b/libgomp/target.c
index 31148003d0a..97fc1ee2ddc 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -1214,6 +1214,7 @@ gomp_load_image_to_device (struct gomp_device_descr *devicep, unsigned version,
   k->tgt = tgt;
   k->tgt_offset = target_table[i].start;
   k->refcount = REFCOUNT_INFINITY;
+  k->dynamic_refcount = 0;
   k->link_key = NULL;
   array->left = NULL;
   array->right = NULL;
@@ -1246,6 +1247,7 @@ gomp_load_image_to_device (struct gomp_device_descr *devicep, unsigned version,
   k->tgt = tgt;
   k->tgt_offset = target_var->start;
   k->refcount = target_size & link_bit ? REFCOUNT_LINK : REFCOUNT_INFINITY;
+  k->dynamic_refcount = 0;
   k->link_key = NULL;
   array->left = NULL;
   array->right = NULL;
@@ -2501,6 +2503,7 @@ omp_target_associate_ptr (const void *host_ptr, const

Re: [PATCH] PR90838: Support ctz idioms

2019-12-11 Thread Wilco Dijkstra

Hi Richard,

>> +(match (ctz_table_index @1 @2 @3)
>> +  (rshift (mult (bit_and (negate @1) @1) INTEGER_CST@2) INTEGER_CST@3))
>
> You need a :c on the bit_and

Fixed.

> +  unsigned HOST_WIDE_INT val = tree_to_uhwi (mulc);
> +  unsigned shiftval = tree_to_uhwi (tshift);
> +  unsigned input_bits = tree_to_shwi (TYPE_SIZE (input_type));

> In the even that a __int128_t IFN_CTZ is supported the above might ICE with
> too large constants so please wither use wide-int ops or above verify
> tree_fits_{u,s}hwi () before doing the conversions (the conversion from
> TYPE_SIZE should always succeed though).

I've moved the initialization of val much later so we have done all the checks 
and
know for sure the mulc will fit in a HWint.

> Hmm.  So this verifies that for a subset of all possible inputs the table
> computes the correct value.
>
> a) how do we know this verification is exhaustive?
> b) we do this for every array access matching the pattern

It checks all the values that matter, which is the number of bits plus the 
special
handling of ctz(0). An array may contain entries which can never be referenced
(see ctz2() in the testcase), so we don't care what the value is in those cases.
Very few accesses can match the pattern given it is very specific and there are
many checks before it tries to check the contents of the array.

> I suggest you do
>  tree ctor = ctor_for_folding (array);
>  if (!ctor || TREE_CODE (ctor) != CONSTRUCTOR)
>    return false;
>
> and then perform the verification on the constructor elements directly.
> That's a lot cheaper.  Watch out for array_ref_low_bound which you
> don't get passed in here - thus pass in the ARRAY_REF, not the array.
>
> I believe it's also wrong in your code above (probably miscompiles
> a fortran equivalent of your testcase or fails verification/transform).
>
> When you do the verification on the ctor_for_folding then you
> can in theory lookup the varpool node for 'array' and cache
> the verification result there.

I've updated it to use the ctor, but it meant adding another code path to
handle string literals. It's not clear how the array_ref_low_bound affects the
initializer, but I now reject it if it is non-zero.

>> +  tree lhs = gimple_assign_lhs (stmt);
>> +  bool zero_ok = CTZ_DEFINED_VALUE_AT_ZERO (TYPE_MODE (type), val);
>
> since we're using the optab entry shouldn't you check for == 2 here?

Yes, that looks more correct (it's not clear what 1 means exactly).

> Please check this before building the call.

I've reordered the checks so it returns before it builds any gimple if it 
cannot do
the transformation.

> For all of the above using gimple_build () style stmt building and
> a final gsi_replace_with_seq would be more straight-forward.

I've changed that, but it meant always inserting the nop convert, otherwise
it does not make the code easier to follow.

Cheers,
Wilco


[PATCH v3] PR90838: Support ctz idioms

v3: Directly walk over the array initializer and other tweaks based on review.
v2: Use fwprop pass rather than match.pd

Support common idioms for count trailing zeroes using an array lookup.
The canonical form is array[((x & -x) * C) >> SHIFT] where C is a magic
constant which when multiplied by a power of 2 contains a unique value
in the top 5 or 6 bits.  This is then indexed into a table which maps it
to the number of trailing zeroes.  When the table is valid, we emit a
sequence using the target defined value for ctz (0):

int ctz1 (unsigned x)
{
  static const char table[32] =
{
  0, 1, 28, 2, 29, 14, 24, 3, 30, 22, 20, 15, 25, 17, 4, 8,
  31, 27, 13, 23, 21, 19, 16, 7, 26, 12, 18, 6, 11, 5, 10, 9
};

  return table[((unsigned)((x & -x) * 0x077CB531U)) >> 27];
}

Is optimized to:

rbitw0, w0
clz w0, w0
and w0, w0, 31
ret

Bootstrapped on AArch64. OK for commit?

ChangeLog:

2019-12-11  Wilco Dijkstra  

PR tree-optimization/90838
* tree-ssa-forwprop.c (check_ctz_array): Add new function.
(check_ctz_string): Likewise.
(optimize_count_trailing_zeroes): Likewise.
(simplify_count_trailing_zeroes): Likewise.
(pass_forwprop::execute): Try ctz simplification.
* match.pd: Add matching for ctz idioms.
* testsuite/gcc.target/aarch64/pr90838.c: New test.

--
diff --git a/gcc/match.pd b/gcc/match.pd
index 
3b7a5ce4e9a4de4f983ccdc696ad406a7932c08c..410cd6eaae0cdc9de7e01d5496de0595b7ea15ba
 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -6116,3 +6116,11 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (simplify
  (vec_perm vec_same_elem_p@0 @0 @1)
  @0)
+
+/* Match count trailing zeroes for simplify_count_trailing_zeroes in fwprop.
+   The canonical form is array[((x & -x) * C) >> SHIFT] where C is a magic
+   constant which when multiplied by a power of 2 contains a unique value
+   in the top 5 or 6 bits.  This is then indexed into a table which maps it
+   to the number of trailing zeroes.  */
+(match (ctz_table_index @1

Re: [PATCH] Fix vect rotate pattern recog (PR target/92723)

2019-12-11 Thread Richard Sandiford

Jakub Jelinek  writes:
> Hi!
>
> Unlike x86, where the last operand of vector by scalar shift is DImode for
> V[248]DImode shifts, on aarch64 they seem to be SImode.
> vect_recog_rotate_pattern when the rotate amount is not constant casts the
> amount to the element type of the vector, so for V[248]DImode vectors to
> DImode, but then we ICE during expand_shift_1 because such argument doesn't
> satisfy the predicate and can't be widened to it.
>
> The following patch fixes it by special casing vector by scalar shifts
> when adding patterns for rotates, in that case we punt if the operand isn't
> INTEGER_CST or external_def, and the patch just keeps using the type of the
> amount operand the rotate had for the shifts too.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2019-12-10  Jakub Jelinek  
>
>   PR target/92723
>   * tree-vect-patterns.c (vect_recog_rotate_pattern): If vector x vector
>   shifts aren't supported and rotate amount is SSA_NAME, use its type
>   rather than first operand's type for the shift amounts.
>
>   * gcc.dg/vect/pr92723.c: New test.
>
> --- gcc/tree-vect-patterns.c.jj   2019-12-09 11:12:29.983288823 +0100
> +++ gcc/tree-vect-patterns.c  2019-12-10 16:30:59.922177911 +0100
> @@ -2242,6 +2242,7 @@ vect_recog_rotate_pattern (stmt_vec_info
>optab optab1, optab2;
>edge ext_def = NULL;
>bool bswap16_p = false;
> +  bool scalar_shift_p = false;
>  
>if (is_gimple_assign (last_stmt))
>  {
> @@ -2420,6 +2421,7 @@ vect_recog_rotate_pattern (stmt_vec_info
> || !optab2
> || optab_handler (optab2, TYPE_MODE (vectype)) == CODE_FOR_nothing)
>   return NULL;
> +  scalar_shift_p = true;
>  }
>  
>*type_out = vectype;
> @@ -2439,7 +2441,8 @@ vect_recog_rotate_pattern (stmt_vec_info
>def = NULL_TREE;
>scalar_int_mode mode = SCALAR_INT_TYPE_MODE (type);
>if (TREE_CODE (oprnd1) == INTEGER_CST
> -  || TYPE_MODE (TREE_TYPE (oprnd1)) == mode)
> +  || TYPE_MODE (TREE_TYPE (oprnd1)) == mode
> +  || scalar_shift_p)
>  def = oprnd1;
>else if (def_stmt && gimple_assign_cast_p (def_stmt))
>  {

WDYT about instead having:

  if (dt != vect_internal_def || TYPE_MODE (TREE_TYPE (oprnd1)) == mode)

and removing the ext_def stuff?  I'd have expected keeping the original
operand to always be best for vect_external_def, and it avoids changing
the function body during what's supposed to be just an analysis phase.

Thanks,
Richard

> --- gcc/testsuite/gcc.dg/vect/pr92723.c.jj2019-12-10 16:37:09.924375698 
> +0100
> +++ gcc/testsuite/gcc.dg/vect/pr92723.c   2019-12-10 16:37:21.823189103 
> +0100
> @@ -0,0 +1,9 @@
> +/* { dg-do compile } */
> +
> +void
> +foo (unsigned long long *x, unsigned long long *y, int z)
> +{
> +  int i;
> +  for (i = 0; i < 1024; i++)
> +x[i] = (y[i] >> z) | (y[i] << (-z & (__SIZEOF_LONG_LONG__ * __CHAR_BIT__ 
> - 1)));
> +}
>
>   Jakub

[C++ PATCH] PR c++/92105 - decltype(decltype) error cascade.

2019-12-11 Thread Jason Merrill

The primary change here is to do the CPP_DECLTYPE replacement even when we
get an error, so we don't keep trying and giving the same parse error each
time.  We also commit to the tentative firewall parse more often, leading to
better diagnostics.

Tested x86_64-pc-linux-gnu, applying to trunk.

* parser.c (cp_parser_decltype_expr): Don't tentative_firewall here.
(cp_parser_decltype): Do it here.  Remember a non-tentative error.
---
 gcc/cp/parser.c| 32 --
 gcc/testsuite/g++.dg/cpp0x/decltype-err1.C |  7 +
 gcc/testsuite/g++.dg/cpp0x/decltype10.C|  2 +-
 3 files changed, 31 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/decltype-err1.C

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index bf6d291ba9d..16d1359c47d 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -14637,11 +14637,6 @@ cp_parser_decltype_expr (cp_parser *parser,
   cp_token *id_expr_start_token;
   tree expr;
 
-  /* Since we're going to preserve any side-effects from this parse, set up a
- firewall to protect our callers from cp_parser_commit_to_tentative_parse
- in the expression.  */
-  tentative_firewall firewall (parser);
-
   /* First, try parsing an id-expression.  */
   id_expr_start_token = cp_lexer_peek_token (parser->lexer);
   cp_parser_parse_tentatively (parser);
@@ -14733,9 +14728,6 @@ cp_parser_decltype_expr (cp_parser *parser,
  expression.  */
   cp_parser_abort_tentative_parse (parser);
 
-  /* Commit to the tentative_firewall so we get syntax errors.  */
-  cp_parser_commit_to_tentative_parse (parser);
-
   /* Parse a full expression.  */
   expr = cp_parser_expression (parser, /*pidk=*/NULL, /*cast_p=*/false,
   /*decltype_p=*/true);
@@ -14773,6 +14765,17 @@ cp_parser_decltype (cp_parser *parser)
   if (!parens.require_open (parser))
 return error_mark_node;
 
+  /* Since we're going to preserve any side-effects from this parse, set up a
+ firewall to protect our callers from cp_parser_commit_to_tentative_parse
+ in the expression.  */
+  tentative_firewall firewall (parser);
+
+  /* If in_declarator_p, a reparse as an expression might succeed (60361).
+ Otherwise, commit now for better diagnostics.  */
+  if (cp_parser_uncommitted_to_tentative_parse_p (parser)
+  && !parser->in_declarator_p)
+cp_parser_commit_to_topmost_tentative_parse (parser);
+
   push_deferring_access_checks (dk_deferred);
 
   tree expr = NULL_TREE;
@@ -14833,10 +14836,16 @@ cp_parser_decltype (cp_parser *parser)
 }
 
   /* Parse to the closing `)'.  */
-  if (!parens.require_close (parser))
+  if (expr == error_mark_node || !parens.require_close (parser))
 {
   cp_parser_skip_to_closing_parenthesis (parser, true, false,
 /*consume_paren=*/true);
+  expr = error_mark_node;
+}
+
+  /* If we got a parse error while tentative, bail out now.  */
+  if (cp_parser_error_occurred (parser))
+{
   pop_deferring_access_checks ();
   return error_mark_node;
 }
@@ -14859,6 +14868,11 @@ cp_parser_decltype (cp_parser *parser)
   start_token->u.tree_check_value->value = expr;
   start_token->u.tree_check_value->checks = get_deferred_access_checks ();
   start_token->keyword = RID_MAX;
+
+  location_t loc = start_token->location;
+  loc = make_location (loc, loc, parser->lexer);
+  start_token->location = loc;
+
   cp_lexer_purge_tokens_after (parser->lexer, start_token);
 
   pop_to_parent_deferring_access_checks ();
diff --git a/gcc/testsuite/g++.dg/cpp0x/decltype-err1.C 
b/gcc/testsuite/g++.dg/cpp0x/decltype-err1.C
new file mode 100644
index 000..302cb64aafc
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/decltype-err1.C
@@ -0,0 +1,7 @@
+// PR c++/92105
+// { dg-do compile { target c++11 } }
+
+// Test that we get exactly one "expected" error.
+
+decltype(decltype) x = 42; // { dg-bogus "expected.*expected" }
+// { dg-error "expected" "" { target *-*-* } .-1 }
diff --git a/gcc/testsuite/g++.dg/cpp0x/decltype10.C 
b/gcc/testsuite/g++.dg/cpp0x/decltype10.C
index 846d0bf57cf..fe7247269f5 100644
--- a/gcc/testsuite/g++.dg/cpp0x/decltype10.C
+++ b/gcc/testsuite/g++.dg/cpp0x/decltype10.C
@@ -6,4 +6,4 @@ template struct A
   static int i;
 };
 
-template int A::i(decltype (A::i;// { dg-error "expected 
primary-expression before" }
+template int A::i(decltype (A::i;// { dg-error "expected" }

base-commit: 945f2b19497eff52ef44923d291bf0fdba043299
-- 
2.18.1

[C++ PATCH] PR c++/57082 - new X{} and private destructor.

2019-12-11 Thread Jason Merrill

build_new_1 already passes tf_no_cleanup to build_value_init, but in this
testcase we end up calling build_value_init by way of
build_special_member_call, so we need to pass it to that function as well.

Tested x86_64-pc-linux-gnu, applying to trunk.

* init.c (build_new_1): Also pass tf_no_cleanup to
build_special_member_call.
---
 gcc/cp/init.c  |  2 +-
 gcc/testsuite/g++.dg/cpp0x/initlist-new2.C | 15 +++
 2 files changed, 16 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/initlist-new2.C

diff --git a/gcc/cp/init.c b/gcc/cp/init.c
index e40afe27e1a..ecd09510adb 100644
--- a/gcc/cp/init.c
+++ b/gcc/cp/init.c
@@ -3591,7 +3591,7 @@ build_new_1 (vec **placement, tree type, 
tree nelts,
 complete_ctor_identifier,
 init, elt_type,
 LOOKUP_NORMAL,
- complain);
+complain|tf_no_cleanup);
}
  else if (explicit_value_init_p)
{
diff --git a/gcc/testsuite/g++.dg/cpp0x/initlist-new2.C 
b/gcc/testsuite/g++.dg/cpp0x/initlist-new2.C
new file mode 100644
index 000..d8731389a65
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/initlist-new2.C
@@ -0,0 +1,15 @@
+// PR c++/57082
+// { dg-do compile { target c++11 } }
+
+struct X
+{
+private:
+  ~X() {}
+};
+
+int main()
+{
+  new X;// OK
+  new X();  // OK
+  new X{};  // ERROR
+}

base-commit: 945f2b19497eff52ef44923d291bf0fdba043299
-- 
2.18.1

[C++ PATCH] PR c++/92774 - ICE with implicitly deleted operator<=>.

2019-12-11 Thread Jason Merrill

Missing error-recovery code.  While I was poking at this I also figured we
don't need to iterate over the members of a union.

Tested x86_64-pc-linux-gnu, applying to trunk.

* method.c (comp_info::~comp_info): Factor out of...
(build_comparison_op): Here.  Handle error return from build_new_op.
---
 gcc/cp/method.c   | 33 ---
 .../g++.dg/cpp2a/spaceship-synth-neg2.C   | 25 ++
 2 files changed, 46 insertions(+), 12 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/spaceship-synth-neg2.C

diff --git a/gcc/cp/method.c b/gcc/cp/method.c
index 83da20a0779..97c27c51ea3 100644
--- a/gcc/cp/method.c
+++ b/gcc/cp/method.c
@@ -1244,6 +1244,21 @@ struct comp_info
 if (noex && !expr_noexcept_p (expr, tf_none))
   noex = false;
   }
+
+  ~comp_info ()
+  {
+if (first_time)
+  {
+   DECL_DECLARED_CONSTEXPR_P (fndecl) = constexp || was_constexp;
+   tree raises = TYPE_RAISES_EXCEPTIONS (TREE_TYPE (fndecl));
+   if (!raises || UNEVALUATED_NOEXCEPT_SPEC_P (raises))
+ {
+   raises = noex ? noexcept_true_spec : noexcept_false_spec;
+   TREE_TYPE (fndecl) = build_exception_variant (TREE_TYPE (fndecl),
+ raises);
+ }
+  }
+  }
 };
 
 /* Build up the definition of a defaulted comparison operator.  Unlike other
@@ -1282,6 +1297,7 @@ build_comparison_op (tree fndecl, tsubst_flags_t complain)
   if (complain & tf_error)
inform (info.loc, "cannot default compare union %qT", ctype);
   DECL_DELETED_FN (fndecl) = true;
+  return;
 }
 
   tree compound_stmt = NULL_TREE;
@@ -1335,6 +1351,11 @@ build_comparison_op (tree fndecl, tsubst_flags_t 
complain)
 NULL_TREE);
  tree comp = build_new_op (info.loc, code, flags, lhs_mem, rhs_mem,
NULL_TREE, NULL, complain);
+ if (comp == error_mark_node)
+   {
+ DECL_DELETED_FN (fndecl) = true;
+ continue;
+   }
  comps.safe_push (comp);
}
   if (code == SPACESHIP_EXPR && is_auto (rettype))
@@ -1430,18 +1451,6 @@ build_comparison_op (tree fndecl, tsubst_flags_t 
complain)
 finish_compound_stmt (compound_stmt);
   else
 --cp_unevaluated_operand;
-
-  if (info.first_time)
-{
-  DECL_DECLARED_CONSTEXPR_P (fndecl) = info.constexp || info.was_constexp;
-  tree raises = TYPE_RAISES_EXCEPTIONS (TREE_TYPE (fndecl));
-  if (!raises || UNEVALUATED_NOEXCEPT_SPEC_P (raises))
-   {
- raises = info.noex ? noexcept_true_spec : noexcept_false_spec;
- TREE_TYPE (fndecl) = build_exception_variant (TREE_TYPE (fndecl),
-   raises);
-   }
-}
 }
 
 /* Synthesize FNDECL, a non-static member function.   */
diff --git a/gcc/testsuite/g++.dg/cpp2a/spaceship-synth-neg2.C 
b/gcc/testsuite/g++.dg/cpp2a/spaceship-synth-neg2.C
new file mode 100644
index 000..ecc249a67b7
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/spaceship-synth-neg2.C
@@ -0,0 +1,25 @@
+// PR c++/92774
+// { dg-do compile { target c++2a } }
+
+#include 
+
+template
+struct X { };
+
+template
+bool operator==(const X&, const X&) { return true; }
+template
+bool operator<(const X&, const X&) { return true; }
+
+struct Y
+{
+  int a;
+  X c;
+
+  auto operator <=>(Y const&) const = default; // { dg-error "no match" }
+};
+
+void f()
+{
+  auto x = Y() < Y();  // { dg-error "deleted" }
+}

base-commit: 945f2b19497eff52ef44923d291bf0fdba043299
-- 
2.18.1

Re: [PATCH] Fix gnu-versioned-namespace build

2019-12-11 Thread Jonathan Wakely


On 11/12/19 08:29 +0100, François Dumont wrote:

I plan to commit this tomorrow.

Note that rather than just adding the missing 
_GLIBCXX_[BEGIN,END]_VERSION_NAMESPACE I also move anonymous namespace 
usage outside std namespace. Let me know if it was intentional.


    * src/c++11/random.cc: Add _GLIBCXX_BEGIN_NAMESPACE_VERSION and
    _GLIBCXX_END_NAMESPACE_VERSION. Move anonymous namespace outside std
    namespace.

Tested under Linux x86_64 normal/debug/versioned namespace modes.

There are still tests failing in versioned namespace, more patches to come.


One of them fails like this:

/home/jwakely/src/gcc/buildso8/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/locale_classes.tcc:134:
 undefined reference to `_ZNSt3__87codecvtIDsDu11__mbstate_tE2idE'

For some reason some of the char8_t instantiations are not exported
from the libstdc++.so.8.0.0 shared library:

000aaad0 t 
_ZNKSt3__87codecvtIDiDu11__mbstate_tE10do_unshiftERS1_PDuS4_RS4_
000aaa80 t _ZNKSt3__87codecvtIDiDu11__mbstate_tE11do_encodingEv
0000 t _ZNKSt3__87codecvtIDiDu11__mbstate_tE13do_max_lengthEv
000aaa90 t _ZNKSt3__87codecvtIDiDu11__mbstate_tE16do_always_noconvEv
000ab870 t 
_ZNKSt3__87codecvtIDiDu11__mbstate_tE5do_inERS1_PKDuS5_RS5_PDiS7_RS7_
000abc50 t 
_ZNKSt3__87codecvtIDiDu11__mbstate_tE6do_outERS1_PKDiS5_RS5_PDuS7_RS7_
000ab810 t _ZNKSt3__87codecvtIDiDu11__mbstate_tE9do_lengthERS1_PKDuS5_m
000aaac0 t 
_ZNKSt3__87codecvtIDsDu11__mbstate_tE10do_unshiftERS1_PDuS4_RS4_
000aaa80 t _ZNKSt3__87codecvtIDsDu11__mbstate_tE11do_encodingEv
0000 t _ZNKSt3__87codecvtIDsDu11__mbstate_tE13do_max_lengthEv
000aaa90 t _ZNKSt3__87codecvtIDsDu11__mbstate_tE16do_always_noconvEv
000abd40 t 
_ZNKSt3__87codecvtIDsDu11__mbstate_tE5do_inERS1_PKDuS5_RS5_PDsS7_RS7_
000ab990 t 
_ZNKSt3__87codecvtIDsDu11__mbstate_tE6do_outERS1_PKDsS5_RS5_PDuS7_RS7_
000ab900 t _ZNKSt3__87codecvtIDsDu11__mbstate_tE9do_lengthERS1_PKDuS5_m
00157a00 b _ZNSt3__87codecvtIDiDu11__mbstate_tE2idE
000ab270 t _ZNSt3__87codecvtIDiDu11__mbstate_tED0Ev
000ab130 t _ZNSt3__87codecvtIDiDu11__mbstate_tED1Ev
000ab130 t _ZNSt3__87codecvtIDiDu11__mbstate_tED2Ev
00157a08 b _ZNSt3__87codecvtIDsDu11__mbstate_tE2idE
000ab250 t _ZNSt3__87codecvtIDsDu11__mbstate_tED0Ev
000ab110 t _ZNSt3__87codecvtIDsDu11__mbstate_tED1Ev
000ab110 t _ZNSt3__87codecvtIDsDu11__mbstate_tED2Ev

We should be able to add them manually to the
config/abi/pre/gnu-versioned-namespace.ver script, but that shouldn't
be necessary.

I think the problem is that the binutils demangler doesn't know how to
demangle char8_t symbols, so this fails to match them:

GLIBCXX_8.0 {

 global:

   # Names inside the 'extern' block are demangled names.
   extern "C++"
   {
 std::*;
 std::__8::*;
 std::random_device::*
   };

I'm using binutils-2.31.1-36.fc30.x86_64 but it looks like
binutils-2.32-30.fc31.x86_64 also can't demangle those symbols.

[C++ PATCH] PR c++/92446 - deduction of class NTTP.

2019-12-11 Thread Jason Merrill

Another place we need to look through the VIEW_CONVERT_EXPR we add to make a
use of a class NTTP have const type.

Tested x86_64-pc-linux-gnu, applying to trunk.

* pt.c (deducible_expression): Look through VIEW_CONVERT_EXPR.
---
 gcc/cp/pt.c  |  2 +-
 gcc/testsuite/g++.dg/cpp2a/nontype-class26.C | 13 +
 2 files changed, 14 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/nontype-class26.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index d8ab26ec675..6f658de28ed 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -21183,7 +21183,7 @@ static bool
 deducible_expression (tree expr)
 {
   /* Strip implicit conversions.  */
-  while (CONVERT_EXPR_P (expr))
+  while (CONVERT_EXPR_P (expr) || TREE_CODE (expr) == VIEW_CONVERT_EXPR)
 expr = TREE_OPERAND (expr, 0);
   return (TREE_CODE (expr) == TEMPLATE_PARM_INDEX);
 }
diff --git a/gcc/testsuite/g++.dg/cpp2a/nontype-class26.C 
b/gcc/testsuite/g++.dg/cpp2a/nontype-class26.C
new file mode 100644
index 000..315e0ac2309
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/nontype-class26.C
@@ -0,0 +1,13 @@
+// { dg-do compile { target c++2a } }
+
+struct p { unsigned p_ {}; };
+
+template  struct pp {};
+struct qq : public pp  {};
+
+template  int f (pp  const &);
+
+int main ()
+{
+  return f (qq {});
+}

base-commit: 945f2b19497eff52ef44923d291bf0fdba043299
-- 
2.18.1

[C++ PATCH] PR c++/92859 - ADL and bit-field.

2019-12-11 Thread Jason Merrill

We also need unlowered_expr_type when considering associated types for ADL.

Tested x86_64-pc-linux-gnu, applying to trunk.

* name-lookup.c: Use unlowered_expr_type.
---
 gcc/cp/name-lookup.c   |  2 +-
 gcc/testsuite/g++.dg/overload/bit-field1.C | 18 ++
 2 files changed, 19 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/overload/bit-field1.C

diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
index e82eaf222c0..e64cd9a9d66 100644
--- a/gcc/cp/name-lookup.c
+++ b/gcc/cp/name-lookup.c
@@ -908,7 +908,7 @@ name_lookup::adl_expr (tree expr)
 
   if (TREE_TYPE (expr) != unknown_type_node)
 {
-  adl_type (TREE_TYPE (expr));
+  adl_type (unlowered_expr_type (expr));
   return;
 }
 
diff --git a/gcc/testsuite/g++.dg/overload/bit-field1.C 
b/gcc/testsuite/g++.dg/overload/bit-field1.C
new file mode 100644
index 000..318caaaeb65
--- /dev/null
+++ b/gcc/testsuite/g++.dg/overload/bit-field1.C
@@ -0,0 +1,18 @@
+// PR c++/92859
+// { dg-do compile { target c++11 } }
+
+void f(int) = delete;
+
+struct ES { 
+  enum E { v }; 
+  friend void f(E) { }
+};
+
+struct S {
+  ES::E e : 1; 
+};
+
+int main() {
+  S s{}; 
+  f (s.e);
+}

base-commit: 945f2b19497eff52ef44923d291bf0fdba043299
-- 
2.18.1

Re: [PATCH] hash-table.h: support non-zero empty values in empty_slow

2019-12-11 Thread Jakub Jelinek

On Wed, Dec 11, 2019 at 10:44:58AM -0500, David Malcolm wrote:
> Is it OK for a hash_map key to have a "empty" value that isn't
> all-zeroes (and thus the same for a hash_table entry)?
> 
> Is the following patch OK for trunk?

I'd say that it is important to analyze the generated code for the common
case where empty elt is all zeros (and perhaps POD only).
Perhaps -ftree-loop-distribute-patterns can handle it in most cases?
Perhaps only when built in C++14 or later mode, that would still affect
bootstrapped compilers.  Like, could we conditionally constexpr evaluate
what mark_empty does and determine at compile time if it is all zeros or
not or something similar?

Jakub

[PATCH] hash-table.h: support non-zero empty values in empty_slow

2019-12-11 Thread David Malcolm

On Tue, 2019-12-10 at 16:00 -0700, Martin Sebor wrote:
> On 12/10/19 3:07 PM, David Malcolm wrote:
> > On Tue, 2019-12-03 at 15:41 -0700, Martin Sebor wrote:
> > > After allocating a new chunk of memory hash_table::expand() copy-
> > > assigns elements from the current array over the newly allocated
> > > elements without constructing them.
> > > 
> > > Similarly, after destroying existing elements, hash_table::
> > > empty_slow() assigns a new value to them.  This bug was
> > > introduced in r249234 when trying to deal with -Wclass-memaccess
> > > instances when the warning was first added.
> > > 
> > > Neither of these is a problem for PODs but both cause trouble
> > > when
> > > the hash_table contains elements of a type with a user-defined
> > > copy
> > > assignment operator.  There are at least two such instances in
> > > GCC
> > > already and a third is under review.
> > > 
> > > The attached patch avoids this by using placement new to
> > > construct
> > > new elements in uninitialized storage and restoring the original
> > > call to memset in hash_table::empty_slow(), analogously to what
> > > was done in r272893 for PR90923,
> > > 
> > > Longer term, to make these templates safely and efficiently
> > > usable
> > > with non-POD types with user-defined copy ctors and copy
> > > assignment
> > > operators I think these classes will probably need to be enhanced
> > > to make use of "assign" and "move" traits functions to
> > > efficiently
> > > assign and move objects.
> > > 
> > > Martin
> > > diff --git a/gcc/hash-table.h b/gcc/hash-table.h
> > > index ba5d64fb7a3..26bac624a08 100644
> > > --- a/gcc/hash-table.h
> > > +++ b/gcc/hash-table.h
> > > @@ -818,8 +818,7 @@ hash_table > > Allocator>::expand ()
> > > if (!is_empty (x) && !is_deleted (x))
> > >   {
> > > value_type *q = find_empty_slot_for_expand
> > > (Descriptor::hash (x));
> > > -
> > > -  *q = x;
> > > +   new ((void*) q) value_type (x);
> > >   }
> > >   
> > > p++;
> > > @@ -869,14 +868,8 @@ hash_table > > Allocator>::empty_slow ()
> > > m_size_prime_index = nindex;
> > >   }
> > > else
> > > -{
> > > -#ifndef BROKEN_VALUE_INITIALIZATION
> > > -  for ( ; size; ++entries, --size)
> > > - *entries = value_type ();
> > > -#else
> > > -  memset (entries, 0, size * sizeof (value_type));
> > > -#endif
> > > -}
> > > +memset ((void *) entries, 0, size * sizeof (value_type));
> > > +
> > > m_n_deleted = 0;
> > > m_n_elements = 0;
> > >   }
> > 
> > On attempting to rebase my analyzer branch I found that this patch
> > uncovered a bug in it, but also possibly a bug in hash-table.h.
> > 
> > In the analyzer branch I have a hash_map with a key where the
> > "empty"
> > value isn't all-zero-bits.
> 
> There's a test case for it in comment #3 in 92762.

Thanks.  I've adapted that into a selftest in this patch.

> > Specifically, in gcc/analyzer/program-state.h: sm_state_map has a
> > hash_map  map_t, where svalue_id, the key type,
> > has
> > hash traits:
> > 
> > template <>
> > inline void
> > pod_hash_traits::mark_empty (value_type )
> > {
> >v = svalue_id::null ();
> > }
> > 
> > which is a -1 int (all ones).
> > 
> > memset to zero bits populates the "empty" slots with a key with a
> > non-
> > empty value for this key type, effectively corrupting the data
> > structure (luckily a selftest caught this).
> > 
> > Looking at the above patch, it looks like I was unwittingly relying
> > on
> > two things:
> > (a) #ifndef BROKEN_VALUE_INITIALIZATION, and
> > (b) that the default ctor for my key type was the "empty" value.
> > 
> > However, shouldn't empty_slow be calling the Descriptor::mark_empty
> > function?  Doesn't this memset code assume that the "empty" value
> > of
> > the hash_table entry is all-zeroes (and thus imposing the same
> > assumption on all hash_maps' key and value types?) - which isn't
> > the
> > case for my code.  I don't remember seeing that assumption
> > documented.

Correcting myself, I *think* this would impose the assumption that all
hash_maps' keys types "empty" value is all-zeroes; I don't think that
holds for the value types though.

> > 
> > Or am I misreading this?
> 
> IIRC, I had tried the below but it caused problems during bootstrap
> that I didn't spend enough time investigating.
> 
>for (size_t i = size - 1; i < size; i--)
>  if (!is_empty (entries[i]) && !is_deleted (entries[i]))
>{
>  Descriptor::remove (entries[i]);
>  mark_empty (entries[i]);
>}
> 
> (I think it also caused compilation error in one of the IPA passes
> because of a missing mark_empty function in its traits class; maybe
> ipa_bit_ggc_hash_traits in ipa-prop.c).
> 
> To be honest, I'm not sure I understand why the memset is even there.
> It seems like a leak to me (I can reproduce leaks with a user-defined
> type).  I was going to get back to it at some point.
> 
> Martin

Thanks for all the info.

Re: [PATCH, rs6000] Adjust vectorization cost for scalar COND_EXPR

2019-12-11 Thread Segher Boessenkool

On Wed, Dec 11, 2019 at 07:39:38AM -0600, Bill Schmidt wrote:
> I can't approve this, but for what it's worth it looks fine to me.

But I can :-)  Thanks for looking Bill!

The patch is okay for trunk.  Thanks Ke Wen!


Segher


> >2019-12-11  Kewen Lin  
> >
> > * config/rs6000/rs6000.c (adjust_vectorization_cost): New function.
> > (rs6000_add_stmt_cost): Call adjust_vectorization_cost and update
> > stmt_cost.

Re: [PATCH, rs6000] Adjust vectorization cost for scalar COND_EXPR

2019-12-11 Thread Bill Schmidt


Hi!

I can't approve this, but for what it's worth it looks fine to me.

Bill

On 12/11/19 6:31 AM, Kewen.Lin wrote:

Hi,

We found that the vectorization cost modeling on scalar COND_EXPR is a bit off
on rs6000.  One typical case is 548.exchange2_r, -Ofast -mcpu=power9 -mrecip
-fvect-cost-model=unlimited is better than -Ofast -mcpu=power9 -mrecip (the
default is -fvect-cost-model=dynamic) by 1.94%.  Scalar COND_EXPR is expanded
into compare + branch or compare + isel normally, either of them should be
priced more than the simple FXU operation.  This patch is to add additional
vectorization cost onto scalar COND_EXPR on top of builtin_vectorization_cost.
The idea to use additional cost value 2 instead of the others: 1) try various
possible value candidates from 1 to 5, 2 is the best measured on Power9.  2)
from latency view, compare takes 3 cycles and isel takes 2 on Power9, it's
2.5 times of simple FXU instruction which takes cost 1 in the current
modeling, it's close.  3) get fine SPEC2017 ratio on Power8 as well.

The SPEC2017 performance evaluation on Power9 with explicit unrolling shows
548.exchange2_r +2.35% gains, but 526.blender_r -1.99% degradation, the others
is trivial.  By further investigation on 526.blender_r, the assembly of 10
hottest functions are unchanged, the impact should be due to some side effects.
SPECINT geomean +0.16%, SPECFP geomean -0.16% (mainly due to blender_r).
Without explicit unrolling, 548.exchange2_r +1.78% gains and the others are
trivial.  SPECINT geomean +0.19%, SPECINT geomean +0.06%.

While the SPEC2017 performance evaluation on Power8 shows 500.perlbench_r
+1.32% gain and 511.povray_r +2.03% gain, the others are trivial.  SPECINT
geomean +0.08%, SPECINT geomean +0.18%.

Bootstrapped and regress tested on powerpc64le-linux-gnu.
Is OK for trunk?

BR,
Kewen
---

gcc/ChangeLog

2019-12-11  Kewen Lin  

* config/rs6000/rs6000.c (adjust_vectorization_cost): New function.
(rs6000_add_stmt_cost): Call adjust_vectorization_cost and update
stmt_cost.

[patch] libgomp/openacc.f90 – clean-up public/private attributes

2019-12-11 Thread Tobias Burnus


This patch cleans up the public/private handling in libgomp/openacc.f90:

* module openacc_kinds marked symbols explicitly as public and private 
(but default is public). Make this explicit by a 'PUBLIC' and remove the 
(redundant) explicit 'public :: ' lines.


* 'module openacc' had a bunch of 'public :: ' lines but the default was 
already 'public'. Changed this to 'private' and marked the 
use-associated 'openacc_kinds' symbols as 'public' and added 'public' 
statements for the two missing items. (Net effect: this will hide all 
openacc_internal symbols.)


I think this patch is rather obvious. Nonetheless: are the comments?
(If not, I will commit it in the next days.)

Tobias

PS: I found the two missing symbols by looking at the 'interface ' 
lines; 'module openacc' has only those + the version symbol.


PPS: I have filled PR 92913 as gfortran feature request to compare the 
'interface' block's procedure declarations with the actual procedure 
declarations outside of the modules; as, unfortunately, no 
argument-mismatch check exists, currently.


2019-12-11  Tobias Burnus  

	libgomp/
	* openacc.f90 (module openacc_kinds): Use 'PUBLIC' to mark all symbols
	as public except for the 'use …, only' imported symbol, which is
	private.
	(module openacc): Default to 'PRIVATE' to exclude openacc_internal; mark
	all symbols from module openacc_kinds as PUBLIC; add missing PUBLIC
	attributes for acc_copyout_finalize and acc_delete_finalize.

diff --git a/libgomp/openacc.f90 b/libgomp/openacc.f90
index 831a157e703..b37f1872d50 100644
--- a/libgomp/openacc.f90
+++ b/libgomp/openacc.f90
@@ -31,13 +31,12 @@ module openacc_kinds
   use iso_fortran_env, only: int32
   implicit none
 
+  public
   private :: int32
-  public :: acc_device_kind
 
-  integer, parameter :: acc_device_kind = int32
+  ! When adding items, also update 'public' setting in 'module openmp' below.
 
-  public :: acc_device_none, acc_device_default, acc_device_host
-  public :: acc_device_not_host, acc_device_nvidia
+  integer, parameter :: acc_device_kind = int32
 
   ! Keep in sync with include/gomp-constants.h.
   integer (acc_device_kind), parameter :: acc_device_none = 0
@@ -48,16 +47,11 @@ module openacc_kinds
   integer (acc_device_kind), parameter :: acc_device_nvidia = 5
   integer (acc_device_kind), parameter :: acc_device_gcn = 8
 
-  public :: acc_handle_kind
-
   integer, parameter :: acc_handle_kind = int32
 
-  public :: acc_async_noval, acc_async_sync
-
   ! Keep in sync with include/gomp-constants.h.
   integer (acc_handle_kind), parameter :: acc_async_noval = -1
   integer (acc_handle_kind), parameter :: acc_async_sync = -2
-
 end module
 
 module openacc_internal
@@ -717,6 +711,13 @@ module openacc
   use openacc_internal
   implicit none
 
+  private
+  ! From openacc_kinds
+  public :: acc_device_kind, acc_handle_kind
+  public :: acc_device_none, acc_device_default, acc_device_host
+  public :: acc_device_not_host, acc_device_nvidia, acc_device_gcn
+  public :: acc_async_noval, acc_async_sync
+
   public :: openacc_version
 
   public :: acc_get_num_devices, acc_set_device_type, acc_get_device_type
@@ -730,6 +731,7 @@ module openacc
   public :: acc_update_device, acc_update_self, acc_is_present
   public :: acc_copyin_async, acc_create_async, acc_copyout_async
   public :: acc_delete_async, acc_update_device_async, acc_update_self_async
+  public :: acc_copyout_finalize, acc_delete_finalize
 
   integer, parameter :: openacc_version = 201306

[PATCH, rs6000] Adjust vectorization cost for scalar COND_EXPR

2019-12-11 Thread Kewen.Lin

Hi,

We found that the vectorization cost modeling on scalar COND_EXPR is a bit off
on rs6000.  One typical case is 548.exchange2_r, -Ofast -mcpu=power9 -mrecip
-fvect-cost-model=unlimited is better than -Ofast -mcpu=power9 -mrecip (the
default is -fvect-cost-model=dynamic) by 1.94%.  Scalar COND_EXPR is expanded
into compare + branch or compare + isel normally, either of them should be
priced more than the simple FXU operation.  This patch is to add additional
vectorization cost onto scalar COND_EXPR on top of builtin_vectorization_cost.
The idea to use additional cost value 2 instead of the others: 1) try various
possible value candidates from 1 to 5, 2 is the best measured on Power9.  2) 
from latency view, compare takes 3 cycles and isel takes 2 on Power9, it's 
2.5 times of simple FXU instruction which takes cost 1 in the current
modeling, it's close.  3) get fine SPEC2017 ratio on Power8 as well.

The SPEC2017 performance evaluation on Power9 with explicit unrolling shows 
548.exchange2_r +2.35% gains, but 526.blender_r -1.99% degradation, the others
is trivial.  By further investigation on 526.blender_r, the assembly of 10
hottest functions are unchanged, the impact should be due to some side effects.
SPECINT geomean +0.16%, SPECFP geomean -0.16% (mainly due to blender_r).
Without explicit unrolling, 548.exchange2_r +1.78% gains and the others are
trivial.  SPECINT geomean +0.19%, SPECINT geomean +0.06%.

While the SPEC2017 performance evaluation on Power8 shows 500.perlbench_r
+1.32% gain and 511.povray_r +2.03% gain, the others are trivial.  SPECINT
geomean +0.08%, SPECINT geomean +0.18%.

Bootstrapped and regress tested on powerpc64le-linux-gnu.  
Is OK for trunk?

BR,
Kewen
---

gcc/ChangeLog

2019-12-11  Kewen Lin  

* config/rs6000/rs6000.c (adjust_vectorization_cost): New function.
(rs6000_add_stmt_cost): Call adjust_vectorization_cost and update
stmt_cost.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 2995348..5dad3cc 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -5016,6 +5016,29 @@ rs6000_init_cost (struct loop *loop_info)
   return data;
 }
 
+/* Adjust vectorization cost after calling rs6000_builtin_vectorization_cost.
+   For some statement, we would like to further fine-grain tweak the cost on
+   top of rs6000_builtin_vectorization_cost handling which doesn't have any
+   information on statement operation codes etc.  One typical case here is
+   COND_EXPR, it takes the same cost to simple FXU instruction when evaluating
+   for scalar cost, but it should be priced more whatever transformed to either
+   compare + branch or compare + isel instructions.  */
+
+static unsigned
+adjust_vectorization_cost (enum vect_cost_for_stmt kind,
+  struct _stmt_vec_info *stmt_info)
+{
+  if (kind == scalar_stmt && stmt_info && stmt_info->stmt
+  && gimple_code (stmt_info->stmt) == GIMPLE_ASSIGN)
+{
+  tree_code subcode = gimple_assign_rhs_code (stmt_info->stmt);
+  if (subcode == COND_EXPR)
+   return 2;
+}
+
+  return 0;
+}
+
 /* Implement targetm.vectorize.add_stmt_cost.  */
 
 static unsigned
@@ -5031,6 +5054,7 @@ rs6000_add_stmt_cost (void *data, int count, enum 
vect_cost_for_stmt kind,
   tree vectype = stmt_info ? stmt_vectype (stmt_info) : NULL_TREE;
   int stmt_cost = rs6000_builtin_vectorization_cost (kind, vectype,
 misalign);
+  stmt_cost += adjust_vectorization_cost (kind, stmt_info);
   /* Statements in an inner loop relative to the loop being
 vectorized are weighted more heavily.  The value here is
 arbitrary and could potentially be improved with analysis.  */

Re: [PATCH v2][MSP430] Add msp430-elfbare target

2019-12-11 Thread Jozef Lawrynowicz

On Wed, 11 Dec 2019 12:19:41 +
Jozef Lawrynowicz  wrote:

> On Mon, 9 Dec 2019 15:28:25 +
> Jozef Lawrynowicz  wrote:
> 
> > On Sat, 07 Dec 2019 11:40:33 -0700
> > Jeff Law  wrote:
> >   
> > > On Fri, 2019-11-29 at 21:00 +, Jozef Lawrynowicz wrote:
> > > > The attached patch consolidates some configuration tweaks I
> > > > previously submitted
> > > > as modifications to the msp430-elf target into a new target called
> > > > "msp430-elfbare" i.e. "bare-metal".
> > > > 
> > > > MSP430: Disable TM clone registry by default
> > > >   https://gcc.gnu.org/ml/gcc-patches/2019-11/msg00550.html
> > > > MSP430: Disable __cxa_atexit
> > > >   https://gcc.gnu.org/ml/gcc-patches/2019-11/msg00552.html
> > > > 
> > > > The patches tweak the CRT code to achieve the smallest possible code
> > > > size, 
> > > > and rely on some additional generic tweaks to crtstuff.c.
> > > > 
> > > > I did submit these tweaks a while ago, but I didn't get any feedback,
> > > > however even if they are acceptable I suspect it is too late for GCC-
> > > > 10 anyway:
> > > > libgcc: Dont define __do_global_dtors_aux if it will be empty
> > > >   https://gcc.gnu.org/ml/gcc-patches/2019-11/msg00417.html
> > > > libgcc: Implement TARGET_LIBGCC_REMOVE_DSO_HANDLE
> > > >   https://gcc.gnu.org/ml/gcc-patches/2019-11/msg00418.html
> > > > 
> > > > (The second one is a bit hacky, but without some way of removing the
> > > > __dso_handle declaration, we end up with 150 bytes of unnecessary
> > > > code in some
> > > > programs.)
> > > > 
> > > > So for this patch crtstuff.c was copied to the msp430 subdirectory
> > > > and the
> > > > changes were made to that target specific version.
> > > > 
> > > > Tiny program size can now be achieved by configuring gcc for msp430-
> > > > elfbare.
> > > > 
> > > > For example in an "empty main" program which loops forever:
> > > >   msp430-elfbare @ -Os:
> > > >  textdata bss dec hex filename
> > > >14   0   0  14   e a.out
> > > >   msp430-elf @ -Os:
> > > >  textdata bss dec hex filename
> > > >   270   6   2 278 116 a.out
> > > > 
> > > > Successfully regtested msp430-elfbare vs msp430-elf.
> > > > 
> > > > Ok to apply?
> > > > 
> > > > P.S. This patch relies on the -fno-exceptions multilib patch
> > > > submitted here:
> > > > https://gcc.gnu.org/ml/gcc-patches/2019-11/msg02523.html
> > > > 
> > > > P.P.S. This requires some minor configury tweaks to Newlib and GDB of
> > > > the form:
> > > > -  msp430*-*-elf)
> > > > +  msp430-*-elf*)  
> > > 
> > > > I'll apply these changes if the patch is accepted.
> > > > From cff4611855d838315e793d45256de5fc8eeefafe Mon Sep 17 00:00:00
> > > > 2001
> > > > From: Jozef Lawrynowicz 
> > > > Date: Mon, 25 Nov 2019 19:41:05 +
> > > > Subject: [PATCH] MSP430: Add new msp430-elfbare target
> > > > 
> > > > contrib/ChangeLog:
> > > > 
> > > > 2019-11-29  Jozef Lawrynowicz  
> > > > 
> > > > * config-list.mk: Add msp430-elfbare.
> > > > 
> > > > gcc/ChangeLog:
> > > > 
> > > > 2019-11-29  Jozef Lawrynowicz  
> > > > 
> > > > * config.gcc: s/msp430*-*-*/msp430-*-*.
> > > > Handle msp430-*-elfbare.
> > > > * config/msp430/msp430-devices.c (TARGET_SUBDIR): Define.
> > > > (_MSPMKSTR): Define.
> > > > (__MSPMKSTR): Define.
> > > > (rest_of_devices_path): Use TARGET_SUBDIR value in string.
> > > > * config/msp430/msp430.c (msp430_option_override): Error if
> > > > -fuse-cxa-atexit is used when it has been disabled at configure
> > > > time.
> > > > * config/msp430/t-msp430: Define TARGET_SUBDIR when building
> > > > msp430-devices.o.
> > > > * doc/install.texi: Document msp430-*-elf and msp430-*-elfbare.
> > > > * doc/invoke.texi: Update documentation about which path
> > > > devices.csv is
> > > > searched for.
> > > > 
> > > > gcc/testsuite/ChangeLog:
> > > > 
> > > > 2019-11-29  Jozef Lawrynowicz  
> > > > 
> > > > * g++.dg/init/dso_handle1.C: Require cxa_atexit support.
> > > > * g++.dg/init/dso_handle2.C: Likewise.
> > > > * g++.dg/other/cxa-atexit1.C: Likewise.
> > > > * gcc.target/msp430/msp430.exp: Update csv-using-installed.c
> > > > test to
> > > > handle msp430-elfbare configuration.
> > > > 
> > > > libgcc/ChangeLog:
> > > > 
> > > > 2019-11-29  Jozef Lawrynowicz  
> > > > 
> > > > * config.host: Use t-msp430-elfbare-crtstuff Makefile fragment
> > > > when GCC
> > > > is configured for the msp430-elfbare target.
> > > > * config/msp430/msp430-elfbare-crtstuff.c: New file.
> > > > * config/msp430/t-msp430: Remove Makefile rules for object
> > > > files
> > > > built from crtstuff.c
> > > > * config/msp430/t-msp430-crtstuff: New file.
> > > > * config/msp430/t-msp430-elfbare-crtstuff: New file.
> > > > * configure: Regenerate.
> > > > *

Re: [PATCH v2][MSP430] Add msp430-elfbare target

2019-12-11 Thread Jozef Lawrynowicz

On Mon, 9 Dec 2019 15:28:25 +
Jozef Lawrynowicz  wrote:

> On Sat, 07 Dec 2019 11:40:33 -0700
> Jeff Law  wrote:
> 
> > On Fri, 2019-11-29 at 21:00 +, Jozef Lawrynowicz wrote:  
> > > The attached patch consolidates some configuration tweaks I
> > > previously submitted
> > > as modifications to the msp430-elf target into a new target called
> > > "msp430-elfbare" i.e. "bare-metal".
> > > 
> > > MSP430: Disable TM clone registry by default
> > >   https://gcc.gnu.org/ml/gcc-patches/2019-11/msg00550.html
> > > MSP430: Disable __cxa_atexit
> > >   https://gcc.gnu.org/ml/gcc-patches/2019-11/msg00552.html
> > > 
> > > The patches tweak the CRT code to achieve the smallest possible code
> > > size, 
> > > and rely on some additional generic tweaks to crtstuff.c.
> > > 
> > > I did submit these tweaks a while ago, but I didn't get any feedback,
> > > however even if they are acceptable I suspect it is too late for GCC-
> > > 10 anyway:
> > > libgcc: Dont define __do_global_dtors_aux if it will be empty
> > >   https://gcc.gnu.org/ml/gcc-patches/2019-11/msg00417.html
> > > libgcc: Implement TARGET_LIBGCC_REMOVE_DSO_HANDLE
> > >   https://gcc.gnu.org/ml/gcc-patches/2019-11/msg00418.html
> > > 
> > > (The second one is a bit hacky, but without some way of removing the
> > > __dso_handle declaration, we end up with 150 bytes of unnecessary
> > > code in some
> > > programs.)
> > > 
> > > So for this patch crtstuff.c was copied to the msp430 subdirectory
> > > and the
> > > changes were made to that target specific version.
> > > 
> > > Tiny program size can now be achieved by configuring gcc for msp430-
> > > elfbare.
> > > 
> > > For example in an "empty main" program which loops forever:
> > >   msp430-elfbare @ -Os:
> > >  textdata bss dec hex filename
> > >14   0   0  14   e a.out
> > >   msp430-elf @ -Os:
> > >  textdata bss dec hex filename
> > >   270   6   2 278 116 a.out
> > > 
> > > Successfully regtested msp430-elfbare vs msp430-elf.
> > > 
> > > Ok to apply?
> > > 
> > > P.S. This patch relies on the -fno-exceptions multilib patch
> > > submitted here:
> > > https://gcc.gnu.org/ml/gcc-patches/2019-11/msg02523.html
> > > 
> > > P.P.S. This requires some minor configury tweaks to Newlib and GDB of
> > > the form:
> > > -  msp430*-*-elf)
> > > +  msp430-*-elf*)
> >   
> > > I'll apply these changes if the patch is accepted.
> > > From cff4611855d838315e793d45256de5fc8eeefafe Mon Sep 17 00:00:00
> > > 2001
> > > From: Jozef Lawrynowicz 
> > > Date: Mon, 25 Nov 2019 19:41:05 +
> > > Subject: [PATCH] MSP430: Add new msp430-elfbare target
> > > 
> > > contrib/ChangeLog:
> > > 
> > > 2019-11-29  Jozef Lawrynowicz  
> > > 
> > >   * config-list.mk: Add msp430-elfbare.
> > > 
> > > gcc/ChangeLog:
> > > 
> > > 2019-11-29  Jozef Lawrynowicz  
> > > 
> > >   * config.gcc: s/msp430*-*-*/msp430-*-*.
> > >   Handle msp430-*-elfbare.
> > >   * config/msp430/msp430-devices.c (TARGET_SUBDIR): Define.
> > >   (_MSPMKSTR): Define.
> > >   (__MSPMKSTR): Define.
> > >   (rest_of_devices_path): Use TARGET_SUBDIR value in string.
> > >   * config/msp430/msp430.c (msp430_option_override): Error if
> > >   -fuse-cxa-atexit is used when it has been disabled at configure
> > > time.
> > >   * config/msp430/t-msp430: Define TARGET_SUBDIR when building
> > >   msp430-devices.o.
> > >   * doc/install.texi: Document msp430-*-elf and msp430-*-elfbare.
> > >   * doc/invoke.texi: Update documentation about which path
> > > devices.csv is
> > >   searched for.
> > > 
> > > gcc/testsuite/ChangeLog:
> > > 
> > > 2019-11-29  Jozef Lawrynowicz  
> > > 
> > >   * g++.dg/init/dso_handle1.C: Require cxa_atexit support.
> > >   * g++.dg/init/dso_handle2.C: Likewise.
> > >   * g++.dg/other/cxa-atexit1.C: Likewise.
> > >   * gcc.target/msp430/msp430.exp: Update csv-using-installed.c
> > > test to
> > >   handle msp430-elfbare configuration.
> > > 
> > > libgcc/ChangeLog:
> > > 
> > > 2019-11-29  Jozef Lawrynowicz  
> > > 
> > >   * config.host: Use t-msp430-elfbare-crtstuff Makefile fragment
> > > when GCC
> > >   is configured for the msp430-elfbare target.
> > >   * config/msp430/msp430-elfbare-crtstuff.c: New file.
> > >   * config/msp430/t-msp430: Remove Makefile rules for object
> > > files
> > >   built from crtstuff.c
> > >   * config/msp430/t-msp430-crtstuff: New file.
> > >   * config/msp430/t-msp430-elfbare-crtstuff: New file.
> > >   * configure: Regenerate.
> > >   * configure.ac: Disable TM clone registry by default for
> > >   msp430-elfbare.
> > OK.   I probably would have tried to avoid msp430-elfbare-crtstuff, but
> > it's not a huge wart IMHO.  
> 
> If we get the __dso_handle removal into the generic libgcc/crtstuff.c those
> changes won't be necessary.

I've attached an amended patch assuming the removal of __dso_handle
when !DEFAULT_USE_CXA_ATEXIT is approved. Differences vs the original patch are
only in libgcc/, removed the

Re: [testsuite][arm] Remove xfail for vect-epilogues test

2019-12-11 Thread Richard Biener

On December 11, 2019 12:27:31 PM GMT+01:00, "Andre Vieira (lists)" 
 wrote:
>Hi,
>
>We can now vectorize an epilogue for this loop for arm too, so removing
>
>xfail.
>
>Is this OK for trunk? Wasn't entirely sure whether I could commit this 
>under obvious.

Sure. 

Richard. 

>gcc/testsuite/ChangeLog:
>2019-12-11  Andre Vieira  
>
> * gcc.dg/vect/vect-epilogues.c: Remove xfail for arm.

arm: Fix an incorrect warning when -mcpu=cortex-a55 is used with -mfloat-abi=soft

2019-12-11 Thread Richard Earnshaw (lists)

When a CPU such as cortex-a55 is used with the soft-float ABI variant, 
the compiler is incorrectly issuing a warning about a mismatch between 
the architecture (generated internally) and the CPU.  This is not 
expected or intended.


The problem stems from the fact that we generate (correctly) an 
architecture for a soft-float compilation, but then try to compare it 
against the one recorded for the CPU.  Normally we strip out the 
floating point information before doing that comparison, but we 
currently only do that for the features that can be affected by the 
-mfpu option.  For a soft-float environment we also need to strip out 
any bits that depend on having floating-point present.


So this patch implements that and does a bit of housekeeping at the same 
time:


- in arm-cpus.in it is not necessary for a CPU to specify both +dotprod 
and +simd in its architecture specification, since +dotprod implies +simd.


- I've refactored the ALL_SIMD fgroup in arm-cpus.in to create a new 
subgroup ALL_SIMD_EXTERNAL and containing the bits that were previously 
added directly to ALL_SIMD.  Similarly, I've added an ALL_FPU_EXTERNAL 
subgroup.


- in arm.c rename fpu_bitlist and all_fpubits to fpu_bitlist_internal 
and all_fpubits_internal for consistency with the fgroup bits which they 
contain.


* config/arm/arm-cpus.in (ALL_SIMD_EXTERNAL): New fgroup.
(ALL_SIMD): Use it.
(ALL_FPU_EXTERNAL): New fgroup.
(ALL_FP): Use it.
(cortex-a55, cortex-a75, cortex-a76, cortex-a76ae): Remove redundant
+simd from architecture specification.
(cortex-a77, neoverse-n1, cortex-a75.cortex-a55): Likewise.
* config/arm/arm.c (isa_all_fpubits, fpu_bitlist): Rename to ...
(isa_all_fpubits_internal, fpu_bitlist_internal): ... these.
(isa_all_fpbits): New bitmap.
(arm_option_override): Initialize it.
(arm_configure_build_target): If the target isa does not have any
FP enabled, do not warn about mismatches in FP-related feature bits.

Committed to trunk.
diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in
index 44e6cc6bdb6..7090775aa7e 100644
--- a/gcc/config/arm/arm-cpus.in
+++ b/gcc/config/arm/arm-cpus.in
@@ -213,15 +213,18 @@ define fgroup ALL_CRYPTO	crypto
 # strip off 32 D-registers, but does not remove support for
 # double-precision FP.
 define fgroup ALL_SIMD_INTERNAL	fp_d32 neon ALL_CRYPTO
-define fgroup ALL_SIMD	ALL_SIMD_INTERNAL dotprod fp16fml
+define fgroup ALL_SIMD_EXTERNAL dotprod fp16fml
+define fgroup ALL_SIMD	ALL_SIMD_INTERNAL ALL_SIMD_EXTERNAL
 
 # List of all FPU bits to strip out if -mfpu is used to override the
 # default.  fp16 is deliberately missing from this list.
 define fgroup ALL_FPU_INTERNAL	vfpv2 vfpv3 vfpv4 fpv5 fp16conv fp_dbl ALL_SIMD_INTERNAL
-
 # Similarly, but including fp16 and other extensions that aren't part of
 # -mfpu support.
-define fgroup ALL_FP	fp16 ALL_FPU_INTERNAL
+define fgroup ALL_FPU_EXTERNAL fp16
+
+# Everything related to the FPU extensions (FP or SIMD).
+define fgroup ALL_FP	ALL_FPU_EXTERNAL ALL_FPU_INTERNAL ALL_SIMD
 
 define fgroup ARMv4   armv4 notm
 define fgroup ARMv4t  ARMv4 thumb
@@ -1301,7 +1304,7 @@ begin cpu cortex-a55
  cname cortexa55
  tune for cortex-a53
  tune flags LDSCHED
- architecture armv8.2-a+fp16+dotprod+simd
+ architecture armv8.2-a+fp16+dotprod
  option crypto add FP_ARMv8 CRYPTO
  option nofp remove ALL_FP
  costs cortex_a53
@@ -1313,7 +1316,7 @@ begin cpu cortex-a75
  cname cortexa75
  tune for cortex-a57
  tune flags LDSCHED
- architecture armv8.2-a+fp16+dotprod+simd
+ architecture armv8.2-a+fp16+dotprod
  option crypto add FP_ARMv8 CRYPTO
  costs cortex_a73
  vendor 41
@@ -1324,7 +1327,7 @@ begin cpu cortex-a76
  cname cortexa76
  tune for cortex-a57
  tune flags LDSCHED
- architecture armv8.2-a+fp16+dotprod+simd
+ architecture armv8.2-a+fp16+dotprod
  option crypto add FP_ARMv8 CRYPTO
  costs cortex_a57
  vendor 41
@@ -1335,7 +1338,7 @@ begin cpu cortex-a76ae
  cname cortexa76ae
  tune for cortex-a57
  tune flags LDSCHED
- architecture armv8.2-a+fp16+dotprod+simd
+ architecture armv8.2-a+fp16+dotprod
  option crypto add FP_ARMv8 CRYPTO
  costs cortex_a57
  vendor 41
@@ -1346,7 +1349,7 @@ begin cpu cortex-a77
  cname cortexa77
  tune for cortex-a57
  tune flags LDSCHED
- architecture armv8.2-a+fp16+dotprod+simd
+ architecture armv8.2-a+fp16+dotprod
  option crypto add FP_ARMv8 CRYPTO
  costs cortex_a57
  vendor 41
@@ -1358,7 +1361,7 @@ begin cpu neoverse-n1
  alias !ares
  tune for cortex-a57
  tune flags LDSCHED
- architecture armv8.2-a+fp16+dotprod+simd
+ architecture armv8.2-a+fp16+dotprod
  option crypto add FP_ARMv8 CRYPTO
  costs cortex_a57
  vendor 41
@@ -1370,7 +1373,7 @@ begin cpu cortex-a75.cortex-a55
  cname cortexa75cortexa55
  tune for cortex-a53
  tune flags LDSCHED
- architecture armv8.2-a+fp16+dotprod+simd
+ architecture armv8.2-a+fp16+dotprod
  option crypto add FP_ARMv8 CRYPTO
  costs cortex_a73
 end cpu

Re: [Patch, committed] libgomp – spelling fixes, incl. omp_lib.h.in

2019-12-11 Thread Jakub Jelinek

On Wed, Dec 11, 2019 at 12:48:10PM +0100, Tobias Burnus wrote:
> --- libgomp/ChangeLog (revision 279217)
> +++ libgomp/ChangeLog (revision 279218)
> @@ -1,5 +1,25 @@
>  2019-12-11  Tobias Burnus  
>  
> + * omp_lib.h.in: Fix spelling of function declaration
> + omp_get_cancell(l)ation.

> --- libgomp/omp_lib.h.in  (revision 279217)
> +++ libgomp/omp_lib.h.in  (revision 279218)
> @@ -102,8 +102,8 @@
>external omp_in_final
>logical(4) omp_in_final
>  
> -  external omp_get_cancelllation
> -  logical(4) omp_get_cancelllation
> +  external omp_get_cancellation
> +  logical(4) omp_get_cancellation
>  
>external omp_get_proc_bind
>integer(omp_proc_bind_kind) omp_get_proc_bind

Could you please backport this to 9 and 8 branches?  The other changes
aren't needed there.

Thanks.

Jakub

Re: [PATCH 3/3] libgcc: Implement TARGET_LIBGCC_REMOVE_DSO_HANDLE

2019-12-11 Thread Jozef Lawrynowicz

On Mon, 9 Dec 2019 13:05:22 +
Jozef Lawrynowicz  wrote:

> On Sat, 07 Dec 2019 11:27:54 -0700
> Jeff Law  wrote:
> 
> > On Wed, 2019-11-06 at 16:19 +, Jozef Lawrynowicz wrote:  
> > > From 7bc0971d2936ebe71e7b7d3d805cf1bbf9f9f5af Mon Sep 17 00:00:00 2001
> > > From: Jozef Lawrynowicz 
> > > Date: Mon, 4 Nov 2019 17:38:13 +
> > > Subject: [PATCH 3/3] libgcc: Implement TARGET_LIBGCC_REMOVE_DSO_HANDLE
> > > 
> > > gcc/ChangeLog:
> > > 
> > > 2019-11-06  Jozef Lawrynowicz  
> > > 
> > >   * doc/tm.texi: Regenerate.
> > >   * doc/tm.texi.in: Define TARGET_LIBGCC_REMOVE_DSO_HANDLE.
> > > 
> > > libgcc/ChangeLog:
> > > 
> > > 2019-11-06  Jozef Lawrynowicz  
> > > 
> > >   * crtstuff.c: Don't declare __dso_handle if
> > >   TARGET_LIBGCC_REMOVE_DSO_HANDLE is defined.
> > Presumably you'll switch this on for your bare elf target
> > configuration?  
> 
> Yep that's the plan. I originally didn't include the target changes in
> this patch since other target changes (disabling __cxa_atexit) were required 
> for
> the removal of __dso_handle to be OK.
> 
> > 
> > Are there other things, particularly related to shared library support,
> > that we wouldn't need to use as well?  The reason I ask is I'm trying
> > to figure out if REMOVE_DSO_HANDLE is the right name or if we should
> > generalize it to a name that indicates shared libraries in general
> > aren't supported on the target.  
> 
> CRTSTUFFS_O is defined when compiling shared versions of crt{begin,end} and
> handles an extra case in crtstuff.c where there's some shared library related
> functionality we don't need on MSP430.
> 
> But when CRTSTUFFS_O is undefined __dso_handle is still declared - to 0. The
> comment gives some additional insight:
> 
> /* Declare the __dso_handle variable.  It should have a unique value  
>in every shared-object; in a main program its value is zero.  The  
>object should in any case be protected.  This means the instance   
>in one DSO or the main program is not used in another object.  The 
>dynamic linker takes care of this.  */ 
> 
> I haven't noticed any further shared library-related bloat coming from libgcc.
> 
> I think a better way of solving this problem is just to check
> DEFAULT_USE_CXA_ATEXIT rather than adding this new macro. If __cxa_atexit is
> not enabled then as far as I understand __dso_handle serves no purpose.
> DEFAULT_USE_CXA_ATEXIT is defined at configure time for any targets that want
> __cxa_atexit support.
> 
> A quick bootstrap and test of dg.exp on x86_64-pc-linux-gnu shows no issues
> with the following:
> 
> > diff --git a/libgcc/crtstuff.c b/libgcc/crtstuff.c
> > index ae6328d317d..349f8191e61 100644
> > --- a/libgcc/crtstuff.c
> > +++ b/libgcc/crtstuff.c
> > @@ -340,8 +340,10 @@ extern void *__dso_handle __attribute__ 
> > ((__visibility__ ("hidden")));
> >  #ifdef CRTSTUFFS_O
> >  void *__dso_handle = &__dso_handle;
> >  #else
> > +#if DEFAULT_USE_CXA_ATEXIT
> >  void *__dso_handle = 0;
> >  #endif
> > +#endif
> >  
> >  /* The __cxa_finalize function may not be available so we use only a
> > weak declaration.  */  
> 
> I'll put that patch through some more rigorous testing.

Successfully bootstrapped and regtested the attached patch for
x86_64-pc-linux-gnu (where DEFAULT_USE_CXA_ATEXIT is set to 1) and the proposed
msp430-elfbare target (where DEFAULT_USE_CXA_ATEXIT is set to 0).

Ok to apply?
> 
> Thanks,
> Jozef
> > 
> > Jeff  

>From fc2564803c33229184926a5ac6db62ae36ea8d77 Mon Sep 17 00:00:00 2001
From: Jozef Lawrynowicz 
Date: Mon, 9 Dec 2019 15:20:23 +
Subject: [PATCH] libgcc: Only define __dso_handle if DEFAULT_USE_CXA_ATEXIT is
 true

libgcc/ChangeLog:

2019-12-11  Jozef Lawrynowicz  

	* crtstuff.c: Declare __dso_handle only if DEFAULT_USE_CXA_ATEXIT is
	true.

---
 libgcc/crtstuff.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/libgcc/crtstuff.c b/libgcc/crtstuff.c
index 9346cc5ca54..e282cb1aabb 100644
--- a/libgcc/crtstuff.c
+++ b/libgcc/crtstuff.c
@@ -325,11 +325,14 @@ register_tm_clones (void)
 
 #ifdef OBJECT_FORMAT_ELF
 
+#if DEFAULT_USE_CXA_ATEXIT
 /* Declare the __dso_handle variable.  It should have a unique value
in every shared-object; in a main program its value is zero.  The
object should in any case be protected.  This means the instance
in one DSO or the main program is not used in another object.  The
-   dynamic linker takes care of this.  */
+   dynamic linker takes care of this.
+   If __cxa_atexit is not being used, __dso_handle will not be used and
+   doesn't need to be defined.  */
 
 #ifdef TARGET_LIBGCC_SDATA_SECTION
 extern void *__dso_handle __attribute__ ((__section__ (TARGET_LIBGCC_SDATA_SECTION)));
@@ -342,6 +345,7 @@ void *__dso_handle = &__dso_handle;
 #else
 void *__dso_handle = 0;
 #endif
+#endif /* DEFAULT_USE_CXA_ATEXIT */
 
 /* The __cxa_finalize function may not be available so we use only a
weak declaration.  */
-- 
2.17.1

[Patch, committed] libgomp – spelling fixes, incl. omp_lib.h.in

2019-12-11 Thread Tobias Burnus


This patch fixes various comment typos – and:

* omp_lib.h.in (Fortran): omp_get_cancell(l)ation – this typo could 
break user code; the test suite has 'use omp_lib' (i.e. the module) and 
not the #include file; hence, it didn't fail there.


* libgomp.texi: This one is also user visible.

Committed as Rev. 279218.

Tobias

Index: libgomp/team.c
===
--- libgomp/team.c	(revision 279217)
+++ libgomp/team.c	(revision 279218)
@@ -23,7 +23,7 @@
see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
.  */
 
-/* This file handles the maintainence of threads in response to team
+/* This file handles the maintenance of threads in response to team
creation and termination.  */
 
 #include "libgomp.h"
Index: libgomp/ChangeLog
===
--- libgomp/ChangeLog	(revision 279217)
+++ libgomp/ChangeLog	(revision 279218)
@@ -1,5 +1,25 @@
 2019-12-11  Tobias Burnus  
 
+	* omp_lib.h.in: Fix spelling of function declaration
+	omp_get_cancell(l)ation.
+	* libgomp.texi (acc_is_present, acc_async_test, acc_async_test_all):
+	Fix typos.
+	* env.c: Fix comment typos.
+	* oacc-host.c: Likewise.
+	* ordered.c: Likewise.
+	* task.c: Likewise.
+	* team.c: Likewise.
+	* config/gcn/task.c: Likewise.
+	* config/gcn/team.c: Likewise.
+	* config/nvptx/task.c: Likewise.
+	* config/nvptx/team.c: Likewise.
+	* plugin/plugin-gcn.c: Likewise.
+	* testsuite/libgomp.fortran/jacobi.f: Likewise.
+	* testsuite/libgomp.hsa.c/tiling-2.c: Likewise.
+	* testsuite/libgomp.oacc-c-c++-common/enter_exit-lib.c: Likewise.
+
+2019-12-11  Tobias Burnus  
+
 	* testsuite/libgomp.oacc-fortran/optional-cache.f95: Add 'dg-do run'.
 	* testsuite/libgomp.oacc-fortran/optional-reduction.f90: Remove
 	unnecessary 'dg-additional-options "-w"'.
@@ -1235,7 +1255,7 @@
 	(host_openacc_async_construct): New function.
 	(host_openacc_async_destruct): New function.
 	(struct gomp_device_descr host_dispatch): Remove initialization of old
-	interface, add intialization of new async sub-struct.
+	interface, add initialization of new async sub-struct.
 	* oacc-init.c (acc_shutdown_1): Adjust to use gomp_fini_device.
 	(goacc_attach_host_thread_to_device): Remove old async code usage.
 	* oacc-int.h (goacc_init_asyncqueues): New declaration.
@@ -6373,7 +6393,7 @@
 	* libgomp_g.h (GOACC_parallel): Remove.
 	(GOACC_parallel_keyed): Declare.
 	* plugin/plugin-nvptx.c (struct targ_fn_launch): New struct.
-	(stuct targ_gn_descriptor): Replace name field with launch field.
+	(struct targ_gn_descriptor): Replace name field with launch field.
 	(nvptx_exec): Lose separate geometry args, take array.  Process
 	dynamic dimensions and adjust.
 	(struct nvptx_tdata): Replace fn_names field with fn_descs.
@@ -6394,7 +6414,7 @@
 2015-09-08  Aditya Kumar  
 	Sebastian Pop  
 
-	* testsuite/libgomp.graphite/bounds.c (int foo): Modifed test case to
+	* testsuite/libgomp.graphite/bounds.c (int foo): Modified test case to
 	match o/p.
 	* testsuite/libgomp.graphite/force-parallel-1.c (void parloop): Same.
 	* testsuite/libgomp.graphite/force-parallel-4.c: Same.
@@ -6671,7 +6691,7 @@
 	* target.c (struct offload_image_descr): Constify target_data.
 	(gomp_offload_image_to_device): Likewise.
 	(GOMP_offload_register): Likewise.
-	(GOMP_offload_unrefister): Likewise.
+	(GOMP_offload_unregister): Likewise.
 	* plugin/plugin-host.c (GOMP_OFFLOAD_load_image,
 	GOMP_OFFLOAD_unload_image): Constify target data.
 	* plugin/plugin-nvptx.c (struct ptx_image_data): Constify target data.
@@ -7997,7 +8017,7 @@
 2014-12-12  Kyrylo Tkachov  
 
 	* testsuite/lib/libgomp.exp: Load target-utils.exp.
-	Move load of target-supportes.exp earlier.
+	Move load of target-supports.exp earlier.
 
 2014-12-10  Ilya Verbin  
 
@@ -8484,7 +8504,7 @@
 
 2013-12-17  Andreas Tobler  
 
-	* testsuite/libgomp.c/affinity-1.c: Remove alloca.h inlcude. Replace
+	* testsuite/libgomp.c/affinity-1.c: Remove alloca.h include. Replace
 	alloca () with __builtin_alloca ().
 	* testsuite/libgomp.c/icv-2.c: Add FreeBSD coverage.
 	* testsuite/libgomp.c/lock-3.c: Likewise.
@@ -8644,7 +8664,7 @@
 	(gomp_team_end): Use gomp_managed_threads_lock instead of
 	gomp_remaining_threads_lock.  Use gomp_team_barrier_wait_final instead
 	of gomp_team_barrier_wait.  If team->team_cancelled, call
-	gomp_fini_worshare on ws chain starting at team->work_shares_to_free
+	gomp_fini_workshare on ws chain starting at team->work_shares_to_free
 	rather than thr->ts.work_share.
 	(initialize_team): Don't call gomp_sem_init here.
 	* sections.c (GOMP_parallel_sections_start): Adjust gomp_team_start
@@ -12019,7 +12039,7 @@
 
 	* configure.ac: Determine whether -pthread or -lpthread is needed.
 	* Makefile.am (libgomp_la_LDFLAGS): Remove explicit -lpthread.
-	* Makefine.in, configure: Rebuild.
+	* Makefile.in, configure: Rebuild.
 
 2005-09-28  Richard Henderson  
 
Index:

[testsuite][arm] Remove xfail for vect-epilogues test

2019-12-11 Thread Andre Vieira (lists)


Hi,

We can now vectorize an epilogue for this loop for arm too, so removing 
xfail.


Is this OK for trunk? Wasn't entirely sure whether I could commit this 
under obvious.


gcc/testsuite/ChangeLog:
2019-12-11  Andre Vieira  

* gcc.dg/vect/vect-epilogues.c: Remove xfail for arm.
diff --git a/gcc/testsuite/gcc.dg/vect/vect-epilogues.c b/gcc/testsuite/gcc.dg/vect/vect-epilogues.c
index 94e918ff9d019f07c0d891a9148692f86c92..de95310a65eed78e1f75c4cd7581f9f7a86afd16 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-epilogues.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-epilogues.c
@@ -16,4 +16,4 @@ void pixel_avg( unsigned char *dst, int i_dst_stride,
  }
  }
 
-/* { dg-final { scan-tree-dump "LOOP EPILOGUE VECTORIZED" "vect" { xfail { arm*-*-* } } } } */
+/* { dg-final { scan-tree-dump "LOOP EPILOGUE VECTORIZED" "vect" } } */

Re: [PATCH] Fix gnu-versioned-namespace build

2019-12-11 Thread Jonathan Wakely


On 11/12/19 11:16 +, Jonathan Wakely wrote:

On 11/12/19 08:29 +0100, François Dumont wrote:

I plan to commit this tomorrow.

Note that rather than just adding the missing 
_GLIBCXX_[BEGIN,END]_VERSION_NAMESPACE I also move anonymous 
namespace usage outside std namespace. Let me know if it was 
intentional.


It was intentional, why move it?

Adding the BEGIN/END_VERSION macros is unnecessary. Those namespaces
are inline, so std::random_device already refers to
std::__8::random_device when the original declaration was in the
versioned namespace.

The only fix needed here seems to be qualifying std::isdigit (and
strictly-speaking we should also include  to declare that).


I was curious why that qualification is needed. Th problem is that
 is being indirectly included by some other header, and so is
, so the declarations visible are ::isdigit(int) and
std::__8::isdigit(CharT, const locale&). Even after including
 we still can't call it unqualified, because  doesn't
use the versioned namespace:

namespace std
{
 using ::isalnum;
 using ::isalpha;
 using ::iscntrl;
 using ::isdigit;

In any case, I think the correct fix is to #include  and add
the std:: qualification. There should be no need to change any
namespace nesting.

Re: [PATCH] Fix gnu-versioned-namespace build

2019-12-11 Thread Jonathan Wakely


On 11/12/19 08:29 +0100, François Dumont wrote:

I plan to commit this tomorrow.

Note that rather than just adding the missing 
_GLIBCXX_[BEGIN,END]_VERSION_NAMESPACE I also move anonymous namespace 
usage outside std namespace. Let me know if it was intentional.


It was intentional, why move it?

Adding the BEGIN/END_VERSION macros is unnecessary. Those namespaces
are inline, so std::random_device already refers to
std::__8::random_device when the original declaration was in the
versioned namespace.

The only fix needed here seems to be qualifying std::isdigit (and
strictly-speaking we should also include  to declare that).



    * src/c++11/random.cc: Add _GLIBCXX_BEGIN_NAMESPACE_VERSION and
    _GLIBCXX_END_NAMESPACE_VERSION. Move anonymous namespace outside std
    namespace.

Tested under Linux x86_64 normal/debug/versioned namespace modes.

There are still tests failing in versioned namespace, more patches to come.

François




diff --git a/libstdc++-v3/src/c++11/random.cc b/libstdc++-v3/src/c++11/random.cc
index 10fbe1dc4c4..d4ebc9556ab 100644
--- a/libstdc++-v3/src/c++11/random.cc
+++ b/libstdc++-v3/src/c++11/random.cc
@@ -73,8 +73,6 @@
# define USE_MT19937 1
#endif

-namespace std _GLIBCXX_VISIBILITY(default)
-{
namespace
{
#if USE_RDRAND
@@ -124,6 +122,10 @@ namespace std _GLIBCXX_VISIBILITY(default)
#endif
}

+namespace std _GLIBCXX_VISIBILITY(default)
+{
+_GLIBCXX_BEGIN_NAMESPACE_VERSION
+
  void
  random_device::_M_init(const std::string& token)
  {
@@ -286,7 +288,7 @@ namespace std _GLIBCXX_VISIBILITY(default)
_M_mt.seed(seed);
#else
// Convert old default token "mt19937" or numeric seed tokens to "default".
-if (token == "mt19937" || isdigit((unsigned char)token[0]))
+if (token == "mt19937" || std::isdigit((unsigned char)token[0]))
  _M_init("default");
else
  _M_init(token);
@@ -407,5 +409,7 @@ namespace std _GLIBCXX_VISIBILITY(default)
0x9d2c5680UL, 15,
0xefc6UL, 18, 1812433253UL>;
#endif // USE_MT19937
-}
+
+_GLIBCXX_END_NAMESPACE_VERSION
+} // namespace std
#endif // _GLIBCXX_USE_C99_STDINT_TR1

Re: [Patch][OpenMP/OpenACC/Fortran] Fix mapping of optional (present|absent) arguments

2019-12-11 Thread Tobias Burnus


Hi Thomas,

[Attached patch committed as Rev. 279217]
[One OpenMP (+OpenACC) patch and one OpenACC-only patch pending review 
are linked below.]


On 12/7/19 3:49 PM, Thomas Schwinge wrote:


I'm seeing:
 [-PASS:-]{+FAIL:+} libgomp.fortran/use_device_addr-3.f90   -O1  execution 
test


Whether it passed or not depended whether the stack was NULL for the 
local "array_arg.0" variable (which got assigned "array_arg->data" if 
the argument was present). Fixed by the following patch, which also 
fixes some more corner cases:


https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00707.html – pending 
Jakub's review



Except for trivial changes to libgomp/oacc-mem.c

Just because something is just a few lines of code doesn't mean that it's
trivial.  I had asked you to first resolve that issue separately
(referencing PR92726)


I vaguely remembered the this email – but couldn't find it in the 
thread. (This thread has 24 emails!)
Turned out that you wrote tht email in a different thread – and while 
this patch is mainly about Fortran, your email was also not set to 
fortran@ (arguably, your quoted text didn't include any Fortran bits).


Pending OpenACC review the thread you mentedion: 
https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00062.html



--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-cache.f95
@@ -0,0 +1,23 @@
+! Test that the cache directives work with optional arguments.  [...]

Missing '{ dg-do run }'.


Fixed/committed.


+! of giving a non-present argument to the cache directive is not tested as
+! it is undefined. […]
+! The effect of
+! non-present aguments in firstprivate clauses is undefined […] The effect of
+! non-present arguments in reduction clauses is undefined

Once you've got access, please file a ticket at
 so this gets clarified.


I will try to remember this – it's now on my to-do list.


+++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-reduction.f90
+! { dg-additional-options "-w" }

Why that?


No idea (was added by Kwok) – I do see warnings with "-Wall", but not 
with default options. (It then shows warnings like: "‘rg.25’ is used 
uninitialized in this function [-Wuninitialized]".)


I have now also removed this line → committed after testing that it runs 
through with nvptx.


Tobias

commit 6d8c93a07d3844bfcf646d3e53a8d57eb034c509
Author: burnus 
Date:   Wed Dec 11 10:40:11 2019 +

[OpenMP/OpenACC/Fortran] Fix mapping of optional (present|absent) arguments

* testsuite/libgomp.oacc-fortran/optional-cache.f95: Add 'dg-do run'.
* testsuite/libgomp.oacc-fortran/optional-reduction.f90: Remove
unnecessary 'dg-additional-options "-w"'.



git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@279217 138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/libgomp/ChangeLog b/libgomp/ChangeLog
index 83227032f88..14154088c95 100644
--- a/libgomp/ChangeLog
+++ b/libgomp/ChangeLog
@@ -1,3 +1,9 @@
+2019-12-11  Tobias Burnus  
+
+	* testsuite/libgomp.oacc-fortran/optional-cache.f95: Add 'dg-do run'.
+	* testsuite/libgomp.oacc-fortran/optional-reduction.f90: Remove
+	unnecessary 'dg-additional-options "-w"'.
+
 2019-12-09  Thomas Schwinge  
 	Julian Brown  
 
@@ -11109,7 +5,7 @@
 	PR libgomp/30546
 	* configure.ac: Add check for makeinfo
 	* Makefile.am: Redefined target libgomp.info, build libgomp.info only
-	if an appropiate version of makeinfo is found.
+	if an appropriate version of makeinfo is found.
 	* aclocal.m4: Regenerated.
 	* configure: Regenerated.
 	* Makefile.in: Regenerated.
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/optional-cache.f95 b/libgomp/testsuite/libgomp.oacc-fortran/optional-cache.f95
index 00f7472ae6e..0d48e2bd786 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/optional-cache.f95
+++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-cache.f95
@@ -1,3 +1,4 @@
+! { dg-do run }
 ! Test that the cache directives work with optional arguments.  The effect
 ! of giving a non-present argument to the cache directive is not tested as
 ! it is undefined.  The test is based on gfortran.dg/goacc/cache-1.f95.
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/optional-reduction.f90 b/libgomp/testsuite/libgomp.oacc-fortran/optional-reduction.f90
index b76db3ef6d3..29f92c0d4c3 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/optional-reduction.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-reduction.f90
@@ -3,7 +3,6 @@
 ! for.  The tests are based on those in reduction-1.f90.
 
 ! { dg-do run }
-! { dg-additional-options "-w" }
 
 program optional_reduction
   implicit none

Re: [PATCH] Rename condition_variable_any::wait_on_* methods

2019-12-11 Thread Jonathan Wakely


On 10/12/19 22:38 -0800, Thomas Rodgers wrote:

User-agent: mu4e 1.3.4; emacs 26.2
* include/std/condition_variable
(condition_variable_any::wait_on(_Lock&, stop_token, _Predicate): Rename
to match current draft standard.
(condition_variable_any::wait_on_until(_Lock&, stop_token,
const chrono::time_point<>&, _Predicate): Likewise.
(condition_variable_any::wait_on_for(_Lock&, stop_token,
const chrono::duration<>&, _Predicate(: Likewise.


The closing paren here is an opening one. OK for trunk with that
fixed.

Optional tweaks ...

Since the names wait_on, wait_on_until and wait_on_for are
unambiguous, you could just list them without parameters i.e.

* include/std/condition_variable (condition_variable_any::wait_on)
   (condition_variable_any::wait_on_until)
   (condition_variable_any::wait_on_for): Rename to match current
   draft standard.


* testsuite/30_threads/condition_variable_any/stop_token/wait_on.cc 
(main):


We don't generally both to say which functions were modified in tests,
so the (main) isn't needed here, but is harmless.


Adjust tests to account for renamed methods.

Re: Ping: [GCC][PATCH] Add ARM-specific Bfloat format support to middle-end

2019-12-11 Thread Kyrill Tkachov


Hi all,

On 12/11/19 9:41 AM, Stam Markianos-Wright wrote:



On 12/11/19 3:48 AM, Jeff Law wrote:
> On Mon, 2019-12-09 at 13:40 +, Stam Markianos-Wright wrote:
>>
>> On 12/3/19 10:31 AM, Stam Markianos-Wright wrote:
>>>
>>> On 12/2/19 9:27 PM, Joseph Myers wrote:
 On Mon, 2 Dec 2019, Jeff Law wrote:

>> 2019-11-13  Stam Markianos-Wright <
>> stam.markianos-wri...@arm.com>
>>
>>  * real.c (struct arm_bfloat_half_format,
>>  encode_arm_bfloat_half, decode_arm_bfloat_half): New.
>>  * real.h (arm_bfloat_half_format): New.
>>
>>
> Generally OK.  Please consider using "arm_bfloat_half" instead
> of
> "bfloat_half" for the name field in the arm_bfloat_half_format
> structure.  I'm not sure if that's really visible externally,
> but it
>>> Hi both! Agreed that we want to be conservative. See latest diff
>>> attached with the name field change (also pasted below).
>>
>> .Ping :)
> Sorry if I wasn't clear.  WIth the name change I considered this OK for
> the trunk.  Please install on the trunk.
>
> If you don't have commit privs let me know.

Ahh ok gotcha! Sorry I'm new here, and yes, I don't have commit
privileges, yet!



I've committed this on Stams' behalf with r279216.

Thanks,

Kyrill



Cheers,
Stam
>
>
> Jeff
>

Re: [PATCH]Add tune option for integer mask cmov, enable this tune for m_CORE_AVX512

2019-12-11 Thread Jakub Jelinek

On Wed, Dec 11, 2019 at 06:18:31PM +0800, Hongtao Liu wrote:
> Hi:
>   This patch is about to add tune option for integer mask cmov, for
> some targets has both integer mask register and sse mask register,
> this tune indicates to use integer one. Currently it's default on for
> m_CORE_AVX512.
> 
>   Bootstrap is ok, regression test on i386/x86_64 backends is ok.
>   ok for trunk?

I don't see the need for that right now, doesn't m_CORE_AVX512 include
all CPUs that support AVX512VL right now?  If yes, the whole effect of the
patch will be that the masked registers won't be used in generic tuning,
something most people actually use.

I think it is worth adding something like this only when some other AVX512VL
capable CPUs appear and what will perform better on those.

Jakub

[PATCH]Add tune option for integer mask cmov, enable this tune for m_CORE_AVX512

2019-12-11 Thread Hongtao Liu

Hi:
  This patch is about to add tune option for integer mask cmov, for
some targets has both integer mask register and sse mask register,
this tune indicates to use integer one. Currently it's default on for
m_CORE_AVX512.

  Bootstrap is ok, regression test on i386/x86_64 backends is ok.
  ok for trunk?

Changelog
gcc/
* config/i386/i386-expand.c (ix86_valid_mask_cmp_mode): Return
false if target not prefer using integer mask cmov for
128/256-bit vector under avx512f.
* config/i386/i386.h (TARGET_PREFER_INTEGER_MASK_CMOV): New
macro.
* config/i386/x86-tune.def
(X86_TUNE_PREFER_INTEGER_MASK_CMOV): New tune.

gcc/testsuite
* gcc.target/i386/avx512bw-pr92686-movcc-1.c: Adjust test case.
* gcc.target/i386/avx512bw-pr92686-movcc-2.c: Ditto.
* gcc.target/i386/avx512bw-pr92686-vpcmp-1.c: Ditto.
* gcc.target/i386/avx512bw-pr92686-vpcmp-2.c: Ditto.
* gcc.target/i386/avx512vl-pr92686-movcc-1.c: Ditto.
* gcc.target/i386/avx512vl-pr92686-movcc-2.c: Ditto.
* gcc.target/i386/avx512vl-pr92686-vpcmp-1.c: Ditto.
* gcc.target/i386/avx512vl-pr92686-vpcmp-2.c: Ditto.
* gcc.target/i386/avx512vl-pr88547-1.c: Ditto.


-- 
BR,
Hongtao
From 716bdede7f23ef035d93fb1d4f6917e19cef5f3e Mon Sep 17 00:00:00 2001
From: liuhongt 
Date: Wed, 11 Dec 2019 16:38:04 +0800
Subject: [PATCH] Add tune option for integer mask cmov, enable this tune for m_CORE_AVX512

Changelog
gcc/
	* config/i386/i386-expand.c (ix86_valid_mask_cmp_mode): Return
	false if target not prefer using integer mask cmov for
	128/256-bit vector under avx512f.
	* config/i386/i386.h (TARGET_PREFER_INTEGER_MASK_CMOV): New
	macro.
	* config/i386/x86-tune.def
	(X86_TUNE_PREFER_INTEGER_MASK_CMOV): New tune.

gcc/testsuite
	* gcc.target/i386/avx512bw-pr92686-movcc-1.c: Adjust test case.
	* gcc.target/i386/avx512bw-pr92686-movcc-2.c: Ditto.
	* gcc.target/i386/avx512bw-pr92686-vpcmp-1.c: Ditto.
	* gcc.target/i386/avx512bw-pr92686-vpcmp-2.c: Ditto.
	* gcc.target/i386/avx512vl-pr92686-movcc-1.c: Ditto.
	* gcc.target/i386/avx512vl-pr92686-movcc-2.c: Ditto.
	* gcc.target/i386/avx512vl-pr92686-vpcmp-1.c: Ditto.
	* gcc.target/i386/avx512vl-pr92686-vpcmp-2.c: Ditto.
	* gcc.target/i386/avx512vl-pr88547-1.c: Ditto.
---
 gcc/config/i386/i386-expand.c  |4 
 gcc/config/i386/i386.h |2 ++
 gcc/config/i386/x86-tune.def   |   10 ++
 .../gcc.target/i386/avx512bw-pr92686-movcc-1.c |2 +-
 .../gcc.target/i386/avx512bw-pr92686-movcc-2.c |2 +-
 .../gcc.target/i386/avx512bw-pr92686-vpcmp-1.c |2 +-
 .../gcc.target/i386/avx512bw-pr92686-vpcmp-2.c |2 +-
 gcc/testsuite/gcc.target/i386/avx512vl-pr88547-1.c |6 +++---
 .../gcc.target/i386/avx512vl-pr92686-movcc-1.c |2 +-
 .../gcc.target/i386/avx512vl-pr92686-movcc-2.c |2 +-
 .../gcc.target/i386/avx512vl-pr92686-vpcmp-1.c |2 +-
 .../gcc.target/i386/avx512vl-pr92686-vpcmp-2.c |2 +-
 12 files changed, 27 insertions(+), 11 deletions(-)

diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
index cbf4eb7..a627642 100644
--- a/gcc/config/i386/i386-expand.c
+++ b/gcc/config/i386/i386-expand.c
@@ -3431,6 +3431,10 @@ ix86_valid_mask_cmp_mode (machine_mode mode)
   if (TARGET_XOP && !TARGET_AVX512F)
 return false;
 
+  /* For 512-bit vector, only integer mask vcmp/vcmov is valid.  */
+  if (!TARGET_PREFER_INTEGER_MASK_CMOV && GET_MODE_SIZE (mode) != 64)
+return false;
+
   /* AVX512F is needed for mask operation.  */
   if (!(TARGET_AVX512F && VECTOR_MODE_P (mode)))
 return false;
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 2542cb3..23d796e 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -596,6 +596,8 @@ extern unsigned char ix86_tune_features[X86_TUNE_LAST];
 	ix86_tune_features[X86_TUNE_USE_XCHG_FOR_ATOMIC_STORE]
 #define TARGET_EMIT_VZEROUPPER \
 	ix86_tune_features[X86_TUNE_EMIT_VZEROUPPER]
+#define TARGET_PREFER_INTEGER_MASK_CMOV \
+	ix86_tune_features[X86_TUNE_PREFER_INTEGER_MASK_CMOV]
 
 /* Feature tests against the various architecture variations.  */
 enum ix86_arch_indices {
diff --git a/gcc/config/i386/x86-tune.def b/gcc/config/i386/x86-tune.def
index 328535d..e944f39 100644
--- a/gcc/config/i386/x86-tune.def
+++ b/gcc/config/i386/x86-tune.def
@@ -467,6 +467,16 @@ DEF_TUNE (X86_TUNE_AVX128_OPTIMAL, "avx128_optimal", m_BDVER | m_BTVER2
 DEF_TUNE (X86_TUNE_AVX256_OPTIMAL, "avx256_optimal", m_CORE_AVX512)
 
 /*/
+/* AVX512 instruction selection tuning. */
+/*/
+
+/* X86_TUNE_PREFER_INTEGER_MASK_CMOV: Use integer mask vcmov/vcmp for
+   128/256-bit vector under avx512f, there's are also instructions
+   using sse regs as mask under avx2 or xop.  */

Re: Ping: [GCC][PATCH] Add ARM-specific Bfloat format support to middle-end

2019-12-11 Thread Stam Markianos-Wright



On 12/11/19 3:48 AM, Jeff Law wrote:
> On Mon, 2019-12-09 at 13:40 +, Stam Markianos-Wright wrote:
>>
>> On 12/3/19 10:31 AM, Stam Markianos-Wright wrote:
>>>
>>> On 12/2/19 9:27 PM, Joseph Myers wrote:
 On Mon, 2 Dec 2019, Jeff Law wrote:

>> 2019-11-13  Stam Markianos-Wright  <
>> stam.markianos-wri...@arm.com>
>>
>>  * real.c (struct arm_bfloat_half_format,
>>  encode_arm_bfloat_half, decode_arm_bfloat_half): New.
>>  * real.h (arm_bfloat_half_format): New.
>>
>>
> Generally OK.  Please consider using "arm_bfloat_half" instead
> of
> "bfloat_half" for the name field in the arm_bfloat_half_format
> structure.  I'm not sure if that's really visible externally,
> but it
>>> Hi both! Agreed that we want to be conservative. See latest diff
>>> attached with the name field change (also pasted below).
>>
>> .Ping :)
> Sorry if I wasn't clear.  WIth the name change I considered this OK for
> the trunk.  Please install on the trunk.
> 
> If you don't have commit privs let me know.

Ahh ok gotcha! Sorry I'm new here, and yes, I don't have commit 
privileges, yet!

Cheers,
Stam
> 
> 
> Jeff
>

[PATCH, committed] Fix PR92901: Change test expectation for C++ in OpenACC test clause-locations.c

2019-12-11 Thread Harwath, Frederik

Hi,
I have committed the attached trivial patch to trunk as r279215. The columns of 
the clause locations are reported differently
by the C and C++ front-end and hence we need different test expectations for 
both languages.

Best regards,
Frederik

r279215 | frederik | 2019-12-11 09:26:18 +0100 (Mi, 11 Dez 2019) | 12 lines

Fix PR92901: Change test expectation for C++ in OpenACC test clause-locations.c 

The columns of the clause locations that are reported for C and C++ are
different and hence we need separate test expectations for both languages.

2019-12-11  Frederik Harwath  

	PR other/92901
	/gcc/testsuite/
	* c-c++-common/clause-locations.c: Adjust test expectation for C++.




Index: gcc/testsuite/c-c++-common/goacc/clause-locations.c
===
--- gcc/testsuite/c-c++-common/goacc/clause-locations.c	(revision 279214)
+++ gcc/testsuite/c-c++-common/goacc/clause-locations.c	(working copy)
@@ -9,7 +9,9 @@
 #pragma acc loop reduction(+:sum)
 for (i = 1; i <= 10; i++)
   {
-#pragma acc loop reduction(-:diff) reduction(-:sum)  /* { dg-warning "53: conflicting reduction operations for .sum." } */
+#pragma acc loop reduction(-:diff) reduction(-:sum)
+	/* { dg-warning "53: conflicting reduction operations for .sum." "" { target c } .-1 } */
+	/* { dg-warning "56: conflicting reduction operations for .sum." "" { target c++ } .-2 } */
 	for (j = 1; j <= 10; j++)
 	  sum = 1;
   }

93 matches

Mail list logo