date:20190718

[C++ PATCH] PR c++/90098 - partial specialization and class non-type parms.

2019-07-18 Thread Jason Merrill

A non-type template parameter of class type used in an expression has
const-qualified type; the pt.c hunks deal with this difference from the
unqualified type of the parameter declaration.  WAhen we use such a
parameter as an argument to another template, we don't want to confuse
things by copying it, we should pass it straight through.  And we might as
well skip copying other classes in constant evaluation context in a
template, too; we'll get the copy semantics at instantiation time.

Tested x86_64-pc-linux-gnu, applying to trunk.

PR c++/90099
PR c++/90101
* call.c (build_converted_constant_expr_internal): Don't copy.
* pt.c (process_partial_specialization): Allow VIEW_CONVERT_EXPR
around class non-type parameter.
(unify) [TEMPLATE_PARM_INDEX]: Ignore cv-quals.
---
 gcc/cp/call.c|  5 +
 gcc/cp/pt.c  | 11 +++
 gcc/testsuite/g++.dg/cpp2a/nontype-class18.C | 17 +
 gcc/testsuite/g++.dg/cpp2a/nontype-class19.C | 13 +
 gcc/testsuite/g++.dg/cpp2a/nontype-class20.C | 13 +
 gcc/cp/ChangeLog | 10 ++
 6 files changed, 65 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/nontype-class18.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/nontype-class19.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/nontype-class20.C

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index e597d7ac919..38d229b1f33 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -4278,6 +4278,11 @@ build_converted_constant_expr_internal (tree type, tree 
expr,
 
   if (conv)
 {
+  /* Don't copy a class in a template.  */
+  if (CLASS_TYPE_P (type) && conv->kind == ck_rvalue
+ && processing_template_decl)
+   conv = next_conversion (conv);
+
   conv->check_narrowing = true;
   conv->check_narrowing_const_only = true;
   expr = convert_like (conv, expr, complain);
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index e23c0aaf325..53aaad1800a 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -4954,7 +4954,8 @@ process_partial_specialization (tree decl)
  simple identifier' condition and also the `specialized
  non-type argument' bit.  */
   && TREE_CODE (arg) != TEMPLATE_PARM_INDEX
- && !(REFERENCE_REF_P (arg)
+ && !((REFERENCE_REF_P (arg)
+   || TREE_CODE (arg) == VIEW_CONVERT_EXPR)
   && TREE_CODE (TREE_OPERAND (arg, 0)) == TEMPLATE_PARM_INDEX))
 {
   if ((!packed_args && tpd.arg_uses_template_parms[i])
@@ -22371,9 +22372,11 @@ unify (tree tparms, tree targs, tree parm, tree arg, 
int strict,
/* Template-parameter dependent expression.  Just accept it for now.
   It will later be processed in convert_template_argument.  */
;
-  else if (same_type_p (non_reference (TREE_TYPE (arg)),
-   non_reference (tparm)))
-   /* OK */;
+  else if (same_type_ignoring_top_level_qualifiers_p
+  (non_reference (TREE_TYPE (arg)),
+   non_reference (tparm)))
+   /* OK.  Ignore top-level quals here because a class-type template
+  parameter object is const.  */;
   else if ((strict & UNIFY_ALLOW_INTEGER)
   && CP_INTEGRAL_TYPE_P (tparm))
/* Convert the ARG to the type of PARM; the deduced non-type
diff --git a/gcc/testsuite/g++.dg/cpp2a/nontype-class18.C 
b/gcc/testsuite/g++.dg/cpp2a/nontype-class18.C
new file mode 100644
index 000..22f47884d08
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/nontype-class18.C
@@ -0,0 +1,17 @@
+// PR c++/90101
+// { dg-do compile { target c++2a } }
+
+template
+struct A;
+
+template typename List>
+struct A> {};
+
+template typename List, auto V>
+struct A> {};
+
+template
+struct B {};
+
+struct X { int value; };
+A> a2;
diff --git a/gcc/testsuite/g++.dg/cpp2a/nontype-class19.C 
b/gcc/testsuite/g++.dg/cpp2a/nontype-class19.C
new file mode 100644
index 000..91267aca383
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/nontype-class19.C
@@ -0,0 +1,13 @@
+// PR c++/90099
+// { dg-do compile { target c++2a } }
+
+struct Unit {
+  int value;
+  // auto operator<=>(const Unit&) = default;
+};
+
+template
+struct X {};
+
+template
+struct X {};
diff --git a/gcc/testsuite/g++.dg/cpp2a/nontype-class20.C 
b/gcc/testsuite/g++.dg/cpp2a/nontype-class20.C
new file mode 100644
index 000..5d3479c345e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/nontype-class20.C
@@ -0,0 +1,13 @@
+// PR c++/90098
+// { dg-do compile { target c++2a } }
+
+struct A {
+  int value;
+  // auto operator<=>(const A&) = default;
+};
+
+template
+struct Z {};
+
+template
+struct Z {};
diff --git a/gcc/cp/ChangeLog b/gcc/cp/ChangeLog
index c9091f523c5..cef36b2d1b2 100644
--- a/gcc/cp/ChangeLog
+++ b/gcc/cp/ChangeLog
@@ -1,3 +1,13 @@
+2019-07-18  Jason Merrill  
+
+   PR c++/

Re: [PATCH, rs6000] Support vrotr3 for int vector types

2019-07-18 Thread Kewen.Lin

on 2019/7/19 上午3:48, Segher Boessenkool wrote:
> On Thu, Jul 18, 2019 at 01:44:36PM +0800, Kewen.Lin wrote:
>> Hi Segher,
>>
>> on 2019/7/17 下午9:40, Segher Boessenkool wrote:
>>> Hi Kewen,
>>>
>>> On Wed, Jul 17, 2019 at 04:32:15PM +0800, Kewen.Lin wrote:
 Regression testing just launched, is it OK for trunk if it's bootstrapped
 and regresstested on powerpc64le-unknown-linux-gnu?
>>>
 +;; Expanders for rotatert to make use of vrotl
 +(define_expand "vrotr3"
 +  [(set (match_operand:VEC_I 0 "vint_operand")
 +  (rotatert:VEC_I (match_operand:VEC_I 1 "vint_operand")
 +(match_operand:VEC_I 2 "vint_reg_or_const_vector")))]
>>>
>>> Having any rotatert in a define_expand or define_insn will regress.
>>>
>>> So, nope, sorry.
>>
>> Thanks for clarifying!  Since regression testing passed on powerpc64le,I'd 
>> like to double confirm the meaning of "regress", does it mean it's 
>> a regression from design view?  Is it specific to rotatert and its 
>> related one like vrotr? 
> 
> You will get HAVE_rotatert defined in insn-config.h if you do this patch,
> and then simplify-rtx.c will not work correctly, generating rotatert by
> an immediate, which we have no instructions for.
> 
> This might be masked because many of our rl*.c tests already fail because
> of other changes, I should fix that :-/
> 

Hi Segher,

Thanks for further explanation!  Sorry that, but I didn't find this 
HAVE_rotatert definition.  I guess it's due to the preparation is always 
"DONE"?  Then it doesn't really generate rotatert. 
although I can see rotatert in insn like below, it seems fine in note?

(insn 10 9 11 4 (set (reg:V4SI 122 [ vect__2.7 ])
(rotate:V4SI (reg:V4SI 121 [ vect__1.6 ])
(reg:V4SI 124))) "t.c":17:28 1596 {*altivec_vrlw}
 (expr_list:REG_EQUAL (rotatert:V4SI (reg:V4SI 121 [ vect__1.6 ])
(const_vector:V4SI [
(const_int 8 [0x8]) repeated x4
]))
(nil)))


Thanks,
Kewen

Re: [PATCH] Allow case-insensitive comparisons of register names by implementing CASE_INSENSITIVE_REGISTER_NAMES PR target/70320

2019-07-18 Thread Segher Boessenkool

Hi!

On Thu, Jul 18, 2019 at 08:45:38PM +0100, Jozef Lawrynowicz wrote:
>   PR target/70320
>   * doc/tm.texi.in: Document new macro CASE_INSENSITIVE_REGISTER_NAMES.
>   * doc/tm.texi: Likewise.

"Regenerate." -- or did you edit this file by hand?  Don't, or don't tell
us anyway ;-)

>   strcmp for comparisons of asmspec with a register name if 

(Trailing space here, and elsewhere).

> +/* { dg-do compile } */
> +/* { dg-options "-ffixed-r6 -ffixed-R7" } */
> +/* { dg-final { scan-assembler "PUSH.*R4" } } */
> +/* { dg-final { scan-assembler "PUSH.*R5" } } */

scan-assembler does multi-line matching by default, so that .* probably
matches things you do not want it to match.  You can do things like

/* { dg-final { scan-assembler "(?n)PUSH.*R5" } } */

to make sure this is on one line at least.  See man re_syntax.

Rest looks fine, but I'm not an RTL maintainer.

Segher

Re: [PATCH, rs6000] Split up rs6000.c. (part 2)

2019-07-18 Thread Segher Boessenkool

Hi!

On Wed, Jul 17, 2019 at 10:06:14AM -0500, Bill Seurer wrote:
> 2019-07-17  Bill Seurer  
> 
>   * config/rs6000/rs6000.c (builtin_description, cpu_is_info,

[ ... ]
(Your mailer seems to have wrapped some changelog lines, with trailing
spaces and everything).

>   rs6000_internal_arg_pointer, rs6000_output_mi_thunk): Moved
>   to rs6000-logue.c.

rs6000-call.c, instead :-)  And don't use passive voice in changelogs
please, just say "Move to rs6000-call.c ."?

>   * config/rs6000/t-rs6000: Add new source file rs6000-call.c.
>   * config/config.gcc: Add new source file rs6000-call.c to garbage
>   collector.

To extra_objs, too.

You forgot a changelog entry for rs6000-internal.h I think?

>  /* Support targetm.vectorize.builtin_mask_for_load.  */
> -static GTY(()) tree altivec_builtin_mask_for_load;
> +GTY(()) tree altivec_builtin_mask_for_load;

The changelog doesn't mention these changes.  There are only a few :-)

>  /* True if we have expanded a CPU builtin.  */
> -bool cpu_builtin_p;
> +bool cpu_builtin_p = false;

I'm curious, why was this needed?  Or is it just general cleanliness :-)

The patch is fine, the changelog needs a little work.  Okay for trunk
with that fixed.  Thanks!

Segher

Re: Fix failing tests after PR libstdc++/85965

2019-07-18 Thread François Dumont

Got it, it is my PR 68303 patch which was introducing this regression. I 
fix it to restore those assertions.


You'll see once the awaiting hashtable patches are in...

On 7/18/19 12:18 PM, Jonathan Wakely wrote:

On 18/07/19 07:41 +0200, François Dumont wrote:

Since commit 5d3695d03b7bdade9f4d05d2b those tests are failing.

    * testsuite/23_containers/unordered_map/48101_neg.cc: Adapt dg-error
    after PR libstdc++/85965 fix.
    * testsuite/23_containers/unordered_multimap/48101_neg.cc: Likewise.
    * testsuite/23_containers/unordered_multiset/48101_neg.cc: Likewise.
    * testsuite/23_containers/unordered_set/48101_neg.cc

It is quite trivial but I wonder if there is another plan to restore 
those static assertions differently.


Ok to commit ?


No. I don't see these failures. With the first change applied, I see a
new failure.

The patch seems wrong.

Re: Improve TBAA for types in anonymous namespaces

2019-07-18 Thread Ian Lance Taylor

Jan Hubicka  writes:

>> >> 
>> >> OK.  I wonder if we can/should carve off some bits to note
>> >> type_with_linkage_p and type_in_anonymous_namespace_p in the tree
>> >> itself?  At least in type_common there's plenty of bits left.
>> >> Not sure how expensive / reliable (non-C++?) those tests otherwise are.
>> >
>> > It also makes me wonder if other languages (D, Ada, go, Fortran...) have
>> > concept of anonymous namespace types - that is types that are never
>> > interoperable with types from another translation unit.  That would
>> > justify the extra flag pretty well.
>> >
>> > Similarly for types with name mangling defined.  Both these bits can be
>> > made indpendent of C++.
>> 
>> Go has the concept, but it implements it by mangling the names with the
>> package-path, which is required to be unique within an application (the
>> package-path is normally the path used to find an import, so it is
>> inherently unique within a file system).
>
> Currently we implement ODR names only for C++.  If Go has similar
> concept (i.e. types has mangled names and equal names implies equal
> types acros sunits), we may want to implemnt it too and improve TBAA for
> go programs..  I wonder is there something I can read about go types and
> mangling?

I don't know that the mangling is documented anywhere.  It's just an
implementation detail.  The basic idea follows from
https://golang.org/ref/spec#Import_declarations which explains how to
refer to identifiers defined in other packages.  There is no requirement
that the identifiers in different packages have unique names.  That is
if packages p1 and p2 both define a type T, then p1.T and p2.T are
different types.

Within the GCC middle-end, names will have the mangling applied, so if
the middle-end sees two types both named p1.T, then they are indeed the
same type.  It may be possible to use this fact for better TBAA when
using LTO.

> This would be good motivation to make ODR type machinery indepenent of
> C++.  Until now it was only used to drive devirtualization (which needs
> BINFOs that are not done by go FE either) and produce ODR violation
> warnings (that I am not sure if would make sense for go), but with TBAA
> I think I can take a look into this.

ODR violations can't arise in Go, since there is no way to give two
distinct types/variables/functions the same mangled name.

Go could in principle benefit from devirtualization optimizations, but
they would look pretty different than they do in C++ and I doubt they
could actually share an implementation.

Ian

Re: [PATCH, Modula-2 (C/C++/D/F/Go/Jit)] (Register spec fn) (v2)

2019-07-18 Thread Matthias Klose

On 08.07.19 23:19, Matthias Klose wrote:
> On 14.06.19 15:09, Gaius Mulley wrote:
>>
>> Hello,
>>
>> here is version two of the patches which introduce Modula-2 into the
>> GCC trunk.  The patches include:
>>
>>   (*)  a patch to allow all front ends to register a lang spec function.
>>(included are patches for all front ends to provide an empty
>> callback function).
>>   (*)  patch diffs to allow the Modula-2 front end driver to be
>>built using GCC Makefile and friends.
>>
>> The compressed tarball includes:
>>
>>   (*)  gcc/m2  (compiler driver and lang-spec stuff for Modula-2).
>>Including the need for registering lang spec functions.
>>   (*)  gcc/testsuite/gm2  (a Modula-2 dejagnu test to ensure that
>>the gm2 driver is built and can understands --version).
>>
>> These patches have been re-written after taking on board the comments
>> found in this thread:
>>
>>https://gcc.gnu.org/ml/gcc-patches/2013-11/msg02620.html
>>
>> it is a revised patch set from:
>>
>>https://gcc.gnu.org/ml/gcc-patches/2019-06/msg00220.html
>>
>> I've run make bootstrap and run the regression tests on trunk and no
>> extra failures occur for all languages touched in the ChangeLog.
>>
>> I'm currently tracking gcc trunk and gcc-9 with gm2 (which works well
>> with amd64/arm64/i386) - these patches are currently simply for the
>> driver to minimise the patch size.  There are also > 1800 tests in a
>> dejagnu testsuite for gm2 which can be included at some future time.
> 
> I had a look at the GCC 9 version of the patches, with a build including a 
> make
> install. Some comments:

A look at the licenses:

libgm2/p2c/*: GPL 3+
libgm2/libiso/*: LGPL 2.1+
libgm2/libmin/libc.c: GPL 3+
libgm2/liblog/*: LGPL 2.1+
libgm2/libpim/*: LGPL 2.1+
libgm2/libpim/Selective.c: GPL 3+
libgm2/libpim/wrapc.c: GPL 3+
libgm2/libpth/*: LGPL 2.1+

gcc/gm2/ulm-lib-gm2/* GPL 3+, Ulm copyright holder?

gcc/gm2/gm2-libs/*.def GPL 3+
gcc/gm2/gm2-libs/Break.def LGPL 2.1+
gcc/gm2/gm2-libs/*.mod LGPL 2.1+
gcc/gm2/gm2-libs/Builtins.mod GPL 3+

I didn't look at everything in gcc/gm2, however it's not clear for me when a
file is LGPL or GPL.  And at least in gm2-libs, it seems to be mixed randomly.
First I thought all definition modules were GPL, and implementation modules were
LGPL, but that's not the case.

So currently all code linked with the runtime libs becomes GPL 3+?

For the ulm lib, the files mention the Ulm university as the copyright holder,
but it's not clear which license these files had before they were imported.

libgm2 seems to be mostly LGPL except for two files. Intended?

Matthias

[PATCH, i386]: Remove *qi_2_slp insn patterns

2019-07-18 Thread Uros Bizjak

These insn patterns are just too complex to ever match. Remove them.

2019-07-18  Uroš Bizjak  

* config/i386/i386.md (*addqi_2_slp): Remove.
(*qi_2_slp): Ditto.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

For the reference, the correct form would be:

--cut here--
(define_insn "*and_2_slp"
  [(set (reg FLAGS_REG)
(compare
  (and:SWI12 (match_operand:SWI12 1 "nonimmediate_operand" "%0")
 (match_operand:SWI12 2 "general_operand" "mn"))
  (const_int 0)))
   (set (strict_low_part (match_operand:SWI12 0 "register_operand" "+"))
(and:SWI12 (match_dup 1) (match_dup 2)))]
  "(!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun))
   && ix86_match_ccmode (insn, CCNOmode)
   /* FIXME: without this LRA can't reload this pattern, see PR82524.  */
   && (rtx_equal_p (operands[0], operands[1])
   || rtx_equal_p (operands[0], operands[2]))"
  "and{}\t{%2, %0|%0, %2}"
  [(set_attr "type" "alu")
   (set_attr "mode" "")])

(define_insn "*_2_slp"
  [(set (reg FLAGS_REG)
(compare
  (any_or:SWI12 (match_operand:SWI12 1 "nonimmediate_operand" "%0")
(match_operand:SWI12 2 "general_operand" "mn"))
  (const_int 0)))
   (set (strict_low_part (match_operand:SWI12 0 "register_operand" "+"))
(any_or:SWI12 (match_dup 1) (match_dup 2)))]
  "(!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun))
   && ix86_match_ccmode (insn, CCNOmode)
   /* FIXME: without this LRA can't reload this pattern, see PR82524.  */
   && (rtx_equal_p (operands[0], operands[1])
   || rtx_equal_p (operands[0], operands[2]))"
  "{}\t{%2, %0|%0, %2}"
  [(set_attr "type" "alu")
   (set_attr "mode" "")])
--cut here--

Uros.
Index: i386.md
===
--- i386.md (revision 273578)
+++ i386.md (working copy)
@@ -8723,20 +8723,6 @@
   [(set_attr "type" "alu")
(set_attr "mode" "")])
 
-(define_insn "*andqi_2_slp"
-  [(set (reg FLAGS_REG)
-   (compare (and:QI (match_operand:QI 0 "nonimmediate_operand" "+qm,q")
-(match_operand:QI 1 "nonimmediate_operand" "qn,m"))
-(const_int 0)))
-   (set (strict_low_part (match_dup 0))
-   (and:QI (match_dup 0) (match_dup 1)))]
-  "(!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun))
-   && ix86_match_ccmode (insn, CCNOmode)
-   && !(MEM_P (operands[0]) && MEM_P (operands[1]))"
-  "and{b}\t{%1, %0|%0, %1}"
-  [(set_attr "type" "alu1")
-   (set_attr "mode" "QI")])
-
 (define_insn "andqi_ext_1"
   [(set (zero_extract:SI (match_operand 0 "ext_register_operand" "+Q,Q")
 (const_int 8)
@@ -9155,20 +9141,6 @@
   [(set_attr "type" "alu")
(set_attr "mode" "SI")])
 
-(define_insn "*qi_2_slp"
-  [(set (reg FLAGS_REG)
-   (compare (any_or:QI (match_operand:QI 0 "nonimmediate_operand" "+qm,q")
-   (match_operand:QI 1 "general_operand" "qn,m"))
-(const_int 0)))
-   (set (strict_low_part (match_dup 0))
-   (any_or:QI (match_dup 0) (match_dup 1)))]
-  "(!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun))
-   && ix86_match_ccmode (insn, CCNOmode)
-   && !(MEM_P (operands[0]) && MEM_P (operands[1]))"
-  "{q}\t{%1, %0|%0, %1}"
-  [(set_attr "type" "alu1")
-   (set_attr "mode" "QI")])
-
 (define_insn "*_3"
   [(set (reg FLAGS_REG)
(compare (any_or:SWI

Re: [PATCH, rs6000] Support vrotr3 for int vector types

2019-07-18 Thread Segher Boessenkool

On Thu, Jul 18, 2019 at 01:44:36PM +0800, Kewen.Lin wrote:
> Hi Segher,
> 
> on 2019/7/17 下午9:40, Segher Boessenkool wrote:
> > Hi Kewen,
> > 
> > On Wed, Jul 17, 2019 at 04:32:15PM +0800, Kewen.Lin wrote:
> >> Regression testing just launched, is it OK for trunk if it's bootstrapped
> >> and regresstested on powerpc64le-unknown-linux-gnu?
> > 
> >> +;; Expanders for rotatert to make use of vrotl
> >> +(define_expand "vrotr3"
> >> +  [(set (match_operand:VEC_I 0 "vint_operand")
> >> +  (rotatert:VEC_I (match_operand:VEC_I 1 "vint_operand")
> >> +(match_operand:VEC_I 2 "vint_reg_or_const_vector")))]
> > 
> > Having any rotatert in a define_expand or define_insn will regress.
> > 
> > So, nope, sorry.
> 
> Thanks for clarifying!  Since regression testing passed on powerpc64le,I'd 
> like to double confirm the meaning of "regress", does it mean it's 
> a regression from design view?  Is it specific to rotatert and its 
> related one like vrotr? 

You will get HAVE_rotatert defined in insn-config.h if you do this patch,
and then simplify-rtx.c will not work correctly, generating rotatert by
an immediate, which we have no instructions for.

This might be masked because many of our rl*.c tests already fail because
of other changes, I should fix that :-/


Segher

[PATCH] Allow case-insensitive comparisons of register names by implementing CASE_INSENSITIVE_REGISTER_NAMES PR target/70320

2019-07-18 Thread Jozef Lawrynowicz

The attached patch adds a new target macro called 
CASE_INSENSITIVE_REGISTER_NAMES, which allows the case of register names
used in an asm statement clobber list, or given in a command line option, to be
disregarded when comparing with the register names defined for the target in
REGISTER_NAMES. 

The macro is set to 1 for msp430 only, and set to 0 by default, so comparisons
continue to be case-sensitive for all targets except msp430.

Previously, a register name provided by the user using one of the aforementioned
methods must exactly match those defined in the targets REGISTER_NAMES macro.

This means that, for example, for msp430-elf the following code emits an
ambiguous error:

> void
> foo (void)
> {
>   __asm__ ("" : : : "r4", "R6");
> }

> asm-register-names.c:8:3: error: unknown register name 'r4' in 'asm'

All the register names defined in the msp430 REGISTER_NAMES macro use an
upper case 'R', so use of lower case 'r' gets rejected.

Successfully bootstrapped and regtested on trunk for x86_64-pc-linux-gnu, and
regtested for msp430-elf.

Ok for trunk?
>From 82eadcdcbb8914b06818f7c8a10156336518e8d1 Mon Sep 17 00:00:00 2001
From: Jozef Lawrynowicz 
Date: Wed, 17 Jul 2019 11:48:23 +0100
Subject: [PATCH] Implement CASE_INSENSITIVE_REGISTER_NAMES

gcc/ChangeLog:

2019-07-18  Jozef Lawrynowicz  

	PR target/70320
	* doc/tm.texi.in: Document new macro CASE_INSENSITIVE_REGISTER_NAMES.
	* doc/tm.texi: Likewise.
	* defaults.h: Define CASE_INSENSITIVE_REGISTER_NAMES to 0.
	* config/msp430/msp430.h: Define CASE_INSENSITIVE_REGISTER_NAMES to 1.
	* varasm.c (decode_reg_name_and_count): Use strcasecmp instead of
	strcmp for comparisons of asmspec with a register name if 
	CASE_INSENSITIVE_REGISTER_NAMES is defined to 1.

gcc/testsuite/ChangeLog:

2019-07-18  Jozef Lawrynowicz  

	PR target/70320
	* gcc.target/msp430/asm-register-names.c: New test. 
---
 gcc/config/msp430/msp430.h|  1 +
 gcc/defaults.h|  4 
 gcc/doc/tm.texi   | 19 +++
 gcc/doc/tm.texi.in| 19 +++
 .../gcc.target/msp430/asm-register-names.c| 14 ++
 gcc/varasm.c  | 18 --
 6 files changed, 73 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/msp430/asm-register-names.c

diff --git a/gcc/config/msp430/msp430.h b/gcc/config/msp430/msp430.h
index 1288b1a263d..7b02c5fe28d 100644
--- a/gcc/config/msp430/msp430.h
+++ b/gcc/config/msp430/msp430.h
@@ -317,6 +317,7 @@ enum reg_class
 #define REGNO_OK_FOR_BASE_P(regno)	1
 #define REGNO_OK_FOR_INDEX_P(regno)	1
 
+#define CASE_INSENSITIVE_REGISTER_NAMES 1
 
 
 typedef struct
diff --git a/gcc/defaults.h b/gcc/defaults.h
index af7ea185f1e..2a22d52ba2f 100644
--- a/gcc/defaults.h
+++ b/gcc/defaults.h
@@ -1254,6 +1254,10 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #define SHORT_IMMEDIATES_SIGN_EXTEND 0
 #endif
 
+#ifndef CASE_INSENSITIVE_REGISTER_NAMES
+#define CASE_INSENSITIVE_REGISTER_NAMES 0
+#endif
+
 #ifndef WORD_REGISTER_OPERATIONS
 #define WORD_REGISTER_OPERATIONS 0
 #endif
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 8e5b01c9383..b895dfaa4b0 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -2000,6 +2000,25 @@ If the program counter has a register number, define this as that
 register number.  Otherwise, do not define it.
 @end defmac
 
+@defmac CASE_INSENSITIVE_REGISTER_NAMES
+Define this macro to 1 if it is safe to disregard the case of register
+names when comparing user-provided register names with the
+names defined by @code{REGISTER_NAMES}.  By default this is set to
+0.
+
+This affects the register clobber list in an @code{asm} statement and
+command line options which accept register names, such as
+@option{-ffixed-@var{reg}}.
+
+For example, if @code{REGISTER_NAMES} defines a register called @var{R4},
+then the following use of lower case @var{r4} will not result in an error
+if this macro is defined to 1.
+@smallexample
+asm ("" : : : "r4");
+@end smallexample
+
+@end defmac
+
 @node Allocation Order
 @subsection Order of Allocation of Registers
 @cindex order of register allocation
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index b4d57b86e2f..4aba454248e 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -1749,6 +1749,25 @@ If the program counter has a register number, define this as that
 register number.  Otherwise, do not define it.
 @end defmac
 
+@defmac CASE_INSENSITIVE_REGISTER_NAMES
+Define this macro to 1 if it is safe to disregard the case of register
+names when comparing user-provided register names with the
+names defined by @code{REGISTER_NAMES}.  By default this is set to
+0.
+
+This affects the register clobber list in an @code{asm} statement and
+command line options which accept register names, such as
+@option{-ffixed-@var{reg}}.
+
+For example, if @code{REGISTER_NAMES} defines a register c

Re: [RFC] Consider lrotate const rotation in vectorizer

2019-07-18 Thread Segher Boessenkool

On Thu, Jul 18, 2019 at 09:01:13AM +0200, Jakub Jelinek wrote:
> On Wed, Jul 17, 2019 at 12:00:32PM -0500, Segher Boessenkool wrote:
> > I think we can say that *all* targets behave like SHIFT_COUNT_TRUNCATED
> > for rotates?  Not all immediates are valid of course, but that is a
> > separate issue.
> 
> Well, we'd need to double check all the hw rotate instructions on all the
> targets to be sure.

Yes :-(

> As for the current GCC code, SHIFT_COUNT_TRUNCATED value is used even for
> rotates at least in combine.c, expmed.c and simplify-rtx.c and in
> optabs.c through targetm.shift_truncation_mask, but e.g. in cse.c is used
> only for shifts and not rotates.

Many targets cannot use SHIFT_COUNT_TRUNCATED, and not even
targetm.shift_truncation_mask, because the actual mask depends on the
insn used, or rtl code used at least, not just the mode.

[snip]
> hunk in there, just it is limited to scalar rotates ATM rather than vector
> ones through is_int_mode.  So I bet the problem with the vector shifts is 
> just that
> tree-vect-generic.c support isn't there.

:-)

Should we always allow both directions in gimple, and pretend both are
cheap?  Should we allow only one direction, and let the target select
which, or both?


Segher

Re: [PATCH], Patch #6, revision 3, Create pc-relative addressing insns

2019-07-18 Thread Segher Boessenkool

On Thu, Jul 18, 2019 at 02:20:43PM -0400, Michael Meissner wrote:
> On Tue, Jul 16, 2019 at 03:58:18PM -0500, Segher Boessenkool wrote:
> > > I did not move the initialization of the TOC_alias_set
> > > elsewhere, because in order to call TOC_alias_set, the code has already 
> > > called
> > > force_const_mem, create_TOC_reference, and gen_const_mem, so I didn't see 
> > > the
> > > point of adding a micro-optimization for this.
> > 
> > It gets rid of a call, but also of the conditional, and that makes this
> > eminently inlinable.  You could remove the getter function completely
> > even, access the variable directly.
> > 
> > But, sure, that's existing code only now.
> 
> We can fix this later.

Yup :-)

> I did rename the TARGET_NO_TOC macro to TARGET_NO_TOC_OR_PCREL which hopefully
> makes it more obvious when it should be true.

It does, thanks!


Segher

Re: [PATCH], Patch #6, revision 3, Create pc-relative addressing insns

2019-07-18 Thread Michael Meissner

On Tue, Jul 16, 2019 at 03:58:18PM -0500, Segher Boessenkool wrote:
> Hi Mike,
> 
> On Tue, Jul 16, 2019 at 02:19:14AM -0400, Michael Meissner wrote:
> > I have changed the TARGET_TOC to be TARGET_HAS_TOC in the aix, darwin, 
> > system
> > V, and Linux 64-bit headers.  Then in rs6000.h, TARGET_TOC is defined in 
> > terms
> > of TARGET_HAS_TOC and not pc-relative referencing.
> 
> Cool, thanks.  Good name, too.
> 
> > I discovered that TARGET_NO_TOC must not be set to be just !TARGET_TOC, 
> > since
> > TARGET_NO_TOC is used to create the elf_high, elf_low insns in 32-bit.
> 
> I don't know if your setting in sysv4.h works.  This file is used on so
> very many platforms, it is hard to predict if it works everywhere :-/

Well it is a mechanical change.

> > I did rename the static variable 'set' that contained the alias set to
> > TOC_alias_set.
> 
> :-)
> 
> > I did not move the initialization of the TOC_alias_set
> > elsewhere, because in order to call TOC_alias_set, the code has already 
> > called
> > force_const_mem, create_TOC_reference, and gen_const_mem, so I didn't see 
> > the
> > point of adding a micro-optimization for this.
> 
> It gets rid of a call, but also of the conditional, and that makes this
> eminently inlinable.  You could remove the getter function completely
> even, access the variable directly.
> 
> But, sure, that's existing code only now.

We can fix this later.

> > Index: gcc/config/rs6000/linux64.h
> > ===
> > --- gcc/config/rs6000/linux64.h (revision 273457)
> > +++ gcc/config/rs6000/linux64.h (working copy)
> > @@ -277,8 +277,8 @@ extern int dot_symbols;
> >  #ifndef RS6000_BI_ARCH
> >  
> >  /* 64-bit PowerPC Linux always has a TOC.  */
> > -#undef  TARGET_TOC
> > -#defineTARGET_TOC  1
> > +#undef  TARGET_HAS_TOC
> > +#defineTARGET_HAS_TOC  1
> 
> Fix the tab while your at it?  Not that it is consistent at all in this
> file, but having the undef and the define in different style... :-)
> 
> 
> So, what does TARGET_NO_TOC mean now?  Maybe a better name would help,
> or some documentation if not?
> 
> Looks good otherwise, okay for trunk.  Thanks!
> 
> (And watch out if it works on AIX and Darwin, please).

David has said the earlier patch works on AIX, and Ian has said he will test it
when he is able to.

I did rename the TARGET_NO_TOC macro to TARGET_NO_TOC_OR_PCREL which hopefully
makes it more obvious when it should be true.

Here is the patch I committed:

2019-07-18  Michael Meissner  

* config/rs6000/aix.h (TARGET_HAS_TOC): Rename TARGET_TOC to
TARGET_HAS_TOC.
(TARGET_TOC): Likewise.
(TARGET_NO_TOC): Delete here, define TARGET_NO_TOC_OR_PCREL in
rs6000.h.
* config/rs6000/darwin.h (TARGET_HAS_TOC): Rename TARGET_TOC to
TARGET_HAS_TOC.
(TARGET_TOC): Likewise.
(TARGET_NO_TOC): Delete here, define TARGET_NO_TOC_OR_PCREL in
rs6000.h.
* config/rs6000/linux64.h (TARGET_HAS_TOC): Rename TARGET_TOC to
TARGET_HAS_TOC.
(TARGET_TOC): Likewise.
* config/rs6000/rs6000.c (rs6000_option_override_internal): Add
check to require -mcmodel=medium for pc-relative addressing.
(create_TOC_reference): Add assertion for TARGET_TOC.
(rs6000_legitimize_address): Use TARGET_NO_TOC_OR_PCREL instead of
TARGET_NO_TOC.
(rs6000_emit_move): Likewise.
(TOC_alias_set): Rename TOC alias set static variable from 'set'
to 'TOC_alias_set'.
(get_TOC_alias_set): Likewise.
(output_toc): Use TARGET_NO_TOC_OR_PCREL instead of
TARGET_NO_TOC.
(rs6000_can_eliminate): Likewise.
* config/rs6000/rs6000.h (TARGET_TOC): Define in terms of
TARGET_HAS_TOC and not pc-relative.
(TARGET_NO_TOC_OR_PCREL): New macro to replace TARGET_NO_TOC.
* config/rs6000/sysv4.h (TARGET_HAS_TOC): Rename TARGET_TOC to
TARGET_HAS_TOC.
(TARGET_TOC): Likewise.
(TARGET_NO_TOC): Delete here, define TARGET_NO_TOC_OR_PCREL in
rs6000.h.

Index: gcc/config/rs6000/aix.h
===
--- gcc/config/rs6000/aix.h (revision 273578)
+++ gcc/config/rs6000/aix.h (working copy)
@@ -32,8 +32,7 @@
 #define TARGET_AIX_OS 1

 /* AIX always has a TOC.  */
-#define TARGET_NO_TOC 0
-#define TARGET_TOC 1
+#define TARGET_HAS_TOC 1
 #define FIXED_R2 1

 /* AIX allows r13 to be used in 32-bit mode.  */
Index: gcc/config/rs6000/darwin.h
===
--- gcc/config/rs6000/darwin.h  (revision 273578)
+++ gcc/config/rs6000/darwin.h  (working copy)
@@ -43,8 +43,7 @@

 /* We're not ever going to do TOCs.  */

-#define TARGET_TOC 0
-#define TARGET_NO_TOC 1
+#define TARGET_HAS_TOC 0

 /* Override the default rs6000 definition.  */
 #undef  PTRDIFF_TYPE
Index: gcc/config/rs6000/linux64.h
===

Re: [PATCH, i386]: Fix PR 91188, strict_low_part operations do not work

2019-07-18 Thread Uros Bizjak

On Thu, Jul 18, 2019 at 7:23 PM Uros Bizjak  wrote:
>
> Attached patch fixes several strict_low_part insn patterns to operate
> only on register outputs. Also, the patch paves the was for patterns
> to handle unmatched registers (once PR82524) is fixed, and allows
> patterns to operate on HImode operands.

Please note that variable shifts currently don't trigger this
optimization (but constant shift do) due to unnecessary promotion to
int, as reported in PR91202 [1].

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91202

Uros.

Re: [RFC] Consider lrotate const rotation in vectorizer

2019-07-18 Thread Segher Boessenkool

On Thu, Jul 18, 2019 at 04:12:48PM +0100, Richard Earnshaw (lists) wrote:
> 
> 
> On 17/07/2019 18:00, Segher Boessenkool wrote:
> >On Wed, Jul 17, 2019 at 12:54:32PM +0200, Jakub Jelinek wrote:
> >>On Wed, Jul 17, 2019 at 12:37:59PM +0200, Richard Biener wrote:
> >>>I'm not sure if it makes sense to have both LROTATE_EXPR and
> >>>RROTATE_EXPR on the GIMPLE level then (that CPUs only
> >>>support one direction is natural though).  So maybe simply get
> >>>rid of one?  Its semantics are also nowhere documented
> >>
> >>A lot of targets support both,
> >
> >Of all the linux targets, we have:
> >
> >No rotate:
> >   alpha microblaze riscv sparc
> >
> >Both directions:
> >   aarch64 c6x ia64 m68k nios2 parisc sh x86 xtensa
> 
> AArch64 is Right only.

This is whether a port has any rotate resp. rotatert in recog, which
generates HAVE_rotate and HAVE_rotatert in insn-config.h .

If the hardware can only do right rotates you should either handle
immediate left rotates everywhere as well, or avoid using left rotates
in define_expand and define_insn.


Segher

[PATCH, i386]: Fix PR 91188, strict_low_part operations do not work

2019-07-18 Thread Uros Bizjak

Attached patch fixes several strict_low_part insn patterns to operate
only on register outputs. Also, the patch paves the was for patterns
to handle unmatched registers (once PR82524) is fixed, and allows
patterns to operate on HImode operands.

2019-07-18  Uroš Bizjak  

PR target/91188
* config/i386/i386.md (*addqi_1_slp): Use register_operand predicate
for operand 0.  Do not use (match_dup) to match operand 1 with
operand 0.  Add check in insn constraint that either input operand
matches operand 0.  Use SWI12 mode iterator to also handle
HImode operands.
(*and_1_slp): Ditto.
(*qi_1_slp): Ditto.
(*sub_1_slp): Use register_operand predicate for operand 0.
Do not use (match_dup) to match operand 1 with operand 0.  Add
check in insn constraint that operand 1 matches operand 0.
Use SWI12 mode iterator to also handle HImode operands.
(*ashl3_1_slp): Ditto.
(*3_1_slp): Ditto.
(*3_1_slp): Ditto.

testsuite/ChangeLog:

2019-07-18  Uroš Bizjak  

PR target/91188
* gcc.target/i386/pr91188-1a.c: New test.
* gcc.target/i386/pr91188-1b.c: Ditto.
* gcc.target/i386/pr91188-1c.c: Ditto.
* gcc.target/i386/pr91188-2a.c: Ditto.
* gcc.target/i386/pr91188-2b.c: Ditto.
* gcc.target/i386/pr91188-2c.c: Ditto.

Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 273573)
+++ config/i386/i386.md (working copy)
@@ -5583,41 +5583,39 @@
  (symbol_ref "!TARGET_PARTIAL_REG_STALL")]
   (symbol_ref "true")))])
 
-(define_insn "*addqi_1_slp"
-  [(set (strict_low_part (match_operand:QI 0 "nonimmediate_operand" "+qm,q"))
-   (plus:QI (match_dup 0)
-(match_operand:QI 1 "general_operand" "qn,m")))
+(define_insn "*add_1_slp"
+  [(set (strict_low_part (match_operand:SWI12 0 "register_operand" "+"))
+   (plus:SWI12 (match_operand:SWI12 1 "nonimmediate_operand" "%0")
+   (match_operand:SWI12 2 "general_operand" "mn")))
(clobber (reg:CC FLAGS_REG))]
-  "(! TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun))
-   && !(MEM_P (operands[0]) && MEM_P (operands[1]))"
+  "(!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun))
+   /* FIXME: without this LRA can't reload this pattern, see PR82524.  */
+   && (rtx_equal_p (operands[0], operands[1])
+   || rtx_equal_p (operands[0], operands[2]))"
 {
   switch (get_attr_type (insn))
 {
 case TYPE_INCDEC:
-  if (operands[1] == const1_rtx)
-   return "inc{b}\t%0";
+  if (operands[2] == const1_rtx)
+   return "inc{}\t%0";
   else
{
- gcc_assert (operands[1] == constm1_rtx);
- return "dec{b}\t%0";
+ gcc_assert (operands[2] == constm1_rtx);
+ return "dec{}\t%0";
}
 
 default:
-  if (x86_maybe_negate_const_int (&operands[1], QImode))
-   return "sub{b}\t{%1, %0|%0, %1}";
+  if (x86_maybe_negate_const_int (&operands[2], QImode))
+   return "sub{}\t{%2, %0|%0, %2}";
 
-  return "add{b}\t{%1, %0|%0, %1}";
+  return "add{}\t{%2, %0|%0, %2}";
 }
 }
   [(set (attr "type")
- (if_then_else (match_operand:QI 1 "incdec_operand")
+ (if_then_else (match_operand:QI 2 "incdec_operand")
(const_string "incdec")
-   (const_string "alu1")))
-   (set (attr "memory")
- (if_then_else (match_operand 1 "memory_operand")
-(const_string "load")
-(const_string "none")))
-   (set_attr "mode" "QI")])
+   (const_string "alu")))
+   (set_attr "mode" "")])
 
 ;; Split non destructive adds if we cannot use lea.
 (define_split
@@ -6345,16 +6343,17 @@
   [(set_attr "type" "alu")
(set_attr "mode" "SI")])
 
-(define_insn "*subqi_1_slp"
-  [(set (strict_low_part (match_operand:QI 0 "nonimmediate_operand" "+qm,q"))
-   (minus:QI (match_dup 0)
- (match_operand:QI 1 "general_operand" "qn,m")))
+(define_insn "*sub_1_slp"
+  [(set (strict_low_part (match_operand:SWI12 0 "register_operand" "+"))
+   (minus:SWI12 (match_operand:SWI12 1 "register_operand" "0")
+(match_operand:SWI12 2 "general_operand" "mn")))
(clobber (reg:CC FLAGS_REG))]
-  "(! TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun))
-   && !(MEM_P (operands[0]) && MEM_P (operands[1]))"
-  "sub{b}\t{%1, %0|%0, %1}"
-  [(set_attr "type" "alu1")
-   (set_attr "mode" "QI")])
+  "(!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun))
+   /* FIXME: without this LRA can't reload this pattern, see PR82524.  */
+   && rtx_equal_p (operands[0], operands[1])"
+  "sub{}\t{%2, %0|%0, %2}"
+  [(set_attr "type" "alu")
+   (set_attr "mode" "")])
 
 (define_insn "*sub_2"
   [(set (reg FLAGS_REG)
@@ -8548,16 +8547,18 @@
  (symbol_ref "!TARGET_PARTIAL_REG_STALL")]
   (symbol_ref "true")))])
 
-(define_insn "*an

Re: [patch][aarch64]: add intrinsics for vld1(q)_x4 and vst1(q)_x4

2019-07-18 Thread James Greenhalgh

On Mon, Jun 10, 2019 at 06:21:05PM +0100, Sylvia Taylor wrote:
> Greetings,
> 
> This patch adds the intrinsic functions for:
> - vld1__x4
> - vst1__x4
> - vld1q__x4
> - vst1q__x4
> 
> Bootstrapped and tested on aarch64-none-linux-gnu.
> 
> Ok for trunk? If yes, I don't have any commit rights, so can someone 
> please commit it on my behalf.

Hi,

I'm concerned by this strategy for implementing the arm_neon.h builtins:

> +__extension__ extern __inline int8x8x4_t
> +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> +vld1_s8_x4 (const int8_t *__a)
> +{
> +  union { int8x8x4_t __i; __builtin_aarch64_simd_xi __o; } __au;
> +  __au.__o
> += __builtin_aarch64_ld1x4v8qi ((const __builtin_aarch64_simd_qi *) __a);
> +  return __au.__i;
> +}

As far as I know this is undefined behaviour in C++11. This was the best
resource I could find pointing to the relevant standards paragraphs.

  
https://stackoverflow.com/questions/11373203/accessing-inactive-union-member-and-undefined-behavior

That said, GCC explicitly allows it, so maybe this is fine?

  
https://gcc.gnu.org/onlinedocs/gcc-9.1.0/gcc/Optimize-Options.html#Type-punning

Can anyone from the languages side chime in on whether we're exposing
undefined behaviour (in either C or C++) here?

Thanks,
James



> 
> Cheers,
> Syl
> 
> gcc/ChangeLog:
> 
> 2019-06-10  Sylvia Taylor  
> 
>   * config/aarch64/aarch64-simd-builtins.def:
>   (ld1x4): New.
>   (st1x4): Likewise.
>   * config/aarch64/aarch64-simd.md:
>   (aarch64_ld1x4): New pattern.
>   (aarch64_st1x4): Likewise.
>   (aarch64_ld1_x4_): Likewise.
>   (aarch64_st1_x4_): Likewise.
>   * config/aarch64/arm_neon.h:
>   (vld1_s8_x4): New function.
>   (vld1q_s8_x4): Likewise.
>   (vld1_s16_x4): Likewise.
>   (vld1q_s16_x4): Likewise.
>   (vld1_s32_x4): Likewise.
>   (vld1q_s32_x4): Likewise.
>   (vld1_u8_x4): Likewise.
>   (vld1q_u8_x4): Likewise.
>   (vld1_u16_x4): Likewise.
>   (vld1q_u16_x4): Likewise.
>   (vld1_u32_x4): Likewise.
>   (vld1q_u32_x4): Likewise.
>   (vld1_f16_x4): Likewise.
>   (vld1q_f16_x4): Likewise.
>   (vld1_f32_x4): Likewise.
>   (vld1q_f32_x4): Likewise.
>   (vld1_p8_x4): Likewise.
>   (vld1q_p8_x4): Likewise.
>   (vld1_p16_x4): Likewise.
>   (vld1q_p16_x4): Likewise.
>   (vld1_s64_x4): Likewise.
>   (vld1_u64_x4): Likewise.
>   (vld1_p64_x4): Likewise.
>   (vld1q_s64_x4): Likewise.
>   (vld1q_u64_x4): Likewise.
>   (vld1q_p64_x4): Likewise.
>   (vld1_f64_x4): Likewise.
>   (vld1q_f64_x4): Likewise.
>   (vst1_s8_x4): Likewise.
>   (vst1q_s8_x4): Likewise.
>   (vst1_s16_x4): Likewise.
>   (vst1q_s16_x4): Likewise.
>   (vst1_s32_x4): Likewise.
>   (vst1q_s32_x4): Likewise.
>   (vst1_u8_x4): Likewise.
>   (vst1q_u8_x4): Likewise.
>   (vst1_u16_x4): Likewise.
>   (vst1q_u16_x4): Likewise.
>   (vst1_u32_x4): Likewise.
>   (vst1q_u32_x4): Likewise.
>   (vst1_f16_x4): Likewise.
>   (vst1q_f16_x4): Likewise.
>   (vst1_f32_x4): Likewise.
>   (vst1q_f32_x4): Likewise.
>   (vst1_p8_x4): Likewise.
>   (vst1q_p8_x4): Likewise.
>   (vst1_p16_x4): Likewise.
>   (vst1q_p16_x4): Likewise.
>   (vst1_s64_x4): Likewise.
>   (vst1_u64_x4): Likewise.
>   (vst1_p64_x4): Likewise.
>   (vst1q_s64_x4): Likewise.
>   (vst1q_u64_x4): Likewise.
>   (vst1q_p64_x4): Likewise.
>   (vst1_f64_x4): Likewise.
>   (vst1q_f64_x4): Likewise.
> 
> gcc/testsuite/ChangeLog:
> 
> 2019-06-10  Sylvia Taylor  
> 
>   * gcc.target/aarch64/advsimd-intrinsics/vld1x4.c: New test.
>   * gcc.target/aarch64/advsimd-intrinsics/vst1x4.c: New test.

Go patch committed: Fix bug in importing blocks in inline functions

2019-07-18 Thread Ian Lance Taylor

This Go frontend patch by Than McIntosh  fixes a buglet in the
function body importer.  It adds hooks for keeping a stack of blocks
corresponding to the block nesting in the imported function.  This
ensures that local variables and temps wind up correctly scoped and
don't introduce collisions.

There is a new test case for this problem in https://golang.org/cl/186717.

This fixes https://golang.org/issue/33158.

Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 273564)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-19ed722fb3ae5e618c746da20efb79fc837337cd
+4df7c8d7af894ee93f50c3a50debdcf4e369a2c6
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/import.cc
===
--- gcc/go/gofrontend/import.cc (revision 273534)
+++ gcc/go/gofrontend/import.cc (working copy)
@@ -1535,6 +1535,26 @@ Stream_from_file::do_advance(size_t skip
 
 // Class Import_function_body.
 
+Import_function_body::Import_function_body(Gogo* gogo,
+   Import* imp,
+   Named_object* named_object,
+   const std::string& body,
+   size_t off,
+   Block* block,
+   int indent)
+: gogo_(gogo), imp_(imp), named_object_(named_object), body_(body),
+  off_(off), indent_(indent), temporaries_(), labels_(),
+  saw_error_(false)
+{
+  this->blocks_.push_back(block);
+}
+
+Import_function_body::~Import_function_body()
+{
+  // At this point we should be left with the original outer block only.
+  go_assert(saw_errors() || this->blocks_.size() == 1);
+}
+
 // The name of the function we are parsing.
 
 const std::string&
Index: gcc/go/gofrontend/import.h
===
--- gcc/go/gofrontend/import.h  (revision 273534)
+++ gcc/go/gofrontend/import.h  (working copy)
@@ -593,11 +593,8 @@ class Import_function_body : public Impo
  public:
   Import_function_body(Gogo* gogo, Import* imp, Named_object* named_object,
   const std::string& body, size_t off, Block* block,
-  int indent)
-: gogo_(gogo), imp_(imp), named_object_(named_object), body_(body),
-  off_(off), block_(block), indent_(indent), temporaries_(), labels_(),
-  saw_error_(false)
-  { }
+  int indent);
+  ~Import_function_body();
 
   // The IR.
   Gogo*
@@ -637,7 +634,17 @@ class Import_function_body : public Impo
   // The current block.
   Block*
   block()
-  { return this->block_; }
+  { return this->blocks_.back(); }
+
+  // Begin importing a new block BLOCK nested within the current block.
+  void
+  begin_block(Block *block)
+  { this->blocks_.push_back(block); }
+
+  // Record the fact that we're done importing the current block.
+  void
+  finish_block()
+  { this->blocks_.pop_back(); }
 
   // The current indentation.
   int
@@ -757,8 +764,8 @@ class Import_function_body : public Impo
   const std::string& body_;
   // The current offset into body_.
   size_t off_;
-  // Current block.
-  Block* block_;
+  // Stack to record nesting of blocks being imported.
+  std::vector blocks_;
   // Current expected indentation level.
   int indent_;
   // Temporary statements by index.
Index: gcc/go/gofrontend/statements.cc
===
--- gcc/go/gofrontend/statements.cc (revision 273534)
+++ gcc/go/gofrontend/statements.cc (working copy)
@@ -2176,7 +2176,9 @@ Block_statement::do_import(Import_functi
   ifb->set_off(nl + 1);
   ifb->increment_indent();
   Block* block = new Block(ifb->block(), loc);
+  ifb->begin_block(block);
   bool ok = Block::import_block(block, ifb, loc);
+  ifb->finish_block();
   ifb->decrement_indent();
   if (!ok)
 return NULL;

[Ada] clean ups in C runtime files

2019-07-18 Thread Arnaud Charlet

This change introduces a "STANDALONE" mode where the C files of the Ada
runtime do not have any dependency on GCC include files. This is
useful for rebuilding the Ada runtime in a sandbox where GCC include files
are not available.

Also a few clean ups along the way.

Tested on x86_64-pc-linux-gnu, committed on trunk.

2019-07-18  Arnaud Charlet  

* Makefile.rtl, expect.c, env.c, aux-io.c, mkdir.c, initialize.c,
cstreams.c, raise.c, tracebak.c, adadecode.c, init.c, raise-gcc.c,
argv.c, adaint.c, adaint.h, ctrl_c.c, sysdep.c, rtinit.c, cio.c,
seh_init.c, exit.c, targext.c: Introduce a "STANDALONE" mode where C
runtime files do not have any dependency on GCC include files.
Remove unnecessary includes.
Remove remaining references to VMS in runtime C file.
* runtime.h: new File.

--
Index: expect.c
===
--- expect.c(revision 273575)
+++ expect.c(working copy)
@@ -29,14 +29,11 @@
  *  *
  /
 
-#ifdef __alpha_vxworks
-#include "vxWorks.h"
-#endif
-
 #ifdef IN_RTS
 #define POSIX
-#include "tconfig.h"
-#include "tsystem.h"
+#include "runtime.h"
+#include 
+
 #else
 #include "config.h"
 #include "system.h"
Index: env.c
===
--- env.c   (revision 273575)
+++ env.c   (working copy)
@@ -30,15 +30,11 @@
  /
 
 #ifdef IN_RTS
-# include "tconfig.h"
-# include "tsystem.h"
+# include "runtime.h"
+# include 
+# include 
+# include 
 
-# include 
-# include 
-# include 
-# ifdef VMS
-#  include 
-# endif
 /* We don't have libiberty, so use malloc.  */
 # define xmalloc(S) malloc (S)
 #else /* IN_RTS */
@@ -109,89 +105,10 @@
   return;
 }
 
-/* VMS specific declarations for set_env_value.  */
-
-#ifdef VMS
-
-typedef struct _ile3
-{
-  unsigned short len, code;
-  __char_ptr32 adr;
-  __char_ptr32 retlen_adr;
-} ile_s;
-
-#endif
-
 void
 __gnat_setenv (char *name, char *value)
 {
-#if defined (VMS)
-  struct dsc$descriptor_s name_desc;
-  $DESCRIPTOR (table_desc, "LNM$PROCESS");
-  char *host_pathspec = value;
-  char *copy_pathspec;
-  int num_dirs_in_pathspec = 1;
-  char *ptr;
-  long status;
-
-  name_desc.dsc$w_length = strlen (name);
-  name_desc.dsc$b_dtype = DSC$K_DTYPE_T;
-  name_desc.dsc$b_class = DSC$K_CLASS_S;
-  name_desc.dsc$a_pointer = name; /* ??? Danger, not 64bit safe.  */
-
-  if (*host_pathspec == 0)
-/* deassign */
-{
-  status = LIB$DELETE_LOGICAL (&name_desc, &table_desc);
-  /* no need to check status; if the logical name is not
- defined, that's fine. */
-  return;
-}
-
-  ptr = host_pathspec;
-  while (*ptr++)
-if (*ptr == ',')
-  num_dirs_in_pathspec++;
-
-  {
-int i, status;
-/* Alloca is guaranteed to be 32bit.  */
-ile_s *ile_array = alloca (sizeof (ile_s) * (num_dirs_in_pathspec + 1));
-char *copy_pathspec = alloca (strlen (host_pathspec) + 1);
-char *curr, *next;
-
-strcpy (copy_pathspec, host_pathspec);
-curr = copy_pathspec;
-for (i = 0; i < num_dirs_in_pathspec; i++)
-  {
-   next = strchr (curr, ',');
-   if (next == 0)
- next = strchr (curr, 0);
-
-   *next = 0;
-   ile_array[i].len = strlen (curr);
-
-   /* Code 2 from lnmdef.h means it's a string.  */
-   ile_array[i].code = 2;
-   ile_array[i].adr = curr;
-
-   /* retlen_adr is ignored.  */
-   ile_array[i].retlen_adr = 0;
-   curr = next + 1;
-  }
-
-/* Terminating item must be zero.  */
-ile_array[i].len = 0;
-ile_array[i].code = 0;
-ile_array[i].adr = 0;
-ile_array[i].retlen_adr = 0;
-
-status = LIB$SET_LOGICAL (&name_desc, 0, &table_desc, 0, ile_array);
-if ((status & 1) != 1)
-  LIB$SIGNAL (status);
-  }
-
-#elif (defined (__vxworks) && defined (__RTP__)) || defined (__APPLE__)
+#if (defined (__vxworks) && defined (__RTP__)) || defined (__APPLE__)
   setenv (name, value, 1);
 
 #else
@@ -213,10 +130,7 @@
 char **
 __gnat_environ (void)
 {
-#if defined (VMS) || defined (RTX)
-  /* Not implemented */
-  return NULL;
-#elif defined (__MINGW32__)
+#if defined (__MINGW32__)
   return _environ;
 #elif defined (__sun__)
   extern char **_environ;
@@ -247,10 +161,7 @@
 
 void __gnat_unsetenv (char *name)
 {
-#if defined (VMS)
-  /* Not implemented */
-  return;
-#elif defined (__hpux__) || defined (__sun__) \
+#if defined (__hpux__) || defined (__sun__) \
  || (defined (__vxworks) && ! defined (__RTP__)) \
  || defined (_AIX) || defined (__Lynx__)
 
@@ -306,10 +217,7 @@
 
 void __gnat_clearenv (void)
 {
-#if defined (VMS)
-  /* not implemented */
-  return;
-#elif defined (__sun__) \
+#if defined (__sun__) \
   || (defined (__vxworks) && ! defined (__R

Re: [PATCH][gcc] libgccjit: check result_type in gcc_jit_context_new_unary_op

2019-07-18 Thread Andrea Corallo



David Malcolm writes:

> On Thu, 2019-07-18 at 14:20 +, Andrea Corallo wrote:
>> Hi all,
>> I've just realized that what we has been done recently for
>> gcc_jit_context_new_binary_op should be done also for the unary
>> version.
>> This patch checks at record time for the result type of
>> gcc_jit_context_new_unary_op to be numeric type plus add a testcase
>> for the new check.
>>
>> make check-jit runs clean
>>
>> Is it okay for trunk?
>>
>> Bests
>>   Andrea
>>
>> gcc/jit/ChangeLog
>> 2019-07-18  Andrea Corallo 
>>
>>  * libgccjit.c (gcc_jit_context_new_unary_op): Check result_type
>> to be a
>>  numeric type.
>>  * libgccjit.c (gcc_jit_context_new_binary_op): Fix nit in error
>> message.
>>
>> gcc/testsuite/ChangeLog
>> 2019-07-04  Andrea Corallo 
>>
>>  * jit.dg/test-error-gcc_jit_context_new_unary_op-bad-res-
>> type.c:
>>  New testcase.
>>  * jit.dg/test-error-gcc_jit_context_new_binary_op-bad-res-
>> type.c:
>>  Fix nit in error message.
>
> Thanks for the patch.  What happens with the existing code if the user
> tries to use such a unary op?

In case the res type is something "exotic" like a structure I've
encountered an ICE, if I'm not wrong again during gimplification.

>> diff --git a/gcc/jit/libgccjit.c b/gcc/jit/libgccjit.c
>> index 23e83e2..bea840f 100644
>> --- a/gcc/jit/libgccjit.c
>> +++ b/gcc/jit/libgccjit.c
>> @@ -1336,6 +1336,12 @@ gcc_jit_context_new_unary_op (gcc_jit_context *ctxt,
>>  "unrecognized value for enum gcc_jit_unary_op: %i",
>>  op);
>>RETURN_NULL_IF_FAIL (result_type, ctxt, loc, "NULL result_type");
>> +  RETURN_NULL_IF_FAIL_PRINTF3 (
>> +result_type->is_numeric (), ctxt, loc,
>> +"gcc_jit_unary_op %i with operand %s "
>> +"has non-numeric result_type: %s",
>> +op, rvalue->get_debug_string (),
>> +result_type->get_debug_string ());
>>RETURN_NULL_IF_FAIL (rvalue, ctxt, loc, "NULL rvalue");
>
> The use of "%i" for "op" here isn't as user-friendly as it could be; it
> would be ideal to tell the user the enum value.
>
> "op" has already been validated, so why not expose the currently-static
> unary_op_reproducer_strings from jit-recording.c in an internal header,
> and use it here with a "%s"?
>
>>return (gcc_jit_rvalue *)ctxt->new_unary_op (loc, op, result_type,
> rvalue);
>> @@ -1388,7 +1394,7 @@ gcc_jit_context_new_binary_op (gcc_jit_context
> *ctxt,
>>RETURN_NULL_IF_FAIL_PRINTF4 (
>>  result_type->is_numeric (), ctxt, loc,
>>  "gcc_jit_binary_op %i with operands a: %s b: %s "
>> -"has non numeric result_type: %s",
>> +"has non-numeric result_type: %s",
>>  op, a->get_debug_string (), b->get_debug_string (),
>>  result_type->get_debug_string ());
>
> Ah, I see there's one of these "%i" for op already.  Given that you're
> already fixing a nit here, please make this print "%s", using
> binary_op_reproducer_strings from jit-recording.c ("op" has already
> been validated).
>
> Thanks
> Dave

That's a really good idea I'll update the patch.
Thanks for the comments.

Bests
  Andrea

Ping: [PATCH] x86/AVX512: improve generated code for mask-to-vector-register conversions

2019-07-18 Thread Jan Beulich

>>> On 27.06.19 at 10:59,  wrote:
> Conversion of comparison results to full vectors does, when VPMOVM2* are
> unavailable, not require any intermediate VMOVDQ{A,U}*: Simply use
> embedded masking on VPTERNLOG* right away, which is available with
> AVX512F (while VPMOVM2{D,Q} are available only with AVX512DQ).
> 
> Note that the chosen immediate is only one of many possible ones; I was
> trying to make the insn here distinguishable from the pre-existing uses
> of vpternlog.
> 
> gcc/
> 2019-06-27  Jan Beulich  
> 
>   * config/i386/sse.md (_cvtmask2):
>   Require only AVX512F.
>   (*_cvtmask2): Likewise.  Add
>   alternative expanding to vpternlog.
> 
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -6395,21 +6395,25 @@
> (match_dup 2)
> (match_dup 3)
> (match_operand: 1 "register_operand")))]
> -  "TARGET_AVX512DQ"
> +  "TARGET_AVX512F"
>"{
>  operands[2] = CONSTM1_RTX (mode);
>  operands[3] = CONST0_RTX (mode);
>}")
>  
>  (define_insn "*_cvtmask2"
> -  [(set (match_operand:VI48_AVX512VL 0 "register_operand" "=v")
> +  [(set (match_operand:VI48_AVX512VL 0 "register_operand" "=v,v")
>   (vec_merge:VI48_AVX512VL
> (match_operand:VI48_AVX512VL 2 "vector_all_ones_operand")
> (match_operand:VI48_AVX512VL 3 "const0_operand")
> -   (match_operand: 1 "register_operand" "k")))]
> -  "TARGET_AVX512DQ"
> -  "vpmovm2\t{%1, %0|%0, %1}"
> -  [(set_attr "prefix" "evex")
> +   (match_operand: 1 "register_operand" "k,Yk")))]
> +  "TARGET_AVX512F"
> +  "@
> +   vpmovm2\t{%1, %0|%0, %1}
> +   vpternlog\t{$0x81, %0, %0, %0%{%1%}%{z%}|%0%{%1%}%{z%}, 
> %0, %0, 0x81}"
> +  [(set_attr "isa" "avx512dq,*")
> +   (set_attr "length_immediate" "0,1")
> +   (set_attr "prefix" "evex")
> (set_attr "mode" "")])
>  
>  (define_insn "sse2_cvtps2pd"

Re: [PATCH][gcc] libgccjit: check result_type in gcc_jit_context_new_unary_op

2019-07-18 Thread David Malcolm

On Thu, 2019-07-18 at 14:20 +, Andrea Corallo wrote:
> Hi all,
> I've just realized that what we has been done recently for
> gcc_jit_context_new_binary_op should be done also for the unary
> version.
> This patch checks at record time for the result type of
> gcc_jit_context_new_unary_op to be numeric type plus add a testcase
> for the new check.
> 
> make check-jit runs clean
> 
> Is it okay for trunk?
> 
> Bests
>   Andrea
> 
> gcc/jit/ChangeLog
> 2019-07-18  Andrea Corallo 
> 
>   * libgccjit.c (gcc_jit_context_new_unary_op): Check result_type
> to be a
>   numeric type.
>   * libgccjit.c (gcc_jit_context_new_binary_op): Fix nit in error
> message.
> 
> gcc/testsuite/ChangeLog
> 2019-07-04  Andrea Corallo 
> 
>   * jit.dg/test-error-gcc_jit_context_new_unary_op-bad-res-
> type.c:
>   New testcase.
>   * jit.dg/test-error-gcc_jit_context_new_binary_op-bad-res-
> type.c:
>   Fix nit in error message.

Thanks for the patch.  What happens with the existing code if the user
tries to use such a unary op?

> diff --git a/gcc/jit/libgccjit.c b/gcc/jit/libgccjit.c
> index 23e83e2..bea840f 100644
> --- a/gcc/jit/libgccjit.c
> +++ b/gcc/jit/libgccjit.c
> @@ -1336,6 +1336,12 @@ gcc_jit_context_new_unary_op (gcc_jit_context *ctxt,
>  "unrecognized value for enum gcc_jit_unary_op: %i",
>  op);
>RETURN_NULL_IF_FAIL (result_type, ctxt, loc, "NULL result_type");
> +  RETURN_NULL_IF_FAIL_PRINTF3 (
> +result_type->is_numeric (), ctxt, loc,
> +"gcc_jit_unary_op %i with operand %s "
> +"has non-numeric result_type: %s",
> +op, rvalue->get_debug_string (),
> +result_type->get_debug_string ());
>RETURN_NULL_IF_FAIL (rvalue, ctxt, loc, "NULL rvalue");

The use of "%i" for "op" here isn't as user-friendly as it could be; it
would be ideal to tell the user the enum value.

"op" has already been validated, so why not expose the currently-static 
unary_op_reproducer_strings from jit-recording.c in an internal header,
and use it here with a "%s"?

>return (gcc_jit_rvalue *)ctxt->new_unary_op (loc, op, result_type,
rvalue);
> @@ -1388,7 +1394,7 @@ gcc_jit_context_new_binary_op (gcc_jit_context
*ctxt,
>RETURN_NULL_IF_FAIL_PRINTF4 (
>  result_type->is_numeric (), ctxt, loc,
>  "gcc_jit_binary_op %i with operands a: %s b: %s "
> -"has non numeric result_type: %s",
> +"has non-numeric result_type: %s",
>  op, a->get_debug_string (), b->get_debug_string (),
>  result_type->get_debug_string ());

Ah, I see there's one of these "%i" for op already.  Given that you're
already fixing a nit here, please make this print "%s", using
binary_op_reproducer_strings from jit-recording.c ("op" has already
been validated).

Thanks
Dave

Ping: [PATCH] x86/AVX512: improve generated code for bit-wise negation of vectors of integers

2019-07-18 Thread Jan Beulich

>>> On 27.06.19 at 10:59,  wrote:
> NOT on vectors of integers does not require loading a constant vector of
> all ones into a register - VPTERNLOG can be used here (and could/should
> be further used to carry out other binary and ternary logical operations
> which don't have a special purpose instruction).
> 
> gcc/
> 2019-06-27  Jan Beulich  
> 
>   * config/i386/sse.md (ternlogsuffix): New.
>   (one_cmpl2): Don't force CONSTM1_RTX into a register when
>   AVX512F is in use.
>   (one_cmpl2): New.
> 
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -853,6 +853,13 @@
> (V4SF "k") (V2DF "q")
> (SF "k") (DF "q")])
>  
> +;; Mapping of vector modes to VPTERNLOG suffix
> +(define_mode_attr ternlogsuffix
> +  [(V8DI "q") (V4DI "q") (V2DI "q")
> +   (V16SI "d") (V8SI "d") (V4SI "d")
> +   (V32HI "d") (V16HI "d") (V8HI "d")
> +   (V64QI "d") (V32QI "d") (V16QI "d")])
> +
>  ;; Number of scalar elements in each vector type
>  (define_mode_attr ssescalarnum
>[(V64QI "64") (V16SI "16") (V8DI "8")
> @@ -12564,9 +12571,22 @@
>   (match_dup 2)))]
>"TARGET_SSE"
>  {
> -  operands[2] = force_reg (mode, CONSTM1_RTX (mode));
> +  if (!TARGET_AVX512F)
> +operands[2] = force_reg (mode, CONSTM1_RTX (mode));
> +  else
> +operands[2] = CONSTM1_RTX (mode);
>  })
>  
> +(define_insn "one_cmpl2"
> +  [(set (match_operand:VI 0 "register_operand" "=v")
> + (xor:VI (match_operand:VI 1 "nonimmediate_operand" "vm")
> + (match_operand:VI 2 "vector_all_ones_operand" "BC")))]
> +  "TARGET_AVX512F"
> +  "vpternlog\t{$0x55, %1, %0, 
> %0|%0, %0, %1, 0x55}"
> +  [(set_attr "type" "sselog")
> +   (set_attr "prefix" "evex")
> +   (set_attr "mode" "")])
> +
>  (define_expand "_andnot3"
>[(set (match_operand:VI_AVX2 0 "register_operand")
>   (and:VI_AVX2
> 
> 
> 
>

Re: [patch2/2][arm]: remove builtin expand for sha1

2019-07-18 Thread Kyrill Tkachov


Hi Sylvia,

On 7/3/19 10:36 AM, Sylvia Taylor wrote:

Greetings,

This patch removes the builtin expand handling for sha1h/c/m/p and
replaces it with expand patterns. This should make it more consistent
with how we handle intrinsic implementations and cleans up the custom
sha1 code in the arm_expand builtins for unop and ternop.

Bootstrapped and tested on arm-none-linux-gnueabihf.

Cheers,
Syl

gcc/ChangeLog:

2019-07-03  Sylvia Taylor  

    * config/arm/arm-builtins.c
    (arm_expand_ternop_builtin): Remove builtin_sha1cpm.
    (arm_expand_unop_builtin): Remove builtin_sha1h.
    * config/arm/crypto.md
    (crypto_sha1h): New expand pattern.
    (crypto_sha1c): Likewise.
    (crypto_sha1m): Likewise.
    (crypto_sha1p): Likewise.
    (crypto_sha1h_lb): Modify.
    (crypto_sha1c_lb): Likewise.
    (crypto_sha1m_lb): Likewise.
    (crypto_sha1p_lb): Likewise.


This doesn't exactly match what the patch looks. You don't need to list 
the names the iterators expand into.


We just need the string of the names as it appears in the MD files. 
Granted, this is a tricky case for ChangeLog writing as


define_insns are converted to define_expand, things are renamed etc...

I've taken the liberty of updating the ChangeLog to:

2019-07-18  Sylvia Taylor 

    * config/arm/arm-builtins.c
    (arm_expand_ternop_builtin): Remove explicit sha1 builtin handling.
    (arm_expand_unop_builtin): Likewise.
    * config/arm/crypto.md
    (crypto_sha1h): Convert from define_insn to define_expand.
    (crypto_): Likewise.
    (crypto_sha1h_lb): New define_insn.
    (crypto__lb): Likewise.

and committed as r273575.

Thanks for the nice cleanup!

Kyrill

Re: [PATCH] Fix simd attribute handling on aarch64

2019-07-18 Thread Steve Ellcey

On Thu, 2019-07-18 at 08:37 +0100, Richard Sandiford wrote:
> 
> > 2019-07-17  Steve Ellcey  
> > 
> > * omp-simd-clone.c (simd_clone_adjust):  Call targetm.simd_clone.adjust
> > after calling simd_clone_adjust_return_type.
> > (expand_simd_clones): Ditto.
> 
> It should be pretty easy to add a test for this, now that we use
> .variant_pcs to mark symbols with the attribute.

OK, I will add some tests that makes sure this mark is not on the
scalar version of a simd function.

> > diff --git a/gcc/omp-simd-clone.c b/gcc/omp-simd-clone.c
> > index caa8da3cba5..6a6b439d146 100644
> > --- a/gcc/omp-simd-clone.c
> > +++ b/gcc/omp-simd-clone.c
> > @@ -1164,9 +1164,8 @@ simd_clone_adjust (struct cgraph_node *node)
> >  {
> >push_cfun (DECL_STRUCT_FUNCTION (node->decl));
> >  
> > -  targetm.simd_clone.adjust (node);
> > -
> >tree retval = simd_clone_adjust_return_type (node);
> > +  targetm.simd_clone.adjust (node);
> >ipa_parm_adjustment_vec adjustments
> >  = simd_clone_adjust_argument_types (node);
> >  
> > @@ -1737,8 +1736,8 @@ expand_simd_clones (struct cgraph_node *node)
> > simd_clone_adjust (n);
> >   else
> > {
> > - targetm.simd_clone.adjust (n);
> >   simd_clone_adjust_return_type (n);
> > + targetm.simd_clone.adjust (n);
> >   simd_clone_adjust_argument_types (n);
> > }
> > }
> 
> I don't think this is enough, since simd_clone_adjust_return_type
> does nothing for functions that return void (e.g. sincos).
> I think instead aarch64_simd_clone_adjust should do something like:
> 
>   TREE_TYPE (node->decl) = build_distinct_type_copy (TREE_TYPE (node-
> >decl));
> 
> But maybe that has consequences that I've not thought about...

I think that would work, but it would build two distinct types for non-
void functions, one of which would be unused/uneeded.  I.e.
aarch64_simd_clone_adjust would create a distinct type and then
simd_clone_adjust_return_type would create another distinct type
and the previous one would no longer be used anywhere.

What do you think about moving the call to build_distinct_type_copy
out of simd_clone_adjust_return_type and doing it even for null
types.  Below is what I am thinking about (untested).  I suppose
we could also leave the call to build_distinct_type_copy in 
simd_clone_adjust_return_type but just move it above the check
for a NULL type so that a distinct type is always created there.
That would still require that we change the order of the
targetm.simd_clone.adjust and simd_clone_adjust_return_type
calls as my original patch does.


diff --git a/gcc/omp-simd-clone.c b/gcc/omp-simd-clone.c
index caa8da3cba5..427d6f6f514 100644
--- a/gcc/omp-simd-clone.c
+++ b/gcc/omp-simd-clone.c
@@ -498,7 +498,6 @@ simd_clone_adjust_return_type (struct cgraph_node
*node)
   /* Adjust the function return type.  */
   if (orig_rettype == void_type_node)
 return NULL_TREE;
-  TREE_TYPE (fndecl) = build_distinct_type_copy (TREE_TYPE (fndecl));
   t = TREE_TYPE (TREE_TYPE (fndecl));
   if (INTEGRAL_TYPE_P (t) || POINTER_TYPE_P (t))
 veclen = node->simdclone->vecsize_int;
@@ -1164,6 +1163,7 @@ simd_clone_adjust (struct cgraph_node *node)
 {
   push_cfun (DECL_STRUCT_FUNCTION (node->decl));
 
+  TREE_TYPE (node->decl) = build_distinct_type_copy (TREE_TYPE (node-
>decl));
   targetm.simd_clone.adjust (node);
 
   tree retval = simd_clone_adjust_return_type (node);
@@ -1737,6 +1737,8 @@ expand_simd_clones (struct cgraph_node *node)
simd_clone_adjust (n);
  else
{
+ TREE_TYPE (n->decl)
+   = build_distinct_type_copy (TREE_TYPE (n->decl));
  targetm.simd_clone.adjust (n);
  simd_clone_adjust_return_type (n);
  simd_clone_adjust_argument_types (n);


Steve Ellcey
sell...@marvell.com

Re: [patch1/2][arm][PR90317]: fix sha1 patterns

2019-07-18 Thread Kyrill Tkachov


Hi Sylvia,

On 7/3/19 10:31 AM, Sylvia Taylor wrote:

Greetings,

This patch fixes:

1) Ice message thrown when using the crypto_sha1h intrinsic due to
incompatible mode used for zero_extend. Removed zero extend as it is
not a good choice for vector modes and using an equivalent single
mode like TI (128bits) instead of V4SI produces extra instructions
making it inefficient.

This affects gcc version 8 and above.

2) Incorrect combine optimizations made due to vec_select usage
in the sha1 patterns on arm. The patterns should only combine
a vec select within a sha1h instruction when the lane is 0.

This affects gcc version 5 and above.

- Fixed by explicitly declaring the valid const int for such
optimizations. For cases when the lane is not 0, the vector
lane selection now occurs in a e.g. vmov instruction prior
to sha1h.

- Updated the sha1h testcases on arm to check for additional
cases with custom vector lane selection.

The intrinsic functions for the sha1 patterns have also been
simplified which seems to eliminate extra vmovs like:
- vmov.i32 q8, #0.

Bootstrapped and tested on arm-none-linux-gnueabihf.

I've also given it a quick build and test on arm-none-eabi and 
armeb-none-eabi (big-endian arm)



Cheers,
Syl

gcc/ChangeLog:

2019-07-03  Sylvia Taylor  

    PR target/90317
    * config/arm/arm_neon.h
    (vsha1h_u32): Refactor.
    (vsha1cq_u32): Likewise.
    (vsha1pq_u32): Likewise.
    (vsha1mq_u32): Likewise.
    * config/arm/crypto.md:
    (crypto_sha1h): Remove zero extend, correct vec select.
    (crypto_sha1c): Correct vec select.
    (crypto_sha1m): Likewise.
    (crypto_sha1p): Likewise.

gcc/testsuite/ChangeLog:

2019-07-03  Sylvia Taylor  

    PR target/90317
    * gcc.target/arm/crypto-vsha1cq_u32.c (foo): Change.
    (GET_LANE, TEST_SHA1C_VEC_SELECT): New.
    * gcc.target/arm/crypto-vsha1h_u32.c (foo): Change.
    (GET_LANE, TEST_SHA1H_VEC_SELECT): New.
    * gcc.target/arm/crypto-vsha1mq_u32.c (foo): Change.
    (GET_LANE, TEST_SHA1M_VEC_SELECT): New.
    * gcc.target/arm/crypto-vsha1pq_u32.c (foo): Change.
    (GET_LANE, TEST_SHA1P_VEC_SELECT): New.


"Change" is to vague. Better to use "Change return type to uint32_t."

I've made this change to the ChangeLog and committed on your behalf with 
r273574.


Thanks for the patch and sorry for the delay in review!

Kyrill

Re: [RFC] Consider lrotate const rotation in vectorizer

2019-07-18 Thread Richard Earnshaw (lists)





On 18/07/2019 16:30, Jakub Jelinek wrote:

On Thu, Jul 18, 2019 at 04:26:26PM +0100, Richard Earnshaw (lists) wrote:



On 18/07/2019 16:17, Jakub Jelinek wrote:

On Thu, Jul 18, 2019 at 04:12:48PM +0100, Richard Earnshaw (lists) wrote:

Both directions:
 aarch64 c6x ia64 m68k nios2 parisc sh x86 xtensa


AArch64 is Right only.


Maybe hw-wise, but it has both rotr3 and rotl3 expanders.
At least for GPRs.

Jakub



Only for immediates.  And the patterns that support that just write out
assembly as "ror ( - n)".


For registers too (for those it negates and uses rotr).
Note, the middle-end ought to be able to do the same thing already,
except if not SHIFT_COUNT_TRUNCATED it will use bits - count instead of
-count.
(define_expand "rotl3"
   [(set (match_operand:GPI 0 "register_operand")
 (rotatert:GPI (match_operand:GPI 1 "register_operand")
   (match_operand:QI 2 "aarch64_reg_or_imm")))]
   ""
   {
 /* (SZ - cnt) % SZ == -cnt % SZ */
 if (CONST_INT_P (operands[2]))
   {
 operands[2] = GEN_INT ((-INTVAL (operands[2]))
& (GET_MODE_BITSIZE (mode) - 1));
 if (operands[2] == const0_rtx)
   {
 emit_insn (gen_mov (operands[0], operands[1]));
 DONE;
   }
   }
 else
   operands[2] = expand_simple_unop (QImode, NEG, operands[2],
 NULL_RTX, 1);
   }
)

Jakub



Well Arm has that sort of expansion as well, but it was listed as 'right 
only'.


R.

Re: [RFC] Consider lrotate const rotation in vectorizer

2019-07-18 Thread Jakub Jelinek

On Thu, Jul 18, 2019 at 04:26:26PM +0100, Richard Earnshaw (lists) wrote:
> 
> 
> On 18/07/2019 16:17, Jakub Jelinek wrote:
> > On Thu, Jul 18, 2019 at 04:12:48PM +0100, Richard Earnshaw (lists) wrote:
> > > > Both directions:
> > > > aarch64 c6x ia64 m68k nios2 parisc sh x86 xtensa
> > > 
> > > AArch64 is Right only.
> > 
> > Maybe hw-wise, but it has both rotr3 and rotl3 expanders.
> > At least for GPRs.
> > 
> > Jakub
> > 
> 
> Only for immediates.  And the patterns that support that just write out
> assembly as "ror ( - n)".

For registers too (for those it negates and uses rotr).
Note, the middle-end ought to be able to do the same thing already,
except if not SHIFT_COUNT_TRUNCATED it will use bits - count instead of
-count.
(define_expand "rotl3"
  [(set (match_operand:GPI 0 "register_operand")
(rotatert:GPI (match_operand:GPI 1 "register_operand")
  (match_operand:QI 2 "aarch64_reg_or_imm")))]
  ""
  {
/* (SZ - cnt) % SZ == -cnt % SZ */
if (CONST_INT_P (operands[2]))
  {
operands[2] = GEN_INT ((-INTVAL (operands[2]))
   & (GET_MODE_BITSIZE (mode) - 1));
if (operands[2] == const0_rtx)
  {
emit_insn (gen_mov (operands[0], operands[1]));
DONE;
  }
  }
else
  operands[2] = expand_simple_unop (QImode, NEG, operands[2],
NULL_RTX, 1);
  }
)

Jakub

Re: [RFC] Consider lrotate const rotation in vectorizer

2019-07-18 Thread Richard Earnshaw (lists)





On 18/07/2019 16:17, Jakub Jelinek wrote:

On Thu, Jul 18, 2019 at 04:12:48PM +0100, Richard Earnshaw (lists) wrote:

Both directions:
aarch64 c6x ia64 m68k nios2 parisc sh x86 xtensa


AArch64 is Right only.


Maybe hw-wise, but it has both rotr3 and rotl3 expanders.
At least for GPRs.

Jakub



Only for immediates.  And the patterns that support that just write out 
assembly as "ror ( - n)".


R.

Re: [RFC] Consider lrotate const rotation in vectorizer

2019-07-18 Thread Jakub Jelinek

On Thu, Jul 18, 2019 at 04:12:48PM +0100, Richard Earnshaw (lists) wrote:
> > Both directions:
> >aarch64 c6x ia64 m68k nios2 parisc sh x86 xtensa
> 
> AArch64 is Right only.

Maybe hw-wise, but it has both rotr3 and rotl3 expanders.
At least for GPRs.

Jakub

Re: [RFC] Consider lrotate const rotation in vectorizer

2019-07-18 Thread Richard Earnshaw (lists)





On 17/07/2019 18:00, Segher Boessenkool wrote:

On Wed, Jul 17, 2019 at 12:54:32PM +0200, Jakub Jelinek wrote:

On Wed, Jul 17, 2019 at 12:37:59PM +0200, Richard Biener wrote:

I'm not sure if it makes sense to have both LROTATE_EXPR and
RROTATE_EXPR on the GIMPLE level then (that CPUs only
support one direction is natural though).  So maybe simply get
rid of one?  Its semantics are also nowhere documented


A lot of targets support both,


Of all the linux targets, we have:

No rotate:
   alpha microblaze riscv sparc

Both directions:
   aarch64 c6x ia64 m68k nios2 parisc sh x86 xtensa


AArch64 is Right only.

R.



Left only:
   csky h8300 powerpc s390

Right only:
   arc arm mips nds32 openrisc


Then there are some targets that only support left rotates and not right
rotates (rs6000, s390, tilegx, ...), and other targets that only support
right rotates (mips, iq2000, ...).
So only having one GIMPLE code doesn't seem to be good enough.

I think handling it during expansion in generic code is fine, especially
when we clearly have several targets that do support only one of the
rotates.  As you wrote, it needs corresponding code in tree-vect-generic.c,
and shouldn't hardcode the rs6000 direction of mapping rotr to rotl, but
support also the other direction - rotl to rotr.  For the sake of
!SHIFT_COUNT_TRUNCATED targets for constant shift counts it needs to do
negation + masking and for variable shift counts probably punt and let the
backend code handle it if it can do the truncation in there?


I think we can say that *all* targets behave like SHIFT_COUNT_TRUNCATED
for rotates?  Not all immediates are valid of course, but that is a
separate issue.


Segher

[PATCH][gcc] libgccjit: check result_type in gcc_jit_context_new_unary_op

2019-07-18 Thread Andrea Corallo

Hi all,
I've just realized that what we has been done recently for
gcc_jit_context_new_binary_op should be done also for the unary
version.
This patch checks at record time for the result type of
gcc_jit_context_new_unary_op to be numeric type plus add a testcase
for the new check.

make check-jit runs clean

Is it okay for trunk?

Bests
  Andrea

gcc/jit/ChangeLog
2019-07-18  Andrea Corallo 

* libgccjit.c (gcc_jit_context_new_unary_op): Check result_type to be a
numeric type.
* libgccjit.c (gcc_jit_context_new_binary_op): Fix nit in error message.

gcc/testsuite/ChangeLog
2019-07-04  Andrea Corallo 

* jit.dg/test-error-gcc_jit_context_new_unary_op-bad-res-type.c:
New testcase.
* jit.dg/test-error-gcc_jit_context_new_binary_op-bad-res-type.c:
Fix nit in error message.
diff --git a/gcc/jit/libgccjit.c b/gcc/jit/libgccjit.c
index 23e83e2..bea840f 100644
--- a/gcc/jit/libgccjit.c
+++ b/gcc/jit/libgccjit.c
@@ -1336,6 +1336,12 @@ gcc_jit_context_new_unary_op (gcc_jit_context *ctxt,
 "unrecognized value for enum gcc_jit_unary_op: %i",
 op);
   RETURN_NULL_IF_FAIL (result_type, ctxt, loc, "NULL result_type");
+  RETURN_NULL_IF_FAIL_PRINTF3 (
+result_type->is_numeric (), ctxt, loc,
+"gcc_jit_unary_op %i with operand %s "
+"has non-numeric result_type: %s",
+op, rvalue->get_debug_string (),
+result_type->get_debug_string ());
   RETURN_NULL_IF_FAIL (rvalue, ctxt, loc, "NULL rvalue");
 
   return (gcc_jit_rvalue *)ctxt->new_unary_op (loc, op, result_type, rvalue);
@@ -1388,7 +1394,7 @@ gcc_jit_context_new_binary_op (gcc_jit_context *ctxt,
   RETURN_NULL_IF_FAIL_PRINTF4 (
 result_type->is_numeric (), ctxt, loc,
 "gcc_jit_binary_op %i with operands a: %s b: %s "
-"has non numeric result_type: %s",
+"has non-numeric result_type: %s",
 op, a->get_debug_string (), b->get_debug_string (),
 result_type->get_debug_string ());
 
diff --git a/gcc/testsuite/jit.dg/test-error-gcc_jit_context_new_binary_op-bad-res-type.c b/gcc/testsuite/jit.dg/test-error-gcc_jit_context_new_binary_op-bad-res-type.c
index abadc9f..d2a0963 100644
--- a/gcc/testsuite/jit.dg/test-error-gcc_jit_context_new_binary_op-bad-res-type.c
+++ b/gcc/testsuite/jit.dg/test-error-gcc_jit_context_new_binary_op-bad-res-type.c
@@ -36,6 +36,6 @@ verify_code (gcc_jit_context *ctxt, gcc_jit_result *result)
   /* Verify that the correct error message was emitted.	 */
   CHECK_STRING_VALUE (gcc_jit_context_get_first_error (ctxt),
 		  "gcc_jit_context_new_binary_op: gcc_jit_binary_op 1 with"
-		  " operands a: (int)1 b: (int)2 has non numeric "
+		  " operands a: (int)1 b: (int)2 has non-numeric "
 		  "result_type: void *");
 }
diff --git a/gcc/testsuite/jit.dg/test-error-gcc_jit_context_new_unary_op-bad-res-type.c b/gcc/testsuite/jit.dg/test-error-gcc_jit_context_new_unary_op-bad-res-type.c
new file mode 100644
index 000..f547974
--- /dev/null
+++ b/gcc/testsuite/jit.dg/test-error-gcc_jit_context_new_unary_op-bad-res-type.c
@@ -0,0 +1,37 @@
+#include 
+#include 
+
+#include "libgccjit.h"
+
+#include "harness.h"
+
+/* Try to create an unary operator with invalid result type.  */
+
+void
+create_code (gcc_jit_context *ctxt, void *user_data)
+{
+  gcc_jit_type *int_type =
+gcc_jit_context_get_type (ctxt, GCC_JIT_TYPE_INT);
+  gcc_jit_type *void_ptr_type =
+gcc_jit_context_get_type (ctxt, GCC_JIT_TYPE_VOID_PTR);
+
+  gcc_jit_context_new_unary_op (
+ctxt,
+NULL,
+GCC_JIT_UNARY_OP_LOGICAL_NEGATE,
+void_ptr_type,
+gcc_jit_context_new_rvalue_from_int (ctxt,
+	 int_type,
+	 1));
+}
+
+void
+verify_code (gcc_jit_context *ctxt, gcc_jit_result *result)
+{
+  CHECK_VALUE (result, NULL);
+
+  /* Verify that the correct error message was emitted.	 */
+  CHECK_STRING_VALUE (gcc_jit_context_get_first_error (ctxt),
+		  "gcc_jit_context_new_unary_op: gcc_jit_unary_op 2 with "
+		  "operand (int)1 has non-numeric result_type: void *");
+}

Re: Improve TBAA for types in anonymous namespaces

2019-07-18 Thread Jan Hubicka

> Jan Hubicka  writes:
> 
> >> 
> >> OK.  I wonder if we can/should carve off some bits to note
> >> type_with_linkage_p and type_in_anonymous_namespace_p in the tree
> >> itself?  At least in type_common there's plenty of bits left.
> >> Not sure how expensive / reliable (non-C++?) those tests otherwise are.
> >
> > It also makes me wonder if other languages (D, Ada, go, Fortran...) have
> > concept of anonymous namespace types - that is types that are never
> > interoperable with types from another translation unit.  That would
> > justify the extra flag pretty well.
> >
> > Similarly for types with name mangling defined.  Both these bits can be
> > made indpendent of C++.
> 
> Go has the concept, but it implements it by mangling the names with the
> package-path, which is required to be unique within an application (the
> package-path is normally the path used to find an import, so it is
> inherently unique within a file system).

Currently we implement ODR names only for C++.  If Go has similar
concept (i.e. types has mangled names and equal names implies equal
types acros sunits), we may want to implemnt it too and improve TBAA for
go programs..  I wonder is there something I can read about go types and
mangling?

This would be good motivation to make ODR type machinery indepenent of
C++.  Until now it was only used to drive devirtualization (which needs
BINFOs that are not done by go FE either) and produce ODR violation
warnings (that I am not sure if would make sense for go), but with TBAA
I think I can take a look into this.

Honza
> 
> Ian

Re: [PATCH] Move rust_{is_mangled,demangle_sym} to a private libiberty header.

2019-07-18 Thread Jakub Jelinek

On Thu, Jul 18, 2019 at 07:04:05AM -0700, Ian Lance Taylor wrote:
> >> On Mon, Jun 3, 2019, at 7:23 AM, Ian Lance Taylor wrote:
> >> > On Sat, Jun 1, 2019 at 7:15 AM Eduard-Mihai Burtescu  
> >> > wrote:
> >> > >
> >> > > 2019-06-01 Eduard-Mihai Burtescu 
> >> > > include/ChangeLog:
> >> > > * demangle.h (rust_is_mangled): Move to libiberty/rust-demangle.h.
> >> > > (rust_demangle_sym): Move to libiberty/rust-demangle.h.
> >> > > libiberty/ChangeLog:
> >> > > * cplus-dem.c: Include rust-demangle.h.
> >> > > * rust-demangle.c: Include rust-demangle.h.
> >> > > * rust-demangle.h: New file.
> >> > 
> >> > This is OK if it bootstraps and tests pass.
> >> > 
> >> > Thanks.
> >> > 
> >> > Ian
> >> > 
> 
> 
> Can someone volunteer to commit this patch?  Thanks.

Done.

Jakub

Re: Improve TBAA for types in anonymous namespaces

2019-07-18 Thread Ian Lance Taylor

Jan Hubicka  writes:

>> 
>> OK.  I wonder if we can/should carve off some bits to note
>> type_with_linkage_p and type_in_anonymous_namespace_p in the tree
>> itself?  At least in type_common there's plenty of bits left.
>> Not sure how expensive / reliable (non-C++?) those tests otherwise are.
>
> It also makes me wonder if other languages (D, Ada, go, Fortran...) have
> concept of anonymous namespace types - that is types that are never
> interoperable with types from another translation unit.  That would
> justify the extra flag pretty well.
>
> Similarly for types with name mangling defined.  Both these bits can be
> made indpendent of C++.

Go has the concept, but it implements it by mangling the names with the
package-path, which is required to be unique within an application (the
package-path is normally the path used to find an import, so it is
inherently unique within a file system).

Ian

Re: [PATCH] Move rust_{is_mangled,demangle_sym} to a private libiberty header.

2019-07-18 Thread Ian Lance Taylor

"Eduard-Mihai Burtescu"  writes:

> Pinging this again - while it's a tiny change, I want it to land
> before I submit anything else in this area.
> Also, I forgot to mention I have no commit access.
>
> Original submission can be found at
> https://gcc.gnu.org/ml/gcc-patches/2019-06/msg6.html.
>
> Thanks,
> - Eddy B.
>
>
> On Wed, Jun 26, 2019, at 11:54 AM, Eduard-Mihai Burtescu wrote:
>> Bootstrapped and tested on x86_64-unknown-linux-gnu.
>> 
>> (Apologies for the delay, while I was able to run libiberty tests
>> back when I submitted the patch, I wanted to make sure I can run the
>> whole GCC testsuite, especially for more significant future
>> contributions, so I had to wait until I had the time to troubleshoot
>> the NixOS support for GCC's make check)
>> 
>> Thanks,
>> - Eddy B.
>> 
>> 
>> On Mon, Jun 3, 2019, at 7:23 AM, Ian Lance Taylor wrote:
>> > On Sat, Jun 1, 2019 at 7:15 AM Eduard-Mihai Burtescu  
>> > wrote:
>> > >
>> > > 2019-06-01 Eduard-Mihai Burtescu 
>> > > include/ChangeLog:
>> > > * demangle.h (rust_is_mangled): Move to libiberty/rust-demangle.h.
>> > > (rust_demangle_sym): Move to libiberty/rust-demangle.h.
>> > > libiberty/ChangeLog:
>> > > * cplus-dem.c: Include rust-demangle.h.
>> > > * rust-demangle.c: Include rust-demangle.h.
>> > > * rust-demangle.h: New file.
>> > 
>> > This is OK if it bootstraps and tests pass.
>> > 
>> > Thanks.
>> > 
>> > Ian
>> > 


Can someone volunteer to commit this patch?  Thanks.

Ian

[arm] Fix incorrect modes with 'borrow' operations

2019-07-18 Thread Richard Earnshaw (lists)


Looking through the arm backend I noticed that the modes used to pass
comparison types into subtract-with-carry operations were being
incorrectly set.  The result is that the compiler is not truly
self-consistent.  To clean this up I've introduced a new predicate,
arm_borrow_operation (borrowed from the AArch64 backend  which can
match the comparison type with the required mode and then fixed all
the patterns to use this.  The split patterns that were generating
incorrect modes have all obviously been fixed as well.

The basic rule for the use of a borrow is:
- if the condition code was set by a 'subtract-like' operation (subs,
  cmp), then use CCmode and LTU.
- if the condition code was by unsigned overflow of addition (adds),
  then use CC_Cmode and GEU.

* config/arm/predicates.md (arm_borrow_operation): New predicate.
* config/arm/arm.c (subdi3_compare1): Use CCmode for the split.
(arm_subdi3, subdi_di_zesidi, subdi_di_sesidi): Likewise.
(subdi_zesidi_zesidi): Likewise.
(negdi2_compare, negdi2_insn): Likewise.
(negdi_extensidi): Likewise.
(negdi_zero_extendsidi): Likewise.
(arm_cmpdi_insn): Likewise.
(subsi3_carryin): Use arm_borrow_operation.
(subsi3_carryin_const): Likewise.
(subsi3_carryin_const0): Likewise.
(subsi3_carryin_compare): Likewise.
(subsi3_carryin_compare_const): Likewise.
(subsi3_carryin_compare_const0): Likewise.
(subsi3_carryin_shift): Likewise.
(rsbsi3_carryin_shift): Likewise.
(negsi2_carryin_compare): Likewise.
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 8f4a4c26ea8..dcb57372192 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -1110,7 +1110,7 @@ (define_insn_and_split "subdi3_compare1"
(parallel [(set (reg:CC CC_REGNUM)
 		   (compare:CC (match_dup 4) (match_dup 5)))
 	 (set (match_dup 3) (minus:SI (minus:SI (match_dup 4) (match_dup 5))
-			   (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0])]
+			   (ltu:SI (reg:CC CC_REGNUM) (const_int 0])]
   {
 operands[3] = gen_highpart (SImode, operands[0]);
 operands[0] = gen_lowpart (SImode, operands[0]);
@@ -1141,7 +1141,7 @@ (define_insn "*subsi3_carryin"
   [(set (match_operand:SI 0 "s_register_operand" "=r,r,r")
 	(minus:SI (minus:SI (match_operand:SI 1 "reg_or_int_operand" "r,I,Pz")
 			(match_operand:SI 2 "s_register_operand" "r,r,r"))
-		  (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0]
+		  (match_operand:SI 3 "arm_borrow_operation" "")))]
   "TARGET_32BIT"
   "@
sbc%?\\t%0, %1, %2
@@ -1155,9 +1155,10 @@ (define_insn "*subsi3_carryin"
 
 (define_insn "*subsi3_carryin_const"
   [(set (match_operand:SI 0 "s_register_operand" "=r")
-(minus:SI (plus:SI (match_operand:SI 1 "s_register_operand" "r")
-   (match_operand:SI 2 "arm_neg_immediate_operand" "L"))
-  (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0]
+	(minus:SI (plus:SI
+		   (match_operand:SI 1 "s_register_operand" "r")
+		   (match_operand:SI 2 "arm_neg_immediate_operand" "L"))
+		  (match_operand:SI 3 "arm_borrow_operation" "")))]
   "TARGET_32BIT"
   "sbc\\t%0, %1, #%n2"
   [(set_attr "conds" "use")
@@ -1166,8 +1167,8 @@ (define_insn "*subsi3_carryin_const"
 
 (define_insn "*subsi3_carryin_const0"
   [(set (match_operand:SI 0 "s_register_operand" "=r")
-(minus:SI (match_operand:SI 1 "s_register_operand" "r")
-  (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0]
+	(minus:SI (match_operand:SI 1 "s_register_operand" "r")
+		  (match_operand:SI 2 "arm_borrow_operation" "")))]
   "TARGET_32BIT"
   "sbc\\t%0, %1, #0"
   [(set_attr "conds" "use")
@@ -1176,12 +1177,11 @@ (define_insn "*subsi3_carryin_const0"
 
 (define_insn "*subsi3_carryin_compare"
   [(set (reg:CC CC_REGNUM)
-(compare:CC (match_operand:SI 1 "s_register_operand" "r")
-(match_operand:SI 2 "s_register_operand" "r")))
+	(compare:CC (match_operand:SI 1 "s_register_operand" "r")
+		(match_operand:SI 2 "s_register_operand" "r")))
(set (match_operand:SI 0 "s_register_operand" "=r")
-(minus:SI (minus:SI (match_dup 1)
-(match_dup 2))
-  (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0]
+	(minus:SI (minus:SI (match_dup 1) (match_dup 2))
+		  (match_operand:SI 3 "arm_borrow_operation" "")))]
   "TARGET_32BIT"
   "sbcs\\t%0, %1, %2"
   [(set_attr "conds" "set")
@@ -1190,12 +1190,13 @@ (define_insn "*subsi3_carryin_compare"
 
 (define_insn "*subsi3_carryin_compare_const"
   [(set (reg:CC CC_REGNUM)
-(compare:CC (match_operand:SI 1 "reg_or_int_operand" "r")
-(match_operand:SI 2 "const_int_I_operand" "I")))
+	(compare:CC (match_operand:SI 1 "reg_or_int_operand" "r")
+		(match_operand:SI 2 "const_int_I_operand" "I")))
(set (match_operand:SI 0 "s_register_operand" "=r")
-(minus:SI (plus:SI (match_dup 1)
-   (match_

[PATCH][MSP430] Fix unnecessary saving of all callee-saved regs in an interrupt function that calls another function

2019-07-18 Thread Jozef Lawrynowicz

The attached patch fixes an issue for msp430 where the logic to decide which
registers need to be saved in an interrupt function was unnecessarily
choosing to save all callee-saved registers regardless of whether they were
used or not. This came at a code size and performance penalty for the 430 ISA,
and a performance penalty for the 430X ISA.

Interrupt functions require special conventions for saving registers which
would normally be caller-saved. Since the interrupt happens without warning,
registers that would normally have been preserved by the caller of a function
cannot be preserved when an interrupt is triggered. This means interrupts must
save and restore the used caller-saved registers, in addition to the used
callee-saved registers that a regular function would save.

If an interrupt is not a leaf function, all caller-saved registers must be
saved/restored in the prologue/epilogue of the interrupt function, since it
is unknown which of these will be modified in later functions.

We can rely on the function called by an interrupt to save and restore
callee-saved registers, so it is unnecessary to save all callee-saved regs
in the ISR. This is what this patch changes.

Successfully regtested for msp430-elf on trunk for C/C++.

Ok for trunk?

Thanks,
Jozef
>From 1e151dac2be34ae50bea8b4b37bd2d78c5f7ddd6 Mon Sep 17 00:00:00 2001
From: Jozef Lawrynowicz 
Date: Thu, 18 Jul 2019 09:25:52 +0100
Subject: [PATCH] MSP430: Fix unnecessary saving of all callee-saved regs in an
 ISR which calls another function

gcc/ChangeLog:

2019-07-18  Jozef Lawrynowicz  

	* config/msp430/msp430.c (msp430_preserve_reg_p): Don't save
	callee-saved regs R4->R10 in an interrupt function that calls another
	function.

gcc/testsuite/ChangeLog:

2019-07-18  Jozef Lawrynowicz  

	* gcc.target/msp430/isr-push-pop-main.c: New test.
	* gcc.target/msp430/isr-push-pop-isr-430.c: Likewise.
	* gcc.target/msp430/isr-push-pop-isr-430x.c: Likewise.
	* gcc.target/msp430/isr-push-pop-leaf-isr-430.c: Likewise.
	* gcc.target/msp430/isr-push-pop-leaf-isr-430x.c: Likewise.
	
---
 gcc/config/msp430/msp430.c|  18 ++-
 .../gcc.target/msp430/isr-push-pop-isr-430.c  |  13 ++
 .../gcc.target/msp430/isr-push-pop-isr-430x.c |  12 ++
 .../msp430/isr-push-pop-leaf-isr-430.c|  27 
 .../msp430/isr-push-pop-leaf-isr-430x.c   |  24 
 .../gcc.target/msp430/isr-push-pop-main.c | 120 ++
 6 files changed, 209 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/msp430/isr-push-pop-isr-430.c
 create mode 100644 gcc/testsuite/gcc.target/msp430/isr-push-pop-isr-430x.c
 create mode 100644 gcc/testsuite/gcc.target/msp430/isr-push-pop-leaf-isr-430.c
 create mode 100644 gcc/testsuite/gcc.target/msp430/isr-push-pop-leaf-isr-430x.c
 create mode 100644 gcc/testsuite/gcc.target/msp430/isr-push-pop-main.c

diff --git a/gcc/config/msp430/msp430.c b/gcc/config/msp430/msp430.c
index 365e9eba747..265c2f642d8 100644
--- a/gcc/config/msp430/msp430.c
+++ b/gcc/config/msp430/msp430.c
@@ -1755,11 +1755,19 @@ msp430_preserve_reg_p (int regno)
   if (fixed_regs [regno])
 return false;
 
-  /* Interrupt handlers save all registers they use, even
- ones which are call saved.  If they call other functions
- then *every* register is saved.  */
-  if (msp430_is_interrupt_func ())
-return ! crtl->is_leaf || df_regs_ever_live_p (regno);
+  /* For interrupt functions we must save and restore the used regs that
+ would normally be caller-saved (R11->R15).  */
+  if (msp430_is_interrupt_func () && regno >= 11 && regno <= 15)
+{
+  if (crtl->is_leaf && df_regs_ever_live_p (regno))
+	/* If the interrupt func is a leaf then we only need to restore the
+	   caller-saved regs that are used.  */
+	return true;
+  else if (!crtl->is_leaf)
+	/* If the interrupt function is not a leaf we must save all
+	   caller-saved regs in case the callee modifies them.  */
+	return true;
+}
 
   if (!call_used_regs [regno]
   && df_regs_ever_live_p (regno))
diff --git a/gcc/testsuite/gcc.target/msp430/isr-push-pop-isr-430.c b/gcc/testsuite/gcc.target/msp430/isr-push-pop-isr-430.c
new file mode 100644
index 000..a2bf8433ebd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/msp430/isr-push-pop-isr-430.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-skip-if "" { *-*-* } { "-mcpu=msp430x*" "-mlarge" } { "" } } */
+/* { dg-options "-mcpu=msp430" } */
+/* { dg-final { scan-assembler "PUSH\tR11" } } */
+/* { dg-final { scan-assembler-not "PUSH\tR10" } } */
+
+void __attribute__((noinline)) callee (void);
+
+void __attribute__((interrupt))
+isr (void)
+{
+  callee();
+}
diff --git a/gcc/testsuite/gcc.target/msp430/isr-push-pop-isr-430x.c b/gcc/testsuite/gcc.target/msp430/isr-push-pop-isr-430x.c
new file mode 100644
index 000..2d65186bdf9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/msp430/isr-push-pop-isr-430x.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-skip-if "" { *-*-* } { "-

Re: Improve TBAA for types in anonymous namespaces

2019-07-18 Thread Jan Hubicka

> 
> OK.  I wonder if we can/should carve off some bits to note
> type_with_linkage_p and type_in_anonymous_namespace_p in the tree
> itself?  At least in type_common there's plenty of bits left.
> Not sure how expensive / reliable (non-C++?) those tests otherwise are.

It also makes me wonder if other languages (D, Ada, go, Fortran...) have
concept of anonymous namespace types - that is types that are never
interoperable with types from another translation unit.  That would
justify the extra flag pretty well.

Similarly for types with name mangling defined.  Both these bits can be
made indpendent of C++.

Honza

Re: Improve TBAA for types in anonymous namespaces

2019-07-18 Thread Jan Hubicka

> On Thu, 18 Jul 2019, Jan Hubicka wrote:
> 
> > Hi,
> > this patch adjusts LTO tree merging to treat anonymous namespace types
> > as local to a given TU, so just like !TREE_PUBLIC decls they are not
> > merged (this is unify_scc change). This makes them to get different
> > canonical types and act as independent types for TBAA.
> > 
> > I also modified canonical type calculation to never consider anonymous
> > namespace types as interoperable with structurally equivalent types
> > declared in non-C++ translation units.
> > 
> > Both these changes are tested by the testcase (first change makes set1
> > and set2 independent and second chnage makes set1&set2 independent of
> > set3).
> > 
> > lto-bootstrapped/regtested x86_64-linux
> 
> OK.  I wonder if we can/should carve off some bits to note
> type_with_linkage_p and type_in_anonymous_namespace_p in the tree
> itself?  At least in type_common there's plenty of bits left.
> Not sure how expensive / reliable (non-C++?) those tests otherwise are.

I was thinking of that too. The pattern matching is somewhat ugly.
It is now sanity checked to non-C++ FEs by verifying that CXX_ODR_P is
set if the test matches, so it seems to work reliably (byt indeed there
was issues until this stage1)

After we dropped -fno-odr-type-merging these predicates are no longer
absolutely terrible though. type_with_linkage_p boils down to testing
that TYPE_NAME is TYPE_DECL and it has ASSEMBLER_NAME set.  Anonymous
namespace just looks for  in the type name.

I plan to do some profiling and we can see how bad are they these days.
Flag would also save the weird set of tests we do in
need_assembler_name_p to not get confused by other languages.

Honza

Re: Improve TBAA for types in anonymous namespaces

2019-07-18 Thread Richard Biener

On Thu, 18 Jul 2019, Jan Hubicka wrote:

> Hi,
> this patch adjusts LTO tree merging to treat anonymous namespace types
> as local to a given TU, so just like !TREE_PUBLIC decls they are not
> merged (this is unify_scc change). This makes them to get different
> canonical types and act as independent types for TBAA.
> 
> I also modified canonical type calculation to never consider anonymous
> namespace types as interoperable with structurally equivalent types
> declared in non-C++ translation units.
> 
> Both these changes are tested by the testcase (first change makes set1
> and set2 independent and second chnage makes set1&set2 independent of
> set3).
> 
> lto-bootstrapped/regtested x86_64-linux

OK.  I wonder if we can/should carve off some bits to note
type_with_linkage_p and type_in_anonymous_namespace_p in the tree
itself?  At least in type_common there's plenty of bits left.
Not sure how expensive / reliable (non-C++?) those tests otherwise are.

Thanks,
Richard.

> Honza
> 
>   * lto-common.c (gimple_register_canonical_type_1): Do not look for
>   non-ODR conflicts of types in anonymous namespaces.
>   (unify_scc): Do not merge anonymous namespace types.
>   * g++.dg/lto/alias-5_0.C: New testcase.
>   * g++.dg/lto/alias-5_1.C: New.
>   * g++.dg/lto/alias-5_2.c: New.
> Index: lto/lto-common.c
> ===
> --- lto/lto-common.c  (revision 273551)
> +++ lto/lto-common.c  (working copy)
> @@ -418,13 +418,19 @@ gimple_register_canonical_type_1 (tree t
>if (RECORD_OR_UNION_TYPE_P (t)
>&& odr_type_p (t) && !odr_type_violation_reported_p (t))
>  {
> -  /* Here we rely on fact that all non-ODR types was inserted into
> -  canonical type hash and thus we can safely detect conflicts between
> -  ODR types and interoperable non-ODR types.  */
> -  gcc_checking_assert (type_streaming_finished
> -&& TYPE_MAIN_VARIANT (t) == t);
> -  slot = htab_find_slot_with_hash (gimple_canonical_types, t, hash,
> -NO_INSERT);
> +  /* Anonymous namespace types never conflict with non-C++ types.  */
> +  if (type_with_linkage_p (t) && type_in_anonymous_namespace_p (t))
> + slot = NULL;
> +  else
> + {
> +   /* Here we rely on fact that all non-ODR types was inserted into
> +  canonical type hash and thus we can safely detect conflicts between
> +  ODR types and interoperable non-ODR types.  */
> +   gcc_checking_assert (type_streaming_finished
> +&& TYPE_MAIN_VARIANT (t) == t);
> +   slot = htab_find_slot_with_hash (gimple_canonical_types, t, hash,
> +NO_INSERT);
> + }
>if (slot && !TYPE_CXX_ODR_P (*(tree *)slot))
>   {
> tree nonodr = *(tree *)slot;
> @@ -1640,11 +1646,14 @@ unify_scc (class data_in *data_in, unsig
>tree t = streamer_tree_cache_get_tree (cache, from + i);
>scc->entries[i] = t;
>/* Do not merge SCCs with local entities inside them.  Also do
> -  not merge TRANSLATION_UNIT_DECLs.  */
> +  not merge TRANSLATION_UNIT_DECLs and anonymous namespace types.  */
>if (TREE_CODE (t) == TRANSLATION_UNIT_DECL
> || (VAR_OR_FUNCTION_DECL_P (t)
> && !(TREE_PUBLIC (t) || DECL_EXTERNAL (t)))
> -   || TREE_CODE (t) == LABEL_DECL)
> +   || TREE_CODE (t) == LABEL_DECL
> +   || (TYPE_P (t)
> +   && type_with_linkage_p (TYPE_MAIN_VARIANT (t))
> +   && type_in_anonymous_namespace_p (TYPE_MAIN_VARIANT (t
>   {
> /* Avoid doing any work for these cases and do not worry to
>record the SCCs for further merging.  */
> Index: testsuite/g++.dg/lto/alias-5_0.C
> ===
> --- testsuite/g++.dg/lto/alias-5_0.C  (nonexistent)
> +++ testsuite/g++.dg/lto/alias-5_0.C  (working copy)
> @@ -0,0 +1,35 @@
> +/* { dg-lto-do run } */
> +/* { dg-lto-options { { -O3 -flto } } } */
> +/* This testcase tests that anonymous namespaces in different TUs are treated
> +   as different types by LTO TBAA and that they never alias with structurally
> +   same C types.  */
> +namespace {
> +  __attribute__((used))
> +  struct a {int a;} *p,**ptr=&p;
> +};
> +void
> +set1()
> +{
> +  *ptr=0;
> +}
> +void
> +get1()
> +{
> +  if (!__builtin_constant_p (*ptr==0))
> +__builtin_abort ();
> +}
> +extern void set2();
> +extern "C" void set3();
> +int n = 1;
> +int
> +main()
> +{
> +  for (int i = 0; i < n; i++)
> +{
> +  set1();
> +  set2();
> +  set3();
> +  get1();
> +}
> +  return 0;
> +}
> Index: testsuite/g++.dg/lto/alias-5_1.C
> ===
> --- testsuite/g++.dg/lto/alias-5_1.C  (nonexistent)
> +++ testsuite/g++.dg/lto/alias-5_1.C  (working copy)
> @@ -0,0 +1,9 @@
> +namespace {
> +  __attribute__((

Improve TBAA for types in anonymous namespaces

2019-07-18 Thread Jan Hubicka

Hi,
this patch adjusts LTO tree merging to treat anonymous namespace types
as local to a given TU, so just like !TREE_PUBLIC decls they are not
merged (this is unify_scc change). This makes them to get different
canonical types and act as independent types for TBAA.

I also modified canonical type calculation to never consider anonymous
namespace types as interoperable with structurally equivalent types
declared in non-C++ translation units.

Both these changes are tested by the testcase (first change makes set1
and set2 independent and second chnage makes set1&set2 independent of
set3).

lto-bootstrapped/regtested x86_64-linux
Honza

* lto-common.c (gimple_register_canonical_type_1): Do not look for
non-ODR conflicts of types in anonymous namespaces.
(unify_scc): Do not merge anonymous namespace types.
* g++.dg/lto/alias-5_0.C: New testcase.
* g++.dg/lto/alias-5_1.C: New.
* g++.dg/lto/alias-5_2.c: New.
Index: lto/lto-common.c
===
--- lto/lto-common.c(revision 273551)
+++ lto/lto-common.c(working copy)
@@ -418,13 +418,19 @@ gimple_register_canonical_type_1 (tree t
   if (RECORD_OR_UNION_TYPE_P (t)
   && odr_type_p (t) && !odr_type_violation_reported_p (t))
 {
-  /* Here we rely on fact that all non-ODR types was inserted into
-canonical type hash and thus we can safely detect conflicts between
-ODR types and interoperable non-ODR types.  */
-  gcc_checking_assert (type_streaming_finished
-  && TYPE_MAIN_VARIANT (t) == t);
-  slot = htab_find_slot_with_hash (gimple_canonical_types, t, hash,
-  NO_INSERT);
+  /* Anonymous namespace types never conflict with non-C++ types.  */
+  if (type_with_linkage_p (t) && type_in_anonymous_namespace_p (t))
+   slot = NULL;
+  else
+   {
+ /* Here we rely on fact that all non-ODR types was inserted into
+canonical type hash and thus we can safely detect conflicts between
+ODR types and interoperable non-ODR types.  */
+ gcc_checking_assert (type_streaming_finished
+  && TYPE_MAIN_VARIANT (t) == t);
+ slot = htab_find_slot_with_hash (gimple_canonical_types, t, hash,
+  NO_INSERT);
+   }
   if (slot && !TYPE_CXX_ODR_P (*(tree *)slot))
{
  tree nonodr = *(tree *)slot;
@@ -1640,11 +1646,14 @@ unify_scc (class data_in *data_in, unsig
   tree t = streamer_tree_cache_get_tree (cache, from + i);
   scc->entries[i] = t;
   /* Do not merge SCCs with local entities inside them.  Also do
-not merge TRANSLATION_UNIT_DECLs.  */
+not merge TRANSLATION_UNIT_DECLs and anonymous namespace types.  */
   if (TREE_CODE (t) == TRANSLATION_UNIT_DECL
  || (VAR_OR_FUNCTION_DECL_P (t)
  && !(TREE_PUBLIC (t) || DECL_EXTERNAL (t)))
- || TREE_CODE (t) == LABEL_DECL)
+ || TREE_CODE (t) == LABEL_DECL
+ || (TYPE_P (t)
+ && type_with_linkage_p (TYPE_MAIN_VARIANT (t))
+ && type_in_anonymous_namespace_p (TYPE_MAIN_VARIANT (t
{
  /* Avoid doing any work for these cases and do not worry to
 record the SCCs for further merging.  */
Index: testsuite/g++.dg/lto/alias-5_0.C
===
--- testsuite/g++.dg/lto/alias-5_0.C(nonexistent)
+++ testsuite/g++.dg/lto/alias-5_0.C(working copy)
@@ -0,0 +1,35 @@
+/* { dg-lto-do run } */
+/* { dg-lto-options { { -O3 -flto } } } */
+/* This testcase tests that anonymous namespaces in different TUs are treated
+   as different types by LTO TBAA and that they never alias with structurally
+   same C types.  */
+namespace {
+  __attribute__((used))
+  struct a {int a;} *p,**ptr=&p;
+};
+void
+set1()
+{
+  *ptr=0;
+}
+void
+get1()
+{
+  if (!__builtin_constant_p (*ptr==0))
+__builtin_abort ();
+}
+extern void set2();
+extern "C" void set3();
+int n = 1;
+int
+main()
+{
+  for (int i = 0; i < n; i++)
+{
+  set1();
+  set2();
+  set3();
+  get1();
+}
+  return 0;
+}
Index: testsuite/g++.dg/lto/alias-5_1.C
===
--- testsuite/g++.dg/lto/alias-5_1.C(nonexistent)
+++ testsuite/g++.dg/lto/alias-5_1.C(working copy)
@@ -0,0 +1,9 @@
+namespace {
+  __attribute__((used))
+  struct a {int a;} *p,**ptr=&p,q;
+};
+void
+set2()
+{
+  *ptr=&q;
+}
Index: testsuite/g++.dg/lto/alias-5_2.c
===
--- testsuite/g++.dg/lto/alias-5_2.c(nonexistent)
+++ testsuite/g++.dg/lto/alias-5_2.c(working copy)
@@ -0,0 +1,7 @@
+  __attribute__((used))
+  struct a {int a;} *p,**ptr=&p,q;
+void
+set3()
+{
+  *ptr=&q;
+}

Re: [PATCH 2/5, OpenACC] Support Fortran optional arguments in the firstprivate clause

2019-07-18 Thread Kwok Cheung Yeung


On 18/07/2019 10:28 am, Tobias Burnus wrote:

Hi all,

I played around and came up with another second way one gets a single "*" 
without
'optional'.

I haven't checked whether which of those match the proposed 
omp_is_optional_argument's
+&& DECL_BY_REFERENCE (decl)
+&& TREE_CODE (TREE_TYPE (decl)) == POINTER_TYPE;
nor whether some checks reject any of those with OpenACC (or OpenMP).

In any case, the dump of "type(c_ptr),value", "integer, dimension(1)" and 
"integer,optional"
is:

static void foo (void *, integer(kind=4)[1] *, integer(kind=4) *);



The case of the fixed-length array currently doesn't work properly with 
omp_is_optional_argument, as it returns true whether or not it is optional. 
Indeed, the PARM_DECL doesn't seem to change between optional and non-optional, 
so it is probably impossible to discern via the tree unless some extra 
information is added by the front-end.


However, optional arrays still 'just work' with my patches (most of the 
testcases include tests for arrays in optional arguments). I believe this is 
because the existing code must already deal with pointers to arrays, so the 
false positive simply does not matter on the codepath taken. The new case of a 
null pointer (in the case of a non-present optional argument) was dealt with by 
making operations on null pointers into NOPs.




Actually, if one combines VALUE with OPTIONAL, it gets even more interesting.
To implement it, one has two choices:
* pass a copy by reference (or NULL)
* pass by value (including a dummy value if absent) and denote the state as
   extra argument.

The latter is done in gfortran, cf. PR fortran/35203, following the IBM 
compiler.


I am not sure whether it does need special care fore OpenACC (or OpenMP 5) 
offloading,
but that's at least a case which is not handled by the patch.



Optional by-value arguments are tested in optional-data-copyin-by-value.f90. 
They do not need extra handling, since from the OACC lowering perspective they 
are just two bog-standard integral values of no special interest.


Kwok

Re: Fix failing tests after PR libstdc++/85965

2019-07-18 Thread Jonathan Wakely


On 18/07/19 07:41 +0200, François Dumont wrote:

Since commit 5d3695d03b7bdade9f4d05d2b those tests are failing.

    * testsuite/23_containers/unordered_map/48101_neg.cc: Adapt dg-error
    after PR libstdc++/85965 fix.
    * testsuite/23_containers/unordered_multimap/48101_neg.cc: Likewise.
    * testsuite/23_containers/unordered_multiset/48101_neg.cc: Likewise.
    * testsuite/23_containers/unordered_set/48101_neg.cc

It is quite trivial but I wonder if there is another plan to restore 
those static assertions differently.


Ok to commit ?


No. I don't see these failures. With the first change applied, I see a
new failure.

The patch seems wrong.

Re: sized delete in _Temporary_buffer<>

2019-07-18 Thread Jonathan Wakely


On 18/07/19 07:41 +0200, François Dumont wrote:
As we adopted the sized deallocation in the new_allocator why not 
doing the same in _Temporary_buffer<>.


    * include/bits/stl_tempbuf.h 
(__detail::__return_temporary_buffer): New.

    (~_Temporary_buffer()): Use latter.
    (_Temporary_buffer(_FIterator, size_type)): Likewise.

Tested w/o activating sized deallocation. I'll try to run tests with 
this option activated.


As the manual says, it's enabled by default for C++14 and later.


Ok to commit ?


OK for trunk, thanks.

Re: [PATCH 04/10, OpenACC] Turn OpenACC kernels regions into a sequence of, parallel regions

2019-07-18 Thread Jakub Jelinek

On Wed, Jul 17, 2019 at 10:06:07PM +0100, Kwok Cheung Yeung wrote:
> --- a/gcc/omp-oacc-kernels.c
> +++ b/gcc/omp-oacc-kernels.c
> @@ -30,6 +30,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "backend.h"
>  #include "target.h"
>  #include "tree.h"
> +#include "cp/cp-tree.h"

No, you certainly don't want to do this.  Use langhooks if needed, though
that can be only for stuff done before IPA.  After IPA, because of LTO FE, you
must not rely on anything that is not in the IL generically.

> +  /* Build the gang-single region.  */
> +  gimple *single_region
> += gimple_build_omp_target (
> +NULL,
> +GF_OMP_TARGET_KIND_OACC_PARALLEL_KERNELS_GANG_SINGLE,
> +gang_single_clause);

Formatting, both lack of tab uses, and ( at the end of line is very ugly.
Either try to use shorter GF_OMP_TARGET_KIND names, or say set a local
variable to that value and use that as an argument to the function to make
it shorter and more readable.

Jakub

Re: [PATCH 2/5, OpenACC] Support Fortran optional arguments in the firstprivate clause

2019-07-18 Thread Tobias Burnus

Hi all,

I played around and came up with another second way one gets a single "*" 
without
'optional'.

I haven't checked whether which of those match the proposed 
omp_is_optional_argument's
+&& DECL_BY_REFERENCE (decl)
+&& TREE_CODE (TREE_TYPE (decl)) == POINTER_TYPE;
nor whether some checks reject any of those with OpenACC (or OpenMP).

In any case, the dump of "type(c_ptr),value", "integer, dimension(1)" and 
"integer,optional"
is:

static void foo (void *, integer(kind=4)[1] *, integer(kind=4) *);


Actually, if one combines VALUE with OPTIONAL, it gets even more interesting.
To implement it, one has two choices:
* pass a copy by reference (or NULL)
* pass by value (including a dummy value if absent) and denote the state as
  extra argument.

The latter is done in gfortran, cf. PR fortran/35203, following the IBM 
compiler.


I am not sure whether it does need special care fore OpenACC (or OpenMP 5) 
offloading,
but that's at least a case which is not handled by the patch.


Actually, there is a bug: the declaration of the function and the definition of
the function is not the same - one misses the hidden argument :-(

That's now PR fortran/91196.


Fortran code - which now also contains VALUE, OPTIONAL:

use iso_c_binding
implicit none
logical(kind=c_bool) :: is_present
integer :: y(1)
y(1) = 5
is_present = foo(c_null_ptr, y)
contains
  logical(kind=c_bool) function foo(x, y, z, z2)
type(c_ptr), value :: x ! Matches a C 'void *' pointer
integer, target :: y(1)
integer, optional :: z
integer, value, optional :: z2

foo = present(z2)
  end function foo
end


Tobias

Re: [PATCH 02/10, OpenACC] Add OpenACC target kinds for decomposed kernels regions

2019-07-18 Thread Jakub Jelinek

On Wed, Jul 17, 2019 at 10:04:10PM +0100, Kwok Cheung Yeung wrote:
> @@ -2319,7 +2339,8 @@ scan_omp_for (gomp_for *stmt, omp_context *outer_ctx)
>  {
>omp_context *tgt = enclosing_target_ctx (outer_ctx);
> 
> -  if (!tgt || is_oacc_parallel (tgt))
> +  if (!tgt || (is_oacc_parallel (tgt)
> +&& !was_originally_oacc_kernels (tgt)))
>   for (tree c = clauses; c; c = OMP_CLAUSE_CHAIN (c))
> {
>   char const *check = NULL;

Please watch up formatting, the above doesn't use tabs where it should.
Have you run the series through contrib/check_GNU_style.sh ?

Otherwise, no concerns about this particular patch, assuming Thomas is ok
with it.

Jakub

Re: [PATCH 00/10, OpenACC] Rework handling of OpenACC kernels regions

2019-07-18 Thread Jakub Jelinek

On Wed, Jul 17, 2019 at 10:02:18PM +0100, Kwok Cheung Yeung wrote:
> This series of patches reworks the way that OpenACC kernels regions are
> processed by GCC. Instead of relying on the parloops pass for
> auto-parallelisation of the kernel region, the contents of the region are
> transformed into a sequence of offloaded regions, which are then processed
> individually.
> 
> Tested on an x86_64 host, with offloading to a Nvidia Tesla K20c card.

So, what is the state of this series?  Has Thomas reviewed it and acked from
OpenACC side?  Which particular patches you want me to look at from the
OpenMP vs. OpenACC interaction?

Jakub

Re: [PATCH] Fix simd attribute handling on aarch64

2019-07-18 Thread Richard Sandiford

Steve Ellcey  writes:
> This patch fixes a bug with SIMD functions on Aarch64.  I found it
> while trying to run SPEC with ToT GCC and a glibc that defines vector
> math functions for aarch64.  When a function is declared with the simd
> attribute GCC creates vector clones of that function with the return
> and argument types changed to vector types.  On Aarch64 the vector
> clones are also marked with the aarch64_vector_pcs attribute to signify
> that they use an alternate calling convention.  Due to a bug in GCC the
> non-vector version of the function being cloned was also being marked
> with this attribute.
>
> Because simd_clone_adjust and expand_simd_clones are calling
> targetm.simd_clone.adjust (which attached the aarch64_vector_pcs
> attribute to the function type) before calling
> simd_clone_adjust_return_type (which created a new distinct type tree
> for the cloned function) the attribute got attached to both the
> 'normal' scalar version of the SIMD function and any vector versions of
> the function.  The attribute should only be on the vector versions.
>
> My fix is to call simd_clone_adjust_return_type and create the new type
> before calling targetm.simd_clone.adjust which adds the attribute.  The
> only other platform that this patch could affect is x86 because that is
> the only other platform to use targetm.simd_clone.adjust.  I did a
> bootstrap and gcc test run on x86 (as well as Aarch64) and got no
> regressions.
>
> OK to checkin?
>
> Steve Ellcey
> sell...@marvell.com
>
>
> 2019-07-17  Steve Ellcey  
>
>   * omp-simd-clone.c (simd_clone_adjust):  Call targetm.simd_clone.adjust
>   after calling simd_clone_adjust_return_type.
>   (expand_simd_clones): Ditto.

It should be pretty easy to add a test for this, now that we use
.variant_pcs to mark symbols with the attribute.

> diff --git a/gcc/omp-simd-clone.c b/gcc/omp-simd-clone.c
> index caa8da3cba5..6a6b439d146 100644
> --- a/gcc/omp-simd-clone.c
> +++ b/gcc/omp-simd-clone.c
> @@ -1164,9 +1164,8 @@ simd_clone_adjust (struct cgraph_node *node)
>  {
>push_cfun (DECL_STRUCT_FUNCTION (node->decl));
>  
> -  targetm.simd_clone.adjust (node);
> -
>tree retval = simd_clone_adjust_return_type (node);
> +  targetm.simd_clone.adjust (node);
>ipa_parm_adjustment_vec adjustments
>  = simd_clone_adjust_argument_types (node);
>  
> @@ -1737,8 +1736,8 @@ expand_simd_clones (struct cgraph_node *node)
>   simd_clone_adjust (n);
> else
>   {
> -   targetm.simd_clone.adjust (n);
> simd_clone_adjust_return_type (n);
> +   targetm.simd_clone.adjust (n);
> simd_clone_adjust_argument_types (n);
>   }
>   }

I don't think this is enough, since simd_clone_adjust_return_type
does nothing for functions that return void (e.g. sincos).
I think instead aarch64_simd_clone_adjust should do something like:

  TREE_TYPE (node->decl) = build_distinct_type_copy (TREE_TYPE (node->decl));

But maybe that has consequences that I've not thought about...

Thanks,
Richard

Re: [C++] DEFERRED_PARSE

2019-07-18 Thread Nathan Sidwell


On 7/17/19 8:17 PM, Marek Polacek wrote:

On Mon, Jul 08, 2019 at 08:25:25AM -0400, Nathan Sidwell wrote:

Jason, Marek,
can DEFERRED_PARSE trees survive past the in-class-context late parsing
stage?  My assumption was not, but in reducing a module testcase I
encountered a situation when one survived to end of compilation (with no
errors).  It was an exception specifier on a declared-but-not-defined
template member function.

Is my assumption incorrect?  (I can of course further reduce the testcase,
if needed.)


I think that should be fine.  I guess we can treat it similarly to
DEFERRED_NOEXCEPT.  At least I haven't seen that it broke anything.


Thanks. I did some more poking, and we do parse them when the class is complete, 
it's just that in the non-defining template case we leave some detritus behind 
that I was picking up.  I'm playing with a patch to fix that.


nathan

--
Nathan Sidwell

Re: Rewrite some jump.c routines to use flags

2019-07-18 Thread Richard Sandiford

Eric Botcazou  writes:
>> I'm not sure using flags_to_condition really buys anything then,
>> since you have to think about each individual case to see whether
>> it belongs in the switch or not.  I also don't have any proof
>> that the no-op cases are the common ones (since adding this
>> fast path of course slows down the others).
>
> Really?  Branch prediction is rather efficient these days.

Sure.  But we're still adding another branch here to avoid the branches in
flags_to_condition in some cases.  There's no guarantee that this branch
is going to be more predictable than the ones in flags_to_condition,
or that the total number of mispredicts is going to be lower this way.
(Not saying it won't be, just that we're making assumptions if we think
it will.)

I just think we should only add extra code like this if we have proof
that it makes things better, or at least that the cases it handles
are worth handling specially.

But by the same token, the status quo wins if there's doubt about
whether the original patch itself is worthwhile.

Richard

[COMMITTED][GCC9] Backport RISC-V: Fix splitter for 32-bit AND on 64-bit target.

2019-07-18 Thread Kito Cheng

Hi:

I've backported this patch from trunk in order to fix a code gen error
for RISC-V port.
From a1f4984764c66c135cc385e6ea90ca24861bdcc4 Mon Sep 17 00:00:00 2001
From: kito 
Date: Thu, 18 Jul 2019 07:00:32 +
Subject: [PATCH] RISC-V: Fix splitter for 32-bit AND on 64-bit target.

Fixes github.com/riscv/riscv-gcc issue #161.  We were accidentally using
BITS_PER_WORD to compute shift counts when we should have been using the
bitsize of the operand modes.  This was wrong when we had an SImode shift
and a 64-bit target.

	Andrew Waterman  
	gcc/
	* config/riscv/riscv.md (lshrsi3_zero_extend_3+1): Use operands[1]
	bitsize	instead of BITS_PER_WORD.
	gcc/testsuite/
	* gcc.target/riscv/shift-shift-2.c: Add one more test.

gcc/ChangeLog:
2019-07-18  Kito Cheng  

	Backport from mainline
	2019-07-08  Andrew Waterman  
		Jim Wilson  

	* config/riscv/riscv.md (lshrsi3_zero_extend_3+1): Use operands[1]
	bitsize	instead of BITS_PER_WORD.
	gcc/testsuite/

gcc/testsuite/ChangeLog:
2019-07-18  Kito Cheng  

	Backport from mainline
	2019-07-08  Jim Wilson  

	* gcc.target/riscv/shift-shift-2.c: Add one more test.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-9-branch@273566 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog  | 10 ++
 gcc/config/riscv/riscv.md  |  5 +++--
 gcc/testsuite/ChangeLog|  7 +++
 gcc/testsuite/gcc.target/riscv/shift-shift-2.c | 16 ++--
 4 files changed, 34 insertions(+), 4 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 6c61632e373..22716b0c0c0 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,13 @@
+2019-07-18  Kito Cheng  
+
+	Backport from mainline
+	2019-07-08  Andrew Waterman  
+		Jim Wilson  
+
+	* config/riscv/riscv.md (lshrsi3_zero_extend_3+1): Use operands[1]
+	bitsize	instead of BITS_PER_WORD.
+	gcc/testsuite/
+
 2019-07-17  John David Anglin  
 
 	* config/pa/pa.c (pa_som_asm_init_sections): Don't force all constant
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index e3799a5bdd8..a8bac170e72 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -1775,10 +1775,11 @@
   (set (match_dup 0)
(lshiftrt:GPR (match_dup 0) (match_dup 2)))]
 {
-  operands[2] = GEN_INT (BITS_PER_WORD
+  /* Op2 is a VOIDmode constant, so get the mode size from op1.  */
+  operands[2] = GEN_INT (GET_MODE_BITSIZE (GET_MODE (operands[1]))
 			 - exact_log2 (INTVAL (operands[2]) + 1));
 })
-  
+
 ;; Handle AND with 0xF...F0...0 where there are 32 to 63 zeros.  This can be
 ;; split into two shifts.  Otherwise it requires 3 instructions: li, sll, and.
 (define_split
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index ecd4b6a7178..de0b52ce248 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,10 @@
+2019-07-18  Kito Cheng  
+
+	Backport from mainline
+	2019-07-08  Jim Wilson  
+
+	* gcc.target/riscv/shift-shift-2.c: Add one more test.
+
 2019-07-17  Andreas Krebbel  
 
 	Backport from mainline
diff --git a/gcc/testsuite/gcc.target/riscv/shift-shift-2.c b/gcc/testsuite/gcc.target/riscv/shift-shift-2.c
index 3f07e7776e7..10a5bb728be 100644
--- a/gcc/testsuite/gcc.target/riscv/shift-shift-2.c
+++ b/gcc/testsuite/gcc.target/riscv/shift-shift-2.c
@@ -25,5 +25,17 @@ sub4 (unsigned long i)
 {
   return (i << 52) >> 52;
 }
-/* { dg-final { scan-assembler-times "slli" 4 } } */
-/* { dg-final { scan-assembler-times "srli" 4 } } */
+
+unsigned int
+sub5 (unsigned int i)
+{
+  unsigned int j;
+  j = i >> 24;
+  j = j * (1 << 24);
+  j = i - j;
+  return j;
+}
+/* { dg-final { scan-assembler-times "slli" 5 } } */
+/* { dg-final { scan-assembler-times "srli" 5 } } */
+/* { dg-final { scan-assembler-times "slliw" 1 } } */
+/* { dg-final { scan-assembler-times "srliw" 1 } } */
-- 
2.17.1

Re: [PATCH] Move rust_{is_mangled,demangle_sym} to a private libiberty header.

2019-07-18 Thread Eduard-Mihai Burtescu

Pinging this again - while it's a tiny change, I want it to land before I 
submit anything else in this area.
Also, I forgot to mention I have no commit access.

Original submission can be found at 
https://gcc.gnu.org/ml/gcc-patches/2019-06/msg6.html.

Thanks,
- Eddy B.


On Wed, Jun 26, 2019, at 11:54 AM, Eduard-Mihai Burtescu wrote:
> Bootstrapped and tested on x86_64-unknown-linux-gnu.
> 
> (Apologies for the delay, while I was able to run libiberty tests back when I 
> submitted the patch, I wanted to make sure I can run the whole GCC testsuite, 
> especially for more significant future contributions, so I had to wait until 
> I had the time to troubleshoot the NixOS support for GCC's make check)
> 
> Thanks,
> - Eddy B.
> 
> 
> On Mon, Jun 3, 2019, at 7:23 AM, Ian Lance Taylor wrote:
> > On Sat, Jun 1, 2019 at 7:15 AM Eduard-Mihai Burtescu  wrote:
> > >
> > > 2019-06-01 Eduard-Mihai Burtescu 
> > > include/ChangeLog:
> > > * demangle.h (rust_is_mangled): Move to libiberty/rust-demangle.h.
> > > (rust_demangle_sym): Move to libiberty/rust-demangle.h.
> > > libiberty/ChangeLog:
> > > * cplus-dem.c: Include rust-demangle.h.
> > > * rust-demangle.c: Include rust-demangle.h.
> > > * rust-demangle.h: New file.
> > 
> > This is OK if it bootstraps and tests pass.
> > 
> > Thanks.
> > 
> > Ian
> >

Re: [RFC] Consider lrotate const rotation in vectorizer

2019-07-18 Thread Jakub Jelinek

On Wed, Jul 17, 2019 at 12:00:32PM -0500, Segher Boessenkool wrote:
> I think we can say that *all* targets behave like SHIFT_COUNT_TRUNCATED
> for rotates?  Not all immediates are valid of course, but that is a
> separate issue.

Well, we'd need to double check all the hw rotate instructions on all the
targets to be sure.
As for the current GCC code, SHIFT_COUNT_TRUNCATED value is used even for
rotates at least in combine.c, expmed.c and simplify-rtx.c and in
optabs.c through targetm.shift_truncation_mask, but e.g. in cse.c is used
only for shifts and not rotates.

And speaking of optabs.c, it already has code to emit the other rotate
if one rotate isn't supported, the:
  /* If we were trying to rotate, and that didn't work, try rotating
 the other direction before falling back to shifts and bitwise-or.  */
  if (((binoptab == rotl_optab
&& (icode = optab_handler (rotr_optab, mode)) != CODE_FOR_nothing)
   || (binoptab == rotr_optab
   && (icode = optab_handler (rotl_optab, mode)) != CODE_FOR_nothing))
  && is_int_mode (mode, &int_mode))
{
  optab otheroptab = (binoptab == rotl_optab ? rotr_optab : rotl_optab);
hunk in there, just it is limited to scalar rotates ATM rather than vector
ones through is_int_mode.  So I bet the problem with the vector shifts is just 
that
tree-vect-generic.c support isn't there.  And the above mentioned code
actually emits the
newop1 = expand_binop (GET_MODE (op1), sub_optab,
   gen_int_mode (bits, GET_MODE (op1)), op1,
   NULL_RTX, unsignedp, OPTAB_DIRECT);
as the fallback, rather than masking of negation with some mask, so if there
is some target that doesn't truncate the rotate count, it will be
problematic with variable rotate by 0.  And note that the other rotate
code explicitly uses targetm.shift_truncation_mask.

Jakub

55 matches

Mail list logo