Re: [10/32] Remove global call sets: combine.c

2019-09-11 Thread Segher Boessenkool
On Wed, Sep 11, 2019 at 08:08:38PM +0100, Richard Sandiford wrote:
>hard_reg_set_iterator hrsi;
> -  EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call, 0, i, hrsi)
> +  EXECUTE_IF_SET_IN_HARD_REG_SET (abi.full_and_partial_reg_clobbers (),
> +   0, i, hrsi)

So "abi" in that means calls?  It is not such a great name like that.
Since its children are very_long_names, it doesn't need to be only three
chars itself, either?


Segher


Re: [PATCH V6 04/11] testsuite: new require effective target indirect_calls

2019-09-11 Thread Mike Stump
On Sep 11, 2019, at 1:47 PM, Jose E. Marchesi  wrote:
> 
> I am working on a new compilation mode for what I call xbpf, which is
> basically eBPF plus extensions to eliminate the current restrictions to
> C.  The purpose of -mxbpf is mainly to test the compiler.  Once the
> support is in, I will revert this and similar other patches.

Yeah, not a bad way to do it.  I have a machine with limits on certain memories 
in hardware that trip up gcc's testsuite, and in my simulator, I have a special 
gcc testing mode that simply gives me a ton more memory (17 MB) that on real 
hardware, I don't have.  This let's me easily test under simulation without 
worrying about the limits.  I can also have my simulator select the small mode 
to reflect what hardware does, and everything else uses this.  Best of both 
world.

Re: Fix PR rtl-optimization/89795

2019-09-11 Thread Segher Boessenkool
On Wed, Sep 11, 2019 at 02:18:36PM +0200, Eric Botcazou wrote:
> > What does it mean for
> >   (set (reg:QI) (const_int -128))
> > for example?  This is the canonical form of moving positive 128 into a
> > register as well, of course.
> 
> Nothing, since word_register_operation_p returns false on CONST_INTs .
> 
> > W_R_O needs many more restrictions if it can work at all, but those aren't
> > documented.
> 
> The problematic issue is more basic and pertains to mere SUBREGs.

I'd say this is more basic, but heh...  It's an example.  If it isn't
documented what W_R_O really *does*, how can we ever hope to make it
works stably?


Segher


[PATCH V6 08/11] bpf: make target-supports.exp aware of eBPF

2019-09-11 Thread Mike Stump
On Aug 29, 2019, at 8:13 AM, Jose E. Marchesi  wrote:
> 
> This patch makes the several effective target checks in
> target-supports.exp to be aware of eBPF targets.

Ok.

These sort of updates will now be obvious to you, now that you have gotten your 
feet wet.  This applies to the indirect calls style changes as well.

Re: Patch to support extended characters in C/C++ identifiers

2019-09-11 Thread Joseph Myers
On Wed, 11 Sep 2019, Lewis Hyatt wrote:

> things that may be a little surprising. For instance, you can take a
> UTF-8 encoded file and insert a backslash line continuation in the
> middle of a multibyte sequence, and gcc will happily paste it back
> together and then interpret the resulting UTF-8. I think it's
> technically OK standardwise since the conversion from extended
> characters to the source character set is implementation-defined, but
> it's hardly a straightforward definition. It is sort of consistent
> with the treatment of undefined behavior with UCN escapes though,
> which gcc already permits to be pasted together over a line
> continuation. Anyway, should this behavior be documented as well? I

I don't think that peculiarity should be documented.  (Whereas accepting 
arbitrary bytes inside comments and strings by default is arguably 
actually a feature.)

> > gcc/testsuite/g++.dg/cpp/ucnid-2-utf8.C and
> > gcc/testsuite/g++.dg/cpp/ucnid-3-utf8.C are testing double stringizing in
> > C++, where strictly the results they expect show that GCC does not conform
> > to the C++ standard requirement to convert all extended characters to UCNs
> > (because C++ does not have the special C rule making it
> > implementation-defined whether the \ of a UCN in a string literal is
> > doubled when stringizing).
> 
> Thanks, I didn't mean to ignore this point when you made it on the PR
> comments, I just wasn't sure what was the best way to handle it. Do
> you find it preferable to just add a comment, or should I rather
> change the test to look for the standard-confirming output, and make
> it an XFAIL?

My inclination would be a comment, with reference to a bug filed for this 
issue in Bugzilla.

> Finally, one general question, when I submit these last changes, is it
> better to send them as a new patch relative to what I already sent, or
> is it better to send the whole thing updated from scratch? Thanks
> again.

A complete patch that can be applied to trunk is best.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [Patch, fortran] PR91717 - ICE on concatenating deferred-length character and character literal

2019-09-11 Thread Steve Kargl
On Wed, Sep 11, 2019 at 11:27:56PM +0100, Paul Richard Thomas wrote:
> Hi Steve,
> 
> Being an allocatable component, this code appears on entry into scope:
> 
>   my_core.msg = 0B;
>   my_core._msg_length = 0;
>   {
> 
> Thus, according to the standard, my_core%msg is defined.
>

I'll politely defer to the Fortran standard.

  5.4.10 Allocatable variables

  The allocation status of an allocatable variable is either allocated
  or unallocated.  An allocatable variable becomes allocated as described
  in 9.7.1.3. It becomes unallocated as described in 9.7.3.2.

  An unallocated allocatable variable shall not be referenced or defined.

  7.5.4.6 Default initialization for components

  Allocatable components are always initialized to unallocated.

In your testcase, you have an unallocatable allocatable entity 
referenced on the RHS in the first line that involves my_core%msg.
>From the Fortran standard's point of view, I believe the code is
nonconforming.

  type core
character (len=:), allocatable :: msg
  end type

  type(core) :: my_core

  print *, allocated(my_core%msg)

! Comment out to avoid ICE.
!  my_core%msg on RHS is unallocated in next line.
!  my_core%msg = my_core%msg//"my message is: 
!  my_core%msg = my_core%msg//"Hello!"
!
!  if (my_core%msg .ne. "my message is: Hello!") stop 1
end

% gfcx -o z a.f90 && ./z
 F

> detected for this assignment even though no array references are
> involved. There is certainly the need for a temporary, which the patch
> generates.
> 

Your patch may fix the ICE, but if the code compiles and 
execute.  You are getting the "right" answer fortuitiouly.

Of course, I could be wrong.

-- 
Steve


Re: [Patch, fortran] PR91717 - ICE on concatenating deferred-length character and character literal

2019-09-11 Thread Paul Richard Thomas
Hi Steve,

Being an allocatable component, this code appears on entry into scope:

  my_core.msg = 0B;
  my_core._msg_length = 0;
  {

Thus, according to the standard, my_core%msg is defined.

If you follow the backtrace from the ICE, you will see that something
like the patch is required. It is important that the dependency is
detected for this assignment even though no array references are
involved. There is certainly the need for a temporary, which the patch
generates.

Cheers

Paul

On Wed, 11 Sep 2019 at 06:50, Steve Kargl
 wrote:
>
> On Wed, Sep 11, 2019 at 06:44:50AM +0100, Paul Richard Thomas wrote:
> > ===
> > *** gcc/testsuite/gfortran.dg/dependency_55.f90   (nonexistent)
> > --- gcc/testsuite/gfortran.dg/dependency_55.f90   (working copy)
> > ***
> > *** 0 
> > --- 1,17 
> > + ! { dg-do run }
> > + !
> > + ! Test the fix for PR91717 in which the concatenation operation ICEd.
> > + !
> > + ! Contributed by Damian Rouson  
> > + !
> > +   type core
> > + character (len=:), allocatable :: msg
> > +   end type
> > +
> > +   type(core) :: my_core
> > +
> > +   my_core%msg = my_core%msg//"my message is: "
>
> my_core%msg is undefined on the RHS.  This is invalid
> Fortran.  Not sure whether your patch is correct or not.
>
> --
> Steve



-- 
"If you can't explain it simply, you don't understand it well enough"
- Albert Einstein


C++ PATCH for c++/91740 - ICE with constexpr call and ?: in ARRAY_REF

2019-09-11 Thread Marek Polacek
This ICEs since r267272 - more location wrapper nodes, but not because we can't
cope with new location wrappers, but because that commit introduced a call to
maybe_constant_value in cp_build_array_ref.  In this testcase we call it with

  f (VIEW_CONVERT_EXPR("BAR")) ? 1 : 0

argument and that crashes in fold_convert because we end up trying to convert 
"BAR" of type const char[4] to const char * when evaluating the call.  At this
point, decay_conversion hasn't turned the argument into (const char *) "BAR"
yet.

The ICE doesn't occur without :?, because then the call will be wrapped in
NON_DEPENDENT_EXPR and constexpr throws its hands (I'm anthropomorphizing) up
when it encounters such an expression.

I noticed that build_non_dependent_expr doesn't wrap op0 of ?: in N_D_E.  This
is so since r70606 -- Nathan, is there a reason not to do it?  Doing it fixes
this problem.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2019-09-11  Marek Polacek  

PR c++/91740 - ICE with constexpr call and ?: in ARRAY_REF.
* pt.c (build_non_dependent_expr): Call build_non_dependent_expr for
the first operand.

* g++.dg/cpp1y/var-templ63.C: New test.

diff --git gcc/cp/pt.c gcc/cp/pt.c
index c5915a5ecd0..775389d8245 100644
--- gcc/cp/pt.c
+++ gcc/cp/pt.c
@@ -26709,7 +26709,7 @@ build_non_dependent_expr (tree expr)
   if (TREE_CODE (expr) == COND_EXPR)
 return build3 (COND_EXPR,
   TREE_TYPE (expr),
-  TREE_OPERAND (expr, 0),
+  build_non_dependent_expr (TREE_OPERAND (expr, 0)),
   (TREE_OPERAND (expr, 1)
? build_non_dependent_expr (TREE_OPERAND (expr, 1))
: build_non_dependent_expr (TREE_OPERAND (expr, 0))),
diff --git gcc/testsuite/g++.dg/cpp1y/var-templ63.C 
gcc/testsuite/g++.dg/cpp1y/var-templ63.C
new file mode 100644
index 000..a65f53b2963
--- /dev/null
+++ gcc/testsuite/g++.dg/cpp1y/var-templ63.C
@@ -0,0 +1,5 @@
+// PR c++/91740 - ICE with constexpr call and ?: in ARRAY_REF.
+// { dg-do compile { target c++14 } }
+
+constexpr bool f(const char*) { return true; }
+template const char c = "FOO"[f("BAR") ? 1 : 0];


[testsuite, obvious] Remove explicit "dg-do run" from more gcc.dg/vect tests

2019-09-11 Thread Sandra Loosemore

This patch is a followup to PR testsuite/83889, in which it was noted that

"The failures in this PR were from forcing { dg-do run } even when
vect.exp chooses options that are incompatible with the runtime.
The default vect.exp behaviour is to execute when possible, so there's
no need for a dg-do at all."

Since the original patch for this PR was committed, a bunch of new 
testcases have been added that also incorrectly specify "dg-do run", 
leading to bogus execution failures (in the case I tripped over, trying 
to run code built for ARM Neon on a target that doesn't support those 
instructions).  I've applied the same trivial fix as in the original 
patch.  I thought this was obvious enough that I went ahead and 
committed it.


-Sandra
2019-09-11  Sandra Loosemore  

	PR testsuite/83889

	gcc/testsuite/
	* gcc.dg/vect/pr81740-2.c: Remove explicit dg-do run.
	* gcc.dg/vect/pr88598-1.c: Likewise.
	* gcc.dg/vect/pr88598-2.c: Likewise.
	* gcc.dg/vect/pr88598-3.c: Likewise.
	* gcc.dg/vect/pr88598-4.c: Likewise.
	* gcc.dg/vect/pr88598-5.c: Likewise.
	* gcc.dg/vect/pr88598-6.c: Likewise.
	* gcc.dg/vect/pr89440.c: Likewise.
	* gcc.dg/vect/pr90018.c: Likewise.
	* gcc.dg/vect/pr91293-1.c: Likewise.
	* gcc.dg/vect/pr91293-2.c: Likewise.
	* gcc.dg/vect/pr91293-3.c: Likewise.
Index: gcc/testsuite/gcc.dg/vect/pr81740-1.c
===
--- gcc/testsuite/gcc.dg/vect/pr81740-1.c	(revision 275664)
+++ gcc/testsuite/gcc.dg/vect/pr81740-1.c	(working copy)
@@ -1,4 +1,3 @@
-/* { dg-do run } */
 /* { dg-require-effective-target vect_int } */
 
 #include "tree-vect.h"
Index: gcc/testsuite/gcc.dg/vect/pr81740-2.c
===
--- gcc/testsuite/gcc.dg/vect/pr81740-2.c	(revision 275664)
+++ gcc/testsuite/gcc.dg/vect/pr81740-2.c	(working copy)
@@ -1,4 +1,3 @@
-/* { dg-do run } */
 /* { dg-require-effective-target vect_int } */
 /* { dg-require-effective-target vect_hw_misalign } */
 
Index: gcc/testsuite/gcc.dg/vect/pr88598-1.c
===
--- gcc/testsuite/gcc.dg/vect/pr88598-1.c	(revision 275664)
+++ gcc/testsuite/gcc.dg/vect/pr88598-1.c	(working copy)
@@ -1,4 +1,3 @@
-/* { dg-do run } */
 /* { dg-additional-options "-fdump-tree-optimized" } */
 
 #include "tree-vect.h"
Index: gcc/testsuite/gcc.dg/vect/pr88598-2.c
===
--- gcc/testsuite/gcc.dg/vect/pr88598-2.c	(revision 275664)
+++ gcc/testsuite/gcc.dg/vect/pr88598-2.c	(working copy)
@@ -1,4 +1,3 @@
-/* { dg-do run } */
 /* { dg-additional-options "-fdump-tree-optimized" } */
 
 #include "tree-vect.h"
Index: gcc/testsuite/gcc.dg/vect/pr88598-3.c
===
--- gcc/testsuite/gcc.dg/vect/pr88598-3.c	(revision 275664)
+++ gcc/testsuite/gcc.dg/vect/pr88598-3.c	(working copy)
@@ -1,4 +1,3 @@
-/* { dg-do run } */
 /* { dg-additional-options "-ffast-math -fdump-tree-optimized" } */
 
 #include "tree-vect.h"
Index: gcc/testsuite/gcc.dg/vect/pr88598-4.c
===
--- gcc/testsuite/gcc.dg/vect/pr88598-4.c	(revision 275664)
+++ gcc/testsuite/gcc.dg/vect/pr88598-4.c	(working copy)
@@ -1,4 +1,3 @@
-/* { dg-do run } */
 /* { dg-additional-options "-fdump-tree-optimized" } */
 
 #include "tree-vect.h"
Index: gcc/testsuite/gcc.dg/vect/pr88598-5.c
===
--- gcc/testsuite/gcc.dg/vect/pr88598-5.c	(revision 275664)
+++ gcc/testsuite/gcc.dg/vect/pr88598-5.c	(working copy)
@@ -1,4 +1,3 @@
-/* { dg-do run } */
 /* { dg-additional-options "-fdump-tree-optimized" } */
 
 #include "tree-vect.h"
Index: gcc/testsuite/gcc.dg/vect/pr88598-6.c
===
--- gcc/testsuite/gcc.dg/vect/pr88598-6.c	(revision 275664)
+++ gcc/testsuite/gcc.dg/vect/pr88598-6.c	(working copy)
@@ -1,4 +1,3 @@
-/* { dg-do run } */
 /* { dg-additional-options "-ffast-math -fdump-tree-optimized" } */
 
 #include "tree-vect.h"
Index: gcc/testsuite/gcc.dg/vect/pr89440.c
===
--- gcc/testsuite/gcc.dg/vect/pr89440.c	(revision 275664)
+++ gcc/testsuite/gcc.dg/vect/pr89440.c	(working copy)
@@ -1,4 +1,3 @@
-/* { dg-do run } */
 /* { dg-additional-options "-ffast-math" } */
 
 #include "tree-vect.h"
Index: gcc/testsuite/gcc.dg/vect/pr90018.c
===
--- gcc/testsuite/gcc.dg/vect/pr90018.c	(revision 275664)
+++ gcc/testsuite/gcc.dg/vect/pr90018.c	(working copy)
@@ -1,4 +1,3 @@
-/* { dg-do run } */
 /* { dg-require-effective-target vect_double } */
 
 #include "tree-vect.h"
Index: gcc/testsuite/gcc.dg/vect/pr91293-1.c
===
--- gcc/testsuite/gcc.dg/vect/pr91293-1.c	(revision 275664)
+++ 

Re: [PATCH V6 03/11] testsuite: annotate c-torture/compile tests with dg-require-stack-size

2019-09-11 Thread Jose E. Marchesi


On Aug 29, 2019, at 8:13 AM, Jose E. Marchesi  
wrote:
> 
> This patch annotates tests that make use of a significant a mount of
> stack space.

Ok.  Future changes like this are now obvious to you and you can
just maintain them as you see fit.

Ok thanks.


Re: [PATCH V6 04/11] testsuite: new require effective target indirect_calls

2019-09-11 Thread Jose E. Marchesi


Hi Mike.
Thanks for the review.

On Aug 29, 2019, at 8:13 AM, Jose E. Marchesi  
wrote:
> 
> This patch adds a new dg_require_effective_target procedure to the
> testsuite infrastructure: indirect_calls.  This new function tells
> whether a target supports calls to non-constant call targets.

Ok.  I'll let people contemplate some comments...

I'm torn between targets that can't support C and gooping up the test
suite and approving.  I'll error on the side of approving this, but,
would like to hear from folks if I go to far.

Since they are easy to identify, maintain and ignore...  I went with
approval.

People can contemplate other ways to do this, like introduce a fake
marker for when the feature is used and when running a program with
such a marker, then mark it as unsupported.  This way, no test need be
marked, and all future test cases that use the feature, just flip to
unsupported, no maintenance required.

We do this sort of thing with programs that overflow the RAM, by using
a stylized error message from ld, and noticing that in dejagnu, and
then not expecting it to work.

If you can find a way to tally stack space, and check it before
running it, the other change to tightly track stack space then would
not be as necessary.  I think you might be able to do this on your
target.  Having the stack space marked is generally useful for other
targets, as most won't have a nice way to sort out small stacks, so my
general comment apply less to the stack size, but, things that can
cause you less maintenance burden are likely good in any event.

I am working on a new compilation mode for what I call xbpf, which is
basically eBPF plus extensions to eliminate the current restrictions to
C.  The purpose of -mxbpf is mainly to test the compiler.  Once the
support is in, I will revert this and similar other patches.

J.


Re: [RFC PATCH] Add inlining growth bias flag

2019-09-11 Thread Graham Markall
Hi Jeff,

On 09/09/2019 19:55, Jeff Law wrote:
> I'm not sure if we really want to expose this as a first class option.
> Perhaps a PARAM would be better.  Jan would have the final call though
> since he's done the most work on the IPA bits.

Many thanks for the suggestion - I could turn it into a pair of integer
params (one to increase the growth and one to decrease it) - would that
still be suitable? (I think there's no easy way to have a param with as
signed value, but am I missing something obvious?)

Best regards,
Graham.



signature.asc
Description: OpenPGP digital signature


Re: [PATCH] Fix PR 91708

2019-09-11 Thread Bernd Edlinger
On 9/11/19 8:30 PM, Richard Biener wrote:
> On September 11, 2019 7:41:22 PM GMT+02:00, Bernd Edlinger 
>  wrote:
>> On 9/11/19 6:08 PM, Jeff Law wrote:
>>> On 9/11/19 7:49 AM, Bernd Edlinger wrote:
 On 9/11/19 9:23 AM, Richard Biener wrote:
> On Tue, 10 Sep 2019, Bernd Edlinger wrote:
>
>> Hi!
>>
>> This ICE happens when compiling real_nextafter in real.c.
>> CSE sees this:
>>
>> (insn 179 178 180 11 (set (reg:SI 319)
>> (reg/v/f:SI 273 [ rD.73757 ]))
>> "../../gcc-trunk-1/gcc/real.c":120:10 643 {*thumb2_movsi_vfp}
>>  (nil))
>> [...]
>> (insn 181 180 182 11 (set (mem:SI (reg:SI 319) [0 MEM
>>  [(voidD.73 *)r_77(D)]+0 S4 A8])
>> (unspec:SI [
>> (reg:SI 320)
>> ] UNSPEC_UNALIGNED_STORE))
>> "../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
>>  (nil))
>> [...]
>> (insn 186 185 187 11 (set (mem:SI (plus:SI (reg/v/f:SI 273 [
>> rD.73757 ])
>> (const_int 20 [0x14])) [0 MEM 
>> [(voidD.73 *)r_77(D)]+20 S4 A8])
>> (unspec:SI [
>> (reg:SI 320)
>> ] UNSPEC_UNALIGNED_STORE))
>> "../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
>>  (expr_list:REG_DEAD (reg:SI 320)
>> (expr_list:REG_DEAD (reg/f:SI 319 [ rD.73757 ])
>> (nil
>> [...]
>> (insn 234 233 235 11 (set (reg:SI 340)
>> (mem:SI (reg/v/f:SI 273 [ rD.73757 ]) [52 MEM > int> [(struct real_valueD.28367 *)r_77(D)]+0 S4 A32]))
>> "../../gcc-trunk-1/gcc/real.c":5185:9 643 {*thumb2_movsi_vfp}
>>  (nil))
>>
>>
>> ... and transforms insn 234 in an invalid insn:
>>
>>
>> (insn 234 233 235 11 (set (reg:SI 340 [ MEM 
>> [(struct real_valueD.28367 *)r_77(D)] ])
>> (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
>> (const_int 20 [0x14])) [0 MEM 
>> [(voidD.73 *)r_77(D)]+20 S4 A8])) "../../gcc-trunk-1/gcc/real.c":5185:9
>> 643 {*thumb2_movsi_vfp}
>>  (nil))
>>
>> which triggers the assertion in the arm back-end, because the MEM
>> is not aligned.
>>
>> To fix that I changed exp_equiv_p to consider MEMs with different
>> MEM_ALIGN or
>> ALIAS_SET as different.
>
> I think the appropriate fix is to retain the mem_attrs of the
>> original
> MEM since obviously the replacement does not change those?  It's a
>> mere
> change of the MEMs address but implementation-wise CSE replaces the
>> whole
> MEM?
>
> For CSE exp_equiv_p simply compares the addresses, only if for_gcse
> we do further checks like mem_attrs_eq_p.  IMHO that is correct.
>
> I'm not sure what CSE does above, that is, what does it think it
>> does?
> It doesn't replace the loaded value, it seems to do sth else which
> is odd for CSE (replaces a REG with a PLUS - but why?)
>  

 What I think CSE is thinking there is this:

 insn 181 stores the value of (reg:SI 320) at (mem:SI (reg:SI 319) [0
>> MEM  [(voidD.73 *)r_77(D)]+0 S4 A8]
 that is the same address as where insn 234 reads the value back but
>> there we know it is aligned.

 insn 186 stores the same value at (mem:SI (plus:SI (reg/v/f:SI 273 [
>> rD.73757 ]) (const_int 20 [0x14])) [0 MEM  [(voidD.73
>> *)r_77(D)]+20 S4 A8]

 But since both reg:SI 320 is dead and reg:SI 319 is dead as well,
>> when
 insn 234 processed, it replaces one mem with another mem that has
>> the same
 value.
>>> Just to make sure I understand.  Are you saying the addresses for the
>>> MEMs are equal or the contents of the memory location are equal.
>>>
>>
>> The MEMs all have different addresses, but the same value, they are
>>from a
>> memset or something:
>>
>> (gdb) call dump_class(elt)
>> Equivalence chain for (mem:SI (reg/v/f:SI 273 [ r ]) [0 MEM
>>  [(void *)r_77(D)]+0 S4 A8]): 
>> (unspec:SI [
>>(reg:SI 320)
>>] UNSPEC_UNALIGNED_STORE)
>> (mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
>> (const_int 20 [0x14])) [0 MEM  [(void *)r_77(D)]+20 S4 A8])
>> (mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
>> (const_int 16 [0x10])) [0 MEM  [(void *)r_77(D)]+16 S4 A8])
>> (mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
>> (const_int 12 [0xc])) [0 MEM  [(void *)r_77(D)]+12 S4 A8])
>> (mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
>>   (const_int 8 [0x8])) [0 MEM  [(void *)r_77(D)]+8 S4 A8])
>> (mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
>>   (const_int 4 [0x4])) [0 MEM  [(void *)r_77(D)]+4 S4 A8])
>> (mem:SI (reg/v/f:SI 273 [ r ]) [0 MEM  [(void *)r_77(D)]+0
>> S4 A8])
>>
>>
>> The insn 234, uses the same address as the last in the list
>> (mem:SI (reg/v/f:SI 273 [ r ]) [0 MEM  [(void *)r_77(D)]+0
>> S4 A8])
>> but incompatible alignment and alias set.
>>
>> since we only are interested in the value CSE picks the most silly
>> variant,
>> (mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
>> (const_int 20 [0x14])) [0 MEM  [(void *)r_77(D)]+20 S4 A8])
>>
>> now this 

[32/32] Hide regs_invalidated_by_call etc.

2019-09-11 Thread Richard Sandiford
The previous patches removed all target-independent uses of
regs_invalidated_by_call, call_used_or_fixed_regs and
call_used_or_fixed_reg_p.  This patch therefore restricts
them to target-specific code (and reginfo.c, which sets them up).


2019-09-11  Richard Sandiford  

gcc/
* hard-reg-set.h (regs_invalidated_by_call): Only define if
IN_TARGET_CODE.
(call_used_or_fixed_regs): Likewise.
(call_used_or_fixed_reg_p): Likewise.
* reginfo.c (regs_invalidated_by_call): New macro.

Index: gcc/hard-reg-set.h
===
--- gcc/hard-reg-set.h  2019-09-10 19:57:04.713041281 +0100
+++ gcc/hard-reg-set.h  2019-09-11 19:49:18.149461264 +0100
@@ -477,10 +477,12 @@ #define call_used_regs \
 #endif
 #define savable_regs \
   (this_target_hard_regs->x_savable_regs)
+#ifdef IN_TARGET_CODE
 #define regs_invalidated_by_call \
   (this_target_hard_regs->x_regs_invalidated_by_call)
 #define call_used_or_fixed_regs \
   (regs_invalidated_by_call | fixed_reg_set)
+#endif
 #define reg_alloc_order \
   (this_target_hard_regs->x_reg_alloc_order)
 #define inv_reg_alloc_order \
@@ -509,6 +511,7 @@ #define reg_names \
 #define REG_CAN_CHANGE_MODE_P(REGN, FROM, TO)  \
   (targetm.can_change_mode_class (FROM, TO, REGNO_REG_CLASS (REGN)))
 
+#ifdef IN_TARGET_CODE
 /* Return true if register REGNO is either fixed or call-used
(aka call-clobbered).  */
 
@@ -517,5 +520,6 @@ call_used_or_fixed_reg_p (unsigned int r
 {
   return fixed_regs[regno] || this_target_hard_regs->x_call_used_regs[regno];
 }
+#endif
 
 #endif /* ! GCC_HARD_REG_SET_H */
Index: gcc/reginfo.c
===
--- gcc/reginfo.c   2019-09-11 19:47:39.474156575 +0100
+++ gcc/reginfo.c   2019-09-11 19:49:18.149461264 +0100
@@ -69,6 +69,8 @@ struct target_regs *this_target_regs = &
 
 #define call_used_regs \
   (this_target_hard_regs->x_call_used_regs)
+#define regs_invalidated_by_call \
+  (this_target_hard_regs->x_regs_invalidated_by_call)
 
 /* Data for initializing fixed_regs.  */
 static const char initial_fixed_regs[] = FIXED_REGISTERS;


[31/32] Remove global call sets: shrink-wrap.c

2019-09-11 Thread Richard Sandiford
This is a straight replacement of "calls we can clobber without saving
them first".


2019-09-11  Richard Sandiford  

gcc/
* shrink-wrap.c: Include function-abi.h.
(requires_stack_frame_p): Use crtl->abi to test whether the
current function can use a register without saving it first.

Index: gcc/shrink-wrap.c
===
--- gcc/shrink-wrap.c   2019-09-10 19:56:32.577268091 +0100
+++ gcc/shrink-wrap.c   2019-09-11 19:49:15.133482515 +0100
@@ -43,7 +43,7 @@ Software Foundation; either version 3, o
 #include "regcprop.h"
 #include "rtl-iter.h"
 #include "valtrack.h"
-
+#include "function-abi.h"
 
 /* Return true if INSN requires the stack frame to be set up.
PROLOGUE_USED contains the hard registers used in the function
@@ -76,7 +76,7 @@ requires_stack_frame_p (rtx_insn *insn,
 }
   if (hard_reg_set_intersect_p (hardregs, prologue_used))
 return true;
-  hardregs &= ~call_used_or_fixed_regs;
+  hardregs &= ~crtl->abi->full_reg_clobbers ();
   for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
 if (TEST_HARD_REG_BIT (hardregs, regno)
&& df_regs_ever_live_p (regno))


[30/32] Remove global call sets: sel-sched.c

2019-09-11 Thread Richard Sandiford
The main change here is to replace a crosses_call boolean with
a bitmask of the ABIs used by the crossed calls.  For space reasons,
I didn't also add a HARD_REG_SET that tracks the set of registers
that are actually clobbered, which means that this is the one part
of the series that doesn't benefit from -fipa-ra.  The existing
FIXME suggests that the current structures aren't the preferred
way of representing this anyhow, and the pass already makes
conservative assumptions about call-crossing registers.


2019-09-11  Richard Sandiford  

gcc/
* sel-sched-ir.h (_def::crosses_call): Replace with...
(_def::crossed_call_abis): ..this new field.
(def_list_add): Take a mask of ABIs instead of a crosses_call
boolean.
* sel-sched-ir.c (def_list_add): Likewise.  Update initialization
of _def accordingly.
* sel-sched.c: Include function-abi.h.
(hard_regs_data::regs_for_call_clobbered): Delete.
(reg_rename::crosses_call): Replace with...
(reg_rename::crossed_call_abis): ...this new field.
(fur_static_params::crosses_call): Replace with...
(fur_static_params::crossed_call_abis): ...this new field.
(init_regs_for_mode): Don't initialize sel_hrd.regs_for_call_clobbered.
(init_hard_regs_data): Use crtl->abi to test which registers the
current function would need to save before it uses them.
(mark_unavailable_hard_regs): Update handling of call-clobbered
registers, using call_clobbers_in_region to find out which registers
might be call-clobbered (but without taking -fipa-ra into account
for now).  Remove separate handling of partially call-clobbered
registers.
(verify_target_availability): Use crossed_call_abis instead of
crosses_call.
(get_spec_check_type_for_insn, find_used_regs): Likewise.
(fur_orig_expr_found, fur_on_enter, fur_orig_expr_not_found): Likewise.

Index: gcc/sel-sched-ir.h
===
--- gcc/sel-sched-ir.h  2019-07-10 19:41:26.359898316 +0100
+++ gcc/sel-sched-ir.h  2019-09-11 19:49:11.673506894 +0100
@@ -188,12 +188,12 @@ struct _def
 {
   insn_t orig_insn;
 
-  /* FIXME: Get rid of CROSSES_CALL in each def, since if we're moving up
+  /* FIXME: Get rid of CROSSED_CALL_ABIS in each def, since if we're moving up
  rhs from two different places, but only one of the code motion paths
  crosses a call, we can't use any of the call_used_regs, no matter which
- path or whether all paths crosses a call.  Thus we should move 
CROSSES_CALL
- to static params.  */
-  bool crosses_call;
+ path or whether all paths crosses a call.  Thus we should move
+ CROSSED_CALL_ABIS to static params.  */
+  unsigned int crossed_call_abis;
 };
 typedef struct _def *def_t;
 
@@ -1510,7 +1510,7 @@ extern void flist_tail_init (flist_tail_
 
 extern fence_t flist_lookup (flist_t, insn_t);
 extern void flist_clear (flist_t *);
-extern void def_list_add (def_list_t *, insn_t, bool);
+extern void def_list_add (def_list_t *, insn_t, unsigned int);
 
 /* Target context functions.  */
 extern tc_t create_target_context (bool);
Index: gcc/sel-sched-ir.c
===
--- gcc/sel-sched-ir.c  2019-09-09 19:01:40.375078244 +0100
+++ gcc/sel-sched-ir.c  2019-09-11 19:49:11.673506894 +0100
@@ -311,9 +311,10 @@ flist_clear (flist_t *lp)
 flist_remove (lp);
 }
 
-/* Add ORIGINAL_INSN the def list DL honoring CROSSES_CALL.  */
+/* Add ORIGINAL_INSN the def list DL honoring CROSSED_CALL_ABIS.  */
 void
-def_list_add (def_list_t *dl, insn_t original_insn, bool crosses_call)
+def_list_add (def_list_t *dl, insn_t original_insn,
+ unsigned int crossed_call_abis)
 {
   def_t d;
 
@@ -321,7 +322,7 @@ def_list_add (def_list_t *dl, insn_t ori
   d = DEF_LIST_DEF (*dl);
 
   d->orig_insn = original_insn;
-  d->crosses_call = crosses_call;
+  d->crossed_call_abis = crossed_call_abis;
 }
 
 
Index: gcc/sel-sched.c
===
--- gcc/sel-sched.c 2019-09-11 19:47:32.902202887 +0100
+++ gcc/sel-sched.c 2019-09-11 19:49:11.673506894 +0100
@@ -46,6 +46,7 @@ Software Foundation; either version 3, o
 #include "sel-sched-dump.h"
 #include "sel-sched.h"
 #include "dbgcnt.h"
+#include "function-abi.h"
 
 /* Implementation of selective scheduling approach.
The below implementation follows the original approach with the following
@@ -302,10 +303,6 @@ struct hard_regs_data
  that the whole set is not computed yet.  */
   HARD_REG_SET regs_for_rename[FIRST_PSEUDO_REGISTER];
 
-  /* For every mode, this stores registers not available due to
- call clobbering.  */
-  HARD_REG_SET regs_for_call_clobbered[NUM_MACHINE_MODES];
-
   /* All registers that are used or call used.  */
   HARD_REG_SET regs_ever_used;
 
@@ -325,8 +322,8 @@ struct reg_rename
   /* These 

[29/32] Remove global call sets: sched-deps.c

2019-09-11 Thread Richard Sandiford
This is a straight replacement of an existing "full or partial"
call-clobber check.


2019-09-11  Richard Sandiford  

gcc/
* sched-deps.c (deps_analyze_insn): Use the ABI of the target
function to test whether a register is fully or partly clobbered.

Index: gcc/sched-deps.c
===
--- gcc/sched-deps.c2019-09-11 19:47:32.902202887 +0100
+++ gcc/sched-deps.c2019-09-11 19:49:08.517529131 +0100
@@ -3736,9 +3736,7 @@ deps_analyze_insn (class deps_desc *deps
  Since we only have a choice between 'might be clobbered'
  and 'definitely not clobbered', we must include all
  partly call-clobbered registers here.  */
-   else if (targetm.hard_regno_call_part_clobbered (abi.id (), i,
-reg_raw_mode[i])
- || TEST_HARD_REG_BIT (regs_invalidated_by_call, i))
+   else if (abi.clobbers_at_least_part_of_reg_p (i))
   SET_REGNO_REG_SET (reg_pending_clobbers, i);
   /* We don't know what set of fixed registers might be used
  by the function, but it is certain that the stack pointer


[00/32] Remove global call sets: rtlanal.c

2019-09-11 Thread Richard Sandiford
The reg_set_p part is simple, since the caller is asking about
a specific REG rtx, with a known register number and mode.

The find_all_hard_reg_sets part emphasises that the "implicit"
behaviour was always a bit suspect, since it includes fully-clobbered
registers but not partially-clobbered registers.  The only current
user of this path is the c6x-specific scheduler predication code,
and c6x doesn't have partly call-clobbered registers, so in practice
it's fine.  I've added a comment to try to disuade future users.
(The !implicit path is OK and useful though.)


2019-09-11  Richard Sandiford  

gcc/
* rtlanal.c: Include function-abi.h.
(reg_set_p): Use call_insn_abi to get the ABI of the called
function and clobbers_reg_p to test whether the register
is call-clobbered.
(find_all_hard_reg_sets): When implicit is true, use call_insn_abi
to get the ABI of the called function and full_reg_clobbers to
get the set of fully call-clobbered registers.  Warn about the
pitfalls of using this mode.

Index: gcc/rtlanal.c
===
--- gcc/rtlanal.c   2019-09-11 19:47:24.418262673 +0100
+++ gcc/rtlanal.c   2019-09-11 19:49:04.417558020 +0100
@@ -36,6 +36,7 @@ Software Foundation; either version 3, o
 #include "addresses.h"
 #include "rtl-iter.h"
 #include "hard-reg-set.h"
+#include "function-abi.h"
 
 /* Forward declarations */
 static void set_of_1 (rtx, const_rtx, void *);
@@ -1270,8 +1271,8 @@ reg_set_p (const_rtx reg, const_rtx insn
  || (CALL_P (insn)
  && ((REG_P (reg)
   && REGNO (reg) < FIRST_PSEUDO_REGISTER
-  && overlaps_hard_reg_set_p (regs_invalidated_by_call,
-  GET_MODE (reg), REGNO (reg)))
+  && (call_insn_abi (as_a (insn))
+  .clobbers_reg_p (GET_MODE (reg), REGNO (reg
  || MEM_P (reg)
  || find_reg_fusage (insn, CLOBBER, reg)
 return true;
@@ -1486,7 +1487,11 @@ record_hard_reg_sets (rtx x, const_rtx p
 }
 
 /* Examine INSN, and compute the set of hard registers written by it.
-   Store it in *PSET.  Should only be called after reload.  */
+   Store it in *PSET.  Should only be called after reload.
+
+   IMPLICIT is true if we should include registers that are fully-clobbered
+   by calls.  This should be used with caution, since it doesn't include
+   partially-clobbered registers.  */
 void
 find_all_hard_reg_sets (const rtx_insn *insn, HARD_REG_SET *pset, bool 
implicit)
 {
@@ -1495,7 +1500,7 @@ find_all_hard_reg_sets (const rtx_insn *
   CLEAR_HARD_REG_SET (*pset);
   note_stores (insn, record_hard_reg_sets, pset);
   if (CALL_P (insn) && implicit)
-*pset |= call_used_or_fixed_regs;
+*pset |= call_insn_abi (insn).full_reg_clobbers ();
   for (link = REG_NOTES (insn); link; link = XEXP (link, 1))
 if (REG_NOTE_KIND (link) == REG_INC)
   record_hard_reg_sets (XEXP (link, 0), NULL, pset);


[27/32] Remove global call sets: reload.c

2019-09-11 Thread Richard Sandiford
The inheritance code in find_equiv_reg can use clobbers_reg_p
to test whether a call clobbers either of the equivalent registers.

reload and find_reg use crtl->abi to test whether a register needs
to be saved in the prologue before use.

reload_as_needed can use full_and_partial_reg_clobbers and thus
avoid needing to keep its own record of which registers are part
call-clobbered.


2019-09-11  Richard Sandiford  

gcc/
* reload.c: Include function-abi.h.
(find_equiv_reg): Use clobbers_reg_p to test whether either
of the equivalent registers is clobbered by a call.
* reload1.c: Include function-abi.h.
(reg_reloaded_call_part_clobbered): Delete.
(reload): Use crtl->abi to test which registers would need
saving in the prologue before use.
(find_reg): Likewise.
(emit_reload_insns): Remove code for reg_reloaded_call_part_clobbered.
(reload_as_needed): Likewise.  Use full_and_partial_reg_clobbers
instead of call_used_or_fixed_regs | reg_reloaded_call_part_clobbered.

Index: gcc/reload.c
===
--- gcc/reload.c2019-09-11 19:47:32.902202887 +0100
+++ gcc/reload.c2019-09-11 19:49:00.269587248 +0100
@@ -106,6 +106,7 @@ #define REG_OK_STRICT
 #include "reload.h"
 #include "addresses.h"
 #include "params.h"
+#include "function-abi.h"
 
 /* True if X is a constant that can be forced into the constant pool.
MODE is the mode of the operand, or VOIDmode if not known.  */
@@ -6904,24 +6905,19 @@ find_equiv_reg (rtx goal, rtx_insn *insn
 if either of the two is in a call-clobbered register, or memory.  */
   if (CALL_P (p))
{
- int i;
-
  if (goal_mem || need_stable_sp)
return 0;
 
- if (regno >= 0 && regno < FIRST_PSEUDO_REGISTER)
-   for (i = 0; i < nregs; ++i)
- if (call_used_or_fixed_reg_p (regno + i)
- || targetm.hard_regno_call_part_clobbered (0, regno + i,
-mode))
-   return 0;
+ function_abi abi = call_insn_abi (p);
+ if (regno >= 0
+ && regno < FIRST_PSEUDO_REGISTER
+ && abi.clobbers_reg_p (mode, regno))
+   return 0;
 
- if (valueno >= 0 && valueno < FIRST_PSEUDO_REGISTER)
-   for (i = 0; i < valuenregs; ++i)
- if (call_used_or_fixed_reg_p (valueno + i)
- || targetm.hard_regno_call_part_clobbered (0, valueno + i,
-mode))
-   return 0;
+ if (valueno >= 0
+ && valueno < FIRST_PSEUDO_REGISTER
+ && abi.clobbers_reg_p (mode, valueno))
+   return 0;
}
 
   if (INSN_P (p))
Index: gcc/reload1.c
===
--- gcc/reload1.c   2019-09-11 19:47:32.902202887 +0100
+++ gcc/reload1.c   2019-09-11 19:49:00.273587220 +0100
@@ -42,6 +42,7 @@ Software Foundation; either version 3, o
 #include "except.h"
 #include "dumpfile.h"
 #include "rtl-iter.h"
+#include "function-abi.h"
 
 /* This file contains the reload pass of the compiler, which is
run after register allocation has been done.  It checks that
@@ -120,11 +121,6 @@ #define spill_indirect_levels  \
This is only valid if reg_reloaded_contents is set and valid.  */
 static HARD_REG_SET reg_reloaded_dead;
 
-/* Indicate whether the register's current value is one that is not
-   safe to retain across a call, even for registers that are normally
-   call-saved.  This is only meaningful for members of reg_reloaded_valid.  */
-static HARD_REG_SET reg_reloaded_call_part_clobbered;
-
 /* Number of spill-regs so far; number of valid elements of spill_regs.  */
 static int n_spills;
 
@@ -795,7 +791,7 @@ reload (rtx_insn *first, int global)
 
   if (crtl->saves_all_registers)
 for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
-  if (! call_used_or_fixed_reg_p (i)
+  if (! crtl->abi->clobbers_full_reg_p (i)
  && ! fixed_regs[i]
  && ! LOCAL_REGNO (i))
df_set_regs_ever_live (i, true);
@@ -1908,8 +1904,8 @@ find_reg (class insn_chain *chain, int o
  && (inv_reg_alloc_order[regno]
  < inv_reg_alloc_order[best_reg])
 #else
- && call_used_or_fixed_reg_p (regno)
- && ! call_used_or_fixed_reg_p (best_reg)
+ && crtl->abi->clobbers_full_reg_p (regno)
+ && !crtl->abi->clobbers_full_reg_p (best_reg)
 #endif
  ))
{
@@ -4464,7 +4460,6 @@ reload_as_needed (int live_known)
   reg_last_reload_reg = XCNEWVEC (rtx, max_regno);
   INIT_REG_SET (_has_output_reload);
   CLEAR_HARD_REG_SET (reg_reloaded_valid);
-  CLEAR_HARD_REG_SET (reg_reloaded_call_part_clobbered);
 
   set_initial_elim_offsets ();
 
@@ -4786,8 +4781,8 

[26/32] Remove global call sets: regrename.c

2019-09-11 Thread Richard Sandiford
This patch makes regrename use a similar mask-and-clobber-set
pair to IRA when tracking whether registers are clobbered by
calls in a region.  Testing for a nonzero ABI mask is equivalent
to testing for a register that crosses a call.

Since AArch64 and c6x use regrename.h, they need to be updated
to include function-abi.h first.  AIUI this is preferred over
including function-abi.h in regrename.h.


2019-09-11  Richard Sandiford  

gcc/
* regrename.h (du_head::call_clobber_mask): New field.
(du_head::need_caller_save_reg): Replace with...
(du_head::call_abis): ...this new field.
* regrename.c: Include function-abi.h.
(call_clobbered_in_chain_p): New function.
(check_new_reg_p): Use crtl->abi when deciding whether a register
is free for use after RA.  Use call_clobbered_in_chain_p to test
whether a candidate register would be clobbered by a call.
(find_rename_reg): Don't add call-clobber conflicts here.
(rename_chains): Check call_abis instead of need_caller_save_reg.
(merge_chains): Update for changes to du_head.
(build_def_use): Use call_insn_abi to get the ABI of the call insn
target.  Record the ABI identifier in call_abis and the set of
fully or partially clobbered registers in call_clobber_mask.
Add fully-clobbered registers to hard_conflicts here rather
than in find_rename_reg.
* config/aarch64/cortex-a57-fma-steering.c: Include function-abi.h.
(rename_single_chain): Check call_abis instead of need_caller_save_reg.
* config/aarch64/falkor-tag-collision-avoidance.c: Include
function-abi.h.
* config/c6x/c6x.c: Likewise.

Index: gcc/regrename.h
===
--- gcc/regrename.h 2019-07-10 19:41:20.115948322 +0100
+++ gcc/regrename.h 2019-09-11 19:48:56.069616842 +0100
@@ -41,9 +41,12 @@ #define GCC_REGRENAME_H
   bitmap_head conflicts;
   /* Conflicts with untracked hard registers.  */
   HARD_REG_SET hard_conflicts;
+  /* Which registers are fully or partially clobbered by the calls that
+ the chain crosses.  */
+  HARD_REG_SET call_clobber_mask;
 
-  /* Nonzero if the chain crosses a call.  */
-  unsigned int need_caller_save_reg:1;
+  /* A bitmask of ABIs used by the calls that the chain crosses.  */
+  unsigned int call_abis : NUM_ABI_IDS;
   /* Nonzero if the register is used in a way that prevents renaming,
  such as the SET_DEST of a CALL_INSN or an asm operand that used
  to be a hard register.  */
Index: gcc/regrename.c
===
--- gcc/regrename.c 2019-09-11 19:47:32.898202916 +0100
+++ gcc/regrename.c 2019-09-11 19:48:56.069616842 +0100
@@ -33,6 +33,7 @@
 #include "addresses.h"
 #include "cfganal.h"
 #include "tree-pass.h"
+#include "function-abi.h"
 #include "regrename.h"
 
 /* This file implements the RTL register renaming pass of the compiler.  It is
@@ -303,6 +304,18 @@ merge_overlapping_regs (HARD_REG_SET *ps
 }
 }
 
+/* Return true if (reg:MODE REGNO) would be clobbered by a call covered
+   by THIS_HEAD.  */
+
+static bool
+call_clobbered_in_chain_p (du_head *this_head, machine_mode mode,
+  unsigned int regno)
+{
+  return call_clobbered_in_region_p (this_head->call_abis,
+this_head->call_clobber_mask,
+mode, regno);
+}
+
 /* Check if NEW_REG can be the candidate register to rename for
REG in THIS_HEAD chain.  THIS_UNAVAILABLE is a set of unavailable hard
registers.  */
@@ -322,7 +335,7 @@ check_new_reg_p (int reg ATTRIBUTE_UNUSE
|| global_regs[new_reg + i]
/* Can't use regs which aren't saved by the prologue.  */
|| (! df_regs_ever_live_p (new_reg + i)
-   && ! call_used_or_fixed_reg_p (new_reg + i))
+   && ! crtl->abi->clobbers_full_reg_p (new_reg + i))
 #ifdef LEAF_REGISTERS
/* We can't use a non-leaf register if we're in a
   leaf function.  */
@@ -337,11 +350,8 @@ check_new_reg_p (int reg ATTRIBUTE_UNUSE
   for (tmp = this_head->first; tmp; tmp = tmp->next_use)
 if ((!targetm.hard_regno_mode_ok (new_reg, GET_MODE (*tmp->loc))
 && ! DEBUG_INSN_P (tmp->insn))
-   || (this_head->need_caller_save_reg
-   && ! (targetm.hard_regno_call_part_clobbered
- (0, reg, GET_MODE (*tmp->loc)))
-   && (targetm.hard_regno_call_part_clobbered
-   (0, new_reg, GET_MODE (*tmp->loc)
+   || call_clobbered_in_chain_p (this_head, GET_MODE (*tmp->loc),
+ new_reg))
   return false;
 
   return true;
@@ -363,12 +373,6 @@ find_rename_reg (du_head_p this_head, en
   int pass;
   int best_new_reg = old_reg;
 
-  /* Further narrow the set of registers we can use for renaming.
- If the chain needs a call-saved register, mark the 

[25/32] Remove global call sets: regcprop.c

2019-09-11 Thread Richard Sandiford
This is a direct replacement of an existing test for fully and
partially clobbered registers.


2019-09-11  Richard Sandiford  

gcc/
* regcprop.c (copyprop_hardreg_forward_1): Use the recorded
mode of the register when deciding whether it is no longer
available after a call.

Index: gcc/regcprop.c
===
--- gcc/regcprop.c  2019-09-11 19:47:32.898202916 +0100
+++ gcc/regcprop.c  2019-09-11 19:48:51.961645788 +0100
@@ -1055,16 +1055,15 @@ copyprop_hardreg_forward_1 (basic_block
 
  function_abi abi = call_insn_abi (insn);
  for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
-   if ((abi.clobbers_full_reg_p (regno)
-|| (targetm.hard_regno_call_part_clobbered
-(abi.id (), regno, vd->e[regno].mode)))
+   if (vd->e[regno].mode != VOIDmode
+   && abi.clobbers_reg_p (vd->e[regno].mode, regno)
&& (regno < set_regno || regno >= set_regno + set_nregs))
  kill_value_regno (regno, 1, vd);
 
  /* If SET was seen in CALL_INSN_FUNCTION_USAGE, and SET_SRC
-of the SET isn't in regs_invalidated_by_call hard reg set,
-but instead among CLOBBERs on the CALL_INSN, we could wrongly
-assume the value in it is still live.  */
+of the SET isn't clobbered by ABI, but instead among
+CLOBBERs on the CALL_INSN, we could wrongly assume the
+value in it is still live.  */
  if (ksvd.ignore_set_reg)
kill_clobbered_values (insn, vd);
}


[24/32] Remove global call sets: recog.c

2019-09-11 Thread Richard Sandiford
2019-09-11  Richard Sandiford  

gcc/
* recog.c: Include function-abi.h.
(peep2_find_free_register): Use crtl->abi when deciding whether
a register is free for use after RA.

Index: gcc/recog.c
===
--- gcc/recog.c 2019-09-10 19:56:45.357177891 +0100
+++ gcc/recog.c 2019-09-11 19:48:48.689668843 +0100
@@ -40,6 +40,7 @@ Software Foundation; either version 3, o
 #include "cfgcleanup.h"
 #include "reload.h"
 #include "tree-pass.h"
+#include "function-abi.h"
 
 #ifndef STACK_POP_CODE
 #if STACK_GROWS_DOWNWARD
@@ -3227,7 +3228,7 @@ peep2_find_free_register (int from, int
  break;
}
  /* And that we don't create an extra save/restore.  */
- if (! call_used_or_fixed_reg_p (regno + j)
+ if (! crtl->abi->clobbers_full_reg_p (regno + j)
  && ! df_regs_ever_live_p (regno + j))
{
  success = 0;


[21/32] Remove global call sets: LRA

2019-09-11 Thread Richard Sandiford
lra_reg has an actual_call_used_reg_set field that is only used during
inheritance.  This in turn required a special lra_create_live_ranges
pass for flag_ipa_ra to set up this field.  This patch instead makes
the inheritance code do its own live register tracking, using the
same ABI-mask-and-clobber-set pair as for IRA.

Tracking ABIs simplifies (and cheapens) the logic in lra-lives.c and
means we no longer need a separate path for -fipa-ra.  It also means
we can remove TARGET_RETURN_CALL_WITH_MAX_CLOBBERS.

The patch also strengthens the sanity check in lra_assigns so that
we check that reg_renumber is consistent with the whole conflict set,
not just the call-clobbered registers.


2019-09-11  Richard Sandiford  

gcc/
* target.def (return_call_with_max_clobbers): Delete.
* doc/tm.texi.in (TARGET_RETURN_CALL_WITH_MAX_CLOBBERS): Delete.
* doc/tm.texi: Regenerate.
* config/aarch64/aarch64.c (aarch64_return_call_with_max_clobbers)
(TARGET_RETURN_CALL_WITH_MAX_CLOBBERS): Delete.
* lra-int.h (lra_reg::actual_call_used_reg_set): Delete.
(lra_reg::call_insn): Delete.
* lra.c: Include function-abi.h.
(initialize_lra_reg_info_element): Don't initialize the fields above.
(lra): Use crtl->abi to test whether the current function needs to
save a register in the prologue.  Remove special pre-inheritance
lra_create_live_ranges pass for flag_ipa_ra.
* lra-assigns.c: Include function-abi.h
(find_hard_regno_for_1): Use crtl->abi to test whether the current
function needs to save a register in the prologue.
(lra_assign): Assert that registers aren't allocated to a
conflicting register, rather than checking only for overlaps
with call_used_or_fixed_regs.  Do this even for flag_ipa_ra,
and for registers that are not live across a call.
* lra-constraints.c (last_call_for_abi): New variable.
(full_and_partial_call_clobbers): Likewise.
(setup_next_usage_insn): Remove the register from
full_and_partial_call_clobbers.
(need_for_call_save_p): Use call_clobbered_in_region_p to test
whether the register needs a caller save.
(need_for_split_p): Use full_and_partial_reg_clobbers instead
of call_used_or_fixed_regs.
(inherit_in_ebb): Initialize and maintain last_call_for_abi and
full_and_partial_call_clobbers.
* lra-lives.c (check_pseudos_live_through_calls): Replace
last_call_used_reg_set and call_insn arguments with an abi argument.
Remove handling of lra_reg::call_insn.  Use function_abi::mode_clobbers
as the set of conflicting registers.
(calls_have_same_clobbers_p): Delete.
(process_bb_lives): Track the ABI of the last call instead of an
insn/HARD_REG_SET pair.  Update calls to
check_pseudos_live_through_calls.  Use eh_edge_abi to calculate
the set of registers that could be clobbered by an EH edge.
Include partially-clobbered as well as fully-clobbered registers.
(lra_create_live_ranges_1): Don't initialize lra_reg::call_insn.
* lra-remat.c: Include function-abi.h.
(call_used_regs_arr_len, call_used_regs_arr): Delete.
(set_bb_regs): Use call_insn_abi to get the set of call-clobbered
registers and bitmap_view to combine them into dead_regs.
(call_used_input_regno_present_p): Take a function_abi argument
and use it to test whether a register is call-clobbered.
(calculate_gen_cands): Use call_insn_abi to get the ABI of the
call insn target.  Update tje call to call_used_input_regno_present_p.
(do_remat): Likewise.
(lra_remat): Remove the initialization of call_used_regs_arr_len
and call_used_regs_arr.

Index: gcc/target.def
===
--- gcc/target.def  2019-09-11 19:47:32.906202859 +0100
+++ gcc/target.def  2019-09-11 19:48:38.549740292 +0100
@@ -5786,20 +5786,6 @@ for targets that don't have partly call-
  hook_bool_uint_uint_mode_false)
 
 DEFHOOK
-(return_call_with_max_clobbers,
- "This hook returns a pointer to the call that partially clobbers the\n\
-most registers.  If a platform supports multiple ABIs where the registers\n\
-that are partially clobbered may vary, this function compares two\n\
-calls and returns a pointer to the one that clobbers the most registers.\n\
-If both calls clobber the same registers, @var{call_1} must be returned.\n\
-\n\
-The registers clobbered in different ABIs must be a proper subset or\n\
-superset of all other ABIs.  @var{call_1} must always be a call insn,\n\
-call_2 may be NULL or a call insn.",
- rtx_insn *, (rtx_insn *call_1, rtx_insn *call_2),
- NULL)
-
-DEFHOOK
 (get_multilib_abi_name,
  "This hook returns name of multilib ABI name.",
  const char *, (void),
Index: gcc/doc/tm.texi.in

[23/32] Remove global call sets: postreload-gcse.c

2019-09-11 Thread Richard Sandiford
This is another case in which we should conservatively treat
partial kills as full kills.


2019-09-11  Richard Sandiford  

gcc/
* postreload-gcse.c: Include regs.h and function-abi.h.
(record_opr_changes): Use call_insn_abi to get the ABI of the
call insn target.  Conservatively assume that partially-clobbered
registers are altered.

Index: gcc/postreload-gcse.c
===
--- gcc/postreload-gcse.c   2019-09-09 18:58:51.472270712 +0100
+++ gcc/postreload-gcse.c   2019-09-11 19:48:45.585690715 +0100
@@ -41,6 +41,8 @@ Software Foundation; either version 3, o
 #include "intl.h"
 #include "gcse-common.h"
 #include "gcse.h"
+#include "regs.h"
+#include "function-abi.h"
 
 /* The following code implements gcse after reload, the purpose of this
pass is to cleanup redundant loads generated by reload and other
@@ -770,9 +772,13 @@ record_opr_changes (rtx_insn *insn)
   /* Finally, if this is a call, record all call clobbers.  */
   if (CALL_P (insn))
 {
+  function_abi abi = call_insn_abi (insn);
   unsigned int regno;
   hard_reg_set_iterator hrsi;
-  EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call, 0, regno, hrsi)
+  /* We don't track modes of hard registers, so we need to be
+conservative and assume that partial kills are full kills.  */
+  EXECUTE_IF_SET_IN_HARD_REG_SET (abi.full_and_partial_reg_clobbers (),
+ 0, regno, hrsi)
record_last_reg_set_info_regno (insn, regno);
 
   if (! RTL_CONST_OR_PURE_CALL_P (insn))


[22/32] Remove global call sets: postreload.c

2019-09-11 Thread Richard Sandiford
The "|= fixed_regs" in reload_combine isn't necessary, since the
set is only used to determine which values have changed (rather than,
for example, which registers are available for use).

In reload_cse_move2add we can be accurate about which registers
are still available.  BLKmode indicates a continuation of the
previous register, and since clobbers_reg_p handles multi-register
values, it's enough to skip over BLKmode entries and just test the
start register.


2019-09-11  Richard Sandiford  

gcc/
* postreload.c (reload_combine_recognize_pattern): Use crtl->abi
when deciding whether a register is free for use after RA.
(reload_combine): Remove unnecessary use of fixed_reg_set.
(reload_cse_move2add): Use call_insn_abi to get the ABI of the
call insn target.  Use reg_mode when testing whether a register
is no longer available.

Index: gcc/postreload.c
===
--- gcc/postreload.c2019-09-11 19:47:24.418262673 +0100
+++ gcc/postreload.c2019-09-11 19:48:41.905716645 +0100
@@ -1136,7 +1136,8 @@ reload_combine_recognize_pattern (rtx_in
  if (TEST_HARD_REG_BIT (reg_class_contents[INDEX_REG_CLASS], i)
  && reg_state[i].use_index == RELOAD_COMBINE_MAX_USES
  && reg_state[i].store_ruid <= reg_state[regno].use_ruid
- && (call_used_or_fixed_reg_p (i) || df_regs_ever_live_p (i))
+ && (crtl->abi->clobbers_full_reg_p (i)
+ || df_regs_ever_live_p (i))
  && (!frame_pointer_needed || i != HARD_FRAME_POINTER_REGNUM)
  && !fixed_regs[i] && !global_regs[i]
  && hard_regno_nregs (i, GET_MODE (reg)) == 1
@@ -1332,9 +1333,6 @@ reload_combine (void)
{
  rtx link;
  HARD_REG_SET used_regs = call_insn_abi (insn).full_reg_clobbers ();
- /* ??? This preserves traditional behavior; it might not be
-needed.  */
- used_regs |= fixed_reg_set;
 
  for (r = 0; r < FIRST_PSEUDO_REGISTER; r++)
if (TEST_HARD_REG_BIT (used_regs, r))
@@ -2126,12 +2124,13 @@ reload_cse_move2add (rtx_insn *first)
 unknown values.  */
   if (CALL_P (insn))
{
+ function_abi abi = call_insn_abi (insn);
  for (i = FIRST_PSEUDO_REGISTER - 1; i >= 0; i--)
-   {
- if (call_used_or_fixed_reg_p (i))
-   /* Reset the information about this register.  */
-   reg_mode[i] = VOIDmode;
-   }
+   if (reg_mode[i] != VOIDmode
+   && reg_mode[i] != BLKmode
+   && abi.clobbers_reg_p (reg_mode[i], i))
+ /* Reset the information about this register.  */
+ reg_mode[i] = VOIDmode;
}
 }
   return changed;


[20/32] Remove global call sets: loop-iv.c

2019-09-11 Thread Richard Sandiford
Similar idea to the combine.c and gcse.c patches.


2019-09-11  Richard Sandiford  

gcc/
* loop-iv.c: Include regs.h and function-abi.h.
(simplify_using_initial_values): Use call_insn_abi to get the
ABI of the call insn target.  Conservatively assume that
partially-clobbered registers are altered.

Index: gcc/loop-iv.c
===
--- gcc/loop-iv.c   2019-09-09 19:01:40.371078272 +0100
+++ gcc/loop-iv.c   2019-09-11 19:48:35.161764168 +0100
@@ -62,6 +62,8 @@ Free Software Foundation; either version
 #include "dumpfile.h"
 #include "rtl-iter.h"
 #include "tree-ssa-loop-niter.h"
+#include "regs.h"
+#include "function-abi.h"
 
 /* Possible return values of iv_get_reaching_def.  */
 
@@ -1972,8 +1974,14 @@ simplify_using_initial_values (class loo
  CLEAR_REG_SET (this_altered);
  note_stores (insn, mark_altered, this_altered);
  if (CALL_P (insn))
-   /* Kill all call clobbered registers.  */
-   IOR_REG_SET_HRS (this_altered, regs_invalidated_by_call);
+   {
+ /* Kill all registers that might be clobbered by the call.
+We don't track modes of hard registers, so we need to be
+conservative and assume that partial kills are full kills.  */
+ function_abi abi = call_insn_abi (insn);
+ IOR_REG_SET_HRS (this_altered,
+  abi.full_and_partial_reg_clobbers ());
+   }
 
  if (suitable_set_for_replacement (insn, , ))
{


[19/32] Remove global call sets: IRA

2019-09-11 Thread Richard Sandiford
For -fipa-ra, IRA already keeps track of which specific registers
are call-clobbered in a region, rather than using global information.
The patch generalises this so that it tracks which ABIs are used
by calls in the region.

We can then use the new ABI descriptors to handle partially-clobbered
registers in the same way as fully-clobbered registers, without having
special code for targetm.hard_regno_call_part_clobbered.  This in turn
makes -fipa-ra work for partially-clobbered registers too.

A side-effect of allowing multiple ABIs is that we no longer have
an obvious set of conflicting registers for the self-described
"fragile hack" in ira-constraints.c.  This code kicks in for
user-defined registers that aren't live across a call at -O0,
and it tries to avoid allocating a call-clobbered register to them.
Here I've used the set of call-clobbered registers in the current
function's ABI, applying on top of any registers that are clobbered by
called functions.  This is enough to keep gcc.dg/debug/dwarf2/pr5948.c
happy.

The handling of GENERIC_STACK_CHECK in do_reload seemed to have
a reversed condition:

  for (int i = 0; i < FIRST_PSEUDO_REGISTER; i++)
if (df_regs_ever_live_p (i)
&& !fixed_regs[i]
&& call_used_or_fixed_reg_p (i))
  size += UNITS_PER_WORD;

The final part of the condition counts registers that don't need to be
saved in the prologue, but I think the opposite was intended.


2019-09-11  Richard Sandiford  

gcc/
* function-abi.h (call_clobbers_in_region): Declare.
(call_clobbered_in_region_p): New function.
* function-abi.cc (call_clobbers_in_region): Likewise.
* ira-int.h: Include function-abi.h.
(ira_allocno::crossed_calls_abis): New field.
(ALLOCNO_CROSSED_CALLS_ABIS): New macro.
(ira_need_caller_save_regs): New function.
(ira_need_caller_save_p): Likewise.
* ira.c (setup_reg_renumber): Use ira_need_caller_save_p instead
of call_used_or_fixed_regs.
(do_reload): Use crtl->abi to test whether the current function
needs to save a register in the prologue.  Count registers that
need to be saved rather than registers that don't.
* ira-build.c (create_cap_allocno): Copy ALLOCNO_CROSSED_CALLS_ABIS.
Remove unnecessary | from ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS.
(propagate_allocno_info): Merge ALLOCNO_CROSSED_CALLS_ABIS too.
(propagate_some_info_from_allocno): Likewise.
(copy_info_to_removed_store_destinations): Likewise.
(ira_flattening): Say that ALLOCNO_CROSSED_CALLS_ABIS and
ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS are handled conservatively.
(ira_build): Use ira_need_caller_save_regs instead of
call_used_or_fixed_regs.
* ira-color.c (calculate_saved_nregs): Use crtl->abi to test
whether the current function would need to save a register
before using it.
(calculate_spill_cost): Likewise.
(allocno_reload_assign): Use ira_need_caller_save_regs and
ira_need_caller_save_p instead of call_used_or_fixed_regs.
* ira-conflicts.c (ira_build_conflicts): Use
ira_need_caller_save_regs rather than call_used_or_fixed_regs
as the set of call-clobbered registers.  Remove the
call_used_or_fixed_regs mask from the calculation of
temp_hard_reg_set and mask its use instead.  Remove special
handling of partially-clobbered registers.
* ira-costs.c (ira_tune_allocno_costs): Use ira_need_caller_save_p.
* ira-lives.c (process_bb_node_lives): Use mode_clobbers to
calculate the set of conflicting registers for calls that
can throw.  Record the ABIs of calls in ALLOCNO_CROSSED_CALLS_ABIS.
Use full_and_partial_reg_clobbers rather than full_reg_clobbers
for the calculation of ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS.
Use eh_edge_abi to calculate the set of registers that could
be clobbered by an EH edge.  Include partially-clobbered as
well as fully-clobbered registers.

Index: gcc/function-abi.h
===
--- gcc/function-abi.h  2019-09-11 19:47:24.418262673 +0100
+++ gcc/function-abi.h  2019-09-11 19:48:31.709788491 +0100
@@ -265,6 +265,32 @@ #define default_function_abi \
   (this_target_function_abi_info->x_function_abis[0])
 #define eh_edge_abi default_function_abi
 
+extern HARD_REG_SET call_clobbers_in_region (unsigned int, const_hard_reg_set,
+machine_mode mode);
+
+/* Return true if (reg:MODE REGNO) might be clobbered by one of the
+   calls in a region described by ABIS and MASK, where:
+
+   * Bit ID of ABIS is set if the region contains a call with
+ function_abi identifier ID.
+
+   * MASK contains all the registers that are fully or partially
+ clobbered by calls in the region.
+
+   This is not quite as accurate as testing 

Re: [PATCH] Sync MIPS support from libffi master repository

2019-09-11 Thread Aurelien Jarno
On 2019-08-08 16:22, Jeff Law wrote:
> On 8/8/19 4:18 PM, Aurelien Jarno wrote:
> > 
> > It is used by libgo, but it seems to be the last user.
> Ah, yes, I should have remembered libgo.  And it links via
> ../../blahblah.  Thanks for digging into it.
> 
> So, yea, we need it.  So I think the best path forward is to just resync
> everything to the upstream master.

I have noticed that parisc and riscv support have commits that haven't
been propagated on the upstream libffi side. Things might be easier to
sync if those changes get merged on the libffi side.

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net


[18/32] Remove global call sets: haifa-sched.c

2019-09-11 Thread Richard Sandiford
The code patched here is counting how many registers the current
function would need to save in the prologue before it uses them.
The code is called per function, so using crtl is OK.


2019-09-11  Richard Sandiford  

gcc/
* haifa-sched.c: Include function-abi.h.
(alloc_global_sched_pressure_data): Use crtl->abi to check whether
the function would need to save a register before using it.

Index: gcc/haifa-sched.c
===
--- gcc/haifa-sched.c   2019-09-10 19:56:45.353177919 +0100
+++ gcc/haifa-sched.c   2019-09-11 19:48:27.281819692 +0100
@@ -146,6 +146,7 @@ Software Foundation; either version 3, o
 #include "cfgloop.h"
 #include "dumpfile.h"
 #include "print-rtl.h"
+#include "function-abi.h"
 
 #ifdef INSN_SCHEDULING
 
@@ -939,7 +940,8 @@ enum reg_class *sched_regno_pressure_cla
 /* Effective number of available registers of a given class (see comment
in sched_pressure_start_bb).  */
 static int sched_class_regs_num[N_REG_CLASSES];
-/* Number of call_saved_regs and fixed_regs.  Helpers for calculating of
+/* The number of registers that the function would need to save before it
+   uses them, and the number of fixed_regs.  Helpers for calculating of
sched_class_regs_num.  */
 static int call_saved_regs_num[N_REG_CLASSES];
 static int fixed_regs_num[N_REG_CLASSES];
@@ -7207,10 +7209,13 @@ alloc_global_sched_pressure_data (void)
  fixed_regs_num[cl] = 0;
 
  for (int i = 0; i < ira_class_hard_regs_num[cl]; ++i)
-   if (!call_used_or_fixed_reg_p (ira_class_hard_regs[cl][i]))
- ++call_saved_regs_num[cl];
-   else if (fixed_regs[ira_class_hard_regs[cl][i]])
- ++fixed_regs_num[cl];
+   {
+ unsigned int regno = ira_class_hard_regs[cl][i];
+ if (fixed_regs[regno])
+   ++fixed_regs_num[cl];
+ else if (!crtl->abi->clobbers_full_reg_p (regno))
+   ++call_saved_regs_num[cl];
+   }
}
 }
 }


[16/32] Remove global call sets: function.c

2019-09-11 Thread Richard Sandiford
Whatever the rights and wrongs of the way aggregate_value_p
handles call-preserved registers, it's a de facto part of the ABI,
so we shouldn't change it.  The patch simply extends the current
approach to whatever call-preserved set the function happens to
be using.


2019-09-11  Richard Sandiford  

gcc/
* function.c (aggregate_value_p): Work out which ABI the
function is using before testing which registers are at least
partly preserved by a call.

Index: gcc/function.c
===
--- gcc/function.c  2019-09-11 19:47:07.490381964 +0100
+++ gcc/function.c  2019-09-11 19:48:18.357882573 +0100
@@ -2120,10 +2120,17 @@ aggregate_value_p (const_tree exp, const
   if (!REG_P (reg))
 return 0;
 
+  /* Use the default ABI if the type of the function isn't known.
+ The scheme for handling interoperability between different ABIs
+ requires us to be able to tell when we're calling a function with
+ a nondefault ABI.  */
+  const predefined_function_abi  = (fntype
+   ? fntype_abi (fntype)
+   : default_function_abi);
   regno = REGNO (reg);
   nregs = hard_regno_nregs (regno, TYPE_MODE (type));
   for (i = 0; i < nregs; i++)
-if (! call_used_or_fixed_reg_p (regno + i))
+if (!fixed_regs[regno + i] && !abi.clobbers_full_reg_p (regno + i))
   return 1;
 
   return 0;


[17/32] Remove global call sets: gcse.c

2019-09-11 Thread Richard Sandiford
This is another case in which we can conservatively treat partial
kills as full kills.  Again this is in principle a bug fix for
TARGET_HARD_REGNO_CALL_PART_CLOBBERED targets, but in practice
it probably doesn't make a difference.


2019-09-11  Richard Sandiford  

gcc/
* gcse.c: Include function-abi.h.
(compute_hash_table_work): Use call_insn_abi to get the ABI of
the call insn target.  Invalidate partially call-clobbered
registers as well as fully call-clobbered ones.

Index: gcc/gcse.c
===
--- gcc/gcse.c  2019-09-09 18:58:51.468270740 +0100
+++ gcc/gcse.c  2019-09-11 19:48:23.453846664 +0100
@@ -160,6 +160,7 @@ Software Foundation; either version 3, o
 #include "dbgcnt.h"
 #include "gcse.h"
 #include "gcse-common.h"
+#include "function-abi.h"
 
 /* We support GCSE via Partial Redundancy Elimination.  PRE optimizations
are a superset of those done by classic GCSE.
@@ -1527,9 +1528,14 @@ compute_hash_table_work (struct gcse_has
 
  if (CALL_P (insn))
{
+ function_abi abi = call_insn_abi (insn);
  hard_reg_set_iterator hrsi;
- EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call,
- 0, regno, hrsi)
+
+ /* We don't track modes of hard registers, so we need
+to be conservative and assume that partial kills
+are full kills.  */
+ const_hard_reg_set clob = abi.full_and_partial_reg_clobbers ();
+ EXECUTE_IF_SET_IN_HARD_REG_SET (clob, 0, regno, hrsi)
record_last_reg_set_info (insn, regno);
 
  if (! RTL_CONST_OR_PURE_CALL_P (insn)


[15/32] Remove global call sets: early-remat.c

2019-09-11 Thread Richard Sandiford
This pass previously excluded rematerialisation candidates if they
clobbered a call-preserved register, on the basis that it then
wouldn't be safe to add new instances of the candidate instruction
after a call.  This patch instead makes the decision on a call-by-call
basis.

The second emit_remat_insns_for_block hunk probably isn't needed,
but it seems safer and more consistent to have it, so that every call
to emit_remat_insns is preceded by a check for invalid clobbers.


2019-09-11  Richard Sandiford  

gcc/
* early-remat.c: Include regs.h and function-abi.h.
(early_remat::maybe_add_candidate): Don't check for call-clobbered
registers here.
(early_remat::restrict_remat_for_unavail_regs): New function.
(early_remat::restrict_remat_for_call): Likewise.
(early_remat::process_block): Before calling emit_remat_insns
for a previous call in the block, invalidate any candidates
that would clobber call-preserved registers.
(early_remat::emit_remat_insns_for_block): Likewise for the
final call in a block.  Do the same thing for live-in registers
when calling emit_remat_insns at the head of a block.

Index: gcc/early-remat.c
===
--- gcc/early-remat.c   2019-07-01 09:37:04.720545420 +0100
+++ gcc/early-remat.c   2019-09-11 19:48:14.825907465 +0100
@@ -36,6 +36,8 @@ Software Foundation; either version 3, o
 #include "rtlhash.h"
 #include "print-rtl.h"
 #include "rtl-iter.h"
+#include "regs.h"
+#include "function-abi.h"
 
 /* This pass runs before register allocation and implements an aggressive
form of rematerialization.  It looks for pseudo registers R of mode M
@@ -435,6 +437,8 @@ struct remat_candidate_hasher : nofree_p
   void compute_clobbers (unsigned int);
   void assign_value_number (unsigned int);
   void decide_candidate_validity (void);
+  void restrict_remat_for_unavail_regs (bitmap, const_bitmap);
+  void restrict_remat_for_call (bitmap, rtx_insn *);
   bool stable_use_p (unsigned int);
   void emit_copy_before (unsigned int, rtx, rtx);
   void stabilize_pattern (unsigned int);
@@ -889,8 +893,8 @@ #define FAILURE_ARGS regno, INSN_UID (in
   else
{
  /* The instruction can set additional registers, provided that
-they're call-clobbered hard registers.  This is useful for
-instructions that alter the condition codes.  */
+they're hard registers.  This is useful for instructions
+that alter the condition codes.  */
  if (!HARD_REGISTER_NUM_P (def_regno))
{
  if (dump_file)
@@ -898,20 +902,6 @@ #define FAILURE_ARGS regno, INSN_UID (in
 " pseudo reg %d\n", FAILURE_ARGS, def_regno);
  return false;
}
- if (global_regs[def_regno])
-   {
- if (dump_file)
-   fprintf (dump_file, FAILURE_FORMAT "insn also sets"
-" global reg %d\n", FAILURE_ARGS, def_regno);
- return false;
-   }
- if (!TEST_HARD_REG_BIT (regs_invalidated_by_call, def_regno))
-   {
- if (dump_file)
-   fprintf (dump_file, FAILURE_FORMAT "insn also sets"
-" call-preserved reg %d\n", FAILURE_ARGS, def_regno);
- return false;
-   }
}
 }
 
@@ -1532,6 +1522,39 @@ early_remat::decide_candidate_validity (
   }
 }
 
+/* Remove any candidates in CANDIDATES that would clobber a register in
+   UNAVAIL_REGS.  */
+
+void
+early_remat::restrict_remat_for_unavail_regs (bitmap candidates,
+ const_bitmap unavail_regs)
+{
+  bitmap_clear (_tmp_bitmap);
+  unsigned int cand_index;
+  bitmap_iterator bi;
+  EXECUTE_IF_SET_IN_BITMAP (candidates, 0, cand_index, bi)
+{
+  remat_candidate *cand = _candidates[cand_index];
+  if (cand->clobbers
+ && bitmap_intersect_p (cand->clobbers, unavail_regs))
+   bitmap_set_bit (_tmp_bitmap, cand_index);
+}
+  bitmap_and_compl_into (candidates, _tmp_bitmap);
+}
+
+/* Remove any candidates in CANDIDATES that would clobber a register
+   that is potentially live across CALL.  */
+
+void
+early_remat::restrict_remat_for_call (bitmap candidates, rtx_insn *call)
+{
+  function_abi abi = call_insn_abi (call);
+  /* We don't know whether partially-clobbered registers are live
+ across the call or not, so assume that they are.  */
+  bitmap_view call_preserved_regs (~abi.full_reg_clobbers ());
+  restrict_remat_for_unavail_regs (candidates, call_preserved_regs);
+}
+
 /* Assuming that every path reaching a point P contains a copy of a
use U of REGNO, return true if another copy of U at P would have
access to the same value of REGNO.  */
@@ -1984,10 +2007,13 @@ early_remat::process_block (basic_block
  init_temp_bitmap (_required);
}
  else
-  

[14/32] Remove global call sets: DF (entry/exit defs)

2019-09-11 Thread Richard Sandiford
The code patched here is seeing whether the current function
needs to save at least part of a register before using it.


2019-09-11  Richard Sandiford  

gcc/
* df-scan.c (df_get_entry_block_def_set): Use crtl->abi to test
whether the current function needs to save at least part of a
register before using it.
(df_get_exit_block_use_set): Likewise for epilogue restores.

Index: gcc/df-scan.c
===
--- gcc/df-scan.c   2019-09-11 19:48:07.405959747 +0100
+++ gcc/df-scan.c   2019-09-11 19:48:11.009934354 +0100
@@ -3499,7 +3499,9 @@ df_get_entry_block_def_set (bitmap entry
   /* Defs for the callee saved registers are inserted so that the
 pushes have some defining location.  */
   for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
-   if ((call_used_or_fixed_reg_p (i) == 0) && (df_regs_ever_live_p (i)))
+   if (!crtl->abi->clobbers_full_reg_p (i)
+   && !fixed_regs[i]
+   && df_regs_ever_live_p (i))
  bitmap_set_bit (entry_block_defs, i);
 }
 
@@ -3672,8 +3674,9 @@ df_get_exit_block_use_set (bitmap exit_b
 {
   /* Mark all call-saved registers that we actually used.  */
   for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
-   if (df_regs_ever_live_p (i) && !LOCAL_REGNO (i)
-   && !TEST_HARD_REG_BIT (regs_invalidated_by_call, i))
+   if (df_regs_ever_live_p (i)
+   && !LOCAL_REGNO (i)
+   && !crtl->abi->clobbers_full_reg_p (i))
  bitmap_set_bit (exit_block_uses, i);
 }
 


[13/32] Remove global call sets: DF (EH edges)

2019-09-11 Thread Richard Sandiford
The DF dense_invalidated_by_call and sparse_invalidated_by_call
sets are actually only used on EH edges, and so are more the set
of registers that are invalidated by a taken EH edge.  Under the
new order, that means that they describe eh_edge_abi.


2019-09-11  Richard Sandiford  

gcc/
* df-problems.c: Include regs.h and function-abi.h.
(df_rd_problem_data): Rename sparse_invalidated_by_call to
sparse_invalidated_by_eh and dense_invalidated_by_call to
dense_invalidated_by_eh.
(df_print_bb_index): Update accordingly.
(df_rd_alloc, df_rd_start_dump, df_rd_confluence_n): Likewise.
(df_lr_confluence_n): Use eh_edge_abi to get the set of registers
that are clobbered by an EH edge.  Clobber partially-clobbered
registers as well as fully-clobbered ones.
(df_md_confluence_n): Likewise.
(df_rd_local_compute): Likewise.  Update for changes to
df_rd_problem_data.
* df-scan.c (df_scan_start_dump): Use eh_edge_abi to get the set
of registers that are clobbered by an EH edge.  Includde partially-
clobbered registers as well as fully-clobbered ones.

Index: gcc/df-problems.c
===
--- gcc/df-problems.c   2019-09-09 19:01:48.423021426 +0100
+++ gcc/df-problems.c   2019-09-11 19:48:07.405959747 +0100
@@ -36,6 +36,8 @@ Software Foundation; either version 3, o
 #include "valtrack.h"
 #include "dumpfile.h"
 #include "rtl-iter.h"
+#include "regs.h"
+#include "function-abi.h"
 
 /* Note that turning REG_DEAD_DEBUGGING on will cause
gcc.c-torture/unsorted/dump-noaddr.c to fail because it prints
@@ -139,18 +141,17 @@ df_print_bb_index (basic_block bb, FILE
these along with the bitmap_clear_range call to remove ranges of
bits without actually generating a knockout vector.
 
-   The kill and sparse_kill and the dense_invalidated_by_call and
-   sparse_invalidated_by_call both play this game.  */
+   The kill and sparse_kill and the dense_invalidated_by_eh and
+   sparse_invalidated_by_eh both play this game.  */
 
 /* Private data used to compute the solution for this problem.  These
data structures are not accessible outside of this module.  */
 class df_rd_problem_data
 {
 public:
-  /* The set of defs to regs invalidated by call.  */
-  bitmap_head sparse_invalidated_by_call;
-  /* The set of defs to regs invalidate by call for rd.  */
-  bitmap_head dense_invalidated_by_call;
+  /* The set of defs to regs invalidated by EH edges.  */
+  bitmap_head sparse_invalidated_by_eh;
+  bitmap_head dense_invalidated_by_eh;
   /* An obstack for the bitmaps we need for this problem.  */
   bitmap_obstack rd_bitmaps;
 };
@@ -187,8 +188,8 @@ df_rd_alloc (bitmap all_blocks)
   if (df_rd->problem_data)
 {
   problem_data = (class df_rd_problem_data *) df_rd->problem_data;
-  bitmap_clear (_data->sparse_invalidated_by_call);
-  bitmap_clear (_data->dense_invalidated_by_call);
+  bitmap_clear (_data->sparse_invalidated_by_eh);
+  bitmap_clear (_data->dense_invalidated_by_eh);
 }
   else
 {
@@ -196,9 +197,9 @@ df_rd_alloc (bitmap all_blocks)
   df_rd->problem_data = problem_data;
 
   bitmap_obstack_initialize (_data->rd_bitmaps);
-  bitmap_initialize (_data->sparse_invalidated_by_call,
+  bitmap_initialize (_data->sparse_invalidated_by_eh,
 _data->rd_bitmaps);
-  bitmap_initialize (_data->dense_invalidated_by_call,
+  bitmap_initialize (_data->dense_invalidated_by_eh,
 _data->rd_bitmaps);
 }
 
@@ -391,8 +392,8 @@ df_rd_local_compute (bitmap all_blocks)
   bitmap_iterator bi;
   class df_rd_problem_data *problem_data
 = (class df_rd_problem_data *) df_rd->problem_data;
-  bitmap sparse_invalidated = _data->sparse_invalidated_by_call;
-  bitmap dense_invalidated = _data->dense_invalidated_by_call;
+  bitmap sparse_invalidated = _data->sparse_invalidated_by_eh;
+  bitmap dense_invalidated = _data->dense_invalidated_by_eh;
 
   bitmap_initialize (_in_block, _bitmap_obstack);
   bitmap_initialize (_in_insn, _bitmap_obstack);
@@ -404,10 +405,13 @@ df_rd_local_compute (bitmap all_blocks)
   df_rd_bb_local_compute (bb_index);
 }
 
-  /* Set up the knockout bit vectors to be applied across EH_EDGES.  */
+  /* Set up the knockout bit vectors to be applied across EH_EDGES.
+ Conservatively treat partially-clobbered registers as surviving
+ across the EH edge, i.e. assume that definitions before the edge
+ is taken *might* reach uses after it has been taken.  */
   if (!(df->changeable_flags & DF_NO_HARD_REGS))
 for (unsigned int regno = 0; regno < FIRST_PSEUDO_REGISTER; ++regno)
-  if (TEST_HARD_REG_BIT (regs_invalidated_by_call, regno))
+  if (eh_edge_abi.clobbers_full_reg_p (regno))
{
  if (DF_DEFS_COUNT (regno) > DF_SPARSE_THRESHOLD)
bitmap_set_bit (sparse_invalidated, regno);
@@ -455,8 

Re: [PATCH V6 05/11] bpf: new GCC port

2019-09-11 Thread Mike Stump
On Sep 5, 2019, at 3:50 PM, Jose E. Marchesi  wrote:
> I guess I should wait for acceptance of the remaining patches not having
> an explicit ack?
> 
> This leaves the following patches that still need explicit approval:
> 
> - [PATCH V6 03/11] testsuite: annotate c-torture/compile tests with 
> dg-require-stack-size
> - [PATCH V6 04/11] testsuite: new require effective target indirect_calls
> - [PATCH V6 08/11] bpf: make target-supports.exp aware of eBPF
> - [PATCH V6 09/11] bpf: adjust GCC testsuite to eBPF limitations

I think I've responded to these now.

[12/32] Remove global call sets: cselib.c

2019-09-11 Thread Richard Sandiford
cselib_invalidate_regno is a no-op if REG_VALUES (i) is null,
so we can check that first.  Then, if we know what mode the register
currently has, we can check whether it's clobbered in that mode.

Using GET_MODE (values->elt->val_rtx) to get the mode of the last
set is taken from cselib_reg_set_mode.


2019-09-11  Richard Sandiford  

gcc/
* cselib.c (cselib_process_insn): If we know what mode a
register was set in, check whether it is clobbered in that
mode by a call.  Only fall back to reg_raw_mode if that fails.

Index: gcc/cselib.c
===
--- gcc/cselib.c2019-09-11 19:47:32.894202944 +0100
+++ gcc/cselib.c2019-09-11 19:48:04.229982128 +0100
@@ -2768,11 +2768,23 @@ cselib_process_insn (rtx_insn *insn)
 {
   function_abi abi = call_insn_abi (insn);
   for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
-   if (call_used_or_fixed_reg_p (i)
-   || (REG_VALUES (i) && REG_VALUES (i)->elt
-   && (targetm.hard_regno_call_part_clobbered
-   (abi.id (), i, GET_MODE (REG_VALUES (i)->elt->val_rtx)
- cselib_invalidate_regno (i, reg_raw_mode[i]);
+   if (elt_list *values = REG_VALUES (i))
+ {
+   /* If we know what mode the value was set in, check whether
+  it is still available after the call in that mode.  If we
+  don't know the mode, we have to check for the worst-case
+  scenario instead.  */
+   if (values->elt)
+ {
+   if (abi.clobbers_reg_p (GET_MODE (values->elt->val_rtx), i))
+ cselib_invalidate_regno (i, GET_MODE (values->elt->val_rtx));
+ }
+   else
+ {
+   if (abi.clobbers_at_least_part_of_reg_p (i))
+ cselib_invalidate_regno (i, reg_raw_mode[i]);
+ }
+ }
 
   /* Since it is not clear how cselib is going to be used, be
 conservative here and treat looping pure or const functions


[10/32] Remove global call sets: combine.c

2019-09-11 Thread Richard Sandiford
There shouldn't be many cases in which a useful hard register is
live across a call before RA, so we might as well keep things simple
and invalidate partially-clobbered registers here, in case the values
they hold leak into the call-clobbered part.  In principle this is
a bug fix for TARGET_HARD_REGNO_CALL_PART_CLOBBERED targets,
but in practice it probably doesn't make a difference.


2019-09-11  Richard Sandiford  

gcc/
* combine.c: Include function-abi.h.
(record_dead_and_set_regs): Use call_insn_abi to get the ABI
of call insns.  Invalidate partially-clobbered registers as
well as fully-clobbered ones.

Index: gcc/combine.c
===
--- gcc/combine.c   2019-09-09 18:58:51.448270881 +0100
+++ gcc/combine.c   2019-09-11 19:47:57.062032638 +0100
@@ -105,6 +105,7 @@ Software Foundation; either version 3, o
 #include "valtrack.h"
 #include "rtl-iter.h"
 #include "print-rtl.h"
+#include "function-abi.h"
 
 /* Number of attempts to combine instructions in this function.  */
 
@@ -13464,11 +13465,21 @@ record_dead_and_set_regs (rtx_insn *insn
 
   if (CALL_P (insn))
 {
+  function_abi abi = call_insn_abi (insn);
   hard_reg_set_iterator hrsi;
-  EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call, 0, i, hrsi)
+  EXECUTE_IF_SET_IN_HARD_REG_SET (abi.full_and_partial_reg_clobbers (),
+ 0, i, hrsi)
{
  reg_stat_type *rsp;
 
+ /* ??? We could try to preserve some information from the last
+set of register I if the call doesn't actually clobber
+(reg:last_set_mode I), which might be true for ABIs with
+partial clobbers.  However, it would be difficult to
+update last_set_nonzero_bits and last_sign_bit_copies
+to account for the part of I that actually was clobbered.
+It wouldn't help much anyway, since we rarely see this
+situation before RA.  */
  rsp = _stat[i];
  rsp->last_set_invalid = 1;
  rsp->last_set = insn;


[11/32] Remove global call sets: cse.c

2019-09-11 Thread Richard Sandiford
Like with the combine.c patch, this one keeps things simple by
invalidating values in partially-clobbered registers, rather than
trying to tell whether the value in a partially-clobbered register
is actually clobbered or not.  Again, this is in principle a bug fix,
but probably never matters in practice.


2019-09-11  Richard Sandiford  

gcc/
* cse.c: Include regs.h and function-abi.h.
(invalidate_for_call): Take the call insn as an argument.
Use call_insn_abi to get the ABI of the call and invalidate
partially clobbered registers as well as fully clobbered ones.
(cse_insn): Update call accordingly.

Index: gcc/cse.c
===
--- gcc/cse.c   2019-09-09 18:58:51.468270740 +0100
+++ gcc/cse.c   2019-09-11 19:48:00.966005128 +0100
@@ -42,6 +42,8 @@ Software Foundation; either version 3, o
 #include "tree-pass.h"
 #include "dbgcnt.h"
 #include "rtl-iter.h"
+#include "regs.h"
+#include "function-abi.h"
 
 /* The basic idea of common subexpression elimination is to go
through the code, keeping a record of expressions that would
@@ -566,7 +568,6 @@ static void remove_invalid_subreg_refs (
machine_mode);
 static void rehash_using_reg (rtx);
 static void invalidate_memory (void);
-static void invalidate_for_call (void);
 static rtx use_related_value (rtx, struct table_elt *);
 
 static inline unsigned canon_hash (rtx, machine_mode);
@@ -2091,23 +2092,29 @@ rehash_using_reg (rtx x)
 }
 
 /* Remove from the hash table any expression that is a call-clobbered
-   register.  Also update their TICK values.  */
+   register in INSN.  Also update their TICK values.  */
 
 static void
-invalidate_for_call (void)
+invalidate_for_call (rtx_insn *insn)
 {
-  unsigned int regno, endregno;
-  unsigned int i;
+  unsigned int regno;
   unsigned hash;
   struct table_elt *p, *next;
   int in_table = 0;
   hard_reg_set_iterator hrsi;
 
-  /* Go through all the hard registers.  For each that is clobbered in
- a CALL_INSN, remove the register from quantity chains and update
+  /* Go through all the hard registers.  For each that might be clobbered
+ in call insn INSN, remove the register from quantity chains and update
  reg_tick if defined.  Also see if any of these registers is currently
- in the table.  */
-  EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call, 0, regno, hrsi)
+ in the table.
+
+ ??? We could be more precise for partially-clobbered registers,
+ and only invalidate values that actually occupy the clobbered part
+ of the registers.  It doesn't seem worth the effort though, since
+ we shouldn't see this situation much before RA.  */
+  function_abi abi = call_insn_abi (insn);
+  EXECUTE_IF_SET_IN_HARD_REG_SET (abi.full_and_partial_reg_clobbers (),
+ 0, regno, hrsi)
 {
   delete_reg_equiv (regno);
   if (REG_TICK (regno) >= 0)
@@ -2132,15 +2139,11 @@ invalidate_for_call (void)
  || REGNO (p->exp) >= FIRST_PSEUDO_REGISTER)
continue;
 
- regno = REGNO (p->exp);
- endregno = END_REGNO (p->exp);
-
- for (i = regno; i < endregno; i++)
-   if (TEST_HARD_REG_BIT (regs_invalidated_by_call, i))
- {
-   remove_from_table (p, hash);
-   break;
- }
+ /* This must use the same test as above rather than the
+more accurate clobbers_reg_p.  */
+ if (overlaps_hard_reg_set_p (abi.full_and_partial_reg_clobbers (),
+  GET_MODE (p->exp), REGNO (p->exp)))
+   remove_from_table (p, hash);
}
 }
 
@@ -5834,7 +5837,7 @@ cse_insn (rtx_insn *insn)
  if (GET_CODE (XEXP (tem, 0)) == USE
  && MEM_P (XEXP (XEXP (tem, 0), 0)))
invalidate (XEXP (XEXP (tem, 0), 0), VOIDmode);
-  invalidate_for_call ();
+  invalidate_for_call (insn);
 }
 
   /* Now invalidate everything set by this instruction.


[09/32] Remove global call sets: cfgloopanal.c

2019-09-11 Thread Richard Sandiford
...or rather, make the use of the default ABI explicit.  That seems
OK if not ideal for this heuristic.

In practical terms, the code patched here is counting GENERAL_REGS,
which are treated in the same way by all concurrent ABI variants
on AArch64.  It might give bad results if used for interrupt
handlers though.


2019-09-11  Richard Sandiford  

gcc/
* cfgloopanal.c: Include regs.h and function-abi.h.
(init_set_costs): Use default_function_abi to test whether
a general register is call-clobbered.

Index: gcc/cfgloopanal.c
===
--- gcc/cfgloopanal.c   2019-09-10 19:56:45.313178201 +0100
+++ gcc/cfgloopanal.c   2019-09-11 19:47:53.946054595 +0100
@@ -32,6 +32,8 @@ Software Foundation; either version 3, o
 #include "graphds.h"
 #include "params.h"
 #include "sreal.h"
+#include "regs.h"
+#include "function-abi.h"
 
 struct target_cfgloop default_target_cfgloop;
 #if SWITCHABLE_TARGET
@@ -353,7 +355,10 @@ init_set_costs (void)
&& !fixed_regs[i])
   {
target_avail_regs++;
-   if (call_used_or_fixed_reg_p (i))
+   /* ??? This is only a rough heuristic.  It doesn't cope well
+  with alternative ABIs, but that's an optimization rather than
+  correctness issue.  */
+   if (default_function_abi.clobbers_full_reg_p (i))
  target_clobbered_regs++;
   }
 


[08/32] Remove global call sets: cfgcleanup.c

2019-09-11 Thread Richard Sandiford
old_insns_match_p just tests whether two instructions are
similar enough to merge.  With call_insn_abi it makes more
sense to compare the ABIs directly.


2019-09-11  Richard Sandiford  

gcc/
* cfgcleanup.c (old_insns_match_p): Compare the ABIs of calls
instead of the call-clobbered sets.

Index: gcc/cfgcleanup.c
===
--- gcc/cfgcleanup.c2019-09-11 19:47:24.402262786 +0100
+++ gcc/cfgcleanup.c2019-09-11 19:47:50.610078102 +0100
@@ -1227,13 +1227,7 @@ old_insns_match_p (int mode ATTRIBUTE_UN
}
}
 
-  HARD_REG_SET i1_used = call_insn_abi (i1).full_reg_clobbers ();
-  HARD_REG_SET i2_used = call_insn_abi (i2).full_reg_clobbers ();
-  /* ??? This preserves traditional behavior; it might not be needed.  */
-  i1_used |= fixed_reg_set;
-  i2_used |= fixed_reg_set;
-
-  if (i1_used != i2_used)
+  if (call_insn_abi (i1) != call_insn_abi (i2))
 return dir_none;
 }
 


[07/32] Remove global call sets: caller-save.c

2019-09-11 Thread Richard Sandiford
All caller-save.c uses of "|= fixed_reg_set" added in a previous patch
were redundant, since the sets are later ANDed with ~fixed_reg_set.


2019-09-11  Richard Sandiford  

gcc/
* caller-save.c (setup_save_areas): Remove redundant |s of
fixed_reg_set.
(save_call_clobbered_regs): Likewise.  Use the call ABI rather
than call_used_or_fixed_regs to decide whether a REG_RETURNED
value is useful.

Index: gcc/caller-save.c
===
--- gcc/caller-save.c   2019-09-11 19:47:24.402262786 +0100
+++ gcc/caller-save.c   2019-09-11 19:47:45.710112631 +0100
@@ -428,8 +428,6 @@ setup_save_areas (void)
   REG_SET_TO_HARD_REG_SET (hard_regs_to_save,
   >live_throughout);
   used_regs = call_insn_abi (insn).full_reg_clobbers ();
-  /* ??? This preserves traditional behavior; it might not be needed.  */
-  used_regs |= fixed_reg_set;
 
   /* Record all registers set in this call insn.  These don't
 need to be saved.  N.B. the call insn might set a subreg
@@ -513,9 +511,6 @@ setup_save_areas (void)
  REG_SET_TO_HARD_REG_SET (hard_regs_to_save,
   >live_throughout);
  used_regs = call_insn_abi (insn).full_reg_clobbers ();
- /* ??? This preserves traditional behavior; it might not
-be needed.  */
- used_regs |= fixed_reg_set;
 
  /* Record all registers set in this call insn.  These don't
 need to be saved.  N.B. the call insn might set a subreg
@@ -793,7 +788,6 @@ save_call_clobbered_regs (void)
{
  unsigned regno;
  HARD_REG_SET hard_regs_to_save;
- HARD_REG_SET call_def_reg_set;
  reg_set_iterator rsi;
  rtx cheap;
 
@@ -840,15 +834,12 @@ save_call_clobbered_regs (void)
  note_stores (insn, mark_set_regs, _insn_sets);
 
  /* Compute which hard regs must be saved before this call.  */
+ function_abi abi = call_insn_abi (insn);
  hard_regs_to_save &= ~(fixed_reg_set
 | this_insn_sets
 | hard_regs_saved);
  hard_regs_to_save &= savable_regs;
- call_def_reg_set = call_insn_abi (insn).full_reg_clobbers ();
- /* ??? This preserves traditional behavior; it might not
-be needed.  */
- call_def_reg_set |= fixed_reg_set;
- hard_regs_to_save &= call_def_reg_set;
+ hard_regs_to_save &= abi.full_reg_clobbers ();
 
  for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
if (TEST_HARD_REG_BIT (hard_regs_to_save, regno))
@@ -863,8 +854,7 @@ save_call_clobbered_regs (void)
  
  if (cheap
  && HARD_REGISTER_P (cheap)
- && TEST_HARD_REG_BIT (call_used_or_fixed_regs,
-   REGNO (cheap)))
+ && abi.clobbers_reg_p (GET_MODE (cheap), REGNO (cheap)))
{
  rtx dest, newpat;
  rtx pat = PATTERN (insn);


[06/32] Pass an ABI to choose_hard_reg_mode

2019-09-11 Thread Richard Sandiford
choose_hard_reg_mode previously took a boolean saying whether the
mode needed to be call-preserved.  This patch replaces it with an
optional ABI pointer instead, so that the function can use that
to test whether a value is call-saved.

default_dwarf_frame_reg_mode uses eh_edge_abi because that's the
ABI that matters for unwinding.  Targets need to override the hook
if they want something different.


2019-09-11  Richard Sandiford  

gcc/
* rtl.h (predefined_function_abi): Declare.
(choose_hard_reg_mode): Take a pointer to a predefined_function_abi
instead of a boolean call_save flag.
* config/gcn/gcn.c (gcn_hard_regno_caller_save_mode): Update call
accordingly.
* config/i386/i386.h (HARD_REGNO_CALLER_SAVE_MODE): Likewise.
* config/ia64/ia64.h (HARD_REGNO_CALLER_SAVE_MODE): Likewise.
* config/mips/mips.c (mips_hard_regno_caller_save_mode): Likewise.
* config/msp430/msp430.h (HARD_REGNO_CALLER_SAVE_MODE): Likewise.
* config/rs6000/rs6000.h (HARD_REGNO_CALLER_SAVE_MODE): Likewise.
* config/sh/sh.c (sh_hard_regno_caller_save_mode): Likewise.
* reginfo.c (init_reg_modes_target): Likewise.
(choose_hard_reg_mode): Take a pointer to a predefined_function_abi
instead of a boolean call_save flag.
* targhooks.c: Include function-abi.h.
(default_dwarf_frame_reg_mode): Update call to choose_hard_reg_mode,
using eh_edge_abi to choose the mode.

Index: gcc/rtl.h
===
--- gcc/rtl.h   2019-09-11 19:47:24.418262673 +0100
+++ gcc/rtl.h   2019-09-11 19:47:39.478156547 +0100
@@ -36,6 +36,8 @@ #define GCC_RTL_H
 
 #include "hard-reg-set.h"
 
+class predefined_function_abi;
+
 /* Value used by some passes to "recognize" noop moves as valid
  instructions.  */
 #define NOOP_MOVE_INSN_CODEINT_MAX
@@ -3383,7 +3385,8 @@ extern bool val_signbit_known_clear_p (m
   unsigned HOST_WIDE_INT);
 
 /* In reginfo.c  */
-extern machine_mode choose_hard_reg_mode (unsigned int, unsigned int, bool);
+extern machine_mode choose_hard_reg_mode (unsigned int, unsigned int,
+ const predefined_function_abi *);
 extern const HARD_REG_SET _subregs (const subreg_shape &);
 
 /* In emit-rtl.c  */
Index: gcc/config/gcn/gcn.c
===
--- gcc/config/gcn/gcn.c2019-09-10 19:56:45.333178060 +0100
+++ gcc/config/gcn/gcn.c2019-09-11 19:47:39.466156632 +0100
@@ -3017,7 +3017,7 @@ gcn_hard_regno_rename_ok (unsigned int f
 gcn_hard_regno_caller_save_mode (unsigned int regno, unsigned int nregs,
 machine_mode regmode)
 {
-  machine_mode result = choose_hard_reg_mode (regno, nregs, false);
+  machine_mode result = choose_hard_reg_mode (regno, nregs, NULL);
 
   if (VECTOR_MODE_P (result) && !VECTOR_MODE_P (regmode))
 result = (nregs == 1 ? SImode : DImode);
Index: gcc/config/i386/i386.h
===
--- gcc/config/i386/i386.h  2019-08-27 07:24:49.455527415 +0100
+++ gcc/config/i386/i386.h  2019-09-11 19:47:39.466156632 +0100
@@ -1256,7 +1256,7 @@ #define AVOID_CCMODE_COPIES
 #define HARD_REGNO_CALLER_SAVE_MODE(REGNO, NREGS, MODE)
\
   (CC_REGNO_P (REGNO) ? VOIDmode   \
: (MODE) == VOIDmode && (NREGS) != 1 ? VOIDmode \
-   : (MODE) == VOIDmode ? choose_hard_reg_mode ((REGNO), (NREGS), false) \
+   : (MODE) == VOIDmode ? choose_hard_reg_mode ((REGNO), (NREGS), NULL)
\
: (MODE) == HImode && !((GENERAL_REGNO_P (REGNO)\
&& TARGET_PARTIAL_REG_STALL)\
   || MASK_REGNO_P (REGNO)) ? SImode\
Index: gcc/config/ia64/ia64.h
===
--- gcc/config/ia64/ia64.h  2019-09-10 19:57:04.693041422 +0100
+++ gcc/config/ia64/ia64.h  2019-09-11 19:47:39.466156632 +0100
@@ -562,7 +562,7 @@ #define REG_ALLOC_ORDER 
   \
 
 #define HARD_REGNO_CALLER_SAVE_MODE(REGNO, NREGS, MODE) \
   ((FR_REGNO_P (REGNO) && (NREGS) == 1) ? RFmode\
-   : choose_hard_reg_mode ((REGNO), (NREGS), false))
+   : choose_hard_reg_mode ((REGNO), (NREGS), NULL))
 
 /* Handling Leaf Functions */
 
Index: gcc/config/mips/mips.c
===
--- gcc/config/mips/mips.c  2019-09-11 19:47:32.874203085 +0100
+++ gcc/config/mips/mips.c  2019-09-11 19:47:39.470156604 +0100
@@ -22174,7 +22174,7 @@ mips_hard_regno_caller_save_mode (unsign
   /* For performance, avoid saving/restoring upper parts of a register
  by returning MODE as save mode when the mode is known.  */
   if (mode == VOIDmode)
-

[05/32] Pass an ABI identifier to hard_regno_call_part_clobbered

2019-09-11 Thread Richard Sandiford
This patch replaces the rtx_insn argument to
targetm.hard_regno_call_part_clobbered with an ABI identifier, since
call insns are now just one possible way of getting an ABI handle.
This in turn allows predefined_function_abi::initialize to do the
right thing for non-default ABIs.

The horrible ?: in need_for_call_save_p goes away in a later patch,
with the series as a whole removing most direct calls to the hook in
favour of function_abi operations.


2019-09-11  Richard Sandiford  

gcc/
* target.def (hard_regno_call_part_clobbered): Take an ABI
identifier instead of an rtx_insn.
* doc/tm.texi: Regenerate.
* hooks.h (hook_bool_insn_uint_mode_false): Delete.
(hook_bool_uint_uint_mode_false): New function.
* hooks.c (hook_bool_insn_uint_mode_false): Delete.
(hook_bool_uint_uint_mode_false): New function.
* config/aarch64/aarch64.c (aarch64_hard_regno_call_part_clobbered):
Take an ABI identifier instead of an rtx_insn.
* config/avr/avr.c (avr_hard_regno_call_part_clobbered): Likewise.
* config/i386/i386.c (ix86_hard_regno_call_part_clobbered): Likewise.
* config/mips/mips.c (mips_hard_regno_call_part_clobbered): Likewise.
* config/pru/pru.c (pru_hard_regno_call_part_clobbered): Likewise.
* config/rs6000/rs6000.c (rs6000_hard_regno_call_part_clobbered):
Likewise.
* config/s390/s390.c (s390_hard_regno_call_part_clobbered): Likewise.
* cselib.c: Include function-abi.h.
(cselib_process_insn): Update call to
targetm.hard_regno_call_part_clobbered, using call_insn_abi
to get the appropriate ABI identifier.
* function-abi.cc (predefined_function_abi::initialize): Update call
to targetm.hard_regno_call_part_clobbered.
* ira-conflicts.c (ira_build_conflicts): Likewise.
* ira-costs.c (ira_tune_allocno_costs): Likewise.
* lra-constraints.c: Include function-abi.h.
(need_for_call_save_p): Update call to
targetm.hard_regno_call_part_clobbered, using call_insn_abi
to get the appropriate ABI identifier.
* lra-lives.c (check_pseudos_live_through_calls): Likewise.
* regcprop.c (copyprop_hardreg_forward_1): Update call
to targetm.hard_regno_call_part_clobbered.
* reginfo.c (choose_hard_reg_mode): Likewise.
* regrename.c (check_new_reg_p): Likewise.
* reload.c (find_equiv_reg): Likewise.
* reload1.c (emit_reload_insns): Likewise.
* sched-deps.c: Include function-abi.h.
(deps_analyze_insn): Update call to
targetm.hard_regno_call_part_clobbered, using call_insn_abi
to get the appropriate ABI identifier.
* sel-sched.c (init_regs_for_mode, mark_unavailable_hard_regs): Update
call to targetm.hard_regno_call_part_clobbered.
* targhooks.c (default_dwarf_frame_reg_mode): Likewise.

Index: gcc/target.def
===
--- gcc/target.def  2019-09-11 19:47:24.422262645 +0100
+++ gcc/target.def  2019-09-11 19:47:32.906202859 +0100
@@ -5763,18 +5763,27 @@ The default version of this hook always
 
 DEFHOOK
 (hard_regno_call_part_clobbered,
- "This hook should return true if @var{regno} is partly call-saved and\n\
-partly call-clobbered, and if a value of mode @var{mode} would be partly\n\
-clobbered by call instruction @var{insn}.  If @var{insn} is NULL then it\n\
-should return true if any call could partly clobber the register.\n\
-For example, if the low 32 bits of @var{regno} are preserved across a call\n\
-but higher bits are clobbered, this hook should return true for a 64-bit\n\
-mode but false for a 32-bit mode.\n\
+ "ABIs usually specify that calls must preserve the full contents\n\
+of a particular register, or that calls can alter any part of a\n\
+particular register.  This information is captured by the target macro\n\
+@code{CALL_REALLY_USED_REGISTERS}.  However, some ABIs specify that calls\n\
+must preserve certain bits of a particular register but can alter others.\n\
+This hook should return true if this applies to at least one of the\n\
+registers in @samp{(reg:@var{mode} @var{regno})}, and if as a result the\n\
+call would alter part of the @var{mode} value.  For example, if a call\n\
+preserves the low 32 bits of a 64-bit hard register @var{regno} but can\n\
+clobber the upper 32 bits, this hook should return true for a 64-bit mode\n\
+but false for a 32-bit mode.\n\
+\n\
+The value of @var{abi_id} comes from the @code{predefined_function_abi}\n\
+structure that describes the ABI of the call; see the definition of the\n\
+structure for more details.  If (as is usual) the target uses the same ABI\n\
+for all functions in a translation unit, @var{abi_id} is always 0.\n\
 \n\
 The default implementation returns false, which is correct\n\
 for targets that don't have partly call-clobbered registers.",
- bool, (rtx_insn *insn, 

[04/32] [x86] Robustify vzeroupper handling across calls

2019-09-11 Thread Richard Sandiford
One of the effects of the function_abi series is to make -fipa-ra
work for partially call-clobbered registers.  E.g. if a call preserves
only the low 32 bits of a register R, we handled the partial clobber
separately from -fipa-ra, and so treated the upper bits of R as
clobbered even if we knew that the target function doesn't touch R.

"Fixing" this caused problems for the vzeroupper handling on x86.
The pass that inserts the vzerouppers assumes that no 256-bit or 512-bit
values are live across a call unless the call takes a 256-bit or 512-bit
argument:

  /* Needed mode is set to AVX_U128_CLEAN if there are
 no 256bit or 512bit modes used in function arguments. */

This implicitly relies on:

/* Implement TARGET_HARD_REGNO_CALL_PART_CLOBBERED.  The only ABI that
   saves SSE registers across calls is Win64 (thus no need to check the
   current ABI here), and with AVX enabled Win64 only guarantees that
   the low 16 bytes are saved.  */

static bool
ix86_hard_regno_call_part_clobbered (rtx_insn *insn ATTRIBUTE_UNUSED,
 unsigned int regno, machine_mode mode)
{
  return SSE_REGNO_P (regno) && GET_MODE_SIZE (mode) > 16;
}

The comment suggests that this code is only needed for Win64 and that
not testing for Win64 is just a simplification.  But in practice it was
needed for correctness on GNU/Linux and other targets too, since without
it the RA would be able to keep 256-bit and 512-bit values in SSE
registers across calls that are known not to clobber them.

This patch conservatively treats calls as AVX_U128_ANY if the RA can see
that some SSE registers are not touched by a call.  There are then no
regressions if the ix86_hard_regno_call_part_clobbered check is disabled
for GNU/Linux (not something we should do, was just for testing).

If in fact we want -fipa-ra to pretend that all functions clobber
SSE registers above 128 bits, it'd certainly be possible to arrange
that.  But IMO that would be an optimisation decision, whereas what
the patch is fixing is a correctness decision.  So I think we should
have this check even so.


2019-09-11  Richard Sandiford  

gcc/
* config/i386/i386.c: Include function-abi.h.
(ix86_avx_u128_mode_needed): Treat function calls as AVX_U128_ANY
if they preserve some 256-bit or 512-bit SSE registers.

Index: gcc/config/i386/i386.c
===
--- gcc/config/i386/i386.c  2019-09-10 19:56:55.601105594 +0100
+++ gcc/config/i386/i386.c  2019-09-11 19:47:28.506233865 +0100
@@ -95,6 +95,7 @@ #define IN_TARGET_CODE 1
 #include "i386-builtins.h"
 #include "i386-expand.h"
 #include "i386-features.h"
+#include "function-abi.h"
 
 /* This file should be included last.  */
 #include "target-def.h"
@@ -13511,6 +13512,15 @@ ix86_avx_u128_mode_needed (rtx_insn *ins
}
}
 
+  /* If the function is known to preserve some SSE registers,
+RA and previous passes can legitimately rely on that for
+modes wider than 256 bits.  It's only safe to issue a
+vzeroupper if all SSE registers are clobbered.  */
+  const function_abi  = call_insn_abi (insn);
+  if (!hard_reg_set_subset_p (reg_class_contents[ALL_SSE_REGS],
+ abi.mode_clobbers (V4DImode)))
+   return AVX_U128_ANY;
+
   return AVX_U128_CLEAN;
 }
 


Re: [PATCH V6 09/11] bpf: adjust GCC testsuite to eBPF limitations

2019-09-11 Thread Mike Stump
On Aug 29, 2019, at 8:13 AM, Jose E. Marchesi  wrote:
> 
> This patch makes many tests in gcc.dg and gcc.c-torture to be skipped
> in bpf-*-* targets.  This is due to the many limitations imposed by
> eBPF

Ok.

So, see my other comments about marking and automatically skipping tests in the 
indirect calls patch.  All of that applies to this one as well.

[03/32] Add a function for getting the ABI of a call insn target

2019-09-11 Thread Richard Sandiford
This patch replaces get_call_reg_set_usage with call_insn_abi,
which returns the ABI of the target of a call insn.  The ABI's
full_reg_clobbers corresponds to regs_invalidated_by_call,
whereas many callers instead passed call_used_or_fixed_regs, i.e.:

  (regs_invalidated_by_call | fixed_reg_set)

The patch slavishly preserves the "| fixed_reg_set" for these callers;
later patches will clean this up.


2019-09-11  Richard Sandiford  

gcc/
* target.def (call_insn_abi): New hook.
(remove_extra_call_preserved_regs): Delete.
* doc/tm.texi.in (TARGET_CALL_INSN_ABI): New macro.
(TARGET_REMOVE_EXTRA_CALL_PRESERVED_REGS): Delete.
* doc/tm.texi: Regenerate.
* targhooks.h (default_remove_extra_call_preserved_regs): Delete.
* targhooks.c (default_remove_extra_call_preserved_regs): Delete.
* config/aarch64/aarch64.c (aarch64_simd_call_p): Constify the
insn argument.
(aarch64_remove_extra_call_preserved_regs): Delete.
(aarch64_call_insn_abi): New function.
(TARGET_REMOVE_EXTRA_CALL_PRESERVED_REGS): Delete.
(TARGET_CALL_INSN_ABI): New macro.
* rtl.h (get_call_fndecl): Declare.
(cgraph_rtl_info): Fix formatting.  Tweak comment for
function_used_regs.  Remove function_used_regs_valid.
* rtlanal.c (get_call_fndecl): Moved from final.c
* function-abi.h (call_insn_abi): Declare.
(target_function_abi_info): Mention call_insn_abi.
* function-abi.cc (fndecl_abi): Handle flag_ipa_ra in a similar
way to get_call_reg_set_usage did.
(call_insn_abi): New function.
* regs.h (get_call_reg_set_usage): Delete.
* final.c: Include function-abi.h.
(collect_fn_hard_reg_usage): Add fixed and stack registers to
function_used_regs before the main loop rather than afterwards.
Use call_insn_abi instead of get_call_reg_set_usage.  Exit early
if function_used_regs ends up not being useful.
(get_call_fndecl): Move to rtlanal.c
(get_call_cgraph_rtl_info, get_call_reg_set_usage): Delete.
* caller-save.c: Include function-abi.h.
(setup_save_areas, save_call_clobbered_regs): Use call_insn_abi
instead of get_call_reg_set_usage.
* cfgcleanup.c: Include function-abi.h.
(old_insns_match_p): Use call_insn_abi instead of
get_call_reg_set_usage.
* cgraph.h (cgraph_node::rtl_info): Take a const_tree instead of
a tree.
* cgraph.c (cgraph_node::rtl_info): Likewise.  Initialize
function_used_regs.
* df-scan.c: Include function-abi.h.
(df_get_call_refs): Use call_insn_abi instead of
get_call_reg_set_usage.
* ira-lives.c: Include function-abi.h.
(process_bb_node_lives): Use call_insn_abi instead of
get_call_reg_set_usage.
* lra-lives.c: Include function-abi.h.
(process_bb_lives): Use call_insn_abi instead of
get_call_reg_set_usage.
* postreload.c: Include function-abi.h.
(reload_combine): Use call_insn_abi instead of get_call_reg_set_usage.
* regcprop.c: Include function-abi.h.
(copyprop_hardreg_forward_1): Use call_insn_abi instead of
get_call_reg_set_usage.
* resource.c: Include function-abi.h.
(mark_set_resources, mark_target_live_regs): Use call_insn_abi
instead of get_call_reg_set_usage.
* var-tracking.c: Include function-abi.h.
(dataflow_set_clear_at_call): Use call_insn_abi
instead of get_call_reg_set_usage.

Index: gcc/target.def
===
--- gcc/target.def  2019-09-11 19:47:20.406290945 +0100
+++ gcc/target.def  2019-09-11 19:47:24.422262645 +0100
@@ -4901,6 +4901,19 @@ interoperability between several ABIs in
  const predefined_function_abi &, (const_tree type),
  NULL)
 
+DEFHOOK
+(call_insn_abi,
+ "This hook returns a description of the ABI used by the target of\n\
+call instruction @var{insn}; see the definition of\n\
+@code{predefined_function_abi} for details of the ABI descriptor.\n\
+Only the global function @code{call_insn_abi} should call this hook\n\
+directly.\n\
+\n\
+Targets only need to define this hook if they support\n\
+interoperability between several ABIs in the same translation unit.",
+ const predefined_function_abi &, (const rtx_insn *insn),
+ NULL)
+
 /* ??? Documenting this hook requires a GFDL license grant.  */
 DEFHOOK_UNDOC
 (internal_arg_pointer,
@@ -5783,20 +5796,6 @@ DEFHOOK
  const char *, (void),
  hook_constcharptr_void_null)
 
-DEFHOOK
-(remove_extra_call_preserved_regs,
- "This hook removes registers from the set of call-clobbered registers\n\
- in @var{used_regs} if, contrary to the default rules, something guarantees\n\
- that @samp{insn} preserves those registers.  For example, some targets\n\
- support variant ABIs in which functions preserve more registers than\n\
- normal 

[02/32] Add a target hook for getting an ABI from a function type

2019-09-11 Thread Richard Sandiford
This patch adds a target hook that allows targets to return
the ABI associated with a particular function type.  Generally,
when multiple ABIs are in use, it must be possible to tell from
a function type and its attributes which ABI it is using.


2019-09-11  Richard Sandiford  

gcc/
* target.def (fntype_abi): New target hook.
* doc/tm.texi.in (TARGET_FNTYPE_ABI): Likewise.
* doc/tm.texi: Regenerate.
* target.h (predefined_function_abi): Declare.
* function-abi.cc (fntype_abi): Call targetm.calls.fntype_abi,
if defined.
* config/aarch64/aarch64.h (ARM_PCS_SIMD): New arm_pcs value.
* config/aarch64/aarch64.c: Include function-abi.h.
(aarch64_simd_abi, aarch64_fntype_abi): New functions.
(TARGET_FNTYPE_ABI): Define.

Index: gcc/target.def
===
--- gcc/target.def  2019-09-09 17:51:55.848574716 +0100
+++ gcc/target.def  2019-09-11 19:47:20.406290945 +0100
@@ -4892,6 +4892,15 @@ If this hook is not defined, then FUNCTI
  bool, (const unsigned int regno),
  default_function_value_regno_p)
 
+DEFHOOK
+(fntype_abi,
+ "Return the ABI used by a function with type @var{type}; see the\n\
+definition of @code{predefined_function_abi} for details of the ABI\n\
+descriptor.  Targets only need to define this hook if they support\n\
+interoperability between several ABIs in the same translation unit.",
+ const predefined_function_abi &, (const_tree type),
+ NULL)
+
 /* ??? Documenting this hook requires a GFDL license grant.  */
 DEFHOOK_UNDOC
 (internal_arg_pointer,
Index: gcc/doc/tm.texi.in
===
--- gcc/doc/tm.texi.in  2019-09-10 19:57:04.713041281 +0100
+++ gcc/doc/tm.texi.in  2019-09-11 19:47:20.402290974 +0100
@@ -1709,6 +1709,11 @@ must be defined.  Modern ports should de
 @cindex call-used register
 @cindex call-clobbered register
 @cindex call-saved register
+@hook TARGET_FNTYPE_ABI
+
+@cindex call-used register
+@cindex call-clobbered register
+@cindex call-saved register
 @hook TARGET_HARD_REGNO_CALL_PART_CLOBBERED
 
 @hook TARGET_REMOVE_EXTRA_CALL_PRESERVED_REGS
Index: gcc/doc/tm.texi
===
--- gcc/doc/tm.texi 2019-09-10 19:57:04.713041281 +0100
+++ gcc/doc/tm.texi 2019-09-11 19:47:20.402290974 +0100
@@ -1898,6 +1898,16 @@ must be defined.  Modern ports should de
 @cindex call-used register
 @cindex call-clobbered register
 @cindex call-saved register
+@deftypefn {Target Hook} {const predefined_function_abi &} TARGET_FNTYPE_ABI 
(const_tree @var{type})
+Return the ABI used by a function with type @var{type}; see the
+definition of @code{predefined_function_abi} for details of the ABI
+descriptor.  Targets only need to define this hook if they support
+interoperability between several ABIs in the same translation unit.
+@end deftypefn
+
+@cindex call-used register
+@cindex call-clobbered register
+@cindex call-saved register
 @deftypefn {Target Hook} bool TARGET_HARD_REGNO_CALL_PART_CLOBBERED (rtx_insn 
*@var{insn}, unsigned int @var{regno}, machine_mode @var{mode})
 This hook should return true if @var{regno} is partly call-saved and
 partly call-clobbered, and if a value of mode @var{mode} would be partly
Index: gcc/target.h
===
--- gcc/target.h2019-08-20 09:52:11.022820825 +0100
+++ gcc/target.h2019-09-11 19:47:20.406290945 +0100
@@ -152,6 +152,9 @@ struct noce_if_info;
 /* This is defined in calls.h.  */
 class function_arg_info;
 
+/* This is defined in function-abi.h.  */
+class predefined_function_abi;
+
 /* These are defined in tree-vect-stmts.c.  */
 extern tree stmt_vectype (class _stmt_vec_info *);
 extern bool stmt_in_inner_loop_p (class _stmt_vec_info *);
Index: gcc/function-abi.cc
===
--- gcc/function-abi.cc 2019-09-11 19:47:07.490381964 +0100
+++ gcc/function-abi.cc 2019-09-11 19:47:20.402290974 +0100
@@ -132,6 +132,8 @@ const predefined_function_abi &
 fntype_abi (const_tree type)
 {
   gcc_assert (FUNC_OR_METHOD_TYPE_P (type));
+  if (targetm.calls.fntype_abi)
+return targetm.calls.fntype_abi (type);
   return default_function_abi;
 }
 
Index: gcc/config/aarch64/aarch64.h
===
--- gcc/config/aarch64/aarch64.h2019-09-05 08:49:31.193737018 +0100
+++ gcc/config/aarch64/aarch64.h2019-09-11 19:47:20.398291002 +0100
@@ -783,6 +783,7 @@ #define TARGET_ILP32(aarch64_abi & AARC
 enum arm_pcs
 {
   ARM_PCS_AAPCS64, /* Base standard AAPCS for 64 bit.  */
+  ARM_PCS_SIMD,/* For aarch64_vector_pcs functions.  */
   ARM_PCS_UNKNOWN
 };
 
Index: gcc/config/aarch64/aarch64.c
===
--- 

[01/32] Add function_abi.{h,cc}

2019-09-11 Thread Richard Sandiford
This patch adds new structures and functions for handling
multiple ABIs in a translation unit.  The structures are:

- predefined_function_abi: describes a static, predefined ABI
- function_abi: describes either a predefined ABI or a local
  variant of one (e.g. taking -fipa-ra into account)

The patch adds functions for getting the ABI from a given type
or decl; a later patch will also add a function for getting the
ABI of the target of a call insn.

Although ABIs are about much more than call-clobber/saved choices,
I wanted to keep the name general in case we add more ABI-related
information in future.


2019-09-11  Richard Sandiford  

gcc/
* Makefile.in (OBJS): Add function-abi.o.
(GTFILES): Add function-abi.h.
* function-abi.cc: New file.
* function-abi.h: Likewise.
* emit-rtl.h (rtl_data::abi): New field.
* function.c: Include function-abi.h.
(prepare_function_start): Initialize crtl->abi.
* read-rtl-function.c: Include regs.h and function-abi.h.
(read_rtl_function_body): Initialize crtl->abi.
(read_rtl_function_body_from_file_range): Likewise.
* reginfo.c: Include function-abi.h.
(init_reg_sets_1): Initialize default_function_abi.
(globalize_reg): Call add_full_reg_clobber for each predefined ABI
when making a register global.
* target-globals.h (this_target_function_abi_info): Declare.
(target_globals::function_abi_info): New field.
(restore_target_globals): Copy it.
* target-globals.c: Include function-abi.h.
(default_target_globals): Initialize the function_abi_info field.
(target_globals): Allocate it.
(save_target_globals): Free it.

Index: gcc/Makefile.in
===
--- gcc/Makefile.in 2019-09-09 17:51:55.832574829 +0100
+++ gcc/Makefile.in 2019-09-11 19:47:07.486381992 +0100
@@ -1306,6 +1306,7 @@ OBJS = \
fold-const.o \
fold-const-call.o \
function.o \
+   function-abi.o \
function-tests.o \
fwprop.o \
gcc-rich-location.o \
@@ -2522,6 +2523,7 @@ GTFILES = $(CPPLIB_H) $(srcdir)/input.h
   $(srcdir)/libfuncs.h $(SYMTAB_H) \
   $(srcdir)/real.h $(srcdir)/function.h $(srcdir)/insn-addr.h 
$(srcdir)/hwint.h \
   $(srcdir)/fixed-value.h \
+  $(srcdir)/function-abi.h \
   $(srcdir)/output.h $(srcdir)/cfgloop.h $(srcdir)/cfg.h 
$(srcdir)/profile-count.h \
   $(srcdir)/cselib.h $(srcdir)/basic-block.h  $(srcdir)/ipa-ref.h 
$(srcdir)/cgraph.h \
   $(srcdir)/reload.h $(srcdir)/caller-save.c $(srcdir)/symtab.c \
Index: gcc/function-abi.cc
===
--- /dev/null   2019-07-30 08:53:31.317691683 +0100
+++ gcc/function-abi.cc 2019-09-11 19:47:07.490381964 +0100
@@ -0,0 +1,145 @@
+/* Information about fuunction binary interfaces.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+
+This file is part of GCC
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "target.h"
+#include "rtl.h"
+#include "tree.h"
+#include "regs.h"
+#include "function-abi.h"
+#include "varasm.h"
+#include "cgraph.h"
+
+target_function_abi_info default_target_function_abi_info;
+#if SWITCHABLE_TARGET
+target_function_abi_info *this_target_function_abi_info
+  = _target_function_abi_info;
+#endif
+
+/* Initialize a predefined function ABI with the given values of
+   ID and FULL_REG_CLOBBERS.  */
+
+void
+predefined_function_abi::initialize (unsigned int id,
+const_hard_reg_set full_reg_clobbers)
+{
+  m_id = id;
+  m_initialized = true;
+  m_full_reg_clobbers = full_reg_clobbers;
+
+  /* Set up the value of m_full_and_partial_reg_clobbers.
+
+ If the ABI specifies that part of a hard register R is call-clobbered,
+ we should be able to find a single-register mode M for which
+ targetm.hard_regno_call_part_clobbered (NULL, R, M) is true.
+ In other words, it shouldn't be the case that R can hold all
+ single-register modes across a call, but can't hold part of
+ a multi-register mode.
+
+ If that assumption doesn't hold for a future target, we would need
+ to change the interface of TARGET_HARD_REGNO_CALL_PART_CLOBBERED so
+ that it tells us which registers in a multi-register 

[00/32] Support multiple ABIs in the same translation unit

2019-09-11 Thread Richard Sandiford
This series of patches introduces some classes and helpers for handling
multiple ABIs in the same translation unit.  At the moment "ABI" maans
specifically the choice of call-clobbered registers, but I'm hoping the
structures could be used for other ABI properties in future.

The main point of the series is to use these ABI structures instead of
global information like regs_invalidated_by_call, call_used_or_fixed_regs
and targetm.hard_regno_call_part_clobbered.  This has the side effect
of making all passes take -fipa-ra into account (except sel-sched.c,
see its patch for details).

The series also makes -fipa-ra work for partially-clobbered registers too.
Previously, if the ABI said that only the upper bits of a register are
call-clobbered, we'd enforce that rule separately from the -fipa-ra
information and apply it even when -fipa-ra can prove that the registers
aren't modified.  It turns out that fixing this interacts badly with
vzeroupper on x86, so the series has a patch to fix that.

Another general knock-on change is that we now always use the equivalent
of regs_invalidated_by_call rather than call_used_reg_set when deciding
whether a register is clobbered.  Among other things, this means that
cselib no longer invalidates expressions involving the stack pointer
when processing a call, since calls are guaranteed to return with the
same stack pointer.

The main motivating case for the series is the AArch64 vector PCS
and the SVE PCS, which are variants of the base AArch64 ABI but are
interoperable with it.  (Specifically, vector PCS calls preserve the
low 128 bits of 16 vector registers rather than the usual low 64 bits
of 8 registers.  SVE PCS calls instead preserve the whole of those 16
vector registers.)  However, I realised later that we could also use
this for the tlsdesc ABI on SVE targets, which would remove the need
for CLOBBER_HIGH.  I have follow-on patches to do that.

I also think the new structures would be useful for targets that
implement interrupt-handler attributes.  At the moment, we compile
interrupt handlers pretty much like ordinary functions, using the
same optimisation heuristics as for ordinary functions, and then
account for the extra call-saved registers in the prologue and
epilogue code.  Hooks like TARGET_HARD_REGNO_SCRATCH_OK then
prevent later optimisers from introducing new uses of unprotected
call-saved registers.  If the interrupt handler ABI was described
directly, the middle-end code would work with it in the same way
as for ordinary functions, including taking it into account when
making optimisation decisions.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  There were
some extra guality.exp failures due to the extra optimisation,
but they look like acceptable regressions.

Also tested by compiling at least one target per CPU directory and
checking for no new warnings.  It's quite hard to test for accidental
code differences given the general optimisation changes above, so I
resorted to comparing the gcc.c-torture, gcc.dg and g++.dg code at
-O0 only.  This came back clean except on PRU.

The reason for the PRU differences is that the port defines
targetm.hard_regno_call_part_clobbered, but uses it to test whether
a multi-register value contains a mixture of fully-clobbered and
fully-preserved registers.  AFAICT the port doesn't actually have
individual registers that are partly clobbered, so it doesn't need
to define the hook.  (I can see how the documentation gave a misleading
impression though.  I've tried to improve it in one of the patches.)
The series moves away from testing hard_regno_call_part_clobbered
directly to testing cached information instead, and the way that the
cached information is calculated means that defining the hook the way
the PRU port does has no effect.  In other words, after the series we
treat it (rightly IMO) as having a "normal" ABI whereas before we didn't.

Sorry for the long write-up.

Richard


Re: [PATCH V6 08/11] bpf: make target-supports.exp aware of eBPF

2019-09-11 Thread Mike Stump
On Aug 29, 2019, at 8:13 AM, Jose E. Marchesi  wrote:
> 
> This patch makes the several effective target checks in
> target-supports.exp to be aware of eBPF targets.

Ok.

These sort of updates will now be obvious to you, now that you have gotten your 
feet wet.  This applies to the indirect calls style changes as well.

Re: [PATCH V6 04/11] testsuite: new require effective target indirect_calls

2019-09-11 Thread Mike Stump
On Aug 29, 2019, at 8:13 AM, Jose E. Marchesi  wrote:
> 
> This patch adds a new dg_require_effective_target procedure to the
> testsuite infrastructure: indirect_calls.  This new function tells
> whether a target supports calls to non-constant call targets.

Ok.  I'll let people contemplate some comments...

I'm torn between targets that can't support C and gooping up the test suite and 
approving.  I'll error on the side of approving this, but, would like to hear 
from folks if I go to far.

Since they are easy to identify, maintain and ignore...  I went with approval.

People can contemplate other ways to do this, like introduce a fake marker for 
when the feature is used and when running a program with such a marker, then 
mark it as unsupported.  This way, no test need be marked, and all future test 
cases that use the feature, just flip to unsupported, no maintenance required.

We do this sort of thing with programs that overflow the RAM, by using a 
stylized error message from ld, and noticing that in dejagnu, and then not 
expecting it to work.

If you can find a way to tally stack space, and check it before running it, the 
other change to tightly track stack space then would not be as necessary.  I 
think you might be able to do this on your target.  Having the stack space 
marked is generally useful for other targets, as most won't have a nice way to  
sort out small stacks, so my general comment apply less to the stack size, but, 
things that can cause you less maintenance burden are likely good in any event.

Re: [PATCH V6 03/11] testsuite: annotate c-torture/compile tests with dg-require-stack-size

2019-09-11 Thread Mike Stump
On Aug 29, 2019, at 8:13 AM, Jose E. Marchesi  wrote:
> 
> This patch annotates tests that make use of a significant a mount of
> stack space.

Ok.  Future changes like this are now obvious to you and you can just maintain 
them as you see fit.

[Committed] PR fortran/91553 -- simplification of constants needs to check for parenthesis

2019-09-11 Thread Steve Kargl
Committed as obvious (once the problem is identified).

When simplification of an expression occurs, parenthesis can
be inserted.  When simplifying a constant expression, need to
check for inserted parenthesis.

2019-09-11  Steven G. Kargl  

PR fortran/91553
* simplify.c (gfc_convert_constant):  During conversion check if the
constant is enclosed in parenthesis, and simplify expression.

2019-09-11  Steven G. Kargl  

PR fortran/91553
* gfortran.dg/pr91553.f90: New test.

-- 
Steve
Index: gcc/fortran/simplify.c
===
--- gcc/fortran/simplify.c	(revision 275651)
+++ gcc/fortran/simplify.c	(working copy)
@@ -8503,6 +8503,12 @@ gfc_convert_constant (gfc_expr *e, bt type, int kind)
 	{
 	  if (c->expr->expr_type == EXPR_ARRAY)
 		tmp = gfc_convert_constant (c->expr, type, kind);
+	  else if (c->expr->expr_type == EXPR_OP
+		   && c->expr->value.op.op == INTRINSIC_PARENTHESES)
+		{
+		  gfc_simplify_expr (c->expr, 1);
+		  tmp = f (c->expr, kind);
+		}
 	  else
 		tmp = f (c->expr, kind);
 	}
Index: gcc/testsuite/gfortran.dg/pr91553.f90
===
--- gcc/testsuite/gfortran.dg/pr91553.f90	(nonexistent)
+++ gcc/testsuite/gfortran.dg/pr91553.f90	(working copy)
@@ -0,0 +1,9 @@
+! { dg-do run }
+! Code contributed by Gerhard Steinmetz
+program p
+   complex z(1)
+   z = (1.0, 2.0) * [real :: (3.0 + 4.0)]
+   if (real(z(1)) /= 7.) stop 1
+   if (aimag(z(1)) /= 14.) stop 2
+end


Re: [Patch 0/X] [WIP][RFC][libsanitizer] Introduce HWASAN to GCC

2019-09-11 Thread Evgenii Stepanov via gcc-patches
On Wed, Sep 11, 2019 at 9:37 AM Matthew Malcomson
 wrote:
>
> On 11/09/19 12:53, Martin Liška wrote:
> > On 9/9/19 5:54 PM, Matthew Malcomson wrote:
> >> On 09/09/19 11:47, Martin Liška wrote:
> >>> On 9/6/19 4:46 PM, Matthew Malcomson wrote:
>  Hello,
> 
> >> As I understand it, `hwasan-abi=interceptor` vs `platform` is about
> >> adding such MTE emulation for "application code" or "platform code (e.g.
> >> kernel)" respectively.
> >
> > Hm, are you sure? Clang also uses -fsanitize=kernel-hwaddress which should
> > be equivalent to kernel-address for -fsanitize=address.
> >
>
> I'm not at all sure it's to do with the kernel ;-}
>
> Here's the commit that adds the flag.
> https://reviews.llvm.org/D56038
>
>  From the commit message it seems the point is to distinguish between
> running on runtimes that natively support HWASAN (named the "platform"
> abi) and those where functions like malloc and pthread_create have to be
> intercepted (named the "interceptor" abi).
>
> I had assumed that targeting the kernel would be in the "platform"
> group, but it could easily not be the case.
>
> Considering the message form the below commit it seems that this is more
> targeted at instrumenting things like libc https://reviews.llvm.org/D50922.

With hwasan we tried a different approach from asan: instead of
intercepting libc we build it with sanitizer instrumentation, and rely
on a few hooks to update internal state of the tool on interesting
events, such as process startup, thread creation and destruction,
stack unwind (longjmp, vfork). This effectively puts hwasan _below_
libc (as in libc depends on libhwasan).

It has worked amazingly well for Android, where we aim to sanitize
most of platform code at once. Ex. ASan has this requirement that the
main executable needs to be built with ASan before any of the
libraries could - otherwise the tool will not be able to interpose
malloc/free symbols. As a consequence, when there are binaries that
can not be sanitized for any reason, we need to keep unsanitized
copies of all their transitive dependencies, and that turns into a
huge build/deployment mess. Hwasan approach avoids this problem by
making sure that the allocator is always there (because everything
depends on libc).

The downside, of course, is that this can not be used to sanitize a
single binary without a specially built libc. Hence the "interceptor"
ABI, which was an attempt to support running hwasan-instrumented
applications on regular, non-hwasan devices. We are not developing
this mode any longer, but it is used to run compiler-rt tests on
aarch64-android.

> I'm currently working on writing down the questions I plan to ask the
> developers of HWASAN in LLVM, I'll put this on the list :-)
>
> >>
> >>>
> >> There's an even more fundamental problem of accesses within the
> >> instrumented binary -- I haven't yet figured out how to remove the tag
> >> before accesses on architectures without the AArch64 TBI feature.
> >
> > Which should platforms like x86_64, right?
>
> Yes.
> As yet I haven't gotten anything working for architectures without TBI
> (everything except AArch64).
> This particular problem was one I was hoping for suggestions around (my
> first of the questions in my cover letter).

We have support for hwasan on x86_64 in LLVM (by removing tags before
accesses), but it is not really practical because any library built
without instrumentation is a big source of false positives. Even, say,
libc++/libstdc++. We use it exclusively for tests.

> 
>  The current patch series is far from complete, but I'm posting the 
>  current state
>  to provide something to discuss at the Cauldron next week.
> 
>  In its current state, this sanitizer only works on AArch64 with a custom 
>  kernel
>  to allow tagged pointers in system calls.  This is discussed in the 
>  below link
>  https://source.android.com/devices/tech/debug/hwasan -- the custom 
>  kernel allows
>  tagged pointers in syscalls.
> >>>
> >>> Can you be please more specific. Is the MTE in upstream linux kernel? If 
> >>> so,
> >>> starting from which version?
> >>
> >> I find I can only make complicated statements remotely clear in bullet
> >> points ;-)
> >>
> >> What I was trying to say was:
> >> - HWASAN from this patch series requires AArch64 TBI.
> >> (I have not handled architectures without TBI)
> >> - The upstream kernel does not accept tagged pointers in syscalls.
> >> (programs that use TBI must currently clear tags before passing
> >>  pointers to the kernel)
> >
> > I know that in case of ASAN, the libasan provides wrappers (interceptors) 
> > for various glibc
> > functions that are often system calls. Similar wrappers are probably used 
> > in HWASAN
> > and so that one can create the memory pointer tags.
> >
> >> - This patch series doesn't include any way to avoid passing tagged
> >> pointers to syscalls.
> >
> > I bet LLVM has the same problem so I would expect 

Re: [PATCH] Fix PR 91708

2019-09-11 Thread Richard Biener
On September 11, 2019 7:41:22 PM GMT+02:00, Bernd Edlinger 
 wrote:
>On 9/11/19 6:08 PM, Jeff Law wrote:
>> On 9/11/19 7:49 AM, Bernd Edlinger wrote:
>>> On 9/11/19 9:23 AM, Richard Biener wrote:
 On Tue, 10 Sep 2019, Bernd Edlinger wrote:

> Hi!
>
> This ICE happens when compiling real_nextafter in real.c.
> CSE sees this:
>
> (insn 179 178 180 11 (set (reg:SI 319)
> (reg/v/f:SI 273 [ rD.73757 ]))
>"../../gcc-trunk-1/gcc/real.c":120:10 643 {*thumb2_movsi_vfp}
>  (nil))
> [...]
> (insn 181 180 182 11 (set (mem:SI (reg:SI 319) [0 MEM
> [(voidD.73 *)r_77(D)]+0 S4 A8])
> (unspec:SI [
> (reg:SI 320)
> ] UNSPEC_UNALIGNED_STORE))
>"../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
>  (nil))
> [...]
> (insn 186 185 187 11 (set (mem:SI (plus:SI (reg/v/f:SI 273 [
>rD.73757 ])
> (const_int 20 [0x14])) [0 MEM 
>[(voidD.73 *)r_77(D)]+20 S4 A8])
> (unspec:SI [
> (reg:SI 320)
> ] UNSPEC_UNALIGNED_STORE))
>"../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
>  (expr_list:REG_DEAD (reg:SI 320)
> (expr_list:REG_DEAD (reg/f:SI 319 [ rD.73757 ])
> (nil
> [...]
> (insn 234 233 235 11 (set (reg:SI 340)
> (mem:SI (reg/v/f:SI 273 [ rD.73757 ]) [52 MEM int> [(struct real_valueD.28367 *)r_77(D)]+0 S4 A32]))
>"../../gcc-trunk-1/gcc/real.c":5185:9 643 {*thumb2_movsi_vfp}
>  (nil))
>
>
> ... and transforms insn 234 in an invalid insn:
>
>
> (insn 234 233 235 11 (set (reg:SI 340 [ MEM 
>[(struct real_valueD.28367 *)r_77(D)] ])
> (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
> (const_int 20 [0x14])) [0 MEM 
>[(voidD.73 *)r_77(D)]+20 S4 A8])) "../../gcc-trunk-1/gcc/real.c":5185:9
>643 {*thumb2_movsi_vfp}
>  (nil))
>
> which triggers the assertion in the arm back-end, because the MEM
>is not aligned.
>
> To fix that I changed exp_equiv_p to consider MEMs with different
>MEM_ALIGN or
> ALIAS_SET as different.

 I think the appropriate fix is to retain the mem_attrs of the
>original
 MEM since obviously the replacement does not change those?  It's a
>mere
 change of the MEMs address but implementation-wise CSE replaces the
>whole
 MEM?

 For CSE exp_equiv_p simply compares the addresses, only if for_gcse
 we do further checks like mem_attrs_eq_p.  IMHO that is correct.

 I'm not sure what CSE does above, that is, what does it think it
>does?
 It doesn't replace the loaded value, it seems to do sth else which
 is odd for CSE (replaces a REG with a PLUS - but why?)
  
>>>
>>> What I think CSE is thinking there is this:
>>>
>>> insn 181 stores the value of (reg:SI 320) at (mem:SI (reg:SI 319) [0
>MEM  [(voidD.73 *)r_77(D)]+0 S4 A8]
>>> that is the same address as where insn 234 reads the value back but
>there we know it is aligned.
>>>
>>> insn 186 stores the same value at (mem:SI (plus:SI (reg/v/f:SI 273 [
>rD.73757 ]) (const_int 20 [0x14])) [0 MEM  [(voidD.73
>*)r_77(D)]+20 S4 A8]
>>>
>>> But since both reg:SI 320 is dead and reg:SI 319 is dead as well,
>when
>>> insn 234 processed, it replaces one mem with another mem that has
>the same
>>> value.
>> Just to make sure I understand.  Are you saying the addresses for the
>> MEMs are equal or the contents of the memory location are equal.
>> 
>
>The MEMs all have different addresses, but the same value, they are
>from a
>memset or something:
>
>(gdb) call dump_class(elt)
>Equivalence chain for (mem:SI (reg/v/f:SI 273 [ r ]) [0 MEM
> [(void *)r_77(D)]+0 S4 A8]): 
>(unspec:SI [
>(reg:SI 320)
>] UNSPEC_UNALIGNED_STORE)
>(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
>(const_int 20 [0x14])) [0 MEM  [(void *)r_77(D)]+20 S4 A8])
>(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
>(const_int 16 [0x10])) [0 MEM  [(void *)r_77(D)]+16 S4 A8])
>(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
> (const_int 12 [0xc])) [0 MEM  [(void *)r_77(D)]+12 S4 A8])
>(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
>   (const_int 8 [0x8])) [0 MEM  [(void *)r_77(D)]+8 S4 A8])
>(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
>   (const_int 4 [0x4])) [0 MEM  [(void *)r_77(D)]+4 S4 A8])
>(mem:SI (reg/v/f:SI 273 [ r ]) [0 MEM  [(void *)r_77(D)]+0
>S4 A8])
>
>
>The insn 234, uses the same address as the last in the list
>(mem:SI (reg/v/f:SI 273 [ r ]) [0 MEM  [(void *)r_77(D)]+0
>S4 A8])
>but incompatible alignment and alias set.
>
>since we only are interested in the value CSE picks the most silly
>variant,
>(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
>(const_int 20 [0x14])) [0 MEM  [(void *)r_77(D)]+20 S4 A8])
>
>now this has a wrong alias set, a wrong alignment and a different
>address,
>and is much more expensive, no wonder that lra needs to rewrite that.

But the alias and alignment are properties of the expression representing the 
value 

[Committed] PR fortran/91642 -- NULL() cannot be in iolength inquire

2019-09-11 Thread Steve Kargl
Committed as obvious.

2019-09-11  Steven G. Kargl  

PR fortran/91642
* io.c (gfc_match_inquire): null() cannot be in an iolength inquire
list.

2019-09-11  Steven G. Kargl  

PR fortran/91642
* gfortran.dg/pr91642.f90: New test.

-- 
Steve
Index: gcc/fortran/io.c
===
--- gcc/fortran/io.c	(revision 275651)
+++ gcc/fortran/io.c	(working copy)
@@ -4641,6 +4641,17 @@ gfc_match_inquire (void)
   if (m == MATCH_NO)
 	goto syntax;
 
+  for (gfc_code *c = code; c; c = c->next)
+	if (c->expr1 && c->expr1->expr_type == EXPR_FUNCTION
+	&& c->expr1->symtree && c->expr1->symtree->n.sym->attr.function
+	&& !c->expr1->symtree->n.sym->attr.external
+	&& strcmp (c->expr1->symtree->name, "null") == 0)
+	  {
+	gfc_error ("NULL() near %L cannot appear in INQUIRE statement",
+		   >expr1->where);
+	goto cleanup;
+	  }
+
   new_st.op = EXEC_IOLENGTH;
   new_st.expr1 = inquire->iolength;
   new_st.ext.inquire = inquire;
Index: gcc/testsuite/gfortran.dg/pr91642.f90
===
--- gcc/testsuite/gfortran.dg/pr91642.f90	(nonexistent)
+++ gcc/testsuite/gfortran.dg/pr91642.f90	(working copy)
@@ -0,0 +1,19 @@
+! { dg-do compile }
+! PR fortran/91642
+! Code contributed by Gerhard Steinmetz
+program p
+   integer i
+   integer :: iol
+   integer, external :: null
+   i = 0
+   inquire (iolength=iol) i, null()
+   if (iol == 4) stop 1
+end
+
+subroutine q
+   integer i
+   integer :: iol
+   i = 0
+   inquire (iolength=iol) i, null() ! { dg-error "cannot appear in INQUIRE" }
+   if (iol == 4) stop 1
+end


Re: [PATCH] Fix PR 91708

2019-09-11 Thread Wilco Dijkstra
Hi Jeff,
 
> We're talking about two instructions where if the first executes, then
> the second also executes.  If the memory addresses are the same, then
> their alignment is the same.
> 
> In your case the two instructions are on different execution paths and
> are in fact mutually exclusive.

Sure but it shows we cannot just force the alignment to be the same in all
cases. The loads are reading the same address in the same block but they
are *not* identical. The unaligned stores are emitted by an inlined memset,
and it *aliases* with a later load that uses the same address.

Given they are really different accesses with very different meanings you
can't just force the same MEM for both! The alignment of the memset could
be improved (I think we still don't pass the correct alignment info), but the
alias set would need to continue to be different otherwise we violate the C
aliasing rules.

Wilco
 

Re: [PATCH] Fix PR 91708

2019-09-11 Thread Bernd Edlinger
On 9/11/19 6:08 PM, Jeff Law wrote:
> On 9/11/19 7:49 AM, Bernd Edlinger wrote:
>> On 9/11/19 9:23 AM, Richard Biener wrote:
>>> On Tue, 10 Sep 2019, Bernd Edlinger wrote:
>>>
 Hi!

 This ICE happens when compiling real_nextafter in real.c.
 CSE sees this:

 (insn 179 178 180 11 (set (reg:SI 319)
 (reg/v/f:SI 273 [ rD.73757 ])) 
 "../../gcc-trunk-1/gcc/real.c":120:10 643 {*thumb2_movsi_vfp}
  (nil))
 [...]
 (insn 181 180 182 11 (set (mem:SI (reg:SI 319) [0 MEM  
 [(voidD.73 *)r_77(D)]+0 S4 A8])
 (unspec:SI [
 (reg:SI 320)
 ] UNSPEC_UNALIGNED_STORE)) 
 "../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
  (nil))
 [...]
 (insn 186 185 187 11 (set (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
 (const_int 20 [0x14])) [0 MEM  [(voidD.73 
 *)r_77(D)]+20 S4 A8])
 (unspec:SI [
 (reg:SI 320)
 ] UNSPEC_UNALIGNED_STORE)) 
 "../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
  (expr_list:REG_DEAD (reg:SI 320)
 (expr_list:REG_DEAD (reg/f:SI 319 [ rD.73757 ])
 (nil
 [...]
 (insn 234 233 235 11 (set (reg:SI 340)
 (mem:SI (reg/v/f:SI 273 [ rD.73757 ]) [52 MEM  
 [(struct real_valueD.28367 *)r_77(D)]+0 S4 A32])) 
 "../../gcc-trunk-1/gcc/real.c":5185:9 643 {*thumb2_movsi_vfp}
  (nil))


 ... and transforms insn 234 in an invalid insn:


 (insn 234 233 235 11 (set (reg:SI 340 [ MEM  [(struct 
 real_valueD.28367 *)r_77(D)] ])
 (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
 (const_int 20 [0x14])) [0 MEM  [(voidD.73 
 *)r_77(D)]+20 S4 A8])) "../../gcc-trunk-1/gcc/real.c":5185:9 643 
 {*thumb2_movsi_vfp}
  (nil))

 which triggers the assertion in the arm back-end, because the MEM is not 
 aligned.

 To fix that I changed exp_equiv_p to consider MEMs with different 
 MEM_ALIGN or
 ALIAS_SET as different.
>>>
>>> I think the appropriate fix is to retain the mem_attrs of the original
>>> MEM since obviously the replacement does not change those?  It's a mere
>>> change of the MEMs address but implementation-wise CSE replaces the whole
>>> MEM?
>>>
>>> For CSE exp_equiv_p simply compares the addresses, only if for_gcse
>>> we do further checks like mem_attrs_eq_p.  IMHO that is correct.
>>>
>>> I'm not sure what CSE does above, that is, what does it think it does?
>>> It doesn't replace the loaded value, it seems to do sth else which
>>> is odd for CSE (replaces a REG with a PLUS - but why?)
>>>  
>>
>> What I think CSE is thinking there is this:
>>
>> insn 181 stores the value of (reg:SI 320) at (mem:SI (reg:SI 319) [0 MEM 
>>  [(voidD.73 *)r_77(D)]+0 S4 A8]
>> that is the same address as where insn 234 reads the value back but there we 
>> know it is aligned.
>>
>> insn 186 stores the same value at (mem:SI (plus:SI (reg/v/f:SI 273 [ 
>> rD.73757 ]) (const_int 20 [0x14])) [0 MEM  [(voidD.73 
>> *)r_77(D)]+20 S4 A8]
>>
>> But since both reg:SI 320 is dead and reg:SI 319 is dead as well, when
>> insn 234 processed, it replaces one mem with another mem that has the same
>> value.
> Just to make sure I understand.  Are you saying the addresses for the
> MEMs are equal or the contents of the memory location are equal.
> 

The MEMs all have different addresses, but the same value, they are from a
memset or something:

(gdb) call dump_class(elt)
Equivalence chain for (mem:SI (reg/v/f:SI 273 [ r ]) [0 MEM  [(void 
*)r_77(D)]+0 S4 A8]): 
(unspec:SI [
(reg:SI 320)
] UNSPEC_UNALIGNED_STORE)
(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
(const_int 20 [0x14])) [0 MEM  [(void *)r_77(D)]+20 S4 A8])
(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
(const_int 16 [0x10])) [0 MEM  [(void *)r_77(D)]+16 S4 A8])
(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
(const_int 12 [0xc])) [0 MEM  [(void *)r_77(D)]+12 S4 A8])
(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
(const_int 8 [0x8])) [0 MEM  [(void *)r_77(D)]+8 S4 A8])
(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
(const_int 4 [0x4])) [0 MEM  [(void *)r_77(D)]+4 S4 A8])
(mem:SI (reg/v/f:SI 273 [ r ]) [0 MEM  [(void *)r_77(D)]+0 S4 A8])


The insn 234, uses the same address as the last in the list
(mem:SI (reg/v/f:SI 273 [ r ]) [0 MEM  [(void *)r_77(D)]+0 S4 A8])
but incompatible alignment and alias set.

since we only are interested in the value CSE picks the most silly variant,
(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
(const_int 20 [0x14])) [0 MEM  [(void *)r_77(D)]+20 S4 A8])

now this has a wrong alias set, a wrong alignment and a different address,
and is much more expensive, no wonder that lra needs to rewrite that.

> For the former the alignment has to be the same, plain and simple, even
> if GCC isn't aware the alignments have to be the same.
> 

Yes.

> For the latter we'd 

Re: [PATCH] Fix PR 91708

2019-09-11 Thread Jeff Law
On 9/11/19 10:38 AM, Wilco Dijkstra wrote:
> Hi Jeff,
> 
> Jeff wrote:
>> Just to make sure I understand.  Are you saying the addresses for the
>> MEMs are equal or the contents of the memory location are equal.
>>
>> For the former the alignment has to be the same, plain and simple, even
>> if GCC isn't aware the alignments have to be the same.
>>
>> For the latter we'd be better off loading the data into a REG, then
>> using the REG for the source of both memory stores.
> 
> The addresses are the same (they should probably have been canonicalized
> earlier, that might be a separate bug). I'm not sure why the unaligned stores
> have a lower alignment/no alias set, but it's certainly possible that a memset
> expansion of a subfield of a structure has a lower alignment.
> 
> The known alignment of otherwise identical loads in different blocks could be
> different, ie:
> 
> __attribute__((aligned(1)) struct { int x; } *p;
> if (((intptr_t)p & 3) == 0)
>x = p->x;  // align 4
> else
>y = p->x;  // align 1
> 
> It would be very wrong to change the alignment of the 2nd load to be 4-byte
> aligned.
True, but that's not what's going on here.

We're talking about two instructions where if the first executes, then
the second also executes.  If the memory addresses are the same, then
their alignment is the same.

In your case the two instructions are on different execution paths and
are in fact mutually exclusive.

jeff


Re: Fix for type confusion bug on microblaze

2019-09-11 Thread Jeff Law
On 9/11/19 5:28 AM, Jonas Pfeil wrote:
> The Microblaze target confuses unsigned long with HOST_WIDE_INT.
> 
> This works fine for many architectures, but fails on ARM (HOST_WIDE_INT is 8
> bytes, unsigned long is 4 bytes). Leading to print a uninitialized register
> instead of the desired number, fix by using the correct printf-specifier.
> 
> Tested to fix the issue and work with an ARM->MB cross compiler.
> 
> --- a/gcc/config/microblaze/microblaze.h
> +++ b/gcc/config/microblaze/microblaze.h
> @@ -695,7 +695,7 @@ do {
> \
>fprintf (STREAM, "\t.align\t%d\n", (LOG))
>  
>  #define ASM_OUTPUT_SKIP(STREAM,SIZE)   \
> -  fprintf (STREAM, "\t.space\t%lu\n", (SIZE))
> +  fprintf (STREAM, "\t.space\t" HOST_WIDE_INT_PRINT_DEC "\n", (SIZE))
I believe that we should be using HOST_WIDE_INT_PRINT_UNSIGNED, not
HOST_WIDE_INT_PRINT_DEC.  Can you please verify that works for your
cross builds?

Thanks,
jeff



Re: [PATCH][ARM] Correctly set SLOW_BYTE_ACCESS

2019-09-11 Thread Wilco Dijkstra
Hi Paul,

> > On Sep 11, 2019, at 11:48 AM, Wilco Dijkstra  wrote:
> > 
> > Contrary to all documentation, SLOW_BYTE_ACCESS simply means accessing
> > bitfields by their declared type, which results in better codegeneration
> > on practically any target.  So set it correctly to 1 on Arm.
> 
> If the documentation is indeed wrong, could you submit a patch to cure that 
> documentation defect?
 
I already submitted a patch to completely remove SLOW_BYTE_ACCESS since there
are only 2 uses of it in all of GCC. There is already a target call to handle 
volatile
bitfields, the other use is completely unrelated (since it's not even about 
memory access)
and dead code as far as I can see.

Wilco

 
 

Re: [PATCH][ARM] Correctly set SLOW_BYTE_ACCESS

2019-09-11 Thread Paul Koning



> On Sep 11, 2019, at 11:48 AM, Wilco Dijkstra  wrote:
> 
> Contrary to all documentation, SLOW_BYTE_ACCESS simply means accessing
> bitfields by their declared type, which results in better codegeneration
> on practically any target.  So set it correctly to 1 on Arm.

If the documentation is indeed wrong, could you submit a patch to cure that 
documentation defect?

paul



Re: [PATCH] Fix PR 91708

2019-09-11 Thread Wilco Dijkstra
Hi Jeff,

Jeff wrote:
> Just to make sure I understand.  Are you saying the addresses for the
> MEMs are equal or the contents of the memory location are equal.
>
> For the former the alignment has to be the same, plain and simple, even
> if GCC isn't aware the alignments have to be the same.
>
> For the latter we'd be better off loading the data into a REG, then
> using the REG for the source of both memory stores.

The addresses are the same (they should probably have been canonicalized
earlier, that might be a separate bug). I'm not sure why the unaligned stores
have a lower alignment/no alias set, but it's certainly possible that a memset
expansion of a subfield of a structure has a lower alignment.

The known alignment of otherwise identical loads in different blocks could be
different, ie:

__attribute__((aligned(1)) struct { int x; } *p;
if (((intptr_t)p & 3) == 0)
   x = p->x;  // align 4
else
   y = p->x;  // align 1

It would be very wrong to change the alignment of the 2nd load to be 4-byte
aligned.

Wilco

Re: [Patch 0/X] [WIP][RFC][libsanitizer] Introduce HWASAN to GCC

2019-09-11 Thread Matthew Malcomson
On 11/09/19 12:53, Martin Liška wrote:
> On 9/9/19 5:54 PM, Matthew Malcomson wrote:
>> On 09/09/19 11:47, Martin Liška wrote:
>>> On 9/6/19 4:46 PM, Matthew Malcomson wrote:
 Hello,

>> As I understand it, `hwasan-abi=interceptor` vs `platform` is about
>> adding such MTE emulation for "application code" or "platform code (e.g.
>> kernel)" respectively.
> 
> Hm, are you sure? Clang also uses -fsanitize=kernel-hwaddress which should
> be equivalent to kernel-address for -fsanitize=address.
> 

I'm not at all sure it's to do with the kernel ;-}

Here's the commit that adds the flag.
https://reviews.llvm.org/D56038

 From the commit message it seems the point is to distinguish between 
running on runtimes that natively support HWASAN (named the "platform" 
abi) and those where functions like malloc and pthread_create have to be 
intercepted (named the "interceptor" abi).

I had assumed that targeting the kernel would be in the "platform" 
group, but it could easily not be the case.

Considering the message form the below commit it seems that this is more 
targeted at instrumenting things like libc https://reviews.llvm.org/D50922.

I'm currently working on writing down the questions I plan to ask the 
developers of HWASAN in LLVM, I'll put this on the list :-)

>>
>>>
>> There's an even more fundamental problem of accesses within the
>> instrumented binary -- I haven't yet figured out how to remove the tag
>> before accesses on architectures without the AArch64 TBI feature.
> 
> Which should platforms like x86_64, right?

Yes.
As yet I haven't gotten anything working for architectures without TBI 
(everything except AArch64).
This particular problem was one I was hoping for suggestions around (my 
first of the questions in my cover letter).


 The current patch series is far from complete, but I'm posting the current 
 state
 to provide something to discuss at the Cauldron next week.

 In its current state, this sanitizer only works on AArch64 with a custom 
 kernel
 to allow tagged pointers in system calls.  This is discussed in the below 
 link
 https://source.android.com/devices/tech/debug/hwasan -- the custom kernel 
 allows
 tagged pointers in syscalls.
>>>
>>> Can you be please more specific. Is the MTE in upstream linux kernel? If so,
>>> starting from which version?
>>
>> I find I can only make complicated statements remotely clear in bullet
>> points ;-)
>>
>> What I was trying to say was:
>> - HWASAN from this patch series requires AArch64 TBI.
>> (I have not handled architectures without TBI)
>> - The upstream kernel does not accept tagged pointers in syscalls.
>> (programs that use TBI must currently clear tags before passing
>>  pointers to the kernel)
> 
> I know that in case of ASAN, the libasan provides wrappers (interceptors) for 
> various glibc
> functions that are often system calls. Similar wrappers are probably used in 
> HWASAN
> and so that one can create the memory pointer tags.
> 
>> - This patch series doesn't include any way to avoid passing tagged
>> pointers to syscalls.
> 
> I bet LLVM has the same problem so I would expect a handling in the 
> interceptors.
> 

I'm pretty sure this problem hasn't been solved with interceptors.

The android page describing hwasan specifically mentions the requirement 
of a Linux kernel accepting tagged pointers, and I believe this is the 
most supported environment.

https://source.android.com/devices/tech/debug/hwasan
"HWASan requires the Linux kernel to accept tagged pointers in system 
call arguments."

Also, there are surprisingly few interceptors defined in libhwasan.

Thanks,
Matthew

>> - Hence on order to test the sanitizer I'm using a kernel that has been
>> patched to accept tagged pointers in many syscalls.
>> - The link to the android.com site is just another source describing the
>> same requirement.
>>
>>
>> The support for the relaxed ABI (of accepting tagged pointers in various
>> syscalls in the kernel) is being discussed on the kernel mailing list,
>> the latest patchset I know of is here:
>> https://lkml.org/lkml/2019/7/25/725
> 
> Thanks for pointer.
> 


Re: [PATCH] Fix PR 91708

2019-09-11 Thread Jeff Law
On 9/11/19 7:49 AM, Bernd Edlinger wrote:
> On 9/11/19 9:23 AM, Richard Biener wrote:
>> On Tue, 10 Sep 2019, Bernd Edlinger wrote:
>>
>>> Hi!
>>>
>>> This ICE happens when compiling real_nextafter in real.c.
>>> CSE sees this:
>>>
>>> (insn 179 178 180 11 (set (reg:SI 319)
>>> (reg/v/f:SI 273 [ rD.73757 ])) 
>>> "../../gcc-trunk-1/gcc/real.c":120:10 643 {*thumb2_movsi_vfp}
>>>  (nil))
>>> [...]
>>> (insn 181 180 182 11 (set (mem:SI (reg:SI 319) [0 MEM  
>>> [(voidD.73 *)r_77(D)]+0 S4 A8])
>>> (unspec:SI [
>>> (reg:SI 320)
>>> ] UNSPEC_UNALIGNED_STORE)) 
>>> "../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
>>>  (nil))
>>> [...]
>>> (insn 186 185 187 11 (set (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
>>> (const_int 20 [0x14])) [0 MEM  [(voidD.73 
>>> *)r_77(D)]+20 S4 A8])
>>> (unspec:SI [
>>> (reg:SI 320)
>>> ] UNSPEC_UNALIGNED_STORE)) 
>>> "../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
>>>  (expr_list:REG_DEAD (reg:SI 320)
>>> (expr_list:REG_DEAD (reg/f:SI 319 [ rD.73757 ])
>>> (nil
>>> [...]
>>> (insn 234 233 235 11 (set (reg:SI 340)
>>> (mem:SI (reg/v/f:SI 273 [ rD.73757 ]) [52 MEM  
>>> [(struct real_valueD.28367 *)r_77(D)]+0 S4 A32])) 
>>> "../../gcc-trunk-1/gcc/real.c":5185:9 643 {*thumb2_movsi_vfp}
>>>  (nil))
>>>
>>>
>>> ... and transforms insn 234 in an invalid insn:
>>>
>>>
>>> (insn 234 233 235 11 (set (reg:SI 340 [ MEM  [(struct 
>>> real_valueD.28367 *)r_77(D)] ])
>>> (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
>>> (const_int 20 [0x14])) [0 MEM  [(voidD.73 
>>> *)r_77(D)]+20 S4 A8])) "../../gcc-trunk-1/gcc/real.c":5185:9 643 
>>> {*thumb2_movsi_vfp}
>>>  (nil))
>>>
>>> which triggers the assertion in the arm back-end, because the MEM is not 
>>> aligned.
>>>
>>> To fix that I changed exp_equiv_p to consider MEMs with different MEM_ALIGN 
>>> or
>>> ALIAS_SET as different.
>>
>> I think the appropriate fix is to retain the mem_attrs of the original
>> MEM since obviously the replacement does not change those?  It's a mere
>> change of the MEMs address but implementation-wise CSE replaces the whole
>> MEM?
>>
>> For CSE exp_equiv_p simply compares the addresses, only if for_gcse
>> we do further checks like mem_attrs_eq_p.  IMHO that is correct.
>>
>> I'm not sure what CSE does above, that is, what does it think it does?
>> It doesn't replace the loaded value, it seems to do sth else which
>> is odd for CSE (replaces a REG with a PLUS - but why?)
>>  
> 
> What I think CSE is thinking there is this:
> 
> insn 181 stores the value of (reg:SI 320) at (mem:SI (reg:SI 319) [0 MEM 
>  [(voidD.73 *)r_77(D)]+0 S4 A8]
> that is the same address as where insn 234 reads the value back but there we 
> know it is aligned.
> 
> insn 186 stores the same value at (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 
> ]) (const_int 20 [0x14])) [0 MEM  [(voidD.73 *)r_77(D)]+20 S4 
> A8]
> 
> But since both reg:SI 320 is dead and reg:SI 319 is dead as well, when
> insn 234 processed, it replaces one mem with another mem that has the same
> value.
Just to make sure I understand.  Are you saying the addresses for the
MEMs are equal or the contents of the memory location are equal.

For the former the alignment has to be the same, plain and simple, even
if GCC isn't aware the alignments have to be the same.

For the latter we'd be better off loading the data into a REG, then
using the REG for the source of both memory stores.

Jeff


Re: [PATCH] Fix PR 91708

2019-09-11 Thread Jeff Law
On 9/11/19 7:37 AM, Richard Biener wrote:
> On Wed, 11 Sep 2019, Bernd Edlinger wrote:
> 
>> On 9/11/19 12:43 AM, Jeff Law wrote:
>>> On 9/10/19 1:51 PM, Bernd Edlinger wrote:
 Hi!

 This ICE happens when compiling real_nextafter in real.c.
 CSE sees this:

 (insn 179 178 180 11 (set (reg:SI 319)
 (reg/v/f:SI 273 [ rD.73757 ])) 
 "../../gcc-trunk-1/gcc/real.c":120:10 643 {*thumb2_movsi_vfp}
  (nil))
 [...]
 (insn 181 180 182 11 (set (mem:SI (reg:SI 319) [0 MEM  
 [(voidD.73 *)r_77(D)]+0 S4 A8])
 (unspec:SI [
 (reg:SI 320)
 ] UNSPEC_UNALIGNED_STORE)) 
 "../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
  (nil))
 [...]
 (insn 186 185 187 11 (set (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
 (const_int 20 [0x14])) [0 MEM  [(voidD.73 
 *)r_77(D)]+20 S4 A8])
 (unspec:SI [
 (reg:SI 320)
 ] UNSPEC_UNALIGNED_STORE)) 
 "../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
  (expr_list:REG_DEAD (reg:SI 320)
 (expr_list:REG_DEAD (reg/f:SI 319 [ rD.73757 ])
 (nil
 [...]
 (insn 234 233 235 11 (set (reg:SI 340)
 (mem:SI (reg/v/f:SI 273 [ rD.73757 ]) [52 MEM  
 [(struct real_valueD.28367 *)r_77(D)]+0 S4 A32])) 
 "../../gcc-trunk-1/gcc/real.c":5185:9 643 {*thumb2_movsi_vfp}
  (nil))


 ... and transforms insn 234 in an invalid insn:


 (insn 234 233 235 11 (set (reg:SI 340 [ MEM  [(struct 
 real_valueD.28367 *)r_77(D)] ])
 (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
 (const_int 20 [0x14])) [0 MEM  [(voidD.73 
 *)r_77(D)]+20 S4 A8])) "../../gcc-trunk-1/gcc/real.c":5185:9 643 
 {*thumb2_movsi_vfp}
  (nil))

 which triggers the assertion in the arm back-end, because the MEM is not 
 aligned.

 To fix that I changed exp_equiv_p to consider MEMs with different 
 MEM_ALIGN or
 ALIAS_SET as different.

 This patch fixes the arm bootstrap for --with-cpu=cortex-a57 
 --with-mode=thumb --with-fpu=fp-armv8 --with-float=hard
 which I confirmed using a cross compiler.  And it fixes the test case that 
 is attached to the PR, but it is way
 too large for the test suite.


 Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
 Is it OK for trunk?


 Thanks
 Bernd.


 patch-pr91708.diff

 2019-09-10  Bernd Edlinger  

PR middle-end/91708
* cse.c (exp_equiv_p): Consider MEMs with different
alias set or alignment as different.
>>> Hmm, I would have expected the address space and alignment checks to be
>>> handled by the call to mem_attrs_eq_p.
>>>
>>
>> Yes, but exp_equiv_p is called from here:
>>
>> lookup (rtx x, unsigned int hash, machine_mode mode)
>> {
>>   struct table_elt *p;
>>
>>   for (p = table[hash]; p; p = p->next_same_hash)
>> if (mode == p->mode && ((x == p->exp && REG_P (x))
>> || exp_equiv_p (x, p->exp, !REG_P (x), false)))
>>   return p;
>>
>> with for_gcse = false si mem_attrs_eq_p is not used.
>>
>>> Which aspect of the MEMs is really causing the problem here?
>>>
>>
>> The alignment is (formally) different, but since also the address is 
>> different,
>> the alignment could be also be really wrong.
>>
>> Originally it was [(struct real_valueD.28367 *)r_77(D)]+0 S4 A32]
>> but it is replaced with [0 MEM  [(voidD.73 *)r_77(D)]+20 S4 
>> A8],
>> that appears to be done, because there is the same value that was in the 
>> original
>> location.
>>
>> But also the wrong alias set will cause problems if the instruction
>> is re-materialized in another place (which actually happened in LRA and 
>> caused the
>> assertion, BTW).
> 
> But !for_gcse shouldn't do this kind of things with MEMs.  It should
> only replace a MEM with a REG, not put in some other "equivalent"
> MEMs.
Agreed, generally replacing one MEM with another MEM would be silly.
Though I wouldn't be terribly surprised if CSE did so if the form of one
of the MEMs was cheaper than the other.

It be better to use a REG though.

jeff



Re: [C++] Don't fold __builtin_constant_p prematurely

2019-09-11 Thread Marc Glisse

Ping

On Tue, 3 Sep 2019, Marc Glisse wrote:


On Fri, 2 Aug 2019, Marc Glisse wrote:


Ping

On Tue, 16 Jul 2019, Marc Glisse wrote:


Adding a C++ maintainer in Cc:
https://gcc.gnu.org/ml/gcc-patches/2019-07/msg00808.html

On Wed, 10 Jul 2019, Marc Glisse wrote:


Hello,

this avoids folding __builtin_constant_p to 0 early when we are not 
forced to do so. Clearly this has an effect, since it uncovered a bug in 
wi::lshift, fixed today ;-)


I wasn't sure about using |= or just =, the first one seemed more 
conservative.


Bootstrap+regtest on x86_64-pc-linux-gnu.

2019-07-11  Marc Glisse  

gcc/cp/
* constexpr.c (cxx_eval_builtin_function_call): Only set
force_folding_builtin_constant_p if manifestly_const_eval.

gcc/testsuite/
* g++.dg/pr85746.C: New file.


--
Marc Glisse


[PATCH][ARM] Enable code hoisting with -Os (PR80155)

2019-09-11 Thread Wilco Dijkstra
While code hoisting generally improves codesize, it can affect performance
negatively.  Benchmarking shows it doesn't help SPEC and negatively affects
embedded benchmarks, so only enable code hoisting with -Os on Arm.

Bootstrap OK, OK for commit?

ChangeLog:
2019-09-11  Wilco Dijkstra  


PR tree-optimization/80155
* common/config/arm/arm-common.c (arm_option_optimization_table):
Enable -fcode-hoisting with -Os.

--
diff --git a/gcc/common/config/arm/arm-common.c 
b/gcc/common/config/arm/arm-common.c
index 
41a920f6dc96833e778faa8dbcc19beac483734c..b0d5fb300bf01acc1fb6f4631635f8a1bfe6441c
 100644
--- a/gcc/common/config/arm/arm-common.c
+++ b/gcc/common/config/arm/arm-common.c
@@ -39,6 +39,8 @@ static const struct default_options 
arm_option_optimization_table[] =
 /* Enable section anchors by default at -O1 or higher.  */
 { OPT_LEVELS_1_PLUS, OPT_fsection_anchors, NULL, 1 },
 { OPT_LEVELS_1_PLUS, OPT_fsched_pressure, NULL, 1 },
+/* Enable code hoisting only with -Os.  */
+{ OPT_LEVELS_SIZE, OPT_fcode_hoisting, NULL, 1 },
 { OPT_LEVELS_NONE, 0, NULL, 0 }
   };
 



[PATCH][ARM] Correctly set SLOW_BYTE_ACCESS

2019-09-11 Thread Wilco Dijkstra
Contrary to all documentation, SLOW_BYTE_ACCESS simply means accessing
bitfields by their declared type, which results in better codegeneration
on practically any target.  So set it correctly to 1 on Arm.

As a result we generate much better code for bitfields:

typedef struct
{
  int x : 2, y : 8, z : 2;
} X;

int bitfield (X *p)
{
  return p->x + p->y + p->z;
}


Before:
ldrbr3, [r0]@ zero_extendqisi2
ldrhr2, [r0]
ldrbr0, [r0, #1]@ zero_extendqisi2
sbfxr3, r3, #0, #2
sbfxr2, r2, #2, #8
sbfxr0, r0, #2, #2
sxtab   r3, r2, r3
sxtab   r0, r3, r0
bx  lr

After:
ldr r0, [r0]
sbfxr3, r0, #0, #2
sbfxr2, r0, #2, #8
sbfxr0, r0, #10, #2
sxtab   r3, r2, r3
add r0, r0, r3
bx  lr

Bootstrap OK, OK for commit?

ChangeLog:
2019-09-11  Wilco Dijkstra  

* config/arm/arm.h (SLOW_BYTE_ACCESS): Set to 1.

--

diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 
8b92c830de09a3ad49420fdfacde02d8efc2a89b..11212d988a0f56299c2266bace80170d074be56c
 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -1892,8 +1892,9 @@ enum arm_auto_incmodes
((arm_arch4 || (MODE) == QImode) ? ZERO_EXTEND  \
 : ((BYTES_BIG_ENDIAN && (MODE) == HImode) ? SIGN_EXTEND : UNKNOWN)))
 
-/* Nonzero if access to memory by bytes is slow and undesirable.  */
-#define SLOW_BYTE_ACCESS 0
+/* Contrary to all documentation, this enables wide bitfield accesses,
+   which results in better code when accessing multiple bitfields.  */
+#define SLOW_BYTE_ACCESS 1
 
 /* Immediate shift counts are truncated by the output routines (or was it
the assembler?).  Shift counts in a register are truncated by ARM.  Note



Re: [PATCH] Fix PR 91708

2019-09-11 Thread Richard Biener
On September 11, 2019 4:41:10 PM GMT+02:00, Bernd Edlinger 
 wrote:
>On 9/11/19 3:55 PM, Richard Biener wrote:
>> On Wed, 11 Sep 2019, Bernd Edlinger wrote:
>> 
>>> On 9/11/19 9:23 AM, Richard Biener wrote:
 On Tue, 10 Sep 2019, Bernd Edlinger wrote:

> Hi!
>
> This ICE happens when compiling real_nextafter in real.c.
> CSE sees this:
>
> (insn 179 178 180 11 (set (reg:SI 319)
> (reg/v/f:SI 273 [ rD.73757 ]))
>"../../gcc-trunk-1/gcc/real.c":120:10 643 {*thumb2_movsi_vfp}
>  (nil))
> [...]
> (insn 181 180 182 11 (set (mem:SI (reg:SI 319) [0 MEM
> [(voidD.73 *)r_77(D)]+0 S4 A8])
> (unspec:SI [
> (reg:SI 320)
> ] UNSPEC_UNALIGNED_STORE))
>"../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
>  (nil))
> [...]
> (insn 186 185 187 11 (set (mem:SI (plus:SI (reg/v/f:SI 273 [
>rD.73757 ])
> (const_int 20 [0x14])) [0 MEM 
>[(voidD.73 *)r_77(D)]+20 S4 A8])
> (unspec:SI [
> (reg:SI 320)
> ] UNSPEC_UNALIGNED_STORE))
>"../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
>  (expr_list:REG_DEAD (reg:SI 320)
> (expr_list:REG_DEAD (reg/f:SI 319 [ rD.73757 ])
> (nil
> [...]
> (insn 234 233 235 11 (set (reg:SI 340)
> (mem:SI (reg/v/f:SI 273 [ rD.73757 ]) [52 MEM int> [(struct real_valueD.28367 *)r_77(D)]+0 S4 A32]))
>"../../gcc-trunk-1/gcc/real.c":5185:9 643 {*thumb2_movsi_vfp}
>  (nil))
>
>
> ... and transforms insn 234 in an invalid insn:
>
>
> (insn 234 233 235 11 (set (reg:SI 340 [ MEM 
>[(struct real_valueD.28367 *)r_77(D)] ])
> (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
> (const_int 20 [0x14])) [0 MEM 
>[(voidD.73 *)r_77(D)]+20 S4 A8])) "../../gcc-trunk-1/gcc/real.c":5185:9
>643 {*thumb2_movsi_vfp}
>  (nil))
>
> which triggers the assertion in the arm back-end, because the MEM
>is not aligned.
>
> To fix that I changed exp_equiv_p to consider MEMs with different
>MEM_ALIGN or
> ALIAS_SET as different.

 I think the appropriate fix is to retain the mem_attrs of the
>original
 MEM since obviously the replacement does not change those?  It's a
>mere
 change of the MEMs address but implementation-wise CSE replaces the
>whole
 MEM?

 For CSE exp_equiv_p simply compares the addresses, only if for_gcse
 we do further checks like mem_attrs_eq_p.  IMHO that is correct.

 I'm not sure what CSE does above, that is, what does it think it
>does?
 It doesn't replace the loaded value, it seems to do sth else which
 is odd for CSE (replaces a REG with a PLUS - but why?)
  
>>>
>>> What I think CSE is thinking there is this:
>>>
>>> insn 181 stores the value of (reg:SI 320) at (mem:SI (reg:SI 319) [0
>MEM  [(voidD.73 *)r_77(D)]+0 S4 A8]
>>> that is the same address as where insn 234 reads the value back but
>there we know it is aligned.
>>>
>>> insn 186 stores the same value at (mem:SI (plus:SI (reg/v/f:SI 273 [
>rD.73757 ]) (const_int 20 [0x14])) [0 MEM  [(voidD.73
>*)r_77(D)]+20 S4 A8]
>>>
>>> But since both reg:SI 320 is dead and reg:SI 319 is dead as well,
>when
>>> insn 234 processed, it replaces one mem with another mem that has
>the same
>>> value.
>> 
>> But why?  I see zero benefit doing that.  Note the addresses are
>equal,
>> so if it for some reason is beneficial CSE could still simply 
>> _preserve_ the original mem_attrs.
>> 
>
>Yes, but the addresses are simply *not* equal.
>
>--- cse.c.orig 2019-07-24 21:21:53.590065924 +0200
>+++ cse.c  2019-09-11 16:03:43.429236836 +0200
>@@ -4564,6 +4564,10 @@
>   else if (GET_CODE (x) == PARALLEL)
> sets = XALLOCAVEC (struct set, XVECLEN (x, 0));
> 
>+  if (INSN_UID(insn) == 234 && GET_CODE(x) == SET
>+  && GET_CODE(SET_DEST(x)) == REG && REGNO(SET_DEST(x)) == 340
>+  && getenv("debug")) asm("int3");
>+
>   this_insn = insn;
>   /* Records what this insn does to set CC0.  */
>   this_insn_cc0 = 0;
>
>(gdb) n
>4809   elt = lookup (src, sets[i].src_hash, mode);
>(gdb) call debug(src)
>(mem:SI (reg/v/f:SI 273 [ r ]) [52 MEM  [(struct
>real_value *)r_77(D)]+0 S4 A32])
>(gdb) call dump_class(elt)
>Equivalence chain for (mem:SI (reg/v/f:SI 273 [ r ]) [0 MEM
> [(void *)r_77(D)]+0 S4 A8]): 
>(unspec:SI [
>(reg:SI 320)
>] UNSPEC_UNALIGNED_STORE)
>(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
>(const_int 20 [0x14])) [0 MEM  [(void *)r_77(D)]+20 S4 A8])
>(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
>(const_int 16 [0x10])) [0 MEM  [(void *)r_77(D)]+16 S4 A8])
>(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
> (const_int 12 [0xc])) [0 MEM  [(void *)r_77(D)]+12 S4 A8])
>(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
>   (const_int 8 [0x8])) [0 MEM  [(void *)r_77(D)]+8 S4 A8])
>(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
>   (const_int 4 [0x4])) [0 MEM  [(void *)r_77(D)]+4 S4 A8])

Re: [PATCH] Fix PR 91708

2019-09-11 Thread Bernd Edlinger
On 9/11/19 3:55 PM, Richard Biener wrote:
> On Wed, 11 Sep 2019, Bernd Edlinger wrote:
> 
>> On 9/11/19 9:23 AM, Richard Biener wrote:
>>> On Tue, 10 Sep 2019, Bernd Edlinger wrote:
>>>
 Hi!

 This ICE happens when compiling real_nextafter in real.c.
 CSE sees this:

 (insn 179 178 180 11 (set (reg:SI 319)
 (reg/v/f:SI 273 [ rD.73757 ])) 
 "../../gcc-trunk-1/gcc/real.c":120:10 643 {*thumb2_movsi_vfp}
  (nil))
 [...]
 (insn 181 180 182 11 (set (mem:SI (reg:SI 319) [0 MEM  
 [(voidD.73 *)r_77(D)]+0 S4 A8])
 (unspec:SI [
 (reg:SI 320)
 ] UNSPEC_UNALIGNED_STORE)) 
 "../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
  (nil))
 [...]
 (insn 186 185 187 11 (set (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
 (const_int 20 [0x14])) [0 MEM  [(voidD.73 
 *)r_77(D)]+20 S4 A8])
 (unspec:SI [
 (reg:SI 320)
 ] UNSPEC_UNALIGNED_STORE)) 
 "../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
  (expr_list:REG_DEAD (reg:SI 320)
 (expr_list:REG_DEAD (reg/f:SI 319 [ rD.73757 ])
 (nil
 [...]
 (insn 234 233 235 11 (set (reg:SI 340)
 (mem:SI (reg/v/f:SI 273 [ rD.73757 ]) [52 MEM  
 [(struct real_valueD.28367 *)r_77(D)]+0 S4 A32])) 
 "../../gcc-trunk-1/gcc/real.c":5185:9 643 {*thumb2_movsi_vfp}
  (nil))


 ... and transforms insn 234 in an invalid insn:


 (insn 234 233 235 11 (set (reg:SI 340 [ MEM  [(struct 
 real_valueD.28367 *)r_77(D)] ])
 (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
 (const_int 20 [0x14])) [0 MEM  [(voidD.73 
 *)r_77(D)]+20 S4 A8])) "../../gcc-trunk-1/gcc/real.c":5185:9 643 
 {*thumb2_movsi_vfp}
  (nil))

 which triggers the assertion in the arm back-end, because the MEM is not 
 aligned.

 To fix that I changed exp_equiv_p to consider MEMs with different 
 MEM_ALIGN or
 ALIAS_SET as different.
>>>
>>> I think the appropriate fix is to retain the mem_attrs of the original
>>> MEM since obviously the replacement does not change those?  It's a mere
>>> change of the MEMs address but implementation-wise CSE replaces the whole
>>> MEM?
>>>
>>> For CSE exp_equiv_p simply compares the addresses, only if for_gcse
>>> we do further checks like mem_attrs_eq_p.  IMHO that is correct.
>>>
>>> I'm not sure what CSE does above, that is, what does it think it does?
>>> It doesn't replace the loaded value, it seems to do sth else which
>>> is odd for CSE (replaces a REG with a PLUS - but why?)
>>>  
>>
>> What I think CSE is thinking there is this:
>>
>> insn 181 stores the value of (reg:SI 320) at (mem:SI (reg:SI 319) [0 MEM 
>>  [(voidD.73 *)r_77(D)]+0 S4 A8]
>> that is the same address as where insn 234 reads the value back but there we 
>> know it is aligned.
>>
>> insn 186 stores the same value at (mem:SI (plus:SI (reg/v/f:SI 273 [ 
>> rD.73757 ]) (const_int 20 [0x14])) [0 MEM  [(voidD.73 
>> *)r_77(D)]+20 S4 A8]
>>
>> But since both reg:SI 320 is dead and reg:SI 319 is dead as well, when
>> insn 234 processed, it replaces one mem with another mem that has the same
>> value.
> 
> But why?  I see zero benefit doing that.  Note the addresses are equal,
> so if it for some reason is beneficial CSE could still simply 
> _preserve_ the original mem_attrs.
> 

Yes, but the addresses are simply *not* equal.

--- cse.c.orig  2019-07-24 21:21:53.590065924 +0200
+++ cse.c   2019-09-11 16:03:43.429236836 +0200
@@ -4564,6 +4564,10 @@
   else if (GET_CODE (x) == PARALLEL)
 sets = XALLOCAVEC (struct set, XVECLEN (x, 0));
 
+  if (INSN_UID(insn) == 234 && GET_CODE(x) == SET
+  && GET_CODE(SET_DEST(x)) == REG && REGNO(SET_DEST(x)) == 340
+  && getenv("debug")) asm("int3");
+
   this_insn = insn;
   /* Records what this insn does to set CC0.  */
   this_insn_cc0 = 0;

(gdb) n
4809elt = lookup (src, sets[i].src_hash, mode);
(gdb) call debug(src)
(mem:SI (reg/v/f:SI 273 [ r ]) [52 MEM  [(struct real_value 
*)r_77(D)]+0 S4 A32])
(gdb) call dump_class(elt)
Equivalence chain for (mem:SI (reg/v/f:SI 273 [ r ]) [0 MEM  [(void 
*)r_77(D)]+0 S4 A8]): 
(unspec:SI [
(reg:SI 320)
] UNSPEC_UNALIGNED_STORE)
(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
(const_int 20 [0x14])) [0 MEM  [(void *)r_77(D)]+20 S4 A8])
(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
(const_int 16 [0x10])) [0 MEM  [(void *)r_77(D)]+16 S4 A8])
(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
(const_int 12 [0xc])) [0 MEM  [(void *)r_77(D)]+12 S4 A8])
(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
(const_int 8 [0x8])) [0 MEM  [(void *)r_77(D)]+8 S4 A8])
(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
(const_int 4 [0x4])) [0 MEM  [(void *)r_77(D)]+4 S4 A8])
(mem:SI (reg/v/f:SI 273 [ r ]) [0 MEM  [(void *)r_77(D)]+0 S4 A8])

[...]
5393  

Re: Patch to support extended characters in C/C++ identifiers

2019-09-11 Thread Lewis Hyatt
On Tue, Sep 10, 2019 at 7:47 PM Joseph Myers  wrote:
>
> Thanks, I think this is OK with a few updates to the documentation.

Thanks for looking through this, I'm glad it will be acceptable. I
will make the documentation adjustments as you suggest.

Speaking of documentation, one other thing occurred to me. When I made
these changes, I tried to make them as minimally disruptive as
possible, so they are the smallest changes I could find to the current
overall architecture to make this work. As a result there are some
things that may be a little surprising. For instance, you can take a
UTF-8 encoded file and insert a backslash line continuation in the
middle of a multibyte sequence, and gcc will happily paste it back
together and then interpret the resulting UTF-8. I think it's
technically OK standardwise since the conversion from extended
characters to the source character set is implementation-defined, but
it's hardly a straightforward definition. It is sort of consistent
with the treatment of undefined behavior with UCN escapes though,
which gcc already permits to be pasted together over a line
continuation. Anyway, should this behavior be documented as well? I
doubt anyone would be happy with a full-blown solution that involves
doing the UTF-8 conversion at initial parse time, given how much of
the libcpp code is devoted to optimizing the performance of scanning
input files, so this is probably the way it's going to end up working
I presume.

> I should also note that a few of the tests added by the test are testing
> things that are properties of the implementation that might arguably be
> bugs, rather than standard features, and so perhaps should at least have
> comments added saying they are testing those implementation properties.
>
> gcc/testsuite/gcc.dg/cpp/ucnid-7-utf8.c, testing invalid UTF-8, is relying
> on GCC, in its default -finput-charset=utf-8 mode, not actually checking
> that the input is valid UTF-8.  It's clear that avoiding such a check
> makes sense in strings and comments, both as a matter of efficiency and
> because it's likely to do the right thing for a lot of user programs that
> use non-UTF-8 character sets in those places and just need the bytes in
> the strings to be passed through to the compiler output (rather than
> requiring users to specify -finput-charset and -fexec-charset for those
> programs).  Outside those contexts it's less obvious what's the best way
> to behave (this sort of test, where the stray non-UTF-8 bytes are in text
> that disappears as a result of macro expansion, is certainly a corner
> case).
>

My main reason for including this test was to demonstrate that
existing behavior is unchanged by the patch. If you think it makes
more sense, I could omit the test altogether, otherwise I will add a
comment like you suggested.

> gcc/testsuite/g++.dg/cpp/ucnid-2-utf8.C and
> gcc/testsuite/g++.dg/cpp/ucnid-3-utf8.C are testing double stringizing in
> C++, where strictly the results they expect show that GCC does not conform
> to the C++ standard requirement to convert all extended characters to UCNs
> (because C++ does not have the special C rule making it
> implementation-defined whether the \ of a UCN in a string literal is
> doubled when stringizing).

Thanks, I didn't mean to ignore this point when you made it on the PR
comments, I just wasn't sure what was the best way to handle it. Do
you find it preferable to just add a comment, or should I rather
change the test to look for the standard-confirming output, and make
it an XFAIL?

Finally, one general question, when I submit these last changes, is it
better to send them as a new patch relative to what I already sent, or
is it better to send the whole thing updated from scratch? Thanks
again.

-Lewis


-Lewis


Go patch committed: Don't generate type descriptors for aliases

2019-09-11 Thread Ian Lance Taylor
This Go frontend patch by Than McIntosh suppresses type descriptor
generation for aliases.

This changes Named_object::get_backend to ignore aliases when creating
type descriptors for types, to be consistent with
Type::needs_specific_type_functions and the Specific_type_functions
traversal class.  For example, when compiling a package that creates
an alias to an an externally defined type, e.g.,

   import "foo"
   type MyFoo = foo.Foo

it makes sense to skip the alias (not try to generate type specific
functions for it) and let the normal mechanisms take care of the alias
target, depending on whether the target is defined locally or defined
elsewhere.

A testcase for this problen can be found in https://golang.org/cl/193261.

This fixes https://golang.org/issue/33866.

Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 275650)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-27b2311fa460b1dd76fb3a796c7c05ebedc64df2
+0950e905939f88c1421f8667ac4dc9e14528471c
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/gogo.cc
===
--- gcc/go/gofrontend/gogo.cc   (revision 275594)
+++ gcc/go/gofrontend/gogo.cc   (working copy)
@@ -8718,7 +8718,13 @@ Named_object::get_backend(Gogo* gogo, st
 case NAMED_OBJECT_TYPE:
   {
 Named_type* named_type = this->u_.type_value;
-   if (!Gogo::is_erroneous_name(this->name_) && !named_type->is_alias())
+
+// No need to do anything for aliases-- whatever has to be done
+// can be done for the alias target.
+if (named_type->is_alias())
+  break;
+
+   if (!Gogo::is_erroneous_name(this->name_))
  type_decls.push_back(named_type->get_backend(gogo));
 
 // We need to produce a type descriptor for every named


Re: [gofrontend-dev] Re: libgo: Update to Go 1.13beta1 release

2019-09-11 Thread Ian Lance Taylor
On Tue, Sep 10, 2019 at 11:54 PM Andreas Schwab  wrote:
>
> On Sep 10 2019, Ian Lance Taylor  wrote:
>
> > On Mon, Sep 9, 2019 at 2:00 PM Andreas Schwab  wrote:
> >>
> >> ../../../libgo/go/golang.org/x/sys/cpu/cpu.go:17:30: error: reference to 
> >> undefined name ‘cacheLineSize’
> >>17 | type CacheLinePad struct{ _ [cacheLineSize]byte }
> >>   |  ^
> >> ../../../libgo/go/golang.org/x/sys/cpu/cpu_linux.go:56:2: error: reference 
> >> to undefined name ‘doinit’
> >>56 |  doinit()
> >>   |  ^
> >
> > Thanks, should be fixed by SVN revision 275611, just committed.
>
> Nope.
>
> ../../../libgo/go/golang.org/x/sys/cpu/cpu_linux.go:56:2: error: reference to 
> undefined name 'doinit'
>56 |  doinit()
>   |  ^
> make[4]: *** [golang.org/x/sys/cpu.lo] Error 1

Bother, sorry.  I just committed this patch.  Perhaps this will fix it.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 275648)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-1f4ce28409a2d9d4045b1085de55c46de8759d1c
+27b2311fa460b1dd76fb3a796c7c05ebedc64df2
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/go/golang.org/x/sys/cpu/cpu_linux_other.go
===
--- libgo/go/golang.org/x/sys/cpu/cpu_linux_other.go(nonexistent)
+++ libgo/go/golang.org/x/sys/cpu/cpu_linux_other.go(working copy)
@@ -0,0 +1,9 @@
+// Copyright 2019 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+// +build !amd64,!amd64p32,!386,!arm64,!ppc64,!ppc64le,!s390x
+
+package cpu
+
+func doinit() {}


Re: [PATCH 5/5] Rewrite second part of or_comparisons_1 into match.pd.

2019-09-11 Thread Martin Liška
Hi.

Updated version of the patch that drops GENERIC
support in TREE codes.

Martin
>From 548ff649ae56e2f60ba4b2537d97c5bf085248d5 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Fri, 6 Sep 2019 12:59:36 +0200
Subject: [PATCH 5/5] Rewrite second part of or_comparisons_1 into match.pd.

gcc/ChangeLog:

2019-09-09  Martin Liska  

	* gimple-fold.c (or_comparisons_1): Remove rules moved
	to ...
	* match.pd: ... here.
---
 gcc/gimple-fold.c | 45 -
 gcc/match.pd  | 38 ++
 2 files changed, 38 insertions(+), 45 deletions(-)

diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index dc8e8eef45c..5b79ebd711f 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -6043,51 +6043,6 @@ or_comparisons_1 (tree type, enum tree_code code1, tree op1a, tree op1b,
 	return t;
 }
 
-  /* If both comparisons are of the same value against constants, we might
- be able to merge them.  */
-  if (operand_equal_p (op1a, op2a, 0)
-  && TREE_CODE (op1b) == INTEGER_CST
-  && TREE_CODE (op2b) == INTEGER_CST)
-{
-  int cmp = tree_int_cst_compare (op1b, op2b);
-
-  /* Chose the less restrictive of two < or <= comparisons.  */
-  if ((code1 == LT_EXPR || code1 == LE_EXPR)
-	   && (code2 == LT_EXPR || code2 == LE_EXPR))
-	{
-	  if ((cmp < 0) || (cmp == 0 && code1 == LT_EXPR))
-	return fold_build2 (code2, boolean_type_node, op2a, op2b);
-	  else
-	return fold_build2 (code1, boolean_type_node, op1a, op1b);
-	}
-
-  /* Likewise chose the less restrictive of two > or >= comparisons.  */
-  else if ((code1 == GT_EXPR || code1 == GE_EXPR)
-	   && (code2 == GT_EXPR || code2 == GE_EXPR))
-	{
-	  if ((cmp > 0) || (cmp == 0 && code1 == GT_EXPR))
-	return fold_build2 (code2, boolean_type_node, op2a, op2b);
-	  else
-	return fold_build2 (code1, boolean_type_node, op1a, op1b);
-	}
-
-  /* Check for singleton ranges.  */
-  else if (cmp == 0
-	   && ((code1 == LT_EXPR && code2 == GT_EXPR)
-		   || (code1 == GT_EXPR && code2 == LT_EXPR)))
-	return fold_build2 (NE_EXPR, boolean_type_node, op1a, op2b);
-
-  /* Check for less/greater pairs that don't restrict the range at all.  */
-  else if (cmp >= 0
-	   && (code1 == LT_EXPR || code1 == LE_EXPR)
-	   && (code2 == GT_EXPR || code2 == GE_EXPR))
-	return boolean_true_node;
-  else if (cmp <= 0
-	   && (code1 == GT_EXPR || code1 == GE_EXPR)
-	   && (code2 == LT_EXPR || code2 == LE_EXPR))
-	return boolean_true_node;
-}
-
   /* Perhaps the first comparison is (NAME != 0) or (NAME == 1) where
  NAME's definition is a truth value.  See if there are any simplifications
  that can be done against the NAME's definition.  */
diff --git a/gcc/match.pd b/gcc/match.pd
index c465eabbb89..08b382fd2b9 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2051,6 +2051,44 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (if (code1 == NE_EXPR && val) { constant_boolean_node (true, type); })
   (if (code1 == NE_EXPR && !val) @3))
 
+/* Convert (X OP1 CST1) || (X OP2 CST2).  */
+
+(for code1 (lt le gt ge)
+ (for code2 (lt le gt ge)
+  (simplify
+  (bit_ior (code1:c@3 @0 INTEGER_CST@1) (code2:c@4 @0 INTEGER_CST@2))
+   (with
+{
+ int cmp = tree_int_cst_compare (@1, @2);
+}
+(switch
+ /* Choose the more restrictive of two < or <= comparisons.  */
+ (if ((code1 == LT_EXPR || code1 == LE_EXPR)
+	  && (code2 == LT_EXPR || code2 == LE_EXPR))
+  (if ((cmp < 0) || (cmp == 0 && code1 == LT_EXPR))
+   @4
+   @3))
+ /* Likewise chose the more restrictive of two > or >= comparisons.  */
+ (if ((code1 == GT_EXPR || code1 == GE_EXPR)
+	  && (code2 == GT_EXPR || code2 == GE_EXPR))
+  (if ((cmp > 0) || (cmp == 0 && code1 == GT_EXPR))
+   @4
+   @3))
+ /* Check for singleton ranges.  */
+ (if (cmp == 0
+	  && ((code1 == LT_EXPR && code2 == GT_EXPR)
+	  || (code1 == GT_EXPR && code2 == LT_EXPR)))
+  (ne @0 @2))
+ /* Check for disjoint ranges.  */
+ (if (cmp >= 0
+	  && (code1 == LT_EXPR || code1 == LE_EXPR)
+	  && (code2 == GT_EXPR || code2 == GE_EXPR))
+  { constant_boolean_node (true, type); })
+ (if (cmp <= 0
+	  && (code1 == GT_EXPR || code1 == GE_EXPR)
+	  && (code2 == LT_EXPR || code2 == LE_EXPR))
+  { constant_boolean_node (true, type); })
+ )
 
 /* We can't reassociate at all for saturating types.  */
 (if (!TYPE_SATURATING (type))
-- 
2.23.0



Re: [PATCH 4/5] Rewrite first part of or_comparisons_1 into match.pd.

2019-09-11 Thread Martin Liška
Hi.

Updated version of the patch that drops GENERIC
support in TREE codes.

Martin
>From 725f04c781c3d9cc2108b075201fc9ac7afb9a44 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Fri, 6 Sep 2019 12:47:01 +0200
Subject: [PATCH 4/5] Rewrite first part of or_comparisons_1 into match.pd.

gcc/ChangeLog:

2019-09-09  Martin Liska  

	* gimple-fold.c (or_comparisons_1): Remove rules
	moved to ...
	* match.pd: ... here.
---
 gcc/gimple-fold.c | 87 +--
 gcc/match.pd  | 28 +++
 2 files changed, 29 insertions(+), 86 deletions(-)

diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 23891fed930..dc8e8eef45c 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -6051,93 +6051,8 @@ or_comparisons_1 (tree type, enum tree_code code1, tree op1a, tree op1b,
 {
   int cmp = tree_int_cst_compare (op1b, op2b);
 
-  /* If we have (op1a != op1b), we should either be able to
-	 return that or TRUE, depending on whether the constant op1b
-	 also satisfies the other comparison against op2b.  */
-  if (code1 == NE_EXPR)
-	{
-	  bool done = true;
-	  bool val;
-	  switch (code2)
-	{
-	case EQ_EXPR: val = (cmp == 0); break;
-	case NE_EXPR: val = (cmp != 0); break;
-	case LT_EXPR: val = (cmp < 0); break;
-	case GT_EXPR: val = (cmp > 0); break;
-	case LE_EXPR: val = (cmp <= 0); break;
-	case GE_EXPR: val = (cmp >= 0); break;
-	default: done = false;
-	}
-	  if (done)
-	{
-	  if (val)
-		return boolean_true_node;
-	  else
-		return fold_build2 (code1, boolean_type_node, op1a, op1b);
-	}
-	}
-  /* Likewise if the second comparison is a != comparison.  */
-  else if (code2 == NE_EXPR)
-	{
-	  bool done = true;
-	  bool val;
-	  switch (code1)
-	{
-	case EQ_EXPR: val = (cmp == 0); break;
-	case NE_EXPR: val = (cmp != 0); break;
-	case LT_EXPR: val = (cmp > 0); break;
-	case GT_EXPR: val = (cmp < 0); break;
-	case LE_EXPR: val = (cmp >= 0); break;
-	case GE_EXPR: val = (cmp <= 0); break;
-	default: done = false;
-	}
-	  if (done)
-	{
-	  if (val)
-		return boolean_true_node;
-	  else
-		return fold_build2 (code2, boolean_type_node, op2a, op2b);
-	}
-	}
-
-  /* See if an equality test is redundant with the other comparison.  */
-  else if (code1 == EQ_EXPR)
-	{
-	  bool val;
-	  switch (code2)
-	{
-	case EQ_EXPR: val = (cmp == 0); break;
-	case NE_EXPR: val = (cmp != 0); break;
-	case LT_EXPR: val = (cmp < 0); break;
-	case GT_EXPR: val = (cmp > 0); break;
-	case LE_EXPR: val = (cmp <= 0); break;
-	case GE_EXPR: val = (cmp >= 0); break;
-	default:
-	  val = false;
-	}
-	  if (val)
-	return fold_build2 (code2, boolean_type_node, op2a, op2b);
-	}
-  else if (code2 == EQ_EXPR)
-	{
-	  bool val;
-	  switch (code1)
-	{
-	case EQ_EXPR: val = (cmp == 0); break;
-	case NE_EXPR: val = (cmp != 0); break;
-	case LT_EXPR: val = (cmp > 0); break;
-	case GT_EXPR: val = (cmp < 0); break;
-	case LE_EXPR: val = (cmp >= 0); break;
-	case GE_EXPR: val = (cmp <= 0); break;
-	default:
-	  val = false;
-	}
-	  if (val)
-	return fold_build2 (code1, boolean_type_node, op1a, op1b);
-	}
-
   /* Chose the less restrictive of two < or <= comparisons.  */
-  else if ((code1 == LT_EXPR || code1 == LE_EXPR)
+  if ((code1 == LT_EXPR || code1 == LE_EXPR)
 	   && (code2 == LT_EXPR || code2 == LE_EXPR))
 	{
 	  if ((cmp < 0) || (cmp == 0 && code1 == LT_EXPR))
diff --git a/gcc/match.pd b/gcc/match.pd
index ac80dd7dd15..c465eabbb89 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2024,6 +2024,34 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   { constant_boolean_node (false, type); })
  )
 
+/* Convert (X == CST1) || (X OP2 CST2) to a known value
+   based on CST1 OP2 CST2.  Similarly for (X != CST1).  */
+
+(for code1 (eq ne)
+ (for code2 (eq ne lt gt le ge)
+  (simplify
+   (bit_ior:c (code1@3 @0 INTEGER_CST@1) (code2@4 @0 INTEGER_CST@2))
+(with
+ {
+  int cmp = tree_int_cst_compare (@1, @2);
+  bool val;
+  switch (code2)
+	{
+	case EQ_EXPR: val = (cmp == 0); break;
+	case NE_EXPR: val = (cmp != 0); break;
+	case LT_EXPR: val = (cmp < 0); break;
+	case GT_EXPR: val = (cmp > 0); break;
+	case LE_EXPR: val = (cmp <= 0); break;
+	case GE_EXPR: val = (cmp >= 0); break;
+	default: gcc_unreachable ();
+	}
+ }
+ (switch
+  (if (code1 == EQ_EXPR && val) @4)
+  (if (code1 == NE_EXPR && val) { constant_boolean_node (true, type); })
+  (if (code1 == NE_EXPR && !val) @3))
+
+
 /* We can't reassociate at all for saturating types.  */
 (if (!TYPE_SATURATING (type))
 
-- 
2.23.0



Re: [PATCH 3/5] Rewrite part of and_comparisons_1 into match.pd.

2019-09-11 Thread Martin Liška
On 9/11/19 3:19 PM, Martin Liška wrote:
> On 9/11/19 2:43 PM, Richard Biener wrote:
>> On Wed, 11 Sep 2019, Martin Liška wrote:
>>
>>> I'm sending updated version of the patch where I changed
>>> from previous version:
>>> - more compact matching is used (note @3 and @4):
>>>   (and (code1:c@3 @0 INTEGER_CST@1) (code2:c@4 @0 INTEGER_CST@2))
>>>
>>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>>
>>> Ready to be installed?
>>
>> --- a/gcc/genmatch.c
>> +++ b/gcc/genmatch.c
>> @@ -1894,9 +1894,11 @@ dt_node *
>>  dt_node::append_simplify (simplify *s, unsigned pattern_no,
>>   dt_operand **indexes)
>>  {
>> +  dt_simplify *s2;
>>dt_simplify *n = new dt_simplify (s, pattern_no, indexes);
>>for (unsigned i = 0; i < kids.length (); ++i)
>> -if (dt_simplify *s2 = dyn_cast  (kids[i]))
>> +if ((s2 = dyn_cast  (kids[i]))
>> +   && s->match->location != s2->s->match->location)
>>{
>>
>> can you retain the warning for verbose >= 1 please?  And put in
>> a comment that duplicates are sometimes hard to avoid with
>> nested for so this keeps match.pd sources small.
> 
> Sure.
> 
>>
>> +  /* Function maybe_fold_comparisons_from_match_pd creates temporary
>> + SSA_NAMEs.  */
>> +  if (TREE_CODE (op1) == SSA_NAME && TREE_CODE (op2) == SSA_NAME)
>> +{
>> +  gimple *s = SSA_NAME_DEF_STMT (op2);
>> +  if (is_gimple_assign (s))
>> +   return same_bool_comparison_p (op1, gimple_assign_rhs_code (s),
>> +  gimple_assign_rhs1 (s),
>> +  gimple_assign_rhs2 (s));
>> +  else
>> +   return false;
>> +}
>>
>> when do you actually run into the need to add this?  The whole
>> same_bool_result_p/same_bool_comparison_p helpers look a bit
>> odd to me.  It shouldn't be special to the new temporaries
>> either so at least the comment looks out-of place.
> 
> At some point it was needed to handle gcc/testsuite/gcc.dg/pr46909.c
> test-case. Apparently, now the test-case is fine with the hunk. I will it
> removal of the hunk.

So it's not needed.

> 
>>
>> +  else if (op.code.is_tree_code ()
>> +  && TREE_CODE_CLASS ((tree_code)op.code) == tcc_comparison)
>> +   {
>>
>> COMPARISON_CLASS_P ((tree_code)op.code)
>>
>> as was noted.
> 
> Won't work here:
> 
> /home/marxin/Programming/gcc/gcc/gimple-fold.c: In function ‘tree_node* 
> maybe_fold_comparisons_from_match_pd(tree, tree_code, tree_code, tree, tree, 
> tree_code, tree, tree)’:
> /home/marxin/Programming/gcc/gcc/tree.h:239:49: error: base operand of ‘->’ 
> is not a pointer
>   239 | #define TREE_CODE(NODE) ((enum tree_code) (NODE)->base.code)
>   | ^~
> 
>>
>> Any particular reason you needed to swap the calls in
>> maybe_fold_and/or_comparisons?
>>
>> +(for code1 (eq ne)
>> + (for code2 (eq ne lt gt le ge)
>> +  (for and (truth_and bit_and)
>> +   (simplify
>>
>> You could save some code-bloat with writing
>>
>>   (for and (
>> #if GENERIC
>>   truth_and
>> #endif
>>   bit_and)
>>
>> but since you are moving the patterns from GIMPLE code I'd say
>> simply remove truth_and and that innermost for?
> 
> I'm gonna drop the generic tree codes support.

It's dropped in the updated patch.

Martin

> 
> Martin
> 
>>
>> Otherwise OK.
>>
>> Thanks,
>> Richard.
>>
>>
>>
>>
>>> Thanks,
>>> Martin
>>>
>>
> 

>From f4e6cf9aec14111a35b1c8d641f83ec355d8c7e0 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Fri, 6 Sep 2019 12:34:49 +0200
Subject: [PATCH 3/5] Rewrite part of and_comparisons_1 into match.pd.

gcc/ChangeLog:

2019-09-09  Martin Liska  

	* genmatch.c (dt_node::append_simplify): Do not print
	warning when we have duplicate patterns belonging
	to a same simplify rule.
	* gimple-fold.c (and_comparisons_1): Remove matching moved to match.pd.
	(maybe_fold_comparisons_from_match_pd): Handle
	tcc_comparison as a results.
	(maybe_fold_and_comparisons):Call maybe_fold_comparisons_from_match_pd
	first.
	(maybe_fold_or_comparisons): Likewise.
	* match.pd: Handle (X == CST1) && (X OP2 CST2) conditions.
---
 gcc/genmatch.c|   7 +-
 gcc/gimple-fold.c | 161 ++
 gcc/match.pd  |  66 +++
 3 files changed, 93 insertions(+), 141 deletions(-)

diff --git a/gcc/genmatch.c b/gcc/genmatch.c
index 2e7bf27eeda..cede432cdc9 100644
--- a/gcc/genmatch.c
+++ b/gcc/genmatch.c
@@ -1894,10 +1894,15 @@ dt_node *
 dt_node::append_simplify (simplify *s, unsigned pattern_no,
 			  dt_operand **indexes)
 {
+  dt_simplify *s2;
   dt_simplify *n = new dt_simplify (s, pattern_no, indexes);
   for (unsigned i = 0; i < kids.length (); ++i)
-if (dt_simplify *s2 = dyn_cast  (kids[i]))
+if ((s2 = dyn_cast  (kids[i]))
+	&& (verbose >= 1
+	|| s->match->location != s2->s->match->location))
   {
+	/* With a nested patters, it's hard to avoid these in order
+	   to keep match.pd rules relatively small.  */
 

Re: [PATCH 2/2] Fix PR88784, middle end is missing some optimizations about unsigned

2019-09-11 Thread Martin Liška
On 9/11/19 3:08 PM, Richard Biener wrote:
> On Fri, 6 Sep 2019, Martin Liška wrote:
> 
>> On 9/5/19 3:17 PM, Richard Biener wrote:
>>> On Tue, 16 Jul 2019, Li Jia He wrote:
>>>
 Hi,

   I made some changes based on the recommendations. Would you like to
   help me to see it again ? Sorry, it took so long time to provide the
   patch.

   Note: 1. I keep the code for and_comparisons_1 and or_comparisons_1.
The reason is that I did not found a good way to handle the
optimization of '((x CODE1 y) AND (x CODE2 y))' in match.pd.
Maybe I missing some important information about match.pd.
2. The gimple_resimplify2 function is not used.  Since stmt1,
stmt2, lhs1 and lhs2 are allocated on the stack, Is there a
question with the value on the stack as the return value ?
I may have misunderstood Richard's intention.
>>>
>>> And now for the match.pd patch.
>>>
>>> +/* x >  y  &&  x != XXX_MIN  -->  x > y  */
>>> +(for and (truth_and bit_and)
>>> + (simplify
>>> +  (and:c (gt:c@3 @0 @1) (ne @0 INTEGER_CST@2))
>>> +  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P 
>>> (TREE_TYPE(@1))
>>> +   && (wi::eq_p (wi::to_wide (@2), wi::min_value (TREE_TYPE (@2)
>>> +@3)))
>>> +
>>> +/* x >  y  &&  x == XXX_MIN  -->  false  */
>>> +(for and (truth_and bit_and)
>>> + (simplify
>>> +  (and:c (gt:c @0 @1) (eq @0 INTEGER_CST@2))
>>> +  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P 
>>> (TREE_TYPE(@1))
>>> +   && (wi::eq_p (wi::to_wide (@2), wi::min_value (TREE_TYPE (@2)
>>> +{ boolean_false_node; })))
>>>
>>> you could merge those two via
>>>
>>>  (for eqne (eq ne)
>>>   (for and (
>>>(simplify
>>> (and:c (gt:c @0 @1) (eqne @0 INTEGER_CST@2))
>>> (if (...)
>>>  (switch
>>>   (if (eqne == NE_EXPR)
>>>@3)
>>>   (if (eqne == EQ_EXPR)
>>>{ constant_boolean_node (false, type); }
>>>
>>> notice using constant_boolean_node (false, type); instead of
>>> boolean_false_node.  I suspect more unification is possible.
>>>
>>> Also you could do
>>>
>>> (match min_value
>>>  INTEGER_CST
>>>  (if (INTEGRAL_TYPE_P (type)
>>>   && wi::eq_p (wi::to_wide (t), wi::min_value (type)
>>>
>>> and then write
>>>
>>>  (simplify
>>>   (and:c (gt:c @0 @1) (eq @0 min_value))
>>>   (...
>>>
>>> Your
>>>
>>> (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P
>>> (TREE_TYPE(@1))
>>>
>>> is redundant, it's enough to check either @0 or @1 given they
>>> have to be compatible for the gt operation.  Note you probably
>>> want to use
>>>
>>>   (and:c (gt:c @0 @1) (eq @@0 min_value))
>>>
>>> and verify that types_match (@1, @0) because when @0 are a constant
>>> (and (eq @0 min_value) is not folded which can happen) then they
>>> might have different types and thus you could have
>>> (SHORT_MAX > intvar) && (SHORT_MAX == SHORT_MAX)
>>>
>>> That said, the patterns can be quite a bit simplified I think.
>>>
>>> Richard.
>>>
>>
>> Likewise, I applied the suggested simplification.
>>
>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>
>> Ready to be installed?
> 
> +/* { dg-options "-O2 -fdump-tree-ifcombine --param 
> logical-op-non-short-circuit=0" } */
> +
> +#include 
> +
> +_Bool and1(unsigned x, unsigned y)
> +{
> +  /* x > y && x != 0 --> x > y */
> +  return x > y && x != 0;
> +}
> ...
> +/* { dg-final { scan-tree-dump-not " != " "ifcombine" } } */
> 
> since you iterate over bit_and and truth_and GENERIC folding should
> already simplify this, no?
> 
> I suggest to remove the truth_and/or iteration.

Done in updated patch.

Martin

> 
> Otherwise looks OK now.
> 
> Richard.
> 

>From acb5e62c2babf749b60c5475925ac6afb059e459 Mon Sep 17 00:00:00 2001
From: Li Jia He 
Date: Mon, 15 Jul 2019 03:50:34 -0500
Subject: [PATCH 2/5] Fix PR88784, middle end is missing some optimizations
 about unsigned

Hi,

According to the optimizable case described by Qi Feng on
issue 88784, we can combine the cases into the following:

1. x >  y  &&  x != XXX_MIN  -->   x > y
2. x >  y  &&  x == XXX_MIN  -->   false
3. x <= y  &&  x == XXX_MIN  -->   x == XXX_MIN

4. x <  y  &&  x != XXX_MAX  -->   x < y
5. x <  y  &&  x == XXX_MAX  -->   false
6. x >= y  &&  x == XXX_MAX  -->   x == XXX_MAX

7. x >  y  ||  x != XXX_MIN  -->   x != XXX_MIN
8. x <= y  ||  x != XXX_MIN  -->   true
9. x <= y  ||  x == XXX_MIN  -->   x <= y

10. x <  y  ||  x != XXX_MAX  -->   x != UXXX_MAX
11. x >= y  ||  x != XXX_MAX  -->   true
12. x >= y  ||  x == XXX_MAX  -->   x >= y

Note: XXX_MIN represents the minimum value of type x.
  XXX_MAX represents the maximum value of type x.

Here we don't need to care about whether the operation is
signed or unsigned.  For example, in the below equation:

'x >  y  &&  x != XXX_MIN  -->   x > y'

If the x type is signed int and XXX_MIN is INT_MIN, we can
optimize it to 'x > y'.  However, if the type of x is unsigned
int and XXX_MIN is 0, we can still optimize 

Re: [PATCH] Fix PR 91708

2019-09-11 Thread Richard Biener
On Wed, 11 Sep 2019, Bernd Edlinger wrote:

> On 9/11/19 9:23 AM, Richard Biener wrote:
> > On Tue, 10 Sep 2019, Bernd Edlinger wrote:
> > 
> >> Hi!
> >>
> >> This ICE happens when compiling real_nextafter in real.c.
> >> CSE sees this:
> >>
> >> (insn 179 178 180 11 (set (reg:SI 319)
> >> (reg/v/f:SI 273 [ rD.73757 ])) 
> >> "../../gcc-trunk-1/gcc/real.c":120:10 643 {*thumb2_movsi_vfp}
> >>  (nil))
> >> [...]
> >> (insn 181 180 182 11 (set (mem:SI (reg:SI 319) [0 MEM  
> >> [(voidD.73 *)r_77(D)]+0 S4 A8])
> >> (unspec:SI [
> >> (reg:SI 320)
> >> ] UNSPEC_UNALIGNED_STORE)) 
> >> "../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
> >>  (nil))
> >> [...]
> >> (insn 186 185 187 11 (set (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
> >> (const_int 20 [0x14])) [0 MEM  [(voidD.73 
> >> *)r_77(D)]+20 S4 A8])
> >> (unspec:SI [
> >> (reg:SI 320)
> >> ] UNSPEC_UNALIGNED_STORE)) 
> >> "../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
> >>  (expr_list:REG_DEAD (reg:SI 320)
> >> (expr_list:REG_DEAD (reg/f:SI 319 [ rD.73757 ])
> >> (nil
> >> [...]
> >> (insn 234 233 235 11 (set (reg:SI 340)
> >> (mem:SI (reg/v/f:SI 273 [ rD.73757 ]) [52 MEM  
> >> [(struct real_valueD.28367 *)r_77(D)]+0 S4 A32])) 
> >> "../../gcc-trunk-1/gcc/real.c":5185:9 643 {*thumb2_movsi_vfp}
> >>  (nil))
> >>
> >>
> >> ... and transforms insn 234 in an invalid insn:
> >>
> >>
> >> (insn 234 233 235 11 (set (reg:SI 340 [ MEM  [(struct 
> >> real_valueD.28367 *)r_77(D)] ])
> >> (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
> >> (const_int 20 [0x14])) [0 MEM  [(voidD.73 
> >> *)r_77(D)]+20 S4 A8])) "../../gcc-trunk-1/gcc/real.c":5185:9 643 
> >> {*thumb2_movsi_vfp}
> >>  (nil))
> >>
> >> which triggers the assertion in the arm back-end, because the MEM is not 
> >> aligned.
> >>
> >> To fix that I changed exp_equiv_p to consider MEMs with different 
> >> MEM_ALIGN or
> >> ALIAS_SET as different.
> > 
> > I think the appropriate fix is to retain the mem_attrs of the original
> > MEM since obviously the replacement does not change those?  It's a mere
> > change of the MEMs address but implementation-wise CSE replaces the whole
> > MEM?
> > 
> > For CSE exp_equiv_p simply compares the addresses, only if for_gcse
> > we do further checks like mem_attrs_eq_p.  IMHO that is correct.
> > 
> > I'm not sure what CSE does above, that is, what does it think it does?
> > It doesn't replace the loaded value, it seems to do sth else which
> > is odd for CSE (replaces a REG with a PLUS - but why?)
> >  
> 
> What I think CSE is thinking there is this:
> 
> insn 181 stores the value of (reg:SI 320) at (mem:SI (reg:SI 319) [0 MEM 
>  [(voidD.73 *)r_77(D)]+0 S4 A8]
> that is the same address as where insn 234 reads the value back but there we 
> know it is aligned.
> 
> insn 186 stores the same value at (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 
> ]) (const_int 20 [0x14])) [0 MEM  [(voidD.73 *)r_77(D)]+20 S4 
> A8]
> 
> But since both reg:SI 320 is dead and reg:SI 319 is dead as well, when
> insn 234 processed, it replaces one mem with another mem that has the same
> value.

But why?  I see zero benefit doing that.  Note the addresses are equal,
so if it for some reason is beneficial CSE could still simply 
_preserve_ the original mem_attrs.

Richard.


Re: [PATCH 1/2] Auto-generate maybe_fold_and/or_comparisons from match.pd

2019-09-11 Thread Martin Liška
On 9/11/19 2:23 PM, Richard Biener wrote:
> On Wed, 11 Sep 2019, Martin Liška wrote:
> 
>> Hello.
>>
>> One another updated version of the patch.
>> Changes from the previous version:
>> - I fixed:
>>   gimple *stmt1 = (gimple *) XALLOCAVEC (char, gimple_size (GIMPLE_ASSIGN, 
>> 2));
>>   into gimple_size (GIMPLE_ASSIGN, 3)
>> - I simplified condition in gimple_simplified_result_is_gimple_val
>> - optimize_vec_cond_expr is using build_same_sized_truth_vector_type
> 
> Can you here instead amend ovce_extract_ops to return the type of
> 'cond' which should be the type you want?

Sure, updated in the tested attached patch.

I'm going to install the patch after the Cauldron.

Martin

> 
> Otherwise OK.
> 
> Richard.
> 
>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>
>> Ready to be installed?
>> Thanks,
>> Martin

>From 8ad0d2e59350b9410543571059c0d24bcda52db3 Mon Sep 17 00:00:00 2001
From: Li Jia He 
Date: Mon, 15 Jul 2019 00:30:25 -0500
Subject: [PATCH 1/5] Auto-generate maybe_fold_and/or_comparisons from match.pd

gcc/ChangeLog

2019-07-16  Li Jia He  
	Martin Liska  

	* gimple-fold.c (and_comparisons_1): Add type as first
	argument.
	(and_var_with_comparison): Likewise.
	(and_var_with_comparison_1): Likewise.
	(or_comparisons_1): Likewise.
	(or_var_with_comparison): Likewise.
	(or_var_with_comparison_1): Likewise.
	(maybe_fold_and_comparisons): Call maybe_fold_comparisons_from_match_pd.
	(maybe_fold_or_comparisons): Likewise.
	(maybe_fold_comparisons_from_match_pd): New.
	* gimple-fold.h (maybe_fold_and_comparisons): Add type argument.
	(maybe_fold_or_comparisons): Likewise.
	* gimple.c (gimple_size): Make it public and add num_ops argument.
	(gimple_init): New function.
	(gimple_alloc): Call gimple_init.
	* gimple.h (gimple_size): New.
	(gimple_init): Likewise.
	* tree-if-conv.c (fold_or_predicates): Pass type.
	* tree-ssa-ifcombine.c (ifcombine_ifandif): Likewise.
	* tree-ssa-reassoc.c (eliminate_redundant_comparison): Likewise.
	(optimize_vec_cond_expr): Likewise.
	(ovce_extract_ops): Return type of conditional expression.
	* tree-ssanames.c (init_ssa_name_imm_use): New.
	(make_ssa_name_fn): Use init_ssa_name_imm_use.
	* tree-ssanames.h (init_ssa_name_imm_use): New.
---
 gcc/gimple-fold.c| 170 +--
 gcc/gimple-fold.h|   4 +-
 gcc/gimple.c |  37 +
 gcc/gimple.h |   2 +
 gcc/tree-if-conv.c   |   2 +-
 gcc/tree-ssa-ifcombine.c |   2 +-
 gcc/tree-ssa-reassoc.c   |  25 --
 gcc/tree-ssanames.c  |  21 +++--
 gcc/tree-ssanames.h  |   1 +
 9 files changed, 189 insertions(+), 75 deletions(-)

diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index fcffb9802b7..6d9ba367839 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -5371,22 +5371,22 @@ same_bool_result_p (const_tree op1, const_tree op2)
 /* Forward declarations for some mutually recursive functions.  */
 
 static tree
-and_comparisons_1 (enum tree_code code1, tree op1a, tree op1b,
+and_comparisons_1 (tree type, enum tree_code code1, tree op1a, tree op1b,
 		   enum tree_code code2, tree op2a, tree op2b);
 static tree
-and_var_with_comparison (tree var, bool invert,
+and_var_with_comparison (tree type, tree var, bool invert,
 			 enum tree_code code2, tree op2a, tree op2b);
 static tree
-and_var_with_comparison_1 (gimple *stmt,
+and_var_with_comparison_1 (tree type, gimple *stmt,
 			   enum tree_code code2, tree op2a, tree op2b);
 static tree
-or_comparisons_1 (enum tree_code code1, tree op1a, tree op1b,
+or_comparisons_1 (tree, enum tree_code code1, tree op1a, tree op1b,
 		  enum tree_code code2, tree op2a, tree op2b);
 static tree
-or_var_with_comparison (tree var, bool invert,
+or_var_with_comparison (tree, tree var, bool invert,
 			enum tree_code code2, tree op2a, tree op2b);
 static tree
-or_var_with_comparison_1 (gimple *stmt,
+or_var_with_comparison_1 (tree, gimple *stmt,
 			  enum tree_code code2, tree op2a, tree op2b);
 
 /* Helper function for and_comparisons_1:  try to simplify the AND of the
@@ -5395,7 +5395,7 @@ or_var_with_comparison_1 (gimple *stmt,
Return NULL_EXPR if we can't simplify this to a single expression.  */
 
 static tree
-and_var_with_comparison (tree var, bool invert,
+and_var_with_comparison (tree type, tree var, bool invert,
 			 enum tree_code code2, tree op2a, tree op2b)
 {
   tree t;
@@ -5409,11 +5409,11 @@ and_var_with_comparison (tree var, bool invert,
  !var AND (op2a code2 op2b) => !(var OR !(op2a code2 op2b))
  Then we only have to consider the simpler non-inverted cases.  */
   if (invert)
-t = or_var_with_comparison_1 (stmt, 
+t = or_var_with_comparison_1 (type, stmt,
   invert_tree_comparison (code2, false),
   op2a, op2b);
   else
-t = and_var_with_comparison_1 (stmt, code2, op2a, op2b);
+t = and_var_with_comparison_1 (type, stmt, code2, op2a, op2b);
   return canonicalize_bool (t, invert);
 }
 
@@ -5422,7 +5422,7 @@ 

Re: [PATCH] Fix PR 91708

2019-09-11 Thread Bernd Edlinger
On 9/11/19 9:23 AM, Richard Biener wrote:
> On Tue, 10 Sep 2019, Bernd Edlinger wrote:
> 
>> Hi!
>>
>> This ICE happens when compiling real_nextafter in real.c.
>> CSE sees this:
>>
>> (insn 179 178 180 11 (set (reg:SI 319)
>> (reg/v/f:SI 273 [ rD.73757 ])) "../../gcc-trunk-1/gcc/real.c":120:10 
>> 643 {*thumb2_movsi_vfp}
>>  (nil))
>> [...]
>> (insn 181 180 182 11 (set (mem:SI (reg:SI 319) [0 MEM  
>> [(voidD.73 *)r_77(D)]+0 S4 A8])
>> (unspec:SI [
>> (reg:SI 320)
>> ] UNSPEC_UNALIGNED_STORE)) "../../gcc-trunk-1/gcc/real.c":120:10 
>> 129 {unaligned_storesi}
>>  (nil))
>> [...]
>> (insn 186 185 187 11 (set (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
>> (const_int 20 [0x14])) [0 MEM  [(voidD.73 
>> *)r_77(D)]+20 S4 A8])
>> (unspec:SI [
>> (reg:SI 320)
>> ] UNSPEC_UNALIGNED_STORE)) "../../gcc-trunk-1/gcc/real.c":120:10 
>> 129 {unaligned_storesi}
>>  (expr_list:REG_DEAD (reg:SI 320)
>> (expr_list:REG_DEAD (reg/f:SI 319 [ rD.73757 ])
>> (nil
>> [...]
>> (insn 234 233 235 11 (set (reg:SI 340)
>> (mem:SI (reg/v/f:SI 273 [ rD.73757 ]) [52 MEM  
>> [(struct real_valueD.28367 *)r_77(D)]+0 S4 A32])) 
>> "../../gcc-trunk-1/gcc/real.c":5185:9 643 {*thumb2_movsi_vfp}
>>  (nil))
>>
>>
>> ... and transforms insn 234 in an invalid insn:
>>
>>
>> (insn 234 233 235 11 (set (reg:SI 340 [ MEM  [(struct 
>> real_valueD.28367 *)r_77(D)] ])
>> (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
>> (const_int 20 [0x14])) [0 MEM  [(voidD.73 
>> *)r_77(D)]+20 S4 A8])) "../../gcc-trunk-1/gcc/real.c":5185:9 643 
>> {*thumb2_movsi_vfp}
>>  (nil))
>>
>> which triggers the assertion in the arm back-end, because the MEM is not 
>> aligned.
>>
>> To fix that I changed exp_equiv_p to consider MEMs with different MEM_ALIGN 
>> or
>> ALIAS_SET as different.
> 
> I think the appropriate fix is to retain the mem_attrs of the original
> MEM since obviously the replacement does not change those?  It's a mere
> change of the MEMs address but implementation-wise CSE replaces the whole
> MEM?
> 
> For CSE exp_equiv_p simply compares the addresses, only if for_gcse
> we do further checks like mem_attrs_eq_p.  IMHO that is correct.
> 
> I'm not sure what CSE does above, that is, what does it think it does?
> It doesn't replace the loaded value, it seems to do sth else which
> is odd for CSE (replaces a REG with a PLUS - but why?)
>  

What I think CSE is thinking there is this:

insn 181 stores the value of (reg:SI 320) at (mem:SI (reg:SI 319) [0 MEM 
 [(voidD.73 *)r_77(D)]+0 S4 A8]
that is the same address as where insn 234 reads the value back but there we 
know it is aligned.

insn 186 stores the same value at (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 
]) (const_int 20 [0x14])) [0 MEM  [(voidD.73 *)r_77(D)]+20 S4 A8]

But since both reg:SI 320 is dead and reg:SI 319 is dead as well, when
insn 234 processed, it replaces one mem with another mem that has the same
value.


Usually that would be okay, but not when the RTX is extracted from an unspec
unalinged_storedi.


Bernd.


>> This patch fixes the arm bootstrap for --with-cpu=cortex-a57 
>> --with-mode=thumb --with-fpu=fp-armv8 --with-float=hard
>> which I confirmed using a cross compiler.  And it fixes the test case that 
>> is attached to the PR, but it is way
>> too large for the test suite.
>>
>>
>> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
>> Is it OK for trunk?
> 
> So no, I don't think this is correct.
> 
> Thanks,
> Richard.
> 


Re: [PATCH] Fix PR 91708

2019-09-11 Thread Richard Biener
On Wed, 11 Sep 2019, Bernd Edlinger wrote:

> On 9/11/19 12:43 AM, Jeff Law wrote:
> > On 9/10/19 1:51 PM, Bernd Edlinger wrote:
> >> Hi!
> >>
> >> This ICE happens when compiling real_nextafter in real.c.
> >> CSE sees this:
> >>
> >> (insn 179 178 180 11 (set (reg:SI 319)
> >> (reg/v/f:SI 273 [ rD.73757 ])) 
> >> "../../gcc-trunk-1/gcc/real.c":120:10 643 {*thumb2_movsi_vfp}
> >>  (nil))
> >> [...]
> >> (insn 181 180 182 11 (set (mem:SI (reg:SI 319) [0 MEM  
> >> [(voidD.73 *)r_77(D)]+0 S4 A8])
> >> (unspec:SI [
> >> (reg:SI 320)
> >> ] UNSPEC_UNALIGNED_STORE)) 
> >> "../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
> >>  (nil))
> >> [...]
> >> (insn 186 185 187 11 (set (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
> >> (const_int 20 [0x14])) [0 MEM  [(voidD.73 
> >> *)r_77(D)]+20 S4 A8])
> >> (unspec:SI [
> >> (reg:SI 320)
> >> ] UNSPEC_UNALIGNED_STORE)) 
> >> "../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
> >>  (expr_list:REG_DEAD (reg:SI 320)
> >> (expr_list:REG_DEAD (reg/f:SI 319 [ rD.73757 ])
> >> (nil
> >> [...]
> >> (insn 234 233 235 11 (set (reg:SI 340)
> >> (mem:SI (reg/v/f:SI 273 [ rD.73757 ]) [52 MEM  
> >> [(struct real_valueD.28367 *)r_77(D)]+0 S4 A32])) 
> >> "../../gcc-trunk-1/gcc/real.c":5185:9 643 {*thumb2_movsi_vfp}
> >>  (nil))
> >>
> >>
> >> ... and transforms insn 234 in an invalid insn:
> >>
> >>
> >> (insn 234 233 235 11 (set (reg:SI 340 [ MEM  [(struct 
> >> real_valueD.28367 *)r_77(D)] ])
> >> (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
> >> (const_int 20 [0x14])) [0 MEM  [(voidD.73 
> >> *)r_77(D)]+20 S4 A8])) "../../gcc-trunk-1/gcc/real.c":5185:9 643 
> >> {*thumb2_movsi_vfp}
> >>  (nil))
> >>
> >> which triggers the assertion in the arm back-end, because the MEM is not 
> >> aligned.
> >>
> >> To fix that I changed exp_equiv_p to consider MEMs with different 
> >> MEM_ALIGN or
> >> ALIAS_SET as different.
> >>
> >> This patch fixes the arm bootstrap for --with-cpu=cortex-a57 
> >> --with-mode=thumb --with-fpu=fp-armv8 --with-float=hard
> >> which I confirmed using a cross compiler.  And it fixes the test case that 
> >> is attached to the PR, but it is way
> >> too large for the test suite.
> >>
> >>
> >> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
> >> Is it OK for trunk?
> >>
> >>
> >> Thanks
> >> Bernd.
> >>
> >>
> >> patch-pr91708.diff
> >>
> >> 2019-09-10  Bernd Edlinger  
> >>
> >>PR middle-end/91708
> >>* cse.c (exp_equiv_p): Consider MEMs with different
> >>alias set or alignment as different.
> > Hmm, I would have expected the address space and alignment checks to be
> > handled by the call to mem_attrs_eq_p.
> > 
> 
> Yes, but exp_equiv_p is called from here:
> 
> lookup (rtx x, unsigned int hash, machine_mode mode)
> {
>   struct table_elt *p;
> 
>   for (p = table[hash]; p; p = p->next_same_hash)
> if (mode == p->mode && ((x == p->exp && REG_P (x))
> || exp_equiv_p (x, p->exp, !REG_P (x), false)))
>   return p;
> 
> with for_gcse = false si mem_attrs_eq_p is not used.
> 
> > Which aspect of the MEMs is really causing the problem here?
> > 
> 
> The alignment is (formally) different, but since also the address is 
> different,
> the alignment could be also be really wrong.
> 
> Originally it was [(struct real_valueD.28367 *)r_77(D)]+0 S4 A32]
> but it is replaced with [0 MEM  [(voidD.73 *)r_77(D)]+20 S4 
> A8],
> that appears to be done, because there is the same value that was in the 
> original
> location.
> 
> But also the wrong alias set will cause problems if the instruction
> is re-materialized in another place (which actually happened in LRA and 
> caused the
> assertion, BTW).

But !for_gcse shouldn't do this kind of things with MEMs.  It should
only replace a MEM with a REG, not put in some other "equivalent"
MEMs.

Richard.


Re: [PATCH] Fix PR 91708

2019-09-11 Thread Bernd Edlinger
On 9/11/19 12:43 AM, Jeff Law wrote:
> On 9/10/19 1:51 PM, Bernd Edlinger wrote:
>> Hi!
>>
>> This ICE happens when compiling real_nextafter in real.c.
>> CSE sees this:
>>
>> (insn 179 178 180 11 (set (reg:SI 319)
>> (reg/v/f:SI 273 [ rD.73757 ])) "../../gcc-trunk-1/gcc/real.c":120:10 
>> 643 {*thumb2_movsi_vfp}
>>  (nil))
>> [...]
>> (insn 181 180 182 11 (set (mem:SI (reg:SI 319) [0 MEM  
>> [(voidD.73 *)r_77(D)]+0 S4 A8])
>> (unspec:SI [
>> (reg:SI 320)
>> ] UNSPEC_UNALIGNED_STORE)) "../../gcc-trunk-1/gcc/real.c":120:10 
>> 129 {unaligned_storesi}
>>  (nil))
>> [...]
>> (insn 186 185 187 11 (set (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
>> (const_int 20 [0x14])) [0 MEM  [(voidD.73 
>> *)r_77(D)]+20 S4 A8])
>> (unspec:SI [
>> (reg:SI 320)
>> ] UNSPEC_UNALIGNED_STORE)) "../../gcc-trunk-1/gcc/real.c":120:10 
>> 129 {unaligned_storesi}
>>  (expr_list:REG_DEAD (reg:SI 320)
>> (expr_list:REG_DEAD (reg/f:SI 319 [ rD.73757 ])
>> (nil
>> [...]
>> (insn 234 233 235 11 (set (reg:SI 340)
>> (mem:SI (reg/v/f:SI 273 [ rD.73757 ]) [52 MEM  
>> [(struct real_valueD.28367 *)r_77(D)]+0 S4 A32])) 
>> "../../gcc-trunk-1/gcc/real.c":5185:9 643 {*thumb2_movsi_vfp}
>>  (nil))
>>
>>
>> ... and transforms insn 234 in an invalid insn:
>>
>>
>> (insn 234 233 235 11 (set (reg:SI 340 [ MEM  [(struct 
>> real_valueD.28367 *)r_77(D)] ])
>> (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
>> (const_int 20 [0x14])) [0 MEM  [(voidD.73 
>> *)r_77(D)]+20 S4 A8])) "../../gcc-trunk-1/gcc/real.c":5185:9 643 
>> {*thumb2_movsi_vfp}
>>  (nil))
>>
>> which triggers the assertion in the arm back-end, because the MEM is not 
>> aligned.
>>
>> To fix that I changed exp_equiv_p to consider MEMs with different MEM_ALIGN 
>> or
>> ALIAS_SET as different.
>>
>> This patch fixes the arm bootstrap for --with-cpu=cortex-a57 
>> --with-mode=thumb --with-fpu=fp-armv8 --with-float=hard
>> which I confirmed using a cross compiler.  And it fixes the test case that 
>> is attached to the PR, but it is way
>> too large for the test suite.
>>
>>
>> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
>> Is it OK for trunk?
>>
>>
>> Thanks
>> Bernd.
>>
>>
>> patch-pr91708.diff
>>
>> 2019-09-10  Bernd Edlinger  
>>
>>  PR middle-end/91708
>>  * cse.c (exp_equiv_p): Consider MEMs with different
>>  alias set or alignment as different.
> Hmm, I would have expected the address space and alignment checks to be
> handled by the call to mem_attrs_eq_p.
> 

Yes, but exp_equiv_p is called from here:

lookup (rtx x, unsigned int hash, machine_mode mode)
{
  struct table_elt *p;

  for (p = table[hash]; p; p = p->next_same_hash)
if (mode == p->mode && ((x == p->exp && REG_P (x))
|| exp_equiv_p (x, p->exp, !REG_P (x), false)))
  return p;

with for_gcse = false si mem_attrs_eq_p is not used.

> Which aspect of the MEMs is really causing the problem here?
> 

The alignment is (formally) different, but since also the address is different,
the alignment could be also be really wrong.

Originally it was [(struct real_valueD.28367 *)r_77(D)]+0 S4 A32]
but it is replaced with [0 MEM  [(voidD.73 *)r_77(D)]+20 S4 A8],
that appears to be done, because there is the same value that was in the 
original
location.

But also the wrong alias set will cause problems if the instruction
is re-materialized in another place (which actually happened in LRA and caused 
the
assertion, BTW).


Bernd.


web site patch committed: Mention that GCC 9 supports Go 1.12.2

2019-09-11 Thread Ian Lance Taylor
Add a Go entry for GCC 9, for PR 91700.

Ian
Index: changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-9/changes.html,v
retrieving revision 1.76
diff -u -r1.76 changes.html
--- changes.html1 Sep 2019 00:28:46 -   1.76
+++ changes.html11 Sep 2019 13:27:22 -
@@ -778,7 +778,11 @@
   
 
 
-
+Go
+
+  GCC 9 provides a complete implementation of the Go 1.12.2
+user packages.
+
 
 
 libgccjit


Re: [PATCH 3/5] Rewrite part of and_comparisons_1 into match.pd.

2019-09-11 Thread Martin Liška
On 9/11/19 2:43 PM, Richard Biener wrote:
> On Wed, 11 Sep 2019, Martin Liška wrote:
> 
>> I'm sending updated version of the patch where I changed
>> from previous version:
>> - more compact matching is used (note @3 and @4):
>>   (and (code1:c@3 @0 INTEGER_CST@1) (code2:c@4 @0 INTEGER_CST@2))
>>
>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>
>> Ready to be installed?
> 
> --- a/gcc/genmatch.c
> +++ b/gcc/genmatch.c
> @@ -1894,9 +1894,11 @@ dt_node *
>  dt_node::append_simplify (simplify *s, unsigned pattern_no,
>   dt_operand **indexes)
>  {
> +  dt_simplify *s2;
>dt_simplify *n = new dt_simplify (s, pattern_no, indexes);
>for (unsigned i = 0; i < kids.length (); ++i)
> -if (dt_simplify *s2 = dyn_cast  (kids[i]))
> +if ((s2 = dyn_cast  (kids[i]))
> +   && s->match->location != s2->s->match->location)
>{
> 
> can you retain the warning for verbose >= 1 please?  And put in
> a comment that duplicates are sometimes hard to avoid with
> nested for so this keeps match.pd sources small.

Sure.

> 
> +  /* Function maybe_fold_comparisons_from_match_pd creates temporary
> + SSA_NAMEs.  */
> +  if (TREE_CODE (op1) == SSA_NAME && TREE_CODE (op2) == SSA_NAME)
> +{
> +  gimple *s = SSA_NAME_DEF_STMT (op2);
> +  if (is_gimple_assign (s))
> +   return same_bool_comparison_p (op1, gimple_assign_rhs_code (s),
> +  gimple_assign_rhs1 (s),
> +  gimple_assign_rhs2 (s));
> +  else
> +   return false;
> +}
> 
> when do you actually run into the need to add this?  The whole
> same_bool_result_p/same_bool_comparison_p helpers look a bit
> odd to me.  It shouldn't be special to the new temporaries
> either so at least the comment looks out-of place.

At some point it was needed to handle gcc/testsuite/gcc.dg/pr46909.c
test-case. Apparently, now the test-case is fine with the hunk. I will it
removal of the hunk.

> 
> +  else if (op.code.is_tree_code ()
> +  && TREE_CODE_CLASS ((tree_code)op.code) == tcc_comparison)
> +   {
> 
> COMPARISON_CLASS_P ((tree_code)op.code)
> 
> as was noted.

Won't work here:

/home/marxin/Programming/gcc/gcc/gimple-fold.c: In function ‘tree_node* 
maybe_fold_comparisons_from_match_pd(tree, tree_code, tree_code, tree, tree, 
tree_code, tree, tree)’:
/home/marxin/Programming/gcc/gcc/tree.h:239:49: error: base operand of ‘->’ is 
not a pointer
  239 | #define TREE_CODE(NODE) ((enum tree_code) (NODE)->base.code)
  | ^~

> 
> Any particular reason you needed to swap the calls in
> maybe_fold_and/or_comparisons?
> 
> +(for code1 (eq ne)
> + (for code2 (eq ne lt gt le ge)
> +  (for and (truth_and bit_and)
> +   (simplify
> 
> You could save some code-bloat with writing
> 
>   (for and (
> #if GENERIC
>   truth_and
> #endif
>   bit_and)
> 
> but since you are moving the patterns from GIMPLE code I'd say
> simply remove truth_and and that innermost for?

I'm gonna drop the generic tree codes support.

Martin

> 
> Otherwise OK.
> 
> Thanks,
> Richard.
> 
> 
> 
> 
>> Thanks,
>> Martin
>>
> 



[PATCH] Builtin fadd variants folding implementation

2019-09-11 Thread Tejas Joshi
Hi.
This patch includes implementation of fadd, faddl, daddl functions.
The patch boostraps on x86_64-linux-gnu and passes regression tests.

Thanks,
Tejas

gcc/ChangeLog:

2019-09-11  Tejas Joshi  

* builtin-types.def: Define narrowing function types.
* builtins.def: Define fadd variants builtin functions.
* fold-const-call.c (fold_const_narrow_binary): New. Folding calls
and conditions for standard arithmetic functions.
(fold_const_call): Add case for fadd variants.

gcc/testsuite/ChangeLog:

2019-09-11  Tejas Joshi  

* gcc.dg/builtin-fadd-1.c: New test.
* gcc.dg/builtin-fadd-2.c: New test.
* gcc.dg/builtin-fadd-3.c: New test.
* gcc.dg/builtin-fadd-4.c: New test.
* gcc.dg/builtin-fadd-5.c: New test.
* gcc.dg/builtin-fadd-6.c: New test.
diff --git a/gcc/builtin-types.def b/gcc/builtin-types.def
index e5c9e063c48..6bc552fa71a 100644
--- a/gcc/builtin-types.def
+++ b/gcc/builtin-types.def
@@ -387,8 +387,14 @@ DEF_FUNCTION_TYPE_2 (BT_FN_VOID_UINT_PTR,
 		 BT_VOID, BT_UINT, BT_PTR)
 DEF_FUNCTION_TYPE_2 (BT_FN_FLOAT_FLOAT_FLOAT,
 		 BT_FLOAT, BT_FLOAT, BT_FLOAT)
+DEF_FUNCTION_TYPE_2 (BT_FN_FLOAT_DOUBLE_DOUBLE,
+		 BT_FLOAT, BT_DOUBLE, BT_DOUBLE)
+DEF_FUNCTION_TYPE_2 (BT_FN_FLOAT_LONGDOUBLE_LONGDOUBLE,
+		 BT_FLOAT, BT_LONGDOUBLE, BT_LONGDOUBLE)
 DEF_FUNCTION_TYPE_2 (BT_FN_DOUBLE_DOUBLE_DOUBLE,
 		 BT_DOUBLE, BT_DOUBLE, BT_DOUBLE)
+DEF_FUNCTION_TYPE_2 (BT_FN_DOUBLE_LONGDOUBLE_LONGDOUBLE,
+		 BT_DOUBLE, BT_LONGDOUBLE, BT_LONGDOUBLE)
 DEF_FUNCTION_TYPE_2 (BT_FN_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE,
 		 BT_LONGDOUBLE, BT_LONGDOUBLE, BT_LONGDOUBLE)
 DEF_FUNCTION_TYPE_2 (BT_FN_FLOAT16_FLOAT16_FLOAT16,
diff --git a/gcc/builtins.def b/gcc/builtins.def
index 8bb7027aac7..2df616c477e 100644
--- a/gcc/builtins.def
+++ b/gcc/builtins.def
@@ -355,6 +355,9 @@ DEF_EXT_LIB_FLOATN_NX_BUILTINS (BUILT_IN_FABS, "fabs", FABS_TYPE, ATTR_CONST_NOT
 DEF_GCC_BUILTIN(BUILT_IN_FABSD32, "fabsd32", BT_FN_DFLOAT32_DFLOAT32, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN(BUILT_IN_FABSD64, "fabsd64", BT_FN_DFLOAT64_DFLOAT64, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN(BUILT_IN_FABSD128, "fabsd128", BT_FN_DFLOAT128_DFLOAT128, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_EXT_LIB_BUILTIN(BUILT_IN_FADD, "fadd", BT_FN_FLOAT_DOUBLE_DOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_EXT_LIB_BUILTIN(BUILT_IN_FADDL, "faddl", BT_FN_FLOAT_LONGDOUBLE_LONGDOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_EXT_LIB_BUILTIN(BUILT_IN_DADDL, "daddl", BT_FN_DOUBLE_LONGDOUBLE_LONGDOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_C99_BUILTIN(BUILT_IN_FDIM, "fdim", BT_FN_DOUBLE_DOUBLE_DOUBLE, ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_C99_BUILTIN(BUILT_IN_FDIMF, "fdimf", BT_FN_FLOAT_FLOAT_FLOAT, ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_C99_BUILTIN(BUILT_IN_FDIML, "fdiml", BT_FN_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE, ATTR_MATHFN_FPROUNDING_ERRNO)
diff --git a/gcc/fold-const-call.c b/gcc/fold-const-call.c
index 3a14d2a41c1..f6b4508a101 100644
--- a/gcc/fold-const-call.c
+++ b/gcc/fold-const-call.c
@@ -570,6 +570,44 @@ fold_const_nextafter (real_value *result, const real_value *arg0,
   return true;
 }
 
+/* Try to evaluate:
+
+  *RESULT = add (*ARG0, *ARG1)
+
+   in format FORMAT.  Return true on success.  */
+
+static bool
+fold_const_narrow_binary (real_value *result, const real_value *arg0,
+			  int icode, const real_value *arg1,
+			  const real_format *format)
+{
+  if (REAL_VALUE_ISSIGNALING_NAN (*arg0)
+  || REAL_VALUE_ISSIGNALING_NAN (*arg1))
+return false;
+
+  real_arithmetic (result, icode, arg0, arg1);
+  /* Underflow condition.  */
+  if (flag_errno_math
+  && result->cl == rvc_normal
+  && REAL_EXP (result) < format->emin)
+return false;
+
+  if (!exact_real_truncate (format, result)
+  && (flag_rounding_math || flag_trapping_math))
+return false;
+
+  real_convert (result, format, result);
+  /* Overflow condition.  */
+  if (!real_isfinite (result) && flag_errno_math)
+return false;
+
+  if (REAL_VALUE_ISNAN (*result)
+  && (flag_errno_math || flag_trapping_math))
+return false;
+
+  return true;
+}
+
 /* Try to evaluate:
 
   *RESULT = ldexp (*ARG0, ARG1)
@@ -1674,6 +1712,25 @@ fold_const_call (combined_fn fn, tree type, tree arg0, tree arg1)
 case CFN_FOLD_LEFT_PLUS:
   return fold_const_fold_left (type, arg0, arg1, PLUS_EXPR);
 
+case CFN_BUILT_IN_FADD:
+case CFN_BUILT_IN_FADDL:
+case CFN_BUILT_IN_DADDL:
+  {
+	if (real_cst_p (arg0)
+	&& real_cst_p (arg1))
+	  {
+	machine_mode arg0_mode = TYPE_MODE (TREE_TYPE (arg0));
+	gcc_checking_assert (SCALAR_FLOAT_MODE_P (arg0_mode));
+	REAL_VALUE_TYPE result;
+	machine_mode mode = TYPE_MODE (type);
+	if (fold_const_narrow_binary (, TREE_REAL_CST_PTR (arg0),
+	  PLUS_EXPR, TREE_REAL_CST_PTR (arg1),
+	  REAL_MODE_FORMAT (mode)))
+	  return build_real (type, result);
+	  }
+  }
+  return NULL_TREE;
+
 default:

Re: [PATCH 2/2] Fix PR88784, middle end is missing some optimizations about unsigned

2019-09-11 Thread Richard Biener
On Fri, 6 Sep 2019, Martin Liška wrote:

> On 9/5/19 3:17 PM, Richard Biener wrote:
> > On Tue, 16 Jul 2019, Li Jia He wrote:
> > 
> >> Hi,
> >>
> >>   I made some changes based on the recommendations. Would you like to
> >>   help me to see it again ? Sorry, it took so long time to provide the
> >>   patch.
> >>
> >>   Note: 1. I keep the code for and_comparisons_1 and or_comparisons_1.
> >>The reason is that I did not found a good way to handle the
> >>optimization of '((x CODE1 y) AND (x CODE2 y))' in match.pd.
> >>Maybe I missing some important information about match.pd.
> >>2. The gimple_resimplify2 function is not used.  Since stmt1,
> >>stmt2, lhs1 and lhs2 are allocated on the stack, Is there a
> >>question with the value on the stack as the return value ?
> >>I may have misunderstood Richard's intention.
> > 
> > And now for the match.pd patch.
> > 
> > +/* x >  y  &&  x != XXX_MIN  -->  x > y  */
> > +(for and (truth_and bit_and)
> > + (simplify
> > +  (and:c (gt:c@3 @0 @1) (ne @0 INTEGER_CST@2))
> > +  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P 
> > (TREE_TYPE(@1))
> > +   && (wi::eq_p (wi::to_wide (@2), wi::min_value (TREE_TYPE (@2)
> > +@3)))
> > +
> > +/* x >  y  &&  x == XXX_MIN  -->  false  */
> > +(for and (truth_and bit_and)
> > + (simplify
> > +  (and:c (gt:c @0 @1) (eq @0 INTEGER_CST@2))
> > +  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P 
> > (TREE_TYPE(@1))
> > +   && (wi::eq_p (wi::to_wide (@2), wi::min_value (TREE_TYPE (@2)
> > +{ boolean_false_node; })))
> > 
> > you could merge those two via
> > 
> >  (for eqne (eq ne)
> >   (for and (
> >(simplify
> > (and:c (gt:c @0 @1) (eqne @0 INTEGER_CST@2))
> > (if (...)
> >  (switch
> >   (if (eqne == NE_EXPR)
> >@3)
> >   (if (eqne == EQ_EXPR)
> >{ constant_boolean_node (false, type); }
> > 
> > notice using constant_boolean_node (false, type); instead of
> > boolean_false_node.  I suspect more unification is possible.
> > 
> > Also you could do
> > 
> > (match min_value
> >  INTEGER_CST
> >  (if (INTEGRAL_TYPE_P (type)
> >   && wi::eq_p (wi::to_wide (t), wi::min_value (type)
> > 
> > and then write
> > 
> >  (simplify
> >   (and:c (gt:c @0 @1) (eq @0 min_value))
> >   (...
> > 
> > Your
> > 
> > (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P
> > (TREE_TYPE(@1))
> > 
> > is redundant, it's enough to check either @0 or @1 given they
> > have to be compatible for the gt operation.  Note you probably
> > want to use
> > 
> >   (and:c (gt:c @0 @1) (eq @@0 min_value))
> > 
> > and verify that types_match (@1, @0) because when @0 are a constant
> > (and (eq @0 min_value) is not folded which can happen) then they
> > might have different types and thus you could have
> > (SHORT_MAX > intvar) && (SHORT_MAX == SHORT_MAX)
> > 
> > That said, the patterns can be quite a bit simplified I think.
> > 
> > Richard.
> > 
> 
> Likewise, I applied the suggested simplification.
> 
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> 
> Ready to be installed?

+/* { dg-options "-O2 -fdump-tree-ifcombine --param 
logical-op-non-short-circuit=0" } */
+
+#include 
+
+_Bool and1(unsigned x, unsigned y)
+{
+  /* x > y && x != 0 --> x > y */
+  return x > y && x != 0;
+}
...
+/* { dg-final { scan-tree-dump-not " != " "ifcombine" } } */

since you iterate over bit_and and truth_and GENERIC folding should
already simplify this, no?

I suggest to remove the truth_and/or iteration.

Otherwise looks OK now.

Richard.

Re: [PATCH 2/9] operand_equal_p: add support for FIELD_DECL

2019-09-11 Thread Martin Liška
On 8/16/19 2:09 PM, Jan Hubicka wrote:
>> On Thu, Aug 15, 2019 at 4:17 PM Jan Hubicka  wrote:
>>>
 On Tue, Aug 6, 2019 at 5:44 PM Martin Liska  wrote:
>
>
> gcc/ChangeLog:

 So I suppose this isn't to call operand_equal_p on two FIELD_DECLs
 but to make two COMPONENT_REFs "more equal"?  If so I then
>>>
>>> yes. The patch originates from my original patchset I believe and it is
>>> what ICF does.
 I suggest to make this change "local" to the COMPONENT_REF handling.
 This also interacts with path-based disambiguation so you want to make
 sure to only make things equal here iff it wouldn't change the outcome
 of path-based analysis.  Honza?
>>>
>>> Indeed this can be handled as part of COMPONENT_REF match.
>>> Access path oracle here basically checks:
>>>  1) that MEM_REF type matches (we want predicate for this)
>>>  2) if it finds type match via same_type_for_tbaa and then it applies
>>> the assumption about disjointness or overlap
>>>
>>> So I guess ideally we should
>>>
>>>  1) do matching part of COMPONENT_REF
>>>  2) compare OFFSET, BIT_OFFSET
>>> This establishes that the access has same semantics.
>>>  3) for -fno-strict-aliasing be happy
>>>  4) for -fstrict-aliaisng check if access path applies (we should export
>>> predicate from tree-ssa-alias as discussed earlier)
>>>  5) compare types by same_type_for_tbaa_p
>>
>> Ick.  This smells like a layering violation to me.  IMHO this extended
>> equality handling should be handled with the overloading/callback
>> and not in native operand_equal_p.  Either on the level of the
>> COMPONENT_REF itself (sounds like that would be needed)
>> or the FIELD_DECL.  Not sure if the above suggestions make
>> it neccessary to look at more than a single COMPONENT_REF/FIELD_DECL
>> in the access path.  If so then watch out for quadraticness as 
>> operand_equal_p
>> traverses a reference chain...
> 
> I suppose we want to match whole access paths at once, since only having
> the MEM_REF allows one to check whether access path oracle applies to
> the given reference or not...

Doing that, can you please Honza point me to a function that should be used for 
it?

Martin

> 
> Honza
>>
>> Richard.
>>
>>> Honza



libgo patch committed: Force test package to be imported first

2019-09-11 Thread Ian Lance Taylor
When running libgo tests, when compiling the x_test package, this
patch forces the test package to be imported first.  That ensures that
we will see the types defined in the test package before the types
defined in the non-test version of the package.  This matters if the
types differ in some way, such as by adding a new method.

This avoids a failure in internal/poll on Solaris, in which the test
package adds a method to a type (FD.EOFError).  I think it was
Solaris-specific because files are sorted in a different order by
default.

The go tool handles this kind of thing correctly, by rebuilding
dependent packages.  This is just a hack sufficient to run the libgo
testsuite without using the go tool.

This fixes PR 91712.

Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Ran the
failing test on Solaris.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 275611)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-bf4832d604e7522dee78fca76de220b62a131d54
+1f4ce28409a2d9d4045b1085de55c46de8759d1c
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/testsuite/gotest
===
--- libgo/testsuite/gotest  (revision 275594)
+++ libgo/testsuite/gotest  (working copy)
@@ -419,10 +419,12 @@ esac
 # Split $gofiles into external gofiles (those in *_test packages)
 # and internal ones (those in the main package).
 xgofiles=
+xpackage=
 for f in $gofiles; do
 package=`grep '^package[   ]' $f | sed 1q`
 case "$package" in
 *_test)
+   xpackage=`echo $package | sed -e 's/package[]//' -e 's/[]*$//'`
xgofiles="$xgofiles $f"
;;
 *)
@@ -471,10 +473,17 @@ $GC -g $pkgpatharg $prefixarg -c -I . -f
 if $havex; then
mkdir -p `dirname $package`
cp _gotest_.o `dirname $package`/lib`basename $package`.a
+
+   # Force the test version of the package to be imported first,
+   # so that it's type definitions will be used, in case any new
+   # methods appear in export_test.go files.
+   echo "package $xpackage" > _first_test.go
+   echo 'import _ "'$package'"' >> _first_test.go
+
if test "$trace" = "true"; then
-   echo $GC -g $xpkgpatharg -c -I . -fno-toplevel-reorder -o $xofile 
$xgofiles
+   echo $GC -g $xpkgpatharg -c -I . -fno-toplevel-reorder -o $xofile 
_first_test.go $xgofiles
fi
-   $GC -g $xpkgpatharg -c -I . -fno-toplevel-reorder -o $xofile $xgofiles
+   $GC -g $xpkgpatharg -c -I . -fno-toplevel-reorder -o $xofile 
_first_test.go $xgofiles
 fi
 
 # They all compile; now generate the code to call them.


Re: [PATCH 3/5] Rewrite part of and_comparisons_1 into match.pd.

2019-09-11 Thread Richard Biener
On Wed, 11 Sep 2019, Martin Liška wrote:

> I'm sending updated version of the patch where I changed
> from previous version:
> - more compact matching is used (note @3 and @4):
>   (and (code1:c@3 @0 INTEGER_CST@1) (code2:c@4 @0 INTEGER_CST@2))
> 
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> 
> Ready to be installed?

--- a/gcc/genmatch.c
+++ b/gcc/genmatch.c
@@ -1894,9 +1894,11 @@ dt_node *
 dt_node::append_simplify (simplify *s, unsigned pattern_no,
  dt_operand **indexes)
 {
+  dt_simplify *s2;
   dt_simplify *n = new dt_simplify (s, pattern_no, indexes);
   for (unsigned i = 0; i < kids.length (); ++i)
-if (dt_simplify *s2 = dyn_cast  (kids[i]))
+if ((s2 = dyn_cast  (kids[i]))
+   && s->match->location != s2->s->match->location)
   {

can you retain the warning for verbose >= 1 please?  And put in
a comment that duplicates are sometimes hard to avoid with
nested for so this keeps match.pd sources small.

+  /* Function maybe_fold_comparisons_from_match_pd creates temporary
+ SSA_NAMEs.  */
+  if (TREE_CODE (op1) == SSA_NAME && TREE_CODE (op2) == SSA_NAME)
+{
+  gimple *s = SSA_NAME_DEF_STMT (op2);
+  if (is_gimple_assign (s))
+   return same_bool_comparison_p (op1, gimple_assign_rhs_code (s),
+  gimple_assign_rhs1 (s),
+  gimple_assign_rhs2 (s));
+  else
+   return false;
+}

when do you actually run into the need to add this?  The whole
same_bool_result_p/same_bool_comparison_p helpers look a bit
odd to me.  It shouldn't be special to the new temporaries
either so at least the comment looks out-of place.

+  else if (op.code.is_tree_code ()
+  && TREE_CODE_CLASS ((tree_code)op.code) == tcc_comparison)
+   {

COMPARISON_CLASS_P ((tree_code)op.code)

as was noted.

Any particular reason you needed to swap the calls in
maybe_fold_and/or_comparisons?

+(for code1 (eq ne)
+ (for code2 (eq ne lt gt le ge)
+  (for and (truth_and bit_and)
+   (simplify

You could save some code-bloat with writing

  (for and (
#if GENERIC
  truth_and
#endif
  bit_and)

but since you are moving the patterns from GIMPLE code I'd say
simply remove truth_and and that innermost for?

Otherwise OK.

Thanks,
Richard.




> Thanks,
> Martin
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 247165 (AG München)

Re: [PATCH 1/2] Auto-generate maybe_fold_and/or_comparisons from match.pd

2019-09-11 Thread Richard Biener
On Wed, 11 Sep 2019, Martin Liška wrote:

> Hello.
> 
> One another updated version of the patch.
> Changes from the previous version:
> - I fixed:
>   gimple *stmt1 = (gimple *) XALLOCAVEC (char, gimple_size (GIMPLE_ASSIGN, 
> 2));
>   into gimple_size (GIMPLE_ASSIGN, 3)
> - I simplified condition in gimple_simplified_result_is_gimple_val
> - optimize_vec_cond_expr is using build_same_sized_truth_vector_type

Can you here instead amend ovce_extract_ops to return the type of
'cond' which should be the type you want?

Otherwise OK.

Richard.

> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> 
> Ready to be installed?
> Thanks,
> Martin

Re: Fix PR rtl-optimization/89795

2019-09-11 Thread Eric Botcazou
> What does it mean for
>   (set (reg:QI) (const_int -128))
> for example?  This is the canonical form of moving positive 128 into a
> register as well, of course.

Nothing, since word_register_operation_p returns false on CONST_INTs .

> W_R_O needs many more restrictions if it can work at all, but those aren't
> documented.

The problematic issue is more basic and pertains to mere SUBREGs.

-- 
Eric Botcazou


Re: [patch, libgomp] Fix segfault with plugin-hsa

2019-09-11 Thread Martin Jambor
Hi,

On Wed, Sep 11 2019, Jakub Jelinek wrote:
> On Wed, Sep 11, 2019 at 11:30:17AM +0200, Tobias Burnus wrote:
>> when playing around with offloading, I got an hsa initialization error
>> (/dev/... lacked write permission) and the library call returned with an
>> error status – but hsa_fns.hsa_status_string_fn didn't set the message to
>> the string variable.
>> 
>> Hence, the string variable was uninitialized and libgomp crashed when
>> printing the error message; "[unknown]" is not the best HSA run-time
>> message, but at least the primary message is shown and there is no crash :-)
>
> LGTM, but give Martin (CCed) as the author a chance to chime in.

Looks good to me too.  Thanks.

Martin

>
>> 2019-09-11  Tobias Burnus  
>> 
>> * plugin/plugin-hsa.c (hsa_warn, hsa_fatal, hsa_error): Ensure
>> string is initialized.


Re: Fix PR rtl-optimization/89795

2019-09-11 Thread Segher Boessenkool
Hi Eric,

On Wed, Sep 11, 2019 at 12:39:51PM +0200, Eric Botcazou wrote:
> This fixes a wrong code generation on the ARM in very peculiar circumstances 
> ( 
> -O2 -fno-dce -fno-forward-propagate -fno-sched-pressure) present on all 
> active 
> branches in the form of a missing zero-extension.  It turns out that not all 
> parts of the RTL middle-end agree on the assumptions valid on RISC machines 
> as 
> encoded by WORD_REGISTER_OPERATIONS, so the nonzero_bits machinery needs to 
> be 
> more conservative with them.  As witnessed by the added XFAIL, this will add 
> back some redundant zero-extensions on WORD_REGISTER_OPERATIONS targets.

But what are those assumptions, _exactly_?  It's not clear to me.


WORD_REGISTER_OPERATIONS
  Define this macro if operations between registers with integral mode
  smaller than a word are always performed on the entire register. Most
  RISC machines have this property and most CISC machines do not.


What does it mean for
  (set (reg:QI) (const_int -128))
for example?  This is the canonical form of moving positive 128 into a
register as well, of course.

On a target with 64-bit registers, does it mean that
  (set (reg:SI d) (mult:SI (reg:SI a) (reg:SI b)))
is actually done as
  (set (reg:DI d) (mult:DI (reg:DI a) (reg:DI b)))
?  I can't think of any hardware where that is true.

W_R_O needs many more restrictions if it can work at all, but those aren't
documented.


Segher


Re: [Patch 0/X] [WIP][RFC][libsanitizer] Introduce HWASAN to GCC

2019-09-11 Thread Martin Liška
On 9/9/19 5:54 PM, Matthew Malcomson wrote:
> On 09/09/19 11:47, Martin Liška wrote:
>> On 9/6/19 4:46 PM, Matthew Malcomson wrote:
>>> Hello,
>>>
>>> This patch series is a WORK-IN-PROGRESS towards porting the LLVM hardware
>>> address sanitizer (HWASAN) in GCC.  The document describing HWASAN can be 
>>> found
>>> here http://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html.
>>
>> Hello.
>>
>> I'm happy that you are working on the functionality for GCC and I can provide
>> my knowledge that I have with ASAN. I briefly read the patch series and I 
>> have
>> multiple questions (and observations):
>>
>> 1) Is the ambition of the patchset to be a software emulation of MTE that can
>> work targets that do not support MTE? Is it something what clang
>> names hwasan-abi=interceptor?
> 
> The ambition is to provide a software emulation of MTE for AArch64 
> targets that don't support MTE.

Hello.

It would be also great to provide the emulation on targets that do not provide 
TBI
(like x86_64).

> I also hope to have the framework set up so that enabling for other 
> architectures is relatively easy and can be done by those interested.
> 
> As I understand it, `hwasan-abi=interceptor` vs `platform` is about 
> adding such MTE emulation for "application code" or "platform code (e.g. 
> kernel)" respectively.

Hm, are you sure? Clang also uses -fsanitize=kernel-hwaddress which should
be equivalent to kernel-address for -fsanitize=address.

> 
>>
>> 2) Do you have a real aarch64 hardware that has MTE support? Would it be 
>> possible
>> for the future to give such a machine to GCC Compile Farm for testing 
>> purpose?
> 
> No our team doesn't have real MTE hardware, I have been testing on an 
> AArch64 machine that has TBI, other work in the team that requires MTE 
> support is being tested on the Arm "Fast Models" emulator.
> 
>>
>> 3) I like the idea of sharing of internal functions like 
>> ASAN_CHECK/HWASAN_CHECK.
>> We should benefit from that in the future.
>>
>> 4) Am I correct that due to escape of "tagged" pointers, one needs to have 
>> an entire
>> DSO (dynamic shared object) built with hwasan enabled? Otherwise, a 
>> dereference of
>> a tagged pointer will lead to a segfault (except TBI feature on aarch64)?
> 
> 
> Yes, one needs to take pains to avoid the escape of tagged pointers on 
> architectures other than AArch64.

Which is the very same pain which MPX was suffering from, before it was dropped
in GCC :)

> 
> I don't believe that compiling the entire DSO with HWASAN enabled is 
> enough, since pointers can be passed across DSO boundaries.
> I haven't yet looked into how to handle this.
> 
> There's an even more fundamental problem of accesses within the 
> instrumented binary -- I haven't yet figured out how to remove the tag 
> before accesses on architectures without the AArch64 TBI feature.

Which should platforms like x86_64, right?

> 
> 
>>
>> 5) Is there a documentation/definition of how shadow memory for memory 
>> tagging looks like?
>> Is it similar to ASAN, where one can get to tag with:
>> u8 memory_tag = *((PTR >> TG) + SHADOW_OFFSET) & 0xf?
>>
> 
> Yes, it's similar.
> 
>  From the libhwasan code, the function to fetch a pointer to the shadow 
> memory byte corresponding to a memory address is MemToShadow.
> 
> constexpr uptr kShadowScale = 4;
> inline uptr MemToShadow(uptr untagged_addr) {
>return (untagged_addr >> kShadowScale) +
>   __hwasan_shadow_memory_dynamic_address;
> }
> 
> https://github.com/llvm-mirror/compiler-rt/blob/99ce9876124e910475c627829bf14326b8073a9d/lib/hwasan/hwasan_mapping.h#L42
> 
> 
>> 6) Note that thing like memtag_tag_size, memtag_granule_size define an ABI 
>> of libsanitizer
>>
> 
> Yes, the size of these values define an ABI.
> 
> Those particular hooks are added as a demonstration for how something 
> like MTE would be implemented on top of this framework (where the 
> backend would specify the tag and granule size to match their targets 
> architecture).
> 
> HWASAN itself would use the hard-coded tag and granule size that matches 
> what libsanitizer uses.
> https://github.com/llvm-mirror/compiler-rt/blob/99ce9876124e910475c627829bf14326b8073a9d/lib/hwasan/hwasan_mapping.h#L36
> 
> I define these as `HWASAN_TAG_SIZE` and `HWASAN_TAG_GRANULE_SIZE` in 
> asan.h, and when using the sanitizer library the macro 
> `HARDWARE_MEMORY_TAGGING` would be false so their values would be constant.
> 
> 
>>>
>>> The current patch series is far from complete, but I'm posting the current 
>>> state
>>> to provide something to discuss at the Cauldron next week.
>>>
>>> In its current state, this sanitizer only works on AArch64 with a custom 
>>> kernel
>>> to allow tagged pointers in system calls.  This is discussed in the below 
>>> link
>>> https://source.android.com/devices/tech/debug/hwasan -- the custom kernel 
>>> allows
>>> tagged pointers in syscalls.
>>
>> Can you be please more specific. Is the MTE in upstream 

Re: Fix PR rtl-optimization/89795

2019-09-11 Thread Jakub Jelinek
On Wed, Sep 11, 2019 at 12:39:51PM +0200, Eric Botcazou wrote:
> This fixes a wrong code generation on the ARM in very peculiar circumstances 
> ( 
> -O2 -fno-dce -fno-forward-propagate -fno-sched-pressure) present on all 
> active 
> branches in the form of a missing zero-extension.  It turns out that not all 
> parts of the RTL middle-end agree on the assumptions valid on RISC machines 
> as 
> encoded by WORD_REGISTER_OPERATIONS, so the nonzero_bits machinery needs to 
> be 
> more conservative with them.  As witnessed by the added XFAIL, this will add 
> back some redundant zero-extensions on WORD_REGISTER_OPERATIONS targets.
> 
> Tested on SPARC/Solaris and SPARC64/Linux, applied on all active branches.

Thanks.  I've added testcase for the various PRs that are fixed by this
change, verified on armv4 and riscv64 on the first and last testcases that
the rtlanal.c change makes the desired change in the assembly, plus tested
on x86_64-linux -m32/-m64, committed to trunk as obvious.

2019-09-11  Jakub Jelinek  

PR rtl-optimization/89435
PR rtl-optimization/89795
PR rtl-optimization/91720
* gcc.dg/pr89435.c: New test.
* gcc.dg/pr89795.c: New test.
* gcc.dg/pr91720.c: New test.

--- gcc/testsuite/gcc.dg/pr89435.c.jj   2019-09-11 13:34:09.630165981 +0200
+++ gcc/testsuite/gcc.dg/pr89435.c  2019-09-11 13:33:37.038664072 +0200
@@ -0,0 +1,21 @@
+/* PR rtl-optimization/89435 */
+/* { dg-do run } */
+/* { dg-options "-O1 -fno-forward-propagate -fno-tree-forwprop -fno-tree-ccp" 
} */
+
+unsigned short a;
+unsigned int b, c, d, e, f;
+
+int
+main ()
+{
+#if __CHAR_BIT__ == 8 && __SIZEOF_INT__ == 4
+  unsigned char g = e = __builtin_mul_overflow_p (5, 542624702, 0);
+  d = __builtin_bswap64 (a);
+  b = __builtin_sub_overflow ((unsigned char) -e, (unsigned int) d, );
+  e = __builtin_mul_overflow (b, c, );
+  f = g + e;
+  if (f != 0xff)
+__builtin_abort ();
+#endif
+  return 0;
+}
--- gcc/testsuite/gcc.dg/pr89795.c.jj   2019-09-11 13:05:58.405033868 +0200
+++ gcc/testsuite/gcc.dg/pr89795.c  2019-09-11 13:04:52.508042619 +0200
@@ -0,0 +1,25 @@
+/* PR rtl-optimization/89795 */
+/* { dg-do run } */
+/* { dg-options "-O2 -fno-dce -fno-forward-propagate -fno-sched-pressure" } */
+
+unsigned char a;
+unsigned b, c, d;
+
+int
+main ()
+{
+#if __CHAR_BIT__ == 8
+  unsigned x;
+  int e, f;
+  unsigned char g;
+  e = __builtin_bswap32 (a);
+  f = __builtin_ffs (~(unsigned short) e);
+  a = __builtin_mul_overflow ((unsigned char) 0xf7, f, );
+  a |= __builtin_sub_overflow_p (c, 0, (unsigned char) 0);
+  d = g + b;
+  x = d;
+  if (x != 0xf7)
+__builtin_abort ();
+#endif
+  return 0;
+}
--- gcc/testsuite/gcc.dg/pr91720.c.jj   2019-09-11 13:06:08.909873060 +0200
+++ gcc/testsuite/gcc.dg/pr91720.c  2019-09-11 13:00:50.070754162 +0200
@@ -0,0 +1,22 @@
+/* PR rtl-optimization/91720 */
+/* { dg-do run } */
+/* { dg-options "-Og -fno-forward-propagate -frerun-cse-after-loop 
-fno-tree-fre" } */
+
+unsigned a, b;
+
+int
+main ()
+{
+#if __CHAR_BIT__ == 8
+  unsigned c = 1;
+  unsigned long long d = 0;
+  unsigned char e = 0;
+  e = __builtin_sub_overflow (d, e, ) ? 0 : 0x80;
+  e = e << 7 | e >> c;
+  __builtin_memmove (, , 2);
+  b = e;
+  if (b != 0x40)
+__builtin_abort ();
+#endif
+  return 0;
+}


Jakub


[PATCH] Fix libstdc++ tests for -Wvolatile warnings in C++2a mode

2019-09-11 Thread Jonathan Wakely

* testsuite/20_util/result_of/sfinae_friendly_1.cc: Add -Wno-volatile
for C++2a and up. Define HAS_52748_FIXED and fix incorrect tests.
* testsuite/tr1/3_function_objects/result_of.cc: Add -Wno-volatile
for C++2a and up.

Tested x86_64-linux, committed to trunk.

commit 25f120f074c602424f5a05bd629deabccd77f75e
Author: redi 
Date:   Wed Sep 11 11:38:15 2019 +

Fix libstdc++ tests for -Wvolatile warnings in C++2a mode

* testsuite/20_util/result_of/sfinae_friendly_1.cc: Add 
-Wno-volatile
for C++2a and up. Define HAS_52748_FIXED and fix incorrect tests.
* testsuite/tr1/3_function_objects/result_of.cc: Add -Wno-volatile
for C++2a and up.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@275643 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/libstdc++-v3/testsuite/20_util/result_of/sfinae_friendly_1.cc 
b/libstdc++-v3/testsuite/20_util/result_of/sfinae_friendly_1.cc
index ed3e011f937..86aa52029cc 100644
--- a/libstdc++-v3/testsuite/20_util/result_of/sfinae_friendly_1.cc
+++ b/libstdc++-v3/testsuite/20_util/result_of/sfinae_friendly_1.cc
@@ -1,4 +1,5 @@
 // { dg-do compile { target c++11 } }
+// { dg-additional-options "-Wno-volatile" { target c++2a } }
 
 // Copyright (C) 2012-2019 Free Software Foundation, Inc.
 //
@@ -21,9 +22,9 @@
 #include 
 #include 
 
-// TODO: Uncomment the following define once gcc has fixed bug 52748
+// Uncomment the following define once gcc has fixed bug 52748
 // (incomplete types in function call expressions):
-//#define HAS_52748_FIXED
+#define HAS_52748_FIXED
 
 // Helper types:
 struct has_type_impl
@@ -627,9 +628,9 @@ static_assert(is_type&(int, int, int)>, Ukn>(), "Error!");
 
-static_assert(is_type, Ukn>(), "Error!");
-static_assert(is_type, Ukn>(), "Error!");
-static_assert(is_type, Ukn>(), "Error!");
+static_assert(is_type, Ukn>(), 
"Error!");
+static_assert(is_type, Ukn>(), 
"Error!");
+static_assert(is_type, Ukn>(), 
"Error!");
 
 static_assert(is_type, Ukn>(), "Error!");
 static_assert(is_type, Ukn>(), "Error!");
diff --git a/libstdc++-v3/testsuite/tr1/3_function_objects/result_of.cc 
b/libstdc++-v3/testsuite/tr1/3_function_objects/result_of.cc
index 02cf8bf8c6d..dea0665b1e3 100644
--- a/libstdc++-v3/testsuite/tr1/3_function_objects/result_of.cc
+++ b/libstdc++-v3/testsuite/tr1/3_function_objects/result_of.cc
@@ -17,6 +17,8 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
+// { dg-additional-options "-Wno-volatile" { target c++2a } }
+
 // 3.4 function return types
 #include 
 #include 


[PATCH] Fix Xmethod for shared_ptr::use_count()

2019-09-11 Thread Jonathan Wakely

This was reported in https://bugzilla.redhat.com/show_bug.cgi?id=1749578

* python/libstdcxx/v6/xmethods.py (SharedPtrUseCountWorker.__call__):
Fix syntax error.

Tested x86_64-linux (although I can't get tests for the use_count() or
unique() Xmethods to actually work ...)

Committed to trunk. I'll backport it too.

commit 0f5caaa9b5fedebc6117cae5f3bc57c5cb3448d8
Author: redi 
Date:   Wed Sep 11 11:38:23 2019 +

Fix Xmethod for shared_ptr::use_count()

This was reported in https://bugzilla.redhat.com/show_bug.cgi?id=1749578

* python/libstdcxx/v6/xmethods.py 
(SharedPtrUseCountWorker.__call__):
Fix syntax error.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@275644 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/libstdc++-v3/python/libstdcxx/v6/xmethods.py 
b/libstdc++-v3/python/libstdcxx/v6/xmethods.py
index 623cb80bc0e..a720a63fa1a 100644
--- a/libstdc++-v3/python/libstdcxx/v6/xmethods.py
+++ b/libstdc++-v3/python/libstdcxx/v6/xmethods.py
@@ -739,7 +739,7 @@ class SharedPtrUseCountWorker(gdb.xmethod.XMethodWorker):
 return gdb.lookup_type('long')
 
 def __call__(self, obj):
-refcounts = ['_M_refcount']['_M_pi']
+refcounts = obj['_M_refcount']['_M_pi']
 return refcounts['_M_use_count'] if refcounts else 0
 
 class SharedPtrUniqueWorker(SharedPtrUseCountWorker):


Re: [PATCH] Prevent LTO section collision for a symbol name starting with '*'.

2019-09-11 Thread Martin Liška
On 9/9/19 4:33 PM, Jan Hubicka wrote:
>> PING^1
>>
>> On 8/26/19 12:04 PM, Martin Liška wrote:
>>> Ok. I have a semi-working patch that has issues for inline clones.
>>> When we call cgraph_node::get_untransformed_body for an inline clone,
>>> then one needs to use clone_of->order to find proper LTO stream.
> 
> This seems OK to me - when using inline clone we really look for a body
> of its master, so that seems OK.

Ok!

>>>
>>> What's more problematic is that such clone can be expanded:
>>>
>>> f/12 (f) @0x7769f708
>>>   Type: function definition analyzed
>>>   Visibility: external public
>>>   References: mumble.lto_priv.0/8 (write)
>>>   Referring: 
>>>   Read from file: /tmp/cciAkXHp.ltrans1.o
>>>   Function f/12 is inline copy in main/0
>>>   Availability: local
>>>   Function flags: count:1073741824 (estimated locally) local nonfreeing_fn 
>>> executed_once
>>>   Called by: main/0 (inlined) (1073741824 (estimated locally),1.00 per 
>>> call) 
>>>   Calls: 
>>>
>>> and lost. So we end up with an orphan and we ICE with:
> 
> We do some work on removing unnecesary master clone when function is
> fully inlined and I guess in that case you lose the order info.
> One option would be to copy order into all inline clones (it does not
> have very good meaning there) or do that when reshaping the tree.

What do you mean by "reshaping the tree"?

> This is done in cgraph_node::remove at the place clone_of is
> manipulated.

The inline_clone manipulation happens in cgraph_node::find_replacement where
we manipulate the clone_of.

Martin

> This is probably bit cleaner.
>>>
>>> /home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/lto/20081112_0.c: In 
>>> function ‘main’:
>>> /home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/lto/20081112_0.c:10:3: 
>>> fatal error: /tmp/cciAkXHp.ltrans1.o: section f is missing
>>>
>>> So usage of symtab_node::order seems awkward to me :/
>>>
>>> Martin
>>>
>>



Fix for type confusion bug on microblaze

2019-09-11 Thread Jonas Pfeil
The Microblaze target confuses unsigned long with HOST_WIDE_INT.

This works fine for many architectures, but fails on ARM (HOST_WIDE_INT is 8
bytes, unsigned long is 4 bytes). Leading to print a uninitialized register
instead of the desired number, fix by using the correct printf-specifier.

Tested to fix the issue and work with an ARM->MB cross compiler.

--- a/gcc/config/microblaze/microblaze.h
+++ b/gcc/config/microblaze/microblaze.h
@@ -695,7 +695,7 @@ do {
\
   fprintf (STREAM, "\t.align\t%d\n", (LOG))
 
 #define ASM_OUTPUT_SKIP(STREAM,SIZE)   \
-  fprintf (STREAM, "\t.space\t%lu\n", (SIZE))
+  fprintf (STREAM, "\t.space\t" HOST_WIDE_INT_PRINT_DEC "\n", (SIZE))
 
 #define ASCII_DATA_ASM_OP  "\t.ascii\t"
 #define STRING_ASM_OP  "\t.asciz\t"




  1   2   >