Re: [PATCH] tbb-backend effective target should check ability to link TBB

2019-05-21 Thread François Dumont

On 5/21/19 10:51 PM, Jonathan Wakely wrote:

On 21/05/19 22:47 +0200, François Dumont wrote:

On 5/21/19 3:50 PM, Jonathan Wakely wrote:

On 20/05/19 21:41 -0700, Thomas Rodgers wrote:

With the addition of "-ltbb" to the v3_target_compile flags (so as to,
you know, actually try to link tbb).

Tested x86_64-linux, committed to trunk.


This didn't work, I still get a FAIL for every pstl test when
tbb.x86_64 and tbb-devel.x86_64 are installed but not tbb.i686.

Adding -v to RUNTESTFLAGS shows -ltbb wasn't being added to the
command, and because the test program didn't actually refer to any TBB
symbols, it still linked successfully.

But the test program still do not refer any TBB symbol. Why is it 
better ?


Because -ltbb means the linker will give an error if there is no
suitable libtbb. That's true even if no symbols from it are needed.
Try it.


Ok, good to know.

Thanks




Re: C++ PATCH for c++/90548 - ICE with generic lambda and empty pack

2019-05-21 Thread Jason Merrill

On 5/20/19 6:08 PM, Marek Polacek wrote:

We ICE here because we are accessing call_args even though it's empty:

(gdb) p (*call_args).is_empty()
$5 = true

It's empty because the pack processed by tsubst_pack_expansion expanded into an
empty vector, so nothing got pushed onto call_args.  So handle that, and also
handle pushing 'this' properly when call_args is empty.


Ah, the problem is that nargs is just wrong in the presence of pack 
expansions; in this case it's too large, but it could also end up too 
small.  What we want is the length of call_args, not the number of args 
before doing the expansions.


Jason


Re: [RS6000] Don't pass -many to the assembler

2019-05-21 Thread Alan Modra
On Tue, May 21, 2019 at 09:48:10AM -0500, Segher Boessenkool wrote:
> > +static const char *
> > +rs6000_machine_from_flags (void)
> > +{
> > +  if ((rs6000_isa_flags & (ISA_3_0_MASKS_SERVER & ~ISA_2_7_MASKS_SERVER)) 
> > != 0)
> > +return "power9";
> > +  if ((rs6000_isa_flags & (ISA_2_7_MASKS_SERVER & ~ISA_2_6_MASKS_SERVER)) 
> > != 0)
> > +return "power8";
> > +  if ((rs6000_isa_flags & (ISA_2_6_MASKS_SERVER & ~ISA_2_5_MASKS_SERVER)) 
> > != 0)
> > +return "power7";
> > +  if ((rs6000_isa_flags & (ISA_2_5_MASKS_SERVER & ~ISA_2_4_MASKS)) != 0)
> > +return "power6";
> > +  if ((rs6000_isa_flags & (ISA_2_4_MASKS & ~ISA_2_1_MASKS)) != 0)
> > +return "power5";
> > +  if ((rs6000_isa_flags & ISA_2_1_MASKS) != 0)
> > +return "power4";
> > +  if ((rs6000_isa_flags & OPTION_MASK_POWERPC64) != 0)
> > +return "ppc64";
> > +  return "ppc";
> > +}
> 
> As you know I'm trying to get rid of most of the separate user-selectable
> features we have currently.  I think I'll steal this code :-)
> 
> (Is Power5 2.4?  Not 2.2?)

Yes, I think power5 was 2.02, but I haven't looked at cpu and arch
books to verify exactly what power5 and power5+ was.  Note that gas
lumps power5 and power5+ in one category so "power5" from
rs6000_machine_from_flags means power5+ too.

This change was based on the fact that "friz" and other similar
instructions enabled by gcc with TARGET_FPRND are enabled in gas by
"-mpower5", "-mpwr5", or "-mpwr5x".  ("-mpower5+" isn't a valid gas
option.)  rs6000-cpus.def puts OPTION_MASK_FPRND in ISA_2_4_MASKS, so
ISA_2_4_MASKS is the one to use in deciding to pass "-mpower5" to
gas.

> -mdejagnu-cpu=power7 should make the -mno-* things unnecessary I think?

No, it doesn't.  The -mno- options are to counter options added by
check_vect_support_and_set_flags based on hardware detection.  On
power8 hardware just switching to -mdejagnu-cpu results in, for
example:
...xgcc -B.../ ...gcc.dg/vect/pr4875.c -fno-diagnostics-show-caret \
-fno-diagnostics-show-line-numbers -fdiagnostics-color=never -flto \
-ffat-lto-objects -maltivec -mpower8-vector -ftree-vectorize \
-fno-vect-cost-model -fno-common -O2 -fdump-tree-vect-details \
-O3 -mdejagnu-cpu=power6 -S -o pr48765.s

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH 1/2] Add support for IVOPT

2019-05-21 Thread Kugan Vivekanandarajah
Hi Richard,


On Fri, 17 May 2019 at 18:47, Richard Sandiford
 wrote:
>
> Kugan Vivekanandarajah  writes:
> > [...]
> >> > +{
> >> > +  struct mem_address parts = {NULL_TREE, integer_one_node,
> >> > +   NULL_TREE, NULL_TREE, NULL_TREE};
> >>
> >> Might be better to use "= {}" and initialise the fields that matter by
> >> assignment.  As it stands this uses integer_one_node as the base, but I
> >> couldn't tell if that was deliberate.
> >
> > I just copied this part from get_address_cost, similar to what is done
> > there.
>
> Ah, sorry :-)
>
> > I have now changed the way you suggested but using the values
> > used in get_address_cost.
>
> Thanks.
>
> > [...]
> > @@ -3479,6 +3481,35 @@ add_iv_candidate_derived_from_uses (struct 
> > ivopts_data *data)
> >data->iv_common_cands.truncate (0);
> >  }
> >
> > +/* Return the preferred mem scale factor for accessing MEM_MODE
> > +   of BASE in LOOP.  */
> > +static unsigned int
> > +preferred_mem_scale_factor (struct loop *loop,
> > + tree base, machine_mode mem_mode)
>
> IMO this should live in tree-ssa-address.c instead.
>
> The only use of "loop" is to test for size vs. speed, but other callers
> might want to make that decision based on individual blocks, so I think
> it would make sense to pass a "speed" bool instead.  Probably also worth
> making it the last parameter, so that the order is consistent with
> address_cost (though probably then inconsistent with something else :-)).
>
> > [...]
> > @@ -3500,6 +3531,28 @@ add_iv_candidate_for_use (struct ivopts_data *data, 
> > struct iv_use *use)
> >  basetype = sizetype;
> >record_common_cand (data, build_int_cst (basetype, 0), iv->step, use);
> >
> > +  /* Compare the cost of an address with an unscaled index with the cost of
> > +an address with a scaled index and add candidate if useful. */
> > +  if (use != NULL
> > +  && poly_int_tree_p (iv->step)
> > +  && tree_fits_poly_int64_p (iv->step)
> > +  && address_p (use->type))
> > +{
> > +  poly_int64 new_step;
> > +  poly_int64 poly_step = tree_to_poly_int64 (iv->step);
>
> This should be:
>
>   poly_int64 step;
>   if (use != NULL
>   && poly_int_tree_p (iv->step, )
>   && address_p (use->type))
> {
>   poly_int64 new_step;
>
> > +  unsigned int fact
> > + = preferred_mem_scale_factor (data->current_loop,
> > +use->iv->base,
> > +TYPE_MODE (use->mem_type));
> > +
> > +  if ((fact != 1)
> > +   && multiple_p (poly_step, fact, _step))
>
> Should be no brackets around "fact != 1".
>
> > [...]
>
> Looks really good to me otherwise, thanks.  Bin, any comments?
Revised patch which handles the above review comments is attached.

Thanks,
Kugan

> Richard
From 6a146662fab39de876de332bacbb1a3300caefb8 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah 
Date: Wed, 15 May 2019 09:16:43 +1000
Subject: [PATCH 1/2] Add support for IVOPT

gcc/ChangeLog:

2019-05-15  Kugan Vivekanandarajah  

	PR target/88834
	* tree-ssa-loop-ivopts.c (get_mem_type_for_internal_fn): Handle
	IFN_MASK_LOAD_LANES and IFN_MASK_STORE_LANES.
	(get_alias_ptr_type_for_ptr_address): Likewise.
	(add_iv_candidate_for_use): Add scaled index candidate if useful.
	* tree-ssa-address.c (preferred_mem_scale_factor): New.

Change-Id: Ie47b1722dc4fb430f07dadb8a58385759e75df58
---
 gcc/tree-ssa-address.c | 28 
 gcc/tree-ssa-address.h |  3 +++
 gcc/tree-ssa-loop-ivopts.c | 26 +-
 3 files changed, 56 insertions(+), 1 deletion(-)

diff --git a/gcc/tree-ssa-address.c b/gcc/tree-ssa-address.c
index 1c17e93..fdb6619 100644
--- a/gcc/tree-ssa-address.c
+++ b/gcc/tree-ssa-address.c
@@ -1127,6 +1127,34 @@ maybe_fold_tmr (tree ref)
   return new_ref;
 }
 
+/* Return the preferred mem scale factor for accessing MEM_MODE
+   of BASE which is optimized for SPEED.  */
+unsigned int
+preferred_mem_scale_factor (tree base, machine_mode mem_mode,
+			bool speed)
+{
+  struct mem_address parts = {};
+  addr_space_t as = TYPE_ADDR_SPACE (TREE_TYPE (base));
+  unsigned int fact = GET_MODE_UNIT_SIZE (mem_mode);
+
+  /* Addressing mode "base + index".  */
+  parts.index = integer_one_node;
+  parts.base = integer_one_node;
+  rtx addr = addr_for_mem_ref (, as, false);
+  unsigned cost = address_cost (addr, mem_mode, as, speed);
+
+  /* Addressing mode "base + index << scale".  */
+  parts.step = wide_int_to_tree (sizetype, fact);
+  addr = addr_for_mem_ref (, as, false);
+  unsigned new_cost = address_cost (addr, mem_mode, as, speed);
+
+  /* Compare the cost of an address with an unscaled index with
+ a scaled index and return factor if useful. */
+  if (new_cost < cost)
+return GET_MODE_UNIT_SIZE (mem_mode);
+  return 1;
+}
+
 /* Dump PARTS to FILE.  */
 
 extern void dump_mem_address (FILE *, struct mem_address *);
diff --git a/gcc/tree-ssa-address.h 

[RFC][PR88838][SVE] Use 32-bit WHILELO in LP64 mode

2019-05-21 Thread Kugan Vivekanandarajah
Hi,

Attached RFC patch attempts to use 32-bit WHILELO in LP64 mode to fix
the PR. Bootstarp and regression testing ongoing. In earlier testing,
I ran into an issue related to fwprop. I will tackle that based on the
feedback for the patch.

Thanks,
Kugan
From 4e9837ff9c0c080923f342e83574a6fdba2b3d92 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah 
Date: Tue, 5 Mar 2019 10:01:45 +1100
Subject: [PATCH] pr88838[v2]

As Mentioned in PR88838, this patch  avoid the SXTW by using WHILELO on W
registers instead of X registers.

As mentined in PR, vect_verify_full_masking checks which IV widths are
supported for WHILELO but prefers to go to Pmode width.  This is because
using Pmode allows ivopts to reuse the IV for indices (as in the loads
and store above).  However, it would be better to use a 32-bit WHILELO
with a truncated 64-bit IV if:

(a) the limit is extended from 32 bits.
(b) the detection loop in vect_verify_full_masking detects that using a
32-bit IV would be correct.

gcc/ChangeLog:

2019-05-22  Kugan Vivekanandarajah  

	* tree-vect-loop-manip.c (vect_set_loop_masks_directly): If the
	compare_type is not with Pmode size, we will create an IV with
	Pmode size with truncated use (i.e. converted to the correct type).
	* tree-vect-loop.c (vect_verify_full_masking): Find which IV
	widths are supported for WHILELO.

gcc/testsuite/ChangeLog:

2019-05-22  Kugan Vivekanandarajah  

	* gcc.target/aarch64/pr88838.c: New test.
	* gcc.target/aarch64/sve/while_1.c: Adjust.

Change-Id: Iff52946c28d468078f2cc0868d53edb05325b8ca
---
 gcc/fwprop.c   | 13 +++
 gcc/testsuite/gcc.target/aarch64/pr88838.c | 11 ++
 gcc/testsuite/gcc.target/aarch64/sve/while_1.c | 16 
 gcc/tree-vect-loop-manip.c | 52 --
 gcc/tree-vect-loop.c   | 39 ++-
 5 files changed, 117 insertions(+), 14 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/pr88838.c

diff --git a/gcc/fwprop.c b/gcc/fwprop.c
index cf2c9de..5275ad3 100644
--- a/gcc/fwprop.c
+++ b/gcc/fwprop.c
@@ -1358,6 +1358,19 @@ forward_propagate_and_simplify (df_ref use, rtx_insn *def_insn, rtx def_set)
   else
 mode = GET_MODE (*loc);
 
+  /* TODO.  */
+  if (GET_MODE_CLASS (mode) != GET_MODE_CLASS (GET_MODE (reg)))
+return false;
+  /* TODO. We can't get the mode for
+ (set (reg:VNx16BI 109)
+  (unspec:VNx16BI [
+	(reg:SI 131)
+	(reg:SI 106)
+   ] UNSPEC_WHILE_LO))
+ Thus, bailout when it is UNSPEC and MODEs are not compatible.  */
+  if (GET_MODE_CLASS (mode) != GET_MODE_CLASS (GET_MODE (reg))
+  && GET_CODE (SET_SRC (use_set)) == UNSPEC)
+return false;
   new_rtx = propagate_rtx (*loc, mode, reg, src,
   			   optimize_bb_for_speed_p (BLOCK_FOR_INSN (use_insn)));
 
diff --git a/gcc/testsuite/gcc.target/aarch64/pr88838.c b/gcc/testsuite/gcc.target/aarch64/pr88838.c
new file mode 100644
index 000..9d03c0a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr88838.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-S -O3 -march=arm8.2-a+sve" } */
+
+void
+f (int *restrict x, int *restrict y, int *restrict z, int n)
+{
+for (int i = 0; i < n; i += 1)
+  x[i] = y[i] + z[i];
+}
+
+/* { dg-final { scan-assembler-not "sxtw" } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/while_1.c b/gcc/testsuite/gcc.target/aarch64/sve/while_1.c
index a93a04b..05a4860 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve/while_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/while_1.c
@@ -26,14 +26,14 @@
 TEST_ALL (ADD_LOOP)
 
 /* { dg-final { scan-assembler-not {\tuqdec} } } */
-/* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.b, xzr,} 2 } } */
-/* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.b, x[0-9]+,} 2 } } */
-/* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.h, xzr,} 2 } } */
-/* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.h, x[0-9]+,} 2 } } */
-/* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.s, xzr,} 3 } } */
-/* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.s, x[0-9]+,} 3 } } */
-/* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d, xzr,} 3 } } */
-/* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d, x[0-9]+,} 3 } } */
+/* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.b, wzr,} 2 } } */
+/* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.b, w[0-9]+,} 2 } } */
+/* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.h, wzr,} 2 } } */
+/* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.h, w[0-9]+,} 2 } } */
+/* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.s, wzr,} 3 } } */
+/* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.s, w[0-9]+,} 3 } } */
+/* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d, wzr,} 3 } } */
+/* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d, w[0-9]+,} 3 } } */
 /* { dg-final { scan-assembler-times {\tld1b\tz[0-9]+\.b, p[0-7]/z, 

Re: [PATCH v3 2/3] Add predict_doloop_p target hook

2019-05-21 Thread Kewen.Lin
on 2019/5/21 下午6:20, Richard Biener wrote:
> On Tue, 21 May 2019, Kewen.Lin wrote:
> 
>> on 2019/5/21 上午12:37, Segher Boessenkool wrote:
>>> On Mon, May 20, 2019 at 08:43:59AM -0600, Jeff Law wrote:
> I think we should have two hooks: one is called with the struct loop as
> parameter; and the other is called for every statement in the loop, if
> the hook isn't null anyway.  Or perhaps we do not need that second one.
 I'd wait to see a compelling example from real world code where we need
 to scan the statements.  Otherwise we're just dragging in more target
 specific decisions which in fact we want to minimize target stuff.
>>>
>>> The ivopts pass will be too optimistic about what loops will end up as a
>>> doloop, and cost things accordingly.  The cases where we cannot later
>>> actually use a doloop are doing pretty much per iteration, so I think
>>> ivopts will still make good decisions.  We'll need to make the rtl part
>>> not actually do a doloop then, but we probably still need that logic
>>> anyway.
>>>
>>> Kewen, Bin, will that work satisfactorily do you think?
>>>
>>
>> If my understanding on this question is correct, IMHO we should try to make
>> IVOPTs conservative than optimistic, since once the predict is wrong from
>> too optimistic decision, the costing on the doloop use is wrong, it's very
>> possible to affect the global optimal set.  It looks we don't have any ways
>> to recover it in RTL then?  (otherwise, there should be better place to fix
>> the PR).  Although it's also possible to miss some good cases, it's at least
>> as good as before, I'm inclined to make it conservative.
> 
> I wonder if you could simply benchmark what happens if you make
> IVOPTs _always_ create a doloop IV (if possible)?  I doubt the
> cases where a doloop IV is bad (calls, etc.) are too common and
> that in those cases the extra simple IV hurts.
> 

OK.  I'll do the changes and measure with SPEC2017.  Maybe also 
extend it to two other checks niter cost check and estimated
niter range check.  It may take some days.  You might have to
expect late response.  :)


Thanks
Kewen


> Richard.
> 



Re: Go patch committed: Intrinsify runtime/internal/atomic functions

2019-05-21 Thread Jim Wilson
On Sun, May 19, 2019 at 5:22 AM Andreas Schwab  wrote:
> ../../../libgo/go/runtime/mbitmap.go: In function 
> ‘runtime.setMarked.runtime.markBits’:
> ../../../libgo/go/runtime/mbitmap.go:291:9: internal compiler error: 
> Segmentation fault
>   291 |  atomic.Or8(m.bytep, m.mask)
>   | ^

This is failing for RISC-V because __atomic_or_fetch_1 isn't a
built-in function that can be expanded inline.  You have to call the
library function in libatomic.  The C front-end is registering all of
the built-in functions, but it looks like the go front-end is only
registering functions it thinks it needs and this list is incomplete.
In expand_builtin, case BUILT_IN_ATOMIC_OR_FETCH_1, the external
library call for this gets set to BUILT_IN_ATOMIC_FETCH_OR_1.  Then in
expand_builtin_atomic_fetch_op when we call builtin_decl_explicit
(ext_call) it returns NULL.  This is because the go front end
registered BUILT_IN_ATOMIC_OR_FETCH_1 as a built-in, but did not
register BUILT_IN_ATOMIC_FETCH_OR_1 as a built-in.  The NULL return
from builtin_decl_explicit gives us an ADDR_EXPR with a NULL operand
which eventually causes the internal compiler error.  It looks like
the same thing is done with all of the op_fetch built-ins, so use of
any of them means that the fetch_op built-in also has to be
registered.  I verified with a quick hack that I need both
BUILT_IN_ATOMIC_FETCH_OR_1 and BUILT_IN_ATOMIC_FETCH_AND_1 defined as
built-ins to make a RISC-V go build work.  I haven't done any testing
yet.

Jim
diff --git a/gcc/go/go-gcc.cc b/gcc/go/go-gcc.cc
index 1b26f2bac93..91043b51463 100644
--- a/gcc/go/go-gcc.cc
+++ b/gcc/go/go-gcc.cc
@@ -871,6 +871,8 @@ Gcc_backend::Gcc_backend()
NULL_TREE);
   this->define_builtin(BUILT_IN_ATOMIC_AND_FETCH_1, "__atomic_and_fetch_1", 
NULL,
t, false, false);
+  this->define_builtin(BUILT_IN_ATOMIC_FETCH_AND_1, "__atomic_fetch_and_1", 
NULL,
+   t, false, false);
 
   t = build_function_type_list(unsigned_char_type_node,
ptr_type_node,
@@ -879,6 +881,8 @@ Gcc_backend::Gcc_backend()
NULL_TREE);
   this->define_builtin(BUILT_IN_ATOMIC_OR_FETCH_1, "__atomic_or_fetch_1", NULL,
t, false, false);
+  this->define_builtin(BUILT_IN_ATOMIC_FETCH_OR_1, "__atomic_fetch_or_1", NULL,
+   t, false, false);
 }
 
 // Get an unnamed integer type.


Re: RFA: fix PR90553, IRA assigning a call-clobbered reg to call with post-increment

2019-05-21 Thread Hans-Peter Nilsson
> From: Vladimir Makarov 
> Date: Tue, 21 May 2019 17:05:50 -0400

> Yes, the patch is ok to commit.  Thank you for working on the problem.
> 
> It is hard to reproduce the same problem in LRA as LRA mostly follows 
> IRA decisions.
> 
> I'll probably do the analogous patch for LRA on this week.

Thanks!

> > gcc:
> > * lra-lives.c (process_bb_node_lives): Consider defs
> It should be ira-lives.c here.
> > for a call insn to be die before the call, not after.

I also missed the "PR middle-end/90553" heading above and
misspelled the GNU spacing at the end of the moved and expanded
comment.  Fixed before the commit.  Thanks.

brgds, H-P


Re: [PATCH 3/3][DejaGNU] target: Wrap linker flags into `-largs'/`-margs' for Ada

2019-05-21 Thread Jacob Bachmeyer

Maciej Rozycki wrote:

On Thu, 16 May 2019, Jacob Bachmeyer wrote:

  
 I suspect the origins may be different, however as valuable as your 
observation is functional problems have precedence over issues with code 
structuring, so we need to fix the problem at hand first.  I'm sure 
DejaGNU maintainers will be happy to review your implementation of code 
restructuring afterwards.
  
My concern is that your patch may only solve a small part of the problem 
-- enough to make your specific case work, yes, but then someone else 
will hit other parts of the problem later and we spiral towards "tissue 
of hacks" unmaintainability.



 I think however that fixing problems in small steps as they are 
discovered is also a reasonable approach and a way to move forward: 
perfect is the enemy of good.
  


Fair enough; observe the small patches I have recently submitted to DejaGnu.

 So I don't think the prospect of making a comprehensive solution should 
prevent a simple fix for the problem at hand that has been already 
developed from being applied.
  


I recognize a difference between "simple but complete" (an ideal 
sometimes achieved in practice) and "simple because incomplete" (which 
does not actually fix the problem).  My concerns are that your patch may 
be the latter.


 IOW I don't discourage you from developing a comprehensive solution, 
however applying my proposal right away will help at least some people and 
will not block you in any way.
  


Correct, although, considering how long my FSF paperwork took, I might 
be able to finish a comprehensive patch before your paperwork is 
completed.  :-)


The biggest hint to me that your patch is incomplete is that it only 
adds -largs/-margs to wrap LDFLAGS.  I suspect that there are other 
-?args options that should be used also with other flag sets, but those 
do not appear in this patch.  Do we know what the GNU Ada frontend 
actually expects?



 At first glance it looks to me we should be safe overall as compiler 
flags are supposed to be passed through by `gnatmake' (barring switch 
processing bugs, as observed with 1/3), and IIUC assembler flags are 
considered compiler flags for the purpose of this consideration as 
`gnatmake' does not make assembly a separate build stage.  So while we 
could wrap compiler flags into `-cargs'/`-margs', it would only serve to 
avoid possible `gnatmake' switch processing bugs.
  


I am not sure if those are actually bugs in `gnatmake' or the result of 
an incomplete specification for `gnatmake' -- I suspect that --sysroot= 
may have been added to the rest of GCC after `gnatmake' was written and 
whoever added it did not update the Ada frontend.


 There's also `-bargs' for binder switches, but I can't see any use for it 
here.


 Finally boards are offered the choice of `adaflags', `cflags', 
`cxxflags', etc. for the individual languages, where the correct syntax 
can be used if anything unusual is needed beyond what I have noted above.
  


Which also raises the issue of "cflags_for_target" (used regardless of 
language and currently always taken from the "unix" board configuration) 
and how that is supposed to make sense and whether it should be 
similarly split into language-specific values or simply removed.  I have 
already submitted a patch to draw that value from the actual host board 
configuration.


 I'll defer any further consideration to the Ada maintainers cc-ed; I do 
hope I haven't missed anything here, but then Ada is far from being my 
primary area of experience.
  


Likewise, hopefully some of the Ada maintainers will be able to shed 
light on this issue.  And I hope Ben (the DejaGnu maintainer) is okay -- 
I would have expected him to comment by now.


 The ordering rules are system-specific I'm afraid and we have to be 
careful not to break working systems out there.  People may be forced to a 
DejaGNU upgrate, due to a newer version of a program being tested having 
such a requirement, and can legitimately expect their system to continue 
working.
  
We can start by simply preserving the existing ordering until we know 
something should change, but the main goal of my previous message was to 
collect the requirements for a specification for default_target_compile 
so I can write regression tests (some of which will fail due to known 
bugs like broken Ada support in our current implementation) before 
embarking on extensive changes to that procedure.  Improving 
"target.test" was already on my local TODO list.



 You are welcome to go ahead with your effort as far as I am concerned.
  


I am working on it.  :-)

-- Jacob


Re: [RFC] SHIFT_COUNT_TRUNCATED and shift_truncation_mask

2019-05-21 Thread Jeff Law
On 5/15/19 8:30 AM, Robin Dapp wrote:
>> It would really help if you could provide testcases which show the
>> suboptimal code and any analysis you've done.
> 
> I tried introducing a define_subst pattern that substitutes something
> one of two other subst patterns already changed.
> 
> The first subst pattern helps remove a superfluous and on the shift
> count operand by accepting both variants, with and without an and for
> the insn pattern.
> 
> (define_subst "masked_op_subst"
>   [(set (match_operand:DSI 0 ""   "")
> (shift:DSI (match_operand:DSI 1 "" "")
>  (match_operand:SI  2 "" "")))]
>   ""
>   [(set (match_dup 0)
> (shift:DSI (match_dup 1)
>  (and:SI (match_dup 2)
> (match_operand:SI 3 "const_int_6bitset_operand" "jm6"])
> 
> The second subst helps encode a shift count addition like $r1 + 1 as
> address style operand 1($r1) that is directly supported by the shift
> instruction.
> 
> (define_subst "addr_style_op_subst"
>   [(set (match_operand:DSI_VI 0 "" "")
> (shift:DSI_VI (match_operand:DSI_VI 1 "" "")
> (match_operand:SI 2 "" "")))]
>   ""
>   [(set (match_dup 0)
> (shift:DSI_VI (match_dup 1)
> (plus:SI (match_operand:SI 2 "register_operand" "a")
> (match_operand 3 "const_int_operand" "n"])
> 
> Both of these are also used in combination.
> 
> Now, in order to get rid of the subregs in the pattern combine creates,
> I would need to be able to do something like
> 
> (define_subst "subreg_subst"
>   [(set (match_operand:DI 0 "" "")
> (shift:DI (match_operand:DI 1 "" "")
> (subreg:SI (match_dup:DI 2)))]
> 
> where the (match_dup:DI 2) would capture both (and:SI ...) [with the
> first argument being either a register or an already substituted
> (plus:SI ...)] as well as a simple (plus:SI ...).
> 
> As far as I can tell match_dup:mode can be used to change the mode of
> the top-level operation but the operands will remain the same.  For
> this, a match_dup_deep or whatever would be useful.  I'm pretty sure we
> don't want to open this can of worms, though :)
> 
> To get rid of this, I explicitly duplicated all three subst combinations
> resulting in a lot of additional code.  This is not necessary when the
> subregs are eliminated by the middle end via SHIFT_COUNT_TRUNCATED.
> Maybe there is a much more obvious way that I missed?
Painful.  I doubt exposing the masking during the RTL expansion phase
and hoping the standard optimizers will eliminate it would work better
-- though perhaps if the expanders queried the global range information
and elided the masking when the range of the shift was known to be in range.

jeff




Re: [PATCH] Strip target_clones in copy attribute (PR lto/90500).

2019-05-21 Thread Jeff Law
On 5/21/19 3:32 AM, Martin Liška wrote:
> Hi.
> 
> As suggested by Joseph, the patch is about not copying
> target_clones attributes in handle_copy_attribute.
> 
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> 
> Ready to be installed?
> Thanks,
> Martin
> 
> gcc/c-family/ChangeLog:
> 
> 2019-05-21  Martin Liska  
> 
>   PR lto/90500
>   * c-attribs.c (handle_copy_attribute): Do not copy
>   target_clones attribute.
> 
> gcc/testsuite/ChangeLog:
> 
> 2019-05-21  Martin Liska  
> 
>   PR lto/90500
>   * gcc.target/i386/pr90500-1.c: Make the test-case valid
>   now.
I think you need to update the docs to indicate it doesn't copy the
target_clones attribute.  With that, OK.

jeff


Re: [PATCH,RFC 0/3] Support for CTF in GCC

2019-05-21 Thread Indu Bhagat

Thanks for your feedback. Comments inline.


On 05/21/2019 03:28 AM, Richard Biener wrote:

GCC RFC patch set :
Patch 1 is a simple addition of a new function lang_GNU_GIMPLE to check for
GIMPLE frontend.

I don't think you should need this - the GIMPLE "frontend" is intended for
unit testing only, I wouldn't like it to be exposed more.


When using -gt with -flto, I would still like the CTF hooks to be initialized
so that CTF can be generated when -flto is used. So the check in toplev.c is
done to allow only C and GNU GIMPLE.  I am fine with doing a string compare
with the language.hooks string if you suggest to go that way.




One of the main high-level design requirements that is relevant in the context
of the current GCC patch set is that - CTF and DWARF must be able to co-exist.
A user may want CTF debug information in isolation or with other debug formats.
A .ctf section is small and unlike other debug sections, ideally should not
need to be stripped out of the binary/executable.

High-level proposed plan (phase 1) :
In the next few patches, the functionality to generate contents of the CTF
section (.ctf) for a single compilation unit will be added.
Once CTF generation for a single compilation unit stabilizes, LTO and CTF
generation will be looked at.

Feedback and suggestions welcome.

You probably got asked this question multiple times already, but,
can CTF information be generated from DWARF instead?


Yes and No :) And that is indeed one of the motivation of the project - to
allow CTF generation where it's most suited aka the toolchain.

There do exist utilties for generation of CTF from DWARF. For example, one of
them is the dwarf2ctf in the DTrace Linux distribution. dwarf2ctf works offline
to transform DWARF generated by the compiler into CTF.

A dependency of an external conversion utility for "post-processing" DWARF
offline poses several problems:

1. Deployment problems: the converter should be distributed and integrated in
   the build system of the program.  This, on occasions, can be intrusive.  For
   example, in terms of dependencies: the dwarf2ctf converter depends on
   libdwarf from elfutils suite, glib2 (used for the GHashTable), zlib (used to
   compress the CTF information) and libctf (which both reads and writes the
   CTF data).

2. Performance problems: the conversion from DWARF to CTF can take a long time,
   especially in big programs such as the Linux kernel.

3. Maintainability problems: the converter should be maintained in order to
   reflect potential changes in the DWARF generated by the compiler.

4. Adoption problem: it is difficult for applications to adopt the usage of
   CTF, even if it happens to provide what they need, since it would require to
   write a conversion utility or integrate DTrace's.




The meaning of the CTF acronym suggests that there's nothing
like locations, registers, etc. but just a representation of the
types?


Yes. CTF is simply put Type information; no locations, registers etc.



Generally we are trying to walk away from supporting multiple
debug info formats because that gets in the way of being
more precise from the frontend side.  Since DWARF is the


With regard to whether the support for CTF imposes infeasible or distinct
requirements on the frontend - it does not appear to be the case (I have
been working on CTF generation in GCC for a SINGLE compilation unit; More see
below). I agree that CTF debug information generation should ideally not impose
additional requirements on the frontend.


defacto standard, extensible and with a rich feature set the
line of thinking is that other formats (like STABS) can be
generated by "post-processing" DWARF.  Such
post-processing could happen on the object files or
on the GCC internal DWARF data structures by
providing alternate output routines.  That is, the mid-term
design goal is to make DWARF generation the "API"
for GCC frontends to use when creating high-level
debug information rather than trying to abstract from
the debuginfo format via the current debug-hooks or
the other way around via language-hooks.


I am not sure if I understood the last part very well, so I will state how CTF
generation is intended to work. Does the following fit the design goal you
state ?

( Caveat : I have been working on the functionality to generate CTF for a SINGLE
  compilation unit. LTO bits remain. )

So far, there are no additional requirements on the frontend side. CTF hooks
are wrappers around DWARF debug hooks (much like go-dump hooks, and vms dbg
hooks).  We did notice that GCC does not have the infrastructure to register or
enlist multiple debug hooks; and now from your comments it is clear that this
is by design. Thanks for clarifying that.

Having said that, I use CTF hooks to go from TREE --> update CTF internal
structures or output CTF routines depending on the hook (e.g., type_decl or
finish respectively), rather than changing the dwarf* files with CTF APIs. The
CTF debug hooks relay control 

Re: [PATCH] Implement LWG 3062, Unnecessary decay_t in is_execution_policy_v

2019-05-21 Thread Thomas Rodgers
The revised attached patch has been ested x86_64-linux, committed to trunk.
>From 5f8aeeb98477d6555d65a45d1d2aed84b26863c9 Mon Sep 17 00:00:00 2001
From: Thomas Rodgers 
Date: Tue, 21 May 2019 12:02:35 -0700
Subject: [PATCH] LWG 3062 - Unnecessary decay_t in is_execution_policy_v

* include/pstl/execution_defs.h (__enable_if_execution_policy):
Use std::__remove_cv_ref_t when building with GCC.
---
 libstdc++-v3/ChangeLog | 6 ++
 libstdc++-v3/include/pstl/execution_defs.h | 6 ++
 2 files changed, 12 insertions(+)

diff --git a/libstdc++-v3/ChangeLog b/libstdc++-v3/ChangeLog
index 804bace5e03..8ce445e39b8 100644
--- a/libstdc++-v3/ChangeLog
+++ b/libstdc++-v3/ChangeLog
@@ -1,3 +1,9 @@
+2019-05-21  Thomas Rodgers  
+
+	LWG 3062 - Unnecessary decay_t in is_execution_policy_v
+	* include/pstl/execution_defs.h (__enable_if_execution_policy):
+	Use std::__remove_cv_ref_t when building with GCC.
+
 2019-05-21  Jonathan Wakely  
 
 	PR libstdc++/90252
diff --git a/libstdc++-v3/include/pstl/execution_defs.h b/libstdc++-v3/include/pstl/execution_defs.h
index 1a551c7871c..34b0e3d6350 100644
--- a/libstdc++-v3/include/pstl/execution_defs.h
+++ b/libstdc++-v3/include/pstl/execution_defs.h
@@ -152,9 +152,15 @@ constexpr bool is_execution_policy_v = __pstl::execution::is_execution_policy<_T
 namespace __internal
 {
 template 
+#if _GLIBCXX_RELEASE >= 9
+using __enable_if_execution_policy =
+typename std::enable_if<__pstl::execution::is_execution_policy>::value,
+_Tp>::type;
+#else
 using __enable_if_execution_policy =
 typename std::enable_if<__pstl::execution::is_execution_policy::type>::value,
 _Tp>::type;
+#endif
 } // namespace __internal
 
 } // namespace __pstl
-- 
2.20.1


Thomas Rodgers writes:

> Revised patch.
>
> From 074685cf74b48604244c0c6f1d8cba63ff8915e5 Mon Sep 17 00:00:00 2001
> From: Thomas Rodgers 
> Date: Wed, 24 Apr 2019 15:53:45 -0700
> Subject: [PATCH] Implement LWG 3062, Unnecessary decay_t in
>  is_execution_policy_v
>
>   should be remove_cvref_t
>   * include/pstl/execution_defs.h (__enable_if_execution_policy):
> Use std::__remove_cv_ref_t when building with GCC.
> ---
>  libstdc++-v3/include/pstl/execution_defs.h | 6 ++
>  1 file changed, 6 insertions(+)
>
> diff --git a/libstdc++-v3/include/pstl/execution_defs.h 
> b/libstdc++-v3/include/pstl/execution_defs.h
> index 86c7a5a770d..0ed4cc30914 100644
> --- a/libstdc++-v3/include/pstl/execution_defs.h
> +++ b/libstdc++-v3/include/pstl/execution_defs.h
> @@ -152,9 +152,15 @@ constexpr bool is_execution_policy_v = 
> __pstl::execution::is_execution_policy<_T
>  namespace __internal
>  {
>  template 
> +#if _GLIBCXX_RELEASE >= 9
> +using __enable_if_execution_policy =
> +typename 
> std::enable_if<__pstl::execution::is_execution_policy>::value,
> +_T>::type;
> +#else
>  using __enable_if_execution_policy =
>  typename std::enable_if<__pstl::execution::is_execution_policy std::decay<_ExecPolicy>::type>::value,
>  _T>::type;
> +#endif
>  } // namespace __internal
>  
>  } // namespace __pstl



PING^1: V2 [PATCH] i386: Insert ENDBR for NOTE_INSN_DELETED_LABEL only if needed

2019-05-21 Thread H.J. Lu
On Sat, Feb 16, 2019 at 7:02 AM H.J. Lu  wrote:
>
> On Thu, Feb 14, 2019 at 08:13:32PM -0800, H.J. Lu wrote:
> > NOTE_INSN_DELETED_LABEL is used to mark what used to be a 'code_label',
> > but was not used for other purposes than taking its address and was
> > transformed to mark that no code jumps to it.  NOTE_INSN_DELETED_LABEL
> > is generated only in 3 places:
> >
> > 1. When delete_insn sees an unused label which is an explicit label in
> > the input source code or its address is taken, it turns the label into
> > a NOTE_INSN_DELETED_LABEL note.
> > 2. When rtl_tidy_fallthru_edge deletes a tablejump, it turns the
> > tablejump into a NOTE_INSN_DELETED_LABEL note.
> > 3. ix86_init_large_pic_reg creats a NOTE_INSN_DELETED_LABEL note, .L2,
> > to initialize large model PIC register:
> >
> > L2:
> >   movabsq $_GLOBAL_OFFSET_TABLE_-.L2, %r11
> >   leaq.L2(%rip), %rax
> >   movabsq $val@GOT, %rdx
> >   addq%r11, %rax
> >
> > Among of them, ENDBR is needed only when the label address is taken.
> > rest_of_insert_endbranch has
> >
> >   if ((LABEL_P (insn) && LABEL_PRESERVE_P (insn))
> >   || (NOTE_P (insn)
> >   && NOTE_KIND (insn) == NOTE_INSN_DELETED_LABEL))
> > /* TODO.  Check /s bit also.  */
> > {
> >   cet_eb = gen_nop_endbr ();
> >   emit_insn_after (cet_eb, insn);
> >   continue;
> > }
> >
> > For NOTE_INSN_DELETED_LABEL, we should check if forced_labels to see
> > if its address is taken.  Also ix86_init_large_pic_reg shouldn't set
> > LABEL_PRESERVE_P (in_struct) since NOTE_INSN_DELETED_LABEL is suffcient
> > to keep the label.
> >
> > gcc/
> >
> >   PR target/89355
> >   * config/i386/i386.c (rest_of_insert_endbranch): Check
> >   forced_labels to see if the address of NOTE_INSN_DELETED_LABEL
> >   is taken.
> >   (ix86_init_large_pic_reg): Don't set LABEL_PRESERVE_P.
> >
>
> Here is the updated patch.  We should check LABEL_PRESERVE_P on
> NOTE_INSN_DELETED_LABEL to see if its address is taken.
>
> OK for trunk?
>
> Thanks.
>
> H.J.
> ---
> NOTE_INSN_DELETED_LABEL is used to mark what used to be a 'code_label',
> but was not used for other purposes than taking its address and was
> transformed to mark that no code jumps to it.  Since LABEL_PRESERVE_P is
> true only if the label address was taken, check LABEL_PRESERVE_P before
> inserting ENDBR.
>
> 2019-02-15  H.J. Lu  
> Hongtao Liu  
>
> gcc/
>
> PR target/89355
> * config/i386/i386.c (rest_of_insert_endbranch): LABEL_PRESERVE_P
> to see if the address of NOTE_INSN_DELETED_LABEL is taken.
> (ix86_init_large_pic_reg): Don't set LABEL_PRESERVE_P.
>
> gcc/testsuite/
>
> PR target/89355
> * gcc.target/i386/cet-label-3.c: New test.
> * gcc.target/i386/cet-label-4.c: Likewise.
> * gcc.target/i386/cet-label-5.c: Likewise.

PING:

https://gcc.gnu.org/ml/gcc-patches/2019-02/msg01327.html


-- 
H.J.


PING^1: [PATCH] i386: Properly encode xmm16-xmm31/ymm16-ymm31 for vector move

2019-05-21 Thread H.J. Lu
On Fri, Feb 22, 2019 at 8:25 AM H.J. Lu  wrote:
>
> Hi Jan, Uros,
>
> This patch fixes the wrong code bug:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89229
>
> Tested on AVX2 and AVX512 with and without --with-arch=native.
>
> OK for trunk?
>
> Thanks.
>
> H.J.
> --
> i386 backend has
>
> INT_MODE (OI, 32);
> INT_MODE (XI, 64);
>
> So, XI_MODE represents 64 INTEGER bytes = 64 * 8 = 512 bit operation,
> in case of const_1, all 512 bits set.
>
> We can load zeros with narrower instruction, (e.g. 256 bit by inherent
> zeroing of highpart in case of 128 bit xor), so TImode in this case.
>
> Some targets prefer V4SF mode, so they will emit float xorps for zeroing.
>
> sse.md has
>
> (define_insn "mov_internal"
>   [(set (match_operand:VMOVE 0 "nonimmediate_operand"
>  "=v,v ,v ,m")
> (match_operand:VMOVE 1 "nonimmediate_or_sse_const_operand"
>  " C,BC,vm,v"))]
> 
>   /* There is no evex-encoded vmov* for sizes smaller than 64-bytes
>  in avx512f, so we need to use workarounds, to access sse registers
>  16-31, which are evex-only. In avx512vl we don't need workarounds.  
> */
>   if (TARGET_AVX512F &&  < 64 && !TARGET_AVX512VL
>   && (EXT_REX_SSE_REG_P (operands[0])
>   || EXT_REX_SSE_REG_P (operands[1])))
> {
>   if (memory_operand (operands[0], mode))
> {
>   if ( == 32)
> return "vextract64x4\t{$0x0, %g1, %0|%0, %g1, 
> 0x0}";
>   else if ( == 16)
> return "vextract32x4\t{$0x0, %g1, %0|%0, %g1, 
> 0x0}";
>   else
> gcc_unreachable ();
> }
> ...
>
> However, since ix86_hard_regno_mode_ok has
>
>  /* TODO check for QI/HI scalars.  */
>   /* AVX512VL allows sse regs16+ for 128/256 bit modes.  */
>   if (TARGET_AVX512VL
>   && (mode == OImode
>   || mode == TImode
>   || VALID_AVX256_REG_MODE (mode)
>   || VALID_AVX512VL_128_REG_MODE (mode)))
> return true;
>
>   /* xmm16-xmm31 are only available for AVX-512.  */
>   if (EXT_REX_SSE_REGNO_P (regno))
> return false;
>
>   if (TARGET_AVX512F &&  < 64 && !TARGET_AVX512VL
>   && (EXT_REX_SSE_REG_P (operands[0])
>   || EXT_REX_SSE_REG_P (operands[1])))
>
> is a dead code.
>
> Also for
>
> long long *p;
> volatile __m256i yy;
>
> void
> foo (void)
> {
>_mm256_store_epi64 (p, yy);
> }
>
> with AVX512VL, we should generate
>
> vmovdqa %ymm0, (%rax)
>
> not
>
> vmovdqa64   %ymm0, (%rax)
>
> All TYPE_SSEMOV vector moves are consolidated to ix86_output_ssemov:
>
> 1. If xmm16-xmm31/ymm16-ymm31 registers aren't used, SSE/AVX vector
> moves will be generated.
> 2. If xmm16-xmm31/ymm16-ymm31 registers are used:
>a. With AVX512VL, AVX512VL vector moves will be generated.
>b. Without AVX512VL, xmm16-xmm31/ymm16-ymm31 register to register
>   move will be done with zmm register move.
>
> ext_sse_reg_operand is removed since it is no longer needed.
>
> Tested on AVX2 and AVX512 with and without --with-arch=native.
>
> gcc/
>
> PR target/89229
> PR target/89346
> * config/i386/i386-protos.h (ix86_output_ssemov): New prototype.
> * config/i386/i386.c (ix86_get_ssemov): New function.
> (ix86_output_ssemov): Likewise.
> * config/i386/i386.md (*movxi_internal_avx512f): Call
> ix86_output_ssemov for TYPE_SSEMOV.
> (*movoi_internal_avx): Call ix86_output_ssemov for TYPE_SSEMOV.
> Remove ext_sse_reg_operand and TARGET_AVX512VL check.
> (*movti_internal): Likewise.
> (*movdi_internal): Call ix86_output_ssemov for TYPE_SSEMOV.
> Remove ext_sse_reg_operand check.
> (*movsi_internal): Likewise.
> (*movtf_internal): Call ix86_output_ssemov for TYPE_SSEMOV.
> (*movdf_internal): Call ix86_output_ssemov for TYPE_SSEMOV.
> Remove TARGET_AVX512F, TARGET_PREFER_AVX256, TARGET_AVX512VL
> and ext_sse_reg_operand check.
> (*movsf_internal_avx): Call ix86_output_ssemov for TYPE_SSEMOV.
> Remove TARGET_PREFER_AVX256, TARGET_AVX512VL and
> ext_sse_reg_operand check.
> * config/i386/mmx.md (MMXMODE:*mov_internal): Call
> ix86_output_ssemov for TYPE_SSEMOV.  Remove ext_sse_reg_operand
> check.
> * config/i386/sse.md (VMOVE:mov_internal): Call
> ix86_output_ssemov for TYPE_SSEMOV.  Remove TARGET_AVX512VL
> check.
> * config/i386/predicates.md (ext_sse_reg_operand): Removed.
>
> gcc/testsuite/
>
> PR target/89229
> PR target/89346
> * gcc.target/i386/avx512vl-vmovdqa64-1.c: Updated.
> * gcc.target/i386/pr89229-2a.c: New test.
> * gcc.target/i386/pr89229-2b.c: Likewise.
> * gcc.target/i386/pr89229-2c.c: Likewise.
> * gcc.target/i386/pr89229-3a.c: Likewise.
> * gcc.target/i386/pr89229-3b.c: 

Re: [PATCH 3/3][DejaGNU] target: Wrap linker flags into `-largs'/`-margs' for Ada

2019-05-21 Thread Maciej Rozycki
On Thu, 16 May 2019, Jacob Bachmeyer wrote:

> >  I suspect the origins may be different, however as valuable as your 
> > observation is functional problems have precedence over issues with code 
> > structuring, so we need to fix the problem at hand first.  I'm sure 
> > DejaGNU maintainers will be happy to review your implementation of code 
> > restructuring afterwards.
> 
> My concern is that your patch may only solve a small part of the problem 
> -- enough to make your specific case work, yes, but then someone else 
> will hit other parts of the problem later and we spiral towards "tissue 
> of hacks" unmaintainability.

 I think however that fixing problems in small steps as they are 
discovered is also a reasonable approach and a way to move forward: 
perfect is the enemy of good.

 So I don't think the prospect of making a comprehensive solution should 
prevent a simple fix for the problem at hand that has been already 
developed from being applied.

 IOW I don't discourage you from developing a comprehensive solution, 
however applying my proposal right away will help at least some people and 
will not block you in any way.

> The biggest hint to me that your patch is incomplete is that it only 
> adds -largs/-margs to wrap LDFLAGS.  I suspect that there are other 
> -?args options that should be used also with other flag sets, but those 
> do not appear in this patch.  Do we know what the GNU Ada frontend 
> actually expects?

 At first glance it looks to me we should be safe overall as compiler 
flags are supposed to be passed through by `gnatmake' (barring switch 
processing bugs, as observed with 1/3), and IIUC assembler flags are 
considered compiler flags for the purpose of this consideration as 
`gnatmake' does not make assembly a separate build stage.  So while we 
could wrap compiler flags into `-cargs'/`-margs', it would only serve to 
avoid possible `gnatmake' switch processing bugs.

 There's also `-bargs' for binder switches, but I can't see any use for it 
here.

 Finally boards are offered the choice of `adaflags', `cflags', 
`cxxflags', etc. for the individual languages, where the correct syntax 
can be used if anything unusual is needed beyond what I have noted above.

 I'll defer any further consideration to the Ada maintainers cc-ed; I do 
hope I haven't missed anything here, but then Ada is far from being my 
primary area of experience.

> >  The ordering rules are system-specific I'm afraid and we have to be 
> > careful not to break working systems out there.  People may be forced to a 
> > DejaGNU upgrate, due to a newer version of a program being tested having 
> > such a requirement, and can legitimately expect their system to continue 
> > working.
> 
> We can start by simply preserving the existing ordering until we know 
> something should change, but the main goal of my previous message was to 
> collect the requirements for a specification for default_target_compile 
> so I can write regression tests (some of which will fail due to known 
> bugs like broken Ada support in our current implementation) before 
> embarking on extensive changes to that procedure.  Improving 
> "target.test" was already on my local TODO list.

 You are welcome to go ahead with your effort as far as I am concerned.

> Unfortunately, people with that particular attitude seem to have 
> acquired outsize influence over the last few years.  I would suspect an 
> organized attack if I were more conspiracy-oriented, but Hanlon's razor 
> strongly suggests that this is simply a consequence of lowering barriers 
> to entry.

 Nod.

  Maciej

Re: RFA: fix PR90553, IRA assigning a call-clobbered reg to call with post-increment

2019-05-21 Thread Vladimir Makarov



On 2019-05-20 9:58 p.m., Hans-Peter Nilsson wrote:

I was looking into why I couldn't trivially move cris-elf to
"use init_array".  It appeared that it wasn't the hooks into
that machinery that went wrong, but that a compiler bug is
plaguing __libc_init_array.  It's been there since at least
4.7-era, hiding under the covers of the __init_array being empty
(everything being in .init).

Looking into it, it seems that IRA is treating this insn:

(call_insn 16 14 17 4 (parallel [
 (call (mem:QI (mem/f:SI (post_inc:SI (reg:SI 27 [ ivtmp.7 ])) [1 
MEM[base: _8, offset: 0B]+0 S4 A8]) [0 *_1 S1 A8])
 (const_int 0 [0]))
 (clobber (reg:SI 16 srp))
 ]) "t.c":7:5 220 {*expanded_call_non_v32}
  (expr_list:REG_INC (reg:SI 27 [ ivtmp.7 ])
 (expr_list:REG_CALL_DECL (nil)
 (nil)))
 (nil))

...as if the read-part of the post-increment happens before the
call, and the write-part to happen after the call, supposedly
with the value magically kept unclobbered or treated as some
kind of return-value.  Looking around, it seems only the VAX
port would also be affected; grepping around I see no other port
having a call instruction capable of loading a value indirectly,
with a side-effect.

Now, I'm ok with deliberately forbidding autoinc and other
side-effect constructs on call insns during register allocation
and will do the documentation legwork of that part (and the more
involved target-fixing) if there's consensus for that, but it
seems that for IRA the fix is as simple as follows.

LRA seems to have the same issue, but I have no way to reproduce
it there; I'll just have to watch out when I move the port to
LRA.  I don't know if reload is affected, but I believe
autoincdec doesn't count as an output reload.  (Please correct
me if I'm wrong!  An output-reload on a call insn is not
allowed, says a comment in find_reloads, but AFAICS that's still
undocumented.)

Probably this case was a reason to prohibit output reloads for calls.

So, is this ok?  Regtested on cris-elf and x86_64-linux-gnu
(though the latter uses LRA).  Note that this does *not* cause
the return-value for f3 and f4 in the test-case to be allocated
a call-saved register after the value.


Yes, the patch is ok to commit.  Thank you for working on the problem.

It is hard to reproduce the same problem in LRA as LRA mostly follows 
IRA decisions.


I'll probably do the analogous patch for LRA on this week.



gcc:
* lra-lives.c (process_bb_node_lives): Consider defs

It should be ira-lives.c here.

for a call insn to be die before the call, not after.


Re: [C++ Patch] Two literal operator template location fixes

2019-05-21 Thread Jason Merrill

On 5/21/19 7:03 AM, Paolo Carlini wrote:

Hi,

also in my back queue a few more location fixes (of course ;) Tested 
x86_64-linux.


OK.



Re: [C++ PATCH] Add test for DR 1940 - static_assert in anonymous unions

2019-05-21 Thread Jason Merrill

On 5/21/19 11:11 AM, Marek Polacek wrote:

 clarified that static_assert in anonymous
unions are permitted, but nowhere in the testsuite do we test that.

Tested on x86_64-linux, ok for trunk?

2019-05-21  Marek Polacek  

DR 1940 - static_assert in anonymous unions.
* g++.dg/DRs/dr1940.C: New test.


OK.

Jason



Re: [C++ Patch] PR 67184 ("Missed optimization with C++11 final specifier")

2019-05-21 Thread Jason Merrill

On 5/21/19 12:34 PM, Paolo Carlini wrote:

Hi,

On 21/05/19 16:57, Jason Merrill wrote:

On 5/16/19 7:12 PM, Paolo Carlini wrote:

Hi,

when Roberto Agostino and I implemented the front-end 
devirtualization of final overriders we missed this case, where it 
comes from the base. It seems to me that by way of access_path the 
existing approach can be neatly extended. Tested x86_64-linux.



+  || CLASSTYPE_FINAL (TREE_TYPE (cand->access_path)))


This will give the wrong type when the function is called with an 
explicit scope; you probably want to look at argtype instead.


Yes, thanks, that works fine and is even neater. I'm finishing testing 
the below. As you can see, I also added a line to final3.C where the two 
versions of the call..c conditional give different answers. Note, 
however, that in practice, in terms, say, of dumps, the difference is 
difficult to emphasize because the call would not be virtual anyway (if 
I'm not horribly mistaken ;)


True enough.  OK.

Jason



Re: [PATCH] tbb-backend effective target should check ability to link TBB

2019-05-21 Thread Jonathan Wakely

On 21/05/19 22:47 +0200, François Dumont wrote:

On 5/21/19 3:50 PM, Jonathan Wakely wrote:

On 20/05/19 21:41 -0700, Thomas Rodgers wrote:

With the addition of "-ltbb" to the v3_target_compile flags (so as to,
you know, actually try to link tbb).

Tested x86_64-linux, committed to trunk.


This didn't work, I still get a FAIL for every pstl test when
tbb.x86_64 and tbb-devel.x86_64 are installed but not tbb.i686.

Adding -v to RUNTESTFLAGS shows -ltbb wasn't being added to the
command, and because the test program didn't actually refer to any TBB
symbols, it still linked successfully.

But the test program still do not refer any TBB symbol. Why is it 
better ?


Because -ltbb means the linker will give an error if there is no
suitable libtbb. That's true even if no symbols from it are needed.
Try it.


It looks like it could be a preprocess test.


It used to be a preprocess test, and it didn't work. How can a
preprocess test tell if -ltbb will work?



Re: [PATCH] tbb-backend effective target should check ability to link TBB

2019-05-21 Thread François Dumont

On 5/21/19 3:50 PM, Jonathan Wakely wrote:

On 20/05/19 21:41 -0700, Thomas Rodgers wrote:

With the addition of "-ltbb" to the v3_target_compile flags (so as to,
you know, actually try to link tbb).

Tested x86_64-linux, committed to trunk.


This didn't work, I still get a FAIL for every pstl test when
tbb.x86_64 and tbb-devel.x86_64 are installed but not tbb.i686.

Adding -v to RUNTESTFLAGS shows -ltbb wasn't being added to the
command, and because the test program didn't actually refer to any TBB
symbols, it still linked successfully.

But the test program still do not refer any TBB symbol. Why is it better 
? It looks like it could be a preprocess test.





[PATCH, i386]: Fix PR90547, ICE in gen_lowpart_general

2019-05-21 Thread Uros Bizjak
2019-05-21  Uroš Bizjak  

PR target/90547
* config/i386/i386.md (anddi_1 to andsi_1_zext splitter):
Avoid calling gen_lowpart with CONST operand.

testsuite/ChangeLog:

2019-05-21  Uroš Bizjak  

PR target/90547
* gcc.target/i386/pr90547.c: New test.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN, patch will be backported to all release branches.

Uros.
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 271467)
+++ config/i386/i386.md (working copy)
@@ -8525,6 +8525,14 @@
   operands[2] = shallow_copy_rtx (operands[2]);
   PUT_MODE (operands[2], SImode);
 }
+  else if (GET_CODE (operands[2]) == CONST)
+{
+  /* (const:DI (plus:DI (symbol_ref:DI ("...")) (const_int N))) */
+  operands[2] = copy_rtx (operands[2]);
+  PUT_MODE (operands[2], SImode);
+  PUT_MODE (XEXP (operands[2], 0), SImode);
+  PUT_MODE (XEXP (XEXP (operands[2], 0), 0), SImode);
+}
   else
 operands[2] = gen_lowpart (SImode, operands[2]);
 })
Index: testsuite/gcc.target/i386/pr90547.c
===
--- testsuite/gcc.target/i386/pr90547.c (nonexistent)
+++ testsuite/gcc.target/i386/pr90547.c (working copy)
@@ -0,0 +1,21 @@
+/* PR target/90547 */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+void
+foo ()
+{
+  void *g[] = {&, &};
+
+  for (unsigned c = 0x1F;; c >>= 1)
+{
+  unsigned d = (long)("a"+1);
+  long e = 8;
+
+  while (e)
+{
+  a: goto *g[c];
+  b: e--;
+}
+}
+}


Re: [PATCH PR57534]Support strength reduction for MEM_REF in slur

2019-05-21 Thread Bill Schmidt
Hi Bin,

On 5/16/19 11:00 AM, Bin.Cheng wrote:
> On Thu, May 16, 2019 at 11:50 PM Bill Schmidt  wrote:
>> Thanks, Bin and Richard -- I am out of the office until Tuesday, so will 
>> review
>> when I get back.  Bin, please CC me on SLSR patches as otherwise I might miss
>> them.  Thanks!
> Thanks for helping.  Will do it next time.
>
> Thanks,
> bin
>> Bill
>>
>>
>> On 5/16/19 6:37 AM, Richard Biener wrote:
>>> On Wed, May 15, 2019 at 6:30 AM bin.cheng  
>>> wrote:
 Hi,
 As noted in PR57534 comment #33, SLSR currently doesn't strength reduce 
 memory
 references in reported cases, which conflicts with its comment at the 
 beginning of file.
 The main reason is in functions slsr_process_ref and restructure_reference 
 which
 rejects MEM_REF by handled_compoenent_p in the first place.  This patch 
 identifies
 and creates CAND_REF for MEM_REF by restructuring base/offset.

 Note the patch only affects cases with multiple reducible MEM_REF.

 Also note, with this patch, [base + cst_offset] addressing mode would be 
 generally
 preferred.  I need to adjust three existing tests:
 * gcc.dg/tree-ssa/isolate-5.c
 * gcc.dg/tree-ssa/ssa-hoist-4.c
 Though address computation is reduced out of memory reference, the 
 generated
 assembly is not worsened.

 * gcc.dg/tree-ssa/slsr-3.c
 The generated assembly has two more instructions:
 <   movslq  %edx, %rcx
 <   movl(%rsi,%rcx,4), %r9d
 <   leaq0(,%rcx,4), %rax
 <   leal2(%r9), %r8d
 <   movl%r8d, (%rdi,%rcx,4)
 <   movl4(%rsi,%rax), %ecx
 <   addl$2, %ecx
 <   movl%ecx, 4(%rdi,%rax)
 <   movl8(%rsi,%rax), %ecx
 <   addl$2, %ecx
 <   movl%ecx, 8(%rdi,%rax)
 <   movl12(%rsi,%rax), %ecx
 <   addl$2, %ecx
 <   movl%ecx, 12(%rdi,%rax)
 ---
>   movslq  %edx, %rax
>   salq$2, %rax
>   addq%rax, %rsi
>   addq%rax, %rdi
>   movl(%rsi), %eax
>   addl$2, %eax
>   movl%eax, (%rdi)
>   movl4(%rsi), %eax
>   addl$2, %eax
>   movl%eax, 4(%rdi)
>   movl8(%rsi), %eax
>   addl$2, %eax
>   movl%eax, 8(%rdi)
>   movl12(%rsi), %eax
>   addl$2, %eax
>   movl%eax, 12(%rdi)
 Seems to me this is not deteriorating and "salq" can be saved by two 
 forward propagation.

 Bootstrap and test on x86_64, any comments?
>>> The idea is good I think and the result above as well.  Leaving for Bill
>>> to have a look as well, otherwise OK.

Once again, I lost the original patch due to some glitch in my mail server,
so I apologize for not quoting the patch.

I'm initially okay with all of it until we get to the "stride_cand" treatment in
the second half of restructure_base_offset.  It looks to me like the 
transformation
here is not necessarily a good one for architectures that don't support x86's 
elaborate set of addressing modes.  Can you convince me otherwise?

I'm not certain that any of the test cases that have been modified won't be
degradations for other architectures.  Did you do any code gen comparisons 
there?
Seems like we need results from Power and Aarch64 at a minimum.

Perhaps the existing MEM_REFs are already ugly enough that the replacement 
doesn't
make things worse; I just can't prove it to myself without examples.  Could you
please run some experiments?

This is otherwise okay if we can get past that concern.

Thanks!
Bill

>>>
>>> Thanks,
>>> Richard.
>>>
 Thanks,
 bin

 2019-05-15  Bin Cheng  

 PR tree-optimization/57534
 * gimple-ssa-strength-reduction.c (restructure_base_offset): New.
 (restructure_reference): Call restructure_base_offset when offset 
 is
 NULL.
 (slsr_process_ref): Handle MEM_REF.

 2018-05-15  Bin Cheng  

 PR tree-optimization/57534
 * gcc.dg/tree-ssa/pr57534.c: New test.
 * gcc.dg/tree-ssa/isolate-5.c: Adjust checking strings.
 * gcc.dg/tree-ssa/slsr-3.c: Ditto.
 * gcc.dg/tree-ssa/ssa-hoist-4.c: Ditto.



Re: preserve more debug stmts in gimple jump threading

2019-05-21 Thread Alexandre Oliva
On May 17, 2019, Jeff Law  wrote:

> OK.  Presumably creating a reliable testcase was painful?

Heh, that it might even possible didn't even cross my mind.  I was happy
enough to be able to exercise and inspect the before IR for most
of the new code paths in the patch in a GCC bootstrap, by placing
gcc_assert(gcc_stop_here_*||getenv("GCC_STOP_HERE_*")) at various
places, so that I could stop to inspect, skip them in a debugger or in a
full run to look at the compiler dumps.  That alone took much longer
than I had to complete it :-(

-- 
Alexandre Oliva, freedom fighter  he/him   https://FSFLA.org/blogs/lxo
Be the change, be Free! FSF Latin America board member
GNU Toolchain EngineerFree Software Evangelist
Hay que enGNUrecerse, pero sin perder la terGNUra jamás - Che GNUevara


Re: dg-require-ifunc syntax

2019-05-21 Thread Rainer Orth
Rainer Orth  writes:

> Hi Iain,
>
>> Hi Dominique,
>>
>>> On 19 May 2019, at 15:10, Dominique d'Humières  wrote:
>>> 
>>> AFAICT the syntax for dg-require-ifunc seems to be
>>> 
>>> /* { dg-require-ifunc "" } */
>>> 
>>> with two sets of exceptions:
>>> 
>>> (1) gcc.target/i386/pr90500-*.c
>>> 
>>> which explains
>>> 
>>> FAIL: gcc.target/i386/pr90500-1.c  (test for errors, line 6)
>>> FAIL: gcc.target/i386/pr90500-1.c  (test for warnings, line 6)
>>> FAIL: gcc.target/i386/pr90500-1.c (test for excess errors)
>>> FAIL: gcc.target/i386/pr90500-2.c  (test for errors, line 6)
>>> FAIL: gcc.target/i386/pr90500-2.c  (test for warnings, line 6)
>>> FAIL: gcc.target/i386/pr90500-2.c (test for excess errors)
>>> 
>>> and is fixed with the trivial patch
>>> 
>>> --- ../_clean/gcc/testsuite/gcc.target/i386/pr90500-1.c 2019-05-16
>>> 17:34:09.0 +0200
>>> +++ gcc/testsuite/gcc.target/i386/pr90500-1.c 2019-05-18
>>> 14:28:12.0 +0200
>>> @@ -1,6 +1,6 @@
>>> /* PR middle-end/84723 */
>>> /* { dg-do compile } */
>>> -/* { dg-require-ifunc } */
>>> +/* { dg-require-ifunc "" } */
>>> 
>>> __attribute__((target_clones("arch=haswell", "default"))) int __tanh() {}
>>> __typeof(__tanh) tanhf64 __attribute__((alias("__tanh")))/* { dg-error
>>> "clones for .target_clones. attribute cannot be created" } */
>>
>> From a Darwin point of view, this is OK (it seems obvious to me, also).
>
> indeed: I have the same issue on Solaris, too.
>
>>> --- ../_clean/gcc/testsuite/gcc.target/i386/pr90500-2.c 2019-05-16
>>> 17:34:09.0 +0200
>>> +++ gcc/testsuite/gcc.target/i386/pr90500-2.c 2019-05-18
>>> 14:28:25.0 +0200
>>> @@ -1,6 +1,6 @@
>>> /* PR middle-end/84723 */
>>> /* { dg-do compile } */
>>> -/* { dg-require-ifunc } */
>>> +/* { dg-require-ifunc "" } */
>>> 
>>> __attribute__((target_clones("arch=haswell", "default"))) int __tanh() {}
>>> __typeof(__tanh) tanhf64
>>> __attribute__((alias("__tanh"),target_clones("arch=haswell",
>>> "default"))); /* { dg-error "clones for .target_clones. attribute cannot
>>> be created" } */
>>> 
>>> (2) gcc.target/i386/pr84723-*.c
>>> 
>>> which succeed on darwin. What is the suitable fix for that?
>>
>> My assumption here is that the tests should not be run on a non-ifuncs 
>> target, 
>> but that it happens to be that they are testing for an erroneous condition 
>> - which by chance also gives the correct error on a non-ifuncs target.
>>> 
>>> (a) Fix the dg-require-ifunc as above?
>>
>> I would prefer this, (it’s confusing to run tests for an unsupported
>> functionality)
>> - unless there is some other value to running the tests (will wait for 
>> comments
>> on that).
>
> Right: I guess we can wait for Jakub's take on that.  The tests PASS on
> Solaris/x86 as well, which hasn't ifunc support either, and there are no
> gcc-testresults postings showing failures for this test anywhere.
>
> Prompted by the initial bug, I looked around a bit and found some more
> instances of the dg-require-* syntax problem.  I'll commit them either
> together with the rest of separately, since this stuff tends to be
> copied around.

Given no response from Jakub and no indication that
gcc.target/i386/pr84723-?.c fails anywhere, I've installed the following
patch on mainline.

Tested on i386-pc-solaris2.11, sparc-sun-solaris2.11, and x86_64-pc-linux-gnu.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2019-05-17  Rainer Orth  

* gcc.dg/Wattribute-alias.c: Pass emtpy arg to dg-require-ifunc.

* gcc.c-torture/execute/20030125-1.c: Pass emtpy arg to dg-require-weak.

* gcc.dg/torture/ftrapv-2.c: Pass empty arg to dg-require-fork.

* gcc.target/i386/pr84723-1.c: Remove dg-require-ifunc.
* gcc.target/i386/pr84723-2.c: Likewise.
* gcc.target/i386/pr84723-3.c: Likewise.
* gcc.target/i386/pr84723-4.c: Likewise.
* gcc.target/i386/pr84723-5.c: Likewise.

# HG changeset patch
# Parent  20c3c82fdcb626b5183ad78d127d3c0679bc257c
Fix dg-require-* syntax

diff --git a/gcc/testsuite/gcc.c-torture/execute/20030125-1.c b/gcc/testsuite/gcc.c-torture/execute/20030125-1.c
--- a/gcc/testsuite/gcc.c-torture/execute/20030125-1.c
+++ b/gcc/testsuite/gcc.c-torture/execute/20030125-1.c
@@ -1,6 +1,6 @@
 /* Verify whether math functions are simplified.  */
 /* { dg-require-effective-target c99_runtime } */
-/* { dg-require-weak } */
+/* { dg-require-weak "" } */
 double sin(double);
 double floor(double);
 float 
diff --git a/gcc/testsuite/gcc.dg/Wattribute-alias.c b/gcc/testsuite/gcc.dg/Wattribute-alias.c
--- a/gcc/testsuite/gcc.dg/Wattribute-alias.c
+++ b/gcc/testsuite/gcc.dg/Wattribute-alias.c
@@ -1,6 +1,6 @@
 /* PR middle-end/81824 - Warn for missing attributes with function aliases
{ dg-do compile }
-   { dg-require-ifunc "require ifunc support" }
+   { dg-require-ifunc "" }
{ dg-options "-Wall -Wattribute-alias=2" } */
 
 

[PATCH][GCC][AArch64] Make processing less fragile in config.gcc

2019-05-21 Thread Tamar Christina
Hi All,

Due to config.gcc all the options need to be on one line because of the grep
lines which would select only the first line of the option.

This causes it not to select the right bits on options that are spread over
multiple lines when the --with-arch configure option is used.  The issue happens
silently and you just get a compiler with an incorrect set of default flags.

The current rules are quite rigid:

   1) No space between the AARCH64_OPT_EXTENSION and the opening (.
   2) No space between the opening ( and the extension name.
   3) No space after the extension name before the ,.
   4) Spaces are only allowed after a , and around |.

This patch makes this a lot less fragile by using the C pre-processor to flatten
the list and then provides much more flexible regex using group matching to
process the options instead of string replacement.  This removes all the
restrictions above and makes the code a bit more readable.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for trunk? and for eventual backport?

Thanks,
Tamar

gcc/ChangeLog:

2019-05-21  Tamar Christina  

PR target/89517
* config.gcc: Relax parsing of AARCH64_OPT_EXTENSION.
* config/aarch64/aarch64-option-extensions.def: Add new comments
and restore easier to read options.

-- 
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 76bb316942d93448e6354a89a8266118ee767601..effd42e8612ff0432f7b4dc656531a9cb11f99cd 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -3828,32 +3828,40 @@ case "${target}" in
   sed -e 's/,.*$//'`
 			  fi
 
+			  # Use the pre-processor to strip flatten the options.
+			  # This makes the format less rigid than if we use
+			  # grep and sed directly here.
+			  opt_macro="AARCH64_OPT_EXTENSION(A, B, C, D, E, F)=A, B, C, D, E, F"
+			  options_parsed="`$ac_cv_prog_CPP -D"$opt_macro" -x c \
+${srcdir}/config/aarch64/aarch64-option-extensions.def`"
+
+			  # Match one element inside AARCH64_OPT_EXTENSION, we
+			  # consume anything that's not a ,.
+			  elem="[ 	]*\([^,]\+\)[ 	]*"
+
+			  # Repeat the pattern for the number of entries in the
+			  # AARCH64_OPT_EXTENSION, currently 6 times.
+			  sed_patt="^$elem,$elem,$elem,$elem,$elem,$elem"
+
 			  while [ x"$ext_val" != x ]
 			  do
 ext_val=`echo $ext_val | sed -e 's/\+//'`
 ext=`echo $ext_val | sed -e 's/\+.*//'`
 base_ext=`echo $ext | sed -e 's/^no//'`
+opt_line=`echo -e "$options_parsed" | \
+	grep "^\"$base_ext\""`
 
 if [ x"$base_ext" = x ] \
-|| grep "^AARCH64_OPT_EXTENSION(\"$base_ext\"," \
-${srcdir}/config/aarch64/aarch64-option-extensions.def \
-> /dev/null; then
-
-  ext_canon=`grep "^AARCH64_OPT_EXTENSION(\"$base_ext\"," \
-	${srcdir}/config/aarch64/aarch64-option-extensions.def | \
-	sed -e 's/^[^,]*,[ 	]*//' | \
-	sed -e 's/,.*$//'`
-  ext_on=`grep "^AARCH64_OPT_EXTENSION(\"$base_ext\"," \
-	${srcdir}/config/aarch64/aarch64-option-extensions.def | \
-	sed -e 's/^[^,]*,[ 	]*[^,]*,[ 	]*//' | \
-	sed -e 's/,.*$//' | \
-	sed -e 's/).*$//'`
-  ext_off=`grep "^AARCH64_OPT_EXTENSION(\"$base_ext\"," \
-	${srcdir}/config/aarch64/aarch64-option-extensions.def | \
-	sed -e 's/^[^,]*,[ 	]*[^,]*,[ 	]*[^,]*,[ 	]*//' | \
-	sed -e 's/,.*$//' | \
-	sed -e 's/).*$//'`
-
+|| [[ -n $opt_line ]]; then
+
+  # These regexp extract the elements based on
+  # their group match index in the regexp.
+  ext_canon=`echo -e "$opt_line" | \
+	sed -e "s/$sed_patt/\2/"`
+  ext_on=`echo -e "$opt_line" | \
+	sed -e "s/$sed_patt/\3/"`
+  ext_off=`echo -e "$opt_line" | \
+	sed -e "s/$sed_patt/\4/"`
 
   if [ $ext = $base_ext ]; then
 	# Adding extension
diff --git a/gcc/config/aarch64/aarch64-option-extensions.def b/gcc/config/aarch64/aarch64-option-extensions.def
index 53dcd03590d2e4eebac83f03039c442fca7f5d5d..9b1f11c4170793aefda67beb4b56cacdf6f527fd 100644
--- a/gcc/config/aarch64/aarch64-option-extensions.def
+++ b/gcc/config/aarch64/aarch64-option-extensions.def
@@ -45,29 +45,34 @@
  entries: aes, pmull, sha1, sha2 being present).  In that case this field
  should contain a space (" ") separated list of the strings in 'Features'
  that are required.  Their order is not important.  An empty string means
- do not detect this feature during auto detection.  */
+ do not detect this feature during auto detection.
 
-/* NOTE: This file is being parsed by config.gcc and so the
-   AARCH64_OPT_EXTENSION must adhere to a strict format:
-   1) No space between the AARCH64_OPT_EXTENSION and the opening (.
-   2) No space between the opening ( and the extension name.
-   3) No space after the extension name before the ,.
-   4) Spaces are only allowed after a , and around |.
-   5) Everything must be on one line.  */
+ NOTE: Any changes to the AARCH64_OPT_EXTENSION macro need to be mirrored in
+ config.gcc.  */
 
 /* Enabling "fp" just enables 

Re: [Contrib PATCH] Add scripts to convert GCC repo from SVN to Git

2019-05-21 Thread Segher Boessenkool
Hi Joseph,

On Mon, May 20, 2019 at 10:42:45PM +, Joseph Myers wrote:
> (SVN will remain 
> readonly, just as the old CVS repository remains available readonly).

> That the git-svn mirror is useful for many purposes for which people want 
> to use git also provides a clear argument against needing to do the final 
> conversion in a hurry;

Right.  But trying to correct the ancient history in the repo isn't useful
*anyway*.  One much bigger problem is that very often very unrelated things
are committed at the same time, in big omnibus patches.  Another much
bigger problm is that when you are doing the kind of archeology where this
matters, you need to have old the email archives anyway, which aren't
available.

> I think having author names and email addresses is a basic requirement of 
> any reasonable repository conversion

Yes, and they should be the same as they were in the original repository.


Segher


[PATCH, Darwin, testsuite] Fix PR67958

2019-05-21 Thread Iain Sandoe
This a series of  tests that require specific scan-asms in some cases
because of the different codegen for Dawin.
Added some explanations too.

tested on x86_64-darwin and x86_64-linux-gnu
applied to mainline,
Iain

gcc/testsuite/

2019-05-21  Iain Sandoe  

PR testsuite/67958
* gcc.target/i386/pr32219-1.c: Adjust scan-asms for Darwin, comment
the differences.
* gcc.target/i386/pr32219-2.c: Likewise.
* gcc.target/i386/pr32219-3.c: Likewise.
* gcc.target/i386/pr32219-4.c: Likewise.
* gcc.target/i386/pr32219-5.c: Likewise.
* gcc.target/i386/pr32219-6.c: Likewise.
* gcc.target/i386/pr32219-7.c: Likewise.
* gcc.target/i386/pr32219-8.c: Likewise.

2019-05-21  Iain Sandoe  
 
PR target/63891
* gcc.dg/darwin-weakimport-3.c: Adjust options and explain
diff --git a/gcc/testsuite/gcc.target/i386/pr32219-1.c 
b/gcc/testsuite/gcc.target/i386/pr32219-1.c
index bb28f9f..0fcb138 100644
--- a/gcc/testsuite/gcc.target/i386/pr32219-1.c
+++ b/gcc/testsuite/gcc.target/i386/pr32219-1.c
@@ -12,7 +12,11 @@ foo ()
   return xxx;
 }
 
-/* { dg-final { scan-assembler "movl\[ \t\]xxx\\(%rip\\), %eax" { target { ! 
ia32 } } } } */
-/* { dg-final { scan-assembler-not "xxx@GOTPCREL" { target { ! ia32 } } } } */
-/* { dg-final { scan-assembler "movl\[ \t\]xxx@GOTOFF\\(%\[^,\]*\\), %eax" { 
target ia32 } } } */
-/* { dg-final { scan-assembler-not "movl\[ \t\]xxx@GOT\\(%\[^,\]*\\), %eax" { 
target ia32 } } } */
+/* { dg-final { scan-assembler {movl[ \t]_?xxx\(%rip\),[ \t]%eax} { target { ! 
ia32 } } } } */
+/* { dg-final { scan-assembler-not "_?xxx@GOTPCREL" { target { ! ia32 } } } } 
*/
+
+/* { dg-final { scan-assembler "movl\[ \t\]xxx@GOTOFF\\(%\[^,\]*\\), %eax" { 
target { ia32 && { ! *-*-darwin* } } } } } */
+/* { dg-final { scan-assembler-not "movl\[ \t\]_?xxx@GOT\\(%\[^,\]*\\), %eax" 
{ target { ia32 && { ! *-*-darwin* } } } } } */
+
+/* For Darwin, we default to PIC - but that's needed for Darwin's PIE.  */
+/* { dg-final { scan-assembler {movl[ \t]_xxx-L1\$pb\(%eax\),[ \t]%eax} { 
target { ia32 && *-*-darwin* } } } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr32219-2.c 
b/gcc/testsuite/gcc.target/i386/pr32219-2.c
index b30862d..cb587db 100644
--- a/gcc/testsuite/gcc.target/i386/pr32219-2.c
+++ b/gcc/testsuite/gcc.target/i386/pr32219-2.c
@@ -12,6 +12,13 @@ foo ()
 }
 
 /* { dg-final { scan-assembler-not "movl\[ \t\]xxx\\(%rip\\), %" { target { ! 
ia32 } } } } */
+/* For Darwin m64 we are always PIC, but common symbols are indirected, which 
happens to
+   match the general "ELF" case.  */
 /* { dg-final { scan-assembler "xxx@GOTPCREL" { target { ! ia32 } } } } */
-/* { dg-final { scan-assembler-not "movl\[ \t\]xxx@GOTOFF\\(%\[^,\]*\\), %" { 
target ia32 } } } */
-/* { dg-final { scan-assembler "movl\[ \t\]xxx@GOT\\(%\[^,\]*\\), %" { target 
ia32 } } } */
+
+/* { dg-final { scan-assembler-not "movl\[ \t\]xxx@GOTOFF\\(%\[^,\]*\\), %" { 
target { ia32 && { ! *-*-darwin* } } } } } */
+/* { dg-final { scan-assembler "movl\[ \t\]xxx@GOT\\(%\[^,\]*\\), %" { target 
{ ia32 && { ! *-*-darwin* } } } } } */
+
+/* Darwin m32 defaults to PIC but common symbols need to be indirected.  */
+/* { dg-final { scan-assembler {movl[ \t]l_xxx\$non_lazy_ptr-L1\$pb\(%eax\),[ 
\t]%eax} { target { ia32 && *-*-darwin* } } } } */
+
diff --git a/gcc/testsuite/gcc.target/i386/pr32219-3.c 
b/gcc/testsuite/gcc.target/i386/pr32219-3.c
index 657fb78..f9cfca7 100644
--- a/gcc/testsuite/gcc.target/i386/pr32219-3.c
+++ b/gcc/testsuite/gcc.target/i386/pr32219-3.c
@@ -12,7 +12,16 @@ foo ()
   return xxx;
 }
 
-/* { dg-final { scan-assembler "movl\[ \t\]xxx\\(%rip\\), %eax" { target { ! 
ia32 } } } } */
-/* { dg-final { scan-assembler-not "xxx@GOTPCREL" { target { ! ia32 } } } } */
-/* { dg-final { scan-assembler "movl\[ \t\]xxx@GOTOFF\\(%\[^,\]*\\), %eax" { 
target ia32 } } } */
-/* { dg-final { scan-assembler-not "movl\[ \t\]xxx@GOT\\(%\[^,\]*\\), %eax" { 
target ia32 } } } */
+/* { dg-final { scan-assembler {movl[ \t]xxx\(%rip\),[ \t]%eax} { target { { ! 
ia32 } && { ! *-*-darwin* } } } } } */
+/* { dg-final { scan-assembler-not "xxx@GOTPCREL" { target { { ! ia32 } && { ! 
*-*-darwin* } } } } } */
+
+/* For Darwin m64, code is always PIC and we need to indirect through the GOT 
to allow
+   weak symbols to be interposed.  The dynamic loader knows how to apply PIE 
to this.  */
+/* { dg-final { scan-assembler {movq[ \t]_xxx@GOTPCREL\(%rip\),[ \t]%rax} { 
target { { ! ia32 } && *-*-darwin* } } } } */
+
+/* { dg-final { scan-assembler "movl\[ \t\]xxx@GOTOFF\\(%\[^,\]*\\), %eax" { 
target { ia32 && { ! *-*-darwin* } } } } } */
+/* { dg-final { scan-assembler-not "movl\[ \t\]xxx@GOT\\(%\[^,\]*\\), %eax" { 
target { ia32 && { ! *-*-darwin* } } } } } */
+
+/* For Darwin, we need PIC to allow PIE, but also we must indirect weak 
symbols so that
+   they can be indirected.  Again, dyld knows how to deal with this. */
+/* { dg-final { scan-assembler {movl[ 

[PATCH, Darwin, testsuite] Fix PR 63891.

2019-05-21 Thread Iain Sandoe
This is a testcase failing because one part of the codegen is (correctly)
generating the scan-asm-not signature.

Fixed by altering the build options. 

tested on x86_64-darwin16, applied to mainline,
Iain

gcc/testsuite/

2019-05-21  Iain Sandoe  

   PR target/63891
   * gcc.dg/darwin-weakimport-3.c: Adjust options and explain
   the reasons.

 2019-05-21  Uroš Bizjak  
 
* gcc.target/i386/vect-signbitf.c: New test.

diff --git a/gcc/testsuite/gcc.dg/darwin-weakimport-3.c 
b/gcc/testsuite/gcc.dg/darwin-weakimport-3.c
index 77ab980..a15b5b0 100644
--- a/gcc/testsuite/gcc.dg/darwin-weakimport-3.c
+++ b/gcc/testsuite/gcc.dg/darwin-weakimport-3.c
@@ -1,5 +1,20 @@
 /* { dg-do compile { target *-*-darwin* } } */
-/* { dg-options "-fno-asynchronous-unwind-tables" } */
+
+/* Here we want to test if "foo" gets placed into a coalesced
+   section (it should not).
+
+   However, for i386, and PIC code we have a "get_pc thunk" that
+   is (correctly) placed in a coalesced section when using an older
+   linker - also unwind tables are emitted into coalesced.
+
+   With modern linkers this is moot, since even weak symbols
+   are emitted into the regular sections.
+
+   To avoid the unwind tables -fno-asynchronous-unwind-tables.
+   To ensure that we emit code for an older linker -mtarget-linker
+   To avoid the get_pc thunk optimise at least O1.  */
+
+/* { dg-options "-fno-asynchronous-unwind-tables -O1 -mtarget-linker 85.2" } */
 /* { dg-require-weak "" } */
 
 /* { dg-final { scan-assembler-not "coalesced" } } */



Re: [PATCH GCC]Correct cand_chain and stmt_cand_map for copy/cast

2019-05-21 Thread Bill Schmidt
On 5/15/19 6:09 AM, Richard Biener wrote:
> On Wed, May 15, 2019 at 7:54 AM bin.cheng  wrote:
>> Hi,
>> I noticed that cand_chain (first_interp/next_interp) is not maintained 
>> correctly
>> in slsr_process_copy/slsr_process_cast (now slsr_process_copycast).  This one
>> fixes the issue, as well as records the "first" cand in stmt_cand_map.
>>
>> Hi Bill, is this correct or I misunderstood the code? Bootstrap and test on 
>> x86_64.
> Looks good to me, still waiting for Bills feedback (now CCed).

Good catch, thanks for fixing!  Okay for trunk.

My mail server seems to have eaten the patch that introduces 
slsr_process_copycast.
This is okay for trunk also, with one grammatical nit.  Please change the 
comment
before alloc_cand_and_fine_basis to read:

"In case of a cast, the type is not propagated because it comes from the cast;
nor is the base candidate's cast, which is no longer applicable."

I'm very sorry for the delay due to vacation.

Thanks!
Bill

>
> Richard.
>
>> Thanks,
>> bin
>>
>> 2019-05-15  Bin Cheng  
>>
>> * gimple-ssa-strength-reduction.c (slsr_process_copycast): Record
>> information about next_interp and the first cand.



Re: [C++ Patch] PR 67184 ("Missed optimization with C++11 final specifier")

2019-05-21 Thread Paolo Carlini

Hi,

On 21/05/19 16:57, Jason Merrill wrote:

On 5/16/19 7:12 PM, Paolo Carlini wrote:

Hi,

when Roberto Agostino and I implemented the front-end 
devirtualization of final overriders we missed this case, where it 
comes from the base. It seems to me that by way of access_path the 
existing approach can be neatly extended. Tested x86_64-linux.



+  || CLASSTYPE_FINAL (TREE_TYPE (cand->access_path)))


This will give the wrong type when the function is called with an 
explicit scope; you probably want to look at argtype instead.


Yes, thanks, that works fine and is even neater. I'm finishing testing 
the below. As you can see, I also added a line to final3.C where the two 
versions of the call..c conditional give different answers. Note, 
however, that in practice, in terms, say, of dumps, the difference is 
difficult to emphasize because the call would not be virtual anyway (if 
I'm not horribly mistaken ;)


Thanks, Paolo.

/


Index: cp/call.c
===
--- cp/call.c   (revision 271459)
+++ cp/call.c   (working copy)
@@ -8244,7 +8244,7 @@ build_over_call (struct z_candidate *cand, int fla
   /* See if the function member or the whole class type is declared
 final and the call can be devirtualized.  */
   if (DECL_FINAL_P (fn)
- || CLASSTYPE_FINAL (TYPE_METHOD_BASETYPE (TREE_TYPE (fn
+ || CLASSTYPE_FINAL (TREE_TYPE (argtype)))
flags |= LOOKUP_NONVIRTUAL;
 
   /* [class.mfct.nonstatic]: If a nonstatic member function of a class
Index: testsuite/g++.dg/other/final3.C
===
--- testsuite/g++.dg/other/final3.C (nonexistent)
+++ testsuite/g++.dg/other/final3.C (working copy)
@@ -0,0 +1,27 @@
+// PR c++/67184
+// { dg-do compile { target c++11 } }
+// { dg-options "-fdump-tree-original"  }
+
+struct V {
+ virtual void foo(); 
+};
+
+struct wV final : V {
+};
+
+struct oV final : V {
+  void foo();
+};
+
+void call(wV& x)
+{
+  x.foo();
+  x.V::foo();
+}
+
+void call(oV& x)
+{
+  x.foo();
+}
+
+// { dg-final { scan-tree-dump-times "OBJ_TYPE_REF" 0 "original" } }
Index: testsuite/g++.dg/other/final4.C
===
--- testsuite/g++.dg/other/final4.C (nonexistent)
+++ testsuite/g++.dg/other/final4.C (working copy)
@@ -0,0 +1,16 @@
+// PR c++/67184
+// { dg-do compile { target c++11 } }
+// { dg-options "-fdump-tree-original"  }
+
+struct B
+{
+  virtual void operator()();
+  virtual operator int();
+  virtual int operator++();
+};
+
+struct D final : B { };
+
+void foo(D& d) { d(); int t = d; ++d; }
+
+// { dg-final { scan-tree-dump-times "OBJ_TYPE_REF" 0 "original" } }
Index: testsuite/g++.dg/other/final5.C
===
--- testsuite/g++.dg/other/final5.C (nonexistent)
+++ testsuite/g++.dg/other/final5.C (working copy)
@@ -0,0 +1,19 @@
+// PR c++/69445
+// { dg-do compile { target c++11 } }
+// { dg-options "-fdump-tree-original"  }
+
+struct Base {
+  virtual void foo() const = 0;
+  virtual void bar() const {}
+};
+
+struct C final : Base {
+  void foo() const { }
+};
+
+void func(const C & c) {
+  c.bar();
+  c.foo();
+}
+
+// { dg-final { scan-tree-dump-times "OBJ_TYPE_REF" 0 "original" } }


[PATCH 6/6] rs6000: wz -> d+p7

2019-05-21 Thread Segher Boessenkool
2019-05-21  Segher Boessenkool  

* config/rs6000/constraints.md (define_register_constraint "wz"):
Delete.
* config/rs6000/rs6000.h (enum r6000_reg_class_enum): Delete
RS6000_CONSTRAINT_wz.
* config/rs6000/rs6000.c (rs6000_debug_reg_global): Adjust.
(rs6000_init_hard_regno_mode_ok): Adjust.
* config/rs6000/rs6000.md: Replace "wz" constraint by "d" with "p7".
* doc/md.texi (Machine Constraints): Adjust.

---
 gcc/config/rs6000/constraints.md | 3 ---
 gcc/config/rs6000/rs6000.c   | 8 +---
 gcc/config/rs6000/rs6000.h   | 1 -
 gcc/config/rs6000/rs6000.md  | 8 
 gcc/doc/md.texi  | 3 ---
 5 files changed, 5 insertions(+), 18 deletions(-)

diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md
index 802ce44..fd8be34 100644
--- a/gcc/config/rs6000/constraints.md
+++ b/gcc/config/rs6000/constraints.md
@@ -103,9 +103,6 @@ (define_register_constraint "ww" 
"rs6000_constraints[RS6000_CONSTRAINT_ww]"
 (define_register_constraint "wx" "rs6000_constraints[RS6000_CONSTRAINT_wx]"
   "Floating point register if the STFIWX instruction is enabled or NO_REGS.")
 
-(define_register_constraint "wz" "rs6000_constraints[RS6000_CONSTRAINT_wz]"
-  "Floating point register if the LFIWZX instruction is enabled or NO_REGS.")
-
 (define_register_constraint "wA" "rs6000_constraints[RS6000_CONSTRAINT_wA]"
   "BASE_REGS if 64-bit instructions are enabled or NO_REGS.")
 
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 6124bce..54a3261 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -2521,7 +2521,6 @@ rs6000_debug_reg_global (void)
   "wv reg_class = %s\n"
   "ww reg_class = %s\n"
   "wx reg_class = %s\n"
-  "wz reg_class = %s\n"
   "wA reg_class = %s\n"
   "\n",
   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_d]],
@@ -2541,7 +2540,6 @@ rs6000_debug_reg_global (void)
   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wv]],
   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_ww]],
   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wx]],
-  reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wz]],
   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wA]]);
 
   nl = "\n";
@@ -3160,8 +3158,7 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p)
wt - VSX register for TImode in VSX registers.
wv - Altivec register for ISA 2.06 VSX DF/DI load/stores.
ww - Register class to do SF conversions in with VSX operations.
-   wx - Float register if we can do 32-bit int stores.
-   wz - Float register if we can do 32-bit unsigned int loads.  */
+   wx - Float register if we can do 32-bit int stores.  */
 
   if (TARGET_HARD_FLOAT)
 {
@@ -3202,9 +3199,6 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p)
   if (TARGET_STFIWX)
 rs6000_constraints[RS6000_CONSTRAINT_wx] = FLOAT_REGS; /* DImode  */
 
-  if (TARGET_LFIWZX)
-rs6000_constraints[RS6000_CONSTRAINT_wz] = FLOAT_REGS; /* DImode  */
-
   if (TARGET_FLOAT128_TYPE)
 {
   rs6000_constraints[RS6000_CONSTRAINT_wq] = VSX_REGS; /* KFmode  */
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index 176f34d..fb94901 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -1262,7 +1262,6 @@ enum r6000_reg_class_enum {
   RS6000_CONSTRAINT_wv,/* Altivec register for double 
load/stores.  */
   RS6000_CONSTRAINT_ww,/* FP or VSX register for vsx float 
ops.  */
   RS6000_CONSTRAINT_wx,/* FPR register for STFIWX */
-  RS6000_CONSTRAINT_wz,/* FPR register for LFIWZX */
   RS6000_CONSTRAINT_wA,/* BASE_REGS if 64-bit.  */
   RS6000_CONSTRAINT_MAX
 };
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 454518e..32c41f3 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -834,7 +834,7 @@ (define_insn_and_split "*zero_extendhi2_dot2"
 
 
 (define_insn "zero_extendsi2"
-  [(set (match_operand:EXTSI 0 "gpc_reg_operand" "=r,r,wz,wa,wi,r,wa")
+  [(set (match_operand:EXTSI 0 "gpc_reg_operand" "=r,r,d,wa,wi,r,wa")
(zero_extend:EXTSI (match_operand:SI 1 "reg_or_mem_operand" 
"m,r,Z,Z,r,wa,wa")))]
   ""
   "@
@@ -846,7 +846,7 @@ (define_insn "zero_extendsi2"
mfvsrwz %0,%x1
xxextractuw %x0,%x1,4"
   [(set_attr "type" "load,shift,fpload,fpload,mffgpr,mftgpr,vecexts")
-   (set_attr "isa" "*,*,*,p8v,p8v,p8v,p9v")])
+   (set_attr "isa" "*,*,p7,p8v,p8v,p8v,p9v")])
 
 (define_insn_and_split "*zero_extendsi2_dot"
   [(set (match_operand:CC 2 "cc_reg_operand" "=x,?y")
@@ -7349,7 +7349,7 @@ (define_insn "movsf_hardfloat"
 ;; FMR  MR MT%0   MF%1   NOP
 (define_insn "movsd_hardfloat"
   [(set (match_operand:SD 0 "nonimmediate_operand"
-"=!r,   wz,  

[PATCH 5/6] rs6000: wl -> d+p6

2019-05-21 Thread Segher Boessenkool
2019-05-21  Segher Boessenkool  

* config/rs6000/constraints.md (define_register_constraint "wl"):
Delete.
* config/rs6000/rs6000.h (enum r6000_reg_class_enum): Delete
RS6000_CONSTRAINT_wl.
* config/rs6000/rs6000.c (rs6000_debug_reg_global): Adjust.
(rs6000_init_hard_regno_mode_ok): Adjust.
* config/rs6000/rs6000.md: Replace "wl" constraint by "d" with "p6".
* doc/md.texi (Machine Constraints): Adjust.

---
 gcc/config/rs6000/constraints.md | 3 ---
 gcc/config/rs6000/rs6000.c   | 6 --
 gcc/config/rs6000/rs6000.h   | 1 -
 gcc/config/rs6000/rs6000.md  | 4 ++--
 gcc/doc/md.texi  | 5 +
 5 files changed, 3 insertions(+), 16 deletions(-)

diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md
index 90a94c1..802ce44 100644
--- a/gcc/config/rs6000/constraints.md
+++ b/gcc/config/rs6000/constraints.md
@@ -74,9 +74,6 @@ (define_register_constraint "wg" 
"rs6000_constraints[RS6000_CONSTRAINT_wg]"
 (define_register_constraint "wi" "rs6000_constraints[RS6000_CONSTRAINT_wi]"
   "FP or VSX register to hold 64-bit integers for VSX insns or NO_REGS.")
 
-(define_register_constraint "wl" "rs6000_constraints[RS6000_CONSTRAINT_wl]"
-  "Floating point register if the LFIWAX instruction is enabled or NO_REGS.")
-
 ;; NO_REGs register constraint, used to merge mov{sd,sf}, since movsd can use
 ;; direct move directly, and movsf can't to move between the register sets.
 ;; There is a mode_attr that resolves to wa for SDmode and wn for SFmode
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index d6ffc36..6124bce 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -2513,7 +2513,6 @@ rs6000_debug_reg_global (void)
   "wf reg_class = %s\n"
   "wg reg_class = %s\n"
   "wi reg_class = %s\n"
-  "wl reg_class = %s\n"
   "wp reg_class = %s\n"
   "wq reg_class = %s\n"
   "wr reg_class = %s\n"
@@ -2534,7 +2533,6 @@ rs6000_debug_reg_global (void)
   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wf]],
   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wg]],
   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wi]],
-  reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wl]],
   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wp]],
   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wq]],
   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wr]],
@@ -3156,7 +3154,6 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p)
wf - Preferred register class for V4SFmode.
wg - Float register for power6x move insns.
wi - FP or VSX register to hold 64-bit integers for VSX insns.
-   wl - Float register if we can do 32-bit signed int loads.
wn - always NO_REGS.
wr - GPR if 64-bit mode is permitted.
ws - Register class to do ISA 2.06 DF operations.
@@ -3191,9 +3188,6 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p)
   if (TARGET_MFPGPR)   /* DFmode  */
 rs6000_constraints[RS6000_CONSTRAINT_wg] = FLOAT_REGS;
 
-  if (TARGET_LFIWAX)
-rs6000_constraints[RS6000_CONSTRAINT_wl] = FLOAT_REGS; /* DImode  */
-
   if (TARGET_POWERPC64)
 {
   rs6000_constraints[RS6000_CONSTRAINT_wr] = GENERAL_REGS;
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index 27055a6..176f34d 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -1254,7 +1254,6 @@ enum r6000_reg_class_enum {
   RS6000_CONSTRAINT_wf,/* VSX register for V4SF */
   RS6000_CONSTRAINT_wg,/* FPR register for -mmfpgpr */
   RS6000_CONSTRAINT_wi,/* FPR/VSX register to hold DImode */
-  RS6000_CONSTRAINT_wl,/* FPR register for LFIWAX */
   RS6000_CONSTRAINT_wp,/* VSX reg for IEEE 128-bit fp TFmode. 
*/
   RS6000_CONSTRAINT_wq,/* VSX reg for IEEE 128-bit fp KFmode.  
*/
   RS6000_CONSTRAINT_wr,/* GPR register if 64-bit  */
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 33a6de7..454518e 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -1019,7 +1019,7 @@ (define_insn_and_split "*extendhi2_dot2"
 
 (define_insn "extendsi2"
   [(set (match_operand:EXTSI 0 "gpc_reg_operand"
-"=r, r,   wl,wa,wi,v,  v, wr")
+"=r, r,   d, wa,wi,v,  v, wr")
(sign_extend:EXTSI (match_operand:SI 1 "lwa_operand"
 "YZ, r,   Z, Z, r, v,  v, ?wa")))]
   ""
@@ -1035,7 +1035,7 @@ (define_insn "extendsi2"
   [(set_attr "type" "load,exts,fpload,fpload,mffgpr,vecexts,vecperm,mftgpr")
(set_attr "sign_extend" "yes")
(set_attr "length" "4,4,4,4,4,4,8,8")
-   (set_attr "isa" 

[PATCH 3/6] rs6000: wk -> ws+p8v

2019-05-21 Thread Segher Boessenkool
2019-05-21  Segher Boessenkool  

* config/rs6000/constraints.md (define_register_constraint "wk"):
Delete.
* config/rs6000/rs6000.h (enum r6000_reg_class_enum): Delete
RS6000_CONSTRAINT_wk.
* config/rs6000/rs6000.c (rs6000_debug_reg_global): Adjust.
(rs6000_init_hard_regno_mode_ok): Adjust.
* config/rs6000/rs6000.md: Replace "wk" constraint by "ws" with "p8v".
* doc/md.texi (Machine Constraints): Adjust.

---
 gcc/config/rs6000/constraints.md | 3 ---
 gcc/config/rs6000/rs6000.c   | 9 +
 gcc/config/rs6000/rs6000.h   | 1 -
 gcc/config/rs6000/rs6000.md  | 2 +-
 gcc/doc/md.texi  | 5 +
 5 files changed, 3 insertions(+), 17 deletions(-)

diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md
index 9f315e4..6f60627 100644
--- a/gcc/config/rs6000/constraints.md
+++ b/gcc/config/rs6000/constraints.md
@@ -74,9 +74,6 @@ (define_register_constraint "wg" 
"rs6000_constraints[RS6000_CONSTRAINT_wg]"
 (define_register_constraint "wi" "rs6000_constraints[RS6000_CONSTRAINT_wi]"
   "FP or VSX register to hold 64-bit integers for VSX insns or NO_REGS.")
 
-(define_register_constraint "wk" "rs6000_constraints[RS6000_CONSTRAINT_wk]"
-  "FP or VSX register to hold 64-bit doubles for direct moves or NO_REGS.")
-
 (define_register_constraint "wl" "rs6000_constraints[RS6000_CONSTRAINT_wl]"
   "Floating point register if the LFIWAX instruction is enabled or NO_REGS.")
 
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 76c80a4..718535f 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -2513,7 +2513,6 @@ rs6000_debug_reg_global (void)
   "wf reg_class = %s\n"
   "wg reg_class = %s\n"
   "wi reg_class = %s\n"
-  "wk reg_class = %s\n"
   "wl reg_class = %s\n"
   "wm reg_class = %s\n"
   "wp reg_class = %s\n"
@@ -2536,7 +2535,6 @@ rs6000_debug_reg_global (void)
   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wf]],
   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wg]],
   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wi]],
-  reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wk]],
   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wl]],
   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wm]],
   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wp]],
@@ -3160,7 +3158,6 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p)
wf - Preferred register class for V4SFmode.
wg - Float register for power6x move insns.
wi - FP or VSX register to hold 64-bit integers for VSX insns.
-   wk - FP or VSX register to hold 64-bit doubles for direct moves.
wl - Float register if we can do 32-bit signed int loads.
wm - VSX register for ISA 2.07 direct move operations.
wn - always NO_REGS.
@@ -3201,11 +3198,7 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p)
 rs6000_constraints[RS6000_CONSTRAINT_wl] = FLOAT_REGS; /* DImode  */
 
   if (TARGET_DIRECT_MOVE)
-{
-  rs6000_constraints[RS6000_CONSTRAINT_wk] /* DFmode  */
-   = rs6000_constraints[RS6000_CONSTRAINT_ws];
-  rs6000_constraints[RS6000_CONSTRAINT_wm] = VSX_REGS;
-}
+rs6000_constraints[RS6000_CONSTRAINT_wm] = VSX_REGS;
 
   if (TARGET_POWERPC64)
 {
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index 218ed10..cc60559 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -1254,7 +1254,6 @@ enum r6000_reg_class_enum {
   RS6000_CONSTRAINT_wf,/* VSX register for V4SF */
   RS6000_CONSTRAINT_wg,/* FPR register for -mmfpgpr */
   RS6000_CONSTRAINT_wi,/* FPR/VSX register to hold DImode */
-  RS6000_CONSTRAINT_wk,/* FPR/VSX register for DFmode direct 
moves. */
   RS6000_CONSTRAINT_wl,/* FPR register for LFIWAX */
   RS6000_CONSTRAINT_wm,/* VSX register for direct move */
   RS6000_CONSTRAINT_wp,/* VSX reg for IEEE 128-bit fp TFmode. 
*/
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 9a986a1..33a6de7 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -471,7 +471,7 @@ (define_mode_attr zero_fp [(SF "j")
 (define_mode_attr f64_vsx [(DF "ws") (DD "wn")])
 
 ; Definitions for 64-bit direct move
-(define_mode_attr f64_dm  [(DF "wk") (DD "d")])
+(define_mode_attr f64_dm  [(DF "ws") (DD "d")])
 
 ; Definitions for 64-bit use of altivec registers
 (define_mode_attr f64_av  [(DF "wv") (DD "wn")])
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 55de2f1..13a621d 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -3197,7 +3197,7 @@ Altivec vector register
 Any VSX register if the @option{-mvsx} option was used or NO_REGS.
 
 When using any of the register constraints 

[PATCH 4/6] rs6000: wm -> wa+p8v

2019-05-21 Thread Segher Boessenkool
2019-05-21  Segher Boessenkool  

* config/rs6000/constraints.md (define_register_constraint "wm"):
Delete.
* config/rs6000/rs6000.h (enum r6000_reg_class_enum): Delete
RS6000_CONSTRAINT_wm.
* config/rs6000/rs6000.c (rs6000_debug_reg_global): Adjust.
(rs6000_init_hard_regno_mode_ok): Adjust.
* config/rs6000/vsx.md: Replace "wm" constraint by "wa" with "p8v".
* doc/md.texi (Machine Constraints): Adjust.

---
 gcc/config/rs6000/constraints.md | 5 +
 gcc/config/rs6000/rs6000.c   | 6 --
 gcc/config/rs6000/rs6000.h   | 1 -
 gcc/config/rs6000/vsx.md | 6 ++
 gcc/doc/md.texi  | 5 +
 5 files changed, 4 insertions(+), 19 deletions(-)

diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md
index 6f60627..90a94c1 100644
--- a/gcc/config/rs6000/constraints.md
+++ b/gcc/config/rs6000/constraints.md
@@ -77,12 +77,9 @@ (define_register_constraint "wi" 
"rs6000_constraints[RS6000_CONSTRAINT_wi]"
 (define_register_constraint "wl" "rs6000_constraints[RS6000_CONSTRAINT_wl]"
   "Floating point register if the LFIWAX instruction is enabled or NO_REGS.")
 
-(define_register_constraint "wm" "rs6000_constraints[RS6000_CONSTRAINT_wm]"
-  "VSX register if direct move instructions are enabled, or NO_REGS.")
-
 ;; NO_REGs register constraint, used to merge mov{sd,sf}, since movsd can use
 ;; direct move directly, and movsf can't to move between the register sets.
-;; There is a mode_attr that resolves to wm for SDmode and wn for SFmode
+;; There is a mode_attr that resolves to wa for SDmode and wn for SFmode
 (define_register_constraint "wn" "NO_REGS" "No register (NO_REGS).")
 
 (define_register_constraint "wp" "rs6000_constraints[RS6000_CONSTRAINT_wp]"
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 718535f..d6ffc36 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -2514,7 +2514,6 @@ rs6000_debug_reg_global (void)
   "wg reg_class = %s\n"
   "wi reg_class = %s\n"
   "wl reg_class = %s\n"
-  "wm reg_class = %s\n"
   "wp reg_class = %s\n"
   "wq reg_class = %s\n"
   "wr reg_class = %s\n"
@@ -2536,7 +2535,6 @@ rs6000_debug_reg_global (void)
   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wg]],
   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wi]],
   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wl]],
-  reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wm]],
   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wp]],
   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wq]],
   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wr]],
@@ -3159,7 +3157,6 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p)
wg - Float register for power6x move insns.
wi - FP or VSX register to hold 64-bit integers for VSX insns.
wl - Float register if we can do 32-bit signed int loads.
-   wm - VSX register for ISA 2.07 direct move operations.
wn - always NO_REGS.
wr - GPR if 64-bit mode is permitted.
ws - Register class to do ISA 2.06 DF operations.
@@ -3197,9 +3194,6 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p)
   if (TARGET_LFIWAX)
 rs6000_constraints[RS6000_CONSTRAINT_wl] = FLOAT_REGS; /* DImode  */
 
-  if (TARGET_DIRECT_MOVE)
-rs6000_constraints[RS6000_CONSTRAINT_wm] = VSX_REGS;
-
   if (TARGET_POWERPC64)
 {
   rs6000_constraints[RS6000_CONSTRAINT_wr] = GENERAL_REGS;
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index cc60559..27055a6 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -1255,7 +1255,6 @@ enum r6000_reg_class_enum {
   RS6000_CONSTRAINT_wg,/* FPR register for -mmfpgpr */
   RS6000_CONSTRAINT_wi,/* FPR/VSX register to hold DImode */
   RS6000_CONSTRAINT_wl,/* FPR register for LFIWAX */
-  RS6000_CONSTRAINT_wm,/* VSX register for direct move */
   RS6000_CONSTRAINT_wp,/* VSX reg for IEEE 128-bit fp TFmode. 
*/
   RS6000_CONSTRAINT_wq,/* VSX reg for IEEE 128-bit fp KFmode.  
*/
   RS6000_CONSTRAINT_wr,/* GPR register if 64-bit  */
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index ff4ceb6..6108451 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -3198,10 +3198,8 @@ (define_expand "vsx_set_"
 
 (define_insn "vsx_extract_"
   [(set (match_operand: 0 "gpc_reg_operand" "=d,d, wr, wr")
-
(vec_select:
-(match_operand:VSX_D 1 "gpc_reg_operand"  ", , wm, wa")
-
+(match_operand:VSX_D 1 "gpc_reg_operand"  ", , wa, wa")
 (parallel
  [(match_operand:QI 2 "const_0_to_1_operand"  "wD,n, wD, 
n")])))]
   "VECTOR_MEM_VSX_P (mode)"
@@ -3250,7 +3248,7 @@ (define_insn 

[PATCH 2/6] rs6000: wj -> wi+p8v

2019-05-21 Thread Segher Boessenkool
Also deletes VS_64dm, it's unused.


2019-05-21  Segher Boessenkool  

* config/rs6000/constraints.md (define_register_constraint "wj"):
Delete.
* config/rs6000/rs6000.h (enum r6000_reg_class_enum): Delete
RS6000_CONSTRAINT_wj.
* config/rs6000/rs6000.c (rs6000_debug_reg_global): Adjust.
(rs6000_init_hard_regno_mode_ok): Adjust.
* config/rs6000/rs6000.md: Replace "wj" constraint by "wi" with "p8v".
(VS_64dm): Delete.
* config/rs6000/vsx.md: Ditto.
* doc/md.texi (Machine Constraints): Adjust.

---
 gcc/config/rs6000/constraints.md |  3 ---
 gcc/config/rs6000/rs6000.c   |  5 -
 gcc/config/rs6000/rs6000.h   |  1 -
 gcc/config/rs6000/rs6000.md  | 22 +++---
 gcc/config/rs6000/vsx.md | 10 +++---
 gcc/doc/md.texi  |  5 +
 6 files changed, 15 insertions(+), 31 deletions(-)

diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md
index c9f168f..9f315e4 100644
--- a/gcc/config/rs6000/constraints.md
+++ b/gcc/config/rs6000/constraints.md
@@ -74,9 +74,6 @@ (define_register_constraint "wg" 
"rs6000_constraints[RS6000_CONSTRAINT_wg]"
 (define_register_constraint "wi" "rs6000_constraints[RS6000_CONSTRAINT_wi]"
   "FP or VSX register to hold 64-bit integers for VSX insns or NO_REGS.")
 
-(define_register_constraint "wj" "rs6000_constraints[RS6000_CONSTRAINT_wj]"
-  "FP or VSX register to hold 64-bit integers for direct moves or NO_REGS.")
-
 (define_register_constraint "wk" "rs6000_constraints[RS6000_CONSTRAINT_wk]"
   "FP or VSX register to hold 64-bit doubles for direct moves or NO_REGS.")
 
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index a95848a..76c80a4 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -2513,7 +2513,6 @@ rs6000_debug_reg_global (void)
   "wf reg_class = %s\n"
   "wg reg_class = %s\n"
   "wi reg_class = %s\n"
-  "wj reg_class = %s\n"
   "wk reg_class = %s\n"
   "wl reg_class = %s\n"
   "wm reg_class = %s\n"
@@ -2537,7 +2536,6 @@ rs6000_debug_reg_global (void)
   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wf]],
   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wg]],
   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wi]],
-  reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wj]],
   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wk]],
   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wl]],
   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wm]],
@@ -3162,7 +3160,6 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p)
wf - Preferred register class for V4SFmode.
wg - Float register for power6x move insns.
wi - FP or VSX register to hold 64-bit integers for VSX insns.
-   wj - FP or VSX register to hold 64-bit integers for direct moves.
wk - FP or VSX register to hold 64-bit doubles for direct moves.
wl - Float register if we can do 32-bit signed int loads.
wm - VSX register for ISA 2.07 direct move operations.
@@ -3205,8 +3202,6 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p)
 
   if (TARGET_DIRECT_MOVE)
 {
-  rs6000_constraints[RS6000_CONSTRAINT_wj] /* DImode  */
-   = rs6000_constraints[RS6000_CONSTRAINT_wi];
   rs6000_constraints[RS6000_CONSTRAINT_wk] /* DFmode  */
= rs6000_constraints[RS6000_CONSTRAINT_ws];
   rs6000_constraints[RS6000_CONSTRAINT_wm] = VSX_REGS;
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index ca30639..218ed10 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -1254,7 +1254,6 @@ enum r6000_reg_class_enum {
   RS6000_CONSTRAINT_wf,/* VSX register for V4SF */
   RS6000_CONSTRAINT_wg,/* FPR register for -mmfpgpr */
   RS6000_CONSTRAINT_wi,/* FPR/VSX register to hold DImode */
-  RS6000_CONSTRAINT_wj,/* FPR/VSX register for DImode direct 
moves. */
   RS6000_CONSTRAINT_wk,/* FPR/VSX register for DFmode direct 
moves. */
   RS6000_CONSTRAINT_wl,/* FPR register for LFIWAX */
   RS6000_CONSTRAINT_wm,/* VSX register for direct move */
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 398398c..9a986a1 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -834,7 +834,7 @@ (define_insn_and_split "*zero_extendhi2_dot2"
 
 
 (define_insn "zero_extendsi2"
-  [(set (match_operand:EXTSI 0 "gpc_reg_operand" "=r,r,wz,wa,wj,r,wa")
+  [(set (match_operand:EXTSI 0 "gpc_reg_operand" "=r,r,wz,wa,wi,r,wa")
(zero_extend:EXTSI (match_operand:SI 1 "reg_or_mem_operand" 
"m,r,Z,Z,r,wa,wa")))]
   ""
   "@
@@ -846,7 +846,7 @@ (define_insn "zero_extendsi2"
mfvsrwz %0,%x1
xxextractuw %x0,%x1,4"
   

[PATCH 0/6] rs6000: Some more easy "enabled" cases

2019-05-21 Thread Segher Boessenkool
This converts some more constraints to a simpler constraint plus
enabled formulation.

Tested on p7 powerpc64-linux {-m32,-m64}; still testing on p8 and p9 LE,
will commit if that succeeds.


Segher


Segher Boessenkool (6):
  wh -> d+p8v
  wj -> wi+p8v
  wk -> ws+p8v
  wm -> wa+p8v
  wl -> d+p6
  wz -> d+p7

 gcc/config/rs6000/constraints.md | 20 +--
 gcc/config/rs6000/rs6000.c   | 36 +-
 gcc/config/rs6000/rs6000.h   |  6 --
 gcc/config/rs6000/rs6000.md  | 42 ++--
 gcc/config/rs6000/vsx.md | 16 +--
 gcc/doc/md.texi  | 22 ++---
 6 files changed, 32 insertions(+), 110 deletions(-)

-- 
1.8.3.1



[PATCH 1/6] rs6000: wh -> d+p8v

2019-05-21 Thread Segher Boessenkool
This replaces the "wh" constraint by "d", with isa "p8v".


2019-05-21  Segher Boessenkool  

* config/rs6000/constraints.md (define_register_constraint "wh"):
Delete.
* config/rs6000/rs6000.h (enum r6000_reg_class_enum): Delete
RS6000_CONSTRAINT_wh.
* config/rs6000/rs6000.c (rs6000_debug_reg_global): Adjust.
(rs6000_init_hard_regno_mode_ok): Adjust.
* config/rs6000/rs6000.md: Replace "wh" constraint by "wa" with "p8v".
* doc/md.texi (Machine Constraints): Adjust.

---
 gcc/config/rs6000/constraints.md |  3 ---
 gcc/config/rs6000/rs6000.c   |  4 
 gcc/config/rs6000/rs6000.h   |  1 -
 gcc/config/rs6000/rs6000.md  | 20 
 gcc/doc/md.texi  |  5 +
 5 files changed, 13 insertions(+), 20 deletions(-)

diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md
index dbcf08c..c9f168f 100644
--- a/gcc/config/rs6000/constraints.md
+++ b/gcc/config/rs6000/constraints.md
@@ -71,9 +71,6 @@ (define_register_constraint "wf" 
"rs6000_constraints[RS6000_CONSTRAINT_wf]"
 (define_register_constraint "wg" "rs6000_constraints[RS6000_CONSTRAINT_wg]"
   "If -mmfpgpr was used, a floating point register or NO_REGS.")
 
-(define_register_constraint "wh" "rs6000_constraints[RS6000_CONSTRAINT_wh]"
-  "Floating point register if direct moves are available, or NO_REGS.")
-
 (define_register_constraint "wi" "rs6000_constraints[RS6000_CONSTRAINT_wi]"
   "FP or VSX register to hold 64-bit integers for VSX insns or NO_REGS.")
 
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index b5dc5f3..a95848a 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -2512,7 +2512,6 @@ rs6000_debug_reg_global (void)
   "we reg_class = %s\n"
   "wf reg_class = %s\n"
   "wg reg_class = %s\n"
-  "wh reg_class = %s\n"
   "wi reg_class = %s\n"
   "wj reg_class = %s\n"
   "wk reg_class = %s\n"
@@ -2537,7 +2536,6 @@ rs6000_debug_reg_global (void)
   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_we]],
   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wf]],
   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wg]],
-  reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wh]],
   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wi]],
   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wj]],
   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wk]],
@@ -3163,7 +3161,6 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p)
wd - Preferred register class for V2DFmode.
wf - Preferred register class for V4SFmode.
wg - Float register for power6x move insns.
-   wh - FP register for direct move instructions.
wi - FP or VSX register to hold 64-bit integers for VSX insns.
wj - FP or VSX register to hold 64-bit integers for direct moves.
wk - FP or VSX register to hold 64-bit doubles for direct moves.
@@ -3208,7 +3205,6 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p)
 
   if (TARGET_DIRECT_MOVE)
 {
-  rs6000_constraints[RS6000_CONSTRAINT_wh] = FLOAT_REGS;
   rs6000_constraints[RS6000_CONSTRAINT_wj] /* DImode  */
= rs6000_constraints[RS6000_CONSTRAINT_wi];
   rs6000_constraints[RS6000_CONSTRAINT_wk] /* DFmode  */
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index eaf309b..ca30639 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -1253,7 +1253,6 @@ enum r6000_reg_class_enum {
   RS6000_CONSTRAINT_we,/* VSX register if ISA 3.0 vector. */
   RS6000_CONSTRAINT_wf,/* VSX register for V4SF */
   RS6000_CONSTRAINT_wg,/* FPR register for -mmfpgpr */
-  RS6000_CONSTRAINT_wh,/* FPR register for direct moves.  */
   RS6000_CONSTRAINT_wi,/* FPR/VSX register to hold DImode */
   RS6000_CONSTRAINT_wj,/* FPR/VSX register for DImode direct 
moves. */
   RS6000_CONSTRAINT_wk,/* FPR/VSX register for DFmode direct 
moves. */
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index b2bba5d..398398c 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -471,7 +471,7 @@ (define_mode_attr zero_fp [(SF "j")
 (define_mode_attr f64_vsx [(DF "ws") (DD "wn")])
 
 ; Definitions for 64-bit direct move
-(define_mode_attr f64_dm  [(DF "wk") (DD "wh")])
+(define_mode_attr f64_dm  [(DF "wk") (DD "d")])
 
 ; Definitions for 64-bit use of altivec registers
 (define_mode_attr f64_av  [(DF "wv") (DD "wn")])
@@ -7349,10 +7349,10 @@ (define_insn "movsf_hardfloat"
 ;; FMR  MR MT%0   MF%1   NOP
 (define_insn "movsd_hardfloat"
   [(set (match_operand:SD 0 "nonimmediate_operand"
-"=!r,   wz,m, Z, ?wh,   ?r,
+"=!r, 

[PATCH, i386]: Introduce signbit2 expander

2019-05-21 Thread Uros Bizjak
Based on the recent work that enabled vectorization of
__builtin_signbit on aarch64.

2019-05-21  Uroš Bizjak  

* config/i386/sse.md (VF1_AVX2): New mode iterator.
(signbit2): New expander

testsuite/ChangeLog:

2019-05-21  Uroš Bizjak  

* gcc.target/i386/vect-signbitf.c: New test.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.
Index: config/i386/sse.md
===
--- config/i386/sse.md  (revision 271467)
+++ config/i386/sse.md  (working copy)
@@ -279,6 +279,9 @@
 (define_mode_iterator VF1
   [(V16SF "TARGET_AVX512F") (V8SF "TARGET_AVX") V4SF])
 
+(define_mode_iterator VF1_AVX2
+  [(V16SF "TARGET_AVX512F") (V8SF "TARGET_AVX2") V4SF])
+
 ;; 128- and 256-bit SF vector modes
 (define_mode_iterator VF1_128_256
   [(V8SF "TARGET_AVX") V4SF])
@@ -3523,6 +3526,15 @@
   operands[4] = gen_reg_rtx (mode);
 })
 
+(define_expand "signbit2"
+  [(set (match_operand: 0 "register_operand")
+   (lshiftrt:
+ (subreg:
+   (match_operand:VF1_AVX2 1 "register_operand") 0)
+ (match_dup 2)))]
+  "TARGET_SSE2"
+  "operands[2] = GEN_INT (GET_MODE_UNIT_BITSIZE (mode)-1);")
+
 ;; Also define scalar versions.  These are used for abs, neg, and
 ;; conditional move.  Using subregs into vector modes causes register
 ;; allocation lossage.  These patterns do not allow memory operands
Index: testsuite/gcc.target/i386/vect-signbitf.c
===
--- testsuite/gcc.target/i386/vect-signbitf.c   (nonexistent)
+++ testsuite/gcc.target/i386/vect-signbitf.c   (working copy)
@@ -0,0 +1,30 @@
+/* { dg-do run { target sse2_runtime } } */
+/* { dg-options "-O2 -msse2 -ftree-vectorize -fdump-tree-vect-details 
-save-temps" } */
+
+extern void abort ();
+
+#define N 1024
+float a[N] = {0.0f, -0.0f, 1.0f, -1.0f,
+ -2.0f, 3.0f, -5.0f, -8.0f,
+ 13.0f, 21.0f, -25.0f, 33.0f};
+int r[N];
+
+int
+main (void)
+{
+  int i;
+
+  for (i = 0; i < N; i++)
+r[i] = __builtin_signbitf (a[i]);
+
+  /* check results:  */
+  for (i = 0; i < N; i++)
+if (__builtin_signbit (a[i]) && !r[i])
+  abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
+/* { dg-final { scan-assembler-not "-2147483648" } } */
+/* { dg-final { scan-assembler "psrld" } } */


Re: Follow-up-fix 2 to "[PATCH] Move PR84877 fix elsewhere (PR bootstrap/88450)"

2019-05-21 Thread Jeff Law
On 5/12/19 10:01 PM, Hans-Peter Nilsson wrote:
>> Date: Tue, 30 Apr 2019 11:37:17 -0600
>> From: Jeff Law 
> 
>> On 2/10/19 6:09 PM, Hans-Peter Nilsson wrote:
>>> Here's the follow-up, getting rid of the observed
>>> alignment-padding in execute/930126-1.c: the x parameter in f
>>> spuriously being runtime-aligned to BITS_PER_WORD.  I separated
>>> this change because this is an older issue, a change introduced
>>> in r94104 where BITS_PER_WORD was chosen perhaps because we
>>> expect register-sized writes into this area.  Here, we instead
>>> align to a minimum of PREFERRED_STACK_BOUNDARY, but of course
>>> gated on !  STRICT_ALIGNMENT.
>>>
>>> Regtested cris-elf and x86_64-pc-linux-gnu.
>>>
>>> Ok to commit?
>>>
>>> gcc:
>>> * function.c (assign_parm_setup_block): If not STRICT_ALIGNMENT,
>>> instead of always BITS_PER_WORD, align the stacked
>>> parameter to a minimum PREFERRED_STACK_BOUNDARY.
>> Interestingly enough in the thread from 2005 Richard S suggests that he
>> could have made increasing the alignment conditional on STRICT_ALIGNMENT
>> but thought that with the size already being rounded up it wasn't worth
>> it and that we could take advantage of the increased alignment elsewhere.
>>
>> I wonder if we could just go back to that idea.  Leave the alignment as
>> DECL_ALIGN for !STRICT_ALIGNMENT targets and bump it up for
>> STRICT_ALIGNMENT targets?
>>
>> So something like
>>
>> align = STRICT_ALIGNMENT ? MAX (DECL_ALIGN (parm), BITS_PER_WORD) :
>> DECL_ALIGN (parm)
> 
> That'd work for me, I think.  Testing in progress.  Thanks.
> Almost obvious, but: is this ok; what you meant?
> 
> gcc:
>   * function.c (assign_parm_setup_block): Raise alignment of
>   stacked parameter only for STRICT_ALIGNMENT targets.
Yea.  I threw it into my tester overnight and it hasn't caused any
problems.   I think this is good for the trunk.


> 
> Index: gcc/function.c
> ===
> --- gcc/function.c(revision 27)
> +++ gcc/function.c(working copy)
> @@ -2912,7 +2912,11 @@ assign_parm_setup_block (struct assign_p
>size_stored = CEIL_ROUND (size, UNITS_PER_WORD);
>if (stack_parm == 0)
>  {
> -  SET_DECL_ALIGN (parm, MAX (DECL_ALIGN (parm), BITS_PER_WORD));
> +  HOST_WIDE_INT parm_align
> + = (STRICT_ALIGNMENT
> +? MAX (DECL_ALIGN (parm), BITS_PER_WORD) : DECL_ALIGN (parm));
> +
> +  SET_DECL_ALIGN (parm, parm_align);
>if (DECL_ALIGN (parm) > MAX_SUPPORTED_STACK_ALIGNMENT)
>   {
> rtx allocsize = gen_int_mode (size_stored, Pmode);
> 
> PS. A preferable solution would IMHO involve hookifying
> parameter padding and alignment as separate entities.  Maybe
> later, perhaps even after PR84877 is fixed, or that bug risk
> dying from perceived misprioritized attention.  (Amazing how
> easy it is to "overalign" here, and hard to align "there"!)
Yea, but at this point I suspect folks are a bit reluctant to dig into
this code :(

jeff



[PATCH] i386: Generate standard floating point scalar operation patterns

2019-05-21 Thread H.J. Lu
On Wed, May 15, 2019 at 2:29 PM Richard Sandiford
 wrote:
>
> "H.J. Lu"  writes:
> > On Thu, Feb 7, 2019 at 9:49 AM H.J. Lu  wrote:
> >>
> >> Standard scalar operation patterns which preserve the rest of the vector
> >> look like
> >>
> >>  (vec_merge:V2DF
> >>(vec_duplicate:V2DF
> >>  (op:DF (vec_select:DF (reg/v:V2DF 85 [ x ])
> >> (parallel [ (const_int 0 [0])]))
> >>  (reg:DF 87))
> >>(reg/v:V2DF 85 [ x ])
> >>(const_int 1 [0x1])]))
> >>
> >> Add such pattens to i386 backend and convert VEC_CONCAT patterns to
> >> standard standard scalar operation patterns.
>
> It looks like there's some variety in the patterns used, e.g.:
>
> (define_insn 
> "_vm3"
>   [(set (match_operand:VF_128 0 "register_operand" "=x,v")
> (vec_merge:VF_128
>   (smaxmin:VF_128
> (match_operand:VF_128 1 "register_operand" "0,v")
> (match_operand:VF_128 2 "vector_operand" 
> "xBm,"))
>  (match_dup 1)
>  (const_int 1)))]
>   "TARGET_SSE"
>   "@
>\t{%2, %0|%0, %2}
>v\t{%2, 
> %1, %0|%0, %1, 
> %2}"
>   [(set_attr "isa" "noavx,avx")
>(set_attr "type" "sse")
>(set_attr "btver2_sse_attr" "maxmin")
>(set_attr "prefix" "")
>(set_attr "mode" "")])
>
> makes the operand a full vector operation, which seems simpler.

This pattern is used to implement scalar smaxmin intrinsics.

> The above would then be:
>
>   (vec_merge:V2DF
> (op:V2DF
>   (reg:V2DF 85)
>   (vec_duplicate:V2DF (reg:DF 87)))
> (reg/v:V2DF 85 [ x ])
> (const_int 1 [0x1])]))
>
> I guess technically the two have different faulting behaviour though,
> since the smaxmin gets applied to all elements, not just element 0.

This is the issue.   We don't use the correct mode for scalar instructions:

---
#include 

__m128d
foo1 (__m128d x, double *p)
{
  __m128d y = _mm_load_sd (p);
  return _mm_max_pd (x, y);
}
---

movq (%rdi), %xmm1
maxpd %xmm1, %xmm0
ret


Here is the updated patch to add standard floating point scalar
operation patterns to i386 backend.Then we can do

---
#include 

extern __inline __m128d __attribute__((__gnu_inline__,
__always_inline__, __artificial__))
_new_mm_max_pd (__m128d __A, __m128d __B)
{
  __A[0] = __A[0] > __B[0] ? __A[0] : __B[0];
  return __A;
}

__m128d
foo2 (__m128d x, double *p)
{
  __m128d y = _mm_load_sd (p);
  return _new_mm_max_pd (x, y);
}

maxsd (%rdi), %xmm0
ret

We should use generic vector operations to implement i386 intrinsics
as much as we can.

> The patch seems very specific.  E.g. why just PLUS, MINUS, MULT and DIV?

This patch only adds  +, -, *, /, > and <.We can add more if there
are testcases
for them.

> Thanks,
> Richard
>
>
> >>
> >> gcc/
> >>
> >> PR target/54855
> >> * simplify-rtx.c (simplify_binary_operation_1): Convert
> >> VEC_CONCAT patterns to standard standard scalar operation
> >> patterns.
> >> * config/i386/sse.md (*_vm3): New.
> >> (*_vm3): Likewise.
> >>
> >> gcc/testsuite/
> >>
> >> PR target/54855
> >> * gcc.target/i386/pr54855-1.c: New test.
> >> * gcc.target/i386/pr54855-2.c: Likewise.
> >> * gcc.target/i386/pr54855-3.c: Likewise.
> >> * gcc.target/i386/pr54855-4.c: Likewise.
> >> * gcc.target/i386/pr54855-5.c: Likewise.
> >> * gcc.target/i386/pr54855-6.c: Likewise.
> >> * gcc.target/i386/pr54855-7.c: Likewise.
> >
> > PING:
> >
> > https://gcc.gnu.org/ml/gcc-patches/2019-02/msg00398.html

Thanks.

-- 
H.J.
From 5d91bf264c89541a79ca8f9121264416ce307420 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Sun, 3 Feb 2019 09:16:23 -0800
Subject: [PATCH] i386: Generate standard floating point scalar operation
 patterns

Standard floating point scalar operation patterns for combiner, which
preserve the rest of the vector, look like

 (vec_merge:V2DF
   (vec_duplicate:V2DF (reg:DF 87))
   (reg/v:V2DF 85 [ x ])
   (const_int 1 [0x1])]))

and

 (vec_merge:V2DF
   (vec_duplicate:V2DF
 (op:DF (vec_select:DF (reg/v:V2DF 85 [ x ])
(parallel [ (const_int 0 [0])]))
 (reg:DF 87))
   (reg/v:V2DF 85 [ x ])
   (const_int 1 [0x1])]))

This patch adds and generates such standard floating point scalar
operation patterns for +, -, *, /, > and <.

Tested on x86-64.

gcc/

	PR target/54855
	* config/i386/i386-expand.c (ix86_expand_vector_set): Generate
	standard scalar operation pattern for V2DF.
	* config/i386/sse.md (*_vm3): New.
	(*_vm3): Likewise.
	(*ieee_3): Likewise.
	(vec_setv2df_0): Likewise.

gcc/testsuite/

	PR target/54855
	* gcc.target/i386/pr54855-1.c: New test.
	* gcc.target/i386/pr54855-2.c: Likewise.
	* gcc.target/i386/pr54855-3.c: Likewise.
	* gcc.target/i386/pr54855-4.c: Likewise.
	* gcc.target/i386/pr54855-5.c: Likewise.
	* gcc.target/i386/pr54855-6.c: Likewise.
	* gcc.target/i386/pr54855-7.c: Likewise.
	* gcc.target/i386/pr54855-8.c: Likewise.
	* 

Re: *Ping* Re: [PATCH] PR c/43673 - Incorrect warning in dfp printf.

2019-05-21 Thread Jeff Law
On 5/20/19 6:56 PM, luoxhu wrote:
> Ping for GCC-10.
I thought this was a NAK in its current form.

See Ryan's c#1 in the BZ.

jeff


Re: [ARM/FDPIC v5 03/21] [ARM] FDPIC: Force FDPIC related options unless -mno-fdpic is provided

2019-05-21 Thread Rich Felker
On Tue, May 21, 2019 at 05:28:51PM +0200, Christophe Lyon wrote:
> On Wed, 15 May 2019 at 18:06, Rich Felker  wrote:
> >
> > On Wed, May 15, 2019 at 03:59:39PM +, Szabolcs Nagy wrote:
> > > On 15/05/2019 16:37, Rich Felker wrote:
> > > > On Wed, May 15, 2019 at 05:12:11PM +0200, Christophe Lyon wrote:
> > > >> On Wed, 15 May 2019 at 16:37, Rich Felker  wrote:
> > > >>> On Wed, May 15, 2019 at 01:55:30PM +, Szabolcs Nagy wrote:
> > >  can support both normal elf and fdpic elf so you can test/use
> > >  an fdpic toolchain on a system with mmu, but this requires
> > >  different dynamic linker name ..otherwise one has to run
> > >  executables in a chroot or separate mount namespace to change
> > >  the dynamic linker)
> > > >>>
> > > >>> Indeed, it's a bad idea to make them clash.
> > > >>>
> > > >>
> > > >> Not sure to understand your point: indeed FDPIC binaries work
> > > >> on a system with mmu, provided you have the right dynamic
> > > >> linker in the right place, as well as the needed runtime libs (libc, 
> > > >> etc)
> > > >>
> > > >> Do you want me to change anything here?
> > > >
> > > > I think the concern is that if the PT_INTERP name is the same for
> > > > binaries with different ABIs, you wouldn't be able to have both
> > > > present in the same root fs, and this would make it more of a pain to
> > > > debug fdpic binaries on a full (with-mmu) host.
> > > >
> > > > musl always uses a different PT_INTERP name for each ABI combination,
> > > > so I guess the question is whether uclibc or whatever other libc
> > > > you're intending people to use would also want to do this.
> > >
> > > glibc uses different names now for new abis, so i was expecting
> > > some *_DYNAMIC_LINKER update, but it seems uclibc always uses
> > > the same fixed name
> > >
> > > /lib/ld-uClibc.so.0
> > >
> > > i guess it makes sense for them since iirc uclibc can change
> > > its runtime abi based on lot of build time config so having
> > > different name for each abi variant may be impractical.
> >
> > Yes, this "feature" of uclibc was was of the key motivations behind
> > the creation of musl... :-)
> >
> 
> Hi,
> 
> I discussed a bit further with Szabolcs on irc, and tried to get some
> feedback from uclibc-ng community (none so far)
> 
> I propose the following 2 patches on top of this one to address part
> of the concerns:
> diff --git a/gcc/config/arm/linux-eabi.h b/gcc/config/arm/linux-eabi.h
> index 67edb42..d7cc923 100644
> --- a/gcc/config/arm/linux-eabi.h
> +++ b/gcc/config/arm/linux-eabi.h
> @@ -89,7 +89,7 @@
>  #define MUSL_DYNAMIC_LINKER_E "%{mbig-endian:eb}"
>  #endif
>  #define MUSL_DYNAMIC_LINKER \
> -  "/lib/ld-musl-arm" MUSL_DYNAMIC_LINKER_E "%{mfloat-abi=hard:hf}.so.1"
> +  "/lib/ld-musl-arm" MUSL_DYNAMIC_LINKER_E
> "%{mfloat-abi=hard:hf}%{mfdpic:-fdpic}.so.1"
> 
>  /* At this point, bpabi.h will have clobbered LINK_SPEC.  We want to
> use the GNU/Linux version, not the generic BPABI version.  */
> 
> diff --git a/libsanitizer/configure.tgt b/libsanitizer/configure.tgt
> index c38b3f4..92bca69 100644
> --- a/libsanitizer/configure.tgt
> +++ b/libsanitizer/configure.tgt
> @@ -45,7 +45,7 @@ case "${target}" in
> ;;
>sparc*-*-solaris2.11*)
> ;;
> -  arm*-*-uclinuxfdpiceabi)
> +  arm*-*-fdpiceabi)
> UNSUPPORTED=1
> ;;
>arm*-*-linux*)
> 
> However, regarding -staic/-static-pie, it seems I have several options:
> (a) add support for static-pie to uclibc-ng. This means creating a new
> rcrt1.o or similar, which would embed parts of the dynamic linker into
> static-pie executables. This seems to involve quite a bit of work
> 
> (b) add support for FDPIC on arm to musl, which I'm not familiar with
> 
> (c) declare -static not supported on arm-FDPIC
> 
> (d) gather consensus that -static with pt_interp is ok (my preference,
> since that's what the current patches do :-)

musl definitely does not support static with pt_interp, and won't. If
it works, it's by chance, and not a good idea to try relying on it. If
you want to follow this path in upstream for now that's fine but it
means musl users will need to apply patches. This is already done
anyway, so it's not a *new* burden, but it's still annoying.

> At this point, I'd prefer to stick with (d), or (c), to avoid further delaying
> inclusion of FDPIC support for arm in GCC, and address improvements
> later, so that it's not a constantly moving target.

I'd find (c) mildly better as long as it's still easy for us to patch.
Providing a -static that's not actually static is not useful and will
be harder to fix later.

Rich


Re: [PATCH] PR bootstrap/87338: Fix ia64 bootstrap comparison regression in r257511

2019-05-21 Thread Jeff Law
On 5/21/19 7:08 AM, James Clarke wrote:
> On 26 Apr 2019, at 15:24, Jeff Law  wrote:
>> On 4/26/19 2:01 AM, Jakub Jelinek wrote:
>>> On Fri, Apr 26, 2019 at 08:58:18AM +0200, Richard Biener wrote:
 On Thu, Apr 25, 2019 at 7:52 PM James Clarke  wrote:
>
> By using ASM_OUTPUT_LABEL, r257511 forced the assembler to start a new
> bundle when emitting an inline entry label on. Instead, use
> ASM_OUTPUT_DEBUG_LABEL like for the block begin and end labels so tags are
> emitted rather than labels.

 Looks sensible.  mips is the other port defining ASM_OUTPUT_DEBUG_LABEL,
 so either you can do a bootstrap/test on mips as well or I'm asking Matthew
 for approval here.
>>>
>>> And arm and arc, while they don't define their own ASM_OUTPUT_DEBUG_LABEL,
>>> they override TARGET_ASM_INTERNAL_LABEL which is the underlying
>>> implementation of the default ASM_OUTPUT_DEBUG_LABEL.  But I agree that
>>> for mips it is a significant change, while arm and arc call 
>>> default_internal_label
>>> from their hook, just do additional stuff.
>> My tester will bootstrap the mips port within 24hrs after the change is
>> committed.  Happy to contact y'all if something goes wrong ;-)  If you
>> don't hear from me, assume it didn't cause problems.
> 
> Hi all,
> It looks like there are no objections to this (assuming it doesn't regress
> anything), so could somebody please commit this to trunk?
Sorry, it looks like this slipped through the cracks.  I've committed it
to the trunk.  I've also updated the BZ to reflect the bootstrap
regression for gcc-8 and gcc-9 on ia64.

jeff


Re: Fix MEM_REF creation for shared stack slots

2019-05-21 Thread Richard Biener
On May 21, 2019 5:17:43 PM GMT+02:00, Jeff Law  wrote:
>On 5/21/19 8:37 AM, Jan Hubicka wrote:
>>> And that should be done at RTL creation time instead of
>>> repeatedly over and over.  Like with the following.
>>>
>>> Bootstrap / regtest on x86_64-unknown-linux-gnu in progress.
>> 
>> Thanks,
>> for TBAA stats I now get
>> 
>> Alias oracle query stats:
>>   refs_may_alias_p: 3022975 disambiguations, 3321454 queries
>>   ref_maybe_used_by_call_p: 6451 disambiguations, 3048555 queries
>>   call_may_clobber_ref_p: 817 disambiguations, 817 queries
>>   aliasing_component_ref_p: 187 disambiguations, 16103 queries
>>   TBAA oracle: 1452502 disambiguations 2956630 queries
>>550659 are in alias set 0
>>576739 queries asked about the same object
>>0 queries asked about the same alias set
>>0 access volatile
>>260391 are dependent in the DAG
>>116339 are aritificially in conflict with void *
>> 
>> So some improvement from original (but less great than with my wrong
>> patch):
>> 
>>   refs_may_alias_p: 3027850 disambiguations, 3340416 queries
>>   ref_maybe_used_by_call_p: 6451 disambiguations, 3053430 queries
>>   call_may_clobber_ref_p: 817 disambiguations, 817 queries
>>   aliasing_component_ref_p: 151 disambiguations, 12565 queries
>>   TBAA oracle: 1468434 disambiguations 3010778 queries
>>550723 are in alias set 0
>>614261 queries asked about the same object
>>0 queries asked about the same alias set
>>0 access volatile
>>260983 are dependent in the DAG
>>116377 are aritificially in conflict with void *
>My recollection of the stack slot issue is that when objects from
>different scopes share a slot we can end up reordering their accesses.
>So we might have
>
>  {
>int x;
>  }
>
>  {
>long y;
>  }
>
>Assume x & y are addressable for some reason.  A read of x at the end
>of
>the first scope might get moved after a store to y in the second scope
>because we think the two objects can't alias.

Of course the GCC memory model doesn't allow such code motion. For anti 
dependences you may not use TBAA. 

But yes, we had bugs in this area in the past. 

Richard. 

>
>IIRC this was particularly problematical with aggressive inlining where
>stack slot sharing is important to keep stack usage down in the kernel.
>
>jeff



Re: [C++ PATCH] Using decls

2019-05-21 Thread Nathan Sidwell

On 5/21/19 10:43 AM, Marek Polacek wrote:

Thanks for the patch and sorry for nitpicking:

On Tue, May 21, 2019 at 10:32:31AM -0400, Nathan Sidwell wrote:

-/* Process a local-scope or namespace-scope using declaration.  SCOPE
+/* Process a local-scope or namespace-scope using declaration.
+   FIXME


This ain't look right.  You meant to document the INSERT_P param, right.


The INSERT_P parameter is not (yet) needed.  This patch removes it and 
cleans up the inadvertent stray FIXMEs.  Sorry for the noise.


nathan
--
Nathan Sidwell
2019-05-20  Nathan Sidwell  

	gcc/cp/
	* name-lookup.c (finish_namespace_using_directive)
	(finish_local_using_directive): Merge to ...
	(finish_using_directive): ... here.  Handle both contexts.
	* name-lookup.h (finish_namespace_using_directive)
	(finish_local_using_directive): Replace with ...
	(finish_using_directive): ... this.
	* parser.c (cp_parser_using_directive): Adjust.
	* pt.c (tsubst_expr): Likewise.

	libcc1/
	* libcp1plugin.cc (plugin_add_using_namespace): Call renamed
	finish_using_directive.

Index: gcc/cp/name-lookup.c
===
--- gcc/cp/name-lookup.c	(revision 271416)
+++ gcc/cp/name-lookup.c	(working copy)
@@ -7235,52 +7235,36 @@ emit_debug_info_using_namespace (tree fr
 }
 
-/* Process a namespace-scope using directive.  */
+/* Process a using directive.  */
 
 void
-finish_namespace_using_directive (tree target, tree attribs)
+finish_using_directive (tree target, tree attribs)
 {
-  gcc_checking_assert (namespace_bindings_p ());
   if (target == error_mark_node)
 return;
 
-  add_using_namespace (current_binding_level->using_directives,
-		   ORIGINAL_NAMESPACE (target));
-  emit_debug_info_using_namespace (current_namespace,
-   ORIGINAL_NAMESPACE (target), false);
-
-  if (attribs == error_mark_node)
-return;
-
-  for (tree a = attribs; a; a = TREE_CHAIN (a))
-{
-  tree name = get_attribute_name (a);
-  if (is_attribute_p ("strong", name))
-	{
-	  warning (0, "strong using directive no longer supported");
-	  if (CP_DECL_CONTEXT (target) == current_namespace)
-	inform (DECL_SOURCE_LOCATION (target),
-		"you may use an inline namespace instead");
-	}
-  else
-	warning (OPT_Wattributes, "%qD attribute directive ignored", name);
-}
-}
-
-/* Process a function-scope using-directive.  */
-
-void
-finish_local_using_directive (tree target, tree attribs)
-{
-  gcc_checking_assert (local_bindings_p ());
-  if (target == error_mark_node)
-return;
-
-  if (attribs)
-warning (OPT_Wattributes, "attributes ignored on local using directive");
-
-  add_stmt (build_stmt (input_location, USING_STMT, target));
+  if (current_binding_level->kind != sk_namespace)
+add_stmt (build_stmt (input_location, USING_STMT, target));
+  else
+emit_debug_info_using_namespace (current_binding_level->this_entity,
+ ORIGINAL_NAMESPACE (target), false);
 
   add_using_namespace (current_binding_level->using_directives,
 		   ORIGINAL_NAMESPACE (target));
+
+  if (attribs != error_mark_node)
+for (tree a = attribs; a; a = TREE_CHAIN (a))
+  {
+	tree name = get_attribute_name (a);
+	if (current_binding_level->kind == sk_namespace
+	&& is_attribute_p ("strong", name))
+	  {
+	warning (0, "strong using directive no longer supported");
+	if (CP_DECL_CONTEXT (target) == current_namespace)
+	  inform (DECL_SOURCE_LOCATION (target),
+		  "you may use an inline namespace instead");
+	  }
+	else
+	  warning (OPT_Wattributes, "%qD attribute directive ignored", name);
+  }
 }
 
Index: gcc/cp/name-lookup.h
===
--- gcc/cp/name-lookup.h	(revision 271416)
+++ gcc/cp/name-lookup.h	(working copy)
@@ -1,3 +1,3 @@
-/* Declarations for C++ name lookup routines.
+/* Declarations for -*- C++ -*- name lookup routines.
Copyright (C) 2003-2019 Free Software Foundation, Inc.
Contributed by Gabriel Dos Reis 
@@ -318,6 +318,5 @@ extern void cp_emit_debug_info_for_using
 extern void finish_namespace_using_decl (tree, tree, tree);
 extern void finish_local_using_decl (tree, tree, tree);
-extern void finish_namespace_using_directive (tree, tree);
-extern void finish_local_using_directive (tree, tree);
+extern void finish_using_directive (tree, tree);
 extern tree pushdecl (tree, bool is_friend = false);
 extern tree pushdecl_outermost_localscope (tree);
Index: gcc/cp/parser.c
===
--- gcc/cp/parser.c	(revision 271416)
+++ gcc/cp/parser.c	(working copy)
@@ -19738,8 +19738,5 @@ cp_parser_using_directive (cp_parser* pa
 
   /* Update the symbol table.  */
-  if (namespace_bindings_p ())
-finish_namespace_using_directive (namespace_decl, attribs);
-  else
-finish_local_using_directive (namespace_decl, attribs);
+  finish_using_directive (namespace_decl, attribs);
 
   /* Look for the final `;'.  */
Index: gcc/cp/pt.c

Re: [ARM/FDPIC v5 03/21] [ARM] FDPIC: Force FDPIC related options unless -mno-fdpic is provided

2019-05-21 Thread Christophe Lyon
On Wed, 15 May 2019 at 18:06, Rich Felker  wrote:
>
> On Wed, May 15, 2019 at 03:59:39PM +, Szabolcs Nagy wrote:
> > On 15/05/2019 16:37, Rich Felker wrote:
> > > On Wed, May 15, 2019 at 05:12:11PM +0200, Christophe Lyon wrote:
> > >> On Wed, 15 May 2019 at 16:37, Rich Felker  wrote:
> > >>> On Wed, May 15, 2019 at 01:55:30PM +, Szabolcs Nagy wrote:
> >  can support both normal elf and fdpic elf so you can test/use
> >  an fdpic toolchain on a system with mmu, but this requires
> >  different dynamic linker name ..otherwise one has to run
> >  executables in a chroot or separate mount namespace to change
> >  the dynamic linker)
> > >>>
> > >>> Indeed, it's a bad idea to make them clash.
> > >>>
> > >>
> > >> Not sure to understand your point: indeed FDPIC binaries work
> > >> on a system with mmu, provided you have the right dynamic
> > >> linker in the right place, as well as the needed runtime libs (libc, 
> > >> etc)
> > >>
> > >> Do you want me to change anything here?
> > >
> > > I think the concern is that if the PT_INTERP name is the same for
> > > binaries with different ABIs, you wouldn't be able to have both
> > > present in the same root fs, and this would make it more of a pain to
> > > debug fdpic binaries on a full (with-mmu) host.
> > >
> > > musl always uses a different PT_INTERP name for each ABI combination,
> > > so I guess the question is whether uclibc or whatever other libc
> > > you're intending people to use would also want to do this.
> >
> > glibc uses different names now for new abis, so i was expecting
> > some *_DYNAMIC_LINKER update, but it seems uclibc always uses
> > the same fixed name
> >
> > /lib/ld-uClibc.so.0
> >
> > i guess it makes sense for them since iirc uclibc can change
> > its runtime abi based on lot of build time config so having
> > different name for each abi variant may be impractical.
>
> Yes, this "feature" of uclibc was was of the key motivations behind
> the creation of musl... :-)
>

Hi,

I discussed a bit further with Szabolcs on irc, and tried to get some
feedback from uclibc-ng community (none so far)

I propose the following 2 patches on top of this one to address part
of the concerns:
diff --git a/gcc/config/arm/linux-eabi.h b/gcc/config/arm/linux-eabi.h
index 67edb42..d7cc923 100644
--- a/gcc/config/arm/linux-eabi.h
+++ b/gcc/config/arm/linux-eabi.h
@@ -89,7 +89,7 @@
 #define MUSL_DYNAMIC_LINKER_E "%{mbig-endian:eb}"
 #endif
 #define MUSL_DYNAMIC_LINKER \
-  "/lib/ld-musl-arm" MUSL_DYNAMIC_LINKER_E "%{mfloat-abi=hard:hf}.so.1"
+  "/lib/ld-musl-arm" MUSL_DYNAMIC_LINKER_E
"%{mfloat-abi=hard:hf}%{mfdpic:-fdpic}.so.1"

 /* At this point, bpabi.h will have clobbered LINK_SPEC.  We want to
use the GNU/Linux version, not the generic BPABI version.  */

diff --git a/libsanitizer/configure.tgt b/libsanitizer/configure.tgt
index c38b3f4..92bca69 100644
--- a/libsanitizer/configure.tgt
+++ b/libsanitizer/configure.tgt
@@ -45,7 +45,7 @@ case "${target}" in
;;
   sparc*-*-solaris2.11*)
;;
-  arm*-*-uclinuxfdpiceabi)
+  arm*-*-fdpiceabi)
UNSUPPORTED=1
;;
   arm*-*-linux*)

However, regarding -staic/-static-pie, it seems I have several options:
(a) add support for static-pie to uclibc-ng. This means creating a new
rcrt1.o or similar, which would embed parts of the dynamic linker into
static-pie executables. This seems to involve quite a bit of work

(b) add support for FDPIC on arm to musl, which I'm not familiar with

(c) declare -static not supported on arm-FDPIC

(d) gather consensus that -static with pt_interp is ok (my preference,
since that's what the current patches do :-)

At this point, I'd prefer to stick with (d), or (c), to avoid further delaying
inclusion of FDPIC support for arm in GCC, and address improvements
later, so that it's not a constantly moving target.

Thanks,

Christophe


Re: [PATCH v3 2/3] Add predict_doloop_p target hook

2019-05-21 Thread Jeff Law
On 5/21/19 4:20 AM, Richard Biener wrote:
> On Tue, 21 May 2019, Kewen.Lin wrote:
> 
>> on 2019/5/21 上午12:37, Segher Boessenkool wrote:
>>> On Mon, May 20, 2019 at 08:43:59AM -0600, Jeff Law wrote:
> I think we should have two hooks: one is called with the struct loop as
> parameter; and the other is called for every statement in the loop, if
> the hook isn't null anyway.  Or perhaps we do not need that second one.
 I'd wait to see a compelling example from real world code where we need
 to scan the statements.  Otherwise we're just dragging in more target
 specific decisions which in fact we want to minimize target stuff.
>>>
>>> The ivopts pass will be too optimistic about what loops will end up as a
>>> doloop, and cost things accordingly.  The cases where we cannot later
>>> actually use a doloop are doing pretty much per iteration, so I think
>>> ivopts will still make good decisions.  We'll need to make the rtl part
>>> not actually do a doloop then, but we probably still need that logic
>>> anyway.
>>>
>>> Kewen, Bin, will that work satisfactorily do you think?
>>>
>>
>> If my understanding on this question is correct, IMHO we should try to make
>> IVOPTs conservative than optimistic, since once the predict is wrong from
>> too optimistic decision, the costing on the doloop use is wrong, it's very
>> possible to affect the global optimal set.  It looks we don't have any ways
>> to recover it in RTL then?  (otherwise, there should be better place to fix
>> the PR).  Although it's also possible to miss some good cases, it's at least
>> as good as before, I'm inclined to make it conservative.
> 
> I wonder if you could simply benchmark what happens if you make
> IVOPTs _always_ create a doloop IV (if possible)?  I doubt the
> cases where a doloop IV is bad (calls, etc.) are too common and
> that in those cases the extra simple IV hurts.
This had been in the back of my mind as well.  I wonder how the RTL IV
bits would respond to that.

Jeff


Re: CPUID Patch for IDT Winchip

2019-05-21 Thread Uros Bizjak
On Tue, May 21, 2019 at 9:46 AM Uros Bizjak  wrote:
>
> On Tue, May 21, 2019 at 12:20 AM tedheadster  wrote:
> >
> > On Mon, May 20, 2019 at 2:57 PM Uros Bizjak  wrote:
> > >
> > > On Mon, May 20, 2019 at 6:12 PM tedheadster  wrote:
> > > > Did you instead mean "zeroing %EBX and %ECX regs should be enough"?
> > >
> > > Ah, yes. This is what I meant to say. The patch clears %ebx and %ecx.
> > >
> >
> > Uros,
> >   your patch worked on real 32-bit hardware. The assembly output is
> > nearly identical to mine, with merely a re-ordering when setting the
> > %eax, %ebx, and %ecx registers.
>
> Attached patch fixes the core of the problem. We can change __cpuid
> itself to use zeroing, unless we can prove that we use constant
> argument, different than 1. __cpuid is mostly used with constant
> argument, so constant propagation does its job. As an example:
>
> --cut here--
> #include "cpuid.h"
>
> int main ()
> {
>   unsigned int eax, ebx, ecx, edx;
>
>   if (!__get_cpuid (1, , , , ))
> __builtin_abort ();
>
>   printf ("%#x, %#x, %#x, %#x\n", eax, ebx, ecx, edx);
>   return 0;
> }
> --cut here--
>
> results in:
>
>   2f:   31 f6   xor%esi,%esi
>   31:   89 f0   mov%esi,%eax
>   33:   0f a2   cpuid
>   35:   85 c0   test   %eax,%eax
>   37:   0f 84 fc ff ff ff   je 39 
>   3d:   83 ec 0csub$0xc,%esp
>   40:   89 f3   mov%esi,%ebx
>   42:   89 f1   mov%esi,%ecx
>   44:   b8 01 00 00 00  mov$0x1,%eax
>   49:   0f a2   cpuid

2019-05-21  Uroš Bizjak  

* config/i386/cpuid.h (__cpuid): For 32bit targets, zero
%ebx and %ecx bafore calling cpuid with leaf 1 or
non-constant leaf argument.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN, will be backported to all active branches.

Uros.


Re: Fix MEM_REF creation for shared stack slots

2019-05-21 Thread Jeff Law
On 5/21/19 8:37 AM, Jan Hubicka wrote:
>> And that should be done at RTL creation time instead of
>> repeatedly over and over.  Like with the following.
>>
>> Bootstrap / regtest on x86_64-unknown-linux-gnu in progress.
> 
> Thanks,
> for TBAA stats I now get
> 
> Alias oracle query stats:
>   refs_may_alias_p: 3022975 disambiguations, 3321454 queries
>   ref_maybe_used_by_call_p: 6451 disambiguations, 3048555 queries
>   call_may_clobber_ref_p: 817 disambiguations, 817 queries
>   aliasing_component_ref_p: 187 disambiguations, 16103 queries
>   TBAA oracle: 1452502 disambiguations 2956630 queries
>550659 are in alias set 0
>576739 queries asked about the same object
>0 queries asked about the same alias set
>0 access volatile
>260391 are dependent in the DAG
>116339 are aritificially in conflict with void *
> 
> So some improvement from original (but less great than with my wrong
> patch):
> 
>   refs_may_alias_p: 3027850 disambiguations, 3340416 queries
>   ref_maybe_used_by_call_p: 6451 disambiguations, 3053430 queries
>   call_may_clobber_ref_p: 817 disambiguations, 817 queries
>   aliasing_component_ref_p: 151 disambiguations, 12565 queries
>   TBAA oracle: 1468434 disambiguations 3010778 queries
>550723 are in alias set 0
>614261 queries asked about the same object
>0 queries asked about the same alias set
>0 access volatile
>260983 are dependent in the DAG
>116377 are aritificially in conflict with void *
My recollection of the stack slot issue is that when objects from
different scopes share a slot we can end up reordering their accesses.
So we might have

  {
int x;
  }

  {
long y;
  }

Assume x & y are addressable for some reason.  A read of x at the end of
the first scope might get moved after a store to y in the second scope
because we think the two objects can't alias.

IIRC this was particularly problematical with aggressive inlining where
stack slot sharing is important to keep stack usage down in the kernel.

jeff


[C++ PATCH] Add test for DR 1940 - static_assert in anonymous unions

2019-05-21 Thread Marek Polacek
 clarified that static_assert in anonymous
unions are permitted, but nowhere in the testsuite do we test that.

Tested on x86_64-linux, ok for trunk?

2019-05-21  Marek Polacek  

DR 1940 - static_assert in anonymous unions.
* g++.dg/DRs/dr1940.C: New test.

diff --git gcc/testsuite/g++.dg/DRs/dr1940.C gcc/testsuite/g++.dg/DRs/dr1940.C
new file mode 100644
index 000..dee4ae998a6
--- /dev/null
+++ gcc/testsuite/g++.dg/DRs/dr1940.C
@@ -0,0 +1,13 @@
+// DR 1940 - static_assert in anonymous unions
+// { dg-do compile { target c++11 } }
+
+namespace N {
+  static union { int i; static_assert(1, ""); };
+}
+
+void
+g ()
+{
+  union { int j; static_assert(1, ""); };
+  N::i = 42;
+}


Re: [Contrib PATCH] Add scripts to convert GCC repo from SVN to Git

2019-05-21 Thread Richard Earnshaw (lists)
On 21/05/2019 15:44, Jeff Law wrote:
> On 5/21/19 8:24 AM, Richard Earnshaw (lists) wrote:
>> On 20/05/2019 23:42, Joseph Myers wrote:
>>
>>> I'm not particularly concerned with distinguishing between different names 
>>> and email addresses for an author depending on when or in what capacity 
>>> they contributed a change, or with the cases where a patch was committed 
>>> for someone else and SVN simply doesn't provide a way to distinguish that 
>>> information.  However, since some people were concerned with that, and 
>>> since the feature needed for that was implemented (the "changelogs" 
>>> feature in reposurgeon, which will do it as long as a proper ChangeLog 
>>> entry was included in the commit), we may as well use that feature.  (The 
>>> author map is still needed for commits without ChangeLog entries.)
>>>
>>
>> For very old commits, back in the GCC 2 days, even the ChangeLogs don't
>> always show the author.  At that time only the committers' name was
>> used.  I'm pretty sure that some of my earliest patches to GCC were
>> committed by tege and kenner under their names.  So we'll never really
>> be able to fully reconstruct the early history.
> I'd say we make a reasonable effort here, but the importance of
> authorship decays rapidly the further back we go.  Even when the author
> (or committer) is still around, they often can't remember the details
> around commits from that era.
> 
> jeff
> 


Agreed, and I'm well aware of my limitation on remembering which of
those early patches were mine.  I was just pointing out that the
ChangeLogs from that period cannot be taken as an indication of authorship.

There's a fair chance that, if it was Arm related and dated from mid
1992 onwards, I had a hand in it.  But that's by no means a claim on all
such patches from that era.

R.


[PATCH] x86: Don't allocate stack frame nor align stack if not needed

2019-05-21 Thread H.J. Lu
get_frame_size () returns used stack slots during compilation, which
may be optimized out later.  This patch does the followings:

1. Add no_stack_frame to machine_function to indicate that the function
doesn't need a stack frame.
2. Change ix86_find_max_used_stack_alignment to set no_stack_frame.
3. Always call ix86_find_max_used_stack_alignment to check if stack
frame is needed.

Tested on i686 and x86-64 with

--with-arch=native --with-cpu=native

Tested on AVX512 machine configured with

--with-arch=native --with-cpu=native

gcc/

PR target/88483
* config/i386/i386.c (ix86_get_frame_size): New function.
(ix86_frame_pointer_required): Replace get_frame_size with
ix86_get_frame_size.
(ix86_compute_frame_layout): Likewise.
(ix86_find_max_used_stack_alignment): Changed to void.  Set
no_stack_frame.
(ix86_finalize_stack_frame_flags): Always call
ix86_find_max_used_stack_alignment.  Replace get_frame_size with
ix86_get_frame_size.
* config/i386/i386.h (machine_function): Add no_stack_frame.

gcc/testsuite/

PR target/88483
* gcc.target/i386/stackalign/pr88483-1.c: New test.
* gcc.target/i386/stackalign/pr88483-2.c: Likewise.
---
 gcc/config/i386/i386.c| 53 ---
 gcc/config/i386/i386.h|  3 ++
 .../gcc.target/i386/stackalign/pr88483-1.c| 18 +++
 .../gcc.target/i386/stackalign/pr88483-2.c| 18 +++
 4 files changed, 74 insertions(+), 18 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/stackalign/pr88483-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/stackalign/pr88483-2.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 54607748b0b..d0b2a4f8b70 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -5012,6 +5012,19 @@ ix86_can_use_return_insn_p (void)
  && (frame.nregs + frame.nsseregs) == 0);
 }
 
+/* Return stack frame size.  get_frame_size () returns used stack slots
+   during compilation, which may be optimized out later.  no_stack_frame
+   is set to true if stack frame isn't needed.  */
+
+static HOST_WIDE_INT
+ix86_get_frame_size (void)
+{
+  if (cfun->machine->no_stack_frame)
+return 0;
+  else
+return get_frame_size ();
+}
+
 /* Value should be nonzero if functions must have frame pointers.
Zero means the frame pointer need not be set up (and parms may
be accessed via the stack pointer) in functions that seem suitable.  */
@@ -5035,7 +5048,7 @@ ix86_frame_pointer_required (void)
 
   /* Win64 SEH, very large frames need a frame-pointer as maximum stack
  allocation is 4GB.  */
-  if (TARGET_64BIT_MS_ABI && get_frame_size () > SEH_MAX_FRAME_SIZE)
+  if (TARGET_64BIT_MS_ABI && ix86_get_frame_size () > SEH_MAX_FRAME_SIZE)
 return true;
 
   /* SSE saves require frame-pointer when stack is misaligned.  */
@@ -5842,7 +5855,7 @@ ix86_compute_frame_layout (void)
   unsigned HOST_WIDE_INT stack_alignment_needed;
   HOST_WIDE_INT offset;
   unsigned HOST_WIDE_INT preferred_alignment;
-  HOST_WIDE_INT size = get_frame_size ();
+  HOST_WIDE_INT size = ix86_get_frame_size ();
   HOST_WIDE_INT to_allocate;
 
   /* m->call_ms2sysv is initially enabled in ix86_expand_call for all 64-bit
@@ -7436,11 +7449,11 @@ output_probe_stack_range (rtx reg, rtx end)
   return "";
 }
 
-/* Return true if stack frame is required.  Update STACK_ALIGNMENT
-   to the largest alignment, in bits, of stack slot used if stack
-   frame is required and CHECK_STACK_SLOT is true.  */
+/* Set no_stack_frame to true if stack frame isn't required.  Update
+   STACK_ALIGNMENT to the largest alignment, in bits, of stack slot
+   used if stack frame is required and CHECK_STACK_SLOT is true.  */
 
-static bool
+static void
 ix86_find_max_used_stack_alignment (unsigned int _alignment,
bool check_stack_slot)
 {
@@ -7489,7 +7502,7 @@ ix86_find_max_used_stack_alignment (unsigned int 
_alignment,
  }
 }
 
-  return require_stack_frame;
+  cfun->machine->no_stack_frame = !require_stack_frame;
 }
 
 /* Finalize stack_realign_needed and frame_pointer_needed flags, which
@@ -7519,6 +7532,14 @@ ix86_finalize_stack_frame_flags (void)
   return;
 }
 
+  /* It is always safe to compute max_used_stack_alignment.  We
+ compute it only if 128-bit aligned load/store may be generated
+ on misaligned stack slot which will lead to segfault. */
+  bool check_stack_slot
+= (stack_realign || crtl->max_used_stack_slot_alignment >= 128);
+  ix86_find_max_used_stack_alignment (stack_alignment,
+ check_stack_slot);
+
   /* If the only reason for frame_pointer_needed is that we conservatively
  assumed stack realignment might be needed or -fno-omit-frame-pointer
  is used, but in the end nothing that needed the stack alignment had
@@ -7538,12 +7559,11 @@ ix86_finalize_stack_frame_flags (void)
  

[PATCHv2] debug: make -feliminate-unused-debug-symbols the default [PR debug/86964]

2019-05-21 Thread Thomas De Schampheleire
From: Thomas De Schampheleire 

In addition to making -feliminate-unused-debug-symbols work for the DWARF
format (see [1]), make this option the default. This behavior was the case
before, e.g. under gcc 4.9.x.
[1] https://gcc.gnu.org/viewcvs/gcc?view=revision=269925

This change requires some updates to test cases, which expected the previous
default of not eliminating unused debug symbols.

gcc/ChangeLog:

2019-05-21  Thomas De Schampheleire  

PR debug/86964
* common.opt (feliminate-unused-debug-symbols): Enable by default.
* doc/invoke.texi (Debugging Options): Document new default of
-feliminate-unused-debug-symbols and remove restriction to 'stabs'.

gcc/testsuite/ChangeLog:

2019-05-21  Thomas De Schampheleire  

PR debug/86964
* g++.dg/debug/dwarf2/fesd-any.C: Use
-fno-eliminate-unused-debug-symbols.
* g++.dg/debug/dwarf2/fesd-baseonly.C: Likewise.
* g++.dg/debug/dwarf2/fesd-none.C: Likewise.
* g++.dg/debug/dwarf2/fesd-reduced.C: Likewise.
* g++.dg/debug/dwarf2/fesd-sys.C: Likewise.
* g++.dg/debug/dwarf2/inline-var-1.C: Likewise.
* g++.dg/debug/enum-2.C: Likewise.
* gcc.dg/debug/dwarf2/fesd-any.c: Likewise.
* gcc.dg/debug/dwarf2/fesd-baseonly.c: Likewise.
* gcc.dg/debug/dwarf2/fesd-none.c: Likewise.
* gcc.dg/debug/dwarf2/fesd-reduced.c: Likewise.
* gcc.dg/debug/dwarf2/fesd-sys.c: Likewise.
---
 gcc/common.opt| 2 +-
 gcc/doc/invoke.texi   | 9 +
 gcc/testsuite/g++.dg/debug/dwarf2/fesd-any.C  | 2 +-
 gcc/testsuite/g++.dg/debug/dwarf2/fesd-baseonly.C | 2 +-
 gcc/testsuite/g++.dg/debug/dwarf2/fesd-none.C | 2 +-
 gcc/testsuite/g++.dg/debug/dwarf2/fesd-reduced.C  | 2 +-
 gcc/testsuite/g++.dg/debug/dwarf2/fesd-sys.C  | 2 +-
 gcc/testsuite/g++.dg/debug/dwarf2/inline-var-1.C  | 2 +-
 gcc/testsuite/g++.dg/debug/enum-2.C   | 1 +
 gcc/testsuite/gcc.dg/debug/dwarf2/fesd-any.c  | 2 +-
 gcc/testsuite/gcc.dg/debug/dwarf2/fesd-baseonly.c | 2 +-
 gcc/testsuite/gcc.dg/debug/dwarf2/fesd-none.c | 2 +-
 gcc/testsuite/gcc.dg/debug/dwarf2/fesd-reduced.c  | 2 +-
 gcc/testsuite/gcc.dg/debug/dwarf2/fesd-sys.c  | 2 +-
 14 files changed, 18 insertions(+), 16 deletions(-)

diff --git a/gcc/common.opt b/gcc/common.opt
index d342c4f3749..0e72fd08ec4 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1379,7 +1379,7 @@ Common Report Var(flag_ipa_sra) Init(0) Optimization
 Perform interprocedural reduction of aggregates.
 
 feliminate-unused-debug-symbols
-Common Report Var(flag_debug_only_used_symbols)
+Common Report Var(flag_debug_only_used_symbols) Init(1)
 Perform unused symbol elimination in debug info.
 
 feliminate-unused-debug-types
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 5e3e8873d35..06c8c60f19e 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -388,7 +388,7 @@ Objective-C and Objective-C++ Dialects}.
 -fno-eliminate-unused-debug-types @gol
 -femit-struct-debug-baseonly  -femit-struct-debug-reduced @gol
 -femit-struct-debug-detailed@r{[}=@var{spec-list}@r{]} @gol
--feliminate-unused-debug-symbols  -femit-class-debug-always @gol
+-fno-eliminate-unused-debug-symbols  -femit-class-debug-always @gol
 -fno-merge-debug-strings  -fno-dwarf2-cfi-asm @gol
 -fvar-tracking  -fvar-tracking-assignments}
 
@@ -7827,10 +7827,11 @@ confusion with @option{-gdwarf-@var{level}}.
 Instead use an additional @option{-g@var{level}} option to change the
 debug level for DWARF.
 
-@item -feliminate-unused-debug-symbols
+@item -fno-eliminate-unused-debug-symbols
 @opindex feliminate-unused-debug-symbols
-Produce debugging information in stabs format (if that is supported),
-for only symbols that are actually used.
+@opindex fno-eliminate-unused-debug-symbols
+By default, no debug information is produced for symbols that are not actually
+used. Use this option if you want debug information for all symbols.
 
 @item -femit-class-debug-always
 @opindex femit-class-debug-always
diff --git a/gcc/testsuite/g++.dg/debug/dwarf2/fesd-any.C 
b/gcc/testsuite/g++.dg/debug/dwarf2/fesd-any.C
index a4a0b50ee50..5868ebc9c85 100644
--- a/gcc/testsuite/g++.dg/debug/dwarf2/fesd-any.C
+++ b/gcc/testsuite/g++.dg/debug/dwarf2/fesd-any.C
@@ -1,5 +1,5 @@
 // { dg-do compile }
-// { dg-options "-gdwarf-2 -dA -femit-struct-debug-detailed=any" }
+// { dg-options "-gdwarf-2 -dA -femit-struct-debug-detailed=any 
-fno-eliminate-unused-debug-symbols" }
 // { dg-final { scan-assembler "timespec.*DW_AT_name" } }
 // { dg-final { scan-assembler "tv_sec.*DW_AT_name" } }
 // { dg-final { scan-assembler "tv_nsec.*DW_AT_name" } }
diff --git a/gcc/testsuite/g++.dg/debug/dwarf2/fesd-baseonly.C 
b/gcc/testsuite/g++.dg/debug/dwarf2/fesd-baseonly.C
index 4f580ebd361..fe0016a4563 100644
--- a/gcc/testsuite/g++.dg/debug/dwarf2/fesd-baseonly.C
+++ b/gcc/testsuite/g++.dg/debug/dwarf2/fesd-baseonly.C
@@ -1,5 +1,5 

Re: [C++ Patch] PR 67184 ("Missed optimization with C++11 final specifier")

2019-05-21 Thread Jason Merrill

On 5/16/19 7:12 PM, Paolo Carlini wrote:

Hi,

when Roberto Agostino and I implemented the front-end devirtualization 
of final overriders we missed this case, where it comes from the base. 
It seems to me that by way of access_path the existing approach can be 
neatly extended. Tested x86_64-linux.



+ || CLASSTYPE_FINAL (TREE_TYPE (cand->access_path)))


This will give the wrong type when the function is called with an 
explicit scope; you probably want to look at argtype instead.


Jason



Re: [RS6000] Don't pass -many to the assembler

2019-05-21 Thread Segher Boessenkool
Hi!

On Tue, May 21, 2019 at 10:22:26PM +0930, Alan Modra wrote:
> This is a repost of
> https://gcc.gnu.org/ml/gcc-patches/2018-12/msg00911.html with a small
> tweak to rs6000_machine_from_flags (&~ instead of ^).
> 
> Bootstrapped and regression tested powerpc64le-linux power8 and
> power9.  OK to apply now that we're in stage1?
> 
>   * config/rs6000/rs6000.h (ASM_OPT_ANY): Define.
>   (ASM_CPU_SPEC): Conditionally add -many.
>   * config/rs6000/rs6000.c (rs6000_machine): New static var.
>   (rs6000_machine_from_flags, emit_asm_machine): New functions..
>   (rs6000_file_start): ..extracted from here, and modified to
>   test all ISA bits.
>   (rs6000_output_function_prologue): Emit .machine as necessary.
>   * testsuite/gcc.target/powerpc/ppc32-abi-dfp-1.c: Don't use
>   power mnemonics.
>   * testsuite/gcc.dg/vect/O3-pr70130.c: Disable default options
>   added by check_vect_support_and_set_flags.
>   * testsuite/gcc.dg/vect/pr48765.c: Likewise.
>   * testsuite/gfortran.dg/vect/pr45714-b.f: Likewise.


> +static const char *
> +rs6000_machine_from_flags (void)
> +{
> +  if ((rs6000_isa_flags & (ISA_3_0_MASKS_SERVER & ~ISA_2_7_MASKS_SERVER)) != 
> 0)
> +return "power9";
> +  if ((rs6000_isa_flags & (ISA_2_7_MASKS_SERVER & ~ISA_2_6_MASKS_SERVER)) != 
> 0)
> +return "power8";
> +  if ((rs6000_isa_flags & (ISA_2_6_MASKS_SERVER & ~ISA_2_5_MASKS_SERVER)) != 
> 0)
> +return "power7";
> +  if ((rs6000_isa_flags & (ISA_2_5_MASKS_SERVER & ~ISA_2_4_MASKS)) != 0)
> +return "power6";
> +  if ((rs6000_isa_flags & (ISA_2_4_MASKS & ~ISA_2_1_MASKS)) != 0)
> +return "power5";
> +  if ((rs6000_isa_flags & ISA_2_1_MASKS) != 0)
> +return "power4";
> +  if ((rs6000_isa_flags & OPTION_MASK_POWERPC64) != 0)
> +return "ppc64";
> +  return "ppc";
> +}

As you know I'm trying to get rid of most of the separate user-selectable
features we have currently.  I think I'll steal this code :-)

(Is Power5 2.4?  Not 2.2?)

> +{
> +  rs6000_output_savres_externs (file);
> +#ifdef USING_ELFOS_H
> +  const char *curr_machine = rs6000_machine_from_flags ();
> +  if (rs6000_machine != curr_machine)
> + {
> +   rs6000_machine = curr_machine;
> +   emit_asm_machine ();
> + }
> +#endif
> +}

Comparing fixed strings using ==...  Not great.  I'll change things to use
an enum soon, so it's okay for now.

> diff --git a/gcc/testsuite/gcc.dg/vect/O3-pr70130.c 
> b/gcc/testsuite/gcc.dg/vect/O3-pr70130.c
> index 18a295c83f0..f8b84405140 100644
> --- a/gcc/testsuite/gcc.dg/vect/O3-pr70130.c
> +++ b/gcc/testsuite/gcc.dg/vect/O3-pr70130.c
> @@ -1,5 +1,5 @@
>  /* { dg-require-effective-target vsx_hw { target powerpc*-*-* } } */
> -/* { dg-additional-options "-mcpu=power7" { target powerpc*-*-* } } */
> +/* { dg-additional-options "-mcpu=power7 -mno-power9-vector 
> -mno-power8-vector" { target powerpc*-*-* } } */

-mdejagnu-cpu=power7 should make the -mno-* things unnecessary I think?
Hrm, I missed the few testcases outside gcc.target/powerpc/ when I did
that.

Please try that?  Okay for trunk with that.  Thanks!


Segher


Re: [C++ PATCH] Using decls

2019-05-21 Thread Nathan Sidwell

On 5/21/19 10:43 AM, Marek Polacek wrote:

Thanks for the patch and sorry for nitpicking:

On Tue, May 21, 2019 at 10:32:31AM -0400, Nathan Sidwell wrote:

-/* Process a local-scope or namespace-scope using declaration.  SCOPE
+/* Process a local-scope or namespace-scope using declaration.
+   FIXME


This ain't look right.  You meant to document the INSERT_P param, right.


Yes indeed.  Found these when merging back to modules.  Thanks for noticing!


--
Nathan Sidwell


Re: [Contrib PATCH] Add scripts to convert GCC repo from SVN to Git

2019-05-21 Thread Jeff Law
On 5/21/19 8:24 AM, Richard Earnshaw (lists) wrote:
> On 20/05/2019 23:42, Joseph Myers wrote:
> 
>> I'm not particularly concerned with distinguishing between different names 
>> and email addresses for an author depending on when or in what capacity 
>> they contributed a change, or with the cases where a patch was committed 
>> for someone else and SVN simply doesn't provide a way to distinguish that 
>> information.  However, since some people were concerned with that, and 
>> since the feature needed for that was implemented (the "changelogs" 
>> feature in reposurgeon, which will do it as long as a proper ChangeLog 
>> entry was included in the commit), we may as well use that feature.  (The 
>> author map is still needed for commits without ChangeLog entries.)
>>
> 
> For very old commits, back in the GCC 2 days, even the ChangeLogs don't
> always show the author.  At that time only the committers' name was
> used.  I'm pretty sure that some of my earliest patches to GCC were
> committed by tege and kenner under their names.  So we'll never really
> be able to fully reconstruct the early history.
I'd say we make a reasonable effort here, but the importance of
authorship decays rapidly the further back we go.  Even when the author
(or committer) is still around, they often can't remember the details
around commits from that era.

jeff


Re: [C++ PATCH] Using decls

2019-05-21 Thread Marek Polacek
Thanks for the patch and sorry for nitpicking:

On Tue, May 21, 2019 at 10:32:31AM -0400, Nathan Sidwell wrote:
> -/* Process a local-scope or namespace-scope using declaration.  SCOPE
> +/* Process a local-scope or namespace-scope using declaration.
> +   FIXME

This ain't look right.  You meant to document the INSERT_P param, right.

> +  /* DR 36 questions why using-decls at function scope may not be
> +  duplicates.  Disallow it, as C++11 claimed and PR 20420
> +  implemented.  */
> +  do_nonmember_using_decl (lookup, true, true, , );
> +
> +  if (!value)
> + ;
> +  else if (binding && value == binding->value)
> + ;
> +  else if (binding && binding->value && TREE_CODE (value) == OVERLOAD)
> + {
> +   update_local_overload (IDENTIFIER_BINDING (name), value);
> +   IDENTIFIER_BINDING (name)->value = value;
> + }
> +  else
> + /* Install the new binding.  */
> + // FIXME: Short circuit P_L_B

Was this FIXME meant to be here?

Marek


Re: Fix MEM_REF creation for shared stack slots

2019-05-21 Thread Jan Hubicka
> And that should be done at RTL creation time instead of
> repeatedly over and over.  Like with the following.
> 
> Bootstrap / regtest on x86_64-unknown-linux-gnu in progress.

Thanks,
for TBAA stats I now get

Alias oracle query stats:
  refs_may_alias_p: 3022975 disambiguations, 3321454 queries
  ref_maybe_used_by_call_p: 6451 disambiguations, 3048555 queries
  call_may_clobber_ref_p: 817 disambiguations, 817 queries
  aliasing_component_ref_p: 187 disambiguations, 16103 queries
  TBAA oracle: 1452502 disambiguations 2956630 queries
   550659 are in alias set 0
   576739 queries asked about the same object
   0 queries asked about the same alias set
   0 access volatile
   260391 are dependent in the DAG
   116339 are aritificially in conflict with void *

So some improvement from original (but less great than with my wrong
patch):

  refs_may_alias_p: 3027850 disambiguations, 3340416 queries
  ref_maybe_used_by_call_p: 6451 disambiguations, 3053430 queries
  call_may_clobber_ref_p: 817 disambiguations, 817 queries
  aliasing_component_ref_p: 151 disambiguations, 12565 queries
  TBAA oracle: 1468434 disambiguations 3010778 queries
   550723 are in alias set 0
   614261 queries asked about the same object
   0 queries asked about the same alias set
   0 access volatile
   260983 are dependent in the DAG
   116377 are aritificially in conflict with void *

Honza

> 
> Richard.
> 
> 2019-05-21  Richard Biener  
> 
> * alias.c (ao_ref_from_mem): Move stack-slot sharing
> rewrite ...
> * emit-rtl.c (set_mem_attributes_minus_bitpos): ... here.
> 
> Index: gcc/alias.c
> ===
> --- gcc/alias.c   (revision 271463)
> +++ gcc/alias.c   (working copy)
> @@ -307,18 +307,6 @@ ao_ref_from_mem (ao_ref *ref, const_rtx
>   && TREE_CODE (TMR_BASE (base)) == SSA_NAME)))
>  return false;
>  
> -  /* If this is a reference based on a partitioned decl replace the
> - base with a MEM_REF of the pointer representative we
> - created during stack slot partitioning.  */
> -  if (VAR_P (base)
> -  && ! is_global_var (base)
> -  && cfun->gimple_df->decls_to_pointers != NULL)
> -{
> -  tree *namep = cfun->gimple_df->decls_to_pointers->get (base);
> -  if (namep)
> - ref->base = build_simple_mem_ref (*namep);
> -}
> -
>ref->ref_alias_set = MEM_ALIAS_SET (mem);
>  
>/* If MEM_OFFSET or MEM_SIZE are unknown what we got from MEM_EXPR
> Index: gcc/emit-rtl.c
> ===
> --- gcc/emit-rtl.c(revision 271463)
> +++ gcc/emit-rtl.c(working copy)
> @@ -61,6 +61,8 @@ along with GCC; see the file COPYING3.
>  #include "opts.h"
>  #include "predict.h"
>  #include "rtx-vector-builder.h"
> +#include "gimple.h"
> +#include "gimple-ssa.h"
>  
>  struct target_rtl default_target_rtl;
>  #if SWITCHABLE_TARGET
> @@ -2128,6 +2130,26 @@ set_mem_attributes_minus_bitpos (rtx ref
> apply_bitpos = bitpos;
>   }
>  
> +  /* If this is a reference based on a partitioned decl replace the
> +  base with a MEM_REF of the pointer representative we created
> +  during stack slot partitioning.  */
> +  if (attrs.expr
> +   && VAR_P (base)
> +   && ! is_global_var (base)
> +   && cfun->gimple_df->decls_to_pointers != NULL)
> + {
> +   tree *namep = cfun->gimple_df->decls_to_pointers->get (base);
> +   if (namep)
> + {
> +   tree *orig_base = 
> +   while (handled_component_p (*orig_base))
> + orig_base = _OPERAND (*orig_base, 0);
> +   tree aptrt = reference_alias_ptr_type (*orig_base);
> +   *orig_base = build2 (MEM_REF, TREE_TYPE (*orig_base), *namep,
> +build_int_cst (aptrt, 0));
> + }
> + }
> +
>/* Compute the alignment.  */
>unsigned int obj_align;
>unsigned HOST_WIDE_INT obj_bitpos;


[C++ PATCH] Using decls

2019-05-21 Thread Nathan Sidwell
This patch reimplements using-decl handling.  It removes the double 
lookup of the target name, and commonizes the local- and namespace- 
scope handling into a single function.


Applying to trunk.

nathan
--
Nathan Sidwell
2019-05-21  Nathan Sidwell  

	gcc/cp/
	* name-lookup.h (struct cp_binding_level): Drop usings field.
	(finish_namespace_using_decl, finish_local_using_decl): Replace with ...
	(finish_nonmember_using_decl): ... this.
	* name-lookup.c (push_using_decl_1, push_using_decl):
	(do_nonmember_using_decl): ... here.  Add INSERT_P arg.  Reimplement.
	(validate_nonmember_using_decl, finish_namespace_using_decl)
	(finish_local_using_decl): Replace with ...
	(finish_nonmember_using_decl): ... this.  Drop DECL parm.
	* parser.c (cp_parser_using_declaration): Don't do lookup here.
	* pt.c (tsubst_expr): Do not do using decl lookup here.

	gcc/testsuite/
	* g++.dg/lookup/using53.C: Adjust diagnostic.

	libcc1/
	* libcp1plugin.cc (plugin_add_using_decl): Use
	finish_nonmember_using_decl.

Index: gcc/cp/name-lookup.c
===
--- gcc/cp/name-lookup.c	(revision 271427)
+++ gcc/cp/name-lookup.c	(working copy)
@@ -3830,40 +3830,4 @@ make_lambda_name (void)
 }
 
-/* Insert another USING_DECL into the current binding level, returning
-   this declaration. If this is a redeclaration, do nothing, and
-   return NULL_TREE if this not in namespace scope (in namespace
-   scope, a using decl might extend any previous bindings).  */
-
-static tree
-push_using_decl_1 (tree scope, tree name)
-{
-  tree decl;
-
-  gcc_assert (TREE_CODE (scope) == NAMESPACE_DECL);
-  gcc_assert (identifier_p (name));
-  for (decl = current_binding_level->usings; decl; decl = DECL_CHAIN (decl))
-if (USING_DECL_SCOPE (decl) == scope && DECL_NAME (decl) == name)
-  break;
-  if (decl)
-return namespace_bindings_p () ? decl : NULL_TREE;
-  decl = build_lang_decl (USING_DECL, name, NULL_TREE);
-  USING_DECL_SCOPE (decl) = scope;
-  DECL_CHAIN (decl) = current_binding_level->usings;
-  current_binding_level->usings = decl;
-  return decl;
-}
-
-/* Wrapper for push_using_decl_1.  */
-
-static tree
-push_using_decl (tree scope, tree name)
-{
-  tree ret;
-  timevar_start (TV_NAME_LOOKUP);
-  ret = push_using_decl_1 (scope, name);
-  timevar_stop (TV_NAME_LOOKUP);
-  return ret;
-}
-
 /* Same as pushdecl, but define X in binding-level LEVEL.  We rely on the
caller to set DECL_CONTEXT properly.
@@ -3919,89 +3883,17 @@ pushdecl_outermost_localscope (tree x)
 }
 
-/* Check a non-member using-declaration. Return the name and scope
-   being used, and the USING_DECL, or NULL_TREE on failure.  */
-
-static tree
-validate_nonmember_using_decl (tree decl, tree scope, tree name)
-{
-  /* [namespace.udecl]
-   A using-declaration for a class member shall be a
-   member-declaration.  */
-  if (TYPE_P (scope))
-{
-  error ("%qT is not a namespace or unscoped enum", scope);
-  return NULL_TREE;
-}
-  else if (scope == error_mark_node)
-return NULL_TREE;
-
-  if (TREE_CODE (decl) == TEMPLATE_ID_EXPR)
-{
-  /* 7.3.3/5
-	   A using-declaration shall not name a template-id.  */
-  error ("a using-declaration cannot specify a template-id.  "
-	 "Try %", name);
-  return NULL_TREE;
-}
-
-  if (TREE_CODE (decl) == NAMESPACE_DECL)
-{
-  error ("namespace %qD not allowed in using-declaration", decl);
-  return NULL_TREE;
-}
-
-  if (TREE_CODE (decl) == SCOPE_REF)
-{
-  /* It's a nested name with template parameter dependent scope.
-	 This can only be using-declaration for class member.  */
-  error ("%qT is not a namespace", TREE_OPERAND (decl, 0));
-  return NULL_TREE;
-}
-
-  decl = OVL_FIRST (decl);
-
-  /* Make a USING_DECL.  */
-  tree using_decl = push_using_decl (scope, name);
-
-  if (using_decl == NULL_TREE
-  && at_function_scope_p ()
-  && VAR_P (decl))
-/* C++11 7.3.3/10.  */
-error ("%qD is already declared in this scope", name);
-  
-  return using_decl;
-}
-
-/* Process a local-scope or namespace-scope using declaration.  SCOPE
+/* Process a local-scope or namespace-scope using declaration.
+   FIXME
is the nominated scope to search for NAME.  VALUE_P and TYPE_P
point to the binding for NAME in the current scope and are
updated.  */
 
-static void
-do_nonmember_using_decl (tree scope, tree name, tree *value_p, tree *type_p)
+static bool
+do_nonmember_using_decl (name_lookup , bool fn_scope_p,
+			 bool insert_p, tree *value_p, tree *type_p)
 {
-  name_lookup lookup (name, 0);
-
-  if (!qualified_namespace_lookup (scope, ))
-{
-  error ("%qD not declared", name);
-  return;
-}
-  else if (TREE_CODE (lookup.value) == TREE_LIST)
-{
-  error ("reference to %qD is ambiguous", name);
-  print_candidates (lookup.value);
-  lookup.value = NULL_TREE;
-}
-
-  if (lookup.type && TREE_CODE (lookup.type) == TREE_LIST)
-{
-  error 

Re: [PATCH] Remove empty loop with assumed finiteness (PR tree-optimization/89713)

2019-05-21 Thread Richard Biener
On Tue, May 21, 2019 at 12:12 PM Richard Biener
 wrote:
>
> On Mon, May 20, 2019 at 4:51 PM Feng Xue OS  
> wrote:
> >
> > > I don't see how it is safe in a late pass when it is not safe in an
> >
> > > earlier one.  Optimization is imperfect - we could fail to remove
> > > an "obvious" never taken exit and still have a loop that appears to be
> > > finite according to our definition.
> >
> > Yes. it is. This is somewhat similar to strict-alias option/loop dep pragma.
> > Compiler tries to do something based on hint you tell it, but does not 
> > ensure correctness.
> >
> > > The only way
> > > to define it would be if there was, at any point, an exit from the
> > > loop (and there it _may_ be exclude EH edges) then
> > > the loop is assumed to be finite.
> >
> > No catch your point. If we treat an infinite loop as finite, it's bad 
> > because the loop might be removed.
> >
> > Suppose we have a function:
> >
> > void foo(int bound)
> >  { for (int i = 0; i <= bound; i++); }
> >
> >  In an early CD-DCE pass, "bound" is represented as a variable, and loop 
> > has a exit, so it is assumed to finite, and is removed.
> >
> > But in a late pass, this function is inlined into another one, and "bound" 
> > has value of INT_MAX, this loop is infinite, and here we can know it should 
> > not be removed.
>
> But if "bound" is always INT_MAX but that's not visible to the
> compiler we will still remove the
> loop so I see no difference with removing it always.
>
> > This is why I suggest doing the optimization as late as possible.
>
> But this will defeat the purpose of allowing followup optimizations.
>
> IMHO the only "sensible" thing is to do
>
> Index: gcc/tree-ssa-dce.c
> ===
> --- gcc/tree-ssa-dce.c  (revision 271415)
> +++ gcc/tree-ssa-dce.c  (working copy)
> @@ -417,7 +417,7 @@ find_obviously_necessary_stmts (bool agg
>   }
>
>FOR_EACH_LOOP (loop, 0)
> -   if (!finite_loop_p (loop))
> +   if (!loop_has_exit_edges (loop))
>   {
> if (dump_file)
>   fprintf (dump_file, "cannot prove finiteness of loop
> %i\n", loop->num);

Bootstrapped / tested on x86_64-unknown-linux-gnu.  Fallout:

FAIL: gcc.dg/loop-unswitch-1.c scan-tree-dump unswitch ";; Unswitching loop"
FAIL: gcc.dg/predict-9.c scan-tree-dump-times profile_estimate "first
match heuristics: 2.20%" 3
FAIL: gcc.dg/predict-9.c scan-tree-dump-times profile_estimate "first
match heuristics: 5.50%" 1
FAIL: gcc.dg/uninit-28-gimple.c  (test for bogus messages, line 9)
FAIL: gcc.dg/graphite/scop-19.c scan-tree-dump-times graphite "number
of SCoPs: 0" 2
...
UNRESOLVED: gcc.dg/tree-ssa/20040211-1.c scan-tree-dump cddce2 "if "
FAIL: gcc.dg/tree-ssa/loop-10.c scan-tree-dump-times optimized "if " 3
FAIL: gcc.dg/tree-ssa/pr84648.c scan-tree-dump-times cddce1 "Found
loop 1 to be finite: upper bound found" 1
FAIL: gcc.dg/tree-ssa/split-path-6.c scan-tree-dump-times split-paths
"Duplicating join block" 3
FAIL: gcc.dg/tree-ssa/ssa-thread-12.c scan-tree-dump thread2 "FSM"
FAIL: gcc.dg/tree-ssa/ssa-thread-12.c scan-tree-dump thread3 "FSM"

I didn't look if the testcases are sensible for loop removal (or what
actually happens).

Richard.

> that also has the obvious advantage that we don't need to replace the loop
> with a trap() but have a place to forward control flow to.  The loop in the
> following testcase is then successfully removed:
>
> int main(int argc, char **argv)
> {
>   unsigned i = argc;
>   while (i+=2);
>   return 0;
> }
>
> Likewise is the loop
>
> void **q;
> int main(int argc, char **argv)
> {
>   void **p = q;
>   while (p = (void **)*p);
>   return 0;
> }
>
> (that's the pointer-chasing).  Not with -fnon-call-exceptions
> -fexceptions though.
>
> Richard.
>
> > Feng
> >


Re: [Contrib PATCH] Add scripts to convert GCC repo from SVN to Git

2019-05-21 Thread Richard Earnshaw (lists)
On 20/05/2019 23:42, Joseph Myers wrote:

> I'm not particularly concerned with distinguishing between different names 
> and email addresses for an author depending on when or in what capacity 
> they contributed a change, or with the cases where a patch was committed 
> for someone else and SVN simply doesn't provide a way to distinguish that 
> information.  However, since some people were concerned with that, and 
> since the feature needed for that was implemented (the "changelogs" 
> feature in reposurgeon, which will do it as long as a proper ChangeLog 
> entry was included in the commit), we may as well use that feature.  (The 
> author map is still needed for commits without ChangeLog entries.)
> 

For very old commits, back in the GCC 2 days, even the ChangeLogs don't
always show the author.  At that time only the committers' name was
used.  I'm pretty sure that some of my earliest patches to GCC were
committed by tege and kenner under their names.  So we'll never really
be able to fully reconstruct the early history.

R.


[PATCH] Handle ABS_EXPR in rewrite_to_defined_overflow

2019-05-21 Thread Richard Biener


The following makes us properly rewrite ABS_EXPR to avoid
undefined overflow when hoisting it from a not always executed
region.

Bootstrap & regtest on x86_64-unknown-linux-gnu in progress.

Richard.

2019-05-21  Richard Biener  

* gimple-fold.c (arith_code_with_undefined_signed_overflow):
Add ABS_EXPR.
(rewrite_to_defined_overflow): Handle rewriting ABS_EXPR
as ABSU_EXPR.

* gcc.dg/tree-ssa/ssa-lim-13.c: New testcase.

Index: gcc/gimple-fold.c
===
--- gcc/gimple-fold.c   (revision 271463)
+++ gcc/gimple-fold.c   (working copy)
@@ -7329,6 +7329,7 @@ arith_code_with_undefined_signed_overflo
 {
   switch (code)
 {
+case ABS_EXPR:
 case PLUS_EXPR:
 case MINUS_EXPR:
 case MULT_EXPR:
@@ -7361,12 +7362,15 @@ rewrite_to_defined_overflow (gimple *stm
   tree lhs = gimple_assign_lhs (stmt);
   tree type = unsigned_type_for (TREE_TYPE (lhs));
   gimple_seq stmts = NULL;
-  for (unsigned i = 1; i < gimple_num_ops (stmt); ++i)
-{
-  tree op = gimple_op (stmt, i);
-  op = gimple_convert (, type, op);
-  gimple_set_op (stmt, i, op);
-}
+  if (gimple_assign_rhs_code (stmt) == ABS_EXPR)
+gimple_assign_set_rhs_code (stmt, ABSU_EXPR);
+  else
+for (unsigned i = 1; i < gimple_num_ops (stmt); ++i)
+  {
+   tree op = gimple_op (stmt, i);
+   op = gimple_convert (, type, op);
+   gimple_set_op (stmt, i, op);
+  }
   gimple_assign_set_lhs (stmt, make_ssa_name (type, stmt));
   if (gimple_assign_rhs_code (stmt) == POINTER_PLUS_EXPR)
 gimple_assign_set_rhs_code (stmt, PLUS_EXPR);
Index: gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-13.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-13.c  (nonexistent)
+++ gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-13.c  (working copy)
@@ -0,0 +1,53 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fgimple -fdump-tree-lim2-details" } */
+
+int __GIMPLE (ssa,startwith("lim"))
+foo (int x, int n)
+{
+  int i;
+  int r;
+  int _1;
+  int _2;
+  int _6;
+
+  __BB(2):
+  goto __BB7;
+
+  __BB(3):
+  if (i_5 == 17)
+goto __BB8;
+  else
+goto __BB4;
+
+  __BB(4):
+  _1 = i_5 & 1;
+  if (_1 != 0)
+goto __BB5;
+  else
+goto __BB6;
+
+  __BB(5):
+  _2 = __ABS x_8(D);
+  r_9 = _2 / 5;
+  goto __BB6;
+
+  __BB(6):
+  r_3 = __PHI (__BB5: r_9, __BB4: r_4);
+  i_10 = i_5 + 1;
+  goto __BB7;
+
+  __BB(7,loop_header(1)):
+  r_4 = __PHI (__BB2: 1, __BB6: r_3);
+  i_5 = __PHI (__BB2: 0, __BB6: i_10);
+  if (i_5 < n_7(D))
+goto __BB3;
+  else
+goto __BB8;
+
+  __BB(8):
+  _6 = __PHI (__BB3: 0, __BB7: r_4);
+  return _6;
+}
+
+/* { dg-final { scan-tree-dump-times "Moving statement" 2 "lim2" } } */
+/* { dg-final { scan-tree-dump "ABSU_EXPR" "lim2" } } */


Re: [PATCH] [aarch64] Introduce flags for SVE2.

2019-05-21 Thread Kyrill Tkachov

Hi Matthew,

On 5/15/19 4:11 PM, Matthew Malcomson wrote:

> Matthew Malcomson  writes:
>> @@ -326,16 +326,22 @@ int opt_ext_cmp (const void* a, const void* b)

Cheers Richard -- modified patch attached and inlined.
MM


This looks ok to me FWIW (you'll still need maintainer approval).

The size of the change is mostly due to mechanical updates to use 
uint64_t to hold the flags.


I'm guessing the ChangeLog remains the same?

Thanks,

Kyrill


### Attachment also inlined for ease of reply    
###



diff --git a/gcc/common/config/aarch64/aarch64-common.c 
b/gcc/common/config/aarch64/aarch64-common.c
index 
bab3ab3fa36c66906d1b4367e2b7bfb1bf6aa08c..f9051056589861ce0ffe1bae4fa04cf44d34b9a2 
100644

--- a/gcc/common/config/aarch64/aarch64-common.c
+++ b/gcc/common/config/aarch64/aarch64-common.c
@@ -170,9 +170,9 @@ aarch64_handle_option (struct gcc_options *opts,
 struct aarch64_option_extension
 {
   const char *const name;
-  const unsigned long flag_canonical;
-  const unsigned long flags_on;
-  const unsigned long flags_off;
+  const uint64_t flag_canonical;
+  const uint64_t flags_on;
+  const uint64_t flags_off;
   const bool is_synthetic;
 };

@@ -201,14 +201,14 @@ struct processor_name_to_arch
 {
   const std::string processor_name;
   const enum aarch64_arch arch;
-  const unsigned long flags;
+  const uint64_t flags;
 };

 struct arch_to_arch_name
 {
   const enum aarch64_arch arch;
   const std::string arch_name;
-  const unsigned long flags;
+  const uint64_t flags;
 };

 /* Map processor names to the architecture revision they implement and
@@ -238,7 +238,7 @@ static const struct arch_to_arch_name 
all_architectures[] =

    a copy of the string is created and stored to INVALID_EXTENSION.  */

 enum aarch64_parse_opt_result
-aarch64_parse_extension (const char *str, unsigned long *isa_flags,
+aarch64_parse_extension (const char *str, uint64_t *isa_flags,
  std::string *invalid_extension)
 {
   /* The extension string is parsed left to right.  */
@@ -326,18 +326,21 @@ int opt_ext_cmp (const void* a, const void* b)
  turns on as a dependency.  As an example +dotprod turns on 
FL_DOTPROD and

  FL_SIMD.  As such the set of bits represented by this option is
  {FL_DOTPROD, FL_SIMD}. */
-  unsigned long total_flags_a = opt_a->flag_canonical & opt_a->flags_on;
-  unsigned long total_flags_b = opt_b->flag_canonical & opt_b->flags_on;
+  uint64_t total_flags_a = opt_a->flag_canonical & opt_a->flags_on;
+  uint64_t total_flags_b = opt_b->flag_canonical & opt_b->flags_on;
   int popcnt_a = popcount_hwi ((HOST_WIDE_INT)total_flags_a);
   int popcnt_b = popcount_hwi ((HOST_WIDE_INT)total_flags_b);
   int order = popcnt_b - popcnt_a;

   /* If they have the same amount of bits set, give it a more
  deterministic ordering by using the value of the bits 
themselves.  */

-  if (order == 0)
-    return total_flags_b - total_flags_a;
+  if (order != 0)
+    return order;

-  return order;
+  if (total_flags_a != total_flags_b)
+    return total_flags_a < total_flags_b ? 1 : -1;
+
+  return 0;
 }

 /* Implement TARGET_OPTION_INIT_STRUCT.  */
@@ -373,9 +376,9 @@ aarch64_option_init_struct (struct gcc_options 
*opts ATTRIBUTE_UNUSED)

 */

 static bool
-aarch64_contains_opt (unsigned long isa_flag_bits, opt_ext *opt)
+aarch64_contains_opt (uint64_t isa_flag_bits, opt_ext *opt)
 {
-  unsigned long flags_check
+  uint64_t flags_check
 = opt->is_synthetic ? opt->flags_on : opt->flag_canonical;

   return (isa_flag_bits & flags_check) == flags_check;
@@ -388,13 +391,13 @@ aarch64_contains_opt (unsigned long 
isa_flag_bits, opt_ext *opt)

    that all the "+" flags come before the "+no" flags.  */

 std::string
-aarch64_get_extension_string_for_isa_flags (unsigned long isa_flags,
-   unsigned long 
default_arch_flags)

+aarch64_get_extension_string_for_isa_flags (uint64_t isa_flags,
+   uint64_t default_arch_flags)
 {
   const struct aarch64_option_extension *opt = NULL;
   std::string outstr = "";

-  unsigned long isa_flag_bits = isa_flags;
+  uint64_t isa_flag_bits = isa_flags;

   /* Pass one: Minimize the search space by reducing the set of options
  to the smallest set that still turns on the same features as 
before in
diff --git a/gcc/config/aarch64/aarch64-option-extensions.def 
b/gcc/config/aarch64/aarch64-option-extensions.def
index 
53dcd03590d2e4eebac83f03039c442fca7f5d5d..4b10c62d20401a66374eb68e36531d73df300af1 
100644

--- a/gcc/config/aarch64/aarch64-option-extensions.def
+++ b/gcc/config/aarch64/aarch64-option-extensions.def
@@ -57,17 +57,20 @@

 /* Enabling "fp" just enables "fp".
    Disabling "fp" also disables "simd", "crypto", "fp16", "aes", "sha2",
-   "sha3", sm3/sm4 and "sve".  */
-AARCH64_OPT_EXTENSION("fp", AARCH64_FL_FP, 0, AARCH64_FL_SIMD | 
AARCH64_FL_CRYPTO | AARCH64_FL_F16 | AARCH64_FL_AES | AARCH64_FL_SHA2 
| AARCH64_FL_SHA3 | 

Re: [PATCH] tbb-backend effective target should check ability to link TBB

2019-05-21 Thread Jonathan Wakely

On 20/05/19 21:41 -0700, Thomas Rodgers wrote:

With the addition of "-ltbb" to the v3_target_compile flags (so as to,
you know, actually try to link tbb).

Tested x86_64-linux, committed to trunk.


This didn't work, I still get a FAIL for every pstl test when
tbb.x86_64 and tbb-devel.x86_64 are installed but not tbb.i686.

Adding -v to RUNTESTFLAGS shows -ltbb wasn't being added to the
command, and because the test program didn't actually refer to any TBB
symbols, it still linked successfully.

This patch uses additional_flags=-ltbb to pass that flag, which seems
to work correctly. I've also cached the result of the effective-target
check, because it's pretty slow and was being run again for each of
the 56 pstl tests, multiplied by the number of test permutations. Now
it runs once per permutation, e.g. once for "unix" and once for
"unix/-m32".

Tested x86_64-linux (with 64-bit tbb only, 32-bit tbb only, and both
32-bit and 64-bit tbb), and powerpc64le-linux (with no tbb installed).

Committed to trunk. I'll backport it to gcc-9-branch too.


commit cab777300b4d549d5dd43152c5d2a5b77fe554bd
Author: Jonathan Wakely 
Date:   Tue May 21 12:32:24 2019 +0100

PR libstdc++/90252 fix effective-target check for TBB

PR libstdc++/90252
* testsuite/lib/libstdc++.exp (check_effective_target_tbb-backend):
Use "additional_flags" to pass -ltbb to v3_target_compile command.
Use check_v3_target_prop_cached to cache the result of the test.

diff --git a/libstdc++-v3/testsuite/lib/libstdc++.exp b/libstdc++-v3/testsuite/lib/libstdc++.exp
index 26f3d46e089..868a7cf7aec 100644
--- a/libstdc++-v3/testsuite/lib/libstdc++.exp
+++ b/libstdc++-v3/testsuite/lib/libstdc++.exp
@@ -1581,34 +1581,33 @@ proc check_effective_target_random_device { } {
 
 # Return 1 if tbb parallel backend is available
 proc check_effective_target_tbb-backend { } {
-global cxxflags
+return [check_v3_target_prop_cached et_tbb {
+# Set up and compile a C++ test program that depends on tbb
+set src tbb_backend[pid].cc
+set exe tbb_backend[pid].x
 
-# Set up and preprocess a C++ test program that depends
-# on tbb
-set src tbb_backend[pid].cc
-set exe tbb_backend[pid].x
-
-set f [open $src "w"]
-puts $f "#include "
-puts $f "#if TBB_INTERFACE_VERSION < 1"
-puts $f "#  error Intel(R) Threading Building Blocks 2018 is required; older versions are not supported."
-puts $f "#endif"
-puts $f "int main ()"
-puts $f "{"
-puts $f "  return 0;"
-puts $f "}"
-close $f
-
-set lines [v3_target_compile $src $exe executable "-ltbb"]
-file delete $src
+set f [open $src "w"]
+puts $f "#include "
+puts $f "#if TBB_INTERFACE_VERSION < 1"
+puts $f "#  error Intel(R) Threading Building Blocks 2018 is required; older versions are not supported."
+puts $f "#endif"
+puts $f "int main ()"
+puts $f "{"
+puts $f "  return 0;"
+puts $f "}"
+close $f
 
-if [string match "" $lines] {
-	# No error message, preprocessing succeeded.
-	verbose "check_v3_tbb-backend: `1'" 2
-	return 1
-}
-verbose "check_v3_tbb-backend: `0'" 2
-return 0
+set lines [v3_target_compile $src $exe executable "additional_flags=-std=c++17 additional_flags=-ltbb"]
+file delete $src
+
+if [string match "" $lines] {
+# No error message, compilation succeeded.
+verbose "check_v3_tbb-backend: `1'" 2
+return 1
+}
+verbose "check_v3_tbb-backend: `0'" 2
+return 0
+}]
 }
 
 set additional_prunes ""


[PATCH] A jump threading opportunity for condition branch

2019-05-21 Thread Jiufu Guo
Hi,

This patch implements a new opportunity of jump threading for PR77820.
In this optimization, conditional jumps are merged with unconditional jump.
And then moving CMP result to GPR is eliminated.

It looks like below:

  
  p0 = a CMP b
  goto ;

  
  p1 = c CMP d
  goto ;

  
  # phi = PHI 
  if (phi != 0) goto ; else goto ;

Could be transformed to:

  
  p0 = a CMP b
  if (p0 != 0) goto ; else goto ;

  
  p1 = c CMP d
  if (p1 != 0) goto ; else goto ;


This optimization eliminates:
1. saving CMP result: p0 = a CMP b.
2. additional CMP on branch: if (phi != 0).
3. converting CMP result if there is phi = (INT_CONV) p0 if there is.

Bootstrapped and tested on powerpc64le with no regressions(one case is improved)
and new testcases are added. Is this ok for trunk?

Thanks!
Jiufu Guo


[gcc]
2019-05-21  Jiufu Guo  
Lijia He  

PR tree-optimization/77820
* tree-ssa-threadedge.c (cmp_from_unconditional_block): New function.
* tree-ssa-threadedge.c (is_trivial_join_block): New function.
* tree-ssa-threadedge.c (thread_across_edge): Call 
is_trivial_join_block.

[gcc/testsuite]
2019-05-21  Jiufu Guo  
Lijia He  

PR tree-optimization/77820
* gcc.dg/tree-ssa/phi_on_compare-1.c: New testcase.
* gcc.dg/tree-ssa/phi_on_compare-2.c: New testcase.
* gcc.dg/tree-ssa/phi_on_compare-3.c: New testcase.
* gcc.dg/tree-ssa/phi_on_compare-4.c: New testcase.
* gcc.dg/tree-ssa/split-path-6.c: Update testcase.

---
 gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-1.c | 32 +
 gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-2.c | 27 +++
 gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-3.c | 31 
 gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-4.c | 40 +++
 gcc/testsuite/gcc.dg/tree-ssa/split-path-6.c |  2 +-
 gcc/tree-ssa-threadedge.c| 91 +++-
 6 files changed, 219 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-1.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-2.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-3.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-4.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-1.c
new file mode 100644
index 000..ad4890a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-1.c
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-options "-Ofast -fdump-tree-vrp1" } */
+
+void g (int);
+void g1 (int);
+
+void
+f (long a, long b, long c, long d, long x)
+{
+  _Bool t;
+  if (x)
+{
+  g (a + 1);
+  t = a < b;
+  c = d + x;
+}
+  else
+{
+  g (b + 1);
+  a = c + d;
+  t = c > d;
+}
+
+  if (t)
+{
+  g1 (c);
+}
+
+  g (a);
+}
+
+/* { dg-final { scan-tree-dump-times "Removing basic block" 1 "vrp1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-2.c 
b/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-2.c
new file mode 100644
index 000..ca67d65
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-2.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-Ofast -fdump-tree-vrp1" } */
+
+void g (void);
+void g1 (void);
+
+void
+f (long a, long b, long c, long d, int x)
+{
+  _Bool t;
+  if (x)
+{
+  t = c < d;
+}
+  else
+{
+  t = a < b;
+}
+
+  if (t)
+{
+  g1 ();
+  g ();
+}
+}
+
+/* { dg-final { scan-tree-dump-times "Removing basic block" 1 "vrp1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-3.c 
b/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-3.c
new file mode 100644
index 000..a126e97
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-3.c
@@ -0,0 +1,31 @@
+/* { dg-do compile } */
+/* { dg-options "-Ofast -fdump-tree-vrp1" } */
+
+void g (void);
+void g1 (void);
+
+void
+f (long a, long b, long c, long d, int x)
+{
+  int t;
+  if (x)
+{
+  t = a < b;
+}
+  else if (d == x)
+{
+  t = c < b;
+}
+  else
+{
+  t = d > c;
+}
+
+  if (t)
+{
+  g1 ();
+  g ();
+}
+}
+
+/* { dg-final { scan-tree-dump-times "Removing basic block" 1 "vrp1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-4.c 
b/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-4.c
new file mode 100644
index 000..5a50c2d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-4.c
@@ -0,0 +1,40 @@
+/* { dg-do compile } */
+/* { dg-options "-Ofast -fdump-tree-vrp1" } */
+
+void g (int);
+void g1 (int);
+
+void
+f (long a, long b, long c, long d, int x)
+{
+  int t;
+  _Bool l1 = 0, l2 = 0;
+  if (x)
+{
+  g (a);
+  c = a + b;
+  t = a < b;
+  l1 = 1;
+}
+  else
+{
+  g1 (b);
+  t = c > d;
+  d = c + b;
+  l2 = 1;
+}
+
+  if (t)
+{
+  if (l1 | l2)
+  g1 (c);
+}
+  else
+{
+  g (d);
+  g1 (a + 

Re: Fix MEM_REF creation for shared stack slots

2019-05-21 Thread Richard Biener
On Tue, 21 May 2019, Richard Biener wrote:

> On Tue, 21 May 2019, Jan Hubicka wrote:
> 
> > > > So about 8times of aliasing_component_refs hitrate.
> > > 
> > > OK, one issue with the patch is that it restores TBAA for the
> > > access which we may _not_ do IIRC.
> > 
> > I can see that with stack sharing we have one memory location that for a
> > while is of type A and later is rewritten by type B, but we already give
> > up on optimizing this because of C++ placement new, right?
> > 
> > In what scenarios one can disambiguate by the alias set of the reference
> > type (which we do in all cases) but not by the alias set of base type.
> > The code in cfgexpand does not seem to care about either of those.
> 
> I don't remember exactly but lets change the things independently
> if possible.
> 
> > > 
> > > > Bootstrapped/regtested x86_64-linux, OK?
> > > 
> > > I'd rather not have that new build_simple_mem_ref_with_type_loc
> > > function - the "simple" MEM_REF was to be a way to replace
> > > a plain old INDIRECT_REF.
> > > 
> > > So please instead ...
> > > 
> > > > Honza
> > > > 
> > > > * alias.c (ao_ref_from_mem): Use build_simple_mem_ref_with_type.
> > > > * tree.c (build_simple_mem_ref_with_type_loc): Break out from 
> > > > ...
> > > > (build_simple_mem_ref_loc): ... here.
> > > > * fold-const.h (build_simple_mem_ref_with_type_loc): Declare.
> > > > (build_simple_mem_ref_with_type): New macro.
> > > > Index: alias.c
> > > > ===
> > > > --- alias.c (revision 271379)
> > > > +++ alias.c (working copy)
> > > > @@ -316,7 +316,8 @@ ao_ref_from_mem (ao_ref *ref, const_rtx
> > > >  {
> > > >tree *namep = cfun->gimple_df->decls_to_pointers->get (base);
> > > >if (namep)
> > > > -   ref->base = build_simple_mem_ref (*namep);
> > > > +   ref->base = build_simple_mem_ref_with_type
> > > > +(*namep, build_pointer_type (TREE_TYPE 
> > > > (base)));
> > > 
> > > ...
> > > 
> > > ref->base = build2 (MEM_REF, TREE_TYPE (base), *namep,
> > >   build_int_cst (TREE_TYPE (*namep), 0));
> > > 
> > > which preserves TBAA behavior but fixes the 'void' type ref.
> > 
> > 
> > My undrestanding of MEMREF is that it has two types, one is TREE_TYPE
> > (MEMREF) and its ref type taken from TREE_TYPE of the constant.
> > So we will still be dereferencing void which is odd.
> 
> Here we reference 'base' (TREE_TYPE of the mem-ref) and the
> pointer-type for TBAA purposes is the type of the constant.
> void * here simply means alias-set zero.
> 
> > If globbing is necessary, perhaps the outer type should be somethig like
> > alias set 0 char pointer (used by some builtins such as copysign) or
> > union of all types of vars that gets into a given partition?
> 
> Note the original reason might have been latent bugs in the RTL code
> not correctly dealing with the placement new case.  With stack slot
> sharing you end up with a lot more placement news ;)
> 
> As said, fixing TREE_TYPE (mem-ref) is quite obvious (it should
> never have been void but retained the original type of the base).
> 
> Changing TBAA behavior should be done separately and that should
> probably simply be
> 
>   ref->base = build2 (MEM_REF, TREE_TYPE (base), *namep,
>  build_int_cst (build_pointer_type (TREE_TYPE (base)), 
> 0);
> 
> note we retain the original alias-set from RTL:
> 
>   ref->ref_alias_set = MEM_ALIAS_SET (mem);
> 
> and that might _not_ reflect that of the original tree.  For
> example a
> 
>   MEM[, (void * ref-all)0] = 1;
> 
> may be represented as ref.base = b; ref.ref_alias_set = 0; ref.ref = NULL;
> losing the ref-all qualification.  So it is _not_ easily possible
> to recreate the original 'base'.  There might be code in the
> component-ref disambiguations looking at those alias-types but that's
> probably fishy for the cases coming in via ao_ref_from_mem which
> means being conservative here is important.
> 
> That may also hold for the type of the reference for ref->base.
> _Nothing_ should really look at that...  In fact the correct
> type should be salvaged by not doing
> 
>   /* Get the base of the reference and see if we have to reject or
>  adjust it.  */
>   base = ao_ref_base (ref);
>   if (base == NULL_TREE)
> return false;
> 
> but doing what ao_ref_base_alias_set does, strip handled-components
> and thus preserve an eventual inner view-converting MEM_REF for
> the purpose of building the stack-slot sharing ref...
> 
> The current (somewhat broken) code simply side-steps this by
> being awkwardly conservative...
> 
> So if we want to go full in on "fixing" ref->base (previously
> we just said doing that is wasted cycles) then do
> 
> Index: gcc/alias.c
> ===
> --- gcc/alias.c (revision 271415)
> +++ gcc/alias.c (working copy)
> @@ -316,7 +316,14 @@ ao_ref_from_mem 

Re: [RS6000] PR90545, gcc.target/powerpc/fold-vec-splats-floatdouble.c fails

2019-05-21 Thread Segher Boessenkool
Hi Alan,

On Tue, May 21, 2019 at 09:01:29PM +0930, Alan Modra wrote:
> Bootstrapped powerpc64le-linux power8 and power9, OK to apply?
> 
> I figure a tweak to register_move_cost is better than sprinkling ?s
> in instruction alternatives.

Yup.

>   PR 90545

PR target/90545

>   * config/rs6000/rs6000.c (rs6000_register_move_cost): Increase
>   power9 direct move cost.
>   * testsuite/gcc.target/powerpc/fold-vec-splats-floatdouble.c:
>   Correct comments and rename functions to suit parameters.

Okay for trunk with that nit fixed.  Thanks!


Segher


Re: [PATCH] PR bootstrap/87338: Fix ia64 bootstrap comparison regression in r257511

2019-05-21 Thread James Clarke
On 26 Apr 2019, at 15:24, Jeff Law  wrote:
> On 4/26/19 2:01 AM, Jakub Jelinek wrote:
>> On Fri, Apr 26, 2019 at 08:58:18AM +0200, Richard Biener wrote:
>>> On Thu, Apr 25, 2019 at 7:52 PM James Clarke  wrote:
 
 By using ASM_OUTPUT_LABEL, r257511 forced the assembler to start a new
 bundle when emitting an inline entry label on. Instead, use
 ASM_OUTPUT_DEBUG_LABEL like for the block begin and end labels so tags are
 emitted rather than labels.
>>> 
>>> Looks sensible.  mips is the other port defining ASM_OUTPUT_DEBUG_LABEL,
>>> so either you can do a bootstrap/test on mips as well or I'm asking Matthew
>>> for approval here.
>> 
>> And arm and arc, while they don't define their own ASM_OUTPUT_DEBUG_LABEL,
>> they override TARGET_ASM_INTERNAL_LABEL which is the underlying
>> implementation of the default ASM_OUTPUT_DEBUG_LABEL.  But I agree that
>> for mips it is a significant change, while arm and arc call 
>> default_internal_label
>> from their hook, just do additional stuff.
> My tester will bootstrap the mips port within 24hrs after the change is
> committed.  Happy to contact y'all if something goes wrong ;-)  If you
> don't hear from me, assume it didn't cause problems.

Hi all,
It looks like there are no objections to this (assuming it doesn't regress
anything), so could somebody please commit this to trunk?

Thanks,
James



[RS6000] Don't pass -many to the assembler

2019-05-21 Thread Alan Modra
This is a repost of
https://gcc.gnu.org/ml/gcc-patches/2018-12/msg00911.html with a small
tweak to rs6000_machine_from_flags (&~ instead of ^).

Bootstrapped and regression tested powerpc64le-linux power8 and
power9.  OK to apply now that we're in stage1?

* config/rs6000/rs6000.h (ASM_OPT_ANY): Define.
(ASM_CPU_SPEC): Conditionally add -many.
* config/rs6000/rs6000.c (rs6000_machine): New static var.
(rs6000_machine_from_flags, emit_asm_machine): New functions..
(rs6000_file_start): ..extracted from here, and modified to
test all ISA bits.
(rs6000_output_function_prologue): Emit .machine as necessary.
* testsuite/gcc.target/powerpc/ppc32-abi-dfp-1.c: Don't use
power mnemonics.
* testsuite/gcc.dg/vect/O3-pr70130.c: Disable default options
added by check_vect_support_and_set_flags.
* testsuite/gcc.dg/vect/pr48765.c: Likewise.
* testsuite/gfortran.dg/vect/pr45714-b.f: Likewise.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index b5dc5f30f88..d27871ab907 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -5664,6 +5664,36 @@ rs6000_builtin_md_vectorized_function (tree fndecl, tree 
type_out,
 /* Default CPU string for rs6000*_file_start functions.  */
 static const char *rs6000_default_cpu;
 
+#ifdef USING_ELFOS_H
+static const char *rs6000_machine;
+
+static const char *
+rs6000_machine_from_flags (void)
+{
+  if ((rs6000_isa_flags & (ISA_3_0_MASKS_SERVER & ~ISA_2_7_MASKS_SERVER)) != 0)
+return "power9";
+  if ((rs6000_isa_flags & (ISA_2_7_MASKS_SERVER & ~ISA_2_6_MASKS_SERVER)) != 0)
+return "power8";
+  if ((rs6000_isa_flags & (ISA_2_6_MASKS_SERVER & ~ISA_2_5_MASKS_SERVER)) != 0)
+return "power7";
+  if ((rs6000_isa_flags & (ISA_2_5_MASKS_SERVER & ~ISA_2_4_MASKS)) != 0)
+return "power6";
+  if ((rs6000_isa_flags & (ISA_2_4_MASKS & ~ISA_2_1_MASKS)) != 0)
+return "power5";
+  if ((rs6000_isa_flags & ISA_2_1_MASKS) != 0)
+return "power4";
+  if ((rs6000_isa_flags & OPTION_MASK_POWERPC64) != 0)
+return "ppc64";
+  return "ppc";
+}
+
+static void
+emit_asm_machine (void)
+{
+  fprintf (asm_out_file, "\t.machine %s\n", rs6000_machine);
+}
+#endif
+
 /* Do anything needed at the start of the asm file.  */
 
 static void
@@ -5729,27 +5759,10 @@ rs6000_file_start (void)
 }
 
 #ifdef USING_ELFOS_H
+  rs6000_machine = rs6000_machine_from_flags ();
   if (!(rs6000_default_cpu && rs6000_default_cpu[0])
   && !global_options_set.x_rs6000_cpu_index)
-{
-  fputs ("\t.machine ", asm_out_file);
-  if ((rs6000_isa_flags & OPTION_MASK_MODULO) != 0)
-   fputs ("power9\n", asm_out_file);
-  else if ((rs6000_isa_flags & OPTION_MASK_DIRECT_MOVE) != 0)
-   fputs ("power8\n", asm_out_file);
-  else if ((rs6000_isa_flags & OPTION_MASK_POPCNTD) != 0)
-   fputs ("power7\n", asm_out_file);
-  else if ((rs6000_isa_flags & OPTION_MASK_CMPB) != 0)
-   fputs ("power6\n", asm_out_file);
-  else if ((rs6000_isa_flags & OPTION_MASK_POPCNTB) != 0)
-   fputs ("power5\n", asm_out_file);
-  else if ((rs6000_isa_flags & OPTION_MASK_MFCRF) != 0)
-   fputs ("power4\n", asm_out_file);
-  else if ((rs6000_isa_flags & OPTION_MASK_POWERPC64) != 0)
-   fputs ("ppc64\n", asm_out_file);
-  else
-   fputs ("ppc\n", asm_out_file);
-}
+emit_asm_machine ();
 #endif
 
   if (DEFAULT_ABI == ABI_ELFv2)
@@ -27536,7 +27549,17 @@ static void
 rs6000_output_function_prologue (FILE *file)
 {
   if (!cfun->is_thunk)
-rs6000_output_savres_externs (file);
+{
+  rs6000_output_savres_externs (file);
+#ifdef USING_ELFOS_H
+  const char *curr_machine = rs6000_machine_from_flags ();
+  if (rs6000_machine != curr_machine)
+   {
+ rs6000_machine = curr_machine;
+ emit_asm_machine ();
+   }
+#endif
+}
 
   /* ELFv2 ABI r2 setup code and local entry point.  This must follow
  immediately after the global entry point label.  */
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index eaf309b45b7..f435d8bf1db 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -70,6 +70,12 @@
 #define PPC405_ERRATUM77 0
 #endif
 
+#if CHECKING_P
+#define ASM_OPT_ANY ""
+#else
+#define ASM_OPT_ANY " -many"
+#endif
+
 /* Common ASM definitions used by ASM_SPEC among the various targets for
handling -mcpu=xxx switches.  There is a parallel list in driver-rs6000.c to
provide the default assembler options if the user uses -mcpu=native, so if
@@ -137,8 +143,8 @@
mvsx: -mpower7; \
mpowerpc64: -mppc64;: %(asm_default)}; \
   :%eMissing -mcpu option in ASM_CPU_SPEC?\n} \
-%{mvsx: -mvsx -maltivec; maltivec: -maltivec} \
--many"
+%{mvsx: -mvsx -maltivec; maltivec: -maltivec}" \
+ASM_OPT_ANY
 
 #define CPP_DEFAULT_SPEC ""
 
diff --git a/gcc/testsuite/gcc.dg/vect/O3-pr70130.c 
b/gcc/testsuite/gcc.dg/vect/O3-pr70130.c
index 

Re: [PATCH v3 2/3] Add predict_doloop_p target hook

2019-05-21 Thread Segher Boessenkool
On Tue, May 21, 2019 at 12:20:50PM +0200, Richard Biener wrote:
> On Tue, 21 May 2019, Kewen.Lin wrote:
> 
> > on 2019/5/21 上午12:37, Segher Boessenkool wrote:
> > > On Mon, May 20, 2019 at 08:43:59AM -0600, Jeff Law wrote:
> > >>> I think we should have two hooks: one is called with the struct loop as
> > >>> parameter; and the other is called for every statement in the loop, if
> > >>> the hook isn't null anyway.  Or perhaps we do not need that second one.
> > >> I'd wait to see a compelling example from real world code where we need
> > >> to scan the statements.  Otherwise we're just dragging in more target
> > >> specific decisions which in fact we want to minimize target stuff.
> > > 
> > > The ivopts pass will be too optimistic about what loops will end up as a
> > > doloop, and cost things accordingly.  The cases where we cannot later
> > > actually use a doloop are doing pretty much per iteration, so I think
> > > ivopts will still make good decisions.  We'll need to make the rtl part
> > > not actually do a doloop then, but we probably still need that logic
> > > anyway.
> > > 
> > > Kewen, Bin, will that work satisfactorily do you think?
> > 
> > If my understanding on this question is correct, IMHO we should try to make
> > IVOPTs conservative than optimistic, since once the predict is wrong from
> > too optimistic decision, the costing on the doloop use is wrong, it's very
> > possible to affect the global optimal set.  It looks we don't have any ways
> > to recover it in RTL then?  (otherwise, there should be better place to fix
> > the PR).  Although it's also possible to miss some good cases, it's at least
> > as good as before, I'm inclined to make it conservative.
> 
> I wonder if you could simply benchmark what happens if you make
> IVOPTs _always_ create a doloop IV (if possible)?

That would help yes.

> I doubt the
> cases where a doloop IV is bad (calls, etc.) are too common and

They are quite common I think :-(  Like, more than the number of valid
doloops for us.  But let's see numbers :-)

> that in those cases the extra simple IV hurts.

That, I don't know.  We do not usually need an IV that counts down, but
will it be worse performance?


Segher


[PATCH] Fix names of _Lock_policy constants in libstdc++ manual

2019-05-21 Thread Jonathan Wakely

* doc/xml/manual/shared_ptr.xml: Fix names of lock policy constants.

I'll commit this to trunk shortly (and maybe backport it in my next
round of doc backports).

commit b82fec365a7d9404b94bf5df8b9c1266b3bd4c38
Author: Jonathan Wakely 
Date:   Tue May 21 12:29:55 2019 +0100

Fix names of _Lock_policy constants in libstdc++ manual

* doc/xml/manual/shared_ptr.xml: Fix names of lock policy constants.

diff --git a/libstdc++-v3/doc/xml/manual/shared_ptr.xml 
b/libstdc++-v3/doc/xml/manual/shared_ptr.xml
index fcbade6d5bf..24e275e95eb 100644
--- a/libstdc++-v3/doc/xml/manual/shared_ptr.xml
+++ b/libstdc++-v3/doc/xml/manual/shared_ptr.xml
@@ -239,7 +239,7 @@ available policies are:

  

-   _S_Atomic
+   _S_atomic


 Selected when GCC supports a builtin atomic compare-and-swap operation
@@ -252,7 +252,7 @@ synchronisation.
 
  

-   _S_Mutex
+   _S_mutex


 The _Sp_counted_base specialization for this policy contains a mutex,
@@ -263,7 +263,7 @@ builtins aren't available so explicit memory barriers are 
needed in places.
 
  

-   _S_Single
+   _S_single


 This policy uses a non-reentrant add_ref_lock() with no locking. It is


Re: [PATCH][RFC] Sanitize equals and hash functions in hash-tables.

2019-05-21 Thread Richard Biener
On Tue, May 21, 2019 at 1:02 PM Martin Liška  wrote:
>
> On 5/21/19 11:38 AM, Richard Biener wrote:
> > On Tue, May 21, 2019 at 12:07 AM Jeff Law  wrote:
> >>
> >> On 5/13/19 1:41 AM, Martin Liška wrote:
> >>> On 11/8/18 9:56 AM, Martin Liška wrote:
>  On 11/7/18 11:23 PM, Jeff Law wrote:
> > On 10/30/18 6:28 AM, Martin Liška wrote:
> >> On 10/30/18 11:03 AM, Jakub Jelinek wrote:
> >>> On Mon, Oct 29, 2018 at 04:14:21PM +0100, Martin Liška wrote:
>  +hashtab_chk_error ()
>  +{
>  +  fprintf (stderr, "hash table checking failed: "
>  +   "equal operator returns true for a pair "
>  +   "of values with a different hash value");
> >>> BTW, either use internal_error here, or at least if using fprintf
> >>> terminate with \n, in your recent mail I saw:
> >>> ...different hash valueduring RTL pass: vartrack
> >>> ^^
> >> Sure, fixed in attached patch.
> >>
> >> Martin
> >>
>  +  gcc_unreachable ();
>  +}
> >>>   Jakub
> >>>
> >> 0001-Sanitize-equals-and-hash-functions-in-hash-tables.patch
> >>
> >> From 0d9c979c845580a98767b83c099053d36eb49bb9 Mon Sep 17 00:00:00 2001
> >> From: marxin 
> >> Date: Mon, 29 Oct 2018 09:38:21 +0100
> >> Subject: [PATCH] Sanitize equals and hash functions in hash-tables.
> >>
> >> ---
> >>  gcc/hash-table.h | 40 +++-
> >>  1 file changed, 39 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/gcc/hash-table.h b/gcc/hash-table.h
> >> index bd83345c7b8..694eedfc4be 100644
> >> --- a/gcc/hash-table.h
> >> +++ b/gcc/hash-table.h
> >> @@ -503,6 +503,7 @@ private:
> >>
> >>value_type *alloc_entries (size_t n CXX_MEM_STAT_INFO) const;
> >>value_type *find_empty_slot_for_expand (hashval_t);
> >> +  void verify (const compare_type , hashval_t hash);
> >>bool too_empty_p (unsigned int);
> >>void expand ();
> >>static bool is_deleted (value_type )
> >> @@ -882,8 +883,12 @@ hash_table
> >>if (insert == INSERT && m_size * 3 <= m_n_elements * 4)
> >>  expand ();
> >>
> >> -  m_searches++;
> >> +#if ENABLE_EXTRA_CHECKING
> >> +if (insert == INSERT)
> >> +  verify (comparable, hash);
> >> +#endif
> >>
> >> +  m_searches++;
> >>value_type *first_deleted_slot = NULL;
> >>hashval_t index = hash_table_mod1 (hash, m_size_prime_index);
> >>hashval_t hash2 = hash_table_mod2 (hash, m_size_prime_index);
> >> @@ -930,6 +935,39 @@ hash_table
> >>return _entries[index];
> >>  }
> >>
> >> +#if ENABLE_EXTRA_CHECKING
> >> +
> >> +/* Report a hash table checking error.  */
> >> +
> >> +ATTRIBUTE_NORETURN ATTRIBUTE_COLD
> >> +static void
> >> +hashtab_chk_error ()
> >> +{
> >> +  fprintf (stderr, "hash table checking failed: "
> >> + "equal operator returns true for a pair "
> >> + "of values with a different hash value\n");
> >> +  gcc_unreachable ();
> >> +}
> > I think an internal_error here is probably still better than a simple
> > fprintf, even if the fprintf is terminated with a \n :-)
>  Fully agree with that, but I see a lot of build errors when using 
>  internal_error.
> 
> > The question then becomes can we bootstrap with this stuff enabled and
> > if not, are we likely to soon?  It'd be a shame to put it into
> > EXTRA_CHECKING, but then not be able to really use EXTRA_CHECKING
> > because we've got too many bugs to fix.
>  Unfortunately it's blocked with these 2 PRs:
>  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87845
>  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87847
> >>> Hi.
> >>>
> >>> I've just added one more PR:
> >>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90450
> >>>
> >>> I'm sending updated version of the patch that provides a disablement for 
> >>> the 3 PRs
> >>> with a new function disable_sanitize_eq_and_hash.
> >>>
> >>> With that I can bootstrap and finish tests. However, I've done that with 
> >>> a patch
> >>> limits maximal number of checks:
> >> So rather than call the disable_sanitize_eq_and_hash, can you have its
> >> state set up when you instantiate the object?  It's not a huge deal,
> >> just thinking about loud.
> >>
> >>
> >>
> >> So how do we want to go forward, particularly the EXTRA_EXTRA checking
> >> issue :-)
> >
> > There is at least one PR where we have a table where elements _in_ the
> > table are never compared against each other but always against another
> > object (I guess that's usual even), but the setup is in a way that the
> > comparison function only works with those.  With the patch we verify
> > hashing/comparison for something that is never used.
> >
> > So - wouldn't it be more "correct" to only verify comparison/hashing
> > at lookup time, 

[RS6000] PR90545, gcc.target/powerpc/fold-vec-splats-floatdouble.c fails

2019-05-21 Thread Alan Modra
Bootstrapped powerpc64le-linux power8 and power9, OK to apply?

I figure a tweak to register_move_cost is better than sprinkling ?s
in instruction alternatives.

PR 90545
* config/rs6000/rs6000.c (rs6000_register_move_cost): Increase
power9 direct move cost.
* testsuite/gcc.target/powerpc/fold-vec-splats-floatdouble.c:
Correct comments and rename functions to suit parameters.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index d27871ab907..d0e37d98ed5 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -34678,8 +34678,14 @@ rs6000_register_move_cost (machine_mode mode,
{
  if (TARGET_DIRECT_MOVE)
{
+ /* Keep the cost for direct moves above that for within
+a register class even if the actual processor cost is
+comparable.  We do this because a direct move insn
+can't be a nop, whereas with ideal register
+allocation a move within the same class might turn
+out to be a nop.  */
  if (rs6000_tune == PROCESSOR_POWER9)
-   ret = 2 * hard_regno_nregs (FIRST_GPR_REGNO, mode);
+   ret = 3 * hard_regno_nregs (FIRST_GPR_REGNO, mode);
  else
ret = 4 * hard_regno_nregs (FIRST_GPR_REGNO, mode);
  /* SFmode requires a conversion when moving between gprs
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-splats-floatdouble.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-splats-floatdouble.c
index c4544f1a452..3c7cc7c9a67 100644
--- a/gcc/testsuite/gcc.target/powerpc/fold-vec-splats-floatdouble.c
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-splats-floatdouble.c
@@ -8,20 +8,20 @@
 #include 
 
 vector float
-test1d (float x)
+test1f (float x)
 {
   return vec_splats (x);
 }
 
 vector double
-test1f (double x)
+test1d (double x)
 {
   return vec_splats (x);
 }
 
-// float test generates the permute instruction.
+// double test generates the permute instruction.
 /* { dg-final { scan-assembler-times "xxpermdi" 1 } } */
 
-// double test generates a convert (double to single non-signalling) followed 
by a splat.
+// float test generates a convert (double to single non-signalling) followed 
by a splat.
 /* { dg-final { scan-assembler-times {\mxscvdpspn?\M} 1 } } */
 /* { dg-final { scan-assembler-times {\mvspltw\M|\mxxspltw\M} 1 } } */

-- 
Alan Modra
Australia Development Lab, IBM


Re: Fix MEM_REF creation for shared stack slots

2019-05-21 Thread Richard Biener
On Tue, 21 May 2019, Jan Hubicka wrote:

> > > So about 8times of aliasing_component_refs hitrate.
> > 
> > OK, one issue with the patch is that it restores TBAA for the
> > access which we may _not_ do IIRC.
> 
> I can see that with stack sharing we have one memory location that for a
> while is of type A and later is rewritten by type B, but we already give
> up on optimizing this because of C++ placement new, right?
> 
> In what scenarios one can disambiguate by the alias set of the reference
> type (which we do in all cases) but not by the alias set of base type.
> The code in cfgexpand does not seem to care about either of those.

I don't remember exactly but lets change the things independently
if possible.

> > 
> > > Bootstrapped/regtested x86_64-linux, OK?
> > 
> > I'd rather not have that new build_simple_mem_ref_with_type_loc
> > function - the "simple" MEM_REF was to be a way to replace
> > a plain old INDIRECT_REF.
> > 
> > So please instead ...
> > 
> > > Honza
> > > 
> > >   * alias.c (ao_ref_from_mem): Use build_simple_mem_ref_with_type.
> > >   * tree.c (build_simple_mem_ref_with_type_loc): Break out from ...
> > >   (build_simple_mem_ref_loc): ... here.
> > >   * fold-const.h (build_simple_mem_ref_with_type_loc): Declare.
> > >   (build_simple_mem_ref_with_type): New macro.
> > > Index: alias.c
> > > ===
> > > --- alias.c   (revision 271379)
> > > +++ alias.c   (working copy)
> > > @@ -316,7 +316,8 @@ ao_ref_from_mem (ao_ref *ref, const_rtx
> > >  {
> > >tree *namep = cfun->gimple_df->decls_to_pointers->get (base);
> > >if (namep)
> > > - ref->base = build_simple_mem_ref (*namep);
> > > + ref->base = build_simple_mem_ref_with_type
> > > +  (*namep, build_pointer_type (TREE_TYPE (base)));
> > 
> > ...
> > 
> > ref->base = build2 (MEM_REF, TREE_TYPE (base), *namep,
> > build_int_cst (TREE_TYPE (*namep), 0));
> > 
> > which preserves TBAA behavior but fixes the 'void' type ref.
> 
> 
> My undrestanding of MEMREF is that it has two types, one is TREE_TYPE
> (MEMREF) and its ref type taken from TREE_TYPE of the constant.
> So we will still be dereferencing void which is odd.

Here we reference 'base' (TREE_TYPE of the mem-ref) and the
pointer-type for TBAA purposes is the type of the constant.
void * here simply means alias-set zero.

> If globbing is necessary, perhaps the outer type should be somethig like
> alias set 0 char pointer (used by some builtins such as copysign) or
> union of all types of vars that gets into a given partition?

Note the original reason might have been latent bugs in the RTL code
not correctly dealing with the placement new case.  With stack slot
sharing you end up with a lot more placement news ;)

As said, fixing TREE_TYPE (mem-ref) is quite obvious (it should
never have been void but retained the original type of the base).

Changing TBAA behavior should be done separately and that should
probably simply be

  ref->base = build2 (MEM_REF, TREE_TYPE (base), *namep,
   build_int_cst (build_pointer_type (TREE_TYPE (base)), 
0);

note we retain the original alias-set from RTL:

  ref->ref_alias_set = MEM_ALIAS_SET (mem);

and that might _not_ reflect that of the original tree.  For
example a

  MEM[, (void * ref-all)0] = 1;

may be represented as ref.base = b; ref.ref_alias_set = 0; ref.ref = NULL;
losing the ref-all qualification.  So it is _not_ easily possible
to recreate the original 'base'.  There might be code in the
component-ref disambiguations looking at those alias-types but that's
probably fishy for the cases coming in via ao_ref_from_mem which
means being conservative here is important.

That may also hold for the type of the reference for ref->base.
_Nothing_ should really look at that...  In fact the correct
type should be salvaged by not doing

  /* Get the base of the reference and see if we have to reject or
 adjust it.  */
  base = ao_ref_base (ref);
  if (base == NULL_TREE)
return false;

but doing what ao_ref_base_alias_set does, strip handled-components
and thus preserve an eventual inner view-converting MEM_REF for
the purpose of building the stack-slot sharing ref...

The current (somewhat broken) code simply side-steps this by
being awkwardly conservative...

So if we want to go full in on "fixing" ref->base (previously
we just said doing that is wasted cycles) then do

Index: gcc/alias.c
===
--- gcc/alias.c (revision 271415)
+++ gcc/alias.c (working copy)
@@ -316,7 +316,14 @@ ao_ref_from_mem (ao_ref *ref, const_rtx
 {
   tree *namep = cfun->gimple_df->decls_to_pointers->get (base);
   if (namep)
-   ref->base = build_simple_mem_ref (*namep);
+   {
+ tree orig_base = expr;
+ while (handled_component_p (orig_base))
+   orig_base = TREE_OPERAND (orig_base, 0);
+ ref->base = build2 

Re: [PATCH][RFC] Fix PR90510, VEC_PERM -> BIT_INSERT folding

2019-05-21 Thread Richard Biener
On Mon, 20 May 2019, Richard Biener wrote:

> On Sun, 19 May 2019, Richard Sandiford wrote:
> 
> > Richard Biener  writes:
> > > This adds, incrementally ontop of moving VEC_PERM_EXPR folding
> > > to match.pd (but not incremental patch - sorry...), folding
> > > of single-element insert permutations to BIT_INSERT_EXPR.
> > >
> > > Things are quite awkward with the new poly-int vec-perm stuff
> > > so effectively this doesn't work for SVE and is still very
> > > ugly.  I wonder how to make it generic enough so SVE would
> > > be happy and / or how to make the code prettier.
> > >
> > > I also can't find a helper to read a specific vector element
> > > from a VECTOR_CST/CONSTRUCTOR (can I even do that "generally"
> > > with a poly-int index?!), but there surely must be one.
> > 
> > Yeah, would be nice to have a helper that handles both VECTOR_CST
> > and CONSTRUCTOR, even if just for constant indices.
> > 
> > CONSTRUCTORs are still very much fixed-length, so it wouldn't really
> > make sense to fold a poly_int index at compile time.  The only case we
> > could handle is when the index is known to be beyond the last nonzero
> > element.
> > 
> > We could fold some poly_int VECTOR_CST_ELT indices to poly_ints, but
> > it'd depend on the values involved.
> > 
> > Indexing specific poly_int elements of a single vector is a bit dubious
> > in length-agnostic code though.  Not saying it'll never be useful, but
> > it's certainly not a native SVE operation.  So I think even for SVE,
> > constant VECTOR_CST/CONSTRUCTOR elements are the only interesting case
> > for now.
> > 
> > > [...]
> > > @@ -11774,61 +11777,7 @@ fold_ternary_loc (location_t loc, enum t
> > > poly_uint64 nelts = TYPE_VECTOR_SUBPARTS (type);
> > > bool single_arg = (op0 == op1);
> > > vec_perm_indices sel (builder, single_arg ? 1 : 2, nelts);
> > > -
> > > -   /* Check for cases that fold to OP0 or OP1 in their original
> > > -  element order.  */
> > > -   if (sel.series_p (0, 1, 0, 1))
> > > - return op0;
> > > -   if (sel.series_p (0, 1, nelts, 1))
> > > - return op1;
> > > -
> > > -   if (!single_arg)
> > > - {
> > > -   if (sel.all_from_input_p (0))
> > > - op1 = op0;
> > > -   else if (sel.all_from_input_p (1))
> > > - {
> > > -   op0 = op1;
> > > -   sel.rotate_inputs (1);
> > > - }
> > > - }
> > 
> > Since this isn't an incremental patch... :-)
> > 
> > One of the holes of the current code is that we still allow two
> > permute indices for the same permutation, e.g.  { 4, 1, 2, 3 } and
> > { 0, 5, 6, 7 }.  IMO we should canonicalize it so that the first index
> > selects from the first vector.  So maybe the above should be:
> > 
> >   if (!single_arg)
> > {
> >   if (known_ge (sel[0], nunits))
> > {
> >   std::swap (op0, op1);
> >   sel.rotate_inputs (1);
> > }
> >   if (sel.all_from_input_p (0))
> > op1 = op0;
> > }
> > 
> > > [...]
> > > + /* See if the permutation is performing a single element
> > > +insert from a CONSTRUCTOR or constant and use a BIT_INSERT_EXPR
> > > +in that case.  */
> > > + unsigned HOST_WIDE_INT cnelts;
> > > +if ((TREE_CODE (cop0) == VECTOR_CST
> > > +  || TREE_CODE (cop0) == CONSTRUCTOR
> > > +  || TREE_CODE (cop1) == VECTOR_CST
> > > +  || TREE_CODE (cop1) == CONSTRUCTOR)
> > > + && nelts.is_constant ())
> > > +  {
> > > + unsigned first = 0, first_oo = 0, first_i;
> > > + unsigned second = 0, second_oo = 0, second_i;
> > > + HOST_WIDE_INT idx;
> > > + for (unsigned HOST_WIDE_INT i = 0; i < cnelts; ++i)
> > > +   if (!sel[i].is_constant ())
> > > + {
> > > +   first = second = 0;
> > > +   break;
> > > + }
> > > +   else if ((unsigned HOST_WIDE_INT)idx < cnelts)
> > > + {
> > > +   first_i = i;
> > > +   first++;
> > > +   first_oo += (unsigned HOST_WIDE_INT)idx != i;
> > > + }
> > > +   else
> > > + {
> > > +   second_i = i;
> > > +   second++;
> > > +   second_oo += (unsigned HOST_WIDE_INT)idx != i + cnelts;
> > > + }
> > 
> > This won't handle the case in which the inserted element comes from
> > the same vector.
> > 
> > If we add the extra canonicalization above, we'd only ever be inserting
> > into the second vector at element 0.   The test for that would be:
> > 
> >if (sel.series_p (1, 1, nelts + 1, 1))
> >  // insert sel[0] into index 0 of the second vector
> > 
> > I think the SVE-friendly way of doing the check for the first vector
> > would be:
> > 
> >unsigned int encoded_nelts = sel.encoding ().encoded_nelts ();
> >unsigned int i = 0;
> >for (; i < encoded_nelts; ++i)
> >  if (maybe_ne (sel[i], i))
> >break;
> >if (i < encoded_nelts && sel.series_p (i + 1, 1, i + 1, 1))
> >  // insert sel[i] into index 

Re: Fix MEM_REF creation for shared stack slots

2019-05-21 Thread Jan Hubicka
> > So about 8times of aliasing_component_refs hitrate.
> 
> OK, one issue with the patch is that it restores TBAA for the
> access which we may _not_ do IIRC.

I can see that with stack sharing we have one memory location that for a
while is of type A and later is rewritten by type B, but we already give
up on optimizing this because of C++ placement new, right?

In what scenarios one can disambiguate by the alias set of the reference
type (which we do in all cases) but not by the alias set of base type.
The code in cfgexpand does not seem to care about either of those.
> 
> > Bootstrapped/regtested x86_64-linux, OK?
> 
> I'd rather not have that new build_simple_mem_ref_with_type_loc
> function - the "simple" MEM_REF was to be a way to replace
> a plain old INDIRECT_REF.
> 
> So please instead ...
> 
> > Honza
> > 
> > * alias.c (ao_ref_from_mem): Use build_simple_mem_ref_with_type.
> > * tree.c (build_simple_mem_ref_with_type_loc): Break out from ...
> > (build_simple_mem_ref_loc): ... here.
> > * fold-const.h (build_simple_mem_ref_with_type_loc): Declare.
> > (build_simple_mem_ref_with_type): New macro.
> > Index: alias.c
> > ===
> > --- alias.c (revision 271379)
> > +++ alias.c (working copy)
> > @@ -316,7 +316,8 @@ ao_ref_from_mem (ao_ref *ref, const_rtx
> >  {
> >tree *namep = cfun->gimple_df->decls_to_pointers->get (base);
> >if (namep)
> > -   ref->base = build_simple_mem_ref (*namep);
> > +   ref->base = build_simple_mem_ref_with_type
> > +(*namep, build_pointer_type (TREE_TYPE (base)));
> 
> ...
> 
> ref->base = build2 (MEM_REF, TREE_TYPE (base), *namep,
>   build_int_cst (TREE_TYPE (*namep), 0));
> 
> which preserves TBAA behavior but fixes the 'void' type ref.


My undrestanding of MEMREF is that it has two types, one is TREE_TYPE
(MEMREF) and its ref type taken from TREE_TYPE of the constant.
So we will still be dereferencing void which is odd.

If globbing is necessary, perhaps the outer type should be somethig like
alias set 0 char pointer (used by some builtins such as copysign) or
union of all types of vars that gets into a given partition?

Honza


Re: [PATCH v3 2/3] Add predict_doloop_p target hook

2019-05-21 Thread Segher Boessenkool
On Tue, May 21, 2019 at 02:03:04PM +0800, Kewen.Lin wrote:
> on 2019/5/21 上午12:37, Segher Boessenkool wrote:
> > On Mon, May 20, 2019 at 08:43:59AM -0600, Jeff Law wrote:
> >>> I think we should have two hooks: one is called with the struct loop as
> >>> parameter; and the other is called for every statement in the loop, if
> >>> the hook isn't null anyway.  Or perhaps we do not need that second one.
> >> I'd wait to see a compelling example from real world code where we need
> >> to scan the statements.  Otherwise we're just dragging in more target
> >> specific decisions which in fact we want to minimize target stuff.
> > 
> > The ivopts pass will be too optimistic about what loops will end up as a
> > doloop, and cost things accordingly.  The cases where we cannot later
> > actually use a doloop are doing pretty much per iteration, so I think
> > ivopts will still make good decisions.  We'll need to make the rtl part
> > not actually do a doloop then, but we probably still need that logic
> > anyway.
> > 
> > Kewen, Bin, will that work satisfactorily do you think?
> 
> If my understanding on this question is correct, IMHO we should try to make
> IVOPTs conservative than optimistic, since once the predict is wrong from
> too optimistic decision, the costing on the doloop use is wrong, it's very
> possible to affect the global optimal set.

Ah, it does change what is chosen, and it happens a lot as well...  So
never mind, this is one simplification too far :-)

> It looks we don't have any ways to recover it in RTL then?

We don't.  We'd have to redo everything that happened in between...

> (otherwise, there should be better place to fix
> the PR).  Although it's also possible to miss some good cases, it's at least
> as good as before, I'm inclined to make it conservative.


Segher


Re: [PATCH] Strip target_clones in copy attribute (PR lto/90500).

2019-05-21 Thread Martin Liška
On 5/21/19 11:41 AM, Dominique d'Humières wrote:
> Hi Martin,
> 
>  /* { dg-require-ifunc } */
> 
> should be
> 
>  /* { dg-require-ifunc ""} */
> 
> and the same for pr90500-2.c (see 
> https://gcc.gnu.org/ml/gcc-patches/2019-05/msg01152.html)
> 
> TIA
> 
> Dominique
> 

I'm addressing that in patch that I'm going to install.

Martin
>From 14f8d01fccd4d94e9575f7d964faf4144db0879d Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Tue, 21 May 2019 13:04:14 +0200
Subject: [PATCH] Add missing "" for dg-require-ifunc.

gcc/testsuite/ChangeLog:

2019-05-21  Martin Liska  

	* gcc.target/i386/pr90500-1.c: Add missing '""'.
	* gcc.target/i386/pr90500-2.c: Likewise.
---
 gcc/testsuite/gcc.target/i386/pr90500-1.c | 2 +-
 gcc/testsuite/gcc.target/i386/pr90500-2.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.target/i386/pr90500-1.c b/gcc/testsuite/gcc.target/i386/pr90500-1.c
index 7ac6a739c05..e90e5ed4674 100644
--- a/gcc/testsuite/gcc.target/i386/pr90500-1.c
+++ b/gcc/testsuite/gcc.target/i386/pr90500-1.c
@@ -1,6 +1,6 @@
 /* PR middle-end/84723 */
 /* { dg-do compile } */
-/* { dg-require-ifunc } */
+/* { dg-require-ifunc "" } */
 
 __attribute__((target_clones("arch=haswell", "default"))) int __tanh() {}
 __typeof(__tanh) tanhf64 __attribute__((alias("__tanh")))/* { dg-error "clones for .target_clones. attribute cannot be created" } */
diff --git a/gcc/testsuite/gcc.target/i386/pr90500-2.c b/gcc/testsuite/gcc.target/i386/pr90500-2.c
index 0fafb8adb21..cb0658dbc38 100644
--- a/gcc/testsuite/gcc.target/i386/pr90500-2.c
+++ b/gcc/testsuite/gcc.target/i386/pr90500-2.c
@@ -1,6 +1,6 @@
 /* PR middle-end/84723 */
 /* { dg-do compile } */
-/* { dg-require-ifunc } */
+/* { dg-require-ifunc "" } */
 
 __attribute__((target_clones("arch=haswell", "default"))) int __tanh() {}
 __typeof(__tanh) tanhf64 __attribute__((alias("__tanh"),target_clones("arch=haswell", "default"))); /* { dg-error "clones for .target_clones. attribute cannot be created" } */
-- 
2.21.0



[C++ Patch] Two literal operator template location fixes

2019-05-21 Thread Paolo Carlini

Hi,

also in my back queue a few more location fixes (of course ;) Tested 
x86_64-linux.


Thanks, Paolo.



/cp
2019-05-21  Paolo Carlini  

* parser.c (cp_parser_template_declaration_after_parameters): Use
DECL_SOURCE_LOCATION in literal operator template errors.

/testsuite
2019-05-21  Paolo Carlini  

* g++.dg/cpp0x/udlit-tmpl-arg-neg2.C: Check locations too.
* g++.dg/cpp0x/udlit-tmpl-parms-neg.C: Likewise.
Index: cp/parser.c
===
--- cp/parser.c (revision 271459)
+++ cp/parser.c (working copy)
@@ -27912,14 +27912,16 @@ cp_parser_template_declaration_after_parameters (c
   if (!ok)
{
  if (cxx_dialect > cxx17)
-   error ("literal operator template %qD has invalid parameter list;"
-  " expected non-type template parameter pack %<%> "
-  "or single non-type parameter of class type",
-  decl);
+   error_at (DECL_SOURCE_LOCATION (decl), "literal operator "
+ "template %qD has invalid parameter list; expected "
+ "non-type template parameter pack %<%> or "
+ "single non-type parameter of class type",
+ decl);
  else
-   error ("literal operator template %qD has invalid parameter list;"
-  " expected non-type template parameter pack %<%>",
-  decl);
+   error_at (DECL_SOURCE_LOCATION (decl), "literal operator "
+ "template %qD has invalid parameter list; expected "
+ "non-type template parameter pack %<%>",
+ decl);
}
 }
 
Index: testsuite/g++.dg/cpp0x/udlit-tmpl-arg-neg2.C
===
--- testsuite/g++.dg/cpp0x/udlit-tmpl-arg-neg2.C(revision 271459)
+++ testsuite/g++.dg/cpp0x/udlit-tmpl-arg-neg2.C(working copy)
@@ -2,6 +2,6 @@
 // { dg-do compile { target c++11 } }
 
 template// { dg-error "'T' has not been declared" }
-int operator"" _foo ();// { dg-error "has invalid parameter 
list" }
+int operator"" _foo ();// { dg-error "5:literal operator 
template .int operator\"\"_foo\\(\\). has invalid parameter list" }
 template   // { dg-error "'T' has not been declared" }
-int operator"" _bar ();// { dg-error "has invalid parameter 
list" }
+int operator"" _bar ();// { dg-error "5:literal operator 
template .int operator\"\"_bar\\(\\). has invalid parameter list" }
Index: testsuite/g++.dg/cpp0x/udlit-tmpl-parms-neg.C
===
--- testsuite/g++.dg/cpp0x/udlit-tmpl-parms-neg.C   (revision 271459)
+++ testsuite/g++.dg/cpp0x/udlit-tmpl-parms-neg.C   (working copy)
@@ -3,10 +3,10 @@
 class Foo { };
 
 template
-  Foo operator"" _Foo(); // { dg-error "literal operator template|has invalid 
parameter list" }
+  Foo operator"" _Foo(); // { dg-error "7:literal operator template .Foo 
operator\"\"_Foo\\(\\). has invalid parameter list" }
 
 template
-  Foo operator"" _Bar(); // { dg-error "literal operator template|has invalid 
parameter list" }
+  Foo operator"" _Bar(); // { dg-error "7:literal operator template .Foo 
operator\"\"_Bar\\(\\). has invalid parameter list" }
 
 template
-  Foo operator"" _Bar(); // { dg-error "literal operator template|has invalid 
parameter list" }
+  Foo operator"" _Bar(); // { dg-error "7:literal operator template .Foo 
operator\"\"_Bar\\(\\). has invalid parameter list" }


Re: [PATCH][RFC] Sanitize equals and hash functions in hash-tables.

2019-05-21 Thread Martin Liška
On 5/21/19 11:38 AM, Richard Biener wrote:
> On Tue, May 21, 2019 at 12:07 AM Jeff Law  wrote:
>>
>> On 5/13/19 1:41 AM, Martin Liška wrote:
>>> On 11/8/18 9:56 AM, Martin Liška wrote:
 On 11/7/18 11:23 PM, Jeff Law wrote:
> On 10/30/18 6:28 AM, Martin Liška wrote:
>> On 10/30/18 11:03 AM, Jakub Jelinek wrote:
>>> On Mon, Oct 29, 2018 at 04:14:21PM +0100, Martin Liška wrote:
 +hashtab_chk_error ()
 +{
 +  fprintf (stderr, "hash table checking failed: "
 +   "equal operator returns true for a pair "
 +   "of values with a different hash value");
>>> BTW, either use internal_error here, or at least if using fprintf
>>> terminate with \n, in your recent mail I saw:
>>> ...different hash valueduring RTL pass: vartrack
>>> ^^
>> Sure, fixed in attached patch.
>>
>> Martin
>>
 +  gcc_unreachable ();
 +}
>>>   Jakub
>>>
>> 0001-Sanitize-equals-and-hash-functions-in-hash-tables.patch
>>
>> From 0d9c979c845580a98767b83c099053d36eb49bb9 Mon Sep 17 00:00:00 2001
>> From: marxin 
>> Date: Mon, 29 Oct 2018 09:38:21 +0100
>> Subject: [PATCH] Sanitize equals and hash functions in hash-tables.
>>
>> ---
>>  gcc/hash-table.h | 40 +++-
>>  1 file changed, 39 insertions(+), 1 deletion(-)
>>
>> diff --git a/gcc/hash-table.h b/gcc/hash-table.h
>> index bd83345c7b8..694eedfc4be 100644
>> --- a/gcc/hash-table.h
>> +++ b/gcc/hash-table.h
>> @@ -503,6 +503,7 @@ private:
>>
>>value_type *alloc_entries (size_t n CXX_MEM_STAT_INFO) const;
>>value_type *find_empty_slot_for_expand (hashval_t);
>> +  void verify (const compare_type , hashval_t hash);
>>bool too_empty_p (unsigned int);
>>void expand ();
>>static bool is_deleted (value_type )
>> @@ -882,8 +883,12 @@ hash_table
>>if (insert == INSERT && m_size * 3 <= m_n_elements * 4)
>>  expand ();
>>
>> -  m_searches++;
>> +#if ENABLE_EXTRA_CHECKING
>> +if (insert == INSERT)
>> +  verify (comparable, hash);
>> +#endif
>>
>> +  m_searches++;
>>value_type *first_deleted_slot = NULL;
>>hashval_t index = hash_table_mod1 (hash, m_size_prime_index);
>>hashval_t hash2 = hash_table_mod2 (hash, m_size_prime_index);
>> @@ -930,6 +935,39 @@ hash_table
>>return _entries[index];
>>  }
>>
>> +#if ENABLE_EXTRA_CHECKING
>> +
>> +/* Report a hash table checking error.  */
>> +
>> +ATTRIBUTE_NORETURN ATTRIBUTE_COLD
>> +static void
>> +hashtab_chk_error ()
>> +{
>> +  fprintf (stderr, "hash table checking failed: "
>> + "equal operator returns true for a pair "
>> + "of values with a different hash value\n");
>> +  gcc_unreachable ();
>> +}
> I think an internal_error here is probably still better than a simple
> fprintf, even if the fprintf is terminated with a \n :-)
 Fully agree with that, but I see a lot of build errors when using 
 internal_error.

> The question then becomes can we bootstrap with this stuff enabled and
> if not, are we likely to soon?  It'd be a shame to put it into
> EXTRA_CHECKING, but then not be able to really use EXTRA_CHECKING
> because we've got too many bugs to fix.
 Unfortunately it's blocked with these 2 PRs:
 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87845
 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87847
>>> Hi.
>>>
>>> I've just added one more PR:
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90450
>>>
>>> I'm sending updated version of the patch that provides a disablement for 
>>> the 3 PRs
>>> with a new function disable_sanitize_eq_and_hash.
>>>
>>> With that I can bootstrap and finish tests. However, I've done that with a 
>>> patch
>>> limits maximal number of checks:
>> So rather than call the disable_sanitize_eq_and_hash, can you have its
>> state set up when you instantiate the object?  It's not a huge deal,
>> just thinking about loud.
>>
>>
>>
>> So how do we want to go forward, particularly the EXTRA_EXTRA checking
>> issue :-)
> 
> There is at least one PR where we have a table where elements _in_ the
> table are never compared against each other but always against another
> object (I guess that's usual even), but the setup is in a way that the
> comparison function only works with those.  With the patch we verify
> hashing/comparison for something that is never used.
> 
> So - wouldn't it be more "correct" to only verify comparison/hashing
> at lookup time, using the object from the lookup and verify that against
> all other elements?

I don't a have problem with that. Apparently this changes fixes
PR90450 and PR87847.

Changes from previous version:
- verification happens only when an element is searched (not inserted)
- new argument 

Re: [PATCH v3 2/3] Add predict_doloop_p target hook

2019-05-21 Thread Segher Boessenkool
On Tue, May 21, 2019 at 01:50:31PM +0800, Kewen.Lin wrote:
> on 2019/5/20 下午10:43, Jeff Law wrote:
> > On 5/20/19 4:24 AM, Segher Boessenkool wrote:
> >> Let me try to answer a bit here...
> >>
> >> On Mon, May 20, 2019 at 11:28:26AM +0200, Richard Biener wrote:
> >>> On Thu, 16 May 2019, li...@linux.ibm.com wrote: 
> >>
> >>> So the better way would be to expose that via a target hook somehow.
> >>> Or simply restrict IVOPTs processing to innermost loops for now.
> >>
> >> I think we should have two hooks: one is called with the struct loop as
> >> parameter; and the other is called for every statement in the loop, if
> >> the hook isn't null anyway.  Or perhaps we do not need that second one.
> > I'd wait to see a compelling example from real world code where we need
> > to scan the statements.  Otherwise we're just dragging in more target
> > specific decisions which in fact we want to minimize target stuff.
> 
> The scan is trying to do similar thing like default_invalid_within_doloop.
> It scans for hardware counter register clobbering.  I think it's important
> and valuable to scan especially for call since it's common.

Ah, right, without this check we would say many more loops can be doloop
than can in fact be.

>   if (CALL_P (insn))
> return "Function call in loop.";
> 
>   if (tablejump_p (insn, NULL, NULL) || computed_jump_p (insn))
> return "Computed branch in the loop.";
> 
> But it's a question whether to make it as part of generic.  I double checked
> that most of the doloop targets use this default behavior, only 5 targets are
> using their own TARGET_INVALID_WITHIN_DOLOOP, so it might be a good thing to 
> make it common to share.

Yeah, and have the default hook for gimple be similar to the rtl one.


Segher


Re: [PATCH,RFC 0/3] Support for CTF in GCC

2019-05-21 Thread Richard Biener
On Mon, May 20, 2019 at 7:56 PM Indu Bhagat  wrote:
>
> Background :
> CTF is the Compact Ansi-C Type Format. It is a format designed to express some
> characteristics (specifically Type information) of the data types in a C
> program. CTF format is compact and fast; It was originally designed for
> use-cases like dynamic tracing, online in-application debugging among others.
>
> A patch to the binutils mailing list to add libctf is currently under
> review (https://sourceware.org/ml/binutils/2019-05/msg00154.html and
> https://sourceware.org/ml/binutils/2019-05/msg00212.html).  libctf
> provides means to create, update, read and manipulate CTF information.
>
> This GCC patch set is preliminary work and the purpose is to gather comments 
> and
> feedback about CTF support in GCC.
>
> (For technical introduction into the CTF format, the CTF header or
> https://sourceware.org/ml/binutils/2019-04/msg00277.html will be useful.)
>
> Project Details :
> The project aims to add the support for CTF in the GNU toolchain. Adding CTF
> support in the GNU toolchain will help the community in developing and
> converging the tools and use-cases where a light-weight debug format is 
> needed.
>
> De-duplication is a key aspect of the CTF format which ensures its 
> compactness.
> A parallel effort is ongoing to support de-duplication of CTF types at the
> link-time.
>
> In phase 1, we are making the compiler, linker and the debugger (GDB) capable
> of handling the CTF format.
>
> CTF format, in its present form, does not have callsite information.  We are
> working on this as well. Once the CTF format extensions are agreed upon, the
> -gt1 option (see below) will begin to take form, in phase 2 of the project.
>
> GCC RFC patch set :
> Patch 1 is a simple addition of a new function lang_GNU_GIMPLE to check for
> GIMPLE frontend.

I don't think you should need this - the GIMPLE "frontend" is intended for
unit testing only, I wouldn't like it to be exposed more.

> Patch 2 and Patch 3 set up the framework for CTF support in GCC :
> -- Patch 2 adds the new command line option for generating CTF. CTF generation
>is enabled in the compiler by specifying an explicit -gt or
>-gtLEVEL[LEVEL=1,2] :
>
> -gtLEVEL
>
> This is used to request CTF debug information and to specify how much CTF
> debug information, LEVEL[=0,1,2] can be specified. If -gt is specified
> (with no LEVEL), the default value of LEVEL is 2.
>
> -gt0 (Level 0) produces no CTF debug information at all. Thus, -gt0
> negates -gt.
>
> -gt1 (Level 1) produces CTF information for tracebacks only. This includes
> CTF callsite information, but does not include type information for other
> entities.
>
> -gt2 (Level 2) produces type information for entities (functions, 
> variables
> etc.) at file-scope or global-scope only. This level of information can be
> used by dynamic tracers like DTrace.
>
> --  Patch 3 adds the CTF debug hooks and initializes them if the required
> user-level options are specified.
> CTF debug hooks are wrappers around the DWARF debug hooks.
>
> One of the main high-level design requirements that is relevant in the context
> of the current GCC patch set is that - CTF and DWARF must be able to co-exist.
> A user may want CTF debug information in isolation or with other debug 
> formats.
> A .ctf section is small and unlike other debug sections, ideally should not
> need to be stripped out of the binary/executable.
>
> High-level proposed plan (phase 1) :
> In the next few patches, the functionality to generate contents of the CTF
> section (.ctf) for a single compilation unit will be added.
> Once CTF generation for a single compilation unit stabilizes, LTO and CTF
> generation will be looked at.
>
> Feedback and suggestions welcome.

You probably got asked this question multiple times already, but,
can CTF information be generated from DWARF instead?

The meaning of the CTF acronym suggests that there's nothing
like locations, registers, etc. but just a representation of the
types?

Generally we are trying to walk away from supporting multiple
debug info formats because that gets in the way of being
more precise from the frontend side.  Since DWARF is the
defacto standard, extensible and with a rich feature set the
line of thinking is that other formats (like STABS) can be
generated by "post-processing" DWARF.  Such
post-processing could happen on the object files or
on the GCC internal DWARF data structures by
providing alternate output routines.  That is, the mid-term
design goal is to make DWARF generation the "API"
for GCC frontends to use when creating high-level
debug information rather than trying to abstract from
the debuginfo format via the current debug-hooks or
the other way around via language-hooks.

Richard.

> Thanks
>
> Indu Bhagat (3):
>   Add new function lang_GNU_GIMPLE
>   Add CTF command line options : -gtLEVEL
>   Create CTF debug hooks
>
>  

Re: [PATCH v3 2/3] Add predict_doloop_p target hook

2019-05-21 Thread Richard Biener
On Tue, 21 May 2019, Kewen.Lin wrote:

> on 2019/5/21 上午12:37, Segher Boessenkool wrote:
> > On Mon, May 20, 2019 at 08:43:59AM -0600, Jeff Law wrote:
> >>> I think we should have two hooks: one is called with the struct loop as
> >>> parameter; and the other is called for every statement in the loop, if
> >>> the hook isn't null anyway.  Or perhaps we do not need that second one.
> >> I'd wait to see a compelling example from real world code where we need
> >> to scan the statements.  Otherwise we're just dragging in more target
> >> specific decisions which in fact we want to minimize target stuff.
> > 
> > The ivopts pass will be too optimistic about what loops will end up as a
> > doloop, and cost things accordingly.  The cases where we cannot later
> > actually use a doloop are doing pretty much per iteration, so I think
> > ivopts will still make good decisions.  We'll need to make the rtl part
> > not actually do a doloop then, but we probably still need that logic
> > anyway.
> > 
> > Kewen, Bin, will that work satisfactorily do you think?
> > 
> 
> If my understanding on this question is correct, IMHO we should try to make
> IVOPTs conservative than optimistic, since once the predict is wrong from
> too optimistic decision, the costing on the doloop use is wrong, it's very
> possible to affect the global optimal set.  It looks we don't have any ways
> to recover it in RTL then?  (otherwise, there should be better place to fix
> the PR).  Although it's also possible to miss some good cases, it's at least
> as good as before, I'm inclined to make it conservative.

I wonder if you could simply benchmark what happens if you make
IVOPTs _always_ create a doloop IV (if possible)?  I doubt the
cases where a doloop IV is bad (calls, etc.) are too common and
that in those cases the extra simple IV hurts.

Richard.

Re: [PATCH] fix diagnostic quoting/spelling in rs6000

2019-05-21 Thread Segher Boessenkool
On Mon, May 20, 2019 at 11:31:17PM +, Joseph Myers wrote:
> Where you refer to 'homogeneous % aggregates', are such aggregates 
> in the rs6000 case (or in the case where the ABI changed) really 
> restricted to the type float, or do they apply more generally to some 
> other floating-point types so that expanding "float" to a longer 
> description would be the appropriate fix?

See https://gcc.gnu.org/ml/gcc-patches/2014-07/msg00632.html .

It's about single precision floating point only.


Segher


Re: [PATCH] Remove empty loop with assumed finiteness (PR tree-optimization/89713)

2019-05-21 Thread Richard Biener
On Mon, May 20, 2019 at 4:51 PM Feng Xue OS  wrote:
>
> > I don't see how it is safe in a late pass when it is not safe in an
>
> > earlier one.  Optimization is imperfect - we could fail to remove
> > an "obvious" never taken exit and still have a loop that appears to be
> > finite according to our definition.
>
> Yes. it is. This is somewhat similar to strict-alias option/loop dep pragma.
> Compiler tries to do something based on hint you tell it, but does not ensure 
> correctness.
>
> > The only way
> > to define it would be if there was, at any point, an exit from the
> > loop (and there it _may_ be exclude EH edges) then
> > the loop is assumed to be finite.
>
> No catch your point. If we treat an infinite loop as finite, it's bad because 
> the loop might be removed.
>
> Suppose we have a function:
>
> void foo(int bound)
>  { for (int i = 0; i <= bound; i++); }
>
>  In an early CD-DCE pass, "bound" is represented as a variable, and loop has 
> a exit, so it is assumed to finite, and is removed.
>
> But in a late pass, this function is inlined into another one, and "bound" 
> has value of INT_MAX, this loop is infinite, and here we can know it should 
> not be removed.

But if "bound" is always INT_MAX but that's not visible to the
compiler we will still remove the
loop so I see no difference with removing it always.

> This is why I suggest doing the optimization as late as possible.

But this will defeat the purpose of allowing followup optimizations.

IMHO the only "sensible" thing is to do

Index: gcc/tree-ssa-dce.c
===
--- gcc/tree-ssa-dce.c  (revision 271415)
+++ gcc/tree-ssa-dce.c  (working copy)
@@ -417,7 +417,7 @@ find_obviously_necessary_stmts (bool agg
  }

   FOR_EACH_LOOP (loop, 0)
-   if (!finite_loop_p (loop))
+   if (!loop_has_exit_edges (loop))
  {
if (dump_file)
  fprintf (dump_file, "cannot prove finiteness of loop
%i\n", loop->num);

that also has the obvious advantage that we don't need to replace the loop
with a trap() but have a place to forward control flow to.  The loop in the
following testcase is then successfully removed:

int main(int argc, char **argv)
{
  unsigned i = argc;
  while (i+=2);
  return 0;
}

Likewise is the loop

void **q;
int main(int argc, char **argv)
{
  void **p = q;
  while (p = (void **)*p);
  return 0;
}

(that's the pointer-chasing).  Not with -fnon-call-exceptions
-fexceptions though.

Richard.

> Feng
>


Re: [PATCH v2 2/3] Add predict_doloop_p target hook

2019-05-21 Thread Segher Boessenkool
On Mon, May 20, 2019 at 01:31:10PM -0600, Jeff Law wrote:
> On 5/15/19 10:44 AM, Segher Boessenkool wrote:
> > But we can be allocated a floating point register, or memory, instead.
> > That is heavily discouraged (by making it more expensive), but it can
> > still happen.  This is a jump_insn so it cannot get any reloads, either;
> > but even if it could, that is an *expensive* thing to do.
> RIght.  ANd that's consistent with what other architectures have needed
> to do.  I can't describe the pain of what happens on the PA when you
> find out that the loop counter got allocated to the shift amount
> register or a floating point register.  It's rare, but you had to handle
> it.  Ugh.

Maybe it's time to finally allow output reloads on jump insns.  In LRA
only, if that helps?


Segher


Re: [PATCH v3 2/3] Add predict_doloop_p target hook

2019-05-21 Thread Richard Biener
On Mon, 20 May 2019, Segher Boessenkool wrote:

> On Mon, May 20, 2019 at 08:43:59AM -0600, Jeff Law wrote:
> > > I think we should have two hooks: one is called with the struct loop as
> > > parameter; and the other is called for every statement in the loop, if
> > > the hook isn't null anyway.  Or perhaps we do not need that second one.
> > I'd wait to see a compelling example from real world code where we need
> > to scan the statements.  Otherwise we're just dragging in more target
> > specific decisions which in fact we want to minimize target stuff.
> 
> The ivopts pass will be too optimistic about what loops will end up as a
> doloop, and cost things accordingly.  The cases where we cannot later
> actually use a doloop are doing pretty much per iteration, so I think
> ivopts will still make good decisions.  We'll need to make the rtl part
> not actually do a doloop then, but we probably still need that logic
> anyway.

Yes - my thinking was that if IVOPTs _not_ choose a doloop IV then
the RTL side has to give up, so that's bad.  But if it chooses
a doloop IV but we end up failing to do a doloop that's not too
bad.  After all we indeed cannot remove RTL doloop analysis
at this point.

So the important thing is to make IVOPTs create a separate
counter IV that is only used in the update/compare/branch
and cost that appropriately (not even sure if _that_ is
actually required!).

A slight complication might be that if IVOPTs decides
to use a doloop IV but creates another equivalent IV
for other uses then later CSE might end up unifying them
again.  We should probably make IVOPTs aware of this.

> Kewen, Bin, will that work satisfactorily do you think?
> 
> 
> Segher
> 

-- 
Richard Biener 
SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)

Re: [PATCH] Strip target_clones in copy attribute (PR lto/90500).

2019-05-21 Thread Iain Sandoe


> On 21 May 2019, at 10:48, Martin Liška  wrote:
> 
> On 5/21/19 11:41 AM, Dominique d'Humières wrote:
>> Hi Martin,
>> 
>> /* { dg-require-ifunc } */
>> 
>> should be
>> 
>> /* { dg-require-ifunc ""} */
>> 
>> and the same for pr90500-2.c (see 
>> https://gcc.gnu.org/ml/gcc-patches/2019-05/msg01152.html)
>> 
>> TIA
>> 
>> Dominique
>> 

> Thanks, should I fix it for all:
> 
> gcc/testsuite/gcc.target/i386/pr84723-1.c:/* { dg-require-ifunc } */
> gcc/testsuite/gcc.target/i386/pr84723-2.c:/* { dg-require-ifunc } */
> gcc/testsuite/gcc.target/i386/pr84723-3.c:/* { dg-require-ifunc } */
> gcc/testsuite/gcc.target/i386/pr84723-4.c:/* { dg-require-ifunc } */
> gcc/testsuite/gcc.target/i386/pr84723-5.c:/* { dg-require-ifunc } */

as noted in the thread referenced, if there is some benefit in running these ^^
tests on a non-ifunc target, then the line should be removed - otherwise
yes, fix it so that these targets don’t run it by accident.

> gcc/testsuite/gcc.target/i386/pr90500-1.c:/* { dg-require-ifunc } */
> gcc/testsuite/gcc.target/i386/pr90500-2.c:/* { dg-require-ifunc } */

yes.
thanks
Iain



Re: [PATCH] Strip target_clones in copy attribute (PR lto/90500).

2019-05-21 Thread Dominique d'Humières



> Le 21 mai 2019 à 11:48, Martin Liška  a écrit :
> 
> On 5/21/19 11:41 AM, Dominique d'Humières wrote:
>> Hi Martin,
>> 
>> /* { dg-require-ifunc } */
>> 
>> should be
>> 
>> /* { dg-require-ifunc ""} */
>> 
>> and the same for pr90500-2.c (see 
>> https://gcc.gnu.org/ml/gcc-patches/2019-05/msg01152.html)
>> 
>> TIA
>> 
>> Dominique
>> 
> 
> Hi.
> 
> Thanks, should I fix it for all:
> 
> gcc/testsuite/gcc.target/i386/pr84723-1.c:/* { dg-require-ifunc } */
> gcc/testsuite/gcc.target/i386/pr84723-2.c:/* { dg-require-ifunc } */
> gcc/testsuite/gcc.target/i386/pr84723-3.c:/* { dg-require-ifunc } */
> gcc/testsuite/gcc.target/i386/pr84723-4.c:/* { dg-require-ifunc } */
> gcc/testsuite/gcc.target/i386/pr84723-5.c:/* { dg-require-ifunc } */
> gcc/testsuite/gcc.target/i386/pr90500-1.c:/* { dg-require-ifunc } */
> gcc/testsuite/gcc.target/i386/pr90500-2.c:/* { dg-require-ifunc } */

IMO Jakub’s advice is needed for the gcc/testsuite/gcc.target/i386/pr84723-* 
tests: they pass on darwin.

> 
> ?



Re: Simplify more EXACT_DIV_EXPR comparisons

2019-05-21 Thread Richard Biener
On Tue, May 21, 2019 at 4:13 AM Martin Sebor  wrote:
>
> On 5/20/19 3:16 AM, Richard Biener wrote:
> > On Mon, May 20, 2019 at 10:16 AM Marc Glisse  wrote:
> >>
> >> On Mon, 20 May 2019, Richard Biener wrote:
> >>
> >>> On Sun, May 19, 2019 at 6:16 PM Marc Glisse  wrote:
> 
>  Hello,
> 
>  2 pieces:
> 
>  - the first one handles the case where the denominator is negative. It
>  doesn't happen often with exact_div, so I don't handle it everywhere, but
>  this one looked trivial
> 
>  - handle the case where a pointer difference is cast to an unsigned type
>  before being compared to a constant (I hit this in std::vector). With 
>  some
>  range info we could probably handle some non-constant cases as well...
> 
>  The second piece breaks Walloca-13.c (-Walloca-larger-than=100 -O2)
> 
>  void f (void*);
>  void g (int *p, int *q)
>  {
>  __SIZE_TYPE__ n = (__SIZE_TYPE__)(p - q);
>  if (n < 100)
>    f (__builtin_alloca (n));
>  }
> 
>  At the time of walloca2, we have
> 
>  _1 = p_5(D) - q_6(D);
>  # RANGE [-2305843009213693952, 2305843009213693951]
>  _2 = _1 /[ex] 4;
>  # RANGE ~[2305843009213693952, 16140901064495857663]
>  n_7 = (long unsigned intD.10) _2;
>  _11 = (long unsigned intD.10) _1;
>  if (_11 <= 396)
>  [...]
>  _3 = allocaD.1059 (n_7);
> 
>  and warn.
> >>>
> >>> That's indeed to complicated relation of _11 to n_7 for
> >>> VRP predicate discovery.
> >>>
>  However, DOM3 later produces
> 
>  _1 = p_5(D) - q_6(D);
>  _11 = (long unsigned intD.10) _1;
>  if (_11 <= 396)
> >>>
> >>> while _11 vs. _1 works fine.
> >>>
>  [...]
>  # RANGE [0, 99] NONZERO 127
>  _2 = _1 /[ex] 4;
>  # RANGE [0, 99] NONZERO 127
>  n_7 = (long unsigned intD.10) _2;
>  _3 = allocaD.1059 (n_7);
> 
>  so I am tempted to say that the walloca2 pass is too early, xfail the
>  testcase and file an issue...
> >>>
> >>> Hmm, there's a DOM pass before walloca2 already and moving
> >>> walloca2 after loop opts doesn't look like the best thing to do?
> >>> I suppose it's not DOM but sinking that does the important transform
> >>> here?  That is,
> >>>
> >>> Index: gcc/passes.def
> >>> ===
> >>> --- gcc/passes.def  (revision 271395)
> >>> +++ gcc/passes.def  (working copy)
> >>> @@ -241,9 +241,9 @@ along with GCC; see the file COPYING3.
> >>>NEXT_PASS (pass_optimize_bswap);
> >>>NEXT_PASS (pass_laddress);
> >>>NEXT_PASS (pass_lim);
> >>> -  NEXT_PASS (pass_walloca, false);
> >>>NEXT_PASS (pass_pre);
> >>>NEXT_PASS (pass_sink_code);
> >>> +  NEXT_PASS (pass_walloca, false);
> >>>NEXT_PASS (pass_sancov);
> >>>NEXT_PASS (pass_asan);
> >>>NEXT_PASS (pass_tsan);
> >>>
> >>> fixes it?
> >>
> >> I will check, but I don't think walloca uses any kind of on-demand VRP, so
> >> we still need some pass to update the ranges after sinking, which doesn't
> >> seem to happen until the next DOM pass.
> >
> > Oh, ok...  Aldy, why's this a separate pass anyways?  I think similar
> > other warnigns are emitted from RTL expansion?  So maybe we can
> > indeed move the pass towards warn_restrict or late_warn_uninit.
>
> I thought there was a preference to add new middle-end warnings
> into passes of their own rather than into existing passes.  Is
> that not so (either in general or in this specific case)?

The preference was to add them not into optimization passes.  But
of course having 10+ warning passes, each going over the whole IL
is excessive.  Also each of the locally computing ranges or so.

Given the simplicity of Walloca I wonder why it's not part of another
warning pass - since it's about tracking "sizes" again there are plenty
that fit ;)

>  From my POV, the main (only?) benefit of putting warnings in their
> own passes is modularity.  Are there any others?
>
> The biggest drawback I see is that it makes it hard to then share
> data across multiple passes.  The sharing can help not just
> warnings (reduce both false positive and false negative rates) but
> also optimization.  That's why I'm merging the strlen and sprintf
> passes, and want to eventually also look into merging
> the -Wstringop-overflow warnings there (also emitted just before
> RTL expansion.  Did I miss any downsides?

When things fit together they are fine to merge obviously.

One may not like -Warray-bounds inside VRP but it really "fits".

OTOH making a warning part of an optimization pass naturally
limits its effect to when the specific optimization is enabled.
In theory it's possible to do -Warray-bounds at -O0 - we are in
SSA form after all - but of course you don't want to enable VRP at -O0.

> I don't know if there's the -Walloca pass would benefit 

Re: [PATCH] Strip target_clones in copy attribute (PR lto/90500).

2019-05-21 Thread Martin Liška
On 5/21/19 11:41 AM, Dominique d'Humières wrote:
> Hi Martin,
> 
>  /* { dg-require-ifunc } */
> 
> should be
> 
>  /* { dg-require-ifunc ""} */
> 
> and the same for pr90500-2.c (see 
> https://gcc.gnu.org/ml/gcc-patches/2019-05/msg01152.html)
> 
> TIA
> 
> Dominique
> 

Hi.

Thanks, should I fix it for all:

gcc/testsuite/gcc.target/i386/pr84723-1.c:/* { dg-require-ifunc } */
gcc/testsuite/gcc.target/i386/pr84723-2.c:/* { dg-require-ifunc } */
gcc/testsuite/gcc.target/i386/pr84723-3.c:/* { dg-require-ifunc } */
gcc/testsuite/gcc.target/i386/pr84723-4.c:/* { dg-require-ifunc } */
gcc/testsuite/gcc.target/i386/pr84723-5.c:/* { dg-require-ifunc } */
gcc/testsuite/gcc.target/i386/pr90500-1.c:/* { dg-require-ifunc } */
gcc/testsuite/gcc.target/i386/pr90500-2.c:/* { dg-require-ifunc } */

?


Re: [PATCH] Strip target_clones in copy attribute (PR lto/90500).

2019-05-21 Thread Dominique d'Humières
Hi Martin,

 /* { dg-require-ifunc } */

should be

 /* { dg-require-ifunc ""} */

and the same for pr90500-2.c (see 
https://gcc.gnu.org/ml/gcc-patches/2019-05/msg01152.html)

TIA

Dominique

Re: [PATCH][RFC] Sanitize equals and hash functions in hash-tables.

2019-05-21 Thread Richard Biener
On Tue, May 21, 2019 at 12:07 AM Jeff Law  wrote:
>
> On 5/13/19 1:41 AM, Martin Liška wrote:
> > On 11/8/18 9:56 AM, Martin Liška wrote:
> >> On 11/7/18 11:23 PM, Jeff Law wrote:
> >>> On 10/30/18 6:28 AM, Martin Liška wrote:
>  On 10/30/18 11:03 AM, Jakub Jelinek wrote:
> > On Mon, Oct 29, 2018 at 04:14:21PM +0100, Martin Liška wrote:
> >> +hashtab_chk_error ()
> >> +{
> >> +  fprintf (stderr, "hash table checking failed: "
> >> +   "equal operator returns true for a pair "
> >> +   "of values with a different hash value");
> > BTW, either use internal_error here, or at least if using fprintf
> > terminate with \n, in your recent mail I saw:
> > ...different hash valueduring RTL pass: vartrack
> > ^^
>  Sure, fixed in attached patch.
> 
>  Martin
> 
> >> +  gcc_unreachable ();
> >> +}
> >   Jakub
> >
>  0001-Sanitize-equals-and-hash-functions-in-hash-tables.patch
> 
>  From 0d9c979c845580a98767b83c099053d36eb49bb9 Mon Sep 17 00:00:00 2001
>  From: marxin 
>  Date: Mon, 29 Oct 2018 09:38:21 +0100
>  Subject: [PATCH] Sanitize equals and hash functions in hash-tables.
> 
>  ---
>   gcc/hash-table.h | 40 +++-
>   1 file changed, 39 insertions(+), 1 deletion(-)
> 
>  diff --git a/gcc/hash-table.h b/gcc/hash-table.h
>  index bd83345c7b8..694eedfc4be 100644
>  --- a/gcc/hash-table.h
>  +++ b/gcc/hash-table.h
>  @@ -503,6 +503,7 @@ private:
> 
> value_type *alloc_entries (size_t n CXX_MEM_STAT_INFO) const;
> value_type *find_empty_slot_for_expand (hashval_t);
>  +  void verify (const compare_type , hashval_t hash);
> bool too_empty_p (unsigned int);
> void expand ();
> static bool is_deleted (value_type )
>  @@ -882,8 +883,12 @@ hash_table
> if (insert == INSERT && m_size * 3 <= m_n_elements * 4)
>   expand ();
> 
>  -  m_searches++;
>  +#if ENABLE_EXTRA_CHECKING
>  +if (insert == INSERT)
>  +  verify (comparable, hash);
>  +#endif
> 
>  +  m_searches++;
> value_type *first_deleted_slot = NULL;
> hashval_t index = hash_table_mod1 (hash, m_size_prime_index);
> hashval_t hash2 = hash_table_mod2 (hash, m_size_prime_index);
>  @@ -930,6 +935,39 @@ hash_table
> return _entries[index];
>   }
> 
>  +#if ENABLE_EXTRA_CHECKING
>  +
>  +/* Report a hash table checking error.  */
>  +
>  +ATTRIBUTE_NORETURN ATTRIBUTE_COLD
>  +static void
>  +hashtab_chk_error ()
>  +{
>  +  fprintf (stderr, "hash table checking failed: "
>  + "equal operator returns true for a pair "
>  + "of values with a different hash value\n");
>  +  gcc_unreachable ();
>  +}
> >>> I think an internal_error here is probably still better than a simple
> >>> fprintf, even if the fprintf is terminated with a \n :-)
> >> Fully agree with that, but I see a lot of build errors when using 
> >> internal_error.
> >>
> >>> The question then becomes can we bootstrap with this stuff enabled and
> >>> if not, are we likely to soon?  It'd be a shame to put it into
> >>> EXTRA_CHECKING, but then not be able to really use EXTRA_CHECKING
> >>> because we've got too many bugs to fix.
> >> Unfortunately it's blocked with these 2 PRs:
> >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87845
> >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87847
> > Hi.
> >
> > I've just added one more PR:
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90450
> >
> > I'm sending updated version of the patch that provides a disablement for 
> > the 3 PRs
> > with a new function disable_sanitize_eq_and_hash.
> >
> > With that I can bootstrap and finish tests. However, I've done that with a 
> > patch
> > limits maximal number of checks:
> So rather than call the disable_sanitize_eq_and_hash, can you have its
> state set up when you instantiate the object?  It's not a huge deal,
> just thinking about loud.
>
>
>
> So how do we want to go forward, particularly the EXTRA_EXTRA checking
> issue :-)

There is at least one PR where we have a table where elements _in_ the
table are never compared against each other but always against another
object (I guess that's usual even), but the setup is in a way that the
comparison function only works with those.  With the patch we verify
hashing/comparison for something that is never used.

So - wouldn't it be more "correct" to only verify comparison/hashing
at lookup time, using the object from the lookup and verify that against
all other elements?

Richard.

>
> Jeff


[PATCH] Strip target_clones in copy attribute (PR lto/90500).

2019-05-21 Thread Martin Liška
Hi.

As suggested by Joseph, the patch is about not copying
target_clones attributes in handle_copy_attribute.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/c-family/ChangeLog:

2019-05-21  Martin Liska  

PR lto/90500
* c-attribs.c (handle_copy_attribute): Do not copy
target_clones attribute.

gcc/testsuite/ChangeLog:

2019-05-21  Martin Liska  

PR lto/90500
* gcc.target/i386/pr90500-1.c: Make the test-case valid
now.
---
 gcc/c-family/c-attribs.c  | 3 ++-
 gcc/testsuite/gcc.target/i386/pr90500-1.c | 3 +--
 2 files changed, 3 insertions(+), 3 deletions(-)


diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index 03203470955..517b7e0dd01 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -2486,7 +2486,8 @@ handle_copy_attribute (tree *node, tree name, tree args,
 	  || is_attribute_p ("noinline", atname)
 	  || is_attribute_p ("visibility", atname)
 	  || is_attribute_p ("weak", atname)
-	  || is_attribute_p ("weakref", atname))
+	  || is_attribute_p ("weakref", atname)
+	  || is_attribute_p ("target_clones", atname))
 	continue;
 
 	  /* Attribute leaf only applies to extern functions.
diff --git a/gcc/testsuite/gcc.target/i386/pr90500-1.c b/gcc/testsuite/gcc.target/i386/pr90500-1.c
index 7ac6a739c05..2b4639ca7f9 100644
--- a/gcc/testsuite/gcc.target/i386/pr90500-1.c
+++ b/gcc/testsuite/gcc.target/i386/pr90500-1.c
@@ -3,6 +3,5 @@
 /* { dg-require-ifunc } */
 
 __attribute__((target_clones("arch=haswell", "default"))) int __tanh() {}
-__typeof(__tanh) tanhf64 __attribute__((alias("__tanh")))/* { dg-error "clones for .target_clones. attribute cannot be created" } */
-  /* { dg-message "'target_clones' cannot be combined with 'alias' attribute" "" { target *-*-* } .-1 } */
+__typeof(__tanh) tanhf64 __attribute__((alias("__tanh")))
 __attribute__((__copy__(__tanh)));



Re: Fix MEM_REF creation for shared stack slots

2019-05-21 Thread Richard Biener
On Tue, 21 May 2019, Jan Hubicka wrote:

> Hi,
> while creating shared stack slots we create a fake void * pointer and
> merge the corresponidng points-to sets.  Later ao_ref_from_mem constructs
> mem_ref to feed alias oracle from. Since pointer is void we then
> dereference it and keep with void_type mem_ref that does not make much sense.
> It also makes oracle to punt later on 
> 
>   /* If either reference is view-converted, give up now.  */
>   if (same_type_for_tbaa (TREE_TYPE (base1), TREE_TYPE (ptrtype1)) != 1
>   || same_type_for_tbaa (TREE_TYPE (dbase2), TREE_TYPE (base2)) != 1)
> return true;
> 
> The patch improves access path disambiguation from:
> 
>   refs_may_alias_p: 3027850 disambiguations, 3340416 queries
>   ref_maybe_used_by_call_p: 6451 disambiguations, 3053430 queries
>   call_may_clobber_ref_p: 817 disambiguations, 817 queries
>   aliasing_component_ref_p: 151 disambiguations, 12565 queries
>   TBAA oracle: 1468434 disambiguations 3010778 queries
>550723 are in alias set 0
>614261 queries asked about the same object
>0 queries asked about the same alias set
>0 access volatile
>260983 are dependent in the DAG
>116377 are aritificially in conflict with void *
> 
> to:
> 
> Alias oracle query stats:
>   refs_may_alias_p: 3029219 disambiguations, 3341410 queries
>   ref_maybe_used_by_call_p: 6451 disambiguations, 3054799 queries
>   call_may_clobber_ref_p: 817 disambiguations, 817 queries
>   aliasing_component_ref_p: 1286 disambiguations, 18458 queries
>   TBAA oracle: 1468536 disambiguations 3013214 queries
>550743 are in alias set 0
>616203 queries asked about the same object
>0 queries asked about the same alias set
>0 access volatile
>261355 are dependent in the DAG
>116377 are aritificially in conflict with void *
> 
> So about 8times of aliasing_component_refs hitrate.

OK, one issue with the patch is that it restores TBAA for the
access which we may _not_ do IIRC.

> Bootstrapped/regtested x86_64-linux, OK?

I'd rather not have that new build_simple_mem_ref_with_type_loc
function - the "simple" MEM_REF was to be a way to replace
a plain old INDIRECT_REF.

So please instead ...

> Honza
> 
>   * alias.c (ao_ref_from_mem): Use build_simple_mem_ref_with_type.
>   * tree.c (build_simple_mem_ref_with_type_loc): Break out from ...
>   (build_simple_mem_ref_loc): ... here.
>   * fold-const.h (build_simple_mem_ref_with_type_loc): Declare.
>   (build_simple_mem_ref_with_type): New macro.
> Index: alias.c
> ===
> --- alias.c   (revision 271379)
> +++ alias.c   (working copy)
> @@ -316,7 +316,8 @@ ao_ref_from_mem (ao_ref *ref, const_rtx
>  {
>tree *namep = cfun->gimple_df->decls_to_pointers->get (base);
>if (namep)
> - ref->base = build_simple_mem_ref (*namep);
> + ref->base = build_simple_mem_ref_with_type
> +  (*namep, build_pointer_type (TREE_TYPE (base)));

...

ref->base = build2 (MEM_REF, TREE_TYPE (base), *namep,
build_int_cst (TREE_TYPE (*namep), 0));

which preserves TBAA behavior but fixes the 'void' type ref.

Thanks,
Richard.


>  }
>  
>ref->ref_alias_set = MEM_ALIAS_SET (mem);
> Index: fold-const.h
> ===
> --- fold-const.h  (revision 271379)
> +++ fold-const.h  (working copy)
> @@ -120,6 +120,9 @@ extern tree fold_indirect_ref_loc (locat
>  extern tree build_simple_mem_ref_loc (location_t, tree);
>  #define build_simple_mem_ref(T)\
>   build_simple_mem_ref_loc (UNKNOWN_LOCATION, T)
> +extern tree build_simple_mem_ref_with_type_loc (location_t, tree, tree);
> +#define build_simple_mem_ref_with_type(T, T2)\
> + build_simple_mem_ref_with_type_loc (UNKNOWN_LOCATION, T, T2)
>  extern poly_offset_int mem_ref_offset (const_tree);
>  extern tree build_invariant_address (tree, tree, poly_int64);
>  extern tree constant_boolean_node (bool, tree);
> Index: tree.c
> ===
> --- tree.c(revision 271402)
> +++ tree.c(working copy)
> @@ -4907,14 +4907,14 @@ build5 (enum tree_code code, tree tt, tr
>  }
>  
>  /* Build a simple MEM_REF tree with the sematics of a plain INDIRECT_REF
> -   on the pointer PTR.  */
> +   on the pointer PTR casted to TYPE.  */
>  
>  tree
> -build_simple_mem_ref_loc (location_t loc, tree ptr)
> +build_simple_mem_ref_with_type_loc (location_t loc, tree ptr, tree ptype)
>  {
>poly_int64 offset = 0;
> -  tree ptype = TREE_TYPE (ptr);
>tree tem;
> +  gcc_checking_assert (POINTER_TYPE_P (ptype) && ptype != void_type_node);
>/* For convenience allow addresses that collapse to a simple base
>   and offset.  */
>if (TREE_CODE (ptr) == ADDR_EXPR
> @@ 

Re: [PATCH] Convert contrib/mklog script to Python 3

2019-05-21 Thread Janne Blomqvist
On Tue, May 21, 2019 at 10:47 AM Janne Blomqvist
 wrote:
>
> On Tue, May 21, 2019 at 10:32 AM Martin Liška  wrote:
> >
> > Hi.
> >
> > There's a regression I see after the transition to python3:
> >
> > $ cat /tmp/patch
> > diff --git a/gcc/testsuite/gcc.dg/pr90263.c b/gcc/testsuite/gcc.dg/pr90263.c
> > index acf3db16640..3222a5331c1 100644
> > --- a/gcc/testsuite/gcc.dg/pr90263.c
> > +++ b/gcc/testsuite/gcc.dg/pr90263.c
> > @@ -1,5 +1,6 @@
> >  /* PR middle-end/90263 */
> >  /* { dg-do compile } */
> > +/* { dg-options "-O2" } */
> >  /* { dg-require-effective-target glibc } */
> >
> >  int *f (int *p, int *q, long n)
> >
> > $ ~/Programming/gcc/contrib/mklog /tmp/patch
> > Traceback (most recent call last):
> >   File "/home/marxin/Programming/gcc/contrib/mklog", line 470, in 
> > main()
> >   File "/home/marxin/Programming/gcc/contrib/mklog", line 388, in main
> > diffs = parse_patch(contents)
> >   File "/home/marxin/Programming/gcc/contrib/mklog", line 273, in 
> > parse_patch
> > lines = contents.split('\n')
> > TypeError: a bytes-like object is required, not 'str'
> >
> > Thanks,
> > Martin
>
> Oof, thanks for the report, looking into it!
>
>
> --
> Janne Blomqvist

Committed r271459 as obvious:

diff --git a/contrib/mklog b/contrib/mklog
index 125f52ef11c..be1dc3a27fc 100755
--- a/contrib/mklog
+++ b/contrib/mklog
@@ -380,7 +380,7 @@ def main():
   if len(args) == 1 and args[0] == '-':
 input = sys.stdin
   elif len(args) == 1:
-input = open(args[0], 'rb')
+input = open(args[0])
   else:
 error("too many arguments; for more details run with -h")

@@ -442,7 +442,7 @@ def main():
 shutil.copymode(args[0], tmp)

 # Open the temp file, clearing contents.
-out = open(tmp, 'wb')
+out = open(tmp, 'w')
   else:
 tmp = None
 out = sys.stdout


-- 
Janne Blomqvist


Re: [PATCH] Remove empty loop with assumed finiteness (PR tree-optimization/89713)

2019-05-21 Thread Marc Glisse

On Mon, 20 May 2019, Michael Matz wrote:


On Mon, 20 May 2019, Richard Biener wrote:


The C++ standard says that do{}while(1) is __builtin_unreachable(), we
don't have to preserve it. There is no mention of anything like a
"nontrivial exit condition". Other languages may have a different
opinion though, so it would probably need a flag indeed... But I am
curious what the point of such a loop is.


busy wait until wakeup by signal or interrupt.


I'd actually turn it around from what C++ says.  If the user wrote, as
is, "do{}while(1);" or "while(1);" or "for(;;);" then we can assume
something funky going on and not remove the loop.  For any other loop we
assume that they are finite.  I.e. we mark loops as to-be-preserved (which
we set on a few known patterns), and just remove all other loops when they
contain no observable side effects after optimization.


Seems sensible, although marking the trivial infinite loops in gimple 
seems simpler than doing it in the front-ends, and a good enough 
approximation unless we are willing to replace some other infinite loops 
with unreachable (or trap).



And of course we'd still have to determine what acceptable side effects
are.  E.g. in a pointer chasing loop containing no body, is the
segfault when the pointer chain is not in fact circular, a side effect we
should retain, or should we be allowed to remove the loop?  I'd say we
should remove the loop, of course.


That may depend on flags like -fnon-call-exceptions (maybe not the right 
one) I guess, although I would also want the removal to happen in as many 
cases as possible. We do usually remove memory reads if the value read is 
unused.



(And yes, I've always found our obsession with preserving infinite loops,
outside of the above "obvious" cases, overly anal as well)


--
Marc Glisse


  1   2   >