Re: Backport important ASan features from upstream.

2015-11-19 Thread Jakub Jelinek
On Thu, Nov 19, 2015 at 11:19:23AM +0300, Maxim Ostapenko wrote:
> Hi!
> 
> Since the last sanitizer library merge to GCC happened, some new useful
> features were applied upstream. In particular, the most significant are:
> 
> * The shadow offset for ASan was unified on Aarch64 for 39 and 42 VMAs
> (http://reviews.llvm.org/D13782). AFAU, this change wouldn't require any
> additional support from compiler side, because the shadow offset is the same
> as for 39-bit VMA (36) .
> * Optional ASan recovery functionality was merged to sanitizer library
> (http://reviews.llvm.org/D12318). This feature seems to be very helpful in
> complex systems where we may want to proceed to work even in case of bug was
> detected. However, to access this functionality, we'll need to implement new
> asan_report_{store, load}{1, 2, 4, 8, 16, N}_noabort callbacks in compiler.
> This is probably unacceptable for stage 3.

No, those are already there (for -fsanitize{,-recover}=kernel-address).
IMHO all you need is remove:
  if (opts->x_flag_sanitize_recover & SANITIZE_USER_ADDRESS)
error_at (loc, "-fsanitize-recover=address is not supported");
in opts.c (finish_options), and in asan.c (asan_expand_check_ifn)
change
  bool recover_p
= (flag_sanitize & flag_sanitize_recover & SANITIZE_KERNEL_ADDRESS) != 0;
to say:
  bool recover_p;
  if (flag_sanitize & SANITIZE_USER_ADDRESS)
recover_p = (flag_sanitize_recover & SANITIZE_USER_ADDRESS) != 0;
  else
recover_p = (flag_sanitize_recover & SANITIZE_KERNEL_ADDRESS) != 0;
(plus add some testsuite coverage).
This is small enough change that I'm ok with an exception for this.

> I think it would be nice to have unified shadow offset on Aarch64 for 39 and
> 42 VMAs even in GCC 6 (enabling ASan recovery would be nice too, but it's
> much harder).
> 
> So, my question is: is it acceptable to backport these features from
> upstream without touching compiler side? If so, I see two options here:
> 
> - Perform sanitizer library merge to GCC without changing compiler side.

And at this point I also prefer the above, rather than cherry-picking, of
course later on in the process cherry-picking will be desirable instead.

> - Cherry-pick the patch for AArch64 on top of current trunk.

Jakub


[PATCH][combine] PR rtl-optimization/68381: Only restrict pure simplification in mult-extend subst case, allow other substitutions

2015-11-19 Thread Kyrill Tkachov

Hi all,

In this PR we end up removing a widening multiplication. I suspect it's some 
path in combine
that doesn't handle the case where we return clobber of 0 properly. This 
started with my recent
combine patch r230326.
Changing the subst hunk from that patch to just return x when handling a no-op 
substitution
fixes the testcase for me and doesn't regress the widening-mult cases that 
r230326 improved.
In fact, with this patch I'm seeing improved codegen for aarch64 with some 
widening multiplications
combined with constant operands and ending up in bitfield move/insert 
instructions.

Bootstrapped and tested on aarch64, arm, x86_64.

Ok for trunk?

Thanks,
Kyrill

2015-11-19  Kyrylo Tkachov  

PR rtl-optimization/68381
* combine.c (subst): Do not return clobber of zero in widening mult
case.  Just return x unchanged if it is a no-op substitution.

2015-11-19  Kyrylo Tkachov  

PR rtl-optimization/68381
* gcc.c-torture/execute/pr68381.c: New test.
commit 58ec17169da2bb864d2bcaeb34a33e8c5a93348f
Author: Kyrylo Tkachov 
Date:   Tue Nov 17 15:33:58 2015 +

[combine] PR rtl-optimization/68381: Only restrict pure simplification in mult-extend subst case, allow other substitutions

diff --git a/gcc/combine.c b/gcc/combine.c
index 26b14bf..3791abf 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -5286,7 +5286,7 @@ subst (rtx x, rtx from, rtx to, int in_dest, int in_cond, int unique_copy)
 	  || GET_CODE (SET_DEST (x)) == PC))
 	fmt = "ie";
 
-  /* Substituting into the operands of a widening MULT is not likely
+  /* Trying to simplify the operands of a widening MULT is not likely
 	 to create RTL matching a machine insn.  */
   if (code == MULT
 	  && (GET_CODE (XEXP (x, 0)) == ZERO_EXTEND
@@ -5294,13 +5294,10 @@ subst (rtx x, rtx from, rtx to, int in_dest, int in_cond, int unique_copy)
 	  && (GET_CODE (XEXP (x, 1)) == ZERO_EXTEND
 	  || GET_CODE (XEXP (x, 1)) == SIGN_EXTEND)
 	  && REG_P (XEXP (XEXP (x, 0), 0))
-	  && REG_P (XEXP (XEXP (x, 1), 0)))
-	{
-	  if (from == to)
-	return x;
-	  else
-	return gen_rtx_CLOBBER (GET_MODE (x), const0_rtx);
-	}
+	  && REG_P (XEXP (XEXP (x, 1), 0))
+	  && from == to)
+	return x;
+
 
   /* Get the mode of operand 0 in case X is now a SIGN_EXTEND of a
 	 constant.  */
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr68381.c b/gcc/testsuite/gcc.c-torture/execute/pr68381.c
new file mode 100644
index 000..33012a3
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr68381.c
@@ -0,0 +1,22 @@
+/* { dg-options "-fexpensive-optimizations -fno-tree-bit-ccp" } */
+
+__attribute__ ((noinline, noclone))
+int
+foo (unsigned short x, unsigned short y)
+{
+  int r;
+  if (__builtin_mul_overflow (x, y, ))
+__builtin_abort ();
+  return r;
+}
+
+int
+main (void)
+{
+  int x = 1;
+  int y = 2;
+  if (foo (x, y) != x * y)
+__builtin_abort ();
+  return 0;
+}
+


Re: [PATCH][RTL-ree] PR rtl-optimization/68194: Restrict copy instruction in presence of conditional moves

2015-11-19 Thread Kyrill Tkachov


On 18/11/15 09:11, Kyrill Tkachov wrote:


On 17/11/15 23:11, Bernd Schmidt wrote:

On 11/17/2015 02:03 PM, Kyrill Tkachov wrote:

+  || !reg_overlap_mentioned_p (tmp_reg, SET_SRC (PATTERN (cand->insn
 return false;



Well, I think the statement we want to make is
"return false from this function if the two expressions contain the same
register number".


I looked at it again and I think I'll need more time to really figure out 
what's going on in this pass.

However, about the above... either I'm very confused, or the actual statement here is "return false _unless_ the two expressions contain the same register number". In the testcase, the regs in question are ax and bx, which is then 
rejected if the patch has been applied.


I'm sorry, it is my mistake in explaining. I meant to say:
"return false from this function unless the two expressions contain the same
 register number"



(gdb) p tmp_reg
(reg:SI 0 ax)
(gdb) p cand->insn
(insn 59 117 60 7 (set (reg:SI 4 si [orig:107 D.1813 ] [107])
(sign_extend:SI (reg:HI 3 bx [orig:99 D.1813 ] [99])))

And I think it would be better to strengthen that to "return false unless registers 
are identical". (From the above it's clear however that a plain rtx_equal_p wouldn't 
work since there's an extension in the operand).


So the three relevant instructions are:
I1: (set (reg:HI 0 ax)
 (mem:HI (symbol_ref:DI ("f"

I2: (set (reg:SI 3 bx)
 (if_then_else:SI (eq (reg:CCZ 17 flags)
(const_int 0))
(reg:SI 0 ax)
(reg:SI 3 bx)))

I3: (set (reg:SI 4 si)
 (sign_extend:SI (reg:HI 3 bx)))

I1 is def_insn, I3 is cand->insn. tmp_reg is 'ax'. What we want to do is reject 
this transformation
because the destination of def_insn (aka I1), that is 'ax', is not the operand 
of the extend operation
in cand->insn (aka I3). As you said, rtx_equal won't work on just SET_SRC (PATTERN 
(cand->insn)) because
it's an extend operation. So reg_overlap_mentioned should be appropriate.
Hope this helps.



Also, I had another look at the testcase. It uses __builtin_printf and dg-output, which 
is at least unusual. Try to use the normal "if (cond) abort ()".



I did not try to remove the __builtin_printf because the use of 'f' there is 
needed to trigger this.
There are at least two other dups of this PR: 68328 and 68185 the testcases for which 
have an "if (cond) abort ();" form.
I'll use them instead.



For completeness' sake here's the patch I'm proposing.
I've used the testcases from the other two PRs that exhibit the same problem 
and use
the if (cond) abort (); form.

Kyrill

2015-11-16  Kyrylo Tkachov  

PR rtl-optimization/68194
PR rtl-optimization/68328
PR rtl-optimization/68185
* ree.c (combine_reaching_defs): Reject copy_needed case if defining
insn does not feed directly into candidate extension insn.

2015-11-16  Kyrylo Tkachov  

PR rtl-optimization/68194
* gcc.c-torture/execute/pr68185.c: Likewise.
* gcc.c-torture/execute/pr68328.c: Likewise.





Bernd





commit 7ca1b135babbe3a542c850591e1b0e8736a19f55
Author: Kyrylo Tkachov 
Date:   Fri Nov 13 15:01:47 2015 +

[RTL-ree] PR rtl-optimization/68194: Restrict copy instruction in presence of conditional moves

diff --git a/gcc/ree.c b/gcc/ree.c
index b8436f2..e91d164 100644
--- a/gcc/ree.c
+++ b/gcc/ree.c
@@ -814,7 +814,30 @@ combine_reaching_defs (ext_cand *cand, const_rtx set_pat, ext_state *state)
 
   rtx tmp_reg = gen_rtx_REG (GET_MODE (SET_DEST (PATTERN (cand->insn))),
  REGNO (SET_DEST (*dest_sub_rtx)));
-  if (reg_overlap_mentioned_p (tmp_reg, SET_DEST (PATTERN (cand->insn
+
+  /* When transforming:
+	(set (reg1) (expression))
+	 ...
+	(set (reg2) (any_extend (reg1)))
+
+	into
+
+	(set (reg2) (any_extend (expression)))
+	(set (reg1) (reg2))
+	make sure that reg1 from the first set feeds directly into the extend.
+	This may not hold in a situation with an intermediate
+	conditional copy i.e.
+	I1: (set (reg3) (expression))
+	I2: (set (reg1) (cond ? reg3 : reg1))
+	I3: (set (reg2) (any_extend (reg1)))
+
+	where I3 is cand, I1 is def_insn and I2 is a conditional copy.
+	We want to avoid transforming that into:
+	(set (reg2) (any_extend (expression)))
+	(set (reg1) (reg2))
+	(set (reg1) (cond ? reg3 : reg1)).  */
+  if (reg_overlap_mentioned_p (tmp_reg, SET_DEST (PATTERN (cand->insn)))
+	  || !reg_overlap_mentioned_p (tmp_reg, SET_SRC (PATTERN (cand->insn
 	return false;
 
   /* The destination register of the extension insn must not be
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr68185.c b/gcc/testsuite/gcc.c-torture/execute/pr68185.c
new file mode 100644
index 000..826531b
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr68185.c
@@ -0,0 +1,29 @@
+int a, b, d = 1, e, f, o, u, w = 1, z;
+short c, q, t;
+
+int
+main ()
+{
+  char g;
+  for (; d; d--)
+{
+  while (o)
+	for (; e;)
+	  

Re: [PATCH] Add clang-format config to contrib folder

2015-11-19 Thread Martin Liška

On 11/19/2015 12:09 AM, Sebastian Pop wrote:

Martin, thanks for getting this patch out.  I like the patch.
Jeff, clang-format has scripts that allow formatting only the lines
touched by a patch.
It also has a script to integrate with git:
https://llvm.org/svn/llvm-project/cfe/trunk/tools/clang-format/git-clang-format
We could use those scripts in a commit hook to automatically enforce
correct formatting of new patches.

Sebastian


Hi.

Thanks for pointing out the script touching just modified lines,
I mentioned the script in the clang-format file.

Sending v2 that I'm going to install.

Thanks,
Martin



On Wed, Nov 18, 2015 at 8:10 AM, Martin Liška  wrote:

Hello.

Following patch adds a clang-format config file that should respect the GNU 
coding standards.
As the file is not part of build process, I hope the patch can be applied even 
though
we've just skipped to stage3? The patch adds a hunk to Makefile which can 
create symlink
to the root directory of the GCC compiler. The clang-format automatically loads 
style from
the configuration file.

clang-format (version 3.8) provides rich variety of configuration options that 
can
ensure the GNU coding style.

Limitations:
+ placement of opening brace of an initializer can't be requested
+ sometimes, '(' is the trailing symbol at the end of a line, which can look 
weird

As we've been continuously converting our source base to C++, the clang-format 
should
provide better results than a collection of regular expressions 
(check_GNU_style.sh).

As a reference file I attach gcc/tree-ssa-uninit.c file.
Feel free to comment the suggested configuration file.

Thanks,
Martin



From c344396c77ce462b576204697ccd48f1eaa17c7b Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 18 Nov 2015 14:40:58 +0100
Subject: [PATCH] Add clang-format config to contrib folder

ChangeLog:

2015-11-18  Martin Liska  

	* Makefile.in: Add clang-format.
	* Makefile.tpl: Likewise.

contrib/ChangeLog:

2015-11-18  Martin Liska  

	* clang-format: New file.
---
 Makefile.in  |  9 +
 Makefile.tpl |  9 +
 contrib/clang-format | 55 
 3 files changed, 73 insertions(+)
 create mode 100644 contrib/clang-format

diff --git a/Makefile.in b/Makefile.in
index bc2bae6..75caafd 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -2461,6 +2461,15 @@ vimrc: $(srcdir)/.local.vimrc $(srcdir)/.lvimrc
 
 .PHONY: vimrc
 
+# clang-format config
+
+$(srcdir)/.clang-format:
+	$(LN_S) contrib/clang-format $@
+
+clang-format: $(srcdir)/.clang-format
+
+.PHONY: clang-format
+
 # Installation targets.
 
 .PHONY: install uninstall
diff --git a/Makefile.tpl b/Makefile.tpl
index 660e1e4..beabadc 100644
--- a/Makefile.tpl
+++ b/Makefile.tpl
@@ -889,6 +889,15 @@ vimrc: $(srcdir)/.local.vimrc $(srcdir)/.lvimrc
 
 .PHONY: vimrc
 
+# clang-format config
+
+$(srcdir)/.clang-format:
+	$(LN_S) contrib/clang-format $@
+
+clang-format: $(srcdir)/.clang-format
+
+.PHONY: clang-format
+
 # Installation targets.
 
 .PHONY: install uninstall
diff --git a/contrib/clang-format b/contrib/clang-format
new file mode 100644
index 000..8a6fd4b
--- /dev/null
+++ b/contrib/clang-format
@@ -0,0 +1,55 @@
+# Copyright (C) 2015 Free Software Foundation, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+
+# clang-format 3.8+ (Mon Nov 16) is required
+#
+# To utilize the tool to lines just touched by a patch, use
+# clang-format-diff.py script, which can be downloaded here:
+# https://llvm.org/svn/llvm-project/cfe/trunk/tools/clang-format/clang-format-diff.py
+
+---
+Language: Cpp
+AccessModifierOffset: -2
+AlwaysBreakAfterDefinitionReturnType: All
+BinPackArguments: true
+BinPackParameters: true
+BraceWrapping:
+  AfterClass: true
+  AfterControlStatement: true
+  AfterEnum: true
+  AfterFunction: true
+  AfterNamespace: false
+  AfterObjCDeclaration: true
+  AfterStruct: true
+  AfterUnion: true
+  BeforeCatch: true
+  BeforeElse: true
+  IndentBraces: true
+BreakBeforeBinaryOperators: All
+BreakBeforeBraces: Custom
+BreakBeforeTernaryOperators: true
+ColumnLimit: 80
+ConstructorInitializerIndentWidth: 2
+ContinuationIndentWidth: 2
+ForEachMacros: 

Backport important ASan features from upstream.

2015-11-19 Thread Maxim Ostapenko

Hi!

Since the last sanitizer library merge to GCC happened, some new useful 
features were applied upstream. In particular, the most significant are:


* The shadow offset for ASan was unified on Aarch64 for 39 and 42 VMAs 
(http://reviews.llvm.org/D13782). AFAU, this change wouldn't require any 
additional support from compiler side, because the shadow offset is the 
same as for 39-bit VMA (36) .
* Optional ASan recovery functionality was merged to sanitizer library 
(http://reviews.llvm.org/D12318). This feature seems to be very helpful 
in complex systems where we may want to proceed to work even in case of 
bug was detected. However, to access this functionality, we'll need to 
implement new asan_report_{store, load}{1, 2, 4, 8, 16, N}_noabort 
callbacks in compiler. This is probably unacceptable for stage 3.


I think it would be nice to have unified shadow offset on Aarch64 for 39 
and 42 VMAs even in GCC 6 (enabling ASan recovery would be nice too, but 
it's much harder).


So, my question is: is it acceptable to backport these features from 
upstream without touching compiler side? If so, I see two options here:


- Perform sanitizer library merge to GCC without changing compiler side.
- Cherry-pick the patch for AArch64 on top of current trunk.

Thanks,
-Maxim


Re: [PATCH] Avoid useless work in loop vectorization

2015-11-19 Thread Richard Biener
On Wed, 18 Nov 2015, Alan Lawrence wrote:

> On 13/11/15 08:41, Richard Biener wrote:
> > 
> > Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.
> > 
> > Richard.
> > 
> > 2015-11-13  Richard Biener  
> > 
> > * tree-vect-loop.c (vect_analyze_loop_2): Add fatal parameter.
> > Signal fatal failure if early checks fail.
> > (vect_analyze_loop): If vect_analyze_loop_2 fails fatally
> > do not bother testing further vector sizes.
> 
> It seems that on AArch64 this causes:
> 
> FAIL: gcc.dg/vect/vect-outer-1-big-array.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "grouped access in outer loop" 2
> FAIL: gcc.dg/vect/vect-outer-1-big-array.c scan-tree-dump-times vect "grouped
> access in outer loop" 2
> FAIL: gcc.dg/vect/vect-outer-1.c -flto -ffat-lto-objects  scan-tree-dump-times
> vect "grouped access in outer loop" 2
> FAIL: gcc.dg/vect/vect-outer-1.c scan-tree-dump-times vect "grouped access in
> outer loop" 2
> FAIL: gcc.dg/vect/vect-outer-1a-big-array.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "grouped access in outer loop" 2
> FAIL: gcc.dg/vect/vect-outer-1a-big-array.c scan-tree-dump-times vect "grouped
> access in outer loop" 2
> FAIL: gcc.dg/vect/vect-outer-1a.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "grouped access in outer loop" 2
> FAIL: gcc.dg/vect/vect-outer-1a.c scan-tree-dump-times vect "grouped access in
> outer loop" 2
> FAIL: gcc.dg/vect/vect-outer-1b-big-array.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "grouped access in outer loop" 2
> FAIL: gcc.dg/vect/vect-outer-1b-big-array.c scan-tree-dump-times vect "grouped
> access in outer loop" 2
> FAIL: gcc.dg/vect/vect-outer-1b.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "grouped access in outer loop" 2
> FAIL: gcc.dg/vect/vect-outer-1b.c scan-tree-dump-times vect "grouped access in
> outer loop" 2
> FAIL: gcc.dg/vect/vect-outer-2b.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "grouped access in outer loop" 2
> FAIL: gcc.dg/vect/vect-outer-2b.c scan-tree-dump-times vect "grouped access in
> outer loop" 2
> FAIL: gcc.dg/vect/vect-outer-3b.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "grouped access in outer loop" 4
> FAIL: gcc.dg/vect/vect-outer-3b.c scan-tree-dump-times vect "grouped access in
> outer loop" 4
> 
> Still there on r230556, I haven't dug any further yet.

Probably a testsuite issue as we have

/* { dg-final { scan-tree-dump-times "grouped access in outer loop" 1 
"vect" { target { ! vect_multiple_sizes } } } } */
/* { dg-final { scan-tree-dump-times "grouped access in outer loop" 2 
"vect" { target vect_multiple_sizes } } } */

and now possibly terminate early before considering the other vector
size(s).

Richard.


[PATCH] Fix PR68117

2015-11-19 Thread Richard Biener

I'm reverting an earlier change removing the redirect_edge_var_map_destroy
call from delete_tree_ssa.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2015-11-19  Richard Biener  

PR middle-end/68117
* tree-ssa.c (delete_tree_ssa): Revert removal of call to
redirect_edge_var_map_destroy.

Index: gcc/tree-ssa.c
===
--- gcc/tree-ssa.c  (revision 230550)
+++ gcc/tree-ssa.c  (working copy)
@@ -1126,6 +1126,9 @@ delete_tree_ssa (struct function *fn)
   fn->gimple_df->decls_to_pointers = NULL;
   fn->gimple_df->modified_noreturn_calls = NULL;
   fn->gimple_df = NULL;
+
+  /* We no longer need the edge variable maps.  */
+  redirect_edge_var_map_destroy ();
 }
 
 /* Return true if EXPR is a useless type conversion, otherwise return


Re: [PATCH] [4.9] Re: [PATCH][5] Backport ISL 0.15 support

2015-11-19 Thread Richard Biener
On Wed, 18 Nov 2015, Matthias Klose wrote:

> On 12.10.2015 12:58, Richard Biener wrote:
> > 
> > This backports the patch to allow bootstrapping with ISL 0.15 to the
> > GCC 5 branch (the GCC 4.9 branch will require backporting of some
> > dependencies).
> 
> I don't think so. 4.8 and 4.9 don't use as much ISL code as 5 does.  I had a
> look at the backport and came up with something which is just mechanical
> changes for the cpp conditional.
> 
> The version check in the toplevel configure needs to be extended.  I'm
> currently not checking non matching isl and cloog versions.
> 
> Now build for 4.9 using ISL 0.15 and 0.14. Ok to commit to the branch?

Ok.  I wonder if we have a cloog version in infrastructure that builds
against ISL 0.1[45]?  IIRC cloog 0.18.1 doesn't.

Thanks,
Richard.

> Matthias
> 
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: [PATCH, PR68373 ] Call scev_const_prop in pass_parallelize_loops::execute

2015-11-19 Thread Tom de Vries

On 17/11/15 23:20, Tom de Vries wrote:

[ was: Re: [PATCH, 10/16] Add pass_oacc_kernels pass group in passes.def ]

Hi,

Consider test-case test.c, with a use of the final value of the
iteration variable (return i):
...
unsigned int
foo (int *a, unsigned int n)
{
   unsigned int i;
   for (i = 0; i < n; ++i)
 a[i] = 1;

   return i;
}
...

Compiled with:
...
$ gcc -S -O2 test.c -ftree-parallelize-loops=2 -fdump-tree-all-details
...

Before parloops, we have:
...
  :
   # i_12 = PHI <0(3), i_10(5)>
   _5 = (long unsigned int) i_12;
   _6 = _5 * 4;
   _8 = a_7(D) + _6;
   *_8 = 1;
   i_10 = i_12 + 1;
   if (n_4(D) > i_10)
 goto ;
   else
 goto ;

   :
   goto ;

   :
   # i_14 = PHI 
...

Parloops will fail because:
...
phi is n_2 = PHI 
arg of phi to exit:   value n_4(D) used outside loop
   checking if it a part of reduction pattern:
   FAILED: it is not a part of reduction
...
[ note that the phi looks slightly different. In
gather_scalar_reductions -> vect_analyze_loop_form ->
vect_analyze_loop_form_1 -> split_loop_exit_edge we split the edge from
bb4 to bb6. ]

This patch uses scev_const_prop at the start of parloops.
scev_const_prop first also splits the exit edge, and then replaces the
phi with a assignment:
...
  final value replacement:
   n_2 = PHI 
   with
   n_2 = n_4(D);
...

This allows parloops to succeed.

And there's a similar story when we compile with -fno-tree-scev-cprop in
addition.

Bootstrapped and reg-tested on x86_64.

OK for stage3/stage1?


The patch has been updated to do the final value replacement only for 
the loop that parloops is processing, as suggested in review comment at 
https://gcc.gnu.org/ml/gcc-patches/2015-11/msg02166.html .


That means the patch is now also required for the kernels patch series.

Bootstrapped and reg-tested on x86_64.

OK for stage 3 trunk?

Thanks,
- Tom
Do final value replacement in try_create_reduction_list

2015-11-18  Tom de Vries  

	* tree-scalar-evolution.c (final_value_replacement_loop): Factor out of ...
	(scev_const_prop): ... here.
	* tree-scalar-evolution.h (final_value_replacement_loop): Declare.
	* tree-parloops.c (try_create_reduction_list): Call
	final_value_replacement_loop.

	* gcc.dg/autopar/pr68373.c: New test.

---
 gcc/testsuite/gcc.dg/autopar/pr68373.c |  14 ++
 gcc/tree-parloops.c|   3 +
 gcc/tree-scalar-evolution.c| 248 +
 gcc/tree-scalar-evolution.h|   1 +
 4 files changed, 145 insertions(+), 121 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/autopar/pr68373.c b/gcc/testsuite/gcc.dg/autopar/pr68373.c
new file mode 100644
index 000..8e0f8a5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/autopar/pr68373.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-parallelize-loops=2 -fdump-tree-parloops-details" } */
+
+unsigned int
+foo (int *a, unsigned int n)
+{
+  unsigned int i;
+  for (i = 0; i < n; ++i)
+a[i] = 1;
+
+  return i;
+}
+
+/* { dg-final { scan-tree-dump-times "SUCCESS: may be parallelized" 1 "parloops" } } */
diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c
index 17415a8..8d7912d 100644
--- a/gcc/tree-parloops.c
+++ b/gcc/tree-parloops.c
@@ -2539,6 +2539,9 @@ try_create_reduction_list (loop_p loop,
 
   gcc_assert (exit);
 
+  /* Try to get rid of exit phis.  */
+  final_value_replacement_loop (loop);
+
   gather_scalar_reductions (loop, reduction_list);
 
 
diff --git a/gcc/tree-scalar-evolution.c b/gcc/tree-scalar-evolution.c
index 27630f0..9b33693 100644
--- a/gcc/tree-scalar-evolution.c
+++ b/gcc/tree-scalar-evolution.c
@@ -3417,6 +3417,131 @@ expression_expensive_p (tree expr)
 }
 }
 
+/* Do final value replacement for LOOP.  */
+
+void
+final_value_replacement_loop (struct loop *loop)
+{
+  /* If we do not know exact number of iterations of the loop, we cannot
+ replace the final value.  */
+  edge exit = single_exit (loop);
+  if (!exit)
+return;
+
+  tree niter = number_of_latch_executions (loop);
+  if (niter == chrec_dont_know)
+return;
+
+  /* Ensure that it is possible to insert new statements somewhere.  */
+  if (!single_pred_p (exit->dest))
+split_loop_exit_edge (exit);
+
+  /* Set stmt insertion pointer.  All stmts are inserted before this point.  */
+  gimple_stmt_iterator gsi = gsi_after_labels (exit->dest);
+
+  struct loop *ex_loop
+= superloop_at_depth (loop,
+			  loop_depth (exit->dest->loop_father) + 1);
+
+  gphi_iterator psi;
+  for (psi = gsi_start_phis (exit->dest); !gsi_end_p (psi); )
+{
+  gphi *phi = psi.phi ();
+  tree rslt = PHI_RESULT (phi);
+  tree def = PHI_ARG_DEF_FROM_EDGE (phi, exit);
+  if (virtual_operand_p (def))
+	{
+	  gsi_next ();
+	  continue;
+	}
+
+  if (!POINTER_TYPE_P (TREE_TYPE (def))
+	  && !INTEGRAL_TYPE_P (TREE_TYPE (def)))
+	{
+	  gsi_next ();
+	  continue;
+	}
+
+  bool folded_casts;
+  def = 

Re: Backport important ASan features from upstream.

2015-11-19 Thread Yury Gribov

On 11/19/2015 11:36 AM, Andrew Pinski wrote:

On Nov 19, 2015 12:19 AM, "Maxim Ostapenko" 
wrote:


Hi!

Since the last sanitizer library merge to GCC happened, some new useful

features were applied upstream. In particular, the most significant are:


* The shadow offset for ASan was unified on Aarch64 for 39 and 42 VMAs (

http://reviews.llvm.org/D13782). AFAU, this change wouldn't require any
additional support from compiler side, because the shadow offset is the
same as for 39-bit VMA (36).

Actually until 48 vma is implemented asan is useless for aarch64 and should
not be enabled at all.


Linaro has stated several times that main reason for missing 48-bit 
support is lack of HW. Could Cavium provide them with ThunderX bot?


Best regards,
Yury Gribov



Re: [PATCH] Fix memory leaks in tree-ssa-uninit.c

2015-11-19 Thread Martin Liška

On 11/19/2015 01:50 AM, Joseph Myers wrote:

I don't think all the reformattings here are things we want to do globally
for most source files.  E.g.


@@ -75,18 +74,17 @@ get_mask_first_set_bit (unsigned mask)
  static bool
  has_undefined_value_p (tree t)
  {
-  return (ssa_undefined_value_p (t)
-  || (possibly_undefined_names
-  && possibly_undefined_names->contains (t)));
+  return ssa_undefined_value_p (t)
+|| (possibly_undefined_names
+&& possibly_undefined_names->contains (t));


This seems plain wrong.  The GNU Coding Standards explicitly say that in
such cases where a binary operator goes on the start of the next line, you
should insert extra parentheses so that Emacs will get the indentation
right (even though otherwise parentheses shouldn't be used with "return").
So the old code was better.


  static void
-warn_uninit (enum opt_code wc, tree t, tree expr, tree var,
-const char *gmsgid, void *data, location_t phiarg_loc)
+warn_uninit (enum opt_code wc, tree t, tree expr, tree var, const char *gmsgid,
+void *data, location_t phiarg_loc)


Just because it's OK to go up to an 80- or 79-column limit if necessary
doesn't mean code should be reformatted to do so.  Breaking a bit before
that is perfectly reasonable in general (and in some cases, the choice of
where to break may be based on logical grouping of arguments).


@@ -135,10 +133,9 @@ warn_uninit (enum opt_code wc, tree t, tree expr, tree var,

/* TREE_NO_WARNING either means we already warned, or the front end
   wishes to suppress the warning.  */
-  if ((context
-   && (gimple_no_warning_p (context)
-  || (gimple_assign_single_p (context)
-  && TREE_NO_WARNING (gimple_assign_rhs1 (context)
+  if ((context && (gimple_no_warning_p (context)
+  || (gimple_assign_single_p (context)
+  && TREE_NO_WARNING (gimple_assign_rhs1 (context)


I think in cases such as this, putting the operator && on the start of the
next line to make subsequent lines less-indented makes sense.  Again, the
new formatting isn't incorrect, but that doesn't make it a desirable
change.


@@ -217,24 +212,20 @@ warn_uninitialized_vars (bool warn_possibly_uninitialized)
 so must be limited which means we would miss warning
 opportunities.  */
  use = gimple_vuse (stmt);
- if (use
- && gimple_assign_single_p (stmt)
- && !gimple_vdef (stmt)
+ if (use && gimple_assign_single_p (stmt) && !gimple_vdef (stmt)
  && SSA_NAME_IS_DEFAULT_DEF (use))


I think there may be a preference, where it's necessary to use multiple
lines (if the whole condition doesn't fit on one line) for a condition
like this, for each "&& condition" to go on its own line, rather than some
lines having multiple conditions.


  /* Do not warn if it can be initialized outside this function.  */
- if (TREE_CODE (base) != VAR_DECL
- || DECL_HARD_REGISTER (base)
+ if (TREE_CODE (base) != VAR_DECL || DECL_HARD_REGISTER (base)
  || is_global_var (base))


Likewise.


@@ -2393,15 +2338,28 @@ class pass_late_warn_uninitialized : public 
gimple_opt_pass
  public:
pass_late_warn_uninitialized (gcc::context *ctxt)
  : gimple_opt_pass (pass_data_late_warn_uninitialized, ctxt)
-  {}
+  {
+  }



Hi Joseph.

Agree with you that logical expressions are not formatted by the tool ideally,
human interaction is highly needed. The same rule can be applied to function 
arguments,
where we have mixture of approaches:

a) maximally fill-up a line
b) logically separate group of arguments
c) put each argument on a separate line if it does not fit to the single line

As the patch hasn't been installed, I'm attaching v2, where I preserve original 
grouping of a function
arguments and logic expressions that are made of multiple operands should be 
fine.



Is that really what we want?


The hunk is reverted.



I think those examples are sufficient to illustrate the problems with
automatic conversion from one correct format to another correct format (as
opposed to fixing cases where the formatting is unambiguously incorrect,
which covers plenty of the changes in this patch).



You are right, however as the original coding style was really broken, it was 
much easier
to use the tool and clean-up fall-out.

Waiting for thoughts related to v2.

Thanks,
Martin
>From 17270f8aa982567142aae1dd97238a424e2bf2d6 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 18 Nov 2015 10:20:11 +0100
Subject: [PATCH 1/2] Reformat tree-ssa-uninit.c

---
 gcc/tree-ssa-uninit.c | 1349 -
 1 file changed, 668 insertions(+), 681 deletions(-)

diff --git a/gcc/tree-ssa-uninit.c b/gcc/tree-ssa-uninit.c
index 0709cce..c4ed2d7 100644
--- a/gcc/tree-ssa-uninit.c
+++ b/gcc/tree-ssa-uninit.c
@@ -35,16 +35,15 @@ 

Re: PATCH to shorten_compare -Wtype-limits handling

2015-11-19 Thread Martin Sebor

On 11/18/2015 09:26 PM, Jason Merrill wrote:

The rs6000 target was hitting a bootstrap failure due to
-Werror=type-limits.  Since warn_tautological_cmp and other warnings
avoid warning if one of the operands comes from a macro, I thought it
would make sense to do that here as well.


The also disables the warning for functions that are shadowed by
macros such as C atomic_load et al. For example, in the program
below. Is that an acceptable compromise or is there a way to avoid
this?

  #include 

  unsigned foo (unsigned char *x)
  {
if (atomic_load (x) > 1000)
  return 0;
return 1;
  }

At the same time, the change doesn't suppress the warning in other
cases where I would have expected it to suppress it based on your
description. For instance here:

  unsigned short bar (unsigned short x)
  {
  #define X x

if (x > 0x)
  return 0;
return 1;
  }

I noticed there is code elsewhere in c-common.c that avoids
issuing the same warning for system headers (that's the code
that responsible for issuing the warning for the second test
case above).

There is also code in tree-vrp.c that issues it unconditionally
regardless of macros or system headers.

Martin


Re: RFC/RFA: Fix bug with REE optimization corrupting extended registers

2015-11-19 Thread Jeff Law

On 11/19/2015 10:18 AM, Bernd Schmidt wrote:

On 11/19/2015 06:06 PM, Jeff Law wrote:


Frankly, the overall structure of ree is a mess -- I've found it
incredibly difficult to follow every time I've had to look at it.


Yeah, no kidding. The check seems to be in the wrong place - it's done
very late when we're replacing things, and IMO we should just restrict
the candidates for optimization much earlier.
And I didn't make it any cleaner when I hacked it up to handle one more 
case :(I'm hoping that a year away will allow me to look and think 
about the code again without getting a horrid headache.





Also, I want to apply the following. Ok if testing succeeds?


Bernd

clean-ree1.diff


commit ce68938b5150f5d41a54ed317ab97d98461be064
Author: Bernd Schmidt
Date:   Thu Nov 19 17:38:15 2015 +0100

 Readability improvements in ree.c:combine_reaching_defs.

* ree.c (combine_reaching_defs): Simplify code by caching expressions
in variables.

Yes. That's good with me.

jeff



Re: [PATCH] Transactional Memory: Support __cxa_free_exception and fix exception handling.

2015-11-19 Thread Jason Merrill

On 11/18/2015 06:22 PM, Torvald Riegel wrote:

The EH scheme that we had been using for TM / libitm doesn't work
properly.  We fail to handle throwing exceptions whose constructors may
throw themselves.  We also do not clean up properly in all situations
when a transactions abort while being in the process of throwing an
exception.
This patch solves  this particular problem by adding a transactional
wrapper for __cxa_free_exception and changing the EH scheme in libitm.

Follow-up patches will fix other issues that we have identified.  Some
of the changes to the libitm.texi ABI docs added in this patch already
take this future work into account.

Tested using the libitm testsuite on x86_64-linux.

Are the gcc/ bits OK?

 gcc/cp/
 * except.c (do_free_exception): Use transactional wrapper.


OK.

Jason




Re: PATCH to shorten_compare -Wtype-limits handling

2015-11-19 Thread Manuel López-Ibáñez
On 19 November 2015 at 17:54, Jeff Law  wrote:
>> But there were a couple of patches from you some time ago, for
>> example: http://permalink.gmane.org/gmane.comp.gcc.patches/343476
>>
>> What happened with those?
>
> On hold pending fixing the type-limits warning placement.  Essentially that
> has to be untangled first.

Could you elaborate on this? Or point me to some previous email
thread? (I don't have enough free time to follow the mailing list
anymore, sorry).

Cheers,

Manuel.


Openacc reduction tests

2015-11-19 Thread Nathan Sidwell
in adding the complex double support I noticed some existing tests had commented 
out sequences. These were broken tests


a) the multiplication reduction assumed that  0 * 99! was numerically stable in 
the face of arbitrary re-association.  While it did guess that results might 
vary, it used an absolute, rather than relative, epsilon value.


b) A min/max test assumed that that operator is well defined on complex numbers 
-- it's not.


c) Another min/max test failed because the reduction was specified as +

Fixed thusly,

nathan
2015-11-19  Nathan Sidwell  

	* libgomp.oacc-c-c++-common/reduction-dbl.c: New.
	* libgomp.oacc-c-c++-common/reduction-flt.c: New.
	* libgomp.oacc-c-c++-common/reduction-cplx-dbl.c: Use typedef.
	* libgomp.oacc-c-c++-common/reduction-cplx-flt.c: Use typedef.
	* libgomp.oacc-c-c++-common/reduction-2.c: Uncomment broken tests
	and fix.
	* libgomp.oacc-c-c++-common/reduction-3.c: Likewise.
	* libgomp.oacc-c-c++-common/reduction-4.c: Likewise.

Index: testsuite/libgomp.oacc-c-c++-common/reduction-2.c
===
--- testsuite/libgomp.oacc-c-c++-common/reduction-2.c	(revision 230605)
+++ testsuite/libgomp.oacc-c-c++-common/reduction-2.c	(working copy)
@@ -50,39 +50,37 @@ main(void)
 
   if (fabs(result - vresult) > .0001)
 abort ();
-//   result = 0;
-//   vresult = 0;
-// 
-//   /* 'max' reductions.  */
-// #pragma acc parallel vector_length (vl)
-// #pragma acc loop reduction (+:result)
-//   for (i = 0; i < n; i++)
-//   result = result > array[i] ? result : array[i];
-// 
-//   /* Verify the reduction.  */
-//   for (i = 0; i < n; i++)
-//   vresult = vresult > array[i] ? vresult : array[i];
-// 
-//   printf("%d != %d\n", result, vresult);
-//   if (result != vresult)
-// abort ();
-// 
-//   result = 0;
-//   vresult = 0;
-// 
-//   /* 'min' reductions.  */
-// #pragma acc parallel vector_length (vl)
-// #pragma acc loop reduction (+:result)
-//   for (i = 0; i < n; i++)
-//   result = result < array[i] ? result : array[i];
-// 
-//   /* Verify the reduction.  */
-//   for (i = 0; i < n; i++)
-//   vresult = vresult < array[i] ? vresult : array[i];
-// 
-//   printf("%d != %d\n", result, vresult);
-//   if (result != vresult)
-// abort ();
+  result = 0;
+  vresult = 0;
+
+  /* 'max' reductions.  */
+#pragma acc parallel vector_length (vl) copy(result)
+#pragma acc loop reduction (max:result)
+  for (i = 0; i < n; i++)
+result = result > array[i] ? result : array[i];
+
+  /* Verify the reduction.  */
+  for (i = 0; i < n; i++)
+vresult = vresult > array[i] ? vresult : array[i];
+
+  if (result != vresult)
+abort ();
+
+  result = 0;
+  vresult = 0;
+
+  /* 'min' reductions.  */
+#pragma acc parallel vector_length (vl) copy(result)
+#pragma acc loop reduction (min:result)
+  for (i = 0; i < n; i++)
+result = result < array[i] ? result : array[i];
+
+  /* Verify the reduction.  */
+  for (i = 0; i < n; i++)
+vresult = vresult < array[i] ? vresult : array[i];
+
+  if (result != vresult)
+abort ();
 
   result = 5;
   vresult = 5;
Index: testsuite/libgomp.oacc-c-c++-common/reduction-3.c
===
--- testsuite/libgomp.oacc-c-c++-common/reduction-3.c	(revision 230605)
+++ testsuite/libgomp.oacc-c-c++-common/reduction-3.c	(working copy)
@@ -22,15 +22,15 @@ main(void)
   result = 0;
   vresult = 0;
 
-  /* '+' reductions.  */
+  /* 'max' reductions.  */
 #pragma acc parallel vector_length (vl) copy(result)
-#pragma acc loop reduction (+:result)
+#pragma acc loop reduction (max:result)
   for (i = 0; i < n; i++)
-result += array[i];
+result = result > array[i] ? result : array[i];
 
   /* Verify the reduction.  */
   for (i = 0; i < n; i++)
-vresult += array[i];
+vresult = vresult > array[i] ? vresult : array[i];
 
   if (result != vresult)
 abort ();
@@ -38,51 +38,18 @@ main(void)
   result = 0;
   vresult = 0;
 
-  /* '*' reductions.  */
+  /* 'min' reductions.  */
 #pragma acc parallel vector_length (vl) copy(result)
-#pragma acc loop reduction (*:result)
+#pragma acc loop reduction (min:result)
   for (i = 0; i < n; i++)
-result *= array[i];
+result = result < array[i] ? result : array[i];
 
   /* Verify the reduction.  */
   for (i = 0; i < n; i++)
-vresult *= array[i];
+vresult = vresult < array[i] ? vresult : array[i];
 
-  if (fabs(result - vresult) > .0001)
+  if (result != vresult)
 abort ();
-//   result = 0;
-//   vresult = 0;
-// 
-//   /* 'max' reductions.  */
-// #pragma acc parallel vector_length (vl)
-// #pragma acc loop reduction (+:result)
-//   for (i = 0; i < n; i++)
-//   result = result > array[i] ? result : array[i];
-// 
-//   /* Verify the reduction.  */
-//   for (i = 0; i < n; i++)
-//   vresult = vresult > array[i] ? vresult : array[i];
-// 
-//   printf("%d != %d\n", result, vresult);
-//   if (result != vresult)
-//   

Re: [C++ PATCH] Fix unexpanded pack error with auto-deduced functions (PR c++/68396)

2015-11-19 Thread Jason Merrill

On 11/19/2015 09:35 AM, Ryan Burn wrote:

This fixes an issue where the return type of an auto-deduced function gets
classified as a parameter pack if it's called in a decltype expression
inside of a type expansion.


Thanks!  Do you have a copyright assignment on file with the FSF?  This 
patch is small enough not to need one, but it would be good to get it in 
place for future contributions.


I moved the testcase to cpp1y since it's valid C++14 code.

Jason



Re: RFC/RFA: Fix bug with REE optimization corrupting extended registers

2015-11-19 Thread Jeff Law

On 11/19/2015 08:34 AM, Nick Clifton wrote:

Hi Bernd,


I had a look around. There's code testing HARD_REGNO_NREGS in
ree.c:combine_set_extension. It's inside #if 0, and labelled
"temporarily disabled". See if enabling that helps you? (Jeff, that #if
0 was added by you).


I suspect that the code was disabled because it prevented too many
useful optimization opportunities.

The code there would solve this problem, but the approach is is overly
cautious, since it disables the optimization for all extensions that
increase the number of hard registers used.  Some of these will be
viable candidates, provided that the extra hard registers are no used.
(This is certainly true for the RL78, where the (patched) optimization
does improve code, even though the widening does use extra registers).

Nick -- can you pass along your testcode?

Jeff



Re: [C PATCH] Don't leak C_MAYBE_CONST_EXPRs into fold() (PR c/68412)

2015-11-19 Thread Marek Polacek
On Wed, Nov 18, 2015 at 09:14:46PM +, Joseph Myers wrote:
> remove_c_maybe_const_expr doesn't seem to be quite what you want.  Apart 
> from this not being a case covered by the comment on the function, you're 
> ignoring the possibility of the side effects in the 
> C_MAYBE_CONST_EXPR_PRE.  So I think you want something else: if either 
> argument is a C_MAYBE_CONST_EXPR whose C_MAYBE_CONST_EXPR_PRE is non-NULL 
> and has side-effects, don't run the comparison, and otherwise it's OK to 
> go down to the C_MAYBE_CONST_EXPR_EXPR.

Indeed, thanks.  The following variant does what you suggest.  Perhaps
I should've introduced a new helper function, but I couldn't find a
proper name.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2015-11-19  Marek Polacek  

PR c/68412
* c-typeck.c (parser_build_binary_op): Properly handle
C_MAYBE_CONST_EXPR before calling warn_tautological_cmp.

* gcc.dg/pr68412-2.c: New test.
* gcc.dg/pr68412.c: New test.

diff --git gcc/c/c-typeck.c gcc/c/c-typeck.c
index c18c307..5cb0f7e 100644
--- gcc/c/c-typeck.c
+++ gcc/c/c-typeck.c
@@ -3512,7 +3512,28 @@ parser_build_binary_op (location_t location, enum 
tree_code code,
   code1, arg1.value, code2, arg2.value);
 
   if (warn_tautological_compare)
-warn_tautological_cmp (location, code, arg1.value, arg2.value);
+{
+  tree lhs = arg1.value;
+  tree rhs = arg2.value;
+  if (TREE_CODE (lhs) == C_MAYBE_CONST_EXPR)
+   {
+ if (C_MAYBE_CONST_EXPR_PRE (lhs) != NULL_TREE
+ && TREE_SIDE_EFFECTS (C_MAYBE_CONST_EXPR_PRE (lhs)))
+   lhs = NULL_TREE;
+ else
+   lhs = C_MAYBE_CONST_EXPR_EXPR (lhs);
+   }
+  if (TREE_CODE (rhs) == C_MAYBE_CONST_EXPR)
+   {
+ if (C_MAYBE_CONST_EXPR_PRE (rhs) != NULL_TREE
+ && TREE_SIDE_EFFECTS (C_MAYBE_CONST_EXPR_PRE (rhs)))
+   rhs = NULL_TREE;
+ else
+   rhs = C_MAYBE_CONST_EXPR_EXPR (rhs);
+   }
+  if (lhs != NULL_TREE && rhs != NULL_TREE)
+   warn_tautological_cmp (location, code, lhs, rhs);
+}
 
   if (warn_logical_not_paren
   && TREE_CODE_CLASS (code) == tcc_comparison
diff --git gcc/testsuite/gcc.dg/pr68412-2.c gcc/testsuite/gcc.dg/pr68412-2.c
index e69de29..be1dcfa 100644
--- gcc/testsuite/gcc.dg/pr68412-2.c
+++ gcc/testsuite/gcc.dg/pr68412-2.c
@@ -0,0 +1,15 @@
+/* PR c/68412 */
+/* { dg-do compile } */
+/* { dg-options "-Wall -Wextra" } */
+
+int
+fn1 (int i)
+{
+  return ({ i; }) == ({ i; }); /* { dg-warning "self-comparison always 
evaluates to true" } */
+}
+
+int
+fn2 (int i)
+{
+  return ({ i + 1; }) != ({ i + 1; }); /* { dg-warning "self-comparison always 
evaluates to false" } */
+}
diff --git gcc/testsuite/gcc.dg/pr68412.c gcc/testsuite/gcc.dg/pr68412.c
index e69de29..825eb63 100644
--- gcc/testsuite/gcc.dg/pr68412.c
+++ gcc/testsuite/gcc.dg/pr68412.c
@@ -0,0 +1,41 @@
+/* PR c/68412 */
+/* { dg-do compile } */
+/* { dg-options "-Wall -Wextra" } */
+
+#define M (sizeof (int) * __CHAR_BIT__)
+
+int
+fn1 (int i)
+{
+  return i == (-1 << 8); /* { dg-warning "left shift of negative value" } */
+}
+
+int
+fn2 (int i)
+{
+  return i == (1 << M); /* { dg-warning "left shift count" } */
+}
+
+int
+fn3 (int i)
+{
+  return i == 10 << (M - 1); /* { dg-warning "requires" } */
+}
+
+int
+fn4 (int i)
+{
+  return i == 1 << -1; /* { dg-warning "left shift count" } */
+}
+
+int
+fn5 (int i)
+{
+  return i == 1 >> M; /* { dg-warning "right shift count" } */
+}
+
+int
+fn6 (int i)
+{
+  return i == 1 >> -1; /* { dg-warning "right shift count" } */
+}

Marek


Re: [PATCH 01/15] Selftest framework (unittests v4)

2015-11-19 Thread Bernd Schmidt

On 11/19/2015 07:08 PM, David Malcolm wrote:

gcc_assert terminates the process and no further testing is done,
whereas the approach the kit tries to run as much of the testsuite as
possible, and then fail if any errors occurred.


Yeah, but let's say someone is working on bitmaps and one of the bitmap 
tests fail, it's somewhat unlikely that cfg will also fail (or if it 
does, it'll be a consequence of the earlier failure). You debug the 
issue, fix it, and run cc1 -fself-test again to see if that sorted it out.


As I said, it's a matter of taste and style and I won't claim that my 
way is necessarily the right one, but I do want to see if others feel 
the same.



The patch kit does use a lot of "magic" via macros and C++.

Taking registration/discovery/running in completely the other direction,
another approach could be a completely manual approach, with something
like this in toplev.c:

   bitmap_selftest ();
   et_forest_selftest ();
   /* etc */
   vec_selftest ();

This has the advantage of being explicit, and the disadvantage of
requiring a bit more typing.
(possibly passing in a "selftest *" param if we're doing the
try-to-run-everything approach, so we can count failures etc without
introducing globals)

Was that what you had in mind, or something else?


It's one option, but it doesn't seem like the best one either. I was 
thinking of something not dissimilar to your approach, but with fewer 
bells and whistles. My class registrator would look something like this:


static list test_callbacks;

class registrator
{
public:
  registrator (void (*)() cb)
  {
test_callbacks.push_front (cb);
  }
}

(or use a vec if you can do that at global constructor time)

and then you just walk the list and run the callbacks when you want to 
run tests. The one you have implements both the registration and a 
special case linked list, which just doesn't look right to me, and I 
think I'd also have avoided the runner class.



Bernd


Re: PATCH to shorten_compare -Wtype-limits handling

2015-11-19 Thread Jeff Law

On 11/19/2015 10:48 AM, Manuel López-Ibáñez wrote:

On 19 November 2015 at 17:09, Jeff Law  wrote:

The even longer term direction for this code is to separate out the
type-limits warning from the canonicalization and shortening.  I've got a
blob of code form Kai that goes in that direction, but it needs more
engineering around it.

Ideally the canonicalization/shortening moves into match.pd.  The warning,
in theory, moves out of the front-ends as well.


The last attempt by Kai was here:
https://gcc.gnu.org/ml/gcc-patches/2015-09/msg01265.html
Right.  THe problem is there wasn't any real information on how that 
affected code generation.





But there were a couple of patches from you some time ago, for
example: http://permalink.gmane.org/gmane.comp.gcc.patches/343476

What happened with those?
On hold pending fixing the type-limits warning placement.  Essentially 
that has to be untangled first.




There is also https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51712 It
would be great if that was eventually fixed in the new delay-folding
world.

Noted.

jeff



Re: [PATCH 01/15] Selftest framework (unittests v4)

2015-11-19 Thread Mike Stump
On Nov 19, 2015, at 10:08 AM, David Malcolm  wrote:
> gcc_assert terminates the process and no further testing is done,
> whereas the approach the kit tries to run as much of the testsuite as
> possible, and then fail if any errors occurred.

Running as much as possible is desirable over stopping at the first failure.  I 
personally like the magic registration.  Not sure if we need the qsort (they 
can be run in registration order I think).



Re: [PATCH 01/15] Selftest framework (unittests v4)

2015-11-19 Thread Mikhail Maltsev
On 11/19/2015 09:44 PM, Bernd Schmidt wrote:
> It's one option, but it doesn't seem like the best one either. I was
> thinking of something not dissimilar to your approach, but with fewer
> bells and whistles. My class registrator would look something like this:
> 
> static list test_callbacks;
> 
> class registrator
> {
> public:
>   registrator (void (*)() cb)
>   {
> test_callbacks.push_front (cb);
>   }
> }
> 
> (or use a vec if you can do that at global constructor time)
> 
> and then you just walk the list and run the callbacks when you want to
> run tests. The one you have implements both the registration and a
> special case linked list, which just doesn't look right to me, and I
> think I'd also have avoided the runner class.

Google test allows to run tests which match a given regular expression.
This is very convenient when you debug a failing test: you just need to
run the testsuite under debugger and specify the relevant (failing) test
as a filter. I think this feature is worth implementing eventually
(maybe regex is an overkill, and matching against a substring will be
enough). So, if testcases are implemented as objects, it will easy to
add filtering. If testcases are just callbacks, names and possibly some
other metainformation need to be stored separately.

FWIW, I worked with other unit test frameworks, such as CppUnit (but
Google Test is superior, IMHO) and they use the same approach: testcases
are objects which store metainformation (such as name) and have methods
for running tests.

-- 
Regards,
Mikhail Maltsev


Re: [PATCH] Fix ICE with mangling aliases (PR c++/67354, take 2)

2015-11-19 Thread Jason Merrill

On 11/19/2015 07:40 AM, Jakub Jelinek wrote:

@@ -4502,6 +4509,7 @@ c_parse_final_cleanups (void)

locus_at_end_of_parsing = input_location;
at_eof = 1;
+  defer_mangling_aliases = false;


Let's clear this in generate_mangling_aliases rather than here.  OK with 
that change.


Jason



Re: [PATCH 1/2] [graphite] Move codegen related functions to graphite-isl-ast-to-gimple.c

2015-11-19 Thread Sebastian Pop
Fixed in r230625.

David, please test on your systems, we were not able to reproduce the fails on
x86_64-linux: the linker does optimize away the functions that are not used.

Thanks,
Sebastian

Aditya K wrote:
> Thanks for the update. I'll fix that asap.
> 
> 
> -Aditya
> 
> 
> 
> > Date: Thu, 19 Nov 2015 08:36:58 -0500
> > Subject: Re: [PATCH 1/2] [graphite] Move codegen related functions to 
> > graphite-isl-ast-to-gimple.c
> > From: dje@gmail.com
> > To: hiradi...@msn.com; aditya...@samsung.com; s@samsung.com
> > CC: gcc-patches@gcc.gnu.org
> >
> > This patch broke bootstrap when ISL is not enabled.
> >
> > graphite-isl-ast-to-gimple.c is protected by HAVE_isl but
> > get_false_edge_from_guard_bb() is used outside of Graphite, including
> > sese.c, which is not restricted to HAVE_isl.
> >
> > Please fix.
> >
> > Thanks, David
> 


Re: [PATCH/RFC] C++ FE: expression ranges (v2)

2015-11-19 Thread Jason Merrill

On 11/15/2015 12:01 AM, David Malcolm wrote:

As with the C frontend, there's an issue with tree nodes that
don't have locations: VAR_DECL, INTEGER_CST, etc:

   int test (int foo)
   {
 return foo * 100;
^^^   ^^^
   }

where we'd like to access the source spelling ranges of the expressions
during parsing, so that we can use them when reporting parser errors.


Hmm, I had been thinking to address this in the C++ front end by 
wrapping uses in another tree: NOP_EXPR for rvalues, VIEW_CONVERT_EXPR 
for lvalues.


Jason



[PATCH] (Partial) Implementation of simplificaiton of CSHIFT

2015-11-19 Thread Steve Kargl
The attached patch provides a partial implementation for
the simplification for CSHIFT.  It is partial in that it
only applies to rank 1 arrays.  For arrays with rank > 1,
gfc_simplify_cshift will issue an error.  Here, the intent
is that hopefully someone that knows what they are doing
with supply a patch for rank > 1.

The meat of the patch for rank = 1 may not be the most
efficient.  It copies the array elements from 'a' to 'result'
in the circularly shifted order.  It inefficiently always
starts with the first element in 'a' to find the candidate
element for next 'result' element.

  cr = gfc_constructor_first (result->value.constructor);
  for (i = 0; i < sz; i++, cr = gfc_constructor_next (cr))
{
  j = (i + shft) % sz;
  ca = gfc_constructor_first (a->value.constructor);
  while (j-- > 0)
ca = gfc_constructor_next (ca);
  cr->expr = gfc_copy_expr (ca->expr);
}

As the values are storied in a splay tree, there may be
a more efficient way to split the splay and recombine
it into a new.

Anyway, I would like to commit the attached patch.
Built and tested on x86_64-*-freebsd?

2015-11-19  Steven G. Kargl  

* intrinsic.h: Prototype for gfc_simplify_cshift
* intrinsic.c (add_functions): Use gfc_simplify_cshift.
* simplify.c (gfc_simplify_cshift): Implement simplification of CSHIFT.
(gfc_simplify_spread): Remove a FIXME and add error condition.
 
2015-11-19  Steven G. Kargl  

* gfortran.dg/simplify_cshift_1.f90: New test.

-- 
Steve
Index: gcc/fortran/intrinsic.c
===
--- gcc/fortran/intrinsic.c	(revision 230585)
+++ gcc/fortran/intrinsic.c	(working copy)
@@ -1659,9 +1659,11 @@ add_functions (void)
 
   make_generic ("count", GFC_ISYM_COUNT, GFC_STD_F95);
 
-  add_sym_3 ("cshift", GFC_ISYM_CSHIFT, CLASS_TRANSFORMATIONAL, ACTUAL_NO, BT_REAL, dr, GFC_STD_F95,
-	 gfc_check_cshift, NULL, gfc_resolve_cshift,
-	 ar, BT_REAL, dr, REQUIRED, sh, BT_INTEGER, di, REQUIRED,
+  add_sym_3 ("cshift", GFC_ISYM_CSHIFT, CLASS_TRANSFORMATIONAL, ACTUAL_NO,
+	 BT_REAL, dr, GFC_STD_F95,
+	 gfc_check_cshift, gfc_simplify_cshift, gfc_resolve_cshift,
+	 ar, BT_REAL, dr, REQUIRED,
+	 sh, BT_INTEGER, di, REQUIRED,
 	 dm, BT_INTEGER, ii, OPTIONAL);
 
   make_generic ("cshift", GFC_ISYM_CSHIFT, GFC_STD_F95);
Index: gcc/fortran/intrinsic.h
===
--- gcc/fortran/intrinsic.h	(revision 230585)
+++ gcc/fortran/intrinsic.h	(working copy)
@@ -271,6 +271,7 @@ gfc_expr *gfc_simplify_conjg (gfc_expr *
 gfc_expr *gfc_simplify_cos (gfc_expr *);
 gfc_expr *gfc_simplify_cosh (gfc_expr *);
 gfc_expr *gfc_simplify_count (gfc_expr *, gfc_expr *, gfc_expr *);
+gfc_expr *gfc_simplify_cshift (gfc_expr *, gfc_expr *, gfc_expr *);
 gfc_expr *gfc_simplify_dcmplx (gfc_expr *, gfc_expr *);
 gfc_expr *gfc_simplify_dble (gfc_expr *);
 gfc_expr *gfc_simplify_digits (gfc_expr *);
Index: gcc/fortran/simplify.c
===
--- gcc/fortran/simplify.c	(revision 230585)
+++ gcc/fortran/simplify.c	(working copy)
@@ -1789,6 +1789,88 @@ gfc_simplify_count (gfc_expr *mask, gfc_
 
 
 gfc_expr *
+gfc_simplify_cshift (gfc_expr *array, gfc_expr *shift, gfc_expr *dim)
+{
+  gfc_expr *a;
+
+  a = gfc_copy_expr (array);
+
+  switch (a->expr_type)
+{
+  case EXPR_VARIABLE:
+  case EXPR_ARRAY:
+	gfc_simplify_expr (a, 0);
+	if (!is_constant_array_expr (a))
+	  {
+	gfc_free_expr (a);
+	return NULL;
+	  }
+	break;
+  default:
+	gcc_unreachable ();
+}
+
+  if (a->rank == 1)
+{
+  gfc_constructor *ca, *cr;
+  gfc_expr *result;
+  mpz_t size;
+  int i, j, shft, sz;
+
+  if (!gfc_is_constant_expr (shift))
+	return NULL;
+
+  shft = mpz_get_si (shift->value.integer);
+
+  /* Special case: rank 1 array with no shift!  */
+  if (shft == 0)
+	return a;
+
+  /*  Case (i):  If ARRAY has rank one, element i of the result is
+	  ARRAY (1 + MODULO (i + SHIFT ­ 1, SIZE (ARRAY))).  */
+
+  result = gfc_copy_expr (a);
+  mpz_init (size);
+  gfc_array_size (a, );
+  sz = mpz_get_si (size);
+  mpz_clear (size);
+  shft = shft < 0 ? 1 - shft : shft;
+  cr = gfc_constructor_first (result->value.constructor);
+  for (i = 0; i < sz; i++, cr = gfc_constructor_next (cr))
+	{
+	  j = (i + shft) % sz;
+	  ca = gfc_constructor_first (a->value.constructor);
+	  while (j-- > 0)
+	ca = gfc_constructor_next (ca);
+	  cr->expr = gfc_copy_expr (ca->expr);
+	}
+
+  gfc_free_expr (a);
+  return result;
+}
+  else
+{
+  int dm;
+
+  if (dim)
+	{
+	  if (!gfc_is_constant_expr (dim))
+	return NULL;
+
+	  dm = mpz_get_si (dim->value.integer);
+	}
+  else
+	dm = 1;
+
+  gfc_error ("Simplification of CSHIFT with an array with rank > 1 "

Re: [PATCH][RTL-ree] PR rtl-optimization/68194: Restrict copy instruction in presence of conditional moves

2015-11-19 Thread Bernd Schmidt

I1 is def_insn, I3 is cand->insn. tmp_reg is 'ax'. What we want to do
is reject this transformation
because the destination of def_insn (aka I1), that is 'ax', is not the
operand of the extend operation
in cand->insn (aka I3). As you said, rtx_equal won't work on just
SET_SRC (PATTERN (cand->insn)) because
it's an extend operation. So reg_overlap_mentioned should be appropriate.


Yeah, so just use the src_reg variable for the comparison. I still don't 
see why you wouldn't want to use the stronger test. But the whole thing 
still feels not completely ideal somehow, so after reading through ree.c 
for a while and getting a better feeling for how it works, I think the 
following (which you said is equivalent) would be the most 
understandable and direct fix.


You said that the two tests should be equivalent, and I agree. I've not 
found cases where the change makes a difference, other than the 
testcase. Would you mind running this version through the testsuite and 
committing if it passes?


I've shrunk the comment; massive explanations like this for every bug 
are inappropriate IMO, and the example also duplicates an earlier 
comment in the same function. And, as I said earlier, the way you placed 
the comment is confusing because only one part of the following if 
statement is related to it.



Bernd
diff --git a/gcc/ree.c b/gcc/ree.c
index 4550cc3..2c9d4d6 100644
--- a/gcc/ree.c
+++ b/gcc/ree.c
@@ -772,6 +772,12 @@ combine_reaching_defs (ext_cand *cand, const_rtx set_pat, ext_state *state)
   if (state->defs_list.length () != 1)
 	return false;
 
+  /* We don't have the structure described above if there are
+	 conditional moves in between the def and the candidate,
+	 and we will not handle them correctly.  See PR68194.  */
+  if (state->copies_list.length () > 0)
+	return false;
+
   /* We require the candidate not already be modified.  It may,
 	 for example have been changed from a (sign_extend (reg))
 	 into (zero_extend (sign_extend (reg))).


Re: [PATCH 0/5] Add support -mcpu=thunderxt88pass1 and needed changes to support that

2015-11-19 Thread Andrew Pinski
On Thu, Nov 19, 2015 at 4:08 AM, Kyrill Tkachov  wrote:
> Hi Andrew,
>
> On 17/11/15 22:10, Andrew Pinski wrote:
>>
>> To Add support for -mcpu=thunderxt88pass1, I needed to fix up a few
>> things in the support for -mcpu=native.  First was I wanted to do the same
>> cleanup that was done for some of the other .def files and move the
>> #undef into the .def files instead of the .h files.
>> Second to make it easier to understand the implemention_id and part
>> numbers
>> are really numbers, move them over to use integer to compare against
>> instead of strings which allows for easier comparisons later on.
>> Third fix the way we compare imp and part num; that is instead of finding
>> the part number and then comparing the imp, we find both numbers first
>> and then search for the ones that match.
>> Add a comment to the aarch64-cores.def file in front of each group of
>> cores
>> to signify the groups of company's cores.
>> And all of this allows for adding variant for the check of the cores
>> which allows for thunderxt88pass1 to be added.
>>
>> The reason why thunderxt88pass1 is seperate from thunderx is because
>> thunderx is changed to be an ARMv8.1 arch core while thunderxt88pass1
>> is still an ARMv8 arch core.
>>
>> I tested each of these patches seperately.
>
>
> I tried these patches out on a big.LITTLE system with 2 Cortex-A57 and 4
> Cortex-A53
> cores and the code doesn't detect it properly anymore. Can you have a look
> please?
> The Linux /proc/cpuinfo for that system looks like what's described here:
> https://lists.linaro.org/pipermail/cross-distro/2014-October/000750.html
> Look for the entry "[3] arm64, v3.17 + this patch, Juno platform "
> You can put that into a file, point the fopen call in driver-aarch64.c to
> that file
> and use that to debug the issue.

It is a simple fix, valid_bL_core_p  should be checking both orders of
the core as I thought it ordering the BIG as first always.  Like:
static bool
valid_bL_core_p (unsigned int *core, unsigned int bL_core)
{
  return AARCH64_BIG_LITTLE (core[0], core[1]) == bL_core
 || AARCH64_BIG_LITTLE (core[1], core[0]) == bL_core;
}


Thanks,
Andrew


>
> Thanks,
> Kyrill
>
>
>> Ok for the trunk even though I missed out on stage 1?
>>
>> Thanks,
>> Andrew
>>
>> Andrew Pinski (5):
>>[AARCH64]: Move #undef into .def files.
>>[AARCH64] Change IMP and PART over to integers from strings.
>>[AARCH64] Fix part num and implement indendent.
>>{AARCH64] Add comment for the company's cores.
>>[AARCH64] Add variant support to -m*=native and add thunderxt88pass1.
>>
>>   gcc/common/config/aarch64/aarch64-common.c   |   5 +-
>>   gcc/config/aarch64/aarch64-arches.def|   1 +
>>   gcc/config/aarch64/aarch64-cores.def |  43 --
>>   gcc/config/aarch64/aarch64-fusion-pairs.def  |   1 +
>>   gcc/config/aarch64/aarch64-option-extensions.def |   2 +
>>   gcc/config/aarch64/aarch64-opts.h|   4 +-
>>   gcc/config/aarch64/aarch64-protos.h  |   4 -
>>   gcc/config/aarch64/aarch64-tune.md   |   2 +-
>>   gcc/config/aarch64/aarch64-tuning-flags.def  |   1 +
>>   gcc/config/aarch64/aarch64.c |   7 +-
>>   gcc/config/aarch64/aarch64.h |   3 +-
>>   gcc/config/aarch64/driver-aarch64.c  | 169
>> +--
>>   12 files changed, 136 insertions(+), 106 deletions(-)
>>
>


msp430: fix alignment in multiply

2015-11-19 Thread DJ Delorie

Minor fix, committed.

2015-11-19  DJ Delorie  

* config/msp430/lib2hw_mul.S: Fix alignment.

Index: libgcc/config/msp430/lib2hw_mul.S
===
--- libgcc/config/msp430/lib2hw_mul.S   (revision 230632)
+++ libgcc/config/msp430/lib2hw_mul.S   (working copy)
@@ -19,13 +19,13 @@
 ; a copy of the GCC Runtime Library Exception along with this program;
 ; see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 ; .
 
 .macro start_func name
.pushsection .text.\name,"ax",@progbits
-   .align 2
+   .p2align 1
.global \name
.type \name , @function
 \name:
PUSH.W  sr  ; Save current interrupt state
DINT; Disable interrupts
NOP ; Account for latency


Re: [PATCH] (Partial) Implementation of simplificaiton of CSHIFT

2015-11-19 Thread Steve Kargl
On Thu, Nov 19, 2015 at 04:58:36PM -0800, Steve Kargl wrote:
> +  else
> +{
> +  int dm;
> +
> +  if (dim)
> + {
> +   if (!gfc_is_constant_expr (dim))
> + return NULL;
> +
> +   dm = mpz_get_si (dim->value.integer);
> + }
> +  else
> + dm = 1;
> +
> +  gfc_error ("Simplification of CSHIFT with an array with rank > 1 "
> +  "no yet support");
> +}
> +

To save some time, the dim portion of the patch isn't
correct.  dim can be scalar or rank 1 array.  I'll
#if 0 ... #endif this section unless I persevere with
the rank > 1 case.

-- 
Steve


[patch] Python Pretty Printers get disabled on libstdc++ reload by GDB (PR libstdc++/68448)

2015-11-19 Thread Jan Kratochvil
Hi,

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68448
https://bugzilla.redhat.com/show_bug.cgi?id=1279406

echo -e '#include \nint main(){std::vector l;\nreturn 0;}'|g++ -g 
-Wall -x c++ -;gdb -q ./a.out -batch -ex 'b 3' -ex r -ex 'p l' -ex r -ex 'p l'

Actual:
[...]
$1 = std::vector of length 0, capacity 0
[...]
$2 = {> = {_M_impl = 
{ = {<__gnu_cxx::new_allocator> = {}, 
}, _M_start = 0x0, _M_finish = 0x0, _M_end_of_storage = 0x0}}, 
}

Expected:
[...]
$1 = std::vector of length 0, capacity 0
[...]
$2 = std::vector of length 0, capacity 0


Thanks,
Jan
* python/hook.in: Call register_libstdcxx_printers.
* python/libstdcxx/v6/__init__.py: Wrap it to
register_libstdcxx_printers.

diff --git a/libstdc++-v3/python/hook.in b/libstdc++-v3/python/hook.in
index 07fe4c6..703b6a7 100644
--- a/libstdc++-v3/python/hook.in
+++ b/libstdc++-v3/python/hook.in
@@ -55,4 +55,7 @@ if gdb.current_objfile () is not None:
 if not dir_ in sys.path:
 sys.path.insert(0, dir_)
 
-import libstdcxx.v6
+# Call a function as a plain import would not execute body of the included file
+# on repeated reloads of this object file.
+from libstdcxx.v6 import register_libstdcxx_printers
+register_libstdcxx_printers(gdb.current_objfile())
diff --git a/libstdc++-v3/python/libstdcxx/v6/__init__.py 
b/libstdc++-v3/python/libstdcxx/v6/__init__.py
index de3aa72..d8e060c 100644
--- a/libstdc++-v3/python/libstdcxx/v6/__init__.py
+++ b/libstdc++-v3/python/libstdcxx/v6/__init__.py
@@ -15,10 +15,6 @@
 
 import gdb
 
-# Load the pretty-printers.
-from .printers import register_libstdcxx_printers
-register_libstdcxx_printers(gdb.current_objfile())
-
 # Load the xmethods if GDB supports them.
 def gdb_has_xmethods():
 try:
@@ -27,6 +23,11 @@ def gdb_has_xmethods():
 except ImportError:
 return False
 
-if gdb_has_xmethods():
-from .xmethods import register_libstdcxx_xmethods
-register_libstdcxx_xmethods(gdb.current_objfile())
+def register_libstdcxx_printers(obj):
+# Load the pretty-printers.
+from .printers import register_libstdcxx_printers
+register_libstdcxx_printers(obj)
+
+if gdb_has_xmethods():
+from .xmethods import register_libstdcxx_xmethods
+register_libstdcxx_xmethods(obj)


Re: vector lightweight debug mode

2015-11-19 Thread François Dumont
Thanks for the explanation, it is much clearer.

So here is a the more limited patch.

Tested under Linux x86_64.

Ok to commit ?

François


On 18/11/2015 13:27, Jonathan Wakely wrote:
> On 17/11/15 20:49 +0100, François Dumont wrote:
>> On 16/11/2015 11:29, Jonathan Wakely wrote:
>>> Not doing the checks is also an option. That minimizes the cost :-)
>>
>> This is controlled by a macro, users already have this option.
>
> True, but we're talking about maybe enabling these checks by default
> when building linux distributions.
>
>>>
>>> For the full debug mode we want to check everything we can, and accept
>>> that has a cost.
>>>
>>> For the lightweight one we need to evaluate the relative benefits. Is
>>> it worth adding checks for errors that only happen rarely? Does the
>>> benefit outweigh the cost?
>>>
>>> I'm still not convinced that's the case for the "valid range" checks.
>>> I'm willing to be convinced, but am not convinced yet.
>>
>> Ok so I will remove this check. And what about insert position check ? I
>> guess this one too so I will remove it too. Note that will only remain
>> checks on the most basic operations that is to say those on which the
>> check will have the biggest impact proportionally.
>
> Yes, that's a good point.
>
> But my unproven assumption is that it's more common to use operator[]
> incorrectly, rather than pass invalid iterators to range insert, which
> is a relatively "advanced" operation.
>
>
>> I would like we push the simplest version so that people can start
>> experimenting.
>>
>> I would also prefer concentrate on _GLIBCXX_DEBUG mode :-)
>>
>>>
It would be great to have it for gcc 6.0. I am working on the same
 for other containers.
>>>
>>> Please don't do the valid range checks for std::deque, the checks are
>>> undefined for iterators into different containers and will not give a
>>> reliable answer.
>>
>> But debug mode is full of those checks, no ?
>
> They're supposed to be guarded by checks for _M_can_compare, if they
> aren't that's a regression. For debug mode we can tell whether two
> iterators are comparable, because they store a pointer back to their
> parent container. We can't check that in normal mode.
>
>

diff --git a/libstdc++-v3/include/bits/stl_vector.h b/libstdc++-v3/include/bits/stl_vector.h
index 305d446..3b17fb4 100644
--- a/libstdc++-v3/include/bits/stl_vector.h
+++ b/libstdc++-v3/include/bits/stl_vector.h
@@ -63,6 +63,8 @@
 #include 
 #endif
 
+#include 
+
 namespace std _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
@@ -320,7 +322,8 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   vector(const vector& __x)
   : _Base(__x.size(),
 	_Alloc_traits::_S_select_on_copy(__x._M_get_Tp_allocator()))
-  { this->_M_impl._M_finish =
+  {
+	this->_M_impl._M_finish =
 	  std::__uninitialized_copy_a(__x.begin(), __x.end(),
   this->_M_impl._M_start,
   _M_get_Tp_allocator());
@@ -340,7 +343,8 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   /// Copy constructor with alternative allocator
   vector(const vector& __x, const allocator_type& __a)
   : _Base(__x.size(), __a)
-  { this->_M_impl._M_finish =
+  {
+	this->_M_impl._M_finish =
 	  std::__uninitialized_copy_a(__x.begin(), __x.end(),
   this->_M_impl._M_start,
   _M_get_Tp_allocator());
@@ -470,7 +474,8 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   vector&
   operator=(initializer_list __l)
   {
-	this->assign(__l.begin(), __l.end());
+	this->_M_assign_aux(__l.begin(), __l.end(),
+			random_access_iterator_tag());
 	return *this;
   }
 #endif
@@ -532,7 +537,10 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
*/
   void
   assign(initializer_list __l)
-  { this->assign(__l.begin(), __l.end()); }
+  {
+	this->_M_assign_aux(__l.begin(), __l.end(),
+			random_access_iterator_tag());
+  }
 #endif
 
   /// Get a copy of the memory allocation object.
@@ -694,7 +702,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   resize(size_type __new_size, const value_type& __x)
   {
 	if (__new_size > size())
-	  insert(end(), __new_size - size(), __x);
+	  _M_fill_insert(end(), __new_size - size(), __x);
 	else if (__new_size < size())
 	  _M_erase_at_end(this->_M_impl._M_start + __new_size);
   }
@@ -714,7 +722,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   resize(size_type __new_size, value_type __x = value_type())
   {
 	if (__new_size > size())
-	  insert(end(), __new_size - size(), __x);
+	  _M_fill_insert(end(), __new_size - size(), __x);
 	else if (__new_size < size())
 	  _M_erase_at_end(this->_M_impl._M_start + __new_size);
   }
@@ -778,7 +786,10 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
*/
   reference
   operator[](size_type __n) _GLIBCXX_NOEXCEPT
-  { return *(this->_M_impl._M_start + __n); }
+  {
+	__glibcxx_requires_subscript(__n);
+	return *(this->_M_impl._M_start + __n);
+  }
 
   /**
*  @brief  Subscript access to the data 

Re: [PATCH 01/15] Selftest framework (unittests v4)

2015-11-19 Thread David Malcolm
On Thu, 2015-11-19 at 18:35 +0100, Bernd Schmidt wrote:
> In general I'm much happier with this approach, and I think this series 
> is close to ready, but I want to bring up some questions that could use 
> wider discussion.

> > This patch adds a selftest.h/.c to gcc, with an API loosely
> > modelled on gtest (though without the use of CamelCase): it
> > supports enough of the gtest API to enable the tests that I
> > wrote to run with minimal changes.
> 
> Here there's a question of style. I don't want to approve or reject this 
> just now, I'd like to hear what others think. To my eyes this still 
> looks rather seriously overengineered. Plain gcc_assert and if (cond) 
> abort (); would work just fine for the tests IMO, it's what we have for 
> regular testcases where we don't distinguish between all sorts of 
> microscopic subtests, and any gcc_assert failure is easy enough to 
> debug, just load up cc1 into the debugger and run. I don't think we need 
> output for tests that are really just expected to pass always, all we 
> need is the build to stop if an internal error is detected.

gcc_assert terminates the process and no further testing is done,
whereas the approach the kit tries to run as much of the testsuite as
possible, and then fail if any errors occurred.

Given that the aim is for something that runs very quickly in stage1 and
should rarely fail, this distinction may be irrelevant, though maybe the
latter approach is better for the case where *something* has broken a
lot of tests and you want to see what the lowest level failures are,
rather than just the first one that broke.

Consider the case of something being tested on, say x86_64, that works
fine there, but which breaks some of the selftests on AIX (perhaps the
selftest failure is indicating a genuine problem, or perhaps the
selftest is over-specified).  I believe it would be preferable to have
some sort of report on the scope of the breakage, rather than require
each test to be fixed before the self-test can continue.

> If I'd written it I'd also have used a somewhat lower-tech approach for 
> the registration and running of tests, but once again I'd like to hear 
> from others.

The patch kit does use a lot of "magic" via macros and C++.

Taking registration/discovery/running in completely the other direction,
another approach could be a completely manual approach, with something
like this in toplev.c:

  bitmap_selftest ();
  et_forest_selftest ();
  /* etc */
  vec_selftest ();

This has the advantage of being explicit, and the disadvantage of
requiring a bit more typing.  I was going to add that there might be
more risk of adding a test and not having it called, but I suspect that
the "function declared but not used" warning would protect us from that.
(there's also the loss of the ability to e.g. randomize the order in
which the tests get run, but that's less important).

(possibly passing in a "selftest *" param if we're doing the
try-to-run-everything approach, so we can count failures etc without
introducing globals)

Was that what you had in mind, or something else?

> For things like
> 
> > +#define RUN_ALL_TESTS() \
> > +  ::selftest::run_all_tests ()
> 
> I don't see the point of the macro. 

FWIW, this was for compatibility with the gtest API, but yeah, it's
redundant.

>Also, in [8/15]
> 
> > +class gimple_test : public ::selftest::test
> > +{
> > + protected:
> > +  void
> > +  verify_gimple_pp (const char *expected, gimple *stmt)
> > +  {
> > +pretty_printer pp;
> > +pp_gimple_stmt_1 (, stmt, 0 /* spc */, 0 /* flags */);
> > +EXPECT_STREQ (expected, pp_formatted_text ());
> > +  }
> > +};
> > +
> 
> Why have the class rather than just a function? This sort of thing makes 
> me go "overuse of C++".

It's useful if a test has state: the test functions are methods of a
class, so the state can live as instance data, and hence the helper
functions in the fixtures can get at it.  But this could be achieved in
other ways that might be more explicit, and use less macro chicanery.
(also, the result reporting relies on the test code being a method of
a ::selftest::test subclass so it's useful to write fixtures as
subclasses to inherit from, but yeah, this could be overengineering it).


Thanks
Dave



Re: RFC/RFA: Fix bug with REE optimization corrupting extended registers

2015-11-19 Thread Jeff Law

On 11/18/2015 07:15 AM, Nick Clifton wrote:

Hi Guys,

   I recently discovered a bug in the current Redundant Extension
   Elimination optimization.  If the candidate extension instruction
   increases the number of hard registers used, the pass does not check
   to see if these extra registers are live between the definition and
   the extension.

   For example:

 (insn  44 (set (reg:QI r11) (mem:QI (reg:HI r20)))
 (insn  45 (set (reg:QI r10) (mem:QI (reg:HI r18)))
 [...]
 (insn  71 (set (reg:HI r14) (zero_extend:HI (reg:QI r11)))
 [...]
 (insn  88 (set (reg:HI r10) (zero_extend:HI (reg:QI r10)))

   (This is on the RL78 target where HImode values occupy two hard
   registers and QImode values only one.  The bug however is generic, not
   RL78 specific).
Right.  I can recall walking though multi-reg cases on my whiteboard. 
There was one particular case Jakub was concerned about in this space, 
but I couldn't contrive a testcase to trigger and I was getting to the 
point where I was struggling to be able to reason about the behaviour of 
the code.







   The REE pass transforms this into:

 (insn  44 (set (reg:QI r11) (mem:QI (reg:HI r20)))
 (insn  45 (set (reg:HI r10) (zero_extend:HI (mem:QI (reg:HI r18
 [...]
 (insn  71 (set (reg:HI r14) (zero_extend:HI (reg:QI r11)))
 [...]
 (insn  88 deleted)

   Note how the new set at insn 45 clobbers the value loaded by insn 44
   into r11.

Right.



   The patch below is my attempt to fix this.  It is not correct
   however.  I could not work out how to determine if a given hard
   register is live at any point between two insns.  So instead I used
   the liveness information associated with the basic block containing
   the definition.  If one of the extended registers is live or becomes
   live in this block, and this block is not the same block as the one
   containing the extension insn, then I stop the optimization.  This
   works for the RL78 for all of the cases that I could find.
The way this pass has dealt with this class of problems is the various 
routines in rtlanal.c  reg_used_between_p, reg_set_between_p and the like.





   So ... comments please ?

   Will gcc ever generate a single basic block that contains both the
   definition and the extension instructions, but also where the extra
   hard registers are used for another purpose as well ?

I can't see any reason why it would not.


jeff


Re: [C PATCH] Don't leak C_MAYBE_CONST_EXPRs into fold() (PR c/68412)

2015-11-19 Thread Joseph Myers
On Thu, 19 Nov 2015, Marek Polacek wrote:

> On Wed, Nov 18, 2015 at 09:14:46PM +, Joseph Myers wrote:
> > remove_c_maybe_const_expr doesn't seem to be quite what you want.  Apart 
> > from this not being a case covered by the comment on the function, you're 
> > ignoring the possibility of the side effects in the 
> > C_MAYBE_CONST_EXPR_PRE.  So I think you want something else: if either 
> > argument is a C_MAYBE_CONST_EXPR whose C_MAYBE_CONST_EXPR_PRE is non-NULL 
> > and has side-effects, don't run the comparison, and otherwise it's OK to 
> > go down to the C_MAYBE_CONST_EXPR_EXPR.
> 
> Indeed, thanks.  The following variant does what you suggest.  Perhaps
> I should've introduced a new helper function, but I couldn't find a
> proper name.
> 
> Bootstrapped/regtested on x86_64-linux, ok for trunk?

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH 2/4][AArch64] Increase the loop peeling limit

2015-11-19 Thread Evandro Menezes

On 11/05/2015 02:51 PM, Evandro Menezes wrote:

2015-11-05  Evandro Menezes 

   gcc/

   * config/aarch64/aarch64.c (aarch64_override_options_internal):
   Increase loop peeling limit.

This patch increases the limit for the number of peeled insns. With 
this change, I noticed no major regression in either Geekbench v3 or 
SPEC CPU2000 while some benchmarks, typically FP ones, improved 
significantly.


I tested this tuning on Exynos M1 and on A57.  ThunderX seems to 
benefit from this tuning too.  However, I'd appreciate comments from 
other stakeholders.


Ping.

--
Evandro Menezes



Re: PATCH to shorten_compare -Wtype-limits handling

2015-11-19 Thread Jason Merrill

On 11/19/2015 02:44 PM, Martin Sebor wrote:

On 11/18/2015 09:26 PM, Jason Merrill wrote:

The rs6000 target was hitting a bootstrap failure due to
-Werror=type-limits.  Since warn_tautological_cmp and other warnings
avoid warning if one of the operands comes from a macro, I thought it
would make sense to do that here as well.


The also disables the warning for functions that are shadowed by
macros such as C atomic_load et al. For example, in the program
below. Is that an acceptable compromise or is there a way to avoid
this?


I think it's an acceptable compromise, but see below.


At the same time, the change doesn't suppress the warning in other
cases where I would have expected it to suppress it based on your
description. For instance here:

   unsigned short bar (unsigned short x)
   {
   #define X x

 if (x > 0x)


Yes, this is missed because the front end doesn't remember the location 
of the use of x; that's one of many location tracking issues.  David 
Malcolm is working on this stuff.



I noticed there is code elsewhere in c-common.c that avoids
issuing the same warning for system headers (that's the code
that responsible for issuing the warning for the second test
case above).


Hmm, it looks like using expansion_point_if_in_system_header might avoid 
the first issue you mention.



There is also code in tree-vrp.c that issues it unconditionally
regardless of macros or system headers.


Good to know.

Jason



Re: [PATCH 4/4][AArch64] Add cost model for Exynos M1

2015-11-19 Thread Evandro Menezes

On 11/05/2015 06:09 PM, Evandro Menezes wrote:

2015-10-25  Evandro Menezes 

   gcc/

   * config/aarch64/aarch64-cores.def: Use the Exynos M1 cost model.
   * config/aarch64/aarch64.c (exynosm1_addrcost_table): New 
variable.

   (exynosm1_regmove_cost): Likewise.
   (exynosm1_vector_cost): Likewise.
   (exynosm1_tunings): Likewise.
   * config/arm/aarch-cost-tables.h (exynosm1_extra_costs): Likewise.
   * config/arm/arm.c (arm_exynos_m1_tune): Likewise.

This patch adds the cost model for Exynos M1.  This patch depends on a 
couple of previous patches though, 
https://gcc.gnu.org/ml/gcc-patches/2015-11/msg00505.html and 
https://gcc.gnu.org/ml/gcc-patches/2015-11/msg00538.html


Please, commit if it's alright.


Ping.

--
Evandro Menezes



Re: [PATCH 3a/4][AArch64] Add attribute for compatibility with ARM pipeline models

2015-11-19 Thread Evandro Menezes

On 11/12/2015 11:32 AM, Evandro Menezes wrote:

On 11/12/2015 09:39 AM, Evandro Menezes wrote:

On 11/12/2015 08:55 AM, James Greenhalgh wrote:

On Tue, Nov 10, 2015 at 11:50:12AM -0600, Evandro Menezes wrote:

2015-11-10  Evandro Menezes 

gcc/

* config/aarch64/aarch64.md (predicated): Copy attribute from
"arm.md".

This patch duplicates an attribute from arm.md so that the same
pipeline model can be used for both AArch32 and AArch64.

Bootstrapped on arm-unknown-linux-gnueabihf, 
aarch64-unknown-linux-gnu.


Please, commit if it's alright.

--
Evandro Menezes


 From 3b643a3c026350864713e1700dc44e4794d93809 Mon Sep 17 00:00:00 
2001

From: Evandro Menezes 
Date: Mon, 9 Nov 2015 17:11:16 -0600
Subject: [PATCH 1/2] [AArch64] Add attribute for compatibility with 
ARM

  pipeline models

gcc/
* config/aarch64/aarch64.md (predicated): Copy attribute from 
"arm.md".

---
  gcc/config/aarch64/aarch64.md | 5 +
  1 file changed, 5 insertions(+)

diff --git a/gcc/config/aarch64/aarch64.md 
b/gcc/config/aarch64/aarch64.md

index 6b08850..2bc2ff5 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -195,6 +195,11 @@
  ;; 1 :=: yes
  (define_attr "far_branch" "" (const_int 0))
  +;; [For compatibility with ARM in pipeline models]
+;; Attribute that specifies whether or not the instruction is 
executed

+;; conditionally ( != "AL"? "yes": "no").
I'm not sure this  != "AL" [...] part makes sense to me (thinking 
only
of AArch64, I'd understand it on AArch32 :) ) and we should document 
that

this is never set for AArch64. Could you respin with a slightly clearer
comment.
Since this attribute was not described in config/arm/arm.md, I felt 
that it needed to, but perhaps in its original place instead.  I 
agree that I should also point out that it's strictly for 
compatibility with AArch32 and that it never matters for AArch64.


WRT the  thing, I was referring to the opcode fields terminology 
used by ARM in its ISA documentation.  Perhaps it's unnecessary, yes?




   2015-11-12  Evandro Menezes 

   [AArch64] Add attribute for compatibility with ARM pipeline models

   gcc/

   * config/aarch64/aarch64.md (predicated): Copy attribute from
   "arm.md".
   * config/arm/arm.md (predicated): Added description.

Please, commit if it's alright.


Ping.

--
Evandro Menezes



Re: [PATCH 3b/4][AArch64] Add scheduling model for Exynos M1

2015-11-19 Thread Evandro Menezes

On 11/10/2015 11:54 AM, Evandro Menezes wrote:

2015-11-10  Evandro Menezes 

   gcc/

   * config/aarch64/aarch64-cores.def: Use the Exynos M1 sched model.
   * config/aarch64/aarch64.md: Include "exynos-m1.md".
   * config/arm/arm-cores.def: Use the Exynos M1 sched model.
   * config/arm/arm.md: Include "exynos-m1.md".
   * config/arm/arm-tune.md: Regenerated.
   * config/arm/exynos-m1.md: New file.

This patch adds the scheduling model for Exynos M1.  It depends on 
https://gcc.gnu.org/ml/gcc-patches/2015-11/msg01257.html


Bootstrapped on arm-unknown-linux-gnueabihf, aarch64-unknown-linux-gnu.

Please, commit if it's alright.


Ping

--
Evandro Menezes



Re: Aw: Re: TR1 Special Math

2015-11-19 Thread Joseph Myers
On Thu, 19 Nov 2015, Ed Smith-Rowland wrote:

> Testing coverage is growing.
> I starting to build check_inf tests for everyone (but they aren't in the
> patch.
> I'll look at musl to see how you do check_denorm, etc.

FWIW, in glibc I've systematically put special cases and a representative 
range of input values in auto-libm-test-in (where the test inputs and 
mathematical results are finite, but possibly overflowing / underflowing) 
/ libm-test.inc (where not all finite).  Random test generation has also 
been done with both mpcheck and ad hoc scripts that put random inputs in 
auto-libm-test-in before regenerating auto-libm-test-out and running the 
testsuite, in order to find cases with large errors or spurious or missing 
exceptions (those cases then being added to auto-libm-test-in when the 
bugs are fixed).  This all relies heavily on having MPFR / MPC support for 
the functions in question in order to compute the ideal correctly rounded 
results for all rounding modes (so if new functions were added to glibc, 
it would be a very good idea to add them to MPFR / MPC first).

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH] fix PR68341: correctly compute the insertion point for close phi nodes

2015-11-19 Thread Sebastian Pop
---
 gcc/graphite-isl-ast-to-gimple.c | 96 ++--
 1 file changed, 62 insertions(+), 34 deletions(-)

diff --git a/gcc/graphite-isl-ast-to-gimple.c b/gcc/graphite-isl-ast-to-gimple.c
index 3e0907d..91bda33 100644
--- a/gcc/graphite-isl-ast-to-gimple.c
+++ b/gcc/graphite-isl-ast-to-gimple.c
@@ -399,6 +399,11 @@ class translate_isl_ast_to_gimple
   edge copy_bb_and_scalar_dependences (basic_block bb, edge next_e,
   vec iv_map);
 
+  /* Given a basic block containing close-phi it returns the new basic block
+ where to insert a copy of the close-phi nodes.  All the uses in close phis
+ should come from a single loop otherwise it returns NULL.  */
+  edge edge_for_new_close_phis (basic_block bb);
+
   /* Add NEW_NAME as the ARGNUM-th arg of NEW_PHI which is in NEW_BB.
  DOMINATING_PRED is the predecessor basic block of OLD_BB which dominates
  the other pred of OLD_BB as well.  If no such basic block exists then it 
is
@@ -1660,8 +1665,8 @@ translate_isl_ast_to_gimple::rename_all_uses (tree 
new_expr, basic_block new_bb,
   return new_expr;
 }
 
-/* For ops which are scev_analyzeable, we can regenerate a new name from
-its scalar evolution around LOOP.  */
+/* For ops which are scev_analyzeable, we can regenerate a new name from its
+   scalar evolution around LOOP.  */
 
 tree
 translate_isl_ast_to_gimple::
@@ -1702,9 +1707,7 @@ get_rename_from_scev (tree old_name, gimple_seq *stmts, 
loop_p loop,
   basic_block bb = gimple_bb (SSA_NAME_DEF_STMT (new_expr));
   if (bb && !dominated_by_p (CDI_DOMINATORS, new_bb, bb))
{
- /* FIXME: Remove if bootstrap passes.  */
  codegen_error = true;
- gcc_unreachable ();
  return build_zero_cst (TREE_TYPE (old_name));
}
 }
@@ -1717,9 +1720,7 @@ get_rename_from_scev (tree old_name, gimple_seq *stmts, 
loop_p loop,
   basic_block bb = gimple_bb (SSA_NAME_DEF_STMT (new_expr));
   if (bb && !dominated_by_p (CDI_DOMINATORS, new_bb, bb))
{
- /* FIXME: Remove if bootstrap passes.  */
  codegen_error = true;
- gcc_unreachable ();
  return build_zero_cst (TREE_TYPE (old_name));
}
 }
@@ -2079,7 +2080,7 @@ translate_isl_ast_to_gimple::copy_loop_close_phi_args 
(basic_block old_bb,
 {
   /* The successor of bb having close phi should be a merge of the diamond
  inserted to guard the loop during codegen.  */
-  basic_block close_phi_merge_bb = single_succ (new_bb);
+  basic_block succ_new_bb = single_succ (new_bb);
 
   for (gphi_iterator psi = gsi_start_phis (old_bb); !gsi_end_p (psi);
gsi_next ())
@@ -2119,10 +2120,11 @@ translate_isl_ast_to_gimple::copy_loop_close_phi_args 
(basic_block old_bb,
 
   /* When there is no loop guard around this codegenerated loop, there is 
no
 need to collect the close-phi arg.  */
-  if (2 != EDGE_COUNT (close_phi_merge_bb->preds))
+  if (2 != EDGE_COUNT (succ_new_bb->preds)
+ || bb_contains_loop_phi_nodes (succ_new_bb))
continue;
 
-  /* Add a PHI in the close_phi_merge_bb for each close phi of the loop.  
*/
+  /* Add a PHI in the succ_new_bb for each close phi of the loop.  */
   tree init = find_init_value_close_phi (new_phi);
 
   /* A close phi must come from a loop-phi having an init value.  */
@@ -2140,8 +2142,7 @@ translate_isl_ast_to_gimple::copy_loop_close_phi_args 
(basic_block old_bb,
  continue;
}
 
-  gphi *merge_phi = create_phi_node (SSA_NAME_VAR (res),
-close_phi_merge_bb);
+  gphi *merge_phi = create_phi_node (SSA_NAME_VAR (res), succ_new_bb);
   tree merge_res = create_new_def_for (res, merge_phi,
   gimple_phi_result_ptr (merge_phi));
   set_rename (res, merge_res);
@@ -2150,8 +2151,8 @@ translate_isl_ast_to_gimple::copy_loop_close_phi_args 
(basic_block old_bb,
   add_phi_arg (merge_phi, new_res, from_loop, get_loc (old_name));
 
   /* The edge coming from loop guard.  */
-  edge other = from_loop == (*close_phi_merge_bb->preds)[0]
-   ? (*close_phi_merge_bb->preds)[1] : (*close_phi_merge_bb->preds)[0];
+  edge other = from_loop == (*succ_new_bb->preds)[0]
+   ? (*succ_new_bb->preds)[1] : (*succ_new_bb->preds)[0];
 
   add_phi_arg (merge_phi, init, other, get_loc (old_name));
   if (dump_file)
@@ -2605,6 +2606,47 @@ 
translate_isl_ast_to_gimple::graphite_copy_stmts_from_block (basic_block bb,
   return true;
 }
 
+
+/* Given a basic block containing close-phi it returns the new basic block 
where
+   to insert a copy of the close-phi nodes.  All the uses in close phis should
+   come from a single loop otherwise it returns NULL.  */
+
+edge
+translate_isl_ast_to_gimple::edge_for_new_close_phis (basic_block bb)
+{
+  /* Make sure that NEW_BB is the new_loop->exit->dest.  We find the definition
+ of close phi in the 

Re: [PATCH] C++ FE: offer suggestions for misspelled field names

2015-11-19 Thread Jason Merrill

OK, thanks.

Jason


Re: [PATCH][AArch64] Replace insn to zero up DF register

2015-11-19 Thread Evandro Menezes

On 10/30/2015 05:24 AM, Marcus Shawcroft wrote:

On 20 October 2015 at 00:40, Evandro Menezes  wrote:

In the existing targets, it seems that it's always faster to zero up a DF
register with "movi %d0, #0" instead of "fmov %d0, xzr".

This patch modifies the respective pattern.


Hi Evandro,

This patch changes the generic, u architecture independent instruction
selection. The ARM ARM (C3.5.3) makes a specific recommendation about
the choice of instruction in this situation and the current
implementation in GCC follows that recommendation.  Wilco has also
picked up on this issue he has the same patch internal to ARM along
with an ongoing discussion with ARM architecture folk regarding this
recommendation.  I'm reluctant to take this patch right now on the
basis that it runs contrary to ARM ARM recommendation pending the
conclusion of Wilco's discussion with ARM architecture folk.


Ping.

--
Evandro Menezes



[PTX] Add weak support

2015-11-19 Thread Nathan Sidwell
I've committed this patch to trunk, which adds weak symbol support to PTX.  PTX 
supports weak definitions but not weak declarations, so some of the tests need 
explicitly skipping, however the overall change is for the better.


I did discover a bug in the PTX JIT, in that it resolved weak definitions too 
early, and have reported that to Nvidia. This affected gcc.dg/special/weak-2.c, 
which is thus skipped.


nathan
2015-11-19  Nathan Sidwell  

	gcc/
	* config/nvptx/nvptx.h (SUPPORTS_WEAK): Define.
	* config/nvptx/nvptx.c (nvptx_write_function_decl): Support
	DECL_WEAK.
	(nvptx_declare_objec_name): Likewise.

	gcc/testsuite/
	* lib/target-supports.exp (check_weak_available): Add nvptx-*-*.
	* gcc.dg/attr-weakref-1.c: Skip for nvptx-*-*
	* gcc.dg/special/weak-2.c: Likewise.
	* gcc.dg/weak/weak-12.c: Likewise.
	* gcc.dg/weak/weak-15.c: Likewise.
	* gcc.dg/weak/weak-16.c: Likewise.
	* gcc.dg/weak/weak-1.c: Likewise.
	* gcc.dg/weak/weak-2.c: Likewise.
	* gcc.dg/weak/weak-4.c: Likewise.
	* gcc.dg/torture/pr53922.c: Likewise.
	* gcc.dg/torture/pr60092.c: Likewise.

Index: config/nvptx/nvptx.c
===
--- config/nvptx/nvptx.c	(revision 230554)
+++ config/nvptx/nvptx.c	(working copy)
@@ -379,7 +379,7 @@ nvptx_write_function_decl (std::stringst
   if (DECL_EXTERNAL (decl))
 s << ".extern ";
   else if (TREE_PUBLIC (decl))
-s << ".visible ";
+s << (DECL_WEAK (decl) ? ".weak " : ".visible ");
 
   if (kernel)
 s << ".entry ";
@@ -1780,8 +1780,9 @@ nvptx_declare_object_name (FILE *file, c
   size = tree_to_uhwi (DECL_SIZE_UNIT (decl));
   const char *section = nvptx_section_for_decl (decl);
   fprintf (file, "\t%s%s .align %d .u%d ",
-	   TREE_PUBLIC (decl) ? " .visible" : "", section,
-	   DECL_ALIGN (decl) / BITS_PER_UNIT,
+	   !TREE_PUBLIC (decl) ? ""
+	   : DECL_WEAK (decl) ? ".weak" : ".visible",
+	   section, DECL_ALIGN (decl) / BITS_PER_UNIT,
 	   decl_chunk_size * BITS_PER_UNIT);
   assemble_name (file, name);
   if (size > 0)
Index: config/nvptx/nvptx.h
===
--- config/nvptx/nvptx.h	(revision 230554)
+++ config/nvptx/nvptx.h	(working copy)
@@ -349,6 +349,7 @@ struct GTY(()) machine_function
 #define CTZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \
   ((VALUE) = GET_MODE_BITSIZE ((MODE)), 2)
 
+#define SUPPORTS_WEAK 1
 #define NO_DOT_IN_LABEL
 #define ASM_COMMENT_START "//"
 
Index: testsuite/lib/target-supports.exp
===
--- testsuite/lib/target-supports.exp	(revision 230554)
+++ testsuite/lib/target-supports.exp	(working copy)
@@ -292,6 +292,12 @@ proc check_weak_available { } {
 	return 0
 }
 
+# nvptx (nearly) supports it
+
+if { [istarget nvptx-*-*] } {
+	return 1
+}
+
 # ELF and ECOFF support it. a.out does with gas/gld but may also with
 # other linkers, so we should try it
 
Index: testsuite/gcc.dg/attr-weakref-1.c
===
--- testsuite/gcc.dg/attr-weakref-1.c	(revision 230554)
+++ testsuite/gcc.dg/attr-weakref-1.c	(working copy)
@@ -1,10 +1,12 @@
 // { dg-do run }
 // { dg-require-weak "" }
 // On darwin, we use attr-weakref-1-darwin.c.
+
 // This test requires support for undefined weak symbols.  This support
-// is not available on hppa*-*-hpux*.  The test is skipped rather than
+// is not available on the following targets.  The test is skipped rather than
 // xfailed to suppress the warning that would otherwise arise.
-// { dg-skip-if "" { "hppa*-*-hpux*" "*-*-aix*" } "*" { "" } }
+// { dg-skip-if "" { "hppa*-*-hpux*" "*-*-aix*" "nvptx-*-*" } "*" { "" } }
+
 // For kernel modules and static RTPs, the loader treats undefined weak
 // symbols in the same way as undefined strong symbols.  The test
 // therefore fails to load, so skip it.
Index: testsuite/gcc.dg/special/weak-2.c
===
--- testsuite/gcc.dg/special/weak-2.c	(revision 230554)
+++ testsuite/gcc.dg/special/weak-2.c	(working copy)
@@ -2,6 +2,10 @@
 /* { dg-require-weak "" } */
 /* { dg-additional-sources "weak-2a.c weak-2b.c" } */
 
+/* NVPTX's implementation of weak is broken when a strong symbol is in
+   a later object file than the weak definition.   */
+/* { dg-skip-if "" { "nvptx-*-*" } "*" { "" } } */
+
 #include 
 
 extern int foo(void);
Index: testsuite/gcc.dg/weak/weak-12.c
===
--- testsuite/gcc.dg/weak/weak-12.c	(revision 230554)
+++ testsuite/gcc.dg/weak/weak-12.c	(working copy)
@@ -2,6 +2,8 @@
 /* { dg-do compile } */
 /* { dg-require-weak "" } */
 /* { dg-options "" } */
+/* NVPTX's weak is applied to the definition,  not declaration.  */
+/* { dg-skip-if "" { nvptx-*-* } { "*" } { "" } } */
 
 /* { dg-final { scan-assembler "weak\[^ \t\]*\[ \t\]_?foo" } } */
 
Index: 

C++ PATCH for c++/68422 (sizeof... slow to evaluate)

2015-11-19 Thread Jason Merrill
This turns out to be because we represent sizeof... as a sizeof a pack 
expansion internally, and we generated a full new pack expansion one 
element at a time rather than just look at how many elements there were 
in the argument pack.


We already had an optimization in tsubst_parameter_pack to recognize 
when the pack expansion we're asking for will be the same as one of the 
argument packs, but we weren't doing it for expressions because we might 
need to convert_from_reference the elements.  This patch improves the 
optimization to recognize when we won't need to worry about that, and 
therefore handle more expression pack expansions, which fixes the 
quadratic behavior on the testcase.


But we'd like sizeof... to be fast for packs of references as well, so I 
added a flag to EXPR_PACK_EXPANSION for sizeof... so we can always just 
return the argument pack.  Surprisingly, this provided a small but 
consistent further speedup for the testcase.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 3d0560f2a78158cf6a3165d5cd8f1b6eda0d9fc8
Author: Jason Merrill 
Date:   Thu Nov 19 15:36:50 2015 -0500

	PR c++/68422

	* cp-tree.h (PACK_EXPANSION_SIZEOF_P): New.
	* parser.c (cp_parser_sizeof_pack): Set it.
	* pt.c 	(tsubst_copy) [SIZEOF_EXPR]: Likewise.
	(tsubst_pack_expansion): Improve T... shortcut for expression packs.

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 160bf1e..14ea119 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -98,6 +98,7 @@ c-common.h, not after.
   DECLTYPE_FOR_INIT_CAPTURE (in DECLTYPE_TYPE)
   CONSTRUCTOR_NO_IMPLICIT_ZERO (in CONSTRUCTOR)
   TINFO_USED_TEMPLATE_ID (in TEMPLATE_INFO)
+  PACK_EXPANSION_SIZEOF_P (in *_PACK_EXPANSION)
2: IDENTIFIER_OPNAME_P (in IDENTIFIER_NODE)
   ICS_THIS_FLAG (in _CONV)
   DECL_INITIALIZED_BY_CONSTANT_EXPRESSION_P (in VAR_DECL)
@@ -3200,6 +3201,9 @@ extern void decl_shadowed_for_var_insert (tree, tree);
 /* True iff this pack expansion is within a function context.  */
 #define PACK_EXPANSION_LOCAL_P(NODE) TREE_LANG_FLAG_0 (NODE)
 
+/* True iff this pack expansion is for sizeof  */
+#define PACK_EXPANSION_SIZEOF_P(NODE) TREE_LANG_FLAG_1 (NODE)
+
 /* True iff the wildcard can match a template parameter pack.  */
 #define WILDCARD_PACK_P(NODE) TREE_LANG_FLAG_0 (NODE)
 
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 30cde0b..24ed404 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -25868,6 +25868,7 @@ cp_parser_sizeof_pack (cp_parser *parser)
   else if (TREE_CODE (expr) == CONST_DECL)
 expr = DECL_INITIAL (expr);
   expr = make_pack_expansion (expr);
+  PACK_EXPANSION_SIZEOF_P (expr) = true;
 
   if (paren)
 cp_parser_require (parser, CPP_CLOSE_PAREN, RT_CLOSE_PAREN);
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index b4a5e71..5868be2 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -10868,14 +10868,26 @@ tsubst_pack_expansion (tree t, tree args, tsubst_flags_t complain,
 	}
 }
 
-  /* If the expansion is just T..., return the matching argument pack.  */
+  /* If the expansion is just T..., return the matching argument pack, unless
+ we need to call convert_from_reference on all the elements.  This is an
+ important optimization; see c++/68422.  */
   if (!unsubstituted_packs
   && TREE_PURPOSE (packs) == pattern)
 {
   tree args = ARGUMENT_PACK_ARGS (TREE_VALUE (packs));
+  /* Types need no adjustment, nor does sizeof..., and if we still have
+	 some pack expansion args we won't do anything yet.  */
   if (TREE_CODE (t) == TYPE_PACK_EXPANSION
+	  || PACK_EXPANSION_SIZEOF_P (t)
 	  || pack_expansion_args_count (args))
 	return args;
+  /* Also optimize expression pack expansions if we can tell that the
+	 elements won't have reference type.  */
+  tree type = TREE_TYPE (pattern);
+  if (type && TREE_CODE (type) != REFERENCE_TYPE
+	  && !PACK_EXPANSION_P (type)
+	  && !WILDCARD_TYPE_P (type))
+	return args;
   /* Otherwise use the normal path so we get convert_from_reference.  */
 }
 
@@ -13940,7 +13952,6 @@ tsubst_copy (tree t, tree args, tsubst_flags_t complain, tree in_decl)
 case SIZEOF_EXPR:
   if (PACK_EXPANSION_P (TREE_OPERAND (t, 0)))
 {
-
   tree expanded, op = TREE_OPERAND (t, 0);
 	  int len = 0;
 
@@ -13966,6 +13977,8 @@ tsubst_copy (tree t, tree args, tsubst_flags_t complain, tree in_decl)
 	{
 	  if (TREE_CODE (expanded) == TREE_VEC)
 		expanded = TREE_VEC_ELT (expanded, len - 1);
+	  else
+		PACK_EXPANSION_SIZEOF_P (expanded) = true;
 
 	  if (TYPE_P (expanded))
 		return cxx_sizeof_or_alignof_type (expanded, SIZEOF_EXPR, 


[PATCH] fix PR68428: ignore bb dominated by the scop->exit

2015-11-19 Thread Sebastian Pop
---
 gcc/graphite-scop-detection.c   |  6 +-
 gcc/testsuite/gcc.dg/graphite/pr68428.c | 23 +++
 2 files changed, 28 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/graphite/pr68428.c

diff --git a/gcc/graphite-scop-detection.c b/gcc/graphite-scop-detection.c
index e42b9bf..84fe945 100644
--- a/gcc/graphite-scop-detection.c
+++ b/gcc/graphite-scop-detection.c
@@ -1062,12 +1062,16 @@ scop_detection::harmful_stmt_in_region (sese_l scop) 
const
   basic_block bb;
   FOR_EACH_VEC_ELT (dom, i, bb)
 {
-  DEBUG_PRINT (dp << "\nVisiting bb_" << bb->index);
+  DEBUG_PRINT (dp << "Visiting bb_" << bb->index << "\n");
 
   /* We don't want to analyze any bb outside sese.  */
   if (!dominated_by_p (CDI_POST_DOMINATORS, bb, exit_bb))
continue;
 
+  /* Basic blocks dominated by the scop->exit are not in the scop.  */
+  if (bb != exit_bb && dominated_by_p (CDI_DOMINATORS, bb, exit_bb))
+   continue;
+
   /* The basic block should not be part of an irreducible loop.  */
   if (bb->flags & BB_IRREDUCIBLE_LOOP)
{
diff --git a/gcc/testsuite/gcc.dg/graphite/pr68428.c 
b/gcc/testsuite/gcc.dg/graphite/pr68428.c
new file mode 100644
index 000..2dc63b7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/graphite/pr68428.c
@@ -0,0 +1,23 @@
+/* { dg-options "-O2 -floop-nest-optimize" } */
+
+int au[4] = { 0 };
+
+int
+main(void)
+{
+  int dc;
+  int m7;
+  int lv;
+  int f2;
+  int uq[3] = { 1 };
+  for (dc = 0; dc < 2; ++dc) {
+for (lv = 0; lv < 2; ++lv)
+  for (m7 = 0; m7 < 3; ++m7) {
+if (uq[dc] == 0)
+  continue;
+for (f2 = 0; f2 < 3; ++f2)
+  au[dc+2] = uq[f2];
+  }
+  }
+  return 0;
+}
-- 
1.9.1



[PATCH] add testcase for PR68335

2015-11-19 Thread Sebastian Pop
---
 gcc/testsuite/gfortran.dg/graphite/pr68335.f90 | 45 ++
 1 file changed, 45 insertions(+)
 create mode 100644 gcc/testsuite/gfortran.dg/graphite/pr68335.f90

diff --git a/gcc/testsuite/gfortran.dg/graphite/pr68335.f90 
b/gcc/testsuite/gfortran.dg/graphite/pr68335.f90
new file mode 100644
index 000..a948bea
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/graphite/pr68335.f90
@@ -0,0 +1,45 @@
+! { dg-options "-O2 -floop-nest-optimize" }
+
+MODULE whittaker
+  INTEGER, PARAMETER :: dp=8
+  INTEGER, PARAMETER :: maxfac = 30
+  REAL(KIND=dp), PARAMETER, DIMENSION (-1:2*maxfac+1) :: dfac = (/&
+ 0.1000E+01_dp, 0.1000E+01_dp, 
0.1000E+01_dp,&
+ 0.2000E+01_dp, 0.3000E+01_dp, 
0.8000E+01_dp,&
+ 0.1500E+02_dp, 0.4800E+02_dp, 
0.1050E+03_dp,&
+ 0.3840E+03_dp, 0.9450E+03_dp, 
0.3840E+04_dp,&
+ 0.10395000E+05_dp, 0.4608E+05_dp, 
0.13513500E+06_dp,&
+ 0.64512000E+06_dp, 0.20270250E+07_dp, 
0.10321920E+08_dp,&
+ 0.34459425E+08_dp, 0.18579456E+09_dp, 
0.654729075000E+09_dp,&
+ 0.37158912E+10_dp, 0.137493105750E+11_dp, 
0.817496064000E+11_dp,&
+ 0.316234143225E+12_dp, 0.196199055360E+13_dp, 
0.7905853580625000E+13_dp,&
+ 0.510117543936E+14_dp, 0.2134580466768750E+15_dp, 
0.1428329123020800E+16_dp,&
+ 0.6190283353629375E+16_dp, 0.4284987369062400E+17_dp, 
0.19189878396251062500E+18_dp,&
+ 0.1371195958099968E+19_dp, 0.63326598707628506250E+19_dp, 
0.46620662575398912000E+20_dp,&
+ 0.22164309547669977187E+21_dp, 0.16783438527143608320E+22_dp, 
0.82007945326378915594E+22_dp,&
+ 0.63777066403145711616E+23_dp, 0.3198309867728082E+24_dp, 
0.25510826561258284646E+25_dp,&
+ 0.13113070457687988603E+26_dp, 0.10714547155728479551E+27_dp, 
0.56386202968058350995E+27_dp,&
+ 0.47144007485205310027E+28_dp, 0.25373791335626257948E+29_dp, 
0.21686243443194442612E+30_dp,&
+ 0.11925681927744341235E+31_dp, 0.1040939685272454E+32_dp, 
0.58435841445947272053E+32_dp,&
+ 0.5204698426362269E+33_dp, 0.29802279137433108747E+34_dp, 
0.27064431817106664380E+35_dp,&
+ 0.15795207942839547636E+36_dp, 0.14614793181237598765E+37_dp, 
0.86873643685617511998E+37_dp,&
+ 0.81842841814930553085E+38_dp, 0.49517976900801981839E+39_dp, 
0.47468848252659720789E+40_dp,&
+ 0.29215606371473169285E+41_dp, 0.28481308951595832474E+42_dp, 
0.17821519886598633264E+43_dp/)
+CONTAINS
+  SUBROUTINE whittaker_c0 ( wc, r, expa, erfa, alpha, l, n )
+INTEGER, INTENT(IN)  :: n, l
+REAL(KIND=dp), INTENT(IN):: alpha
+REAL(KIND=dp), DIMENSION(n)  :: erfa, expa, r, wc
+INTEGER  :: i, k
+REAL(dp) :: t1,x
+SELECT CASE (l)
+  CASE DEFAULT
+DO i = 1, n
+  DO k = 0, l/2
+wc(i) = wc(i) + expa(i)*x**(2*k+1)*t1**(2*k+3)*&
+dfac(l+1)/dfac(2*k+1)*2**(k+1)
+  END DO
+END DO
+END SELECT
+  END SUBROUTINE whittaker_c0
+END MODULE whittaker
-- 
1.9.1



Re: libgomp: Compile-time error for non-portable gomp_mutex_t initialization

2015-11-19 Thread Jakub Jelinek
On Wed, Nov 18, 2015 at 06:19:29PM +0300, Ilya Verbin wrote:
> On Fri, Sep 25, 2015 at 17:28:25 +0200, Jakub Jelinek wrote:
> > On Fri, Sep 25, 2015 at 05:04:47PM +0200, Thomas Schwinge wrote:
> > > On Thu, 26 Mar 2015 23:41:30 +0300, Ilya Verbin  wrote:
> > > > On Thu, Mar 26, 2015 at 13:09:19 +0100, Jakub Jelinek wrote:
> > > > > the current code is majorly broken.  As I've said earlier, e.g. the 
> > > > > lack
> > > > > of mutex guarding gomp_target_init (which is using pthread_once 
> > > > > guaranteed
> > > > > to be run just once) vs. concurrent GOMP_offload_register calls
> > > > > (if those are run from ctors, then I guess something like dl_load_lock
> > > > > ensures at least on glibc that multiple GOMP_offload_register calls 
> > > > > aren't
> > > > > performed at the same time) in accessing/reallocating offload_images
> > > > > and num_offload_images and the lack of support to register further
> > > > > images after the gomp_target_init call (if you dlopen further shared
> > > > > libraries) is really bad.  And it would be really nice to support the
> > > > > unloading.
> > > 
> > > > Here is the latest patch for libgomp and mic plugin.
> > > 
> > > > libgomp/
> > > 
> > > > * target.c (register_lock): New mutex for offload image 
> > > > registration.
> > > 
> > > > (GOMP_offload_register): Add mutex lock.
> > 
> > That is definitely wrong.  You'd totally break --disable-linux-futex support
> > on linux and bootstrap on e.g. Solaris and various other pthread targets.
> 
> I don't quite understand, do you mean that gcc 5 and trunk are broken, because
> register_lock doesn't have initialization?  But it seems that bootstrap on
> Solaris and other targets works fine...

Thomas has been proposing to add an #error when !GOMP_MUTEX_INIT_0 into
target.c, so that means break build of libgomp on all targets where
config/posix/mutex.h is used.  That includes --disable-linux-futex on Linux,
and various other targets.

> > At least for ELF and dynamic linking, shared libraries that contain
> > constructors that call GOMP_offload_register* should have DT_NEEDED libgomp
> > and thus libgomp's constructors should be run before the constructors of
> > the libraries that call GOMP_offload_register*.
> 
> So, libgomp should contain a constructor, which will call gomp_mutex_init
> (_lock) before any call to GOMP_offload_register*, right?

No, I think the GOMP_MUTEX_INITIALIZER case is better.
All pthread targets need to support PTHREAD_MUTEX_INITIALIZER, and the other
config/*/bar.h implementations are GOMP_MUTEX_INIT_0 1.
So, config/posix/bar.h would
#define GOMP_MUTEX_INITIALIZER PTHREAD_MUTEX_INITIALIZER
config/linux/bar.h would
#define GOMP_MUTEX_INITIALIZER { 0 }
and
config/rtems/bar.h would
#define GOMP_MUTEX_INITIALIZER {}
// or something similar.
and then just initialize the file scope locks with that static initializer.

Jakub


Re: [PATCH][combine] PR rtl-optimization/68381: Only restrict pure simplification in mult-extend subst case, allow other substitutions

2015-11-19 Thread Segher Boessenkool
Hi Kyrill,

On Thu, Nov 19, 2015 at 10:26:25AM +, Kyrill Tkachov wrote:
> In this PR we end up removing a widening multiplication. I suspect it's 
> some path in combine
> that doesn't handle the case where we return clobber of 0 properly.

That is troublesome.  Could you look deeper?

> This 
> started with my recent
> combine patch r230326.
> Changing the subst hunk from that patch to just return x when handling a 
> no-op substitution
> fixes the testcase for me and doesn't regress the widening-mult cases that 
> r230326 improved.
> In fact, with this patch I'm seeing improved codegen for aarch64 with some 
> widening multiplications
> combined with constant operands and ending up in bitfield move/insert 
> instructions.
> 
> Bootstrapped and tested on aarch64, arm, x86_64.
> 
> Ok for trunk?

I'll have a look what it does for code quality on other targets.


Segher


[SPARC] Fix PR target/68408

2015-11-19 Thread Eric Botcazou
This is the breakage of the init_priority attribute on sparc-elf, a regression 
from older releases in the 4.x series.

Tested on sparc-elf, applied on all active branches.


2015-11-19  Eric Botcazou  

PR target/68408
* config/sparc/sp-elf.h (CTORS_SECTION_ASM_OP): Undefine.
(DTORS_SECTION_ASM_OP): Likewise.

-- 
Eric BotcazouIndex: config/sparc/sp-elf.h
===
--- config/sparc/sp-elf.h	(revision 230589)
+++ config/sparc/sp-elf.h	(working copy)
@@ -53,6 +53,10 @@ along with GCC; see the file COPYING3.
 #undef  ASM_GENERATE_INTERNAL_LABEL
 #define ASM_GENERATE_INTERNAL_LABEL(LABEL,PREFIX,NUM)	\
   sprintf ((LABEL), "*.L%s%ld", (PREFIX), (long)(NUM))
+
+/* We use GNU ld so undefine this so that attribute((init_priority)) works.  */
+#undef CTORS_SECTION_ASM_OP
+#undef DTORS_SECTION_ASM_OP
 
 /* ??? Inherited from sol2.h.  Probably wrong.  */
 #undef WCHAR_TYPE


Re: [patch] Fix PR lto/61313

2015-11-19 Thread Bernd Schmidt

On 11/19/2015 01:02 PM, Eric Botcazou wrote:

this fixes the glitch I introduced when trying to fix another bug:
   https://gcc.gnu.org/ml/gcc-patches/2012-05/msg00330.html
The --with-plugin-ld value specified by the user is also stripped, although
that's unnecessay to fix the original bug.  So I guess that we can leave this
value alone, as before my patch.

Tested on x86_64-suse-linux, OK for all active branches?


Ok.


Bernd



[PATCH] Fix ICE with mangling aliases (PR c++/67354, take 2)

2015-11-19 Thread Jakub Jelinek
On Wed, Nov 18, 2015 at 04:22:21PM -0500, Jason Merrill wrote:
> On 11/18/2015 03:52 AM, Jakub Jelinek wrote:
> >If changing at_eof for this is too big hack, perhaps we could have a
> >different bool just to affect the mangling aliases (and set it to true
> >in generate_mangling_aliases or so).
> 
> Let's do that.

Here it is.  Bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk/5.3?

2015-11-19  Jakub Jelinek  

PR c++/67354
* cp-tree.h (defer_mangling_aliases): Declare.
(generate_mangling_aliases): New prototype.
* decl2.c (defer_mangling_aliases): New variable.
(note_mangling_alias): Use !defer_mangling_aliases
instead of at_eof.
(generate_mangling_aliases): No longer static.
(c_parse_final_cleanups): Clear defer_mangling_aliases.
* optimize.c (maybe_thunk_body): Defer emitting mangling aliases
if !defer_mangling_aliases until the fns are put into the same
comdat group.

* g++.dg/abi/mangle67.C: New test.

--- gcc/cp/cp-tree.h.jj 2015-11-18 11:19:20.956770209 +0100
+++ gcc/cp/cp-tree.h2015-11-19 09:58:00.355009161 +0100
@@ -4842,6 +4842,11 @@ extern GTY(()) vec *local_c
 
 extern int at_eof;
 
+/* True if note_mangling_alias should enqueue mangling aliases for
+   later generation, rather than emitting them right away.  */
+
+extern bool defer_mangling_aliases;
+
 /* A list of namespace-scope objects which have constructors or
destructors which reside in the global scope.  The decl is stored
in the TREE_VALUE slot and the initializer is stored in the
@@ -5768,6 +5773,7 @@ extern tree cxx_maybe_build_cleanup   (tr
 
 /* in decl2.c */
 extern void note_mangling_alias(tree, tree);
+extern void generate_mangling_aliases  (void);
 extern bool check_java_method  (tree);
 extern tree build_memfn_type   (tree, tree, cp_cv_quals, 
cp_ref_qualifier);
 extern tree build_pointer_ptrmemfn_type(tree);
--- gcc/cp/decl2.c.jj   2015-11-18 11:19:12.866884363 +0100
+++ gcc/cp/decl2.c  2015-11-19 10:12:24.423695746 +0100
@@ -102,6 +102,11 @@ static GTY(()) vec *manglin
 /* Nonzero if we're done parsing and into end-of-file activities.  */
 
 int at_eof;
+
+/* True if note_mangling_alias should enqueue mangling aliases for
+   later generation, rather than emitting them right away.  */
+
+bool defer_mangling_aliases = true;
 
 
 /* Return a member function type (a METHOD_TYPE), given FNTYPE (a
@@ -4389,7 +4394,7 @@ void
 note_mangling_alias (tree decl ATTRIBUTE_UNUSED, tree id2 ATTRIBUTE_UNUSED)
 {
 #ifdef ASM_OUTPUT_DEF
-  if (at_eof)
+  if (!defer_mangling_aliases)
 generate_mangling_alias (decl, id2);
   else
 {
@@ -4399,7 +4404,9 @@ note_mangling_alias (tree decl ATTRIBUTE
 #endif
 }
 
-static void
+/* Emit all mangling aliases that were deferred up to this point.  */
+
+void
 generate_mangling_aliases ()
 {
   while (!vec_safe_is_empty (mangling_aliases))
@@ -4502,6 +4509,7 @@ c_parse_final_cleanups (void)
 
   locus_at_end_of_parsing = input_location;
   at_eof = 1;
+  defer_mangling_aliases = false;
 
   /* Bad parse errors.  Just forget about it.  */
   if (! global_bindings_p () || current_class_type
--- gcc/cp/optimize.c.jj2015-11-18 11:19:12.935883389 +0100
+++ gcc/cp/optimize.c   2015-11-19 10:12:24.423695746 +0100
@@ -270,7 +270,11 @@ maybe_thunk_body (tree fn, bool force)
 }
   else if (HAVE_COMDAT_GROUP)
 {
+  /* At eof, defer creation of mangling aliases temporarily.  */
+  bool save_defer_mangling_aliases = defer_mangling_aliases;
+  defer_mangling_aliases = true;
   tree comdat_group = cdtor_comdat_group (fns[1], fns[0]);
+  defer_mangling_aliases = save_defer_mangling_aliases;
   cgraph_node::get_create (fns[0])->set_comdat_group (comdat_group);
   cgraph_node::get_create (fns[1])->add_to_same_comdat_group
(cgraph_node::get_create (fns[0]));
@@ -281,6 +285,9 @@ maybe_thunk_body (tree fn, bool force)
   virtual, it goes into the same comdat group as well.  */
cgraph_node::get_create (fns[2])->add_to_same_comdat_group
  (symtab_node::get (fns[0]));
+  /* Emit them now that the thunks are same comdat group aliases.  */
+  if (!save_defer_mangling_aliases)
+   generate_mangling_aliases ();
   TREE_PUBLIC (fn) = false;
   DECL_EXTERNAL (fn) = false;
   DECL_INTERFACE_KNOWN (fn) = true;
--- gcc/testsuite/g++.dg/abi/mangle67.C.jj  2015-11-19 09:54:51.644701162 
+0100
+++ gcc/testsuite/g++.dg/abi/mangle67.C 2015-11-19 09:54:51.644701162 +0100
@@ -0,0 +1,21 @@
+// PR c++/67354
+// { dg-do compile { target c++11 } }
+// { dg-options "-fabi-version=5 -Os" }
+
+class A
+{
+};
+
+template 
+void
+foo ()
+{
+  T ();
+}
+
+struct B : virtual A
+{
+  template  B () {}
+};
+
+auto f = foo;


Jakub


[C++ PATCH] Issue hard error even with -fpermissive for certain goto violations (PR c++/67409, take 2)

2015-11-19 Thread Jakub Jelinek
On Thu, Nov 19, 2015 at 02:52:55AM +, Manuel López-Ibáñez wrote:
> static bool
> error_jumpto (diagnostic_t kind, location_t loc, tree decl)
> {
>   bool complained = (decl
>? emit_diagnostic (kind, input_location, 0,
>   "jump to label %qD", decl)
>: emit_diagnostic (kind, input_location, 0,
>   "jump to case label"));
>   if (complained && loc)
> inform (loc, " from here");
>   return complained;
> }

Thanks for the suggestion.  That said, I see no reason to rename the
function, or reverse the order of arguments (seems typeck2.c routines
that call emit_diagnostic also have diagnostic_t diag_kind last).
And I really need two locations, not just one, so that it can be used
in both check_previous_goto_1 and check_goto.

So, here is an updated patch, bootstrapped/regtested on x86_64-linux and
i686-linux, ok for trunk/5.3?

2015-11-19  Jakub Jelinek  
Manuel López-Ibáñez  

PR c++/67409
* decl.c (identify_goto): Add LOC and DIAG_KIND arguments, call
emit_diagnostic instead of permerror.
(check_previous_goto_1): Adjust identify_goto callers, treat all
cases but crossing initialization and entering scope of decl with
non-trivial dtor as unconditional hard errors.
(check_goto): Use identify_goto.  Treat all cases but crossing
initialization and entering scope of decl with non-trivial dtor
as unconditional hard errors.

* g++.dg/eh/goto3.C: New test.

--- gcc/cp/decl.c.jj2015-11-19 09:16:02.675919026 +0100
+++ gcc/cp/decl.c   2015-11-19 10:47:09.317990016 +0100
@@ -2967,14 +2967,16 @@ decl_jump_unsafe (tree decl)
   return 0;
 }
 
-/* A subroutine of check_previous_goto_1 to identify a branch to the user.  */
+/* A subroutine of check_previous_goto_1 and check_goto to identify a branch
+   to the user.  */
 
 static bool
-identify_goto (tree decl, const location_t *locus)
+identify_goto (tree decl, location_t loc, const location_t *locus,
+  diagnostic_t diag_kind)
 {
-  bool complained = (decl
-? permerror (input_location, "jump to label %qD", decl)
-: permerror (input_location, "jump to case label"));
+  bool complained
+= (decl ? emit_diagnostic (diag_kind, loc, 0, "jump to label %qD", decl)
+   : emit_diagnostic (diag_kind, loc, 0, "jump to case label"));
   if (complained && locus)
 inform (*locus, "  from here");
   return complained;
@@ -2991,15 +2993,17 @@ check_previous_goto_1 (tree decl, cp_bin
   bool exited_omp, const location_t *locus)
 {
   cp_binding_level *b;
-  bool identified = false, complained = false;
+  bool complained = false;
+  int identified = 0;
   bool saw_eh = false, saw_omp = false, saw_tm = false;
 
   if (exited_omp)
 {
-  complained = identify_goto (decl, locus);
+  complained = identify_goto (decl, input_location, locus, DK_ERROR);
   if (complained)
inform (input_location, "  exits OpenMP structured block");
-  identified = saw_omp = true;
+  saw_omp = true;
+  identified = 2;
 }
 
   for (b = current_binding_level; b ; b = b->level_chain)
@@ -3016,8 +3020,9 @@ check_previous_goto_1 (tree decl, cp_bin
 
  if (!identified)
{
- complained = identify_goto (decl, locus);
- identified = true;
+ complained = identify_goto (decl, input_location, locus,
+ DK_PERMERROR);
+ identified = 1;
}
  if (complained)
{
@@ -3035,10 +3040,11 @@ check_previous_goto_1 (tree decl, cp_bin
break;
   if ((b->kind == sk_try || b->kind == sk_catch) && !saw_eh)
{
- if (!identified)
+ if (identified < 2)
{
- complained = identify_goto (decl, locus);
- identified = true;
+ complained = identify_goto (decl, input_location, locus,
+ DK_ERROR);
+ identified = 2;
}
  if (complained)
{
@@ -3051,10 +3057,11 @@ check_previous_goto_1 (tree decl, cp_bin
}
   if (b->kind == sk_omp && !saw_omp)
{
- if (!identified)
+ if (identified < 2)
{
- complained = identify_goto (decl, locus);
- identified = true;
+ complained = identify_goto (decl, input_location, locus,
+ DK_ERROR);
+ identified = 2;
}
  if (complained)
inform (input_location, "  enters OpenMP structured block");
@@ -3062,10 +3069,11 @@ check_previous_goto_1 (tree decl, cp_bin
}
   if (b->kind == sk_transaction && !saw_tm)
{
- if (!identified)
+ if (identified < 2)
{
-

Re: [PATCH] Add clang-format config to contrib folder

2015-11-19 Thread Martin Liška
On 11/19/2015 11:40 AM, Martin Liška wrote:
> On 11/19/2015 12:09 AM, Sebastian Pop wrote:
>> Martin, thanks for getting this patch out.  I like the patch.
>> Jeff, clang-format has scripts that allow formatting only the lines
>> touched by a patch.
>> It also has a script to integrate with git:
>> https://llvm.org/svn/llvm-project/cfe/trunk/tools/clang-format/git-clang-format
>> We could use those scripts in a commit hook to automatically enforce
>> correct formatting of new patches.
>>
>> Sebastian
> 
> Hi.
> 
> Thanks for pointing out the script touching just modified lines,
> I mentioned the script in the clang-format file.
> 
> Sending v2 that I'm going to install.
> 
> Thanks,
> Martin
> 
>>
>> On Wed, Nov 18, 2015 at 8:10 AM, Martin Liška  wrote:
>>> Hello.
>>>
>>> Following patch adds a clang-format config file that should respect the GNU 
>>> coding standards.
>>> As the file is not part of build process, I hope the patch can be applied 
>>> even though
>>> we've just skipped to stage3? The patch adds a hunk to Makefile which can 
>>> create symlink
>>> to the root directory of the GCC compiler. The clang-format automatically 
>>> loads style from
>>> the configuration file.
>>>
>>> clang-format (version 3.8) provides rich variety of configuration options 
>>> that can
>>> ensure the GNU coding style.
>>>
>>> Limitations:
>>> + placement of opening brace of an initializer can't be requested
>>> + sometimes, '(' is the trailing symbol at the end of a line, which can 
>>> look weird
>>>
>>> As we've been continuously converting our source base to C++, the 
>>> clang-format should
>>> provide better results than a collection of regular expressions 
>>> (check_GNU_style.sh).
>>>
>>> As a reference file I attach gcc/tree-ssa-uninit.c file.
>>> Feel free to comment the suggested configuration file.
>>>
>>> Thanks,
>>> Martin
>>>
> 

There's a small follow-up where I enhance list of FOR_EACH macros as spotted
by octoploid.

As the change is obvious in a new file, I'm going to install the patch.

Thanks,
Martin
From f5571ac8379a376056aa5dc4846d3adb2d1db7b8 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Thu, 19 Nov 2015 13:48:47 +0100
Subject: [PATCH] clang-format: Enhance list of FOR_EACH macros

contrib/ChangeLog:

2015-11-19  Martin Liska  

	* clang-format: Enhance list of FOR_EACH macros.
---
 contrib/clang-format | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/clang-format b/contrib/clang-format
index 8a6fd4b..76a9d87 100644
--- a/contrib/clang-format
+++ b/contrib/clang-format
@@ -43,7 +43,7 @@ BreakBeforeTernaryOperators: true
 ColumnLimit: 80
 ConstructorInitializerIndentWidth: 2
 ContinuationIndentWidth: 2
-ForEachMacros: ['_FOR_EACH','_FOR_EACH_1','FOR_EACH_AGGR_INIT_EXPR_ARG','FOR_EACH_ALIAS','FOR_EACH_ALLOCNO','FOR_EACH_ALLOCNO_OBJECT','FOR_EACH_ARTIFICIAL_DEF','FOR_EACH_ARTIFICIAL_USE','FOR_EACH_BB_FN','FOR_EACH_BB_REVERSE_FN','FOR_EACH_BIT_IN_MINMAX_SET','FOR_EACH_CALL_EXPR_ARG','FOR_EACH_CLONE','FOR_EACH_CONST_CALL_EXPR_ARG','FOR_EACH_CONSTRUCTOR_ELT','FOR_EACH_CONSTRUCTOR_VALUE','FOR_EACH_COPY','FOR_EACH_DEF','FOR_EACH_DEFINED_FUNCTION','FOR_EACH_DEFINED_SYMBOL','FOR_EACH_DEFINED_VARIABLE','FOR_EACH_DEP','FOR_EACH_EDGE','FOR_EACH_EXPR','FOR_EACH_EXPR_1','FOR_EACH_FUNCTION','FOR_EACH_FUNCTION_WITH_GIMPLE_BODY','FOR_EACH_HASH_TABLE_ELEMENT','FOR_EACH_IMM_USE_FAST','FOR_EACH_IMM_USE_ON_STMT','FOR_EACH_IMM_USE_STMT','FOR_EACH_INSN','FOR_EACH_INSN_1','FOR_EACH_INSN_DEF','FOR_EACH_INSN_EQ_USE','FOR_EACH_INSN_INFO_DEF','FOR_EACH_INSN_INFO_EQ_USE','FOR_EACH_INSN_INFO_MW','FOR_EACH_INSN_INFO_USE','FOR_EACH_INSN_USE','FOR_EACH_LOCAL_DECL','FOR_EACH_LOOP','FOR_EACH_LOOP_FN','FOR_EACH_OBJECT','FOR_EACH_OBJECT_CONFLICT','FOR_EACH_PHI_ARG','FOR_EACH_PHI_OR_STMT_DEF','FOR_EACH_PHI_OR_STMT_USE','FOR_EACH_PREF','FOR_EACH_SCALAR','FOR_EACH_SSA_DEF_OPERAND','FOR_EACH_SSA_TREE_OPERAND','FOR_EACH_SSA_USE_OPERAND','FOR_EACH_STATIC_INITIALIZER','FOR_EACH_SUBRTX','FOR_EACH_SUBRTX_PTR','FOR_EACH_SUBRTX_VAR','FOR_EACH_SUCC','FOR_EACH_SUCC_1','FOR_EACH_SYMBOL','FOR_EACH_VARIABLE','FOR_EACH_VEC_ELT','FOR_EACH_VEC_ELT_FROM','FOR_EACH_VEC_ELT_REVERSE','FOR_EACH_VEC_SAFE_ELT','FOR_EACH_VEC_SAFE_ELT_REVERSE','_FOR_EACH_X','_FOR_EACH_X_1','FOREACH_FUNCTION_ARGS','FOREACH_FUNCTION_ARGS_PTR']
+ForEachMacros: 

[patch] Fix PR lto/61313

2015-11-19 Thread Eric Botcazou
Hi,

this fixes the glitch I introduced when trying to fix another bug:
  https://gcc.gnu.org/ml/gcc-patches/2012-05/msg00330.html
The --with-plugin-ld value specified by the user is also stripped, although 
that's unnecessay to fix the original bug.  So I guess that we can leave this 
value alone, as before my patch.

Tested on x86_64-suse-linux, OK for all active branches?


2015-11-19  Eric Botcazou  

PR lto/61313
* configure.ac (PLUGIN_LD_SUFFIX): Do not touch the value specified
by the user.
* configure: Regenerate.

-- 
Eric BotcazouIndex: configure.ac
===
--- configure.ac	(revision 230589)
+++ configure.ac	(working copy)
@@ -2246,7 +2246,7 @@ AC_ARG_WITH(plugin-ld,
 [AS_HELP_STRING([[--with-plugin-ld=[ARG]]], [specify the plugin linker])],
 [if test x"$withval" != x; then
ORIGINAL_PLUGIN_LD_FOR_TARGET="$withval"
-   PLUGIN_LD_SUFFIX=`echo $withval | sed -e "s,$target_alias-,,"`
+   PLUGIN_LD_SUFFIX="$withval"
  fi])
 AC_SUBST(ORIGINAL_PLUGIN_LD_FOR_TARGET)
 AC_DEFINE_UNQUOTED(PLUGIN_LD_SUFFIX, "$PLUGIN_LD_SUFFIX", [Specify plugin linker])


Re: [1/2] OpenACC routine support

2015-11-19 Thread Jakub Jelinek
On Wed, Nov 18, 2015 at 11:02:01AM -0800, Cesar Philippidis wrote:
>  static inline void
>  cp_ensure_no_oacc_routine (cp_parser *parser)
>  {
> -  cp_finalize_oacc_routine (parser, NULL_TREE, false, true);
> +  if (parser->oacc_routine && !parser->oacc_routine->error_seen)
> +{
> +  tree clauses = parser->oacc_routine->clauses;
> +  location_t loc = OMP_CLAUSE_LOCATION (TREE_PURPOSE(clauses));

Formatting, missing space before (clauses));

> @@ -19326,7 +19330,11 @@ cp_parser_late_return_type_opt (cp_parser* parser, 
> cp_declarator *declarator,
>  declarator->std_attributes
>= cp_parser_late_parsing_omp_declare_simd (parser,
>declarator->std_attributes);
> -
> +  if (oacc_routine_p)
> +declarator->std_attributes
> +  = cp_parser_late_parsing_oacc_routine (parser,
> +  declarator->std_attributes);
> +  

Trailing whitespace at the end of line.

Otherwise LGTM.

Jakub


Re: [PATCH 0/5] Add support -mcpu=thunderxt88pass1 and needed changes to support that

2015-11-19 Thread Kyrill Tkachov

Hi Andrew,

On 17/11/15 22:10, Andrew Pinski wrote:

To Add support for -mcpu=thunderxt88pass1, I needed to fix up a few
things in the support for -mcpu=native.  First was I wanted to do the same
cleanup that was done for some of the other .def files and move the
#undef into the .def files instead of the .h files.
Second to make it easier to understand the implemention_id and part numbers
are really numbers, move them over to use integer to compare against
instead of strings which allows for easier comparisons later on.
Third fix the way we compare imp and part num; that is instead of finding
the part number and then comparing the imp, we find both numbers first
and then search for the ones that match.
Add a comment to the aarch64-cores.def file in front of each group of cores
to signify the groups of company's cores.
And all of this allows for adding variant for the check of the cores
which allows for thunderxt88pass1 to be added.

The reason why thunderxt88pass1 is seperate from thunderx is because
thunderx is changed to be an ARMv8.1 arch core while thunderxt88pass1
is still an ARMv8 arch core.

I tested each of these patches seperately.


I tried these patches out on a big.LITTLE system with 2 Cortex-A57 and 4 
Cortex-A53
cores and the code doesn't detect it properly anymore. Can you have a look 
please?
The Linux /proc/cpuinfo for that system looks like what's described here:
https://lists.linaro.org/pipermail/cross-distro/2014-October/000750.html
Look for the entry "[3] arm64, v3.17 + this patch, Juno platform "
You can put that into a file, point the fopen call in driver-aarch64.c to that 
file
and use that to debug the issue.

Thanks,
Kyrill


Ok for the trunk even though I missed out on stage 1?

Thanks,
Andrew

Andrew Pinski (5):
   [AARCH64]: Move #undef into .def files.
   [AARCH64] Change IMP and PART over to integers from strings.
   [AARCH64] Fix part num and implement indendent.
   {AARCH64] Add comment for the company's cores.
   [AARCH64] Add variant support to -m*=native and add thunderxt88pass1.

  gcc/common/config/aarch64/aarch64-common.c   |   5 +-
  gcc/config/aarch64/aarch64-arches.def|   1 +
  gcc/config/aarch64/aarch64-cores.def |  43 --
  gcc/config/aarch64/aarch64-fusion-pairs.def  |   1 +
  gcc/config/aarch64/aarch64-option-extensions.def |   2 +
  gcc/config/aarch64/aarch64-opts.h|   4 +-
  gcc/config/aarch64/aarch64-protos.h  |   4 -
  gcc/config/aarch64/aarch64-tune.md   |   2 +-
  gcc/config/aarch64/aarch64-tuning-flags.def  |   1 +
  gcc/config/aarch64/aarch64.c |   7 +-
  gcc/config/aarch64/aarch64.h |   3 +-
  gcc/config/aarch64/driver-aarch64.c  | 169 +--
  12 files changed, 136 insertions(+), 106 deletions(-)





Re: Backport important ASan features from upstream.

2015-11-19 Thread Maxim Ostapenko

On 19/11/15 13:18, Jakub Jelinek wrote:

On Thu, Nov 19, 2015 at 11:19:23AM +0300, Maxim Ostapenko wrote:

Hi!

Since the last sanitizer library merge to GCC happened, some new useful
features were applied upstream. In particular, the most significant are:

* The shadow offset for ASan was unified on Aarch64 for 39 and 42 VMAs
(http://reviews.llvm.org/D13782). AFAU, this change wouldn't require any
additional support from compiler side, because the shadow offset is the same
as for 39-bit VMA (36) .
* Optional ASan recovery functionality was merged to sanitizer library
(http://reviews.llvm.org/D12318). This feature seems to be very helpful in
complex systems where we may want to proceed to work even in case of bug was
detected. However, to access this functionality, we'll need to implement new
asan_report_{store, load}{1, 2, 4, 8, 16, N}_noabort callbacks in compiler.
This is probably unacceptable for stage 3.

No, those are already there (for -fsanitize{,-recover}=kernel-address).
IMHO all you need is remove:
   if (opts->x_flag_sanitize_recover & SANITIZE_USER_ADDRESS)
 error_at (loc, "-fsanitize-recover=address is not supported");
in opts.c (finish_options), and in asan.c (asan_expand_check_ifn)
change
   bool recover_p
 = (flag_sanitize & flag_sanitize_recover & SANITIZE_KERNEL_ADDRESS) != 0;
to say:
   bool recover_p;
   if (flag_sanitize & SANITIZE_USER_ADDRESS)
 recover_p = (flag_sanitize_recover & SANITIZE_USER_ADDRESS) != 0;
   else
 recover_p = (flag_sanitize_recover & SANITIZE_KERNEL_ADDRESS) != 0;
(plus add some testsuite coverage).
This is small enough change that I'm ok with an exception for this.


I think it would be nice to have unified shadow offset on Aarch64 for 39 and
42 VMAs even in GCC 6 (enabling ASan recovery would be nice too, but it's
much harder).

So, my question is: is it acceptable to backport these features from
upstream without touching compiler side? If so, I see two options here:

- Perform sanitizer library merge to GCC without changing compiler side.

And at this point I also prefer the above, rather than cherry-picking, of
course later on in the process cherry-picking will be desirable instead.


Great, thanks! I'll try to perform another merge as soon as possible.




- Cherry-pick the patch for AArch64 on top of current trunk.

Jakub





Re: [PATCH, 10/16] Add pass_oacc_kernels pass group in passes.def

2015-11-19 Thread Tom de Vries

On 16/11/15 13:45, Richard Biener wrote:

I've eliminated all the uses for pass_tree_loop_init/pass_tree_loop_done in
>the pass group. Instead, I've added conditional loop optimizer setup in:
>-  pass_lim and pass_scev_cprop (added in this patch), and


Reposting the "Add pass_oacc_kernels pass group in passes.def" patch.

pass_scev_cprop is no longer part of the pass group.

And I've dropped the scev_initialize in pass_lim.

Pass_lim is part of the pass_tree_loop pass group, where AFAIU scev info 
is initialized at the start of the pass group and updated or reset by 
passes in the pass group if necessary, such that it's always available, 
or can be recalculated on the spot.


First, pass_lim doesn't invalidate scev info. And second, AFAIU pass_lim 
doesn't use scev info. So there doesn't seem to be a need to do anything 
about scev info for using pass_lim outside pass_tree_loop.



>- pass_parallelize_loops_oacc_kernels (added in patch "Add
>   pass_parallelize_loops_oacc_kernels").

You miss calling scev_finalize ().


I've added the scev_finalize () in patch "Add 
pass_parallelize_loops_oacc_kernels".


Thanks,
- Tom

Add pass_oacc_kernels pass group in passes.def

2015-11-09  Tom de Vries  

	* omp-low.c (pass_expand_omp_ssa::clone): New function.
	* passes.def: Add pass_oacc_kernels pass group.
	* tree-ssa-loop-ch.c (pass_ch::clone): New function.
	* tree-ssa-loop-im.c (tree_ssa_lim): Make static.
	(pass_lim::execute): Allow to run outside pass_tree_loop.

---
 gcc/omp-low.c  |  1 +
 gcc/passes.def | 25 +
 gcc/tree-ssa-loop-ch.c |  2 ++
 gcc/tree-ssa-loop-im.c | 10 +-
 4 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 9c27396..d2f88b3 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -13385,6 +13385,7 @@ public:
   return !(fun->curr_properties & PROP_gimple_eomp);
 }
   virtual unsigned int execute (function *) { return execute_expand_omp (); }
+  opt_pass * clone () { return new pass_expand_omp_ssa (m_ctxt); }
 
 }; // class pass_expand_omp_ssa
 
diff --git a/gcc/passes.def b/gcc/passes.def
index 17027786..00446c3 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -88,7 +88,32 @@ along with GCC; see the file COPYING3.  If not see
 	  /* pass_build_ealias is a dummy pass that ensures that we
 	 execute TODO_rebuild_alias at this point.  */
 	  NEXT_PASS (pass_build_ealias);
+	  /* Pass group that runs when the function is an offloaded function
+	 containing oacc kernels loops.  Part 1.  */
+	  NEXT_PASS (pass_oacc_kernels);
+	  PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels)
+	  /* We need pass_ch here, because pass_lim has no effect on
+	 exit-first loops (PR65442).  Ideally we want to remove both
+		 this pass instantiation, and the reverse transformation
+		 transform_to_exit_first_loop_alt, which is done in
+		 pass_parallelize_loops_oacc_kernels. */
+	  NEXT_PASS (pass_ch);
+	  POP_INSERT_PASSES ()
 	  NEXT_PASS (pass_fre);
+	  /* Pass group that runs when the function is an offloaded function
+	 containing oacc kernels loops.  Part 2.  */
+	  NEXT_PASS (pass_oacc_kernels2);
+	  PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels2)
+	  /* We use pass_lim to rewrite in-memory iteration and reduction
+	 variable accesses in loops into local variables accesses.  */
+	  NEXT_PASS (pass_lim);
+	  NEXT_PASS (pass_dominator, false /* may_peel_loop_headers_p */);
+	  NEXT_PASS (pass_lim);
+	  NEXT_PASS (pass_dominator, false /* may_peel_loop_headers_p */);
+	  NEXT_PASS (pass_dce);
+	  NEXT_PASS (pass_parallelize_loops_oacc_kernels);
+	  NEXT_PASS (pass_expand_omp_ssa);
+	  POP_INSERT_PASSES ()
 	  NEXT_PASS (pass_merge_phi);
   NEXT_PASS (pass_dse);
 	  NEXT_PASS (pass_cd_dce);
diff --git a/gcc/tree-ssa-loop-ch.c b/gcc/tree-ssa-loop-ch.c
index 7e618bf..6493fcc 100644
--- a/gcc/tree-ssa-loop-ch.c
+++ b/gcc/tree-ssa-loop-ch.c
@@ -165,6 +165,8 @@ public:
   /* Initialize and finalize loop structures, copying headers inbetween.  */
   virtual unsigned int execute (function *);
 
+  opt_pass * clone () { return new pass_ch (m_ctxt); }
+
 protected:
   /* ch_base method: */
   virtual bool process_loop_p (struct loop *loop);
diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c
index 30b53ce..96f05f2 100644
--- a/gcc/tree-ssa-loop-im.c
+++ b/gcc/tree-ssa-loop-im.c
@@ -2496,7 +2496,7 @@ tree_ssa_lim_finalize (void)
 /* Moves invariants from loops.  Only "expensive" invariants are moved out --
i.e. those that are likely to be win regardless of the register pressure.  */
 
-unsigned int
+static unsigned int
 tree_ssa_lim (void)
 {
   unsigned int todo;
@@ -2560,9 +2560,17 @@ public:
 unsigned int
 pass_lim::execute (function *fun)
 {
+  if (!loops_state_satisfies_p (LOOPS_NORMAL
+| LOOPS_HAVE_RECORDED_EXITS))
+loop_optimizer_init (LOOPS_NORMAL
+			 | LOOPS_HAVE_RECORDED_EXITS);
+
   if 

Re: [PATCH,RFC] Introduce RUN_UNDER_VALGRIND in test-suite

2015-11-19 Thread Markus Trippelsdorf
On 2015.11.19 at 15:38 +0100, Martin Liška wrote:
> 
> In last two weeks I've removed couple of memory leaks, mainly tight to
> middle-end.  Currently, a user of the GCC compiler can pass
> '--enable-checking=valgrind' configure option that will run all
> commands within valgrind environment, but as the valgrind runs just
> with '-q' option, the result is not very helpful.

Well, it is easy to add as many valgrind options as you like to gcc.c.
(I normally add --track-origins=yes and just adjust the for loop at
line 3045).

The advantage of that approach is that the valgrind annotation macros
for the alloc-pool and the garbage collector (etc.) are used. So no
suppression file is needed.

-- 
Markus


Re: [PATCH] [4.9] Re: [PATCH][5] Backport ISL 0.15 support

2015-11-19 Thread Matthias Klose

On 19.11.2015 09:07, Richard Biener wrote:

On Wed, 18 Nov 2015, Matthias Klose wrote:


On 12.10.2015 12:58, Richard Biener wrote:


This backports the patch to allow bootstrapping with ISL 0.15 to the
GCC 5 branch (the GCC 4.9 branch will require backporting of some
dependencies).


I don't think so. 4.8 and 4.9 don't use as much ISL code as 5 does.  I had a
look at the backport and came up with something which is just mechanical
changes for the cpp conditional.

The version check in the toplevel configure needs to be extended.  I'm
currently not checking non matching isl and cloog versions.

Now build for 4.9 using ISL 0.15 and 0.14. Ok to commit to the branch?


Ok.  I wonder if we have a cloog version in infrastructure that builds
against ISL 0.1[45]?  IIRC cloog 0.18.1 doesn't.


No, that would be 0.18.4, the minimum requirement for isl 0.15.

Trying to rebuild 0.15 against 0.12.2 results in test failures (below), 
rebuilding against 0.14 doesn't show test failures.


Matthias

--- ./reservoir/QR.c 2014-10-02 14:33:50.0 +
+++ cloog_temp  2015-11-19 15:12:15.844164292 +
@@ -1,12 +1,6 @@
-/* Generated from ./reservoir/QR.cloog by CLooG 0.18.1-2-g43fc508 gmp bits in 
0.07s. */
+/* Generated from ./reservoir/QR.cloog by CLooG 0.18.4-b2b4f77 gmp bits in 
0.13s. */

 if (N >= 1) {
   S1(0);
-  if ((M <= 0) && (N >= 2)) {
-S3(0);
-S10(0);
-S1(1);
-S5(0);
-  }
   if ((M >= 1) && (N == 1)) {
 for (c4=0;c4<=M-1;c4++) {
   S2(0,c4);
@@ -34,6 +28,12 @@
 S10(0);
 S1(1);
 S5(0);
+  }
+  if ((M <= 0) && (N >= 2)) {
+S3(0);
+S10(0);
+S1(1);
+S5(0);
   }
   for (c2=2;c2<=min(M,N-1);c2++) {
 for (c4=c2-1;c4<=N-1;c4++) {
FAIL: ./reservoir/QR.c has a problem

--- ./isl/jacobi-shared.c  2014-10-02 16:33:52.0 +
+++ cloog_temp  2015-11-19 15:12:17.260192519 +
@@ -3,7 +3,7 @@
   if ((16*floord(t0-1,16) >= -N+g1+t0+1) && (16*floord(g1+t0-3,16) >= 
-N+g1+t0+1) && (32*floord(t1-1,32) >= -N+g2+t1+1) && (32*floord(g2+t1-3,32) >= 
t1-32)) {
 for 
(c0=max(-16*floord(t0-1,16)+t0,-16*floord(g1+t0-3,16)+t0);c0<=min(32,N-g1-1);c0+=16) 
{

   for (c1=-32*floord(t1-1,32)+t1;c1<=min(32,N-g2-1);c1+=32) {
-if (c1 >= 1) {
+if ((c1 >= 1) && (c1 <= 32)) {
   S1((c0+g1-1),(c1+g2-1));
 }
   }
FAIL: ./isl/jacobi-shared.c has a problemE


Re: [PATCH] Simple optimization for MASK_STORE.

2015-11-19 Thread Yuri Rumyantsev
Hi Richard,

I send you updated version of patch which contains fixes you mentioned
and additional early exit in
register_edge_assert_for() for gcond with vector comparison - it tries
to produce assert for
  if (vect != {0,0,0,0}) but can't create such constant. This is not
essential since this is applied to very specialized context.

My answers are below.

2015-11-12 16:58 GMT+03:00 Richard Biener :
> On Wed, Nov 11, 2015 at 2:13 PM, Yuri Rumyantsev  wrote:
>> Richard,
>>
>> What we should do to cope with this problem (structure size increasing)?
>> Should we return to vector comparison version?
>
> Ok, given this constraint I think the cleanest approach is to allow
> integer(!) vector equality(!) compares with scalar result.  This should then
> expand via cmp_optab and not via vec_cmp_optab.

  In fact it is expanded through cbranch_optab since the only use of
such comparison is for masked
store motion
>
> On gimple you can then have
>
>  if (mask_vec_1 != {0, 0,  })
> ...
>
> Note that a fallback expansion (for optabs.c to try) would be
> the suggested view-conversion (aka, subreg) variant using
> a same-sized integer mode.
>
> Target maintainers can then choose what is a better fit for
> their target (and instruction set as register set constraints may apply).
>
> The patch you posted seems to do this but not restrict the compares
> to integer ones (please do that).
>
>if (TREE_CODE (op0_type) == VECTOR_TYPE
>   || TREE_CODE (op1_type) == VECTOR_TYPE)
>  {
> -  error ("vector comparison returning a boolean");
> -  debug_generic_expr (op0_type);
> -  debug_generic_expr (op1_type);
> -  return true;
> + /* Allow vector comparison returning boolean if operand types
> +are equal and CODE is EQ/NE.  */
> + if ((code != EQ_EXPR && code != NE_EXPR)
> + || TREE_CODE (op0_type) != TREE_CODE (op1_type)
> + || TYPE_VECTOR_SUBPARTS (op0_type)
> +!= TYPE_VECTOR_SUBPARTS (op1_type)
> + || GET_MODE_SIZE (TYPE_MODE (TREE_TYPE (op0_type)))
> +!= GET_MODE_SIZE (TYPE_MODE (TREE_TYPE (op1_type
>
> These are all checked with the useless_type_conversion_p checks done earlier.
>
> As said I'd like to see a
>
> || ! VECTOR_INTEGER_TYPE_P (op0_type)

  I added check on VECTOR_BOOLEAN_TYPE_P (op0_type) instead since type
of mask was changed.
>
> check added so we and targets do not need to worry about using EQ/NE vs. CMP
> and worry about signed zeros and friends.
>
> +   {
> + error ("type mismatch for vector comparison returning a 
> boolean");
> + debug_generic_expr (op0_type);
> + debug_generic_expr (op1_type);
> + return true;
> +   }
>
>
>
> --- a/gcc/tree-ssa-forwprop.c
> +++ b/gcc/tree-ssa-forwprop.c
> @@ -422,6 +422,15 @@ forward_propagate_into_comparison_1 (gimple *stmt,
>   enum tree_code def_code = gimple_assign_rhs_code (def_stmt);
>   bool invariant_only_p = !single_use0_p;
>
> + /* Can't combine vector comparison with scalar boolean type of
> +the result and VEC_COND_EXPR having vector type of comparison.  
> */
> + if (TREE_CODE (TREE_TYPE (op0)) == VECTOR_TYPE
> + && INTEGRAL_TYPE_P (type)
> + && (TREE_CODE (type) == BOOLEAN_TYPE
> + || TYPE_PRECISION (type) == 1)
> + && def_code == VEC_COND_EXPR)
> +   return NULL_TREE;
>
> this hints at larger fallout you paper over here.  So this effectively
> means we're trying combining (vec1 != vec2) != 0 for example
> and that fails miserably?  If so then the solution is to fix whatever
> does not expect this (valid) GENERIC tree.

  I changed it to the following check in combine_cond_expr_cond:
  /* Do not perform combining it types are not compatible.  */
  if (TREE_CODE (TREE_TYPE (op0)) == VECTOR_TYPE
  && !tree_int_cst_equal (TYPE_SIZE (type), TYPE_SIZE (TREE_TYPE (op0
return NULL_TREE;
>
> +  if (ENABLE_ZERO_TEST_FOR_MASK_STORE == 0)
> +return;
>
> not sure if I like a param more than a target hook ... :/
  I introduced the param instead of a target hook to make this
transformation user visible, i.e. to switch it on/off
for different targets.
>
> +  /* Create vector comparison with boolean result.  */
> +  vectype = TREE_TYPE (mask);
> +  zero = build_zero_cst (TREE_TYPE (vectype));
> +  zero = build_vector_from_val (vectype, zero);
>
> build_zero_cst (vectype);

Done.
>
> +  stmt = gimple_build_cond (EQ_EXPR, mask, zero, NULL_TREE, NULL_TREE);
>
> you can omit the NULL_TREE operands.

  I did not find such definition for it.
>
> +  gcc_assert (vdef && TREE_CODE (vdef) == SSA_NAME);
>
> please omit the assert.

  Done.
>
> +  gimple_set_vdef (last, new_vdef);
>
> do this before you create the PHI.
>
  Done.
> + /* Put definition 

[PATCH; committed] Fix typo in doc/gty.texi

2015-11-19 Thread David Malcolm
Committed to trunk (as r230609) as obvious, having
verified that the docs build.

gcc/ChangeLog:
* doc/gty.texi (Support for inheritance): Fix missing
parentheses in example.
---
 gcc/doc/gty.texi | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/doc/gty.texi b/gcc/doc/gty.texi
index 5e0a465..f66fa01 100644
--- a/gcc/doc/gty.texi
+++ b/gcc/doc/gty.texi
@@ -429,13 +429,13 @@ public:
 tree a;
 @};
 
-class GTY((tag("1")) some_subclass : public example_base
+class GTY((tag("1"))) some_subclass : public example_base
 @{
 public:
 tree b;
 @};
 
-class GTY((tag("2")) some_other_subclass : public example_base
+class GTY((tag("2"))) some_other_subclass : public example_base
 @{
 public:
 tree c;
-- 
1.8.5.3



Re: [PATCH][combine] PR rtl-optimization/68381: Only restrict pure simplification in mult-extend subst case, allow other substitutions

2015-11-19 Thread Kyrill Tkachov


On 19/11/15 15:00, Kyrill Tkachov wrote:


On 19/11/15 14:41, Segher Boessenkool wrote:

On Thu, Nov 19, 2015 at 01:38:53PM +, Kyrill Tkachov wrote:

That is troublesome.  Could you look deeper?

Yes.

Thanks.


So the bad case is when we're in subst and returning a CLOBBER of zero
and 'from' is (reg/v:SI 80 [ x ]) and 'to' is (zero_extend:SI (reg:HI 0 x0
[ x ])).
The call to subst comes from try_combine at line 3403:

if (added_sets_1)
 {
   rtx t = i1pat;
   if (i0_feeds_i1_n)
 t = subst (t, i0dest, i0src_copy ? i0src_copy : i0src, 0, 0, 0);

   XVECEXP (newpat, 0, --total_sets) = t;
 }

It uses t after calling subst on it without checking that it didn't return
a clobber.
If I change that snippet to check for CLOBBER:
   if (added_sets_1)
 {
   rtx t = i1pat;
   if (i0_feeds_i1_n)
 t = subst (t, i0dest, i0src_copy ? i0src_copy : i0src, 0, 0, 0);

   if (GET_CODE (t) == CLOBBER)
 {
   undo_all ();
   return 0;
 }
   XVECEXP (newpat, 0, --total_sets) = t;
 }

The testcase gets fixed.
But shouldn't the XVECEXP (newpat, 0, --total_sets) = t; line create an
uncrecognizable rtx
that would then get rejected by combine or something?

Yes.  recog_for_combine_1 checks for a PARALLEL with such a CLOBBER
right at the start; and of course having the clobber elsewhere will
just not match.


If we don't check for clobber there and perform the "XVECEXP = ..."
the resulting newpat looks like:
(parallel [
 (set (reg:CC 66 cc)
 (compare:CC (const_int 0 [0])
 (const_int 0 [0])))
 (nil)
 (clobber:DI (const_int 0 [0]))
 ])

ah, so the clobber is put in a parallel with another insn and is thus
accepted by recog?

No, recog_for_combine should refuse it because of that third arm.
The second arm (the nil) looks very wrong, where is that coming
from?  That isn't valid RTL.


Well, it came from a bit earlier before the subst call (around line 3390):
  /* If the actions of the earlier insns must be kept
 in addition to substituting them into the latest one,
 we must make a new PARALLEL for the latest insn
 to hold additional the SETs.  */

  rtx old = newpat;
  total_sets = 1 + extra_sets;
  newpat = gen_rtx_PARALLEL (VOIDmode, rtvec_alloc (total_sets));
  XVECEXP (newpat, 0, 0) = old;


extra_sets is 2, so we have a parallel with 3 slots so after the subst call
we put the clobber into --total_sets, that is slot 2:
  XVECEXP (newpat, 0, --total_sets) = t;
A bit further below we fill slot 1 with another rtx, so all 3 parts of the 
PARALLEL
are filled.
I'll look further into why recog_for_combine doesn't kill the whole thing.



Hmmm, so the answer to that is a bit further down the validate_replacement: 
path.
It's the code after the big comment:
  /* See if this is a PARALLEL of two SETs where one SET's destination is
 a register that is unused and this isn't marked as an instruction that
 might trap in an EH region.  In that case, we just need the other SET.
 We prefer this over the PARALLEL.

 This can occur when simplifying a divmod insn.  We *must* test for this
 case here because the code below that splits two independent SETs doesn't
 handle this case correctly when it updates the register status.

 It's pointless doing this if we originally had two sets, one from
 i3, and one from i2.  Combining then splitting the parallel results
 in the original i2 again plus an invalid insn (which we delete).
 The net effect is only to move instructions around, which makes
 debug info less accurate.  */

The code extracts all the valid sets inside the PARALLEL and calls 
recog_for_combine on them
individually, ignoring the clobber. Should we tell it to reject any. So 
recog_for_combine doesn't
complain because it never sees the clobbers and all the other checks in 
try_combine are gated
on "insn_code < 0" so they never apply.

Where should we add the extra checks for CLOBBER? Or do you reckon we should 
add another call to
recog_for_combine somewhere there?

Thanks,
Kyrill


Kyrill






So, should we check 't' after subst for clobbers as above? Or should this
be fixed in
some other place?

There is a bug somewhere, so that will need fixing.  Workarounds are
last resort, and even then we really need to know what is going on.


Thanks. In light of the above I think this patch happens to avoid
the issue highlighted above but we should fix the above code separately?

Yes, if your patch creates better code we want it (and fix the regression),
but you exposed a separate bug as well :-)


Segher







Re: [C++ PATCH] Issue hard error even with -fpermissive for certain goto violations (PR c++/67409, take 2)

2015-11-19 Thread Jason Merrill

OK.

Jason


Re: [PATCH] (Partial) Implementation of simplificaiton of CSHIFT

2015-11-19 Thread Steve Kargl
On Thu, Nov 19, 2015 at 05:31:32PM -0800, Steve Kargl wrote:
> On Thu, Nov 19, 2015 at 04:58:36PM -0800, Steve Kargl wrote:
> > +  else
> > +{
> > +  int dm;
> > +
> > +  if (dim)
> > +   {
> > + if (!gfc_is_constant_expr (dim))
> > +   return NULL;
> > +
> > + dm = mpz_get_si (dim->value.integer);
> > +   }
> > +  else
> > +   dm = 1;
> > +
> > +  gfc_error ("Simplification of CSHIFT with an array with rank > 1 "
> > +"no yet support");
> > +}
> > +
> 
> To save some time, the dim portion of the patch isn't
> correct.  dim can be scalar or rank 1 array.  I'll
> #if 0 ... #endif this section unless I persevere with
> the rank > 1 case.

Ugh.  Too much gdb today.  The above is correct.  I conflated
SHIFT and DIM's requirements.

-- 
Steve


Re: [PATCH 1/2] [graphite] Move codegen related functions to graphite-isl-ast-to-gimple.c

2015-11-19 Thread David Edelsohn
The patch fixed the bootstrap failure.

Thanks, David

On Thu, Nov 19, 2015 at 3:37 PM, Sebastian Pop  wrote:
> Fixed in r230625.
>
> David, please test on your systems, we were not able to reproduce the fails on
> x86_64-linux: the linker does optimize away the functions that are not used.
>
> Thanks,
> Sebastian
>
> Aditya K wrote:
>> Thanks for the update. I'll fix that asap.
>>
>>
>> -Aditya
>>
>>
>> 
>> > Date: Thu, 19 Nov 2015 08:36:58 -0500
>> > Subject: Re: [PATCH 1/2] [graphite] Move codegen related functions to 
>> > graphite-isl-ast-to-gimple.c
>> > From: dje@gmail.com
>> > To: hiradi...@msn.com; aditya...@samsung.com; s@samsung.com
>> > CC: gcc-patches@gcc.gnu.org
>> >
>> > This patch broke bootstrap when ISL is not enabled.
>> >
>> > graphite-isl-ast-to-gimple.c is protected by HAVE_isl but
>> > get_false_edge_from_guard_bb() is used outside of Graphite, including
>> > sese.c, which is not restricted to HAVE_isl.
>> >
>> > Please fix.
>> >
>> > Thanks, David
>>


Re: [PATCH] Fix memory leaks in tree-ssa-uninit.c

2015-11-19 Thread Bernd Schmidt
BTW, I'm with whoever said absolutely no way to the idea of making 
automatic changes like this as part of a commit hook.


I think the whitespace change can go in if it hasn't already, but I 
think the other one still has enough problems that I'll say - leave it 
for the next stage 1.



@@ -178,8 +173,9 @@ warn_uninitialized_vars (bool warn_possibly_uninitialized)

   FOR_EACH_BB_FN (bb, cfun)
 {
-  bool always_executed = dominated_by_p (CDI_POST_DOMINATORS,
-single_succ 
(ENTRY_BLOCK_PTR_FOR_FN (cfun)), bb);
+  bool always_executed
+   = dominated_by_p (CDI_POST_DOMINATORS,
+ single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)), bb);
   for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next ())
{


Better to pull the single_succ into its own variable perhaps?


@@ -1057,7 +1039,8 @@ prune_uninit_phi_opnds_in_unrealizable_paths (gphi *phi,
*visited_flag_phis = BITMAP_ALLOC (NULL);

  if (bitmap_bit_p (*visited_flag_phis,
-   SSA_NAME_VERSION (gimple_phi_result 
(flag_arg_def
+   SSA_NAME_VERSION (
+ gimple_phi_result (flag_arg_def
return false;

  bitmap_set_bit (*visited_flag_phis,


Pull the gimple_phi_result into a separate variable, or just leave it 
unchanged for now, the modified version is too ugly to live.



  bitmap_clear_bit (*visited_flag_phis,
-   SSA_NAME_VERSION (gimple_phi_result 
(flag_arg_def)));
+   SSA_NAME_VERSION (
+ gimple_phi_result (flag_arg_def)));
  continue;
}


Here too.


-  all_pruned = prune_uninit_phi_opnds_in_unrealizable_paths (phi,
-uninit_opnds,
-as_a  
(flag_def),
-boundary_cst,
-cmp_code,
-visited_phis,
-
_flag_phis);
+  all_pruned = prune_uninit_phi_opnds_in_unrealizable_paths
+(phi, uninit_opnds, as_a (flag_def), boundary_cst, cmp_code,
+ visited_phis, _flag_phis);


I'd rather shorten the name of the function, even if it goes against the 
spirit of the novel writing month.



-  if (gphi *use_phi = dyn_cast  (use_stmt))
-   use_bb = gimple_phi_arg_edge (use_phi,
- PHI_ARG_INDEX_FROM_USE (use_p))->src;
+  if (gphi *use_phi = dyn_cast (use_stmt))
+   use_bb
+ = gimple_phi_arg_edge (use_phi, PHI_ARG_INDEX_FROM_USE (use_p))->src;
else
use_bb = gimple_bb (use_stmt);


There are some changes of this nature and I'm not sure it's an 
improvement. Leave them out for now?



@@ -2345,8 +2309,8 @@ warn_uninitialized_phi (gphi *phi, vec *worklist,
  }

/* Now check if we have any use of the value without proper guard.  */
-  uninit_use_stmt = find_uninit_use (phi, uninit_opnds,
-worklist, added_to_worklist);
+  uninit_use_stmt
+= find_uninit_use (phi, uninit_opnds, worklist, added_to_worklist);

/* All uses are properly guarded.  */
if (!uninit_use_stmt)


Here too.


@@ -2396,12 +2359,24 @@ public:
{}

/* opt_pass methods: */
-  opt_pass * clone () { return new pass_late_warn_uninitialized (m_ctxt); }
-  virtual bool gate (function *) { return gate_warn_uninitialized (); }
+  opt_pass *clone ();
+  virtual bool gate (function *);


This may technically violate our coding standards, but it's consistent 
with a lot of similar cases. Since coding standards are about enforcing 
consistency, I'd drop this change.



  namespace {

  const pass_data pass_data_early_warn_uninitialized =
  {
-  GIMPLE_PASS, /* type */
+  GIMPLE_PASS,/* type */
"*early_warn_uninitialized", /* name */
-  OPTGROUP_NONE, /* optinfo_flags */
-  TV_TREE_UNINIT, /* tv_id */
-  PROP_ssa, /* properties_required */
-  0, /* properties_provided */
-  0, /* properties_destroyed */
-  0, /* todo_flags_start */
-  0, /* todo_flags_finish */
+  OPTGROUP_NONE,  /* optinfo_flags */
+  TV_TREE_UNINIT, /* tv_id */
+  PROP_ssa,   /* properties_required */
+  0,  /* properties_provided */
+  0,  /* properties_destroyed */
+  0,  /* todo_flags_start */
+  0,  /* todo_flags_finish */
  };


Likewise. Seems to be done practically nowhere.


  class pass_early_warn_uninitialized : public gimple_opt_pass
@@ -2519,14 +2491,23 @@ public:
{}

/* opt_pass methods: */
-  virtual bool gate (function *) { return gate_warn_uninitialized (); }
-  virtual unsigned 

Fix -fno-checking segfault

2015-11-19 Thread Michael Matz
Hi,

in an enabled-checking compiler gcc_checking_assert is always executed.  
If that depends on things having happened under flag_checking being true, 
but it's actually false during runtime due to -fno-checking things go 
awry, like segfaulting in this case.  The fix is obvious and checked in 
after regstrapping on x86_64-linux.


Ciao,
Michael.
* fwprop.c (update_uses): Use flag_checking instead of
gcc_checking_assert.

Index: fwprop.c
===
--- fwprop.c(revision 230463)
+++ fwprop.c(working copy)
@@ -893,7 +893,8 @@ update_uses (df_ref use)
   if (DF_REF_ID (use) >= (int) use_def_ref.length ())
 use_def_ref.safe_grow_cleared (DF_REF_ID (use) + 1);
 
-  gcc_checking_assert (sparseset_bit_p (active_defs_check, regno));
+  if (flag_checking)
+   gcc_assert (sparseset_bit_p (active_defs_check, regno));
   use_def_ref[DF_REF_ID (use)] = active_defs[regno];
 }
 }


Re: Fix -fno-checking segfault

2015-11-19 Thread Kyrill Tkachov

Hi Michael,

On 19/11/15 16:13, Michael Matz wrote:

Hi,

in an enabled-checking compiler gcc_checking_assert is always executed.
If that depends on things having happened under flag_checking being true,
but it's actually false during runtime due to -fno-checking things go
awry, like segfaulting in this case.  The fix is obvious and checked in
after regstrapping on x86_64-linux.


I think this is PR 68392 if you want to mark it in your ChangeLog.

Kyrill



Ciao,
Michael.
* fwprop.c (update_uses): Use flag_checking instead of
gcc_checking_assert.

Index: fwprop.c
===
--- fwprop.c(revision 230463)
+++ fwprop.c(working copy)
@@ -893,7 +893,8 @@ update_uses (df_ref use)
if (DF_REF_ID (use) >= (int) use_def_ref.length ())
  use_def_ref.safe_grow_cleared (DF_REF_ID (use) + 1);
  
-  gcc_checking_assert (sparseset_bit_p (active_defs_check, regno));

+  if (flag_checking)
+   gcc_assert (sparseset_bit_p (active_defs_check, regno));
use_def_ref[DF_REF_ID (use)] = active_defs[regno];
  }
  }





Re: OpenACC declare directive updates

2015-11-19 Thread James Norris

Jakub,

Here's the updated version of the Fortran changes. More test
cases have been added as well as the issues that Cesar
pointed on in error checking have been addressed (Thanks).
I've also addressed the issue, described below, in dealing
with declare directives when found within a BLOCK construct.

On 11/06/2015 01:49 PM, Jakub Jelinek wrote:

On Fri, Nov 06, 2015 at 01:45:09PM -0600, James Norris wrote:

Okay, I'll fix this.

After fixing, OK to commit?

Thank you for taking the time for the review.


Well, isn't this patch really dependent on the other one?

Also, wonder about BLOCK stmt in Fortran, that can give you variables that
don't live through the whole function, but only a portion of it even in
Fortran.




On 11/18/2015 02:09 PM, Cesar Philippidis wrote:
> On 11/08/2015 08:53 PM, James Norris wrote:
>
>
> What block stmt? The most recent version of Fortran OpenACC 2.0a
> supports is 2003. The block construct is a 2008 feature. I don't think
> that's applicable to this version. Jim, maybe you should add an error
> message for variables defined in blocks.
>
> Thinking about this some more, I wonder if we should emit an error if
> any acc constructs are used inside blocks? That's probably overly
> pessimistic though.

Thanks!
Jim

2015-XX-XX  James Norris  
Cesar Philippidis  

gcc/fortran/
* dump-parse-tree.c (show_namespace): Handle declares.
* gfortran.h (struct symbol_attribute): New fields.
(enum gfc_omp_map_map): Add OMP_MAP_DEVICE_RESIDENT and OMP_MAP_LINK.
(OMP_LIST_LINK): New enum.
(struct gfc_oacc_declare): New structure.
(gfc_get_oacc_declare): New definition.
(struct gfc_namespace): Change type.
(enum gfc_exec_op): Add EXEC_OACC_DECLARE.
(struct gfc_code): New field.
* module.c (enum ab_attribute): Add AB_OACC_DECLARE_CREATE,
AB_OACC_DECLARE_COPYIN, AB_OACC_DECLARE_DEVICEPTR,
AB_OACC_DECLARE_DEVICE_RESIDENT, AB_OACC_DECLARE_LINK
(attr_bits): Add new initializers.
(mio_symbol_attribute): Handle new atributes.
* openmp.c (gfc_free_oacc_declare_clauses): New function.
(gfc_match_oacc_clause_link: Likewise.
(OMP_CLAUSE_LINK): New definition.
(gfc_match_omp_clauses): Handle OMP_CLAUSE_LINK.
(OACC_DECLARE_CLAUSES): Add OMP_CLAUSE_LINK
(gfc_match_oacc_declare): Add checking and module handling.
(gfc_resolve_oacc_declare): Reimplement.
* parse.c (case_decl): Add ST_OACC_DECLARE.
(parse_spec): Remove handling.
(parse_progunit): Remove handling.
* parse.h (struct gfc_state_data): Change type.
* resolve.c (gfc_resolve_blocks): Handle EXEC_OACC_DECLARE.
* st.c (gfc_free_statement): Handle EXEC_OACC_DECLARE.
* symbol.c (check_conflict): Add conflict checks.
(gfc_add_oacc_declare_create, gfc_add_oacc_declare_copyin, 
gfc_add_oacc_declare_deviceptr, gfc_add_oacc_declare_device_resident):
New functions.
(gfc_copy_attr): Handle new symbols.
* trans-decl.c (add_clause, find_module_oacc_declare_clauses,
finish_oacc_declare): New functions.
(gfc_generate_function_code): Replace with call.
* trans-openmp.c (gfc_trans_oacc_declare): Reimplement.
(gfc_trans_oacc_directive): Handle EXEC_OACC_DECLARE.
* trans-stmt.c (gfc_trans_block_construct): Replace with call.
* trans-stmt.h (gfc_trans_oacc_declare): Remove argument.
* trans.c (trans_code): Handle EXEC_OACC_DECLARE.

gcc/testsuite
* gfortran.dg/goacc/declare-1.f95: Update test.
* gfortran.dg/goacc/declare-2.f95: New test.

libgomp/
* testsuite/libgomp.oacc-fortran/declare-1.f90: New test.
* testsuite/libgomp.oacc-fortran/declare-2.f90: Likewise.
* testsuite/libgomp.oacc-fortran/declare-3.f90: Likewise.
* testsuite/libgomp.oacc-fortran/declare-4.f90: Likewise.
* testsuite/libgomp.oacc-fortran/declare-5.f90: Likewise.
* testsuite/libgomp.oacc-fortran/declare-5.f90: Likewise.
diff --git a/gcc/fortran/dump-parse-tree.c b/gcc/fortran/dump-parse-tree.c
index 83ecbaa..48476af 100644
--- a/gcc/fortran/dump-parse-tree.c
+++ b/gcc/fortran/dump-parse-tree.c
@@ -2570,12 +2570,16 @@ show_namespace (gfc_namespace *ns)
   for (eq = ns->equiv; eq; eq = eq->next)
 show_equiv (eq);
 
-  if (ns->oacc_declare_clauses)
+  if (ns->oacc_declare)
 {
+  struct gfc_oacc_declare *decl;
   /* Dump !$ACC DECLARE clauses.  */
-  show_indent ();
-  fprintf (dumpfile, "!$ACC DECLARE");
-  show_omp_clauses (ns->oacc_declare_clauses);
+  for (decl = ns->oacc_declare; decl; decl = decl->next)
+	{
+	  show_indent ();
+	  fprintf (dumpfile, "!$ACC DECLARE");
+	  show_omp_clauses (decl->clauses);
+	}
 }
 
   fputc ('\n', dumpfile);
diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h

update acc routines in fortran

2015-11-19 Thread Cesar Philippidis
This patch extends the existing support for acc routines in fortran.
It's a little bit more invasive than what I remembered, but it's still
fairly straightforward. Basically, it adds support for the following:

 - name routines
 - gang, worker, vector and seq clauses

In addition, I've also taught tree-nested to be aware of the
aforementioned clauses. Without those tree-nested changes, a lot of the
new test cases would fail.

If you observe the changelog closely, you'll noticed that I didn't
include libgomp.oacc-fortran/routine-[48].f90. The reason is, we don't
have support for the bind and nohost clauses on trunk yet. Thomas posted
a patch right before stage1 closed. So if that patch gets accepted, I'll
create a follow up patch for routines in fortran.

This this OK for trunk?

Cesar
2015-11-19  Cesar Philippidis  

	gcc/
	* tree-nested.c (convert_nonlocal_omp_clauses): Add support for
	OMP_CLAUSE_{NUM_GANGS,NUM_VECTORS,VECTOR_LENGTH,SEQ}.
	(convert_local_omp_clauses): Likewise.

2015-11-19  Cesar Philippidis  
	James Norris  
	Nathan Sidwell  

	gcc/fortran/
	* f95-lang.c (gfc_attribute_table): Add an "oacc function"
	attribute.
	* gfortran.h (symbol_attribute): Add an oacc_function bit-field.
	(gfc_oacc_routine_name): New struct;
	(gfc_get_oacc_routine_name): New macro.
	(gfc_namespace): Add oacc_routine_clauses, oacc_routine_names and
	oacc_routine fields.
	(gfc_exec_op): Add EXEC_OACC_ROUTINE.
	* openmp.c (OACC_ROUTINE_CLAUSES): New mask.
	(gfc_oacc_routine_dims): New function.
	(gfc_match_oacc_routine): Add support for named routines and the
	gang, worker vector and seq clauses.
	* parse.c (is_oacc): Add EXEC_OACC_ROUTINE.
	* resolve.c (gfc_resolve_blocks): Likewise.
	* st.c (gfc_free_statement): Likewise.
	* trans-decl.c (add_attributes_to_decl): Attach an 'oacc function'
	attribute and shape geometry for acc routine.

2015-11-19  Cesar Philippidis  
	Nathan Sidwell  

	gcc/testsuite/
	* gfortran.dg/goacc/routine-3.f90: New test.
	* gfortran.dg/goacc/routine-4.f90: New test.
	* gfortran.dg/goacc/routine-5.f90: New test.
	* gfortran.dg/goacc/routine-6.f90: New test.

2015-11-19  James Norris  
	Cesar Philippidis  

	libgomp/
	* libgomp.oacc-fortran/routine-5.f90: New test.
	* libgomp.oacc-fortran/routine-7.f90: New test.
	* libgomp.oacc-fortran/routine-9.f90: New test.

diff --git a/gcc/fortran/f95-lang.c b/gcc/fortran/f95-lang.c
index 605c2ab..8556b70 100644
--- a/gcc/fortran/f95-lang.c
+++ b/gcc/fortran/f95-lang.c
@@ -93,6 +93,8 @@ static const struct attribute_spec gfc_attribute_table[] =
affects_type_identity } */
   { "omp declare target", 0, 0, true,  false, false,
 gfc_handle_omp_declare_target_attribute, false },
+  { "oacc function", 0, -1, true,  false, false,
+gfc_handle_omp_declare_target_attribute, false },
   { NULL,		  0, 0, false, false, false, NULL, false }
 };
 
diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index e13b4d4..3dbcd96 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -841,6 +841,9 @@ typedef struct
   /* Mentioned in OMP DECLARE TARGET.  */
   unsigned omp_declare_target:1;
 
+  /* This is an OpenACC acclerator function at level N - 1  */
+  unsigned oacc_function:3;
+
   /* Attributes set by compiler extensions (!GCC$ ATTRIBUTES).  */
   unsigned ext_attr:EXT_ATTR_NUM;
 
@@ -1582,6 +1585,16 @@ gfc_dt_list;
   /* A list of all derived types.  */
   extern gfc_dt_list *gfc_derived_types;
 
+typedef struct gfc_oacc_routine_name
+{
+  struct gfc_symbol *sym;
+  struct gfc_omp_clauses *clauses;
+  struct gfc_oacc_routine_name *next;
+}
+gfc_oacc_routine_name;
+
+#define gfc_get_oacc_routine_name() XCNEW (gfc_oacc_routine_name)
+
 /* A namespace describes the contents of procedure, module, interface block
or BLOCK construct.  */
 /* ??? Anything else use these?  */
@@ -1648,6 +1661,12 @@ typedef struct gfc_namespace
   /* !$ACC DECLARE clauses.  */
   gfc_omp_clauses *oacc_declare_clauses;
 
+  /* !$ACC ROUTINE clauses.  */
+  gfc_omp_clauses *oacc_routine_clauses;
+
+  /* !$ACC ROUTINE names.  */
+  gfc_oacc_routine_name *oacc_routine_names;
+
   gfc_charlen *cl_list, *old_cl_list;
 
   gfc_dt_list *derived_types;
@@ -1693,6 +1712,9 @@ typedef struct gfc_namespace
 
   /* Set to 1 for !$OMP DECLARE REDUCTION namespaces.  */
   unsigned omp_udr_ns:1;
+
+  /* Set to 1 for !$ACC ROUTINE namespaces.  */
+  unsigned oacc_routine:1;
 }
 gfc_namespace;
 
@@ -2320,7 +2342,7 @@ enum gfc_exec_op
   EXEC_READ, EXEC_WRITE, EXEC_IOLENGTH, EXEC_TRANSFER, EXEC_DT_END,
   EXEC_BACKSPACE, EXEC_ENDFILE, EXEC_INQUIRE, EXEC_REWIND, EXEC_FLUSH,
   EXEC_LOCK, EXEC_UNLOCK,
-  EXEC_OACC_KERNELS_LOOP, EXEC_OACC_PARALLEL_LOOP,
+  EXEC_OACC_KERNELS_LOOP, EXEC_OACC_PARALLEL_LOOP, EXEC_OACC_ROUTINE,
   EXEC_OACC_PARALLEL, EXEC_OACC_KERNELS, 

[PATCH, PR68337] Don't fold memcpy/memmove we want to instrument

2015-11-19 Thread Ilya Enkovich
Hi,

Currently we fold all memcpy/memmove calls with a known data size.
It causes two problems when used with Pointer Bounds Checker.
The first problem is that we may copy pointers as integer data
and thus loose bounds.  The second problem is that if we inline
memcpy, we also have to inline bounds copy and this may result
in a huge amount of code and significant compilation time growth.
This patch disables folding for functions we want to instrument.

Does it look reasonable for trunk and GCC5 branch?  Bootstrapped
and regtested on x86_64-unknown-linux-gnu.

Thanks,
Ilya
--
gcc/

2015-11-19  Ilya Enkovich  

* gimple-fold.c (gimple_fold_builtin_memory_op): Don't
fold non-useless call if we are going to instrument it.

gcc/testsuite/

2015-11-19  Ilya Enkovich  

* gcc.target/i386/mpx/pr68337.c: New test.


diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 1ab20d1..b3a1229 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -53,6 +53,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gomp-constants.h"
 #include "optabs-query.h"
 #include "omp-low.h"
+#include "ipa-chkp.h"
 
 
 /* Return true when DECL can be referenced from current unit.
@@ -664,6 +665,13 @@ gimple_fold_builtin_memory_op (gimple_stmt_iterator *gsi,
   unsigned int src_align, dest_align;
   tree off0;
 
+  /* Inlining of memcpy/memmove may cause bounds lost (if we copy
+pointers as wide integer) and also may result in huge function
+size because of inlined bounds copy.  Thus don't inline for
+functions we want to instrument.  */
+  if (flag_check_pointer_bounds && chkp_instrumentable_p (cfun->decl))
+   return false;
+
   /* Build accesses at offset zero with a ref-all character type.  */
   off0 = build_int_cst (build_pointer_type_for_mode (char_type_node,
 ptr_mode, true), 0);
diff --git a/gcc/testsuite/gcc.target/i386/mpx/pr68337.c 
b/gcc/testsuite/gcc.target/i386/mpx/pr68337.c
new file mode 100644
index 000..3f8d79d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/mpx/pr68337.c
@@ -0,0 +1,32 @@
+/* { dg-do run } */
+/* { dg-options "-fcheck-pointer-bounds -mmpx" } */
+
+#include "mpx-check.h"
+
+#define N 2
+
+extern void abort ();
+
+static int
+mpx_test (int argc, const char **argv)
+{
+  char ** src = (char **)malloc (sizeof (char *) * N);
+  char ** dst = (char **)malloc (sizeof (char *) * N);
+  int i;
+
+  for (i = 0; i < N; i++)
+src[i] = __bnd_set_ptr_bounds (argv[0] + i, i + 1);
+
+  __builtin_memcpy(dst, src, sizeof (char *) * N);
+
+  for (i = 0; i < N; i++)
+{
+  char *p = dst[i];
+  if (p != argv[0] + i
+ || __bnd_get_ptr_lbound (p) != p
+ || __bnd_get_ptr_ubound (p) != p + i)
+   abort ();
+}
+
+  return 0;
+}


[PATCH] fix vectorizer performance problem on cygwin hosted cross compiler

2015-11-19 Thread Jim Wilson
A cygwin hosted cross compiler to aarch64-linux, compiling a C version
of linpack with -Ofast, produces code that runs 17% slower than a
linux hosted compiler.  The problem shows up in the vect dump, where
some different vectorization optimization decisions were made by the
cygwin compiler than the linux compiler.  That happened because
tree-vect-data-refs.c calls qsort in vect_analyze_data_ref_accesses,
and the newlib and glibc qsort routines sort the list differently.  I
can reproduce the same problem on linux by adding the newlib qsort
sources to a gcc build.  For an x86_64 target, I see about a 30%
performance loss using the newlib qsort.

The qsort trouble turns out to be a problem in the qsort comparison
function, dr_group_sort_cmp.  It does this
  if (!operand_equal_p (DR_BASE_ADDRESS (dra), DR_BASE_ADDRESS (drb), 0))
{
  cmp = compare_tree (DR_BASE_ADDRESS (dra), DR_BASE_ADDRESS (drb));
  if (cmp != 0)
return cmp;
}
operand_equal_p calls STRIP_NOPS, so it will consider two trees to be
the same even if they have NOP_EXPR.  However, compare_tree is not
calling STRIP_NOPS, so it handles trees with NOP_EXPRs differently
than trees without.  The result is that depending on which array entry
gets used as the qsort pivot point, you can get very different sorts.
The newlib qsort happens to be accidentally choosing a bad pivot for
this testcase.  The glibc qsort happens to be accidentally choosing a
good pivot for this testcase.  This then triggers a cascading problem
in vect_analyze_data_ref_accesses which assumes that array entries
that pass the operand_equal_p test for the base address will end up
adjacent, and will only vectorize in that case.

For a contrived example, suppose we have four entries to sort: (plus Y
8), (mult A 4), (pointer_plus Z 16), and (nop (mult A 4)).  Suppose we
choose the mult as the pivot point. The plus sorts before because
tree_code plus is less than mult. The pointer_plus sorts after for the
same reason. The nop sorts equal. So we end up with plus, mult, nop,
pointer_plus. The mult and nop are then combined into the same
vectorization group.  Now suppose we choose the pointer_plus as the
pivot point. The plus and mult sort before. The nop sorts after. The
final result is plus, mult, pointer_plus, nop. And we fail to
vectorize as the mult and nop are not adjacent as they should be.

When I modify compare_tree to call STRIP_NOPS, this problem goes away.
I get the same sort from both the newlib and glibc qsort functions,
and I get the same linpack performance from a cygwin hosted compiler
and a linux hosted compiler.

This patch was tested with an x86_64 bootstrap and make check.  There
were no regressions.  I've also done a SPEC CPU2000 run with and
without the patch on aarch64-linux, there is no performance change.
And I've verified it by building linpack for aarch64-linux with cygwin
hosted cross compiler, x86_64 hosted cross compiler, and an aarch64
native compiler.

Jim
2015-11-19  Jim Wilson  

	* tree-vect-data-refs.c (compare_tree): Call STRIP_NOPS.

Index: tree-vect-data-refs.c
===
--- tree-vect-data-refs.c	(revision 230429)
+++ tree-vect-data-refs.c	(working copy)
@@ -2545,6 +2545,8 @@ compare_tree (tree t1, tree t2)
   if (t2 == NULL)
 return 1;
 
+  STRIP_NOPS (t1);
+  STRIP_NOPS (t2);
 
   if (TREE_CODE (t1) != TREE_CODE (t2))
 return TREE_CODE (t1) < TREE_CODE (t2) ? -1 : 1;


Re: [PATCH] S/390: Clobber r1 in patterns resulting in pfpo instruction.

2015-11-19 Thread Andreas Krebbel
On 11/18/2015 03:35 PM, Dominik Vogt wrote:
> The attached patch fixes the S/390 patterns using the "pfpo"
> instruction in s390.md.  The instructions clobber r1, but the
> patterns did not reflect that.

Good catch!

Your testcase requires -mzarch in the options to enable pfpo with -m31 as well. 
 I've committed the
patch with that change.

Thanks!

Bye,

-Andreas-



Re: [gomp4.5] Handle #pragma omp declare target link

2015-11-19 Thread Jakub Jelinek
On Mon, Nov 16, 2015 at 06:40:43PM +0300, Ilya Verbin wrote:
> Here is WIP patch, not for check-in.  There are still many FIXMEs, which I am
> going to resolve, however target-link-1.c testcase pass.
> Is this approach correct?  Any comments on FIXMEs?
> 
> 
> diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
> index 23d0107..58771c0 100644
> --- a/gcc/c/c-parser.c
> +++ b/gcc/c/c-parser.c
> @@ -15895,7 +15895,10 @@ c_parser_omp_declare_target (c_parser *parser)
> g->have_offload = true;
> if (is_a  (node))
>   {
> -   vec_safe_push (offload_vars, t);
> +   omp_offload_var var;
> +   var.decl = t;
> +   var.link_ptr_decl = NULL_TREE;
> +   vec_safe_push (offload_vars, var);
> node->force_output = 1;
>   }

Another possible approach would be to keep offload_vars as
vector of trees, and simply push 2 trees in each case.
Or not to change this at all, see below.

> @@ -2009,7 +2010,8 @@ scan_sharing_clauses (tree clauses, omp_context *ctx)
> decl = OMP_CLAUSE_DECL (c);
> /* Global variables with "omp declare target" attribute
>don't need to be copied, the receiver side will use them
> -  directly.  */
> +  directly.  However, global variables with "omp declare target link"
> +  attribute need to be copied.  */
> if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
> && DECL_P (decl)
> && ((OMP_CLAUSE_MAP_KIND (c) != GOMP_MAP_FIRSTPRIVATE_POINTER
> @@ -2017,7 +2019,9 @@ scan_sharing_clauses (tree clauses, omp_context *ctx)
>  != GOMP_MAP_FIRSTPRIVATE_REFERENCE))
> || TREE_CODE (TREE_TYPE (decl)) == ARRAY_TYPE)
> && is_global_var (maybe_lookup_decl_in_outer_ctx (decl, ctx))
> -   && varpool_node::get_create (decl)->offloadable)
> +   && varpool_node::get_create (decl)->offloadable
> +   && !lookup_attribute ("omp declare target link",
> + DECL_ATTRIBUTES (decl)))

I wonder if Honza/Richi wouldn't prefer to have this info also
in cgraph, instead of looking up the attribute in each case.

> +  if (var.link_ptr_decl == NULL_TREE)
> + addr = build_fold_addr_expr (var.decl);
> +  else
> + {
> +   /* For "omp declare target link" var use address of the pointer
> +  instead of address of the var.  */
> +   addr = build_fold_addr_expr (var.link_ptr_decl);
> +   /* Most significant bit of the size marks such vars.  */
> +   unsigned HOST_WIDE_INT isize = tree_to_uhwi (size);
> +   isize |= 1ULL << (int_size_in_bytes (const_ptr_type_node) * 8 - 1);
> +   size = wide_int_to_tree (const_ptr_type_node, isize);
> +
> +   /* FIXME: Remove varpool node of var?  */

There is varpool_node::remove (), but not sure if at this point all the
references are already gone.

> +class pass_omp_target_link : public gimple_opt_pass
> +{
> +public:
> +  pass_omp_target_link (gcc::context *ctxt)
> +: gimple_opt_pass (pass_data_omp_target_link, ctxt)
> +  {}
> +
> +  /* opt_pass methods: */
> +  virtual bool gate (function *fun)
> +{
> +#ifdef ACCEL_COMPILER
> +  /* FIXME: Replace globals in target regions too or not?  */
> +  return lookup_attribute ("omp declare target",
> +DECL_ATTRIBUTES (fun->decl));

Certainly in "omp declare target entrypoint" regions too.

> +unsigned
> +pass_omp_target_link::execute (function *fun)
> +{
> +  basic_block bb;
> +  FOR_EACH_BB_FN (bb, fun)
> +{
> +  gimple_stmt_iterator gsi;
> +  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next ())
> + {
> +   unsigned i;
> +   gimple *stmt = gsi_stmt (gsi);
> +   for (i = 0; i < gimple_num_ops (stmt); i++)
> + {
> +   tree op = gimple_op (stmt, i);
> +   tree var = NULL_TREE;
> +
> +   if (!op)
> + continue;
> +   if (TREE_CODE (op) == VAR_DECL)
> + var = op;
> +   else if (TREE_CODE (op) == ADDR_EXPR)
> + {
> +   tree op1 = TREE_OPERAND (op, 0);
> +   if (TREE_CODE (op1) == VAR_DECL)
> + var = op1;
> + }
> +   /* FIXME: Support arrays.  What else?  */

We need to support all the references to the variables.
So, I think this approach is not right.

> +
> +   if (var && lookup_attribute ("omp declare target link",
> +DECL_ATTRIBUTES (var)))
> + {
> +   tree type = TREE_TYPE (var);
> +   tree ptype = build_pointer_type (type);
> +
> +   /* Find var in offload table.  */
> +   omp_offload_var *table_entry = NULL;
> +   for (unsigned j = 0; j < vec_safe_length (offload_vars); j++)
> + if ((*offload_vars)[j].decl == var)
> +   {
> + 

Re: [PATCH] Allow embedded timestamps by C/C++ macros to be set externally (2)

2015-11-19 Thread Dhole
On 11/17/2015 12:26 AM, Joseph Myers wrote:
> fprintf to stderr is never appropriate.  All diagnostics should go through 
> a diagnostic function that properly causes the message to be translated.
> 
> If you want a fatal error (exit immediately after giving the message 
> rather than continuing compilation), use the fatal_error function, which 
> takes an explicit location.  You can use UNKNOWN_LOCATION to avoid an 
> input file being named, (...)

Thanks for the advise! I've applied the change in the patch.

> Also, this environment variable needs documenting in cppenv.texi.

I have added the documentation as required, it's included in the
attached patch.

Regarding the copyright assignment process, it's in progress :)


Best regards,
Dhole
gcc/c-family/ChangeLog:

2015-11-18  Eduard Sanou  
Matthias Klose  
* c-common.c (get_source_date_epoch): New function, gets the environment
variable SOURCE_DATE_EPOCH and parses it as long long with error 
handling.
* c-common.h (get_source_date_epoch): Prototype.
* c-lex.c (c_lex_with_flags): set parse_in->source_date_epoch.

libcpp/ChangeLog:
2015-11-18  Eduard Sanou  
Matthias Klose  
* doc/cppenv.texi: Document SOURCE_DATE_EPOCH environment variable.

2015-11-18  Eduard Sanou  
Matthias Klose  
* include/cpplib.h (cpp_init_source_date_epoch): Prototype.
* init.c (cpp_init_source_date_epoch): New function.
* internal.h: Added source_date_epoch variable to struct
cpp_reader to store a reproducible date.
* macro.c (_cpp_builtin_macro_text): Set pfile->date timestamp from 
pfile->source_date_epoch instead of localtime if source_date_epoch is 
set, to be used for __DATE__ and __TIME__ macros to help reproducible 
builds.
diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 7fe7fa6..dbf9fb0 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -12318,4 +12318,38 @@ pointer_to_zero_sized_aggr_p (tree t)
   return (TYPE_SIZE (t) && integer_zerop (TYPE_SIZE (t)));
 }
 
+/* Read SOURCE_DATE_EPOCH from environment to have a deterministic
+   timestamp to replace embedded current dates to get reproducible
+   results. Returns -1 if SOURCE_DATE_EPOCH is not defined.  */
+long long
+get_source_date_epoch()
+{
+  char *source_date_epoch;
+  unsigned long long epoch;
+  char *endptr;
+
+  source_date_epoch = getenv ("SOURCE_DATE_EPOCH");
+  if (!source_date_epoch)
+return -1;
+
+  errno = 0;
+  epoch = strtoull (source_date_epoch, , 10);
+  if ((errno == ERANGE && (epoch == ULLONG_MAX || epoch == 0))
+  || (errno != 0 && epoch == 0))
+fatal_error (UNKNOWN_LOCATION, "environment variable $SOURCE_DATE_EPOCH: "
+"strtoull: %s\n", xstrerror(errno));
+  if (endptr == source_date_epoch)
+fatal_error (UNKNOWN_LOCATION, "environment variable $SOURCE_DATE_EPOCH: "
+"No digits were found: %s\n", endptr);
+  if (*endptr != '\0')
+fatal_error (UNKNOWN_LOCATION, "environment variable $SOURCE_DATE_EPOCH: "
+"Trailing garbage: %s\n", endptr);
+  if (epoch > ULONG_MAX)
+fatal_error (UNKNOWN_LOCATION, "environment variable $SOURCE_DATE_EPOCH: "
+"value must be smaller than or equal to: %lu but was found to "
+"be: %llu \n", ULONG_MAX, epoch);
+
+  return (long long) epoch;
+}
+
 #include "gt-c-family-c-common.h"
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index cabf452..4b55277 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1437,4 +1437,10 @@ extern bool contains_cilk_spawn_stmt (tree);
 extern tree cilk_for_number_of_iterations (tree);
 extern bool check_no_cilk (tree, const char *, const char *,
   location_t loc = UNKNOWN_LOCATION);
+
+/* Read SOURCE_DATE_EPOCH from environment to have a deterministic
+   timestamp to replace embedded current dates to get reproducible
+   results. Returns -1 if SOURCE_DATE_EPOCH is not defined.  */
+extern long long get_source_date_epoch();
+
 #endif /* ! GCC_C_COMMON_H */
diff --git a/gcc/c-family/c-lex.c b/gcc/c-family/c-lex.c
index bb55be8..9344f66 100644
--- a/gcc/c-family/c-lex.c
+++ b/gcc/c-family/c-lex.c
@@ -402,6 +402,10 @@ c_lex_with_flags (tree *value, location_t *loc, unsigned 
char *cpp_flags,
   enum cpp_ttype type;
   unsigned char add_flags = 0;
   enum overflow_type overflow = OT_NONE;
+  long long source_date_epoch = -1;
+
+  source_date_epoch = get_source_date_epoch();
+  cpp_init_source_date_epoch(parse_in, source_date_epoch);
 
   timevar_push (TV_CPP);
  retry:
diff --git a/gcc/doc/cppenv.texi b/gcc/doc/cppenv.texi
index 100811d..3b5317b 100644
--- a/gcc/doc/cppenv.texi
+++ b/gcc/doc/cppenv.texi
@@ -79,4 +79,21 @@ main input file is omitted.
 @ifclear cppmanual
 

Re: [PATCH] Get rid of insn-codes.h in optabs-tree.c

2015-11-19 Thread Bernd Schmidt

On 11/19/2015 03:28 PM, Ilya Enkovich wrote:

This is a refactoring patch discussed in another thread [1].  It gets
rid of CODE_FOR_nothing usage in optabs-tree.c by introducing boolean
predicated in optabs-query.  Bootstrapped and regtesed on
x86_64-unknown-linux-gnu.


Looks pretty reasonable, but I think we have to start saying "not now" 
after the end of stage 1.




+/* Return 1 id there is a valid insn code to convert fixed-point mode


"true", not "1" (elsewhere too), and "if".


+{
+  return get_fix_icode (fixmode, fltmode, unsignedp, truncp_ptr)
+!= CODE_FOR_nothing;


Formatting.


Bernd


Re: [PATCH] Implement GOMP_OFFLOAD_unload_image in intelmic plugin

2015-11-19 Thread Jakub Jelinek
On Thu, Nov 19, 2015 at 06:47:28PM +0300, Ilya Verbin wrote:
> I will add this:
> 
> diff --git a/liboffloadmic/plugin/libgomp-plugin-intelmic.cpp 
> b/liboffloadmic/plugin/libgomp-plugin-intelmic.cpp
> index 6ee585e..f8c1725 100644
> --- a/liboffloadmic/plugin/libgomp-plugin-intelmic.cpp
> +++ b/liboffloadmic/plugin/libgomp-plugin-intelmic.cpp
> @@ -71,7 +71,7 @@ struct TargetImageDesc {
>/* 10 characters is enough for max int value.  */
>char name[sizeof ("lib00.so")];
>char data[];
> -} __attribute__ ((packed));
> +};
>  
>  /* Image descriptors, indexed by a pointer obtained from libgomp.  */
>  typedef std::map ImgDescMap;
> @@ -313,9 +313,8 @@ offload_image (const void *target_image)
>target_image, image_start, image_end);
>  
>int64_t image_size = (uintptr_t) image_end - (uintptr_t) image_start;
> -  TargetImageDesc *image
> -= (TargetImageDesc *) malloc (sizeof (int64_t) + sizeof 
> ("lib00.so")
> -   + image_size);
> +  TargetImageDesc *image = (TargetImageDesc *) malloc (offsetof 
> (TargetImageDesc, data)
> ++ image_size);
>if (!image)
>  {
>fprintf (stderr, "%s: Can't allocate memory\n", __FILE__);

LGTM.
> > What is the point of this hunk?  Is there any point in unregistering the
> > main target image?  I mean at that point the process is exiting anyway.
> > The importance of unregistering target images registered from shared
> > libraries is that they should be unregistered when they are dlclosed.
> 
> liboffloadmic performs correct finalization of the target process in
> __offload_fini_library, which is called only during unregistration of the main
> target image.
> Without this finalization the target process will be destroyed after unloading
> libcoi_host.so.  And then some DSO may call GOMP_offload_unregister_ver from 
> its
> destructor, which will try to unload target image from the already destroyed
> process.  This issue is reproducible only using real COI.

Ok then.

Jakub


Re: RFC/RFA: Fix bug with REE optimization corrupting extended registers

2015-11-19 Thread Nick Clifton

Hi Bernd,


I had a look around. There's code testing HARD_REGNO_NREGS in
ree.c:combine_set_extension. It's inside #if 0, and labelled
"temporarily disabled". See if enabling that helps you? (Jeff, that #if
0 was added by you).


I suspect that the code was disabled because it prevented too many 
useful optimization opportunities.


The code there would solve this problem, but the approach is is overly 
cautious, since it disables the optimization for all extensions that 
increase the number of hard registers used.  Some of these will be 
viable candidates, provided that the extra hard registers are no used. 
(This is certainly true for the RL78, where the (patched) optimization 
does improve code, even though the widening does use extra registers).


Cheers
  Nick




Re: [PATCH] Allow embedded timestamps by C/C++ macros to be set externally (2)

2015-11-19 Thread Dhole
On 11/19/2015 04:35 PM, Dhole wrote:
> On 11/17/2015 12:26 AM, Joseph Myers wrote:
>> fprintf to stderr is never appropriate.  All diagnostics should go through 
>> a diagnostic function that properly causes the message to be translated.
>>
>> If you want a fatal error (exit immediately after giving the message 
>> rather than continuing compilation), use the fatal_error function, which 
>> takes an explicit location.  You can use UNKNOWN_LOCATION to avoid an 
>> input file being named, (...)
> 
> Thanks for the advise! I've applied the change in the patch.
> 
>> Also, this environment variable needs documenting in cppenv.texi.
> 
> I have added the documentation as required, it's included in the
> attached patch.
> 
> Regarding the copyright assignment process, it's in progress :)
> 
> 
> Best regards,
> Dhole
> 

Resending the ChangeLog as it was malformed.

-- 
Dhole
gcc/c-family/ChangeLog:

2015-11-18  Eduard Sanou  
Matthias Klose  
* c-common.c (get_source_date_epoch): New function, gets the environment
variable SOURCE_DATE_EPOCH and parses it as long long with error 
handling.
* c-common.h (get_source_date_epoch): Prototype.
* c-lex.c (c_lex_with_flags): set parse_in->source_date_epoch.

gcc/ChangeLog:

2015-11-18  Eduard Sanou  
Matthias Klose  
* doc/cppenv.texi: Document SOURCE_DATE_EPOCH environment variable.

libcpp/ChangeLog:

2015-11-18  Eduard Sanou  
Matthias Klose  
* include/cpplib.h (cpp_init_source_date_epoch): Prototype.
* init.c (cpp_init_source_date_epoch): New function.
* internal.h: Added source_date_epoch variable to struct
cpp_reader to store a reproducible date.
* macro.c (_cpp_builtin_macro_text): Set pfile->date timestamp from 
pfile->source_date_epoch instead of localtime if source_date_epoch is 
set, to be used for __DATE__ and __TIME__ macros to help reproducible 
builds.


Re: [PATCH] Transactional Memory: Support __cxa_free_exception and fix exception handling.

2015-11-19 Thread Peter Bergner
On Thu, 2015-11-19 at 09:35 -0600, Torvald Riegel wrote:
> The EH scheme that we had been using for TM / libitm doesn't work
> properly.  We fail to handle throwing exceptions whose constructors may
> throw themselves.  We also do not clean up properly in all situations
> when a transactions abort while being in the process of throwing an
> exception.
> This patch solves  this particular problem by adding a transactional
> wrapper for __cxa_free_exception and changing the EH scheme in libitm.
> 
> Follow-up patches will fix other issues that we have identified.  Some
> of the changes to the libitm.texi ABI docs added in this patch already
> take this future work into account.
> 
> Tested using the libitm testsuite on x86_64-linux.

I have fired off a test on powerpc64le-linux and will report back
on its status when it's done.

Peter



Re: [PATCH] Implement GOMP_OFFLOAD_unload_image in intelmic plugin

2015-11-19 Thread Ilya Verbin
On Thu, Nov 19, 2015 at 14:33:06 +0100, Jakub Jelinek wrote:
> On Mon, Nov 16, 2015 at 08:33:28PM +0300, Ilya Verbin wrote:
> > diff --git a/liboffloadmic/plugin/libgomp-plugin-intelmic.cpp 
> > b/liboffloadmic/plugin/libgomp-plugin-intelmic.cpp
> > index 772e198..6ee585e 100644
> > --- a/liboffloadmic/plugin/libgomp-plugin-intelmic.cpp
> > +++ b/liboffloadmic/plugin/libgomp-plugin-intelmic.cpp
> > @@ -65,6 +65,17 @@ typedef std::vector DevAddrVect;
> >  /* Addresses for all images and all devices.  */
> >  typedef std::map ImgDevAddrMap;
> >  
> > +/* Image descriptor needed by __offload_[un]register_image.  */
> > +struct TargetImageDesc {
> > +  int64_t size;
> > +  /* 10 characters is enough for max int value.  */
> > +  char name[sizeof ("lib00.so")];
> > +  char data[];
> > +} __attribute__ ((packed));
> 
> Why the packed attribute?  I know it is preexisting, but with int64_t
> being the first and then just char, there is no padding in between fields.

Hmmm, I can't remember, but I definitely have added this attribute 2 years ago,
because liboffloadmic failed to register the image.  Anyway, now everything
works fine without it.

> And to determine the size without data, you can just use offsetof.

I will add this:

diff --git a/liboffloadmic/plugin/libgomp-plugin-intelmic.cpp 
b/liboffloadmic/plugin/libgomp-plugin-intelmic.cpp
index 6ee585e..f8c1725 100644
--- a/liboffloadmic/plugin/libgomp-plugin-intelmic.cpp
+++ b/liboffloadmic/plugin/libgomp-plugin-intelmic.cpp
@@ -71,7 +71,7 @@ struct TargetImageDesc {
   /* 10 characters is enough for max int value.  */
   char name[sizeof ("lib00.so")];
   char data[];
-} __attribute__ ((packed));
+};
 
 /* Image descriptors, indexed by a pointer obtained from libgomp.  */
 typedef std::map ImgDescMap;
@@ -313,9 +313,8 @@ offload_image (const void *target_image)
 target_image, image_start, image_end);
 
   int64_t image_size = (uintptr_t) image_end - (uintptr_t) image_start;
-  TargetImageDesc *image
-= (TargetImageDesc *) malloc (sizeof (int64_t) + sizeof 
("lib00.so")
- + image_size);
+  TargetImageDesc *image = (TargetImageDesc *) malloc (offsetof 
(TargetImageDesc, data)
+  + image_size);
   if (!image)
 {
   fprintf (stderr, "%s: Can't allocate memory\n", __FILE__);


> > @@ -217,13 +231,27 @@ offload (const char *file, uint64_t line, int device, 
> > const char *name,
> >  }
> >  
> >  static void
> > +unregister_main_image ()
> > +{
> > +  __offload_unregister_image (_target_image);
> > +}
> > +
> > +static void
> >  register_main_image ()
> >  {
> > +  /* Do not check the return value, because old versions of liboffloadmic 
> > did
> > + not have return values.  */
> >__offload_register_image (_target_image);
> >  
> >/* liboffloadmic will call GOMP_PLUGIN_target_task_completion when
> >   asynchronous task on target is completed.  */
> >__offload_register_task_callback (GOMP_PLUGIN_target_task_completion);
> > +
> > +  if (atexit (unregister_main_image) != 0)
> > +{
> > +  fprintf (stderr, "%s: atexit failed\n", __FILE__);
> > +  exit (1);
> > +}
> >  }
> 
> What is the point of this hunk?  Is there any point in unregistering the
> main target image?  I mean at that point the process is exiting anyway.
> The importance of unregistering target images registered from shared
> libraries is that they should be unregistered when they are dlclosed.

liboffloadmic performs correct finalization of the target process in
__offload_fini_library, which is called only during unregistration of the main
target image.
Without this finalization the target process will be destroyed after unloading
libcoi_host.so.  And then some DSO may call GOMP_offload_unregister_ver from its
destructor, which will try to unload target image from the already destroyed
process.  This issue is reproducible only using real COI.

  -- Ilya


Re: [PATCH] Add clang-format config to contrib folder

2015-11-19 Thread Hans-Peter Nilsson
On Wed, 18 Nov 2015, Jeff Law wrote:
> Given that gnu-indent seems to muck up C++ badly in my
> experience, clang-format may be a better long term solution.  I'd really like
> to get to a point one day where formatting is a commit hook so that things are
> always kept properly formatted.

I hope you don't mean the direct interpretation; that the result
of indent (any indent program) should be blindly committed, but
that it should (perhaps) be used to (re)format and the commit
refused if there's a diff.  No tool should be trusted to
override the committers' edits.  Really.

brgds, H-P


Re: [OpenACC 0/7] host_data construct

2015-11-19 Thread Jakub Jelinek
On Thu, Nov 19, 2015 at 02:26:50PM +, Julian Brown wrote:
> OK, thanks -- as to what the standard says, it's so ill-specified in
> this area that nothing can be learned about the behaviour of offloaded
> regions within host_data constructs, and my question about that on the
> technical mailing list is still unanswered (actually Nathan suggested
> in private mail that the conservative thing to do would be to disallow
> offloaded regions entirely within host_data constructs, so maybe that's
> the way to go).
> 
> OpenMP 4.5 seems to *not* specify the skipping-over behaviour for
> use_device_ptr variables (p105, lines 20-23):
> 
> "The is_device_ptr clause is used to indicate that a list item is a
> device pointer already in the device data environment and that it
> should be used directly. Support for device pointers created outside
> of OpenMP, specifically outside of the omp_target_alloc routine and the
> use_device_ptr clause, is implementation defined."
> 
> That suggests that use_device_ptr is a valid way to create device
> pointers for use in enclosed target regions: the behaviour I assumed
> was wrong for OpenACC. So I think my guess at the "most-obvious"
> behaviour was probably misguided anyway.

use_device_ptr kind of privatizes the variable, the private variable being
the device pointer corresponding to the host pointer outside of the target
data with use_device_ptr clause.

And, if you want to use that device pointer in a target region, it should be
on the is_device_ptr clause on the target construct.  See e.g.
libgomp.c/target-18.c testcase.
  int a[4];
...
  #pragma omp target data map(to:a)
  #pragma omp target data use_device_ptr(a) map(from:err)
  #pragma omp target is_device_ptr(a) private(i) map(from:err)
  {
err = 0;
for (i = 0; i < 4; i++)
  if (a[i] != 23 + i)
err = 1;
  }
The implementation has this way a choice how to implement device pointers
(what use_device_ptr gives you, or say omp_target_alloc returns)
- either (GCC's choice at least for the XeonPhi and hopefully PTX, HSA does
not care, as it shares address space) implement them as host pointer
encoding the bits the target device wants to use, or some kind of
descriptor.  In the former case, is_device_ptr is essentially a
firstprivate, you bitwise copy the device pointer from the host to target
device, where you can dereference it etc.  In the descriptor case you'd
do some transformation of the host side representation of the device pointer
to the device side.

> 
> It's maybe even more complicated. Consider the example:
> 
> char x[1024];
> 
> #pragma acc enter data copyin(x)
> 
> #pragma acc host_data use_device(x)
> {
>   target_primitive(x);
>   #pragma acc parallel present(x)[1]
>   {
> x[5] = 0;[2]
>   }
> }

If it is unclear, I think disallowing acc {parallel,kernels} inside of
acc host_data might be too big hammer, but perhaps just erroring out
or warning during gimplification that if you (explicitly or implicitly)
try to map a var that is in use_device clause in some outer context,
it is either wrong, unsupported or will not do what users think?

I will double check on omp-lang, but supposedly we could for OpenMP
warn in similar cases (use_device_ptr clause instead of use_device),
except when it is passed to is_device_ptr clause, because I think the
behavior is just unspecified otherwise.
> 
> Here, the "present" clause marked [1] will fail (because 'x' is a
> target pointer now). If it's omitted, the array access [2] will cause an
> implicit present_or_copy to be used for the 'x' pointer (which again
> will fail, because now 'x' points to target data). Maybe what we
> actually need is,
> 
> #pragma acc host_data use_device(x)
> {
>   target_primitive(x);
>   #pragma acc parallel deviceptr(x)
>   {
> ...
>   }
> }
> 
> with the deviceptr(x) clause magically substituted in the parallel
> construct, but I'm struggling to see how we could justify doing that
> when that behaviour's not mentioned in the spec at all.

Is deviceptr as above meant to work?  That is the OpenACC counterpart
of is_device_ptr, right?  If yes, then I'd suggest just warning if you
try to implicitly or explicitly map something use_device in outer contexts,
and just make sure you don't ICE on the cases where you warn.
If the standard does not say what it means, then it is unspecified
behavior...

Jakub


Re: [PATCH] Get rid of insn-codes.h in optabs-tree.c

2015-11-19 Thread Ilya Enkovich
On 19 Nov 16:46, Bernd Schmidt wrote:
> On 11/19/2015 03:28 PM, Ilya Enkovich wrote:
> >This is a refactoring patch discussed in another thread [1].  It gets
> >rid of CODE_FOR_nothing usage in optabs-tree.c by introducing boolean
> >predicated in optabs-query.  Bootstrapped and regtesed on
> >x86_64-unknown-linux-gnu.
> 
> Looks pretty reasonable, but I think we have to start saying "not now" after
> the end of stage 1.

I send it now because Jeff considered this patch at early stage3.  I can commit 
it at the next stage1 either.

> 
> >
> >+/* Return 1 id there is a valid insn code to convert fixed-point mode
> 
> "true", not "1" (elsewhere too), and "if".
> 
> >+{
> >+  return get_fix_icode (fixmode, fltmode, unsignedp, truncp_ptr)
> >+!= CODE_FOR_nothing;
> 
> Formatting.

Thanks! Below is a fixed version.

Ilya

> 
> 
> Bernd


--
gcc/

2015-11-19  Ilya Enkovich  

* optabs-query.h (get_vec_cmp_icode): Remove 'static'.
(get_vcond_mask_icode): Likewise.
(get_extend_icode): New.
(get_float_icode): New.
(get_fix_icode): New.
(can_extend_p): Return bool
(can_float_p): Return bool.
(can_fix_p): Return bool.
(can_vec_cmp_p): New.
(can_vcond_p): New.
(can_vcond_mask_p): New.
* optabs-query.c (get_float_icode): New.
(can_extend_p): Return bool.
(get_float_icode): New.
(can_float_p): Return bool.
(get_fix_icode): New.
(can_fix_p): Return bool.
(can_vec_cmp_p): New.
(can_vcond_p): New.
(can_vcond_mask_p): New.
* expr.c (init_expr_target): Use get_extend_icode and
adjust to new can_extend_p return type.
(convert_move): Likewise.
(compress_float_constant): Likewise.
* function.c (assign_parm_setup_reg): Likewise.
* optabs-tree.c: Don't include insn-codes.h.
(supportable_convert_operation): Adjust to can_fix_p and
can_float_p new return types.
* optabs.c (gen_extend_insn): Use get_extend_icode.
(expand_float): Use get_float_icode and adjust to can_float_p
new return type.
(expand_fix): Use get_fix_icode and adjust to can_fix_p
new return type.
* tree-vrp.c (simplify_float_conversion_using_ranges): Adjust
to can_float_p new return type.


diff --git a/gcc/expr.c b/gcc/expr.c
index bd43dc4..f4c06a1 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -223,7 +223,7 @@ init_expr_target (void)
{
  enum insn_code ic;
 
- ic = can_extend_p (mode, srcmode, 0);
+ ic = get_extend_icode (mode, srcmode, 0);
  if (ic == CODE_FOR_nothing)
continue;
 
@@ -452,7 +452,7 @@ convert_move (rtx to, rtx from, int unsignedp)
   int nwords = CEIL (GET_MODE_SIZE (to_mode), UNITS_PER_WORD);
 
   /* Try converting directly if the insn is supported.  */
-  if ((code = can_extend_p (to_mode, from_mode, unsignedp))
+  if ((code = get_extend_icode (to_mode, from_mode, unsignedp))
  != CODE_FOR_nothing)
{
  /* If FROM is a SUBREG, put it into a register.  Do this
@@ -466,7 +466,7 @@ convert_move (rtx to, rtx from, int unsignedp)
}
   /* Next, try converting via full word.  */
   else if (GET_MODE_PRECISION (from_mode) < BITS_PER_WORD
-  && ((code = can_extend_p (to_mode, word_mode, unsignedp))
+  && ((code = get_extend_icode (to_mode, word_mode, unsignedp))
   != CODE_FOR_nothing))
{
  rtx word_to = gen_reg_rtx (word_mode);
@@ -573,7 +573,7 @@ convert_move (rtx to, rtx from, int unsignedp)
   if (GET_MODE_PRECISION (to_mode) > GET_MODE_PRECISION (from_mode))
 {
   /* Convert directly if that works.  */
-  if ((code = can_extend_p (to_mode, from_mode, unsignedp))
+  if ((code = get_extend_icode (to_mode, from_mode, unsignedp))
  != CODE_FOR_nothing)
{
  emit_unop_insn (code, to, from, equiv_code);
@@ -588,12 +588,10 @@ convert_move (rtx to, rtx from, int unsignedp)
  /* Search for a mode to convert via.  */
  for (intermediate = from_mode; intermediate != VOIDmode;
   intermediate = GET_MODE_WIDER_MODE (intermediate))
-   if (((can_extend_p (to_mode, intermediate, unsignedp)
- != CODE_FOR_nothing)
+   if ((can_extend_p (to_mode, intermediate, unsignedp)
 || (GET_MODE_SIZE (to_mode) < GET_MODE_SIZE (intermediate)
 && TRULY_NOOP_TRUNCATION_MODES_P (to_mode, intermediate)))
-   && (can_extend_p (intermediate, from_mode, unsignedp)
-   != CODE_FOR_nothing))
+   && can_extend_p (intermediate, from_mode, unsignedp))
  {
convert_move (to, convert_to_mode (intermediate, from,
   unsignedp), unsignedp);
@@ -3638,7 +3636,7 @@ compress_float_constant (rtx 

[PATCH] Fix division-by-unsigned optimization in VRP (PR tree-optimization/68431)

2015-11-19 Thread Marek Polacek
This fixes a failure to optimize division by an unsigned.  The comment before
the condition I'm fixing says "When vr0.max < 0, vr1.min != 0 and ..." but
"&& !compare_values (vr1.min, zero)" actually ensures that vr1.min is zero.
(Indeed, the following int_const_binop would attemp to divide by zero and
return NULL_TREE.)

With this, (a / x) gets a correct range [-2, 0] and we're able to optimize
the condition away, thus even the abort () call.  (This won't work on targets
where long long == int.)

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2015-11-19  Marek Polacek  

PR tree-optimization/68431
* tree-vrp.c (extract_range_from_binary_expr_1): Fix condition.

* gcc.dg/tree-ssa/pr68431.c: New test.

diff --git gcc/testsuite/gcc.dg/tree-ssa/pr68431.c 
gcc/testsuite/gcc.dg/tree-ssa/pr68431.c
index e69de29..3bd3843 100644
--- gcc/testsuite/gcc.dg/tree-ssa/pr68431.c
+++ gcc/testsuite/gcc.dg/tree-ssa/pr68431.c
@@ -0,0 +1,16 @@
+/* PR tree-optimization/68431 */
+/* { dg-options "-O2 -fdump-tree-vrp1-details" } */
+
+unsigned int x = 1;
+int
+main (void)
+{
+  long long int a = -2LL;
+  int t = 1 <= (a / x);
+  if (t != 0)
+__builtin_abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "Folding predicate .*to 0" 1 "vrp1" } } */
diff --git gcc/tree-vrp.c gcc/tree-vrp.c
index e67048e..736082b 100644
--- gcc/tree-vrp.c
+++ gcc/tree-vrp.c
@@ -2975,7 +2975,7 @@ extract_range_from_binary_expr_1 (value_range *vr,
  if (vr1.type == VR_RANGE
  && !symbolic_range_p ()
  && !symbolic_range_p ()
- && !compare_values (vr1.min, zero))
+ && compare_values (vr1.min, zero) != 0)
max = int_const_binop (code, vr0.max, vr1.min);
  else
max = zero;

Marek


Re: [PATCH] Fix memory leaks in tree-ssa-uninit.c

2015-11-19 Thread Martin Liška
On 11/19/2015 02:58 PM, Bernd Schmidt wrote:
> On 11/19/2015 11:16 AM, Martin Liška wrote:
>> You are right, however as the original coding style was really broken,
>> it was much easier
>> to use the tool and clean-up fall-out.
>>
>> Waiting for thoughts related to v2.
> 
> Better, but still some oddities. I hope you won't get mad at me if I suggest 
> doing this in stages? A first patch could just deal with non-reformatting 
> whitespace changes, such as removing trailing whitespace, and converting 
> leading spaces to tabs - that would be mechanical, and reduce the size of the 
> rest of the patch (it seems emacs has an appropriate command, M-x 
> whitespace-cleanup). Such a change is preapproved.

Hi.

That's good approach, let's start with that.

> 
> A few things I noticed:
> 
>> -   flag_1 = phi <0, 1>  // (1)
>> +   flag_1 = phi <0, 1>  // (1)
> 
> Check whether the // (1) is still lined up with the // (2) and // (3) parts. 
> In general I'm not sure we should to what extent we should be reformatting 
> comments in this patch. Breaking long lines and ensuring two spaces after a 
> period seems fine.

Fixed.

> 
>> +   Checking recursively into (1), the compiler can find out that only 
>> some_val
>> +   (which is defined) can flow into (3) which is OK.
>>
>>  */
> 
> Could take the opportunity to move the */ onto the end of the line.

Likewise.

> 
>> +  if (is_gimple_call (cond_stmt) && EDGE_COUNT (e->src->succs) >= 2)
>> +{
>> +  /* Ignore EH edge.  Can add assertion
>> + on the other edge's flag.  */
>> +  continue;
>> +}
> 
> Could also take the opportunity to remove excess braces and parentheses. Not 
> a requirement and probably a distraction at this point.

Likewise.

> 
>> -  return (pred.cond_code == NE_EXPR && !pred.invert)
>> -  || (pred.cond_code == EQ_EXPR && pred.invert);
>> +  return (pred.cond_code == NE_EXPR && !pred.invert)
>> + || (pred.cond_code == EQ_EXPR && pred.invert);
> 
> This still isn't right.

Also fixed.

The patch has been split to two files.

Thanks,
Martin

> 
> 
> Bernd

>From 5a1de38916beefca319ef38df89a3a3b60c17a95 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Thu, 19 Nov 2015 15:41:38 +0100
Subject: [PATCH 1/2] Replace spaces with tabs and remove trailing whitespaces

---
 gcc/tree-ssa-uninit.c | 1004 -
 1 file changed, 502 insertions(+), 502 deletions(-)

diff --git a/gcc/tree-ssa-uninit.c b/gcc/tree-ssa-uninit.c
index 0709cce..50bfb03 100644
--- a/gcc/tree-ssa-uninit.c
+++ b/gcc/tree-ssa-uninit.c
@@ -76,8 +76,8 @@ static bool
 has_undefined_value_p (tree t)
 {
   return (ssa_undefined_value_p (t)
-  || (possibly_undefined_names
-  && possibly_undefined_names->contains (t)));
+	  || (possibly_undefined_names
+	  && possibly_undefined_names->contains (t)));
 }
 
 
@@ -270,9 +270,9 @@ can_skip_redundant_opnd (tree opnd, gimple *phi)
 {
   tree op = gimple_phi_arg_def (op_def, i);
   if (TREE_CODE (op) != SSA_NAME)
-continue;
+	continue;
   if (op != phi_def && uninit_undefined_value_p (op))
-return false;
+	return false;
 }
 
   return true;
@@ -296,10 +296,10 @@ compute_uninit_opnds_pos (gphi *phi)
 {
   tree op = gimple_phi_arg_def (phi, i);
   if (TREE_CODE (op) == SSA_NAME
-  && uninit_undefined_value_p (op)
-  && !can_skip_redundant_opnd (op, phi))
+	  && uninit_undefined_value_p (op)
+	  && !can_skip_redundant_opnd (op, phi))
 	{
-  if (cfun->has_nonlocal_label || cfun->calls_setjmp)
+	  if (cfun->has_nonlocal_label || cfun->calls_setjmp)
 	{
 	  /* Ignore SSA_NAMEs that appear on abnormal edges
 		 somewhere.  */
@@ -323,7 +323,7 @@ find_pdom (basic_block block)
else
  {
basic_block bb
-   = get_immediate_dominator (CDI_POST_DOMINATORS, block);
+	   = get_immediate_dominator (CDI_POST_DOMINATORS, block);
if (! bb)
 	 return EXIT_BLOCK_PTR_FOR_FN (cfun);
return bb;
@@ -398,8 +398,8 @@ find_control_equiv_block (basic_block bb)
 
 static bool
 compute_control_dep_chain (basic_block bb, basic_block dep_bb,
-   vec *cd_chains,
-   size_t *num_chains,
+			   vec *cd_chains,
+			   size_t *num_chains,
 			   vec *cur_cd_chain,
 			   int *num_calls)
 {
@@ -426,7 +426,7 @@ compute_control_dep_chain (basic_block bb, basic_block dep_bb,
   edge e = (*cur_cd_chain)[i];
   /* Cycle detected. */
   if (e->src == bb)
-return false;
+	return false;
 }
 
   FOR_EACH_EDGE (e, ei, bb->succs)
@@ -434,39 +434,39 @@ compute_control_dep_chain (basic_block bb, basic_block dep_bb,
   basic_block cd_bb;
   int post_dom_check = 0;
   if (e->flags & (EDGE_FAKE | EDGE_ABNORMAL))
-continue;
+	continue;
 
   cd_bb = e->dest;
   cur_cd_chain->safe_push (e);
   while (!is_non_loop_exit_postdominating 

Re: [PATCH][combine] PR rtl-optimization/68381: Only restrict pure simplification in mult-extend subst case, allow other substitutions

2015-11-19 Thread Kyrill Tkachov


On 19/11/15 14:41, Segher Boessenkool wrote:

On Thu, Nov 19, 2015 at 01:38:53PM +, Kyrill Tkachov wrote:

That is troublesome.  Could you look deeper?

Yes.

Thanks.


So the bad case is when we're in subst and returning a CLOBBER of zero
and 'from' is (reg/v:SI 80 [ x ]) and 'to' is (zero_extend:SI (reg:HI 0 x0
[ x ])).
The call to subst comes from try_combine at line 3403:

if (added_sets_1)
 {
   rtx t = i1pat;
   if (i0_feeds_i1_n)
 t = subst (t, i0dest, i0src_copy ? i0src_copy : i0src, 0, 0, 0);

   XVECEXP (newpat, 0, --total_sets) = t;
 }

It uses t after calling subst on it without checking that it didn't return
a clobber.
If I change that snippet to check for CLOBBER:
   if (added_sets_1)
 {
   rtx t = i1pat;
   if (i0_feeds_i1_n)
 t = subst (t, i0dest, i0src_copy ? i0src_copy : i0src, 0, 0, 0);

   if (GET_CODE (t) == CLOBBER)
 {
   undo_all ();
   return 0;
 }
   XVECEXP (newpat, 0, --total_sets) = t;
 }

The testcase gets fixed.
But shouldn't the XVECEXP (newpat, 0, --total_sets) = t; line create an
uncrecognizable rtx
that would then get rejected by combine or something?

Yes.  recog_for_combine_1 checks for a PARALLEL with such a CLOBBER
right at the start; and of course having the clobber elsewhere will
just not match.


If we don't check for clobber there and perform the "XVECEXP = ..."
the resulting newpat looks like:
(parallel [
 (set (reg:CC 66 cc)
 (compare:CC (const_int 0 [0])
 (const_int 0 [0])))
 (nil)
 (clobber:DI (const_int 0 [0]))
 ])

ah, so the clobber is put in a parallel with another insn and is thus
accepted by recog?

No, recog_for_combine should refuse it because of that third arm.
The second arm (the nil) looks very wrong, where is that coming
from?  That isn't valid RTL.


Well, it came from a bit earlier before the subst call (around line 3390):
  /* If the actions of the earlier insns must be kept
 in addition to substituting them into the latest one,
 we must make a new PARALLEL for the latest insn
 to hold additional the SETs.  */

  rtx old = newpat;
  total_sets = 1 + extra_sets;
  newpat = gen_rtx_PARALLEL (VOIDmode, rtvec_alloc (total_sets));
  XVECEXP (newpat, 0, 0) = old;


extra_sets is 2, so we have a parallel with 3 slots so after the subst call
we put the clobber into --total_sets, that is slot 2:
  XVECEXP (newpat, 0, --total_sets) = t;
A bit further below we fill slot 1 with another rtx, so all 3 parts of the 
PARALLEL
are filled.
I'll look further into why recog_for_combine doesn't kill the whole thing.

Kyrill






So, should we check 't' after subst for clobbers as above? Or should this
be fixed in
some other place?

There is a bug somewhere, so that will need fixing.  Workarounds are
last resort, and even then we really need to know what is going on.


Thanks. In light of the above I think this patch happens to avoid
the issue highlighted above but we should fix the above code separately?

Yes, if your patch creates better code we want it (and fix the regression),
but you exposed a separate bug as well :-)


Segher





Re: [PATCH] Fix division-by-unsigned optimization in VRP (PR tree-optimization/68431)

2015-11-19 Thread Richard Biener
On November 19, 2015 3:57:03 PM GMT+01:00, Marek Polacek  
wrote:
>This fixes a failure to optimize division by an unsigned.  The comment
>before
>the condition I'm fixing says "When vr0.max < 0, vr1.min != 0 and ..."
>but
>"&& !compare_values (vr1.min, zero)" actually ensures that vr1.min is
>zero.
>(Indeed, the following int_const_binop would attemp to divide by zero
>and
>return NULL_TREE.)
>
>With this, (a / x) gets a correct range [-2, 0] and we're able to
>optimize
>the condition away, thus even the abort () call.  (This won't work on
>targets
>where long long == int.)
>
>Bootstrapped/regtested on x86_64-linux, ok for trunk?

OK.

Richard.

>2015-11-19  Marek Polacek  
>
>   PR tree-optimization/68431
>   * tree-vrp.c (extract_range_from_binary_expr_1): Fix condition.
>
>   * gcc.dg/tree-ssa/pr68431.c: New test.
>
>diff --git gcc/testsuite/gcc.dg/tree-ssa/pr68431.c
>gcc/testsuite/gcc.dg/tree-ssa/pr68431.c
>index e69de29..3bd3843 100644
>--- gcc/testsuite/gcc.dg/tree-ssa/pr68431.c
>+++ gcc/testsuite/gcc.dg/tree-ssa/pr68431.c
>@@ -0,0 +1,16 @@
>+/* PR tree-optimization/68431 */
>+/* { dg-options "-O2 -fdump-tree-vrp1-details" } */
>+
>+unsigned int x = 1;
>+int
>+main (void)
>+{
>+  long long int a = -2LL;
>+  int t = 1 <= (a / x);
>+  if (t != 0)
>+__builtin_abort ();
>+
>+  return 0;
>+}
>+
>+/* { dg-final { scan-tree-dump-times "Folding predicate .*to 0" 1
>"vrp1" } } */
>diff --git gcc/tree-vrp.c gcc/tree-vrp.c
>index e67048e..736082b 100644
>--- gcc/tree-vrp.c
>+++ gcc/tree-vrp.c
>@@ -2975,7 +2975,7 @@ extract_range_from_binary_expr_1 (value_range
>*vr,
> if (vr1.type == VR_RANGE
> && !symbolic_range_p ()
> && !symbolic_range_p ()
>-&& !compare_values (vr1.min, zero))
>+&& compare_values (vr1.min, zero) != 0)
>   max = int_const_binop (code, vr0.max, vr1.min);
> else
>   max = zero;
>
>   Marek




RE: [PATCH 1/2] [graphite] Move codegen related functions to graphite-isl-ast-to-gimple.c

2015-11-19 Thread Aditya K
Thanks for the update. I'll fix that asap.


-Aditya



> Date: Thu, 19 Nov 2015 08:36:58 -0500
> Subject: Re: [PATCH 1/2] [graphite] Move codegen related functions to 
> graphite-isl-ast-to-gimple.c
> From: dje@gmail.com
> To: hiradi...@msn.com; aditya...@samsung.com; s@samsung.com
> CC: gcc-patches@gcc.gnu.org
>
> This patch broke bootstrap when ISL is not enabled.
>
> graphite-isl-ast-to-gimple.c is protected by HAVE_isl but
> get_false_edge_from_guard_bb() is used outside of Graphite, including
> sese.c, which is not restricted to HAVE_isl.
>
> Please fix.
>
> Thanks, David
  

[PATCH 05/15] Add selftests to fold-const.c

2015-11-19 Thread David Malcolm
Jeff approved an older version of this (as a separate
unittests/test-folding.c):
 https://gcc.gnu.org/ml/gcc-patches/2015-10/msg03305.html
> OK if/when prereqs are approved. Minor twiddling if we end up
> moving it elsewhere or standardizing/reducing header files
> is pre-approved.

gcc/ChangeLog:
* fold-const.c: Include "selftest.h".
(class tree_folding_test): New test subclass.
(tree_folding_test, arithmetic_folding): New selftest.
---
 gcc/fold-const.c | 66 
 1 file changed, 66 insertions(+)

diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 698062e..27c9216 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -74,6 +74,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-into-ssa.h"
 #include "md5.h"
 #include "case-cfn-macros.h"
+#include "selftest.h"
 
 #ifndef LOAD_EXTEND_OP
 #define LOAD_EXTEND_OP(M) UNKNOWN
@@ -14423,3 +14424,68 @@ c_getstr (tree src)
 
   return TREE_STRING_POINTER (src) + tree_to_uhwi (offset_node);
 }
+
+#if CHECKING_P
+
+namespace {
+
+/* A test fixture for writing tests of folding trees.  */
+class tree_folding_test : public ::selftest::test
+{
+ protected:
+  void
+  assert_binop_folds_to_const (tree lhs, enum tree_code code, tree rhs,
+  tree constant)
+  {
+EXPECT_EQ (constant, fold_build2 (code, TREE_TYPE (lhs), lhs, rhs));
+  }
+
+  void
+  assert_binop_folds_to_nonlvalue (tree lhs, enum tree_code code, tree rhs,
+  tree wrapped_expr)
+  {
+tree result = fold_build2 (code, TREE_TYPE (lhs), lhs, rhs);
+EXPECT_NE (wrapped_expr, result);
+EXPECT_EQ (NON_LVALUE_EXPR, TREE_CODE (result));
+EXPECT_EQ (wrapped_expr, TREE_OPERAND (result, 0));
+  }
+};
+
+TEST_F (tree_folding_test, arithmetic_folding)
+{
+  tree type = integer_type_node;
+  tree x = create_tmp_var_raw (type, "x");
+  tree zero = build_zero_cst (type);
+  tree one = build_int_cst (type, 1);
+
+  /* Addition.  */
+  /* 1 <-- (0 + 1) */
+  assert_binop_folds_to_const (zero, PLUS_EXPR, one,
+  one);
+  assert_binop_folds_to_const (one, PLUS_EXPR, zero,
+  one);
+
+  /* (nonlvalue)x <-- (x + 0) */
+  assert_binop_folds_to_nonlvalue (x, PLUS_EXPR, zero,
+  x);
+
+  /* Subtraction.  */
+  /* 0 <-- (x - x) */
+  assert_binop_folds_to_const (x, MINUS_EXPR, x,
+  zero);
+  assert_binop_folds_to_nonlvalue (x, MINUS_EXPR, zero,
+  x);
+
+  /* Multiplication.  */
+  /* 0 <-- (x * 0) */
+  assert_binop_folds_to_const (x, MULT_EXPR, zero,
+  zero);
+
+  /* (nonlvalue)x <-- (x * 1) */
+  assert_binop_folds_to_nonlvalue (x, MULT_EXPR, one,
+  x);
+}
+
+}  // anon namespace
+
+#endif /* CHECKING_P */
-- 
1.8.5.3



[PATCH 15/15] RFC: Add ggc-tests.c

2015-11-19 Thread David Malcolm
Jeff approved an earlier version of this (as
unittests/test-ggc.c):
  https://gcc.gnu.org/ml/gcc-patches/2015-10/msg03306.html
> Not terribly happy with that counter to used to create a big list
> to detect recursion. But I'm not offhand sure how to avoid without
> exposing more of the ggc system that is wise.
>
> OK if/when prereqs are approved. Minor twiddling if we end up
> moving it elsewhere or standardizing/reducing header files is
> pre-approved.

This version moves it to gcc/gcc-tests.c.

For now, I also reduced the count within
  TEST_F (ggc_test, chain_next)
from 2 million to 10, to avoid swamping stderr with PASS results.

There's a problem with this test as-is: it adds GTY roots, but it's
all wrapped with #if CHECKING_P.  This leads to a failure at
link time when building with
  --enable-checking=release
since the various generated gtype-foo.c contain references to the
roots like this:

  /* ...snip.. */
  extern const struct ggc_root_tab gt_ggc_r_gt_ggc_tests_h[];
  /* ...snip.. */

  EXPORTED_CONST struct ggc_root_tab * const gt_ggc_rtab[] = {
/* ...snip.. */
gt_ggc_r_gt_ggc_tests_h,
/* ...snip.. */
  };

etc.

(This could be special-cased in gengtype, I suppose, but do we want
to have such conditional roots?)

Potentially we could build up PCH testing here, in which case do we
want to have conditional roots?  In particular, that would mean PCH
files would be incompatible between release vs checked builds
(I'm not sure whether or not that's already the case).

Hence I'm deferring this one for now; if we resolve the issues it
could be committed separately to the other patches.

gcc/ChangeLog:
* Makefile.in (OBJS): Add ggc-tests.o.
(GTFILES): Add ggc-tests.c.
* ggc-tests.c: New file.
---
 gcc/Makefile.in |   2 +
 gcc/ggc-tests.c | 302 
 2 files changed, 304 insertions(+)
 create mode 100644 gcc/ggc-tests.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index ebc9dee..657aaac 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1267,6 +1267,7 @@ OBJS = \
gcse.o \
gcse-common.o \
ggc-common.o \
+   ggc-tests.o \
gimple.o \
gimple-builder.o \
gimple-expr.o \
@@ -2366,6 +2367,7 @@ GTFILES = $(CPP_ID_DATA_H) $(srcdir)/input.h 
$(srcdir)/coretypes.h \
   $(srcdir)/emit-rtl.c $(srcdir)/except.h $(srcdir)/explow.c $(srcdir)/expr.c \
   $(srcdir)/expr.h \
   $(srcdir)/function.c $(srcdir)/except.c \
+  $(srcdir)/ggc-tests.c \
   $(srcdir)/gcse.c $(srcdir)/godump.c \
   $(srcdir)/lists.c $(srcdir)/optabs-libfuncs.c \
   $(srcdir)/profile.c $(srcdir)/mcf.c \
diff --git a/gcc/ggc-tests.c b/gcc/ggc-tests.c
new file mode 100644
index 000..ed6cbe6
--- /dev/null
+++ b/gcc/ggc-tests.c
@@ -0,0 +1,302 @@
+/* Unit tests for GCC's garbage collector (and gengtype etc).
+   Copyright (C) 2015 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "opts.h"
+#include "signop.h"
+#include "hash-set.h"
+#include "fixed-value.h"
+#include "alias.h"
+#include "symtab.h"
+#include "tree-core.h"
+#include "stor-layout.h"
+#include "tree.h"
+#include "stringpool.h"
+#include "stor-layout.h"
+#include "predict.h"
+#include "vec.h"
+#include "hashtab.h"
+#include "hash-set.h"
+#include "machmode.h"
+#include "ggc-internal.h" /* (for ggc_force_collect).  */
+
+#include "selftest.h"
+
+#if CHECKING_P
+
+/* The various GTY markers must be outside of a namespace to be seen by
+   gengtype, so we don't put this file within an anonymous namespace.  */
+
+/* A test fixture for writing ggc tests.  */
+class ggc_test : public ::selftest::test
+{
+ protected:
+  void
+  forcibly_ggc_collect ()
+  {
+ggc_force_collect = true;
+ggc_collect ();
+ggc_force_collect = false;
+  }
+};
+
+/* Smoketest to ensure that a GC root is marked ("tree" type).  */
+
+static GTY(()) tree dummy_unittesting_tree;
+
+TEST_F (ggc_test, tree_marking)
+{
+  dummy_unittesting_tree = build_int_cst (integer_type_node, 1066);
+
+  forcibly_ggc_collect ();
+
+  EXPECT_TRUE (ggc_marked_p (dummy_unittesting_tree));
+}
+
+/* Verify that a simple custom struct works, and that it can
+   own references to non-roots, and have them be marked.  */
+
+struct GTY(()) test_struct
+{
+  test_struct *other;
+};
+
+static GTY(()) 

[PATCH 06/15] Add function-tests.c

2015-11-19 Thread David Malcolm
Jeff approved an earlier version of this (as
unittests/test-functions.c):
  https://gcc.gnu.org/ml/gcc-patches/2015-10/msg03310.html
with:

> There's some if (0) code in here that needs to be eliminated.

(done)

> The RTL case in particular is probably stretching the limits of what
> we can do with the framework right now. Or more correctly what folks
> are likely to write within this framework.
>
> We may need to build up a library of bits that do common things so
> that we're not repeating that stuff all over the place.
>
> As far as RTL testing, as you note, once we hit RTL we're going to
> have far more target dependencies to contend with. Testing will
> be nontrivial.
>
> OK if/when prereqs are approved. Minor twiddling if we end up moving
> it elsewhere or standardizing/reducing header files is pre-approved.

Like other moves, this one gains #if CHECKING_P, and the change of
include from gtest/gtest.h to selftest.h.

gcc/ChangeLog:
* function-tests.c: New file.
---
 gcc/function-tests.c | 630 +++
 1 file changed, 630 insertions(+)
 create mode 100644 gcc/function-tests.c

diff --git a/gcc/function-tests.c b/gcc/function-tests.c
new file mode 100644
index 000..58b27b8
--- /dev/null
+++ b/gcc/function-tests.c
@@ -0,0 +1,630 @@
+/* Unit tests for function-handling.
+   Copyright (C) 2015 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "opts.h"
+#include "signop.h"
+#include "hash-set.h"
+#include "fixed-value.h"
+#include "alias.h"
+#include "flags.h"
+#include "symtab.h"
+#include "tree-core.h"
+#include "stor-layout.h"
+#include "tree.h"
+#include "stringpool.h"
+#include "stor-layout.h"
+#include "rtl.h"
+#include "predict.h"
+#include "vec.h"
+#include "hashtab.h"
+#include "hash-set.h"
+#include "machmode.h"
+#include "hard-reg-set.h"
+#include "input.h"
+#include "function.h"
+#include "dominance.h"
+#include "cfg.h"
+#include "cfganal.h"
+#include "basic-block.h"
+#include "tree-ssa-alias.h"
+#include "internal-fn.h"
+#include "gimple-fold.h"
+#include "gimple-expr.h"
+#include "toplev.h"
+#include "print-tree.h"
+#include "tree-iterator.h"
+#include "gimplify.h"
+#include "tree-cfg.h"
+#include "basic-block.h"
+#include "double-int.h"
+#include "alias.h"
+#include "symtab.h"
+#include "wide-int.h"
+#include "inchash.h"
+#include "tree.h"
+#include "fold-const.h"
+#include "stor-layout.h"
+#include "stmt.h"
+#include "hash-table.h"
+#include "tree-ssa-alias.h"
+#include "internal-fn.h"
+#include "gimple-expr.h"
+#include "is-a.h"
+#include "gimple.h"
+#include "tree-pass.h"
+#include "context.h"
+#include "hash-map.h"
+#include "plugin-api.h"
+#include "ipa-ref.h"
+#include "cgraph.h"
+#include "selftest.h"
+
+#if CHECKING_P
+
+namespace {
+
+class function_test : public ::selftest::test
+{
+ protected:
+  tree
+  make_fndecl (tree return_type,
+  const char *name,
+  vec  _types,
+  bool is_variadic = false)
+  {
+tree fn_type;
+if (is_variadic)
+  fn_type = build_varargs_function_type_array (return_type,
+  param_types.length (),
+  param_types.address ());
+else
+  fn_type = build_function_type_array (return_type,
+  param_types.length (),
+  param_types.address ());
+/* FIXME: this uses input_location: */
+tree fndecl = build_fn_decl (name, fn_type);
+
+return fndecl;
+  }
+};
+
+TEST_F (function_test, fndecl_int_void)
+{
+  auto_vec  param_types;
+  tree fndecl = make_fndecl (integer_type_node,
+"test_fndecl_int_void",
+param_types);
+  ASSERT_TRUE (fndecl != NULL);
+
+  /* Verify name of decl.  */
+  tree declname = DECL_NAME (fndecl);
+  ASSERT_TRUE (declname != NULL);
+  EXPECT_EQ (IDENTIFIER_NODE, TREE_CODE (declname));
+  /* We expect it to use a *copy* of the string we passed in.  */
+  const char *identifier_ptr = IDENTIFIER_POINTER (declname);
+  EXPECT_NE ("test_fndecl_int_void", identifier_ptr);
+  EXPECT_EQ (0, strcmp ("test_fndecl_int_void", identifier_ptr));
+
+  /* Verify type of fndecl.  */
+  EXPECT_EQ 

[PATCH 14/15] Add selftests to vec.c

2015-11-19 Thread David Malcolm
Jeff approved an earlier version of this (as
unittests/test-vec.c):
 https://gcc.gnu.org/ml/gcc-patches/2015-10/msg03308.html
> OK if/when prereqs are approved. Minor twiddling if we end up
> moving it elsewhere or standardizing/reducing header files is
> pre-approved.

This version puts the tests at the end of vec.c.

gcc/ChangeLog:
* vec.c: Include "selftest.h".
(class vec_test): New test subclass.
(vec_test, quick_push): New selftest.
(vec_test, safe_push): New selftest.
(vec_test, truncate): New selftest.
(vec_test, safe_grow_cleared): New selftest.
(vec_test, pop): New selftest.
(vec_test, safe_insert): New selftest.
(vec_test, ordered_remove): New selftest.
(vec_test, unordered_remove): New selftest.
(vec_test, block_remove): New selftest.
(reverse_cmp): New function.
(vec_test, qsort): New selftest.
---
 gcc/vec.c | 142 ++
 1 file changed, 142 insertions(+)

diff --git a/gcc/vec.c b/gcc/vec.c
index 48b10c4..c320635 100644
--- a/gcc/vec.c
+++ b/gcc/vec.c
@@ -30,6 +30,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "system.h"
 #include "coretypes.h"
 #include "hash-table.h"
+#include "selftest.h"
 
 /* vNULL is an empty type with a template cast operation that returns
a zero-initialized vec instance.  Use this when you want
@@ -188,3 +189,144 @@ dump_vec_loc_statistics (void)
 {
   vec_mem_desc.dump (VEC_ORIGIN);
 }
+
+#ifndef GENERATOR_FILE
+#if CHECKING_P
+
+namespace {
+
+class vec_test : public ::selftest::test
+{
+ protected:
+  /* Add the range [START..LIMIT) to V.  */
+  void
+  safe_push_range (vec , int start, int limit)
+  {
+for (int i = start; i < limit; i++)
+  v.safe_push (i);
+  }
+};
+
+TEST_F (vec_test, quick_push)
+{
+  auto_vec  v;
+  EXPECT_EQ (0, v.length ());
+  v.reserve (3);
+  EXPECT_EQ (0, v.length ());
+  EXPECT_TRUE (v.space (3));
+  v.quick_push (5);
+  v.quick_push (6);
+  v.quick_push (7);
+  EXPECT_EQ (3, v.length ());
+  EXPECT_EQ (5, v[0]);
+  EXPECT_EQ (6, v[1]);
+  EXPECT_EQ (7, v[2]);
+}
+
+TEST_F (vec_test, safe_push)
+{
+  auto_vec  v;
+  EXPECT_EQ (0, v.length ());
+  v.safe_push (5);
+  v.safe_push (6);
+  v.safe_push (7);
+  EXPECT_EQ (3, v.length ());
+  EXPECT_EQ (5, v[0]);
+  EXPECT_EQ (6, v[1]);
+  EXPECT_EQ (7, v[2]);
+}
+
+TEST_F (vec_test, truncate)
+{
+  auto_vec  v;
+  EXPECT_EQ (0, v.length ());
+  safe_push_range (v, 0, 10);
+  EXPECT_EQ (10, v.length ());
+
+  v.truncate (5);
+  EXPECT_EQ (5, v.length ());
+}
+
+TEST_F (vec_test, safe_grow_cleared)
+{
+  auto_vec  v;
+  EXPECT_EQ (0, v.length ());
+  v.safe_grow_cleared (50);
+  EXPECT_EQ (50, v.length ());
+  EXPECT_EQ (0, v[0]);
+  EXPECT_EQ (0, v[49]);
+}
+
+TEST_F (vec_test, pop)
+{
+  auto_vec  v;
+  safe_push_range (v, 5, 20);
+  EXPECT_EQ (15, v.length ());
+
+  int last = v.pop ();
+  EXPECT_EQ (19, last);
+  EXPECT_EQ (14, v.length ());
+}
+
+TEST_F (vec_test, safe_insert)
+{
+  auto_vec  v;
+  safe_push_range (v, 0, 10);
+  v.safe_insert (5, 42);
+  EXPECT_EQ (4, v[4]);
+  EXPECT_EQ (42, v[5]);
+  EXPECT_EQ (5, v[6]);
+  EXPECT_EQ (11, v.length ());
+}
+
+TEST_F (vec_test, ordered_remove)
+{
+  auto_vec  v;
+  safe_push_range (v, 0, 10);
+  v.ordered_remove (5);
+  EXPECT_EQ (4, v[4]);
+  EXPECT_EQ (6, v[5]);
+  EXPECT_EQ (9, v.length ());
+}
+
+TEST_F (vec_test, unordered_remove)
+{
+  auto_vec  v;
+  safe_push_range (v, 0, 10);
+  v.unordered_remove (5);
+  EXPECT_EQ (9, v.length ());
+}
+
+TEST_F (vec_test, block_remove)
+{
+  auto_vec  v;
+  safe_push_range (v, 0, 10);
+  v.block_remove (5, 3);
+  EXPECT_EQ (3, v[3]);
+  EXPECT_EQ (4, v[4]);
+  EXPECT_EQ (8, v[5]);
+  EXPECT_EQ (9, v[6]);
+  EXPECT_EQ (7, v.length ());
+}
+
+static int reverse_cmp (const void *p_i, const void *p_j)
+{
+  return *(const int *)p_j - *(const int *)p_i;
+}
+
+TEST_F (vec_test, qsort)
+{
+  auto_vec  v;
+  safe_push_range (v, 0, 10);
+  v.qsort (reverse_cmp);
+  EXPECT_EQ (9, v[0]);
+  EXPECT_EQ (8, v[1]);
+  EXPECT_EQ (1, v[8]);
+  EXPECT_EQ (0, v[9]);
+  EXPECT_EQ (10, v.length ());
+}
+
+}  // anon namespace
+
+#endif /* #if CHECKING_P */
+#endif /* #ifndef GENERATOR_FILE */
-- 
1.8.5.3



[PATCH 03/15] Add selftests to tree-cfg.c

2015-11-19 Thread David Malcolm
Jeff approved an older version of this:
  https://gcc.gnu.org/ml/gcc-patches/2015-10/msg03285.html
with:
> Unless there's a good reason, drop the presumably redundant tests
> and this is OK. Save preapprovald for these changes as the bitmap
> patch.

This version removes the redundant tests, and moves the tests
from being in a new file to being in tree-cfg.c

gcc/ChangeLog:
* tree-cfg.c: Include "selftest.h".
(class cfg_test): New test subclass.
(cfg_test, linear_chain): New selftest.
(cfg_test, diamond): New selftest.
(cfg_test, fully_connected): New selftest.
---
 gcc/tree-cfg.c | 264 +
 1 file changed, 264 insertions(+)

diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 0c624aa..797cce3 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -58,6 +58,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-cfgcleanup.h"
 #include "gimplify.h"
 #include "attribs.h"
+#include "selftest.h"
 
 /* This file contains functions for building the Control Flow Graph (CFG)
for a function tree.  */
@@ -9011,3 +9012,266 @@ gt_pch_nx (edge_def *e, gt_pointer_operator op, void 
*cookie)
 op (&(e->insns.r), cookie);
   op (&(block), cookie);
 }
+
+#if CHECKING_P
+
+namespace {
+
+class cfg_test : public ::selftest::test
+{
+ protected:
+  tree
+  push_fndecl (const char *name)
+  {
+tree fn_type = build_function_type_array (integer_type_node, /* 
return_type */
+   0, NULL);
+/* FIXME: this uses input_location: */
+tree fndecl = build_fn_decl (name, fn_type);
+tree retval = build_decl (UNKNOWN_LOCATION, RESULT_DECL,
+ NULL_TREE, integer_type_node);
+DECL_RESULT (fndecl) = retval;
+push_struct_function (fndecl);
+function *fun = DECL_STRUCT_FUNCTION (fndecl);
+EXPECT_TRUE (fun != NULL);
+init_empty_tree_cfg_for_function (fun);
+EXPECT_EQ (2, n_basic_blocks_for_fn (fun));
+EXPECT_EQ (0, n_edges_for_fn (fun));
+return fndecl;
+  }
+};
+
+/* These tests directly create CFGs.
+   Compare with the static fns within tree-cfg.c:
+ - build_gimple_cfg
+ - make_blocks: calls create_basic_block (seq, bb);
+ - make_edges.   */
+
+/* Verify a simple cfg of the form:
+ ENTRY -> A -> B -> C -> EXIT.  */
+TEST_F (cfg_test, linear_chain)
+{
+  gimple_register_cfg_hooks ();
+
+  tree fndecl = push_fndecl ("cfg_test_linear_chain");
+  function *fun = DECL_STRUCT_FUNCTION (fndecl);
+
+  /* Create some empty blocks.  */
+  basic_block bb_a = create_empty_bb (ENTRY_BLOCK_PTR_FOR_FN (fun));
+  basic_block bb_b = create_empty_bb (bb_a);
+  basic_block bb_c = create_empty_bb (bb_b);
+
+  EXPECT_EQ (5, n_basic_blocks_for_fn (fun));
+  EXPECT_EQ (0, n_edges_for_fn (fun));
+
+  /* Create some edges: a simple linear chain of BBs.  */
+  make_edge (ENTRY_BLOCK_PTR_FOR_FN (fun), bb_a, EDGE_FALLTHRU);
+  make_edge (bb_a, bb_b, 0);
+  make_edge (bb_b, bb_c, 0);
+  make_edge (bb_c, EXIT_BLOCK_PTR_FOR_FN (fun), 0);
+
+  /* Verify the edges.  */
+  EXPECT_EQ (4, n_edges_for_fn (fun));
+  EXPECT_EQ (NULL, ENTRY_BLOCK_PTR_FOR_FN (fun)->preds);
+  EXPECT_EQ (1, ENTRY_BLOCK_PTR_FOR_FN (fun)->succs->length ());
+  EXPECT_EQ (1, bb_a->preds->length ());
+  EXPECT_EQ (1, bb_a->succs->length ());
+  EXPECT_EQ (1, bb_b->preds->length ());
+  EXPECT_EQ (1, bb_b->succs->length ());
+  EXPECT_EQ (1, bb_c->preds->length ());
+  EXPECT_EQ (1, bb_c->succs->length ());
+  EXPECT_EQ (1, EXIT_BLOCK_PTR_FOR_FN (fun)->preds->length ());
+  EXPECT_EQ (NULL, EXIT_BLOCK_PTR_FOR_FN (fun)->succs);
+
+  /* Verify the dominance information
+ Each BB in our simple chain should be dominated by the one before
+ it.  */
+  calculate_dominance_info (CDI_DOMINATORS);
+  EXPECT_EQ (bb_a, get_immediate_dominator (CDI_DOMINATORS, bb_b));
+  EXPECT_EQ (bb_b, get_immediate_dominator (CDI_DOMINATORS, bb_c));
+  vec dom_by_b = get_dominated_by (CDI_DOMINATORS, bb_b);
+  EXPECT_EQ (1, dom_by_b.length ());
+  EXPECT_EQ (bb_c, dom_by_b[0]);
+  free_dominance_info (CDI_DOMINATORS);
+
+  /* Similarly for post-dominance: each BB in our chain is post-dominated
+ by the one after it.  */
+  calculate_dominance_info (CDI_POST_DOMINATORS);
+  EXPECT_EQ (bb_b, get_immediate_dominator (CDI_POST_DOMINATORS, bb_a));
+  EXPECT_EQ (bb_c, get_immediate_dominator (CDI_POST_DOMINATORS, bb_b));
+  vec postdom_by_b = get_dominated_by (CDI_POST_DOMINATORS, bb_b);
+  EXPECT_EQ (1, postdom_by_b.length ());
+  EXPECT_EQ (bb_a, postdom_by_b[0]);
+  free_dominance_info (CDI_POST_DOMINATORS);
+
+  pop_cfun ();
+}
+
+/* Verify a simple CFG of the form:
+ ENTRY
+   |
+   A
+  / \
+ /t  \f
+B C
+ \   /
+  \ /
+   D
+   |
+  EXIT.  */
+TEST_F (cfg_test, diamond)
+{
+  gimple_register_cfg_hooks ();
+
+  tree fndecl = push_fndecl ("cfg_test_diamond");
+  function *fun = DECL_STRUCT_FUNCTION (fndecl);
+
+  /* Create some 

[PATCH 00/15] Unittests framework v4: -fself-test

2015-11-19 Thread David Malcolm
On Mon, 2015-11-16 at 19:17 +0100, Bernd Schmidt wrote:
> So Jeff and I just had a chat, and we came up with some thoughts about 
> how to proceed. I think we both agree that it would be good to have a 
> special testing backend, along with frontends designed to be able to 
> read in gimple or rtl that can be operated on. That's more of a 
> long-term thing.
> 
> For some of the simpler infrastructure tests such as the ones in this 
> patch kit (bitmap, vec or wide-int functionality testing and such), we 
> had the idea of putting these into every ENABLE_CHECKING compiler, and 
> run them after building stage1, controlled by a -fself-test flag. It's 
> better to detect such basic failures early rather than complete a full 
> bootstrap and test cycle. It also keeps the tests alongside the rest of 
> the implementation, which I consider desirable for such relatively 
> simple data structures.
> 
> Thoughts?

I like the idea.

Here's another iteration of the patch kit, which implements it (mostly).

David Malcolm (15):
  Selftest framework (unittests v4)
  Add selftests to bitmap.c
  Add selftests to tree-cfg.c
  Add selftests to et-forest.c
  Add selftests to fold-const.c
  Add function-tests.c
  Fix warning in function-tests.c
  Add selftests to gimple.c
  Add hash-map-tests.c
  Add hash-set-tests.c
  Add selftests to input.c
  Add rtl-tests.c
  Add selftests to tree.c
  Add selftests to vec.c
  RFC: Add ggc-tests.c

 gcc/Makefile.in  |   9 +-
 gcc/bitmap.c |  92 
 gcc/common.opt   |   4 +
 gcc/et-forest.c  |  99 
 gcc/fold-const.c |  66 ++
 gcc/function-tests.c | 632 +++
 gcc/ggc-tests.c  | 302 
 gcc/gimple.c | 103 +
 gcc/hash-map-tests.c |  81 +++
 gcc/hash-set-tests.c |  57 +
 gcc/input.c  | 107 +
 gcc/rtl-tests.c  |  98 
 gcc/selftest.c   | 152 +
 gcc/selftest.h   | 270 ++
 gcc/toplev.c |  51 +
 gcc/toplev.h |   2 +
 gcc/tree-cfg.c   | 264 +
 gcc/tree.c   |  47 
 gcc/vec.c| 142 
 19 files changed, 2575 insertions(+), 3 deletions(-)
 create mode 100644 gcc/function-tests.c
 create mode 100644 gcc/ggc-tests.c
 create mode 100644 gcc/hash-map-tests.c
 create mode 100644 gcc/hash-set-tests.c
 create mode 100644 gcc/rtl-tests.c
 create mode 100644 gcc/selftest.c
 create mode 100644 gcc/selftest.h

-- 
1.8.5.3



Re: RFC/RFA: Fix bug with REE optimization corrupting extended registers

2015-11-19 Thread Bernd Schmidt

On 11/19/2015 04:34 PM, Nick Clifton wrote:

Hi Bernd,


I had a look around. There's code testing HARD_REGNO_NREGS in
ree.c:combine_set_extension. It's inside #if 0, and labelled
"temporarily disabled". See if enabling that helps you? (Jeff, that #if
0 was added by you).


I suspect that the code was disabled because it prevented too many
useful optimization opportunities.


Well, correctness goes first. Since reenabling it fixes your problem, 
that change is approved (conditional on testing as usual obviously).



Bernd


  1   2   >