Re: [SFN+LVU+IEPM v4 7/9] [LVU] Introduce location views

2018-02-12 Thread Alexandre Oliva
On Feb 12, 2018, Andreas Schwab  wrote:

> On Feb 12 2018, Alexandre Oliva  wrote:
>> On Feb 11, 2018, Andreas Schwab  wrote:
>> 
>>> On Feb 09 2018, Alexandre Oliva  wrote:
>> 
 +  if (list_head->vl_symbol && dwarf2out_locviews_in_attribute ())
 +{
 +  ASM_OUTPUT_LABEL (asm_out_file, list_head->vl_symbol);
>> 
>>> That needs to use ASM_OUTPUT_DEBUG_LABEL.
>> 
>> Note this is always output in the .debug_loclist section, not in code
>> sections, so I don't get why it should matter.  Care to clarify, please?

> Perhaps I'm misunderstanding it, but I see .LM labels emitted in the
> middle of code bundles, which breaks them apart.

That line would only output .LVUS symbols.

The only line that outputs LM symbols is 

  targetm.asm_out.internal_label (asm_out_file, LINE_CODE_LABEL, label_num);

It's not ASM_OUTPUT_DEBUG_LABEL, but it was there before.

What may have changed is that, using an assembler without .loc view
support, GCC would switch to internal line number tables, which requires
it to emit LM labels at points in which it would otherwise have output
.loc directives.

The patch I posted last night should work around this problem, in that
it will disable LVU by default if the assembler doesn't support .loc
views, and then you won't get this error any more, unless you explicitly
ask for location views.  If you can give it a try on ia64-linux-gnu,
that would be appreciated.  You might also want to give it a spin with
GNU as 2.30: the assembler view mismatch errors you'd get before that
patch should now be gone too, because we no longer trust min insn
lengths to compute view reset points internally.

If you force that on, with -ginternal-reset-location-views, you'll get
errors in at meast some of the cases in which the assembler finds the
GCC-computed insn length mismatches the assembler's, and then you can
fix the lengths in GCC, so that eventually we can mark this port as one
whose lengths can be used to this end.  See also the reset_location_view
target hook introduced in the same patch.

https://gcc.gnu.org/ml/gcc-patches/2018-02/msg00624.html

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PATCH] Fix _vpermi2var3_mask (PR target/84336)

2018-02-12 Thread Kirill Yukhin
Hello Jakub!

> On 13 Feb 2018, at 00:59, Jakub Jelinek  wrote:
> 
> Hi!
> 
> The following testcase ICEs, because the expander is called with
> a subreg as operands[2], and gen_lowpart on it creates another subreg
> from the same pseudo; the instructions rely on match_dup working:
> (define_insn "*_vpermi2var3_mask"
>  [(set (match_operand:VF_AVX512VL 0 "register_operand" "=v")
>(vec_merge:VF_AVX512VL
>  (unspec:VF_AVX512VL
>[(match_operand: 2 "register_operand" "0")
> (match_operand:VF_AVX512VL 1 "register_operand" "v")
> (match_operand:VF_AVX512VL 3 "nonimmediate_operand" "vm")]
>UNSPEC_VPERMT2)
>  (subreg:VF_AVX512VL (match_dup 2) 0)
>  (match_operand: 4 "register_operand" "Yk")))]
> and this only works if operands[2] is initially a REG.
> 
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
> trunk?
Patch is OK for trunk.

—
Thanks, K
> 
> 2018-02-12  Jakub Jelinek  
> 
>   PR target/84336
>   * config/i386/sse.md (_vpermi2var3_mask): Force
>   operands[2] into a REG before using gen_lowpart on it.
> 
>   * gcc.target/i386/pr84336.c: New test.
> 
> --- gcc/config/i386/sse.md.jj 2018-02-06 13:13:03.911758746 +0100
> +++ gcc/config/i386/sse.md2018-02-12 18:55:27.257386614 +0100
> @@ -18183,7 +18183,10 @@ (define_expand "_vpermi2var (match_dup 5)
> (match_operand: 4 "register_operand")))]
>   "TARGET_AVX512F"
> -  "operands[5] = gen_lowpart (mode, operands[2]);")
> +{
> +  operands[2] = force_reg (mode, operands[2]);
> +  operands[5] = gen_lowpart (mode, operands[2]);
> +})
> 
> (define_insn "*_vpermi2var3_mask"
>   [(set (match_operand:VPERMI2I 0 "register_operand" "=v")
> --- gcc/testsuite/gcc.target/i386/pr84336.c.jj2018-02-12 
> 19:10:15.861401288 +0100
> +++ gcc/testsuite/gcc.target/i386/pr84336.c   2018-02-12 19:09:17.911405540 
> +0100
> @@ -0,0 +1,13 @@
> +/* PR target/84336 */
> +/* { dg-do compile } */
> +/* { dg-options "-O0 -ftree-ter -mavx512f" } */
> +
> +#include 
> +
> +struct S { __m512i h; } b;
> +
> +__m512
> +foo (__m512 a, __mmask16 c, __m512 d)
> +{
> +  return _mm512_mask2_permutex2var_ps (a, b.h, c, d);
> +}
> 
>   Jakub



[PATCH, rs6000] Fix PR84279, powerpc64le ICE on cvc4

2018-02-12 Thread Peter Bergner
PR84279 is a similar problem to PR83399, in that we generate an altivec
load/store through an explicit call to the altivec_{l,st}vx_v4si_2op
pattern and then due to spilling, we end up calling recog() and we match
an earlier pattern, in this case vsx_movv4si_64bit.  That is ok, since
this pattern can generate the lvx/stvx insns the altivec patterm can.
However, due to a constraint bug, we end up using the wrong alternative.

The problematic code after spilling looks like:

(insn 92 131 126 2 (parallel [
(set (reg:V4SI 140)
(unspec:V4SI [
(reg:SI 143 [ g ])
(reg:SI 150 [ ar.v ])
(subreg:SI (reg:DI 146) 0)
(subreg:SI (reg:DI 149) 0)
] UNSPEC_VSX_VEC_INIT))
(clobber (scratch:DI))
(clobber (scratch:DI))
]) "bug.i":25 1237 {vsx_init_v4si})

(insn 126 92 95 2 (set (mem/c:V4SI (and:DI (plus:DI (reg/f:DI 111 sfp)
(reg:DI 156))
(const_int -16 [0xfff0])) [3 MEM[(struct A *)]+0 
S16 A128])
(reg:V4SI 140)) "bug.i":25 1792 {altivec_stvx_v4si_2op})

The vsx_init_v4si pattern forces pseudo 140 to be assigned a GPR, which
should force a reload in insn 126, because the altivec store requires
an altivec register for its src operand.  However, after recog(), we
end up using the vsx_movv4si_64bit pattern which looks like:

(define_insn "*vsx_mov_64bit"
  [(set (match_operand:VSX_M 0 "nonimmediate_operand"
   "=ZwO,  , , r, we,?wQ,
?,   ??r,   ??Y,   ??r,   wo,v,
?,*r,v, ??r,   wZ,v")

(match_operand:VSX_M 1 "input_operand" 
   ", ZwO,   , we,r, r,
wQ,Y, r, r, wE,jwM,
?jwM,  jwM,   W, W, v, wZ"))]

Now we _should_ match using the second to last alternative, but we end up
matching the 8th alternative ("??Y" and "r").  The 8th alternative is used for
storing a GPR, which we have, but the mem we're trying to store to does not
have a valid address for a GPR store.  The "bug" is that the "Y" constraint
code, which is implemented by  mem_operand_gpr() allows our altivec address
when it should not.  The following patch which fixes the ICE adds code to
mem_operand_gpr() which disallows such addresses.

This patch passed bootstrap and retesting on powerpc64le-linux with
no regressions.  Ok for mainline?

Peter

gcc/
PR target/84279
* config/rs6000/rs6000.c (mem_operand_gpr): Disallow altivec addresses.

gcc/testsuite/
PR target/84279
* g++.dg/pr84279.C: New test.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 257606)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -8220,6 +8220,12 @@ mem_operand_gpr (rtx op, machine_mode mo
   int extra;
   rtx addr = XEXP (op, 0);
 
+  /* Don't allow altivec type addresses like (mem (and (plus ...))).
+ See PR target/84279.  */
+
+  if (GET_CODE (addr) == AND)
+return false;
+
   op = address_offset (addr);
   if (op == NULL_RTX)
 return true;
Index: gcc/testsuite/g++.dg/pr84279.C
===
--- gcc/testsuite/g++.dg/pr84279.C  (nonexistent)
+++ gcc/testsuite/g++.dg/pr84279.C  (working copy)
@@ -0,0 +1,35 @@
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power8" } } */
+/* { dg-options "-O3 -mcpu=power8 -g -fPIC -fvisibility=hidden 
-fstack-protector-strong" } */
+
+template  struct E { T e; };
+struct J {
+  unsigned k, l;
+  J (unsigned x, unsigned y) : k(x), l(y) {}
+};
+typedef struct A {
+  J n, p;
+  A ();
+  A (J x, J y) : n(x), p(y) {}
+} *S;
+S t;
+struct B {
+  struct C {
+S q, r;
+int u, v;
+bool m1 (S, A &);
+J m2 () const;
+J m3 () const;
+A m4 () const;
+  };
+  typedef E D;
+  void m5 (D *);
+  void m6 (unsigned, A);
+};
+bool B::C::m1 (S, A ) { bool o; x = m4 (); return o; }
+J B::C::m2 () const { unsigned g (u == 0); unsigned h (v); return J (g, h); }
+J B::C::m3 () const { unsigned g (q != t); unsigned h (r != t); return J (g, 
h); }
+A B::C::m4 () const { return A (m2 (), m3 ()); }
+void B::m5 (D *c) { unsigned x; C ar; A am; if (ar.m1 (c->e, am)) m6 (x, am); }


Re: [PING #3] [PATCH] make -Wrestrict for strcat more meaningful (PR 83698)

2018-02-12 Thread Martin Sebor

Ping: https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01488.html

On 02/05/2018 08:20 PM, Martin Sebor wrote:

Ping: https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01488.html

As I mentioned, this doesn't solve a regression per se but
rather implements what I consider an important usability
improvement to the -Wrestrict warnings.  Printing offsets
that are the most meaningful makes debugging the problems
the warnings point out easier.

On 01/30/2018 10:24 AM, Martin Sebor wrote:

Ping: https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01488.html

This is a minor improvement but there have been a couple of
comments lately pointing out that the numbers in the -Wrestrict
messages can make them confusing.

Jakub, since you made one of those comments (the other came up
in the context of bug 84095 - [8 Regression] false-positive
-Wrestrict warnings for memcpy within array).  Can you please
either approve this patch or suggest changes?

On 01/16/2018 05:35 PM, Martin Sebor wrote:

On 01/16/2018 02:32 PM, Jakub Jelinek wrote:

On Tue, Jan 16, 2018 at 01:36:26PM -0700, Martin Sebor wrote:

--- gcc/gimple-ssa-warn-restrict.c(revision 256752)
+++ gcc/gimple-ssa-warn-restrict.c(working copy)
@@ -384,6 +384,12 @@ builtin_memref::builtin_memref (tree expr,
tree si
   base = SSA_NAME_VAR (base);
   }

+  if (DECL_P (base) && TREE_CODE (TREE_TYPE (base)) == ARRAY_TYPE)
+{
+  if (offrange[0] < 0 && offrange[1] > 0)
+offrange[0] = 0;
+}


Why the 2 nested ifs?


No particular reason.  There may have been more code in there
that I ended up removing.  Or a comment.  I can remove the
extra braces when the patch is approved.




@@ -1079,14 +1085,35 @@ builtin_access::strcat_overlap ()
 return false;

   /* When strcat overlap is certain it is always a single byte:
- the terminatinn NUL, regardless of offsets and sizes.  When
+ the terminating NUL, regardless of offsets and sizes.  When
  overlap is only possible its range is [0, 1].  */
   acs.ovlsiz[0] = dstref->sizrange[0] == dstref->sizrange[1] ? 1 : 0;
   acs.ovlsiz[1] = 1;
-  acs.ovloff[0] = (dstref->sizrange[0] +
dstref->offrange[0]).to_shwi ();
-  acs.ovloff[1] = (dstref->sizrange[1] +
dstref->offrange[1]).to_shwi ();


You use to_shwi many times in the patch, do the callers or something
earlier
in this function guarantee that you aren't throwing away any bits
(unlike
tree_to_shwi, to_shwi method doesn't ICE, just throws away upper bits).
Especially when you perform additions like here, even if both
wide_ints fit
into a shwi, the result might not.


No, I'm not sure.  In fact, it wouldn't surprise me if it did
happen.  It doesn't cause false positives or negatives but it
can make the offsets less than meaningful in cases where they
are within valid bounds.  There are also cases where they are
meaningless to begin with and there is little the pass can do
about that.

IMO, the ideal solution to the first problem is to add a format
specifier for wide ints to the pretty printer and get rid of
the conversions.  It's probably too late for something like
that now but I'd like to do it for GCC 9.  Unless someone
files a bug/regression, it's also too late for me to go and
try to find and fix these conversions now.

Martin

PS While looking for a case you asked about I came up with
the following.  I don't think there's any slicing involved
but the offsets are just as meaningless as if there were.
I think the way to do significantly better is to detect
out-of-bounds offsets earlier (e.g., as in this patch:
https://gcc.gnu.org/ml/gcc-patches/2017-10/msg02143.html)

$ cat z.c && gcc -O2 -S -Warray-bounds -m32 z.c
extern int a[];

void f (__PTRDIFF_TYPE__ i)
{
  if (i < __PTRDIFF_MAX__ - 7 || __PTRDIFF_MAX__ - 5 < i)
i = __PTRDIFF_MAX__ -  7;

  const int *s = a + i;

  __builtin_memcpy (a, [i], 3);
}
z.c: In function ‘f’:
z.c:10:3: warning: ‘__builtin_memcpy’ offset [-64, -48] is out of the
bounds of object ‘a’ with type ‘int[]’ [-Warray-bounds]
   __builtin_memcpy (a, [i], 3);
   ^~
z.c:1:12: note: ‘a’ declared here
 extern int a[];
^









[PATCH] adjust warning_n() to take uhwi (PR 84207)

2018-02-12 Thread Martin Sebor

Bug 84207 - Hard coded plural in gimple-fold.c points out one
of a number of warning_at() calls where warning_n() should have
been used.  The attached patch both replaces the calls and also
changes the signatures of the warning_n(), error_n(), and
inform_n() functions to take an unsigned HOST_WIDE_INT argument
instead of int.  I also changed the implementation of
diagnostic_n_impl() to deal with unsigned HOST_WIDE_INT values
in excess of ULONG_MAX (the maximum value ngettext handles) so
callers don't need to.

Bootstrapped/regtested on x86_64-linux.

Martin

PS Is there any reason why diagnostic-core.h and diagnostic.c
does not/should not include tree.h and other GCC headers?
PR translation/84207 - Hard coded plural in gimple-fold.c

gcc/ChangeLog:

	PR translation/84207
	* diagnostic-core.h (warning_n, error_n, inform_n): Change
	n argument to unsigned HOST_WIDE_INT.
	* diagnostic.c (warning_n, error_n, inform_n): Ditto.
	(diagnostic_n_impl): Ditto.  Handle arguments in excess of LONG_MAX.
	* gimple-fold.c (gimple_fold_builtin_strncpy): Use warning_n.

Index: gcc/diagnostic-core.h
===
--- gcc/diagnostic-core.h	(revision 257607)
+++ gcc/diagnostic-core.h	(working copy)
@@ -59,10 +59,11 @@ extern void internal_error_no_backtrace (const cha
  ATTRIBUTE_GCC_DIAG(1,2) ATTRIBUTE_NORETURN;
 /* Pass one of the OPT_W* from options.h as the first parameter.  */
 extern bool warning (int, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3);
-extern bool warning_n (location_t, int, int, const char *, const char *, ...)
+extern bool warning_n (location_t, int, unsigned HOST_WIDE_INT,
+		   const char *, const char *, ...)
 ATTRIBUTE_GCC_DIAG(4,6) ATTRIBUTE_GCC_DIAG(5,6);
-extern bool warning_n (rich_location *, int, int, const char *,
-		   const char *, ...)
+extern bool warning_n (rich_location *, int, unsigned HOST_WIDE_INT,
+		   const char *, const char *, ...)
 ATTRIBUTE_GCC_DIAG(4, 6) ATTRIBUTE_GCC_DIAG(5, 6);
 extern bool warning_at (location_t, int, const char *, ...)
 ATTRIBUTE_GCC_DIAG(3,4);
@@ -69,7 +70,8 @@ extern bool warning_at (location_t, int, const cha
 extern bool warning_at (rich_location *, int, const char *, ...)
 ATTRIBUTE_GCC_DIAG(3,4);
 extern void error (const char *, ...) ATTRIBUTE_GCC_DIAG(1,2);
-extern void error_n (location_t, int, const char *, const char *, ...)
+extern void error_n (location_t, unsigned HOST_WIDE_INT, const char *,
+		 const char *, ...)
 ATTRIBUTE_GCC_DIAG(3,5) ATTRIBUTE_GCC_DIAG(4,5);
 extern void error_at (location_t, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3);
 extern void error_at (rich_location *, const char *, ...)
@@ -87,7 +89,8 @@ extern bool permerror (rich_location *, const char
 extern void sorry (const char *, ...) ATTRIBUTE_GCC_DIAG(1,2);
 extern void inform (location_t, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3);
 extern void inform (rich_location *, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3);
-extern void inform_n (location_t, int, const char *, const char *, ...)
+extern void inform_n (location_t, unsigned HOST_WIDE_INT, const char *,
+		  const char *, ...)
 ATTRIBUTE_GCC_DIAG(3,5) ATTRIBUTE_GCC_DIAG(4,5);
 extern void verbatim (const char *, ...) ATTRIBUTE_GCC_DIAG(1,2);
 extern bool emit_diagnostic (diagnostic_t, location_t, int,
Index: gcc/diagnostic.c
===
--- gcc/diagnostic.c	(revision 257607)
+++ gcc/diagnostic.c	(working copy)
@@ -51,8 +51,8 @@ along with GCC; see the file COPYING3.  If not see
 /* Prototypes.  */
 static bool diagnostic_impl (rich_location *, int, const char *,
 			 va_list *, diagnostic_t) ATTRIBUTE_GCC_DIAG(3,0);
-static bool diagnostic_n_impl (rich_location *, int, int, const char *,
-			   const char *, va_list *,
+static bool diagnostic_n_impl (rich_location *, int, unsigned HOST_WIDE_INT,
+			   const char *, const char *, va_list *,
 			   diagnostic_t) ATTRIBUTE_GCC_DIAG(5,0);
 
 static void error_recursion (diagnostic_context *) ATTRIBUTE_NORETURN;
@@ -,15 +,22 @@ diagnostic_impl (rich_location *richloc, int opt,
 /* Implement inform_n, warning_n, and error_n, as documented and
defined below.  */
 static bool
-diagnostic_n_impl (rich_location *richloc, int opt, int n,
+diagnostic_n_impl (rich_location *richloc, int opt, unsigned HOST_WIDE_INT n,
 		   const char *singular_gmsgid,
 		   const char *plural_gmsgid,
 		   va_list *ap, diagnostic_t kind)
 {
   diagnostic_info diagnostic;
-  diagnostic_set_info_translated (,
-  ngettext (singular_gmsgid, plural_gmsgid, n),
-  ap, richloc, kind);
+  unsigned long gtn;
+
+  if (sizeof n <= sizeof gtn)
+gtn = n;
+  else
+/* Use the largest number ngettext() can handle.  */
+gtn = n <= ULONG_MAX ? n : ULONG_MAX;
+
+  const char *text = ngettext (singular_gmsgid, plural_gmsgid, gtn);
+  

Re: Merge from trunk to gccgo branch

2018-02-12 Thread Ian Lance Taylor
I merged trunk revision 257610 to the gccgo branch.

Ian


[PATCH][committed][PR target/83760] Fix several instances of maybe_record_trace_start ICEs on sh

2018-02-12 Thread Jeff Law


The SH port has a pass which looks for suitable places to put its
constant table.

Ideally it finds a barrier (within a certain range from the first use of
the pool) and shoves the constant pool after that barrier.

Otherwise it'll create jump/barrier/ style sequence and insert
the constant pool after the barrier, but before the jump target.


That's all fine and good.  Except that it can confuse the CFI machinery.
 The sequence makes it look like there's live code after the sibling
call.  It's now so clear why this bug came and went semi-randomly across
the various sh targets -- it depends on the sibling call being the insn
which causes the main loop in find_barrier to exit.  So one more or one
less insn, or an insn changing alternative (and thus size) and the bug
goes latent.

Anyway fixing this is pretty simple.  We can just insert the table after
the sibling call (taking care not to separate the call from its
NOTE_INSN_CALL_ARG_LOCATION note).

I went back and verified each of the SH bugs where we had testcases for
this problem were fixed by this patch.  I also verified that the sh-elf
libgcc/newlib targets built after this patch.

Given the sensitivity to precisely what insn causes the loop to exit, I
haven't bothered to add a test to the testsuite.

This would likely be a reasonable backport candidate to gcc-7 if SH
folks are interested.

Installed onto the trunk,

Jeff

ps.  No, I wasn't really looking to fix the SH stuff.  I'm really
chasing that MIPS regression.  Fixing this was just collateral damage.

commit eeace813c643ee2f2c5f37bf984a680be6032a2d
Author: law 
Date:   Tue Feb 13 03:07:04 2018 +

PR target/83760
* config/sh/sh.c (find_barrier): Consider a sibling call
a barrier as well.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@257611 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index d5913d0a7db..a74c8610443 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,9 @@
 2018-02-12  Jeff Law  
 
+   PR target/83760
+   * config/sh/sh.c (find_barrier): Consider a sibling call
+   a barrier as well.
+
* cse.c (try_back_substitute_reg): Move any REG_ARGS_SIZE note when
successfully back substituting a reg.
 
diff --git a/gcc/config/sh/sh.c b/gcc/config/sh/sh.c
index 48e99a3cadf..90d6c733d33 100644
--- a/gcc/config/sh/sh.c
+++ b/gcc/config/sh/sh.c
@@ -5233,10 +5233,22 @@ find_barrier (int num_mova, rtx_insn *mova, rtx_insn 
*from)
 CALL_ARG_LOCATION note.  */
   if (CALL_P (from))
{
+ bool sibcall_p = SIBLING_CALL_P (from);
+
  rtx_insn *next = NEXT_INSN (from);
  if (next && NOTE_P (next)
  && NOTE_KIND (next) == NOTE_INSN_CALL_ARG_LOCATION)
from = next;
+
+ /* If FROM was a sibling call, then we know that control
+will not return.  In fact, we were guaranteed to hit
+a barrier before another real insn.
+
+The jump around the constant pool is unnecessary.  It
+costs space, but more importantly it confuses dwarf2cfi
+generation.  */
+ if (sibcall_p)
+   return emit_barrier_after (from);
}
 
   from = emit_jump_insn_after (gen_jump (label), from);


Re: [PATCH 1/2] Untangle stddef.h a little

2018-02-12 Thread coypu
ping, let me know if there is anything wrong with it.


[RFC][AARCH64] Machine reorg pass for aarch64/Falkor to handle prefetcher tag collision

2018-02-12 Thread Kugan Vivekanandarajah
Implements a machine reorg pass for aarch64/Falkor to handle
prefetcher tag collision. This is strictly not part of the loop
unroller but for Falkor, unrolling can make h/w prefetcher performing
badly if there are too much tag collisions based on the discussions in
https://gcc.gnu.org/ml/gcc/2017-10/msg00178.html.

gcc/ChangeLog:

2018-02-12  Kugan Vivekanandarajah  

* config/aarch64/aarch64.c (iv_p): New.
(strided_load_p): Likwise.
(make_tag): Likesie.
(get_load_info): Likewise.
(aarch64_reorg): Likewise.
(TARGET_MACHINE_DEPENDENT_REORG): Implement new target hook.
From 0cd4f5acb2117c739ba81bb4b8b71af499107812 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah 
Date: Mon, 12 Feb 2018 10:44:53 +1100
Subject: [PATCH 4/4] reorg-for-tag-collision

Change-Id: Ic6e42d54268c9112ec1c25de577ca92c1808eeff
---
 gcc/config/aarch64/aarch64.c | 353 +++
 1 file changed, 353 insertions(+)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 1ce2a0c..48e7c54 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -71,6 +71,7 @@
 #include "selftest.h"
 #include "selftest-rtl.h"
 #include "rtx-vector-builder.h"
+#include "cfgrtl.h"
 
 /* This file should be included last.  */
 #include "target-def.h"
@@ -17203,6 +17204,355 @@ aarch64_select_early_remat_modes (sbitmap modes)
 }
 }
 
+static bool
+iv_p (rtx reg, struct loop *loop)
+{
+  df_ref adef;
+  unsigned regno = REGNO (reg);
+  bool def_in_loop = false;
+  bool def_out_loop = false;
+
+  if (GET_MODE_CLASS (GET_MODE (reg)) != MODE_INT)
+return false;
+
+  for (adef = DF_REG_DEF_CHAIN (regno); adef; adef = DF_REF_NEXT_REG (adef))
+{
+  if (!DF_REF_INSN_INFO (adef)
+	  || !NONDEBUG_INSN_P (DF_REF_INSN (adef)))
+	continue;
+
+  basic_block bb = DF_REF_BB (adef);
+  if (dominated_by_p (CDI_DOMINATORS, bb, loop->header)
+	  && bb->loop_father == loop)
+	{
+	  rtx_insn *insn = DF_REF_INSN (adef);
+	  recog_memoized (insn);
+	  rtx pat = PATTERN (insn);
+	  if (GET_CODE (pat) != SET)
+	continue;
+	  rtx x = SET_SRC (pat);
+	  if (GET_CODE (x) == ZERO_EXTRACT
+	  || GET_CODE (x) == ZERO_EXTEND
+	  || GET_CODE (x) == SIGN_EXTEND)
+	x = XEXP (x, 0);
+	  if (MEM_P (x))
+	continue;
+	  if (GET_CODE (x) == POST_INC
+	  || GET_CODE (x) == POST_DEC
+	  || GET_CODE (x) == PRE_INC
+	  || GET_CODE (x) == PRE_DEC)
+	def_in_loop = true;
+	  else if (BINARY_P (x))
+	def_in_loop = true;
+	}
+  if (dominated_by_p (CDI_DOMINATORS, loop->header, bb))
+	def_out_loop = true;
+  if (def_in_loop && def_out_loop)
+	return true;
+}
+  return false;
+}
+
+/* Return true if X is a strided load.  */
+
+static bool
+strided_load_p (rtx x,
+		struct loop *loop,
+		bool *pre_post,
+		rtx *base,
+		rtx *offset)
+{
+  /* Loadded value is extended, get src.  */
+  if (GET_CODE (x) == ZERO_EXTRACT
+  || GET_CODE (x) == ZERO_EXTEND
+  || GET_CODE (x) == SIGN_EXTEND)
+x = XEXP (x, 0);
+
+  /* If it is not MEM_P, it is not lodade from mem.  */
+  if (!MEM_P (x))
+return false;
+
+  /* Get the src of MEM_P.  */
+  x = XEXP (x, 0);
+
+  /* If it is a post/pre increment, get the src.  */
+  if (GET_CODE (x) == POST_INC
+  || GET_CODE (x) == POST_DEC
+  || GET_CODE (x) == PRE_INC
+  || GET_CODE (x) == PRE_DEC)
+{
+  x = XEXP (x, 0);
+  *pre_post = true;
+}
+
+  /* get base and offset depending on the type.  */
+  if (REG_P (x)
+  || UNARY_P (x))
+{
+  if (!REG_P (x))
+	x = XEXP (x, 0);
+  if (REG_P (x)
+	  && iv_p (x, loop))
+	{
+	  *base = x;
+	  return true;
+	}
+}
+  else if (BINARY_P (x))
+{
+  rtx reg1, reg2;
+  reg1 = XEXP (x, 0);
+
+  if (REG_P (reg1)
+	  && REGNO (reg1) == SP_REGNUM)
+	return false;
+  reg2 = XEXP (x, 1);
+
+  if (REG_P (reg1)
+	  && iv_p (reg1, loop))
+	{
+
+	  *base = reg1;
+	  *offset = reg2;
+	  return true;
+	}
+
+  if (REG_P (reg1)
+	  && REG_P (reg2)
+	  && iv_p (reg2, loop))
+	{
+	  *base = reg1;
+	  *offset = reg2;
+	  return true;
+	}
+}
+  return false;
+}
+
+static unsigned
+make_tag (unsigned dest, unsigned base, unsigned offset)
+{
+  return (dest & 0xf)
+| ((base & 0xf) << 4)
+| ((offset & 0x3f) << 8);
+}
+
+
+/* Return true if X INSN is a strided load.  */
+
+static bool
+get_load_info (rtx_insn *insn,
+	   struct loop *loop,
+	   bool *pre_post,
+	   rtx *base,
+	   rtx *dest,
+	   rtx *offset)
+{
+  subrtx_var_iterator::array_type array;
+  if (!INSN_P (insn) || recog_memoized (insn) < 0)
+return false;
+  rtx pat = PATTERN (insn);
+  switch (GET_CODE (pat))
+{
+case PARALLEL:
+	{
+	  for (int j = 0; j < XVECLEN (pat, 0); ++j)
+	{
+	  rtx ex = XVECEXP (pat, 0, j);
+	  FOR_EACH_SUBRTX_VAR (iter, array, ex, NONCONST)
+		{
+		  const_rtx x = *iter;
+		  if (GET_CODE (x) == SET
+		  

[RFC][AARCH64] Implements target hook

2018-02-12 Thread Kugan Vivekanandarajah
Implements target hook TARGET_HW_MAX_MEM_READ_STREAMS for aarch64

gcc/ChangeLog:

2018-02-12  Kugan Vivekanandarajah  

* config/aarch64/aarch64-protos.h (struct cpu_prefetch_tune): Add
  new entry hw_prefetchers_avail.
* config/aarch64/aarch64.c (aarch64_hw_max_mem_read_streams):
  Implement new target hook.
(TARGET_HW_MAX_MEM_READ_STREAMS): Likewise.
From 3529cf5b85d7282b1829d53652f03d0945359ad6 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah 
Date: Mon, 12 Feb 2018 10:44:26 +1100
Subject: [PATCH 3/4] add-prefetchers-availabl

Change-Id: I68af62d7be56255574a9c3f636b2d338f918b4e1
---
 gcc/config/aarch64/aarch64-protos.h |  1 +
 gcc/config/aarch64/aarch64.c| 26 --
 2 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index 2d705d2..2e3b2a1 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -231,6 +231,7 @@ struct cpu_prefetch_tune
   const int l1_cache_line_size;
   const int l2_cache_size;
   const int default_opt_level;
+  const int hw_prefetchers_avail;
 };
 
 struct tune_params
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 2e70f3a..1ce2a0c 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -547,7 +547,8 @@ static const cpu_prefetch_tune generic_prefetch_tune =
   -1,			/* l1_cache_size  */
   -1,			/* l1_cache_line_size  */
   -1,			/* l2_cache_size  */
-  -1			/* default_opt_level  */
+  -1,			/* default_opt_level  */
+  -1			/* default hw_prefetchers_avail */
 };
 
 static const cpu_prefetch_tune exynosm1_prefetch_tune =
@@ -556,7 +557,8 @@ static const cpu_prefetch_tune exynosm1_prefetch_tune =
   -1,			/* l1_cache_size  */
   64,			/* l1_cache_line_size  */
   -1,			/* l2_cache_size  */
-  -1			/* default_opt_level  */
+  -1,			/* default_opt_level  */
+  -1			/* default hw_prefetchers_avail */
 };
 
 static const cpu_prefetch_tune qdf24xx_prefetch_tune =
@@ -565,7 +567,8 @@ static const cpu_prefetch_tune qdf24xx_prefetch_tune =
   32,			/* l1_cache_size  */
   64,			/* l1_cache_line_size  */
   1024,			/* l2_cache_size  */
-  -1			/* default_opt_level  */
+  -1,			/* default_opt_level  */
+  7			/* hw_prefetchers_avail */
 };
 
 static const cpu_prefetch_tune thunderxt88_prefetch_tune =
@@ -574,7 +577,8 @@ static const cpu_prefetch_tune thunderxt88_prefetch_tune =
   32,			/* l1_cache_size  */
   128,			/* l1_cache_line_size  */
   16*1024,		/* l2_cache_size  */
-  3			/* default_opt_level  */
+  3,			/* default_opt_level  */
+  -1			/* default hw_prefetchers_avail */
 };
 
 static const cpu_prefetch_tune thunderx_prefetch_tune =
@@ -583,7 +587,8 @@ static const cpu_prefetch_tune thunderx_prefetch_tune =
   32,			/* l1_cache_size  */
   128,			/* l1_cache_line_size  */
   -1,			/* l2_cache_size  */
-  -1			/* default_opt_level  */
+  -1,			/* default_opt_level  */
+  -1			/* default hw_prefetchers_avail */
 };
 
 static const cpu_prefetch_tune thunderx2t99_prefetch_tune =
@@ -592,7 +597,8 @@ static const cpu_prefetch_tune thunderx2t99_prefetch_tune =
   32,			/* l1_cache_size  */
   64,			/* l1_cache_line_size  */
   256,			/* l2_cache_size  */
-  -1			/* default_opt_level  */
+  -1,			/* default_opt_level  */
+  -1			/* default hw_prefetchers_avail */
 };
 
 static const struct tune_params generic_tunings =
@@ -17143,6 +17149,11 @@ aarch64_sched_can_speculate_insn (rtx_insn *insn)
 	return true;
 }
 }
+static int
+aarch64_hw_max_mem_read_streams ()
+{
+  return aarch64_tune_params.prefetch->hw_prefetchers_avail;
+}
 
 /* Implement TARGET_COMPUTE_PRESSURE_CLASSES.  */
 
@@ -17661,6 +17672,9 @@ aarch64_libgcc_floating_mode_supported_p
 #undef TARGET_SELECT_EARLY_REMAT_MODES
 #define TARGET_SELECT_EARLY_REMAT_MODES aarch64_select_early_remat_modes
 
+#undef TARGET_HW_MAX_MEM_READ_STREAMS
+#define TARGET_HW_MAX_MEM_READ_STREAMS aarch64_hw_max_mem_read_streams
+
 #if CHECKING_P
 #undef TARGET_RUN_TARGET_SELFTESTS
 #define TARGET_RUN_TARGET_SELFTESTS selftest::aarch64_run_selftests
-- 
2.7.4



[RFC] Tree Loop Unroller Pass

2018-02-12 Thread Kugan Vivekanandarajah
Implements tree loop unroller using the infrastructure provided.

gcc/ChangeLog:

2018-02-12  Kugan Vivekanandarajah  

* Makefile.in (OBJS): Add tree-ssa-loop-unroll.o.
* common.opt (ftree-loop-unroll): New option.
* passes.def: Add pass_tree_loop_uroll
* timevar.def (TV_TREE_LOOP_UNROLL): Add.
* tree-pass.h (make_pass_tree_loop_unroll): Declare.
* tree-ssa-loop-unroll.c: New file.
From 71baaf8393dd79a98b4c0216e56d87083caf0177 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah 
Date: Mon, 12 Feb 2018 10:44:00 +1100
Subject: [PATCH 2/4] tree-loop-unroller

Change-Id: I58c25b5f2e796d4166af3ea4e50a0f4a3078b6c2
---
 gcc/Makefile.in|   1 +
 gcc/common.opt |   4 +
 gcc/passes.def |   1 +
 gcc/timevar.def|   1 +
 gcc/tree-pass.h|   1 +
 gcc/tree-ssa-loop-unroll.c | 268 +
 6 files changed, 276 insertions(+)
 create mode 100644 gcc/tree-ssa-loop-unroll.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 374bf3e..de3c146 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1536,6 +1536,7 @@ OBJS = \
 	tree-ssa-loop-im.o \
 	tree-ssa-loop-ivcanon.o \
 	tree-ssa-loop-ivopts.o \
+	tree-ssa-loop-unroll.o \
 	tree-ssa-loop-manip.o \
 	tree-ssa-loop-niter.o \
 	tree-ssa-loop-prefetch.o \
diff --git a/gcc/common.opt b/gcc/common.opt
index b20a9aa..ea47b8c 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1770,6 +1770,10 @@ fivopts
 Common Report Var(flag_ivopts) Init(1) Optimization
 Optimize induction variables on trees.
 
+ftree-loop-unroll
+Common Report Var(flag_tree_loop_unroll) Init(1) Optimization
+Perform loop unrolling in gimple.
+
 fjump-tables
 Common Var(flag_jump_tables) Init(1) Optimization
 Use jump tables for sufficiently large switch statements.
diff --git a/gcc/passes.def b/gcc/passes.def
index 9802f08..57f7cc2 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -302,6 +302,7 @@ along with GCC; see the file COPYING3.  If not see
   NEXT_PASS (pass_predcom);
 	  NEXT_PASS (pass_complete_unroll);
 	  NEXT_PASS (pass_slp_vectorize);
+	  NEXT_PASS (pass_tree_loop_unroll);
 	  NEXT_PASS (pass_loop_prefetch);
 	  /* Run IVOPTs after the last pass that uses data-reference analysis
 	 as that doesn't handle TARGET_MEM_REFs.  */
diff --git a/gcc/timevar.def b/gcc/timevar.def
index 91221ae..a6bb847 100644
--- a/gcc/timevar.def
+++ b/gcc/timevar.def
@@ -202,6 +202,7 @@ DEFTIMEVAR (TV_TREE_LOOP_DISTRIBUTION, "tree loop distribution")
 DEFTIMEVAR (TV_CHECK_DATA_DEPS   , "tree check data dependences")
 DEFTIMEVAR (TV_TREE_PREFETCH	 , "tree prefetching")
 DEFTIMEVAR (TV_TREE_LOOP_IVOPTS	 , "tree iv optimization")
+DEFTIMEVAR (TV_TREE_LOOP_UNROLL , "tree loop unroll")
 DEFTIMEVAR (TV_PREDCOM		 , "predictive commoning")
 DEFTIMEVAR (TV_TREE_CH		 , "tree copy headers")
 DEFTIMEVAR (TV_TREE_SSA_UNCPROP	 , "tree SSA uncprop")
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 93a6a99..2c0740f 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -388,6 +388,7 @@ extern gimple_opt_pass *make_pass_complete_unrolli (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_parallelize_loops (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_loop_prefetch (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_iv_optimize (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_tree_loop_unroll (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_tree_loop_done (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_ch (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_ch_vect (gcc::context *ctxt);
diff --git a/gcc/tree-ssa-loop-unroll.c b/gcc/tree-ssa-loop-unroll.c
new file mode 100644
index 000..04cf092
--- /dev/null
+++ b/gcc/tree-ssa-loop-unroll.c
@@ -0,0 +1,268 @@
+
+/* Tree Loop Unroller.
+   Copyright (C) 2017 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by the
+Free Software Foundation; either version 3, or (at your option) any
+later version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT
+ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "hash-set.h"
+#include "machmode.h"
+#include "tree.h"
+#include "tree-pass.h"
+#include "target.h"
+#include "gimple.h"
+#include "cfgloop.h"
+#include "tree-ssa-loop.h"
+#include "tree-scalar-evolution.h"
+#include "tree-ssa-loop-manip.h"
+#include "tree-ssa-loop-niter.h"
+#include 

[RFC] Adds a target hook

2018-02-12 Thread Kugan Vivekanandarajah
Adds a target hook TARGET_HW_MAX_MEM_READ_STREAMS. Loop unroller, if
defined, will try to limit the unrolling factor based on this.


gcc/ChangeLog:

2018-02-12  Kugan Vivekanandarajah  

* doc/tm.texi.in (TARGET_HW_MAX_MEM_READ_STREAMS): Dcoument.
* doc/tm.texi: Regenerate.
* target.def (hw_max_mem_read_streams): New target hook.
From 95287a11980ff64ee473406d832d75f96204c6e9 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah 
Date: Mon, 12 Feb 2018 10:42:29 +1100
Subject: [PATCH 1/4] add-target-hook

Change-Id: I1789769c27786babc6a071d12049c72d7afed00e
---
 gcc/doc/tm.texi| 6 ++
 gcc/doc/tm.texi.in | 2 ++
 gcc/target.def | 9 +
 3 files changed, 17 insertions(+)

diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 7f02b0d..08f4e2a 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -11718,6 +11718,12 @@ is required only when the target has special constraints like maximum
 number of memory accesses.
 @end deftypefn
 
+@deftypefn {Target Hook} signed TARGET_HW_MAX_MEM_READ_STREAMS (void)
+This target hook returns the maximum number of memory read streams
+ that hw prefers.  Tree loop unroller will use this while deciding
+ unroll factor.
+@end deftypefn
+
 @defmac POWI_MAX_MULTS
 If defined, this macro is interpreted as a signed integer C expression
 that specifies the maximum number of floating point multiplications
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 90c24be..e222372 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -7927,6 +7927,8 @@ build_type_attribute_variant (@var{mdecl},
 
 @hook TARGET_LOOP_UNROLL_ADJUST
 
+@hook TARGET_HW_MAX_MEM_READ_STREAMS
+
 @defmac POWI_MAX_MULTS
 If defined, this macro is interpreted as a signed integer C expression
 that specifies the maximum number of floating point multiplications
diff --git a/gcc/target.def b/gcc/target.def
index aeb41df..29295ae 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -2751,6 +2751,15 @@ number of memory accesses.",
  unsigned, (unsigned nunroll, struct loop *loop),
  NULL)
 
+/* Return a new value for loop unroll size.  */
+DEFHOOK
+(hw_max_mem_read_streams,
+ "This target hook returns the maximum number of memory read streams\n\
+ that hw prefers.  Tree loop unroller will use this while deciding\n\
+ unroll factor.",
+ signed, (void),
+ NULL)
+
 /* True if X is a legitimate MODE-mode immediate operand.  */
 DEFHOOK
 (legitimate_constant_p,
-- 
2.7.4



[RFC] Tree loop unroller pass

2018-02-12 Thread Kugan Vivekanandarajah
Hi All,

Based on the previous discussions, I tried to implement a tree loop
unroller for partial unrolling. I would like to queue this RFC patches
for next stage1 review.

In summary:

* Cost-model for selecting the loop uses the same params used
elsewhere in related optimizations. I was told that keeping this same
would allow better tuning for all the optimizations.

* I have also implemented an option to limit loops based on memory
streams. i.e., some micro-architectures where limiting the resulting
memory streams is preferred and used  to limit unrolling factor.

* I have tested this on variants of aarch64 and the results are
promising. I am in the process of running benchmarks on x86. I will
update the results later.

* I expect that there will be some cost-model changes might be needed
to handle (or provide ability to handle) various loop preferences of
the micro-architectures. I am sending this patch for review early to
get feedbacks on this.

* Position of the pass in passes.def can also be changed. Example,
unrolling before SLP.

* I have bootstrapped and regression tested on aarch64-linux-gnu.
There are no execution errors or ICEs. There are some testsuite
differences as expected. Few of them needs further evaluation and I am
doing that now.

Patches are organized as:

Patch1: Adds a target hook TARGET_HW_MAX_MEM_READ_STREAMS. Loop
unroller, if defined, will try to limit the unrolling factor based on
this.

Patch2: Implements tree loop unroller using the infrastructure
provided. Pass itself is very simple.

Patch3: Implements target hook TARGET_HW_MAX_MEM_READ_STREAMS for aarch64.

Patch4: Implements a machine reorg pass for aarch64/Falkor to handle
prefetcher tag collision. This is strictly not part of the loop
unroller but for Falkor, unrolling can make h/w prefetcher performing
badly if there are too-much tag collisions based on the discussions in
https://gcc.gnu.org/ml/gcc/2017-10/msg00178.html.

Thanks,
Kugan


Re: [PATCH] RL78 new "vector" function attribute

2018-02-12 Thread DJ Delorie
"Sebastian Perta"  writes:
>>>Looks OK to me, but wait a day or two for a docs person to comment on...
> 6 days no comments so far, can I check in now?

Yup, go ahead.

>>>if the new line is too long
> There are many other lines which have the same length or are even longer
> this is why I let it as it is.

Ok.

> Also based on comments from Jakub (on a different patch) I corrected the
> Changelog entry for this patch (see below). Is this OK?

Yup.  Thanks!


Re: [PATCH] Improve dead code elimination with -fsanitize=address (PR84307)

2018-02-12 Thread Paolo Bonzini
On 09/02/2018 19:07, Jakub Jelinek wrote:
> On Fri, Feb 09, 2018 at 07:01:08PM +0100, Richard Biener wrote:
>>> which indeed fixes the testcase and seems not to break asan.exp.
>>
>> Huh. Need to double check why that makes sense ;) 
> 
> I think it does, for both ASAN_CHECK and ASAN_MARK the pointer argument
> is the second one, the first one is an integer argument with flags.
> And ASAN_MARK, both poison and unpoison, works kind like a clobber on the
> referenced variable, before unpoison it is generally inaccessible and after
> poison too.

This was too optimistic. :(

In use-after-scope-types-1.C, after the patch FRE+DSE are able to
optimize away the problematic read.  In general it seems to me that the
sanitizer passes should be before DSE if we want ASAN builtins to have
precise info, otherwise some reads or stores might not be
instrumented---GCC was being lucky here.

The obvious change here is:

Index: passes.def
===
--- passes.def  (revision 257584)
+++ passes.def  (working copy)
@@ -95,6 +95,9 @@
  NEXT_PASS (pass_fre);
  NEXT_PASS (pass_early_vrp);
  NEXT_PASS (pass_merge_phi);
+ NEXT_PASS (pass_sancov);
+ NEXT_PASS (pass_asan);
+ NEXT_PASS (pass_tsan);
   NEXT_PASS (pass_dse);
  NEXT_PASS (pass_cd_dce);
  NEXT_PASS (pass_early_ipa_sra);
@@ -259,9 +262,6 @@
   NEXT_PASS (pass_walloca, false);
   NEXT_PASS (pass_pre);
   NEXT_PASS (pass_sink_code);
-  NEXT_PASS (pass_sancov);
-  NEXT_PASS (pass_asan);
-  NEXT_PASS (pass_tsan);
   NEXT_PASS (pass_dce);
   /* Pass group that runs when 1) enabled, 2) there are loops
 in the function.  Make sure to run pass_fix_loops before

which seems to work (this time for real... not sure what went wrong in
my previous testing) but it's a pretty large change that I'd like to run
by you guys before posting it.

Paolo


Re: [RFC PATCH] avoid applying attributes to explicit specializations (PR 83871)

2018-02-12 Thread Martin Sebor

On 02/12/2018 10:11 AM, Jason Merrill wrote:

On Mon, Feb 12, 2018 at 11:59 AM, Martin Sebor  wrote:

On 02/12/2018 09:30 AM, Jason Merrill wrote:


On Fri, Feb 9, 2018 at 6:57 PM, Martin Sebor  wrote:


On 02/09/2018 12:52 PM, Jason Merrill wrote:


On 02/08/2018 04:52 PM, Martin Sebor wrote:



I took me a while to find DECL_TEMPLATE_RESULT.  Hopefully
that's the right way to get the primary from a TEMPLATE_DECL.



Yes.


Attached is an updated patch.  It hasn't gone through full
testing yet but please let me know if you'd like me to make
some changes.





+  const char* const whitelist[] = {
+"error", "noreturn", "warning"
+  };




Why whitelist noreturn?  I would expect to want that to be consistent.



I expect noreturn to be used on a primary whose definition
is provided but that's not meant to be used the way the API
is otherwise expected to be.  As in:

   template 
   T [[noreturn]] foo () { throw "not implemented"; }

   template <> int foo();   // implemented elsewhere



Marking that template as noreturn seems pointless, and possibly harmful;
the deprecated, warning, or error attributes would be better for this
situation.



I meant either:

  template 
  T __attribute__ ((noreturn)) foo () { throw "not implemented"; }

  template <> int foo();   // implemented elsewhere

or (sigh)

  template 
  [[noreturn]] T foo () { throw "not implemented"; }

  template <> int foo();   // implemented elsewhere

It lets code like this

  int bar ()
  {
 return foo();
  }

be diagnosed because it's likely a bug (as Clang does with
-Wunreachable-code).  It doesn't stop code like the following
from compiling (which is good) but it instead lets them throw
at runtime which is what foo's author wants.

  void bar ()
  {
 foo();
  }

It's the same as having an "unimplemented" base virtual function
throw an exception when it's called rather than making it pure
and having calls to it abort.  Declaring the base virtual function
noreturn is useful for the same reason (and also diagnosed by
Clang).  I should remember to add the same warning in GCC 9.



Yes, I understood the patterns you had in mind, but I disagree with
them.  My point about harmful is that declaring a function noreturn
because it's unimplemented could be a problem for when the function is
later implemented, and callers were optimized inappropriately.  This
seems like a rather roundabout way to get a warning about calling an
unimplemented function, and not worth overriding the normal behavior.



Removing noreturn from the whitelist means having to prevent
the attribute from causing conflicts with the attributes on
the blacklist.  E.g., in this:

  template  [[malloc]] void* allocate (int);

  template <> [[noreturn]] void* allocate (int);

-Wmissing-attributes would warn for the missing malloc but
-Wattributes will warn once malloc is added.  Ditto for all
other attributes noreturn is considered to conflict with such
as alloc_size and warn_unused_result.


This example seems rather unlikely, and the solution is to remove
[[noreturn]].  I don't think this is worth worrying about for GCC 8.


Removing [[noreturn]] is not a solution because (as I said)
-Wmissing-attributes will warn that the specialization is
missing the malloc attribute.  The only solutions to avoid
both warnings are to either a) remove the malloc attribute
(and all others on the blacklist) from the primary or b) add
all of them to the specialization and remove the noreturn.
I.e., have them match.  That makes sense except when either
the primary or the specialization in fact does not return.

I really don't think it's helpful to try to force noreturn
to match between the primary and its specializations.

The use case you are concerned about (a noreturn function
returning) is already diagnosed:

  warning: function declared ‘noreturn’ has a ‘return’ statement

Martin


[PATCH] diagnose specializations of deprecated templates (PR c++/84318)

2018-02-12 Thread Martin Sebor

While testing my fix for 83871 (handling attributes on explicit
specializations) I noticed another old regression: while GCC 4.4
would diagnose declarations of explicit soecializations of all
primary templates declared deprecated, GCC 4.5 and later only
diagnose declarations of explicit soecializations of class
templates but not those of function or variable templates.

The root cause of this regression is different than 83871 so I
prefer to fix each separately.  The attached patch does that.
Because there is an opportunity to share code between the two
fixes I expect to integrate one into the other (whichever is
approved/committed last).

Martin
PR c++/84318 - attribute deprecated on function templates different than class templates

gcc/cp/ChangeLog:

	PR c++/84318
	* pt.c (check_explicit_specialization): Warn for explicit
	specializations of deprecated primary templates.

gcc/testsuite/ChangeLog:

	PR c++/84318
	* g++.dg/ext/attr-deprecated-2.C: New test.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index b58c60f..aa5f0dd 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -3104,6 +3104,20 @@ check_explicit_specialization (tree declarator,
 	  else if (VAR_P (decl))
 	DECL_COMDAT (decl) = false;
 
+	  if (TREE_CODE (gen_tmpl) != TYPE_DECL)
+	{
+	  tree tmpl = gen_tmpl;
+	  if (DECL_FUNCTION_TEMPLATE_P (tmpl)
+		  || TREE_CODE (tmpl) == TEMPLATE_DECL)
+		tmpl = DECL_TEMPLATE_RESULT (tmpl);
+
+	  /* Diagnose declarations of specializations of
+		 a deprecated primary template.  */
+	  if (TREE_DEPRECATED (tmpl)
+		  || lookup_attribute ("deprecated", DECL_ATTRIBUTES (tmpl)))
+		warn_deprecated_use (tmpl, NULL_TREE);
+	}
+
 	  /* If this is a full specialization, register it so that we can find
 	 it again.  Partial specializations will be registered in
 	 process_partial_specialization.  */
diff --git a/gcc/testsuite/g++.dg/ext/attr-deprecated-2.C b/gcc/testsuite/g++.dg/ext/attr-deprecated-2.C
new file mode 100644
index 000..f639a73
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/attr-deprecated-2.C
@@ -0,0 +1,63 @@
+// PR c++/84318 - attribute deprecated on function templates different
+// than class templates
+// { dg-do compile }
+// { dg-options "-Wall" }
+
+#define DEPRECATED __attribute__ ((deprecated))
+
+template 
+struct DEPRECATED
+ClassPartial { };   // { dg-message "declared here" }
+
+// Verify that a partial specialization is diagnosed.
+template 
+struct ClassPartial { };   // { dg-warning ".template * struct ClassPartial. is deprecated" }
+
+
+template 
+struct
+ClassPartialDeprecated { };
+
+template 
+struct DEPRECATED
+ClassPartialDeprecated { };
+
+ClassPartialDeprecated cpdi;
+ClassPartialDeprecated cpdci; // { dg-warning "is deprecated" "bug 84347" { xfail *-*-* } }
+
+
+template 
+struct DEPRECATED
+ClassExplicit { };  // { dg-message "declared here" }
+
+template <>
+struct
+ClassExplicit { }; // { dg-warning ".template * struct ClassExplicit. is deprecated" }
+
+
+template 
+void DEPRECATED
+FuncExplicit ();// { dg-message "declared here" }
+
+template <>
+void
+FuncExplicit();// { dg-warning ".void FuncExplicit\\\(\\\). is deprecated" }
+
+
+template 
+int DEPRECATED
+VarPartial; // { dg-message "declared here" }
+
+template 
+int
+VarPartial;// { dg-warning ".VarPartial. is deprecated" }
+
+template 
+int DEPRECATED
+VarExplicit;// { dg-message "declared here" }
+
+template <>
+int
+VarExplicit;   // { dg-warning ".VarExplicit. is deprecated" }
+
+// { dg-prune-output "variable templates only available" }


Re: Mising Patch #2 from the RISC-V v3 Submission

2018-02-12 Thread Jim Wilson

On 02/12/2018 03:23 AM, Andreas Schwab wrote:

On Feb 06 2017, Palmer Dabbelt  wrote:


+/* Because RISC-V only has word-sized atomics, it requries libatomic where
+   others do not.  So link libatomic by default, as needed.  */
+#undef LIB_SPEC
+#ifdef LD_AS_NEEDED_OPTION
+#define LIB_SPEC GNU_USER_TARGET_LIB_SPEC \
+  " %{pthread:" LD_AS_NEEDED_OPTION " -latomic " LD_NO_AS_NEEDED_OPTION "}"
+#else
+#define LIB_SPEC GNU_USER_TARGET_LIB_SPEC " -latomic "
+#endif


Why is -latomic added only with -pthread if --as-needed is supported,
but unconditionally if not?  Wouldn't it make sense to add it
unconditionally in both cases?


I don't know the history here, but I do know that the most common atomic 
related bug report we get is for people using pthread, so we were 
probably thinking about that when this was written.  But handling the 
two cases differently does look like a bug.  I'm OK with a patch that 
makes it unconditional in the LD_AS_NEEDED_OPTION case also. 
Particularly if you have a good case to justify it.  Joseph's pointer to 
bug 81358 looks like a possible good justification for this.  Do you 
want to write a patch, or do you want me to write it?


I'm not worried about the case where --as-needed is missing.  Linker 
--as-needed support was added to GNU ld in 2004, and RISC-V support was 
added to GCC in 2017, so there should be no riscv target that is missing 
the linker --as-needed support.


Jim


Re: [PATCH] RISC-V: define _REENTRANT with -pthread

2018-02-12 Thread Jim Wilson

On 02/12/2018 03:15 AM, Andreas Schwab wrote:

This is expected by the AX_PTHREAD autoconf macro from
.

* config/riscv/linux.h (CPP_SPEC): Define.


OK.

Jim


[committed] Fix OpenMP atomic and for condition C++ parsing (PR c++/84341)

2018-02-12 Thread Jakub Jelinek
Hi!

In these cases, we want the tree to be just a placeholder for the operation
plus 2 operands, we are going to take it appart later; by using build_min
we can avoid the asserts build2_loc does, because the operands might not
have the same type etc.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2018-02-12  Jakub Jelinek  

PR c++/84341
* parser.c (cp_parser_binary_expression): Use build_min instead of
build2_loc to build the no_toplevel_fold_p toplevel binary expression.

* c-c++-common/gomp/pr84341.c: New test.

--- gcc/cp/parser.c.jj  2018-02-12 19:17:37.939216029 +0100
+++ gcc/cp/parser.c 2018-02-12 20:05:55.287322491 +0100
@@ -9330,12 +9330,14 @@ cp_parser_binary_expression (cp_parser*
   if (no_toplevel_fold_p
  && lookahead_prec <= current.prec
  && sp == stack)
-   current.lhs = build2_loc (combined_loc,
- current.tree_type,
- TREE_CODE_CLASS (current.tree_type)
- == tcc_comparison
- ? boolean_type_node : TREE_TYPE (current.lhs),
- current.lhs, rhs);
+   {
+ current.lhs
+   = build_min (current.tree_type,
+TREE_CODE_CLASS (current.tree_type) == tcc_comparison
+? boolean_type_node : TREE_TYPE (current.lhs),
+current.lhs.get_value (), rhs.get_value ());
+ SET_EXPR_LOCATION (current.lhs, combined_loc);
+   }
   else
 {
   current.lhs = build_x_binary_op (combined_loc, current.tree_type,
--- gcc/testsuite/c-c++-common/gomp/pr84341.c.jj2018-02-12 
20:08:20.500327702 +0100
+++ gcc/testsuite/c-c++-common/gomp/pr84341.c   2018-02-12 20:08:00.290326972 
+0100
@@ -0,0 +1,10 @@
+/* PR c++/84341 */
+/* { dg-do compile } */
+/* { dg-options "-fopenmp" } */
+
+void
+foo (int i)
+{
+  #pragma omp atomic
+i =  + 1;/* { dg-error "invalid form of" } */
+}

Jakub


[PATCH] Fix ISA masks for wmmintrin.h builtins (PR target/84335)

2018-02-12 Thread Jakub Jelinek
Hi!

While the documentation only mentions AES resp. PCLMUL CPUIDs for these
intrinsics, they use and return V2DImode vectors and V2DImode is
only in VALID_SSE2_REG_MODE and VALID_AVX512VL_128_REG_MODE, so without
-msse2 we can't create registers with that mode.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?  This isn't really backportable to older branches, unless
r256281 is backported too.

2018-02-12  Jakub Jelinek  

PR target/84335
* config/i386/i386.c (ix86_init_mmx_sse_builtins): Pass
OPTION_MASK_ISA_AES | OPTION_MASK_ISA_SSE2 instead of
OPTION_MASK_ISA_AES as first argument to def_builtin_const
for AES builtins.  Pass OPTION_MASK_ISA_PCLMUL | OPTION_MASK_ISA_SSE2
instead of OPTION_MASK_ISA_PCLMUL as first argument to
def_builtin_const for __builtin_ia32_pclmulqdq128 builtin.
* config/i386/wmmintrin.h: If __SSE2__ is not defined, enable it
temporarily for AES and PCLMUL builtins.

* gcc.target/i386/pr84335.c: New test.

--- gcc/config/i386/i386.c.jj   2018-02-09 06:44:38.737804246 +0100
+++ gcc/config/i386/i386.c  2018-02-12 16:29:11.731233919 +0100
@@ -31282,21 +31282,28 @@ ix86_init_mmx_sse_builtins (void)
   VOID_FTYPE_UNSIGNED_UNSIGNED, IX86_BUILTIN_MWAIT);
 
   /* AES */
-  def_builtin_const (OPTION_MASK_ISA_AES, "__builtin_ia32_aesenc128",
+  def_builtin_const (OPTION_MASK_ISA_AES | OPTION_MASK_ISA_SSE2,
+"__builtin_ia32_aesenc128",
 V2DI_FTYPE_V2DI_V2DI, IX86_BUILTIN_AESENC128);
-  def_builtin_const (OPTION_MASK_ISA_AES, "__builtin_ia32_aesenclast128",
+  def_builtin_const (OPTION_MASK_ISA_AES | OPTION_MASK_ISA_SSE2,
+"__builtin_ia32_aesenclast128",
 V2DI_FTYPE_V2DI_V2DI, IX86_BUILTIN_AESENCLAST128);
-  def_builtin_const (OPTION_MASK_ISA_AES, "__builtin_ia32_aesdec128",
+  def_builtin_const (OPTION_MASK_ISA_AES | OPTION_MASK_ISA_SSE2,
+"__builtin_ia32_aesdec128",
 V2DI_FTYPE_V2DI_V2DI, IX86_BUILTIN_AESDEC128);
-  def_builtin_const (OPTION_MASK_ISA_AES, "__builtin_ia32_aesdeclast128",
+  def_builtin_const (OPTION_MASK_ISA_AES | OPTION_MASK_ISA_SSE2,
+"__builtin_ia32_aesdeclast128",
 V2DI_FTYPE_V2DI_V2DI, IX86_BUILTIN_AESDECLAST128);
-  def_builtin_const (OPTION_MASK_ISA_AES, "__builtin_ia32_aesimc128",
+  def_builtin_const (OPTION_MASK_ISA_AES | OPTION_MASK_ISA_SSE2,
+"__builtin_ia32_aesimc128",
 V2DI_FTYPE_V2DI, IX86_BUILTIN_AESIMC128);
-  def_builtin_const (OPTION_MASK_ISA_AES, "__builtin_ia32_aeskeygenassist128",
+  def_builtin_const (OPTION_MASK_ISA_AES | OPTION_MASK_ISA_SSE2,
+"__builtin_ia32_aeskeygenassist128",
 V2DI_FTYPE_V2DI_INT, IX86_BUILTIN_AESKEYGENASSIST128);
 
   /* PCLMUL */
-  def_builtin_const (OPTION_MASK_ISA_PCLMUL, "__builtin_ia32_pclmulqdq128",
+  def_builtin_const (OPTION_MASK_ISA_PCLMUL | OPTION_MASK_ISA_SSE2,
+"__builtin_ia32_pclmulqdq128",
 V2DI_FTYPE_V2DI_V2DI_INT, IX86_BUILTIN_PCLMULQDQ128);
 
   /* RDRND */
--- gcc/config/i386/wmmintrin.h.jj  2018-01-03 10:20:05.953535684 +0100
+++ gcc/config/i386/wmmintrin.h 2018-02-12 16:25:32.526060590 +0100
@@ -32,9 +32,9 @@
 
 /* AES */
 
-#ifndef __AES__
+#if !defined(__AES__) || !defined(__SSE2__)
 #pragma GCC push_options
-#pragma GCC target("aes")
+#pragma GCC target("aes,sse2")
 #define __DISABLE_AES__
 #endif /* __AES__ */
 
@@ -101,9 +101,9 @@ _mm_aeskeygenassist_si128 (__m128i __X,
 
 /* PCLMUL */
 
-#ifndef __PCLMUL__
+#if !defined(__PCLMUL__) || !defined(__SSE2__)
 #pragma GCC push_options
-#pragma GCC target("pclmul")
+#pragma GCC target("pclmul,sse2")
 #define __DISABLE_PCLMUL__
 #endif /* __PCLMUL__ */
 
--- gcc/testsuite/gcc.target/i386/pr84335.c.jj  2018-02-12 16:37:55.304353911 
+0100
+++ gcc/testsuite/gcc.target/i386/pr84335.c 2018-02-12 16:37:33.230351025 
+0100
@@ -0,0 +1,10 @@
+/* PR target/84335 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -maes -mno-sse2" } */
+typedef long long V __attribute__ ((__vector_size__ (16)));
+
+V
+foo (V *a, V *b)
+{
+  return __builtin_ia32_aesenc128 (*a, *b);/* { dg-error "needs isa 
option" } */
+}

Jakub


[PATCH] Fix _vpermi2var3_mask (PR target/84336)

2018-02-12 Thread Jakub Jelinek
Hi!

The following testcase ICEs, because the expander is called with
a subreg as operands[2], and gen_lowpart on it creates another subreg
from the same pseudo; the instructions rely on match_dup working:
(define_insn "*_vpermi2var3_mask"
  [(set (match_operand:VF_AVX512VL 0 "register_operand" "=v")
(vec_merge:VF_AVX512VL
  (unspec:VF_AVX512VL
[(match_operand: 2 "register_operand" "0")
 (match_operand:VF_AVX512VL 1 "register_operand" "v")
 (match_operand:VF_AVX512VL 3 "nonimmediate_operand" "vm")]
UNSPEC_VPERMT2)
  (subreg:VF_AVX512VL (match_dup 2) 0)
  (match_operand: 4 "register_operand" "Yk")))]
and this only works if operands[2] is initially a REG.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2018-02-12  Jakub Jelinek  

PR target/84336
* config/i386/sse.md (_vpermi2var3_mask): Force
operands[2] into a REG before using gen_lowpart on it.

* gcc.target/i386/pr84336.c: New test.

--- gcc/config/i386/sse.md.jj   2018-02-06 13:13:03.911758746 +0100
+++ gcc/config/i386/sse.md  2018-02-12 18:55:27.257386614 +0100
@@ -18183,7 +18183,10 @@ (define_expand "_vpermi2var 4 "register_operand")))]
   "TARGET_AVX512F"
-  "operands[5] = gen_lowpart (mode, operands[2]);")
+{
+  operands[2] = force_reg (mode, operands[2]);
+  operands[5] = gen_lowpart (mode, operands[2]);
+})
 
 (define_insn "*_vpermi2var3_mask"
   [(set (match_operand:VPERMI2I 0 "register_operand" "=v")
--- gcc/testsuite/gcc.target/i386/pr84336.c.jj  2018-02-12 19:10:15.861401288 
+0100
+++ gcc/testsuite/gcc.target/i386/pr84336.c 2018-02-12 19:09:17.911405540 
+0100
@@ -0,0 +1,13 @@
+/* PR target/84336 */
+/* { dg-do compile } */
+/* { dg-options "-O0 -ftree-ter -mavx512f" } */
+
+#include 
+
+struct S { __m512i h; } b;
+
+__m512
+foo (__m512 a, __mmask16 c, __m512 d)
+{
+  return _mm512_mask2_permutex2var_ps (a, b.h, c, d);
+}

Jakub


Re: [PATCH] Improve pow (C, x) -> exp (log (C) * x) optimization (PR middle-end/84309)

2018-02-12 Thread Joseph Myers
On Sat, 10 Feb 2018, Wilco Dijkstra wrote:

> For floats exp2f is ~10% faster than expf, powf is 2.2 times slower, and 
> exp10f is 3.2 times slower (slower than powf due to using double pow).

I expect it would be reasonably straightforward to adapt Szabolcs's 
optimized expf to produce an optimized exp10f (and likewise for log10f).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH, rs6000] fix-up le-altivec-const.c and altivec-const.c tests

2018-02-12 Thread Segher Boessenkool
Hi!

On Mon, Feb 12, 2018 at 03:35:27PM -0600, Will Schmidt wrote:
>   Noticed during review of test results.  I expect the intent
> here was to compile in cases where the run command was not
> valid.
> But for the scan-assembler stanza to work, need to have compile results
> in all cases.

> --- a/gcc/testsuite/gcc.target/powerpc/altivec-consts.c
> +++ b/gcc/testsuite/gcc.target/powerpc/altivec-consts.c
> @@ -1,7 +1,11 @@
> +/* altivec-consts.c:
> +   Requires vmx_hw support to run.  Requires altivec support to compile.
> +   This test varies from le-altivec-consts.c in the ordering of the
> +   vector elements below.  */
>  /* { dg-do run { target { powerpc*-*-* && vmx_hw } } } */
> -/* { dg-do compile { target { powerpc*-*-* && { ! vmx_hw } } } } */
> +/* { dg-do compile { target { powerpc*-*-* } } } */
>  /* { dg-require-effective-target powerpc_altivec_ok } */
>  /* { dg-options "-maltivec -mabi=altivec -O2" } */

As dg.exp (in dejagnu itself) says:

# Multiple instances are supported (since we don't support target and xfail
# selectors on one line), though it doesn't make much sense to change the
# compile/assemble/link/run field.  Nor does it make any sense to have
# multiple lines of target selectors (use one line).

You can just leave out the "dg-do compile" line and it will still run the
compile tests if the "dg-do run"'s target selector does not match (and
btw., you can remove the powerpc*-*-* part of the selectors, this is
always true in gcc.target/powerpc/).

Same for the other test.  Okay with that chance (if that works :-) )
Thanks!


Segher


Re: Mising Patch #2 from the RISC-V v3 Submission

2018-02-12 Thread Joseph Myers
On Mon, 12 Feb 2018, Andreas Schwab wrote:

> On Feb 06 2017, Palmer Dabbelt  wrote:
> 
> > +/* Because RISC-V only has word-sized atomics, it requries libatomic where
> > +   others do not.  So link libatomic by default, as needed.  */
> > +#undef LIB_SPEC
> > +#ifdef LD_AS_NEEDED_OPTION
> > +#define LIB_SPEC GNU_USER_TARGET_LIB_SPEC \
> > +  " %{pthread:" LD_AS_NEEDED_OPTION " -latomic " LD_NO_AS_NEEDED_OPTION "}"
> > +#else
> > +#define LIB_SPEC GNU_USER_TARGET_LIB_SPEC " -latomic "
> > +#endif
> 
> Why is -latomic added only with -pthread if --as-needed is supported,
> but unconditionally if not?  Wouldn't it make sense to add it
> unconditionally in both cases?

Really -latomic should be used with --as-needed (provided -nostdlib isn't 
used, which gcc.c specs deal with) - generally, on all architectures (this 
is bug 81358).  Using it without --as-needed is riskier (introducing 
spurious dependencies in binaries that don't actually use it).  I don't 
think there should be any architecture dependency here.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [patch, libfortran] Use flexible array members for array descriptor

2018-02-12 Thread Thomas Koenig

Hi Jakub,


Or if we have some easy way to find out what objects will need local
variables with descriptors (those need the non-flexible array member stuff)
and others (e.g. dummy arguments etc.) where we could use just the flexible
array members.


Descriptors are used for passing arguments to procedures, for
allocatable variables and for pointers. Pointers and allocatable
variables can also be passed to procedures.

So, not so easy (unfortunately).

Regards

Thomas


[PATCH] Fix get_range_strlen (PR tree-optimization/84339)

2018-02-12 Thread Jakub Jelinek
Hi!

get_range_strlen fails to tell the caller that array_at_struct_end_p has
been involved in cases like (>arr[0]), while it handles
(ptr->arr).  Fixed thusly, bootstrapped/regtested on x86_64-linux and
i686-linux, ok for trunk?

2018-02-12  Jakub Jelinek  

PR tree-optimization/84339
* gimple-fold.c (get_range_strlen): Set *FLEXP to true when handling
ARRAY_REF where first operand is array_at_struct_end_p COMPONENT_REF.
Formatting fixes.

* gcc.c-torture/execute/pr84339.c: New test.

--- gcc/gimple-fold.c.jj2018-01-11 19:01:07.259442879 +0100
+++ gcc/gimple-fold.c   2018-02-12 15:44:41.350214335 +0100
@@ -1380,9 +1380,15 @@ get_range_strlen (tree arg, tree length[
  /* Set the minimum size to zero since the string in
 the array could have zero length.  */
  *minlen = ssize_int (0);
+
+ if (TREE_CODE (TREE_OPERAND (arg, 0)) == COMPONENT_REF
+ && type == TREE_TYPE (TREE_OPERAND (arg, 0))
+ && array_at_struct_end_p (TREE_OPERAND (arg, 0)))
+   *flexp = true;
}
  else if (TREE_CODE (arg) == COMPONENT_REF
- && TREE_CODE (TREE_TYPE (TREE_OPERAND (arg, 1))) == ARRAY_TYPE)
+  && (TREE_CODE (TREE_TYPE (TREE_OPERAND (arg, 1)))
+  == ARRAY_TYPE))
{
  /* Use the type of the member array to determine the upper
 bound on the length of the array.  This may be overly
@@ -1428,7 +1434,7 @@ get_range_strlen (tree arg, tree length[
  || integer_zerop (val))
return false;
  val = wide_int_to_tree (TREE_TYPE (val),
- wi::sub(wi::to_wide (val), 1));
+ wi::sub (wi::to_wide (val), 1));
  /* Set the minimum size to zero since the string in
 the array could have zero length.  */
  *minlen = ssize_int (0);
--- gcc/testsuite/gcc.c-torture/execute/pr84339.c.jj2018-02-12 
15:47:04.167243039 +0100
+++ gcc/testsuite/gcc.c-torture/execute/pr84339.c   2018-02-12 
15:46:48.363239868 +0100
@@ -0,0 +1,30 @@
+/* PR tree-optimization/84339 */
+
+struct S { int a; char b[1]; };
+
+__attribute__((noipa)) int
+foo (struct S *p)
+{
+  return __builtin_strlen (>b[0]);
+}
+
+__attribute__((noipa)) int
+bar (struct S *p)
+{
+  return __builtin_strlen (p->b);
+}
+
+int
+main ()
+{
+  struct S *p = __builtin_malloc (sizeof (struct S) + 16);
+  if (p)
+{
+  p->a = 1;
+  __builtin_strcpy (p->b, "abcdefg");
+  if (foo (p) != 7 || bar (p) != 7)
+   __builtin_abort ();
+  __builtin_free (p);
+}
+  return 0;
+}

Jakub


[PATCH] Improve pow (C, x) -> exp (log (C) * x) optimization (PR middle-end/84309, take 2)

2018-02-12 Thread Jakub Jelinek
On Sat, Feb 10, 2018 at 03:26:46PM +0100, Jakub Jelinek wrote:
> If use_exp2 is true and (cfun->curr_properties & PROP_gimple_lvec) == 0,
> don't fold it?  Then I guess if we vectorize or slp vectorize the pow
> as vector pow, we'd need to match.pd it into the exp (log (vec_cst) * x).

Here is an updated patch, that defers it for pow (0x2.0pN, x) until after
vectorization and adds tree-vect-patterns.c matcher that will handle it
during vectorization (that one using exp, because we don't have exp2
vectorized).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2018-02-12  Jakub Jelinek  

PR middle-end/84309
* match.pd (pow(C,x) -> exp(log(C)*x)): Optimize instead into
exp2(log2(C)*x) if C is a power of 2 and c99 runtime is available.
* generic-match-head.c (canonicalize_math_after_vectorization_p): New
inline function.
* gimple-match-head.c (canonicalize_math_after_vectorization_p): New
inline function.
* omp-simd-clone.h: New file.
* omp-simd-clone.c: Include omp-simd-clone.h.
(expand_simd_clones): No longer static.
* tree-vect-patterns.c: Include fold-const-call.h, attribs.h,
cgraph.h and omp-simd-clone.h.
(vect_recog_pow_pattern): Optimize pow(C,x) to exp(log(C)*x).
(vect_recog_widen_shift_pattern): Formatting fix.
(vect_pattern_recog_1): Don't check optab for calls.

* gcc.dg/pr84309.c: New test.
* gcc.target/i386/pr84309.c: New test.

--- gcc/match.pd.jj 2018-02-09 19:11:26.910070491 +0100
+++ gcc/match.pd2018-02-12 14:15:05.653779352 +0100
@@ -3992,15 +3992,36 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
(logs (pows @0 @1))
(mult @1 (logs @0
 
- /* pow(C,x) -> exp(log(C)*x) if C > 0.  */
+ /* pow(C,x) -> exp(log(C)*x) if C > 0,
+or if C is a positive power of 2,
+pow(C,x) -> exp2(log2(C)*x).  */
  (for pows (POW)
   exps (EXP)
   logs (LOG)
+  exp2s (EXP2)
+  log2s (LOG2)
   (simplify
(pows REAL_CST@0 @1)
-(if (real_compare (GT_EXPR, TREE_REAL_CST_PTR (@0), )
-&& real_isfinite (TREE_REAL_CST_PTR (@0)))
- (exps (mult (logs @0) @1)
+   (if (real_compare (GT_EXPR, TREE_REAL_CST_PTR (@0), )
+   && real_isfinite (TREE_REAL_CST_PTR (@0)))
+(with {
+   const REAL_VALUE_TYPE *const value = TREE_REAL_CST_PTR (@0);
+   bool use_exp2 = false;
+   if (targetm.libc_has_function (function_c99_misc)
+  && value->cl == rvc_normal)
+{
+  REAL_VALUE_TYPE frac_rvt = *value;
+  SET_REAL_EXP (_rvt, 1);
+  if (real_equal (_rvt, ))
+use_exp2 = true;
+}
+ }
+ (if (!use_exp2)
+  (exps (mult (logs @0) @1))
+  /* As libmvec doesn't have a vectorized exp2, defer optimizing
+this until after vectorization.  */
+  (if (canonicalize_math_after_vectorization_p ())
+   (exps (mult (logs @0) @1
 
  (for sqrts (SQRT)
   cbrts (CBRT)
--- gcc/generic-match-head.c.jj 2018-01-03 10:19:55.454534005 +0100
+++ gcc/generic-match-head.c2018-02-12 14:13:27.088784495 +0100
@@ -68,3 +68,12 @@ canonicalize_math_p ()
 {
   return true;
 }
+
+/* Return true if math operations that are beneficial only after
+   vectorization should be canonicalized.  */
+
+static inline bool
+canonicalize_math_after_vectorization_p ()
+{
+  return false;
+}
--- gcc/gimple-match-head.c.jj  2018-01-03 10:19:55.931534081 +0100
+++ gcc/gimple-match-head.c 2018-02-12 14:14:17.352781873 +0100
@@ -831,3 +831,12 @@ canonicalize_math_p ()
 {
   return !cfun || (cfun->curr_properties & PROP_gimple_opt_math) == 0;
 }
+
+/* Return true if math operations that are beneficial only after
+   vectorization should be canonicalized.  */
+
+static inline bool
+canonicalize_math_after_vectorization_p ()
+{
+  return !cfun || (cfun->curr_properties & PROP_gimple_lvec) != 0;
+}
--- gcc/omp-simd-clone.h.jj 2018-02-12 18:11:01.843931808 +0100
+++ gcc/omp-simd-clone.h2018-02-12 18:12:13.901948041 +0100
@@ -0,0 +1,26 @@
+/* OMP constructs' SIMD clone supporting code.
+
+   Copyright (C) 2005-2018 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#ifndef GCC_OMP_SIMD_CLONE_H
+#define GCC_OMP_SIMD_CLONE_H
+
+extern void expand_simd_clones (struct cgraph_node *);
+
+#endif /* GCC_OMP_SIMD_CLONE_H */
--- 

Re: Add a DECL_EXPR for VLA pointer casts (PR 84305)

2018-02-12 Thread Joseph Myers
On Mon, 12 Feb 2018, Richard Sandiford wrote:

> 2018-02-11  Richard Sandiford  
> 
> gcc/c/
>   PR c/84305
>   * c-decl.c (grokdeclarator): Create an anonymous TYPE_DECL
>   in PARM and TYPENAME contexts too, but attach it to a BIND_EXPR
>   and include the BIND_EXPR in the list of things that need to be
>   pre-evaluated.
> 
> gcc/testsuite/
>   PR c/84305
>   * gcc.c-torture/compile/pr84305.c: New test.

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH, rs6000] fix-up le-altivec-const.c and altivec-const.c tests

2018-02-12 Thread Will Schmidt
Hi,
  Noticed during review of test results.  I expect the intent
here was to compile in cases where the run command was not
valid.
But for the scan-assembler stanza to work, need to have compile results
in all cases.

/* { dg-do run { target { powerpc*-*-* && vmx_hw } } } */
/* { dg-do compile { target { powerpc*-*-* && { ! vmx_hw } } } } */
/* { dg-require-effective-target powerpc_altivec_ok } */
...
/* { dg-final { scan-assembler-not "lvx" { target { powerpc*le-*-* } } } } */


So..
Added some commentary, updated the stanzas, retested on assorted power systems.
This fixes "scan-assembler-not lvx" failure as seen in testresults from LE
systems.
OK for trunk?

Thanks
-Will

[testsuite]

2018-02-12  Will Schmidt  

* gcc.target/powerpc/altivec-consts.c:  Update compile stanzas.
* gcc.target/powerpc/le-altivec-consts.c:  Same.

diff --git a/gcc/testsuite/gcc.target/powerpc/altivec-consts.c 
b/gcc/testsuite/gcc.target/powerpc/altivec-consts.c
index 36cb60c..8ec73e9 100644
--- a/gcc/testsuite/gcc.target/powerpc/altivec-consts.c
+++ b/gcc/testsuite/gcc.target/powerpc/altivec-consts.c
@@ -1,7 +1,11 @@
+/* altivec-consts.c:
+   Requires vmx_hw support to run.  Requires altivec support to compile.
+   This test varies from le-altivec-consts.c in the ordering of the
+   vector elements below.  */
 /* { dg-do run { target { powerpc*-*-* && vmx_hw } } } */
-/* { dg-do compile { target { powerpc*-*-* && { ! vmx_hw } } } } */
+/* { dg-do compile { target { powerpc*-*-* } } } */
 /* { dg-require-effective-target powerpc_altivec_ok } */
 /* { dg-options "-maltivec -mabi=altivec -O2" } */
 
 /* Check that "easy" AltiVec constants are correctly synthesized.  */
 
diff --git a/gcc/testsuite/gcc.target/powerpc/le-altivec-consts.c 
b/gcc/testsuite/gcc.target/powerpc/le-altivec-consts.c
index 15ec650..2f81ff7 100644
--- a/gcc/testsuite/gcc.target/powerpc/le-altivec-consts.c
+++ b/gcc/testsuite/gcc.target/powerpc/le-altivec-consts.c
@@ -1,7 +1,11 @@
+/* le-altivec-consts.c:
+   Requires vmx_hw support to run.  Requires altivec support to compile.
+   This test varies from altivec-consts.c in the ordering of the
+   vector elements below.  */
 /* { dg-do run { target { powerpc*-*-* && vmx_hw } } } */
-/* { dg-do compile { target { powerpc*-*-* && { ! vmx_hw } } } } */
+/* { dg-do compile { target { powerpc*-*-* } } } */
 /* { dg-require-effective-target powerpc_altivec_ok } */
 /* { dg-options "-maltivec -mabi=altivec -O2" } */
 
 /* Check that "easy" AltiVec constants are correctly synthesized.  */
 




Re: [PATCH 1/3] Add PTWRITE builtins for x86

2018-02-12 Thread Joseph Myers
On Sun, 11 Feb 2018, Andi Kleen wrote:

> @@ -27064,6 +27064,9 @@ preferred alignment to 
> @option{-mpreferred-stack-boundary=2}.
>  @itemx -mfsgsbase
>  @opindex mfsgsbase
>  @need 200
> +@itemx -mptwrite
> +@opindex mptwrite
> +@need 200
>  @itemx -mrdrnd
>  @opindex mrdrnd
>  @need 200

This @itemx sequence is above a paragraph that lists the corresponding 
instruction set extension for each option.  You need to insert an 
appropriate entry in that list between "FSGSBASE, RDRND".

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: PING Fwd: [patch] implement generic debug() for vectors and hash sets

2018-02-12 Thread Jason Merrill
On Mon, Nov 20, 2017 at 6:47 AM, Aldy Hernandez  wrote:
> Minor oversight...
>
> debug_vec_tree() no longer exist.  I forgot to remove the prototype.
>
> Also, gdbinit.in has a macro that uses it, but this is no longer
> necessary as we can print tree vectors generically with "print
> debug(xxx)".

But that's a lot more to type than "pvt"

Jason


[patch, testcase, fortran, committed] Fix read_dir.f90

2018-02-12 Thread Thomas Koenig

Hello world,

I just committed (in two attempts...) the patch below as obvious.
Reading a byte is fine on some operating systems, just not on
Linux.

I verified this on AIX first.

I'll keep the PR open for a few days to see if this fix really works
on all affected system.

Regards

Thomas

-! { dg-do run { xfail *-*-freebsd* *-*-dragonfly* hppa*-*-hpux* 
powerpc-ibm-aix* } }

+! { dg-do run }
 ! PR67367
 program bug
implicit none
@@ -12,7 +11,7 @@ program bug
   call abort
end if
read(10, iostat=ios) c
-   if (ios.ne.21) then
+   if (ios.ne.21.and.ios.ne.0) then
   close(10, status='delete')
   call abort
end if


Re: [PR 83990] Fix location handling in ipa_modify_call_arguments

2018-02-12 Thread Jakub Jelinek
On Mon, Feb 12, 2018 at 06:35:47PM +0100, Martin Jambor wrote:
> Hi,
> 
> the callee-side arguments manipulation method used by IPA-SRA has two
> issues with how it deals with locations.  First, it gets the location
> from expressions in an unreliable way rather than the statements it sees
> and then it forgets to set a location of one gimple assign it creates.
> Both is fixed in the patch below.
> 
> I have bootstrapped and tested the patch on an x86_64-linux and consider
> it pre-approved by Jakub in bugzilla so plan to commit it to trunk
> tomorrow and to the gcc-7-branch soon afterwards (after testing there).

Yeah, this is ok.

> 2018-01-30  Martin Jambor  
> 
>   PR c++/83990
>   * ipa-param-manipulation.c (ipa_modify_call_arguments): Use location
>   of call statements, also set location of a load to a temporary.
> 
> --- a/gcc/ipa-param-manipulation.c
> +++ b/gcc/ipa-param-manipulation.c
> @@ -295,8 +295,7 @@ ipa_modify_call_arguments (struct cgraph_edge *cs, gcall 
> *stmt,
>  
> poly_int64 byte_offset = exact_div (adj->offset, BITS_PER_UNIT);
> base = gimple_call_arg (stmt, adj->base_index);
> -   loc = DECL_P (base) ? DECL_SOURCE_LOCATION (base)
> -   : EXPR_LOCATION (base);
> +   loc = gimple_location (stmt);
>  
> if (TREE_CODE (base) != ADDR_EXPR
> && POINTER_TYPE_P (TREE_TYPE (base)))
> @@ -385,6 +384,7 @@ ipa_modify_call_arguments (struct cgraph_edge *cs, gcall 
> *stmt,
> else
>   expr = create_tmp_reg (TREE_TYPE (expr));
> gimple_assign_set_lhs (tem, expr);
> +   gimple_set_location (tem, loc);
> gsi_insert_before (, tem, GSI_SAME_STMT);
>   }
>   }
> -- 
> 2.15.1

Jakub


Re: [patch, libfortran] Use flexible array members for array descriptor

2018-02-12 Thread Jakub Jelinek
On Mon, Feb 12, 2018 at 08:41:56PM +0100, Thomas Koenig wrote:
> Am 08.02.2018 um 12:27 schrieb Richard Biener:
> > If the effect of the patch is (it doesn't include generated files) that the
> > function arguments now have pointers to array descriptor types with
> > the flexible array then yes, that's what will be needed anyways, no need
> > for any dust to settle here.
> 
> All right.
> 
> I have attached the patch updated with Janne's comments.
> 
> Regression-tested. OK for trunk then?

For the library part, you could have just used a single
GFC_ARRAY_DESCRIPTOR macro, just use
GFC_ARRAY_DESCRIPTOR (, GFC_INTEGER_1) etc. when you want the flexible
array member.

The compiler side will be harder, wonder if we want to always use
as the TREE_TYPE of things what we use right now and have somewhere
in the lang_specific part a pointer to a type with flexible array member
and whenever trying to dereference something build a MEM_REF with the
flexible array member type; plus whenever passing it to some other function.
Or if we have some easy way to find out what objects will need local
variables with descriptors (those need the non-flexible array member stuff)
and others (e.g. dummy arguments etc.) where we could use just the flexible
array members.

Jakub


Re: [patch, libfortran] Use flexible array members for array descriptor

2018-02-12 Thread Janne Blomqvist
On Mon, Feb 12, 2018 at 9:41 PM, Thomas Koenig  wrote:
> Am 08.02.2018 um 12:27 schrieb Richard Biener:
>>
>> If the effect of the patch is (it doesn't include generated files) that
>> the
>> function arguments now have pointers to array descriptor types with
>> the flexible array then yes, that's what will be needed anyways, no need
>> for any dust to settle here.
>
>
> All right.
>
> I have attached the patch updated with Janne's comments.
>
> Regression-tested. OK for trunk then?

Ok, thanks.

-- 
Janne Blomqvist


Re: [patch, libfortran] Use flexible array members for array descriptor

2018-02-12 Thread Thomas Koenig

Am 08.02.2018 um 12:27 schrieb Richard Biener:

If the effect of the patch is (it doesn't include generated files) that the
function arguments now have pointers to array descriptor types with
the flexible array then yes, that's what will be needed anyways, no need
for any dust to settle here.


All right.

I have attached the patch updated with Janne's comments.

Regression-tested. OK for trunk then?

Regards

Thomas

2018-02-12  Thomas Koenig  

* libgfortran.h (GFC_ARRAY_DESCRIPTOR): Remove dimension
of descriptor to use vaiable members for dim.
Change usage of GFC_ARRAY_DESCRIPTOR accordingly.
(GFC_FILL_ARRAY_DESCRIPTOR): New macro.
(gfc_full_array_i4): New type.
* intrinsics/date_and_time.c (secnds): Use sizeof
(gfc_array_i4) + sizeof (descriptor_dimension) for memory
allocation.
* intrinsics/reshape_generic.c: Use GFC_FULL_ARRAY_DESCRIPTOR.
* io/format.c: Use sizeof (gfc_array_i4) + sizeof
(descriptor_dimension) for memoy allocation.
* io/list_read.c (list_formatted_read_scalar): Use
gfc_full_array_i4 for variable.
(nml_read_obj): Likewise.
* io/write.c (list_formatted_write_scalar): Likewise.
(nml_write_obj): Likewise.
* m4/reshape.m4: Use GFC_FULL_ARRAY_DESCRIPTOR.
* generated/reshape_c10.c: Regenerated.
* generated/reshape_c16.c: Regenerated.
* generated/reshape_c4.c: Regenerated.
* generated/reshape_c8.c: Regenerated.
* generated/reshape_i16.c: Regenerated.
* generated/reshape_i4.c: Regenerated.
* generated/reshape_i8.c: Regenerated.
* generated/reshape_r10.c: Regenerated.
* generated/reshape_r16.c: Regenerated.
* generated/reshape_r4.c: Regenerated.
* generated/reshape_r8.c: Regenerated.
Index: generated/reshape_c10.c
===
--- generated/reshape_c10.c	(Revision 257347)
+++ generated/reshape_c10.c	(Arbeitskopie)
@@ -28,7 +28,7 @@ see the files COPYING3 and COPYING.RUNTIME respect
 
 #if defined (HAVE_GFC_COMPLEX_10)
 
-typedef GFC_ARRAY_DESCRIPTOR(1, index_type) shape_type;
+typedef GFC_FULL_ARRAY_DESCRIPTOR(1, index_type) shape_type;
 
 
 extern void reshape_c10 (gfc_array_c10 * const restrict, 
Index: generated/reshape_c16.c
===
--- generated/reshape_c16.c	(Revision 257347)
+++ generated/reshape_c16.c	(Arbeitskopie)
@@ -28,7 +28,7 @@ see the files COPYING3 and COPYING.RUNTIME respect
 
 #if defined (HAVE_GFC_COMPLEX_16)
 
-typedef GFC_ARRAY_DESCRIPTOR(1, index_type) shape_type;
+typedef GFC_FULL_ARRAY_DESCRIPTOR(1, index_type) shape_type;
 
 
 extern void reshape_c16 (gfc_array_c16 * const restrict, 
Index: generated/reshape_c4.c
===
--- generated/reshape_c4.c	(Revision 257347)
+++ generated/reshape_c4.c	(Arbeitskopie)
@@ -28,7 +28,7 @@ see the files COPYING3 and COPYING.RUNTIME respect
 
 #if defined (HAVE_GFC_COMPLEX_4)
 
-typedef GFC_ARRAY_DESCRIPTOR(1, index_type) shape_type;
+typedef GFC_FULL_ARRAY_DESCRIPTOR(1, index_type) shape_type;
 
 
 extern void reshape_c4 (gfc_array_c4 * const restrict, 
Index: generated/reshape_c8.c
===
--- generated/reshape_c8.c	(Revision 257347)
+++ generated/reshape_c8.c	(Arbeitskopie)
@@ -28,7 +28,7 @@ see the files COPYING3 and COPYING.RUNTIME respect
 
 #if defined (HAVE_GFC_COMPLEX_8)
 
-typedef GFC_ARRAY_DESCRIPTOR(1, index_type) shape_type;
+typedef GFC_FULL_ARRAY_DESCRIPTOR(1, index_type) shape_type;
 
 
 extern void reshape_c8 (gfc_array_c8 * const restrict, 
Index: generated/reshape_i16.c
===
--- generated/reshape_i16.c	(Revision 257347)
+++ generated/reshape_i16.c	(Arbeitskopie)
@@ -28,7 +28,7 @@ see the files COPYING3 and COPYING.RUNTIME respect
 
 #if defined (HAVE_GFC_INTEGER_16)
 
-typedef GFC_ARRAY_DESCRIPTOR(1, index_type) shape_type;
+typedef GFC_FULL_ARRAY_DESCRIPTOR(1, index_type) shape_type;
 
 
 extern void reshape_16 (gfc_array_i16 * const restrict, 
Index: generated/reshape_i4.c
===
--- generated/reshape_i4.c	(Revision 257347)
+++ generated/reshape_i4.c	(Arbeitskopie)
@@ -28,7 +28,7 @@ see the files COPYING3 and COPYING.RUNTIME respect
 
 #if defined (HAVE_GFC_INTEGER_4)
 
-typedef GFC_ARRAY_DESCRIPTOR(1, index_type) shape_type;
+typedef GFC_FULL_ARRAY_DESCRIPTOR(1, index_type) shape_type;
 
 
 extern void reshape_4 (gfc_array_i4 * const restrict, 
Index: generated/reshape_i8.c
===
--- generated/reshape_i8.c	(Revision 257347)
+++ generated/reshape_i8.c	(Arbeitskopie)
@@ -28,7 +28,7 @@ see the files COPYING3 and COPYING.RUNTIME respect
 
 #if defined 

Go patch committed: error on func declaration/definition

2018-02-12 Thread Ian Lance Taylor
Long long long ago Go permitted writing
func F()
in one file and writing
func F() {}
in another file.  This was removed from the language, and that is now
considered to be a multiple definition error.  Gccgo never caught up
to that, and it has been permitting this invalid code for some time.

Stop permitting it, so that we give correct errors.  Since we've
supported it for a long time, the compiler uses it in a couple of
cases: it predeclares the hash/equal methods if it decides to create
them while compiling another function, and it predeclares main.main as
a mechanism for getting the right warning if a program uses the wrong
signature for main.  For simplicity, keep those existing uses.

This required a few minor changes in libgo which were relying,
unnecessarily, on the current behavior.

Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 257599)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-cebdbf3f293f5b0f3120c009c47da0ceadc113cb
+7998e29eec43ede1cee925d87eef0b09da67d90b
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/gogo.cc
===
--- gcc/go/gofrontend/gogo.cc   (revision 257540)
+++ gcc/go/gofrontend/gogo.cc   (working copy)
@@ -7762,33 +7762,29 @@ Bindings::new_definition(Named_object* o
   go_unreachable();
 
 case Named_object::NAMED_OBJECT_FUNC:
-  if (new_object->is_function_declaration())
-   {
- if (!new_object->func_declaration_value()->asm_name().empty())
-   go_error_at(Linemap::unknown_location(),
-   ("sorry, not implemented: "
-"__asm__ for function definitions"));
- Function_type* old_type = old_object->func_value()->type();
- Function_type* new_type =
-   new_object->func_declaration_value()->type();
- if (old_type->is_valid_redeclaration(new_type, ))
-   return old_object;
-   }
   break;
 
 case Named_object::NAMED_OBJECT_FUNC_DECLARATION:
   {
-   if (new_object->is_function())
+   // We declare the hash and equality functions before defining
+   // them, because we sometimes see that we need the declaration
+   // while we are in the middle of a different function.  We
+   // declare the main function before the user defines it, to
+   // give better error messages.
+   if (new_object->is_function()
+   && ((Linemap::is_predeclared_location(old_object->location())
+&& Linemap::is_predeclared_location(new_object->location()))
+   || (Gogo::unpack_hidden_name(old_object->name()) == "main"
+   && Linemap::is_unknown_location(old_object->location()
  {
 Function_type* old_type =
 old_object->func_declaration_value()->type();
Function_type* new_type = new_object->func_value()->type();
if (old_type->is_valid_redeclaration(new_type, ))
  {
-   if (!old_object->func_declaration_value()->asm_name().empty())
- go_error_at(Linemap::unknown_location(),
- ("sorry, not implemented: "
-  "__asm__ for function definitions"));
+   Function_declaration* fd =
+ old_object->func_declaration_value();
+   go_assert(fd->asm_name().empty());
old_object->set_function_value(new_object->func_value());
this->named_objects_.push_back(old_object);
return old_object;
@@ -7810,8 +7806,10 @@ Bindings::new_definition(Named_object* o
   old_object->set_is_redefinition();
   new_object->set_is_redefinition();
 
-  go_inform(old_object->location(), "previous definition of %qs was here",
-n.c_str());
+  if (!Linemap::is_unknown_location(old_object->location())
+  && !Linemap::is_predeclared_location(old_object->location()))
+go_inform(old_object->location(), "previous definition of %qs was here",
+ n.c_str());
 
   return old_object;
 }
Index: libgo/go/runtime/extern.go
===
--- libgo/go/runtime/extern.go  (revision 257527)
+++ libgo/go/runtime/extern.go  (working copy)
@@ -157,10 +157,6 @@ package runtime
 
 import "runtime/internal/sys"
 
-// Gosched yields the processor, allowing other goroutines to run.  It does not
-// suspend the current goroutine, so execution resumes automatically.
-func Gosched()
-
 // Caller reports file and line number information about function invocations 
on
 // the calling goroutine's stack. The argument skip is the number of stack 
frames
 // to ascend, with 0 identifying the caller of 

Go patch committed: error on func declaration/definition

2018-02-12 Thread Ian Lance Taylor
Long long long ago Go permitted writing
func F()
in one file and writing
func F() {}
in another file.  This was removed from the language, and that is now
considered to be a multiple definition error.  Gccgo never caught up
to that, and it has been permitting this invalid code for some time.

Stop permitting it, so that we give correct errors.  Since we've
supported it for a long time, the compiler uses it in a couple of
cases: it predeclares the hash/equal methods if it decides to create
them while compiling another function, and it predeclares main.main as
a mechanism for getting the right warning if a program uses the wrong
signature for main.  For simplicity, keep those existing uses.

This required a few minor changes in libgo which were relying,
unnecessarily, on the current behavior.

Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian


[C++ Patch] PR 84333 ("[6/7/8 Regression] ICE with ternary operator in template function")

2018-02-12 Thread Paolo Carlini

Hi,

this ICE on valid happens only with checking enabled - that explains why 
we didn't notice it so far - but I think points to a minor but 
substantive correctness issue. In short, we ICE when 
build_conditional_expr calls save_expr, which in turn calls 
contain_placeholder_p, which doesn't handle correctly the sizeof(int), 
and tries to use TREE_CONSTANT on the INTEGER_TYPE. I think that in 
general we simply have to explicitly handle both kinds of sizeof in 
contains_placeholder_p: even for a type as simple as INTEGER_TYPE the 
result may not be trivial, ie, type_contains_placeholder_1 checks the 
bounds:


   case INTEGER_TYPE:
    case REAL_TYPE:
    case FIXED_POINT_TYPE:
  /* Here we just check the bounds.  */
  return (CONTAINS_PLACEHOLDER_P (TYPE_MIN_VALUE (type))
      || CONTAINS_PLACEHOLDER_P (TYPE_MAX_VALUE (type)));

I'm finishing testing the below on x86_64-linux, all good so far.

Thanks, Paolo.

//

2018-02-12  Paolo Carlini  

PR c++/84333
* tree.c (contains_placeholder_p): Explicitly handle both kinds
of SIZEOF_EXPR, ie, type and expr operand.

/testsuite
2018-02-12  Paolo Carlini  

PR c++/84333
* g++.dg/template/sizeof16.C: New.
Index: testsuite/g++.dg/template/sizeof16.C
===
--- testsuite/g++.dg/template/sizeof16.C(nonexistent)
+++ testsuite/g++.dg/template/sizeof16.C(working copy)
@@ -0,0 +1,6 @@
+// PR c++/84333
+
+template int foo()
+{
+  return sizeof(int) > 1 ? : 1;
+}
Index: tree.c
===
--- tree.c  (revision 257588)
+++ tree.c  (working copy)
@@ -3733,6 +3733,12 @@ contains_placeholder_p (const_tree exp)
 a PLACEHOLDER_EXPR. */
  return 0;
 
+   case SIZEOF_EXPR:
+ if (TYPE_P (TREE_OPERAND (exp, 0)))
+   return type_contains_placeholder_p (TREE_OPERAND (exp, 0));
+ else
+   return CONTAINS_PLACEHOLDER_P (TREE_OPERAND (exp, 0));
+
default:
  break;
}


libgo patch committed: Use write barrier for atomic pointer functions

2018-02-12 Thread Ian Lance Taylor
This patch to the Go frontend uses a write barrier for atomic pointer
functions.  This copies atomic_pointer.go from 1.10rc2.  It was
omitted during the transition of the runtime from C to Go, and I
forgot about it.  This may help with PR 84215; I'm not sure since I
haven't been able to recreate the problems described there.
Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 257540)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-89105404f94005ffa8e2b08df78015dc9ac91362
+cebdbf3f293f5b0f3120c009c47da0ceadc113cb
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/go/runtime/atomic_pointer.go
===
--- libgo/go/runtime/atomic_pointer.go  (nonexistent)
+++ libgo/go/runtime/atomic_pointer.go  (working copy)
@@ -0,0 +1,69 @@
+// Copyright 2009 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+package runtime
+
+import (
+   "runtime/internal/atomic"
+   "unsafe"
+)
+
+// These functions cannot have go:noescape annotations,
+// because while ptr does not escape, new does.
+// If new is marked as not escaping, the compiler will make incorrect
+// escape analysis decisions about the pointer value being stored.
+// Instead, these are wrappers around the actual atomics (casp1 and so on)
+// that use noescape to convey which arguments do not escape.
+
+// atomicstorep performs *ptr = new atomically and invokes a write barrier.
+//
+//go:nosplit
+func atomicstorep(ptr unsafe.Pointer, new unsafe.Pointer) {
+   writebarrierptr_prewrite((*uintptr)(ptr), uintptr(new))
+   atomic.StorepNoWB(noescape(ptr), new)
+}
+
+//go:nosplit
+func casp(ptr *unsafe.Pointer, old, new unsafe.Pointer) bool {
+   // The write barrier is only necessary if the CAS succeeds,
+   // but since it needs to happen before the write becomes
+   // public, we have to do it conservatively all the time.
+   writebarrierptr_prewrite((*uintptr)(unsafe.Pointer(ptr)), uintptr(new))
+   return atomic.Casp1((*unsafe.Pointer)(noescape(unsafe.Pointer(ptr))), 
noescape(old), new)
+}
+
+// Like above, but implement in terms of sync/atomic's uintptr operations.
+// We cannot just call the runtime routines, because the race detector expects
+// to be able to intercept the sync/atomic forms but not the runtime forms.
+
+//go:linkname sync_atomic_StoreUintptr sync_atomic.StoreUintptr
+func sync_atomic_StoreUintptr(ptr *uintptr, new uintptr)
+
+//go:linkname sync_atomic_StorePointer sync_atomic.StorePointer
+//go:nosplit
+func sync_atomic_StorePointer(ptr *unsafe.Pointer, new unsafe.Pointer) {
+   writebarrierptr_prewrite((*uintptr)(unsafe.Pointer(ptr)), uintptr(new))
+   sync_atomic_StoreUintptr((*uintptr)(unsafe.Pointer(ptr)), uintptr(new))
+}
+
+//go:linkname sync_atomic_SwapUintptr sync_atomic.SwapUintptr
+func sync_atomic_SwapUintptr(ptr *uintptr, new uintptr) uintptr
+
+//go:linkname sync_atomic_SwapPointer sync_atomic.SwapPointer
+//go:nosplit
+func sync_atomic_SwapPointer(ptr *unsafe.Pointer, new unsafe.Pointer) 
unsafe.Pointer {
+   writebarrierptr_prewrite((*uintptr)(unsafe.Pointer(ptr)), uintptr(new))
+   old := 
unsafe.Pointer(sync_atomic_SwapUintptr((*uintptr)(noescape(unsafe.Pointer(ptr))),
 uintptr(new)))
+   return old
+}
+
+//go:linkname sync_atomic_CompareAndSwapUintptr 
sync_atomic.CompareAndSwapUintptr
+func sync_atomic_CompareAndSwapUintptr(ptr *uintptr, old, new uintptr) bool
+
+//go:linkname sync_atomic_CompareAndSwapPointer 
sync_atomic.CompareAndSwapPointer
+//go:nosplit
+func sync_atomic_CompareAndSwapPointer(ptr *unsafe.Pointer, old, new 
unsafe.Pointer) bool {
+   writebarrierptr_prewrite((*uintptr)(unsafe.Pointer(ptr)), uintptr(new))
+   return 
sync_atomic_CompareAndSwapUintptr((*uintptr)(noescape(unsafe.Pointer(ptr))), 
uintptr(old), uintptr(new))
+}
Index: libgo/go/runtime/stubs.go
===
--- libgo/go/runtime/stubs.go   (revision 257527)
+++ libgo/go/runtime/stubs.go   (working copy)
@@ -5,7 +5,6 @@
 package runtime
 
 import (
-   "runtime/internal/atomic"
"runtime/internal/sys"
"unsafe"
 )
@@ -307,15 +306,6 @@ func setSupportAES(v bool) {
support_aes = v
 }
 
-// Here for gccgo until we port atomic_pointer.go and mgc.go.
-//go:nosplit
-func casp(ptr *unsafe.Pointer, old, new unsafe.Pointer) bool {
-   if !atomic.Casp1((*unsafe.Pointer)(noescape(unsafe.Pointer(ptr))), 
noescape(old), new) {
-   return false
-   }
-   return true
-}
-
 // Here for gccgo until we port lock_*.go.
 func lock(l *mutex)
 func unlock(l *mutex)
@@ -347,12 +337,6 

[PATCH][committed] Fix ICE in maybe_record_trace_start

2018-02-12 Thread Jeff Law


This was something my tester was tripping over on h8-elf.  I was hoping
it was going to fix the similar ICEs for the SH port, but alas those are
different.

The fundamental problem is generic code generated something like this:

(set (temp) (plus (stack_pointer_rtx) (const_int))
(set (stack_pointer_rtx) (temp))  REG_ARGS_SIZE note


The backward propagation step in cse.c turns the first insn into:

(set (stack_pointer_rtx) (plus (stack_pointer_rtx) (const_int))

And the second insn gets deleted, losing the REG_ARGS_SIZE note.

We then cross jump the tail of that block with the tail of another block
which has REG_ARGS_SIZE notes that do not get deleted.

The net result is at the commonized tail we have two paths which
different notions of REG_ARGS_SIZE and thus different CFIs, triggering
the ICE.

The most sensible way to fix this is to move the REG_ARGS_SIZE note
during the backward propagation step in cse.c

That allows the H8 port to build libgcc/newlib across all its multilib
variants.

I've also bootstrapped and regression tested on x86_64-linux-gnu, though
I doubt it really got exercised there.

Installing on the trunk.  Now back to the SH, rx and mips problems with
maybe_record_trace_start.

Jeff

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 5a264391268..d5913d0a7db 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2018-02-12  Jeff Law  
+
+   * cse.c (try_back_substitute_reg): Move any REG_ARGS_SIZE note when
+   successfully back substituting a reg.
+
 2018-02-12  Richard Biener  
 
PR tree-optimization/84037
diff --git a/gcc/cse.c b/gcc/cse.c
index 825b0bd8989..a73a771041a 100644
--- a/gcc/cse.c
+++ b/gcc/cse.c
@@ -4256,6 +4256,15 @@ try_back_substitute_reg (rtx set, rtx_insn *insn)
  && (reg_mentioned_p (dest, XEXP (note, 0))
  || rtx_equal_p (src, XEXP (note, 0
remove_note (insn, note);
+
+ /* If INSN has a REG_ARGS_SIZE note, move it to PREV.  */
+ note = find_reg_note (insn, REG_ARGS_SIZE, NULL_RTX);
+ if (note != 0)
+   {
+ remove_note (insn, note);
+ gcc_assert (!find_reg_note (prev, REG_ARGS_SIZE, NULL_RTX));
+ set_unique_reg_note (prev, REG_ARGS_SIZE, XEXP (note, 0));
+   }
}
}
 }
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index dba0bedb7cf..8f22a65c7bb 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,7 @@
+2018-02-12  Jeff Law  
+
+   * gcc.c-torture/compile/reg-args-size.c: New test.
+
 2018-02-12  Carl Love  
 
* gcc.target/powerpc/builtins-4-runnable.c (main): Move int128 and
diff --git a/gcc/testsuite/gcc.c-torture/compile/regs-arg-size.c 
b/gcc/testsuite/gcc.c-torture/compile/regs-arg-size.c
new file mode 100644
index 000..0ca0b9f034b
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/regs-arg-size.c
@@ -0,0 +1,36 @@
+int foo;
+typedef long unsigned int size_t;
+typedef short unsigned int wchar_t;
+struct tm
+{
+  int tm_mday;
+  int tm_mon;
+  int tm_year;
+};
+size_t
+__strftime (wchar_t * s, size_t maxsize, const wchar_t * format, const struct 
tm *tim_p)
+{
+  size_t count = 0;
+  int len = 0;
+  size_t i, ctloclen;
+  unsigned long width;
+  {
+if (foo)
+  {
+   {
+ wchar_t *fmt = L"%s%.*d";
+ len = swprintf ([count], maxsize, fmt, "-", width, 0);
+   }
+   if ((count) >= maxsize)
+ return 0;
+  }
+else
+  {
+   len =
+ swprintf ([count], maxsize - count, L"%.2d/%.2d/%.2d", 42, 99, 0);
+   if ((count) >= maxsize)
+ return 0;
+
+  }
+  }
+}


Re: [PATCH] combine: Update links correctly for new I2 (PR84169)

2018-02-12 Thread Jakub Jelinek
On Mon, Feb 12, 2018 at 03:59:05PM +, Segher Boessenkool wrote:
> If there is a LOG_LINK between two insns, this means those two insns
> can be combined, as far as dataflow is concerned.  There never should
> be a LOG_LINK between two unrelated insns.  If there is one, combine
> will try to combine the insns without doing all the needed checks if
> the earlier destination is used before the later insn, etc.
> 
> Unfortunately we do not update the LOG_LINKs correctly in some cases.
> This patch fixes at least some of those cases.
> 
> This fixes the PR's testcase on aarch64.  Also tested on 30+ cross
> compiler, and on powerpc64-linux {-m32,-m64}.  Will test on x86_64
> as well before committing.

Will you check in the testcase too?

My preference would be something like following, so that it can
be torture-tested on all targets.

--- gcc/testsuite/gcc.c-torture/execute/pr84169.c   2018-01-12 
10:39:42.940283691 +0100
+++ gcc/testsuite/gcc.c-torture/execute/pr84169.c   2018-02-12 
17:11:18.970878978 +0100
@@ -0,0 +1,25 @@
+/* PR rtl-optimization/84169 */
+
+#ifdef __SIZEOF_INT128__
+typedef unsigned __int128 T;
+#else
+typedef unsigned long long T;
+#endif
+
+T b;
+
+static __attribute__ ((noipa)) T
+foo (T c, T d, T e, T f, T g, T h)
+{
+  __builtin_mul_overflow ((unsigned char) h, -16, );
+  return b + h;
+}
+
+int
+main ()
+{
+  T x = foo (0, 0, 0, 0, 0, 4);
+  if (x != -64)
+__builtin_abort ();
+  return 0;
+}


Jakub


[PR 83990] Fix location handling in ipa_modify_call_arguments

2018-02-12 Thread Martin Jambor
Hi,

the callee-side arguments manipulation method used by IPA-SRA has two
issues with how it deals with locations.  First, it gets the location
from expressions in an unreliable way rather than the statements it sees
and then it forgets to set a location of one gimple assign it creates.
Both is fixed in the patch below.

I have bootstrapped and tested the patch on an x86_64-linux and consider
it pre-approved by Jakub in bugzilla so plan to commit it to trunk
tomorrow and to the gcc-7-branch soon afterwards (after testing there).

Thanks,

Martin


2018-01-30  Martin Jambor  

PR c++/83990
* ipa-param-manipulation.c (ipa_modify_call_arguments): Use location
of call statements, also set location of a load to a temporary.

---
 gcc/ipa-param-manipulation.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/ipa-param-manipulation.c b/gcc/ipa-param-manipulation.c
index 36290704644..1ab1fcccdae 100644
--- a/gcc/ipa-param-manipulation.c
+++ b/gcc/ipa-param-manipulation.c
@@ -295,8 +295,7 @@ ipa_modify_call_arguments (struct cgraph_edge *cs, gcall 
*stmt,
 
  poly_int64 byte_offset = exact_div (adj->offset, BITS_PER_UNIT);
  base = gimple_call_arg (stmt, adj->base_index);
- loc = DECL_P (base) ? DECL_SOURCE_LOCATION (base)
- : EXPR_LOCATION (base);
+ loc = gimple_location (stmt);
 
  if (TREE_CODE (base) != ADDR_EXPR
  && POINTER_TYPE_P (TREE_TYPE (base)))
@@ -385,6 +384,7 @@ ipa_modify_call_arguments (struct cgraph_edge *cs, gcall 
*stmt,
  else
expr = create_tmp_reg (TREE_TYPE (expr));
  gimple_assign_set_lhs (tem, expr);
+ gimple_set_location (tem, loc);
  gsi_insert_before (, tem, GSI_SAME_STMT);
}
}
-- 
2.15.1



Re: [Patch, Fortran] PR 84273: Reject allocatable passed-object dummy argument (proc_ptr_47.f90)

2018-02-12 Thread Janus Weil
2018-02-12 8:22 GMT+01:00 Richard Biener :
>> 2018-02-10 0:21 GMT+01:00 Steve Kargl :
>> > On Fri, Feb 09, 2018 at 06:13:34PM +0100, Janus Weil wrote:
>> >>
>> >> the attached patch fixes some checking code for PASS arguments in
>> >> procedure-pointer components, which does not properly account for the
>> >> fact that the PASS argument needs to be polymorphic.
>> >>
>> >> [The reason for this issue is probably that PPCs were mostly
>> >> implemented before polymorphism was available. The corresponding
>> >> pass-arg checks for TBPs are ok.]
>> >>
>> >> The patch also fixes an invalid test case (which was detected thanks
>> >> to Neil Carlson). It regtests cleanly on x86_64-linux-gnu. Ok for
>> >> trunk?
>> >
>> > The patch looks ok to me.  Trunk is in regression and doc
>> > fixes only mode, so you'll probably need to ping Jakub or
>> > Richard (ie., release engineer) for an ok.
>>
>> would you mind if I applied this patch to trunk at the current stage?
>> It was approved by Steve and Paul, is rather simple and low-risk ...
>
> Go ahead.


Thanks! Committed as r257590.

Cheers,
Janus


Re: [RFC PATCH] avoid applying attributes to explicit specializations (PR 83871)

2018-02-12 Thread Jason Merrill
On Mon, Feb 12, 2018 at 11:59 AM, Martin Sebor  wrote:
> On 02/12/2018 09:30 AM, Jason Merrill wrote:
>>
>> On Fri, Feb 9, 2018 at 6:57 PM, Martin Sebor  wrote:
>>>
>>> On 02/09/2018 12:52 PM, Jason Merrill wrote:

 On 02/08/2018 04:52 PM, Martin Sebor wrote:
>
>
> I took me a while to find DECL_TEMPLATE_RESULT.  Hopefully
> that's the right way to get the primary from a TEMPLATE_DECL.


 Yes.

>>> Attached is an updated patch.  It hasn't gone through full
>>> testing yet but please let me know if you'd like me to make
>>> some changes.
>>
>>
>>
>>> +  const char* const whitelist[] = {
>>> +"error", "noreturn", "warning"
>>> +  };
>>
>>
>>
>> Why whitelist noreturn?  I would expect to want that to be consistent.
>
>
> I expect noreturn to be used on a primary whose definition
> is provided but that's not meant to be used the way the API
> is otherwise expected to be.  As in:
>
>template 
>T [[noreturn]] foo () { throw "not implemented"; }
>
>template <> int foo();   // implemented elsewhere


 Marking that template as noreturn seems pointless, and possibly harmful;
 the deprecated, warning, or error attributes would be better for this
 situation.
>>>
>>>
>>> I meant either:
>>>
>>>   template 
>>>   T __attribute__ ((noreturn)) foo () { throw "not implemented"; }
>>>
>>>   template <> int foo();   // implemented elsewhere
>>>
>>> or (sigh)
>>>
>>>   template 
>>>   [[noreturn]] T foo () { throw "not implemented"; }
>>>
>>>   template <> int foo();   // implemented elsewhere
>>>
>>> It lets code like this
>>>
>>>   int bar ()
>>>   {
>>>  return foo();
>>>   }
>>>
>>> be diagnosed because it's likely a bug (as Clang does with
>>> -Wunreachable-code).  It doesn't stop code like the following
>>> from compiling (which is good) but it instead lets them throw
>>> at runtime which is what foo's author wants.
>>>
>>>   void bar ()
>>>   {
>>>  foo();
>>>   }
>>>
>>> It's the same as having an "unimplemented" base virtual function
>>> throw an exception when it's called rather than making it pure
>>> and having calls to it abort.  Declaring the base virtual function
>>> noreturn is useful for the same reason (and also diagnosed by
>>> Clang).  I should remember to add the same warning in GCC 9.
>>
>>
>> Yes, I understood the patterns you had in mind, but I disagree with
>> them.  My point about harmful is that declaring a function noreturn
>> because it's unimplemented could be a problem for when the function is
>> later implemented, and callers were optimized inappropriately.  This
>> seems like a rather roundabout way to get a warning about calling an
>> unimplemented function, and not worth overriding the normal behavior.
>
>
> Removing noreturn from the whitelist means having to prevent
> the attribute from causing conflicts with the attributes on
> the blacklist.  E.g., in this:
>
>   template  [[malloc]] void* allocate (int);
>
>   template <> [[noreturn]] void* allocate (int);
>
> -Wmissing-attributes would warn for the missing malloc but
> -Wattributes will warn once malloc is added.  Ditto for all
> other attributes noreturn is considered to conflict with such
> as alloc_size and warn_unused_result.

This example seems rather unlikely, and the solution is to remove
[[noreturn]].  I don't think this is worth worrying about for GCC 8.

> I anticipate the warning code to ultimately end up in
> the middle-end so it can handle Joseph's case as well, and
> so it can also be integrated with the attribute conflict
> machinery.  It also needs to be in the middle-end to become
> usable by -Wsuggest-attribute.  But I wasn't thinking of
> making any of these bigger changes until GCC 9.

> Do you want me to integrate it with the conflict stuff now?

No, leaving it for GCC 9 makes sense to me.

Jason

>>> I actually had some misgivings about both warning and deprecated
>>> for the white-listing, but not for noreturn.  My (only mild)
>>> concern is that both warning and deprecated functions can and
>>> likely will in some cases still be called, and so using them to
>>> suppress the warning runs the risk that their calls might be
>>> wrong and no one will notice.  Warning cannot be suppressed
>>> so it seems unlikely to be ignored, but deprecated can be.
>>> So I wonder if the white-listing for deprecated should be
>>> conditional on -Wdeprecated being enabled.
>>>
> -  /* Merge the type qualifiers.  */
> -  if (TREE_READONLY (newdecl))
> -TREE_READONLY (olddecl) = 1;
>if (TREE_THIS_VOLATILE (newdecl))
>  TREE_THIS_VOLATILE (olddecl) = 1;
> -  if (TREE_NOTHROW (newdecl))
> -TREE_NOTHROW (olddecl) = 1;
> +
> +  if (merge_attr)
> +{
> +  /* Merge the type qualifiers.  */
> +  if (TREE_READONLY (newdecl))
> +

Re: [RFC PATCH] avoid applying attributes to explicit specializations (PR 83871)

2018-02-12 Thread Martin Sebor

On 02/12/2018 09:30 AM, Jason Merrill wrote:

On Fri, Feb 9, 2018 at 6:57 PM, Martin Sebor  wrote:

On 02/09/2018 12:52 PM, Jason Merrill wrote:

On 02/08/2018 04:52 PM, Martin Sebor wrote:


I took me a while to find DECL_TEMPLATE_RESULT.  Hopefully
that's the right way to get the primary from a TEMPLATE_DECL.


Yes.


Attached is an updated patch.  It hasn't gone through full
testing yet but please let me know if you'd like me to make
some changes.




+  const char* const whitelist[] = {
+"error", "noreturn", "warning"
+  };



Why whitelist noreturn?  I would expect to want that to be consistent.


I expect noreturn to be used on a primary whose definition
is provided but that's not meant to be used the way the API
is otherwise expected to be.  As in:

   template 
   T [[noreturn]] foo () { throw "not implemented"; }

   template <> int foo();   // implemented elsewhere


Marking that template as noreturn seems pointless, and possibly harmful;
the deprecated, warning, or error attributes would be better for this
situation.


I meant either:

  template 
  T __attribute__ ((noreturn)) foo () { throw "not implemented"; }

  template <> int foo();   // implemented elsewhere

or (sigh)

  template 
  [[noreturn]] T foo () { throw "not implemented"; }

  template <> int foo();   // implemented elsewhere

It lets code like this

  int bar ()
  {
 return foo();
  }

be diagnosed because it's likely a bug (as Clang does with
-Wunreachable-code).  It doesn't stop code like the following
from compiling (which is good) but it instead lets them throw
at runtime which is what foo's author wants.

  void bar ()
  {
 foo();
  }

It's the same as having an "unimplemented" base virtual function
throw an exception when it's called rather than making it pure
and having calls to it abort.  Declaring the base virtual function
noreturn is useful for the same reason (and also diagnosed by
Clang).  I should remember to add the same warning in GCC 9.


Yes, I understood the patterns you had in mind, but I disagree with
them.  My point about harmful is that declaring a function noreturn
because it's unimplemented could be a problem for when the function is
later implemented, and callers were optimized inappropriately.  This
seems like a rather roundabout way to get a warning about calling an
unimplemented function, and not worth overriding the normal behavior.


Removing noreturn from the whitelist means having to prevent
the attribute from causing conflicts with the attributes on
the blacklist.  E.g., in this:

  template  [[malloc]] void* allocate (int);

  template <> [[noreturn]] void* allocate (int);

-Wmissing-attributes would warn for the missing malloc but
-Wattributes will warn once malloc is added.  Ditto for all
other attributes noreturn is considered to conflict with such
as alloc_size and warn_unused_result.

I anticipate the warning code to ultimately end up in
the middle-end so it can handle Joseph's case as well, and
so it can also be integrated with the attribute conflict
machinery.  It also needs to be in the middle-end to become
usable by -Wsuggest-attribute.  But I wasn't thinking of
making any of these bigger changes until GCC 9.

Do you want me to integrate it with the conflict stuff now?

Martin


I actually had some misgivings about both warning and deprecated
for the white-listing, but not for noreturn.  My (only mild)
concern is that both warning and deprecated functions can and
likely will in some cases still be called, and so using them to
suppress the warning runs the risk that their calls might be
wrong and no one will notice.  Warning cannot be suppressed
so it seems unlikely to be ignored, but deprecated can be.
So I wonder if the white-listing for deprecated should be
conditional on -Wdeprecated being enabled.


-  /* Merge the type qualifiers.  */
-  if (TREE_READONLY (newdecl))
-TREE_READONLY (olddecl) = 1;
   if (TREE_THIS_VOLATILE (newdecl))
 TREE_THIS_VOLATILE (olddecl) = 1;
-  if (TREE_NOTHROW (newdecl))
-TREE_NOTHROW (olddecl) = 1;
+
+  if (merge_attr)
+{
+  /* Merge the type qualifiers.  */
+  if (TREE_READONLY (newdecl))
+TREE_READONLY (olddecl) = 1;
+}
+  else
+{
+  /* Set the bits that correspond to the const function
attributes.  */
+  TREE_READONLY (olddecl) = TREE_READONLY (newdecl);
+}



Let's limit the const/volatile handling here to non-functions, and
handle the const/noreturn attributes for functions in the later hunk
along with nothrow/malloc/pure.



I had to keep the TREE_HAS_VOLATILE handling as is since it
applies to functions too (has side-effects).  Otherwise the
attr-nothrow-2.C test fails.


When I mentioned "noreturn" above I was referring to
TREE_HAS_VOLATILE; sorry I wasn't clear.  For functions it should be
handled along with nothrow/readonly/malloc/pure.

Jason





Re: [PATCH rs6000] Fix for builtins-4-runnable.c testcase FAIL on power7/BE 32-bit

2018-02-12 Thread Segher Boessenkool
Hi Carl,

On Mon, Feb 12, 2018 at 08:20:16AM -0800, Carl Love wrote:
> On Mon, 2018-02-12 at 09:17 -0600, Segher Boessenkool wrote:
> > > Without the powerpc64*-*-* the test was still tried to compiled the
> > > test case in 32-bit mode on BE and failed.
> > 
> > If the dg-do target clause fails, you still get the default dg-do
> > value,
> > which is "compile" for most testcases.
> > 
> > You want to use  dg-require-effective-target int128  if the testcase
> > cannot compile without int128.  Could you try that?
> > 
> > Btw, your patch is completely whitespace-damaged which makes it very
> > hard
> > to read (and impossible to apply).  Please fix your setup :-)

[ That is fixed now, thanks! ]

> No problem, I hadn't committed the patch yet.  Nothing seemed to be
> going right for me last Friday so I figured it best to wait for a
> better day to do the commit.
> 
> I updated the patch as requested.  The dg commands are now:
> 
> /* { dg-do run } */
> /* { dg-require-effective-target int128 } */
> /* { dg-require-effective-target vsx_hw } */
> /* { dg-options "-maltivec -mvsx" } */

Thanks, that looks good.

> 2018-02-12  Carl Love  
> 
>   * gcc.target/powerpc/builtins-4-runnable.c (main): Move int128 and

You have a tab in the middle of the line there (after "Move").

>   uint128 tests to new testfile.
>   * gcc.target/powerpc/builtins-4-int128-runnable.c: New testfile for
>   int128 and uint128 tests.
>   * gcc.target/powerpc/powerpc.exp: Add builtins-4-int128-runnable.c to
>   list of torture tests.

Okay for trunk.  Thanks!


Segher


Re: PING [PATCH] RX movsicc degrade fix

2018-02-12 Thread Jakub Jelinek
On Mon, Feb 12, 2018 at 01:36:54PM +, Sebastian Perta wrote:
> Hi Jakub,
> 
> >>Still missing . at the end of the above line.
> 
> The sentence continues on the next line (so the "." is there):
> 
> +* config/rx/constraints.md (CALL_OP_SYMBOL_REF): Added new constraint
> +to allow or block "symbol_ref" depending on the value of TARGET_JSR.
> 
>  I think this is OK, please confirm.

You're right, confused by your mailer wrapping lines in between
"new" and "constraint" around.  Sorry.

Jakub


Re: [PATCH] combine: Update links correctly for new I2 (PR84169)

2018-02-12 Thread Segher Boessenkool
On Mon, Feb 12, 2018 at 05:12:20PM +0100, Jakub Jelinek wrote:
> On Mon, Feb 12, 2018 at 03:59:05PM +, Segher Boessenkool wrote:
> > If there is a LOG_LINK between two insns, this means those two insns
> > can be combined, as far as dataflow is concerned.  There never should
> > be a LOG_LINK between two unrelated insns.  If there is one, combine
> > will try to combine the insns without doing all the needed checks if
> > the earlier destination is used before the later insn, etc.
> > 
> > Unfortunately we do not update the LOG_LINKs correctly in some cases.
> > This patch fixes at least some of those cases.
> > 
> > This fixes the PR's testcase on aarch64.  Also tested on 30+ cross
> > compiler, and on powerpc64-linux {-m32,-m64}.  Will test on x86_64
> > as well before committing.
> 
> Will you check in the testcase too?

Yes, but it is dg-do run and I don't have a native aarch compiler built
for it yet, so that will be a later patch.

> My preference would be something like following, so that it can
> be torture-tested on all targets.

Thanks!


Segher


Re: [RFC PATCH] avoid applying attributes to explicit specializations (PR 83871)

2018-02-12 Thread Jason Merrill
On Fri, Feb 9, 2018 at 6:57 PM, Martin Sebor  wrote:
> On 02/09/2018 12:52 PM, Jason Merrill wrote:
>> On 02/08/2018 04:52 PM, Martin Sebor wrote:
>>>
>>> I took me a while to find DECL_TEMPLATE_RESULT.  Hopefully
>>> that's the right way to get the primary from a TEMPLATE_DECL.
>>
>> Yes.
>>
> Attached is an updated patch.  It hasn't gone through full
> testing yet but please let me know if you'd like me to make
> some changes.


> +  const char* const whitelist[] = {
> +"error", "noreturn", "warning"
> +  };


 Why whitelist noreturn?  I would expect to want that to be consistent.
>>>
>>> I expect noreturn to be used on a primary whose definition
>>> is provided but that's not meant to be used the way the API
>>> is otherwise expected to be.  As in:
>>>
>>>template 
>>>T [[noreturn]] foo () { throw "not implemented"; }
>>>
>>>template <> int foo();   // implemented elsewhere
>>
>> Marking that template as noreturn seems pointless, and possibly harmful;
>> the deprecated, warning, or error attributes would be better for this
>> situation.
>
> I meant either:
>
>   template 
>   T __attribute__ ((noreturn)) foo () { throw "not implemented"; }
>
>   template <> int foo();   // implemented elsewhere
>
> or (sigh)
>
>   template 
>   [[noreturn]] T foo () { throw "not implemented"; }
>
>   template <> int foo();   // implemented elsewhere
>
> It lets code like this
>
>   int bar ()
>   {
>  return foo();
>   }
>
> be diagnosed because it's likely a bug (as Clang does with
> -Wunreachable-code).  It doesn't stop code like the following
> from compiling (which is good) but it instead lets them throw
> at runtime which is what foo's author wants.
>
>   void bar ()
>   {
>  foo();
>   }
>
> It's the same as having an "unimplemented" base virtual function
> throw an exception when it's called rather than making it pure
> and having calls to it abort.  Declaring the base virtual function
> noreturn is useful for the same reason (and also diagnosed by
> Clang).  I should remember to add the same warning in GCC 9.

Yes, I understood the patterns you had in mind, but I disagree with
them.  My point about harmful is that declaring a function noreturn
because it's unimplemented could be a problem for when the function is
later implemented, and callers were optimized inappropriately.  This
seems like a rather roundabout way to get a warning about calling an
unimplemented function, and not worth overriding the normal behavior.

> I actually had some misgivings about both warning and deprecated
> for the white-listing, but not for noreturn.  My (only mild)
> concern is that both warning and deprecated functions can and
> likely will in some cases still be called, and so using them to
> suppress the warning runs the risk that their calls might be
> wrong and no one will notice.  Warning cannot be suppressed
> so it seems unlikely to be ignored, but deprecated can be.
> So I wonder if the white-listing for deprecated should be
> conditional on -Wdeprecated being enabled.
>
>>> -  /* Merge the type qualifiers.  */
>>> -  if (TREE_READONLY (newdecl))
>>> -TREE_READONLY (olddecl) = 1;
>>>if (TREE_THIS_VOLATILE (newdecl))
>>>  TREE_THIS_VOLATILE (olddecl) = 1;
>>> -  if (TREE_NOTHROW (newdecl))
>>> -TREE_NOTHROW (olddecl) = 1;
>>> +
>>> +  if (merge_attr)
>>> +{
>>> +  /* Merge the type qualifiers.  */
>>> +  if (TREE_READONLY (newdecl))
>>> +TREE_READONLY (olddecl) = 1;
>>> +}
>>> +  else
>>> +{
>>> +  /* Set the bits that correspond to the const function
>>> attributes.  */
>>> +  TREE_READONLY (olddecl) = TREE_READONLY (newdecl);
>>> +}
>>
>>
>> Let's limit the const/volatile handling here to non-functions, and
>> handle the const/noreturn attributes for functions in the later hunk
>> along with nothrow/malloc/pure.
>
>
> I had to keep the TREE_HAS_VOLATILE handling as is since it
> applies to functions too (has side-effects).  Otherwise the
> attr-nothrow-2.C test fails.

When I mentioned "noreturn" above I was referring to
TREE_HAS_VOLATILE; sorry I wasn't clear.  For functions it should be
handled along with nothrow/readonly/malloc/pure.

Jason


Re: [PATCH rs6000] Fix for builtins-4-runnable.c testcase FAIL on power7/BE 32-bit

2018-02-12 Thread Carl Love
On Mon, 2018-02-12 at 09:17 -0600, Segher Boessenkool wrote:
> Hi Carl,
> 
> On Fri, Feb 09, 2018 at 02:09:06PM -0800, Carl Love wrote:
> > As pointed out, the dg arguments in new test file was missing the
> > {target 128}.  I updated the arguments to be 
> > 
> > { dg-do run { target { int128 && powerpc64*-*-* } } }
> > 
> > Without the powerpc64*-*-* the test was still tried to compiled the
> > test case in 32-bit mode on BE and failed.
> 
> If the dg-do target clause fails, you still get the default dg-do
> value,
> which is "compile" for most testcases.
> 
> You want to use  dg-require-effective-target int128  if the testcase
> cannot compile without int128.  Could you try that?
> 
> Btw, your patch is completely whitespace-damaged which makes it very
> hard
> to read (and impossible to apply).  Please fix your setup :-)
> 
> 
> Segher
> 

Segher:

No problem, I hadn't committed the patch yet.  Nothing seemed to be
going right for me last Friday so I figured it best to wait for a
better day to do the commit.

I updated the patch as requested.  The dg commands are now:

/* { dg-do run } */
/* { dg-require-effective-target int128 } */
/* { dg-require-effective-target vsx_hw } */
/* { dg-options "-maltivec -mvsx" } */

#include 

When run with the command:

make -k check-gcc RUNTESTFLAGS="--target_board=unix'{-m32,-m64}' 
powerpc.exp=builtins-4-int128-runnable.c"

The results are the same:

# of unsupported tests  6
Running target unix/-m64
Using /usr/share/dejagnu/baseboards/unix.exp as board description file for 
target.
Using /usr/share/dejagnu/config/unix.exp as generic interface file for 
target.
Using /home/carll/GCC/gcc-builtin4/gcc/testsuite/config/default.exp as 
tool-and-target-specific interface file.
Running 
/home/carll/GCC/gcc-builtin4/gcc/testsuite/gcc.target/powerpc/powerpc.exp ...

=== gcc Summary for unix/-m64 ===

# of expected passes12

=== gcc Summary ===

# of expected passes12
# of unsupported tests  6
/home/carll/GCC/build/gcc-builtin4/gcc/xgcc  version 8.0.1 20180209 
(experimental) (GCC) 

It does sound like this is more reliable in terms of the test harness.

Below is the patch.  Sorry, I forgot to set the patch to "preformatted"
last time, as I said, I wasn't having a good day Friday.  :-)

Let me know if it all looks good this time.  Thanks.

 Carl 

-

gcc/testsuite/ChangeLog:

2018-02-12  Carl Love  

* gcc.target/powerpc/builtins-4-runnable.c (main): Move int128 and
uint128 tests to new testfile.
* gcc.target/powerpc/builtins-4-int128-runnable.c: New testfile for
int128 and uint128 tests.
* gcc.target/powerpc/powerpc.exp: Add builtins-4-int128-runnable.c to
list of torture tests.
---
 .../powerpc/builtins-4-int128-runnable.c   | 109 +
 .../gcc.target/powerpc/builtins-4-runnable.c   |  84 
 gcc/testsuite/gcc.target/powerpc/powerpc.exp   |   1 +
 3 files changed, 110 insertions(+), 84 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/powerpc/builtins-4-int128-runnable.c

diff --git a/gcc/testsuite/gcc.target/powerpc/builtins-4-int128-runnable.c 
b/gcc/testsuite/gcc.target/powerpc/builtins-4-int128-runnable.c
new file mode 100644
index 000..162e267
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/builtins-4-int128-runnable.c
@@ -0,0 +1,109 @@
+/* { dg-do run } */
+/* { dg-require-effective-target int128 } */
+/* { dg-require-effective-target vsx_hw } */
+/* { dg-options "-maltivec -mvsx" } */
+
+#include 
+#include  // vector
+
+#ifdef DEBUG
+#include 
+#endif
+
+void abort (void);
+
+int main() {
+  int i;
+  __uint128_t data_u128[100];
+  __int128_t data_128[100];
+
+  vector __int128_t vec_128_expected1, vec_128_result1;
+  vector __uint128_t vec_u128_expected1, vec_u128_result1;
+  signed long long zero = (signed long long) 0;
+
+  for (i = 0; i < 100; i++)
+{
+  data_128[i] = i + 1280;
+  data_u128[i] = i + 1281;
+}
+
+  /* vec_xl() tests */
+
+  vec_128_expected1 = (vector __int128_t){1280};
+  vec_128_result1 = vec_xl (zero, data_128);
+
+  if (vec_128_expected1[0] != vec_128_result1[0])
+{
+#ifdef DEBUG
+   printf("Error: vec_xl(), vec_128_result1[0] = %lld %llu; ",
+  vec_128_result1[0] >> 64,
+  vec_128_result1[0] & (__int128_t)0x);
+   printf("vec_128_expected1[0] = %lld %llu\n",
+  vec_128_expected1[0] >> 64,
+  vec_128_expected1[0] & (__int128_t)0x);
+#else
+   abort ();
+#endif
+}
+
+  vec_u128_result1 = vec_xl (zero, data_u128);
+  vec_u128_expected1 = (vector __uint128_t){1281};
+  if (vec_u128_expected1[0] != vec_u128_result1[0])
+{

[PATCH] combine: Update links correctly for new I2 (PR84169)

2018-02-12 Thread Segher Boessenkool
If there is a LOG_LINK between two insns, this means those two insns
can be combined, as far as dataflow is concerned.  There never should
be a LOG_LINK between two unrelated insns.  If there is one, combine
will try to combine the insns without doing all the needed checks if
the earlier destination is used before the later insn, etc.

Unfortunately we do not update the LOG_LINKs correctly in some cases.
This patch fixes at least some of those cases.

This fixes the PR's testcase on aarch64.  Also tested on 30+ cross
compiler, and on powerpc64-linux {-m32,-m64}.  Will test on x86_64
as well before committing.


Segher


2018-02-12  Segher Boessenkool  

PR rtl-optimization/84169
* combine.c (try_combine): New variable split_i2i3.  Set it to true if
we generated a parallel as new i3 and we split that to new i2 and i3
instructions.  Handle split_i2i3 similar to swap_i2i3: scan the
LOG_LINKs of i3 to see which of those need to link to i2 now.  Link
those to i2, not i1.  Partially rewrite this scan code.

---
 gcc/combine.c | 55 ++-
 1 file changed, 30 insertions(+), 25 deletions(-)

diff --git a/gcc/combine.c b/gcc/combine.c
index 870bc77..204368e 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -2737,6 +2737,7 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
rtx_insn *i0,
   /* Notes that I1, I2 or I3 is a MULT operation.  */
   int have_mult = 0;
   int swap_i2i3 = 0;
+  int split_i2i3 = 0;
   int changed_i3_dest = 0;
 
   int maxreg;
@@ -4167,6 +4168,9 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
rtx_insn *i0,
}
 
  insn_code_number = recog_for_combine (, i3, _i3_notes);
+
+ if (insn_code_number >= 0)
+   split_i2i3 = 1;
}
 }
 
@@ -4334,44 +4338,45 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
rtx_insn *i0,
 
   if (swap_i2i3)
 {
-  rtx_insn *insn;
-  struct insn_link *link;
-  rtx ni2dest;
-
   /* I3 now uses what used to be its destination and which is now
 I2's destination.  This requires us to do a few adjustments.  */
   PATTERN (i3) = newpat;
   adjust_for_new_dest (i3);
+}
 
-  /* We need a LOG_LINK from I3 to I2.  But we used to have one,
-so we still will.
+  if (swap_i2i3 || split_i2i3)
+{
+  /* We might need a LOG_LINK from I3 to I2.  But then we used to
+have one, so we still will.
 
 However, some later insn might be using I2's dest and have
-a LOG_LINK pointing at I3.  We must remove this link.
-The simplest way to remove the link is to point it at I1,
-which we know will be a NOTE.  */
+a LOG_LINK pointing at I3.  We should change it to point at
+I2 instead.  */
 
   /* newi2pat is usually a SET here; however, recog_for_combine might
 have added some clobbers.  */
-  if (GET_CODE (newi2pat) == PARALLEL)
-   ni2dest = SET_DEST (XVECEXP (newi2pat, 0, 0));
-  else
-   ni2dest = SET_DEST (newi2pat);
+  rtx x = newi2pat;
+  if (GET_CODE (x) == PARALLEL)
+   x = XVECEXP (newi2pat, 0, 0);
 
-  for (insn = NEXT_INSN (i3);
-  insn && (this_basic_block->next_bb == EXIT_BLOCK_PTR_FOR_FN (cfun)
-   || insn != BB_HEAD (this_basic_block->next_bb));
+  unsigned int regno = REGNO (SET_DEST (x));
+
+  bool done = false;
+  for (rtx_insn *insn = NEXT_INSN (i3);
+  !done
+  && insn
+  && NONDEBUG_INSN_P (insn)
+  && BLOCK_FOR_INSN (insn) == this_basic_block;
   insn = NEXT_INSN (insn))
{
- if (NONDEBUG_INSN_P (insn)
- && reg_referenced_p (ni2dest, PATTERN (insn)))
-   {
- FOR_EACH_LOG_LINK (link, insn)
-   if (link->insn == i3)
- link->insn = i1;
-
- break;
-   }
+ struct insn_link *link;
+ FOR_EACH_LOG_LINK (link, insn)
+   if (link->insn == i3 && link->regno == regno)
+ {
+   link->insn = i2;
+   done = true;
+   break;
+ }
}
 }
 
-- 
1.8.3.1



Fix VR_ANTI_RANGE handling in intersect_range_with_nonzero_bits (PR 84321)

2018-02-12 Thread Richard Sandiford
VR_ANTI_RANGE is basically a union of two ranges, and although
intersect_range_with_nonzero_bits had code to deal with the upper
one being empty, it didn't handle the lower one being empty.
There were also some off-by-one errors.

This patch rewrites the code in a hopefully clearer way.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Richard


2018-02-12  Richard Sandiford  

gcc/
PR tree-optimization/84321
* tree-vrp.c (intersect_range_with_nonzero_bits): Fix VR_ANTI_RANGE
handling.  Also check whether the anti-range contains any values
that satisfy the mask; switch to a VR_RANGE if not.

gcc/testsuite/
PR tree-optimization/84321
* gcc.dg/pr84321.c: New test.

Index: gcc/tree-vrp.c
===
*** gcc/tree-vrp.c  2018-02-08 15:16:21.784407397 +
--- gcc/tree-vrp.c  2018-02-12 15:26:13.703500747 +
*** intersect_range_with_nonzero_bits (enum
*** 184,220 
   const wide_int _bits,
   signop sgn)
  {
!   if (vr_type == VR_RANGE)
  {
!   *max = wi::round_down_for_mask (*max, nonzero_bits);
  
!   /* Check that the range contains at least one valid value.  */
!   if (wi::gt_p (*min, *max, sgn))
!   return VR_UNDEFINED;
  
!   *min = wi::round_up_for_mask (*min, nonzero_bits);
!   gcc_checking_assert (wi::le_p (*min, *max, sgn));
! }
!   if (vr_type == VR_ANTI_RANGE)
! {
!   *max = wi::round_up_for_mask (*max, nonzero_bits);
  
!   /* If the calculation wrapped, we now have a VR_RANGE whose
!lower bound is *MAX and whose upper bound is *MIN.  */
!   if (wi::gt_p (*min, *max, sgn))
{
! std::swap (*min, *max);
! *max = wi::round_down_for_mask (*max, nonzero_bits);
  gcc_checking_assert (wi::le_p (*min, *max, sgn));
  return VR_RANGE;
}
  
!   *min = wi::round_down_for_mask (*min, nonzero_bits);
gcc_checking_assert (wi::le_p (*min, *max, sgn));
  
!   /* Check whether we now have an empty set of values.  */
!   if (*min - 1 == *max)
return VR_UNDEFINED;
  }
return vr_type;
  }
--- 184,244 
   const wide_int _bits,
   signop sgn)
  {
!   if (vr_type == VR_ANTI_RANGE)
  {
!   /* The VR_ANTI_RANGE is equivalent to the union of the ranges
!A: [-INF, *MIN) and B: (*MAX, +INF].  First use NONZERO_BITS
!to create an inclusive upper bound for A and an inclusive lower
!bound for B.  */
!   wide_int a_max = wi::round_down_for_mask (*min - 1, nonzero_bits);
!   wide_int b_min = wi::round_up_for_mask (*max + 1, nonzero_bits);
  
!   /* If the calculation of A_MAX wrapped, A is effectively empty
!and A_MAX is the highest value that satisfies NONZERO_BITS.
!Likewise if the calculation of B_MIN wrapped, B is effectively
!empty and B_MIN is the lowest value that satisfies NONZERO_BITS.  */
!   bool a_empty = wi::ge_p (a_max, *min, sgn);
!   bool b_empty = wi::le_p (b_min, *max, sgn);
  
!   /* If both A and B are empty, there are no valid values.  */
!   if (a_empty && b_empty)
!   return VR_UNDEFINED;
  
!   /* If exactly one of A or B is empty, return a VR_RANGE for the
!other one.  */
!   if (a_empty || b_empty)
{
! *min = b_min;
! *max = a_max;
  gcc_checking_assert (wi::le_p (*min, *max, sgn));
  return VR_RANGE;
}
  
!   /* Update the VR_ANTI_RANGE bounds.  */
!   *min = a_max + 1;
!   *max = b_min - 1;
gcc_checking_assert (wi::le_p (*min, *max, sgn));
  
!   /* Now check whether the excluded range includes any values that
!satisfy NONZERO_BITS.  If not, switch to a full VR_RANGE.  */
!   if (wi::round_up_for_mask (*min, nonzero_bits) == b_min)
!   {
! unsigned int precision = min->get_precision ();
! *min = wi::min_value (precision, sgn);
! *max = wi::max_value (precision, sgn);
! vr_type = VR_RANGE;
!   }
! }
!   if (vr_type == VR_RANGE)
! {
!   *max = wi::round_down_for_mask (*max, nonzero_bits);
! 
!   /* Check that the range contains at least one valid value.  */
!   if (wi::gt_p (*min, *max, sgn))
return VR_UNDEFINED;
+ 
+   *min = wi::round_up_for_mask (*min, nonzero_bits);
+   gcc_checking_assert (wi::le_p (*min, *max, sgn));
  }
return vr_type;
  }
Index: gcc/testsuite/gcc.dg/pr84321.c
===
*** /dev/null   2018-02-10 09:05:46.714416790 +
--- gcc/testsuite/gcc.dg/pr84321.c  2018-02-12 15:26:13.702500788 +
***
*** 0 
--- 1,16 
+ /* { dg-do compile } */
+ /* { dg-options "-O3 -fwrapv" } */
+ 
+ int c;
+ 
+ void

Re: PR84300, ICE in dwarf2cfi on ppc64le

2018-02-12 Thread Segher Boessenkool
On Sat, Feb 10, 2018 at 02:09:57PM +1030, Alan Modra wrote:
> On Fri, Feb 09, 2018 at 08:11:44AM -0600, Segher Boessenkool wrote:
> > On Fri, Feb 09, 2018 at 04:12:47PM +1030, Alan Modra wrote:
> > >  ;; Use r0 to stop regrename twiddling with lr restore insns emitted
> > >  ;; after the call to __morestack.
> > >  (define_insn "split_stack_return"
> > > -  [(unspec_volatile [(use (reg:SI 0))] UNSPECV_SPLIT_STACK_RETURN)]
> > > +  [(unspec_volatile [(use (reg:SI 0)) (use (reg:SI LR_REGNO))]
> > > + UNSPECV_SPLIT_STACK_RETURN)]
> > 
> > I'm not sure what a USE as input of an UNSPEC means -- it should work
> > without the USEs?
> 
> Hmm, yes, plain [(reg:SI 0) (reg:SI LR_REGNO)] ought to work.  A sniff
> test says it's OK but I'll do the whole bootstrap/regtest cycle before
> committing.  I'm not sure why I put the USE there for r0, probably
> because I had the r0 dependency outside the unspec initially then
> decided it could replace (const_int 0) inside the unspec vector to
> save on useless RTL.  I didn't go far enough in trimming the RTL..

For R0 you can use a USE I think, but it should be in a parallel with
the unspec then.  But LR more appropriately is an input to the unspec,
it's not just there to keep the register alive for .

Having both as inputs works of course.


Segher


Re: [PATCH][AArch64][1/3] PR target/84164: Simplify subreg + redundant AND-immediate

2018-02-12 Thread Kyrill Tkachov

Hi Richard,

On 08/02/18 20:29, Richard Sandiford wrote:

Thanks for doing this.

Kyrill  Tkachov  writes:

diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
index 
2e7aa5c12952ab1a9b49b5adaf23710327e577d3..af06d7502cebac03cefc689b2646874b8397e767
 100644
--- a/gcc/simplify-rtx.c
+++ b/gcc/simplify-rtx.c
@@ -6474,6 +6474,18 @@ simplify_subreg (machine_mode outermode, rtx op,
return NULL_RTX;
  }
  
+  /* Simplify (subreg:QI (and:SI (reg:SI) (const_int 0x)) 0)

+ into (subreg:QI (reg:SI) 0).  */
+  scalar_int_mode int_outermode, int_innermode;
+  if (!paradoxical_subreg_p (outermode, innermode)
+  && is_a  (outermode, _outermode)
+  && is_a  (innermode, _innermode)
+  && GET_CODE (op) == AND && CONST_INT_P (XEXP (op, 1))
+  && known_eq (subreg_lowpart_offset (outermode, innermode), byte)
+  && (~INTVAL (XEXP (op, 1)) & GET_MODE_MASK (int_outermode)) == 0
+  && validate_subreg (outermode, innermode, XEXP (op, 0), byte))
+return gen_rtx_SUBREG (outermode, XEXP (op, 0), byte);
+
/* A SUBREG resulting from a zero extension may fold to zero if
   it extracts higher bits that the ZERO_EXTEND's source bits.  */
if (GET_CODE (op) == ZERO_EXTEND && SCALAR_INT_MODE_P (innermode))

I think it'd be better to do this in simplify_truncation (shared
by the subreg code and the TRUNCATE code).  The return would then
be simplify_gen_unary (TRUNCATE, ...), which will become a subreg
if TRULY_NOOP_TRUNCATION.


Thanks, that does look cleaner.
Bootstrapped and tested on arm-none-linux-gnueabihf, aarch64-none-linux-gnu and 
x86_64-unknown-linux-gnu.
The other two patches are still needed to address the fallout.

Is this ok?

Thanks,
Kyrill

2018-02-12  Kyrylo Tkachov  

PR target/84164
* simplify-rtx.c (simplify_truncation): Simplify truncation of masking
operation.
* config/aarch64/aarch64.md (*aarch64_reg_3_neg_mask2):
Use simplify_gen_unary creating a SUBREG.
(*aarch64_reg_3_minus_mask): Likewise.
(*aarch64__reg_di3_mask2): Use const_int_operand predicate
for operand 3.

2018-02-12  Kyrylo Tkachov  

PR target/84164
* gcc.c-torture/compile/pr84164.c: New test.
commit 3bc951e94bec9395a732b35038dc0abf8785ba7d
Author: Kyrylo Tkachov 
Date:   Thu Feb 1 13:57:46 2018 +

[AArch64] PR target/84164: Simplify subreg + redundant AND-immediate

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 0d13c35..69ff5ca 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -4278,8 +4278,10 @@ (define_insn_and_split "*aarch64_reg_3_neg_mask2"
 emit_insn (gen_negsi2 (tmp, operands[2]));
 
 rtx and_op = gen_rtx_AND (SImode, tmp, operands[3]);
-rtx subreg_tmp = gen_rtx_SUBREG (GET_MODE (operands[4]), and_op,
- SUBREG_BYTE (operands[4]));
+rtx subreg_tmp = simplify_gen_unary (TRUNCATE, GET_MODE (operands[4]),
+	  and_op, SImode);
+
+gcc_assert (subreg_tmp);
 emit_insn (gen_3 (operands[0], operands[1], subreg_tmp));
 DONE;
   }
@@ -4305,9 +4307,10 @@ (define_insn_and_split "*aarch64_reg_3_minus_mask"
 emit_insn (gen_negsi2 (tmp, operands[3]));
 
 rtx and_op = gen_rtx_AND (SImode, tmp, operands[4]);
-rtx subreg_tmp = gen_rtx_SUBREG (GET_MODE (operands[5]), and_op,
- SUBREG_BYTE (operands[5]));
+rtx subreg_tmp = simplify_gen_unary (TRUNCATE, GET_MODE (operands[5]),
+	  and_op, SImode);
 
+gcc_assert (subreg_tmp);
 emit_insn (gen_ashl3 (operands[0], operands[1], subreg_tmp));
 DONE;
   }
@@ -4318,9 +4321,9 @@ (define_insn "*aarch64__reg_di3_mask2"
 	(SHIFT:DI
 	  (match_operand:DI 1 "register_operand" "r")
 	  (match_operator 4 "subreg_lowpart_operator"
-	   [(and:SI (match_operand:SI 2 "register_operand" "r")
-		 (match_operand 3 "aarch64_shift_imm_di" "Usd"))])))]
-  "((~INTVAL (operands[3]) & (GET_MODE_BITSIZE (DImode)-1)) == 0)"
+	[(and:SI (match_operand:SI 2 "register_operand" "r")
+		 (match_operand 3 "const_int_operand" "n"))])))]
+  "((~INTVAL (operands[3]) & (GET_MODE_BITSIZE (DImode) - 1)) == 0)"
 {
   rtx xop[3];
   xop[0] = operands[0];
diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
index 2e7aa5c..1ccfce8 100644
--- a/gcc/simplify-rtx.c
+++ b/gcc/simplify-rtx.c
@@ -848,6 +848,16 @@ simplify_truncation (machine_mode mode, rtx op,
 	return simplify_gen_subreg (int_mode, SUBREG_REG (op), subreg_mode, 0);
 }
 
+  /* Simplify (truncate:QI (and:SI (reg:SI) (const_int 0x)) 0)
+ into (truncate:QI (reg:SI) 0).  */
+
+  scalar_int_mode int_outermode, int_innermode;
+  if (is_a  (mode, _outermode)
+  && is_a  (op_mode, _innermode)
+  && GET_CODE (op) == AND && CONST_INT_P (XEXP (op, 1))
+  && (~INTVAL (XEXP (op, 1)) & GET_MODE_MASK (int_outermode)) == 0)
+return simplify_gen_unary (TRUNCATE, mode, XEXP (op, 0), op_mode);
+
   /* (truncate:A (truncate:B X)) is 

Re: [PATCH rs6000] Fix for builtins-4-runnable.c testcase FAIL on power7/BE 32-bit

2018-02-12 Thread Segher Boessenkool
Hi Carl,

On Fri, Feb 09, 2018 at 02:09:06PM -0800, Carl Love wrote:
> As pointed out, the dg arguments in new test file was missing the
> {target 128}.  I updated the arguments to be 
> 
> { dg-do run { target { int128 && powerpc64*-*-* } } }
> 
> Without the powerpc64*-*-* the test was still tried to compiled the
> test case in 32-bit mode on BE and failed.

If the dg-do target clause fails, you still get the default dg-do value,
which is "compile" for most testcases.

You want to use  dg-require-effective-target int128  if the testcase
cannot compile without int128.  Could you try that?

Btw, your patch is completely whitespace-damaged which makes it very hard
to read (and impossible to apply).  Please fix your setup :-)


Segher


RE: PR84239, Reimplement CET intrinsics for rdssp/incssp insn

2018-02-12 Thread Tsimbalist, Igor V
> -Original Message-
> From: Sandra Loosemore [mailto:san...@codesourcery.com]
> Sent: Friday, February 9, 2018 7:42 PM
> To: Tsimbalist, Igor V ; gcc-
> patc...@gcc.gnu.org
> Cc: Uros Bizjak 
> Subject: Re: PR84239, Reimplement CET intrinsics for rdssp/incssp insn
> 
> On 02/09/2018 05:50 AM, Tsimbalist, Igor V wrote:
> > Introduce a couple of new CET intrinsics for reading and updating a
> shadow stack
> > pointer (_get_ssp and _inc_ssp), which are more user friendly. They replace
> the existing
> > _rdssp[d|q] and _incssp[d|q] instrinsics. The _get_ssp intrinsic has more
> deterministic
> > semantic: it returns a value of the shadow stack pointer if HW is CET
> capable and
> > 0 otherwise.
> >
> > Ok for trunk?
> 
> Just reviewing the documentation part:
> 
> > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> > index cb9df97..9f25dd9 100644
> > --- a/gcc/doc/extend.texi
> > +++ b/gcc/doc/extend.texi
> > @@ -12461,6 +12461,7 @@ instructions, but allow the compiler to
> schedule those calls.
> >  * TILEPro Built-in Functions::
> >  * x86 Built-in Functions::
> >  * x86 transactional memory intrinsics::
> > +* x86 control-flow protection intrinsics::
> >  @end menu
> >
> >  @node AArch64 Built-in Functions
> > @@ -21772,13 +21773,17 @@ void __builtin_ia32_wrpkru (unsigned int)
> >  unsigned int __builtin_ia32_rdpkru ()
> >  @end smallexample
> >
> > -The following built-in functions are available when @option{-mcet} is
> used.
> > -They are used to support Intel Control-flow Enforcment Technology (CET).
> > -Each built-in function generates the  machine instruction that is part of
> the
> > -function's name.
> > +The following built-in functions are available when @option{-mcet} or
> > +@option{-mshstk} option is used.  They support shadow stack
> > +machine instructions from Intel Control-flow Enforcment Technology
> (CET).
> 
> s/Enforcment/Enforcement/
> 
> > +Each built-in function generates the  machine instruction that is part
> > +of the function's name.  These are the internal low level functions.
> 
> s/low level/low-level/
> 
> > +Normally the functions in @ref{x86 control-flow protection intrinsics}
> > +should be used instead.
> > +
> >  @smallexample
> > -unsigned int __builtin_ia32_rdsspd (unsigned int)
> > -unsigned long long __builtin_ia32_rdsspq (unsigned long long)
> > +unsigned int __builtin_ia32_rdsspd (void)
> > +unsigned long long __builtin_ia32_rdsspq (void)
> >  void __builtin_ia32_incsspd (unsigned int)
> >  void __builtin_ia32_incsspq (unsigned long long)
> >  void __builtin_ia32_saveprevssp(void);
> > @@ -21885,6 +21890,51 @@ else
> >  Note that, in most cases, the transactional and non-transactional code
> >  must synchronize together to ensure consistency.
> >
> > +@node x86 control-flow protection intrinsics
> > +@subsection x86 Control-Flow Protection Intrinsics
> > +
> > +@deftypefn {CET Function} {ret_type} _get_ssp (void)
> > +The @code{ret_type} is @code{unsigned long long} for x86-64 platform
> > +and @code{unsigned int} for x86 pltform.
> 
> I'd prefer the sentence about the return type be placed after the
> description of what the function does.  And please fix typos:
> s/x86-64 platform/64-bit targets/
> s/x86 pltform/32-bit targets/
> 
> > +Get the current value of shadow stack pointer if shadow stack support
> > +from Intel CET is enabled in the HW or @code{0} otherwise.
> 
> s/HW/hardware,/
> 
> > +@end deftypefn
> > +
> > +@deftypefn {CET Function} void _inc_ssp (unsigned int)
> > +Increment the current shadow stack pointer by the size specified by the
> > +function argument.  For security reason only unsigned byte value is used
> > +from the argument.  Therefore for the size greater than @code{255} the
> > +function should be called several times.
> 
> How about rephrasing the last two sentences:
> 
> The argument is masked to a byte value for security reasons, so to
> increment by more than 255 bytes you must call the function multiple times.
> 
> > +@end deftypefn
> > +
> > +The shadow stack unwind code looks like:
> > +
> > +@smallexample
> > +#include 
> > +
> > +/* Unwind the shadow stack for EH.  */
> > +#define _Unwind_Frames_Extra(x)\
> > +  do   \
> > +@{ \
> > +  _Unwind_Word ssp = _get_ssp ();  \
> > +  if (ssp != 0)\
> > +   @{  \
> > + _Unwind_Word tmp = (x);   \
> > + while (tmp > 255) \
> > +   @{  \
> > + _inc_ssp (tmp);   \
> > + tmp -= 255;   \
> > +   @}  \
> > + _inc_ssp (tmp);   \
> > +   @}  \
> > +@} \
> > +while (0)
> > +@end smallexample
> 
> Tabs in Texinfo input don't work well.  Please use spaces to format code
> 

RE: [PATCH] RL78 new "vector" function attribute

2018-02-12 Thread Sebastian Perta
Hi DJ,

>>Looks OK to me, but wait a day or two for a docs person to comment on...
6 days no comments so far, can I check in now?

>>if the new line is too long
There are many other lines which have the same length or are even longer
this is why I let it as it is.

Also based on comments from Jakub (on a different patch) I corrected the
Changelog entry for this patch (see below). Is this OK?

Best Regards,
Sebastian

Index: ChangeLog
===
--- ChangeLog   (revision 257588)
+++ ChangeLog   (working copy)
@@ -1,3 +1,13 @@
+2018-02-12  Sebastian Perta  
+
+   * config/rl78/rl78.c (add_vector_labels): New function.
+   * config/rl78/rl78.c (rl78_handle_vector_attribute): New function.
+   * config/rl78/rl78.c (rl78_start_function): Call add_vector_labels.
+   * config/rl78/rl78.c (rl78_handle_func_attribute): Removed the
assert 
+   which checks that no arguments are passed.
+   * config/rl78/rl78.c (rl78_attribute_table): Add "vector" attribute.
+   * doc/extend.texi: Documentation for the new attribute.
+
 2018-02-12  Richard Biener  
 
PR tree-optimization/84037
Index: testsuite/ChangeLog
===
--- testsuite/ChangeLog (revision 257588)
+++ testsuite/ChangeLog (working copy)
@@ -1,3 +1,7 @@
+2018-02-12  Sebastian Perta  
+
+   * gcc.target/rl78/test_auto_vector.c: New test.
+
 2018-02-12  Tamar Christina  
 
PR target/82641



> -Original Message-
> From: DJ Delorie [mailto:d...@redhat.com]
> Sent: 06 February 2018 22:57
> To: Sebastian Perta 
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] RL78 new "vector" function attribute
> 
> 
> Sebastian Perta  writes:
> > I've updated the patch (extend.texi) as you suggested.
> > Please let me know if this is OK to check-in, thank you!
> 
> Looks OK to me, but wait a day or two for a docs person to comment on...
> 
> > -On RX targets, you may specify one or more vector numbers as arguments
> > +On RX and RL78 targets, you may specify one or more vector numbers as
> arguments
> 
> ...if the new line is too long and if a paragraph reformat is warranted.
> 
> Thanks!



Re: PING [PATCH] RX movsicc degrade fix

2018-02-12 Thread Jakub Jelinek
On Mon, Feb 12, 2018 at 01:27:24PM -, Sebastian Perta wrote:
> --- ChangeLog (revision 257583)
> +++ ChangeLog (working copy)
> @@ -129,8 +129,7 @@
>  
>  2018-02-09  Sebastian Perta  
>  
> - * config/rx.md: updated "movsicc" expand to be matched by GCC
> - * testsuite/gcc.target/rx/movsicc.c: new test case
> + * config/rx/rx.md (movsicc): Update expander to be matched by GCC.
>  
>  2018-02-09  Peter Bergner  
>  
> @@ -143,10 +142,10 @@
>  
>  2018-02-09  Sebastian Perta  
>  
> - * config/rx/constraints.md: added new constraint CALL_OP_SYMBOL_REF 
> - to allow or block "symbol_ref" depending on value of TARGET_JSR
> - * config/rx/rx.md: use CALL_OP_SYMBOL_REF in call_internal and 
> - call_value_internal insns
> + * config/rx/constraints.md (CALL_OP_SYMBOL_REF): Added new constraint 

Still missing . at the end of the above line.

Otherwise LGTM.

Jakub


RE: PING [PATCH] RX movsicc degrade fix

2018-02-12 Thread Sebastian Perta
Hi Jakub,

>>Still missing . at the end of the above line.

The sentence continues on the next line (so the "." is there):

+* config/rx/constraints.md (CALL_OP_SYMBOL_REF): Added new constraint
+to allow or block "symbol_ref" depending on the value of TARGET_JSR.

 I think this is OK, please confirm.

Best Regards,
Sebastian

> -Original Message-
> From: Jakub Jelinek [mailto:ja...@redhat.com]
> Sent: 12 February 2018 13:30
> To: Sebastian Perta 
> Cc: Nick Clifton ; gcc-patches  patc...@gcc.gnu.org>
> Subject: Re: PING [PATCH] RX movsicc degrade fix
>
> On Mon, Feb 12, 2018 at 01:27:24PM -, Sebastian Perta wrote:
> > --- ChangeLog(revision 257583)
> > +++ ChangeLog(working copy)
> > @@ -129,8 +129,7 @@
> >
> >  2018-02-09  Sebastian Perta  
> >
> > -* config/rx.md: updated "movsicc" expand to be matched by GCC
> > -* testsuite/gcc.target/rx/movsicc.c: new test case
> > +* config/rx/rx.md (movsicc): Update expander to be matched by
> GCC.
> >
> >  2018-02-09  Peter Bergner  
> >
> > @@ -143,10 +142,10 @@
> >
> >  2018-02-09  Sebastian Perta  
> >
> > -* config/rx/constraints.md: added new constraint
> CALL_OP_SYMBOL_REF
> > -to allow or block "symbol_ref" depending on value of TARGET_JSR
> > -* config/rx/rx.md: use CALL_OP_SYMBOL_REF in call_internal and
> > -call_value_internal insns
> > +* config/rx/constraints.md (CALL_OP_SYMBOL_REF): Added new
> constraint
>
> Still missing . at the end of the above line.
>
> Otherwise LGTM.
>
> Jakub



Renesas Electronics Europe Ltd, Dukes Meadow, Millboard Road, Bourne End, 
Buckinghamshire, SL8 5FH, UK. Registered in England & Wales under Registered 
No. 04586709.


RE: PING [PATCH] RX movsicc degrade fix

2018-02-12 Thread Sebastian Perta
HI Jakub,

I have updated the changelog entries as per your suggestion.
Is this OK? Thank you!

Best Regards,
Sebastian

Index: ChangeLog
===
--- ChangeLog   (revision 257583)
+++ ChangeLog   (working copy)
@@ -129,8 +129,7 @@
 
 2018-02-09  Sebastian Perta  
 
-   * config/rx.md: updated "movsicc" expand to be matched by GCC
-   * testsuite/gcc.target/rx/movsicc.c: new test case
+   * config/rx/rx.md (movsicc): Update expander to be matched by GCC.
 
 2018-02-09  Peter Bergner  
 
@@ -143,10 +142,10 @@
 
 2018-02-09  Sebastian Perta  
 
-   * config/rx/constraints.md: added new constraint CALL_OP_SYMBOL_REF 
-   to allow or block "symbol_ref" depending on value of TARGET_JSR
-   * config/rx/rx.md: use CALL_OP_SYMBOL_REF in call_internal and 
-   call_value_internal insns
+   * config/rx/constraints.md (CALL_OP_SYMBOL_REF): Added new
constraint 
+   to allow or block "symbol_ref" depending on the value of TARGET_JSR.
+   * config/rx/rx.md (call_internal): Use CALL_OP_SYMBOL_REF.
+   * config/rx/rx.md (call_value_internal): Use CALL_OP_SYMBOL_REF.
 
 2018-02-09  Pierre-Marie de Rodat  
 
@@ -1342,9 +1341,8 @@
 
 2018-01-26  Sebastian Perta  
 
-   * config/rl78/rl78.c: if operand 2 is const avoid addition with 0
-   and use incw and decw where possible
-   * testsuite/gcc.target/rl78/test_addsi3_internal.c: new file
+   * config/rl78/rl78.c (rl78_addsi3_internal): If operand 2 is const 
+   avoid addition with 0 and use incw and decw where possible.
 
 2018-01-26  Richard Biener  
 
@@ -1675,15 +1673,15 @@
 
 2018-01-22  Sebastian Perta  
 
-   * config/rl78/rl78-expand.md: New define_expand "bswaphi2"
-   * config/rl78/rl78-virt.md: New define_insn "*bswaphi2_virt"
-   * config/rl78/rl78-real.md: New define_insn "*bswaphi2_real"
+   * config/rl78/rl78-expand.md (bswaphi2): New define_expand.
+   * config/rl78/rl78-virt.md (*bswaphi2_virt): New define_insn.
+   * config/rl78/rl78-real.md (*bswaphi2_real): New define_insn.
 
 2018-01-22  Sebastian Perta  
 
-   * config/rl78/rl78-protos.h: New function declaration
rl78_split_movdi
-   * config/rl78/rl78.md: New define_expand "movdi"
-   * config/rl78/rl78.c: New function definition rl78_split_movdi
+   * config/rl78/rl78-protos.h (rl78_split_movdi): New function
declaration.
+   * config/rl78/rl78.md (movdi): New define_expand.
+   * config/rl78/rl78.c (rl78_split_movdi): New function.
 
 2018-01-22  Michael Meissner  
 
@@ -1706,19 +1704,19 @@
 
 2018-01-22  Sebastian Perta  
 
-   * config/rl78/rl78.md: New define_expand "anddi3".
+   * config/rl78/rl78.md (anddi3): New define_expand.
 
 2018-01-22  Sebastian Perta  
 
-   * config/rl78/rl78.md: New define_expand "umindi3".
+   * config/rl78/rl78.md (umindi3): New define_expand.
 
 2018-01-22  Sebastian Perta  
 
-   * config/rl78/rl78.md: New define_expand "smindi3".
+   * config/rl78/rl78.md (smindi3): New define_expand.
 
 2018-01-22  Sebastian Perta  
 
-   * config/rl78/rl78.md: New define_expand "smaxdi3".
+   * config/rl78/rl78.md (smaxdi3): New define_expand.
 
 2018-01-22 Carl Love 
 
@@ -1738,12 +1736,12 @@
 
 2018-01-22  Sebastian Perta  
 
-   * config/rl78/rl78.md: New define_expand "umaxdi3".
+   * config/rl78/rl78.md (umaxdi3): New define_expand.
 
 2018-01-22  Sebastian Perta  
 
-   * config/rl78/rl78.c (rl78_note_reg_set): fixed dead reg check
-   for non-QImode registers
+   * config/rl78/rl78.c (rl78_note_reg_set): Fixed dead reg check
+   for non-QImode registers.
 
 2018-01-22  Richard Biener  
 
Index: testsuite/ChangeLog
===
--- testsuite/ChangeLog (revision 257583)
+++ testsuite/ChangeLog (working copy)
@@ -80,7 +80,11 @@
 
PR sanitizer/83987
* g++.dg/ubsan/pr83987-2.C: New test.
+   
+2018-02-09  Sebastian Perta  
+
+   * gcc.target/rx/movsicc.c: New test.
+
 2018-02-09  Peter Bergner  
 
PR target/83926
@@ -945,6 +949,10 @@
PR c++/83924
* g++.dg/warn/Wduplicated-branches5.C: New.
 
+2018-01-26  Sebastian Perta  
+
+   * gcc.target/rl78/test_addsi3_internal.c: New test.
+
 2018-01-26  Segher Boessenkool  
 
* gcc.target/powerpc/safe-indirect-jump-1.c: Build on all targets.

> 

Re: PING [PATCH] RX movsicc degrade fix

2018-02-12 Thread Jakub Jelinek
On Mon, Feb 12, 2018 at 11:06:35AM -, Sebastian Perta wrote:
> Hi Jakub,
> 
> Thank you for pointing this out, I'm sorry!
> Can I create a patch to correct the changelog entries?
> 
> Best Regards,
> Sebastian
>  
> >>1) there should be a space between * and the filename
> The spaces are there (see the changelog), the renesas mail server removes
> them sometimes

They weren't there in what I've fixed.
See 
https://gcc.gnu.org/viewcvs/gcc/trunk/gcc/ChangeLog?r1=257536=257535=257536

Jakub


[COMMITTED][PATCH][GCC][ARM] Change baseline test from armv5t to armv5te

2018-02-12 Thread Tamar Christina
Hi All,

This patch updates the pragma_arch_switch_2.c test to use Armv5te from Armv5t to
allow the test to be able to run on hard float configurations.

Regtested on arm-none-eabi (no hf as hf trunk was broken) and with explicit hard
float abi.

Committed under the GCC obvious rules.

Thanks,
Tamar

gcc/testsuite
2018-02-12  Tamar Christina  

PR target/82641
* gcc.target/arm/pragma_arch_switch_2.c: Use armv5te.

-- 
diff --git a/gcc/testsuite/gcc.target/arm/pragma_arch_switch_2.c b/gcc/testsuite/gcc.target/arm/pragma_arch_switch_2.c
index 7f297557d555fd139a3b804d354117239a78ae62..b6211f94c377e3b1c0ba6c24203434ca8b550c3b 100644
--- a/gcc/testsuite/gcc.target/arm/pragma_arch_switch_2.c
+++ b/gcc/testsuite/gcc.target/arm/pragma_arch_switch_2.c
@@ -2,7 +2,7 @@
 /* { dg-skip-if "instruction not valid on thumb" { *-*-* } { "-mthumb" } { "" } } */
 /* { dg-do assemble } */
 /* { dg-require-effective-target arm_arm_ok } */
-/* { dg-additional-options "-Wall -O2 -march=armv5t -std=gnu99 -marm" } */
+/* { dg-additional-options "-Wall -O2 -march=armv5te -std=gnu99 -marm" } */
 
 #pragma GCC target ("arch=armv6")
 int test_assembly (int hi, int lo)



Re: PING [PATCH] RX movsicc degrade fix

2018-02-12 Thread Oleg Endo
On Mon, 2018-02-12 at 11:06 +, Sebastian Perta wrote:


> > > 1) there should be a space between * and the filename
> The spaces are there (see the changelog), the renesas mail server
> removes them sometimes

You might want to send around your patches as email attachments.  That
avoids formatting issues.

Cheers,
Oleg


[PATCH] More PR84037 fixing

2018-02-12 Thread Richard Biener

The following fixes two issues I found while investigating the costing
of vectorization for capacita.  First we are missing to CSE between
SLP instances, that's easy to fix.  Second we are double-counting
hybrid SLP stmts.  Fixing both leads to SLP vectorization being profitable
for AVX256 (but still not AVX128).

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.  Checked
SPEC 2k6 for build and test (with and without LTO).

Richard.

2018-02-12  Richard Biener  

PR tree-optimization/84037
* tree-vect-slp.c (vect_analyze_slp_cost): Add visited
parameter, move visited init to caller.
(vect_slp_analyze_operations): Separate cost from validity
check, initialize visited once for all instances.
(vect_schedule_slp): Analyze map to CSE vectorized nodes once
for all instances.
* tree-vect-stmts.c (vect_model_simple_cost): Make early
out an assert.
(vect_model_promotion_demotion_cost): Likewise.
(vectorizable_bswap): Guard cost modeling with !slp_node
instead of !PURE_SLP_STMT to avoid double-counting on hybrid
SLP stmts.
(vectorizable_call): Likewise.
(vectorizable_conversion): Likewise.
(vectorizable_assignment): Likewise.
(vectorizable_shift): Likewise.
(vectorizable_operation): Likewise.
(vectorizable_store): Likewise.
(vectorizable_load): Likewise.
(vectorizable_condition): Likewise.
(vectorizable_comparison): Likewise.

Index: gcc/tree-vect-slp.c
===
--- gcc/tree-vect-slp.c (revision 257581)
+++ gcc/tree-vect-slp.c (working copy)
@@ -2003,17 +2003,13 @@ vect_analyze_slp_cost_1 (slp_instance in
 /* Compute the cost for the SLP instance INSTANCE.  */
 
 static void
-vect_analyze_slp_cost (slp_instance instance, void *data)
+vect_analyze_slp_cost (slp_instance instance, void *data, scalar_stmts_set_t 
*visited)
 {
   stmt_vector_for_cost body_cost_vec, prologue_cost_vec;
   unsigned ncopies_for_cost;
   stmt_info_for_cost *si;
   unsigned i;
 
-  if (dump_enabled_p ())
-dump_printf_loc (MSG_NOTE, vect_location,
-"=== vect_analyze_slp_cost ===\n");
-
   /* Calculate the number of vector stmts to create based on the unrolling
  factor (number of vectors is 1 if NUNITS >= GROUP_SIZE, and is
  GROUP_SIZE / NUNITS otherwise.  */
@@ -2050,11 +2046,9 @@ vect_analyze_slp_cost (slp_instance inst
 
   prologue_cost_vec.create (10);
   body_cost_vec.create (10);
-  scalar_stmts_set_t *visited = new scalar_stmts_set_t ();
   vect_analyze_slp_cost_1 (instance, SLP_INSTANCE_TREE (instance),
   _cost_vec, _cost_vec,
   ncopies_for_cost, visited);
-  delete visited;
 
   /* Record the prologue costs, which were delayed until we were
  sure that SLP was successful.  */
@@ -2871,13 +2865,19 @@ vect_slp_analyze_operations (vec_info *v
   vinfo->slp_instances.ordered_remove (i);
}
   else
-   {
- /* Compute the costs of the SLP instance.  */
- vect_analyze_slp_cost (instance, vinfo->target_cost_data);
- i++;
-   }
+   i++;
 }
 
+  if (dump_enabled_p ())
+dump_printf_loc (MSG_NOTE, vect_location,
+"=== vect_analyze_slp_cost ===\n");
+
+  /* Compute the costs of the SLP instances.  */
+  scalar_stmts_set_t *visited = new scalar_stmts_set_t ();
+  for (i = 0; vinfo->slp_instances.iterate (i, ); ++i)
+vect_analyze_slp_cost (instance, vinfo->target_cost_data, visited);
+  delete visited;
+
   return !vinfo->slp_instances.is_empty ();
 }
 
@@ -4246,19 +4246,20 @@ vect_schedule_slp (vec_info *vinfo)
   unsigned int i;
   bool is_store = false;
 
+
+  scalar_stmts_to_slp_tree_map_t *bst_map
+= new scalar_stmts_to_slp_tree_map_t ();
   slp_instances = vinfo->slp_instances;
   FOR_EACH_VEC_ELT (slp_instances, i, instance)
 {
   /* Schedule the tree of INSTANCE.  */
-  scalar_stmts_to_slp_tree_map_t *bst_map
-   = new scalar_stmts_to_slp_tree_map_t ();
   is_store = vect_schedule_slp_instance (SLP_INSTANCE_TREE (instance),
  instance, bst_map);
-  delete bst_map;
   if (dump_enabled_p ())
dump_printf_loc (MSG_NOTE, vect_location,
  "vectorizing stmts using SLP.\n");
 }
+  delete bst_map;
 
   FOR_EACH_VEC_ELT (slp_instances, i, instance)
 {
Index: gcc/tree-vect-stmts.c
===
--- gcc/tree-vect-stmts.c   (revision 257581)
+++ gcc/tree-vect-stmts.c   (working copy)
@@ -826,8 +826,7 @@ vect_model_simple_cost (stmt_vec_info st
   int inside_cost = 0, prologue_cost = 0;
 
   /* The SLP costs were already calculated during SLP tree build.  */
-  if (PURE_SLP_STMT (stmt_info))
-return;
+  gcc_assert (!PURE_SLP_STMT 

Re: [PATCH] Improve dead code elimination with -fsanitize=address (PR84307)

2018-02-12 Thread Jakub Jelinek
On Mon, Feb 12, 2018 at 01:02:20PM +0100, Paolo Bonzini wrote:
> On 12/02/2018 09:56, Richard Biener wrote:
> >>> I think it does, for both ASAN_CHECK and ASAN_MARK the pointer argument
> >>> is the second one, the first one is an integer argument with flags.
> >>> And ASAN_MARK, both poison and unpoison, works kind like a clobber on
> >>> the
> >>> referenced variable, before unpoison it is generally inaccessible and
> >>> after
> >>> poison too.
> >> Ah, indeed.
> > Which was an approval as well, in case you want to push this right now.
> 
> Oh cool.  I was going to look at ubsan builtins too, I'll post that
> separately.  Ok for GCC 7 too?

Please wait with GCC 7 backport at least 2 weeks after it is committed to
trunk.

Jakub


[PATCH] Add limit for maximal alignment options (PR c/84310).

2018-02-12 Thread Martin Liška
Hi.

Following patch fixes 2 issues with -falign-*:
1) when using -malign-x=16 (or corresponding -falign-* value) then ICE appeared
as code in final.c can deal just with limited alignment.
2) thus I also documented and limited the maximum value of -falign-* options.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
i386.exp test-suite works fine on x86_64 machine.

Ready to be installed?
Martin

gcc/ChangeLog:

2018-02-09  Martin Liska  

PR c/84310
PR target/79747
* final.c (shorten_branches): Build align_tab array with one
more element.
* opts.c (finish_options): Add alignment option limit check.
(MAX_CODE_ALIGN): Likewise.
(MAX_CODE_ALIGN_VALUE): Likewise.
* doc/invoke.texi: Document maximum allowed option value for
all -falign-* options.

gcc/testsuite/ChangeLog:

2018-02-12  Martin Liska  

PR c/84310
PR target/79747
* gcc.target/i386/pr84310.c: New test.
* gcc.target/i386/pr84310-2.c: Likewise.
---
 gcc/doc/invoke.texi   |  4 
 gcc/final.c   |  4 ++--
 gcc/opts.c| 20 
 gcc/testsuite/gcc.target/i386/pr84310-2.c | 10 ++
 gcc/testsuite/gcc.target/i386/pr84310.c   |  8 
 5 files changed, 44 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr84310-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr84310.c


diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index df357bea7dc..edfa9d5ada1 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -9096,6 +9096,7 @@ Some assemblers only support this flag when @var{n} is a power of two;
 in that case, it is rounded up.
 
 If @var{n} is not specified or is zero, use a machine-dependent default.
+The maximum allowed @var{n} option value is 65536.
 
 Enabled at levels @option{-O2}, @option{-O3}.
 
@@ -9121,6 +9122,7 @@ are greater than this value, then their values are used instead.
 
 If @var{n} is not specified or is zero, use a machine-dependent default
 which is very likely to be @samp{1}, meaning no alignment.
+The maximum allowed @var{n} option value is 65536.
 
 Enabled at levels @option{-O2}, @option{-O3}.
 
@@ -9134,6 +9136,7 @@ operations.
 
 @option{-fno-align-loops} and @option{-falign-loops=1} are
 equivalent and mean that loops are not aligned.
+The maximum allowed @var{n} option value is 65536.
 
 If @var{n} is not specified or is zero, use a machine-dependent default.
 
@@ -9151,6 +9154,7 @@ need be executed.
 equivalent and mean that loops are not aligned.
 
 If @var{n} is not specified or is zero, use a machine-dependent default.
+The maximum allowed @var{n} option value is 65536.
 
 Enabled at levels @option{-O2}, @option{-O3}.
 
diff --git a/gcc/final.c b/gcc/final.c
index 99a7cadd7c9..933c613cabf 100644
--- a/gcc/final.c
+++ b/gcc/final.c
@@ -911,7 +911,7 @@ shorten_branches (rtx_insn *first)
   char *varying_length;
   rtx body;
   int uid;
-  rtx align_tab[MAX_CODE_ALIGN];
+  rtx align_tab[MAX_CODE_ALIGN + 1];
 
   /* Compute maximum UID and allocate label_align / uid_shuid.  */
   max_uid = get_max_uid ();
@@ -1016,7 +1016,7 @@ shorten_branches (rtx_insn *first)
  alignment of n.  */
   uid_align = XCNEWVEC (rtx, max_uid);
 
-  for (i = MAX_CODE_ALIGN; --i >= 0;)
+  for (i = MAX_CODE_ALIGN + 1; --i >= 0;)
 align_tab[i] = NULL_RTX;
   seq = get_last_insn ();
   for (; seq; seq = PREV_INSN (seq))
diff --git a/gcc/opts.c b/gcc/opts.c
index f2795f98bf4..33efcc0d6e7 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -1039,6 +1039,26 @@ finish_options (struct gcc_options *opts, struct gcc_options *opts_set,
   if ((opts->x_flag_sanitize & SANITIZE_KERNEL_ADDRESS) && opts->x_flag_tm)
 sorry ("transactional memory is not supported with "
 	   "%<-fsanitize=kernel-address%>");
+
+  /* Comes from final.c -- no real reason to change it.  */
+#define MAX_CODE_ALIGN 16
+#define MAX_CODE_ALIGN_VALUE (1 << MAX_CODE_ALIGN)
+
+  if (opts->x_align_loops > MAX_CODE_ALIGN_VALUE)
+error_at (loc, "-falign-loops=%d is not between 0 and %d",
+	  opts->x_align_loops, MAX_CODE_ALIGN_VALUE);
+
+  if (opts->x_align_jumps > MAX_CODE_ALIGN_VALUE)
+error_at (loc, "-falign-jumps=%d is not between 0 and %d",
+	  opts->x_align_jumps, MAX_CODE_ALIGN_VALUE);
+
+  if (opts->x_align_functions > MAX_CODE_ALIGN_VALUE)
+error_at (loc, "-falign-functions=%d is not between 0 and %d",
+	  opts->x_align_functions, MAX_CODE_ALIGN_VALUE);
+
+  if (opts->x_align_labels > MAX_CODE_ALIGN_VALUE)
+error_at (loc, "-falign-labels=%d is not between 0 and %d",
+	  opts->x_align_labels, MAX_CODE_ALIGN_VALUE);
 }
 
 #define LEFT_COLUMN	27
diff --git a/gcc/testsuite/gcc.target/i386/pr84310-2.c b/gcc/testsuite/gcc.target/i386/pr84310-2.c
new file mode 100644
index 000..e39a421e8d2
--- /dev/null
+++ 

Re: [PATCH] Improve dead code elimination with -fsanitize=address (PR84307)

2018-02-12 Thread Paolo Bonzini
On 12/02/2018 09:56, Richard Biener wrote:
>>> I think it does, for both ASAN_CHECK and ASAN_MARK the pointer argument
>>> is the second one, the first one is an integer argument with flags.
>>> And ASAN_MARK, both poison and unpoison, works kind like a clobber on
>>> the
>>> referenced variable, before unpoison it is generally inaccessible and
>>> after
>>> poison too.
>> Ah, indeed.
> Which was an approval as well, in case you want to push this right now.

Oh cool.  I was going to look at ubsan builtins too, I'll post that
separately.  Ok for GCC 7 too?

Thanks,

Paolo


Re: Mising Patch #2 from the RISC-V v3 Submission

2018-02-12 Thread Andreas Schwab
On Feb 06 2017, Palmer Dabbelt  wrote:

> +/* Because RISC-V only has word-sized atomics, it requries libatomic where
> +   others do not.  So link libatomic by default, as needed.  */
> +#undef LIB_SPEC
> +#ifdef LD_AS_NEEDED_OPTION
> +#define LIB_SPEC GNU_USER_TARGET_LIB_SPEC \
> +  " %{pthread:" LD_AS_NEEDED_OPTION " -latomic " LD_NO_AS_NEEDED_OPTION "}"
> +#else
> +#define LIB_SPEC GNU_USER_TARGET_LIB_SPEC " -latomic "
> +#endif

Why is -latomic added only with -pthread if --as-needed is supported,
but unconditionally if not?  Wouldn't it make sense to add it
unconditionally in both cases?

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: PING [PATCH] RX movsicc degrade fix

2018-02-12 Thread Jakub Jelinek
On Mon, Feb 12, 2018 at 11:06:35AM -, Sebastian Perta wrote:
> Thank you for pointing this out, I'm sorry!
> Can I create a patch to correct the changelog entries?

Yes, and no need to add a ChangeLog entry for ChangeLog changes ;)

Jakub


[PATCH] RISC-V: define _REENTRANT with -pthread

2018-02-12 Thread Andreas Schwab
This is expected by the AX_PTHREAD autoconf macro from
.

* config/riscv/linux.h (CPP_SPEC): Define.

diff --git a/gcc/config/riscv/linux.h b/gcc/config/riscv/linux.h
index 1da1b0a74e..ad03654e8d 100644
--- a/gcc/config/riscv/linux.h
+++ b/gcc/config/riscv/linux.h
@@ -47,6 +47,8 @@ along with GCC; see the file COPYING3.  If not see
 
 #define ICACHE_FLUSH_FUNC "__riscv_flush_icache"
 
+#define CPP_SPEC "%{pthread:-D_REENTRANT}"
+
 #define LINK_SPEC "\
 -melf" XLEN_SPEC "lriscv \
 %{shared} \
-- 
2.16.1


-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


RE: PING [PATCH] RX movsicc degrade fix

2018-02-12 Thread Sebastian Perta
Hi Jakub,

Thank you for pointing this out, I'm sorry!
Can I create a patch to correct the changelog entries?

Best Regards,
Sebastian
 
>>1) there should be a space between * and the filename
The spaces are there (see the changelog), the renesas mail server removes
them sometimes

> -Original Message-
> From: Jakub Jelinek [mailto:ja...@redhat.com]
> Sent: 09 February 2018 18:24
> To: Sebastian Perta ; Nick Clifton
> 
> Cc: gcc-patches 
> Subject: Re: PING [PATCH] RX movsicc degrade fix
> 
> On Wed, Feb 07, 2018 at 02:10:21PM +, Nick Clifton wrote:
> > Hi Sebastian,
> >
> >   Sorry for missing this one.  If it helps in the future, feel free to
ping me
> directly.
> >
> > > +2018-01-09  Sebastian Perta  
> > > +
> > > + *config/rx.md: updated "movsicc" expand to be matched by GCC
> > > + *testsuite/gcc.target/rx/movsicc.c: new test case
> >
> > Approved - please apply.
> 
> Note the ChangeLog is incorrect:
> 1) there should be a space between * and the filename
> 2) testsuite/ has its own ChangeLog, so changes for testsuite/ should
>go there and filenames be relative to the testsuite/ directory
> 3) there is no config/rx.md file, you've changed config/rx/rx.md instead
> 4) the format is * filename (what): Description. , so it should be
>   * config/rx/rx.md (movsicc): Update expander to be matched by
> GCC.
> 5) note capital letter after : and full stop at the end.
>   * gcc.target/rx/movsicc.c: New test.
>goes into testsuite/ChangeLog
> 
> Many other of your ChangeLog entries suffer from similar issues.
> 
>   Jakub



[patch, fortran] Fix handling of assumed-size arrays in inline matmul

2018-02-12 Thread Thomas Koenig

Hello world,

the attached patch fixes a regression where a rejects-valid would
be issued.

OK for the affected branches, trunk and gcc-7?

Regards

Thomas

2018-02-12  Thomas Koenig  

PR fortran/84270
* frontend-passes (scalarized_expr):  If the expression
is an assumed size array, leave in the last reference
and pass AR_SECTION instead of AR_FULL to gfc_resolve
in order to avoid an error.

2018-02-12  Thomas Koenig  

PR fortran/84270
* gfortran.dg/inline_matmul_22.f90: New test.
Index: frontend-passes.c
===
--- frontend-passes.c	(Revision 257347)
+++ frontend-passes.c	(Arbeitskopie)
@@ -3567,11 +3567,27 @@ scalarized_expr (gfc_expr *e_in, gfc_expr **index,
 			 is the lbound of a full ref.  */
 		  int j;
 		  gfc_array_ref *ar;
+		  int to;
 
 		  ar = >u.ar;
-		  ar->type = AR_FULL;
-		  for (j = 0; j < ar->dimen; j++)
+
+		  /* For assumed size, we need to keep around the final
+			 reference in order not to get an error on resolution
+			 below, and we cannot use AR_FULL.  */
+			 
+		  if (ar->as->type == AS_ASSUMED_SIZE)
 			{
+			  ar->type = AR_SECTION;
+			  to = ar->dimen - 1;
+			}
+		  else
+			{
+			  to = ar->dimen;
+			  ar->type = AR_FULL;
+			}
+
+		  for (j = 0; j < to; j++)
+			{
 			  gfc_free_expr (ar->start[j]);
 			  ar->start[j] = NULL;
 			  gfc_free_expr (ar->end[j]);
! { dg-do compile }
! { dg-additional-options "-ffrontend-optimize" }
! PR 84270 - this used to be rejected.
! Test case by Michael Weinert

module fp_precision

   integer, parameter   :: fp = selected_real_kind(13)

end module fp_precision

  subroutine lhcal(nrot,orth,ngpts,vgauss,vr_0)

  use fp_precision  ! floating point precision

  implicit none

!--->rotation matrices and rotations (input)
  integer,  intent(in)  :: nrot
! real(kind=fp),intent(in)  :: orth(3,3,nrot)  ! fine at all -O
  real(kind=fp),intent(in)  :: orth(3,3,*)

!--->gaussian integration points
  integer,  intent(in)  :: ngpts
  real(kind=fp),intent(in)  :: vgauss(3,*)

!--->output results
  real(kind=fp),intent(out) :: vr_0(3)

  real(kind=fp) :: v(3),vr(3)
  integer   :: n,nn

  vr_0 = 0
  do nn=1,ngpts
 v(:) = vgauss(:,nn)
!--->apply rotations
 do n=2,nrot
vr = matmul( orth(:,:,n), v )
vr_0 = vr_0 + vr
 enddo
  enddo

  return
  end subroutine lhcal


Add a DECL_EXPR for VLA pointer casts (PR 84305)

2018-02-12 Thread Richard Sandiford
This PR was about a case in which we ended up with a MULT_EXPR
that was shared between an ungimplified VLA type and a pointer
calculation.  The SSA names used in the pointer calculation were
later freed, but they were still there in the VLA, and caused an
ICE when remapping the types during inlinling.

The fix is to add a DECL_EXPR that forces the VLA type sizes to be
gimplified too, but the tricky part is deciding where.  As the comment
in grokdeclarator says, we can't just add it to the statement list,
since the size might only be conditionally evaluated.  It might also
end up being evaluated out of sequence.

The patch gets around that by putting the DECL_EXPR in a BIND_EXPR
and adding the BIND_EXPR to the list of things that need to be
evaluated for the declarator.  This means that the TYPE_NAME is
used outside of its BIND_EXPR though.  Is that a problem?

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Richard


2018-02-11  Richard Sandiford  

gcc/c/
PR c/84305
* c-decl.c (grokdeclarator): Create an anonymous TYPE_DECL
in PARM and TYPENAME contexts too, but attach it to a BIND_EXPR
and include the BIND_EXPR in the list of things that need to be
pre-evaluated.

gcc/testsuite/
PR c/84305
* gcc.c-torture/compile/pr84305.c: New test.

Index: gcc/c/c-decl.c
===
--- gcc/c/c-decl.c  2018-02-12 10:10:10.802846717 +
+++ gcc/c/c-decl.c  2018-02-12 10:10:10.955833751 +
@@ -6479,28 +6479,53 @@ grokdeclarator (const struct c_declarato
   type has a name/declaration of it's own, but special attention
   is required if the type is anonymous.
 
-  We handle the NORMAL and FIELD contexts here by attaching an
-  artificial TYPE_DECL to such pointed-to type.  This forces the
-  sizes evaluation at a safe point and ensures it is not deferred
-  until e.g. within a deeper conditional context.
+  We attach an artificial TYPE_DECL to such pointed-to type
+  and arrange for it to be included in a DECL_EXPR.  This
+  forces the sizes evaluation at a safe point and ensures it
+  is not deferred until e.g. within a deeper conditional context.
 
-  We expect nothing to be needed here for PARM or TYPENAME.
-  Pushing a TYPE_DECL at this point for TYPENAME would actually
-  be incorrect, as we might be in the middle of an expression
-  with side effects on the pointed-to type size "arguments" prior
-  to the pointer declaration point and the fake TYPE_DECL in the
-  enclosing context would force the size evaluation prior to the
-  side effects.  */
+  PARM contexts have no enclosing statement list that
+  can hold the DECL_EXPR, so we need to use a BIND_EXPR
+  instead, and add it to the list of expressions that
+  need to be evaluated.
 
+  TYPENAME contexts do have an enclosing statement list,
+  but it would be incorrect to use it, as the size should
+  only be evaluated if the containing expression is
+  evaluated.  We might also be in the middle of an
+  expression with side effects on the pointed-to type size
+  "arguments" prior to the pointer declaration point and
+  the fake TYPE_DECL in the enclosing context would force
+  the size evaluation prior to the side effects.  We therefore
+  use BIND_EXPRs in TYPENAME contexts too.  */
if (!TYPE_NAME (type)
-   && (decl_context == NORMAL || decl_context == FIELD)
&& variably_modified_type_p (type, NULL_TREE))
  {
+   tree bind = NULL_TREE;
+   if (decl_context == TYPENAME || decl_context == PARM)
+ {
+   bind = build3 (BIND_EXPR, void_type_node, NULL_TREE,
+  NULL_TREE, NULL_TREE);
+   TREE_SIDE_EFFECTS (bind) = 1;
+   BIND_EXPR_BODY (bind) = push_stmt_list ();
+   push_scope ();
+ }
tree decl = build_decl (loc, TYPE_DECL, NULL_TREE, type);
DECL_ARTIFICIAL (decl) = 1;
pushdecl (decl);
finish_decl (decl, loc, NULL_TREE, NULL_TREE, NULL_TREE);
TYPE_NAME (type) = decl;
+   if (bind)
+ {
+   pop_scope ();
+   BIND_EXPR_BODY (bind)
+ = pop_stmt_list (BIND_EXPR_BODY (bind));
+   if (*expr)
+ *expr = build2 (COMPOUND_EXPR, void_type_node, *expr,
+ bind);
+   else
+ *expr = bind;
+ 

Re: [SFN+LVU+IEPM v4 7/9] [LVU] Introduce location views

2018-02-12 Thread Andreas Schwab
On Feb 12 2018, Alexandre Oliva  wrote:

> On Feb 11, 2018, Andreas Schwab  wrote:
>
>> On Feb 09 2018, Alexandre Oliva  wrote:
>
>>> +  if (list_head->vl_symbol && dwarf2out_locviews_in_attribute ())
>>> +{
>>> +  ASM_OUTPUT_LABEL (asm_out_file, list_head->vl_symbol);
>
>> That needs to use ASM_OUTPUT_DEBUG_LABEL.
>
> Note this is always output in the .debug_loclist section, not in code
> sections, so I don't get why it should matter.  Care to clarify, please?

Perhaps I'm misunderstanding it, but I see .LM labels emitted in the
middle of code bundles, which breaks them apart.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: [PATCH 1/3] Remove support for obsolete x86 -malign-foo options

2018-02-12 Thread Martin Liška
On 05/06/2017 09:20 AM, Uros Bizjak wrote:
> On Tue, Apr 18, 2017 at 8:30 PM, Denys Vlasenko  wrote:
>> 2017-04-18  Denys Vlasenko  
>>
>> * config/i386/i386-common.c (ix86_handle_option): Remove support
>> for obsolete -malign-loops, -malign-jumps and -malign-functions
>> options.
>> * config/i386/i386.opt: Likewise.
>> Index: gcc/common/config/i386/i386-common.c
>> ===
>> --- gcc/common/config/i386/i386-common.c(revision 240663)
>> +++ gcc/common/config/i386/i386-common.c(working copy)
>> @@ -998,38 +998,6 @@ ix86_handle_option (struct gcc_options *opts,
>> }
>>return true;
>>
>> -
>> -  /* Comes from final.c -- no real reason to change it.  */
>> -#define MAX_CODE_ALIGN 16
>> -
>> -case OPT_malign_loops_:
>> -  warning_at (loc, 0, "-malign-loops is obsolete, use -falign-loops");
>> -  if (value > MAX_CODE_ALIGN)
>> -   error_at (loc, "-malign-loops=%d is not between 0 and %d",
>> - value, MAX_CODE_ALIGN);
>> -  else
>> -   opts->x_align_loops = 1 << value;
>> -  return true;
>> -
>> -case OPT_malign_jumps_:
>> -  warning_at (loc, 0, "-malign-jumps is obsolete, use -falign-jumps");
>> -  if (value > MAX_CODE_ALIGN)
>> -   error_at (loc, "-malign-jumps=%d is not between 0 and %d",
>> - value, MAX_CODE_ALIGN);
>> -  else
>> -   opts->x_align_jumps = 1 << value;
>> -  return true;
>> -
>> -case OPT_malign_functions_:
>> -  warning_at (loc, 0,
>> - "-malign-functions is obsolete, use -falign-functions");
>> -  if (value > MAX_CODE_ALIGN)
>> -   error_at (loc, "-malign-functions=%d is not between 0 and %d",
>> - value, MAX_CODE_ALIGN);
>> -  else
>> -   opts->x_align_functions = 1 << value;
>> -  return true;
>> -
>>  case OPT_mbranch_cost_:
>>if (value > 5)
>> {
>> Index: gcc/config/i386/i386.opt
>> ===
>> --- gcc/config/i386/i386.opt(revision 240663)
>> +++ gcc/config/i386/i386.opt(working copy)
>> @@ -205,18 +205,6 @@ malign-double
>>  Target Report Mask(ALIGN_DOUBLE) Save
>>  Align some doubles on dword boundary.
>>
>> -malign-functions=
>> -Target RejectNegative Joined UInteger
>> -Function starts are aligned to this power of 2.
>> -
>> -malign-jumps=
>> -Target RejectNegative Joined UInteger
>> -Jump targets are aligned to this power of 2.
>> -
>> -malign-loops=
>> -Target RejectNegative Joined UInteger
>> -Loop code aligned to this power of 2.
>> -
>>  malign-stringops
>>  Target RejectNegative Report InverseMask(NO_ALIGN_STRINGOPS, 
>> ALIGN_STRINGOPS) Save
>>  Align destination of the string operations.
> 
> Instead of removing the above definitions, please rather redefine them
> in a similar way -mcpu in i386.opt is obsoleted, e.g.:
> 
> malign-functions=
> Target RejectNegative Joined Undocumented Alias(falign-functions=)
> Warn(%<-malign-functions%> is obsolete, use %<-falign-functions%>)

Please correct me but doing the alias is not simple as value of 
-malign-functions
option is a power of 2, while -falign-functions= is an absolute value.
Thus -malign-functions=5 == -falign-functions=32.

I believe the legacy options are not problem for the patch series as it only 
sets
value of -falign-functions option.

Martin

> 
> This cleanup should be done a long time ago, the patch can be
> committed independently of other patches in the series.
> 
> Uros.
> 



RE: [x86,avx] Fix __builtin_cpu_supports for icelake and cannonlake isa

2018-02-12 Thread Koval, Julia
Hi,

There is no PR for this. This builtin was just missing for all new cpus.

Thanks,
Julia

> -Original Message-
> From: Kirill Yukhin [mailto:kirill.yuk...@gmail.com]
> Sent: Monday, February 12, 2018 7:19 AM
> To: Koval, Julia 
> Cc: 'GCC Patches' 
> Subject: Re: [x86,avx] Fix __builtin_cpu_supports for icelake and cannonlake 
> isa
> 
> Hello Julia.
> 
> On 15 Jan 08:28, Koval, Julia wrote:
> > Hi,
> > This patch fixes subj. Ok for trunk?
> >
> > gcc/
> > * config/i386/i386.c (F_AVX512VBMI2, F_GFNI, F_VPCLMULQDQ,
> F_AVX512VNNI,
> > F_AVX512BITALG): New.
> >
> > gcc/testsuite/
> > * gcc.target/i386/builtin_target.c (check_intel_cpu_model): Add
> cannonlake.
> > (check_features): Add avx512vbmi2, gfni, vpclmulqdq, avx512vnni,
> > avx512bitalg.
> >
> > libgcc/
> > * config/i386/cpuinfo.c (get_available_features): Add
> FEATURE_AVX512VBMI2,
> > FEATURE_GFNI, FEATURE_VPCLMULQDQ, FEATURE_AVX512VNNI,
> FEATURE_AVX512BITALG.
> > * config/i386/cpuinfo.h (processor_features) Add
> FEATURE_AVX512VBMI2,
> > FEATURE_GFNI, FEATURE_VPCLMULQDQ, FEATURE_AVX512VNNI,
> FEATURE_AVX512BITALG.
> 
> Could you pls mention, which problem does your patch fix?
> 
> --
> Thanks, K


Re: Reduce inline limits a bit to compensate changes in inlining metrics

2018-02-12 Thread Richard Biener
On Fri, 9 Feb 2018, Jan Hubicka wrote:

> Hi,
> this patch addresses the code size regression by reducing 
> max-inline-insns-auto 40->30 and increasing inline-min-speedup 8->15.
> 
> The main reason why we need retuning is following
> 
>  - inline-min-speedup works in a way that if expected runtime 
>of caller+calee combo after inlining reduces by more than 8%
>the inliner is going to bypass inline-insns-auto (because it knows the
>inlining is benefical rather than just inlining in hope it will be).
>The decrease either happens because callee is very lightweight at
>average or because we can track it will optimize well.
> 
>During GCC 8 development I have switched time estimates from int to sreal.
>Original estimates was capping time to about 1000 instructions and thus
>large function rarely saw speedup because that was based comparing caped
>numbers.  With sreals we can now track benefits better
> 
>  - We made quite some progress on early optimizations making function
>bodies to appear smaller to inliner which in turn inlines more of them.
>This is reason why we want to decrease inline-min-speedup to gain some code
>size back.
> 
>The code size estimate difference at beggining of inlning is about 6% to
>gcc 6 and about 12% to gcc 4.9.
> 
> I have benchmarked patch on Haswell SPEC2000, SPEC2006, polyhedron and our C++
> benchmarks.  Here I found no off-noise changes on SPEC2000/2006. I know that
> reducing inline-insns-auto to 10  still produces no regressions and even
> improves facerec 6600->8000 but that seems bit of effect of good luck (it also
> depends on setting of branch predictor weights and needs to be analyzed
> independently).  min-speedup can be increased to 30 without measurable effects
> as well.
> 
> On C++ benchmark suite I know that cray degrades with min-speedup set to 30 
> (it
> needs value of 22). Also there is degradation with profile-generate on 
> tramp3d.
> 
> So overall I believe that for Haswell the reduction of inline limits is doing
> very consistent code size improvement without perofrmance tradeoffs.
> 
> I also tested Itanium and here things are slightly more sensitive. The
> reduction of limits affects gzip 337->332 (-1.5%), vpr 1000->980 (-2%), crafty
> (925->935) (+2%) and vortex (1165->1180) (+1%). So overall it is specint2000
> neutral. Reducing inline-isns-auto to 10 brings off noise overall degradation
> by -1% and 20 is in-between.
> 
> specfp2000 reacts positively by improving applu 520->525 (+1%) and mgrid
> 391->397 (+1.3%). It would let me to reduct inline-isns-auto to 10 without
> any other regressions.
> 
> C++ benchmarks does not show any off-noise changes.
> 
> I have also did some limited testing on ppc and arm. They reacted more 
> similarly
> to Haswell also showing no important changes for reducing the inlining limits.
> 
> Now reducing inline limits triggers failure of testsuite/g++.dg/pr83239.C
> which tests that inlining happens.  The reason why it does not happen is
> becuae ipa-fnsplit is trying to second guess if inliner will evnetually 
> consider
> function for inlining and the test is out of date.  I decided to hack around
> it for stage4 and will try to clean these things up next stage1.
> 
> Bootstraped/regtested x86_64-linux.  I know it is late in stage4, but would it
> be OK to for GCC 8? 

Ok.

Richard.

>   PR middle-end/83665
>   * params.def (inline-min-speedup): Increase from 8 to 15.
>   (max-inline-insns-auto): Decrease from 40 to 30.
>   * ipa-split.c (consider_split): Add some buffer for function to
>   be considered inlining candidate.
>   * invoke.texi (max-inline-insns-auto, inline-min-speedup): UPdate
>   default values.
> Index: params.def
> ===
> --- params.def(revision 257520)
> +++ params.def(working copy)
> @@ -52,13 +52,13 @@ DEFPARAM (PARAM_PREDICTABLE_BRANCH_OUTCO
>  DEFPARAM (PARAM_INLINE_MIN_SPEEDUP,
> "inline-min-speedup",
> "The minimal estimated speedup allowing inliner to ignore 
> inline-insns-single and inline-insns-auto.",
> -   8, 0, 0)
> +   15, 0, 0)
>  
>  /* The single function inlining limit. This is the maximum size
> of a function counted in internal gcc instructions (not in
> real machine instructions) that is eligible for inlining
> by the tree inliner.
> -   The default value is 450.
> +   The default value is 400.
> Only functions marked inline (or methods defined in the class
> definition for C++) are affected by this.
> There are more restrictions to inlining: If inlined functions
> @@ -77,11 +77,11 @@ DEFPARAM (PARAM_MAX_INLINE_INSNS_SINGLE,
> that is applied to functions marked inlined (or defined in the
> class declaration in C++) given by the "max-inline-insns-single"
> parameter.
> -   The default value is 40.  */
> +   The default value is 30.  */
>  DEFPARAM 

Re: [PATCH] Improve dead code elimination with -fsanitize=address (PR84307)

2018-02-12 Thread Richard Biener
On Fri, Feb 9, 2018 at 9:10 PM, Richard Biener
 wrote:
> On February 9, 2018 7:07:45 PM GMT+01:00, Jakub Jelinek  
> wrote:
>>On Fri, Feb 09, 2018 at 07:01:08PM +0100, Richard Biener wrote:
>>> >which indeed fixes the testcase and seems not to break asan.exp.
>>>
>>> Huh. Need to double check why that makes sense ;)
>>
>>I think it does, for both ASAN_CHECK and ASAN_MARK the pointer argument
>>is the second one, the first one is an integer argument with flags.
>>And ASAN_MARK, both poison and unpoison, works kind like a clobber on
>>the
>>referenced variable, before unpoison it is generally inaccessible and
>>after
>>poison too.
>
> Ah, indeed.

Which was an approval as well, in case you want to push this right now.

Richard.

> Richard.
>
>>   Jakub
>


Re: [SFN+LVU+IEPM v4 9/9] [IEPM] Introduce inline entry point markers

2018-02-12 Thread Alexandre Oliva
On Feb  9, 2018, Alexandre Oliva  wrote:

> On Feb  9, 2018, Jakub Jelinek  wrote:
>> On Fri, Feb 09, 2018 at 07:01:25PM -0200, Alexandre Oliva wrote:
>>> So, as discussed on IRC, I'm trying to use a target hook to allow
>>> targets to indicate that their length attrs have been assessed for this
>>> purpose, and a param to make that overridable, but I'm having trouble
>>> initializing the param from the target hook.  How does one do that?

>> Better in the default version of the target hook check the param
>> whether it should return true or false, and for analyzed targets
>> just use an always true (or false, depending on what the hook is)
>> as the hook.

> I want it to be overridable, so here's what I ended up with.
> Testing underway; ok to install if it succeeds?

This patch supersedes the previous one.  Testing underway...  Ok if it
succeeds?

Sorry for combining so many not-entirely-related issues in a single
patch, but there would be lots of overlaps and conflicts otherwise, and
in the end they're all about allowing finer-tuning of markers and views,
so I hope tha's ok.  Well, not all: there are formatting fixes to docs
that are totally unrelated, but still overlapping.  Anyway...


[LVU, IEPM] several new controlling options

Given that the minimum insn length is not generally reliable to tell
whether an insn actually advances PC, this patch disables the locview
list optimizations that can only be done when can tell it.

The preexisting logic is retained, however, and can be enabled with
the newly-introduced -ginternal-reset-location-view.  This is now
enabled by default only if the target defines a hook that may override
or defer to the preexisting logic.  The negated command line option
can then be used should errors still be encountered.


We also introduce options to control whether to assume .loc and view
support in the assembler, and to control whether to output inline
entry points (and views) from markers.


This patch also fixes a number of documentation formatting errors,
namely using @item rather than @itemx for all but the first of several
options before a description.

for  gcc/ChangeLog

* common.opt (gas-loc-support, gas-locview-support): New.
(ginline-points, ginternal-reset-location-views): New.
* doc/invoke.texi: Document them.  Use @itemx where intended.
(gvariable-location-views): Adjust.
* target.def (reset_location_view): New.
* doc/tm.texi.in (DWARF2_ASM_VIEW_DEBUG_INFO): New.
(TARGET_RESET_LOCATION_VIEW): New.
* doc/tm.texi: Rebuilt.
* dwarf2out.c (dwarf2out_default_as_loc_support): New.
(dwarf2out_default_as_locview_support): New.
(output_asm_line_debug_info): Use option variables.
(dwarf2out_maybe_output_loclist_view_pair): Likewise.
(output_loc_list): Likewise.
(add_high_low_attributes): Check option variables.
Don't output entry view attribute in strict mode.
(gen_inlined_subroutine_die): Check option variables.
(dwarf2out_inline_entry): Likewise.
(init_sections_and_labels): Likewise.
(dwarf2out_early_finish): Likewise.
(maybe_reset_location_view): New, from...
(dwarf2out_var_location): ... here.  Call it.
* debug.h (dwarf2out_default_as_loc_support): Declare.
(dwarf2out_default_as_locview_support): Declare.
* hooks.c (hook_int_rtx_insn_0): New.
* hooks.h (hook_int_rtx_insn_0): Declare.
* toplev.c (process_options): Take -gas-loc-support and
-gas-locview-support from dwarf2out.  Enable
-gvariable-location-views by default only with locview
assembler support.  Enable -ginternal-reset-location-views by
default only if the target defines the corresponding hook.
Enable -ginline-points by default if location views are
enabled; force it disabled if statement frontiers are
disabled.
* tree-inline.c (expand_call_inline): Check option variables.
* tree-ssa-live.c (remove_unused_scope_block_p): Likewise.
---
 gcc/common.opt  |   16 
 gcc/debug.h |2 +
 gcc/doc/invoke.texi |  106 +++--
 gcc/doc/tm.texi |   23 ++
 gcc/doc/tm.texi.in  |9 ++
 gcc/dwarf2out.c |  186 +--
 gcc/hooks.c |6 ++
 gcc/hooks.h |1 
 gcc/target.def  |   17 +
 gcc/toplev.c|   45 +++-
 gcc/tree-inline.c   |2 -
 gcc/tree-ssa-live.c |4 +
 12 files changed, 324 insertions(+), 93 deletions(-)

diff --git a/gcc/common.opt b/gcc/common.opt
index 40ec0088c57e..e0bc4d1bb18d 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2880,6 +2880,14 @@ g
 Common Driver RejectNegative JoinedOrMissing
 Generate debug information in default format.
 
+gas-loc-support
+Common Driver Var(dwarf2out_as_loc_support) Init(2)
+Assume assembler support for (DWARF2+)