Re: [PATCH][i386][3/3] PR target/84164: Make *cmpqi_ext_ patterns accept more zero_extract modes

2018-02-14 Thread Uros Bizjak
On Wed, Feb 14, 2018 at 7:04 PM, Kyrill  Tkachov
 wrote:
>
> On 13/02/18 16:45, Jeff Law wrote:
>>
>> On 02/09/2018 07:50 AM, Kyrill Tkachov wrote:
>>>
>>> Hi Uros,
>>>
>>> On 08/02/18 22:54, Uros Bizjak wrote:

 On Thu, Feb 8, 2018 at 6:11 PM, Kyrill  Tkachov
  wrote:
>
> Hi all,
>
> This patch fixes some fallout in the i386 testsuite that occurs after
> the
> simplification in patch [1/3] [1].
> The gcc.target/i386/extract-2.c FAILs because it expects to match:
> (set (reg:CC 17 flags)
>   (compare:CC (subreg:QI (zero_extract:SI (reg:HI 98)
>   (const_int 8 [0x8])
>   (const_int 8 [0x8])) 0)
>   (const_int 4 [0x4])))
>
> which is the *cmpqi_ext_2 pattern in i386.md but with the new
> simplification
> the combine/simplify-rtx
> machinery produces:
> (set (reg:CC 17 flags)
>   (compare:CC (subreg:QI (zero_extract:HI (reg:HI 98)
>   (const_int 8 [0x8])
>   (const_int 8 [0x8])) 0)
>   (const_int 4 [0x4])))
>
> Notice that the zero_extract now has HImode like the register source
> rather
> than SImode.
> The existing *cmpqi_ext_ patterns however explicitly demand an
> SImode on
> the zero_extract.
> I'm not overly familiar with the i386 port but I think that's too
> restrictive.
> The RTL documentation says:
> For (zero_extract:m loc size pos) "The mode m is the same as the mode
> that
> would be used for loc if it were a register."
> I'm not sure if that means that the mode of the zero_extract and the
> source
> register must always match (as is the
> case after patch [1/3]) but in any case it shouldn't matter
> semantically
> since we're taking a QImode subreg of the whole
> thing anyway.
>
> So the proposed solution in this patch is to allow HI, SI and DImode
> zero_extracts in these patterns as these are the
> modes that the ext_register_operand predicate accepts, so that the
> patterns
> can match the new form above.
>
> With this patch the aforementioned test passes again and bootstrap and
> testing on x86_64-unknown-linux-gnu shows
> no regressions.
>
> Is this ok for trunk if the first patch is accepted?

 Huh, there are many other zero-extract patterns besides cmpqi_ext_*
 with QImode subreg of SImode zero_extract in i386.md, used to access
 high QImode register of HImode pair. A quick grep shows these that
 have _ext_ in their name:

 (define_insn "*cmpqi_ext_1"
 (define_insn "*cmpqi_ext_2"
 (define_expand "cmpqi_ext_3"
 (define_insn "*cmpqi_ext_3"
 (define_insn "*cmpqi_ext_4"
 (define_insn "addqi_ext_1"
 (define_insn "*addqi_ext_2"
 (define_expand "testqi_ext_1_ccno"
 (define_insn "*testqi_ext_1"
 (define_insn "*testqi_ext_2"
 (define_insn_and_split "*testqi_ext_3"
 (define_insn "andqi_ext_1"
 (define_insn "*andqi_ext_1_cc"
 (define_insn "*andqi_ext_2"
 (define_insn "*qi_ext_1"
 (define_insn "*qi_ext_2"
 (define_expand "xorqi_ext_1_cc"
 (define_insn "*xorqi_ext_1_cc"

 There are also relevant splitters and peephole2 patterns.
>>>
>>> I see. Another approach I've looked at is removing the mode specifier
>>> from
>>> the zero_extract in these patterns. This means that they can be of any
>>> mode
>>> so they will match all of these modes without creating new patterns
>>> through
>>> iterators. That also works for the testcase and passes bootstrap and
>>> testing
>>> however there is the snag that the define_insns that don't start with a
>>> "*"
>>> are used to generate RTL through the gen_* mechanism and in that context
>>> the absence of a mode on the zero_extract would mean a VOIDmode
>>> zero_extract
>>> would be created, which I'm fairly sure is not good. So in my
>>> experiments I left
>>> those patterns alone (with an explicit SI on the zero_extract).
>>>
 IIRC, SImode zero_extract was enough to catch all high-register uses.
 There will be a pattern explosion if we want to handle all other
 integer modes here. However, I'm not a RTL expert, so someone will
 have to say what is the correct RTX form here.
>>>
>>> Jeff, Richard, could you please give us some guidance on this issue?
>>> Sorry for the trouble.
>>>
>> I don't think any of the patterns above are known to the generic code.
>> So you just have to check the x86 backend to see their precise uses in a
>> generator (ie gen_cmpqi_ext_3) and verify those do not allow a VOIDmode
>> (or any other undesirable mode) to slip through.
>>
>> Jeff
>
>
> Thanks Jeff, I did have a look. I think we want to maintain the SImode on
> the
> RTL that gets created through these named expanders, as generating a
> VOIDmode
> zero_extract is not valid. So my patch leaves those 

Re: [PATCH] __VA_OPT__ fixes (PR preprocessor/83063, PR preprocessor/83708)

2018-02-14 Thread Jason Merrill

On 01/10/2018 07:04 AM, Jakub Jelinek wrote:

The following patch attempts to fix various issues, including some ICEs,
by introducing 3 new states, two of them are alternatives to INCLUDE used
for the very first token after __VA_OPT__( , where we want to take into
account also flags from the __VA_OPT__ token, and before the closing )
token where we want to use flags from the closing ) token.  Plus
PADDING which is used for the case where there are no varargs and __VA_OPT__
is supposed to fold into a placemarker, or for the case of __VA_OPT__(),
which is similar to that, in both cases we need to take into account in
those cases both flags from __VA_OPT__ and from the closing ).


I had an idea for a way to simplify this, which I've attached below.


This is just a partial fix, one thing this patch doesn't change is that
the standard says that __VA_OPT__ ( contents ) should be treated as
parameter, which means that #__VA_OPT__ ( contents ) should stringify it,
which we right now reject.  My preprocessor knowledge is too limited to
handle this right myself, including all the corner cases, e.g. one can have
#define f(x, ...) #__VA_OPT__(#x x ## x) etc..  I presume
m_flags = token->flags & (PREV_FALLTHROUGH | PREV_WHITE);
could be changed into:
m_flags = token->flags & (PREV_FALLTHROUGH | PREV_WHITE | STRINGIFY_ARG);
and when handling the PADDING result from update, we could just emit a
"" token, but for INCLUDE_FIRST with this we'd need something complex,
probably a new routine similar to stringify_arg to some extent.


Yes, I think long term we really need to treat __VA_OPT__ more like an 
argument.


The first patch below makes your testcases work in what seems to me a 
simpler way: pad when we see __VA_OPT__ if we aren't pasting to the 
left, and fix up the end of the body if we're pasting to the right.


The second further patch below makes the rest of the clang testcase work 
the way it does in clang, apart from stringification.  But it feels more 
kludgey.


Thoughts?

Jason
commit f07a7a181489ad2c6287c239d6b034a3933a56ce
Author: Jason Merrill 
Date:   Wed Feb 14 13:59:22 2018 -0500

PR preprocessor/83063

PR preprocessor/83708
* macro.c (vaopt_state): Reorder m_last_was_paste before m_state.
(vaopt_state::vaopt_state): Adjust.
(vaopt_state::update_flags): Add BEGIN and END.
(vaopt_state::update): Return them.
(retcon_paste_flag, padding_ok_after_last_token): New.
(replace_args): Handle BEGIN and END.
(tokens_buff_last_token_ptr): Return NULL if no tokens.

diff --git a/gcc/testsuite/c-c++-common/cpp/va-opt-2.c b/gcc/testsuite/c-c++-common/cpp/va-opt-2.c
new file mode 100644
index 000..cff2d6cbe5d
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cpp/va-opt-2.c
@@ -0,0 +1,41 @@
+/* PR preprocessor/83063 */
+/* { dg-do compile } */
+/* { dg-options "-std=gnu99" { target c } } */
+/* { dg-options "-std=c++2a" { target c++ } } */
+
+#define f1(...) int b##__VA_OPT__(c)
+#define f2(...) int __VA_OPT__(c)##d
+#define f3(...) int e##__VA_OPT__()
+#define f4(...) int __VA_OPT__()##f
+#define f5(...) int g##__VA_OPT__(h)##i
+#define f6(...) int j##__VA_OPT__()##k
+#define f7(...) int l##__VA_OPT__()
+#define f8(...) int __VA_OPT__()##m
+#define f9(...) int n##__VA_OPT__()##o
+#define f10(x, ...) int x##__VA_OPT__(x)
+#define f11(x, ...) int __VA_OPT__(x)##x
+#define f12(x, ...) int x##__VA_OPT__(x)##x
+f1 (1, 2, 3);
+f1 ();
+f2 (1, 2);
+f2 ();
+f3 (1);
+f4 (2);
+f5 (6, 7);
+f5 ();
+f6 (8);
+f7 ();
+f8 ();
+f9 ();
+f10 (p, 5, 6);
+f10 (p);
+f11 (q, 7);
+f11 (q);
+f12 (r, 1, 2, 3, 4, 5);
+f12 (r);
+
+int
+main ()
+{
+  return bc + b + cd + d + e + f + ghi + gi + jk + l + m + no + pp + p + qq + q + rrr + rr;
+}
diff --git a/gcc/testsuite/c-c++-common/cpp/va-opt-3.c b/gcc/testsuite/c-c++-common/cpp/va-opt-3.c
new file mode 100644
index 000..2e639891070
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cpp/va-opt-3.c
@@ -0,0 +1,69 @@
+/* PR preprocessor/83063 */
+/* PR preprocessor/83708 */
+/* { dg-do preprocess } */
+/* { dg-options "-std=gnu99" { target c } } */
+/* { dg-options "-std=c++2a" { target c++ } } */
+
+#define f1(...) b##__VA_OPT__(c)
+#define f2(...) __VA_OPT__(c)##d
+#define f3(...) e##__VA_OPT__()
+#define f4(...) __VA_OPT__()##f
+#define f5(...) g##__VA_OPT__(h)##i
+#define f6(...) j##__VA_OPT__()##k
+#define f7(...) l##__VA_OPT__()
+#define f8(...) __VA_OPT__()##m
+#define f9(...) n##__VA_OPT__()##o
+#define f10(x, ...) x##__VA_OPT__(x)
+#define f11(x, ...) __VA_OPT__(x)##x
+#define f12(x, ...) x##__VA_OPT__(x)##x
+#define f13(...) __VA_OPT__(a)__VA_OPT__(b)c
+#define f14(a, b, c, ...) __VA_OPT__(a)__VA_OPT__(b)c
+#define f15(a, b, c, ...) __VA_OPT__(a b)__VA_OPT__(b c)a/**/__VA_OPT__(c a)a
+t1 f1 (1, 2, 3);
+/* { dg-final { scan-file va-opt-3.i "t1 bc;" } } */
+t2 f1 ();
+/* { dg-final { scan-file va-opt-3.i "t2 b;" } } */
+t3 f2 (1, 2);
+/* { dg-final { 

[PATCH] RISC-V: Change sp subtracts so prologue stores can compress.

2018-02-14 Thread Jim Wilson
This patch changes the initial stack pointer subtraction when we need two
subtracts, so that the first stack pointer subtraction allows the register
save stores to compress.  This is only done for RVC targets, and only when
it doesn't change the number of instructions required.  A follow on patch
may change this further when -Os is used, to allow an extra instruction or two
if it results in smaller code in the end.

This initial patch doesn't affect many functions, but is useful when it can
trigger.  For instance, in the newlib vfwscanf file, text size reduces from
4006 to 3954.

This was tested with rv32/rv64 newlib/linux builds.  There were no regressions.

Committed.

Jim

gcc/
* config/riscv/riscv.c (riscv_first_stack_step): Move locals after
first SMALL_OPERAND check.  New local min_second_step.  Move assert
to where locals are set.  Add TARGET_RVC support.
* config/riscv/riscv.h (C_SxSP_BITS, SWSP_REACH, SDSP_REACH): New.
---
 gcc/config/riscv/riscv.c | 30 --
 gcc/config/riscv/riscv.h |  4 
 2 files changed, 28 insertions(+), 6 deletions(-)

diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
index 4ef7a1774c4..c38f6c394d5 100644
--- a/gcc/config/riscv/riscv.c
+++ b/gcc/config/riscv/riscv.c
@@ -3495,25 +3495,43 @@ riscv_output_gpr_save (unsigned mask)
 
 /* For stack frames that can't be allocated with a single ADDI instruction,
compute the best value to initially allocate.  It must at a minimum
-   allocate enough space to spill the callee-saved registers.  */
+   allocate enough space to spill the callee-saved registers.  If TARGET_RVC,
+   try to pick a value that will allow compression of the register saves
+   without adding extra instructions.  */
 
 static HOST_WIDE_INT
 riscv_first_stack_step (struct riscv_frame_info *frame)
 {
-  HOST_WIDE_INT min_first_step = frame->total_size - frame->fp_sp_offset;
-  HOST_WIDE_INT max_first_step = IMM_REACH / 2 - STACK_BOUNDARY / 8;
-
   if (SMALL_OPERAND (frame->total_size))
 return frame->total_size;
 
+  HOST_WIDE_INT min_first_step = frame->total_size - frame->fp_sp_offset;
+  HOST_WIDE_INT max_first_step = IMM_REACH / 2 - STACK_BOUNDARY / 8;
+  HOST_WIDE_INT min_second_step = frame->total_size - max_first_step;
+  gcc_assert (min_first_step <= max_first_step);
+
   /* As an optimization, use the least-significant bits of the total frame
  size, so that the second adjustment step is just LUI + ADD.  */
-  if (!SMALL_OPERAND (frame->total_size - max_first_step)
+  if (!SMALL_OPERAND (min_second_step)
   && frame->total_size % IMM_REACH < IMM_REACH / 2
   && frame->total_size % IMM_REACH >= min_first_step)
 return frame->total_size % IMM_REACH;
 
-  gcc_assert (min_first_step <= max_first_step);
+  if (TARGET_RVC)
+{
+  /* If we need two subtracts, and one is small enough to allow compressed
+loads and stores, then put that one first.  */
+  if (IN_RANGE (min_second_step, 0,
+   (TARGET_64BIT ? SDSP_REACH : SWSP_REACH)))
+   return MAX (min_second_step, min_first_step);
+
+  /* If we need LUI + ADDI + ADD for the second adjustment step, then start
+with the minimum first step, so that we can get compressed loads and
+stores.  */
+  else if (!SMALL_OPERAND (min_second_step))
+   return min_first_step;
+}
+
   return max_first_step;
 }
 
diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
index 1c1c3431119..6144e267727 100644
--- a/gcc/config/riscv/riscv.h
+++ b/gcc/config/riscv/riscv.h
@@ -891,9 +891,13 @@ extern unsigned riscv_stack_boundary;
 #define SHIFT_RS1 15
 #define SHIFT_IMM 20
 #define IMM_BITS 12
+#define C_SxSP_BITS 6
 
 #define IMM_REACH (1LL << IMM_BITS)
 #define CONST_HIGH_PART(VALUE) (((VALUE) + (IMM_REACH/2)) & ~(IMM_REACH-1))
 #define CONST_LOW_PART(VALUE) ((VALUE) - CONST_HIGH_PART (VALUE))
 
+#define SWSP_REACH (4LL << C_SxSP_BITS)
+#define SDSP_REACH (8LL << C_SxSP_BITS)
+
 #endif /* ! GCC_RISCV_H */
-- 
2.14.1



[PATCH] Fix LRA ICE in lra_substitute_pseudo on DEBUG_INSN (PR rtl-optimization/83723)

2018-02-14 Thread Jakub Jelinek
Hi!

Unlike normal insns where SUBREGs must properly validate, in
DEBUG_INSNs we allow arbitrary SUBREGs, either the dwarf2out code
will be able to use it, or it will just punt.  The reason for it is
among other things that during analysis we usually need to ignore
debug insns, so can't reject some optimization just because it would
create subreg in debug insn that doesn't validate, and resetting such
debug insns is too big hammer.

On the following testcase on i?86 we ICE because we have a SFmode
pseudo and want to use a XFmode new_reg for it and such subreg doesn't
validate on i386.

Fixed by using gen_rtx_raw_SUBREG in DEBUG_INSNs as other passes do.
We don't have gen_lowpart_raw_SUBREG, so the patch inlines what
gen_lowpart_SUBREG does to compute the offset and uses gen_rtx_{,raw_}SUBREG
in all cases.  Bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2018-02-14  Jakub Jelinek  

PR rtl-optimization/83723
* lra-int.h (lra_substitute_pseudo): Add DEBUG_P argument.
* lra.c (lra_substitute_pseudo): Likewise.  If true, use
gen_rtx_raw_SUBREG instead of gen_rtx_SUBREG.  Pass DEBUG_P to
recursive calls.
(lra_substitute_pseudo_within_insn): Adjust lra_substitute_pseudo
callers.
* lra-constraints.c (inherit_reload_reg, split_reg): Likewise.

* gcc.dg/pr83723.c: New test.

--- gcc/lra-int.h.jj2018-01-03 10:19:53.848533738 +0100
+++ gcc/lra-int.h   2018-02-14 10:48:21.246324445 +0100
@@ -309,7 +309,7 @@ extern void lra_update_dups (lra_insn_re
 extern void lra_process_new_insns (rtx_insn *, rtx_insn *, rtx_insn *,
   const char *);
 
-extern bool lra_substitute_pseudo (rtx *, int, rtx, bool);
+extern bool lra_substitute_pseudo (rtx *, int, rtx, bool, bool);
 extern bool lra_substitute_pseudo_within_insn (rtx_insn *, int, rtx, bool);
 
 extern lra_insn_recog_data_t lra_set_insn_recog_data (rtx_insn *);
--- gcc/lra.c.jj2018-01-03 10:19:54.726533889 +0100
+++ gcc/lra.c   2018-02-14 10:47:52.033315369 +0100
@@ -1893,9 +1893,11 @@ lra_process_new_insns (rtx_insn *insn, r
 
 /* Replace all references to register OLD_REGNO in *LOC with pseudo
register NEW_REG.  Try to simplify subreg of constant if SUBREG_P.
-   Return true if any change was made.  */
+   DEBUG_P is if LOC is within a DEBUG_INSN.  Return true if any
+   change was made.  */
 bool
-lra_substitute_pseudo (rtx *loc, int old_regno, rtx new_reg, bool subreg_p)
+lra_substitute_pseudo (rtx *loc, int old_regno, rtx new_reg, bool subreg_p,
+  bool debug_p)
 {
   rtx x = *loc;
   bool result = false;
@@ -1931,11 +1933,14 @@ lra_substitute_pseudo (rtx *loc, int old
   if (mode != inner_mode
  && ! (CONST_INT_P (new_reg) && SCALAR_INT_MODE_P (mode)))
{
- if (!partial_subreg_p (mode, inner_mode)
- || ! SCALAR_INT_MODE_P (inner_mode))
-   new_reg = gen_rtx_SUBREG (mode, new_reg, 0);
+ poly_uint64 offset = 0;
+ if (partial_subreg_p (mode, inner_mode)
+ && SCALAR_INT_MODE_P (inner_mode))
+   offset = subreg_lowpart_offset (mode, inner_mode);
+ if (debug_p)
+   new_reg = gen_rtx_raw_SUBREG (mode, new_reg, offset);
  else
-   new_reg = gen_lowpart_SUBREG (mode, new_reg);
+   new_reg = gen_rtx_SUBREG (mode, new_reg, offset);
}
   *loc = new_reg;
   return true;
@@ -1948,14 +1953,14 @@ lra_substitute_pseudo (rtx *loc, int old
   if (fmt[i] == 'e')
{
  if (lra_substitute_pseudo ( (x, i), old_regno,
-new_reg, subreg_p))
+new_reg, subreg_p, debug_p))
result = true;
}
   else if (fmt[i] == 'E')
{
  for (j = XVECLEN (x, i) - 1; j >= 0; j--)
if (lra_substitute_pseudo ( (x, i, j), old_regno,
-  new_reg, subreg_p))
+  new_reg, subreg_p, debug_p))
  result = true;
}
 }
@@ -1970,7 +1975,8 @@ lra_substitute_pseudo_within_insn (rtx_i
   rtx new_reg, bool subreg_p)
 {
   rtx loc = insn;
-  return lra_substitute_pseudo (, old_regno, new_reg, subreg_p);
+  return lra_substitute_pseudo (, old_regno, new_reg, subreg_p,
+   DEBUG_INSN_P (insn));
 }
 
 
--- gcc/lra-constraints.c.jj2018-02-09 06:44:36.392805568 +0100
+++ gcc/lra-constraints.c   2018-02-14 10:49:22.193340178 +0100
@@ -5287,7 +5287,8 @@ inherit_reload_reg (bool def_p, int orig
  lra_assert (DEBUG_INSN_P (usage_insn));
  next_usage_insns = XEXP (next_usage_insns, 1);
}
-  lra_substitute_pseudo (_insn, original_regno, new_reg, false);
+  lra_substitute_pseudo (_insn, original_regno, new_reg, false,
+DEBUG_INSN_P (usage_insn));
   

[PATCH] Fix ICE in maybe_diag_stxncpy_trunc (PR tree-optimization/84383)

2018-02-14 Thread Jakub Jelinek
Hi!

The function calls get_addr_base_and_unit_offset on 2 trees, but
that can return NULL if the unit offset is not constant.
The conditional tests just one of them for non-NULL and operand_equal_p
ICEs if one argument is NULL, so depending on the uninitialized poly_int64
(get_addr_base_and_unit_offset doesn't touch it if it returns NULL),
we either ICE in operand_equal_p or are lucky and dstoff is equal to lhsoff
and just valgrind complains.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2018-02-15  Jakub Jelinek  

PR tree-optimization/84383
* tree-ssa-strlen.c (maybe_diag_stxncpy_trunc): Don't look at
dstoff nor call operand_equal_p if dstbase is NULL.

* gcc.c-torture/compile/pr84383.c: New test.

--- gcc/tree-ssa-strlen.c.jj2018-02-09 06:44:29.993809176 +0100
+++ gcc/tree-ssa-strlen.c   2018-02-14 16:38:36.981713666 +0100
@@ -1878,6 +1878,7 @@ maybe_diag_stxncpy_trunc (gimple_stmt_it
   poly_int64 lhsoff;
   tree lhsbase = get_addr_base_and_unit_offset (lhs, );
   if (lhsbase
+ && dstbase
  && known_eq (dstoff, lhsoff)
  && operand_equal_p (dstbase, lhsbase, 0))
return false;
--- gcc/testsuite/gcc.c-torture/compile/pr84383.c.jj2018-02-14 
17:33:21.972803287 +0100
+++ gcc/testsuite/gcc.c-torture/compile/pr84383.c   2018-02-14 
17:32:37.639803918 +0100
@@ -0,0 +1,14 @@
+/* PR tree-optimization/84383 */
+
+struct S { char *s; };
+void bar (struct S *);
+
+void
+foo (int a, char *b)
+{
+  struct S c[4];
+  bar (c);
+  __builtin_strncpy (c[a].s, b, 32);
+  c[a].s[31] = '\0';
+  bar (c);
+}

Jakub


Re: [Patch] Minor GCC documentation correction for -Wformat-overflow

2018-02-14 Thread Martin Sebor

On 02/14/2018 02:28 PM, Indu Bhagat wrote:

In section "-Wformat-overflow=1", following is stated :

void f (int a, int b)
{
  char buf [12];
  sprintf (buf, "a = %i, b = %i\n", a, b);
}

" Increasing the size of the buffer by a single byte is sufficient to avoid
the warning,"

[size of an unknown int for the purpose of this warning is = 1 (to
represent 0);
add 1 for newline, add 1 for null; add all the other chars in the format
string = 14]

The minimum increase however needs to be of 2 bytes. i.e., a buf of size
14 is
the minimum length for the warning in the example to go away.
So the correct statement should be -

" Increasing the size of the buffer by two bytes is sufficient to avoid the
warning,"

Alternatively, the size of buf can be bumped up to 13 in the sample code
as done
in the patch below.


You're right, that's an off-by-one error/typo in the example.
Your patch seems like an obvious fix that can go in without
a formal approval, so I've committed it for you in r257680.

Thank you!
Martin



RE: PR84239, Reimplement CET intrinsics for rdssp/incssp insn

2018-02-14 Thread Joseph Myers
This patch has broken bootstrap of a cross toolchain for x86_64 (the case 
where inhibit_libc is defined because there is no libc for the target 
available at that stage in the bootstrap process).

In file included from 
/scratch/jmyers/glibc-bot/build/compilers/x86_64-linux-gnu/gcc-first/gcc/include/xmmintrin.h:34,
 from 
/scratch/jmyers/glibc-bot/build/compilers/x86_64-linux-gnu/gcc-first/gcc/include/x86intrin.h:33,
 from 
/scratch/jmyers/glibc-bot/src/gcc/libgcc/config/i386/shadow-stack-unwind.h:25,
 from ./md-unwind-support.h:27,
 from /scratch/jmyers/glibc-bot/src/gcc/libgcc/unwind-dw2.c:411:
../../.././gcc/mm_malloc.h:27:10: fatal error: stdlib.h: No such file or 
directory
 #include 
  ^~

https://sourceware.org/ml/libc-testresults/2018-q1/msg00307.html

The patch makes shadow-stack-unwind.h include , which ends up 
including , which includes  and  
unconditionally.  You can't include any libc system headers 
unconditionally from libgcc (only when inhibit_libc is not defined - and 
, being an installed header, can't test inhibit_libc because 
it's in the user's namespace).  So I think you need to avoid the 
mm_malloc.h include here somehow (without adding any inhibit_libc 
conditionals to installed headers).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [gcc-7 backport PATCH, rs6000/PR84388] fix fold-vec-mult-int128 testcases

2018-02-14 Thread Segher Boessenkool
On Wed, Feb 14, 2018 at 05:02:19PM -0600, Peter Bergner wrote:
> On 2/14/18 4:47 PM, Will Schmidt wrote:
> > -/* { dg-options "-maltivec -mvsx -mpower8-vector" } */
> > +/* { dg-options "-mpower8-vector -mcpu=power8 -O2" } */
> 
> [snip]
> 
> > -/* { dg-options "-maltivec -mvsx -mcpu=power9 -O2" } */
> > +/* { dg-options "-mpower9-vector -mcpu=power9 -O2" } */
> 
> As we discussed offline, I don't know why we need -mpower[89]-vector
> in dg-options if we're already specifying -mcpu=power[89].
> Those are both implied by the -mcpu=power[89] options.

Yes, or the other way around.  -mcpu= has the advantage that it fixes
more variables (scheduling model, potentially some instruction selection,
etc.; -mpower9-vector has the advantage that you do not need the skip-if-
some-other-cpu-is-already-set dance.  Will, your choice.

If you want to change it, please first do so on trunk.


Segher


Re: [gcc-7 backport PATCH, rs6000/PR84388] fix fold-vec-mult-int128 testcases

2018-02-14 Thread Segher Boessenkool
On Wed, Feb 14, 2018 at 04:47:56PM -0600, Will Schmidt wrote:
>   This backports some testcase fixes to the gcc7 branch.  The testcases 
> touched here would now match their gcc-trunk equivalents.
> 
> OK for gcc-7 ?

Sure, thanks!  But see other thread...


Segher


> 2018-02-14  Will Schmidt  
> 
> PR target/84388
>   * gcc.target/powerpc/fold-vec-mult-int128-p8.c: Update dg-options
>   and scan-assembler stanzas.
>   * gcc.target/powerpc/fold-vec-mult-int128-p9.c: Same.


Re: [gcc-7 backport PATCH, rs6000/PR84388] fix fold-vec-mult-int128 testcases

2018-02-14 Thread Peter Bergner
On 2/14/18 4:47 PM, Will Schmidt wrote:
> -/* { dg-options "-maltivec -mvsx -mpower8-vector" } */
> +/* { dg-options "-mpower8-vector -mcpu=power8 -O2" } */

[snip]

> -/* { dg-options "-maltivec -mvsx -mcpu=power9 -O2" } */
> +/* { dg-options "-mpower9-vector -mcpu=power9 -O2" } */

As we discussed offline, I don't know why we need -mpower[89]-vector
in dg-options if we're already specifying -mcpu=power[89].
Those are both implied by the -mcpu=power[89] options.

Peter



[gcc-7 backport PATCH, rs6000/PR84388] fix fold-vec-mult-int128 testcases

2018-02-14 Thread Will Schmidt

Hi, 
  This backports some testcase fixes to the gcc7 branch.  The testcases 
touched here would now match their gcc-trunk equivalents.

OK for gcc-7 ?

Thanks,
-Will

[testsuite]

2018-02-14  Will Schmidt  

PR target/84388
* gcc.target/powerpc/fold-vec-mult-int128-p8.c: Update dg-options
and scan-assembler stanzas.
* gcc.target/powerpc/fold-vec-mult-int128-p9.c: Same.



Index: fold-vec-mult-int128-p8.c
===
--- fold-vec-mult-int128-p8.c   (revision 257672)
+++ fold-vec-mult-int128-p8.c   (working copy)
@@ -4,7 +4,9 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target powerpc_p8vector_ok } */
 /* { dg-require-effective-target int128 } */
-/* { dg-options "-maltivec -mvsx -mpower8-vector" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power8" } } */
+/* { dg-options "-mpower8-vector -mcpu=power8 -O2" } */
 /* { dg-additional-options "-maix64" { target powerpc-ibm-aix* } } */
 
 #include "altivec.h"
@@ -21,5 +23,5 @@
   return vec_mul (x, y);
 }
 
-/* { dg-final { scan-assembler-times "\[ \t\]mulld " 6 } } */
-/* { dg-final { scan-assembler-times "\[ \t\]mulhdu" 2 } } */
+/* { dg-final { scan-assembler-times {\mmulld\M} 6 } } */
+/* { dg-final { scan-assembler-times {\mmulhdu\M} 2 } } */
Index: fold-vec-mult-int128-p9.c
===
--- fold-vec-mult-int128-p9.c   (revision 257672)
+++ fold-vec-mult-int128-p9.c   (working copy)
@@ -2,10 +2,10 @@
inputs produce the right results.  */
 
 /* { dg-do compile } */
-/* { dg-require-effective-target powerpc_float128_hw_ok } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
 /* { dg-require-effective-target int128 } */
 /* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power9" } } */
-/* { dg-options "-maltivec -mvsx -mcpu=power9 -O2" } */
+/* { dg-options "-mpower9-vector -mcpu=power9 -O2" } */
 /* { dg-additional-options "-maix64" { target powerpc-ibm-aix* } } */
 
 #include "altivec.h"
@@ -22,4 +22,5 @@
   return vec_mul (x, y);
 }
 
-/* { dg-final { scan-assembler-times "\[ \t\]xsmulqp" 2 } } */
+/* { dg-final { scan-assembler-times {\mmulld\M} 4 } } */
+/* { dg-final { scan-assembler-times {\mmulhdu\M} 2 } } */




Re: [PATCH] adjust warning_n() to take uhwi (PR 84207)

2018-02-14 Thread Pedro Alves
On 02/14/2018 09:47 PM, Manuel López-Ibáñez wrote:
> On 14 Feb 2018 8:16 pm, "Pedro Alves"  > wrote:
> 
> Instead of a class that has to have a constructor for every type
> you want to pass as plural selector to the _n functions, which
> increases coupling, I'd suggest using a conversion function, and
> overload that.  I.e., something like, in the core diagnostics code:
> 
> static inline unsigned HOST_WIDE_INT
> as_plural_form (unsigned HOST_WIDE_INT val)
> {
>   return val;
> }
> 
> /* In some tree-diagnostics header.  */
> 
> static inline unsigned HOST_WIDE_INT
> as_plural_form (tree t)
> {
>    // extract & return a HWI
> }
> 
> /* In some whatever-other-type-diagnostics header.  */
> 
> static inline unsigned HOST_WIDE_INT
> as_plural_form (whatever_other_type v)
> {
>    // like above
> }
> 
> and then you call error_n and other similar functions like this:
> 
>     error_n (loc, u, "%u thing", "%u things", u);
>     error_n (loc, as_plural_form (u), "%u thing", "%u things", u);
>     error_n (loc, as_plural_form (t), "%E thing", "%E things", t);
>     error_n (loc, as_plural_form (i), "%wu thing", "%wu things", i);
> 
> This is similar in spirit to std::to_string, etc.
> 
> 
> If that's desired, why not simply have GCC::to_uhwi() ? It would likely be 
> useful in other contexts.

Because of types that (e.g. wide_int specializations) that can store
values larger than what fit in uhwi.  GCC::to_uhwi's semantics for that
aren't clear -- could saturate, could unsigned wrap, could
throw/abort/assert, could be undefined.

as_plural_form has clear semantics -- it'd return a value that does
the right thing for ngettext's N parameter.  I.e., it'd do
the "n % 100 + 100" operation as a wide_int still, and
before converting/cast the result to uhwi. 

Thanks,
Pedro Alves


Re: [PATCH] jit: fix link on OS X and Solaris (PR jit/64089 and PR jit/84288)

2018-02-14 Thread FX
I can confirm that, with the attached revised patch, a bootstrap with 
--enable-languages=c,c++,jit --enable-host-shared is successful on macOS.

FX



patch
Description: Binary data


Re: [PATCH] MPX and CET changes in release notes

2018-02-14 Thread Gerald Pfeifer
Hi Igor,

On Wed, 14 Feb 2018, Tsimbalist, Igor V wrote:
> MPX is going to be deprecated in gcc-8. Control-flow protection support 
> is in gcc-8. Reflect these in Release Notes for gcc-8.

thanks for this update.  Only some minor changes, then this is
good to go.

>>   
>> A new option -fcf-protection=[full|branch|return|none] is
>> introduced to perform a code instrumentation to increase program 
>> security by

"performs code instrumentation" (omit "a")

>> checking that target addresses of control-flow transfer instructions 
>> (such as
>> indirect function call, function return, indirect jump) are valid. 
>> Currently
>> the instrumentation is supported on x86 GNU/Linux target only.

"on x86 GNU/Linux targets" (since there are several such as i586, 
i686,...).

>> See, the user

No comma.

>>   
>> GCC now supports the Intel Control-flow Enforcement Technology (CET)
>> extension through -mibt, -mshstk, -mcet options. One of 
>> these

"...through the -mibt, -mshstk, and 
-mcet options"


Gerald


Re: [PATCH] Fix up compound literal handling in C FE (PR sanitizer/84307)

2018-02-14 Thread Joseph Myers
On Wed, 14 Feb 2018, Jakub Jelinek wrote:

> 2018-02-13  Jakub Jelinek  
> 
>   PR sanitizer/84340
>   * c-decl.c (build_compound_literal): Call pushdecl (decl) even when
>   it is not TREE_STATIC.
>   * c-typeck.c (c_mark_addressable) : Mark
>   not just the COMPOUND_LITERAL_EXPR node itself addressable, but also
>   its COMPOUND_LITERAL_EXPR_DECL.

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] combine: Update links correctly for new I2 (PR84169)

2018-02-14 Thread Jakub Jelinek
On Mon, Feb 12, 2018 at 03:59:05PM +, Segher Boessenkool wrote:
> 2018-02-12  Segher Boessenkool  
> 
>   PR rtl-optimization/84169
>   * combine.c (try_combine): New variable split_i2i3.  Set it to true if
>   we generated a parallel as new i3 and we split that to new i2 and i3
>   instructions.  Handle split_i2i3 similar to swap_i2i3: scan the
>   LOG_LINKs of i3 to see which of those need to link to i2 now.  Link
>   those to i2, not i1.  Partially rewrite this scan code.

> +  unsigned int regno = REGNO (SET_DEST (x));

This line ICEs with rtl checking on both x86_64-linux and i686-linux
on gcc.c-torture/compile/pr66168.c:
+FAIL: gcc.c-torture/compile/pr66168.c   -O2  (internal compiler error)
+FAIL: gcc.c-torture/compile/pr66168.c   -O2  (test for excess errors)
+FAIL: gcc.c-torture/compile/pr66168.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  (internal compiler error)
+FAIL: gcc.c-torture/compile/pr66168.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  (test for excess errors)
+FAIL: gcc.c-torture/compile/pr66168.c   -O3 -g  (internal compiler error)
+FAIL: gcc.c-torture/compile/pr66168.c   -O3 -g  (test for excess errors)
+FAIL: gcc.c-torture/compile/pr66168.c   -Os  (internal compiler error)
+FAIL: gcc.c-torture/compile/pr66168.c   -Os  (test for excess errors)

/home/jakub/src/gcc/gcc/testsuite/gcc.c-torture/compile/pr66168.c:15:1: 
internal compiler error: RTL check: expected code 'reg', have 'subreg' in 
rhs_regno, at rtl.h:1896
0x6755ff rtl_check_failed_code1(rtx_def const*, rtx_code, char const*, int, 
char const*)
../../gcc/rtl.c:844
0x7fcdd2 rhs_regno
../../gcc/rtl.h:1896
0x807e75 rhs_regno
../../gcc/rtl.h:1447
0x807e75 try_combine
../../gcc/combine.c:4286
0x1680cc1 combine_instructions
../../gcc/combine.c:1320
0x1680cc1 rest_of_handle_combine
../../gcc/combine.c:14881
0x1680cc1 execute
../../gcc/combine.c:14926

Jakub


[PATCH] Fix up compound literal handling in C FE (PR sanitizer/84307)

2018-02-14 Thread Jakub Jelinek
On Tue, Feb 13, 2018 at 03:40:20PM +0100, Jakub Jelinek wrote:
> BTW, your testcase shows a more severe problem, that we actually don't
> handle compound literals correctly.
> 
> C99 says that:
> "If the compound literal occurs outside the body of a function, the object
> has static storage duration; otherwise, it has automatic storage duration
> associated with the enclosing block."
> but if we create an object with automatic storage duration, we don't
> actually put that object into the scope of the enclosing block, but of the
> enclosing function, which explains the weird ASAN_MARK UNPOISON present, but
> corresponding ASAN_MARK POISON not present.  The following testcase should
> IMHO FAIL with -fsanitize=address on the second bar call, but doesn't, even
> at -O0 without any DSE.  When optimizing we because of this don't emit
> CLOBBER stmts when the compound literal object goes out of scope, and with
> -fsanitize=address -fsanitize-address-use-after-scope we don't emit the
> POISON.

Here is the full patch that passed bootstrap/regtest on x86_64-linux and
i686-linux, ok for stage1?

The mark_addressable part is needed to fix
gcc.c-torture/compile/compound-literal-3.c, where we marked the underlying
automatic temporary of the compound literal as addressable too late and
the gimplifier already set DECL_GIMPLE_REG_P on it when seeing it in the
block scope.

2018-02-13  Jakub Jelinek  

PR sanitizer/84340
* c-decl.c (build_compound_literal): Call pushdecl (decl) even when
it is not TREE_STATIC.
* c-typeck.c (c_mark_addressable) : Mark
not just the COMPOUND_LITERAL_EXPR node itself addressable, but also
its COMPOUND_LITERAL_EXPR_DECL.

--- gcc/c/c-decl.c.jj   2018-02-13 21:21:35.447978191 +0100
+++ gcc/c/c-decl.c  2018-02-13 22:13:02.030148159 +0100
@@ -5348,6 +5348,8 @@ build_compound_literal (location_t loc,
   pushdecl (decl);
   rest_of_decl_compilation (decl, 1, 0);
 }
+  else
+pushdecl (decl);
 
   if (non_const)
 {
--- gcc/c/c-typeck.c.jj 2018-02-10 00:15:36.398163493 +0100
+++ gcc/c/c-typeck.c2018-02-13 22:24:06.294104876 +0100
@@ -4821,6 +4821,10 @@ c_mark_addressable (tree exp, bool array
break;
 
   case COMPOUND_LITERAL_EXPR:
+   TREE_ADDRESSABLE (x) = 1;
+   TREE_ADDRESSABLE (COMPOUND_LITERAL_EXPR_DECL (x)) = 1;
+   return true;
+
   case CONSTRUCTOR:
TREE_ADDRESSABLE (x) = 1;
return true;


Jakub


Re: [Patch, Fortran, F03] PR 84385: Reject invalid SELECT TYPE selector (allocate_with_source_22.f03)

2018-02-14 Thread Janus Weil
2018-02-14 22:30 GMT+01:00 Janus Weil :
> 2018-02-14 22:16 GMT+01:00 Steve Kargl :
>> On Wed, Feb 14, 2018 at 10:10:09PM +0100, Janus Weil wrote:
>>>
>>> Regtests cleanly on x86_64-linux-gnu. Ok for trunk?
>>>
>>
>> Looks okay to me with two question below.
>>
>>> Index: gcc/fortran/match.c
>>> ===
>>> --- gcc/fortran/match.c   (revision 257635)
>>> +++ gcc/fortran/match.c   (working copy)
>>> @@ -6201,9 +6201,10 @@ gfc_match_select_type (void)
>>>|| CLASS_DATA (expr1)->attr.codimension)
>>>&& expr1->ref
>>>&& expr1->ref->type == REF_ARRAY
>>> +  && expr1->ref->u.ar.type == AR_FULL
>>>&& expr1->ref->next == NULL);
>>>
>>> -  /* Check for F03:C811.  */
>>> +  /* Check for F03:C811 (F08:C835).  */
>>
>> Is there a testcase that causes gfortran to emit
>> an error message for violation of F03:C811?  If no,
>> could you commit one?
>
> Good point: Yes, there is such a test case, but it does not cover the
> case that is fixed with the patch. I have now added this case to
> select_type_1.f03, see updated patch in attachment.

I have just committed this updated patch as r257673. Thanks for the
review, Steve.

Cheers,
Janus


Re: [PATCH] adjust warning_n() to take uhwi (PR 84207)

2018-02-14 Thread Manuel López-Ibáñez
On 14 Feb 2018 8:16 pm, "Pedro Alves"  wrote:

Instead of a class that has to have a constructor for every type
you want to pass as plural selector to the _n functions, which
increases coupling, I'd suggest using a conversion function, and
overload that.  I.e., something like, in the core diagnostics code:

static inline unsigned HOST_WIDE_INT
as_plural_form (unsigned HOST_WIDE_INT val)
{
  return val;
}

/* In some tree-diagnostics header.  */

static inline unsigned HOST_WIDE_INT
as_plural_form (tree t)
{
   // extract & return a HWI
}

/* In some whatever-other-type-diagnostics header.  */

static inline unsigned HOST_WIDE_INT
as_plural_form (whatever_other_type v)
{
   // like above
}

and then you call error_n and other similar functions like this:

error_n (loc, u, "%u thing", "%u things", u);
error_n (loc, as_plural_form (u), "%u thing", "%u things", u);
error_n (loc, as_plural_form (t), "%E thing", "%E things", t);
error_n (loc, as_plural_form (i), "%wu thing", "%wu things", i);

This is similar in spirit to std::to_string, etc.


If that's desired, why not simply have GCC::to_uhwi() ? It would likely be
useful in other contexts.

Cheers,

Manuel.


Re: [Patch, Fortran, F03] PR 84385: Reject invalid SELECT TYPE selector (allocate_with_source_22.f03)

2018-02-14 Thread Janus Weil
2018-02-14 22:16 GMT+01:00 Steve Kargl :
> On Wed, Feb 14, 2018 at 10:10:09PM +0100, Janus Weil wrote:
>>
>> Regtests cleanly on x86_64-linux-gnu. Ok for trunk?
>>
>
> Looks okay to me with two question below.
>
>> Index: gcc/fortran/match.c
>> ===
>> --- gcc/fortran/match.c   (revision 257635)
>> +++ gcc/fortran/match.c   (working copy)
>> @@ -6201,9 +6201,10 @@ gfc_match_select_type (void)
>>|| CLASS_DATA (expr1)->attr.codimension)
>>&& expr1->ref
>>&& expr1->ref->type == REF_ARRAY
>> +  && expr1->ref->u.ar.type == AR_FULL
>>&& expr1->ref->next == NULL);
>>
>> -  /* Check for F03:C811.  */
>> +  /* Check for F03:C811 (F08:C835).  */
>
> Is there a testcase that causes gfortran to emit
> an error message for violation of F03:C811?  If no,
> could you commit one?

Good point: Yes, there is such a test case, but it does not cover the
case that is fixed with the patch. I have now added this case to
select_type_1.f03, see updated patch in attachment.

Cheers,
Janus
Index: gcc/fortran/match.c
===
--- gcc/fortran/match.c (revision 257671)
+++ gcc/fortran/match.c (working copy)
@@ -6201,9 +6201,10 @@ gfc_match_select_type (void)
 || CLASS_DATA (expr1)->attr.codimension)
 && expr1->ref
 && expr1->ref->type == REF_ARRAY
+&& expr1->ref->u.ar.type == AR_FULL
 && expr1->ref->next == NULL);
 
-  /* Check for F03:C811.  */
+  /* Check for F03:C811 (F08:C835).  */
   if (!expr2 && (expr1->expr_type != EXPR_VARIABLE
 || (!class_array && expr1->ref != NULL)))
 {
Index: gcc/testsuite/gfortran.dg/allocate_with_source_22.f03
===
--- gcc/testsuite/gfortran.dg/allocate_with_source_22.f03   (revision 
257671)
+++ gcc/testsuite/gfortran.dg/allocate_with_source_22.f03   (working copy)
@@ -27,7 +27,7 @@ subroutine test_class()
   ! with -fcheck=bounds.
   if (size(b) /= 4) call abort()
   if (any(b(1:2)%i /= [ 1,2])) call abort()
-  select type (b(1))
+  select type (b1 => b(1))
 class is (tt)
   continue
 class default
Index: gcc/testsuite/gfortran.dg/allocate_with_source_23.f03
===
--- gcc/testsuite/gfortran.dg/allocate_with_source_23.f03   (revision 
257671)
+++ gcc/testsuite/gfortran.dg/allocate_with_source_23.f03   (working copy)
@@ -28,7 +28,7 @@ subroutine test_class_correct()
   allocate(b(1:4), source=a(1))
   if (size(b) /= 4) call abort()
   if (any(b(:)%i /= [ 1,1,1,1])) call abort()
-  select type (b(1))
+  select type (b1 => b(1))
 class is (tt)
   continue
 class default
@@ -46,7 +46,7 @@ subroutine test_class_fail()
   allocate(b(1:4), source=a) ! Fail expected: sizes do not conform
   if (size(b) /= 4) call abort()
   if (any(b(1:2)%i /= [ 1,2])) call abort()
-  select type (b(1))
+  select type (b1 => b(1))
 class is (tt)
   continue
 class default
Index: gcc/testsuite/gfortran.dg/select_type_1.f03
===
--- gcc/testsuite/gfortran.dg/select_type_1.f03 (revision 257671)
+++ gcc/testsuite/gfortran.dg/select_type_1.f03 (working copy)
@@ -23,6 +23,7 @@
   end type
 
   class(t1), pointer :: a => NULL()
+  class(t1), allocatable, dimension(:) :: ca
   type(t1), target :: b
   type(t2), target :: c
   a => b
@@ -32,6 +33,7 @@
 
   select type (3.5)  ! { dg-error "is not a named variable" }
   select type (a%cp) ! { dg-error "is not a named variable" }
+  select type (ca(1))! { dg-error "is not a named variable" }
   select type (b)! { dg-error "Selector shall be polymorphic" }
   end select
 


[committed, hppa] Fix loading of PIC labels

2018-02-14 Thread John David Anglin
The attached patch fixes the compilation of pr81687-2.c in the libgomp 
testsuite.


Typically, when labels are "close" on hppa, it is best to load the 
address of the label in
PIC code using a pc-relative sequence.  The code generated for 
pr81687-2.c is an example
where the label and the referencing code are in different spaces.  A 
pc-relative sequence
doesn't work in this case, and we are forced to load the address from 
the linkage table.


Tested on hppa64-hp-hpux11.11, hppa2.0w-hp-hpux11.11 and 
hppa-unknown-linux-gnu

with no observed regressions.

Committed to trunk.

Dave

--
John David Anglin  dave.ang...@bell.net

2018-02-14  John David Anglin  

PR target/83984
* config/pa/pa.md: Load address of PIC label using the linkage table
if the label is nonlocal.

Index: config/pa/pa.md
===
--- config/pa/pa.md (revision 257505)
+++ config/pa/pa.md (working copy)
@@ -2536,24 +2536,40 @@
 
   xoperands[0] = operands[0];
   xoperands[1] = operands[1];
-  xoperands[2] = gen_label_rtx ();
 
-  (*targetm.asm_out.internal_label) (asm_out_file, \"L\",
-CODE_LABEL_NUMBER (xoperands[2]));
-  output_asm_insn (\"mfia %0\", xoperands);
+  if (GET_CODE (operands[1]) == LABEL_REF
+  && !LABEL_REF_NONLOCAL_P (operands[1]))
+{
+  xoperands[2] = gen_label_rtx ();
+  (*targetm.asm_out.internal_label) (asm_out_file, \"L\",
+CODE_LABEL_NUMBER (xoperands[2]));
+  output_asm_insn (\"mfia %0\", xoperands);
 
-  /* If we're trying to load the address of a label that happens to be
- close, then we can use a shorter sequence.  */
-  if (GET_CODE (operands[1]) == LABEL_REF
-  && !LABEL_REF_NONLOCAL_P (operands[1])
-  && INSN_ADDRESSES_SET_P ()
-  && abs (INSN_ADDRESSES (INSN_UID (XEXP (operands[1], 0)))
-   - INSN_ADDRESSES (INSN_UID (insn))) < 8100)
-output_asm_insn (\"ldo %1-%2(%0),%0\", xoperands);
+  /* If we're trying to load the address of a label that happens to be
+close, then we can use a shorter sequence.  */
+  if (INSN_ADDRESSES_SET_P ()
+ && abs (INSN_ADDRESSES (INSN_UID (XEXP (operands[1], 0)))
+ - INSN_ADDRESSES (INSN_UID (insn))) < 8100)
+   output_asm_insn (\"ldo %1-%2(%0),%0\", xoperands);
+  else
+   {
+ output_asm_insn (\"addil L%%%1-%2,%0\", xoperands);
+ output_asm_insn (\"ldo R%%%1-%2(%0),%0\", xoperands);
+   }
+}
   else
 {
-  output_asm_insn (\"addil L%%%1-%2,%0\", xoperands);
-  output_asm_insn (\"ldo R%%%1-%2(%0),%0\", xoperands);
+  /* Load using linkage table.  */
+  if (TARGET_64BIT)
+   {
+ output_asm_insn (\"addil LT%%%1,%%r27\", xoperands);
+ output_asm_insn (\"ldd RT%%%1(%0),%0\", xoperands);
+   }
+  else
+   {
+ output_asm_insn (\"addil LT%%%1,%%r19\", xoperands);
+ output_asm_insn (\"ldw RT%%%1(%0),%0\", xoperands);
+   }
 }
   return \"\";
 }"
@@ -2570,25 +2586,33 @@
 
   xoperands[0] = operands[0];
   xoperands[1] = operands[1];
-  xoperands[2] = gen_label_rtx ();
 
-  output_asm_insn (\"bl .+8,%0\", xoperands);
-  output_asm_insn (\"depi 0,31,2,%0\", xoperands);
-  (*targetm.asm_out.internal_label) (asm_out_file, \"L\",
-CODE_LABEL_NUMBER (xoperands[2]));
+  if (GET_CODE (operands[1]) == LABEL_REF
+  && !LABEL_REF_NONLOCAL_P (operands[1]))
+{
+  xoperands[2] = gen_label_rtx ();
+  output_asm_insn (\"bl .+8,%0\", xoperands);
+  output_asm_insn (\"depi 0,31,2,%0\", xoperands);
+  (*targetm.asm_out.internal_label) (asm_out_file, \"L\",
+CODE_LABEL_NUMBER (xoperands[2]));
 
-  /* If we're trying to load the address of a label that happens to be
- close, then we can use a shorter sequence.  */
-  if (GET_CODE (operands[1]) == LABEL_REF
-  && !LABEL_REF_NONLOCAL_P (operands[1])
-  && INSN_ADDRESSES_SET_P ()
-  && abs (INSN_ADDRESSES (INSN_UID (XEXP (operands[1], 0)))
-   - INSN_ADDRESSES (INSN_UID (insn))) < 8100)
-output_asm_insn (\"ldo %1-%2(%0),%0\", xoperands);
+  /* If we're trying to load the address of a label that happens to be
+close, then we can use a shorter sequence.  */
+  if (INSN_ADDRESSES_SET_P ()
+ && abs (INSN_ADDRESSES (INSN_UID (XEXP (operands[1], 0)))
+ - INSN_ADDRESSES (INSN_UID (insn))) < 8100)
+   output_asm_insn (\"ldo %1-%2(%0),%0\", xoperands);
+  else
+   {
+ output_asm_insn (\"addil L%%%1-%2,%0\", xoperands);
+ output_asm_insn (\"ldo R%%%1-%2(%0),%0\", xoperands);
+   }
+}
   else
 {
-  output_asm_insn (\"addil L%%%1-%2,%0\", xoperands);
-  output_asm_insn (\"ldo R%%%1-%2(%0),%0\", xoperands);
+  /* Load using linkage table.  */
+  output_asm_insn 

[Patch] Minor GCC documentation correction for -Wformat-overflow

2018-02-14 Thread Indu Bhagat

In section "-Wformat-overflow=1", following is stated :

void f (int a, int b)
{
  char buf [12];
  sprintf (buf, "a = %i, b = %i\n", a, b);
}

" Increasing the size of the buffer by a single byte is sufficient to avoid
the warning,"

[size of an unknown int for the purpose of this warning is = 1 (to represent 0);
add 1 for newline, add 1 for null; add all the other chars in the format
string = 14]

The minimum increase however needs to be of 2 bytes. i.e., a buf of size 14 is
the minimum length for the warning in the example to go away.
So the correct statement should be -

" Increasing the size of the buffer by two bytes is sufficient to avoid the
warning,"

Alternatively, the size of buf can be bumped up to 13 in the sample code as done
in the patch below.

Thanks
 
--

gcc/ChangeLog:


* doc/invoke.texi: Correction in -Wformat-overflow code sample.

Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi (revision 257646)
+++ gcc/doc/invoke.texi (working copy)
@@ -4184,7 +4184,7 @@
 @smallexample
 void f (int a, int b)
 @{
-  char buf [12];
+  char buf [13];
   sprintf (buf, "a = %i, b = %i\n", a, b);
 @}
 @end smallexample


Re: [Patch, Fortran, F03] PR 84385: Reject invalid SELECT TYPE selector (allocate_with_source_22.f03)

2018-02-14 Thread Steve Kargl
On Wed, Feb 14, 2018 at 10:10:09PM +0100, Janus Weil wrote:
> 
> Regtests cleanly on x86_64-linux-gnu. Ok for trunk?
> 

Looks okay to me with two question below.

> Index: gcc/fortran/match.c
> ===
> --- gcc/fortran/match.c   (revision 257635)
> +++ gcc/fortran/match.c   (working copy)
> @@ -6201,9 +6201,10 @@ gfc_match_select_type (void)
>|| CLASS_DATA (expr1)->attr.codimension)
>&& expr1->ref
>&& expr1->ref->type == REF_ARRAY
> +  && expr1->ref->u.ar.type == AR_FULL
>&& expr1->ref->next == NULL);
>  
> -  /* Check for F03:C811.  */
> +  /* Check for F03:C811 (F08:C835).  */

Is there a testcase that causes gfortran to emit
an error message for violation of F03:C811?  If no,
could you commit one?

-- 
Steve


[Patch, Fortran, F03] PR 84385: Reject invalid SELECT TYPE selector (allocate_with_source_22.f03)

2018-02-14 Thread Janus Weil
Hi all,

here is another small patch that fixes two invalid test cases in the
test suite and also fixes the check that was supposed to reject these
cases (but failed).

Regtests cleanly on x86_64-linux-gnu. Ok for trunk?

Cheers,
Janus


2018-02-14  Janus Weil  

PR fortran/84385
* match.c (gfc_match_select_type): Fix check for selector in
SELECT TYPE statement.


2018-02-14  Janus Weil  

PR fortran/84385
* gfortran.dg/allocate_with_source_22.f03: Fix invalid test case.
* gfortran.dg/allocate_with_source_23.f90: Ditto.
Index: gcc/fortran/match.c
===
--- gcc/fortran/match.c (revision 257635)
+++ gcc/fortran/match.c (working copy)
@@ -6201,9 +6201,10 @@ gfc_match_select_type (void)
 || CLASS_DATA (expr1)->attr.codimension)
 && expr1->ref
 && expr1->ref->type == REF_ARRAY
+&& expr1->ref->u.ar.type == AR_FULL
 && expr1->ref->next == NULL);
 
-  /* Check for F03:C811.  */
+  /* Check for F03:C811 (F08:C835).  */
   if (!expr2 && (expr1->expr_type != EXPR_VARIABLE
 || (!class_array && expr1->ref != NULL)))
 {
Index: gcc/testsuite/gfortran.dg/allocate_with_source_22.f03
===
--- gcc/testsuite/gfortran.dg/allocate_with_source_22.f03   (revision 
257635)
+++ gcc/testsuite/gfortran.dg/allocate_with_source_22.f03   (working copy)
@@ -27,7 +27,7 @@ subroutine test_class()
   ! with -fcheck=bounds.
   if (size(b) /= 4) call abort()
   if (any(b(1:2)%i /= [ 1,2])) call abort()
-  select type (b(1))
+  select type (b1 => b(1))
 class is (tt)
   continue
 class default
Index: gcc/testsuite/gfortran.dg/allocate_with_source_23.f03
===
--- gcc/testsuite/gfortran.dg/allocate_with_source_23.f03   (revision 
257635)
+++ gcc/testsuite/gfortran.dg/allocate_with_source_23.f03   (working copy)
@@ -28,7 +28,7 @@ subroutine test_class_correct()
   allocate(b(1:4), source=a(1))
   if (size(b) /= 4) call abort()
   if (any(b(:)%i /= [ 1,1,1,1])) call abort()
-  select type (b(1))
+  select type (b1 => b(1))
 class is (tt)
   continue
 class default
@@ -46,7 +46,7 @@ subroutine test_class_fail()
   allocate(b(1:4), source=a) ! Fail expected: sizes do not conform
   if (size(b) /= 4) call abort()
   if (any(b(1:2)%i /= [ 1,2])) call abort()
-  select type (b(1))
+  select type (b1 => b(1))
 class is (tt)
   continue
 class default


Re: [PATCH rs6000] Fix for builtins-4-int128-runnable.c

2018-02-14 Thread Segher Boessenkool
Hi Carl,

On Wed, Feb 14, 2018 at 11:09:09AM -0800, Carl Love wrote:
> The following patch contains fixes an issue with the 128-bit test on
> Power 7.  Power 7 does not support int128 so the test must be
> restricted to run on Power 8 and later.

> 2018-02-14  Carl Love  

(blank line here)

> * gcc.target/powerpc/builtins-4-int128-runnable.c 
> (dg-require-effective-target):

Line too long.  You can put the ( at the start of a new line.

>   Change vsx_hw to p8vector_hw.
>   (dg-options): Change -maltivec -mvsx to -mpower8-vector.

Otherwise fine, thanks!  Okay for trunk.


Segher


[PATCH] Fix endless match.pd recursion on cst1 + cst2 + cst3 (PR tree-optimization/84334, take 2)

2018-02-14 Thread Jakub Jelinek
On Tue, Feb 13, 2018 at 07:04:09PM +0100, Richard Biener wrote:
> On February 13, 2018 6:51:29 PM GMT+01:00, Jakub Jelinek  
> wrote:
> >On the following testcase, we recurse infinitely, because
> >we have float re-association enabled, but also rounding-math, so
> >we try to optimize (cst1 + cst2) + cst3 as (cst2 + cst3) + cst1
> >but (cst2 + cst3) doesn't simplify and we try again and optimize
> >it as (cst3 + cst1) + cst2 and then (cst1 + cst2) + cst3 and so on
> >forever.  If @0 is not a CONSTANT_CLASS_P, there is not a problem,
> >if it is, the code just checks if we can actually simplify the
> >operation between cst2 and cst3 into a constant.
> 
> Is there a reason to try simplifying at all for constant @0?  I'd rather not 
> try to avoid all the complex code. 

So like this?  Bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2018-02-14  Jakub Jelinek  

PR tree-optimization/84334
* match.pd ((A +- CST1) +- CST2 -> A + CST3): If A is
also a CONSTANT_CLASS_P, punt.

* gcc.dg/pr84334.c: New test.

--- gcc/match.pd.jj 2018-02-13 21:22:19.565979401 +0100
+++ gcc/match.pd2018-02-14 13:55:06.584668049 +0100
@@ -1733,9 +1733,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   CONSTANT_CLASS_P@2)
  /* If one of the types wraps, use that one.  */
  (if (!ANY_INTEGRAL_TYPE_P (type) || TYPE_OVERFLOW_WRAPS (type))
-  (if (outer_op == PLUS_EXPR)
-   (plus (view_convert @0) (inner_op @2 (view_convert @1)))
-   (minus (view_convert @0) (neg_inner_op @2 (view_convert @1
+  /* If all 3 captures are CONSTANT_CLASS_P, punt, as we might recurse
+forever if something doesn't simplify into a constant.  */
+  (if (!CONSTANT_CLASS_P (@0))
+   (if (outer_op == PLUS_EXPR)
+   (plus (view_convert @0) (inner_op @2 (view_convert @1)))
+   (minus (view_convert @0) (neg_inner_op @2 (view_convert @1)
   (if (!ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0))
   || TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)))
(if (outer_op == PLUS_EXPR)
--- gcc/testsuite/gcc.dg/pr84334.c.jj   2018-02-14 13:53:36.816683512 +0100
+++ gcc/testsuite/gcc.dg/pr84334.c  2018-02-14 13:53:36.815683512 +0100
@@ -0,0 +1,12 @@
+/* PR tree-optimization/84334 */
+/* { dg-do compile } */
+/* { dg-options "-Ofast -frounding-math" } */
+
+float
+foo (void)
+{
+  float a = 9.99974752427078783512115478515625e-7f;
+  float b = 1.4950485415756702423095703125e-6f;
+  float c = 4.99873689375817775726318359375e-6f;
+  return a + b + c;
+}


Jakub


[PATCH, rs6000, committed] Fix PR84390: test case gcc.target/powerpc/vsxcopy.c fails for gcc 7 and gcc 6 on power9

2018-02-14 Thread Peter Bergner
This had already been fixed on trunk and just needed back porting to the
release branches.  Committed as obvious.

Peter

PR target/84390
* gcc.target/powerpc/vsxcopy.c: Also match lxv when compiling
with -mcpu=power9.

Index: gcc/testsuite/gcc.target/powerpc/vsxcopy.c
===
--- gcc/testsuite/gcc.target/powerpc/vsxcopy.c  (revision 257670)
+++ gcc/testsuite/gcc.target/powerpc/vsxcopy.c  (revision 257671)
@@ -1,8 +1,8 @@
 /* { dg-do compile { target { powerpc64*-*-* } } } */
 /* { dg-require-effective-target powerpc_vsx_ok } */
 /* { dg-options "-O1 -mvsx" } */
-/* { dg-final { scan-assembler "lxvd2x" } } */
-/* { dg-final { scan-assembler "stxvd2x" } } */
+/* { dg-final { scan-assembler {\mlxvd2x\M|\mlxv\M} } } */
+/* { dg-final { scan-assembler {\mstxvd2x\M|\mstxv\M} } } */
 /* { dg-final { scan-assembler-not "xxpermdi" } } */
 
 typedef float vecf __attribute__ ((vector_size (16)));



Re: [PATCH] adjust warning_n() to take uhwi (PR 84207)

2018-02-14 Thread Pedro Alves
On 02/13/2018 10:37 PM, Martin Sebor wrote:
> On 02/13/2018 01:59 PM, Manuel López-Ibáñez wrote:
>>
> Here's a sketch of what I tried to do:
> 
>   struct IntegerConverter
>   {
>     union {
>   tree t;
>   unsigned HOST_WIDE_INT hwi;
>   // buffer for offset_int, wide_int, etc.
>     } value;
> 
>     IntegerConverter (tree t)
>     {
>   value.t = t;
>     }
> 
>     IntegerConverter (unsigned HOST_WIDE_INT x)
>     {
>   value.x = x;
>     }
> 
>     ...
>   };
> 
>   void error_n (int, const IntegerConverter &, const char*, ...);
>   ...
> 
> With that, the call
> 
>   error_n (loc, t, "%E thing", "%E things", t);
> 
> works when t is a tree, and the call to the same function
> 
>   error_n (loc, i, "%wu thing", "%wu things", i);
> 
> also works when i is a HOST_WIDE_INT.  I chose this approach
> to avoid introducing additional overloads of the error_n()
> functions.

Instead of a class that has to have a constructor for every type
you want to pass as plural selector to the _n functions, which
increases coupling, I'd suggest using a conversion function, and
overload that.  I.e., something like, in the core diagnostics code:

static inline unsigned HOST_WIDE_INT
as_plural_form (unsigned HOST_WIDE_INT val)
{
  return val;
}

/* In some tree-diagnostics header.  */

static inline unsigned HOST_WIDE_INT
as_plural_form (tree t)
{
   // extract & return a HWI
}

/* In some whatever-other-type-diagnostics header.  */

static inline unsigned HOST_WIDE_INT
as_plural_form (whatever_other_type v)
{
   // like above
}

and then you call error_n and other similar functions like this:

error_n (loc, u, "%u thing", "%u things", u); 
error_n (loc, as_plural_form (u), "%u thing", "%u things", u); 
error_n (loc, as_plural_form (t), "%E thing", "%E things", t); 
error_n (loc, as_plural_form (i), "%wu thing", "%wu things", i);

This is similar in spirit to std::to_string, etc.

BTW, the "plural_form" naming above comes from ngettext's documentation
of the N parameter:

  char * ngettext (const char *msgid1, const char *msgid2, unsigned long int n);

  "The parameter n is used to determine the plural form."

Pedro Alves


Re: [PATCH rs6000] Fix for builtins-4-int128-runnable.c

2018-02-14 Thread Peter Bergner
On 2/14/18 1:09 PM, Carl Love wrote:
> The following patch contains fixes an issue with the 128-bit test on
> Power 7.  Power 7 does not support int128 so the test must be
> restricted to run on Power 8 and later.

To be pedantic, GCC POWER7 supports __int128_t just fine (in 64-bit mode).
It however does not support vector __int128_t which is what we're testing
here.

Peter



[PATCH, rs6000] Remove non-ABI builtin support for vec_insert4b, vec_extract4b

2018-02-14 Thread Carl Love
GCC maintainers:

This is the second patch that removes the non-ABI vec_insert4b and
vec_extract4b builtin support.  It also removes the two existing test
files for the non-ABI builtin instances.  A runnable test file for the
ABI specified builtins was added by the first patch.

This patch has been tested on:

  powerpc64le-unknown-linux-gnu (Power 8 LE)
  powerpc64le-unknown-linux-gnu (Power 9 LE)

with no regressions.

Let me know if the patch looks OK or not. Thanks.

The patch should also be ported to GCC 7 so we are in compliance with
the ABI.

   Carl Love
---

gcc/ChangeLog:

2018-02-13  Carl Love  

* config/rs6000/altivec.h: Remove vec_vextract4b and vec_vinsert4b.
* config/rs6000/rs6000-builtin.def: Remove macro expansion for
VEXTRACT4B, VINSERT4B, VINSERT4B_DI and VEXTRACT4B.
* config/rs6000/rs6000.c: Remove case statements for
P9V_BUILTIN_VEXTRACT4B, P9V_BUILTIN_VEC_VEXTRACT4B,
P9V_BUILTIN_VINSERT4B, P9V_BUILTIN_VINSERT4B_DI,
and P9V_BUILTIN_VEC_VINSERT4B.
* config/rs6000/rs6000-c.c (altivec_expand_builtin): Remove entries for
P9V_BUILTIN_VEC_VEXTRACT4B and P9V_BUILTIN_VEC_VINSERT4B.
* config/rs6000/vsx.md:
* doc/extend.texi: Remove vec_vextract4b, non ABI definitions for
vec_insert4b.

gcc/testsuite/ChangeLog:

2018-02-13  Carl Love  
* gcc.target/powerpc/p9-vinsert4b-1.c: Remove test file for non-ABI
tests.
* gcc.target/powerpc/p9-vinsert4b-2.c: Remove test file for non-ABI
tests.
---
 gcc/config/rs6000/altivec.h   |  2 -
 gcc/config/rs6000/rs6000-builtin.def  |  5 --
 gcc/config/rs6000/rs6000-c.c  | 25 ---
 gcc/config/rs6000/rs6000.c|  5 --
 gcc/config/rs6000/vsx.md  | 87 ---
 gcc/doc/extend.texi   | 11 +--
 gcc/testsuite/gcc.target/powerpc/p9-vinsert4b-1.c | 39 --
 gcc/testsuite/gcc.target/powerpc/p9-vinsert4b-2.c | 30 
 8 files changed, 1 insertion(+), 203 deletions(-)
 delete mode 100644 gcc/testsuite/gcc.target/powerpc/p9-vinsert4b-1.c
 delete mode 100644 gcc/testsuite/gcc.target/powerpc/p9-vinsert4b-2.c

diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h
index 3bce2ae39..1e495e69c 100644
--- a/gcc/config/rs6000/altivec.h
+++ b/gcc/config/rs6000/altivec.h
@@ -433,8 +433,6 @@
 #define vec_vctzd __builtin_vec_vctzd
 #define vec_vctzh __builtin_vec_vctzh
 #define vec_vctzw __builtin_vec_vctzw
-#define vec_vextract4b __builtin_vec_vextract4b
-#define vec_vinsert4b __builtin_vec_vinsert4b
 #define vec_extract4b __builtin_vec_extract4b
 #define vec_insert4b __builtin_vec_insert4b
 #define vec_vprtyb __builtin_vec_vprtyb
diff --git a/gcc/config/rs6000/rs6000-builtin.def 
b/gcc/config/rs6000/rs6000-builtin.def
index 420d12e29..16fb18d53 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -2226,9 +2226,6 @@ BU_P9V_AV_2 (VEXTUWLX, "vextuwlx",CONST,  
vextuwlx)
 BU_P9V_AV_2 (VEXTUWRX, "vextuwrx", CONST,  vextuwrx)
 
 /* Insert/extract 4 byte word into a vector.  */
-BU_P9V_VSX_2 (VEXTRACT4B,   "vextract4b",  CONST,  vextract4b)
-BU_P9V_VSX_3 (VINSERT4B,"vinsert4b",   CONST,  vinsert4b)
-BU_P9V_VSX_3 (VINSERT4B_DI, "vinsert4b_di",CONST,  vinsert4b_di)
 BU_P9V_VSX_3 (INSERT4B,"insert4b", CONST,  insert4b)
 BU_P9V_VSX_2 (EXTRACT4B,   "extract4b",CONST,  extract4b)
 
@@ -2292,13 +2289,11 @@ BU_P9V_OVERLOAD_2 (LXVL,"lxvl")
 BU_P9V_OVERLOAD_2 (XL_LEN_R,   "xl_len_r")
 BU_P9V_OVERLOAD_2 (VEXTULX,"vextulx")
 BU_P9V_OVERLOAD_2 (VEXTURX,"vexturx")
-BU_P9V_OVERLOAD_2 (VEXTRACT4B, "vextract4b")
 BU_P9V_OVERLOAD_2 (EXTRACT4B,  "extract4b")
 
 /* ISA 3.0 Vector scalar overloaded 3 argument functions */
 BU_P9V_OVERLOAD_3 (STXVL,  "stxvl")
 BU_P9V_OVERLOAD_3 (XST_LEN_R,  "xst_len_r")
-BU_P9V_OVERLOAD_3 (VINSERT4B,  "vinsert4b")
 BU_P9V_OVERLOAD_3 (INSERT4B,"insert4b")
 
 /* Overloaded CMPNE support was implemented prior to Power 9,
diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index 56e66db98..24675b12e 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
@@ -5429,10 +5429,6 @@ const struct altivec_builtin_types 
altivec_overloaded_builtins[] = {
   { P9V_BUILTIN_VEC_VCTZLSBB, P9V_BUILTIN_VCTZLSBB_V4SI,
 RS6000_BTI_INTSI, RS6000_BTI_V4SI, 0, 0 },
 
-  { P9V_BUILTIN_VEC_VEXTRACT4B, P9V_BUILTIN_VEXTRACT4B,
-RS6000_BTI_INTDI, RS6000_BTI_V16QI, RS6000_BTI_UINTSI, 0 },
-  { P9V_BUILTIN_VEC_VEXTRACT4B, P9V_BUILTIN_VEXTRACT4B,
-RS6000_BTI_INTDI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_UINTSI, 0 },
   { P9V_BUILTIN_VEC_EXTRACT4B, P9V_BUILTIN_EXTRACT4B,
 RS6000_BTI_unsigned_V2DI, 

[PATCH, rs6000] Add builtin support for vec_insert4b, vec_extract4b

2018-02-14 Thread Carl Love
GCC maintainers:

Per Segher's comments on the first version of the patch.  I split the
patch into two.  The first patch (this one) adds the ABI specified
vec_insert4b and vec_extract builtins.  It adds a runnable file to test
the ABI specified builtin instances.  Note, the runnable test file does
not test for illegal argument values such as the const int second
argument > 12 or of the wrong type.

Note, the rtl for vec_insert4b in vsx.md is a copy of the vec_vinsert4b
code with the name changed.  The rtl for vec_extract4b is new.

The second patch removes all of the non-ABI builtin support.

Additionally, I have addressed the other comments from Segher with
regards to formatting issues and rtl register specification.

This patch has been tested on:

  powerpc64le-unknown-linux-gnu (Power 8 LE)
  powerpc64le-unknown-linux-gnu (Power 9 LE)

with no regressions.

Let me know if the patch looks OK or not. Thanks.

The patch should also be ported to GCC 7 so we are in compliance with
the ABI.

   Carl Love

---

gcc/ChangeLog:

2018-02-13  Carl Love  

* config/rs6000/altivec.h: Add builtin names vec_extract4b
vec_insert4b.
* config/rs6000/rs6000-builtin.def: Add INSERT4B and EXTRACT4B
definitions.
* config/rs6000/rs6000-c.c: Add the definitions for
P9V_BUILTIN_VEC_EXTRACT4B and P9V_BUILTIN_VEC_INSERT4B.
* config/rs6000/rs6000.c (altivec_expand_builtin): Add
P9V_BUILTIN_EXTRACT4B and P9V_BUILTIN_INSERT4B case statements.
* config/rs6000/vsx.md: Add define_insn extract4b.  Add define_expand
definition for insert4b and define insn *insert3b_internal.
* doc/extend.texi: Add documentation for vec_extract4b.

gcc/testsuite/ChangeLog:

2018-02-13  Carl Love  
* gcc.target/powerpc/builtins-7-p9-runnable.c: New runnable test file
for the ABI definitions for vec_extract4b and vec_insert4b.
---
 gcc/config/rs6000/altivec.h|   2 +
 gcc/config/rs6000/rs6000-builtin.def   |   4 +
 gcc/config/rs6000/rs6000-c.c   |   8 +
 gcc/config/rs6000/rs6000.c |   2 +
 gcc/config/rs6000/vsx.md   |  41 +
 gcc/doc/extend.texi|   7 +
 .../gcc.target/powerpc/builtins-7-p9-runnable.c| 169 +
 7 files changed, 233 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/builtins-7-p9-runnable.c

diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h
index 684cb1990..3bce2ae39 100644
--- a/gcc/config/rs6000/altivec.h
+++ b/gcc/config/rs6000/altivec.h
@@ -435,6 +435,8 @@
 #define vec_vctzw __builtin_vec_vctzw
 #define vec_vextract4b __builtin_vec_vextract4b
 #define vec_vinsert4b __builtin_vec_vinsert4b
+#define vec_extract4b __builtin_vec_extract4b
+#define vec_insert4b __builtin_vec_insert4b
 #define vec_vprtyb __builtin_vec_vprtyb
 #define vec_vprtybd __builtin_vec_vprtybd
 #define vec_vprtybw __builtin_vec_vprtybw
diff --git a/gcc/config/rs6000/rs6000-builtin.def 
b/gcc/config/rs6000/rs6000-builtin.def
index 86604da46..420d12e29 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -2229,6 +2229,8 @@ BU_P9V_AV_2 (VEXTUWRX, "vextuwrx",CONST,  
vextuwrx)
 BU_P9V_VSX_2 (VEXTRACT4B,   "vextract4b",  CONST,  vextract4b)
 BU_P9V_VSX_3 (VINSERT4B,"vinsert4b",   CONST,  vinsert4b)
 BU_P9V_VSX_3 (VINSERT4B_DI, "vinsert4b_di",CONST,  vinsert4b_di)
+BU_P9V_VSX_3 (INSERT4B,"insert4b", CONST,  insert4b)
+BU_P9V_VSX_2 (EXTRACT4B,   "extract4b",CONST,  extract4b)
 
 /* Hardware IEEE 128-bit floating point round to odd instrucitons added in ISA
3.0 (power9).  */
@@ -2291,11 +2293,13 @@ BU_P9V_OVERLOAD_2 (XL_LEN_R,"xl_len_r")
 BU_P9V_OVERLOAD_2 (VEXTULX,"vextulx")
 BU_P9V_OVERLOAD_2 (VEXTURX,"vexturx")
 BU_P9V_OVERLOAD_2 (VEXTRACT4B, "vextract4b")
+BU_P9V_OVERLOAD_2 (EXTRACT4B,  "extract4b")
 
 /* ISA 3.0 Vector scalar overloaded 3 argument functions */
 BU_P9V_OVERLOAD_3 (STXVL,  "stxvl")
 BU_P9V_OVERLOAD_3 (XST_LEN_R,  "xst_len_r")
 BU_P9V_OVERLOAD_3 (VINSERT4B,  "vinsert4b")
+BU_P9V_OVERLOAD_3 (INSERT4B,"insert4b")
 
 /* Overloaded CMPNE support was implemented prior to Power 9,
so is not mentioned here.  */
diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index a68be511c..56e66db98 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
@@ -5433,6 +5433,8 @@ const struct altivec_builtin_types 
altivec_overloaded_builtins[] = {
 RS6000_BTI_INTDI, RS6000_BTI_V16QI, RS6000_BTI_UINTSI, 0 },
   { P9V_BUILTIN_VEC_VEXTRACT4B, P9V_BUILTIN_VEXTRACT4B,
 RS6000_BTI_INTDI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_UINTSI, 0 },
+  { P9V_BUILTIN_VEC_EXTRACT4B, 

Re: patch for PR84359

2018-02-14 Thread Uros Bizjak
>  The following patch fixes
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84359
>
>  Committed as rev. 257628.
>
>
> Index: testsuite/gcc.target/i386/pr57193.c
> ===
> --- testsuite/gcc.target/i386/pr57193.c (revision 257537)
> +++ testsuite/gcc.target/i386/pr57193.c (working copy)
> @@ -1,5 +1,5 @@
>  /* { dg-do compile { target { ! ia32 } } } */
> -/* { dg-options "-O2" } */
> +/* { dg-options "-O2 -march=x86-64" } */
>  /* { dg-final { scan-assembler-times "movdqa" 2 } } */

The preferred way to stabilize assembler dumps on testcases, which
depend on -march is to manually specify -mno-sseX, as in the attached
patch.

Also, the test will work on 32bit x86 targets with -msse2, so the
patch also removes target selector from the testcase.

Tested on x86_64-linux-gnu {,-m32}, will be committed to mainline ASAP.

Uros.

Index: gcc.target/i386/pr57193.c
===
--- gcc.target/i386/pr57193.c(revision 257659)
+++ gcc.target/i386/pr57193.c(working copy)
@@ -1,5 +1,5 @@
-/* { dg-do compile { target { ! ia32 } } } */
-/* { dg-options "-O2 -march=x86-64" } */
+/* { dg-do compile } */
+/* { dg-options "-O2 -msse2 -mno-sse3" } */
 /* { dg-final { scan-assembler-times "movdqa" 2 } } */

 #include 


[PATCH rs6000] Fix for builtins-4-int128-runnable.c

2018-02-14 Thread Carl Love
GCC maintainers:

The following patch contains fixes an issue with the 128-bit test on
Power 7.  Power 7 does not support int128 so the test must be
restricted to run on Power 8 and later.

The change was tested by hand on a Power 8 machine with the command:

make -k check-gcc RUNTESTFLAGS="--target_board=unix'{-mcpu=power7}'
powerpc.exp=builtins-4-int128-runnable.c" 

to verify the test passes cleanly with the change in the dg directives.

Please let me know if the patch is OK for trunk.  Thanks.

   Carl Love




2018-02-14  Carl Love  
* gcc.target/powerpc/builtins-4-int128-runnable.c 
(dg-require-effective-target):
Change vsx_hw to p8vector_hw.
(dg-options): Change -maltivec -mvsx to -mpower8-vector.
---
 gcc/testsuite/gcc.target/powerpc/builtins-4-int128-runnable.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.target/powerpc/builtins-4-int128-runnable.c 
b/gcc/testsuite/gcc.target/powerpc/builtins-4-int128-runnable.c
index 162e267..4f4e7a9 100644
--- a/gcc/testsuite/gcc.target/powerpc/builtins-4-int128-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/builtins-4-int128-runnable.c
@@ -1,7 +1,7 @@
 /* { dg-do run } */
 /* { dg-require-effective-target int128 } */
-/* { dg-require-effective-target vsx_hw } */
-/* { dg-options "-maltivec -mvsx" } */
+/* { dg-require-effective-target p8vector_hw } */
+/* { dg-options "-mpower8-vector" } */
 
 #include 
 #include  // vector
-- 
2.7.4



Re: [PATCH] adjust warning_n() to take uhwi (PR 84207)

2018-02-14 Thread Joseph Myers
On Wed, 14 Feb 2018, Martin Sebor wrote:

> I was also hoping to test it, either now if it's easy, or if
> it's complicated, sometime in the future but I couldn't find
> a .po file where it would make a difference.  I could have
> easily missed one but none of those I've looked seems to do
> much with the plural forms where such large numbers could
> come up.  The strings are either all empty or all look
> the same.  Do you happen to know of one where it matters

There are cases where there are more than two translations, but I don't 
have an example where such large numbers could come up (which in any case 
requires a 32-bit host, to have an unsigned HOST_WIDE_INT value above 
ULONG_MAX).  (For example, the Russian translation of "  candidate expects 
%d argument, %d provided" appears to have three different translations, 
using "%d аргумент", "%d аргумента" and "%d аргументов".)

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH][i386][3/3] PR target/84164: Make *cmpqi_ext_ patterns accept more zero_extract modes

2018-02-14 Thread Kyrill Tkachov


On 13/02/18 16:45, Jeff Law wrote:

On 02/09/2018 07:50 AM, Kyrill Tkachov wrote:

Hi Uros,

On 08/02/18 22:54, Uros Bizjak wrote:

On Thu, Feb 8, 2018 at 6:11 PM, Kyrill  Tkachov
 wrote:

Hi all,

This patch fixes some fallout in the i386 testsuite that occurs after
the
simplification in patch [1/3] [1].
The gcc.target/i386/extract-2.c FAILs because it expects to match:
(set (reg:CC 17 flags)
  (compare:CC (subreg:QI (zero_extract:SI (reg:HI 98)
  (const_int 8 [0x8])
  (const_int 8 [0x8])) 0)
  (const_int 4 [0x4])))

which is the *cmpqi_ext_2 pattern in i386.md but with the new
simplification
the combine/simplify-rtx
machinery produces:
(set (reg:CC 17 flags)
  (compare:CC (subreg:QI (zero_extract:HI (reg:HI 98)
  (const_int 8 [0x8])
  (const_int 8 [0x8])) 0)
  (const_int 4 [0x4])))

Notice that the zero_extract now has HImode like the register source
rather
than SImode.
The existing *cmpqi_ext_ patterns however explicitly demand an
SImode on
the zero_extract.
I'm not overly familiar with the i386 port but I think that's too
restrictive.
The RTL documentation says:
For (zero_extract:m loc size pos) "The mode m is the same as the mode
that
would be used for loc if it were a register."
I'm not sure if that means that the mode of the zero_extract and the
source
register must always match (as is the
case after patch [1/3]) but in any case it shouldn't matter semantically
since we're taking a QImode subreg of the whole
thing anyway.

So the proposed solution in this patch is to allow HI, SI and DImode
zero_extracts in these patterns as these are the
modes that the ext_register_operand predicate accepts, so that the
patterns
can match the new form above.

With this patch the aforementioned test passes again and bootstrap and
testing on x86_64-unknown-linux-gnu shows
no regressions.

Is this ok for trunk if the first patch is accepted?

Huh, there are many other zero-extract patterns besides cmpqi_ext_*
with QImode subreg of SImode zero_extract in i386.md, used to access
high QImode register of HImode pair. A quick grep shows these that
have _ext_ in their name:

(define_insn "*cmpqi_ext_1"
(define_insn "*cmpqi_ext_2"
(define_expand "cmpqi_ext_3"
(define_insn "*cmpqi_ext_3"
(define_insn "*cmpqi_ext_4"
(define_insn "addqi_ext_1"
(define_insn "*addqi_ext_2"
(define_expand "testqi_ext_1_ccno"
(define_insn "*testqi_ext_1"
(define_insn "*testqi_ext_2"
(define_insn_and_split "*testqi_ext_3"
(define_insn "andqi_ext_1"
(define_insn "*andqi_ext_1_cc"
(define_insn "*andqi_ext_2"
(define_insn "*qi_ext_1"
(define_insn "*qi_ext_2"
(define_expand "xorqi_ext_1_cc"
(define_insn "*xorqi_ext_1_cc"

There are also relevant splitters and peephole2 patterns.

I see. Another approach I've looked at is removing the mode specifier from
the zero_extract in these patterns. This means that they can be of any mode
so they will match all of these modes without creating new patterns through
iterators. That also works for the testcase and passes bootstrap and
testing
however there is the snag that the define_insns that don't start with a "*"
are used to generate RTL through the gen_* mechanism and in that context
the absence of a mode on the zero_extract would mean a VOIDmode
zero_extract
would be created, which I'm fairly sure is not good. So in my
experiments I left
those patterns alone (with an explicit SI on the zero_extract).


IIRC, SImode zero_extract was enough to catch all high-register uses.
There will be a pattern explosion if we want to handle all other
integer modes here. However, I'm not a RTL expert, so someone will
have to say what is the correct RTX form here.

Jeff, Richard, could you please give us some guidance on this issue?
Sorry for the trouble.


I don't think any of the patterns above are known to the generic code.
So you just have to check the x86 backend to see their precise uses in a
generator (ie gen_cmpqi_ext_3) and verify those do not allow a VOIDmode
(or any other undesirable mode) to slip through.

Jeff


Thanks Jeff, I did have a look. I think we want to maintain the SImode on the
RTL that gets created through these named expanders, as generating a VOIDmode
zero_extract is not valid. So my patch leaves those intact.
The patch removes the mode from the zero_extract RTXes in patterns that are
only ever going to get matched (as opposed to generated). That is the ones that
start with "*" in their name.
This should allow them to match any of the zero_extract modes that
might get generated by the midend.

Bootstrapped and tested on x86_64-unknown-linux-gnu.
Is this a preferable approach?

Thanks,
Kyrill

2018-02-14  Kyrylo Tkachov  

PR target/84164
* config/i386/i386.md (*cmpqi_ext_1, *cmpqi_ext_2, *cmpqi_ext_3,
*cmpqi_ext_4, *extzvqi_mem_rex64, *extzvqi, QImode zero_extract
peephole, *addqi_ext_2, *testqi_ext_1, *testqi_ext_2, *andqi_ext_1_cc,

Re: [PATCH] adjust warning_n() to take uhwi (PR 84207)

2018-02-14 Thread Martin Sebor

On 02/13/2018 02:05 PM, Joseph Myers wrote:

On Mon, 12 Feb 2018, Martin Sebor wrote:


Bug 84207 - Hard coded plural in gimple-fold.c points out one
of a number of warning_at() calls where warning_n() should have
been used.  The attached patch both replaces the calls and also
changes the signatures of the warning_n(), error_n(), and
inform_n() functions to take an unsigned HOST_WIDE_INT argument
instead of int.  I also changed the implementation of
diagnostic_n_impl() to deal with unsigned HOST_WIDE_INT values
in excess of ULONG_MAX (the maximum value ngettext handles) so
callers don't need to.


Saturating to ULONG_MAX is not correct for languages where the plural form
depends on n%10 or n%100 (see the various Plural-Forms entries in the .po
files).  If n is too large you want something like n % 100 + 100
instead to get the correct plural form in all cases.


Thanks.  I've made that change in the attached patch.

I was also hoping to test it, either now if it's easy, or if
it's complicated, sometime in the future but I couldn't find
a .po file where it would make a difference.  I could have
easily missed one but none of those I've looked seems to do
much with the plural forms where such large numbers could
come up.  The strings are either all empty or all look
the same.  Do you happen to know of one where it matters
and a suggestion for how to test it?  I suppose I could
create a dummy .po file with a non-trivial Plural-Forms but
then how would I plug it into GCC to verify (in an automated
test) that the right form is used?

Martin
PR translation/84207 - Hard coded plural in gimple-fold.c

gcc/ChangeLog:

	PR translation/84207
	* diagnostic-core.h (warning_n, error_n, inform_n): Change
	n argument to unsigned HOST_WIDE_INT.
	* diagnostic.c (warning_n, error_n, inform_n): Ditto.
	(diagnostic_n_impl): Ditto.  Handle arguments in excess of LONG_MAX.
	* gimple-fold.c (gimple_fold_builtin_strncpy): Use warning_n.
	* gimple-ssa-sprintf.c (format_directive): Simplify inform_n call.

Index: gcc/diagnostic-core.h
===
--- gcc/diagnostic-core.h	(revision 257665)
+++ gcc/diagnostic-core.h	(working copy)
@@ -59,10 +59,11 @@ extern void internal_error_no_backtrace (const cha
  ATTRIBUTE_GCC_DIAG(1,2) ATTRIBUTE_NORETURN;
 /* Pass one of the OPT_W* from options.h as the first parameter.  */
 extern bool warning (int, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3);
-extern bool warning_n (location_t, int, int, const char *, const char *, ...)
+extern bool warning_n (location_t, int, unsigned HOST_WIDE_INT,
+		   const char *, const char *, ...)
 ATTRIBUTE_GCC_DIAG(4,6) ATTRIBUTE_GCC_DIAG(5,6);
-extern bool warning_n (rich_location *, int, int, const char *,
-		   const char *, ...)
+extern bool warning_n (rich_location *, int, unsigned HOST_WIDE_INT,
+		   const char *, const char *, ...)
 ATTRIBUTE_GCC_DIAG(4, 6) ATTRIBUTE_GCC_DIAG(5, 6);
 extern bool warning_at (location_t, int, const char *, ...)
 ATTRIBUTE_GCC_DIAG(3,4);
@@ -69,7 +70,8 @@ extern bool warning_at (location_t, int, const cha
 extern bool warning_at (rich_location *, int, const char *, ...)
 ATTRIBUTE_GCC_DIAG(3,4);
 extern void error (const char *, ...) ATTRIBUTE_GCC_DIAG(1,2);
-extern void error_n (location_t, int, const char *, const char *, ...)
+extern void error_n (location_t, unsigned HOST_WIDE_INT, const char *,
+		 const char *, ...)
 ATTRIBUTE_GCC_DIAG(3,5) ATTRIBUTE_GCC_DIAG(4,5);
 extern void error_at (location_t, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3);
 extern void error_at (rich_location *, const char *, ...)
@@ -87,7 +89,8 @@ extern bool permerror (rich_location *, const char
 extern void sorry (const char *, ...) ATTRIBUTE_GCC_DIAG(1,2);
 extern void inform (location_t, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3);
 extern void inform (rich_location *, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3);
-extern void inform_n (location_t, int, const char *, const char *, ...)
+extern void inform_n (location_t, unsigned HOST_WIDE_INT, const char *,
+		  const char *, ...)
 ATTRIBUTE_GCC_DIAG(3,5) ATTRIBUTE_GCC_DIAG(4,5);
 extern void verbatim (const char *, ...) ATTRIBUTE_GCC_DIAG(1,2);
 extern bool emit_diagnostic (diagnostic_t, location_t, int,
Index: gcc/diagnostic.c
===
--- gcc/diagnostic.c	(revision 257665)
+++ gcc/diagnostic.c	(working copy)
@@ -51,8 +51,8 @@ along with GCC; see the file COPYING3.  If not see
 /* Prototypes.  */
 static bool diagnostic_impl (rich_location *, int, const char *,
 			 va_list *, diagnostic_t) ATTRIBUTE_GCC_DIAG(3,0);
-static bool diagnostic_n_impl (rich_location *, int, int, const char *,
-			   const char *, va_list *,
+static bool diagnostic_n_impl (rich_location *, int, unsigned HOST_WIDE_INT,
+			   const char *, const char *, va_list *,
 			   diagnostic_t) ATTRIBUTE_GCC_DIAG(5,0);
 
 static void 

Patch ping

2018-02-14 Thread Jakub Jelinek
Hi!

I'd like to ping these patches:

PR84146 fix -fcompare-debug issues with -mcet -fcf-protection=full
  http://gcc.gnu.org/ml/gcc-patches/2018-02/msg00390.html

PR83708 __VA_OPT__ assorted fixes
  http://gcc.gnu.org/ml/gcc-patches/2018-01/msg00727.html

Thanks

Jakub


Re: [PATCH] jit: fix link on OS X and Solaris (PR jit/64089 and PR jit/84288)

2018-02-14 Thread FX
> Does this fix the jit linker issues on OS X and Solaris?

The patch fails to bootstrap on x86_64-apple-darwin17. gcc/config.log says:

gcc_cv_ld_version_script=no
ld_version_script_option='--version-script’

gcc/Makefile says:

LD_VERSION_SCRIPT_OPTION = --version-script
LD_SONAME_OPTION = -install_name

which makes it try to link with:

  -Wl,--version-script,../../trunk/gcc/jit/libgccjit.map \
  -Wl,-install_name,libgccjit.so.0

which fails with: ld: unknown option: --version-script

I think the patch to gcc/configure.ac should be:

+AC_MSG_CHECKING(linker --version-script option)
+gcc_cv_ld_version_script=no
+ld_version_script_option=''


FX

RFA: PATCH to build_type_attribute_qual_variant for c++/84314, ICE with fastcall

2018-02-14 Thread Jason Merrill
This testcase involves a fastcall-qualified function type.  During
mangling, we use build_type_attribute_qual_variant to look up an
attribute-unqualified version of that type.
build_type_attribute_qual_variant calls type_hash_canon and finds the
original unqualified type, but then clobbers its TYPE_CANONICAL
because it's incompatible with the fastcall-qualified type.

Fixed by leaving TYPE_CANONICAL of a previously existing type alone.

Tested x86_64-pc-linux-gnu.  OK for trunk?
commit dca04c7fb9d7002d342f6e5d47dfbe85569dbc5e
Author: Jason Merrill 
Date:   Tue Feb 13 15:15:26 2018 -0500

PR c++/84314 - ICE with templates and fastcall attribute.

* attribs.c (build_type_attribute_qual_variant): Don't clobber
TYPE_CANONICAL on an existing type.

diff --git a/gcc/attribs.c b/gcc/attribs.c
index 2cac9c403b4..d13a3d4b88b 100644
--- a/gcc/attribs.c
+++ b/gcc/attribs.c
@@ -1127,19 +1127,29 @@ build_type_attribute_qual_variant (tree otype, tree 
attribute, int quals)
ttype = (lang_hooks.types.copy_lang_qualifiers
 (ttype, TYPE_MAIN_VARIANT (otype)));
 
-  ntype = build_distinct_type_copy (ttype);
+  tree dtype = ntype = build_distinct_type_copy (ttype);
 
   TYPE_ATTRIBUTES (ntype) = attribute;
 
   hashval_t hash = type_hash_canon_hash (ntype);
   ntype = type_hash_canon (hash, ntype);
 
-  /* If the target-dependent attributes make NTYPE different from
-its canonical type, we will need to use structural equality
-checks for this type.  */
-  if (TYPE_STRUCTURAL_EQUALITY_P (ttype)
- || !comp_type_attributes (ntype, ttype))
-   SET_TYPE_STRUCTURAL_EQUALITY (ntype);
+  if (ntype != dtype)
+   /* This variant was already in the hash table, don't mess with
+  TYPE_CANONICAL.  */;
+  else if (TYPE_STRUCTURAL_EQUALITY_P (ttype)
+  || !comp_type_attributes (ntype, ttype))
+   {
+ /* If the target-dependent attributes make NTYPE different from
+its canonical type, we will need to use structural equality
+checks for this type.
+
+But make sure we don't get here for stripping attributes from a
+type; the no-attribute type might not need structural comparison,
+and it should have been in the hash table already.  */
+ gcc_assert (attribute);
+ SET_TYPE_STRUCTURAL_EQUALITY (ntype);
+   }
   else if (TYPE_CANONICAL (ntype) == ntype)
TYPE_CANONICAL (ntype) = TYPE_CANONICAL (ttype);
 
diff --git a/gcc/testsuite/g++.dg/ext/attrib55.C 
b/gcc/testsuite/g++.dg/ext/attrib55.C
new file mode 100644
index 000..dc0cdc48b7a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/attrib55.C
@@ -0,0 +1,99 @@
+// PR c++/84314
+// { dg-do compile { target { { i?86-*-* x86_64-*-* } && ia32 } } }
+// { dg-additional-options "-w -std=c++11" }
+
+template  struct c { static constexpr a d = b; };
+template  using e = c;
+template  struct conditional;
+template  struct f;
+template 
+struct f : conditional::i {};
+template  struct j;
+template  struct j : conditional<1, h, g>::i {};
+template 
+struct j : conditional<1, j, g>::i {};
+struct aa : e {};
+template  struct m : c {};
+template  struct o {
+  template  static c p(int);
+  typedef decltype(p(0)) i;
+};
+template  struct ab : o::i {};
+template  struct s { typedef int ad; };
+template  struct q;
+template  struct q { typedef a i; };
+template  struct conditional { typedef ae i; };
+template  struct conditional {
+  typedef r i;
+};
+struct B {
+  B(int);
+};
+template  struct af;
+template 
+struct af : af<1, ah...>, B {
+  typedef af<1, ah...> ai;
+  ai al(af);
+  template  af(af p1) : ai(al(p1)), B(0) {}
+};
+template  struct af {};
+template  struct ap {
+  template  static constexpr bool ar() {
+return j...>::d;
+  }
+};
+template  class as : public af<0, ao...> {
+  typedef af<0, ao...> ai;
+
+public:
+  template  using au = ap::d, ao...>;
+  template ::template ar(), bool>::i = true>
+  as(as an) : ai(an) {}
+};
+template  as::ad...> ax(ao...);
+namespace ay {
+class az {};
+}
+using ay::az;
+namespace ay {
+template  struct C { typedef ba bc; };
+}
+template  class bd;
+template  using bj = f, ab>;
+template  class bd {
+  struct F : bj {};
+  template  using bm = typename q::i;
+
+public:
+  template , typename = bm>
+  bd(bg);
+  using bn = bf;
+  bn bo;
+};
+template 
+template 
+bd::bd(bg) {
+  bo;
+}
+typedef long long(__attribute__((fastcall)) bq)(int *);
+struct v : ay::C> {
+  bc bt() { 

update LTO test harness README

2018-02-14 Thread Martin Sebor

I was excited to find out about the recent enhancement to
the LTO test harness to support the new dg-lto-warning and
dg-lto-message directives (thanks, David).

To make them easier to find and use (there is a C++ LTO test
that uses them but no C tests yet) the attached patch updates
the README to document them.  While I was at it I made a few
minor cosmetic improvements to the README as well.

Let me know if I didn't get something quite right or if there
is something else that might be worth mentioning in the README.

Martin
gcc/testsuite/ChangeLog:

	* gcc.dg/lto/README (dg-lto-warning, dg-lto-message): Document new
	directives.	

Index: gcc/testsuite/gcc.dg/lto/README
===
--- gcc/testsuite/gcc.dg/lto/README	(revision 257664)
+++ gcc/testsuite/gcc.dg/lto/README	(working copy)
@@ -1,4 +1,20 @@
 This directory contains tests for link-time optimization (LTO).
+
+=== Directives ===
+
+The LTO harness recognizes the following special DejaGnu directives:
+ *  dg-lto-do - the equivalent of dg-do with a limited set of supported
+  arguments (see below),
+ *  dg-lto-options - the equivalent of dg-options with additional syntax
+  to support different sets of options for different files compiled
+  as part of the same test case,
+ *  dg-lto-warning - the equivalent of dg-warning for diagnostics expected
+  to be emitted at LTO link time,
+ *  dg-lto-message - the equivakent of dg-message for informational notes
+  expected to be emitted at LTO link time.
+
+=== Test Names ===
+
 Tests in this directory may span multiple files, so the naming of
 the files is significant.
 
@@ -9,8 +25,8 @@ executable.
 
 By default, each set of files will be compiled with list of
 options listed in LTO_OPTIONS (../../lib/lto.exp), which can be
-overwritten in the shell environment or using the 'dg-lto-options'
-command in the main file of the set (i.e., the file with _0
+overridden in the shell environment or using the 'dg-lto-options'
+directive in the main file of the set (i.e., the file with _0
 suffix).
 
 For example, given the files a_0.C a_1.C a_2.C, they will be
@@ -24,7 +40,9 @@ $ g++ -o  a_0.o a_1.o a_2.o
 Tests that do not need more than one file are a special case
 where there is a single file named 'foo_0.C'.
 
-The only supported dg-lto-do option are 'assemble', 'run' and 'link'.
+=== The dg-lto-do Directive ==
+
+The only supported dg-lto-do options are 'assemble', 'run' and 'link'.
 Additionally, these can only be used in the main file.  If
 'assemble' is used, only the individual object files are
 generated.  If 'link' is used, the final executable is generated


[PATCH] MPX and CET changes in release notes

2018-02-14 Thread Tsimbalist, Igor V
MPX is going to be deprecated in gcc-8. Control-flow protection support is in 
gcc-8.
Reflect these in Release Notes for gcc-8.

Ok for trunk?

Igor


Index: changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-8/changes.html,v
retrieving revision 1.36
diff -r1.36 changes.html
34a35,38
>   
> The MPX extensions to the C and C++ languages have been deprecated and
> will be removed in a future release.
>   
43a48,56
>   
> A new option -fcf-protection=[full|branch|return|none] is
> introduced to perform a code instrumentation to increase program security 
> by
> checking that target addresses of control-flow transfer instructions 
> (such as
> indirect function call, function return, indirect jump) are valid. 
> Currently
> the instrumentation is supported on x86 GNU/Linux target only. See, the 
> user
> guide for further information about the option syntax and section "New 
> Targets
> and Target Specific Improvements" for IA-32/x86-64 for more details.
>   
402a416,421
>   
> GCC now supports the Intel Control-flow Enforcement Technology (CET)
> extension through -mibt, -mshstk, -mcet options. One of these
> options has to accompany the -fcf-protection option to enable
> the code instrumentation for control-flow protection.
>   

Igor




Re: wwwdocs: An additional release note for powerpc for GCC 8

2018-02-14 Thread Segher Boessenkool
On Wed, Feb 14, 2018 at 09:06:55AM -0600, Kelvin Nilsen wrote:
> 
> Is this revision to the existing draft GCC 8 release notes ok for
> commit?  

That looks fine to me, thanks!


Segher


> --- htdocs/gcc-8/changes.html 12 Feb 2018 07:23:11 -  1.36
> +++ htdocs/gcc-8/changes.html 14 Feb 2018 14:58:56 -
> @@ -464,6 +464,11 @@ a work-in-progress.
>  powerpc-xilinx-eabi*)
>  is deprecated and will be removed in a future release.
>
> +  
> +Support for using big-endian AltiVec intrinsics on a little-endian target
> +(-maltivec=be) is deprecated and will be removed in a
> +future release.
> +  
>  
>  
>  PowerPC SPE


Re: [PATCH 0/5] Make std::future::wait_* use std::chrono::steady_clock when required

2018-02-14 Thread Mike Crowe
On Sunday 14 January 2018 at 20:44:10 +, Mike Crowe wrote:
> On Sunday 14 January 2018 at 16:08:09 +, Mike Crowe wrote:
> > Hi Torvald,
> > 
> > Thanks for reviewing this change.
> > 
> > On Saturday 13 January 2018 at 16:29:57 +0100, Torvald Riegel wrote:
> > > On Sun, 2018-01-07 at 20:55 +, Mike Crowe wrote:
> > > > This is a first attempt to make std::future::wait_until and
> > > > std::future::wait_for make correct use of
> > > > std::chrono::steady_clock/CLOCK_MONOTONIC. It also makes
> > > > std::future::wait_until react to changes to CLOCK_REALTIME during the
> > > > wait, but only when passed a std::chrono::system_clock time point.
> > > 
> > > I have comments on the design.
> > > 
> > > First, I don't think we should not change
> > > __atomic_futex_unsigned_base::_M_futex_wait_until, as there's a risk
> > > that we'll change behavior of existing applications that work as
> > > expected.
> > 
> > I assume you mean "I don't think we should change" or "I think we should
> > not change"... :-)
> > 
> > The only way I can see that behaviour will change for existing programs is
> > when the system clock changes (i.e. when someone calls settimeofday.) In
> > the existing code, the maximum wait time is fixed once gettimeofday is
> > called to calculate the relative timeout. When using FUTEX_CLOCK_REALTIME,
> > the maximum wait can change based on changes to the system clock after that
> > point. It appears that glibc made this transition successfully and
> > currently uses FUTEX_CLOCK_REALTIME. I think that the new behaviour is
> > better than the old behaviour.
> > 
> > Or perhaps I've missed another possibility. Did you have another risk in
> > mind?
> > 
> > > Instead, ISTM we should additionally expose the two options we have at
> > > the level of futexes:
> > > * Relative timeout using CLOCK_MONOTONIC
> > > * Absolute timeout using CLOCK_REALTIME (which will fall back to the
> > > former on old kernels, which is fine I think).
> > > 
> > > Then we do the following translations from functions that programs would
> > > call to the new futex functions:
> > > 
> > > 1) wait_for is a loop in which we load the current time from the steady
> > > clock, then call the relative futex wait, and if that returns for a
> > > spurious reason (ie, neither timeout nor is the expected value present),
> > > we reduce the prior relative amount by the difference between the time
> > > before the futex wait and the current time.
> > 
> > If we're going to loop on a relative timeout it sounds safer to convert it
> > to an absolute (steady clock) timeout. That way we won't risk increasing
> > the timeout if the scheduler decides not to run us at an inopportune moment
> > between waits. _M_load_when_equal_for already does this.
> > 
> > _M_load_and_test_until already has a loop for spurious wakeup. I think that
> > it makes sense to only loop at one level. That loop relies on the timeout
> > being absolute, which is why my _M_load_and_test_until_steady also uses an
> > absolute timeout.
> > 
> > > 2) wait_until using the steady clock is a loop similar to wait_for, just
> > > that we additionally compute the initial relative timeout.
> > 
> > Clearly an absolute wait can be implemented in terms of a relative one and
> > vice-versa, but at least in my attempts to write them I find the code
> > easier to understand (and therefore get right) if the fundamental wait is
> > the absolute one and the relative one is implemented on top of it.
> 
> I had a quick go at implementing at least the first part of your design, as
> I understood it. (I've kept the loops inside atomic_futex_unsigned - and I
> think that you wanted to move them out to the client code.) I've not tested
> it much.
> 
> I think that this implementation of _M_load_and_test_for is rather more
> error-prone than my previous _M_load_and_test_until_steady. That's probably
> partly because the type-safe duration has already been separated into seconds
> and nanoseconds. It would be nice to push this separation as deeply as
> possible in the code, but I'm afraid that would break ABI compatibility.
> 
> Thanks.
> 
> Mike.
> 
> --8<--
> 
> diff --git a/libstdc++-v3/include/bits/atomic_futex.h 
> b/libstdc++-v3/include/bits/atomic_futex.h
> index ad9437da4e2..fa4a4382c79 100644
> --- a/libstdc++-v3/include/bits/atomic_futex.h
> +++ b/libstdc++-v3/include/bits/atomic_futex.h
> @@ -57,6 +57,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  _M_futex_wait_until(unsigned *__addr, unsigned __val, bool __has_timeout,
>   chrono::seconds __s, chrono::nanoseconds __ns);
> 
> +// Returns false iff a timeout occurred.
> +bool
> +_M_futex_wait_for(unsigned *__addr, unsigned __val, bool __has_timeout,
> + chrono::seconds __s, chrono::nanoseconds __ns);
> +
>  // This can be executed after the object has been destroyed.
>  static void _M_futex_notify_all(unsigned* __addr);
>};
> @@ -110,6 +115,40 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>   }
> 

[C++ PATCH]: instantiation via vtable marking

2018-02-14 Thread Nathan Sidwell
We had encountered a bogus warning about class visibility. Namely a 
lambda had captured a variable whose type was in the anonymous 
namespace.  The problem was that in_main_input_context was returning 
false (i.e. we're in a header file), as that was the location of the 
outermost template instantiation (yup, the lambda came out of a template 
instantiation).


The problem was the location was wrong -- it was pointing to some much 
earlier non-template class definition. (funnily enough, the class was to 
do with Futures, messing with my head about reaching into the future of 
the compilation).


The problem stemmed from the c_parse_final_cleanups.  That consists of a 
do-until-no-more loop that does a bunch of things.  Early in the loop it 
maybe_emits_vtables and later in the loop it emits inline function that 
need a body.  In this case, an earlier non-template class needed its 
virtual deleting dtor synthesized and emitted. That sets the 
input_location to that of the class.  But doesn't immediately restore it.


Then the next iteration of the main loop in c_parse_final_cleanups 
discovered a new vtable needed emitting and calls mark_used on the 
vfuncs it points at.  That caused instantiation of the body of a vfunc 
within the template class whose vtable we were emitting.  We still had 
the stale input_location from the above synthesis.


The upshot of which is we think the cause of the template instantiation 
of the vfunc was at the other class definition, and as that was from a 
header file, don't think we're in the main input file.


I failed at reducing the testcase.

Anyway, the fix is to set input_location to something definite before 
calling mark_used in maybe_emit_vtables.  Candidates are:

1) locus_at_end_of_compilation
2) locus of the class owning the vtable
3) locus of the vfunc being marked.

I went with #3 in this patch.  Although #2 is attractive, the class 
definition could easily be in a header file, which would fail the 
is_main_input_context test.


This changes the location of the 'required-from' message in 
g++.dg/template/instantiate5.C.  It now points at the vfunc.  However, 
the current location is misleading -- although it's on the 'C c;' 
line, that's just happenstance as the last token in the file.  If we 
append some unrelated C++, we'll point to that.  Which isn't really helpful.


committing to trunk.

nathan

--
Nathan Sidwell
2018-02-14  Nathan Sidwell  

	gcc/cp/
	* decl2.c (mark_vtable_entries): Set input_location to decl's.
	(c_parse_final_cleanups): Restore input_location after emitting
	vtables.

	gcc/testsuite/
	* g++.dg/template/instantiate5.C: Adjust required-from loc.

Index: cp/decl2.c
===
--- cp/decl2.c	(revision 257657)
+++ cp/decl2.c	(working copy)
@@ -1825,6 +1825,11 @@ mark_vtable_entries (tree decl)
 	 function, so we emit the thunks there instead.  */
   if (DECL_THUNK_P (fn))
 	use_thunk (fn, /*emit_p=*/0);
+  /* Set the location, as marking the function could cause
+ instantiation.  We do not need to preserve the incoming
+ location, as we're called from c_parse_final_cleanups, which
+ takes care of that.  */
+  input_location = DECL_SOURCE_LOCATION (fn);
   mark_used (fn);
 }
 }
@@ -4727,6 +4732,9 @@ c_parse_final_cleanups (void)
 	reconsider = true;
 	keyed_classes->unordered_remove (i);
 	  }
+  /* The input_location may have been changed during marking of
+	 vtable entries.  */
+  input_location = locus_at_end_of_parsing;
 
   /* Write out needed type info variables.  We have to be careful
 	 looping through unemitted decls, because emit_tinfo_decl may
Index: testsuite/g++.dg/template/instantiate5.C
===
--- testsuite/g++.dg/template/instantiate5.C	(revision 257657)
+++ testsuite/g++.dg/template/instantiate5.C	(working copy)
@@ -18,7 +18,12 @@ struct B
 
 template  struct C
 {
-  virtual void bar() const { T::foo(); } // { dg-error "no matching function" }
+  virtual void bar() const	// { dg-message "required" }
+  {
+T::foo(); // { dg-error "no matching function" }
+  }
 };
 
-C c;// { dg-message "required" }
+C c;
+
+int k;


Re: [PATCH] __VA_OPT__ fixes (PR preprocessor/83063, PR preprocessor/83708)

2018-02-14 Thread Jason Merrill

On 01/10/2018 07:04 AM, Jakub Jelinek wrote:

I've also cross-checked the libcpp implementation with this patch against
trunk clang which apparently also implements __VA_OPT__ now, on the
testcases included here the output is the same and on their
macro_vaopt_expand.cpp testcase, if I remove all tests that test
#__VA_OPT__ ( contents ) handling which we just reject now, there are still
some differences:
$ /usr/src/llvm/obj8/bin/clang++  -E /tmp/macro_vaopt_expand.cpp -std=c++2a > 
/tmp/1
$ ~/src/gcc/obj20/gcc/cc1plus -quiet -E /tmp/macro_vaopt_expand.cpp -std=c++2a 
> /tmp/2
diff -up /tmp/1 /tmp/2
-4: f(0 )
+4: f(0)
-6: f(0, a )
-7: f(0, a )
+6: f(0, a)
+7: f(0, a)
-9: TONG C ( ) B ( ) "A()"
+9: HT_A() C ( ) B ( ) "A()"
-16: S foo ;
+16: S foo;
-26: B1
-26_1: B1
+26: B 1
+26_1: B 1
-27: B11
-27_1: BexpandedA0 11
-28: B11
+27: B 11
+27_1: BA0 11
+28: B 11

Perhaps some of the whitespace changes aren't significant, but 9:, and
2[678]{,_1}: are significantly different.
9: is
#define LPAREN (
#define A() B LPAREN )
#define B() C LPAREN )
#define HT_B() TONG
#define F(x, ...) HT_ ## __VA_OPT__(x x A()  #x)
9: F(A(),1)

Thoughts on what is right and why?


clang is.  First we substitute into the body of the __VA_OPT__, so

x x A() #x
B ( ) B ( ) A() "A()"

Then we paste, so

HT_B ( ) B ( ) A() "A()"

then rescan, so

TONG C ( ) B ( ) "A()"


Similarly for expansion on the last token from __VA_OPT__ when followed
by ##, like:
#define m1 (
#define f16() f17 m1 )
#define f17() f18 m1 )
#define f18() m2 m1 )
#define m3f17() g
#define f19(x, ...) m3 ## __VA_OPT__(x x f16() #x)
#define f20(x, ...) __VA_OPT__(x x)##m4()
#define f21() f17
#define f17m4() h
t25 f19 (f16 (), 1);
t26 f20 (f21 (), 2);

E.g. 26: is:
#define F(a,...)  __VA_OPT__(B a ## a) ## 1
26: F(,1)
I really wonder why clang emits B1 in that case, there
is no ## in between B and a, so those are 2 separate tokens
separated by whitespace, even when a ## a is a placemarker.
Does that mean __VA_OPT__ should throw away all the placemarkers
and return the last non-placemarker token for the ## handling?


This is unclear to me.  The standard says that the replacement for the 
__VA_OPT__ is the result of expansion before rescanning and further 
replacement, but it's unclear to me whether the discarding of 
placemarkers should happen or not, since that is mentioned in the 
context of rescanning.  Representing a placemarker as <> below, we first 
expand the __VA_OPT__ body to


B a ## a
B <> ## <>
B <>

then do we have

B <> ## 1
or
B ## 1
?

Looking at the patch now.

Jason


Re: [C++ Patch] PR 84350 ("[7/8 Regression] ICE with new and auto")

2018-02-14 Thread Jason Merrill
OK.

On Wed, Feb 14, 2018 at 9:53 AM, Paolo Carlini  wrote:
> Hi,
>
> today I spent some time on this: basing on r245826, when we started ICEing.
> For example I wondered if we wanted to rework the use of do_auto_deduction
> from build_new, and check CLASS_PLACEHOLDER_TEMPLATE (auto_node) and
> possibly directly call do_class_deduction when d_init stays NULL_TREE
> because vec_safe_length (*init) != 1 (
> https://gcc.gnu.org/viewcvs/gcc/trunk/gcc/cp/init.c?r1=245826=245825=245826
> ). But that would require a non-static do_class_deduction and an additional
> function call from build_new, not at all sure it's worth it. Thus I'm just
> proposing the below, restoring the old diagnostic and avoiding the ICE.
>
> Thanks, Paolo.
>
> 
>


Re: [Patch, Fortran, F08] PR 84313: reject procedure pointers in COMMON blocks

2018-02-14 Thread Janus Weil
>>> Adding ! { dg-additional-options "-std=f2003" }
>>> doesn't work, because the test uses
>>>   call abort
>
> I actually think we should get rid of such extensions in the
> testsuite, where possible. This particular one is used all over the
> place, but could be easily replaces by something like "stop 1", which
> is standard Fortran.

Just opened a PR for this:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84381


>>> which is a GNU extension and I have no idea how to choose allow_std
>>> which includes GNU but doesn't include F2008.
>>
>> ! { dg-additional-options "-std=f2003 -fdec" }
>>
>> seems to work (because -std=f2003 sets
>>   gfc_option.allow_std = GFC_STD_F95_OBS | GFC_STD_F77
>> | GFC_STD_F2003 | GFC_STD_F95 | GFC_STD_F2008_OBS;
>> and -fdec adds:
>>   gfc_option.allow_std |= GFC_STD_F95_OBS | GFC_STD_F95_DEL
>> | GFC_STD_GNU | GFC_STD_LEGACY;
>> ), but it is quite nasty.  Isn't there a better way?
>
> Yes, there is "-std=f2003 -fall-intrinsics", which is a little better
> at least. It's what I did with proc_ptr_common1.f90 as well ...
>
> https://gcc.gnu.org/viewcvs/gcc/trunk/gcc/testsuite/gfortran.dg/proc_ptr_common_1.f90?r1=257636=257635=257636

Thanks for taking care of fixing this so quickly, Jakub (and for
noticing it in the first place)!


>> Kind like -std=gnu++17 vs. -std=c++17 where the latter is standard
>> and former standard + GNU extensions (which would roughly be
>> "| GFC_STD_GNU | GFC_STD_LEGACY" in the fortran world).
>
> Huh, I suppose it would be nice to have options like -std=gnu2003 and
> -std=gnu2008, in analogy to those C++ options ...

This is now:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84382

Cheers,
Janus


Re: [PATCH, rs6000] (v2) PR84220 remove RS6000_BTI_NOT_OPAQUE refs from builtins table

2018-02-14 Thread Will Schmidt
On Wed, 2018-02-14 at 04:53 -0600, Segher Boessenkool wrote:
> Hi!
> 
> On Tue, Feb 13, 2018 at 05:40:08PM -0600, Will Schmidt wrote:
> >   Some of our builtin definitions were allowing invalid parameters, and a
> > subsequent ICE (on invalid code) were the result.  This is due to the use of
> > RS6000_BTI_NOT_OPAQUE (which allowed vector arguments), where a
> > RS6000_BTI_INTSI appears to be a more appropriate choice.
> > This change adjusts the definitions for the VEC_SLD, VEC_SLDW, vec_XXSLDWI
> > and VEC_XXPERMDI entries.
> 
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/powerpc/pr84220-xxperm.c
> > @@ -0,0 +1,100 @@
> > +/* PR target/84220 */
> > +/* Test to ensure we generate invalid parameter errors rather than an ICE
> > +when calling vec_xxpermdi() with invalid parameters.  */
> > +/* { dg-do compile { target { powerpc64*-*-* } } } */
> > +/* { dg-require-effective-target powerpc_vsx_ok } */
> > +/* { dg-options "-O2 -mvsx" } */
> 
> Does this test need powerpc64*?  Or does it need lp64 instead, or nothing?

Good catch.  :-)   slightly more coverage when removing the target.
Changed it to just a dg-do compile stanza to match the other tests.

Thanks,
-Will

> Looks good, please look at that detail again; okay for trunk.  Thanks!
> 
> 
> Segher
> 




Re: [PATCH] diagnose specializations of deprecated templates (PR c++/84318)

2018-02-14 Thread Jason Merrill
On Wed, Feb 14, 2018 at 10:18 AM, Martin Sebor  wrote:
> On 02/13/2018 10:28 PM, Jason Merrill wrote:
>> On Tue, Feb 13, 2018 at 5:20 PM, Martin Sebor  wrote:
>>> On 02/13/2018 01:09 PM, Jason Merrill wrote:
 On Tue, Feb 13, 2018 at 2:59 PM, Martin Sebor  wrote:
> On 02/13/2018 12:15 PM, Jason Merrill wrote:
>> On Tue, Feb 13, 2018 at 1:31 PM, Martin Sebor  wrote:
>>> On 02/13/2018 09:24 AM, Martin Sebor wrote:
 On 02/13/2018 08:35 AM, Martin Sebor wrote:
> On 02/13/2018 07:40 AM, Jason Merrill wrote:
>> On Mon, Feb 12, 2018 at 6:32 PM, Martin Sebor  
>> wrote:
>>>
>>> While testing my fix for 83871 (handling attributes on explicit
>>> specializations) I noticed another old regression: while GCC 4.4
>>> would diagnose declarations of explicit specializations of all
>>> primary templates declared deprecated, GCC 4.5 and later only
>>> diagnose declarations of explicit specializations of class
>>> templates but not those of function or variable templates.
>>
>> Hmm, the discussion on the core reflector seemed to be agreeing that
>> we want to be able to define non-deprecated specializations of a
>> deprecated primary template.
>
> Yes, that's what Richard wanted to do.  The only way to do it
> within the existing constraints(*) is to define a non-deprecated
> primary, and a deprecated partial specialization.  This is in line
> with that approach and supported by Clang and all other compilers
> I tested (including Clang).

 To clarify, this approach works for class templates (e.g., like
 std::numeric_limits that was mentioned in the core discussion)
 and for variable templates.  Functions have no partial
 specilizations so they have to be overloaded to achieve the same
 effect.

 Implementations don't treat the deprecated attribute on partial
 specializations consistently.

 EDG accepts and honors it on class template partial specializations
 but rejects it with an error on those of variables.

 Clang accepts but silently ignores it on class template partial
 specializations and rejects with an error it on variables.

 MSVC accepts and honors it on variables but silently ignores it
 on class template partial specializations.

 GCC ignores it silently on class partial specializations and
 with a warning on variables (I opened bug 84347 to track this
 and to have GCC honor is everywhere).

 This is clearly a mess, which isn't surprising given how poorly
 specified this is in the standard.  But from the test cases and
 from the core discussion it seems clear that deprecating
 a template, including its partial specializations (as opposed
 to just a single explicit specialization) is desirable and
 already supported, and that the wording in the standard just
 needs to be adjusted to reflect that.

> [*] Except (as Richard noted) that the standard doesn't seem to
> allow a template to be deprecated.  I think that's a bug in the
> spec because all implementations allow it to some degree.
>>>
>>> One other note.  While thinking about this problem during
>>> the core discussion, another approach to deprecating a primary
>>> template without also deprecating all of its specializations
>>> occurred to me.
>>>
>>> 1) First declare the primary template without [[deprecated]].
>>> 2) Next declare its non-deprecated specializations (partial
>>>or explicit).
>>> 3) Finally declare the primary again, this time [[deprecated]].
>>>
>>> Like this:
>>>
>>>   template  structS;
>>>   template  structS { };
>>>   template  struct [[deprecated]] S { };
>>>   template  struct [[deprecated]] S { };
>>>
>>>   S si; // warning
>>>   S sci;  // no warning
>>>   S svi;   // warning
>>>
>>> This works as expected with Intel ICC.  All other compilers
>>> diagnose all three variables.  I'd say for [[deprecated]] it
>>> should work the way ICC does.  (For [[noreturn]] the first
>>> declaration must be [[noreturn]], so there this solution
>>> wouldn't work also because of that, in addition to function
>>> templates not being partially-specializable.)
>>
>> My understanding of the reflector discussion, and Richard's comment in
>> particular, was that [[deprecated]] should apply to the instances, not
>> the template itself, so that declaring the primary template
>> [[deprecated]] doesn't affect explicit specializations.  Your 

Re: [PATCH, rs6000] Fix PR84279, powerpc64le ICE on cvc4

2018-02-14 Thread Peter Bergner
On 2/13/18 5:51 PM, Segher Boessenkool wrote:
> We can backport without having a failing testcase.  Let's do that for 7
> at least?

Ok, the backport tested clean, so I committed it.  Thanks.

Peter




Re: [PATCH] diagnose specializations of deprecated templates (PR c++/84318)

2018-02-14 Thread Martin Sebor

On 02/13/2018 10:28 PM, Jason Merrill wrote:

On Tue, Feb 13, 2018 at 5:20 PM, Martin Sebor  wrote:

On 02/13/2018 01:09 PM, Jason Merrill wrote:


On Tue, Feb 13, 2018 at 2:59 PM, Martin Sebor  wrote:


On 02/13/2018 12:15 PM, Jason Merrill wrote:


On Tue, Feb 13, 2018 at 1:31 PM, Martin Sebor  wrote:


On 02/13/2018 09:24 AM, Martin Sebor wrote:




On 02/13/2018 08:35 AM, Martin Sebor wrote:




On 02/13/2018 07:40 AM, Jason Merrill wrote:




On Mon, Feb 12, 2018 at 6:32 PM, Martin Sebor 
wrote:




While testing my fix for 83871 (handling attributes on explicit
specializations) I noticed another old regression: while GCC 4.4
would diagnose declarations of explicit specializations of all
primary templates declared deprecated, GCC 4.5 and later only
diagnose declarations of explicit specializations of class
templates but not those of function or variable templates.





Hmm, the discussion on the core reflector seemed to be agreeing that
we want to be able to define non-deprecated specializations of a
deprecated primary template.





Yes, that's what Richard wanted to do.  The only way to do it
within the existing constraints(*) is to define a non-deprecated
primary, and a deprecated partial specialization.  This is in line
with that approach and supported by Clang and all other compilers
I tested (including Clang).





To clarify, this approach works for class templates (e.g., like
std::numeric_limits that was mentioned in the core discussion)
and for variable templates.  Functions have no partial
specilizations so they have to be overloaded to achieve the same
effect.

Implementations don't treat the deprecated attribute on partial
specializations consistently.

EDG accepts and honors it on class template partial specializations
but rejects it with an error on those of variables.

Clang accepts but silently ignores it on class template partial
specializations and rejects with an error it on variables.

MSVC accepts and honors it on variables but silently ignores it
on class template partial specializations.

GCC ignores it silently on class partial specializations and
with a warning on variables (I opened bug 84347 to track this
and to have GCC honor is everywhere).

This is clearly a mess, which isn't surprising given how poorly
specified this is in the standard.  But from the test cases and
from the core discussion it seems clear that deprecating
a template, including its partial specializations (as opposed
to just a single explicit specialization) is desirable and
already supported, and that the wording in the standard just
needs to be adjusted to reflect that.



Martin

[*] Except (as Richard noted) that the standard doesn't seem to
allow a template to be deprecated.  I think that's a bug in the
spec because all implementations allow it to some degree.





One other note.  While thinking about this problem during
the core discussion, another approach to deprecating a primary
template without also deprecating all of its specializations
occurred to me.

1) First declare the primary template without [[deprecated]].
2) Next declare its non-deprecated specializations (partial
   or explicit).
3) Finally declare the primary again, this time [[deprecated]].

Like this:

  template  structS;
  template  structS { };
  template  struct [[deprecated]] S { };
  template  struct [[deprecated]] S { };

  S si; // warning
  S sci;  // no warning
  S svi;   // warning

This works as expected with Intel ICC.  All other compilers
diagnose all three variables.  I'd say for [[deprecated]] it
should work the way ICC does.  (For [[noreturn]] the first
declaration must be [[noreturn]], so there this solution
wouldn't work also because of that, in addition to function
templates not being partially-specializable.)




My understanding of the reflector discussion, and Richard's comment in
particular, was that [[deprecated]] should apply to the instances, not
the template itself, so that declaring the primary template
[[deprecated]] doesn't affect explicit specializations.  Your last
example should work as you expect in this model, but you can also
write the simpler

template  struct [[deprecated]] S { };
template  struct S { }; // no warning




With this approach there would be no way to deprecate all of
a template's specializations) because it would always be
possible for a user to get around deprecation by defining
their own specialization, partial or explicit.



Yep.  And so he suggested that we might want to add a new way to write
attributes that do apply to the template name.



[[deprecated]] was introduced in part to make it possible for
C++ standard library implementers to add warnings for stuff
the committee has deprecated.  Most C++ deprecated features
are templates.  Declaring that [[deprecated]] isn't meant to
serve its purpose for templates and that some new form of
it is needed 

wwwdocs: An additional release note for powerpc for GCC 8

2018-02-14 Thread Kelvin Nilsen

Is this revision to the existing draft GCC 8 release notes ok for
commit?  

Thanks

? cvs.diffs
Index: htdocs/gcc-8/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-8/changes.html,v
retrieving revision 1.36
diff -u -3 -p -r1.36 changes.html
--- htdocs/gcc-8/changes.html   12 Feb 2018 07:23:11 -  1.36
+++ htdocs/gcc-8/changes.html   14 Feb 2018 14:58:56 -
@@ -464,6 +464,11 @@ a work-in-progress.
 powerpc-xilinx-eabi*)
 is deprecated and will be removed in a future release.
   
+  
+Support for using big-endian AltiVec intrinsics on a little-endian target
+(-maltivec=be) is deprecated and will be removed in a
+future release.
+  
 
 
 PowerPC SPE



Re: [PATCH, rs6000] PR84220 fix altivec_vec_sld and vec_sldw intrinsic definitions

2018-02-14 Thread Bill Schmidt

> On Feb 13, 2018, at 5:28 PM, Will Schmidt  wrote:
> 
> On Thu, 2018-02-08 at 17:48 -0600, Segher Boessenkool wrote:
>> Hi!
>> 
>> On Wed, Feb 07, 2018 at 09:14:59AM -0600, Will Schmidt wrote:
>>>  Our VEC_SLD definitions were mistakenly allowing the third argument to be
>>> of an invalid type, triggering an ICE (on invalid code) later in the build
>>> process.  This fixes those definitions.  The nearby VEC_SLDW definitions 
>>> have
>>> the same issue, those have been fixed as part of this patch too.
>>> Testcases have been added to ensure we generate the 'invalid intrinsic'
>>> message as is appropriate, instead of ICEing.
>>> Giving proper credit, this was found by Peter Bergner while working a
>>> different issue. :-)
>>> 
>>> Sniff-tests passed on P8.  Doing larger reg-test across power systems now.
>>> OK for trunk?
>>> And,.. do we want this one backported too?
>> 
>>> diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
>>> index a68be51..26f9990 100644
>>> --- a/gcc/config/rs6000/rs6000-c.c
>>> +++ b/gcc/config/rs6000/rs6000-c.c
>>> @@ -3654,39 +3654,39 @@ const struct altivec_builtin_types 
>>> altivec_overloaded_builtins[] = {
>>>   { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_16QI,
>>> RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, 
>>> RS6000_BTI_bool_V16QI },
>>>   { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_16QI,
>>> RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, 
>>> RS6000_BTI_unsigned_V16QI },
>>>   { ALTIVEC_BUILTIN_VEC_SLD, ALTIVEC_BUILTIN_VSLDOI_4SF,
>>> -RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 
>>> RS6000_BTI_NOT_OPAQUE },
>>> +RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_INTSI },
>> 
>> It isn't clear to me what RS6000_BTI_NOT_OPAQUE means...  rs6000-c.c says:
>> 
>>/* For arguments after the last, we have RS6000_BTI_NOT_OPAQUE in
>>   the opX fields.  */
>> 
>> (whatever that means!), and the following code seems to allow anything in
>> such args?  If you understand it, please update some comments somewhere?

The intent is that each entry in the table has a maximum of one result type
and three argument types.  If a built-in only takes one argument, then the 
second
and third argument entries in the table should be flagged as
RS6000_BTI_NOT_OPAQUE.  I have never understood why that particular
(misleading) term was used.  But basically it means (as Will found in the code
chunk below) that you shouldn't check type compatibility because we don't
expect an argument in that position.

At least that's always been my understanding.  I could be wrong.  It seems
very odd that you are finding it easy to remove this (only a few entries), so
I might be way off base.

Bill

> 
> I dug in a bit more to try to understand the history and context.
> 
> The RS6000_BTI_NOT_OPAQUE entry was added as part of the (large) AltiVec
> rewrite (By Paolo Bonzini) in Apr 2005.
> The ALTIVEC_BUILTIN_VEC_SLD entries, and the "for arguments after the
> last" code chunk in altivec_resolve_overloaded_builtin() was part of
> that addition, and pretty much un-touched since that time.
> 
>>>   { VSX_BUILTIN_VEC_XXPERMDI, VSX_BUILTIN_XXPERMDI_2DF,
>> 
>> XXPERMDI is the only other builtin that uses NOT_OPAQUE, does that suffer
>> from the same problem?  If so, you can completely delete NOT_OPAQUE it
>> seems?
> 
> Yes and no.   I've generated a few more tests that show the problem also
> included vec_xxperms.   
> SO,.. I've updated the patch to fix those references too.
> 
> With that change, all references to NOT_OPAQUE in the builtins table are
> removed.
> 
> (I'll be posting that momentarily while regtests run overnight..)
> 
> So then with the idea of cleaning up all remaining references to
> _NOT_OPAQUE..  I got stuck.  :-)
> 
> The _NOT_OPAQUE definition is the first entry in the (rs6000.h: enum
> rs6000_builtin_type_index)
> 
> enum rs6000_builtin_type_index
> {
> -  RS6000_BTI_NOT_OPAQUE,
> +  RS6000_BTI_unset,
>   RS6000_BTI_opaque_V2SI,
> 
> 
> And the only other reference is in this chunk of code in rs6000-c.c:
> altivec_resolve_overloaded_builtin() 
> 
>  /* For arguments after the last, we have RS6000_BTI_NOT_OPAQUE in
>   the opX fields.  */
>for (; desc->code == fcode; desc++)
>  {
>   if ((desc->op1 == RS6000_BTI_NOT_OPAQUE
>|| rs6000_builtin_type_compatible (types[0], desc->op1))
>   && (desc->op2 == RS6000_BTI_NOT_OPAQUE
>   || rs6000_builtin_type_compatible (types[1], desc->op2))
>   && (desc->op3 == RS6000_BTI_NOT_OPAQUE
>   || rs6000_builtin_type_compatible (types[2], desc->op3)))
> 
> 
> So there should no longer be any matches to ...NOT_OPAQUE, but if I
> comment out the snippets "== ..NOT_OPAQUE || ", lots of ICE's show up.
> 
> which makes me wonder if the check here is more of a "if desc->op1 was
> not explicitly set,... " thing. 

[C++ Patch] PR 84350 ("[7/8 Regression] ICE with new and auto")

2018-02-14 Thread Paolo Carlini

Hi,

today I spent some time on this: basing on r245826, when we started 
ICEing. For example I wondered if we wanted to rework the use of 
do_auto_deduction from build_new, and check CLASS_PLACEHOLDER_TEMPLATE 
(auto_node) and possibly directly call do_class_deduction when d_init 
stays NULL_TREE because vec_safe_length (*init) != 1 ( 
https://gcc.gnu.org/viewcvs/gcc/trunk/gcc/cp/init.c?r1=245826=245825=245826 
). But that would require a non-static do_class_deduction and an 
additional function call from build_new, not at all sure it's worth it. 
Thus I'm just proposing the below, restoring the old diagnostic and 
avoiding the ICE.


Thanks, Paolo.



/cp
2018-02-14  Paolo Carlini  

PR c++/84350
* pt.c (do_auto_deduction): Don't check the TREE_TYPE of a null
init, early return.

/testsuite
2018-02-14  Paolo Carlini  

PR c++/84350
* g++.dg/cpp0x/auto49.C: New.
Index: cp/pt.c
===
--- cp/pt.c (revision 257659)
+++ cp/pt.c (working copy)
@@ -25975,7 +25975,7 @@ do_auto_deduction (tree type, tree init, tree auto
 /* C++17 class template argument deduction.  */
 return do_class_deduction (type, tmpl, init, flags, complain);
 
-  if (TREE_TYPE (init) == NULL_TREE)
+  if (init == NULL_TREE || TREE_TYPE (init) == NULL_TREE)
 /* Nothing we can do with this, even in deduction context.  */
 return type;
 
Index: testsuite/g++.dg/cpp0x/auto49.C
===
--- testsuite/g++.dg/cpp0x/auto49.C (nonexistent)
+++ testsuite/g++.dg/cpp0x/auto49.C (working copy)
@@ -0,0 +1,12 @@
+// PR c++/84350
+// { dg-do compile { target c++11 } }
+
+template void foo(T... t)
+{
+  new auto(t...);  // { dg-error "invalid use" }
+}
+
+void bar()
+{
+  foo();
+}


Re: [PATCH] FIx endless match.pd recursion on cst1 + cst2 + cst3 (PR tree-optimization/84334)

2018-02-14 Thread Jakub Jelinek
On Wed, Feb 14, 2018 at 12:09:57PM +0100, Richard Biener wrote:
> On Tue, 13 Feb 2018, Marc Glisse wrote:
> 
> > On Tue, 13 Feb 2018, Richard Biener wrote:
> > 
> > > On February 13, 2018 6:51:29 PM GMT+01:00, Jakub Jelinek 
> > > 
> > > wrote:
> > > > Hi!
> > > > 
> > > > On the following testcase, we recurse infinitely, because
> > > > we have float re-association enabled, but also rounding-math, so
> > > > we try to optimize (cst1 + cst2) + cst3 as (cst2 + cst3) + cst1
> > > > but (cst2 + cst3) doesn't simplify and we try again and optimize
> > > > it as (cst3 + cst1) + cst2 and then (cst1 + cst2) + cst3 and so on
> > > > forever.  If @0 is not a CONSTANT_CLASS_P, there is not a problem,
> > > > if it is, the code just checks if we can actually simplify the
> > > > operation between cst2 and cst3 into a constant.
> > > 
> > > Is there a reason to try simplifying at all for constant @0?
> > 
> > Yes. cst2+cst3 might simplify (the operation happens to be exact and not
> > require rounding), which leaves us with only one addition instead of 2.
> > 
> > On the other hand, mixing -frounding-math with reassociation seems strange 
> > to
> > me, and likely not worth optimizing for.
> 
> ./cc1 -quiet t.c -O -frounding-math -fassociative-math
> cc1: warning: -fassociative-math disabled; other options take precedence

You need
./cc1 -quiet t.c -O -fassociative-math -fno-trapping-math -fno-signed-zeros 
-frounding-math

> So _maybe_ we should disable these patterns for !flag_associative_math
> when dealing with FP?

We do, this is in block with:
 /* We can't reassociate floating-point unless -fassociative-math
or fixed-point plus or minus because of saturation to +-Inf.  */
 (if ((!FLOAT_TYPE_P (type) || flag_associative_math)
  && !FIXED_POINT_TYPE_P (type))

But that doesn't mean you can't request associative math and rounding math
at the same time.

Jakub


[PATCH] CET shouldn't be enabled in 32-bit run-time libraries by defualt

2018-02-14 Thread Tsimbalist, Igor V
ENDBR32 and RDSSPD are multi-byte NOPs on x86-64 processors and
newer x86 processors, starting Pentium Pro.  They are UD on older 32-bit
processors. Detect this at configure time and adjust the default value
for enable_cet. GCC will enable CET in 32-bit run-time libraries in any case
if --enable-cet is used to configure GCC.

OK for trunk?

Igor




0001-CET-shouldn-t-be-enabled-in-32-bit-run-time-librarie.patch
Description: 0001-CET-shouldn-t-be-enabled-in-32-bit-run-time-librarie.patch


Re: [PATCH] Handle PowerPC64 ELFv1 function descriptors in libbacktrace (PR other/82368)

2018-02-14 Thread Ian Lance Taylor
On Wed, Feb 14, 2018 at 3:41 AM, Jakub Jelinek  wrote:
>
> As mentioned in detail in the PR, PowerPC64 ELFv1 function symbols
> point to function descriptors in .opd section rather than actual
> code, and one needs to read the code address from the .opd section
> in order to associate symbols with .text addresses.
>
> Fixed thusly, bootstrapped/regtested on powerpc64-linux (-m32/-m64
> testing) and powerpc64le-linux, ok for trunk?
>
> 2018-02-14  Jakub Jelinek  
>
> PR other/82368
> * elf.c (EM_PPC64, EF_PPC64_ABI): Undefine and define.
> (struct elf_ppc64_opd_data): New type.
> (elf_initialize_syminfo): Add opd argument, handle symbols
> pointing into the PowerPC64 ELFv1 .opd section.
> (elf_add): Read .opd section on PowerPC64 ELFv1, pass pointer
> to structure with .opd data to elf_initialize_syminfo.

This is OK.

Thanks for taking this on.

Ian


Re: [C++ Patch] tsubst_flags_t fixlet

2018-02-14 Thread Jason Merrill
OK.

On Wed, Feb 14, 2018 at 6:01 AM, Paolo Carlini  wrote:
> Hi,
>
> today, while having a look to c++/84350, I noticed that in a couple of
> places we aren't forwarding the tsubst_flags_t argument to
> do_auto_deduction. I think we can elegantly solve the problem by removing
> the do_auto_deduction overload taking three arguments and adding two
> defaults to the other. Tested x86_64-linux.
>
> Thanks, Paolo.
>
> ///
>


Re: [PATCH] jit: fix link on OS X and Solaris (PR jit/64089 and PR jit/84288)

2018-02-14 Thread Rainer Orth
Hi David,

>> * added LD_SONAME_OPTION, done in the same way
[...]
>> Does this fix the jit linker issues on OS X and Solaris?
>
> I'll give it a whirl tomorrow, including the jit-recording.c part of my
> patch to allow the build to complete.

actually, I've replaced the Makefile and configure parts of my patch
with yours and did a jit-only bootstrap on i386-pc-solaris2.11 with
as/ld and gas/ld.  Both went fine with a minor caveat: I noticed that
LD_SONAME_OPTION wasn't set in gcc/Makefile.  Fixed with the following
(so far untested) snippet:

diff --git a/gcc/configure.ac b/gcc/configure.ac
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -3715,6 +3715,12 @@ elif test x$gcc_cv_ld != x; then
   gcc_cv_ld_soname=yes
   ld_soname_option='-install_name'
   ;;
+# Solaris 2 ld always supports -h.  It also supports --soname for GNU
+# ld compatiblity since some Solaris 10 update.
+*-*-solaris2*)
+  gcc_cv_ld_soname=yes
+  ld_soname_option='-h'
+  ;;
   esac
 fi
 # Don't AC_DEFINE result, only used in jit/Make-lang.in so far.

I've also checked that the original Solaris 10 release didn't support ld
-soname, so it's safer to always use the Solaris-native -h option
instead.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH] FIx endless match.pd recursion on cst1 + cst2 + cst3 (PR tree-optimization/84334)

2018-02-14 Thread Marc Glisse

On Wed, 14 Feb 2018, Richard Biener wrote:


On Tue, 13 Feb 2018, Marc Glisse wrote:


On Tue, 13 Feb 2018, Richard Biener wrote:


On February 13, 2018 6:51:29 PM GMT+01:00, Jakub Jelinek 
wrote:

Hi!

On the following testcase, we recurse infinitely, because
we have float re-association enabled, but also rounding-math, so
we try to optimize (cst1 + cst2) + cst3 as (cst2 + cst3) + cst1
but (cst2 + cst3) doesn't simplify and we try again and optimize
it as (cst3 + cst1) + cst2 and then (cst1 + cst2) + cst3 and so on
forever.  If @0 is not a CONSTANT_CLASS_P, there is not a problem,
if it is, the code just checks if we can actually simplify the
operation between cst2 and cst3 into a constant.


Is there a reason to try simplifying at all for constant @0?


Yes. cst2+cst3 might simplify (the operation happens to be exact and not
require rounding), which leaves us with only one addition instead of 2.

On the other hand, mixing -frounding-math with reassociation seems strange to
me, and likely not worth optimizing for.


./cc1 -quiet t.c -O -frounding-math -fassociative-math
cc1: warning: -fassociative-math disabled; other options take precedence

So _maybe_ we should disable these patterns for !flag_associative_math
when dealing with FP?


There is

 (if ((!FLOAT_TYPE_P (type) || flag_associative_math)
  && !FIXED_POINT_TYPE_P (type))

above, which I think covers this transformation.

--
Marc Glisse


Re: Check array indices in object_address_invariant_in_loop_p (PR 84357)

2018-02-14 Thread Richard Biener
On Wed, Feb 14, 2018 at 10:44 AM, Richard Sandiford
 wrote:
> object_address_invariant_in_loop_p ignored ARRAY_REF indices on
> the basis that:
>
>   /* Index of the ARRAY_REF was zeroed in analyze_indices, thus we 
> only
>  need to check the stride and the lower bound of the reference.  
> */
>
> That was true back in 2007 when the code was added:
>
> static void
> dr_analyze_indices (struct data_reference *dr, struct loop *nest)
> {
>   [...]
>   while (handled_component_p (aref))
> {
>   if (TREE_CODE (aref) == ARRAY_REF)
> {
>   op = TREE_OPERAND (aref, 1);
>   access_fn = analyze_scalar_evolution (loop, op);
>   access_fn = resolve_mixers (nest, access_fn);
>   VEC_safe_push (tree, heap, access_fns, access_fn);
>
>   TREE_OPERAND (aref, 1) = build_int_cst (TREE_TYPE (op), 0);
> }
>
>   aref = TREE_OPERAND (aref, 0);
> }
>
> but the assignment was removed a few years ago.

GCC 4.7!

>  We were therefore
> treating "two->arr[i]" and "three->arr[i]" as loop invariant.
>
> Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu.
> OK to install?

Ok.

Thanks,
Richard.

> Richard
>
>
> 2018-02-14  Richard Sandiford  
>
> gcc/
> PR tree-optimization/84357
> * tree-data-ref.c (object_address_invariant_in_loop_p): Check
> operand 1 of an ARRAY_REF too.
>
> gcc/testsuite/
> PR tree-optimization/84357
> * gcc.dg/vect/pr84357.c: New test.
>
> Index: gcc/tree-data-ref.c
> ===
> --- gcc/tree-data-ref.c 2018-02-08 15:16:21.784407397 +
> +++ gcc/tree-data-ref.c 2018-02-14 09:42:14.801095011 +
> @@ -2200,13 +2200,10 @@ object_address_invariant_in_loop_p (cons
>  {
>if (TREE_CODE (obj) == ARRAY_REF)
> {
> - /* Index of the ARRAY_REF was zeroed in analyze_indices, thus we 
> only
> -need to check the stride and the lower bound of the reference.  
> */
> - if (chrec_contains_symbols_defined_in_loop (TREE_OPERAND (obj, 2),
> - loop->num)
> - || chrec_contains_symbols_defined_in_loop (TREE_OPERAND (obj, 
> 3),
> -loop->num))
> -   return false;
> + for (int i = 1; i < 4; ++i)
> +   if (chrec_contains_symbols_defined_in_loop (TREE_OPERAND (obj, i),
> +   loop->num))
> + return false;
> }
>else if (TREE_CODE (obj) == COMPONENT_REF)
> {
> Index: gcc/testsuite/gcc.dg/vect/pr84357.c
> ===
> --- /dev/null   2018-02-10 09:05:46.714416790 +
> +++ gcc/testsuite/gcc.dg/vect/pr84357.c 2018-02-14 09:42:14.800095067 +
> @@ -0,0 +1,31 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-Wall" } */
> +
> +#define COUNT 32
> +
> +typedef struct s1 {
> +unsigned char c;
> +} s1;
> +
> +typedef struct s2
> +{
> +char pad;
> +s1 arr[COUNT];
> +} s2;
> +
> +typedef struct s3 {
> +s1 arr[COUNT];
> +} s3;
> +
> +s2 * get_s2();
> +s3 * gActiveS3;
> +void foo()
> +{
> +s3 * three = gActiveS3;
> +s2 * two = get_s2();
> +
> +for (int i = 0; i < COUNT; i++)
> +{
> +two->arr[i].c = three->arr[i].c;
> +}
> +}


Re: [PATCH] Fix PR84101, account for function ABI details in vectorization costs

2018-02-14 Thread Jakub Jelinek
On Wed, Feb 14, 2018 at 12:52:45PM +0100, Richard Biener wrote:
> On Tue, 13 Feb 2018, Jeff Law wrote:
> 
> > On 01/30/2018 02:59 AM, Richard Biener wrote:
> > > 
> > > This patch tries to deal with the "easy" part of a function ABI,
> > > the return value location, in vectorization costing.  The testcase
> > > shows that if we vectorize the returned value but the function
> > > doesn't return in memory or in a vector register but as in this
> > > case in an integer register pair (reg:TI ax) (bah, ABI details
> > > exposed late?  why's this not a parallel?) we end up spilling
> > > badly.
> > PARALLEL is used when the ABI mandates a value be returned in multiple
> > places.  Typically that happens when the value is returned in different
> > types of registers (integer, floating point, vector).
> > 
> > Presumably it's not a PARALLEL in this case because the value is only
> > returned in %eax.
> 
> It's returned in %eax and %rdx (TImode after all).  But maybe
> "standard register pairs" are not represented as PARALLEL ...

Yes, it is (reg:TI %rax) if low part is in register 0 and high part in
register 1.

Jakub


Re: [PING] [PATCH] [MSP430] PR79242: Implement Complex Partial Integers

2018-02-14 Thread Jozef Lawrynowicz

On 14/02/18 07:25, Jeff Law wrote:

On 02/08/2018 09:54 AM, Jozef Lawrynowicz wrote:

ping x1

Complex Partial Integers are unimplemented, resulting in an ICE when
attempting to use them. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79242
This results in GCC7/8 for msp430-elf failing to build.

typedef _Complex __int20 C;

C
foo (C x, C y)
{
   return x + y;
}

(Thanks Jakub - https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79242#c2)

../../gcc/testsuite/gcc.target/msp430/pr79242.c: In function 'foo':
../../gcc/testsuite/gcc.target/msp430/pr79242.c:8:1: internal compiler
error: in make_decl_rtl, at varasm.c:1304
  foo (C x, C y)
  ^~~
0xc07b29 make_decl_rtl(tree_node*)
 ../../gcc/varasm.c:1303
0x67523c set_parm_rtl(tree_node*, rtx_def*)
 ../../gcc/cfgexpand.c:1274
0x79ffb9 expand_function_start(tree_node*)
 ../../gcc/function.c:5166
0x6800e1 execute
 ../../gcc/cfgexpand.c:6250

The attached patch defines a new complex mode for PARTIAL_INT.
You may notice that genmodes.c:complex_class returns MODE_COMPLEX_INT for
MODE_PARTIAL_INT rather than MODE_COMPLEX_PARTIAL_INT. I reviewed the uses of
MODE_COMPLEX_INT and it doesn't looked like a Complex Partial Int requires any
different behaviour to MODE_COMPLEX_INT.

msp430_hard_regno_nregs now returns 2 for CPSImode, but I feel like this may be
better handled in the front-end. PSImode is already defined to only use 1
register, so for a CPSI shouldn't the front-end should be able to work out that
double the amount of registers are required? Thoughts?
Without the definition for CPSI in msp430_hard_regno_nregs,
rtlanal.c:subreg_get_info thinks that a CPSI requires 4 registers of size 2,
instead of 2 registers of size 4.

Successfully bootstrapped and tested for c,c++,fortran,lto,objc on
x86_64-pc-linux-gnu with no regressions on gcc-7-branch.

With this patch gcc-7-branch now builds for msp430-elf. A further bug prevents
trunk from building for msp430-elf.

If the attached patch is acceptable, I would appreciate if someone would commit
it for me (to trunk and gcc-7-branch), as I do not have write access.


0001-Add-support-for-Complex-Partial-Integers-CPSImode.patch


 From 31d8554ebb6afeb2d8f235cf3d3c262236aa5e32 Mon Sep 17 00:00:00 2001
From: Jozef Lawrynowicz 
Date: Fri, 12 Jan 2018 13:23:40 +
Subject: [PATCH] Add support for Complex Partial Integers - CPSImode

2018-01-XX  Jozef Lawrynowicz 

gcc/
   PR target/79242
   * machmode.def: Define a complex mode for PARTIAL_INT.
   * genmodes.c (complex_class): Return MODE_COMPLEX_INT for
 MODE_PARTIAL_INT.
   * doc/rtl.texi: Document CSPImode.
   * config/msp430/msp430.c (msp430_hard_regno_nregs): Add CPSImode
 handling.
 (msp430_hard_regno_nregs_with_padding): Likewise.

gcc/testsuite/
   PR target/79242
   * gcc.target/msp430/pr79242.c: New test.

Thanks.  Installed.

jeff


Thanks! Do you mind applying to gcc-7-branch as well?

Jozef



Re: [RX] Fix PR 83831 -- Unused bclr, bnot, bset insns

2018-02-14 Thread Oleg Endo
On Tue, 2018-02-13 at 17:04 +, Nick Clifton wrote:
> 
> > gcc/ChangeLog:
> > 
> > PR target/83831
> > * config/rx/rx-protos.h (rx_reg_dead_or_unused_after_insn,
> > rx_copy_reg_dead_or_unused_notes, rx_fuse_in_memory_bitop): New
> > declarations.
> > (set_of_reg): New struct.
> > (rx_find_set_of_reg, rx_find_use_of_reg): New functions.
> > * config/rx/rx.c (rx_reg_dead_or_unused_after_insn,
> > rx_copy_reg_dead_or_unused_notes, rx_fuse_in_memory_bitop): New
> > functions.
> > * config/rx/rx.md (andsi3, iorsi3, xorsi3): Convert to insn_and_split.
> > Split into bitclr, bitset, bitinvert patterns if appropriate.
> > (*bitset, *bitinvert, *bitclr): Convert to named insn_and_split and
> > use rx_fuse_in_memory_bitop.
> > (*bitset_in_memory, *bitinvert_in_memory, *bitclr_in_memory): Convert
> > to named insn, correct maximum insn length.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > PR target/83831
> > * gcc.target/rx/pr83831.c: New tests.

> Approved - please apply - and thanks very much for doing this!

Thanks!  Committed as r257655.

Cheers,
Oleg


Re: [PR tree-optimization/84047] missing -Warray-bounds on an out-of-bounds index

2018-02-14 Thread Richard Biener
On Thu, Feb 8, 2018 at 9:45 PM, Martin Sebor  wrote:
> On 02/08/2018 03:38 AM, Richard Biener wrote:
>>
>> On Thu, Feb 1, 2018 at 6:42 PM, Aldy Hernandez  wrote:
>>>
>>> Since my patch isn't the easy one liner I wanted it to be, perhaps we
>>> should concentrate on Martin's patch, which is more robust, and has
>>> testcases to boot!  His patch from last week also fixes a couple other
>>> PRs.
>>>
>>> Richard, would this be acceptable?  That is, could you or Jakub review
>>> Martin's all-encompassing patch?  If so, I'll drop mine.
>>
>>
>> Sorry, no - this one looks way too complicated.
>
>
> Presumably the complication is in he loop that follows SSA_NAMEs
> and the offsets in:
>
>   const char *s0 = "12";
>   const char *s1 = s0 + 1;
>   const char *s2 = s1 + 1;
>   const char *s3 = s2 + 1;
>
>   int i = *s0 + *s1 + *s2 + *s3;
>
> ?
>
> I don't know if this used to be diagnosed and is also part of
> the regression.  If it isn't it could be removed for GCC 8 and
> then added for GCC 9.  If this isn't your concern can you be
> more specific about what is?
>
> I should note (again) that this patch doesn't fix the whole
> regression.  There are remaining cases (involving arrays) that
> used to be handled but no longer are.  It's tedious (and hacky)
> to limit the fix to just the subset of the regression while at
> the same time preserving the pre-existing limitations (or bugs,
> depending on one's point of view).
>
>>> Also, could someone pontificate on whether we want to fix
>>> -Warray-bounds regressions for this release cycle?
>>
>>
>> Remove bogus ones?  Yes.  Add "missing ones"?  No.
>
>
> Can you please explain how to interpret the Target Milestone
> then?  Why is it set it to 6.5 when the bug is not meant to
> be fixed?
>
> If it's meant to be fixed in 6.5 (and presumably also 7.4) but
> not in 8.1, do we expect to fix it in 8.2?  More to the point,
> how can we tell which it is?

The target milestone is set by formal criteria.  We have leeway
with the priority and only P1 bugs must be fixed before the release.

What release we fix something ultimatively ends up depending on
how the patch looks like.  And most bugs are about multiple issues
so having "one" target milestone or even priority is difficult if it
is supposed to be a hard one.

Richard.

> Thanks
> Martin
>
>
>>
>> Richard.
>>
>>> Thanks.
>>>
>>> On Wed, Jan 31, 2018 at 6:05 AM, Richard Biener
>>>  wrote:

 On Tue, Jan 30, 2018 at 11:11 PM, Aldy Hernandez 
 wrote:
>
> Hi!
>
> [Note: Jakub has mentioned that missing -Warray-bounds regressions
> should be
> punted to GCC 9.  I think this particular one is easy pickings, but if
> this
> and/or the rest of the -Warray-bounds regressions should be marked as
> GCC 9
> material, please let me know so we can adjust all relevant PRs.]
>
> This is a -Warray-bounds regression that happens because the IL now has
> an
> MEM_REF instead on ARRAY_REF.
>
> Previously we had an ARRAY_REF we could diagnose:
>
>   D.2720_5 = "12345678"[1073741824];
>
> But now this is represented as:
>
>   _1 = MEM[(const char *)"12345678" + 1073741824B];
>
> I think we can just allow check_array_bounds() to handle MEM_REF's and
> everything should just work.
>
> The attached patch fixes both regressions mentioned in the PR.
>
> Tested on x86-64 Linux.
>
> OK?


 This doesn't look correct.  You lump MEM_REF handling together with
 ADDR_EXPR handling but for the above case you want to diagnose
 _dereferences_ not address-taking.

 For the dereference case you need to amend the ARRAY_REF case, for
 example
 via

 Index: gcc/tree-vrp.c
 ===
 --- gcc/tree-vrp.c  (revision 257181)
 +++ gcc/tree-vrp.c  (working copy)
 @@ -5012,6 +5012,13 @@ check_array_bounds (tree *tp, int *walk_
if (TREE_CODE (t) == ARRAY_REF)
  vrp_prop->check_array_ref (location, t, false
 /*ignore_off_by_one*/);

 +  else if (TREE_CODE (t) == MEM_REF
 +  && TREE_CODE (TREE_OPERAND (t, 0)) == ADDR_EXPR
 +  && TREE_CODE (TREE_OPERAND (TREE_OPERAND (t, 0), 0)) ==
 STRING_CST)
 +{
 +  call factored part of check_array_ref passing in STRING_CST and
 offset
 +}
 +
else if (TREE_CODE (t) == ADDR_EXPR)
  {
vrp_prop->search_for_addr_array (t, location);

 note your patch will fail to warn for "1"[1] because taking that
 address is valid but not
 dereferencing it.

 Richard.
>
>


Re: [PATCH] Handle PowerPC64 ELFv1 function descriptors in libbacktrace (PR other/82368)

2018-02-14 Thread Segher Boessenkool
Hi Jakub,

On Wed, Feb 14, 2018 at 12:41:38PM +0100, Jakub Jelinek wrote:
> As mentioned in detail in the PR, PowerPC64 ELFv1 function symbols
> point to function descriptors in .opd section rather than actual
> code, and one needs to read the code address from the .opd section
> in order to associate symbols with .text addresses.
> 
> Fixed thusly, bootstrapped/regtested on powerpc64-linux (-m32/-m64
> testing) and powerpc64le-linux, ok for trunk?
> 
> 2018-02-14  Jakub Jelinek  
> 
>   PR other/82368
>   * elf.c (EM_PPC64, EF_PPC64_ABI): Undefine and define.
>   (struct elf_ppc64_opd_data): New type.
>   (elf_initialize_syminfo): Add opd argument, handle symbols
>   pointing into the PowerPC64 ELFv1 .opd section.
>   (elf_add): Read .opd section on PowerPC64 ELFv1, pass pointer
>   to structure with .opd data to elf_initialize_syminfo.

Looks good to me; you may want a libbacktrace maintainer though.


Segher


Re: [Patch, Fortran, F08] PR 84313: reject procedure pointers in COMMON blocks

2018-02-14 Thread Janus Weil
2018-02-14 12:47 GMT+01:00 Jakub Jelinek :
> On Wed, Feb 14, 2018 at 12:30:14PM +0100, Jakub Jelinek wrote:
>> On Tue, Feb 13, 2018 at 07:24:35PM +0100, Janus Weil wrote:
>> > as the subject line says, the attached patch rejects procedure
>> > pointers in COMMON blocks (which is forbidden in F08). Since it's
>> > apparently legal in F03, I'm still accepting it with -std=f2003 and
>> > add that flag to a test case where this 'feature' is used. In another
>> > one, I'm adding the error message that one gets with -std=f2008.
>> >
>> > As my last submission, this fixes fallout from
>> > https://groups.google.com/forum/?fromgroups#!topic/comp.lang.fortran/AIHRQ2kJv3c.
>> > As the last one, it is a very simple fix for an accepts-invalid
>> > problem (which is not a regression), so I hope this one will also
>> > still be suitable for trunk (if not, I hope the release managers, in
>> > CC, will stop me).
>> >
>> > It does regtest cleanly on x86_64-linux-gnu. Ok for trunk?
>>
>> This broke libgomp.fortran/threadprivate4.f90 test.

Sorry about that! I regtested with "make check-gfortran", which
doesn't run the libgomp tests, I guess?


>> Adding ! { dg-additional-options "-std=f2003" }
>> doesn't work, because the test uses
>>   call abort

I actually think we should get rid of such extensions in the
testsuite, where possible. This particular one is used all over the
place, but could be easily replaces by something like "stop 1", which
is standard Fortran.


>> which is a GNU extension and I have no idea how to choose allow_std
>> which includes GNU but doesn't include F2008.
>
> ! { dg-additional-options "-std=f2003 -fdec" }
>
> seems to work (because -std=f2003 sets
>   gfc_option.allow_std = GFC_STD_F95_OBS | GFC_STD_F77
> | GFC_STD_F2003 | GFC_STD_F95 | GFC_STD_F2008_OBS;
> and -fdec adds:
>   gfc_option.allow_std |= GFC_STD_F95_OBS | GFC_STD_F95_DEL
> | GFC_STD_GNU | GFC_STD_LEGACY;
> ), but it is quite nasty.  Isn't there a better way?

Yes, there is "-std=f2003 -fall-intrinsics", which is a little better
at least. It's what I did with proc_ptr_common1.f90 as well ...

https://gcc.gnu.org/viewcvs/gcc/trunk/gcc/testsuite/gfortran.dg/proc_ptr_common_1.f90?r1=257636=257635=257636


> Kind like -std=gnu++17 vs. -std=c++17 where the latter is standard
> and former standard + GNU extensions (which would roughly be
> "| GFC_STD_GNU | GFC_STD_LEGACY" in the fortran world).

Huh, I suppose it would be nice to have options like -std=gnu2003 and
-std=gnu2008, in analogy to those C++ options ...

Cheers,
Janus


Re: [PATCH] Fix PR84101, account for function ABI details in vectorization costs

2018-02-14 Thread Richard Biener
On Tue, 13 Feb 2018, Jeff Law wrote:

> On 01/30/2018 02:59 AM, Richard Biener wrote:
> > 
> > This patch tries to deal with the "easy" part of a function ABI,
> > the return value location, in vectorization costing.  The testcase
> > shows that if we vectorize the returned value but the function
> > doesn't return in memory or in a vector register but as in this
> > case in an integer register pair (reg:TI ax) (bah, ABI details
> > exposed late?  why's this not a parallel?) we end up spilling
> > badly.
> PARALLEL is used when the ABI mandates a value be returned in multiple
> places.  Typically that happens when the value is returned in different
> types of registers (integer, floating point, vector).
> 
> Presumably it's not a PARALLEL in this case because the value is only
> returned in %eax.

It's returned in %eax and %rdx (TImode after all).  But maybe
"standard register pairs" are not represented as PARALLEL ...

> > 
> > The idea is to account for such spilling so if vectorization
> > benefits outweight the spilling we'll vectorize anyway.
> That's a pretty serious bleed of the target into the vectorizer.  But
> we've already deemed that the vectorizer is going to have these target
> dependencies.  So I won't object on those grounds.
> 
> 
> > 
> > I think the particular testcase could be fixed in the subreg
> > pass basically undoing the vectorization but I realize that
> > generally this is a too hard problem and avoiding vectorization
> > is better.  Still this patch is somewhat fragile in that it
> > depends on us "seeing" that the stored to decl is returned
> > (see cfun_returns).
> > 
> > Bootstrap & regtest running on x86_64-unknown-linux-gnu.
> > 
> > I'd like to hear opinions on my use of hard_function_value
> > and also from other target maintainers.  I'm not sure we
> > have sufficient testsuite coverage of _profitable_ vectorization
> > of a return value.  Feel free to add to this for your
> > target.
> Well, it's the right way to get at the information.  I'm not aware of
> any other way to get what you want.  We could possibly hide the RTL bits
> to avoid GET_MODE and friends within the vectorizer -- your call.
>
> I'm not sure the bits in vect_mode_store_cost are right though.  ISTM
> you want to penalize if and only if the return value is not stored in a
> vector-capable location.  If it's a PARALLEL and any element is a
> suitable vector register, then do not penalize -- the spills are
> unavoidable in that case.

So we vectorize sth like

  res[0] = a;
  res[1] = b;

to

  vector_reg = {a, b};
  MEM[] = vector_reg;

and if 'res' is a PARALLEL we'd end up spilling vector_reg so we
can assign it to MEM[] and/or decompose 'res' according to the
PARALLEL.  Without this we should be able to manage to emit
direct sets of the PARALLELs components to a and b?  Certainly
it works that way for the (reg:TI ax) testcase (which is not a
PARALLEL ...).

> So I think you have to iterate over elements in the PARALLEL case to
> verify none of them are suitable for holding the vector result.

Ok, so in case ret was bigger than just res[0] and res[1] and the
PARALLEL for it would contain both a vector of two elements and
sth for the rest then yes, I guess we might be able to generate
optimal code for the assignment to the vector part of the PARALLEL.

> I'm not entirely sure what to do with CONCAT.  I wasn't immediately
> aware it could show up in that context.

It was just a guess that it might occur for complex vars.  OTOH
I wasn't able to create a testcase that didn't end up using
a SSA name for the actual return value so we don't catch this
case with the patch.

void bar(_Complex double *);
_Complex double foo (double x, double y)
{
  _Complex double z;
  bar ();
  __real z = x;
  __imag z = y;
  return z;
}

gets vectorized to

foo:
.LFB0:
.cfi_startproc
subq$40, %rsp
.cfi_def_cfa_offset 48
unpcklpd%xmm1, %xmm0
leaq16(%rsp), %rdi
movaps  %xmm0, (%rsp)
callbar
movapd  (%rsp), %xmm0
movaps  %xmm0, 16(%rsp)
movsd   16(%rsp), %xmm0
movsd   24(%rsp), %xmm1
addq$40, %rsp
.cfi_def_cfa_offset 8
ret

vs. non-vectorized

foo:
.LFB0:
.cfi_startproc
subq$40, %rsp
.cfi_def_cfa_offset 48
leaq16(%rsp), %rdi
movsd   %xmm0, 8(%rsp)
movsd   %xmm1, (%rsp)
callbar
movsd   8(%rsp), %xmm0
movsd   (%rsp), %xmm1
addq$40, %rsp
.cfi_def_cfa_offset 8
ret

but hard_function_value indeed returns a PARALLEL here:

(parallel:DC [
(expr_list:REG_DEP_TRUE (reg:DF 21 xmm0)
(const_int 0 [0]))
(expr_list:REG_DEP_TRUE (reg:DF 22 xmm1)
(const_int 8 [0x8]))
])


> Or am I missing something here?

No idea - how would a testcase with a PARALLEL like you mention
look like?  On x86_64 stuff like struct A { v2df x; v2df y; };
is returned via an 

[PATCH] Handle PowerPC64 ELFv1 function descriptors in libbacktrace (PR other/82368)

2018-02-14 Thread Jakub Jelinek
Hi!

As mentioned in detail in the PR, PowerPC64 ELFv1 function symbols
point to function descriptors in .opd section rather than actual
code, and one needs to read the code address from the .opd section
in order to associate symbols with .text addresses.

Fixed thusly, bootstrapped/regtested on powerpc64-linux (-m32/-m64
testing) and powerpc64le-linux, ok for trunk?

2018-02-14  Jakub Jelinek  

PR other/82368
* elf.c (EM_PPC64, EF_PPC64_ABI): Undefine and define.
(struct elf_ppc64_opd_data): New type.
(elf_initialize_syminfo): Add opd argument, handle symbols
pointing into the PowerPC64 ELFv1 .opd section.
(elf_add): Read .opd section on PowerPC64 ELFv1, pass pointer
to structure with .opd data to elf_initialize_syminfo.

--- libbacktrace/elf.c.jj   2018-02-08 20:46:10.671242369 +
+++ libbacktrace/elf.c  2018-02-14 08:39:06.674088951 +
@@ -165,6 +165,8 @@ dl_iterate_phdr (int (*callback) (struct
 #undef ELFDATA2MSB
 #undef EV_CURRENT
 #undef ET_DYN
+#undef EM_PPC64
+#undef EF_PPC64_ABI
 #undef SHN_LORESERVE
 #undef SHN_XINDEX
 #undef SHN_UNDEF
@@ -245,6 +247,9 @@ typedef struct {
 
 #define ET_DYN 3
 
+#define EM_PPC64 21
+#define EF_PPC64_ABI 3
+
 typedef struct {
   b_elf_word   sh_name;/* Section name, index in string tbl */
   b_elf_word   sh_type;/* Type of section */
@@ -405,6 +410,20 @@ struct elf_syminfo_data
   size_t count;
 };
 
+/* Information about PowerPC64 ELFv1 .opd section.  */
+
+struct elf_ppc64_opd_data
+{
+  /* Address of the .opd section.  */
+  b_elf_addr addr;
+  /* Section data.  */
+  const char *data;
+  /* Size of the .opd section.  */
+  size_t size;
+  /* Corresponding section view.  */
+  struct backtrace_view view;
+};
+
 /* Compute the CRC-32 of BUF/LEN.  This uses the CRC used for
.gnu_debuglink files.  */
 
@@ -569,7 +588,8 @@ elf_initialize_syminfo (struct backtrace
const unsigned char *symtab_data, size_t symtab_size,
const unsigned char *strtab, size_t strtab_size,
backtrace_error_callback error_callback,
-   void *data, struct elf_syminfo_data *sdata)
+   void *data, struct elf_syminfo_data *sdata,
+   struct elf_ppc64_opd_data *opd)
 {
   size_t sym_count;
   const b_elf_sym *sym;
@@ -620,7 +640,17 @@ elf_initialize_syminfo (struct backtrace
  return 0;
}
   elf_symbols[j].name = (const char *) strtab + sym->st_name;
-  elf_symbols[j].address = sym->st_value + base_address;
+  /* Special case PowerPC64 ELFv1 symbols in .opd section, if the symbol
+is a function descriptor, read the actual code address from the
+descriptor.  */
+  if (opd
+ && sym->st_value >= opd->addr
+ && sym->st_value < opd->addr + opd->size)
+   elf_symbols[j].address
+ = *(const b_elf_addr *) (opd->data + (sym->st_value - opd->addr));
+  else
+   elf_symbols[j].address = sym->st_value;
+  elf_symbols[j].address += base_address;
   elf_symbols[j].size = sym->st_size;
   ++j;
 }
@@ -2637,6 +2667,7 @@ elf_add (struct backtrace_state *state,
   int debug_view_valid;
   unsigned int using_debug_view;
   uint16_t *zdebug_table;
+  struct elf_ppc64_opd_data opd_data, *opd;
 
   if (!debuginfo)
 {
@@ -2655,6 +2686,7 @@ elf_add (struct backtrace_state *state,
   debuglink_name = NULL;
   debuglink_crc = 0;
   debug_view_valid = 0;
+  opd = NULL;
 
   if (!backtrace_get_view (state, descriptor, 0, sizeof ehdr, error_callback,
   data, _view))
@@ -2857,6 +2889,23 @@ elf_add (struct backtrace_state *state,
  debuglink_crc = *(const uint32_t*)(debuglink_data + crc_offset);
}
}
+
+  /* Read the .opd section on PowerPC64 ELFv1.  */
+  if (ehdr.e_machine == EM_PPC64
+ && (ehdr.e_flags & EF_PPC64_ABI) < 2
+ && shdr->sh_type == SHT_PROGBITS
+ && strcmp (name, ".opd") == 0)
+   {
+ if (!backtrace_get_view (state, descriptor, shdr->sh_offset,
+  shdr->sh_size, error_callback, data,
+  _data.view))
+   goto fail;
+
+ opd = _data;
+ opd->addr = shdr->sh_addr;
+ opd->data = (const char *) opd_data.view.data;
+ opd->size = shdr->sh_size;
+   }
 }
 
   if (symtab_shndx == 0)
@@ -2898,7 +2947,7 @@ elf_add (struct backtrace_state *state,
   if (!elf_initialize_syminfo (state, base_address,
   symtab_view.data, symtab_shdr->sh_size,
   strtab_view.data, strtab_shdr->sh_size,
-  error_callback, data, sdata))
+  error_callback, data, sdata, opd))
{
  backtrace_free (state, sdata, sizeof *sdata, error_callback, data);
 

Re: [Patch, Fortran, F08] PR 84313: reject procedure pointers in COMMON blocks

2018-02-14 Thread Jakub Jelinek
On Wed, Feb 14, 2018 at 12:30:14PM +0100, Jakub Jelinek wrote:
> On Tue, Feb 13, 2018 at 07:24:35PM +0100, Janus Weil wrote:
> > as the subject line says, the attached patch rejects procedure
> > pointers in COMMON blocks (which is forbidden in F08). Since it's
> > apparently legal in F03, I'm still accepting it with -std=f2003 and
> > add that flag to a test case where this 'feature' is used. In another
> > one, I'm adding the error message that one gets with -std=f2008.
> > 
> > As my last submission, this fixes fallout from
> > https://groups.google.com/forum/?fromgroups#!topic/comp.lang.fortran/AIHRQ2kJv3c.
> > As the last one, it is a very simple fix for an accepts-invalid
> > problem (which is not a regression), so I hope this one will also
> > still be suitable for trunk (if not, I hope the release managers, in
> > CC, will stop me).
> > 
> > It does regtest cleanly on x86_64-linux-gnu. Ok for trunk?
> 
> This broke libgomp.fortran/threadprivate4.f90 test.
> Adding ! { dg-additional-options "-std=f2003" }
> doesn't work, because the test uses
>   call abort
> which is a GNU extension and I have no idea how to choose allow_std
> which includes GNU but doesn't include F2008.

! { dg-additional-options "-std=f2003 -fdec" }

seems to work (because -std=f2003 sets
  gfc_option.allow_std = GFC_STD_F95_OBS | GFC_STD_F77
| GFC_STD_F2003 | GFC_STD_F95 | GFC_STD_F2008_OBS;
and -fdec adds:
  gfc_option.allow_std |= GFC_STD_F95_OBS | GFC_STD_F95_DEL
| GFC_STD_GNU | GFC_STD_LEGACY;
), but it is quite nasty.  Isn't there a better way?

Kind like -std=gnu++17 vs. -std=c++17 where the latter is standard
and former standard + GNU extensions (which would roughly be
"| GFC_STD_GNU | GFC_STD_LEGACY" in the fortran world).

Jakub


Re: [Patch, Fortran, F08] PR 84313: reject procedure pointers in COMMON blocks

2018-02-14 Thread Jakub Jelinek
On Tue, Feb 13, 2018 at 07:24:35PM +0100, Janus Weil wrote:
> Hi all,
> 
> as the subject line says, the attached patch rejects procedure
> pointers in COMMON blocks (which is forbidden in F08). Since it's
> apparently legal in F03, I'm still accepting it with -std=f2003 and
> add that flag to a test case where this 'feature' is used. In another
> one, I'm adding the error message that one gets with -std=f2008.
> 
> As my last submission, this fixes fallout from
> https://groups.google.com/forum/?fromgroups#!topic/comp.lang.fortran/AIHRQ2kJv3c.
> As the last one, it is a very simple fix for an accepts-invalid
> problem (which is not a regression), so I hope this one will also
> still be suitable for trunk (if not, I hope the release managers, in
> CC, will stop me).
> 
> It does regtest cleanly on x86_64-linux-gnu. Ok for trunk?

This broke libgomp.fortran/threadprivate4.f90 test.
Adding ! { dg-additional-options "-std=f2003" }
doesn't work, because the test uses
  call abort
which is a GNU extension and I have no idea how to choose allow_std
which includes GNU but doesn't include F2008.

Jakub


Re: [PATCH] FIx endless match.pd recursion on cst1 + cst2 + cst3 (PR tree-optimization/84334)

2018-02-14 Thread Richard Biener
On Tue, 13 Feb 2018, Marc Glisse wrote:

> On Tue, 13 Feb 2018, Richard Biener wrote:
> 
> > On February 13, 2018 6:51:29 PM GMT+01:00, Jakub Jelinek 
> > wrote:
> > > Hi!
> > > 
> > > On the following testcase, we recurse infinitely, because
> > > we have float re-association enabled, but also rounding-math, so
> > > we try to optimize (cst1 + cst2) + cst3 as (cst2 + cst3) + cst1
> > > but (cst2 + cst3) doesn't simplify and we try again and optimize
> > > it as (cst3 + cst1) + cst2 and then (cst1 + cst2) + cst3 and so on
> > > forever.  If @0 is not a CONSTANT_CLASS_P, there is not a problem,
> > > if it is, the code just checks if we can actually simplify the
> > > operation between cst2 and cst3 into a constant.
> > 
> > Is there a reason to try simplifying at all for constant @0?
> 
> Yes. cst2+cst3 might simplify (the operation happens to be exact and not
> require rounding), which leaves us with only one addition instead of 2.
> 
> On the other hand, mixing -frounding-math with reassociation seems strange to
> me, and likely not worth optimizing for.

./cc1 -quiet t.c -O -frounding-math -fassociative-math
cc1: warning: -fassociative-math disabled; other options take precedence

So _maybe_ we should disable these patterns for !flag_associative_math
when dealing with FP?

Richard.


[C++ Patch] tsubst_flags_t fixlet

2018-02-14 Thread Paolo Carlini

Hi,

today, while having a look to c++/84350, I noticed that in a couple of 
places we aren't forwarding the tsubst_flags_t argument to 
do_auto_deduction. I think we can elegantly solve the problem by 
removing the do_auto_deduction overload taking three arguments and 
adding two defaults to the other. Tested x86_64-linux.


Thanks, Paolo.

///

2018-02-14  Paolo Carlini  

* cp-tree.h (do_auto_deduction (tree, tree, tree)): Remove.
(do_auto_deduction (tree, tree, tree, tsubst_flags_t,
auto_deduction_context, tree, int): Add defaults.
* pt.c (do_auto_deduction (tree, tree, tree)): Remove definition.
(tsubst_omp_for_iterator): Adjust do_auto_deduction call, forward
tsubst_flags_t argument.
* init.c (build_new): Likewise.
Index: cp-tree.h
===
--- cp-tree.h   (revision 257653)
+++ cp-tree.h   (working copy)
@@ -6470,10 +6470,11 @@ extern tree make_auto   (void);
 extern tree make_decltype_auto (void);
 extern tree make_template_placeholder  (tree);
 extern bool template_placeholder_p (tree);
-extern tree do_auto_deduction   (tree, tree, tree);
 extern tree do_auto_deduction   (tree, tree, tree,
- tsubst_flags_t,
- auto_deduction_context,
+ tsubst_flags_t
+= tf_warning_or_error,
+ auto_deduction_context
+= adc_unspecified,
 tree = NULL_TREE,
 int = LOOKUP_NORMAL);
 extern tree type_uses_auto (tree);
Index: init.c
===
--- init.c  (revision 257653)
+++ init.c  (working copy)
@@ -3593,7 +3593,7 @@ build_new (vec **placement, tree type
  d_init = (**init)[0];
  d_init = resolve_nondeduced_context (d_init, complain);
}
- type = do_auto_deduction (type, d_init, auto_node);
+ type = do_auto_deduction (type, d_init, auto_node, complain);
}
 }
 
Index: pt.c
===
--- pt.c(revision 257653)
+++ pt.c(working copy)
@@ -15785,7 +15785,7 @@ tsubst_omp_for_iterator (tree t, int i, tree declv
   tree auto_node = type_uses_auto (TREE_TYPE (decl));
   if (auto_node && init)
 TREE_TYPE (decl)
-  = do_auto_deduction (TREE_TYPE (decl), init, auto_node);
+  = do_auto_deduction (TREE_TYPE (decl), init, auto_node, complain);
 
   gcc_assert (!type_dependent_expression_p (decl));
 
@@ -25941,17 +25941,6 @@ do_class_deduction (tree ptype, tree tmpl, tree in
 }
 
 /* Replace occurrences of 'auto' in TYPE with the appropriate type deduced
-   from INIT.  AUTO_NODE is the TEMPLATE_TYPE_PARM used for 'auto' in TYPE.  */
-
-tree
-do_auto_deduction (tree type, tree init, tree auto_node)
-{
-  return do_auto_deduction (type, init, auto_node,
-tf_warning_or_error,
-adc_unspecified);
-}
-
-/* Replace occurrences of 'auto' in TYPE with the appropriate type deduced
from INIT.  AUTO_NODE is the TEMPLATE_TYPE_PARM used for 'auto' in TYPE.
The CONTEXT determines the context in which auto deduction is performed
and is used to control error diagnostics.  FLAGS are the LOOKUP_* flags.


Re: [PATCH, rs6000] (v2) PR84220 remove RS6000_BTI_NOT_OPAQUE refs from builtins table

2018-02-14 Thread Segher Boessenkool
Hi!

On Tue, Feb 13, 2018 at 05:40:08PM -0600, Will Schmidt wrote:
>   Some of our builtin definitions were allowing invalid parameters, and a
> subsequent ICE (on invalid code) were the result.  This is due to the use of
> RS6000_BTI_NOT_OPAQUE (which allowed vector arguments), where a
> RS6000_BTI_INTSI appears to be a more appropriate choice.
> This change adjusts the definitions for the VEC_SLD, VEC_SLDW, vec_XXSLDWI
> and VEC_XXPERMDI entries.

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr84220-xxperm.c
> @@ -0,0 +1,100 @@
> +/* PR target/84220 */
> +/* Test to ensure we generate invalid parameter errors rather than an ICE
> +when calling vec_xxpermdi() with invalid parameters.  */
> +/* { dg-do compile { target { powerpc64*-*-* } } } */
> +/* { dg-require-effective-target powerpc_vsx_ok } */
> +/* { dg-options "-O2 -mvsx" } */

Does this test need powerpc64*?  Or does it need lp64 instead, or nothing?

Looks good, please look at that detail again; okay for trunk.  Thanks!


Segher


[PATCH][i386] Adjust vec_construct cost for AVX256/512, penaltize elementwise load vectorization

2018-02-14 Thread Richard Biener

The following tries to account for the fact that when constructing
AVX256 or AVX512 vectors from elements we can only use insertps to
insert into the low 128bits of a vector but have to use
vinserti128 or vinserti64x4 to build larger AVX256/512 vectors.
Those operations also have higher latency (Agner documents
3 cycles for Broadwell for reg-reg vinserti128 while insertps has
one cycle latency).  Agner doesn't have tables for AVX512 yet but
I guess the story is similar for vinserti64x4.

Latency is similar for FP adds so I re-used ix86_cost->addss for
this cost.

This works towards fixing the referenced PRs below where we end
up vectorizing a lot of loads via elementwise construction, mostly
"enabled" by the new support for alias versioning for variable
strides.  Here, analyzed for PR84037, the large number of scalar
loads and vector builds before any meaningful computation means
the CPU is bottlenecked with AGU and load ops and doesn't get
any meaningful work done thus the vectorization should end up
being not profitable (with some more massaging in the vectorizer
and using SLP which reduces the number of loads a lot I only
can get into same-speed as not vectorized territory).

So the real fix for those issues is to account for those
microarchitectural issues in the backend costing.  I've decided
to plumb this onto the vector construction op if that happens
to be fed by loads, scaling this cost by the number of
vector elements (overall latency should grow with the number
of dependences).

Bootstrap/regtest running on x86_64-unknown-linux-gnu.

I've benchmarked this on Haswell with SPEC CPU 2006 and a three-run
reveals that it doesn't regress any benchmark off-noise but improves
416.gamess by 7%, 465.tonto by 6% and 481.wrf by 2%.  It also fixes
the Polyhedron capacita regression (which is what I "tuned" the
factoring with).  I've mentioned the bugs refering any of the above
affected benchmarks in the ChangeLog but it still has to be verified
if the bugs are fully fixed (84037 is).

Ok for trunk?

Any confirmation of the microarchitectural bottleneck in, say,
Capacita from people with access to cycle-accurate simulators
are welcome ;)  Performance counters only help so much (not much...),
so my guesses are based on Agner and finger-counting.

Thanks,
Richard.

2018-02-13  Richard Biener  

PR tree-optimization/84037
PR tree-optimization/84016
PR target/82862
* config/i386/i386.c (ix86_builtin_vectorization_cost):
Adjust vec_construct for the fact we need additional higher latency
128bit inserts for AVX256 and AVX512 vector builds.
(ix86_add_stmt_cost): Scale vector construction cost for
elementwise loads.

Index: gcc/config/i386/i386.c
===
--- gcc/config/i386/i386.c  (revision 257620)
+++ gcc/config/i386/i386.c  (working copy)
@@ -45904,7 +45904,18 @@ ix86_builtin_vectorization_cost (enum ve
  ix86_cost->sse_op, true);
 
   case vec_construct:
-   return ix86_vec_cost (mode, ix86_cost->sse_op, false);
+   {
+ /* N element inserts.  */
+ int cost = ix86_vec_cost (mode, ix86_cost->sse_op, false);
+ /* One vinserti128 for combining two SSE vectors for AVX256.  */
+ if (GET_MODE_BITSIZE (mode) == 256)
+   cost += ix86_vec_cost (mode, ix86_cost->addss, true);
+ /* One vinserti64x4 and two vinserti128 for combining SSE
+and AVX256 vectors to AVX512.  */
+ else if (GET_MODE_BITSIZE (mode) == 512)
+   cost += 3 * ix86_vec_cost (mode, ix86_cost->addss, true);
+ return cost;
+   }
 
   default:
 gcc_unreachable ();
@@ -50243,6 +50254,18 @@ ix86_add_stmt_cost (void *data, int coun
  break;
}
 }
+  /* If we do elementwise loads into a vector then we are bound by
+ latency and execution resources for the many scalar loads
+ (AGU and load ports).  Try to account for this by scaling the
+ construction cost by the number of elements involved.  */
+  if (kind == vec_construct
+  && stmt_info
+  && stmt_info->type == load_vec_info_type
+  && stmt_info->memory_access_type == VMAT_ELEMENTWISE)
+{
+  stmt_cost = ix86_builtin_vectorization_cost (kind, vectype, misalign);
+  stmt_cost *= TYPE_VECTOR_SUBPARTS (vectype);
+}
   if (stmt_cost == -1)
 stmt_cost = ix86_builtin_vectorization_cost (kind, vectype, misalign);
 


Check array indices in object_address_invariant_in_loop_p (PR 84357)

2018-02-14 Thread Richard Sandiford
object_address_invariant_in_loop_p ignored ARRAY_REF indices on
the basis that:

  /* Index of the ARRAY_REF was zeroed in analyze_indices, thus we only
 need to check the stride and the lower bound of the reference.  */

That was true back in 2007 when the code was added:

static void
dr_analyze_indices (struct data_reference *dr, struct loop *nest)
{
  [...]
  while (handled_component_p (aref))
{
  if (TREE_CODE (aref) == ARRAY_REF)
{
  op = TREE_OPERAND (aref, 1);
  access_fn = analyze_scalar_evolution (loop, op);
  access_fn = resolve_mixers (nest, access_fn);
  VEC_safe_push (tree, heap, access_fns, access_fn);

  TREE_OPERAND (aref, 1) = build_int_cst (TREE_TYPE (op), 0);
}

  aref = TREE_OPERAND (aref, 0);
}

but the assignment was removed a few years ago.  We were therefore
treating "two->arr[i]" and "three->arr[i]" as loop invariant.

Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu.
OK to install?

Richard


2018-02-14  Richard Sandiford  

gcc/
PR tree-optimization/84357
* tree-data-ref.c (object_address_invariant_in_loop_p): Check
operand 1 of an ARRAY_REF too.

gcc/testsuite/
PR tree-optimization/84357
* gcc.dg/vect/pr84357.c: New test.

Index: gcc/tree-data-ref.c
===
--- gcc/tree-data-ref.c 2018-02-08 15:16:21.784407397 +
+++ gcc/tree-data-ref.c 2018-02-14 09:42:14.801095011 +
@@ -2200,13 +2200,10 @@ object_address_invariant_in_loop_p (cons
 {
   if (TREE_CODE (obj) == ARRAY_REF)
{
- /* Index of the ARRAY_REF was zeroed in analyze_indices, thus we only
-need to check the stride and the lower bound of the reference.  */
- if (chrec_contains_symbols_defined_in_loop (TREE_OPERAND (obj, 2),
- loop->num)
- || chrec_contains_symbols_defined_in_loop (TREE_OPERAND (obj, 3),
-loop->num))
-   return false;
+ for (int i = 1; i < 4; ++i)
+   if (chrec_contains_symbols_defined_in_loop (TREE_OPERAND (obj, i),
+   loop->num))
+ return false;
}
   else if (TREE_CODE (obj) == COMPONENT_REF)
{
Index: gcc/testsuite/gcc.dg/vect/pr84357.c
===
--- /dev/null   2018-02-10 09:05:46.714416790 +
+++ gcc/testsuite/gcc.dg/vect/pr84357.c 2018-02-14 09:42:14.800095067 +
@@ -0,0 +1,31 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-Wall" } */
+
+#define COUNT 32
+
+typedef struct s1 {
+unsigned char c;
+} s1;
+
+typedef struct s2
+{
+char pad;
+s1 arr[COUNT];
+} s2;
+
+typedef struct s3 {
+s1 arr[COUNT];
+} s3;
+
+s2 * get_s2();
+s3 * gActiveS3;
+void foo()
+{
+s3 * three = gActiveS3;
+s2 * two = get_s2();
+
+for (int i = 0; i < COUNT; i++)
+{
+two->arr[i].c = three->arr[i].c;
+}
+}


Re: [SFN+LVU+IEPM v4 7/9] [LVU] Introduce location views

2018-02-14 Thread Andreas Schwab
On Feb 13 2018, Alexandre Oliva  wrote:

> The patch I posted last night should work around this problem, in that
> it will disable LVU by default if the assembler doesn't support .loc
> views, and then you won't get this error any more, unless you explicitly
> ask for location views.  If you can give it a try on ia64-linux-gnu,
> that would be appreciated.

That fixes the build failure on ia64.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


[Aarch64] Fix conditional branches with target far away.

2018-02-14 Thread Sameera Deshpande
Hi!

Please find attached the patch to fix bug in branches with offsets over 1MiB.
There has been an attempt to fix this issue in commit
050af05b9761f1979f11c151519e7244d5becd7c

However, the far_branch attribute defined in above patch used
insn_length - which computes incorrect offset. Hence, eliminated the
attribute completely, and computed the offset from insn_addresses
instead.

Ok for trunk?

gcc/Changelog

2018-02-13 Sameera Deshpande 
* config/aarch64/aarch64.md (far_branch): Remove attribute. Eliminate
all the dependencies on the attribute from RTL patterns.

-- 
- Thanks and regards,
  Sameera D.
Index: gcc/config/aarch64/aarch64.md
===
--- gcc/config/aarch64/aarch64.md	(revision 257620)
+++ gcc/config/aarch64/aarch64.md	(working copy)
@@ -244,13 +244,6 @@
 	(const_string "no")
 	] (const_string "yes")))
 
-;; Attribute that specifies whether we are dealing with a branch to a
-;; label that is far away, i.e. further away than the maximum/minimum
-;; representable in a signed 21-bits number.
-;; 0 :=: no
-;; 1 :=: yes
-(define_attr "far_branch" "" (const_int 0))
-
 ;; Strictly for compatibility with AArch32 in pipeline models, since AArch64 has
 ;; no predicated insns.
 (define_attr "predicated" "yes,no" (const_string "no"))
@@ -448,12 +441,7 @@
 	(if_then_else (and (ge (minus (match_dup 2) (pc)) (const_int -1048576))
 			   (lt (minus (match_dup 2) (pc)) (const_int 1048572)))
 		  (const_int 4)
-		  (const_int 8)))
-   (set (attr "far_branch")
-	(if_then_else (and (ge (minus (match_dup 2) (pc)) (const_int -1048576))
-			   (lt (minus (match_dup 2) (pc)) (const_int 1048572)))
-		  (const_int 0)
-		  (const_int 1)))]
+		  (const_int 8)))]
 )
 
 ;; For a 24-bit immediate CST we can optimize the compare for equality
@@ -670,12 +658,7 @@
 	(if_then_else (and (ge (minus (match_dup 1) (pc)) (const_int -1048576))
 			   (lt (minus (match_dup 1) (pc)) (const_int 1048572)))
 		  (const_int 4)
-		  (const_int 8)))
-   (set (attr "far_branch")
-	(if_then_else (and (ge (minus (match_dup 2) (pc)) (const_int -1048576))
-			   (lt (minus (match_dup 2) (pc)) (const_int 1048572)))
-		  (const_int 0)
-		  (const_int 1)))]
+		  (const_int 8)))]
 )
 
 (define_insn "*tb1"
@@ -692,7 +675,11 @@
   {
 if (get_attr_length (insn) == 8)
   {
-	if (get_attr_far_branch (insn) == 1)
+	long long int offset;
+	offset = INSN_ADDRESSES (INSN_UID (XEXP (operands[2], 0)))
+		  - INSN_ADDRESSES (INSN_UID (insn));
+
+	if (offset <= -1048576 || offset >= 1048572)
 	  return aarch64_gen_far_branch (operands, 2, "Ltb",
 	 "\\t%0, %1, ");
 	else
@@ -709,12 +696,7 @@
 	(if_then_else (and (ge (minus (match_dup 2) (pc)) (const_int -32768))
 			   (lt (minus (match_dup 2) (pc)) (const_int 32764)))
 		  (const_int 4)
-		  (const_int 8)))
-   (set (attr "far_branch")
-	(if_then_else (and (ge (minus (match_dup 2) (pc)) (const_int -1048576))
-			   (lt (minus (match_dup 2) (pc)) (const_int 1048572)))
-		  (const_int 0)
-		  (const_int 1)))]
+		  (const_int 8)))]
 
 )
 
@@ -727,8 +709,12 @@
   ""
   {
 if (get_attr_length (insn) == 8)
-  {
-	if (get_attr_far_branch (insn) == 1)
+   {
+long long int offset;
+offset = INSN_ADDRESSES (INSN_UID (XEXP (operands[1], 0)))
+		 - INSN_ADDRESSES (INSN_UID (insn));
+
+	if (offset <= -1048576 || offset >= 1048572)
 	  return aarch64_gen_far_branch (operands, 1, "Ltb",
 	 "\\t%0, , ");
 	else
@@ -740,7 +726,7 @@
 	output_asm_insn (buf, operands);
 	return "\t%l1";
 	  }
-  }
+   }
 else
   return "\t%0, , %l1";
   }
@@ -749,12 +735,7 @@
 	(if_then_else (and (ge (minus (match_dup 1) (pc)) (const_int -32768))
 			   (lt (minus (match_dup 1) (pc)) (const_int 32764)))
 		  (const_int 4)
-		  (const_int 8)))
-   (set (attr "far_branch")
-	(if_then_else (and (ge (minus (match_dup 1) (pc)) (const_int -1048576))
-			   (lt (minus (match_dup 1) (pc)) (const_int 1048572)))
-		  (const_int 0)
-		  (const_int 1)))]
+		  (const_int 8)))]
 )
 
 ;; ---