Re: [PATCH 3/4] enhance overflow and truncation detection in strncpy and strncat (PR 81117)

2017-11-03 Thread Jeff Law

On 10/20/2017 06:18 PM, Martin Sebor wrote:


What might be even better would be to use the immediate uses of the
memory tag.  For your case there should be only one immediate use 
and it

should point to the statement which NUL terminates the destination.  Or
maybe that would be worse in that you only want to allow this exception
when the statements are consecutive.


You said "maybe that would be worse" so I hadn't implemented it.
I went ahead and coded it up but with more testing I don't think
it has the desired result.  See below.
I probably should have said that while it may be better at rooting out 
false positives, the ability to find things that are not consecutive 
could perhaps be seen as a negative in that it might fail to warn on 
code that while not technically incorrect is likely dodgy.







I'll have to try this to better understand how it might work.

It's actually quite simple.

Rather than looking at the next statement in the chain via
gsi_next_nondebug you follow the def->use chain for the memory tag
associated with the string copy statement.

/* Get the memory tag that is defined by this statement.  */
defvar = gimple_vdef (gsi_stmt (gsi));

imm_use_iterator iter;
gimple *use_stmt;

if (num_imm_uses (defvar) == 1)
  {
    imm_use_terator iter;
    gimple *use_stmt;

    /* Iterate over the immediate uses of the memory tag.  */
    FOR_EACH_IMM_USE_STMT (use_stmt, ui, defvar)
  {
    Check if STMT is dst[i] = '\0'
  }
  }



The check that there is a single immediate use is designed to make sure
you get a warning for this scenario:


Thanks for the outline of the solution.  I managed to get it to
work with only a few minor changes(*) but...


strxncpy
read the destination
terminate the destination

Which I think you'd want to consider non-terminated because of the read
of the destination prior to termination.

But avoids warnings for

strxncpy
stuff that doesn't read the destination
terminate the destintion


...while it works fine for the basic cases it has the downside
of missing more subtle problems like this one:

   char a[8];

   void f (void) { puts (a); }

   void g (const char *s)
   {
     strncpy (a, s);

     f ();   // assumes a is a string

     a[sizeof a - 1] = '\0';
   }
Don't you have a VDEF at the call to f()?  Wouldn't that be the only 
immediate use?  If you hit a VDEF without finding the termination, then 
you warn.


Or is it the case that the call gets inlined and we know enough about 
puts that we get a VUSE rather than a VDEF?


In which case we have a VUSE at the puts and a VDEF at the array 
assignment in the immediate use list.


I think this highlights that I over-simplified my pseudo code.  If you 
find a VUSE in the list, then you have to warn.  IF you find a VDEF that 
does not terminate, then you have to warn.


It's probably academic at this point -- it's probably more useful to 
understand the immediate uses and the vuse/vdef work for the future.  I 
can live with a statement walking implementation.




or this one:

   struct S { char a[8]; };

   void f (const struct S *p) { puts (p->a); }

   void g (struct S *p, const char *s)
   {
     strncpy (p->a, s);

     f (p);   // assumes p->a is a string

     a[sizeof p->a - 1] = '\0';
   }

This one looks similar.




+  if (TREE_CODE (dest) == SSA_NAME)
+    {
+  gimple *stmt = SSA_NAME_DEF_STMT (dest);
+  if (!is_gimple_assign (stmt))
+    return NULL_TREE;
+
+  dest = gimple_assign_rhs1 (stmt);
+    }


This seems wrong as-written -- you have no idea what the RHS code is.
You're just blindly taking the first operand, then digging into its
type.  It probably works in practice, but it would seem better to verify
that gimple_assign_rhs_code is a conversion first (CONVERT_EXPR_CODE_P).
 If it's not a CONVERT_EXPR_CODE_P, return NULL_TREE.


The code doesn't match CONVERT_EXPR_P().  It's expected to match
ADDR_EXPR and that's also what it tests for on the line just below
the hunk you pasted above.  Here it is:

[ ... [


   if (TREE_CODE (dest) == SSA_NAME)
     {
   gimple *stmt = SSA_NAME_DEF_STMT (dest);
   if (!is_gimple_assign (stmt))
 return NULL_TREE;

   dest = gimple_assign_rhs1 (stmt);
     }

   if (TREE_CODE (dest) != ADDR_EXPR)
     return NULL_TREE;

So unless I'm missing something this does what you're looking
for.But you still making assumptions that you don't know are valid.


You walk to the defining statement of the given SSA_NAME.  You verify it 
is an assignment.  You're OK up to this point.


You then proceed to look at rhs1. Instead you should look at 
gimple_assign_rhs_code and verify that is an ADDR_EXPR.


For your particular example I don't think it matters, but I suspect it'd 
matter for something if the defining statement was something like a 
POINTER_PLUS_EXPR where the first operand might be an ADDR_EXPR.




Jeff


Re: [PATCH 2/2] [i386] PR82002 Part 2: Correct non-immediate offset/invalid INSN

2017-11-03 Thread Daniel Santos
On 11/03/2017 04:22 PM, Daniel Santos wrote:
> ...
> How does this patch look?  (Also, I've updated comments for
> choose_baseaddr.)  Currently re-running tests.
>
> Thanks,
> Daniel
>
> @@ -13110,10 +13125,26 @@ ix86_expand_prologue (void)
>target.  */
>if (TARGET_SEH)
>   m->fs.sp_valid = false;
> -}
>  
> -  if (m->call_ms2sysv)
> -ix86_emit_outlined_ms2sysv_save (frame);
> +  /* If SP offset is non-immediate after allocation of the stack frame,
> +  then emit SSE saves or stub call prior to allocating the rest of the
> +  stack frame.  This is less efficient for the out-of-line stub because
> +  we can't combine allocations across the call barrier, but it's better
> +  than using a scratch register.  */
> +  else if (!x86_64_immediate_operand (GEN_INT 
> (frame.stack_pointer_offset - m->fs.sp_realigned_offset), Pmode))

Oops, and also after fixing this formatting...

Daniel


[PATCH], PR 82748, Fix __builtin_fabsq on PowerPC

2017-11-03 Thread Michael Meissner
This patch fixes PR 82748, which is a compiler abort if you use the old
__builtin_fabsq function when you are changing the long double default from IBM
double-double format to IEEE.

The problem is __builtin_fabsq returns a KFmode type, but when you use
-mabi=ieeelongdouble, the float128 type is TFmode.

In fixing this, I made use of the recent changes that I made to move float128
fabs, copysign, fma, sqrt, etc. handling to machine independent code.  I just
made __builtin_fabsq map into __builtin_fabsf128.  So all of the old 'q'
built-ins got deleted, and #define'ed to the new name.

A related issue is the round to odd functions only take KFmode arguments and
return KFmode results.  These would not work when the -mabi=ieeelongdouble
switch is used.  I originally was going down the usual path of using the
overloaded builtin support to support these functions with KFmode or TFmode
arguments.  However, it was getting complex, in that we would need to move the
TARGET_IEEEQUAD option to a option bit, add new BTM declarations, add new BTI
type fields, etc.  It was just simpler to build the calls manually instead of
using the rs6000-builtin.def machinery.

I have checked these patches on a little endian power8 system, and after
bootstrap and check, there were no regressions.  Can I check this patch into
trunk?

Note, I will need to re-engineer a different patch for GCC 7, as the machine
independent handling of f128 builtins are not in GCC 7 (the bug was against GCC
7.x).

[gcc]
2017-11-03  Michael Meissner  

PR target/82748
* config/rs6000/rs6000-builtin.def (BU_FLOAT128_1): Delete
float128 helper macros, which are no longer used after deleting
the old 'q' built-in functions, and moving the round to odd
built-in functions to being special built-in functions.
(BU_FLOAT128_2): Likewise.
(BU_FLOAT128_1_HW): Likewise.
(BU_FLOAT128_2_HW): Likewise.
(BU_FLOAT128_3_HW): Likewise.
(FABSQ): Delete old 'q' built-in functions.
(COPYSIGNQ): Likewise.
(NQNQ): Likewise.
(NQNSQ): Likewise.
(INFQ): Likewise.
(HUGE_VALQ): Likewise.
(SQRTF128_ODD): Move round to odd built-in functions to be
special built-in functions, so that we can handle
-mabi=ieeelongdouble.
(TRUNCF128_ODD): Likewise.
(ADDF128_ODD): Likewise.
(SUBF128_ODD): Likewise.
(MULF128_ODD): Likewise.
(DIVF128_ODD): Likewise.
(FMAF128_ODD): Likewise.
* config/rs6000/rs6000-c.c (rs6000_cpu_cpp_builtins): Map old 'q'
built-in names to 'f128'.
* config/rs6000/rs6000.c (rs6000_fold_builtin): Remove folding the
old 'q' built-in functions, as the machine independent code for
'f128' built-in functions handles this.
(rs6000_expand_builtin): Add expansion for float128 round to odd
functions, keying off on -mabi=ieeelongdouble of whether to use
the KFmode or TFmode variant.
(rs6000_init_builtins): Initialize the _Float128 round to odd
built-in functions.
* doc/extend.texi (PowerPC Built-in Functions): Document the old
_Float128 'q' built-in functions are now mapped into the new
'f128' built-in functions.

[gcc/testsuite]
2017-11-03  Michael Meissner  

PR target/82748
* gcc.target/powerpc/pr82748-1.c: New test.
* gcc.target/powerpc/pr82748-2.c: Likewise.



-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000-builtin.def
===
--- gcc/config/rs6000/rs6000-builtin.def(revision 254357)
+++ gcc/config/rs6000/rs6000-builtin.def(working copy)
@@ -660,48 +660,6 @@
 | RS6000_BTC_BINARY),  \
CODE_FOR_ ## ICODE) /* ICODE */
 
-/* IEEE 128-bit floating-point builtins.  */
-#define BU_FLOAT128_2(ENUM, NAME, ATTR, ICODE)  \
-  RS6000_BUILTIN_2 (MISC_BUILTIN_ ## ENUM,  /* ENUM */  \
-   "__builtin_" NAME,  /* NAME */  \
-   RS6000_BTM_FLOAT128,/* MASK */  \
-   (RS6000_BTC_ ## ATTR/* ATTR */  \
-| RS6000_BTC_BINARY),  \
-   CODE_FOR_ ## ICODE) /* ICODE */
-
-#define BU_FLOAT128_1(ENUM, NAME, ATTR, ICODE)  \
-  RS6000_BUILTIN_1 (MISC_BUILTIN_ ## ENUM,  /* ENUM */  \
-   "__builtin_" NAME,  /* NAME */  \
-   RS6000_BTM_FLOAT128,/* MASK */  \
-   (RS6000_BTC_ ## ATTR/* ATTR 

Re: [PATCH] Add a warning for invalid function casts

2017-11-03 Thread Joseph Myers
On Mon, 9 Oct 2017, Bernd Edlinger wrote:

> +type @code{void (*) (void);} is special and matches everything, which can

The type name should not include ";".

The non-C++ parts of the patch are OK with that change.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Ping: [patch, fortran] KIND arguments for MINLOC and MAXLOC

2017-11-03 Thread Steve Kargl
On Fri, Nov 03, 2017 at 10:04:21PM +0100, Thomas Koenig wrote:
> Am 28.10.2017 um 23:57 schrieb Thomas Koenig:
> 
> Ping?
> 

Ok.  (I thought you had already committed this.)

-- 
Steve
20170425 https://www.youtube.com/watch?v=VWUpyCsUKR4
20161221 https://www.youtube.com/watch?v=IbCHE-hONow


Re: [PATCH 2/2] [i386] PR82002 Part 2: Correct non-immediate offset/invalid INSN

2017-11-03 Thread Daniel Santos
On 11/03/2017 02:09 AM, Uros Bizjak wrote:
> On Thu, Nov 2, 2017 at 11:43 PM, Daniel Santos  
> wrote:
>
int_registers_saved = (frame.nregs == 0);
sse_registers_saved = (frame.nsseregs == 0);
 +  save_stub_call_needed = (m->call_ms2sysv);
 +  gcc_assert (!(!sse_registers_saved && save_stub_call_needed));
>>> Oooh, double negation :(
>> I'm just saying that we shouldn't be saving SSE registers inline and via
>> the stub.  If I followed the naming convention of e.g.,
>> "see_registers_saved" then my variable would end up being called
>> "save_stub_called" which would be incorrect and misleading, similar to
>> how "see_registers_saved" is misleading when there are in fact no SSE
>> register that need to be saved.  Maybe I should rename
>> (int|sse)_registers_saved to (int|sse)_register_saves_needed with
>> inverted logic instead.
> But, we can just say
>
> gcc_assert (sse_registers_saved || !save_stub_call_needed);
>
> No?
>
> Uros.
>

Oh yes, I see.  Because "sse_registers_saved" really means that we've
either already saved them or don't have to, and not literally that they
have been saved.  I ranted about it's name but didn't think it all the
way through. :)

How does this patch look?  (Also, I've updated comments for
choose_baseaddr.)  Currently re-running tests.

Thanks,
Daniel
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 2967876..fb81d4dba84 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -11515,12 +11515,15 @@ choose_basereg (HOST_WIDE_INT cfa_offset, rtx _reg,
an alignment value (in bits) that is preferred or zero and will
recieve the alignment of the base register that was selected,
irrespective of rather or not CFA_OFFSET is a multiple of that
-   alignment value.
+   alignment value.  If it is possible for the base register offset to be
+   non-immediate then SCRATCH_REGNO should specify a scratch register to
+   use.
 
The valid base registers are taken from CFUN->MACHINE->FS.  */
 
 static rtx
-choose_baseaddr (HOST_WIDE_INT cfa_offset, unsigned int *align)
+choose_baseaddr (HOST_WIDE_INT cfa_offset, unsigned int *align,
+		 unsigned int scratch_regno = INVALID_REGNUM)
 {
   rtx base_reg = NULL;
   HOST_WIDE_INT base_offset = 0;
@@ -11534,6 +11537,19 @@ choose_baseaddr (HOST_WIDE_INT cfa_offset, unsigned int *align)
 choose_basereg (cfa_offset, base_reg, base_offset, 0, align);
 
   gcc_assert (base_reg != NULL);
+
+  rtx base_offset_rtx = GEN_INT (base_offset);
+
+  if (!x86_64_immediate_operand (base_offset_rtx, Pmode))
+{
+  gcc_assert (scratch_regno != INVALID_REGNUM);
+
+  rtx scratch_reg = gen_rtx_REG (Pmode, scratch_regno);
+  emit_move_insn (scratch_reg, base_offset_rtx);
+
+  return gen_rtx_PLUS (Pmode, base_reg, scratch_reg);
+}
+
   return plus_constant (Pmode, base_reg, base_offset);
 }
 
@@ -12793,23 +12809,19 @@ ix86_emit_outlined_ms2sysv_save (const struct ix86_frame )
   rtx sym, addr;
   rtx rax = gen_rtx_REG (word_mode, AX_REG);
   const struct xlogue_layout  = xlogue_layout::get_instance ();
-  HOST_WIDE_INT allocate = frame.stack_pointer_offset - m->fs.sp_offset;
 
   /* AL should only be live with sysv_abi.  */
   gcc_assert (!ix86_eax_live_at_start_p ());
+  gcc_assert (m->fs.sp_offset >= frame.sse_reg_save_offset);
 
   /* Setup RAX as the stub's base pointer.  We use stack_realign_offset rather
  we've actually realigned the stack or not.  */
   align = GET_MODE_ALIGNMENT (V4SFmode);
   addr = choose_baseaddr (frame.stack_realign_offset
-			  + xlogue.get_stub_ptr_offset (), );
+			  + xlogue.get_stub_ptr_offset (), , AX_REG);
   gcc_assert (align >= GET_MODE_ALIGNMENT (V4SFmode));
-  emit_insn (gen_rtx_SET (rax, addr));
 
-  /* Allocate stack if not already done.  */
-  if (allocate > 0)
-  pro_epilogue_adjust_stack (stack_pointer_rtx, stack_pointer_rtx,
-GEN_INT (-allocate), -1, false);
+  emit_insn (gen_rtx_SET (rax, addr));
 
   /* Get the stub symbol.  */
   sym = xlogue.get_stub_rtx (frame_pointer_needed ? XLOGUE_STUB_SAVE_HFP
@@ -12841,6 +12853,7 @@ ix86_expand_prologue (void)
   HOST_WIDE_INT allocate;
   bool int_registers_saved;
   bool sse_registers_saved;
+  bool save_stub_call_needed;
   rtx static_chain = NULL_RTX;
 
   if (ix86_function_naked (current_function_decl))
@@ -13016,6 +13029,8 @@ ix86_expand_prologue (void)
 
   int_registers_saved = (frame.nregs == 0);
   sse_registers_saved = (frame.nsseregs == 0);
+  save_stub_call_needed = (m->call_ms2sysv);
+  gcc_assert (sse_registers_saved || !save_stub_call_needed);
 
   if (frame_pointer_needed && !m->fs.fp_valid)
 {
@@ -13110,10 +13125,26 @@ ix86_expand_prologue (void)
 	 target.  */
   if (TARGET_SEH)
 	m->fs.sp_valid = false;
-}
 
-  if (m->call_ms2sysv)
-ix86_emit_outlined_ms2sysv_save (frame);
+  /* If SP offset is non-immediate after allocation of the stack frame,
+	 then emit SSE saves or stub call prior to allocating the rest of 

Re: [patch, fortran] Index interchange for FORALL and DO CONCURRENT

2017-11-03 Thread Thomas Koenig

Am 31.10.2017 um 21:56 schrieb Bernhard Reutner-Fischer:

On Tue, Oct 31, 2017 at 09:50:37PM +0100, Bernhard Reutner-Fischer wrote:

On Tue, Oct 31, 2017 at 09:30:27PM +0100, Thomas Koenig wrote:


Or maybe emit diagnostics into the frontend optimize dump file and scan
that?


If we could check the Fortran tree dumps with dejagnu, that would be
doable. Unfortunately, we don't have that in place.


Well that should be rather easy.
Don't we have a basic scan-dump where we can pass the file as well as a
regexp? I might look into this later.


and there is a scan-lang-dump which may be exactly for this case.


I have looked at this a little, and currently, there is no easy
way to scan the dump from -fdump-fortran-original.

My preference would be to commit the patch as is and open a PR
for scanning the dump, with a note about the missing test case.

So, is the original patch (with the spelling corrections) OK for trunk?

Regards

Thomas




Ping: [patch, fortran] KIND arguments for MINLOC and MAXLOC

2017-11-03 Thread Thomas Koenig

Am 28.10.2017 um 23:57 schrieb Thomas Koenig:

Ping?


the attached patch allows KIND arguments to MINLOC and MAXLOC.
There was a bit of a choice to make here. Originally, I wanted to
run the calculation using index_type only and convert to another
integer kind if that was required. This ran into the issue that
bounds checking fails for this approach if there is a conversion
( https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82660 ), and I
got regressions for that.

On the other hand, I wanted to avoid adding kind=1 and kind=2 versions
to the library. This approach had been rejected some time ago,
in 2009.

So, I chose a third path by using only pre-existing library
functions for kind=4, kind=8 and kind=16 and by doing a conversion
if the user specified kind=1 or kind=2.

This introduces a bug (array bounds violation not caught) if the user

- specifies bounds checking
- choses kind=1 or kind=2 for minloc or maxloc (it escapes me why
   anybody would want to do that)
- uses an array as return value whose bounds cannot be determined
   at compile-time, and gets the dimension of that array wrong

Frankly, if anybody would do this, the expression "deserves to lose"
comes to mind.

This would not be a regression, because kind=1 and kind=2 are
not supported at the moment.  This bug would be fixed together
with 82660.

Regression-tested. OK for trunk?

Regards

 Thomas

2017-10-28  Thomas Koenig  

     PR fortran/29600
     * gfortran.h (gfc_check_f): Replace fm3l with fm4l.
     * intrinsic.h (gfc_resolve_maxloc): Add gfc_expr * to argument
     list in protoytpe.
     (gfc_resolve_minloc): Likewise.
     * check.c (gfc_check_minloc_maxloc): Handle kind argument.
     * intrinsic.c (add_sym_3_ml): Rename to
     (add_sym_4_ml): and handle kind argument.
     (add_function): Replace add_sym_3ml with add_sym_4ml and add
     extra arguments for maxloc and minloc.
     (check_specific): Change use of check.f3ml with check.f4ml.
     * iresolve.c (gfc_resolve_maxloc): Handle kind argument. If
     the kind is smaller than the smallest library version available,
     use gfc_default_integer_kind and convert afterwards.
     (gfc_resolve_minloc): Likewise.

2017-10-28  Thomas Koenig  

     PR fortran/29600
     * gfortran.dg/minmaxloc_8.f90: New test.




[PR target/82823] Add testcase

2017-11-03 Thread Jeff Law


So the x86 assertion failure I fixed yesterday where I didn't have a 
testcase?  Martin L has just tripped over it and filed a bug for it 
overnight.


I'm adding his testcase to the C++ regression testsuite.  I also 
retro-actively added the PR marker to the x86 commit which fixes this 
problem.



Jeff
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 4cc2cedc0dc..0a08fe2ed5c 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,5 +1,8 @@
 2017-11-03  Jeff Law  
 
+   PR target/82823
+   * g++.dg/torture/pr82823.C: New test.
+
* gcc.target/i386/stack-check-12.c: New test.
 
 2017-11-03  Jakub Jelinek  
diff --git a/gcc/testsuite/g++.dg/torture/pr82823.C 
b/gcc/testsuite/g++.dg/torture/pr82823.C
new file mode 100644
index 000..dab369e7ad3
--- /dev/null
+++ b/gcc/testsuite/g++.dg/torture/pr82823.C
@@ -0,0 +1,26 @@
+// { dg-do compile }
+// { dg-additional-options "-fstack-clash-protection" }
+// { dg-require-effective-target supports_stack_clash_protection }
+
+
+class a
+{
+public:
+  ~a ();
+  int b;
+};
+class c
+{
+public:
+  a m_fn1 ();
+};
+class d
+{
+  int e ();
+  c f;
+};
+int
+d::e ()
+{
+  return f.m_fn1 ().b;
+}


Re: [C++ Patch] PR 65579 ("gcc requires definition of a static constexpr member...")

2017-11-03 Thread Paolo Carlini

Hi,

On 03/11/2017 18:56, Jason Merrill wrote:

Looking at the code again, it seems that the problem is the difference
between start_decl_1 and grokfield, in that the former has

   /* If an explicit initializer is present, or if this is a definition
  of an aggregate, then we need a complete type at this point.
  (Scalars are always complete types, so there is nothing to
  check.)  This code just sets COMPLETE_P; errors (if necessary)
  are issued below.  */
   if ((initialized || aggregate_definition_p)
   && !complete_p
   && COMPLETE_TYPE_P (complete_type (type)))
 {
   complete_p = true;
   /* We will not yet have set TREE_READONLY on DECL if the type
  was "const", but incomplete, before this point.  But, now, we
  have a complete type, so we can try again.  */
   cp_apply_type_quals_to_decl (cp_type_quals (type), decl);
 }

and grokfield/finish_static_data_member_decl don't.  How about
completing the type and re-applying the quals in
finish_static_data_member_decl if there's an initializer?  Your most
recent patch ought to work, but is less parallel.  Sorry for the
churn.
No problem, I learned something! Anyway, yes, the below is passing 
testing, shall we go ahead with it?


Thanks,
Paolo.

///
Index: cp/decl2.c
===
--- cp/decl2.c  (revision 254365)
+++ cp/decl2.c  (working copy)
@@ -787,6 +787,15 @@ finish_static_data_member_decl (tree decl,
   && TYPE_DOMAIN (TREE_TYPE (decl)) == NULL_TREE)
 SET_VAR_HAD_UNKNOWN_BOUND (decl);
 
+  if (init)
+{
+  /* Similarly to start_decl_1, we want to complete the type in order
+to do the right thing in cp_apply_type_quals_to_decl, possibly
+clear TYPE_QUAL_CONST (c++/65579).  */
+  tree type = TREE_TYPE (decl) = complete_type (TREE_TYPE (decl));
+  cp_apply_type_quals_to_decl (cp_type_quals (type), decl);
+}
+
   cp_finish_decl (decl, init, init_const_expr_p, asmspec_tree, flags);
 }
 
Index: testsuite/g++.dg/cpp0x/constexpr-template11.C
===
--- testsuite/g++.dg/cpp0x/constexpr-template11.C   (nonexistent)
+++ testsuite/g++.dg/cpp0x/constexpr-template11.C   (working copy)
@@ -0,0 +1,16 @@
+// PR c++/65579
+// { dg-do link { target c++11 } }
+
+template 
+struct S {
+int i;
+};
+
+struct T {
+  static constexpr S s = { 1 };
+};
+
+int main()
+{
+  return T::s.i;
+}


Re: [RFA][PATCH] Improve initial probe for noreturn functions for x86 target

2017-11-03 Thread Uros Bizjak
3. nov. 2017 20:30 je oseba "Jeff Law"  napisala:
>
> On 11/03/2017 11:38 AM, Uros Bizjak wrote:
>>
>> On Fri, Nov 3, 2017 at 6:13 PM, Jeff Law  wrote:
>>>
>>> On 11/03/2017 04:46 AM, Uros Bizjak wrote:


 On Fri, Nov 3, 2017 at 11:14 AM, Richard Biener
  wrote:
>
>
> On Fri, Nov 3, 2017 at 9:38 AM, Uros Bizjak  wrote:
>>>
>>>
>>>  * config/i386/i386.c (ix86_emit_restore_reg_using_pop):
>>> Prototype.
>>>  (ix86_adjust_stack_and_probe_stack_clash): Use a
push/pop
>>> sequence
>>>  to probe at the start of a noreturn function.
>>>
>>>  * gcc.target/i386/stack-check-12.c: New test
>>
>>
>>
>> -  emit_stack_probe (plus_constant (Pmode, stack_pointer_rtx,
>> -   -GET_MODE_SIZE (word_mode)));
>> +  rtx_insn *insn = emit_insn (gen_push (gen_rtx_REG (word_mode,
>> 0)));
>>
>> Please use AX_REG instead of 0.
>>
>> +  RTX_FRAME_RELATED_P (insn) = 1;
>> +  ix86_emit_restore_reg_using_pop (gen_rtx_REG (word_mode, 0));
>>
>> Also here.
>>
>>  emit_insn (gen_blockage ());
>>
>> BTW: Could we use an unused register here, if available? %eax is used
>> to pass first argument in regparm functions on 32bit targets.
>
>
>
> Can you push %[er]sp?  What about partial reg stalls when using other
> registers (if the last set was a movb to it)?  I guess [er]sp is safe
> here
> as was [re]ax due to the ABI?



 That would work, too. I believe, that this won't trigger stack engine
 [1], but since the operation is a bit unusual, let's ask HJ to be
 sure.

 [1] https://en.wikipedia.org/wiki/Stack_register#Stack_engine
>>>
>>>
>>> How about %esi in 32 bit mode and %rax in 64 bit mode?  I think that
avoids
>>> hitting the parameter passing registers.
>>
>>
>> That is a good choice. IMO, It warrants a small comment, in the
>> source, why this choice.
>
> Here's a patch implementing that approach.
>
> Bootstrapped and regression tested on x86_64.  Also spot tested the new
test on x86.
>
> OK for the trunk now?
>
> Jeff
>
> * config/i386/i386.c (ix86_emit_restore_reg_using_pop):
Prototype.
> (ix86_adjust_stack_and_probe_stack_clash): Use a push/pop
sequence
> to probe at the start of a noreturn function.
>
> * gcc.target/i386/stack-check-12.c: New test

OK.

Thanks, Uros.

> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 25b28a1..1b83755 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -101,6 +101,8 @@ static void ix86_print_operand_address_as (FILE *,
rtx, addr_space_t, bool);
>  static bool ix86_save_reg (unsigned int, bool, bool);
>  static bool ix86_function_naked (const_tree);
>  static bool ix86_notrack_prefixed_insn_p (rtx);
> +static void ix86_emit_restore_reg_using_pop (rtx);
> +
>
>  #ifndef CHECK_STACK_LIMIT
>  #define CHECK_STACK_LIMIT (-1)
> @@ -12124,8 +12126,14 @@ ix86_adjust_stack_and_probe_stack_clash (const
HOST_WIDE_INT size)
>   we just probe when we cross PROBE_INTERVAL.  */
>if (TREE_THIS_VOLATILE (cfun->decl))
>  {
> -  emit_stack_probe (plus_constant (Pmode, stack_pointer_rtx,
> -  -GET_MODE_SIZE (word_mode)));
> +  /* We can safely use any register here since we're just going to
push
> +its value and immediately pop it back.  But we do try and avoid
> +argument passing registers so as not to introduce dependencies in
> +the pipeline.  For 32 bit we use %esi and for 64 bit we use
%rax.  */
> +  rtx dummy_reg = gen_rtx_REG (word_mode, TARGET_64BIT ? AX_REG :
SI_REG);
> +  rtx_insn *insn = emit_insn (gen_push (dummy_reg));
> +  RTX_FRAME_RELATED_P (insn) = 1;
> +  ix86_emit_restore_reg_using_pop (dummy_reg);
>emit_insn (gen_blockage ());
>  }
>
> diff --git a/gcc/testsuite/gcc.target/i386/stack-check-12.c
b/gcc/testsuite/gcc.target/i386/stack-check-12.c
> new file mode 100644
> index 000..cb69bb0
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/stack-check-12.c
> @@ -0,0 +1,19 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fstack-clash-protection -mtune=generic" } */
> +/* { dg-require-effective-target supports_stack_clash_protection } */
> +
> +__attribute__ ((noreturn)) void exit (int);
> +
> +__attribute__ ((noreturn)) void
> +f (void)
> +{
> +  asm volatile ("nop" ::: "edi");
> +  exit (1);
> +}
> +
> +/* { dg-final { scan-assembler-not "or\[ql\]" } } */
> +/* { dg-final { scan-assembler "pushl  %esi" { target ia32 } } } */
> +/* { dg-final { scan-assembler "popl   %esi" { target ia32 } } }*/
> +/* { dg-final { scan-assembler "pushq  %rax" { target { ! ia32 } } } } */
> +/* { dg-final { scan-assembler "popq   %rax" { target { ! ia32 } } } }*/
> +

Re: [PATCH] Improve store merging to handle load+store or bitwise logicals (PR tree-optimization/78821, take 2)

2017-11-03 Thread Richard Biener
On November 3, 2017 8:17:30 PM GMT+01:00, Jakub Jelinek  
wrote:
>On Fri, Nov 03, 2017 at 03:04:18PM +0100, Jakub Jelinek wrote:
>> single-use vs. multiple uses is something I've thought about, but
>don't
>> know whether it is better to require single-use or not (or sometimes,
>> under some condition?).  Say if we have:
>
>So, here is what I've committed in the end after
>bootstrapping/regtesting
>it on x86_64-linux and i686-linux, the only changes from the earlier
>patch
>were comments and addition of has_single_use checks.
>
>In those bootstraps/regtests, the number of integer_cst stores were
>expectedly the same, and so were the number of bit_*_expr cases, but
>it apparently matters a lot for the memory copying (rhs_code MEM_REF).
>Without this patch new/orig stores:
>16943   35369
>and with the patch:
>12111   24911
>So, perhaps we'll need to do something smarter (approximate how many
>original loads would be kept and how many new loads/stores we'd need to
>add
>to get rid of how many original stores).
>Or allow multiple uses for the MEM_REF rhs_code only and for anything
>else
>require single use.

Probably interesting to look at the individual cases. But yes, it should be 
factored into the cost model somehow. 
It's possibly also increasing register pressure. 

Richard. 

>2017-11-03  Jakub Jelinek  
>
>   PR tree-optimization/78821
>   * gimple-ssa-store-merging.c: Update the file comment.
>   (MAX_STORE_ALIAS_CHECKS): Define.
>   (struct store_operand_info): New type.
>   (store_operand_info::store_operand_info): New constructor.
>   (struct store_immediate_info): Add rhs_code and ops data members.
>   (store_immediate_info::store_immediate_info): Add rhscode, op0r
>   and op1r arguments to the ctor, initialize corresponding data members.
>   (struct merged_store_group): Add load_align_base and load_align
>   data members.
>   (merged_store_group::merged_store_group): Initialize them.
>   (merged_store_group::do_merge): Update them.
>   (merged_store_group::apply_stores): Pick the constant for
>   encode_tree_to_bitpos from one of the two operands, or skip
>   encode_tree_to_bitpos if neither operand is a constant.
>   (class pass_store_merging): Add process_store method decl.  Remove
>   bool argument from terminate_all_aliasing_chains method decl.
>   (pass_store_merging::terminate_all_aliasing_chains): Remove
>   var_offset_p argument and corresponding handling.
>   (stmts_may_clobber_ref_p): New function.
>   (compatible_load_p): New function.
>   (imm_store_chain_info::coalesce_immediate_stores): Terminate group
>   if there is overlap and rhs_code is not INTEGER_CST.  For
>   non-overlapping stores terminate group if rhs is not mergeable.
>   (get_alias_type_for_stmts): Change first argument from
>   auto_vec & to vec &.  Add IS_LOAD, CLIQUEP and
>   BASEP arguments.  If IS_LOAD is true, look at rhs1 of the stmts
>   instead of lhs.  Compute *CLIQUEP and *BASEP in addition to the
>   alias type.
>   (get_location_for_stmts): Change first argument from
>   auto_vec & to vec &.
>   (struct split_store): Remove orig_stmts data member, add orig_stores.
>   (split_store::split_store): Create orig_stores rather than orig_stmts.
>   (find_constituent_stmts): Renamed to ...
>   (find_constituent_stores): ... this.  Change second argument from
>   vec * to vec *, push pointers
>   to info structures rather than the statements.
>   (split_group): Rename ALLOW_UNALIGNED argument to
>   ALLOW_UNALIGNED_STORE, add ALLOW_UNALIGNED_LOAD argument and handle
>   it.  Adjust find_constituent_stores caller.
>   (imm_store_chain_info::output_merged_store): Handle rhs_code other
>   than INTEGER_CST, adjust split_group, get_alias_type_for_stmts and
>   get_location_for_stmts callers.  Set MR_DEPENDENCE_CLIQUE and
>   MR_DEPENDENCE_BASE on the MEM_REFs if they are the same in all stores.
>   (mem_valid_for_store_merging): New function.
>   (handled_load): New function.
>   (pass_store_merging::process_store): New method.
>   (pass_store_merging::execute): Use process_store method.  Adjust
>   terminate_all_aliasing_chains caller.
>
>   * gcc.dg/store_merging_13.c: New test.
>   * gcc.dg/store_merging_14.c: New test.
>
>--- gcc/gimple-ssa-store-merging.c.jj  2017-11-03 15:37:02.869561500
>+0100
>+++ gcc/gimple-ssa-store-merging.c 2017-11-03 16:15:15.059282459 +0100
>@@ -19,7 +19,8 @@
>.  */
> 
> /* The purpose of this pass is to combine multiple memory stores of
>-   constant values to consecutive memory locations into fewer wider
>stores.
>+   constant values, values loaded from memory or bitwise operations
>+   on those to consecutive memory locations into fewer wider stores.
>For example, if we have a sequence peforming four byte 

Re: [RFA][PATCH] Improve initial probe for noreturn functions for x86 target

2017-11-03 Thread Jeff Law

On 11/03/2017 11:38 AM, Uros Bizjak wrote:

On Fri, Nov 3, 2017 at 6:13 PM, Jeff Law  wrote:

On 11/03/2017 04:46 AM, Uros Bizjak wrote:


On Fri, Nov 3, 2017 at 11:14 AM, Richard Biener
 wrote:


On Fri, Nov 3, 2017 at 9:38 AM, Uros Bizjak  wrote:


 * config/i386/i386.c (ix86_emit_restore_reg_using_pop):
Prototype.
 (ix86_adjust_stack_and_probe_stack_clash): Use a push/pop
sequence
 to probe at the start of a noreturn function.

 * gcc.target/i386/stack-check-12.c: New test



-  emit_stack_probe (plus_constant (Pmode, stack_pointer_rtx,
-   -GET_MODE_SIZE (word_mode)));
+  rtx_insn *insn = emit_insn (gen_push (gen_rtx_REG (word_mode,
0)));

Please use AX_REG instead of 0.

+  RTX_FRAME_RELATED_P (insn) = 1;
+  ix86_emit_restore_reg_using_pop (gen_rtx_REG (word_mode, 0));

Also here.

 emit_insn (gen_blockage ());

BTW: Could we use an unused register here, if available? %eax is used
to pass first argument in regparm functions on 32bit targets.



Can you push %[er]sp?  What about partial reg stalls when using other
registers (if the last set was a movb to it)?  I guess [er]sp is safe
here
as was [re]ax due to the ABI?



That would work, too. I believe, that this won't trigger stack engine
[1], but since the operation is a bit unusual, let's ask HJ to be
sure.

[1] https://en.wikipedia.org/wiki/Stack_register#Stack_engine


How about %esi in 32 bit mode and %rax in 64 bit mode?  I think that avoids
hitting the parameter passing registers.


That is a good choice. IMO, It warrants a small comment, in the
source, why this choice.

Here's a patch implementing that approach.

Bootstrapped and regression tested on x86_64.  Also spot tested the new 
test on x86.


OK for the trunk now?

Jeff
* config/i386/i386.c (ix86_emit_restore_reg_using_pop): Prototype.
(ix86_adjust_stack_and_probe_stack_clash): Use a push/pop sequence
to probe at the start of a noreturn function.

* gcc.target/i386/stack-check-12.c: New test


diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 25b28a1..1b83755 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -101,6 +101,8 @@ static void ix86_print_operand_address_as (FILE *, rtx, 
addr_space_t, bool);
 static bool ix86_save_reg (unsigned int, bool, bool);
 static bool ix86_function_naked (const_tree);
 static bool ix86_notrack_prefixed_insn_p (rtx);
+static void ix86_emit_restore_reg_using_pop (rtx);
+
 
 #ifndef CHECK_STACK_LIMIT
 #define CHECK_STACK_LIMIT (-1)
@@ -12124,8 +12126,14 @@ ix86_adjust_stack_and_probe_stack_clash (const 
HOST_WIDE_INT size)
  we just probe when we cross PROBE_INTERVAL.  */
   if (TREE_THIS_VOLATILE (cfun->decl))
 {
-  emit_stack_probe (plus_constant (Pmode, stack_pointer_rtx,
-  -GET_MODE_SIZE (word_mode)));
+  /* We can safely use any register here since we're just going to push
+its value and immediately pop it back.  But we do try and avoid
+argument passing registers so as not to introduce dependencies in
+the pipeline.  For 32 bit we use %esi and for 64 bit we use %rax.  */
+  rtx dummy_reg = gen_rtx_REG (word_mode, TARGET_64BIT ? AX_REG : SI_REG);
+  rtx_insn *insn = emit_insn (gen_push (dummy_reg));
+  RTX_FRAME_RELATED_P (insn) = 1;
+  ix86_emit_restore_reg_using_pop (dummy_reg);
   emit_insn (gen_blockage ());
 }
 
diff --git a/gcc/testsuite/gcc.target/i386/stack-check-12.c 
b/gcc/testsuite/gcc.target/i386/stack-check-12.c
new file mode 100644
index 000..cb69bb0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/stack-check-12.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fstack-clash-protection -mtune=generic" } */
+/* { dg-require-effective-target supports_stack_clash_protection } */
+
+__attribute__ ((noreturn)) void exit (int);
+
+__attribute__ ((noreturn)) void
+f (void)
+{
+  asm volatile ("nop" ::: "edi");
+  exit (1);
+}
+
+/* { dg-final { scan-assembler-not "or\[ql\]" } } */
+/* { dg-final { scan-assembler "pushl  %esi" { target ia32 } } } */
+/* { dg-final { scan-assembler "popl   %esi" { target ia32 } } }*/
+/* { dg-final { scan-assembler "pushq  %rax" { target { ! ia32 } } } } */
+/* { dg-final { scan-assembler "popq   %rax" { target { ! ia32 } } } }*/
+


[Patch, fortran] PR81735 - [6/7/8 Regression] double free or corruption (fasttop) error (SIGABRT) with character(:) and custom return type with allocatable

2017-11-03 Thread Paul Richard Thomas
This was already fixed by the patch for PR82375 on trunk, albeit less
well than the patch applied here. I will correct trunk tomorrow.

Committed as 'obvious' on 6-branch(r254393) and 7-branch(r254389)
after bootstrapping and regtesting.

Cheers

Paul
2017-11-03  Paul Thomas  

PR fortran/81735
* trans-decl.c (gfc_trans_deferred_vars): Correct case where
'tmp' can be used unititialized.

2017-11-03  Paul Thomas  

PR fortran/81735
* gfortran.dg/pr81735.f90: New test.


Re: [PATCH] Improve store merging to handle load+store or bitwise logicals (PR tree-optimization/78821, take 2)

2017-11-03 Thread Jakub Jelinek
On Fri, Nov 03, 2017 at 03:04:18PM +0100, Jakub Jelinek wrote:
> single-use vs. multiple uses is something I've thought about, but don't
> know whether it is better to require single-use or not (or sometimes,
> under some condition?).  Say if we have:

So, here is what I've committed in the end after bootstrapping/regtesting
it on x86_64-linux and i686-linux, the only changes from the earlier patch
were comments and addition of has_single_use checks.

In those bootstraps/regtests, the number of integer_cst stores were
expectedly the same, and so were the number of bit_*_expr cases, but
it apparently matters a lot for the memory copying (rhs_code MEM_REF).
Without this patch new/orig stores:
16943   35369
and with the patch:
12111   24911
So, perhaps we'll need to do something smarter (approximate how many
original loads would be kept and how many new loads/stores we'd need to add
to get rid of how many original stores).
Or allow multiple uses for the MEM_REF rhs_code only and for anything else
require single use.

2017-11-03  Jakub Jelinek  

PR tree-optimization/78821
* gimple-ssa-store-merging.c: Update the file comment.
(MAX_STORE_ALIAS_CHECKS): Define.
(struct store_operand_info): New type.
(store_operand_info::store_operand_info): New constructor.
(struct store_immediate_info): Add rhs_code and ops data members.
(store_immediate_info::store_immediate_info): Add rhscode, op0r
and op1r arguments to the ctor, initialize corresponding data members.
(struct merged_store_group): Add load_align_base and load_align
data members.
(merged_store_group::merged_store_group): Initialize them.
(merged_store_group::do_merge): Update them.
(merged_store_group::apply_stores): Pick the constant for
encode_tree_to_bitpos from one of the two operands, or skip
encode_tree_to_bitpos if neither operand is a constant.
(class pass_store_merging): Add process_store method decl.  Remove
bool argument from terminate_all_aliasing_chains method decl.
(pass_store_merging::terminate_all_aliasing_chains): Remove
var_offset_p argument and corresponding handling.
(stmts_may_clobber_ref_p): New function.
(compatible_load_p): New function.
(imm_store_chain_info::coalesce_immediate_stores): Terminate group
if there is overlap and rhs_code is not INTEGER_CST.  For
non-overlapping stores terminate group if rhs is not mergeable.
(get_alias_type_for_stmts): Change first argument from
auto_vec & to vec &.  Add IS_LOAD, CLIQUEP and
BASEP arguments.  If IS_LOAD is true, look at rhs1 of the stmts
instead of lhs.  Compute *CLIQUEP and *BASEP in addition to the
alias type.
(get_location_for_stmts): Change first argument from
auto_vec & to vec &.
(struct split_store): Remove orig_stmts data member, add orig_stores.
(split_store::split_store): Create orig_stores rather than orig_stmts.
(find_constituent_stmts): Renamed to ...
(find_constituent_stores): ... this.  Change second argument from
vec * to vec *, push pointers
to info structures rather than the statements.
(split_group): Rename ALLOW_UNALIGNED argument to
ALLOW_UNALIGNED_STORE, add ALLOW_UNALIGNED_LOAD argument and handle
it.  Adjust find_constituent_stores caller.
(imm_store_chain_info::output_merged_store): Handle rhs_code other
than INTEGER_CST, adjust split_group, get_alias_type_for_stmts and
get_location_for_stmts callers.  Set MR_DEPENDENCE_CLIQUE and
MR_DEPENDENCE_BASE on the MEM_REFs if they are the same in all stores.
(mem_valid_for_store_merging): New function.
(handled_load): New function.
(pass_store_merging::process_store): New method.
(pass_store_merging::execute): Use process_store method.  Adjust
terminate_all_aliasing_chains caller.

* gcc.dg/store_merging_13.c: New test.
* gcc.dg/store_merging_14.c: New test.

--- gcc/gimple-ssa-store-merging.c.jj   2017-11-03 15:37:02.869561500 +0100
+++ gcc/gimple-ssa-store-merging.c  2017-11-03 16:15:15.059282459 +0100
@@ -19,7 +19,8 @@
.  */
 
 /* The purpose of this pass is to combine multiple memory stores of
-   constant values to consecutive memory locations into fewer wider stores.
+   constant values, values loaded from memory or bitwise operations
+   on those to consecutive memory locations into fewer wider stores.
For example, if we have a sequence peforming four byte stores to
consecutive memory locations:
[p ] := imm1;
@@ -29,21 +30,49 @@
we can transform this into a single 4-byte store if the target supports it:
   [p] := imm1:imm2:imm3:imm4 //concatenated immediates according to endianness.
 
+   Or:
+   [p ] := [q ];
+   [p + 1B] := [q + 

Re: [PATCH] Set default to -fomit-frame-pointer

2017-11-03 Thread Sandra Loosemore

On 11/03/2017 10:54 AM, Wilco Dijkstra wrote:

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 
71b2445f70fd5b832c68c08e69e71d8ecad37a4a..1c56f4b12495fe97c604200ef245c9fa02684b0f
 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -7436,16 +7436,17 @@ machine-description macro @code{FRAME_POINTER_REQUIRED} 
controls
  whether a target machine supports this flag.  @xref{Registers,,Register
  Usage, gccint, GNU Compiler Collection (GCC) Internals}.

-The default setting (when not optimizing for
-size) for 32-bit GNU/Linux x86 and 32-bit Darwin x86 targets is
-@option{-fomit-frame-pointer}.  You can configure GCC with the
-@option{--enable-frame-pointer} configure option to change the default.
+The default setting is @option{-fomit-frame-pointer}.  You can configure GCC
+with the @option{--enable-frame-pointer} configure option to change the 
default.


I'd prefer that you remove the reference to configure options entirely 
here.  Nowadays most GCC users install a package provided by their OS 
distribution, Linaro, etc, rather than trying to build GCC from scratch.



  Note that @option{-fno-omit-frame-pointer} doesn't force a new stack
  frame for all functions if it isn't otherwise needed, and hence doesn't
-guarantee a new frame pointer for all functions.
+guarantee a new frame pointer for all functions.  Several targets always omit
+the frame pointer in leaf functions.
+
+Enabled at levels @option{-O}, @option{-O1}, @option{-O2}, @option{-O3},
+@option{-Os} and @option{-Og}.


This last sentence makes no sense.  If the option is now enabled by 
default, then the optimization level is irrelevant.


-Sandra



[Committed] un-XFAIL gfortran testcase on FreeBSD.

2017-11-03 Thread Steve Kargl
The Changelog entry and patch are somewhat self-explanatory.

2017-11-3  Steven G. Kargl  

* gfortran.dg/large_real_kind_2.F90: Test passes on FreeBSD.  Remove
dg-xfail-if directive.


Index: gcc/testsuite/gfortran.dg/large_real_kind_2.F90
===
--- gcc/testsuite/gfortran.dg/large_real_kind_2.F90 (revision 254388)
+++ gcc/testsuite/gfortran.dg/large_real_kind_2.F90 (working copy)
@@ -1,6 +1,5 @@
 ! { dg-do run }
 ! { dg-require-effective-target fortran_large_real }
-! { dg-xfail-if "" { "*-*-freebsd*" } }
 
 ! Testing library calls on large real kinds (larger than kind=8)
   implicit none

-- 
Steve


Re: [PATCH 5/6] [ARC] Add 'uncached' attribute.

2017-11-03 Thread Sandra Loosemore

On 11/03/2017 05:22 AM, Claudiu Zissulescu wrote:


I see no documentation here.



Ups, forgot this one :) Please find it attached. I'll merge it into the final 
patch when everything is approved.

Thanks,
Claudiu

+@node ARC Type Attributes
+@subsection ARC Type Attributes
+
+@cindex @code{uncached} type attribute, ARC
+Declaring variables @code{uncached} allows you to exclude data-cache


Since this is a type attribute and not a variable attribute (I presume 
to allow accessing objects through a pointer), it would be better to say


Declaring objects with the @code{uncached} type attribute allows


+participation in load and store operations on those variables without


And s/variables/objects/ here too.


+involving the additional semantic implications of volatile.  The


You probably want @code{volatile} markup here?


+@code{.di} instruction suffix is used for all loads and stores of data
+declared @code{uncached}.
+


Otherwise, the description makes sense to me.  (In fact, I might 
eventually want to copy this attribute over to the Nios II backend, too, 
since it also has similar "io"-variant load/store instructions.)


-Sandra



[PATCH] C/C++: more stdlib header hints (PR c/81404) (v5)

2017-11-03 Thread David Malcolm
On Fri, 2017-11-03 at 17:49 +, Joseph Myers wrote:
> On Thu, 2 Nov 2017, David Malcolm wrote:
> 
> > +{"offsetof", {"", ""} },
> 
> offsetof is in stddef.h for C, not stdalign.h.

Thanks.  Here's an updated version of the patch which fixes that.

OK for trunk? (assuming bootstrap and regrtesting passes)


Changed in v5:
- fixed "offsetof" C header from "" to ""

Changed in v4:
- updated for changes of "inform_at_rich_loc" to "inform"
- added #define INCLUDE_UNIQUE_PTR to known-headers.cc

Changed in v3:
- fixed WINT_MAX and WINT_MIN

Here's an updated version of the patch, which moves the data to
known-headers and unifies the C and C++ data into one array.

Blurb from v1:

This patch depends on:

* "[PATCH] c-family: add name_hint/deferred_diagnostic (v2)"
* https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01021.html
(waiting review)

* [PATCH 3/3] C: hints for missing stdlib includes for macros and types
* https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00125.html
(approved, pending the prereq above)

It extends the C frontend's "knowledge" of the C stdlib within
get_c_name_hint to cover some more macros and functions, covering
a case reported in PR c/81404 ("INT_MAX"), so that rather than printing:

  t.c:5:12: error: 'INT_MAX' undeclared here (not in a function); did you mean 
'__INT_MAX__'?
   int test = INT_MAX;
  ^~~
  __INT_MAX__

we instead print:

  t.c:5:12: error: 'INT_MAX' undeclared here (not in a function)
   int test = INT_MAX;
  ^~~
  t.c:5:12: note: 'INT_MAX' is defined in header ''; did you forget 
to '#include '?
  t.c:1:1:
  +#include 

  t.c:5:12:
int test = INT_MAX;
   ^~~

It also adds generalizes some of the code for this (and for the "std::"
namespace hints in the C++ frontend), moving it to a new
c-family/known-headers.cc and .h, and introducing a class known_headers.
This currently just works by scanning a hardcoded array of known
name/header associations, but perhaps in the future could be turned
into some kind of symbol database so that the compiler could record API
uses and use that to offer suggestions e.g.

foo.cc: error: 'myapi::foo' was not declared in this scope
foo.cc: note: 'myapi::foo" was declared in header 'myapi/private.h'
(included via 'myapi/public.h') when compiling 'bar.cc'; did you forget to
'#include "myapi/public.h"'?

or somesuch.

In any case, moving this to a class gives an easier way to locate the
hardcoded knowledge about the stdlib.

The patch also adds similar code to the C++ frontend covering
unqualified names in the standard library, so that rather than just
e.g.:

  t.cc:19:13: error: 'NULL' was not declared in this scope
   void *ptr = NULL;
   ^~~~

we can emit:

  t.cc:19:13: error: 'NULL' was not declared in this scope
   void *ptr = NULL;
   ^~~~
  t.cc:19:13: note: 'NULL' is defined in header ''; did you forget
  to '#include '?
  t.cc:1:1:
  +#include 

  t.cc:19:13:
   void *ptr = NULL;
   ^~~~

(Also XFAIL for PR c++/80567 added for the C++ testcase; this is a
separate pre-existing bug exposed by the testcase for PR 81404).

gcc/ChangeLog:
PR c/81404
* Makefile.in (C_COMMON_OBJS): Add c-family/known-headers.o.

gcc/c-family/ChangeLog:
PR c/81404
* known-headers.cc: New file, based on material from c/c-decl.c.
(suggest_missing_header): Copied as-is.
(get_stdlib_header_for_name): New, based on get_c_name_hint but
heavily edited to add C++ support.  Add some knowledge about
, , and .
* known-headers.h: Likewise.

gcc/c/ChangeLog:
PR c/81404
* c-decl.c: Include "c-family/known-headers.h".
(get_c_name_hint): Rename to get_stdlib_header_for_name and move
to known-headers.cc.
(class suggest_missing_header): Move to known-header.h.
(lookup_name_fuzzy): Call get_c_stdlib_header_for_name rather
than get_c_name_hint.

gcc/cp/ChangeLog:
PR c/81404
* name-lookup.c: Include "c-family/known-headers.h"
(lookup_name_fuzzy): Call get_cp_stdlib_header_for_name and
potentially return a new suggest_missing_header hint.

gcc/testsuite/ChangeLog:
PR c/81404
* g++.dg/spellcheck-stdlib.C: New.
* gcc.dg/spellcheck-stdlib.c (test_INT_MAX): New.
---
 gcc/Makefile.in  |   2 +-
 gcc/c-family/known-headers.cc| 169 +++
 gcc/c-family/known-headers.h |  41 
 gcc/c/c-decl.c   |  82 +--
 gcc/cp/name-lookup.c |  11 ++
 gcc/testsuite/g++.dg/spellcheck-stdlib.C |  84 +++
 gcc/testsuite/gcc.dg/spellcheck-stdlib.c |   9 ++
 7 files changed, 319 insertions(+), 79 deletions(-)
 create mode 100644 gcc/c-family/known-headers.cc
 create mode 100644 gcc/c-family/known-headers.h
 create mode 100644 gcc/testsuite/g++.dg/spellcheck-stdlib.C

diff --git 

Re: [PATCH] Set default to -fomit-frame-pointer

2017-11-03 Thread Joseph Myers
On Fri, 3 Nov 2017, Wilco Dijkstra wrote:

> Almost all targets add an explict -fomit-frame-pointer in the target specific
> options.  Rather than doing this in a target-specific way, do this in the

Which targets do not?  You should explicitly list them and CC their 
maintainers and seek confirmation that such a change is appropriate for 
them.

The addition of -fomit-frame-pointer through this mechanism was a 
replacement for the old target macro CAN_DEBUG_WITHOUT_FP.  It may now be 
the cases that with DWARF debug info, having or not having a frame pointer 
is not particularly relevant to debugging.  But since there are other 
reasons people may want a frame pointer (e.g. light-weight backtraces that 
don't depend on debug / unwind info), it's at least possible there are 
architecture-specific choices regarding keeping frame pointers involved 
here.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [C++ Patch] PR 65579 ("gcc requires definition of a static constexpr member...")

2017-11-03 Thread Jason Merrill
On Thu, Oct 26, 2017 at 6:17 AM, Paolo Carlini  wrote:
> Hi again,
>
> On 24/10/2017 20:58, Jason Merrill wrote:
>>
>> This seems like an odd place to add the complete_type call.  What
>> happens if we change the COMPLETE_TYPE_P (type) in
>> cp_apply_type_quals_to_decl to COMPLETE_TYPE_P (complete_type (type))?
>
> Finally I'm back with some information.
>
> Simply doing the above doesn't fully work. The first symptom is the failure
> of g++.dg/init/mutable1.C which is precisely the testcase that you added
> together with the "|| !COMPLETE_TYPE_P (type)" itself: clearly, the
> additional condition isn't able anymore to do its work, because, first, when
> the type isn't complete, TYPE_HAS_MUTABLE_P (type) is false and then, when
> in fact it would be found true, we check !COMPLETE_TYPE_P (complete_type
> (type)) which is false, because completing succeeded.

> Thus it seems we need at least something like:
>
>TREE_TYPE (decl) = type = complete_type (type);
>
>if (TYPE_HAS_MUTABLE_P (type) || !COMPLETE_TYPE_P (type))
>  type_quals &= ~TYPE_QUAL_CONST;
>
> But then, toward the end of the testsuite, we notice a more serious issue,
> which is unrelated to the above: g++.old-deja/g++.pt/poi1.C
>
> // { dg-do assemble  }
> // Origin: Gerald Pfeifer 
>
> template 
> class TLITERAL : public T
> {
> int x;
> };
>
> class GATOM;
>
> typedef TLITERAL x;
> extern TLITERAL y;
>
> also fails:
>
> poi1.C: In instantiation of ‘class TLITERAL’:
> poi1.C:13:24:   required from here
> poi1.C:5:7: error: invalid use of incomplete type ‘class GATOM’
>  class TLITERAL : public T
> poi1.C:10:7: note: forward declaration of ‘class GATOM’
>  class GATOM;
>
> that is, trying to complete GATOM at the 'extern TLITERAL y;" line
> obviously fails. Note, in case isn't obvious, that this happens exactly for
> the cp_apply_type_quals_to_decl call at the end of grokdeclarator which I
> tried to change in my first try: the failure of poi1.C seems rather useful
> to figure out what we want to do for this bug.
>
> Well, as expected, explicitly checking VAR_P && DECL_DECLARED_CONSTEXPR_P
> works again - it seems to me that after all it could make sense given the
> comment precisely talking about the additional complexities related to
> constexpr. Anyway, I'm attaching the corresponding complete patch.

Looking at the code again, it seems that the problem is the difference
between start_decl_1 and grokfield, in that the former has

  /* If an explicit initializer is present, or if this is a definition
 of an aggregate, then we need a complete type at this point.
 (Scalars are always complete types, so there is nothing to
 check.)  This code just sets COMPLETE_P; errors (if necessary)
 are issued below.  */
  if ((initialized || aggregate_definition_p)
  && !complete_p
  && COMPLETE_TYPE_P (complete_type (type)))
{
  complete_p = true;
  /* We will not yet have set TREE_READONLY on DECL if the type
 was "const", but incomplete, before this point.  But, now, we
 have a complete type, so we can try again.  */
  cp_apply_type_quals_to_decl (cp_type_quals (type), decl);
}

and grokfield/finish_static_data_member_decl don't.  How about
completing the type and re-applying the quals in
finish_static_data_member_decl if there's an initializer?  Your most
recent patch ought to work, but is less parallel.  Sorry for the
churn.

Jason


[4/4] SVE unwinding

2017-11-03 Thread Richard Sandiford
This patch adds support for unwinding frames that use the SVE
pseudo VG register.  We want this register to act like a normal
register if the CFI explicitly sets it, but want to provide a
default value otherwise.  Computing the default value requires
an SVE target, so we only want to compute it on demand.

aarch64_vg uses a hard-coded .inst in order to avoid a build
dependency on binutils 2.28 or later.


2017-11-03  Richard Sandiford  

libgcc/
* config/aarch64/value-unwind.h (aarch64_vg): New function.
(DWARF_LAZY_REGISTER_VALUE): Define.
* unwind-dw2.c (_Unwind_GetGR): Use DWARF_LAZY_REGISTER_VALUE
to provide a fallback register value.

gcc/testsuite/
* g++.target/aarch64/aarch64.exp: New harness.
* g++.target/aarch64/sve_catch_1.C: New test.
* g++.target/aarch64/sve_catch_2.C: Likewise.
* g++.target/aarch64/sve_catch_3.C: Likewise.
* g++.target/aarch64/sve_catch_4.C: Likewise.
* g++.target/aarch64/sve_catch_5.C: Likewise.
* g++.target/aarch64/sve_catch_6.C: Likewise.

Index: libgcc/config/aarch64/value-unwind.h
===
--- libgcc/config/aarch64/value-unwind.h2017-02-23 19:53:58.0 
+
+++ libgcc/config/aarch64/value-unwind.h2017-11-03 17:24:20.172023500 
+
@@ -23,3 +23,19 @@
 #if defined __aarch64__ && !defined __LP64__
 # define REG_VALUE_IN_UNWIND_CONTEXT
 #endif
+
+/* Return the value of the pseudo VG register.  This should only be
+   called if we know this is an SVE host.  */
+static inline int
+aarch64_vg (void)
+{
+  register int vg asm ("x0");
+  /* CNTD X0.  */
+  asm (".inst 0x04e0e3e0" : "=r" (vg));
+  return vg;
+}
+
+/* Lazily provide a value for VG, so that we don't try to execute SVE
+   instructions unless we know they're needed.  */
+#define DWARF_LAZY_REGISTER_VALUE(REGNO, VALUE) \
+  ((REGNO) == AARCH64_DWARF_VG && ((*VALUE) = aarch64_vg (), 1))
Index: libgcc/unwind-dw2.c
===
--- libgcc/unwind-dw2.c 2017-02-23 19:54:02.0 +
+++ libgcc/unwind-dw2.c 2017-11-03 17:24:20.172023500 +
@@ -216,12 +216,12 @@ _Unwind_IsExtendedContext (struct _Unwin
  || (context->flags & EXTENDED_CONTEXT_BIT));
 }
 
-/* Get the value of register INDEX as saved in CONTEXT.  */
+/* Get the value of register REGNO as saved in CONTEXT.  */
 
 inline _Unwind_Word
-_Unwind_GetGR (struct _Unwind_Context *context, int index)
+_Unwind_GetGR (struct _Unwind_Context *context, int regno)
 {
-  int size;
+  int size, index;
   _Unwind_Context_Reg_Val val;
 
 #ifdef DWARF_ZERO_REG
@@ -229,7 +229,7 @@ _Unwind_GetGR (struct _Unwind_Context *c
 return 0;
 #endif
 
-  index = DWARF_REG_TO_UNWIND_COLUMN (index);
+  index = DWARF_REG_TO_UNWIND_COLUMN (regno);
   gcc_assert (index < (int) sizeof(dwarf_reg_size_table));
   size = dwarf_reg_size_table[index];
   val = context->reg[index];
@@ -237,6 +237,14 @@ _Unwind_GetGR (struct _Unwind_Context *c
   if (_Unwind_IsExtendedContext (context) && context->by_value[index])
 return _Unwind_Get_Unwind_Word (val);
 
+#ifdef DWARF_LAZY_REGISTER_VALUE
+  {
+_Unwind_Word value;
+if (DWARF_LAZY_REGISTER_VALUE (regno, ))
+  return value;
+  }
+#endif
+
   /* This will segfault if the register hasn't been saved.  */
   if (size == sizeof(_Unwind_Ptr))
 return * (_Unwind_Ptr *) (_Unwind_Internal_Ptr) val;
Index: gcc/testsuite/g++.target/aarch64/aarch64.exp
===
--- /dev/null   2017-11-03 10:40:07.002381728 +
+++ gcc/testsuite/g++.target/aarch64/aarch64.exp2017-11-03 
17:24:20.171023116 +
@@ -0,0 +1,38 @@
+#  Specific regression driver for AArch64.
+#  Copyright (C) 2009-2017 Free Software Foundation, Inc.
+#  Contributed by ARM Ltd.
+#
+#  This file is part of GCC.
+#
+#  GCC is free software; you can redistribute it and/or modify it
+#  under the terms of the GNU General Public License as published by
+#  the Free Software Foundation; either version 3, or (at your option)
+#  any later version.
+#
+#  GCC is distributed in the hope that it will be useful, but
+#  WITHOUT ANY WARRANTY; without even the implied warranty of
+#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+#  General Public License for more details.
+#
+#  You should have received a copy of the GNU General Public License
+#  along with GCC; see the file COPYING3.  If not see
+#  .  */
+
+# GCC testsuite that uses the `dg.exp' driver.
+
+# Exit immediately if this isn't an AArch64 target.
+if {![istarget aarch64*-*-*] } then {
+  return
+}
+
+# Load support procs.
+load_lib g++-dg.exp
+
+# Initialize `dg'.
+dg-init
+
+# Main loop.
+dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.C]] "" ""
+
+# All done.
+dg-finish
Index: gcc/testsuite/g++.target/aarch64/sve_catch_1.C

[3/4] [AArch64] SVE tests

2017-11-03 Thread Richard Sandiford
This patch adds gcc.target/aarch64 tests for SVE, and forces some
existing Advanced SIMD tests to use -march=armv8-a.


2017-11-03  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/testsuite/
* gcc.target/aarch64/bic_imm_1.c: Force -march=armv8-a.
* gcc.target/aarch64/fmaxmin.c: Likewise.
* gcc.target/aarch64/fmul_fcvt_2.c: Likewise.
* gcc.target/aarch64/orr_imm_1.c: Likewise.
* gcc.target/aarch64/pr62178.c: Likewise.
* gcc.target/aarch64/pr71727-2.c: Likewise.
* gcc.target/aarch64/saddw-1.c: Likewise.
* gcc.target/aarch64/saddw-2.c: Likewise.
* gcc.target/aarch64/uaddw-1.c: Likewise.
* gcc.target/aarch64/uaddw-2.c: Likewise.
* gcc.target/aarch64/uaddw-3.c: Likewise.
* gcc.target/aarch64/vect-add-sub-cond.c: Likewise.
* gcc.target/aarch64/vect-compile.c: Likewise.
* gcc.target/aarch64/vect-faddv-compile.c: Likewise.
* gcc.target/aarch64/vect-fcm-eq-d.c: Likewise.
* gcc.target/aarch64/vect-fcm-eq-f.c: Likewise.
* gcc.target/aarch64/vect-fcm-ge-d.c: Likewise.
* gcc.target/aarch64/vect-fcm-ge-f.c: Likewise.
* gcc.target/aarch64/vect-fcm-gt-d.c: Likewise.
* gcc.target/aarch64/vect-fcm-gt-f.c: Likewise.
* gcc.target/aarch64/vect-fmax-fmin-compile.c: Likewise.
* gcc.target/aarch64/vect-fmaxv-fminv-compile.c: Likewise.
* gcc.target/aarch64/vect-fmovd-zero.c: Likewise.
* gcc.target/aarch64/vect-fmovd.c: Likewise.
* gcc.target/aarch64/vect-fmovf-zero.c: Likewise.
* gcc.target/aarch64/vect-fmovf.c: Likewise.
* gcc.target/aarch64/vect-fp-compile.c: Likewise.
* gcc.target/aarch64/vect-ld1r-compile-fp.c: Likewise.
* gcc.target/aarch64/vect-ld1r-compile.c: Likewise.
* gcc.target/aarch64/vect-movi.c: Likewise.
* gcc.target/aarch64/vect-mull-compile.c: Likewise.
* gcc.target/aarch64/vect-reduc-or_1.c: Likewise.
* gcc.target/aarch64/vect-vaddv.c: Likewise.
* gcc.target/aarch64/vect_saddl_1.c: Likewise.
* gcc.target/aarch64/vect_smlal_1.c: Likewise.
* gcc.target/aarch64/vector_initialization_nostack.c: XFAIL for
fixed-length SVE.
* gcc.target/aarch64/sve_arith_1.c: New test.
* gcc.target/aarch64/sve_const_pred_1.C: Likewise.
* gcc.target/aarch64/sve_const_pred_2.C: Likewise.
* gcc.target/aarch64/sve_const_pred_3.C: Likewise.
* gcc.target/aarch64/sve_const_pred_4.C: Likewise.
* gcc.target/aarch64/sve_cvtf_signed_1.c: Likewise.
* gcc.target/aarch64/sve_cvtf_signed_1_run.c: Likewise.
* gcc.target/aarch64/sve_cvtf_unsigned_1.c: Likewise.
* gcc.target/aarch64/sve_cvtf_unsigned_1_run.c: Likewise.
* gcc.target/aarch64/sve_dup_imm_1.c: Likewise.
* gcc.target/aarch64/sve_dup_imm_1_run.c: Likewise.
* gcc.target/aarch64/sve_dup_lane_1.c: Likewise.
* gcc.target/aarch64/sve_ext_1.c: Likewise.
* gcc.target/aarch64/sve_ext_2.c: Likewise.
* gcc.target/aarch64/sve_extract_1.c: Likewise.
* gcc.target/aarch64/sve_extract_2.c: Likewise.
* gcc.target/aarch64/sve_extract_3.c: Likewise.
* gcc.target/aarch64/sve_extract_4.c: Likewise.
* gcc.target/aarch64/sve_fabs_1.c: Likewise.
* gcc.target/aarch64/sve_fcvtz_signed_1.c: Likewise.
* gcc.target/aarch64/sve_fcvtz_signed_1_run.c: Likewise.
* gcc.target/aarch64/sve_fcvtz_unsigned_1.c: Likewise.
* gcc.target/aarch64/sve_fcvtz_unsigned_1_run.c: Likewise.
* gcc.target/aarch64/sve_fdiv_1.c: Likewise.
* gcc.target/aarch64/sve_fdup_1.c: Likewise.
* gcc.target/aarch64/sve_fdup_1_run.c: Likewise.
* gcc.target/aarch64/sve_fmad_1.c: Likewise.
* gcc.target/aarch64/sve_fmla_1.c: Likewise.
* gcc.target/aarch64/sve_fmls_1.c: Likewise.
* gcc.target/aarch64/sve_fmsb_1.c: Likewise.
* gcc.target/aarch64/sve_fmul_1.c: Likewise.
* gcc.target/aarch64/sve_fneg_1.c: Likewise.
* gcc.target/aarch64/sve_fnmad_1.c: Likewise.
* gcc.target/aarch64/sve_fnmla_1.c: Likewise.
* gcc.target/aarch64/sve_fnmls_1.c: Likewise.
* gcc.target/aarch64/sve_fnmsb_1.c: Likewise.
* gcc.target/aarch64/sve_fp_arith_1.c: Likewise.
* gcc.target/aarch64/sve_frinta_1.c: Likewise.
* gcc.target/aarch64/sve_frinti_1.c: Likewise.
* gcc.target/aarch64/sve_frintm_1.c: Likewise.
* gcc.target/aarch64/sve_frintp_1.c: Likewise.
* gcc.target/aarch64/sve_frintx_1.c: Likewise.
* gcc.target/aarch64/sve_frintz_1.c: Likewise.
* gcc.target/aarch64/sve_fsqrt_1.c: Likewise.
* gcc.target/aarch64/sve_fsubr_1.c: Likewise.
* gcc.target/aarch64/sve_index_1.c: Likewise.
* gcc.target/aarch64/sve_index_1_run.c: 

[2/4] [AArch64] Testsuite markup for SVE

2017-11-03 Thread Richard Sandiford
This patch adds new target selectors for SVE and updates existing
selectors accordingly.  It also XFAILs some tests that don't yet
work for some SVE modes; most of these go away with follow-on
vectorisation enhancements.


2017-11-03  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/testsuite/
* lib/target-supports.exp (check_effective_target_aarch64_sve)
(aarch64_sve_bits, check_effective_target_aarch64_sve_hw)
(aarch64_sve_hw_bits, check_effective_target_aarch64_sve256_hw):
New procedures.
(check_effective_target_vect_perm): Handle SVE.
(check_effective_target_vect_perm_byte): Likewise.
(check_effective_target_vect_perm_short): Likewise.
(check_effective_target_vect_widen_sum_hi_to_si_pattern): Likewise.
(check_effective_target_vect_widen_mult_qi_to_hi): Likewise.
(check_effective_target_vect_widen_mult_hi_to_si): Likewise.
(check_effective_target_vect_element_align_preferred): Likewise.
(check_effective_target_vect_align_stack_vars): Likewise.
(check_effective_target_vect_load_lanes): Likewise.
(check_effective_target_vect_masked_store): Likewise.
(available_vector_sizes): Use aarch64_sve_bits for SVE.
* gcc.dg/vect/tree-vect.h (VECTOR_BITS): Define appropriately
for SVE.
* gcc.dg/tree-ssa/ssa-dom-cse-2.c: Add SVE XFAIL.
* gcc.dg/vect/bb-slp-pr69907.c: Likewise.
* gcc.dg/vect/no-vfa-vect-depend-2.c: Likewise.
* gcc.dg/vect/no-vfa-vect-depend-3.c: Likewise.
* gcc.dg/vect/slp-23.c: Likewise.
* gcc.dg/vect/slp-25.c: Likewise.
* gcc.dg/vect/slp-perm-5.c: Likewise.
* gcc.dg/vect/slp-perm-6.c: Likewise.
* gcc.dg/vect/slp-perm-9.c: Likewise.
* gcc.dg/vect/slp-reduc-3.c: Likewise.
* gcc.dg/vect/vect-114.c: Likewise.
* gcc.dg/vect/vect-119.c: Likewise.
* gcc.dg/vect/vect-cselim-1.c: Likewise.
* gcc.dg/vect/vect-live-slp-1.c: Likewise.
* gcc.dg/vect/vect-live-slp-2.c: Likewise.
* gcc.dg/vect/vect-live-slp-3.c: Likewise.
* gcc.dg/vect/vect-mult-const-pattern-1.c: Likewise.
* gcc.dg/vect/vect-mult-const-pattern-2.c: Likewise.
* gcc.dg/vect/vect-over-widen-1-big-array.c: Likewise.
* gcc.dg/vect/vect-over-widen-1.c: Likewise.
* gcc.dg/vect/vect-over-widen-3-big-array.c: Likewise.
* gcc.dg/vect/vect-over-widen-4-big-array.c: Likewise.
* gcc.dg/vect/vect-over-widen-4.c: Likewise.

Index: gcc/testsuite/lib/target-supports.exp
===
--- gcc/testsuite/lib/target-supports.exp   2017-11-03 17:22:13.533564036 
+
+++ gcc/testsuite/lib/target-supports.exp   2017-11-03 17:24:09.475993817 
+
@@ -3350,6 +3350,35 @@ proc check_effective_target_aarch64_litt
 }]
 }
 
+# Return 1 if this is an AArch64 target supporting SVE.
+proc check_effective_target_aarch64_sve { } {
+if { ![istarget aarch64*-*-*] } {
+   return 0
+}
+return [check_no_compiler_messages aarch64_sve assembly {
+   #if !defined (__ARM_FEATURE_SVE)
+   #error FOO
+   #endif
+}]
+}
+
+# Return the size in bits of an SVE vector, or 0 if the size is variable.
+proc aarch64_sve_bits { } {
+return [check_cached_effective_target aarch64_sve_bits {
+   global tool
+
+   set src dummy[pid].c
+   set f [open $src "w"]
+   puts $f "int bits = __ARM_FEATURE_SVE_BITS;"
+   close $f
+   set output [${tool}_target_compile $src "" preprocess ""]
+   file delete $src
+
+   regsub {.*bits = ([^;]*);.*} $output {\1} bits
+   expr { $bits }
+}]
+}
+
 # Return 1 if this is a compiler supporting ARC atomic operations
 proc check_effective_target_arc_atomic { } {
 return [check_no_compiler_messages arc_atomic assembly {
@@ -4275,6 +4304,49 @@ proc check_effective_target_arm_neon_hw
 } [add_options_for_arm_neon ""]]
 }
 
+# Return true if this is an AArch64 target that can run SVE code.
+
+proc check_effective_target_aarch64_sve_hw { } {
+if { ![istarget aarch64*-*-*] } {
+   return 0
+}
+return [check_runtime aarch64_sve_hw_available {
+   int
+   main (void)
+   {
+ asm volatile ("ptrue p0.b");
+ return 0;
+   }
+}]
+}
+
+# Return true if this is an AArch64 target that can run SVE code and
+# if its SVE vectors have exactly BITS bits.
+
+proc aarch64_sve_hw_bits { bits } {
+if { ![check_effective_target_aarch64_sve_hw] } {
+   return 0
+}
+return [check_runtime aarch64_sve${bits}_hw [subst {
+   int
+   main (void)
+   {
+ int res;
+ asm volatile ("cntd %0" : "=r" (res));
+ if (res * 64 != $bits)
+   __builtin_abort ();
+ return 0;
+   }
+}]]
+}
+
+# Return true if this is an 

Re: [PATCH] C/C++: more stdlib header hints (PR c/81404) (v4)

2017-11-03 Thread Joseph Myers
On Thu, 2 Nov 2017, David Malcolm wrote:

> +{"offsetof", {"", ""} },

offsetof is in stddef.h for C, not stdalign.h.

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH] Use rcrt1.o%s/grcrt1.o%s to relocate static PIE

2017-11-03 Thread H.J. Lu
On Wed, Nov 1, 2017 at 9:39 AM, H.J. Lu  wrote:
> On Wed, Nov 1, 2017 at 9:32 AM, Rich Felker  wrote:
>> On Sun, Oct 15, 2017 at 06:16:57AM -0700, H.J. Lu wrote:
>>> crt1.o is used to create dynamic and non-PIE static executables.  Static
>>> PIE needs to link with Pcrt1.o, instead of crt1.o, to relocate static PIE
>>> at run-time.  When -pg is used with -static-pie, gPcrt1.o should be used.
>>>
>>> Tested on x86-64.  OK for master?
>>
>> Is there a reason you didn't follow the existing naming practice here?
>> Openbsd and musl libc have both had static pie for a long time now and
>> have used rcrt1.o as the name.
>
> I wasn't aware of rcrt1.o and there is no reference to rcrt1.o in GCC at all.
> Does the FSF GCC support static PIE for musl libc? If not, is there a GCC
> bug for it?
>
> BTW, I don't mind replacing Pcrt1.o/gPcrt1.o with rcrt1.o/grcrt1.o.
>

Here is the updated patch to use rcrt1.o/grcrt1.o.

OK for trunk?

Thanks.


-- 
H.J.
From 4d727ac09ef22d0706b30055a233920ffa2419a8 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Tue, 3 Oct 2017 17:29:13 -0700
Subject: [PATCH] Use rcrt1.o%s/grcrt1.o%s to relocate static PIE

crt1.o is used to create dynamic and non-PIE static executables.  Static
PIE needs to link with rcrt1.o, instead of crt1.o, which is also used by
musl libc and OpenBSD:

https://gcc.gnu.org/ml/gcc/2015-06/msg8.html

to relocate static PIE at run-time.  When -pg is used with -static-pie,
grcrt1.o should be used.

	* config/gnu-user.h (GNU_USER_TARGET_STARTFILE_SPEC): Use
	rcrt1.o%s/grcrt1.o%s for -static-pie.
---
 gcc/config/gnu-user.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/gcc/config/gnu-user.h b/gcc/config/gnu-user.h
index df17b180906..93960f5cacb 100644
--- a/gcc/config/gnu-user.h
+++ b/gcc/config/gnu-user.h
@@ -51,9 +51,10 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #if defined HAVE_LD_PIE
 #define GNU_USER_TARGET_STARTFILE_SPEC \
   "%{shared:; \
- pg|p|profile:gcrt1.o%s; \
+ pg|p|profile:%{static-pie:grcrt1.o%s;:gcrt1.o%s}; \
  static:crt1.o%s; \
- static-pie|" PIE_SPEC ":Scrt1.o%s; \
+ static-pie:rcrt1.o%s; \
+ " PIE_SPEC ":Scrt1.o%s; \
  :crt1.o%s} \
crti.o%s \
%{static:crtbeginT.o%s; \
-- 
2.13.6



[1/4] [AArch64] SVE backend support

2017-11-03 Thread Richard Sandiford
This patch adds support for ARM's Scalable Vector Extension.
The patch just contains the core features that work with the
current vectoriser framework; later patches will add extra
capabilities to both the target-independent code and AArch64 code.
The patch doesn't include:

- support for unwinding frames whose size depends on the vector length
- modelling the effect of __tls_get_addr on the SVE registers

These are handled by later patches instead.

Some notes:

- The copyright years for aarch64-sve.md start at 2009 because some of
  the code is based on aarch64.md, which also starts from then.

- The patch inserts spaces between items in the AArch64 section
  of sourcebuild.texi.  This matches at least the surrounding
  architectures and looks a little nicer in the info output.

- aarch64-sve.md includes a pattern:

while_ult

  A later patch adds a matching "while_ult" optab, but the pattern
  is also needed by the predicate vec_duplicate expander.


2017-11-03  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* doc/invoke.texi (-msve-vector-bits=): Document new option.
(sve): Document new AArch64 extension.
* doc/md.texi (w): Extend the description of the AArch64
constraint to include SVE vectors.
(Upl, Upa): Document new AArch64 predicate constraints.
* config/aarch64/aarch64-opts.h (aarch64_sve_vector_bits_enum): New
enum.
* config/aarch64/aarch64.opt (sve_vector_bits): New enum.
(msve-vector-bits=): New option.
* config/aarch64/aarch64-option-extensions.def (fp, simd): Disable
SVE when these are disabled.
(sve): New extension.
* config/aarch64/aarch64-modes.def: Define SVE vector and predicate
modes.  Adjust their number of units based on aarch64_sve_vg.
(MAX_BITSIZE_MODE_ANY_MODE): Define.
* config/aarch64/aarch64-protos.h (ADDR_QUERY_ANY): New
aarch64_addr_query_type.
(aarch64_const_vec_all_same_in_range_p, aarch64_sve_pred_mode)
(aarch64_sve_cnt_immediate_p, aarch64_sve_addvl_addpl_immediate_p)
(aarch64_sve_inc_dec_immediate_p, aarch64_add_offset_temporaries)
(aarch64_split_add_offset, aarch64_output_sve_cnt_immediate)
(aarch64_output_sve_addvl_addpl, aarch64_output_sve_inc_dec_immediate)
(aarch64_output_sve_mov_immediate, aarch64_output_ptrue): Declare.
(aarch64_simd_imm_zero_p): Delete.
(aarch64_check_zero_based_sve_index_immediate): Declare.
(aarch64_sve_index_immediate_p, aarch64_sve_arith_immediate_p)
(aarch64_sve_bitmask_immediate_p, aarch64_sve_dup_immediate_p)
(aarch64_sve_cmp_immediate_p, aarch64_sve_float_arith_immediate_p)
(aarch64_sve_float_mul_immediate_p): Likewise.
(aarch64_classify_symbol): Take the offset as a HOST_WIDE_INT
rather than an rtx.
(aarch64_sve_ld1r_operand_p, aarch64_sve_ldr_operand_p): Declare.
(aarch64_expand_mov_immediate): Take a gen_vec_duplicate callback.
(aarch64_emit_sve_pred_move, aarch64_expand_sve_mem_move): Declare.
(aarch64_expand_sve_vec_cmp_int, aarch64_expand_sve_vec_cmp_float)
(aarch64_expand_sve_vcond, aarch64_expand_sve_vec_perm): Declare.
(aarch64_regmode_natural_size): Likewise.
* config/aarch64/aarch64.h (AARCH64_FL_SVE): New macro.
(AARCH64_FL_V8_3, AARCH64_FL_RCPC, AARCH64_FL_DOTPROD): Shift
left one place.
(AARCH64_ISA_SVE, TARGET_SVE): New macros.
(FIXED_REGISTERS, CALL_USED_REGISTERS, REGISTER_NAMES): Add entries
for VG and the SVE predicate registers.
(V_ALIASES): Add a "z"-prefixed alias.
(FIRST_PSEUDO_REGISTER): Change to P15_REGNUM + 1.
(AARCH64_DWARF_VG, AARCH64_DWARF_P0): New macros.
(PR_REGNUM_P, PR_LO_REGNUM_P): Likewise.
(PR_LO_REGS, PR_HI_REGS, PR_REGS): New reg_classes.
(REG_CLASS_NAMES): Add entries for them.
(REG_CLASS_CONTENTS): Likewise.  Update ALL_REGS to include VG
and the predicate registers.
(aarch64_sve_vg): Declare.
(BITS_PER_SVE_VECTOR, BYTES_PER_SVE_VECTOR, BYTES_PER_SVE_PRED)
(SVE_BYTE_MODE, MAX_COMPILE_TIME_VEC_BYTES): New macros.
(REGMODE_NATURAL_SIZE): Define.
* config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): Handle
SVE macros.
* config/aarch64/aarch64.c: Include cfgrtl.h.
(simd_immediate_info): Add a constructor for series vectors,
and an associated step field.
(aarch64_sve_vg): New variable.
(aarch64_dbx_register_number): Handle VG and the predicate registers.
(aarch64_vect_struct_mode_p, aarch64_vector_mode_p): Delete.
(VEC_ADVSIMD, VEC_SVE_DATA, VEC_SVE_PRED, VEC_STRUCT, VEC_ANY_SVE)
(VEC_ANY_DATA, VEC_STRUCT): New constants.
(aarch64_advsimd_struct_mode_p, 

[0/4] [AArch64] Add SVE support

2017-11-03 Thread Richard Sandiford
This series adds support for ARM's Scalable Vector Extension.
More details on SVE can be found here:

  
https://developer.arm.com/products/architecture/a-profile/docs/arm-architecture-reference-manual-supplement-armv8-a

There are four parts for ease of review, but it probably makes
sense to commit them as one patch.

The series plugs SVE into the current vectorisation framework without
adding any new features to the framework itself.  This means for example
that vector loops still handle full vectors, with a scalar epilogue loop
being needed for the rest.  Later patches add support for other features
like fully-predicated loops.

The patches build on top of the various series that I've already posted.
Sorry that there were so many, and thanks again for all the reviews.

Tested on aarch64-linux-gnu without SVE and aarch64-linux-gnu with SVE
(in the default vector-length agnostic mode).  Also tested with
-msve-vector-bits=256 and -msve-vector-bits=512 to select 256-bit
and 512-bit SVE registers.

Thanks,
Richard


RE: [patch][x86] GFNI enabling [2/4]

2017-11-03 Thread Koval, Julia
Here is the solution I propose:

gcc/
* common/config/i386/i386-common.c
(OPTION_MASK_ISA_GENERAL_REGS_ONLY_UNSET): Remove MPX from flag.
(ix86_handle_option): Move MPX to isa_flags2 and GFNI to isa_flags.
* config/i386/i386-c.c (ix86_target_macros_internal): Ditto.
* config/i386/i386.opt: Ditto.
* config/i386/i386.c (ix86_target_string): Ditto.
(ix86_option_override_internal): Ditto.
(ix86_init_mpx_builtins): Move MPX to args2.
(ix86_expand_builtin): Special handling for OPTION_MASK_ISA_GFNI.
* config/i386/i386-builtin.def (__builtin_ia32_vgf2p8affineinvqb_v64qi,
__builtin_ia32_vgf2p8affineinvqb_v64qi_mask,
__builtin_ia32_vgf2p8affineinvqb_v32qi,
__builtin_ia32_vgf2p8affineinvqb_v32qi_mask,
__builtin_ia32_vgf2p8affineinvqb_v16qi,
__builtin_ia32_vgf2p8affineinvqb_v16qi_mask): Move to ARGS array.

> -Original Message-
> From: Koval, Julia
> Sent: Friday, November 03, 2017 9:27 AM
> To: 'Jakub Jelinek' 
> Cc: 'Kirill Yukhin' ; 'GCC Patches'  patc...@gcc.gnu.org>
> Subject: RE: [patch][x86] GFNI enabling [2/4]
> 
> > But what do you think about adding AVX/SSE flags to this special set 
> > instead?
> Ok, was wrong, it is impossible to add SSE, because it is used in normal "or" 
> way.
> Then I'll add GFNI/VAES instead.
> 
> There is also another problem there: GFNI belongs to isa_flags2, while
> AVX512VL/AVX/SSE belong to isa_flags, so we can't keep them in the same field.
> There are candidates, which can be moved from isa_flags to isa_flags2 instead
> of GFNI, because there are no dependencies on other flags, but it is only a 
> short
> term solution.
> 
> > -Original Message-
> > From: Koval, Julia
> > Sent: Thursday, November 02, 2017 12:57 PM
> > To: Jakub Jelinek 
> > Cc: Kirill Yukhin ; GCC Patches  > patc...@gcc.gnu.org>
> > Subject: RE: [patch][x86] GFNI enabling [2/4]
> >
> > The documentation is right, I was wrong not adding SSE/AVX flags in these
> > builtin declaratuin.
> >
> > > The exceptions are
> > > MMX, AVX512VL and 64BIT is also special.
> > > So, shall GFNI be added to that set?
> > Turns out only GFNI and VAES(haven't sent those yet, they are from the same
> > Icelake pdf) are like this, others rely on AVX512VL/BW. But what do you 
> > think
> > about adding AVX/SSE flags to this special set instead? Looks like they more
> > probably will be used as a flags, on which new instructions may depend in 
> > the
> > future, than GFNI/VAES flags.
> >
> > -Julia
> >
> > > -Original Message-
> > > From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> > > ow...@gcc.gnu.org] On Behalf Of Jakub Jelinek
> > > Sent: Tuesday, October 31, 2017 8:28 PM
> > > To: Koval, Julia 
> > > Cc: Kirill Yukhin ; GCC Patches  > > patc...@gcc.gnu.org>
> > > Subject: Re: [patch][x86] GFNI enabling [2/4]
> > >
> > > On Mon, Oct 30, 2017 at 07:02:23PM +, Koval, Julia wrote:
> > > > gcc/testsuite/
> > > > * gcc.target/i386/avx-1.c: Handle new intrinsics.
> > > > * gcc.target/i386/avx512-check.h: Check GFNI bit.
> > > > * gcc.target/i386/avx512f-gf2p8affineinvqb-2.c: Runtime test.
> > > > * gcc.target/i386/avx512vl-gf2p8affineinvqb-2.c: Runtime test.
> > > > * gcc.target/i386/gfni-1.c: New.
> > > > * gcc.target/i386/gfni-2.c: New.
> > > > * gcc.target/i386/gfni-3.c: New.
> > > > * gcc.target/i386/gfni-4.c: New.
> > >
> > > The gfni-4.c testcase ICEs on i686-linux (e.g. try
> > > make check-gcc RUNTESTFLAGS='--target_board=unix\{-m32/-msse,-m32/-
> > > mno-sse,-m64\} i386.exp=gfni*'
> > > to see it).
> > >
> > > I must say I'm confused by the CPUIDs, the
> > > https://software.intel.com/sites/default/files/managed/c5/15/architecture-
> > > instruction-set-extensions-programming-reference.pdf
> > > lists GFNI; 2x AVX+GFNI; 2x AVX512VL+GFNI; AVX512F+GFNI CPUIDs for the
> > > instructions, but i386-builtins.def has:
> > > BDESC (OPTION_MASK_ISA_GFNI, CODE_FOR_vgf2p8affineinvqb_v64qi,
> > > "__builtin_ia32_vgf2p8affineinvqb_v64qi",
> > > IX86_BUILTIN_VGF2P8AFFINEINVQB512, UNKNOWN
> > > BDESC (OPTION_MASK_ISA_GFNI | OPTION_MASK_ISA_AVX512BW,
> > > CODE_FOR_vgf2p8affineinvqb_v64qi_mask,
> > > "__builtin_ia32_vgf2p8affineinvqb_v64qi_mask", IX86_
> > > BDESC (OPTION_MASK_ISA_GFNI, CODE_FOR_vgf2p8affineinvqb_v32qi,
> > > "__builtin_ia32_vgf2p8affineinvqb_v32qi",
> > > IX86_BUILTIN_VGF2P8AFFINEINVQB256, UNKNOWN
> > > BDESC (OPTION_MASK_ISA_GFNI | OPTION_MASK_ISA_AVX512BW,
> > > CODE_FOR_vgf2p8affineinvqb_v32qi_mask,
> > > "__builtin_ia32_vgf2p8affineinvqb_v32qi_mask", IX86_
> > > BDESC (OPTION_MASK_ISA_GFNI, CODE_FOR_vgf2p8affineinvqb_v16qi,
> > > "__builtin_ia32_vgf2p8affineinvqb_v16qi",
> > > IX86_BUILTIN_VGF2P8AFFINEINVQB128, UNKNOWN
> > > BDESC (OPTION_MASK_ISA_GFNI | 

Re: [RFA][PATCH] Improve initial probe for noreturn functions for x86 target

2017-11-03 Thread Uros Bizjak
On Fri, Nov 3, 2017 at 6:13 PM, Jeff Law  wrote:
> On 11/03/2017 04:46 AM, Uros Bizjak wrote:
>>
>> On Fri, Nov 3, 2017 at 11:14 AM, Richard Biener
>>  wrote:
>>>
>>> On Fri, Nov 3, 2017 at 9:38 AM, Uros Bizjak  wrote:
>
> * config/i386/i386.c (ix86_emit_restore_reg_using_pop):
> Prototype.
> (ix86_adjust_stack_and_probe_stack_clash): Use a push/pop
> sequence
> to probe at the start of a noreturn function.
>
> * gcc.target/i386/stack-check-12.c: New test


 -  emit_stack_probe (plus_constant (Pmode, stack_pointer_rtx,
 -   -GET_MODE_SIZE (word_mode)));
 +  rtx_insn *insn = emit_insn (gen_push (gen_rtx_REG (word_mode,
 0)));

 Please use AX_REG instead of 0.

 +  RTX_FRAME_RELATED_P (insn) = 1;
 +  ix86_emit_restore_reg_using_pop (gen_rtx_REG (word_mode, 0));

 Also here.

 emit_insn (gen_blockage ());

 BTW: Could we use an unused register here, if available? %eax is used
 to pass first argument in regparm functions on 32bit targets.
>>>
>>>
>>> Can you push %[er]sp?  What about partial reg stalls when using other
>>> registers (if the last set was a movb to it)?  I guess [er]sp is safe
>>> here
>>> as was [re]ax due to the ABI?
>>
>>
>> That would work, too. I believe, that this won't trigger stack engine
>> [1], but since the operation is a bit unusual, let's ask HJ to be
>> sure.
>>
>> [1] https://en.wikipedia.org/wiki/Stack_register#Stack_engine
>
> How about %esi in 32 bit mode and %rax in 64 bit mode?  I think that avoids
> hitting the parameter passing registers.

That is a good choice. IMO, It warrants a small comment, in the
source, why this choice.

Uros.


[C++ Patch/RFC] PR 82593 ("Internal compiler error: in process_init_constructor_array, at cp/typeck2.c:1294")

2017-11-03 Thread Paolo Carlini

Hi,

this ICE on valid (given GNU's designated initializers) is rather simple 
to analyze: for the testcase, the gcc_assert in 
process_init_constructor_array triggers because at that time INDEX1 is 
still a CONST_DECL, not an INTEGER_CST. As regards fixing the problem, I 
immediately noticed earlier today that fold_non_dependent_expr is 
definitely able to fold the CONST_DECL to the expected zero INTEGER_CST 
- and that also passes testing - but I'm not sure that in the big 
picture folding at that time is correct. Thanks in advance for any feedback!


Paolo.

//

Index: cp/typeck2.c
===
--- cp/typeck2.c(revision 254365)
+++ cp/typeck2.c(working copy)
@@ -1291,6 +1291,7 @@ process_init_constructor_array (tree type, tree in
 {
   if (ce->index)
{
+ ce->index = fold_non_dependent_expr (ce->index);
  gcc_assert (TREE_CODE (ce->index) == INTEGER_CST);
  if (compare_tree_int (ce->index, i) != 0)
{
Index: testsuite/g++.dg/cpp0x/desig2.C
===
--- testsuite/g++.dg/cpp0x/desig2.C (nonexistent)
+++ testsuite/g++.dg/cpp0x/desig2.C (working copy)
@@ -0,0 +1,23 @@
+// PR c++/82593
+// { dg-do compile { target c++11 } }
+// { dg-options "" }
+
+enum {
+ INDEX1 = 0,
+ INDEX2
+};
+
+class SomeClass {
+public:
+ SomeClass();
+private:
+ struct { int field; } member[2];
+};
+
+SomeClass::SomeClass()
+ : member({
+   [INDEX1] = { .field = 0 },
+   [INDEX2] = { .field = 1 }
+ })
+{
+}


Re: [PATCH] Fix libsanitizer bootstrap with glibc 2.26

2017-11-03 Thread Andi Kleen
On Fri, Nov 03, 2017 at 10:22:12AM -0700, Andi Kleen wrote:
> 
> It looks like some non POSIX symbols got removed from the header
> files, which breaks the libsanitizer build.

nm, looks like i was on a old checkout. Seems to be already fixed
in current trunk.

-Andi


[PATCH] Fix libsanitizer bootstrap with glibc 2.26

2017-11-03 Thread Andi Kleen

It looks like some non POSIX symbols got removed from the header
files, which breaks the libsanitizer build.

struct sigaltstack now only exists as stack_t (which is the offical
POSIX name)

__res_state typedef is now only struct __res_state

This fixes bootstrap of trunk on a current opensuse tumbleweed system.

I realize this is a downstream version, but fixing bootstrap is rather
important.

Now passes bootstrap with this patch.

libsanitizer/:
2017-11-03  Andi Kleen  

* sanitizer_common/sanitizer_stoptheworld_linux_libcdep.cc
(TracerThread): Use stack_t instead of struct sigaltstack
* tsan/tsan_platform_linux.cc (ExtractResolvFDs):
Use struct __res_state instead of __res_state.


diff --git 
a/libsanitizer/sanitizer_common/sanitizer_stoptheworld_linux_libcdep.cc 
b/libsanitizer/sanitizer_common/sanitizer_stoptheworld_linux_libcdep.cc
index 891386dc0ba..14dedcae64f 100644
--- a/libsanitizer/sanitizer_common/sanitizer_stoptheworld_linux_libcdep.cc
+++ b/libsanitizer/sanitizer_common/sanitizer_stoptheworld_linux_libcdep.cc
@@ -273,11 +273,11 @@ static int TracerThread(void* argument) {
 
   // Alternate stack for signal handling.
   InternalScopedBuffer handler_stack_memory(kHandlerStackSize);
-  struct sigaltstack handler_stack;
+  stack_t handler_stack;
   internal_memset(_stack, 0, sizeof(handler_stack));
   handler_stack.ss_sp = handler_stack_memory.data();
   handler_stack.ss_size = kHandlerStackSize;
-  internal_sigaltstack(_stack, nullptr);
+  internal_sigaltstack((struct sigaltstack *)_stack, nullptr);
 
   // Install our handler for synchronous signals. Other signals should be
   // blocked by the mask we inherited from the parent thread.
diff --git a/libsanitizer/tsan/tsan_platform_linux.cc 
b/libsanitizer/tsan/tsan_platform_linux.cc
index 2ed5718a12e..6f972ab0dd6 100644
--- a/libsanitizer/tsan/tsan_platform_linux.cc
+++ b/libsanitizer/tsan/tsan_platform_linux.cc
@@ -287,7 +287,7 @@ void InitializePlatform() {
 int ExtractResolvFDs(void *state, int *fds, int nfd) {
 #if SANITIZER_LINUX && !SANITIZER_ANDROID
   int cnt = 0;
-  __res_state *statp = (__res_state*)state;
+  struct __res_state *statp = (struct __res_state*)state;
   for (int i = 0; i < MAXNS && cnt < nfd; i++) {
 if (statp->_u._ext.nsaddrs[i] && statp->_u._ext.nssocks[i] != -1)
   fds[cnt++] = statp->_u._ext.nssocks[i];


Re: [RFA][PATCH] Improve initial probe for noreturn functions for x86 target

2017-11-03 Thread Jeff Law

On 11/03/2017 04:46 AM, Uros Bizjak wrote:

On Fri, Nov 3, 2017 at 11:14 AM, Richard Biener
 wrote:

On Fri, Nov 3, 2017 at 9:38 AM, Uros Bizjak  wrote:

* config/i386/i386.c (ix86_emit_restore_reg_using_pop): Prototype.
(ix86_adjust_stack_and_probe_stack_clash): Use a push/pop sequence
to probe at the start of a noreturn function.

* gcc.target/i386/stack-check-12.c: New test


-  emit_stack_probe (plus_constant (Pmode, stack_pointer_rtx,
-   -GET_MODE_SIZE (word_mode)));
+  rtx_insn *insn = emit_insn (gen_push (gen_rtx_REG (word_mode, 0)));

Please use AX_REG instead of 0.

+  RTX_FRAME_RELATED_P (insn) = 1;
+  ix86_emit_restore_reg_using_pop (gen_rtx_REG (word_mode, 0));

Also here.

emit_insn (gen_blockage ());

BTW: Could we use an unused register here, if available? %eax is used
to pass first argument in regparm functions on 32bit targets.


Can you push %[er]sp?  What about partial reg stalls when using other
registers (if the last set was a movb to it)?  I guess [er]sp is safe here
as was [re]ax due to the ABI?


That would work, too. I believe, that this won't trigger stack engine
[1], but since the operation is a bit unusual, let's ask HJ to be
sure.

[1] https://en.wikipedia.org/wiki/Stack_register#Stack_engine
How about %esi in 32 bit mode and %rax in 64 bit mode?  I think that 
avoids hitting the parameter passing registers.

jeff


[PATCH] ipa-fnsummary.c: fix use-after-free crash (PR jit/82826)

2017-11-03 Thread David Malcolm
PR jit/82826 reports a crash when running jit.dg/test-benchmark.c,
introduced by r254140
(aka "Extend ipa-pure-const pass to propagate malloc attribute.")

I see the crash on the 28th of 400 in-process iterations of the
compiler; on turning on GCC_JIT_BOOL_OPTION_SELFCHECK_GC, it shows up
on the 2nd iteration.

The root cause is that in one in-process invocation of the compiler we
have mismatching calls to ipa_fn_summary_alloc/ipa_free_fn_summary.

The sequence of calls is:

1st in-process iteration:
  (1): ipa_fn_summary_alloc
(called indirectly by pass_local_fn_summary::execute)

  (2): ipa_free_fn_summary
(called by pass_ipa_free_fn_summary::execute)

  (3): ipa_fn_summary_alloc

...where (3) is called thusly:

  (gdb) bt
  #0  ipa_fn_summary_alloc () at ../../src/gcc/ipa-fnsummary.c:533
  #1  0x76616788 in ipa_fn_summary_generate () at 
../../src/gcc/ipa-fnsummary.c:3184
  #2  0x7679ebe4 in execute_ipa_summary_passes (ipa_pass=0x656be0) at 
../../src/gcc/passes.c:2200
  #3  0x76359c2c in ipa_passes () at ../../src/gcc/cgraphunit.c:2448
  #4  0x7635a095 in symbol_table::compile (this=0x7fffef8ed100) at 
../../src/gcc/cgraphunit.c:2558
  #5  0x7635a4e2 in symbol_table::finalize_compilation_unit 
(this=0x7fffef8ed100) at ../../src/gcc/cgraphunit.c:2716
  #6  0x768ab25d in compile_file () at ../../src/gcc/toplev.c:481
  #7  0x768ae3e1 in do_compile () at ../../src/gcc/toplev.c:2063
  #8  0x768ae758 in toplev::main (this=0x7fffdf8e, argc=12, 
argv=0x61f6c8) at ../../src/gcc/toplev.c:2198
  (etc)

and so we have an "alloc" that's not matched by a free.

Hence the global "ipa_fn_summaries" is left non-NULL, a
GTY-marked object, with a pointer to the GC-heap-allocated
"symtab".  There's no GTY marking for the symtab reference
in the function_summary ; it has GTY((user)), and the
hand-written functions don't visit the symtab.

On the next in-process iteration, the would-be alloc encounters this
conditional:

2396  if (!ipa_fn_summaries)
2397ipa_fn_summary_alloc ();

and hence doesn't re-allocate.

The global "ipa_fn_summaries" thus contains a pointer to last iteration's
"symtab", leading to unpredictable bugs due to there being *two* symbol
tables (one for the previous iteration, one for the current iteration).
At some point the previous symtab will be collected from "under"
ipa_fn_summaries, leading to a use-after-free crash e.g. here in
function_summary::release:
67  m_symtab->remove_cgraph_insertion_hook (m_symtab_insertion_hook);

due to m_symtab being GC-allocated but with nothing keeping it alive,
and thus being (hopefully) poisoned by GC, or perhaps with the memory
being reused for something else.

This isn't an issue for the regular compiler, but when libgccjit
reruns in-process, the mismatched alloc/cleanups can lead to use-after-free
after the GC has run.

This patch fixes the issue in the least invasive way I could, by explicitly
ensuring that the cleanup happens in toplev::finalize (the jit's hook for
doing such cleanups); merely fixing the GTY issue wouldn't fix the issue of
accidentally having two symtabs.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
OK for trunk?

gcc/ChangeLog:
PR jit/82826
* ipa-fnsummary.c (ipa_fnsummary_c_finalize): New function.
* ipa-fnsummary.h (ipa_fnsummary_c_finalize): New decl.
* toplev.c: Include "ipa-fnsummary.h".
(toplev::finalize): Call ipa_fnsummary_c_finalize.
---
 gcc/ipa-fnsummary.c | 9 +
 gcc/ipa-fnsummary.h | 2 ++
 gcc/toplev.c| 2 ++
 3 files changed, 13 insertions(+)

diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
index f71338e..20eec2a 100644
--- a/gcc/ipa-fnsummary.c
+++ b/gcc/ipa-fnsummary.c
@@ -3618,3 +3618,12 @@ make_pass_ipa_fn_summary (gcc::context *ctxt)
 {
   return new pass_ipa_fn_summary (ctxt);
 }
+
+/* Reset all state within ipa-fnsummary.c so that we can rerun the compiler
+   within the same process.  For use by toplev::finalize.  */
+
+void
+ipa_fnsummary_c_finalize (void)
+{
+  ipa_free_fn_summary ();
+}
diff --git a/gcc/ipa-fnsummary.h b/gcc/ipa-fnsummary.h
index a794bd0..b345bbc 100644
--- a/gcc/ipa-fnsummary.h
+++ b/gcc/ipa-fnsummary.h
@@ -266,4 +266,6 @@ void estimate_node_size_and_time (struct cgraph_node *node,
  vec
  inline_param_summary);
 
+void ipa_fnsummary_c_finalize (void);
+
 #endif /* GCC_IPA_FNSUMMARY_H */
diff --git a/gcc/toplev.c b/gcc/toplev.c
index f68de3b..590ab58 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -83,6 +83,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "edit-context.h"
 #include "tree-pass.h"
 #include "dumpfile.h"
+#include "ipa-fnsummary.h"
 
 #if defined(DBX_DEBUGGING_INFO) || defined(XCOFF_DEBUGGING_INFO)
 #include "dbxout.h"
@@ -2238,6 +2239,7 @@ toplev::finalize (void)
 
   /* Needs to be called before cgraph_c_finalize 

[PATCH] Set default to -fomit-frame-pointer

2017-11-03 Thread Wilco Dijkstra
Almost all targets add an explict -fomit-frame-pointer in the target specific
options.  Rather than doing this in a target-specific way, do this in the
generic options so it works identically across all targets.  In many cases the
target no longer needs to define TARGET_OPTION_OPTIMIZATION_TABLE, reducing
the amount of target code.

Verified all targets built by buildmanyglibcs script do still build.

OK for commit?


ChangeLog:
2017-11-03  Wilco Dijkstra  

* opts.c (default_options_table): Add OPT_fomit_frame_pointer entry.
* common/config/alpha/alpha-common.c (TARGET_OPTION_OPTIMIZATION_TABLE):
Remove OPT_fomit_frame_pointer entry.
* common/config/arc/arc-common.c: Likewise. 
* common/config/arm/arm-common.c: Likewise. 
* common/config/avr/avr-common.c: Likewise. 
* common/config/c6x/c6x-common.c: Likewise. 
* common/config/cr16/cr16-common.c: Likewise.   
* common/config/cris/cris-common.c: Likewise.   
* common/config/epiphany/epiphany-common.c: Likewise.   
* common/config/fr30/fr30-common.c: Likewise.   
* common/config/frv/frv-common.c: Likewise. 
* common/config/ia64/ia64-common.c: Likewise.   
* common/config/iq2000/iq2000-common.c: Likewise.   
* common/config/lm32/lm32-common.c: Likewise.   
* common/config/m32r/m32r-common.c: Likewise.   
* common/config/mcore/mcore-common.c: Likewise. 
* common/config/microblaze/microblaze-common.c: Likewise.   
* common/config/mips/mips-common.c: Likewise.   
* common/config/mmix/mmix-common.c: Likewise.   
* common/config/mn10300/mn10300-common.c: Likewise.
* common/config/nios2/nios2-common.c: Likewise. 
* common/config/pa/pa-common.c: Likewise.   
* common/config/pdp11/pdp11-common.c: Likewise. 
* common/config/powerpcspe/powerpcspe-common.c: Likewise.   
* common/config/riscv/riscv-common.c: Likewise. 
* common/config/rs6000/rs6000-common.c: Likewise.   
* common/config/rx/rx-common.c: Likewise.   
* common/config/s390/s390-common.c: Likewise.   
* common/config/sh/sh-common.c: Likewise.   
* common/config/sparc/sparc-common.c: Likewise. 
* common/config/tilegx/tilegx-common.c: Likewise.   
* common/config/tilepro/tilepro-common.c: Likewise. 
* common/config/v850/v850-common.c: Likewise.   
* common/config/visium/visium-common.c: Likewise.   
* common/config/xstormy16/xstormy16-common.c: Likewise. 
* common/config/xtensa/xtensa-common.c: Likewise.

doc/
* invoke.texi (-fomit-frame-pointer): Update documentation.

--
diff --git a/gcc/common/config/alpha/alpha-common.c 
b/gcc/common/config/alpha/alpha-common.c
index 
be42282270bbc22e31e39bfb5307d7b4d82a84b9..3a7d28d16225478e2fdae42c5610e55dc0b68c6f
 100644
--- a/gcc/common/config/alpha/alpha-common.c
+++ b/gcc/common/config/alpha/alpha-common.c
@@ -30,7 +30,6 @@ along with GCC; see the file COPYING3.  If not see
 /* Implement TARGET_OPTION_OPTIMIZATION_TABLE.  */
 static const struct default_options alpha_option_optimization_table[] =
   {
-{ OPT_LEVELS_1_PLUS, OPT_fomit_frame_pointer, NULL, 1 },
 /* Enable redundant extension instructions removal at -O2 and higher.  */
 { OPT_LEVELS_2_PLUS, OPT_free, NULL, 1 },
 { OPT_LEVELS_NONE, 0, NULL, 0 }
diff --git a/gcc/common/config/arc/arc-common.c 
b/gcc/common/config/arc/arc-common.c
index 
82e0dd383c9d627d39cc8cf904ef3c17a80f3da9..c437313ba4192b1d6c79b047b40b02e5b7a4facb
 100644
--- a/gcc/common/config/arc/arc-common.c
+++ b/gcc/common/config/arc/arc-common.c
@@ -47,7 +47,6 @@ arc_option_init_struct (struct gcc_options *opts)
 static const struct default_options arc_option_optimization_table[] =
   {
 { OPT_LEVELS_SIZE, OPT_fsection_anchors, NULL, 1 },
-{ OPT_LEVELS_1_PLUS, OPT_fomit_frame_pointer, NULL, 1 },
 { OPT_LEVELS_ALL, OPT_mRcq, NULL, 1 },
 { OPT_LEVELS_ALL, OPT_mRcw, NULL, 1 },
 { OPT_LEVELS_ALL, OPT_msize_level_, NULL, 1 },
diff --git a/gcc/common/config/arm/arm-common.c 
b/gcc/common/config/arm/arm-common.c
index 
1588ca86e9b06282ed4358e072bc2b0224a11483..5ae20fea916a636d078b9e1aa2b4e866b9da1259
 100644
--- a/gcc/common/config/arm/arm-common.c
+++ b/gcc/common/config/arm/arm-common.c
@@ -36,7 +36,6 @@ static const struct default_options 
arm_option_optimization_table[] =
   {
 /* Enable section anchors by default at -O1 or higher.  */
 { OPT_LEVELS_1_PLUS, OPT_fsection_anchors, NULL, 1 },
-{ OPT_LEVELS_1_PLUS, OPT_fomit_frame_pointer, NULL, 1 },
 { OPT_LEVELS_1_PLUS, OPT_fsched_pressure, NULL, 1 },
 { OPT_LEVELS_NONE, 0, NULL, 0 }
   };
diff --git a/gcc/common/config/avr/avr-common.c 
b/gcc/common/config/avr/avr-common.c
index 

Re: [RFA][PATCH] Improve initial probe for noreturn functions for x86 target

2017-11-03 Thread Jeff Law

On 11/03/2017 04:14 AM, Richard Biener wrote:

On Fri, Nov 3, 2017 at 9:38 AM, Uros Bizjak  wrote:

* config/i386/i386.c (ix86_emit_restore_reg_using_pop): Prototype.
(ix86_adjust_stack_and_probe_stack_clash): Use a push/pop sequence
to probe at the start of a noreturn function.

* gcc.target/i386/stack-check-12.c: New test


-  emit_stack_probe (plus_constant (Pmode, stack_pointer_rtx,
-   -GET_MODE_SIZE (word_mode)));
+  rtx_insn *insn = emit_insn (gen_push (gen_rtx_REG (word_mode, 0)));

Please use AX_REG instead of 0.

+  RTX_FRAME_RELATED_P (insn) = 1;
+  ix86_emit_restore_reg_using_pop (gen_rtx_REG (word_mode, 0));

Also here.

emit_insn (gen_blockage ());

BTW: Could we use an unused register here, if available? %eax is used
to pass first argument in regparm functions on 32bit targets.


Can you push %[er]sp?  What about partial reg stalls when using other
registers (if the last set was a movb to it)?  I guess [er]sp is safe here
as was [re]ax due to the ABI?
The whole point here is to touch the stack, but leave other machine 
state as-is.  So the register we use isn't terribly important (with the 
possible exception of %esp).  After the push/pop sequence the register 
we use still has the same value & the stack pointer still points to the 
same location as before the push/pop.


Jeff


Re: [RFA][PATCH] Improve initial probe for noreturn functions for x86 target

2017-11-03 Thread Jeff Law

On 11/03/2017 04:46 AM, Uros Bizjak wrote:

On Fri, Nov 3, 2017 at 11:14 AM, Richard Biener
 wrote:

On Fri, Nov 3, 2017 at 9:38 AM, Uros Bizjak  wrote:

* config/i386/i386.c (ix86_emit_restore_reg_using_pop): Prototype.
(ix86_adjust_stack_and_probe_stack_clash): Use a push/pop sequence
to probe at the start of a noreturn function.

* gcc.target/i386/stack-check-12.c: New test


-  emit_stack_probe (plus_constant (Pmode, stack_pointer_rtx,
-   -GET_MODE_SIZE (word_mode)));
+  rtx_insn *insn = emit_insn (gen_push (gen_rtx_REG (word_mode, 0)));

Please use AX_REG instead of 0.

+  RTX_FRAME_RELATED_P (insn) = 1;
+  ix86_emit_restore_reg_using_pop (gen_rtx_REG (word_mode, 0));

Also here.

emit_insn (gen_blockage ());

BTW: Could we use an unused register here, if available? %eax is used
to pass first argument in regparm functions on 32bit targets.


Can you push %[er]sp?  What about partial reg stalls when using other
registers (if the last set was a movb to it)?  I guess [er]sp is safe here
as was [re]ax due to the ABI?


That would work, too. I believe, that this won't trigger stack engine
[1], but since the operation is a bit unusual, let's ask HJ to be
sure.

I suspect we're better off avoiding %esp as the source/dest.

And a note on the micro-optimizations.  This only happens for no-return 
function prologues.  I think we can assume they are not performance 
critical.


Jeff


Re: [PATCH][AArch64] Simplify frame layout for stack probing

2017-11-03 Thread Wilco Dijkstra
James Greenhalgh wrote:
>
> This caused:
>
>  Failures:
>    gcc.target/aarch64/test_frame_4.c
>    gcc.target/aarch64/test_frame_2.c
>    gcc.target/aarch64/test_frame_7.c
>    gcc.target/aarch64/test_frame_10.c

Sorry, I missed that in testing. I've reverted part of the patch that caused 
this.
The tests are definitely too picky but they also uncovered a real code 
generation
inefficiency, so I need to look into that further.

I've committed this:

2017-11-03  Wilco Dijkstra  

PR target/82786
* config/aarch64/aarch64.c (aarch64_layout_frame):
Undo forcing of LR at bottom of frame.
--
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 2fc7db4..949f3cb 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,9 @@
+2017-11-03  Wilco Dijkstra  
+
+   PR target/82786
+   * config/aarch64/aarch64.c (aarch64_layout_frame):
+   Undo forcing of LR at bottom of frame.
+
 2017-11-03  Jeff Law  
 
* cfganal.c (single_pred_edge_ignoring_loop_edges): New function
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 1e12645..12f247d 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -2908,8 +2908,7 @@ aarch64_frame_pointer_required (void)
 
 /* Mark the registers that need to be saved by the callee and calculate
the size of the callee-saved registers area and frame record (both FP
-   and LR may be omitted).  If the function is not a leaf, ensure LR is
-   saved at the bottom of the callee-save area.  */
+   and LR may be omitted).  */
 static void
 aarch64_layout_frame (void)
 {
@@ -2966,13 +2965,6 @@ aarch64_layout_frame (void)
   cfun->machine->frame.wb_candidate2 = R30_REGNUM;
   offset = 2 * UNITS_PER_WORD;
 }
-  else if (!crtl->is_leaf)
-{
-  /* Ensure LR is saved at the bottom of the callee-saves.  */
-  cfun->machine->frame.reg_offset[R30_REGNUM] = 0;
-  cfun->machine->frame.wb_candidate1 = R30_REGNUM;
-  offset = UNITS_PER_WORD;
-}
 
   /* Now assign stack slots for them.  */
   for (regno = R0_REGNUM; regno <= R30_REGNUM; regno++)





Re: [RFA][PATCH] Improve initial probe for noreturn functions for x86 target

2017-11-03 Thread Jeff Law

On 11/03/2017 02:38 AM, Uros Bizjak wrote:

* config/i386/i386.c (ix86_emit_restore_reg_using_pop): Prototype.
(ix86_adjust_stack_and_probe_stack_clash): Use a push/pop sequence
to probe at the start of a noreturn function.

* gcc.target/i386/stack-check-12.c: New test


-  emit_stack_probe (plus_constant (Pmode, stack_pointer_rtx,
-   -GET_MODE_SIZE (word_mode)));
+  rtx_insn *insn = emit_insn (gen_push (gen_rtx_REG (word_mode, 0)));

Please use AX_REG instead of 0.

+  RTX_FRAME_RELATED_P (insn) = 1;
+  ix86_emit_restore_reg_using_pop (gen_rtx_REG (word_mode, 0));

Also here.

emit_insn (gen_blockage ());

BTW: Could we use an unused register here, if available? %eax is used
to pass first argument in regparm functions on 32bit targets.

Any register is sufficient.  We just want to touch the top of the stack.

Jeff


[PATCH, testsuite]: PR 82828: Fix invalid gcc.target/i386/pr70263-2.c testcase

2017-11-03 Thread Uros Bizjak
Hello!

This testcase uses undefined values. Attached patch fixes the
testcase; the scan for message in the RTL dump assures us that the fix
for PR 70263 still performs its magic.

2017-11-03  Uros Bizjak  

PR testsuite/82828
PR rtl-optimization/70263
* gcc.target/i386/pr70263-2.c: Fix invalid testcase.

Tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.
Index: gcc.target/i386/pr70263-2.c
===
--- gcc.target/i386/pr70263-2.c (revision 254289)
+++ gcc.target/i386/pr70263-2.c (working copy)
@@ -4,20 +4,13 @@
 /* { dg-final { scan-rtl-dump "Adding REG_EQUIV to insn \[0-9\]+ for source of 
insn \[0-9\]+" "ira" } } */
 
 typedef float XFtype __attribute__ ((mode (XF)));
-typedef _Complex float XCtype __attribute__ ((mode (XC)));
-XCtype
-__mulxc3 (XFtype a, XFtype b, XFtype c, XFtype d)
+
+void bar (XFtype);
+
+void
+foo (XFtype a, XFtype c)
 {
-  XFtype ac, bd, ad, bc, x, y;
-  ac = a * c;
-__asm__ ("": "=m" (ac):"m" (ac));
-  if (x != x)
-{
-  _Bool recalc = 0;
-  if (((!(!(((ac) - (ac)) != ((ac) - (ac)))
-   recalc = 1;
-  if (recalc)
-   x = __builtin_huge_vall () * (a * c - b * d);
-}
-  return x;
+  XFtype ac = a * c;
+
+  bar (ac);
 }


Re: [RFC, PR 80689] Copy small aggregates element-wise

2017-11-03 Thread Martin Jambor
Hi,

On Thu, Oct 26, 2017 at 02:43:02PM +0200, Richard Biener wrote:
> On Thu, Oct 26, 2017 at 2:18 PM, Martin Jambor  wrote:
> >
> > Nevertheless, I still intend to experiment with the limit, I sent out
> > this RFC exactly so that I don't spend a lot of time benchmarking
> > something that is eventually not deemed acceptable on principle.
> 
> I think the limit should be on the number of generated copies and not
> the overall size of the structure...  If the struct were composed of
> 32 individual chars we wouldn't want to emit 32 loads and 32 stores...

I have added another parameter to also limit the number of generated
element copies.  I have kept the size limit so that we don't even
attempt to count them for large structures.

> Given that load bandwith is usually higher than store bandwith it
> might make sense to do the store combining in our copying sequence,
> like for the 8 byte entry case use sth like
> 
>   movq 0(%eax), %xmm0
>   movhps 8(%eax), %xmm0 // or vpinsert
>   mov[au]ps %xmm0, 0%(ebx)

I would be concerned about the cost of GPR->XMM moves when the value
being stored is in a GPR, especially with generic tuning which (with
-O2) is the main thing I am targeting here.  Wouldn't we actually pass
it through stack with all the associated penalties?

Also, while such store combining might work for ImageMagick, if a
programmer  did:

region1->x = x1;
region2->x = x2;
region1->y = 0;
region2->y = 20;
...
SetPixelCacheNexusPixels(cache_info, ReadMode, region1, ...)

The transformation would not work unless it could prove region1 and
region2 are not the same thing.

> As said a general concern was you not copying padding.  If you
> put this into an even more common place you surely will break
> stuff, no?

I don't understand, what even more common place do you mean?

I have been testing the patch also on a bunch of other architectures
and those have tests in their testsuite that check that padding is
copied, for example some tests in gcc.target/aarch64/aapcs64/ check
whether a structure passed to a function is binary the same as the
original, and the test fail because of padding.  That is the only
"breakage" I know about but I believe that the assumption that padding
must always be is wrong (if it is not than we need to make SRA quite a
bit more conservative).


On Thu, Oct 26, 2017 at 05:09:42PM +0200, Richard Biener wrote:
> Also if we do the stores in smaller chunks we are more
> likely hitting the same store-to-load-forwarding issue
> elsewhere.  Like in case the destination is memcpy'ed
> away.
> 
> So the proposed change isn't necessarily a win without
> a possible similar regression that it tries to fix.
>

With some encouragement by Honza, I have done some benchmarking anyway
and I did not see anything of that kind.

> Whole-program analysis of accesses might allow
> marking affected objects.

Attempting to save access patterns before IPA and then tracking them
and keep them in sync across inlining and all gimple late passes seems
like a nightmarish task.  If this approach is indeed rejected I might
attempt to do the store combining but a WPA analysis seems just too
complex.

Anyway, here are the numbers.  They were taken on two different
Zen-based machines.  I am also in the process of measuring at least
something on a Haswell machine but I started later and the machine is
quite a bit slower so I will not have the numbers until next week (and
not all equivalents in any way).  I found out I do not have access to
any more modern .*Lake intel CPU.

trunk is pristine trunk revision 254205.  All benchmarks were run
three times and the median was chosen.

s or strict means the patch with the strictest possible settings to
speed-up ImageMagick, i.e. --param max-size-for-elementwise-copy=32
--param max-insns-for-elementwise-copy=4.  Also run three times.

x1 is patched trunk with the parameters having the default values was
going to propose, i.e. --param max-size-for-elementwise-copy=35
--param max-insns-for-elementwise-copy=6.  Also run three times.

I then increased the parameter, in search for further missed
opportunities and to see what and how soon will start to regress.
x2 is roughly twice that, --param max-size-for-elementwise-copy=67
--param max-insns-for-elementwise-copy=12.  Run twice, outliers
manually checked.

x4 is roughly four times x1, namely --param max-size-for-elementwise-copy=143
--param max-insns-for-elementwise-copy=24.  Run only once.

The times below are of course "non-reportable," for a whole bunch of
reasons.


Zen SPECINT 2006  -O2 generic tuning


 Run-time
 
 
| Benchmark  | trunk |   s | % |  x1 | % |  x2 | % |  x4 | 
% |
|+---+-+---+-+---+-+---+-+---|
| 400.perlbench  |   237 | 236 | -0.42 | 236 | -0.42 | 238 | +0.42 | 237 | 
+0.00 |
| 401.bzip2  |   341 | 342 | +0.29 | 341 | +0.00 | 341 | +0.00 | 341 | 
+0.00 |
| 403.gcc   

Use extract_bit_field_as_subreg for vectors

2017-11-03 Thread Richard Sandiford
extract_bit_field_1 tries to use vec_extract to extract part of a
vector.  However, if that pattern isn't defined or if the operands
aren't suitable, another good approach is to try a direct subreg
reference.  This is particularly useful for multi-vector modes on
SVE (e.g. when extracting one vector from an LD2 result).

The function would go on to try the same thing anyway, but only
if there is an integer mode with the same size as the vector mode,
which isn't true for SVE modes (and doesn't seem a good thing to
require in general).  Even when there is an integer mode, doing the
operation on the original modes avoids some unnecessary bitcasting.

Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64-linux-gnu.
OK to install?

Richard


2017-11-03  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* expmed.c (extract_bit_field_1): For vector extracts,
fall back to extract_bit_field_as_subreg if vec_extract
isn't available.

Index: gcc/expmed.c
===
--- gcc/expmed.c2017-11-03 12:15:45.018033355 +
+++ gcc/expmed.c2017-11-03 12:23:33.286459352 +
@@ -1708,33 +1708,49 @@ extract_bit_field_1 (rtx str_rtx, poly_u
 }
 
   /* Use vec_extract patterns for extracting parts of vectors whenever
- available.  */
+ available.  If that fails, see whether the current modes and bitregion
+ give a natural subreg.  */
   machine_mode outermode = GET_MODE (op0);
-  scalar_mode innermode = GET_MODE_INNER (outermode);
-  poly_uint64 pos;
-  if (VECTOR_MODE_P (outermode)
-  && !MEM_P (op0)
-  && (convert_optab_handler (vec_extract_optab, outermode, innermode)
- != CODE_FOR_nothing)
-  && must_eq (bitsize, GET_MODE_BITSIZE (innermode))
-  && multiple_p (bitnum, GET_MODE_BITSIZE (innermode), ))
+  if (VECTOR_MODE_P (outermode) && !MEM_P (op0))
 {
-  struct expand_operand ops[3];
+  scalar_mode innermode = GET_MODE_INNER (outermode);
   enum insn_code icode
= convert_optab_handler (vec_extract_optab, outermode, innermode);
+  poly_uint64 pos;
+  if (icode != CODE_FOR_nothing
+ && must_eq (bitsize, GET_MODE_BITSIZE (innermode))
+ && multiple_p (bitnum, GET_MODE_BITSIZE (innermode), ))
+   {
+ struct expand_operand ops[3];
+
+ create_output_operand ([0], target, innermode);
+ ops[0].target = 1;
+ create_input_operand ([1], op0, outermode);
+ create_integer_operand ([2], pos);
+ if (maybe_expand_insn (icode, 3, ops))
+   {
+ if (alt_rtl && ops[0].target)
+   *alt_rtl = target;
+ target = ops[0].value;
+ if (GET_MODE (target) != mode)
+   return gen_lowpart (tmode, target);
+ return target;
+   }
+   }
+  /* Using subregs is useful if we're extracting the least-significant
+vector element, or if we're extracting one register vector from
+a multi-register vector.  extract_bit_field_as_subreg checks
+for valid bitsize and bitnum, so we don't need to do that here.
 
-  create_output_operand ([0], target, innermode);
-  ops[0].target = 1;
-  create_input_operand ([1], op0, outermode);
-  create_integer_operand ([2], pos);
-  if (maybe_expand_insn (icode, 3, ops))
+The mode check makes sure that we're extracting either
+a single element or a subvector with the same element type.
+If the modes aren't such a natural fit, fall through and
+bitcast to integers first.  */
+  if (GET_MODE_INNER (mode) == innermode)
{
- if (alt_rtl && ops[0].target)
-   *alt_rtl = target;
- target = ops[0].value;
- if (GET_MODE (target) != mode)
-   return gen_lowpart (tmode, target);
- return target;
+ rtx sub = extract_bit_field_as_subreg (mode, op0, bitsize, bitnum);
+ if (sub)
+   return sub;
}
 }
 


Improve spilling for variable-size slots

2017-11-03 Thread Richard Sandiford
Once SVE is enabled, a general AArch64 spill slot offset will be

  A + B * VL

where A is a constant and B is a multiple of the SVE vector length.
The offsets in SVE load and store instructions are a multiple of VL
(and so can encode some values of B), while offsets for base AArch64
load and store instructions aren't (and encode some values of A).

We therefore get better spill code if variable-sized slots are grouped
together separately from constant-sized slots, and if variable-sized
slots are not reused for constant-sized data.  Then, spills to the
constant-sized slots can add B * VL to the offset first, creating a
common anchor point for spills with the same B component but different
A components.  Spills to variable-sized slots can likewise add A to
the offset first, creating a common anchor point for spills with the
same A component but different B components.

This patch implements the sorting and grouping side of the optimisation.
A later patch creates the anchor points.

The patch is a no-op on other targets.

Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64-linux-gnu.
OK to install?

Richard


2017-11-03  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* lra-spills.c (pseudo_reg_slot_compare): Sort slots by whether
they are variable or constant sized.
(assign_stack_slot_num_and_sort_pseudos): Don't reuse variable-sized
slots for constant-sized data.

Index: gcc/lra-spills.c
===
--- gcc/lra-spills.c2017-11-03 12:15:45.033032920 +
+++ gcc/lra-spills.c2017-11-03 12:22:34.003396358 +
@@ -174,9 +174,17 @@ regno_freq_compare (const void *v1p, con
 }
 
 /* Sort pseudos according to their slots, putting the slots in the order
-   that they should be allocated.  Slots with lower numbers have the highest
-   priority and should get the smallest displacement from the stack or
-   frame pointer (whichever is being used).
+   that they should be allocated.
+
+   First prefer to group slots with variable sizes together and slots
+   with constant sizes together, since that usually makes them easier
+   to address from a common anchor point.  E.g. loads of polynomial-sized
+   registers tend to take polynomial offsets while loads of constant-sized
+   registers tend to take constant (non-polynomial) offsets.
+
+   Next, slots with lower numbers have the highest priority and should
+   get the smallest displacement from the stack or frame pointer
+   (whichever is being used).
 
The first allocated slot is always closest to the frame pointer,
so prefer lower slot numbers when frame_pointer_needed.  If the stack
@@ -194,6 +202,10 @@ pseudo_reg_slot_compare (const void *v1p
 
   slot_num1 = pseudo_slots[regno1].slot_num;
   slot_num2 = pseudo_slots[regno2].slot_num;
+  diff = (int (slots[slot_num1].size.is_constant ())
+ - int (slots[slot_num2].size.is_constant ()));
+  if (diff != 0)
+return diff;
   if ((diff = slot_num1 - slot_num2) != 0)
 return (frame_pointer_needed
|| (!FRAME_GROWS_DOWNWARD) == STACK_GROWS_DOWNWARD ? diff : -diff);
@@ -356,8 +368,17 @@ assign_stack_slot_num_and_sort_pseudos (
j = slots_num;
   else
{
+ machine_mode mode
+   = wider_subreg_mode (PSEUDO_REGNO_MODE (regno),
+lra_reg_info[regno].biggest_mode);
  for (j = 0; j < slots_num; j++)
if (slots[j].hard_regno < 0
+   /* Although it's possible to share slots between modes
+  with constant and non-constant widths, we usually
+  get better spill code by keeping the constant and
+  non-constant areas separate.  */
+   && (GET_MODE_SIZE (mode).is_constant ()
+   == slots[j].size.is_constant ())
&& ! (lra_intersected_live_ranges_p
  (slots[j].live_ranges,
   lra_reg_info[regno].live_ranges)))


Improve canonicalisation of TARGET_MEM_REFs

2017-11-03 Thread Richard Sandiford
A general TARGET_MEM_REF is:

BASE + STEP * INDEX + INDEX2 + OFFSET

After classifying the address in this way, the code that builds
TARGET_MEM_REFs tries to simplify the address until it's valid
for the current target and for the mode of memory being addressed.
It does this in a fixed order:

(1) add SYMBOL to BASE
(2) add INDEX * STEP to the base, if STEP != 1
(3) add OFFSET to INDEX or BASE (reverted if unsuccessful)
(4) add INDEX to BASE
(5) add OFFSET to BASE

So suppose we had an address:

 + offset + index * 8   (e.g. "a[i + 1]" for a global "a")

on a target that only allows an index or an offset, not both.  Following
the steps above, we'd first create:

tmp = symbol
tmp2 = tmp + index * 8

Then if the given offset value was valid for the mode being addressed,
we'd create:

MEM[base:tmp2, offset:offset]

while if it was invalid we'd create:

tmp3 = tmp2 + offset
MEM[base:tmp3, offset:0]

The problem is that this could happen if ivopts had decided to use
a scaled index for an address that happens to have a constant base.
The old procedure failed to give an indexed TARGET_MEM_REF in that case,
and adding the offset last prevented later passes from being able to
fold the index back in.

The patch avoids this by skipping (2) if BASE + INDEX * STEP
is a legitimate address and if OFFSET is stopping the address
being valid.

Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64-linux-gnu.
OK to install?

Richard


2017-10-31  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* tree-ssa-address.c (keep_index_p): New function.
(create_mem_ref): Use it.  Only split out the INDEX * STEP
component if that is invalid even with the symbol and offset
removed.

Index: gcc/tree-ssa-address.c
===
--- gcc/tree-ssa-address.c  2017-11-03 12:15:44.097060121 +
+++ gcc/tree-ssa-address.c  2017-11-03 12:21:18.060359821 +
@@ -746,6 +746,20 @@ gimplify_mem_ref_parts (gimple_stmt_iter
 true, GSI_SAME_STMT);
 }
 
+/* Return true if the STEP in PARTS gives a valid BASE + INDEX * STEP
+   address for type TYPE and if the offset is making it appear invalid.  */
+
+static bool
+keep_index_p (tree type, mem_address parts)
+{
+  if (!parts.base)
+return false;
+
+  gcc_assert (!parts.symbol);
+  parts.offset = NULL_TREE;
+  return valid_mem_ref_p (TYPE_MODE (type), TYPE_ADDR_SPACE (type), );
+}
+
 /* Creates and returns a TARGET_MEM_REF for address ADDR.  If necessary
computations are emitted in front of GSI.  TYPE is the mode
of created memory reference. IV_CAND is the selected iv candidate in ADDR,
@@ -809,7 +823,8 @@ create_mem_ref (gimple_stmt_iterator *gs
  into:
index' = index << step;
[... + index' + ,,,].  */
-  if (parts.step && !integer_onep (parts.step))
+  bool scaled_p = (parts.step && !integer_onep (parts.step));
+  if (scaled_p && !keep_index_p (type, parts))
 {
   gcc_assert (parts.index);
   parts.index = force_gimple_operand_gsi (gsi,
@@ -821,6 +836,7 @@ create_mem_ref (gimple_stmt_iterator *gs
   mem_ref = create_mem_ref_raw (type, alias_ptr_type, , true);
   if (mem_ref)
return mem_ref;
+  scaled_p = false;
 }
 
   /* Add offset to invariant part by transforming address expression:
@@ -832,7 +848,9 @@ create_mem_ref (gimple_stmt_iterator *gs
index' = index + offset;
[base + index']
  depending on which one is invariant.  */
-  if (parts.offset && !integer_zerop (parts.offset))
+  if (parts.offset
+  && !integer_zerop (parts.offset)
+  && (!var_in_base || !scaled_p))
 {
   tree old_base = unshare_expr (parts.base);
   tree old_index = unshare_expr (parts.index);
@@ -882,7 +900,7 @@ create_mem_ref (gimple_stmt_iterator *gs
   /* Transform [base + index + ...] into:
base' = base + index;
[base' + ...].  */
-  if (parts.index)
+  if (parts.index && !scaled_p)
 {
   tmp = parts.index;
   parts.index = NULL_TREE;


Re: [RFA][PATCH] Refactor duplicated code used by various dom walkers

2017-11-03 Thread Jeff Law

On 11/03/2017 09:01 AM, Jeff Law wrote:

On 11/03/2017 04:05 AM, Richard Biener wrote:

On Fri, Nov 3, 2017 at 4:49 AM, Jeff Law  wrote:




Several passes which perform dominator walks want to identify when 
block has

a single incoming edge, ignoring loop backedges.

I'm aware of 4 implementations of this code.  3 of the 4 are 
identical in

function.  The 4th (tree-ssa-dom.c) has an additional twist that it also
ignores edges that are not marked as executable.

So I've taken the more general implementation from tree-ssa-dom.c and
conditionalized the handling of unexecutable edges on a flag and 
moved the

implementation into cfganal.c where it more naturally belongs.

Bootstrapped and regression tested on x86_64.  OK for the trunk?


Minor nits (sorry...)
No need to apologize.  I'm always appreciative of feedback as it 
consistently improves what ultimately lands in the tree.






Jeff

 * cfganal.c (single_incoming_edge_ignoring_loop_edges): New 
function

 extracted from tree-ssa-dom.c.
 * cfganal.h (single_incoming_edge_ignoring_loop_edges): 
Prototype.
 * tree-ssa-dom.c (single_incoming_edge_ignoring_loop_edges): 
Remove.
 (record_equivalences_from_incoming_edge): Add additional 
argument

 to single_incoming_edge_ignoring_loop_edges call.
 * tree-ssa-uncprop.c 
(single_incoming_edge_ignoring_loop_edges):

Remove.
 (uncprop_dom_walker::before_dom_children): Add additional 
argument

 to single_incoming_edge_ignoring_loop_edges call.
 * tree-ssa-sccvn.c (sccvn_dom_walker::before_dom_children): Use
 single_incoming_edge_ignoring_loop_edges rather than open 
coding.

 * tree-vrp.c (evrp_dom_walker::before_dom_children): Similarly.





diff --git a/gcc/cfganal.c b/gcc/cfganal.c
index c506067..14d94b2 100644
--- a/gcc/cfganal.c
+++ b/gcc/cfganal.c
@@ -1554,3 +1554,38 @@ single_pred_before_succ_order (void)
  #undef MARK_VISITED
  #undef VISITED_P
  }
+
+/* Ignoring loop backedges, if BB has precisely one incoming edge then
+   return that edge.  Otherwise return NULL.  */
+edge
+single_incoming_edge_ignoring_loop_edges (basic_block bb,
+ bool ignore_unreachable)


single_pred_edge_ignoring_loop_edges and ignore_not_executable

to better match existing CFG functions and actual edge flag use.

Ok with that change.

Sure.  Easy to change.
Final patch attached for archival purposes.  I took the liberty of also 
documenting the new IGNORE_NOT_EXECUTABLE argument.


Jeff
commit 1477a5a7a0a0a93cb8b1c79581134b8ccdca072b
Author: law 
Date:   Fri Nov 3 16:28:28 2017 +

* cfganal.c (single_pred_edge_ignoring_loop_edges): New function
extracted from tree-ssa-dom.c.
* cfganal.h (single_pred_edge_ignoring_loop_edges): Prototype.
* tree-ssa-dom.c (single_incoming_edge_ignoring_loop_edges): Remove.
(record_equivalences_from_incoming_edge): Add additional argument
to single_pred_edge_ignoring_loop_edges call.
* tree-ssa-uncprop.c (single_incoming_edge_ignoring_loop_edges): 
Remove.
(uncprop_dom_walker::before_dom_children): Add additional argument
to single_pred_edge_ignoring_loop_edges call.
* tree-ssa-sccvn.c (sccvn_dom_walker::before_dom_children): Use
single_pred_edge_ignoring_loop_edges rather than open coding.
* tree-vrp.c (evrp_dom_walker::before_dom_children): Similarly.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@254383 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 83db8bf..2fc7db44f7b 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,18 @@
+2017-11-03  Jeff Law  
+
+   * cfganal.c (single_pred_edge_ignoring_loop_edges): New function
+   extracted from tree-ssa-dom.c.
+   * cfganal.h (single_pred_edge_ignoring_loop_edges): Prototype.
+   * tree-ssa-dom.c (single_incoming_edge_ignoring_loop_edges): Remove.
+   (record_equivalences_from_incoming_edge): Add additional argument
+   to single_pred_edge_ignoring_loop_edges call.
+   * tree-ssa-uncprop.c (single_incoming_edge_ignoring_loop_edges): Remove.
+   (uncprop_dom_walker::before_dom_children): Add additional argument
+   to single_pred_edge_ignoring_loop_edges call.
+   * tree-ssa-sccvn.c (sccvn_dom_walker::before_dom_children): Use
+   single_pred_edge_ignoring_loop_edges rather than open coding.
+   * tree-vrp.c (evrp_dom_walker::before_dom_children): Similarly.
+
 2017-11-03  Marc Glisse  
 
* match.pd (-(-A)): Rewrite.
diff --git a/gcc/cfganal.c b/gcc/cfganal.c
index c506067fdcd..8bf8a53fa58 100644
--- a/gcc/cfganal.c
+++ b/gcc/cfganal.c
@@ -1554,3 +1554,42 @@ single_pred_before_succ_order (void)
 #undef MARK_VISITED
 #undef VISITED_P
 }
+
+/* Ignoring loop 

Improve ivopts handling of forced scales

2017-11-03 Thread Richard Sandiford
This patch improves the ivopts address cost calculcation for modes
in which an index must be scaled rather than unscaled.  Previously
we would only try the scaled form if the unscaled form was valid.

Many of the SVE tests rely on this when matching scaled indices.

Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64-linux-gnu.
OK to install?

Richard


2017-11-03  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* tree-ssa-loop-ivopts.c (get_address_cost): Try using a
scaled index even if the unscaled address was invalid.
Don't increase the complexity of using a scale in that case.

Index: gcc/tree-ssa-loop-ivopts.c
===
--- gcc/tree-ssa-loop-ivopts.c  2017-11-03 12:20:07.041206480 +
+++ gcc/tree-ssa-loop-ivopts.c  2017-11-03 12:20:07.193201997 +
@@ -4333,18 +4333,25 @@ get_address_cost (struct ivopts_data *da
   machine_mode addr_mode = TYPE_MODE (type);
   machine_mode mem_mode = TYPE_MODE (TREE_TYPE (*use->op_p));
   addr_space_t as = TYPE_ADDR_SPACE (TREE_TYPE (use->iv->base));
+  /* Only true if ratio != 1.  */
+  bool ok_with_ratio_p = false;
+  bool ok_without_ratio_p = false;
 
   if (!aff_combination_const_p (aff_inv))
 {
   parts.index = integer_one_node;
   /* Addressing mode "base + index".  */
-  if (valid_mem_ref_p (mem_mode, as, ))
+  ok_without_ratio_p = valid_mem_ref_p (mem_mode, as, );
+  if (ratio != 1)
{
  parts.step = wide_int_to_tree (type, ratio);
  /* Addressing mode "base + index << scale".  */
- if (ratio != 1 && !valid_mem_ref_p (mem_mode, as, ))
+ ok_with_ratio_p = valid_mem_ref_p (mem_mode, as, );
+ if (!ok_with_ratio_p)
parts.step = NULL_TREE;
-
+   }
+  if (ok_with_ratio_p || ok_without_ratio_p)
+   {
  if (maybe_nonzero (aff_inv->offset))
{
  parts.offset = wide_int_to_tree (sizetype, aff_inv->offset);
@@ -,7 +4451,9 @@ get_address_cost (struct ivopts_data *da
 
   if (parts.symbol != NULL_TREE)
 cost.complexity += 1;
-  if (parts.step != NULL_TREE && !integer_onep (parts.step))
+  /* Don't increase the complexity of adding a scaled index if it's
+ the only kind of index that the target allows.  */
+  if (parts.step != NULL_TREE && ok_without_ratio_p)
 cost.complexity += 1;
   if (parts.base != NULL_TREE && parts.index != NULL_TREE)
 cost.complexity += 1;


Improve vectorisation of COND_EXPR

2017-11-03 Thread Richard Sandiford
This patch allows us to recognise:

... = bool1 != bool2 ? x : y

as equivalent to:

bool tmp = bool1 ^ bool2;
... = tmp ? x : y

For the latter we were already able to find the natural number
of vector units for tmp based on the types that feed bool1 and
bool2, whereas with the former we would simply treat bool1 and
bool2 as vectorised 8-bit values, possibly requiring them to
be packed and unpacked from their natural width.

This is used by a later SVE patch.


2017-11-03  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* tree-vect-patterns.c (vect_recog_mask_conversion_pattern): When
handling COND_EXPRs with boolean comparisons, try to find a better
basis for the mask type than the boolean itself.

Index: gcc/tree-vect-patterns.c
===
--- gcc/tree-vect-patterns.c2017-11-03 12:17:34.392744807 +
+++ gcc/tree-vect-patterns.c2017-11-03 12:17:36.313554835 +
@@ -3968,15 +3968,43 @@ vect_recog_mask_conversion_pattern (vec<
return NULL;
}
   else if (COMPARISON_CLASS_P (rhs1))
-   rhs1_type = TREE_TYPE (TREE_OPERAND (rhs1, 0));
+   {
+ /* Check whether we're comparing scalar booleans and (if so)
+whether a better mask type exists than the mask associated
+with boolean-sized elements.  This avoids unnecessary packs
+and unpacks if the booleans are set from comparisons of
+wider types.  E.g. in:
+
+  int x1, x2, x3, x4, y1, y1;
+  ...
+  bool b1 = (x1 == x2);
+  bool b2 = (x3 == x4);
+  ... = b1 == b2 ? y1 : y2;
+
+it is better for b1 and b2 to use the mask type associated
+with int elements rather bool (byte) elements.  */
+ rhs1_type = search_type_for_mask (TREE_OPERAND (rhs1, 0), vinfo);
+ if (!rhs1_type)
+   rhs1_type = TREE_TYPE (TREE_OPERAND (rhs1, 0));
+   }
   else
return NULL;
 
   vectype2 = get_mask_type_for_scalar_type (rhs1_type);
 
-  if (!vectype1 || !vectype2
- || must_eq (TYPE_VECTOR_SUBPARTS (vectype1),
- TYPE_VECTOR_SUBPARTS (vectype2)))
+  if (!vectype1 || !vectype2)
+   return NULL;
+
+  /* Continue if a conversion is needed.  Also continue if we have
+a comparison whose vector type would normally be different from
+VECTYPE2 when considered in isolation.  In that case we'll
+replace the comparison with an SSA name (so that we can record
+its vector type) and behave as though the comparison was an SSA
+name from the outset.  */
+  if (must_eq (TYPE_VECTOR_SUBPARTS (vectype1),
+  TYPE_VECTOR_SUBPARTS (vectype2))
+ && (TREE_CODE (rhs1) == SSA_NAME
+ || rhs1_type == TREE_TYPE (TREE_OPERAND (rhs1, 0
return NULL;
 
   /* If rhs1 is a comparison we need to move it into a
@@ -3993,7 +4021,11 @@ vect_recog_mask_conversion_pattern (vec<
  append_pattern_def_seq (stmt_vinfo, pattern_stmt);
}
 
-  tmp = build_mask_conversion (rhs1, vectype1, stmt_vinfo, vinfo);
+  if (may_ne (TYPE_VECTOR_SUBPARTS (vectype1),
+ TYPE_VECTOR_SUBPARTS (vectype2)))
+   tmp = build_mask_conversion (rhs1, vectype1, stmt_vinfo, vinfo);
+  else
+   tmp = rhs1;
 
   lhs = vect_recog_temp_ssa_var (TREE_TYPE (lhs), NULL);
   pattern_stmt = gimple_build_assign (lhs, COND_EXPR, tmp,


[10/10] Add a vect_masked_store target selector

2017-11-03 Thread Richard Sandiford
This patch adds a target selector that says whether the target
supports IFN_MASK_STORE.


2017-11-03  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* doc/sourcebuild.texi (vect_masked_store): Document.

gcc/testsuite/
* lib/target-supports.exp (check_effective_target_vect_masked_store):
New proc.
* gcc.dg/vect/vect-cselim-1.c (foo): Mention that the second loop
is vectorizable with masked stores.  Update scan-tree-dump-times
accordingly.

Index: gcc/doc/sourcebuild.texi
===
--- gcc/doc/sourcebuild.texi2017-11-03 16:06:56.51697 +
+++ gcc/doc/sourcebuild.texi2017-11-03 16:07:00.028331940 +
@@ -1403,6 +1403,9 @@ Target supports hardware vectors of @cod
 @item vect_long_long
 Target supports hardware vectors of @code{long long}.
 
+@item vect_masked_store
+Target supports vector masked stores.
+
 @item vect_aligned_arrays
 Target aligns arrays to vector alignment boundary.
 
Index: gcc/testsuite/lib/target-supports.exp
===
--- gcc/testsuite/lib/target-supports.exp   2017-11-03 16:06:56.519977825 
+
+++ gcc/testsuite/lib/target-supports.exp   2017-11-03 16:07:00.029332326 
+
@@ -6433,6 +6433,12 @@ proc check_effective_target_vect_load_la
 return $et_vect_load_lanes
 }
 
+# Return 1 if the target supports vector masked stores.
+
+proc check_effective_target_vect_masked_store { } {
+return 0
+}
+
 # Return 1 if the target supports vector conditional operations, 0 otherwise.
 
 proc check_effective_target_vect_condition { } {
Index: gcc/testsuite/gcc.dg/vect/vect-cselim-1.c
===
--- gcc/testsuite/gcc.dg/vect/vect-cselim-1.c   2015-06-02 23:53:38.0 
+0100
+++ gcc/testsuite/gcc.dg/vect/vect-cselim-1.c   2017-11-03 16:07:00.028331940 
+
@@ -38,7 +38,7 @@ foo ()
 }
 }
 
-  /* Not vectorizable.  */
+  /* Only vectorizable with masked stores.  */
   for (i = 0; i < N; i++)
 {
   c = in1[i].b;
@@ -82,4 +82,5 @@ main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  { xfail { 
{ vect_no_align && { ! vect_hw_misalign } } || { ! vect_strided2 } } } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { 
! vect_masked_store } xfail { { vect_no_align && { ! vect_hw_misalign } } || { 
! vect_strided2 } } } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" { target { 
vect_masked_store } } } } */


[9/10] Add a vect_align_stack_vars target selector

2017-11-03 Thread Richard Sandiford
This patch adds a target selector to say whether it's possible to
align a local variable to the target's preferred vector alignment.
This can be false for large vectors if the alignment is only
a preference and not a hard requirement (and thus if there is no
need to support a stack realignment mechanism).


2017-11-03  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* doc/sourcebuild.texi (vect_align_stack_vars): Document.

gcc/testsuite/
* lib/target-supports.exp
(check_effective_target_vect_align_stack_vars): New proc.
* gcc.dg/vect/vect-23.c: Only expect the array to be aligned if
vect_align_stack_vars.
* gcc.dg/vect/vect-24.c: Likewise.
* gcc.dg/vect/vect-25.c: Likewise.
* gcc.dg/vect/vect-26.c: Likewise.
* gcc.dg/vect/vect-32-big-array.c: Likewise.
* gcc.dg/vect/vect-32.c: Likewise.
* gcc.dg/vect/vect-40.c: Likewise.
* gcc.dg/vect/vect-42.c: Likewise.
* gcc.dg/vect/vect-46.c: Likewise.
* gcc.dg/vect/vect-48.c: Likewise.
* gcc.dg/vect/vect-52.c: Likewise.
* gcc.dg/vect/vect-54.c: Likewise.
* gcc.dg/vect/vect-62.c: Likewise.
* gcc.dg/vect/vect-67.c: Likewise.
* gcc.dg/vect/vect-75-big-array.c: Likewise.
* gcc.dg/vect/vect-75.c: Likewise.
* gcc.dg/vect/vect-77-alignchecks.c: Likewise.
* gcc.dg/vect/vect-78-alignchecks.c: Likewise.
* gcc.dg/vect/vect-89-big-array.c: Likewise.
* gcc.dg/vect/vect-89.c: Likewise.
* gcc.dg/vect/vect-96.c: Likewise.
* gcc.dg/vect/vect-multitypes-3.c: Likewise.
* gcc.dg/vect/vect-multitypes-6.c: Likewise.

Index: gcc/doc/sourcebuild.texi
===
--- gcc/doc/sourcebuild.texi2017-11-03 16:06:52.929591350 +
+++ gcc/doc/sourcebuild.texi2017-11-03 16:06:56.51697 +
@@ -1373,6 +1373,10 @@ Target supports Fortran @code{real} kind
 @subsubsection Vector-specific attributes
 
 @table @code
+@item vect_align_stack_vars
+The target's ABI allows stack variables to be aligned to the preferred
+vector alignment.
+
 @item vect_condition
 Target supports vector conditional operations.
 
Index: gcc/testsuite/lib/target-supports.exp
===
--- gcc/testsuite/lib/target-supports.exp   2017-11-03 16:06:52.930591737 
+
+++ gcc/testsuite/lib/target-supports.exp   2017-11-03 16:06:56.519977825 
+
@@ -6350,6 +6350,12 @@ proc check_effective_target_vect_element
 return [check_effective_target_vect_variable_length]
 }
 
+# Return 1 if we can align stack data to the preferred vector alignment.
+
+proc check_effective_target_vect_align_stack_vars { } {
+return 1
+}
+
 # Return 1 if vector alignment (for types of size 32 bit or less) is 
reachable, 0 otherwise.
 
 proc check_effective_target_vector_alignment_reachable { } {
Index: gcc/testsuite/gcc.dg/vect/vect-23.c
===
--- gcc/testsuite/gcc.dg/vect/vect-23.c 2016-11-22 21:16:10.0 +
+++ gcc/testsuite/gcc.dg/vect/vect-23.c 2017-11-03 16:06:56.51697 +
@@ -125,4 +125,4 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 
"vect" } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 
"vect" { xfail { ! vect_align_stack_vars } } } } */
Index: gcc/testsuite/gcc.dg/vect/vect-24.c
===
--- gcc/testsuite/gcc.dg/vect/vect-24.c 2017-02-23 19:54:09.0 +
+++ gcc/testsuite/gcc.dg/vect/vect-24.c 2017-11-03 16:06:56.51697 +
@@ -123,4 +123,4 @@ int main (void)
   return main1 ();
 }
 /* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" { xfail { { 
! aarch64*-*-* } && { ! arm-*-* } } } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 
"vect" } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 
"vect" { xfail { ! vect_align_stack_vars } } } } */
Index: gcc/testsuite/gcc.dg/vect/vect-25.c
===
--- gcc/testsuite/gcc.dg/vect/vect-25.c 2015-06-02 23:53:35.0 +0100
+++ gcc/testsuite/gcc.dg/vect/vect-25.c 2017-11-03 16:06:56.51697 +
@@ -51,4 +51,4 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 
"vect" } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 
"vect" { xfail { ! vect_align_stack_vars } } } } */
Index: gcc/testsuite/gcc.dg/vect/vect-26.c

[8/10] Add a vect_variable_length target selector

2017-11-03 Thread Richard Sandiford
This patch adds a target selector for variable-length vectors.
Initially it's always false, but the SVE patch provides a case
in which it's true.


2017-11-03  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* doc/sourcebuild.texi (vect_variable_length): Document.

gcc/testsuite/
* lib/target-supports.exp
(check_effective_target_vect_variable_length): New proc.
* gcc.dg/vect/pr60482.c: XFAIL test for no epilog loop if
vect_variable_length.
* gcc.dg/vect/slp-reduc-6.c: XFAIL two-operation SLP if
vect_variable_length.
* gcc.dg/vect/vect-alias-check-5.c: XFAIL alias optimization if
vect_variable_length.
* gfortran.dg/vect/fast-math-mgrid-resid.f: XFAIL predictive
commoning optimization if vect_variable_length.

Index: gcc/doc/sourcebuild.texi
===
--- gcc/doc/sourcebuild.texi2017-11-03 16:06:26.237889385 +
+++ gcc/doc/sourcebuild.texi2017-11-03 16:06:52.929591350 +
@@ -1486,6 +1486,9 @@ Target prefers vectors to have an alignm
 alignment, but also allows unaligned vector accesses in some
 circumstances.
 
+@item vect_variable_length
+Target has variable-length vectors.
+
 @item vect_widen_sum_hi_to_si
 Target supports a vector widening summation of @code{short} operands
 into @code{int} results, or can promote (unpack) from @code{short}
Index: gcc/testsuite/lib/target-supports.exp
===
--- gcc/testsuite/lib/target-supports.exp   2017-11-03 16:06:26.241888136 
+
+++ gcc/testsuite/lib/target-supports.exp   2017-11-03 16:06:52.930591737 
+
@@ -6714,6 +6714,12 @@ proc check_effective_target_vect_multipl
 return [expr { [llength [available_vector_sizes]] > 1 }]
 }
 
+# Return true if variable-length vectors are supported.
+
+proc check_effective_target_vect_variable_length { } {
+return [expr { [lindex [available_vector_sizes] 0] == 0 }]
+}
+
 # Return 1 if the target supports vectors of 64 bits.
 
 proc check_effective_target_vect64 { } {
Index: gcc/testsuite/gcc.dg/vect/pr60482.c
===
--- gcc/testsuite/gcc.dg/vect/pr60482.c 2015-06-02 23:53:38.0 +0100
+++ gcc/testsuite/gcc.dg/vect/pr60482.c 2017-11-03 16:06:52.929591350 +
@@ -16,4 +16,6 @@ foo (double *x, int n)
   return p;
 }
 
-/* { dg-final { scan-tree-dump-not "epilog loop required" "vect" } } */
+/* Until fully-masked loops are supported, we always need an epilog
+   loop for variable-length vectors.  */
+/* { dg-final { scan-tree-dump-not "epilog loop required" "vect" { xfail 
vect_variable_length } } } */
Index: gcc/testsuite/gcc.dg/vect/slp-reduc-6.c
===
--- gcc/testsuite/gcc.dg/vect/slp-reduc-6.c 2015-06-02 23:53:35.0 
+0100
+++ gcc/testsuite/gcc.dg/vect/slp-reduc-6.c 2017-11-03 16:06:52.929591350 
+
@@ -44,5 +44,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { xfail { 
vect_no_int_add || { ! { vect_unpack || vect_strided2 } } } } } } */
 /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 0 "vect" } 
} */
-/* { dg-final { scan-tree-dump-times "different interleaving chains in one 
node" 1 "vect" { target { ! vect_no_int_add } } } } */
+/* { dg-final { scan-tree-dump-times "different interleaving chains in one 
node" 1 "vect" { target { ! vect_no_int_add } xfail vect_variable_length } } } 
*/
 
Index: gcc/testsuite/gcc.dg/vect/vect-alias-check-5.c
===
--- gcc/testsuite/gcc.dg/vect/vect-alias-check-5.c  2017-08-04 
11:39:37.910284386 +0100
+++ gcc/testsuite/gcc.dg/vect/vect-alias-check-5.c  2017-11-03 
16:06:52.929591350 +
@@ -15,5 +15,5 @@ f1 (struct s *a, struct s *b)
 }
 
 /* { dg-final { scan-tree-dump-times "consider run-time aliasing" 1 "vect" } } 
*/
-/* { dg-final { scan-tree-dump-times "improved number of alias checks from 1 
to 0" 1 "vect" } } */
+/* { dg-final { scan-tree-dump-times "improved number of alias checks from 1 
to 0" 1 "vect" { xfail vect_variable_length } } } */
 /* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 1 "vect" } } */
Index: gcc/testsuite/gfortran.dg/vect/fast-math-mgrid-resid.f
===
--- gcc/testsuite/gfortran.dg/vect/fast-math-mgrid-resid.f  2017-10-04 
16:25:39.620051123 +0100
+++ gcc/testsuite/gfortran.dg/vect/fast-math-mgrid-resid.f  2017-11-03 
16:06:52.929591350 +
@@ -42,5 +42,5 @@ C
 ! vectorized loop.  If vector factor is 2, the vectorized loop can
 ! be predictive commoned, we check if predictive commoning PHI node
 ! is created with vector(2) type.
-! { dg-final { scan-tree-dump 

[7/10] Add a vect_unaligned_possible target selector

2017-11-03 Thread Richard Sandiford
This patch adds a target selector that says whether we can ever
generate an "unaligned" accesses, where "unaligned" is relative
to the target's preferred vector alignment.  This is already true if:

   vect_no_align && { ! vect_hw_misalign }

i.e. if the target doesn't have any alignment mechanism and also
doesn't allow unaligned accesses.  It is also true (for the things
tested by gcc.dg/vect) if the target only wants things to be aligned
to an element; in that case every normal scalar access is "vector aligned".


2017-11-03  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* doc/sourcebuild.texi (vect_unaligned_possible): Document.

gcc/testsuite/
* lib/target-supports.exp
(check_effective_target_vect_unaligned_possible): New proc.
* gcc.dg/vect/slp-25.c: Extend XFAIL of peeling for alignment from
vect_no_align && { ! vect_hw_misalign } to ! vect_unaligned_possible.
* gcc.dg/vect/vect-multitypes-1.c: Likewise.
* gcc.dg/vect/vect-109.c: XFAIL vectorisation of an unaligned
access to ! vect_unaligned_possible.
* gcc.dg/vect/vect-33.c: Likewise.
* gcc.dg/vect/vect-42.c: Likewise.
* gcc.dg/vect/vect-56.c: Likewise.
* gcc.dg/vect/vect-60.c: Likewise.
* gcc.dg/vect/vect-96.c: Likewise.
* gcc.dg/vect/vect-peel-1.c: Likewise.
* gcc.dg/vect/vect-27.c: Extend XFAIL of unaligned vectorization from
vect_no_align && { ! vect_hw_misalign } to ! vect_unaligned_possible.
* gcc.dg/vect/vect-29.c: Likewise.
* gcc.dg/vect/vect-44.c: Likewise.
* gcc.dg/vect/vect-48.c: Likewise.
* gcc.dg/vect/vect-50.c: Likewise.
* gcc.dg/vect/vect-52.c: Likewise.
* gcc.dg/vect/vect-72.c: Likewise.
* gcc.dg/vect/vect-75-big-array.c: Likewise.
* gcc.dg/vect/vect-75.c: Likewise.
* gcc.dg/vect/vect-77-alignchecks.c: Likewise.
* gcc.dg/vect/vect-77-global.c: Likewise.
* gcc.dg/vect/vect-78-alignchecks.c: Likewise.
* gcc.dg/vect/vect-78-global.c: Likewise.
* gcc.dg/vect/vect-multitypes-3.c: Likewise.
* gcc.dg/vect/vect-multitypes-4.c: Likewise.
* gcc.dg/vect/vect-multitypes-6.c: Likewise.
* gcc.dg/vect/vect-peel-4.c: Likewise.
* gcc.dg/vect/vect-peel-3.c: Likewise, and also for peeling
for alignment.

Index: gcc/doc/sourcebuild.texi
===
--- gcc/doc/sourcebuild.texi2017-11-03 16:06:22.561036988 +
+++ gcc/doc/sourcebuild.texi2017-11-03 16:06:26.237889385 +
@@ -1481,6 +1481,11 @@ Like @code{vect_perm3_byte}, but for 16-
 @item vect_shift
 Target supports a hardware vector shift operation.
 
+@item vect_unaligned_possible
+Target prefers vectors to have an alignment greater than element
+alignment, but also allows unaligned vector accesses in some
+circumstances.
+
 @item vect_widen_sum_hi_to_si
 Target supports a vector widening summation of @code{short} operands
 into @code{int} results, or can promote (unpack) from @code{short}
Index: gcc/testsuite/lib/target-supports.exp
===
--- gcc/testsuite/lib/target-supports.exp   2017-11-03 16:06:22.564036053 
+
+++ gcc/testsuite/lib/target-supports.exp   2017-11-03 16:06:26.241888136 
+
@@ -6399,6 +6399,15 @@ proc check_effective_target_vect_element
 return $et_vect_element_align($et_index)
 }
 
+# Return 1 if we expect to see unaligned accesses in at least some
+# vector dumps.
+
+proc check_effective_target_vect_unaligned_possible { } {
+return [expr { ![check_effective_target_vect_element_align_preferred]
+  && (![check_effective_target_vect_no_align]
+  || [check_effective_target_vect_hw_misalign]) }]
+}
+
 # Return 1 if the target supports vector LOAD_LANES operations, 0 otherwise.
 
 proc check_effective_target_vect_load_lanes { } {
Index: gcc/testsuite/gcc.dg/vect/slp-25.c
===
--- gcc/testsuite/gcc.dg/vect/slp-25.c  2015-06-02 23:53:38.0 +0100
+++ gcc/testsuite/gcc.dg/vect/slp-25.c  2017-11-03 16:06:26.237889385 +
@@ -57,4 +57,4 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect"  } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 
"vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using 
peeling" 2 "vect" { xfail { { vect_no_align && { ! vect_hw_misalign } } || { ! 
vect_natural_alignment } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using 
peeling" 2 "vect" { xfail { { ! vect_unaligned_possible } || { ! 
vect_natural_alignment } } } } } */
Index: gcc/testsuite/gcc.dg/vect/vect-multitypes-1.c

[6/10] Add a vect_element_align_preferred target selector

2017-11-03 Thread Richard Sandiford
This patch adds a target selector for targets whose
preferred_vector_alignment is the alignment of one element.  We'll never
peel in that case, and the step of a loop that operates on normal (as
opposed to packed) elements will always divide the preferred alignment.


2017-11-03  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* doc/sourcebuild.texi (vect_element_align_preferred): Document.

gcc/testsuite/
* lib/target-supports.exp
(check_effective_target_vect_element_align_preferred): New proc.
(check_effective_target_vect_peeling_profitable): Test it.
* gcc.dg/vect/no-section-anchors-vect-31.c: Don't expect peeling
if vect_element_align_preferred.
* gcc.dg/vect/no-section-anchors-vect-64.c: Likewise.
* gcc.dg/vect/pr65310.c: Likewise.
* gcc.dg/vect/vect-26.c: Likewise.
* gcc.dg/vect/vect-54.c: Likewise.
* gcc.dg/vect/vect-56.c: Likewise.
* gcc.dg/vect/vect-58.c: Likewise.
* gcc.dg/vect/vect-60.c: Likewise.
* gcc.dg/vect/vect-89-big-array.c: Likewise.
* gcc.dg/vect/vect-89.c: Likewise.
* gcc.dg/vect/vect-92.c: Likewise.
* gcc.dg/vect/vect-peel-1.c: Likewise.
* gcc.dg/vect/vect-outer-3a-big-array.c: Expect the step to
divide the alignment if vect_element_align_preferred.
* gcc.dg/vect/vect-outer-3a.c: Likewise.

Index: gcc/doc/sourcebuild.texi
===
--- gcc/doc/sourcebuild.texi2017-11-03 16:06:19.377029536 +
+++ gcc/doc/sourcebuild.texi2017-11-03 16:06:22.561036988 +
@@ -1383,6 +1383,10 @@ have different type from the value opera
 @item vect_double
 Target supports hardware vectors of @code{double}.
 
+@item vect_element_align_preferred
+The target's preferred vector alignment is the same as the element
+alignment.
+
 @item vect_float
 Target supports hardware vectors of @code{float}.
 
Index: gcc/testsuite/lib/target-supports.exp
===
--- gcc/testsuite/lib/target-supports.exp   2017-11-03 16:06:19.378029224 
+
+++ gcc/testsuite/lib/target-supports.exp   2017-11-03 16:06:22.564036053 
+
@@ -3290,7 +3290,8 @@ proc check_effective_target_vect_peeling
 } else {
set et_vect_peeling_profitable_saved($et_index) 1
 if { ([istarget s390*-*-*]
- && [check_effective_target_s390_vx]) } {
+ && [check_effective_target_s390_vx])
+|| [check_effective_target_vect_element_align_preferred] } {
set et_vect_peeling_profitable_saved($et_index) 0
 }
 }
@@ -6342,6 +6343,13 @@ proc check_effective_target_vect_natural
 return $et_vect_natural_alignment
 }
 
+# Return 1 if the target doesn't prefer any alignment beyond element
+# alignment during vectorization.
+
+proc check_effective_target_vect_element_align_preferred { } {
+return [check_effective_target_vect_variable_length]
+}
+
 # Return 1 if vector alignment (for types of size 32 bit or less) is 
reachable, 0 otherwise.
 
 proc check_effective_target_vector_alignment_reachable { } {
Index: gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-31.c
===
--- gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-31.c  2017-11-03 
16:06:08.010089752 +
+++ gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-31.c  2017-11-03 
16:06:22.561036988 +
@@ -94,4 +94,4 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 4 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 
"vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using 
peeling" 2 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using 
peeling" 2 "vect" { xfail vect_element_align_preferred } } } */
Index: gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-64.c
===
--- gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-64.c  2017-11-03 
16:06:08.010089752 +
+++ gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-64.c  2017-11-03 
16:06:22.562036677 +
@@ -91,4 +91,4 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 
"vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using 
peeling" 2 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using 
peeling" 2 "vect" { xfail vect_element_align_preferred } } } */
Index: gcc/testsuite/gcc.dg/vect/pr65310.c
===
--- gcc/testsuite/gcc.dg/vect/pr65310.c 2015-06-02 23:53:38.0 

[5/10] Add vect_perm3_* target selectors

2017-11-03 Thread Richard Sandiford
SLP load permutation fails if any individual permutation requires more
than two vector inputs.  For 128-bit vectors, it's possible to permute
3 contiguous loads of 32-bit and 8-bit elements, but not 16-bit elements
or 64-bit elements.  The results are reversed for 256-bit vectors,
and so on for wider vectors.

This patch adds a routine that tests whether a permute will require
three vectors for a given vector count and element size, then adds
vect_perm3_* target selectors for the cases that we currently use.


2017-11-03  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* doc/sourcebuild.texi (vect_perm_short, vect_perm_byte): Document
previously undocumented selectors.
(vect_perm3_byte, vect_perm3_short, vect_perm3_int): Document.

gcc/testsuite/
* lib/target-supports.exp (vect_perm_supported): New proc.
(check_effective_target_vect_perm3_int): Likewise.
(check_effective_target_vect_perm3_short): Likewise.
(check_effective_target_vect_perm3_byte): Likewise.
* gcc.dg/vect/slp-perm-1.c: Expect SLP load permutation to
succeed if vect_perm3_int.
* gcc.dg/vect/slp-perm-5.c: Likewise.
* gcc.dg/vect/slp-perm-6.c: Likewise.
* gcc.dg/vect/slp-perm-7.c: Likewise.
* gcc.dg/vect/slp-perm-8.c: Likewise vect_perm3_byte.
* gcc.dg/vect/slp-perm-9.c: Likewise vect_perm3_short.
Use vect_perm_short instead of vect_perm.  Add a scan-tree-dump-not
test for vect_perm3_short targets.

Index: gcc/doc/sourcebuild.texi
===
--- gcc/doc/sourcebuild.texi2017-10-26 12:27:58.124235242 +0100
+++ gcc/doc/sourcebuild.texi2017-11-03 16:06:19.377029536 +
@@ -1448,6 +1448,32 @@ element types.
 @item vect_perm
 Target supports vector permutation.
 
+@item vect_perm_byte
+Target supports permutation of vectors with 8-bit elements.
+
+@item vect_perm_short
+Target supports permutation of vectors with 16-bit elements.
+
+@item vect_perm3_byte
+Target supports permutation of vectors with 8-bit elements, and for the
+default vector length it is possible to permute:
+@example
+@{ a0, a1, a2, b0, b1, b2, @dots{} @}
+@end example
+to:
+@example
+@{ a0, a0, a0, b0, b0, b0, @dots{} @}
+@{ a1, a1, a1, b1, b1, b1, @dots{} @}
+@{ a2, a2, a2, b2, b2, b2, @dots{} @}
+@end example
+using only two-vector permutes, regardless of how long the sequence is.
+
+@item vect_perm3_int
+Like @code{vect_perm3_byte}, but for 32-bit elements.
+
+@item vect_perm3_short
+Like @code{vect_perm3_byte}, but for 16-bit elements.
+
 @item vect_shift
 Target supports a hardware vector shift operation.
 
Index: gcc/testsuite/lib/target-supports.exp
===
--- gcc/testsuite/lib/target-supports.exp   2017-11-03 16:06:12.625838683 
+
+++ gcc/testsuite/lib/target-supports.exp   2017-11-03 16:06:19.378029224 
+
@@ -5547,6 +5547,78 @@ proc check_effective_target_vect_perm {
 return $et_vect_perm_saved($et_index)
 }
 
+# Return 1 if, for some VF:
+#
+# - the target's default vector size is VF * ELEMENT_BITS bits
+#
+# - it is possible to implement the equivalent of:
+#
+#  int_t s1[COUNT][COUNT * VF], s2[COUNT * VF];
+#  for (int i = 0; i < COUNT; ++i)
+#for (int j = 0; j < COUNT * VF; ++j)
+#  s1[i][j] = s2[j - j % COUNT + i]
+#
+#   using only a single 2-vector permute for each vector in s1.
+#
+# E.g. for COUNT == 3 and vector length 4, the two arrays would be:
+#
+#s2| a0 a1 a2 a3 | b0 b1 b2 b3 | c0 c1 c2 c3
+#--+-+-+
+#s1[0] | a0 a0 a0 a3 | a3 a3 b2 b2 | b2 c1 c1 c1
+#s1[1] | a1 a1 a1 b0 | b0 b0 b3 b3 | b3 c2 c2 c2
+#s1[2] | a2 a2 a2 b1 | b1 b1 c0 c0 | c0 c3 c3 c3
+#
+# Each s1 permute requires only two of a, b and c.
+#
+# The distance between the start of vector n in s1[0] and the start
+# of vector n in s2 is:
+#
+#A = (n * VF) % COUNT
+#
+# The corresponding value for the end of vector n is:
+#
+#B = (n * VF + VF - 1) % COUNT
+#
+# Subtracting i from each value gives the corresponding difference
+# for s1[i].  The condition being tested by this function is false
+# iff A - i > 0 and B - i < 0 for some i and n, such that the first
+# element for s1[i] comes from vector n - 1 of s2 and the last element
+# comes from vector n + 1 of s2.  The condition is therefore true iff
+# A <= B for all n.  This is turn means the condition is true iff:
+#
+#(n * VF) % COUNT + (VF - 1) % COUNT < COUNT
+#
+# for all n.  COUNT - (n * VF) % COUNT is bounded by gcd (VF, COUNT),
+# and will be that value for at least one n in [0, COUNT), so we want:
+#
+#(VF - 1) % COUNT < gcd (VF, COUNT)
+
+proc vect_perm_supported { count element_bits } {
+set vector_bits [lindex [available_vector_sizes] 0]
+if { 

Re: Adjust empty class parameter passing ABI (PR c++/60336)

2017-11-03 Thread Jason Merrill
On Fri, Nov 3, 2017 at 9:55 AM, Marek Polacek  wrote:
> +  TYPE_EMPTY_P (t) = targetm.calls.empty_record_p (t);

I think we want to set this in finalize_type_size; since the point of
all this is becoming compliant with the psABI (and compatible with the
C front end), I wouldn't think it should be specific to the C++ front
end.

> +  TYPE_WARN_EMPTY_P (t) = warn_abi && abi_version_crosses (12);

Can this flag go on the TRANSLATION_UNIT_DECL rather than the type?

> + if (TREE_CODE (field) == FIELD_DECL
> + && (DECL_NAME (field)
> + || RECORD_OR_UNION_TYPE_P (TREE_TYPE (field)))
> + && !default_is_empty_type (TREE_TYPE (field)))
> +return false;
> +  return true;

Hmm, this assumes that any unnamed field can be ignored; I'm concerned
that some front end might clear DECL_NAME for a reason that doesn't
imply that the field is just padding.

Jason


[4/10] Don't assume vect_multiple_sizes means 2 sizes

2017-11-03 Thread Richard Sandiford
Some tests assumed that there would only be 2 vector sizes if
vect_multiple_sizes, whereas for SVE there are three (SVE, 128-bit
and 64-bit).  This patch replaces scan-tree-dump-times with
scan-tree-dump for vect_multiple_sizes but keeps it for
!vect_multiple_sizes.


2017-11-03  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/testsuite/
* gcc.dg/vect/no-vfa-vect-101.c: Use scan-tree-dump rather than
scan-tree-dump-times for vect_multiple_sizes.
* gcc.dg/vect/no-vfa-vect-102.c: Likewise.
* gcc.dg/vect/no-vfa-vect-102a.c: Likewise.
* gcc.dg/vect/no-vfa-vect-37.c: Likewise.
* gcc.dg/vect/no-vfa-vect-79.c: Likewise.
* gcc.dg/vect/vect-104.c: Likewise.

Index: gcc/testsuite/gcc.dg/vect/no-vfa-vect-101.c
===
--- gcc/testsuite/gcc.dg/vect/no-vfa-vect-101.c 2015-06-02 23:53:38.0 
+0100
+++ gcc/testsuite/gcc.dg/vect/no-vfa-vect-101.c 2017-11-03 16:06:16.141037152 
+
@@ -46,5 +46,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 0 "vect" } } */
 /* { dg-final { scan-tree-dump-times "can't determine dependence" 1 "vect" { 
target { ! vect_multiple_sizes } } } } */
-/* { dg-final { scan-tree-dump-times "can't determine dependence" 2 "vect" { 
target vect_multiple_sizes } } } */
+/* { dg-final { scan-tree-dump "can't determine dependence" "vect" { target 
vect_multiple_sizes } } } */
 
Index: gcc/testsuite/gcc.dg/vect/no-vfa-vect-102.c
===
--- gcc/testsuite/gcc.dg/vect/no-vfa-vect-102.c 2017-11-03 16:06:03.052282173 
+
+++ gcc/testsuite/gcc.dg/vect/no-vfa-vect-102.c 2017-11-03 16:06:16.141037152 
+
@@ -51,5 +51,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 0 "vect" } } */
 /* { dg-final { scan-tree-dump-times "possible dependence between data-refs" 1 
"vect" { target { ! vect_multiple_sizes } } } } */
-/* { dg-final { scan-tree-dump-times "possible dependence between data-refs" 2 
"vect" { target vect_multiple_sizes } } } */
+/* { dg-final { scan-tree-dump "possible dependence between data-refs" "vect" 
{ target vect_multiple_sizes } } } */
 
Index: gcc/testsuite/gcc.dg/vect/no-vfa-vect-102a.c
===
--- gcc/testsuite/gcc.dg/vect/no-vfa-vect-102a.c2017-11-03 
16:06:03.052282173 +
+++ gcc/testsuite/gcc.dg/vect/no-vfa-vect-102a.c2017-11-03 
16:06:16.141037152 +
@@ -51,5 +51,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 0 "vect" } } */
 /* { dg-final { scan-tree-dump-times "possible dependence between data-refs" 1 
"vect" { target { ! vect_multiple_sizes } } } } */
-/* { dg-final { scan-tree-dump-times "possible dependence between data-refs" 2 
"vect" { target vect_multiple_sizes } } } */
+/* { dg-final { scan-tree-dump "possible dependence between data-refs" "vect" 
{ target vect_multiple_sizes } } } */
 
Index: gcc/testsuite/gcc.dg/vect/no-vfa-vect-37.c
===
--- gcc/testsuite/gcc.dg/vect/no-vfa-vect-37.c  2015-06-02 23:53:38.0 
+0100
+++ gcc/testsuite/gcc.dg/vect/no-vfa-vect-37.c  2017-11-03 16:06:16.141037152 
+
@@ -59,4 +59,4 @@ int main (void)
prevent vectorization on some targets.  */
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" { xfail 
*-*-* } } } */
 /* { dg-final { scan-tree-dump-times "can't determine dependence" 2 "vect" { 
target { ! vect_multiple_sizes } } } } */
-/* { dg-final { scan-tree-dump-times "can't determine dependence" 4 "vect" { 
target vect_multiple_sizes } } } */
+/* { dg-final { scan-tree-dump "can't determine dependence" "vect" { target 
vect_multiple_sizes } } } */
Index: gcc/testsuite/gcc.dg/vect/no-vfa-vect-79.c
===
--- gcc/testsuite/gcc.dg/vect/no-vfa-vect-79.c  2015-06-02 23:53:35.0 
+0100
+++ gcc/testsuite/gcc.dg/vect/no-vfa-vect-79.c  2017-11-03 16:06:16.141037152 
+
@@ -47,4 +47,4 @@ int main (void)
   prevent vectorization on some targets.  */
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  { xfail 
*-*-* } } } */
 /* { dg-final { scan-tree-dump-times "can't determine dependence" 1 "vect" { 
target { ! vect_multiple_sizes } } } } */
-/* { dg-final { scan-tree-dump-times "can't determine dependence" 2 "vect" { 
target vect_multiple_sizes } } } */
+/* { dg-final { scan-tree-dump "can't determine dependence" "vect" { target 
vect_multiple_sizes } } } */
Index: gcc/testsuite/gcc.dg/vect/vect-104.c
===
--- gcc/testsuite/gcc.dg/vect/vect-104.c2017-11-03 16:06:03.054282499 
+
+++ gcc/testsuite/gcc.dg/vect/vect-104.c

[3/10] Add available_vector_sizes to target-supports.exp

2017-11-03 Thread Richard Sandiford
This patch adds a routine that lists the available vector sizes
for a target and uses it for some existing target conditions.
Later patches add more uses.

The cases are taken from multiple_sizes.


2017-11-03  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/testsuite/
* lib/target-supports.exp (available_vector_sizes): New proc.
(check_effective_target_vect_multiple_sizes): Use it.
(check_effective_target_vect64): Likewise.
(check_effective_target_vect_sizes_32B_16B): Likewise.

Index: gcc/testsuite/lib/target-supports.exp
===
--- gcc/testsuite/lib/target-supports.exp   2017-11-03 12:16:58.605777011 
+
+++ gcc/testsuite/lib/target-supports.exp   2017-11-03 16:06:12.625838683 
+
@@ -6581,46 +6581,38 @@ foreach N {2 3 4 8} {
 }]
 }
 
-# Return 1 if the target supports multiple vector sizes
+# Return the list of vector sizes (in bits) that each target supports.
+# A vector length of "0" indicates variable-length vectors.
 
-proc check_effective_target_vect_multiple_sizes { } {
-global et_vect_multiple_sizes_saved
-global et_index
-
-set et_vect_multiple_sizes_saved($et_index) 0
-if { [istarget aarch64*-*-*]
-|| [is-effective-target arm_neon]
-|| (([istarget i?86-*-*] || [istarget x86_64-*-*])
-&& ([check_avx_available] && ![check_prefer_avx128])) } {
-   set et_vect_multiple_sizes_saved($et_index) 1
+proc available_vector_sizes { } {
+set result {}
+if { [istarget aarch64*-*-*] } {
+   lappend result 128 64
+} elseif { [istarget arm*-*-*]
+   && [check_effective_target_arm_neon_ok] } {
+   lappend result 128 64
+} elseif { (([istarget i?86-*-*] || [istarget x86_64-*-*])
+&& ([check_avx_available] && ![check_prefer_avx128])) } {
+   lappend result 256 128
+} elseif { [istarget sparc*-*-*] } {
+   lappend result 64
+} else {
+   # The traditional default asumption.
+   lappend result 128
 }
+return $result
+}
+
+# Return 1 if the target supports multiple vector sizes
 
-verbose "check_effective_target_vect_multiple_sizes:\
-returning $et_vect_multiple_sizes_saved($et_index)" 2
-return $et_vect_multiple_sizes_saved($et_index)
+proc check_effective_target_vect_multiple_sizes { } {
+return [expr { [llength [available_vector_sizes]] > 1 }]
 }
 
 # Return 1 if the target supports vectors of 64 bits.
 
 proc check_effective_target_vect64 { } {
-global et_vect64_saved
-global et_index
-
-if [info exists et_vect64_saved($et_index)] {
-verbose "check_effective_target_vect64: using cached result" 2
-} else {
-   set et_vect64_saved($et_index) 0
-if { ([is-effective-target arm_neon]
- && [check_effective_target_arm_little_endian])
-|| [istarget aarch64*-*-*]
- || [istarget sparc*-*-*] } {
-  set et_vect64_saved($et_index) 1
-}
-}
-
-verbose "check_effective_target_vect64:\
-returning $et_vect64_saved($et_index)" 2
-return $et_vect64_saved($et_index)
+return [expr { [lsearch -exact [available_vector_sizes] 64] >= 0 }]
 }
 
 # Return 1 if the target supports vector copysignf calls.
@@ -7747,11 +7739,7 @@ proc check_avx_available { } {
 # Return true if 32- and 16-bytes vectors are available.
 
 proc check_effective_target_vect_sizes_32B_16B { } {
-  if { [check_avx_available] && ![check_prefer_avx128] } {
- return 1;
-  } else {
-return 0;
-  }
+return [expr { [available_vector_sizes] == [list 256 128] }]
 }
 
 # Return true if 16- and 8-bytes vectors are available.


[2/10] Add VECTOR_BITS to tree-vect.h

2017-11-03 Thread Richard Sandiford
Several vector tests are sensitive to the vector size.  This patch adds
a VECTOR_BITS macro to tree-vect.h to select the expected vector size
and uses it to influence iteration counts and array sizes.  The tests
keep the original values if the vector size is small enough.

For now VECTOR_BITS is always 128, but the SVE patches add other values.


2017-11-03  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/testsuite/
* gcc.dg/vect/tree-vect.h (VECTOR_BITS): Define.
* gcc.dg/vect/bb-slp-pr69907.c: Include tree-vect.h.
(N): New macro.
(foo): Use it instead of hard-coded 320.
* gcc.dg/vect/no-scevccp-outer-7.c (N): Redefine if the default
value is too small for VECTOR_BITS.
* gcc.dg/vect/no-scevccp-vect-iv-3.c (N): Likewise.
* gcc.dg/vect/no-section-anchors-vect-31.c (N): Likewise.
* gcc.dg/vect/no-section-anchors-vect-36.c (N): Likewise.
* gcc.dg/vect/slp-perm-9.c (N): Likewise.
* gcc.dg/vect/vect-32.c (N): Likewise.
* gcc.dg/vect/vect-75.c (N, OFF): Likewise.
* gcc.dg/vect/vect-77-alignchecks.c (N, OFF): Likewise.
* gcc.dg/vect/vect-78-alignchecks.c (N, OFF): Likewise.
* gcc.dg/vect/vect-89.c (N): Likewise.
* gcc.dg/vect/vect-96.c (N): Likewise.
* gcc.dg/vect/vect-multitypes-3.c (N): Likewise.
* gcc.dg/vect/vect-multitypes-6.c (N): Likewise.
* gcc.dg/vect/vect-over-widen-1.c (N): Likewise.
* gcc.dg/vect/vect-over-widen-4.c (N): Likewise.
* gcc.dg/vect/vect-reduc-pattern-1a.c (N): Likewise.
* gcc.dg/vect/vect-reduc-pattern-1b.c (N): Likewise.
* gcc.dg/vect/vect-reduc-pattern-2a.c (N): Likewise.
* gcc.dg/vect/no-section-anchors-vect-64.c (NINTS): New macro.
(N): Redefine in terms of NINTS.
(ia, ib, ic): Use NINTS instead of hard-coded constants in the
array bounds.
* gcc.dg/vect/no-section-anchors-vect-69.c (NINTS): New macro.
(N): Redefine in terms of NINTS.
(test1): Replace a and b fields with NINTS - 2 ints of padding.
(main1): Use NINTS instead of hard-coded constants.
* gcc.dg/vect/section-anchors-vect-69.c (NINTS): New macro.
(N): Redefine in terms of NINTS.
(test1): Replace a and b fields with NINTS - 2 ints of padding.
(test2): Remove incorrect comments about alignment.
(main1): Use NINTS instead of hard-coded constants.
* gcc.dg/vect/pr45752.c (N): Redefine if the default value is
too small for VECTOR_BITS.
(main): Continue to use canned results for the default value of N,
but compute the expected results from scratch for other values.
* gcc.dg/vect/slp-perm-1.c (N, main): As for pr45752.c.
* gcc.dg/vect/slp-perm-4.c (N, main): Likewise.
* gcc.dg/vect/slp-perm-5.c (N, main): Likewise.
* gcc.dg/vect/slp-perm-6.c (N, main): Likewise.
* gcc.dg/vect/slp-perm-7.c (N, main): Likewise.
* gcc.dg/vect/pr65518.c (NINTS, N, RESULT): New macros.
(giga): Use NINTS as the array bound.
(main): Use NINTS, N and RESULT.
* gcc.dg/vect/pr65947-5.c (N): Redefine if the default value is
too small for VECTOR_BITS.
(main): Fill in any remaining elements of A programmatically.
* gcc.dg/vect/pr81136.c: Include tree-vect.h.
(a): Use VECTOR_BITS to set the alignment of the target structure.
* gcc.dg/vect/slp-19c.c (N): Redefine if the default value is
too small for VECTOR_BITS.
(main1): Continue to use the canned input for the default value of N,
but compute the input from scratch for other values.
* gcc.dg/vect/slp-28.c (N): Redefine if the default value is
too small for VECTOR_BITS.
(in1, in2, in3): Remove initialization.
(check1, check2): Delete.
(main1): Initialize in1, in2 and in3 here.  Check every element
of the vectors and compute the expected values directly instead
of using an array.
* gcc.dg/vect/slp-perm-8.c (N): Redefine if the default value is
too small for VECTOR_BITS.
(foo, main): Change type of "i" to int.
* gcc.dg/vect/vect-103.c (NINTS): New macro.
(N): Redefine in terms of N.
(c): Delete.
(main1): Use NINTS.  Check the result from a and b directly.
* gcc.dg/vect/vect-67.c (NINTS): New macro.
(N): Redefine in terms of N.
(main1): Use NINTS for the inner array bounds.
* gcc.dg/vect/vect-70.c (NINTS, OUTERN): New macros.
(N): Redefine in terms of NINTS.
(s): Keep the outer dimensions as 4 even if N is larger than 24.
(tmp1): New variable.
(main1): Only define a local tmp1 if NINTS is relatively small.
Use OUTERN for the outer loops and NINTS for the inner loops.
* 

[PATCH] RISC-V: Emit "i" suffix for instructions with immediate operands

2017-11-03 Thread Palmer Dabbelt
From: Michael Clark 

This changes makes GCC asm output use instruction names that are
consistent with the RISC-V ISA manual.  The assembler accepts
immediate-operand instructions without the "i" suffix, so this all
worked before, it's just a bit cleaner to match the ISA manual more
closely.

gcc/ChangeLog

2017-10-03  Michael Clark 

* config/riscv/riscv.c (riscv_print_operand): Add a 'i' format.
config/riscv/riscv.md (addsi3): Use 'i' for immediates.
(adddi3): Likewise.
(*addsi3_extended): Likewise.
(*addsi3_extended2): Likewise.
(si3): Likewise.
(di3): Likewise.
(3): Likewise.
(<*optabe>si3_internal): Likewise.
(zero_extendqi2): Likewise.
(*addhi3): Likewise.
(*xorhi3): Likewise.
(di3): Likewise.
(*si3_extend): Likewise.
(*sge_): Likewise.
(*slt_): Likewise.
(*sle_): Likewise.
---
 gcc/config/riscv/riscv.c  |  8 +++-
 gcc/config/riscv/riscv.md | 36 ++--
 2 files changed, 25 insertions(+), 19 deletions(-)

diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
index f0b05d7eaeda..d7e6bd0f205e 100644
--- a/gcc/config/riscv/riscv.c
+++ b/gcc/config/riscv/riscv.c
@@ -2733,7 +2733,8 @@ riscv_memmodel_needs_release_fence (enum memmodel model)
'C' Print the integer branch condition for comparison OP.
'A' Print the atomic operation suffix for memory model OP.
'F' Print a FENCE if the memory model requires a release.
-   'z' Print x0 if OP is zero, otherwise print OP normally.  */
+   'z' Print x0 if OP is zero, otherwise print OP normally.
+   'i' Print i if the operand is not a register. */
 
 static void
 riscv_print_operand (FILE *file, rtx op, int letter)
@@ -2768,6 +2769,11 @@ riscv_print_operand (FILE *file, rtx op, int letter)
fputs ("fence iorw,ow; ", file);
   break;
 
+case 'i':
+  if (code != REG)
+fputs ("i", file);
+  break;
+
 default:
   switch (code)
{
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 9f056bbcda4f..53e1db97db7d 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -414,7 +414,7 @@
(plus:SI (match_operand:SI 1 "register_operand" " r,r")
 (match_operand:SI 2 "arith_operand"" r,I")))]
   ""
-  { return TARGET_64BIT ? "addw\t%0,%1,%2" : "add\t%0,%1,%2"; }
+  { return TARGET_64BIT ? "add%i2w\t%0,%1,%2" : "add%i2\t%0,%1,%2"; }
   [(set_attr "type" "arith")
(set_attr "mode" "SI")])
 
@@ -423,7 +423,7 @@
(plus:DI (match_operand:DI 1 "register_operand" " r,r")
 (match_operand:DI 2 "arith_operand"" r,I")))]
   "TARGET_64BIT"
-  "add\t%0,%1,%2"
+  "add%i2\t%0,%1,%2"
   [(set_attr "type" "arith")
(set_attr "mode" "DI")])
 
@@ -433,7 +433,7 @@
 (plus:SI (match_operand:SI 1 "register_operand" " r,r")
  (match_operand:SI 2 "arith_operand"" r,I"]
   "TARGET_64BIT"
-  "addw\t%0,%1,%2"
+  "add%i2w\t%0,%1,%2"
   [(set_attr "type" "arith")
(set_attr "mode" "SI")])
 
@@ -444,7 +444,7 @@
  (match_operand:DI 2 "arith_operand"" r,I"))
 0)))]
   "TARGET_64BIT"
-  "addw\t%0,%1,%2"
+  "add%i2w\t%0,%1,%2"
   [(set_attr "type" "arith")
(set_attr "mode" "SI")])
 
@@ -705,7 +705,7 @@
(any_div:SI (match_operand:SI 1 "register_operand" " r")
(match_operand:SI 2 "register_operand" " r")))]
   "TARGET_DIV"
-  { return TARGET_64BIT ? "w\t%0,%1,%2" : "\t%0,%1,%2"; }
+  { return TARGET_64BIT ? "%i2w\t%0,%1,%2" : "%i2\t%0,%1,%2"; }
   [(set_attr "type" "idiv")
(set_attr "mode" "SI")])
 
@@ -714,7 +714,7 @@
(any_div:DI (match_operand:DI 1 "register_operand" " r")
(match_operand:DI 2 "register_operand" " r")))]
   "TARGET_DIV && TARGET_64BIT"
-  "\t%0,%1,%2"
+  "%i2\t%0,%1,%2"
   [(set_attr "type" "idiv")
(set_attr "mode" "DI")])
 
@@ -724,7 +724,7 @@
(any_div:SI (match_operand:SI 1 "register_operand" " r")
(match_operand:SI 2 "register_operand" " r"]
   "TARGET_DIV && TARGET_64BIT"
-  "w\t%0,%1,%2"
+  "%i2w\t%0,%1,%2"
   [(set_attr "type" "idiv")
(set_attr "mode" "DI")])
 
@@ -928,7 +928,7 @@
(any_bitwise:X (match_operand:X 1 "register_operand" "%r,r")
   (match_operand:X 2 "arith_operand"" r,I")))]
   ""
-  "\t%0,%1,%2"
+  "%i2\t%0,%1,%2"
   [(set_attr "type" "logical")
(set_attr "mode" "")])
 
@@ -937,7 +937,7 @@
(any_bitwise:SI (match_operand:SI 1 "register_operand" "%r,r")
(match_operand:SI 2 "arith_operand"" r,I")))]
   "TARGET_64BIT"
-  "\t%0,%1,%2"
+  "%i2\t%0,%1,%2"
   [(set_attr "type" "logical")
(set_attr "mode" "SI")])
 
@@ -1025,7 +1025,7 @@
(match_operand:QI 1 "nonimmediate_operand" " r,m")))]
   ""
   "@
-   and\t%0,%1,0xff
+ 

[PATCH] RISC-V: If -m[no-]strict-align is not passed, assume its value from -mtune

2017-11-03 Thread Palmer Dabbelt
From: Andrew Waterman 

2017-11-03  Andrew Waterman  

* config/riscv/riscv.c (riscv_option_override): Conditionally set
TARGET_STRICT_ALIGN based upon -mtune argument.
---
 gcc/config/riscv/riscv.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
index b81a2d29fbfd..f0b05d7eaeda 100644
--- a/gcc/config/riscv/riscv.c
+++ b/gcc/config/riscv/riscv.c
@@ -3772,9 +3772,13 @@ riscv_option_override (void)
 
   /* Use -mtune's setting for slow_unaligned_access, even when optimizing
  for size.  For architectures that trap and emulate unaligned accesses,
- the performance cost is too great, even for -Os.  */
-  riscv_slow_unaligned_access_p = (cpu->tune_info->slow_unaligned_access
-  || TARGET_STRICT_ALIGN);
+ the performance cost is too great, even for -Os.  Similarly, if
+ -m[no-]strict-align is left unspecified, heed -mtune's advice.  */
+  riscv_slow_unaligned_access = (cpu->tune_info->slow_unaligned_access
+|| TARGET_STRICT_ALIGN);
+  if ((target_flags_explicit & MASK_STRICT_ALIGN) == 0
+  && cpu->tune_info->slow_unaligned_access)
+target_flags |= MASK_STRICT_ALIGN;
 
   /* If the user hasn't specified a branch cost, use the processor's
  default.  */
-- 
2.13.6



[1/10] Consistently use asm volatile ("" ::: "memory") in vect tests

2017-11-03 Thread Richard Sandiford
The vectoriser tests used a combination of:

1) if (impossible condition) abort ();
2) volatile int x; ... *x = ...;
3) asm volatile ("" ::: "memory");

to prevent vectorisation of a set-up loop.  The problem with 1) is that
the compiler can often tell that the condition is false and optimise
it away before vectorisation.

This was already happening in slp-perm-9.c, which is why the test was
expecting one loop to be vectorised even when the required permutes
weren't supported.  It becomes a bigger problem with SVE, which is
able to vectorise more set-up loops.

The point of this patch is therefore to replace 1) with something else.
2) should work most of the time, but we don't usually treat non-volatile
accesses as aliasing unrelated volatile accesses, so I think in principle
we could split the loop into one that does the set-up and one that does
the volatile accesses.  3) seems more robust because it's also a wild
read and write.

The patch therefore tries to replace all instances of 1) and 2) with 3).


2017-11-03  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/testsuite/
* gcc.dg/vect/bb-slp-cond-1.c (main): Add an asm volatile
to the set-up loop.
* gcc.dg/vect/slp-perm-7.c (main): Prevent vectorisation with
asm volatile ("" ::: "memory") instead of a conditional abort.
Update the expected vector loop count accordingly.
* gcc.dg/vect/slp-perm-9.c (main): Likewise.
* gcc.dg/vect/bb-slp-1.c (main1): Prevent vectorisation with
asm volatile ("" ::: "memory") instead of a conditional abort.
* gcc.dg/vect/slp-23.c (main): Likewise,
* gcc.dg/vect/slp-35.c (main): Likewise,
* gcc.dg/vect/slp-37.c (main): Likewise,
* gcc.dg/vect/slp-perm-4.c (main): Likewise.
* gcc.dg/vect/bb-slp-24.c (foo): Likewise.  Remove dummy argument.
(main): Update call accordingly.
* gcc.dg/vect/bb-slp-25.c (foo, main): As for bb-slp-24.c.
* gcc.dg/vect/bb-slp-26.c (foo, main): Likewise.
* gcc.dg/vect/bb-slp-29.c (foo, main): Likewise.
* gcc.dg/vect/no-vfa-vect-102.c (foo): Delete.
(main): Don't initialize it.
(main1): Prevent vectorisation with asm volatile ("" ::: "memory")
instead of a conditional abort.
* gcc.dg/vect/no-vfa-vect-102a.c (foo, main1, main): As for
no-vfa-vect-102.c
* gcc.dg/vect/vect-103.c (foo, main1, main): Likewise.
* gcc.dg/vect/vect-104.c (foo, main1, main): Likewise.
* gcc.dg/vect/pr42709.c (main1): Remove dummy argument.
Prevent vectorisation with asm volatile ("" ::: "memory")
instead of a conditional abort.
* gcc.dg/vect/slp-13-big-array.c (y): Delete.
(main1): Prevent vectorisation with asm volatile ("" ::: "memory")
instead of a conditional abort.
* gcc.dg/vect/slp-3-big-array.c (y, main1): As for slp-13-big-array.c.
* gcc.dg/vect/slp-34-big-array.c (y, main1): Likewise.
* gcc.dg/vect/slp-4-big-array.c (y, main1): Likewise.
* gcc.dg/vect/slp-multitypes-11-big-array.c (y, main1): Likewise.
* gcc.dg/vect/vect-105.c (y, main1): Likewise.
* gcc.dg/vect/vect-105-big-array.c (y, main1): Likewise.
* gcc.dg/vect/vect-112-big-array.c (y, main1): Likewise.
* gcc.dg/vect/vect-15-big-array.c (y, main1): Likewise.
* gcc.dg/vect/vect-2-big-array.c (y, main1): Likewise.
* gcc.dg/vect/vect-34-big-array.c (y, main1): Likewise.
* gcc.dg/vect/vect-6-big-array.c (y, main1): Likewise.
* gcc.dg/vect/vect-73-big-array.c (y, main1): Likewise.
* gcc.dg/vect/vect-74-big-array.c (y, main1): Likewise.
* gcc.dg/vect/vect-75-big-array.c (y, main1): Likewise.
* gcc.dg/vect/vect-76-big-array.c (y, main1): Likewise.
* gcc.dg/vect/vect-80-big-array.c (y, main1): Likewise.
* gcc.dg/vect/vect-97-big-array.c (y, main1): Likewise.
* gcc.dg/vect/vect-all-big-array.c (y, main1): Likewise.
* gcc.dg/vect/vect-reduc-1char-big-array.c (y, main1): Likewise.
* gcc.dg/vect/vect-reduc-2char-big-array.c (y, main1): Likewise.
* gcc.dg/vect/vect-strided-a-mult.c (y, main1): Likewise.
* gcc.dg/vect/vect-strided-a-u16-i2.c (y, main1): Likewise.
* gcc.dg/vect/vect-strided-a-u16-i4.c (y, main1): Likewise.
* gcc.dg/vect/vect-strided-a-u16-mult.c (y, main1): Likewise.
* gcc.dg/vect/vect-strided-a-u8-i2-gap.c (y, main1): Likewise.
* gcc.dg/vect/vect-strided-a-u8-i8-gap2-big-array.c (y, main1):
Likewise.
* gcc.dg/vect/vect-strided-a-u8-i8-gap2.c (y, main1): Likewise.
* gcc.dg/vect/vect-strided-a-u8-i8-gap7-big-array.c (y, main1):
Likewise.
* gcc.dg/vect/vect-strided-a-u8-i8-gap7.c (y, main1): Likewise.
* gcc.dg/vect/slp-24.c (y): Delete.
(main): Prevent 

[0/10] Vectoriser testsuite tweaks

2017-11-03 Thread Richard Sandiford
This series of patches generalises the vector testsuite and makes
it cope better with arbitrary vector lengths.  It also adds some
target selectors needed for SVE.

Tested on aarch64-linux-gnu without SVE, with various fixed-length
SVE modes, and with the default variable-length SVE mode.  Also tested
on x86_64-linux-gnu and powerpc64-linux-gnu.

Thanks,
Richard


[PATCH] RISC-V: Set SLOW_BYTE_ACCESS=1

2017-11-03 Thread Palmer Dabbelt
From: Andrew Waterman 

When implementing the RISC-V port, I took the name of this macro at
face value.  It appears we were mistaken in what this means, here's a
quote from the SPARC port that better describes what SLOW_BYTE_ACCESS
does

/* Nonzero if access to memory by bytes is slow and undesirable.
   For RISC chips, it means that access to memory by bytes is no
   better than access by words when possible, so grab a whole word
   and maybe make use of that.  */

I've added the comment to our port as well.

See https://gcc.gnu.org/ml/gcc/2017-08/msg00202.html for more
discussion.  Thanks to Michael Clark and Andrew Pinski for the help!

gcc/ChangeLog

2017-10-03  Andrew Waterman  

* config/riscv/riscv.h (SLOW_BYTE_ACCESS): Change to 1.
---
 gcc/config/riscv/riscv.h | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
index e53555efe82f..a802a3f8cbbb 100644
--- a/gcc/config/riscv/riscv.h
+++ b/gcc/config/riscv/riscv.h
@@ -615,7 +615,12 @@ typedef struct {
 #define MOVE_MAX UNITS_PER_WORD
 #define MAX_MOVE_MAX 8
 
-#define SLOW_BYTE_ACCESS 0
+/* The SPARC port says:
+   Nonzero if access to memory by bytes is slow and undesirable.
+   For RISC chips, it means that access to memory by bytes is no
+   better than access by words when possible, so grab a whole word
+   and maybe make use of that.  */
+#define SLOW_BYTE_ACCESS 1
 
 #define SHIFT_COUNT_TRUNCATED 1
 
-- 
2.13.6



[PATCH] Define std::endian for C++2a (P0463R1)

2017-11-03 Thread Jonathan Wakely

This is a tiny feature, small but perfectly formed. And easy to
implement.

* include/std/type_traits (endian): Define new enumeration type.
* testsuite/20_util/endian/1.cc: New test.

Tested powerpc64le-linux, committed to trunk.

commit ad69c5875cdefc5c699b50975ee6016a424d450b
Author: Jonathan Wakely 
Date:   Fri Nov 3 15:04:51 2017 +

Define std::endian for C++2a (P0463R1)

* include/std/type_traits (endian): Define new enumeration type.
* testsuite/20_util/endian/1.cc: New test.

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 036f7667bd8..7eca08c1a50 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -2664,7 +2664,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 void operator=(__nonesuch const&) = delete;
   };
 
-#if __cplusplus > 201402L
+#if __cplusplus >= 201703L
 # define __cpp_lib_is_invocable 201703
 
   /// std::invoke_result
@@ -2739,7 +2739,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   = is_nothrow_invocable_r<_Fn, _Args...>::value;
 #endif // C++17
 
-#if __cplusplus > 201402L
+#if __cplusplus >= 201703L
 # define __cpp_lib_type_trait_variable_templates 201510L
 template 
   inline constexpr bool is_void_v = is_void<_Tp>::value;
@@ -2943,6 +2943,16 @@ template 
 
 #endif // C++17
 
+#if __cplusplus > 201703L
+  /// Byte order
+  enum class endian
+  {
+little = __ORDER_LITTLE_ENDIAN__,
+big= __ORDER_BIG_ENDIAN__,
+native = __BYTE_ORDER__
+  };
+#endif // C++2a
+
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace std
 
diff --git a/libstdc++-v3/testsuite/20_util/endian/1.cc 
b/libstdc++-v3/testsuite/20_util/endian/1.cc
new file mode 100644
index 000..2720c5edcdb
--- /dev/null
+++ b/libstdc++-v3/testsuite/20_util/endian/1.cc
@@ -0,0 +1,36 @@
+// Copyright (C) 2017 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-options "-std=gnu++2a" }
+// { dg-do compile { target c++2a } }
+
+#include 
+
+static_assert( std::is_enum_v );
+static_assert( std::endian::little != std::endian::big );
+static_assert( std::endian::native == std::endian::big
+   || std::endian::native == std::endian::little );
+
+namespace gnu {
+  int little, big, native;
+}
+
+using namespace std;
+using namespace gnu;
+
+// std::endian is a scoped-enum so these should refer to gnu::native etc.
+int test = little + big + native;


[PATCH] rs6000: Remove rs6000_emit_sISEL

2017-11-03 Thread Segher Boessenkool
Instead of calling rs6000_emit_sISEL, call rs6000_emit_int_cmove
directly, in the one place it is used.

Tested as usual; committing to trunk.


Segher


2017-11-03  Segher Boessenkool  

* config/rs6000/rs6-protos.h (rs6000_emit_sISEL): Delete.
(rs6000_emit_int_cmove): New declaration.
* config/rs6000/rs6000.c (rs6000_emit_int_cmove): Delete declaration.
(rs6000_emit_sISEL): Delete.
(rs6000_emit_int_cmove): Make non-static.
* config/rs6000/rs6000.md (cstore4): Use rs6000_emit_int_cmove
instead of rs6000_emit_sISEL.

---
 gcc/config/rs6000/rs6000-protos.h |  2 +-
 gcc/config/rs6000/rs6000.c| 11 +--
 gcc/config/rs6000/rs6000.md   |  2 +-
 3 files changed, 3 insertions(+), 12 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-protos.h 
b/gcc/config/rs6000/rs6000-protos.h
index db0e692..721b906 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -124,7 +124,6 @@ extern void print_operand_address (FILE *, rtx);
 extern enum rtx_code rs6000_reverse_condition (machine_mode,
   enum rtx_code);
 extern rtx rs6000_emit_eqne (machine_mode, rtx, rtx, rtx);
-extern void rs6000_emit_sISEL (machine_mode, rtx[]);
 extern void rs6000_emit_sCOND (machine_mode, rtx[]);
 extern void rs6000_emit_cbranch (machine_mode, rtx[]);
 extern char * output_cbranch (rtx, const char *, int, rtx_insn *);
@@ -132,6 +131,7 @@ extern const char * output_probe_stack_range (rtx, rtx, 
rtx);
 extern void rs6000_emit_dot_insn (rtx dst, rtx src, int dot, rtx ccreg);
 extern bool rs6000_emit_set_const (rtx, rtx);
 extern int rs6000_emit_cmove (rtx, rtx, rtx, rtx);
+extern int rs6000_emit_int_cmove (rtx, rtx, rtx, rtx);
 extern int rs6000_emit_vector_cond_expr (rtx, rtx, rtx, rtx, rtx, rtx);
 extern void rs6000_emit_minmax (rtx, enum rtx_code, rtx, rtx);
 extern void rs6000_split_signbit (rtx, rtx);
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 182dc30..0eabd5f 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -1357,7 +1357,6 @@ static void rs6000_common_init_builtins (void);
 static void paired_init_builtins (void);
 static rtx paired_expand_predicate_builtin (enum insn_code, tree, rtx);
 static void htm_init_builtins (void);
-static int rs6000_emit_int_cmove (rtx, rtx, rtx, rtx);
 static rs6000_stack_t *rs6000_stack_info (void);
 static void is_altivec_return_reg (rtx, void *);
 int easy_vector_constant (rtx, machine_mode);
@@ -22478,14 +22477,6 @@ rs6000_expand_float128_convert (rtx dest, rtx src, 
bool unsigned_p)
 }
 
 
-/* Emit the RTL for an sISEL pattern.  */
-
-void
-rs6000_emit_sISEL (machine_mode mode ATTRIBUTE_UNUSED, rtx operands[])
-{
-  rs6000_emit_int_cmove (operands[0], operands[1], const1_rtx, const0_rtx);
-}
-
 /* Emit RTL that sets a register to zero if OP1 and OP2 are equal.  SCRATCH
can be used as that dest register.  Return the dest register.  */
 
@@ -23261,7 +23252,7 @@ rs6000_emit_cmove (rtx dest, rtx op, rtx true_cond, rtx 
false_cond)
 
 /* Same as above, but for ints (isel).  */
 
-static int
+int
 rs6000_emit_int_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
 {
   rtx condition_rtx, cr;
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 2ef028f..ed5ff39 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -11782,7 +11782,7 @@ (define_expand "cstore4"
 {
   /* Use ISEL if the user asked for it.  */
   if (TARGET_ISEL)
-rs6000_emit_sISEL (mode, operands);
+rs6000_emit_int_cmove (operands[0], operands[1], const1_rtx, const0_rtx);
 
   /* Expanding EQ and NE directly to some machine instructions does not help
  but does hurt combine.  So don't.  */
-- 
1.8.3.1



PR82816: Widening multiplies of bitfields

2017-11-03 Thread Richard Sandiford
In this PR we tried to create a widening multiply of two 3-bit numbers,
but that isn't a widening multiply at the optab/rtl level, since both
the input and output still have the same mode.

We could trap this either in is_widening_mult_p or (as the patch does)
in the routines that actually ask for an optab.  The latter seemed
more natural since is_widening_mult_p doesn't otherwise care about modes.

Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64-linux-gnu.
OK to install?

Richard


2017-11-03  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
PR tree-optimization/82816
* tree-ssa-math-opts.c (convert_mult_to_widen): Return false
if the modes of the two types are the same.
(convert_plusminus_to_widen): Likewise.

gcc/testsuite/
* gcc.c-torture/compile/pr82816.c: New test.

Index: gcc/tree-ssa-math-opts.c
===
--- gcc/tree-ssa-math-opts.c2017-11-01 12:29:40.203534002 +
+++ gcc/tree-ssa-math-opts.c2017-11-03 11:18:03.046411241 +
@@ -3259,6 +3259,9 @@ convert_mult_to_widen (gimple *stmt, gim
 
   to_mode = SCALAR_INT_TYPE_MODE (type);
   from_mode = SCALAR_INT_TYPE_MODE (type1);
+  if (to_mode == from_mode)
+return false;
+
   from_unsigned1 = TYPE_UNSIGNED (type1);
   from_unsigned2 = TYPE_UNSIGNED (type2);
 
@@ -3449,6 +3452,9 @@ convert_plusminus_to_widen (gimple_stmt_
 
   to_mode = SCALAR_TYPE_MODE (type);
   from_mode = SCALAR_TYPE_MODE (type1);
+  if (to_mode == from_mode)
+return false;
+
   from_unsigned1 = TYPE_UNSIGNED (type1);
   from_unsigned2 = TYPE_UNSIGNED (type2);
   optype = type1;
Index: gcc/testsuite/gcc.c-torture/compile/pr82816.c
===
--- /dev/null   2017-11-03 10:40:07.002381728 +
+++ gcc/testsuite/gcc.c-torture/compile/pr82816.c   2017-11-03 
11:18:03.045411265 +
@@ -0,0 +1,12 @@
+struct A
+{
+  int b:3;
+} d, e;
+
+int c;
+
+void f ()
+{
+  char g = d.b * e.b;
+  c = g;
+}


Re: [PATCH] Fix test-suite fallout of default -Wreturn-type.

2017-11-03 Thread Jason Merrill
On Thu, Oct 26, 2017 at 8:14 AM, Martin Liška  wrote:
> On 10/24/2017 04:39 PM, Jason Merrill wrote:
>> On 10/18/2017 08:48 AM, Martin Liška wrote:
>>> This is second patch that addresses test-suite fallout. All these tests 
>>> fail because -Wreturn-type is
>>> now on by default.
>>
>>> +++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-diag3.C
>>> -constexpr T g(T t) { return f(t); } // { dg-error "f.int" }
>>> +constexpr T g(T t) { return f(t); } // { dg-error "f.int" "" { target 
>>> c++14_only } }
>>
>>> +++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-neg3.C
>>> -  constexpr int bar() { return a.foo(); } // { dg-error "foo" }
>>> +  constexpr int bar() { return a.foo(); } // { dg-error "foo" "" { target 
>>> c++14_only } }
>>
>> Why are these changes needed?  They aren't "Return a value for functions 
>> with non-void return type, or change type to void, or add -Wno-return-type 
>> for test."
>>
>> The rest of the patch is OK.
>>
>> Jason
>
> Hi.
>
> Sorry, I forgot to describe this change. With -std=c++11 we do:
>
> #0  massage_constexpr_body (fun=0x76955500, body=0x76813eb8) at 
> ../../gcc/cp/constexpr.c:708
> #1  0x0087700b in explain_invalid_constexpr_fn (fun=0x76955500) 
> at ../../gcc/cp/constexpr.c:896
> #2  0x008799dc in cxx_eval_call_expression (ctx=0x7fffd150, 
> t=0x76820118, lval=false, non_constant_p=0x7fffd1cf, 
> overflow_p=0x7fffd1ce) at ../../gcc/cp/constexpr.c:1558
> #3  0x008843fe in cxx_eval_constant_expression (ctx=0x7fffd150, 
> t=0x76820118, lval=false, non_constant_p=0x7fffd1cf, 
> overflow_p=0x7fffd1ce, jump_target=0x0) at ../../gcc/cp/constexpr.c:4069
>
> static tree
> massage_constexpr_body (tree fun, tree body)
> {
>   if (DECL_CONSTRUCTOR_P (fun))
> body = build_constexpr_constructor_member_initializers
>   (DECL_CONTEXT (fun), body);
>   else if (cxx_dialect < cxx14)
> {
>   if (TREE_CODE (body) == EH_SPEC_BLOCK)
> body = EH_SPEC_STMTS (body);
>   if (TREE_CODE (body) == MUST_NOT_THROW_EXPR)
> body = TREE_OPERAND (body, 0);
>   body = constexpr_fn_retval (body);
> }
>   return body;
> }
>
> and we end up with error_mark_node and thus potential_constant_expression_1 
> does bail out.
> That's why we don't print the later error with -std=c++11.
>
> What should we do with that?

Fix constexpr_fn_retval to ignore the call to __builtin_unreachable.

Jason


Re: [PATCH] Zero vptr in dtor for -fsanitize=vptr.

2017-11-03 Thread Jason Merrill
On Fri, Nov 3, 2017 at 10:25 AM, Martin Liška  wrote:
> On 10/27/2017 09:44 PM, Nathan Sidwell wrote:
>> On 10/27/2017 02:34 PM, Jakub Jelinek wrote:
>>
>>> But when singly inheriting a polymorphic base and thus mapped to the same
>>> vptr all but the last dtor will not be in charge, right?
>>
>> Correct.
>>
>>> So, if using build_clobber_this for this, instead of clobbering what we
>>> clobber we'd just clear the single vptr (couldn't clobber the rest, even
>>> if before the store, because that would make the earlier other vptr stores
>>> dead).
>>
>> ok (I'd not looked at the patch to see if in chargeness was signficant)
>>
>> nathan
>>
>
> Hello.
>
> I'm sending v2 which only zeros vptr of object.
>
> Ready to be installed after finishing tests?

Surely we also want to check TYPE_CONTAINS_VPTR_P.

Jason


Re: [PATCH][AArch64] Set default sched pressure algorithm

2017-11-03 Thread James Greenhalgh
On Thu, Nov 02, 2017 at 06:41:58PM +, Wilco Dijkstra wrote:
> The Arm backend sets the default sched-pressure algorithm to
> SCHED_PRESSURE_MODEL.  Benchmarking on AArch64 shows this 
> speeds up floating point performance on SPEC - eg. CactusBSSN improves
> by ~16%.  The gains are mostly due to less spilling, so enable this on AArch64
> by default.
> 
> OK for commit?

OK.

Reviewed-By: James Greenhalgh 

Thanks,
James

> 
> 2017-11-02  Wilco Dijkstra  
> 
>   * config/aarch64/aarch64.c (aarch64_override_options_internal):
>   Set PARAM_SCHED_PRESSURE_ALGORITHM to SCHED_PRESSURE_MODEL.
> 
> --
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 
> 34456e96497ac7b6d2f9931187ff05619e1934a4..750b0bc29c0963742d5d7bb4ae4619d93bec3e4a
>  100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -9276,6 +9276,11 @@ aarch64_override_options_internal (struct gcc_options 
> *opts)
>  opts->x_param_values,
>  global_options_set.x_param_values);
>  
> +  /* Use the alternative scheduling-pressure algorithm by default.  */
> +  maybe_set_param_value (PARAM_SCHED_PRESSURE_ALGORITHM, 
> SCHED_PRESSURE_MODEL,
> +  opts->x_param_values,
> +  global_options_set.x_param_values);
> +
>/* Enable sw prefetching at specified optimization level for
>   CPUS that have prefetch.  Lower optimization level threshold by 1
>   when profiling is enabled.  */


Re: [RFA][PATCH] Refactor duplicated code used by various dom walkers

2017-11-03 Thread Jeff Law

On 11/03/2017 04:05 AM, Richard Biener wrote:

On Fri, Nov 3, 2017 at 4:49 AM, Jeff Law  wrote:




Several passes which perform dominator walks want to identify when block has
a single incoming edge, ignoring loop backedges.

I'm aware of 4 implementations of this code.  3 of the 4 are identical in
function.  The 4th (tree-ssa-dom.c) has an additional twist that it also
ignores edges that are not marked as executable.

So I've taken the more general implementation from tree-ssa-dom.c and
conditionalized the handling of unexecutable edges on a flag and moved the
implementation into cfganal.c where it more naturally belongs.

Bootstrapped and regression tested on x86_64.  OK for the trunk?


Minor nits (sorry...)
No need to apologize.  I'm always appreciative of feedback as it 
consistently improves what ultimately lands in the tree.






Jeff

 * cfganal.c (single_incoming_edge_ignoring_loop_edges): New function
 extracted from tree-ssa-dom.c.
 * cfganal.h (single_incoming_edge_ignoring_loop_edges): Prototype.
 * tree-ssa-dom.c (single_incoming_edge_ignoring_loop_edges): Remove.
 (record_equivalences_from_incoming_edge): Add additional argument
 to single_incoming_edge_ignoring_loop_edges call.
 * tree-ssa-uncprop.c (single_incoming_edge_ignoring_loop_edges):
Remove.
 (uncprop_dom_walker::before_dom_children): Add additional argument
 to single_incoming_edge_ignoring_loop_edges call.
 * tree-ssa-sccvn.c (sccvn_dom_walker::before_dom_children): Use
 single_incoming_edge_ignoring_loop_edges rather than open coding.
 * tree-vrp.c (evrp_dom_walker::before_dom_children): Similarly.





diff --git a/gcc/cfganal.c b/gcc/cfganal.c
index c506067..14d94b2 100644
--- a/gcc/cfganal.c
+++ b/gcc/cfganal.c
@@ -1554,3 +1554,38 @@ single_pred_before_succ_order (void)
  #undef MARK_VISITED
  #undef VISITED_P
  }
+
+/* Ignoring loop backedges, if BB has precisely one incoming edge then
+   return that edge.  Otherwise return NULL.  */
+edge
+single_incoming_edge_ignoring_loop_edges (basic_block bb,
+ bool ignore_unreachable)


single_pred_edge_ignoring_loop_edges and ignore_not_executable

to better match existing CFG functions and actual edge flag use.

Ok with that change.

Sure.  Easy to change.

Jeff


Re: [PATCH] RISC-V: Handle non-legitimate address in riscv_legitimize_move

2017-11-03 Thread Palmer Dabbelt
Committed.

On Thu, 02 Nov 2017 09:03:19 PDT (-0700), Palmer Dabbelt wrote:
> From: Kito Cheng 
>
> GCC may generate non-legitimate address due to we allow some
> load/store with non-legitimate address in pic.md.
>
> gcc/ChangeLog
>
> 2017-11-02  Kito Cheng  
>
> * config/riscv/riscv.c (riscv_legitimize_move): Handle
> non-legitimate address.
> ---
>  gcc/config/riscv/riscv.c | 16 
>  1 file changed, 16 insertions(+)
>
> diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
> index c34468e018d6..b81a2d29fbfd 100644
> --- a/gcc/config/riscv/riscv.c
> +++ b/gcc/config/riscv/riscv.c
> @@ -1332,6 +1332,22 @@ riscv_legitimize_move (machine_mode mode, rtx dest, 
> rtx src)
>return true;
>  }
>
> +  /* RISC-V GCC may generate non-legitimate address due to we provide some
> + pattern for optimize access PIC local symbol and it's make GCC generate
> + unrecognizable instruction during optmizing.  */
> +
> +  if (MEM_P (dest) && !riscv_legitimate_address_p (mode, XEXP (dest, 0),
> +reload_completed))
> +{
> +  XEXP (dest, 0) = riscv_force_address (XEXP (dest, 0), mode);
> +}
> +
> +  if (MEM_P (src) && !riscv_legitimate_address_p (mode, XEXP (src, 0),
> +   reload_completed))
> +{
> +  XEXP (src, 0) = riscv_force_address (XEXP (src, 0), mode);
> +}
> +
>return false;
>  }


[PATCH][Arm] Cleanup IT attributes

2017-11-03 Thread Wilco Dijkstra
A recent change to remove the movdi_vfp_cortexa8 meant that ldrd was used in
ITs block even when arm_restrict_it was enabled.  Rather than just fixing this
latent issue, change the default of predicable_short_it to "no" so that only
16-bit instructions need to be marked with it.  As a result there are far fewer
patterns that need the attribute, and omitting predicable_short_it is no longer
causing issues.

This fixes 4 tests that failed after r254233, OK for commit?

ChangeLog:
2017-11-03  Wilco Dijkstra  

* config/arm/arm.md (predicable_short_it): Change default to "no",
improve documentation, remove uses that are identical to the default.
(enabled_for_depr_it): Rename to enabled_for_short_it.
* config/arm/arm-fixed.md (predicable_short_it): Remove default uses.
* config/arm/ldmstm.md (predicable_short_it): Likewise. 
* config/arm/sync.md (predicable_short_it): Likewise.
* config/arm/thumb2.md (predicable_short_it): Likewise.
* config/arm/vfp.md (predicable_short_it): Likewise.

--
diff --git a/gcc/config/arm/arm-fixed.md b/gcc/config/arm/arm-fixed.md
index 
ca721437792c7e3ad4fdc5ab5701aa79f01932cb..6730a2bbad6b107c669cb003cfdb651243740553
 100644
--- a/gcc/config/arm/arm-fixed.md
+++ b/gcc/config/arm/arm-fixed.md
@@ -35,7 +35,6 @@
   "TARGET_INT_SIMD"
   "sadd%?\\t%0, %1, %2"
   [(set_attr "predicable" "yes")
-   (set_attr "predicable_short_it" "no")
(set_attr "type" "alu_dsp_reg")])
 
 (define_insn "usadd3"
@@ -45,7 +44,6 @@
   "TARGET_INT_SIMD"
   "uqadd%?\\t%0, %1, %2"
   [(set_attr "predicable" "yes")
-   (set_attr "predicable_short_it" "no")
(set_attr "type" "alu_dsp_reg")])
 
 (define_insn "ssadd3"
@@ -55,7 +53,6 @@
   "TARGET_INT_SIMD"
   "qadd%?\\t%0, %1, %2"
   [(set_attr "predicable" "yes")
-   (set_attr "predicable_short_it" "no")
(set_attr "type" "alu_dsp_reg")])
 
 (define_insn "sub3"
@@ -75,7 +72,6 @@
   "TARGET_INT_SIMD"
   "ssub%?\\t%0, %1, %2"
   [(set_attr "predicable" "yes")
-   (set_attr "predicable_short_it" "no")
(set_attr "type" "alu_dsp_reg")])
 
 (define_insn "ussub3"
@@ -86,7 +82,6 @@
   "TARGET_INT_SIMD"
   "uqsub%?\\t%0, %1, %2"
   [(set_attr "predicable" "yes")
-   (set_attr "predicable_short_it" "no")
(set_attr "type" "alu_dsp_reg")])
 
 (define_insn "sssub3"
@@ -96,7 +91,6 @@
   "TARGET_INT_SIMD"
   "qsub%?\\t%0, %1, %2"
   [(set_attr "predicable" "yes")
-   (set_attr "predicable_short_it" "no")
(set_attr "type" "alu_dsp_reg")])
 
 ;; Fractional multiplies.
@@ -414,7 +408,6 @@
   "TARGET_32BIT && arm_arch6"
   "ssat%?\\t%0, #16, %2%S1"
   [(set_attr "predicable" "yes")
-   (set_attr "predicable_short_it" "no")
(set_attr "shift" "1")
(set_attr "type" "alu_shift_imm")])
 
@@ -424,6 +417,5 @@
   "TARGET_INT_SIMD"
   "usat%?\\t%0, #16, %1"
   [(set_attr "predicable" "yes")
-   (set_attr "predicable_short_it" "no")
(set_attr "type" "alu_imm")]
 )
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 
22a264b53dcb8dffe62da77ebd7420e1484de42d..e2d528442b49f816e854acb3945413b4e5fcc3a8
 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -81,14 +81,17 @@
   (const (if_then_else (symbol_ref "TARGET_THUMB1")
   (const_string "yes") (const_string "no"
 
-; We use this attribute to disable alternatives that can produce 32-bit
-; instructions inside an IT-block in Thumb2 state.  ARMv8 deprecates IT blocks
-; that contain 32-bit instructions.
-(define_attr "enabled_for_depr_it" "no,yes" (const_string "yes"))
-
-; This attribute is used to disable a predicated alternative when we have
-; arm_restrict_it.
-(define_attr "predicable_short_it" "no,yes" (const_string "yes"))
+; Mark an instruction as suitable for "short IT" blocks in Thumb-2.
+; The arm_restrict_it flag enables the "short IT" feature which
+; restricts IT blocks to a single 16-bit instruction.
+; This attribute should only be used on 16-bit Thumb-2 instructions
+; which may be predicated (the "predicable" attribute must be set).
+(define_attr "predicable_short_it" "no,yes" (const_string "no"))
+
+; Mark an instruction as suitable for "short IT" blocks in Thumb-2.
+; This attribute should only be used on instructions which may emit
+; an IT block in their expansion which is not a short IT.
+(define_attr "enabled_for_short_it" "no,yes" (const_string "yes"))
 
 ;; Operand number of an input operand that is shifted.  Zero if the
 ;; given instruction does not shift one of its input operands.
@@ -229,7 +232,7 @@
(match_test "arm_restrict_it")))
  (const_string "no")
 
- (and (eq_attr "enabled_for_depr_it" "no")
+ (and (eq_attr "enabled_for_short_it" "no")
   (match_test "arm_restrict_it"))
  (const_string "no")
 
@@ -1031,7 +1034,6 @@
   "adc%?\\t%0, %1, %3%S2"
   [(set_attr "conds" "use")
(set_attr "predicable" "yes")
-   (set_attr "predicable_short_it" "no")
(set (attr "type") (if_then_else 

Re: [C++ Patch] PR 80955 (Macros expanded in definition of user-defined literals)

2017-11-03 Thread Paolo Carlini

Hi,

On 02/11/2017 15:42, Jason Merrill wrote:



This is a good suggestion. I have attached the revised patch. Thanks.

OK, thanks!

Thanks Jason.

I was about to volunteer committing the patch but then noticed that the 
testcase includes quite a lot, eg,  too, which we never 
include in the whole C++ testsuite. Can we have something simpler? Also, 
we don't need to include the whole  and  for a couple 
of declarations, we can simply provide by hand the declarations of 
sprintf and strcmp at the beginning of the file (plenty of examples in 
the testsuite). Mukesh, can you please work on that? Also, please watch 
out trailing blank lines.


Thanks,
Paolo.


Re: [PATCH] Zero vptr in dtor for -fsanitize=vptr.

2017-11-03 Thread Marek Polacek
On Fri, Nov 03, 2017 at 03:25:25PM +0100, Martin Liška wrote:
> On 10/27/2017 09:44 PM, Nathan Sidwell wrote:
> > On 10/27/2017 02:34 PM, Jakub Jelinek wrote:
> > 
> >> But when singly inheriting a polymorphic base and thus mapped to the same
> >> vptr all but the last dtor will not be in charge, right?
> > 
> > Correct.
> > 
> >> So, if using build_clobber_this for this, instead of clobbering what we
> >> clobber we'd just clear the single vptr (couldn't clobber the rest, even
> >> if before the store, because that would make the earlier other vptr stores
> >> dead).
> > 
> > ok (I'd not looked at the patch to see if in chargeness was signficant)
> > 
> > nathan
> > 
> 
> Hello.
> 
> I'm sending v2 which only zeros vptr of object.
> 
> Ready to be installed after finishing tests?
> Martin

> From 098932be5472656c834b402038accb0b861afcc1 Mon Sep 17 00:00:00 2001
> From: marxin 
> Date: Thu, 19 Oct 2017 11:10:19 +0200
> Subject: [PATCH] Zero vptr in dtor for -fsanitize=vptr.
> 
> gcc/cp/ChangeLog:
> 
> 2017-11-03  Martin Liska  
> 
>   * decl.c (begin_destructor_body): In case of VPTR sanitization
>   (with disabled recovery), zero vptr in order to catch virtual calls
>   after lifetime of an object.
> 
> gcc/testsuite/ChangeLog:
> 
> 2017-10-27  Martin Liska  
> 
>   * g++.dg/ubsan/vptr-12.C: New test.
> ---
>  gcc/cp/decl.c| 20 +++-
>  gcc/testsuite/g++.dg/ubsan/vptr-12.C | 22 ++
>  2 files changed, 41 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/g++.dg/ubsan/vptr-12.C
> 
> diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
> index d88c78f348b..d45cc29e636 100644
> --- a/gcc/cp/decl.c
> +++ b/gcc/cp/decl.c
> @@ -15241,7 +15241,25 @@ begin_destructor_body (void)
>if (flag_lifetime_dse
> /* Clobbering an empty base is harmful if it overlays real data.  */
> && !is_empty_class (current_class_type))
> - finish_decl_cleanup (NULL_TREE, build_clobber_this ());
> +  {
> +   if (sanitize_flags_p (SANITIZE_VPTR)
> +   && (flag_sanitize_recover & SANITIZE_VPTR) == 0)
> + {
> +   tree binfo = TYPE_BINFO (current_class_type);
> +   tree ref
> + = cp_build_indirect_ref (current_class_ptr, RO_NULL,
> +  tf_warning_or_error);
> +
> +   tree vtbl_ptr = build_vfield_ref (ref, TREE_TYPE (binfo));
> +   tree vtbl = build_zero_cst (TREE_TYPE (vtbl_ptr));
> +   tree stmt = cp_build_modify_expr (input_location, vtbl_ptr,
> + NOP_EXPR, vtbl,
> + tf_warning_or_error);
> +   finish_decl_cleanup (NULL_TREE, stmt);
> + }
> +   else
> + finish_decl_cleanup (NULL_TREE, build_clobber_this ());
> +  }
>  
>/* And insert cleanups for our bases and members so that they
>will be properly destroyed if we throw.  */
> diff --git a/gcc/testsuite/g++.dg/ubsan/vptr-12.C 
> b/gcc/testsuite/g++.dg/ubsan/vptr-12.C
> new file mode 100644
> index 000..be5c074dfc1
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/ubsan/vptr-12.C
> @@ -0,0 +1,22 @@
> +// { dg-do run }
> +// { dg-shouldfail "ubsan" }
> +// { dg-options "-fsanitize=vptr -fno-sanitize-recover=vptr" }
> +
> +struct MyClass
> +{
> +  virtual ~MyClass () {}
> +  virtual void Doit () {}
> +};
> +
> +int
> +main ()
> +{
> +  MyClass *c = new MyClass;
> +  c->~MyClass ();
> +  c->Doit ();
> +
> +  return 0;
> +}
> +
> +// { dg-output "\[^\n\r]*vptr-12.C:16:\[0-9]*: runtime error: member call on 
> address 0x\[0-9a-fA-F]* which does not point to an object of type 
> 'MyClass'(\n|\r\n|\r)" }
> +// { dg-output "0x\[0-9a-fA-F]*: note: object has invalid vptr(\n|\r\n|\r)" }

I think the last dg-output shouldn't have any regexps at the end, so:

// { dg-output "0x\[0-9a-fA-F]*: note: object has invalid vptr" }

Marek


Re: [PATCH] Zero vptr in dtor for -fsanitize=vptr.

2017-11-03 Thread Martin Liška
On 10/27/2017 09:44 PM, Nathan Sidwell wrote:
> On 10/27/2017 02:34 PM, Jakub Jelinek wrote:
> 
>> But when singly inheriting a polymorphic base and thus mapped to the same
>> vptr all but the last dtor will not be in charge, right?
> 
> Correct.
> 
>> So, if using build_clobber_this for this, instead of clobbering what we
>> clobber we'd just clear the single vptr (couldn't clobber the rest, even
>> if before the store, because that would make the earlier other vptr stores
>> dead).
> 
> ok (I'd not looked at the patch to see if in chargeness was signficant)
> 
> nathan
> 

Hello.

I'm sending v2 which only zeros vptr of object.

Ready to be installed after finishing tests?
Martin
>From 098932be5472656c834b402038accb0b861afcc1 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Thu, 19 Oct 2017 11:10:19 +0200
Subject: [PATCH] Zero vptr in dtor for -fsanitize=vptr.

gcc/cp/ChangeLog:

2017-11-03  Martin Liska  

	* decl.c (begin_destructor_body): In case of VPTR sanitization
	(with disabled recovery), zero vptr in order to catch virtual calls
	after lifetime of an object.

gcc/testsuite/ChangeLog:

2017-10-27  Martin Liska  

	* g++.dg/ubsan/vptr-12.C: New test.
---
 gcc/cp/decl.c| 20 +++-
 gcc/testsuite/g++.dg/ubsan/vptr-12.C | 22 ++
 2 files changed, 41 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/ubsan/vptr-12.C

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index d88c78f348b..d45cc29e636 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -15241,7 +15241,25 @@ begin_destructor_body (void)
   if (flag_lifetime_dse
 	  /* Clobbering an empty base is harmful if it overlays real data.  */
 	  && !is_empty_class (current_class_type))
-	finish_decl_cleanup (NULL_TREE, build_clobber_this ());
+  {
+	  if (sanitize_flags_p (SANITIZE_VPTR)
+	  && (flag_sanitize_recover & SANITIZE_VPTR) == 0)
+	{
+	  tree binfo = TYPE_BINFO (current_class_type);
+	  tree ref
+		= cp_build_indirect_ref (current_class_ptr, RO_NULL,
+	 tf_warning_or_error);
+
+	  tree vtbl_ptr = build_vfield_ref (ref, TREE_TYPE (binfo));
+	  tree vtbl = build_zero_cst (TREE_TYPE (vtbl_ptr));
+	  tree stmt = cp_build_modify_expr (input_location, vtbl_ptr,
+		NOP_EXPR, vtbl,
+		tf_warning_or_error);
+	  finish_decl_cleanup (NULL_TREE, stmt);
+	}
+	  else
+	finish_decl_cleanup (NULL_TREE, build_clobber_this ());
+  }
 
   /* And insert cleanups for our bases and members so that they
 	 will be properly destroyed if we throw.  */
diff --git a/gcc/testsuite/g++.dg/ubsan/vptr-12.C b/gcc/testsuite/g++.dg/ubsan/vptr-12.C
new file mode 100644
index 000..be5c074dfc1
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ubsan/vptr-12.C
@@ -0,0 +1,22 @@
+// { dg-do run }
+// { dg-shouldfail "ubsan" }
+// { dg-options "-fsanitize=vptr -fno-sanitize-recover=vptr" }
+
+struct MyClass
+{
+  virtual ~MyClass () {}
+  virtual void Doit () {}
+};
+
+int
+main ()
+{
+  MyClass *c = new MyClass;
+  c->~MyClass ();
+  c->Doit ();
+
+  return 0;
+}
+
+// { dg-output "\[^\n\r]*vptr-12.C:16:\[0-9]*: runtime error: member call on address 0x\[0-9a-fA-F]* which does not point to an object of type 'MyClass'(\n|\r\n|\r)" }
+// { dg-output "0x\[0-9a-fA-F]*: note: object has invalid vptr(\n|\r\n|\r)" }
-- 
2.14.3



[PATCH] Fix testsuire error message

2017-11-03 Thread Nathan Sidwell
Someone noticed I'd not updated the error message when cloneing 
scan-tree-dump into scan-lang-dump.


Fixed thusly and applied

nathan
--
Nathan Sidwell
2017-11-03  Nathan Sidwell  

	* lib/scanlang.exp: Fix error message to refer to scan-lang-dump.

Index: lib/scanlang.exp
===
--- lib/scanlang.exp	(revision 254349)
+++ lib/scanlang.exp	(working copy)
@@ -28,11 +28,11 @@ load_lib scandump.exp
 proc scan-lang-dump { args } {
 
 if { [llength $args] < 2 } {
-	error "scan-tree-dump: too few arguments"
+	error "scan-lang-dump: too few arguments"
 	return
 }
 if { [llength $args] > 3 } {
-	error "scan-tree-dump: too many arguments"
+	error "scan-lang-dump: too many arguments"
 	return
 }
 if { [llength $args] >= 3 } {


Re: [PATCH] Improve store merging to handle load+store or bitwise logicals (PR tree-optimization/78821, take 2)

2017-11-03 Thread Richard Biener
On Fri, 3 Nov 2017, Jakub Jelinek wrote:

> On Fri, Nov 03, 2017 at 02:14:39PM +0100, Richard Biener wrote:
> > > +/* Return true if stmts in between FIRST (inclusive) and LAST (exclusive)
> > > +   may clobber REF.  FIRST and LAST must be in the same basic block and
> > > +   have non-NULL vdef.  */
> > > +
> > > +bool
> > > +stmts_may_clobber_ref_p (gimple *first, gimple *last, tree ref)
> > > +{
> > > +  ao_ref r;
> > > +  ao_ref_init (, ref);
> > > +  unsigned int count = 0;
> > > +  tree vop = gimple_vdef (last);
> > > +  gimple *stmt;
> > > +
> > > +  gcc_checking_assert (gimple_bb (first) == gimple_bb (last));
> > 
> > EBB would probably work as well, thus we should assert we do not
> > end up visiting a PHI in the loop?
> 
> For a general purpose routine sure, this one is in anonymous namespace
> and meant for use in this pass.  And there it is only checking stores from
> the same store group and any other stores intermixed between those.
> The pass at least right now is resetting all of its state at the end of
> basic blocks, so gimple_bb (first) == gimple_bb (last) is indeed always
> guaranteed.  If we ever extend it such that we don't have this guarantee,
> then this assert would fail and then of course it should be adjusted to
> handle whatever is needed.  But do we need to do that right now?

No, we don't.  Just wondered about the assert and the real limitation
of the implementation.

> Note extending store-merging to handle groups of stores within EBB
> doesn't look useful, then not all stores would be unconditional.

Yes.

> What could make sense is if we have e.g. a diamond
>  |
> bb1
>/  \
>   bb2 bb3
>\  /
> bb4
>  |
> and stores are in bb1 and bb4 and no stores in bb2 or bb3 can alias
> with those.  But then we'd likely need full-blown walk_aliased_vdefs
> for this...

Yes.

> > > +  gimple *first = merged_store->first_stmt;
> > > +  gimple *last = merged_store->last_stmt;
> > > +  unsigned int i;
> > > +  store_immediate_info *infoc;
> > > +  if (info->order < merged_store->first_order)
> > > +{
> > > +  FOR_EACH_VEC_ELT (merged_store->stores, i, infoc)
> > > + if (stmts_may_clobber_ref_p (info->stmt, first, infoc->ops[idx].val))
> > > +   return false;
> > > +  first = info->stmt;
> > > +}
> > > +  else if (info->order > merged_store->last_order)
> > > +{
> > > +  FOR_EACH_VEC_ELT (merged_store->stores, i, infoc)
> > > + if (stmts_may_clobber_ref_p (last, info->stmt, infoc->ops[idx].val))
> > > +   return false;
> > > +  last = info->stmt;
> > > +}
> > > +  if (stmts_may_clobber_ref_p (first, last, info->ops[idx].val))
> > > +return false;
> > 
> > Can you comment on what you check in this block?  It first checks
> > all stmts (but not info->stmt itself if it is after last!?) 
> > against
> > all stores that would be added when adding 'info'.  Then it checks
> > from new first to last against the newly added stmt (again
> > excluding that stmt if it was added last).
> 
> The stmts_may_clobber_ref_p routine doesn't check aliasing on the last
> stmt, only on the first stmt and stmts in between.
> 
> Previous iterations have checked FOR_EACH_VEC_ELT (merged_store->stores, i, 
> infoc)
> that merged_store->first_stmt and stmts in between that and 
> merged_store->last_stmt don't clobber any of the infoc->ops[idx].val
> references and we want to maintain that invariant if we add another store to
> the group.  So, if we are about to extend the range between first_stmt and
> last_stmt, then we need to check all the earlier refs on the stmts we've
> added to the range.  Note that the stores are sorted by bitpos, not by
> their order within the basic block at this point, so it is possible that a
> store with a higher bitpos extends to earlier stmts or later stmts.
> 
> And finally the if (stmts_may_clobber_ref_p (first, last, info->ops[idx].val))
> is checking the reference we are adding against the whole new range.
> 
> > > +  if (offset != NULL_TREE)
> > > +{
> > > +  /* If the access is variable offset then a base decl has to be
> > > +  address-taken to be able to emit pointer-based stores to it.
> > > +  ???  We might be able to get away with re-using the original
> > > +  base up to the first variable part and then wrapping that inside
> > > +  a BIT_FIELD_REF.  */
> > 
> > Yes, that's what I'd generally recommend...  OTOH it can get quite
> > fugly but it only has to survive until RTL expansion...
> 
> This is an preexisting comment I've just moved around from the
> pass_store_merging::execute method into a helper function (it grew too much
> and needed too big indentation and furthermore I wanted to use it for the
> loads too).  Haven't really changed anything on that.
> 
> > As extra sanity check I'd rather have that all refs share a common
> > base (operand-equal-p'ish).  But I guess that's what usually will
> > happen anyways.  The alias-ptr-type trick will be tricky to do
> > here as well (you have to go down to 

Re: [PATCH] Improve store merging to handle load+store or bitwise logicals (PR tree-optimization/78821, take 2)

2017-11-03 Thread Jakub Jelinek
On Fri, Nov 03, 2017 at 02:14:39PM +0100, Richard Biener wrote:
> > +/* Return true if stmts in between FIRST (inclusive) and LAST (exclusive)
> > +   may clobber REF.  FIRST and LAST must be in the same basic block and
> > +   have non-NULL vdef.  */
> > +
> > +bool
> > +stmts_may_clobber_ref_p (gimple *first, gimple *last, tree ref)
> > +{
> > +  ao_ref r;
> > +  ao_ref_init (, ref);
> > +  unsigned int count = 0;
> > +  tree vop = gimple_vdef (last);
> > +  gimple *stmt;
> > +
> > +  gcc_checking_assert (gimple_bb (first) == gimple_bb (last));
> 
> EBB would probably work as well, thus we should assert we do not
> end up visiting a PHI in the loop?

For a general purpose routine sure, this one is in anonymous namespace
and meant for use in this pass.  And there it is only checking stores from
the same store group and any other stores intermixed between those.
The pass at least right now is resetting all of its state at the end of
basic blocks, so gimple_bb (first) == gimple_bb (last) is indeed always
guaranteed.  If we ever extend it such that we don't have this guarantee,
then this assert would fail and then of course it should be adjusted to
handle whatever is needed.  But do we need to do that right now?

Note extending store-merging to handle groups of stores within EBB
doesn't look useful, then not all stores would be unconditional.
What could make sense is if we have e.g. a diamond
 |
bb1
   /  \
  bb2 bb3
   \  /
bb4
 |
and stores are in bb1 and bb4 and no stores in bb2 or bb3 can alias
with those.  But then we'd likely need full-blown walk_aliased_vdefs
for this...

> > +  gimple *first = merged_store->first_stmt;
> > +  gimple *last = merged_store->last_stmt;
> > +  unsigned int i;
> > +  store_immediate_info *infoc;
> > +  if (info->order < merged_store->first_order)
> > +{
> > +  FOR_EACH_VEC_ELT (merged_store->stores, i, infoc)
> > +   if (stmts_may_clobber_ref_p (info->stmt, first, infoc->ops[idx].val))
> > + return false;
> > +  first = info->stmt;
> > +}
> > +  else if (info->order > merged_store->last_order)
> > +{
> > +  FOR_EACH_VEC_ELT (merged_store->stores, i, infoc)
> > +   if (stmts_may_clobber_ref_p (last, info->stmt, infoc->ops[idx].val))
> > + return false;
> > +  last = info->stmt;
> > +}
> > +  if (stmts_may_clobber_ref_p (first, last, info->ops[idx].val))
> > +return false;
> 
> Can you comment on what you check in this block?  It first checks
> all stmts (but not info->stmt itself if it is after last!?) 
> against
> all stores that would be added when adding 'info'.  Then it checks
> from new first to last against the newly added stmt (again
> excluding that stmt if it was added last).

The stmts_may_clobber_ref_p routine doesn't check aliasing on the last
stmt, only on the first stmt and stmts in between.

Previous iterations have checked FOR_EACH_VEC_ELT (merged_store->stores, i, 
infoc)
that merged_store->first_stmt and stmts in between that and 
merged_store->last_stmt don't clobber any of the infoc->ops[idx].val
references and we want to maintain that invariant if we add another store to
the group.  So, if we are about to extend the range between first_stmt and
last_stmt, then we need to check all the earlier refs on the stmts we've
added to the range.  Note that the stores are sorted by bitpos, not by
their order within the basic block at this point, so it is possible that a
store with a higher bitpos extends to earlier stmts or later stmts.

And finally the if (stmts_may_clobber_ref_p (first, last, info->ops[idx].val))
is checking the reference we are adding against the whole new range.

> > +  if (offset != NULL_TREE)
> > +{
> > +  /* If the access is variable offset then a base decl has to be
> > +address-taken to be able to emit pointer-based stores to it.
> > +???  We might be able to get away with re-using the original
> > +base up to the first variable part and then wrapping that inside
> > +a BIT_FIELD_REF.  */
> 
> Yes, that's what I'd generally recommend...  OTOH it can get quite
> fugly but it only has to survive until RTL expansion...

This is an preexisting comment I've just moved around from the
pass_store_merging::execute method into a helper function (it grew too much
and needed too big indentation and furthermore I wanted to use it for the
loads too).  Haven't really changed anything on that.

> As extra sanity check I'd rather have that all refs share a common
> base (operand-equal-p'ish).  But I guess that's what usually will
> happen anyways.  The alias-ptr-type trick will be tricky to do
> here as well (you have to go down to the base MEM_REF, wrap
> a decl if there's no MEM_REF and adjust the offset type).

For the aliasing, I have an untested incremental patch, need to finish
testcases for that, then test and then I can post it.

> given the style of processing we can end up doing more than
> necessary work when following ! single-use chains here, no?
> Would it 

[PATCH] rs6000: Improve *lt0 patterns

2017-11-03 Thread Segher Boessenkool
The rs6000 port currently has an *lt0_disi define_insn, setting the DI
result to whether the SI argument is negative or not.  It turns out the
generic optimisers cannot always figure out in the other cases either
that this is just a shift for us.  This patch adds patterns for all
four SI/DI combinations.

Tested on powerpc64-linux {-m32,-m64}; committing to trunk.


Segher


2017-11-03  Segher Boessenkool  

* config/rs6000/rs6000.md (*lt0_disi): Delete.
(*lt0_di, *lt0_si): New.

---
 gcc/config/rs6000/rs6000.md | 16 
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 18ebe8f..ff79f2d 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -3829,11 +3829,19 @@ (define_insn_and_split "*rotl3_mask_dot2"
 
 ; Special case for less-than-0.  We can do it with just one machine
 ; instruction, but the generic optimizers do not realise it is cheap.
-(define_insn "*lt0_disi"
-  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
-   (lt:DI (match_operand:SI 1 "gpc_reg_operand" "r")
-  (const_int 0)))]
+(define_insn "*lt0_di"
+  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
+   (lt:GPR (match_operand:DI 1 "gpc_reg_operand" "r")
+   (const_int 0)))]
   "TARGET_POWERPC64"
+  "srdi %0,%1,63"
+  [(set_attr "type" "shift")])
+
+(define_insn "*lt0_si"
+  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
+   (lt:GPR (match_operand:SI 1 "gpc_reg_operand" "r")
+   (const_int 0)))]
+  ""
   "rlwinm %0,%1,1,31,31"
   [(set_attr "type" "shift")])
 
-- 
1.8.3.1



[PATCH] rs6000: move_from_CR_ov_bit is TARGET_PAIRED_FLOAT, not TARGET_ISEL

2017-11-03 Thread Segher Boessenkool
Tested as usual, committing to trunk.


Segher


2017-11-03  Segher Boessenkool  

* config/rs6000/rs6000.md (move_from_CR_ov_bit): Change condition to
TARGET_PAIRED_FLOAT.

---
 gcc/config/rs6000/rs6000.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index ff79f2d..2ef028f 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -12142,7 +12142,7 @@ (define_insn "move_from_CR_ov_bit"
   [(set (match_operand:SI 0 "gpc_reg_operand" "=r")
(unspec:SI [(match_operand:CC 1 "cc_reg_operand" "y")]
   UNSPEC_MV_CR_OV))]
-  "TARGET_ISEL"
+  "TARGET_PAIRED_FLOAT"
   "mfcr %0\;rlwinm %0,%0,%t1,1"
   [(set_attr "type" "mfcr")
(set_attr "length" "8")])
-- 
1.8.3.1



Re: Adjust empty class parameter passing ABI (PR c++/60336)

2017-11-03 Thread Marek Polacek
On Thu, Nov 02, 2017 at 02:50:17PM -0400, Jason Merrill wrote:
> We probably want to call them something like default_is_empty_type and
> default_is_empty_record, though.

Done.

> > For one thing I thought we should be consistent in treating these two
> > structs, regarding being empty:
> >
> > struct S { struct { } a; int a[0]; };
> > struct T { struct { } a; int a[]; };
> >
> > Without the (ugly, I know) special handling in is_empty_type, only T would 
> > be
> > considered empty.
> 
> Why would T be considered empty?  I don't see anything in
> is_empty_type that would cause that.
 
Sorry, I meant *non*-empty :(.  I've changed the code so that both S and T
are considered empty, because...

> > And that would mean that the g++.dg/abi/empty23.C test
> > crashes (the abort triggers).  The problem seems to be that when we're 
> > passing
> > a pointer to an empty struct, it got turned into a null pointer.
> 
> That seems like a bug that needs fixing regardless of whether we end
> up considering S and T empty.

...I fixed this.  The problem was that ix86_function_arg_advance tried to
skip parameters with empty types:

+  /* Skip empty records because they won't be passed.  */
+  if (type && targetm.calls.empty_record_p (type))
+return;

but I think that's wrong.  Thus the testcase passes now.

> > And yeah, when passing such a struct by value, the flexible array member is
> > ignored.  The ABI of passing struct with a flexible array member has 
> > changed in
> > GCC 4.4.
> 
> Can we add a warning about such pass/return (as a separate patch)?

Oh, the warning is already there.

> > +  if (TREE_ADDRESSABLE (type))
> > +return false;
> 
> I think we want to abort in this case, as the front end should have
> turned this into an invisible reference already.

Done.  Thanks,

Bootstrap/regtest running on x86_64-linux and ppc64-linux.

2017-11-03  Marek Polacek  
H.J. Lu  
Jason Merrill  

PR c++/60336
PR middle-end/67239
PR target/68355
* class.c (layout_class_type): Set TYPE_EMPTY_P and TYPE_WARN_EMPTY_P.

* lto.c (compare_tree_sccs_1): Compare TYPE_WARN_EMPTY_P and
TYPE_EMPTY_P.

* calls.c (initialize_argument_information): Call
warn_parameter_passing_abi target hook.
(store_one_arg): Use 0 for empty record size.  Don't push 0 size
argument onto stack.
(must_pass_in_stack_var_size_or_pad): Return false for empty types.
* common.opt: Update -fabi-version description.
* config/i386/i386.c (init_cumulative_args): Set cum->warn_empty.
(ix86_return_in_memory): Return false for empty types.
(ix86_gimplify_va_arg): Call arg_int_size_in_bytes instead of
int_size_in_bytes.
(ix86_is_empty_record): New function.
(ix86_warn_parameter_passing_abi): New function.
(TARGET_EMPTY_RECORD_P): Redefine.
(TARGET_WARN_PARAMETER_PASSING_ABI): Redefine.
* config/i386/i386.h (CUMULATIVE_ARGS): Add warn_empty.
* doc/tm.texi: Regenerated.
* doc/tm.texi.in (TARGET_EMPTY_RECORD_P,
TARGET_WARN_PARAMETER_PASSING_ABI): Add.
* explow.c (hard_function_value): Call arg_int_size_in_bytes
instead of int_size_in_bytes.
* expr.c (copy_blkmode_to_reg): Likewise.
* function.c (assign_parm_find_entry_rtl): Call
warn_parameter_passing_abi target hook.
(locate_and_pad_parm): Call arg size_in_bytes instead
size_in_bytes.
* lto-streamer-out.c (hash_tree): Hash TYPE_EMPTY_P and
TYPE_WARN_EMPTY_P.
* target.def (empty_record_p, warn_parameter_passing_abi): New target
hook.
* targhooks.c (hook_void_CUMULATIVE_ARGS_tree): New hook.
(std_gimplify_va_arg_expr): Skip empty records.  Call
arg_size_in_bytes instead size_in_bytes.
* targhooks.h (hook_void_CUMULATIVE_ARGS_tree): Declare.
* tree-core.h (tree_type_common): Update comment.  Add artificial_flag
and empty_flag.
* tree-streamer-in.c (unpack_ts_base_value_fields): Stream
TYPE_WARN_EMPTY_P instead of TYPE_ARTIFICIAL.
(unpack_ts_type_common_value_fields): Stream TYPE_EMPTY_P and
TYPE_ARTIFICIAL.
* tree-streamer-out.c (pack_ts_base_value_fields): Stream
TYPE_WARN_EMPTY_P instead of TYPE_ARTIFICIAL.
(pack_ts_type_common_value_fields): Stream TYPE_EMPTY_P and
TYPE_ARTIFICIAL.
* tree.c (is_empty_type): New function.
(default_is_empty_record): New function.
(arg_int_size_in_bytes): New function.
(arg_size_in_bytes): New function.
* tree.h: Define TYPE_EMPTY_P and TYPE_WARN_EMPTY_P.  Map
TYPE_ARTIFICIAL to type_common.artificial_flag.
(default_is_empty_record, arg_int_size_in_bytes,
arg_size_in_bytes): Declare.

* g++.dg/abi/empty12.C: New test.
* 

Re: [PATCH OBVIOUS]Fix memory leak in tree-predcom.c

2017-11-03 Thread Richard Biener
On Fri, Nov 3, 2017 at 1:36 PM, Bin Cheng  wrote:
> Hi,
> I ran into this memory leak issue in tree-predcom.c when investigating other 
> PRs.
> This is the obvious fix by freeing reference of trivial component.
> Bootstrap and test on x86_64.  Is it OK?

Ok.

Thanks,
Richard.

> Thanks,
> bin
> 2017-11-02  Bin Cheng  
>
> * tree-predcom.c (determine_roots_comp): Avoid memory leak by freeing
> reference of trivial component.


Re: [PATCH][RFC] Instrument function exit with __builtin_unreachable in C++.

2017-11-03 Thread Martin Liška
On 10/24/2017 04:19 PM, Jason Merrill wrote:
> On 10/18/2017 09:07 AM, Martin Liška wrote:
>> @@ -1182,7 +1182,13 @@ cxx_eval_builtin_function_call (const constexpr_ctx 
>> *ctx, tree t, tree fun,
>>  {
>>    new_call = build_call_array_loc (EXPR_LOCATION (t), TREE_TYPE (t),
>>     CALL_EXPR_FN (t), nargs, args);
>> -  error ("%q+E is not a constant expression", new_call);
>> +
>> +  /* Do not allow__builtin_unreachable in constexpr function.  */
>> +  if (DECL_FUNCTION_CODE (fun) == BUILT_IN_UNREACHABLE
>> +  && EXPR_LOCATION (t) == BUILTINS_LOCATION)
>> +    error ("constexpr call flows off the end of the function");
>> +  else
>> +    error ("%q+E is not a constant expression", new_call);
> 
> You don't need to build new_call in the new case, since you don't use it.
> 
> Also, please adjust the comment to say that a __builtin_unreachable call with 
> BUILTINS_LOCATION comes from cp_maybe_instrument_return.
> 
> OK with those changes.
> 
> Jason

Hi.

Thank you for review, done that.
Can you please take a look at the single problematic test-case that blocks 
acceptance of the patch to trunk?

Martin
>From 0c4fc1acba49d2d5ca2e6c475286a14e465b6f6c Mon Sep 17 00:00:00 2001
From: marxin 
Date: Thu, 12 Oct 2017 10:14:59 +0200
Subject: [PATCH 1/3] Instrument function exit with __builtin_unreachable in
 C++

gcc/c-family/ChangeLog:

2017-10-12  Martin Liska  

	PR middle-end/82404
	* c-opts.c (c_common_post_options): Set -Wreturn-type for C++
	FE.
	* c.opt: Set default value of warn_return_type.

gcc/cp/ChangeLog:

2017-10-12  Martin Liska  

	PR middle-end/82404
	* constexpr.c (cxx_eval_builtin_function_call): Handle
	__builtin_unreachable call.
	* cp-gimplify.c (cp_ubsan_maybe_instrument_return): Rename to
	...
	(cp_maybe_instrument_return): ... this.
	(cp_genericize): Call the function unconditionally.

gcc/fortran/ChangeLog:

2017-10-12  Martin Liska  

	PR middle-end/82404
	* options.c (gfc_post_options): Set default value of
	-Wreturn-type to false.
---
 gcc/c-family/c-opts.c |  3 +++
 gcc/c-family/c.opt|  2 +-
 gcc/cp/constexpr.c| 15 ---
 gcc/cp/cp-gimplify.c  | 20 ++--
 gcc/fortran/options.c |  3 +++
 5 files changed, 33 insertions(+), 10 deletions(-)

diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c
index 32120e636c2..cead15e7a63 100644
--- a/gcc/c-family/c-opts.c
+++ b/gcc/c-family/c-opts.c
@@ -989,6 +989,9 @@ c_common_post_options (const char **pfilename)
 	flag_extern_tls_init = 1;
 }
 
+  if (warn_return_type == -1)
+warn_return_type = c_dialect_cxx ();
+
   if (num_in_fnames > 1)
 error ("too many filenames given.  Type %s --help for usage",
 	   progname);
diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index dae124ac1c2..9ab31f0e153 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -960,7 +960,7 @@ C++ ObjC++ Var(warn_reorder) Warning LangEnabledBy(C++ ObjC++,Wall)
 Warn when the compiler reorders code.
 
 Wreturn-type
-C ObjC C++ ObjC++ Var(warn_return_type) Warning LangEnabledBy(C ObjC C++ ObjC++,Wall)
+C ObjC C++ ObjC++ Var(warn_return_type) Warning LangEnabledBy(C ObjC C++ ObjC++,Wall) Init(-1)
 Warn whenever a function's return type defaults to \"int\" (C), or about inconsistent return types (C++).
 
 Wscalar-storage-order
diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 483f731a49a..7c2185851e0 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -1180,9 +1180,18 @@ cxx_eval_builtin_function_call (const constexpr_ctx *ctx, tree t, tree fun,
 {
   if (!*non_constant_p && !ctx->quiet)
 	{
-	  new_call = build_call_array_loc (EXPR_LOCATION (t), TREE_TYPE (t),
-	   CALL_EXPR_FN (t), nargs, args);
-	  error ("%q+E is not a constant expression", new_call);
+	  /* Do not allow__builtin_unreachable in constexpr function.
+	 The __builtin_unreachable call with BUILTINS_LOCATION
+	 comes from cp_maybe_instrument_return.  */
+	  if (DECL_FUNCTION_CODE (fun) == BUILT_IN_UNREACHABLE
+	  && EXPR_LOCATION (t) == BUILTINS_LOCATION)
+	error ("constexpr call flows off the end of the function");
+	  else
+	{
+	  new_call = build_call_array_loc (EXPR_LOCATION (t), TREE_TYPE (t),
+	   CALL_EXPR_FN (t), nargs, args);
+	  error ("%q+E is not a constant expression", new_call);
+	}
 	}
   *non_constant_p = true;
   return t;
diff --git a/gcc/cp/cp-gimplify.c b/gcc/cp/cp-gimplify.c
index 262485a5c1f..014c1ee7231 100644
--- a/gcc/cp/cp-gimplify.c
+++ b/gcc/cp/cp-gimplify.c
@@ -1556,10 +1556,11 @@ cp_genericize_tree (tree* t_p, bool handle_invisiref_parm_p)
 
 /* If a function that should end with a return in non-void
function doesn't obviously end with return, add ubsan
-   instrumentation code to verify it at runtime.  */
+   instrumentation code to verify it at runtime.  If -fsanitize=return
+   is not enabled, instrument 

Re: [PATCH] New option saphira for Qualcomm server part

2017-11-03 Thread Richard Earnshaw
On 03/11/17 12:45, Siddhesh Poyarekar wrote:
> On 3 November 2017 at 15:50, Richard Earnshaw
>  wrote:
>>>   2017-10-27  Siddhesh Poyarekar  
>>>   Jim Wilson  
>>>
>>>   gcc/
>>>   * config/aarch64/aarch64-cores.def (saphira): New.
>>>   * config/aarch64/aarch64-tune.md: Regenerated.
>>>   * doc/invoke.texi (AArch64 Options/-mtune): Add "saphira".
>>>   * gcc/config/aarch64/aarch64.c (saphira_tunings): New.
>>
>> OK.
> 
> Thanks, I don't have commit access, can you please push for me?
> Alternatively, may I request commit access?
> 
> Siddhesh
> 

Committed.

R.


Re: [PATCH] Improve store merging to handle load+store or bitwise logicals (PR tree-optimization/78821, take 2)

2017-11-03 Thread Richard Biener
On Thu, 2 Nov 2017, Jakub Jelinek wrote:

> On Thu, Nov 02, 2017 at 03:38:45PM +, Kyrill Tkachov wrote:
> > this looks great! I have a couple of comments.
> > * Can you please extend file comments for gimple-ssa-store-merging.c ?
> > Currently it mostly describes how we merge constants together. Once we start
> > accepting non-constant members
> > we should mention it in there.
> 
> The following updated patch introduced the #define and updates comments.
> I'll do the BIT_NOT_EXPR work incrementally.
> 
> BTW, finished the statistics gathering from combined x86_64 and i686-linux
> bootstraps.  With my recent gimple-ssa-store-merging.c (the bitfield
> handling etc.) changes reverted, the split_stores.length () and orig_num_stmts
> counts at the end of successful output_merged_store was (sum from all
> cases):
> integer_cst   199245  413294
> with the recent change in plus this patch:
> integer_cst   215274  442134
> mem_ref   16943   35369
> bit_and_expr  37  88
> bit_ior_expr  19  46
> bit_xor_expr  27  58
> I think the integer_cst numbers without/with this patch should be roughly
> the same.
> 
> 2017-11-02  Jakub Jelinek  
> 
>   PR tree-optimization/78821
>   * gimple-ssa-store-merging.c: Update the file comment.
>   (MAX_STORE_ALIAS_CHECKS): Define.
>   (struct store_operand_info): New type.
>   (store_operand_info::store_operand_info): New constructor.
>   (struct store_immediate_info): Add rhs_code and ops data members.
>   (store_immediate_info::store_immediate_info): Add rhscode, op0r
>   and op1r arguments to the ctor, initialize corresponding data members.
>   (struct merged_store_group): Add load_align_base and load_align
>   data members.
>   (merged_store_group::merged_store_group): Initialize them.
>   (merged_store_group::do_merge): Update them.
>   (merged_store_group::apply_stores): Pick the constant for
>   encode_tree_to_bitpos from one of the two operands, or skip
>   encode_tree_to_bitpos if neither operand is a constant.
>   (class pass_store_merging): Add process_store method decl.  Remove
>   bool argument from terminate_all_aliasing_chains method decl.
>   (pass_store_merging::terminate_all_aliasing_chains): Remove
>   var_offset_p argument and corresponding handling.
>   (stmts_may_clobber_ref_p): New function.
>   (compatible_load_p): New function.
>   (imm_store_chain_info::coalesce_immediate_stores): Terminate group
>   if there is overlap and rhs_code is not INTEGER_CST.  For
>   non-overlapping stores terminate group if rhs is not mergeable.
>   (get_alias_type_for_stmts): Change first argument from
>   auto_vec & to vec &.  Add IS_LOAD, CLIQUEP and
>   BASEP arguments.  If IS_LOAD is true, look at rhs1 of the stmts
>   instead of lhs.  Compute *CLIQUEP and *BASEP in addition to the
>   alias type.
>   (get_location_for_stmts): Change first argument from
>   auto_vec & to vec &.
>   (struct split_store): Remove orig_stmts data member, add orig_stores.
>   (split_store::split_store): Create orig_stores rather than orig_stmts.
>   (find_constituent_stmts): Renamed to ...
>   (find_constituent_stores): ... this.  Change second argument from
>   vec * to vec *, push pointers
>   to info structures rather than the statements.
>   (split_group): Rename ALLOW_UNALIGNED argument to
>   ALLOW_UNALIGNED_STORE, add ALLOW_UNALIGNED_LOAD argument and handle
>   it.  Adjust find_constituent_stores caller.
>   (imm_store_chain_info::output_merged_store): Handle rhs_code other
>   than INTEGER_CST, adjust split_group, get_alias_type_for_stmts and
>   get_location_for_stmts callers.  Set MR_DEPENDENCE_CLIQUE and
>   MR_DEPENDENCE_BASE on the MEM_REFs if they are the same in all stores.
>   (mem_valid_for_store_merging): New function.
>   (handled_load): New function.
>   (pass_store_merging::process_store): New method.
>   (pass_store_merging::execute): Use process_store method.  Adjust
>   terminate_all_aliasing_chains caller.
> 
>   * gcc.dg/store_merging_13.c: New test.
>   * gcc.dg/store_merging_14.c: New test.
> 
> --- gcc/gimple-ssa-store-merging.c.jj 2017-11-01 22:49:18.123965696 +0100
> +++ gcc/gimple-ssa-store-merging.c2017-11-02 17:24:04.236317245 +0100
> @@ -19,7 +19,8 @@
> .  */
>  
>  /* The purpose of this pass is to combine multiple memory stores of
> -   constant values to consecutive memory locations into fewer wider stores.
> +   constant values, values loaded from memory or bitwise operations
> +   on those to consecutive memory locations into fewer wider stores.
> For example, if we have a sequence peforming four byte stores to
> consecutive memory locations:
> [p ] := imm1;
> @@ -29,21 +30,49 @@
> we can transform this into a single 4-byte store if the target supports 

Re: [PR c++/82710] false positive paren warning

2017-11-03 Thread Nathan Sidwell

On 11/02/2017 02:24 PM, Nathan Sidwell wrote:

This patch fixes pr82710, were we erroneously warn on something like:
    friend class_X (::other::name (...));
the parens are needed, otherwise the '::other' is taken to be a 
qualified lookup inside the class_X.


Unfortunately, at the point we can check, we've lost information that 
'::' was used.  So I back off when we see a qualified name there.


I realized CLASS_TYPE_P was overly restrictive.  There are other 
class-like entities that also need protection (typename_types etc). 
Fixed with this patch that uses MAYBE_CLASS_TYPE and ENUMERAL_TYPE.  I 
pushed this test to the end, as it is more complicated, so we'll only 
get to it after we know we have a qualified name.


nathan

--
Nathan Sidwell
2017-11-03  Nathan Sidwell  

	PR c++/82710
	* decl.c (grokdeclarator): Protect MAYBE_CLASS things from paren
	warning too.

	PR c++/82710
	* g++.dg/warn/pr82710.C: More cases.

Index: cp/decl.c
===
--- cp/decl.c	(revision 254350)
+++ cp/decl.c	(working copy)
@@ -10795,13 +10795,15 @@ grokdeclarator (const cp_declarator *dec
 	 to be a constructor call.  */
   if (decl_context != PARM
 	  && declarator->parenthesized != UNKNOWN_LOCATION
-	  /* If the type is a class and the inner name used a global
-	 namespace qualifier, we need the parens.  Unfortunately
-	 all we can tell is that a qualified name was used.  */
-	  && !(CLASS_TYPE_P (type)
-	   && inner_declarator
+	  /* If the type is class-like and the inner name used a
+	 global namespace qualifier, we need the parens.
+	 Unfortunately all we can tell is whether a qualified name
+	 was used or not.  */
+	  && !(inner_declarator
 	   && inner_declarator->kind == cdk_id
-	   && inner_declarator->u.id.qualifying_scope))
+	   && inner_declarator->u.id.qualifying_scope
+	   && (MAYBE_CLASS_TYPE_P (type)
+		   || TREE_CODE (type) == ENUMERAL_TYPE)))
 	warning_at (declarator->parenthesized, OPT_Wparentheses,
 		"unnecessary parentheses in declaration of %qs", name);
   if (declarator->kind == cdk_id || declarator->kind == cdk_decomp)
Index: testsuite/g++.dg/warn/pr82710.C
===
--- testsuite/g++.dg/warn/pr82710.C	(revision 254349)
+++ testsuite/g++.dg/warn/pr82710.C	(working copy)
@@ -1,7 +1,10 @@
-// { dg-additional-options -Wparentheses }
+// { dg-do compile { target c++11 } }
+// { dg-additional-options "-Wparentheses -Wno-non-template-friend" }
 
 // the MVP warning triggered on a friend decl.  */
 class X;
+enum class Q {}; // C++ 11ness
+enum R {};
 
 namespace here 
 {
@@ -9,6 +12,9 @@ namespace here
   X friendFunc1();
   X *friendFunc2 ();
   int friendFunc3 ();
+  int bob ();
+  Q bill ();
+  R ben ();
 }
 
 namespace nm
@@ -19,6 +25,9 @@ namespace nm
 void friendFunc1 ();
 void friendFunc2 ();
 void friendFunc3 ();
+int bob ();
+Q bill ();
+R ben ();
   }
 
   class TestClass
@@ -28,5 +37,12 @@ namespace nm
 friend X *::here::friendFunc2 ();
 friend int (::here::friendFunc3 ()); // { dg-warning "" }
   };
+
+  template  class X
+  {
+friend typename T::frob (::here::bob ());
+friend Q (::here::bill ());
+friend R (::here::ben ());
+  };
 }
 


Re: [PATCH][AArch64] Set default sched pressure algorithm

2017-11-03 Thread Wilco Dijkstra
Richard Biener wrote:
> On Fri, Nov 3, 2017 at 6:38 AM, Andrew Pinski  wrote:
> > On Fri, Nov 3, 2017 at 12:11 AM, Wilco Dijkstra  
> > wrote:
> >> The Arm backend sets the default sched-pressure algorithm to
> >> SCHED_PRESSURE_MODEL.  Benchmarking on AArch64 shows this
> >> speeds up floating point performance on SPEC - eg. CactusBSSN improves
> >> by ~16%.  The gains are mostly due to less spilling, so enable this on 
> >> AArch64
> >> by default.
> >>
>>> OK for commit?
> >
> > I am ok with this from my point of view.  The rs6000, arm and s390
> > back-ends all enable the same way.  I suspect all RISC targets should
> > enable this way too.
>
> I think all OOO execution capable CPUs should.  Ideally this wouldn't be
> a choice between two models but the scheduler would take into account
> register pressure anyways.  Or we should always schedule with sched-pressure
> during first scheduling.

Of the 6 targets which use -fsched-pressure, 5 prefer SCHED_PRESSURE_MODEL,
so we could just make that the default (nds32 is the only exception, but it
has 32 registers so that should not be an issue).

This also fits nicely with my patches to improve GCC settings to be more 
optimal.

Wilco


Re: [PATCH] New option saphira for Qualcomm server part

2017-11-03 Thread Siddhesh Poyarekar
On 3 November 2017 at 15:50, Richard Earnshaw
 wrote:
>>   2017-10-27  Siddhesh Poyarekar  
>>   Jim Wilson  
>>
>>   gcc/
>>   * config/aarch64/aarch64-cores.def (saphira): New.
>>   * config/aarch64/aarch64-tune.md: Regenerated.
>>   * doc/invoke.texi (AArch64 Options/-mtune): Add "saphira".
>>   * gcc/config/aarch64/aarch64.c (saphira_tunings): New.
>
> OK.

Thanks, I don't have commit access, can you please push for me?
Alternatively, may I request commit access?

Siddhesh


[PATCH PR82726/PR70754][2/2]New fix by finding correct root reference in combined chains

2017-11-03 Thread Bin Cheng
Hi,
As described in message of previous patch:

This patch set fixes both PRs in the opposite way: Instead of find dominance
insertion position for root reference, we resort zero-distance references of
combined chain by their position information so that new root reference must
dominate others.  This should be more efficient because we avoid function call
to stmt_dominates_stmt_p.
Bootstrap and test on x86_64 and AArch64 in patch set.  Is it OK?

Thanks,
bin
2017-11-02  Bin Cheng  

PR tree-optimization/82726
PR tree-optimization/70754
* tree-predcom.c (, INCLUDE_ALGORITHM): New headers.
(order_drefs_by_pos): New function.
(combine_chains): Move code setting has_max_use_after to...
(try_combine_chains): ...here.  New parameter.  Sort combined chains
according to position information.
(tree_predictive_commoning_loop): Update call to above function.
(update_pos_for_combined_chains, pcom_stmt_dominates_stmt_p): New.

gcc/testsuite
2017-11-02  Bin Cheng  

PR tree-optimization/82726
* gcc.dg/tree-ssa/pr82726.c: New test.From 843cef544a46236e40063416cebc8037736ad18a Mon Sep 17 00:00:00 2001
From: Bin Cheng 
Date: Wed, 1 Nov 2017 17:43:55 +
Subject: [PATCH 2/2] pr82726-20171102.txt

---
 gcc/testsuite/gcc.dg/tree-ssa/pr82726.c |  26 ++
 gcc/tree-predcom.c  | 159 
 2 files changed, 169 insertions(+), 16 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr82726.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr82726.c b/gcc/testsuite/gcc.dg/tree-ssa/pr82726.c
new file mode 100644
index 000..179f93a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr82726.c
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 --param tree-reassoc-width=4" } */
+/* { dg-additional-options "-mavx2" { target avx2_runtime } } */
+
+#define N 40
+#define M 128
+unsigned int in[N+M];
+unsigned short out[N];
+
+/* Outer-loop vectorization. */
+
+void
+foo (){
+  int i,j;
+  unsigned int diff;
+
+  for (i = 0; i < N; i++) {
+diff = 0;
+for (j = 0; j < M; j+=8) {
+  diff += in[j+i];
+}
+out[i]=(unsigned short)diff;
+  }
+
+  return;
+}
diff --git a/gcc/tree-predcom.c b/gcc/tree-predcom.c
index 24d7c9c..a243bce 100644
--- a/gcc/tree-predcom.c
+++ b/gcc/tree-predcom.c
@@ -201,6 +201,8 @@ along with GCC; see the file COPYING3.  If not see
i * i with ii_last + 2 * i + 1), to generalize strength reduction.  */
 
 #include "config.h"
+#include 
+#define INCLUDE_ALGORITHM /* std::sort */
 #include "system.h"
 #include "coretypes.h"
 #include "backend.h"
@@ -1020,6 +1022,14 @@ order_drefs (const void *a, const void *b)
   return (*da)->pos - (*db)->pos;
 }
 
+/* Compares two drefs A and B by their position.  Callback for std::sort.  */
+
+static bool
+order_drefs_by_pos (dref a, dref b)
+{
+  return a->pos < b->pos;
+}
+
 /* Returns root of the CHAIN.  */
 
 static inline dref
@@ -2633,7 +2643,6 @@ combine_chains (chain_p ch1, chain_p ch2)
   bool swap = false;
   chain_p new_chain;
   unsigned i;
-  gimple *root_stmt;
   tree rslt_type = NULL_TREE;
 
   if (ch1 == ch2)
@@ -2675,31 +2684,56 @@ combine_chains (chain_p ch1, chain_p ch2)
   new_chain->refs.safe_push (nw);
 }
 
-  new_chain->has_max_use_after = false;
-  root_stmt = get_chain_root (new_chain)->stmt;
-  for (i = 1; new_chain->refs.iterate (i, ); i++)
-{
-  if (nw->distance == new_chain->length
-	  && !stmt_dominates_stmt_p (nw->stmt, root_stmt))
-	{
-	  new_chain->has_max_use_after = true;
-	  break;
-	}
-}
-
   ch1->combined = true;
   ch2->combined = true;
   return new_chain;
 }
 
-/* Try to combine the CHAINS.  */
+/* Recursively update position information of all offspring chains to ROOT
+   chain's position information.  */
+
+static void
+update_pos_for_combined_chains (chain_p root)
+{
+  chain_p ch1 = root->ch1, ch2 = root->ch2;
+  dref ref, ref1, ref2;
+  for (unsigned j = 0; (root->refs.iterate (j, )
+			&& ch1->refs.iterate (j, )
+			&& ch2->refs.iterate (j, )); ++j)
+ref1->pos = ref2->pos = ref->pos;
+
+  if (ch1->type == CT_COMBINATION)
+update_pos_for_combined_chains (ch1);
+  if (ch2->type == CT_COMBINATION)
+update_pos_for_combined_chains (ch2);
+}
+
+/* Returns true if statement S1 dominates statement S2.  */
+
+static bool
+pcom_stmt_dominates_stmt_p (std::map _map,
+			gimple *s1, gimple *s2)
+{
+  basic_block bb1 = gimple_bb (s1), bb2 = gimple_bb (s2);
+
+  if (!bb1 || s1 == s2)
+return true;
+
+  if (bb1 == bb2)
+return stmts_map[s1] < stmts_map[s2];
+
+  return dominated_by_p (CDI_DOMINATORS, bb2, bb1);
+}
+
+/* Try to combine the CHAINS in LOOP.  */
 
 static void
-try_combine_chains (vec *chains)
+try_combine_chains (struct loop *loop, vec *chains)
 {
   unsigned i, j;
   chain_p ch1, ch2, cch;
   auto_vec worklist;
+  bool combined_p = false;
 
 

[PATCH PR82726][1/2]Revert previous fixes for PR70754 and PR79663

2017-11-03 Thread Bin Cheng
Hi,
When fixing PR70754, I thought the issue only happens for ZERO-length chains.
Well, that's apparently not true with PR82726.
The whole story is, with chain combination/re-association, new stmts may be
created/inserted at position not dominating following uses.  This happens in
two scenarios:
  1) Zero length chains, as in PR70754.
  2) Non-zero chains with multiple zero distance references.
PR82726 falls in case 2).  Because zero distance references are root of the
chain, they don't inherit values from loop carried PHIs.  In code generation,
we still need to be careful not inserting use before definitions.

Previous fix to PR70754 tries to find dominance position for insertion when
combining all references.  I could do the similar thing on top of that fix,
but it would be inefficient/complicated because we should only do that for
zero distance references in a non-zero length combined chain.

This patch set fixes both PRs in the opposite way: Instead of finding dominance
insertion position for root reference, we re-sort zero-distance references of
combined chain by their position information so that new root reference must
dominate others.  This should be more efficient because we avoid function call
to stmt_dominates_stmt_p.

This is the first patch reverting r244815 and r245689.

Bootstrap and test on x86_64 and AArch64 in patch set.  Is it OK?

Thanks,
bin
2017-11-02  Bin Cheng  

PR tree-optimization/82726
Revert
2017-01-23  Bin Cheng  

PR tree-optimization/70754
* tree-predcom.c (stmt_combining_refs): New parameter INSERT_BEFORE.
(reassociate_to_the_same_stmt): New parameter INSERT_BEFORE.  Insert
combined stmt before it if not NULL.
(combine_chains): Process refs reversely and compute dominance point
for root ref.

Revert
2017-02-23  Bin Cheng  

PR tree-optimization/79663
* tree-predcom.c (combine_chains): Process refs in reverse order
only for ZERO length chains, and add explaining comment.From 408c86c33670ce64e9872fa9d4cc66fe0b3bffa4 Mon Sep 17 00:00:00 2001
From: Bin Cheng 
Date: Wed, 1 Nov 2017 12:53:43 +
Subject: [PATCH 1/2] revert-244815-245689.txt

---
 gcc/tree-predcom.c | 64 +++---
 1 file changed, 13 insertions(+), 51 deletions(-)

diff --git a/gcc/tree-predcom.c b/gcc/tree-predcom.c
index fdb32f1..24d7c9c 100644
--- a/gcc/tree-predcom.c
+++ b/gcc/tree-predcom.c
@@ -2520,11 +2520,10 @@ remove_name_from_operation (gimple *stmt, tree op)
 }
 
 /* Reassociates the expression in that NAME1 and NAME2 are used so that they
-   are combined in a single statement, and returns this statement.  Note the
-   statement is inserted before INSERT_BEFORE if it's not NULL.  */
+   are combined in a single statement, and returns this statement.  */
 
 static gimple *
-reassociate_to_the_same_stmt (tree name1, tree name2, gimple *insert_before)
+reassociate_to_the_same_stmt (tree name1, tree name2)
 {
   gimple *stmt1, *stmt2, *root1, *root2, *s1, *s2;
   gassign *new_stmt, *tmp_stmt;
@@ -2581,12 +2580,6 @@ reassociate_to_the_same_stmt (tree name1, tree name2, gimple *insert_before)
   var = create_tmp_reg (type, "predreastmp");
   new_name = make_ssa_name (var);
   new_stmt = gimple_build_assign (new_name, code, name1, name2);
-  if (insert_before && stmt_dominates_stmt_p (insert_before, s1))
-bsi = gsi_for_stmt (insert_before);
-  else
-bsi = gsi_for_stmt (s1);
-
-  gsi_insert_before (, new_stmt, GSI_SAME_STMT);
 
   var = create_tmp_reg (type, "predreastmp");
   tmp_name = make_ssa_name (var);
@@ -2603,6 +2596,7 @@ reassociate_to_the_same_stmt (tree name1, tree name2, gimple *insert_before)
   s1 = gsi_stmt (bsi);
   update_stmt (s1);
 
+  gsi_insert_before (, new_stmt, GSI_SAME_STMT);
   gsi_insert_before (, tmp_stmt, GSI_SAME_STMT);
 
   return new_stmt;
@@ -2611,11 +2605,10 @@ reassociate_to_the_same_stmt (tree name1, tree name2, gimple *insert_before)
 /* Returns the statement that combines references R1 and R2.  In case R1
and R2 are not used in the same statement, but they are used with an
associative and commutative operation in the same expression, reassociate
-   the expression so that they are used in the same statement.  The combined
-   statement is inserted before INSERT_BEFORE if it's not NULL.  */
+   the expression so that they are used in the same statement.  */
 
 static gimple *
-stmt_combining_refs (dref r1, dref r2, gimple *insert_before)
+stmt_combining_refs (dref r1, dref r2)
 {
   gimple *stmt1, *stmt2;
   tree name1 = name_for_ref (r1);
@@ -2626,7 +2619,7 @@ stmt_combining_refs (dref r1, dref r2, gimple *insert_before)
   if (stmt1 == stmt2)
 return stmt1;
 
-  return reassociate_to_the_same_stmt (name1, name2, insert_before);
+  return reassociate_to_the_same_stmt (name1, name2);
 }
 
 /* Tries to 

[PATCH OBVIOUS]Fix memory leak in tree-predcom.c

2017-11-03 Thread Bin Cheng
Hi,
I ran into this memory leak issue in tree-predcom.c when investigating other 
PRs.
This is the obvious fix by freeing reference of trivial component.
Bootstrap and test on x86_64.  Is it OK?

Thanks,
bin
2017-11-02  Bin Cheng  

* tree-predcom.c (determine_roots_comp): Avoid memory leak by freeing
reference of trivial component.diff --git a/gcc/tree-predcom.c b/gcc/tree-predcom.c
index a243bce..e493dcd 100644
--- a/gcc/tree-predcom.c
+++ b/gcc/tree-predcom.c
@@ -1341,7 +1341,14 @@ determine_roots_comp (struct loop *loop,
 
   /* Trivial component.  */
   if (comp->refs.length () <= 1)
-return;
+{
+  if (comp->refs.length () == 1)
+   {
+ free (comp->refs[0]);
+ comp->refs.truncate (0);
+   }
+  return;
+}
 
   comp->refs.qsort (order_drefs);
   FOR_EACH_VEC_ELT (comp->refs, i, a)


[PATCH PR82776]Exploit more undefined pointer overflow behavior in loop niter analysis

2017-11-03 Thread Bin Cheng
Hi,
This is a simple patch exploiting more undefined pointer overflow behavior in
loop niter analysis.  Originally, it only supports POINTER_PLUS_EXPR if the
offset part is IV.  This patch also handles the case if pointer is IV.  With
this patch, the while(true) loop in test can be removed by cddce pass now.

Bootstrap and test on x86_64 and AArch64.  This patch introduces two failures:
FAIL: g++.dg/pr79095-1.C  -std=gnu++98 (test for excess errors)
FAIL: g++.dg/pr79095-2.C  -std=gnu++11 (test for excess errors)
I believe this exposes inaccurate value range information issue.  For below 
code:
/* { dg-do compile } */
/* { dg-options "-Wall -O3" } */

typedef long unsigned int size_t;

inline void
fill (int *p, size_t n, int)
{
  while (n--)
*p++ = 0;
}

struct B
{
  int* p0, *p1, *p2;

  size_t size () const {
return size_t (p1 - p0);
  }

  void resize (size_t n) {
if (n > size())
  append (n - size());
  }

  void append (size_t n)
  {
if (size_t (p2 - p1) >= n)   {
  fill (p1, n, 0);
}
  }
};

void foo (B )
{
  if (b.size () != 0)
b.resize (b.size () - 1);
}

GCC gives below warning with this patch:
pr79095-1.C: In function ‘void foo(B&)’:
pr79095-1.C:10:7: warning: iteration 4611686018427387903 invokes undefined 
behavior [-Waggressive-loop-optimizations]
 *p++ = 0;
  ~^~
pr79095-1.C:9:11: note: within this loop
   while (n--)
   ^~

Problem is VRP should understand that it's never the case with condition:
  (size_t (p2 - p1) >= n) 
in function B::append.

So, any comment?

Thanks,
bin
2017-11-02  Bin Cheng  

PR tree-optimization/82776
* tree-ssa-loop-niter.c (infer_loop_bounds_from_pointer_arith): Handle
POINTER_PLUS_EXPR in which the pointer is an IV.
(infer_loop_bounds_from_signedness): Refine comment.

gcc/testsuite
2017-11-02  Bin Cheng  

PR tree-optimization/82776
* g++.dg/pr82776.C: New test.
* gcc.dg/tree-ssa/split-path-6.c: Refine test.diff --git a/gcc/testsuite/g++.dg/pr82776.C b/gcc/testsuite/g++.dg/pr82776.C
new file mode 100644
index 000..2a66817
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr82776.C
@@ -0,0 +1,78 @@
+// PR tree-optimization/82776
+// { dg-do compile }
+// { dg-options "-O2 -std=c++14 -fdump-tree-cddce2-details" }
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+
+unsigned baz (unsigned);
+
+struct Chunk {
+  std::array tags_;
+  uint8_t control_;
+
+  bool eof() const {
+return (control_ & 1) != 0;
+  }
+
+  static constexpr unsigned kFullMask = (1 << 14) - 1;
+
+  unsigned occupiedMask() const {
+return baz (kFullMask);
+  }
+};
+
+#define LIKELY(x) __builtin_expect((x), true)
+#define UNLIKELY(x) __builtin_expect((x), false)
+
+struct Iter {
+  Chunk* chunk_;
+  std::size_t index_;
+
+  void advance() {
+// common case is packed entries
+while (index_ > 0) {
+  --index_;
+  if (LIKELY(chunk_->tags_[index_] != 0)) {
+return;
+  }
+}
+
+// bar only skips the work of advance() if this loop can
+// be guaranteed to terminate
+#ifdef ENABLE_FORLOOP
+for (std::size_t i = 1; i != 0; ++i) {
+#else
+while (true) {
+#endif
+  // exhausted the current chunk
+  if (chunk_->eof()) {
+chunk_ = nullptr;
+break;
+  }
+  ++chunk_;
+  auto m = chunk_->occupiedMask();
+  if (m != 0) {
+index_ = 31 - __builtin_clz(m);
+break;
+  }
+}
+  }
+};
+
+static Iter foo(Iter iter) {
+  puts("hello");
+  iter.advance();
+  return iter;
+}
+
+void bar(Iter iter) {
+  foo(iter);
+}
+
+// { dg-final { scan-tree-dump-not "can not prove finiteness of loop" "cddce2" 
} }
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/split-path-6.c 
b/gcc/testsuite/gcc.dg/tree-ssa/split-path-6.c
index 682166f..2206d05 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/split-path-6.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/split-path-6.c
@@ -53,10 +53,11 @@ oof ()
 }
 
 void
-lookharder (string)
+lookharder (string, res)
  char *string;
+ char *res;
 {
-  register char *g;
+  register char *g = res;
   register char *s;
   for (s = string; *s != '\0'; s++)
 {
diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index 6efe67a..7c1ac61 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -3422,7 +3422,7 @@ static void
 infer_loop_bounds_from_pointer_arith (struct loop *loop, gimple *stmt)
 {
   tree def, base, step, scev, type, low, high;
-  tree var, ptr;
+  tree rhs2, rhs1;
 
   if (!is_gimple_assign (stmt)
   || gimple_assign_rhs_code (stmt) != POINTER_PLUS_EXPR)
@@ -3436,12 +3436,13 @@ infer_loop_bounds_from_pointer_arith (struct loop 
*loop, gimple *stmt)
   if (!nowrap_type_p (type))
 return;
 
-  ptr = gimple_assign_rhs1 (stmt);
-  if (!expr_invariant_in_loop_p (loop, ptr))
+  rhs2 = gimple_assign_rhs2 (stmt);
+  if (TYPE_PRECISION (type) != TYPE_PRECISION (TREE_TYPE (rhs2)))
 

GCC 8.0.0 Status Report (2017-11-03)

2017-11-03 Thread Richard Biener

Status
==

The feature development phase of GCC 8, Stage 1, is coming to its ends
at Friday, Nov. 17th (as usual you can use your local timezone to your
own advantage).

This means that from Saturday, Nov. 18th we will be in Stage 3 which
allows for general bugfixing.  All feature implementations posted
before Stage 3 might be considered during a brief period after we
entered Stage 3.

As usual in this time not all regressions have been prioritized,
usual rules apply -- regressions new in GCC 8 will end up as P1
unless they do not affect primary or secondary targets or languages.
Regressions that we shipped with in GCC 7.2 can only be at most P2.
Regressions that only affect non-primary/secondary targets or
languages will be demoted to P4/5.


Quality Data


Priority  #   Change from last report
---   ---
P1   14   +   8
P2  163   +  44
P3  134   + 126
P4  135   -  11
P5   27   -   3
---   ---
Total P1-P3 311   + 178
Total   473   + 164


Previous Report
===

https://gcc.gnu.org/ml/gcc/2017-04/msg00084.html


Re: [PATCH] Make inlining consistent in LTO and non-LTO mode (PR target/71991).

2017-11-03 Thread Martin Liška
Hi.

Honza can you please take a look at this, because Richi installed patch
that I originally suggested and you were not happy about: r251333.

Thanks,
Martin



[patch] Do not report non-executed blocks in Ada coverage

2017-11-03 Thread Eric Botcazou
Hi,

as explained in
  https://gcc.gnu.org/ml/gcc-patches/2017-11/msg00105.html
we don't (necessarily) want to report non-executed blocks in Ada.

The couple of attached patches prevent this from happening by detecting 
whether we are dealing with an Ada file; the first variant is optimal, the 
second variant is slightly more general in case other languages need it too.

Tested on x86_64-suse-linux, OK for the mainline (and which one)?


2017-11-03  Eric Botcazou  

* gcov.c (output_line_beginning): Document HAS_UNEXECUTED_BLOCK.
(output_lines): Always pass false for HAS_UNEXECUTED_BLOCK to
above function if this is for an Ada file.

-- 
Eric BotcazouIndex: gcov.c
===
--- gcov.c	(revision 254348)
+++ gcov.c	(working copy)
@@ -2477,10 +2477,11 @@ pad_count_string (string )
 }
 
 /* Print GCOV line beginning to F stream.  If EXISTS is set to true, the
-   line exists in source file.  UNEXCEPTIONAL indicated that it's not in
-   an exceptional statement.  The output is printed for LINE_NUM of given
-   COUNT of executions.  EXCEPTIONAL_STRING and UNEXCEPTIONAL_STRING are
-   used to indicate non-executed blocks.  */
+   line exists in source file.  UNEXCEPTIONAL indicates that it's not in
+   an exceptional statement.  HAS_UNEXECUTED_BLOCK indicates that it has
+   at least one non-executed block.  The output is printed for LINE_NUM of
+   given COUNT of executions.  EXCEPTIONAL_STRING and UNEXCEPTIONAL_STRING
+   are used to indicate non-executed blocks.  */
 
 static void
 output_line_beginning (FILE *f, bool exists, bool unexceptional,
@@ -2552,6 +2553,7 @@ output_lines (FILE *gcov_file, const sou
   const line_info *line;  /* current line info ptr.  */
   const char *retval = "";	/* status of source file reading.  */
   function_t *fn = NULL;
+  bool is_ada_file = false;
 
   fprintf (gcov_file, DEFAULT_LINE_START "Source:%s\n", src->coverage.name);
   if (!multiple_files)
@@ -2575,6 +2577,17 @@ output_lines (FILE *gcov_file, const sou
   if (flag_branches)
 fn = src->functions;
 
+  /* Detect whether we are dealing with an Ada file.  If this is the case,
+ then we don't report non-executed blocks for lines because statements
+ can easily give rise to non-executed blocks in Ada, e.g. for checks.  */
+  const char *const last_dot = strrchr (src->name, '.');
+  if (last_dot
+  && last_dot[1] == 'a'
+  && last_dot[2] == 'd'
+  && (last_dot[3] == 'a' || last_dot[3] == 'b' || last_dot[3] == 's')
+  && last_dot[4] == '\0')
+is_ada_file = true;
+
   for (line_num = 1, line = >lines[line_num];
line_num < src->lines.size (); line_num++, line++)
 {
@@ -2610,7 +2623,8 @@ output_lines (FILE *gcov_file, const sou
 	 There are 16 spaces of indentation added before the source
 	 line so that tabs won't be messed up.  */
   output_line_beginning (gcov_file, line->exists, line->unexceptional,
-			 line->has_unexecuted_block, line->count, line_num,
+			 line->has_unexecuted_block && !is_ada_file,
+			 line->count, line_num,
 			 "=", "#");
   fprintf (gcov_file, ":%s\n", retval ? retval : "/*EOF*/");
 
Index: gcov.c
===
--- gcov.c	(revision 254348)
+++ gcov.c	(working copy)
@@ -2477,10 +2477,11 @@ pad_count_string (string )
 }
 
 /* Print GCOV line beginning to F stream.  If EXISTS is set to true, the
-   line exists in source file.  UNEXCEPTIONAL indicated that it's not in
-   an exceptional statement.  The output is printed for LINE_NUM of given
-   COUNT of executions.  EXCEPTIONAL_STRING and UNEXCEPTIONAL_STRING are
-   used to indicate non-executed blocks.  */
+   line exists in source file.  UNEXCEPTIONAL indicates that it's not in
+   an exceptional statement.  HAS_UNEXECUTED_BLOCK indicates that it has
+   at least one non-executed block.  The output is printed for LINE_NUM of
+   given COUNT of executions.  EXCEPTIONAL_STRING and UNEXCEPTIONAL_STRING
+   are used to indicate non-executed blocks.  */
 
 static void
 output_line_beginning (FILE *f, bool exists, bool unexceptional,
@@ -2552,6 +2553,7 @@ output_lines (FILE *gcov_file, const sou
   const line_info *line;  /* current line info ptr.  */
   const char *retval = "";	/* status of source file reading.  */
   function_t *fn = NULL;
+  bool is_ada_file = false;
 
   fprintf (gcov_file, DEFAULT_LINE_START "Source:%s\n", src->coverage.name);
   if (!multiple_files)
@@ -2575,6 +2577,23 @@ output_lines (FILE *gcov_file, const sou
   if (flag_branches)
 fn = src->functions;
 
+  /* Detect whether we are dealing with an Ada file.  If this is the case,
+ then we don't report non-executed blocks for lines because statements
+ can easily give rise to non-executed blocks in Ada, e.g. for checks.  */
+  const char *const last_dot = strrchr (src->name, '.');
+  if (last_dot)
+{
+  const char *const 

RE: [PATCH 5/6] [ARC] Add 'uncached' attribute.

2017-11-03 Thread Claudiu Zissulescu
> 
> I see no documentation here.
> 

Ups, forgot this one :) Please find it attached. I'll merge it into the final 
patch when everything is approved.

Thanks,
Claudiu 


0001-ARC-DOC-Add-uncached-documentation.patch
Description: 0001-ARC-DOC-Add-uncached-documentation.patch


Re: [PATCH] Simplify _Node_insert_return to avoid including

2017-11-03 Thread Jonathan Wakely

On 03/11/17 09:29 +, Jonathan Wakely wrote:

On 02/11/17 18:00 -0400, Tim Song wrote:

Um, why are those member get's there at all (and with an index mapping
that doesn't agree with the member order)? [container.insert.return]
says that "It has no base classes or members other than those
specified."


Because I forgot to implement https://wg21.link/p0508r0 which changed
the spec, including the order of the members.

I'll do that shortly, thanks.



Fixed by this patch, tested powerpc64le-linux, committed to trunk.


commit a13d8a5a5620bbcc44ac6d6a1065e9c5d425b629
Author: Jonathan Wakely 
Date:   Fri Nov 3 09:48:59 2017 +

Remove _Node_insert_return::get() member functions (P0508R0)

* include/bits/node_handle.h (_Node_insert_return::get): Remove, as
per P0508R0.

diff --git a/libstdc++-v3/include/bits/node_handle.h b/libstdc++-v3/include/bits/node_handle.h
index f93bfd7f686..4a830630c89 100644
--- a/libstdc++-v3/include/bits/node_handle.h
+++ b/libstdc++-v3/include/bits/node_handle.h
@@ -282,54 +282,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _Iterator		position = _Iterator();
   bool		inserted = false;
   _NodeHandle	node;
-
-  template
-	decltype(auto) get() &
-	{
-	  static_assert(_Idx < 3);
-	  if constexpr (_Idx == 0)
-	return inserted;
-	  else if constexpr (_Idx == 1)
-	return position;
-	  else if constexpr (_Idx == 2)
-	return node;
-	}
-
-  template
-	decltype(auto) get() const &
-	{
-	  static_assert(_Idx < 3);
-	  if constexpr (_Idx == 0)
-	return inserted;
-	  else if constexpr (_Idx == 1)
-	return position;
-	  else if constexpr (_Idx == 2)
-	return node;
-	}
-
-  template
-	decltype(auto) get() &&
-	{
-	  static_assert(_Idx < 3);
-	  if constexpr (_Idx == 0)
-	return std::move(inserted);
-	  else if constexpr (_Idx == 1)
-	return std::move(position);
-	  else if constexpr (_Idx == 2)
-	return std::move(node);
-	}
-
-  template
-	decltype(auto) get() const &&
-	{
-	  static_assert(_Idx < 3);
-	  if constexpr (_Idx == 0)
-	return std::move(inserted);
-	  else if constexpr (_Idx == 1)
-	return std::move(position);
-	  else if constexpr (_Idx == 2)
-	return std::move(node);
-	}
 };
 
 _GLIBCXX_END_NAMESPACE_VERSION


RE: [ARC] Fix stack unwinding for ARC

2017-11-03 Thread Claudiu Zissulescu
> - Fix to unwinding. Now is is possible to unwind from syscall
> wrappers, signals and functions with dynamic stack allocation.
> 
> - Patch also fixes millicode. Although millicode save and restore functions
> would change blink, the calls to those functions were not clobbering blink.
> 

Approved and committed with minor modifications.

Cheers,
Claudiu


Re: [RFA][PATCH] Improve initial probe for noreturn functions for x86 target

2017-11-03 Thread Uros Bizjak
On Fri, Nov 3, 2017 at 11:14 AM, Richard Biener
 wrote:
> On Fri, Nov 3, 2017 at 9:38 AM, Uros Bizjak  wrote:
>>>* config/i386/i386.c (ix86_emit_restore_reg_using_pop): 
>>> Prototype.
>>>(ix86_adjust_stack_and_probe_stack_clash): Use a push/pop 
>>> sequence
>>>to probe at the start of a noreturn function.
>>>
>>>* gcc.target/i386/stack-check-12.c: New test
>>
>> -  emit_stack_probe (plus_constant (Pmode, stack_pointer_rtx,
>> -   -GET_MODE_SIZE (word_mode)));
>> +  rtx_insn *insn = emit_insn (gen_push (gen_rtx_REG (word_mode, 0)));
>>
>> Please use AX_REG instead of 0.
>>
>> +  RTX_FRAME_RELATED_P (insn) = 1;
>> +  ix86_emit_restore_reg_using_pop (gen_rtx_REG (word_mode, 0));
>>
>> Also here.
>>
>>emit_insn (gen_blockage ());
>>
>> BTW: Could we use an unused register here, if available? %eax is used
>> to pass first argument in regparm functions on 32bit targets.
>
> Can you push %[er]sp?  What about partial reg stalls when using other
> registers (if the last set was a movb to it)?  I guess [er]sp is safe here
> as was [re]ax due to the ABI?

That would work, too. I believe, that this won't trigger stack engine
[1], but since the operation is a bit unusual, let's ask HJ to be
sure.

[1] https://en.wikipedia.org/wiki/Stack_register#Stack_engine

Uros.


Re: [PATCH] Initialize variable in order to survive PGO bootstrap.

2017-11-03 Thread Richard Biener
On Fri, Nov 3, 2017 at 9:52 AM, Martin Liška  wrote:
> Hi.
>
> This is oneliner that fixes PGO bootstrap. I've discussed that with Richi and 
> the core is correct.
> However we probably don't have an attribute that will ignore the warning?

I think

  wide_int res = res;

might do (untested).

> Only option is to push/pop Wuninitialized warning.

Too ugly...

> Ready for trunk?

Any better idea?  Richard?

Richard.

> Thanks,
> Martin
>
> gcc/ChangeLog:
>
> 2017-11-03  Martin Liska  
>
> * tree-vrp.c (vrp_int_const_binop): Initialize to wi::zero.
> ---
>  gcc/tree-vrp.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
>


  1   2   >