date:20161031

Re: [PATCH] enhance buffer overflow warnings (and c/53562)

2016-10-31 Thread Martin Sebor


Attached is an updated patch that works around the problem with
the definition of the NOTE_DATA macro discussed below.  I've
raised bug 78174 for it and temporarily worked around it in
the patch.  I'll see if I can come up with a patch to fix the
macro the "right way" but would prefer to do that separately.
The NOTE_DATA macro is implemented in terms of the RTL_CHECK1
macro that will need to change and that macro is used in many
others, so I would rather not mess around with those as part
of this patch.

I tested the updated patch with today's top of trunk on x86_64.
The reason why I missed the problem in the previous patch is
because I accidentally tested an earlier version of it.

In following up on bug 78174 it occurred to me that the warning
option added by the patch for the non-checking functions could
perhaps be enhanced to specify the object size checking type
(right now it hardcodes type 1 for all non-checking functions).
That would make it possible to request looser checking for some
functions (e.g., memset, for compatibility with Glibc's setting
of zero even at _FORTIFY_SOURCE=2), and stricter checking for
others (e.g., strcpy where Glibc uses type-1 with
_FORTIFY_SOURCE=2).  If this is something people would think
would be worthwhile please let me know.

Martin

On 10/31/2016 10:44 AM, Martin Sebor wrote:

On 10/31/2016 06:39 AM, Tobias Burnus wrote:

Martin Sebor wrote:

Attached is an updated patch that adds checks for excessive sizes
and bounds (those in excess of SIZE_MAX / 2), and also enables
the same checking for strcat and strncat).  This version also
fixes an issue with the interpretation of anti-ranges in the
first patch.  The improvements exposed two bugs in the regression
tests.


If I apply this patch to my local trunk - and try to bootstrap GCC,
bootstrapping fails (on x86-64_gnu-linux) as following. I have not
tried to figure out whether the warning (-Werror) makes sense or not.

../../gcc/emit-rtl.c: In function ‘rtx_note* make_note_raw(insn_note)’:
../../gcc/emit-rtl.c:3933:59: error: void* memset(void*, int, size_t)
writing 8 bytes into a region of size 0 overflows the destination
[-Werror=stringop-overflow]
   memset (_DATA (note), 0, sizeof (NOTE_DATA (note)));


where NOTE_DATA is defined in rtl.h as

/* Opaque data.  */
#define NOTE_DATA(INSN) RTL_CHECKC1 (INSN, 3, NOTE)


Thanks for trying it out!  The patch bootstrapped and passed regression
tests for me yesterday, also on x86_64, and today on powepc64le, but
just now I reproduced the error with the top of today's trunk on
x86_64.  I think the error is justified because the call to memset
in the make_note_raw function references a fourth element
(rtx_note::rtx_insn::rtx_def::u.fld[3]) of what is just a single
element array.  After macro expansion, the memset call itself:

  memset (&((note)->u.fld[3]), 0, sizeof (((note)->u.fld[3])));

is within the bounds of the object pointed to by note but the index
3 is out of bounds for note->u.fld and so undefined.  An unpatched
GCC issues a similar error when the call to memset is replaced with
__builtin___memset_chk like so:

  __builtin___memset_chk (&((note)->u.fld[3]), 0,
  sizeof (((note)->u.fld[3])),
  __builtin_object_size (&((note)->u.fld[3]), 1));

/src/gcc/53562/gcc/emit-rtl.c: In function ‘rtx_note*
make_note_raw(insn_note)’:
/src/gcc/53562/gcc/emit-rtl.c:3941:53: warning: call to void*
__builtin___memset_chk(void*, int, long unsigned int, long unsigned int)
will always overflow destination buffer

The code can be made valid and the warning avoided by deriving
the same address not from an invalid pointer/subscript but rather
from a pointer to the beginning of the union itself and adding
the offset of the fourth element, like so:

  char *p = (char*) &(note)->u.fld[0];
  p += sizeof (((note)->u.fld[0])) * 3;
  memset (p, 0, sizeof *p);

Unfortunately, because the invalid subscript is the result of
the expansion of the RTL_CHECK1() macro fixing this isn't as
straightforward as that.  What really should be fixed is the macro
itself.  Until then, it can be worked (hacked would be a better
term) around it by storing the result of _DATA (note)
expression in a volatile pointer, like so:

diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
index 8afcfbe..6dd9439 100644
--- a/gcc/emit-rtl.c
+++ b/gcc/emit-rtl.c
@@ -3930,7 +3930,8 @@ make_note_raw (enum insn_note subtype)
   INSN_UID (note) = cur_insn_uid++;
   NOTE_KIND (note) = subtype;
   BLOCK_FOR_INSN (note) = NULL;
-  memset (_DATA (note), 0, sizeof (NOTE_DATA (note)));
+  void* volatile p = _DATA (note);
+  memset (p, 0, sizeof (NOTE_DATA (note)));
   return note;
 }

Martin


PR c/53562 - Add -Werror= support for -D_FORTIFY_SOURCE / __builtin___memcpy_chk
PR middle-end/78149 - missing warning on strncpy buffer overflow due to an excessive bound
PR middle-end/78138 - missing warnings on buffer overflow with non-constant source length

gcc/c-family/ChangeLog:
2016-10-30  Martin Sebor

Re: [PATCH, RFC, rs6000] Add overloaded built-in function support to altivec.h, and re-implement vec_add

2016-10-31 Thread Bill Schmidt

On Oct 31, 2016, at 7:19 PM, Michael Meissner  
wrote:
> 
> One question is how many of the billions and billions (ok, 1,345) of the 
> rs6000
> built-ins would be improved by expanding them in gimple time rather than rtl?
> 
Hundreds and hundreds of them.  All of the basic operators, many of the memory
operations, all of the dozens of flavors of things that are just permutes at 
heart.
The loads and stores alone are a huge deal that we've seen cause problems in
customer code.

Bill

Re: [PATCH, RFC, rs6000] Add overloaded built-in function support to altivec.h, and re-implement vec_add

2016-10-31 Thread Bill Schmidt


> On Oct 31, 2016, at 7:09 PM, Jakub Jelinek  wrote:
> 
> On Mon, Oct 31, 2016 at 05:28:42PM -0500, Bill Schmidt wrote:
>> The PowerPC back end loses performance on vector intrinsics, because 
>> currently
>> all of them are treated as calls throughout the middle-end phases and only
>> expanded when they reach RTL.  Our version of altivec.h currently defines the
>> public names of overloaded functions (like vec_add) to be #defines for hidden
>> functions (like __builtin_vec_add), which are recognized in the parser as 
>> requiring special back-end support.  Tables in rs6000-c.c handle dispatch of
>> the overloaded functions to specific function calls appropriate to the 
>> argument
>> types.
> 
> This doesn't look very nice.  If all you care is that the builtins like
> __builtin_altivec_vaddubm etc. that __builtin_vec_add overloads into fold
> into generic vector operations under certain conditions, just fold those
> into whatever you want in targetm.gimple_fold_builtin (gsi).
> 
>   Jakub
> 
Ah, thanks, Jakub.  I wasn't aware of that hook, and that sounds like the best
approach.  I had found previously how difficult it can be to expand some of
these things during the parser hook, but if we can expand them early in
GIMPLE that is probably much easier.  I will look into it.

"This doesn't look very nice" wins the understatement of the year award...
I was getting increasingly unhappy the further I got into it.

Thanks,
Bill

Re: [PATCH, RFC, rs6000] Add overloaded built-in function support to altivec.h, and re-implement vec_add

2016-10-31 Thread Michael Meissner

On Mon, Oct 31, 2016 at 06:49:20PM -0500, Bill Schmidt wrote:
> > 
> > On Oct 31, 2016, at 5:28 PM, Bill Schmidt  
> > wrote:
> > 
> >   The other way would be to require a specific option on the command line
> >   to use the new dispatch mechanism.  When the option is present, we would
> >   predefine a macro such as __PPC_FAST_VECTOR__, which would then gate the
> >   usage in altivec.h and overload.h.  Use of #pragma GCC target to change
> >   the availability of Altivec, VMX, P8-vector, etc. would also be disallowed
> >   when the option is present.  This has the advantage of always generating
> >   correct code, at the cost of requiring a special option before anyone
> >   can leverage the benefits of early vector expansion.  That's unfortunate,
> >   but I suspect it's the best we can do.
> 
> Though I suppose we could require the option to turn off the new dispatch
> mechanism, and document the change in gcc7/changes.html.  A little irritating
> for people already using the pragma support, but I really expect this 
> wouldn't affect
> many people at all.

I suspect we may find out how many people are using #pragma GCC target and
altivec vector instructions if we break their code :-(

Even with attribute target instead of pragma, you need to give appropriate
error messages if the user calls a built-in that the current machine doesn't
support.

IIRC, C++ does not support #pragma GCC target nor the target attribute.

One question is how many of the billions and billions (ok, 1,345) of the rs6000
built-ins would be improved by expanding them in gimple time rather than rtl?

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797

Re: [PATCH, RFC, rs6000] Add overloaded built-in function support to altivec.h, and re-implement vec_add

2016-10-31 Thread Jakub Jelinek

On Mon, Oct 31, 2016 at 05:28:42PM -0500, Bill Schmidt wrote:
> The PowerPC back end loses performance on vector intrinsics, because currently
> all of them are treated as calls throughout the middle-end phases and only
> expanded when they reach RTL.  Our version of altivec.h currently defines the
> public names of overloaded functions (like vec_add) to be #defines for hidden
> functions (like __builtin_vec_add), which are recognized in the parser as 
> requiring special back-end support.  Tables in rs6000-c.c handle dispatch of
> the overloaded functions to specific function calls appropriate to the 
> argument
> types.

This doesn't look very nice.  If all you care is that the builtins like
__builtin_altivec_vaddubm etc. that __builtin_vec_add overloads into fold
into generic vector operations under certain conditions, just fold those
into whatever you want in targetm.gimple_fold_builtin (gsi).

Jakub

Re: [PATCH, RFC, rs6000] Add overloaded built-in function support to altivec.h, and re-implement vec_add

2016-10-31 Thread Bill Schmidt

> 
> On Oct 31, 2016, at 5:28 PM, Bill Schmidt  wrote:
> 
>   The other way would be to require a specific option on the command line
>   to use the new dispatch mechanism.  When the option is present, we would
>   predefine a macro such as __PPC_FAST_VECTOR__, which would then gate the
>   usage in altivec.h and overload.h.  Use of #pragma GCC target to change
>   the availability of Altivec, VMX, P8-vector, etc. would also be disallowed
>   when the option is present.  This has the advantage of always generating
>   correct code, at the cost of requiring a special option before anyone
>   can leverage the benefits of early vector expansion.  That's unfortunate,
>   but I suspect it's the best we can do.

Though I suppose we could require the option to turn off the new dispatch
mechanism, and document the change in gcc7/changes.html.  A little irritating
for people already using the pragma support, but I really expect this wouldn't 
affect
many people at all.

-- Bill

Bill Schmidt, Ph.D.
GCC for Linux on Power
Linux on Power Toolchain
IBM Linux Technology Center
wschm...@linux.vnet.ibm.com

Re: [PATCH], Optimize (double)vec_extract(n) on PowerPC

2016-10-31 Thread Michael Meissner

On Mon, Oct 31, 2016 at 06:04:47PM -0500, Segher Boessenkool wrote:
> On Mon, Oct 31, 2016 at 05:25:56PM -0400, Michael Meissner wrote:
> > +(define_insn "vsx_xvcvsxwdp_df"
> > +  [(set (match_operand:DF 0 "vsx_register_operand" "=ws")
> > +   (unspec:DF [(match_operand:V4SI 1 "vsx_register_operand" "wa")]
> > +  UNSPEC_VSX_CVSXWDP))]
> > +  "VECTOR_UNIT_VSX_P (V2DFmode)"
> > +  "xvcvsxwdp %x0,%x1"
> > +  [(set_attr "type" "vecdouble")])
> 
> I think the condition here is a pasto?  It probably works, but it isn't
> exactly clear what it means here.

I will change it to TARGET_VSX, since it needs the XVCVSXWDP instruction to be
defined.

> > +;; Opimize  f = () vec_extract (vi, )
> 
> Typo (optimize).

Thanks.

> > +;; where  is a hardware supported binary floating point type
> > +;; that is not double
> 
> Full stop.

Ok.

> 
> Looks fine otherwise.  Please apply with those things taking care of.
> 
> Thanks,
> 
> 
> Segher
> 

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797

Re: [PATCH], Optimize (double)vec_extract(n) on PowerPC

2016-10-31 Thread Segher Boessenkool

On Mon, Oct 31, 2016 at 05:25:56PM -0400, Michael Meissner wrote:
> +(define_insn "vsx_xvcvsxwdp_df"
> +  [(set (match_operand:DF 0 "vsx_register_operand" "=ws")
> + (unspec:DF [(match_operand:V4SI 1 "vsx_register_operand" "wa")]
> +UNSPEC_VSX_CVSXWDP))]
> +  "VECTOR_UNIT_VSX_P (V2DFmode)"
> +  "xvcvsxwdp %x0,%x1"
> +  [(set_attr "type" "vecdouble")])

I think the condition here is a pasto?  It probably works, but it isn't
exactly clear what it means here.

> +;; Opimize  f = () vec_extract (vi, )

Typo (optimize).

> +;; where  is a hardware supported binary floating point type
> +;; that is not double

Full stop.


Looks fine otherwise.  Please apply with those things taking care of.

Thanks,


Segher

[PATCH, RFC, rs6000] Add overloaded built-in function support to altivec.h, and re-implement vec_add

2016-10-31 Thread Bill Schmidt

Hi,

The PowerPC back end loses performance on vector intrinsics, because currently
all of them are treated as calls throughout the middle-end phases and only
expanded when they reach RTL.  Our version of altivec.h currently defines the
public names of overloaded functions (like vec_add) to be #defines for hidden
functions (like __builtin_vec_add), which are recognized in the parser as 
requiring special back-end support.  Tables in rs6000-c.c handle dispatch of
the overloaded functions to specific function calls appropriate to the argument
types.

The Clang version of altivec.h, by contrast, creates static inlines for each
overloaded function variant, relying on a special __attribute__((overloadable))
construct to do the dispatch in the parser itself.  This allows vec_add to be
immediately translated into type-specific addition during parsing, allowing
the expressions to be subject to all subsequent optimization.

We have opened a PR suggesting that this attribute be supported in GCC as well
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71199), but so far there hasn't
been any success in that regard.  While waiting/hoping for the attribute to be
implemented, though, we can use existing mechanisms to create a poor man's
version of overloading dispatch.  This patch is a proof of concept for how
this can be done, and provides support for early expansion of the overloaded
vec_add intrinsic.  If we get this working, then we can gradually add more
intrinsics over time.

The dispatch mechanism is provided in a new header file, overload.h, which is
included in altivec.h.  This is done because the guts of the dispatch
mechanism are pretty ugly to look at.  Overloading is done with a chain of
calls to __builtin_choose_expr and __builtin_types_compatible_p.  Currently
I envision providing a separate dispatch macro for each combination of the
number of arguments and the number of variants to be distinguished.  I also
provide a separate "decl" macro for each number of arguments, used to create
the function decls for each static inline function.  The add_vec intrinsic
takes two input arguments and has 28 variants, so it requires the definition
of OVERLOAD_2ARG_28VAR and OVERLOAD_2ARG_DECL in overload.h.

These macros are then instantiated in altivec.h.  The dispatch macro for an
overloaded intrinsic is instantiated once, and the decl macro is instantiated
once for each variant, along with the associated inline function body.

The dispatch macro may need to vary depending on the supported processor
features.  In the vec_add example, we have some variants that support the
"vector double" and "vector long long" data types.  These only exist when
VSX code generation is supported, so a dispatch table conditioned on
__VSX__ includes these, while a separate one without VSX support does not.
Similarly, __POWER8_VECTOR__ must be defined if we are to support "vector
signed/unsigned __int128".  Because we use a numbering scheme that needs
to be kept consistent, this requires three versions of the dispatch table,
where the more restrictive versions replace the unimplemented entries with
redundant entries.

Note that if and when we get an overloadable attribute in GCC, the stuff
in overload.h will become obsolete, we will remove the dispatch instantiations,
and we will replace the decl instantiations with plain decls with the
overloadable attribute applied.

There are several complications on top of the basic design:

 * When compiling for C++, the dispatch mechanism is not available, and indeed
   is not necessary.  Thus for C++ we skip the dispatch mechanism, and change
   the definition of OVERLOAD_2ARG_DECL to use standard function overloading.

 * Compiling with -ansi or -std=c11 or the like means the dispatch mechanism
   is unavailable even for C, since GNU extensions are disallowed.  Regret-
   tably, this means that we can't get rid of the existing late-expansion
   methods altogether.  I don't see any way to avoid this.  Note that this
   would be the case even if we had __attribute__ ((overloadable)), since
   that would also be a GNU extension.  Despite the mess, I think that the 
   performance improvements for non-strict-ANSI code make the dual maintenance
   worthwhile.

 * "#pragma GCC target" is going to cause a lot of trouble.  With the patch
   in its present state, we fail gcc.target/powerpc/ppc-target-4.c, which
   tests the use of "vsx", "altivec,no-vsx", and "no-altivec" target options,
   and happens to use vec_add (float, float) as a testbed.  The test fails
   because altivec.h is #included after the #pragma GCC target("vsx"), which
   allows the interfaces involving vector long long and vector double to be
   produced.  However, when the options are changed to "altivec,no-vsx", the
   subsequent invocation of vec_add expands to a dispatch sequence including
   vector long long, leading to a compile-time error.

   I can only think of two ways to deal with this, neither of which is
   attractive.  The

[PATCH] Fix nonoverlapping_memrefs_p ICE (PR target/77834)

2016-10-31 Thread Jakub Jelinek

Hi!

Some automatic VAR_DECLs don't get DECL_RTL set - e.g. if its SSA_NAMEs
expand to multiple rtls, then there is not a single one that can be used.
Using DECL_RTL on such VAR_DECLs ICEs.

I've tried to just return 0 in nonoverlapping_memrefs_p if either
DECL_RTL_SET_P (expr{x,y}) wasn't true, but that triggered way too often
during bootstrap/regtest (3800+ times).  So the following patch narrows it
down more and triggers only on the testcase below.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-10-31  Jakub Jelinek  

PR target/77834
* alias.c (nonoverlapping_memrefs_p): Return 0 if exprx or expry
doesn't have rtl set and it can't be safely created.

* gcc.dg/pr77834.c: New test.

--- gcc/alias.c.jj  2016-10-21 17:06:27.0 +0200
+++ gcc/alias.c 2016-10-31 11:38:29.448031590 +0100
@@ -2755,6 +2755,27 @@ nonoverlapping_memrefs_p (const_rtx x, c
   || TREE_CODE (expry) == CONST_DECL)
 return 1;
 
+  /* Don't try to create RTL for decls that intentionally don't have
+ DECL_RTL set (e.g. marked as living in multiple places).  */
+  if (!DECL_RTL_SET_P (exprx))
+{
+  if (TREE_CODE (exprx) == PARM_DECL
+ || TREE_CODE (exprx) == RESULT_DECL
+ || (VAR_P (exprx)
+ && !TREE_STATIC (exprx)
+ && !DECL_EXTERNAL (exprx)))
+   return 0;
+}
+  if (!DECL_RTL_SET_P (expry))
+{
+  if (TREE_CODE (expry) == PARM_DECL
+ || TREE_CODE (expry) == RESULT_DECL
+ || (VAR_P (expry)
+ && !TREE_STATIC (expry)
+ && !DECL_EXTERNAL (expry)))
+   return 0;
+}
+
   rtlx = DECL_RTL (exprx);
   rtly = DECL_RTL (expry);
 
--- gcc/testsuite/gcc.dg/pr77834.c.jj   2016-10-31 11:41:46.290521464 +0100
+++ gcc/testsuite/gcc.dg/pr77834.c  2016-10-31 11:41:24.0 +0100
@@ -0,0 +1,18 @@
+/* PR target/77834 */
+/* { dg-do compile } */
+/* { dg-options "-O -ftree-pre -Wno-psabi" } */
+/* { dg-additional-options "-mstringop-strategy=libcall" { target i?86-*-* 
x86_64-*-* } } */
+
+typedef int V __attribute__ ((vector_size (64)));
+
+V
+foo (V u, V v, int w)
+{
+  do
+{
+  if (u[0]) v ^= u[w];
+}
+  while ((V) { 0, u[w] }[1]);
+  u = (V) { v[v[0]], u[u[0]] };
+  return v + u;
+}

Jakub

[PATCH], Optimize (double)vec_extract(n) on PowerPC

2016-10-31 Thread Michael Meissner

This patch adds an optimization that I noticed towards the end of working on my
previous patch to allow SImode into vector registers.

This patch adds optimization when you are extracting an element using a 32-bit
integer with constant element number, and then converting the value to double.
While I was doing the patch, I noticed the SImode extract pattern could be
optimized on power8 for better extract when SImode is allowed in vector
registers.

For example, consider the code:

#include 
double vi_to_df (vector int v) { return (double)vec_extract (v, 0); }

The current trunk compiler generates (on a Power8 little endian system):

vspltw 0,2,3
mfvsrwz 9,32
mtvsrwa 1,9
fcfid 1,1

With these patches it generates:

vspltw 2,2,3
xvcvsxwdp 1,34

I also added support for conversions to SFmode, TFmode, IFmode, and on ISA 3.0,
KFmode using a DFmode temporary.

I did a bootstrap build and there were no regressions with these patches.  In
addition, I built spec 2006 with/without these patches.  In general, there was
no change.  The 416.gamess benchmark generated a few less xxlor's used as a
move instruction due to the improvements in the SImode extraction and when
464.h264ref is auto vectorized, one of the extracts converted to floating point
and generated the new code.  However, neither change affected the performance
of those two benchmarks.

Can I check this into the trunk?

[gcc]
2016-10-31  Michael Meissner  

* config/rs6000/vsx.md (VSX_EXTRACT_FL): New iterator for all
binary floating point types supported by the hardware except for
double.
(vsx_xvcvsxwdp_df): Provide scalar result alternative to the
vector instruction for optimizing extracting a SImode from a
V4SImode vector and converting it to floating point.
(vsx_xvcvuxwdp_df): Likewise.
(vsx_extract_si): On ISA 3.0, allow extract target and temporary
registers to be any VSX register.  Move stores to the end of the
constraints.
(vsx_extract_si_float_df): New combiner pattern and splitter
to optimize extracting a SImode from a V4SImode vector and
converting it to a binary floating point type supported by the
hardware.  Use the vector converts instead of extracting the
element, sign extending it, and then converting it to double.
Other floating point types  than double first convert to double,
then the double is converted to that type.
(vsx_extract_si_float_): Likewise.

[gcc/testsuite]
2016-10-31  Michael Meissner  

* gcc.target/powerpc/vsx-extract-4.c: New test.
* gcc.target/powerpc/vsx-extract-5.c: Likewise.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/vsx.md
===
--- gcc/config/rs6000/vsx.md
(.../svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)
(revision 241631)
+++ gcc/config/rs6000/vsx.md(.../gcc/config/rs6000) (working copy)
@@ -288,6 +288,16 @@ (define_mode_attr VSX_EX [(V16QI "v")
  (V8HI  "v")
  (V4SI  "wa")])
 
+;; Mode iterator for binary floating types other than double to
+;; optimize convert to that floating point type from an extract
+;; of an integer type
+(define_mode_iterator VSX_EXTRACT_FL [SF
+ (IF "FLOAT128_2REG_P (IFmode)")
+ (KF "TARGET_FLOAT128_HW")
+ (TF "FLOAT128_2REG_P (TFmode)
+  || (FLOAT128_IEEE_P (TFmode)
+  && TARGET_FLOAT128_HW)")])
+
 ;; Iterator for the 2 short vector types to do a splat from an integer
 (define_mode_iterator VSX_SPLAT_I [V16QI V8HI])
 
@@ -1907,6 +1917,7 @@ (define_insn "vsx_xvcvuxdsp"
   [(set_attr "type" "vecdouble")])
 
 ;; Convert from 32-bit to 64-bit types
+;; Provide both vector and scalar targets
 (define_insn "vsx_xvcvsxwdp"
   [(set (match_operand:V2DF 0 "vsx_register_operand" "=wd,?wa")
(unspec:V2DF [(match_operand:V4SI 1 "vsx_register_operand" "wf,wa")]
@@ -1915,6 +1926,14 @@ (define_insn "vsx_xvcvsxwdp"
   "xvcvsxwdp %x0,%x1"
   [(set_attr "type" "vecdouble")])
 
+(define_insn "vsx_xvcvsxwdp_df"
+  [(set (match_operand:DF 0 "vsx_register_operand" "=ws")
+   (unspec:DF [(match_operand:V4SI 1 "vsx_register_operand" "wa")]
+  UNSPEC_VSX_CVSXWDP))]
+  "VECTOR_UNIT_VSX_P (V2DFmode)"
+  "xvcvsxwdp %x0,%x1"
+  [(set_attr "type" "vecdouble")])
+
 (define_insn "vsx_xvcvuxwdp"
   [(set (match_operand:V2DF 0 "vsx_register_operand" "=wd,?wa")
(unspec:V2DF [(match_operand:V4SI 1 "vsx_register_operand" "wf,wa")]
@@ -1923,6 +1942,14 @@

Re: [PATCH] DWARF5 .debug_rnglists support

2016-10-31 Thread Jason Merrill

On Mon, Oct 31, 2016 at 3:33 PM, Jakub Jelinek  wrote:
> On Mon, Oct 31, 2016 at 02:42:15PM -0400, Jason Merrill wrote:
>> On 10/20/2016 02:52 PM, Jakub Jelinek wrote:
>> >@@ -8476,7 +8498,16 @@ size_of_die (dw_die_ref die)
>> > size += DWARF_OFFSET_SIZE;
>> >   break;
>> > case dw_val_class_range_list:
>> >-  size += DWARF_OFFSET_SIZE;
>> >+  if (dwarf_split_debug_info
>> >+  && dwarf_version >= 5
>> >+  && a->dw_attr_val.val_entry != RELOCATED_OFFSET)
>>
>> This test, here and in value_format, should be factored out into a separate
>> function with a comment explaining why you check dwarf_split_debug_info.
>
> Ok.  Another option would be to call value_format in size_of_die
> in this case and put the comment into value_format.

That works, too.

>> In general there's a lot of code duplication between the existing ranges
>> support and the new rnglist support; even the new vector is a superset of
>> the old one.  Why duplicate the code rather than modify it?
>
> The point was to conserve memory for the -gdwarf-{2,3,4} case.
> The old table needs just 4 bytes per entry, the new one 16 bytes per entry.
> The code duplication because of that is mainly in
> add_high_low_attributes - 24 lines duplicated.

I was also thinking of the output_rnglists function.

> At least while -gdwarf-4 is the default that looked to me like acceptable
> cost, but if you disagree, I can surely try to just grow the original
> table and use it for all versions.

Please.  I think if we're going in this direction we should just go
ahead with it; if the memory consumption is going to be a problem it
would be good to find that out so we can address it.

Jason

[PATCH] Do not simplify "(and (reg) (const bit))" to if_then_else.

2016-10-31 Thread Dominik Vogt

The attached patch does a little change in
combine.c:combine_simplify_rtx() to prevent a "simplification"
where the rtl code gets more complex in reality.  The complete
description of the change can be found in the commit comment in
the attached patch.

The patch reduces the number of patterns in the s390 backend and
slightly reduces the size of the compiled SPEC2006 code.  (Code
size or runtime only tested on s390x with -m64.)  It is
theoretically possible that this patch leads to somewhat worse
code on some target if that only has a pattern for the formerly replaced
rtl expression but not for the original one.

The patch has passed the testsuite on s390, s390x biarch, x86_64
and Power biarch.

--

(I'm not sure whether the const_int expression can appear in both
operands or only as the second.  If the latter is the case, the
conditions can be simplified a bit.)

What do you think about this patch?

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/ChangeLog

* combine.c (if_then_else_cond): Suppress replacement of
"(and (reg) (const_int bit))" with "if_then_else".
>From bfa15721c760fa4cd9003050c3137cba3165139f Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Mon, 31 Oct 2016 09:00:31 +0100
Subject: [PATCH] Do not simplify "(and (reg) (const bit))" to if_then_else.

combine_simplify_rtx() tries to replace rtx expressions with just two
possible values with an experession that uses if_then_else:

  (if_then_else (condition) (value1) (value2))

If the original expression is e.g.

  (and (reg) (const_int 2))

where the constant is the mask for a single bit, the replacement results
in a more complex expression than before:

  (if_then_else (ne (zero_extract (reg) (1) (31))) (2) (0))

Similar replacements are done for

  (signextend (and ...))
  (zeroextend (and ...))

Suppress the replacement this special case in if_then_else_cond().
---
 gcc/combine.c | 20 ++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/gcc/combine.c b/gcc/combine.c
index b22a274..244669d 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -9103,9 +9103,25 @@ if_then_else_cond (rtx x, rtx *ptrue, rtx *pfalse)
   return x;
 }
 
-  /* Likewise for 0 or a single bit.  */
+  /* Likewise for 0 or a single bit.
+ If the operation is an AND (possibly wrapped in a SIGN_EXTEND or
+ ZERO_EXTEND) with either operand being just a constant single bit value,
+ do nothing since IF_THEN_ELSE is likely to increase the expression's
+ complexity.  */
   else if (HWI_COMPUTABLE_MODE_P (mode)
-  && pow2p_hwi (nz = nonzero_bits (x, mode)))
+  && pow2p_hwi (nz = nonzero_bits (x, mode))
+  && ! (code == AND
+&& ((CONST_INT_P (XEXP (x, 0))
+ && UINTVAL (XEXP (x, 0)) == nz)
+|| (CONST_INT_P (XEXP (x, 1))
+&& UINTVAL (XEXP (x, 1)) == nz)))
+  && ! ((code == SIGN_EXTEND || code == ZERO_EXTEND)
+&& GET_CODE (XEXP (x, 0)) == AND
+&& ((CONST_INT_P (XEXP (XEXP (x, 0), 0))
+ && UINTVAL (XEXP (XEXP (x, 0), 0)) == nz)
+|| (CONST_INT_P (XEXP (XEXP (x, 0), 1))
+&& UINTVAL (XEXP (XEXP (x, 0), 1)) == nz)))
+ )
 {
   *ptrue = gen_int_mode (nz, mode), *pfalse = const0_rtx;
   return x;
-- 
2.3.0

Re: [gcc] Enable DW_OP_VAL_EXPRESSION support in dwarf module

2016-10-31 Thread Jason Merrill


On 10/21/2016 04:30 AM, Jiong Wang wrote:

All DW_OP_* of the expression are grouped together inside the PARALLEL,
and those operations which don't have RTL mapping are wrapped by
UNSPEC.  The parsing algorithm is simply something like:

  foreach elem inside PARALLEL
if (UNSPEC)
  {
dw_op_code = INTVAL (XVECEXP (elem, 0, 0));
oprnd1 = INTVAL (XVECEXP (elem, 0, 1));
oprnd2 = INTVAL (XVECEXP (elem, 0, 2));
  }
else
  call standard RTL parser.

Any comments on the approach?


If you're going to use UNSPEC, why not put the DWARF operator in the 
second operand?


Jason

Re: [PATCH] DWARF5 .debug_info headers, .debug_types -> .debug_info DW_UT_type

2016-10-31 Thread Jason Merrill

On Mon, Oct 31, 2016 at 3:38 PM, Jakub Jelinek  wrote:
> On Mon, Oct 31, 2016 at 03:33:22PM -0400, Jason Merrill wrote:
>> OK.
>
> This patch needs the incremental
> http://gcc.gnu.org/ml/gcc-patches/2016-10/msg02440.html
> patch (now successfully bootstrapped/regtested on x86_64-linux and
> i686-linux).  Is that one ok too?

I meant to imply that by replying to that message. :)

Yes, both are OK.

Jason

Re: [PATCH] DWARF5 .debug_info headers, .debug_types -> .debug_info DW_UT_type

2016-10-31 Thread Jakub Jelinek

On Mon, Oct 31, 2016 at 03:33:22PM -0400, Jason Merrill wrote:
> OK.

This patch needs the incremental
http://gcc.gnu.org/ml/gcc-patches/2016-10/msg02440.html
patch (now successfully bootstrapped/regtested on x86_64-linux and
i686-linux).  Is that one ok too?
I think it would be better to commit those two in a single commit.

Jakub

Re: [PATCH] DWARF5 .debug_rnglists support

2016-10-31 Thread Jakub Jelinek

On Mon, Oct 31, 2016 at 02:42:15PM -0400, Jason Merrill wrote:
> On 10/20/2016 02:52 PM, Jakub Jelinek wrote:
> >@@ -8476,7 +8498,16 @@ size_of_die (dw_die_ref die)
> > size += DWARF_OFFSET_SIZE;
> >   break;
> > case dw_val_class_range_list:
> >-  size += DWARF_OFFSET_SIZE;
> >+  if (dwarf_split_debug_info
> >+  && dwarf_version >= 5
> >+  && a->dw_attr_val.val_entry != RELOCATED_OFFSET)
> 
> This test, here and in value_format, should be factored out into a separate
> function with a comment explaining why you check dwarf_split_debug_info.

Ok.  Another option would be to call value_format in size_of_die
in this case and put the comment into value_format.

> In general there's a lot of code duplication between the existing ranges
> support and the new rnglist support; even the new vector is a superset of
> the old one.  Why duplicate the code rather than modify it?

The point was to conserve memory for the -gdwarf-{2,3,4} case.
The old table needs just 4 bytes per entry, the new one 16 bytes per entry.
The code duplication because of that is mainly in
add_high_low_attributes - 24 lines duplicated.
At least while -gdwarf-4 is the default that looked to me like acceptable
cost, but if you disagree, I can surely try to just grow the original
table and use it for all versions.

Jakub

Re: [PATCH] DWARF5 .debug_info headers, .debug_types -> .debug_info DW_UT_type

2016-10-31 Thread Jason Merrill

OK.

Re: [PATCH] DWARF5 - Emit DW_AT_rank and DW_TAG_generic_subrange for assumed-rank arrays

2016-10-31 Thread Jason Merrill

OK.

Re: [PATCH] DWARF5 .debug_line{,_str} support

2016-10-31 Thread Jason Merrill


On 10/31/2016 02:38 PM, Jakub Jelinek wrote:

On Mon, Oct 31, 2016 at 11:25:26AM -0400, Jason Merrill wrote:

On 10/19/2016 07:30 PM, Jakub Jelinek wrote:

This patch adds support for DWARF5 .debug_line{,_str} section format,
though only if !DWARF2_ASM_LINE_DEBUG_INFO, because otherwise
.debug_line is emitted by the assembler.  For that we'll need some
coordination with gas, dunno if we want a new as directive for that,
or as command line switch, or perhaps key DWARF5 .debug_line on the
presence of .debug_line_str section (though the last one probably isn't a
good idea, because -gsplit-dwarf doesn't use .debug_line_str).


Could gas pay attention to the DWARF version in a unit header?


Maybe.  Guess it needs to be decided in binutils and then we adjust
gcc if any changes are needed.


+  if (ndirs + idx_offset <= 256)
+   idx_form = DW_FORM_data1;
+  else if (ndirs + idx_offset <= 65536)
+   {
+ unsigned HOST_WIDE_INT sum = 1;
+ for (i = 0; i < numfiles; i++)
+   {
+ int file_idx = backmap[i];
+ int dir_idx = dirs[files[file_idx].dir_idx].dir_idx;
+ sum += size_of_uleb128 (dir_idx);
+   }
+ if (sum >= HOST_WIDE_INT_UC (2) * (numfiles + 1))
+   idx_form = DW_FORM_data2;
+   }


Why can we choose DW_FORM_data1 based only on ndirs+idx_offset, but we need
to look at the files table to choose DW_FORM_data2?  This at least needs a
comment.


I'll add a comment.  The reason is that DW_FORM_data1, if it can be used,
is always more compact than (or as compact as) DW_FORM_udata.  While the
choice in between DW_FORM_data2 and DW_FORM_udata if we have 256 to 65535
directories needs to be decided by computing both the size it would take
for all the uleb128s and compare that to the size it takes to encode all
as data2.


+   dw2_asm_output_data (idx_form == DW_FORM_data1 ? 1 : 2,
+0, NULL);


It seems that we could use an output_data function that takes a DW_FORM as
one of its arguments and does the appropriate thing.


Had a brief look at this, but it seems hard without big changes in
dwarf2asm.[ch] or lots of duplication.  The problem is that
dw2_asm_output_data* are varargs functions, so if we add some wrapper
that picks what to call based on a enum dwarf_form argument, we'd need to
convert all of them to take va_list instead, and add various further
wrappers.  Is this ok to leave this part out?


+  if (dwarf_version >= 5 && str_form == DW_FORM_line_strp)
+   {
+ node = find_AT_string_in_table (filename0, debug_line_str_hash);
+ set_indirect_string (node);
+ node->form = DW_FORM_line_strp;
+ dw2_asm_output_offset (DWARF_OFFSET_SIZE, node->label,
+debug_line_str_section,
+"File Entry: %#x: \"%s\"", 0, node->str);
+   }
+  else
+   dw2_asm_output_nstring (filename0, -1, "File Entry: %#x", 0);


Please factor this out into a separate function.


So like
static void
output_line_string (enum dwarf_form form, const char *str,
const char *entry_kind, unsigned int idx)
in the patch below?


+  if (!DWARF2_ASM_LINE_DEBUG_INFO
+  && dwarf_version >= 5
+  && !(DWARF2_INDIRECT_STRING_SUPPORT_MISSING_ON_TARGET
+  || (DEBUG_STR_SECTION_FLAGS & SECTION_MERGE) == 0
+ /* FIXME: For now.  */
+  || dwarf_split_debug_info))


This should also be a separate function or macro, since it's used more than
once.  And please elaborate the FIXME comment.


Like DWARF5_USE_DEBUG_LINE_STR macro in the patch below?

For -gsplit-dwarf, the problem is I'm really not familiar with what tools
are used post assembly for this (and likely the tools need adjustment
for DWARF5 anyway), so what is in the patches is just partial support what
appeared to be easily done by myself.  For the rest I've been hoping Cary
or somebody else could finish it up.


OK.

Jason

Re: [PATCH] DWARF5 .debug_rnglists support

2016-10-31 Thread Jason Merrill


On 10/20/2016 02:52 PM, Jakub Jelinek wrote:

@@ -8476,7 +8498,16 @@ size_of_die (dw_die_ref die)
 size += DWARF_OFFSET_SIZE;
  break;
case dw_val_class_range_list:
-  size += DWARF_OFFSET_SIZE;
+ if (dwarf_split_debug_info
+ && dwarf_version >= 5
+ && a->dw_attr_val.val_entry != RELOCATED_OFFSET)


This test, here and in value_format, should be factored out into a 
separate function with a comment explaining why you check 
dwarf_split_debug_info.


In general there's a lot of code duplication between the existing ranges 
support and the new rnglist support; even the new vector is a superset 
of the old one.  Why duplicate the code rather than modify it?


Jason

Re: [PATCH] DWARF5 .debug_line{,_str} support

2016-10-31 Thread Jakub Jelinek

On Mon, Oct 31, 2016 at 11:25:26AM -0400, Jason Merrill wrote:
> On 10/19/2016 07:30 PM, Jakub Jelinek wrote:
> >This patch adds support for DWARF5 .debug_line{,_str} section format,
> >though only if !DWARF2_ASM_LINE_DEBUG_INFO, because otherwise
> >.debug_line is emitted by the assembler.  For that we'll need some
> >coordination with gas, dunno if we want a new as directive for that,
> >or as command line switch, or perhaps key DWARF5 .debug_line on the
> >presence of .debug_line_str section (though the last one probably isn't a
> >good idea, because -gsplit-dwarf doesn't use .debug_line_str).
> 
> Could gas pay attention to the DWARF version in a unit header?

Maybe.  Guess it needs to be decided in binutils and then we adjust
gcc if any changes are needed.

> >+  if (ndirs + idx_offset <= 256)
> >+idx_form = DW_FORM_data1;
> >+  else if (ndirs + idx_offset <= 65536)
> >+{
> >+  unsigned HOST_WIDE_INT sum = 1;
> >+  for (i = 0; i < numfiles; i++)
> >+{
> >+  int file_idx = backmap[i];
> >+  int dir_idx = dirs[files[file_idx].dir_idx].dir_idx;
> >+  sum += size_of_uleb128 (dir_idx);
> >+}
> >+  if (sum >= HOST_WIDE_INT_UC (2) * (numfiles + 1))
> >+idx_form = DW_FORM_data2;
> >+}
> 
> Why can we choose DW_FORM_data1 based only on ndirs+idx_offset, but we need
> to look at the files table to choose DW_FORM_data2?  This at least needs a
> comment.

I'll add a comment.  The reason is that DW_FORM_data1, if it can be used,
is always more compact than (or as compact as) DW_FORM_udata.  While the
choice in between DW_FORM_data2 and DW_FORM_udata if we have 256 to 65535
directories needs to be decided by computing both the size it would take
for all the uleb128s and compare that to the size it takes to encode all
as data2.

> >+dw2_asm_output_data (idx_form == DW_FORM_data1 ? 1 : 2,
> >+ 0, NULL);
> 
> It seems that we could use an output_data function that takes a DW_FORM as
> one of its arguments and does the appropriate thing.

Had a brief look at this, but it seems hard without big changes in
dwarf2asm.[ch] or lots of duplication.  The problem is that
dw2_asm_output_data* are varargs functions, so if we add some wrapper
that picks what to call based on a enum dwarf_form argument, we'd need to
convert all of them to take va_list instead, and add various further
wrappers.  Is this ok to leave this part out?

> >+  if (dwarf_version >= 5 && str_form == DW_FORM_line_strp)
> >+{
> >+  node = find_AT_string_in_table (filename0, debug_line_str_hash);
> >+  set_indirect_string (node);
> >+  node->form = DW_FORM_line_strp;
> >+  dw2_asm_output_offset (DWARF_OFFSET_SIZE, node->label,
> >+ debug_line_str_section,
> >+ "File Entry: %#x: \"%s\"", 0, node->str);
> >+}
> >+  else
> >+dw2_asm_output_nstring (filename0, -1, "File Entry: %#x", 0);
> 
> Please factor this out into a separate function.

So like
static void
output_line_string (enum dwarf_form form, const char *str,
const char *entry_kind, unsigned int idx)
in the patch below?

> >+  if (!DWARF2_ASM_LINE_DEBUG_INFO
> >+  && dwarf_version >= 5
> >+  && !(DWARF2_INDIRECT_STRING_SUPPORT_MISSING_ON_TARGET
> >+   || (DEBUG_STR_SECTION_FLAGS & SECTION_MERGE) == 0
> >+  /* FIXME: For now.  */
> >+   || dwarf_split_debug_info))
> 
> This should also be a separate function or macro, since it's used more than
> once.  And please elaborate the FIXME comment.

Like DWARF5_USE_DEBUG_LINE_STR macro in the patch below?

For -gsplit-dwarf, the problem is I'm really not familiar with what tools
are used post assembly for this (and likely the tools need adjustment
for DWARF5 anyway), so what is in the patches is just partial support what
appeared to be easily done by myself.  For the rest I've been hoping Cary
or somebody else could finish it up.

Jakub
2016-10-31  Jakub Jelinek  

* dwarf2out.c (debug_line_str_section): New variable.
(debug_line_str_hash): Likewise.
(DEBUG_LINE_STR_SECTION): Define.
(set_indirect_string): Handle DW_FORM_line_strp like
DW_FORM_strp.
(find_string_form): Fix up formatting.
(size_of_die): Handle DW_FORM_line_strp like DW_FORM_strp.
Fix up indentation.
(output_die): Handle DW_FORM_line_strp.
(DWARF5_USE_DEBUG_LINE_STR): Define.
(output_line_string): New function.
(output_file_names): Add -gdwarf-5 support.
(output_line_info): Likewise.
(init_sections_and_labels): Initialize debug_line_str_section.
(output_indirect_string): Change 2nd argument from void *
to enum dwarf_form form, compare with form rather than
DW_FORM_strp.
(output_indirect_strings): Pass DW_FORM_strp to
output_indirect_string traversion.
(dwarf2out_finish):

[RFC][PATCH][AArch64] Cleanup frame pointer usage

2016-10-31 Thread Wilco Dijkstra

This patch cleans up all code related to the frame pointer.  On AArch64 we
emit a frame chain even in cases where the frame pointer is not required.
So make this explicit by introducing a boolean emit_frame_chain in
aarch64_frame record.

When the frame pointer is enabled but not strictly required (eg. no use of
alloca), we emit a frame chain in non-leaf functions, but continue to use the
stack pointer to access locals.  This results in smaller code and unwind info.

Also simplify the complex logic in aarch64_override_options_after_change_1
and compute whether the frame chain is required in aarch64_layout_frame
instead.  As a result aarch64_frame_pointer_required is now redundant and
aarch64_can_eliminate can be greatly simplified.

Finally convert all callee save/restore functions to use gen_frame_mem.

Bootstrap OK. Any comments?

ChangeLog:
2016-10-31  Wilco Dijkstra  

gcc/
* config/aarch64/aarch64.h (aarch64_frame):
Add emit_frame_chain boolean.
* config/aarch64/aarch64.c (aarch64_frame_pointer_required)
Remove.
(aarch64_layout_frame): Initialise emit_frame_chain.
(aarch64_pushwb_single_reg): Use gen_frame_mem.
(aarch64_pop_regs): Likewise.
(aarch64_gen_load_pair): Likewise.
(aarch64_save_callee_saves): Likewise.
(aarch64_restore_callee_saves): Likewise.
(aarch64_expand_prologue): Use emit_frame_chain.
(aarch64_can_eliminate): Simplify. When FP needed or outgoing
arguments are large, eliminate to FP, otherwise SP.
(aarch64_override_options_after_change_1): Simplify.
(TARGET_FRAME_POINTER_REQUIRED): Remove define.

--

diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 
fa81e4b853daf08842955288861ec7e7acca..6e32dc9f6f171dde0c182fdd7857230251f71712
 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -583,6 +583,9 @@ struct GTY (()) aarch64_frame
   /* The size of the stack adjustment after saving callee-saves.  */
   HOST_WIDE_INT final_adjust;
 
+  /* Store FP,LR and setup a frame pointer.  */
+  bool emit_frame_chain;
+
   unsigned wb_candidate1;
   unsigned wb_candidate2;
 
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
f07d771ea343803e054e03f59c8c1efb698bf474..6c06ac18d16f8afa7ee1cc5e8530e285a60e2b0f
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -2728,24 +2728,6 @@ aarch64_output_probe_stack_range (rtx reg1, rtx reg2)
   return "";
 }
 
-static bool
-aarch64_frame_pointer_required (void)
-{
-  /* In aarch64_override_options_after_change
- flag_omit_leaf_frame_pointer turns off the frame pointer by
- default.  Turn it back on now if we've not got a leaf
- function.  */
-  if (flag_omit_leaf_frame_pointer
-  && (!crtl->is_leaf || df_regs_ever_live_p (LR_REGNUM)))
-return true;
-
-  /* Force a frame pointer for EH returns so the return address is at FP+8.  */
-  if (crtl->calls_eh_return)
-return true;
-
-  return false;
-}
-
 /* Mark the registers that need to be saved by the callee and calculate
the size of the callee-saved registers area and frame record (both FP
and LR may be omitted).  */
@@ -2758,6 +2740,18 @@ aarch64_layout_frame (void)
   if (reload_completed && cfun->machine->frame.laid_out)
 return;
 
+  /* Force a frame chain for EH returns so the return address is at FP+8.  */
+  cfun->machine->frame.emit_frame_chain
+= frame_pointer_needed || crtl->calls_eh_return;
+
+  /* Emit a frame chain if the frame pointer is enabled.
+ If -momit-leaf-frame-pointer is used, do not use a frame chain
+ in leaf functions which do not use LR.  */
+  if (flag_omit_frame_pointer == 2
+  && !(flag_omit_leaf_frame_pointer && crtl->is_leaf
+  && !df_regs_ever_live_p (LR_REGNUM)))
+cfun->machine->frame.emit_frame_chain = true;
+
 #define SLOT_NOT_REQUIRED (-2)
 #define SLOT_REQUIRED (-1)
 
@@ -2789,7 +2783,7 @@ aarch64_layout_frame (void)
&& !call_used_regs[regno])
   cfun->machine->frame.reg_offset[regno] = SLOT_REQUIRED;
 
-  if (frame_pointer_needed)
+  if (cfun->machine->frame.emit_frame_chain)
 {
   /* FP and LR are placed in the linkage record.  */
   cfun->machine->frame.reg_offset[R29_REGNUM] = 0;
@@ -2937,7 +2931,7 @@ aarch64_pushwb_single_reg (machine_mode mode, unsigned 
regno,
   reg = gen_rtx_REG (mode, regno);
   mem = gen_rtx_PRE_MODIFY (Pmode, base_rtx,
plus_constant (Pmode, base_rtx, -adjustment));
-  mem = gen_rtx_MEM (mode, mem);
+  mem = gen_frame_mem (mode, mem);
 
   insn = emit_move_insn (mem, reg);
   RTX_FRAME_RELATED_P (insn) = 1;
@@ -3011,7 +3005,7 @@ aarch64_pop_regs (unsigned regno1, unsigned regno2, 
HOST_WIDE_INT adjustment,
 {
   rtx mem = plus_constant (Pmode, stack_pointer_rtx, adjustment);
   mem = gen_rtx_POST_MODIFY (Pmode, stack_pointer_rtx, mem);
-  emit_move_insn (reg1,

A new branch 'ira-select' created

2016-10-31 Thread Vladimir N Makarov


I've created a new branch ira-select for work on experimental
algorithm of calculations of pseudo register classes.

The current algorithm in IRA (file ira-costs.c) was adopted from old
regclass.c.  It has a big disadvantage which is in ignoring a fact
that operands should be in the same alternative when we calculate reg
class costs for an insn pseudos.

The new algorithm is based on choosing insn alternative and then
calculation of pseudo reg class costs knowing the alternative.  As the
current algorithm the new one is a two pass algorithm. On the second
pass we use pseudo reg classes chosen on the first pass to choose
again an insn alternative better and calculate reg class costs more
accurately.  A special treatment of pseudo move insns is done as they
have too many possible alternatives.

In some way the new algorithm can be considered as a more explicit
early code selection than the current algorithm is (the final code
selection in GCC is actually done in LRA).

On the branch the new algorithm is switched on by default.  To switch
it off, use option -fno-ira-select.

Currently, the new algorithm improves x86-64 SPECINT2000 by 0.5% and
speed up the compiler on it by 1%.  Still there are some problems with
code.  They are a bigger code for SPECFP2000 and some failures on GCC
testsuite (unexpected code generation).

This code is experimental, the work is not finished yet (I publish
this code as I promised to show it to some people on last Cauldron).
The code might become a part of GCC8 if it works well for variety of
targets.

Re: [PATCH] dwarf2cfi: Dump row differences before asserting

2016-10-31 Thread Kyrill Tkachov



On 31/10/16 17:15, Segher Boessenkool wrote:

If maybe_record_trace_start fails because the CFI is inconsistent on two
paths into a block it currently just ICEs.  This changes it to also dump
the CFI on those two paths in the dump file; debugging it without that
information is hopeless.

Tested on powerpc64-linux {-m32,-m64}.  Is this okay for trunk?


+1 from me as a variant of this patch has helped me debug issues
with the CFI.

Kyrill



Segher


2016-10-31  Segher Boessenkool  

* dwarf2cfi.c (dump_cfi_row): Add forward declaration.
(maybe_record_trace_start): If the CFI is different on the new and
old paths, print out both to the dump file before ICEing.

---
  gcc/dwarf2cfi.c | 18 +-
  1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/gcc/dwarf2cfi.c b/gcc/dwarf2cfi.c
index da9da52..19edc28 100644
--- a/gcc/dwarf2cfi.c
+++ b/gcc/dwarf2cfi.c
@@ -2239,6 +2239,8 @@ add_cfis_to_fde (void)
  }
  }
  
+static void dump_cfi_row (FILE *f, dw_cfi_row *row);

+
  /* If LABEL is the start of a trace, then initialize the state of that
 trace from CUR_TRACE and CUR_ROW.  */
  
@@ -2282,7 +2284,21 @@ maybe_record_trace_start (rtx_insn *start, rtx_insn *origin)

/* We ought to have the same state incoming to a given trace no
 matter how we arrive at the trace.  Anything else means we've
 got some kind of optimization error.  */
-  gcc_checking_assert (cfi_row_equal_p (cur_row, ti->beg_row));
+#if CHECKING_P
+  if (!cfi_row_equal_p (cur_row, ti->beg_row))
+   {
+ if (dump_file)
+   {
+ fprintf (dump_file, "Inconsistent CFI state!\n");
+ fprintf (dump_file, "SHOULD have:\n");
+ dump_cfi_row (dump_file, ti->beg_row);
+ fprintf (dump_file, "DO have:\n");
+ dump_cfi_row (dump_file, cur_row);
+   }
+
+ gcc_unreachable ();
+   }
+#endif
  
/* The args_size is allowed to conflict if it isn't actually used.  */

if (ti->beg_true_args_size != args_size)

[PATCH] Fall-through warnings in combine.c

2016-10-31 Thread Fritz Reese

I think this is obvious, the following code in combine.c which intends
to fallthrough some switch cases is not marked with 'gcc_fallthrough'
as it should be, thus producing warnings during compilation.

---
Fritz Reese


2016-10-31  Fritz Reese  

gcc/combine.c (simplify_compare_const): Mark fallthrough cases.
diff --git a/gcc/combine.c b/gcc/combine.c
index 60a127f..6cbb8cf 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -11414,6 +11414,7 @@ simplify_compare_const (enum rtx_code code, 
machine_mode mode,
  const_op -= 1;
  code = LE;
  /* ... fall through to LE case below.  */
+ gcc_fallthrough ();
}
   else
break;
@@ -11443,6 +11444,7 @@ simplify_compare_const (enum rtx_code code, 
machine_mode mode,
  const_op -= 1;
  code = GT;
  /* ... fall through to GT below.  */
+ gcc_fallthrough ();
}
   else
break;

[PATCH] dwarf2cfi: Dump row differences before asserting

2016-10-31 Thread Segher Boessenkool

If maybe_record_trace_start fails because the CFI is inconsistent on two
paths into a block it currently just ICEs.  This changes it to also dump
the CFI on those two paths in the dump file; debugging it without that
information is hopeless.

Tested on powerpc64-linux {-m32,-m64}.  Is this okay for trunk?


Segher


2016-10-31  Segher Boessenkool  

* dwarf2cfi.c (dump_cfi_row): Add forward declaration.
(maybe_record_trace_start): If the CFI is different on the new and
old paths, print out both to the dump file before ICEing.

---
 gcc/dwarf2cfi.c | 18 +-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/gcc/dwarf2cfi.c b/gcc/dwarf2cfi.c
index da9da52..19edc28 100644
--- a/gcc/dwarf2cfi.c
+++ b/gcc/dwarf2cfi.c
@@ -2239,6 +2239,8 @@ add_cfis_to_fde (void)
 }
 }
 
+static void dump_cfi_row (FILE *f, dw_cfi_row *row);
+
 /* If LABEL is the start of a trace, then initialize the state of that
trace from CUR_TRACE and CUR_ROW.  */
 
@@ -2282,7 +2284,21 @@ maybe_record_trace_start (rtx_insn *start, rtx_insn 
*origin)
   /* We ought to have the same state incoming to a given trace no
 matter how we arrive at the trace.  Anything else means we've
 got some kind of optimization error.  */
-  gcc_checking_assert (cfi_row_equal_p (cur_row, ti->beg_row));
+#if CHECKING_P
+  if (!cfi_row_equal_p (cur_row, ti->beg_row))
+   {
+ if (dump_file)
+   {
+ fprintf (dump_file, "Inconsistent CFI state!\n");
+ fprintf (dump_file, "SHOULD have:\n");
+ dump_cfi_row (dump_file, ti->beg_row);
+ fprintf (dump_file, "DO have:\n");
+ dump_cfi_row (dump_file, cur_row);
+   }
+
+ gcc_unreachable ();
+   }
+#endif
 
   /* The args_size is allowed to conflict if it isn't actually used.  */
   if (ti->beg_true_args_size != args_size)
-- 
1.9.3

Re: [PATCH] For -gdwarf-5 emit DW_OP_{implicit_pointer,entry_value,*_type,convert,reinterpret}

2016-10-31 Thread Jason Merrill


On 10/31/2016 01:04 PM, Jakub Jelinek wrote:

On Mon, Oct 31, 2016 at 12:58:23PM -0400, Jason Merrill wrote:

On 10/14/2016 03:41 PM, Jakub Jelinek wrote:

Another set of GNU extensions that were accepted into DWARF5, so we should
emit them even for -gstrict-dwarf -gdwarf-5, and for -gdwarf-5 should use
the accepted standard opcodes instead of the corresponding GNU ones.


Hmm, I wonder if we also want to use the standard opcodes instead of the GNU
ones in -gdwarf-4 and below.


Maybe in 2 years, but not now IMHO.  The GNU opcodes are supported by lots of
consumers for often as long as 5 years, while the DWARF5 ones might be
supported next year.  So, emitting them now by default would mean e.g.
most of the guality tests would fail until one updates to the not yet
existing GDB.  Similar reason why we IMHO shouldn't enable -gdwarf-5 by
default in GCC 7, but should wait for GCC 8+ with that.


Makes sense.  The patch is OK.

Jason

Re: [PATCH] For -gdwarf-5 emit DW_OP_{implicit_pointer,entry_value,*_type,convert,reinterpret}

2016-10-31 Thread Jakub Jelinek

On Mon, Oct 31, 2016 at 12:58:23PM -0400, Jason Merrill wrote:
> On 10/14/2016 03:41 PM, Jakub Jelinek wrote:
> >Another set of GNU extensions that were accepted into DWARF5, so we should
> >emit them even for -gstrict-dwarf -gdwarf-5, and for -gdwarf-5 should use
> >the accepted standard opcodes instead of the corresponding GNU ones.
> 
> Hmm, I wonder if we also want to use the standard opcodes instead of the GNU
> ones in -gdwarf-4 and below.

Maybe in 2 years, but not now IMHO.  The GNU opcodes are supported by lots of
consumers for often as long as 5 years, while the DWARF5 ones might be
supported next year.  So, emitting them now by default would mean e.g.
most of the guality tests would fail until one updates to the not yet
existing GDB.  Similar reason why we IMHO shouldn't enable -gdwarf-5 by
default in GCC 7, but should wait for GCC 8+ with that.

Jakub

Re: [C++ PATCH] Fix postfix-expression parsing (PR c++/78089, take 2)

2016-10-31 Thread Jason Merrill

OK.

Re: [PATCH] For -gdwarf-5 emit DW_OP_{implicit_pointer,entry_value,*_type,convert,reinterpret}

2016-10-31 Thread Jason Merrill


On 10/14/2016 03:41 PM, Jakub Jelinek wrote:

Another set of GNU extensions that were accepted into DWARF5, so we should
emit them even for -gstrict-dwarf -gdwarf-5, and for -gdwarf-5 should use
the accepted standard opcodes instead of the corresponding GNU ones.


Hmm, I wonder if we also want to use the standard opcodes instead of the 
GNU ones in -gdwarf-4 and below.


Jason

Re: [PATCH] DWARF5 .debug_loclists support

2016-10-31 Thread Jason Merrill


All the different cases in output_loc_list could use brief comments.

Let's keep loc_list_idx, output_loclists_offsets, and 
assign_location_list_indexes together in the file.


OK with those changes.

Jason

Re: [PATCH] [ARC] define SIZE_TYPE and PTRDIFF_TYPE correctly

2016-10-31 Thread Vineet Gupta

On 10/31/2016 09:45 AM, Andreas Schwab wrote:
> On Okt 31 2016, Vineet Gupta  wrote:
>
>> This silences tons of -Wformat= warnings when building ARC Linux kernel
>> with gcc 6.x (and restores the ARC gcc 4.8.x behaviour) which had
>> similar fix.
>>
>> gcc/
>> 2016-10-28  Vineet Gupta 
>>
>> * config/arc/arc.h (SIZE_TYPE): define as unsigned int.
>>  * (PTRDIFF_TYPE): define as int.
>>
>> Signed-off-by: Vineet Gupta 
>> ---
>>  gcc/config/arc/arc.h | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/gcc/config/arc/arc.h b/gcc/config/arc/arc.h
>> index cf13a7fa9fc3..0b579c44ee12 100644
>> --- a/gcc/config/arc/arc.h
>> +++ b/gcc/config/arc/arc.h
>> @@ -282,10 +282,10 @@ if (GET_MODE_CLASS (MODE) == MODE_INT  \
>>  #define DEFAULT_SIGNED_CHAR 0
>>  
>>  #undef SIZE_TYPE
>> -#define SIZE_TYPE "long unsigned int"
>> +#define SIZE_TYPE "unsigned int"
>>  
>>  #undef PTRDIFF_TYPE
>> -#define PTRDIFF_TYPE "long int"
>> +#define PTRDIFF_TYPE "int"
>>  
>>  #undef WCHAR_TYPE
>>  #define WCHAR_TYPE "int"
> This changes the C++ ABI.  You need to fix the kernel instead, see
> arch/avr32/include/uapi/asm/posix_types.h for how to do that.

There's ABI change one way or the other. I'd rather fix the upstream to have the
ABI we want to continue with in the long run and match with the ABI we have in 
our
off-upstream 4.8 and 6.x versions !

Thx,
-Vineet

[PATCH-Resend] [ARC] define SIZE_TYPE and PTRDIFF_TYPE correctly

2016-10-31 Thread Vineet Gupta

This silences tons of -Wformat= warnings when building ARC Linux kernel
with gcc 6.x (and restores the ARC gcc 4.8.x behaviour) which had
similar fix.

gcc/
2016-10-28  Vineet Gupta 

* config/arc/arc.h (SIZE_TYPE): define as unsigned int.
* (PTRDIFF_TYPE): define as int.

Signed-off-by: Vineet Gupta 
---
Fat fingered Andrew's address in prev posting
---
 gcc/config/arc/arc.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/config/arc/arc.h b/gcc/config/arc/arc.h
index cf13a7fa9fc3..0b579c44ee12 100644
--- a/gcc/config/arc/arc.h
+++ b/gcc/config/arc/arc.h
@@ -282,10 +282,10 @@ if (GET_MODE_CLASS (MODE) == MODE_INT \
 #define DEFAULT_SIGNED_CHAR 0
 
 #undef SIZE_TYPE
-#define SIZE_TYPE "long unsigned int"
+#define SIZE_TYPE "unsigned int"
 
 #undef PTRDIFF_TYPE
-#define PTRDIFF_TYPE "long int"
+#define PTRDIFF_TYPE "int"
 
 #undef WCHAR_TYPE
 #define WCHAR_TYPE "int"
-- 
2.7.4

Re: [C++ PATCH] Add tests for a const member and a reference member for launder.

2016-10-31 Thread Jason Merrill

OK, thanks.

On Mon, Oct 31, 2016 at 12:13 PM, Ville Voutilainen
 wrote:
> On 31 October 2016 at 15:31, Jason Merrill  wrote:
>> Good idea.  You might put the reuse in a separate function in order to
>> hide it from the optimizer.
>
>
> Ok, new patch, tested on Linux-x64, ok for trunk?
>
> 2016-10-31  Ville Voutilainen  
>
> Add tests for a const member and a reference member for launder.
> * g++.dg/cpp1z/launder3.C: New.
> * g++.dg/cpp1z/launder4.C: Likewise.
> * g++.dg/cpp1z/launder5.C: Likewise.
> * g++.dg/cpp1z/launder5.cc: Likewise.
> * g++.dg/cpp1z/launder5.h: Likewise.
> * g++.dg/cpp1z/launder6.C: Likewise.
> * g++.dg/cpp1z/launder6.cc: Likewise.
> * g++.dg/cpp1z/launder6.h: Likewise.

Re: [PATCH] DWARF5 DW_FORM_data16 support

2016-10-31 Thread Jason Merrill

On Mon, Oct 31, 2016 at 12:44 PM, Jakub Jelinek  wrote:
> On Mon, Oct 31, 2016 at 10:15:11AM -0400, Jason Merrill wrote:
>> On 10/17/2016 05:39 PM, Jakub Jelinek wrote:
>> >(dwarf_version >= 5 ? 128 : 64)
>>
>> Please make this a macro.  OK with that change.
>
> DWARF_LARGEST_DATA_FORM_BITS ?  Or do you have better name?

Sounds good.

Jason

Re: [PATCH] [ARC] define SIZE_TYPE and PTRDIFF_TYPE correctly

2016-10-31 Thread Andreas Schwab

On Okt 31 2016, Vineet Gupta  wrote:

> This silences tons of -Wformat= warnings when building ARC Linux kernel
> with gcc 6.x (and restores the ARC gcc 4.8.x behaviour) which had
> similar fix.
>
> gcc/
> 2016-10-28  Vineet Gupta 
>
> * config/arc/arc.h (SIZE_TYPE): define as unsigned int.
>   * (PTRDIFF_TYPE): define as int.
>
> Signed-off-by: Vineet Gupta 
> ---
>  gcc/config/arc/arc.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/config/arc/arc.h b/gcc/config/arc/arc.h
> index cf13a7fa9fc3..0b579c44ee12 100644
> --- a/gcc/config/arc/arc.h
> +++ b/gcc/config/arc/arc.h
> @@ -282,10 +282,10 @@ if (GET_MODE_CLASS (MODE) == MODE_INT   \
>  #define DEFAULT_SIGNED_CHAR 0
>  
>  #undef SIZE_TYPE
> -#define SIZE_TYPE "long unsigned int"
> +#define SIZE_TYPE "unsigned int"
>  
>  #undef PTRDIFF_TYPE
> -#define PTRDIFF_TYPE "long int"
> +#define PTRDIFF_TYPE "int"
>  
>  #undef WCHAR_TYPE
>  #define WCHAR_TYPE "int"

This changes the C++ ABI.  You need to fix the kernel instead, see
arch/avr32/include/uapi/asm/posix_types.h for how to do that.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

Re: [PATCH] DWARF5 DW_TAG_call_site support

2016-10-31 Thread Jakub Jelinek

On Mon, Oct 31, 2016 at 10:13:16AM -0400, Jason Merrill wrote:
> OK.

This patch is on top of the
- DWARF5 http://gcc.gnu.org/ml/gcc-patches/2016-10/msg01202.html
   
  - DW_OP_{implicit_pointer,entry_value,{const,regval,deref}_type}  
   
and DW_OP_{convert,reinterpret} -gdwarf-5 support   
   
patch and depends on the functions added by that patch.

Jakub

Re: [PATCH] enhance buffer overflow warnings (and c/53562)

2016-10-31 Thread Martin Sebor


On 10/31/2016 06:39 AM, Tobias Burnus wrote:

Martin Sebor wrote:

Attached is an updated patch that adds checks for excessive sizes
and bounds (those in excess of SIZE_MAX / 2), and also enables
the same checking for strcat and strncat).  This version also
fixes an issue with the interpretation of anti-ranges in the
first patch.  The improvements exposed two bugs in the regression
tests.


If I apply this patch to my local trunk - and try to bootstrap GCC,
bootstrapping fails (on x86-64_gnu-linux) as following. I have not
tried to figure out whether the warning (-Werror) makes sense or not.

../../gcc/emit-rtl.c: In function ‘rtx_note* make_note_raw(insn_note)’:
../../gcc/emit-rtl.c:3933:59: error: void* memset(void*, int, size_t) writing 8 
bytes into a region of size 0 overflows the destination 
[-Werror=stringop-overflow]
   memset (_DATA (note), 0, sizeof (NOTE_DATA (note)));


where NOTE_DATA is defined in rtl.h as

/* Opaque data.  */
#define NOTE_DATA(INSN) RTL_CHECKC1 (INSN, 3, NOTE)


Thanks for trying it out!  The patch bootstrapped and passed regression
tests for me yesterday, also on x86_64, and today on powepc64le, but
just now I reproduced the error with the top of today's trunk on
x86_64.  I think the error is justified because the call to memset
in the make_note_raw function references a fourth element
(rtx_note::rtx_insn::rtx_def::u.fld[3]) of what is just a single
element array.  After macro expansion, the memset call itself:

  memset (&((note)->u.fld[3]), 0, sizeof (((note)->u.fld[3])));

is within the bounds of the object pointed to by note but the index
3 is out of bounds for note->u.fld and so undefined.  An unpatched
GCC issues a similar error when the call to memset is replaced with
__builtin___memset_chk like so:

  __builtin___memset_chk (&((note)->u.fld[3]), 0,
  sizeof (((note)->u.fld[3])),
  __builtin_object_size (&((note)->u.fld[3]), 1));

/src/gcc/53562/gcc/emit-rtl.c: In function ‘rtx_note* 
make_note_raw(insn_note)’:
/src/gcc/53562/gcc/emit-rtl.c:3941:53: warning: call to void* 
__builtin___memset_chk(void*, int, long unsigned int, long unsigned int) 
will always overflow destination buffer


The code can be made valid and the warning avoided by deriving
the same address not from an invalid pointer/subscript but rather
from a pointer to the beginning of the union itself and adding
the offset of the fourth element, like so:

  char *p = (char*) &(note)->u.fld[0];
  p += sizeof (((note)->u.fld[0])) * 3;
  memset (p, 0, sizeof *p);

Unfortunately, because the invalid subscript is the result of
the expansion of the RTL_CHECK1() macro fixing this isn't as
straightforward as that.  What really should be fixed is the macro
itself.  Until then, it can be worked (hacked would be a better
term) around it by storing the result of _DATA (note)
expression in a volatile pointer, like so:

diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
index 8afcfbe..6dd9439 100644
--- a/gcc/emit-rtl.c
+++ b/gcc/emit-rtl.c
@@ -3930,7 +3930,8 @@ make_note_raw (enum insn_note subtype)
   INSN_UID (note) = cur_insn_uid++;
   NOTE_KIND (note) = subtype;
   BLOCK_FOR_INSN (note) = NULL;
-  memset (_DATA (note), 0, sizeof (NOTE_DATA (note)));
+  void* volatile p = _DATA (note);
+  memset (p, 0, sizeof (NOTE_DATA (note)));
   return note;
 }

Martin

Re: [PATCH] DWARF5 DW_FORM_data16 support

2016-10-31 Thread Jakub Jelinek

On Mon, Oct 31, 2016 at 10:15:11AM -0400, Jason Merrill wrote:
> On 10/17/2016 05:39 PM, Jakub Jelinek wrote:
> >(dwarf_version >= 5 ? 128 : 64)
> 
> Please make this a macro.  OK with that change.

DWARF_LARGEST_DATA_FORM_BITS ?  Or do you have better name?

Jakub

[PATCH] [ARC] define SIZE_TYPE and PTRDIFF_TYPE correctly

2016-10-31 Thread Vineet Gupta

This silences tons of -Wformat= warnings when building ARC Linux kernel
with gcc 6.x (and restores the ARC gcc 4.8.x behaviour) which had
similar fix.

gcc/
2016-10-28  Vineet Gupta 

* config/arc/arc.h (SIZE_TYPE): define as unsigned int.
* (PTRDIFF_TYPE): define as int.

Signed-off-by: Vineet Gupta 
---
 gcc/config/arc/arc.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/config/arc/arc.h b/gcc/config/arc/arc.h
index cf13a7fa9fc3..0b579c44ee12 100644
--- a/gcc/config/arc/arc.h
+++ b/gcc/config/arc/arc.h
@@ -282,10 +282,10 @@ if (GET_MODE_CLASS (MODE) == MODE_INT \
 #define DEFAULT_SIGNED_CHAR 0
 
 #undef SIZE_TYPE
-#define SIZE_TYPE "long unsigned int"
+#define SIZE_TYPE "unsigned int"
 
 #undef PTRDIFF_TYPE
-#define PTRDIFF_TYPE "long int"
+#define PTRDIFF_TYPE "int"
 
 #undef WCHAR_TYPE
 #define WCHAR_TYPE "int"
-- 
2.7.4

Re: [PATCH] Fix reassoc NEGATE_EXPR handling (PR tree-optimization/77860)

2016-10-31 Thread Richard Biener

On October 31, 2016 4:50:08 PM GMT+01:00, Jakub Jelinek  
wrote:
>Hi!
>
>As the testcase shows, while reassociation of vector multiplication
>has been enabled, the eliminate_using_constants optimization
>isn't enabled for integral vectors (only for floating point vectors),
>so e.g. if we have more than one { ~0, ~0, ... } vectors in there
>coming from NEGATE_EXPRs, we don't actually fold them into a single
>{ ~0, ~0, ... } or { 1, 1, ... } vector (the former then into
>NEGATE_EXPR on the result, the latter removed altogether).
>
>Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Richard.

>2016-10-31  Jakub Jelinek  
>
>   PR tree-optimization/77860
>   * tree-ssa-reassoc.c (eliminate_using_constants): Handle
>   also integral complex and vector constants.
>
>   * gcc.dg/pr77860.c: New test.
>
>--- gcc/tree-ssa-reassoc.c.jj  2016-10-31 13:28:12.0 +0100
>+++ gcc/tree-ssa-reassoc.c 2016-10-31 13:47:00.364216248 +0100
>@@ -924,7 +924,7 @@ eliminate_using_constants (enum tree_cod
>   tree type = TREE_TYPE (oelast->op);
> 
>   if (oelast->rank == 0
>-  && (INTEGRAL_TYPE_P (type) || FLOAT_TYPE_P (type)))
>+  && (ANY_INTEGRAL_TYPE_P (type) || FLOAT_TYPE_P (type)))
> {
>   switch (opcode)
>   {
>--- gcc/testsuite/gcc.dg/pr77860.c.jj  2016-10-31 13:50:34.019649814
>+0100
>+++ gcc/testsuite/gcc.dg/pr77860.c 2016-10-31 13:49:35.0 +0100
>@@ -0,0 +1,13 @@
>+/* PR tree-optimization/77860 */
>+/* { dg-do compile } */
>+/* { dg-options "-O2 -fno-tree-vrp -fno-tree-forwprop -Wno-psabi" } */
>+
>+typedef unsigned short V __attribute__((vector_size (16)));
>+
>+V
>+foo (V x, V y)
>+{
>+  V a = -x;
>+  V b = -y;
>+  return a * b;
>+}
>
>   Jakub

Re: [C++ PATCH] Add tests for a const member and a reference member for launder.

2016-10-31 Thread Ville Voutilainen

On 31 October 2016 at 15:31, Jason Merrill  wrote:
> Good idea.  You might put the reuse in a separate function in order to
> hide it from the optimizer.


Ok, new patch, tested on Linux-x64, ok for trunk?

2016-10-31  Ville Voutilainen  

Add tests for a const member and a reference member for launder.
* g++.dg/cpp1z/launder3.C: New.
* g++.dg/cpp1z/launder4.C: Likewise.
* g++.dg/cpp1z/launder5.C: Likewise.
* g++.dg/cpp1z/launder5.cc: Likewise.
* g++.dg/cpp1z/launder5.h: Likewise.
* g++.dg/cpp1z/launder6.C: Likewise.
* g++.dg/cpp1z/launder6.cc: Likewise.
* g++.dg/cpp1z/launder6.h: Likewise.
diff --git a/gcc/testsuite/g++.dg/cpp1z/launder3.C 
b/gcc/testsuite/g++.dg/cpp1z/launder3.C
new file mode 100644
index 000..2a2afc5
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/launder3.C
@@ -0,0 +1,38 @@
+// { dg-do run { target c++11 } }
+// { dg-additional-options "-O2" }
+
+#include 
+
+void *
+operator new (decltype (sizeof (0)), void *p)
+{
+  return p;
+}
+
+namespace std
+{
+  template 
+  T *
+  launder (T *p)
+  {
+return __builtin_launder (p);
+  }
+}
+
+struct A
+{
+  const int x;
+};
+
+struct B
+{
+  A a;
+};
+
+int
+main ()
+{
+  B b{{42}};
+  new () A{666};
+  assert(std::launder()->x == 666);
+}
diff --git a/gcc/testsuite/g++.dg/cpp1z/launder4.C 
b/gcc/testsuite/g++.dg/cpp1z/launder4.C
new file mode 100644
index 000..3a65eb2
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/launder4.C
@@ -0,0 +1,40 @@
+// { dg-do run { target c++11 } }
+// { dg-additional-options "-O2" }
+
+#include 
+
+void *
+operator new (decltype (sizeof (0)), void *p)
+{
+  return p;
+}
+
+namespace std
+{
+  template 
+  T *
+  launder (T *p)
+  {
+return __builtin_launder (p);
+  }
+}
+
+struct A
+{
+  int& x;
+};
+
+struct B
+{
+  A a;
+};
+
+int
+main ()
+{
+  int x = 42;
+  B b{{x}};
+  int y = 666;
+  new () A{y};
+  assert(std::launder()->x == 666);
+}
diff --git a/gcc/testsuite/g++.dg/cpp1z/launder5.C 
b/gcc/testsuite/g++.dg/cpp1z/launder5.C
new file mode 100644
index 000..483d6f2
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/launder5.C
@@ -0,0 +1,25 @@
+// { dg-do run { target c++11 } }
+// { dg-additional-options "-O2" }
+// { dg-additional-sources "launder5.cc" }
+
+#include 
+#include "launder5.h"
+
+namespace std
+{
+  template 
+  T *
+  launder (T *p)
+  {
+return __builtin_launder (p);
+  }
+}
+
+
+int
+main ()
+{
+  B b{{42}};
+  f(b);
+  assert(std::launder()->x == 666);
+}
diff --git a/gcc/testsuite/g++.dg/cpp1z/launder5.cc 
b/gcc/testsuite/g++.dg/cpp1z/launder5.cc
new file mode 100644
index 000..f9d867d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/launder5.cc
@@ -0,0 +1,12 @@
+#include "launder5.h"
+
+void *
+operator new (decltype (sizeof (0)), void *p)
+{
+  return p;
+}
+
+void f(B& b)
+{
+  new () A{666};
+}
diff --git a/gcc/testsuite/g++.dg/cpp1z/launder5.h 
b/gcc/testsuite/g++.dg/cpp1z/launder5.h
new file mode 100644
index 000..b66103a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/launder5.h
@@ -0,0 +1,16 @@
+#ifndef GCC_TEST_LAUNDER5_H
+#define GCC_TEST_LAUNDER5_H
+
+struct A
+{
+  const int x;
+};
+
+struct B
+{
+  A a;
+};
+
+void f(B& b);
+
+#endif
diff --git a/gcc/testsuite/g++.dg/cpp1z/launder6.C 
b/gcc/testsuite/g++.dg/cpp1z/launder6.C
new file mode 100644
index 000..babc4b4
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/launder6.C
@@ -0,0 +1,24 @@
+// { dg-do run { target c++11 } }
+// { dg-additional-options "-O2" }
+// { dg-additional-sources "launder6.cc" }
+#include 
+#include "launder6.h"
+
+namespace std
+{
+  template 
+  T *
+  launder (T *p)
+  {
+return __builtin_launder (p);
+  }
+}
+
+int
+main ()
+{
+  int x = 42;
+  B b{{x}};
+  f(b);
+  assert(std::launder()->x == 666);
+}
diff --git a/gcc/testsuite/g++.dg/cpp1z/launder6.cc 
b/gcc/testsuite/g++.dg/cpp1z/launder6.cc
new file mode 100644
index 000..1822891
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/launder6.cc
@@ -0,0 +1,14 @@
+#include "launder6.h"
+
+void *
+operator new (decltype (sizeof (0)), void *p)
+{
+  return p;
+}
+
+int y = 666;
+
+void f(B& b)
+{
+  new () A{y};
+}
diff --git a/gcc/testsuite/g++.dg/cpp1z/launder6.h 
b/gcc/testsuite/g++.dg/cpp1z/launder6.h
new file mode 100644
index 000..eca2ad4
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/launder6.h
@@ -0,0 +1,16 @@
+#ifndef GCC_TEST_LAUNDER6_H
+#define GCC_TEST_LAUNDER6_H
+
+struct A
+{
+  int& x;
+};
+
+struct B
+{
+  A a;
+};
+
+void f(B& b);
+
+#endif

[PATCH] Fix reassoc NEGATE_EXPR handling (PR tree-optimization/77860)

2016-10-31 Thread Jakub Jelinek

Hi!

As the testcase shows, while reassociation of vector multiplication
has been enabled, the eliminate_using_constants optimization
isn't enabled for integral vectors (only for floating point vectors),
so e.g. if we have more than one { ~0, ~0, ... } vectors in there
coming from NEGATE_EXPRs, we don't actually fold them into a single
{ ~0, ~0, ... } or { 1, 1, ... } vector (the former then into
NEGATE_EXPR on the result, the latter removed altogether).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-10-31  Jakub Jelinek  

PR tree-optimization/77860
* tree-ssa-reassoc.c (eliminate_using_constants): Handle
also integral complex and vector constants.

* gcc.dg/pr77860.c: New test.

--- gcc/tree-ssa-reassoc.c.jj   2016-10-31 13:28:12.0 +0100
+++ gcc/tree-ssa-reassoc.c  2016-10-31 13:47:00.364216248 +0100
@@ -924,7 +924,7 @@ eliminate_using_constants (enum tree_cod
   tree type = TREE_TYPE (oelast->op);
 
   if (oelast->rank == 0
-  && (INTEGRAL_TYPE_P (type) || FLOAT_TYPE_P (type)))
+  && (ANY_INTEGRAL_TYPE_P (type) || FLOAT_TYPE_P (type)))
 {
   switch (opcode)
{
--- gcc/testsuite/gcc.dg/pr77860.c.jj   2016-10-31 13:50:34.019649814 +0100
+++ gcc/testsuite/gcc.dg/pr77860.c  2016-10-31 13:49:35.0 +0100
@@ -0,0 +1,13 @@
+/* PR tree-optimization/77860 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-tree-vrp -fno-tree-forwprop -Wno-psabi" } */
+
+typedef unsigned short V __attribute__((vector_size (16)));
+
+V
+foo (V x, V y)
+{
+  V a = -x;
+  V b = -y;
+  return a * b;
+}

Jakub

[PATCH] [ARC] New option handling, refurbish multilib support.

2016-10-31 Thread Claudiu Zissulescu

Hi Andrew,

Please find the updated patch.

What is new:
- The .def files are having a comment block on how to add new lines.
- The arc_seen_option is not used.
- The arc_cpu* variables are not used.

Please let me know if I miss something,
Claudiu

gcc/
2016-05-09  Claudiu Zissulescu  

* config/arc/arc-arch.h: New file.
* config/arc/arc-arches.def: Likewise.
* config/arc/arc-cpus.def: Likewise.
* config/arc/arc-options.def: Likewise.
* config/arc/t-multilib: Likewise.
* config/arc/genmultilib.awk: Likewise.
* config/arc/genoptions.awk: Likewise.
* config/arc/arc-tables.opt: Likewise.
* config/arc/driver-arc.c: Likewise.
* common/config/arc/arc-common.c (arc_handle_option): Trace
toggled options.
* config.gcc (arc*-*-*): Add arc-tables.opt to arc's extra
options; check for supported cpu against arc-cpus.def file.
(arc*-*-elf*, arc*-*-linux-uclibc*): Use new make fragment; define
TARGET_CPU_BUILD macro; add driver-arc.o as an extra object.
* config/arc/arc-c.def: Add emacs local variables.
* config/arc/arc-opts.h (processor_type): Use arc-cpus.def file.
(FPU_FPUS, FPU_FPUD, FPU_FPUDA, FPU_FPUDA_DIV, FPU_FPUDA_FMA)
(FPU_FPUDA_ALL, FPU_FPUS_DIV, FPU_FPUS_FMA, FPU_FPUS_ALL)
(FPU_FPUD_DIV, FPU_FPUD_FMA, FPU_FPUD_ALL): New defines.
(DEFAULT_arc_fpu_build): Define.
(DEFAULT_arc_mpy_option): Define.
* config/arc/arc-protos.h (arc_init): Delete.
* config/arc/arc.c (arc_cpu_name): New variable.
(arc_selected_cpu, arc_selected_arch, arc_arcem, arc_archs)
(arc_arc700, arc_arc600, arc_arc601): New variable.
(arc_init): Add static; remove selection of default tune value,
cleanup obsolete error messages.
(arc_override_options): Make use of .def files for selecting the
right cpu and option configurations.
* config/arc/arc.h (stdbool.h): Include.
(TARGET_CPU_DEFAULT): Define.
(CPP_SPEC): Remove mcpu=NPS400 handling.
(arc_cpu_to_as): Declare.
(EXTRA_SPEC_FUNCTIONS): Define.
(OPTION_DEFAULT_SPECS): Likewise.
(ASM_DEFAULT): Remove.
(ASM_SPEC): Use arc_cpu_to_as.
(DRIVER_SELF_SPECS): Remove deprecated options.
(arc_base_cpu): Declare.
(TARGET_ARC600, TARGET_ARC601, TARGET_ARC700, TARGET_EM)
(TARGET_HS, TARGET_V2, TARGET_ARC600): Make them use arc_base_cpu
variable.
(MULTILIB_DEFAULTS): Use ARC_MULTILIB_CPU_DEFAULT.
* config/arc/arc.md (attr_cpu): Remove.
* config/arc/arc.opt (arc_mpy_option): Make it target variable.
(mno-mpy): Deprecate.
(mcpu=ARC600, mcpu=ARC601, mcpu=ARC700, mcpu=NPS400, mcpu=ARCEM)
(mcpu=ARCHS): Remove.
(mcrc, mdsp-packa, mdvbf, mmac-d16, mmac-24, mtelephony, mrtsc):
Deprecate.
(mbarrel_shifte, mspfp_, mdpfp_, mdsp_pack, mmac_): Remove.
(arc_fpu): Use new defines.
(arc_seen_options): New target variable.
* config/arc/t-arc (driver-arc.o): New target.
(arc-cpus, t-multilib, arc-tables.opt): Likewise.
* config/arc/t-arc-newlib: Delete.
* config/arc/t-arc-uClibc: Renamed to t-uClibc.
* doc/invoke.texi (ARC): Update arc options.
---
 gcc/common/config/arc/arc-common.c | 102 ++-
 gcc/config.gcc |  47 +
 gcc/config/arc/arc-arch.h  | 120 ++
 gcc/config/arc/arc-arches.def  |  56 ++
 gcc/config/arc/arc-c.def   |   4 +
 gcc/config/arc/arc-cpus.def|  75 ++
 gcc/config/arc/arc-options.def | 109 
 gcc/config/arc/arc-opts.h  |  47 +++--
 gcc/config/arc/arc-protos.h|   1 -
 gcc/config/arc/arc-tables.opt  |  90 
 gcc/config/arc/arc.c   | 176 +---
 gcc/config/arc/arc.h   |  89 
 gcc/config/arc/arc.md  |   5 -
 gcc/config/arc/arc.opt | 106 +--
 gcc/config/arc/driver-arc.c|  78 ++
 gcc/config/arc/genmultilib.awk | 203 +
 gcc/config/arc/genoptions.awk  |  86 
 gcc/config/arc/t-arc   |  19 
 gcc/config/arc/t-arc-newlib|  46 -
 gcc/config/arc/t-arc-uClibc|  20 
 gcc/config/arc/t-multilib  |  34 +++
 gcc/config/arc/t-uClibc|  20 
 gcc/doc/invoke.texi|  90 +---
 23 files changed, 1247 insertions(+), 376 deletions(-)
 create mode 100644 gcc/config/arc/arc-arch.h
 create mode 100644 gcc/config/arc/arc-arches.def
 create mode 100644 gcc/config/arc/arc-cpus.def
 create mode 100644 gcc/config/arc/arc-options.def
 create mode 100644 gcc/config/arc/arc-tables.opt
 create mode 100644

[C++ PATCH] Fix postfix-expression parsing (PR c++/78089, take 2)

2016-10-31 Thread Jakub Jelinek

Hi!

On Mon, Oct 24, 2016 at 04:20:25PM +0200, Jakub Jelinek wrote:
> While writing a testcase for __builtin_launder, I've noticed we don't parse
> __builtin_shuffle or __builtin_addressof properly (as well as do error
> recovery e.g. for typeid in constant expression etc.).
> 
> The problem is that several spots would return something; from the first
> switch in cp_parser_postfix_expression, but that is not something we really
> want, that primary expression part can be followed by [, ., -> etc. to form
> postfix-expression, but this is done in the same function.  So instead of
> returning, we need to set postfix_expression to whatever we were returning
> and just get out of the first switch, then continue parsing in the loop
> after that.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> For release branches, __builtin_addressof doesn't apply and the
> __builtin_shuffle code is differently formatted, but the bug is there
> for __builtin_shuffle, ok to backport to the release branches too?

The patch doesn't apply any longer, as the __builtin_launder patch has been
committed earlier.
So here is the same patch updated for current trunk, bootstrapped/regtested
on x86_64-linux and i686-linux again, ok for trunk?

2016-10-31  Jakub Jelinek  

PR c++/78089
* parser.c (cp_parser_postfix_expression): Replace return statement in
the first switch with setting postfix_expression to the return
expression and break;.

* c-c++-common/builtin-shuffle-1.c: New test.
* g++.dg/cpp0x/addressof3.C: New test.

--- gcc/cp/parser.c.jj  2016-10-22 20:58:11.0 +0200
+++ gcc/cp/parser.c 2016-10-31 13:03:57.504037757 +0200
@@ -6441,7 +6441,10 @@ cp_parser_postfix_expression (cp_parser
   can be used in constant-expressions.  */
if (!cast_valid_in_integral_constant_expression_p (type)
&& cp_parser_non_integral_constant_expression (parser, NIC_CAST))
- return error_mark_node;
+ {
+   postfix_expression = error_mark_node;
+   break;
+ }
 
switch (keyword)
  {
@@ -6521,7 +6524,7 @@ cp_parser_postfix_expression (cp_parser
parser->type_definition_forbidden_message = saved_message;
/* `typeid' may not appear in an integral constant expression.  */
if (cp_parser_non_integral_constant_expression (parser, NIC_TYPEID))
- return error_mark_node;
+ postfix_expression = error_mark_node;
   }
   break;
 
@@ -6615,7 +6618,10 @@ cp_parser_postfix_expression (cp_parser
/*cast_p=*/false, /*allow_expansion_p=*/true,
/*non_constant_p=*/NULL);
if (vec == NULL)
- return error_mark_node;
+ {
+   postfix_expression = error_mark_node;
+   break;
+ }
 
FOR_EACH_VEC_ELT (*vec, i, p)
  mark_exp_read (p);
@@ -6624,10 +6630,15 @@ cp_parser_postfix_expression (cp_parser
  {
  case RID_ADDRESSOF:
if (vec->length () == 1)
- return cp_build_addressof (loc, (*vec)[0], tf_warning_or_error);
-   error_at (loc, "wrong number of arguments to "
-  "%<__builtin_addressof%>");
-   return error_mark_node;
+ postfix_expression
+   = cp_build_addressof (loc, (*vec)[0], tf_warning_or_error);
+   else
+ {
+   error_at (loc, "wrong number of arguments to "
+  "%<__builtin_addressof%>");
+   postfix_expression = error_mark_node;
+ }
+   break;
 
  case RID_BUILTIN_LAUNDER:
if (vec->length () == 1)
@@ -6643,14 +6654,20 @@ cp_parser_postfix_expression (cp_parser
 
  case RID_BUILTIN_SHUFFLE:
if (vec->length () == 2)
- return build_x_vec_perm_expr (loc, (*vec)[0], NULL_TREE,
-   (*vec)[1], tf_warning_or_error);
+ postfix_expression
+   = build_x_vec_perm_expr (loc, (*vec)[0], NULL_TREE,
+(*vec)[1], tf_warning_or_error);
else if (vec->length () == 3)
- return build_x_vec_perm_expr (loc, (*vec)[0], (*vec)[1],
-   (*vec)[2], tf_warning_or_error);
-   error_at (loc, "wrong number of arguments to "
-  "%<__builtin_shuffle%>");
-   return error_mark_node;
+ postfix_expression
+   = build_x_vec_perm_expr (loc, (*vec)[0], (*vec)[1],
+(*vec)[2], tf_warning_or_error);
+   else
+ {
+   error_at (loc, "wrong number of arguments to "
+  "%<__builtin_shuffle%>");
+   postfix_expression = error_mark_node;
+ }
+   break;
 
  default:

RFA (tree-inline): PATCH for C++ inheriting constructors overhaul

2016-10-31 Thread Jason Merrill

wg21.link/p0136 significantly changes the specification of C++11
inheriting constructors so that they become an implementation detail
rather than a language-level construct; instead, overload resolution
and such are done on the constructor from the base, and the artificial
constructor in the derived class comes in only later.  As a result,
several things that worked poorly with the old specification work much
better.  But this changes the representation as well; as a result, the
mangled name of the artificial inherited constructor is changing.

This proposal is part of C++17, but it is specified as a defect report
against C++11 and C++14 as well, so the new behavior affects those
modes as well.  The change can be controlled with
-fno-new-inheriting-ctors, and it is also disabled by default with
-fabi-version=10 or below.

One subtlety with the new scheme is that base constructor clones of an
inheriting constructor that inherits from a virtual base don't get the
inherited parameters, since they won't actually call the constructor
for the virtual base; only the complete constructor clone does that.

For the cloning code to handle omitted inherited parms, I needed to
change copy_tree_body_r to immediately optimize away a COND_EXPR with
a constant condition, as the non-taken branch would try to pass the
omitted parameters to the virtual base constructor.

Is the tree-inline.c patch OK for trunk?
commit 1375d880435fc21023a812ad17da209e44526b07
Author: Jason Merrill 
Date:   Tue Oct 25 23:53:51 2016 -0400

Implement P0136R1, Rewording inheriting constructors.

gcc/c-family/
* c.opt (-fnew-inheriting-ctors): New.
* c-opts.c: Default to on for ABI 11+.
gcc/cp/
* call.c (enum rejection_reason_code): Add rr_inherited_ctor.
(inherited_ctor_rejection): New.
(add_function_candidate): Reject inherited ctors for copying.
(enforce_access): Use strip_inheriting_ctors.
(print_z_candidate): Likewise.  Handle rr_inherited_ctor.
(convert_like_real): Avoid copying inheriting ctor parameters.
(build_over_call): Likewise.  A base ctor inheriting from vbase
has no parms.  Sorry about varargs.
(joust): A local constructor beats inherited with the same convs.
* class.c (add_method): Handle hiding inheriting ctors.
(one_inherited_ctor): Handle new semantics.
(add_implicitly_declared_members): Pass using_decl down.
(build_clone): A base ctor inheriting from vbase has no parms.
* cp-tree.h (DECL_INHERITED_CTOR): Store this instead of the base.
(SET_DECL_INHERITED_CTOR): Likewise.
(DECL_INHERITED_CTOR_BASE): Adjust.
* constexpr.c: Adjust.
* error.c (dump_function_decl): Decorate inheriting ctors.
* init.c (emit_mem_initializers): Suppress access control in
inheriting ctor.
* mangle.c (write_special_name_constructor): Handle new inheriting
ctor mangling.
* method.c (strip_inheriting_ctors, inherited_ctor_binfo)
(ctor_omit_inherited_parms, binfo_inherited_from): New.
(synthesized_method_walk): Use binfo_inherited_from.  Suppress
access control in inheriting ctor.
(deduce_inheriting_ctor): Deleted if ambiguous ctor inheritance.
(maybe_explain_implicit_delete): Explain ambigous ctor inheritance.
(add_one_base_init, do_build_copy_constructor): Adjust.
(locate_fn_flags, explain_implicit_non_constexpr): Adjust.
(implicitly_declare_fn): Adjust.
(get_inherited_ctor): Remove.
* name-lookup.c (do_class_using_decl): Check for indirect ctor
inheritance.
* optimize.c (cdtor_comdat_group): Adjust for new mangling.
(maybe_clone_body): Handle omitted parms in base clone.
(maybe_thunk_body): Don't thunk if base clone omits parms.
* pt.c (tsubst_decl): Adjust.
(instantiate_template_1): Suppress access control in inheriting
ctor.
(fn_type_unification): Do deduction with inherited ctor.
* tree.c (special_function_p): Adjust.
gcc/
* tree-inline.c (copy_tree_body_r): Only copy the taken branch of
a COND_EXPR with constant condition.
libiberty/
* cp-demangle.c (d_ctor_dtor_name): Handle inheriting constructor.

diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c
index c399307..969b05c 100644
--- a/gcc/c-family/c-opts.c
+++ b/gcc/c-family/c-opts.c
@@ -914,6 +914,12 @@ c_common_post_options (const char **pfilename)
   if (flag_abi_version == 0)
 flag_abi_version = 11;
 
+  /* By default, enable the new inheriting constructor semantics along with ABI
+ 11.  New and old should coexist fine, but it is a change in what
+ artificial symbols are

Re: [PATCH] bb-reorder: Improve compgotos pass (PR71785)

2016-10-31 Thread Segher Boessenkool

On Mon, Oct 31, 2016 at 04:09:48PM +0100, Steven Bosscher wrote:
> On Sun, Oct 30, 2016 at 8:10 PM, Segher Boessenkool wrote:
> > This patch solves this problem by simply running the 
> > duplicate_computed_gotos
> > pass again, as long as it does any work.  The patch looks much bigger than
> > it is, because I factored out two routines to simplify the control flow.
> 
> It's made the patch a bit difficult to read. Condensing it a bit...

Well, it would have a goto crossing a loop, or two gotos crossing each
other, otherwise.  I should have done it as two patches I guess (first
factor, then change).

> > +  for (;;)
> >  {
> > +  if (n_basic_blocks_for_fn (fun) <= NUM_FIXED_BLOCKS + 1)
> > +   return 0;
> 
> This test should not be needed in the loop. This pass can never
> collapse the function to a single basic block.

Yeah maybe, but that relies on quite a few assumptions.  This is strictly
an optimisation anyway, will move it outside the loop.

> > +  basic_block bb;
> > +  FOR_EACH_BB_FN (bb, fun)
> > +   {
> > + /* Build the reorder chain for the original order of blocks.  */
> > + if (bb->next_bb != EXIT_BLOCK_PTR_FOR_FN (fun))
> > +   bb->aux = bb->next_bb;
> > +   }
> >
> > +  duplicate_computed_gotos_find_candidates (fun, candidates, max_size);
> >
> > +  bool changed = false;
> > +  if (!bitmap_empty_p (candidates))
> > +   changed = duplicate_computed_gotos_do_duplicate (fun, candidates);
> >
> > +  if (changed)
> > +   fixup_partitions ();
> > +
> > +  cfg_layout_finalize ();
> 
> I don't think you have to go into/out-of cfglayout mode for each iteration.

Yeah probably.  I was too lazy :-)  It needs the cleanup_cfg every
iteration though, I was afraid that interacts.

> >/* Merge the duplicated blocks into predecessors, when possible.  */
> > +  if (changed)
> > +   cleanup_cfg (0);
> > +  else
> > +   break;
> >  }
> 
> Maybe a gcc_assert that the loop doesn't iterate more often than num_edges?

Good plan (num blocks even).

Thanks,


Segher

Re: [PATCH] DWARF5 .debug_line{,_str} support

2016-10-31 Thread Jason Merrill


On 10/19/2016 07:30 PM, Jakub Jelinek wrote:

This patch adds support for DWARF5 .debug_line{,_str} section format,
though only if !DWARF2_ASM_LINE_DEBUG_INFO, because otherwise
.debug_line is emitted by the assembler.  For that we'll need some
coordination with gas, dunno if we want a new as directive for that,
or as command line switch, or perhaps key DWARF5 .debug_line on the
presence of .debug_line_str section (though the last one probably isn't a
good idea, because -gsplit-dwarf doesn't use .debug_line_str).


Could gas pay attention to the DWARF version in a unit header?


+  if (ndirs + idx_offset <= 256)
+   idx_form = DW_FORM_data1;
+  else if (ndirs + idx_offset <= 65536)
+   {
+ unsigned HOST_WIDE_INT sum = 1;
+ for (i = 0; i < numfiles; i++)
+   {
+ int file_idx = backmap[i];
+ int dir_idx = dirs[files[file_idx].dir_idx].dir_idx;
+ sum += size_of_uleb128 (dir_idx);
+   }
+ if (sum >= HOST_WIDE_INT_UC (2) * (numfiles + 1))
+   idx_form = DW_FORM_data2;
+   }


Why can we choose DW_FORM_data1 based only on ndirs+idx_offset, but we 
need to look at the files table to choose DW_FORM_data2?  This at least 
needs a comment.



+   dw2_asm_output_data (idx_form == DW_FORM_data1 ? 1 : 2,
+0, NULL);


It seems that we could use an output_data function that takes a DW_FORM 
as one of its arguments and does the appropriate thing.



+  if (dwarf_version >= 5 && str_form == DW_FORM_line_strp)
+   {
+ node = find_AT_string_in_table (filename0, debug_line_str_hash);
+ set_indirect_string (node);
+ node->form = DW_FORM_line_strp;
+ dw2_asm_output_offset (DWARF_OFFSET_SIZE, node->label,
+debug_line_str_section,
+"File Entry: %#x: \"%s\"", 0, node->str);
+   }
+  else
+   dw2_asm_output_nstring (filename0, -1, "File Entry: %#x", 0);


Please factor this out into a separate function.


+  if (!DWARF2_ASM_LINE_DEBUG_INFO
+  && dwarf_version >= 5
+  && !(DWARF2_INDIRECT_STRING_SUPPORT_MISSING_ON_TARGET
+  || (DEBUG_STR_SECTION_FLAGS & SECTION_MERGE) == 0
+ /* FIXME: For now.  */
+  || dwarf_split_debug_info))


This should also be a separate function or macro, since it's used more 
than once.  And please elaborate the FIXME comment.


Jason

Re: [PATCH] bb-reorder: Improve compgotos pass (PR71785)

2016-10-31 Thread Steven Bosscher

On Sun, Oct 30, 2016 at 8:10 PM, Segher Boessenkool wrote:
> This patch solves this problem by simply running the duplicate_computed_gotos
> pass again, as long as it does any work.  The patch looks much bigger than
> it is, because I factored out two routines to simplify the control flow.

It's made the patch a bit difficult to read. Condensing it a bit...

> +  for (;;)
>  {
> +  if (n_basic_blocks_for_fn (fun) <= NUM_FIXED_BLOCKS + 1)
> +   return 0;

This test should not be needed in the loop. This pass can never
collapse the function to a single basic block.


> +  clear_bb_flags ();
> +  cfg_layout_initialize (0);

See comment below...


> +  basic_block bb;
> +  FOR_EACH_BB_FN (bb, fun)
> +   {
> + /* Build the reorder chain for the original order of blocks.  */
> + if (bb->next_bb != EXIT_BLOCK_PTR_FOR_FN (fun))
> +   bb->aux = bb->next_bb;
> +   }
>
> +  duplicate_computed_gotos_find_candidates (fun, candidates, max_size);
>
> +  bool changed = false;
> +  if (!bitmap_empty_p (candidates))
> +   changed = duplicate_computed_gotos_do_duplicate (fun, candidates);
>
> +  if (changed)
> +   fixup_partitions ();
> +
> +  cfg_layout_finalize ();

I don't think you have to go into/out-of cfglayout mode for each iteration.


>/* Merge the duplicated blocks into predecessors, when possible.  */
> +  if (changed)
> +   cleanup_cfg (0);
> +  else
> +   break;
>  }

Maybe a gcc_assert that the loop doesn't iterate more often than num_edges?

Ciao!
Steven

Re: [PATCH] fix linker name for uClibc

2016-10-31 Thread Michael Eager


On 10/31/2016 08:04 AM, Jeff Law wrote:

On 10/28/2016 03:30 PM, Michael Eager wrote:

On 10/28/2016 11:14 AM, Waldemar Brodkorb wrote:

Hi,

uClibc-ng can be used for Microblaze architecture.
It is regulary tested with qemu-system-microblaze in little and
big endian mode.

2016-10-28  Waldemar Brodkorb  

 gcc/
 * config/microblaze/linux.h: add UCLIBC_DYNAMIC_LINKER

diff --git a/gcc/config/microblaze/linux.h
b/gcc/config/microblaze/linux.h
index ae8523c..b3bf43a 100644
--- a/gcc/config/microblaze/linux.h
+++ b/gcc/config/microblaze/linux.h
@@ -29,6 +29,7 @@
  #define TLS_NEEDS_GOT 1

  #define GLIBC_DYNAMIC_LINKER "/lib/ld.so.1"
+#define UCLIBC_DYNAMIC_LINKER "/lib/ld-uClibc.so.0"

  #if TARGET_BIG_ENDIAN_DEFAULT == 0 /* LE */
  #define MUSL_DYNAMIC_LINKER_E "%{mbig-endian:;:el}"


best regards
  Waldemar


OK to apply.

Thanks.  I don't believe Waldemar has write access, so I committed the patch 
for him.


Thanks.


--
Michael Eagerea...@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077

Re: MAINTAINERS update

2016-10-31 Thread Carl E. Love

On Sat, 2016-10-29 at 12:10 -0500, Segher Boessenkool wrote:
Hi Carl,
> 
> On Fri, Oct 28, 2016 at 08:11:11AM -0700, Carl E. Love wrote:
> >  Manuel López-Ibáñez  
> >  Martin v. Löwis

> >  H.J. Lu

> > +Carl Love
> >  Christophe Lyon

> >  Luis Machado 
> >  Ziga Mahkovec

> 
> That is not alphabetical...  Move it two up?
> 
> Segher

Seger:

Yup, it is in the wrong order.  I have updated the MAINTAINERS file and
the ChangeLog.

  Carl Love

-

Index: ChangeLog
===
--- ChangeLog   (revision 241702)
+++ ChangeLog   (working copy)
@@ -1,5 +1,10 @@
 2016-10-27  Carl Love  
 
+   * MAINTAINERS (Write After Approval): Fix my entry in the Write After
+   Approval list to make it alphabetical.
+
+2016-10-27  Carl Love  
+
* MAINTAINERS (Write After Approval): Add myself.
 
 2016-10-27  Andrew Burgess  
Index: MAINTAINERS
===
--- MAINTAINERS (revision 241702)
+++ MAINTAINERS (working copy)
@@ -477,9 +477,9 @@
 Gabor Loki 
 Sandra Loosemore   
 Manuel López-Ibáñez
+Carl Love  
 Martin v. Löwis

 H.J. Lu
-Carl Love  
 Christophe Lyon
 Luis Machado   
 Ziga Mahkovec

Re: [PATCH] fix linker name for uClibc

2016-10-31 Thread Jeff Law


On 10/28/2016 03:30 PM, Michael Eager wrote:

On 10/28/2016 11:14 AM, Waldemar Brodkorb wrote:

Hi,

uClibc-ng can be used for Microblaze architecture.
It is regulary tested with qemu-system-microblaze in little and
big endian mode.

2016-10-28  Waldemar Brodkorb  

 gcc/
 * config/microblaze/linux.h: add UCLIBC_DYNAMIC_LINKER

diff --git a/gcc/config/microblaze/linux.h
b/gcc/config/microblaze/linux.h
index ae8523c..b3bf43a 100644
--- a/gcc/config/microblaze/linux.h
+++ b/gcc/config/microblaze/linux.h
@@ -29,6 +29,7 @@
  #define TLS_NEEDS_GOT 1

  #define GLIBC_DYNAMIC_LINKER "/lib/ld.so.1"
+#define UCLIBC_DYNAMIC_LINKER "/lib/ld-uClibc.so.0"

  #if TARGET_BIG_ENDIAN_DEFAULT == 0 /* LE */
  #define MUSL_DYNAMIC_LINKER_E "%{mbig-endian:;:el}"


best regards
  Waldemar


OK to apply.
Thanks.  I don't believe Waldemar has write access, so I committed the 
patch for him.


jeff

Re: [PATCH] Fix host_size_t_cst_p predicate

2016-10-31 Thread Martin Liška

On 10/31/2016 12:11 PM, Richard Biener wrote:
> On Mon, Oct 31, 2016 at 11:18 AM, Richard Sandiford
>  wrote:
>> Richard Biener  writes:
>>> On Mon, Oct 31, 2016 at 10:58 AM, Richard Sandiford
>>>  wrote:
 Richard Biener  writes:
> On Mon, Oct 31, 2016 at 10:10 AM, Martin Liška  wrote:
>> On 10/31/2016 01:12 AM, Richard Sandiford wrote:
>>> Richard Biener  writes:
 On Thu, Oct 27, 2016 at 5:06 PM, Martin Liška  wrote:
> On 10/27/2016 03:35 PM, Richard Biener wrote:
>> On Thu, Oct 27, 2016 at 9:41 AM, Martin Liška  wrote:
>>> Running simple test-case w/o the proper header file causes ICE:
>>> strncmp ("a", "b", -1);
>>>
>>> 0xe74462 tree_to_uhwi(tree_node const*)
>>> ../../gcc/tree.c:7324
>>> 0x90a23f host_size_t_cst_p
>>> ../../gcc/fold-const-call.c:63
>>> 0x90a23f fold_const_call(combined_fn, tree_node*, tree_node*,
>>> tree_node*, tree_node*)
>>> ../../gcc/fold-const-call.c:1512
>>> 0x787b01 fold_builtin_3
>>> ../../gcc/builtins.c:8385
>>> 0x787b01 fold_builtin_n(unsigned int, tree_node*, tree_node**,
>>> int, bool)
>>> ../../gcc/builtins.c:8465
>>> 0x9052b1 fold(tree_node*)
>>> ../../gcc/fold-const.c:11919
>>> 0x6de2bb c_fully_fold_internal
>>> ../../gcc/c/c-fold.c:185
>>> 0x6e1f6b c_fully_fold(tree_node*, bool, bool*)
>>> ../../gcc/c/c-fold.c:90
>>> 0x67cbbf c_process_expr_stmt(unsigned int, tree_node*)
>>> ../../gcc/c/c-typeck.c:10369
>>> 0x67cfbd c_finish_expr_stmt(unsigned int, tree_node*)
>>> ../../gcc/c/c-typeck.c:10414
>>> 0x6cb578 c_parser_statement_after_labels
>>> ../../gcc/c/c-parser.c:5430
>>> 0x6cd333 c_parser_compound_statement_nostart
>>> ../../gcc/c/c-parser.c:4944
>>> 0x6cdbde c_parser_compound_statement
>>> ../../gcc/c/c-parser.c:4777
>>> 0x6c93ac c_parser_declaration_or_fndef
>>> ../../gcc/c/c-parser.c:2176
>>> 0x6d51ab c_parser_external_declaration
>>> ../../gcc/c/c-parser.c:1574
>>> 0x6d5c09 c_parser_translation_unit
>>> ../../gcc/c/c-parser.c:1454
>>> 0x6d5c09 c_parse_file()
>>> ../../gcc/c/c-parser.c:18173
>>> 0x72ffd2 c_common_parse_file()
>>> ../../gcc/c-family/c-opts.c:1087
>>>
>>> Following patch improves the host_size_t_cst_p predicate.
>>>
>>> Patch can bootstrap on ppc64le-redhat-linux and survives
>>> regression tests.
>>>
>>> Ready to be installed?
>>
>> I believe the wi::min_precision (t, UNSIGNED) <= sizeof (size_t) *
>> CHAR_BIT test is now redundant.
>>
>> OTOH it was probably desired to allow -1 here?  A little looking back
>> in time should tell.
>
> Ok, it started with r229922, where it was changed from:
>
>   if (tree_fits_uhwi_p (len) && p1 && p2)
> {
>   const int i = strncmp (p1, p2, tree_to_uhwi (len));
> ...
>
> to current version:
>
> case CFN_BUILT_IN_STRNCMP:
>   {
> bool const_size_p = host_size_t_cst_p (arg2, );
>
> Thus I'm suggesting to change to back to it.
>
> Ready to be installed?

 Let's ask Richard.
>>>
>>> The idea with the:
>>>
>>>   wi::min_precision (t, UNSIGNED) <= sizeof (size_t) * CHAR_BIT
>>>
>>> test was to stop us attempting 64-bit size_t operations on ILP32 hosts.
>>> I think we still want that.
>>
>> OK, so is the consensus to add tree_fits_uhwi_p predicate to the current
>> wi::min_precision check, right?
>
> Not sure.  If we have host_size_t_cst_p then we should have a 
> corresponding
> size_t host_size_t (const_tree) and should use those in pairs.  Not sure
> why we have sth satisfying host_size_t_cst_p but not tree_fits_uhwi_p.

 It's the other way around: something can satisfy tree_fits_uhwi_p
 (i.e. fit within a uint64_t) but not fit within the host's size_t.
 The kind of case I'm thinking of is:

   strncmp ("fi", "fo", (1L << 32) + 1)

 for an ILP32 host and LP64 target.  There's a danger that by passing
 the uint64_t value (1L << 32) + 1 to the host's strncmp that we'd
 truncate it to 1, giving the wrong result.
>>>
>>> Yes, but if it passes host_size_t_cst_p why does tree_to_uhwi ICE then?
>>> (unless we have a > 64bit host size_t).
>>
>> Because in Martin's test

[PATCH 2/2, i386]: Implement TARGET_EXPAND_DIVMOD_LIBFUNC

2016-10-31 Thread Uros Bizjak

Attached patch builds on previous libgcc patch and implements
TARGET_EXPAND_DIVMOD_LIBFUNC target hook.

The same approach can also be used on other targets without hardware
divmod insn.

Patch also fixes several gcc.dg/divmod-?.c testsuite failures for
x86_64 multilibs.

2016-10-31  Uros Bizjak  

* config/i386/i386.c (ix86_init_libfuncs): New.  Call
darwin_rename_builtins here.
(ix86_expand_divmod_libfunc): New.
(TARGET_INIT_LIBFUNCS): Unconditionally define to ix86_init_libfuncs.
(TARGET_EXPAND_DIVMOD_LIBFUNC): Define.

testsuite/Changelog:

2016-10-31  Uros Bizjak  

* lib/target-supports.exp (check_effective_target_divmod):
Add i?86-*-*.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Uros.
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 241697)
+++ config/i386/i386.c  (working copy)
@@ -50571,9 +50571,44 @@ ix86_addr_space_zero_address_valid (addr_space_t a
 {
   return as != ADDR_SPACE_GENERIC;
 }
-#undef TARGET_ADDR_SPACE_ZERO_ADDRESS_VALID
-#define TARGET_ADDR_SPACE_ZERO_ADDRESS_VALID ix86_addr_space_zero_address_valid
 
+static void
+ix86_init_libfuncs (void)
+{
+  if (TARGET_64BIT)
+{
+  set_optab_libfunc (sdivmod_optab, TImode, "__divmodti4");
+  set_optab_libfunc (udivmod_optab, TImode, "__udivmodti4");
+}
+  else
+{
+  set_optab_libfunc (sdivmod_optab, DImode, "__divmoddi4");
+  set_optab_libfunc (udivmod_optab, DImode, "__udivmoddi4");
+}
+
+#if TARGET_MACHO
+  darwin_rename_builtins ();
+#endif
+}
+
+/* Generate call to __divmoddi4.  */
+
+static void
+ix86_expand_divmod_libfunc (rtx libfunc, machine_mode mode,
+   rtx op0, rtx op1,
+   rtx *quot_p, rtx *rem_p)
+{
+  rtx rem = assign_386_stack_local (mode, SLOT_TEMP);
+
+  rtx quot = emit_library_call_value (libfunc, NULL_RTX, LCT_NORMAL,
+   mode, 3,
+   op0, GET_MODE (op0),
+   op1, GET_MODE (op1),
+   XEXP (rem, 0), Pmode);
+  *quot_p = quot;
+  *rem_p = rem;
+}
+
 /* Initialize the GCC target structure.  */
 #undef TARGET_RETURN_IN_MEMORY
 #define TARGET_RETURN_IN_MEMORY ix86_return_in_memory
@@ -50957,11 +50992,6 @@ ix86_addr_space_zero_address_valid (addr_space_t a
 #undef TARGET_CONDITIONAL_REGISTER_USAGE
 #define TARGET_CONDITIONAL_REGISTER_USAGE ix86_conditional_register_usage
 
-#if TARGET_MACHO
-#undef TARGET_INIT_LIBFUNCS
-#define TARGET_INIT_LIBFUNCS darwin_rename_builtins
-#endif
-
 #undef TARGET_LOOP_UNROLL_ADJUST
 #define TARGET_LOOP_UNROLL_ADJUST ix86_loop_unroll_adjust
 
@@ -51052,6 +51082,15 @@ ix86_addr_space_zero_address_valid (addr_space_t a
 #undef TARGET_CUSTOM_FUNCTION_DESCRIPTORS
 #define TARGET_CUSTOM_FUNCTION_DESCRIPTORS 1
 
+#undef TARGET_ADDR_SPACE_ZERO_ADDRESS_VALID
+#define TARGET_ADDR_SPACE_ZERO_ADDRESS_VALID ix86_addr_space_zero_address_valid
+
+#undef TARGET_INIT_LIBFUNCS
+#define TARGET_INIT_LIBFUNCS ix86_init_libfuncs
+
+#undef TARGET_EXPAND_DIVMOD_LIBFUNC
+#define TARGET_EXPAND_DIVMOD_LIBFUNC ix86_expand_divmod_libfunc
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 #include "gt-i386.h"
Index: testsuite/lib/target-supports.exp
===
--- testsuite/lib/target-supports.exp   (revision 241697)
+++ testsuite/lib/target-supports.exp   (working copy)
@@ -8110,7 +8110,7 @@ proc check_effective_target_divmod { } {
 #TODO: Add checks for all targets that have either hardware divmod insn
 # or define libfunc for divmod.
 if { [istarget arm*-*-*]
-|| [istarget x86_64-*-*] } {
+|| [istarget x86_64-*-*] || [istarget i?86-*-*] } {
return 1
 }
 return 0

[PATCH 1/2, libgcc]: Implement _divmoddi4

2016-10-31 Thread Uros Bizjak

This function will be used in a follow-up patch to implement
TARGET_EXPAND_DIVMOD_LIBFUNC for x86 targets. Other targets can call
this function, so IMO it should be part of a generic library.

2016-10-31  Uros Bizjak  

* Makefile.in (LIB2_DIVMOD_FUNCS): Add _divmoddi4.
* libgcc2.c (__divmoddi4): New function.
* libgcc2.h (__divmoddi4): Declare.
* libgcc-std.ver.in (GCC_7.0.0): New. Add __PFX_divmoddi4
and __PFX_divmodti4.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32} with a
follow-up target-dependent  patch.

OK for mainline?

Uros.
Index: Makefile.in
===
--- Makefile.in (revision 241697)
+++ Makefile.in (working copy)
@@ -255,10 +255,6 @@ LIB2FUNCS_ST = _eprintf __gcc_bcmp
 # List of functions not to build from libgcc2.c.
 LIB2FUNCS_EXCLUDE =
 
-# These might cause a divide overflow trap and so are compiled with
-# unwinder info.
-LIB2_DIVMOD_FUNCS = _divdi3 _moddi3 _udivdi3 _umoddi3 _udiv_w_sdiv _udivmoddi4
-
 # List of extra C and assembler files to add to static and shared libgcc2.
 # Assembler files should have names ending in `.S'.
 LIB2ADD = 
@@ -455,7 +451,8 @@ endif
 
 # These might cause a divide overflow trap and so are compiled with
 # unwinder info.
-LIB2_DIVMOD_FUNCS = _divdi3 _moddi3 _udivdi3 _umoddi3 _udiv_w_sdiv _udivmoddi4
+LIB2_DIVMOD_FUNCS = _divdi3 _moddi3 _divmoddi4 \
+   _udivdi3 _umoddi3 _udivmoddi4 _udiv_w_sdiv
 
 # Remove any objects from lib2funcs and LIB2_DIVMOD_FUNCS that are
 # defined as optimized assembly code in LIB1ASMFUNCS or as C code
Index: libgcc-std.ver.in
===
--- libgcc-std.ver.in   (revision 241697)
+++ libgcc-std.ver.in   (working copy)
@@ -1938,3 +1938,9 @@ GCC_4.7.0 {
 %inherit GCC_4.8.0 GCC_4.7.0
 GCC_4.8.0 {
 }
+
+%inherit GCC_7.0.0 GCC_4.8.0
+GCC_7.0.0 {
+  __PFX__divmoddi4
+  __PFX__divmodti4
+}
Index: libgcc2.c
===
--- libgcc2.c   (revision 241697)
+++ libgcc2.c   (working copy)
@@ -680,7 +680,8 @@ __udiv_w_sdiv (UWtype *rp __attribute__ ((__unused
 #endif
 
 #if (defined (L_udivdi3) || defined (L_divdi3) || \
- defined (L_umoddi3) || defined (L_moddi3))
+ defined (L_umoddi3) || defined (L_moddi3) || \
+ defined (L_divmoddi4))
 #define L_udivmoddi4
 #endif
 
@@ -937,7 +938,8 @@ __parityDI2 (UDWtype x)
 #ifdef TARGET_HAS_NO_HW_DIVIDE
 
 #if (defined (L_udivdi3) || defined (L_divdi3) || \
- defined (L_umoddi3) || defined (L_moddi3))
+ defined (L_umoddi3) || defined (L_moddi3) || \
+ defined (L_divmoddi4))
 static inline __attribute__ ((__always_inline__))
 #endif
 UDWtype
@@ -1004,7 +1006,8 @@ __udivmoddi4 (UDWtype n, UDWtype d, UDWtype *rp)
 #else
 
 #if (defined (L_udivdi3) || defined (L_divdi3) || \
- defined (L_umoddi3) || defined (L_moddi3))
+ defined (L_umoddi3) || defined (L_moddi3) || \
+ defined (L_divmoddi4))
 static inline __attribute__ ((__always_inline__))
 #endif
 UDWtype
@@ -1269,6 +1272,34 @@ __moddi3 (DWtype u, DWtype v)
 }
 #endif
 
+#ifdef L_divmoddi4
+DWtype
+__divmoddi4 (DWtype u, DWtype v, DWtype *rp)
+{
+  Wtype c1 = 0, c2 = 0;
+  DWunion uu = {.ll = u};
+  DWunion vv = {.ll = v};
+  DWtype w;
+  DWtype r;
+
+  if (uu.s.high < 0)
+c1 = ~c1, c2 = ~c2,
+uu.ll = -uu.ll;
+  if (vv.s.high < 0)
+c1 = ~c1,
+vv.ll = -vv.ll;
+
+  w = __udivmoddi4 (uu.ll, vv.ll, (UDWtype*));
+  if (c1)
+w = -w;
+  if (c2)
+r = -r;
+
+  *rp = r;
+  return w;
+}
+#endif
+
 #ifdef L_umoddi3
 UDWtype
 __umoddi3 (UDWtype u, UDWtype v)
Index: libgcc2.h
===
--- libgcc2.h   (revision 241697)
+++ libgcc2.h   (working copy)
@@ -281,6 +281,7 @@ typedef int shift_count_type __attribute__((mode (
 #define __ashrdi3  __NDW(ashr,3)
 #define __cmpdi2   __NDW(cmp,2)
 #define __ucmpdi2  __NDW(ucmp,2)
+#define __divmoddi4__NDW(divmod,4)
 #define __udivmoddi4   __NDW(udivmod,4)
 #define __fixunstfDI   __NDW(fixunstf,)
 #define __fixtfdi  __NDW(fixtf,)
@@ -376,10 +377,12 @@ extern DWtype __divdi3 (DWtype, DWtype);
 extern UDWtype __udivdi3 (UDWtype, UDWtype);
 extern UDWtype __umoddi3 (UDWtype, UDWtype);
 extern DWtype __moddi3 (DWtype, DWtype);
+extern DWtype __divmoddi4 (DWtype, DWtype, DWtype *);
 
 /* __udivmoddi4 is static inline when building other libgcc2 portions.  */
 #if (!defined (L_udivdi3) && !defined (L_divdi3) && \
- !defined (L_umoddi3) && !defined (L_moddi3))
+ !defined (L_umoddi3) && !defined (L_moddi3) && \
+ !defined (L_divmoddi4))
 extern UDWtype __udivmoddi4 (UDWtype, UDWtype, UDWtype *);
 #endif

Re: [PATCH] Emit DW_AT_inline for C++17 inline variables

2016-10-31 Thread Jason Merrill

On Mon, Oct 31, 2016 at 10:25 AM, Jakub Jelinek  wrote:
> On Mon, Oct 31, 2016 at 09:52:28AM -0400, Jason Merrill wrote:
>> On 10/14/2016 01:33 PM, Jakub Jelinek wrote:
>> >This also uses the infrastructure of the langhook patch I've sent earlier.
>> >It emits (if not strict dwarf) DW_AT_inline on explicit or implicit inline
>> >variables, and also tweaks dwarf2out so that for inline static data members
>> >we consider in-class declarations as definitions (emit DW_AT_linkage_name
>> >and no DW_AT_declaration) for them.
>>
>> Hmm, so there's no DW_TAG_variable for the inline static data member? That
>> seems problematic to me.  The DWARF 3 convention that static data members
>> use DW_TAG_member seems rather misguided, since in fact they are variables.
>> Why did that change?
>
> I had to bisect it, apparently it has been changed by myself in
> r145770, rationale in thread starting at
> https://gcc.gnu.org/ml/gcc-patches/2009-04/msg00039.html
>
> The current DWARF 5 wording is:
> "If the variable entry represents the defining declaration for a C++ static 
> data
> member of a structure, class or union, the entry has a DW_AT_specification
> attribute, whose value is a reference to the debugging information entry
> representing the declaration of this data member. The referenced entry has
> the tag DW_TAG_member and will be a child of some class, structure or
> union type entry." on page 98 in DWARF5_Public_Review.pdf.

Yes, this changed in DWARF 3; DWARF 2 didn't specify the tag.  I think
this was a mistake.

> Incidentally, I've filed today a DWARF issue that Appendix A doesn't list
> for DW_TAG_member lots of attributes that are allowed for DW_TAG_variable
> and are useful for static data members.

Using DW_TAG_variable would address that, too.

Jason

Re: [PATCH] DWARF5 DW_FORM_implicit_const support

2016-10-31 Thread Jason Merrill


On 10/17/2016 05:45 PM, Jakub Jelinek wrote:

+/* First abbrev_id that can be optimized based on usage.  */
+static unsigned int abbrev_opt_start;
+
+/* Vector of usage counts during build_abbrev_table.  Indexed by
+   abbrev_id - abbrev_opt_start.  */
+static vec abbrev_usage_count;
+
+/* Vector of all DIEs added with die_abbrev >= abbrev_opt_start.  */
+static vec abbrev_dies;


I think these could use a comment that they are temporary data only live 
between build_abbrev_table and optimize_abbrev_table.  And perhaps the 
abbrev_dies variable should be named something like sorted_abbrev_dies 
to avoid confusion with abbrev_die_table.


OK with that change.

Jason

[Patch, testsuite] Skip gcc.dg/lto/pr60449_0.c for avr

2016-10-31 Thread Senthil Kumar Selvaraj

Hi,

  gcc.dg/lto/pr60449_0.c fails to link for the avr target because it
  doesn't have an implementation for gettimeofday. This patch therefore
  skips the test for avr.

  Committed to trunk as obvious.

Regards
Senthil


gcc/testsuite/ChangeLog

2016-10-31  Senthil Kumar Selvaraj  

* gcc.dg/lto/pr60449_0.c: Skip for avr.

Index: gcc/testsuite/gcc.dg/lto/pr60449_0.c
===
--- gcc/testsuite/gcc.dg/lto/pr60449_0.c(revision 241697)
+++ gcc/testsuite/gcc.dg/lto/pr60449_0.c(working copy)
@@ -1,4 +1,5 @@
 /* { dg-lto-do link } */
+/* { dg-skip-if "Needs gettimeofday" { "avr-*-*" } } */
 
 extern int printf (const char *__restrict __format, ...);
 typedef long int __time_t;

Re: [PATCH][AArch64] Expand DImode constant stores to two SImode stores when profitable

2016-10-31 Thread Kyrill Tkachov



On 31/10/16 13:42, Richard Earnshaw (lists) wrote:

On 31/10/16 11:54, Kyrill Tkachov wrote:

On 24/10/16 17:15, Andrew Pinski wrote:

On Mon, Oct 24, 2016 at 7:27 AM, Kyrill Tkachov
 wrote:

Hi all,

When storing a 64-bit immediate that has equal bottom and top halves we
currently
synthesize the repeating 32-bit pattern twice and perform a single
X-store.
With this patch we synthesize the 32-bit pattern once into a W
register and
store
that twice using an STP. This reduces codesize bloat from
synthesising the
same
constant multiple times at the expense of converting a store to a
store-pair.
It will only trigger if we can save two or more instructions, so it will
only transform:
  mov x1, 49370
  movkx1, 0xc0da, lsl 32
  str x1, [x0]

into:

  mov w1, 49370
  stp w1, w1, [x0]

when optimising for -Os, whereas it will always transform a 4-insn
synthesis
sequence into a two-insn sequence + STP (see comments in the patch).

This patch triggers already but will trigger more with the store merging
pass
that I'm working on since that will generate more of these repeating
64-bit
constants.
This helps improve codegen on 456.hmmer where store merging can
sometimes
create very
complex repeating constants and target-specific expand needs to break
them
down.

Doing STP might be worse on ThunderX 1 than the mov/movk.  Or this
might cause an ICE with -mcpu=thunderx; I can't remember if the check
for slow unaligned store pair word is with the pattern or not.

I can't get it to ICE with -mcpu=thunderx.
The restriction is just on the STP forming code in the sched-fusion
hooks AFAIK.


Basically the rule is
1) if 4 byte aligned, then it is better to do two str.
2) If 8 byte aligned, then doing stp is good
3) Otherwise it is better to do two str.

Ok, then I'll make the function just emit two stores and depend on the
sched-fusion
machinery to fuse them into an STP when appropriate since that has the
logic that
takes thunderx into account.

If the mode is DImode (ie the pattern is 'movdi', then surely we must
have a 64-bit aligned store.


I don't think that's guaranteed. At least the Standard Names documentation
doesn't mention it. I've gone through an example with gdb where store merging
produces an unaligned store and the gen_movdi expander is called with a source
of (const_int 8589934593 [0x20001]) and a destination of:
(mem:DI (reg/v/f:DI 73 [ a ]) [1 MEM[(int *)a_2(D)]+0 S8 A32])

i.e. a 32-bit aligned DImode MEM.

Thanks,
Kyrill


R.


Thanks for the info.
Kyrill


Thanks,
Andrew



Bootstrapped and tested on aarch64-none-linux-gnu.

Ok for trunk?

Thanks,
Kyrill

2016-10-24  Kyrylo Tkachov  

  * config/aarch64/aarch64.md (mov): Call
  aarch64_split_dimode_const_store on DImode constant stores.
  * config/aarch64/aarch64-protos.h
(aarch64_split_dimode_const_store):
  New prototype.
  * config/aarch64/aarch64.c (aarch64_split_dimode_const_store): New
  function.

2016-10-24  Kyrylo Tkachov  

  * gcc.target/aarch64/store_repeating_constant_1.c: New test.
  * gcc.target/aarch64/store_repeating_constant_2.c: Likewise.

Re: [PATCH] Emit DW_AT_inline for C++17 inline variables

2016-10-31 Thread Jakub Jelinek

On Mon, Oct 31, 2016 at 09:52:28AM -0400, Jason Merrill wrote:
> On 10/14/2016 01:33 PM, Jakub Jelinek wrote:
> >This also uses the infrastructure of the langhook patch I've sent earlier.
> >It emits (if not strict dwarf) DW_AT_inline on explicit or implicit inline
> >variables, and also tweaks dwarf2out so that for inline static data members
> >we consider in-class declarations as definitions (emit DW_AT_linkage_name
> >and no DW_AT_declaration) for them.
> 
> Hmm, so there's no DW_TAG_variable for the inline static data member? That
> seems problematic to me.  The DWARF 3 convention that static data members
> use DW_TAG_member seems rather misguided, since in fact they are variables.
> Why did that change?

I had to bisect it, apparently it has been changed by myself in
r145770, rationale in thread starting at
https://gcc.gnu.org/ml/gcc-patches/2009-04/msg00039.html

The current DWARF 5 wording is:
"If the variable entry represents the defining declaration for a C++ static data
member of a structure, class or union, the entry has a DW_AT_specification
attribute, whose value is a reference to the debugging information entry
representing the declaration of this data member. The referenced entry has
the tag DW_TAG_member and will be a child of some class, structure or
union type entry." on page 98 in DWARF5_Public_Review.pdf.
And page 117 has in 5.7.6:
"A data member (as opposed to a member function) is represented by a
debugging information entry with the tag DW_TAG_member."

So, using DW_TAG_variable as DW_TAG_{structure,class}_type children
for the non-inline static data members would violate these two.
C++17 is outside of the scope of DWARF5 I guess, so the question is to find
out what do we want for those and agree on that with consumers and other
producers.

Incidentally, I've filed today a DWARF issue that Appendix A doesn't list
for DW_TAG_member lots of attributes that are allowed for DW_TAG_variable
and are useful for static data members.  Some of them we want anyway, at
least unless we agree on the non-inline static data members no longer be
DW_TAG_member in the structure/class type and change the standard.
In particular DW_AT_const_value and DW_AT_const_expr are desirable even
for the static data members in C++98..14 (or non-inline ones in C++17).
The rest, like DW_AT_linkage_name, DW_AT_location, etc. are generally more
useful mainly for the inline ones.

Jakub

[PATCH] Fix PR78047

2016-10-31 Thread Richard Biener


I am testing the following to fix another latent PTA bug uncovered
by pointer comparsion folding.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

2016-10-31  Richard Biener  

PR tree-optimization/78047
* tree-ssa-structalias.c (push_fields_onto_fieldstack): Initialize
fake field at offset zero conservatively regarding to may_have_pointers.

diff --git a/gcc/tree-ssa-structalias.c b/gcc/tree-ssa-structalias.c
index fb364f1..2880382 100644
--- a/gcc/tree-ssa-structalias.c
+++ b/gcc/tree-ssa-structalias.c
@@ -5566,7 +5568,7 @@ push_fields_onto_fieldstack (tree type, vec 
*fieldstack,
&& offset + foff != 0)
  {
fieldoff_s e
- = {0, offset + foff, false, false, false, false, NULL_TREE};
+ = {0, offset + foff, false, false, true, false, NULL_TREE};
pair = fieldstack->safe_push (e);
  }

Re: [PATCH] DWARF5 DW_FORM_data16 support

2016-10-31 Thread Jason Merrill


On 10/17/2016 05:39 PM, Jakub Jelinek wrote:

(dwarf_version >= 5 ? 128 : 64)


Please make this a macro.  OK with that change.

Jason

Re: [PATCH] DWARF5 DW_TAG_call_site support

2016-10-31 Thread Jason Merrill


OK.

Jason

Re: [PATCH] Emit DWARF5 DW_AT_reference and DW_AT_rvalue_reference

2016-10-31 Thread Jason Merrill


On 10/24/2016 03:09 PM, Jakub Jelinek wrote:

On Mon, Oct 24, 2016 at 02:34:04PM -0400, Jason Merrill wrote:

On Mon, Oct 24, 2016 at 10:29 AM, Jakub Jelinek  wrote:

This is another addition in DWARF5.  The patch emits these attributes
only for DW_TAG_subprogram for non-static ref-qualified member functions.
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


OK.


Thanks, committed.


We really should emit it also for DW_TAG_subroutine_type for PMF types,
the problem is that for FUNCTION_TYPE/METHOD_TYPE, dwarf2out.c uses
type_main_variant and get_qualified_type, neither of them understand
ref-qualifies.  I think we'd need to change get_qualified_type to use
some lang-hook which for C++ would also check the ref qualifiers, and then
for these two avoid type_main_variant and instead use get_qualified_type
to get the non-const/volatile/restrict etc. qualified one (and only then
we can use the decl_dwarf_attribute langhook to ask the FE whether the
attribute should be emitted).  Does that make sense, or do you have another
approach?


Can we pull out the ref-qualifier before we clobber the qualifiers on the type?


We could handle it like "reverse" flag, basically replace bool reverse with
int that would hold bitmask of the flags, but I really don't understand how
the reverse flag works right now, I don't see how it can work reliably when
we use and rely on equate_type_number_to_die/lookup_type_die mechanism where
we store just a single DIE for each type.  We either equate it to a type
with DW_AT_endianity present, or not, and then the rest of the lookups will
be just random.  I'm afraid to make it work reliably we'd need to somehow
store both the "reverse" and "!reverse" DIEs for each type DIE, and
similarly for the & and && flags.

Or by changing get_qualified_die (in particular check_base_type) to use a
langhook, we could at least for DW_AT_{reference,rvalue_reference} just use
equate_type_number_to_die/lookup_type_die reliably (DW_AT_endianity issue
remains).


Yeah, I think that adding a langhook is the right way to go.  The 
generic code clobbering C++ qualifiers is a frequent headache.


Jason

Re: [PATCH] For -gdwarf-5 emit DWARF5 .debug_macro

2016-10-31 Thread Jason Merrill

OK.

Re: [PATCH] Emit DW_AT_inline for C++17 inline variables

2016-10-31 Thread Jason Merrill


On 10/14/2016 01:33 PM, Jakub Jelinek wrote:

This also uses the infrastructure of the langhook patch I've sent earlier.
It emits (if not strict dwarf) DW_AT_inline on explicit or implicit inline
variables, and also tweaks dwarf2out so that for inline static data members
we consider in-class declarations as definitions (emit DW_AT_linkage_name
and no DW_AT_declaration) for them.


Hmm, so there's no DW_TAG_variable for the inline static data member? 
That seems problematic to me.  The DWARF 3 convention that static data 
members use DW_TAG_member seems rather misguided, since in fact they are 
variables.  Why did that change?


Jason

Re: [PATCH][AArch64] Expand DImode constant stores to two SImode stores when profitable

2016-10-31 Thread Richard Earnshaw (lists)

On 31/10/16 11:54, Kyrill Tkachov wrote:
> 
> On 24/10/16 17:15, Andrew Pinski wrote:
>> On Mon, Oct 24, 2016 at 7:27 AM, Kyrill Tkachov
>>  wrote:
>>> Hi all,
>>>
>>> When storing a 64-bit immediate that has equal bottom and top halves we
>>> currently
>>> synthesize the repeating 32-bit pattern twice and perform a single
>>> X-store.
>>> With this patch we synthesize the 32-bit pattern once into a W
>>> register and
>>> store
>>> that twice using an STP. This reduces codesize bloat from
>>> synthesising the
>>> same
>>> constant multiple times at the expense of converting a store to a
>>> store-pair.
>>> It will only trigger if we can save two or more instructions, so it will
>>> only transform:
>>>  mov x1, 49370
>>>  movkx1, 0xc0da, lsl 32
>>>  str x1, [x0]
>>>
>>> into:
>>>
>>>  mov w1, 49370
>>>  stp w1, w1, [x0]
>>>
>>> when optimising for -Os, whereas it will always transform a 4-insn
>>> synthesis
>>> sequence into a two-insn sequence + STP (see comments in the patch).
>>>
>>> This patch triggers already but will trigger more with the store merging
>>> pass
>>> that I'm working on since that will generate more of these repeating
>>> 64-bit
>>> constants.
>>> This helps improve codegen on 456.hmmer where store merging can
>>> sometimes
>>> create very
>>> complex repeating constants and target-specific expand needs to break
>>> them
>>> down.
>>
>> Doing STP might be worse on ThunderX 1 than the mov/movk.  Or this
>> might cause an ICE with -mcpu=thunderx; I can't remember if the check
>> for slow unaligned store pair word is with the pattern or not.
> 
> I can't get it to ICE with -mcpu=thunderx.
> The restriction is just on the STP forming code in the sched-fusion
> hooks AFAIK.
> 
>> Basically the rule is
>> 1) if 4 byte aligned, then it is better to do two str.
>> 2) If 8 byte aligned, then doing stp is good
>> 3) Otherwise it is better to do two str.
> 
> Ok, then I'll make the function just emit two stores and depend on the
> sched-fusion
> machinery to fuse them into an STP when appropriate since that has the
> logic that
> takes thunderx into account.

If the mode is DImode (ie the pattern is 'movdi', then surely we must
have a 64-bit aligned store.

R.

> 
> Thanks for the info.
> Kyrill
> 
>>
>> Thanks,
>> Andrew
>>
>>
>>> Bootstrapped and tested on aarch64-none-linux-gnu.
>>>
>>> Ok for trunk?
>>>
>>> Thanks,
>>> Kyrill
>>>
>>> 2016-10-24  Kyrylo Tkachov  
>>>
>>>  * config/aarch64/aarch64.md (mov): Call
>>>  aarch64_split_dimode_const_store on DImode constant stores.
>>>  * config/aarch64/aarch64-protos.h
>>> (aarch64_split_dimode_const_store):
>>>  New prototype.
>>>  * config/aarch64/aarch64.c (aarch64_split_dimode_const_store): New
>>>  function.
>>>
>>> 2016-10-24  Kyrylo Tkachov  
>>>
>>>  * gcc.target/aarch64/store_repeating_constant_1.c: New test.
>>>  * gcc.target/aarch64/store_repeating_constant_2.c: Likewise.
>

Re: [PATCH 0/3] use rtx_insn * more

2016-10-31 Thread Jeff Law


On 10/28/2016 01:13 PM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders 

HI,

This series changes various variables type from rtx to rtx_insn * so that the
remaining patches in this series
http://gcc.gnu.org/ml/gcc-patches/2016-10/msg01353.html can be applied.

patches bootstrapped and regtested on x86_64-linux-gnu, and run through 
config-list.mk, ok?

Thanks!

Trev

Trevor Saunders (3):
  use rtx_insn * in various places where it is obvious
  split up the trial variable in reorg.c:relax_delay_slots to use
rtx_insn * more
  split up some variables to use rtx_insn * more

All 3 patches in this series are fine.

Thanks,
Jeff

Re: [C++ PATCH] Fix -Wimplicit-fallthrough in templates (PR c++/77886)

2016-10-31 Thread Jason Merrill

OK.

Re: [C++ PATCH] Add tests for a const member and a reference member for launder.

2016-10-31 Thread Jason Merrill

On Sun, Oct 30, 2016 at 2:53 PM, Ville Voutilainen
 wrote:
> So, how about adding these? They should give us regression tests
> that make sure launder does the right in in case some optimizations
> attempt to reuse const/reference members.

Good idea.  You might put the reuse in a separate function in order to
hide it from the optimizer.

Jason

Re: relax rule for flexible array members in 6.x (78039 - fails to compile glibc tests)

2016-10-31 Thread Jason Merrill


OK.

Jason

Re: [PATCH] Fix PR78035

2016-10-31 Thread Markus Trippelsdorf

On 2016.10.31 at 13:46 +0100, Richard Biener wrote:
> 
> This is an updated patch for PR78035 which is about optimizing
> address comparisons using points-to info vs. directly looking
> at two addr-exprs.  I've given up trying to dissect a predicate
> more useful than decl_binds_to_current_def_p from
> symtab_node::equal_address_to and thus I'm going with the former.

It also fixes the kernel issue:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77964

Unfortunately Chromium still gets miscompiled:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78047

-- 
Markus

[PATCH] Fix PR78129

2016-10-31 Thread Richard Biener


This fixes leftovers in /tmp from -Werror=suggest-final-types.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2016-10-31  Richard Biener  

PR lto/78129
* lto.c (do_whole_program_analysis): Bail out after errors
from WPA analysis.

Index: gcc/lto/lto.c
===
--- gcc/lto/lto.c   (revision 241695)
+++ gcc/lto/lto.c   (working copy)
@@ -3092,6 +3092,10 @@ do_whole_program_analysis (void)
 
   execute_ipa_pass_list (g->get_passes ()->all_regular_ipa_passes);
 
+  /* When WPA analysis raises errors, do not bother to output anything.  */
+  if (seen_error ())
+return;
+
   if (symtab->dump_file)
 {
   fprintf (symtab->dump_file, "Optimized ");

[PATCH] Fix PR78035

2016-10-31 Thread Richard Biener


This is an updated patch for PR78035 which is about optimizing
address comparisons using points-to info vs. directly looking
at two addr-exprs.  I've given up trying to dissect a predicate
more useful than decl_binds_to_current_def_p from
symtab_node::equal_address_to and thus I'm going with the former.

Bootstrap & regtest pending on x86_64-unknown-linux-gnu.

Richard.

2016-10-31  Richard Biener  

PR tree-optimization/78035
* gimple-pretty-print.c (pp_points_to_solution): Print
vars_contains_interposable.
* tree-ssa-alias.c: Include varasm.h.
(ptrs_compare_unequal): Check vars_contains_interposable and
decl_binds_to_current_def_p.
(dump_points_to_solution): Dump vars_contains_interposable.
* tree-ssa-alias.h (struct pt_solution): Add vars_contains_interposable
flag.
* tree-ssa-structalias.c: Include varasm.h.
(set_uids_in_ptset): Record whether vars contains a
not decl_binds_to_current_def_p variable in vars_contains_interposable.
(ipa_escaped_pt): Update initializer.

* gcc.target/i386/pr78035.c: New testcase.

Index: trunk/gcc/testsuite/gcc.target/i386/pr78035.c
===
--- trunk/gcc/testsuite/gcc.target/i386/pr78035.c   (revision 0)
+++ trunk/gcc/testsuite/gcc.target/i386/pr78035.c   (working copy)
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+extern int a;
+extern int b;
+extern int c;
+
+int foo(int choose_a)
+{
+  int *p;
+  if (choose_a)
+p = 
+  else
+p = 
+  return p != 
+}
+
+int bar ()
+{
+  return  != 
+}
+
+/* We should not optimize away either comparison.  */
+/* { dg-final { scan-assembler-times "cmp" 2 } } */
diff --git a/gcc/gimple-pretty-print.c b/gcc/gimple-pretty-print.c
index 9d47f7f..8286326 100644
--- a/gcc/gimple-pretty-print.c
+++ b/gcc/gimple-pretty-print.c
@@ -728,6 +728,12 @@ pp_points_to_solution (pretty_printer *buffer, struct 
pt_solution *pt)
{
  pp_string (buffer, comma);
  pp_string (buffer, "restrict");
+ comma = ", ";
+   }
+ if (pt->vars_contains_interposable)
+   {
+ pp_string (buffer, comma);
+ pp_string (buffer, "interposable");
}
  pp_string (buffer, ")");
}
diff --git a/gcc/tree-ssa-alias.c b/gcc/tree-ssa-alias.c
index 26c9f9e..ebae6cf 100644
--- a/gcc/tree-ssa-alias.c
+++ b/gcc/tree-ssa-alias.c
@@ -32,12 +32,12 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-pretty-print.h"
 #include "alias.h"
 #include "fold-const.h"
-
 #include "langhooks.h"
 #include "dumpfile.h"
 #include "tree-eh.h"
 #include "tree-dfa.h"
 #include "ipa-reference.h"
+#include "varasm.h"
 
 /* Broad overview of how alias analysis on gimple works:
 
@@ -373,14 +373,18 @@ ptrs_compare_unequal (tree ptr1, tree ptr2)
   /* We may not use restrict to optimize pointer comparisons.
  See PR71062.  So we have to assume that restrict-pointed-to
 may be in fact obj1.  */
-  if (!pi || pi->pt.vars_contains_restrict)
+  if (!pi
+ || pi->pt.vars_contains_restrict
+ || pi->pt.vars_contains_interposable)
return false;
   if (VAR_P (obj1)
  && (TREE_STATIC (obj1) || DECL_EXTERNAL (obj1)))
{
  varpool_node *node = varpool_node::get (obj1);
  /* If obj1 may bind to NULL give up (see below).  */
- if (! node || ! node->nonzero_address ())
+ if (! node
+ || ! node->nonzero_address ()
+ || ! decl_binds_to_current_def_p (obj1))
return false;
}
   return !pt_solution_includes (>pt, obj1);
@@ -553,7 +557,12 @@ dump_points_to_solution (FILE *file, struct pt_solution 
*pt)
  comma = ", ";
}
  if (pt->vars_contains_restrict)
-   fprintf (file, "%srestrict", comma);
+   {
+ fprintf (file, "%srestrict", comma);
+ comma = ", ";
+   }
+ if (pt->vars_contains_interposable)
+   fprintf (file, "%sinterposable", comma);
  fprintf (file, ")");
}
 }
diff --git a/gcc/tree-ssa-alias.h b/gcc/tree-ssa-alias.h
index 27a06fc..810d83c 100644
--- a/gcc/tree-ssa-alias.h
+++ b/gcc/tree-ssa-alias.h
@@ -57,6 +57,8 @@ struct GTY(()) pt_solution
   /* Nonzero if the vars bitmap includes a anonymous variable used to
  represent storage pointed to by a restrict qualified pointer.  */
   unsigned int vars_contains_restrict : 1;
+  /* Nonzero if the vars bitmap includes an interposable variable.  */
+  unsigned int vars_contains_interposable : 1;
 
   /* Set of variables that this pointer may point to.  */
   bitmap vars;
diff --git a/gcc/tree-ssa-structalias.c b/gcc/tree-ssa-structalias.c
index fb364f1..10434f6 100644
--- a/gcc/tree-ssa-structalias.c
+++ b/gcc/tree-ssa-structalias.c
@@ -39,6 +39,8 @@

Re: [PATCH] Add -Wshadow-local and -Wshadow-compatible-local.

2016-10-31 Thread Jason Merrill

OK.

On Sun, Oct 30, 2016 at 3:53 PM, Mark Wielaard  wrote:
> On Tue, Oct 25, 2016 at 01:57:09AM +0200, Mark Wielaard wrote:
>> I think the only thing "blocking" the patch from going in is that
>> nobody made a decission on how the actual warning option should be
>> named. I think the suggestion for -Wshadow=[global|local|compatible-local]
>> is the right one. With -Wshadow being an alias for -Wshadow=global.
>> But since there are already gcc versions out there that accept
>> -Wshadow-local and -Wshadow-compatible-local (and you can find some
>> configure scripts that already check for those options) it would be
>> good to have those as (hidden) aliases.
>>
>> If people, some maintainer, agrees with that then we can do the .opt
>> file hacking to make it so.
>
> Nobody objected, nor did anybody say this is a great idea. But I think
> it is. So I just implemented the options this way.
>
> I made one small diagnostic change to fix a regression pointed out by
> gcc/testsuite/gcc.dg/pr48062.c. It should still be possible to ignore
> all shadow warnings with #pragma GCC diagnostic ignored "-Wshadow" when
> -Wshadow was given, but not the new -Wshadow=(local|compatible-local).
> So we now set warning_code = OPT_Wshadow when -Wshadow was given.
>
> The documentation and code comments were updated to refer to the
> new -Wshadow=... variants.
>
> OK to commit the attached patch?
>
> Thanks,
>
> Mark

Re: [PATCH] enhance buffer overflow warnings (and c/53562)

2016-10-31 Thread Tobias Burnus

Martin Sebor wrote:
> Attached is an updated patch that adds checks for excessive sizes
> and bounds (those in excess of SIZE_MAX / 2), and also enables
> the same checking for strcat and strncat).  This version also
> fixes an issue with the interpretation of anti-ranges in the
> first patch.  The improvements exposed two bugs in the regression
> tests.

If I apply this patch to my local trunk - and try to bootstrap GCC,
bootstrapping fails (on x86-64_gnu-linux) as following. I have not
tried to figure out whether the warning (-Werror) makes sense or not.

../../gcc/emit-rtl.c: In function ‘rtx_note* make_note_raw(insn_note)’:
../../gcc/emit-rtl.c:3933:59: error: void* memset(void*, int, size_t) writing 8 
bytes into a region of size 0 overflows the destination 
[-Werror=stringop-overflow]
   memset (_DATA (note), 0, sizeof (NOTE_DATA (note)));


where NOTE_DATA is defined in rtl.h as

/* Opaque data.  */
#define NOTE_DATA(INSN) RTL_CHECKC1 (INSN, 3, NOTE)


Cheers,

Tobias

Re: [C++ PATCH] Fix -std=c++11 -std=gnu++11 option handling (PR c++/77948)

2016-10-31 Thread Jason Merrill

On Sat, Oct 29, 2016 at 1:09 PM, Jakub Jelinek  wrote:
> On Sat, Oct 29, 2016 at 12:41:43PM -0400, Jason Merrill wrote:
>> On Sat, Oct 29, 2016 at 12:09 PM, Jakub Jelinek  wrote:
>> > Apparently the driver reorders the options, -std=* options come before
>> > all -f* options, so both of the following patches work the same, except
>> > when cc1plus is invoked by hand.  The first (inlined) patch solves it
>> > more in line how e.g. -fgnu-keywords and similar options are handled, so I 
>> > think
>> > it is preferable (and I've successfully bootstrapped/regtested it on
>> > x86_64-linux and i686-linux).  The second (attached) patch handles it
>> > by not clearing -fext-numeric-literals during option processing
>> > for -std=* options, just in post option handling clears it for the
>> > strict c++11+ modes if the option is not specified explicitly.
>> > If the driver wouldn't reorder the options, it would handle better
>> > cases like -fext-numeric-literals -std=c++11 to mean the same thing
>> > as -std=c++11 -fext-numeric-literals etc., but due to the reordering
>> > it isn't needed and for consistency we'd also need to change -fgnu-keywords
>> > etc.
>>
>> I lean toward the second patch because it keeps the logic in a single
>> place, but it looks like it doesn't currently handle C++98
>> differently.  I don't think we need to change the handling of other
>> options to match.
>
> The second patch has that
> +  /* Unless -f{,no-}ext-numeric-literals has been used explicitly,
> +for -std=c++{11,14,1z} default to -fno-ext-numeric-literals.  */
> +  if (flag_iso && !global_options_set.x_flag_ext_numeric_literals)
> +   cpp_opts->ext_numeric_literals = 0;
> in a block guarded with cxx_dialect >= cxx11.  So for -std=c++98 it will
> just not disable them

Ah, OK then.

Jason

Re: [PATCH] Don't use priority {cd}tors if not supported by a target (PR, gcov-profile/78086)

2016-10-31 Thread Martin Liška

On 10/31/2016 11:07 AM, Rainer Orth wrote:
> Hi Martin,
> 
>> Using priority {cd}tors on a target that does not support that can cause
>> failures (see the PR).
>> Apart from that, I decided to use priority 100 for both gcov_init and
>> gcov_exit functions as
>> the reserved range includes priority 100. Moreover, I enhanced test-cases
>> we have.
> 
> just two nits:
> 
> diff --git a/gcc/testsuite/g++.dg/gcov/pr16855-priority.C 
> b/gcc/testsuite/g++.dg/gcov/pr16855-priority.C
> new file mode 100644
> index 000..7e39565
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/gcov/pr16855-priority.C
> [...]
> +static void __attribute__ ((constructor ((101 ctor_100 ()
> 
> Should be called ctor_101 now.  Same for dtor_100 below.
> 
>   Rainer
> 

Thanks for the note. Fixed in attached patch.

Martin
>From 0bd345d168bd18754bcb04a72e9717028d601a5f Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 26 Oct 2016 12:50:35 +0200
Subject: [PATCH] Don't use priority {cd}tors if not supported by a target (PR
 gcov-profile/78086)

gcc/testsuite/ChangeLog:

2016-10-26  Martin Liska  

	* g++.dg/gcov/pr16855.C: Clean up the test case.
	* g++.dg/gcov/pr16855-priority.C: New test.

gcc/ChangeLog:

2016-10-26  Martin Liska  

	* coverage.c (build_init_ctor): Don't use priority {cd}tors if
	not supported by a target.  Set priority to 100 if possible.
	(build_gcov_exit_decl): Likewise.
---
 gcc/coverage.c   | 13 +++--
 gcc/testsuite/g++.dg/gcov/pr16855-priority.C | 79 
 gcc/testsuite/g++.dg/gcov/pr16855.C  | 55 +--
 3 files changed, 116 insertions(+), 31 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/gcov/pr16855-priority.C

diff --git a/gcc/coverage.c b/gcc/coverage.c
index 8810710..4167e26 100644
--- a/gcc/coverage.c
+++ b/gcc/coverage.c
@@ -1056,8 +1056,10 @@ build_init_ctor (tree gcov_info_type)
   stmt = build_call_expr (init_fn, 1, stmt);
   append_to_statement_list (stmt, );
 
-  /* Generate a constructor to run it (with priority 99).  */
-  cgraph_build_static_cdtor ('I', ctor, DEFAULT_INIT_PRIORITY - 1);
+  /* Generate a constructor to run it.  */
+  int priority = SUPPORTS_INIT_PRIORITY
+? MAX_RESERVED_INIT_PRIORITY: DEFAULT_INIT_PRIORITY;
+  cgraph_build_static_cdtor ('I', ctor, priority);
 }
 
 /* Generate the destructor function to call __gcov_exit.  */
@@ -1078,8 +1080,11 @@ build_gcov_exit_decl (void)
   tree stmt = build_call_expr (init_fn, 0);
   append_to_statement_list (stmt, );
 
-  /* Generate a destructor to run it (with priority 99).  */
-  cgraph_build_static_cdtor ('D', dtor, MAX_RESERVED_INIT_PRIORITY - 1);
+  /* Generate a destructor to run it.  */
+  int priority = SUPPORTS_INIT_PRIORITY
+? MAX_RESERVED_INIT_PRIORITY: DEFAULT_INIT_PRIORITY;
+
+  cgraph_build_static_cdtor ('D', dtor, priority);
 }
 
 /* Create the gcov_info types and object.  Generate the constructor
diff --git a/gcc/testsuite/g++.dg/gcov/pr16855-priority.C b/gcc/testsuite/g++.dg/gcov/pr16855-priority.C
new file mode 100644
index 000..5c1fbc4
--- /dev/null
+++ b/gcc/testsuite/g++.dg/gcov/pr16855-priority.C
@@ -0,0 +1,79 @@
+/* { dg-options "-fprofile-arcs -ftest-coverage" } */
+/* { dg-do run { target native } } */
+/* { dg-require-effective-target init_priority } */
+
+#include 
+#include 
+
+int a;
+
+void
+foo ()
+{
+  fprintf (stderr, "In foo\n");
+  a = 123; /* count(1) */
+}
+
+using namespace std;
+class Test
+{
+public:
+  Test (void) { fprintf (stderr, "In Test::Test\n"); /* count(1) */ }
+  ~Test (void) { fprintf (stderr, "In Test::~Test\n"); /* count(1) */ }
+} T1;
+
+void
+uncalled (void)
+{
+  fprintf (stderr, "In uncalled\n"); /* count(#) */
+}
+
+int
+main (void)
+{
+  atexit ();
+  fprintf (stderr, "In main\n"); /* count(1) */
+  return 0;
+}
+
+static void __attribute__ ((constructor)) ctor_default ()
+{
+  fprintf (stderr, "in constructor(())\n"); /* count(1) */
+}
+
+static void __attribute__ ((constructor ((101 ctor_101 ()
+{
+  fprintf (stderr, "in constructor((101))\n"); /* count(1) */
+}
+
+static void __attribute__ ((constructor ((500 ctor_500 ()
+{
+  fprintf (stderr, "in constructor((500))\n"); /* count(1) */
+}
+
+static void __attribute__ ((constructor ((65535 ctor_65535 ()
+{
+  fprintf (stderr, "in constructor((65535))\n"); /* count(1) */
+}
+
+static void __attribute__ ((destructor)) dtor_default ()
+{
+  fprintf (stderr, "in destructor(())\n"); /* count(1) */
+}
+
+static void __attribute__ ((destructor ((101 dtor_101 ()
+{
+  fprintf (stderr, "in destructor((101))\n"); /* count(1) */
+}
+
+static void __attribute__ ((destructor ((500 dtor_500 ()
+{
+  fprintf (stderr, "in destructor((500))\n"); /* count(1) */
+}
+
+static void __attribute__ ((destructor ((65535 dtor_65535 ()
+{
+  fprintf (stderr, "in destructor((65535))\n"); /* count(1) */
+}
+
+/* { dg-final { run-gcov branches { -b pr16855-priority.C } } } */
diff --git

Re: [PATCH][AArch64] Add function comments to some prologue/epilogue helpers

2016-10-31 Thread Kyrill Tkachov


Ping.

Thanks,
Kyrill

On 24/10/16 12:30, Kyrill Tkachov wrote:

Ping.
https://gcc.gnu.org/ml/gcc-patches/2016-10/msg00839.html

Thanks,
Kyrill

On 12/10/16 11:23, Kyrill Tkachov wrote:

Hi all,

I'm looking at the aarch64 prologue and epilogue generation code and I noticed 
many of the helper
functions don't have function comments so it makes it harder than it has to to 
understand what's going on.
This patch adds function comments to some of them. I hope I understood the 
functions correctly.

Is this ok for trunk?

Thanks,
Kyrill

2016-10-12  Kyrylo Tkachov  

* config/aarch64/aarch64.c (aarch64_register_saved_on_entry): Add
function comment.
(aarch64_next_callee_save): Likewise.
(aarch64_pushwb_single_reg): Likewise.
(aarch64_gen_storewb_pair): Likewise.
(aarch64_push_regs): Likewise.
(aarch64_gen_loadwb_pair): Likewise.
(aarch64_pop_regs): Likewise.
(aarch64_gen_store_pair): Likewise.
(aarch64_gen_load_pair): Likewise.
(aarch64_save_callee_saves): Likewise.
(aarch64_restore_callee_saves): Likewise.

Re: [PATCH][AArch64] Fix PR target/77822: Use tighter predicates for zero_extract patterns

2016-10-31 Thread Kyrill Tkachov


Ping.

Thanks,
Kyrill

On 24/10/16 14:12, Kyrill Tkachov wrote:


On 24/10/16 12:29, Kyrill Tkachov wrote:

Ping.
https://gcc.gnu.org/ml/gcc-patches/2016-10/msg01321.html



I just noticed my original ChangeLog entry was truncated.
It is
2016-10-04  Kyrylo Tkachov  

PR target/77822
* config/aarch64/aarch64.md (*tb1): Use
aarch64_simd_shift_imm_ predicate for operand 1.
(, ANY_EXTRACT): Use tighter predicates on operands 2 and 3
to restrict them to an appropriate range and add FAIL check if the
region they specify is out of range.  Delete useless constraint
strings.
(*, ANY_EXTRACT): Add appropriate predicates on operands
2 and 3 to restrict their range and add pattern predicate.

2016-10-04  Kyrylo Tkachov  

PR target/77822
* g++.dg/torture/pr77822.C: New test.

Kyrill



On 17/10/16 17:15, Kyrill Tkachov wrote:

Hi all,

For the attached testcase the code ends up trying to extract bits outside the 
range of the normal register
widths. The aarch64 patterns for ubfz and tbnz end up accepting such operands 
and emitting invalid assembly
such as 'ubfx x18,x2,192,32'

The solution is to add proper predicates and guards to the operands of the 
zero_extract operations that are going on.
I had a look at all the other patterns in aarch64 that generate/use 
zero_extract and they all have guards on their
operands in one form or another to avoid them accessing an area that is out of 
range.

With this patch the testcase compiles and assembles fine.

Bootstrapped and tested on aarch64-none-linux-gnu.

Ok for trunk?

Thanks,
Kyrill

2016-10-17  Kyrylo Tkachov  

PR target/77822
* config/aarch64/aarch64.md (*tb1): Use
aarch64_simd_shift_imm_ predicate for operand 1.
(, ANY_EXTRACT): Use tighter predicates on operands 2 and 3
to restrict them to an appropriate range and add FAIL check if the
region they specify is out of range.  Delete useless constraint
strings.
(*, ANY_EXTRACT): Add appropriate predicates on operands
2 and 3 to restrict their range and add pattern predicate.

2016-10-17  Kyrylo Tkachov  

PR target/77822

Re: [PATCH][AArch64] Expand DImode constant stores to two SImode stores when profitable

2016-10-31 Thread Kyrill Tkachov



On 24/10/16 17:15, Andrew Pinski wrote:

On Mon, Oct 24, 2016 at 7:27 AM, Kyrill Tkachov
 wrote:

Hi all,

When storing a 64-bit immediate that has equal bottom and top halves we
currently
synthesize the repeating 32-bit pattern twice and perform a single X-store.
With this patch we synthesize the 32-bit pattern once into a W register and
store
that twice using an STP. This reduces codesize bloat from synthesising the
same
constant multiple times at the expense of converting a store to a
store-pair.
It will only trigger if we can save two or more instructions, so it will
only transform:
 mov x1, 49370
 movkx1, 0xc0da, lsl 32
 str x1, [x0]

into:

 mov w1, 49370
 stp w1, w1, [x0]

when optimising for -Os, whereas it will always transform a 4-insn synthesis
sequence into a two-insn sequence + STP (see comments in the patch).

This patch triggers already but will trigger more with the store merging
pass
that I'm working on since that will generate more of these repeating 64-bit
constants.
This helps improve codegen on 456.hmmer where store merging can sometimes
create very
complex repeating constants and target-specific expand needs to break them
down.


Doing STP might be worse on ThunderX 1 than the mov/movk.  Or this
might cause an ICE with -mcpu=thunderx; I can't remember if the check
for slow unaligned store pair word is with the pattern or not.


I can't get it to ICE with -mcpu=thunderx.
The restriction is just on the STP forming code in the sched-fusion hooks AFAIK.


Basically the rule is
1) if 4 byte aligned, then it is better to do two str.
2) If 8 byte aligned, then doing stp is good
3) Otherwise it is better to do two str.


Ok, then I'll make the function just emit two stores and depend on the 
sched-fusion
machinery to fuse them into an STP when appropriate since that has the logic 
that
takes thunderx into account.

Thanks for the info.
Kyrill



Thanks,
Andrew



Bootstrapped and tested on aarch64-none-linux-gnu.

Ok for trunk?

Thanks,
Kyrill

2016-10-24  Kyrylo Tkachov  

 * config/aarch64/aarch64.md (mov): Call
 aarch64_split_dimode_const_store on DImode constant stores.
 * config/aarch64/aarch64-protos.h (aarch64_split_dimode_const_store):
 New prototype.
 * config/aarch64/aarch64.c (aarch64_split_dimode_const_store): New
 function.

2016-10-24  Kyrylo Tkachov  

 * gcc.target/aarch64/store_repeating_constant_1.c: New test.
 * gcc.target/aarch64/store_repeating_constant_2.c: Likewise.

Re: [PATCH 3/5] [AARCH64] Fix part num and implement dependency

2016-10-31 Thread James Greenhalgh

On Sun, Oct 23, 2016 at 03:37:22PM -0700, Andrew Pinski wrote:
> On Tue, Nov 17, 2015 at 2:10 PM, Andrew Pinski  wrote:
> >
> > The way the current code was written assumes all cores have an unique part
> > num which is not true.  What they have is an unique pair of implementer and
> > part num.  This changes the code to look up that pair after the parsing
> > of the two is done.
> >
> > Someone should test this on a big.little target too just to make sure
> > the parsing is done correctly as I don't have access to one off hand.
> >
> > OK?  Bootstrapped and tested on aarch64-linux-gnu with no regressions.
> 
> 
> Here is an updated version of this patch after the other patches have
> now gone in.
> OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions
> and -mcpu=native still works.

OK.

The logic in this file looks a bit upside down in places. I might have a go
at refactoring it at some point in stage 3 or 4, but that's no reason to
block this patch for now.

Thanks,
James

> Index: config/aarch64/driver-aarch64.c
> ===
> --- config/aarch64/driver-aarch64.c   (revision 241437)
> +++ config/aarch64/driver-aarch64.c   (working copy)
> @@ -169,7 +169,6 @@ host_detect_local_cpu (int argc, const c
>bool tune = false;
>bool cpu = false;
>unsigned int i = 0;
> -  unsigned int core_idx = 0;
>unsigned char imp = INVALID_IMP;
>unsigned int cores[2] = { INVALID_CORE, INVALID_CORE };
>unsigned int n_cores = 0;
> @@ -219,18 +218,13 @@ host_detect_local_cpu (int argc, const c
>if (strstr (buf, "part") != NULL)
>   {
> unsigned ccore = parse_field (buf);
> -   for (i = 0; aarch64_cpu_data[i].name != NULL; i++)
> - if (ccore == aarch64_cpu_data[i].part_no
> - && !contains_core_p (cores, ccore))
> -   {
> - if (n_cores == 2)
> -   goto not_found;
> -
> - cores[n_cores++] = ccore;
> - core_idx = i;
> - arch_id = aarch64_cpu_data[i].arch;
> - break;
> -   }
> +   if (!contains_core_p (cores, ccore))
> + {
> +   if (n_cores == 2)
> + goto not_found;
> +
> +   cores[n_cores++] = ccore;
> + }
> continue;
>   }
>if (!tune && !processed_exts && strstr (buf, "Features") != NULL)
> @@ -276,11 +270,19 @@ host_detect_local_cpu (int argc, const c
>if (n_cores == 0 || n_cores > 2 || imp == INVALID_IMP)
>  goto not_found;
>  
> -  if (arch && !arch_id)
> -goto not_found;
> -
>if (arch)
>  {
> +  /* Search for one of the cores in the list. */
> +  for (i = 0; aarch64_cpu_data[i].name != NULL; i++)
> + if (aarch64_cpu_data[i].implementer_id == imp
> + && contains_core_p (cores, aarch64_cpu_data[i].part_no))
> +   {
> + arch_id = aarch64_cpu_data[i].arch;
> + break;
> +   }
> +  if (!arch_id)
> + goto not_found;
> +
>struct aarch64_arch_driver_info* arch_info = get_arch_from_id 
> (arch_id);
>  
>/* We got some arch indentifier that's not in aarch64-arches.def?  */
> @@ -312,7 +314,15 @@ host_detect_local_cpu (int argc, const c
>/* The simple, non-big.LITTLE case.  */
>else
>  {
> -  if (aarch64_cpu_data[core_idx].implementer_id != imp)
> +  int core_idx = -1;
> +  for (i = 0; aarch64_cpu_data[i].name != NULL; i++)
> + if (cores[0] == aarch64_cpu_data[i].part_no
> + && aarch64_cpu_data[i].implementer_id == imp)
> +   {
> + core_idx = i;
> + break;
> +   }
> +  if (core_idx == -1)
>   goto not_found;
>  
>res = concat ("-m", cpu ? "cpu" : "tune", "=",

Re: [PATCH] Fix host_size_t_cst_p predicate

2016-10-31 Thread Richard Biener

On Mon, Oct 31, 2016 at 11:18 AM, Richard Sandiford
 wrote:
> Richard Biener  writes:
>> On Mon, Oct 31, 2016 at 10:58 AM, Richard Sandiford
>>  wrote:
>>> Richard Biener  writes:
 On Mon, Oct 31, 2016 at 10:10 AM, Martin Liška  wrote:
> On 10/31/2016 01:12 AM, Richard Sandiford wrote:
>> Richard Biener  writes:
>>> On Thu, Oct 27, 2016 at 5:06 PM, Martin Liška  wrote:
 On 10/27/2016 03:35 PM, Richard Biener wrote:
> On Thu, Oct 27, 2016 at 9:41 AM, Martin Liška  wrote:
>> Running simple test-case w/o the proper header file causes ICE:
>> strncmp ("a", "b", -1);
>>
>> 0xe74462 tree_to_uhwi(tree_node const*)
>> ../../gcc/tree.c:7324
>> 0x90a23f host_size_t_cst_p
>> ../../gcc/fold-const-call.c:63
>> 0x90a23f fold_const_call(combined_fn, tree_node*, tree_node*,
>> tree_node*, tree_node*)
>> ../../gcc/fold-const-call.c:1512
>> 0x787b01 fold_builtin_3
>> ../../gcc/builtins.c:8385
>> 0x787b01 fold_builtin_n(unsigned int, tree_node*, tree_node**,
>> int, bool)
>> ../../gcc/builtins.c:8465
>> 0x9052b1 fold(tree_node*)
>> ../../gcc/fold-const.c:11919
>> 0x6de2bb c_fully_fold_internal
>> ../../gcc/c/c-fold.c:185
>> 0x6e1f6b c_fully_fold(tree_node*, bool, bool*)
>> ../../gcc/c/c-fold.c:90
>> 0x67cbbf c_process_expr_stmt(unsigned int, tree_node*)
>> ../../gcc/c/c-typeck.c:10369
>> 0x67cfbd c_finish_expr_stmt(unsigned int, tree_node*)
>> ../../gcc/c/c-typeck.c:10414
>> 0x6cb578 c_parser_statement_after_labels
>> ../../gcc/c/c-parser.c:5430
>> 0x6cd333 c_parser_compound_statement_nostart
>> ../../gcc/c/c-parser.c:4944
>> 0x6cdbde c_parser_compound_statement
>> ../../gcc/c/c-parser.c:4777
>> 0x6c93ac c_parser_declaration_or_fndef
>> ../../gcc/c/c-parser.c:2176
>> 0x6d51ab c_parser_external_declaration
>> ../../gcc/c/c-parser.c:1574
>> 0x6d5c09 c_parser_translation_unit
>> ../../gcc/c/c-parser.c:1454
>> 0x6d5c09 c_parse_file()
>> ../../gcc/c/c-parser.c:18173
>> 0x72ffd2 c_common_parse_file()
>> ../../gcc/c-family/c-opts.c:1087
>>
>> Following patch improves the host_size_t_cst_p predicate.
>>
>> Patch can bootstrap on ppc64le-redhat-linux and survives
>> regression tests.
>>
>> Ready to be installed?
>
> I believe the wi::min_precision (t, UNSIGNED) <= sizeof (size_t) *
> CHAR_BIT test is now redundant.
>
> OTOH it was probably desired to allow -1 here?  A little looking back
> in time should tell.

 Ok, it started with r229922, where it was changed from:

   if (tree_fits_uhwi_p (len) && p1 && p2)
 {
   const int i = strncmp (p1, p2, tree_to_uhwi (len));
 ...

 to current version:

 case CFN_BUILT_IN_STRNCMP:
   {
 bool const_size_p = host_size_t_cst_p (arg2, );

 Thus I'm suggesting to change to back to it.

 Ready to be installed?
>>>
>>> Let's ask Richard.
>>
>> The idea with the:
>>
>>   wi::min_precision (t, UNSIGNED) <= sizeof (size_t) * CHAR_BIT
>>
>> test was to stop us attempting 64-bit size_t operations on ILP32 hosts.
>> I think we still want that.
>
> OK, so is the consensus to add tree_fits_uhwi_p predicate to the current
> wi::min_precision check, right?

 Not sure.  If we have host_size_t_cst_p then we should have a corresponding
 size_t host_size_t (const_tree) and should use those in pairs.  Not sure
 why we have sth satisfying host_size_t_cst_p but not tree_fits_uhwi_p.
>>>
>>> It's the other way around: something can satisfy tree_fits_uhwi_p
>>> (i.e. fit within a uint64_t) but not fit within the host's size_t.
>>> The kind of case I'm thinking of is:
>>>
>>>   strncmp ("fi", "fo", (1L << 32) + 1)
>>>
>>> for an ILP32 host and LP64 target.  There's a danger that by passing
>>> the uint64_t value (1L << 32) + 1 to the host's strncmp that we'd
>>> truncate it to 1, giving the wrong result.
>>
>> Yes, but if it passes host_size_t_cst_p why does tree_to_uhwi ICE then?
>> (unless we have a > 64bit host size_t).
>
> Because in Martin's test case the length has a signed type.
> tree_to_uhwi then treats the argument as -1 to infinite precision
> rather than ~(size_t) 0 to infinite precision.

Indeed.  So the

Re: [PATCH] Fix host_size_t_cst_p predicate

2016-10-31 Thread Richard Sandiford

Richard Biener  writes:
> On Mon, Oct 31, 2016 at 10:58 AM, Richard Sandiford
>  wrote:
>> Richard Biener  writes:
>>> On Mon, Oct 31, 2016 at 10:10 AM, Martin Liška  wrote:
 On 10/31/2016 01:12 AM, Richard Sandiford wrote:
> Richard Biener  writes:
>> On Thu, Oct 27, 2016 at 5:06 PM, Martin Liška  wrote:
>>> On 10/27/2016 03:35 PM, Richard Biener wrote:
 On Thu, Oct 27, 2016 at 9:41 AM, Martin Liška  wrote:
> Running simple test-case w/o the proper header file causes ICE:
> strncmp ("a", "b", -1);
>
> 0xe74462 tree_to_uhwi(tree_node const*)
> ../../gcc/tree.c:7324
> 0x90a23f host_size_t_cst_p
> ../../gcc/fold-const-call.c:63
> 0x90a23f fold_const_call(combined_fn, tree_node*, tree_node*,
> tree_node*, tree_node*)
> ../../gcc/fold-const-call.c:1512
> 0x787b01 fold_builtin_3
> ../../gcc/builtins.c:8385
> 0x787b01 fold_builtin_n(unsigned int, tree_node*, tree_node**,
> int, bool)
> ../../gcc/builtins.c:8465
> 0x9052b1 fold(tree_node*)
> ../../gcc/fold-const.c:11919
> 0x6de2bb c_fully_fold_internal
> ../../gcc/c/c-fold.c:185
> 0x6e1f6b c_fully_fold(tree_node*, bool, bool*)
> ../../gcc/c/c-fold.c:90
> 0x67cbbf c_process_expr_stmt(unsigned int, tree_node*)
> ../../gcc/c/c-typeck.c:10369
> 0x67cfbd c_finish_expr_stmt(unsigned int, tree_node*)
> ../../gcc/c/c-typeck.c:10414
> 0x6cb578 c_parser_statement_after_labels
> ../../gcc/c/c-parser.c:5430
> 0x6cd333 c_parser_compound_statement_nostart
> ../../gcc/c/c-parser.c:4944
> 0x6cdbde c_parser_compound_statement
> ../../gcc/c/c-parser.c:4777
> 0x6c93ac c_parser_declaration_or_fndef
> ../../gcc/c/c-parser.c:2176
> 0x6d51ab c_parser_external_declaration
> ../../gcc/c/c-parser.c:1574
> 0x6d5c09 c_parser_translation_unit
> ../../gcc/c/c-parser.c:1454
> 0x6d5c09 c_parse_file()
> ../../gcc/c/c-parser.c:18173
> 0x72ffd2 c_common_parse_file()
> ../../gcc/c-family/c-opts.c:1087
>
> Following patch improves the host_size_t_cst_p predicate.
>
> Patch can bootstrap on ppc64le-redhat-linux and survives
> regression tests.
>
> Ready to be installed?

 I believe the wi::min_precision (t, UNSIGNED) <= sizeof (size_t) *
 CHAR_BIT test is now redundant.

 OTOH it was probably desired to allow -1 here?  A little looking back
 in time should tell.
>>>
>>> Ok, it started with r229922, where it was changed from:
>>>
>>>   if (tree_fits_uhwi_p (len) && p1 && p2)
>>> {
>>>   const int i = strncmp (p1, p2, tree_to_uhwi (len));
>>> ...
>>>
>>> to current version:
>>>
>>> case CFN_BUILT_IN_STRNCMP:
>>>   {
>>> bool const_size_p = host_size_t_cst_p (arg2, );
>>>
>>> Thus I'm suggesting to change to back to it.
>>>
>>> Ready to be installed?
>>
>> Let's ask Richard.
>
> The idea with the:
>
>   wi::min_precision (t, UNSIGNED) <= sizeof (size_t) * CHAR_BIT
>
> test was to stop us attempting 64-bit size_t operations on ILP32 hosts.
> I think we still want that.

 OK, so is the consensus to add tree_fits_uhwi_p predicate to the current
 wi::min_precision check, right?
>>>
>>> Not sure.  If we have host_size_t_cst_p then we should have a corresponding
>>> size_t host_size_t (const_tree) and should use those in pairs.  Not sure
>>> why we have sth satisfying host_size_t_cst_p but not tree_fits_uhwi_p.
>>
>> It's the other way around: something can satisfy tree_fits_uhwi_p
>> (i.e. fit within a uint64_t) but not fit within the host's size_t.
>> The kind of case I'm thinking of is:
>>
>>   strncmp ("fi", "fo", (1L << 32) + 1)
>>
>> for an ILP32 host and LP64 target.  There's a danger that by passing
>> the uint64_t value (1L << 32) + 1 to the host's strncmp that we'd
>> truncate it to 1, giving the wrong result.
>
> Yes, but if it passes host_size_t_cst_p why does tree_to_uhwi ICE then?
> (unless we have a > 64bit host size_t).

Because in Martin's test case the length has a signed type.
tree_to_uhwi then treats the argument as -1 to infinite precision
rather than ~(size_t) 0 to infinite precision.

Thanks,
Richard

Re: [PATCH] Fix host_size_t_cst_p predicate

2016-10-31 Thread Richard Biener

On Mon, Oct 31, 2016 at 10:58 AM, Richard Sandiford
 wrote:
> Richard Biener  writes:
>> On Mon, Oct 31, 2016 at 10:10 AM, Martin Liška  wrote:
>>> On 10/31/2016 01:12 AM, Richard Sandiford wrote:
 Richard Biener  writes:
> On Thu, Oct 27, 2016 at 5:06 PM, Martin Liška  wrote:
>> On 10/27/2016 03:35 PM, Richard Biener wrote:
>>> On Thu, Oct 27, 2016 at 9:41 AM, Martin Liška  wrote:
 Running simple test-case w/o the proper header file causes ICE:
 strncmp ("a", "b", -1);

 0xe74462 tree_to_uhwi(tree_node const*)
 ../../gcc/tree.c:7324
 0x90a23f host_size_t_cst_p
 ../../gcc/fold-const-call.c:63
 0x90a23f fold_const_call(combined_fn, tree_node*, tree_node*,
 tree_node*, tree_node*)
 ../../gcc/fold-const-call.c:1512
 0x787b01 fold_builtin_3
 ../../gcc/builtins.c:8385
 0x787b01 fold_builtin_n(unsigned int, tree_node*, tree_node**, int, 
 bool)
 ../../gcc/builtins.c:8465
 0x9052b1 fold(tree_node*)
 ../../gcc/fold-const.c:11919
 0x6de2bb c_fully_fold_internal
 ../../gcc/c/c-fold.c:185
 0x6e1f6b c_fully_fold(tree_node*, bool, bool*)
 ../../gcc/c/c-fold.c:90
 0x67cbbf c_process_expr_stmt(unsigned int, tree_node*)
 ../../gcc/c/c-typeck.c:10369
 0x67cfbd c_finish_expr_stmt(unsigned int, tree_node*)
 ../../gcc/c/c-typeck.c:10414
 0x6cb578 c_parser_statement_after_labels
 ../../gcc/c/c-parser.c:5430
 0x6cd333 c_parser_compound_statement_nostart
 ../../gcc/c/c-parser.c:4944
 0x6cdbde c_parser_compound_statement
 ../../gcc/c/c-parser.c:4777
 0x6c93ac c_parser_declaration_or_fndef
 ../../gcc/c/c-parser.c:2176
 0x6d51ab c_parser_external_declaration
 ../../gcc/c/c-parser.c:1574
 0x6d5c09 c_parser_translation_unit
 ../../gcc/c/c-parser.c:1454
 0x6d5c09 c_parse_file()
 ../../gcc/c/c-parser.c:18173
 0x72ffd2 c_common_parse_file()
 ../../gcc/c-family/c-opts.c:1087

 Following patch improves the host_size_t_cst_p predicate.

 Patch can bootstrap on ppc64le-redhat-linux and survives
 regression tests.

 Ready to be installed?
>>>
>>> I believe the wi::min_precision (t, UNSIGNED) <= sizeof (size_t) *
>>> CHAR_BIT test is now redundant.
>>>
>>> OTOH it was probably desired to allow -1 here?  A little looking back
>>> in time should tell.
>>
>> Ok, it started with r229922, where it was changed from:
>>
>>   if (tree_fits_uhwi_p (len) && p1 && p2)
>> {
>>   const int i = strncmp (p1, p2, tree_to_uhwi (len));
>> ...
>>
>> to current version:
>>
>> case CFN_BUILT_IN_STRNCMP:
>>   {
>> bool const_size_p = host_size_t_cst_p (arg2, );
>>
>> Thus I'm suggesting to change to back to it.
>>
>> Ready to be installed?
>
> Let's ask Richard.

 The idea with the:

   wi::min_precision (t, UNSIGNED) <= sizeof (size_t) * CHAR_BIT

 test was to stop us attempting 64-bit size_t operations on ILP32 hosts.
 I think we still want that.
>>>
>>> OK, so is the consensus to add tree_fits_uhwi_p predicate to the current
>>> wi::min_precision check, right?
>>
>> Not sure.  If we have host_size_t_cst_p then we should have a corresponding
>> size_t host_size_t (const_tree) and should use those in pairs.  Not sure
>> why we have sth satisfying host_size_t_cst_p but not tree_fits_uhwi_p.
>
> It's the other way around: something can satisfy tree_fits_uhwi_p
> (i.e. fit within a uint64_t) but not fit within the host's size_t.
> The kind of case I'm thinking of is:
>
>   strncmp ("fi", "fo", (1L << 32) + 1)
>
> for an ILP32 host and LP64 target.  There's a danger that by passing
> the uint64_t value (1L << 32) + 1 to the host's strncmp that we'd
> truncate it to 1, giving the wrong result.

Yes, but if it passes host_size_t_cst_p why does tree_to_uhwi ICE then?
(unless we have a > 64bit host size_t).

Richard.

>
> Thanks,
> Richard

Re: [PATCH] Don't use priority {cd}tors if not supported by a target (PR, gcov-profile/78086)

2016-10-31 Thread Rainer Orth

Hi Martin,

> Using priority {cd}tors on a target that does not support that can cause
> failures (see the PR).
> Apart from that, I decided to use priority 100 for both gcov_init and
> gcov_exit functions as
> the reserved range includes priority 100. Moreover, I enhanced test-cases
> we have.

just two nits:

diff --git a/gcc/testsuite/g++.dg/gcov/pr16855-priority.C 
b/gcc/testsuite/g++.dg/gcov/pr16855-priority.C
new file mode 100644
index 000..7e39565
--- /dev/null
+++ b/gcc/testsuite/g++.dg/gcov/pr16855-priority.C
[...]
+static void __attribute__ ((constructor ((101 ctor_100 ()

Should be called ctor_101 now.  Same for dtor_100 below.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [PATCH] Fix computation of register limit for -fsched-pressure

2016-10-31 Thread Tamar Christina

Yeah, In the test it's reasonable, just used one extra register.

I'm currently running some benchmarks to make sure those are OK too.

Regards,
Tamar


From: Bin.Cheng 
Sent: Friday, October 28, 2016 12:38:28 PM
To: Tamar Christina
Cc: Pat Haugen; Maxim Kuvyrkov; GCC Patches; nd
Subject: Re: [PATCH] Fix computation of register limit for -fsched-pressure

On Fri, Oct 28, 2016 at 12:27 PM, Tamar Christina
 wrote:
> Looking at it again,
>
> it seems to be that the testcase should be adjusted.
> There's no actual spilling. It just uses more registers than before due to 
> the scheduling.
Sorry I didn't look into the test, but using more registers sounds
like a regression too?  At least we need to make sure it's reasonable.

Thanks,
bin
>
> I will update the testcase.
>
> Thanks.
>
> 
> From: gcc-patches-ow...@gcc.gnu.org  on behalf 
> of Tamar Christina 
> Sent: Friday, October 28, 2016 10:53:20 AM
> To: Pat Haugen; Maxim Kuvyrkov
> Cc: GCC Patches; nd
> Subject: Re: [PATCH] Fix computation of register limit for -fsched-pressure
>
> Forwarding to list as well.
> 
> From: Tamar Christina
> Sent: Friday, October 28, 2016 10:52:17 AM
> To: Pat Haugen; Maxim Kuvyrkov
> Cc: GCC Patches
> Subject: Re: [PATCH] Fix computation of register limit for -fsched-pressure
>
> Hi Pat,
>
> The commit seems to be causing some odd stack spills on aarch64.
>
> I've created a new ticket https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78142
>
> Thanks,
> Tamar
>
> 
> From: gcc-patches-ow...@gcc.gnu.org  on behalf 
> of Pat Haugen 
> Sent: Tuesday, October 18, 2016 4:07:55 PM
> To: Maxim Kuvyrkov
> Cc: GCC Patches
> Subject: Re: [PATCH] Fix computation of register limit for -fsched-pressure
>
> On 10/18/2016 05:31 AM, Maxim Kuvyrkov wrote:
>>> > I see your point and agree that current code isn't optimal.  However, I 
>>> > don't think your patch is accurate either.  Consider 
>>> > https://gcc.gnu.org/onlinedocs/gccint/Register-Basics.html and let's 
>>> > assume that FIXED_REGISTERS in class CL is set for a third of the 
>>> > registers, and CALL_USED_REGISTERS is set to "1" for another third of 
>>> > registers.  So we have a third available for zero-cost allocation 
>>> > (CALL_USED_REGISTERS-FIXED_REGISTERS), a third available for spill-cost 
>>> > allocation (ALL_REGISTERS-CALL_USED_REGISTERS) and a third non-available 
>>> > (FIXED_REGISTERS).
>>> >
>>> > For a non-loop-single-basic-block function we should be targeting only 
>>> > the third of register available at zero-cost -- correct?
> Yes.
>
>   This is what is done by the current code, but, apparently, by accident.  It 
> seems that the right register count can be obtained with:
>>> >
>>> >  for (int i = 0; i < ira_class_hard_regs_num[cl]; ++i)
>>> > -  if (call_used_regs[ira_class_hard_regs[cl][i]])
>>> > -++call_used_regs_num[cl];
>>> > +  if (!call_used_regs[ira_class_hard_regs[cl][i]]
>>> > +   || fixed_regs[ira_class_hard_regs[cl][i]])
>>> > +++call_saved_regs_num[cl];
>>> >
>>> > Does this look correct to you?
>> Thinking some more, it seems like fixed_regs should not be available to the 
>> scheduler no matter what.  Therefore, the number of fixed registers should 
>> be subtracted from ira_class_hard_regs_num[cl] without any scaling 
>> (entry_freq / bb_freq).
>
> Ahh, yes, I forgot about FIXED_REGISTERS being included in 
> CALL_USED_REGISTERS. I agree they should be totally removed from the register 
> limit calculation. I'll rework the patch.
>
> Thanks,
> Pat
>

Re: [PATCH] Fix host_size_t_cst_p predicate

2016-10-31 Thread Richard Sandiford

Richard Biener  writes:
> On Mon, Oct 31, 2016 at 10:10 AM, Martin Liška  wrote:
>> On 10/31/2016 01:12 AM, Richard Sandiford wrote:
>>> Richard Biener  writes:
 On Thu, Oct 27, 2016 at 5:06 PM, Martin Liška  wrote:
> On 10/27/2016 03:35 PM, Richard Biener wrote:
>> On Thu, Oct 27, 2016 at 9:41 AM, Martin Liška  wrote:
>>> Running simple test-case w/o the proper header file causes ICE:
>>> strncmp ("a", "b", -1);
>>>
>>> 0xe74462 tree_to_uhwi(tree_node const*)
>>> ../../gcc/tree.c:7324
>>> 0x90a23f host_size_t_cst_p
>>> ../../gcc/fold-const-call.c:63
>>> 0x90a23f fold_const_call(combined_fn, tree_node*, tree_node*,
>>> tree_node*, tree_node*)
>>> ../../gcc/fold-const-call.c:1512
>>> 0x787b01 fold_builtin_3
>>> ../../gcc/builtins.c:8385
>>> 0x787b01 fold_builtin_n(unsigned int, tree_node*, tree_node**, int, 
>>> bool)
>>> ../../gcc/builtins.c:8465
>>> 0x9052b1 fold(tree_node*)
>>> ../../gcc/fold-const.c:11919
>>> 0x6de2bb c_fully_fold_internal
>>> ../../gcc/c/c-fold.c:185
>>> 0x6e1f6b c_fully_fold(tree_node*, bool, bool*)
>>> ../../gcc/c/c-fold.c:90
>>> 0x67cbbf c_process_expr_stmt(unsigned int, tree_node*)
>>> ../../gcc/c/c-typeck.c:10369
>>> 0x67cfbd c_finish_expr_stmt(unsigned int, tree_node*)
>>> ../../gcc/c/c-typeck.c:10414
>>> 0x6cb578 c_parser_statement_after_labels
>>> ../../gcc/c/c-parser.c:5430
>>> 0x6cd333 c_parser_compound_statement_nostart
>>> ../../gcc/c/c-parser.c:4944
>>> 0x6cdbde c_parser_compound_statement
>>> ../../gcc/c/c-parser.c:4777
>>> 0x6c93ac c_parser_declaration_or_fndef
>>> ../../gcc/c/c-parser.c:2176
>>> 0x6d51ab c_parser_external_declaration
>>> ../../gcc/c/c-parser.c:1574
>>> 0x6d5c09 c_parser_translation_unit
>>> ../../gcc/c/c-parser.c:1454
>>> 0x6d5c09 c_parse_file()
>>> ../../gcc/c/c-parser.c:18173
>>> 0x72ffd2 c_common_parse_file()
>>> ../../gcc/c-family/c-opts.c:1087
>>>
>>> Following patch improves the host_size_t_cst_p predicate.
>>>
>>> Patch can bootstrap on ppc64le-redhat-linux and survives
>>> regression tests.
>>>
>>> Ready to be installed?
>>
>> I believe the wi::min_precision (t, UNSIGNED) <= sizeof (size_t) *
>> CHAR_BIT test is now redundant.
>>
>> OTOH it was probably desired to allow -1 here?  A little looking back
>> in time should tell.
>
> Ok, it started with r229922, where it was changed from:
>
>   if (tree_fits_uhwi_p (len) && p1 && p2)
> {
>   const int i = strncmp (p1, p2, tree_to_uhwi (len));
> ...
>
> to current version:
>
> case CFN_BUILT_IN_STRNCMP:
>   {
> bool const_size_p = host_size_t_cst_p (arg2, );
>
> Thus I'm suggesting to change to back to it.
>
> Ready to be installed?

 Let's ask Richard.
>>>
>>> The idea with the:
>>>
>>>   wi::min_precision (t, UNSIGNED) <= sizeof (size_t) * CHAR_BIT
>>>
>>> test was to stop us attempting 64-bit size_t operations on ILP32 hosts.
>>> I think we still want that.
>>
>> OK, so is the consensus to add tree_fits_uhwi_p predicate to the current
>> wi::min_precision check, right?
>
> Not sure.  If we have host_size_t_cst_p then we should have a corresponding
> size_t host_size_t (const_tree) and should use those in pairs.  Not sure
> why we have sth satisfying host_size_t_cst_p but not tree_fits_uhwi_p.

It's the other way around: something can satisfy tree_fits_uhwi_p
(i.e. fit within a uint64_t) but not fit within the host's size_t.
The kind of case I'm thinking of is:

  strncmp ("fi", "fo", (1L << 32) + 1)

for an ILP32 host and LP64 target.  There's a danger that by passing
the uint64_t value (1L << 32) + 1 to the host's strncmp that we'd
truncate it to 1, giving the wrong result.

Thanks,
Richard

[PATCH] libiberty: Fix memory leak in ada_demangle when symbol cannot be demangled.

2016-10-31 Thread Mark Wielaard

When a symbol cannot be demangled in ada_demangle a new demangled VEC
will be allocated without deleting the demangled VEC already in use.

Running testsuite/test-demangle under valgrind will show the leak for
this entry in testsuite/demangle-expected:

# Elaborated flag (not demangled)
--format=gnat
x_E


 11 bytes in 1 blocks are definitely lost in loss record 1 of 1
at 0x4C27BE3: malloc (vg_replace_malloc.c:299)
by 0x413FE7: xmalloc (xmalloc.c:148)
by 0x4025EC: ada_demangle (cplus-dem.c:930)
by 0x402C59: cplus_demangle (cplus-dem.c:892)
by 0x400FEC: main (test-demangle.c:317)

libiberty/ChangeLog:

* cplus-dem.c (ada_demangle): Initialize demangled to NULL and
XDELETEVEC demangled when unknown.
---
 libiberty/ChangeLog   | 5 +
 libiberty/cplus-dem.c | 3 ++-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/libiberty/ChangeLog b/libiberty/ChangeLog
index 5934bc1..73cbcc4 100644
--- a/libiberty/ChangeLog
+++ b/libiberty/ChangeLog
@@ -1,3 +1,8 @@
+2016-10-31  Mark Wielaard  
+
+   * cplus-dem.c (ada_demangle): Initialize demangled to NULL and
+   XDELETEVEC demangled when unknown.
+
 2016-09-19  Andrew Stubbs  
 
* pex-win32.c (argv_to_cmdline): Quote zero-length parameters.
diff --git a/libiberty/cplus-dem.c b/libiberty/cplus-dem.c
index f954050..7f63397 100644
--- a/libiberty/cplus-dem.c
+++ b/libiberty/cplus-dem.c
@@ -911,7 +911,7 @@ ada_demangle (const char *mangled, int option 
ATTRIBUTE_UNUSED)
   int len0;
   const char* p;
   char *d;
-  char *demangled;
+  char *demangled = NULL;
   
   /* Discard leading _ada_, which is used for library level subprograms.  */
   if (strncmp (mangled, "_ada_", 5) == 0)
@@ -1156,6 +1156,7 @@ ada_demangle (const char *mangled, int option 
ATTRIBUTE_UNUSED)
   return demangled;
 
  unknown:
+  XDELETEVEC (demangled);
   len0 = strlen (mangled);
   demangled = XNEWVEC (char, len0 + 3);
 
-- 
1.8.3.1

Re: [PATCH] Fix and testcases for pr72747

2016-10-31 Thread Richard Biener

On Fri, Oct 28, 2016 at 4:06 PM, Will Schmidt  wrote:
> On Fri, 2016-10-28 at 08:31 -0500, Will Schmidt wrote:
>> On Fri, 2016-10-28 at 10:38 +0200, Richard Biener wrote:
>> > On Thu, Oct 27, 2016 at 5:37 PM, Will Schmidt  
>> > wrote:
>> > > Hi,
>> > >
>> > > Per PR72747, A statement such as "v = vec_splats (1);" correctly
>> > > initializes a vector.  However, a statement such as "v[1] = v[0] =
>> > > vec_splats (1);" initializes both v[1] and v[0] to random garbage.
>> > >
>> > > It has been determined that this is occurring because we did not emit
>> > > the actual initialization statement before our final exit from
>> > > gimplify_init_constructor, at which time we lose the expression when we
>> > > assign *expr_p to either NULL or object.  This problem affected both 
>> > > constant
>> > > and non-constant initializers.  Corrected this by moving the logic to
>> > > emit the statement up earlier within the if/else logic.
>> > >
>> > > Bootstrapped and make check ran without regressions on
>> > > powerpc64le-unknown-linux-gnu.
>> > >
>> > > OK for trunk?
>> >
>> > Ok.
>> >
>> > RIchard.
>>
>> Committed revision 241647.
>> Thanks.
>
> Oops, forgot to ask.  After some burn-in time, is this OK to backport to
> the 5 and 6 branches?

Yes.

Richard.

> The bug was first noticed in GCC 5.
>
> Thanks,
> -Will
>
>>
>>
>> >
>> > > Thanks,
>> > > -Will
>> > >
>> > > gcc:
>> > > 2016-10-26  Will Schmidt 
>> > >
>> > > PR middle-end/72747
>> > > * gimplify.c (gimplify_init_constructor): Move emit of 
>> > > constructor
>> > > assignment to earlier in the if/else logic.
>> > >
>> > > testsuite:
>> > > 2016-10-26  Will Schmidt 
>> > >
>> > > PR middle-end/72747
>> > > * c-c++-common/pr72747-1.c: New test.
>> > > * c-c++-common/pr72747-2.c: Likewise.
>> > >
>> > > Index: gcc/gimplify.c
>> > > ===
>> > > --- gcc/gimplify.c  (revision 241600)
>> > > +++ gcc/gimplify.c  (working copy)
>> > > @@ -4730,24 +4730,23 @@
>> > >
>> > >if (ret == GS_ERROR)
>> > >  return GS_ERROR;
>> > > -  else if (want_value)
>> > > +  /* If we have gimplified both sides of the initializer but have
>> > > + not emitted an assignment, do so now.  */
>> > > +  if (*expr_p)
>> > >  {
>> > > +  tree lhs = TREE_OPERAND (*expr_p, 0);
>> > > +  tree rhs = TREE_OPERAND (*expr_p, 1);
>> > > +  gassign *init = gimple_build_assign (lhs, rhs);
>> > > +  gimplify_seq_add_stmt (pre_p, init);
>> > > +}
>> > > +  if (want_value)
>> > > +{
>> > >*expr_p = object;
>> > >return GS_OK;
>> > >  }
>> > >else
>> > >  {
>> > > -  /* If we have gimplified both sides of the initializer but have
>> > > -not emitted an assignment, do so now.  */
>> > > -  if (*expr_p)
>> > > -   {
>> > > - tree lhs = TREE_OPERAND (*expr_p, 0);
>> > > - tree rhs = TREE_OPERAND (*expr_p, 1);
>> > > - gassign *init = gimple_build_assign (lhs, rhs);
>> > > - gimplify_seq_add_stmt (pre_p, init);
>> > > - *expr_p = NULL;
>> > > -   }
>> > > -
>> > > +  *expr_p = NULL;
>> > >return GS_ALL_DONE;
>> > >  }
>> > >  }
>> > > Index: gcc/testsuite/c-c++-common/pr72747-1.c
>> > > ===
>> > > --- gcc/testsuite/c-c++-common/pr72747-1.c  (revision 0)
>> > > +++ gcc/testsuite/c-c++-common/pr72747-1.c  (working copy)
>> > > @@ -0,0 +1,16 @@
>> > > +/* { dg-do compile } */
>> > > +/* { dg-require-effective-target powerpc_altivec_ok } */
>> > > +/* { dg-options "-maltivec -fdump-tree-gimple" } */
>> > > +
>> > > +/* PR 72747: Test that cascaded definition is happening for constant 
>> > > vectors. */
>> > > +
>> > > +#include 
>> > > +
>> > > +int main (int argc, char *argv[])
>> > > +{
>> > > +   __vector int v1,v2;
>> > > +   v1 = v2 = vec_splats ((int) 42);
>> > > +   return 0;
>> > > +}
>> > > +/* { dg-final { scan-tree-dump-times " v2 = { 42, 42, 42, 42 }" 1 
>> > > "gimple" } } */
>> > > +
>> > > Index: gcc/testsuite/c-c++-common/pr72747-2.c
>> > > ===
>> > > --- gcc/testsuite/c-c++-common/pr72747-2.c  (revision 0)
>> > > +++ gcc/testsuite/c-c++-common/pr72747-2.c  (working copy)
>> > > @@ -0,0 +1,18 @@
>> > > +/* { dg-do compile } */
>> > > +/* { dg-require-effective-target powerpc_altivec_ok } */
>> > > +/* { dg-options "-c -maltivec -fdump-tree-gimple" } */
>> > > +
>> > > +/* PR 72747: test that cascaded definition is happening for non 
>> > > constants. */
>> > > +
>> > > +void foo ()
>> > > +{
>> > > +  extern int i;
>> > > +  __vector int v,w;
>> > > +v = w = (vector int) { i };
>> > > +}
>> > > +
>> > > +int main (int argc, char *argv[])
>> > > +{
>> > > +  return 0;

Re: [PATCH] Fix host_size_t_cst_p predicate

2016-10-31 Thread Richard Biener

On Mon, Oct 31, 2016 at 10:10 AM, Martin Liška  wrote:
> On 10/31/2016 01:12 AM, Richard Sandiford wrote:
>> Richard Biener  writes:
>>> On Thu, Oct 27, 2016 at 5:06 PM, Martin Liška  wrote:
 On 10/27/2016 03:35 PM, Richard Biener wrote:
> On Thu, Oct 27, 2016 at 9:41 AM, Martin Liška  wrote:
>> Running simple test-case w/o the proper header file causes ICE:
>> strncmp ("a", "b", -1);
>>
>> 0xe74462 tree_to_uhwi(tree_node const*)
>> ../../gcc/tree.c:7324
>> 0x90a23f host_size_t_cst_p
>> ../../gcc/fold-const-call.c:63
>> 0x90a23f fold_const_call(combined_fn, tree_node*, tree_node*,
>> tree_node*, tree_node*)
>> ../../gcc/fold-const-call.c:1512
>> 0x787b01 fold_builtin_3
>> ../../gcc/builtins.c:8385
>> 0x787b01 fold_builtin_n(unsigned int, tree_node*, tree_node**, int, bool)
>> ../../gcc/builtins.c:8465
>> 0x9052b1 fold(tree_node*)
>> ../../gcc/fold-const.c:11919
>> 0x6de2bb c_fully_fold_internal
>> ../../gcc/c/c-fold.c:185
>> 0x6e1f6b c_fully_fold(tree_node*, bool, bool*)
>> ../../gcc/c/c-fold.c:90
>> 0x67cbbf c_process_expr_stmt(unsigned int, tree_node*)
>> ../../gcc/c/c-typeck.c:10369
>> 0x67cfbd c_finish_expr_stmt(unsigned int, tree_node*)
>> ../../gcc/c/c-typeck.c:10414
>> 0x6cb578 c_parser_statement_after_labels
>> ../../gcc/c/c-parser.c:5430
>> 0x6cd333 c_parser_compound_statement_nostart
>> ../../gcc/c/c-parser.c:4944
>> 0x6cdbde c_parser_compound_statement
>> ../../gcc/c/c-parser.c:4777
>> 0x6c93ac c_parser_declaration_or_fndef
>> ../../gcc/c/c-parser.c:2176
>> 0x6d51ab c_parser_external_declaration
>> ../../gcc/c/c-parser.c:1574
>> 0x6d5c09 c_parser_translation_unit
>> ../../gcc/c/c-parser.c:1454
>> 0x6d5c09 c_parse_file()
>> ../../gcc/c/c-parser.c:18173
>> 0x72ffd2 c_common_parse_file()
>> ../../gcc/c-family/c-opts.c:1087
>>
>> Following patch improves the host_size_t_cst_p predicate.
>>
>> Patch can bootstrap on ppc64le-redhat-linux and survives regression 
>> tests.
>>
>> Ready to be installed?
>
> I believe the wi::min_precision (t, UNSIGNED) <= sizeof (size_t) *
> CHAR_BIT test is now redundant.
>
> OTOH it was probably desired to allow -1 here?  A little looking back
> in time should tell.

 Ok, it started with r229922, where it was changed from:

   if (tree_fits_uhwi_p (len) && p1 && p2)
 {
   const int i = strncmp (p1, p2, tree_to_uhwi (len));
 ...

 to current version:

 case CFN_BUILT_IN_STRNCMP:
   {
 bool const_size_p = host_size_t_cst_p (arg2, );

 Thus I'm suggesting to change to back to it.

 Ready to be installed?
>>>
>>> Let's ask Richard.
>>
>> The idea with the:
>>
>>   wi::min_precision (t, UNSIGNED) <= sizeof (size_t) * CHAR_BIT
>>
>> test was to stop us attempting 64-bit size_t operations on ILP32 hosts.
>> I think we still want that.
>
> OK, so is the consensus to add tree_fits_uhwi_p predicate to the current
> wi::min_precision check, right?

Not sure.  If we have host_size_t_cst_p then we should have a corresponding
size_t host_size_t (const_tree) and should use those in pairs.  Not sure
why we have sth satisfying host_size_t_cst_p but not tree_fits_uhwi_p.
Is that wi::min_precision fault?  The way it is documented suggests that
it should be equal to tree_fits_uwhi_p ...

Richard.

> Thanks,
> Martin
>
>>
>> Thanks,
>> Richard
>>
>

Re: [committed] Fix bootstrap with ada x86_64-linux and -fcompare-debug failure on ppc64le-linux (PR target/78148)

2016-10-31 Thread Kyrill Tkachov



On 29/10/16 16:57, Jakub Jelinek wrote:

On Sat, Oct 29, 2016 at 10:07:22AM +0200, Andreas Schwab wrote:

That breaks Ada:

a-teioed.adb: In function 'Ada.Text_Io.Editing.Format_Number':
a-teioed.adb:127:4: error: alignment of array elements is greater than element 
size
a-teioed.adb:127:4: error: alignment of array elements is greater than element 
size
a-teioed.adb:127:4: error: alignment of array elements is greater than element 
size
a-teioed.adb:127:4: error: alignment of array elements is greater than element 
size
a-teioed.adb:127:4: error: alignment of array elements is greater than element 
size
a-teioed.adb:127:4: error: alignment of array elements is greater than element 
size
a-teioed.adb:127:4: error: alignment of array elements is greater than element 
size
a-teioed.adb:127:4: error: alignment of array elements is greater than element 
size
a-teioed.adb:127:4: error: alignment of array elements is greater than element 
size
a-teioed.adb: In function 'Ada.Text_Io.Editing.To_Picture':
a-teioed.adb:2724:4: error: alignment of array elements is greater than element 
size
a-teioed.adb: In function 'Ada.Text_Io.Editing.Valid':
a-teioed.adb:2751:4: error: alignment of array elements is greater than element 
size

The bug is the same why PR78148 testcase fails with -fcompare-debug,
build_nonstandard_integer_type return values are shared, all such calls
with the same arguments return the same types, so using SET_TYPE_ALIGN on it
is wrong.

As it is a bootstrap failure on primary target and the fix is obvious,
I've committed it to trunk after bootstrapping/regtesting it on x86_64-linux
(where it fixed ada bootstrap) and i686-linux.


Thanks for fixing this Jakub!
Sorry for the breakage.
Kyrill


2016-10-29  Jakub Jelinek  

PR target/78148
* gimple-ssa-store-merging.c
(imm_store_chain_info::output_merged_store): Use build_aligned_type
instead of SET_TYPE_ALIGN on shared integral type.

* gcc.dg/pr78148.c: New test.

--- gcc/gimple-ssa-store-merging.c.jj   2016-10-29 14:39:24.0 +0200
+++ gcc/gimple-ssa-store-merging.c  2016-10-29 15:09:34.650749175 +0200
@@ -1130,7 +1130,7 @@ imm_store_chain_info::output_merged_stor
location_t loc = get_location_for_stmts (split_store->orig_stmts);
  
tree int_type = build_nonstandard_integer_type (try_size, UNSIGNED);

-  SET_TYPE_ALIGN (int_type, align);
+  int_type = build_aligned_type (int_type, align);
tree addr = build_fold_addr_expr (base);
tree dest = fold_build2 (MEM_REF, int_type, addr,
   build_int_cst (offset_type, try_pos));
--- gcc/testsuite/gcc.dg/pr78148.c.jj   2016-10-29 15:10:05.432358626 +0200
+++ gcc/testsuite/gcc.dg/pr78148.c  2016-10-29 15:09:09.0 +0200
@@ -0,0 +1,31 @@
+/* PR target/78148 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fcompare-debug" } */
+
+struct A { int a, b; };
+struct B { char c, d; };
+extern void bar (struct A, struct B);
+struct C { char e, f; } a;
+struct D
+{
+  int g;
+  struct C h[4];
+};
+struct D *e;
+
+struct D
+foo (void)
+{
+  int b;
+  struct B c;
+  struct A d;
+  d.b = c.c = c.d = 0;
+  bar (d, c);
+}
+
+void
+baz ()
+{
+  e->h[0].e = e->h[0].f = 0;
+  foo ();
+}

Jakub

Re: [PATCH VECT]Support operand swapping for cond_expr in vect_slp

2016-10-31 Thread Bin.Cheng

On Fri, Oct 28, 2016 at 1:17 PM, Richard Biener
 wrote:
> On Thu, Oct 27, 2016 at 3:37 PM, Bin Cheng  wrote:
>> Hi,
>> During analysis, vect_slp checks if statements of a group are isomorphic to 
>> each other, specifically, all statements have to be isomorphic to the first 
>> one.  Apparently, operands of commutative operators (PLUS_EXPR/MINUS_EXPR 
>> etc.) could be swapped when checking isomorphic property.  Though vect_slp 
>> has basic support for such commutative operators, the related code is not 
>> properly implemented:
>>   1) vect_build_slp_tree mixes operand swapping in the current slp tree node 
>> and operand swapping in its child slp tree nodes.
>>   2) Operand swapping in the current slp tree is incorrect when 
>> vect_get_and_check_slp_defs has already committed to a fixed operand order.
>> In addition, operand swapping for COND_EXPR is implemented in a wrong way 
>> (place) because:
>>   3) vect_get_and_check_slp_defs swaps operands for COND_EXPR by changing 
>> comparison code after vect_build_slp_tree_1 checks the code consistency for 
>> the statement group.
>>   4) vect_build_slp_tree_1 should support operand swapping for COND_EXPR 
>> while it doesn't.
>>
>> This patch addresses above issues.  It supports COND_EXPR by recording 
>> swapping information in vect_build_slp_tree_1 and applies the swap in 
>> vect_get_check_slp_defs.  It supports two types swapping: swapping and 
>> inverting.  The patch also does refactoring so that operand swapping in 
>> child slp tree node and the current slp tree node are differentiated.  With 
>> this patch, failures (gcc.dg/vect/slp-cond-3.c) revealed by previous 
>> COND_EXPR match.pd patches are resolved.
>> Bootstrap and test on x86_64 and AArch64.  Is it OK?
>
> Ok, but please re-instantiate the early-out here:

Thanks for reviewing.

>
> @@ -905,18 +960,10 @@ vect_build_slp_tree (vec_info *vinfo,
>slp_oprnd_info oprnd_info;
>FOR_EACH_VEC_ELT (stmts, i, stmt)
>  {
> -  switch (vect_get_and_check_slp_defs (vinfo, stmt, i, _info))
> -   {
> -   case 0:
> - break;
> -   case -1:
> - matches[0] = false;
> - vect_free_oprnd_info (oprnds_info);
> - return NULL;
>
> 
>
> you seem to needlessly continue checking other DEFs if one returns -1.

Yes, updated patch committed.

Thanks,
bin
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index 5a611e4..62f060c 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -207,14 +207,20 @@ vect_get_place_in_interleaving_chain (gimple *stmt, 
gimple *first_stmt)
 
 /* Get the defs for the rhs of STMT (collect them in OPRNDS_INFO), check that
they are of a valid type and that they match the defs of the first stmt of
-   the SLP group (stored in OPRNDS_INFO).  If there was a fatal error
-   return -1, if the error could be corrected by swapping operands of the
-   operation return 1, if everything is ok return 0.  */
+   the SLP group (stored in OPRNDS_INFO).  This function tries to match stmts
+   by swapping operands of STMT when possible.  Non-zero *SWAP indicates swap
+   is required for cond_expr stmts.  Specifically, *SWAP is 1 if STMT is cond
+   and operands of comparison need to be swapped; *SWAP is 2 if STMT is cond
+   and code of comparison needs to be inverted.  If there is any operand swap
+   in this function, *SWAP is set to non-zero value.
+   If there was a fatal error return -1; if the error could be corrected by
+   swapping operands of father node of this one, return 1; if everything is
+   ok return 0.  */
 
-static int 
-vect_get_and_check_slp_defs (vec_info *vinfo,
+static int
+vect_get_and_check_slp_defs (vec_info *vinfo, unsigned char *swap,
 gimple *stmt, unsigned stmt_num,
- vec *oprnds_info)
+vec *oprnds_info)
 {
   tree oprnd;
   unsigned int i, number_of_oprnds;
@@ -237,11 +243,12 @@ vect_get_and_check_slp_defs (vec_info *vinfo,
 {
   enum tree_code code = gimple_assign_rhs_code (stmt);
   number_of_oprnds = gimple_num_ops (stmt) - 1;
+  /* Swap can only be done for cond_expr if asked to, otherwise we
+could result in different comparison code to the first stmt.  */
   if (gimple_assign_rhs_code (stmt) == COND_EXPR
  && COMPARISON_CLASS_P (gimple_assign_rhs1 (stmt)))
{
  first_op_cond = true;
- commutative = true;
  number_of_oprnds++;
}
   else
@@ -250,20 +257,24 @@ vect_get_and_check_slp_defs (vec_info *vinfo,
   else
 return -1;
 
-  bool swapped = false;
+  bool swapped = (*swap != 0);
+  gcc_assert (!swapped || first_op_cond);
   for (i = 0; i < number_of_oprnds; i++)
 {
 again:
   if (first_op_cond)
{
- if (i == 0 || i == 1)
-   oprnd = TREE_OPERAND (gimple_op (stmt, first_op_idx),
- swapped ? !i : i);
+ /* Map

Re: [PATCH] add a gimple test for PR21458

2016-10-31 Thread Richard Biener

On Mon, Oct 31, 2016 at 8:26 AM,   wrote:
> From: Trevor Saunders 
>
> A demonstration we can do the same thing with a gimple test as -fno-tree-evrp 
> but somewhat more precisely.
>
> I tested this passes on x86_64-linux-gnu, ok?

Looks good to me with the indentation fixed (once the FE is on trunk).

Note that as the startwith 'vrp1' doesn't necessarily correspond to
the dumpfile 'vrp1' the pass specification
might be somewhat fragile...  it would require some more refactoring to make the
dumpfile name of a pass easily accessible from struct pass * to make a
match to that
(but then the startwith choice is to make it work with any pass
re-ordering as well).  I think
we support just -fdump-tree-vrp and using "vrp" in the
scan-tree-dump-times line as well?
Though I don't remember what the semantics is of those if we have
multiple vrp dumps.

Richard.

> Trev
>
> gcc/testsuite/ChangeLog:
>
> 2016-10-31  Trevor Saunders  
>
> * gcc.dg/tree-ssa/pr21458-3.c: New test.
> ---
>  gcc/testsuite/gcc.dg/tree-ssa/pr21458-3.c | 40 
> +++
>  1 file changed, 40 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr21458-3.c
>
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr21458-3.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/pr21458-3.c
> new file mode 100644
> index 000..6433a7e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr21458-3.c
> @@ -0,0 +1,40 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fgimple -fdump-tree-vrp1" } */
> +
> +extern void g (void);
> +extern void bar (int);
> +
> +int __GIMPLE(startwith("vrp1"))
> +foo (int a)
> +{
> +int i;
> +
> +  bb_2:
> +  i_4 = 1;
> +goto bb_6;
> +
> + bb_3:
> + if (i_1 != 0)
> +   goto bb_4;
> +   else
> + goto bb_5;
> +
> + bb_4:
> + g ();
> +
> +   bb_5:
> +   i_7 = i_1 + 1;
> +
> + bb_6:
> + i_1 = __PHI (bb_2: i_4, bb_5: i_7);
> +   if (i_1 <= 99)
> + goto bb_3;
> + else
> +   goto bb_7;
> +
> +   bb_7:
> +   return;
> +
> +}
> +
> +/* { dg-final { scan-tree-dump-times "Folding predicate.*to 1" 1 "vrp1" } } 
> */
> --
> 2.9.3.dirty
>

Re: [Patch, rtl] PR middle-end/78016, keep REG_NOTE order during insn copy

2016-10-31 Thread Jiong Wang


On 21/10/16 13:30, Bernd Schmidt wrote:



On 10/21/2016 02:04 PM, Jiong Wang wrote:

+  /* Locate the end of existing REG_NOTES in NEW_RTX.  */
+  rtx *ptail = _NOTES (new_rtx);
+  while (*ptail != NULL_RTX)
+ptail =  (*ptail, 1);


I was thinking along the lines of something like this (untested, 
emit-rtl.c part omitted). Eric can choose whether he likes either of 
these or wants something else.


Hi Eric,

  What's your decision on this?

  Thanks.

Regards,
Jiong




Bernd

Index: gcc/rtl.h
===
--- gcc/rtl.h(revision 241233)
+++ gcc/rtl.h(working copy)
@@ -3008,6 +3008,7 @@ extern rtx alloc_reg_note (enum reg_note
 extern void add_reg_note (rtx, enum reg_note, rtx);
 extern void add_int_reg_note (rtx, enum reg_note, int);
 extern void add_shallow_copy_of_reg_note (rtx_insn *, rtx);
+extern rtx duplicate_reg_note (rtx_insn *, rtx);
 extern void remove_note (rtx, const_rtx);
 extern void remove_reg_equal_equiv_notes (rtx_insn *);
 extern void remove_reg_equal_equiv_notes_for_regno (unsigned int);
Index: gcc/rtlanal.c
===
--- gcc/rtlanal.c(revision 241233)
+++ gcc/rtlanal.c(working copy)
@@ -2304,6 +2304,21 @@ add_shallow_copy_of_reg_note (rtx_insn *
 add_reg_note (insn, REG_NOTE_KIND (note), XEXP (note, 0));
 }

+/* Duplicate NOTE and return the copy.  */
+rtx
+duplicate_reg_note (rtx note)
+{
+  rtx n;
+  reg_note_kind kind = REG_NOTE_KIND (note);
+
+  if (GET_CODE (note) == INT_LIST)
+return gen_rtx_INT_LIST ((machine_mode) kind, XINT (note, 0), 
NULL_RTX);

+  else if (GET_CODE (note) == EXPR_LIST)
+return alloc_reg_note (kind, copy_insn_1 (XEXP (note, 0)), 
NULL_RTX);

+  else
+return alloc_reg_note (kind, XEXP (note, 0), NULL_RTX);
+}
+
 /* Remove register note NOTE from the REG_NOTES of INSN.  */

 void
Index: gcc/sel-sched-ir.c
===
--- gcc/sel-sched-ir.c(revision 241233)
+++ gcc/sel-sched-ir.c(working copy)
@@ -5762,6 +5762,11 @@ create_copy_of_insn_rtx (rtx insn_rtx)
   res = create_insn_rtx_from_pattern (copy_rtx (PATTERN (insn_rtx)),
   NULL_RTX);

+  /* Locate the end of existing REG_NOTES in NEW_RTX.  */
+  rtx *ptail = _NOTES (new_rtx);
+  while (*ptail != NULL_RTX)
+ptail =  (*ptail, 1);
+
   /* Copy all REG_NOTES except REG_EQUAL/REG_EQUIV and REG_LABEL_OPERAND
  since mark_jump_label will make them.  REG_LABEL_TARGETs are 
created

  there too, but are supposed to be sticky, so we copy them. */
@@ -5770,11 +5775,8 @@ create_copy_of_insn_rtx (rtx insn_rtx)
 && REG_NOTE_KIND (link) != REG_EQUAL
 && REG_NOTE_KIND (link) != REG_EQUIV)
   {
-if (GET_CODE (link) == EXPR_LIST)
-  add_reg_note (res, REG_NOTE_KIND (link),
-copy_insn_1 (XEXP (link, 0)));
-else
-  add_reg_note (res, REG_NOTE_KIND (link), XEXP (link, 0));
+*ptail = duplicate_reg_note (link);
+ptail =  (*ptail, 1);
   }

   return res;

[Ping][gcc] Enable DW_CFA_val_expression support in dwarf module

2016-10-31 Thread Jiong Wang


On 21/10/16 09:30, Jiong Wang wrote:

Currently, GCC only support DW_CFA_expression in dwarf module, this patch
extend the support to DW_CFA_val_expression which share the same code
mostly the same code with DW_CFA_expression.

Meanwhile the existed dwarf expression parser only allows expressions
which can be represented using GCC RTL.  If one operation doesn't have
a correspondent GCC RTL operator, then there is no way to attach that
information in reg-note.

This patch extends the current dwarf expression support to unlimited
number of operations by using PARALLEL, and unlimited type of operations
by using UNSPEC.

All DW_OP_* of the expression are grouped together inside the PARALLEL,
and those operations which don't have RTL mapping are wrapped by
UNSPEC.  The parsing algorithm is simply something like:

  foreach elem inside PARALLEL
if (UNSPEC)
  {
dw_op_code = INTVAL (XVECEXP (elem, 0, 0));
oprnd1 = INTVAL (XVECEXP (elem, 0, 1));
oprnd2 = INTVAL (XVECEXP (elem, 0, 2));
  }
else
  call standard RTL parser.

Any comments on the approach?


Ping ~



Thanks.

gcc/
2016-10-20  Jiong Wang  

* reg-notes.def (CFA_VAL_EXPRESSION): New entry.
* dwarf2cfi.c (dwarf2out_frame_debug_cfa_val_expression): New 
function.

(dwarf2out_frame_debug): Support REG_CFA_VAL_EXPRESSION.
(output_cfa_loc): Support DW_CFA_val_expression.
(output_cfa_loc_raw): Likewise.
(output_cfi): Likewise.
(output_cfi_directive): Likewise.
* dwarf2out.c (dw_cfi_oprnd1_desc): Support 
DW_CFA_val_expression.

(dw_cfi_oprnd2_desc): Likewise.
(mem_loc_descriptor): Recognize new pattern generated for value
expression.

Re: [PATCH] DWARF5 .debug_info headers, .debug_types -> .debug_info DW_UT_type

2016-10-31 Thread Jakub Jelinek

On Sun, Oct 30, 2016 at 08:31:47PM +0100, Jan Kratochvil wrote:
> On Fri, 21 Oct 2016 21:32:42 +0200, Jakub Jelinek wrote:
> > This patch changes the .debug_info headers to follow the current
> > specification (I still hope the useless padding1/padding2 fields will be
> > removed), and also changes the -gsplit-dwarf stuff to move dwo_id into
> > the header and use DW_UT_{skeleton,split_*}.
> 
> During GDB consumer patch testing I needed this fix on top of your patch.

Thanks for noticing it.  As I had no consumer, all I did was eyeballing
-gdwarf-5 -dA output, and haven't noticed this.

> diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
> index 7c6a9e9..1c0ca35 100644
> --- a/gcc/dwarf2out.c
> +++ b/gcc/dwarf2out.c
> @@ -2977,7 +2977,8 @@ skeleton_chain_node;
>  
>  /* Fixed size portion of the DWARF compilation unit header.  */
>  #define DWARF_COMPILE_UNIT_HEADER_SIZE \
> -  (DWARF_INITIAL_LENGTH_SIZE + DWARF_OFFSET_SIZE + 3)
> +  (DWARF_INITIAL_LENGTH_SIZE + DWARF_OFFSET_SIZE \
> +   + (dwarf_version < 5 ? 3 : 4 + 8 + DWARF_OFFSET_SIZE))

I'd use DWARF_TYPE_SIGNATURE_SIZE instead of the 8 to match the define
below it.

>  /* Fixed size portion of the DWARF comdat type unit header.  */
>  #define DWARF_COMDAT_TYPE_UNIT_HEADER_SIZE \

And this define needs adjusting too.

I still hope we can get rid of the useless padding for the most common
DW_UT_* types, I've raised it on dwarf workgroup and got some positive
reactions, but will now file it officially as public review issue.

2016-10-31  Jan Kratochvil  
Jakub Jelinek  

* dwarf2out.c (DWARF_COMPILE_UNIT_HEADER_SIZE): Adjust for -gdwarf-5.
(DWARF_COMDAT_TYPE_UNIT_HEADER_SIZE): Likewise.

--- gcc/dwarf2out.c.jj  2016-10-27 15:57:24.0 +0200
+++ gcc/dwarf2out.c 2016-10-31 10:12:20.338761079 +0100
@@ -2805,12 +2805,15 @@ skeleton_chain_node;
 
 /* Fixed size portion of the DWARF compilation unit header.  */
 #define DWARF_COMPILE_UNIT_HEADER_SIZE \
-  (DWARF_INITIAL_LENGTH_SIZE + DWARF_OFFSET_SIZE + 3)
+  (DWARF_INITIAL_LENGTH_SIZE + DWARF_OFFSET_SIZE   \
+   + (dwarf_version >= 5   \
+  ? 4 + DWARF_TYPE_SIGNATURE_SIZE + DWARF_OFFSET_SIZE : 3))
 
 /* Fixed size portion of the DWARF comdat type unit header.  */
 #define DWARF_COMDAT_TYPE_UNIT_HEADER_SIZE \
-  (DWARF_COMPILE_UNIT_HEADER_SIZE + DWARF_TYPE_SIGNATURE_SIZE \
-   + DWARF_OFFSET_SIZE)
+  (DWARF_COMPILE_UNIT_HEADER_SIZE  \
+   + (dwarf_version >= 5   \
+  ? 0 : DWARF_TYPE_SIGNATURE_SIZE + DWARF_OFFSET_SIZE))
 
 /* Fixed size portion of public names info.  */
 #define DWARF_PUBNAMES_HEADER_SIZE (2 * DWARF_OFFSET_SIZE + 2)


Jakub

[PATCH] Don't use priority {cd}tors if not supported by a target (PR, gcov-profile/78086)

2016-10-31 Thread Martin Liška

Hi.

Using priority {cd}tors on a target that does not support that can cause 
failures (see the PR).
Apart from that, I decided to use priority 100 for both gcov_init and gcov_exit 
functions as
the reserved range includes priority 100. Moreover, I enhanced test-cases we 
have.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed?
Martin
>From 05a0dcb13d608facdd1c85f4101cd821634d07cd Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 26 Oct 2016 12:50:35 +0200
Subject: [PATCH] Don't use priority {cd}tors if not supported by a target (PR
 gcov-profile/78086)

gcc/testsuite/ChangeLog:

2016-10-26  Martin Liska  

	* g++.dg/gcov/pr16855.C: Clean up the test case.
	* g++.dg/gcov/pr16855-priority.C: New test.

gcc/ChangeLog:

2016-10-26  Martin Liska  

	* coverage.c (build_init_ctor): Don't use priority {cd}tors if
	not supported by a target.  Set priority to 100 if possible.
	(build_gcov_exit_decl): Likewise.
---
 gcc/coverage.c   | 13 +++--
 gcc/testsuite/g++.dg/gcov/pr16855-priority.C | 79 
 gcc/testsuite/g++.dg/gcov/pr16855.C  | 55 +--
 3 files changed, 116 insertions(+), 31 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/gcov/pr16855-priority.C

diff --git a/gcc/coverage.c b/gcc/coverage.c
index 8810710..4167e26 100644
--- a/gcc/coverage.c
+++ b/gcc/coverage.c
@@ -1056,8 +1056,10 @@ build_init_ctor (tree gcov_info_type)
   stmt = build_call_expr (init_fn, 1, stmt);
   append_to_statement_list (stmt, );
 
-  /* Generate a constructor to run it (with priority 99).  */
-  cgraph_build_static_cdtor ('I', ctor, DEFAULT_INIT_PRIORITY - 1);
+  /* Generate a constructor to run it.  */
+  int priority = SUPPORTS_INIT_PRIORITY
+? MAX_RESERVED_INIT_PRIORITY: DEFAULT_INIT_PRIORITY;
+  cgraph_build_static_cdtor ('I', ctor, priority);
 }
 
 /* Generate the destructor function to call __gcov_exit.  */
@@ -1078,8 +1080,11 @@ build_gcov_exit_decl (void)
   tree stmt = build_call_expr (init_fn, 0);
   append_to_statement_list (stmt, );
 
-  /* Generate a destructor to run it (with priority 99).  */
-  cgraph_build_static_cdtor ('D', dtor, MAX_RESERVED_INIT_PRIORITY - 1);
+  /* Generate a destructor to run it.  */
+  int priority = SUPPORTS_INIT_PRIORITY
+? MAX_RESERVED_INIT_PRIORITY: DEFAULT_INIT_PRIORITY;
+
+  cgraph_build_static_cdtor ('D', dtor, priority);
 }
 
 /* Create the gcov_info types and object.  Generate the constructor
diff --git a/gcc/testsuite/g++.dg/gcov/pr16855-priority.C b/gcc/testsuite/g++.dg/gcov/pr16855-priority.C
new file mode 100644
index 000..7e39565
--- /dev/null
+++ b/gcc/testsuite/g++.dg/gcov/pr16855-priority.C
@@ -0,0 +1,79 @@
+/* { dg-options "-fprofile-arcs -ftest-coverage" } */
+/* { dg-do run { target native } } */
+/* { dg-require-effective-target init_priority } */
+
+#include 
+#include 
+
+int a;
+
+void
+foo ()
+{
+  fprintf (stderr, "In foo\n");
+  a = 123; /* count(1) */
+}
+
+using namespace std;
+class Test
+{
+public:
+  Test (void) { fprintf (stderr, "In Test::Test\n"); /* count(1) */ }
+  ~Test (void) { fprintf (stderr, "In Test::~Test\n"); /* count(1) */ }
+} T1;
+
+void
+uncalled (void)
+{
+  fprintf (stderr, "In uncalled\n"); /* count(#) */
+}
+
+int
+main (void)
+{
+  atexit ();
+  fprintf (stderr, "In main\n"); /* count(1) */
+  return 0;
+}
+
+static void __attribute__ ((constructor)) ctor_default ()
+{
+  fprintf (stderr, "in constructor(())\n"); /* count(1) */
+}
+
+static void __attribute__ ((constructor ((101 ctor_100 ()
+{
+  fprintf (stderr, "in constructor((101))\n"); /* count(1) */
+}
+
+static void __attribute__ ((constructor ((500 ctor_500 ()
+{
+  fprintf (stderr, "in constructor((500))\n"); /* count(1) */
+}
+
+static void __attribute__ ((constructor ((65535 ctor_65535 ()
+{
+  fprintf (stderr, "in constructor((65535))\n"); /* count(1) */
+}
+
+static void __attribute__ ((destructor)) dtor_default ()
+{
+  fprintf (stderr, "in destructor(())\n"); /* count(1) */
+}
+
+static void __attribute__ ((destructor ((101 dtor_100 ()
+{
+  fprintf (stderr, "in destructor((101))\n"); /* count(1) */
+}
+
+static void __attribute__ ((destructor ((500 dtor_500 ()
+{
+  fprintf (stderr, "in destructor((500))\n"); /* count(1) */
+}
+
+static void __attribute__ ((destructor ((65535 dtor_65535 ()
+{
+  fprintf (stderr, "in destructor((65535))\n"); /* count(1) */
+}
+
+/* { dg-final { run-gcov branches { -b pr16855-priority.C } } } */
diff --git a/gcc/testsuite/g++.dg/gcov/pr16855.C b/gcc/testsuite/g++.dg/gcov/pr16855.C
index 91801d4..d7aa8a4 100644
--- a/gcc/testsuite/g++.dg/gcov/pr16855.C
+++ b/gcc/testsuite/g++.dg/gcov/pr16855.C
@@ -2,46 +2,47 @@
 /* { dg-do run { target native } } */
 
 #include 
+#include 
 
 int a;
 
-void foo()
+void
+foo ()
 {
-  a = 123;		  /* count(1) */
+  fprintf (stderr, "In foo\n");
+  a = 123; /* count(1) */
 }
 
-#include

Re: [PATCH] Introduce -fprofile-update=maybe-atomic

2016-10-31 Thread Martin Liška

PING^1

On 10/13/2016 05:34 PM, Martin Liška wrote:
> Hello.
> 
> As it's very hard to guess from GCC driver whether a target supports atomic 
> updates
> for GCOV counter or not, I decided to come up with a new option value 
> (maybe-atomic),
> that would be transformed in a corresponding value (single or atomic) in 
> tree-profile.c.
> The GCC driver selects the option when -pthread is present in the command 
> line.
> 
> That should fix all tests failures seen on AIX target.
> 
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
> 
> Ready to be installed?
> Martin
>

1 2 >

1 - 100 of 104 matches

Mail list logo