date:20140502

Broken link

2014-05-02 Thread William Leighton Dawson

In this message:

http://gcc.gnu.org/ml/gcc-patches/2008-10/msg01096.html

Gerald indicated a stale link for 'Guide for the use of the Ada
programming language in high integrity systems'. The document is, in
fact, not removed, just moved. It's new location is:

http://www.dit.upm.es/~str/ork/documents/adahis.pdf


Gratias tibi ago,


Leighton Dawson
- - - . .   . . . . -
.\\//

Re: [PATCH] Try to coalesce for unary and binary ops

2014-05-02 Thread Jeff Law


On 04/17/14 05:38, Richard Biener wrote:


The patch below increases the number of coalescs we attempt
to also cover unary and binary operations.  This improves
initial code generation for code like

[ ... ]



Expansion has special code to improve coalescing of op1 with
target thus this is what we try to match here.

Overall there are positive and negative size effects during
a bootstrap on x86_64, but overall it seems to be a loss
- stage3 cc1 text size is 18261647 bytes without the patch
compared to 18265751 bytes with the patch.

Now the question is what does this tell us?  Not re-using
the same pseudo as op and target is always better?
Coalescing has always been and probably always will be a transformation 
that can help or hurt code.  At the level where you're applying it I 
suspect there aren't any really good heuristics to determine when it's 
likely profitable vs not profitable.


I'd tend to lean towards coalescing is good here, but would want to 
investigate the regressions a bit to see if there's something systematic 
going on that causes us to generate poor code in some small number of 
situations that we might be able to fix.


You might even initially restrict to just coalescing on unary operators 
until we have better heuristics on binary operators.


Also note all that may end up hurting risc targets.

Jeff

Re: [PATCH] Change HONOR_REG_ALLOC_ORDER to a marco for C expression

2014-05-02 Thread Jeff Law


On 04/16/14 06:14, Kito Cheng wrote:

Hi Vladimir:

thanks your replay and approve, however I don't have commit right yet,
can you help to commit it? thanks!

I fixed up some minor whitespace issues and committed your patch.

Jeff

Re: [PATCH] Remove keep_aligning from get_inner_reference

2014-05-02 Thread Jeff Law


On 04/22/14 03:25, Eric Botcazou wrote:

Sure, and thanks again for your help.


Thanks!


I was not able to find any difference on the generated code with
or without that patch.


Yes, my gut feeling is that TYPE_ALIGN_OK is really obsolete now.  It is set
in a single place in the compiler (gcc-interface/decl.c:gnat_to_gnu_entity):

   /* Tell the middle-end that objects of tagged types are guaranteed to
 be properly aligned.  This is necessary because conversions to the
 class-wide type are translated into conversions to the root type,
 which can be less aligned than some of its derived types.  */
   if (Is_Tagged_Type (gnat_entity)
  || Is_Class_Wide_Equivalent_Type (gnat_entity))
TYPE_ALIGN_OK (gnu_type) = 1;

but we changed the way these conversions are done some time ago.

So does this remove the last concern around Bernd's patch?

Jeff

Re: [PATCH] Change HONOR_REG_ALLOC_ORDER to a marco for C expression

2014-05-02 Thread Kito Cheng

Hi Jeff:

 I fixed up some minor whitespace issues and committed your patch.

Thanks for your help :)

Re: [C PATCH] Fix up diagnostics (PR c/25801)

2014-05-02 Thread Marek Polacek

On Thu, May 01, 2014 at 10:54:18PM +, Joseph S. Myers wrote:
 I think the comment on this function should be updated to explain the 
 interface contract when an incomplete (or function) type is passed (i.e. 
 return size_one_node, caller is responsible for any diagnostics).
 
Done.

 The test should I think also cover incomplete union types, and pointer 
 subtraction for all the different cases of incomplete types, and the += 
 and -= operators (i.e. cover invalid arithmetic on pointers to incomplete 
 types more thoroughly, even if not actually affected by this patch).

Done.

Another change is that I used COMPLETE_TYPE_P instead of
COMPLETE_OR_VOID_TYPE_P, that would be a bug and before it didn't make
much sense: we checked VOID_TYPE before.

Tested x86_64-linux, ok now?

2014-05-02  Marek Polacek  pola...@redhat.com

PR c/25801
* c-typeck.c (c_size_in_bytes): Update comment.  Don't call error.
Return size_one_node when the type is not complete.
(pointer_diff): Remove comment.
(build_unary_op): Improve error messages.

* gcc.dg/pr25801.c: New test.

diff --git gcc/c/c-typeck.c gcc/c/c-typeck.c
index 98567b4..41770bc 100644
--- gcc/c/c-typeck.c
+++ gcc/c/c-typeck.c
@@ -1754,22 +1754,20 @@ type_lists_compatible_p (const_tree args1, const_tree 
args2,
 }
 }
 
-/* Compute the size to increment a pointer by.  */
+/* Compute the size to increment a pointer by.  When a function type or void
+   type or incomplete type is passed, size_one_node is returned.
+   This function does not emit any diagnostics; the caller is responsible
+   for that.  */
 
 static tree
 c_size_in_bytes (const_tree type)
 {
   enum tree_code code = TREE_CODE (type);
 
-  if (code == FUNCTION_TYPE || code == VOID_TYPE || code == ERROR_MARK)
+  if (code == FUNCTION_TYPE || code == VOID_TYPE || code == ERROR_MARK
+  || !COMPLETE_TYPE_P (type))
 return size_one_node;
 
-  if (!COMPLETE_OR_VOID_TYPE_P (type))
-{
-  error (arithmetic on pointer to an incomplete type);
-  return size_one_node;
-}
-
   /* Convert in case a char is more than one unit.  */
   return size_binop_loc (input_location, CEIL_DIV_EXPR, TYPE_SIZE_UNIT (type),
 size_int (TYPE_PRECISION (char_type_node)
@@ -3530,7 +3528,6 @@ pointer_diff (location_t loc, tree op0, tree op1)
   if (!COMPLETE_OR_VOID_TYPE_P (TREE_TYPE (TREE_TYPE (orig_op1
 error_at (loc, arithmetic on pointer to an incomplete type);
 
-  /* This generates an error if op0 is pointer to incomplete type.  */
   op1 = c_size_in_bytes (target_type);
 
   if (pointer_to_zero_sized_aggr_p (TREE_TYPE (orig_op1)))
@@ -4004,16 +4001,18 @@ build_unary_op (location_t location,
 
if (typecode == POINTER_TYPE)
  {
-   /* If pointer target is an undefined struct,
+   /* If pointer target is an incomplete type,
   we just cannot know how to do the arithmetic.  */
if (!COMPLETE_OR_VOID_TYPE_P (TREE_TYPE (argtype)))
  {
if (code == PREINCREMENT_EXPR || code == POSTINCREMENT_EXPR)
  error_at (location,
-   increment of pointer to unknown structure);
+   increment of pointer to an incomplete type %qT,
+   TREE_TYPE (argtype));
else
  error_at (location,
-   decrement of pointer to unknown structure);
+   decrement of pointer to an incomplete type %qT,
+   TREE_TYPE (argtype));
  }
else if (TREE_CODE (TREE_TYPE (argtype)) == FUNCTION_TYPE
 || TREE_CODE (TREE_TYPE (argtype)) == VOID_TYPE)
diff --git gcc/testsuite/gcc.dg/pr25801.c gcc/testsuite/gcc.dg/pr25801.c
index e69de29..10b53d3 100644
--- gcc/testsuite/gcc.dg/pr25801.c
+++ gcc/testsuite/gcc.dg/pr25801.c
@@ -0,0 +1,44 @@
+/* PR c/25801 */
+/* { dg-do compile } */
+/* { dg-options -std=c99 } */
+
+int (*a)[];
+struct S *s;
+union U *u;
+enum E *e;
+
+void
+f (void)
+{
+  a++; /* { dg-error increment of pointer to an incomplete type } */
+  ++a; /* { dg-error increment of pointer to an incomplete type } */
+  a--; /* { dg-error decrement of pointer to an incomplete type } */
+  --a; /* { dg-error decrement of pointer to an incomplete type } */
+  a += 1; /* { dg-error invalid use of array with unspecified bounds } */
+  a -= 1; /* { dg-error invalid use of array with unspecified bounds } */
+  a - a; /* { dg-error arithmetic on pointer to an incomplete type } */
+
+  s++; /* { dg-error increment of pointer to an incomplete type } */
+  ++s; /* { dg-error increment of pointer to an incomplete type } */
+  s--; /* { dg-error decrement of pointer to an incomplete type } */
+  --s; /* { dg-error decrement of pointer to an incomplete type } */
+  s += 1; /* { dg-error invalid use of undefined type } */
+  s -= 1; /* { dg-error invalid use of undefined type

Re: [PING][ARM] Handle simple SImode PLUS and MINUS operations in rtx costs

2014-05-02 Thread Kyrill Tkachov


Ping again.

Kyrill

On 24/04/14 09:20, Kyrill Tkachov wrote:

Hi all,

Pinging this:

http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01276.html


Thanks,
Kyrill

Re: GCC's -fsplit-stack disturbing Mach's vm_allocate

2014-05-02 Thread Svante Signell

On Fri, 2014-05-02 at 00:45 +0200, Samuel Thibault wrote:
 Hello,
 
 Svante Signell, le Thu 24 Apr 2014 10:39:10 +0200, a écrit :
  - Without split stack enabled around 70 libgo tests pass and 50 fails,
  most of them with a segfault.
  - Enabling split stack and using the libc Samuel built all 122 libgo
  tests fail with a segfault.
 
 Please provide segfault backtrace, rpctrace, etc.  Otherwise we won't be
 able to help you.

You already have one in this thread, analysed by Justus:
http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01736.html

The full rpctrace and gdb backtrace for that case was published on
paste.debian.net, but are gone now. I can upload again if you need them,
or do you want it attached to a mail?

[DOC PATCH] Describe -fsanitize=float-divide-by-zero

2014-05-02 Thread Marek Polacek

When I submitted -fsanitize=float-divide-by-zero stuff, I forgot
to document the option in the table of ubsan options.

Ok?

2014-05-02  Marek Polacek  pola...@redhat.com

* doc/invoke.texi: Describe -fsanitize=float-divide-by-zero.

diff --git gcc/doc/invoke.texi gcc/doc/invoke.texi
index 0eba1e0..3fe9d5f 100644
--- gcc/doc/invoke.texi
+++ gcc/doc/invoke.texi
@@ -5393,6 +5393,14 @@ While @option{-ftrapv} causes traps for signed overflows 
to be emitted,
 @option{-fsanitize=undefined} gives a diagnostic message.
 This currently works only for the C family of languages.
 
+@item -fsanitize=float-divide-by-zero
+@opindex fsanitize=float-divide-by-zero
+
+Detect floating-point division by zero.  Unlike other similar options,
+@option{-fsanitize=float-divide-by-zero} is not enabled by
+@option{-fsanitize=undefined}, since floating-point division by zero can
+be a legitimate way of obtaining infinities and NaNs.
+
 @item -fsanitize-recover
 @opindex fsanitize-recover
 By default @option{-fsanitize=undefined} sanitization (and its suboptions

Marek

Re: [DOC PATCH] Describe -fsanitize=float-divide-by-zero

2014-05-02 Thread Jakub Jelinek

On Fri, May 02, 2014 at 10:11:05AM +0200, Marek Polacek wrote:
 When I submitted -fsanitize=float-divide-by-zero stuff, I forgot
 to document the option in the table of ubsan options.
 
 Ok?
 
 2014-05-02  Marek Polacek  pola...@redhat.com
 
   * doc/invoke.texi: Describe -fsanitize=float-divide-by-zero.

Ok.

 --- gcc/doc/invoke.texi
 +++ gcc/doc/invoke.texi
 @@ -5393,6 +5393,14 @@ While @option{-ftrapv} causes traps for signed 
 overflows to be emitted,
  @option{-fsanitize=undefined} gives a diagnostic message.
  This currently works only for the C family of languages.
  
 +@item -fsanitize=float-divide-by-zero
 +@opindex fsanitize=float-divide-by-zero
 +
 +Detect floating-point division by zero.  Unlike other similar options,
 +@option{-fsanitize=float-divide-by-zero} is not enabled by
 +@option{-fsanitize=undefined}, since floating-point division by zero can
 +be a legitimate way of obtaining infinities and NaNs.
 +
  @item -fsanitize-recover
  @opindex fsanitize-recover
  By default @option{-fsanitize=undefined} sanitization (and its suboptions
 
   Marek

Jakub

[PATCH] RTEMS: Adjust SPARC multilibs

2014-05-02 Thread Sebastian Huber

A recent fix for the LEON3 support added a new option -muser-mode to
select the ASI of the CASA instruction used for atomic operations.  Use
the user-mode CASA instruction since this works also in supervisor-mode.

This patch should be applied to GCC 4.8, 4,9 and mainline.  I do not
have write access, so in case this gets approved, please commit it for
me.

gcc/ChangeLog
2014-05-02  Sebastian Huber  sebastian.hu...@embedded-brains.de

* config/sparc/t-rtems: Use user-mode for leon3 multilibs.
---
 gcc/config/sparc/t-rtems |   18 --
 1 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/gcc/config/sparc/t-rtems b/gcc/config/sparc/t-rtems
index f1a3d84..91de612 100644
--- a/gcc/config/sparc/t-rtems
+++ b/gcc/config/sparc/t-rtems
@@ -17,6 +17,20 @@
 # http://www.gnu.org/licenses/.
 #
 
-MULTILIB_OPTIONS = msoft-float mcpu=v8/mcpu=leon3
-MULTILIB_DIRNAMES = soft v8 leon3
+MULTILIB_OPTIONS = msoft-float mcpu=v8/mcpu=leon3 muser-mode
+MULTILIB_DIRNAMES = soft v8 leon3 user-mode
 MULTILIB_MATCHES = msoft-float=mno-fpu
+
+# Enumeration of multilibs
+
+MULTILIB_EXCEPTIONS += msoft-float/mcpu=v8/muser-mode
+# MULTILIB_EXCEPTIONS += msoft-float/mcpu=v8
+# MULTILIB_EXCEPTIONS += msoft-float/mcpu=leon3/muser-mode
+MULTILIB_EXCEPTIONS += msoft-float/mcpu=leon3
+MULTILIB_EXCEPTIONS += msoft-float/muser-mode
+# MULTILIB_EXCEPTIONS += msoft-float
+MULTILIB_EXCEPTIONS += mcpu=v8/muser-mode
+# MULTILIB_EXCEPTIONS += mcpu=v8
+# MULTILIB_EXCEPTIONS += mcpu=leon3/muser-mode
+MULTILIB_EXCEPTIONS += mcpu=leon3
+MULTILIB_EXCEPTIONS += muser-mode
-- 
1.7.7

Re: [PATCH] RTEMS thread model configuration

2014-05-02 Thread Sebastian Huber


Ping.

On 2014-04-18 12:11, Sebastian Huber wrote:

From: Sebastian Huber sebastian-hu...@web.de

The command line to build a GCC for RTEMS contained virtually always a
'--enable-threads'.  This patch helps to avoid this extra configuration
command line parameter and makes the GCC build a bit more user friendly
for RTEMS.

This patch should be applied to GCC 4.9 branch and master.

2014-04-18  Sebastian Huber  sebastian.hu...@embedded-brains.de

* config.gcc (*-*-rtems*): Default to 'rtems' thread model.
Enable selection of 'posix' or no thread model.
---
  gcc/config.gcc | 8 +++-
  1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 3c55c88..93d5994 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -791,7 +791,13 @@ case ${target} in
;;
  *-*-rtems*)
case ${enable_threads} in
-yes) thread_file='rtems' ;;
+ | yes | rtems) thread_file='rtems' ;;
+posix) thread_file='posix' ;;
+no) ;;
+*)
+  echo 'Unknown thread configuration for RTEMS'
+  exit 1
+  ;;
esac
tmake_file=${tmake_file} t-rtems
extra_options=${extra_options} rtems.opt




--
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax : +49 89 189 47 41-09
E-Mail  : sebastian.hu...@embedded-brains.de
PGP : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.

[AArch64] Fix integer vabs intrinsics

2014-05-02 Thread James Greenhalgh


Hi,

Unlike the mid-end's concept of an ABS_EXPR, which treats overflow as
undefined/impossible, the neon intrinsics vabs intrinsics should behave as
the hardware. That is to say, the pseudo-code sequence:

  a = vabs_s8 (vdup_n_s8 (-128));
  assert (a = 0);

does not hold. As in hardware

  abs (-128) == -128

Folding vabs intrinsics to an ABS_EXPR is thus a mistake, and we should avoid
it. In fact, we have to be even more careful than that, and keep the integer
vabs intrinsics as an unspec in the back end.

We keep the standard pattern name around for the benefit of
auto-vectorization.

Tested on aarch64-none-elf with no issues.

This will also be a bug on 4.9 (ugh), OK for trunk and gcc-4_9-branch?

Thanks,
James

---
2014-05-02  James Greenhalgh  james.greenha...@arm.com

* config/aarch64/aarch64-builtins.c (aarch64_fold_builtin): Don't
fold integer abs builtins.
* config/aarch64/aarch64-simd-builtins.def (abs): Split by integer
and floating point variants.
* config/aarch64/aarch64-simd.md (aarch64_absmode): New.
* config/aarch64/iterators.md (unspec): Add UNSPEC_ABS.
diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c
index a301982..6d47c0b 100644
--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
@@ -1153,7 +1153,7 @@ aarch64_fold_builtin (tree fndecl, int n_args ATTRIBUTE_UNUSED, tree *args,
 
   switch (fcode)
 {
-  BUILTIN_VALLDI (UNOP, abs, 2)
+  BUILTIN_VDQF (UNOP, abs, 2)
 	return fold_build1 (ABS_EXPR, type, args[0]);
 	break;
   BUILTIN_VALLDI (BINOP, cmge, 0)
diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def
index 339e8f8..e2d1078 100644
--- a/gcc/config/aarch64/aarch64-simd-builtins.def
+++ b/gcc/config/aarch64/aarch64-simd-builtins.def
@@ -365,7 +365,8 @@
   BUILTIN_VDQF (UNOP, frecpe, 0)
   BUILTIN_VDQF (BINOP, frecps, 0)
 
-  BUILTIN_VALLDI (UNOP, abs, 2)
+  BUILTIN_VDQ (UNOP, abs, 0)
+  BUILTIN_VDQF (UNOP, abs, 2)
 
   VAR1 (UNOP, vec_unpacks_hi_, 10, v4sf)
   VAR1 (BINOP, float_truncate_hi_, 0, v4sf)
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 108bc8d88931e67e6c7eeb4a01bb391a1ced..acb75f5bd0c732d8e11d4a7b6b61f8b1e81d1960 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -390,6 +390,18 @@ (define_insn abamode_3
   [(set_attr type neon_arith_accq)]
 )
 
+;; To mirror the behaviour of hardware, as required for arm_neon.h, we must
+;; show an abundance of caution around the abs instruction.
+
+(define_insn aarch64_absmode
+  [(set (match_operand:VDQ 0 register_operand =w)
+(unspec:VDQ [(match_operand:VDQ 1 register_operand w)]
+		  UNSPEC_ABS))]
+  TARGET_SIMD
+  abs\t%0.Vtype, %1.Vtype
+  [(set_attr type neon_absq)]
+)
+
 (define_insn fabdmode_3
   [(set (match_operand:VDQF 0 register_operand =w)
 	(abs:VDQF (minus:VDQF
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index c537c3780eea95fa315c82bb36ac7f91f0f920fd..e45a1a11991a71ad37a8d5bb7c4ff81627671384 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -197,6 +197,7 @@ (define_c_enum unspec
  [
 UNSPEC_ASHIFT_SIGNED	; Used in aarch-simd.md.
 UNSPEC_ASHIFT_UNSIGNED	; Used in aarch64-simd.md.
+UNSPEC_ABS		; Used in aarch64-simd.md.
 UNSPEC_FMAX		; Used in aarch64-simd.md.
 UNSPEC_FMAXNMV	; Used in aarch64-simd.md.
 UNSPEC_FMAXV	; Used in aarch64-simd.md.

Re: GCC's -fsplit-stack disturbing Mach's vm_allocate

2014-05-02 Thread Samuel Thibault

Svante Signell, le Fri 02 May 2014 10:18:12 +0200, a écrit :
 Thread 4 (Thread 1205.4):
 #0  0x019977b7 in _hurd_intr_rpc_msg_in_trap () at intr-msg.c:132
 err = optimized out
 err = optimized out
 user_option = 3
 user_timeout = 48
 m = 0x532370
 msgh_bits = 0
 remote_port = 268509186
 msgid = 21118
 save_data = optimized out
 __PRETTY_FUNCTION__ = _hurd_intr_rpc_mach_msg

Ok, reading this again, this looks like what Justus analyzed.

Samuel

Re: [AArch64] Fix integer vabs intrinsics

2014-05-02 Thread Andrew Pinski

On Fri, May 2, 2014 at 1:48 AM, James Greenhalgh
james.greenha...@arm.com wrote:

 Hi,

 Unlike the mid-end's concept of an ABS_EXPR, which treats overflow as
 undefined/impossible, the neon intrinsics vabs intrinsics should behave as
 the hardware. That is to say, the pseudo-code sequence:


Only for signed integer types.  You should be able to use an unsigned
integer type here instead.


   a = vabs_s8 (vdup_n_s8 (-128));
   assert (a = 0);

 does not hold. As in hardware

   abs (-128) == -128

 Folding vabs intrinsics to an ABS_EXPR is thus a mistake, and we should avoid
 it. In fact, we have to be even more careful than that, and keep the integer
 vabs intrinsics as an unspec in the back end.

No it is not.  The mistake is to use signed integer types here.  Just
add a conversion to an unsigned integer vector and it will work
correctly.
In fact the ABS rtl code is not undefined for the overflow.

Thanks,
Andrew Pinski


 We keep the standard pattern name around for the benefit of
 auto-vectorization.

 Tested on aarch64-none-elf with no issues.

 This will also be a bug on 4.9 (ugh), OK for trunk and gcc-4_9-branch?

 Thanks,
 James

 ---
 2014-05-02  James Greenhalgh  james.greenha...@arm.com

 * config/aarch64/aarch64-builtins.c (aarch64_fold_builtin): Don't
 fold integer abs builtins.
 * config/aarch64/aarch64-simd-builtins.def (abs): Split by integer
 and floating point variants.
 * config/aarch64/aarch64-simd.md (aarch64_absmode): New.
 * config/aarch64/iterators.md (unspec): Add UNSPEC_ABS.

Re: [RFC][AARCH64] TARGET_ATOMIC_ASSIGN_EXPAND_FENV hook

2014-05-02 Thread Kugan

 
 +2014-04-29  Kugan Vivekanandarajah  kug...@linaro.org
 +
 + * config/aarch64/aarch64.c (TARGET_ATOMIC_ASSIGN_EXPAND_FENV): New
 + define.
 + * config/aarch64/aarch64-protos.h (aarch64_atomic_assign_expand_fenv):
 + New function declaration.
 + * config/aarch64/aarch64-builtins.c (aarch64_builtins) : Add
 + AARCH64_BUILTIN_GET_FPCR, AARCH64_BUILTIN_SET_FPCR.
 + AARCH64_BUILTIN_GET_FPSR and AARCH64_BUILTIN_SET_FPSR.
 + (aarch64_init_builtins) : Initialize builtins
 + __builtins_aarch64_set_fpcr, __builtins_aarch64_get_fpcr.
 + __builtins_aarch64_set_fpsr and __builtins_aarch64_get_fpsr.
 + (aarch64_expand_builtin) : Expand builtins __builtins_aarch64_set_fpcr
 + __builtins_aarch64_get_fpcr, __builtins_aarch64_get_fpsr,
 + and __builtins_aarch64_set_fpsr.
 + (aarch64_atomic_assign_expand_fenv): New function.
 + * config/aarch64/aarch64.md (set_fpcr): New pattern.
 + (get_fpcr) : Likewise.
 + (set_fpsr) : Likewise.
 + (get_fpsr) : Likewise.
 + (unspecv): Add UNSPECV_GET_FPCR and UNSPECV_SET_FPCR, UNSPECV_GET_FPSR
 +  and UNSPECV_SET_FPSR.
 + * doc/extend.texi (AARCH64 Built-in Functions) : Document
 + __builtins_aarch64_set_fpcr, __builtins_aarch64_get_fpcr.
 + __builtins_aarch64_set_fpsr and __builtins_aarch64_get_fpsr.
 

Updated is based on the review at
http://gcc.gnu.org/ml/gcc-patches/2014-05/msg00041.html. FE_* values are
now changed to AARCH64_FE-*.

Thanks,
Kugan
diff --git a/gcc/config/aarch64/aarch64-builtins.c 
b/gcc/config/aarch64/aarch64-builtins.c
index 55cfe0a..40d53b1 100644
--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
@@ -371,6 +371,12 @@ static aarch64_simd_builtin_datum 
aarch64_simd_builtin_data[] = {
 enum aarch64_builtins
 {
   AARCH64_BUILTIN_MIN,
+
+  AARCH64_BUILTIN_GET_FPCR,
+  AARCH64_BUILTIN_SET_FPCR,
+  AARCH64_BUILTIN_GET_FPSR,
+  AARCH64_BUILTIN_SET_FPSR,
+
   AARCH64_SIMD_BUILTIN_BASE,
 #include aarch64-simd-builtins.def
   AARCH64_SIMD_BUILTIN_MAX = AARCH64_SIMD_BUILTIN_BASE
@@ -752,6 +758,24 @@ aarch64_init_simd_builtins (void)
 void
 aarch64_init_builtins (void)
 {
+  tree ftype_set_fpr
+= build_function_type_list (void_type_node, unsigned_type_node, NULL);
+  tree ftype_get_fpr
+= build_function_type_list (unsigned_type_node, NULL);
+
+  aarch64_builtin_decls[AARCH64_BUILTIN_GET_FPCR]
+= add_builtin_function (__builtin_aarch64_get_fpcr, ftype_get_fpr,
+   AARCH64_BUILTIN_GET_FPCR, BUILT_IN_MD, NULL, 
NULL_TREE);
+  aarch64_builtin_decls[AARCH64_BUILTIN_SET_FPCR]
+= add_builtin_function (__builtin_aarch64_set_fpcr, ftype_set_fpr,
+   AARCH64_BUILTIN_SET_FPCR, BUILT_IN_MD, NULL, 
NULL_TREE);
+  aarch64_builtin_decls[AARCH64_BUILTIN_GET_FPSR]
+= add_builtin_function (__builtin_aarch64_get_fpsr, ftype_get_fpr,
+   AARCH64_BUILTIN_GET_FPSR, BUILT_IN_MD, NULL, 
NULL_TREE);
+  aarch64_builtin_decls[AARCH64_BUILTIN_SET_FPSR]
+= add_builtin_function (__builtin_aarch64_set_fpsr, ftype_set_fpr,
+   AARCH64_BUILTIN_SET_FPSR, BUILT_IN_MD, NULL, 
NULL_TREE);
+
   if (TARGET_SIMD)
 aarch64_init_simd_builtins ();
 }
@@ -964,6 +988,36 @@ aarch64_expand_builtin (tree exp,
 {
   tree fndecl = TREE_OPERAND (CALL_EXPR_FN (exp), 0);
   int fcode = DECL_FUNCTION_CODE (fndecl);
+  int icode;
+  rtx pat, op0;
+  tree arg0;
+
+  switch (fcode)
+{
+case AARCH64_BUILTIN_GET_FPCR:
+case AARCH64_BUILTIN_SET_FPCR:
+case AARCH64_BUILTIN_GET_FPSR:
+case AARCH64_BUILTIN_SET_FPSR:
+  if ((fcode == AARCH64_BUILTIN_GET_FPCR)
+ || (fcode == AARCH64_BUILTIN_GET_FPSR))
+   {
+ icode = (fcode == AARCH64_BUILTIN_GET_FPSR) ?
+   CODE_FOR_get_fpsr : CODE_FOR_get_fpcr;
+ target = gen_reg_rtx (SImode);
+ pat = GEN_FCN (icode) (target);
+   }
+  else
+   {
+ target = NULL_RTX;
+ icode = (fcode == AARCH64_BUILTIN_SET_FPSR) ?
+   CODE_FOR_set_fpsr : CODE_FOR_set_fpcr;
+ arg0 = CALL_EXPR_ARG (exp, 0);
+ op0 = expand_normal (arg0);
+ pat = GEN_FCN (icode) (op0);
+   }
+  emit_insn (pat);
+  return target;
+}
 
   if (fcode = AARCH64_SIMD_BUILTIN_BASE)
 return aarch64_simd_expand_builtin (fcode, exp, target);
@@ -1196,6 +1250,106 @@ aarch64_gimple_fold_builtin (gimple_stmt_iterator *gsi)
   return changed;
 }
 
+void
+aarch64_atomic_assign_expand_fenv (tree *hold, tree *clear, tree *update)
+{
+  const unsigned AARCH64_FE_INVALID = 1;
+  const unsigned AARCH64_FE_DIVBYZERO = 2;
+  const unsigned AARCH64_FE_OVERFLOW = 4;
+  const unsigned AARCH64_FE_UNDERFLOW = 8;
+  const unsigned AARCH64_FE_INEXACT = 16;
+  const unsigned HOST_WIDE_INT AARCH64_FE_ALL_EXCEPT = (AARCH64_FE_INVALID
+   | AARCH64_FE_DIVBYZERO
+   |

Re: [PATCH 00/89] Compile-time gimple-checking

2014-05-02 Thread Richard Biener

On April 30, 2014 11:26:35 PM CEST, Jeff Law l...@redhat.com wrote:
On 04/23/14 08:32, Richard Biener wrote:

 Also I see you introduce a const_FOO class with every FOO one.
 I wonder whether, now that we have C++, can address
const-correctness
 in a less awkward way than with a typedef.  Can you try to go back
 in time and see why we did with that in the first place?  ISTR that
 it was oh, if we were only using C++ we wouldn't need to jump
through
 that hoop.

 To followup myself here, it's because 'tree' is a typedef to a
pointer
 and thus 'const tree' is different from 'const tree_node *'.
Right.


 Not sure why we re-introduced the 'mistake' of making 'tree' a
pointer
 when we introduced 'gimple'.  If we were to make 'gimple' the class
 type itself we can use gimple *, const gimple * and also const gimple

 (when a NULL pointer is not expected).
It wasn't ever really discussed to the best of my recollection.


 Anyway, gazillion new typedefs are ugly :/  (typedefs are ugly)
Yea, can't argue with that.  However, do we want to ask David to fix up

the gimple vs gimple * vs const gimple * vs const gimple  as a 
prerequisite for this patchset or are you comfortable going forward
with 
the patchset, then researching if there's a cleaner way to handle the 
const/typedef issues?

Well, I'd like to see both and one affects the other. Doing the const 
correctness thing first seems more natural to me.
Of course both need to wait for 4.9.1.

Thanks,
Richard.


jeff

Re: [AArch64] Fix integer vabs intrinsics

2014-05-02 Thread James Greenhalgh

On Fri, May 02, 2014 at 10:00:15AM +0100, Andrew Pinski wrote:
 On Fri, May 2, 2014 at 1:48 AM, James Greenhalgh
 james.greenha...@arm.com wrote:
 
  Hi,
 
  Unlike the mid-end's concept of an ABS_EXPR, which treats overflow as
  undefined/impossible, the neon intrinsics vabs intrinsics should behave as
  the hardware. That is to say, the pseudo-code sequence:
 
 
 Only for signed integer types.  You should be able to use an unsigned
 integer type here instead.

If anything, I think that puts us in a worse position. The issue that
inspires this patch is that GCC will happily fold:

  t1 = ABS_EXPR (x)
  t2 = GE_EXPR (t1, 0)

to

  t2 = TRUE

Surely an unsigned integer type is going to suffer the same fate? Certainly I
can imagine somewhere in the compiler there being a fold path for:

  (unsigned = 0) == TRUE

 
a = vabs_s8 (vdup_n_s8 (-128));
assert (a = 0);
 
  does not hold. As in hardware
 
abs (-128) == -128
 
  Folding vabs intrinsics to an ABS_EXPR is thus a mistake, and we should 
  avoid
  it. In fact, we have to be even more careful than that, and keep the integer
  vabs intrinsics as an unspec in the back end.
 
 No it is not.  The mistake is to use signed integer types here.  Just
 add a conversion to an unsigned integer vector and it will work
 correctly.
 In fact the ABS rtl code is not undefined for the overflow.
 
Here we are covering ourselves against a seperate issue. For auto-vectorized
code we want the SABD combine patterns to kick in whenever sensible. For
intrinsics code, in the case where vsub_s8 (x, y) would cause an underflow:

  vabs_s8 (vsub_s8 (x, y)) != vabd_s8 (x, y)

So in this case, the combine would be erroneous. Likewise SABA.

Thanks,
James

Re: [AArch64] Fix integer vabs intrinsics

2014-05-02 Thread pinskia



 On May 2, 2014, at 2:21 AM, James Greenhalgh james.greenha...@arm.com wrote:
 
 On Fri, May 02, 2014 at 10:00:15AM +0100, Andrew Pinski wrote:
 On Fri, May 2, 2014 at 1:48 AM, James Greenhalgh
 james.greenha...@arm.com wrote:
 
 Hi,
 
 Unlike the mid-end's concept of an ABS_EXPR, which treats overflow as
 undefined/impossible, the neon intrinsics vabs intrinsics should behave as
 the hardware. That is to say, the pseudo-code sequence:
 
 
 Only for signed integer types.  You should be able to use an unsigned
 integer type here instead.
 
 If anything, I think that puts us in a worse position.

Not if you cast it back. 


 The issue that
 inspires this patch is that GCC will happily fold:
 
  t1 = ABS_EXPR (x)
  t2 = GE_EXPR (t1, 0)
 
 to
 
  t2 = TRUE
 
 Surely an unsigned integer type is going to suffer the same fate? Certainly I
 can imagine somewhere in the compiler there being a fold path for:

Yes but if add a cast from the unsigned type to the signed type gcc does not 
optimize that. If it does it is a bug since the overflow is defined there. 

 
  (unsigned = 0) == TRUE
 
 
  a = vabs_s8 (vdup_n_s8 (-128));
  assert (a = 0);
 
 does not hold. As in hardware
 
  abs (-128) == -128
 
 Folding vabs intrinsics to an ABS_EXPR is thus a mistake, and we should 
 avoid
 it. In fact, we have to be even more careful than that, and keep the integer
 vabs intrinsics as an unspec in the back end.
 
 No it is not.  The mistake is to use signed integer types here.  Just
 add a conversion to an unsigned integer vector and it will work
 correctly.
 In fact the ABS rtl code is not undefined for the overflow.
 
 Here we are covering ourselves against a seperate issue. For auto-vectorized
 code we want the SABD combine patterns to kick in whenever sensible. For
 intrinsics code, in the case where vsub_s8 (x, y) would cause an underflow:
 
  vabs_s8 (vsub_s8 (x, y)) != vabd_s8 (x, y)
 
 So in this case, the combine would be erroneous. Likewise SABA.

This sounds like it would problematic for unsigned types  and not just for 
vabs_s8 with vsub_s8. So I think you should be using unspec for vabd_s8 
instead. Since in rtl overflow and underflow is defined to be wrapping. 

Thanks,
Andrew Pinski

 
 Thanks,
 James

Re: [C PATCH] Make attributes accept enum values (PR c/50459)

2014-05-02 Thread Marek Polacek

On Thu, May 01, 2014 at 11:20:25PM +, Joseph S. Myers wrote:
 On Wed, 23 Apr 2014, Marek Polacek wrote:
 
  diff --git gcc/testsuite/c-c++-common/attributes-1.c 
  gcc/testsuite/c-c++-common/attributes-1.c
  index af4dd12..8458e47 100644
  --- gcc/testsuite/c-c++-common/attributes-1.c
  +++ gcc/testsuite/c-c++-common/attributes-1.c
  @@ -9,7 +9,7 @@ typedef char vec __attribute__((vector_size(bar))); /* { 
  dg-warning ignored }
   void f1(char*) __attribute__((nonnull(bar))); /* { dg-error invalid 
  operand } */
   void f2(char*) __attribute__((nonnull(1,bar))); /* { dg-error invalid 
  operand } */
   
  -void g() __attribute__((aligned(bar))); /* { dg-error invalid value|not 
  an integer } */
  +void g() __attribute__((aligned(bar)));
 
 I don't think it's appropriate to remove any test assertion that this 
 invalid code gets diagnosed.
 
 If the only diagnostic is now one swallowed by the dg-prune-output in this 
 test, either that dg-prune-output needs to be removed (and corresponding 
 more detailed error expectations added), or a separate test needs adding 
 for this erroneous use of this attribute (that separate test not using 
 dg-prune-output).

Yeah, that was a weird thing to do.  I yanked that particular test to
a new testcase.  Otherwise no changes.

Tested again x86_64-linux, ok now?

2014-05-02  Marek Polacek  pola...@redhat.com

PR c/50459
c-family/
* c-common.c (check_user_alignment): Return -1 if alignment is error
node.
(handle_aligned_attribute): Don't call default_conversion on
FUNCTION_DECLs.
(handle_vector_size_attribute): Likewise.
(handle_tm_wrap_attribute): Handle case when wrap_decl is error node.
(handle_sentinel_attribute): Call default_conversion and allow even
integral types as an argument.
c/
* c-parser.c (c_parser_attributes): Parse the arguments as an
expression-list if the attribute takes identifier.
testsuite/
* c-c++-common/attributes-1.c: Move test line to a new test.
* c-c++-common/attributes-2.c: New test.
* c-c++-common/pr50459.c: New test.
* c-c++-common/pr59280.c: Add undeclared to dg-error.
* gcc.dg/nonnull-2.c: Likewise.
* gcc.dg/pr55570.c: Modify dg-error.
* gcc.dg/tm/wrap-2.c: Likewise.

diff --git gcc/c-family/c-common.c gcc/c-family/c-common.c
index 0ad955d..3ebd960 100644
--- gcc/c-family/c-common.c
+++ gcc/c-family/c-common.c
@@ -7438,6 +7438,8 @@ check_user_alignment (const_tree align, bool allow_zero)
 {
   int i;
 
+  if (error_operand_p (align))
+return -1;
   if (TREE_CODE (align) != INTEGER_CST
   || !INTEGRAL_TYPE_P (TREE_TYPE (align)))
 {
@@ -7559,7 +7561,8 @@ handle_aligned_attribute (tree *node, tree ARG_UNUSED 
(name), tree args,
   if (args)
 {
   align_expr = TREE_VALUE (args);
-  if (align_expr  TREE_CODE (align_expr) != IDENTIFIER_NODE)
+  if (align_expr  TREE_CODE (align_expr) != IDENTIFIER_NODE
+  TREE_CODE (align_expr) != FUNCTION_DECL)
align_expr = default_conversion (align_expr);
 }
   else
@@ -8424,9 +8427,11 @@ handle_tm_wrap_attribute (tree *node, tree name, tree 
args,
   else
 {
   tree wrap_decl = TREE_VALUE (args);
-  if (TREE_CODE (wrap_decl) != IDENTIFIER_NODE
-  TREE_CODE (wrap_decl) != VAR_DECL
-  TREE_CODE (wrap_decl) != FUNCTION_DECL)
+  if (error_operand_p (wrap_decl))
+;
+  else if (TREE_CODE (wrap_decl) != IDENTIFIER_NODE
+   TREE_CODE (wrap_decl) != VAR_DECL
+   TREE_CODE (wrap_decl) != FUNCTION_DECL)
error (%qE argument not an identifier, name);
   else
{
@@ -8553,7 +8558,8 @@ handle_vector_size_attribute (tree *node, tree name, tree 
args,
   *no_add_attrs = true;
 
   size = TREE_VALUE (args);
-  if (size  TREE_CODE (size) != IDENTIFIER_NODE)
+  if (size  TREE_CODE (size) != IDENTIFIER_NODE
+   TREE_CODE (size) != FUNCTION_DECL)
 size = default_conversion (size);
 
   if (!tree_fits_uhwi_p (size))
@@ -8964,8 +8970,12 @@ handle_sentinel_attribute (tree *node, tree name, tree 
args,
   if (args)
 {
   tree position = TREE_VALUE (args);
+  if (position  TREE_CODE (position) != IDENTIFIER_NODE
+  TREE_CODE (position) != FUNCTION_DECL)
+   position = default_conversion (position);
 
-  if (TREE_CODE (position) != INTEGER_CST)
+  if (TREE_CODE (position) != INTEGER_CST
+  || !INTEGRAL_TYPE_P (TREE_TYPE (position)))
{
  warning (OPT_Wattributes,
   requested position is not an integer constant);
diff --git gcc/c/c-parser.c gcc/c/c-parser.c
index 7947355..48f8d2f 100644
--- gcc/c/c-parser.c
+++ gcc/c/c-parser.c
@@ -3955,11 +3955,16 @@ c_parser_attributes (c_parser *parser)
 In objective-c the identifier may be a classname.  */
  if (c_parser_next_token_is (parser, CPP_NAME)
   (c_parser_peek_token (parser)-id_kind == C_ID_ID
-

Re: [PATCH][ARM] Handle simple SImode PLUS and MINUS operations in rtx costs

2014-05-02 Thread Richard Earnshaw

On 24/03/14 17:21, Kyrill Tkachov wrote:
 Hi all,
 
 I noticed that we don't handle simple reg-to-reg arithmetic operations in the 
 arm rtx cost functions. We should be adding the cost of alu.arith to the 
 costs 
 of the operands. This patch does that. Since we don't have any cost tables 
 yet 
 that have a non-zero value for that field it shouldn't affect code-gen for 
 any 
 current cores.
 
 Bootstrapped and tested on arm-none-linux-gnueabihf.
 
 Ok for next stage1?
 
 Thanks,
 Kyrill
 
 2014-03-24  Kyrylo Tkachov  kyrylo.tkac...@arm.com
 
  * config/arm/arm.c (arm_new_rtx_costs): Handle reg-to-reg PLUS
  and MINUS RTXs.
 

I think the convention for big switch statements like this is to write

* config/arm/arm.c (arm_new_rtx_costs, case PLUS, MINUS): 

However...

 
 rtx-costs-plus-minus.patch
 
 
 diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
 index bf5d1b2..3102b67 100644
 --- a/gcc/config/arm/arm.c
 +++ b/gcc/config/arm/arm.c
 @@ -9428,6 +9428,14 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum 
 rtx_code outer_code,
 *cost += rtx_cost (XEXP (x, 1), code, 1, speed_p);
 return true;
   }
 +   else if (REG_P (XEXP (x, 0)))
 + {
 +   *cost += rtx_cost (XEXP (x, 0), code, 0, speed_p)
 ++ rtx_cost (XEXP (x, 1), code, 1, speed_p);
 +   if (speed_p)
 + *cost += extra_cost-alu.arith;
 +   return true;
 + }
  

This still doesn't handle the case where the first operand is not a
register.  Furthermore, the only difference between this and what we had
before is that we now add alu.arith to the overall cost.  But we want to
do that anyway.

I think you really just need

  else if (speed_p)
*cost += extra_cost.alu_arith;

and then let recursion (return false) handle the cost of non-register
operands (register will always cost as zero).

 return false;
   }
 @@ -9663,6 +9671,14 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum 
 rtx_code outer_code,
 *cost += rtx_cost (XEXP (x, 0), PLUS, 0, speed_p);
 return true;
   }
 +   else if (REG_P (XEXP (x, 1)))
 + {
 +   *cost += rtx_cost (XEXP (x, 0), PLUS, 0, speed_p)
 ++ rtx_cost (XEXP (x, 1), PLUS, 1, speed_p);
 +   if (speed_p)
 + *cost += extra_cost-alu.arith;
 +   return true;
 + }

The same here.

 return false;
   }
  
 
R.

Re: GCC's -fsplit-stack disturbing Mach's vm_allocate

2014-05-02 Thread Samuel Thibault

Svante Signell, le Fri 02 May 2014 10:03:23 +0200, a écrit :
 On Fri, 2014-05-02 at 00:45 +0200, Samuel Thibault wrote:
  Hello,
  
  Svante Signell, le Thu 24 Apr 2014 10:39:10 +0200, a écrit :
   - Without split stack enabled around 70 libgo tests pass and 50 fails,
   most of them with a segfault.
   - Enabling split stack and using the libc Samuel built all 122 libgo
   tests fail with a segfault.
  
  Please provide segfault backtrace, rpctrace, etc.  Otherwise we won't be
  able to help you.
 
 You already have one in this thread, analysed by Justus:
 http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01736.html

Yes, but that stack was without split stack fixed. The backtrace you
have just attached seems different.

Samuel

Re: GCC's -fsplit-stack disturbing Mach's vm_allocate

2014-05-02 Thread Samuel Thibault

Justus Winter, le Sat 26 Apr 2014 08:53:08 +0200, a écrit :
 task130(pid1182)-vm_map (0 49880 0 1133--160(pid1182) 0 1 5 7 1) = 0 
 2453504
 
 We map that somewhere.
 
 task130(pid1182)-mach_port_deallocate (pn{ 25}) = 0 
 
 Deallocate the port.  Again, for some strange reason 133 == pn{ 25}.
 
   158--157(pid1182)-io_map_request () = 0133--162(pid1182)  (null)
 
 Some more io_map.
 
 task130(pid1182)-vm_map (2498560 8192 0 0133--162(pid1182) 40960 1 3 7 
 1) = 0x3 ((os/kern) no space available) 
 task130(pid1182)-vm_deallocate (2498560 8192) = 0 
 
 Hum?
 
 task130(pid1182)-vm_map (2498560 8192 0 0133--162(pid1182) 40960 1 3 7 
 1) = 0 2498560
 task130(pid1182)-mach_port_deallocate (pn{ 25}) = 0 
 
 Success!

See the logic in elf/dl-load.c's _dl_map_object_from_fd and
sysdeps/mach/hurd/mmap.c. _dl_map_object_from_fd first gets somewhere
to map the SO (first __mmap call), and then, if bss is big, it maps
anonymous pages for it with MAP_FIXED, replacing the bss part of the SO:

2453504 + 49880 = 0x2632d8
2498560 + 8192 = 0x264000

So this seems like just normal behavior.

Samuel

Re: GCC's -fsplit-stack disturbing Mach's vm_allocate

2014-05-02 Thread Samuel Thibault

Svante Signell, le Fri 02 May 2014 10:18:12 +0200, a écrit :
 task130(pid1182)-vm_allocate (33562796 8364 0) = 0x3 ((os/kern) no space 
 available) 
 task130(pid1182)-vm_allocate (33571160 8364 0) = 0 33570816

While inspecting this, I realized this is from __pthread_stack_alloc,
the only caller of vm_allocate with anywhere set to 0 which would have
such behavior. 8364 is really small for a stack (but that's expected
from -fsplit-stack), and the thing is: we have a bogus libpthread which
includes guardsize into stacksize.  I guess this is what happens: gcc
believes there is 8K, but our libpthread actually removes 4K from it for
guardsize, so the process will crash as soon as 4K are used on the
stack.

So we just need to fix guardsize in our libpthread.

Samuel

Re: GCC's -fsplit-stack disturbing Mach's vm_allocate

2014-05-02 Thread Samuel Thibault

Samuel Thibault, le Fri 02 May 2014 11:57:53 +0200, a écrit :
 So we just need to fix guardsize in our libpthread.

(And I'll have a look at it).

Samuel

Re: [RFC][AARCH64] TARGET_ATOMIC_ASSIGN_EXPAND_FENV hook

2014-05-02 Thread Marcus Shawcroft

On 29 April 2014 03:37, Kugan kugan.vivekanandara...@linaro.org wrote:

 On 28/04/14 21:01, Ramana Radhakrishnan wrote:
 On 04/26/14 11:57, Kugan wrote:
 Attached patch implements TARGET_ATOMIC_ASSIGN_EXPAND_FENV for AARCH64.
 With this, atomic test-case gcc.dg/atomic/c11-atomic-exec-5.c now PASS.

 This implementation is based on SPARC and i386 implementations.

 Regression tested on qemu-aarch64 for aarch64-none-linux-gnu with no new
 regression. Is this OK for trunk?

 Again like A32 please test on hardware to make sure this behaves
 correctly with c11-atomic-exec-5.c .

 If you don't have access to hardware, let us know : we'll take it for a
 spin once you update the patch according to Marcus's comments.


 Thanks for the review. I have updated the patch. I also have updated
 hold, clear and update to be exactly as in feholdexcpt.c, fclrexcpt.c
 and feupdateenv.c of glibc/ports/sysdeps/aarch64/fpu.


Kugan, I've not looked at the respin in detail yet, but it has just
occurred to me that the sequence used here to set FPCR is
insufficient.  The architecture reference manual requires that any
write to FPCR must be syncrhronized by a context synchronization
operation so we need to plant an ISB after the write.   Both the write
and ISB are likely to be expensive on some implementations so it would
be good to ensure that both the write and the isb are scheduled
independently.  IIRC there si

 I have limited real hardware access and just did a bootstrap and tested
 c11-atomic-exec-5.c alone to make sure that it PASS. I have also
 regression tested again on qemu-aarch64 for aarch64-none-linux-gnu with
 no new regressions. I will appreciate if you could do the regression
 testing on real hw.

Once the ISB issue is resolved I'll give the patch a spin on HW here.
/Marcus

Re: [PATCH], RFC, add support for float128/ibm128 types on PowerPC

2014-05-02 Thread Jakub Jelinek

Hi!

On Tue, Apr 29, 2014 at 06:30:32PM -0400, Michael Meissner wrote:
 This patch adds support for a new type (__float128) on the PowerPC to allow
 people to use the 128-bit IEEE floating point format instead of the 
 traditional
 IBM double-double that has been used in the Linux compilers.  At this time,
 long double still will remain using the IBM double-double format.
 
 There has been an undocumented option to switch long double to to IEEE 
 128-bit,
 but right now, there are bugs I haven't ironed out on VSX systems.
 
 In addition, I added another type (__ibm128) so that when the transition is
 eventually made, people can use this type to get the old long double type.
 
 I was wondering if people had any comments on the code so far, and things I
 should different.  Note, I will be out on vacation May 6th - 14th, so I don't
 expect to submit the patches until I get back.

For mangling, if you are going to mangle it the same as the -mlong-double-64
long double, is __float128 going to be supported solely for ELFv2 ABI and
are you sure nobody has ever used -mlong-double-64 or
--without-long-double-128 configured compiler for it?

What is the plan for glibc (and for libstdc++)?
Looking at current ppc64le glibc, it seems it mistakenly still supports
the -mlong-double-64 stuff (e.g. printf calls are usually redirected to
__nldbl_printf (and tons of other calls).  So, is the plan to use
yet another set of symbols?  For __nldbl_* it is about 113 entry points
in libc.so and 1 in libm.so, but if you are going to support all of
-mlong-double-64, -mlong-double-128 as well as __float128, that would be far
more, because the compat -mlong-double-64 support mostly works by
redirecting, either in headers or through a special *.a library, to
corresponding double entry points whenever possible.
So, if you call logl in -mlong-double-64 code, it will be redirected to
log, because it has the same ABI.  But if you call *printf or nexttowardf
etc. where there is no ABI compatible double entrypoint, it needs to be a
new symbol.
But with __float128 vs. __ibm128 and long double being either of those,
you need different logl.

Which is why it is so huge problem that this hasn't been resolved initially
as part of ELFv2 changes.

Jakub

Re: [RFC][AARCH64] TARGET_ATOMIC_ASSIGN_EXPAND_FENV hook

2014-05-02 Thread Marcus Shawcroft

On 2 May 2014 11:06, Marcus Shawcroft marcus.shawcr...@gmail.com wrote:

 Kugan, I've not looked at the respin in detail yet, but it has just
 occurred to me that the sequence used here to set FPCR is
 insufficient.  The architecture reference manual requires that any
 write to FPCR must be syncrhronized by a context synchronization
 operation so we need to plant an ISB after the write.   Both the write
 and ISB are likely to be expensive on some implementations so it would
 be good to ensure that both the write and the isb are scheduled
 independently.  IIRC there si

Sorry, incomplete sentence.  I had started to write that IIRC the same
issue did not apply to FPSCR in the ARM patch.  I have doubled checked
and the FPSCR does not have the issue therefore the ARM patch is fine
in this respect.

/Marcus

[RS6000] PR60737, expand_block_clear uses word stores

2014-05-02 Thread Alan Modra

In cases where the compiler has no alignment info, powerpc64le-linux
gcc generates byte at a time copies for -mstrict-align (which is on
for little-endian power7).  That's awful code, a problem shared by
other strict-align targets, see pr50417.  However, we also have a case
when -mno-strict-align generates less than ideal code, which I believe
stems from using alignment as a proxy for testing an address offset.
See http://gcc.gnu.org/ml/gcc-patches/1999-09n/msg01072.html.

So my first attempt at fixing this problem looked at address offsets
directly.  That worked fine too, but on thinking some more, I believe
we no longer have the movdi restriction.  Nowadays we'll reload the
address if we have an offset that doesn't satisfy the Y constraint
(ie. a multiple of 4 offset).  Which led to this simpler patch.
Bootstrapped and regression tested powerpc64le-linux, powerpc64-linux
and powerpc-linux.  OK to apply?

PR target/60737
* config/rs6000/rs6000.c (expand_block_move): Allow 64-bit
loads and stores when -mno-strict-align at any alignment.
(expand_block_clear): Similarly.  Also correct calculation of
instruction count.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 209926)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -15438,7 +15422,7 @@ expand_block_clear (rtx operands[])
  load zero and three to do clearing.  */
   if (TARGET_ALTIVEC  align = 128)
 clear_step = 16;
-  else if (TARGET_POWERPC64  align = 32)
+  else if (TARGET_POWERPC64  (align = 64 || !STRICT_ALIGNMENT))
 clear_step = 8;
   else if (TARGET_SPE  align = 64)
 clear_step = 8;
@@ -15466,9 +15450,7 @@ expand_block_clear (rtx operands[])
   mode = V2SImode;
 }
   else if (bytes = 8  TARGET_POWERPC64
-  /* 64-bit loads and stores require word-aligned
- displacements.  */
-   (align = 64 || (!STRICT_ALIGNMENT  align = 32)))
+   (align = 64 || !STRICT_ALIGNMENT))
{
  clear_bytes = 8;
  mode = DImode;
@@ -15599,9 +15581,7 @@ expand_block_move (rtx operands[])
  gen_func.movmemsi = gen_movmemsi_4reg;
}
   else if (bytes = 8  TARGET_POWERPC64
-  /* 64-bit loads and stores require word-aligned
- displacements.  */
-   (align = 64 || (!STRICT_ALIGNMENT  align = 32)))
+   (align = 64 || !STRICT_ALIGNMENT))
{
  move_bytes = 8;
  mode = DImode;

-- 
Alan Modra
Australia Development Lab, IBM

Re: [AArch64] Fix integer vabs intrinsics

2014-05-02 Thread James Greenhalgh

On Fri, May 02, 2014 at 10:29:06AM +0100, pins...@gmail.com wrote:
 
 
  On May 2, 2014, at 2:21 AM, James Greenhalgh james.greenha...@arm.com 
  wrote:
  
  On Fri, May 02, 2014 at 10:00:15AM +0100, Andrew Pinski wrote:
  On Fri, May 2, 2014 at 1:48 AM, James Greenhalgh
  james.greenha...@arm.com wrote:
  
  Hi,
  
  Unlike the mid-end's concept of an ABS_EXPR, which treats overflow as
  undefined/impossible, the neon intrinsics vabs intrinsics should behave as
  the hardware. That is to say, the pseudo-code sequence:
  
  
  Only for signed integer types.  You should be able to use an unsigned
  integer type here instead.
  
  If anything, I think that puts us in a worse position.
 
 Not if you cast it back. 
 
 
  The issue that
  inspires this patch is that GCC will happily fold:
  
   t1 = ABS_EXPR (x)
   t2 = GE_EXPR (t1, 0)
  
  to
  
   t2 = TRUE
  
  Surely an unsigned integer type is going to suffer the same fate? Certainly 
  I
  can imagine somewhere in the compiler there being a fold path for:
 
 Yes but if add a cast from the unsigned type to the signed type gcc does not
 optimize that. If it does it is a bug since the overflow is defined there. 

I'm not sure I understand, are you saying I want to fold to:

  t1 = VIEW_CONVERT_EXPR (x, unsigned)
  t2 = ABS_EXPR (t1)
  t3 = VIEW_CONVERT_EXPR (t2, signed)

Surely ABS_EXPR (unsigned) is a nop, and the two VIEW_CONVERTs cancel each
other out leading to an overall NOP? It might just be Friday morning and a
lack of coffee talking, but I think I need you to spell this one out to
me in big letters!

  
   (unsigned = 0) == TRUE
  
  
   a = vabs_s8 (vdup_n_s8 (-128));
   assert (a = 0);
  
  does not hold. As in hardware
  
   abs (-128) == -128
  
  Folding vabs intrinsics to an ABS_EXPR is thus a mistake, and we should 
  avoid
  it. In fact, we have to be even more careful than that, and keep the 
  integer
  vabs intrinsics as an unspec in the back end.
  
  No it is not.  The mistake is to use signed integer types here.  Just
  add a conversion to an unsigned integer vector and it will work
  correctly.
  In fact the ABS rtl code is not undefined for the overflow.
  
  Here we are covering ourselves against a seperate issue. For auto-vectorized
  code we want the SABD combine patterns to kick in whenever sensible. For
  intrinsics code, in the case where vsub_s8 (x, y) would cause an underflow:
  
   vabs_s8 (vsub_s8 (x, y)) != vabd_s8 (x, y)
  
  So in this case, the combine would be erroneous. Likewise SABA.
 
 This sounds like it would problematic for unsigned types  and not just for
 vabs_s8 with vsub_s8. So I think you should be using unspec for vabd_s8
 instead. Since in rtl overflow and underflow is defined to be wrapping. 

There are no vabs_u8/vabd_u8 so I don't see how we can reach this point
with unsigned types. Further, I have never thought of RTL having signed
and unsigned types, just a bag of bits. We'll want to use unspec for the
intrinsic version of vabd_s8 - but we'll want to specify the

  (abs (minus (reg) (reg)))

behaviour so that auto-vectorized code can pick it up.

So in the end we'll have these patterns:

  (abs
(abs (reg)))

  (intrinsic_abs
(unspec [(reg)] UNSPEC_ABS))

  (abd
(abs (minus (reg) (reg

  (intrinsic_abd
(unspec [(reg) (reg)] UNSPEC_ABD))

  (aba
(plus (abs (minus (reg) (reg))) (reg)))

  (intrinsic_aba
(plus (unspec [(reg) (reg)] UNSPEC_ABD) (reg)))

which should give us reasonable auto-vectorized code without triggering any
of the issues mapping the semantics of the instructions to intrinsics.

Thanks,
James

 
 Thanks,
 Andrew Pinski
 
  
  Thanks,
  James

Re: GCC's -fsplit-stack disturbing Mach's vm_allocate

2014-05-02 Thread Samuel Thibault

Samuel Thibault, le Fri 02 May 2014 11:57:53 +0200, a écrit :
 So we just need to fix guardsize in our libpthread.

It was not so difficult actually.

Svante, could you try this libpthread:

http://people.debian.org/~sthibault/tmp/libpthread.so.0.3

Thanks,
Samuel

[C++ Patch/RFC] PR 58582

2014-05-02 Thread Paolo Carlini


Hi,

this relatively old bug report is about an ICE during error recovery and 
some time ago I figured out that the below simple tweak to grokfndecl 
worked for it (we notice that duplicate_decls returned error_mark_node). 
Today, however, while formatting the original testcases, I noticed 
something slightly more interesting: at variance with both current clang 
and icc, we reject the below deleted6.C, thus we reject explicit 
instantiations of deleted template functions (without explaining that 
the function is deleted in the error message and still accepting the 
code with -fpermissive (?)). Thus the below tweak to instantiate_decl. 
Without it, 58582 is still fixed of course, only the testcases need a 
little adjusting.


Tested x86_64-linux.

Thanks,
Paolo.

//
Index: cp/decl.c
===
--- cp/decl.c   (revision 210004)
+++ cp/decl.c   (working copy)
@@ -7822,6 +7822,8 @@ grokfndecl (tree ctype,
 decl, ctype);
  return NULL_TREE;
}
+ if (ok == error_mark_node)
+   return NULL_TREE;
  return old_decl;
}
 }
Index: cp/pt.c
===
--- cp/pt.c (revision 210004)
+++ cp/pt.c (working copy)
@@ -19698,7 +19699,8 @@ instantiate_decl (tree d, int defer_ok,
 
   if (at_eof  !pattern_defined
   DECL_EXPLICIT_INSTANTIATION (d)
-  DECL_NOT_REALLY_EXTERN (d))
+  DECL_NOT_REALLY_EXTERN (d)
+  (TREE_CODE (d) != FUNCTION_DECL || !DECL_DELETED_FN (d)))
/* [temp.explicit]
 
   The definition of a non-exported function template, a
Index: testsuite/g++.dg/cpp0x/deleted4.C
===
--- testsuite/g++.dg/cpp0x/deleted4.C   (revision 0)
+++ testsuite/g++.dg/cpp0x/deleted4.C   (working copy)
@@ -0,0 +1,11 @@
+// PR c++/58582
+// { dg-do compile { target c++11 } }
+
+struct A
+{
+  templateint void foo() = delete;
+};
+
+templateint void A::foo() { int i; } // { dg-error redefinition }
+
+template void A::foo0();
Index: testsuite/g++.dg/cpp0x/deleted5.C
===
--- testsuite/g++.dg/cpp0x/deleted5.C   (revision 0)
+++ testsuite/g++.dg/cpp0x/deleted5.C   (working copy)
@@ -0,0 +1,11 @@
+// PR c++/58582
+// { dg-do compile { target c++11 } }
+
+struct A
+{
+  templateint void foo() = delete;
+};
+
+templateint void A::foo() {} // { dg-error redefinition }
+
+template void A::foo0();
Index: testsuite/g++.dg/cpp0x/deleted6.C
===
--- testsuite/g++.dg/cpp0x/deleted6.C   (revision 0)
+++ testsuite/g++.dg/cpp0x/deleted6.C   (working copy)
@@ -0,0 +1,9 @@
+// PR c++/58582
+// { dg-do compile { target c++11 } }
+
+struct A
+{
+  templateint void foo() = delete;
+};
+
+template void A::foo0();

Re: [AArch64] Fix integer vabs intrinsics

2014-05-02 Thread Richard Earnshaw

On 02/05/14 11:28, James Greenhalgh wrote:
 On Fri, May 02, 2014 at 10:29:06AM +0100, pins...@gmail.com wrote:


 On May 2, 2014, at 2:21 AM, James Greenhalgh james.greenha...@arm.com 
 wrote:

 On Fri, May 02, 2014 at 10:00:15AM +0100, Andrew Pinski wrote:
 On Fri, May 2, 2014 at 1:48 AM, James Greenhalgh
 james.greenha...@arm.com wrote:

 Hi,

 Unlike the mid-end's concept of an ABS_EXPR, which treats overflow as
 undefined/impossible, the neon intrinsics vabs intrinsics should behave as
 the hardware. That is to say, the pseudo-code sequence:


 Only for signed integer types.  You should be able to use an unsigned
 integer type here instead.

 If anything, I think that puts us in a worse position.

 Not if you cast it back. 


 The issue that
 inspires this patch is that GCC will happily fold:

  t1 = ABS_EXPR (x)
  t2 = GE_EXPR (t1, 0)

 to

  t2 = TRUE

 Surely an unsigned integer type is going to suffer the same fate? Certainly 
 I
 can imagine somewhere in the compiler there being a fold path for:

 Yes but if add a cast from the unsigned type to the signed type gcc does not
 optimize that. If it does it is a bug since the overflow is defined there. 
 
 I'm not sure I understand, are you saying I want to fold to:
 
   t1 = VIEW_CONVERT_EXPR (x, unsigned)
   t2 = ABS_EXPR (t1)
   t3 = VIEW_CONVERT_EXPR (t2, signed)
 
 Surely ABS_EXPR (unsigned) is a nop, and the two VIEW_CONVERTs cancel each
 other out leading to an overall NOP? It might just be Friday morning and a
 lack of coffee talking, but I think I need you to spell this one out to
 me in big letters!
 

I agree.  I think what you need is a type widening so that you get

t1 = VEC_WIDEN (x)
t2 = ABS_EXPR (t1)
t3 = VEC_NARROW (t2)

This then guarantees that the ABS expression cannot be undefined.  I'm
less sure, however about the narrow causing a change in 'sign'.  Has it
just punted the problem?  Maybe you need


t1 = VEC_WIDEN (x)
t2 = ABS_EXPR (t1)
t3 = VIEW_CONVERT_EXPR (x, unsigned)
t4 = VEC_NARROW (t3)
t5 = VIEW_CONVERT_EXPR (t4, signed)

!!!

How you capture this into RTL during expand, though, is another thing.

R.


  (unsigned = 0) == TRUE


  a = vabs_s8 (vdup_n_s8 (-128));
  assert (a = 0);

 does not hold. As in hardware

  abs (-128) == -128

 Folding vabs intrinsics to an ABS_EXPR is thus a mistake, and we should 
 avoid
 it. In fact, we have to be even more careful than that, and keep the 
 integer
 vabs intrinsics as an unspec in the back end.

 No it is not.  The mistake is to use signed integer types here.  Just
 add a conversion to an unsigned integer vector and it will work
 correctly.
 In fact the ABS rtl code is not undefined for the overflow.

 Here we are covering ourselves against a seperate issue. For auto-vectorized
 code we want the SABD combine patterns to kick in whenever sensible. For
 intrinsics code, in the case where vsub_s8 (x, y) would cause an underflow:

  vabs_s8 (vsub_s8 (x, y)) != vabd_s8 (x, y)

 So in this case, the combine would be erroneous. Likewise SABA.

 This sounds like it would problematic for unsigned types  and not just for
 vabs_s8 with vsub_s8. So I think you should be using unspec for vabd_s8
 instead. Since in rtl overflow and underflow is defined to be wrapping. 
 
 There are no vabs_u8/vabd_u8 so I don't see how we can reach this point
 with unsigned types. Further, I have never thought of RTL having signed
 and unsigned types, just a bag of bits. We'll want to use unspec for the
 intrinsic version of vabd_s8 - but we'll want to specify the
 
   (abs (minus (reg) (reg)))
 
 behaviour so that auto-vectorized code can pick it up.
 
 So in the end we'll have these patterns:
 
   (abs
 (abs (reg)))
 
   (intrinsic_abs
 (unspec [(reg)] UNSPEC_ABS))
 
   (abd
 (abs (minus (reg) (reg
 
   (intrinsic_abd
 (unspec [(reg) (reg)] UNSPEC_ABD))
 
   (aba
 (plus (abs (minus (reg) (reg))) (reg)))
 
   (intrinsic_aba
 (plus (unspec [(reg) (reg)] UNSPEC_ABD) (reg)))
 
 which should give us reasonable auto-vectorized code without triggering any
 of the issues mapping the semantics of the instructions to intrinsics.
 
 Thanks,
 James
 

 Thanks,
 Andrew Pinski


 Thanks,
 James

Re: GCC's -fsplit-stack disturbing Mach's vm_allocate

2014-05-02 Thread Svante Signell

On Fri, 2014-05-02 at 12:00 +0200, Samuel Thibault wrote:
 Samuel Thibault, le Fri 02 May 2014 11:57:53 +0200, a écrit :
  So we just need to fix guardsize in our libpthread.
 
 (And I'll have a look at it).

Maybe this can fix the around 40 segfaults (of 50 failures) when split
stack is disabled too? The segfaults are always around the same place in
libpthread.

Re: GCC's -fsplit-stack disturbing Mach's vm_allocate

2014-05-02 Thread Samuel Thibault

Svante Signell, le Fri 02 May 2014 12:45:56 +0200, a écrit :
 On Fri, 2014-05-02 at 12:00 +0200, Samuel Thibault wrote:
  Samuel Thibault, le Fri 02 May 2014 11:57:53 +0200, a écrit :
   So we just need to fix guardsize in our libpthread.
  
  (And I'll have a look at it).
 
 Maybe this can fix the around 40 segfaults (of 50 failures) when split
 stack is disabled too? The segfaults are always around the same place in
 libpthread. 

Possibly, if libgo uses small stacks already.

Samuel

Re: [RFC][AARCH64] TARGET_ATOMIC_ASSIGN_EXPAND_FENV hook

2014-05-02 Thread Yufeng Zhang


On 05/02/14 10:08, Kugan wrote:

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 347a94a..8bd13f3 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -9107,6 +9107,7 @@ to those machines.  Generally these generate calls to 
specific machine
  instructions, but allow the compiler to schedule those calls.

  @menu
+* AARCH64 Built-in Functions::
  * Alpha Built-in Functions::
  * Altera Nios II Built-in Functions::
  * ARC Built-in Functions::
@@ -9139,6 +9140,18 @@ instructions, but allow the compiler to schedule those 
calls.
  * TILEPro Built-in Functions::
  @end menu

+@node AARCH64 Built-in Functions
+@subsection AARCH64 Built-in Functions
+
+These built-in functions are available for the AARCH64 family of
+processors.
+@smallexample
+unsigned int __builtin_aarch64_get_fpcr ()
+void __builtin_aarch64_set_fpcr (unsigned int)
+unsigned int __builtin_aarch64_get_fpsr ()
+void __builtin_aarch64_set_fpsr (unsigned int)
+@end smallexample
+
  @node Alpha Built-in Functions
  @subsection Alpha Built-in Functions



Please s/AARCH64/AArch64 to stay consistent with the existing usage, 
e.g. those in invoke.texi.


Thanks,
Yufeng

Re: [Patch, Fortran] Fix an issue with CLASS and -fcoarray=lib on the trunk

2014-05-02 Thread Tobias Burnus

Dominique Dhumieres wrote:
 This causes several failures with -m32 (see
 http://gcc.gnu.org/ml/gcc-regression/2014-05/msg3.html):

Sorry for the breakage - obviously, I forgot to test with -m32.

 Most of the failures are fixed by replacing 'kind=8' with 'kind=.' or 
 'kind=\[48\]'.

Either change is fine with me.

 The remaining failures in gfortran.dg/coarray_lib_this_image_*.f90 are fixed 
 by
 the following patches

The patch is okay with a proper ChangeLog; please also list PR fortran/61025.

Thanks,

Tobias

Re: [RFC][AARCH64] TARGET_ATOMIC_ASSIGN_EXPAND_FENV hook

2014-05-02 Thread Kugan



On 02/05/14 20:06, Marcus Shawcroft wrote:
 On 29 April 2014 03:37, Kugan kugan.vivekanandara...@linaro.org wrote:

 On 28/04/14 21:01, Ramana Radhakrishnan wrote:
 On 04/26/14 11:57, Kugan wrote:
 Attached patch implements TARGET_ATOMIC_ASSIGN_EXPAND_FENV for AARCH64.
 With this, atomic test-case gcc.dg/atomic/c11-atomic-exec-5.c now PASS.

 This implementation is based on SPARC and i386 implementations.

 Regression tested on qemu-aarch64 for aarch64-none-linux-gnu with no new
 regression. Is this OK for trunk?

 Again like A32 please test on hardware to make sure this behaves
 correctly with c11-atomic-exec-5.c .

 If you don't have access to hardware, let us know : we'll take it for a
 spin once you update the patch according to Marcus's comments.


 Thanks for the review. I have updated the patch. I also have updated
 hold, clear and update to be exactly as in feholdexcpt.c, fclrexcpt.c
 and feupdateenv.c of glibc/ports/sysdeps/aarch64/fpu.

 
 Kugan, I've not looked at the respin in detail yet, but it has just
 occurred to me that the sequence used here to set FPCR is
 insufficient.  The architecture reference manual requires that any
 write to FPCR must be syncrhronized by a context synchronization
 operation so we need to plant an ISB after the write.   Both the write
 and ISB are likely to be expensive on some implementations so it would
 be good to ensure that both the write and the isb are scheduled
 independently.  IIRC there si
 
 I have limited real hardware access and just did a bootstrap and tested
 c11-atomic-exec-5.c alone to make sure that it PASS. I have also
 regression tested again on qemu-aarch64 for aarch64-none-linux-gnu with
 no new regressions. I will appreciate if you could do the regression
 testing on real hw.
 
 Once the ISB issue is resolved I'll give the patch a spin on HW here.

Here is the modified patch which also includes changes Yufeng has
suggested. Regression tested on qemu-aarch64 for aarch64-none-linux-gnu
with no new regressions.

Thanks,
Kugan


gcc/
+2014-05-02  Kugan Vivekanandarajah  kug...@linaro.org
+
+   * config/aarch64/aarch64.c (TARGET_ATOMIC_ASSIGN_EXPAND_FENV): New
+   define.
+   * config/aarch64/aarch64-protos.h (aarch64_atomic_assign_expand_fenv):
+   New function declaration.
+   * config/aarch64/aarch64-builtins.c (aarch64_builtins) : Add
+   AARCH64_BUILTIN_GET_FPCR, AARCH64_BUILTIN_SET_FPCR.
+   AARCH64_BUILTIN_GET_FPSR and AARCH64_BUILTIN_SET_FPSR.
+   (aarch64_init_builtins) : Initialize builtins
+   __builtins_aarch64_set_fpcr, __builtins_aarch64_get_fpcr.
+   __builtins_aarch64_set_fpsr and __builtins_aarch64_get_fpsr.
+   (aarch64_expand_builtin) : Expand builtins __builtins_aarch64_set_fpcr
+   __builtins_aarch64_get_fpcr, __builtins_aarch64_get_fpsr,
+   and __builtins_aarch64_set_fpsr.
+   (aarch64_atomic_assign_expand_fenv): New function.
+   * config/aarch64/aarch64.md (set_fpcr): New pattern.
+   (get_fpcr) : Likewise.
+   (set_fpsr) : Likewise.
+   (get_fpsr) : Likewise.
+   (unspecv): Add UNSPECV_GET_FPCR and UNSPECV_SET_FPCR, UNSPECV_GET_FPSR
+and UNSPECV_SET_FPSR.
+   * doc/extend.texi (AARCH64 Built-in Functions) : Document
+   __builtins_aarch64_set_fpcr, __builtins_aarch64_get_fpcr.
+   __builtins_aarch64_set_fpsr and __builtins_aarch64_get_fpsr.
+
diff --git a/gcc/config/aarch64/aarch64-builtins.c 
b/gcc/config/aarch64/aarch64-builtins.c
index 55cfe0a..a5af874 100644
--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
@@ -371,6 +371,12 @@ static aarch64_simd_builtin_datum 
aarch64_simd_builtin_data[] = {
 enum aarch64_builtins
 {
   AARCH64_BUILTIN_MIN,
+
+  AARCH64_BUILTIN_GET_FPCR,
+  AARCH64_BUILTIN_SET_FPCR,
+  AARCH64_BUILTIN_GET_FPSR,
+  AARCH64_BUILTIN_SET_FPSR,
+
   AARCH64_SIMD_BUILTIN_BASE,
 #include aarch64-simd-builtins.def
   AARCH64_SIMD_BUILTIN_MAX = AARCH64_SIMD_BUILTIN_BASE
@@ -752,6 +758,24 @@ aarch64_init_simd_builtins (void)
 void
 aarch64_init_builtins (void)
 {
+  tree ftype_set_fpr
+= build_function_type_list (void_type_node, unsigned_type_node, NULL);
+  tree ftype_get_fpr
+= build_function_type_list (unsigned_type_node, NULL);
+
+  aarch64_builtin_decls[AARCH64_BUILTIN_GET_FPCR]
+= add_builtin_function (__builtin_aarch64_get_fpcr, ftype_get_fpr,
+   AARCH64_BUILTIN_GET_FPCR, BUILT_IN_MD, NULL, 
NULL_TREE);
+  aarch64_builtin_decls[AARCH64_BUILTIN_SET_FPCR]
+= add_builtin_function (__builtin_aarch64_set_fpcr, ftype_set_fpr,
+   AARCH64_BUILTIN_SET_FPCR, BUILT_IN_MD, NULL, 
NULL_TREE);
+  aarch64_builtin_decls[AARCH64_BUILTIN_GET_FPSR]
+= add_builtin_function (__builtin_aarch64_get_fpsr, ftype_get_fpr,
+   AARCH64_BUILTIN_GET_FPSR, BUILT_IN_MD, NULL, 
NULL_TREE);
+  aarch64_builtin_decls[AARCH64_BUILTIN_SET_FPSR]
+= add_builtin_function (__builtin_aarch64_set_fpsr,

Re: [PATCH], RFC, add support for float128/ibm128 types on PowerPC

2014-05-02 Thread Steven Munroe

On Fri, 2014-05-02 at 12:13 +0200, Jakub Jelinek wrote:
 Hi!
 
 On Tue, Apr 29, 2014 at 06:30:32PM -0400, Michael Meissner wrote:
  This patch adds support for a new type (__float128) on the PowerPC to allow
  people to use the 128-bit IEEE floating point format instead of the 
  traditional
  IBM double-double that has been used in the Linux compilers.  At this time,
  long double still will remain using the IBM double-double format.
  
  There has been an undocumented option to switch long double to to IEEE 
  128-bit,
  but right now, there are bugs I haven't ironed out on VSX systems.
  
  In addition, I added another type (__ibm128) so that when the transition is
  eventually made, people can use this type to get the old long double type.
  
  I was wondering if people had any comments on the code so far, and things I
  should different.  Note, I will be out on vacation May 6th - 14th, so I 
  don't
  expect to submit the patches until I get back.
 
 For mangling, if you are going to mangle it the same as the -mlong-double-64
 long double, is __float128 going to be supported solely for ELFv2 ABI and
 are you sure nobody has ever used -mlong-double-64 or
 --without-long-double-128 configured compiler for it?

 What is the plan for glibc (and for libstdc++)?
 Looking at current ppc64le glibc, it seems it mistakenly still supports
 the -mlong-double-64 stuff (e.g. printf calls are usually redirected to
 __nldbl_printf (and tons of other calls).  So, is the plan to use
 yet another set of symbols?  For __nldbl_* it is about 113 entry points
 in libc.so and 1 in libm.so, but if you are going to support all of
 -mlong-double-64, -mlong-double-128 as well as __float128, that would be far
 more, because the compat -mlong-double-64 support mostly works by
 redirecting, either in headers or through a special *.a library, to
 corresponding double entry points whenever possible.
 So, if you call logl in -mlong-double-64 code, it will be redirected to
 log, because it has the same ABI.  But if you call *printf or nexttowardf
 etc. where there is no ABI compatible double entrypoint, it needs to be a
 new symbol.
 But with __float128 vs. __ibm128 and long double being either of those,
 you need different logl.
 
Yes and we will work on a plan to do this. But at this time and near
future there is no performance advantage to __float128 over IBM long
double.

 Which is why it is so huge problem that this hasn't been resolved initially
 as part of ELFv2 changes.

Because it was a huge problem and there was no way for the required GCC
support to be available in time for GLIBC-2.19.

So we will develop a orderly, step by step transition plan. This will
take some time.

[PATCH] Describe Wsizeof-pointer-memaccess in c.opt

2014-05-02 Thread Marek Polacek

Wsizeof-pointer-memaccess option is missing a description in
the c.opt file.  This patch adds something in there.

Ok?

2014-05-02  Marek Polacek  pola...@redhat.com

* c.opt (Wsizeof-pointer-memaccess): Describe option.

diff --git gcc/c-family/c.opt gcc/c-family/c.opt
index 1bbb81c..7aa9e23 100644
--- gcc/c-family/c.opt
+++ gcc/c-family/c.opt
@@ -516,6 +516,7 @@ Warn about missing fields in struct initializers
 
 Wsizeof-pointer-memaccess
 C ObjC C++ ObjC++ Var(warn_sizeof_pointer_memaccess) Warning LangEnabledBy(C 
ObjC C++ ObjC++,Wall)
+Warn about suspicious length parameters to certain string functions if the 
argument uses sizeof
 
 Wsuggest-attribute=format
 C ObjC C++ ObjC++ Var(warn_suggest_attribute_format) Warning

Marek

Re: [C PATCH] Another locus improvements (PR c/60257)

2014-05-02 Thread Marek Polacek

On Fri, May 02, 2014 at 12:11:02AM +, Joseph S. Myers wrote:
 On Sat, 26 Apr 2014, Marek Polacek wrote:
 
  What is worth mentioning is that the (near initialization for X)
  message seems next to useless to me now with caret diagnostics (?).
 
 The caret diagnostics point to the initializer.  near initialization 
 says (or should say) what field GCC thinks is being initialized, which may 
 not be obvious from looking at the initializer (consider a 100-element 
 structure where the initializer for one element is missing, resulting in 
 errors for subsequent elements having the wrong type; being told which 
 element GCC thinks matches the initializer with a bad type may help you 
 find where the missing initializer is).

I see, thanks.  It would be nice if we could show the previous
initialization in some note message...

Marek

Re: [PATCH ARM] Improve ARM memset inlining

2014-05-02 Thread Richard Earnshaw

On 30/04/14 03:52, bin.cheng wrote:
 Hi,
 This patch expands small memset calls into direct memory set instructions by
 introducing setmemsi pattern.  For processors without NEON support, it
 expands memset using general store instruction.  For example, strd for
 4-bytes aligned addresses.  For processors with NEON support, it expands
 memset using neon instructions like vstr and miscellaneous vst1.*
 instructions for both aligned and unaligned cases.
 
 This patch depends on
 http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01923.html otherwise vst1.64
 will be generated for 32-bit aligned memory unit.
 
 There is also one leftover work of this patch:  Since vst1.* instructions
 only support post-increment addressing mode, the inlined memset for
 unaligned neon cases should be like:
   vmov.i32   q8, #...
   vst1.8 {q8}, [r3]!
   vst1.8 {q8}, [r3]!
   vst1.8 {q8}, [r3]!
   vst1.8 {q8}, [r3]

Other than for zero, I'd expect the vmov to be vmov.i8 to move an
arbitrary byte value into all lanes in a vector.  After that, if the
alignment is known to be more than 8-bit, I'd expect the vst1
instructions (with the exception of the last store if the length is not
a multiple of the alignment) to use

vst1.align {reg}, [addr-reg :align]!

Hence, for 16-bit aligned data, we want

vst1.16 {q8}, [r3:16]!

 But for now, gcc can't do this and below code is generated:
   vmov.i32   q8, #...
   vst1.8 {q8}, [r3]
   addr2,   r3,  #16
   addr3,   r2,  #16
   vst1.8 {q8}, [r2]
   vst1.8 {q8}, [r3]
   addr2,   r3,  #16
   vst1.8 {q8}, [r2]
 
 I investigated this issue.  The root cause lies in rtx cost returned by ARM
 backend.  Anyway, I think this is another issue and should be fixed in
 separated patch.
 
 Bootstrap and reg-test on cortex-a15, with or without neon support.  Is it
 OK?
 

Some more comments inline.

 Thanks,
 bin
 
 
 2014-04-29  Bin Cheng  bin.ch...@arm.com
 
   PR target/55701
   * config/arm/arm.md (setmem): New pattern.
   * config/arm/arm-protos.h (struct tune_params): New field.
   (arm_gen_setmem): New prototype.
   * config/arm/arm.c (arm_slowmul_tune): Initialize new field.
   (arm_fastmul_tune, arm_strongarm_tune, arm_xscale_tune): Ditto.
   (arm_9e_tune, arm_v6t2_tune, arm_cortex_tune): Ditto.
   (arm_cortex_a8_tune, arm_cortex_a7_tune): Ditto.
   (arm_cortex_a15_tune, arm_cortex_a53_tune): Ditto.
   (arm_cortex_a57_tune, arm_cortex_a5_tune): Ditto.
   (arm_cortex_a9_tune, arm_cortex_a12_tune): Ditto.
   (arm_v7m_tune, arm_v6m_tune, arm_fa726te_tune): Ditto.
   (arm_const_inline_cost): New function.
   (arm_block_set_max_insns): New function.
   (arm_block_set_straight_profit_p): New function.
   (arm_block_set_vect_profit_p): New function.
   (arm_block_set_unaligned_vect): New function.
   (arm_block_set_aligned_vect): New function.
   (arm_block_set_unaligned_straight): New function.
   (arm_block_set_aligned_straight): New function.
   (arm_block_set_vect, arm_gen_setmem): New functions.
 
 gcc/testsuite/ChangeLog
 2014-04-29  Bin Cheng  bin.ch...@arm.com
 
   PR target/55701
   * gcc.target/arm/memset-inline-1.c: New test.
   * gcc.target/arm/memset-inline-2.c: New test.
   * gcc.target/arm/memset-inline-3.c: New test.
   * gcc.target/arm/memset-inline-4.c: New test.
   * gcc.target/arm/memset-inline-5.c: New test.
   * gcc.target/arm/memset-inline-6.c: New test.
   * gcc.target/arm/memset-inline-7.c: New test.
   * gcc.target/arm/memset-inline-8.c: New test.
   * gcc.target/arm/memset-inline-9.c: New test.
 
 
 j1328-20140429.txt
 
 
 Index: gcc/config/arm/arm.c
 ===
 --- gcc/config/arm/arm.c  (revision 209852)
 +++ gcc/config/arm/arm.c  (working copy)
 @@ -1585,10 +1585,11 @@ const struct tune_params arm_slowmul_tune =
true,  /* Prefer constant 
 pool.  */
arm_default_branch_cost,
false, /* Prefer LDRD/STRD.  */
 -  {true, true},  /* Prefer non short 
 circuit.  */
 -  arm_default_vec_cost,/* Vectorizer costs.  */
 -  false,/* Prefer Neon for 64-bits 
 bitops.  */
 -  false, false  /* Prefer 32-bit encodings.  
 */
 +  {true, true},  /* Prefer non short circuit.  */
 +  arm_default_vec_cost,/* Vectorizer costs.  */
 +  false,/* Prefer Neon for 64-bits bitops.  
 */
 +  false, false, /* Prefer 32-bit encodings.  */
 +  false /* Prefer Neon for stringops.  */
  };
  

Please make sure that all the white space before the comments is using
TAB, not spaces.  Similarly for the other

Re: [PATCH AARCH64] One-line tidy of bit-twiddle expression in aarch64.c

2014-05-02 Thread Alan Lawrence

Whilst I agree with Richard H that it is obvious, my feeling is that the 
assertion does no harm, so have committed rev 210005 with Richard E's changes.


--Alan

Richard Henderson wrote:

On 04/29/2014 05:42 AM, Richard Earnshaw wrote:

On 23/04/14 16:20, Alan Lawrence wrote:
This patch is a small tidy of a more-complicated expression that just flips a 
single bit and can thus be a simple XOR.


No regressions on aarch64-none-elf or aarch64_be-none-elf. (I've verified code 
is indeed exercised by dg-torture.exp vshuf-v*.c).


Also ok after applying TBL and testsuite patches in 
http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01309.html and 
http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00579.html.


gcc/ChangeLog:
2014-04-23  Alan Lawrence  alan.lawre...@arm.com

* config/aarch64/aarch64.c (aarch64_expand_vec_perm_1): tidy bit-flip 
expression.


s/tidy/Tidy/

It's not obvious from your description (or from the code, for that
matter) that for this to be valid nelt must be a power of 2.

I suggest that, above the loop, you put

gcc_assert (nelt == (nelt  -nelt));

OK with those changes.


I think it's sort of obvious from context that we're working with a vector.
And it also seems obvious that we won't have a vector without a power-of-two
number of elements.


r~

[wide-int] out-of-range set_bit in java

2014-05-02 Thread Richard Sandiford

I locally tried adding an assertion to the wide-int version of set_bit
to make sure that the bit number was in range.  It triggers for this
code in boehm.c:mark_reference_fields (quoting trunk version):

  /* First word in object corresponds to most significant byte of 
 bitmap. 
 
 In the case of a multiple-word record, we set pointer 
 bits for all words in the record. This is conservative, but the 
 size_words != 1 case is impossible in regular java code. */
  for (i = 0; i  size_words; ++i)
*mask = (*mask).set_bit (ubit - count - i - 1);

  if (count = ubit - 2)
*pointer_after_end = 1;

if count + i + 1 = ubit.

AIUI the lower 2 bits are used for something else:

  /* Bottom two bits for bitmap mark type are 01.  */
  mask = mask.set_bit (0);
  value = double_int_to_tree (value_type, mask);

which is why the pointer_after_end condition checks for count = ubit - 2.
We never actually use the mask if pointer_after_end is true, so this
patch puts the set_bit in an else branch.

On face value it looks like the condition should be:

  count + size_words  ubit - 2

instead, but it'd go without saying that I don't really understand this code.

Tested on x86_64-linux-gnu and powerpc64-linux-gnu for wide-int.
OK to install?

Thanks,
Richard


gcc/java/
* boehm.c (mark_reference_fields): Don't update the mask when
setting pointer_after_end.

Index: gcc/java/boehm.c
===
--- gcc/java/boehm.c2014-01-13 15:05:22.543887284 +
+++ gcc/java/boehm.c2014-05-02 16:08:25.500760537 +0100
@@ -101,17 +101,17 @@ mark_reference_fields (tree field,
 
  *last_set_index = count;
 
- /* First word in object corresponds to most significant byte of 
-bitmap. 
-
-In the case of a multiple-word record, we set pointer 
-bits for all words in the record. This is conservative, but the 
-size_words != 1 case is impossible in regular java code. */
- for (i = 0; i  size_words; ++i)
-   *mask = wi::set_bit (*mask, ubit - count - i - 1);
-
  if (count = ubit - 2)
*pointer_after_end = 1;
+ else
+   /* First word in object corresponds to most significant byte of 
+  bitmap. 
+
+  In the case of a multiple-word record, we set pointer 
+  bits for all words in the record. This is conservative, but the 
+  size_words != 1 case is impossible in regular java code. */
+   for (i = 0; i  size_words; ++i)
+ *mask = wi::set_bit (*mask, ubit - count - i - 1);
 
  /* If we saw a non-reference field earlier, then we can't
 use the count representation.  We keep track of that in

Re: [RS6000] PR60737, expand_block_clear uses word stores

2014-05-02 Thread David Edelsohn

On Fri, May 2, 2014 at 6:20 AM, Alan Modra amo...@gmail.com wrote:
 In cases where the compiler has no alignment info, powerpc64le-linux
 gcc generates byte at a time copies for -mstrict-align (which is on
 for little-endian power7).  That's awful code, a problem shared by
 other strict-align targets, see pr50417.  However, we also have a case
 when -mno-strict-align generates less than ideal code, which I believe
 stems from using alignment as a proxy for testing an address offset.
 See http://gcc.gnu.org/ml/gcc-patches/1999-09n/msg01072.html.

 So my first attempt at fixing this problem looked at address offsets
 directly.  That worked fine too, but on thinking some more, I believe
 we no longer have the movdi restriction.  Nowadays we'll reload the
 address if we have an offset that doesn't satisfy the Y constraint
 (ie. a multiple of 4 offset).  Which led to this simpler patch.
 Bootstrapped and regression tested powerpc64le-linux, powerpc64-linux
 and powerpc-linux.  OK to apply?

Hi, Alan

Thanks for finding and addressing this.

As you mention, recent server-class processors, at least POWER8, do
not have the performance degradation for common, mis-aligned loads and
stores of wider modes. But the patch should not impose this default on
the large, installed based of processors, where mis-aligned loads can
be a severe performance penalty. This heuristic has become
processor-dependent and should not be hard-coded in the block_move and
block_clear algorithms.

PROCESSOR_DEFAULT is POWER8 for ELFv2 (and should be updated as the
default for PowerLinux in general). Please update the patch to test
rs6000_cpu, probably another boolean flag set in
rs6000_option_override_internal(). Because of the processor defaults,
the preferred instruction sequence will be the default without
encoding an assumption about the heuristics in the algorithm itself.

Thanks, David

[committed] Fix signed overflow for zext_hwi (..., 63)

2014-05-02 Thread Richard Sandiford

...well, still HOST_BITS_PER_WIDE_INT-1 officially, until Richard's patches.

Caught by a boostrap-ubsan on wide-int.  Tested on x86_64-linux-gnu and
committed as obvious.

Richard


gcc/
* hwint.h (zext_hwi): Fix signed overflow for prec == 63.

Index: gcc/hwint.h
===
--- gcc/hwint.h 2014-01-03 15:06:18.841058787 +
+++ gcc/hwint.h 2014-05-02 16:31:44.987263123 +0100
@@ -344,7 +344,7 @@ zext_hwi (unsigned HOST_WIDE_INT src, un
   else
 {
   gcc_checking_assert (prec  HOST_BITS_PER_WIDE_INT);
-  return src  (((HOST_WIDE_INT) 1  prec) - 1);
+  return src  (((unsigned HOST_WIDE_INT) 1  prec) - 1);
 }
 }

[PATCH] Change PPC64 Linux default to POWER8.

2014-05-02 Thread David Edelsohn

This patch updates the default processor tuning to POWER8 for any
PPC64 Linux target, not only little endian (ELFv2).

Any objection?

Thanks, David

* config/rs6000/linux64.h (PROCESSOR_DEFAULT): Change to PROCESSOR_POWER8.
(PROCESSOR_DEFAULT64): Same.

Index: linux64.h
===
--- linux64.h   (revision 209981)
+++ linux64.h   (working copy)
@@ -69,13 +69,9 @@
 #endif

 #undef  PROCESSOR_DEFAULT
-#define PROCESSOR_DEFAULT PROCESSOR_POWER7
+#define PROCESSOR_DEFAULT PROCESSOR_POWER8
 #undef  PROCESSOR_DEFAULT64
-#ifdef LINUX64_DEFAULT_ABI_ELFv2
 #define PROCESSOR_DEFAULT64 PROCESSOR_POWER8
-#else
-#define PROCESSOR_DEFAULT64 PROCESSOR_POWER7
-#endif

 /* We don't need to generate entries in .fixup, except when
-mrelocatable or -mrelocatable-lib is given.  */

Re: [PATCH] Implement -fsanitize=float-divide-by-zero

2014-05-02 Thread Joseph S. Myers

On Tue, 29 Apr 2014, Jakub Jelinek wrote:

 To my surprise, the wording in C99 or C++11 make even floating point
 division by zero undefined behavior, but I think generally at least for IEEE
 floating point semantics it is well defined, thus I think we shouldn't
 include it in -fsanitize=undefined.

Making Annex F explicitly take precedence over the generic text is DR#442.  
(This precedence issue applies to out-of-range conversions from floating 
point to integer as well as such things as floating-point division by 
zero: the generic text makes such out-of-range conversions undefined, but 
Annex F says they raise invalid and return an unspecified value.)

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH] Change PPC64 Linux default to POWER8.

2014-05-02 Thread Peter Bergner

On Fri, 2014-05-02 at 11:50 -0400, David Edelsohn wrote:
 This patch updates the default processor tuning to POWER8 for any
 PPC64 Linux target, not only little endian (ELFv2).
 
 Any objection?
 
 Thanks, David
 
 * config/rs6000/linux64.h (PROCESSOR_DEFAULT): Change to PROCESSOR_POWER8.
 (PROCESSOR_DEFAULT64): Same.

That looks good to me.  Thanks.

Peter

Re: [PING] Re: Add const char* constructors for exception classes in stdexcept

2014-05-02 Thread Jonathan Wakely

On 1 May 2014 17:18, Oleg Endo wrote:
 Jonathan,

 now that we're in stage 1 again, I'd like to revive the issue below.  Do
 you have any particular plans?  How should we proceed?

Hi Oleg, sorry for letting the thread die in January.

We will definitely want to deal with the missing constructors in
stdexcept during stage 1, but I'm not sure exactly when :-)

I'll come back to this next week - ping me if you don't hear from me again!

[patch] fix libstdc++/59476 - pretty printers

2014-05-02 Thread Jonathan Wakely


Update the python pretty printers to understand how _Rb_tree_node is
defined in C++11 mode now.

Tested x86_64-linux, committed to trunk and the 4.9 branch.
commit a498be0a723e8c6b60185d5c15081be4650b2644
Author: Jonathan Wakely jwak...@redhat.com
Date:   Fri May 2 14:02:51 2014 +0100

	PR libstdc++/59476
	* python/libstdcxx/v6/printers.py (get_value_from_Rb_tree_node): New
	function to handle both C++03 and C++11 _Rb_tree_node implementations.
	(StdRbtreeIteratorPrinter, StdMapPrinter, StdSetPrinter): Use it.
	* testsuite/libstdc++-prettyprinters/simple.cc: Update comment to
	refer to...
	* testsuite/libstdc++-prettyprinters/simple11.cc: New.

diff --git a/libstdc++-v3/python/libstdcxx/v6/printers.py b/libstdc++-v3/python/libstdcxx/v6/printers.py
index 05da17b..1f1f860 100644
--- a/libstdc++-v3/python/libstdcxx/v6/printers.py
+++ b/libstdc++-v3/python/libstdcxx/v6/printers.py
@@ -375,6 +375,22 @@ class RbtreeIterator:
 self.node = node
 return result
 
+def get_value_from_Rb_tree_node(node):
+Returns the value held in an _Rb_tree_node_Val
+try:
+member = node.type.fields()[1].name
+if member == '_M_value_field':
+# C++03 implementation, node contains the value as a member
+return node['_M_value_field']
+elif member == '_M_storage':
+# C++11 implementation, node stores value in __aligned_buffer
+p = node['_M_storage']['_M_storage'].address
+p = p.cast(node.type.template_argument(0).pointer())
+return p.dereference()
+except:
+pass
+raise ValueError, Unsupported implementation for %s % str(node.type)
+
 # This is a pretty printer for std::_Rb_tree_iterator (which is
 # std::map::iterator), and has nothing to do with the RbtreeIterator
 # class above.
@@ -387,7 +403,8 @@ class StdRbtreeIteratorPrinter:
 def to_string (self):
 typename = str(self.val.type.strip_typedefs()) + '::_Link_type'
 nodetype = gdb.lookup_type(typename).strip_typedefs()
-return self.val.cast(nodetype).dereference()['_M_value_field']
+node = self.val.cast(nodetype).dereference()
+return get_value_from_Rb_tree_node(node)
 
 class StdDebugIteratorPrinter:
 Print a debug enabled version of an iterator
@@ -417,7 +434,8 @@ class StdMapPrinter:
 def next(self):
 if self.count % 2 == 0:
 n = self.rbiter.next()
-n = n.cast(self.type).dereference()['_M_value_field']
+n = n.cast(self.type).dereference()
+n = get_value_from_Rb_tree_node(n)
 self.pair = n
 item = n['first']
 else:
@@ -458,7 +476,8 @@ class StdSetPrinter:
 
 def next(self):
 item = self.rbiter.next()
-item = item.cast(self.type).dereference()['_M_value_field']
+item = item.cast(self.type).dereference()
+item = get_value_from_Rb_tree_node(item)
 # FIXME: this is weird ... what to do?
 # Maybe a 'set' display hint?
 result = ('[%d]' % self.count, item)
diff --git a/libstdc++-v3/testsuite/libstdc++-prettyprinters/simple.cc b/libstdc++-v3/testsuite/libstdc++-prettyprinters/simple.cc
index 66ae8f7..030207a 100644
--- a/libstdc++-v3/testsuite/libstdc++-prettyprinters/simple.cc
+++ b/libstdc++-v3/testsuite/libstdc++-prettyprinters/simple.cc
@@ -1,4 +1,4 @@
-// If you modify this, please update debug.cc as well.
+// If you modify this, please update simple11.cc and debug.cc as well.
 
 // { dg-do run }
 // { dg-options -g -O0 }
diff --git a/libstdc++-v3/testsuite/libstdc++-prettyprinters/simple11.cc b/libstdc++-v3/testsuite/libstdc++-prettyprinters/simple11.cc
new file mode 100644
index 000..e94bea6
--- /dev/null
+++ b/libstdc++-v3/testsuite/libstdc++-prettyprinters/simple11.cc
@@ -0,0 +1,92 @@
+// If you modify this, please update simple.cc and debug.cc as well.
+
+// { dg-do run }
+// { dg-options -g -O0 -std=gnu++11 }
+
+// Copyright (C) 2011-2014 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// http://www.gnu.org/licenses/.
+
+#include string
+#include deque
+#include bitset
+#include iostream
+#include list
+#include map
+#include set
+#include ext/slist
+
+int
+main()
+{
+  std::string tem;
+

Re: [PATCH 00/89] Compile-time gimple-checking

2014-05-02 Thread Jeff Law


On 05/02/14 03:09, Richard Biener wrote:


Well, I'd like to see both and one affects the other. Doing the const 
correctness thing first seems more natural to me.
Of course both need to wait for 4.9.1.
Well, it looks like David is already on that path to some extent with 
the proposed gengtype changes.


I guess I'm just trying to figure out how to stage this stuff in.  ie, 
is it easier to go with the #89 patchkit, then followup with fixing the 
const stuff, or is it easier to first fix the const stuff, then adjust 
the #89 kit.  You're recommending the latter, which is fine with me, but 
I'd like David to chime in as well since he's doing the work :-)


jeff

Re: [committed] Fix signed overflow for zext_hwi (..., 63)

2014-05-02 Thread Mike Stump

On May 2, 2014, at 8:36 AM, Richard Sandiford rsand...@linux.vnet.ibm.com 
wrote:
 ...well, still HOST_BITS_PER_WIDE_INT-1 officially, until Richard's patches.
 
 Caught by a boostrap-ubsan on wide-int.  Tested on x86_64-linux-gnu and
 committed as obvious.

Thanks.

Re: [PATCH 00/89] Compile-time gimple-checking

2014-05-02 Thread David Malcolm

On Fri, 2014-05-02 at 10:02 -0600, Jeff Law wrote:
 On 05/02/14 03:09, Richard Biener wrote:
 
  Well, I'd like to see both and one affects the other. Doing the const 
  correctness thing first seems more natural to me.
  Of course both need to wait for 4.9.1.
 Well, it looks like David is already on that path to some extent with 
 the proposed gengtype changes.
 
 I guess I'm just trying to figure out how to stage this stuff in.  ie, 
 is it easier to go with the #89 patchkit, then followup with fixing the 
 const stuff, or is it easier to first fix the const stuff, then adjust 
 the #89 kit.  You're recommending the latter, which is fine with me, but 
 I'd like David to chime in as well since he's doing the work :-)

I'm about 4 or 5 hours from having patches I can post :)

[my automated typedef-removal script seems to work, but I need to write
some ChangeLogs and port one of the patches from the #89 kit, then
rebootstrapregrtest]

Re: [PATCH] gengtype: Support explicit pointers in template arguments

2014-05-02 Thread Richard Sandiford

David Malcolm dmalc...@redhat.com writes:
 Currently, gengtype does not support template arguments that are
 explicitly pointers, such as:
   static GTY(()) vecgimple_statement_base * test_gimple; giving this
 error:
   ../../src/gcc/gimple-expr.c:902: parse error: expected a string constant or 
 ',', have '*' instead 
 requiring them to be typedefs.

 This patch removes this restriction, supporting up to a single pointer
 in each template argument.

Sorry to ask, but would you mind holding on until after the wide-int merge
(hopefully the next few days, this time for real)?  It looks like it
might clash with the support for nested templates.

Thanks,
Richard

Re: [C PATCH] Improve warn msg (PR c/43395)

2014-05-02 Thread Joseph S. Myers

On Tue, 29 Apr 2014, Marek Polacek wrote:

 It's correct to warn about returning an address of a local label,
 but it's clumsy to say it's local variable.  We can easily distinguish
 between a label and a variable.
 
 Regtested/bootstrapped on x86_64-linux, ok for trunk?

You always need to have complete sentences in diagnostics for the sake of 
translation.  Thus, you need two separate warning_at calls, one with each 
version of the message.  (Using ? : for the whole format string isn't 
sufficient; I think xgettext only extracts one of the two alternatives for 
translation if you do that.)

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH] gengtype: Support explicit pointers in template arguments

2014-05-02 Thread David Malcolm

On Fri, 2014-05-02 at 17:43 +0100, Richard Sandiford wrote:
 David Malcolm dmalc...@redhat.com writes:
  Currently, gengtype does not support template arguments that are
  explicitly pointers, such as:
static GTY(()) vecgimple_statement_base * test_gimple; giving this
  error:
../../src/gcc/gimple-expr.c:902: parse error: expected a string constant 
  or ',', have '*' instead 
  requiring them to be typedefs.
 
  This patch removes this restriction, supporting up to a single pointer
  in each template argument.
 
 Sorry to ask, but would you mind holding on until after the wide-int merge
 (hopefully the next few days, this time for real)?  It looks like it
 might clash with the support for nested templates.

Ack.   [This is enabling work for something I've already been asked to
wait post-4.9.1 for, so sure.  I hope it won't be too hard to redo on
top of your changes.]

Good luck with the wide-int merge

Re: [PATCH], RFC, add support for float128/ibm128 types on PowerPC

2014-05-02 Thread Joseph S. Myers

On Tue, 29 Apr 2014, Michael Meissner wrote:

   * soft-fp/quad.h (TFmode): If TFmode is defined, don't use the
   normal TF mode definition.

That does of course need to go to glibc first (and I think it would be 
best to do something consistent for all the floating-point formats in 
soft-fp - allow overriding the definitions of all of them in terms of 
machine modes to keep things consistent, even without any use case for 
overriding definitions other than for TFmode).

I don't see anything in this patch to give appropriate symbol versions to 
the new libgcc functions.

libgcc/config/rs6000/sfp-machine.h does not currently implement 
integration with hardware exceptions and rounding modes.  Such integration 
should be added as on other architectures such as x86 and AArch64 
(conditional on FPRs being present, as opposed to soft-fp being used in 
libgcc for soft-float or e500).

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [C PATCH] Improve warn msg (PR c/43395)

2014-05-02 Thread Marek Polacek

On Fri, May 02, 2014 at 05:01:18PM +, Joseph S. Myers wrote:
 On Tue, 29 Apr 2014, Marek Polacek wrote:
 
  It's correct to warn about returning an address of a local label,
  but it's clumsy to say it's local variable.  We can easily distinguish
  between a label and a variable.
  
  Regtested/bootstrapped on x86_64-linux, ok for trunk?
 
 You always need to have complete sentences in diagnostics for the sake of 
 translation.  Thus, you need two separate warning_at calls, one with each 
 version of the message.  (Using ? : for the whole format string isn't 
 sufficient; I think xgettext only extracts one of the two alternatives for 
 translation if you do that.)

Ooops.  Is it ok to fix it up with this patch then?

2014-05-02  Marek Polacek  pola...@redhat.com

c/
* c-typeck.c (c_finish_return): Separate warning_at calls.
cp/
* typeck.c (maybe_warn_about_returning_address_of_local): Separate
warning_at calls.

diff --git gcc/c/c-typeck.c gcc/c/c-typeck.c
index b95fd89..f7ad91e 100644
--- gcc/c/c-typeck.c
+++ gcc/c/c-typeck.c
@@ -9273,10 +9273,14 @@ c_finish_return (location_t loc, tree retval, tree 
origtype)
   !DECL_EXTERNAL (inner)
   !TREE_STATIC (inner)
   DECL_CONTEXT (inner) == current_function_decl)
-   warning_at (loc,
-   OPT_Wreturn_local_addr, function returns address 
-   of %s, TREE_CODE (inner) == LABEL_DECL
-? label : local variable);
+   {
+ if (TREE_CODE (inner) == LABEL_DECL)
+   warning_at (loc, OPT_Wreturn_local_addr,
+   function returns address of label);
+ else
+   warning_at (loc, OPT_Wreturn_local_addr,
+   function returns address of local variable);
+   }
  break;
 
default:
diff --git gcc/cp/typeck.c gcc/cp/typeck.c
index 8b7cb8d..7b28a9a 100644
--- gcc/cp/typeck.c
+++ gcc/cp/typeck.c
@@ -8309,10 +8309,12 @@ maybe_warn_about_returning_address_of_local (tree 
retval)
   if (TREE_CODE (valtype) == REFERENCE_TYPE)
warning (OPT_Wreturn_local_addr, reference to local variable %q+D 
returned,
 whats_returned);
+  else if (TREE_CODE (whats_returned) == LABEL_DECL)
+   warning (OPT_Wreturn_local_addr, address of label %q+D returned,
+whats_returned);
   else
-   warning (OPT_Wreturn_local_addr, address of %s %q+D returned,
-TREE_CODE (whats_returned) == LABEL_DECL
-? label : local variable, whats_returned);
+   warning (OPT_Wreturn_local_addr, address of local variable %q+D 
+returned, whats_returned);
   return;
 }
 }

Marek

[Patch, testsuite, mips] Fix two failing MIPS tests.

2014-05-02 Thread Steve Ellcey


Two MIPS specific tests started failing a few days ago, this patch fixes
them.  The tests are trying to check that a performance optimization is
done and that one constant is derived from another and not simply loaded
as a completely new constant.

Post 4.9 changes have affected which constant gets loaded and which one
gets derived (and how it is derived) from it.  The new code looks as
good as the old code:

Old:

li  $5,305397760# 0x1234
addiu   $4,$5,1
addiu   $5,$5,-1

New:

li  $5,305332224# 0x1233
ori $5,$5,0x
addiu   $4,$5,2

My proposed fix is to change the asm scan to verify that there is only 1 'li'
instruction and to not worry about what value is being loaded or what
instructions are used to generate the second constant.

Tested on MIPS32 and MIPS64 (n32 and 64 ABIs).

OK to checkin?

Steve Ellcey
sell...@mips.com



2014-05-02  Steve Ellcey  sell...@mips.com

* gcc.target/mips/const-anchor-1.c: Modify asm scan.
* gcc.target/mips/const-anchor-2.c: Ditto.


diff --git a/gcc/testsuite/gcc.target/mips/const-anchor-1.c 
b/gcc/testsuite/gcc.target/mips/const-anchor-1.c
index a5f01e4..debab12 100644
--- a/gcc/testsuite/gcc.target/mips/const-anchor-1.c
+++ b/gcc/testsuite/gcc.target/mips/const-anchor-1.c
@@ -1,8 +1,8 @@
-/* Derive a constant (0x1233) from an intermediate value
-   (0x1234000) used to build another constant.  */
+/* Derive one constant from an intermediate value used to build another
+   constant.  Verify that there is only one load immediate instruction
+   generated.  */
 /* { dg-skip-if code quality test { *-*-* } { -O0 } {  } } */
-/* { dg-final { scan-assembler-not 0x1233|305332224 } } */
-/* { dg-final { scan-assembler \td?addiu\t\\\$5,\\\$\[0-9\]*,-1 } } */
+/* { dg-final { scan-assembler-times \tli\t 1 } } */
 
 NOMIPS16 void f ()
 {
diff --git a/gcc/testsuite/gcc.target/mips/const-anchor-2.c 
b/gcc/testsuite/gcc.target/mips/const-anchor-2.c
index 8dad5a7..2edd1ef 100644
--- a/gcc/testsuite/gcc.target/mips/const-anchor-2.c
+++ b/gcc/testsuite/gcc.target/mips/const-anchor-2.c
@@ -1,7 +1,8 @@
-/* Derive a constant (0x30001) from another constant.  */
+/* Derive one constant from an intermediate value used to build another
+   constant.  Verify that there is only one load immediate instruction
+   generated.  */
 /* { dg-skip-if code quality test { *-*-* } { -O0 } {  } } */
-/* { dg-final { scan-assembler-not 0x30|196608 } } */
-/* { dg-final { scan-assembler \td?addiu\t\\\$5,\\\$\[0-9\]*,32763 } } */
+/* { dg-final { scan-assembler-times \tli\t 1 } } */
 
 NOMIPS16 void f ()
 {

Re: [patch] change specific int128 - generic intN

2014-05-02 Thread Joseph S. Myers

On Thu, 1 May 2014, DJ Delorie wrote:

 Do we have a routine to do that already?

I don't know.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: Contributing new gcc targets: i386--dragonfly and x86-64--dragonfly

2014-05-02 Thread Joseph S. Myers

On Fri, 2 May 2014, John Marino wrote:

 So given the track record (building itself, building base, building
 21,000 software ports) over a couple of years I think any issues this
 could have caused would have been seen and identified by now.

These issues aren't generally obvious (given that the ISO C conformance 
modes aren't used that often, and when they are that doesn't mean the 
application is using POSIX types for something else).

  (I don't know what the FreeBSD sys/_types.h defines, but it at least 
  seems possible from the name that it is only defining things in the 
  implementation namespace, with the public sys/types.h being what then 
  includes sys/_types.h and does typedef __foo_t foo_t; or similar to 
  provide the public POSIX types that aren't in ISO C.)
 
 Here are the headers in question:
 http://grok.dragonflybsd.org/xref/freebsd/sys/sys/_types.h

That seems OK for stddef.h inclusion, as long as the machine/_types.h 
is OK.

 http://grok.dragonflybsd.org/xref/dragonfly/sys/sys/types.h

That's definitely not correct to include in stddef.h; it defines lots of 
types outside the ISO C namespace.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [Patch] Fix a typo in Chinese msg

2014-05-02 Thread Joseph S. Myers

On Thu, 1 May 2014, Jeff Law wrote:

 On 05/01/14 17:59, Joseph S. Myers wrote:
  On Thu, 24 Apr 2014, Tobias Burnus wrote:
  
   Jeff, are you sure that this will not be overridden? I think the
   translations
   are always synced from http://translationproject.org/team/zh_CN.html ?
  
  Yes.  This should be reverted; we shouldn't make any local changes to
  these files.  When there's a new zh_CN.po from the TP (whether resulting
  from changes by translators, or an automatic merge with a new .pot file),
  that will be imported verbatim (to both trunk and 4.9 branch), whether or
  not it includes this change.
 OK.  Sorry about that.  Reverted.
 
 What's the procedure to contact the translators so they can fix this?

Contact the address for the translation team given in the .po file.  
Unfortunately I got a bounce message in Chinese; maybe it indicates it's a 
subscriber-posting-only mailing list, which seems unfortunate.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [C PATCH] Fix up diagnostics (PR c/25801)

2014-05-02 Thread Joseph S. Myers

On Fri, 2 May 2014, Marek Polacek wrote:

 On Thu, May 01, 2014 at 10:54:18PM +, Joseph S. Myers wrote:
  I think the comment on this function should be updated to explain the 
  interface contract when an incomplete (or function) type is passed (i.e. 
  return size_one_node, caller is responsible for any diagnostics).
  
 Done.
 
  The test should I think also cover incomplete union types, and pointer 
  subtraction for all the different cases of incomplete types, and the += 
  and -= operators (i.e. cover invalid arithmetic on pointers to incomplete 
  types more thoroughly, even if not actually affected by this patch).
 
 Done.
 
 Another change is that I used COMPLETE_TYPE_P instead of
 COMPLETE_OR_VOID_TYPE_P, that would be a bug and before it didn't make
 much sense: we checked VOID_TYPE before.
 
 Tested x86_64-linux, ok now?

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com

[committed] Fix two omp loop issues

2014-05-02 Thread Jakub Jelinek

Hi!

While starting to work on OpenMP 4.0 Fortran stuff, I've noticed
some issues in C/C++ too.
One problem is that when some variable other than the iteration var
is linear on a combined for simd or parallel for simd loop, we weren't
adding a barrier between read of the initial value and store of the final
value.  Fixed by making such vars firstprivate+lastprivate on the
omp for loop.
The second issue is that if there are lastprivate vars on combined for simd
loop, the outer assignment on last iteration was keyed on some fd-loop.v
iterator value which hasn't been ever assigned to.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
committed to trunk/4.9.

2014-05-02  Jakub Jelinek  ja...@redhat.com

* gimplify.c (gimplify_adjust_omp_clauses_1): Handle
GOVD_FIRSTPRIVATE | GOVD_LASTPRIVATE.
(gimplify_adjust_omp_clauses): Simd region is never
directly nested in combined parallel.  Instead, for linear
with copyin/copyout, if in combined for simd loop, make decl
firstprivate/lastprivate on OMP_FOR.
* omp-low.c (expand_omp_for_generic, expand_omp_for_static_nochunk,
expand_omp_for_static_chunk): When setting endvar, also set
fd-loop.v to the same value.
libgomp/
* testsuite/libgomp.c/simd-10.c: New test.
* testsuite/libgomp.c/simd-11.c: New test.
* testsuite/libgomp.c/simd-12.c: New test.
* testsuite/libgomp.c/simd-13.c: New test.

--- gcc/gimplify.c.jj   2014-04-30 19:06:11.0 +0200
+++ gcc/gimplify.c  2014-05-02 11:38:07.908463543 +0200
@@ -6294,9 +6294,17 @@ gimplify_adjust_omp_clauses_1 (splay_tre
  OMP_CLAUSE_CHAIN (clause) = nc;
}
 }
+  if (code == OMP_CLAUSE_FIRSTPRIVATE  (flags  GOVD_LASTPRIVATE) != 0)
+{
+  tree nc = build_omp_clause (input_location, OMP_CLAUSE_LASTPRIVATE);
+  OMP_CLAUSE_DECL (nc) = decl;
+  OMP_CLAUSE_LASTPRIVATE_FIRSTPRIVATE (nc) = 1;
+  OMP_CLAUSE_CHAIN (nc) = *list_p;
+  OMP_CLAUSE_CHAIN (clause) = nc;
+  lang_hooks.decls.omp_finish_clause (nc);
+}
   *list_p = clause;
   lang_hooks.decls.omp_finish_clause (clause);
-
   return 0;
 }
 
@@ -6335,18 +6343,17 @@ gimplify_adjust_omp_clauses (tree *list_
  if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_LINEAR
   ctx-outer_context
   !(OMP_CLAUSE_LINEAR_NO_COPYIN (c)
-   OMP_CLAUSE_LINEAR_NO_COPYOUT (c))
-  !is_global_var (decl))
+   OMP_CLAUSE_LINEAR_NO_COPYOUT (c)))
{
- if (ctx-outer_context-region_type == ORT_COMBINED_PARALLEL)
+ if (ctx-outer_context-combined_loop
+  !OMP_CLAUSE_LINEAR_NO_COPYIN (c))
{
  n = splay_tree_lookup (ctx-outer_context-variables,
 (splay_tree_key) decl);
  if (n == NULL
  || (n-value  GOVD_DATA_SHARE_CLASS) == 0)
{
- int flags = OMP_CLAUSE_LINEAR_NO_COPYIN (c)
- ? GOVD_LASTPRIVATE : GOVD_SHARED;
+ int flags = GOVD_FIRSTPRIVATE | GOVD_LASTPRIVATE;
  if (n == NULL)
omp_add_variable (ctx-outer_context, decl,
  flags | GOVD_SEEN);
@@ -6354,7 +6361,7 @@ gimplify_adjust_omp_clauses (tree *list_
n-value |= flags | GOVD_SEEN;
}
}
- else
+ else if (!is_global_var (decl))
omp_notice_variable (ctx-outer_context, decl, true);
}
}
--- gcc/omp-low.c.jj2014-04-30 09:13:22.0 +0200
+++ gcc/omp-low.c   2014-05-01 15:18:05.368771615 +0200
@@ -5584,6 +5584,12 @@ expand_omp_for_generic (struct omp_regio
 {
   stmt = gimple_build_assign (endvar, iend);
   gsi_insert_after (gsi, stmt, GSI_CONTINUE_LINKING);
+  if (useless_type_conversion_p (TREE_TYPE (fd-loop.v), TREE_TYPE (iend)))
+   stmt = gimple_build_assign (fd-loop.v, iend);
+  else
+   stmt = gimple_build_assign_with_ops (NOP_EXPR, fd-loop.v, iend,
+NULL_TREE);
+  gsi_insert_after (gsi, stmt, GSI_CONTINUE_LINKING);
 }
   if (fd-collapse  1)
 expand_omp_for_init_vars (fd, gsi, counts, inner_stmt, startvar);
@@ -6000,6 +6006,12 @@ expand_omp_for_static_nochunk (struct om
 {
   stmt = gimple_build_assign (endvar, e);
   gsi_insert_after (gsi, stmt, GSI_CONTINUE_LINKING);
+  if (useless_type_conversion_p (TREE_TYPE (fd-loop.v), TREE_TYPE (e)))
+   stmt = gimple_build_assign (fd-loop.v, e);
+  else
+   stmt = gimple_build_assign_with_ops (NOP_EXPR, fd-loop.v, e,
+NULL_TREE);
+

Re: [C PATCH] Improve warn msg (PR c/43395)

2014-05-02 Thread Joseph S. Myers

On Fri, 2 May 2014, Marek Polacek wrote:

 Ooops.  Is it ok to fix it up with this patch then?
 
 2014-05-02  Marek Polacek  pola...@redhat.com
 
 c/
   * c-typeck.c (c_finish_return): Separate warning_at calls.
 cp/
   * typeck.c (maybe_warn_about_returning_address_of_local): Separate
   warning_at calls.

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [C++ Patch/RFC] PR 58582

2014-05-02 Thread Jason Merrill

I'd tweak instantiate_decl differently; a deleted function is defined, 
so pattern_defined should be true.


Jason

Re: [PATCH] Describe Wsizeof-pointer-memaccess in c.opt

2014-05-02 Thread Joseph S. Myers

On Fri, 2 May 2014, Marek Polacek wrote:

 Wsizeof-pointer-memaccess option is missing a description in
 the c.opt file.  This patch adds something in there.
 
 Ok?

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: Contributing new gcc targets: i386--dragonfly and x86-64--dragonfly

2014-05-02 Thread John Marino

On 5/2/2014 19:49, Joseph S. Myers wrote:
 On Fri, 2 May 2014, John Marino wrote:
 
 http://grok.dragonflybsd.org/xref/dragonfly/sys/sys/types.h
 
 That's definitely not correct to include in stddef.h; it defines lots of 
 types outside the ISO C namespace.

Ok.
So I guess there are two problems.
1) I don't know which type definitions are missing (iow, the important
ones from sys/type.h that are required to build gcc)
2) Mainly because of 1) I don't know what to pull in instead.  DragonFly
doesn't have this _type.h split like FreeBSD

Do you have a suggestion for me on how to proceed?
Thanks,
John

[patch] libstdc++/61036 - fix dumb typo in shared_ptr

2014-05-02 Thread Jonathan Wakely


This was a really dumb regression from me :-(

Tested x86_64-linux, committed to trunk and 4.9 branch.

commit 892be9bd861871e39049c78c75807f5680fffb94
Author: Jonathan Wakely jwak...@redhat.com
Date:   Fri May 2 18:35:41 2014 +0100

	PR libstdc++/61036
	* include/bits/shared_ptr_base.h (__shared_ptr::__shared_ptr(_Tp1*)):
	Check the correct type in the static assertion.
	* testsuite/20_util/shared_ptr/cons/61036.cc: New.

diff --git a/libstdc++-v3/include/bits/shared_ptr_base.h b/libstdc++-v3/include/bits/shared_ptr_base.h
index 57398af..c25157f 100644
--- a/libstdc++-v3/include/bits/shared_ptr_base.h
+++ b/libstdc++-v3/include/bits/shared_ptr_base.h
@@ -871,7 +871,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 : _M_ptr(__p), _M_refcount(__p)
 	{
 	  __glibcxx_function_requires(_ConvertibleConcept_Tp1*, _Tp*)
-	  static_assert( !is_void_Tp::value, incomplete type );
+	  static_assert( !is_void_Tp1::value, incomplete type );
 	  static_assert( sizeof(_Tp1)  0, incomplete type );
 	  __enable_shared_from_this_helper(_M_refcount, __p, __p);
 	}
diff --git a/libstdc++-v3/testsuite/20_util/shared_ptr/cons/61036.cc b/libstdc++-v3/testsuite/20_util/shared_ptr/cons/61036.cc
new file mode 100644
index 000..9cade66
--- /dev/null
+++ b/libstdc++-v3/testsuite/20_util/shared_ptr/cons/61036.cc
@@ -0,0 +1,28 @@
+// { dg-options -std=gnu++11 }
+// { dg-do compile }
+
+// Copyright (C) 2014 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// http://www.gnu.org/licenses/.
+
+// 20.8.2.2 Template class shared_ptr [util.smartptr.shared]
+
+#include memory
+
+void test01()
+{
+  std::shared_ptrvoid p(new int);
+}

[wide-int] Handle zero-precision INTEGER_CSTs again

2014-05-02 Thread Richard Sandiford

I'd hoped the days of zero-precision INTEGER_CSTs were behind us after
Richard's patch to remove min amd max values from zero-width bitfields,
but a boostrap-ubsan showed otherwise.  One source is in:

  null_pointer_node = build_int_cst (build_pointer_type (void_type_node), 0);

if no_target, since the precision of the type comes from ptr_mode,
which remains the default VOIDmode.  Maybe that's a bug, but setting
it to an arbitrary nonzero value would also be dangerous.

The other use is in the C/C++ void_zero_node, which is specifically
defined as a VOID_TYPE, zero-precision INTEGER_CST.  This is used by the
ubsan code in some ?: tests, as well as by other places in the frontend
proper.  At least the ubsan use persists until gimple.

This patch therefore restores the wide-int handling for zero precision,
for which the length must be 1 and the single HWI must be zero.
I've tried to wrap up most of the dependencies in two new functions,
wi::blocks_needed (a function version of the .cc BLOCKS_NEEDED, now also
used in the header) and wi::excess_bits, so that it'll be easier to
remove the handling again if zero precision ever goes away.
There are some remaining, easily-greppable cases that check directly
for a precision of 0 though.

The combination of this and the other patches allows boostrap-ubsan to
complete.  There are a lot of extra testsuite failures compared to a
normal bootstrap, but they don't look related to wide-int.

Tested on x86_64-linux-gnu and powerpc64-linux-gnu.  OK to install?

Thanks,
Richard


Index: gcc/wide-int.cc
===
--- gcc/wide-int.cc 2014-05-02 16:28:07.657847935 +0100
+++ gcc/wide-int.cc 2014-05-02 16:28:09.560842845 +0100
@@ -48,8 +48,6 @@ static const HOST_WIDE_INT zeros[WIDE_IN
 #define HALF_INT_MASK (((HOST_WIDE_INT) 1  HOST_BITS_PER_HALF_WIDE_INT) - 1)
 
 #define BLOCK_OF(TARGET) ((TARGET) / HOST_BITS_PER_WIDE_INT)
-#define BLOCKS_NEEDED(PREC) \
-  (PREC ? (((PREC) + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT) : 1)
 #define SIGN_MASK(X) ((HOST_WIDE_INT) (X)  0 ? -1 : 0)
 
 /* Return the value a VAL[I] if I  LEN, otherwise, return 0 or -1
@@ -69,20 +67,20 @@ safe_uhwi (const HOST_WIDE_INT *val, uns
 static unsigned int
 canonize (HOST_WIDE_INT *val, unsigned int len, unsigned int precision)
 {
-  unsigned int blocks_needed = BLOCKS_NEEDED (precision);
+  unsigned int blocks_needed = wi::blocks_needed (precision);
   HOST_WIDE_INT top;
   int i;
 
   if (len  blocks_needed)
 len = blocks_needed;
 
+  if (wi::excess_bits (len, precision)  0)
+val[len - 1] = sext_hwi (val[len - 1], precision % HOST_BITS_PER_WIDE_INT);
   if (len == 1)
 return len;
 
   top = val[len - 1];
-  if (len * HOST_BITS_PER_WIDE_INT  precision)
-val[len - 1] = top = sext_hwi (top, precision % HOST_BITS_PER_WIDE_INT);
-  if (top != 0  top != (HOST_WIDE_INT)-1)
+  if (top != 0  top != (HOST_WIDE_INT) -1)
 return len;
 
   /* At this point we know that the top is either 0 or -1.  Find the
@@ -134,7 +132,7 @@ wi::from_buffer (const unsigned char *bu
 
   /* We have to clear all the bits ourself, as we merely or in values
  below.  */
-  unsigned int len = BLOCKS_NEEDED (precision);
+  unsigned int len = wi::blocks_needed (precision);
   HOST_WIDE_INT *val = result.write_val ();
   for (unsigned int i = 0; i  len; ++i)
 val[i] = 0;
@@ -180,7 +178,7 @@ wi::to_mpz (const wide_int_ref x, mpz_t
 {
   int len = x.get_len ();
   const HOST_WIDE_INT *v = x.get_val ();
-  int excess = len * HOST_BITS_PER_WIDE_INT - x.get_precision ();
+  unsigned int excess = wi::excess_bits (len, x.get_precision ());
 
   if (wi::neg_p (x, sgn))
 {
@@ -306,7 +304,8 @@ wi::force_to_size (HOST_WIDE_INT *val, c
   unsigned int xlen, unsigned int xprecision,
   unsigned int precision, signop sgn)
 {
-  unsigned int blocks_needed = BLOCKS_NEEDED (precision);
+  unsigned int blocks_needed = wi::blocks_needed (precision);
+  unsigned int xblocks_needed = wi::blocks_needed (xprecision);
   unsigned int len = blocks_needed  xlen ? blocks_needed : xlen;
   for (unsigned i = 0; i  len; i++)
 val[i] = xval[i];
@@ -318,11 +317,11 @@ wi::force_to_size (HOST_WIDE_INT *val, c
   /* Expanding.  */
   if (sgn == UNSIGNED)
{
- if (small_xprecision  len == BLOCKS_NEEDED (xprecision))
+ if (small_xprecision  len == xblocks_needed)
val[len - 1] = zext_hwi (val[len - 1], small_xprecision);
  else if (val[len - 1]  0)
{
- while (len  BLOCKS_NEEDED (xprecision))
+ while (len  xblocks_needed)
val[len++] = -1;
  if (small_xprecision)
val[len - 1] = zext_hwi (val[len - 1], small_xprecision);
@@ -332,7 +331,7 @@ wi::force_to_size (HOST_WIDE_INT *val, c
}
   else
{
- if (small_xprecision  len == BLOCKS_NEEDED (xprecision))
+ if (small_xprecision  len

[wide-int] Fix some division cases

2014-05-02 Thread Richard Sandiford

divmod_internal didn't handle unsigned division in which the inputs
have implicit all-one upper bits.  There were two problems:

- wi_unpack should extend implicit 1s to index blocks_needed - 1
  (possibly with a zext_hwi on the last block for small_prec)
  regardless of signedness

- when calculating m and n, divmod_internal only used the
  *_blocks_needed value if the division was signed.  Neither is value
  is really signed in this context, since we've already calculated
  absolute values if sgnop == SIGNED, so the neg_p (and its pre-wi::
  incarnation) would only have held for the minimum integer.

  Using a simple while loop to find the top half-HWI seems more direct.

Also, divmod_internal_2 took n and m as unsigned values but used
HOST_WIDE_INT iterators on m - n.  m can be less than n for
things like 0 / big, and the negative m - n was being treated
as an unsigned int and zero-extended to HOST_WIDE_INT.
The patch fixes both the types of n and m and the types
of the iterators.

Tested on x86_64-linux-gnu and powerpc64-linux-gnu.  OK to install?

Thanks,
Richard


Index: gcc/wide-int.cc
===
--- gcc/wide-int.cc 2014-05-02 16:28:09.560842845 +0100
+++ gcc/wide-int.cc 2014-05-02 16:40:40.002916374 +0100
@@ -1126,8 +1126,7 @@ wi::add_large (HOST_WIDE_INT *val, const
HOST_HALF_WIDE_INTs of RESULT.  The rest of RESULT is filled by
uncompressing the top bit of INPUT[IN_LEN - 1].  */
 static void
-wi_unpack (unsigned HOST_HALF_WIDE_INT *result,
-  const unsigned HOST_WIDE_INT *input,
+wi_unpack (unsigned HOST_HALF_WIDE_INT *result, const HOST_WIDE_INT *input,
   unsigned int in_len, unsigned int out_len,
   unsigned int prec, signop sgn)
 {
@@ -1145,20 +1144,24 @@ wi_unpack (unsigned HOST_HALF_WIDE_INT *
   else
 mask = 0;
 
-  for (i = 0; i  in_len; i++)
+  for (i = 0; i  blocks_needed - 1; i++)
 {
-  HOST_WIDE_INT x = input[i];
-  if (i == blocks_needed - 1  small_prec)
-   {
- if (sgn == SIGNED)
-   x = sext_hwi (x, small_prec);
- else
-   x = zext_hwi (x, small_prec);
-   }
+  HOST_WIDE_INT x = safe_uhwi (input, in_len, i);
   result[j++] = x;
   result[j++] = x  HOST_BITS_PER_HALF_WIDE_INT;
 }
 
+  HOST_WIDE_INT x = safe_uhwi (input, in_len, i);
+  if (small_prec)
+{
+  if (sgn == SIGNED)
+   x = sext_hwi (x, small_prec);
+  else
+   x = zext_hwi (x, small_prec);
+}
+  result[j++] = x;
+  result[j++] = x  HOST_BITS_PER_HALF_WIDE_INT;
+
   /* Smear the sign bit.  */
   while (j  out_len)
 result[j++] = mask;
@@ -1317,10 +1320,8 @@ wi::mul_internal (HOST_WIDE_INT *val, co
 }
 
   /* We do unsigned mul and then correct it.  */
-  wi_unpack (u, (const unsigned HOST_WIDE_INT *) op1val, op1len,
-half_blocks_needed, prec, SIGNED);
-  wi_unpack (v, (const unsigned HOST_WIDE_INT *) op2val, op2len,
-half_blocks_needed, prec, SIGNED);
+  wi_unpack (u, op1val, op1len, half_blocks_needed, prec, SIGNED);
+  wi_unpack (v, op2val, op2len, half_blocks_needed, prec, SIGNED);
 
   /* The 2 is for a full mult.  */
   memset (r, 0, half_blocks_needed * 2
@@ -1519,7 +1520,7 @@ divmod_internal_2 (unsigned HOST_HALF_WI
   unsigned HOST_HALF_WIDE_INT *b_remainder,
   unsigned HOST_HALF_WIDE_INT *b_dividend,
   unsigned HOST_HALF_WIDE_INT *b_divisor,
-  unsigned int m, unsigned int n)
+  int m, int n)
 {
   /* The digits are a HOST_HALF_WIDE_INT which the size of half of a
  HOST_WIDE_INT and stored in the lower bits of each word.  This
@@ -1530,7 +1531,8 @@ divmod_internal_2 (unsigned HOST_HALF_WI
   unsigned HOST_WIDE_INT qhat;   /* Estimate of quotient digit.  */
   unsigned HOST_WIDE_INT rhat;   /* A remainder.  */
   unsigned HOST_WIDE_INT p;  /* Product of two digits.  */
-  HOST_WIDE_INT s, i, j, t, k;
+  HOST_WIDE_INT t, k;
+  int i, j, s;
 
   /* Single digit divisor.  */
   if (n == 1)
@@ -1757,26 +1759,17 @@ wi::divmod_internal (HOST_WIDE_INT *quot
}
 }
 
-  wi_unpack (b_dividend, (const unsigned HOST_WIDE_INT *) dividend.get_val (),
-dividend.get_len (), dividend_blocks_needed, dividend_prec, sgn);
-  wi_unpack (b_divisor, (const unsigned HOST_WIDE_INT *) divisor.get_val (),
-divisor.get_len (), divisor_blocks_needed, divisor_prec, sgn);
-
-  if (wi::neg_p (dividend, sgn))
-m = dividend_blocks_needed;
-  else
-m = 2 * dividend.get_len ();
+  wi_unpack (b_dividend, dividend.get_val (), dividend.get_len (),
+dividend_blocks_needed, dividend_prec, sgn);
+  wi_unpack (b_divisor, divisor.get_val (), divisor.get_len (),
+divisor_blocks_needed, divisor_prec, sgn);
+
+  m = dividend_blocks_needed;
+  while (m  1  b_dividend[m - 1] == 0)
+m--;
 
-  if (wi::neg_p (divisor, sgn))
-n = divisor_blocks_needed;
-  else
-n = 2 *

Re: [wide-int 3/8] Add and use udiv_ceil

2014-05-02 Thread Richard Sandiford

Richard Sandiford rdsandif...@googlemail.com writes:
 Just a minor tweak to avoid several calculations when one would do.
 Since we have a function for rounded-up division, we might as well
 use it instead of the (X + Y - 1) / Y idiom.

 Tested on x86_64-linux-gnu.  OK to install?

I ended up reverting this.  I'd assumed that T would be nonnegative,
but it turns out that there are cases where T itself is negative
but T + ALIGN - 1 is nonnegative.  In those cases the effect of
this patch was to do an unsigned division on the negative
(i.e. very large positive) number, add 1 to the large result,
and then multiply the large result by ALIGN to get zero.

Using a ubsan bootstrap after this change showed up some bugs in the
divmod code; the symptoms were non-fatal overflow warnings during the
bootstrap itself.  I sent a separate patch for that and made sure
that the failures were gone.

With that fixed, the udiv_ceil version gave the same result as
the original, but it's a non-optimisation after all.

 Thanks,
 Richard


 Index: gcc/dwarf2out.c
 ===
 --- gcc/dwarf2out.c   2014-04-22 20:31:25.187000808 +0100
 +++ gcc/dwarf2out.c   2014-04-22 20:31:26.374009366 +0100
 @@ -14824,7 +14824,7 @@ simple_decl_align_in_bits (const_tree de
  static inline offset_int
  round_up_to_align (const offset_int t, unsigned int align)
  {
 -  return wi::udiv_trunc (t + align - 1, align) * align;
 +  return wi::udiv_ceil (t, align) * align;
  }
  
  /* Given a pointer to a FIELD_DECL, compute and return the byte offset of the
 Index: gcc/wide-int.h
 ===
 --- gcc/wide-int.h2014-04-22 20:31:25.842005530 +0100
 +++ gcc/wide-int.h2014-04-22 20:31:26.375009373 +0100
 @@ -521,6 +521,7 @@ #define SHIFT_FUNCTION \
BINARY_FUNCTION udiv_floor (const T1 , const T2 );
BINARY_FUNCTION sdiv_floor (const T1 , const T2 );
BINARY_FUNCTION div_ceil (const T1 , const T2 , signop, bool * = 0);
 +  BINARY_FUNCTION udiv_ceil (const T1 , const T2 );
BINARY_FUNCTION div_round (const T1 , const T2 , signop, bool * = 0);
BINARY_FUNCTION divmod_trunc (const T1 , const T2 , signop,
   WI_BINARY_RESULT (T1, T2) *);
 @@ -2566,6 +2567,13 @@ wi::div_ceil (const T1 x, const T2 y,
return quotient;
  }
  
 +template typename T1, typename T2
 +inline WI_BINARY_RESULT (T1, T2)
 +wi::udiv_ceil (const T1 x, const T2 y)
 +{
 +  return div_ceil (x, y, UNSIGNED);
 +}
 +
  /* Return X / Y, rouding towards nearest with ties away from zero.
 Treat X and Y as having the signedness given by SGN.  Indicate
 in *OVERFLOW if the result overflows.  */

C++ PATCH for c++/60992 (lambda and constant variable)

2014-05-02 Thread Jason Merrill

My change to avoid building a dummy variable in a nested function turns 
out to have been too aggressive; in this case, we have a reference to a 
local const variable in the type of an array that we're capturing.  It 
seems safe enough to make dummy copies of const and static variables 
when instantiating a lambda operator, so let's do that.


Tested x86_64-pc-linux-gnu, applying to trunk and 4.9.
commit 7638a1fc9626c71c8665263dba6703b007293f92
Author: Jason Merrill ja...@redhat.com
Date:   Wed Apr 30 13:58:09 2014 -0400

	PR c++/60992
	* lambda.c (lambda_capture_field_type): Wrap anything dependent
	other than 'this'.
	(add_capture): Check for VLA before calling it.
	* semantics.c (is_this_parameter): Accept any 'this' parameter, not
	just the current one.  Make non-static.
	* cp-tree.h: Declare it.
	* pt.c (tsubst_copy) [VAR_DECL]: Also build a new VAR_DECL if
	the operand was static or constant.

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 55ecc4e..34d3d20 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -5773,6 +5773,7 @@ extern bool is_sub_constant_expr (tree);
 extern bool reduced_constant_expression_p (tree);
 extern void explain_invalid_constexpr_fn (tree);
 extern vectree cx_error_context (void);
+extern bool is_this_parameter (tree);
 
 enum {
   BCS_NO_SCOPE = 1,
diff --git a/gcc/cp/lambda.c b/gcc/cp/lambda.c
index 0b8b46a..5ba6f14 100644
--- a/gcc/cp/lambda.c
+++ b/gcc/cp/lambda.c
@@ -216,8 +216,8 @@ lambda_capture_field_type (tree expr, bool explicit_init_p)
 }
   else
 type = non_reference (unlowered_expr_type (expr));
-  if (!type || WILDCARD_TYPE_P (type) || type_uses_auto (type)
-  || DECL_PACK_P (expr))
+  if (type_dependent_expression_p (expr)
+   !is_this_parameter (tree_strip_nop_conversions (expr)))
 {
   type = cxx_make_type (DECLTYPE_TYPE);
   DECLTYPE_TYPE_EXPR (type) = expr;
@@ -455,7 +455,7 @@ add_capture (tree lambda, tree id, tree orig_init, bool by_reference_p,
   if (TREE_CODE (initializer) == TREE_LIST)
 initializer = build_x_compound_expr_from_list (initializer, ELK_INIT,
 		   tf_warning_or_error);
-  type = lambda_capture_field_type (initializer, explicit_init_p);
+  type = TREE_TYPE (initializer);
   if (array_of_runtime_bound_p (type))
 {
   vla = true;
@@ -482,15 +482,19 @@ add_capture (tree lambda, tree id, tree orig_init, bool by_reference_p,
 		variable size, TREE_TYPE (type));
   type = error_mark_node;
 }
-  else if (by_reference_p)
+  else
 {
-  type = build_reference_type (type);
-  if (!real_lvalue_p (initializer))
-	error (cannot capture %qE by reference, initializer);
+  type = lambda_capture_field_type (initializer, explicit_init_p);
+  if (by_reference_p)
+	{
+	  type = build_reference_type (type);
+	  if (!real_lvalue_p (initializer))
+	error (cannot capture %qE by reference, initializer);
+	}
+  else
+	/* Capture by copy requires a complete type.  */
+	type = complete_type (type);
 }
-  else
-/* Capture by copy requires a complete type.  */
-type = complete_type (type);
 
   /* Add __ to the beginning of the field name so that user code
  won't find the field with name lookup.  We can't just leave the name
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 48cc2a9..1584eb9 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -12629,13 +12629,17 @@ tsubst_copy (tree t, tree args, tsubst_flags_t complain, tree in_decl)
 		}
 	  else
 		{
-		  /* This can happen for a variable used in a late-specified
-		 return type of a local lambda.  Just make a dummy decl
-		 since it's only used for its type.  */
-		  if (cp_unevaluated_operand)
-		return tsubst_decl (t, args, complain);
-		  gcc_assert (errorcount || sorrycount);
-		  return error_mark_node;
+		  /* This can happen for a variable used in a
+		 late-specified return type of a local lambda, or for a
+		 local static or constant.  Building a new VAR_DECL
+		 should be OK in all those cases.  */
+		  r = tsubst_decl (t, args, complain);
+		  if (decl_constant_var_p (r))
+		/* A use of a local constant must decay to its value.  */
+		return integral_constant_value (r);
+		  gcc_assert (cp_unevaluated_operand || TREE_STATIC (r)
+			  || errorcount || sorrycount);
+		  return r;
 		}
 	}
 	}
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 3f8ca44..4afb821 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -8155,10 +8155,11 @@ maybe_initialize_constexpr_call_table (void)
 
 /* Return true if T designates the implied `this' parameter.  */
 
-static inline bool
+bool
 is_this_parameter (tree t)
 {
-  return t == current_class_ptr;
+  return (TREE_CODE (t) == PARM_DECL
+	   DECL_NAME (t) == this_identifier);
 }
 
 /* We have an expression tree T that represents a call, either CALL_EXPR
diff --git a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-const3.C b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-const3.C
new file mode

Re: Contributing new gcc targets: i386--dragonfly and x86-64--dragonfly

2014-05-02 Thread Joseph S. Myers

On Fri, 2 May 2014, John Marino wrote:

 1) I don't know which type definitions are missing (iow, the important
 ones from sys/type.h that are required to build gcc)

The default presumption should be:

* stddef.h from GCC provides what it needs to provide; nothing extra is 
needed and such a #include should not be needed at all.

* Special measures to avoid duplicate typedefs (where some other header 
also defines one of the typedefs defined in stddef.h) aren't in fact 
needed, because GCC allows duplicate typedefs in system headers (even 
outside C11 mode - in C11 mode it's a standard feature).

So try removing that #include.  If that causes problems, investigate based 
on the actual problems seen.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: Contributing new gcc targets: i386--dragonfly and x86-64--dragonfly

2014-05-02 Thread John Marino

On 5/2/2014 22:15, Joseph S. Myers wrote:
 On Fri, 2 May 2014, John Marino wrote:
 
 1) I don't know which type definitions are missing (iow, the important
 ones from sys/type.h that are required to build gcc)
 
 The default presumption should be:
 
 * stddef.h from GCC provides what it needs to provide; nothing extra is 
 needed and such a #include should not be needed at all.
 
 * Special measures to avoid duplicate typedefs (where some other header 
 also defines one of the typedefs defined in stddef.h) aren't in fact 
 needed, because GCC allows duplicate typedefs in system headers (even 
 outside C11 mode - in C11 mode it's a standard feature).
 
 So try removing that #include.  If that causes problems, investigate based 
 on the actual problems seen.

Okay, will do.
Thanks!

Re: debug container patch

2014-05-02 Thread François Dumont


Hi Jonathan

I just wanted to make sure that you are aware that I preferred to 
wait for another validation of the small modification I have done.


François


On 28/04/2014 23:07, François Dumont wrote:

On 27/04/2014 15:39, Jonathan Wakely wrote:

On 17/04/14 22:43 +0200, François Dumont wrote:

Hi

   Here is a patch to globally enhance debug containers implementation.


François, sorry for the delay, this is a large patch and I wanted to
give it the time it deserves to review properly.


No problem, I see that there are a lot of proposals lately.



I understand why this is needed, but it changes the layout of the
classes in very significant ways, meaning the debug containers will
not be compatible across GCC releases. I'm OK with that now, but from
the next major GCC release I'd like to avoid that in future.


I remember Paolo saying that there is no abi guaranty for debug mode 
this is why I didn't hesitate in making this proposal. Will there be 
one in the future ? I plan also breaking changes for profile mode to 
fix its very bad performance.




   I noticed also that in std/c++11/debug.cc we have some methods 
qualified with noexcept while in a C++03 user code those methods 
will have a throw() qualification. Is that fine ?


As I said in my last mail, yes, those specifications are compatible.
But I don't think your changes are doing what you think they are doing
in all cases. Using _GLIBCXX_NOEXCEPT does not expand to throw() in
C++03 mode, it expands to nothing.


Yes, I discover this difference in one of your recent mail.




   * include/debug/safe_unordered_container.tcc


N.B. This file has no changes listed in the changelog entry.


I reviewed the ChangeLog and limit modifications like in this file. 
Note however that patch have been generated with '-x -b' option to 
hide white space modifications. I clean usage of white chars in 
impacted files, replaced some white spaces with tabs and remove 
useless white spaces.





@@ -69,8 +75,26 @@

  // 23.2.1.1 construct/copy/destroy:

-  deque() : _Base() { }
+#if __cplusplus  201103L
+  deque()
+  : _Base() { }

+  deque(const deque __x)
+  : _Base(__x) { }
+
+  ~deque() _GLIBCXX_NOEXCEPT { }


In C++03 mode the _GLIBCXX_NOEXCEPT macro expands to an empty string,
so it is useless in this chunk of code, which is only compiled for
C++03 mode. It should probably just be removed here (and in all the
other debug containers which use it in C++03-only code).


Ok, I cleaned those. Did you mean removing the whole explicit 
destructor ? Is it a coding Standard to always explicitly implement 
the destructor or just a way to have Doxygen generate ?



+ * before-begin ownership.*/

+   templatetypename _SafeSequence
+void
+_Safe_forward_list_SafeSequence::
+_M_swap(_Safe_sequence_base __other) noexcept
+{
+  __gnu_cxx::__scoped_lock sentry(_M_this()._M_get_mutex());


Shouldn't we be locking both containers' mutexes here?
As we do in src/c++11/debug.cc


Good point, not a regression but nice to fix in this patch.




+ forward_list(forward_list __list, const allocator_type __al)
+: _Safe(std::move(__list), __al),
+  _Base(std::move(__list), __al)
+  { }


This makes me feel uneasy, seeing a moved-from object being used
again, but I don't think changing it to use static_casts to the two
base classes would look better, so let's leave it like that.


That indeed looks scary, we replaced with:

  forward_list(forward_list __list, const allocator_type __al)
: _Safe(std::move(__list._M_safe()), __al),
  _Base(std::move(__list._M_base()), __al)
  { }

it makes clearer the fact that we move each part.




Index: include/debug/safe_base.h
===
--- include/debug/safe_base.h(revision 209446)
+++ include/debug/safe_base.h(working copy)
@@ -188,22 +188,18 @@

  protected:
// Initialize with a version number of 1 and no iterators
-_Safe_sequence_base()
+_Safe_sequence_base() _GLIBCXX_NOEXCEPT


This use of _GLIBCXX_NOEXCEPT are correct, if the intention is to be
noexcept in C++11 and have no exception specification in C++98/C++03.


Yes, I preferred to use default implementation for special function in 
C++11 so I qualified as many things as possible noexcept so that 
resulting noexcept qualification depends only on the normal mode 
noexcept qualification.





: _M_iterators(0), _M_const_iterators(0), _M_version(1)
{ }

#if __cplusplus = 201103L
_Safe_sequence_base(const _Safe_sequence_base) noexcept
  : _Safe_sequence_base() { }
-
-_Safe_sequence_base(_Safe_sequence_base __x) noexcept
-  : _Safe_sequence_base()
-{ _M_swap(__x); }
#endif

/** Notify all iterators that reference this sequence that the
sequence is being destroyed. */
-~_Safe_sequence_base()
+~_Safe_sequence_base() _GLIBCXX_NOEXCEPT


This is redundant. In C++03 the macro expands to nothing, and in

Re: [C++ Patch/RFC] PR 58582

2014-05-02 Thread Paolo Carlini


Hi,

On 05/02/2014 08:05 PM, Jason Merrill wrote:
I'd tweak instantiate_decl differently; a deleted function is defined, 
so pattern_defined should be true.
I see, thanks. Simply setting pattern_defined and nothing else appears 
to work very well. The only minor annoyance is that DECL_STRUCT_FUNCTION 
can be null and we have to check for that before using it (in fact, we 
do that in 2 other places too). Tested x86_64-linux.


Thanks again,
Paolo.

///
Index: cp/decl.c
===
--- cp/decl.c   (revision 210004)
+++ cp/decl.c   (working copy)
@@ -7822,6 +7822,8 @@ grokfndecl (tree ctype,
 decl, ctype);
  return NULL_TREE;
}
+ if (ok == error_mark_node)
+   return NULL_TREE;
  return old_decl;
}
 }
Index: cp/pt.c
===
--- cp/pt.c (revision 210004)
+++ cp/pt.c (working copy)
@@ -19616,7 +19617,8 @@ instantiate_decl (tree d, int defer_ok,
 
   if (TREE_CODE (d) == FUNCTION_DECL)
 pattern_defined = (DECL_SAVED_TREE (code_pattern) != NULL_TREE
-  || DECL_DEFAULTED_OUTSIDE_CLASS_P (code_pattern));
+  || DECL_DEFAULTED_OUTSIDE_CLASS_P (code_pattern)
+  || DECL_DELETED_FN (code_pattern));
   else
 pattern_defined = ! DECL_IN_AGGR_P (code_pattern);
 
@@ -19858,14 +19860,17 @@ instantiate_decl (tree d, int defer_ok,
   tf_warning_or_error, tmpl,
   /*integral_constant_expression_p=*/false);
 
- /* Set the current input_location to the end of the function
-so that finish_function knows where we are.  */
- input_location
-   = DECL_STRUCT_FUNCTION (code_pattern)-function_end_locus;
+ if (DECL_STRUCT_FUNCTION (code_pattern))
+   {
+ /* Set the current input_location to the end of the function
+so that finish_function knows where we are.  */
+ input_location
+   = DECL_STRUCT_FUNCTION (code_pattern)-function_end_locus;
 
- /* Remember if we saw an infinite loop in the template.  */
- current_function_infinite_loop
-   = DECL_STRUCT_FUNCTION (code_pattern)-language-infinite_loop;
+ /* Remember if we saw an infinite loop in the template.  */
+ current_function_infinite_loop
+   = DECL_STRUCT_FUNCTION (code_pattern)-language-infinite_loop;
+   }
}
 
   /* We don't need the local specializations any more.  */
Index: testsuite/g++.dg/cpp0x/deleted4.C
===
--- testsuite/g++.dg/cpp0x/deleted4.C   (revision 0)
+++ testsuite/g++.dg/cpp0x/deleted4.C   (working copy)
@@ -0,0 +1,11 @@
+// PR c++/58582
+// { dg-do compile { target c++11 } }
+
+struct A
+{
+  templateint void foo() = delete;
+};
+
+templateint void A::foo() { int i; } // { dg-error redefinition }
+
+template void A::foo0();
Index: testsuite/g++.dg/cpp0x/deleted5.C
===
--- testsuite/g++.dg/cpp0x/deleted5.C   (revision 0)
+++ testsuite/g++.dg/cpp0x/deleted5.C   (working copy)
@@ -0,0 +1,11 @@
+// PR c++/58582
+// { dg-do compile { target c++11 } }
+
+struct A
+{
+  templateint void foo() = delete;
+};
+
+templateint void A::foo() {} // { dg-error redefinition }
+
+template void A::foo0();
Index: testsuite/g++.dg/cpp0x/deleted6.C
===
--- testsuite/g++.dg/cpp0x/deleted6.C   (revision 0)
+++ testsuite/g++.dg/cpp0x/deleted6.C   (working copy)
@@ -0,0 +1,9 @@
+// PR c++/58582
+// { dg-do compile { target c++11 } }
+
+struct A
+{
+  templateint void foo() = delete;
+};
+
+template void A::foo0();

[PATCH 0/3] Compile-time gimple checking, without typedefs

2014-05-02 Thread David Malcolm

This patch series demonstrates a way of reimplementing the 89-patch series:
  [PATCH 00/89] Compile-time gimple-checking
 http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01148.html

whilst avoiding introducing a pair of gimple_foo/const_gimple_foo typedefs
for each subclass.

It eliminates the gimple and const_gimple typedefs,
renaming gimple_statement_base to gimple_stmt, giving types:
  gimple_stmt * and const gimple_stmt *
thoughout the middle-end.  The rest of the gimple statement classes are
renamed, converting the various
  gimple_statement_with_FOO
to:
  gimple_stmt_with_FOO
and the remainder:
  gimple_statement_SOME_SUBCLASS
to just:
  gimple_SOME_SUBCLASS

The idea is then to reimplement the earlier patch series, porting many of
these:
  gimple_stmt *something
to point to some more concrete subclass; I've done this for GIMPLE_SWITCH.

It requires two patches that I've already posted separately:

  (A): [PATCH] gengtype: Support explicit pointers in template arguments:
  http://gcc.gnu.org/ml/gcc-patches/2014-05/msg3.html
   (which apparently will need reworking after wide-int is merged;
oh well).

  (B): [PATCH 19/89] Const-correctness of gimple_call_builtin_p:
  http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01194.html
   (I have a separate bootstrapregrtest in progress for just this one,
which appears to be pre-approved per Jeff's earlier comments).

Of the 3 patches in the series itself:

  Patch 1:
This one is handwritten: it renames the gimple statement classes in
their declarations, in the docs, in the selftests and renames explicit
uses of *subclasses*.

  Patch 2:
 This one is autogenerated: it does the mass conversion of gimple to
 gimple_stmt * and const_gimple to const gimple_stmt * throughout
 the code.

 The conversion script is at:
   
https://github.com/davidmalcolm/gcc-refactoring-scripts/blob/master/rename_gimple.py

 It's not a real C++ parser, but it has some smarts for handling
 awkward cases like:
gimple def0, def2;
 which must become:
gimple_stmt *def0, *def2;
   ^ note the second '*' character.

 You can see the selftests for the conversion script at:
   
https://github.com/davidmalcolm/gcc-refactoring-scripts/blob/master/test_rename_gimple.py

  Patch 3:
This one is a port of the previously-posted:
  [PATCH 02/89] Introduce gimple_switch and use it in various places
 http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01154.html
to the new approach.  As per an earlier revision:
  http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01549.html
it *eliminates* the
  stmt-as_a_gimple_switch ()
and
  stmt-dyn_cast_gimple_switch ()
casting methods from the base class in favor of direct usage of is-a.h:
  as_a gimple_switch * (stmt)
and
  dyn_cast gimple_switch * (stmt)

This version of the patch also eliminates the
   gimple_switch / const_gimple_switch
typedefs from the initial patch series in favor of gimple_switch
being the name of the subclass.

Hopefully porting this first patch in the series gives a sense of what
the end-result would look like.

There are a few possible variations on the names we could pick in this
approach.

I deliberately renamed gimple to gimple_stmt since IIRC Andrew MacLeod
had suggested something like this, for the potential of making gimple
be a namespace. That said, using namespaces may be controversial, and
would need further gengtype support, which may be tricky (i.e. fully teach
gengtype about C++ namespaces, which may be a gengtype hack too far).
[I'd much prefer to avoid C++ namespaces in the core code, mostly because
of gengtype].

Also, AIUI, Andrew is looking at introducing concepts of gimple types and
gimple expressions, so gimple may no longer imply a *statement*.

Alternatively, we could make the base class be just gimple (which would
be more consistent with the names of the accessor functions).

There's also the bargain basement namespaces approach, where we don't
have an implicit gimple namespace, but just *pretend* we do, and rename
the base type to stmt, with e.g. gimple_statement_phi becoming just
phi. [gimple_switch would need to become switch_, say, to avoid the
reserved word].

Successfully bootstrapped and regrtested on x86_64-unknown-linux-gnu
(Fedora 20) (on top of the 2 dependent patches mentioned above; equal
results compared to a control build of r209953.

How does this look? (for trunk; after 4.9.1 is released).


David Malcolm (3):
  Handwritten part of conversion of gimple to gimple_stmt *
  Autogenerated part of conversion of gimple to gimple_stmt *
  Introduce gimple_switch and use it in various places

 gcc/asan.c   |   36 +-
 gcc/builtins.c   |   10 +-
 gcc/builtins.h   |2 +-
 gcc/c-family/c-gimplify.c|4 +-
 gcc/calls.c

[PATCH 3/3] Introduce gimple_switch and use it in various places

2014-05-02 Thread David Malcolm

gcc/
* gimple.h (gimple_switch): New subclass of gimple_stmt_with_ops,
adding the invariant that stmt-code == GIMPLE_SWITCH.
(is_a_helper gimple_switch *::test): New.
(is_a_helper const gimple_switch *::test): New.
(gimple_try): Likewise.

* coretypes.h (gimple_switch): Add forward declaration here.

* gdbhooks.py (build_pretty_printer): Add gimple_switch,
using the gimple printer.

* gimple.c (gimple_build_switch_nlabels): Return a gimple_switch
rather than just a gimple.
(gimple_build_switch): Likewise.
* gimple.h (gimple_build_switch_nlabels): Likewise.
(gimple_build_switch): Likewise.

* gimple.h (gimple_switch_num_labels): Update type-signature to
require a gimple_switch rather than just a gimple.
(gimple_switch_set_num_labels): Likewise.
(gimple_switch_set_index): Likewise.
(gimple_switch_label): Likewise.
(gimple_switch_set_label): Likewise.
(gimple_switch_default_label): Likewise.
(gimple_switch_set_default_label): Likewise.
* expr.h (expand_case): Likewise.
* gimple-pretty-print.c (dump_gimple_call): Likewise.
* stmt.c (compute_cases_per_edge): Likewise.
(expand_case): Likewise.
* tree-cfg.h (group_case_labels_stmt): Likewise.
* tree-cfg.c (make_gimple_switch_edges): Likewise.
(find_taken_edge_switch_expr) Likewise.
(find_case_label_for_value) Likewise.
(get_cases_for_edge): Likewise.
(group_case_labels_stmt): Likewise.
(verify_gimple_switch): Likewise.
* tree-eh.c (verify_norecord_switch_expr): Likewise.
* tree-eh.c (lower_eh_constructs_2): Likewise.
* tree-loop-distribution.c (generate_loops_for_partition): Likewise.
* tree-ssa-dom.c (record_edge_info): Likewise.
* tree-ssa-forwprop.c (simplify_gimple_switch_label_vec): Likewise.
(simplify_gimple_switch): Likewise.
* tree-switch-conversion.c (emit_case_bit_tests): Likewise.
(collect_switch_conv_info): Likewise.
(build_constructors): Likewise.
(array_value_type): Likewise.
(build_one_array): Likewise.
(build_arrays): Likewise.
(gen_inbound_check): Likewise.
* tree-vrp.c (find_switch_asserts): Likewise.
(find_case_label_range): Likewise.
(find_case_label_ranges): Likewise.
(vrp_visit_switch_stmt): Likewise.
(simplify_switch_using_ranges): Likewise.

* tree-vrp.c (switch_update): Strengthen field stmt from being
merely a gimple_stmt * to being a gimple_switch *.

* cfgexpand.c (expand_gimple_stmt_1): Add checked cast to
gimple_switch in regions where the stmt code has been tested as
GIMPLE_SWITCH.
* gimple-pretty-print.c (pp_gimple_stmt_1): Likewise.
* tree-cfg.c (make_edges): Likewise.
(end_recording_case_labels): Likewise.
(cleanup_dead_labels): Likewise.
(cleanup_dead_labels): Likewise.
(group_case_labels): Likewise.
(find_taken_edge): Likewise.
(find_case_label_for_value): Likewise.
(verify_gimple_stmt): Likewise.
(gimple_verify_flow_info): Likewise.
(gimple_redirect_edge_and_branch): Likewise.
* tree-inline.c (estimate_num_insns): Likewise.
* tree-ssa-forwprop.c (ssa_forward_propagate_and_combine): Likewise.
* tree-ssa-uncprop.c (associate_equivalences_with_edges): Likewise.
* tree-switch-conversion.c (do_switchconv): Likewise.
* tree-vrp.c  (find_assert_locations_1): Likewise.
(vrp_visit_stmt): Likewise.
(simplify_stmt_using_ranges): Likewise.

* ipa-inline-analysis.c (set_switch_stmt_execution_predicate):
Introduce local lastg as a generic gimple, so that local last
can be of type gimple_switch once lastg's code has been verified.

* omp-low.c (diagnose_sb_2): Introduce switch_stmt local to handle
the GIMPLE_SWITCH case.

* tree-cfg.c (find_taken_edge_switch_expr): Add gimple_switch
argument, since the caller (find_taken_edge) has checked that
last_stmt is a switch.

* doc/gimple.texi (Class hierarchy of GIMPLE statements): Add
gimple_switch class.
(GIMPLE_SWITCH): Update signatures of accessor functions to reflect
above gimple_stmt to gimple_switch changes.
---
 gcc/cfgexpand.c  |  2 +-
 gcc/coretypes.h  |  5 
 gcc/doc/gimple.texi  | 34 ++---
 gcc/expr.h   |  2 +-
 gcc/gdbhooks.py  |  4 ++-
 gcc/gimple-pretty-print.c|  4 +--
 gcc/gimple.c | 11 +
 gcc/gimple.h | 44 ++---
 gcc/ipa-inline-analysis.c|  7 +++---
 gcc/omp-low.c|  5 ++--
 gcc/stmt.c   |  4 +--
 gcc/tree-cfg.c   |

[PATCH, RS6000] Fix __builtin_{pack,unpack}_longdouble and __builtin_{pack,unpack}_dec128 builtins

2014-05-02 Thread Peter Bergner

Currently, the gcc.target/powerpc/pack0[23].c test cases are failing on trunk
and the 4.9 and 4.8 branches.  The problem is we either fail to register the
builtins leading to undefined reference errors to the __builtin_* symbol or
we ICE due to having registered the builtin, but due to compiler options,
rs6000_expand_builtin doesn't think we should expanding the builtin.  I fixed
the new pack/unpack long double builtins by enabling them for TARGET_HARD_FLOAT
and easing the gcc_assert in rs6000_expand_builtin().

The pack/unpack DFP builtins just required passing -mhard-dfp as an option
to the test case. I also reduced the dg-require-effective-target, since
the test case works just fine on non-vsx POWER6 hardware.  In the process,
I noticed we didn't disallow -msoft-float -mhard-dfp being used together,
so we enforce that now.

I'll note this is a precursor patch to using the __builtin_pack_longdouble
builtin in libgcc/config/rs6000/ibm-ldouble.c which currently uses code
that copies data through a union.  The code it produces is horrific, as
it stores the two double fprs to memory, loads them into gprs, then stores
them back to memory before loading them back into fprs again.  The builtin
will also us to get a simple fmr(s).

This passed bootstrap and regtesting on powerpc64-linux with no regressions
and the pack0[23].c tests now pass.  Ok for trunk?

Is this also ok for the 4.8/4.9 branches once my testing on those are complete?

Peter


gcc/
* config/rs6000/rs6000.h (RS6000_BTM_HARD_FLOAT): New define.
(RS6000_BTM_COMMON): Add RS6000_BTM_HARD_FLOAT.
(TARGET_EXTRA_BUILTINS): Add TARGET_HARD_FLOAT.
* config/rs6000/rs6000-builtin.def (BU_MISC_1):
Use RS6000_BTM_HARD_FLOAT.
(BU_MISC_2): Likewise.
* config/rs6000/rs6000.c (rs6000_builtin_mask_calculate): Handle
RS6000_BTM_HARD_FLOAT.
(rs6000_option_override_internal): Enforce -mhard-float if -mhard-dfp
is explicitly used.
(rs6000_invalid_builtin): Add hard floating builtin support.
(rs6000_expand_builtin): Relax the gcc_assert to allow the new
hard float builtins.
(rs6000_builtin_mask_names): Add RS6000_BTM_HARD_FLOAT.

gcc/testsuite/
* gcc.target/powerpc/pack02.c (dg-options): Add -mhard-float.
(dg-require-effective-target): Change target to powerpc_fprs.
* gcc.target/powerpc/pack03.c (dg-options): Add -mhard-dfp.
(dg-require-effective-target): Change target to dfprt.

Index: gcc/config/rs6000/rs6000.h
===
--- gcc/config/rs6000/rs6000.h  (revision 209990)
+++ gcc/config/rs6000/rs6000.h  (working copy)
@@ -624,7 +624,8 @@ extern int rs6000_vector_align[];
  || TARGET_CMPB  /* ISA 2.05 */ \
  || TARGET_POPCNTD   /* ISA 2.06 */ \
  || TARGET_ALTIVEC  \
- || TARGET_VSX)))
+ || TARGET_VSX  \
+ || TARGET_HARD_FLOAT)))
 
 /* E500 cores only support plain sync, not lwsync.  */
 #define TARGET_NO_LWSYNC (rs6000_cpu == PROCESSOR_PPC8540 \
@@ -2517,6 +2518,7 @@ extern int frame_pointer_needed;
 #define RS6000_BTM_POPCNTD MASK_POPCNTD/* Target supports ISA 2.06.  */
 #define RS6000_BTM_CELLMASK_FPRND  /* Target is cell 
powerpc.  */
 #define RS6000_BTM_DFP MASK_DFP/* Decimal floating point.  */
+#define RS6000_BTM_HARD_FLOAT  MASK_SOFT_FLOAT /* Hardware floating point.  */
 
 #define RS6000_BTM_COMMON  (RS6000_BTM_ALTIVEC \
 | RS6000_BTM_VSX   \
@@ -2529,7 +2531,8 @@ extern int frame_pointer_needed;
 | RS6000_BTM_HTM   \
 | RS6000_BTM_POPCNTD   \
 | RS6000_BTM_CELL  \
-| RS6000_BTM_DFP)
+| RS6000_BTM_DFP   \
+| RS6000_BTM_HARD_FLOAT)
 
 /* Define builtin enum index.  */
 
Index: gcc/config/rs6000/rs6000-builtin.def
===
--- gcc/config/rs6000/rs6000-builtin.def(revision 209990)
+++ gcc/config/rs6000/rs6000-builtin.def(working copy)
@@ -626,7 +626,7 @@
 #define BU_MISC_1(ENUM, NAME, ATTR, ICODE) \
   RS6000_BUILTIN_2 (MISC_BUILTIN_ ## ENUM, /* ENUM */  \
__builtin_ NAME,  /* NAME */  \
-   RS6000_BTM_ALWAYS,  /* MASK */  \
+   RS6000_BTM_HARD_FLOAT,  /* MASK */  \
(RS6000_BTC_ ## ATTR

Re: [PATCH] Detect a pack-unpack pattern in GCC vectorizer and optimize it.

2014-05-02 Thread Cong Hou

On Mon, Apr 28, 2014 at 4:04 AM, Richard Biener rguent...@suse.de wrote:
 On Thu, 24 Apr 2014, Cong Hou wrote:

 Given the following loop:

 int a[N];
 short b[N*2];

 for (int i = 0; i  N; ++i)
   a[i] = b[i*2];


 After being vectorized, the access to b[i*2] will be compiled into
 several packing statements, while the type promotion from short to int
 will be compiled into several unpacking statements. With this patch,
 each pair of pack/unpack statements will be replaced by less expensive
 statements (with shift or bit-and operations).

 On x86_64, the loop above will be compiled into the following assembly
 (with -O2 -ftree-vectorize):

 movdqu 0x10(%rcx),%xmm3
 movdqu -0x20(%rcx),%xmm0
 movdqa %xmm0,%xmm2
 punpcklwd %xmm3,%xmm0
 punpckhwd %xmm3,%xmm2
 movdqa %xmm0,%xmm3
 punpcklwd %xmm2,%xmm0
 punpckhwd %xmm2,%xmm3
 movdqa %xmm1,%xmm2
 punpcklwd %xmm3,%xmm0
 pcmpgtw %xmm0,%xmm2
 movdqa %xmm0,%xmm3
 punpckhwd %xmm2,%xmm0
 punpcklwd %xmm2,%xmm3
 movups %xmm0,-0x10(%rdx)
 movups %xmm3,-0x20(%rdx)


 With this patch, the generated assembly is shown below:

 movdqu 0x10(%rcx),%xmm0
 movdqu -0x20(%rcx),%xmm1
 pslld  $0x10,%xmm0
 psrad  $0x10,%xmm0
 pslld  $0x10,%xmm1
 movups %xmm0,-0x10(%rdx)
 psrad  $0x10,%xmm1
 movups %xmm1,-0x20(%rdx)


 Bootstrapped and tested on x86-64. OK for trunk?

 This is an odd place to implement such transform.  Also if it
 is faster or not depends on the exact ISA you target - for
 example ppc has constraints on the maximum number of shifts
 carried out in parallel and the above has 4 in very short
 succession.  Esp. for the sign-extend path.

Thank you for the information about ppc. If this is an issue, I think
we can do it in a target dependent way.



 So this looks more like an opportunity for a post-vectorizer
 transform on RTL or for the vectorizer special-casing
 widening loads with a vectorizer pattern.

I am not sure if the RTL transform is more difficult to implement. I
prefer the widening loads method, which can be detected in a pattern
recognizer. The target related issue will be resolved by only
expanding the widening load on those targets where this pattern is
beneficial. But this requires new tree operations to be defined. What
is your suggestion?

I apologize for the delayed reply.


thanks,
Cong


 Richard.

Re: [wide-int] Add more assertions

2014-05-02 Thread Kenneth Zadeck


These are fine.

On 05/02/2014 03:20 PM, Richard Sandiford wrote:

This patch adds some assertions against sext (.., 0) and zext (..., 0).
The former is undefined at the sext_hwi level and the latter is disallowed
for consistency with the former.

Also, set_bit (rightly IMO) can't handle bit = precision.  For
precision = HOST_BITS_PER_WIDE_INT it would invoke undefined
behaviour while for other precisions I think it would crash.
A case with precision = HOST_BITS_PER_WIDE_INT showed up in java
(fix posted separately).

Tested on x86_64-linux-gnu and powerpc64-linux-gnu.  OK to install?

Thanks,
Richard


Index: gcc/wide-int.h
===
--- gcc/wide-int.h  2014-05-02 16:28:09.561842842 +0100
+++ gcc/wide-int.h  2014-05-02 16:44:04.015463718 +0100
@@ -2046,6 +2046,8 @@ wi::sext (const T x, unsigned int offse
unsigned int precision = get_precision (result);
WIDE_INT_REF_FOR (T) xi (x, precision);
  
+  gcc_checking_assert (offset != 0);

+
if (offset = HOST_BITS_PER_WIDE_INT)
  {
val[0] = sext_hwi (xi.ulow (), offset);
@@ -2065,6 +2067,8 @@ wi::zext (const T x, unsigned int offse
unsigned int precision = get_precision (result);
WIDE_INT_REF_FOR (T) xi (x, precision);
  
+  gcc_checking_assert (offset != 0);

+
/* This is not just an optimization, it is actually required to
   maintain canonization.  */
if (offset = precision)
@@ -2102,6 +2106,9 @@ wi::set_bit (const T x, unsigned int bi
WI_UNARY_RESULT_VAR (result, val, T, x);
unsigned int precision = get_precision (result);
WIDE_INT_REF_FOR (T) xi (x, precision);
+
+  gcc_checking_assert (bit  precision);
+
if (precision = HOST_BITS_PER_WIDE_INT)
  {
val[0] = xi.ulow () | ((unsigned HOST_WIDE_INT) 1  bit);

Re: [wide-int] Fix some division cases

2014-05-02 Thread Kenneth Zadeck


this is fine.

On 05/02/2014 03:22 PM, Richard Sandiford wrote:

divmod_internal didn't handle unsigned division in which the inputs
have implicit all-one upper bits.  There were two problems:

- wi_unpack should extend implicit 1s to index blocks_needed - 1
   (possibly with a zext_hwi on the last block for small_prec)
   regardless of signedness

- when calculating m and n, divmod_internal only used the
   *_blocks_needed value if the division was signed.  Neither is value
   is really signed in this context, since we've already calculated
   absolute values if sgnop == SIGNED, so the neg_p (and its pre-wi::
   incarnation) would only have held for the minimum integer.

   Using a simple while loop to find the top half-HWI seems more direct.

Also, divmod_internal_2 took n and m as unsigned values but used
HOST_WIDE_INT iterators on m - n.  m can be less than n for
things like 0 / big, and the negative m - n was being treated
as an unsigned int and zero-extended to HOST_WIDE_INT.
The patch fixes both the types of n and m and the types
of the iterators.

Tested on x86_64-linux-gnu and powerpc64-linux-gnu.  OK to install?

Thanks,
Richard


Index: gcc/wide-int.cc
===
--- gcc/wide-int.cc 2014-05-02 16:28:09.560842845 +0100
+++ gcc/wide-int.cc 2014-05-02 16:40:40.002916374 +0100
@@ -1126,8 +1126,7 @@ wi::add_large (HOST_WIDE_INT *val, const
 HOST_HALF_WIDE_INTs of RESULT.  The rest of RESULT is filled by
 uncompressing the top bit of INPUT[IN_LEN - 1].  */
  static void
-wi_unpack (unsigned HOST_HALF_WIDE_INT *result,
-  const unsigned HOST_WIDE_INT *input,
+wi_unpack (unsigned HOST_HALF_WIDE_INT *result, const HOST_WIDE_INT *input,
   unsigned int in_len, unsigned int out_len,
   unsigned int prec, signop sgn)
  {
@@ -1145,20 +1144,24 @@ wi_unpack (unsigned HOST_HALF_WIDE_INT *
else
  mask = 0;
  
-  for (i = 0; i  in_len; i++)

+  for (i = 0; i  blocks_needed - 1; i++)
  {
-  HOST_WIDE_INT x = input[i];
-  if (i == blocks_needed - 1  small_prec)
-   {
- if (sgn == SIGNED)
-   x = sext_hwi (x, small_prec);
- else
-   x = zext_hwi (x, small_prec);
-   }
+  HOST_WIDE_INT x = safe_uhwi (input, in_len, i);
result[j++] = x;
result[j++] = x  HOST_BITS_PER_HALF_WIDE_INT;
  }
  
+  HOST_WIDE_INT x = safe_uhwi (input, in_len, i);

+  if (small_prec)
+{
+  if (sgn == SIGNED)
+   x = sext_hwi (x, small_prec);
+  else
+   x = zext_hwi (x, small_prec);
+}
+  result[j++] = x;
+  result[j++] = x  HOST_BITS_PER_HALF_WIDE_INT;
+
/* Smear the sign bit.  */
while (j  out_len)
  result[j++] = mask;
@@ -1317,10 +1320,8 @@ wi::mul_internal (HOST_WIDE_INT *val, co
  }
  
/* We do unsigned mul and then correct it.  */

-  wi_unpack (u, (const unsigned HOST_WIDE_INT *) op1val, op1len,
-half_blocks_needed, prec, SIGNED);
-  wi_unpack (v, (const unsigned HOST_WIDE_INT *) op2val, op2len,
-half_blocks_needed, prec, SIGNED);
+  wi_unpack (u, op1val, op1len, half_blocks_needed, prec, SIGNED);
+  wi_unpack (v, op2val, op2len, half_blocks_needed, prec, SIGNED);
  
/* The 2 is for a full mult.  */

memset (r, 0, half_blocks_needed * 2
@@ -1519,7 +1520,7 @@ divmod_internal_2 (unsigned HOST_HALF_WI
   unsigned HOST_HALF_WIDE_INT *b_remainder,
   unsigned HOST_HALF_WIDE_INT *b_dividend,
   unsigned HOST_HALF_WIDE_INT *b_divisor,
-  unsigned int m, unsigned int n)
+  int m, int n)
  {
/* The digits are a HOST_HALF_WIDE_INT which the size of half of a
   HOST_WIDE_INT and stored in the lower bits of each word.  This
@@ -1530,7 +1531,8 @@ divmod_internal_2 (unsigned HOST_HALF_WI
unsigned HOST_WIDE_INT qhat;   /* Estimate of quotient digit.  */
unsigned HOST_WIDE_INT rhat;   /* A remainder.  */
unsigned HOST_WIDE_INT p;  /* Product of two digits.  */
-  HOST_WIDE_INT s, i, j, t, k;
+  HOST_WIDE_INT t, k;
+  int i, j, s;
  
/* Single digit divisor.  */

if (n == 1)
@@ -1757,26 +1759,17 @@ wi::divmod_internal (HOST_WIDE_INT *quot
}
  }
  
-  wi_unpack (b_dividend, (const unsigned HOST_WIDE_INT *) dividend.get_val (),

-dividend.get_len (), dividend_blocks_needed, dividend_prec, sgn);
-  wi_unpack (b_divisor, (const unsigned HOST_WIDE_INT *) divisor.get_val (),
-divisor.get_len (), divisor_blocks_needed, divisor_prec, sgn);
-
-  if (wi::neg_p (dividend, sgn))
-m = dividend_blocks_needed;
-  else
-m = 2 * dividend.get_len ();
+  wi_unpack (b_dividend, dividend.get_val (), dividend.get_len (),
+dividend_blocks_needed, dividend_prec, sgn);
+  wi_unpack (b_divisor, divisor.get_val (), divisor.get_len (),
+divisor_blocks_needed, divisor_prec, sgn);
+
+  m = dividend_blocks_needed;
+  while (m  1  b_dividend[m - 1] ==

84 matches

Mail list logo