Re: [RFC PATCH, i386]: Remove special PIC related __cpuid definitions from config/i386/cpuid.h

2014-10-17 Thread Uros Bizjak
On Thu, Oct 16, 2014 at 12:25 PM, Uros Bizjak ubiz...@gmail.com wrote:

  Now that %ebx is also allocatable in PIC modes, we can cleanup
  config/i386/cpuid considerably. I propose to remove all PIC related
  specializations of __cpuid and __cpuid_count and protect the
  compilation with #if __GNUC__ = 5.
 
  The only drawback would be that non-bootstrapped build with gcc  5.0
  will ignore -march=native, but I think this should be acceptable.
 
  I'm worried about that.
  Can't you instead keep the current cpuid.h stuff as is, just add
   __GNUC__  5
  to that, so it treats GCC 5+ PIC as if __PIC__ wasn't defined?
 
  Or, at least use cpuid.h even for older GCC if __PIC__ is not defined
  (or __x86_64__ is defined and not medium/large PIC model)?

 Do we really care that much about non-bootstrapped build? I don't see

 At least on Linux, driver-i386.c should not be built with PIC normally,
 so at least changing
 #if __GNUC__ = 5
 to
 #if defined(__GNUC__)  (__GNUC__ = 5 || !defined(__PIC__))
 would limit the -march=native change for non-bootstrapped compilers to
 Darwin only (or what other targets use PIC by default?).

 Yes, this would work for me - the goal is to keep only one universal
 __cpuid (and __cpuid_count) define, and the above condition fits this
 goal.

I have committed the attached patch to mainline SVN.

2014-10-17  Uros Bizjak  ubiz...@gmail.com

* config/i386/cpuid.h (__cpuid): Remove definitions that handle %ebx
register in a special way.
(__cpuid_count): Ditto.
* config/i386/driver-i386.h: Protect with
#if defined(__GNUC__)  (__GNUC__ = 5 || !defined(__PIC__)).
(host_detect_local_cpu): Mention that GCC with non-fixed %ebx
is required to compile the function.

Bootstrapped and regression tested on x86_64-linux-gnu.

Uros.
Index: config/i386/cpuid.h
===
--- config/i386/cpuid.h (revision 216298)
+++ config/i386/cpuid.h (working copy)
@@ -146,56 +146,7 @@
 #define signature_VORTEX_ecx   0x436f5320
 #define signature_VORTEX_edx   0x36387865
 
-#if defined(__i386__)  defined(__PIC__)
-/* %ebx may be the PIC register.  */
-#if __GNUC__ = 3
 #define __cpuid(level, a, b, c, d) \
-  __asm__ (xchg{l}\t{%%}ebx, %k1\n\t \
-  cpuid\n\t  \
-  xchg{l}\t{%%}ebx, %k1\n\t  \
-  : =a (a), =r (b), =c (c), =d (d)\
-  : 0 (level))
-
-#define __cpuid_count(level, count, a, b, c, d)\
-  __asm__ (xchg{l}\t{%%}ebx, %k1\n\t \
-  cpuid\n\t  \
-  xchg{l}\t{%%}ebx, %k1\n\t  \
-  : =a (a), =r (b), =c (c), =d (d)\
-  : 0 (level), 2 (count))
-#else
-/* Host GCCs older than 3.0 weren't supporting Intel asm syntax
-   nor alternatives in i386 code.  */
-#define __cpuid(level, a, b, c, d) \
-  __asm__ (xchgl\t%%ebx, %k1\n\t \
-  cpuid\n\t  \
-  xchgl\t%%ebx, %k1\n\t  \
-  : =a (a), =r (b), =c (c), =d (d)\
-  : 0 (level))
-
-#define __cpuid_count(level, count, a, b, c, d)\
-  __asm__ (xchgl\t%%ebx, %k1\n\t \
-  cpuid\n\t  \
-  xchgl\t%%ebx, %k1\n\t  \
-  : =a (a), =r (b), =c (c), =d (d)\
-  : 0 (level), 2 (count))
-#endif
-#elif defined(__x86_64__)  (defined(__code_model_medium__) || 
defined(__code_model_large__))  defined(__PIC__)
-/* %rbx may be the PIC register.  */
-#define __cpuid(level, a, b, c, d) \
-  __asm__ (xchg{q}\t{%%}rbx, %q1\n\t \
-  cpuid\n\t  \
-  xchg{q}\t{%%}rbx, %q1\n\t  \
-  : =a (a), =r (b), =c (c), =d (d)\
-  : 0 (level))
-
-#define __cpuid_count(level, count, a, b, c, d)\
-  __asm__ (xchg{q}\t{%%}rbx, %q1\n\t \
-  cpuid\n\t  \
-  xchg{q}\t{%%}rbx, %q1\n\t  \
-  : =a (a), =r (b), =c (c), =d (d)\
-  : 0 (level), 2 (count))
-#else
-#define __cpuid(level, a, b, c, d) \
   __asm__ (cpuid\n\t \
   : =a (a), =b (b), =c (c), =d (d) \
   : 0 (level))
@@ -204,8 +155,8 @@
   __asm__ (cpuid\n\t \
   : =a (a), =b (b), =c (c), =d (d) \
   : 0 (level), 2 (count))
-#endif
 
+
 /* Return highest supported input value for cpuid instruction.  ext can
be either 0x0 or 0x800 to return highest supported value for
basic or extended cpuid information.  Function returns 0 if cpuid
Index: config/i386/driver-i386.c
===
--- 

Re: [PATCH Fortran] rename gfc_warning_cmdline to gfc_warning_now_2

2014-10-17 Thread Tobias Burnus

Manuel López-Ibáñez wrote:

This patch is mostly cleanups, sorry for the churn. The next one will
be far more interesting.

Bootregtested on x86_64-linux-gnu.
OK?


Looks good to me. Thanks!

Tobias


gcc/fortran/ChangeLog:
2014-10-16  Manuel López-Ibáñez  m...@gcc.gnu.org

 PR fortran/44054
 * gfortran.h (gfc_warning_cmdline): Rename as gfc_warning_now_2.
 (gfc_error_cmdline): Rename as gfc_error_now_2.
 * error.c (gfc_diagnostic_build_locus_prefix): Remove trailing space.
 (gfc_diagnostic_starter): Add space between locus and prefix.
 (gfc_warning_now_2): Renamed from gfc_warning_cmdline.
 (gfc_error_now_2): Renamed from gfc_error_cmdline.
 * scanner.c (add_path_to_list): Use gfc_warning_now_2.
 (load_line): Likewise.
 (load_file): Likewise.
 * options.c (gfc_post_options): Update all renamed functions.




Re: [PATCH diagnostics/fortran] dynamically generate locations from offset + handle %C

2014-10-17 Thread Tobias Burnus

Manuel López-Ibáñez wrote:

This patch adds handling of Fortran %C using the common diagnostics
machinery. This is achieved by dynamically generating a location given
a location and an offset. This only works for non-macro line-maps (for
now), but this is OK since Fortran does not have virtual locations
(and I'm afraid it won't have them in the foreseeable future).

Dodji, are the linemap_asserts() appropriate? I tried to follow your
previous comments whenever possible.



Bootregtested on x86_64-linux-gnu.
OK?


From my side, the patch is OK. Thanks again for your diagnostic work 
(for this patch set and in general)!


Tobias


libcpp/ChangeLog:

2014-10-16  Manuel López-Ibáñez  m...@gcc.gnu.org

 PR fortran/44054
 * include/line-map.h (linemap_position_for_loc_and_offset):
 Declare.
 * line-map.c (linemap_position_for_loc_and_offset): New.


gcc/fortran/ChangeLog:

2014-10-16  Manuel López-Ibáñez  m...@gcc.gnu.org

 PR fortran/44054
 * gfortran.h (warn_use_without_only): Remove.
 (gfc_diagnostics_finish): Declare.
 * error.c: Include tree-diagnostics.h
 (gfc_format_decoder): New.
 (gfc_diagnostics_init): Use gfc_format_decoder. Set default caret
 char.
 (gfc_diagnostics_finish): Restore tree diagnostics defaults, but
 keep gfc_diagnostics_starter and finalizer. Restore default caret.
 * options.c: Remove all uses of warn_use_without_only.
 * lang.opt (Wuse-without-only): Add Var.
 * module.c (gfc_use_module): Use gfc_warning_now_2.
 * f95-lang.c (gfc_be_parse_file): Call gfc_diagnostics_finish.

gcc/testsuite/ChangeLog:

2014-10-16  Manuel López-Ibáñez  m...@gcc.gnu.org

 PR fortran/44054
 * lib/gfortran-dg.exp: Update regexp to match locus and message
 without caret.
 * gfortran.dg/use_without_only_1.f90: Add column numbers.




[libatomic PATCH] Fix libatomic behavior for big endian toolchain

2014-10-17 Thread Shiva Chen
Hi,

I noticed that libatomic implementation for builtin function parameter
smaller than word.
It would shift the parameter value to word and then store word.
However, the shift amount for big endian would be wrong.
This patch fix libatomic builtin function behavior for big endian toolchain.

Is it ok for trunk ?

Shiva


2014-10-17 Shiva Chen shiva0...@gmail.com

Fix libatomic behavior for big endian toolchain
* libatomic/cas_n.c: Fix shift amount for big endian toolchain
* libatomic/config/arm/exch_n.c: Fix shift amount for big endian toolchain
* libatomic/exch_n.c: Fix shift amount for big endian toolchain
* libatomic/fop_n.c: Fix shift amount for big endian toolchain
diff --git a/libatomic/cas_n.c b/libatomic/cas_n.c
index 801262d..aea49f0 100644
--- a/libatomic/cas_n.c
+++ b/libatomic/cas_n.c
@@ -60,7 +60,11 @@ SIZE(libat_compare_exchange) (UTYPE *mptr, UTYPE *eptr, 
UTYPE newval,
   if (N  WORDSIZE)
 {
   wptr = (UWORD *)((uintptr_t)mptr  -WORDSIZE);
-  shift = (((uintptr_t)mptr % WORDSIZE) * CHAR_BIT) ^ SIZE(INVERT_MASK);
+#ifdef __ARMEB__
+  shift = ((WORDSIZE - N - ((uintptr_t)mptr % WORDSIZE)) * CHAR_BIT);
+#else
+  shift = (((uintptr_t)mptr % WORDSIZE) * CHAR_BIT);
+#endif
   mask = SIZE(MASK)  shift;
 }
   else
diff --git a/libatomic/config/arm/exch_n.c b/libatomic/config/arm/exch_n.c
index c90d57f..0d71c5a 100644
--- a/libatomic/config/arm/exch_n.c
+++ b/libatomic/config/arm/exch_n.c
@@ -88,7 +88,11 @@ SIZE(libat_exchange) (UTYPE *mptr, UTYPE newval, int smodel)
   __atomic_thread_fence (__ATOMIC_SEQ_CST);
 
   wptr = (UWORD *)((uintptr_t)mptr  -WORDSIZE);
-  shift = (((uintptr_t)mptr % WORDSIZE) * CHAR_BIT) ^ INVERT_MASK_1;
+#ifdef __ARMEB__
+  shift = ((WORDSIZE - N - ((uintptr_t)mptr % WORDSIZE)) * CHAR_BIT);
+#else
+  shift = (((uintptr_t)mptr % WORDSIZE) * CHAR_BIT);
+#endif
   mask = MASK_1  shift;
   wnewval = newval  shift;
 
diff --git a/libatomic/exch_n.c b/libatomic/exch_n.c
index 23558b0..e293d0b 100644
--- a/libatomic/exch_n.c
+++ b/libatomic/exch_n.c
@@ -77,7 +77,11 @@ SIZE(libat_exchange) (UTYPE *mptr, UTYPE newval, int smodel)
   if (N  WORDSIZE)
 {
   wptr = (UWORD *)((uintptr_t)mptr  -WORDSIZE);
-  shift = (((uintptr_t)mptr % WORDSIZE) * CHAR_BIT) ^ SIZE(INVERT_MASK);
+#ifdef __ARMEB__
+  shift = ((WORDSIZE - N - ((uintptr_t)mptr % WORDSIZE)) * CHAR_BIT);
+#else
+  shift = (((uintptr_t)mptr % WORDSIZE) * CHAR_BIT);
+#endif
   mask = SIZE(MASK)  shift;
 }
   else
diff --git a/libatomic/fop_n.c b/libatomic/fop_n.c
index 4a18da9..b3184b7 100644
--- a/libatomic/fop_n.c
+++ b/libatomic/fop_n.c
@@ -113,7 +113,11 @@ SIZE(C2(libat_fetch_,NAME)) (UTYPE *mptr, UTYPE opval, int 
smodel)
   pre_barrier (smodel);
 
   wptr = (UWORD *)mptr;
+#ifdef __ARMEB__
+  shift = (WORDSIZE - N) * CHAR_BIT;
+#else
   shift = 0;
+#endif
   mask = -1;
 
   wopval = (UWORD)opval  shift;
@@ -137,7 +141,11 @@ SIZE(C3(libat_,NAME,_fetch)) (UTYPE *mptr, UTYPE opval, 
int smodel)
   pre_barrier (smodel);
 
   wptr = (UWORD *)mptr;
+#ifdef __ARMEB__
+  shift = (WORDSIZE - N) * CHAR_BIT;
+#else
   shift = 0;
+#endif
   mask = -1;
 
   wopval = (UWORD)opval  shift;


Re: [PATCH][0/n] Merge from match-and-simplify

2014-10-17 Thread Ramana Radhakrishnan



On 16/10/14 21:43, Andrew Pinski wrote:

On Thu, Oct 16, 2014 at 1:38 PM, Sebastian Pop seb...@gmail.com wrote:

Richard Biener wrote:


I have posted 5 patches as part of a larger series to merge
(parts) from the match-and-simplify branch.  While I think
there was overall consensus that the idea behind the project
is sound there are technical questions left for how the
thing should look in the end.  I've raised them in 3/n
which is the only patch of the series that contains any
patterns sofar.

To re-iterate here (as I expect most people will only look
at [0/n] patches ;)), the question is whether we are fine
with making fold-const (thus fold_{unary,binary,ternary})
not handle some cases it handles currently.





I have tested on aarch64 all the code in the match-and-simplify against trunk as
of the last merge at r216315:

2014-10-16  Richard Biener  rguent...@suse.de

 Merge from trunk r216235 through r216315.

Overall, I see a lot of perf regressions (about 2/3 of the tests) than
improvements (1/3 of the tests).  I will try to reduce tests.




For instance, saxpy regresses at -O3 on aarch64:

void saxpy(double* x, double* y, double* z) {
 int i=0;
 for (i = 0 ; i  ARRAY_SIZE; i++) {
 z[i] = x[i] + scalar*y[i];
 }
}


This looks like a scheduling issue rather than anything else.  The
scheduler for a57 is not complete and does not model some things like
the fusion of the compares and branch which is most likely what you
are seeing.



Huh !! how is that related to the code generation shown by Seb ?

See the replacement of subs by cmp and sub. Folding cmp into other flag 
setting instructions is a very useful optimization on ARM and AArch64 
and that's what appears missing in fold-const. That maybe what's causing 
the slowdown. I've never known that to be caused by any scheduler 
vagaries !


regards
Ramana






Thanks,
Andrew Pinski



$ diff -u base.s mas.s
--- base.s  2014-10-16 15:30:15.35143 -0500
+++ mas.s   2014-10-16 15:30:16.183035000 -0500
@@ -2,12 +2,14 @@
 add x1, x2, 800
 ldr q0, [x0, x2]
 add x3, x2, 1600
+   cmp x0, 784
 ldr q1, [x0, x1]
+   add x1, x0, 16
 fmlav0.2d, v1.2d, v2.2d
 str q0, [x0, x3]
-   add x0, x0, 16
-   cmp x0, 800
+   mov x0, x1
 bne .L140
  .LBE179:
-   subsw4, w4, #1
+   cmp w4, 1
+   sub w4, w4, #1
 bne .L139



Thanks,
Sebastian




Re: [PATCH 9/17] Initial KAsan support

2014-10-17 Thread Eric Botcazou
 The patch was slightly updated to take care of missing UBSan work
 (SANITIZE_FLOAT_DIVIDE, SANITIZE_FLOAT_CAST, SANITIZE_BOUNDS).

--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -868,6 +868,20 @@ finish_options (struct gcc_options *opts, struct 
gcc_options *opts_set,
   /* The -gsplit-dwarf option requires -gpubnames.  */
   if (opts-x_dwarf_split_debug_info)
 opts-x_debug_generate_pub_sections = 1;
+
+  /* Userspace and kernel ASan conflict with each other and with TSan.  */
+
+  if ((flag_sanitize  SANITIZE_USER_ADDRESS)
+   (flag_sanitize  SANITIZE_KERNEL_ADDRESS))
+error_at (loc,
+  -fsanitize=address is incompatible with 
+  -fsanitize=kernel-address);
+
+  if ((flag_sanitize  SANITIZE_ADDRESS)
+   (flag_sanitize  SANITIZE_THREAD))
+error_at (loc,
+  -fsanitize=address and -fsanitize=kernel-address 
+  are incompatible with -fsanitize=thread);
 }

Why aren't you using opts-x_ here, like all the code just above?

-- 
Eric Botcazou


Re: [PATCH][0/n] Merge from match-and-simplify

2014-10-17 Thread Ramana Radhakrishnan
On Wed, Oct 15, 2014 at 5:29 PM, Kyrill Tkachov kyrylo.tkac...@arm.com wrote:

 On 15/10/14 14:00, Richard Biener wrote:


 Any comments and reviews welcome (I don't think that
 my maintainership covers enough to simply check this in
 without approval).

 Hi Richard,

 The match-and-simplify branch bootstrapped successfully on
 aarch64-none-linux-gnu FWIW.


What about regression tests ?

 Thanks,
 Kyrill




Re: [PATCH 9/17] Initial KAsan support

2014-10-17 Thread Yury Gribov

On 10/17/2014 11:27 AM, Eric Botcazou wrote:

The patch was slightly updated to take care of missing UBSan work
(SANITIZE_FLOAT_DIVIDE, SANITIZE_FLOAT_CAST, SANITIZE_BOUNDS).


--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -868,6 +868,20 @@ finish_options (struct gcc_options *opts, struct
gcc_options *opts_set,
/* The -gsplit-dwarf option requires -gpubnames.  */
if (opts-x_dwarf_split_debug_info)
  opts-x_debug_generate_pub_sections = 1;
+
+  /* Userspace and kernel ASan conflict with each other and with TSan.  */
+
+  if ((flag_sanitize  SANITIZE_USER_ADDRESS)
+   (flag_sanitize  SANITIZE_KERNEL_ADDRESS))
+error_at (loc,
+  -fsanitize=address is incompatible with 
+  -fsanitize=kernel-address);
+
+  if ((flag_sanitize  SANITIZE_ADDRESS)
+   (flag_sanitize  SANITIZE_THREAD))
+error_at (loc,
+  -fsanitize=address and -fsanitize=kernel-address 
+  are incompatible with -fsanitize=thread);
  }

Why aren't you using opts-x_ here, like all the code just above?


Well, that's a backport of ancient patch from trunk so all credits go 
there. And flag_sanitize is indeed handled differently from other 
compiler flags.


-Y


Re: [PATCH 9/17] Initial KAsan support

2014-10-17 Thread Eric Botcazou
 Well, that's a backport of ancient patch from trunk so all credits go
 there. And flag_sanitize is indeed handled differently from other
 compiler flags.

Really curious to know why...

-- 
Eric Botcazou


Re: [PATCH][4/n] Merge from match-and-simplify, hook into fold-const.c

2014-10-17 Thread Richard Biener
On Thu, 16 Oct 2014, Sebastian Pop wrote:

 Richard Biener wrote:
  To give you an example how it looks like, the following code is
  generated for
  
  /* fold_negate_exprs convert - (~A) to A + 1.  */
  (simplify
   (negate (bit_not @0))
   (if (INTEGRAL_TYPE_P (type))
(plus @0 { build_int_cst (TREE_TYPE (@0), 1); } )))
  
  tree
  generic_simplify (enum tree_code code, tree type ATTRIBUTE_UNUSED, tree op0)
 
 I wonder why ATTRIBUTE_UNUSED is generated for used parameters.

I've added them for the initial patch set because without any patterns
defined (just 1/n and 2/n) only one of the parameters will be used.

Consider them removed again once we have enough patterns to make
bootstrap happy after that.

  {
if ((op0  TREE_SIDE_EFFECTS (op0)))
  return NULL_TREE;
switch (code)
  {
  ...
  case NEGATE_EXPR:
{
  switch (TREE_CODE (op0))
{
case BIT_NOT_EXPR:
  {
tree o20 = TREE_OPERAND (op0, 0);
  {
/* #line 136 
  /space/rguenther/src/svn/match-and-simplify/gcc/match.pd */
tree captures[2] ATTRIBUTE_UNUSED = {};
 
 Same here.
 Also, why do we allocate two elements when only captures[0] is used?

Good question - I'll have a look.

Thanks,
Richard.

captures[0] = o20;
/* #line 135 
  /space/rguenther/src/svn/match-and-simplify/gcc/match.pd */
if (INTEGRAL_TYPE_P (type))
  {
if (dump_file  (dump_flags  TDF_DETAILS)) fprintf 
  (dump_file, Applying pattern match.pd:136, %s:%d\n, __FILE__, __LINE__);
tree res_op0;
res_op0 = captures[0];
tree res_op1;
res_op1 =  build_int_cst (TREE_TYPE (captures[0]), 1);
return fold_build2 (PLUS_EXPR, type, res_op0, 
  res_op1);
  }
  }
break;
  }
  ...
 
 

-- 
Richard Biener rguent...@suse.de
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imendorffer


[PATCHv4][Kasan] Allow to override Asan shadow offset from command line

2014-10-17 Thread Yury Gribov

Hi all,

On 09/29/2014 09:21 PM, Yury Gribov wrote:

Kasan developers has asked for an option to override offset of Asan
shadow memory region. This should simplify experimenting with memory
layouts on 64-bit architectures.


New patch which checks that -fasan-shadow-offset is only enabled for
-fsanitize=kernel-address. I (unfortunately) can't make this --param
because this can be a 64-bit value.

Bootstrapped and regtested on x64.


New patchset that adds strtoull to libiberty (blind copy-paste of 
already existing strtoul.c) and uses it to parse -fasan-shadow-offset 
(to avoid problem with compiling for 64-bit target a 32-bit host).


Bootstrapped and regtested on x64.

-Y
From 0225b7878bbb5b803814646d089824d016316fef Mon Sep 17 00:00:00 2001
From: Yury Gribov y.gri...@samsung.com
Date: Thu, 16 Oct 2014 18:31:10 +0400
Subject: [PATCH 1/2] Add strtoull to libiberty.

2014-10-17  Yury Gribov  y.gri...@samsung.com

libiberty/
	* strtoull.c: New file.
---
 libiberty/strtoull.c |  119 ++
 1 file changed, 119 insertions(+)
 create mode 100644 libiberty/strtoull.c

diff --git a/libiberty/strtoull.c b/libiberty/strtoull.c
new file mode 100644
index 000..c92a4a3
--- /dev/null
+++ b/libiberty/strtoull.c
@@ -0,0 +1,119 @@
+/*
+ * Copyright (c) 2014 Regents of the University of California.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *notice, this list of conditions and the following disclaimer in the
+ *documentation and/or other materials provided with the distribution.
+ * 3. [rescinded 22 July 1999]
+ * 4. Neither the name of the University nor the names of its contributors
+ *may be used to endorse or promote products derived from this software
+ *without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+#ifdef HAVE_CONFIG_H
+#include config.h
+#endif
+#ifdef HAVE_LIMITS_H
+#include limits.h
+#endif
+#ifdef HAVE_SYS_PARAM_H
+#include sys/param.h
+#endif
+#include errno.h
+#ifdef NEED_DECLARATION_ERRNO
+extern int errno;
+#endif
+#if 0
+#include stdlib.h
+#endif
+#include ansidecl.h
+#include safe-ctype.h
+
+#ifdef HAVE_LONG_LONG
+
+#ifndef ULLONG_MAX
+#define ULLONG_MAX ((unsigned long long)(~0L)) /* 0x */
+#endif
+
+/*
+ * Convert a string to an unsigned long long integer.
+ *
+ * Ignores `locale' stuff.  Assumes that the upper and lower case
+ * alphabets and digits are each contiguous.
+ */
+unsigned long long
+strtoull(const char *nptr, char **endptr, register int base)
+{
+	register const char *s = nptr;
+	register unsigned long long acc;
+	register int c;
+	register unsigned long long cutoff;
+	register int neg = 0, any, cutlim;
+
+	/*
+	 * See strtol for comments as to the logic used.
+	 */
+	do {
+		c = *s++;
+	} while (ISSPACE(c));
+	if (c == '-') {
+		neg = 1;
+		c = *s++;
+	} else if (c == '+')
+		c = *s++;
+	if ((base == 0 || base == 16) 
+	c == '0'  (*s == 'x' || *s == 'X')) {
+		c = s[1];
+		s += 2;
+		base = 16;
+	}
+	if (base == 0)
+		base = c == '0' ? 8 : 10;
+	cutoff = (unsigned long long)ULLONG_MAX / (unsigned long long)base;
+	cutlim = (unsigned long long)ULLONG_MAX % (unsigned long long)base;
+	for (acc = 0, any = 0;; c = *s++) {
+		if (ISDIGIT(c))
+			c -= '0';
+		else if (ISALPHA(c))
+			c -= ISUPPER(c) ? 'A' - 10 : 'a' - 10;
+		else
+			break;
+		if (c = base)
+			break;
+		if (any  0 || acc  cutoff || (acc == cutoff  c  cutlim))
+			any = -1;
+		else {
+			any = 1;
+			acc *= base;
+			acc += c;
+		}
+	}
+	if (any  0) {
+		acc = ULLONG_MAX;
+		errno = ERANGE;
+	} else if (neg)
+		acc = -acc;
+	if (endptr != 0)
+		*endptr = (char *) (any ? s - 1 : nptr);
+	return (acc);
+}
+
+#endif /* ifdef HAVE_LONG_LONG */
-- 
1.7.9.5

From 6c9ad20bdcfc0fbf7ccb8e2700ef7dce52a34c64 Mon Sep 17 00:00:00 2001
From: Yury Gribov y.gri...@samsung.com
Date: Fri, 29 Aug 2014 11:58:03 +0400
Subject: [PATCH 2/2] 

Re: [PATCH][0/n] Merge from match-and-simplify

2014-10-17 Thread Richard Biener
On Thu, 16 Oct 2014, Sebastian Pop wrote:

 Richard Biener wrote:
  
  I have posted 5 patches as part of a larger series to merge
  (parts) from the match-and-simplify branch.  While I think
  there was overall consensus that the idea behind the project
  is sound there are technical questions left for how the
  thing should look in the end.  I've raised them in 3/n
  which is the only patch of the series that contains any
  patterns sofar.
  
  To re-iterate here (as I expect most people will only look
  at [0/n] patches ;)), the question is whether we are fine
  with making fold-const (thus fold_{unary,binary,ternary})
  not handle some cases it handles currently.
 
 I have tested on aarch64 all the code in the match-and-simplify against trunk 
 as
 of the last merge at r216315:
 
 2014-10-16  Richard Biener  rguent...@suse.de
 
 Merge from trunk r216235 through r216315.
 
 Overall, I see a lot of perf regressions (about 2/3 of the tests) than
 improvements (1/3 of the tests).  I will try to reduce tests.

Note that the branch goes much further in exercising the machinery
than I want to merge at this point (that applies mostly to all
passes using the SSA propagator such as CCP and VRP and passes
exercising value-numbering - FRE and PRE).

It may also simply show the effect of now folding all statements
from tree-ssa-forwprop.c.  I have yet to investigate the testsuite
fallout of [1/n] to [5/n] - testresults have been very noisy lately
due to the C11 change and now ICF.

 For instance, saxpy regresses at -O3 on aarch64:
 
 void saxpy(double* x, double* y, double* z) {
 int i=0;
 for (i = 0 ; i  ARRAY_SIZE; i++) {
 z[i] = x[i] + scalar*y[i];
 }
 }
 
 $ diff -u base.s mas.s
 --- base.s  2014-10-16 15:30:15.35143 -0500
 +++ mas.s   2014-10-16 15:30:16.183035000 -0500
 @@ -2,12 +2,14 @@
 add x1, x2, 800
 ldr q0, [x0, x2]
 add x3, x2, 1600
 +   cmp x0, 784
 ldr q1, [x0, x1]
 +   add x1, x0, 16
 fmlav0.2d, v1.2d, v2.2d
 str q0, [x0, x3]
 -   add x0, x0, 16
 -   cmp x0, 800
 +   mov x0, x1
 bne .L140
  .LBE179:
 -   subsw4, w4, #1
 +   cmp w4, 1
 +   sub w4, w4, #1
 bne .L139

I don't understand AARCH64 assembly very well but the above looks like
RTL issues and/or IVOPTs issues?

Thanks for doing performance measurements.

Richard.


Re: [PATCH][match-and-simplify] More ternary commutative ops, canonicalize operand order before generic_simplify

2014-10-17 Thread Richard Biener
On Thu, 16 Oct 2014, Jeff Law wrote:

 On 10/16/14 05:06, Richard Biener wrote:
  
  This patch (also applicable to trunk) makes us canoncialize operand
  order for comparisons at the same time we canonicalize other
  operand order, in particular before dispatching to generic_simplify.
  It also adds operand canonicalization to ternary ops and adds
  FMA_EXPR and DOT_PROD_EXPR to the list of ternary commutative ops.
  
  Bootstrap and regtest running on match-and-simplify branch and
  x86_64-unknown-linux-gnu.
  
  Richard.
  
  2014-10-16  Richard Biener  rguent...@suse.de
  
  * fold-const.c (fold_comparison): Remove redundant constant
  folding and operand swapping.
  (fold_binary_loc): Do comparison operand swapping here,
  dispatch to generic_simplify after operand canonicalization.
  (fold_ternary_loc): Canonicalize operand order for
  commutative ternary operations.
  * tree.c (commutative_ternary_tree_code): Add DOT_PROD_EXPR
  and FMA_EXPR.
 Seems like something we'd want for the trunk independent of the
 match-and-simplify work

Yes, I am going to test and apply it there today.

Thanks,
Richard.


Re: [PATCH] Fix range test optimization (PR tree-optimization/63302)

2014-10-17 Thread Richard Biener
On Fri, 17 Oct 2014, Jakub Jelinek wrote:

 Hi!
 
 This patch fixes PR63302 by using proper predicate to test if
 an INTEGER_CST is a not power of 2.
 While the issue has been originally reported for PA, the testcase
 shows the same issue on x86_64 (just with __int128 instead of long long).
 
 Bootstrapped/regtested on x86_64-linux and i686-linux on both
 4.9 branch and trunk, ok for trunk/4.9?

Ok.

Thanks,
Richard.

 2014-10-16  Jakub Jelinek  ja...@redhat.com
 
   PR tree-optimization/63302
   * tree-ssa-reassoc.c (optimize_range_tests_xor,
   optimize_range_tests_diff): Use !integer_pow2p () instead of
   tree_log2 ()  0.
 
   * gcc.c-torture/execute/pr63302.c: New test.
 
 --- gcc/tree-ssa-reassoc.c.jj 2014-04-22 15:05:46.0 +0200
 +++ gcc/tree-ssa-reassoc.c2014-10-15 13:33:12.501190909 +0200
 @@ -2198,7 +2198,7 @@ optimize_range_tests_xor (enum tree_code
lowxor = fold_binary (BIT_XOR_EXPR, type, lowi, lowj);
if (lowxor == NULL_TREE || TREE_CODE (lowxor) != INTEGER_CST)
  return false;
 -  if (tree_log2 (lowxor)  0)
 +  if (!integer_pow2p (lowxor))
  return false;
highxor = fold_binary (BIT_XOR_EXPR, type, highi, highj);
if (!tree_int_cst_equal (lowxor, highxor))
 @@ -2245,7 +2245,7 @@ optimize_range_tests_diff (enum tree_cod
tem1 = fold_binary (MINUS_EXPR, type, lowj, lowi);
if (tem1 == NULL_TREE || TREE_CODE (tem1) != INTEGER_CST)
  return false;
 -  if (tree_log2 (tem1)  0)
 +  if (!integer_pow2p (tem1))
  return false;
  
mask = fold_build1 (BIT_NOT_EXPR, type, tem1);
 --- gcc/testsuite/gcc.c-torture/execute/pr63302.c.jj  2014-10-15 
 13:33:57.075343573 +0200
 +++ gcc/testsuite/gcc.c-torture/execute/pr63302.c 2014-10-15 
 13:33:44.0 +0200
 @@ -0,0 +1,60 @@
 +/* PR tree-optimization/63302 */
 +
 +#ifdef __SIZEOF_INT128__
 +#if __SIZEOF_INT128__ * __CHAR_BIT__ == 128
 +#define USE_INT128
 +#endif
 +#endif
 +#if __SIZEOF_LONG_LONG__ * __CHAR_BIT__ == 64
 +#define USE_LLONG
 +#endif
 +
 +#ifdef USE_INT128
 +__attribute__((noinline, noclone)) int
 +foo (__int128 x)
 +{
 +  __int128 v = x  (((__int128) -1  63) | 0x7ff);
 + 
 +  return v == 0 || v == ((__int128) -1  63);
 +}
 +#endif
 +
 +#ifdef USE_LLONG
 +__attribute__((noinline, noclone)) int
 +bar (long long x)
 +{
 +  long long v = x  (((long long) -1  31) | 0x7ff);
 + 
 +  return v == 0 || v == ((long long) -1  31);
 +}
 +#endif
 +
 +int
 +main ()
 +{
 +#ifdef USE_INT128
 +  if (foo (0) != 1
 +  || foo (1) != 0
 +  || foo (0x800) != 1
 +  || foo (0x801) != 0
 +  || foo ((__int128) 1  63) != 0
 +  || foo ((__int128) -1  63) != 1
 +  || foo (((__int128) -1  63) | 1) != 0
 +  || foo (((__int128) -1  63) | 0x800) != 1
 +  || foo (((__int128) -1  63) | 0x801) != 0)
 +__builtin_abort ();
 +#endif
 +#ifdef USE_LLONG
 +  if (bar (0) != 1
 +  || bar (1) != 0
 +  || bar (0x800) != 1
 +  || bar (0x801) != 0
 +  || bar (1LL  31) != 0
 +  || bar (-1LL  31) != 1
 +  || bar ((-1LL  31) | 1) != 0
 +  || bar ((-1LL  31) | 0x800) != 1
 +  || bar ((-1LL  31) | 0x801) != 0)
 +__builtin_abort ();
 +#endif
 +  return 0;
 +}
 
   Jakub
 
 

-- 
Richard Biener rguent...@suse.de
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imendorffer


Re: [PATCH] Optimize range tests into bittests (PR tree-optimization/63464)

2014-10-17 Thread Richard Biener
On Fri, 17 Oct 2014, Jakub Jelinek wrote:

 Hi!
 
 This patch optimizes some range tests into bit tests, like we
 already do for switches in emit_case_bit_tests, but this time
 for a series of ored equality or anded non-equality comparisons.
 If at least 3 comparisons (after the contiguous range, xor and
 diff + xor optimizations are performed) are needed and range of
 values is at most number of bits in a word, we instead check
 whether the operand is = smallest and = highest number in the
 range and if it is, test (word) 1  operand against a bitmask.
 
 Bootstrapped/regtested on x86_64-linux and i686-linux (for i686-linux
 I had to go back to r216304 because there are multiple ICF related
 bootstrap issues on i686-linux), ok for trunk?

Ok.

Thanks,
Richard.

 2014-10-16  Jakub Jelinek  ja...@redhat.com
 
   PR tree-optimization/63464
   * gimple.h (gimple_seq_discard): New prototype.
   * gimple.c: Include stringpool.h and tree-ssanames.h.
   (gimple_seq_discard): New function.
   * optabs.h (lshift_cheap_p): New prototype.
   * optabs.c (lshift_cheap_p): New function, moved from...
   * tree-switch-conversion.c (lshift_cheap_p): ... here.
   * tree-ssa-reassoc.c: Include gimplify.h and optabs.h.
   (reassoc_branch_fixups): New variable.
   (update_range_test): Add otherrangep and seq arguments.
   Unshare exp.  If otherrange is NULL, use for other ranges
   array of pointers pointed by otherrangep instead.
   Emit seq before gimplified statements for tem.
   (optimize_range_tests_diff): Adjust update_range_test
   caller.
   (optimize_range_tests_xor): Likewise.  Fix up comment.
   (extract_bit_test_mask, optimize_range_tests_to_bit_test): New
   functions.
   (optimize_range_tests): Adjust update_range_test caller.
   Call optimize_range_tests_to_bit_test.
   (branch_fixup): New function.
   (execute_reassoc): Call branch_fixup.
 
   * gcc.dg/torture/pr63464.c: New test.
   * gcc.dg/tree-ssa/reassoc-37.c: New test.
   * gcc.dg/tree-ssa/reassoc-38.c: New test.
 
 --- gcc/gimple.h.jj   2014-10-15 12:28:06.428498079 +0200
 +++ gcc/gimple.h  2014-10-15 13:43:18.967491428 +0200
 @@ -1269,9 +1269,10 @@ extern bool gimple_asm_clobbers_memory_p
  extern void dump_decl_set (FILE *, bitmap);
  extern bool nonfreeing_call_p (gimple);
  extern bool infer_nonnull_range (gimple, tree, bool, bool);
 -extern void sort_case_labels (vectree );
 -extern void preprocess_case_label_vec_for_gimple (vectree , tree, tree *);
 -extern void gimple_seq_set_location (gimple_seq , location_t);
 +extern void sort_case_labels (vectree);
 +extern void preprocess_case_label_vec_for_gimple (vectree, tree, tree *);
 +extern void gimple_seq_set_location (gimple_seq, location_t);
 +extern void gimple_seq_discard (gimple_seq);
  
  /* Formal (expression) temporary table handling: multiple occurrences of
 the same scalar expression are evaluated into the same temporary.  */
 --- gcc/gimple.c.jj   2014-10-15 12:28:19.917235900 +0200
 +++ gcc/gimple.c  2014-10-15 13:43:18.970491368 +0200
 @@ -47,6 +47,8 @@ along with GCC; see the file COPYING3.
  #include demangle.h
  #include langhooks.h
  #include bitmap.h
 +#include stringpool.h
 +#include tree-ssanames.h
  
  
  /* All the tuples have their operand vector (if present) at the very bottom
 @@ -2826,3 +2828,19 @@ gimple_seq_set_location (gimple_seq seq,
for (gimple_stmt_iterator i = gsi_start (seq); !gsi_end_p (i); gsi_next 
 (i))
  gimple_set_location (gsi_stmt (i), loc);
  }
 +
 +/* Release SSA_NAMEs in SEQ as well as the GIMPLE statements.  */
 +
 +void
 +gimple_seq_discard (gimple_seq seq)
 +{
 +  gimple_stmt_iterator gsi;
 +
 +  for (gsi = gsi_start (seq); !gsi_end_p (gsi); )
 +{
 +  gimple stmt = gsi_stmt (gsi);
 +  gsi_remove (gsi, true);
 +  release_defs (stmt);
 +  ggc_free (stmt);
 +}
 +}
 --- gcc/optabs.h.jj   2014-10-15 12:28:06.479497088 +0200
 +++ gcc/optabs.h  2014-10-15 13:43:18.970491368 +0200
 @@ -538,5 +538,6 @@ extern void gen_satfractuns_conv_libfunc
 enum machine_mode,
 enum machine_mode);
  extern void init_tree_optimization_optabs (tree);
 +extern bool lshift_cheap_p (bool);
  
  #endif /* GCC_OPTABS_H */
 --- gcc/optabs.c.jj   2014-10-15 12:28:06.433497982 +0200
 +++ gcc/optabs.c  2014-10-15 13:43:18.969491387 +0200
 @@ -8624,4 +8624,31 @@ get_best_mem_extraction_insn (extraction
  struct_bits, field_mode);
  }
  
 +/* Determine whether 1  x is relatively cheap in word_mode.  */
 +
 +bool
 +lshift_cheap_p (bool speed_p)
 +{
 +  /* FIXME: This should be made target dependent via this this_target
 + mechanism, similar to e.g. can_copy_init_p in gcse.c.  */
 +  static bool init[2] = { false, false };
 +  static bool cheap[2] = { true, true };
 +
 +  /* If the targer has no lshift in word_mode, the 

Re: [PATCH][0/n] Merge from match-and-simplify

2014-10-17 Thread Richard Biener
On Fri, 17 Oct 2014, Ramana Radhakrishnan wrote:

 On Wed, Oct 15, 2014 at 5:29 PM, Kyrill Tkachov kyrylo.tkac...@arm.com 
 wrote:
 
  On 15/10/14 14:00, Richard Biener wrote:
 
 
  Any comments and reviews welcome (I don't think that
  my maintainership covers enough to simply check this in
  without approval).
 
  Hi Richard,
 
  The match-and-simplify branch bootstrapped successfully on
  aarch64-none-linux-gnu FWIW.
 
 
 What about regression tests ?

Note the branch isn't regression free on x86_64 either.  The branch
does more than I want to merge to trunk (and it also retains all
folding code I added patterns for).  I've gone farther there to
explore whether it will end up working in the end and what kind
of features the IL and the APIs need.

I've pasted testsuite results on x86_64 below for rev. 216324
which is based on trunk rev. 216315 which unfortunately has
lots of regressions on its own.

This is why I want to restrict the effect of the machinery to
fold (), fold_stmt () and tree-ssa-forwprop.c for the moment
and merge individual patterns (well, maybe in small groups)
separately to allow for easy bi-section.

I suppose I should push the most visible change to trunk first,
namely tree-ssa-forwprop.c folding all statements via fold_stmt
after the merge.  I suspect this alone can have some odd effects
like the sub + cmp fusing.  That would be sth like the patch
attached below.

Richard.

Index: gcc/tree-ssa-forwprop.c
===
--- gcc/tree-ssa-forwprop.c (revision 216258)
+++ gcc/tree-ssa-forwprop.c (working copy)
@@ -54,6 +54,8 @@ along with GCC; see the file COPYING3.
 #include tree-ssa-propagate.h
 #include tree-ssa-dom.h
 #include builtins.h
+#include tree-cfgcleanup.h
+#include tree-into-ssa.h
 
 /* This pass propagates the RHS of assignment statements into use
sites of the LHS of the assignment.  It's basically a specialized
@@ -3586,6 +3588,8 @@ simplify_mult (gimple_stmt_iterator *gsi
 
   return false;
 }
+
+
 /* Main entry point for the forward propagation and statement combine
optimizer.  */
 
@@ -3626,6 +3630,40 @@ pass_forwprop::execute (function *fun)
 
   cfg_changed = false;
 
+  /* Combine stmts with the stmts defining their operands.  Do that
+ in an order that guarantees visiting SSA defs before SSA uses.  */
+  int *postorder = XNEWVEC (int, n_basic_blocks_for_fn (fun));
+  int postorder_num = inverted_post_order_compute (postorder);
+  for (int i = 0; i  postorder_num; ++i)
+{
+  bb = BASIC_BLOCK_FOR_FN (fun, postorder[i]);
+  for (gimple_stmt_iterator gsi = gsi_start_bb (bb);
+  !gsi_end_p (gsi); gsi_next (gsi))
+   {
+ gimple stmt = gsi_stmt (gsi);
+ gimple orig_stmt = stmt;
+
+ if (fold_stmt (gsi))
+   {
+ stmt = gsi_stmt (gsi);
+ if (maybe_clean_or_replace_eh_stmt (orig_stmt, stmt)
+  gimple_purge_dead_eh_edges (bb))
+   cfg_changed = true;
+ update_stmt (stmt);
+   }
+   }
+}
+  free (postorder);
+
+  /* ???  Code below doesn't expect non-renamed VOPs and the above
+ doesn't keep virtual operand form up-to-date.  */
+  if (cfg_changed)
+{
+  cleanup_tree_cfg ();
+  cfg_changed = false;
+}
+  update_ssa (TODO_update_ssa_only_virtuals);
+
   FOR_EACH_BB_FN (bb, fun)
 {
   gimple_stmt_iterator gsi;


[ARM] Fix DWARF unwinding breakage

2014-10-17 Thread Eric Botcazou
Hi,

some OSes, for example VxWorks 6, still use DWARF unwinding on the ARM, which 
means that they use __builtin_eh_return (EABI unwinding doesn't).  The builtin 
is implemented by means of {arm|thumb}_set_return_address, which can generate 
a store if LR has been stored on function entry.  The problem is that, if this 
store is FP-based, it is not seen by the RTL DSE pass as being consumed by the 
SP-based load at the same address on function exit.

That's by design in the RTL DSE pass: FP and SP are never substituted for each 
other by cselib, see for example this comment:

  /* The only thing that we are not willing to do (this
 is requirement of dse and if others potential uses
 need this function we should add a parm to control
 it) is that we will not substitute the
 STACK_POINTER_REGNUM, FRAME_POINTER or the
 HARD_FRAME_POINTER.

 These expansions confuses the code that notices that
 stores into the frame go dead at the end of the
 function and that the frame is not effected by calls
 to subroutines.  If you allow the
 STACK_POINTER_REGNUM substitution, then dse will
 think that parameter pushing also goes dead which is
 wrong.  If you allow the FRAME_POINTER or the
 HARD_FRAME_POINTER then you lose the opportunity to
 make the frame assumptions.  */
  if (regno == STACK_POINTER_REGNUM
  || regno == FRAME_POINTER_REGNUM
  || regno == HARD_FRAME_POINTER_REGNUM
  || regno == cfa_base_preserved_regno)
return orig;

so a FP-based store and a SP-based load are never seen as a RAW dependency.

This nevertheless used to work because the blockage insn emitted by the RTL 
epilogue was acting as a wild load but this got broken by Richard's patch:

2014-03-11  Richard Sandiford  rdsandif...@googlemail.com

* builtins.c (expand_builtin_setjmp_receiver): Use and clobber
hard_frame_pointer_rtx.
* cse.c (cse_insn): Remove volatile check.
* cselib.c (cselib_process_insn): Likewise.
* dse.c (scan_insn): Likewise.

which removed the wild load trick.  This is visible at -O2 for:

void
foo (void *c1, void *t1, void *ra)
{
long offset = uw_install_context_1 (c1, t1);
void *handler = __builtin_frob_return_addr (ra);
__builtin_unwind_init ();
__builtin_eh_return (offset, handler);
}

The attached patch fixes the breakage by marking the stores as frame related.

Tested on ARM/VxWorks, OK for mainline and 4.9 branch?


2014-10-17  Eric Botcazou  ebotca...@adacore.com

* config/arm/arm.c (arm_set_return_address): Mark the store as frame
related, if any.
(thumb_set_return_address): Likewise.


-- 
Eric BotcazouIndex: config/arm/arm.c
===
--- config/arm/arm.c	(revision 216252)
+++ config/arm/arm.c	(working copy)
@@ -28952,7 +28952,11 @@ arm_set_return_address (rtx source, rtx
 
 	  addr = plus_constant (Pmode, addr, delta);
 	}
-  emit_move_insn (gen_frame_mem (Pmode, addr), source);
+  /* The store needs to be marked as frame related in order to prevent
+	 DSE from deleting it as dead if it is based on fp.  */
+  rtx insn = emit_move_insn (gen_frame_mem (Pmode, addr), source);
+  RTX_FRAME_RELATED_P (insn) = 1;
+  add_reg_note (insn, REG_CFA_RESTORE, gen_rtx_REG (Pmode, LR_REGNUM));
 }
 }
 
@@ -29004,7 +29008,11 @@ thumb_set_return_address (rtx source, rt
   else
 	addr = plus_constant (Pmode, addr, delta);
 
-  emit_move_insn (gen_frame_mem (Pmode, addr), source);
+  /* The store needs to be marked as frame related in order to prevent
+	 DSE from deleting it as dead if it is based on fp.  */
+  rtx insn = emit_move_insn (gen_frame_mem (Pmode, addr), source);
+  RTX_FRAME_RELATED_P (insn) = 1;
+  add_reg_note (insn, REG_CFA_RESTORE, gen_rtx_REG (Pmode, LR_REGNUM));
 }
   else
 emit_move_insn (gen_rtx_REG (Pmode, LR_REGNUM), source);


Pasto in is_old_name: s/new_ssa_names/old_ssa_names/g

2014-10-17 Thread Marc Glisse

Hello,

this trivial fix was pre-approved by Richard, I regtested and committed 
it.


2014-10-17  Marc Glisse  marc.gli...@inria.fr

* tree-into-ssa.c (is_old_name): Replace new with old.

--- tree-into-ssa.c (revision 216366)
+++ tree-into-ssa.c (working copy)
@@ -572,23 +572,23 @@ set_livein_block (tree var, basic_block
 info-need_phi_state = NEED_PHI_STATE_MAYBE;
 }


 /* Return true if NAME is in OLD_SSA_NAMES.  */

 static inline bool
 is_old_name (tree name)
 {
   unsigned ver = SSA_NAME_VERSION (name);
-  if (!new_ssa_names)
+  if (!old_ssa_names)
 return false;
-  return (ver  SBITMAP_SIZE (new_ssa_names)
+  return (ver  SBITMAP_SIZE (old_ssa_names)
   bitmap_bit_p (old_ssa_names, ver));
 }


 /* Return true if NAME is in NEW_SSA_NAMES.  */

 static inline bool
 is_new_name (tree name)
 {
   unsigned ver = SSA_NAME_VERSION (name);


--
Marc Glisse


[Ada] Missing inheritance of pragma Default_Initial_Condition

2014-10-17 Thread Arnaud Charlet
This patch modifies the inheritance of all attributes related to pragma
Default_Initial_Condition to account for a case where the full view of
a private type derives from another private type.


-- Source --


--  parent.ads

package Parent is
   type Parent_Typ is private
 with Default_Initial_Condition = False;
private
   type Parent_Typ is null record;
end Parent;

--  derivation.ads

with Parent; use Parent;

package Derivation is
   type Derivation_Typ is private;
private
   type Derivation_Typ is new Parent_Typ;
end Derivation;

--  derivation_check.adb

with Ada.Assertions; use Ada.Assertions;
with Ada.Text_IO;use Ada.Text_IO;
with Derivation; use Derivation;

procedure Derivation_Check is
begin
   declare
  Obj : Derivation_Typ;
   begin
  Put_Line (ERROR: Default_Initial_Condition not triggered);
   end;
exception
   when Assertion_Error =
  Put_Line (OK);
   when others  =
  Put_Line (ERROR: expected Assertion_Error);
end Derivation_Check;


-- Compilation and output --


$ gnatmake -q -gnata derivation_check.adb
$ ./derivation_check
OK

Tested on x86_64-pc-linux-gnu, committed on trunk

2014-10-17  Hristian Kirtchev  kirtc...@adacore.com

* sem_ch3.adb (Build_Derived_Record_Type): Remove the propagation
of all attributes related to pragma Default_Initial_Condition.
(Build_Derived_Type): Propagation of all attributes related
to pragma Default_Initial_Condition.
(Process_Full_View): Account for the case where the full view derives
from another private type and propagate the attributes related
to pragma Default_Initial_Condition to the private view.
(Propagate_Default_Init_Cond_Attributes): New routine.
* sem_util.adb: Alphabetize various routines.
(Build_Default_Init_Cond_Call): Use an unchecked type conversion
when calling the default initial condition procedure of a private type.
(Build_Default_Init_Cond_Procedure_Declaration): Prevent
the generation of multiple default initial condition procedures.

Index: sem_ch3.adb
===
--- sem_ch3.adb (revision 216367)
+++ sem_ch3.adb (working copy)
@@ -650,6 +650,17 @@
--  present. If errors are found, error messages are posted, and the
--  Real_Range_Specification of Def is reset to Empty.
 
+   procedure Propagate_Default_Init_Cond_Attributes
+ (From_Typ : Entity_Id;
+  To_Typ   : Entity_Id;
+  Parent_To_Derivation : Boolean := False;
+  Private_To_Full_View : Boolean := False);
+   --  Subsidiary to routines Build_Derived_Type and Process_Full_View. Inherit
+   --  all attributes related to pragma Default_Initial_Condition from From_Typ
+   --  to To_Typ. Flag Parent_To_Derivation should be set when the context is
+   --  the creation of a derived type. Flag Private_To_Full_View should be set
+   --  when processing both views of a private type.
+
procedure Record_Type_Declaration
  (T: Entity_Id;
   N: Node_Id;
@@ -8546,23 +8557,6 @@
   end if;
 
   Check_Function_Writable_Actuals (N);
-
-  --  Propagate the attributes related to pragma Default_Initial_Condition
-  --  from the parent type to the private extension. A derived type always
-  --  inherits the default initial condition flag from the parent type. If
-  --  the derived type carries its own Default_Initial_Condition pragma,
-  --  the flag is later reset in Analyze_Pragma. Note that both flags are
-  --  mutually exclusive.
-
-  if Has_Inherited_Default_Init_Cond (Parent_Type)
-or else Present (Get_Pragma
-  (Parent_Type, Pragma_Default_Initial_Condition))
-  then
- Set_Has_Inherited_Default_Init_Cond (Derived_Type);
-
-  elsif Has_Default_Init_Cond (Parent_Type) then
- Set_Has_Default_Init_Cond (Derived_Type);
-  end if;
end Build_Derived_Record_Type;
 

@@ -8680,6 +8674,18 @@
  Set_First_Rep_Item (Derived_Type, First_Rep_Item (Parent_Type));
   end if;
 
+  --  Propagate the attributes related to pragma Default_Initial_Condition
+  --  from the parent type to the private extension. A derived type always
+  --  inherits the default initial condition flag from the parent type. If
+  --  the derived type carries its own Default_Initial_Condition pragma,
+  --  the flag is later reset in Analyze_Pragma. Note that both flags are
+  --  mutually exclusive.
+
+  Propagate_Default_Init_Cond_Attributes
+(From_Typ = Parent_Type,
+ To_Typ   = Derived_Type,
+ Parent_To_Derivation = True);
+
   --  If the parent type has delayed rep aspects, then mark the derived
   --  type as possibly inheriting a delayed rep aspect.
 
@@ -10008,6 +10014,401 @@

Re: [fortran,patch] Handle infinities and NaNs in intrinsics code generation

2014-10-17 Thread Tobias Burnus
Hi FX,

FX wrote:
 After the compile-time simplification, this patch fixes the handling of 
 special values
 (infinities and NaNs) by intrinsics EXPONENT, FRACTION, SPACING, RRSPACING  
 SET_EXPONENT

 Bootstrapped and regtested on x86_64-linux.
 OK to commit?

Looks good to me. Thanks for taking care of F2003's IEEE support.

Tobias

PS: You might want to browse through the current (F2008 + corrigenda
+ first F2015 additions) draft at http://j3-fortran.org/doc/year/14/14-007r2.pdf

See especially the list at the beginning under the item
Changes to the intrinsic modules IEEE_ARITHMETIC, IEEE_EXCEPTIONS, and
IEEE_FEATURES for conformance with ISO/IEC/IEEE 60559:2011: [...]
and then later in that file.

Everthing which is in the draft is very likely to be in the final version but
of course not guranteed to be so.


[Ada] Ensure record type equality treated correctly for codepeer

2014-10-17 Thread Arnaud Charlet
This is an internal change that does not affect the compiler, but fixes
a problem in which a record comparison was not properly expanded. The
compiler back end handled this, but it blew up codepeer. No further
test required.

Tested on x86_64-pc-linux-gnu, committed on trunk

2014-10-17  Robert Dewar  de...@adacore.com

* exp_ch4.adb (Expand_N_Op_Eq): Make sure we deal with the
implementation base type.
* sinfo.ads: Add a note for N_Op_Eq and N_Op_Ne that record
operands are always expanded out into component comparisons.

Index: exp_ch4.adb
===
--- exp_ch4.adb (revision 216367)
+++ exp_ch4.adb (working copy)
@@ -7152,8 +7152,11 @@
  return;
   end if;
 
-  Typl := Base_Type (Typl);
+  --  Now get the implementation base type (note that plain Base_Type here
+  --  might lead us back to the private type, which is not what we want!)
 
+  Typl := Implementation_Base_Type (Typl);
+
   --  Equality between variant records results in a call to a routine
   --  that has conditional tests of the discriminant value(s), and hence
   --  violates the No_Implicit_Conditionals restriction.
Index: sinfo.ads
===
--- sinfo.ads   (revision 216367)
+++ sinfo.ads   (working copy)
@@ -4246,6 +4246,11 @@
   --  point operands if the Treat_Fixed_As_Integer flag is set and will
   --  thus treat these nodes in identical manner, ignoring small values.
 
+  --  Note on equality/inequality tests for records. In the expanded tree,
+  --  record comparisons are always expanded to be a series of component
+  --  comparisons, so the back end will never see an equality or inequality
+  --  operation with operands of a record type.
+
   --  Note on overflow handling: When the overflow checking mode is set to
   --  MINIMIZED or ELIMINATED, nodes for signed arithmetic operations may
   --  be modified to use a larger type for the operands and result. In


[Ada] Make System.Atomic_Counters available to user applications

2014-10-17 Thread Arnaud Charlet
The system unit System.Atomic_Counters which provides an atomic
counter type, along with increment, decrement and test operations,
available to user programs.

Tested on x86_64-pc-linux-gnu, committed on trunk

2014-10-17  Robert Dewar  de...@adacore.com

* gnat_rm.texi: Document System.Atomic_Counters.
* impunit.adb: Add System.Atomic_Counters (s-atocou.ads) to the
list of user- accessible units added as children of System.
* s-atocou.ads: Update comment.

Index: gnat_rm.texi
===
--- gnat_rm.texi(revision 216367)
+++ gnat_rm.texi(working copy)
@@ -661,6 +661,7 @@
 * Interfaces.VxWorks.IO (i-vxwoio.ads)::
 * System.Address_Image (s-addima.ads)::
 * System.Assertions (s-assert.ads)::
+* System.Atomic_Counters (s-atocou.ads)::
 * System.Memory (s-memory.ads)::
 * System.Multiprocessors (s-multip.ads)::
 * System.Multiprocessors.Dispatching_Domains (s-mudido.ads)::
@@ -19074,6 +19075,7 @@
 * Interfaces.VxWorks.IO (i-vxwoio.ads)::
 * System.Address_Image (s-addima.ads)::
 * System.Assertions (s-assert.ads)::
+* System.Atomic_Counters (s-atocou.ads)::
 * System.Memory (s-memory.ads)::
 * System.Multiprocessors (s-multip.ads)::
 * System.Multiprocessors.Dispatching_Domains (s-mudido.ads)::
@@ -20585,6 +20587,18 @@
 by an run-time assertion failure, as well as the routine that
 is used internally to raise this assertion.
 
+@node System.Atomic_Counters (s-atocou.ads)
+@section @code{System.Atomic_Counters} (@file{s-atocou.ads})
+@cindex @code{System.Atomic_Counters} (@file{s-atocou.ads})
+
+@noindent
+This package provides the declaration of an atomic counter type,
+together with efficient routines (using hardware
+synchronization primitives) for incrementing, decrementing,
+and testing of these counters. This package is implemented
+on most targets, including all Alpha, ia64, PowerPC, SPARC V9,
+x86, and x86_64 platforms.
+
 @node System.Memory (s-memory.ads)
 @section @code{System.Memory} (@file{s-memory.ads})
 @cindex @code{System.Memory} (@file{s-memory.ads})
Index: impunit.adb
===
--- impunit.adb (revision 216367)
+++ impunit.adb (working copy)
@@ -367,6 +367,7 @@
--
 
 (s-addima, F),  -- System.Address_Image
+(s-atocou, F),  -- System.Atomic_Counters
 (s-assert, F),  -- System.Assertions
 (s-diflio, F),  -- System.Dim.Float_IO
 (s-diinio, F),  -- System.Dim.Integer_IO
Index: s-atocou.ads
===
--- s-atocou.ads(revision 216367)
+++ s-atocou.ads(working copy)
@@ -6,7 +6,7 @@
 --  --
 -- S p e c  --
 --  --
---  Copyright (C) 2011-2013, Free Software Foundation, Inc. --
+--  Copyright (C) 2011-2014, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -37,8 +37,6 @@
 --- all x86 platforms
 --- all x86_64 platforms
 
---  Why isn't this package available to application programs???
-
 package System.Atomic_Counters is
 
pragma Preelaborate;
@@ -59,20 +57,19 @@
 
function Decrement (Item : in out Atomic_Counter) return Boolean;
pragma Inline_Always (Decrement);
-   --  Decrements value of atomic counter, returns True when value reach zero.
+   --  Decrements value of atomic counter, returns True when value reach zero
 
function Is_One (Item : Atomic_Counter) return Boolean;
pragma Inline_Always (Is_One);
-   --  Returns True when value of the atomic counter is one.
+   --  Returns True when value of the atomic counter is one
 
procedure Initialize (Item : out Atomic_Counter);
pragma Inline_Always (Initialize);
--  Initialize counter by setting its value to one. This subprogram is
-   --  intended to be used in special cases when counter object can't be
+   --  intended to be used in special cases when the counter object cannot be
--  initialized in standard way.
 
 private
-
type Unsigned_32 is mod 2 ** 32;
 
type Atomic_Counter is limited record


Re: [PATCH 9/17] Initial KAsan support

2014-10-17 Thread Yury Gribov

On 10/17/2014 11:43 AM, Eric Botcazou wrote:

Well, that's a backport of ancient patch from trunk so all credits go
there. And flag_sanitize is indeed handled differently from other
compiler flags.


Really curious to know why...


I bet this was a typo but let's wait for Jakub's comments.

-Y



Re: [PATCH][4/n] Merge from match-and-simplify, hook into fold-const.c

2014-10-17 Thread Richard Biener
On Fri, 17 Oct 2014, Richard Biener wrote:

 On Thu, 16 Oct 2014, Sebastian Pop wrote:
 
  Richard Biener wrote:
   To give you an example how it looks like, the following code is
   generated for
   
   /* fold_negate_exprs convert - (~A) to A + 1.  */
   (simplify
(negate (bit_not @0))
(if (INTEGRAL_TYPE_P (type))
 (plus @0 { build_int_cst (TREE_TYPE (@0), 1); } )))
   
   tree
   generic_simplify (enum tree_code code, tree type ATTRIBUTE_UNUSED, tree 
   op0)
  
  I wonder why ATTRIBUTE_UNUSED is generated for used parameters.
 
 I've added them for the initial patch set because without any patterns
 defined (just 1/n and 2/n) only one of the parameters will be used.
 
 Consider them removed again once we have enough patterns to make
 bootstrap happy after that.
 
   {
 if ((op0  TREE_SIDE_EFFECTS (op0)))
   return NULL_TREE;
 switch (code)
   {
   ...
   case NEGATE_EXPR:
 {
   switch (TREE_CODE (op0))
 {
 case BIT_NOT_EXPR:
   {
 tree o20 = TREE_OPERAND (op0, 0);
   {
 /* #line 136 
   /space/rguenther/src/svn/match-and-simplify/gcc/match.pd */
 tree captures[2] ATTRIBUTE_UNUSED = {};
  
  Same here.
  Also, why do we allocate two elements when only captures[0] is used?
 
 Good question - I'll have a look.

Fixed by the following - bootstrapped on x86_64-unknown-linux-gnu, 
applied.

Richard.

2014-10-17  Richard Biener  rguent...@suse.de

* genmatch.c (simplify::simplify): Fix off-by-one error.

Index: gcc/genmatch.c
===
--- gcc/genmatch.c  (revision 216316)
+++ gcc/genmatch.c  (working copy)
@@ -495,7 +495,7 @@ struct simplify
   : match (match_), match_location (match_location_),
   result (result_), result_location (result_location_),
   ifexpr_vec (ifexpr_vec_), for_vec (for_vec_),
-  capture_ids (capture_ids_), capture_max (capture_ids_-size ()) {}
+  capture_ids (capture_ids_), capture_max (capture_ids_-size () - 1) {}
 
   /* The expression that is matched against the GENERIC or GIMPLE IL.  */
   operand *match;


Re: [PATCH][match-and-simplify] More ternary commutative ops, canonicalize operand order before generic_simplify

2014-10-17 Thread Richard Biener
On Fri, 17 Oct 2014, Richard Biener wrote:

 On Thu, 16 Oct 2014, Jeff Law wrote:
 
  On 10/16/14 05:06, Richard Biener wrote:
   
   This patch (also applicable to trunk) makes us canoncialize operand
   order for comparisons at the same time we canonicalize other
   operand order, in particular before dispatching to generic_simplify.
   It also adds operand canonicalization to ternary ops and adds
   FMA_EXPR and DOT_PROD_EXPR to the list of ternary commutative ops.
   
   Bootstrap and regtest running on match-and-simplify branch and
   x86_64-unknown-linux-gnu.
   
   Richard.
   
   2014-10-16  Richard Biener  rguent...@suse.de
   
 * fold-const.c (fold_comparison): Remove redundant constant
 folding and operand swapping.
 (fold_binary_loc): Do comparison operand swapping here,
 dispatch to generic_simplify after operand canonicalization.
 (fold_ternary_loc): Canonicalize operand order for
 commutative ternary operations.
 * tree.c (commutative_ternary_tree_code): Add DOT_PROD_EXPR
 and FMA_EXPR.
  Seems like something we'd want for the trunk independent of the
  match-and-simplify work
 
 Yes, I am going to test and apply it there today.

Like below.  Bootstrapped on x86_64-unknown-linux-gnu, testing in 
progress.

Richard.

2014-10-17  Richard Biener  rguent...@suse.de

* fold-const.c (fold_comparison): Remove redundant constant
folding and operand swapping.
(fold_binary_loc): Do comparison operand swapping here.
(fold_ternary_loc): Canonicalize operand order for
commutative ternary operations.
* tree.c (commutative_ternary_tree_code): Add DOT_PROD_EXPR
and FMA_EXPR.

Index: gcc/fold-const.c
===
--- gcc/fold-const.c(revision 216366)
+++ gcc/fold-const.c(working copy)
@@ -8721,14 +8721,6 @@ fold_comparison (location_t loc, enum tr
   STRIP_SIGN_NOPS (arg0);
   STRIP_SIGN_NOPS (arg1);
 
-  tem = fold_relational_const (code, type, arg0, arg1);
-  if (tem != NULL_TREE)
-return tem;
-
-  /* If one arg is a real or integer constant, put it last.  */
-  if (tree_swap_operands_p (arg0, arg1, true))
-return fold_build2_loc (loc, swap_tree_comparison (code), type, op1, op0);
-
   /* Transform comparisons of the form X +- C1 CMP C2 to X CMP C2 -+ C1.  */
   if ((TREE_CODE (arg0) == PLUS_EXPR || TREE_CODE (arg0) == MINUS_EXPR)
(equality_code || TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (arg0)))
@@ -9915,6 +9907,12 @@ fold_binary_loc (location_t loc,
tree_swap_operands_p (arg0, arg1, true))
 return fold_build2_loc (loc, code, type, op1, op0);
 
+  /* Likewise if this is a comparison, and ARG0 is a constant, move it
+ to ARG1 to reduce the number of tests below.  */
+  if (kind == tcc_comparison
+   tree_swap_operands_p (arg0, arg1, true))
+return fold_build2_loc (loc, swap_tree_comparison (code), type, op1, op0);
+
   /* ARG0 is the first operand of EXPR, and ARG1 is the second operand.
 
  First check for cases where an arithmetic operation is applied to a
@@ -13799,6 +13797,12 @@ fold_ternary_loc (location_t loc, enum t
   gcc_assert (IS_EXPR_CODE_CLASS (kind)
   TREE_CODE_LENGTH (code) == 3);
 
+  /* If this is a commutative operation, and OP0 is a constant, move it
+ to OP1 to reduce the number of tests below.  */
+  if (commutative_ternary_tree_code (code)
+   tree_swap_operands_p (op0, op1, true))
+return fold_build3_loc (loc, code, type, op1, op0, op2);
+
   /* Strip any conversions that don't change the mode.  This is safe
  for every expression, except for a comparison expression because
  its signedness is derived from its operands.  So, in the latter
Index: gcc/tree.c
===
--- gcc/tree.c  (revision 216366)
+++ gcc/tree.c  (working copy)
@@ -7385,6 +7385,8 @@ commutative_ternary_tree_code (enum tree
 {
 case WIDEN_MULT_PLUS_EXPR:
 case WIDEN_MULT_MINUS_EXPR:
+case DOT_PROD_EXPR:
+case FMA_EXPR:
   return true;
 
 default:


[Ada] String literal is allowed for pragma Warnings in Ada 83

2014-10-17 Thread Arnaud Charlet
Documentation change only, no further test required

Tested on x86_64-pc-linux-gnu, committed on trunk

2014-10-17  Robert Dewar  de...@adacore.com

* gnat_rm.texi: Document that string literal can be used for
pragma Warnings when operating in Ada 83 mode.

Index: gnat_rm.texi
===
--- gnat_rm.texi(revision 216371)
+++ gnat_rm.texi(working copy)
@@ -7829,6 +7829,9 @@
 pragma Warnings (On | Off, static_string_EXPRESSION [,REASON]);
 
 REASON ::= Reason = STRING_LITERAL @{ STRING_LITERAL@}
+
+Note: in Ada 83 mode, a string literal may be used in place of
+a static string expression (which does not exist in Ada 83).
 @end smallexample
 
 @noindent


[Ada] Class-wide type invariants for type extensions in other units.

2014-10-17 Thread Arnaud Charlet
A class-wide type invariant is inherited by a type extension, and incorporated
into the invariant procedure for that type. When the expression for such an
invariant (typically a function call) is first analyzed, we must preserve some
semantic information in it, because the type extension may be declared in a
different unit, where it cannot be resolved by visibility if it refers to
local entities.

The following must compile quietly:
   gcc -c -gnata inv2.ads

---
package Inv1 is
   type T_Inv1 is tagged private with
  Type_Invariant'Class = Invariant (T_Inv1);

   function Invariant (This : in T_Inv1'Class) return Boolean;
   type T_Inv2 is new Inv1.T_Inv1 with private;

private
   type T_Inv1 is tagged record
  Value : Integer := 1234;
   end record;

   function Invariant (This : in T_Inv1'Class) return Boolean is
  (This.Value  1000);

   type T_Inv2 is new Inv1.T_Inv1 with null record;
end Inv1;
---
with Inv1;
package Inv2 is
   type T_Inv2 is new Inv1.T_Inv1 with private;
private
   type T_Inv2 is new Inv1.T_Inv1 with null record;
end Inv2;

Tested on x86_64-pc-linux-gnu, committed on trunk

2014-10-17  Ed Schonberg  schonb...@adacore.com

* sem_ch13.adb (Add_Invariants): For a class-wide type invariant,
preserve semantic information on the invariant expression
(typically a function call) because it may be inherited by a
type extension in a different unit, and it cannot be resolved
by visibility elsewhere because it may refer to local entities.

Index: sem_ch13.adb
===
--- sem_ch13.adb(revision 216367)
+++ sem_ch13.adb(working copy)
@@ -2947,8 +2947,7 @@
 --  evaluation of this aspect should be delayed to the
 --  freeze point (why???)
 
-if No (Expr)
-  or else Is_True (Static_Boolean (Expr))
+if No (Expr) or else Is_True (Static_Boolean (Expr))
 then
Set_Uses_Lock_Free (E);
 end if;
@@ -3621,10 +3620,10 @@
if (Attr = Name_Constant_Indexing
 and then Present
   (Find_Aspect (Etype (Ent), Aspect_Constant_Indexing)))
-
- or else (Attr = Name_Variable_Indexing
-and then Present
-  (Find_Aspect (Etype (Ent), Aspect_Variable_Indexing)))
+ or else
+   (Attr = Name_Variable_Indexing
+ and then Present
+   (Find_Aspect (Etype (Ent), Aspect_Variable_Indexing)))
then
   if Debug_Flag_Dot_XX then
  null;
@@ -4269,11 +4268,7 @@
 
 --  Case of address clause for a (non-controlled) object
 
-elsif
-  Ekind (U_Ent) = E_Variable
-or else
-  Ekind (U_Ent) = E_Constant
-then
+elsif Ekind_In (U_Ent, E_Variable, E_Constant) then
declare
   Expr  : constant Node_Id := Expression (N);
   O_Ent : Entity_Id;
@@ -4295,7 +4290,7 @@
 
   if Present (O_Ent)
 and then (Has_Controlled_Component (Etype (O_Ent))
-or else Is_Controlled (Etype (O_Ent)))
+   or else Is_Controlled (Etype (O_Ent)))
   then
  Error_Msg_N
(??cannot overlay with controlled object, Expr);
@@ -4826,13 +4821,10 @@
 --  except from aspect specification.
 
 if From_Aspect_Specification (N) then
-   if not (Is_Protected_Type (U_Ent)
-or else Is_Task_Type (U_Ent))
-   then
+   if not Is_Concurrent_Type (U_Ent) then
   Error_Msg_N
-(Interrupt_Priority can only be defined for task 
- and protected object,
- Nam);
+(Interrupt_Priority can only be defined for task 
+  and protected object, Nam);
 
elsif Duplicate_Clause then
   null;
@@ -4985,14 +4977,12 @@
 --  aspect specification.
 
 if From_Aspect_Specification (N) then
-   if not (Is_Protected_Type (U_Ent)
-or else Is_Task_Type (U_Ent)
+   if not (Is_Concurrent_Type (U_Ent)
 or else Ekind (U_Ent) = E_Procedure)
then
   Error_Msg_N
-(Priority can only be defined for task and protected  
- object,
- Nam);
+(Priority can only be defined for task and protected 
+  object, Nam);
 
elsif Duplicate_Clause then
  

[PATCH,i686]: Temporary fir for PR63566

2014-10-17 Thread Martin Liška

Hello.

After IRC discussion, IPA ICF will set local flag to false for both original 
and node that becomes an alias.
That will enforce equal calling convention to be use.

i686-pc-linux bootstrap has been still running, I will commit the fix as soon 
as it finishes.
I consider it as pre-approved.

Thanks you,
Martin
gcc/ChangeLog:

2014-10-17  Martin Liska  mli...@suse.cz

* ipa-icf.c (sem_function::merge): Local flags are set to false
to enforce equal calling convention to be used.
* opts.c (common_handle_option): Indentation fix.
diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c
index f7510b3..0e6bd9a 100644
--- a/gcc/ipa-icf.c
+++ b/gcc/ipa-icf.c
@@ -630,6 +630,11 @@ sem_function::merge (sem_item *alias_item)
   cgraph_node::create_alias (alias_func-decl, decl);
   alias-resolve_alias (original);
 
+  /* Workaround for PR63566 that forces equal calling convention
+	 to be used.  */
+  alias-local.local = false;
+  original-local.local = false;
+
   if (dump_file)
 	fprintf (dump_file, Callgraph alias has been created.\n\n);
 }
diff --git a/gcc/opts.c b/gcc/opts.c
index dc8ddf4..3054196 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -1982,8 +1982,8 @@ common_handle_option (struct gcc_options *opts,
   break;
 
 case OPT_fipa_icf:
-	opts-x_flag_ipa_icf_functions = value;
-	opts-x_flag_ipa_icf_variables = value;
+  opts-x_flag_ipa_icf_functions = value;
+  opts-x_flag_ipa_icf_variables = value;
   break;
 
 default:


[Ada] Fix obscure case of compiler crash on bad attribute

2014-10-17 Thread Arnaud Charlet
This fixes an error in the handling of attributes where the prefix
raises an exception. This resulted from other errors in the program.
No simple test case has been found, but the correction is clearly
safe.

Tested on x86_64-pc-linux-gnu, committed on trunk

2014-10-17  Robert Dewar  de...@adacore.com

* sem_attr.adb (Eval_Attribute): Ensure that attribute
reference is not marked as being a static expression if the
prefix evaluation raises CE.

Index: sem_attr.adb
===
--- sem_attr.adb(revision 216367)
+++ sem_attr.adb(working copy)
@@ -7553,15 +7553,17 @@
Static :=
  Static and then not Is_Constr_Subt_For_U_Nominal (P_Type);
Set_Is_Static_Expression (N, Static);
-
 end if;
 
 while Present (Nod) loop
if not Is_Static_Subtype (Etype (Nod)) then
   Static := False;
   Set_Is_Static_Expression (N, False);
+
elsif not Is_OK_Static_Subtype (Etype (Nod)) then
   Set_Raises_Constraint_Error (N);
+  Static := False;
+  Set_Is_Static_Expression (N, False);
end if;
 
--  If however the index type is generic, or derived from
@@ -7591,6 +7593,7 @@
 
   begin
  E := E1;
+
  while Present (E) loop
 
 --  If expression is not static, then the attribute reference
@@ -7638,6 +7641,7 @@
  end loop;
 
  if Raises_Constraint_Error (Prefix (N)) then
+Set_Is_Static_Expression (N, False);
 return;
  end if;
   end;


[Ada] Better messages for missing entities in configurable runtime

2014-10-17 Thread Arnaud Charlet
A new mechanism has been implemented that allows specialization of
error messages for missing entities in a configurable run-time.
Instead of just outputting the (sometimes obscure) name of the
entity involved, a more meaningful message can be issued. This
new mechanism is used for a case of rendezvous not being supported
and also for packed array operations not being supported.

Also in the case of unsupported array packing, the message is now
issued explicitly on the array type entity, as shown in this
test program (compiled with -gnatld7 -gnatj55)

 1. pragma No_Run_Time;
 2. procedure BadPack (M : Integer) is
 3.type R is mod 2 ** 43;
 4.type A is array (1 .. 10) of R;
|
 packing of 43-bit components not allowed
in no run time mode

 5.pragma Pack (A);
 6.AV : A;
 7. begin
 8.AV (M) := 3;
  |
 construct not allowed in no run time mode
 packed component size of 43 is not
supported

 9. end;

Tested on x86_64-pc-linux-gnu, committed on trunk

2014-10-17  Robert Dewar  de...@adacore.com

* exp_pakd.adb: Move bit packed entity tables to spec.
* exp_pakd.ads: Move bit packed entity tables here from body.
* freeze.adb (Freeze_Array_Type): Check that packed array type
is supported.
* rtsfind.adb (PRE_Id_Table): New table (Entity_Not_Defined):
Specialize messages using PRE_Id_Table.
* uintp.ads, uintp.adb (UI_Image): New functional form.

Index: exp_pakd.adb
===
--- exp_pakd.adb(revision 216367)
+++ exp_pakd.adb(working copy)
@@ -34,7 +34,6 @@
 with Nlists;   use Nlists;
 with Nmake;use Nmake;
 with Opt;  use Opt;
-with Rtsfind;  use Rtsfind;
 with Sem;  use Sem;
 with Sem_Aux;  use Sem_Aux;
 with Sem_Ch3;  use Sem_Ch3;
@@ -77,365 +76,6 @@
--  right rotate into a left rotate, avoiding the subtract, if the machine
--  architecture provides such an instruction.
 
-   --
-   -- Entity Tables for Packed Access Routines --
-   --
-
-   --  For the cases of component size = 3,5-7,9-15,17-31,33-63 we call library
-   --  routines. This table provides the entity for the proper routine.
-
-   type E_Array is array (Int range 01 .. 63) of RE_Id;
-
-   --  Array of Bits_nn entities. Note that we do not use library routines
-   --  for the 8-bit and 16-bit cases, but we still fill in the table, using
-   --  entries from System.Unsigned, because we also use this table for
-   --  certain special unchecked conversions in the big-endian case.
-
-   Bits_Id : constant E_Array :=
- (01 = RE_Bits_1,
-  02 = RE_Bits_2,
-  03 = RE_Bits_03,
-  04 = RE_Bits_4,
-  05 = RE_Bits_05,
-  06 = RE_Bits_06,
-  07 = RE_Bits_07,
-  08 = RE_Unsigned_8,
-  09 = RE_Bits_09,
-  10 = RE_Bits_10,
-  11 = RE_Bits_11,
-  12 = RE_Bits_12,
-  13 = RE_Bits_13,
-  14 = RE_Bits_14,
-  15 = RE_Bits_15,
-  16 = RE_Unsigned_16,
-  17 = RE_Bits_17,
-  18 = RE_Bits_18,
-  19 = RE_Bits_19,
-  20 = RE_Bits_20,
-  21 = RE_Bits_21,
-  22 = RE_Bits_22,
-  23 = RE_Bits_23,
-  24 = RE_Bits_24,
-  25 = RE_Bits_25,
-  26 = RE_Bits_26,
-  27 = RE_Bits_27,
-  28 = RE_Bits_28,
-  29 = RE_Bits_29,
-  30 = RE_Bits_30,
-  31 = RE_Bits_31,
-  32 = RE_Unsigned_32,
-  33 = RE_Bits_33,
-  34 = RE_Bits_34,
-  35 = RE_Bits_35,
-  36 = RE_Bits_36,
-  37 = RE_Bits_37,
-  38 = RE_Bits_38,
-  39 = RE_Bits_39,
-  40 = RE_Bits_40,
-  41 = RE_Bits_41,
-  42 = RE_Bits_42,
-  43 = RE_Bits_43,
-  44 = RE_Bits_44,
-  45 = RE_Bits_45,
-  46 = RE_Bits_46,
-  47 = RE_Bits_47,
-  48 = RE_Bits_48,
-  49 = RE_Bits_49,
-  50 = RE_Bits_50,
-  51 = RE_Bits_51,
-  52 = RE_Bits_52,
-  53 = RE_Bits_53,
-  54 = RE_Bits_54,
-  55 = RE_Bits_55,
-  56 = RE_Bits_56,
-  57 = RE_Bits_57,
-  58 = RE_Bits_58,
-  59 = RE_Bits_59,
-  60 = RE_Bits_60,
-  61 = RE_Bits_61,
-  62 = RE_Bits_62,
-  63 = RE_Bits_63);
-
-   --  Array of Get routine entities. These are used to obtain an element from
-   --  a packed array. The N'th entry is used to obtain elements from a packed
-   --  array whose component size is N. RE_Null is used as a null entry, for
-   --  the cases where a library routine is not used.
-
-   Get_Id : constant E_Array :=
- (01 = RE_Null,
-  02 = RE_Null,
-  03 = RE_Get_03,
-  04 = RE_Null,
-  05 = RE_Get_05,
-  06 = RE_Get_06,
-  07 = RE_Get_07,
-  08 = RE_Null,
-  09 = RE_Get_09,
-  10 = RE_Get_10,
-  11 = RE_Get_11,
-  12 = RE_Get_12,
-  13 = RE_Get_13,
-  14 = RE_Get_14,
-  15 = RE_Get_15,
-  16 = RE_Null,
-  

[Ada] Short_Integer should be considered implementation defined

2014-10-17 Thread Arnaud Charlet
For the purposes of restriction No_Implementation_Identifiers,
Standard.Short_Integer should be considered as being implementation
defined and this was not the case. In addition, this patch fixes
a compiler blow up with a compiler built with assertions in the
test for implementation-defined identifiers. Note that the latter
problem is not documented in the KP entry for this ticket, since
it shows up only in compilers built with assertions.

The following should compile as indicated with -gnatld7 -gnatj55

 1. pragma Restriction_Warnings
 2.  (No_Implementation_Identifiers);
 3. package ImplIdent is
 4.  subtype Integer_8 is Standard.Short_Short_Integer;
   |
 warning: violation of restriction
No_Implementation_Identifiers at line 1

 5.  subtype Integer_16 is Standard.Short_Integer;
|
 warning: violation of restriction
No_Implementation_Identifiers at line 1

 6. end;

Tested on x86_64-pc-linux-gnu, committed on trunk

2014-10-17  Robert Dewar  de...@adacore.com

* cstand.adb (Create_Standard): Mark Short_Integer as
implementation defined.
* sem_util.adb (Set_Entity_With_Checks): Avoid blow up for
compiler built with assertions for No_Implementation_Identifiers test.

Index: sem_util.adb
===
--- sem_util.adb(revision 216371)
+++ sem_util.adb(working copy)
@@ -16462,8 +16462,9 @@
  --  the entities within it).
 
  if (Is_Implementation_Defined (Val)
-   or else
- Is_Implementation_Defined (Scope (Val)))
+  or else
+(Present (Scope (Val))
+  and then Is_Implementation_Defined (Scope (Val
and then not (Ekind_In (Val, E_Package, E_Generic_Package)
   and then Is_Library_Level_Entity (Val))
  then
Index: cstand.adb
===
--- cstand.adb  (revision 216367)
+++ cstand.adb  (working copy)
@@ -735,6 +735,7 @@
 
   Build_Signed_Integer_Type
 (Standard_Short_Integer, Standard_Short_Integer_Size);
+  Set_Is_Implementation_Defined (Standard_Short_Integer);
 
   Build_Signed_Integer_Type
 (Standard_Integer, Standard_Integer_Size);


Re: [PATCH,1/2] Extended if-conversion for loops marked with pragma omp simd.

2014-10-17 Thread Richard Biener
On Thu, Oct 16, 2014 at 5:42 PM, Yuri Rumyantsev ysrum...@gmail.com wrote:
 Richard,

 Here is reduced patch as you requested. All your remarks have been fixed.
 Could you please look at it ( I have already sent the patch with
 changes in add_to_predicate_list for review).

+ if (dump_file  (dump_flags  TDF_DETAILS))
+   fprintf (dump_file, More than two phi node args.\n);
+ return false;
+   }
+
+}

Excess vertical space.


+/* Assumes that BB has more than 2 predecessors.

More than 1 predecessor?

+   Returns false if at least one successor is not on critical edge
+   and true otherwise.  */
+
+static inline bool
+all_edges_are_critical (basic_block bb)
+{

all_preds_critical_p would be a better name

+  if (EDGE_COUNT (bb-preds)  2)
+{
+  if (!flag_force_vectorize)
+   return false;
+}

as I said in the last review I don't think we should restrict edge
predicates to flag_force_vectorize.  At least I can't see how
if-conversion is magically more expensive for that case?

So please rework the patch so critical edges are always handled
correctly.

Ok with that and the above suggested changes.

Thanks,
Richard.


 Thanks.
 Yuri.
 ChangeLog
 2014-10-16  Yuri Rumyantsev  ysrum...@gmail.com

 (flag_force_vectorize): New variable.
 (edge_predicate): New function.
 (set_edge_predicate): New function.
 (add_to_dst_predicate_list): Conditionally invoke add_to_predicate_list
 if destination block of edge is not always executed. Set-up predicate
 for critical edge.
 (if_convertible_phi_p): Accept phi nodes with more than two args
 if FLAG_FORCE_VECTORIZE was set-up.
 (ifcvt_can_use_mask_load_store): Use FLAG_FORCE_VECTORIZE.
 (if_convertible_stmt_p): Fix up pre-function comments.
 (all_edges_are_critical): New function.
 (if_convertible_bb_p): Allow bb has more than two predecessors if
 FLAG_FORCE_VECTORIZE was set-up. Use call of all_edges_are_critical
 to reject block if-conversion with incoming critical edges only if
 FLAG_FORCE_VECTORIZE was not set-up.
 (predicate_bbs): Skip loop exit block also.Invoke build2_loc
 to compute predicate instead of fold_build2_loc.
 Add zeroing of edge 'aux' field.
 (find_phi_replacement_condition): Extend function interface:
 it returns NULL if given phi node must be handled by means of
 extended phi node predication. If number of predecessors of phi-block
 is equal 2 and atleast one incoming edge is not critical original
 algorithm is used.
 (tree_if_conversion): Temporary set-up FLAG_FORCE_VECTORIZE to false.
 Nullify 'aux' field of edges for blocks with two successors.



 2014-10-15 13:50 GMT+04:00 Richard Biener richard.guent...@gmail.com:
 On Mon, Oct 13, 2014 at 11:38 AM, Yuri Rumyantsev ysrum...@gmail.com wrote:
 Richard,

 Here is updated patch (part1) for extended if conversion.

 Second part of patch will be sent later.

 Ok, I'm starting to look at this.  I'd still like you to split things up
 more.

  static inline void
  add_to_predicate_list (struct loop *loop, basic_block bb, tree nc)
  {
 ...

 +  /* We use notion of cd equivalence to get simplier predicate for
 +join block, e.g. if join block has 2 predecessors with predicates
 +p1  p2 and p1  !p2, we'd like to get p1 for it instead of
 +p1  p2 | p1  !p2.  */
 +  if (dom_bb != loop-header
 +  get_immediate_dominator (CDI_POST_DOMINATORS, dom_bb) == bb)
 +   {
 + gcc_assert (flow_bb_inside_loop_p (loop, dom_bb));
 + bc = bb_predicate (dom_bb);
 + gcc_assert (!is_true_predicate (bc));

 these changes look worthwhile even for !flag_force_vectorize.  So please
 split the change to add_to_predicate_list out and compute post-dominators
 unconditionally.  Note that you should call free_dominance_info
 (CDI_POST_DOMINATORS) at the end of if-conversion.

 +  if (!dominated_by_p (CDI_DOMINATORS, loop-latch, e-dest))
 +add_to_predicate_list (loop, e-dest, cond);
 +
 +  /* If edge E is critical save predicate on it.  */
 +  if (EDGE_COUNT (e-dest-preds) = 2)
 +set_edge_predicate (e, cond);

 how do we know the edge is critical by this simple check?  Why not
 simply always save edge predicates (well, you kind of do but omit
 the case where e-src dominates e-dest).

 Btw, you can rely on edge-aux being NULL at the start of the
 pass but need to clear it at the end (best use clear_aux_for_edges ()
 for that).  So stuff like

 + extract_true_false_edges_from_block (bb, true_edge, false_edge);
 + if (flag_force_vectorize)
 +   true_edge-aux = false_edge-aux = NULL;

 shouldn't be necessary.

 I think the edge predicate handling should also be unconditionally
 and not depend on flag_force_vectorize.

 +  /* The loop latch and loop exit block are always executed and
 +have no extra conditions to be processed: skip them.  */
 +  if (bb == loop-latch
 + || bb_with_exit_edge_p (loop, bb))

 I don't think the edge stuff is true - given you 

[Ada] Better error message for illegal iterator expression

2014-10-17 Thread Arnaud Charlet
This patch improves the error message on an iterator specification whose name
is a function call that does not yield a type that implements an iterator
interface.

Compiling try_containers.adb must yield:

   try_containers.adb:17:18: expect object that implements iterator interface

--
with Ada.Text_Io; use Ada.Text_Io;
with Ada.Containers.Vectors;
procedure Try_Containers
is
   package Integer_Vectors is new Ada.Containers.Vectors (Natural, Integer);
   use Integer_Vectors;

   A : Vector := To_Vector (1, 10);
begin
   Loop_1 :
   for Element of A loop
  Put_Line (A (i) =   Integer'Image (Element));
  -- can't do Element := 2;
   end loop Loop_1;

   Loop_2 :
   for Cursor in First (A) loop -- oops! should be:
   --  for Cursor in Iterate (A) loop

  Put_Line (A (I) =   Integer'Image (Element (Cursor)));
  Replace_Element (A, Cursor, 2);
  Reference (A, Cursor) := 2;
   end loop Loop_2;

end Try_Containers;

Tested on x86_64-pc-linux-gnu, committed on trunk

2014-10-17  Ed Schonberg  schonb...@adacore.com

* sem_ch5.adb (Analyze_Iterator_Specification): If the domain
of iteration is given by an expression that is not an array type,
verify that its type implements an iterator iterface.

Index: sem_ch5.adb
===
--- sem_ch5.adb (revision 216367)
+++ sem_ch5.adb (working copy)
@@ -1838,6 +1838,17 @@
 
 else
Typ := Etype (Iter_Name);
+
+   --  Verify that the expression produces an iterator.
+
+   if not Of_Present (N) and then not Is_Iterator (Typ)
+ and then not Is_Array_Type (Typ)
+ and then No (Find_Aspect (Typ, Aspect_Iterable))
+   then
+  Error_Msg_N
+(expect object that implements iterator interface,
+Iter_Name);
+   end if;
 end if;
 
 --  Protect against malformed iterator


[Ada] Directories are no longer created for abstract projects

2014-10-17 Thread Arnaud Charlet
Directories such as object directories are no longer created for abstract
projects when the builder (gnatmake or gprbuild) is called with -P or
with --subdirs=..., even when there is no explicit indication in the
abstract project that there are no sources in the project.

Tested on x86_64-pc-linux-gnu, committed on trunk

2014-10-17  Vincent Celier  cel...@adacore.com

* prj-nmsc.adb (Get_Directories): Do not create directories
when a project is abstract.

Index: prj-nmsc.adb
===
--- prj-nmsc.adb(revision 216367)
+++ prj-nmsc.adb(working copy)
@@ -5498,13 +5498,15 @@
   Dir_Exists : Boolean;
 
   No_Sources : constant Boolean :=
- ((not Source_Files.Default
+ Project.Qualifier = Abstract_Project
+   or else
+ (((not Source_Files.Default
 and then Source_Files.Values = Nil_String)
or else (not Source_Dirs.Default
  and then Source_Dirs.Values = Nil_String)
or else (not Languages.Default
  and then Languages.Values = Nil_String))
- and then Project.Extends = No_Project;
+ and then Project.Extends = No_Project);
 
--  Start of processing for Get_Directories
 


[linaro/gcc-4_9-branch] Merge from gcc-4_9-branch and backports

2014-10-17 Thread Yvan Roux
Hi all

we have merged the gcc-4_9-branch into linaro/gcc-4_9-branch up to
revision 216130 as r216256.  We have also backported this set of revisions:

r209643 as 215975 : [AArch64] Define TARGET_FLAGS_REGNUM
r211881 as 215975 : PR target/61565
r213035 as 215846 : [AArch64] libitm: Improve _ITM_beginTransaction
r213090 as 215847 : [AArch64] Fix *extr_insv_lower_regmode pattern
r214824 as 215977 : [AArch64] Use CC_Z and CC_NZ with csinc and
similar instructions
r214825 as 216007 : [AArch32][1/2] Implement lceil, lfloor, lround
optabs with new ARMv8-A instructions
r214826 as 216007 : [AArch32][2/2] Vectorise lroundf, lfloorf, lceilf
using the new ARMv8-A vcvt* instructions
r214886 as 215944 : [AArch64] Improve epilogue unwind info rth
r214940 as 215853 : [AArch64] Add a mode to operand 1 of sibcall_value_insn
r214943 as 215946 : [AArch64] Add a builtin for rbit(q?)_p8; add
intrinsics and tests.
r214944 as 215948 : [AArch32/AArch64] Schedule alu_ext for Cortex-A53
r214945 as 215949 : [AArch64] Remove varargs from aarch64_simd_expand_args
r214947 as 215854 : [AArch64] Tidy: remove unused qualifier_const_pointer
r214959 as 215857 : [AArch32/AArch64] Add scheduling info for ARMv8-A
FPU new instructions in Cortex-A53
r215050 as 215858 : [AArch32[1/7] Convert FP mnemonics to UAL | mov patterns.
r215051 as 215858 : [AArch32][2/7] Convert FP mnemonics to UAL |
add/sub/div/abs patterns
r215052 as 215858 : [AARch32][3/7] Convert FP mnemonics to UAL |
mul+add patterns
r215053 as 215858 : [AArch32][4/7] Convert FP mnemonics to UAL | vcvt patterns
r215054 as 215858 : [AArch32][5/7] Convert FP mnemonics to UAL | sqrt
and FP compare patterns
r215055 as 215858 : [AArch32][6/7] Convert FP mnemonics to UAL |
movcc_vfp (fmstat)
r215056 as 215858 : [AArch32][7/7] Convert FP mnemonics to UAL |
f{ld,st}m - v{ld,st}m
r215067 as 215923 : [AArch32] Enable auto-vectorization for copysignf
r215085 as 216007 : [AArch32][tests] Make input and output arrays
128-bit aligned in vectorisation tests
r215086 as 215928 : [AARch64] Add crtfastmath for AArch64
r215101 as 215929 : PR target/56846 libstdc++
r215136 as 215932 : PR target/63209
r215205 as 215935 : [Ree] Ensure inserted copy don't change the number
of hard registers
r215260 as 215937 : [AArch64] Fix force_simd macro in vdup_lane_2
r215321 as 215938 : Disallow -mfpu=neon for unsuitable architectures
r215346 as 215940 : movmisalignmode_neon_load
r215385 as 215941 : [AArch64] Add constraint letter for
stack_protect_test pattern
r215471 as 216004 : [AArch64] Auto-generate the BUILTIN_ macros

This will be part of our 2014.10 4.9 release.

Thanks,
Yvan


[Ada] Internal clean up (use Is_Directory_Separator)

2014-10-17 Thread Arnaud Charlet
This is an internal clean up to use an existing abstraction
more extensively. No external effect, no test required.

Tested on x86_64-pc-linux-gnu, committed on trunk

2014-10-17  Robert Dewar  de...@adacore.com

* gnatcmd.adb, make.adb, prj-part.adb, gnatlink.adb, prj-nmsc.adb,
prj-conf.adb, prj-env.adb: Use Is_Directory_Separator where possible.

Index: gnatcmd.adb
===
--- gnatcmd.adb (revision 216367)
+++ gnatcmd.adb (working copy)
@@ -883,10 +883,9 @@
   if not Is_Absolute_Path (Exec_File_Name) then
  for Index in Exec_File_Name'Range loop
 if Exec_File_Name (Index) = Directory_Separator then
-   Fail (relative executable ( 
-   Exec_File_Name 
-   ) with directory part not allowed  
-   when using project files);
+   Fail (relative executable (  Exec_File_Name
+  ) with directory part not allowed 
+  when using project files);
 end if;
  end loop;
 
@@ -1398,9 +1397,7 @@
 
   else
  for K in Switch'Range loop
-if Switch (K) = '/'
-  or else Switch (K) = Directory_Separator
-then
+if Is_Directory_Separator (Switch (K)) then
Test_Existence := True;
exit;
 end if;
Index: make.adb
===
--- make.adb(revision 216367)
+++ make.adb(working copy)
@@ -4057,8 +4057,7 @@
begin
   First := Name'Last;
   while First  Name'First
-and then Name (First - 1) /= Directory_Separator
-and then Name (First - 1) /= '/'
+and then not Is_Directory_Separator (Name (First - 1))
   loop
  First := First - 1;
   end loop;
@@ -6805,8 +6804,7 @@
  begin
 First := Name'Last;
 while First  Name'First
-  and then Name (First - 1) /= Directory_Separator
-  and then Name (First - 1) /= '/'
+  and then not Is_Directory_Separator (Name (First - 1))
 loop
First := First - 1;
 end loop;
Index: prj-part.adb
===
--- prj-part.adb(revision 216367)
+++ prj-part.adb(working copy)
@@ -349,8 +349,7 @@
   Get_Name_String (Path_Name_Of (Main_Project, In_Tree));
 
   while Name_Len  0
-and then Name_Buffer (Name_Len) /= Directory_Separator
-and then Name_Buffer (Name_Len) /= '/'
+and then not Is_Directory_Separator (Name_Buffer (Name_Len))
   loop
  Name_Len := Name_Len - 1;
   end loop;
Index: gnatlink.adb
===
--- gnatlink.adb(revision 216367)
+++ gnatlink.adb(working copy)
@@ -1204,9 +1204,8 @@
if GCC_Index = 0 then
   GCC_Index :=
 Index (Path (1 .. Path_Last),
-   Directory_Separator 
-   lib 
-   Directory_Separator);
+   Directory_Separator  lib
+Directory_Separator);
end if;
 
--  If we have found a lib subdir in
Index: prj-nmsc.adb
===
--- prj-nmsc.adb(revision 216381)
+++ prj-nmsc.adb(working copy)
@@ -5031,10 +5031,7 @@
 
if OK then
   for J in 1 .. Name_Len loop
- if Name_Buffer (J) = '/'
-  or else
-Name_Buffer (J) = Directory_Separator
- then
+ if Is_Directory_Separator (Name_Buffer (J)) then
 OK := False;
 exit;
  end if;
@@ -5336,9 +5333,7 @@
function Compute_Directory_Last (Dir : String) return Natural is
begin
   if Dir'Length  1
-and then (Dir (Dir'Last - 1) = Directory_Separator
-or else
-  Dir (Dir'Last - 1) = '/')
+and then Is_Directory_Separator (Dir (Dir'Last - 1))
   then
  return Dir'Last - 1;
   else
@@ -5858,7 +5853,7 @@
--  Check that there is no directory information
 
for J in 1 .. Last loop
-  if Line (J) = '/' or else Line (J) = Directory_Separator then
+  if 

[patch] LWG 2019 - std::isblankC(C, const std::locale)

2014-10-17 Thread Jonathan Wakely
http://cplusplus.github.io/LWG/lwg-defects.html#2019 


I've checked the relevant _ISblank/_ISBLANK/_CTYPE_B constant on all
targets except VxWorks where I chose something that looks reasonable.
Not all targets reserve a bit for isblank, but this way
ctype_base::blank is always defined, but on some platforms with the
same value as ctype_base::space. That means that on those targets
isblank(c, loc) is equivalent to isspace(c, loc) which is not correct,
but isn't completely crazy either.

Some systems (bionic, newlib, netbsd, openbsd) do define a _B (or
_CTYPE_B) constant, but as it says on netbsd:

/*
* isblank() is implemented as C function, due to insufficient bitwidth in
* _ctype_.  Note that _B does not mean isblank - it means isprint  !isgraph.
*/

On those targets there is no bitmask corresponding to the isblank set.
I don't know how to solve that without changing ctype_base::mask to a
wider type, which I'm not planning on doing.

N.B. on other BSDs (freebsd, darwin, dragonfly) _CTYPE_B *does*
correspond to isblank. Portability is fun.

Some implementations of ctypechar::is(mask, char) and/or
ctypewchar_t::do_is defined inline in config/os/*/ctype_inline.h
need the ctype_base::blank mask, but those files get included by C++98
code, so for some targets ctype_base::blank is always defined even in
C++98 mode. Solving that is too difficult.

Tested x86_64-linux, with --enable-clocale={gnu,generic}
and also by hacking configure.host to use config/os/generic, and also
tested on x86_64-netbsd5.1 and x86_64-dragonfly3.6. Something will
probably break on a target I didn't test, but should be easy to fix.

I plan to commit this later today.

commit 1c95ff5159b6a41e4f5f4d4919b8e394905bb9c4
Author: Jonathan Wakely jwak...@redhat.com
Date:   Thu Oct 16 15:21:02 2014 +0100

	* src/c++98/Makefile.am: Move ctype.cc, ctype_configure_char.cc and
	ctype_members.cc to ...
	* src/c++11/Makefile.am: Here.
	* src/c++98/Makefile.in: Regenerate.
	* src/c++11/Makefile.in: Regenerate.
	* src/c++98/ctype.cc: Move file to ...
	* src/c++11/ctype.cc: Here, define ctype_base::blank.
	* config/abi/pre/gnu.ver: Export ctype_base::blank.
	* config/locale/generic/ctype_members.cc
	(ctypewchar_t::_M_convert_to_wmask): Handle blank. Update comments.
	* config/locale/gnu/ctype_members.cc
	(ctypewchar_t::_M_convert_to_wmask): Likewise.
	* config/os/aix/ctype_base.h (ctype_base::blank): Declare.
	* config/os/bionic/ctype_base.h (ctype_base::blank): Likewise.
	* config/os/bsd/darwin/ctype_base.h (ctype_base::blank): Declare.
	* config/os/bsd/darwin/ctype_inline.h (ctypechar::is): Use blank.
	(ctypewchar_t::do_is): Likewise.
	* config/os/bsd/dragonfly/ctype_base.h (ctype_base::blank): Declare.
	* config/os/bsd/dragonfly/ctype_inline.h (ctypechar::is): Use blank.
	(ctypewchar_t::do_is): Likewise.
	* config/os/bsd/freebsd/ctype_base.h (ctype_base::blank): Declare.
	* config/os/bsd/freebsd/ctype_inline.h (ctypechar::is): Use blank.
	(ctypewchar_t::do_is): Likewise.
	* config/os/bsd/netbsd/ctype_base.h (ctype_base::blank): Declare.
	* config/os/bsd/openbsd/ctype_base.h (ctype_base::blank): Likewise.
	* config/os/djgpp/ctype_base.h (ctype_base::blank): Likewise.
	* config/os/generic/ctype_base.h (ctype_base::blank): Declare.
	* config/os/generic/ctype_inline.h (ctypechar::is): Use blank.
	* config/os/gnu-linux/ctype_base.h (ctype_base::blank): Declare.
	* config/os/hpux/ctype_base.h (ctype_base::blank): Likewise.
	* config/os/mingw32-w64/ctype_base.h (ctype_base::blank): Declare.
	* config/os/mingw32-w64/ctype_configure_char.cc
	(ctypechar::classic_table()): Set blank bit for space and tab.
	* config/os/mingw32/ctype_base.h (ctype_base::blank): Declare.
	* config/os/mingw32/ctype_configure_char.cc
	(ctypechar::classic_table()): Set blank bit for space and tab.
	* config/os/newlib/ctype_base.h (ctype_base::blank): Declare.
	* config/os/qnx/qnx6.1/ctype_base.h (ctype_base::blank): Likewise.
	* config/os/solaris/solaris2.10/ctype_base.h (ctype_base::blank):
	Likewise.
	* config/os/tpf/ctype_base.h (ctype_base::blank): Likewise.
	* config/os/uclibc/ctype_base.h (ctype_base::blank): Likewise.
	* config/os/vxworks/ctype_base.h (ctype_base::blank): Likewise.
	* include/bits/locale_facets.h (isblank): Define.
	* include/bits/localefwd.h (isblank): Declare.
	* testsuite/22_locale/classification/isblank.cc: New.
	* testsuite/22_locale/ctype_base/blank.cc: New.

diff --git a/libstdc++-v3/src/c++11/Makefile.am b/libstdc++-v3/src/c++11/Makefile.am
index 39425d4..c8507ce 100644
--- a/libstdc++-v3/src/c++11/Makefile.am
+++ b/libstdc++-v3/src/c++11/Makefile.am
@@ -27,9 +27,22 @@ noinst_LTLIBRARIES = libc++11convenience.la
 
 headers =
 
+# Source files linked in via configuration/make substitution for a
+# particular host.
+host_sources = \
+	ctype_configure_char.cc \
+	

[RS6000] Fix -mlongcall with nested functions on AIX

2014-10-17 Thread Eric Botcazou
Hi,

-mlongcall miscompiles nested functions with the AIX ABI.  The problem is 
that, when -mlongcall is in effect, rs6000_call_aix redirects all calls to:

  /* Handle indirect calls.  */
  if (GET_CODE (func_desc) != SYMBOL_REF
  || (DEFAULT_ABI == ABI_AIX  !SYMBOL_REF_FUNCTION_P (func_desc)))

and then

  /* A function pointer under AIX is a pointer to a data area whose
 first word contains the actual address of the function, whose
 second word contains a pointer to its TOC, and whose third word
 contains a value to place in the static chain register (r11).
 Note that if we load the static chain, our trampoline need
 not have any executable code.  */

But, if the call was originally direct, no trampoline has been built, which 
means that the value loaded into the static chain register is garbage.  That's 
sort of OK, except when the called function is nested because the static chain 
register has already been loaded with the proper static chain value by the 
generic code, so overwriting it with garbage breaks the program.

Tested on PowerPC/AIX, OK for the mainline?


2014-10-17  Eric Botcazou  ebotca...@adacore.com

* config/rs6000/rs6000.c (rs6000_call_aix): For the AIX ABI, do not
load the static chain if the call was originally direct.


2014-10-17  Eric Botcazou  ebotca...@adacore.com

* gcc.target/powerpc/longcall-2.c: New test.


-- 
Eric BotcazouIndex: config/rs6000/rs6000.c
===
--- config/rs6000/rs6000.c	(revision 216252)
+++ config/rs6000/rs6000.c	(working copy)
@@ -32568,6 +32568,8 @@ rs6000_legitimate_constant_p (enum machi
 void
 rs6000_call_aix (rtx value, rtx func_desc, rtx flag, rtx cookie)
 {
+  const bool direct_call_p
+= GET_CODE (func_desc) == SYMBOL_REF  SYMBOL_REF_FUNCTION_P (func_desc);
   rtx toc_reg = gen_rtx_REG (Pmode, TOC_REGNUM);
   rtx toc_load = NULL_RTX;
   rtx toc_restore = NULL_RTX;
@@ -32636,8 +32638,11 @@ rs6000_call_aix (rtx value, rtx func_des
 			func_toc_offset));
 	  toc_load = gen_rtx_USE (VOIDmode, func_toc_mem);
 
-	  /* If we have a static chain, load it up.  */
-	  if (TARGET_POINTERS_TO_NESTED_FUNCTIONS)
+	  /* If we have a static chain, load it up.  But, if the call was
+	 originally direct, the 3rd word has not been written since no
+	 trampoline has been built, so we ought not to load it, lest we
+	 override a static chain value.  */
+	  if (!direct_call_p  TARGET_POINTERS_TO_NESTED_FUNCTIONS)
 	{
 	  rtx sc_reg = gen_rtx_REG (Pmode, STATIC_CHAIN_REGNUM);
 	  rtx func_sc_offset = GEN_INT (2 * GET_MODE_SIZE (Pmode));/* { dg-do run } */
/* { dg-options -mlongcall } */

extern void abort (void);

#define VAL 12345678

int j = VAL;

void
bar (void)
{
  if (j != VAL)
abort ();
}

int
main (void)
{
  int i = VAL;

  int foo (void)
  {
if (i != VAL)
  abort ();
  }

  foo ();
  bar ();

  return 0;
}

[patch, rfc] fix warning building libssp in C11 mode

2014-10-17 Thread Matthias Klose
Building libssp in C11 mode shows a warning for 64bit configurations,

../../../src/libssp/gets-chk.c:62:12: warning: return makes pointer from integer
without a cast [-Wint-conversion]

Currently working around by adding a prototype in gets-chk.c, conditionally
defined by the inverted condition found in glibc's stdio.h.

Is there a better approach?

  Matthias

# DP: Declare prototype for gets in C11 mode

--- libssp/gets-chk.c
+++ libssp/gets-chk.c
@@ -51,6 +51,11 @@
 # include string.h
 #endif
 
+#if !(!defined __USE_ISOC11\
+  || (defined __cplusplus  __cplusplus = 201103L))
+extern char *gets (char *);
+#endif
+
 extern void __chk_fail (void) __attribute__((__noreturn__));
 
 char *


Re: -fuse-caller-save - Collect register usage information

2014-10-17 Thread Tom de Vries

On 16-10-14 23:46, Eric Botcazou wrote:

Having said that, in my mind, what is confusing about the name
-fuse-caller-save, is that in fact the caller-save registers are already
used in register allocation. It's just that they're used across calls
without the need to save them, but
-fuse-caller-save-across-calls-without-saving-if-possible is not such a
good option name.


Agreed.


Another thing that - in my mind - is confusing is that there's an option
fcaller-saves which controls behaviour for caller-save registers:
- for -fno-caller-saves, caller-save registers are not used across calls
- for -fcaller-saves, caller-save registers are used across calls
The name is similar to -fuse-caller-save, and it won't be clear from just
the names what the difference is.


OK, so the existing -fcaller-saves is in fact -fuse-caller-saves,


Right, in the sense that a caller-save is the save of caller-save register, as 
opposed to short for a caller-save register, which is how it's used in 
-fuse-caller-save.



which means
that we should really find a better name for yours. :-)



Agreed :)


I've pondered the name -fipa-ira, but I rejected that earlier because that
might suggest actual register allocation at the interprocedural scope,
while this is only register allocation at the scope of a single procedure,
taking some interprocedural information into account. Furthermore, it's not
only ira that uses the interprocedural information.

So, let's a generate a list of option names.
-fuse-caller-save
-fuse-call-clobbered
-fprecise-call-clobbers
-foptimize-call-clobbers
-fprune-call-clobbers
-freduce-call-clobbers
-fcall-clobbers-ipa

Any preferences, alternatives?


Given the existing -fcaller-saves, I'd keep caller-saves in the name, so
something along the lines of -foptimize-caller-saves or -fipa-caller-saves.



Let's look at the effect of the option (after the recent fix for PR61605) on 
gcc.target/i386/fuse-calller-save.c:

...
 foo:
 .LFB1:
.cfi_startproc
-   pushq   %rbx
-   .cfi_def_cfa_offset 16
-   .cfi_offset 3, -16
-   movl%edi, %ebx
callbar
-   addl%ebx, %eax
-   popq%rbx
-   .cfi_def_cfa_offset 8
+   addl%edi, %eax
ret
.cfi_endproc
 .LFE1:
...
So, the effect is: instead of using a callee-save register, we use a caller-save 
register to store a value that's live over a call, without needing to add a 
caller-save, as would be normally the case.


If I see an option -foptimize-caller-saves, I'd expect the effect to be that 
without, there are some caller-saves and with, there are less. This is not the 
case in the diff above. Nevertheless, if we'd have a case where we already have 
caller-saves, that would be indeed the observed effect. I'm just trying to point 
out that the optimization does more than just removing caller-saves.


The optimization, at it's core, can be regarded as removing superfluous clobbers 
from calls, and everything else is derived from that:

- if a caller-save register is not clobbered by a call, then there's no need
  for a caller-save before that call, so it's cheaper to use across that call
  than a callee-save register.
  (which explains what we see in the diff)
- if a caller-save register is live across a call, and is not clobbered by a
  call, then there's no need for a caller-save, and it can be removed.
  (which explains what we see in case we have an example where there are
   actual caller-saves without the optimization, and less so with the
   optimization)

I'm starting to lean towards -foptimize-call-clobbers or similar.

Thanks,
- Tom


Re: -fuse-caller-save - Collect register usage information

2014-10-17 Thread Richard Biener
On Fri, Oct 17, 2014 at 12:47 PM, Tom de Vries tom_devr...@mentor.com wrote:
 On 16-10-14 23:46, Eric Botcazou wrote:

 Having said that, in my mind, what is confusing about the name
 -fuse-caller-save, is that in fact the caller-save registers are already
 used in register allocation. It's just that they're used across calls
 without the need to save them, but
 -fuse-caller-save-across-calls-without-saving-if-possible is not such a
 good option name.


 Agreed.

 Another thing that - in my mind - is confusing is that there's an option
 fcaller-saves which controls behaviour for caller-save registers:
 - for -fno-caller-saves, caller-save registers are not used across calls
 - for -fcaller-saves, caller-save registers are used across calls
 The name is similar to -fuse-caller-save, and it won't be clear from just
 the names what the difference is.


 OK, so the existing -fcaller-saves is in fact -fuse-caller-saves,


 Right, in the sense that a caller-save is the save of caller-save register,
 as opposed to short for a caller-save register, which is how it's used in
 -fuse-caller-save.

 which means
 that we should really find a better name for yours. :-)


 Agreed :)

 I've pondered the name -fipa-ira, but I rejected that earlier because
 that
 might suggest actual register allocation at the interprocedural scope,
 while this is only register allocation at the scope of a single
 procedure,
 taking some interprocedural information into account. Furthermore, it's
 not
 only ira that uses the interprocedural information.

 So, let's a generate a list of option names.
 -fuse-caller-save
 -fuse-call-clobbered
 -fprecise-call-clobbers
 -foptimize-call-clobbers
 -fprune-call-clobbers
 -freduce-call-clobbers
 -fcall-clobbers-ipa

 Any preferences, alternatives?


 Given the existing -fcaller-saves, I'd keep caller-saves in the name, so
 something along the lines of -foptimize-caller-saves or
 -fipa-caller-saves.


 Let's look at the effect of the option (after the recent fix for PR61605) on
 gcc.target/i386/fuse-calller-save.c:
 ...
  foo:
  .LFB1:
 .cfi_startproc
 -   pushq   %rbx
 -   .cfi_def_cfa_offset 16
 -   .cfi_offset 3, -16
 -   movl%edi, %ebx
 callbar
 -   addl%ebx, %eax
 -   popq%rbx
 -   .cfi_def_cfa_offset 8
 +   addl%edi, %eax
 ret
 .cfi_endproc
  .LFE1:
 ...
 So, the effect is: instead of using a callee-save register, we use a
 caller-save register to store a value that's live over a call, without
 needing to add a caller-save, as would be normally the case.

 If I see an option -foptimize-caller-saves, I'd expect the effect to be that
 without, there are some caller-saves and with, there are less. This is not
 the case in the diff above. Nevertheless, if we'd have a case where we
 already have caller-saves, that would be indeed the observed effect. I'm
 just trying to point out that the optimization does more than just removing
 caller-saves.

 The optimization, at it's core, can be regarded as removing superfluous
 clobbers from calls, and everything else is derived from that:
 - if a caller-save register is not clobbered by a call, then there's no need
   for a caller-save before that call, so it's cheaper to use across that
 call
   than a callee-save register.
   (which explains what we see in the diff)
 - if a caller-save register is live across a call, and is not clobbered by a
   call, then there's no need for a caller-save, and it can be removed.
   (which explains what we see in case we have an example where there are
actual caller-saves without the optimization, and less so with the
optimization)

 I'm starting to lean towards -foptimize-call-clobbers or similar.

Well, it is really some form of IPA driven register allocation.  Whether
you want to call it -fipa-ra or not is another question - but if we had
such option then enabling it with that option would be fine.

Also users may have no idea what call vs callee clobbers are, but
IPA RA may be a term that is more widely known (or at least google
can come up with something for you).

So - I like -fipa-ra more.

I can't see the obvious difference between -foptimize-caller-saves
and -foptimize-call-clobbers (for the latter -fipa-call-clobbers would
be more to the point?)

Richard.

 Thanks,
 - Tom


Re: [PATCHv4][Kasan] Allow to override Asan shadow offset from command line

2014-10-17 Thread Jakub Jelinek
On Fri, Oct 17, 2014 at 11:53:01AM +0400, Yury Gribov wrote:
 From 0225b7878bbb5b803814646d089824d016316fef Mon Sep 17 00:00:00 2001
 From: Yury Gribov y.gri...@samsung.com
 Date: Thu, 16 Oct 2014 18:31:10 +0400
 Subject: [PATCH 1/2] Add strtoull to libiberty.
 
 2014-10-17  Yury Gribov  y.gri...@samsung.com
 
 libiberty/
   * strtoull.c: New file.

Just putting a file in there won't magically make it be part of libiberty.
Please read libiberty/README on how to add an optional file.
strtoul is also optional as strotoull should be, so you can also just
grep for strtoul in libiberty/* and add on similar spots.

Not sure if there aren't extra steps to make strtoull prototype available
in system.h, libiberty.h etc. for systems that don't have strtoull in their
headers.

CCing Ian as libiberty maintainer.

Jakub


[C PATCH] Enable initializing statics with COMPOUND_LITERAL_EXPR in C99 (PR c/63567)

2014-10-17 Thread Marek Polacek
Building Linux kernel failed with 'error: initializer element is not
constant', because they're initializing objects with static storage
duration with (T){ ...} - and that isn't permitted in gnu99/gnu11.

I think the Right Thing is to allow some latitude here and enable it
even in gnu99/gnu11 unless -pedantic.  In gnu89, this will work as
before even with -pedantic.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2014-10-17  Marek Polacek  pola...@redhat.com

PR c/63567
* c-typeck.c (digest_init): Allow initializing objects with static
storage duration with compound literals in non-pedantic mode.

* gcc.dg/pr63567.c: New test.

diff --git gcc/c/c-typeck.c gcc/c/c-typeck.c
index 5c0697a..8ddf368 100644
--- gcc/c/c-typeck.c
+++ gcc/c/c-typeck.c
@@ -6676,7 +6676,7 @@ digest_init (location_t init_loc, tree type, tree init, 
tree origtype,
inside_init = convert (type, inside_init);
 
   if (require_constant
-  (code == VECTOR_TYPE || !flag_isoc99)
+  (code == VECTOR_TYPE || !pedantic || !flag_isoc99)
   TREE_CODE (inside_init) == COMPOUND_LITERAL_EXPR)
{
  /* As an extension, allow initializing objects with static storage
diff --git gcc/testsuite/gcc.dg/pr63567.c gcc/testsuite/gcc.dg/pr63567.c
index e69de29..cf942ef 100644
--- gcc/testsuite/gcc.dg/pr63567.c
+++ gcc/testsuite/gcc.dg/pr63567.c
@@ -0,0 +1,11 @@
+/* PR c/63567 */
+/* { dg-do compile } */
+/* { dg-options  } */
+
+/* Allow initializing objects with static storage duration with
+   compound literals even in non-pedantic gnu99/gnu11.  This is
+   being used in Linux kernel.  */
+
+struct T { int i; };
+struct S { struct T t; };
+static struct S s = (struct S) { .t = { 42 } };

Marek


[PATCH] Don't expand string/memory builtins if ASan is enabled.

2014-10-17 Thread Maxim Ostapenko

Hi,

this patch disables string/memory builtin functions inlining if ASan is 
enabled. As described in my previous post 
(https://gcc.gnu.org/ml/gcc/2014-09/msg00020.html), this allow us to be 
sure that some dangerous builtins (strcpy, stpcpy, etc) will be handled 
correctly. Also, some redundant checks will be removed for builtin 
functions, that are instrumented but later not inlined for some reason.


Patch also changes logic in asan_mem_ref_hash updating. I eliminated 
memory ref access size from hash computing, so all accesses for same 
memory reference have the same hash. Updating of asan_mem_ref_hash 
occurs only if new access size is greater then saved one.


I've provided some performance testing (spec2006 v1.1) on 
x86_64-unknown-linux-gnu and attached results in test.res (sorry for 
this, I couldn't make my Thunderbird make a pretty table).


Regtested / bootstrapped on x86_64-unknown-linux-gnu.

Does this patch look sane?

-Maxim
$ ~/install/master-x86_64/bin/gcc -v
Using built-in specs.
COLLECT_GCC=/home/max/install/master-x86_64/bin/gcc
COLLECT_LTO_WRAPPER=/home/max/install/master-x86_64/libexec/gcc/x86_64-unknown-linux-gnu/5.0.0/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: /home/max/workspace/downloads/gcc/configure --enable-multilib 
--enable-checking --target=x86_64-unknown-linux-gnu 
--host=x86_64-unknown-linux-gnu --build=x86_64-unknown-linux-gnu 
--prefix=/home/max/install/master-x86_64 --disable-bootstrap 
--enable-languages=c,c++
Thread model: posix
gcc version 5.0.0 20141014 (experimental) (GCC)


Compile options:  -O3 -fsanitize=address -static-libasan.


testmaster  nobuiltin  % of slowdown
400.perlbench1044 10500,5747
401.bzip2682  676-0,8798
403.gcc  497  495-0,4024
429.mcf  488  489 0,2049
445.gobmk723  724 0,1383
456.hmmer783  750-4,2146
458.sjeng887  880-0,7892
462.libquantum   330  323-2,1212
464.h264ref  1108 11544,1516
471.omnetpp  545  559 2,5688
473.astar490  480-2,0408
483.xalancbmk411  400-2,6764
433.milc 517  509-1,5474
444.namd 419  419 0,
450.soplex   310  299-3,5484
453.povray   287  276-3,8328
470.lbm  299  306 2,3411
482.sphinx3  777  804 3,4749

Geomean:
master nobuiltin% of increase
540538-0,50

gcc/ChangeLog:

2014-10-17  Max Ostapenko  m.ostape...@partner.samsung.com

	* asan.c (asan_mem_ref_hasher::hash): Remove MEM_REF access size from
	hash value construction. Call iterative_hash_expr instead of explicit
	hash building.
	(asan_mem_ref_hasher::equal): Change condition.
	(has_mem_ref_been_instrumented): Likewise.
	(update_mem_ref_hash_table): Likewise.
	(maybe_update_mem_ref_hash_table): New function.
	(instrument_strlen_call): Removed.
	(instrument_mem_region_access): Likewise.
	(instrument_builtin_call): Call maybe_update_mem_ref_hash_table instead
	of instrument_mem_region_access.
	* builtins.c (is_memory_builtin): New function.
	(expand_builtin): Don't expand string/memory builtin functions if ASan
	is enabled.
	* builtins.def: Add comment.

gcc/testsuite/ChangeLog:

2014-10-17  Max Ostapenko  m.ostape...@partner.samsung.com

	* c-c++-common/asan/no-redundant-instrumentation-1.c: Updated test.
	* c-c++-common/asan/no-redundant-instrumentation-4.c: Likewise.
	* c-c++-common/asan/no-redundant-instrumentation-5.c: Likewise.
	* c-c++-common/asan/no-redundant-instrumentation-6.c: Likewise.
	* c-c++-common/asan/no-redundant-instrumentation-7.c: Likewise.
	* c-c++-common/asan/no-redundant-instrumentation-8.c: Likewise.
	* c-c++-common/asan/no-redundant-instrumentation-2.c: Removed.
	* c-c++-common/asan/no-redundant-instrumentation-9.c: Likewise.
	* c-c++-common/asan/no-redundant-instrumentation-10.c: New test.
	* c-c++-common/asan/no-redundant-instrumentation-11.c: Likewise.

diff --git a/gcc/asan.c b/gcc/asan.c
index 2a61a82..391f693 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -352,10 +352,7 @@ struct asan_mem_ref_hasher
 inline hashval_t
 asan_mem_ref_hasher::hash (const asan_mem_ref *mem_ref)
 {
-  inchash::hash hstate;
-  inchash::add_expr (mem_ref-start, hstate);
-  hstate.add_wide_int (mem_ref-access_size);
-  return hstate.end ();
+  return iterative_hash_expr (mem_ref-start, 0);
 }
 
 /* Compare two memory references.  We accept the length of either
@@ -365,8 +362,7 @@ inline bool
 asan_mem_ref_hasher::equal (const asan_mem_ref *m1,
 			const asan_mem_ref *m2)
 {
-  return (m1-access_size == m2-access_size
-	   operand_equal_p (m1-start, m2-start, 0));
+  return operand_equal_p (m1-start, m2-start, 0);
 }
 
 static hash_tableasan_mem_ref_hasher *asan_mem_ref_ht;
@@ -417,7 +413,8 @@ has_mem_ref_been_instrumented (tree ref, HOST_WIDE_INT access_size)
   asan_mem_ref r;
   

[C PATCH] Make -Wno-implicit-int work in C99 mode

2014-10-17 Thread Marek Polacek
C99 mode warns about defaulting to int by default, but without
the possibility to suppress the warning with -Wno-implicit-int.
This is likely to arouse the ire of the users, especially with
the new default.

Therefore the following patch tweaks warn_implicit_int in such
a way that -Wimplicit and -Wimplicit-int should work as intended
(following the rule that more specific option takes precedence
over the less specific).  There should be no changes in GNU89
mode.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2014-10-17  Marek Polacek  pola...@redhat.com

c-family/
* c-opts.c (c_common_post_options): Set warn_implicit_int.
* c.opt (Wimplicit-int): Initialize to -1.
c/
* c-decl.c (grokdeclarator): Use OPT_Wimplicit_int unconditionally.
(start_function): Use OPT_Wimplicit_int instead of 0.
(store_parm_decls_oldstyle): Likewise.
testsuite/
* gcc.dg/Wimplicit-int-1.c: New test.
* gcc.dg/Wimplicit-int-2.c: New test.
* gcc.dg/Wimplicit-int-3.c: New test.
* gcc.dg/Wimplicit-int-4.c: New test.

diff --git gcc/c-family/c-opts.c gcc/c-family/c-opts.c
index eb078e3..448eb3e 100644
--- gcc/c-family/c-opts.c
+++ gcc/c-family/c-opts.c
@@ -864,6 +864,10 @@ c_common_post_options (const char **pfilename)
   if (warn_implicit_function_declaration == -1)
 warn_implicit_function_declaration = flag_isoc99;
 
+  /* -Wimplicit-int is enabled by default for C99.  */
+  if (warn_implicit_int == -1)
+warn_implicit_int = flag_isoc99;
+
   /* Declone C++ 'structors if -Os.  */
   if (flag_declone_ctor_dtor == -1)
 flag_declone_ctor_dtor = optimize_size;
diff --git gcc/c-family/c.opt gcc/c-family/c.opt
index 72ac2ed..4f96cf8 100644
--- gcc/c-family/c.opt
+++ gcc/c-family/c.opt
@@ -488,7 +488,7 @@ C ObjC Var(warn_implicit_function_declaration) Init(-1) 
Warning LangEnabledBy(C
 Warn about implicit function declarations
 
 Wimplicit-int
-C ObjC Var(warn_implicit_int) Warning LangEnabledBy(C ObjC,Wimplicit)
+C ObjC Var(warn_implicit_int) Init(-1) Warning LangEnabledBy(C ObjC,Wimplicit)
 Warn when a declaration does not specify a type
 
 Wimport
diff --git gcc/c/c-decl.c gcc/c/c-decl.c
index 839c67b..b18da48 100644
--- gcc/c/c-decl.c
+++ gcc/c/c-decl.c
@@ -5330,11 +5330,11 @@ grokdeclarator (const struct c_declarator *declarator,
   else
{
  if (name)
-   warn_defaults_to (loc, flag_isoc99 ? 0 : OPT_Wimplicit_int,
+   warn_defaults_to (loc, OPT_Wimplicit_int,
  type defaults to %int% in declaration 
  of %qE, name);
  else
-   warn_defaults_to (loc, flag_isoc99 ? 0 : OPT_Wimplicit_int,
+   warn_defaults_to (loc, OPT_Wimplicit_int,
  type defaults to %int% in type name);
}
 }
@@ -8120,7 +8120,7 @@ start_function (struct c_declspecs *declspecs, struct 
c_declarator *declarator,
 }
 
   if (warn_about_return_type)
-warn_defaults_to (loc, flag_isoc99 ? 0
+warn_defaults_to (loc, flag_isoc99 ? OPT_Wimplicit_int
   : (warn_return_type ? OPT_Wreturn_type
  : OPT_Wimplicit_int),
  return type defaults to %int%);
@@ -8429,7 +8429,8 @@ store_parm_decls_oldstyle (tree fndecl, const struct 
c_arg_info *arg_info)
 
  if (flag_isoc99)
pedwarn (DECL_SOURCE_LOCATION (decl),
-0, type of %qD defaults to %int%, decl);
+OPT_Wimplicit_int, type of %qD defaults to %int%,
+decl);
  else
warning_at (DECL_SOURCE_LOCATION (decl),
OPT_Wmissing_parameter_type,
diff --git gcc/testsuite/gcc.dg/Wimplicit-int-1.c 
gcc/testsuite/gcc.dg/Wimplicit-int-1.c
index e69de29..0c89caf 100644
--- gcc/testsuite/gcc.dg/Wimplicit-int-1.c
+++ gcc/testsuite/gcc.dg/Wimplicit-int-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options  } */
+
+static l; /* { dg-warning type defaults to } */
+
+foo (a) /* { dg-warning return type defaults to } */
+/* { dg-warning type of .a. defaults to .int. type { target *-*-* } 6 } */
+{
+  auto p; /* { dg-warning type defaults to } */
+  typedef bar; /* { dg-warning type defaults to } */
+}
diff --git gcc/testsuite/gcc.dg/Wimplicit-int-2.c 
gcc/testsuite/gcc.dg/Wimplicit-int-2.c
index e69de29..158b61c 100644
--- gcc/testsuite/gcc.dg/Wimplicit-int-2.c
+++ gcc/testsuite/gcc.dg/Wimplicit-int-2.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options -pedantic-errors } */
+
+static l; /* { dg-error type defaults to } */
+
+foo (a) /* { dg-error return type defaults to } */
+/* { dg-error type of .a. defaults to .int. type { target *-*-* } 6 } */
+{
+  auto p; /* { dg-error type defaults to } */
+  typedef bar; /* { dg-error type defaults to } */
+}
diff --git gcc/testsuite/gcc.dg/Wimplicit-int-3.c 
gcc/testsuite/gcc.dg/Wimplicit-int-3.c
index e69de29..654ce73 100644
--- 

[PATCH][match-and-simplify] Merge from trunk

2014-10-17 Thread Richard Biener

Lightly tested, committed.

Richard.

2014-10-17  Richard Biener  rguent...@suse.de

Merge from trunk r216316 through r216394.



Re: [PATCH][0/n] Merge from match-and-simplify

2014-10-17 Thread Richard Biener
On Fri, 17 Oct 2014, Richard Biener wrote:

 On Fri, 17 Oct 2014, Ramana Radhakrishnan wrote:
 
  On Wed, Oct 15, 2014 at 5:29 PM, Kyrill Tkachov kyrylo.tkac...@arm.com 
  wrote:
  
   On 15/10/14 14:00, Richard Biener wrote:
  
  
   Any comments and reviews welcome (I don't think that
   my maintainership covers enough to simply check this in
   without approval).
  
   Hi Richard,
  
   The match-and-simplify branch bootstrapped successfully on
   aarch64-none-linux-gnu FWIW.
  
  
  What about regression tests ?
 
 Note the branch isn't regression free on x86_64 either.  The branch
 does more than I want to merge to trunk (and it also retains all
 folding code I added patterns for).  I've gone farther there to
 explore whether it will end up working in the end and what kind
 of features the IL and the APIs need.
 
 I've pasted testsuite results on x86_64 below for rev. 216324
 which is based on trunk rev. 216315 which unfortunately has
 lots of regressions on its own.
 
 This is why I want to restrict the effect of the machinery to
 fold (), fold_stmt () and tree-ssa-forwprop.c for the moment
 and merge individual patterns (well, maybe in small groups)
 separately to allow for easy bi-section.
 
 I suppose I should push the most visible change to trunk first,
 namely tree-ssa-forwprop.c folding all statements via fold_stmt
 after the merge.  I suspect this alone can have some odd effects
 like the sub + cmp fusing.  That would be sth like the patch
 attached below.

Just finished testing this (with -m32 on x86_64), showing regressions
in the testsuite like

FAIL: gcc.dg/tree-ssa/slsr-19.c scan-tree-dump-times optimized  * y 
1
FAIL: gcc.dg/vect/bb-slp-27.c -flto -ffat-lto-objects  
scan-tree-dump-times slp2
 basic block vectorized 1
FAIL: gcc.dg/vect/bb-slp-27.c scan-tree-dump-times slp2 basic block 
vectorized
 1
FAIL: gcc.dg/vect/bb-slp-8b.c -flto -ffat-lto-objects  
scan-tree-dump-times slp2
 basic block vectorized 1
FAIL: gcc.dg/vect/bb-slp-8b.c scan-tree-dump-times slp2 basic block 
vectorized
 1
FAIL: gcc.dg/vect/slp-cond-3.c -flto -ffat-lto-objects  
scan-tree-dump-times vec
t vectorizing stmts using SLP 1
FAIL: gcc.dg/vect/slp-cond-3.c scan-tree-dump-times vect vectorizing 
stmts usin
g SLP 1

Bah.

I suppose I need to investigate this (simply folding a stmt shouldn't
cause any of the above... - with SLP it is probably operand
canonicalization, but not sure).

Richard.

 Richard.
 
 Index: gcc/tree-ssa-forwprop.c
 ===
 --- gcc/tree-ssa-forwprop.c   (revision 216258)
 +++ gcc/tree-ssa-forwprop.c   (working copy)
 @@ -54,6 +54,8 @@ along with GCC; see the file COPYING3.
  #include tree-ssa-propagate.h
  #include tree-ssa-dom.h
  #include builtins.h
 +#include tree-cfgcleanup.h
 +#include tree-into-ssa.h
  
  /* This pass propagates the RHS of assignment statements into use
 sites of the LHS of the assignment.  It's basically a specialized
 @@ -3586,6 +3588,8 @@ simplify_mult (gimple_stmt_iterator *gsi
  
return false;
  }
 +
 +
  /* Main entry point for the forward propagation and statement combine
 optimizer.  */
  
 @@ -3626,6 +3630,40 @@ pass_forwprop::execute (function *fun)
  
cfg_changed = false;
  
 +  /* Combine stmts with the stmts defining their operands.  Do that
 + in an order that guarantees visiting SSA defs before SSA uses.  */
 +  int *postorder = XNEWVEC (int, n_basic_blocks_for_fn (fun));
 +  int postorder_num = inverted_post_order_compute (postorder);
 +  for (int i = 0; i  postorder_num; ++i)
 +{
 +  bb = BASIC_BLOCK_FOR_FN (fun, postorder[i]);
 +  for (gimple_stmt_iterator gsi = gsi_start_bb (bb);
 +!gsi_end_p (gsi); gsi_next (gsi))
 + {
 +   gimple stmt = gsi_stmt (gsi);
 +   gimple orig_stmt = stmt;
 +
 +   if (fold_stmt (gsi))
 + {
 +   stmt = gsi_stmt (gsi);
 +   if (maybe_clean_or_replace_eh_stmt (orig_stmt, stmt)
 +gimple_purge_dead_eh_edges (bb))
 + cfg_changed = true;
 +   update_stmt (stmt);
 + }
 + }
 +}
 +  free (postorder);
 +
 +  /* ???  Code below doesn't expect non-renamed VOPs and the above
 + doesn't keep virtual operand form up-to-date.  */
 +  if (cfg_changed)
 +{
 +  cleanup_tree_cfg ();
 +  cfg_changed = false;
 +}
 +  update_ssa (TODO_update_ssa_only_virtuals);
 +
FOR_EACH_BB_FN (bb, fun)
  {
gimple_stmt_iterator gsi;
 

-- 
Richard Biener rguent...@suse.de
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imendorffer


Re: [PATCH 9/17] Initial KAsan support

2014-10-17 Thread Jakub Jelinek
On Fri, Oct 17, 2014 at 12:44:26PM +0400, Yury Gribov wrote:
 On 10/17/2014 11:43 AM, Eric Botcazou wrote:
 Well, that's a backport of ancient patch from trunk so all credits go
 there. And flag_sanitize is indeed handled differently from other
 compiler flags.
 
 Really curious to know why...
 
 I bet this was a typo but let's wait for Jakub's comments.

The obvious change to replace the direct flag with opts-x_ preapproved.

Jakub


[PATCH] PR63442 ICE with ubsan/overflow-int128.c test on AArch64

2014-10-17 Thread Jiong Wang

the cause should be one minor bug in prepare_cmp_insn.

the last mode parameter pmode of prepare_cmp_insn should match the
mode of the first parameter x, while during the recursive call of 
prepare_cmp_insn,
x is with mode of targetm.libgcc_cmp_return_mode () while pmode is assign to 
word_mode.

generally this is OK, because default libgcc_cmp_return_mode hook always return 
word_mode,
but AArch64 has a target private implementation which always return SImode, so 
there is a
mismatch which cause a ICE later.

this minor issue is hidding because nearly all other targets use default hook, 
and the
compare is rarely invoked.

Thanks

gcc/
  PR target/63442
  * optabs.c (prepare_cmp_insn): Use target hook libgcc_cmp_return_mode 
instead of word_mode.
diff --git a/gcc/optabs.c b/gcc/optabs.c
index d55a6bb..3073816 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -4264,7 +4264,7 @@ prepare_cmp_insn (rtx x, rtx y, enum rtx_code comparison, rtx size,
 	y = const0_rtx;
 	}

-  *pmode = word_mode;
+  *pmode = targetm.libgcc_cmp_return_mode ();
   prepare_cmp_insn (x, y, comparison, NULL_RTX, unsignedp, methods,
 			ptest, pmode);
 }

Re: [PATCH 0/17] KASan 4.9 backport

2014-10-17 Thread Jakub Jelinek
On Thu, Oct 16, 2014 at 12:34:35PM +0400, Yury Gribov wrote:
 Hi all,
 
 As discussed in https://gcc.gnu.org/ml/gcc/2014-09/msg00234.html , this
 patchset backports mainline patches necessary for Kernel ASan in GCC 4.9
 (gcc-4_9-branch). The patcheset consists of
 * Asan headers installation (1 patch)
 * __asan_loadN/__asan_storeN support (3 patches)
 * instrumentation with calls support (1 patch)
 * optimization of strlen instrumentation (1 patch)
 * Kasan support (3 patches)
 * move inlining to sanopt (1 patches)
 * bugfixes (7 patches)
 
 To my knowledge it does not contain any changes that would influence ABI of
 generated code.
 
 The code was bootstrapped and regtested on x64 (I only tested the net
 result, not each patch in isolation).

I had a brief look at what ended up on the branch in the end, and
from what I understand, the 4.9 libasan.so has
__asan_report_store_n and __asan_report_load_n entry points, but does
not have any __asan_loadN/__asan_reportN entrypoints (neither 1/2/4/8/16,
nor variable).
So, what the branch does seems to not match what the library provides.
E.g. trying:
struct S { long long a; long long b; char c; };

void
foo (struct S *p, struct S *q)
{
  *p = *q;
}

int
bar (struct S *p)
{
  return p-a;
}

on x86_64-linux, with -fsanitize=kernel-address -O2 I get expected
__asan_storeN/__asan_loadN/__asan_load8 calls.
With -fsanitize=address -O2, foo unexpectedly is not instrumented
(IMHO it should be, it can use __asan_report_{store,load}_n) and
bar uses (expectedly) __asan_report_load8.
With -fsanitize=address -O2 --param asan-instrumentation-with-call-threshold=0
foo is again unexpectedly not instrumented, and bar is instrumented
with __asan_load8, which looks wrong to me, because the library does not
provide any such entry point.

Thus, IMHO the:
  if ((flag_sanitize  SANITIZE_USER_ADDRESS) != 0
   ((size_in_bytes  (size_in_bytes - 1)) != 0
  || (unsigned HOST_WIDE_INT) size_in_bytes - 1 = 16))
return;
should be nuked from 4.9, we can do unaligned/non-{1,2,4,8,16}
accesses fine.  But, in execute_sanopt force !use_calls
for (flag_sanitize  SANITIZE_USER_ADDRESS).

Or were there any bugfixes needed for __asan_report_{store,load}_n
on the library side?

Jakub


[v3 patch] partially fix testsuite/27_io/headers/cstdio/types_std.cc

2014-10-17 Thread Jonathan Wakely

testsuite/27_io/headers/cstdio/types_std.cc FAILs on dragonflybsd:

/mnt/gcc-src/libstdc++-v3/testsuite/27_io/headers/cstdio/types_std.cc:25:13:
error: aggregate 'FILE gnu::f' has incomplete type and cannot be
defined
/mnt/gcc-src/libstdc++-v3/testsuite/27_io/headers/cstdio/types_std.cc:26:13:
error: aggregate 'FILE gnu::fpos_t' has incomplete type and cannot be
defined

These errors look correct to me, the C standard says that stdio.h
declares FILE as an object type, but it doesn't say complete object
type, so I think that's a bug in the test.

I think there's another bug:

#include cstdio

namespace gnu
{
 std::size_t s;
 std::FILE f;
 std::FILE fpos_t;
}

Surely that third declaration should be testing that fpos_t is a valid
type, rather than declaring a variable of that name, so I'm committing
the attached patch, which also fixes another fail on dragonflybsd.

Tested x86_64-linux, committed to trunk.

commit c388b3dc00c256c9cc5d8ae7b5bc37386bd14c58
Author: Jonathan Wakely jwak...@redhat.com
Date:   Fri Oct 17 12:57:31 2014 +0100

	* testsuite/27_io/basic_ostream/inserters_arithmetic/char/hexfloat.cc:
	Add dg-require-string-conversions.
	* testsuite/27_io/headers/cstdio/types_std.cc: Test for fpos_t.

diff --git a/libstdc++-v3/testsuite/27_io/basic_ostream/inserters_arithmetic/char/hexfloat.cc b/libstdc++-v3/testsuite/27_io/basic_ostream/inserters_arithmetic/char/hexfloat.cc
index 485a485..7fe6ff8 100644
--- a/libstdc++-v3/testsuite/27_io/basic_ostream/inserters_arithmetic/char/hexfloat.cc
+++ b/libstdc++-v3/testsuite/27_io/basic_ostream/inserters_arithmetic/char/hexfloat.cc
@@ -1,4 +1,5 @@
 // { dg-options -std=gnu++11 }
+// { dg-require-string-conversions  }
 
 // 2014-03-27 R??diger Sonderfeld
 // test the hexadecimal floating point inserters (facet num_put)
diff --git a/libstdc++-v3/testsuite/27_io/headers/cstdio/types_std.cc b/libstdc++-v3/testsuite/27_io/headers/cstdio/types_std.cc
index a359b87..a34663f 100644
--- a/libstdc++-v3/testsuite/27_io/headers/cstdio/types_std.cc
+++ b/libstdc++-v3/testsuite/27_io/headers/cstdio/types_std.cc
@@ -23,5 +23,5 @@ namespace gnu
 {
   std::size_t s;
   std::FILE f;
-  std::FILE fpos_t;
+  std::fpos_t p;
 }


Re: [libstdc++ PATCH] Implement the Library Fundamentals v1 variable templates for type traits

2014-10-17 Thread Jonathan Wakely

On 16/10/14 23:04 +0300, Ville Voutilainen wrote:

Argh, needed to do uglification, and formatting fixes.
Also renamed the Dummy types in tests to something
a bit more descriptive. I'm not using the tr2 test types because the
test types in these tests
are amalgamations of multiple properties to keep the tests simple(r).


Thanks! Tested and committed to trunk.


Re: [PATCH] Don't expand string/memory builtins if ASan is enabled.

2014-10-17 Thread Jakub Jelinek
On Fri, Oct 17, 2014 at 03:45:52PM +0400, Maxim Ostapenko wrote:
 Patch also changes logic in asan_mem_ref_hash updating. I eliminated memory
 ref access size from hash computing, so all accesses for same memory
 reference have the same hash. Updating of asan_mem_ref_hash occurs only if
 new access size is greater then saved one.

I guess that is reasonable.

 -/* Instrument an access to a contiguous memory region that starts at
 -   the address pointed to by BASE, over a length of LEN (expressed in
 -   the sizeof (*BASE) bytes).  ITER points to the instruction before
 -   which the instrumentation instructions must be inserted.  LOCATION
 -   is the source location that the instrumentation instructions must
 -   have.  If IS_STORE is true, then the memory access is a store;
 -   otherwise, it's a load.  */
 +/*  Insert a memory reference into the hash table if access length
 +can be determined in compile time.  */

...

If you don't expand the memops builtins inline, I'd expect you start with
get_mem_refs_of_builtin_call and remove all the builtins you stop
expanding specially (i.e. emit a libcall instead unconditionally) that are
handled by libsanitizer (only a subset of them are apparently, perhaps
something to fix) from there.

There are builtins that must be kept instrumented (e.g. all the
sync/atomic builtins).  There are builtins which might need first additions
to libsanitizer (e.g. I see no __*_chk functions in libsanitizer).

 +/* Returns TRUE if given FCODE corresponds to string or memory builtin 
 function.
 + */
 +
 +static inline bool
 +is_memory_builtin (enum built_in_function fcode)
 +{
 +  return fcode = BUILT_IN_STRSTR  fcode = BUILT_IN_BCMP;

This is too fragile and ugly.
IMHO you should list (supposedly not in a special inline, but directly
where you use it) in a switch all the builtins you don't want to expand.

Jakub


Re: [PATCH i386 AVX512] [75/n] Update vec_init.

2014-10-17 Thread Kirill Yukhin
Hello Jakub,
On 15 Oct 18:23, Jakub Jelinek wrote:
 On Thu, Oct 09, 2014 at 04:13:25PM +0400, Kirill Yukhin wrote:
  --- a/gcc/config/i386/i386.c
  +++ b/gcc/config/i386/i386.c
  @@ -39821,6 +39823,9 @@ ix86_expand_vector_init_duplicate (bool mmx_ok, 
  enum machine_mode mode,
 goto widen;
   
   case V8HImode:
  +  if (TARGET_AVX512VL)
  +return ix86_vector_duplicate_value (mode, target, val);
  +
 
 Shouldn't that be TARGET_AVX512VL  TARGET_AVX512BW ?
Nice catch! Fixed.
 
 if (TARGET_SSE2)
  {
struct expand_vec_perm_d dperm;
  @@ -39851,6 +39856,9 @@ ix86_expand_vector_init_duplicate (bool mmx_ok, 
  enum machine_mode mode,
 goto widen;
   
   case V16QImode:
  +  if (TARGET_AVX512VL)
  +return ix86_vector_duplicate_value (mode, target, val);
  +
 
 Ditto.
Ditto.

 if (TARGET_SSE2)
  goto permute;
 goto widen;
  @@ -39880,16 +39888,19 @@ ix86_expand_vector_init_duplicate (bool mmx_ok, 
  enum machine_mode mode,
   
   case V16HImode:
   case V32QImode:
  -  {
  -   enum machine_mode hvmode = (mode == V16HImode ? V8HImode : V16QImode);
  -   rtx x = gen_reg_rtx (hvmode);
  +  if (TARGET_AVX512VL)
  +return ix86_vector_duplicate_value (mode, target, val);
 
 Ditto.
Ditto.
 
  @@ -40503,6 +40515,42 @@ half:
gen_rtx_VEC_CONCAT (mode, op0, op1)));
 return;
   
  +case V64QImode:
  +  quarter_mode = V16QImode;
  +  half_mode = V32QImode;
  +  goto quarter;
  +
  +case V32HImode:
  +  quarter_mode = V8HImode;
  +  half_mode = V16HImode;
  +  goto quarter;
 
 I wonder whether for these modes it can ever be beneficial to build them
 through interleaves/concatenations etc., if it wouldn't be better to build
 them by storing all values into memory and just reading it back.
I've tried this example:
#include immintrin.h

unsigned char a0, a1, a2, a3, a4, a5, a6, a7, a8, a9, a10, a11, a12, a13, a14,
  a15, a16, a17, a18, a19, a20, a21, a22, a23, a24, a25, a26, a27, a28, a29,
  a30, a31, a32, a33, a34, a35, a36, a37, a38, a39, a40, a41, a42, a43, a44,
  a45, a46, a47, a48, a49, a50, a51, a52, a53, a54, a55, a56, a57, a58, a59,
  a60, a61, a62, a63;

__m512i foo ()
{
  return __extension__ (__m512i)(__v64qi){
a0, a1, a2, a3, a4, a5, a6, a7, a8, a9, a10, a11, a12, a13, a14,
  a15, a16, a17, a18, a19, a20, a21, a22, a23, a24, a25, a26, a27, a28, a29,
  a30, a31, a32, a33, a34, a35, a36, a37, a38, a39, a40, a41, a42, a43, a44,
  a45, a46, a47, a48, a49, a50, a51, a52, a53, a54, a55, a56, a57, a58, a59,
  a60, a61, a62, a63 };
}

w/ and w/o -mavx512bw (and always -mavx512f).

When, this code works, we've got 127 lines of assembly to do this init.
W/o AVX-512BW we've got  300 lines of code (mostly on GPRs, using sal, and 
etc.)

Then I've looked into actual assembly w/ -mavx512bw and it turns out that no
AVX-512BW insn were generated, only AVX-512F (and below). Fixed iterator.

 
  -(define_mode_iterator VI48F_512 [V16SI V16SF V8DI V8DF])
  +(define_mode_iterator VI48F_I12_AVX512BW
  +  [V16SI V16SF V8DI V8DF
  +  (V32HI TARGET_AVX512BW) (V64QI TARGET_AVX512BW)])
 
 What does the I12 stand for?  Wasn't it meant to be VI48F_512_AVX512BW
 or I512?
Actually, I am not awere of any name convention for iterators.
As far as I understand, name [more or less] for vector mode
should reflect:
  - Type family of the unit: float or int
  - Size of the unit: 1, 2, 4 etc. bytes
  - If possible, target predicates to enable certain modes in
given iterator.

The name is:
  - Vector (V)
  - I48F - contains both ints and floats of size 4 and 8
  - I12 - contains ints of size 1 and 2
  - AVX512BW - affected by the target (according to previous note - to be 
removed)

Maybe it'll be better to name it: VF48_I1248?

--
Thanks, K

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index baf0d3d..c3202c4 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -39760,6 +39760,8 @@ ix86_expand_vector_init_duplicate (bool mmx_ok, enum 
machine_mode mode,
 case V8SFmode:
 case V8SImode:
 case V2DFmode:
+case V64QImode:
+case V32HImode:
 case V2DImode:
 case V4SFmode:
 case V4SImode:
@@ -39790,6 +39792,9 @@ ix86_expand_vector_init_duplicate (bool mmx_ok, enum 
machine_mode mode,
   goto widen;
 
 case V8HImode:
+  if (TARGET_AVX512VL  TARGET_AVX512BW)
+return ix86_vector_duplicate_value (mode, target, val);
+
   if (TARGET_SSE2)
{
  struct expand_vec_perm_d dperm;
@@ -39820,6 +39825,9 @@ ix86_expand_vector_init_duplicate (bool mmx_ok, enum 
machine_mode mode,
   goto widen;
 
 case V16QImode:
+  if (TARGET_AVX512VL  TARGET_AVX512BW)
+return ix86_vector_duplicate_value (mode, target, val);
+
   if (TARGET_SSE2)
goto permute;
   goto widen;
@@ -39849,16 +39857,19 @@ ix86_expand_vector_init_duplicate (bool mmx_ok, enum 

Re: [PATCH][1/n] Merge from match-and-simplify, public API

2014-10-17 Thread Jakub Jelinek
On Wed, Oct 15, 2014 at 01:40:07PM +0200, Richard Biener wrote:
 2014-10-15  Richard Biener  rguent...@suse.de
 
   * gimple-fold.h (gimple_build): Declare various overloads.
   (gimple_simplify): Likewise.
   (gimple_convert): Re-implement in terms of gimple_build.
   * gimple-fold.c (gimple_convert): Remove.
   (gimple_build): New functions.
 
 --- 45,141 
   extern bool arith_code_with_undefined_signed_overflow (tree_code);
   extern gimple_seq rewrite_to_defined_overflow (gimple);
   
 ! /* gimple_build, functionally matching fold_buildN, outputs stmts
 !int the provided sequence, matching and simplifying them on-the-fly.
 !Supposed to replace force_gimple_operand (fold_buildN (...), ...).  */
 ! tree gimple_build (gimple_seq *, location_t,
 !enum tree_code, tree, tree,
 !tree (*valueize) (tree) = NULL);

I find mixing prototypes with and without extern keyword weird,
most of the prototypes in headers use extern, I think it would be cleaner
to use it everywhere.

 *** gcc/gimple-fold.c.orig2014-10-14 15:49:30.634356179 +0200
 --- gcc/gimple-fold.c 2014-10-15 13:02:08.158099055 +0200
 *** along with GCC; see the file COPYING3.
 *** 56,61 
 --- 56,62 
   #include builtins.h
   #include output.h
   
 + 
   /* Return true when DECL can be referenced from current unit.
  FROM_DECL (if non-null) specify constructor of variable DECL was taken 
 from.
  We can get declarations that are not possible to reference for various

Why the whitespace change?
   
   tree
 ! gimple_convert (gimple_seq *seq, location_t loc, tree type, tree op)
   {
 !   if (useless_type_conversion_p (type, TREE_TYPE (op)))
 ! return op;
 !   op = fold_convert_loc (loc, type, op);
 !   gimple_seq stmts = NULL;
 !   op = force_gimple_operand (op, stmts, true, NULL_TREE);
 !   gimple_seq_add_seq_without_update (seq, stmts);
 !   return op;
   }
 --- 5297,5487 
 return stmts;
   }
   
 ! 
 ! 

3 lines of vertical space too much?

Otherwise, LGTM.

Jakub


[committed] Fix ChangeLog entry

2014-10-17 Thread Kyrill Tkachov

Hi all,

I've committed the attached as obvious to fix up a whitespace issue in a 
patch I committed recently.

This is r216399.

Cheers,
Kyrilldiff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 9e35d69b..5a09e3e 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -136,8 +136,8 @@
 	* doc/install.texi (aarch64*-*-*): Document 
 	new --enable-fix-cortex-a53-835769 option.
 
-2014-10-10  Kyrylo Tkachovkyrylo.tkac...@arm.com
-Ramana Radhakrishnanramana.radhakrish...@arm.com
+2014-10-10  Kyrylo Tkachov  kyrylo.tkac...@arm.com
+Ramana Radhakrishnan  ramana.radhakrish...@arm.com
 
 	* config/aarch64/aarch64.h (FINAL_PRESCAN_INSN): Define.
 	(ADJUST_INSN_LENGTH): Define.

Re: [PATCH i386 AVX512] [80/n] Extend expand_sse2_mulvxdi3.

2014-10-17 Thread Kirill Yukhin
Hello Uroš,
On 16 Oct 14:29, Uros Bizjak wrote:
  + if (mode == V4DImode)
  +   emit_insn (gen_avx512dq_mulv4di3 (op0, op1, op2));
  + else if (mode == V2DImode)
  +   emit_insn (gen_avx512dq_mulv4di3 (op0, op1, op2));
 
 Should this be v2di ?
Right, copy-and-paste :(
 
  +   }
  +}
  +  else if (TARGET_XOP  mode == V2DImode)
   {
 /* op1: A,B,C,D, op2: E,F,G,H */
 op1 = gen_lowpart (V4SImode, op1);
 
 Please use function pointers in the added part.
Done.

Updated patch in the bottom. Is it ok?

--
Thanks, K

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index c3202c4..415e330 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -45667,7 +45667,22 @@ ix86_expand_sse2_mulvxdi3 (rtx op0, rtx op1, rtx op2)
   enum machine_mode mode = GET_MODE (op0);
   rtx t1, t2, t3, t4, t5, t6;
 
-  if (TARGET_XOP  mode == V2DImode)
+  if (TARGET_AVX512DQ)
+{
+  rtx (*gen) (rtx, rtx, rtx);
+
+  if (mode == V8DImode)
+   gen = gen_avx512dq_mulv8di3;
+  else if (TARGET_AVX512VL)
+   {
+ if (mode == V4DImode)
+   gen = gen_avx512dq_mulv4di3;
+ else if (mode == V2DImode)
+   gen = gen_avx512dq_mulv2di3;
+   }
+  emit_insn (gen (op0, op1, op2));
+}
+  else if (TARGET_XOP  mode == V2DImode)
 {
   /* op1: A,B,C,D, op2: E,F,G,H */
   op1 = gen_lowpart (V4SImode, op1);


Re: [PATCHv4][Kasan] Allow to override Asan shadow offset from command line

2014-10-17 Thread Ian Lance Taylor
Jakub Jelinek ja...@redhat.com writes:

 Not sure if there aren't extra steps to make strtoull prototype available
 in system.h, libiberty.h etc. for systems that don't have strtoull in their
 headers.

See the

#if defined(HAVE_DECL_XXX)  !HAVE_DECL_XXX

lines in include/libiberty.h.  Although strtol is missing there as well.

Ian


[wwwdocs] Add recent C++ changes to gcc-5/changes.html

2014-10-17 Thread Jonathan Wakely

Committed to CVS.
? htdocs/gcc-5/.changes.html.swp
Index: htdocs/gcc-5/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v
retrieving revision 1.18
diff -u -r1.18 changes.html
--- htdocs/gcc-5/changes.html	15 Oct 2014 11:21:52 -	1.18
+++ htdocs/gcc-5/changes.html	17 Oct 2014 12:28:16 -
@@ -60,6 +60,31 @@
 liFull support for a href=https://www.cilkplus.org/;Cilk Plus/a
 	has been added to the GCC compiler. Cilk Plus is an extension to
 	the C and C++ languages to support data and task parallelism./li
+liNew preprocessor constructs, code__has_include/code
+and code__has_include_next/code, to test the availability of headers
+have been added.br/
+This demonstrates a way to include the header codelt;optionalgt;/code
+only if it is available:br/
+blockquotepre
+#ifdef __has_include
+#  if __has_include(lt;optionalgt;)
+#include lt;optionalgt;
+#define have_optional 1
+#  elif __has_include(lt;experimental/optionalgt;)
+#include lt;experimental/optionalgt;
+#define have_optional 1
+#define experimental_optional
+#  else
+#define have_optional 0
+#  endif
+#endif
+/pre/blockquote
+The header search paths for code__has_include_next/code
+and code__has_include_next/code are equivalent to those
+of the standard directive code#include/code
+and the extension code#include_next/code respectively.
+/li
+
   /ul
 
 h3 id=cC/h3
@@ -93,6 +118,10 @@
   liG++ and libstdc++ now implement the feature-testing macros from
 a href=http://isocpp.org/std/standing-documents/sd-6-sg10-feature-test-recommendations;Feature-testing
 recommendations for C++/a./li
+  liG++ now allows codetypename/code in a template template parameter.
+blockquotepre
+  templatelt;templatelt;typenamegt; btypename/b Xgt; struct D; // OK
+/pre/blockquote/li
 /ul
 
   h4 id=libstdcxxRuntime Library (libstdc++)/h4
@@ -100,11 +129,18 @@
 lia href=https://gcc.gnu.org/onlinedocs/libstdc++/manual/status.html#status.iso.2011;
   Improved support for C++11/a, including:
   ul
+li A new implementation of codestd::list/code is enabled by
+ default, with an O(1) codesize()/code function; /li
 li codestd::deque/code meets the allocator-aware container requirements;/li
 li movable and swappable iostream classes;/li
-li support for codestd::aligned_union/code;/li
-li I/O manipulators codestd::hexfloat/code and
-codestd::defaultfloat/code;
+li support for codestd::align/code and
+ codestd::aligned_union/code;/li
+li Type traits codestd::is_trivially_copyable/code,
+ codestd::is_trivially_constructible/code,
+ codestd::is_trivially_assignable/code etc.;
+/li
+li I/O manipulators codestd::put_time/code,
+ codestd::hexfloat/code and codestd::defaultfloat/code;
 /li
   /ul
 /li
@@ -128,12 +164,13 @@
   ul
 li Class codestd::experimental::any/code; /li
 li Function template codestd::experimental::apply/code; /li
+li Variable templates for type traits; /li
   /ul
 /li
 liNew random number distributions codelogistic_distribution/code and
   codeuniform_on_sphere_distribution/code as extensions./li
 lia href=https://sourceware.org/gdb/current/onlinedocs/gdb/Xmethods-In-Python.html;GDB
-  Xmethods/a for codestd::vector/code and codestd::unique_ptr/code;/li
+  Xmethods/a for Sequence Containers and codestd::unique_ptr/code;/li
   /ul
 
 h3 id=fortranFortran/h3


Re: [PATCH i386 AVX512] [80/n] Extend expand_sse2_mulvxdi3.

2014-10-17 Thread Uros Bizjak
On Fri, Oct 17, 2014 at 2:32 PM, Kirill Yukhin kirill.yuk...@gmail.com wrote:
 Hello Uroš,
 On 16 Oct 14:29, Uros Bizjak wrote:
  + if (mode == V4DImode)
  +   emit_insn (gen_avx512dq_mulv4di3 (op0, op1, op2));
  + else if (mode == V2DImode)
  +   emit_insn (gen_avx512dq_mulv4di3 (op0, op1, op2));

 Should this be v2di ?
 Right, copy-and-paste :(

  +   }
  +}
  +  else if (TARGET_XOP  mode == V2DImode)
   {
 /* op1: A,B,C,D, op2: E,F,G,H */
 op1 = gen_lowpart (V4SImode, op1);

 Please use function pointers in the added part.
 Done.

 Updated patch in the bottom. Is it ok?

OK.

Thanks,
Uros.


Re: [PATCH][1/n] Merge from match-and-simplify, infrastructure

2014-10-17 Thread Jakub Jelinek
On Wed, Oct 15, 2014 at 01:39:33PM +0200, Richard Biener wrote:
 2014-10-15  Richard Biener  rguent...@suse.de

Shouldn't Prathamesh be listed as co-author of the patch?

 +   fprintf (f, case SSA_NAME:\n);
 +   fprintf (f, {\n);
 +   fprintf (f, gimple def_stmt = SSA_NAME_DEF_STMT (%s);\n, 
 kid_opname);

Etc.; so no attempt to indent the generated code, by tracking number
of current indentation columns and trasnating that into a series of
spaces or tabs or tabs+spaces?  Other generated sources like insn-*.c
usually are indented, at least to some extent.

 +   char dest[32];
 +   snprintf (dest, 32,   res_ops[%d], j);
 +   const char *optype
 +   = get_operand_type (e-operation,

This seems to be indented too much.

 +   type, e-expr_type,
 +   j == 0
 +   ? NULL : TREE_TYPE (res_ops[0]));
 + /* The genmatch generator progam.  It reads from a pattern description
 +and outputs GIMPLE or GENERIC IL matching and simplification routines.  
 */
 + 
 + int
 + main(int argc, char **argv)

Formatting ;)

 + return 1;
 + 
 +   bool gimple = true;
 +   bool verbose = false;
 +   char *input = argv[argc-1];
 +   for (int i = 1; i  argc - 1; ++i)
 + {
 +   if (strcmp (argv[i], -gimple) == 0)
 + gimple = true;
 +   else if (strcmp (argv[i], -generic) == 0)
 + gimple = false;
 +   else if (strcmp (argv[i], -v) == 0)
 + verbose = true;
 +   else
 + {
 +   fprintf (stderr, Usage: genmatch [-gimple] [-generic] [-v] input\n);
 +   return 1;
 + }
 + }

Wouldn't --gimple and --generic be nicer?

Otherwise, LGTM.

Jakub


Re: [PATCH][3/n] Merge from match-and-simplify, first patterns and questions

2014-10-17 Thread Jakub Jelinek
On Wed, Oct 15, 2014 at 01:40:49PM +0200, Richard Biener wrote:
 
 This adds a bunch of simplifications with constant operands
 or ones that simplify to constants, such as a + 0, x * 1.
 
 It's a patch mainly to get a few questions answered for further
 pattern merges:
 
  - The branch uses multiple .pd files and includes them from
match.pd trying to group related stuff together.  It has
become somewhat difficult to do that grouping in some
sensible manner so I am not sure this is the best approach.
Any opinion?  We can simply put everything into match.pd
and group visually by overall comments.

That would be probably my preference, unless match.pd grows too big.

  - Each pattern I will add will either be already implemented
in some form in fold-const.c or tree-ssa-forwprop.c.  Once
the machinery is exercised from fold-const.c and
tree-ssa-forwprop.c I can remove the duplicates at the
same time I add a pattern.  Should I do that?

I guess it depends, if the new pattern covers the old one well, sure,
the STRIP_{,SIGN_}NOPS issues might be more important, TREE_SIDE_EFFECTS
probably less important (those shouldn't be really constant expressions
and thus there should be fewer users expecting stuff to be folded).

In any cases, we need to be prepared to cure some folding
regressions if people report them and we find them desirable to be
restored.  Hopefully there won't be hundreds of such reports.

Jakub


Re: [PATCH i386 AVX512] [75/n] Update vec_init.

2014-10-17 Thread Jakub Jelinek
On Fri, Oct 17, 2014 at 04:28:12PM +0400, Kirill Yukhin wrote:
  I wonder whether for these modes it can ever be beneficial to build them
  through interleaves/concatenations etc., if it wouldn't be better to build
  them by storing all values into memory and just reading it back.
 I've tried this example:
 #include immintrin.h
 
 unsigned char a0, a1, a2, a3, a4, a5, a6, a7, a8, a9, a10, a11, a12, a13, a14,
   a15, a16, a17, a18, a19, a20, a21, a22, a23, a24, a25, a26, a27, a28, a29,
   a30, a31, a32, a33, a34, a35, a36, a37, a38, a39, a40, a41, a42, a43, a44,
   a45, a46, a47, a48, a49, a50, a51, a52, a53, a54, a55, a56, a57, a58, a59,
   a60, a61, a62, a63;
 
 __m512i foo ()
 {
   return __extension__ (__m512i)(__v64qi){
 a0, a1, a2, a3, a4, a5, a6, a7, a8, a9, a10, a11, a12, a13, a14,
   a15, a16, a17, a18, a19, a20, a21, a22, a23, a24, a25, a26, a27, a28, 
 a29,
   a30, a31, a32, a33, a34, a35, a36, a37, a38, a39, a40, a41, a42, a43, 
 a44,
   a45, a46, a47, a48, a49, a50, a51, a52, a53, a54, a55, a56, a57, a58, 
 a59,
   a60, a61, a62, a63 };
 }
 
 w/ and w/o -mavx512bw (and always -mavx512f).
 
 When, this code works, we've got 127 lines of assembly to do this init.
 W/o AVX-512BW we've got  300 lines of code (mostly on GPRs, using sal, and 
 etc.)
 
 Then I've looked into actual assembly w/ -mavx512bw and it turns out that no
 AVX-512BW insn were generated, only AVX-512F (and below). Fixed iterator.

Ok, if it is shorter than copying all those into memory and reading from
memory, so be it.

   -(define_mode_iterator VI48F_512 [V16SI V16SF V8DI V8DF])
   +(define_mode_iterator VI48F_I12_AVX512BW
   +  [V16SI V16SF V8DI V8DF
   +  (V32HI TARGET_AVX512BW) (V64QI TARGET_AVX512BW)])
  
  What does the I12 stand for?  Wasn't it meant to be VI48F_512_AVX512BW
  or I512?
 Actually, I am not awere of any name convention for iterators.
 As far as I understand, name [more or less] for vector mode
 should reflect:
   - Type family of the unit: float or int
   - Size of the unit: 1, 2, 4 etc. bytes
   - If possible, target predicates to enable certain modes in
 given iterator.
 
 The name is:
   - Vector (V)
   - I48F - contains both ints and floats of size 4 and 8
   - I12 - contains ints of size 1 and 2
   - AVX512BW - affected by the target (according to previous note - to be 
 removed)
 
 Maybe it'll be better to name it: VF48_I1248?

I'll leave that to Uros, the patch is ok by me.

Jakub


Re: [PATCH i386 AVX512] [75/n] Update vec_init.

2014-10-17 Thread Uros Bizjak
On Fri, Oct 17, 2014 at 2:57 PM, Jakub Jelinek ja...@redhat.com wrote:
 On Fri, Oct 17, 2014 at 04:28:12PM +0400, Kirill Yukhin wrote:
  I wonder whether for these modes it can ever be beneficial to build them
  through interleaves/concatenations etc., if it wouldn't be better to build
  them by storing all values into memory and just reading it back.
 I've tried this example:
 #include immintrin.h

 unsigned char a0, a1, a2, a3, a4, a5, a6, a7, a8, a9, a10, a11, a12, a13, 
 a14,
   a15, a16, a17, a18, a19, a20, a21, a22, a23, a24, a25, a26, a27, a28, a29,
   a30, a31, a32, a33, a34, a35, a36, a37, a38, a39, a40, a41, a42, a43, a44,
   a45, a46, a47, a48, a49, a50, a51, a52, a53, a54, a55, a56, a57, a58, a59,
   a60, a61, a62, a63;

 __m512i foo ()
 {
   return __extension__ (__m512i)(__v64qi){
 a0, a1, a2, a3, a4, a5, a6, a7, a8, a9, a10, a11, a12, a13, a14,
   a15, a16, a17, a18, a19, a20, a21, a22, a23, a24, a25, a26, a27, a28, 
 a29,
   a30, a31, a32, a33, a34, a35, a36, a37, a38, a39, a40, a41, a42, a43, 
 a44,
   a45, a46, a47, a48, a49, a50, a51, a52, a53, a54, a55, a56, a57, a58, 
 a59,
   a60, a61, a62, a63 };
 }

 w/ and w/o -mavx512bw (and always -mavx512f).

 When, this code works, we've got 127 lines of assembly to do this init.
 W/o AVX-512BW we've got  300 lines of code (mostly on GPRs, using sal, and 
 etc.)

 Then I've looked into actual assembly w/ -mavx512bw and it turns out that no
 AVX-512BW insn were generated, only AVX-512F (and below). Fixed iterator.

 Ok, if it is shorter than copying all those into memory and reading from
 memory, so be it.

   -(define_mode_iterator VI48F_512 [V16SI V16SF V8DI V8DF])
   +(define_mode_iterator VI48F_I12_AVX512BW
   +  [V16SI V16SF V8DI V8DF
   +  (V32HI TARGET_AVX512BW) (V64QI TARGET_AVX512BW)])
 
  What does the I12 stand for?  Wasn't it meant to be VI48F_512_AVX512BW
  or I512?
 Actually, I am not awere of any name convention for iterators.
 As far as I understand, name [more or less] for vector mode
 should reflect:
   - Type family of the unit: float or int
   - Size of the unit: 1, 2, 4 etc. bytes
   - If possible, target predicates to enable certain modes in
 given iterator.

 The name is:
   - Vector (V)
   - I48F - contains both ints and floats of size 4 and 8
   - I12 - contains ints of size 1 and 2
   - AVX512BW - affected by the target (according to previous note - to be 
 removed)

 Maybe it'll be better to name it: VF48_I1248?

 I'll leave that to Uros, the patch is ok by me.

Don't want to bikeshed, but VF48_I1248 looks somehow better to me.

Anyway, the patch is OK even without this change.

Thanks,
Uros.


Re: [PATCH] Don't expand string/memory builtins if ASan is enabled.

2014-10-17 Thread Yury Gribov

On 10/17/2014 04:24 PM, Jakub Jelinek wrote:

+/* Returns TRUE if given FCODE corresponds to string or memory builtin 
function.
+ */
+
+static inline bool
+is_memory_builtin (enum built_in_function fcode)
+{
+  return fcode = BUILT_IN_STRSTR  fcode = BUILT_IN_BCMP;


This is too fragile and ugly.
IMHO you should list (supposedly not in a special inline, but directly
where you use it) in a switch all the builtins you don't want to expand.


We already do this for BUILT_IN_ASAN_REPORT_LOAD1 ... 
BUILT_IN_ASAN_STOREN but I agree that this one is more ugly.


-Y


Re: [PATCH] Don't expand string/memory builtins if ASan is enabled.

2014-10-17 Thread Jakub Jelinek
On Fri, Oct 17, 2014 at 05:01:33PM +0400, Yury Gribov wrote:
 On 10/17/2014 04:24 PM, Jakub Jelinek wrote:
 +/* Returns TRUE if given FCODE corresponds to string or memory builtin 
 function.
 + */
 +
 +static inline bool
 +is_memory_builtin (enum built_in_function fcode)
 +{
 +  return fcode = BUILT_IN_STRSTR  fcode = BUILT_IN_BCMP;
 
 This is too fragile and ugly.
 IMHO you should list (supposedly not in a special inline, but directly
 where you use it) in a switch all the builtins you don't want to expand.
 
 We already do this for BUILT_IN_ASAN_REPORT_LOAD1 ... BUILT_IN_ASAN_STOREN

I know, but it is still a coherent sent of builtins for very similar
purposes, many of them sorted by increasing size number.

 but I agree that this one is more ugly.

The memops builtins are just random bag of them, it is expected many people
will add builtins into that range and outside of that range.

Jakub


Re: [PATCH] Simple improvement for predicate computation in if-convert phase.

2014-10-17 Thread Yuri Rumyantsev
Jeff,

I prepared another patch that includes test-case as you requested.

Below are answers on your questions.

 First, for the benefit of anyone trying to understand what you're doing, 
 defining what cd equivalent means would be helpful.

I added the following  comment to function:

   fwe call basic blocks bb1 and bb2
   cd-equivalent if they are executed under the same condition.


Is it sufficient?

So, do you have a case where the dominated_by_p test above is true and 
is_predicated(bb) returns true as well?  I think this part of the change is 
largely responsible for the hack you're doing with having the function scoped 
static variable join_bb.

I don't have such test-case and I assume that if bb is always
executed, it is not predicated.

I also deleted join_bb in my changes.


Is it OK for trunk now.

Thanks.
Yuri.

2014-10-17  Yuri Rumyantsev  ysrum...@gmail.com
gcc/ChangeLog

* tree-if-conv.c (add_to_predicate_list): Check unconditionally
that bb is always executed to early exit. Use predicate of
cd-equivalent block for join blocks if it exists.
(if_convertible_loop_p_1): Recompute POST_DOMINATOR tree.
(tree_if_conversion): Free post-dominance information.

gcc/testsuite/ChangeLog

* gcc/dg/tree-ssa/ifc-cd.c: New test.



2014-10-17 1:16 GMT+04:00 Jeff Law l...@redhat.com:
 On 10/16/14 05:52, Yuri Rumyantsev wrote:

 Hi All,

 Here is a simple enhancement for predicate computation in if-convert
 phase:

   We use notion of cd equivalence to get simpler predicate for
   join block, e.g. if join block has 2 predecessors with predicates
   p1  p2 and p1  !p2, we'd like to get p1 for it instead of
   p1  p2 | p1  !p2.

 Bootstrap and regression testing did not show any new failures.

 Is it OK for trunk?

 gcc/ChangeLog
 2014-10-16  Yuri Rumyantsevysrum...@gmail.com

 * tree-if-conv.c (add_to_predicate_list): Check unconditionally
 that bb is always executed to early exit. Use predicate of
 cd-equivalent block for join blocks if it exists.
 (if_convertible_loop_p_1): Recompute POST_DOMINATOR tree.
 (tree_if_conversion): Free post-dominance information.

 First, for the benefit of anyone trying to understand what you're doing,
 defining what cd equivalent means would be helpful.




 if-conv.patch


 Index: tree-if-conv.c
 ===
 --- tree-if-conv.c  (revision 216217)
 +++ tree-if-conv.c  (working copy)
 @@ -396,25 +396,51 @@
   }

   /* Add condition NC to the predicate list of basic block BB.  LOOP is
 -   the loop to be if-converted.  */
 +   the loop to be if-converted. Use predicate of cd-equivalent block
 +   for join bb if it exists.  */

   static inline void
   add_to_predicate_list (struct loop *loop, basic_block bb, tree nc)
   {
 tree bc, *tp;
 +  basic_block dom_bb;
 +  static basic_block join_bb = NULL;

 if (is_true_predicate (nc))
   return;

 -  if (!is_predicated (bb))
 +  /* If dominance tells us this basic block is always executed,
 + don't record any predicates for it.  */
 +  if (dominated_by_p (CDI_DOMINATORS, loop-latch, bb))
 +return;

 So, do you have a case where the dominated_by_p test above is true and
 is_predicated(bb) returns true as well?  I think this part of the change is
 largely responsible for the hack you're doing with having the function
 scoped static variable join_bb.




 +
 +  /* If predicate has been already set up for given bb using
 cd-equivalent
 + block predicate, simply escape.  */
 +  if (join_bb == bb)
 +return;

 I *really* dislike the state you're carrying around via join_bb.

 ISTM that if you compute that there's an equivalence, then you just set the
 predicate for the equivalent block and the right things would have happened
 if you had not changed the test above.

 You also need a testcase.  It doesn't have to be extensive, but at least
 some basic smoke test to verify basic operation of this code.  It's
 perfectly fine to scan the debugging dumps for debug output.


 jeff




if-conv.patch.new
Description: Binary data


Re: [PATCH] support ggc hash_map and hash_set

2014-10-17 Thread Alan Lawrence

Sorry, somehow I missed this email. Yes, that appears to have fixed it!

Thank you very much,
Alan

Trevor Saunders wrote:

On Tue, Sep 09, 2014 at 03:37:26PM +0100, Alan Lawrence wrote:

Following this, we're seeing ICEs in tests in gcc.dg/pch.exp and g++.dg/pch.exp,
with cross-builds (hosted on x86_64) targetting bare metal AArch64 and ARM
(aarch64-none-elf, aarch64_be-none-elf and arm-none-eabi; I haven't tested
armeb-none-eabi; builds targeting linux are OK), for *release builds only*.


Could you test the below patch? it seems to work for me, but I'm not
familiar with testing cross compilers.

diff --git a/gcc/hash-table.h b/gcc/hash-table.h
index c2a68fd..028b7de 100644
--- a/gcc/hash-table.h
+++ b/gcc/hash-table.h
@@ -1598,8 +1598,9 @@ templatetypename D
 static void
 gt_pch_nx (hash_tableD *h)
 {
-  gcc_checking_assert (gt_pch_note_object (h-m_entries, h,
-  hashtab_entry_note_pointersD));
+  bool success ATTRIBUTE_UNUSED
+= gt_pch_note_object (h-m_entries, h, hashtab_entry_note_pointersD);
+  gcc_checking_assert (success);
   for (size_t i = 0; i  h-m_size; i++)
 {
   if (hash_tableD::is_empty (h-m_entries[i])

   Trev


Affected tests:

gcc.dg/pch.exp: all variants of
./except-1.h
./inline-3.h
gcc.dg/pch/except-1.c
gcc.dg/pch/inline-3.c

g++.dg: all variants of
./array-1.H
./empty.H
./externc-1.H
./local-1.H
./pch.H
./static-1.H
./system-1.H
./system-2.H
./template-1.H
./uninst.H
./wchar-1.H

(These then lead to failures of g++.dg/pch/{array-1,...}.C and corresponding
assembly comparisons).

Sample log:

Executing on host: build/obj/gcc2/gcc/testsuite/g++/../../xg++
-Bbuild/obj/gcc2/gcc/testsuite/g++/../../ ./template-1.H
-fno-diagnostics-show-caret -fdiagnostics-color=never  -nostdinc++
-Ibuild/obj/gcc2/aarch64-none-elf/ilp32/libstdc++-v3/include/aarch64-none-elf
-Ibuild/obj/gcc2/aarch64-none-elf/ilp32/libstdc++-v3/include
-Isrc/gcc/libstdc++-v3/libsupc++ -Isrc/gcc/libstdc++-v3/include/backward
-Isrc/gcc/libstdc++-v3/testsuite/util -fmessage-length=0  -O2 -g
-specs=aem-ve.specs-mabi=ilp32 -mcmodel=small  -o template-1.H.gch
(timeout = 300)
spawn build/obj/gcc2/gcc/testsuite/g++/../../xg++
-Bbuild/obj/gcc2/gcc/testsuite/g++/../../ ./template-1.H
-fno-diagnostics-show-caret -fdiagnostics-color=never -nostdinc++
-Ibuild/obj/gcc2/aarch64-none-elf/ilp32/libstdc++-v3/include/aarch64-none-elf
-Ibuild/obj/gcc2/aarch64-none-elf/ilp32/libstdc++-v3/include
-Isrc/gcc/libstdc++-v3/libsupc++ -Isrc/gcc/libstdc++-v3/include/backward
-Isrc/gcc/libstdc++-v3/testsuite/util -fmessage-length=0 -O2 -g
-specs=aem-ve.specs -mabi=ilp32 -mcmodel=small -o template-1.H.gch

./template-1.H:5:2: internal compiler error: in relocate_ptrs, at 
ggc-common.c:435
Please submit a full bug report,
with preprocessed source if appropriate.
See http://gcc.gnu.org/bugs.html for instructions.
compiler exited with status 1
output is:
./array-1.H:4:2: internal compiler error: in relocate_ptrs, at ggc-common.c:435
Please submit a full bug report,
with preprocessed source if appropriate.
See http://gcc.gnu.org/bugs.html for instructions.

FAIL: ./array-1.H  -g (internal compiler error)
FAIL: ./array-1.H  -g (test for excess errors)
Excess errors:
./array-1.H:4:2: internal compiler error: in relocate_ptrs, at ggc-common.c:435

--Alan

tsaund...@mozilla.com wrote:

From: Trevor Saunders tsaund...@mozilla.com

Hi,

There are still some issues to make this work really nicely, but this part is
probably good enough its worth reviewing.

For one thing you can't use ggc hash_map or set in front ends with some types
or gengtype will decide to put the overloads of the marking routines it
provides in a front end file instead of the one it choose before breaking other
front ends.  However that seems to be an unrelated issue you can trigger it
without using hash_map/set, so we might as well solve it separetly.

I had to have the entry marking functions for set deligate to the traits class
because gcc  4.9.1 issues clearly bogus errors if you inline the code from the
traits implementation.  We may well want to make map work the same way at some
point to enable some of the special GTY attributes like if_marked, but it
doesn't seem to be necessary right now.

bootstrapped + regtested without regressions on x86_64-unknown-linux-gnu, ok?

Trev

gcc/ChangeLog:

2014-09-01  Trevor Saunders  tsaund...@mozilla.com

   * alloc-pool.c: Include coretypes.h.
   * cgraph.h, dbxout.c, dwarf2out.c, except.c, except.h, function.c,
   function.h, symtab.c, tree-cfg.c, tree-eh.c: Use hash_map and
   hash_set instead of htab.
   * ggc-page.c (in_gc): New variable.
   (ggc_free): Do nothing if a collection is taking place.
   (ggc_collect): Set in_gc appropriately.
   * ggc.h (gt_ggc_mx(const char *)): New function.
   (gt_pch_nx(const char *)): Likewise.
   (gt_ggc_mx(int)): Likewise.
   (gt_pch_nx(int)): Likewise.
   * hash-map.h (hash_map::hash_entry::ggc_mx): 

[PATCH] Fix for PR63569

2014-10-17 Thread Martin Liška

Hello.

Following patch fixes PR63569.

Bootstrap executed on ppc64-linux and no regression seen on x86_64-pc-linux.
Ready for trunk?

Thank you,
Martin
gcc/testsuite/ChangeLog:

2014-10-17  Martin Liska  mli...@suse.cz

* gcc.dg/ipa/ipa-icf-31.c: New test.


gcc/ChangeLog:

2014-10-17  Martin Liska  mli...@suse.cz

* ipa-icf-gimple.c (func_checker::compare_volatility): New function.
(func_checker::compare_gimple_call): Volatility check added.
(func_checker::compare_gimple_assign): Likewise.
* ipa-icf-gimple.h: New function.
diff --git a/gcc/ipa-icf-gimple.c b/gcc/ipa-icf-gimple.c
index 792a3e4..1b9ee85 100644
--- a/gcc/ipa-icf-gimple.c
+++ b/gcc/ipa-icf-gimple.c
@@ -452,6 +452,17 @@ func_checker::compare_tree_list_operand (tree t1, tree t2)
   return true;
 }
 
+/* Compares if both trees T1 and T2 have equal volatility.  */
+
+bool
+func_checker::compare_volatility (tree t1, tree t2)
+{
+  if (t1  t2)
+return TREE_THIS_VOLATILE (t1) == TREE_THIS_VOLATILE (t2);
+
+  return !(t1 || t2);
+}
+
 /* Verifies that trees T1 and T2, representing function declarations
are equivalent from perspective of ICF.  */
 
@@ -663,6 +674,9 @@ func_checker::compare_gimple_call (gimple s1, gimple s2)
   t1 = gimple_get_lhs (s1);
   t2 = gimple_get_lhs (s2);
 
+  if (!compare_volatility (t1, t2))
+return return_false_with_msg (different volatility for call statement);
+
   return compare_operand (t1, t2);
 }
 
@@ -696,8 +710,11 @@ func_checker::compare_gimple_assign (gimple s1, gimple s2)
 
   if (!compare_operand (arg1, arg2))
 	return false;
-}
 
+  if (!compare_volatility (arg1, arg2))
+	return return_false_with_msg (different volatility for assignment 
+	  statement);
+}
 
   return true;
 }
diff --git a/gcc/ipa-icf-gimple.h b/gcc/ipa-icf-gimple.h
index 8487a2a..b791c21 100644
--- a/gcc/ipa-icf-gimple.h
+++ b/gcc/ipa-icf-gimple.h
@@ -209,6 +209,10 @@ public:
  two trees are semantically equivalent.  */
   bool compare_tree_list_operand (tree t1, tree t2);
 
+  /* Compares two tree list operands T1 and T2 and returns true if these
+ two trees are semantically equivalent.  */
+  bool compare_volatility (tree t1, tree t2);
+
   /* Verifies that trees T1 and T2, representing function declarations
  are equivalent from perspective of ICF.  */
   bool compare_function_decl (tree t1, tree t2);
diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-icf-31.c b/gcc/testsuite/gcc.dg/ipa/ipa-icf-31.c
new file mode 100644
index 000..e70d72d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/ipa-icf-31.c
@@ -0,0 +1,33 @@
+/* { dg-do compile } */
+/* { dg-options -fipa-icf -fdump-ipa-icf-details  } */
+
+
+static int f(int t, int *a) __attribute__((noinline));
+
+static int g(int t, volatile int *a) __attribute__((noinline));
+static int g(int t, volatile int *a)
+{
+  int i;
+  int tt = 0;
+  for(i=0;it;i++)
+tt += *a;
+  return tt;
+}
+static int f(int t, int *a)
+{
+  int i;
+  int tt = 0;
+  for(i=0;it;i++)
+tt += *a;
+  return tt;
+}
+
+
+int main()
+{
+  return 0;
+}
+
+/* { dg-final { scan-ipa-dump Equal symbols: 0 icf  } } */
+/* { dg-final { scan-ipa-dump different volatility for assignment statement icf  } } */
+/* { dg-final { cleanup-ipa-dump icf } } */


Re: [PATCH 0/17] KASan 4.9 backport

2014-10-17 Thread Yury Gribov

On 10/17/2014 04:12 PM, Jakub Jelinek wrote:

I had a brief look at what ended up on the branch in the end, and
from what I understand, the 4.9 libasan.so has
__asan_report_store_n and __asan_report_load_n entry points, but does
not have any __asan_loadN/__asan_reportN entrypoints (neither 1/2/4/8/16,
nor variable).
So, what the branch does seems to not match what the library provides.


I agree, __asan_report_loadN is indeed there and misalign tests seem to 
pass fine. Probably I should have examined 4.9 libasan closer.



With -fsanitize=address -O2 --param asan-instrumentation-with-call-threshold=0
foo is again unexpectedly not instrumented, and bar is instrumented
with __asan_load8, which looks wrong to me, because the library does not
provide any such entry point.


By default asan-instrumentation-with-call-threshold is INT_MAX which 
means that compiler will never generate __asan_load*/__asan_store* calls 
unless forced by the user (e.g. for Kasan).



But, in execute_sanopt force !use_calls
for (flag_sanitize  SANITIZE_USER_ADDRESS).


Do you think above limitation is not enough?


Thus, IMHO the:
   if ((flag_sanitize  SANITIZE_USER_ADDRESS) != 0
((size_in_bytes  (size_in_bytes - 1)) != 0
  || (unsigned HOST_WIDE_INT) size_in_bytes - 1 = 16))
 return;
should be nuked from 4.9, we can do unaligned/non-{1,2,4,8,16}
accesses fine.


Right. I'd also import misalign tests.


Or were there any bugfixes needed for __asan_report_{store,load}_n
on the library side?


I don't think so.

-Y



Re: [PATCH 0/17] KASan 4.9 backport

2014-10-17 Thread Jakub Jelinek
On Fri, Oct 17, 2014 at 05:45:17PM +0400, Yury Gribov wrote:
 On 10/17/2014 04:12 PM, Jakub Jelinek wrote:
 I had a brief look at what ended up on the branch in the end, and
 from what I understand, the 4.9 libasan.so has
 __asan_report_store_n and __asan_report_load_n entry points, but does
 not have any __asan_loadN/__asan_reportN entrypoints (neither 1/2/4/8/16,
 nor variable).
 So, what the branch does seems to not match what the library provides.
 
 I agree, __asan_report_loadN is indeed there and misalign tests seem to pass
 fine. Probably I should have examined 4.9 libasan closer.
 
 With -fsanitize=address -O2 --param 
 asan-instrumentation-with-call-threshold=0
 foo is again unexpectedly not instrumented, and bar is instrumented
 with __asan_load8, which looks wrong to me, because the library does not
 provide any such entry point.
 
 By default asan-instrumentation-with-call-threshold is INT_MAX which means
 that compiler will never generate __asan_load*/__asan_store* calls unless
 forced by the user (e.g. for Kasan).
 
 But, in execute_sanopt force !use_calls
 for (flag_sanitize  SANITIZE_USER_ADDRESS).
 
 Do you think above limitation is not enough?

Yeah, even if the default is that it doesn't make the non-existing calls,
anyone who uses the parameter will get code that doesn't link.

 
 Thus, IMHO the:
if ((flag_sanitize  SANITIZE_USER_ADDRESS) != 0
 ((size_in_bytes  (size_in_bytes - 1)) != 0
|| (unsigned HOST_WIDE_INT) size_in_bytes - 1 = 16))
  return;
 should be nuked from 4.9, we can do unaligned/non-{1,2,4,8,16}
 accesses fine.
 
 Right. I'd also import misalign tests.
 
 Or were there any bugfixes needed for __asan_report_{store,load}_n
 on the library side?
 
 I don't think so.

So, what about this?  Just checked that with
make -k check-g{cc,++} RUNTESTFLAGS='--target_board=unix\{-m32,-m64\} asan.exp 
tsan.exp ubsan.exp'
so far.  Plus if you add misalign tests...

2014-10-17  Jakub Jelinek  ja...@redhat.com

* asan.c (instrument_derefs): Allow instrumentation of odd-sized
accesses even for -fsanitize=address.
(execute_sanopt): Only allow use_calls for -fsanitize=kernel-address.

* c-c++-common/asan/instrument-with-calls-1.c: Add
-fno-sanitize=address -fsanitize=kernel-address to dg-options.
* c-c++-common/asan/instrument-with-calls-2.c: Likewise.

--- gcc/asan.c.jj   2014-10-17 12:51:27.0 +0200
+++ gcc/asan.c  2014-10-17 15:21:29.921495259 +0200
@@ -1707,10 +1707,6 @@ instrument_derefs (gimple_stmt_iterator
   size_in_bytes = int_size_in_bytes (type);
   if (size_in_bytes = 0)
 return;
-  if ((flag_sanitize  SANITIZE_USER_ADDRESS) != 0
-   ((size_in_bytes  (size_in_bytes - 1)) != 0
- || (unsigned HOST_WIDE_INT) size_in_bytes - 1 = 16))
-return;
 
   HOST_WIDE_INT bitsize, bitpos;
   tree offset;
@@ -2780,8 +2776,10 @@ execute_sanopt (void)
  }
 }
 
-  bool use_calls = ASAN_INSTRUMENTATION_WITH_CALL_THRESHOLD  INT_MAX
- asan_num_accesses = ASAN_INSTRUMENTATION_WITH_CALL_THRESHOLD;
+  bool use_calls
+= ASAN_INSTRUMENTATION_WITH_CALL_THRESHOLD  INT_MAX
+   (flag_sanitize  SANITIZE_KERNEL_ADDRESS)
+   asan_num_accesses = ASAN_INSTRUMENTATION_WITH_CALL_THRESHOLD;
 
   FOR_EACH_BB_FN (bb, cfun)
 {
--- gcc/testsuite/c-c++-common/asan/instrument-with-calls-1.c.jj
2014-10-17 12:51:27.0 +0200
+++ gcc/testsuite/c-c++-common/asan/instrument-with-calls-1.c   2014-10-17 
15:34:06.679627168 +0200
@@ -1,5 +1,5 @@
 /* { dg-do assemble } */
-/* { dg-options --param asan-instrumentation-with-call-threshold=0 
-save-temps } */
+/* { dg-options -fno-sanitize=address -fsanitize=kernel-address --param 
asan-instrumentation-with-call-threshold=0 -save-temps } */
 
 void f(char *a, int *b) {
   *b = *a;
--- gcc/testsuite/c-c++-common/asan/instrument-with-calls-2.c.jj
2014-10-17 12:51:27.0 +0200
+++ gcc/testsuite/c-c++-common/asan/instrument-with-calls-2.c   2014-10-17 
15:34:15.569472032 +0200
@@ -1,5 +1,5 @@
 /* { dg-do assemble } */
-/* { dg-options --param asan-instrumentation-with-call-threshold=1 
-save-temps } */
+/* { dg-options -fno-sanitize=address -fsanitize=kernel-address --param 
asan-instrumentation-with-call-threshold=1 -save-temps } */
 
 int x;
 


Jakub


Re: [PATCH 7/n] OpenMP 4.0 offloading infrastructure: testsuite

2014-10-17 Thread Ilya Verbin
On 15 Oct 17:35, Jakub Jelinek wrote:
 But we do want to test them with host fallback, which those lines preclude.
 Just a single dg-require-effective-target offload_device guarded test (which
 there necessarily is, e.g. the 57.* ones) should be sufficient for your
 purposes (if you want to diff UNSUPPORTED vs. PASS tests between runs).
 Right now the result of that test turns all tests in the directory into
 UNSUPPORTED, with the removals you'd just turn a single one or a dozen or
 how many would really need it.
 The fact that the tcl offload_device check succeeded doesn't mean that all
 tests don't use host fallback anyway.
 Additionally to a handful of dg-require-effective-target offload_device
 you could have one which just prints something on stdout depending on if
 it is offloaded or not, you can grep for the output of that in your
 libgomp.log.
 
   Jakub

Agreed.  Patch is fixed and retested.

  -- Ilya


---

diff --git a/libgomp/testsuite/lib/libgomp.exp 
b/libgomp/testsuite/lib/libgomp.exp
index 094e5ed..071e22f 100644
--- a/libgomp/testsuite/lib/libgomp.exp
+++ b/libgomp/testsuite/lib/libgomp.exp
@@ -239,3 +239,17 @@ proc libgomp_option_proc { option } {
return 0
 }
 }
+
+# Return 1 if offload device is available.
+proc check_effective_target_offload_device { } {
+return [check_runtime_nocache offload_device_available_ {
+  #include omp.h
+  int main ()
+   {
+ int a;
+ #pragma omp target map(from: a)
+   a = omp_is_initial_device ();
+ return a;
+   }
+} ]
+}
diff --git a/libgomp/testsuite/libgomp.c++/c++.exp 
b/libgomp/testsuite/libgomp.c++/c++.exp
index a9cf41a..da42e62 100644
--- a/libgomp/testsuite/libgomp.c++/c++.exp
+++ b/libgomp/testsuite/libgomp.c++/c++.exp
@@ -42,7 +42,7 @@ if { $blddir !=  } {
 
 if { $lang_test_file_found } {
 # Gather a list of all tests.
-set tests [lsort [glob -nocomplain $srcdir/$subdir/*.C]]
+set tests [lsort [find $srcdir/$subdir *.C]]
 
 if { $blddir !=  } {
 set ld_library_path 
$always_ld_library_path:${blddir}/${lang_library_path}
diff --git a/libgomp/testsuite/libgomp.c++/examples-4/e.51.5.C 
b/libgomp/testsuite/libgomp.c++/examples-4/e.51.5.C
new file mode 100644
index 000..4298e23
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c++/examples-4/e.51.5.C
@@ -0,0 +1,62 @@
+// { dg-do run }
+
+#include omp.h
+
+#define EPS 0.01
+#define N 1000
+
+extern C void abort (void);
+
+void init (float *a1, float *a2, int n)
+{
+  int s = -1;
+  for (int i = 0; i  n; i++)
+{
+  a1[i] = s * 0.01;
+  a2[i] = i;
+  s = -s;
+}
+}
+
+void check (float *a, float *b, int n)
+{
+  for (int i = 0; i  n; i++)
+if (a[i] - b[i]  EPS || b[i] - a[i]  EPS)
+  abort ();
+}
+
+void vec_mult_ref (float *p, float *v1, float *v2, int n)
+{
+  for (int i = 0; i  n; i++)
+p[i] = v1[i] * v2[i];
+}
+
+void vec_mult (float *p, float *v1, float *v2, int n)
+{
+  #pragma omp target map(to: v1[0:n], v2[:n]) map(from: p[0:n])
+#pragma omp parallel for
+  for (int i = 0; i  n; i++)
+   p[i] = v1[i] * v2[i];
+}
+
+int main ()
+{
+  float *p = new float [N];
+  float *p1 = new float [N];
+  float *v1 = new float [N];
+  float *v2 = new float [N];
+
+  init (v1, v2, N);
+
+  vec_mult_ref (p, v1, v2, N);
+  vec_mult (p1, v1, v2, N);
+
+  check (p, p1, N);
+
+  delete [] p;
+  delete [] p1;
+  delete [] v1;
+  delete [] v2;
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.c++/examples-4/e.53.2.C 
b/libgomp/testsuite/libgomp.c++/examples-4/e.53.2.C
new file mode 100644
index 000..75276e7
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c++/examples-4/e.53.2.C
@@ -0,0 +1,43 @@
+// { dg-do run }
+// { dg-require-effective-target offload_device }
+
+#include stdlib.h
+
+struct typeX
+{
+  int a;
+};
+
+class typeY
+{
+public:
+  int foo () { return a^0x01; }
+  int a;
+};
+
+#pragma omp declare target
+struct typeX varX;
+class typeY varY;
+#pragma omp end declare target
+
+int main ()
+{
+  varX.a = 0;
+  varY.a = 0;
+
+  #pragma omp target
+{
+  varX.a = 100;
+  varY.a = 100;
+}
+
+  if (varX.a != 0 || varY.a != 0)
+abort ();
+
+  #pragma omp target update from(varX, varY)
+
+  if (varX.a != 100 || varY.a != 100)
+abort ();
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.c/examples-4/e.50.1.c 
b/libgomp/testsuite/libgomp.c/examples-4/e.50.1.c
new file mode 100644
index 000..45adbe0
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/examples-4/e.50.1.c
@@ -0,0 +1,63 @@
+/* { dg-do run } */
+
+#include stdlib.h
+
+#define N 10
+
+void init (int *a1, int *a2)
+{
+  int i, s = -1;
+  for (i = 0; i  N; i++)
+{
+  a1[i] = s;
+  a2[i] = i;
+  s = -s;
+}
+}
+
+void check (int *a, int *b)
+{
+  int i;
+  for (i = 0; i  N; i++)
+if (a[i] != b[i])
+  abort ();
+}
+
+void vec_mult_ref (int *p)
+{
+  int i;
+  int v1[N], v2[N];
+
+  init (v1, v2);
+
+  for (i = 0; i  N; i++)
+p[i] = v1[i] * 

[PARCH 1/2, x86, PR63534] Fix darwin bootstrap

2014-10-17 Thread Evgeny Stupachenko
Hi,

The patch fixes 1st fail in darwin bootstarp.
When PIC register is pseudo we don't need to init it after setjmp or
non local goto.

Is it ok?

ChangeLog:

2014-10-17  Evgeny Stupachenko  evstu...@gmail.com

PR target/63534
* config/i386/i386.c (builtin_setjmp_receiver): Delete.
(nonlocal_goto_receiver): Ditto.

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 624a1c1..fc3776f 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -16927,57 +16927,6 @@
   * return output_probe_stack_range (operands[0], operands[2]);
   [(set_attr type multi)])

-(define_expand builtin_setjmp_receiver
-  [(label_ref (match_operand 0))]
-  !TARGET_64BIT  flag_pic
-{
-#if TARGET_MACHO
-  if (TARGET_MACHO)
-{
-  rtx xops[3];
-  rtx picreg = gen_rtx_REG (Pmode, PIC_OFFSET_TABLE_REGNUM);
-  rtx_code_label *label_rtx = gen_label_rtx ();
-  emit_insn (gen_set_got_labelled (pic_offset_table_rtx, label_rtx));
-  xops[0] = xops[1] = picreg;
-  xops[2] = machopic_gen_offset (gen_rtx_LABEL_REF (SImode, label_rtx));
-  ix86_expand_binary_operator (MINUS, SImode, xops);
-}
-  else
-#endif
-emit_insn (gen_set_got (pic_offset_table_rtx));
-  DONE;
-})
-
-(define_insn_and_split nonlocal_goto_receiver
-  [(unspec_volatile [(const_int 0)] UNSPECV_NLGR)]
-  TARGET_MACHO  !TARGET_64BIT  flag_pic
-  #
-   reload_completed
-  [(const_int 0)]
-{
-  if (crtl-uses_pic_offset_table)
-{
-  rtx xops[3];
-  rtx label_rtx = gen_label_rtx ();
-  rtx tmp;
-
-  /* Get a new pic base.  */
-  emit_insn (gen_set_got_labelled (pic_offset_table_rtx, label_rtx));
-  /* Correct this with the offset from the new to the old.  */
-  xops[0] = xops[1] = pic_offset_table_rtx;
-  label_rtx = gen_rtx_LABEL_REF (SImode, label_rtx);
-  tmp = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, label_rtx),
-   UNSPEC_MACHOPIC_OFFSET);
-  xops[2] = gen_rtx_CONST (Pmode, tmp);
-  ix86_expand_binary_operator (MINUS, SImode, xops);
-}
-  else
-/* No pic reg restore needed.  */
-emit_note (NOTE_INSN_DELETED);
-
-  DONE;
-})
-
 ;; Avoid redundant prefixes by splitting HImode arithmetic to SImode.
 ;; Do not split instructions with mask registers.
 (define_split


Re: [PATCH,1/2] Extended if-conversion for loops marked with pragma omp simd.

2014-10-17 Thread Yuri Rumyantsev
Richard,

I reworked the patch as you proposed, but I didn't understand what
did you mean by:

So please rework the patch so critical edges are always handled
correctly.

In current patch flag_force_vectorize is used (1) to reject phi nodes
with more than 2 arguments; (2) to reject basic blocks with only
critical incoming edges since support for extended predication of phi
nodes will be in next patch.

Could you please clarify your statement.

I attached modified patch.

ChangeLog:

2014-10-17  Yuri Rumyantsev  ysrum...@gmail.com

(flag_force_vectorize): New variable.
(edge_predicate): New function.
(set_edge_predicate): New function.
(add_to_dst_predicate_list): Conditionally invoke add_to_predicate_list
if destination block of edge is not always executed. Set-up predicate
for critical edge.
(if_convertible_phi_p): Accept phi nodes with more than two args
if FLAG_FORCE_VECTORIZE was set-up.
(ifcvt_can_use_mask_load_store): Use FLAG_FORCE_VECTORIZE.
(if_convertible_stmt_p): Fix up pre-function comments.
(all_edges_are_critical): New function.
(if_convertible_bb_p): Use call of all_preds_critical_p
to reject block if-conversion with incoming critical edges only if
FLAG_FORCE_VECTORIZE was not set-up.
(predicate_bbs): Skip loop exit block also.Invoke build2_loc
to compute predicate instead of fold_build2_loc.
Add zeroing of edge 'aux' field.
(find_phi_replacement_condition): Extend function interface:
it returns NULL if given phi node must be handled by means of
extended phi node predication. If number of predecessors of phi-block
is equal 2 and atleast one incoming edge is not critical original
algorithm is used.
(tree_if_conversion): Temporary set-up FLAG_FORCE_VECTORIZE to false.
Nullify 'aux' field of edges for blocks with two successors.




2014-10-17 13:09 GMT+04:00 Richard Biener richard.guent...@gmail.com:
 On Thu, Oct 16, 2014 at 5:42 PM, Yuri Rumyantsev ysrum...@gmail.com wrote:
 Richard,

 Here is reduced patch as you requested. All your remarks have been fixed.
 Could you please look at it ( I have already sent the patch with
 changes in add_to_predicate_list for review).

 + if (dump_file  (dump_flags  TDF_DETAILS))
 +   fprintf (dump_file, More than two phi node args.\n);
 + return false;
 +   }
 +
 +}

 Excess vertical space.


 +/* Assumes that BB has more than 2 predecessors.

 More than 1 predecessor?

 +   Returns false if at least one successor is not on critical edge
 +   and true otherwise.  */
 +
 +static inline bool
 +all_edges_are_critical (basic_block bb)
 +{

 all_preds_critical_p would be a better name

 +  if (EDGE_COUNT (bb-preds)  2)
 +{
 +  if (!flag_force_vectorize)
 +   return false;
 +}

 as I said in the last review I don't think we should restrict edge
 predicates to flag_force_vectorize.  At least I can't see how
 if-conversion is magically more expensive for that case?

 So please rework the patch so critical edges are always handled
 correctly.

 Ok with that and the above suggested changes.

 Thanks,
 Richard.


 Thanks.
 Yuri.
 ChangeLog
 2014-10-16  Yuri Rumyantsev  ysrum...@gmail.com

 (flag_force_vectorize): New variable.
 (edge_predicate): New function.
 (set_edge_predicate): New function.
 (add_to_dst_predicate_list): Conditionally invoke add_to_predicate_list
 if destination block of edge is not always executed. Set-up predicate
 for critical edge.
 (if_convertible_phi_p): Accept phi nodes with more than two args
 if FLAG_FORCE_VECTORIZE was set-up.
 (ifcvt_can_use_mask_load_store): Use FLAG_FORCE_VECTORIZE.
 (if_convertible_stmt_p): Fix up pre-function comments.
 (all_edges_are_critical): New function.
 (if_convertible_bb_p): Allow bb has more than two predecessors if
 FLAG_FORCE_VECTORIZE was set-up. Use call of all_edges_are_critical
 to reject block if-conversion with incoming critical edges only if
 FLAG_FORCE_VECTORIZE was not set-up.
 (predicate_bbs): Skip loop exit block also.Invoke build2_loc
 to compute predicate instead of fold_build2_loc.
 Add zeroing of edge 'aux' field.
 (find_phi_replacement_condition): Extend function interface:
 it returns NULL if given phi node must be handled by means of
 extended phi node predication. If number of predecessors of phi-block
 is equal 2 and atleast one incoming edge is not critical original
 algorithm is used.
 (tree_if_conversion): Temporary set-up FLAG_FORCE_VECTORIZE to false.
 Nullify 'aux' field of edges for blocks with two successors.



 2014-10-15 13:50 GMT+04:00 Richard Biener richard.guent...@gmail.com:
 On Mon, Oct 13, 2014 at 11:38 AM, Yuri Rumyantsev ysrum...@gmail.com 
 wrote:
 Richard,

 Here is updated patch (part1) for extended if conversion.

 Second part of patch will be sent later.

 Ok, I'm starting to look at this.  I'd still like you to split things up
 more.

  static inline void
  add_to_predicate_list (struct loop *loop, basic_block bb, tree nc)
  {
 ...

 +  /* We use notion of cd equivalence to get simplier 

Re: [PATCH 0/17] KASan 4.9 backport

2014-10-17 Thread Yury Gribov

On 10/17/2014 05:49 PM, Jakub Jelinek wrote:
 Plus if you add misalign tests...

Sure, can do this on Monday.

 -  bool use_calls = ASAN_INSTRUMENTATION_WITH_CALL_THRESHOLD  INT_MAX

- asan_num_accesses = ASAN_INSTRUMENTATION_WITH_CALL_THRESHOLD;
+  bool use_calls
+= ASAN_INSTRUMENTATION_WITH_CALL_THRESHOLD  INT_MAX
+   (flag_sanitize  SANITIZE_KERNEL_ADDRESS)
+   asan_num_accesses = ASAN_INSTRUMENTATION_WITH_CALL_THRESHOLD;


I agree that original code didn't quite match GNU conventions but can we 
avoid reformatting it to make future backports easier? So e.g.


 bool use_calls = ASAN_INSTRUMENTATION_WITH_CALL_THRESHOLD  INT_MAX
+ (flag_sanitize  SANITIZE_KERNEL_ADDRESS)
asan_num_accesses = ASAN_INSTRUMENTATION_WITH_CALL_THRESHOLD;

-Y


[PARCH 2/2, x86, PR63534] Fix darwin bootstrap

2014-10-17 Thread Evgeny Stupachenko
Hi,

Some instructions (like one in PR63534) could have hidden use of PIC register.
Therefore we need to leave SET_GOT not deleted till reload completed.
The patch prevents SET_GOT from deleting while PIC register is pseudo.

Is it ok?

ChangeLog:

2014-10-17  Evgeny Stupachenko  evstu...@gmail.com

PR target/63534
* cse.c (delete_trivially_dead_insns): Consider PIC register is used
while it is pseudo.
* dse.c (deletable_insn_p): Likewise.

diff --git a/gcc/cse.c b/gcc/cse.c
index be2f31b..062ba45 100644
--- a/gcc/cse.c
+++ b/gcc/cse.c
@@ -6953,6 +6953,11 @@ delete_trivially_dead_insns (rtx_insn *insns, int nreg)
   /* If no debug insns can be present, COUNTS is just an array
 which counts how many times each pseudo is used.  */
 }
+  /* Pseudo PIC register should be considered as used due to possible
+ new usages generated.  */
+  if (pic_offset_table_rtx
+   REGNO (pic_offset_table_rtx) = FIRST_PSEUDO_REGISTER)
+counts[REGNO (pic_offset_table_rtx)]++;
   /* Go from the last insn to the first and delete insns that only set unused
  registers or copy a register to itself.  As we delete an insn, remove
  usage counts for registers it uses.
diff --git a/gcc/dce.c b/gcc/dce.c
index 5b7d36e..a52a59c 100644
--- a/gcc/dce.c
+++ b/gcc/dce.c
@@ -127,6 +127,10 @@ deletable_insn_p (rtx_insn *insn, bool fast,
bitmap arg_stores)
 if (HARD_REGISTER_NUM_P (DF_REF_REGNO (def))
 global_regs[DF_REF_REGNO (def)])
   return false;
+/* Initialization of pseudo PIC register should never be removed.  */
+else if (DF_REF_REG (def) == pic_offset_table_rtx
+ REGNO (pic_offset_table_rtx) = FIRST_PSEUDO_REGISTER)
+  return false;

   body = PATTERN (insn);
   switch (GET_CODE (body))


Re: [PATCH 0/17] KASan 4.9 backport

2014-10-17 Thread Jakub Jelinek
On Fri, Oct 17, 2014 at 06:15:11PM +0400, Yury Gribov wrote:
 On 10/17/2014 05:49 PM, Jakub Jelinek wrote:
  Plus if you add misalign tests...
 
 Sure, can do this on Monday.

Ok, thanks.

  -  bool use_calls = ASAN_INSTRUMENTATION_WITH_CALL_THRESHOLD  INT_MAX
 - asan_num_accesses = ASAN_INSTRUMENTATION_WITH_CALL_THRESHOLD;
 +  bool use_calls
 += ASAN_INSTRUMENTATION_WITH_CALL_THRESHOLD  INT_MAX
 +   (flag_sanitize  SANITIZE_KERNEL_ADDRESS)
 +   asan_num_accesses = ASAN_INSTRUMENTATION_WITH_CALL_THRESHOLD;
 
 I agree that original code didn't quite match GNU conventions but can we
 avoid reformatting it to make future backports easier? So e.g.
 
  bool use_calls = ASAN_INSTRUMENTATION_WITH_CALL_THRESHOLD  INT_MAX
 + (flag_sanitize  SANITIZE_KERNEL_ADDRESS)
 asan_num_accesses = ASAN_INSTRUMENTATION_WITH_CALL_THRESHOLD;

I can live with that.  So here is updated patch:

2014-10-17  Jakub Jelinek  ja...@redhat.com

* asan.c (instrument_derefs): Allow instrumentation of odd-sized
accesses even for -fsanitize=address.
(execute_sanopt): Only allow use_calls for -fsanitize=kernel-address.

* c-c++-common/asan/instrument-with-calls-1.c: Add
-fno-sanitize=address -fsanitize=kernel-address to dg-options.
* c-c++-common/asan/instrument-with-calls-2.c: Likewise.

--- gcc/asan.c.jj   2014-10-17 12:51:27.0 +0200
+++ gcc/asan.c  2014-10-17 15:21:29.921495259 +0200
@@ -1707,10 +1707,6 @@ instrument_derefs (gimple_stmt_iterator
   size_in_bytes = int_size_in_bytes (type);
   if (size_in_bytes = 0)
 return;
-  if ((flag_sanitize  SANITIZE_USER_ADDRESS) != 0
-   ((size_in_bytes  (size_in_bytes - 1)) != 0
- || (unsigned HOST_WIDE_INT) size_in_bytes - 1 = 16))
-return;
 
   HOST_WIDE_INT bitsize, bitpos;
   tree offset;
@@ -2781,6 +2777,7 @@ execute_sanopt (void)
 }
 
   bool use_calls = ASAN_INSTRUMENTATION_WITH_CALL_THRESHOLD  INT_MAX
+ (flag_sanitize  SANITIZE_KERNEL_ADDRESS)
  asan_num_accesses = ASAN_INSTRUMENTATION_WITH_CALL_THRESHOLD;
 
   FOR_EACH_BB_FN (bb, cfun)
--- gcc/testsuite/c-c++-common/asan/instrument-with-calls-1.c.jj
2014-10-17 12:51:27.0 +0200
+++ gcc/testsuite/c-c++-common/asan/instrument-with-calls-1.c   2014-10-17 
15:34:06.679627168 +0200
@@ -1,5 +1,5 @@
 /* { dg-do assemble } */
-/* { dg-options --param asan-instrumentation-with-call-threshold=0 
-save-temps } */
+/* { dg-options -fno-sanitize=address -fsanitize=kernel-address --param 
asan-instrumentation-with-call-threshold=0 -save-temps } */
 
 void f(char *a, int *b) {
   *b = *a;
--- gcc/testsuite/c-c++-common/asan/instrument-with-calls-2.c.jj
2014-10-17 12:51:27.0 +0200
+++ gcc/testsuite/c-c++-common/asan/instrument-with-calls-2.c   2014-10-17 
15:34:15.569472032 +0200
@@ -1,5 +1,5 @@
 /* { dg-do assemble } */
-/* { dg-options --param asan-instrumentation-with-call-threshold=1 
-save-temps } */
+/* { dg-options -fno-sanitize=address -fsanitize=kernel-address --param 
asan-instrumentation-with-call-threshold=1 -save-temps } */
 
 int x;
 


Jakub


Re: [PATCH 7/n] OpenMP 4.0 offloading infrastructure: testsuite

2014-10-17 Thread Jakub Jelinek
On Fri, Oct 17, 2014 at 06:02:11PM +0400, Ilya Verbin wrote:
 --- /dev/null
 +++ b/libgomp/testsuite/libgomp.c++/examples-4/e.53.2.C
 @@ -0,0 +1,43 @@
 +// { dg-do run }
 +// { dg-require-effective-target offload_device }

Well, this test actually relies not only offload_device,
but also on non-shared address space (so, if we ever have HSA
backend, it would fail there).  So, perhaps not immediately,
but eventually we'll want an effective target whether
address space is shared or not between offloading device and
host.

 --- a/libgomp/testsuite/libgomp.c/target-7.c
 +++ b/libgomp/testsuite/libgomp.c/target-7.c
 @@ -1,7 +1,9 @@
 +// { dg-require-effective-target offload_device }
 +

Why?  The test was specially written such that it tests
host fallback (if f is true) too.

  #include omp.h
  #include stdlib.h
  
 -volatile int v;
 +volatile int v = 0;

Why?

  void
  foo (int f)
 @@ -18,7 +20,7 @@ foo (int f)
if (omp_get_level () != 0 || !omp_is_initial_device ())
  abort ();
#pragma omp target if (v = 1)
 -  if (omp_get_level () != 0 || (f  !omp_is_initial_device ()))
 +  if (omp_get_level () != 0 || omp_is_initial_device ())
  abort ();
#pragma omp target device (d) if (v = 1)
if (omp_get_level () != 0 || (f  !omp_is_initial_device ()))
 @@ -30,7 +32,7 @@ foo (int f)
if (omp_get_level () != 0 || !omp_is_initial_device ())
  abort ();
#pragma omp target if (1)
 -  if (omp_get_level () != 0 || (f  !omp_is_initial_device ()))
 +  if (omp_get_level () != 0 || omp_is_initial_device ())
  abort ();
#pragma omp target device (d) if (1)
if (omp_get_level () != 0 || (f  !omp_is_initial_device ()))
 @@ -59,7 +61,7 @@ foo (int f)
#pragma omp target data if (v = 1) map (to: h)
{
  #pragma omp target if (v = 1)
 -if (omp_get_level () != 0 || (f  !omp_is_initial_device ()) || h++ != 
 8)
 +if (omp_get_level () != 0 || omp_is_initial_device () || h++ != 8)
abort ();
  #pragma omp target update if (v = 1) from (h)
}
 @@ -87,7 +89,7 @@ foo (int f)
#pragma omp target data if (1) map (to: h)
{
  #pragma omp target if (1)
 -if (omp_get_level () != 0 || (f  !omp_is_initial_device ()) || h++ != 
 12)
 +if (omp_get_level () != 0 || omp_is_initial_device () || h++ != 12)
abort ();
  #pragma omp target update if (1) from (h)
}

I don't understand any of these changes.

Otherwise it LGTM.

Jakub


Re: [PATCH i386 AVX512 Boostrap] [80/n] Extend expand_sse2_mulvxdi3.

2014-10-17 Thread Kirill Yukhin
Hello,
This is fix for bootstrap failure.

Is it OK?

gcc/
* config/i386/i386.c (ix86_expand_sse2_mulvxdi3): Refactor
conditions to fix bootstrap.

--
Thanks, K

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 7040200..3ddaf3d 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -45671,21 +45671,12 @@ ix86_expand_sse2_mulvxdi3 (rtx op0, rtx op1, rtx op2)
   enum machine_mode mode = GET_MODE (op0);
   rtx t1, t2, t3, t4, t5, t6;
 
-  if (TARGET_AVX512DQ)
-{
-  rtx (*gen) (rtx, rtx, rtx);
-
-  if (mode == V8DImode)
-   gen = gen_avx512dq_mulv8di3;
-  else if (TARGET_AVX512VL)
-   {
- if (mode == V4DImode)
-   gen = gen_avx512dq_mulv4di3;
- else if (mode == V2DImode)
-   gen = gen_avx512dq_mulv2di3;
-   }
-  emit_insn (gen (op0, op1, op2));
-}
+  if (TARGET_AVX512DQ  mode == V8DImode)
+emit_insn (gen_avx512dq_mulv8di3 (op0, op1, op2));
+  else if (TARGET_AVX512DQ  TARGET_AVX512VL  mode == V4DImode)
+emit_insn (gen_avx512dq_mulv4di3 (op0, op1, op2));
+  else if (TARGET_AVX512DQ  TARGET_AVX512VL  mode == V2DImode)
+emit_insn (gen_avx512dq_mulv2di3 (op0, op1, op2));
   else if (TARGET_XOP  mode == V2DImode)
 {
   /* op1: A,B,C,D, op2: E,F,G,H */


Re: [PATCH i386 AVX512 Boostrap] [80/n] Extend expand_sse2_mulvxdi3.

2014-10-17 Thread Uros Bizjak
On Fri, Oct 17, 2014 at 4:25 PM, Kirill Yukhin kirill.yuk...@gmail.com wrote:
 Hello,
 This is fix for bootstrap failure.

 Is it OK?

 gcc/
 * config/i386/i386.c (ix86_expand_sse2_mulvxdi3): Refactor
 conditions to fix bootstrap.

Well, OK.

Uros.


Re: [PARCH 2/2, x86, PR63534] Fix darwin bootstrap

2014-10-17 Thread Jakub Jelinek
On Fri, Oct 17, 2014 at 06:16:41PM +0400, Evgeny Stupachenko wrote:
 Hi,
 
 Some instructions (like one in PR63534) could have hidden use of PIC register.
 Therefore we need to leave SET_GOT not deleted till reload completed.
 The patch prevents SET_GOT from deleting while PIC register is pseudo.

Just curious, do you emit the init_pic_reg unconditionally at the start of
the function in -fpic mode?  What does IRA do in that case, if it sees
a dead setter of something that doesn't seem to be used at that point?
Doesn't it penalize generated code, even if we don't end up with any PIC
references during/after reload?

Jakub


[PATCH, x86, 63534] Fix '-p' profile for 32 bit PIC mode

2014-10-17 Thread Evgeny Stupachenko
Hi,

The patch fixes profile in 32bits PIC mode (only -p option affected).

x86 bootstrap, make check passed

spec2000 o2 -p train data on Corei7:
CINT -5%
CFP  +1,5
compared to a compiler before enabling ebx.

There is a potential performance improve after the patch applied
suggested by Jakub:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63534#c8
There is opened bug on this: PR63527. However the fix of the bug is
more complicated.

Is it ok?

ChangeLog

2014-10-16  Evgeny Stupachenko  evstu...@gmail.com

PR target/63534
* config/i386/i386.c (x86_function_profiler): Add GOT register init
for mcount call.

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index a3ca2ed..5117572 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -39119,11 +39126,15 @@ x86_function_profiler (FILE *file, int
labelno ATTRIBUTE_UNUSED)
 }
   else if (flag_pic)
 {
+  fprintf (file,\tpush\t%%ebx\n);
+  fprintf (file,\tcall\t__x86.get_pc_thunk.bx\n);
+  fprintf (file,\taddl\t$_GLOBAL_OFFSET_TABLE_, %%ebx\n);
 #ifndef NO_PROFILE_COUNTERS
   fprintf (file, \tleal\t%sP%d@GOTOFF(%%ebx),%%
PROFILE_COUNT_REGISTER \n,
   LPREFIX, labelno);
 #endif
   fprintf (file, 1:\tcall\t*%s@GOT(%%ebx)\n, mcount_name);
+  fprintf (file,\tpop\t%%ebx\n);
 }
   else
 {


Re: [PATCH 0/17] KASan 4.9 backport

2014-10-17 Thread Yury Gribov

On 10/17/2014 06:18 PM, Jakub Jelinek wrote:

On Fri, Oct 17, 2014 at 06:15:11PM +0400, Yury Gribov wrote:

On 10/17/2014 05:49 PM, Jakub Jelinek wrote:

Plus if you add misalign tests...


Sure, can do this on Monday.


Ok, thanks.


-  bool use_calls = ASAN_INSTRUMENTATION_WITH_CALL_THRESHOLD  INT_MAX
- asan_num_accesses = ASAN_INSTRUMENTATION_WITH_CALL_THRESHOLD;
+  bool use_calls
+= ASAN_INSTRUMENTATION_WITH_CALL_THRESHOLD  INT_MAX
+   (flag_sanitize  SANITIZE_KERNEL_ADDRESS)
+   asan_num_accesses = ASAN_INSTRUMENTATION_WITH_CALL_THRESHOLD;


I agree that original code didn't quite match GNU conventions but can we
avoid reformatting it to make future backports easier? So e.g.

  bool use_calls = ASAN_INSTRUMENTATION_WITH_CALL_THRESHOLD  INT_MAX
+ (flag_sanitize  SANITIZE_KERNEL_ADDRESS)
 asan_num_accesses = ASAN_INSTRUMENTATION_WITH_CALL_THRESHOLD;


I can live with that.  So here is updated patch:


Thanks, LGTM.

-Y





Re: [PATCH, x86, 63534] Fix '-p' profile for 32 bit PIC mode

2014-10-17 Thread Jakub Jelinek
On Fri, Oct 17, 2014 at 06:30:42PM +0400, Evgeny Stupachenko wrote:
 Hi,
 
 The patch fixes profile in 32bits PIC mode (only -p option affected).
 
 x86 bootstrap, make check passed
 
 spec2000 o2 -p train data on Corei7:
 CINT -5%
 CFP  +1,5
 compared to a compiler before enabling ebx.
 
 There is a potential performance improve after the patch applied
 suggested by Jakub:
 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63534#c8
 There is opened bug on this: PR63527. However the fix of the bug is
 more complicated.
 
 Is it ok?

Unfortunately I don't think it is ok.
1) you don't set the appropriate bit in pic_labels_used (for ebx)
2) more importantly, it causes the stack to be misaligned (i.e. violating
   ABI) for the _mcount call, and, break unwind info.

 2014-10-16  Evgeny Stupachenko  evstu...@gmail.com
 
 PR target/63534
 * config/i386/i386.c (x86_function_profiler): Add GOT register init
 for mcount call.
 
 diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
 index a3ca2ed..5117572 100644
 --- a/gcc/config/i386/i386.c
 +++ b/gcc/config/i386/i386.c
 @@ -39119,11 +39126,15 @@ x86_function_profiler (FILE *file, int
 labelno ATTRIBUTE_UNUSED)
  }
else if (flag_pic)
  {
 +  fprintf (file,\tpush\t%%ebx\n);
 +  fprintf (file,\tcall\t__x86.get_pc_thunk.bx\n);
 +  fprintf (file,\taddl\t$_GLOBAL_OFFSET_TABLE_, %%ebx\n);
  #ifndef NO_PROFILE_COUNTERS
fprintf (file, \tleal\t%sP%d@GOTOFF(%%ebx),%%
 PROFILE_COUNT_REGISTER \n,
LPREFIX, labelno);
  #endif
fprintf (file, 1:\tcall\t*%s@GOT(%%ebx)\n, mcount_name);
 +  fprintf (file,\tpop\t%%ebx\n);
  }
else
  {

Jakub


Re: [PARCH 2/2, x86, PR63534] Fix darwin bootstrap

2014-10-17 Thread Evgeny Stupachenko
Yes, unconditionally.
If pic_reg is unused, RA will allocate a hard register for it and
treat it as free, DCE after reload will delete SET_GOT.

On Fri, Oct 17, 2014 at 6:20 PM, Jakub Jelinek ja...@redhat.com wrote:
 On Fri, Oct 17, 2014 at 06:16:41PM +0400, Evgeny Stupachenko wrote:
 Hi,

 Some instructions (like one in PR63534) could have hidden use of PIC 
 register.
 Therefore we need to leave SET_GOT not deleted till reload completed.
 The patch prevents SET_GOT from deleting while PIC register is pseudo.

 Just curious, do you emit the init_pic_reg unconditionally at the start of
 the function in -fpic mode?  What does IRA do in that case, if it sees
 a dead setter of something that doesn't seem to be used at that point?
 Doesn't it penalize generated code, even if we don't end up with any PIC
 references during/after reload?

 Jakub


[PATCH 0/5] Add preferred_for_{size,speed} attributes

2014-10-17 Thread Richard Sandiford
This patch implements the approach I suggested in:

https://gcc.gnu.org/ml/gcc-patches/2014-10/msg00371.html

for fixing PR61360.  To recap, the problem is with the use of enabled in
the i386.md pattern:

(define_insn *floatSWI48:modeMODEF:mode2_sse
  [(set (match_operand:MODEF 0 register_operand =f,x,x)
(float:MODEF
  (match_operand:SWI48 1 nonimmediate_operand m,r,m)))]
  SSE_FLOAT_MODE_P (MODEF:MODEmode)  TARGET_SSE_MATH
  @
   fild%Z1\t%1
   %vcvtsi2MODEF:ssemodesuffixSWI48:rex64suffix\t{%1, %d0|%d0, %1}
   %vcvtsi2MODEF:ssemodesuffixSWI48:rex64suffix\t{%1, %d0|%d0, %1}
  [(set_attr type fmov,sseicvt,sseicvt)
   (set_attr prefix orig,maybe_vex,maybe_vex)
   (set_attr mode MODEF:MODE)
   (set (attr prefix_rex)
 (if_then_else
   (and (eq_attr prefix maybe_vex)
(match_test SWI48:MODEmode == DImode))
   (const_string 1)
   (const_string *)))
   (set_attr unit i387,*,*)
   (set_attr athlon_decode *,double,direct)
   (set_attr amdfam10_decode *,vector,double)
   (set_attr bdver1_decode *,double,direct)
   (set_attr fp_int_src true)
   (set (attr enabled)
 (cond [(eq_attr alternative 0)
  (symbol_ref TARGET_MIX_SSE_I387
X87_ENABLE_FLOAT (MODEF:MODEmode,
SWI48:MODEmode))
(eq_attr alternative 1)
  /* ??? For sched1 we need constrain_operands to be able to
 select an alternative.  Leave this enabled before RA.  */
  (symbol_ref TARGET_INTER_UNIT_CONVERSIONS
   || optimize_function_for_size_p (cfun)
   || !(reload_completed
|| reload_in_progress
|| lra_in_progress))
   ]
   (symbol_ref true)))
   ])

The attribute was only really supposed to test properties of the currently-
selected target.  It wasn't supposed to test function-specific size/speed
properties like the above pattern does.

So the idea was instead to add two new attributes that say whether
an alternative should be used when optimising for speed or size.
These attributes would just be strong optimisation hints; they wouldn't
be correctness properties in the way that enabled is.  There are some
cases where we could end up with a size-only alternative in code
optimised for speed, or vice versa, but the idea would be to reduce
them as far as possible.

The main advantage of this approach is that we can take block-level
size/speed choices into account, rather than just looking at the
function-level choice.

Series tested on x86_64-linux-gnu.

Thanks,
Richard



[PATCH 1/5] Add recog_constrain_insn

2014-10-17 Thread Richard Sandiford
This patch just adds a new utility function called recog_constrain_insn,
to go alongside the existing recog_constrain_insn_cached.

Note that the extract_insn in lra.c wasn't used when checking is disabled.
The function just moved on to the next instruction straight away.

Richard


gcc/
* recog.h (extract_constrain_insn): Declare.
* recog.c (extract_constrain_insn): New function.
* lra.c (check_rtl): Use it.
* postreload.c (reload_cse_simplify_operands): Likewise.
* reg-stack.c (check_asm_stack_operands): Likewise.
(subst_asm_stack_regs): Likewise.
* regcprop.c (copyprop_hardreg_forward_1): Likewise.
* regrename.c (build_def_use): Likewise.
* sel-sched.c (get_reg_class): Likewise.
* config/arm/arm.c (note_invalid_constants): Likewise.
* config/s390/predicates.md (execute_operation): Likewise.

Index: gcc/recog.h
===
--- gcc/recog.h 2014-09-18 11:40:31.223690858 +0100
+++ gcc/recog.h 2014-10-17 15:44:50.219398486 +0100
@@ -134,6 +134,7 @@ extern void add_clobbers (rtx, int);
 extern int added_clobbers_hard_reg_p (int);
 extern void insn_extract (rtx_insn *);
 extern void extract_insn (rtx_insn *);
+extern void extract_constrain_insn (rtx_insn *insn);
 extern void extract_constrain_insn_cached (rtx_insn *);
 extern void extract_insn_cached (rtx_insn *);
 extern void preprocess_constraints (int, int, const char **,
Index: gcc/recog.c
===
--- gcc/recog.c 2014-09-22 08:36:23.889794255 +0100
+++ gcc/recog.c 2014-10-17 15:44:50.219398486 +0100
@@ -2110,6 +2110,17 @@ extract_insn_cached (rtx_insn *insn)
   recog_data.insn = insn;
 }
 
+/* Do uncached extract_insn, constrain_operands and complain about failures.
+   This should be used when extracting a pre-existing constrained instruction
+   if the caller wants to know which alternative was chosen.  */
+void
+extract_constrain_insn (rtx_insn *insn)
+{
+  extract_insn (insn);
+  if (!constrain_operands (reload_completed))
+fatal_insn_not_found (insn);
+}
+
 /* Do cached extract_insn, constrain_operands and complain about failures.
Used by insn_attrtab.  */
 void
Index: gcc/lra.c
===
--- gcc/lra.c   2014-09-26 16:05:57.868394574 +0100
+++ gcc/lra.c   2014-10-17 15:44:50.219398486 +0100
@@ -1919,8 +1919,9 @@ check_rtl (bool final_p)
   {
if (final_p)
  {
-   extract_insn (insn);
-   lra_assert (constrain_operands (1));
+#ifdef ENABLED_CHECKING
+   extract_constrain_insn (insn);
+#endif
continue;
  }
/* LRA code is based on assumption that all addresses can be
Index: gcc/postreload.c
===
--- gcc/postreload.c2014-08-26 12:09:02.182959856 +0100
+++ gcc/postreload.c2014-10-17 15:44:50.219398486 +0100
@@ -401,15 +401,11 @@ reload_cse_simplify_operands (rtx_insn *
   /* Array of alternatives, sorted in order of decreasing desirability.  */
   int *alternative_order;
 
-  extract_insn (insn);
+  extract_constrain_insn (insn);
 
   if (recog_data.n_alternatives == 0 || recog_data.n_operands == 0)
 return 0;
 
-  /* Figure out which alternative currently matches.  */
-  if (! constrain_operands (1))
-fatal_insn_not_found (insn);
-
   alternative_reject = XALLOCAVEC (int, recog_data.n_alternatives);
   alternative_nregs = XALLOCAVEC (int, recog_data.n_alternatives);
   alternative_order = XALLOCAVEC (int, recog_data.n_alternatives);
Index: gcc/reg-stack.c
===
--- gcc/reg-stack.c 2014-09-18 11:40:31.307689884 +0100
+++ gcc/reg-stack.c 2014-10-17 15:44:50.219398486 +0100
@@ -469,8 +469,7 @@ check_asm_stack_operands (rtx_insn *insn
 
   /* Find out what the constraints require.  If no constraint
  alternative matches, this asm is malformed.  */
-  extract_insn (insn);
-  constrain_operands (1);
+  extract_constrain_insn (insn);
 
   preprocess_constraints (insn);
 
@@ -2016,8 +2015,7 @@ subst_asm_stack_regs (rtx_insn *insn, st
   /* Find out what the constraints required.  If no constraint
  alternative matches, that is a compiler bug: we should have caught
  such an insn in check_asm_stack_operands.  */
-  extract_insn (insn);
-  constrain_operands (1);
+  extract_constrain_insn (insn);
 
   preprocess_constraints (insn);
   const operand_alternative *op_alt = which_op_alt ();
Index: gcc/regcprop.c
===
--- gcc/regcprop.c  2014-10-13 08:02:41.225135081 +0100
+++ gcc/regcprop.c  2014-10-17 15:44:50.227398391 +0100
@@ -762,9 +762,7 @@ copyprop_hardreg_forward_1 (basic_block
}
 
   set = single_set (insn);
-  extract_insn (insn);
-  if (! constrain_operands (1))
-   

[PATCH 2/5] Add preferred_for_{size,speed} attributes

2014-10-17 Thread Richard Sandiford
This is the main patch, to add new preferred_for_size and
preferred_for_speed attributes that can be used to selectively disable
alternatives when optimising for size or speed.  As explained in the
docs, the new attributes are just optimisation hints and it is possible
that size-only alternatives will sometimes end up in a block that's
optimised for speed, or vice versa.

The patch deals with code that directly accesses the enabled_attributes
mask and that ought to take size/speed choices into account.  The next
patch deals with indirect uses.  Note that I'm not making reload support
these attributes for hopefully obvious reasons :-)

Richard


gcc/
* doc/md.texi: Document preferred_for_size and preferred_for_speed
attributes.
* genattr.c (main): Handle preferred_for_size and
preferred_for_speed in the same way as enabled.
* recog.h (bool_attr): New enum.
(target_recog): Replace x_enabled_alternatives with x_bool_attr_masks.
(get_preferred_alternatives, check_bool_attrs): Declare.
* recog.c (have_bool_attr, get_bool_attr, get_bool_attr_mask_uncached)
(get_bool_attr_mask, get_preferred_alternatives, check_bool_attrs):
New functions.
(get_enabled_alternatives): Use get_bool_attr_mask.
* ira-costs.c (record_reg_classes): Use get_preferred_alternatives
instead of recog_data.enabled_alternatives.
* ira.c (ira_setup_alts): Likewise.
* postreload.c (reload_cse_simplify_operands): Likewise.
* config/i386/i386.c (ix86_legitimate_combined_insn): Likewise.
* ira-lives.c (preferred_alternatives): New variable.
(process_bb_node_lives): Set it.
(check_and_make_def_conflict, make_early_clobber_and_input_conflicts)
(single_reg_class, ira_implicitly_set_insn_hard_regs): Use it instead
of recog_data.enabled_alternatives.
* lra-int.h (lra_insn_recog_data): Replace enabled_alternatives
to preferred_alternatives.
* lra-constraints.c (process_alt_operands): Update accordingly.
* lra.c (lra_set_insn_recog_data): Likewise.
(lra_update_insn_recog_data): Assert check_bool_attrs.

Index: gcc/doc/md.texi
===
--- gcc/doc/md.texi 2014-10-07 13:12:12.227445290 +0100
+++ gcc/doc/md.texi 2014-10-17 15:47:34.349453560 +0100
@@ -1080,7 +1080,7 @@ the addressing register.
 * Class Preferences::   Constraints guide which hard register to put things in.
 * Modifiers::   More precise control over effects of constraints.
 * Machine Constraints:: Existing constraints for some particular machines.
-* Disable Insn Alternatives:: Disable insn alternatives using the 
@code{enabled} attribute.
+* Disable Insn Alternatives:: Disable insn alternatives using attributes.
 * Define Constraints::  How to define machine-specific constraints.
 * C Constraint Interface:: How to test constraints from C code.
 @end menu
@@ -4006,42 +4006,49 @@ Unsigned constant valid for BccUI instru
 @subsection Disable insn alternatives using the @code{enabled} attribute
 @cindex enabled
 
-The @code{enabled} insn attribute may be used to disable insn
-alternatives that are not available for the current subtarget.
-This is useful when adding new instructions to an existing pattern
-which are only available for certain cpu architecture levels as
-specified with the @code{-march=} option.
-
-If an insn alternative is disabled, then it will never be used.  The
-compiler treats the constraints for the disabled alternative as
-unsatisfiable.
+There are three insn attributes that may be used to selectively disable
+instruction alternatives:
 
-In order to make use of the @code{enabled} attribute a back end has to add
-in the machine description files:
+@table @code
+@item enabled
+Says whether an alternative is available on the current subtarget.
 
-@enumerate
-@item
-A definition of the @code{enabled} insn attribute.  The attribute is
-defined as usual using the @code{define_attr} command.  This
-definition should be based on other insn attributes and/or target flags.
-The attribute must be a static property of the subtarget; that is, it
-must not depend on the current operands or any other dynamic context
-(for example, the location of the insn within the body of a loop).
-
-The @code{enabled} attribute is a numeric attribute and should evaluate to
-@code{(const_int 1)} for an enabled alternative and to
-@code{(const_int 0)} otherwise.
-@item
-A definition of another insn attribute used to describe for what
-reason an insn alternative might be available or
-not.  E.g. @code{cpu_facility} as in the example below.
-@item
-An assignment for the second attribute to each insn definition
-combining instructions which are not all available under the same
-circumstances.  (Note: It obviously only makes sense for definitions
-with more than one alternative.  Otherwise the insn pattern should be
-disabled 

[PATCH 3/5] Pass an alternative_mask to constrain_operands

2014-10-17 Thread Richard Sandiford
After the previous patch there are cases where we want to constrain
operands to any enabled alternative and cases where we want to also take
size/speed preferences into account.  The former applies when
constraining an existing instruction (which might originally have been
in a block with a different size/speed choice) or when making global
decisions.  The latter applies when evaluating a potential optimisation.

This patch therefore passes the mask of allowable alternatives as a
parameter to constrain_operands.

Richard


gcc/
* recog.h (constrain_operands): Add an alternative_mask parameter.
(constrain_operands_cached): Likewise.
(get_preferred_alternatives): Declare new form.
* recog.c (get_preferred_alternatives): New bb-taking instance.
(constrain_operands): Take the set of available alternatives as
a parameter.
(check_asm_operands, insn_invalid_p, extract_constrain_insn)
(extract_constrain_insn_cached): Update calls to constrain_operands.
* caller-save.c (reg_save_code): Likewise.
* ira.c (setup_prohibited_mode_move_regs): Likewise.
* postreload-gcse.c (eliminate_partially_redundant_load): Likewise.
* ree.c (combine_reaching_defs): Likewise.
* reload.c (can_reload_into): Likewise.
* reload1.c (reload, reload_as_needed, inc_for_reload): Likewise.
(gen_reload_chain_without_interm_reg_p, emit_input_reload_insns)
(emit_insn_if_valid_for_reload): Likewise.
* reorg.c (fill_slots_from_thread): Likewise.
* config/i386/i386.c (ix86_attr_length_address_default): Likewise.
* config/pa/pa.c (pa_can_combine_p): Likewise.
* config/rl78/rl78.c (insn_ok_now): Likewise.
* config/sh/sh.md (define_peephole2): Likewise.
* final.c (final_scan_insn): Update call to constrain_operands_cached.

Index: gcc/recog.h
===
--- gcc/recog.h 2014-10-17 15:50:02.0 +0100
+++ gcc/recog.h 2014-10-17 15:50:02.627695847 +0100
@@ -95,8 +95,8 @@ extern void confirm_change_group (void);
 extern int apply_change_group (void);
 extern int num_validated_changes (void);
 extern void cancel_changes (int);
-extern int constrain_operands (int);
-extern int constrain_operands_cached (int);
+extern int constrain_operands (int, alternative_mask);
+extern int constrain_operands_cached (rtx_insn *, int);
 extern int memory_address_addr_space_p (enum machine_mode, rtx, addr_space_t);
 #define memory_address_p(mode,addr) \
memory_address_addr_space_p ((mode), (addr), ADDR_SPACE_GENERIC)
@@ -414,6 +414,7 @@ #define this_target_recog (default_targ
 
 alternative_mask get_enabled_alternatives (rtx_insn *);
 alternative_mask get_preferred_alternatives (rtx_insn *);
+alternative_mask get_preferred_alternatives (rtx_insn *, basic_block);
 bool check_bool_attrs (rtx_insn *);
 
 void recog_init ();
Index: gcc/recog.c
===
--- gcc/recog.c 2014-10-17 15:50:02.0 +0100
+++ gcc/recog.c 2014-10-17 15:50:02.627695847 +0100
@@ -155,8 +155,9 @@ check_asm_operands (rtx x)
   if (reload_completed)
 {
   /* ??? Doh!  We've not got the wrapping insn.  Cook one up.  */
-  extract_insn (make_insn_raw (x));
-  constrain_operands (1);
+  rtx_insn *insn = make_insn_raw (x);
+  extract_insn (insn);
+  constrain_operands (1, get_enabled_alternatives (insn));
   return which_alternative = 0;
 }
 
@@ -360,7 +361,7 @@ insn_invalid_p (rtx_insn *insn, bool in_
 {
   extract_insn (insn);
 
-  if (! constrain_operands (1))
+  if (! constrain_operands (1, get_preferred_alternatives (insn)))
return 1;
 }
 
@@ -2159,6 +2160,21 @@ get_preferred_alternatives (rtx_insn *in
 return get_bool_attr_mask (insn, BA_PREFERRED_FOR_SIZE);
 }
 
+/* Return the set of alternatives of INSN that are allowed by the current
+   target and are preferred for the size/speed optimization choice
+   associated with BB.  Passing a separate BB is useful if INSN has not
+   been emitted yet or if we are considering moving it to a different
+   block.  */
+
+alternative_mask
+get_preferred_alternatives (rtx_insn *insn, basic_block bb)
+{
+  if (optimize_bb_for_speed_p (bb))
+return get_bool_attr_mask (insn, BA_PREFERRED_FOR_SPEED);
+  else
+return get_bool_attr_mask (insn, BA_PREFERRED_FOR_SIZE);
+}
+
 /* Assert that the cached boolean attributes for INSN are still accurate.
The backend is required to define these attributes in a way that only
depends on the current target (rather than operands, compiler phase,
@@ -2199,7 +2215,7 @@ extract_insn_cached (rtx_insn *insn)
 extract_constrain_insn (rtx_insn *insn)
 {
   extract_insn (insn);
-  if (!constrain_operands (reload_completed))
+  if (!constrain_operands (reload_completed, get_enabled_alternatives (insn)))
 fatal_insn_not_found (insn);
 }
 
@@ -2210,16 

[PATCH 4/5] Remove recog_data.enabled_alternatives

2014-10-17 Thread Richard Sandiford
After the previous patches, this one gets rid of
recog_data.enabled_alternatives and its one remaining use.

Richard


gcc/
* recog.h (recog_data_d): Remove enabled_alternatives.
* recog.c (extract_insn): Don't set it.
* reload.c (find_reloads): Call get_enabled_alternatives.

Index: gcc/recog.h
===
--- gcc/recog.h 2014-10-17 15:50:02.627695847 +0100
+++ gcc/recog.h 2014-10-17 15:51:59.662308095 +0100
@@ -250,12 +250,6 @@ struct recog_data_d
   /* True if insn is ASM_OPERANDS.  */
   bool is_asm;
 
-  /* Specifies whether an insn alternative is enabled using the `enabled'
- attribute in the insn pattern definition.  For back ends not using
- the `enabled' attribute the bits are always set to 1 in expand_insn.
- Bits beyond the last alternative are also set to 1.  */
-  alternative_mask enabled_alternatives;
-
   /* In case we are caching, hold insn data was generated for.  */
   rtx insn;
 };
Index: gcc/recog.c
===
--- gcc/recog.c 2014-10-17 15:50:02.627695847 +0100
+++ gcc/recog.c 2014-10-17 15:51:59.662308095 +0100
@@ -2339,8 +2339,6 @@ extract_insn (rtx_insn *insn)
 
   gcc_assert (recog_data.n_alternatives = MAX_RECOG_ALTERNATIVES);
 
-  recog_data.enabled_alternatives = get_enabled_alternatives (insn);
-
   recog_data.insn = NULL;
   which_alternative = -1;
 }
Index: gcc/reload.c
===
--- gcc/reload.c2014-10-17 15:50:02.627695847 +0100
+++ gcc/reload.c2014-10-17 15:51:59.666308048 +0100
@@ -2997,13 +2997,14 @@ find_reloads (rtx_insn *insn, int replac
 
  First loop over alternatives.  */
 
+  alternative_mask enabled = get_enabled_alternatives (insn);
   for (this_alternative_number = 0;
this_alternative_number  n_alternatives;
this_alternative_number++)
 {
   int swapped;
 
-  if (!TEST_BIT (recog_data.enabled_alternatives, this_alternative_number))
+  if (!TEST_BIT (enabled, this_alternative_number))
{
  int i;
 



[PATCH 5/5] Use preferred_for_speed in i386.md

2014-10-17 Thread Richard Sandiford
Undo the original fix for 61630 and use preferred_for_speed in the
problematic pattern.

I've not written many gcc.target/i386 tests so the markup might need
some work.

Richard


gcc/
* lra.c (lra): Remove call to recog_init.
* config/i386/i386.md (preferred_for_speed): New attribute
(*floatSWI48:modeMODEF:mode2_sse): Override it instead of
enabled.

gcc/testsuite/
* gcc.target/i386/conversion-2.c: New test.

Index: gcc/lra.c
===
--- gcc/lra.c   2014-10-17 15:47:34.357453465 +0100
+++ gcc/lra.c   2014-10-17 15:53:10.889463339 +0100
@@ -2116,11 +2116,6 @@ lra (FILE *f)
 
   lra_in_progress = 1;
 
-  /* The enable attributes can change their values as LRA starts
- although it is a bad practice.  To prevent reuse of the outdated
- values, clear them.  */
-  recog_init ();
-
   lra_live_range_iter = lra_coalesce_iter = 0;
   lra_constraint_iter = lra_constraint_iter_after_spill = 0;
   lra_inheritance_iter = lra_undo_inheritance_iter = 0;
Index: gcc/config/i386/i386.md
===
--- gcc/config/i386/i386.md 2014-10-01 10:48:51.079918153 +0100
+++ gcc/config/i386/i386.md 2014-10-17 15:53:10.889463339 +0100
@@ -779,6 +779,8 @@ (define_attr enabled 
]
(const_int 1)))
 
+(define_attr preferred_for_speed  (const_int 1))
+
 ;; Describe a user's asm statement.
 (define_asm_attributes
   [(set_attr length 128)
@@ -4794,16 +4796,12 @@ (define_insn *floatSWI48:modeMODEF:m
   (symbol_ref TARGET_MIX_SSE_I387
 X87_ENABLE_FLOAT (MODEF:MODEmode,
 SWI48:MODEmode))
-(eq_attr alternative 1)
-  /* ??? For sched1 we need constrain_operands to be able to
- select an alternative.  Leave this enabled before RA.  */
-  (symbol_ref TARGET_INTER_UNIT_CONVERSIONS
-   || optimize_function_for_size_p (cfun)
-   || !(reload_completed
-|| reload_in_progress
-|| lra_in_progress))
]
(symbol_ref true)))
+   (set (attr preferred_for_speed)
+ (cond [(eq_attr alternative 1)
+  (symbol_ref TARGET_INTER_UNIT_CONVERSIONS)]
+   (symbol_ref true)))
])
 
 (define_insn *floatSWI48x:modeMODEF:mode2_i387
Index: gcc/testsuite/gcc.target/i386/conversion-2.c
===
--- /dev/null   2014-10-06 08:13:11.214126005 +0100
+++ gcc/testsuite/gcc.target/i386/conversion-2.c2014-10-17 
15:53:10.893463291 +0100
@@ -0,0 +1,35 @@
+/* { dg-options -O2 -fno-toplevel-reorder -mfpmath=sse } */
+/* { dg-require-effective-target lp64 } */
+
+void __attribute__ ((hot, target (tune=bdver2)))
+f1 (int x)
+{
+  register float f asm (%xmm0) = x;
+  asm volatile (#f :: x (f));
+}
+
+void __attribute__ ((cold, target (tune=bdver2)))
+f2 (int x)
+{
+  register float f asm (%xmm1) = x;
+  asm volatile (#f :: x (f));
+}
+
+void __attribute__ ((hot, target (tune=bdver2)))
+f3 (int x)
+{
+  register float f asm (%xmm2) = x;
+  asm volatile (#f :: x (f));
+}
+
+void __attribute__ ((cold, target (tune=bdver2)))
+f4 (int x)
+{
+  register float f asm (%xmm3) = x;
+  asm volatile (#f :: x (f));
+}
+
+/* { dg-final { scan-assembler sp\\\), %xmm0 } } */
+/* { dg-final { scan-assembler di, %xmm1 } } */
+/* { dg-final { scan-assembler sp\\\), %xmm2 } } */
+/* { dg-final { scan-assembler di, %xmm3 } } */



Re: [PATCH 7/n] OpenMP 4.0 offloading infrastructure: testsuite

2014-10-17 Thread Ilya Verbin
On 17 Oct 16:14, Jakub Jelinek wrote:
  -volatile int v;
  +volatile int v = 0;
 
 Why?

Ok, I'll revert it back.

  --- a/libgomp/testsuite/libgomp.c/target-7.c
  +++ b/libgomp/testsuite/libgomp.c/target-7.c
  @@ -1,7 +1,9 @@
  +// { dg-require-effective-target offload_device }
  +
 
 Why?  The test was specially written such that it tests
 host fallback (if f is true) too.

   void
   foo (int f)
  @@ -18,7 +20,7 @@ foo (int f)
 if (omp_get_level () != 0 || !omp_is_initial_device ())
   abort ();
 #pragma omp target if (v = 1)
  -  if (omp_get_level () != 0 || (f  !omp_is_initial_device ()))
  +  if (omp_get_level () != 0 || omp_is_initial_device ())
   abort ();
 #pragma omp target device (d) if (v = 1)
 if (omp_get_level () != 0 || (f  !omp_is_initial_device ()))
  @@ -30,7 +32,7 @@ foo (int f)
 if (omp_get_level () != 0 || !omp_is_initial_device ())
   abort ();
 #pragma omp target if (1)
  -  if (omp_get_level () != 0 || (f  !omp_is_initial_device ()))
  +  if (omp_get_level () != 0 || omp_is_initial_device ())
   abort ();
 #pragma omp target device (d) if (1)
 if (omp_get_level () != 0 || (f  !omp_is_initial_device ()))
  @@ -59,7 +61,7 @@ foo (int f)
 #pragma omp target data if (v = 1) map (to: h)
 {
   #pragma omp target if (v = 1)
  -if (omp_get_level () != 0 || (f  !omp_is_initial_device ()) || h++ 
  != 8)
  +if (omp_get_level () != 0 || omp_is_initial_device () || h++ != 8)
 abort ();
   #pragma omp target update if (v = 1) from (h)
 }
  @@ -87,7 +89,7 @@ foo (int f)
 #pragma omp target data if (1) map (to: h)
 {
   #pragma omp target if (1)
  -if (omp_get_level () != 0 || (f  !omp_is_initial_device ()) || h++ 
  != 12)
  +if (omp_get_level () != 0 || omp_is_initial_device () || h++ != 12)
 abort ();
   #pragma omp target update if (1) from (h)
 }
 
 I don't understand any of these changes.

Here in the original test you have:

  #pragma omp target if (v = 1)
  if (omp_get_level () != 0 || (f  !omp_is_initial_device ()))
abort ();
  #pragma omp target device (d) if (v = 1)
  if (omp_get_level () != 0 || (f  !omp_is_initial_device ()))
abort ();

There are 2 same if-statements, but target pragmas have different clauses.
The second depends on device (d), and (f  !omp_is_initial_device ()) works
fine.  But the first one doesn't depend on 'f', and if we have offload device,
this check will fail.

So, to have this test working both with offloading and fallback, we need to
remove all pragmas without device-clause.

  -- Ilya


Re: [PATCH 7/n] OpenMP 4.0 offloading infrastructure: testsuite

2014-10-17 Thread Jakub Jelinek
On Fri, Oct 17, 2014 at 06:58:17PM +0400, Ilya Verbin wrote:
 Here in the original test you have:
 
   #pragma omp target if (v = 1)
   if (omp_get_level () != 0 || (f  !omp_is_initial_device ()))
 abort ();
   #pragma omp target device (d) if (v = 1)
   if (omp_get_level () != 0 || (f  !omp_is_initial_device ()))
 abort ();
 
 There are 2 same if-statements, but target pragmas have different clauses.
 The second depends on device (d), and (f  !omp_is_initial_device ()) works
 fine.  But the first one doesn't depend on 'f', and if we have offload device,
 this check will fail.
 
 So, to have this test working both with offloading and fallback, we need to
 remove all pragmas without device-clause.

Well, there is no need to remove them, just the  || (f  
!omp_is_initial_device ())
should be dropped from target regions without device (d) on them.
Where there is no f  guard, the condition should stay.
Do you agree?

Jakub


Re: [PATCH 7/n] OpenMP 4.0 offloading infrastructure: testsuite

2014-10-17 Thread Ilya Verbin
On 17 Oct 17:10, Jakub Jelinek wrote:
 On Fri, Oct 17, 2014 at 06:58:17PM +0400, Ilya Verbin wrote:
  Here in the original test you have:
  
#pragma omp target if (v = 1)
if (omp_get_level () != 0 || (f  !omp_is_initial_device ()))
  abort ();
#pragma omp target device (d) if (v = 1)
if (omp_get_level () != 0 || (f  !omp_is_initial_device ()))
  abort ();
  
  There are 2 same if-statements, but target pragmas have different clauses.
  The second depends on device (d), and (f  !omp_is_initial_device ()) works
  fine.  But the first one doesn't depend on 'f', and if we have offload 
  device,
  this check will fail.
  
  So, to have this test working both with offloading and fallback, we need to
  remove all pragmas without device-clause.
 
 Well, there is no need to remove them, just the  || (f  
 !omp_is_initial_device ())
 should be dropped from target regions without device (d) on them.
 Where there is no f  guard, the condition should stay.
 Do you agree?

Yes, should I re-post the patch?

  -- Ilya


Re: [PATCH 7/n] OpenMP 4.0 offloading infrastructure: testsuite

2014-10-17 Thread Jakub Jelinek
On Fri, Oct 17, 2014 at 07:17:31PM +0400, Ilya Verbin wrote:
 On 17 Oct 17:10, Jakub Jelinek wrote:
  On Fri, Oct 17, 2014 at 06:58:17PM +0400, Ilya Verbin wrote:
   Here in the original test you have:
   
 #pragma omp target if (v = 1)
 if (omp_get_level () != 0 || (f  !omp_is_initial_device ()))
   abort ();
 #pragma omp target device (d) if (v = 1)
 if (omp_get_level () != 0 || (f  !omp_is_initial_device ()))
   abort ();
   
   There are 2 same if-statements, but target pragmas have different clauses.
   The second depends on device (d), and (f  !omp_is_initial_device ()) 
   works
   fine.  But the first one doesn't depend on 'f', and if we have offload 
   device,
   this check will fail.
   
   So, to have this test working both with offloading and fallback, we need 
   to
   remove all pragmas without device-clause.
  
  Well, there is no need to remove them, just the  || (f  
  !omp_is_initial_device ())
  should be dropped from target regions without device (d) on them.
  Where there is no f  guard, the condition should stay.
  Do you agree?
 
 Yes, should I re-post the patch?

Guess just the target-7.c patch is enough, to make sure we agree on the same
thing.

Jakub


Re: [PATCH 7/n] OpenMP 4.0 offloading infrastructure: testsuite

2014-10-17 Thread Ilya Verbin
On 17 Oct 17:18, Jakub Jelinek wrote:
 On Fri, Oct 17, 2014 at 07:17:31PM +0400, Ilya Verbin wrote:
  On 17 Oct 17:10, Jakub Jelinek wrote:
   Well, there is no need to remove them, just the  || (f  
   !omp_is_initial_device ())
   should be dropped from target regions without device (d) on them.
   Where there is no f  guard, the condition should stay.
   Do you agree?
  
  Yes, should I re-post the patch?
 
 Guess just the target-7.c patch is enough, to make sure we agree on the same
 thing.

Here it is:

diff --git a/libgomp/testsuite/libgomp.c/target-7.c 
b/libgomp/testsuite/libgomp.c/target-7.c
index 90de6c5..0fe6150 100644
--- a/libgomp/testsuite/libgomp.c/target-7.c
+++ b/libgomp/testsuite/libgomp.c/target-7.c
@@ -18,7 +18,7 @@ foo (int f)
   if (omp_get_level () != 0 || !omp_is_initial_device ())
 abort ();
   #pragma omp target if (v = 1)
-  if (omp_get_level () != 0 || (f  !omp_is_initial_device ()))
+  if (omp_get_level () != 0)
 abort ();
   #pragma omp target device (d) if (v = 1)
   if (omp_get_level () != 0 || (f  !omp_is_initial_device ()))
@@ -30,7 +30,7 @@ foo (int f)
   if (omp_get_level () != 0 || !omp_is_initial_device ())
 abort ();
   #pragma omp target if (1)
-  if (omp_get_level () != 0 || (f  !omp_is_initial_device ()))
+  if (omp_get_level () != 0)
 abort ();
   #pragma omp target device (d) if (1)
   if (omp_get_level () != 0 || (f  !omp_is_initial_device ()))
@@ -59,7 +59,7 @@ foo (int f)
   #pragma omp target data if (v = 1) map (to: h)
   {
 #pragma omp target if (v = 1)
-if (omp_get_level () != 0 || (f  !omp_is_initial_device ()) || h++ != 8)
+if (omp_get_level () != 0 || h++ != 8)
   abort ();
 #pragma omp target update if (v = 1) from (h)
   }
@@ -87,7 +87,7 @@ foo (int f)
   #pragma omp target data if (1) map (to: h)
   {
 #pragma omp target if (1)
-if (omp_get_level () != 0 || (f  !omp_is_initial_device ()) || h++ != 12)
+if (omp_get_level () != 0 || h++ != 12)
   abort ();
 #pragma omp target update if (1) from (h)
   }


  -- Ilya


Re: [PATCH 7/n] OpenMP 4.0 offloading infrastructure: testsuite

2014-10-17 Thread Jakub Jelinek
On Fri, Oct 17, 2014 at 07:29:26PM +0400, Ilya Verbin wrote:
 On 17 Oct 17:18, Jakub Jelinek wrote:
  On Fri, Oct 17, 2014 at 07:17:31PM +0400, Ilya Verbin wrote:
   On 17 Oct 17:10, Jakub Jelinek wrote:
Well, there is no need to remove them, just the  || (f  
!omp_is_initial_device ())
should be dropped from target regions without device (d) on them.
Where there is no f  guard, the condition should stay.
Do you agree?
   
   Yes, should I re-post the patch?
  
  Guess just the target-7.c patch is enough, to make sure we agree on the same
  thing.
 
 Here it is:

LGTM, thanks.
 
 diff --git a/libgomp/testsuite/libgomp.c/target-7.c 
 b/libgomp/testsuite/libgomp.c/target-7.c
 index 90de6c5..0fe6150 100644
 --- a/libgomp/testsuite/libgomp.c/target-7.c
 +++ b/libgomp/testsuite/libgomp.c/target-7.c
 @@ -18,7 +18,7 @@ foo (int f)
if (omp_get_level () != 0 || !omp_is_initial_device ())
  abort ();
#pragma omp target if (v = 1)
 -  if (omp_get_level () != 0 || (f  !omp_is_initial_device ()))
 +  if (omp_get_level () != 0)
  abort ();
#pragma omp target device (d) if (v = 1)
if (omp_get_level () != 0 || (f  !omp_is_initial_device ()))
 @@ -30,7 +30,7 @@ foo (int f)
if (omp_get_level () != 0 || !omp_is_initial_device ())
  abort ();
#pragma omp target if (1)
 -  if (omp_get_level () != 0 || (f  !omp_is_initial_device ()))
 +  if (omp_get_level () != 0)
  abort ();
#pragma omp target device (d) if (1)
if (omp_get_level () != 0 || (f  !omp_is_initial_device ()))
 @@ -59,7 +59,7 @@ foo (int f)
#pragma omp target data if (v = 1) map (to: h)
{
  #pragma omp target if (v = 1)
 -if (omp_get_level () != 0 || (f  !omp_is_initial_device ()) || h++ != 
 8)
 +if (omp_get_level () != 0 || h++ != 8)
abort ();
  #pragma omp target update if (v = 1) from (h)
}
 @@ -87,7 +87,7 @@ foo (int f)
#pragma omp target data if (1) map (to: h)
{
  #pragma omp target if (1)
 -if (omp_get_level () != 0 || (f  !omp_is_initial_device ()) || h++ != 
 12)
 +if (omp_get_level () != 0 || h++ != 12)
abort ();
  #pragma omp target update if (1) from (h)
}

Jakub


Re: [libatomic PATCH] Fix libatomic behavior for big endian toolchain

2014-10-17 Thread Joseph S. Myers
Changes to architecture-independent files must use 
architecture-independent conditionals, so __BYTE_ORDER__ not __ARMEB__.

-- 
Joseph S. Myers
jos...@codesourcery.com


[gomp4] Use GOMP_PLUGIN_ not gomp_plugin_ for libgomp plugin API

2014-10-17 Thread Julian Brown
Hi,

As the title says, this patch makes the libgomp plugin API use the
GOMP_PLUGIN_ prefix rather than gomp_plugin_. This is purely a
mechanical change.

OK for the gomp4 branch?

Thanks,

Julian

ChangeLog

libgomp/
* libgomp-plugin.c (gomp_plugin_*): Rename to...
(GOMP_PLUGIN_*): This.
* libgomp-plugin.h: Likewise.
* libgomp.map: Likewise.
* oacc-host.c (GOMP): Use GOMP_PLUGIN_ in macro expansion.
* oacc-plugin.c (gomp_plugin_*): Rename to...
(GOMP_PLUGIN_*): This.
* plugin-nvptx.c: Likewise.commit cce63ddb8895d3b51a176d68045b7920affc05e5
Author: Julian Brown jul...@codesourcery.com
Date:   Wed Oct 15 02:05:08 2014 -0700

Use GOMP_PLUGIN_ not gomp_plugin_ for libgomp plugin API.

diff --git a/libgomp/libgomp-plugin.c b/libgomp/libgomp-plugin.c
index 46dd7b0..0f72bb9 100644
--- a/libgomp/libgomp-plugin.c
+++ b/libgomp/libgomp-plugin.c
@@ -31,25 +31,25 @@
 #include target.h
 
 void *
-gomp_plugin_malloc (size_t size)
+GOMP_PLUGIN_malloc (size_t size)
 {
   return gomp_malloc (size);
 }
 
 void *
-gomp_plugin_malloc_cleared (size_t size)
+GOMP_PLUGIN_malloc_cleared (size_t size)
 {
   return gomp_malloc_cleared (size);
 }
 
 void *
-gomp_plugin_realloc (void *ptr, size_t size)
+GOMP_PLUGIN_realloc (void *ptr, size_t size)
 {
   return gomp_realloc (ptr, size);
 }
 
 void
-gomp_plugin_error (const char *msg, ...)
+GOMP_PLUGIN_error (const char *msg, ...)
 {
   va_list ap;
   
@@ -59,7 +59,7 @@ gomp_plugin_error (const char *msg, ...)
 }
 
 void
-gomp_plugin_notify (const char *msg, ...)
+GOMP_PLUGIN_notify (const char *msg, ...)
 {
   va_list ap;
   
@@ -69,7 +69,7 @@ gomp_plugin_notify (const char *msg, ...)
 }
 
 void
-gomp_plugin_fatal (const char *msg, ...)
+GOMP_PLUGIN_fatal (const char *msg, ...)
 {
   va_list ap;
   
@@ -82,25 +82,25 @@ gomp_plugin_fatal (const char *msg, ...)
 }
 
 void
-gomp_plugin_mutex_init (gomp_mutex_t *mutex)
+GOMP_PLUGIN_mutex_init (gomp_mutex_t *mutex)
 {
   gomp_mutex_init (mutex);
 }
 
 void
-gomp_plugin_mutex_destroy (gomp_mutex_t *mutex)
+GOMP_PLUGIN_mutex_destroy (gomp_mutex_t *mutex)
 {
   gomp_mutex_destroy (mutex);
 }
 
 void
-gomp_plugin_mutex_lock (gomp_mutex_t *mutex)
+GOMP_PLUGIN_mutex_lock (gomp_mutex_t *mutex)
 {
   gomp_mutex_lock (mutex);
 }
 
 void
-gomp_plugin_mutex_unlock (gomp_mutex_t *mutex)
+GOMP_PLUGIN_mutex_unlock (gomp_mutex_t *mutex)
 {
   gomp_mutex_unlock (mutex);
 }
diff --git a/libgomp/libgomp-plugin.h b/libgomp/libgomp-plugin.h
index 0ecb407..e31573c 100644
--- a/libgomp/libgomp-plugin.h
+++ b/libgomp/libgomp-plugin.h
@@ -31,27 +31,27 @@
 
 /* alloc.c */
 
-extern void *gomp_plugin_malloc (size_t) __attribute__((malloc));
-extern void *gomp_plugin_malloc_cleared (size_t) __attribute__((malloc));
-extern void *gomp_plugin_realloc (void *, size_t);
+extern void *GOMP_PLUGIN_malloc (size_t) __attribute__((malloc));
+extern void *GOMP_PLUGIN_malloc_cleared (size_t) __attribute__((malloc));
+extern void *GOMP_PLUGIN_realloc (void *, size_t);
 
 /* error.c */
 
-extern void gomp_plugin_notify(const char *msg, ...);
-extern void gomp_plugin_error (const char *, ...)
+extern void GOMP_PLUGIN_notify(const char *msg, ...);
+extern void GOMP_PLUGIN_error (const char *, ...)
 	__attribute__((format (printf, 1, 2)));
-extern void gomp_plugin_fatal (const char *, ...)
+extern void GOMP_PLUGIN_fatal (const char *, ...)
 	__attribute__((noreturn, format (printf, 1, 2)));
 
 /* mutex.c */
 
-extern void gomp_plugin_mutex_init (gomp_mutex_t *mutex);
-extern void gomp_plugin_mutex_destroy (gomp_mutex_t *mutex);
-extern void gomp_plugin_mutex_lock (gomp_mutex_t *mutex);
-extern void gomp_plugin_mutex_unlock (gomp_mutex_t *mutex);
+extern void GOMP_PLUGIN_mutex_init (gomp_mutex_t *mutex);
+extern void GOMP_PLUGIN_mutex_destroy (gomp_mutex_t *mutex);
+extern void GOMP_PLUGIN_mutex_lock (gomp_mutex_t *mutex);
+extern void GOMP_PLUGIN_mutex_unlock (gomp_mutex_t *mutex);
 
 /* target.c */
 
-extern void gomp_plugin_async_unmap_vars (void *ptr);
+extern void GOMP_PLUGIN_async_unmap_vars (void *ptr);
 
 #endif
diff --git a/libgomp/libgomp.map b/libgomp/libgomp.map
index e1e87d9..538aabb 100644
--- a/libgomp/libgomp.map
+++ b/libgomp/libgomp.map
@@ -326,15 +326,15 @@ GOACC_2.0 {
 # FIXME: Hygiene/grouping/naming?
 PLUGIN_1.0 {
   global:
-	gomp_plugin_malloc;
-	gomp_plugin_malloc_cleared;
-	gomp_plugin_realloc;
-	gomp_plugin_error;
-	gomp_plugin_notify;
-	gomp_plugin_fatal;
-	gomp_plugin_mutex_init;
-	gomp_plugin_mutex_destroy;
-	gomp_plugin_mutex_lock;
-	gomp_plugin_mutex_unlock;
-	gomp_plugin_async_unmap_vars;
+	GOMP_PLUGIN_malloc;
+	GOMP_PLUGIN_malloc_cleared;
+	GOMP_PLUGIN_realloc;
+	GOMP_PLUGIN_error;
+	GOMP_PLUGIN_notify;
+	GOMP_PLUGIN_fatal;
+	GOMP_PLUGIN_mutex_init;
+	GOMP_PLUGIN_mutex_destroy;
+	GOMP_PLUGIN_mutex_lock;
+	GOMP_PLUGIN_mutex_unlock;
+	GOMP_PLUGIN_async_unmap_vars;
 };
diff --git a/libgomp/oacc-host.c b/libgomp/oacc-host.c
index 7a50d65..a47617a 100644
--- a/libgomp/oacc-host.c
+++ b/libgomp/oacc-host.c
@@ 

Re: -fuse-caller-save - Collect register usage information

2014-10-17 Thread Jeff Law

On 10/17/14 05:00, Richard Biener wrote:


I'm starting to lean towards -foptimize-call-clobbers or similar.


Well, it is really some form of IPA driven register allocation.  Whether
you want to call it -fipa-ra or not is another question - but if we had
such option then enabling it with that option would be fine.

Also users may have no idea what call vs callee clobbers are, but
IPA RA may be a term that is more widely known (or at least google
can come up with something for you).

So - I like -fipa-ra more.
Similarly.  At the heart of the matter is we're utilizing information 
about the callee's behaviour to improve the code we generate in the 
caller.  That's clearly in IPA's domain IMHO.




Jeff


  1   2   >