from:"Richard Biener"

[PATCH][2/n] Always 64bit-HWI cleanups

2014-05-23 Thread Richard Biener


The following changes the configury to insist on [u]int64_t being
available and removes the very old __int64 case.  Autoconf doesn't
check for it, support came in via a big merge in Dec 2002, r60174,
and it was never used on the libcpp side until I fixed that with
the last patch of this series, so we couldn't have relied on it
at least since libcpp was introduced.

Both libcpp and vmsdbg.h now use [u]int64_t, switching HOST_WIDE_INT
to literally use int64_t has to be done with the grand renaming of
all users due to us using 'unsigned HOST_WIDE_INT'.

Btw, I couldn't find any standard way of writing
[u]int64_t literals (substitution for HOST_WIDE_INT_C) nor
one for printf formats (substitutions for HOST_WIDE_INT_PRINT
and friends).  I'll consider doing s/HOST_WIDE_INT/[U]INT64/
there if nobody comes up with a better plan.

Unfortunately any followup will be the whole renaming game
at once due to the 'unsigned' issue.  I'll make sure to
propose a hwint.h-only patch with a renaming guide for
review and expect the actual renaming to take place using
a script.

Bootstrap and regtest running on x86_64-unknown-linux-gnu, ok?

After this patch you may use [u]int64_t freely in host sources
(lto-plugin already does that - heh).

Thanks,
Richard.

2014-05-23  Richard Biener  rguent...@suse.de

libcpp/
* configure.ac: Remove long long and __int64 type checks,
add check for uint64_t and fail if that wasn't found.
* include/cpplib.h (cpp_num_part): Use uint64_t.
* config.in: Regenerate.
* configure: Likewise.

gcc/
* configure.ac: Drop __int64 type check.  Insist that we
found uint64_t and int64_t.
* hwint.h (HOST_BITS_PER___INT64): Remove.
(HOST_BITS_PER_WIDE_INT): Define to 64 and remove
__int64 case.
(HOST_WIDE_INT_PRINT_*): Remove 32bit case.
(HOST_WIDEST_INT*): Define to HOST_WIDE_INT*.
(HOST_WIDEST_FAST_INT): Remove __int64 case.
* vmsdbg.h (struct _DST_SRC_COMMAND): Use int64_t
for dst_q_src_df_rms_cdt.
* configure: Regenerate.
* config.in: Likewise.

Index: libcpp/config.in
===
*** libcpp/config.in(revision 210847)
--- libcpp/config.in(working copy)
***
*** 180,188 
  /* Define to 1 if you have the locale.h header file. */
  #undef HAVE_LOCALE_H
  
- /* Define to 1 if the system has the type `long long'. */
- #undef HAVE_LONG_LONG
- 
  /* Define to 1 if you have the memory.h header file. */
  #undef HAVE_MEMORY_H
  
--- 180,185 
***
*** 231,239 
  /* Define to 1 if you have the unistd.h header file. */
  #undef HAVE_UNISTD_H
  
- /* Define to 1 if the system has the type `__int64'. */
- #undef HAVE___INT64
- 
  /* Define as const if the declaration of iconv() needs const. */
  #undef ICONV_CONST
  
--- 228,233 
***
*** 264,275 
  /* The size of `long', as computed by sizeof. */
  #undef SIZEOF_LONG
  
- /* The size of `long long', as computed by sizeof. */
- #undef SIZEOF_LONG_LONG
- 
- /* The size of `__int64', as computed by sizeof. */
- #undef SIZEOF___INT64
- 
  /* If using the C implementation of alloca, define if you know the
 direction of stack growth for your system; otherwise it will be
 automatically deduced at runtime.
--- 258,263 
***
*** 340,345 
--- 328,338 
  /* Define to 1 if you need to in order for `stat' and other things to work. */
  #undef _POSIX_SOURCE
  
+ /* Define for Solaris 2.5.1 so the uint64_t typedef from sys/synch.h,
+pthread.h, or semaphore.h is not used. If the typedef were allowed, the
+#define below would cause a syntax error. */
+ #undef _UINT64_T
+ 
  /* Define to empty if `const' does not conform to ANSI C. */
  #undef const
  
***
*** 361,366 
--- 354,363 
  /* Define to `int' if sys/types.h does not define. */
  #undef ssize_t
  
+ /* Define to the type of an unsigned integer type of width exactly 64 bits if
+such a type exists and the standard includes do not define it. */
+ #undef uint64_t
+ 
  /* Define to the type of an unsigned integer type wide enough to hold a
 pointer, if such a type exists, and if the system does not define it. */
  #undef uintptr_t
Index: libcpp/configure
===
*** libcpp/configure(revision 210847)
--- libcpp/configure(working copy)
*** $as_echo $ac_res 6; }
*** 1822,1827 
--- 1822,1879 
  
  } # ac_fn_c_check_type
  
+ # ac_fn_c_find_uintX_t LINENO BITS VAR
+ # 
+ # Finds an unsigned integer type with width BITS, setting cache variable VAR
+ # accordingly.
+ ac_fn_c_find_uintX_t ()
+ {
+   as_lineno=${as_lineno-$1} as_lineno_stack=as_lineno_stack=$as_lineno_stack
+   { $as_echo $as_me:${as_lineno-$LINENO}: checking for uint$2_t 5
+ $as_echo_n checking for uint$2_t...  6

Re: [COMMITTED 1/2] Just enumerate all GF_OMP_FOR_KIND_* and GF_OMP_TARGET_KIND_*.

2014-05-23 Thread Richard Biener

I think it was supposed to note that it uses two bits in the mask.

Did Jakub approve these patches you are committing now?

Thanks,
Richard.

On Fri, May 23, 2014 at 1:32 PM, Thomas Schwinge
tho...@codesourcery.com wrote:
 From: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4

 gcc/
 * gimple.h (enum gf_mask): Rewrite  0 shift expressions used
 for GF_OMP_FOR_KIND_MASK, GF_OMP_FOR_KIND_FOR,
 GF_OMP_FOR_KIND_DISTRIBUTE, GF_OMP_FOR_KIND_SIMD,
 GF_OMP_FOR_KIND_CILKSIMD, GF_OMP_TARGET_KIND_MASK,
 GF_OMP_TARGET_KIND_REGION, GF_OMP_TARGET_KIND_DATA,
 GF_OMP_TARGET_KIND_UPDATE.

 git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@210854 
 138bc75d-0d04-0410-961f-82ee72b054a4
 ---
  gcc/ChangeLog |  7 +++
  gcc/gimple.h  | 18 +-
  2 files changed, 16 insertions(+), 9 deletions(-)

 diff --git gcc/ChangeLog gcc/ChangeLog
 index d351c0b..fa2f3c3 100644
 --- gcc/ChangeLog
 +++ gcc/ChangeLog
 @@ -1,5 +1,12 @@
  2014-05-23  Thomas Schwinge  tho...@codesourcery.com

 +   * gimple.h (enum gf_mask): Rewrite  0 shift expressions used
 +   for GF_OMP_FOR_KIND_MASK, GF_OMP_FOR_KIND_FOR,
 +   GF_OMP_FOR_KIND_DISTRIBUTE, GF_OMP_FOR_KIND_SIMD,
 +   GF_OMP_FOR_KIND_CILKSIMD, GF_OMP_TARGET_KIND_MASK,
 +   GF_OMP_TARGET_KIND_REGION, GF_OMP_TARGET_KIND_DATA,
 +   GF_OMP_TARGET_KIND_UPDATE.
 +
 * gimplify.c (omp_notice_variable) case OMP_CLAUSE_DEFAULT_NONE:
 Explicitly enumerate the expected region types.

 diff --git gcc/gimple.h gcc/gimple.h
 index 9df45de..b1970e5 100644
 --- gcc/gimple.h
 +++ gcc/gimple.h
 @@ -91,17 +91,17 @@ enum gf_mask {
  GF_CALL_ALLOCA_FOR_VAR = 1  5,
  GF_CALL_INTERNAL   = 1  6,
  GF_OMP_PARALLEL_COMBINED   = 1  0,
 -GF_OMP_FOR_KIND_MASK   = 3  0,
 -GF_OMP_FOR_KIND_FOR= 0  0,
 -GF_OMP_FOR_KIND_DISTRIBUTE = 1  0,
 -GF_OMP_FOR_KIND_SIMD   = 2  0,
 -GF_OMP_FOR_KIND_CILKSIMD   = 3  0,
 +GF_OMP_FOR_KIND_MASK   = (1  2) - 1,
 +GF_OMP_FOR_KIND_FOR= 0,
 +GF_OMP_FOR_KIND_DISTRIBUTE = 1,
 +GF_OMP_FOR_KIND_SIMD   = 2,
 +GF_OMP_FOR_KIND_CILKSIMD   = 3,
  GF_OMP_FOR_COMBINED= 1  2,
  GF_OMP_FOR_COMBINED_INTO   = 1  3,
 -GF_OMP_TARGET_KIND_MASK= 3  0,
 -GF_OMP_TARGET_KIND_REGION  = 0  0,
 -GF_OMP_TARGET_KIND_DATA= 1  0,
 -GF_OMP_TARGET_KIND_UPDATE  = 2  0,
 +GF_OMP_TARGET_KIND_MASK= (1  2) - 1,
 +GF_OMP_TARGET_KIND_REGION  = 0,
 +GF_OMP_TARGET_KIND_DATA= 1,
 +GF_OMP_TARGET_KIND_UPDATE  = 2,

  /* True on an GIMPLE_OMP_RETURN statement if the return does not require
 a thread synchronization via some sort of barrier.  The exact barrier
 --
 1.9.1

Re: [PATCH] Fix PR rtl-optimization/61278

2014-05-23 Thread Richard Biener

On Fri, May 23, 2014 at 12:33 PM, Zhenqiang Chen
zhenqiang.c...@linaro.org wrote:
 On 23 May 2014 17:05, Richard Biener richard.guent...@gmail.com wrote:
 On Fri, May 23, 2014 at 9:23 AM, Zhenqiang Chen
 zhenqiang.c...@linaro.org wrote:
 Hi,

 The patch fixes PR rtl-optimization/61278. Root cause for issue is
 that df_live does not exist at -O1.

 Bootstrap and no make check regression on X86-64.

 OK for trunk?

 Why do you need to give up?  It seems you can simply avoid
 marking the block as dirty (though df_get_live_in/out also hands you
 back DF_LR_IN/OUT if !df_live).  So isn't the df_grow_bb_info the
 real fix?

 The df_get_live_in of the new basic block will be used to analyse
 later INSNs. If it is not set or incorrect, it will impact on later
 analysis.
 df_grow_bb_info is to make sure the live_in data structure is
 allocated for the new basic block (although I have not found any case
 fail without it). After bitmap_copy(...), we can use it for later
 INSNs.

 Note that df_get_live_in/out are functions tailored to IRA that
 knows that they handle both df_live and df_lr dependent on
 optimization level.  Is shrink-wrapping supposed to work with
 both problems as well?

 Yes. But it seams not perfect to handle df_lr problem. When I fixed PR
 57637 (https://gcc.gnu.org/ml/gcc-patches/2013-07/msg00897.html), we
 selected if DF_LIVE doesn't exist, i.e. at -O1, just give up
 searching NEXT_BLOCK.

Ok, I see.  Maybe it would be better to completely disable
shrink-wrapping when LIVE is not available.

Patch is ok.

Thanks,
Richard.

 Thanks!
 -Zhenqiang


 ChangeLog:
 2014-05-23  Zhenqiang Chen  zhenqiang.c...@linaro.org

 PR rtl-optimization/61278
 * shrink-wrap.c (move_insn_for_shrink_wrap): Check df_live.

 testsuite/ChangeLog:
 2014-05-23  Zhenqiang Chen  zhenqiang.c...@linaro.org

 * gcc.dg/lto/pr61278_0.c: New test.
 * gcc.dg/lto/pr61278_1.c: New test.

 diff --git a/gcc/shrink-wrap.c b/gcc/shrink-wrap.c
 index f09cfe7..be17829 100644
 --- a/gcc/shrink-wrap.c
 +++ b/gcc/shrink-wrap.c
 @@ -204,8 +204,15 @@ move_insn_for_shrink_wrap (basic_block bb, rtx insn,
/* Create a new basic block on the edge.  */
if (EDGE_COUNT (next_block-preds) == 2)
  {
 +  /* If DF_LIVE doesn't exist, i.e. at -O1, just give up.  */
 +  if (!df_live)
 +   return false;
 +
next_block = split_edge (live_edge);

 +  /* We create a new basic block.  Call df_grow_bb_info to make sure
 +all data structures are allocated.  */
 +  df_grow_bb_info (df_live);
bitmap_copy (df_get_live_in (next_block), df_get_live_out (bb));
df_set_bb_dirty (next_block);

 diff --git a/gcc/testsuite/gcc.dg/lto/pr61278_0.c
 b/gcc/testsuite/gcc.dg/lto/pr61278_0.c
 new file mode 100644
 index 000..03a24ae
 --- /dev/null
 +++ b/gcc/testsuite/gcc.dg/lto/pr61278_0.c
 @@ -0,0 +1,30 @@
 +/* { dg-lto-do link } */
 +/* { dg-lto-options { { -flto -O0 } } } */
 +/* { dg-extra-ld-options  -flto -O1  } */
 +
 +static unsigned int
 +fn1 (int p1, int p2)
 +{
 +  return 0;
 +}
 +
 +char a, b, c;
 +
 +char
 +foo (char *p)
 +{
 +  int i;
 +  for (b = 1 ; b  0; b++)
 +{
 +  for (i = 0; i  2; i++)
 +   ;
 +  for (a = 1; a  0; a++)
 +   {
 + char d[1] = { 0 };
 + if (*p)
 +   break;
 + c ^= fn1 (fn1 (fn1 (0, 0), 0), 0);
 +   }
 +}
 +  return 0;
 +}
 diff --git a/gcc/testsuite/gcc.dg/lto/pr61278_1.c
 b/gcc/testsuite/gcc.dg/lto/pr61278_1.c
 new file mode 100644
 index 000..b02c8ac
 --- /dev/null
 +++ b/gcc/testsuite/gcc.dg/lto/pr61278_1.c
 @@ -0,0 +1,13 @@
 +/* { dg-lto-do link } */
 +/* { dg-lto-options { { -flto -O1 } } } */
 +
 +extern char foo (char *);
 +
 +char d;
 +
 +int
 +main ()
 +{
 +  foo (d);
 +  return 0;
 +}

Re: [PATCH] Disable unroll loop that has header count less than iteration count.

2014-05-23 Thread Richard Biener

On May 23, 2014 7:26:04 PM CEST, Jan Hubicka hubi...@ucw.cz wrote:
 On Thu, May 22, 2014 at 11:36 PM, Dehao Chen de...@google.com
wrote:
  If a loop's header count is less than iteration count, the
iteration
  estimation is apparently incorrect for this loop. Thus disable
  unrolling of such loops.
 
  Testing on going. OK for trunk if test pass?
 
 No.  Why don't you instead plug the hole in expected_loop_iterations
()?
 That is, why may not loop-header be bogus?  Isn't it maybe the
bounding

It is autoFDO thing.  Dehao is basically pushing out changes that
should make compiler
more sane about bogus profiles.

At the moment we have tests bb-count != 0 to figure out if the profile
is
reliable. I.e. we assume that profile feedback is always good, guessed
profile
is never good.  Loop optimizers then do not trust guessed profile to
give
realistic estimates on number of iterations and bail out into simple
heuristics
for estimated profiles that are usually not good on this job - i.e.

  int niters = 0;

  if (desc-const_iter)
niters = desc-niter;
  else if (loop-header-count)
niters = expected_loop_iterations (loop);
  ...

With FDO this change, as the FDO profiles are sometimes bogus (and
there is not much
to do about it).
I would say that for loop optimization, we want a flag whether expected
number
of iterations is reliable. We probably want

 if (expected_loop_iterations_reliable_p (loop))
   niters = expected_loop_iterations (loop);

But why not simply never return bogus values from expected-loop-iterations?
We probably want a flag in the .gcda file on whether it was from auto-fdo and 
only not trust profiles from those.

Richard.

We probably also want to store this information into loop structure
rather than computing
it all the time from profile, since the profile may get inaccurate and
we already know
how to maintain upper bounds on numbers of iterations, so it should be
easy to implement.

This would allow us to preserve info like
for (i=0 ;i  __bulitin_expect (n,10); i++)

that would be nice feature to have.

Honza

 you run into?
 
 /* Returns expected number of LOOP iterations.  The returned value is
bounded
by REG_BR_PROB_BASE.  */
 
 unsigned
 expected_loop_iterations (const struct loop *loop)
 {
   gcov_type expected = expected_loop_iterations_unbounded (loop);
   return (expected  REG_BR_PROB_BASE ? REG_BR_PROB_BASE : expected);
 }
 
 I miss a testcase as well.
 
 Richard.
 
  Thanks,
  Dehao
 
  gcc/ChangeLog:
  2014-05-21  Dehao Chen  de...@google.com
 
  * cfgloop.h (expected_loop_iterations_reliable_p): New
func.
  * cfgloopanal.c (expected_loop_iterations_reliable_p):
Likewise.
  * loop-unroll.c (decide_unroll_runtime_iterations): Disable
unroll
  loop that has unreliable iteration counts.
 
  Index: gcc/cfgloop.h
  ===
  --- gcc/cfgloop.h (revision 210717)
  +++ gcc/cfgloop.h (working copy)
  @@ -307,8 +307,8 @@ extern bool just_once_each_iteration_p (const
stru
   gcov_type expected_loop_iterations_unbounded (const struct loop
*);
   extern unsigned expected_loop_iterations (const struct loop *);
   extern rtx doloop_condition_get (rtx);
  +extern bool expected_loop_iterations_reliable_p (const struct loop
*);
 
  -
   /* Loop manipulation.  */
   extern bool can_duplicate_loop_p (const struct loop *loop);
 
  Index: gcc/cfgloopanal.c
  ===
  --- gcc/cfgloopanal.c (revision 210717)
  +++ gcc/cfgloopanal.c (working copy)
  @@ -285,6 +285,15 @@ expected_loop_iterations (const struct loop
*loop)
 return (expected  REG_BR_PROB_BASE ? REG_BR_PROB_BASE :
expected);
   }
 
  +/* Returns true if the loop header's profile count is smaller than
expected
  +   loop iteration.  */
  +
  +bool
  +expected_loop_iterations_reliable_p (const struct loop *loop)
  +{
  +  return expected_loop_iterations (loop)  loop-header-count;
  +}
  +
   /* Returns the maximum level of nesting of subloops of LOOP.  */
 
   unsigned
  Index: gcc/loop-unroll.c
  ===
  --- gcc/loop-unroll.c (revision 210717)
  +++ gcc/loop-unroll.c (working copy)
  @@ -988,6 +988,15 @@ decide_unroll_runtime_iterations (struct loop
*loo
 return;
   }
 
  +  if (profile_status_for_fn (cfun) == PROFILE_READ
  +   expected_loop_iterations_reliable_p (loop))
  +{
  +  if (dump_file)
  + fprintf (dump_file, ;; Not unrolling loop, loop iteration 
  + not reliable.);
  +  return;
  +}
  +
 /* Check whether the loop rolls.  */
 if ((get_estimated_loop_iterations (loop, iterations)
  || get_max_loop_iterations (loop, iterations))

Re: Fix broken graph dump

2014-05-26 Thread Richard Biener

On Sat, May 24, 2014 at 6:44 AM, Xinliang David Li davi...@google.com wrote:
 graph dump is broken in trunk (and gcc-49) -- the subgraph of the last
 function gets dumped. The patch fixed the problem. Also fixed the
 function header dump -- the cgraph uid is still used in many places
 such as -fdisable|enable-xxx options.

 Ok for trunk?

Ok if tested ok.

Thanks,
Richard.

 David

Re: Darwin bootstrap failure following wide int merge (was: we are starting the wide int merge)

2014-05-26 Thread Richard Biener

On Mon, May 26, 2014 at 10:14 AM, FX fxcoud...@gmail.com wrote:
  .././../gcc-4.10-20140518/gcc/wide-int.cc:1274:23: error: invalid use of a
  cast in a inline asm context requiring an l-value: remove the cast or
  build with -fheinous-gnu-extensions
umul_ppmm (val[1], val[0], op1.ulow (), op2.ulow ());
~~~^

 This is PR 61146.  You can get around it by adding -fheinous-gnu-extensions
 to BOOT_CFLAGS.

 This causes GCC bootstrap to fail on Darwin systems (whose system compiler is 
 clang-based). Since PR 61146 was resolved as INVALID (but I’m not sure it’s 
 the right call, see below), I’ve filed a separate report for the bootstrap 
 issue (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61315).

 Regarding PR 61146, I agree with Marc Glisse (comment #3) that the casts in 
 question look weird and should probably be removed, as was done in GMP. This 
 code should be cleaned up, and it will fix bootstrap on clang-based target 
 coincidentally, without adding another kludge.

Please post a patch.

Thanks,
Richard.

 FX

[PATCH][4/4] Always 64bit-HWI cleanups

2014-05-26 Thread Richard Biener


This is the last cleanup bit.  Remaining is getting rid of
HOST_WIDE_INT in favor of [u]int64_t and adjusting interfaces
and interface names.  That's too disruptive at the moment
(thus appropriate for a delay until 4.9.1 is out) and I'm not
sure if we want to split that work or if such splitting is
even possible sensibly.  For example wide-int should get
its own abstraction of its storage type (which could
be simply [u]int64_t - the actual compute routines can
happily split a [u]int64_t into two pieces again if that
is desirable on some architectures - I suspect GCC is good
enough in optimizing the two element case inline for them).

This moves the relatively new HALF_WIDE_INT stuff to its only
user, wide-int.cc, and uses the PRI*64 stuff in the
HOST_WIDE_INT_PRINT macros.

Bootstrap / testing in progress on x86_64-unknown-linux-gnu.

Richard.

2014-05-26  Richard Biener  rguent...@suse.de

* hwint.h (*_HALF_WIDE_INT*): Move to ...
* wide-int.cc (HOST_BITS_PER_HALF_WIDE_INT, HOST_HALF_WIDE_INT):
... here and remove the rest.
* hwint.h (HOST_WIDE_INT_PRINT_*): Define in terms of PRI*64.

Index: gcc/hwint.h
===
*** gcc/hwint.h.orig2014-05-26 10:49:23.009339524 +0200
--- gcc/hwint.h 2014-05-26 11:09:43.597255488 +0200
*** extern char sizeof_long_long_must_be_8[s
*** 64,103 
  # endif
  #endif
  
- /* Print support for half a host wide int.  */
- #define HOST_BITS_PER_HALF_WIDE_INT 32
- #if HOST_BITS_PER_HALF_WIDE_INT == HOST_BITS_PER_LONG
- # define HOST_HALF_WIDE_INT long
- # define HOST_HALF_WIDE_INT_PRINT HOST_LONG_FORMAT
- # define HOST_HALF_WIDE_INT_PRINT_C L
- # define HOST_HALF_WIDE_INT_PRINT_DEC % HOST_HALF_WIDE_INT_PRINT d
- # define HOST_HALF_WIDE_INT_PRINT_DEC_C HOST_HALF_WIDE_INT_PRINT_DEC 
HOST_HALF_WIDE_INT_PRINT_C
- # define HOST_HALF_WIDE_INT_PRINT_UNSIGNED % HOST_HALF_WIDE_INT_PRINT u
- # define HOST_HALF_WIDE_INT_PRINT_HEX %# HOST_HALF_WIDE_INT_PRINT x
- # define HOST_HALF_WIDE_INT_PRINT_HEX_PURE % HOST_HALF_WIDE_INT_PRINT x
- #elif HOST_BITS_PER_HALF_WIDE_INT == HOST_BITS_PER_INT
- # define HOST_HALF_WIDE_INT int
- # define HOST_HALF_WIDE_INT_PRINT 
- # define HOST_HALF_WIDE_INT_PRINT_C 
- # define HOST_HALF_WIDE_INT_PRINT_DEC % HOST_HALF_WIDE_INT_PRINT d
- # define HOST_HALF_WIDE_INT_PRINT_DEC_C HOST_HALF_WIDE_INT_PRINT_DEC 
HOST_HALF_WIDE_INT_PRINT_C
- # define HOST_HALF_WIDE_INT_PRINT_UNSIGNED % HOST_HALF_WIDE_INT_PRINT u
- # define HOST_HALF_WIDE_INT_PRINT_HEX %# HOST_HALF_WIDE_INT_PRINT x
- # define HOST_HALF_WIDE_INT_PRINT_HEX_PURE % HOST_HALF_WIDE_INT_PRINT x
- #elif HOST_BITS_PER_HALF_WIDE_INT == HOST_BITS_PER_SHORT
- # define HOST_HALF_WIDE_INT short
- # define HOST_HALF_WIDE_INT_PRINT 
- # define HOST_HALF_WIDE_INT_PRINT_C 
- # define HOST_HALF_WIDE_INT_PRINT_DEC % HOST_HALF_WIDE_INT_PRINT d
- # define HOST_HALF_WIDE_INT_PRINT_DEC_C HOST_HALF_WIDE_INT_PRINT_DEC 
HOST_HALF_WIDE_INT_PRINT_C
- # define HOST_HALF_WIDE_INT_PRINT_UNSIGNED % HOST_HALF_WIDE_INT_PRINT u
- # define HOST_HALF_WIDE_INT_PRINT_HEX %# HOST_HALF_WIDE_INT_PRINT x
- # define HOST_HALF_WIDE_INT_PRINT_HEX_PURE % HOST_HALF_WIDE_INT_PRINT x
- #else
- #error Please add support for HOST_HALF_WIDE_INT
- #endif
- 
- 
  #define HOST_WIDE_INT_UC(X) HOST_WIDE_INT_C (X ## U)
  #define HOST_WIDE_INT_1 HOST_WIDE_INT_C (1)
  #define HOST_WIDE_INT_1U HOST_WIDE_INT_UC (1)
--- 64,69 
*** extern char sizeof_long_long_must_be_8[s
*** 109,156 
 typedef before using the __asm_fprintf__ format attribute.  */
  typedef HOST_WIDE_INT __gcc_host_wide_int__;
  
- /* Various printf format strings for HOST_WIDE_INT.  */
- 
- #if HOST_BITS_PER_WIDE_INT == HOST_BITS_PER_LONG
- # define HOST_WIDE_INT_PRINT HOST_LONG_FORMAT
- # define HOST_WIDE_INT_PRINT_C L
-   /* HOST_BITS_PER_WIDE_INT is 64 bits.  */
- # define HOST_WIDE_INT_PRINT_DOUBLE_HEX \
- 0x% HOST_LONG_FORMAT x%016 HOST_LONG_FORMAT x
- # define HOST_WIDE_INT_PRINT_PADDED_HEX \
- %016 HOST_LONG_FORMAT x
- #else
- # define HOST_WIDE_INT_PRINT HOST_LONG_LONG_FORMAT
- # define HOST_WIDE_INT_PRINT_C LL
-   /* HOST_BITS_PER_WIDE_INT is 64 bits.  */
- # define HOST_WIDE_INT_PRINT_DOUBLE_HEX \
- 0x% HOST_LONG_LONG_FORMAT x%016 HOST_LONG_LONG_FORMAT x
- # define HOST_WIDE_INT_PRINT_PADDED_HEX \
- %016 HOST_LONG_LONG_FORMAT x
- #endif /* HOST_BITS_PER_WIDE_INT == HOST_BITS_PER_LONG */
- 
- #define HOST_WIDE_INT_PRINT_DEC % HOST_WIDE_INT_PRINT d
- #define HOST_WIDE_INT_PRINT_DEC_C HOST_WIDE_INT_PRINT_DEC 
HOST_WIDE_INT_PRINT_C
- #define HOST_WIDE_INT_PRINT_UNSIGNED % HOST_WIDE_INT_PRINT u
- #define HOST_WIDE_INT_PRINT_HEX %# HOST_WIDE_INT_PRINT x
- #define HOST_WIDE_INT_PRINT_HEX_PURE % HOST_WIDE_INT_PRINT x
- 
  /* Provide C99 inttypes.h style format definitions for 64bits.  */
  #ifndef HAVE_INTTYPES_H
  #undef PRId64
! #define PRId64 HOST_WIDE_INT_PRINT d
  #undef PRIi64
! #define PRIi64 HOST_WIDE_INT_PRINT i
  #undef PRIo64
! #define

Re: [RFC] optimize x - y cmp 0 with undefined overflow

2014-05-26 Thread Richard Biener

On Mon, May 26, 2014 at 12:22 PM, Eric Botcazou ebotca...@adacore.com wrote:
 Hi,

 the motivation of this work is to get rid of the range check on Z in:

 function F (X : Integer; Y : Integer ) return Positive is
Z : Integer;
 begin
if X = Y then
   return 1;
end if;
Z := Y - X;
return Z;
 end;

 An equivalent C testcase is:

 extern void abort (void);

 int foo (int x, int y)
 {
   int z;

   if (x = y)
 return 1;

   z = y - x;
   if (z = 0)
 abort ();

   return z;
 }

 for which the call to abort is not optimized at -O2.


 fold_comparison knows how to optimize X +- C1 CMP C2 to X CMP C2 -+ C1 with
 undefined overflow so optimizing X - Y CMP 0 to X CMP Y is a generalization.

 Once this is done, forwprop will immediately optimize:

 extern void abort (void);

 int foo (int x, int y)
 {
   int z;

   if (x = y)
 return 1;

   z = y - x;
   if (z = 0)
 abort ();

   return 0;
 }

 because 'z' has a single use, but the original testcase won't be optimized.


 The attached patch implements the missing bits in vrp_evaluate_conditional by
 manual propagating the operands of a defining PLUS_EXPR or MINUS_EXPR for an
 SSA name in a condition; an other possibility could be DOM instead of VRP.

 This comes with 4 testcases: the original Ada testcase, the C equivalent, a
 testcase for the folding and another one for the -Wstrict-overflow warning.

 Tested on x86_64-suse-linux with no regressions.

+  tree new_const
+   = fold_build2_loc (loc, reverse_op, TREE_TYPE (arg1), const2, const1);

   /* If the constant operation overflowed this can be
 simplified as a comparison against INT_MAX/INT_MIN.  */
-  if (TREE_CODE (lhs) == INTEGER_CST
-  TREE_OVERFLOW (lhs))
+  if (TREE_OVERFLOW (new_const))

well, either use int_const_binop above or retain the check (or use
TREE_OVERFLOW_P).  Bonus points for using wide-ints here
and not relying on TREE_OVERFLOW.

+  /* Transform comparisons of the form X - Y CMP 0 to X CMP Y.  */
+  if (TREE_CODE (arg0) == MINUS_EXPR
+   TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (arg1))

any good reason for using TYPE_OVERFLOW_UNDEFINED on the
type of arg1 instead on the type of the MINUS (yes, they should
match, but it really looks odd ... the overflow of the minus has to be
undefined).  Also for EQ_EXPR and NE_EXPR the transform is
fine even when !TYPE_OVERFLOW_UNDEFINED (and we seem
to perform it already somewhere).  Please look where and try to
add the undefined overflow case to it.

As for the VRP pieces I don't understand what is missing here
(well, compare_range_with_value and/or compare_values might
not handle this case?  then better fix that to improve symbolic
range handling in general?).  Also I have a longstanding patch
in my tree that does

Index: gcc/tree-vrp.c
===
--- gcc/tree-vrp.c  (revision 210931)
+++ gcc/tree-vrp.c  (working copy)
@@ -6919,14 +6919,15 @@ vrp_evaluate_conditional_warnv_with_ops_
   vr0 = (TREE_CODE (op0) == SSA_NAME) ? get_value_range (op0) : NULL;
   vr1 = (TREE_CODE (op1) == SSA_NAME) ? get_value_range (op1) : NULL;

+  tree res = NULL_TREE;
   if (vr0  vr1)
-return compare_ranges (code, vr0, vr1, strict_overflow_p);
-  else if (vr0  vr1 == NULL)
-return compare_range_with_value (code, vr0, op1, strict_overflow_p);
-  else if (vr0 == NULL  vr1)
-return (compare_range_with_value
+res = compare_ranges (code, vr0, vr1, strict_overflow_p);
+  if (!res  vr0)
+res = compare_range_with_value (code, vr0, op1, strict_overflow_p);
+  if (!res  vr1)
+res = (compare_range_with_value
(swap_tree_comparison (code), vr1, op0, strict_overflow_p));
-  return NULL;
+  return res;
 }

 /* Helper function for vrp_evaluate_conditional_warnv. */

maybe that helps as well.

Thanks,
Richard.


 2014-05-26  Eric Botcazou  ebotca...@adacore.com

 * fold-const.c (fold_comparison): Clean up and simplify X +- C1 CMP C2
 to X CMP C2 -+ C1 transformation, add X - Y CMP 0 to X CMP Y.
 * tree-vrp.c (vrp_evaluate_conditional_warnv_with_fold): New function.
 (vrp_evaluate_conditional): Call it on SSA names with defining 
 PLUS_EXPR
 or MINUS_EXPR if the evaluation of the condition yielded nothing.


 2014-05-26  Eric Botcazou  ebotca...@adacore.com

 * gcc.dg/fold-compare-8.c: New test.
 * gcc.dg/Wstrict-overflow-25.c: Likewise.
 * gcc.dg/tree-ssa/vrp92.c: Likewise.
 * gnat.dg/opt38.adb: Likewise.


 --
 Eric Botcazou

[PATCH] Fix primitive wi::int_traits

2014-05-26 Thread Richard Biener


The following patch fixes completes the primitive int_traints
with long and long long variants and drops the HWI case.  This
fixes darwin builds where HOST_WIDE_INT is 'long' but
int64_t is 'long long'.

Bootstrapped on x86_64-unknown-linux-gnu (and on darwin by Ian),
soon to be committed.

Richard.

2014-05-26  Richard Biener  rguent...@suse.de

* wide-int.h (wi::int_traits long, wi::int_traits unsigned long,
wi::int_traits long long, wi::int_traits unsigned long long):
Provide specializations.
(wi::int_traits HOST_WIDE_INT,
wi::int_traits unsigned HOST_WIDE_INT): Remove specializations.

Index: gcc/wide-int.h
===
--- gcc/wide-int.h  (revision 210931)
+++ gcc/wide-int.h  (working copy)
@@ -1446,12 +1446,22 @@ namespace wi
 : public primitive_int_traits unsigned int, false {};
 
   template 
-  struct int_traits HOST_WIDE_INT
-: public primitive_int_traits HOST_WIDE_INT, true {};
+  struct int_traits long
+: public primitive_int_traits long, true {};
 
   template 
-  struct int_traits unsigned HOST_WIDE_INT
-: public primitive_int_traits unsigned HOST_WIDE_INT, false {};
+  struct int_traits unsigned long
+: public primitive_int_traits unsigned long, false {};
+
+#if defined HAVE_LONG_LONG
+  template 
+  struct int_traits long long
+: public primitive_int_traits long long, true {};
+
+  template 
+  struct int_traits unsigned long long
+: public primitive_int_traits unsigned long long, false {};
+#endif
 }
 
 namespace wi

Re: [PATCH][4/4] Always 64bit-HWI cleanups

2014-05-27 Thread Richard Biener

On May 27, 2014 3:58:06 AM CEST, David Edelsohn dje@gmail.com wrote:
This patch (further) broke bootstrap on AIX.  AIX defaults to 32 bit,
although it supports 64 bit HWI.

/nasfarm/edelsohn/src/src/gcc/bitmap.c: In function 'int
print_statistics(bitmap_descriptor_d**, output_info*)':
/nasfarm/edelsohn/src/src/gcc/bitmap.c:2168:24: error: expected ')'
before 'PRId64'
/nasfarm/edelsohn/src/src/gcc/bitmap.c: In function 'void
dump_bitmap_statistics()':
/nasfarm/edelsohn/src/src/gcc/bitmap.c:2202:15: error: expected ')'
before 'PRId64'

/nasfarm/edelsohn/src/src/gcc/bt-load.c: In function 'void
migrate_btr_defs(reg_class, int)':
/nasfarm/edelsohn/src/src/gcc/bt-load.c:1414:34: error: expected ')'
before 'PRId64'

/nasfarm/edelsohn/src/src/gcc/cfg.c: In function 'void
dump_edge_info(FILE*, edge, int, int)':
/nasfarm/edelsohn/src/src/gcc/cfg.c:489:25: error: expected ')' before
'PRId64'
/nasfarm/edelsohn/src/src/gcc/cfg.c: In function 'void
dump_bb_info(FILE*, basic_block, int, int, bool, bool)':
/nasfarm/edelsohn/src/src/gcc/cfg.c:737:33: error: expected ')' before
'PRId64'

This means aix has inttypes.h but not the standard mandated PRI?64 macros. Does 
it have stdint.h and therein int64_t?

A possibility is to define the GCC fallbacks when the defines are not defined 
as opposed to only when inttypes.h is not available.

Can you investigate some more the situation on aix?

Thanks,
Richard.

bitmap.i looks like:

int
print_statistics (bitmap_descriptor_d **slot, output_info *i)
{
  bitmap_descriptor d = *slot;
  char s[4096];

  if (d-allocated)
{
  const char *s1 = d-file;
  const char *s2;
  while ((s2 = strstr (s1, gcc/)))
 s1 = s2 + 4;
  sprintf (s, %s:%i (%s), s1, d-line, d-function);
  s[41] = 0;
  fprintf ((_iob[2]),
%-41s %9u %15PRId64 %15PRId64 %15PRId64
 %10PRId64 %10PRId64\n,
s, d-created,
d-allocated, d-peak, d-current,
d-nsearches, d-search_iter);
  i-size += d-allocated;
  i-count += d-created;
}
  return 1;
}


- David

Re: [PATCH] Warn on and fold NULL checks against inline functions

2014-05-27 Thread Richard Biener

On Mon, May 26, 2014 at 9:01 PM, Patrick Palka patr...@parcs.ath.cx wrote:
 Hi,

 This patch teaches the C++ frontend to warn on NULL checks against
 inline functions and it teaches the middle-end to fold NULL checks
 against inline functions.  These two things are currently done for
 non-inline C++ functions, but inline functions are exceptional in that
 the C++ frontend marks them as weak, and NULL checks against weak
 symbols cannot be folded in general because the symbol may be mapped to
 NULL at link-time.

 But in the case of an inline function, the fact that it's a weak symbol
 is an implementation detail.  And because it is not permitted to
 explicitly give an inline function the weak attribute (see
 handle_weak_attribute), in order to acheive $SUBJECT it suffices to
 assume that all inline functions are non-null, which is what this patch
 does.

 Bootstrap and regtest against x86_64-unknown-linux-gnu in progress.  Is
 this patch OK assuming no regressions?

What about always_inline function wrappers as used in FORTIFY_SOURCE?
IIRC the actual referenced function may be declared weak and thus the
address can compare to NULL?  Sth like

extern void __foo (void) __attribute__((weak,asm(foo)));
extern inline __attribute__((always_inline,gnu_inline)) void foo (void)
{
  __foo ();
}

int main()
{
  if (foo == 0)
return 0;
  abort ();
}

The issue is that the inline and alias may be hidden to the user and thus
he'll get unwanted and confusing warnings.

Richard.

 2014-05-26  Patrick Palka  patr...@parcs.ath.cx

 * c-family/c-common.c (decl_with_nonnull_addr_p): Assume inline
 functions are non-null.
 * fold-const.c (tree_single_nonzero_warnv_p): Likewise.
 ---
  gcc/c-family/c-common.c |  4 +++-
  gcc/fold-const.c|  5 -
  gcc/testsuite/g++.dg/inline-1.C | 14 ++
  3 files changed, 21 insertions(+), 2 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/inline-1.C

 diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
 index 6ec14fc..d4747a0 100644
 --- a/gcc/c-family/c-common.c
 +++ b/gcc/c-family/c-common.c
 @@ -4508,7 +4508,9 @@ decl_with_nonnull_addr_p (const_tree expr)
return (DECL_P (expr)
(TREE_CODE (expr) == PARM_DECL
   || TREE_CODE (expr) == LABEL_DECL
 - || !DECL_WEAK (expr)));
 + || !DECL_WEAK (expr)
 + || (TREE_CODE (expr) == FUNCTION_DECL
 +  DECL_DECLARED_INLINE_P (expr;
  }

  /* Prepare expr to be an argument of a TRUTH_NOT_EXPR,
 diff --git a/gcc/fold-const.c b/gcc/fold-const.c
 index 188b179..2796079 100644
 --- a/gcc/fold-const.c
 +++ b/gcc/fold-const.c
 @@ -16052,7 +16052,10 @@ tree_single_nonzero_warnv_p (tree t, bool 
 *strict_overflow_p)
 || (DECL_CONTEXT (base)
  TREE_CODE (DECL_CONTEXT (base)) == FUNCTION_DECL
  auto_var_in_fn_p (base, DECL_CONTEXT (base)
 - return !VAR_OR_FUNCTION_DECL_P (base) || !DECL_WEAK (base);
 + return !VAR_OR_FUNCTION_DECL_P (base)
 +|| !DECL_WEAK (base)
 +|| (TREE_CODE (base) == FUNCTION_DECL
 + DECL_DECLARED_INLINE_P (base));

 /* Constants are never weak.  */
 if (CONSTANT_CLASS_P (base))
 diff --git a/gcc/testsuite/g++.dg/inline-1.C b/gcc/testsuite/g++.dg/inline-1.C
 new file mode 100644
 index 000..65beff1
 --- /dev/null
 +++ b/gcc/testsuite/g++.dg/inline-1.C
 @@ -0,0 +1,14 @@
 +// { dg-options -Waddress }
 +
 +inline void
 +foo (void)
 +{
 +}
 +
 +int
 +bar (void)
 +{
 +  return foo == 0; // { dg-warning never be NULL }
 +}
 +
 +// { dg-final { scan-assembler-not foo } }
 --
 2.0.0.rc2

Re: [PATCH] Fix an AARCH64/ARM performance regression

2014-05-27 Thread Richard Biener

On Tue, May 27, 2014 at 10:10 AM, Bernd Edlinger
bernd.edlin...@hotmail.de wrote:
 Hi,

 I have been informed, that the following check-in may cause a slight 
 performance
 regression on aarch64 and arm on trunk and gcc-4_9-branch:
 See https://gcc.gnu.org/viewcvs/gcc?view=revisionrevision=205899

 This can be seen with the following test example:

 test.c:
 typedef struct
 {
   int *x;
   int *y;
   int *z;
   int *m;
   int *n;
   int *o;
 }A;

 typedef struct
 {
   int x;
   int y;
   int z;
   int m;
   int n;
   int o;
 }B;

 A a[6];
 B *b;

 void test(int i)
 {
   A *a1 = a[i];
   B *b1 = b[i];
   a1-x = b1-x;
   a1-y = b1-y;
   a1-z = b1-z;
a1-m = b1-m;
   a1-n = b1-n;
a1-o = b1-o;
 }

 when compiled with aarch64-elf-gcc -O3 -S we get the following
 assembler code:

 test:
 adrp x1, b
 sxtw x3, w0
 mov   w5, 24
 adrp x2, a
 addx2, x2, :lo12:a
 ldr  x4, [x1, #:lo12:b]
 lsl   x1, x3, 2
 subx1, x1, x3
 lsl   x1, x1, 4
 smaddl  x0, w0, w5, x4
 addx3, x2, x1
 addx7, x0, 8
 addx6, x0, 16
 addx8, x0, 4
addx5, x0, 12
 addx4, x0, 20
 str  x0, [x2, x1]
 mov   x0, x3
 mov   x1, x3
str  x8, [x3, 8]
str  x7, [x0, 16]!
str  x6, [x1, 32]!
str  x5, [x0, 8]
 str  x4, [x1, 8]
 ret

 Note the two mov instructions, and that two extra registers are used
 to store the values.  The mov instructions have not been there before the 
 patch.

 After some investigation I found out how this happened:

 The difference is first visible with -fdump-rtl-expand-slim:
 1: NOTE_INSN_DELETED
 4: NOTE_INSN_BASIC_BLOCK 2
 2: r83:SI=x0:SI
 3: NOTE_INSN_FUNCTION_BEG
 6: r74:DI=sign_extend(r83:SI)
 7: r85:DI=high(`b')
 8: r84:DI=r85:DI+low(`b')
   REG_EQUAL `b'
 9: r87:SI=0x18
10: r86:DI=sign_extend(r83:SI)*sign_extend(r87:SI)
11: r88:DI=[r84:DI]
12: r76:DI=r88:DI+r86:DI
13: r90:DI=high(`a')
14: r89:DI=r90:DI+low(`a')
   REG_EQUAL `a'
15: r91:DI=sign_extend(r83:SI)
16: r92:DI=r91:DI
17: r93:DI=r92:DI0x2
18: r94:DI=r93:DI-r91:DI
   REG_EQUAL r91:DI*0x3
19: r95:DI=r94:DI0x4
20: r94:DI=r95:DI
   REG_EQUAL r91:DI*0x30
21: r96:DI=r89:DI+r94:DI
22: [r96:DI]=r76:DI
23: r98:DI=high(`a')
24: r97:DI=r98:DI+low(`a')
   REG_EQUAL `a'
25: r99:DI=sign_extend(r83:SI)
26: r100:DI=r99:DI
27: r101:DI=r100:DI0x2
28: r102:DI=r101:DI-r99:DI
   REG_EQUAL r99:DI*0x3
29: r103:DI=r102:DI0x4
30: r102:DI=r103:DI
   REG_EQUAL r99:DI*0x30
31: r104:DI=r97:DI+r102:DI
32: r105:DI=r76:DI+0x4
33: [r104:DI+0x8]=r105:DI
34: r107:DI=high(`a')
35: r106:DI=r107:DI+low(`a')
   REG_EQUAL `a'
36: r108:DI=sign_extend(r83:SI)
37: r109:DI=r108:DI
38: r110:DI=r109:DI0x2
39: r111:DI=r110:DI-r108:DI
   REG_EQUAL r108:DI*0x3
40: r112:DI=r111:DI0x4
41: r111:DI=r112:DI
   REG_EQUAL r108:DI*0x30
42: r113:DI=r106:DI+r111:DI
43: r114:DI=r113:DI+0x10
44: r115:DI=r76:DI+0x8
45: [r114:DI]=r115:DI
46: r117:DI=high(`a')
47: r116:DI=r117:DI+low(`a')
   REG_EQUAL `a'
48: r118:DI=sign_extend(r83:SI)
49: r119:DI=r118:DI
50: r120:DI=r119:DI0x2
51: r121:DI=r120:DI-r118:DI
   REG_EQUAL r118:DI*0x3
52: r122:DI=r121:DI0x4
53: r121:DI=r122:DI
   REG_EQUAL r118:DI*0x30
54: r123:DI=r116:DI+r121:DI
55: r124:DI=r123:DI+0x10
56: r125:DI=r76:DI+0xc
57: [r124:DI+0x8]=r125:DI
58: r127:DI=high(`a')
59: r126:DI=r127:DI+low(`a')
   REG_EQUAL `a'
60: r128:DI=sign_extend(r83:SI)
61: r129:DI=r128:DI
62: r130:DI=r129:DI0x2
63: r131:DI=r130:DI-r128:DI
   REG_EQUAL r128:DI*0x3
64: r132:DI=r131:DI0x4
65: r131:DI=r132:DI
   REG_EQUAL r128:DI*0x30
66: r133:DI=r126:DI+r131:DI
67: r134:DI=r133:DI+0x20
68: r135:DI=r76:DI+0x10
69: [r134:DI]=r135:DI
70: r137:DI=high(`a')
71: r136:DI=r137:DI+low(`a')
   REG_EQUAL `a'
72: r138:DI=sign_extend(r83:SI)
73: r139:DI=r138:DI
74: r140:DI=r139:DI0x2
75: r141:DI=r140:DI-r138:DI
   REG_EQUAL r138:DI*0x3
76: r142:DI=r141:DI0x4
77: r141:DI=r142:DI
   REG_EQUAL r138:DI*0x30
78: r143:DI=r136:DI+r141:DI
79: r144:DI=r143:DI+0x20
80: r145:DI=r76:DI+0x14
81: [r144:DI+0x8]=r145:DI

 Note the offset +0x8 on the instructions 33, 57, 81 but not on 22, 45, 69.
 This artefact was not there before the patch.

 Well, this causes little ripples in the later rtl-passes.
 There is one pass that does a pretty good job at compensating

Re: [RFC] optimize x - y cmp 0 with undefined overflow

2014-05-27 Thread Richard Biener

On Tue, May 27, 2014 at 11:25 AM, Eric Botcazou ebotca...@adacore.com wrote:
 +  tree new_const
 +   = fold_build2_loc (loc, reverse_op, TREE_TYPE (arg1), const2,
 const1);

/* If the constant operation overflowed this can be
  simplified as a comparison against INT_MAX/INT_MIN.  */
 -  if (TREE_CODE (lhs) == INTEGER_CST
 -  TREE_OVERFLOW (lhs))
 +  if (TREE_OVERFLOW (new_const))

 well, either use int_const_binop above or retain the check (or use
 TREE_OVERFLOW_P).  Bonus points for using wide-ints here
 and not relying on TREE_OVERFLOW.

 The check is useless (you get either NULL_TREE or INTEGER_CST here) but I'll
 use int_const_binop.

 +  /* Transform comparisons of the form X - Y CMP 0 to X CMP Y.  */
 +  if (TREE_CODE (arg0) == MINUS_EXPR
 +   TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (arg1))

 any good reason for using TYPE_OVERFLOW_UNDEFINED on the
 type of arg1 instead on the type of the MINUS (yes, they should
 match, but it really looks odd ... the overflow of the minus has to be
 undefined).

 For the sake of consistency with the X +- C1 CMP C2 case just above, but I can
 change both.

 Also for EQ_EXPR and NE_EXPR the transform is
 fine even when !TYPE_OVERFLOW_UNDEFINED (and we seem
 to perform it already somewhere).  Please look where and try to
 add the undefined overflow case to it.

 Yes, but it's the same for the X +- C1 CMP C2 case, i.e. there are specific
 cases for X +- C1 EQ/NE C2 and X - Y EQ/NE 0 in fold_binary, so I'm not sure
 what you're asking.

I'm asking to merge them (move them to fold_comparison).

 As for the VRP pieces I don't understand what is missing here
 (well, compare_range_with_value and/or compare_values might
 not handle this case?  then better fix that to improve symbolic
 range handling in general?).  Also I have a longstanding patch
 in my tree that does

 Yes, there is an explicit non-handling of symbolic ranges for PLUS_EXPR and
 MINUS_EXPR in VRP (extract_range_from_binary_expr_1) and the patch works
 around it by propagating the code instead of the ranges, which is far easier
 and sufficient here.  If you think that the way to go is to handle symbolic
 ranges for PLUS_EXPR and MINUS_EXPR instead, fine with me, I can try.

Yeah, it would be nice to see some support.  The most interesting cases
will be symbolic-singleton +- CST where the offset shrinks a constant offset
in a symbolic A +- CST (thus we don't get into any overflow issues).  Thus
handling

 [a + 1, a + 1] - [1, 1] - [a, a]

for example.  We get the offsetted singleton symbolic ranges from
conditional asserts a lot.

Thanks,
Richard.

 --
 Eric Botcazou

Re: [RFC] optimize x - y cmp 0 with undefined overflow

2014-05-27 Thread Richard Biener

On Tue, May 27, 2014 at 11:59 AM, Eric Botcazou ebotca...@adacore.com wrote:
 I'm asking to merge them (move them to fold_comparison).

 OK, I'll repost a first patch doing only the fold-const.c massaging.

 Yeah, it would be nice to see some support.  The most interesting cases
 will be symbolic-singleton +- CST where the offset shrinks a constant offset
 in a symbolic A +- CST (thus we don't get into any overflow issues).  Thus
 handling

  [a + 1, a + 1] - [1, 1] - [a, a]

 for example.  We get the offsetted singleton symbolic ranges from
 conditional asserts a lot.

 For the case at hand, you have [x + 1, INF] for y and you want to evaluate the
 range of y - x, so you really don't want to use the range of x...

Ok.  That's similar to the issue I try to address with the VRP snipped I posted
yesterday.

I'd say we still handle basic symbolic range ops in
extract_range_from_binary_1
but in extract_range_from_binary_expr for a symbolic op0 we try to simplify
it with both [op1, op1] and with the value-range of op1 until we get a
non-varying range as result.  Not sure if it's worth restricting that
to the case
where op0s value-range refers to op1 or vice versa, and eventually only
use op1 symbolically then.

Richard.

 --
 Eric Botcazou

[PATCH] Try harder for conditional evaluation in VRP

2014-05-27 Thread Richard Biener


Especially for ops with symbolic ranges it may be preferable
to compare one range with an op such as in
[x + 1, x + 1]  x instead of expanding the range of x on
the rhs to sth unrelated.

So, try harder.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2014-05-27  Richard Biener  rguent...@suse.de

* tree-vrp.c (vrp_evaluate_conditional_warnv_with_ops_using_ranges):
Try using literal operands when comparing value-ranges failed.

Index: gcc/tree-vrp.c
===
--- gcc/tree-vrp.c  (revision 210931)
+++ gcc/tree-vrp.c  (working copy)
@@ -6919,14 +6919,15 @@ vrp_evaluate_conditional_warnv_with_ops_
   vr0 = (TREE_CODE (op0) == SSA_NAME) ? get_value_range (op0) : NULL;
   vr1 = (TREE_CODE (op1) == SSA_NAME) ? get_value_range (op1) : NULL;
 
+  tree res = NULL_TREE;
   if (vr0  vr1)
-return compare_ranges (code, vr0, vr1, strict_overflow_p);
-  else if (vr0  vr1 == NULL)
-return compare_range_with_value (code, vr0, op1, strict_overflow_p);
-  else if (vr0 == NULL  vr1)
-return (compare_range_with_value
+res = compare_ranges (code, vr0, vr1, strict_overflow_p);
+  if (!res  vr0)
+res = compare_range_with_value (code, vr0, op1, strict_overflow_p);
+  if (!res  vr1)
+res = (compare_range_with_value
(swap_tree_comparison (code), vr1, op0, strict_overflow_p));
-  return NULL;
+  return res;
 }
 
 /* Helper function for vrp_evaluate_conditional_warnv. */

Re: [PATCH][4/4] Always 64bit-HWI cleanups

2014-05-27 Thread Richard Biener

On Tue, 27 May 2014, Richard Biener wrote:

 On May 27, 2014 3:58:06 AM CEST, David Edelsohn dje@gmail.com wrote:
 This patch (further) broke bootstrap on AIX.  AIX defaults to 32 bit,
 although it supports 64 bit HWI.
 
 /nasfarm/edelsohn/src/src/gcc/bitmap.c: In function 'int
 print_statistics(bitmap_descriptor_d**, output_info*)':
 /nasfarm/edelsohn/src/src/gcc/bitmap.c:2168:24: error: expected ')'
 before 'PRId64'
 /nasfarm/edelsohn/src/src/gcc/bitmap.c: In function 'void
 dump_bitmap_statistics()':
 /nasfarm/edelsohn/src/src/gcc/bitmap.c:2202:15: error: expected ')'
 before 'PRId64'
 
 /nasfarm/edelsohn/src/src/gcc/bt-load.c: In function 'void
 migrate_btr_defs(reg_class, int)':
 /nasfarm/edelsohn/src/src/gcc/bt-load.c:1414:34: error: expected ')'
 before 'PRId64'
 
 /nasfarm/edelsohn/src/src/gcc/cfg.c: In function 'void
 dump_edge_info(FILE*, edge, int, int)':
 /nasfarm/edelsohn/src/src/gcc/cfg.c:489:25: error: expected ')' before
 'PRId64'
 /nasfarm/edelsohn/src/src/gcc/cfg.c: In function 'void
 dump_bb_info(FILE*, basic_block, int, int, bool, bool)':
 /nasfarm/edelsohn/src/src/gcc/cfg.c:737:33: error: expected ')' before
 'PRId64'
 
 This means aix has inttypes.h but not the standard mandated PRI?64 macros. 
 Does it have stdint.h and therein int64_t?
 
 A possibility is to define the GCC fallbacks when the defines are not defined 
 as opposed to only when inttypes.h is not available.
 
 Can you investigate some more the situation on aix?

Proposed patch below.

Richard.

Index: gcc/hwint.h
===
--- gcc/hwint.h (revision 210932)
+++ gcc/hwint.h (working copy)
@@ -135,21 +135,30 @@ typedef HOST_WIDE_INT __gcc_host_wide_in
 #define HOST_WIDE_INT_PRINT_HEX %# HOST_WIDE_INT_PRINT x
 #define HOST_WIDE_INT_PRINT_HEX_PURE % HOST_WIDE_INT_PRINT x
 
-/* Provide C99 inttypes.h style format definitions for 64bits.  */
-#ifndef HAVE_INTTYPES_H
-#undef PRId64
+/* Provide C99 inttypes.h style format definitions for 64bits.  As
+   64bit types are only conditionally supported also provide fallback
+   definitions for the 64bit variants even if inttypes.h is available.  */
+#ifndef PRId64
 #define PRId64 HOST_WIDE_INT_PRINT d
-#undef PRIi64
+#endif
+#ifndef PRIi64
 #define PRIi64 HOST_WIDE_INT_PRINT i
-#undef PRIo64
+#endif
+#ifndef PRIo64
 #define PRIo64 HOST_WIDE_INT_PRINT o
-#undef PRIu64
+#endif
+#ifndef PRIu64
 #define PRIu64 HOST_WIDE_INT_PRINT u
-#undef PRIx64
+#endif
+#ifndef PRIx64
 #define PRIx64 HOST_WIDE_INT_PRINT x
-#undef PRIX64
+#endif
+#ifndef PRIX64
 #define PRIX64 HOST_WIDE_INT_PRINT X
 #endif
+#ifndef HAVE_INTTYPES_H
+/* 32bit variants go here once we start using them.  */
+#endif
 
 /* Define HOST_WIDEST_FAST_INT to the widest integer type supported
efficiently in hardware.  (That is, the widest integer type that fits

Re: [PATCH][4/4] Always 64bit-HWI cleanups

2014-05-27 Thread Richard Biener

On Tue, 27 May 2014, Jakub Jelinek wrote:

 On Tue, May 27, 2014 at 08:26:35AM +0200, Richard Biener wrote:
  /nasfarm/edelsohn/src/src/gcc/cfg.c: In function 'void
  dump_bb_info(FILE*, basic_block, int, int, bool, bool)':
  /nasfarm/edelsohn/src/src/gcc/cfg.c:737:33: error: expected ')' before
  'PRId64'
  
  This means aix has inttypes.h but not the standard mandated PRI?64 macros. 
  Does it have stdint.h and therein int64_t?
  
  A possibility is to define the GCC fallbacks when the defines are not
  defined as opposed to only when inttypes.h is not available.
 
 BTW, we should be prepared for C++11 udlit and use spaces around
 the PRI* macros, in case some compiler treats those as udlits
 unconditionally.

Though that's not likely as it would break existing and valid code.

Richard.

[PATCH] Fix HWI AIX breakage

2014-05-27 Thread Richard Biener


This fixes the AIX breakage by defining __STDC_FORMAT_MACROS
earlier, before stdio.h on AIX gets to include inttypes.h.

Committed as obvious.

Richard.

2014-05-27  Richard Biener  rguent...@suse.de

* system.h (__STDC_FORMAT_MACROS): Define as very first thing.

Index: gcc/system.h
===
--- gcc/system.h(revision 210972)
+++ gcc/system.h(working copy)
@@ -22,6 +22,12 @@ along with GCC; see the file COPYING3.
 #ifndef GCC_SYSTEM_H
 #define GCC_SYSTEM_H
 
+/* Define this so that inttypes.h defines the PRI?64 macros even
+   when compiling with a C++ compiler.  Define it here so in the
+   event inttypes.h gets pulled in by another header it is already
+   defined.  */
+#define __STDC_FORMAT_MACROS
+
 /* We must include stdarg.h before stdio.h.  */
 #include stdarg.h
 
@@ -491,7 +497,6 @@ extern void *realloc (void *, size_t);
 #endif
 
 #ifdef HAVE_INTTYPES_H
-#define __STDC_FORMAT_MACROS
 #include inttypes.h
 #endif

Re: [PATCH] Try harder for conditional evaluation in VRP

2014-05-27 Thread Richard Biener

On Tue, 27 May 2014, Steven Bosscher wrote:

 On Tue, May 27, 2014 at 12:27 PM, Richard Biener wrote:
  * tree-vrp.c (vrp_evaluate_conditional_warnv_with_ops_using_ranges):
  Try using literal operands when comparing value-ranges failed.
 
 No test case?

Sorry ;)  Happens to patches I uncover in my dev tree.  I'll try
to come up with sth (I remember coding it when workin on some
PR ... but I've lost track of which one).

Richard.

Re: [PATCH] Try harder for conditional evaluation in VRP

2014-05-27 Thread Richard Biener

On Tue, 27 May 2014, Richard Biener wrote:

 On Tue, 27 May 2014, Steven Bosscher wrote:
 
  On Tue, May 27, 2014 at 12:27 PM, Richard Biener wrote:
   * tree-vrp.c 
   (vrp_evaluate_conditional_warnv_with_ops_using_ranges):
   Try using literal operands when comparing value-ranges failed.
  
  No test case?
 
 Sorry ;)  Happens to patches I uncover in my dev tree.  I'll try
 to come up with sth (I remember coding it when workin on some
 PR ... but I've lost track of which one).

Here is one (yeah, a bit artificial ...).

Committed.

Richard.

2014-05-27  Richard Biener  rguent...@suse.de

* gcc.dg/tree-ssa/vrp92.c: New testcase.

Index: gcc/testsuite/gcc.dg/tree-ssa/vrp92.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/vrp92.c   (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/vrp92.c   (working copy)
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -fdump-tree-vrp1-details } */
+
+void bar (void);
+int foo (int i, int j)
+{
+  int res = 1;
+  if (i  j)
+{
+  /* We should be able to simplify the following conditional
+ during propagation.  */
+  if (i  j)
+   res = 0;
+}
+  /* And compute res as having a value-range of [1,1].  */
+  if (res)
+return i;
+  return j;
+}
+
+/* { dg-final { scan-tree-dump res_.: \\\[1, 1\\\] vrp1 } } */
+/* { dg-final { scan-tree-dump-not Threaded vrp1 } } */
+/* { dg-final { cleanup-tree-dump vrp1 } } */

Re: [PATCH] Warn on and fold NULL checks against inline functions

2014-05-27 Thread Richard Biener

On Tue, May 27, 2014 at 4:06 PM, Patrick Palka patr...@parcs.ath.cx wrote:
 On Tue, May 27, 2014 at 8:32 AM, Patrick Palka patr...@parcs.ath.cx wrote:
 On Tue, May 27, 2014 at 3:33 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Mon, May 26, 2014 at 9:01 PM, Patrick Palka patr...@parcs.ath.cx wrote:
 Hi,

 This patch teaches the C++ frontend to warn on NULL checks against
 inline functions and it teaches the middle-end to fold NULL checks
 against inline functions.  These two things are currently done for
 non-inline C++ functions, but inline functions are exceptional in that
 the C++ frontend marks them as weak, and NULL checks against weak
 symbols cannot be folded in general because the symbol may be mapped to
 NULL at link-time.

 But in the case of an inline function, the fact that it's a weak symbol
 is an implementation detail.  And because it is not permitted to
 explicitly give an inline function the weak attribute (see
 handle_weak_attribute), in order to acheive $SUBJECT it suffices to
 assume that all inline functions are non-null, which is what this patch
 does.

 Bootstrap and regtest against x86_64-unknown-linux-gnu in progress.  Is
 this patch OK assuming no regressions?

 What about always_inline function wrappers as used in FORTIFY_SOURCE?
 IIRC the actual referenced function may be declared weak and thus the
 address can compare to NULL?  Sth like

 extern void __foo (void) __attribute__((weak,asm(foo)));
 extern inline __attribute__((always_inline,gnu_inline)) void foo (void)
 {
   __foo ();
 }

 int main()
 {
   if (foo == 0)
 return 0;
   abort ();
 }

 The issue is that the inline and alias may be hidden to the user and thus
 he'll get unwanted and confusing warnings.

 Richard.

 So as it stands the foo == 0 check in the above example gets folded
 with or without my patch.  The issue here seems to be related to the
 use of gnu_inline.  The address of an inline function marked
 gnu_inline shouldn't be considered non-NULL because a standalone
 function does not actually get emitted.  Of course, one could inspect
 aliases but in general it is not correct to assume such functions are
 non-NULL.  Does this sound right?

 This issue has slight overlap with the issue that my patch tries to
 fix.  I could try to handle it as well.

 Oh, disregard the above.  I think I see what you mean now, Richard.
 Because __foo is an alias for foo, and __foo is weak, is it safe to
 assume that foo is non-NULL?  That is a tricky question...

No (I wasn't making a folding validity remark).  I want to make sure
we won't see diagnostics on code like

#include string.h

int main()
{
   if (malloc != 0)
 ...
}

just because it is built with -D_FORTIFY_SOURCE.  Whether we
may fold that check at all is another question, of course (and again
that should _not_ differ between -D_FORTIFY_SOURCE or -U_FORTIFY_SOURCE,
but it may well also be a glibc issue).

And yes, aliases are tricky beasts ...

Richard.

Re: Mark more constants readonly

2014-05-27 Thread Richard Biener

On Tue, May 27, 2014 at 3:13 PM, Bernd Schmidt ber...@codesourcery.com wrote:
 I noticed that string constants built by the Fortran frontend don't set
 TREE_READONLY for STRING_CST (and subsequently noticed that nothing seems to
 set it for COMPLEX_CST). That was confusing the ptx backend I'm working on.
 The -fwritable-strings option for C was removed a while ago, so I tested the
 following patch to just set the flag unconditionally, and passed bootstrap
 and test on x86_64-linux. Ok?

Hmm?  Not so obvious.  TREE_READONLY has no purpose on tcc_constant
nodes if I read documentation correctly (tree.h).  In fact
base.readonly_flag is unused for tcc_constant (and would be redundant
with TREE_CONSTANT).

Richard.


 Bernd

Re: Don't copy constants in build_constant_desc

2014-05-27 Thread Richard Biener

On Tue, May 27, 2014 at 3:26 PM, Bernd Schmidt ber...@codesourcery.com wrote:
 When putting a constant into the constant pool, we make a copy of the tree
 in build_constant_desc. This caused me some problems with the ptx backend,
 which has a pass to modify the address spaces of all variables and constants
 to match the requirements of the backend. It would be nice for it to be able
 to find them all by walking the varpool and the gimple of all functions, and
 I ran into some testcases where these copies can slip through.

 After a while it occurred to me that the copy is probably a relic of old
 obstack times and not necessary anymore. And indeed, in 2.7.2.3 the only
 call to it looked like this:
   push_obstacks_nochange ();
   suspend_momentary ();
   p-exp = copy_constant (exp);
   pop_obstacks ();

 It seems unlikely that anything in the compiler would want to modify a
 constant after it has been put into the constant pool, and even if so, it
 should have the responsibility of making the copy itself. So, as an
 experiment I've bootstrapped and tested the following patch (x86_64-linux),
 and that seemed to work. Ok to apply along with removal of the now unused
 copy_constant?

Ok if nobody raises issues within 24h.

Thanks,
Richard.


 Bernd

Re: [RFC] optimize x - y cmp 0 with undefined overflow

2014-05-27 Thread Richard Biener

On May 27, 2014 6:12:58 PM CEST, Eric Botcazou ebotca...@adacore.com wrote:
 I'm asking to merge them (move them to fold_comparison).

Done (in the end the patch removes more lines than it adds :-).

That's what I like!

Tested on x86_64-suse-linux with no regressions.

OK.
Thanks,
Richard.



2014-05-27  Eric Botcazou  ebotca...@adacore.com

   * fold-const.c (fold_comparison): Clean up and extend X +- C1 CMP C2
   to X CMP C2 -+ C1 transformation to EQ_EXPR/NE_EXPR.
   Add X - Y CMP 0 to X CMP Y transformation.
   (fold_binary_loc) EQ_EXPR/NE_EXPR: Remove same transformations.


2014-05-27  Eric Botcazou  ebotca...@adacore.com

* gcc.dg/fold-compare-8.c: New test.
* gcc.dg/Wstrict-overflow-25.c: Likewise.

Re: Fix old bug in div_and_round_double

2014-05-28 Thread Richard Biener

On Tue, May 27, 2014 at 10:29 PM, Eric Botcazou ebotca...@adacore.com wrote:
 This is an old bug in div_and_round_double for ROUND_DIV_EXPR: when the code
 detects that it needs to adjust the quotient, it needs to decide whether it
 increases or decreases it by 1.  This only depends on the expected sign of the
 quotient, but the test reads:

 if (*hquo  0)

 So if the quotient is 0, the code will always bump it, thus yielding 1, even
 if the expected quotient is negative.  This causes the attached Ada testcase
 to fail at -O on all active branches... except for mainline, because of the
 correct implementation of the same test in wi::div_round:

   if (wi::neg_p (x, sgn) != wi::neg_p (y, sgn))
 return quotient - 1;
   else
 return quotient + 1;

 It turns out that div_and_round_double has a predicate quo_neg that implements
 the same test as the wide-int one and ought to be used here.

 Tested on x86_64-suse-linux (of course this probably doesn't mean anything on
 the mainline, but it was also test with 4.7 and 4.9 compilers) and applied on
 the mainline as obvious.

I suppose you also install on branches?

Thanks,
Richard.


 2014-05-27  Eric Botcazou  ebotca...@adacore.com

 * double-int.c (div_and_round_double) ROUND_DIV_EXPR: Use the proper
 predicate to detect a negative quotient.


 2014-05-27  Eric Botcazou  ebotca...@adacore.com

 * gnat.dg/overflow_fixed.adb: New test.


 --
 Eric Botcazou

Re: Darwin bootstrap failure following wide int merge (was: we are starting the wide int merge)

2014-05-28 Thread Richard Biener

On Wed, May 28, 2014 at 8:50 AM, Jakub Jelinek ja...@redhat.com wrote:
 On Mon, May 26, 2014 at 08:36:31AM -0700, Mike Stump wrote:
 On May 26, 2014, at 2:22 AM, FX fxcoud...@gmail.com wrote:
  This causes GCC bootstrap to fail on Darwin systems (whose system 
  compiler is clang-based). Since PR 61146 was resolved as INVALID (but I’m 
  not sure it’s the right call, see below), I’ve filed a separate report 
  for the bootstrap issue 
  (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61315).
 
  Since my PR has been closed twice by Andrew Pinski (“it’s clang’s fault, 
  bouh ouh”), I’d ask the maintainers to step in. Can we please provide a 
  GCC that works for the default darwin setup? Or at least drop darwin as 
  secondary target and document the failure?

 The best coarse of action, post a patch, have it reviewed and put in.
 Current action, a patch has been posted, the review is outstanding, I’d
 like to see it put in; though, I am curious why the casts were there in
 the first place.

 Note, haven't added them there, but from what I can test, the casts there
 can serve as a compile time check that the right type is used, e.g.
 unsigned long i;

 void
 foo (void)
 {
   asm volatile (# %0 %1 : =r ((unsigned long long) i) : 0 ((unsigned 
 long long) 0));
 }

Ah, interesting.  A not-so-hineous extension then.

 errors out on x86_64 -m32, but compiles fine with -m64, because in the
 latter case the type has the correct size, while in the former case it
 doesn't.  So, perhaps instead of removing the casts we should replace them
 with some kind of static assertions (whether
 extern char foobar[sizeof (arg) == sizeof (UDItype)  
 __builtin_classify_type (arg) == 1 ? 1 : -1];
 or __builtin_types_compatible_p, or C++ templates for C++, ...

Yeah, a portable (C and C++) static assert would be nice.  And also pushing
this to gmp then.

In the meantime I see nothing wrong in merging from GMP.

Richard.

 Jakub

[PATCH] Fix PR61335

2014-05-28 Thread Richard Biener


This fixes PR61335 - symbolic range propagation in VRP is so weak
that we didn't notice this very old serious bug ... compare_values
can return don't know which is -2, but PHI node visiting handles
it the same as -1 (less than...).  Oops.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

2014-05-28  Richard Biener  rguent...@suse.de

PR tree-optimization/61335
* tree-vrp.c (vrp_visit_phi_node): If the compare of old and
new range fails, drop to varying.

* gfortran.dg/pr61335.f90: New testcase.

Index: gcc/tree-vrp.c
===
*** gcc/tree-vrp.c  (revision 210973)
--- gcc/tree-vrp.c  (working copy)
*** vrp_visit_phi_node (gimple phi)
*** 8323,8330 
--- 8323,8336 
 edges == old_edges
 lhs_vr-type != VR_UNDEFINED)
  {
+   /* Compare old and new ranges, fall back to varying if the
+  values are not comparable.  */
int cmp_min = compare_values (lhs_vr-min, vr_result.min);
+   if (cmp_min == -2)
+   goto varying;
int cmp_max = compare_values (lhs_vr-max, vr_result.max);
+   if (cmp_max == -2)
+   goto varying;
  
/* For non VR_RANGE or for pointers fall back to varying if
 the range changed.  */
Index: gcc/testsuite/gfortran.dg/pr61335.f90
===
*** gcc/testsuite/gfortran.dg/pr61335.f90   (revision 0)
--- gcc/testsuite/gfortran.dg/pr61335.f90   (working copy)
***
*** 0 
--- 1,117 
+ ! { dg-do run }
+ ! { dg-additional-options -fbounds-check }
+ MODULE cp_units
+ 
+   INTEGER, PARAMETER :: default_string_length=80, dp=KIND(0.0D0)
+ 
+   LOGICAL, PRIVATE, PARAMETER  :: debug_this_module=.TRUE.
+   CHARACTER(len=*), PARAMETER, PRIVATE :: moduleN = 'cp_units'
+   INTEGER, SAVE, PRIVATE   :: last_unit_id=0, last_unit_set_id=0
+ 
+   INTEGER, PARAMETER, PUBLIC :: cp_unit_max_kinds=8, 
cp_unit_basic_desc_length=15,
+cp_unit_desc_length=cp_unit_max_kinds*cp_unit_basic_desc_length, 
cp_ukind_max=9
+ 
+ CONTAINS
+ 
+   FUNCTION cp_to_string(i) RESULT(res)
+ INTEGER, INTENT(in)  :: i
+ CHARACTER(len=6) :: res
+ 
+ INTEGER  :: iostat
+ REAL(KIND=dp):: tmp_r
+ 
+ IF (i99 .OR. i-9) THEN
+tmp_r=i
+WRITE (res,fmt='(es6.1)',iostat=iostat) tmp_r
+ ELSE
+WRITE (res,fmt='(i6)',iostat=iostat) i
+ END IF
+ IF (iostat/=0) THEN
+STOP 7
+ END IF
+   END FUNCTION cp_to_string
+ 
+   SUBROUTINE cp_unit_create(string)
+ CHARACTER(len=*), INTENT(in) :: string
+ 
+ CHARACTER(len=*), PARAMETER :: routineN = 'cp_unit_create', 
+   routineP = moduleN//':'//routineN
+ 
+ CHARACTER(default_string_length) :: desc
+ CHARACTER(LEN=40):: formatstr
+ INTEGER  :: i_high, i_low, i_unit, 
+ len_string, next_power
+ INTEGER, DIMENSION(cp_unit_max_kinds):: kind_id, power, unit_id
+ LOGICAL  :: failure
+ 
+ failure=.FALSE.
+ unit_id=cp_units_none
+ kind_id=cp_ukind_none
+ power=0
+ i_low=1
+ i_high=1
+ len_string=LEN(string)
+ i_unit=0
+ next_power=1
+ DO WHILE(i_lowlen_string)
+IF (string(i_low:i_low)/=' ') EXIT
+i_low=i_low+1
+ END DO
+ i_high=i_low
+ DO WHILE(i_high=len_string)
+IF ( string(i_high:i_high)==' '.OR.string(i_high:i_high)=='^'.OR.
+ string(i_high:i_high)=='*'.OR.string(i_high:i_high)=='/') EXIT
+i_high=i_high+1
+ END DO
+ DO WHILE(.NOT.failure)
+IF (i_high=i_low.OR.i_lowlen_string) EXIT
+i_unit=i_unit+1
+IF (i_unitcp_unit_max_kinds) THEN
+   EXIT
+END IF
+power(i_unit)=next_power
+! parse op
+i_low=i_high
+DO WHILE(i_low=len_string)
+   IF (string(i_low:i_low)/=' ') EXIT
+   i_low=i_low+1
+END DO
+i_high=i_low
+DO WHILE(i_high=len_string)
+   IF ( string(i_high:i_high)==' '.OR.string(i_high:i_high)=='^'.OR.
+string(i_high:i_high)=='*'.OR.string(i_high:i_high)=='/') EXIT
+   i_high=i_high+1
+END DO
+IF (i_highi_low.OR.i_lowlen_string) EXIT
+ 
+IF (i_high=len_string) THEN
+   IF (string(i_low:i_high)=='^') THEN
+  i_low=i_high+1
+  DO WHILE(i_low=len_string)
+ IF (string(i_low:i_low)/=' ') EXIT
+ i_low=i_low+1
+  END DO
+  i_high=i_low
+  DO WHILE(i_high=len_string)
+ SELECT CASE(string(i_high:i_high))
+ CASE('+','-','0','1','2','3','4','5','6','7

Re: Darwin bootstrap failure following wide int merge (was: we are starting the wide int merge)

2014-05-28 Thread Richard Biener

On Wed, May 28, 2014 at 10:24 AM, FX fxcoud...@gmail.com wrote:
 Yeah, a portable (C and C++) static assert would be nice.  And also pushing
 this to gmp then.

 In the meantime I see nothing wrong in merging from GMP.

 One question, one comment:

 1. can I count your “I see nothing wrong” as an approval, as in “global 
 reviewers can approve changes to any part of the compiler or associated 
 libraries”?

Well, kind of.  But Jakub is as well, so I don't want to override him.  So
please wait for an ack from Jakub.  I agree with him that the casts
served a purpose and that, if removed, they need to be replaced with
an appropriate assertion measure.

Given that inline asm is a GCC extension calling those casts another
extension from the LLVM side is really odd.  In fact I think the casts
are a very good way of doing this kind of assertions.  Are they documented
in that regard?  If so I'd say it's really really LLVM that should be fixed
and the workaround on the GCC side is to pass that -fhineous-gnu-extensions
flag.

 2. I think your quotes around “merging” mean you’re not actually thinking of 
 a merge, but for clarification’s sake: GMP’s longlong.h has apparently a long 
 history of its own, and has many differences with GCC’s version. The closest 
 thing to an “upstream” for us would probably be glibc (see the diff 
 attached), from which we last merged on 2014-04-22.

I see.  I suppose the gcc side includes the proposed patch and glibc
still has those casts, right?  In that case there is nothing to merge
from glibc (but to glibc eventually).

Richard.

 Thanks,
 FX

Re: Darwin bootstrap failure following wide int merge

2014-05-28 Thread Richard Biener

On Wed, May 28, 2014 at 10:24 AM, Richard Sandiford
rdsandif...@googlemail.com wrote:
 Richard Biener richard.guent...@gmail.com writes:
 On Wed, May 28, 2014 at 8:50 AM, Jakub Jelinek ja...@redhat.com wrote:
 On Mon, May 26, 2014 at 08:36:31AM -0700, Mike Stump wrote:
 On May 26, 2014, at 2:22 AM, FX fxcoud...@gmail.com wrote:
  This causes GCC bootstrap to fail on Darwin systems (whose system
  compiler is clang-based). Since PR 61146 was resolved as INVALID
  (but I’m not sure it’s the right call, see below), I’ve filed a
  separate report for the bootstrap issue
  (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61315).
 
  Since my PR has been closed twice by Andrew Pinski (“it’s clang’s
  fault, bouh ouh”), I’d ask the maintainers to step in. Can we
  please provide a GCC that works for the default darwin setup? Or at
  least drop darwin as secondary target and document the failure?

 The best coarse of action, post a patch, have it reviewed and put in.
 Current action, a patch has been posted, the review is outstanding, I’d
 like to see it put in; though, I am curious why the casts were there in
 the first place.

 Note, haven't added them there, but from what I can test, the casts there
 can serve as a compile time check that the right type is used, e.g.
 unsigned long i;

 void
 foo (void)
 {
   asm volatile (# %0 %1 : =r ((unsigned long long) i) : 0
 ((unsigned long long) 0));
 }

 Ah, interesting.  A not-so-hineous extension then.

 In that case, how about just protecting the include with:

 #if GCC_VERSION = 4300  (W_TYPE_SIZE == 32 || defined (__SIZEOF_INT128__))

 rather than:

 #if GCC_VERSION = 3000  (W_TYPE_SIZE == 32 || defined (__SIZEOF_INT128__))

 so that clang will fail the version check?  At the end of the day we
 only really care what happens during stage 2 and 3.  Cross-compilers
 built with recentish gccs will still benefit.

Works for me (thus, pre-approved with a comment explaining the version
choice).

Thanks,
Richard.

 Thanks,
 Richard

Re: Darwin bootstrap failure following wide int merge

2014-05-28 Thread Richard Biener

On Wed, May 28, 2014 at 11:40 AM, Richard Biener
richard.guent...@gmail.com wrote:
 On Wed, May 28, 2014 at 10:24 AM, Richard Sandiford
 rdsandif...@googlemail.com wrote:
 Richard Biener richard.guent...@gmail.com writes:
 On Wed, May 28, 2014 at 8:50 AM, Jakub Jelinek ja...@redhat.com wrote:
 On Mon, May 26, 2014 at 08:36:31AM -0700, Mike Stump wrote:
 On May 26, 2014, at 2:22 AM, FX fxcoud...@gmail.com wrote:
  This causes GCC bootstrap to fail on Darwin systems (whose system
  compiler is clang-based). Since PR 61146 was resolved as INVALID
  (but I’m not sure it’s the right call, see below), I’ve filed a
  separate report for the bootstrap issue
  (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61315).
 
  Since my PR has been closed twice by Andrew Pinski (“it’s clang’s
  fault, bouh ouh”), I’d ask the maintainers to step in. Can we
  please provide a GCC that works for the default darwin setup? Or at
  least drop darwin as secondary target and document the failure?

 The best coarse of action, post a patch, have it reviewed and put in.
 Current action, a patch has been posted, the review is outstanding, I’d
 like to see it put in; though, I am curious why the casts were there in
 the first place.

 Note, haven't added them there, but from what I can test, the casts there
 can serve as a compile time check that the right type is used, e.g.
 unsigned long i;

 void
 foo (void)
 {
   asm volatile (# %0 %1 : =r ((unsigned long long) i) : 0
 ((unsigned long long) 0));
 }

 Ah, interesting.  A not-so-hineous extension then.

 In that case, how about just protecting the include with:

 #if GCC_VERSION = 4300  (W_TYPE_SIZE == 32 || defined (__SIZEOF_INT128__))

 rather than:

 #if GCC_VERSION = 3000  (W_TYPE_SIZE == 32 || defined (__SIZEOF_INT128__))

 so that clang will fail the version check?  At the end of the day we
 only really care what happens during stage 2 and 3.  Cross-compilers
 built with recentish gccs will still benefit.

 Works for me (thus, pre-approved with a comment explaining the version
 choice).

Btw, testing coverage would ask for  defined (__OPTIMIZE__), too, disabling
that path in stage1 unconditionally even for new GCC.

Richard.

 Thanks,
 Richard.

 Thanks,
 Richard

Re: [PATCH, V2] Fix an AARCH64/ARM performance regression

2014-05-28 Thread Richard Biener

On Wed, May 28, 2014 at 11:02 AM, Bernd Edlinger
bernd.edlin...@hotmail.de wrote:
 Hi,


 But the coment previously read

 /* A constant address in TO_RTX can have VOIDmode, we must not try
 to call force_reg for that case. Avoid that case. */

 and you are removing that check. So I guess you want to retain

  GET_MODE (XEXP (to_rtx, 0)) != VOIDmode

 or investigate why we don't need to avoid this case anymore. In fact,
 that's probably the only change necessary, just drop the == BLKmode
 check? Your lengthy description doesn't reveal if you investigated that.
 It seems to me it tries to avoid ajdusting MEM (symbol-ref) for example?
 Maybe also for optimization reasons?

 I have now found out, what kind of MEM it is, which has a VOIDmode address...

 Just a few lines above, there is this:

   if (!MEM_P (to_rtx))
 {
   /* We can get constant negative offsets into arrays with broken
  user code.  Translate this to a trap instead of ICEing.  */
   gcc_assert (TREE_CODE (offset) == INTEGER_CST);
   expand_builtin_trap ();
   to_rtx = gen_rtx_MEM (BLKmode, const0_rtx);
 }

 now, I have never ever seen this block executed.
 Even the examples from PR 23714 do not execute this block any more.

 But const0_rtx has VOIDmode. This would satisfy the condition.

 So if I try:

 to_rtx = gen_rtx_MEM (BLKmode, const0_rtx);
 force_reg(mode1, to_rtx);

 Voila: assertion in emit_move_insn.

 Somehow this matches literally to what Jeff wrote in the comment:
 A constant address in TO_RTX can have VOIDmode, we must not try
 to call force_reg for that case.

 But nothing happens with:

 to_rtx = gen_rtx_MEM (BLKmode, const0_rtx);
 to_rtx = adjust_address (to_rtx, mode1, bitpos / BITS_PER_UNIT)

 also the rest of the code generation works, and of course this
 expands a null pointer access.


 Adding a new comment before the block reflecting your analysis above
 would be nice as well.

 done, see the attached patch v2.

 I have also removed the check for MEM_P (to_rtx) in the first if-statement,
 because this is always satisfied, after the if (!MEM_P(to_rtx))-block above.
 Now both places have exactly the same logic.

 Boot-strapped  regression-tested in x86_64-linux-gnu.

 OK for trunk?

Ok.

Thanks,
Richard.


 Thanks
 Bernd.

Re: Darwin bootstrap failure following wide int merge (was: we are starting the wide int merge)

2014-05-28 Thread Richard Biener

On Wed, May 28, 2014 at 11:48 AM, Jakub Jelinek ja...@redhat.com wrote:
 On Wed, May 28, 2014 at 11:38:55AM +0200, Richard Biener wrote:
 On Wed, May 28, 2014 at 10:24 AM, FX fxcoud...@gmail.com wrote:
  Yeah, a portable (C and C++) static assert would be nice.  And also 
  pushing
  this to gmp then.
 
  In the meantime I see nothing wrong in merging from GMP.
 
  One question, one comment:
 
  1. can I count your “I see nothing wrong” as an approval, as in “global 
  reviewers can approve changes to any part of the compiler or associated 
  libraries”?

 Well, kind of.  But Jakub is as well, so I don't want to override him.  So
 please wait for an ack from Jakub.  I agree with him that the casts
 served a purpose and that, if removed, they need to be replaced with
 an appropriate assertion measure.

 I think my preference would be to change include/ansidecl.h to:

  /* This macro simplifies testing whether we are using gcc, and if it
 is of a particular minimum version. (Both major  minor numbers are
 significant.)  This macro will evaluate to 0 if we are not using
 gcc at all.*/
  #ifndef GCC_VERSION
 +/* Some compilers pretend to be GCC, even when they are not.  */
 +#if defined(__clang__) || defined(__INTEL_COMPILER)
 +#define GCC_VERSION 0
 +#else
  #define GCC_VERSION (__GNUC__ * 1000 + __GNUC_MINOR__)
 +#endif
  #endif /* GCC_VERSION */

 so that we really can trust the GCC_VERSION macro, casts in lhs of inline
 asm isn't the only incompatibility clang has.

 Any other compilers that define __GNUC__?

Every one I guess.  ICC 9 has it defined to 4, __GNUC_MINOR__ to 0.

Richard.

 Jakub

Re: Darwin bootstrap failure following wide int merge (was: we are starting the wide int merge)

2014-05-28 Thread Richard Biener

On Wed, May 28, 2014 at 12:03 PM, Richard Biener
richard.guent...@gmail.com wrote:
 On Wed, May 28, 2014 at 11:48 AM, Jakub Jelinek ja...@redhat.com wrote:
 On Wed, May 28, 2014 at 11:38:55AM +0200, Richard Biener wrote:
 On Wed, May 28, 2014 at 10:24 AM, FX fxcoud...@gmail.com wrote:
  Yeah, a portable (C and C++) static assert would be nice.  And also 
  pushing
  this to gmp then.
 
  In the meantime I see nothing wrong in merging from GMP.
 
  One question, one comment:
 
  1. can I count your “I see nothing wrong” as an approval, as in “global 
  reviewers can approve changes to any part of the compiler or associated 
  libraries”?

 Well, kind of.  But Jakub is as well, so I don't want to override him.  So
 please wait for an ack from Jakub.  I agree with him that the casts
 served a purpose and that, if removed, they need to be replaced with
 an appropriate assertion measure.

 I think my preference would be to change include/ansidecl.h to:

  /* This macro simplifies testing whether we are using gcc, and if it
 is of a particular minimum version. (Both major  minor numbers are
 significant.)  This macro will evaluate to 0 if we are not using
 gcc at all.*/
  #ifndef GCC_VERSION
 +/* Some compilers pretend to be GCC, even when they are not.  */
 +#if defined(__clang__) || defined(__INTEL_COMPILER)
 +#define GCC_VERSION 0
 +#else
  #define GCC_VERSION (__GNUC__ * 1000 + __GNUC_MINOR__)
 +#endif
  #endif /* GCC_VERSION */

 so that we really can trust the GCC_VERSION macro, casts in lhs of inline
 asm isn't the only incompatibility clang has.

 Any other compilers that define __GNUC__?

 Every one I guess.  ICC 9 has it defined to 4, __GNUC_MINOR__ to 0.

So if we want to go down that route I'd rather change the configury that
checks whether we are using GCC to be more pedantic and for example
parse $CC --version output (not for the actual version but for whether
$CC is GCC).

Richard.

 Richard.

 Jakub

Re: Darwin bootstrap failure following wide int merge

2014-05-28 Thread Richard Biener

On Wed, May 28, 2014 at 11:49 AM, Richard Sandiford
rsand...@linux.vnet.ibm.com wrote:
 Richard Biener richard.guent...@gmail.com writes:
 On Wed, May 28, 2014 at 11:40 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Wed, May 28, 2014 at 10:24 AM, Richard Sandiford
 rdsandif...@googlemail.com wrote:
 Richard Biener richard.guent...@gmail.com writes:
 On Wed, May 28, 2014 at 8:50 AM, Jakub Jelinek ja...@redhat.com wrote:
 On Mon, May 26, 2014 at 08:36:31AM -0700, Mike Stump wrote:
 On May 26, 2014, at 2:22 AM, FX fxcoud...@gmail.com wrote:
  This causes GCC bootstrap to fail on Darwin systems (whose system
  compiler is clang-based). Since PR 61146 was resolved as INVALID
  (but I’m not sure it’s the right call, see below), I’ve filed a
  separate report for the bootstrap issue
  (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61315).
 
  Since my PR has been closed twice by Andrew Pinski (“it’s clang’s
  fault, bouh ouh”), I’d ask the maintainers to step in. Can we
  please provide a GCC that works for the default darwin setup? Or at
  least drop darwin as secondary target and document the failure?

 The best coarse of action, post a patch, have it reviewed and put in.
 Current action, a patch has been posted, the review is outstanding, I’d
 like to see it put in; though, I am curious why the casts were there in
 the first place.

 Note, haven't added them there, but from what I can test, the casts there
 can serve as a compile time check that the right type is used, e.g.
 unsigned long i;

 void
 foo (void)
 {
   asm volatile (# %0 %1 : =r ((unsigned long long) i) : 0
 ((unsigned long long) 0));
 }

 Ah, interesting.  A not-so-hineous extension then.

 In that case, how about just protecting the include with:

 #if GCC_VERSION = 4300  (W_TYPE_SIZE == 32 || defined 
 (__SIZEOF_INT128__))

 rather than:

 #if GCC_VERSION = 3000  (W_TYPE_SIZE == 32 || defined 
 (__SIZEOF_INT128__))

 so that clang will fail the version check?  At the end of the day we
 only really care what happens during stage 2 and 3.  Cross-compilers
 built with recentish gccs will still benefit.

 Works for me (thus, pre-approved with a comment explaining the version
 choice).

 Btw, testing coverage would ask for  defined (__OPTIMIZE__), too, disabling
 that path in stage1 unconditionally even for new GCC.

 Ah, but surely the day is near when we use -O or -Og for stage1?
 I've been using -O for a good few years now and it definitely makes
 things faster (and without affecting debugging too much -- in the
 rare cases where it does affect debugging, a recompile with -g
 is very quick).

 ATM we get the testing coverage for i686 and ppc32 hosts.  So TBH I'd
 prefer to keep it simple and just bump the version number.

Works for me (though see Jakubs idea on the other thread, so please
wait until we settled on a solution).

Richard.

 Thanks,
 Richard

Re: Mark more constants readonly

2014-05-28 Thread Richard Biener

On Wed, May 28, 2014 at 12:00 PM, Bernd Schmidt ber...@codesourcery.com wrote:
 On 05/27/2014 04:57 PM, Richard Biener wrote:

 On Tue, May 27, 2014 at 3:13 PM, Bernd Schmidt ber...@codesourcery.com
 wrote:

 I noticed that string constants built by the Fortran frontend don't set
 TREE_READONLY for STRING_CST (and subsequently noticed that nothing seems
 to
 set it for COMPLEX_CST). That was confusing the ptx backend I'm working
 on.
 The -fwritable-strings option for C was removed a while ago, so I tested
 the
 following patch to just set the flag unconditionally, and passed
 bootstrap
 and test on x86_64-linux. Ok?


 Hmm?  Not so obvious.  TREE_READONLY has no purpose on tcc_constant
 nodes if I read documentation correctly (tree.h).  In fact
 base.readonly_flag is unused for tcc_constant (and would be redundant
 with TREE_CONSTANT).


 Well, the C frontend still sets it for strings (in fix_string_type), and
 gcc.dg/lvalue-5.c fails if that is removed. So things are a little
 inconsistent between frontends.

Then fix that instead.  The ptx backend shouldn't use TREE_READONLY
on tcc_constant, so you can resolve the issue completely within the
ptx backend anyway.

Richard.


 Bernd

Re: Fix old bug in div_and_round_double

2014-05-28 Thread Richard Biener

On Wed, May 28, 2014 at 12:17 PM, Eric Botcazou ebotca...@adacore.com wrote:
 I suppose you also install on branches?

 No plan to do so since this isn't a regression, unless you insist. :-)

Well, a wrong-code bug plus a very obvious fix certainly qualifies.

Richard.

 --
 Eric Botcazou

Re: Fix old bug in div_and_round_double

2014-05-28 Thread Richard Biener

On Wed, May 28, 2014 at 12:23 PM, Eric Botcazou ebotca...@adacore.com wrote:
 Well, a wrong-code bug plus a very obvious fix certainly qualifies.

 Fine with me, onto which branch(es) do you want me to put it?

Where you have tested it already, no need to spend extra cycles.

Richard.

 --
 Eric Botcazou

[PATCH] Fix PR61045

2014-05-28 Thread Richard Biener


The following fixes bogus folding (introducing signed overflow)
for the X +- C1 CMP Y +- C2 to X CMP Y +- C2 +- C1 transform
where we let through a sign-change of the remaining constant operand.

Fixed as follows, bootstrap and regtest running on 
x86_64-unknown-linux-gnu.

Richard.

2014-05-28  Richard Biener  rguent...@suse.de

PR middle-end/61045
* fold-const.c (fold_comparison): When folding
X +- C1 CMP Y +- C2 to X CMP Y +- C2 +- C1 also ensure
the sign of the remaining constant operand stays the same.

* gcc.dg/pr61045.c: New testcase.

Index: gcc/fold-const.c
===
--- gcc/fold-const.c(revision 211011)
+++ gcc/fold-const.c(working copy)
@@ -9239,7 +9239,7 @@ fold_comparison (location_t loc, enum tr
   /* Transform comparisons of the form X +- C1 CMP Y +- C2 to
  X CMP Y +- C2 +- C1 for signed X, Y.  This is valid if
  the resulting offset is smaller in absolute value than the
- original one.  */
+ original one and has the same sign.  */
   if (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (arg0))
(TREE_CODE (arg0) == PLUS_EXPR || TREE_CODE (arg0) == MINUS_EXPR)
(TREE_CODE (TREE_OPERAND (arg0, 1)) == INTEGER_CST
@@ -9258,32 +9258,35 @@ fold_comparison (location_t loc, enum tr
  a comparison);
 
   /* Put the constant on the side where it doesn't overflow and is
-of lower absolute value than before.  */
+of lower absolute value and of same sign than before.  */
   cst = int_const_binop (TREE_CODE (arg0) == TREE_CODE (arg1)
 ? MINUS_EXPR : PLUS_EXPR,
 const2, const1);
   if (!TREE_OVERFLOW (cst)
-  tree_int_cst_compare (const2, cst) == tree_int_cst_sgn (const2))
+  tree_int_cst_compare (const2, cst) == tree_int_cst_sgn (const2)
+  tree_int_cst_sgn (cst) == tree_int_cst_sgn (const2))
{
  fold_overflow_warning (warnmsg, WARN_STRICT_OVERFLOW_COMPARISON);
  return fold_build2_loc (loc, code, type,
- variable1,
- fold_build2_loc (loc,
-  TREE_CODE (arg1), TREE_TYPE (arg1),
-  variable2, cst));
+ variable1,
+ fold_build2_loc (loc, TREE_CODE (arg1),
+  TREE_TYPE (arg1),
+  variable2, cst));
}
 
   cst = int_const_binop (TREE_CODE (arg0) == TREE_CODE (arg1)
 ? MINUS_EXPR : PLUS_EXPR,
 const1, const2);
   if (!TREE_OVERFLOW (cst)
-  tree_int_cst_compare (const1, cst) == tree_int_cst_sgn (const1))
+  tree_int_cst_compare (const1, cst) == tree_int_cst_sgn (const1)
+  tree_int_cst_sgn (cst) == tree_int_cst_sgn (const1))
{
  fold_overflow_warning (warnmsg, WARN_STRICT_OVERFLOW_COMPARISON);
  return fold_build2_loc (loc, code, type,
- fold_build2_loc (loc, TREE_CODE (arg0), TREE_TYPE 
(arg0),
-  variable1, cst),
- variable2);
+ fold_build2_loc (loc, TREE_CODE (arg0),
+  TREE_TYPE (arg0),
+  variable1, cst),
+ variable2);
}
 }
 
Index: gcc/testsuite/gcc.dg/pr61045.c
===
--- gcc/testsuite/gcc.dg/pr61045.c  (revision 0)
+++ gcc/testsuite/gcc.dg/pr61045.c  (working copy)
@@ -0,0 +1,12 @@
+/* { dg-do run } */
+/* { dg-options -fstrict-overflow } */
+
+int main ()
+{
+  int a = 0;
+  int b = __INT_MAX__;
+  int t = (a - 2)  (b - 1);
+  if (t != 0)
+__builtin_abort();
+  return 0;
+}

[PATCH] Less noisy VRP dump, improve copy handling

2014-05-28 Thread Richard Biener


The following patch makes the VRP dump less vertical space noisy.
It also makes handling of the two forms of copies,

  a_1 = b_2;

and

  a_1 = PHI b_2

behave more similar by also copying VR_UNDEFINED ranges in the
first case and by creating a [b_2, b_2] symbolic range in
the second case (if b_2 has a VR_VARYING range).

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2014-05-28  Richard Biener  rguent...@suse.de

* tree-ssa-propagate.c (add_control_edge): Print less vertical space.
* tree-vrp.c (extract_range_from_ssa_name): Also copy VR_UNDEFINED.
(vrp_visit_assignment_or_call): Print less vertical space.
(vrp_visit_stmt): Likewise.
(vrp_visit_phi_node): Likewise.  For a PHI argument with
VR_VARYING range consider recording it as copy.

Index: gcc/tree-ssa-propagate.c
===
*** gcc/tree-ssa-propagate.c(revision 211015)
--- gcc/tree-ssa-propagate.c(working copy)
*** add_control_edge (edge e)
*** 301,307 
cfg_blocks_add (bb);
  
if (dump_file  (dump_flags  TDF_DETAILS))
! fprintf (dump_file, Adding Destination of edge (%d - %d) to 
worklist\n\n,
e-src-index, e-dest-index);
  }
  
--- 301,307 
cfg_blocks_add (bb);
  
if (dump_file  (dump_flags  TDF_DETAILS))
! fprintf (dump_file, \nAdding Destination of edge (%d - %d) to 
worklist\n,
e-src-index, e-dest-index);
  }
  
Index: gcc/tree-vrp.c
===
*** gcc/tree-vrp.c  (revision 211015)
--- gcc/tree-vrp.c  (working copy)
*** extract_range_from_ssa_name (value_range
*** 1810,1816 
  {
value_range_t *var_vr = get_value_range (var);
  
!   if (var_vr-type != VR_UNDEFINED  var_vr-type != VR_VARYING)
  copy_value_range (vr, var_vr);
else
  set_value_range (vr, VR_RANGE, var, var, NULL);
--- 1810,1816 
  {
value_range_t *var_vr = get_value_range (var);
  
!   if (var_vr-type != VR_VARYING)
  copy_value_range (vr, var_vr);
else
  set_value_range (vr, VR_RANGE, var, var, NULL);
*** vrp_visit_assignment_or_call (gimple stm
*** 6679,6685 
  print_generic_expr (dump_file, lhs, 0);
  fprintf (dump_file, : );
  dump_value_range (dump_file, new_vr);
! fprintf (dump_file, \n\n);
}
  
  if (new_vr.type == VR_VARYING)
--- 6679,6685 
  print_generic_expr (dump_file, lhs, 0);
  fprintf (dump_file, : );
  dump_value_range (dump_file, new_vr);
! fprintf (dump_file, \n);
}
  
  if (new_vr.type == VR_VARYING)
*** vrp_visit_stmt (gimple stmt, edge *taken
*** 7473,7479 
  {
fprintf (dump_file, \nVisiting statement:\n);
print_gimple_stmt (dump_file, stmt, 0, dump_flags);
-   fprintf (dump_file, \n);
  }
  
if (!stmt_interesting_for_vrp (stmt))
--- 7473,7478 
*** vrp_visit_phi_node (gimple phi)
*** 8242,8248 
if (dump_file  (dump_flags  TDF_DETAILS))
{
  fprintf (dump_file,
! \nArgument #%d (%d - %d %sexecutable)\n,
  (int) i, e-src-index, e-dest-index,
  (e-flags  EDGE_EXECUTABLE) ?  : not );
}
--- 8241,8247 
if (dump_file  (dump_flags  TDF_DETAILS))
{
  fprintf (dump_file,
! Argument #%d (%d - %d %sexecutable)\n,
  (int) i, e-src-index, e-dest-index,
  (e-flags  EDGE_EXECUTABLE) ?  : not );
}
*** vrp_visit_phi_node (gimple phi)
*** 8260,8275 
  /* Do not allow equivalences or symbolic ranges to leak in from
 backedges.  That creates invalid equivalencies.
 See PR53465 and PR54767.  */
! if (e-flags  EDGE_DFS_BACK
!  (vr_arg.type == VR_RANGE
! || vr_arg.type == VR_ANTI_RANGE))
{
! vr_arg.equiv = NULL;
! if (symbolic_range_p (vr_arg))
{
! vr_arg.type = VR_VARYING;
! vr_arg.min = NULL_TREE;
! vr_arg.max = NULL_TREE;
}
}
}
--- 8259,8288 
  /* Do not allow equivalences or symbolic ranges to leak in from
 backedges.  That creates invalid equivalencies.
 See PR53465 and PR54767.  */
! if (e-flags  EDGE_DFS_BACK)
{
! if (vr_arg.type == VR_RANGE
! || vr_arg.type == VR_ANTI_RANGE)
{
! vr_arg.equiv = NULL;
! if (symbolic_range_p (vr_arg))
!   {
! vr_arg.type = VR_VARYING;
! vr_arg.min

Re: Darwin bootstrap failure following wide int merge

2014-05-28 Thread Richard Biener

On Wed, May 28, 2014 at 3:15 PM, FX fxcoud...@gmail.com wrote:
 After lengthy IRC discussions, what Richard and I can live with is
  !defined(__clang__) in this particular case that uses longlong.h
 in GCC sources, with a comment why.

 I’ll test this patch and commit if there is no problem. But right now, 
 current trunk doesn’t build on x86_64-apple-darwin due to error below. 
 Richard, could this be due to your revision 211013?

Hum, yeah.  But why does it even warn if sizeof (long) == sizeof (long long)?
I suppose casting the result of CWI_ELT () to uint64_t fixes this.  Do
similar errors happen elsewhere?

(the hex printfs expect unsigned types but CWI_ELT returns a signed
HWI)

Richard.

 FX



 ../../trunk/gcc/rtl.c: In function ‘void cwi_output_hex(FILE*, const_rtx)’:
 ../../trunk/gcc/rtl.c:239:62: error: format ‘%llx’ expects argument of type 
 ‘long long unsigned int’, but argument 3 has type ‘long int’ [-Werror=format=]
fprintf (outfile, HOST_WIDE_INT_PRINT_HEX, CWI_ELT (x, --i));
   ^
 ../../trunk/gcc/rtl.c:239:62: error: format ‘%llx’ expects argument of type 
 ‘long long unsigned int’, but argument 3 has type ‘long int’ [-Werror=format=]
 ../../trunk/gcc/rtl.c:241:69: error: format ‘%llx’ expects argument of type 
 ‘long long unsigned int’, but argument 3 has type ‘long int’ [-Werror=format=]
  fprintf (outfile, HOST_WIDE_INT_PRINT_PADDED_HEX, CWI_ELT (x, i));
  ^
 ../../trunk/gcc/rtl.c:241:69: error: format ‘%llx’ expects argument of type 
 ‘long long unsigned int’, but argument 3 has type ‘long int’ [-Werror=format=]
 cc1plus: all warnings being treated as errors
 make[3]: *** [build/rtl.o] Error 1
 make[2]: *** [all-stage2-gcc] Error 2
 make[1]: *** [stage2-bubble] Error 2
 make: *** [all] Error 2

[PATCH] Revert HWI printing change

2014-05-28 Thread Richard Biener


I have reverted the following for now.

Richard.

2014-05-28  Richard Biener  rguent...@suse.de

Revert
2014-05-28  Richard Biener  rguent...@suse.de

* hwint.h (HOST_WIDE_INT_PRINT_*): Define in terms of PRI*64.

Index: gcc/hwint.h
===
*** gcc/hwint.h.orig2014-05-26 10:49:23.009339524 +0200
--- gcc/hwint.h 2014-05-26 11:09:43.597255488 +0200
*** extern char sizeof_long_long_must_be_8[s
*** 109,156 
 typedef before using the __asm_fprintf__ format attribute.  */
  typedef HOST_WIDE_INT __gcc_host_wide_int__;
  
- /* Various printf format strings for HOST_WIDE_INT.  */
- 
- #if HOST_BITS_PER_WIDE_INT == HOST_BITS_PER_LONG
- # define HOST_WIDE_INT_PRINT HOST_LONG_FORMAT
- # define HOST_WIDE_INT_PRINT_C L
-   /* HOST_BITS_PER_WIDE_INT is 64 bits.  */
- # define HOST_WIDE_INT_PRINT_DOUBLE_HEX \
- 0x% HOST_LONG_FORMAT x%016 HOST_LONG_FORMAT x
- # define HOST_WIDE_INT_PRINT_PADDED_HEX \
- %016 HOST_LONG_FORMAT x
- #else
- # define HOST_WIDE_INT_PRINT HOST_LONG_LONG_FORMAT
- # define HOST_WIDE_INT_PRINT_C LL
-   /* HOST_BITS_PER_WIDE_INT is 64 bits.  */
- # define HOST_WIDE_INT_PRINT_DOUBLE_HEX \
- 0x% HOST_LONG_LONG_FORMAT x%016 HOST_LONG_LONG_FORMAT x
- # define HOST_WIDE_INT_PRINT_PADDED_HEX \
- %016 HOST_LONG_LONG_FORMAT x
- #endif /* HOST_BITS_PER_WIDE_INT == HOST_BITS_PER_LONG */
- 
- #define HOST_WIDE_INT_PRINT_DEC % HOST_WIDE_INT_PRINT d
- #define HOST_WIDE_INT_PRINT_DEC_C HOST_WIDE_INT_PRINT_DEC 
HOST_WIDE_INT_PRINT_C
- #define HOST_WIDE_INT_PRINT_UNSIGNED % HOST_WIDE_INT_PRINT u
- #define HOST_WIDE_INT_PRINT_HEX %# HOST_WIDE_INT_PRINT x
- #define HOST_WIDE_INT_PRINT_HEX_PURE % HOST_WIDE_INT_PRINT x
- 
  /* Provide C99 inttypes.h style format definitions for 64bits.  */
  #ifndef HAVE_INTTYPES_H
  #undef PRId64
! #define PRId64 HOST_WIDE_INT_PRINT d
  #undef PRIi64
! #define PRIi64 HOST_WIDE_INT_PRINT i
  #undef PRIo64
! #define PRIo64 HOST_WIDE_INT_PRINT o
  #undef PRIu64
! #define PRIu64 HOST_WIDE_INT_PRINT u
  #undef PRIx64
! #define PRIx64 HOST_WIDE_INT_PRINT x
  #undef PRIX64
! #define PRIX64 HOST_WIDE_INT_PRINT X
  #endif
  
  /* Define HOST_WIDEST_FAST_INT to the widest integer type supported
 efficiently in hardware.  (That is, the widest integer type that fits
 in a hardware register.)  Normally this is long but on some hosts it
--- 75,119 
 typedef before using the __asm_fprintf__ format attribute.  */
  typedef HOST_WIDE_INT __gcc_host_wide_int__;
  
  /* Provide C99 inttypes.h style format definitions for 64bits.  */
  #ifndef HAVE_INTTYPES_H
+ #if HOST_BITS_PER_LONG == 64
+ # define GCC_PRI64 HOST_LONG_FORMAT
+ #else
+ # define GCC_PRI64 HOST_LONG_LONG_FORMAT
+ #endif
  #undef PRId64
! #define PRId64 GCC_PRI64 d
  #undef PRIi64
! #define PRIi64 GCC_PRI64 i
  #undef PRIo64
! #define PRIo64 GCC_PRI64 o
  #undef PRIu64
! #define PRIu64 GCC_PRI64 u
  #undef PRIx64
! #define PRIx64 GCC_PRI64 x
  #undef PRIX64
! #define PRIX64 GCC_PRI64 X
! #endif
! 
! /* Various printf format strings for HOST_WIDE_INT.  */
! 
! #if HOST_BITS_PER_LONG == 64
! # define HOST_WIDE_INT_PRINT HOST_LONG_FORMAT
! # define HOST_WIDE_INT_PRINT_C L
! #else
! # define HOST_WIDE_INT_PRINT HOST_LONG_LONG_FORMAT
! # define HOST_WIDE_INT_PRINT_C LL
  #endif
  
+ #define HOST_WIDE_INT_PRINT_DEC % PRId64
+ #define HOST_WIDE_INT_PRINT_DEC_C % PRId64 HOST_WIDE_INT_PRINT_C
+ #define HOST_WIDE_INT_PRINT_UNSIGNED % PRIu64
+ #define HOST_WIDE_INT_PRINT_HEX %# PRIx64
+ #define HOST_WIDE_INT_PRINT_HEX_PURE % PRIx64
+ #define HOST_WIDE_INT_PRINT_DOUBLE_HEX 0x% PRIx64 %016 PRIx64
+ #define HOST_WIDE_INT_PRINT_PADDED_HEX %016 PRIx64
+ 
  /* Define HOST_WIDEST_FAST_INT to the widest integer type supported
 efficiently in hardware.  (That is, the widest integer type that fits
 in a hardware register.)  Normally this is long but on some hosts it

Re: [PATCH][2/n] Always 64bit-HWI cleanups

2014-06-02 Thread Richard Biener

On Wed, 28 May 2014, Richard Sandiford wrote:

 Richard Biener rguent...@suse.de writes:
  The following changes the configury to insist on [u]int64_t being
  available and removes the very old __int64 case.  Autoconf doesn't
  check for it, support came in via a big merge in Dec 2002, r60174,
  and it was never used on the libcpp side until I fixed that with
  the last patch of this series, so we couldn't have relied on it
  at least since libcpp was introduced.
 
  Both libcpp and vmsdbg.h now use [u]int64_t, switching HOST_WIDE_INT
  to literally use int64_t has to be done with the grand renaming of
  all users due to us using 'unsigned HOST_WIDE_INT'.
 
  Btw, I couldn't find any standard way of writing
  [u]int64_t literals (substitution for HOST_WIDE_INT_C) nor
  one for printf formats (substitutions for HOST_WIDE_INT_PRINT
  and friends).  I'll consider doing s/HOST_WIDE_INT/[U]INT64/
  there if nobody comes up with a better plan.
 
 Not sure whether you meant that to apply to both groups, but as far as
 the HOST_WIDE_INT_C replacement goes, how about just using int64_t (N)
 instead of HOST_WIDE_INT_C (N) or INT64_C (N)?

int64_t (N) would already be taken, but yes, I considered [U]INT64_C 
(N).

Let's see.

Richard.

[PATCH] Fix PR61378

2014-06-02 Thread Richard Biener


Committed as obvious.

Richard.

2014-06-02  Richard Biener  rguent...@suse.de

PR tree-optimization/61378
* tree-ssa-sccvn.c (vn_reference_lookup_3): Initialize
valueized_anything.

Index: gcc/tree-ssa-sccvn.c
===
*** gcc/tree-ssa-sccvn.c(revision 211021)
--- gcc/tree-ssa-sccvn.c(working copy)
*** vn_reference_lookup_3 (ao_ref *ref, tree
*** 1613,1619 
 conditional calls to free falsely clobbering ref because
 of imprecise points-to info of the argument.  */
tree oldargs[4];
!   bool valueized_anything;
for (unsigned i = 0; i  gimple_call_num_args (def_stmt); ++i)
{
  oldargs[i] = gimple_call_arg (def_stmt, i);
--- 1613,1619 
 conditional calls to free falsely clobbering ref because
 of imprecise points-to info of the argument.  */
tree oldargs[4];
!   bool valueized_anything = false;
for (unsigned i = 0; i  gimple_call_num_args (def_stmt); ++i)
{
  oldargs[i] = gimple_call_arg (def_stmt, i);

GCC 4.7.4 Status Report, branch frozen for release (candidate)

2014-06-02 Thread Richard Biener


Status
==

The GCC 4.7 branch is now frozen as I am preparing a first release
candidate for GCC 4.7.4.  All changes from now on require release
manager approval.  After GCC 4.7.4 is released the branch will be
closed.


Previous Report
===

https://gcc.gnu.org/ml/gcc/2013-04/msg00121.html

I will send the next status report once GCC 4.7.4 RC1 is ready.

[PATCH] Testcase for PR61346

2014-06-02 Thread Richard Biener


Committed.

Richard.

2014-06-02  Richard Biener  rguent...@suse.de

PR tree-optimization/61346
* gcc.dg/torture/pr61346.c: New testcase.

Index: gcc/testsuite/gcc.dg/torture/pr61346.c
===
--- gcc/testsuite/gcc.dg/torture/pr61346.c  (revision 0)
+++ gcc/testsuite/gcc.dg/torture/pr61346.c  (working copy)
@@ -0,0 +1,162 @@
+/* { dg-do run } */
+
+extern void abort (void);
+
+typedef int int32_t __attribute__ ((mode (SI)));
+typedef int int64_t __attribute__ ((mode (DI)));
+typedef __SIZE_TYPE__ size_t;
+
+struct slice
+{
+  unsigned char *data;
+  int64_t len;
+  int64_t cap;
+};
+
+void fail (int32_t) __attribute__ ((noinline));
+void
+fail (int32_t c)
+{
+  if (c != 0)
+abort ();
+}
+
+struct decode_rune_ret
+{
+  int32_t r;
+  int64_t width;
+};
+
+struct decode_rune_ret decode_rune (struct slice) __attribute__ ((noinline));
+struct decode_rune_ret
+decode_rune (struct slice s)
+{
+  struct decode_rune_ret dr;
+  dr.r = s.data[0];
+  dr.width = 1;
+  return dr;
+}
+
+_Bool is_space (int32_t) __attribute__ ((noinline));
+_Bool
+is_space (int32_t r)
+{
+  return r == ' ';
+}
+
+struct ret
+{
+  int64_t advance;
+  struct slice token;
+};
+
+struct ret scanwords (struct slice, _Bool) __attribute__ ((noinline));
+
+struct ret
+scanwords (struct slice data, _Bool ateof)
+{
+  int64_t advance;
+  struct slice token;
+  int64_t start = 0;
+  {
+int64_t width;
+for (width = 0; start  data.len; start += width)
+  {
+   int32_t r = 0;
+   struct slice s;
+   if (start  data.cap || start  0)
+ fail (3);
+   s.data = data.data + (size_t) start;
+   s.len = data.len - start;
+   s.cap = data.cap - start;
+   struct decode_rune_ret dr = decode_rune (s);
+   r = dr.r;
+   width = dr.width;
+   if (!is_space (r))
+ break;
+  }
+  }
+  _Bool tmp = ateof;
+  if (tmp != 0)
+goto L1;
+  else
+goto L2;
+ L1:
+  tmp = data.len == 0;
+ L2:
+  if (tmp != 0)
+goto L11;
+  else
+goto L12;
+ L11:
+{
+  struct ret r;
+  advance = 0;
+  token.data = 0;
+  token.len = 0;
+  token.cap = 0;
+  r.advance = advance;
+  r.token = token;
+  return r;
+}
+ L12:;
+  int64_t width;
+  int64_t i;
+  for (width = 0, i = start; i  data.len; i += width)
+{
+  int32_t r;
+  struct slice s;
+  if (i  data.cap || i  0)
+   fail (3);
+  s.data = data.data + i;
+  s.len = data.len - i;
+  s.cap = data.cap - i;
+  struct decode_rune_ret dr = decode_rune (s);
+  r = dr.r;
+  width = dr.width;
+  if (is_space (r))
+   {
+ if (i  start || i  data.cap || i  0)
+   fail (3);
+ if (start  data.cap || start  0)
+   fail (3);
+ struct ret r;
+ advance = i + width;
+ token.data = data.data + (size_t) start;
+ token.len = i - start;
+ token.cap = data.cap - start;
+ r.advance = advance;
+ r.token = token;
+ return r;
+   }
+}
+  {
+struct ret r;
+advance = 0;
+token.data = 0;
+token.len = 0;
+token.cap = 0;
+r.advance = advance;
+r.token = token;
+return r;
+  }
+}
+
+int
+main ()
+{
+  unsigned char buf[1000];
+  struct slice s;
+  __builtin_memset (buf, 0, sizeof (buf));
+  buf[0] = ' ';
+  buf[1] = 'a';
+  buf[2] = ' ';
+  s.data = buf;
+  s.len = 3;
+  s.cap = sizeof (buf);
+  struct ret r;
+  r = scanwords (s, 1);
+  if (r.advance != 3 || r.token.data[0] != 'a' || r.token.len != 1)
+abort ();
+  return 0;
+}

Re: [PATCH][2/n] Always 64bit-HWI cleanups

2014-06-02 Thread Richard Biener

On Mon, 2 Jun 2014, Richard Sandiford wrote:

 Richard Biener rguent...@suse.de writes:
  On Wed, 28 May 2014, Richard Sandiford wrote:
 
  Richard Biener rguent...@suse.de writes:
   The following changes the configury to insist on [u]int64_t being
   available and removes the very old __int64 case.  Autoconf doesn't
   check for it, support came in via a big merge in Dec 2002, r60174,
   and it was never used on the libcpp side until I fixed that with
   the last patch of this series, so we couldn't have relied on it
   at least since libcpp was introduced.
  
   Both libcpp and vmsdbg.h now use [u]int64_t, switching HOST_WIDE_INT
   to literally use int64_t has to be done with the grand renaming of
   all users due to us using 'unsigned HOST_WIDE_INT'.
  
   Btw, I couldn't find any standard way of writing
   [u]int64_t literals (substitution for HOST_WIDE_INT_C) nor
   one for printf formats (substitutions for HOST_WIDE_INT_PRINT
   and friends).  I'll consider doing s/HOST_WIDE_INT/[U]INT64/
   there if nobody comes up with a better plan.
  
  Not sure whether you meant that to apply to both groups, but as far as
  the HOST_WIDE_INT_C replacement goes, how about just using int64_t (N)
  instead of HOST_WIDE_INT_C (N) or INT64_C (N)?
 
  int64_t (N) would already be taken, but yes, I considered [U]INT64_C 
  (N).
 
 Not sure what you mean by taken though.  Isn't it just a C++-style cast?
 The nice thing about [u]int64_t (N) is that it Just Works for both constant
 and non-constant N.  We shouldn't really need a special macro that can only
 handle constant N.
 
 E.g.:
 
   #include stdint.h
   uint64_t x = uint64_t (1)  48

But HOST_WIDE_INT_C can also be used to write 12443875182037483LL.
Not sure if large numbers are parsed as 'long long' (give that
C++04  doesn't have long long and integer literals not fitting long
invoke undefined behavior).

For example

config/alpha/alpha.c: GEN_INT (HOST_WIDE_INT_C 
(0x0ffffff0)),
config/alpha/alpha.c:  word1 = GEN_INT (HOST_WIDE_INT_C 
(0xa77b0010a43b0018));
...
config/rs6000/rs6000.c:  const HOST_WIDE_INT lower_32bits = 
HOST_WIDE_INT_C(0x);
config/rs6000/rs6000.c:  const HOST_WIDE_INT sign_bit = 
HOST_WIDE_INT_C(0x8000);
...

(no significant other uses).

Richard.

[PATCH] Make HOST_WIDE_INT underlying type match int64_t, adjust printing macros again

2014-06-02 Thread Richard Biener


This computes the underlying type of int64_t in a configure check and
uses that for HOST_WIDE_INT.  This makes sure we can use the PRI?64
macros from inttypes.h for HOST_WIDE_INT_PRINT_*, which the following
patch re-instantiates.

The patch also makes sure inttypes.h defines PRId64 and if not,
provides replacements in hwint.h (like previously), matching
int64_t.

Built on i586-linux-gnu and x86_64-linux-gnu (to verify both
'long' and 'long long' choices).  A bootstrap and regtest on
x86_64-unknown-linux-gnu is pending.

Ok?

Thanks,
Richard.

2014-06-02  Richard Biener  rguent...@suse.de

* configure.ac: Check whether the underlying type of int64_t
is long or long long.
* configure: Regenerate.
* config.in: Likewise.
* hwint.h (HOST_WIDE_INT): Match the underlying type of int64_t.
(HOST_WIDE_INT_PRINT_*): Define in terms of PRI*64.

Index: gcc/configure.ac
===
*** gcc/configure.ac(revision 211125)
--- gcc/configure.ac(working copy)
*** if test x$ac_cv_c_uint64_t = xno -o
*** 316,321 
--- 316,350 
AC_MSG_ERROR([uint64_t or int64_t not found])
  fi
  
+ # check what underlying integer type int64_t uses
+ AC_LANG_PUSH(C++)
+ AC_CACHE_CHECK(for int64_t underlying type, ac_cv_int64_t_type, [
+ AC_COMPILE_IFELSE([AC_LANG_PROGRAM([[
+ #ifdef HAVE_STDINT_H
+ #include stdint.h
+ #endif
+ template typename T struct X { };
+ template 
+ struct Xlong { typedef long t; };
+ ]], [[typename Xint64_t::t 
x;]])],[ac_cv_int64_t_type=long],[ac_cv_int64_t_type=long long])])
+ if test x$ac_cv_int64_t_type = xlong; then
+   AC_DEFINE(INT64_T_IS_LONG, 1,
+   [Define if int64_t uses long as underlying type.])
+ else
+ AC_COMPILE_IFELSE([AC_LANG_PROGRAM([[
+ #ifdef HAVE_STDINT_H
+ #include stdint.h
+ #endif
+ template typename T struct X { };
+ template 
+ struct Xlong long { typedef long long t; };
+ ]], [[typename Xint64_t::t x;]])],[],[AC_MSG_ERROR([error verifying int64_t 
uses long long])])
+ fi
+ AC_LANG_POP(C++)
+ 
+ 
+ 
+ 
  # -
  # Warnings and checking
  # -
*** LIBS=$save_LIBS
*** 1055,1067 
  AC_SUBST(LDEXP_LIB)
  
  # Use inttypes.h only if it exists,
! # doesn't clash with sys/types.h, and declares intmax_t.
  AC_MSG_CHECKING(for inttypes.h)
  AC_CACHE_VAL(gcc_cv_header_inttypes_h,
  [AC_COMPILE_IFELSE([AC_LANG_PROGRAM(
! [[#include sys/types.h
  #include inttypes.h]],
!   [[intmax_t i = -1;]])],
[gcc_cv_header_inttypes_h=yes],
[gcc_cv_header_inttypes_h=no])])
  AC_MSG_RESULT($gcc_cv_header_inttypes_h)
--- 1084,1101 
  AC_SUBST(LDEXP_LIB)
  
  # Use inttypes.h only if it exists,
! # doesn't clash with sys/types.h, declares intmax_t and defines
! # PRId64
  AC_MSG_CHECKING(for inttypes.h)
  AC_CACHE_VAL(gcc_cv_header_inttypes_h,
  [AC_COMPILE_IFELSE([AC_LANG_PROGRAM(
! [[#define __STDC_FORMAT_MACROS
! #include sys/types.h
  #include inttypes.h]],
!   [[intmax_t i = -1;
! #ifndef PRId64
! choke me
! #endif]])],
[gcc_cv_header_inttypes_h=yes],
[gcc_cv_header_inttypes_h=no])])
  AC_MSG_RESULT($gcc_cv_header_inttypes_h)
Index: gcc/hwint.h
===
*** gcc/hwint.h (revision 211125)
--- gcc/hwint.h (working copy)
*** extern char sizeof_long_long_must_be_8[s
*** 46,58 
  #endif
  
  /* Set HOST_WIDE_INT, this should be always 64 bits.
! 
!With a sane ABI, 'long' is the largest efficient host integer type.
!Thus, we use that unless we have to use 'long long'
!because we're on a 32-bit host.  */
  
  #define HOST_BITS_PER_WIDE_INT 64
! #if HOST_BITS_PER_LONG == 64
  #   define HOST_WIDE_INT long
  #   define HOST_WIDE_INT_C(X) X ## L
  #else
--- 46,56 
  #endif
  
  /* Set HOST_WIDE_INT, this should be always 64 bits.
!The underlying type is matched to that of int64_t and assumed
!to be either long or long long.  */
  
  #define HOST_BITS_PER_WIDE_INT 64
! #if INT64_T_IS_LONG   
  #   define HOST_WIDE_INT long
  #   define HOST_WIDE_INT_C(X) X ## L
  #else
*** extern char sizeof_long_long_must_be_8[s
*** 75,122 
 typedef before using the __asm_fprintf__ format attribute.  */
  typedef HOST_WIDE_INT __gcc_host_wide_int__;
  
- /* Various printf format strings for HOST_WIDE_INT.  */
- 
- #if HOST_BITS_PER_WIDE_INT == HOST_BITS_PER_LONG
- # define HOST_WIDE_INT_PRINT HOST_LONG_FORMAT
- # define HOST_WIDE_INT_PRINT_C L
-   /* HOST_BITS_PER_WIDE_INT is 64 bits.  */
- # define HOST_WIDE_INT_PRINT_DOUBLE_HEX \
- 0x% HOST_LONG_FORMAT x%016 HOST_LONG_FORMAT x
- # define HOST_WIDE_INT_PRINT_PADDED_HEX \
- %016 HOST_LONG_FORMAT x
- #else
- # define HOST_WIDE_INT_PRINT HOST_LONG_LONG_FORMAT
- # define HOST_WIDE_INT_PRINT_C LL
-   /* HOST_BITS_PER_WIDE_INT is 64 bits.  */
- # define HOST_WIDE_INT_PRINT_DOUBLE_HEX \
- 0x% HOST_LONG_LONG_FORMAT x%016 HOST_LONG_LONG_FORMAT x
- # define

Re: Eliminate write-only variables

2014-06-02 Thread Richard Biener

On Sat, May 31, 2014 at 8:21 PM, Sandra Loosemore
san...@codesourcery.com wrote:
 On 05/20/2014 04:56 AM, Richard Biener wrote:

 On Tue, May 20, 2014 at 6:29 AM, Jan Hubicka hubi...@ucw.cz wrote:

 On 05/18/2014 08:45 PM, Sandra Loosemore wrote:

 On 05/18/2014 02:59 PM, Jan Hubicka wrote:

 For cases like local-statics-7 your approach can be saved by adding
 simple IPA analysis
 to look for static vars that are used only by one function and keeping
 your DSE code active
 for them, so we can still get rid of this special case assignments
 during late compilation.
 I am however not quite convinced it is worth the effort - do you have
 some real world
 cases where it helps?


 Um, the well-known benchmark.  ;-)


 I looked at the generated code for this benchmark and see that your
 patch is indeed not getting rid of all the pointless static
 variables, while ours is, so this is quite disappointing.  I'm
 thinking now that we will still need to retain our patch at least
 locally, although we'd really like to get it on trunk if possible.


 Yours patch can really be improved by adding simple IPA analysis to work
 out what variables are refernced by one function only instead of using
 not-longer-that-relevant local static info.
 As last IPA pass for each static variable with no address taken, look at
 all
 references and see if they all come from one function or functions
 inlined to
 it.


 Here is another testcase vaguely based on the same kind of
 signal-processing algorithm as the benchmark, but coded without
 reference to it.  I have arm-none-eabi builds around for both 4.9.0
 with our remove_local_statics patch applied, and trunk with your
 patch.  With -O2, our optimization produces 23 instructions and gets
 rid of all 3 statics, yours produces 33 and only gets rid of 1 of
 them.

 Of course it's lame to use static variables in this way, but OTOH
 it's lame if GCC can't recognize them and optimize them away, too.

 -Sandra


 void
 fir (int *coeffs, int coeff_length, int *input, int input_length, int
 *output)
 {
static int *coeffp;
static int *inputp;
static int *outputp;


 Here inputp read is rmeoved only at 101.dceloop1 pass. That is really
 late.
 coeffp is removed late, too.

int i, c, acc;

for (i = 0; i  input_length; i++)
  {
coeffp = coeffs;
inputp = input + i;
outputp = output + i;
acc = 0;
for (c = 0; c  coeff_length; c++)
{
  acc += *coeffp * *inputp;
  coeffp++;
  inputp--;
}


 It seems like AA can not work out the fact that inputp is unchanged until
 that
 late.  Richi, any ideas?


 Well, AA is perfectly fine - it's just that this boils down to a partial
 redundancy problem.  The stores can be removed by DCE even with

 Index: gcc/tree-ssa-dce.c
 ===
 --- gcc/tree-ssa-dce.c  (revision 210635)
 +++ gcc/tree-ssa-dce.c  (working copy)
 @@ -278,10 +278,20 @@ mark_stmt_if_obviously_necessary (gimple
 break;

   case GIMPLE_ASSIGN:
 -  if (TREE_CODE (gimple_assign_lhs (stmt)) == SSA_NAME
 -  TREE_CLOBBER_P (gimple_assign_rhs1 (stmt)))
 -   return;
 -  break;
 +  {
 +   tree lhs = gimple_assign_lhs (stmt);
 +   if (TREE_CODE (lhs) == SSA_NAME
 +TREE_CLOBBER_P (gimple_assign_rhs1 (stmt)))
 + return;
 +   if (TREE_CODE (lhs) == VAR_DECL
 +!TREE_ADDRESSABLE (lhs)
 +TREE_STATIC (lhs)
 +!TREE_PUBLIC (lhs)  !DECL_EXTERNAL (lhs)
 +   /* ???  Make sure lhs is only refered to from cfun-decl.  */
 +decl_function_context (lhs) == cfun-decl)
 + return;
 +   break;
 +  }

   default:
 break;

 but I don't have a good way of checking the ??? prerequesite
 (even without taking its address the statics may be refered to
 by a) inline copies, b) ipa-split function parts, c) nested functions).
 I'm sure IPA references are not kept up-to-date.

 The last reads go away with PRE at the expense of two
 additional induction variables.

 If we know that a static variable does not have its address taken
 and is only refered to from a single function we can in theory
 simply rewrite it into SSA form during early opts (before inlining
 its body), for example by SRA.  That isn't even restricted to
 function-local statics (for statics used in multiple functions but
 not having address-taken we can do the same with doing
 function entry / exit load / store).  I think that would be a much
 better IPA enabled optimization than playing games with
 looking at individual stmts.


 FAOD, I'm not currently planning to work on this any more myself.  I
 understand Bernd's patch pretty well and could make some minor changes to
 clean it up if that's what it takes to get it approved, but it sounds like
 what's wanted is a completely different approach, again, and I don't have
 any confidence that I could implement

Re: [PATCH] Fix PR ipa/61190

2014-06-02 Thread Richard Biener

On Mon, Jun 2, 2014 at 11:00 AM, Bernd Edlinger
bernd.edlin...@hotmail.de wrote:
 Hi,

 the test case g++.old-deja/g++.mike/p4736b.C is mis-compiled with with all
 optimization levels, except -O0 and -Og. This probably started with gcc 4.7.x.

 The constructor Main::Main() is first inlined and then completely optimized
 away in the dce1 pass. But the thunk is still using the vtable, and therefore
 crashes.

 Well, the problem seems to be, that the thunk code is
 not emitted in the normal way using gimple code,
 but instead by the i386 back-end with a callback.

 This makes the thunk invisible to the ipa-pure-const pass,
 which assumes that all values just flow straight thu the thunk.

 See ipa-pure-const.c (analyze_function):

   if (fn-thunk.thunk_p || fn-alias)
 {
   /* Thunk gets propagated through, so nothing interesting happens.  */
   gcc_assert (ipa);
   return l;
 }

 But this is not true for a virtual base class:
 The thunk itself references the vtable, so if nothing else might need
 the vtable, the optimizer may remove the initalization of the vtable,
 which happened in this example.

 The most simple work around would be the attached patch, which
 simply falls back to emitting gimple code, if virtual base classes
 are used.

 Boot-Strapped and Regression-Tested on x86_64-linux-gnu.
 Ok for trunk?

Honza should know better.  I'd rather skip the above for
thunks going the output_mi_thunk way.

That is, the target hook docs should be updated to reflect which
kind of thunks it is supposed to handle.

Richard.


 Thanks
 Bernd.

Re: [RFC] optimize x - y cmp 0 with undefined overflow

2014-06-02 Thread Richard Biener

On Fri, May 30, 2014 at 10:48 AM, Eric Botcazou ebotca...@adacore.com wrote:
 I'd say we still handle basic symbolic range ops in
 extract_range_from_binary_1
 but in extract_range_from_binary_expr for a symbolic op0 we try to simplify
 it with both [op1, op1] and with the value-range of op1 until we get a
 non-varying range as result.  Not sure if it's worth restricting that
 to the case
 where op0s value-range refers to op1 or vice versa, and eventually only
 use op1 symbolically then.

 Patch along these lines attached.  A bit heavy as expected, but it's VRP...
 It deals with my pet problem, you might want to check it does so with yours.

 Tested on x86_64-suse-linux with no regressions.

Looks mostly ok.  Any reason why you are not re-creating
MINUS_EXPR in build_symbolic_expr?  That is, build
inv - t (for non-pointers, of course)?  Otherwise if a range
becomes -t + inv that will no longer match get_single_symbol
for further propagation?

Then I'm not sure if

+  /* Try with VR0 and [-INF, OP1].  */
+  set_value_range (new_vr1, VR_RANGE, vrp_val_min (expr_type), op1, NULL);
+  extract_range_from_binary_expr_1 (vr, code, expr_type, vr0, new_vr1);
+  if (vr-type != VR_VARYING)
+   return;
+
+  /* Try with VR0 and [OP1, +INF].  */
+  set_value_range (new_vr1, VR_RANGE, op1, vrp_val_max (expr_type), NULL);
+  extract_range_from_binary_expr_1 (vr, code, expr_type, vr0, new_vr1);
+  if (vr-type != VR_VARYING)
+   return;

is a safe thing to do.  If it does make a difference to try [-INF, OP1],
[OP1, +INF] instead of just [OP1, OP1] then at least it's very suspicious ;)
(or an easy missed optimization).

So - can you fix the negate thing and drop the four cases trying
the +-INF based ranges?

Thanks,
Richard.


 2014-05-30  Eric Botcazou  ebotca...@adacore.com

 * tree-vrp.c (get_single_symbol): New function.
 (build_symbolic_expr): Likewise.
 (symbolic_range_based_on_p): New predicate.
 (extract_range_from_binary_expr_1): Deal with single-symbolic ranges
 for PLUS and MINUS.  Do not drop symbolic ranges at the end.
 (extract_range_from_binary_expr): Try harder for PLUS and MINUS if
 operand is symbolic and based on the other operand.


 2014-05-30  Eric Botcazou  ebotca...@adacore.com

 * gcc.dg/tree-ssa/vrp93.c: New test.
 * gnat.dg/opt38.adb: Likewise.


 --
 Eric Botcazou

Re: [RFC] optimize x - y cmp 0 with undefined overflow

2014-06-02 Thread Richard Biener

On Mon, Jun 2, 2014 at 12:36 PM, Richard Biener
richard.guent...@gmail.com wrote:
 On Fri, May 30, 2014 at 10:48 AM, Eric Botcazou ebotca...@adacore.com wrote:
 I'd say we still handle basic symbolic range ops in
 extract_range_from_binary_1
 but in extract_range_from_binary_expr for a symbolic op0 we try to simplify
 it with both [op1, op1] and with the value-range of op1 until we get a
 non-varying range as result.  Not sure if it's worth restricting that
 to the case
 where op0s value-range refers to op1 or vice versa, and eventually only
 use op1 symbolically then.

 Patch along these lines attached.  A bit heavy as expected, but it's VRP...
 It deals with my pet problem, you might want to check it does so with yours.

 Tested on x86_64-suse-linux with no regressions.

 Looks mostly ok.  Any reason why you are not re-creating
 MINUS_EXPR in build_symbolic_expr?  That is, build
 inv - t (for non-pointers, of course)?  Otherwise if a range
 becomes -t + inv that will no longer match get_single_symbol
 for further propagation?

 Then I'm not sure if

 +  /* Try with VR0 and [-INF, OP1].  */
 +  set_value_range (new_vr1, VR_RANGE, vrp_val_min (expr_type), op1, 
 NULL);
 +  extract_range_from_binary_expr_1 (vr, code, expr_type, vr0, new_vr1);
 +  if (vr-type != VR_VARYING)
 +   return;
 +
 +  /* Try with VR0 and [OP1, +INF].  */
 +  set_value_range (new_vr1, VR_RANGE, op1, vrp_val_max (expr_type), 
 NULL);
 +  extract_range_from_binary_expr_1 (vr, code, expr_type, vr0, new_vr1);
 +  if (vr-type != VR_VARYING)
 +   return;

 is a safe thing to do.  If it does make a difference to try [-INF, OP1],
 [OP1, +INF] instead of just [OP1, OP1] then at least it's very suspicious ;)
 (or an easy missed optimization).

 So - can you fix the negate thing and drop the four cases trying
 the +-INF based ranges?

Btw, the testcases are missing in the patch so I can't have a look myself.

Richard.

 Thanks,
 Richard.


 2014-05-30  Eric Botcazou  ebotca...@adacore.com

 * tree-vrp.c (get_single_symbol): New function.
 (build_symbolic_expr): Likewise.
 (symbolic_range_based_on_p): New predicate.
 (extract_range_from_binary_expr_1): Deal with single-symbolic ranges
 for PLUS and MINUS.  Do not drop symbolic ranges at the end.
 (extract_range_from_binary_expr): Try harder for PLUS and MINUS if
 operand is symbolic and based on the other operand.


 2014-05-30  Eric Botcazou  ebotca...@adacore.com

 * gcc.dg/tree-ssa/vrp93.c: New test.
 * gnat.dg/opt38.adb: Likewise.


 --
 Eric Botcazou

Re: [PATCH] Make HOST_WIDE_INT underlying type match int64_t, adjust printing macros again

2014-06-02 Thread Richard Biener

On Mon, 2 Jun 2014, Jakub Jelinek wrote:

 On Mon, Jun 02, 2014 at 11:24:23AM +0200, Richard Biener wrote:
  2014-06-02  Richard Biener  rguent...@suse.de
  
  * configure.ac: Check whether the underlying type of int64_t
  is long or long long.
  * configure: Regenerate.
  * config.in: Likewise.
  * hwint.h (HOST_WIDE_INT): Match the underlying type of int64_t.
  (HOST_WIDE_INT_PRINT_*): Define in terms of PRI*64.
 
 LGTM, but give others 24 hours to comment on it too.

Testing with older host compiler revealed two issues - typename
use outside of template and missing quoting around the cache
variable check.

Thus I am re-testing the following.

Richard.

2014-06-02  Richard Biener  rguent...@suse.de

* configure.ac: Check whether the underlying type of int64_t
is long or long long.
* configure: Regenerate.
* config.in: Likewise.
* hwint.h (HOST_WIDE_INT): Match the underlying type of int64_t.
(HOST_WIDE_INT_PRINT_*): Define in terms of PRI*64.

Index: gcc/configure.ac
===
*** gcc/configure.ac(revision 211125)
--- gcc/configure.ac(working copy)
*** if test x$ac_cv_c_uint64_t = xno -o
*** 316,321 
--- 316,350 
AC_MSG_ERROR([uint64_t or int64_t not found])
  fi
  
+ # check what underlying integer type int64_t uses
+ AC_LANG_PUSH(C++)
+ AC_CACHE_CHECK(for int64_t underlying type, ac_cv_int64_t_type, [
+ AC_COMPILE_IFELSE([AC_LANG_PROGRAM([[
+ #ifdef HAVE_STDINT_H
+ #include stdint.h
+ #endif
+ template typename T struct X { };
+ template 
+ struct Xlong { typedef long t; };
+ ]], [[Xint64_t::t 
x;]])],[ac_cv_int64_t_type=long],[ac_cv_int64_t_type=long long])])
+ if test $ac_cv_int64_t_type = long; then
+   AC_DEFINE(INT64_T_IS_LONG, 1,
+   [Define if int64_t uses long as underlying type.])
+ else
+ AC_COMPILE_IFELSE([AC_LANG_PROGRAM([[
+ #ifdef HAVE_STDINT_H
+ #include stdint.h
+ #endif
+ template typename T struct X { };
+ template 
+ struct Xlong long { typedef long long t; };
+ ]], [[Xint64_t::t x;]])],[],[AC_MSG_ERROR([error verifying int64_t uses 
long long])])
+ fi
+ AC_LANG_POP(C++)
+ 
+ 
+ 
+ 
  # -
  # Warnings and checking
  # -
*** LIBS=$save_LIBS
*** 1055,1067 
  AC_SUBST(LDEXP_LIB)
  
  # Use inttypes.h only if it exists,
! # doesn't clash with sys/types.h, and declares intmax_t.
  AC_MSG_CHECKING(for inttypes.h)
  AC_CACHE_VAL(gcc_cv_header_inttypes_h,
  [AC_COMPILE_IFELSE([AC_LANG_PROGRAM(
! [[#include sys/types.h
  #include inttypes.h]],
!   [[intmax_t i = -1;]])],
[gcc_cv_header_inttypes_h=yes],
[gcc_cv_header_inttypes_h=no])])
  AC_MSG_RESULT($gcc_cv_header_inttypes_h)
--- 1084,1101 
  AC_SUBST(LDEXP_LIB)
  
  # Use inttypes.h only if it exists,
! # doesn't clash with sys/types.h, declares intmax_t and defines
! # PRId64
  AC_MSG_CHECKING(for inttypes.h)
  AC_CACHE_VAL(gcc_cv_header_inttypes_h,
  [AC_COMPILE_IFELSE([AC_LANG_PROGRAM(
! [[#define __STDC_FORMAT_MACROS
! #include sys/types.h
  #include inttypes.h]],
!   [[intmax_t i = -1;
! #ifndef PRId64
! choke me
! #endif]])],
[gcc_cv_header_inttypes_h=yes],
[gcc_cv_header_inttypes_h=no])])
  AC_MSG_RESULT($gcc_cv_header_inttypes_h)
Index: gcc/hwint.h
===
*** gcc/hwint.h (revision 211125)
--- gcc/hwint.h (working copy)
*** extern char sizeof_long_long_must_be_8[s
*** 46,58 
  #endif
  
  /* Set HOST_WIDE_INT, this should be always 64 bits.
! 
!With a sane ABI, 'long' is the largest efficient host integer type.
!Thus, we use that unless we have to use 'long long'
!because we're on a 32-bit host.  */
  
  #define HOST_BITS_PER_WIDE_INT 64
! #if HOST_BITS_PER_LONG == 64
  #   define HOST_WIDE_INT long
  #   define HOST_WIDE_INT_C(X) X ## L
  #else
--- 46,56 
  #endif
  
  /* Set HOST_WIDE_INT, this should be always 64 bits.
!The underlying type is matched to that of int64_t and assumed
!to be either long or long long.  */
  
  #define HOST_BITS_PER_WIDE_INT 64
! #if INT64_T_IS_LONG   
  #   define HOST_WIDE_INT long
  #   define HOST_WIDE_INT_C(X) X ## L
  #else
*** extern char sizeof_long_long_must_be_8[s
*** 75,122 
 typedef before using the __asm_fprintf__ format attribute.  */
  typedef HOST_WIDE_INT __gcc_host_wide_int__;
  
- /* Various printf format strings for HOST_WIDE_INT.  */
- 
- #if HOST_BITS_PER_WIDE_INT == HOST_BITS_PER_LONG
- # define HOST_WIDE_INT_PRINT HOST_LONG_FORMAT
- # define HOST_WIDE_INT_PRINT_C L
-   /* HOST_BITS_PER_WIDE_INT is 64 bits.  */
- # define HOST_WIDE_INT_PRINT_DOUBLE_HEX \
- 0x% HOST_LONG_FORMAT x%016 HOST_LONG_FORMAT x
- # define HOST_WIDE_INT_PRINT_PADDED_HEX \
- %016 HOST_LONG_FORMAT x
- #else
- # define HOST_WIDE_INT_PRINT HOST_LONG_LONG_FORMAT
- # define HOST_WIDE_INT_PRINT_C LL
-   /* HOST_BITS_PER_WIDE_INT is 64 bits

Re: [PATCH, Pointer Bounds Checker 14/x] Pointer Bounds Checker passes

2014-06-02 Thread Richard Biener

On Fri, May 30, 2014 at 2:25 PM, Ilya Enkovich enkovich@gmail.com wrote:
 Hi,

 This patch adds Pointer Bounds Checker passes.  Versioning happens before 
 early local passes.  Earply local passes are split into 3 stages to have 
 everything instrumented before any optimization applies.

That looks artificial to me.  If you need to split up early_local_passes
then do that - nesting three IPA pass groups inside it looks odd to me.
Btw - doing this in three IPA phases makes things possibly slower
due to cache effects.  It might be worth pursuing to move the early
stage completely to the lowering pipeline.

Btw, fixup_cfg only needs to run once local_pure_const was run
on a callee, thus it shouldn't be neccessary to run it from the
first group.

 void
 pass_manager::execute_early_local_passes ()
 {
-  execute_pass_list (pass_early_local_passes_1-sub);
+  execute_pass_list (pass_early_local_passes_1-sub-sub);
+  execute_pass_list (pass_early_local_passes_1-sub-next-sub);
+  execute_pass_list (pass_early_local_passes_1-sub-next-next-next-sub);
 }

is gross - it should be enough to execute the early local pass list
(obsolete comment with the suggestion above).

Richard.

 Bootstrapped and tested on linux-x86_64.

 Thanks,
 Ilya
 --
 gcc/

 2014-05-29  Ilya Enkovich  ilya.enkov...@intel.com

 * tree-chkp.c: New.
 * tree-chkp.h: New.
 * rtl-chkp.c: New.
 * rtl-chkp.h: New.
 * Makefile.in (OBJS): Add tree-chkp.o, rtl-chkp.o.
 (GTFILES): Add tree-chkp.c.
 * c-family/c.opt (fchkp-check-incomplete-type): New.
 (fchkp-zero-input-bounds-for-main): New.
 (fchkp-first-field-has-own-bounds): New.
 (fchkp-narrow-bounds): New.
 (fchkp-narrow-to-innermost-array): New.
 (fchkp-optimize): New.
 (fchkp-use-fast-string-functions): New.
 (fchkp-use-nochk-string-functions): New.
 (fchkp-use-static-bounds): New.
 (fchkp-use-static-const-bounds): New.
 (fchkp-treat-zero-dynamic-size-as-infinite): New.
 (fchkp-check-read): New.
 (fchkp-check-write): New.
 (fchkp-store-bounds): New.
 (fchkp-instrument-calls): New.
 (fchkp-instrument-marked-only): New.
 * cppbuiltin.c (define_builtin_macros_for_compilation_flags): Add
 __CHKP__ macro when Pointer Bounds Checker is on.
 * passes.def (pass_ipa_chkp_versioning): New.
 (pass_before_local_optimization_passes): New.
 (pass_chkp_instrumentation_passes): New.
 (pass_ipa_chkp_produce_thunks): New.
 (pass_local_optimization_passes): New.
 (pass_chkp_opt): New.
 * toplev.c: include tree-chkp.h.
 (compile_file): Add chkp_finish_file call.
 * tree-pass.h (make_pass_ipa_chkp_versioning): New.
 (make_pass_ipa_chkp_produce_thunks): New.
 (make_pass_chkp): New.
 (make_pass_chkp_opt): New.
 (make_pass_before_local_optimization_passes): New.
 (make_pass_chkp_instrumentation_passes): New.
 (make_pass_local_optimization_passes): New.
 * tree.h (called_as_built_in): New.
 * builtins.c (called_as_built_in): Not static anymore.
 * passes.c (pass_manager::execute_early_local_passes): Execute
 early passes in three steps.
 (pass_data_before_local_optimization_passes): New.
 (pass_before_local_optimization_passes): New.
 (pass_data_chkp_instrumentation_passes): New.
 (pass_chkp_instrumentation_passes): New.
 (pass_data_local_optimization_passes): New.
 (pass_local_optimization_passes): New.
 (make_pass_before_local_optimization_passes): New.
 (make_pass_chkp_instrumentation_passes): New.
 (make_pass_local_optimization_passes): New.

Re: [PATCH, i386, Pointer Bounds Checker 10/x] Partitions

2014-06-02 Thread Richard Biener

On Fri, May 30, 2014 at 7:10 PM, Jeff Law l...@redhat.com wrote:
 On 05/28/14 10:06, Ilya Enkovich wrote:

 Hi,

 This patch keeps instrumented and original versions together and preserve
 tranparent alias chain during symbol name privatization.

 Bootstrapped and tested on linux-x86_64.

 Thanks,
 Ilya
 --
 gcc/

 2013-05-28  Ilya Enkovich  ilya.enkov...@intel.com

 * lto/lto-partition.c (add_symbol_to_partition_1): Keep original
 and instrumented versions together.

 This part is OK.  Note lto/ has its own ChangeLog, so put the ChangeLog
 entry there and don't use the lto/ prefix in the ChangeLog entry.


 (privatize_symbol_name): Restore transparent alias chain if
 required.

 What exactly are you doing here?  The comment in the code doesn't really
 make it clear what you are doing or why.


 +  /* We could change name which is a target of transparent alias
 + chain of instrumented function name.  Fix alias chain if so  .*/

 So are you saying that we want to change the name?  Or that it could have
 been changed and we have to adjust something because it was changed?

 I'm certainly not as familiar with the LTO stuff as I should be -- what is
 the purpose behing chaining the DECL_ASSEMBLER_NAME nodes?

Something gross:

/* Nonzero in an IDENTIFIER_NODE if the name is a local alias, whose
   uses are to be substituted for uses of the TREE_CHAINed identifier.  */
#define IDENTIFIER_TRANSPARENT_ALIAS(NODE) \
  (IDENTIFIER_NODE_CHECK (NODE)-base.deprecated_flag)

this should be all moved to the symbol table level.  (and IDENTIFIER_NODE
shouldn't have to have tree_common.chain and thus become smaller).

Richard.


 jeff

Re: [PATCH, Pointer Bounds Checker 13/x] Early versioning

2014-06-02 Thread Richard Biener

On Mon, Jun 2, 2014 at 12:48 PM, Ilya Enkovich enkovich@gmail.com wrote:
 On 30 May 10:59, Jeff Law wrote:
 On 05/29/14 05:05, Ilya Enkovich wrote:
 Hi,
 
 This patch allows to perform function versioning when some structures are 
 not available yet.  It is required to make clones for Pointer Bounds 
 Checker right after SSA build.
 
 Bootstrapped and tested on linux-x86_64.
 
 Thanks,
 Ilya
 --
 gcc/
 
 2014-05-29  Ilya Enkovich  ilya.enkov...@intel.com
 
  * tree-inline.c (copy_cfg_body): Check loop tree
  existence before accessing it.
  (tree_function_versioning): Check DF info existence
  before accessing it.
 
 diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
 index 4293241..23fef90 100644
 --- a/gcc/tree-inline.c
 +++ b/gcc/tree-inline.c
 @@ -2544,7 +2544,8 @@ copy_cfg_body (copy_body_data * id, gcov_type count, 
 int frequency_scale,
 
 /* If the loop tree in the source function needed fixup, mark the
destination loop tree for fixup, too.  */
 -  if (loops_for_fn (src_cfun)-state  LOOPS_NEED_FIXUP)
 +  if (loops_for_fn (src_cfun)
 +   loops_for_fn (src_cfun)-state  LOOPS_NEED_FIXUP)
   loops_state_set (LOOPS_NEED_FIXUP);
 Hmm, so if I understand things correctly, src_fun has no loop
 structures attached, thus there's nothing to copy.  Presumably at
 some later point we build loop structures for the copy from scratch?
 I suppose it is just a simple bug with absent NULL pointer check.  Here is 
 original code:

   /* Duplicate the loop tree, if available and wanted.  */
   if (loops_for_fn (src_cfun) != NULL
current_loops != NULL)
 {
   copy_loops (id, entry_block_map-loop_father,
   get_loop (src_cfun, 0));
   /* Defer to cfgcleanup to update loop-father fields of basic-blocks.  */
   loops_state_set (LOOPS_NEED_FIXUP);
 }

   /* If the loop tree in the source function needed fixup, mark the
  destination loop tree for fixup, too.  */
   if (loops_for_fn (src_cfun)-state  LOOPS_NEED_FIXUP)
 loops_state_set (LOOPS_NEED_FIXUP);

 As you may see we have check for absent loops structure in the first 
 if-statement and no check in the second one.  I hit segfault and added the 
 check.

Actually after SSA is built we always have loops (loops are built once
we build the CFG which happens earlier).  So all the above checks
are no longer necessary now.


 Similarly for the PTA info, we just build it from scratch in the
 copy at some point?

 Here we also have conditional access like

 /* Reset the escaped solution.  */
 if (cfun-gimple_df)
   pt_solution_reset (cfun-gimple_df-escaped);

 and following unconditional I've fixed.

Likewise.  The init_data_structures pass initializes this (init_tree_ssa).

So I'm not sure why you need all this given we are in SSA form?

Richard.


 Assuming the answers to both are yes, then this patch is OK for the
 trunk when the rest of the patches are approved.  It's not great,
 bit it's OK.

 Thanks!
 Ilya


 jeff

Re: [GSoC][match-and-simplify] add bitwise patterns to match.pd

2014-06-02 Thread Richard Biener

On Mon, Jun 2, 2014 at 1:33 PM, Prathamesh Kulkarni
bilbotheelffri...@gmail.com wrote:
 Hi,
   I have tried to add few bitwise patterns from
 tree-ssa-forwprop.c:simplify_bitwise_binary and the patterns
 mentioned in Simplifications wiki (https://gcc.gnu.org/wiki/Simplifications).

 How to write a test-case to match multiple gimple statements ?
 Example: For the pattern: ~x | ~y - ~(x  y)

 test-case:
 int f(int x, int y)
 {
   int t1 = ~x;
   int t2 = ~y;
   return t1 | t2;
 }

 fdump-tree-forwprop-details output:
 gimple_match_and_simplified to _5 = ~_7;
 f (int x, int y)
 {
   int t2;
   int t1;
   int _5;
   int _7;

   bb 2:
   t1_2 = ~x_1(D);
   t2_4 = ~y_3(D);
   _7 = x_1(D)  y_3(D);
   _5 = ~_7;
   return _5;

 }

 I suppose we want to look for matching both:
  _7 = x_1(D)  y_3(D);
   _5 = ~_7;

 so only matching on gimple_match_and_simplified to _5 = ~_7 won't
 be correct ?

Yeah, that's a forwprop debugging dump issue and/or of the API
it uses.  The gimple_match_and_simplify overload using a
gimple_stmt_iterator * inserts the other stmts before it instead
of delaying that to the caller (and gives it the chance to dump it).

You can do the old-school testing by scanning for the IL itself,
not some gimple_match_and_simplified dump.  Thus in addition
to the above also scan for

{ dg-final { scan-tree-dump \[xy\]_\\d\+\\(D\\)  \[xy\]_\\d\+\\(D\\) } }

 The patterns for x  0 - 0 and x  -1 - x don't get fired from forwprop.
 I tried:
 int f1(int x)
 {
   int t1 = 0;
   return x  t1;
 }

 fdump-tree-forwprop-details shows following:
 ;; Function f1 (f1, funcdef_no=0, decl_uid=1743, symbol_order=0)

 f1 (int x)
 {
   int t1;

   bb 2:
   return 0;

 }

That's because constant propagation (which runs before forwprop)
already simplified the statement.

I have a patch to make use of match-and-simplify from
gimple_fold_stmt_to_constant but I have to clean it up.
I'll give writing testcases a thought there (hopefully will
post and commit a patch later today).

Richard.


 [gcc]
 * match.pd: Add few patterns from simplify_bitwise_binary.

 [gcc/testsuite]
 * gcc.dg/tree-ssa/match-2.c: Add more test-cases.

 Thanks and Regards,
 Prathamesh

[PATCH][match-and-simplify]

2014-06-02 Thread Richard Biener


This is a patch I worked on last week - it makes 
gimple_fold_stmt_to_constant[_1] use gimple_match_and_simplify
(and avoid building trees we throw away because they are not
gimple values).  It also adds basic constant folding patterns
(and comments on what is missing in match.pd - the actual
patch I was using was comparing the result from both simplification
methods and errors if they don't agree).

The patch below (which has been committed) keeps the old
path to not regress due to missing patterns.

Committed to the branch.

Richard.

2014-06-02  Richard Biener  rguent...@suse.de

* gimple-fold.c (gimple_fold_stmt_to_constant_1): Rename to 
(gimple_fold_stmt_to_constant_2): ... this and make static.
(gimple_fold_stmt_to_constant_1): New wrapper using
gimple_match_and_simplify before falling back to
gimple_fold_stmt_to_constant_2.
* match.pd: Add basic constant folding patterns.  Add
two commutations.

Index: gcc/gimple-fold.c
===
--- gcc/gimple-fold.c   (revision 211131)
+++ gcc/gimple-fold.c   (working copy)
@@ -2524,8 +2524,8 @@ maybe_fold_or_comparisons (enum tree_cod
privatized with the single valueize function used in the various TUs
to avoid the indirect function call overhead.  */
 
-tree
-gimple_fold_stmt_to_constant_1 (gimple stmt, tree (*valueize) (tree))
+static tree
+gimple_fold_stmt_to_constant_2 (gimple stmt, tree (*valueize) (tree))
 {
   location_t loc = gimple_location (stmt);
   switch (gimple_code (stmt))
@@ -2803,6 +2803,30 @@ gimple_fold_stmt_to_constant_1 (gimple s
 }
 }
 
+tree
+gimple_fold_stmt_to_constant_1 (gimple stmt, tree (*valueize) (tree))
+{
+  tree lhs = gimple_get_lhs (stmt);
+  if (lhs)
+{
+  tree res = gimple_match_and_simplify (lhs, NULL, valueize);
+  if (res)
+   {
+ if (dump_file  dump_flags  TDF_DETAILS)
+   {
+ fprintf (dump_file, Match-and-simplified definition of );
+ print_generic_expr (dump_file, lhs, 0);
+ fprintf (dump_file,  to );
+ print_generic_expr (dump_file, res, 0);
+ fprintf (dump_file, \n);
+   }
+ return res;
+   }
+}
+  /* ???  For now, to avoid regressions.  */
+  return gimple_fold_stmt_to_constant_2 (stmt, valueize);
+}
+
 /* Fold STMT to a constant using VALUEIZE to valueize SSA names.
Returns NULL_TREE if folding to a constant is not possible, otherwise
returns a constant according to is_gimple_min_invariant.  */
Index: gcc/match.pd
===
--- gcc/match.pd(revision 211131)
+++ gcc/match.pd(working copy)
@@ -21,6 +21,95 @@ You should have received a copy of the G
 along with GCC; see the file COPYING3.  If not see
 http://www.gnu.org/licenses/.  */
 
+/* Simple constant foldings to substitute gimple_fold_stmt_to_constant_2.  */
+(match_and_simplify
+  (plus @0 integer_zerop)
+  @0)
+(match_and_simplify
+  (pointer_plus @0 integer_zerop)
+  @0)
+(match_and_simplify
+  (minus @0 integer_zerop)
+  @0)
+(match_and_simplify
+  (minus @0 @0)
+  { build_zero_cst (type); })
+(match_and_simplify
+  (mult @0 integer_zerop@1)
+  @1)
+(match_and_simplify
+  (mult @0 integer_onep)
+  @0)
+(match_and_simplify
+  (trunc_div @0 integer_onep)
+  @0)
+/* It's hard to preserve non-folding of / 0 which is done by a
+   positional check in fold-const.c (to preserve warnings).  The
+   issue here is that we fold too early in frontends.
+   Also fold happilt folds 0 / x to 0 (even if x turns out to be zero later). 
*/
+(match_and_simplify
+  (trunc_div integer_zerop@0 @1)
+  @0)
+(match_and_simplify
+  (trunc_div @0 @0)
+  { build_one_cst (type); })
+(match_and_simplify
+  (trunc_mod @0 integer_onep)
+  { build_zero_cst (type); })
+(match_and_simplify
+  (trunc_mod integer_zerop@0 @1)
+  @0)
+(match_and_simplify
+  (trunc_mod @0 @0)
+  { build_zero_cst (type); })
+(match_and_simplify
+  (bit_ior @0 integer_zerop)
+  @0)
+(match_and_simplify
+  (bit_ior @0 integer_all_onesp@1)
+  @1)
+(match_and_simplify
+  (bit_and @0 integer_all_onesp)
+  @0)
+(match_and_simplify
+  (bit_and @0 integer_zerop@1)
+  @1)
+(match_and_simplify
+  (bit_xor @0 integer_zerop)
+  @0)
+(match_and_simplify
+  (bit_xor @0 @0)
+  { build_zero_cst (type); })
+/* tree-ssa/ifc-pr44710.c requires a  b ? c : d to fold to 1.
+   ???  probably runs into issue of recursive folding of a  b op0.  */
+/* tree-ssa/ssa-ccp-16.c wants to fold hello[i_2] to 0
+   (fold_const_aggregate_ref_1).  */
+/* tree-ssa/ssa-ccp-19.c wants to fold a1_3-i to MEM[(void *)a]
+   (get_addr_base_and_unit_offset_1). */
+/* tree-ssa/ssa-ccp-22.c wants to fold b_2(D) = t_1 to 1.
+   We are missing compare constant folding to type boundaries.  */
+
+/* The following is simplification done by gimple_fold_stmt_to_constant_1
+   to aid propagation engines, producing is_gimple_min_invariants from

Re: [GSoC][match-and-simplify] add bitwise patterns to match.pd

2014-06-02 Thread Richard Biener

On Mon, Jun 2, 2014 at 2:53 PM, Richard Biener
richard.guent...@gmail.com wrote:
 On Mon, Jun 2, 2014 at 1:33 PM, Prathamesh Kulkarni
 bilbotheelffri...@gmail.com wrote:
 Hi,
   I have tried to add few bitwise patterns from
 tree-ssa-forwprop.c:simplify_bitwise_binary and the patterns
 mentioned in Simplifications wiki (https://gcc.gnu.org/wiki/Simplifications).

 How to write a test-case to match multiple gimple statements ?
 Example: For the pattern: ~x | ~y - ~(x  y)

 test-case:
 int f(int x, int y)
 {
   int t1 = ~x;
   int t2 = ~y;
   return t1 | t2;
 }

 fdump-tree-forwprop-details output:
 gimple_match_and_simplified to _5 = ~_7;
 f (int x, int y)
 {
   int t2;
   int t1;
   int _5;
   int _7;

   bb 2:
   t1_2 = ~x_1(D);
   t2_4 = ~y_3(D);
   _7 = x_1(D)  y_3(D);
   _5 = ~_7;
   return _5;

 }

 I suppose we want to look for matching both:
  _7 = x_1(D)  y_3(D);
   _5 = ~_7;

 so only matching on gimple_match_and_simplified to _5 = ~_7 won't
 be correct ?

 Yeah, that's a forwprop debugging dump issue and/or of the API
 it uses.  The gimple_match_and_simplify overload using a
 gimple_stmt_iterator * inserts the other stmts before it instead
 of delaying that to the caller (and gives it the chance to dump it).

 You can do the old-school testing by scanning for the IL itself,
 not some gimple_match_and_simplified dump.  Thus in addition
 to the above also scan for

 { dg-final { scan-tree-dump \[xy\]_\\d\+\\(D\\)  \[xy\]_\\d\+\\(D\\) } }

 The patterns for x  0 - 0 and x  -1 - x don't get fired from forwprop.
 I tried:
 int f1(int x)
 {
   int t1 = 0;
   return x  t1;
 }

 fdump-tree-forwprop-details shows following:
 ;; Function f1 (f1, funcdef_no=0, decl_uid=1743, symbol_order=0)

 f1 (int x)
 {
   int t1;

   bb 2:
   return 0;

 }

 That's because constant propagation (which runs before forwprop)
 already simplified the statement.

 I have a patch to make use of match-and-simplify from
 gimple_fold_stmt_to_constant but I have to clean it up.
 I'll give writing testcases a thought there (hopefully will
 post and commit a patch later today).

Btw,

/* x  0 - 0 */
(match_and_simplify
  (bit_and @0 @1)
  if (INTEGRAL_TYPE_P (TREE_TYPE (@0))  @1 == integer_zero_node)
  { integer_zero_node; })

/* x  -1 - x */
(match_and_simplify
  (bit_and @0 @1)
  if (INTEGRAL_TYPE_P (TREE_TYPE (@0))  @1 == integer_minus_one_node)
  @0)

are too restrictive due to comparing @1 against very special tree nodes.
The patch I just committed uses integer_zerop and integer_all_onesp
instead.

I have simplfied some of the if-exprs, operands are guaranteed
to match.  Also if we settle on the solution of providing 'type'
as the type of the outermost expression we can simplify them
as bitwise expressions don't change types.

Most of the patterns would also apply in commutated form, thus
I guess thinking of a good solution for that is next on the list
after the decision tree stuff.

/* ((a  b)  ~a)  ~b - 0 */
(match_and_simplify
  (bit_and (bit_and (bit_and @0 @1) (bit_not @0)) (bit_not @1))
  if (INTEGRAL_TYPE_P (TREE_TYPE (@0))  INTEGRAL_TYPE_P (TREE_TYPE (@1)))
  { integer_zero_node; })

isn't this too complex and instead already (a  b)  ~a is 0?  Also
that would be still doing two steps in one as we already have
a  ~a - 0 so the pattern (if wanted) should only do the association
(a  b)  ~a - (a  ~a)  b?  (note that generally this is something
for a reassociation pass, not for simple pattern matching)

I have applied the patch with these changes.

Richard.


 Richard.


 [gcc]
 * match.pd: Add few patterns from simplify_bitwise_binary.

 [gcc/testsuite]
 * gcc.dg/tree-ssa/match-2.c: Add more test-cases.

 Thanks and Regards,
 Prathamesh

Re: [PATCH, Pointer Bounds Checker 14/x] Pointer Bounds Checker passes

2014-06-02 Thread Richard Biener

On Mon, Jun 2, 2014 at 2:44 PM, Ilya Enkovich enkovich@gmail.com wrote:
 2014-06-02 15:35 GMT+04:00 Richard Biener richard.guent...@gmail.com:
 On Fri, May 30, 2014 at 2:25 PM, Ilya Enkovich enkovich@gmail.com 
 wrote:
 Hi,

 This patch adds Pointer Bounds Checker passes.  Versioning happens before 
 early local passes.  Earply local passes are split into 3 stages to have 
 everything instrumented before any optimization applies.

 That looks artificial to me.  If you need to split up early_local_passes
 then do that - nesting three IPA pass groups inside it looks odd to me.
 Btw - doing this in three IPA phases makes things possibly slower
 due to cache effects.  It might be worth pursuing to move the early
 stage completely to the lowering pipeline.

 Early local passes is some special case because these passes are
 executed separately for new functions. I did not want to get three
 special passes instead and therefore made split inside.

Yeah, but all passes are already executed via execute_early_local_passes,
so it would be only an implementation detail.

 If you prefer split pass itself, I suppose pass_early_local_passes may
 be replaced with something like pass_build_ssa_passes +
 pass_chkp_instrumentation_passes + pass_ipa_chkp_produce_thunks +
 pass_local_optimization_passes. execute_early_local_passes would
 execute gimple passes lists of pass_build_ssa_passes,
 pass_chkp_instrumentation_passes and pass_local_optimization_passes.

 I think we cannot have the first stage moved into lowering passes
 because it should be executed for newly created functions.

Well, let's defer that then.


 Btw, fixup_cfg only needs to run once local_pure_const was run
 on a callee, thus it shouldn't be neccessary to run it from the
 first group.

 OK. Will try to remove it from the first group.


  void
  pass_manager::execute_early_local_passes ()
  {
 -  execute_pass_list (pass_early_local_passes_1-sub);
 +  execute_pass_list (pass_early_local_passes_1-sub-sub);
 +  execute_pass_list (pass_early_local_passes_1-sub-next-sub);
 +  execute_pass_list (pass_early_local_passes_1-sub-next-next-next-sub);
  }

 is gross - it should be enough to execute the early local pass list
 (obsolete comment with the suggestion above).

 This function should call only gimple passes for cfun. Therefore we
 cannot call IPA passes here and has to execute each gimple passes list
 separately.

Ok, given a different split this would then become somewhat more sane
anyway.

Richard.

 Ilya

 Richard.

 Bootstrapped and tested on linux-x86_64.

 Thanks,
 Ilya
 --
 gcc/

 2014-05-29  Ilya Enkovich  ilya.enkov...@intel.com

 * tree-chkp.c: New.
 * tree-chkp.h: New.
 * rtl-chkp.c: New.
 * rtl-chkp.h: New.
 * Makefile.in (OBJS): Add tree-chkp.o, rtl-chkp.o.
 (GTFILES): Add tree-chkp.c.
 * c-family/c.opt (fchkp-check-incomplete-type): New.
 (fchkp-zero-input-bounds-for-main): New.
 (fchkp-first-field-has-own-bounds): New.
 (fchkp-narrow-bounds): New.
 (fchkp-narrow-to-innermost-array): New.
 (fchkp-optimize): New.
 (fchkp-use-fast-string-functions): New.
 (fchkp-use-nochk-string-functions): New.
 (fchkp-use-static-bounds): New.
 (fchkp-use-static-const-bounds): New.
 (fchkp-treat-zero-dynamic-size-as-infinite): New.
 (fchkp-check-read): New.
 (fchkp-check-write): New.
 (fchkp-store-bounds): New.
 (fchkp-instrument-calls): New.
 (fchkp-instrument-marked-only): New.
 * cppbuiltin.c (define_builtin_macros_for_compilation_flags): Add
 __CHKP__ macro when Pointer Bounds Checker is on.
 * passes.def (pass_ipa_chkp_versioning): New.
 (pass_before_local_optimization_passes): New.
 (pass_chkp_instrumentation_passes): New.
 (pass_ipa_chkp_produce_thunks): New.
 (pass_local_optimization_passes): New.
 (pass_chkp_opt): New.
 * toplev.c: include tree-chkp.h.
 (compile_file): Add chkp_finish_file call.
 * tree-pass.h (make_pass_ipa_chkp_versioning): New.
 (make_pass_ipa_chkp_produce_thunks): New.
 (make_pass_chkp): New.
 (make_pass_chkp_opt): New.
 (make_pass_before_local_optimization_passes): New.
 (make_pass_chkp_instrumentation_passes): New.
 (make_pass_local_optimization_passes): New.
 * tree.h (called_as_built_in): New.
 * builtins.c (called_as_built_in): Not static anymore.
 * passes.c (pass_manager::execute_early_local_passes): Execute
 early passes in three steps.
 (pass_data_before_local_optimization_passes): New.
 (pass_before_local_optimization_passes): New.
 (pass_data_chkp_instrumentation_passes): New.
 (pass_chkp_instrumentation_passes): New.
 (pass_data_local_optimization_passes): New.
 (pass_local_optimization_passes): New

[PATCH] Fix part of PR61383

2014-06-02 Thread Richard Biener


In comment #3 it's noted that ifcombine happily hoists possibly
trapping stmts ... oops.

Fixed like the following, bootstrap and regtest running on 
x86_64-unknown-linux-gnu.

Richard.

2014-06-02  Richard Biener  rguent...@suse.de

PR tree-optimization/61383
* tree-ssa-ifcombine.c (bb_no_side_effects_p): Make sure
stmts can't trap.

* gcc.dg/torture/pr61383-1.c: New testcase.

Index: gcc/tree-ssa-ifcombine.c
===
*** gcc/tree-ssa-ifcombine.c(revision 211125)
--- gcc/tree-ssa-ifcombine.c(working copy)
*** bb_no_side_effects_p (basic_block bb)
*** 128,133 
--- 128,134 
gimple stmt = gsi_stmt (gsi);
  
if (gimple_has_side_effects (stmt)
+ || gimple_could_trap_p (stmt)
  || gimple_vuse (stmt))
return false;
  }
Index: gcc/testsuite/gcc.dg/torture/pr61383-1.c
===
*** gcc/testsuite/gcc.dg/torture/pr61383-1.c(revision 0)
--- gcc/testsuite/gcc.dg/torture/pr61383-1.c(working copy)
***
*** 0 
--- 1,35 
+ /* { dg-do run } */
+ 
+ int a, b = 1, c, d, e, f, g;
+ 
+ int
+ fn1 ()
+ {
+   int h;
+   for (;;)
+ {
+   g = b;
+   g = g ? 0 : 1 % g;
+   e = a + 1;
+   for (; d  1; d = e)
+   {
+ if (f == 0)
+   h = 0;
+ else
+   h = 1 % f;
+ if (f  1)
+   c = 0;
+ else if (h)
+   break;
+   }
+   if (b)
+   return 0;
+ }
+ }
+ 
+ int
+ main ()
+ {
+   fn1 ();
+   return 0;
+ }

Re: Eliminate write-only variables

2014-06-02 Thread Richard Biener

On June 2, 2014 5:44:13 PM CEST, Jan Hubicka hubi...@ucw.cz wrote:
 
 Well, I'm hesitant to add a new pass just to optimize a (irrelevant
in practice)
 benchmark.  I'm ok with strengthening existing infrastructure or
enhancing
 existing passes.
 
 The issue with these mini-passes is that they are very placement
sensitive
 and you don't easily get secondary effects.  See the comment (and
 patch) about DCE - the ??? comment can be addressed the same
 way Bernd addressed it (use cgraph_function_possibly_inlined_p).
 Would that optimize the testcase?
 
 Note that the issue with nested function use prevails (not sure how
to
 solve that - same issue in Bernds patch).

I think this part of the patch can be made much cleaner by simply
adding flag
used by one function only to variables.  We can compute it at the end
of IPA
queue and remove the code playing with local statics, nested functions
and inlines.

I can easily implement this analysis - perhaps it would be useful for
AA or
something else, too?

Yeah, I discussed this with martin today on irc.  For aliasing we'd like to 
know whether a decl possibly has its address taken. Currently we only trust 
TREE_ADDRESSABLE for statics - and lto might change those to hidden visibility 
public :(

So we want deck_referred_to_by_single_function and deck_may_have_address_taken.

Richard.

Honza
 
 Thanks,
 Richard.
 
  -Sandra

Re: [PATCH] Fix PR61328: fully initialize symbolic number before using it

2014-06-03 Thread Richard Biener

On June 3, 2014 9:18:29 AM CEST, Thomas Preud'homme thomas.preudho...@arm.com 
wrote:
When a bitwise OR gimple statement has for operands SSA_NAME
initialized directly from memory source (no cast or other unary
statement intervening), a symbolic number will be used only partly
initialized. This was discovered by valgrind and reported as PR61328.
This patch fixes that by moving the initialization code in a separate
function that can be called from the two places that need it. There was
a problem of a field of a structure that was set in a function and the
value of this field was read before checking the result of the function
call. This would lead to missed optimization.

ChangeLog is as follows:

2014-05-29  Thomas Preud'homme  thomas.preudho...@arm.com

   PR tree-optimization/61328
   * tree-ssa-math-opts.c (init_symbolic_number): Extract symbolic number
   initialization from find_bswap_or_nop_1.
   (find_bswap_or_nop_1): Test return value of find_bswap_or_nop_1 stored
   in source_expr2 before using the size value the function sets. Also
   make use of init_symbolic_number () in both the old place and
   find_bswap_or_nop_load () to avoid reading uninitialized memory when
   doing recursion in the GIMPLE_BINARY_RHS case.

Ok for trunk?

OK.

Thanks,
Richard.

diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
index d9afccf..6c26d6d 100644
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -1701,6 +1701,30 @@ verify_symbolic_number_p (struct symbolic_number
*n, gimple stmt)
   return true;
 }
 
+/* Initialize the symbolic number N for the bswap pass from the base
element
+   SRC manipulated by the bitwise OR expression.  */
+
+static bool
+init_symbolic_number (struct symbolic_number *n, tree src)
+{
+  n-base_addr = n-offset = n-alias_set = n-vuse = NULL_TREE;
+
+  /* Set up the symbolic number N by setting each byte to a value
between 1 and
+ the byte size of rhs1.  The highest order byte is set to n-size
and the
+ lowest order byte to 1.  */
+  n-size = TYPE_PRECISION (TREE_TYPE (src));
+  if (n-size % BITS_PER_UNIT != 0)
+return false;
+  n-size /= BITS_PER_UNIT;
+  n-range = n-size;
+  n-n = CMPNOP;
+
+  if (n-size  (int)sizeof (int64_t))
+n-n = ((uint64_t)1  (n-size * BITS_PER_UNIT)) - 1;
+
+  return true;
+}
+
/* Check if STMT might be a byte swap or a nop from a memory source and
returns
the answer. If so, REF is that memory source and the base of the memory
area
accessed and the offset of the access from that base are recorded in N.
 */
@@ -1713,26 +1737,27 @@ find_bswap_or_nop_load (gimple stmt, tree ref,
struct symbolic_number *n)
   HOST_WIDE_INT bitsize, bitpos;
   enum machine_mode mode;
   int unsignedp, volatilep;
+  tree offset, base_addr;
 
   if (!gimple_assign_load_p (stmt) || gimple_has_volatile_ops (stmt))
 return false;
 
-  n-base_addr = get_inner_reference (ref, bitsize, bitpos,
n-offset,
-mode, unsignedp, volatilep, false);
+  base_addr = get_inner_reference (ref, bitsize, bitpos, offset,
mode,
+ unsignedp, volatilep, false);
 
-  if (TREE_CODE (n-base_addr) == MEM_REF)
+  if (TREE_CODE (base_addr) == MEM_REF)
 {
   offset_int bit_offset = 0;
-  tree off = TREE_OPERAND (n-base_addr, 1);
+  tree off = TREE_OPERAND (base_addr, 1);
 
   if (!integer_zerop (off))
   {
-offset_int boff, coff = mem_ref_offset (n-base_addr);
+offset_int boff, coff = mem_ref_offset (base_addr);
 boff = wi::lshift (coff, LOG2_BITS_PER_UNIT);
 bit_offset += boff;
   }
 
-  n-base_addr = TREE_OPERAND (n-base_addr, 0);
+  base_addr = TREE_OPERAND (base_addr, 0);
 
/* Avoid returning a negative bitpos as this may wreak havoc later.  */
   if (wi::neg_p (bit_offset))
@@ -1743,11 +1768,11 @@ find_bswap_or_nop_load (gimple stmt, tree ref,
struct symbolic_number *n)
Subtract it to BIT_OFFSET and add it (scaled) to OFFSET.  */
 bit_offset -= tem;
 tem = wi::arshift (tem, LOG2_BITS_PER_UNIT);
-if (n-offset)
-  n-offset = size_binop (PLUS_EXPR, n-offset,
+if (offset)
+  offset = size_binop (PLUS_EXPR, offset,
   wide_int_to_tree (sizetype, tem));
 else
-  n-offset = wide_int_to_tree (sizetype, tem);
+  offset = wide_int_to_tree (sizetype, tem);
   }
 
   bitpos += bit_offset.to_shwi ();
@@ -1758,6 +1783,9 @@ find_bswap_or_nop_load (gimple stmt, tree ref,
struct symbolic_number *n)
   if (bitsize % BITS_PER_UNIT)
 return false;
 
+  init_symbolic_number (n, ref);
+  n-base_addr = base_addr;
+  n-offset = offset;
   n-bytepos = bitpos / BITS_PER_UNIT;
   n-alias_set = reference_alias_ptr_type (ref);
   n-vuse = gimple_vuse (stmt);
@@ -1816,28 +1844,12 @@ find_bswap_or_nop_1 (gimple stmt, struct
symbolic_number *n, int limit)
 
   /* If find_bswap_or_nop_1 returned NULL,

Re: [PATCH][match-and-simplify]

2014-06-03 Thread Richard Biener

On Mon, 2 Jun 2014, Marc Glisse wrote:

 
 (plus (bit_not @0) @0)
 if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
 { build_int_cst (TREE_TYPE (@0), -1); })
  +(match_and_simplify
  +  (plus @0 (bit_not @0))
  +  if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
  +  { build_int_cst (TREE_TYPE (@0), -1); })
 
 Why not just:
 
 (match_and_simplify
   (plus @0 (bit_not @0))
   { build_all_ones_cst (TREE_TYPE (@0)); })
 
 ? Works for vector/complex, I don't know what type a bit_not_expr can have
 where the simplification wouldn't be true.

Sure.

Richard.

2014-06-03  Richard Biener  rguent...@suse.de

* match.pd: Generalize ~A + A - -1 simplification to all
types.

Index: gcc/match.pd
===
--- gcc/match.pd(revision 211135)
+++ gcc/match.pd(working copy)
@@ -185,12 +185,10 @@ to (minus @1 @0)
 /* ~A + A - -1 */
 (match_and_simplify
   (plus (bit_not @0) @0)
-  if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
-  { build_int_cst (TREE_TYPE (@0), -1); })
+  { build_all_ones_cst (type); })
 (match_and_simplify
   (plus @0 (bit_not @0))
-  if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
-  { build_int_cst (TREE_TYPE (@0), -1); })
+  { build_all_ones_cst (type); })
 
 /* ~A + 1 - -A */
 (match_and_simplify

Re: [PATCH, i386, Pointer Bounds Checker 18/x] Expand instrumented builtin function calls

2014-06-03 Thread Richard Biener

On Mon, Jun 2, 2014 at 4:51 PM, Ilya Enkovich enkovich@gmail.com wrote:
 Hi,

 This patch adds support for normal builtin function calls (target ones are 
 not instrumented).  The basic idea of the patch is to make call expr copy 
 with no bounds and expand it instead.  If expr is going to be emitted as a 
 function call then original instrumented expr takes place.  Handle string 
 function separately because they are of high interest for the checker.

It seems to be this should be user-controllable (always expand builtins with
bounds as calls, or not), rather than deciding what is of interest yourself.

From a high-level perspective the expansion path doing inline expansion
should be separated from that expanding as a call (or expanding as a
different call).  I don't like that building of yet another GENERIC call expr
in this code ... it goes very much backwards of the idea that we should
expand from the GIMPLE representation.  You are making such transition
even harder.

Can't you do that frobbing from cfgexpand.c:expand_call_stmt instead (please!)?

Thanks,
Richard.

 Bootstrapped and tested on linux-x86_64.

 Thanks,
 Ilya
 --
 gcc/

 2014-06-02  Ilya Enkovich  ilya.enkov...@intel.com

 * builtins.c: Include rtl-chkp.h, tree-chkp.h.
 (expand_builtin_mempcpy_args): Add orig exp as argument.
 Support BUILT_IN_CHKP_MEMPCPY_NOBND_NOCHK.
 (expand_builtin_mempcpy): Adjust expand_builtin_mempcpy_args call.
 (expand_builtin_stpcpy): Likewise.
 (expand_builtin_memset_args): Support 
 BUILT_IN_CHKP_MEMSET_NOBND_NOCHK.
 (std_expand_builtin_va_start): Initialize bounds for va_list.
 (expand_builtin): Support instrumented calls.


 diff --git a/gcc/builtins.c b/gcc/builtins.c
 index 7357124..c0140e1 100644
 --- a/gcc/builtins.c
 +++ b/gcc/builtins.c
 @@ -59,6 +59,8 @@ along with GCC; see the file COPYING3.  If not see
  #include builtins.h
  #include ubsan.h
  #include cilk.h
 +#include tree-chkp.h
 +#include rtl-chkp.h


  static tree do_mpc_arg1 (tree, tree, int (*)(mpc_ptr, mpc_srcptr, 
 mpc_rnd_t));
 @@ -124,7 +126,7 @@ static rtx builtin_memcpy_read_str (void *, 
 HOST_WIDE_INT, enum machine_mode);
  static rtx expand_builtin_memcpy (tree, rtx);
  static rtx expand_builtin_mempcpy (tree, rtx, enum machine_mode);
  static rtx expand_builtin_mempcpy_args (tree, tree, tree, rtx,
 -   enum machine_mode, int);
 +   enum machine_mode, int, tree);
  static rtx expand_builtin_strcpy (tree, rtx);
  static rtx expand_builtin_strcpy_args (tree, tree, rtx);
  static rtx expand_builtin_stpcpy (tree, rtx, enum machine_mode);
 @@ -3284,7 +3286,8 @@ expand_builtin_mempcpy (tree exp, rtx target, enum 
 machine_mode mode)
tree src = CALL_EXPR_ARG (exp, 1);
tree len = CALL_EXPR_ARG (exp, 2);
return expand_builtin_mempcpy_args (dest, src, len,
 - target, mode, /*endp=*/ 1);
 + target, mode, /*endp=*/ 1,
 + exp);
  }
  }

 @@ -3296,10 +3299,23 @@ expand_builtin_mempcpy (tree exp, rtx target, enum 
 machine_mode mode)

  static rtx
  expand_builtin_mempcpy_args (tree dest, tree src, tree len,
 -rtx target, enum machine_mode mode, int endp)
 +rtx target, enum machine_mode mode, int endp,
 +tree orig_exp)
  {
 +  tree fndecl = get_callee_fndecl (orig_exp);
 +
  /* If return value is ignored, transform mempcpy into memcpy.  */
 -  if (target == const0_rtx  builtin_decl_implicit_p (BUILT_IN_MEMCPY))
 +  if (target == const0_rtx
 +   DECL_FUNCTION_CODE (fndecl) == BUILT_IN_CHKP_MEMPCPY_NOBND_NOCHK
 +   builtin_decl_implicit_p (BUILT_IN_CHKP_MEMCPY_NOBND_NOCHK))
 +{
 +  tree fn = builtin_decl_implicit (BUILT_IN_CHKP_MEMCPY_NOBND_NOCHK);
 +  tree result = build_call_nofold_loc (UNKNOWN_LOCATION, fn, 3,
 +  dest, src, len);
 +  return expand_expr (result, target, mode, EXPAND_NORMAL);
 +}
 +  else if (target == const0_rtx
 +   builtin_decl_implicit_p (BUILT_IN_MEMCPY))
  {
tree fn = builtin_decl_implicit (BUILT_IN_MEMCPY);
tree result = build_call_nofold_loc (UNKNOWN_LOCATION, fn, 3,
 @@ -3484,7 +3500,8 @@ expand_builtin_stpcpy (tree exp, rtx target, enum 
 machine_mode mode)

lenp1 = size_binop_loc (loc, PLUS_EXPR, len, ssize_int (1));
ret = expand_builtin_mempcpy_args (dst, src, lenp1,
 -target, mode, /*endp=*/2);
 +target, mode, /*endp=*/2,
 +exp);

if (ret)
 return ret;
 @@ -3778,7 +3795,8 @@ expand_builtin_memset_args (tree dest, tree val, tree 
 len,
   do_libcall:
fndecl = get_callee_fndecl (orig_exp);
fcode =

Re: [PATCH, Pointer Bounds Checker 14/x] Pointer Bounds Checker passes

2014-06-03 Thread Richard Biener

On Mon, Jun 2, 2014 at 5:13 PM, Ilya Enkovich enkovich@gmail.com wrote:
 2014-06-02 17:37 GMT+04:00 Richard Biener richard.guent...@gmail.com:
 On Mon, Jun 2, 2014 at 2:44 PM, Ilya Enkovich enkovich@gmail.com wrote:
 2014-06-02 15:35 GMT+04:00 Richard Biener richard.guent...@gmail.com:
 On Fri, May 30, 2014 at 2:25 PM, Ilya Enkovich enkovich@gmail.com 
 wrote:
 Hi,

 This patch adds Pointer Bounds Checker passes.  Versioning happens before 
 early local passes.  Earply local passes are split into 3 stages to have 
 everything instrumented before any optimization applies.

 That looks artificial to me.  If you need to split up early_local_passes
 then do that - nesting three IPA pass groups inside it looks odd to me.
 Btw - doing this in three IPA phases makes things possibly slower
 due to cache effects.  It might be worth pursuing to move the early
 stage completely to the lowering pipeline.

 Early local passes is some special case because these passes are
 executed separately for new functions. I did not want to get three
 special passes instead and therefore made split inside.

 Yeah, but all passes are already executed via execute_early_local_passes,
 so it would be only an implementation detail.

 If you prefer split pass itself, I suppose pass_early_local_passes may
 be replaced with something like pass_build_ssa_passes +
 pass_chkp_instrumentation_passes + pass_ipa_chkp_produce_thunks +
 pass_local_optimization_passes. execute_early_local_passes would
 execute gimple passes lists of pass_build_ssa_passes,
 pass_chkp_instrumentation_passes and pass_local_optimization_passes.

 I think we cannot have the first stage moved into lowering passes
 because it should be executed for newly created functions.

 Well, let's defer that then.


 Btw, fixup_cfg only needs to run once local_pure_const was run
 on a callee, thus it shouldn't be neccessary to run it from the
 first group.

 OK. Will try to remove it from the first group.


  void
  pass_manager::execute_early_local_passes ()
  {
 -  execute_pass_list (pass_early_local_passes_1-sub);
 +  execute_pass_list (pass_early_local_passes_1-sub-sub);
 +  execute_pass_list (pass_early_local_passes_1-sub-next-sub);
 +  execute_pass_list 
 (pass_early_local_passes_1-sub-next-next-next-sub);
  }

 is gross - it should be enough to execute the early local pass list
 (obsolete comment with the suggestion above).

 This function should call only gimple passes for cfun. Therefore we
 cannot call IPA passes here and has to execute each gimple passes list
 separately.

 Ok, given a different split this would then become somewhat more sane
 anyway.

 Sorry, didn't catch it. Should I try a different split or defer it? :)

Please try a different split.  Defer moving the first part to the
lowering stage.

Richard.

 Ilya


 Richard.

 Ilya

 Richard.

 Bootstrapped and tested on linux-x86_64.

 Thanks,
 Ilya
 --
 gcc/

 2014-05-29  Ilya Enkovich  ilya.enkov...@intel.com

 * tree-chkp.c: New.
 * tree-chkp.h: New.
 * rtl-chkp.c: New.
 * rtl-chkp.h: New.
 * Makefile.in (OBJS): Add tree-chkp.o, rtl-chkp.o.
 (GTFILES): Add tree-chkp.c.
 * c-family/c.opt (fchkp-check-incomplete-type): New.
 (fchkp-zero-input-bounds-for-main): New.
 (fchkp-first-field-has-own-bounds): New.
 (fchkp-narrow-bounds): New.
 (fchkp-narrow-to-innermost-array): New.
 (fchkp-optimize): New.
 (fchkp-use-fast-string-functions): New.
 (fchkp-use-nochk-string-functions): New.
 (fchkp-use-static-bounds): New.
 (fchkp-use-static-const-bounds): New.
 (fchkp-treat-zero-dynamic-size-as-infinite): New.
 (fchkp-check-read): New.
 (fchkp-check-write): New.
 (fchkp-store-bounds): New.
 (fchkp-instrument-calls): New.
 (fchkp-instrument-marked-only): New.
 * cppbuiltin.c (define_builtin_macros_for_compilation_flags): Add
 __CHKP__ macro when Pointer Bounds Checker is on.
 * passes.def (pass_ipa_chkp_versioning): New.
 (pass_before_local_optimization_passes): New.
 (pass_chkp_instrumentation_passes): New.
 (pass_ipa_chkp_produce_thunks): New.
 (pass_local_optimization_passes): New.
 (pass_chkp_opt): New.
 * toplev.c: include tree-chkp.h.
 (compile_file): Add chkp_finish_file call.
 * tree-pass.h (make_pass_ipa_chkp_versioning): New.
 (make_pass_ipa_chkp_produce_thunks): New.
 (make_pass_chkp): New.
 (make_pass_chkp_opt): New.
 (make_pass_before_local_optimization_passes): New.
 (make_pass_chkp_instrumentation_passes): New.
 (make_pass_local_optimization_passes): New.
 * tree.h (called_as_built_in): New.
 * builtins.c (called_as_built_in): Not static anymore.
 * passes.c (pass_manager::execute_early_local_passes): Execute
 early passes in three steps

Re: [PATCH, Pointer Bounds Checker 20/x] Follow transparent alias chains

2014-06-03 Thread Richard Biener

On Mon, Jun 2, 2014 at 5:15 PM, Ilya Enkovich enkovich@gmail.com wrote:
 Hi,

 In the most case we follow transparent alias chains wne assemble names.  But 
 in some cases it is not performed.  For instrumented functions it is critical 
 and following patch fixes that.  It also adds a visibility inheritance for 
 instrtumented functions.

It feels like this should be handled by the symtab code nowadays ... Honza?

Richard.

 Bootstrapped and tested on linux-x86_64.

 Thanks,
 Ilya
 --
 gcc/

 2014-06-02  Ilya Enkovich  ilya.enkov...@intel.com

 * varasm.c: Include tree-chkp.h.
 (ultimate_transparent_alias_target): Move up.
 (make_decl_rtl): For instrumented function use
 name of the original decl.
 (assemble_start_function): Mark function as global
 in case it is instrumentation clone of the global
 function.
 (do_assemble_alias): Follow transparent alias chain
 for identifier.  Check if original alias is public.
 (maybe_assemble_visibility): Use visibility of the
 original function for instrumented version.
 (default_unique_section): Likewise.


 diff --git a/gcc/varasm.c b/gcc/varasm.c
 index fcae2fa..d473bc7 100644
 --- a/gcc/varasm.c
 +++ b/gcc/varasm.c
 @@ -54,6 +54,7 @@ along with GCC; see the file COPYING3.  If not see
  #include pointer-set.h
  #include asan.h
  #include basic-block.h
 +#include tree-chkp.h

  #ifdef XCOFF_DEBUGGING_INFO
  #include xcoffout.h  /* Needed for external data
 @@ -1200,6 +1201,30 @@ use_blocks_for_decl_p (tree decl)
return targetm.use_blocks_for_decl_p (decl);
  }

 +/* Follow the IDENTIFIER_TRANSPARENT_ALIAS chain starting at *ALIAS
 +   until we find an identifier that is not itself a transparent alias.
 +   Modify the alias passed to it by reference (and all aliases on the
 +   way to the ultimate target), such that they do not have to be
 +   followed again, and return the ultimate target of the alias
 +   chain.  */
 +
 +static inline tree
 +ultimate_transparent_alias_target (tree *alias)
 +{
 +  tree target = *alias;
 +
 +  if (IDENTIFIER_TRANSPARENT_ALIAS (target))
 +{
 +  gcc_assert (TREE_CHAIN (target));
 +  target = ultimate_transparent_alias_target (TREE_CHAIN (target));
 +  gcc_assert (! IDENTIFIER_TRANSPARENT_ALIAS (target)
 +  ! TREE_CHAIN (target));
 +  *alias = target;
 +}
 +
 +  return target;
 +}
 +
  /* Create the DECL_RTL for a VAR_DECL or FUNCTION_DECL.  DECL should
 have static storage duration.  In other words, it should not be an
 automatic variable, including PARM_DECLs.
 @@ -1214,6 +1239,7 @@ make_decl_rtl (tree decl)
  {
const char *name = 0;
int reg_number;
 +  tree id;
rtx x;

/* Check that we are not being given an automatic variable.  */
 @@ -1271,7 +1297,12 @@ make_decl_rtl (tree decl)
return;
  }

 -  name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl));
 +  id = DECL_ASSEMBLER_NAME (decl);
 +  if (TREE_CODE (decl) == FUNCTION_DECL
 +   cgraph_get_node (decl)
 +   cgraph_get_node (decl)-instrumentation_clone)
 +ultimate_transparent_alias_target (id);
 +  name = IDENTIFIER_POINTER (id);

if (name[0] != '*'  TREE_CODE (decl) != FUNCTION_DECL
 DECL_REGISTER (decl))
 @@ -1699,7 +1730,10 @@ assemble_start_function (tree decl, const char *fnname)

/* Make function name accessible from other files, if appropriate.  */

 -  if (TREE_PUBLIC (decl))
 +  if (TREE_PUBLIC (decl)
 +  || (cgraph_get_node (decl)-instrumentation_clone
 +  cgraph_get_node (decl)-instrumented_version
 +  TREE_PUBLIC (cgraph_get_node 
 (decl)-instrumented_version-decl)))
  {
notice_global_symbol (decl);

 @@ -2386,30 +2420,6 @@ mark_decl_referenced (tree decl)
  }


 -/* Follow the IDENTIFIER_TRANSPARENT_ALIAS chain starting at *ALIAS
 -   until we find an identifier that is not itself a transparent alias.
 -   Modify the alias passed to it by reference (and all aliases on the
 -   way to the ultimate target), such that they do not have to be
 -   followed again, and return the ultimate target of the alias
 -   chain.  */
 -
 -static inline tree
 -ultimate_transparent_alias_target (tree *alias)
 -{
 -  tree target = *alias;
 -
 -  if (IDENTIFIER_TRANSPARENT_ALIAS (target))
 -{
 -  gcc_assert (TREE_CHAIN (target));
 -  target = ultimate_transparent_alias_target (TREE_CHAIN (target));
 -  gcc_assert (! IDENTIFIER_TRANSPARENT_ALIAS (target)
 -  ! TREE_CHAIN (target));
 -  *alias = target;
 -}
 -
 -  return target;
 -}
 -
  /* Output to FILE (an assembly file) a reference to NAME.  If NAME
 starts with a *, the rest of NAME is output verbatim.  Otherwise
 NAME is transformed in a target-specific way (usually by the
 @@ -5544,6 +5554,9 @@ vecalias_pair, va_gc *alias_pairs;
  void
  do_assemble_alias (tree decl, tree target)
  {
 +  tree orig_decl = decl;
 +  tree id;
 +
/* Emulated

Re: [PATCH, Pointer Bounds Checker 21/x] Weakrefs output

2014-06-03 Thread Richard Biener

On Mon, Jun 2, 2014 at 5:22 PM, Ilya Enkovich enkovich@gmail.com wrote:
 Hi,

 This patch prevents output of both instrumented and not instrumented weakref 
 variants.

Shouldn't one of them be reclaimed instead at some point?

Richard.

 Thanks,
 Ilya
 --
 gcc/

 2014-06-02  Ilya Enkovich  ilya.enkov...@intel.com

 * cgraphunit.c (output_weakrefs): If there are both
 instrumented and original versions, output only one
 of them.


 diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
 index c5c..ae9e699 100644
 --- a/gcc/cgraphunit.c
 +++ b/gcc/cgraphunit.c
 @@ -2111,9 +2111,13 @@ static void
  output_weakrefs (void)
  {
symtab_node *node;
 +  cgraph_node *cnode;
FOR_EACH_SYMBOL (node)
  if (node-alias
   !TREE_ASM_WRITTEN (node-decl)
 +(!(cnode = dyn_cast cgraph_node (node))
 +   || !cnode-instrumented_version
 +   || !TREE_ASM_WRITTEN (cnode-instrumented_version-decl))
  node-weakref)
{
 tree target;

Re: [PATCH, Pointer Bounds Checker 22/x] Inline

2014-06-03 Thread Richard Biener

On Mon, Jun 2, 2014 at 5:56 PM, Ilya Enkovich enkovich@gmail.com wrote:
 Hi,

 This patch adds support for inlining instrumented calls.  Changes are mostly 
 to support returned bounds.  Also generated mem-to-mem assignments are 
 registered to be later instrumented with appropriate bounds copy.

 Bootstrapped and tested on linux-x86_64.

 Thanks,
 Ilya
 --
 gcc/

 2014-06-02  Ilya Enkovich  ilya.enkov...@intel.com

 * ipa-inline.c (early_inliner): Check edge has summary allocated.
 * tree-inline.c: Include tree-chkp.h.
 (declare_return_variable): Add arg holding
 returned bounds slot.  Create and initialize returned bounds var.
 (remap_gimple_stmt): Handle returned bounds.
 Return sequence of statements instead of a single statement.
 (insert_init_stmt): Add declaration.
 (remap_gimple_seq): Adjust to new remap_gimple_stmt signature.
 (copy_bb): Adjust to changed return type of remap_gimple_stmt.
 (expand_call_inline): Handle returned bounds.  Add bounds copy
 for generated mem to mem assignments.
 * tree-inline.h (copy_body_data): Add fields retbnd and
 assign_stmts.
 * cgraph.c: Include tree-chkp.h.
 (cgraph_redirect_edge_call_stmt_to_callee): Support
 returned bounds.
 * value-prof.c: Include tree-chkp.h.
 (gimple_ic): Support returned bounds.


 diff --git a/gcc/cgraph.c b/gcc/cgraph.c
 index 1f684c2..4b6996b 100644
 --- a/gcc/cgraph.c
 +++ b/gcc/cgraph.c
 @@ -63,6 +63,7 @@ along with GCC; see the file COPYING3.  If not see
  #include gimple-pretty-print.h
  #include expr.h
  #include tree-dfa.h
 +#include tree-chkp.h

  /* FIXME: Only for PROP_loops, but cgraph shouldn't have to know about this. 
  */
  #include tree-pass.h
 @@ -1398,6 +1399,31 @@ cgraph_redirect_edge_call_stmt_to_callee (struct 
 cgraph_edge *e)
   e-speculative = false;
   cgraph_set_call_stmt_including_clones (e-caller, e-call_stmt,
  new_stmt, false);
 +
 + /* Fix edges for BUILT_IN_CHKP_BNDRET calls attached to the
 +processed call stmt.  */
 + if (gimple_call_with_bounds_p (new_stmt)
 +  gimple_call_lhs (new_stmt)
 +  chkp_retbnd_call_by_val (gimple_call_lhs (e2-call_stmt)))
 +   {
 + tree dresult = gimple_call_lhs (new_stmt);
 + tree iresult = gimple_call_lhs (e2-call_stmt);
 + gimple dbndret = chkp_retbnd_call_by_val (dresult);
 + gimple ibndret = chkp_retbnd_call_by_val (iresult);
 + struct cgraph_edge *iedge = cgraph_edge (e2-caller, ibndret);
 + struct cgraph_edge *dedge;
 +
 + if (dbndret)
 +   {
 + dedge = cgraph_create_edge (iedge-caller, iedge-callee,
 + dbndret, e-count,
 + e-frequency);
 + dedge-frequency = compute_call_stmt_bb_frequency
 +   (dedge-caller-decl, gimple_bb (dedge-call_stmt));
 +   }
 + iedge-frequency = compute_call_stmt_bb_frequency
 +   (iedge-caller-decl, gimple_bb (iedge-call_stmt));
 +   }
   e-frequency = compute_call_stmt_bb_frequency
(e-caller-decl, gimple_bb (e-call_stmt));
   e2-frequency = compute_call_stmt_bb_frequency
 diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c
 index 4051819..a6fc853 100644
 --- a/gcc/ipa-inline.c
 +++ b/gcc/ipa-inline.c
 @@ -2301,11 +2301,15 @@ early_inliner (void)
  info that might be cleared out for newly discovered edges.  */
   for (edge = node-callees; edge; edge = edge-next_callee)
 {
 - struct inline_edge_summary *es = inline_edge_summary (edge);
 - es-call_stmt_size
 -   = estimate_num_insns (edge-call_stmt, eni_size_weights);
 - es-call_stmt_time
 -   = estimate_num_insns (edge-call_stmt, eni_time_weights);
 + /* We have no summary for new bound store calls yet.  */
 + if (inline_edge_summary_vec.length ()  (unsigned)edge-uid)
 +   {
 + struct inline_edge_summary *es = inline_edge_summary (edge);
 + es-call_stmt_size
 +   = estimate_num_insns (edge-call_stmt, eni_size_weights);
 + es-call_stmt_time
 +   = estimate_num_insns (edge-call_stmt, eni_time_weights);
 +   }
   if (edge-callee-decl
!gimple_check_call_matching_types (
   edge-call_stmt, edge-callee-decl, false))
 diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
 index 23fef90..6557a95 100644
 --- a/gcc/tree-inline.c
 +++ b/gcc/tree-inline.c
 @@ -67,6 +67,7 @@ along with GCC; see the file COPYING3.  If not see
  #include tree-pass.h
  #include

Re: Eliminate write-only variables

2014-06-03 Thread Richard Biener

On Mon, Jun 2, 2014 at 8:59 PM, Jan Hubicka hubi...@ucw.cz wrote:

 Yeah, I discussed this with martin today on irc.  For aliasing we'd like to 
 know whether a decl possibly has its address taken. Currently we only trust 
 TREE_ADDRESSABLE for statics - and lto might change those to hidden 
 visibility public :(

 So we want deck_referred_to_by_single_function

 OK, I will implement this one in IPA mini-pass.  It is easy.
 This property changes by clonning and inlining. What Martin wants to use it 
 for?
 (I.e. my plan would be to compute it as last IPA pass making it useless for
 IPA analysispropagation)

He doesn't want to use it but we talked and he said he'd have a look.

Note that it's important the decls are not refered to by global initializers
either.

Why is a new pass necessary - can't the IPA reference maintaining code
update some flag in the varpool magically?

 and deck_may_have_address_taken.

 We currently clear TREE_ADDRESSABLE for statics that have no address taken at
 WPA time and that ought to keep the flag cleared at ltrans (I think I even 
 made
 testcase for this)

 What I think we could improve is to not consider string functions as ADDR 
 operations
 for this analysis, i.e. it is common to only memset to 0.

 How may_have_address_taken would differ here?

I want tree.h:may_be_aliased to return false if a decl doesn't have its address
taken.  We don't trust TREE_ADDRESSABLE for TREE_PUBLIC/DECL_EXTERNAL
decls because other TUs may take the address.  I want the check to be replaced
with sth more symtab aware, that is, bring in benefits from LTO (and at the same
time be not confused by statics brought public with hidden visibility).

Richard.

 Honza

Re: [PATCH] Fix ICE due to memory corruption in SRA

2014-06-03 Thread Richard Biener

On Mon, Jun 2, 2014 at 11:21 PM, Teresa Johnson tejohn...@google.com wrote:
 This patch fixes an ICE due to memory corruption discovered while building a
 large application with FDO and LIPO on the google branch. I don't have a small
 reproducer, but the same code appears on trunk, and I believe it could also
 silently result in incorrect code generation.

 The problem occurs if SRA is applied on a recursive call. In this case,
 the redirect_callers vec below contains the recursive edge from node-node.
 When rebuild_cgraph_edges is invoked, it will free the callee edges of node,
 including the one recorded in redirect_callers. In the case I looked at,
 after rebuilding the cgraph edges for node, the address recorded in
 redirect_callers now pointed to a different cgraph edge, and we later
 got an ICE because the (incorrect) callee that we tried to modify had
 the wrong number of arguments.

 To fix, I simply moved the collection of caller edges to after the cgraph
 edge rebuilding.

 Google ref b/15383777.

 Bootstrapped and tested on x86_64-unknown-linux-gnu. Ok for trunk?

Ok.

Thanks,
Richard.

 Thanks,
 Teresa

 2014-06-02  Teresa Johnson  tejohn...@google.com

 * tree-sra.c (modify_function): Record caller nodes after rebuild.

 Index: tree-sra.c
 ===
 --- tree-sra.c  (revision 211139)
 +++ tree-sra.c  (working copy)
 @@ -4925,12 +4925,15 @@ modify_function (struct cgraph_node *node, ipa_par
  {
struct cgraph_node *new_node;
bool cfg_changed;
 -  veccgraph_edge_p redirect_callers = collect_callers_of_node (node);

rebuild_cgraph_edges ();
free_dominance_info (CDI_DOMINATORS);
pop_cfun ();

 +  /* This must be done after rebuilding cgraph edges for node above.
 + Otherwise any recursive calls to node that are recorded in
 + redirect_callers will be corrupted.  */
 +  veccgraph_edge_p redirect_callers = collect_callers_of_node (node);
new_node = cgraph_function_versioning (node, redirect_callers,
  NULL,
  NULL, false, NULL, NULL, isra);
 --
 Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413

Re: RFA: PATCH to ctor_for_folding for c++/61020 (ICE with typeid)

2014-06-03 Thread Richard Biener

On Tue, Jun 3, 2014 at 3:39 AM, Jason Merrill ja...@redhat.com wrote:
 The C++ front end wants to be able to build up objects of the various
 typeinfo derived classes (such as __si_class_type_info) without needing to
 see the full definition of the class.  As part of this we build a VAR_DECL
 for the vtable, but never define it because we don't actually have
 declarations of the virtual functions, we're only using it for external
 references.  ctor_for_folding was assuming that all vtables are defined,
 which broke on this case.

 Tested x86_64-pc-linux-gnu.  OK for trunk?

Ok.

Thanks,
Richard.

Re: [PATCH, Pointer Bounds Checker 13/x] Early versioning

2014-06-03 Thread Richard Biener

On Tue, Jun 3, 2014 at 7:55 AM, Ilya Enkovich enkovich@gmail.com wrote:
 2014-06-02 21:27 GMT+04:00 Jeff Law l...@redhat.com:
 On 06/02/14 04:48, Ilya Enkovich wrote:

 Hmm, so if I understand things correctly, src_fun has no loop
 structures attached, thus there's nothing to copy.  Presumably at
 some later point we build loop structures for the copy from scratch?

 I suppose it is just a simple bug with absent NULL pointer check.  Here is
 original code:

/* Duplicate the loop tree, if available and wanted.  */
if (loops_for_fn (src_cfun) != NULL
 current_loops != NULL)
  {
copy_loops (id, entry_block_map-loop_father,
get_loop (src_cfun, 0));
/* Defer to cfgcleanup to update loop-father fields of
 basic-blocks.  */
loops_state_set (LOOPS_NEED_FIXUP);
  }

/* If the loop tree in the source function needed fixup, mark the
   destination loop tree for fixup, too.  */
if (loops_for_fn (src_cfun)-state  LOOPS_NEED_FIXUP)
  loops_state_set (LOOPS_NEED_FIXUP);

 As you may see we have check for absent loops structure in the first
 if-statement and no check in the second one.  I hit segfault and added the
 check.

 Downthread you indicated you're not in SSA form which might explain the
 inconsistency here.  If so, then we need to make sure that the loop  df
 structures do get set up properly later.

 That is what init_data_structures pass will do for us as Richard pointed. 
 Right?

loops are set up during the CFG construction and thus are available everywhere.
the df structures are set up in init_data_structures pass which is run before
going into SSA form (I'd like to somehow cleanup that area).

Richard.

 Ilya


 Jeff

Re: [PATCH, Pointer Bounds Checker 24/x] PRE

2014-06-03 Thread Richard Biener

On Tue, Jun 3, 2014 at 9:13 AM, Ilya Enkovich enkovich@gmail.com wrote:
 Hi,

 This patch preserves CALL_WITH_BOUNDS flag for calls during PRE.

Ok.

Richard.

 Bootstrapped and tested on linux-x86_64.

 Thanks,
 Ilya
 --
 gcc/

 2014-06-03  Ilya Enkovich  ilya.enkov...@intel.com

 * tree-ssa-pre.c (create_component_ref_by_pieces_1): Store
 CALL_WITH_BOUNDS_P for calls.
 (copy_reference_ops_from_call): Restore CALL_WITH_BOUNDS_P
 flag.


 diff --git a/gcc/tree-ssa-pre.c b/gcc/tree-ssa-pre.c
 index 1e55356..d5b9f3b 100644
 --- a/gcc/tree-ssa-pre.c
 +++ b/gcc/tree-ssa-pre.c
 @@ -2579,6 +2579,8 @@ create_component_ref_by_pieces_1 (basic_block block, 
 vn_reference_t ref,
(TREE_CODE (fn) == FUNCTION_DECL
 ? build_fold_addr_expr (fn) : fn),
nargs, args);
 +   if (currop-op2 == integer_one_node)
 + CALL_WITH_BOUNDS_P (folded) = true;
 free (args);
 if (sc)
   CALL_EXPR_STATIC_CHAIN (folded) = sc;
 diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c
 index f7ec8b6..e83d9dc 100644
 --- a/gcc/tree-ssa-sccvn.c
 +++ b/gcc/tree-ssa-sccvn.c
 @@ -1124,6 +1124,8 @@ copy_reference_ops_from_call (gimple call,
temp.opcode = CALL_EXPR;
temp.op0 = gimple_call_fn (call);
temp.op1 = gimple_call_chain (call);
 +  if (gimple_call_with_bounds_p (call))
 +temp.op2 = integer_one_node;
temp.off = -1;
result-safe_push (temp);

Re: [PATCH, Pointer Bounds Checker 25/x] DCE

2014-06-03 Thread Richard Biener

On Tue, Jun 3, 2014 at 9:23 AM, Ilya Enkovich enkovich@gmail.com wrote:
 Hi,

 This patch adjusts alloc-free removal algorithm in DCE to take into account 
 BUILT_IN_CHKP_BNDRET call returning bounds of allocated memory.

 Bootstrapped and tested on linux-x86_64.

 Thanks,
 Ilya
 --
 gcc/

 2014-06-03  Ilya Enkovich  ilya.enkov...@intel.com

 * tree-ssa-dce.c: Include target.h.
 (propagate_necessity): For free call fed by alloc check
 bounds are also provided by the same alloc.
 (eliminate_unnecessary_stmts): Handle BUILT_IN_CHKP_BNDRET
 used by free calls.

 diff --git a/gcc/tree-ssa-dce.c b/gcc/tree-ssa-dce.c
 index 13a71ce..59a0b71 100644
 --- a/gcc/tree-ssa-dce.c
 +++ b/gcc/tree-ssa-dce.c
 @@ -73,6 +73,7 @@ along with GCC; see the file COPYING3.  If not see
  #include flags.h
  #include cfgloop.h
  #include tree-scalar-evolution.h
 +#include target.h

  static struct stmt_stats
  {
 @@ -778,7 +779,23 @@ propagate_necessity (bool aggressive)
DECL_BUILT_IN_CLASS (def_callee) == BUILT_IN_NORMAL
(DECL_FUNCTION_CODE (def_callee) == BUILT_IN_MALLOC
   || DECL_FUNCTION_CODE (def_callee) == BUILT_IN_CALLOC))
 -   continue;
 +   {
 + tree retfndecl
 +   = targetm.builtin_chkp_function (BUILT_IN_CHKP_BNDRET);
 + gimple bounds_def_stmt;
 + tree bounds;
 +
 + /* For instrumented calls we should also check used
 +bounds are returned by the same allocation call.  */
 + if (!gimple_call_with_bounds_p (stmt)
 + || ((bounds = gimple_call_arg (stmt, 1))

Please provide an abstraction for this - I seem to recall seeing multiple
(variants) of this.  Esp. repeated calls to the target hook above look
expensive to me (generally it will not be needed).

I'd like to see gimple_call_bndarg (stmt) to get rid of the magic 1 and I'd
like to see sth similar to gimple_call_builtin_p, for example
gimple_call_chkp_p (stmt, BUILT_IN_CHKP_BNDRET) doing the
target hook invocation and the fndecl check.

 +  TREE_CODE (bounds) == SSA_NAME

What about constant bounds?  That shouldn't make the call necessary either?

 +  (bounds_def_stmt = SSA_NAME_DEF_STMT (bounds))
 +  is_gimple_call (bounds_def_stmt)
 +  gimple_call_fndecl (bounds_def_stmt) == retfndecl
 +  gimple_call_arg (bounds_def_stmt, 0) == ptr))
 +   continue;

So this all becomes

   if (!gimple_call_with_bounds_p (stmt)
   || ((bounds = gimple_call_bndarg (stmt))
TREE_CODE (bounds) == SSA_NAME
(bounds_def_stmt = SSA_NAME_DEF_STMT (bounds))
is_gimple_call (bounds_def_stmt)
gimple_call_chkp_p (bounds_def_stmt,
BUILT_IN_CHKP_BNDRET)
...

btw, I wonder how BUILT_IN_CHKP_BNDRET is in scope here but you
need a target hook to compare the fndecl?  Why isn't it enough to
just check DECL_FUNCTION_CODE and DECL_BUILT_IN?

Richard.

 +   }
 }

   FOR_EACH_SSA_TREE_OPERAND (use, stmt, iter, SSA_OP_USE)
 @@ -1204,6 +1221,23 @@ eliminate_unnecessary_stmts (void)
!gimple_plf (def_stmt, STMT_NECESSARY))
 gimple_set_plf (stmt, STMT_NECESSARY, false);
 }
 + /* We did not propagate necessity for free calls fed
 +by allocation function to allow unnecessary
 +alloc-free sequence elimination.  For instrumented
 +calls it also means we did not mark bounds producer
 +as necessary and it is time to do it in case free
 +call is not removed.  */
 + if (gimple_call_with_bounds_p (stmt))
 +   {
 + gimple bounds_def_stmt;
 + tree bounds = gimple_call_arg (stmt, 1);
 + gcc_assert (TREE_CODE (bounds) == SSA_NAME);
 + bounds_def_stmt = SSA_NAME_DEF_STMT (bounds);
 + if (bounds_def_stmt
 +  !gimple_plf (bounds_def_stmt, STMT_NECESSARY))
 +   gimple_set_plf (bounds_def_stmt, STMT_NECESSARY,
 +   gimple_plf (stmt, STMT_NECESSARY));
 +   }
 }

   /* If GSI is not necessary then remove it.  */
 @@ -1216,6 +1250,7 @@ eliminate_unnecessary_stmts (void)
   else if (is_gimple_call (stmt))
 {
   tree name = gimple_call_lhs (stmt);
 + tree retfn = targetm.builtin_chkp_function 
 (BUILT_IN_CHKP_BNDRET);

   notice_special_calls (stmt);

 @@ -1233,7 +1268,9 @@ eliminate_unnecessary_stmts (void)

Re: [PATCH][match-and-simplify]

2014-06-03 Thread Richard Biener

On Tue, 3 Jun 2014, Marc Glisse wrote:

 On Tue, 3 Jun 2014, Richard Biener wrote:
 
  On Mon, 2 Jun 2014, Marc Glisse wrote:
  
   
   (plus (bit_not @0) @0)
   if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
   { build_int_cst (TREE_TYPE (@0), -1); })
+(match_and_simplify
+  (plus @0 (bit_not @0))
+  if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
+  { build_int_cst (TREE_TYPE (@0), -1); })
   
   Why not just:
   
   (match_and_simplify
 (plus @0 (bit_not @0))
 { build_all_ones_cst (TREE_TYPE (@0)); })
   
   ? Works for vector/complex, I don't know what type a bit_not_expr can have
   where the simplification wouldn't be true.
  
  Sure.
 
 Thanks. Sorry for not being clear enough, the same remark applies to basically
 all the bitwise patterns in the file. I could have saved the remark for the
 pre-merge RFC period. I mostly posted it now so the next batch of patterns can
 be directly written in a general way so you (Prathamesh or Richard) have fewer
 to fix later, but it can certainly wait.
 
 /* ~~x - x */
 (match_and_simplify
   (bit_not (bit_not @0))
   if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
   @0)
 
 We can drop the if line.
 
 /* (x | CST1)  CST2 - (x  CST2) | (CST1  CST2) */
 (match_and_simplify
   (bit_and (bit_ior @0 INTEGER_CST_P@1) INTEGER_CST_P@2)
   if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
   (bit_ior (bit_and @0 @2) (bit_and @1 @2)))
 
 Drop the if line and replace INTEGER_CST_P with CONSTANT_CLASS_P.
 
 etc.

Yeah, though we don't restrict the types that the BIT_*_EXPR operations
apply to anywhere and I wonder if this all applies to FP types
for example ... (and if we are not supposed to get FP types (or
struct types!?) here then the checking code and the tree.def documentation
should be amended).

I think the conditions we have now are literally copied from
tree-ssa-forwrpop.c (and the idea was to copy what that does 1:1).

Richard.

Re: [PATCH, Pointer Bounds Checker 26/x] CCP

2014-06-03 Thread Richard Biener

On Tue, Jun 3, 2014 at 9:38 AM, Ilya Enkovich enkovich@gmail.com wrote:
 Hi,

 This patch allows BUILT_IN_CHKP_BNDRET as a consumer of a result of 
 BUILT_IN_STACK_SAVE call.

 Bootstrapped and tested on linux-x86_64.

 Thanks,
 Ilya
 --
 gcc/

 2014-06-03  Ilya Enkovich  ilya.enkov...@intel.com

 * tree-ssa-ccp.c (insert_clobber_before_stack_restore): Handle
 BUILT_IN_CHKP_BNDRET calls.


 diff --git a/gcc/tree-ssa-ccp.c b/gcc/tree-ssa-ccp.c
 index eeefeaf..c138337 100644
 --- a/gcc/tree-ssa-ccp.c
 +++ b/gcc/tree-ssa-ccp.c
 @@ -1848,7 +1848,7 @@ insert_clobber_before_stack_restore (tree saved_val, 
 tree var,
  gimple_htab *visited)
  {
gimple stmt, clobber_stmt;
 -  tree clobber;
 +  tree clobber, fndecl;
imm_use_iterator iter;
gimple_stmt_iterator i;
gimple *slot;
 @@ -1880,6 +1880,10 @@ insert_clobber_before_stack_restore (tree saved_val, 
 tree var,
  else if (gimple_assign_ssa_name_copy_p (stmt))
insert_clobber_before_stack_restore (gimple_assign_lhs (stmt), var,
visited);
 +else if (gimple_code (stmt) == GIMPLE_CALL
 + (fndecl = targetm.builtin_chkp_function 
 (BUILT_IN_CHKP_BNDRET))
 + gimple_call_fndecl (stmt) == fndecl)
 +  continue;

Please change this to use the proper abstraction once implemented.  It should
be sth like

else if (is_gimple_call (stmt)
gimple_call_builtin_p (stmt, BUILT_IN_CHKP_BNDRET))
   continue;

I assume now that the builtins are not target builtins but added to
builtins.def.

Richard.

  else
gcc_assert (is_gimple_debug (stmt));
  }

Re: [RFC] optimize x - y cmp 0 with undefined overflow

2014-06-03 Thread Richard Biener

On Tue, Jun 3, 2014 at 10:11 AM, Eric Botcazou ebotca...@adacore.com wrote:
 Looks mostly ok.  Any reason why you are not re-creating
 MINUS_EXPR in build_symbolic_expr?  That is, build
 inv - t (for non-pointers, of course)?

 It's more uniform and compare_values expects an INTEGER_CST on the RHS,
 although the patch was lacking a small tweak to the function:

 @@ -1205,19 +1292,23 @@ compare_values_warnv (tree val1, tree va
STRIP_USELESS_TYPE_CONVERSION (val2);

if ((TREE_CODE (val1) == SSA_NAME
 +   || (TREE_CODE (val1) == NEGATE_EXPR
 +   TREE_CODE (TREE_OPERAND (val1, 0)) == SSA_NAME)
 || TREE_CODE (val1) == PLUS_EXPR
 || TREE_CODE (val1) == MINUS_EXPR)
 (TREE_CODE (val2) == SSA_NAME
 + || (TREE_CODE (val2) == NEGATE_EXPR
 +  TREE_CODE (TREE_OPERAND (val2, 0)) == SSA_NAME)
   || TREE_CODE (val2) == PLUS_EXPR
   || TREE_CODE (val2) == MINUS_EXPR))
  {
tree n1, c1, n2, c2;
enum tree_code code1, code2;

 -  /* If VAL1 and VAL2 are of the form 'NAME [+-] CST' or 'NAME',
 +  /* If VAL1 and VAL2 are of the form '[-]NAME [+-] CST' or 'NAME',
  return -1 or +1 accordingly.  If VAL1 and VAL2 don't use the
  same name, return -2.  */
 -  if (TREE_CODE (val1) == SSA_NAME)
 +  if (TREE_CODE (val1) == SSA_NAME || TREE_CODE (val1) == NEGATE_EXPR)
 {
   code1 = SSA_NAME;
   n1 = val1;
 @@ -1239,7 +1330,7 @@ compare_values_warnv (tree val1, tree va
 }
 }

 -  if (TREE_CODE (val2) == SSA_NAME)
 +  if (TREE_CODE (val2) == SSA_NAME || TREE_CODE (val2) == NEGATE_EXPR)
 {
   code2 = SSA_NAME;
   n2 = val2;
 @@ -1262,6 +1353,11 @@ compare_values_warnv (tree val1, tree va
 }

/* Both values must use the same name.  */
 +  if (TREE_CODE (n1) == NEGATE_EXPR  TREE_CODE (n2) == NEGATE_EXPR)
 +   {
 + n1 = TREE_OPERAND (n1, 0);
 + n2 = TREE_OPERAND (n2, 0);
 +   }
if (n1 != n2)
 return -2;

Ah, ok - makes sense.

 Otherwise if a range becomes -t + inv that will no longer match
 get_single_symbol for further propagation?

 Yes, it will, NEGATE_EXPR is handled by get_single_symbol.

Now spotted.

 Then I'm not sure if

 +  /* Try with VR0 and [-INF, OP1].  */
 +  set_value_range (new_vr1, VR_RANGE, vrp_val_min (expr_type), op1,
 NULL); +  extract_range_from_binary_expr_1 (vr, code, expr_type, vr0,
 new_vr1); +  if (vr-type != VR_VARYING)
 +   return;
 +
 +  /* Try with VR0 and [OP1, +INF].  */
 +  set_value_range (new_vr1, VR_RANGE, op1, vrp_val_max (expr_type),
 NULL); +  extract_range_from_binary_expr_1 (vr, code, expr_type, vr0,
 new_vr1); +  if (vr-type != VR_VARYING)
 +   return;

 is a safe thing to do.  If it does make a difference to try [-INF, OP1],
 [OP1, +INF] instead of just [OP1, OP1] then at least it's very suspicious ;)
 (or an easy missed optimization).

 Yes, it makes a difference and is required to eliminate half-symbolic ranges
 (the ones deduced from x = y).  Otherwise extract_range_from_binary_expr_1
 will reject the resulting range because it cannot compare the bounds.

 As discussed, extract_range_from_binary_expr_1 doesn't do any fiddling with
 the input ranges, it just computes the resulting range and see whether it can
 interpret it.  Half-symbolic ranges cannot be interpreted and probably should
 not, in the general case at least, because I think it can be very delicate, so
 extract_range_from_binary_expr will just try all the possible combinations for
 PLUS and MINUS.

So when computing a range for z in

 z = y - x;

with x = [-INF, y - 1] and y = [x + 1, +INF] (deduced from !(x = y)) we fail to
do sth sensible with [y, y] - [-INF, y - 1] or [x + 1, +INF] - [x, x] but we do
sth with [x + 1, +INF] - [-INF, x]?  That seems odd to me.

With the patch we compute z to [1, +INF(OVF)]

Going the [x + 1, +INF] - [x,x] path first we obtain

  [1, -x + INF]

which fails the sanity checking

  cmp = compare_values (min, max);
  if (cmp == -2 || cmp == 1)
{
  /* If the new range has its limits swapped around (MIN  MAX),
 then the operation caused one of them to wrap around, mark
 the new range VARYING.  */
  set_value_range_to_varying (vr);
}
  else
set_value_range (vr, type, min, max, NULL);

but clearly the same reasoning you can apply that makes trying
with [-INF, x] valid (it's just enlarging the range) can be applied
here, too, when computing +INF - x for the upper bound.  We can
safely increase that to +INF making the range valid for the above
test.

But I wonder what code path in the routine still relies on that sanity
checking to produce a valid result (so I'd rather try removing it, or
taking uncomparable bounds as a valid range).

Simplest would be to simply do

  set_value_range (vr, type, min, max, NULL);
  return;

and be done with that in the

Re: [PATCH, Pointer Bounds Checker 25/x] DCE

2014-06-03 Thread Richard Biener

On Tue, Jun 3, 2014 at 1:36 PM, Ilya Enkovich enkovich@gmail.com wrote:
 2014-06-03 13:45 GMT+04:00 Richard Biener richard.guent...@gmail.com:
 On Tue, Jun 3, 2014 at 9:23 AM, Ilya Enkovich enkovich@gmail.com wrote:
 Hi,

 This patch adjusts alloc-free removal algorithm in DCE to take into account 
 BUILT_IN_CHKP_BNDRET call returning bounds of allocated memory.

 Bootstrapped and tested on linux-x86_64.

 Thanks,
 Ilya
 --
 gcc/

 2014-06-03  Ilya Enkovich  ilya.enkov...@intel.com

 * tree-ssa-dce.c: Include target.h.
 (propagate_necessity): For free call fed by alloc check
 bounds are also provided by the same alloc.
 (eliminate_unnecessary_stmts): Handle BUILT_IN_CHKP_BNDRET
 used by free calls.

 diff --git a/gcc/tree-ssa-dce.c b/gcc/tree-ssa-dce.c
 index 13a71ce..59a0b71 100644
 --- a/gcc/tree-ssa-dce.c
 +++ b/gcc/tree-ssa-dce.c
 @@ -73,6 +73,7 @@ along with GCC; see the file COPYING3.  If not see
  #include flags.h
  #include cfgloop.h
  #include tree-scalar-evolution.h
 +#include target.h

  static struct stmt_stats
  {
 @@ -778,7 +779,23 @@ propagate_necessity (bool aggressive)
DECL_BUILT_IN_CLASS (def_callee) == BUILT_IN_NORMAL
(DECL_FUNCTION_CODE (def_callee) == BUILT_IN_MALLOC
   || DECL_FUNCTION_CODE (def_callee) == 
 BUILT_IN_CALLOC))
 -   continue;
 +   {
 + tree retfndecl
 +   = targetm.builtin_chkp_function (BUILT_IN_CHKP_BNDRET);
 + gimple bounds_def_stmt;
 + tree bounds;
 +
 + /* For instrumented calls we should also check used
 +bounds are returned by the same allocation call.  */
 + if (!gimple_call_with_bounds_p (stmt)
 + || ((bounds = gimple_call_arg (stmt, 1))

 Please provide an abstraction for this - I seem to recall seeing multiple
 (variants) of this.  Esp. repeated calls to the target hook above look
 expensive to me (generally it will not be needed).

 I'd like to see gimple_call_bndarg (stmt) to get rid of the magic 1 and I'd
 like to see sth similar to gimple_call_builtin_p, for example
 gimple_call_chkp_p (stmt, BUILT_IN_CHKP_BNDRET) doing the
 target hook invocation and the fndecl check.

 I do not see how to get rid of constant in gimple_call_arg call here.
 What semantics does proposed gimple_call_bndarg have? It has to return
 the first bounds arg but it's not clear from its name and function
 seems to be very specialized.

Ah, ok.  I see now, so the code is ok as-is.


 +  TREE_CODE (bounds) == SSA_NAME

 What about constant bounds?  That shouldn't make the call necessary either?

 Yep, it would be useless.

So please fix.


 +  (bounds_def_stmt = SSA_NAME_DEF_STMT (bounds))
 +  is_gimple_call (bounds_def_stmt)
 +  gimple_call_fndecl (bounds_def_stmt) == 
 retfndecl
 +  gimple_call_arg (bounds_def_stmt, 0) == ptr))
 +   continue;

 So this all becomes

if (!gimple_call_with_bounds_p (stmt)
|| ((bounds = gimple_call_bndarg (stmt))
 TREE_CODE (bounds) == SSA_NAME
 (bounds_def_stmt = SSA_NAME_DEF_STMT 
 (bounds))
 is_gimple_call (bounds_def_stmt)
 gimple_call_chkp_p (bounds_def_stmt,
 BUILT_IN_CHKP_BNDRET)
 ...

 btw, I wonder how BUILT_IN_CHKP_BNDRET is in scope here but you
 need a target hook to compare the fndecl?  Why isn't it enough to
 just check DECL_FUNCTION_CODE and DECL_BUILT_IN?

 Call is required because builtins for Pointer Bounds Checker are
 provided by target depending on available features. That is why
 gimple_call_builtin_p is not used. I may add new interface function
 into tree-chkp.h as you propose (e.g. chkp_gimple_call_builtin_p). But
 I do not see how it may speed-up the code. New function would still
 need to switch by function code and compare with proper decl which is
 basically what we have with target hook.

I don't understand - does this mean the builtin calls are in the GIMPLE
IL even though the target doesn't have the feature?  Isn't the presence
of the call enough?

 It is possible to make faster if use the fact that all chkp builtins
 codes (normal, not target ones) are consequent. Then we may just check
 the range and use code as an index in array of chkp fndecls to compare
 decls. Would it be OK to rely on builtin codes numbering?

Yes, but that's not my point here.  In the above code the target hook
is called all the time, even if !gimple_call_with_bounds_p ().  Providing
a suitable abstraction (or simply relying on DECL_FUNCTION_CODE)
should fix that.

Richard.

 Ilya


 Richard.

 +   }
 }

   FOR_EACH_SSA_TREE_OPERAND (use

[PATCH][match-and-simplify] Get rid of some stmt expressions

2014-06-03 Thread Richard Biener


The following arranges for complex C-expressions (multi-stmt ones)
in the transform pattern to be outlined to a separate function.
This avoids the need for using stmt expressions which are not
necessarily supported by all C++ host compilers.

The patch doesn't address the stmt expressions being used by
the matching code generator - that will be rewritten anyways.

Lightly tested, I plan to install this tomorrow.

Note that this also gives way of re-numbering captures before
code generation so their number increases when for example
walking in pre-order.  And it gives an easier possibility for
querying the largest capture number as well.

Richard.

2014-06-03  Richard Biener  rguent...@suse.de

* genmatch.c (c_expr): Record cpp_tokens, the number of
stmts seen and a function identifier.
(c_expr::gen_gimple_transform): Generate textual form
from the token vector or a call to the outlined function.
(write_nary_simplifiers): Adjust.
(outline_c_exprs): New function.
(write_gimple): Call it.
(parse_c_expr): Record a cpp_token vector.

Index: gcc/genmatch.c
===
--- gcc/genmatch.c  (revision 211131)
+++ gcc/genmatch.c  (working copy)
@@ -190,9 +190,13 @@ struct expr : public operand
 
 struct c_expr : public operand
 {
-  c_expr (const char *code_)
-: operand (OP_C_EXPR), code (code_) {}
-  const char *code;
+  c_expr (cpp_reader *r_, veccpp_token code_, unsigned nr_stmts_)
+: operand (OP_C_EXPR), r (r_), code (code_),
+  nr_stmts (nr_stmts_), fname (NULL) {}
+  cpp_reader *r;
+  veccpp_token code;
+  unsigned nr_stmts;
+  char *fname;
   virtual void gen_gimple_match (FILE *, const char *, const char *) { 
gcc_unreachable (); }
   virtual void gen_gimple_transform (FILE *f, const char *);
 };
@@ -440,7 +444,47 @@ expr::gen_gimple_transform (FILE *f, con
 void
 c_expr::gen_gimple_transform (FILE *f, const char *)
 {
-  fputs (code, f);
+  /* If this expression has an outlined function variant, call it.  */
+  if (fname)
+{
+  fprintf (f, %s (type, captures), fname);
+  return;
+}
+
+  /* All multi-stmt expressions should have been outlined.  */
+  gcc_assert (nr_stmts = 1);
+
+  for (unsigned i = 0; i  code.length (); ++i)
+{
+  const cpp_token *token = code[i];
+
+  /* Replace captures for code-gen.  */
+  if (token-type == CPP_ATSIGN)
+   {
+ const cpp_token *n = code[i+1];
+ if (n-type == CPP_NUMBER
+  !(n-flags  PREV_WHITE))
+   {
+ if (token-flags  PREV_WHITE)
+   fputc (' ', f);
+ fprintf (f, captures[%s], n-val.str.text);
+ ++i;
+ continue;
+   }
+   }
+
+  /* Skip a single stmt delimiter.  */
+  if (token-type == CPP_SEMICOLON
+  nr_stmts == 1)
+   continue;
+
+  if (token-flags  PREV_WHITE)
+   fputc (' ', f);
+
+  /* Output the token as string.  */
+  char *tk = (char *)cpp_token_as_text (r, token);
+  fputs (tk, f);
+}
 }
 
 void
@@ -495,9 +539,9 @@ write_nary_simplifiers (FILE *f, vecsim
}
   if (s-ifexpr)
{
- fprintf (f,   if (!);
+ fprintf (f,   if (!();
  s-ifexpr-gen_gimple_transform (f, fail_label);
- fprintf (f, ) goto %s;, fail_label);
+ fprintf (f, )) goto %s;, fail_label);
}
   if (s-result-type == operand::OP_EXPR)
{
@@ -533,11 +577,82 @@ write_nary_simplifiers (FILE *f, vecsim
 }
 
 static void
+outline_c_exprs (FILE *f, struct operand *op)
+{
+  if (op-type == operand::OP_C_EXPR)
+{
+  c_expr *e = static_cast c_expr *(op);
+  static unsigned fnnr = 1;
+  if (e-nr_stmts  1
+  !e-fname)
+   {
+ e-fname = (char *)xmalloc (sizeof (cexprfn) + 4);
+ sprintf (e-fname, cexprfn%d, fnnr);
+ fprintf (f, static tree cexprfn%d (tree type, tree *captures)\n,
+  fnnr);
+ fprintf (f, {\n);
+ unsigned stmt_nr = 1;
+ for (unsigned i = 0; i  e-code.length (); ++i)
+   {
+ const cpp_token *token = e-code[i];
+
+ /* Replace captures for code-gen.  */
+ if (token-type == CPP_ATSIGN)
+   {
+ const cpp_token *n = e-code[i+1];
+ if (n-type == CPP_NUMBER
+  !(n-flags  PREV_WHITE))
+   {
+ if (token-flags  PREV_WHITE)
+   fputc (' ', f);
+ fprintf (f, captures[%s], n-val.str.text);
+ ++i;
+ continue;
+   }
+   }
+
+ if (token-flags  PREV_WHITE)
+   fputc (' ', f);
+
+ /* Output the token as string.  */
+ char *tk = (char *)cpp_token_as_text (e-r, token);
+ fputs (tk, f);
+
+ if (token-type

Re: [PATCH, Pointer Bounds Checker 25/x] DCE

2014-06-03 Thread Richard Biener

On Tue, Jun 3, 2014 at 3:27 PM, Ilya Enkovich enkovich@gmail.com wrote:
 2014-06-03 15:56 GMT+04:00 Richard Biener richard.guent...@gmail.com:
 On Tue, Jun 3, 2014 at 1:36 PM, Ilya Enkovich enkovich@gmail.com wrote:
 2014-06-03 13:45 GMT+04:00 Richard Biener richard.guent...@gmail.com:
 On Tue, Jun 3, 2014 at 9:23 AM, Ilya Enkovich enkovich@gmail.com 
 wrote:
 Hi,

 This patch adjusts alloc-free removal algorithm in DCE to take into 
 account BUILT_IN_CHKP_BNDRET call returning bounds of allocated memory.

 Bootstrapped and tested on linux-x86_64.

 Thanks,
 Ilya
 --
 gcc/

 2014-06-03  Ilya Enkovich  ilya.enkov...@intel.com

 * tree-ssa-dce.c: Include target.h.
 (propagate_necessity): For free call fed by alloc check
 bounds are also provided by the same alloc.
 (eliminate_unnecessary_stmts): Handle BUILT_IN_CHKP_BNDRET
 used by free calls.

 diff --git a/gcc/tree-ssa-dce.c b/gcc/tree-ssa-dce.c
 index 13a71ce..59a0b71 100644
 --- a/gcc/tree-ssa-dce.c
 +++ b/gcc/tree-ssa-dce.c
 @@ -73,6 +73,7 @@ along with GCC; see the file COPYING3.  If not see
  #include flags.h
  #include cfgloop.h
  #include tree-scalar-evolution.h
 +#include target.h

  static struct stmt_stats
  {
 @@ -778,7 +779,23 @@ propagate_necessity (bool aggressive)
DECL_BUILT_IN_CLASS (def_callee) == BUILT_IN_NORMAL
(DECL_FUNCTION_CODE (def_callee) == BUILT_IN_MALLOC
   || DECL_FUNCTION_CODE (def_callee) == 
 BUILT_IN_CALLOC))
 -   continue;
 +   {
 + tree retfndecl
 +   = targetm.builtin_chkp_function 
 (BUILT_IN_CHKP_BNDRET);
 + gimple bounds_def_stmt;
 + tree bounds;
 +
 + /* For instrumented calls we should also check used
 +bounds are returned by the same allocation call.  */
 + if (!gimple_call_with_bounds_p (stmt)
 + || ((bounds = gimple_call_arg (stmt, 1))

 Please provide an abstraction for this - I seem to recall seeing multiple
 (variants) of this.  Esp. repeated calls to the target hook above look
 expensive to me (generally it will not be needed).

 I'd like to see gimple_call_bndarg (stmt) to get rid of the magic 1 and 
 I'd
 like to see sth similar to gimple_call_builtin_p, for example
 gimple_call_chkp_p (stmt, BUILT_IN_CHKP_BNDRET) doing the
 target hook invocation and the fndecl check.

 I do not see how to get rid of constant in gimple_call_arg call here.
 What semantics does proposed gimple_call_bndarg have? It has to return
 the first bounds arg but it's not clear from its name and function
 seems to be very specialized.

 Ah, ok.  I see now, so the code is ok as-is.


 +  TREE_CODE (bounds) == SSA_NAME

 What about constant bounds?  That shouldn't make the call necessary either?

 Yep, it would be useless.

 So please fix.


 +  (bounds_def_stmt = SSA_NAME_DEF_STMT 
 (bounds))
 +  is_gimple_call (bounds_def_stmt)
 +  gimple_call_fndecl (bounds_def_stmt) == 
 retfndecl
 +  gimple_call_arg (bounds_def_stmt, 0) == ptr))
 +   continue;

 So this all becomes

if (!gimple_call_with_bounds_p (stmt)
|| ((bounds = gimple_call_bndarg (stmt))
 TREE_CODE (bounds) == SSA_NAME
 (bounds_def_stmt = SSA_NAME_DEF_STMT 
 (bounds))
 is_gimple_call (bounds_def_stmt)
 gimple_call_chkp_p (bounds_def_stmt,
 BUILT_IN_CHKP_BNDRET)
 ...

 btw, I wonder how BUILT_IN_CHKP_BNDRET is in scope here but you
 need a target hook to compare the fndecl?  Why isn't it enough to
 just check DECL_FUNCTION_CODE and DECL_BUILT_IN?

 Call is required because builtins for Pointer Bounds Checker are
 provided by target depending on available features. That is why
 gimple_call_builtin_p is not used. I may add new interface function
 into tree-chkp.h as you propose (e.g. chkp_gimple_call_builtin_p). But
 I do not see how it may speed-up the code. New function would still
 need to switch by function code and compare with proper decl which is
 basically what we have with target hook.

 I don't understand - does this mean the builtin calls are in the GIMPLE
 IL even though the target doesn't have the feature?  Isn't the presence
 of the call enough?

 Call is generated using function decl provided by target. Therefore we
 do not know its code and has to compare using fndecl comparison.

So they are target specific builtins after all?  IMNSHO it looks wrong
that the middle-end has to care about them in that case.  Why can't
the builtins itself be target independent?

 It is possible to make faster if use the fact that all chkp builtins
 codes (normal, not target ones) are consequent. Then we

Re: PR61385: phiopt drops some PHIs

2014-06-03 Thread Richard Biener

On Tue, Jun 3, 2014 at 3:48 PM, Marc Glisse marc.gli...@inria.fr wrote:
 Hello,

 apparently it is possible to have a PHI in the middle basic block of
 value_replacement, so I need to move it as well when I move the statement
 and remove the block.

 Bootstrap and testsuite on x86_64-linux-gnu (re-running for various reasons
 but they had completed successfully yesterday).

 2014-06-03  Marc Glisse  marc.gli...@inria.fr

 PR tree-optimization/61385
 gcc/
 * tree-ssa-phiopt.c (value_replacement): Copy PHI nodes before
 removing the basic block.
 gcc/testsuite/
 * gcc.dg/tree-ssa/pr61385.c: New file.

 --
 Marc Glisse
 Index: gcc/testsuite/gcc.dg/tree-ssa/pr61385.c
 ===
 --- gcc/testsuite/gcc.dg/tree-ssa/pr61385.c (revision 0)
 +++ gcc/testsuite/gcc.dg/tree-ssa/pr61385.c (working copy)
 @@ -0,0 +1,43 @@
 +/* { dg-do compile } */
 +/* { dg-options -O2 } */
 +
 +#define assert(x) if (!(x)) __builtin_abort ()
 +
 +int a, b, c, d, e, f, g;
 +
 +int
 +fn1 ()
 +{
 +  int *h = c;
 +  for (; c  1; c++)
 +{
 +  int *i = a, *k = a;
 +  f = 0;
 +  if (b)
 +   return 0;
 +  if (*h)
 +   {
 + int **j = i;
 + *j = 0;
 + d = 0;
 +   }
 +  else
 +   g = e = 0;
 +  if (*h)
 +   {
 + int **l = k;
 + *l = g;
 +   }
 +  d = *h;
 +  assert (k == a || k);
 +  assert (i);
 +}
 +  return 0;
 +}
 +
 +int
 +main ()
 +{
 +  fn1 ();
 +  return 0;
 +}
 Index: gcc/tree-ssa-phiopt.c
 ===
 --- gcc/tree-ssa-phiopt.c   (revision 211178)
 +++ gcc/tree-ssa-phiopt.c   (working copy)
 @@ -877,20 +877,39 @@ value_replacement (basic_block cond_bb,
 operand_equal_for_phi_arg_p (rhs2, cond_lhs)
 neutral_element_p (code_def, cond_rhs, true))
   || (arg1 == rhs2
operand_equal_for_phi_arg_p (rhs1, cond_lhs)
neutral_element_p (code_def, cond_rhs, false))
   || (operand_equal_for_phi_arg_p (arg1, cond_rhs)
(operand_equal_for_phi_arg_p (rhs2, cond_lhs)
   || operand_equal_for_phi_arg_p (rhs1, cond_lhs))
absorbing_element_p (code_def, cond_rhs
  {
 +  /* Move potential PHI nodes.  */
 +  gimple_stmt_iterator psi = gsi_start_phis (middle_bb);
 +  while (!gsi_end_p (psi))
 +   {
 + gimple phi_moving = gsi_stmt (psi);
 + gimple newphi = create_phi_node (gimple_phi_result (phi_moving),
 +  cond_bb);
 + int nargs = cond_bb-preds-length();
 + location_t loc = gimple_phi_arg_location (phi_moving, 0);
 + tree phi_arg = gimple_phi_arg_def (phi_moving, 0);
 + for (int i = 0; i  nargs; ++i)
 +   {
 + edge e = (*cond_bb-preds)[i];
 + add_phi_arg (newphi, phi_arg, e, loc);

All arguments get the same value (and the PHI in middle-bb is surely
a singleton?), so it's way better to re-materialize the PHI as a
gimple assignment at the start of the basic block.  If they are
singletons (which I expect), the easiest way is to propagate their
single argument into all uses and simply remove them.

Richard.

 +   }
 + update_stmt (newphi);
 + gsi_remove (psi, false);
 +   }
 +
gsi = gsi_for_stmt (cond);
gimple_stmt_iterator gsi_from = gsi_for_stmt (assign);
gsi_move_before (gsi_from, gsi);
replace_phi_edge_with_variable (cond_bb, e1, phi, lhs);
return 2;
  }

return 0;
  }

Re: Eliminate write-only variables

2014-06-04 Thread Richard Biener

On Tue, Jun 3, 2014 at 6:19 PM, Jan Hubicka hubi...@ucw.cz wrote:
 On Mon, Jun 2, 2014 at 8:59 PM, Jan Hubicka hubi...@ucw.cz wrote:
 
  Yeah, I discussed this with martin today on irc.  For aliasing we'd like 
  to know whether a decl possibly has its address taken. Currently we only 
  trust TREE_ADDRESSABLE for statics - and lto might change those to hidden 
  visibility public :(
 
  So we want deck_referred_to_by_single_function
 
  OK, I will implement this one in IPA mini-pass.  It is easy.
  This property changes by clonning and inlining. What Martin wants to use 
  it for?
  (I.e. my plan would be to compute it as last IPA pass making it useless for
  IPA analysispropagation)

 He doesn't want to use it but we talked and he said he'd have a look.

 Note that it's important the decls are not refered to by global initializers
 either.

 Why is a new pass necessary - can't the IPA reference maintaining code
 update some flag in the varpool magically?

 If the code to add/remove reference was updating the flag, it would became 
 out of
 date as we remove callgraph during the late compilation (we do not keep 
 references for
 functions we already compiled).
 I do not want the flags to be computed before end of IPA queue so they won't 
 become
 out of date as we clone/inline.

 We basically need to walk references and check that they are all functions 
 either one given function F or a clones inlined to it.  Assuming that function
 does not have unbounded number of references to given variale, this is
 basically constant time test.

  and deck_may_have_address_taken.
 
  We currently clear TREE_ADDRESSABLE for statics that have no address taken 
  at
  WPA time and that ought to keep the flag cleared at ltrans (I think I even 
  made
  testcase for this)
 
  What I think we could improve is to not consider string functions as ADDR 
  operations
  for this analysis, i.e. it is common to only memset to 0.
 
  How may_have_address_taken would differ here?

 I want tree.h:may_be_aliased to return false if a decl doesn't have its 
 address
 taken.  We don't trust TREE_ADDRESSABLE for TREE_PUBLIC/DECL_EXTERNAL
 decls because other TUs may take the address.  I want the check to be 
 replaced
 with sth more symtab aware, that is, bring in benefits from LTO (and at the 
 same
 time be not confused by statics brought public with hidden visibility).


 I see, are you sure you need to ignore TREE_ADDRESSABLE for PUBLIC/EXTERNAL?
 I belive frontends are resposible to set them for all data that may have 
 address
 taken (even externally) and IPA code maintains it - we clear the flagonly for
 variables that are static.

Not sure - we've always not trusted this flag on public/externals.  Probably
because there could be aliases to them that have their address taken?
(well, that's true for locals as well ...)

That said, having a varpool way of querying whether a decl may be aliased
by a pointer would be nice (with bonus points of handling the alias case).

 Or we can just set the flag to true for externally visible vars ealry in IPA
 queue if this is false.

Well, let me do a simple check (remove the restriction and bootstrap/test).

Richard.

 Honza

 Richard.

  Honza

Re: [PATCH] Fix PR61306: improve handling of sign and cast in bswap

2014-06-04 Thread Richard Biener

On Wed, Jun 4, 2014 at 7:59 AM, Thomas Preud'homme
thomas.preudho...@arm.com wrote:
 When bswap replace a bitwise expression involving a memory source by a load 
 possibly followed by a bswap, it is possible that the load has a size smaller 
 than that of the target expression where the bitwise expression was affected. 
 So some sort of cast is needed. But there might also be a difference between 
 the size of the expression that was affected and the size of the load. So 3 
 different sizes might be involved. Consider the following example from 
 binutils:

 bfd_vma
 bfd_getl16 (const void *p)
 {
   const bfd_byte *addr = (*const bfd_byte *) p;
   return (addr[1]  8) | addr[0];
 }

 Here the load will have a size of 16 bits, while the bitwise expression is an 
 int (don't ask me why) but is returned as a 64 bits quantity (bfd_vma maps to 
 the size of host registers). In this case we need 2 separate cast. One from 
 16 bit quantity to int with zero extension as the high bits are 0. It is 
 always a zero extension because bswap will not do anything in the presence of 
 a sign extension as depending on the initial value the result would be 
 different (maybe a bswap if positive number and random value if negative 
 number). Then, we need to cast respecting the extension that would have 
 happen had we not replaced the bitwise extension. Here since the bitwise 
 expression is int, it means we sign extend and then consider the content as 
 being unsigned (bfd_vma is an unsigned quantity).

 When a bswap is necessary we need to do this double cast after doing the 
 bswap as the bswap must be done on the loaded value since a that's the size 
 expected by the bswap builtin. Finally, this patch also forbit any sign 
 extension *in* the bitwise expression as the result of the expression would 
 then be unpredictable (depend on the initial value).

 The patch works this way:

 1) prevent size extension of a bitwise expression
 2) record the type of the bitwise expression instead of its size (the size 
 can be determined from the type)
 3) use this type to perform a double cast as explained above


 2014-05-30  Thomas Preud'homme  thomas.preudho...@arm.com

 PR tree-optimization/61306
 * tree-ssa-math-opts.c (struct symbolic_number): Store type of
 expression instead of its size.
 (do_shift_rotate): Adapt to change in struct symbolic_number.
 (verify_symbolic_number_p): Likewise.
 (init_symbolic_number): Likewise.
 (find_bswap_or_nop_1): Likewise. Also prevent optimization when the
 result of the expressions is unpredictable due to sign extension.
 (convert_via): New function to deal with the casting involved from the
 loaded value to the target SSA.
 (bswap_replace): Rename load_type in range_type to reflect it's the
 type the memory accessed shall have before being casted. Select load
 type according to whether a bswap needs to be done. Cast first to rhs
 with zero extend and then to lhs with sign extend to keep semantic of
 original stmts.
 (pass_optimize_bswap::execute): Adapt to change in struct
 symbolic_number. Decide if the range accessed should be signed or
 unsigned before being casted to lhs type based on rhs type and size.

 2014-05-29  Thomas Preud'homme  thomas.preudho...@arm.com

 * gcc.c-torture/execute/pr61306.c: New test.

 Patch is in attachment. Is this ok for trunk?

I'd rather change the comparisons

-  if (n-size  (int)sizeof (int64_t))
-n-n = ((uint64_t)1  (n-size * BITS_PER_UNIT)) - 1;
+  if (bitsize / BITS_PER_UNIT  (int)sizeof (int64_t))
+n-n = ((uint64_t)1  bitsize) - 1;

to work in bits, thus bitsize  8 * sizeof (int64_t) (note that using
BITS_PER_UNIT is bogus here - you are dealing with 8-bit bytes
on the host, not whatever the target uses).  Otherwise it smells
like truncation may lose bits (probably not in practice).

+   /* Sign extension: result is dependent on the value.  */
+   if (!TYPE_UNSIGNED (type)  !TYPE_UNSIGNED (n-type)
+type_size  TYPE_PRECISION (n-type))
+ return NULL_TREE;

whether it's sign-extended depends solely on the sign of the
converted entity, so I'm not sure why you are restricting this
to signed n-type.  Consider

signed char *p;
((unsigned int)p[0])  8 | ((unsigned int)p[1])  16

the loads are still sign-extended but n-type is unsigned.

I'm failing to get why you possibly need two casts ... you should
only need one, from the bswap/load result to the final type
(zero-extended as you say - so the load type should simply be
unsigned which it is already).

So I think that the testcase in the patch is fixed already by
doing the n-type change (and a proper sign-extension detection).

Can you please split that part out?

That range_type and convert_via looks wrong and unnecessary to me,
and it doesn't look like you have a testcase excercising it?

Thanks,
Richard.

Re: [PATCH] Fix PR61320: disable bswap for unaligned access on SLOW_UNALIGNED_ACCESS targets

2014-06-04 Thread Richard Biener

On Wed, Jun 4, 2014 at 8:07 AM, Thomas Preud'homme
thomas.preudho...@arm.com wrote:
 Hi there,

 It seems from PR61320 that the bswap pass causes some problems when it 
 replaces
 an OR expression by an unaligned access. Although it's not clear yet why the
 unaligned load does not go through the extract_bit_field codepath, it is 
 necessary to
 provide a solution as this prevent sparc from bootstrapping. This patch takes 
 the
 simple approach of cancelling the bswap optimization when the load that would
 replace the OR expression would be an unaligned load and the target has
 SLOW_UNALIGNED_ACCESS. In the long run this patch should be reverted as soon
 as the root cause of the current problem is found.

 The patch also rewrite the test to take into account the fact that the 
 optimization is
 not done for some target. It also add some alignment hint so that more tests 
 can be
 run even on STRICT_ALIGNMENT targets.

 ChangeLog changes are:

 *** gcc/ChangeLog ***

 2014-06-03  Thomas Preud'homme  thomas.preudho...@arm.com

 PR tree-optimization/61320
 * tree-ssa-math-opts.c (bswap_replace): Cancel bswap optimization when
 load is unaligned and would be slow for this target.

 *** gcc/testsuite/ChangeLog ***

 2014-06-03  Thomas Preud'homme  thomas.preudho...@arm.com

 * gcc.dg/optimize-bswaphi-1.c: Make variables global when possible to
 enforce correct alignment and make the test work better on
 STRICT_ALIGNMENT targets. Also adjust dg-final selectors when 
 alignment
 cannot be controlled (read_*_3 ()).
 * gcc.dg/optimize-bswapsi-2.c: Likewise.
 * gcc.dg/optimize-bswapdi-3.c: Likewise.

 Bootstrapped on x86_64-linux-gnu and no regression found in the testsuite. 
 Patch is
 in attachment. It applies on top of the one for PR61306 in the email titled
 [PATCH] Fix PR61306: improve handling of sign and cast in bswap but can be
 trivially modified to apply directly on trunk should that patch (PR61306) 
 need to be
 improved.

 Is this ok for trunk?

I'd rather wait for the other bug to be fixed properly (given this patch
is ontop of that).  Btw,

+  if (align  GET_MODE_ALIGNMENT (TYPE_MODE (load_type)) 
+ SLOW_UNALIGNED_ACCESS (TYPE_MODE (load_type), align))
return false;

 goes to the next line.

I too belive that presenting a wider possibly unaligned load to the expander
is good and gives it a chance to produce optimal code for the target.

I think that PR61320 is probably a dup of PR61306.

Still adding the align  GET_MODE_ALIGNMENT (TYPE_MODE (load_type))
condition is correct here, aligned loads are fine even in the bswap
code-path.

Richard.

 Best regards,

 Thomas

Re: [PATCH, i386, Pointer Bounds Checker 10/x] Partitions

2014-06-04 Thread Richard Biener

On Tue, Jun 3, 2014 at 11:24 PM, Jeff Law l...@redhat.com wrote:
 On 06/02/14 05:41, Richard Biener wrote:


 this should be all moved to the symbol table level.  (and IDENTIFIER_NODE
 shouldn't have to have tree_common.chain and thus become smaller).

 Which ought to be independent of the pointer checking work.  There's been
 some proposals for making first class symbol tables in GCC, but nothing
 that's made any progress at this point.  Maybe I'll sick David on it at some
 point.

You probably missed that we now have such symbol table (ok, it still
lacks labels and CONST_DECLs).

Richard.

 jeff

Re: [PATCH] Fix PR54733 Optimize endian independent load/store

2014-06-04 Thread Richard Biener

On Wed, Jun 4, 2014 at 9:04 AM, Thomas Preud'homme
thomas.preudho...@arm.com wrote:
 From: Christophe Lyon [mailto:christophe.l...@linaro.org]
 On 29 May 2014 11:58, Thomas Preud'homme
 thomas.preudho...@arm.com wrote:
 
  Does the patch solve the problem you had? What about you Christophe?
 
 

 Hi Thomas,

 After a quick test, it looks OK to me.

 Great. What about you Andreas? Does it work fine for you? If yes, is this ok 
 for trunk?

Ok.

Thanks,
Richard.

 Best regards,

 Thomas

Re: [PATCH, Pointer Bounds Checker 6/x] New static constructor types

2014-06-04 Thread Richard Biener

On Wed, Apr 16, 2014 at 2:33 PM, Ilya Enkovich enkovich@gmail.com wrote:
 Hi,

 This patch add new static constructor types used by Pointer Bounds Checker.  
 It was approved earlier for 4.9 and I'll assume patch is OK for trunk if no 
 objections arise.

 Patch was bootstrapped and tested for linux-x86_64.

 Thanks,
 Ilya
 --
 gcc/

 2014-04-16  Ilya Enkovich  ilya.enkov...@intel.com

 * ipa.c (cgraph_build_static_cdtor_1): Support contructors
 with chkp ctor and bnd_legacy attributes.
 * gimplify.c (gimplify_init_constructor): Avoid infinite
 loop during gimplification of bounds initializer.


 diff --git a/gcc/gimplify.c b/gcc/gimplify.c
 index 7441784..67ab515 100644
 --- a/gcc/gimplify.c
 +++ b/gcc/gimplify.c
 @@ -3803,10 +3803,19 @@ gimplify_init_constructor (tree *expr_p, gimple_seq 
 *pre_p, gimple_seq *post_p,
individual element initialization.  Also don't do this for small
all-zero initializers (which aren't big enough to merit
clearing), and don't try to make bitwise copies of
 -  TREE_ADDRESSABLE types.  */
 +  TREE_ADDRESSABLE types.
 +
 +  We cannot apply such transformation when compiling chkp static
 +  initializer because creation of initializer image in the memory
 +  will require static initialization of bounds for it.  It should
 +  result in another gimplification of similar initializer and we
 +  may fall into infinite loop.  */
 if (valid_const_initializer
  !(cleared || num_nonzero_elements == 0)
 -!TREE_ADDRESSABLE (type))
 +!TREE_ADDRESSABLE (type)
 +(!current_function_decl
 +   || !lookup_attribute (chkp ctor,
 + DECL_ATTRIBUTES 
 (current_function_decl

Simply make the type TREE_ADDRESSABLE?

   {
 HOST_WIDE_INT size = int_size_in_bytes (type);
 unsigned int align;
 diff --git a/gcc/ipa.c b/gcc/ipa.c
 index 26e9b03..5ab3aed 100644
 --- a/gcc/ipa.c
 +++ b/gcc/ipa.c
 @@ -1345,9 +1345,11 @@ make_pass_ipa_whole_program_visibility (gcc::context 
 *ctxt)
  }

  /* Generate and emit a static constructor or destructor.  WHICH must
 -   be one of 'I' (for a constructor) or 'D' (for a destructor).  BODY
 -   is a STATEMENT_LIST containing GENERIC statements.  PRIORITY is the
 -   initialization priority for this constructor or destructor.
 +   be one of 'I' (for a constructor), 'D' (for a destructor), 'P'
 +   (for chp static vars constructor) or 'B' (for chkp static bounds
 +   constructor).  BODY is a STATEMENT_LIST containing GENERIC
 +   statements.  PRIORITY is the initialization priority for this
 +   constructor or destructor.

 FINAL specify whether the externally visible name for collect2 should
 be produced. */
 @@ -1406,6 +1408,20 @@ cgraph_build_static_cdtor_1 (char which, tree body, 
 int priority, bool final)
DECL_STATIC_CONSTRUCTOR (decl) = 1;
decl_init_priority_insert (decl, priority);
break;
 +case 'P':
 +  DECL_STATIC_CONSTRUCTOR (decl) = 1;
 +  DECL_ATTRIBUTES (decl) = tree_cons (get_identifier (chkp ctor),
 + NULL,
 + NULL_TREE);

Ick.  Please try to avoid using attributes for this.  Rather adjust the
caller of this function to set a flag in the cgraph node.

So I don't like this patch at all.

Richard.

 +  decl_init_priority_insert (decl, priority);
 +  break;
 +case 'B':
 +  DECL_STATIC_CONSTRUCTOR (decl) = 1;
 +  DECL_ATTRIBUTES (decl) = tree_cons (get_identifier (bnd_legacy),
 + NULL,
 + NULL_TREE);
 +  decl_init_priority_insert (decl, priority);
 +  break;
  case 'D':
DECL_STATIC_DESTRUCTOR (decl) = 1;
decl_fini_priority_insert (decl, priority);
 @@ -1423,9 +1439,11 @@ cgraph_build_static_cdtor_1 (char which, tree body, 
 int priority, bool final)
  }

  /* Generate and emit a static constructor or destructor.  WHICH must
 -   be one of 'I' (for a constructor) or 'D' (for a destructor).  BODY
 -   is a STATEMENT_LIST containing GENERIC statements.  PRIORITY is the
 -   initialization priority for this constructor or destructor.  */
 +   be one of 'I' (for a constructor), 'D' (for a destructor), 'P'
 +   (for chkp static vars constructor) or 'B' (for chkp static bounds
 +   constructor).  BODY is a STATEMENT_LIST containing GENERIC
 +   statements.  PRIORITY is the initialization priority for this
 +   constructor or destructor.  */

  void
  cgraph_build_static_cdtor (char which, tree body, int priority)

Re: [PATCH, Pointer Bounds Checker 13/x] Early versioning

2014-06-04 Thread Richard Biener

On Wed, Jun 4, 2014 at 8:46 AM, Jeff Law l...@redhat.com wrote:
 On 06/03/14 03:29, Richard Biener wrote:

 On Tue, Jun 3, 2014 at 7:55 AM, Ilya Enkovich enkovich@gmail.com
 wrote:

 2014-06-02 21:27 GMT+04:00 Jeff Law l...@redhat.com:

 On 06/02/14 04:48, Ilya Enkovich wrote:


 Hmm, so if I understand things correctly, src_fun has no loop
 structures attached, thus there's nothing to copy.  Presumably at
 some later point we build loop structures for the copy from scratch?


 I suppose it is just a simple bug with absent NULL pointer check.  Here
 is
 original code:

 /* Duplicate the loop tree, if available and wanted.  */
 if (loops_for_fn (src_cfun) != NULL
  current_loops != NULL)
   {
 copy_loops (id, entry_block_map-loop_father,
 get_loop (src_cfun, 0));
 /* Defer to cfgcleanup to update loop-father fields of
 basic-blocks.  */
 loops_state_set (LOOPS_NEED_FIXUP);
   }

 /* If the loop tree in the source function needed fixup, mark the
destination loop tree for fixup, too.  */
 if (loops_for_fn (src_cfun)-state  LOOPS_NEED_FIXUP)
   loops_state_set (LOOPS_NEED_FIXUP);

 As you may see we have check for absent loops structure in the first
 if-statement and no check in the second one.  I hit segfault and added
 the
 check.


 Downthread you indicated you're not in SSA form which might explain the
 inconsistency here.  If so, then we need to make sure that the loop  df
 structures do get set up properly later.


 That is what init_data_structures pass will do for us as Richard pointed.
 Right?


 loops are set up during the CFG construction and thus are available
 everywhere.

 Which would argue that the hunk that checks for the loop tree's existence
 before accessing it should not be needed.  Ilya -- is it possible you hit
 this prior to Richi's work to build the loop structures as part of CFG
 construction and maintain them throughout compilation.

That's likely.  It's still on my list of janitor things to do to remove all
those if (current_loops) checks ...

 the df structures are set up in init_data_structures pass which is run
 before
 going into SSA form (I'd like to somehow cleanup that area).

 OK.   So this part should be approved since we've established this code is
 running prior to going into SSA form.

Agreed.

Thanks,
Richard.

 jeff

Re: PR61385: phiopt drops some PHIs

2014-06-04 Thread Richard Biener

On Wed, Jun 4, 2014 at 9:59 AM, Jeff Law l...@redhat.com wrote:
 On 06/04/14 01:46, Marc Glisse wrote:

 On Tue, 3 Jun 2014, Jeff Law wrote:

 On 06/03/14 08:08, Richard Biener wrote:

 All arguments get the same value (and the PHI in middle-bb is surely
 a singleton?), so it's way better to re-materialize the PHI as a
 gimple assignment at the start of the basic block.  If they are
 singletons (which I expect), the easiest way is to propagate their
 single argument into all uses and simply remove them.


 I was all for propagating, but replace_uses_by calls fold_stmt on the
 using statements (that's normal). If I first move the statement and then
 replace, I may call fold_stmt on a stmt that isn't dominated by the defs
 of all its arguments. If I first replace, the statement I am supposed to
 move may have been modified so I need to find out what happened to it.
 In the end, the assignment seems easier and safer (and not too bad since
 this case almost never happens in practice).

 We certainly want to get rid of them :-)

 I'd start by finding out which pass left the degenerate, ensure it's
 not marked as SSA_NAME_OCCURS_IN_ABNORMAL_PHI, then propagate it away.

 If it's a systemic problem that a particular pass can leave degenerate
 PHIs, then you might want to schedule the phi-only propagator to run
 after that pass.  It does const/copy propagation seeding from
 degenerate PHIs, so it ought to be very fast.


 This one comes from VRP2 apparently. But it isn't the only pass that
 produces singleton phis. I didn't look at them closely, but I assume LIM
 is normal, DOM maybe less so, etc.

 DOM  VRP can create them via jump threading.  Right now we only fire up the
 pass_phi_only_cprop after DOM though.

Right, if we want to get rid of the degenerates early we have to run
pass_phi_only_cprop after VRP as well.

 DOM runs not too long after LIM and it will propagate away the degenerates
 as well.

Not sure where LIM would create degenerate PHIs, but I guess you
simply see loop-closed PHIs here which are degenerate.

 One could argue that propagating away the degenerates ought to occur
 immediately before any phi-opt pass.

I'd say we should run phi_only_cprop after the first VRPs instead and
move the 2nd VRP before the existing phi_only_cprop pass.  I'll do
the appropriate change.


 bootstrap+testsuite on x86_64-linux-gnu as usual.

 2014-06-04  Marc Glisse  marc.gli...@inria.fr

  PR tree-optimization/61385
 gcc/
  * tree-ssa-phiopt.c (value_replacement): Replace singleton PHIs
  with assignments.
 gcc/testsuite/
  * gcc.dg/tree-ssa/pr61385.c: New file.

 Don't you have to verify neither the source nor the destination are marked
 with SSA_NAME_OCCURS_IN_ABNORMAL_PHI?  And if so, you can't propagate away
 the PHI (which may mean we need to avoid the optimization in that case).

 I also believe that when turning a PHI into an assignment  you have to
 preserve the parallel nature of the PHI.  If the result of one PHI is used
 as an input in another PHI, then you have to work harder (see the out-of-ssa
 code).  Perhaps this can never be the case with a degenerate.

Hmm, indeed that can happen.

So I'd say we should instead simply bail out if the middle-bb has a PHI node.

Simple and safe.  And with the pass ordering fixes above we shouldn't see
degenerates anyway.

Richard.

 Jeff

[PATCH][1/2] Improve DSE

2014-06-04 Thread Richard Biener


This first patch improves DSE by improving the handling of
references with non-invariant addresses such as 
a-b[i].c in stmt_kills_ref_p_1.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2014-06-04  Richard Biener  rguent...@suse.de

* tree-ssa-alias.c (stmt_may_clobber_ref_p): Improve handling
of accesses with non-invariant address.

* gcc.dg/tree-ssa/ssa-dse-16.c: New testcase.

Index: gcc/tree-ssa-alias.c
===
*** gcc/tree-ssa-alias.c(revision 211213)
--- gcc/tree-ssa-alias.c(working copy)
*** stmt_may_clobber_ref_p (gimple stmt, tre
*** 2174,2184 
  static bool
  stmt_kills_ref_p_1 (gimple stmt, ao_ref *ref)
  {
!   /* For a must-alias check we need to be able to constrain
!  the access properly.
!  FIXME: except for BUILTIN_FREE.  */
!   if (!ao_ref_base (ref)
!   || ref-max_size == -1)
  return false;
  
if (gimple_has_lhs (stmt)
--- 2174,2180 
  static bool
  stmt_kills_ref_p_1 (gimple stmt, ao_ref *ref)
  {
!   if (!ao_ref_base (ref))
  return false;
  
if (gimple_has_lhs (stmt)
*** stmt_kills_ref_p_1 (gimple stmt, ao_ref
*** 2191,2199 
 might throw as well.  */
 !stmt_can_throw_internal (stmt))
  {
!   tree base, lhs = gimple_get_lhs (stmt);
HOST_WIDE_INT size, offset, max_size, ref_offset = ref-offset;
!   base = get_ref_base_and_extent (lhs, offset, size, max_size);
/* We can get MEM[symbol: sZ, index: D.8862_1] here,
 so base == ref-base does not always hold.  */
if (base != ref-base)
--- 2187,2237 
 might throw as well.  */
 !stmt_can_throw_internal (stmt))
  {
!   tree lhs = gimple_get_lhs (stmt);
!   /* If LHS is literally a base of the access we are done.  */
!   if (ref-ref)
!   {
! tree base = ref-ref;
! if (handled_component_p (base))
!   {
! tree saved_lhs0 = NULL_TREE;
! if (handled_component_p (lhs))
!   {
! saved_lhs0 = TREE_OPERAND (lhs, 0);
! TREE_OPERAND (lhs, 0) = integer_zero_node;
!   }
! do
!   {
! /* Just compare the outermost handled component, if
!they are equal we have found a possible common
!base.  */
! tree saved_base0 = TREE_OPERAND (base, 0);
! TREE_OPERAND (base, 0) = integer_zero_node;
! bool res = operand_equal_p (lhs, base, 0);
! TREE_OPERAND (base, 0) = saved_base0;
! if (res)
!   break;
! /* Otherwise drop handled components of the access.  */
! base = saved_base0;
!   }
! while (handled_component_p (base));
! if (saved_lhs0)
!   TREE_OPERAND (lhs, 0) = saved_lhs0;
!   }
! /* Finally check if lhs is equal or equal to the base candidate
!of the access.  */
! if (operand_equal_p (lhs, base, 0))
!   return true;
!   }
! 
!   /* Now look for non-literal equal bases with the restriction of
!  handling constant offset and size.  */
!   /* For a must-alias check we need to be able to constrain
!the access properly.  */
!   if (ref-max_size == -1)
!   return false;
HOST_WIDE_INT size, offset, max_size, ref_offset = ref-offset;
!   tree base = get_ref_base_and_extent (lhs, offset, size, max_size);
/* We can get MEM[symbol: sZ, index: D.8862_1] here,
 so base == ref-base does not always hold.  */
if (base != ref-base)
*** stmt_kills_ref_p_1 (gimple stmt, ao_ref
*** 2261,2266 
--- 2299,2308 
  case BUILT_IN_MEMMOVE_CHK:
  case BUILT_IN_MEMSET_CHK:
{
+ /* For a must-alias check we need to be able to constrain
+the access properly.  */
+ if (ref-max_size == -1)
+   return false;
  tree dest = gimple_call_arg (stmt, 0);
  tree len = gimple_call_arg (stmt, 2);
  if (!tree_fits_shwi_p (len))
Index: gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-16.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-16.c  (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-16.c  (working copy)
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -fdump-tree-dse1-details } */
+
+struct X { struct A { int a[2]; } b[10]; };
+void foo (struct X *x, int i)
+{
+  struct A a;
+ /* Confuse SRA here with using a variable index, otherwise it will mess
+with the IL too much.  */
+  a.a[i] = 3;
+  a.a[1] = 0;
+  /* The following store is dead.  */
+  x-b[i].a[0] = 1;
+  x-b[i] = a;
+}
+
+/* { dg-final { scan-tree-dump Deleted dead store dse1

[PATCH][2/2] Improve DSE

2014-06-04 Thread Richard Biener


This improves DSE by not stopping to look for a killing stmt at
the first may-alias but instead continue looking for a kill
(well, until we hit a may-use or run into limitations of the walker).

This fixes some embarrassing missing dead store eliminations
(it's also more expensive, but the walking is limited thus
it's O(1) anyway (haha)).

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2014-06-04  Richard Biener  rguent...@suse.de

PR tree-optimization/60098
* tree-ssa-dse.c (dse_possible_dead_store_p): Walk until
we hit a kill.
(dse_optimize_stmt): Simplify, now that we found a kill
earlier.

* gcc.dg/tree-ssa/ssa-dse-15.c: New testcase.

Index: gcc/tree-ssa-dse.c
===
*** gcc/tree-ssa-dse.c.orig 2014-05-06 16:15:43.030962178 +0200
--- gcc/tree-ssa-dse.c  2014-06-04 10:12:33.584954818 +0200
*** dse_possible_dead_store_p (gimple stmt,
*** 198,207 
  break;
}
  }
!   /* We deliberately stop on clobbering statements and not only on
!  killing ones to make walking cheaper.  Otherwise we can just
!  continue walking until both stores have equal reference trees.  */
!   while (!stmt_may_clobber_ref_p (temp, gimple_assign_lhs (stmt)));
  
*use_stmt = temp;
  
--- 198,205 
  break;
}
  }
!   /* Continue walking until we reach a kill.  */
!   while (!stmt_kills_ref_p (temp, gimple_assign_lhs (stmt)));
  
*use_stmt = temp;
  
*** dse_optimize_stmt (gimple_stmt_iterator
*** 248,304 
if (!dse_possible_dead_store_p (stmt, use_stmt))
return;
  
/* But only remove *this_2(D) ={v} {CLOBBER} if killed by
 another clobber stmt.  */
if (gimple_clobber_p (stmt)
   !gimple_clobber_p (use_stmt))
return;
  
!   /* If we have precisely one immediate use at this point and the
!stores are to the same memory location or there is a chain of
!virtual uses from stmt and the stmt which stores to that same
!memory location, then we may have found redundant store.  */
!   if ((gimple_has_lhs (use_stmt)
!   (operand_equal_p (gimple_assign_lhs (stmt),
!   gimple_get_lhs (use_stmt), 0)))
! || stmt_kills_ref_p (use_stmt, gimple_assign_lhs (stmt)))
!   {
! basic_block bb;
  
! /* If use_stmt is or might be a nop assignment, e.g. for
!struct { ... } S a, b, *p; ...
!b = a; b = b;
!or
!b = a; b = *p; where p might be b,
!or
!*p = a; *p = b; where p might be b,
!or
!*p = *u; *p = *v; where p might be v, then USE_STMT
!acts as a use as well as definition, so store in STMT
!is not dead.  */
! if (stmt != use_stmt
!  ref_maybe_used_by_stmt_p (use_stmt, gimple_assign_lhs (stmt)))
!   return;
! 
! if (dump_file  (dump_flags  TDF_DETAILS))
! {
!   fprintf (dump_file,   Deleted dead store ');
!   print_gimple_stmt (dump_file, gsi_stmt (*gsi), dump_flags, 0);
!   fprintf (dump_file, '\n);
! }
! 
! /* Then we need to fix the operand of the consuming stmt.  */
! unlink_stmt_vdef (stmt);
! 
! /* Remove the dead store.  */
! bb = gimple_bb (stmt);
! if (gsi_remove (gsi, true))
!   bitmap_set_bit (need_eh_cleanup, bb-index);
! 
! /* And release any SSA_NAMEs set in this statement back to the
!SSA_NAME manager.  */
! release_defs (stmt);
}
  }
  }
  
--- 246,294 
if (!dse_possible_dead_store_p (stmt, use_stmt))
return;
  
+   /* Now we know that use_stmt kills the LHS of stmt.  */
+ 
/* But only remove *this_2(D) ={v} {CLOBBER} if killed by
 another clobber stmt.  */
if (gimple_clobber_p (stmt)
   !gimple_clobber_p (use_stmt))
return;
  
!   basic_block bb;
! 
!   /* If use_stmt is or might be a nop assignment, e.g. for
!  struct { ... } S a, b, *p; ...
!  b = a; b = b;
!or
!  b = a; b = *p; where p might be b,
!or
!*p = a; *p = b; where p might be b,
!or
!*p = *u; *p = *v; where p might be v, then USE_STMT
!  acts as a use as well as definition, so store in STMT
!  is not dead.  */
!   if (stmt != use_stmt
!  ref_maybe_used_by_stmt_p (use_stmt, gimple_assign_lhs (stmt)))
!   return;
  
!   if (dump_file  (dump_flags  TDF_DETAILS))
!   {
! fprintf (dump_file,   Deleted dead store ');
! print_gimple_stmt (dump_file, gsi_stmt (*gsi), dump_flags, 0);
! fprintf (dump_file, '\n);
}
+ 
+   /* Then we need to fix the operand of the consuming stmt

Re: [PATCH] Fix PR61306: improve handling of sign and cast in bswap

2014-06-04 Thread Richard Biener

On Wed, Jun 4, 2014 at 12:42 PM, Thomas Preud'homme
thomas.preudho...@arm.com wrote:
 From: Richard Biener [mailto:richard.guent...@gmail.com]

 I'd rather change the comparisons

 -  if (n-size  (int)sizeof (int64_t))
 -n-n = ((uint64_t)1  (n-size * BITS_PER_UNIT)) - 1;
 +  if (bitsize / BITS_PER_UNIT  (int)sizeof (int64_t))
 +n-n = ((uint64_t)1  bitsize) - 1;

 to work in bits, thus bitsize  8 * sizeof (int64_t) (note that using
 BITS_PER_UNIT is bogus here - you are dealing with 8-bit bytes
 on the host, not whatever the target uses).  Otherwise it smells
 like truncation may lose bits (probably not in practice).

 Ah yes, right.


 +   /* Sign extension: result is dependent on the value.  */
 +   if (!TYPE_UNSIGNED (type)  !TYPE_UNSIGNED (n-type)
 +type_size  TYPE_PRECISION (n-type))
 + return NULL_TREE;

 whether it's sign-extended depends solely on the sign of the
 converted entity, so I'm not sure why you are restricting this
 to signed n-type.  Consider

 signed char *p;
 ((unsigned int)p[0])  8 | ((unsigned int)p[1])  16

 the loads are still sign-extended but n-type is unsigned.

 Indeed, I understood it for convert_via (the requirement to be
 unsigned) but got it wrong here.


 I'm failing to get why you possibly need two casts ... you should
 only need one, from the bswap/load result to the final type
 (zero-extended as you say - so the load type should simply be
 unsigned which it is already).

 Because of the type of the shift constant, the bitwise expression
 is usually of type int. However, if you write a function that does
 a 32 bit load in host endianness (like a 32 bit little endian load on
 x86_64) with a result of size 64 bits, then you need to sign
 extend, since the source type is signed. This is a situation I
 encountered in bfd_getl32 in binutils I think.

 Now if you consider bfd_getl16 instead a direct sign extension
 is out of the question. Suppose bfd_getl16 is called to read from
 a memory address that contains 0xff 0xfe. The bitwise expression
 would thus be equivalent to the value 0xfeff since it's of type
 int. Then after the sign extension to 64bits you'd have
 0xfeff. But after replacing the bitwise expression
 you end up with a 16bit load into a 16bit SSA variable. If you sign
 extend that directly to 64 bits you'll end up with
 0xfeff which is wrong. But if you zero extend to an
 int value (the type of the bitwise OR expression) and then sign
 extend to the target type you'll have the correct result.

Err, but if you zero-extend directly to the target type you have the
correct result, too.

 But you're right, we can do simpler by sign extending if
 load size == size of bitwise expression and zero extend else. The
 change of load_type to range_type would still be needed
 because in case of a load + bswap it's better to load in the
 same type as the type of the parameter of bswap. After bswap
 you'd need to convert to a signed or unsigned value according
 to the logic above (load size == size of bitwise expression)

 In the original statement, the bitwise OR
 expression would have the 2 bytes of higher weight be 0 while
 the 2 bytes of lower weight would be the value read from
 memory. The result of the sign extension would be


 So I think that the testcase in the patch is fixed already by
 doing the n-type change (and a proper sign-extension detection).

 I don't remember exactly but I think it didn't fix this bug
 (but it is a necessary fix anyway).


 Can you please split that part out?

 Sure, that part would need to be applied on gcc 4.9 too. I'll try
 to construct a testcase for that.


 That range_type and convert_via looks wrong and unnecessary to me,
 and it doesn't look like you have a testcase excercising it?

 See above.

But nothing for the testsuite?  The testcase you add fails foul of
sign-extending the loads.

Richard.

Re: [PATCH] Fix PR61306: improve handling of sign and cast in bswap

2014-06-04 Thread Richard Biener

On Wed, Jun 4, 2014 at 1:15 PM, Thomas Preud'homme
thomas.preudho...@arm.com wrote:
 From: Richard Biener [mailto:richard.guent...@gmail.com]

 Err, but if you zero-extend directly to the target type you have the
 correct result, too.

 Yep but in some case we need sign extend (32 bit bitwise OR stored
 into 64 bit result).

But you'd convert the 32bit bitwise OR to the original type anyway
(no extension, but sign-change).  The extension to the stored type
happens with the original IL.

 As I said, the logic could be simplified by sign
 extending if load_size == bitwise expression size and zero extending
 if not true. I'll rework the patch in this direction.


 But nothing for the testsuite?  The testcase you add fails foul of
 sign-extending the loads.

 Ack, I'll add a test for zero extension and one for sign extension.

Thanks.
Richard.

 Cheers,

 Thomas

Re: PR61385: phiopt drops some PHIs

2014-06-04 Thread Richard Biener

On Wed, Jun 4, 2014 at 1:20 PM, Marc Glisse marc.gli...@inria.fr wrote:
 On Wed, 4 Jun 2014, Richard Biener wrote:

 So I'd say we should instead simply bail out if the middle-bb has a PHI
 node.


 Sounds good to me, so I am testing the mini-patch I had originally posted to
 bugzilla:

Ok.

Thanks,
Richard.


 2014-06-04  Marc Glisse  marc.gli...@inria.fr

 PR tree-optimization/61385
 gcc/
 * tree-ssa-phiopt.c (value_replacement): Punt if there are PHI
 nodes.

 gcc/testsuite/
 * gcc.dg/tree-ssa/pr61385.c: New file.

 --
 Marc Glisse
 Index: gcc/testsuite/gcc.dg/tree-ssa/pr61385.c
 ===
 --- gcc/testsuite/gcc.dg/tree-ssa/pr61385.c (revision 0)
 +++ gcc/testsuite/gcc.dg/tree-ssa/pr61385.c (working copy)
 @@ -0,0 +1,43 @@
 +/* { dg-do compile } */
 +/* { dg-options -O2 } */
 +
 +#define assert(x) if (!(x)) __builtin_abort ()
 +
 +int a, b, c, d, e, f, g;
 +
 +int
 +fn1 ()
 +{
 +  int *h = c;
 +  for (; c  1; c++)
 +{
 +  int *i = a, *k = a;
 +  f = 0;
 +  if (b)
 +   return 0;
 +  if (*h)
 +   {
 + int **j = i;
 + *j = 0;
 + d = 0;
 +   }
 +  else
 +   g = e = 0;
 +  if (*h)
 +   {
 + int **l = k;
 + *l = g;
 +   }
 +  d = *h;
 +  assert (k == a || k);
 +  assert (i);
 +}
 +  return 0;
 +}
 +
 +int
 +main ()
 +{
 +  fn1 ();
 +  return 0;
 +}
 Index: gcc/tree-ssa-phiopt.c
 ===
 --- gcc/tree-ssa-phiopt.c   (revision 211221)
 +++ gcc/tree-ssa-phiopt.c   (working copy)
 @@ -842,20 +842,24 @@ value_replacement (basic_block cond_bb,
/* Now optimize (x != 0) ? x + y : y to just y.
   The following condition is too restrictive, there can easily be
 another
   stmt in middle_bb, for instance a CONVERT_EXPR for the second
 argument.  */
gimple assign = last_and_only_stmt (middle_bb);
if (!assign || gimple_code (assign) != GIMPLE_ASSIGN
|| gimple_assign_rhs_class (assign) != GIMPLE_BINARY_RHS
|| (!INTEGRAL_TYPE_P (TREE_TYPE (arg0))
!POINTER_TYPE_P (TREE_TYPE (arg0
  return 0;

 +  /* Punt if there are (degenerate) PHIs in middle_bb, there should not be.
 */
 +  if (!gimple_seq_empty_p (phi_nodes (middle_bb)))
 +return 0;
 +
/* Only transform if it removes the condition.  */
if (!single_non_singleton_phi_for_edges (phi_nodes (gimple_bb (phi)), e0,
 e1))
  return 0;

/* Size-wise, this is always profitable.  */
if (optimize_bb_for_speed_p (cond_bb)
/* The special case is useless if it has a low probability.  */
 profile_status_for_fn (cfun) != PROFILE_ABSENT
 EDGE_PRED (middle_bb, 0)-probability  PROB_EVEN
/* If assign is cheap, there is no point avoiding it.  */

[PATCH][match-and-simplify] Restore bootstrap somewhat

2014-06-04 Thread Richard Biener


The following patche is necessary to get to stage2 target library
building.

Bootstrapped until that point, applied.

Richard.

2014-06-04  Richard Biener  rguent...@suse.de

* genmatch.c (error_cb, fatal_at): Annotate with printf
format attribute to silence warning.
* gimple-match-head.c (gimple_resimplify1): Check for
builtin availability and properly zero out excess operands.
(gimple_resimplify2): Likewise.
(gimple_resimplify3): Likewise.
(maybe_push_res_to_seq): Check for builtin availability.
(gimple_match_and_simplify): Likewise.

Index: gcc/genmatch.c
===
--- gcc/genmatch.c  (revision 211221)
+++ gcc/genmatch.c  (working copy)
@@ -549,6 +549,9 @@ write_gimple (FILE *f, vecsimplify *
 static struct line_maps *line_table;
 
 static bool
+#if GCC_VERSION = 4001
+__attribute__((format (printf, 6, 0)))
+#endif
 error_cb (cpp_reader *, int, int, source_location location,
  unsigned int, const char *msg, va_list *ap)
 {
@@ -562,6 +565,9 @@ error_cb (cpp_reader *, int, int, source
 }
 
 static void
+#if GCC_VERSION = 4001
+__attribute__((format (printf, 2, 3)))
+#endif
 fatal_at (const cpp_token *tk, const char *msg, ...)
 {
   va_list ap;
Index: gcc/gimple-match-head.c
===
--- gcc/gimple-match-head.c (revision 211221)
+++ gcc/gimple-match-head.c (working copy)
@@ -105,18 +105,23 @@ gimple_resimplify1 (gimple_seq *seq,
   else
{
  tree decl = builtin_decl_implicit (*res_code);
- tem = fold_builtin_n (UNKNOWN_LOCATION, decl, res_ops, 1, false);
- if (tem)
+ if (decl)
{
- /* fold_builtin_n wraps the result inside a NOP_EXPR.  */
- STRIP_NOPS (tem);
- tem = fold_convert (type, tem);
+ tem = fold_builtin_n (UNKNOWN_LOCATION, decl, res_ops, 1, false);
+ if (tem)
+   {
+ /* fold_builtin_n wraps the result inside a NOP_EXPR.  */
+ STRIP_NOPS (tem);
+ tem = fold_convert (type, tem);
+   }
}
}
   if (tem != NULL_TREE
   CONSTANT_CLASS_P (tem))
{
  res_ops[0] = tem;
+ res_ops[1] = NULL_TREE;
+ res_ops[2] = NULL_TREE;
  *res_code = TREE_CODE (res_ops[0]);
  return true;
}
@@ -157,17 +162,22 @@ gimple_resimplify2 (gimple_seq *seq,
   else
{
  tree decl = builtin_decl_implicit (*res_code);
- tem = fold_builtin_n (UNKNOWN_LOCATION, decl, res_ops, 2, false);
- if (tem)
+ if (decl)
{
- /* fold_builtin_n wraps the result inside a NOP_EXPR.  */
- STRIP_NOPS (tem);
- tem = fold_convert (type, tem);
+ tem = fold_builtin_n (UNKNOWN_LOCATION, decl, res_ops, 2, false);
+ if (tem)
+   {
+ /* fold_builtin_n wraps the result inside a NOP_EXPR.  */
+ STRIP_NOPS (tem);
+ tem = fold_convert (type, tem);
+   }
}
}
   if (tem != NULL_TREE)
{
  res_ops[0] = tem;
+ res_ops[1] = NULL_TREE;
+ res_ops[2] = NULL_TREE;
  *res_code = TREE_CODE (res_ops[0]);
  return true;
}
@@ -221,18 +231,23 @@ gimple_resimplify3 (gimple_seq *seq,
   else
{
  tree decl = builtin_decl_implicit (*res_code);
- tem = fold_builtin_n (UNKNOWN_LOCATION, decl, res_ops, 3, false);
- if (tem)
+ if (decl)
{
- /* fold_builtin_n wraps the result inside a NOP_EXPR.  */
- STRIP_NOPS (tem);
- tem = fold_convert (type, tem);
+ tem = fold_builtin_n (UNKNOWN_LOCATION, decl, res_ops, 3, false);
+ if (tem)
+   {
+ /* fold_builtin_n wraps the result inside a NOP_EXPR.  */
+ STRIP_NOPS (tem);
+ tem = fold_convert (type, tem);
+   }
}
}
   if (tem != NULL_TREE
   CONSTANT_CLASS_P (tem))
{
  res_ops[0] = tem;
+ res_ops[1] = NULL_TREE;
+ res_ops[2] = NULL_TREE;
  *res_code = TREE_CODE (res_ops[0]);
  return true;
}
@@ -297,9 +312,11 @@ maybe_push_res_to_seq (code_helper rcode
 {
   if (!seq)
return NULL_TREE;
+  tree decl = builtin_decl_implicit (rcode);
+  if (!decl)
+   return NULL_TREE;
   if (!res)
res = make_ssa_name (type, NULL);
-  tree decl = builtin_decl_implicit (rcode);
   unsigned nargs = type_num_arguments (TREE_TYPE (decl));
   gcc_assert (nargs = 3);
   gimple new_stmt = gimple_build_call (decl, nargs, ops[0], ops[1], 
ops[2]);
@@ -401,14 +418,17 @@ gimple_match_and_simplify (enum built_in

[PATCH][match-and-simplify] Fix cutpaste error

2014-06-04 Thread Richard Biener


Committed.  Hopefully that restores bootstrap ...

Richard.

2014-06-04  Richard Biener  rguent...@suse.de

* gimple-match-head.c (gimple_match_and_simplify): Fix cutpaste
error.

Index: gcc/gimple-match-head.c
===
--- gcc/gimple-match-head.c (revision 211226)
+++ gcc/gimple-match-head.c (working copy)
@@ -604,7 +604,7 @@ gimple_match_and_simplify (gimple stmt,
if (!arg1)
  return false;
  }
-   tree arg2 = gimple_call_arg (stmt, 0);
+   tree arg2 = gimple_call_arg (stmt, 1);
if (valueize  TREE_CODE (arg2) == SSA_NAME)
  {
arg2 = valueize (arg2);

Re: [patch, mips, tree] align microMIPS functions to 16 bits with -Os

2014-06-04 Thread Richard Biener

On Tue, 3 Jun 2014, Richard Sandiford wrote:

 Sandra Loosemore san...@codesourcery.com writes:
  Catherine included an earlier version of this patch with the microMIPS 
  submission a couple years ago:
 
  https://gcc.gnu.org/ml/gcc-patches/2012-07/msg00972.html
 
  Richard's response was:
 
  Looks like the wrong place to do this.  Please treat this as a separate
  patch and get a tree expert to comment.
 
  So, here is the separate patch, finally.  :-)  Can a tree expert 
  comment?  I dug around and didn't see another obvious hook to set 
  function alignment in a target-specific way depending on attributes of 
  the function.
 
  Besides the new test cases, I regression-tested this on mips-sde-elf 
  using Mentor's usual assortment of multilibs, specifically including one 
  for microMIPS.
 
 OK, I asked Richi on IRC today.  To recap, one of the things I was
 worried about was a test against the address, like (foo  2) == 0,
 being optimised away before we set the alignment.  Richi pointed
 out that my idea downthread about cgraph_create_node would also be
 too late to avoid that.  Also, looking at it now, I see that we don't
 trust DECL_ALIGN on functions anyway.  From get_object_alignment_2:
 
   /* Extract alignment information from the innermost object and
  possibly adjust bitpos and offset.  */
   if (TREE_CODE (exp) == FUNCTION_DECL)
 {
   /* Function addresses can encode extra information besides their
alignment.  However, if TARGET_PTRMEMFUNC_VBIT_LOCATION
allows the low bit to be used as a virtual bit, we know
that the address itself must be at least 2-byte aligned.  */
   if (TARGET_PTRMEMFUNC_VBIT_LOCATION == ptrmemfunc_vbit_in_pfn)
   align = 2 * BITS_PER_UNIT;
 }
 
 And since we use the low bit to encode the ISA mode on MIPS, the upshot
 is that we never assume any alignment for functions.  So there's nothing
 to worry about after all.
 
 Richi suggested just changing the alignment at output time.  I assume
 that would be a case of replacing the DECL_ALIGN in:
 
   /* Tell assembler to move to target machine's alignment for functions.  */
   align = floor_log2 (DECL_ALIGN (decl) / BITS_PER_UNIT);
   if (align  0)
 {
   ASM_OUTPUT_ALIGN (asm_out_file, align);
 }
 
 with a hook.  (Is that right?)

Yeah, kind of.  Of course if DECL_ALIGN on function-decls is unused
then we may as well initialize it to 1 in tree.c and at an appropriate
stage adjust it to the result of a target hook invocation.

Appropriate stage would be the above place (which means DECL_ALIGN
is essentially unused for FUNCTION_DECLs).

So ... can you massage the DECL_ALIGN macro to ICE on FUNCTION_DECLs
and see where we access it?  (generic code will possibly trip on it,
but the question is is there any user that cares?)

Richard.

Re: [PATCH, Pointer Bounds Checker 6/x] New static constructor types

2014-06-04 Thread Richard Biener

On Wed, Jun 4, 2014 at 3:13 PM, Ilya Enkovich enkovich@gmail.com wrote:
 2014-06-04 13:58 GMT+04:00 Richard Biener richard.guent...@gmail.com:
 On Wed, Apr 16, 2014 at 2:33 PM, Ilya Enkovich enkovich@gmail.com 
 wrote:
 Hi,

 This patch add new static constructor types used by Pointer Bounds Checker. 
  It was approved earlier for 4.9 and I'll assume patch is OK for trunk if 
 no objections arise.

 Patch was bootstrapped and tested for linux-x86_64.

 Thanks,
 Ilya
 --
 gcc/

 2014-04-16  Ilya Enkovich  ilya.enkov...@intel.com

 * ipa.c (cgraph_build_static_cdtor_1): Support contructors
 with chkp ctor and bnd_legacy attributes.
 * gimplify.c (gimplify_init_constructor): Avoid infinite
 loop during gimplification of bounds initializer.


 diff --git a/gcc/gimplify.c b/gcc/gimplify.c
 index 7441784..67ab515 100644
 --- a/gcc/gimplify.c
 +++ b/gcc/gimplify.c
 @@ -3803,10 +3803,19 @@ gimplify_init_constructor (tree *expr_p, gimple_seq 
 *pre_p, gimple_seq *post_p,
individual element initialization.  Also don't do this for small
all-zero initializers (which aren't big enough to merit
clearing), and don't try to make bitwise copies of
 -  TREE_ADDRESSABLE types.  */
 +  TREE_ADDRESSABLE types.
 +
 +  We cannot apply such transformation when compiling chkp static
 +  initializer because creation of initializer image in the memory
 +  will require static initialization of bounds for it.  It should
 +  result in another gimplification of similar initializer and we
 +  may fall into infinite loop.  */
 if (valid_const_initializer
  !(cleared || num_nonzero_elements == 0)
 -!TREE_ADDRESSABLE (type))
 +!TREE_ADDRESSABLE (type)
 +(!current_function_decl
 +   || !lookup_attribute (chkp ctor,
 + DECL_ATTRIBUTES 
 (current_function_decl

 Simply make the type TREE_ADDRESSABLE?

 Wouldn't it be a hack to mark it addressable just to not hit this
 condition? It would also require to have an unshared copy of the type
 to not affect other statements with that type.


   {
 HOST_WIDE_INT size = int_size_in_bytes (type);
 unsigned int align;
 diff --git a/gcc/ipa.c b/gcc/ipa.c
 index 26e9b03..5ab3aed 100644
 --- a/gcc/ipa.c
 +++ b/gcc/ipa.c
 @@ -1345,9 +1345,11 @@ make_pass_ipa_whole_program_visibility (gcc::context 
 *ctxt)
  }

  /* Generate and emit a static constructor or destructor.  WHICH must
 -   be one of 'I' (for a constructor) or 'D' (for a destructor).  BODY
 -   is a STATEMENT_LIST containing GENERIC statements.  PRIORITY is the
 -   initialization priority for this constructor or destructor.
 +   be one of 'I' (for a constructor), 'D' (for a destructor), 'P'
 +   (for chp static vars constructor) or 'B' (for chkp static bounds
 +   constructor).  BODY is a STATEMENT_LIST containing GENERIC
 +   statements.  PRIORITY is the initialization priority for this
 +   constructor or destructor.

 FINAL specify whether the externally visible name for collect2 should
 be produced. */
 @@ -1406,6 +1408,20 @@ cgraph_build_static_cdtor_1 (char which, tree body, 
 int priority, bool final)
DECL_STATIC_CONSTRUCTOR (decl) = 1;
decl_init_priority_insert (decl, priority);
break;
 +case 'P':
 +  DECL_STATIC_CONSTRUCTOR (decl) = 1;
 +  DECL_ATTRIBUTES (decl) = tree_cons (get_identifier (chkp ctor),
 + NULL,
 + NULL_TREE);

 Ick.  Please try to avoid using attributes for this.  Rather adjust the
 caller of this function to set a flag in the cgraph node.

 It is too late because all early local passes are executed by that
 time and I need this attribute to be set before them.

Ok, so where do you call this from?  It should be possible to create
the function in GIMPLE form directly, avoiding the need to go through
gimplification.

Richard.

 Ilya


 So I don't like this patch at all.

 Richard.

 +  decl_init_priority_insert (decl, priority);
 +  break;
 +case 'B':
 +  DECL_STATIC_CONSTRUCTOR (decl) = 1;
 +  DECL_ATTRIBUTES (decl) = tree_cons (get_identifier (bnd_legacy),
 + NULL,
 + NULL_TREE);
 +  decl_init_priority_insert (decl, priority);
 +  break;
  case 'D':
DECL_STATIC_DESTRUCTOR (decl) = 1;
decl_fini_priority_insert (decl, priority);
 @@ -1423,9 +1439,11 @@ cgraph_build_static_cdtor_1 (char which, tree body, 
 int priority, bool final)
  }

  /* Generate and emit a static constructor or destructor.  WHICH must
 -   be one of 'I' (for a constructor) or 'D' (for a destructor).  BODY
 -   is a STATEMENT_LIST containing GENERIC statements.  PRIORITY is the
 -   initialization priority for this constructor

Re: [PATCH, PR 61391]

2014-06-04 Thread Richard Biener

On Wed, Jun 4, 2014 at 3:17 PM, Yuri Rumyantsev ysrum...@gmail.com wrote:
 Hi All,

 Here is a simple fix for 61391 - missed a check that statement basic
 block is inside loop. With this fix test-case from bug is compiled
 successfully.

 Bootstrap and regression testing did not show any new failures.

 Is it OK for trunk?

Ok.

Thanks,
Richard.

 ChangeLog:

 2014-06-04  Yuri Rumyantsev  ysrum...@gmail.com

 PR tree-optimization/61319
 * tree-if-conv.c (is_cond_scalar_reduction): Add missed check that
 stmt belongs to loop.

Re: [PATCH, PR 61391]

2014-06-04 Thread Richard Biener

On Wed, Jun 4, 2014 at 3:37 PM, Richard Biener
richard.guent...@gmail.com wrote:
 On Wed, Jun 4, 2014 at 3:17 PM, Yuri Rumyantsev ysrum...@gmail.com wrote:
 Hi All,

 Here is a simple fix for 61391 - missed a check that statement basic
 block is inside loop. With this fix test-case from bug is compiled
 successfully.

 Bootstrap and regression testing did not show any new failures.

 Is it OK for trunk?

 Ok.

Please also add the testcase.

 Thanks,
 Richard.

 ChangeLog:

 2014-06-04  Yuri Rumyantsev  ysrum...@gmail.com

 PR tree-optimization/61319
 * tree-if-conv.c (is_cond_scalar_reduction): Add missed check that
 stmt belongs to loop.

< 6 7 8 9 10 11 12 13 14 15 >

1001 - 1100 of 25227 matches

Mail list logo