from:"Jakub Jelinek"

Re: [PATCH] Optimize ASAN_CHECK checks

2014-11-14 Thread Jakub Jelinek

On Fri, Nov 14, 2014 at 12:25:37PM +0100, Dodji Seketeli wrote:
  +  /* True if there is a block with HAS_FREEING_CALL_P flag set
  + on any a path between imm(BB) and BB.  */
 
 s/a//.
 
 Also, I'd rather say on any path between an immediate dominator of
 BB, denoted imm(BB), and BB.  That way, subsequent uses of imm(BB)
 makes sense more for the new comer.  This is only a suggestion.  If
 you feel that formulation is obvious enough, then please ignore my
 comment.

Ok.

  +/* Return true if there might be any call to free/munmap operation
  +   on any path in between DOM (which should be imm(BB)) and BB.  */
  +
  +static bool
  +imm_dom_path_with_freeing_call (basic_block bb, basic_block dom)
  +{
 
 To ease maintainability, maybe we could assert that:
 
gcc_assert (dom == get_immediate_dominator(CDI_DOMINATORS, bb));

Well, that would make the dom argument useless, the point of passing it
around was that because we have to call it in the caller already,
there is no point in calling it again the the callee.
So if we as well call it (most people don't use --disable-checking),
we could just drop the argument.

 ?
 
 And thus remove the assert that is at the caller site of this
 function, later in maybe_optimize_asan_check_ifn:
 
  + basic_block imm = get_immediate_dominator (CDI_DOMINATORS, last_bb);
  + gcc_assert (imm);

The assert is very much intentional here, want to double-check the algorithm
that we indeed always reach gbb through the imm dominator walk (unless
bailing out early, but if we wouldn't, we'd reach it).

  + if (imm_dom_path_with_freeing_call (last_bb, imm))
  +   break;
 
 
 Also, when the 'dom' basic block is NULL, couldn't we just return
 immediately?

get_immediate_dominator (CDI_DOMINATORS, ) should return NULL
just for the ENTRY block, it would be a bug to reach that bb.

  +/* Optimize away redundant ASAN_CHECK calls.  */
  +
  +static bool
  +maybe_optimize_asan_check_ifn (struct sanopt_ctx *ctx, gimple stmt)
  +{
  +  gcc_assert (gimple_call_num_args (stmt) == 4);
  +  tree ptr = gimple_call_arg (stmt, 1);
  +  tree len = gimple_call_arg (stmt, 2);
  +  basic_block bb = gimple_bb (stmt);
  +  sanopt_info *info = (sanopt_info *) bb-aux;
  +
  +  if (TREE_CODE (len) != INTEGER_CST)
  +return false;
  +  if (integer_zerop (len))
  +return false;
  +
  +  gimple_set_uid (stmt, info-freeing_call_events);
 
 I am not sure, but I am wondering if we shouldn't save the previous uid
 of 'stmt' here before setting it, and then restore it before getting out
 of this function.

No, gimple uids are AFAIK undefined at the start of passes, passes that use
them are supposed to initialize them before use (new statements created
during the pass will get 0 there by default), and don't have to clean them
up anyway at the end of pass.

  @@ -89,111 +402,77 @@ sanopt_optimize_walker (basic_block bb,
 
 I think the comment of this sanopt_optimize_walker function should now
 be adapted to say that it optimizes away redundant UBSAN_NULL *and*
 ASAN_CHECK internal function calls.

Ok, will do.

Jakub

Re: [PATCH] Optimize ASAN_CHECK checks

2014-11-14 Thread Jakub Jelinek

On Fri, Nov 14, 2014 at 01:06:52PM +0100, Dodji Seketeli wrote:
 Jakub Jelinek ja...@redhat.com writes:
 
  I am not sure, but I am wondering if we shouldn't save the previous uid
  of 'stmt' here before setting it, and then restore it before getting out
  of this function.
 
  No, gimple uids are AFAIK undefined at the start of passes, passes that use
  them are supposed to initialize them before use (new statements created
  during the pass will get 0 there by default), and don't have to clean them
  up anyway at the end of pass.
 
 Yeah, this is what I figured by grepping other passes, but I wasn't sure
 :-)
 
 Maybe I should follow up with a doc patch for the (otherwise very terse)
 comment of gimple_set_uid and gimple_uid accessors.

That would be indeed nice (similarly for other stuff that we expect to be
undefined on pass boundaries, or expect to be in certain state at pass
boundaries; in the former case set before uses, and don't care about what
state we leave it in, in the latter case can assume some state first (say 0)
and have to put it back into the same state).  There are various visited
flags and the like, Richard, any ideas what other things might be nice to
document?

Jakub

Re: [changes.html] Document -fdiagnostics-color= default changes

2014-11-14 Thread Jakub Jelinek

On Fri, Nov 14, 2014 at 01:14:02PM +0100, Manuel López-Ibáñez wrote:
 
   h3 id=c-familyC family/h3
 ul
  +liThe code-fdiagnostics-color=/code option default is now
  +configurable at GCC configury time using
  +code--with-diagnostics-color=/code, can default to
  +codeauto/code - the new default unless configured otherwise,
  +where diagnostics is colorized by default when emitted to terminal,
  +codenever/code, codealways/code or codeauto-if-env/code,
  +which is the default of GCC 4.9 - codeauto/code if non-empty
  +codeGCC_COLORS/code is in the environment, codenever/code
  +otherwise.  Note, as before, having empty codeGCC_COLORS/code
  +variable in the environment will always turn the coloring off, no
  +matter what the default is or what command line options are used./li
 
 This not only affects 'C family' but also Fortran now, so perhaps it
 should go in the Caveats section at the top or on a Building GCC

Well, it doesn't look like something for Caveats section IMHO, it is an
extension of the previous behavior.  The reason I put it into the C family
part is that that is where it was documented for 4.9, and for Fortran IMHO
gcc-5/changes.html just should document that now it supports diagnostics
colors like the C family of frontends.

 section. Apart from that, do you think the following is a bit clearer?
 
 liThe default setting of the code-fdiagnostics-color=/code option is now
 configurable a href
 =https://gcc.gnu.org/install/configure.html;when building GCC/a
 using configuration option code--with-diagnostics-color=/code. The
 possible values are:
 codenever/code, codealways/code, codeauto/code and
 codeauto-if-env/code.
 The new default codeauto/code means to use color only when the
 standard error is a terminal.
 The default in GCC 4.9 was codeauto-if-env/code, which defaults to
 codeauto/code if there is a non-empty codeGCC_COLORS/code
 environment
 variable, and codenever/code otherwise. As in GCC 4.9, an empty
 codeGCC_COLORS/code variable in the environment will always
 disable colors, no
 matter what the default is or what command line options are used./li

Indeed, it does.  So feel free to turn that into patch form.

Jakub

Re: [PATCH] Optimize ASAN_CHECK checks

2014-11-14 Thread Jakub Jelinek

On Fri, Nov 14, 2014 at 01:16:36PM +0100, Richard Biener wrote:
  That would be indeed nice (similarly for other stuff that we expect to be
  undefined on pass boundaries, or expect to be in certain state at pass
  boundaries; in the former case set before uses, and don't care about what
  state we leave it in, in the latter case can assume some state first (say 0)
  and have to put it back into the same state).  There are various visited
  flags and the like, Richard, any ideas what other things might be nice to
  document?
 
 aux pointers on CFG structures which I believe have to be cleared
 after each pass that uses them (maybe already documented).
 
 Nothing else off the top of my head.

There is at least gimple_plf GF_PLF_{1,2} too, gimple_visited_p, BB_VISITED, 
... 

Jakub

Re: The nvptx port

2014-11-14 Thread Jakub Jelinek

On Fri, Nov 14, 2014 at 01:12:40PM +0100, Bernd Schmidt wrote:
 :(.  So what other option one has to implement something like TLS, even
 using inline asm or similar?  There is %tid, so perhaps indexing some array
 with %tid?
 
 That ought to work. For performance you'd want that array in .shared memory
 but I believe that's limited in size.

Any way to query those limits?  Size of .shared memory, number of threads in
warp, number of warps, etc.?  In OpenACC, are all workers in a single gang
the same warp?

 BTW, one can still invoke OpenMP target regions (even OpenACC regions) from
 multiple host threads, so the question is how without local TLS we can
 actually do anything at all.  Sure, we can pass parameters to the kernel,
 but we'd need to propagate it through all functions.  Or can
 cudaGetParameterBuffer be used for that?
 
 Presumably a kernel could copy its arguments out to memory somewhere when
 it's called?

The question is where.  If it is global memory, then how would you find out
what value is for your team and what value is for some other team?

 - we'll need some synchronization primitives, I see atomic support is
   there, we need mutexes and semaphores I think, is that implementable
   using bar instruction?
 
 It's probably membar you need.
 
 That is a memory barrier, I need threads to wait on each other, wake up one
 another etc.
 
 Hmm. It's worthwhile to keep in mind that GPU threads really behave somewhat
 differently from CPUs (they don't really execute independently); the OMP
 model may just be a poor match for the architecture in general.
 One could busywait on a spinlock, but AFAIK there isn't really a way to put
 a thread to sleep. By not executing independently, I mean this: I believe if
 one thread in a warp is waiting on the spinlock, all the other ones are also
 busywaiting. There may be other effects that seem odd if one approaches it
 from a CPU perspective - for example you probably want only one thread in a
 warp to try to take the spinlock.

So, for a warp, if some threads perform one branch of an if and other
threads another one, all threads perform the first one first (with some
maybe not doing anything), then all the threads the others (again, other
threads not doing anything)?

As for the match, OpenMP isn't written for a particular accelerator, though
supposedly the addition of #pragma omp teams construct was done for NVidia.
So, some OpenMP code may be efficient on PTX, while other code might not be
that much (e.g. if all threads in a warp need to execute the same thing,
supposedly #pragma omp task isn't very good idea for the devices).

Jakub

Re: libsanitizer merge from upstream r221802

2014-11-14 Thread Jakub Jelinek

On Fri, Nov 14, 2014 at 03:11:16PM +0100, Uros Bizjak wrote:
 The missing definition in system's /usr/include/linux/types.h is protected 
 with:
 
 typedef __u16 __bitwise __le16;
 typedef __u16 __bitwise __be16;
 typedef __u32 __bitwise __le32;
 typedef __u32 __bitwise __be32;
 #if defined(__GNUC__)  !defined(__STRICT_ANSI__)
 typedef __u64 __bitwise __le64;
 typedef __u64 __bitwise __be64;
 #endif
 
 which doesn't work with -std=c++11, but works without problems with
 -std=gnu++11.
 
 As proposed by Jakub at [1], -std=gnu++11 fixes the problem with old
 kernels. Attached patch implements this proposal.
 
 2014-11-14  Uros Bizjak  ubiz...@gmail.com
 
 * sanitizer_common/Makefile.am (AM_CXXFLAGS): Use -std=gnu++11.
 * asan/Makefile.am (AM_CXXFLAGS): Ditto.
 * lsan/Makefile.am (AM_CXXFLAGS): Ditto.
 * interception/Makefile.am (AM_CXXFLAGS): Ditto.
 * tsan/Makefile.am (AM_CXXFLAGS): Ditto.
 * libbacktrace/Makefile.am (AM_CXXFLAGS): Ditto.
 * ubsan/Makefile.am (AM_CXXFLAGS): Ditto.
 * sanitizer_common/Makefile.in: Regenerate.
 * asan/Makefile.in: Ditto.
 * lsan/Makefile.in: Ditto.
 * interception/Makefile.in: Ditto.
 * tsan/Makefile.in: Ditto.
 * libbacktrace/Makefile.in: Ditto.
 * ubsan/Makefile.in: Ditto.

Ok, thanks.  Really no reason for pedantic checking.

Jakub

Re: The nvptx port

2014-11-14 Thread Jakub Jelinek

On Fri, Nov 14, 2014 at 07:37:49AM -0800, Cesar Philippidis wrote:
  Hmm. It's worthwhile to keep in mind that GPU threads really behave
  somewhat differently from CPUs (they don't really execute
  independently); the OMP model may just be a poor match for the
  architecture in general.
  One could busywait on a spinlock, but AFAIK there isn't really a way to
  put a thread to sleep. By not executing independently, I mean this: I
  believe if one thread in a warp is waiting on the spinlock, all the
  other ones are also busywaiting. There may be other effects that seem
  odd if one approaches it from a CPU perspective - for example you
  probably want only one thread in a warp to try to take the spinlock.
 
 Thread synchronization in CUDA is different from conventional CPUs.
 Using the gang/thread terminology, there's no way to synchronize two
 threads in two different gangs in PTX without invoking separate kernels.
 Basically, after a kernel is invoked, the host/accelerator (the later
 using dynamic parallelism) waits for the kernel to finish, and that
 effectively creates a barrier.

I believe in OpenMP terminology a gang is a team, and inter-teams barriers
are not supposed to work etc. (though, I think locks and atomic instructions
still are, so is critical region, so I really hope atomics are atomic even
inter-gang).  So for synchronization (mutexes and semaphores, from which
barriers are implemented; but perhaps could also use bar.arrive and bar.sync)
we mainly need synchronization within the gang.

 Also, keep in mind that PTX doesn't have a global TID. The user needs to
 calculate it using ctaid/tid and friends.

Ok.  Is %gridid needed for that combo too?

Jakub

Re: The nvptx port

2014-11-14 Thread Jakub Jelinek

On Fri, Nov 14, 2014 at 08:37:52AM -0800, Cesar Philippidis wrote:
 On 11/14/2014 08:18 AM, Jakub Jelinek wrote:
 
  Also, keep in mind that PTX doesn't have a global TID. The user needs to
  calculate it using ctaid/tid and friends.
  
  Ok.  Is %gridid needed for that combo too?
 
 Eventually, probably. Currently, we're launching all of our kernels with
 cuLaunchKernel, and that function doesn't take grids into account.

I wonder if cudaLaunchDevice called from PTX will result in a different
%gridid or not, will see next week if I manage to get the HW and SW stack

Jakub

Re: [PATCH, i686] Fix for asan test failures with -m32 happened after EBX enabling in PIC mode

2014-11-14 Thread Jakub Jelinek

On Fri, Nov 14, 2014 at 05:05:57PM +, Zamyatin, Igor wrote:
   --- /dev/null
   +++ b/gcc/testsuite/gcc.target/i386/pr63845.c
   @@ -0,0 +1,20 @@
   +/* PR sanitizer/63845 */
   +/* { dg-do compile } */
   +/* { dg-require-effective-target ia32 } */
   +/* { dg-require-effective-target fpic } */
   +/* { dg-skip-if No Windows PIC { *-*-mingw* *-*-cygwin } { * } {
   + } } */
   +/* { dg-options -fPIC } */
   +
   +int __attribute__ ((noinline, noclone)) foo (void *p) {
   +  return *(int*)p;
   +}
   +
   +int main ()
   +{
   +  char a = 0;
   +  foo (a);
   +  return 0;
   +}
   +
  
  
  Will this test fail on Linux without your fix? Doesn't testcase need -
  fsanitize=address to fail?
 
 Sure, you're right, will add it

It is not that easy, -fsanitize=address is not supported everywhere.
Better if you stick it into testsuite/gcc.dg/asan/
No point adding effective-target ia32/fpic, there is nothing i?86 specific,
not even ix86/x86_64 specific, nor pic specific in the test, just use
/* { dg-additional-options -fPIC { target fpic } } */

Jakub

Re: [PATCH] DW_AT_APPLE_* DWARF extensions.

2014-11-14 Thread Jakub Jelinek

On Fri, Nov 14, 2014 at 03:48:22PM +0100, Andrew Burgess wrote:
 * Jakub Jelinek ja...@redhat.com [2014-11-13 14:13:42 +0100]:
 
  On Thu, Nov 13, 2014 at 01:21:21PM +0100, Andrew Burgess wrote:
   I had a look around and couldn't find anything helpful.  The best I
   can offer would be the current path within the llvm source code where
   these are defined.  Would that be sufficient?
 
  That is not useful.  The point is not to suggest where those constants come
  from but what they mean, see e.g. the
  http://www.dwarfstd.org/ShowIssue.php?issue=100909.2type=open
  http://gcc.gnu.org/wiki/TemplateParmsDwarf
  etc. links that describe what the extensions do.
  If LLVM doesn't have any documentation of their extensions, then just
  /* Apple extensions.  */
  is good enough.
 
 I agree, hence my comment about not finding anything useful :)
 
 I've look again, and still can't find anything suitable, so could we
 go with just /* Apple extensions.  */ then?
 
 Patch  Changelog below.

Applied.

Jakub

Re: [PATCH] Optimize ASAN_CHECK checks

2014-11-14 Thread Jakub Jelinek

On Fri, Nov 14, 2014 at 01:06:52PM +0100, Dodji Seketeli wrote:
 Jakub Jelinek ja...@redhat.com writes:
 
  I am not sure, but I am wondering if we shouldn't save the previous uid
  of 'stmt' here before setting it, and then restore it before getting out
  of this function.
 
  No, gimple uids are AFAIK undefined at the start of passes, passes that use
  them are supposed to initialize them before use (new statements created
  during the pass will get 0 there by default), and don't have to clean them
  up anyway at the end of pass.
 
 Yeah, this is what I figured by grepping other passes, but I wasn't sure
 :-)

Here is what I've committed after another bootstrap/regtest.

2014-11-14  Jakub Jelinek  ja...@redhat.com
Marek Polacek  pola...@redhat.com

* sanopt.c: Include tree-ssa-operands.h.
(struct sanopt_info): Add has_freeing_call_p,
has_freeing_call_computed_p, imm_dom_path_with_freeing_call_p,
imm_dom_path_with_freeing_call_computed_p, freeing_call_events,
being_visited_p fields.
(struct sanopt_ctx): Add asan_check_map field.
(imm_dom_path_with_freeing_call, maybe_optimize_ubsan_null_ifn,
maybe_optimize_asan_check_ifn): New functions.
(sanopt_optimize_walker): Use them, optimize even ASAN_CHECK
internal calls.
(pass_sanopt::execute): Call sanopt_optimize even for
-fsanitize=address.
* gimple.c (nonfreeing_call_p): Return true for non-ECF_LEAF
internal calls.

--- gcc/sanopt.c.jj 2014-11-14 15:11:49.202397369 +0100
+++ gcc/sanopt.c2014-11-14 15:19:40.517912502 +0100
@@ -49,6 +49,7 @@ along with GCC; see the file COPYING3.
 #include langhooks.h
 #include ubsan.h
 #include params.h
+#include tree-ssa-operands.h
 
 
 /* This is used to carry information about basic blocks.  It is
@@ -56,7 +57,30 @@ along with GCC; see the file COPYING3.
 
 struct sanopt_info
 {
-  /* True if this BB has been visited.  */
+  /* True if this BB might call (directly or indirectly) free/munmap
+ or similar operation.  */
+  bool has_freeing_call_p;
+
+  /* True if HAS_FREEING_CALL_P flag has been computed.  */
+  bool has_freeing_call_computed_p;
+
+  /* True if there is a block with HAS_FREEING_CALL_P flag set
+ on any path between an immediate dominator of BB, denoted
+ imm(BB), and BB.  */
+  bool imm_dom_path_with_freeing_call_p;
+
+  /* True if IMM_DOM_PATH_WITH_FREEING_CALL_P has been computed.  */
+  bool imm_dom_path_with_freeing_call_computed_p;
+
+  /* Number of possibly freeing calls encountered in this bb
+ (so far).  */
+  uint64_t freeing_call_events;
+
+  /* True if BB is currently being visited during computation
+ of IMM_DOM_PATH_WITH_FREEING_CALL_P flag.  */
+  bool being_visited_p;
+
+  /* True if this BB has been visited in the dominator walk.  */
   bool visited_p;
 };
 
@@ -69,131 +93,387 @@ struct sanopt_ctx
  a vector of UBSAN_NULL call statements that check this pointer.  */
   hash_maptree, auto_vecgimple  null_check_map;
 
+  /* This map maps a pointer (the second argument of ASAN_CHECK) to
+ a vector of ASAN_CHECK call statements that check the access.  */
+  hash_maptree, auto_vecgimple  asan_check_map;
+
   /* Number of IFN_ASAN_CHECK statements.  */
   int asan_num_accesses;
 };
 
 
-/* Try to optimize away redundant UBSAN_NULL checks.
-   
+/* Return true if there might be any call to free/munmap operation
+   on any path in between DOM (which should be imm(BB)) and BB.  */
+
+static bool
+imm_dom_path_with_freeing_call (basic_block bb, basic_block dom)
+{
+  sanopt_info *info = (sanopt_info *) bb-aux;
+  edge e;
+  edge_iterator ei;
+
+  if (info-imm_dom_path_with_freeing_call_computed_p)
+return info-imm_dom_path_with_freeing_call_p;
+
+  info-being_visited_p = true;
+
+  FOR_EACH_EDGE (e, ei, bb-preds)
+{
+  sanopt_info *pred_info = (sanopt_info *) e-src-aux;
+
+  if (e-src == dom)
+   continue;
+
+  if ((pred_info-imm_dom_path_with_freeing_call_computed_p
+  pred_info-imm_dom_path_with_freeing_call_p)
+ || (pred_info-has_freeing_call_computed_p
+  pred_info-has_freeing_call_p))
+   {
+ info-imm_dom_path_with_freeing_call_computed_p = true;
+ info-imm_dom_path_with_freeing_call_p = true;
+ info-being_visited_p = false;
+ return true;
+   }
+}
+
+  FOR_EACH_EDGE (e, ei, bb-preds)
+{
+  sanopt_info *pred_info = (sanopt_info *) e-src-aux;
+
+  if (e-src == dom)
+   continue;
+
+  if (pred_info-has_freeing_call_computed_p)
+   continue;
+
+  gimple_stmt_iterator gsi;
+  for (gsi = gsi_start_bb (e-src); !gsi_end_p (gsi); gsi_next (gsi))
+   {
+ gimple stmt = gsi_stmt (gsi);
+
+ if (is_gimple_call (stmt)  !nonfreeing_call_p (stmt))
+   {
+ pred_info-has_freeing_call_p = true;
+ break;
+   }
+   }
+
+  pred_info-has_freeing_call_computed_p

[PATCH] Document --with-diagnostics-color=

2014-11-14 Thread Jakub Jelinek

On Fri, Nov 14, 2014 at 12:01:43PM +0100, Richard Biener wrote:
  doc/invoke.texi also talks about it, wonder if it shouldn't be mentioned
  elsewhere, install.texi?
 
 Sure - the configure option should be mentioned in install.texi as well.

Here it is.  Ok for trunk?

2014-11-14  Jakub Jelinek  ja...@redhat.com

* doc/install.texi (--with-diagnostics-color=): Document.

--- gcc/doc/install.texi.jj 2014-11-13 15:13:22.0 +0100
+++ gcc/doc/install.texi2014-11-14 16:29:47.941342318 +0100
@@ -1795,6 +1795,15 @@ static data members and inline function
 default for a toolchain with an assembler that accepts it and
 GLIBC 2.11 or above, otherwise disabled.
 
+@item --with-diagnostics-color=@var{choice}
+Tells GCC to use @var{choice} as the default for @option{-fdiagnostics-color=}
+option (if not used explicitly on the command line).  @var{choice}
+can be one of @samp{never}, @samp{auto}, @samp{always}, and @samp{auto-if-env}
+where @samp{auto} is the default.  @samp{auto-if-env} means that
+@option{-fdiagnostics-color=auto} will be the default if @code{GCC_COLORS}
+is present and non-empty in the environment, and
+@option{-fdiagnostics-color=never} otherwise.
+
 @item --enable-lto
 @itemx --disable-lto
 Enable support for link-time optimization (LTO).  This is enabled by


Jakub

[PATCH] DCE dead IFN_GOMP_SIMD_LANE

2014-11-14 Thread Jakub Jelinek

Hi!

When looking at PR59984, I've noticed that in certain cases we leave
around GOMP_SIMD_LANE calls without lhs.
This internal call is intentionally not ECF_CONST, we don't want it moved
before the loop, and the argument is magic.  But, if nothing uses it
anymore, it isn't needed.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2014-11-14  Jakub Jelinek  ja...@redhat.com

* tree-ssa.dce.c (eliminate_unnecessary_stmts): Eliminate
IFN_GOMP_SIMD_LANE without lhs as useless.

--- gcc/tree-ssa-dce.c.jj   2014-11-12 13:28:47.0 +0100
+++ gcc/tree-ssa-dce.c  2014-11-14 16:15:10.356124955 +0100
@@ -1402,6 +1402,11 @@ eliminate_unnecessary_stmts (void)
  maybe_clean_or_replace_eh_stmt (stmt, stmt);
  update_stmt (stmt);
  release_ssa_name (name);
+
+ /* GOMP_SIMD_LANE without lhs is not needed.  */
+ if (gimple_call_internal_p (stmt)
+  gimple_call_internal_fn (stmt) == IFN_GOMP_SIMD_LANE)
+   remove_dead_stmt (gsi, bb);
}
  else if (gimple_call_internal_p (stmt))
switch (gimple_call_internal_fn (stmt))

Jakub

Re: [PATCH] Propagate nonfreeing_call_p using ipa-pure-const (take 2)

2014-11-14 Thread Jakub Jelinek

On Thu, Nov 13, 2014 at 12:52:21PM +0100, Jakub Jelinek wrote:
 On Thu, Nov 13, 2014 at 09:39:42AM +0100, Jakub Jelinek wrote:
  What about the:
I wonder if the nonfreeing_call_p function shouldn't be moved elsewhere
though (suggestion where), so that gimple.c doesn't need the cgraph
includes.
  question though (maybe it is more on Richard)?
 
 Tried richi's suggested cgraph.[ch], but that meant all users of
 nonfreeing_call_p had to start including all of ipa-ref.h, lto-streamer.h
 and cgraph.h, so it is probably better to keep it in gimple.[ch].

Here is a new version, which also handles IFN_ABNORMAL_DISPATCHER as
nonfreeing, it doesn't really call anything, just connect abnormal edges.
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2014-11-14  Jakub Jelinek  ja...@redhat.com

* ipa-pure-const.c (struct funct_state_d): Add can_free field.
(varying_state): Add true for can_free.
(check_call): For builtin or internal !nonfreeing_call_p set
local-can_free.
(check_stmt): For asm volatile and asm with memory set
local-can_free.
(analyze_function): Clear local-can_free initially, continue
calling check_stmt until all flags are computed, dump can_free
flag.
(pure_const_write_summary): Write can_free flag.
(pure_const_read_summary): Read it back.
(propagate_pure_const): Propagate also can_free flag, set
w-nonfreeing_fn if it is false after propagation.
* cgraph.h (cgraph_node): Add nonfreeing_fn member.
* gimple.c: Include ipa-ref.h, lto-streamer.h and cgraph.h.
(nonfreeing_call_p): Return cgraph nonfreeing_fn flag if set.
Also return true for IFN_ABNORMAL_DISPATCHER.
* cgraph.c (cgraph_node::dump): Dump nonfreeing_fn flag.
* lto-cgraph.c (lto_output_node): Write nonfreeing_fn flag.
(input_overwrite_node): Read it back.

--- gcc/cgraph.c.jj 2014-11-14 15:11:46.691442654 +0100
+++ gcc/cgraph.c2014-11-14 15:20:25.642079723 +0100
@@ -1995,6 +1995,8 @@ cgraph_node::dump (FILE *f)
 fprintf (f,  tm_clone);
   if (icf_merged)
 fprintf (f,  icf_merged);
+  if (nonfreeing_fn)
+fprintf (f,  nonfreeing_fn);
   if (DECL_STATIC_CONSTRUCTOR (decl))
 fprintf (f, static_constructor (priority:%i), get_init_priority ());
   if (DECL_STATIC_DESTRUCTOR (decl))
--- gcc/ipa-pure-const.c.jj 2014-11-14 15:11:46.640443574 +0100
+++ gcc/ipa-pure-const.c2014-11-14 15:20:25.643079828 +0100
@@ -112,11 +112,15 @@ struct funct_state_d
   bool looping;
 
   bool can_throw;
+
+  /* If function can call free, munmap or otherwise make previously
+ non-trapping memory accesses trapping.  */
+  bool can_free;
 };
 
 /* State used when we know nothing about function.  */
 static struct funct_state_d varying_state
-   = { IPA_NEITHER, IPA_NEITHER, true, true, true };
+   = { IPA_NEITHER, IPA_NEITHER, true, true, true, true };
 
 
 typedef struct funct_state_d * funct_state;
@@ -559,6 +563,10 @@ check_call (funct_state local, gimple ca
   enum pure_const_state_e call_state;
   bool call_looping;
 
+  if (gimple_call_builtin_p (call, BUILT_IN_NORMAL)
+  !nonfreeing_call_p (call))
+   local-can_free = true;
+
   if (special_builtin_state (call_state, call_looping, callee_t))
{
  worse_state (local-pure_const_state, local-looping,
@@ -589,6 +597,8 @@ check_call (funct_state local, gimple ca
break;
  }
 }
+  else if (gimple_call_internal_p (call)  !nonfreeing_call_p (call))
+local-can_free = true;
 
   /* When not in IPA mode, we can still handle self recursion.  */
   if (!ipa  callee_t
@@ -753,6 +763,7 @@ check_stmt (gimple_stmt_iterator *gsip,
fprintf (dump_file, memory asm clobber is not const/pure\n);
  /* Abandon all hope, ye who enter here. */
  local-pure_const_state = IPA_NEITHER;
+ local-can_free = true;
}
   if (gimple_asm_volatile_p (stmt))
{
@@ -760,7 +771,8 @@ check_stmt (gimple_stmt_iterator *gsip,
fprintf (dump_file, volatile is not const/pure\n);
  /* Abandon all hope, ye who enter here. */
  local-pure_const_state = IPA_NEITHER;
-  local-looping = true;
+ local-looping = true;
+ local-can_free = true;
}
   return;
 default:
@@ -785,6 +797,7 @@ analyze_function (struct cgraph_node *fn
   l-looping_previously_known = true;
   l-looping = false;
   l-can_throw = false;
+  l-can_free = false;
   state_from_flags (l-state_previously_known, l-looping_previously_known,
flags_from_decl_or_type (fn-decl),
fn-cannot_return_p ());
@@ -815,7 +828,10 @@ analyze_function (struct cgraph_node *fn
   gsi_next (gsi))
{
  check_stmt (gsi, l, ipa);
- if (l-pure_const_state == IPA_NEITHER  l-looping  l-can_throw)
+ if (l-pure_const_state

Re: r217562 - in /trunk/libsanitizer: ChangeLog asa...

2014-11-14 Thread Jakub Jelinek

On Fri, Nov 14, 2014 at 11:29:07AM -0800, Konstantin Serebryany wrote:
 +gcc-patches
 
 On Fri, Nov 14, 2014 at 11:26 AM, Konstantin Serebryany
 konstantin.s.serebry...@gmail.com wrote:
  I am opposed to this change.
  Upstream code builds with -std=c++11.
  Building this code here with another set of options is a time bomb.

But also builds with a different compiler.
-std=gnu++11 is not something very different from C++11.
libstdc++-v3 (the C++11 part in it) are also built with -std=gnu++11, not
-std=c++11, similarly in other libraries shipped with GCC.
Valid C++11 will be still valid in -std=gnu++11 mode...

Jakub

Re: [PATCH] gcc/testsuite: guality.exp: Fix `test_counts' restoration

2014-11-14 Thread Jakub Jelinek

On Fri, Nov 14, 2014 at 09:01:25PM +, Maciej W. Rozycki wrote:
 2014-11-14  Maciej W. Rozycki  ma...@codesourcery.com
 
   gcc/testsuite/ 
   * g++.dg/guality/guality.exp (check_guality): Fix `test_counts' 
   restoration.

Ok, thanks.

 --- gcc-fsf-trunk-quilt.orig/gcc/testsuite/g++.dg/guality/guality.exp 
 2014-11-14 18:33:47.0 +
 +++ gcc-fsf-trunk-quilt/gcc/testsuite/g++.dg/guality/guality.exp  
 2014-11-14 20:18:35.038856372 +
 @@ -28,7 +28,7 @@ proc check_guality {args} {
set ret [string match *1 PASS, 0 FAIL, 0 UNRESOLVED* $execout]
  }
  remote_file build delete $output
 -array get test_counts [array get saved_test_counts]
 +array set test_counts [array get saved_test_counts]
  return $ret
  }
  

Jakub

Re: Fortran/C interfacing

2014-11-14 Thread Jakub Jelinek

On Fri, Nov 14, 2014 at 09:47:17PM +0100, Tobias Burnus wrote:
 Somebody recently suggested (for OpenMP) that we just should use
 bind(C) in the Fortran module, it is too late for OpenMP, as we
 have to keep the *_ entrypoints for compatibility anyway, but
 for OpenACC and new OpenMP functions supposedly you could avoid
 exporting all the *_ wrappers and use * directly.
 Tobis, as our local expert :-) -- how does that resonate with the
 discussion (and implementation) about Fortran/C interfacing
 
 Unfortunately, it a wrapper either on C or on Fortran side is unavoidable.

Ok, just wanted to rise it, if it isn't possible, fine as is.

Jakub

Re: [PATCH] Fix Cilk+ ICEs with overflow builtins (PR middle-end/63884)

2014-11-15 Thread Jakub Jelinek

On Sat, Nov 15, 2014 at 01:03:46PM +0100, Marek Polacek wrote:
 The problem here is that the Cilk+ code wasn't prepared to handle
 internal calls that the new overflow builtins entail.  Fixed by
 checking that the CALL_EXPR_FN isn't NULL.
 
 Looking at cilk-plus.exp, I think this file will need some tweaks
 now that the C default is gnu11...
 
 Bootstrapped/regtested on powerpc64-linux, ok for trunk?
 
 2014-11-15  Marek Polacek  pola...@redhat.com
 
   PR middle-end/63884
 c-family/
   * array-notation-common.c (is_sec_implicit_index_fn): Return false
   for NULL fndecl.
   (extract_array_notation_exprs): Return for NULL node.
 testsuite/
   * c-c++-common/cilk-plus/AN/pr63884.c: New test.

Ok, thanks.

Jakub

Re: [PATCH] Fix minimal alignment calculation for user-aligned types (PR63802)

2014-11-16 Thread Jakub Jelinek

On Fri, Nov 14, 2014 at 06:15:16PM +, Joseph Myers wrote:
 On Fri, 14 Nov 2014, Jakub Jelinek wrote:
 
  On Fri, Nov 14, 2014 at 09:46:14AM +0300, Yury Gribov wrote:
   Hi all,
   
   This patch fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63802 by 
   only
   limiting minimal type alignment with BIGGEST_ALIGNMENT for types with no
   __attribute__((aligned)).
   
   Bootstrapped and regtested on x64.  Ok for trunk?
  
  The function is primarily used by the C FE for _Alignas, and I have no idea
  if such a change is desirable for that very much user visible case.  Joseph?
 
 If it is true that a type satisfying TYPE_USER_ALIGN will never be 
 allocated at an address less-aligned than its TYPE_ALIGN, even if that's 
 greater than BIGGEST_ALIGNMENT, then the change seems correct for C11 
 _Alignof.

I think it depends on which target and where.
In structs (unless packed) the user aligned fields should be properly
aligned with respect to start of struct and the struct should have user
alignment in that case, automatic vars these days use alloca with
realignment if not handled better by the target, so should be fine too.
For data section vars and for common vars I think it really depends on the
target.  Perhaps for TYPE_USER_ALIGN use minimum of the TYPE_ALIGN and
MAX_OFILE_ALIGNMENT ?
For heap objects, it really depends on how it has been allocated, but if
allocated through malloc, the extra alignment is never guaranteed.
So, it really depends in malloc counts or not.

Jakub

Re: [PATCH] gcc/ubsan.c: Extend 'pretty_name' space to avoid memory overflow

2014-11-16 Thread Jakub Jelinek

On Mon, Nov 17, 2014 at 08:16:32AM +0100, Marek Polacek wrote:
 On Mon, Nov 17, 2014 at 06:40:26AM +0800, Chen Gang wrote:
  According to the next code, 'pretty_name' may need additional bytes more
  than 16. For simplify thinking and being extensible in future, extent it
  to 256 bytes, directly.
 
 I think + 128 bytes should be enough for everyone.

I disagree.
Consider:
typedef char 
A[1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1];
A a;

int foo (int j)
{
  
a[j][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0]
 = 1;
}
You need 1159 bytes in that case.  Easily to construct testcase that needs
arbitrary amount.
I think easiest would be to rewrite the code so that it uses pretty_printer
to construct the string, grep asan.c for asan_pp .  Or obstacks, but you don't
have a printer to print integers into it easily.
  if (dom  TREE_CODE (TYPE_MAX_VALUE (dom)) == INTEGER_CST)
pos += sprintf (pretty_name[pos], HOST_WIDE_INT_PRINT_DEC,
tree_to_uhwi (TYPE_MAX_VALUE (dom)) + 1);
  else
/* ??? We can't determine the variable name; print VLA unspec.  */
pretty_name[pos++] = '*';
looks wrong anyway, as not all integers fit into uhwi.
Guess you could use wide_int to add 1 there and pp_wide_int.

Jakub

Re: [PATCH] Fix minimal alignment calculation for user-aligned types (PR63802)

2014-11-17 Thread Jakub Jelinek

On Mon, Nov 17, 2014 at 05:46:55PM +, Joseph Myers wrote:
 On Mon, 17 Nov 2014, Jakub Jelinek wrote:
 
   If it is true that a type satisfying TYPE_USER_ALIGN will never be 
   allocated at an address less-aligned than its TYPE_ALIGN, even if that's 
   greater than BIGGEST_ALIGNMENT, then the change seems correct for C11 
   _Alignof.
  
  I think it depends on which target and where.
  In structs (unless packed) the user aligned fields should be properly
  aligned with respect to start of struct and the struct should have user
  alignment in that case, automatic vars these days use alloca with
  realignment if not handled better by the target, so should be fine too.
  For data section vars and for common vars I think it really depends on the
  target.  Perhaps for TYPE_USER_ALIGN use minimum of the TYPE_ALIGN and
  MAX_OFILE_ALIGNMENT ?
  For heap objects, it really depends on how it has been allocated, but if
  allocated through malloc, the extra alignment is never guaranteed.
  So, it really depends in malloc counts or not.
 
 The question, for both _Alignas and ubsan, is the alignment guaranteed *in 
 valid programs*.
 
 malloc only provides sufficient alignment for types with fundamental 
 alignment requirements (although there are various problems with the C11 
 wording; see DR#445).  So if you use malloc to allocate a type with an 
 extended alignment requirement (without doing extra realignment on the 
 result of malloc), that's not a valid program.  And if an alignment is 
 larger than MAX_OFILE_ALIGNMENT, you get an error for declaring non-stack 
 variables requiring that alignment.  So I don't think there's any need to 
 check MAX_OFILE_ALIGNMENT here.

If so, then Yuri's original patch (the one changing min_align_of_type)
should be fine, right?

Jakub

Re: [PATCH] -fsanitize-recover=list

2014-11-17 Thread Jakub Jelinek

On Mon, Nov 17, 2014 at 06:40:00PM -0800, Alexey Samsonov wrote:
 I've just prepared a patch implementing -fsanitize-recover=list in
 Clang (http://reviews.llvm.org/D6302), writing here to make sure we're
 on
 the same page w.r.t. flag semantics:
 
 * -fsanitize-recover: Enable recovery for all checks enabled by
 -fsanitize= which support it.
 * -fno-sanitize-recover: Disable recovery for all checks.

That is not what I think we've agreed on and what is implemented in GCC.
-fsanitize-recover only enables recovery of the undefined plus
undefined-like sanitizers, in particular it doesn't enable recover from
kernel-address, because -fsanitize-recover should be a deprecated option
and kernel-address never used it before.
So, in GCC -fsanitize-recover stands for
-fsanitize-recover=undefined,float-divide-by-zero,float-cast-overflow
and -fno-sanitize-recover stands for
-fno-sanitize-recover=undefined,float-divide-by-zero,float-cast-overflow

 * -fsanitize-recover=list: Enable recovery for all selected checks
 or group of checks. It is forbidden to list unrecoverable sanitizers
 here (e.g., -fsanitize-recover=address will produce an error).

We only error on
-fsanitize-recover=address
-fsanitize-recover=thread
-fsanitize-recover=leak
but not say on
-fsanitize-recover=unreachable
which is part of undefined; unreachable isn't recoverable silently.
Likewise -fsanitize-recover=return.  Otherwise one couldn't use
-fsanitize-recover=undefined which is useful.

 * -fno-sanitize-recover=list: Disable recovery for selected checks
 or group of checks.

Jakub

Re: [PATCH] -fsanitize-recover=list

2014-11-17 Thread Jakub Jelinek

On Mon, Nov 17, 2014 at 11:39:47PM -0800, Alexey Samsonov wrote:
  That is not what I think we've agreed on and what is implemented in GCC.
  -fsanitize-recover only enables recovery of the undefined plus
  undefined-like sanitizers, in particular it doesn't enable recover from
  kernel-address, because -fsanitize-recover should be a deprecated option
  and kernel-address never used it before.
 
 Hm, indeed, I messed it up. In the older thread we agree that plain
 -f(no-)sanitize-recover
 should be a deprecated synonym for the current meaning of these flags,
 which only specify recoverability for UBSan-related checks :-/
 
 Has GCC already shipped with existing implementation? I'm just curious
 if it's convenient to have flags that would enable/disable recovery for all
 sanitizers (analogous to -Werror flags which exist in a standalone form, but
 may accept specific warnings, so that one can write
   $(CC) foo.cc -Wno-error -Werror=sign-compare

Well, no sanitizer is on by default, so you can just use the same
list for -fno-sanitize-recover= or -fsanitize-recover= as for -fsanitize=
if you want.

  So, in GCC -fsanitize-recover stands for
  -fsanitize-recover=undefined,float-divide-by-zero,float-cast-overflow
  and -fno-sanitize-recover stands for
  -fno-sanitize-recover=undefined,float-divide-by-zero,float-cast-overflow
 
   * -fsanitize-recover=list: Enable recovery for all selected checks
   or group of checks. It is forbidden to list unrecoverable sanitizers
   here (e.g., -fsanitize-recover=address will produce an error).
 
  We only error on
  -fsanitize-recover=address
  -fsanitize-recover=thread
  -fsanitize-recover=leak
 
 Right, it's a good idea to error on sanitizer kinds we don't have
 recovery for. I will
 add the errors for TSan/MSan/LSan etc.
 
  but not say on
  -fsanitize-recover=unreachable
  which is part of undefined; unreachable isn't recoverable silently.
  Likewise -fsanitize-recover=return.  Otherwise one couldn't use
  -fsanitize-recover=undefined which is useful.
 
 Can't this be fixed by checking the actual values of -fsanitize-recover= flag
 before expanding the sanitizer groups (i.e. turning undefined into
 null,unreachable,return,)?
 
 We should probably be able to distinguish between 
 -fsanitize-recover=undefined,
 and -fsanitize-recover=unreachable,another checks from undefined.
 In the first case we can enable recovery
 for all parts of undefined that support it. In the second, we can
 produce an error as user
 explicitly tried to enable recovery for sanitizer that doesn't support it.

But in that case you would need to diagnose it already when parsing the
option, or add code that would remember what bits have been set from other
option sets.
In the former case, you'd diagnose even
-fsanitize-recover=unreachable -fno-sanitize-recover=undefined
even when in that case unreachable in the end is not recoverable.
We don't error out for
-fsanitize-recover=address -fno-sanitize-recover=address
because in the end address is not recovered.

Jakub

Re: [PATCH] Fix PR63868 - Guard offloading with ifdef ENABLE_OFFLOADING

2014-11-18 Thread Jakub Jelinek

On Tue, Nov 18, 2014 at 06:31:59PM +0300, Ilya Verbin wrote:
 @@ -8287,7 +8289,9 @@ expand_omp_target (struct omp_region *region)
if (kind == GF_OMP_TARGET_KIND_REGION)
  {
unsigned srcidx, dstidx, num;
 +#ifdef ENABLE_OFFLOADING
struct cgraph_node *node;
 +#endif

Please instead move the struct cgraph_node *node; declaration
right above where it is used for the first time.

There is no goto involved there, and it isn't in a switch either,
so you probably also can do just:
struct cgraph_node *node = cgraph_node::get (child_fn);
instead.

Ok with that change.

 @@ -8414,18 +8418,22 @@ expand_omp_target (struct omp_region *region)
DECL_STRUCT_FUNCTION (child_fn)-curr_properties = 
 cfun-curr_properties;
cgraph_node::add_new_function (child_fn, true);
  
 +#ifdef ENABLE_OFFLOADING
/* Add the new function to the offload table.  */
vec_safe_push (offload_funcs, child_fn);
 +#endif
  
/* Fix the callgraph edges for child_cfun.  Those for cfun will be
fixed in a following pass.  */
push_cfun (child_cfun);
cgraph_edge::rebuild_edges ();
  
 +#ifdef ENABLE_OFFLOADING
/* Prevent IPA from removing child_fn as unreachable, since there are 
 no
refs from the parent function to child_fn in offload LTO mode.  */
node = cgraph_node::get (child_fn);
node-mark_force_output ();
 +#endif
  
/* Some EH regions might become dead, see PR34608.  If
pass_cleanup_cfg isn't the first pass to happen with the

Jakub

[PATCH] Fix -fsanitize=bool -fnon-call-exceptions (PR sanitizer/63913)

2014-11-18 Thread Jakub Jelinek

Hi!

This patch fixes instrumentation of bool/enum loads if they could
throw.  We want to keep the EH on the load, and push
further statements on the fallthru edge.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2014-11-18  Jakub Jelinek  ja...@redhat.com

PR sanitizer/63913
* ubsan.c: Include tree-eh.h.
(instrument_bool_enum_load): Handle loads that can throw.

* g++.dg/ubsan/pr63913.C: New test.

--- gcc/ubsan.c.jj  2014-11-14 15:11:49.0 +0100
+++ gcc/ubsan.c 2014-11-18 09:29:20.409209556 +0100
@@ -67,6 +67,7 @@ along with GCC; see the file COPYING3.
 #include dfp.h
 #include builtins.h
 #include tree-object-size.h
+#include tree-eh.h
 
 /* Map from a tree to a VAR_DECL tree.  */
 
@@ -1135,7 +1136,9 @@ instrument_bool_enum_load (gimple_stmt_i
   || TREE_CODE (gimple_assign_lhs (stmt)) != SSA_NAME)
 return;
 
+  bool can_throw = stmt_could_throw_p (stmt);
   location_t loc = gimple_location (stmt);
+  tree lhs = gimple_assign_lhs (stmt);
   tree ptype = build_pointer_type (TREE_TYPE (rhs));
   tree atype = reference_alias_ptr_type (rhs);
   gimple g = gimple_build_assign (make_ssa_name (ptype, NULL),
@@ -1145,9 +1148,24 @@ instrument_bool_enum_load (gimple_stmt_i
   tree mem = build2 (MEM_REF, utype, gimple_assign_lhs (g),
 build_int_cst (atype, 0));
   tree urhs = make_ssa_name (utype, NULL);
-  g = gimple_build_assign (urhs, mem);
-  gimple_set_location (g, loc);
-  gsi_insert_before (gsi, g, GSI_SAME_STMT);
+  if (can_throw)
+{
+  gimple_assign_set_lhs (stmt, urhs);
+  g = gimple_build_assign_with_ops (NOP_EXPR, lhs, urhs, NULL_TREE);
+  gimple_set_location (g, loc);
+  edge e = find_fallthru_edge (gimple_bb (stmt)-succs);
+  gsi_insert_on_edge_immediate (e, g);
+  gimple_assign_set_rhs_with_ops (gsi, MEM_REF, mem, NULL_TREE);
+  update_stmt (stmt);
+  *gsi = gsi_for_stmt (g);
+  g = stmt;
+}
+  else
+{
+  g = gimple_build_assign (urhs, mem);
+  gimple_set_location (g, loc);
+  gsi_insert_before (gsi, g, GSI_SAME_STMT);
+}
   minv = fold_convert (utype, minv);
   maxv = fold_convert (utype, maxv);
   if (!integer_zerop (minv))
@@ -1169,8 +1187,11 @@ instrument_bool_enum_load (gimple_stmt_i
   gimple_set_location (g, loc);
   gsi_insert_after (gsi, g, GSI_NEW_STMT);
 
-  gimple_assign_set_rhs_with_ops (gsi2, NOP_EXPR, urhs, NULL_TREE);
-  update_stmt (stmt);
+  if (!can_throw)
+{
+  gimple_assign_set_rhs_with_ops (gsi2, NOP_EXPR, urhs, NULL_TREE);
+  update_stmt (stmt);
+}
 
   gsi2 = gsi_after_labels (then_bb);
   if (flag_sanitize_undefined_trap_on_error)
--- gcc/testsuite/g++.dg/ubsan/pr63913.C.jj 2014-11-18 09:34:10.603981487 
+0100
+++ gcc/testsuite/g++.dg/ubsan/pr63913.C2014-11-18 09:33:47.0 
+0100
@@ -0,0 +1,12 @@
+// PR sanitizer/63913
+// { dg-do compile }
+// { dg-options -fsanitize=bool -fnon-call-exceptions }
+
+struct B { B (); ~B (); };
+
+double
+foo (bool *x)
+{
+  B b;
+  return *x;
+}

Jakub

[PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)

2014-11-18 Thread Jakub Jelinek

Hi!

OImode/XImode on i?86/x86_64 are not = MAX_BITSIZE_MODE_ANY_INT, because
they are never used for integer arithmetics (and there is no way
to represent all their values in RTL if not using CONST_WIDE_INT).
As the following testcase shows, simplify_immed_subreg can be called
with such modes though, e.g. trying to forward propagate a CONST_VECTOR
(i?86/x86_64 handles all zeros and all ones as CONST_VECTORs that can appear
in the IL directly) into a SUBREG_REG.
The following patch instead of ICE handles the most common cases (all 0
and all 1 CONST_VECTORs) and returns NULL otherwise.

Before wide-int got merged, the testcase worked, though the code didn't
bother checking anything, just created 0 or constm1_rtx for the two cases
that could happen and if something else appeared, could just return what
matched low TImode (or DImode for -m32).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2014-11-18  Jakub Jelinek  ja...@redhat.com

PR target/63910
* simplify-rtx.c (simplify_immed_subreg): For integer modes
wider than MAX_BITSIZE_MODE_ANY_INT, handle all zeros and all ones
and for other values return NULL_RTX.

* gcc.target/i386/pr63910.c: New test.

--- gcc/simplify-rtx.c.jj   2014-11-11 00:06:19.0 +0100
+++ gcc/simplify-rtx.c  2014-11-18 10:17:01.198668555 +0100
@@ -5505,6 +5505,21 @@ simplify_immed_subreg (machine_mode oute
HOST_WIDE_INT tmp[MAX_BITSIZE_MODE_ANY_INT / 
HOST_BITS_PER_WIDE_INT];
wide_int r;
 
+   if (GET_MODE_PRECISION (outer_submode)  MAX_BITSIZE_MODE_ANY_INT)
+ {
+   /* Handle just all zeros and all ones CONST_VECTORs in
+  this case.  */
+   if ((vp[0]  value_mask) == 0)
+ elems[elem] = const0_rtx;
+   else if ((vp[0]  value_mask) == value_mask)
+ elems[elem] = constm1_rtx;
+   else
+ return NULL_RTX;
+   for (i = value_bit; i  elem_bitsize; i += value_bit)
+ if ((vp[i / value_bit]  value_mask) != (vp[0]  value_mask))
+   return NULL_RTX;
+   break;
+ }
for (u = 0; u  units; u++)
  {
unsigned HOST_WIDE_INT buf = 0;
@@ -5516,8 +5531,6 @@ simplify_immed_subreg (machine_mode oute
tmp[u] = buf;
base += HOST_BITS_PER_WIDE_INT;
  }
-   gcc_assert (GET_MODE_PRECISION (outer_submode)
-   = MAX_BITSIZE_MODE_ANY_INT);
r = wide_int::from_array (tmp, units,
  GET_MODE_PRECISION (outer_submode));
elems[elem] = immed_wide_int_const (r, outer_submode);
--- gcc/testsuite/gcc.target/i386/pr63910.c.jj  2014-11-18 10:12:24.282659318 
+0100
+++ gcc/testsuite/gcc.target/i386/pr63910.c 2014-11-18 10:12:00.0 
+0100
@@ -0,0 +1,12 @@
+/* PR target/63910 */
+/* { dg-do compile } */
+/* { dg-options -O -mstringop-strategy=vector_loop -mavx512f } */
+
+extern void bar (float *c);
+
+void
+foo (void)
+{
+  float c[1024] = { };
+  bar (c);
+}

Jakub

[C++ PATCH] Fix -fsanitize={null,alignment} reference instrumentation (PR sanitizer/63813)

2014-11-18 Thread Jakub Jelinek

Hi!

As the testcase shows, if there is a reinterpret_cast, ubsan
reference instrumentation can ICE, because for NOP_EXPR
to REFERENCE_TYPE we pass argument of that NOP_EXPR, and if it is not
a POINTER_TYPE or POINTER_TYPE to a different type, we can get the
type to use in the diagnostics wrong or ICE.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
ok for trunk?

2014-11-18  Jakub Jelinek  ja...@redhat.com

PR sanitizer/63813
* c-ubsan.c (ubsan_maybe_instrument_reference_or_call): Change type
argument to ptype, set type to TREE_TYPE (ptype).  Don't call
get_pointer_alignment for non-pointers.  Use ptype, or if it is
reference type, corresponding pointer type, as type of kind
argument.
(ubsan_maybe_instrument_reference,
ubsan_maybe_instrument_member_call): Adjust callers.

* g++.dg/ubsan/pr63813.C: New test.

--- gcc/c-family/c-ubsan.c.jj   2014-10-29 09:49:47.0 +0100
+++ gcc/c-family/c-ubsan.c  2014-11-18 12:31:42.700607456 +0100
@@ -383,18 +383,19 @@ ubsan_maybe_instrument_array_ref (tree *
 }
 
 static tree
-ubsan_maybe_instrument_reference_or_call (location_t loc, tree op, tree type,
+ubsan_maybe_instrument_reference_or_call (location_t loc, tree op, tree ptype,
  enum ubsan_null_ckind ckind)
 {
-  tree orig_op = op;
-  bool instrument = false;
-  unsigned int mina = 0;
-
   if (current_function_decl == NULL_TREE
   || lookup_attribute (no_sanitize_undefined,
   DECL_ATTRIBUTES (current_function_decl)))
 return NULL_TREE;
 
+  tree type = TREE_TYPE (ptype);
+  tree orig_op = op;
+  bool instrument = false;
+  unsigned int mina = 0;
+
   if (flag_sanitize  SANITIZE_ALIGNMENT)
 {
   mina = min_align_of_type (type);
@@ -431,13 +432,20 @@ ubsan_maybe_instrument_reference_or_call
}
   else if (flag_sanitize  SANITIZE_NULL)
instrument = true;
-  if (mina  mina  get_pointer_alignment (op) / BITS_PER_UNIT)
-   instrument = true;
+  if (mina  mina  1)
+   {
+ if (!POINTER_TYPE_P (TREE_TYPE (op))
+ || mina  get_pointer_alignment (op) / BITS_PER_UNIT)
+   instrument = true;
+   }
 }
   if (!instrument)
 return NULL_TREE;
   op = save_expr (orig_op);
-  tree kind = build_int_cst (TREE_TYPE (op), ckind);
+  gcc_assert (POINTER_TYPE_P (ptype));
+  if (TREE_CODE (ptype) == REFERENCE_TYPE)
+ptype = build_pointer_type (TREE_TYPE (ptype));
+  tree kind = build_int_cst (ptype, ckind);
   tree align = build_int_cst (pointer_sized_int_node, mina);
   tree call
 = build_call_expr_internal_loc (loc, IFN_UBSAN_NULL, void_type_node,
@@ -453,7 +461,7 @@ ubsan_maybe_instrument_reference (tree s
 {
   tree op = TREE_OPERAND (stmt, 0);
   op = ubsan_maybe_instrument_reference_or_call (EXPR_LOCATION (stmt), op,
-TREE_TYPE (TREE_TYPE (stmt)),
+TREE_TYPE (stmt),
 UBSAN_REF_BINDING);
   if (op)
 TREE_OPERAND (stmt, 0) = op;
@@ -471,7 +479,7 @@ ubsan_maybe_instrument_member_call (tree
   || !POINTER_TYPE_P (TREE_TYPE (op)))
 return;
   op = ubsan_maybe_instrument_reference_or_call (EXPR_LOCATION (stmt), op,
-TREE_TYPE (TREE_TYPE (op)),
+TREE_TYPE (op),
 is_ctor ? UBSAN_CTOR_CALL
 : UBSAN_MEMBER_CALL);
   if (op)
--- gcc/testsuite/g++.dg/ubsan/pr63813.C.jj 2014-11-18 12:37:57.790991586 
+0100
+++ gcc/testsuite/g++.dg/ubsan/pr63813.C2014-11-18 12:50:58.723345975 
+0100
@@ -0,0 +1,12 @@
+// PR sanitizer/63813
+// { dg-do compile }
+// { dg-options -fsanitize=undefined -O1 }
+
+struct A {};
+struct B { long foo () const; A bar () const; };
+
+A 
+B::bar () const
+{
+  return *reinterpret_cast A * (foo ());
+}

Jakub

[PATCH] Fix up r217118 - simplify_binary_operation_1 (PR rtl-optimization/63843)

2014-11-18 Thread Jakub Jelinek

Hi!

The case ASHIFTRT: which is meant for this optimization is preceeded by
/* FALLTHRU */ from ROTATE cases, which can't be optimized this way.
Thus, this patch limits this optimization to ASHIFTRT.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2014-11-18  Jakub Jelinek  ja...@redhat.com

PR rtl-optimization/63843
* simplify-rtx.c (simplify_binary_operation_1) case ASHIFTRT: For
optimization of ashiftrt of subreg of lshiftrt, check that code
is ASHIFTRT.

* gcc.c-torture/execute/pr63843.c: New test.

--- gcc/simplify-rtx.c.jj   2014-11-18 10:17:01.0 +0100
+++ gcc/simplify-rtx.c  2014-11-18 13:44:26.727792683 +0100
@@ -3118,10 +3118,11 @@ simplify_binary_operation_1 (enum rtx_co
  (subreg:M1 (ashiftrt:M2 (reg:M2)
  (const_int c1 + c2))
   low_part).  */
-  if (!VECTOR_MODE_P (mode)
+  if (code == ASHIFTRT
+  !VECTOR_MODE_P (mode)
SUBREG_P (op0)
CONST_INT_P (op1)
-   (GET_CODE (SUBREG_REG (op0)) == LSHIFTRT)
+  GET_CODE (SUBREG_REG (op0)) == LSHIFTRT
!VECTOR_MODE_P (GET_MODE (SUBREG_REG (op0)))
CONST_INT_P (XEXP (SUBREG_REG (op0), 1))
(GET_MODE_BITSIZE (GET_MODE (SUBREG_REG (op0)))
--- gcc/testsuite/gcc.c-torture/execute/pr63843.c.jj2014-11-18 
13:43:45.417544096 +0100
+++ gcc/testsuite/gcc.c-torture/execute/pr63843.c   2014-11-18 
13:43:32.0 +0100
@@ -0,0 +1,31 @@
+/* PR rtl-optimization/63843 */
+
+static inline __attribute__ ((always_inline))
+unsigned short foo (unsigned short v)
+{
+  return (v  8) | (v  8);
+}
+
+unsigned short __attribute__ ((noinline, noclone, hot))
+bar (unsigned char *x)
+{
+  unsigned int a;
+  unsigned short b;
+  __builtin_memcpy (a, x[0], sizeof (a));
+  a ^= 0x80808080U;
+  __builtin_memcpy (x[0], a, sizeof (a));
+  __builtin_memcpy (b, x[2], sizeof (b));
+  return foo (b);
+}
+
+int
+main ()
+{
+  unsigned char x[8] = { 0x01, 0x01, 0x01, 0x01 };
+  if (__CHAR_BIT__ == 8
+   sizeof (short) == 2
+   sizeof (int) == 4
+   bar (x) != 0x8181U)
+__builtin_abort ();
+  return 0;
+}

Jakub

[PATCH] Fix ubsan -fsanitize=signed-integer-overflow expansion (PR sanitizer/63520)

2014-11-18 Thread Jakub Jelinek

Hi!

Apparently, expand_expr with EXPR_WRITE can return
a SUBREG with SUBREG_PROMOTED_VAR_P set on it.  For
UBSAN_CHECK_{ADD,SUB,MUL} expansion, I've been doing just
emit_move_insn into it, which apparently is wrong in that case,
store_expr instead uses convert_move for it.  The
{ADD,SUB,MUL}_OVERFLOW (i.e. __builtin_*_overflow) expansion
shouldn't need it, as the result is complex and complex integers
aren't promoted that way.  As store_expr* uses a tree expression
to store, while I have rtx, I just wrote a short helper function
for this.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2014-11-18  Jakub Jelinek  ja...@redhat.com

PR sanitizer/63520
* internal-fn.c (expand_ubsan_result_store): New function.
(expand_addsub_overflow, expand_neg_overflow, expand_mul_overflow):
Use it instead of just emit_move_insn.

* c-c++-common/ubsan/pr63520.c: New test.

--- gcc/internal-fn.c.jj2014-11-12 13:28:47.0 +0100
+++ gcc/internal-fn.c   2014-11-18 15:35:46.395916823 +0100
@@ -395,6 +395,21 @@ expand_arith_overflow_result_store (tree
   write_complex_part (target, lres, false);
 }
 
+/* Helper for expand_*_overflow.  Store RES into TARGET.  */
+
+static void
+expand_ubsan_result_store (rtx target, rtx res)
+{
+  if (GET_CODE (target) == SUBREG  SUBREG_PROMOTED_VAR_P (target))
+/* If this is a scalar in a register that is stored in a wider mode   
+   than the declared mode, compute the result into its declared mode
+   and then convert to the wider mode.  Our value is the computed
+   expression.  */
+convert_move (SUBREG_REG (target), res, SUBREG_PROMOTED_SIGN (target));
+  else
+emit_move_insn (target, res);
+}
+
 /* Add sub/add overflow checking to the statement STMT.
CODE says whether the operation is +, or -.  */
 
@@ -809,7 +824,7 @@ expand_addsub_overflow (location_t loc,
   if (lhs)
 {
   if (is_ubsan)
-   emit_move_insn (target, res);
+   expand_ubsan_result_store (target, res);
   else
{
  if (do_xor)
@@ -904,7 +919,7 @@ expand_neg_overflow (location_t loc, tre
   if (lhs)
 {
   if (is_ubsan)
-   emit_move_insn (target, res);
+   expand_ubsan_result_store (target, res);
   else
expand_arith_overflow_result_store (lhs, target, mode, res);
 }
@@ -1590,7 +1605,7 @@ expand_mul_overflow (location_t loc, tre
   if (lhs)
 {
   if (is_ubsan)
-   emit_move_insn (target, res);
+   expand_ubsan_result_store (target, res);
   else
expand_arith_overflow_result_store (lhs, target, mode, res);
 }
--- gcc/testsuite/c-c++-common/ubsan/pr63520.c.jj   2014-11-18 
15:40:07.271273710 +0100
+++ gcc/testsuite/c-c++-common/ubsan/pr63520.c  2014-11-18 15:40:40.971673904 
+0100
@@ -0,0 +1,16 @@
+/* PR sanitizer/63520 */
+/* { dg-do compile } */
+/* { dg-options -fsanitize=undefined } */
+
+int a;
+
+void
+foo (void)
+{
+  while (1)
+{
+  if (a == 1)
+   break;
+  a -= 1;
+}
+}


Jakub

[PATCH] Fix simd clone vectorization with EH (PR tree-optimization/63915)

2014-11-18 Thread Jakub Jelinek

Hi!

Simd clone vectorization uses vect_finish_stmt_generation which
handles adding the new calls properly to EH tables, but afterwards
we replace the original call with a normal assignment, and if the original
call can throw, we need to remove it from the EH tables.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
ok for trunk?

2014-11-18  Jakub Jelinek  ja...@redhat.com

PR tree-optimization/63915
* tree-vect-stmts.c (vectorizable_simd_clone_call): Pass
true instead of false as last argument to gsi_replace.

* c-c++-common/gomp/pr60823-4.c: New test.

--- gcc/tree-vect-stmts.c.jj2014-11-14 00:10:39.0 +0100
+++ gcc/tree-vect-stmts.c   2014-11-18 17:02:15.635257023 +0100
@@ -3195,7 +3195,7 @@ vectorizable_simd_clone_call (gimple stm
   set_vinfo_for_stmt (new_stmt, stmt_info);
   set_vinfo_for_stmt (stmt, NULL);
   STMT_VINFO_STMT (stmt_info) = new_stmt;
-  gsi_replace (gsi, new_stmt, false);
+  gsi_replace (gsi, new_stmt, true);
   unlink_stmt_vdef (stmt);
 
   return true;
--- gcc/testsuite/c-c++-common/gomp/pr60823-4.c.jj  2014-11-18 
17:06:38.859319961 +0100
+++ gcc/testsuite/c-c++-common/gomp/pr60823-4.c 2014-11-18 17:06:34.602383632 
+0100
@@ -0,0 +1,7 @@
+/* PR tree-optimization/63915 */
+/* { dg-do run } */
+/* { dg-require-effective-target vect_simd_clones } */
+/* { dg-options -O2 -fopenmp-simd } */
+/* { dg-additional-options -fpic { target fpic } } */
+
+#include pr60823-2.c

Jakub

[committed] Add testcase for PR tree-optimization/61042

2014-11-18 Thread Jakub Jelinek

Hi!

This PR got fixed by a fix for another PR, but I've committed
the testcase to the testsuite for better coverage.

2014-11-18  Jakub Jelinek  ja...@redhat.com

PR tree-optimization/61042
* gcc.c-torture/compile/pr61042.c: New test.

--- gcc/testsuite/gcc.c-torture/compile/pr61042.c.jj2014-11-18 
14:23:56.004671762 +0100
+++ gcc/testsuite/gcc.c-torture/compile/pr61042.c   2014-11-18 
14:23:35.0 +0100
@@ -0,0 +1,10 @@
+/* PR tree-optimization/61042 */
+
+int a, b, c[1], d, f;
+
+void
+foo ()
+{
+  for (; b; b++)
+c[0] = c[f]  (d = a);
+}

Jakub

Re: [Patch] pr63937: TARGET_USE_BY_PIECES_INFRASTRUCTURE_P should take an unsigned HOST_WIDE_INT size argument

2014-11-18 Thread Jakub Jelinek

On Tue, Nov 18, 2014 at 09:57:37PM +, James Greenhalgh wrote:
 2014-11-18  James Greenhalgh  james.greenha...@arm.com
 
   PR target/63937
   * target.def (use_by_pieces_infrastructure_p): Take unsigned
   HOST_WIDE_INT as the size parameter.
   * targhooks.c (default_use_by_pieces_infrastructure_p): Likewise.
   * targhooks.h (default_use_by_pieces_infrastructure_p): Likewise.
   * config/arc/arc.c (arc_use_by_pieces_infrastructure_p)): Likewise.
   * config/mips/mips.c (mips_use_by_pieces_infrastructure_p)): Likewise.
   * config/s390/s390.c (s390_use_by_pieces_infrastructure_p)): Likewise.
   * config/sh/sh.c (sh_use_by_pieces_infrastructure_p)): Likewise.
   * config/aarch64/aarch64.c
   (aarch64_use_by_pieces_infrastructure_p)): Likewise.
   * doc/tm.texi: Regenerate.
 
 2014-11-18  James Greenhalgh  james.greenha...@arm.com
 
   * gcc.dg/memset-2.c: New.

Please mention the PR also in the testsuite/ChangeLog entry.
Ok with that change, thanks.

Jakub

Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)

2014-11-18 Thread Jakub Jelinek

On Tue, Nov 18, 2014 at 03:30:56PM -0800, Mike Stump wrote:
  Before wide-int got merged, the testcase worked, though the code didn't
  bother checking anything, just created 0 or constm1_rtx for the two cases
  that could happen and if something else appeared, could just return what
  matched low TImode (or DImode for -m32).
 
 tmp is sized for MAX_BITSIZE_MODE_ANY_INT, but, you remove the limiter for
 units that keeps u in bounds.  Doesn’t this access 32 bytes of OImode
 values in a 16 byte data structure?

No, I'm not touching tmp array at all in that case, only look at vp
individual bytes.  Either they are all 0, or all 0xff, or I return NULL.

 and GET_MODE_BITSIZE (outer_submode) is  MAX_BITSIZE_MODE_ANY_INT, right?
 
 You can’t copy more bytes than the size of the destination has?

On the testcase we have
(subreg:OI (reg:V8SI 1234) 0)
and try to propagate
(const_vector:V8SI [8 x (const_int 0)])
into it.  All zeros even in OImode (or XImode) is still
(const_int 0), all ones even in those modes is (const_int -1),
which are the only constants I'm using for those ultra-wide modes,
anything else will return NULL (but not ICE).

Jakub

Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)

2014-11-19 Thread Jakub Jelinek

On Tue, Nov 18, 2014 at 04:34:12PM -0800, Mike Stump wrote:
 On Nov 18, 2014, at 3:42 PM, Jakub Jelinek ja...@redhat.com wrote:
  No, I'm not touching tmp array at all in that case, only look at vp
  individual bytes.  Either they are all 0, or all 0xff, or I return NULL.
 
 Oh, sorry, I misread where the break; in your code was going.  I might have 
 been misled by:
 
  -   gcc_assert (GET_MODE_PRECISION (outer_submode)
  -   = MAX_BITSIZE_MODE_ANY_INT);
 
 in your patch.  You don’t need that anymore, do you?  If not, can you remove 
 this part.

I thought the assert is unnecessary given the condition just a few lines
above it.  But can keep it, perhaps gcc_checking_assert would be enough, and
hopefully compiler optimizes it away completely.

 
 The rest looks like normal rtl/vector code, I don’t see anything wrong with 
 it.

Jakub

Re: [PATCH] Fix ubsan -fsanitize=signed-integer-overflow expansion (PR sanitizer/63520)

2014-11-19 Thread Jakub Jelinek

On Wed, Nov 19, 2014 at 10:02:18AM +0100, Richard Biener wrote:
 On Tue, 18 Nov 2014, Jakub Jelinek wrote:
  Apparently, expand_expr with EXPR_WRITE can return
  a SUBREG with SUBREG_PROMOTED_VAR_P set on it.  For
 
 Huh, that looks bogus to me.  But of course I know nothing
 (read: not enough) to really tell.  Eric?

I've tried to look where it comes from, and it dates back to r2xxx
or so, so forever.
And store_expr has a large:
  else if (GET_CODE (target) == SUBREG  SUBREG_PROMOTED_VAR_P (target))
/* If this is a scalar in a register that is stored in a wider mode
   than the declared mode, compute the result into its declared mode
   and then convert to the wider mode.  Our value is the computed
   expression.  */
handling block.  For targets with only word sized operations something
like that actually makes a lot of sense, I have just not been aware of that.

Jakub

Re: [PATCH] Fix PR63844

2014-11-19 Thread Jakub Jelinek

On Wed, Nov 19, 2014 at 10:07:07AM +0100, Richard Biener wrote:
 
 The following improves code-generation for PR63844 by using
 a restrict qualified reference type for the OMP receiver decl.
 This improves alias analysis and points-to analysis enough to
 usually allow invariant and store motion where that was possible
 in the non-split-out variant.
 
 Bootstrapped and tested on x86_64-unknown-linux-gnu, ok for trunk?
 
 Sadly I don't have a good testcase (for example one that wasn't
 vectorized before but is now).  We can add one later - ISTR there
 were a few other bugs about not vectorizing with -fopenmp, I'll
 try to find them again and see whether they are fixed.

Most of those PRs should have openmp keyword.

 2014-11-19  Richard Biener  rguent...@suse.de
 
   PR tree-optimization/63844
   * omp-low.c (fixup_child_record_type): Use a restrict qualified
   referece type for the receiver parameter.

Ok, thanks.

 --- gcc/omp-low.c (revision 217692)
 +++ gcc/omp-low.c (working copy)
 @@ -1517,7 +1517,8 @@ fixup_child_record_type (omp_context *ct
layout_type (type);
  }
  
 -  TREE_TYPE (ctx-receiver_decl) = build_pointer_type (type);
 +  TREE_TYPE (ctx-receiver_decl)
 += build_qualified_type (build_reference_type (type), TYPE_QUAL_RESTRICT);
  }
  
  /* Instantiate decls as necessary in CTX to satisfy the data sharing

Jakub

Re: [PATCH, libgomp]: Require vect_simd_clones effective target for examples-4/e.53.5.{c,f90}

2014-11-19 Thread Jakub Jelinek

On Wed, Nov 19, 2014 at 11:26:25AM +0100, Uros Bizjak wrote:
 2014-11-19  Uros Bizjak  ubiz...@gmail.com
 
 * testsuite/libgomp.c/examples-4/e.53.5.c: Require
 vect_simd_clones effective target.
 * testsuite/libgomp.fortran/examples-4/e.53.5.f90: Ditto.
 
 Tested on x86_64-linux-gnu {,-m32} CentOS 5.11.
 
 OK for mainline?

Yes, thanks.

 Index: testsuite/libgomp.fortran/examples-4/e.53.5.f90
 ===
 --- testsuite/libgomp.fortran/examples-4/e.53.5.f90   (revision 217718)
 +++ testsuite/libgomp.fortran/examples-4/e.53.5.f90   (working copy)
 @@ -1,4 +1,4 @@
 -! { dg-do run }
 +! { dg-do run { target vect_simd_clones } }
  ! { dg-options -O2 }
  ! { dg-additional-options -msse2 { target sse2_runtime } }
  ! { dg-additional-options -mavx { target avx_runtime } }
 Index: testsuite/libgomp.c/examples-4/e.53.5.c
 ===
 --- testsuite/libgomp.c/examples-4/e.53.5.c   (revision 217718)
 +++ testsuite/libgomp.c/examples-4/e.53.5.c   (working copy)
 @@ -1,4 +1,4 @@
 -/* { dg-do run } */
 +/* { dg-do run { target vect_simd_clones } } */
  /* { dg-options -O2 } */
  /* { dg-additional-options -msse2 { target sse2_runtime } } */
  /* { dg-additional-options -mavx { target avx_runtime } } */


Jakub

Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)

2014-11-19 Thread Jakub Jelinek

On Wed, Nov 19, 2014 at 09:59:45AM +0100, Richard Biener wrote:
  OImode/XImode on i?86/x86_64 are not = MAX_BITSIZE_MODE_ANY_INT, because
  they are never used for integer arithmetics (and there is no way
  to represent all their values in RTL if not using CONST_WIDE_INT).
  As the following testcase shows, simplify_immed_subreg can be called
  with such modes though, e.g. trying to forward propagate a CONST_VECTOR
  (i?86/x86_64 handles all zeros and all ones as CONST_VECTORs that can appear
  in the IL directly) into a SUBREG_REG.
 
 So we have (subreg:OI (reg:V4DF ...) ...) for example?  What do we
 end doing with that OI mode subreg?  (why can't we simply use the
 V4DF one?)

propagate_rtx_1 is called on
(subreg:OI (reg:V8DI 89) 0)
with old_rtx
(reg:V8DI 89)
and new_rtx
(const_vector:V8DI [
(const_int 0 [0])
(const_int 0 [0])
(const_int 0 [0])
(const_int 0 [0])
(const_int 0 [0])
(const_int 0 [0])
(const_int 0 [0])
(const_int 0 [0])
])

Seems we throw away the result in that case though, because
gen_lowpart_common doesn't like to return low parts of VOIDmode constants
larger if the mode is larger than double int:
1353  innermode = GET_MODE (x);
1354  if (CONST_INT_P (x)
1355   msize * BITS_PER_UNIT = HOST_BITS_PER_WIDE_INT)
1356innermode = mode_for_size (HOST_BITS_PER_WIDE_INT, MODE_INT, 0);
1357  else if (innermode == VOIDmode)
1358innermode = mode_for_size (HOST_BITS_PER_DOUBLE_INT, MODE_INT, 0);
we hit the last if and as mode is wider than innermode, we return NULL later
on.

  The following patch instead of ICE handles the most common cases (all 0
  and all 1 CONST_VECTORs) and returns NULL otherwise.
  
  Before wide-int got merged, the testcase worked, though the code didn't
  bother checking anything, just created 0 or constm1_rtx for the two cases
  that could happen and if something else appeared, could just return what
  matched low TImode (or DImode for -m32).
  
  Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
 
 Looks ok to me.  Not sure if the zero/all-ones case really happens that
 much to be worth special-casing - I think you could use
 fixed_wide_intproper-size and simply see if the result is representable
 in a CONST_INT/CONST_DOUBLE?  Can you try if that works?  It looks like
 the patch may be smaller for that.

So perhaps something like this?  Don't know how much more inefficient it is
compared to what it used to do before.

Or just keep the existing code and just remove the assert and instead return
NULL whenever outer_submode is wider than MAX_BITSIZE_MODE_ANY_INT?  At
least during propagation that will make zero change.
Though, in that case I have still doubts about the current code handling right
modes wider than HOST_BITS_PER_DOUBLE_INT but smaller than
MAX_BITSIZE_MODE_ANY_INT (none on i?86/x86_64).  If TARGET_SUPPORTS_WIDE_INT
== 0, we still silently throw away the upper bits, don't we?

BTW, to Mike, the assert has been misplaced, there was first buffer overflow
and only after that the assert.

2014-11-19  Jakub Jelinek  ja...@redhat.com

PR target/63910
* simplify-rtx.c (simplify_immed_subreg): Handle integer modes wider
than MAX_BITSIZE_MODE_ANY_INT.

* gcc.target/i386/pr63910.c: New test.

--- gcc/simplify-rtx.c.jj   2014-11-19 09:17:15.491327992 +0100
+++ gcc/simplify-rtx.c  2014-11-19 12:28:16.223808178 +0100
@@ -5501,8 +5501,12 @@ simplify_immed_subreg (machine_mode oute
int units
  = (GET_MODE_BITSIZE (outer_submode) + HOST_BITS_PER_WIDE_INT - 1)
  / HOST_BITS_PER_WIDE_INT;
-   HOST_WIDE_INT tmp[MAX_BITSIZE_MODE_ANY_INT / 
HOST_BITS_PER_WIDE_INT];
-   wide_int r;
+   const int tmpsize = (MAX_BITSIZE_MODE_ANY_MODE 
++ HOST_BITS_PER_WIDE_INT - 1)
+   / HOST_BITS_PER_WIDE_INT;
+   HOST_WIDE_INT tmp[tmpsize];
+   typedef FIXED_WIDE_INT (tmpsize * HOST_BITS_PER_WIDE_INT) imm_int;
+   imm_int r;
 
for (u = 0; u  units; u++)
  {
@@ -5515,10 +5519,12 @@ simplify_immed_subreg (machine_mode oute
tmp[u] = buf;
base += HOST_BITS_PER_WIDE_INT;
  }
-   gcc_assert (GET_MODE_PRECISION (outer_submode)
-   = MAX_BITSIZE_MODE_ANY_INT);
-   r = wide_int::from_array (tmp, units,
- GET_MODE_PRECISION (outer_submode));
+   r = imm_int::from_array (tmp, units,
+GET_MODE_PRECISION (outer_submode));
+#if TARGET_SUPPORTS_WIDE_INT == 0
+   if (wi::min_precision (r, SIGNED)  HOST_BITS_PER_DOUBLE_INT)
+ return NULL_RTX;
+#endif
elems[elem] = immed_wide_int_const (r, outer_submode);
  }
  break;
--- gcc/testsuite/gcc.target/i386/pr63910.c.jj  2014-11-19 12:04

Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)

2014-11-19 Thread Jakub Jelinek

On Wed, Nov 19, 2014 at 12:59:06PM +0100, Richard Biener wrote:
  So perhaps something like this?  Don't know how much more inefficient it is
  compared to what it used to do before.
 
 Yes, that looks good.
 
  Or just keep the existing code and just remove the assert and instead return
  NULL whenever outer_submode is wider than MAX_BITSIZE_MODE_ANY_INT?  At
  least during propagation that will make zero change.
  Though, in that case I have still doubts about the current code handling 
  right
  modes wider than HOST_BITS_PER_DOUBLE_INT but smaller than
  MAX_BITSIZE_MODE_ANY_INT (none on i?86/x86_64).  If TARGET_SUPPORTS_WIDE_INT
  == 0, we still silently throw away the upper bits, don't we?
 
 Well - not with your added check, no?

For TARGET_SUPPORTS_WIDE_INT == 0 should be hopefully ok.  Not sure
about TARGET_SUPPORTS_WIDE_INT != 0, can it express any generic_wide_int, or
is it still bound to wide_int (i.e. MAX_BITSIZE_MODE_ANY_INT rounded up)
precision?  Mike?

Jakub

Re: [PATCH] Fix sanitizer/63690

2014-11-19 Thread Jakub Jelinek

On Wed, Nov 19, 2014 at 02:51:11PM +0100, Marek Polacek wrote:
 As the testcase shows, we should really check first that we
 have a MEM_REF, otherwise subsequent TREE_OPERAND might ICE.
 On an invalid testcase we might get e.g. STRING_CST.

Well, addr_object_size should handle STRING_CSTs just fine, so
why not handle STRING_CSTs in there too (similarly to decl_p = true
case).
Supposedly you want decl_p = true for them and need to tweak
  if (!POINTER_TYPE_P (TREE_TYPE (base))  !DECL_P (base))
but otherwise it should work.
On the other side,
int
foo (int x)
{
  return foobar[x];
}
is already handled by -fsanitize=bounds and for some reason
gimple_assign_load_p doesn't consider handled components with
STRING_CST as base as loads so for -fsanitize=object-size
we aren't called for that.

Jakub

Re: [PATCH] Fix sanitizer/63690

2014-11-19 Thread Jakub Jelinek

On Wed, Nov 19, 2014 at 03:12:05PM +0100, Jakub Jelinek wrote:
 On Wed, Nov 19, 2014 at 02:51:11PM +0100, Marek Polacek wrote:
  As the testcase shows, we should really check first that we
  have a MEM_REF, otherwise subsequent TREE_OPERAND might ICE.
  On an invalid testcase we might get e.g. STRING_CST.
 
 Well, addr_object_size should handle STRING_CSTs just fine, so
 why not handle STRING_CSTs in there too (similarly to decl_p = true
 case).
 Supposedly you want decl_p = true for them and need to tweak
   if (!POINTER_TYPE_P (TREE_TYPE (base))  !DECL_P (base))
 but otherwise it should work.
 On the other side,
 int
 foo (int x)
 {
   return foobar[x];
 }
 is already handled by -fsanitize=bounds and for some reason
 gimple_assign_load_p doesn't consider handled components with
 STRING_CST as base as loads so for -fsanitize=object-size
 we aren't called for that.

The patch is ok as is.

Jakub

[PATCH] gimple_{build_assign,assign_set_rhs}_with_ops* cleanup

2014-11-19 Thread Jakub Jelinek

Hi!

This patch:
1) adds unary op overload to gimple_build_assign_with_ops
   so that many callers don't need to pass NULL_TREE as the last
   argument explicitly
2) adds unary op overload to gimple_assign_set_rhs_with_ops for similar
   reasons
3) renames gimple_assign_set_rhs_with_ops_1 to
   gimple_assign_set_rhs_with_ops so it becomes ternary op overload
   to the existing binary op one and 2) added unary op
and adjusts all users that I found.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2014-11-19  Jakub Jelinek  ja...@redhat.com

* gimple.h (gimple_build_assign_with_ops): Add unary arg overload.
(gimple_assign_set_rhs_with_ops_1): Renamed to ...
(gimple_assign_set_rhs_with_ops): ... this.  Adjust binary arg
inline overload to use it.  Add unary arg overload.
* gimple.c (gimple_build_assign_with_ops): New unary arg overload.
(gimple_assign_set_rhs_from_tree): Use
gimple_assign_set_rhs_with_ops instead of
gimple_assign_set_rhs_with_ops_1.
(gimple_assign_set_rhs_with_ops_1): Renamed to ...
(gimple_assign_set_rhs_with_ops): ... this.
* ipa-split.c (split_function): Remove last NULL argument
from gimple_build_assign_with_ops call.
* tree-ssa-loop-im.c
(move_computations_dom_walker::before_dom_children): Likewise.
* tsan.c (instrument_builtin_call): Likewise.
* tree-vect-stmts.c (vect_init_vector, vectorizable_mask_load_store,
vectorizable_conversion, vectorizable_load): Likewise.
* tree-vect-loop.c (vect_is_simple_reduction_1,
get_initial_def_for_induction): Likewise.
* tree-loop-distribution.c (generate_memset_builtin): Likewise.
* tree-vect-patterns.c (vect_handle_widen_op_by_const,
vect_recog_widen_mult_pattern, vect_operation_fits_smaller_type,
vect_recog_over_widening_pattern, vect_recog_rotate_pattern,
vect_recog_vector_vector_shift_pattern, vect_recog_divmod_pattern,
vect_recog_mixed_size_cond_pattern, adjust_bool_pattern_cast,
adjust_bool_pattern, vect_recog_bool_pattern): Likewise.
* tree-ssa-phiopt.c (conditional_replacement, abs_replacement,
neg_replacement): Likewise.
* asan.c (build_shadow_mem_access, maybe_create_ssa_name,
maybe_cast_to_ptrmode, asan_expand_check_ifn): Likewise.
* tree-vect-slp.c (vect_get_constant_vectors): Likewise.
* omp-low.c (lower_rec_input_clauses, expand_omp_for_generic,
expand_omp_for_static_nochunk, expand_omp_for_static_chunk,
simd_clone_adjust): Likewise.
* tree-vect-loop-manip.c (vect_create_cond_for_align_checks): Likewise.
* gimple-ssa-strength-reduction.c (introduce_cast_before_cand,
replace_one_candidate): Likewise.
* gimple-builder.c (build_type_cast): Likewise.
* tree-ssa-forwprop.c (simplify_rotate): Likewise.
(forward_propagate_addr_expr_1): Remove last NULL argument
from gimple_assign_set_rhs_with_ops call.
(simplify_vector_constructor): Use gimple_assign_set_rhs_with_ops
instead of gimple_assign_set_rhs_with_ops_1.
* tree-ssa-reassoc.c (maybe_optimize_range_tests): Remove last NULL
argument from gimple_build_assign_with_ops call.
(repropagate_negates): Remove last NULL argument from
gimple_assign_set_rhs_with_ops call.
* ubsan.c (ubsan_expand_null_ifn, ubsan_expand_objsize_ifn): Remove
last NULL argument from gimple_build_assign_with_ops call.
(instrument_bool_enum_load): Likewise.  Remove last NULL argument
from gimple_assign_set_rhs_with_ops call.
* tree-ssa-math-opts.c (build_and_insert_cast, convert_mult_to_fma):
Remove last NULL argument from gimple_build_assign_with_ops call.
(bswap_replace): Likewise.  Use gimple_assign_set_rhs_with_ops instead
of gimple_assign_set_rhs_with_ops_1.
(convert_plusminus_to_widen): Use gimple_assign_set_rhs_with_ops
instead of gimple_assign_set_rhs_with_ops_1.
* gimple-fold.c (replace_stmt_with_simplification): Likewise.
(rewrite_to_defined_overflow, gimple_build): Remove last NULL argument
from gimple_build_assign_with_ops call.
* tree-ssa-strlen.c (handle_pointer_plus): Remove last NULL argument
from gimple_assign_set_rhs_with_ops call.
* tree-vrp.c (simplify_truth_ops_using_ranges,
simplify_bit_ops_using_ranges): Remove last NULL argument from
gimple_assign_set_rhs_with_ops call.
(simplify_float_conversion_using_ranges,
simplify_internal_call_using_ranges): Remove last NULL argument from
gimple_build_assign_with_ops call.

--- gcc/gimple.h.jj 2014-11-19 10:45:17.780769013 +0100
+++ gcc/gimple.h2014-11-19 10:52:52.118574835 +0100
@@ -1180,6 +1180,8 @@ gimple gimple_build_assign_with_ops (enu
 tree, tree

Re: [COMMITTED 1/3] Make TARGET_STATIC_CHAIN allow a function type

2014-11-19 Thread Jakub Jelinek

On Wed, Nov 19, 2014 at 03:58:50PM +0100, Richard Henderson wrote:
 As opposed to always being a decl.  This is a prerequisite
 to allowing the static chain to be loaded for indirect calls.
 
 * targhooks.c (default_static_chain): Remove check for
 DECL_STATIC_CHAIN.
 * config/moxie/moxie.c (moxie_static_chain): Likewise.
 * config/i386/i386.c (ix86_static_chain): Allow decl or type
 as the first argument.
 * config/xtensa/xtensa.c (xtensa_static_chain): Change the name
 of the unused first parameter.
 * doc/tm.texi (TARGET_STATIC_CHAIN): Document the first parameter
 may be a type.
 * target.def (static_chain): Likewise.

r217769 broke lots of tests on i686-linux, haven't verified if everything
below is caused by this, but regparm-1.c at -m32 certainly is.
The ICE is:
regparm-1.c: In function ‘test_realigned’:
regparm-1.c:49:1: internal compiler error: in ix86_expand_prologue, at
config/i386/i386.c:11347
 }
 ^
0x106140a ix86_expand_prologue()
../../gcc/config/i386/i386.c:11347
0x11aaf2a gen_prologue()
../../gcc/config/i386/i386.md:12095
0x99baf9 thread_prologue_and_epilogue_insns()
../../gcc/function.c:5911
0x99cba4 rest_of_handle_thread_prologue_and_epilogue
../../gcc/function.c:6481
0x99cc0c execute
../../gcc/function.c:6519
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See http://gcc.gnu.org/bugs.html for instructions.

+FAIL: gcc.dg/pr36015.c (internal compiler error)
+FAIL: gcc.dg/pr36015.c (test for excess errors)
+UNRESOLVED: gcc.dg/pr36015.c compilation failed to produce executable
+FAIL: gcc.dg/torture/stackalign/regparm-1.c   -O0  (internal compiler error)
+FAIL: gcc.dg/torture/stackalign/regparm-1.c   -O0  (test for excess errors)
+UNRESOLVED: gcc.dg/torture/stackalign/regparm-1.c   -O0  compilation failed to 
produce executable
+FAIL: gcc.dg/torture/stackalign/regparm-1.c   -O0 -fpic (internal compiler 
error)
+FAIL: gcc.dg/torture/stackalign/regparm-1.c   -O0 -fpic (test for excess 
errors)
+UNRESOLVED: gcc.dg/torture/stackalign/regparm-1.c   -O0 -fpic compilation 
failed to produce executable
+FAIL: gcc.dg/torture/stackalign/regparm-1.c   -O0 -mforce-drap (internal 
compiler error)
+FAIL: gcc.dg/torture/stackalign/regparm-1.c   -O0 -mforce-drap (test for 
excess errors)
+FAIL: gcc.dg/torture/stackalign/regparm-1.c   -O0 -mforce-drap -fpic (internal 
compiler error)
+FAIL: gcc.dg/torture/stackalign/regparm-1.c   -O0 -mforce-drap -fpic (test for 
excess errors)
+UNRESOLVED: gcc.dg/torture/stackalign/regparm-1.c   -O0 -mforce-drap -fpic 
compilation failed to produce executable
+UNRESOLVED: gcc.dg/torture/stackalign/regparm-1.c   -O0 -mforce-drap 
compilation failed to produce executable
+FAIL: gcc.dg/torture/stackalign/regparm-1.c   -O1 -fpic (internal compiler 
error)
+FAIL: gcc.dg/torture/stackalign/regparm-1.c   -O1 -fpic (test for excess 
errors)
+UNRESOLVED: gcc.dg/torture/stackalign/regparm-1.c   -O1 -fpic compilation 
failed to produce executable
+FAIL: gcc.dg/torture/stackalign/regparm-1.c   -O1 -mforce-drap -fpic (internal 
compiler error)
+FAIL: gcc.dg/torture/stackalign/regparm-1.c   -O1 -mforce-drap -fpic (test for 
excess errors)
+UNRESOLVED: gcc.dg/torture/stackalign/regparm-1.c   -O1 -mforce-drap -fpic 
compilation failed to produce executable
+FAIL: gcc.dg/torture/stackalign/regparm-1.c   -O2 -flto -flto-partition=none 
-fpic (internal compiler error)
+FAIL: gcc.dg/torture/stackalign/regparm-1.c   -O2 -flto -flto-partition=none 
-fpic (test for excess errors)
+UNRESOLVED: gcc.dg/torture/stackalign/regparm-1.c   -O2 -flto 
-flto-partition=none -fpic compilation failed to produce executable
+FAIL: gcc.dg/torture/stackalign/regparm-1.c   -O2 -flto -flto-partition=none 
-mforce-drap -fpic (internal compiler error)
+FAIL: gcc.dg/torture/stackalign/regparm-1.c   -O2 -flto -flto-partition=none 
-mforce-drap -fpic (test for excess errors)
+UNRESOLVED: gcc.dg/torture/stackalign/regparm-1.c   -O2 -flto 
-flto-partition=none -mforce-drap -fpic compilation failed to produce executable
+FAIL: gcc.dg/torture/stackalign/regparm-1.c   -O2 -flto -fpic (internal 
compiler error)
+FAIL: gcc.dg/torture/stackalign/regparm-1.c   -O2 -flto -fpic (test for excess 
errors)
+UNRESOLVED: gcc.dg/torture/stackalign/regparm-1.c   -O2 -flto -fpic 
compilation failed to produce executable
+FAIL: gcc.dg/torture/stackalign/regparm-1.c   -O2 -flto -mforce-drap -fpic 
(internal compiler error)
+FAIL: gcc.dg/torture/stackalign/regparm-1.c   -O2 -flto -mforce-drap -fpic 
(test for excess errors)
+UNRESOLVED: gcc.dg/torture/stackalign/regparm-1.c   -O2 -flto -mforce-drap 
-fpic compilation failed to produce executable
+FAIL: gcc.dg/torture/stackalign/regparm-1.c   -O2 -fpic (internal compiler 
error)
+FAIL: gcc.dg/torture/stackalign/regparm-1.c   -O2 -fpic (test for excess 
errors)
+UNRESOLVED:

[committed] Allow input and output to be both /dev/null (PR driver/36312, PR driver/63837)

2014-11-19 Thread Jakub Jelinek

Hi!

I've bootstrapped/regtested this patch for Manuel (just small formatting
changes from me), and committed to trunk.

2014-11-19  Manuel López-Ibáñez  m...@gcc.gnu.org
Jakub Jelinek  ja...@redhat.com

PR driver/36312
PR driver/63837
* gcc.c (process_command): Don't check for input/output
filename equality if output is HOST_BIT_BUCKET.
* toplev.c (init_asm_output): Likewise.

--- gcc/gcc.c.jj2014-11-14 00:10:38.0 +0100
+++ gcc/gcc.c   2014-11-19 17:13:53.409985279 +0100
@@ -4155,7 +4155,9 @@ process_command (unsigned int decoded_op
   CL_DRIVER, handlers, global_dc);
 }
 
-  if (output_file  strcmp (output_file, -))
+  if (output_file
+   strcmp (output_file, -) != 0
+   strcmp (output_file, HOST_BIT_BUCKET) != 0)
 {
   int i;
   for (i = 0; i  n_infiles; i++)
--- gcc/toplev.c.jj 2014-11-18 08:26:38.0 +0100
+++ gcc/toplev.c2014-11-19 17:14:47.324047562 +0100
@@ -945,7 +945,8 @@ init_asm_output (const char *name)
}
   if (!strcmp (asm_file_name, -))
asm_out_file = stdout;
-  else if (!canonical_filename_eq (asm_file_name, name))
+  else if (!canonical_filename_eq (asm_file_name, name)
+  || !strcmp (asm_file_name, HOST_BIT_BUCKET))
asm_out_file = fopen (asm_file_name, w);
   else
/* Use fatal_error (UNKOWN_LOCATION) instead of just fatal_error to

Jakub

Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)

2014-11-19 Thread Jakub Jelinek

On Wed, Nov 19, 2014 at 10:38:57AM -0800, Mike Stump wrote:
 On Nov 19, 2014, at 4:24 AM, Richard Biener rguent...@suse.de wrote:
  On Wed, 19 Nov 2014, Jakub Jelinek wrote:
  For TARGET_SUPPORTS_WIDE_INT == 0 should be hopefully ok.  Not sure
  about TARGET_SUPPORTS_WIDE_INT != 0, can it express any generic_wide_int, 
  or
  is it still bound to wide_int (i.e. MAX_BITSIZE_MODE_ANY_INT rounded up)
  precision?  Mike?
  
  It can represent any - well, the RTX at least.  Code then using
  simple wide_int may wreck then though.
 
 So, my worry is this…  once you start in on adding support here or there
 for int modes wider than the largest supported int mode, you create a
 never ending maintenance nightmare with complex rules that one will never
 be able to keep straight.  There needs to be a single line or two, that
 explains the rules that we all agree to, then it will always be clear what
 the rule is.  The largest supported int mode is: x, has a nice, simple,
 easy to explain aspect to it.

Well, at least for TARGET_SUPPORTS_WIDE_INT == 0 my patch would always
create CONST_INTs or CONST_DOUBLEs, which all fit into
MAX_BITSIZE_MODE_ANY_INT bits, CONST_INTs are modeless, so in what wider
mode you use them doesn't matter.  For TARGET_SUPPORTS_WIDE_INT != 0
we could certainly cap it similarly, if wi::min_precision (r, SIGNED) 
MAX_BITSIZE_MODE_ANY_INT we could return NULL_RTX.

Though, following patch is just fine for me too, I don't think it will
make a significant difference:

--- gcc/simplify-rtx.c  2014-11-19 15:39:24.073113107 +0100
+++ gcc/simplify-rtx.c  2014-11-19 22:55:44.201464253 +0100
@@ -5504,6 +5504,8 @@ simplify_immed_subreg (machine_mode oute
HOST_WIDE_INT tmp[MAX_BITSIZE_MODE_ANY_INT / 
HOST_BITS_PER_WIDE_INT];
wide_int r;
 
+   if (GET_MODE_PRECISION (outer_submode)  MAX_BITSIZE_MODE_ANY_INT)
+ return NULL_RTX;
for (u = 0; u  units; u++)
  {
unsigned HOST_WIDE_INT buf = 0;
@@ -5515,10 +5517,13 @@ simplify_immed_subreg (machine_mode oute
tmp[u] = buf;
base += HOST_BITS_PER_WIDE_INT;
  }
-   gcc_assert (GET_MODE_PRECISION (outer_submode)
-   = MAX_BITSIZE_MODE_ANY_INT);
r = wide_int::from_array (tmp, units,
  GET_MODE_PRECISION (outer_submode));
+#if TARGET_SUPPORTS_WIDE_INT == 0
+   /* Make sure r will fit into CONST_INT or CONST_DOUBLE.  */
+   if (wi::min_precision (r, SIGNED)  HOST_BITS_PER_DOUBLE_INT)
+ return NULL_RTX;
+#endif
elems[elem] = immed_wide_int_const (r, outer_submode);
  }
  break;


Jakub

Re: OpenACC middle end changes

2014-11-19 Thread Jakub Jelinek

On Wed, Nov 19, 2014 at 08:52:40PM +0100, Bernd Schmidt wrote:
 Another change that's required is (something like) the following. For ptx,
 we need to know whether to output something as a .func (callable from ptx
 code) or a .kernel (callable from the host). That means we need to mark the
 kernel functions somehow in omp-low.c, and the following does that by way of
 a new attribute (already recognized by the nvptx backend).

I think Richard's and Honza's preference in this case is a flag in
cgraph_node instead of an attribute.

   * omp-low.c (create_omp_child_function): Tag entrypoint
 functions with a special attribute.
 
 diff --git a/gcc/omp-low.c b/gcc/omp-low.c
 index 42ba317..8408025 100644
 --- a/gcc/omp-low.c
 +++ b/gcc/omp-low.c
 @@ -2228,6 +2228,12 @@ create_omp_child_function (omp_context *ctx, bool
 task_copy)
 break;
   }
  }
 +  if (cgraph_node::get_create (decl)-offloadable
 +   !lookup_attribute (omp declare target,
 +   DECL_ATTRIBUTES (current_function_decl)))
 +DECL_ATTRIBUTES (decl)
 +  = tree_cons (get_identifier (omp target entrypoint),
 +   NULL_TREE, DECL_ATTRIBUTES (decl));
 
t = build_decl (DECL_SOURCE_LOCATION (decl),
   RESULT_DECL, NULL_TREE, void_type_node);

Jakub

Re: OpenACC middle end changes

2014-11-19 Thread Jakub Jelinek

On Thu, Nov 20, 2014 at 03:19:11AM +0100, Bernd Schmidt wrote:
 Thomas had apparently already pointed out an issue with the new gomp_target
 class (there are multiple similar types of statements we want to handle with
 OpenACC, they have different codes but we want to have function pointers
 operating on any of them) back in July. That seems to have been ignored. By
 necessity, some of David's changes are reverted in the following patch.

I thought the agreement was to use GIMPLE_OMP_TARGET gimple_code and just
two new gimple_omp_target_kind GF_* flags.

Jakub

Re: [PATCH x86, PR60451] Expand even/odd permutation using pack insn.

2014-11-20 Thread Jakub Jelinek

On Thu, Nov 20, 2014 at 02:36:26PM +0300, Evgeny Stupachenko wrote:
 +  /* Only V8HI, V16QI, V16HI and V32QI modes are more profitable than general
 + shuffles.  */

I think switch (d-vmode) would be more readable.

 +  op = gen_reg_rtx (d-vmode);
 +  t = gen_reg_rtx (V4DImode);
 +  emit_insn (gen_pack (op, dop0, dop1));
 +  emit_insn (gen_avx2_permv4di_1 (t, gen_lowpart (V4DImode, op),
 const0_rtx,

Too long line, wrap it?

Will leave the rest to Uros.

Jakub

Re: [PATCH] PR63426 Fix various signed integer overflows

2014-11-20 Thread Jakub Jelinek

On Thu, Nov 20, 2014 at 02:27:52PM +0100, Markus Trippelsdorf wrote:
 2014-11-20  Markus Trippelsdorf  mar...@trippelsdorf.de
 
   * emit-rtl.c (const_wide_int_htab_hash): Likewise.
   * loop-iv.c (determine_max_iter): Likewise.
   (iv_number_of_iterations): Likewise.
   * tree-ssa-loop-ivopts.c (get_computation_cost_at): Likewise.
   * varasm.c (get_section_anchor): Likewise.

Ok, with one small change:

 --- a/gcc/varasm.c
 +++ b/gcc/varasm.c
 @@ -7188,7 +7188,7 @@ get_section_anchor (struct object_block *block, 
 HOST_WIDE_INT offset,
  offset = 0;
else
  {
 -  bias = 1  (GET_MODE_BITSIZE (ptr_mode) - 1);
 +  bias = (unsigned HOST_WIDE_INT) 1  (GET_MODE_BITSIZE (ptr_mode) - 1);

Please use HOST_WIDE_INT_1U instead of (unsigned HOST_WIDE_INT) 1.

Jakub

Re: [PATCH, i386] Add new arg values for __builtin_cpu_supports

2014-11-20 Thread Jakub Jelinek

On Thu, Nov 20, 2014 at 07:36:03PM +0300, Ilya Enkovich wrote:
 Hi,
 
 MPX runtime checks some feature bits in order to check MPX is fully
 supported.  Runtime does it by cpuid calls but there is a
 __builtin_cpu_supports which may be used for that.  Unfortunately
 currently it doesn't support required bits.  Will it be OK to add them for
 trunk?

I think using cpuid for that is just fine.  __builtin_cpu_supports
is for ISA additions users might actually want to version code for,
MPX stuff, as the instructions are nops without hw support, are not
something one would multi-version a function for.
If anything, AVX512F and AVX512BW+VL might be good candidates for that, not
MPX.

Jakub

Re: [PATCH] Fix ubsan and C++14 constexpr ICEs (PR sanitizer/63956)

2014-11-20 Thread Jakub Jelinek

On Thu, Nov 20, 2014 at 06:14:52PM +0100, Marek Polacek wrote:
 This patch fixes a bunch of ICEs related to C++14 constexprs and
 -fsanitize=undefined.  We should ignore ubsan internal functions
 and ubsan builtins in constexpr functions in cxx_eval_call_expression.
 
 Also add proper printing of internal functions into the C++ printer.
 
 Bootstrapped/regtested on ppc64-linux, ok for trunk?

I'd like Jason to review this.  But a few nits:

 @@ -1171,6 +1182,10 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, 
 tree t,
  }
if (DECL_CLONED_FUNCTION_P (fun))
  fun = DECL_CLONED_FUNCTION (fun);
 +
 +  if (!current_function_decl  is_ubsan_builtin_p (fun))
 +return void_node;
 +

I don't understand the !current_function_decl here.

Also, looking at is_ubsan_builtin_p definition, I'd say
it should IMHO at least test DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_NORMAL
before comparing the function name, you can declare
__builtin_ubsan_foobarbaz () and use it and it won't be a builtin.

As for the testcase, I'd like to understand if C++ FE should reject
the constexpr functions when used with arguments that trigger undefined
behavior.  But certainly the behavior should not depend on whether
-fsanitize=undefined or not.
Also, what is the reason for constexpr call flows off the end errors?
Shouldn't that be avoided if any error is found while interpreting the
function?

Jakub

Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)

2014-11-20 Thread Jakub Jelinek

On Wed, Nov 19, 2014 at 02:23:47PM -0800, Mike Stump wrote:
 On Nov 19, 2014, at 1:57 PM, Jakub Jelinek ja...@redhat.com wrote:
  Though, following patch is just fine for me too, I don't think it will
  make a significant difference:
 
 This version is fine by me.

Richard, are you ok with that too?

Bootstrapped/regtested on x86_64-linux and i686-linux now.

2014-11-20  Jakub Jelinek  ja...@redhat.com

PR target/63910
* simplify-rtx.c (simplify_immed_subreg): Return NULL for integer
modes wider than MAX_BITSIZE_MODE_ANY_INT.  If not using
CONST_WIDE_INT, make sure r fits into CONST_DOUBLE.

* gcc.target/i386/pr63910.c: New test.

--- gcc/simplify-rtx.c.jj   2014-11-19 09:17:15.491327992 +0100
+++ gcc/simplify-rtx.c  2014-11-19 12:28:16.223808178 +0100
@@ -5504,6 +5504,8 @@ simplify_immed_subreg (machine_mode oute
HOST_WIDE_INT tmp[MAX_BITSIZE_MODE_ANY_INT / 
HOST_BITS_PER_WIDE_INT];
wide_int r;
 
+   if (GET_MODE_PRECISION (outer_submode)  MAX_BITSIZE_MODE_ANY_INT)
+ return NULL_RTX;
for (u = 0; u  units; u++)
  {
unsigned HOST_WIDE_INT buf = 0;
@@ -5515,10 +5517,13 @@ simplify_immed_subreg (machine_mode oute
tmp[u] = buf;
base += HOST_BITS_PER_WIDE_INT;
  }
-   gcc_assert (GET_MODE_PRECISION (outer_submode)
-   = MAX_BITSIZE_MODE_ANY_INT);
r = wide_int::from_array (tmp, units,
  GET_MODE_PRECISION (outer_submode));
+#if TARGET_SUPPORTS_WIDE_INT == 0
+   /* Make sure r will fit into CONST_INT or CONST_DOUBLE.  */
+   if (wi::min_precision (r, SIGNED)  HOST_BITS_PER_DOUBLE_INT)
+ return NULL_RTX;
+#endif
elems[elem] = immed_wide_int_const (r, outer_submode);
  }
  break;
--- gcc/testsuite/gcc.target/i386/pr63910.c.jj  2014-11-19 12:04:23.490489130 
+0100
+++ gcc/testsuite/gcc.target/i386/pr63910.c 2014-11-19 12:04:23.490489130 
+0100
@@ -0,0 +1,12 @@
+/* PR target/63910 */
+/* { dg-do compile } */
+/* { dg-options -O -mstringop-strategy=vector_loop -mavx512f } */
+
+extern void bar (float *c);
+
+void
+foo (void)
+{
+  float c[1024] = { };
+  bar (c);
+}


Jakub

[PATCH] Fix ICE with non-lvalue vector subscripts and make sure non-lvalue vector subscripts aren't used as lvalues (PR target/63764)

2014-11-20 Thread Jakub Jelinek

Hi!

This patch fixes ICEs if a non-lvalue vector (say cast of one vector
to another vector type) was subscripted and used as lhs.
The following patch, if *vecp is not lvalue, will copy it to a temporary
variable which can be made addressable for the subscription, and afterwards
wrap it into a NON_LVALUE_EXPR so that it is properly rejected if later used
on the lhs.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2014-11-20  Jakub Jelinek  ja...@redhat.com

PR target/63764
c-family/
* c-common.h (convert_vector_to_pointer_for_subscript): Change
return type to bool.
* c-common.c: Include gimple-expr.c.
(convert_vector_to_pointer_for_subscript): Change return type to
bool.  If *vecp is not lvalue_p and has VECTOR_TYPE, return true
and copy it into a TARGET_EXPR and use that instead of *vecp
directly.
c/
* c-typeck.c (build_array_ref): Adjust
convert_vector_to_pointer_for_subscript caller.  If it returns true,
call non_lvalue_loc on the result.
cp/
* typeck.c (cp_build_array_ref): Adjust
convert_vector_to_pointer_for_subscript caller.  If it returns true,
call non_lvalue_loc on the result.
testsuite/
* c-c++-common/pr63764-1.c: New test.
* c-c++-common/pr63764-2.c: New test.

--- gcc/c-family/c-common.h.jj  2014-11-19 15:39:26.606065628 +0100
+++ gcc/c-family/c-common.h 2014-11-20 08:38:02.527655971 +0100
@@ -1310,7 +1310,7 @@ extern tree build_userdef_literal (tree
   enum overflow_type overflow,
   tree num_string);
 
-extern void convert_vector_to_pointer_for_subscript (location_t, tree*, tree);
+extern bool convert_vector_to_pointer_for_subscript (location_t, tree *, tree);
 
 /* Possibe cases of scalar_to_vector conversion.  */
 enum stv_conv {
--- gcc/c-family/c-common.c.jj  2014-11-19 15:39:26.606065628 +0100
+++ gcc/c-family/c-common.c 2014-11-20 08:50:21.000573676 +0100
@@ -60,6 +60,7 @@ along with GCC; see the file COPYING3.
 #include target-def.h
 #include gimplify.h
 #include wide-int-print.h
+#include gimple-expr.h
 
 cpp_reader *parse_in;  /* Declared in c-pragma.h.  */
 
@@ -12030,22 +12031,47 @@ build_userdef_literal (tree suffix_id, t
 }
 
 /* For vector[index], convert the vector to a
-   pointer of the underlying type.  */
-void
+   pointer of the underlying type.  Return true if the resulting
+   ARRAY_REF should not be an lvalue.  */
+
+bool
 convert_vector_to_pointer_for_subscript (location_t loc,
-tree* vecp, tree index)
+tree *vecp, tree index)
 {
+  bool ret = false;
   if (TREE_CODE (TREE_TYPE (*vecp)) == VECTOR_TYPE)
 {
   tree type = TREE_TYPE (*vecp);
   tree type1;
 
+  ret = !lvalue_p (*vecp);
   if (TREE_CODE (index) == INTEGER_CST)
 if (!tree_fits_uhwi_p (index)
 || tree_to_uhwi (index) = TYPE_VECTOR_SUBPARTS (type))
   warning_at (loc, OPT_Warray_bounds, index value is out of bound);
 
-  c_common_mark_addressable_vec (*vecp);
+  if (ret)
+   {
+ tree tmp = create_tmp_var_raw (type, NULL);
+ DECL_SOURCE_LOCATION (tmp) = loc;
+ *vecp = c_save_expr (*vecp);
+ if (TREE_CODE (*vecp) == C_MAYBE_CONST_EXPR)
+   {
+ bool non_const = C_MAYBE_CONST_EXPR_NON_CONST (*vecp);
+ *vecp = C_MAYBE_CONST_EXPR_EXPR (*vecp);
+ *vecp
+   = c_wrap_maybe_const (build4 (TARGET_EXPR, type, tmp,
+ *vecp, NULL_TREE, NULL_TREE),
+ non_const);
+   }
+ else
+   *vecp = build4 (TARGET_EXPR, type, tmp, *vecp,
+   NULL_TREE, NULL_TREE);
+ SET_EXPR_LOCATION (*vecp, loc);
+ c_common_mark_addressable_vec (tmp);
+   }
+  else
+   c_common_mark_addressable_vec (*vecp);
   type = build_qualified_type (TREE_TYPE (type), TYPE_QUALS (type));
   type1 = build_pointer_type (TREE_TYPE (*vecp));
   bool ref_all = TYPE_REF_CAN_ALIAS_ALL (type1);
@@ -12065,6 +12091,7 @@ convert_vector_to_pointer_for_subscript
   *vecp = build1 (ADDR_EXPR, type1, *vecp);
   *vecp = convert (type, *vecp);
 }
+  return ret;
 }
 
 /* Determine which of the operands, if any, is a scalar that needs to be
--- gcc/c/c-typeck.c.jj 2014-11-19 15:39:24.044113650 +0100
+++ gcc/c/c-typeck.c2014-11-20 08:38:02.534655847 +0100
@@ -2495,7 +2495,8 @@ build_array_ref (location_t loc, tree ar
 
   gcc_assert (TREE_CODE (TREE_TYPE (index)) == INTEGER_TYPE);
 
-  convert_vector_to_pointer_for_subscript (loc, array, index);
+  bool non_lvalue
+= convert_vector_to_pointer_for_subscript (loc, array, index);
 
   if (TREE_CODE (TREE_TYPE (array)) == ARRAY_TYPE)
 {
@@ -2557,6 +2558,8 @@ build_array_ref (location_t loc, tree

[PATCH] Fix tree-ssa-strlen ICE introduced by r211956 (PR tree-optimization/61773)

2014-11-20 Thread Jakub Jelinek

Hi!

Before the r211956 changes, the only places that set si-stmt
were required to check that stpcpy has been declared (with the right
prototype) to signal the strlen pass that it can use stpcpy for
optimization.  But r211956 sets si-stmt also for malloca call,
which isn't in any way related to stpcpy.  So, this patch moves the
assertion where it really is needed (for strcat/strcpy and their checking
variants cases).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2014-11-20  Jakub Jelinek  ja...@redhat.com

PR tree-optimization/61773
* tree-ssa-strlen.c (get_string_length): Don't assert
stpcpy has been prototyped if si-stmt is BUILT_IN_MALLOC.

* gcc.dg/pr61773.c: New test.

--- gcc/tree-ssa-strlen.c.jj2014-11-19 18:47:59.0 +0100
+++ gcc/tree-ssa-strlen.c   2014-11-20 09:46:33.949017462 +0100
@@ -430,7 +430,6 @@ get_string_length (strinfo si)
   callee = gimple_call_fndecl (stmt);
   gcc_assert (callee  DECL_BUILT_IN_CLASS (callee) == BUILT_IN_NORMAL);
   lhs = gimple_call_lhs (stmt);
-  gcc_assert (builtin_decl_implicit_p (BUILT_IN_STPCPY));
   /* unshare_strinfo is intentionally not called here.  The (delayed)
 transformation of strcpy or strcat into stpcpy is done at the place
 of the former strcpy/strcat call and so can affect all the strinfos
@@ -479,6 +478,7 @@ get_string_length (strinfo si)
case BUILT_IN_STRCPY_CHK:
case BUILT_IN_STRCPY_CHKP:
case BUILT_IN_STRCPY_CHK_CHKP:
+ gcc_assert (builtin_decl_implicit_p (BUILT_IN_STPCPY));
  if (gimple_call_num_args (stmt) == (with_bounds ? 4 : 2))
fn = builtin_decl_implicit (BUILT_IN_STPCPY);
  else
--- gcc/testsuite/gcc.dg/pr61773.c.jj   2014-11-20 10:12:48.664616764 +0100
+++ gcc/testsuite/gcc.dg/pr61773.c  2014-11-20 10:13:47.384557904 +0100
@@ -0,0 +1,16 @@
+/* PR tree-optimization/61773 */
+/* { dg-do compile } */
+/* { dg-options -O2 } */
+
+void
+foo (char **x)
+{
+  char *p = __builtin_malloc (64);
+  char *q = __builtin_malloc (64);
+  __builtin_strcat (q, abcde);
+  __builtin_strcat (p, ab);
+  p[1] = q[3];
+  __builtin_strcat (p, q);
+  x[0] = p;
+  x[1] = q;
+}

Jakub

[ia64 PATCH] Fix up ia64 attribute handling (PR target/61137)

2014-11-20 Thread Jakub Jelinek

Hi!

Seems the gcc.target/ia64/small-addr-1.c testcase is failing on ia64 since
r210262 but clearly has been failing for much longer if compiled with C++
(just there is insufficient testsuite coverage).
The problem is that for the model attribute (and apparently common_object on
VMS too), the argument of that attribute is supposed to be an identifier
rather than expression (for common_object either an identifier or string),
and these days one has to tell the frontends about that in order not to
get the argument parsed as an expression.

The following untested patch fixes that (tested on small-addr-1.c with
a cross-compiler), I don't have ia64 hw nor spare cycles to test this
though, so I'm just offering the patch as is if anyone wants to test it.
Perhaps better testsuite coverage wouldn't hurt (test the model (small)
attribute also in C++, perhaps test the common_object attribute on VMS?).

2014-11-20  Jakub Jelinek  ja...@redhat.com

PR target/61137
* config/ia64/ia64.c (ia64_attribute_takes_identifier_p): New function.
(TARGET_ATTRIBUTE_TAKES_IDENTIFIER_P): Redefine to it.

--- gcc/config/ia64/ia64.c.jj   2014-11-11 00:06:23.0 +0100
+++ gcc/config/ia64/ia64.c  2014-11-20 11:51:59.729478773 +0100
@@ -324,6 +324,7 @@ static bool ia64_vms_valid_pointer_mode
 static tree ia64_vms_common_object_attribute (tree *, tree, tree, int, bool *)
  ATTRIBUTE_UNUSED;
 
+static bool ia64_attribute_takes_identifier_p (const_tree);
 static tree ia64_handle_model_attribute (tree *, tree, tree, int, bool *);
 static tree ia64_handle_version_id_attribute (tree *, tree, tree, int, bool *);
 static void ia64_encode_section_info (tree, rtx, int);
@@ -669,8 +670,26 @@ static const struct attribute_spec ia64_
 #undef TARGET_VECTORIZE_VEC_PERM_CONST_OK
 #define TARGET_VECTORIZE_VEC_PERM_CONST_OK ia64_vectorize_vec_perm_const_ok
 
+#undef TARGET_ATTRIBUTE_TAKES_IDENTIFIER_P
+#define TARGET_ATTRIBUTE_TAKES_IDENTIFIER_P ia64_attribute_takes_identifier_p
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
+/* Returns TRUE iff the target attribute indicated by ATTR_ID takes a plain
+   identifier as an argument, so the front end shouldn't look it up.  */
+
+static bool
+ia64_attribute_takes_identifier_p (const_tree attr_id)
+{
+  if (is_attribute_p (model, attr_id))
+return true;
+#if TARGET_ABI_OPEN_VMS
+  if (is_attribute_p (common_object, attr_id))
+return true;
+#endif
+  return false;
+}
+
 typedef enum
   {
 ADDR_AREA_NORMAL,  /* normal address area */

Jakub

Re: [PATCH] rs6000: Follow up for signed integer overflow fix

2014-11-20 Thread Jakub Jelinek

On Thu, Nov 20, 2014 at 07:41:43PM +0100, Markus Trippelsdorf wrote:
 2014-11-20  Markus Trippelsdorf  mar...@trippelsdorf.de
 
   * config/rs6000/rs6000.c (includes_rldic_lshift_p): Cast 0 to unsigned.
   (includes_rldicr_lshift_p): Likewise.
 
 diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
 index a9604cf3fa97..d7958b33ba1a 100644
 --- a/gcc/config/rs6000/rs6000.c
 +++ b/gcc/config/rs6000/rs6000.c
 @@ -16197,10 +16197,10 @@ includes_rldic_lshift_p (rtx shiftop, rtx andop)
unsigned HOST_WIDE_INT c, lsb, shift_mask;
  
c = INTVAL (andop);
 -  if (c == 0 || c == ~0)
 +  if (c == 0 || c == ~(unsigned HOST_WIDE_INT) 0)
   return 0;
  
 -  shift_mask = ~0;
 +  shift_mask = ~(unsigned HOST_WIDE_INT) 0;
shift_mask = INTVAL (shiftop);
  
/* Find the least significant one bit.  */
 @@ -16235,7 +16235,7 @@ includes_rldicr_lshift_p (rtx shiftop, rtx andop)
  {
unsigned HOST_WIDE_INT c, lsb, shift_mask;
  
 -  shift_mask = ~0;
 +  shift_mask = ~(unsigned HOST_WIDE_INT) 0;
shift_mask = INTVAL (shiftop);
c = INTVAL (andop);

You could use ~HOST_WIDE_INT_UC (0) in all the 3 cases.

Jakub

Re: [PATCH 1/2] PR debug/38757 gcc does not emit DW_LANG_C99.

2014-11-20 Thread Jakub Jelinek

On Thu, Nov 20, 2014 at 11:30:11PM +0100, Mark Wielaard wrote:
 --- a/gcc/config/avr/avr-c.c
 +++ b/gcc/config/avr/avr-c.c
 @@ -386,7 +386,8 @@ avr_cpu_cpp_builtins (struct cpp_reader *pfile)
   (as mentioned in ISO/IEC DTR 18037; Annex F.2) which is not
   implemented in GCC up to now.  */
  
 -  if (!strcmp (lang_hooks.name, GNU C))
 +  if (strncmp (lang_hooks.name, GNU C, 5) == 0
 +   strncmp (lang_hooks.name, GNU C++, 7) != 0)

I wonder if the tests for C language shouldn't be better done
as (strncmp (lang_hooks.name, GNU C, 5) == 0
 strchr (0123456789, lang_hooks.name[5]) != NULL)
or (strncmp (lang_hooks.name, GNU C, 5) == 0
 (ISDIGIT (lang_hooks.name[5]) || lang_hooks.name[5] == '\0'))
to make it explicit what we are looking for, not what we aren't.

 +  either, so for now use 0.  Match GNU C++ first, since it needs to
 +  be compared with strncmp, like GNU C, which has the same prefix.  */
 +  if (! strncmp (language_string, GNU C++, 7)
 +|| ! strcmp (language_string, GNU Objective-C++))

Wrong formatting, || should be below ! on the previous line.

 + i = 9;
 +  else if (! strncmp (language_string, GNU C, 5)
 || ! strcmp (language_string, GNU GIMPLE)
 || ! strcmp (language_string, GNU Go))

And here too.  But if you use a different check for C (see above), you could
avoid moving the C++ case first.

 --- a/gcc/langhooks.h
 +++ b/gcc/langhooks.h
 @@ -261,7 +261,8 @@ struct lang_hooks_for_lto
  
  struct lang_hooks
  {
 -  /* String identifying the front end.  e.g. GNU C++.  */
 +  /* String identifying the front end.  e.g. GNU C++.
 + Might include language version being used.  */

As we no longer have GNU C++ as any name, using it as an example
is weird.  So,
  /* String identifying the front end and optionally language standard
 version, e.g. GNU C++98 or GNU Java.  */
?

LGTM otherwise.

Jakub

Re: [PATCH 2/2] PR debug/38757 continued. Handle C11, C++11 and C++14.

2014-11-21 Thread Jakub Jelinek

On Thu, Nov 20, 2014 at 11:30:12PM +0100, Mark Wielaard wrote:
 @@ -19592,13 +19597,28 @@ gen_compile_unit_die (const char *filename)
  
language = DW_LANG_C;
if (strncmp (language_string, GNU C++, 7) == 0)
 -language = DW_LANG_C_plus_plus;
 +{
 +  language = DW_LANG_C_plus_plus;
 +  if (dwarf_version = 5 || !dwarf_strict)
 + {
 +   if (strcmp (language_string, GNU C++11) == 0)
 + language = DW_LANG_C_plus_plus_11;
 +   else if (strcmp (language_string, GNU C++14) == 0)
 + language = DW_LANG_C_plus_plus_14;
 + }
 +}

I think best would be to tweak
  if (value  2 || value  4)
error_at (loc, dwarf version %d is not supported, value);
  else
opts-x_dwarf_version = value;
so that we accept value 5 too, and for now, until the
most common consumers are changed, use
  if (dwarf_version = 5 /* || !dwarf_strict */)
so that
- you can actually use it in the test with -gdwarf-5
- you can commit it right away
- people can start playing with what it will mean to support DWARF5

GCC 4.5 also allowed -gdwarf-4 even when DWARF4 has not been released yet.
When there are consumers that can grok it, we can uncomment the
|| !dwarf_strict.

Jason, do you agree?

else if (strncmp (language_string, GNU C, 5) == 0)
  {
language = DW_LANG_C89;
if (dwarf_version = 3 || !dwarf_strict)
 - if (strcmp (language_string, GNU C99) == 0)
 -   language = DW_LANG_C99;
 + {
 +   if (strcmp (language_string, GNU C89) != 0)
 + language = DW_LANG_C99;
 +
 +   if (dwarf_version = 5 || !dwarf_strict)
 + if (strcmp (language_string, GNU C11) == 0)
 +   language = DW_LANG_C11;
 + }

Shouldn't we emit at least DW_LANG_C99 for GNU C11 if
not dwarf_version = 5 /* || !dwarf_strict */ but
dwarf_version = 3 || !dwarf_strict is true?

BTW, noticed we don't have anything for Fortran 2003 and 2008,
filed a DWARF Issue for that.

Jakub

Re: [PATCH] OpenACC for C front end

2014-11-21 Thread Jakub Jelinek

On Thu, Nov 20, 2014 at 05:50:33PM -0600, James Norris wrote:
 +   case 'h':
 + if (!strcmp (host, p))
 +   result = PRAGMA_OMP_CLAUSE_SELF;
 + break;
 Shouldn't this be PRAGMA_OMP_CLAUSE_HOST (PRAGMA_OACC_CLAUSE_HOST)
 instead?  It is _HOST in the C++ patch, are there no C tests with
 that clause covering it?
 
 The host clause is a synonym for the self clause. The initial
 C++ patch did not treat host as a synonym and has amended
 accordingly.

Can you add a comment mentioning that (for casual readers)?

 There was a mistake in naming the function:
 c_parser_omp_clause_vector_length.
 Once it was renamed to: c_parser_oacc_clause_vector_length, diff was able to
 keep track.

Great.

 OK to commit after middle end is accepted?

Ok, thanks.

Jakub

Re: [PATCH] OpenACC for C++ front end

2014-11-21 Thread Jakub Jelinek

On Thu, Nov 20, 2014 at 05:33:57PM -0600, James Norris wrote:
 + t = OMP_CLAUSE_ASYNC_EXPR (c);
 + if (t == error_mark_node)
 +   remove = true;
 + else if (!type_dependent_expression_p (t)
 +   !INTEGRAL_TYPE_P (TREE_TYPE (t)))
 +   {
 + error (%async% expression must be integral);
 You have OMP_CLAUSE_LOCATION (c) which you could use for error_at.
 
 I followed the convention that was used elsewhere in the function
 at using error ().

Perhaps it would be better to change even those other spots in the function.
But that can be certainly done as a follow-up patch.

 Thank you for taking the time to review!
 
 OK to commit after middle end has been accepted?

Yes, thanks.

Jakub

Re: [PATCH 1/2] PR debug/38757 gcc does not emit DW_LANG_C99.

2014-11-21 Thread Jakub Jelinek

On Fri, Nov 21, 2014 at 02:01:55PM +0100, Mark Wielaard wrote:
 gcc/c-family/ChangeLog
 
   PR debug/38757
   * c-opts.c (set_std_c89): Set lang_hooks.name.
   (set_std_c99): Likewise.
   (set_std_c11): Likewise.
   (set_std_cxx98): Likewise.
   (set_std_cxx11): Likewise.
   (set_std_cxx14): Likewise.
   (set_std_cxx1z): Likewise.
 
 gcc/ChangeLog
 
   PR debug/38757
   * config/avr/avr-c.c (avr_cpu_cpp_builtins): Use lang_GNU_C.
   * config/darwin.c (darwin_file_end): Use lang_GNU_CXX.
   (darwin_override_options): Likewise.
   * config/ia64/ia64.c (ia64_struct_retval_addr_is_first_parm_p):
   Likewise.
   * config/rs6000/rs6000.c (rs6000_output_function_epilogue):
   Likewise.
   * dbxout.c (get_lang_number): Likewise.
   (dbxout_type): Likewise.
   (dbxout_symbol_location): Likewise.
   * dwarf2out.c (add_prototyped_attribute): Add DW_AT_prototype
   also for DW_LANG_{C,C99,ObjC}.
   (highest_c_language): New function.
   (gen_compile_unit_die): Call highest_c_language to merge LTO
   TRANSLATION_UNIT_LANGUAGE. Use strncmp language_string to
   determine if DW_LANG_C99 or DW_LANG_C89 should be returned.
   * fold-const.c (fold_cond_expr_with_comparison): Use lang_GNU_CXX.
   * langhooks.h (struct lang_hooks): Add version comment to name.
   (lang_GNU_C): New function declaration.
   (lang_GNU_CXX): Likewise.
   * langhooks.c (lang_GNU_C): New function.
   (lang_GNU_CXX): Likewise.
   * vmsdbgout.c (vmsdbgout_init): Use lang_GNU_C and lang_GNU_CXX.
 
 gcc/testsuite/ChangeLog
 
   PR debug/38757
   * gcc.dg/debug/dwarf2/lang-c89.c: New test.
   * gcc.dg/debug/dwarf2/lang-c99.c: Likewise.

Ok, thanks.

Jakub

[committed] Cherry-pick a libsanitizer bugfix (PR sanitizer/64013)

2014-11-21 Thread Jakub Jelinek

Hi!

I've committed this as obvious.

2014-11-21  Jakub Jelinek  ja...@redhat.com

PR sanitizer/64013
* sanitizer_common/sanitizer_linux.cc (FileExists): Cherry pick
upstream r222532.

--- libsanitizer/sanitizer_common/sanitizer_linux.cc(revision 222531)
+++ libsanitizer/sanitizer_common/sanitizer_linux.cc(revision 222532)
@@ -283,17 +283,15 @@ uptr internal_execve(const char *filenam
 
 // - sanitizer_common.h
 bool FileExists(const char *filename) {
-#if SANITIZER_USES_CANONICAL_LINUX_SYSCALLS
   struct stat st;
+#if SANITIZER_USES_CANONICAL_LINUX_SYSCALLS
   if (internal_syscall(SYSCALL(newfstatat), AT_FDCWD, filename, st, 0))
-return false;
 #else
-  struct stat st;
   if (internal_stat(filename, st))
+#endif
 return false;
   // Sanity check: filename is a regular file.
   return S_ISREG(st.st_mode);
-#endif
 }
 
 uptr GetTid() {

Jakub

[PATCH] Fix VRP handling of {ADD,SUB,MUL}_OVERFLOW (PR tree-optimization/64006)

2014-11-21 Thread Jakub Jelinek

Hi!

As discussed on IRC and in the PR, these internal calls are quite
unique for VRP in that they return _Complex integer result,
which VRP doesn't track, but then extract using REALPART_EXPR/IMAGPART_EXPR
the two results from that _Complex int and to generate good code
it is desirable to get proper ranges of those two results.
The problem is that right now this works only on the first VRP iteration,
the REALPART_EXPR/IMAGPART_EXPR statements are handled if their operand
is set by {ADD,SUB,MUL}_OVERFLOW.  If we iterate because a VR of one
of the internal call arguments changes, nothing in the propagator marks
the REALPART_EXPR/IMAGPART_EXPR statements for reconsideration.

The following patch handles this, by making the internal calls interesting
to the propagator and returning the right SSA_PROP_* for it (depending on
whether any of the value ranges of the REALPART_EXPR/IMAGPART_EXPR immediate
uses would change or not).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2014-11-21  Jakub Jelinek  ja...@redhat.com

PR tree-optimization/64006
* tree-vrp.c (stmt_interesting_for_vrp): Return true
for {ADD,SUB,MUL}_OVERFLOW internal calls.
(vrp_visit_assignment_or_call): For {ADD,SUB,MUL}_OVERFLOW
internal calls, check if any REALPART_EXPR/IMAGPART_EXPR
immediate uses would change their value ranges and return
SSA_PROP_INTERESTING if so, or SSA_PROP_NOT_INTERESTING
if there are some REALPART_EXPR/IMAGPART_EXPR immediate uses
interesting for vrp.

* gcc.c-torture/execute/pr64006.c: New test.

--- gcc/tree-vrp.c.jj   2014-11-21 10:17:05.0 +0100
+++ gcc/tree-vrp.c  2014-11-21 13:12:09.895013334 +0100
@@ -6949,6 +6949,20 @@ stmt_interesting_for_vrp (gimple stmt)
   (is_gimple_call (stmt)
  || !gimple_vuse (stmt)))
return true;
+  else if (is_gimple_call (stmt)  gimple_call_internal_p (stmt))
+   switch (gimple_call_internal_fn (stmt))
+ {
+ case IFN_ADD_OVERFLOW:
+ case IFN_SUB_OVERFLOW:
+ case IFN_MUL_OVERFLOW:
+   /* These internal calls return _Complex integer type,
+  but are interesting to VRP nevertheless.  */
+   if (lhs  TREE_CODE (lhs) == SSA_NAME)
+ return true;
+   break;
+ default:
+   break;
+ }
 }
   else if (gimple_code (stmt) == GIMPLE_COND
   || gimple_code (stmt) == GIMPLE_SWITCH)
@@ -7101,6 +7115,74 @@ vrp_visit_assignment_or_call (gimple stm
 
   return SSA_PROP_NOT_INTERESTING;
 }
+  else if (is_gimple_call (stmt)  gimple_call_internal_p (stmt))
+switch (gimple_call_internal_fn (stmt))
+  {
+  case IFN_ADD_OVERFLOW:
+  case IFN_SUB_OVERFLOW:
+  case IFN_MUL_OVERFLOW:
+   /* These internal calls return _Complex integer type,
+  which VRP does not track, but the immediate uses
+  thereof might be interesting.  */
+   if (lhs  TREE_CODE (lhs) == SSA_NAME)
+ {
+   imm_use_iterator iter;
+   use_operand_p use_p;
+   enum ssa_prop_result res = SSA_PROP_VARYING;
+
+   set_value_range_to_varying (get_value_range (lhs));
+
+   FOR_EACH_IMM_USE_FAST (use_p, iter, lhs)
+ {
+   gimple use_stmt = USE_STMT (use_p);
+   if (!is_gimple_assign (use_stmt))
+ continue;
+   enum tree_code rhs_code = gimple_assign_rhs_code (use_stmt);
+   if (rhs_code != REALPART_EXPR  rhs_code != IMAGPART_EXPR)
+ continue;
+   tree rhs1 = gimple_assign_rhs1 (use_stmt);
+   tree use_lhs = gimple_assign_lhs (use_stmt);
+   if (TREE_CODE (rhs1) != rhs_code
+   || TREE_OPERAND (rhs1, 0) != lhs
+   || TREE_CODE (use_lhs) != SSA_NAME
+   || !stmt_interesting_for_vrp (use_stmt)
+   || (!INTEGRAL_TYPE_P (TREE_TYPE (use_lhs))
+   || !TYPE_MIN_VALUE (TREE_TYPE (use_lhs))
+   || !TYPE_MAX_VALUE (TREE_TYPE (use_lhs
+ continue;
+
+   /* If there is a change in the value range for any of the
+  REALPART_EXPR/IMAGPART_EXPR immediate uses, return
+  SSA_PROP_INTERESTING.  If there are any REALPART_EXPR
+  or IMAGPART_EXPR immediate uses, but none of them have
+  a change in their value ranges, return
+  SSA_PROP_NOT_INTERESTING.  If there are no
+  {REAL,IMAG}PART_EXPR uses at all,
+  return SSA_PROP_VARYING.  */
+   value_range_t new_vr = VR_INITIALIZER;
+   extract_range_basic (new_vr, use_stmt);
+   value_range_t *old_vr = get_value_range (use_lhs);
+   if (old_vr-type != new_vr.type
+   || !vrp_operand_equal_p (old_vr-min, new_vr.min

[PATCH] Fix up __builtin_*_overflow expansion on some targets (PR target/63848)

2014-11-21 Thread Jakub Jelinek

Hi!

Apparently, emit_cmp_and_jump_insns can silently generate wrong code
for wider modes on some targets, so this patch changes all those calls in
internal-fn.c to do_compare_rtx_and_jump, which is a wrapper around
emit_cmp_and_jump_insns that should handle the wider mode comparison
expansion.  Unfortunately, the order of arguments is different :(.

No new testcases provided, the existing testsuite exhibited this on various
targets.

Bootstrapped/regtested on x86_64-linux and i686-linux, tested on the
testcases for ia64 and Uros tested the testcases on Alpha (in both cases
they previously failed), ok for trunk?

2014-11-21  Jakub Jelinek  ja...@redhat.com

PR target/63848
PR target/63975
* internal-fn.c (expand_arith_overflow_result_store,
expand_addsub_overflow, expand_neg_overflow, expand_mul_overflow): Use
do_compare_rtx_and_jump instead of emit_cmp_and_jump_insns everywhere,
adjust arguments to those functions.  Use unsignedp = true for
EQ, NE, GEU, LEU, LTU and GTU comparisons.

--- gcc/internal-fn.c.jj2014-11-19 18:48:02.0 +0100
+++ gcc/internal-fn.c   2014-11-21 17:34:00.634621461 +0100
@@ -386,8 +386,8 @@ expand_arith_overflow_result_store (tree
   int uns = TYPE_UNSIGNED (TREE_TYPE (TREE_TYPE (lhs)));
   lres = convert_modes (tgtmode, mode, res, uns);
   gcc_assert (GET_MODE_PRECISION (tgtmode)  GET_MODE_PRECISION (mode));
-  emit_cmp_and_jump_insns (res, convert_modes (mode, tgtmode, lres, uns),
-  EQ, NULL_RTX, mode, false, done_label,
+  do_compare_rtx_and_jump (res, convert_modes (mode, tgtmode, lres, uns),
+  EQ, true, mode, NULL_RTX, NULL_RTX, done_label,
   PROB_VERY_LIKELY);
   write_complex_part (target, const1_rtx, true);
   emit_label (done_label);
@@ -533,8 +533,8 @@ expand_addsub_overflow (location_t loc,
  ? (CONST_SCALAR_INT_P (op0)  REG_P (op1))
  : CONST_SCALAR_INT_P (op1)))
tem = op1;
-  emit_cmp_and_jump_insns (res, tem, code == PLUS_EXPR ? GEU : LEU,
-  NULL_RTX, mode, false, done_label,
+  do_compare_rtx_and_jump (res, tem, code == PLUS_EXPR ? GEU : LEU,
+  true, mode, NULL_RTX, NULL_RTX, done_label,
   PROB_VERY_LIKELY);
   goto do_error_label;
 }
@@ -549,7 +549,7 @@ expand_addsub_overflow (location_t loc,
   rtx tem = expand_binop (mode, add_optab,
  code == PLUS_EXPR ? res : op0, sgn,
  NULL_RTX, false, OPTAB_LIB_WIDEN);
-  emit_cmp_and_jump_insns (tem, op1, GEU, NULL_RTX, mode, false,
+  do_compare_rtx_and_jump (tem, op1, GEU, true, mode, NULL_RTX, NULL_RTX,
   done_label, PROB_VERY_LIKELY);
   goto do_error_label;
 }
@@ -591,9 +591,9 @@ expand_addsub_overflow (location_t loc,
emit_jump (do_error);
   else if (pos_neg == 3)
/* If ARG0 is not known to be always positive, check at runtime.  */
-   emit_cmp_and_jump_insns (op0, const0_rtx, LT, NULL_RTX, mode, false,
-do_error, PROB_VERY_UNLIKELY);
-  emit_cmp_and_jump_insns (op1, op0, LEU, NULL_RTX, mode, false,
+   do_compare_rtx_and_jump (op0, const0_rtx, LT, false, mode, NULL_RTX,
+NULL_RTX, do_error, PROB_VERY_UNLIKELY);
+  do_compare_rtx_and_jump (op1, op0, LEU, true, mode, NULL_RTX, NULL_RTX,
   done_label, PROB_VERY_LIKELY);
   goto do_error_label;
 }
@@ -607,7 +607,7 @@ expand_addsub_overflow (location_t loc,
  OPTAB_LIB_WIDEN);
   rtx tem = expand_binop (mode, add_optab, op1, sgn, NULL_RTX, false,
  OPTAB_LIB_WIDEN);
-  emit_cmp_and_jump_insns (op0, tem, LTU, NULL_RTX, mode, false,
+  do_compare_rtx_and_jump (op0, tem, LTU, true, mode, NULL_RTX, NULL_RTX,
   done_label, PROB_VERY_LIKELY);
   goto do_error_label;
 }
@@ -619,8 +619,8 @@ expand_addsub_overflow (location_t loc,
 unsigned.  */
   res = expand_binop (mode, add_optab, op0, op1, NULL_RTX, false,
  OPTAB_LIB_WIDEN);
-  emit_cmp_and_jump_insns (res, const0_rtx, LT, NULL_RTX, mode, false,
-  do_error, PROB_VERY_UNLIKELY);
+  do_compare_rtx_and_jump (res, const0_rtx, LT, false, mode, NULL_RTX,
+  NULL_RTX, do_error, PROB_VERY_UNLIKELY);
   rtx tem = op1;
   /* The operation is commutative, so we can pick operand to compare
 against.  For prec = BITS_PER_WORD, I think preferring REG operand
@@ -633,7 +633,7 @@ expand_addsub_overflow (location_t loc,
  ? (CONST_SCALAR_INT_P (op1)  REG_P (op0))
  : CONST_SCALAR_INT_P (op0))
tem = op0;
-  emit_cmp_and_jump_insns (res, tem, GEU

Re: [PATCH 4/4] OpenMP 4.0 offloading to Intel MIC: non-fallback testing

2014-11-21 Thread Jakub Jelinek

On Fri, Nov 21, 2014 at 10:14:21PM +0300, Ilya Verbin wrote:
  On Fri, 21 Nov 2014 21:44:40, Ilya Verbin wrote:
   You're right. This patch was rebased so many times, that we may forget to
   regenerate it before committing.
 
 Build with liboffloadmic passed.  OK for trunk?
 
   -- Ilya
 
 
   * Makefile.in: Regenerate.

Ok.
 --- a/Makefile.in
 +++ b/Makefile.in
 @@ -35238,9 +35238,6 @@ configure-target-liboffloadmic:
   $(SHELL) $(srcdir)/mkinstalldirs $(TARGET_SUBDIR)/liboffloadmic ; \
   $(NORMAL_TARGET_EXPORTS)  \
   echo Configuring in $(TARGET_SUBDIR)/liboffloadmic; \
 -  \
 - this_target=${target_alias}; \
 -  \
   cd $(TARGET_SUBDIR)/liboffloadmic || exit 1; \
   case $(srcdir) in \
 /* | [A-Za-z]:[\\/]*) topdir=$(srcdir) ;; \
 @@ -35248,14 +35245,12 @@ configure-target-liboffloadmic:
   sed -e 's,\./,,g' -e 's,[^/]*/,../,g' `$(srcdir) ;; \
   esac; \
   module_srcdir=liboffloadmic; \
 - srcdiroption=--srcdir=$${topdir}/liboffloadmic; \
 - libsrcdir=$$s/liboffloadmic; \
   rm -f no-such-file || : ; \
   CONFIG_SITE=no-such-file $(SHELL) \
 $$s/$$module_srcdir/configure \
 --srcdir=$${topdir}/$$module_srcdir \
 $(TARGET_CONFIGARGS) --build=${build_alias} --host=${target_alias} \
 -   --target=$${this_target} $${srcdiroption} 
 @extra_liboffloadmic_configure_flags@ \
 +   --target=${target_alias} @extra_liboffloadmic_configure_flags@ \
 || exit 1
  @endif target-liboffloadmic
  

Jakub

Re: [PATCH][OpenMP] Fix named critical sections inside target functions

2014-11-21 Thread Jakub Jelinek

On Fri, Nov 21, 2014 at 11:19:26PM +0300, Ilya Verbin wrote:
 Hi,
 
 '#pragma omp critical (name)' can be placed in the function, marked
 with '#pragma omp declare target', in this case the corresponding node
 should be marked as offloadable too.
 Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Please add a testcase for this.

   * omp-low.c (lower_omp_critical): Mark critical sections
   inside target functions as offloadable.
 
 
 diff --git a/gcc/omp-low.c b/gcc/omp-low.c
 index 3924282..6c5774c 100644
 --- a/gcc/omp-low.c
 +++ b/gcc/omp-low.c
 @@ -9366,16 +9366,6 @@ lower_omp_critical (gimple_stmt_iterator *gsi_p, 
 omp_context *ctx)
 DECL_ARTIFICIAL (decl) = 1;
 DECL_IGNORED_P (decl) = 1;
  
 -   /* If '#pragma omp critical' is inside target region, the symbol must
 -  be marked for offloading.  */
 -   omp_context *octx;
 -   for (octx = ctx-outer; octx; octx = octx-outer)
 - if (is_targetreg_ctx (octx))
 -   {
 - varpool_node::get_create (decl)-offloadable = 1;
 - break;
 -   }
 -
 varpool_node::finalize_decl (decl);
  
 critical_name_mutexes-put (name, decl);
 @@ -9383,6 +9373,20 @@ lower_omp_critical (gimple_stmt_iterator *gsi_p, 
 omp_context *ctx)
else
   decl = *n;
  
 +  /* If '#pragma omp critical' is inside target region or
 +  inside function marked as offloadable, the symbol must be
 +  marked as offloadable too.  */
 +  omp_context *octx;
 +  if (cgraph_node::get (current_function_decl)-offloadable)
 + varpool_node::get_create (decl)-offloadable = 1;
 +  else
 + for (octx = ctx-outer; octx; octx = octx-outer)
 +   if (is_targetreg_ctx (octx))
 + {
 +   varpool_node::get_create (decl)-offloadable = 1;
 +   break;
 + }
 +
lock = builtin_decl_explicit (BUILT_IN_GOMP_CRITICAL_NAME_START);
lock = build_call_expr_loc (loc, lock, 1, build_fold_addr_expr_loc 
 (loc, decl));
  

Jakub

Re: [PATCH][OpenMP] Fix named critical sections inside target functions

2014-11-21 Thread Jakub Jelinek

On Sat, Nov 22, 2014 at 12:08:38AM +0300, Ilya Verbin wrote:
  On 21 Nov 2014, at 23:36, Jakub Jelinek ja...@redhat.com wrote:
  
  On Fri, Nov 21, 2014 at 11:19:26PM +0300, Ilya Verbin wrote:
  Hi,
  
  '#pragma omp critical (name)' can be placed in the function, marked
  with '#pragma omp declare target', in this case the corresponding node
  should be marked as offloadable too.
  Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
  
  Please add a testcase for this.
 
 By default with disabled offloading it will always PASS.  Add anyway?

Doesn't matter.  We want to test it both with host fallback and offloaded to
various devices.  The latter of course will be tested only if somebody
configures the 2 compilers that way, less often, but still.

Jakub

Re: [PATCH v2] gcc/ubsan.c: Use 'pretty_print' for 'pretty_name' to avoid memory overflow

2014-11-21 Thread Jakub Jelinek

On Sat, Nov 22, 2014 at 05:03:37AM +0800, Chen Gang wrote:
 According to the next code, 'pretty_name' may need additional bytes more
 than 16 (may have unlimited length for array type). There is an easy way
 to fix it: use 'pretty_print' for 'pretty_name'.
 
 Let the code meet 2 white spaces alignment coding styles (originally,
 some of code is 1 white sapce alignment).
 
 It passes testsuite under fedora 20 x86_64-unknown-linux-gnu.
 
 2014-11-22  Chen Gang  gang.chen.5...@gmail.com
 
 * ubsan.c (ubsan_type_descriptor): Use 'pretty_print' for
 'pretty_name' to avoid memory overflow

Add a . at the end.

while (deref_depth--  0)
 -pretty_name[pos++] = '*';
 -  pretty_name[pos++] = '\'';
 -  pretty_name[pos] = '\0';
 + pp_star(pretty_name);
 +  pp_quote(pretty_name);

Formatting, missing space before (.  Happens many times in the patch.

 if (dom  TREE_CODE (TYPE_MAX_VALUE (dom)) == INTEGER_CST)
 - pos += sprintf (pretty_name[pos], HOST_WIDE_INT_PRINT_DEC,
 - tree_to_uhwi (TYPE_MAX_VALUE (dom)) + 1);
 + pp_printf (pretty_name, HOST_WIDE_INT_PRINT_DEC,
 +tree_to_uhwi (TYPE_MAX_VALUE (dom)) + 1);

You don't know if TYPE_MAX_VALUE (dom) will fit into uhwi, and you are
using signed printing anyway.
You said that using pp_wide_int breaks too many tests, so perhaps
do
  if (tree_fits_uhwi_p (TYPE_MAX_VALUE (dom))
   tree_to_uhwi (TYPE_MAX_VALUE (dom)) + 1 != 0)
pp_printf (..., HOST_WIDE_INT_PRINT_UNSIGNED,
  else
pp_wide_int (..., wi::to_widest (TYPE_MAX_VALUE (dom)) + 1);
?

Jakub

Re: [Patch,MIPS] Add default SYS_futex definition in libgomp

2014-11-21 Thread Jakub Jelinek

On Fri, Nov 21, 2014 at 02:30:06PM -0800, Steve Ellcey  wrote:
 2014-11-21  Steve Ellcey  sell...@imgtec.com
 
   * config/linux/mips/futex.h (SYS_futex): Define if not already done.

Ok.

 --- a/libgomp/config/linux/mips/futex.h
 +++ b/libgomp/config/linux/mips/futex.h
 @@ -25,6 +25,11 @@
  /* Provide target-specific access to the futex system call.  */
  
  #include sys/syscall.h
 +
 +#if !defined(SYS_futex)
 +#define SYS_futex __NR_futex
 +#endif
 +
  #define FUTEX_WAIT 0
  #define FUTEX_WAKE 1
  

Jakub

Re: [PATCH, PR 63551] Use proper type in evaluate_conditions_for_known_args

2014-11-22 Thread Jakub Jelinek

On Sat, Nov 22, 2014 at 12:09:46PM +0100, Martin Jambor wrote:
 2014-11-21  Martin Jambor  mjam...@suse.cz
 
   PR ipa/63551
   * ipa-inline-analysis.c (evaluate_conditions_for_known_args): Convert
   value of the argument to the type of the value in the condition.
 
 testsuite/
   * gcc.dg/ipa/pr63551.c: New test.
 
 
 Index: src/gcc/ipa-inline-analysis.c
 ===
 --- src.orig/gcc/ipa-inline-analysis.c
 +++ src/gcc/ipa-inline-analysis.c
 @@ -880,7 +880,10 @@ evaluate_conditions_for_known_args (stru
   }
if (c-code == IS_NOT_CONSTANT || c-code == CHANGED)
   continue;
 -  res = fold_binary_to_constant (c-code, boolean_type_node, val, 
 c-val);
 +  val = fold_unary (VIEW_CONVERT_EXPR, TREE_TYPE (c-val), val);

VCE should only be used if the sizes of the types are the same.
Is that always the case here?

Jakub

Re: [PATCH] Check for strtol, strtoul, strtoll and strtoull declarations

2014-11-22 Thread Jakub Jelinek

On Sat, Nov 22, 2014 at 09:40:58AM -0500, John David Anglin wrote:
 libiberty ChangeLog:
 2014-11-22  John David Anglin  dang...@gcc.gnu.org
 
   PR other/63694
   * configure.ac: Check for strtol, strtoul, strtoll and strtoull
   declarations.
   * configure: Regenerated.
 
 gcc ChangeLog:
 2014-11-22  John David Anglin  dang...@gcc.gnu.org
 
   PR other/63694
   * configure.ac: Check for strtol, strtoul, strtoll and strtoull
   declarations.
   * configure: Regenerated.
   * config.in: Regenerated.

Looks reasonable to me, but I'll defer it to Ian as libiberty maintainer.

 Index: libiberty/configure.ac
 ===
 --- libiberty/configure.ac(revision 217956)
 +++ libiberty/configure.ac(working copy)
 @@ -678,6 +678,7 @@
AC_CHECK_FUNCS($checkfuncs)
AC_CHECK_DECLS([basename(char *), ffs, asprintf, vasprintf, snprintf, 
 vsnprintf])
AC_CHECK_DECLS([calloc, getenv, getopt, malloc, realloc, sbrk])
 +  AC_CHECK_DECLS([strtol, strtoul, strtoll, strtoull])
AC_CHECK_DECLS([strverscmp])
libiberty_NEED_DECLARATION(canonicalize_file_name)
  fi
 Index: gcc/configure.ac
 ===
 --- gcc/configure.ac  (revision 217956)
 +++ gcc/configure.ac  (working copy)
 @@ -1213,6 +1213,7 @@
  CXXFLAGS=$CXXFLAGS -I${srcdir} -I${srcdir}/../include $GMPINC
  gcc_AC_CHECK_DECLS(getenv atol asprintf sbrk abort atof getcwd getwd \
   stpcpy strnlen strsignal strstr strverscmp \
 + strtol strtoul strtoll strtoull \
   errno snprintf vsnprintf vasprintf malloc realloc calloc \
   free basename getopt clock getpagesize ffs gcc_UNLOCKED_FUNCS, , ,[
  #include ansidecl.h


Jakub

Re: [PATCH] pr63856 - test case

2014-11-23 Thread Jakub Jelinek

On Sun, Nov 23, 2014 at 01:32:27PM -0500, tsaund...@mozilla.com wrote:
 bug was already fixed, so just add the test case.
 
 tested this only passes with r217909 where it is fixed, ok?
 
 Trev
 
 
 diff --git a/gcc/testsuite/gcc.dg/pr63856.c b/gcc/testsuite/gcc.dg/pr63856.c
 new file mode 100644
 index 000..8fb65c6
 --- /dev/null
 +++ b/gcc/testsuite/gcc.dg/pr63856.c
 @@ -0,0 +1,14 @@
 +/* { dg-do compile } */
 +/* { dg-options -O2 -fPIC } */
 +typedef int v2si __attribute__ ((vector_size (8)));
 +typedef short v4hi __attribute__ ((vector_size (8)));
 +
 +int __attribute__ ((noinline, noclone)) f (v2si A, int N)

Vector arguments or return values often result in -Wpsabi warnings,
so you certainly want -Wno-psabi in dg-options.
On the other side, -fPIC should be only used for { target pic },
so better put that into dg-additional-options.

 +{
 +  return ((v4hi) A)[N];
 +}
 +
 +int __attribute__ ((noinline, noclone)) g (v2si A, int N)
 +{
 +  return ((v4hi) A)[N];
 +}
 -- 
 2.1.3

Jakub

Re: [PATCH] pr63856 - test case

2014-11-23 Thread Jakub Jelinek

On Sun, Nov 23, 2014 at 03:42:43PM -0500, Trevor Saunders wrote:
  Vector arguments or return values often result in -Wpsabi warnings,
  so you certainly want -Wno-psabi in dg-options.
  On the other side, -fPIC should be only used for { target pic },
  so better put that into dg-additional-options.
 
 ok, thanks how about this then?

The __attribute__ ((noinline, noclone)) is useless, though not harmful,
sorry for missing that.

Ok for trunk either way.

 --- /dev/null
 +++ b/gcc/testsuite/gcc.dg/pr63856.c
 @@ -0,0 +1,15 @@
 +/* { dg-do compile } */
 +/* { dg-options -O2 -Wno-psabi } */
 +/* { dg-additional-options -fPIC { target fpic } } */
 +typedef int v2si __attribute__ ((vector_size (8)));
 +typedef short v4hi __attribute__ ((vector_size (8)));
 +
 +int __attribute__ ((noinline, noclone)) f (v2si A, int N)
 +{
 +  return ((v4hi) A)[N];
 +}
 +
 +int __attribute__ ((noinline, noclone)) g (v2si A, int N)
 +{
 +  return ((v4hi) A)[N];
 +}
 -- 
 2.1.3
 


Jakub

Re: [PATCH v3] gcc/ubsan.c: Use 'pretty_print' for 'pretty_name' to avoid memory overflow

2014-11-23 Thread Jakub Jelinek

On Sun, Nov 23, 2014 at 09:13:27AM +0800, Chen Gang wrote:
 2014-11-23  Chen Gang  gang.chen.5...@gmail.com
 
   * ubsan.c (ubsan_type_descriptor): Use 'pretty_print' for
   'pretty_name' to avoid memory overflow.

Ok, with a small nit below.

  gcc/ubsan.c | 63 
 +
  1 file changed, 34 insertions(+), 29 deletions(-)
 
 diff --git a/gcc/ubsan.c b/gcc/ubsan.c
 index b3d5343..3fceff7 100644
 --- a/gcc/ubsan.c
 +++ b/gcc/ubsan.c
 @@ -369,7 +369,7 @@ ubsan_type_descriptor (tree type, enum ubsan_print_style 
 pstyle)
tree dtype = ubsan_get_type_descriptor_type ();
tree type2 = type;
const char *tname = NULL;
 -  char *pretty_name;
 +  pretty_printer pretty_name;
unsigned char deref_depth = 0;
unsigned short tkind, tinfo;
  
 @@ -408,54 +408,58 @@ ubsan_type_descriptor (tree type, enum 
 ubsan_print_style pstyle)
  /* We weren't able to determine the type name.  */
  tname = unknown;
  
 -  /* Decorate the type name with '', '*', struct, or union.  */
 -  pretty_name = (char *) alloca (strlen (tname) + 16 + deref_depth);
if (pstyle == UBSAN_PRINT_POINTER)
  {
 -  int pos = sprintf (pretty_name, '%s%s%s%s%s%s%s,
 -  TYPE_VOLATILE (type2) ? volatile  : ,
 -  TYPE_READONLY (type2) ? const  : ,
 -  TYPE_RESTRICT (type2) ? restrict  : ,
 -  TYPE_ATOMIC (type2) ? _Atomic  : ,
 -  TREE_CODE (type2) == RECORD_TYPE
 -  ? struct 
 -  : TREE_CODE (type2) == UNION_TYPE
 -? union  : , tname,
 -  deref_depth == 0 ?  :  );
 +  pp_printf (pretty_name, '%s%s%s%s%s%s%s,
 +  TYPE_VOLATILE (type2) ? volatile  : ,
 +  TYPE_READONLY (type2) ? const  : ,
 +  TYPE_RESTRICT (type2) ? restrict  : ,
 +  TYPE_ATOMIC (type2) ? _Atomic  : ,
 +  TREE_CODE (type2) == RECORD_TYPE
 +  ? struct 
 +  : TREE_CODE (type2) == UNION_TYPE
 +? union  : , tname,
 +  deref_depth == 0 ?  :  );
while (deref_depth--  0)
 -pretty_name[pos++] = '*';
 -  pretty_name[pos++] = '\'';
 -  pretty_name[pos] = '\0';
 + pp_star (pretty_name);
 +  pp_quote (pretty_name);
  }
else if (pstyle == UBSAN_PRINT_ARRAY)
  {
/* Pretty print the array dimensions.  */
gcc_assert (TREE_CODE (type) == ARRAY_TYPE);
tree t = type;
 -  int pos = sprintf (pretty_name, '%s , tname);
 +  pp_printf (pretty_name, '%s , tname);
while (deref_depth--  0)
 -pretty_name[pos++] = '*';
 + pp_star (pretty_name);
while (TREE_CODE (t) == ARRAY_TYPE)
   {
 -   pretty_name[pos++] = '[';
 +   pp_left_bracket (pretty_name);
 tree dom = TYPE_DOMAIN (t);
 if (dom  TREE_CODE (TYPE_MAX_VALUE (dom)) == INTEGER_CST)
 - pos += sprintf (pretty_name[pos], HOST_WIDE_INT_PRINT_DEC,
 + {
 +   if (tree_fits_uhwi_p (TYPE_MAX_VALUE (dom))
 +tree_to_uhwi (TYPE_MAX_VALUE (dom)) + 1 != 0)
 + pp_printf (pretty_name, HOST_WIDE_INT_PRINT_DEC,
   tree_to_uhwi (TYPE_MAX_VALUE (dom)) + 1);
 +   else
 + pp_wide_int(pretty_name,
 + wi::add (wi::to_widest (TYPE_MAX_VALUE (dom)), 1),
 + TYPE_SIGN (TREE_TYPE (dom)));

Space still missing before ( (and reindenting the following 2 lines).

Jakub

Re: [PATCH v3] gcc/ubsan.c: Use 'pretty_print' for 'pretty_name' to avoid memory overflow

2014-11-24 Thread Jakub Jelinek

On Mon, Nov 24, 2014 at 04:28:10PM +0800, Chen Gang wrote:
 On 11/24/14 15:41, Jakub Jelinek wrote:
  On Sun, Nov 23, 2014 at 09:13:27AM +0800, Chen Gang wrote:
 
 [...]
 
  +else
  +  pp_wide_int(pretty_name,
  +  wi::add (wi::to_widest (TYPE_MAX_VALUE (dom)), 1),
  +  TYPE_SIGN (TREE_TYPE (dom)));
  
  Space still missing before ( (and reindenting the following 2 lines).
  
 
 Oh, thanks, if necessary to send patch v4, please let me know.

No, just fix it up before checking in.

Jakub

Re: [C++ PATCH] Detect UB in shifts in constexpr functions

2014-11-24 Thread Jakub Jelinek

On Mon, Nov 24, 2014 at 02:51:14PM +0100, Marek Polacek wrote:
 --- gcc/cp/constexpr.c
 +++ gcc/cp/constexpr.c
 @@ -1451,6 +1451,43 @@ verify_constant (tree t, bool allow_non_constant, bool 
 *non_constant_p,
return *non_constant_p;
  }
  
 +/* Return true if the shift operation on LHS and RHS is undefined.  */
 +
 +static bool
 +cxx_eval_check_shift_p (enum tree_code code, tree lhs, tree rhs)
 +{
 +  if (code != LSHIFT_EXPR  code != RSHIFT_EXPR)
 +return false;
 +
 +  tree lhstype = TREE_TYPE (lhs);
 +  unsigned HOST_WIDE_INT uprec = TYPE_PRECISION (TREE_TYPE (lhs));
 +
 +  /* [expr.shift] The behavior is undefined if the right operand
 + is negative, or greater than or equal to the length in bits
 + of the promoted left operand.  */
 +  if (tree_int_cst_sgn (rhs) == -1 || compare_tree_int (rhs, uprec) = 0)
 +return true;

I think VERIFY_CONSTANT doesn't guarantee both operands are INTEGER_CSTs.
Consider say:

constexpr int p = 1;
constexpr int foo (int a)
{
  return a  (int) p;
}
constexpr int bar (int a)
{
  return ((int) p)  a;
}
constexpr int q = foo (5);
constexpr int r = bar (2);
constexpr int s = bar (0);

Now, for foo (5) and bar (2) fold_binary_loc returns NULL and thus
your cxx_eval_check_shift_p is not called, for bar (0) it returns
non-NULL and while the result still is not a constant expression and
right now is diagnosed, with your patch it would ICE.

So, I'd just return false if either lhs or rhs are not INTEGER_CSTs.

 +
 +  /* The value of E1  E2 is E1 left-shifted E2 bit positions; [...]
 + if E1 has a signed type and non-negative value, and E1x2^E2 is
 + representable in the corresponding unsigned type of the result type,
 + then that value, converted to the result type, is the resulting value;
 + otherwise, the behavior is undefined.  */
 +  if (code == LSHIFT_EXPR  !TYPE_UNSIGNED (lhstype))
 +{
 +  if (tree_int_cst_sgn (lhs) == -1)
 + return true;
 +  tree t = build_int_cst (unsigned_type_node, uprec - 1);
 +  t = fold_build2 (MINUS_EXPR, unsigned_type_node, t, rhs);
 +  tree ulhs = fold_convert (unsigned_type_for (lhstype), lhs);
 +  t = fold_build2 (RSHIFT_EXPR, TREE_TYPE (ulhs), ulhs, t);
 +  if (tree_int_cst_lt (integer_one_node, t))
 + return true;

I'll leave to Jason whether this shouldn't be using the various
cxx_eval_*_expression calls instead, or perhaps int_const_binop or wide_int
stuff directly.

Jakub

Re: [C++ PATCH] Detect UB in shifts in constexpr functions

2014-11-24 Thread Jakub Jelinek

On Mon, Nov 24, 2014 at 04:58:22PM +0100, Marek Polacek wrote:
  Consider say:
  
  constexpr int p = 1;
  constexpr int foo (int a)
  {
return a  (int) p;
  }
  constexpr int bar (int a)
  {
return ((int) p)  a;
  }
  constexpr int q = foo (5);
  constexpr int r = bar (2);
  constexpr int s = bar (0);
  
  Now, for foo (5) and bar (2) fold_binary_loc returns NULL and thus
  your cxx_eval_check_shift_p is not called, for bar (0) it returns
  non-NULL and while the result still is not a constant expression and
  right now is diagnosed, with your patch it would ICE.
  
  So, I'd just return false if either lhs or rhs are not INTEGER_CSTs.
  
 Ok, I'll add that.  Thank for pointing that out.

Note, the above only with -m32 obviously, or supposedly for bar
you could change return type and the cast and type of r/s to
__PTRDIFF_TYPE__ or so.  foo, as gcc always casts the shift count to int,
would supposedly need to stay the way it is and might produce different
diagnostics between -m32 and -m64.
 
   +
   +  /* The value of E1  E2 is E1 left-shifted E2 bit positions; [...]
   + if E1 has a signed type and non-negative value, and E1x2^E2 is
   + representable in the corresponding unsigned type of the result type,
   + then that value, converted to the result type, is the resulting 
   value;
   + otherwise, the behavior is undefined.  */
   +  if (code == LSHIFT_EXPR  !TYPE_UNSIGNED (lhstype))
   +{
   +  if (tree_int_cst_sgn (lhs) == -1)
   + return true;
   +  tree t = build_int_cst (unsigned_type_node, uprec - 1);
   +  t = fold_build2 (MINUS_EXPR, unsigned_type_node, t, rhs);
   +  tree ulhs = fold_convert (unsigned_type_for (lhstype), lhs);
   +  t = fold_build2 (RSHIFT_EXPR, TREE_TYPE (ulhs), ulhs, t);
   +  if (tree_int_cst_lt (integer_one_node, t))
   + return true;
  
  I'll leave to Jason whether this shouldn't be using the various
  cxx_eval_*_expression calls instead, or perhaps int_const_binop or wide_int
  stuff directly.
 
 ISTR int_const_binop calls wide_int routines wi::rshift/wi::lshift and these
 return 0 and do not have any overflow flag, so that might not help (?).

I meant for the above computation, there you don't check any overflow.

Jakub

[committed] Fix !$omp atomic handling (PR fortran/63938)

2014-11-24 Thread Jakub Jelinek

Hi!

This fixes Fortran !$omp atomic handling to match C/C++, goa_lhs_expr_p
is not able to find out matches of more complex expressions after tree
unsharing, so this patch helps it find what it needs.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.
Will backport soon.

2014-11-24  Jakub Jelinek  ja...@redhat.com

PR fortran/63938
* trans-openmp.c (gfc_trans_omp_atomic): Make sure lhsaddr is
simple enough for goa_lhs_expr_p.

* libgomp.fortran/pr63938-1.f90: New test.
* libgomp.fortran/pr63938-2.f90: New test.

--- gcc/fortran/trans-openmp.c.jj   2014-10-10 13:07:42.0 +0200
+++ gcc/fortran/trans-openmp.c  2014-11-24 11:25:40.157171398 +0100
@@ -2683,6 +2683,18 @@ gfc_trans_omp_atomic (gfc_code *code)
 }
 
   lhsaddr = save_expr (lhsaddr);
+  if (TREE_CODE (lhsaddr) != SAVE_EXPR
+   (TREE_CODE (lhsaddr) != ADDR_EXPR
+ || TREE_CODE (TREE_OPERAND (lhsaddr, 0)) != VAR_DECL))
+{
+  /* Make sure LHS is simple enough so that goa_lhs_expr_p can recognize
+it even after unsharing function body.  */
+  tree var = create_tmp_var_raw (TREE_TYPE (lhsaddr), NULL);
+  DECL_CONTEXT (var) = current_function_decl;
+  lhsaddr = build4 (TARGET_EXPR, TREE_TYPE (lhsaddr), var, lhsaddr,
+   NULL_TREE, NULL_TREE);
+}
+
   rhs = gfc_evaluate_now (rse.expr, block);
 
   if (((atomic_code-ext.omp_atomic  GFC_OMP_ATOMIC_MASK)
--- libgomp/testsuite/libgomp.fortran/pr63938-1.f90.jj  2014-11-24 
11:27:33.060113895 +0100
+++ libgomp/testsuite/libgomp.fortran/pr63938-1.f90 2014-11-24 
11:28:13.339379860 +0100
@@ -0,0 +1,14 @@
+! PR fortran/63938
+! { dg-do run }
+
+program pr63938_1
+  integer :: i, x(1)
+  x(1) = 0
+!$omp parallel do
+  do i = 1, 1000
+!$omp atomic
+x(1) = x(1) + 1
+  end do
+!$omp end parallel do
+  if (x(1) .ne. 1000) call abort
+end program pr63938_1
--- libgomp/testsuite/libgomp.fortran/pr63938-2.f90.jj  2014-11-24 
11:28:22.836206793 +0100
+++ libgomp/testsuite/libgomp.fortran/pr63938-2.f90 2014-11-24 
11:28:53.217653132 +0100
@@ -0,0 +1,18 @@
+! PR fortran/63938
+! { dg-do run }
+
+program pr63938_2
+  type t
+integer :: x
+  end type
+  integer :: i
+  type(t) :: x
+  x%x = 0
+!$omp parallel do
+  do i = 1, 1000
+!$omp atomic
+x%x = x%x + 1
+  end do
+!$omp end parallel do
+  if (x%x .ne. 1000) call abort
+end program pr63938_2

Jakub

[PATCH] Fix linemap_line_start (PR preprocessor/60436)

2014-11-24 Thread Jakub Jelinek

Hi!

As mentioned in the PR, when preprocessing very large files, if there are
huge numbers of lines where no #line is emitted, we might not detect
overflowinging into adhoc locations.
Apparently in the add_map case we already handle that fine, by first
stopping tracking columns and after another 256M lines give up on tracking
locations, so this patch just makes sure we enter that path if
going over those limits.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2014-11-24  Jakub Jelinek  ja...@redhat.com

PR preprocessor/60436
* line-map.c (linemap_line_start): If highest is above 0x6000
and we are still tracking columns or highest is above 0x7000,
force add_map.

--- libcpp/line-map.c.jj2014-11-12 08:06:57.0 +0100
+++ libcpp/line-map.c   2014-11-24 12:14:52.691276169 +0100
@@ -529,10 +529,10 @@ linemap_line_start (struct line_maps *se
   line_delta * ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map)  1000)
   || (max_column_hint = (1U  ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map)))
   || (max_column_hint = 80
-  ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map) = 10))
-{
-  add_map = true;
-}
+  ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map) = 10)
+  || (highest  0x6000
+  (set-max_column_hint || highest  0x7000)))
+add_map = true;
   else
 max_column_hint = set-max_column_hint;
   if (add_map)
@@ -543,7 +543,7 @@ linemap_line_start (struct line_maps *se
  /* If the column number is ridiculous or we've allocated a huge
 number of source_locations, give up on column numbers. */
  max_column_hint = 0;
- if (highest 0x7000)
+ if (highest  0x7000)
return 0;
  column_bits = 0;
}

Jakub

[PATCH] Fix find_base_term in 32-bit -fpic code (PR lto/64025)

2014-11-24 Thread Jakub Jelinek

Hi!

The fallback delegitimization I've added as last option mainly for
debug info purposes, when we don't know if the base is a PIC register
or say a PIC register plus some addend, unfortunately in some tests
broke find_base_term, which for PLUS looks only at the first operand
and recursion on it finds a base term, it returns it immediately.
So, it found base term of _GLOBAL_OFFSET_TABLE_, when the right base
term is actually in the second operand.

This patch fixes it by swapping the operands, debug info doesn't care about
the order, it won't match in any instruction anyway, but helps alias.c.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2014-11-24  Jakub Jelinek  ja...@redhat.com

PR lto/64025
* config/i386/i386.c (ix86_delegitimize_address): Ensure result
comes before (addend - _GLOBAL_OFFSET_TABLE_) term.

--- gcc/config/i386/i386.c.jj   2014-11-24 10:26:26.0 +0100
+++ gcc/config/i386/i386.c  2014-11-24 20:17:27.659091709 +0100
@@ -14848,7 +14848,7 @@ ix86_delegitimize_address (rtx x)
 ...
 movl foo@GOTOFF(%ecx), %edx
 in which case we return (%ecx - %ebx) + foo
-or (%ecx - _GLOBAL_OFFSET_TABLE_) + foo if pseudo_pic_reg
+or foo + (%ecx - _GLOBAL_OFFSET_TABLE_) if pseudo_pic_reg
 and reload has completed.  */
   if (pic_offset_table_rtx
   (!reload_completed || !ix86_use_pseudo_pic_reg ()))
@@ -14859,7 +14859,14 @@ ix86_delegitimize_address (rtx x)
{
  rtx tmp = gen_rtx_SYMBOL_REF (Pmode, GOT_SYMBOL_NAME);
  tmp = gen_rtx_MINUS (Pmode, copy_rtx (addend), tmp);
- result = gen_rtx_PLUS (Pmode, tmp, result);
+ /* The order of the operands here is very important.  find_base_term
+will only recurse into the first operand of PLUS if none of the
+arguments is REG with REG_POINTER_P set on it, and if that finds
+base term, it doesn't recurse into the second operand of PLUS.
+We don't want find_base_term to return the artificial
+_GLOBAL_OFFSET_TABLE_ symbol, so ensure it goes into the
+second operand.  */
+ result = gen_rtx_PLUS (Pmode, result, tmp);
}
   else
return orig_x;

Jakub

[PATCH] Fix building of gengtype

2014-11-24 Thread Jakub Jelinek

Hi!

My last 2 bootstraps failed, both because of a race while building
host gengtype (each time different gengtype*.o).

Looking at the history, we have two versions of gengtype, one in build/,
another one in . (host one).
Initially, the copy in build/ was built with -DGENERATOR_FILE, included
bconfig.h, the other did not and included config.h.
Then, Steven noticed that eventhough gengtype is built for host, it really
for many reasons need GENERATOR_FILE define and defined it in the Makefile.
Except that change resulted in always including bconfig.h even in host
gengtype, which is presumably undesirable.
After another half a year Marcus probably hit similar race like I did
tonight, make trying to build host gengtype*.o when bconfig.h has not been
generated yet, and added for gengtype-lex.o $(BCONFIG_H) dependency
(but not to any other of the gengtype*.o files).

Note, config.h has an #error for #ifdef GENERATOR_FILE, to prevent people
from including config.h in build/gen* stuff by mistake.

So, this patch fixes this by just using a different define and defines
GENERATOR_FILE only after including config.h in the host gengtype* object
files.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2014-11-25  Jakub Jelinek  ja...@redhat.com

* Makefile.in (gengtype-lex.o): Drop dependency on $(BCONFIG_H).
(CFLAGS-gengtype-lex.o, CFLAGS-gengtype-parse.o,
CFLAGS-gengtype-state.o, CFLAGS-gengtype.o): Add -DHOST_GENGTYPE
instead of -DGENERATOR_FILE.
* gengtype.c: Instead of testing GENERATOR_FILE define, test
HOST_GENGTYPE.  If defined, include config.h and define GENERATOR_FILE
afterwards, otherwise include bconfig.h.
* gengtype-parse.c, gengtype-state.c, gengtype-lex.l): Likewise.

--- gcc/Makefile.in.jj  2014-11-18 18:17:53.0 +0100
+++ gcc/Makefile.in 2014-11-24 21:43:47.960196848 +0100
@@ -2485,27 +2485,27 @@ build/gengenrtl.o : gengenrtl.c $(BCONFI
 # the build-%: rule doesn't apply to them.
 
 gengtype-lex.o build/gengtype-lex.o : gengtype-lex.c gengtype.h $(SYSTEM_H)
-gengtype-lex.o: $(CONFIG_H) $(BCONFIG_H)
-CFLAGS-gengtype-lex.o += -DGENERATOR_FILE
+gengtype-lex.o: $(CONFIG_H)
+CFLAGS-gengtype-lex.o += -DHOST_GENGTYPE
 build/gengtype-lex.o: $(BCONFIG_H)
 
 gengtype-parse.o build/gengtype-parse.o : gengtype-parse.c gengtype.h \
   $(SYSTEM_H)
 gengtype-parse.o: $(CONFIG_H)
-CFLAGS-gengtype-parse.o += -DGENERATOR_FILE
+CFLAGS-gengtype-parse.o += -DHOST_GENGTYPE
 build/gengtype-parse.o: $(BCONFIG_H)
 
 gengtype-state.o build/gengtype-state.o: gengtype-state.c $(SYSTEM_H) \
   gengtype.h errors.h double-int.h version.h $(HASHTAB_H) $(OBSTACK_H) \
   $(XREGEX_H)
 gengtype-state.o: $(CONFIG_H)
-CFLAGS-gengtype-state.o += -DGENERATOR_FILE
+CFLAGS-gengtype-state.o += -DHOST_GENGTYPE
 build/gengtype-state.o: $(BCONFIG_H)
 gengtype.o build/gengtype.o : gengtype.c $(SYSTEM_H) gengtype.h\
   rtl.def insn-notes.def errors.h double-int.h version.h   \
   $(HASHTAB_H) $(OBSTACK_H) $(XREGEX_H)
 gengtype.o: $(CONFIG_H)
-CFLAGS-gengtype.o += -DGENERATOR_FILE
+CFLAGS-gengtype.o += -DHOST_GENGTYPE
 build/gengtype.o: $(BCONFIG_H)
 
 build/genmddeps.o: genmddeps.c $(BCONFIG_H) $(SYSTEM_H) coretypes.h\
--- gcc/gengtype.c.jj   2014-11-20 17:06:24.0 +0100
+++ gcc/gengtype.c  2014-11-24 21:44:30.746446814 +0100
@@ -17,10 +17,11 @@
along with GCC; see the file COPYING3.  If not see
http://www.gnu.org/licenses/.  */
 
-#ifdef GENERATOR_FILE
-#include bconfig.h
-#else
+#ifdef HOST_GENGTYPE
 #include config.h
+#define GENERATOR_FILE 1
+#else
+#include bconfig.h
 #endif
 #include system.h
 #include errors.h/* for fatal */
--- gcc/gengtype-parse.c.jj 2014-09-25 15:03:05.0 +0200
+++ gcc/gengtype-parse.c2014-11-24 21:45:19.905585065 +0100
@@ -17,10 +17,11 @@
along with GCC; see the file COPYING3.  If not see
http://www.gnu.org/licenses/.  */
 
-#ifdef GENERATOR_FILE
-#include bconfig.h
-#else
+#ifdef HOST_GENGTYPE
 #include config.h
+#define GENERATOR_FILE 1
+#else
+#include bconfig.h
 #endif
 #include system.h
 #include gengtype.h
--- gcc/gengtype-state.c.jj 2014-09-25 15:01:57.0 +0200
+++ gcc/gengtype-state.c2014-11-24 21:44:59.293946382 +0100
@@ -23,10 +23,11 @@
and Basile Starynkevitch bas...@starynkevitch.net
 */
 
-#ifdef GENERATOR_FILE
-#include bconfig.h
-#else
+#ifdef HOST_GENGTYPE
 #include config.h
+#define GENERATOR_FILE 1
+#else
+#include bconfig.h
 #endif
 #include system.h
 #include errors.h/* For fatal.  */
--- gcc/gengtype-lex.l.jj   2014-09-25 15:02:45.0 +0200
+++ gcc/gengtype-lex.l  2014-11-24 21:45:41.486206761 +0100
@@ -21,10 +21,11 @@ along with GCC; see the file COPYING3.
 %option noinput
 
 %{
-#ifdef GENERATOR_FILE
-#include bconfig.h
-#else
+#ifdef HOST_GENGTYPE
 #include config.h
+#define GENERATOR_FILE 1
+#else
+#include bconfig.h
 #endif
 #include system.h
 

Jakub

Re: [PATCH] Fix find_base_term in 32-bit -fpic code (PR lto/64025)

2014-11-25 Thread Jakub Jelinek

On Tue, Nov 25, 2014 at 09:13:10AM +0100, Uros Bizjak wrote:
 On Tue, Nov 25, 2014 at 8:40 AM, Uros Bizjak ubiz...@gmail.com wrote:
  On Tue, Nov 25, 2014 at 12:25 AM, Jakub Jelinek ja...@redhat.com wrote:
 
  The fallback delegitimization I've added as last option mainly for
  debug info purposes, when we don't know if the base is a PIC register
  or say a PIC register plus some addend, unfortunately in some tests
  broke find_base_term, which for PLUS looks only at the first operand
  and recursion on it finds a base term, it returns it immediately.
  So, it found base term of _GLOBAL_OFFSET_TABLE_, when the right base
  term is actually in the second operand.
 
  This patch fixes it by swapping the operands, debug info doesn't care about
  the order, it won't match in any instruction anyway, but helps alias.c.
 
  Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
 
  2014-11-24  Jakub Jelinek  ja...@redhat.com
 
  PR lto/64025
  * config/i386/i386.c (ix86_delegitimize_address): Ensure result
  comes before (addend - _GLOBAL_OFFSET_TABLE_) term.
 
  Can you also swap operands of (%ecx - %ebx) + foo? There is no point
  digging into RTX involving registers only when we know that we are
  looking for foo. This will also be consistent with the code you
  patched below.
 
 Something like attached prototype patch.

Actually, thinking about it more, at least according to
commutative_operand_precedence the canonical order is
what we used to return (i.e. (something - _G_O_T_) + (symbol_ref)
or
(something - _G_O_T_) + (const (symbol_ref +- const))
So perhaps better fix is to follow find_base_value, which does something
like:
/* Guess which operand is the base address:
   If either operand is a symbol, then it is the base.  If
   either operand is a CONST_INT, then the other is the base.  */
if (CONST_INT_P (src_1) || CONSTANT_P (src_0))
  return find_base_value (src_0);
else if (CONST_INT_P (src_0) || CONSTANT_P (src_1))
  return find_base_value (src_1);
and do something similar in find_base_term too.  I.e. perhaps even with
higher precedence over REG_P with REG_POINTER (or lower, in these cases
it doesn't really matter, neither argument is REG_P), choose first
operand that is CONSTANT_P and not CONST_INT_P.

Jakub

Re: [PATCH] Fix building of gengtype

2014-11-25 Thread Jakub Jelinek

On Tue, Nov 25, 2014 at 12:35:09AM +0100, Jakub Jelinek wrote:
 My last 2 bootstraps failed, both because of a race while building
 host gengtype (each time different gengtype*.o).

Found bootstrap failures even with this patch (dunno what changed on my box
that I started getting these last night, make has not changed), that time
with errors.o and gcc-ar.o.
The generated headers are solved these days in automatic dependencies world
through
# In order for parallel make to really start compiling the expensive
# objects from $(OBJS) as early as possible, build all their
# prerequisites strictly before all objects.
$(ALL_HOST_OBJS) : | $(generated_files)
and build/*.o have explicit dependencies.
I've tried to compare $(ALL_HOST_OBJS) on my box with all *.o */*.o files
I had in stage3 directory, and besides build/*.o, I found:

crtbegin.o crtbeginS.o crtbeginT.o crtend.o crtendS.o crtfastmath.o crtprec32.o 
crtprec64.o crtprec80.o
errors.o gcc-ar.o gcc-nm.o gcc-ranlib.o gengtype-lex.o gengtype.o 
gengtype-parse.o gengtype-state.o

not being listed in ALL_HOST_OBJS.  The crt*.o files come from libgcc build
and thus are ok, the rest I've tried to handle in the following updated
patch.  If the #define GENERATOR_FILE inside of the 5 files is too ugly,
another alternative might be to define both -DHOST_GENERATOR_FILE 
-DGENERATOR_FILE
in Makefile.in and don't error in config.h if GENERATOR_FILE is defined,
if HOST_GENERATOR_FILE is also defined.

2014-11-25  Jakub Jelinek  ja...@redhat.com

* Makefile.in (ALL_HOST_BACKEND_OBJS): Add $(GENGTYPE_OBJS),
gcc-ar.o, gcc-nm.o and gcc-ranlib.o.
(GENGTYPE_OBJS): New.
(gengtype-lex.o, gengtype-parse.o, gengtype-state.o, gengtype.o):
Remove explicit dependencies.
(CFLAGS-gengtype-lex.o, CFLAGS-gengtype-parse.o,
CFLAGS-gengtype-state.o, CFLAGS-gengtype.o): Add -DHOST_GENERATOR_FILE
instead of -DGENERATOR_FILE.
(CFLAGS-errors.o): New.
* gengtype.c: Instead of testing GENERATOR_FILE define, test
HOST_GENERATOR_FILE.  If defined, include config.h and define
GENERATOR_FILE afterwards, otherwise include bconfig.h.
* gengtype-parse.c: Likewise.
* gengtype-state.c: Likewise.
* gengtype-lex.l: Likewise.
* errors.c: Likewise.

--- gcc/Makefile.in.jj  2014-11-25 00:06:43.122178737 +0100
+++ gcc/Makefile.in 2014-11-25 08:55:34.727300843 +0100
@@ -1509,7 +1509,8 @@ ALL_HOST_FRONTEND_OBJS = $(foreach v,$(C
 ALL_HOST_BACKEND_OBJS = $(GCC_OBJS) $(OBJS) $(OBJS-libcommon) \
   $(OBJS-libcommon-target) @TREEBROWSER@ main.o c-family/cppspec.o \
   $(COLLECT2_OBJS) $(EXTRA_GCC_OBJS) $(GCOV_OBJS) $(GCOV_DUMP_OBJS) \
-  $(GCOV_TOOL_OBJS) lto-wrapper.o collect-utils.o
+  $(GCOV_TOOL_OBJS) $(GENGTYPE_OBJS) gcc-ar.o gcc-nm.o gcc-ranlib.o \
+  lto-wrapper.o collect-utils.o
 
 # This lists all host object files, whether they are included in this
 # compilation or not.
@@ -2484,30 +2485,31 @@ build/gengenrtl.o : gengenrtl.c $(BCONFI
 # on BCONFIG_H.  For the build objects, add -DGENERATOR_FILE manually,
 # the build-%: rule doesn't apply to them.
 
+GENGTYPE_OBJS = gengtype.o gengtype-parse.o gengtype-state.o \
+  gengtype-lex.o errors.o
+
 gengtype-lex.o build/gengtype-lex.o : gengtype-lex.c gengtype.h $(SYSTEM_H)
-gengtype-lex.o: $(CONFIG_H) $(BCONFIG_H)
-CFLAGS-gengtype-lex.o += -DGENERATOR_FILE
+CFLAGS-gengtype-lex.o += -DHOST_GENERATOR_FILE
 build/gengtype-lex.o: $(BCONFIG_H)
 
 gengtype-parse.o build/gengtype-parse.o : gengtype-parse.c gengtype.h \
   $(SYSTEM_H)
-gengtype-parse.o: $(CONFIG_H)
-CFLAGS-gengtype-parse.o += -DGENERATOR_FILE
+CFLAGS-gengtype-parse.o += -DHOST_GENERATOR_FILE
 build/gengtype-parse.o: $(BCONFIG_H)
 
 gengtype-state.o build/gengtype-state.o: gengtype-state.c $(SYSTEM_H) \
   gengtype.h errors.h double-int.h version.h $(HASHTAB_H) $(OBSTACK_H) \
   $(XREGEX_H)
-gengtype-state.o: $(CONFIG_H)
-CFLAGS-gengtype-state.o += -DGENERATOR_FILE
+CFLAGS-gengtype-state.o += -DHOST_GENERATOR_FILE
 build/gengtype-state.o: $(BCONFIG_H)
 gengtype.o build/gengtype.o : gengtype.c $(SYSTEM_H) gengtype.h\
   rtl.def insn-notes.def errors.h double-int.h version.h   \
   $(HASHTAB_H) $(OBSTACK_H) $(XREGEX_H)
-gengtype.o: $(CONFIG_H)
-CFLAGS-gengtype.o += -DGENERATOR_FILE
+CFLAGS-gengtype.o += -DHOST_GENERATOR_FILE
 build/gengtype.o: $(BCONFIG_H)
 
+CFLAGS-errors.o += -DHOST_GENERATOR_FILE
+
 build/genmddeps.o: genmddeps.c $(BCONFIG_H) $(SYSTEM_H) coretypes.h\
   errors.h $(READ_MD_H)
 build/genmodes.o : genmodes.c $(BCONFIG_H) $(SYSTEM_H) errors.h
\
--- gcc/gengtype.c.jj   2014-11-21 10:17:06.135695325 +0100
+++ gcc/gengtype.c  2014-11-25 08:56:18.042523089 +0100
@@ -17,10 +17,11 @@
along with GCC; see the file COPYING3.  If not see
http://www.gnu.org/licenses/.  */
 
-#ifdef GENERATOR_FILE
-#include bconfig.h
-#else
+#ifdef HOST_GENERATOR_FILE
 #include config.h
+#define GENERATOR_FILE 1
+#else
+#include bconfig.h

Re: [PATCH] Remove unnecessary calls to strchr.

2014-11-25 Thread Jakub Jelinek

On Tue, Nov 25, 2014 at 03:15:04PM +0300, Ilya Tocar wrote:
 As proposed in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63853
 this patch replaces some function calls with pointer arithmetic.
 I didn't mention PR in Changelog, as they are not actually related.
 Ok for trunk?
 @@ -3408,8 +3408,7 @@ handle_foffload_option (const char *arg)
 if (n == NULL)
   n = strchr (c, '\0');
  
 -   if (strlen (target) == (size_t) (n - c)
 -strncmp (target, c, n - c) == 0)
 +   if (next - cur == n - c  strncmp (target, c, n - c) == 0)

I suppose you could use memcmp here, you know the string lengths.

 @@ -3431,8 +3433,7 @@ handle_foffload_option (const char *arg)
 if (n == NULL)
   n = strchr (c, '\0');
  
 -   if (strlen (target) == (size_t) (n - c)
 -strncmp (c, target, n - c) == 0)
 +   if (next - cur == n - c  strncmp (c, target, n - c) == 0)
   break;

And here too.

Ok with or without those changes.

Jakub

Re: Document __builtin_*_overflow

2014-11-25 Thread Jakub Jelinek

On Tue, Nov 25, 2014 at 07:50:02PM +0100, Gerald Pfeifer wrote:
 Hi Jakub,
 
 On Wednesday 2014-11-12 14:13, Jakub Jelinek wrote:
  This patch mentions __builtin_*_overflow in gcc-5/changes.html.
  Ok for CVS?
 
 I've fallen a bit behind with GCC patches, sorry.
 
 What do you think about this follow-up patch on top of yours?

LGTM, thanks.

 --- changes.html  23 Nov 2014 14:42:28 -  1.41
 +++ changes.html  25 Nov 2014 18:49:02 -
 @@ -157,14 +157,14 @@
   These builtins have two integral arguments (which don't need to have
   the same type), the arguments are extended to infinite precision
   signed type, code+/code, code-/code or code*/code
 - is performed on those and the result is stored into some integer
 - variable pointed by the last argument.  If the stored value is equal
 - to the infinite precision result, the built-in functions return
 + is performed on those, and the result is stored in an integer
 + variable pointed to by the last argument.  If the stored value is
 + equal to the infinite precision result, the built-in functions return
   codefalse/code, otherwise codetrue/code.  The type of
   the integer variable that will hold the result can be different from
 - the types of arguments.  The following snippet demonstrates how
 - this can be used in computing the size for the codecalloc/code
 - function:
 + the types of the first two arguments.  The following snippet
 + demonstrates how this can be used in computing the size for the
 + codecalloc/code function:
  blockquotepre
  void *
  calloc (size_t x, size_t y)
 @@ -177,8 +177,8 @@
return ret;
  }
  /pre/blockquote
 - On e.g. i?86 or x86-64 the above will result in codemul/code
 - instruction followed by jump on overflow.
 + On e.g. i?86 or x86-64 the above will result in a codemul/code
 + instruction followed by a jump on overflow.
  /li
  liThe option code-fextended-identifiers/code is now enabled
   by default for C++, and for C99 and later C versions.  Various
 
 Gerald

Jakub

Re: [PATCH] gcc parallel make check

2014-11-25 Thread Jakub Jelinek

On Tue, Nov 25, 2014 at 03:27:40PM +0100, Tom de Vries wrote:
 This patch fixes that by ensuring that we print that unsupported message only 
 once.
 
 The resulting test result comparison diff is:
 2014-11-25  Tom de Vries  t...@codesourcery.com
 
   * testsuite/libstdc++-prettyprinters/prettyprinters.exp: Add missing
   dg-finish.  Only print unsupported message once.

LGTM.

 --- a/libstdc++-v3/testsuite/libstdc++-prettyprinters/prettyprinters.exp
 +++ b/libstdc++-v3/testsuite/libstdc++-prettyprinters/prettyprinters.exp
 @@ -30,7 +30,14 @@ if ![info exists ::env(GUALITY_GDB_NAME)] {
  }
  
  if {! [gdb_version_check]} {
 +dg-finish
 +# Only print unsupported message in one instance.
 +if ![gcc_parallel_test_run_p prettyprinters] {
 + return
 +}
 +gcc_parallel_test_enable 0
  unsupported prettyprinters.exp
 +gcc_parallel_test_enable 1
  return
  }
  
 -- 
 1.9.1
 


Jakub

Re: [PATCH v3] gcc/c-family/c-cppbuiltin.c: Let buffer enough to print host wide integer value

2014-11-25 Thread Jakub Jelinek

On Wed, Nov 26, 2014 at 09:41:16AM +0800, Chen Gang wrote:
 On 11/26/14 8:31, Joseph Myers wrote:
  On Wed, 26 Nov 2014, Chen Gang wrote:
  
  +  gcc_assert (wi::fits_to_tree_p (value, char_type_node)
  +|| wi::fits_to_tree_p (value, short_integer_type_node)
  +|| wi::fits_to_tree_p (value, integer_type_node)
  +|| wi::fits_to_tree_p (value, long_integer_type_node)
  +|| wi::fits_to_tree_p (value, long_long_integer_type_node));
  
  It doesn't make sense to check for char or short, since you can't write a 
  constant of one of those types.  And it doesn't make sense to check for 
  int or long when checking for long long, as the ranges of int and long are 
  subsets of that of long long.  So just check long long here.
  
  +  buf = (char *) alloca (strlen (macro) + vlen + extra);
  +
  +  sprintf (buf, %s=%sHOST_WIDE_INT_PRINT_DEC%s%s,

Oh, and please use spaces around HOST_WIDE_INT_PRINT_DEC etc.,
with C++11 user defined literals it is always better to have whitespace
in there.

Jakub

Patch ping^2: [PATCH] -fsanitize=vptr instrumentation (take 2)

2014-11-26 Thread Jakub Jelinek

On Wed, Nov 12, 2014 at 03:05:46PM +0100, Jakub Jelinek wrote:
 On Tue, Oct 28, 2014 at 01:44:50PM +0100, Jakub Jelinek wrote:
  On Mon, Oct 27, 2014 at 05:16:05PM +0100, Jakub Jelinek wrote:
   Here is an updated patch, ok if bootstrap/testing passes (so far just
   checked with
   make -j16 -k check RUNTESTFLAGS='--target_board=unix\{-m32,-m64\} 
   asan.exp tsan.exp ubsan.exp'
   )?
 
 I'd like to ping the
 https://gcc.gnu.org/ml/gcc-patches/2014-10/msg02945.html
 patch.

Ping.

Jakub

Re: [PATCH] Enhance ASAN_CHECK optimization

2014-11-26 Thread Jakub Jelinek

On Tue, Nov 25, 2014 at 08:06:00PM +0300, Yury Gribov wrote:
 +/* Traits class for tree hash maps below.  */
 +
 +struct tree_map_traits : default_hashmap_traits
 +{
 +  static inline hashval_t hash (const_tree ref)
 +{
 +  return iterative_hash_expr (ref, 0);
 +}
 +
 +  static inline bool equal_keys (const_tree ref1, const_tree ref2)
 +{
 +  return operand_equal_p (ref1, ref2, 0);
 +}

Formatting.  The {} should be indented like static and return 2 columns to
the right of that.

 @@ -281,37 +316,46 @@ maybe_optimize_asan_check_ifn (struct sanopt_ctx *ctx, 
 gimple stmt)
  
gimple_set_uid (stmt, info-freeing_call_events);
  
 -  auto_vecgimple v = ctx-asan_check_map.get_or_insert (ptr);
 -  if (v.is_empty ())
 +  auto_vecgimple *ptr_checks = ctx-asan_check_map.get_or_insert (ptr);
 +  gimple g = maybe_get_dominating_check (*ptr_checks);
 +
 +  tree base_addr = maybe_get_single_definition (ptr);
 +  auto_vecgimple *base_checks = NULL;
 +  if (base_addr)
  {
 -  /* For this PTR we don't have any ASAN_CHECK stmts recorded, so there's
 -  nothing to optimize yet.  */
 -  v.safe_push (stmt);
 -  return false;
 +  base_checks = ctx-asan_check_map.get_or_insert (base_addr);
 +  /* Original pointer might have been invalidated.  */
 +  ptr_checks = ctx-asan_check_map.get (ptr);
  }

For base_addr computation, you don't really need g or ptr_checks,
do you?  So why not move the:
  auto_vecgimple *ptr_checks = ctx-asan_check_map.get_or_insert (ptr);
  gimple g = maybe_get_dominating_check (*ptr_checks);
lines below the if?

 @@ -404,10 +445,7 @@ sanopt_optimize_walker (basic_block bb, struct 
 sanopt_ctx *ctx)
basic_block son;
gimple_stmt_iterator gsi;
sanopt_info *info = (sanopt_info *) bb-aux;
 -  bool asan_check_optimize
 -= (flag_sanitize  SANITIZE_ADDRESS)
 -   ((flag_sanitize  flag_sanitize_recover
 - SANITIZE_KERNEL_ADDRESS) == 0);
 +  bool asan_check_optimize = (flag_sanitize  SANITIZE_ADDRESS) != 0;
  
for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);)
  {

I'm afraid I'm not convinced about this hunk.  If asan (kernel-address) is
recovering, I don't see a difference from not reporting two different
invalid accesses to the same function and not reporting two integer
overflows in the same function, at least if they have different location_t.

Jakub

Re: [PATCH 2/2] PR debug/38757 continued. Handle C11, C++11 and C++14.

2014-11-26 Thread Jakub Jelinek

On Wed, Nov 26, 2014 at 11:19:42AM +0100, Mark Wielaard wrote:
 Ping. Rebased patch attached.
 
 I have submitted patches for elfutils, valgrind, gdb and binutils for
 this. But they are pending till this patch hits GCC first.

Ok, thanks.

Jakub

Re: [PATCH] Fix PR bootstrap/63995

2014-11-26 Thread Jakub Jelinek

On Wed, Nov 26, 2014 at 03:41:46PM +0300, Ilya Enkovich wrote:
 Hi,
 
 This patch makes optimization for bounds lifetime reduction to ignore
 debug stetments.  This fixes stage2 and stage3 comparision for
 instrumented boostrap.  OK for trunk?

Please add a small testcase (with -fcompare-debug -fcheck-pointer-bounds (or 
what
other options you need to reproduce it) that fails without the patch and
succeeds with it.

 2014-11-26  Ilya Enkovich  ilya.enkov...@intel.com
 
   PR bootstrap/63995
   * tree-chkp-opt.c (chkp_reduce_bounds_lifetime): Ignore
   debug statement when searching for a new position for
   bounds load/creation statement.
 
 
 diff --git a/gcc/tree-chkp-opt.c b/gcc/tree-chkp-opt.c
 index ff390d7..92e0694 100644
 --- a/gcc/tree-chkp-opt.c
 +++ b/gcc/tree-chkp-opt.c
 @@ -1175,6 +1175,9 @@ chkp_reduce_bounds_lifetime (void)
  
FOR_EACH_IMM_USE_STMT (use_stmt, use_iter, op)
   {
 +   if (is_gimple_debug (use_stmt))
 + continue;
 +
 if (dom_bb 
 dominated_by_p (CDI_DOMINATORS,
 dom_bb, gimple_bb (use_stmt)))

Jakub

Re: [PATCH] Fix PR bootstrap/63995

2014-11-26 Thread Jakub Jelinek

On Wed, Nov 26, 2014 at 04:48:13PM +0300, Ilya Enkovich wrote:
 2014-11-26  Ilya Enkovich  ilya.enkov...@intel.com
 
   PR bootstrap/63995
   * tree-chkp-opt.c (chkp_reduce_bounds_lifetime): Ignore
   debug statement when searching for a new position for
   bounds load/creation statement.
 
 gcc/testsuite/
 
 2014-11-26  Ilya Enkovich  ilya.enkov...@intel.com
 
   PR bootstrap/63995
   * gcc.target/i386/pr63995-2.c: New.

Ok, thanks.

Jakub

Re: [PATCH] Fix sanitizer/63788

2014-11-26 Thread Jakub Jelinek

On Wed, Nov 26, 2014 at 02:57:20PM +0100, Marek Polacek wrote:
 Ping.

Ok, thanks.
 
 On Wed, Nov 19, 2014 at 08:09:21PM +0100, Marek Polacek wrote:
  As discussed in the PR, the problem here is that when running gfortran,
  the __builtin_object_size decl isn't available, because c-family's
  c_define_builtins wasn't called.  One way how to fix this is to build
  the __builtin_object_size decl in initialize_sanitizer_builtins, if
  needed.  Alternatively we could just bail in instrument_object_size
  if builtin_decl_explicit (BUILT_IN_OBJECT_SIZE) returns NULL_TREE...
  
  No test attached since we don't have any Fortran ubsan infrastructure,
  I've just tried
  ./xgcc -B ./ -B ../x86_64-unknown-linux-gnu/libsanitizer/ubsan/.libs/ 
  -Wl,-rpath=../x86_64-unknown-linux-gnu/libsanitizer/ubsan/.libs/ -B 
  ../x86_64-unknown-linux-gnu/libgfortran/.libs/ -O -fsanitize=undefined 
  testcase.f -lgfortran; ./a.out
  on the testcase attached in PR - and it doesn't ICE.
  
  Bootstrapped/regtested on ppc64-linux, ok for trunk?
  
  2014-11-19  Marek Polacek  pola...@redhat.com
  
  PR sanitizer/63788
  * asan.c (initialize_sanitizer_builtins): Add BT_FN_SIZE_CONST_PTR_INT
  var.  Conditionally build BUILT_IN_OBJECT_SIZE decl.
  (ATTR_PURE_NOTHROW_LEAF_LIST): Define.

Jakub

Re: [PATCH] Enhance array types debug info. for Ada

2014-11-26 Thread Jakub Jelinek

On Wed, Oct 08, 2014 at 09:05:30PM +0200, Pierre-Marie de Rodat wrote:
 gcc/
   * dwarf2out.h (struct array_descr_info): Remove the base_decl field.
   * dwarf2out.c (enum dw_scalar_form): New.
   (struct loc_descr_context): New.
   (add_scalar_info): New.
   (add_bound_info): Add a context parameter.  Use add_scalar_info.
   (loc_list_from_tree): Add a context parameter.  Handle PLACEHOLDER_EXPR
   nodes for type-related expressions.  Likewise for base declarations.
   (loc_descriptor_from_tree): Add a context parameter.
   (subrange_type_die): Update calls to add_bound_info.
   (tls_mem_loc_descriptor): Likewise.
   (loc_list_for_address_of_addr_expr_of_indirect_ref): Add a context
   parameter.  Update calls to loc_list_from_tree.
   (add_subscript_info): Update calls to add_bound_info.
   (gen_array_type_die): Update calls to loc_list_from_tree and to
   add_bound_info.
   (descr_info_loc): Remove.
   (add_descr_info_field): Remove.
   (gen_descr_array_type_die): Switch add_descr_info_field calls into
   add_scalar_info/add_bound_info ones.
   (gen_subprogram_die): Update calls to loc_list_from_tree.
   (gen_variable_die): Likewise.

Replace implicitely with implicitly in the whole patch.

 + to refer to register values).
 +
 +   CONTEXT provides information to customize the location descriptions
 +   generation.  Its context_type field specifies what type is implicitely
 +   referenced by DW_OP_push_object_address.  If it is NULL_TREE, this 
 operation
 +   will not be generated.
 +
 +   If CONTEXT is NULL, the behavior is the same as if the context_type field
 +   was NULL_TREE.  */

as if both context_type and base_decl were NULL_TREE?

 @@ -14311,6 +14351,12 @@ loc_list_from_tree (tree loc, int want_address)
   extending the values properly.  Hopefully this won't be a real
   problem...  */
  
 +  if (context != NULL
 +   context-base_decl == loc
 +   want_address == 0)
 +return new_loc_list (new_loc_descr (DW_OP_push_object_address, 0, 0),
 +  NULL, NULL, NULL);
 +

This isn't guarded with dwarf_version = 3 || !dwarf_strict.  Shouldn't it
be too and return NULL otherwise?

 + expansion_failed (loc, NULL_RTX,
 +   PLACEHOLDER_EXPR for a unexpected type);

for an unexpected type?

 @@ -14533,7 +14594,8 @@ loc_list_from_tree (tree loc, int want_address)
  
   list_ret = loc_list_from_tree (obj,
  want_address == 2
 - !bitpos  !offset ? 2 : 1);
 + !bitpos  !offset ? 2 : 1,
 +   context);

Formatting.  Should use tabs, not spaces.

 +  if (prec = HOST_BITS_PER_WIDE_INT
 +|| tree_fits_uhwi_p (value))

Formatting.  || should be below p in prec.

Would be nice if you tried more than one fortran testcase, say build
all gfortran.dg/ tests with -O0 -g -dA (and perhaps -O2 -g -dA afterwards)
with both unpatched and patched compilers and diff *.s files?

Otherwise, LGTM.

Jakub

Re: [C/C++ PATCH] Don't convert RHS of a shift-expression to int (PR c/63862)

2014-11-26 Thread Jakub Jelinek

On Wed, Nov 26, 2014 at 06:39:44PM +0100, Marek Polacek wrote:
 Both C11 and C++14 standards specify that integral promotions are
 performed on both operands of a shift-expression.  This we do just
 fine.  But then we convert the right operand to integer_type_node.
 Not only is this unnecessary, it can also be harfmul, because for
 e.g. 
 void
 foo (unsigned int x)
 {
   if (-1  x != -1)
 bar ();
 }
 with (int) x we lose info that x is nonnegative, which means that
 tree_expr_nonnegative_p cannot fold this expr.  Another problem
 with the conversion is that we weren't able to detect e.g. shift
 by 0x1ULL, since after the conversion this is 0.

Have you checked what the expander does with it?  Say trying to
shift something by __int128 count or similar might upset it.
Wonder about if we don't make similar assumptions in the middle-end.
It might make a difference (sometimes positive, sometimes negative)
for vectorization too.

 2014-11-26  Marek Polacek  pola...@redhat.com
 
   PR c/63862
 c-family/
   * c-ubsan.c (ubsan_instrument_shift): Change the type of a MINUS_EXPR
   to op1_utype.

This part is ok regardless of the rest.

Jakub

Re: [PATCH] Enhance ASAN_CHECK optimization

2014-11-26 Thread Jakub Jelinek

On Wed, Nov 26, 2014 at 11:21:06AM -0700, ygribov wrote:
  Formatting.  The {} should be indented like static
  and return 2 columns to the right of that.
 
 Right.
 
  For base_addr computation, you don't really need g or ptr_checks, 
  do you?  So why not move the: 
auto_vecgimple *ptr_checks = ctx-asan_check_map.get_or_insert (ptr); 
gimple g = maybe_get_dominating_check (*ptr_checks); 
  lines below the if?
 
 I can do this.  But then base_checks would be invalidated when I call
 get_or_insert for ptr_checks so I'll still have to hash_map::get.

You're right.

  If asan (kernel-address) is 
  recovering, I don't see a difference from not reporting two different 
  invalid accesses to the same function and not reporting two integer 
  overflows in the same function, at least if they have different
  location_t.
 
 Ok, agreed. BTW how about replacing ' SANITIZE_KERNEL_ADDRESS' with '
 SANITIZE_ADDRESS'?  I know we do not support recovery for userspace but
 having a general enum sounds more logical.

Testing SANITIZE_ADDRESS bit in flag_sanitize_recover doesn't make sense,
testing it in flag_sanitize of course does, but for recover you care whether
the SANITIZE_{KERNEL,USER}_ADDRESS bit in flag_sanitize_recover is set
depending on if SANITIZE_{KERNEL,USER}_ADDRESS is set in
flag_sanitize_recover.
So supposedly
((flag_sanitize  flag_sanitize_recover)
  (SANITIZE_USER_ADDRESS | SANITIZE_KERNEL_ADDRESS)) != 0
is the right check for whether the current address sanitization wants to
recover.

Jakub

Re: [C/C++ PATCH] Don't convert RHS of a shift-expression to int (PR c/63862)

2014-11-26 Thread Jakub Jelinek

On Wed, Nov 26, 2014 at 07:20:04PM +0100, Marek Polacek wrote:
 On Wed, Nov 26, 2014 at 06:50:29PM +0100, Jakub Jelinek wrote:
  On Wed, Nov 26, 2014 at 06:39:44PM +0100, Marek Polacek wrote:
   Both C11 and C++14 standards specify that integral promotions are
   performed on both operands of a shift-expression.  This we do just
   fine.  But then we convert the right operand to integer_type_node.
   Not only is this unnecessary, it can also be harfmul, because for
   e.g. 
   void
   foo (unsigned int x)
   {
 if (-1  x != -1)

I'd say fold-const not simplifying this is a bug even if x is signed int.
I think negative shift counts are undefined even in the middle-end, Richard?
Treating 5  -5 as 5  5 looks just wrong to me, does any FE rely on that?
I don't think the expander will expand it that way though.

Anyway, if the shift count is unnecessary wide type, then the middle-end
won't be able to optimize it as much as it could if it has been a narrower
type.

So, for ubsan purposes (but, shifts are instrumented in the FEs) it is of
course desirable to see the original type, but perhaps it might be a good
idea to cast the shift count to integer_type_node say during gimplification
(perhaps in c_gimplify_expr?).

Jakub

Re: [PATCH] Enhance ASAN_CHECK optimization

2014-11-26 Thread Jakub Jelinek

On Wed, Nov 26, 2014 at 11:42:57AM -0700, ygribov wrote:
  Testing SANITIZE_ADDRESS bit in flag_sanitize_recover doesn't make sense, 
  testing it in flag_sanitize of course does, but for recover you care
  whether 
  the SANITIZE_{KERNEL,USER}_ADDRESS bit in flag_sanitize_recover is set 
  depending on if SANITIZE_{KERNEL,USER}_ADDRESS is set in 
  flag_sanitize_recover. 
 
 Ok, got it. BTW shouldn't we disable local optimization of ASan checks (in
 asan.c) as well? That would be a massive perf hit ...

Ah, you're right, we are already doing that.  So let's just optimize always
even when recovering and we'll see if users don't complain too much.

Jakub

[PATCH] Fix find_base_term in 32-bit -fpic code (PR lto/64025, take 2)

2014-11-26 Thread Jakub Jelinek

On Tue, Nov 25, 2014 at 09:20:13AM +0100, Jakub Jelinek wrote:
 Actually, thinking about it more, at least according to
 commutative_operand_precedence the canonical order is
 what we used to return (i.e. (something - _G_O_T_) + (symbol_ref)
 or
 (something - _G_O_T_) + (const (symbol_ref +- const))
 So perhaps better fix is to follow find_base_value, which does something
 like:
 /* Guess which operand is the base address:
If either operand is a symbol, then it is the base.  If
either operand is a CONST_INT, then the other is the base.  */
 if (CONST_INT_P (src_1) || CONSTANT_P (src_0))
   return find_base_value (src_0);
 else if (CONST_INT_P (src_0) || CONSTANT_P (src_1))
   return find_base_value (src_1);
 and do something similar in find_base_term too.  I.e. perhaps even with
 higher precedence over REG_P with REG_POINTER (or lower, in these cases
 it doesn't really matter, neither argument is REG_P), choose first
 operand that is CONSTANT_P and not CONST_INT_P.

Here it is.  Bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2014-11-26  Jakub Jelinek  ja...@redhat.com

PR lto/64025
* alias.c (find_base_term): Use std::swap.  Prefer tmp2
if it is CONSTANT_P other than CONST_INT.

--- gcc/alias.c.jj  2014-11-21 10:17:17.0 +0100
+++ gcc/alias.c 2014-11-26 12:31:24.719485590 +0100
@@ -1756,11 +1756,11 @@ find_base_term (rtx x)
if (REG_P (tmp1)  REG_POINTER (tmp1))
  ;
else if (REG_P (tmp2)  REG_POINTER (tmp2))
- {
-   rtx tem = tmp1;
-   tmp1 = tmp2;
-   tmp2 = tem;
- }
+ std::swap (tmp1, tmp2);
+   /* If second argument is constant which has base term, prefer it
+  over variable tmp1.  See PR64025.  */
+   else if (CONSTANT_P (tmp2)  !CONST_INT_P (tmp2))
+ std::swap (tmp1, tmp2);
 
/* Go ahead and find the base term for both operands.  If either base
   term is from a pointer or is a named object or a special address


Jakub

[PATCH] Fix simd clone vectorization (PR tree-optimization/64024)

2014-11-26 Thread Jakub Jelinek

Hi!

As discussed in the PR and on IRC, the problem here is that peeling
for alignment can for some linear argument that during vect analysis
passed simple_iv no longer pass it during vect transform phase.

So, to fix this, this patch remembers the base and step values from
simple_iv during vect analysis and uses them during transform phase
(biased by what the peeling for alignment advanced of course).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2014-11-26  Jakub Jelinek  ja...@redhat.com

PR tree-optimization/64024
* tree-vectorizer.h (struct _stmt_vec_info): Remove simd_clone_fndecl
field.  Add simd_clone_info field.
(STMT_VINFO_SIMD_CLONE_FNDECL): Remove.
(STMT_VINFO_SIMD_CLONE_INFO): Define.
* tree-vect-stmts.c (vectorizable_simd_clone_call): Adjust for
STMT_VINFO_SIMD_CLONE_FNDECL becoming first element of
STMT_VINFO_SIMD_CLONE_INFO vector.  For linear arguments, remember
base and linear_step from analysis phase and use it during transform
phase, biased by the difference between LOOP_VINFO_NITERS{_UNCHANGED,}
multiplied by linear_step.
(free_stmt_vec_info): Release STMT_VINFO_SIMD_CLONE_INFO.

* gcc.dg/vect/vect-simd-clone-13.c: New test.
* gcc.dg/vect/vect-simd-clone-14.c: New test.

--- gcc/tree-vectorizer.h.jj2014-11-19 18:48:07.0 +0100
+++ gcc/tree-vectorizer.h   2014-11-26 12:56:00.899824766 +0100
@@ -602,8 +602,10 @@ typedef struct _stmt_vec_info {
  of this stmt.  */
   vecdr_p same_align_refs;
 
-  /* Selected SIMD clone's function decl.  */
-  tree simd_clone_fndecl;
+  /* Selected SIMD clone's function info.  First vector element
+ is SIMD clone's function decl, followed by a pair of trees (base + step)
+ for linear arguments (pair of NULLs for other arguments).  */
+  vectree simd_clone_info;
 
   /* Classify the def of this stmt.  */
   enum vect_def_type def_type;
@@ -677,7 +679,7 @@ typedef struct _stmt_vec_info {
 #define STMT_VINFO_RELATED_STMT(S) (S)-related_stmt
 #define STMT_VINFO_PATTERN_DEF_SEQ(S)  (S)-pattern_def_seq
 #define STMT_VINFO_SAME_ALIGN_REFS(S)  (S)-same_align_refs
-#define STMT_VINFO_SIMD_CLONE_FNDECL(S)   (S)-simd_clone_fndecl
+#define STMT_VINFO_SIMD_CLONE_INFO(S) (S)-simd_clone_info
 #define STMT_VINFO_DEF_TYPE(S) (S)-def_type
 #define STMT_VINFO_GROUP_FIRST_ELEMENT(S)  (S)-first_element
 #define STMT_VINFO_GROUP_NEXT_ELEMENT(S)   (S)-next_element
--- gcc/tree-vect-stmts.c.jj2014-11-19 18:47:59.0 +0100
+++ gcc/tree-vect-stmts.c   2014-11-26 15:38:59.883409014 +0100
@@ -2715,12 +2715,40 @@ vectorizable_simd_clone_call (gimple stm
   else
gcc_assert (thisarginfo.vectype != NULL_TREE);
 
-  if (thisarginfo.dt != vect_constant_def
-  thisarginfo.dt != vect_external_def
-  loop_vinfo
-  TREE_CODE (op) == SSA_NAME
-  simple_iv (loop, loop_containing_stmt (stmt), op, iv, false)
-  tree_fits_shwi_p (iv.step))
+  /* For linear arguments, the analyze phase should have saved
+the base and step in STMT_VINFO_SIMD_CLONE_INFO.  */
+  if (i * 2 + 3 = STMT_VINFO_SIMD_CLONE_INFO (stmt_info).length ()
+  STMT_VINFO_SIMD_CLONE_INFO (stmt_info)[i * 2 + 2])
+   {
+ gcc_assert (vec_stmt);
+ thisarginfo.linear_step
+   = tree_to_shwi (STMT_VINFO_SIMD_CLONE_INFO (stmt_info)[i * 2 + 2]);
+ thisarginfo.op
+   = STMT_VINFO_SIMD_CLONE_INFO (stmt_info)[i * 2 + 1];
+ /* If loop has been peeled for alignment, we need to adjust it.  */
+ tree n1 = LOOP_VINFO_NITERS_UNCHANGED (loop_vinfo);
+ tree n2 = LOOP_VINFO_NITERS (loop_vinfo);
+ if (n1 != n2)
+   {
+ tree bias = fold_build2 (MINUS_EXPR, TREE_TYPE (n1), n1, n2);
+ tree step = STMT_VINFO_SIMD_CLONE_INFO (stmt_info)[i * 2 + 2];
+ tree opt = TREE_TYPE (thisarginfo.op);
+ bias = fold_convert (TREE_TYPE (step), bias);
+ bias = fold_build2 (MULT_EXPR, TREE_TYPE (step), bias, step);
+ thisarginfo.op
+   = fold_build2 (POINTER_TYPE_P (opt)
+  ? POINTER_PLUS_EXPR : PLUS_EXPR, opt,
+  thisarginfo.op, bias);
+   }
+   }
+  else if (!vec_stmt
+   thisarginfo.dt != vect_constant_def
+   thisarginfo.dt != vect_external_def
+   loop_vinfo
+   TREE_CODE (op) == SSA_NAME
+   simple_iv (loop, loop_containing_stmt (stmt), op,
+iv, false)
+   tree_fits_shwi_p (iv.step))
{
  thisarginfo.linear_step = tree_to_shwi (iv.step);
  thisarginfo.op = iv.base;
@@ -2735,8 +2763,8 @@ vectorizable_simd_clone_call (gimple stm
 
   unsigned int badness = 0;
   struct cgraph_node *bestn = NULL

< 1 2 3 4 5 6 7 8 9 10 >

201 - 300 of 18952 matches

Mail list logo