Re: [PATCH] Optimize ASAN_CHECK checks
On Fri, Nov 14, 2014 at 12:25:37PM +0100, Dodji Seketeli wrote: + /* True if there is a block with HAS_FREEING_CALL_P flag set + on any a path between imm(BB) and BB. */ s/a//. Also, I'd rather say on any path between an immediate dominator of BB, denoted imm(BB), and BB. That way, subsequent uses of imm(BB) makes sense more for the new comer. This is only a suggestion. If you feel that formulation is obvious enough, then please ignore my comment. Ok. +/* Return true if there might be any call to free/munmap operation + on any path in between DOM (which should be imm(BB)) and BB. */ + +static bool +imm_dom_path_with_freeing_call (basic_block bb, basic_block dom) +{ To ease maintainability, maybe we could assert that: gcc_assert (dom == get_immediate_dominator(CDI_DOMINATORS, bb)); Well, that would make the dom argument useless, the point of passing it around was that because we have to call it in the caller already, there is no point in calling it again the the callee. So if we as well call it (most people don't use --disable-checking), we could just drop the argument. ? And thus remove the assert that is at the caller site of this function, later in maybe_optimize_asan_check_ifn: + basic_block imm = get_immediate_dominator (CDI_DOMINATORS, last_bb); + gcc_assert (imm); The assert is very much intentional here, want to double-check the algorithm that we indeed always reach gbb through the imm dominator walk (unless bailing out early, but if we wouldn't, we'd reach it). + if (imm_dom_path_with_freeing_call (last_bb, imm)) + break; Also, when the 'dom' basic block is NULL, couldn't we just return immediately? get_immediate_dominator (CDI_DOMINATORS, ) should return NULL just for the ENTRY block, it would be a bug to reach that bb. +/* Optimize away redundant ASAN_CHECK calls. */ + +static bool +maybe_optimize_asan_check_ifn (struct sanopt_ctx *ctx, gimple stmt) +{ + gcc_assert (gimple_call_num_args (stmt) == 4); + tree ptr = gimple_call_arg (stmt, 1); + tree len = gimple_call_arg (stmt, 2); + basic_block bb = gimple_bb (stmt); + sanopt_info *info = (sanopt_info *) bb-aux; + + if (TREE_CODE (len) != INTEGER_CST) +return false; + if (integer_zerop (len)) +return false; + + gimple_set_uid (stmt, info-freeing_call_events); I am not sure, but I am wondering if we shouldn't save the previous uid of 'stmt' here before setting it, and then restore it before getting out of this function. No, gimple uids are AFAIK undefined at the start of passes, passes that use them are supposed to initialize them before use (new statements created during the pass will get 0 there by default), and don't have to clean them up anyway at the end of pass. @@ -89,111 +402,77 @@ sanopt_optimize_walker (basic_block bb, I think the comment of this sanopt_optimize_walker function should now be adapted to say that it optimizes away redundant UBSAN_NULL *and* ASAN_CHECK internal function calls. Ok, will do. Jakub
Re: [PATCH] Optimize ASAN_CHECK checks
On Fri, Nov 14, 2014 at 01:06:52PM +0100, Dodji Seketeli wrote: Jakub Jelinek ja...@redhat.com writes: I am not sure, but I am wondering if we shouldn't save the previous uid of 'stmt' here before setting it, and then restore it before getting out of this function. No, gimple uids are AFAIK undefined at the start of passes, passes that use them are supposed to initialize them before use (new statements created during the pass will get 0 there by default), and don't have to clean them up anyway at the end of pass. Yeah, this is what I figured by grepping other passes, but I wasn't sure :-) Maybe I should follow up with a doc patch for the (otherwise very terse) comment of gimple_set_uid and gimple_uid accessors. That would be indeed nice (similarly for other stuff that we expect to be undefined on pass boundaries, or expect to be in certain state at pass boundaries; in the former case set before uses, and don't care about what state we leave it in, in the latter case can assume some state first (say 0) and have to put it back into the same state). There are various visited flags and the like, Richard, any ideas what other things might be nice to document? Jakub
Re: [changes.html] Document -fdiagnostics-color= default changes
On Fri, Nov 14, 2014 at 01:14:02PM +0100, Manuel López-Ibáñez wrote: h3 id=c-familyC family/h3 ul +liThe code-fdiagnostics-color=/code option default is now +configurable at GCC configury time using +code--with-diagnostics-color=/code, can default to +codeauto/code - the new default unless configured otherwise, +where diagnostics is colorized by default when emitted to terminal, +codenever/code, codealways/code or codeauto-if-env/code, +which is the default of GCC 4.9 - codeauto/code if non-empty +codeGCC_COLORS/code is in the environment, codenever/code +otherwise. Note, as before, having empty codeGCC_COLORS/code +variable in the environment will always turn the coloring off, no +matter what the default is or what command line options are used./li This not only affects 'C family' but also Fortran now, so perhaps it should go in the Caveats section at the top or on a Building GCC Well, it doesn't look like something for Caveats section IMHO, it is an extension of the previous behavior. The reason I put it into the C family part is that that is where it was documented for 4.9, and for Fortran IMHO gcc-5/changes.html just should document that now it supports diagnostics colors like the C family of frontends. section. Apart from that, do you think the following is a bit clearer? liThe default setting of the code-fdiagnostics-color=/code option is now configurable a href =https://gcc.gnu.org/install/configure.html;when building GCC/a using configuration option code--with-diagnostics-color=/code. The possible values are: codenever/code, codealways/code, codeauto/code and codeauto-if-env/code. The new default codeauto/code means to use color only when the standard error is a terminal. The default in GCC 4.9 was codeauto-if-env/code, which defaults to codeauto/code if there is a non-empty codeGCC_COLORS/code environment variable, and codenever/code otherwise. As in GCC 4.9, an empty codeGCC_COLORS/code variable in the environment will always disable colors, no matter what the default is or what command line options are used./li Indeed, it does. So feel free to turn that into patch form. Jakub
Re: [PATCH] Optimize ASAN_CHECK checks
On Fri, Nov 14, 2014 at 01:16:36PM +0100, Richard Biener wrote: That would be indeed nice (similarly for other stuff that we expect to be undefined on pass boundaries, or expect to be in certain state at pass boundaries; in the former case set before uses, and don't care about what state we leave it in, in the latter case can assume some state first (say 0) and have to put it back into the same state). There are various visited flags and the like, Richard, any ideas what other things might be nice to document? aux pointers on CFG structures which I believe have to be cleared after each pass that uses them (maybe already documented). Nothing else off the top of my head. There is at least gimple_plf GF_PLF_{1,2} too, gimple_visited_p, BB_VISITED, ... Jakub
Re: The nvptx port
On Fri, Nov 14, 2014 at 01:12:40PM +0100, Bernd Schmidt wrote: :(. So what other option one has to implement something like TLS, even using inline asm or similar? There is %tid, so perhaps indexing some array with %tid? That ought to work. For performance you'd want that array in .shared memory but I believe that's limited in size. Any way to query those limits? Size of .shared memory, number of threads in warp, number of warps, etc.? In OpenACC, are all workers in a single gang the same warp? BTW, one can still invoke OpenMP target regions (even OpenACC regions) from multiple host threads, so the question is how without local TLS we can actually do anything at all. Sure, we can pass parameters to the kernel, but we'd need to propagate it through all functions. Or can cudaGetParameterBuffer be used for that? Presumably a kernel could copy its arguments out to memory somewhere when it's called? The question is where. If it is global memory, then how would you find out what value is for your team and what value is for some other team? - we'll need some synchronization primitives, I see atomic support is there, we need mutexes and semaphores I think, is that implementable using bar instruction? It's probably membar you need. That is a memory barrier, I need threads to wait on each other, wake up one another etc. Hmm. It's worthwhile to keep in mind that GPU threads really behave somewhat differently from CPUs (they don't really execute independently); the OMP model may just be a poor match for the architecture in general. One could busywait on a spinlock, but AFAIK there isn't really a way to put a thread to sleep. By not executing independently, I mean this: I believe if one thread in a warp is waiting on the spinlock, all the other ones are also busywaiting. There may be other effects that seem odd if one approaches it from a CPU perspective - for example you probably want only one thread in a warp to try to take the spinlock. So, for a warp, if some threads perform one branch of an if and other threads another one, all threads perform the first one first (with some maybe not doing anything), then all the threads the others (again, other threads not doing anything)? As for the match, OpenMP isn't written for a particular accelerator, though supposedly the addition of #pragma omp teams construct was done for NVidia. So, some OpenMP code may be efficient on PTX, while other code might not be that much (e.g. if all threads in a warp need to execute the same thing, supposedly #pragma omp task isn't very good idea for the devices). Jakub
Re: libsanitizer merge from upstream r221802
On Fri, Nov 14, 2014 at 03:11:16PM +0100, Uros Bizjak wrote: The missing definition in system's /usr/include/linux/types.h is protected with: typedef __u16 __bitwise __le16; typedef __u16 __bitwise __be16; typedef __u32 __bitwise __le32; typedef __u32 __bitwise __be32; #if defined(__GNUC__) !defined(__STRICT_ANSI__) typedef __u64 __bitwise __le64; typedef __u64 __bitwise __be64; #endif which doesn't work with -std=c++11, but works without problems with -std=gnu++11. As proposed by Jakub at [1], -std=gnu++11 fixes the problem with old kernels. Attached patch implements this proposal. 2014-11-14 Uros Bizjak ubiz...@gmail.com * sanitizer_common/Makefile.am (AM_CXXFLAGS): Use -std=gnu++11. * asan/Makefile.am (AM_CXXFLAGS): Ditto. * lsan/Makefile.am (AM_CXXFLAGS): Ditto. * interception/Makefile.am (AM_CXXFLAGS): Ditto. * tsan/Makefile.am (AM_CXXFLAGS): Ditto. * libbacktrace/Makefile.am (AM_CXXFLAGS): Ditto. * ubsan/Makefile.am (AM_CXXFLAGS): Ditto. * sanitizer_common/Makefile.in: Regenerate. * asan/Makefile.in: Ditto. * lsan/Makefile.in: Ditto. * interception/Makefile.in: Ditto. * tsan/Makefile.in: Ditto. * libbacktrace/Makefile.in: Ditto. * ubsan/Makefile.in: Ditto. Ok, thanks. Really no reason for pedantic checking. Jakub
Re: The nvptx port
On Fri, Nov 14, 2014 at 07:37:49AM -0800, Cesar Philippidis wrote: Hmm. It's worthwhile to keep in mind that GPU threads really behave somewhat differently from CPUs (they don't really execute independently); the OMP model may just be a poor match for the architecture in general. One could busywait on a spinlock, but AFAIK there isn't really a way to put a thread to sleep. By not executing independently, I mean this: I believe if one thread in a warp is waiting on the spinlock, all the other ones are also busywaiting. There may be other effects that seem odd if one approaches it from a CPU perspective - for example you probably want only one thread in a warp to try to take the spinlock. Thread synchronization in CUDA is different from conventional CPUs. Using the gang/thread terminology, there's no way to synchronize two threads in two different gangs in PTX without invoking separate kernels. Basically, after a kernel is invoked, the host/accelerator (the later using dynamic parallelism) waits for the kernel to finish, and that effectively creates a barrier. I believe in OpenMP terminology a gang is a team, and inter-teams barriers are not supposed to work etc. (though, I think locks and atomic instructions still are, so is critical region, so I really hope atomics are atomic even inter-gang). So for synchronization (mutexes and semaphores, from which barriers are implemented; but perhaps could also use bar.arrive and bar.sync) we mainly need synchronization within the gang. Also, keep in mind that PTX doesn't have a global TID. The user needs to calculate it using ctaid/tid and friends. Ok. Is %gridid needed for that combo too? Jakub
Re: The nvptx port
On Fri, Nov 14, 2014 at 08:37:52AM -0800, Cesar Philippidis wrote: On 11/14/2014 08:18 AM, Jakub Jelinek wrote: Also, keep in mind that PTX doesn't have a global TID. The user needs to calculate it using ctaid/tid and friends. Ok. Is %gridid needed for that combo too? Eventually, probably. Currently, we're launching all of our kernels with cuLaunchKernel, and that function doesn't take grids into account. I wonder if cudaLaunchDevice called from PTX will result in a different %gridid or not, will see next week if I manage to get the HW and SW stack Jakub
Re: [PATCH, i686] Fix for asan test failures with -m32 happened after EBX enabling in PIC mode
On Fri, Nov 14, 2014 at 05:05:57PM +, Zamyatin, Igor wrote: --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr63845.c @@ -0,0 +1,20 @@ +/* PR sanitizer/63845 */ +/* { dg-do compile } */ +/* { dg-require-effective-target ia32 } */ +/* { dg-require-effective-target fpic } */ +/* { dg-skip-if No Windows PIC { *-*-mingw* *-*-cygwin } { * } { + } } */ +/* { dg-options -fPIC } */ + +int __attribute__ ((noinline, noclone)) foo (void *p) { + return *(int*)p; +} + +int main () +{ + char a = 0; + foo (a); + return 0; +} + Will this test fail on Linux without your fix? Doesn't testcase need - fsanitize=address to fail? Sure, you're right, will add it It is not that easy, -fsanitize=address is not supported everywhere. Better if you stick it into testsuite/gcc.dg/asan/ No point adding effective-target ia32/fpic, there is nothing i?86 specific, not even ix86/x86_64 specific, nor pic specific in the test, just use /* { dg-additional-options -fPIC { target fpic } } */ Jakub
Re: [PATCH] DW_AT_APPLE_* DWARF extensions.
On Fri, Nov 14, 2014 at 03:48:22PM +0100, Andrew Burgess wrote: * Jakub Jelinek ja...@redhat.com [2014-11-13 14:13:42 +0100]: On Thu, Nov 13, 2014 at 01:21:21PM +0100, Andrew Burgess wrote: I had a look around and couldn't find anything helpful. The best I can offer would be the current path within the llvm source code where these are defined. Would that be sufficient? That is not useful. The point is not to suggest where those constants come from but what they mean, see e.g. the http://www.dwarfstd.org/ShowIssue.php?issue=100909.2type=open http://gcc.gnu.org/wiki/TemplateParmsDwarf etc. links that describe what the extensions do. If LLVM doesn't have any documentation of their extensions, then just /* Apple extensions. */ is good enough. I agree, hence my comment about not finding anything useful :) I've look again, and still can't find anything suitable, so could we go with just /* Apple extensions. */ then? Patch Changelog below. Applied. Jakub
Re: [PATCH] Optimize ASAN_CHECK checks
On Fri, Nov 14, 2014 at 01:06:52PM +0100, Dodji Seketeli wrote: Jakub Jelinek ja...@redhat.com writes: I am not sure, but I am wondering if we shouldn't save the previous uid of 'stmt' here before setting it, and then restore it before getting out of this function. No, gimple uids are AFAIK undefined at the start of passes, passes that use them are supposed to initialize them before use (new statements created during the pass will get 0 there by default), and don't have to clean them up anyway at the end of pass. Yeah, this is what I figured by grepping other passes, but I wasn't sure :-) Here is what I've committed after another bootstrap/regtest. 2014-11-14 Jakub Jelinek ja...@redhat.com Marek Polacek pola...@redhat.com * sanopt.c: Include tree-ssa-operands.h. (struct sanopt_info): Add has_freeing_call_p, has_freeing_call_computed_p, imm_dom_path_with_freeing_call_p, imm_dom_path_with_freeing_call_computed_p, freeing_call_events, being_visited_p fields. (struct sanopt_ctx): Add asan_check_map field. (imm_dom_path_with_freeing_call, maybe_optimize_ubsan_null_ifn, maybe_optimize_asan_check_ifn): New functions. (sanopt_optimize_walker): Use them, optimize even ASAN_CHECK internal calls. (pass_sanopt::execute): Call sanopt_optimize even for -fsanitize=address. * gimple.c (nonfreeing_call_p): Return true for non-ECF_LEAF internal calls. --- gcc/sanopt.c.jj 2014-11-14 15:11:49.202397369 +0100 +++ gcc/sanopt.c2014-11-14 15:19:40.517912502 +0100 @@ -49,6 +49,7 @@ along with GCC; see the file COPYING3. #include langhooks.h #include ubsan.h #include params.h +#include tree-ssa-operands.h /* This is used to carry information about basic blocks. It is @@ -56,7 +57,30 @@ along with GCC; see the file COPYING3. struct sanopt_info { - /* True if this BB has been visited. */ + /* True if this BB might call (directly or indirectly) free/munmap + or similar operation. */ + bool has_freeing_call_p; + + /* True if HAS_FREEING_CALL_P flag has been computed. */ + bool has_freeing_call_computed_p; + + /* True if there is a block with HAS_FREEING_CALL_P flag set + on any path between an immediate dominator of BB, denoted + imm(BB), and BB. */ + bool imm_dom_path_with_freeing_call_p; + + /* True if IMM_DOM_PATH_WITH_FREEING_CALL_P has been computed. */ + bool imm_dom_path_with_freeing_call_computed_p; + + /* Number of possibly freeing calls encountered in this bb + (so far). */ + uint64_t freeing_call_events; + + /* True if BB is currently being visited during computation + of IMM_DOM_PATH_WITH_FREEING_CALL_P flag. */ + bool being_visited_p; + + /* True if this BB has been visited in the dominator walk. */ bool visited_p; }; @@ -69,131 +93,387 @@ struct sanopt_ctx a vector of UBSAN_NULL call statements that check this pointer. */ hash_maptree, auto_vecgimple null_check_map; + /* This map maps a pointer (the second argument of ASAN_CHECK) to + a vector of ASAN_CHECK call statements that check the access. */ + hash_maptree, auto_vecgimple asan_check_map; + /* Number of IFN_ASAN_CHECK statements. */ int asan_num_accesses; }; -/* Try to optimize away redundant UBSAN_NULL checks. - +/* Return true if there might be any call to free/munmap operation + on any path in between DOM (which should be imm(BB)) and BB. */ + +static bool +imm_dom_path_with_freeing_call (basic_block bb, basic_block dom) +{ + sanopt_info *info = (sanopt_info *) bb-aux; + edge e; + edge_iterator ei; + + if (info-imm_dom_path_with_freeing_call_computed_p) +return info-imm_dom_path_with_freeing_call_p; + + info-being_visited_p = true; + + FOR_EACH_EDGE (e, ei, bb-preds) +{ + sanopt_info *pred_info = (sanopt_info *) e-src-aux; + + if (e-src == dom) + continue; + + if ((pred_info-imm_dom_path_with_freeing_call_computed_p + pred_info-imm_dom_path_with_freeing_call_p) + || (pred_info-has_freeing_call_computed_p + pred_info-has_freeing_call_p)) + { + info-imm_dom_path_with_freeing_call_computed_p = true; + info-imm_dom_path_with_freeing_call_p = true; + info-being_visited_p = false; + return true; + } +} + + FOR_EACH_EDGE (e, ei, bb-preds) +{ + sanopt_info *pred_info = (sanopt_info *) e-src-aux; + + if (e-src == dom) + continue; + + if (pred_info-has_freeing_call_computed_p) + continue; + + gimple_stmt_iterator gsi; + for (gsi = gsi_start_bb (e-src); !gsi_end_p (gsi); gsi_next (gsi)) + { + gimple stmt = gsi_stmt (gsi); + + if (is_gimple_call (stmt) !nonfreeing_call_p (stmt)) + { + pred_info-has_freeing_call_p = true; + break; + } + } + + pred_info-has_freeing_call_computed_p
[PATCH] Document --with-diagnostics-color=
On Fri, Nov 14, 2014 at 12:01:43PM +0100, Richard Biener wrote: doc/invoke.texi also talks about it, wonder if it shouldn't be mentioned elsewhere, install.texi? Sure - the configure option should be mentioned in install.texi as well. Here it is. Ok for trunk? 2014-11-14 Jakub Jelinek ja...@redhat.com * doc/install.texi (--with-diagnostics-color=): Document. --- gcc/doc/install.texi.jj 2014-11-13 15:13:22.0 +0100 +++ gcc/doc/install.texi2014-11-14 16:29:47.941342318 +0100 @@ -1795,6 +1795,15 @@ static data members and inline function default for a toolchain with an assembler that accepts it and GLIBC 2.11 or above, otherwise disabled. +@item --with-diagnostics-color=@var{choice} +Tells GCC to use @var{choice} as the default for @option{-fdiagnostics-color=} +option (if not used explicitly on the command line). @var{choice} +can be one of @samp{never}, @samp{auto}, @samp{always}, and @samp{auto-if-env} +where @samp{auto} is the default. @samp{auto-if-env} means that +@option{-fdiagnostics-color=auto} will be the default if @code{GCC_COLORS} +is present and non-empty in the environment, and +@option{-fdiagnostics-color=never} otherwise. + @item --enable-lto @itemx --disable-lto Enable support for link-time optimization (LTO). This is enabled by Jakub
[PATCH] DCE dead IFN_GOMP_SIMD_LANE
Hi! When looking at PR59984, I've noticed that in certain cases we leave around GOMP_SIMD_LANE calls without lhs. This internal call is intentionally not ECF_CONST, we don't want it moved before the loop, and the argument is magic. But, if nothing uses it anymore, it isn't needed. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2014-11-14 Jakub Jelinek ja...@redhat.com * tree-ssa.dce.c (eliminate_unnecessary_stmts): Eliminate IFN_GOMP_SIMD_LANE without lhs as useless. --- gcc/tree-ssa-dce.c.jj 2014-11-12 13:28:47.0 +0100 +++ gcc/tree-ssa-dce.c 2014-11-14 16:15:10.356124955 +0100 @@ -1402,6 +1402,11 @@ eliminate_unnecessary_stmts (void) maybe_clean_or_replace_eh_stmt (stmt, stmt); update_stmt (stmt); release_ssa_name (name); + + /* GOMP_SIMD_LANE without lhs is not needed. */ + if (gimple_call_internal_p (stmt) + gimple_call_internal_fn (stmt) == IFN_GOMP_SIMD_LANE) + remove_dead_stmt (gsi, bb); } else if (gimple_call_internal_p (stmt)) switch (gimple_call_internal_fn (stmt)) Jakub
Re: [PATCH] Propagate nonfreeing_call_p using ipa-pure-const (take 2)
On Thu, Nov 13, 2014 at 12:52:21PM +0100, Jakub Jelinek wrote: On Thu, Nov 13, 2014 at 09:39:42AM +0100, Jakub Jelinek wrote: What about the: I wonder if the nonfreeing_call_p function shouldn't be moved elsewhere though (suggestion where), so that gimple.c doesn't need the cgraph includes. question though (maybe it is more on Richard)? Tried richi's suggested cgraph.[ch], but that meant all users of nonfreeing_call_p had to start including all of ipa-ref.h, lto-streamer.h and cgraph.h, so it is probably better to keep it in gimple.[ch]. Here is a new version, which also handles IFN_ABNORMAL_DISPATCHER as nonfreeing, it doesn't really call anything, just connect abnormal edges. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2014-11-14 Jakub Jelinek ja...@redhat.com * ipa-pure-const.c (struct funct_state_d): Add can_free field. (varying_state): Add true for can_free. (check_call): For builtin or internal !nonfreeing_call_p set local-can_free. (check_stmt): For asm volatile and asm with memory set local-can_free. (analyze_function): Clear local-can_free initially, continue calling check_stmt until all flags are computed, dump can_free flag. (pure_const_write_summary): Write can_free flag. (pure_const_read_summary): Read it back. (propagate_pure_const): Propagate also can_free flag, set w-nonfreeing_fn if it is false after propagation. * cgraph.h (cgraph_node): Add nonfreeing_fn member. * gimple.c: Include ipa-ref.h, lto-streamer.h and cgraph.h. (nonfreeing_call_p): Return cgraph nonfreeing_fn flag if set. Also return true for IFN_ABNORMAL_DISPATCHER. * cgraph.c (cgraph_node::dump): Dump nonfreeing_fn flag. * lto-cgraph.c (lto_output_node): Write nonfreeing_fn flag. (input_overwrite_node): Read it back. --- gcc/cgraph.c.jj 2014-11-14 15:11:46.691442654 +0100 +++ gcc/cgraph.c2014-11-14 15:20:25.642079723 +0100 @@ -1995,6 +1995,8 @@ cgraph_node::dump (FILE *f) fprintf (f, tm_clone); if (icf_merged) fprintf (f, icf_merged); + if (nonfreeing_fn) +fprintf (f, nonfreeing_fn); if (DECL_STATIC_CONSTRUCTOR (decl)) fprintf (f, static_constructor (priority:%i), get_init_priority ()); if (DECL_STATIC_DESTRUCTOR (decl)) --- gcc/ipa-pure-const.c.jj 2014-11-14 15:11:46.640443574 +0100 +++ gcc/ipa-pure-const.c2014-11-14 15:20:25.643079828 +0100 @@ -112,11 +112,15 @@ struct funct_state_d bool looping; bool can_throw; + + /* If function can call free, munmap or otherwise make previously + non-trapping memory accesses trapping. */ + bool can_free; }; /* State used when we know nothing about function. */ static struct funct_state_d varying_state - = { IPA_NEITHER, IPA_NEITHER, true, true, true }; + = { IPA_NEITHER, IPA_NEITHER, true, true, true, true }; typedef struct funct_state_d * funct_state; @@ -559,6 +563,10 @@ check_call (funct_state local, gimple ca enum pure_const_state_e call_state; bool call_looping; + if (gimple_call_builtin_p (call, BUILT_IN_NORMAL) + !nonfreeing_call_p (call)) + local-can_free = true; + if (special_builtin_state (call_state, call_looping, callee_t)) { worse_state (local-pure_const_state, local-looping, @@ -589,6 +597,8 @@ check_call (funct_state local, gimple ca break; } } + else if (gimple_call_internal_p (call) !nonfreeing_call_p (call)) +local-can_free = true; /* When not in IPA mode, we can still handle self recursion. */ if (!ipa callee_t @@ -753,6 +763,7 @@ check_stmt (gimple_stmt_iterator *gsip, fprintf (dump_file, memory asm clobber is not const/pure\n); /* Abandon all hope, ye who enter here. */ local-pure_const_state = IPA_NEITHER; + local-can_free = true; } if (gimple_asm_volatile_p (stmt)) { @@ -760,7 +771,8 @@ check_stmt (gimple_stmt_iterator *gsip, fprintf (dump_file, volatile is not const/pure\n); /* Abandon all hope, ye who enter here. */ local-pure_const_state = IPA_NEITHER; - local-looping = true; + local-looping = true; + local-can_free = true; } return; default: @@ -785,6 +797,7 @@ analyze_function (struct cgraph_node *fn l-looping_previously_known = true; l-looping = false; l-can_throw = false; + l-can_free = false; state_from_flags (l-state_previously_known, l-looping_previously_known, flags_from_decl_or_type (fn-decl), fn-cannot_return_p ()); @@ -815,7 +828,10 @@ analyze_function (struct cgraph_node *fn gsi_next (gsi)) { check_stmt (gsi, l, ipa); - if (l-pure_const_state == IPA_NEITHER l-looping l-can_throw) + if (l-pure_const_state
Re: r217562 - in /trunk/libsanitizer: ChangeLog asa...
On Fri, Nov 14, 2014 at 11:29:07AM -0800, Konstantin Serebryany wrote: +gcc-patches On Fri, Nov 14, 2014 at 11:26 AM, Konstantin Serebryany konstantin.s.serebry...@gmail.com wrote: I am opposed to this change. Upstream code builds with -std=c++11. Building this code here with another set of options is a time bomb. But also builds with a different compiler. -std=gnu++11 is not something very different from C++11. libstdc++-v3 (the C++11 part in it) are also built with -std=gnu++11, not -std=c++11, similarly in other libraries shipped with GCC. Valid C++11 will be still valid in -std=gnu++11 mode... Jakub
Re: [PATCH] gcc/testsuite: guality.exp: Fix `test_counts' restoration
On Fri, Nov 14, 2014 at 09:01:25PM +, Maciej W. Rozycki wrote: 2014-11-14 Maciej W. Rozycki ma...@codesourcery.com gcc/testsuite/ * g++.dg/guality/guality.exp (check_guality): Fix `test_counts' restoration. Ok, thanks. --- gcc-fsf-trunk-quilt.orig/gcc/testsuite/g++.dg/guality/guality.exp 2014-11-14 18:33:47.0 + +++ gcc-fsf-trunk-quilt/gcc/testsuite/g++.dg/guality/guality.exp 2014-11-14 20:18:35.038856372 + @@ -28,7 +28,7 @@ proc check_guality {args} { set ret [string match *1 PASS, 0 FAIL, 0 UNRESOLVED* $execout] } remote_file build delete $output -array get test_counts [array get saved_test_counts] +array set test_counts [array get saved_test_counts] return $ret } Jakub
Re: Fortran/C interfacing
On Fri, Nov 14, 2014 at 09:47:17PM +0100, Tobias Burnus wrote: Somebody recently suggested (for OpenMP) that we just should use bind(C) in the Fortran module, it is too late for OpenMP, as we have to keep the *_ entrypoints for compatibility anyway, but for OpenACC and new OpenMP functions supposedly you could avoid exporting all the *_ wrappers and use * directly. Tobis, as our local expert :-) -- how does that resonate with the discussion (and implementation) about Fortran/C interfacing Unfortunately, it a wrapper either on C or on Fortran side is unavoidable. Ok, just wanted to rise it, if it isn't possible, fine as is. Jakub
Re: [PATCH] Fix Cilk+ ICEs with overflow builtins (PR middle-end/63884)
On Sat, Nov 15, 2014 at 01:03:46PM +0100, Marek Polacek wrote: The problem here is that the Cilk+ code wasn't prepared to handle internal calls that the new overflow builtins entail. Fixed by checking that the CALL_EXPR_FN isn't NULL. Looking at cilk-plus.exp, I think this file will need some tweaks now that the C default is gnu11... Bootstrapped/regtested on powerpc64-linux, ok for trunk? 2014-11-15 Marek Polacek pola...@redhat.com PR middle-end/63884 c-family/ * array-notation-common.c (is_sec_implicit_index_fn): Return false for NULL fndecl. (extract_array_notation_exprs): Return for NULL node. testsuite/ * c-c++-common/cilk-plus/AN/pr63884.c: New test. Ok, thanks. Jakub
Re: [PATCH] Fix minimal alignment calculation for user-aligned types (PR63802)
On Fri, Nov 14, 2014 at 06:15:16PM +, Joseph Myers wrote: On Fri, 14 Nov 2014, Jakub Jelinek wrote: On Fri, Nov 14, 2014 at 09:46:14AM +0300, Yury Gribov wrote: Hi all, This patch fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63802 by only limiting minimal type alignment with BIGGEST_ALIGNMENT for types with no __attribute__((aligned)). Bootstrapped and regtested on x64. Ok for trunk? The function is primarily used by the C FE for _Alignas, and I have no idea if such a change is desirable for that very much user visible case. Joseph? If it is true that a type satisfying TYPE_USER_ALIGN will never be allocated at an address less-aligned than its TYPE_ALIGN, even if that's greater than BIGGEST_ALIGNMENT, then the change seems correct for C11 _Alignof. I think it depends on which target and where. In structs (unless packed) the user aligned fields should be properly aligned with respect to start of struct and the struct should have user alignment in that case, automatic vars these days use alloca with realignment if not handled better by the target, so should be fine too. For data section vars and for common vars I think it really depends on the target. Perhaps for TYPE_USER_ALIGN use minimum of the TYPE_ALIGN and MAX_OFILE_ALIGNMENT ? For heap objects, it really depends on how it has been allocated, but if allocated through malloc, the extra alignment is never guaranteed. So, it really depends in malloc counts or not. Jakub
Re: [PATCH] gcc/ubsan.c: Extend 'pretty_name' space to avoid memory overflow
On Mon, Nov 17, 2014 at 08:16:32AM +0100, Marek Polacek wrote: On Mon, Nov 17, 2014 at 06:40:26AM +0800, Chen Gang wrote: According to the next code, 'pretty_name' may need additional bytes more than 16. For simplify thinking and being extensible in future, extent it to 256 bytes, directly. I think + 128 bytes should be enough for everyone. I disagree. Consider: typedef char A[1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1]; A a; int foo (int j) { a[j][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0] = 1; } You need 1159 bytes in that case. Easily to construct testcase that needs arbitrary amount. I think easiest would be to rewrite the code so that it uses pretty_printer to construct the string, grep asan.c for asan_pp . Or obstacks, but you don't have a printer to print integers into it easily. if (dom TREE_CODE (TYPE_MAX_VALUE (dom)) == INTEGER_CST) pos += sprintf (pretty_name[pos], HOST_WIDE_INT_PRINT_DEC, tree_to_uhwi (TYPE_MAX_VALUE (dom)) + 1); else /* ??? We can't determine the variable name; print VLA unspec. */ pretty_name[pos++] = '*'; looks wrong anyway, as not all integers fit into uhwi. Guess you could use wide_int to add 1 there and pp_wide_int. Jakub
Re: [PATCH] Fix minimal alignment calculation for user-aligned types (PR63802)
On Mon, Nov 17, 2014 at 05:46:55PM +, Joseph Myers wrote: On Mon, 17 Nov 2014, Jakub Jelinek wrote: If it is true that a type satisfying TYPE_USER_ALIGN will never be allocated at an address less-aligned than its TYPE_ALIGN, even if that's greater than BIGGEST_ALIGNMENT, then the change seems correct for C11 _Alignof. I think it depends on which target and where. In structs (unless packed) the user aligned fields should be properly aligned with respect to start of struct and the struct should have user alignment in that case, automatic vars these days use alloca with realignment if not handled better by the target, so should be fine too. For data section vars and for common vars I think it really depends on the target. Perhaps for TYPE_USER_ALIGN use minimum of the TYPE_ALIGN and MAX_OFILE_ALIGNMENT ? For heap objects, it really depends on how it has been allocated, but if allocated through malloc, the extra alignment is never guaranteed. So, it really depends in malloc counts or not. The question, for both _Alignas and ubsan, is the alignment guaranteed *in valid programs*. malloc only provides sufficient alignment for types with fundamental alignment requirements (although there are various problems with the C11 wording; see DR#445). So if you use malloc to allocate a type with an extended alignment requirement (without doing extra realignment on the result of malloc), that's not a valid program. And if an alignment is larger than MAX_OFILE_ALIGNMENT, you get an error for declaring non-stack variables requiring that alignment. So I don't think there's any need to check MAX_OFILE_ALIGNMENT here. If so, then Yuri's original patch (the one changing min_align_of_type) should be fine, right? Jakub
Re: [PATCH] -fsanitize-recover=list
On Mon, Nov 17, 2014 at 06:40:00PM -0800, Alexey Samsonov wrote: I've just prepared a patch implementing -fsanitize-recover=list in Clang (http://reviews.llvm.org/D6302), writing here to make sure we're on the same page w.r.t. flag semantics: * -fsanitize-recover: Enable recovery for all checks enabled by -fsanitize= which support it. * -fno-sanitize-recover: Disable recovery for all checks. That is not what I think we've agreed on and what is implemented in GCC. -fsanitize-recover only enables recovery of the undefined plus undefined-like sanitizers, in particular it doesn't enable recover from kernel-address, because -fsanitize-recover should be a deprecated option and kernel-address never used it before. So, in GCC -fsanitize-recover stands for -fsanitize-recover=undefined,float-divide-by-zero,float-cast-overflow and -fno-sanitize-recover stands for -fno-sanitize-recover=undefined,float-divide-by-zero,float-cast-overflow * -fsanitize-recover=list: Enable recovery for all selected checks or group of checks. It is forbidden to list unrecoverable sanitizers here (e.g., -fsanitize-recover=address will produce an error). We only error on -fsanitize-recover=address -fsanitize-recover=thread -fsanitize-recover=leak but not say on -fsanitize-recover=unreachable which is part of undefined; unreachable isn't recoverable silently. Likewise -fsanitize-recover=return. Otherwise one couldn't use -fsanitize-recover=undefined which is useful. * -fno-sanitize-recover=list: Disable recovery for selected checks or group of checks. Jakub
Re: [PATCH] -fsanitize-recover=list
On Mon, Nov 17, 2014 at 11:39:47PM -0800, Alexey Samsonov wrote: That is not what I think we've agreed on and what is implemented in GCC. -fsanitize-recover only enables recovery of the undefined plus undefined-like sanitizers, in particular it doesn't enable recover from kernel-address, because -fsanitize-recover should be a deprecated option and kernel-address never used it before. Hm, indeed, I messed it up. In the older thread we agree that plain -f(no-)sanitize-recover should be a deprecated synonym for the current meaning of these flags, which only specify recoverability for UBSan-related checks :-/ Has GCC already shipped with existing implementation? I'm just curious if it's convenient to have flags that would enable/disable recovery for all sanitizers (analogous to -Werror flags which exist in a standalone form, but may accept specific warnings, so that one can write $(CC) foo.cc -Wno-error -Werror=sign-compare Well, no sanitizer is on by default, so you can just use the same list for -fno-sanitize-recover= or -fsanitize-recover= as for -fsanitize= if you want. So, in GCC -fsanitize-recover stands for -fsanitize-recover=undefined,float-divide-by-zero,float-cast-overflow and -fno-sanitize-recover stands for -fno-sanitize-recover=undefined,float-divide-by-zero,float-cast-overflow * -fsanitize-recover=list: Enable recovery for all selected checks or group of checks. It is forbidden to list unrecoverable sanitizers here (e.g., -fsanitize-recover=address will produce an error). We only error on -fsanitize-recover=address -fsanitize-recover=thread -fsanitize-recover=leak Right, it's a good idea to error on sanitizer kinds we don't have recovery for. I will add the errors for TSan/MSan/LSan etc. but not say on -fsanitize-recover=unreachable which is part of undefined; unreachable isn't recoverable silently. Likewise -fsanitize-recover=return. Otherwise one couldn't use -fsanitize-recover=undefined which is useful. Can't this be fixed by checking the actual values of -fsanitize-recover= flag before expanding the sanitizer groups (i.e. turning undefined into null,unreachable,return,)? We should probably be able to distinguish between -fsanitize-recover=undefined, and -fsanitize-recover=unreachable,another checks from undefined. In the first case we can enable recovery for all parts of undefined that support it. In the second, we can produce an error as user explicitly tried to enable recovery for sanitizer that doesn't support it. But in that case you would need to diagnose it already when parsing the option, or add code that would remember what bits have been set from other option sets. In the former case, you'd diagnose even -fsanitize-recover=unreachable -fno-sanitize-recover=undefined even when in that case unreachable in the end is not recoverable. We don't error out for -fsanitize-recover=address -fno-sanitize-recover=address because in the end address is not recovered. Jakub
Re: [PATCH] Fix PR63868 - Guard offloading with ifdef ENABLE_OFFLOADING
On Tue, Nov 18, 2014 at 06:31:59PM +0300, Ilya Verbin wrote: @@ -8287,7 +8289,9 @@ expand_omp_target (struct omp_region *region) if (kind == GF_OMP_TARGET_KIND_REGION) { unsigned srcidx, dstidx, num; +#ifdef ENABLE_OFFLOADING struct cgraph_node *node; +#endif Please instead move the struct cgraph_node *node; declaration right above where it is used for the first time. There is no goto involved there, and it isn't in a switch either, so you probably also can do just: struct cgraph_node *node = cgraph_node::get (child_fn); instead. Ok with that change. @@ -8414,18 +8418,22 @@ expand_omp_target (struct omp_region *region) DECL_STRUCT_FUNCTION (child_fn)-curr_properties = cfun-curr_properties; cgraph_node::add_new_function (child_fn, true); +#ifdef ENABLE_OFFLOADING /* Add the new function to the offload table. */ vec_safe_push (offload_funcs, child_fn); +#endif /* Fix the callgraph edges for child_cfun. Those for cfun will be fixed in a following pass. */ push_cfun (child_cfun); cgraph_edge::rebuild_edges (); +#ifdef ENABLE_OFFLOADING /* Prevent IPA from removing child_fn as unreachable, since there are no refs from the parent function to child_fn in offload LTO mode. */ node = cgraph_node::get (child_fn); node-mark_force_output (); +#endif /* Some EH regions might become dead, see PR34608. If pass_cleanup_cfg isn't the first pass to happen with the Jakub
[PATCH] Fix -fsanitize=bool -fnon-call-exceptions (PR sanitizer/63913)
Hi! This patch fixes instrumentation of bool/enum loads if they could throw. We want to keep the EH on the load, and push further statements on the fallthru edge. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2014-11-18 Jakub Jelinek ja...@redhat.com PR sanitizer/63913 * ubsan.c: Include tree-eh.h. (instrument_bool_enum_load): Handle loads that can throw. * g++.dg/ubsan/pr63913.C: New test. --- gcc/ubsan.c.jj 2014-11-14 15:11:49.0 +0100 +++ gcc/ubsan.c 2014-11-18 09:29:20.409209556 +0100 @@ -67,6 +67,7 @@ along with GCC; see the file COPYING3. #include dfp.h #include builtins.h #include tree-object-size.h +#include tree-eh.h /* Map from a tree to a VAR_DECL tree. */ @@ -1135,7 +1136,9 @@ instrument_bool_enum_load (gimple_stmt_i || TREE_CODE (gimple_assign_lhs (stmt)) != SSA_NAME) return; + bool can_throw = stmt_could_throw_p (stmt); location_t loc = gimple_location (stmt); + tree lhs = gimple_assign_lhs (stmt); tree ptype = build_pointer_type (TREE_TYPE (rhs)); tree atype = reference_alias_ptr_type (rhs); gimple g = gimple_build_assign (make_ssa_name (ptype, NULL), @@ -1145,9 +1148,24 @@ instrument_bool_enum_load (gimple_stmt_i tree mem = build2 (MEM_REF, utype, gimple_assign_lhs (g), build_int_cst (atype, 0)); tree urhs = make_ssa_name (utype, NULL); - g = gimple_build_assign (urhs, mem); - gimple_set_location (g, loc); - gsi_insert_before (gsi, g, GSI_SAME_STMT); + if (can_throw) +{ + gimple_assign_set_lhs (stmt, urhs); + g = gimple_build_assign_with_ops (NOP_EXPR, lhs, urhs, NULL_TREE); + gimple_set_location (g, loc); + edge e = find_fallthru_edge (gimple_bb (stmt)-succs); + gsi_insert_on_edge_immediate (e, g); + gimple_assign_set_rhs_with_ops (gsi, MEM_REF, mem, NULL_TREE); + update_stmt (stmt); + *gsi = gsi_for_stmt (g); + g = stmt; +} + else +{ + g = gimple_build_assign (urhs, mem); + gimple_set_location (g, loc); + gsi_insert_before (gsi, g, GSI_SAME_STMT); +} minv = fold_convert (utype, minv); maxv = fold_convert (utype, maxv); if (!integer_zerop (minv)) @@ -1169,8 +1187,11 @@ instrument_bool_enum_load (gimple_stmt_i gimple_set_location (g, loc); gsi_insert_after (gsi, g, GSI_NEW_STMT); - gimple_assign_set_rhs_with_ops (gsi2, NOP_EXPR, urhs, NULL_TREE); - update_stmt (stmt); + if (!can_throw) +{ + gimple_assign_set_rhs_with_ops (gsi2, NOP_EXPR, urhs, NULL_TREE); + update_stmt (stmt); +} gsi2 = gsi_after_labels (then_bb); if (flag_sanitize_undefined_trap_on_error) --- gcc/testsuite/g++.dg/ubsan/pr63913.C.jj 2014-11-18 09:34:10.603981487 +0100 +++ gcc/testsuite/g++.dg/ubsan/pr63913.C2014-11-18 09:33:47.0 +0100 @@ -0,0 +1,12 @@ +// PR sanitizer/63913 +// { dg-do compile } +// { dg-options -fsanitize=bool -fnon-call-exceptions } + +struct B { B (); ~B (); }; + +double +foo (bool *x) +{ + B b; + return *x; +} Jakub
[PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)
Hi! OImode/XImode on i?86/x86_64 are not = MAX_BITSIZE_MODE_ANY_INT, because they are never used for integer arithmetics (and there is no way to represent all their values in RTL if not using CONST_WIDE_INT). As the following testcase shows, simplify_immed_subreg can be called with such modes though, e.g. trying to forward propagate a CONST_VECTOR (i?86/x86_64 handles all zeros and all ones as CONST_VECTORs that can appear in the IL directly) into a SUBREG_REG. The following patch instead of ICE handles the most common cases (all 0 and all 1 CONST_VECTORs) and returns NULL otherwise. Before wide-int got merged, the testcase worked, though the code didn't bother checking anything, just created 0 or constm1_rtx for the two cases that could happen and if something else appeared, could just return what matched low TImode (or DImode for -m32). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2014-11-18 Jakub Jelinek ja...@redhat.com PR target/63910 * simplify-rtx.c (simplify_immed_subreg): For integer modes wider than MAX_BITSIZE_MODE_ANY_INT, handle all zeros and all ones and for other values return NULL_RTX. * gcc.target/i386/pr63910.c: New test. --- gcc/simplify-rtx.c.jj 2014-11-11 00:06:19.0 +0100 +++ gcc/simplify-rtx.c 2014-11-18 10:17:01.198668555 +0100 @@ -5505,6 +5505,21 @@ simplify_immed_subreg (machine_mode oute HOST_WIDE_INT tmp[MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT]; wide_int r; + if (GET_MODE_PRECISION (outer_submode) MAX_BITSIZE_MODE_ANY_INT) + { + /* Handle just all zeros and all ones CONST_VECTORs in + this case. */ + if ((vp[0] value_mask) == 0) + elems[elem] = const0_rtx; + else if ((vp[0] value_mask) == value_mask) + elems[elem] = constm1_rtx; + else + return NULL_RTX; + for (i = value_bit; i elem_bitsize; i += value_bit) + if ((vp[i / value_bit] value_mask) != (vp[0] value_mask)) + return NULL_RTX; + break; + } for (u = 0; u units; u++) { unsigned HOST_WIDE_INT buf = 0; @@ -5516,8 +5531,6 @@ simplify_immed_subreg (machine_mode oute tmp[u] = buf; base += HOST_BITS_PER_WIDE_INT; } - gcc_assert (GET_MODE_PRECISION (outer_submode) - = MAX_BITSIZE_MODE_ANY_INT); r = wide_int::from_array (tmp, units, GET_MODE_PRECISION (outer_submode)); elems[elem] = immed_wide_int_const (r, outer_submode); --- gcc/testsuite/gcc.target/i386/pr63910.c.jj 2014-11-18 10:12:24.282659318 +0100 +++ gcc/testsuite/gcc.target/i386/pr63910.c 2014-11-18 10:12:00.0 +0100 @@ -0,0 +1,12 @@ +/* PR target/63910 */ +/* { dg-do compile } */ +/* { dg-options -O -mstringop-strategy=vector_loop -mavx512f } */ + +extern void bar (float *c); + +void +foo (void) +{ + float c[1024] = { }; + bar (c); +} Jakub
[C++ PATCH] Fix -fsanitize={null,alignment} reference instrumentation (PR sanitizer/63813)
Hi! As the testcase shows, if there is a reinterpret_cast, ubsan reference instrumentation can ICE, because for NOP_EXPR to REFERENCE_TYPE we pass argument of that NOP_EXPR, and if it is not a POINTER_TYPE or POINTER_TYPE to a different type, we can get the type to use in the diagnostics wrong or ICE. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2014-11-18 Jakub Jelinek ja...@redhat.com PR sanitizer/63813 * c-ubsan.c (ubsan_maybe_instrument_reference_or_call): Change type argument to ptype, set type to TREE_TYPE (ptype). Don't call get_pointer_alignment for non-pointers. Use ptype, or if it is reference type, corresponding pointer type, as type of kind argument. (ubsan_maybe_instrument_reference, ubsan_maybe_instrument_member_call): Adjust callers. * g++.dg/ubsan/pr63813.C: New test. --- gcc/c-family/c-ubsan.c.jj 2014-10-29 09:49:47.0 +0100 +++ gcc/c-family/c-ubsan.c 2014-11-18 12:31:42.700607456 +0100 @@ -383,18 +383,19 @@ ubsan_maybe_instrument_array_ref (tree * } static tree -ubsan_maybe_instrument_reference_or_call (location_t loc, tree op, tree type, +ubsan_maybe_instrument_reference_or_call (location_t loc, tree op, tree ptype, enum ubsan_null_ckind ckind) { - tree orig_op = op; - bool instrument = false; - unsigned int mina = 0; - if (current_function_decl == NULL_TREE || lookup_attribute (no_sanitize_undefined, DECL_ATTRIBUTES (current_function_decl))) return NULL_TREE; + tree type = TREE_TYPE (ptype); + tree orig_op = op; + bool instrument = false; + unsigned int mina = 0; + if (flag_sanitize SANITIZE_ALIGNMENT) { mina = min_align_of_type (type); @@ -431,13 +432,20 @@ ubsan_maybe_instrument_reference_or_call } else if (flag_sanitize SANITIZE_NULL) instrument = true; - if (mina mina get_pointer_alignment (op) / BITS_PER_UNIT) - instrument = true; + if (mina mina 1) + { + if (!POINTER_TYPE_P (TREE_TYPE (op)) + || mina get_pointer_alignment (op) / BITS_PER_UNIT) + instrument = true; + } } if (!instrument) return NULL_TREE; op = save_expr (orig_op); - tree kind = build_int_cst (TREE_TYPE (op), ckind); + gcc_assert (POINTER_TYPE_P (ptype)); + if (TREE_CODE (ptype) == REFERENCE_TYPE) +ptype = build_pointer_type (TREE_TYPE (ptype)); + tree kind = build_int_cst (ptype, ckind); tree align = build_int_cst (pointer_sized_int_node, mina); tree call = build_call_expr_internal_loc (loc, IFN_UBSAN_NULL, void_type_node, @@ -453,7 +461,7 @@ ubsan_maybe_instrument_reference (tree s { tree op = TREE_OPERAND (stmt, 0); op = ubsan_maybe_instrument_reference_or_call (EXPR_LOCATION (stmt), op, -TREE_TYPE (TREE_TYPE (stmt)), +TREE_TYPE (stmt), UBSAN_REF_BINDING); if (op) TREE_OPERAND (stmt, 0) = op; @@ -471,7 +479,7 @@ ubsan_maybe_instrument_member_call (tree || !POINTER_TYPE_P (TREE_TYPE (op))) return; op = ubsan_maybe_instrument_reference_or_call (EXPR_LOCATION (stmt), op, -TREE_TYPE (TREE_TYPE (op)), +TREE_TYPE (op), is_ctor ? UBSAN_CTOR_CALL : UBSAN_MEMBER_CALL); if (op) --- gcc/testsuite/g++.dg/ubsan/pr63813.C.jj 2014-11-18 12:37:57.790991586 +0100 +++ gcc/testsuite/g++.dg/ubsan/pr63813.C2014-11-18 12:50:58.723345975 +0100 @@ -0,0 +1,12 @@ +// PR sanitizer/63813 +// { dg-do compile } +// { dg-options -fsanitize=undefined -O1 } + +struct A {}; +struct B { long foo () const; A bar () const; }; + +A +B::bar () const +{ + return *reinterpret_cast A * (foo ()); +} Jakub
[PATCH] Fix up r217118 - simplify_binary_operation_1 (PR rtl-optimization/63843)
Hi! The case ASHIFTRT: which is meant for this optimization is preceeded by /* FALLTHRU */ from ROTATE cases, which can't be optimized this way. Thus, this patch limits this optimization to ASHIFTRT. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2014-11-18 Jakub Jelinek ja...@redhat.com PR rtl-optimization/63843 * simplify-rtx.c (simplify_binary_operation_1) case ASHIFTRT: For optimization of ashiftrt of subreg of lshiftrt, check that code is ASHIFTRT. * gcc.c-torture/execute/pr63843.c: New test. --- gcc/simplify-rtx.c.jj 2014-11-18 10:17:01.0 +0100 +++ gcc/simplify-rtx.c 2014-11-18 13:44:26.727792683 +0100 @@ -3118,10 +3118,11 @@ simplify_binary_operation_1 (enum rtx_co (subreg:M1 (ashiftrt:M2 (reg:M2) (const_int c1 + c2)) low_part). */ - if (!VECTOR_MODE_P (mode) + if (code == ASHIFTRT + !VECTOR_MODE_P (mode) SUBREG_P (op0) CONST_INT_P (op1) - (GET_CODE (SUBREG_REG (op0)) == LSHIFTRT) + GET_CODE (SUBREG_REG (op0)) == LSHIFTRT !VECTOR_MODE_P (GET_MODE (SUBREG_REG (op0))) CONST_INT_P (XEXP (SUBREG_REG (op0), 1)) (GET_MODE_BITSIZE (GET_MODE (SUBREG_REG (op0))) --- gcc/testsuite/gcc.c-torture/execute/pr63843.c.jj2014-11-18 13:43:45.417544096 +0100 +++ gcc/testsuite/gcc.c-torture/execute/pr63843.c 2014-11-18 13:43:32.0 +0100 @@ -0,0 +1,31 @@ +/* PR rtl-optimization/63843 */ + +static inline __attribute__ ((always_inline)) +unsigned short foo (unsigned short v) +{ + return (v 8) | (v 8); +} + +unsigned short __attribute__ ((noinline, noclone, hot)) +bar (unsigned char *x) +{ + unsigned int a; + unsigned short b; + __builtin_memcpy (a, x[0], sizeof (a)); + a ^= 0x80808080U; + __builtin_memcpy (x[0], a, sizeof (a)); + __builtin_memcpy (b, x[2], sizeof (b)); + return foo (b); +} + +int +main () +{ + unsigned char x[8] = { 0x01, 0x01, 0x01, 0x01 }; + if (__CHAR_BIT__ == 8 + sizeof (short) == 2 + sizeof (int) == 4 + bar (x) != 0x8181U) +__builtin_abort (); + return 0; +} Jakub
[PATCH] Fix ubsan -fsanitize=signed-integer-overflow expansion (PR sanitizer/63520)
Hi! Apparently, expand_expr with EXPR_WRITE can return a SUBREG with SUBREG_PROMOTED_VAR_P set on it. For UBSAN_CHECK_{ADD,SUB,MUL} expansion, I've been doing just emit_move_insn into it, which apparently is wrong in that case, store_expr instead uses convert_move for it. The {ADD,SUB,MUL}_OVERFLOW (i.e. __builtin_*_overflow) expansion shouldn't need it, as the result is complex and complex integers aren't promoted that way. As store_expr* uses a tree expression to store, while I have rtx, I just wrote a short helper function for this. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2014-11-18 Jakub Jelinek ja...@redhat.com PR sanitizer/63520 * internal-fn.c (expand_ubsan_result_store): New function. (expand_addsub_overflow, expand_neg_overflow, expand_mul_overflow): Use it instead of just emit_move_insn. * c-c++-common/ubsan/pr63520.c: New test. --- gcc/internal-fn.c.jj2014-11-12 13:28:47.0 +0100 +++ gcc/internal-fn.c 2014-11-18 15:35:46.395916823 +0100 @@ -395,6 +395,21 @@ expand_arith_overflow_result_store (tree write_complex_part (target, lres, false); } +/* Helper for expand_*_overflow. Store RES into TARGET. */ + +static void +expand_ubsan_result_store (rtx target, rtx res) +{ + if (GET_CODE (target) == SUBREG SUBREG_PROMOTED_VAR_P (target)) +/* If this is a scalar in a register that is stored in a wider mode + than the declared mode, compute the result into its declared mode + and then convert to the wider mode. Our value is the computed + expression. */ +convert_move (SUBREG_REG (target), res, SUBREG_PROMOTED_SIGN (target)); + else +emit_move_insn (target, res); +} + /* Add sub/add overflow checking to the statement STMT. CODE says whether the operation is +, or -. */ @@ -809,7 +824,7 @@ expand_addsub_overflow (location_t loc, if (lhs) { if (is_ubsan) - emit_move_insn (target, res); + expand_ubsan_result_store (target, res); else { if (do_xor) @@ -904,7 +919,7 @@ expand_neg_overflow (location_t loc, tre if (lhs) { if (is_ubsan) - emit_move_insn (target, res); + expand_ubsan_result_store (target, res); else expand_arith_overflow_result_store (lhs, target, mode, res); } @@ -1590,7 +1605,7 @@ expand_mul_overflow (location_t loc, tre if (lhs) { if (is_ubsan) - emit_move_insn (target, res); + expand_ubsan_result_store (target, res); else expand_arith_overflow_result_store (lhs, target, mode, res); } --- gcc/testsuite/c-c++-common/ubsan/pr63520.c.jj 2014-11-18 15:40:07.271273710 +0100 +++ gcc/testsuite/c-c++-common/ubsan/pr63520.c 2014-11-18 15:40:40.971673904 +0100 @@ -0,0 +1,16 @@ +/* PR sanitizer/63520 */ +/* { dg-do compile } */ +/* { dg-options -fsanitize=undefined } */ + +int a; + +void +foo (void) +{ + while (1) +{ + if (a == 1) + break; + a -= 1; +} +} Jakub
[PATCH] Fix simd clone vectorization with EH (PR tree-optimization/63915)
Hi! Simd clone vectorization uses vect_finish_stmt_generation which handles adding the new calls properly to EH tables, but afterwards we replace the original call with a normal assignment, and if the original call can throw, we need to remove it from the EH tables. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2014-11-18 Jakub Jelinek ja...@redhat.com PR tree-optimization/63915 * tree-vect-stmts.c (vectorizable_simd_clone_call): Pass true instead of false as last argument to gsi_replace. * c-c++-common/gomp/pr60823-4.c: New test. --- gcc/tree-vect-stmts.c.jj2014-11-14 00:10:39.0 +0100 +++ gcc/tree-vect-stmts.c 2014-11-18 17:02:15.635257023 +0100 @@ -3195,7 +3195,7 @@ vectorizable_simd_clone_call (gimple stm set_vinfo_for_stmt (new_stmt, stmt_info); set_vinfo_for_stmt (stmt, NULL); STMT_VINFO_STMT (stmt_info) = new_stmt; - gsi_replace (gsi, new_stmt, false); + gsi_replace (gsi, new_stmt, true); unlink_stmt_vdef (stmt); return true; --- gcc/testsuite/c-c++-common/gomp/pr60823-4.c.jj 2014-11-18 17:06:38.859319961 +0100 +++ gcc/testsuite/c-c++-common/gomp/pr60823-4.c 2014-11-18 17:06:34.602383632 +0100 @@ -0,0 +1,7 @@ +/* PR tree-optimization/63915 */ +/* { dg-do run } */ +/* { dg-require-effective-target vect_simd_clones } */ +/* { dg-options -O2 -fopenmp-simd } */ +/* { dg-additional-options -fpic { target fpic } } */ + +#include pr60823-2.c Jakub
[committed] Add testcase for PR tree-optimization/61042
Hi! This PR got fixed by a fix for another PR, but I've committed the testcase to the testsuite for better coverage. 2014-11-18 Jakub Jelinek ja...@redhat.com PR tree-optimization/61042 * gcc.c-torture/compile/pr61042.c: New test. --- gcc/testsuite/gcc.c-torture/compile/pr61042.c.jj2014-11-18 14:23:56.004671762 +0100 +++ gcc/testsuite/gcc.c-torture/compile/pr61042.c 2014-11-18 14:23:35.0 +0100 @@ -0,0 +1,10 @@ +/* PR tree-optimization/61042 */ + +int a, b, c[1], d, f; + +void +foo () +{ + for (; b; b++) +c[0] = c[f] (d = a); +} Jakub
Re: [Patch] pr63937: TARGET_USE_BY_PIECES_INFRASTRUCTURE_P should take an unsigned HOST_WIDE_INT size argument
On Tue, Nov 18, 2014 at 09:57:37PM +, James Greenhalgh wrote: 2014-11-18 James Greenhalgh james.greenha...@arm.com PR target/63937 * target.def (use_by_pieces_infrastructure_p): Take unsigned HOST_WIDE_INT as the size parameter. * targhooks.c (default_use_by_pieces_infrastructure_p): Likewise. * targhooks.h (default_use_by_pieces_infrastructure_p): Likewise. * config/arc/arc.c (arc_use_by_pieces_infrastructure_p)): Likewise. * config/mips/mips.c (mips_use_by_pieces_infrastructure_p)): Likewise. * config/s390/s390.c (s390_use_by_pieces_infrastructure_p)): Likewise. * config/sh/sh.c (sh_use_by_pieces_infrastructure_p)): Likewise. * config/aarch64/aarch64.c (aarch64_use_by_pieces_infrastructure_p)): Likewise. * doc/tm.texi: Regenerate. 2014-11-18 James Greenhalgh james.greenha...@arm.com * gcc.dg/memset-2.c: New. Please mention the PR also in the testsuite/ChangeLog entry. Ok with that change, thanks. Jakub
Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)
On Tue, Nov 18, 2014 at 03:30:56PM -0800, Mike Stump wrote: Before wide-int got merged, the testcase worked, though the code didn't bother checking anything, just created 0 or constm1_rtx for the two cases that could happen and if something else appeared, could just return what matched low TImode (or DImode for -m32). tmp is sized for MAX_BITSIZE_MODE_ANY_INT, but, you remove the limiter for units that keeps u in bounds. Doesn’t this access 32 bytes of OImode values in a 16 byte data structure? No, I'm not touching tmp array at all in that case, only look at vp individual bytes. Either they are all 0, or all 0xff, or I return NULL. and GET_MODE_BITSIZE (outer_submode) is MAX_BITSIZE_MODE_ANY_INT, right? You can’t copy more bytes than the size of the destination has? On the testcase we have (subreg:OI (reg:V8SI 1234) 0) and try to propagate (const_vector:V8SI [8 x (const_int 0)]) into it. All zeros even in OImode (or XImode) is still (const_int 0), all ones even in those modes is (const_int -1), which are the only constants I'm using for those ultra-wide modes, anything else will return NULL (but not ICE). Jakub
Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)
On Tue, Nov 18, 2014 at 04:34:12PM -0800, Mike Stump wrote: On Nov 18, 2014, at 3:42 PM, Jakub Jelinek ja...@redhat.com wrote: No, I'm not touching tmp array at all in that case, only look at vp individual bytes. Either they are all 0, or all 0xff, or I return NULL. Oh, sorry, I misread where the break; in your code was going. I might have been misled by: - gcc_assert (GET_MODE_PRECISION (outer_submode) - = MAX_BITSIZE_MODE_ANY_INT); in your patch. You don’t need that anymore, do you? If not, can you remove this part. I thought the assert is unnecessary given the condition just a few lines above it. But can keep it, perhaps gcc_checking_assert would be enough, and hopefully compiler optimizes it away completely. The rest looks like normal rtl/vector code, I don’t see anything wrong with it. Jakub
Re: [PATCH] Fix ubsan -fsanitize=signed-integer-overflow expansion (PR sanitizer/63520)
On Wed, Nov 19, 2014 at 10:02:18AM +0100, Richard Biener wrote: On Tue, 18 Nov 2014, Jakub Jelinek wrote: Apparently, expand_expr with EXPR_WRITE can return a SUBREG with SUBREG_PROMOTED_VAR_P set on it. For Huh, that looks bogus to me. But of course I know nothing (read: not enough) to really tell. Eric? I've tried to look where it comes from, and it dates back to r2xxx or so, so forever. And store_expr has a large: else if (GET_CODE (target) == SUBREG SUBREG_PROMOTED_VAR_P (target)) /* If this is a scalar in a register that is stored in a wider mode than the declared mode, compute the result into its declared mode and then convert to the wider mode. Our value is the computed expression. */ handling block. For targets with only word sized operations something like that actually makes a lot of sense, I have just not been aware of that. Jakub
Re: [PATCH] Fix PR63844
On Wed, Nov 19, 2014 at 10:07:07AM +0100, Richard Biener wrote: The following improves code-generation for PR63844 by using a restrict qualified reference type for the OMP receiver decl. This improves alias analysis and points-to analysis enough to usually allow invariant and store motion where that was possible in the non-split-out variant. Bootstrapped and tested on x86_64-unknown-linux-gnu, ok for trunk? Sadly I don't have a good testcase (for example one that wasn't vectorized before but is now). We can add one later - ISTR there were a few other bugs about not vectorizing with -fopenmp, I'll try to find them again and see whether they are fixed. Most of those PRs should have openmp keyword. 2014-11-19 Richard Biener rguent...@suse.de PR tree-optimization/63844 * omp-low.c (fixup_child_record_type): Use a restrict qualified referece type for the receiver parameter. Ok, thanks. --- gcc/omp-low.c (revision 217692) +++ gcc/omp-low.c (working copy) @@ -1517,7 +1517,8 @@ fixup_child_record_type (omp_context *ct layout_type (type); } - TREE_TYPE (ctx-receiver_decl) = build_pointer_type (type); + TREE_TYPE (ctx-receiver_decl) += build_qualified_type (build_reference_type (type), TYPE_QUAL_RESTRICT); } /* Instantiate decls as necessary in CTX to satisfy the data sharing Jakub
Re: [PATCH, libgomp]: Require vect_simd_clones effective target for examples-4/e.53.5.{c,f90}
On Wed, Nov 19, 2014 at 11:26:25AM +0100, Uros Bizjak wrote: 2014-11-19 Uros Bizjak ubiz...@gmail.com * testsuite/libgomp.c/examples-4/e.53.5.c: Require vect_simd_clones effective target. * testsuite/libgomp.fortran/examples-4/e.53.5.f90: Ditto. Tested on x86_64-linux-gnu {,-m32} CentOS 5.11. OK for mainline? Yes, thanks. Index: testsuite/libgomp.fortran/examples-4/e.53.5.f90 === --- testsuite/libgomp.fortran/examples-4/e.53.5.f90 (revision 217718) +++ testsuite/libgomp.fortran/examples-4/e.53.5.f90 (working copy) @@ -1,4 +1,4 @@ -! { dg-do run } +! { dg-do run { target vect_simd_clones } } ! { dg-options -O2 } ! { dg-additional-options -msse2 { target sse2_runtime } } ! { dg-additional-options -mavx { target avx_runtime } } Index: testsuite/libgomp.c/examples-4/e.53.5.c === --- testsuite/libgomp.c/examples-4/e.53.5.c (revision 217718) +++ testsuite/libgomp.c/examples-4/e.53.5.c (working copy) @@ -1,4 +1,4 @@ -/* { dg-do run } */ +/* { dg-do run { target vect_simd_clones } } */ /* { dg-options -O2 } */ /* { dg-additional-options -msse2 { target sse2_runtime } } */ /* { dg-additional-options -mavx { target avx_runtime } } */ Jakub
Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)
On Wed, Nov 19, 2014 at 09:59:45AM +0100, Richard Biener wrote: OImode/XImode on i?86/x86_64 are not = MAX_BITSIZE_MODE_ANY_INT, because they are never used for integer arithmetics (and there is no way to represent all their values in RTL if not using CONST_WIDE_INT). As the following testcase shows, simplify_immed_subreg can be called with such modes though, e.g. trying to forward propagate a CONST_VECTOR (i?86/x86_64 handles all zeros and all ones as CONST_VECTORs that can appear in the IL directly) into a SUBREG_REG. So we have (subreg:OI (reg:V4DF ...) ...) for example? What do we end doing with that OI mode subreg? (why can't we simply use the V4DF one?) propagate_rtx_1 is called on (subreg:OI (reg:V8DI 89) 0) with old_rtx (reg:V8DI 89) and new_rtx (const_vector:V8DI [ (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) ]) Seems we throw away the result in that case though, because gen_lowpart_common doesn't like to return low parts of VOIDmode constants larger if the mode is larger than double int: 1353 innermode = GET_MODE (x); 1354 if (CONST_INT_P (x) 1355 msize * BITS_PER_UNIT = HOST_BITS_PER_WIDE_INT) 1356innermode = mode_for_size (HOST_BITS_PER_WIDE_INT, MODE_INT, 0); 1357 else if (innermode == VOIDmode) 1358innermode = mode_for_size (HOST_BITS_PER_DOUBLE_INT, MODE_INT, 0); we hit the last if and as mode is wider than innermode, we return NULL later on. The following patch instead of ICE handles the most common cases (all 0 and all 1 CONST_VECTORs) and returns NULL otherwise. Before wide-int got merged, the testcase worked, though the code didn't bother checking anything, just created 0 or constm1_rtx for the two cases that could happen and if something else appeared, could just return what matched low TImode (or DImode for -m32). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Looks ok to me. Not sure if the zero/all-ones case really happens that much to be worth special-casing - I think you could use fixed_wide_intproper-size and simply see if the result is representable in a CONST_INT/CONST_DOUBLE? Can you try if that works? It looks like the patch may be smaller for that. So perhaps something like this? Don't know how much more inefficient it is compared to what it used to do before. Or just keep the existing code and just remove the assert and instead return NULL whenever outer_submode is wider than MAX_BITSIZE_MODE_ANY_INT? At least during propagation that will make zero change. Though, in that case I have still doubts about the current code handling right modes wider than HOST_BITS_PER_DOUBLE_INT but smaller than MAX_BITSIZE_MODE_ANY_INT (none on i?86/x86_64). If TARGET_SUPPORTS_WIDE_INT == 0, we still silently throw away the upper bits, don't we? BTW, to Mike, the assert has been misplaced, there was first buffer overflow and only after that the assert. 2014-11-19 Jakub Jelinek ja...@redhat.com PR target/63910 * simplify-rtx.c (simplify_immed_subreg): Handle integer modes wider than MAX_BITSIZE_MODE_ANY_INT. * gcc.target/i386/pr63910.c: New test. --- gcc/simplify-rtx.c.jj 2014-11-19 09:17:15.491327992 +0100 +++ gcc/simplify-rtx.c 2014-11-19 12:28:16.223808178 +0100 @@ -5501,8 +5501,12 @@ simplify_immed_subreg (machine_mode oute int units = (GET_MODE_BITSIZE (outer_submode) + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT; - HOST_WIDE_INT tmp[MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT]; - wide_int r; + const int tmpsize = (MAX_BITSIZE_MODE_ANY_MODE ++ HOST_BITS_PER_WIDE_INT - 1) + / HOST_BITS_PER_WIDE_INT; + HOST_WIDE_INT tmp[tmpsize]; + typedef FIXED_WIDE_INT (tmpsize * HOST_BITS_PER_WIDE_INT) imm_int; + imm_int r; for (u = 0; u units; u++) { @@ -5515,10 +5519,12 @@ simplify_immed_subreg (machine_mode oute tmp[u] = buf; base += HOST_BITS_PER_WIDE_INT; } - gcc_assert (GET_MODE_PRECISION (outer_submode) - = MAX_BITSIZE_MODE_ANY_INT); - r = wide_int::from_array (tmp, units, - GET_MODE_PRECISION (outer_submode)); + r = imm_int::from_array (tmp, units, +GET_MODE_PRECISION (outer_submode)); +#if TARGET_SUPPORTS_WIDE_INT == 0 + if (wi::min_precision (r, SIGNED) HOST_BITS_PER_DOUBLE_INT) + return NULL_RTX; +#endif elems[elem] = immed_wide_int_const (r, outer_submode); } break; --- gcc/testsuite/gcc.target/i386/pr63910.c.jj 2014-11-19 12:04
Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)
On Wed, Nov 19, 2014 at 12:59:06PM +0100, Richard Biener wrote: So perhaps something like this? Don't know how much more inefficient it is compared to what it used to do before. Yes, that looks good. Or just keep the existing code and just remove the assert and instead return NULL whenever outer_submode is wider than MAX_BITSIZE_MODE_ANY_INT? At least during propagation that will make zero change. Though, in that case I have still doubts about the current code handling right modes wider than HOST_BITS_PER_DOUBLE_INT but smaller than MAX_BITSIZE_MODE_ANY_INT (none on i?86/x86_64). If TARGET_SUPPORTS_WIDE_INT == 0, we still silently throw away the upper bits, don't we? Well - not with your added check, no? For TARGET_SUPPORTS_WIDE_INT == 0 should be hopefully ok. Not sure about TARGET_SUPPORTS_WIDE_INT != 0, can it express any generic_wide_int, or is it still bound to wide_int (i.e. MAX_BITSIZE_MODE_ANY_INT rounded up) precision? Mike? Jakub
Re: [PATCH] Fix sanitizer/63690
On Wed, Nov 19, 2014 at 02:51:11PM +0100, Marek Polacek wrote: As the testcase shows, we should really check first that we have a MEM_REF, otherwise subsequent TREE_OPERAND might ICE. On an invalid testcase we might get e.g. STRING_CST. Well, addr_object_size should handle STRING_CSTs just fine, so why not handle STRING_CSTs in there too (similarly to decl_p = true case). Supposedly you want decl_p = true for them and need to tweak if (!POINTER_TYPE_P (TREE_TYPE (base)) !DECL_P (base)) but otherwise it should work. On the other side, int foo (int x) { return foobar[x]; } is already handled by -fsanitize=bounds and for some reason gimple_assign_load_p doesn't consider handled components with STRING_CST as base as loads so for -fsanitize=object-size we aren't called for that. Jakub
Re: [PATCH] Fix sanitizer/63690
On Wed, Nov 19, 2014 at 03:12:05PM +0100, Jakub Jelinek wrote: On Wed, Nov 19, 2014 at 02:51:11PM +0100, Marek Polacek wrote: As the testcase shows, we should really check first that we have a MEM_REF, otherwise subsequent TREE_OPERAND might ICE. On an invalid testcase we might get e.g. STRING_CST. Well, addr_object_size should handle STRING_CSTs just fine, so why not handle STRING_CSTs in there too (similarly to decl_p = true case). Supposedly you want decl_p = true for them and need to tweak if (!POINTER_TYPE_P (TREE_TYPE (base)) !DECL_P (base)) but otherwise it should work. On the other side, int foo (int x) { return foobar[x]; } is already handled by -fsanitize=bounds and for some reason gimple_assign_load_p doesn't consider handled components with STRING_CST as base as loads so for -fsanitize=object-size we aren't called for that. The patch is ok as is. Jakub
[PATCH] gimple_{build_assign,assign_set_rhs}_with_ops* cleanup
Hi! This patch: 1) adds unary op overload to gimple_build_assign_with_ops so that many callers don't need to pass NULL_TREE as the last argument explicitly 2) adds unary op overload to gimple_assign_set_rhs_with_ops for similar reasons 3) renames gimple_assign_set_rhs_with_ops_1 to gimple_assign_set_rhs_with_ops so it becomes ternary op overload to the existing binary op one and 2) added unary op and adjusts all users that I found. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2014-11-19 Jakub Jelinek ja...@redhat.com * gimple.h (gimple_build_assign_with_ops): Add unary arg overload. (gimple_assign_set_rhs_with_ops_1): Renamed to ... (gimple_assign_set_rhs_with_ops): ... this. Adjust binary arg inline overload to use it. Add unary arg overload. * gimple.c (gimple_build_assign_with_ops): New unary arg overload. (gimple_assign_set_rhs_from_tree): Use gimple_assign_set_rhs_with_ops instead of gimple_assign_set_rhs_with_ops_1. (gimple_assign_set_rhs_with_ops_1): Renamed to ... (gimple_assign_set_rhs_with_ops): ... this. * ipa-split.c (split_function): Remove last NULL argument from gimple_build_assign_with_ops call. * tree-ssa-loop-im.c (move_computations_dom_walker::before_dom_children): Likewise. * tsan.c (instrument_builtin_call): Likewise. * tree-vect-stmts.c (vect_init_vector, vectorizable_mask_load_store, vectorizable_conversion, vectorizable_load): Likewise. * tree-vect-loop.c (vect_is_simple_reduction_1, get_initial_def_for_induction): Likewise. * tree-loop-distribution.c (generate_memset_builtin): Likewise. * tree-vect-patterns.c (vect_handle_widen_op_by_const, vect_recog_widen_mult_pattern, vect_operation_fits_smaller_type, vect_recog_over_widening_pattern, vect_recog_rotate_pattern, vect_recog_vector_vector_shift_pattern, vect_recog_divmod_pattern, vect_recog_mixed_size_cond_pattern, adjust_bool_pattern_cast, adjust_bool_pattern, vect_recog_bool_pattern): Likewise. * tree-ssa-phiopt.c (conditional_replacement, abs_replacement, neg_replacement): Likewise. * asan.c (build_shadow_mem_access, maybe_create_ssa_name, maybe_cast_to_ptrmode, asan_expand_check_ifn): Likewise. * tree-vect-slp.c (vect_get_constant_vectors): Likewise. * omp-low.c (lower_rec_input_clauses, expand_omp_for_generic, expand_omp_for_static_nochunk, expand_omp_for_static_chunk, simd_clone_adjust): Likewise. * tree-vect-loop-manip.c (vect_create_cond_for_align_checks): Likewise. * gimple-ssa-strength-reduction.c (introduce_cast_before_cand, replace_one_candidate): Likewise. * gimple-builder.c (build_type_cast): Likewise. * tree-ssa-forwprop.c (simplify_rotate): Likewise. (forward_propagate_addr_expr_1): Remove last NULL argument from gimple_assign_set_rhs_with_ops call. (simplify_vector_constructor): Use gimple_assign_set_rhs_with_ops instead of gimple_assign_set_rhs_with_ops_1. * tree-ssa-reassoc.c (maybe_optimize_range_tests): Remove last NULL argument from gimple_build_assign_with_ops call. (repropagate_negates): Remove last NULL argument from gimple_assign_set_rhs_with_ops call. * ubsan.c (ubsan_expand_null_ifn, ubsan_expand_objsize_ifn): Remove last NULL argument from gimple_build_assign_with_ops call. (instrument_bool_enum_load): Likewise. Remove last NULL argument from gimple_assign_set_rhs_with_ops call. * tree-ssa-math-opts.c (build_and_insert_cast, convert_mult_to_fma): Remove last NULL argument from gimple_build_assign_with_ops call. (bswap_replace): Likewise. Use gimple_assign_set_rhs_with_ops instead of gimple_assign_set_rhs_with_ops_1. (convert_plusminus_to_widen): Use gimple_assign_set_rhs_with_ops instead of gimple_assign_set_rhs_with_ops_1. * gimple-fold.c (replace_stmt_with_simplification): Likewise. (rewrite_to_defined_overflow, gimple_build): Remove last NULL argument from gimple_build_assign_with_ops call. * tree-ssa-strlen.c (handle_pointer_plus): Remove last NULL argument from gimple_assign_set_rhs_with_ops call. * tree-vrp.c (simplify_truth_ops_using_ranges, simplify_bit_ops_using_ranges): Remove last NULL argument from gimple_assign_set_rhs_with_ops call. (simplify_float_conversion_using_ranges, simplify_internal_call_using_ranges): Remove last NULL argument from gimple_build_assign_with_ops call. --- gcc/gimple.h.jj 2014-11-19 10:45:17.780769013 +0100 +++ gcc/gimple.h2014-11-19 10:52:52.118574835 +0100 @@ -1180,6 +1180,8 @@ gimple gimple_build_assign_with_ops (enu tree, tree
Re: [COMMITTED 1/3] Make TARGET_STATIC_CHAIN allow a function type
On Wed, Nov 19, 2014 at 03:58:50PM +0100, Richard Henderson wrote: As opposed to always being a decl. This is a prerequisite to allowing the static chain to be loaded for indirect calls. * targhooks.c (default_static_chain): Remove check for DECL_STATIC_CHAIN. * config/moxie/moxie.c (moxie_static_chain): Likewise. * config/i386/i386.c (ix86_static_chain): Allow decl or type as the first argument. * config/xtensa/xtensa.c (xtensa_static_chain): Change the name of the unused first parameter. * doc/tm.texi (TARGET_STATIC_CHAIN): Document the first parameter may be a type. * target.def (static_chain): Likewise. r217769 broke lots of tests on i686-linux, haven't verified if everything below is caused by this, but regparm-1.c at -m32 certainly is. The ICE is: regparm-1.c: In function ‘test_realigned’: regparm-1.c:49:1: internal compiler error: in ix86_expand_prologue, at config/i386/i386.c:11347 } ^ 0x106140a ix86_expand_prologue() ../../gcc/config/i386/i386.c:11347 0x11aaf2a gen_prologue() ../../gcc/config/i386/i386.md:12095 0x99baf9 thread_prologue_and_epilogue_insns() ../../gcc/function.c:5911 0x99cba4 rest_of_handle_thread_prologue_and_epilogue ../../gcc/function.c:6481 0x99cc0c execute ../../gcc/function.c:6519 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See http://gcc.gnu.org/bugs.html for instructions. +FAIL: gcc.dg/pr36015.c (internal compiler error) +FAIL: gcc.dg/pr36015.c (test for excess errors) +UNRESOLVED: gcc.dg/pr36015.c compilation failed to produce executable +FAIL: gcc.dg/torture/stackalign/regparm-1.c -O0 (internal compiler error) +FAIL: gcc.dg/torture/stackalign/regparm-1.c -O0 (test for excess errors) +UNRESOLVED: gcc.dg/torture/stackalign/regparm-1.c -O0 compilation failed to produce executable +FAIL: gcc.dg/torture/stackalign/regparm-1.c -O0 -fpic (internal compiler error) +FAIL: gcc.dg/torture/stackalign/regparm-1.c -O0 -fpic (test for excess errors) +UNRESOLVED: gcc.dg/torture/stackalign/regparm-1.c -O0 -fpic compilation failed to produce executable +FAIL: gcc.dg/torture/stackalign/regparm-1.c -O0 -mforce-drap (internal compiler error) +FAIL: gcc.dg/torture/stackalign/regparm-1.c -O0 -mforce-drap (test for excess errors) +FAIL: gcc.dg/torture/stackalign/regparm-1.c -O0 -mforce-drap -fpic (internal compiler error) +FAIL: gcc.dg/torture/stackalign/regparm-1.c -O0 -mforce-drap -fpic (test for excess errors) +UNRESOLVED: gcc.dg/torture/stackalign/regparm-1.c -O0 -mforce-drap -fpic compilation failed to produce executable +UNRESOLVED: gcc.dg/torture/stackalign/regparm-1.c -O0 -mforce-drap compilation failed to produce executable +FAIL: gcc.dg/torture/stackalign/regparm-1.c -O1 -fpic (internal compiler error) +FAIL: gcc.dg/torture/stackalign/regparm-1.c -O1 -fpic (test for excess errors) +UNRESOLVED: gcc.dg/torture/stackalign/regparm-1.c -O1 -fpic compilation failed to produce executable +FAIL: gcc.dg/torture/stackalign/regparm-1.c -O1 -mforce-drap -fpic (internal compiler error) +FAIL: gcc.dg/torture/stackalign/regparm-1.c -O1 -mforce-drap -fpic (test for excess errors) +UNRESOLVED: gcc.dg/torture/stackalign/regparm-1.c -O1 -mforce-drap -fpic compilation failed to produce executable +FAIL: gcc.dg/torture/stackalign/regparm-1.c -O2 -flto -flto-partition=none -fpic (internal compiler error) +FAIL: gcc.dg/torture/stackalign/regparm-1.c -O2 -flto -flto-partition=none -fpic (test for excess errors) +UNRESOLVED: gcc.dg/torture/stackalign/regparm-1.c -O2 -flto -flto-partition=none -fpic compilation failed to produce executable +FAIL: gcc.dg/torture/stackalign/regparm-1.c -O2 -flto -flto-partition=none -mforce-drap -fpic (internal compiler error) +FAIL: gcc.dg/torture/stackalign/regparm-1.c -O2 -flto -flto-partition=none -mforce-drap -fpic (test for excess errors) +UNRESOLVED: gcc.dg/torture/stackalign/regparm-1.c -O2 -flto -flto-partition=none -mforce-drap -fpic compilation failed to produce executable +FAIL: gcc.dg/torture/stackalign/regparm-1.c -O2 -flto -fpic (internal compiler error) +FAIL: gcc.dg/torture/stackalign/regparm-1.c -O2 -flto -fpic (test for excess errors) +UNRESOLVED: gcc.dg/torture/stackalign/regparm-1.c -O2 -flto -fpic compilation failed to produce executable +FAIL: gcc.dg/torture/stackalign/regparm-1.c -O2 -flto -mforce-drap -fpic (internal compiler error) +FAIL: gcc.dg/torture/stackalign/regparm-1.c -O2 -flto -mforce-drap -fpic (test for excess errors) +UNRESOLVED: gcc.dg/torture/stackalign/regparm-1.c -O2 -flto -mforce-drap -fpic compilation failed to produce executable +FAIL: gcc.dg/torture/stackalign/regparm-1.c -O2 -fpic (internal compiler error) +FAIL: gcc.dg/torture/stackalign/regparm-1.c -O2 -fpic (test for excess errors) +UNRESOLVED:
[committed] Allow input and output to be both /dev/null (PR driver/36312, PR driver/63837)
Hi! I've bootstrapped/regtested this patch for Manuel (just small formatting changes from me), and committed to trunk. 2014-11-19 Manuel López-Ibáñez m...@gcc.gnu.org Jakub Jelinek ja...@redhat.com PR driver/36312 PR driver/63837 * gcc.c (process_command): Don't check for input/output filename equality if output is HOST_BIT_BUCKET. * toplev.c (init_asm_output): Likewise. --- gcc/gcc.c.jj2014-11-14 00:10:38.0 +0100 +++ gcc/gcc.c 2014-11-19 17:13:53.409985279 +0100 @@ -4155,7 +4155,9 @@ process_command (unsigned int decoded_op CL_DRIVER, handlers, global_dc); } - if (output_file strcmp (output_file, -)) + if (output_file + strcmp (output_file, -) != 0 + strcmp (output_file, HOST_BIT_BUCKET) != 0) { int i; for (i = 0; i n_infiles; i++) --- gcc/toplev.c.jj 2014-11-18 08:26:38.0 +0100 +++ gcc/toplev.c2014-11-19 17:14:47.324047562 +0100 @@ -945,7 +945,8 @@ init_asm_output (const char *name) } if (!strcmp (asm_file_name, -)) asm_out_file = stdout; - else if (!canonical_filename_eq (asm_file_name, name)) + else if (!canonical_filename_eq (asm_file_name, name) + || !strcmp (asm_file_name, HOST_BIT_BUCKET)) asm_out_file = fopen (asm_file_name, w); else /* Use fatal_error (UNKOWN_LOCATION) instead of just fatal_error to Jakub
Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)
On Wed, Nov 19, 2014 at 10:38:57AM -0800, Mike Stump wrote: On Nov 19, 2014, at 4:24 AM, Richard Biener rguent...@suse.de wrote: On Wed, 19 Nov 2014, Jakub Jelinek wrote: For TARGET_SUPPORTS_WIDE_INT == 0 should be hopefully ok. Not sure about TARGET_SUPPORTS_WIDE_INT != 0, can it express any generic_wide_int, or is it still bound to wide_int (i.e. MAX_BITSIZE_MODE_ANY_INT rounded up) precision? Mike? It can represent any - well, the RTX at least. Code then using simple wide_int may wreck then though. So, my worry is this… once you start in on adding support here or there for int modes wider than the largest supported int mode, you create a never ending maintenance nightmare with complex rules that one will never be able to keep straight. There needs to be a single line or two, that explains the rules that we all agree to, then it will always be clear what the rule is. The largest supported int mode is: x, has a nice, simple, easy to explain aspect to it. Well, at least for TARGET_SUPPORTS_WIDE_INT == 0 my patch would always create CONST_INTs or CONST_DOUBLEs, which all fit into MAX_BITSIZE_MODE_ANY_INT bits, CONST_INTs are modeless, so in what wider mode you use them doesn't matter. For TARGET_SUPPORTS_WIDE_INT != 0 we could certainly cap it similarly, if wi::min_precision (r, SIGNED) MAX_BITSIZE_MODE_ANY_INT we could return NULL_RTX. Though, following patch is just fine for me too, I don't think it will make a significant difference: --- gcc/simplify-rtx.c 2014-11-19 15:39:24.073113107 +0100 +++ gcc/simplify-rtx.c 2014-11-19 22:55:44.201464253 +0100 @@ -5504,6 +5504,8 @@ simplify_immed_subreg (machine_mode oute HOST_WIDE_INT tmp[MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT]; wide_int r; + if (GET_MODE_PRECISION (outer_submode) MAX_BITSIZE_MODE_ANY_INT) + return NULL_RTX; for (u = 0; u units; u++) { unsigned HOST_WIDE_INT buf = 0; @@ -5515,10 +5517,13 @@ simplify_immed_subreg (machine_mode oute tmp[u] = buf; base += HOST_BITS_PER_WIDE_INT; } - gcc_assert (GET_MODE_PRECISION (outer_submode) - = MAX_BITSIZE_MODE_ANY_INT); r = wide_int::from_array (tmp, units, GET_MODE_PRECISION (outer_submode)); +#if TARGET_SUPPORTS_WIDE_INT == 0 + /* Make sure r will fit into CONST_INT or CONST_DOUBLE. */ + if (wi::min_precision (r, SIGNED) HOST_BITS_PER_DOUBLE_INT) + return NULL_RTX; +#endif elems[elem] = immed_wide_int_const (r, outer_submode); } break; Jakub
Re: OpenACC middle end changes
On Wed, Nov 19, 2014 at 08:52:40PM +0100, Bernd Schmidt wrote: Another change that's required is (something like) the following. For ptx, we need to know whether to output something as a .func (callable from ptx code) or a .kernel (callable from the host). That means we need to mark the kernel functions somehow in omp-low.c, and the following does that by way of a new attribute (already recognized by the nvptx backend). I think Richard's and Honza's preference in this case is a flag in cgraph_node instead of an attribute. * omp-low.c (create_omp_child_function): Tag entrypoint functions with a special attribute. diff --git a/gcc/omp-low.c b/gcc/omp-low.c index 42ba317..8408025 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -2228,6 +2228,12 @@ create_omp_child_function (omp_context *ctx, bool task_copy) break; } } + if (cgraph_node::get_create (decl)-offloadable + !lookup_attribute (omp declare target, + DECL_ATTRIBUTES (current_function_decl))) +DECL_ATTRIBUTES (decl) + = tree_cons (get_identifier (omp target entrypoint), + NULL_TREE, DECL_ATTRIBUTES (decl)); t = build_decl (DECL_SOURCE_LOCATION (decl), RESULT_DECL, NULL_TREE, void_type_node); Jakub
Re: OpenACC middle end changes
On Thu, Nov 20, 2014 at 03:19:11AM +0100, Bernd Schmidt wrote: Thomas had apparently already pointed out an issue with the new gomp_target class (there are multiple similar types of statements we want to handle with OpenACC, they have different codes but we want to have function pointers operating on any of them) back in July. That seems to have been ignored. By necessity, some of David's changes are reverted in the following patch. I thought the agreement was to use GIMPLE_OMP_TARGET gimple_code and just two new gimple_omp_target_kind GF_* flags. Jakub
Re: [PATCH x86, PR60451] Expand even/odd permutation using pack insn.
On Thu, Nov 20, 2014 at 02:36:26PM +0300, Evgeny Stupachenko wrote: + /* Only V8HI, V16QI, V16HI and V32QI modes are more profitable than general + shuffles. */ I think switch (d-vmode) would be more readable. + op = gen_reg_rtx (d-vmode); + t = gen_reg_rtx (V4DImode); + emit_insn (gen_pack (op, dop0, dop1)); + emit_insn (gen_avx2_permv4di_1 (t, gen_lowpart (V4DImode, op), const0_rtx, Too long line, wrap it? Will leave the rest to Uros. Jakub
Re: [PATCH] PR63426 Fix various signed integer overflows
On Thu, Nov 20, 2014 at 02:27:52PM +0100, Markus Trippelsdorf wrote: 2014-11-20 Markus Trippelsdorf mar...@trippelsdorf.de * emit-rtl.c (const_wide_int_htab_hash): Likewise. * loop-iv.c (determine_max_iter): Likewise. (iv_number_of_iterations): Likewise. * tree-ssa-loop-ivopts.c (get_computation_cost_at): Likewise. * varasm.c (get_section_anchor): Likewise. Ok, with one small change: --- a/gcc/varasm.c +++ b/gcc/varasm.c @@ -7188,7 +7188,7 @@ get_section_anchor (struct object_block *block, HOST_WIDE_INT offset, offset = 0; else { - bias = 1 (GET_MODE_BITSIZE (ptr_mode) - 1); + bias = (unsigned HOST_WIDE_INT) 1 (GET_MODE_BITSIZE (ptr_mode) - 1); Please use HOST_WIDE_INT_1U instead of (unsigned HOST_WIDE_INT) 1. Jakub
Re: [PATCH, i386] Add new arg values for __builtin_cpu_supports
On Thu, Nov 20, 2014 at 07:36:03PM +0300, Ilya Enkovich wrote: Hi, MPX runtime checks some feature bits in order to check MPX is fully supported. Runtime does it by cpuid calls but there is a __builtin_cpu_supports which may be used for that. Unfortunately currently it doesn't support required bits. Will it be OK to add them for trunk? I think using cpuid for that is just fine. __builtin_cpu_supports is for ISA additions users might actually want to version code for, MPX stuff, as the instructions are nops without hw support, are not something one would multi-version a function for. If anything, AVX512F and AVX512BW+VL might be good candidates for that, not MPX. Jakub
Re: [PATCH] Fix ubsan and C++14 constexpr ICEs (PR sanitizer/63956)
On Thu, Nov 20, 2014 at 06:14:52PM +0100, Marek Polacek wrote: This patch fixes a bunch of ICEs related to C++14 constexprs and -fsanitize=undefined. We should ignore ubsan internal functions and ubsan builtins in constexpr functions in cxx_eval_call_expression. Also add proper printing of internal functions into the C++ printer. Bootstrapped/regtested on ppc64-linux, ok for trunk? I'd like Jason to review this. But a few nits: @@ -1171,6 +1182,10 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree t, } if (DECL_CLONED_FUNCTION_P (fun)) fun = DECL_CLONED_FUNCTION (fun); + + if (!current_function_decl is_ubsan_builtin_p (fun)) +return void_node; + I don't understand the !current_function_decl here. Also, looking at is_ubsan_builtin_p definition, I'd say it should IMHO at least test DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_NORMAL before comparing the function name, you can declare __builtin_ubsan_foobarbaz () and use it and it won't be a builtin. As for the testcase, I'd like to understand if C++ FE should reject the constexpr functions when used with arguments that trigger undefined behavior. But certainly the behavior should not depend on whether -fsanitize=undefined or not. Also, what is the reason for constexpr call flows off the end errors? Shouldn't that be avoided if any error is found while interpreting the function? Jakub
Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)
On Wed, Nov 19, 2014 at 02:23:47PM -0800, Mike Stump wrote: On Nov 19, 2014, at 1:57 PM, Jakub Jelinek ja...@redhat.com wrote: Though, following patch is just fine for me too, I don't think it will make a significant difference: This version is fine by me. Richard, are you ok with that too? Bootstrapped/regtested on x86_64-linux and i686-linux now. 2014-11-20 Jakub Jelinek ja...@redhat.com PR target/63910 * simplify-rtx.c (simplify_immed_subreg): Return NULL for integer modes wider than MAX_BITSIZE_MODE_ANY_INT. If not using CONST_WIDE_INT, make sure r fits into CONST_DOUBLE. * gcc.target/i386/pr63910.c: New test. --- gcc/simplify-rtx.c.jj 2014-11-19 09:17:15.491327992 +0100 +++ gcc/simplify-rtx.c 2014-11-19 12:28:16.223808178 +0100 @@ -5504,6 +5504,8 @@ simplify_immed_subreg (machine_mode oute HOST_WIDE_INT tmp[MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT]; wide_int r; + if (GET_MODE_PRECISION (outer_submode) MAX_BITSIZE_MODE_ANY_INT) + return NULL_RTX; for (u = 0; u units; u++) { unsigned HOST_WIDE_INT buf = 0; @@ -5515,10 +5517,13 @@ simplify_immed_subreg (machine_mode oute tmp[u] = buf; base += HOST_BITS_PER_WIDE_INT; } - gcc_assert (GET_MODE_PRECISION (outer_submode) - = MAX_BITSIZE_MODE_ANY_INT); r = wide_int::from_array (tmp, units, GET_MODE_PRECISION (outer_submode)); +#if TARGET_SUPPORTS_WIDE_INT == 0 + /* Make sure r will fit into CONST_INT or CONST_DOUBLE. */ + if (wi::min_precision (r, SIGNED) HOST_BITS_PER_DOUBLE_INT) + return NULL_RTX; +#endif elems[elem] = immed_wide_int_const (r, outer_submode); } break; --- gcc/testsuite/gcc.target/i386/pr63910.c.jj 2014-11-19 12:04:23.490489130 +0100 +++ gcc/testsuite/gcc.target/i386/pr63910.c 2014-11-19 12:04:23.490489130 +0100 @@ -0,0 +1,12 @@ +/* PR target/63910 */ +/* { dg-do compile } */ +/* { dg-options -O -mstringop-strategy=vector_loop -mavx512f } */ + +extern void bar (float *c); + +void +foo (void) +{ + float c[1024] = { }; + bar (c); +} Jakub
[PATCH] Fix ICE with non-lvalue vector subscripts and make sure non-lvalue vector subscripts aren't used as lvalues (PR target/63764)
Hi! This patch fixes ICEs if a non-lvalue vector (say cast of one vector to another vector type) was subscripted and used as lhs. The following patch, if *vecp is not lvalue, will copy it to a temporary variable which can be made addressable for the subscription, and afterwards wrap it into a NON_LVALUE_EXPR so that it is properly rejected if later used on the lhs. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2014-11-20 Jakub Jelinek ja...@redhat.com PR target/63764 c-family/ * c-common.h (convert_vector_to_pointer_for_subscript): Change return type to bool. * c-common.c: Include gimple-expr.c. (convert_vector_to_pointer_for_subscript): Change return type to bool. If *vecp is not lvalue_p and has VECTOR_TYPE, return true and copy it into a TARGET_EXPR and use that instead of *vecp directly. c/ * c-typeck.c (build_array_ref): Adjust convert_vector_to_pointer_for_subscript caller. If it returns true, call non_lvalue_loc on the result. cp/ * typeck.c (cp_build_array_ref): Adjust convert_vector_to_pointer_for_subscript caller. If it returns true, call non_lvalue_loc on the result. testsuite/ * c-c++-common/pr63764-1.c: New test. * c-c++-common/pr63764-2.c: New test. --- gcc/c-family/c-common.h.jj 2014-11-19 15:39:26.606065628 +0100 +++ gcc/c-family/c-common.h 2014-11-20 08:38:02.527655971 +0100 @@ -1310,7 +1310,7 @@ extern tree build_userdef_literal (tree enum overflow_type overflow, tree num_string); -extern void convert_vector_to_pointer_for_subscript (location_t, tree*, tree); +extern bool convert_vector_to_pointer_for_subscript (location_t, tree *, tree); /* Possibe cases of scalar_to_vector conversion. */ enum stv_conv { --- gcc/c-family/c-common.c.jj 2014-11-19 15:39:26.606065628 +0100 +++ gcc/c-family/c-common.c 2014-11-20 08:50:21.000573676 +0100 @@ -60,6 +60,7 @@ along with GCC; see the file COPYING3. #include target-def.h #include gimplify.h #include wide-int-print.h +#include gimple-expr.h cpp_reader *parse_in; /* Declared in c-pragma.h. */ @@ -12030,22 +12031,47 @@ build_userdef_literal (tree suffix_id, t } /* For vector[index], convert the vector to a - pointer of the underlying type. */ -void + pointer of the underlying type. Return true if the resulting + ARRAY_REF should not be an lvalue. */ + +bool convert_vector_to_pointer_for_subscript (location_t loc, -tree* vecp, tree index) +tree *vecp, tree index) { + bool ret = false; if (TREE_CODE (TREE_TYPE (*vecp)) == VECTOR_TYPE) { tree type = TREE_TYPE (*vecp); tree type1; + ret = !lvalue_p (*vecp); if (TREE_CODE (index) == INTEGER_CST) if (!tree_fits_uhwi_p (index) || tree_to_uhwi (index) = TYPE_VECTOR_SUBPARTS (type)) warning_at (loc, OPT_Warray_bounds, index value is out of bound); - c_common_mark_addressable_vec (*vecp); + if (ret) + { + tree tmp = create_tmp_var_raw (type, NULL); + DECL_SOURCE_LOCATION (tmp) = loc; + *vecp = c_save_expr (*vecp); + if (TREE_CODE (*vecp) == C_MAYBE_CONST_EXPR) + { + bool non_const = C_MAYBE_CONST_EXPR_NON_CONST (*vecp); + *vecp = C_MAYBE_CONST_EXPR_EXPR (*vecp); + *vecp + = c_wrap_maybe_const (build4 (TARGET_EXPR, type, tmp, + *vecp, NULL_TREE, NULL_TREE), + non_const); + } + else + *vecp = build4 (TARGET_EXPR, type, tmp, *vecp, + NULL_TREE, NULL_TREE); + SET_EXPR_LOCATION (*vecp, loc); + c_common_mark_addressable_vec (tmp); + } + else + c_common_mark_addressable_vec (*vecp); type = build_qualified_type (TREE_TYPE (type), TYPE_QUALS (type)); type1 = build_pointer_type (TREE_TYPE (*vecp)); bool ref_all = TYPE_REF_CAN_ALIAS_ALL (type1); @@ -12065,6 +12091,7 @@ convert_vector_to_pointer_for_subscript *vecp = build1 (ADDR_EXPR, type1, *vecp); *vecp = convert (type, *vecp); } + return ret; } /* Determine which of the operands, if any, is a scalar that needs to be --- gcc/c/c-typeck.c.jj 2014-11-19 15:39:24.044113650 +0100 +++ gcc/c/c-typeck.c2014-11-20 08:38:02.534655847 +0100 @@ -2495,7 +2495,8 @@ build_array_ref (location_t loc, tree ar gcc_assert (TREE_CODE (TREE_TYPE (index)) == INTEGER_TYPE); - convert_vector_to_pointer_for_subscript (loc, array, index); + bool non_lvalue += convert_vector_to_pointer_for_subscript (loc, array, index); if (TREE_CODE (TREE_TYPE (array)) == ARRAY_TYPE) { @@ -2557,6 +2558,8 @@ build_array_ref (location_t loc, tree
[PATCH] Fix tree-ssa-strlen ICE introduced by r211956 (PR tree-optimization/61773)
Hi! Before the r211956 changes, the only places that set si-stmt were required to check that stpcpy has been declared (with the right prototype) to signal the strlen pass that it can use stpcpy for optimization. But r211956 sets si-stmt also for malloca call, which isn't in any way related to stpcpy. So, this patch moves the assertion where it really is needed (for strcat/strcpy and their checking variants cases). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2014-11-20 Jakub Jelinek ja...@redhat.com PR tree-optimization/61773 * tree-ssa-strlen.c (get_string_length): Don't assert stpcpy has been prototyped if si-stmt is BUILT_IN_MALLOC. * gcc.dg/pr61773.c: New test. --- gcc/tree-ssa-strlen.c.jj2014-11-19 18:47:59.0 +0100 +++ gcc/tree-ssa-strlen.c 2014-11-20 09:46:33.949017462 +0100 @@ -430,7 +430,6 @@ get_string_length (strinfo si) callee = gimple_call_fndecl (stmt); gcc_assert (callee DECL_BUILT_IN_CLASS (callee) == BUILT_IN_NORMAL); lhs = gimple_call_lhs (stmt); - gcc_assert (builtin_decl_implicit_p (BUILT_IN_STPCPY)); /* unshare_strinfo is intentionally not called here. The (delayed) transformation of strcpy or strcat into stpcpy is done at the place of the former strcpy/strcat call and so can affect all the strinfos @@ -479,6 +478,7 @@ get_string_length (strinfo si) case BUILT_IN_STRCPY_CHK: case BUILT_IN_STRCPY_CHKP: case BUILT_IN_STRCPY_CHK_CHKP: + gcc_assert (builtin_decl_implicit_p (BUILT_IN_STPCPY)); if (gimple_call_num_args (stmt) == (with_bounds ? 4 : 2)) fn = builtin_decl_implicit (BUILT_IN_STPCPY); else --- gcc/testsuite/gcc.dg/pr61773.c.jj 2014-11-20 10:12:48.664616764 +0100 +++ gcc/testsuite/gcc.dg/pr61773.c 2014-11-20 10:13:47.384557904 +0100 @@ -0,0 +1,16 @@ +/* PR tree-optimization/61773 */ +/* { dg-do compile } */ +/* { dg-options -O2 } */ + +void +foo (char **x) +{ + char *p = __builtin_malloc (64); + char *q = __builtin_malloc (64); + __builtin_strcat (q, abcde); + __builtin_strcat (p, ab); + p[1] = q[3]; + __builtin_strcat (p, q); + x[0] = p; + x[1] = q; +} Jakub
[ia64 PATCH] Fix up ia64 attribute handling (PR target/61137)
Hi! Seems the gcc.target/ia64/small-addr-1.c testcase is failing on ia64 since r210262 but clearly has been failing for much longer if compiled with C++ (just there is insufficient testsuite coverage). The problem is that for the model attribute (and apparently common_object on VMS too), the argument of that attribute is supposed to be an identifier rather than expression (for common_object either an identifier or string), and these days one has to tell the frontends about that in order not to get the argument parsed as an expression. The following untested patch fixes that (tested on small-addr-1.c with a cross-compiler), I don't have ia64 hw nor spare cycles to test this though, so I'm just offering the patch as is if anyone wants to test it. Perhaps better testsuite coverage wouldn't hurt (test the model (small) attribute also in C++, perhaps test the common_object attribute on VMS?). 2014-11-20 Jakub Jelinek ja...@redhat.com PR target/61137 * config/ia64/ia64.c (ia64_attribute_takes_identifier_p): New function. (TARGET_ATTRIBUTE_TAKES_IDENTIFIER_P): Redefine to it. --- gcc/config/ia64/ia64.c.jj 2014-11-11 00:06:23.0 +0100 +++ gcc/config/ia64/ia64.c 2014-11-20 11:51:59.729478773 +0100 @@ -324,6 +324,7 @@ static bool ia64_vms_valid_pointer_mode static tree ia64_vms_common_object_attribute (tree *, tree, tree, int, bool *) ATTRIBUTE_UNUSED; +static bool ia64_attribute_takes_identifier_p (const_tree); static tree ia64_handle_model_attribute (tree *, tree, tree, int, bool *); static tree ia64_handle_version_id_attribute (tree *, tree, tree, int, bool *); static void ia64_encode_section_info (tree, rtx, int); @@ -669,8 +670,26 @@ static const struct attribute_spec ia64_ #undef TARGET_VECTORIZE_VEC_PERM_CONST_OK #define TARGET_VECTORIZE_VEC_PERM_CONST_OK ia64_vectorize_vec_perm_const_ok +#undef TARGET_ATTRIBUTE_TAKES_IDENTIFIER_P +#define TARGET_ATTRIBUTE_TAKES_IDENTIFIER_P ia64_attribute_takes_identifier_p + struct gcc_target targetm = TARGET_INITIALIZER; +/* Returns TRUE iff the target attribute indicated by ATTR_ID takes a plain + identifier as an argument, so the front end shouldn't look it up. */ + +static bool +ia64_attribute_takes_identifier_p (const_tree attr_id) +{ + if (is_attribute_p (model, attr_id)) +return true; +#if TARGET_ABI_OPEN_VMS + if (is_attribute_p (common_object, attr_id)) +return true; +#endif + return false; +} + typedef enum { ADDR_AREA_NORMAL, /* normal address area */ Jakub
Re: [PATCH] rs6000: Follow up for signed integer overflow fix
On Thu, Nov 20, 2014 at 07:41:43PM +0100, Markus Trippelsdorf wrote: 2014-11-20 Markus Trippelsdorf mar...@trippelsdorf.de * config/rs6000/rs6000.c (includes_rldic_lshift_p): Cast 0 to unsigned. (includes_rldicr_lshift_p): Likewise. diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index a9604cf3fa97..d7958b33ba1a 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -16197,10 +16197,10 @@ includes_rldic_lshift_p (rtx shiftop, rtx andop) unsigned HOST_WIDE_INT c, lsb, shift_mask; c = INTVAL (andop); - if (c == 0 || c == ~0) + if (c == 0 || c == ~(unsigned HOST_WIDE_INT) 0) return 0; - shift_mask = ~0; + shift_mask = ~(unsigned HOST_WIDE_INT) 0; shift_mask = INTVAL (shiftop); /* Find the least significant one bit. */ @@ -16235,7 +16235,7 @@ includes_rldicr_lshift_p (rtx shiftop, rtx andop) { unsigned HOST_WIDE_INT c, lsb, shift_mask; - shift_mask = ~0; + shift_mask = ~(unsigned HOST_WIDE_INT) 0; shift_mask = INTVAL (shiftop); c = INTVAL (andop); You could use ~HOST_WIDE_INT_UC (0) in all the 3 cases. Jakub
Re: [PATCH 1/2] PR debug/38757 gcc does not emit DW_LANG_C99.
On Thu, Nov 20, 2014 at 11:30:11PM +0100, Mark Wielaard wrote: --- a/gcc/config/avr/avr-c.c +++ b/gcc/config/avr/avr-c.c @@ -386,7 +386,8 @@ avr_cpu_cpp_builtins (struct cpp_reader *pfile) (as mentioned in ISO/IEC DTR 18037; Annex F.2) which is not implemented in GCC up to now. */ - if (!strcmp (lang_hooks.name, GNU C)) + if (strncmp (lang_hooks.name, GNU C, 5) == 0 + strncmp (lang_hooks.name, GNU C++, 7) != 0) I wonder if the tests for C language shouldn't be better done as (strncmp (lang_hooks.name, GNU C, 5) == 0 strchr (0123456789, lang_hooks.name[5]) != NULL) or (strncmp (lang_hooks.name, GNU C, 5) == 0 (ISDIGIT (lang_hooks.name[5]) || lang_hooks.name[5] == '\0')) to make it explicit what we are looking for, not what we aren't. + either, so for now use 0. Match GNU C++ first, since it needs to + be compared with strncmp, like GNU C, which has the same prefix. */ + if (! strncmp (language_string, GNU C++, 7) +|| ! strcmp (language_string, GNU Objective-C++)) Wrong formatting, || should be below ! on the previous line. + i = 9; + else if (! strncmp (language_string, GNU C, 5) || ! strcmp (language_string, GNU GIMPLE) || ! strcmp (language_string, GNU Go)) And here too. But if you use a different check for C (see above), you could avoid moving the C++ case first. --- a/gcc/langhooks.h +++ b/gcc/langhooks.h @@ -261,7 +261,8 @@ struct lang_hooks_for_lto struct lang_hooks { - /* String identifying the front end. e.g. GNU C++. */ + /* String identifying the front end. e.g. GNU C++. + Might include language version being used. */ As we no longer have GNU C++ as any name, using it as an example is weird. So, /* String identifying the front end and optionally language standard version, e.g. GNU C++98 or GNU Java. */ ? LGTM otherwise. Jakub
Re: [PATCH 2/2] PR debug/38757 continued. Handle C11, C++11 and C++14.
On Thu, Nov 20, 2014 at 11:30:12PM +0100, Mark Wielaard wrote: @@ -19592,13 +19597,28 @@ gen_compile_unit_die (const char *filename) language = DW_LANG_C; if (strncmp (language_string, GNU C++, 7) == 0) -language = DW_LANG_C_plus_plus; +{ + language = DW_LANG_C_plus_plus; + if (dwarf_version = 5 || !dwarf_strict) + { + if (strcmp (language_string, GNU C++11) == 0) + language = DW_LANG_C_plus_plus_11; + else if (strcmp (language_string, GNU C++14) == 0) + language = DW_LANG_C_plus_plus_14; + } +} I think best would be to tweak if (value 2 || value 4) error_at (loc, dwarf version %d is not supported, value); else opts-x_dwarf_version = value; so that we accept value 5 too, and for now, until the most common consumers are changed, use if (dwarf_version = 5 /* || !dwarf_strict */) so that - you can actually use it in the test with -gdwarf-5 - you can commit it right away - people can start playing with what it will mean to support DWARF5 GCC 4.5 also allowed -gdwarf-4 even when DWARF4 has not been released yet. When there are consumers that can grok it, we can uncomment the || !dwarf_strict. Jason, do you agree? else if (strncmp (language_string, GNU C, 5) == 0) { language = DW_LANG_C89; if (dwarf_version = 3 || !dwarf_strict) - if (strcmp (language_string, GNU C99) == 0) - language = DW_LANG_C99; + { + if (strcmp (language_string, GNU C89) != 0) + language = DW_LANG_C99; + + if (dwarf_version = 5 || !dwarf_strict) + if (strcmp (language_string, GNU C11) == 0) + language = DW_LANG_C11; + } Shouldn't we emit at least DW_LANG_C99 for GNU C11 if not dwarf_version = 5 /* || !dwarf_strict */ but dwarf_version = 3 || !dwarf_strict is true? BTW, noticed we don't have anything for Fortran 2003 and 2008, filed a DWARF Issue for that. Jakub
Re: [PATCH] OpenACC for C front end
On Thu, Nov 20, 2014 at 05:50:33PM -0600, James Norris wrote: + case 'h': + if (!strcmp (host, p)) + result = PRAGMA_OMP_CLAUSE_SELF; + break; Shouldn't this be PRAGMA_OMP_CLAUSE_HOST (PRAGMA_OACC_CLAUSE_HOST) instead? It is _HOST in the C++ patch, are there no C tests with that clause covering it? The host clause is a synonym for the self clause. The initial C++ patch did not treat host as a synonym and has amended accordingly. Can you add a comment mentioning that (for casual readers)? There was a mistake in naming the function: c_parser_omp_clause_vector_length. Once it was renamed to: c_parser_oacc_clause_vector_length, diff was able to keep track. Great. OK to commit after middle end is accepted? Ok, thanks. Jakub
Re: [PATCH] OpenACC for C++ front end
On Thu, Nov 20, 2014 at 05:33:57PM -0600, James Norris wrote: + t = OMP_CLAUSE_ASYNC_EXPR (c); + if (t == error_mark_node) + remove = true; + else if (!type_dependent_expression_p (t) + !INTEGRAL_TYPE_P (TREE_TYPE (t))) + { + error (%async% expression must be integral); You have OMP_CLAUSE_LOCATION (c) which you could use for error_at. I followed the convention that was used elsewhere in the function at using error (). Perhaps it would be better to change even those other spots in the function. But that can be certainly done as a follow-up patch. Thank you for taking the time to review! OK to commit after middle end has been accepted? Yes, thanks. Jakub
Re: [PATCH 1/2] PR debug/38757 gcc does not emit DW_LANG_C99.
On Fri, Nov 21, 2014 at 02:01:55PM +0100, Mark Wielaard wrote: gcc/c-family/ChangeLog PR debug/38757 * c-opts.c (set_std_c89): Set lang_hooks.name. (set_std_c99): Likewise. (set_std_c11): Likewise. (set_std_cxx98): Likewise. (set_std_cxx11): Likewise. (set_std_cxx14): Likewise. (set_std_cxx1z): Likewise. gcc/ChangeLog PR debug/38757 * config/avr/avr-c.c (avr_cpu_cpp_builtins): Use lang_GNU_C. * config/darwin.c (darwin_file_end): Use lang_GNU_CXX. (darwin_override_options): Likewise. * config/ia64/ia64.c (ia64_struct_retval_addr_is_first_parm_p): Likewise. * config/rs6000/rs6000.c (rs6000_output_function_epilogue): Likewise. * dbxout.c (get_lang_number): Likewise. (dbxout_type): Likewise. (dbxout_symbol_location): Likewise. * dwarf2out.c (add_prototyped_attribute): Add DW_AT_prototype also for DW_LANG_{C,C99,ObjC}. (highest_c_language): New function. (gen_compile_unit_die): Call highest_c_language to merge LTO TRANSLATION_UNIT_LANGUAGE. Use strncmp language_string to determine if DW_LANG_C99 or DW_LANG_C89 should be returned. * fold-const.c (fold_cond_expr_with_comparison): Use lang_GNU_CXX. * langhooks.h (struct lang_hooks): Add version comment to name. (lang_GNU_C): New function declaration. (lang_GNU_CXX): Likewise. * langhooks.c (lang_GNU_C): New function. (lang_GNU_CXX): Likewise. * vmsdbgout.c (vmsdbgout_init): Use lang_GNU_C and lang_GNU_CXX. gcc/testsuite/ChangeLog PR debug/38757 * gcc.dg/debug/dwarf2/lang-c89.c: New test. * gcc.dg/debug/dwarf2/lang-c99.c: Likewise. Ok, thanks. Jakub
[committed] Cherry-pick a libsanitizer bugfix (PR sanitizer/64013)
Hi! I've committed this as obvious. 2014-11-21 Jakub Jelinek ja...@redhat.com PR sanitizer/64013 * sanitizer_common/sanitizer_linux.cc (FileExists): Cherry pick upstream r222532. --- libsanitizer/sanitizer_common/sanitizer_linux.cc(revision 222531) +++ libsanitizer/sanitizer_common/sanitizer_linux.cc(revision 222532) @@ -283,17 +283,15 @@ uptr internal_execve(const char *filenam // - sanitizer_common.h bool FileExists(const char *filename) { -#if SANITIZER_USES_CANONICAL_LINUX_SYSCALLS struct stat st; +#if SANITIZER_USES_CANONICAL_LINUX_SYSCALLS if (internal_syscall(SYSCALL(newfstatat), AT_FDCWD, filename, st, 0)) -return false; #else - struct stat st; if (internal_stat(filename, st)) +#endif return false; // Sanity check: filename is a regular file. return S_ISREG(st.st_mode); -#endif } uptr GetTid() { Jakub
[PATCH] Fix VRP handling of {ADD,SUB,MUL}_OVERFLOW (PR tree-optimization/64006)
Hi! As discussed on IRC and in the PR, these internal calls are quite unique for VRP in that they return _Complex integer result, which VRP doesn't track, but then extract using REALPART_EXPR/IMAGPART_EXPR the two results from that _Complex int and to generate good code it is desirable to get proper ranges of those two results. The problem is that right now this works only on the first VRP iteration, the REALPART_EXPR/IMAGPART_EXPR statements are handled if their operand is set by {ADD,SUB,MUL}_OVERFLOW. If we iterate because a VR of one of the internal call arguments changes, nothing in the propagator marks the REALPART_EXPR/IMAGPART_EXPR statements for reconsideration. The following patch handles this, by making the internal calls interesting to the propagator and returning the right SSA_PROP_* for it (depending on whether any of the value ranges of the REALPART_EXPR/IMAGPART_EXPR immediate uses would change or not). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2014-11-21 Jakub Jelinek ja...@redhat.com PR tree-optimization/64006 * tree-vrp.c (stmt_interesting_for_vrp): Return true for {ADD,SUB,MUL}_OVERFLOW internal calls. (vrp_visit_assignment_or_call): For {ADD,SUB,MUL}_OVERFLOW internal calls, check if any REALPART_EXPR/IMAGPART_EXPR immediate uses would change their value ranges and return SSA_PROP_INTERESTING if so, or SSA_PROP_NOT_INTERESTING if there are some REALPART_EXPR/IMAGPART_EXPR immediate uses interesting for vrp. * gcc.c-torture/execute/pr64006.c: New test. --- gcc/tree-vrp.c.jj 2014-11-21 10:17:05.0 +0100 +++ gcc/tree-vrp.c 2014-11-21 13:12:09.895013334 +0100 @@ -6949,6 +6949,20 @@ stmt_interesting_for_vrp (gimple stmt) (is_gimple_call (stmt) || !gimple_vuse (stmt))) return true; + else if (is_gimple_call (stmt) gimple_call_internal_p (stmt)) + switch (gimple_call_internal_fn (stmt)) + { + case IFN_ADD_OVERFLOW: + case IFN_SUB_OVERFLOW: + case IFN_MUL_OVERFLOW: + /* These internal calls return _Complex integer type, + but are interesting to VRP nevertheless. */ + if (lhs TREE_CODE (lhs) == SSA_NAME) + return true; + break; + default: + break; + } } else if (gimple_code (stmt) == GIMPLE_COND || gimple_code (stmt) == GIMPLE_SWITCH) @@ -7101,6 +7115,74 @@ vrp_visit_assignment_or_call (gimple stm return SSA_PROP_NOT_INTERESTING; } + else if (is_gimple_call (stmt) gimple_call_internal_p (stmt)) +switch (gimple_call_internal_fn (stmt)) + { + case IFN_ADD_OVERFLOW: + case IFN_SUB_OVERFLOW: + case IFN_MUL_OVERFLOW: + /* These internal calls return _Complex integer type, + which VRP does not track, but the immediate uses + thereof might be interesting. */ + if (lhs TREE_CODE (lhs) == SSA_NAME) + { + imm_use_iterator iter; + use_operand_p use_p; + enum ssa_prop_result res = SSA_PROP_VARYING; + + set_value_range_to_varying (get_value_range (lhs)); + + FOR_EACH_IMM_USE_FAST (use_p, iter, lhs) + { + gimple use_stmt = USE_STMT (use_p); + if (!is_gimple_assign (use_stmt)) + continue; + enum tree_code rhs_code = gimple_assign_rhs_code (use_stmt); + if (rhs_code != REALPART_EXPR rhs_code != IMAGPART_EXPR) + continue; + tree rhs1 = gimple_assign_rhs1 (use_stmt); + tree use_lhs = gimple_assign_lhs (use_stmt); + if (TREE_CODE (rhs1) != rhs_code + || TREE_OPERAND (rhs1, 0) != lhs + || TREE_CODE (use_lhs) != SSA_NAME + || !stmt_interesting_for_vrp (use_stmt) + || (!INTEGRAL_TYPE_P (TREE_TYPE (use_lhs)) + || !TYPE_MIN_VALUE (TREE_TYPE (use_lhs)) + || !TYPE_MAX_VALUE (TREE_TYPE (use_lhs + continue; + + /* If there is a change in the value range for any of the + REALPART_EXPR/IMAGPART_EXPR immediate uses, return + SSA_PROP_INTERESTING. If there are any REALPART_EXPR + or IMAGPART_EXPR immediate uses, but none of them have + a change in their value ranges, return + SSA_PROP_NOT_INTERESTING. If there are no + {REAL,IMAG}PART_EXPR uses at all, + return SSA_PROP_VARYING. */ + value_range_t new_vr = VR_INITIALIZER; + extract_range_basic (new_vr, use_stmt); + value_range_t *old_vr = get_value_range (use_lhs); + if (old_vr-type != new_vr.type + || !vrp_operand_equal_p (old_vr-min, new_vr.min
[PATCH] Fix up __builtin_*_overflow expansion on some targets (PR target/63848)
Hi! Apparently, emit_cmp_and_jump_insns can silently generate wrong code for wider modes on some targets, so this patch changes all those calls in internal-fn.c to do_compare_rtx_and_jump, which is a wrapper around emit_cmp_and_jump_insns that should handle the wider mode comparison expansion. Unfortunately, the order of arguments is different :(. No new testcases provided, the existing testsuite exhibited this on various targets. Bootstrapped/regtested on x86_64-linux and i686-linux, tested on the testcases for ia64 and Uros tested the testcases on Alpha (in both cases they previously failed), ok for trunk? 2014-11-21 Jakub Jelinek ja...@redhat.com PR target/63848 PR target/63975 * internal-fn.c (expand_arith_overflow_result_store, expand_addsub_overflow, expand_neg_overflow, expand_mul_overflow): Use do_compare_rtx_and_jump instead of emit_cmp_and_jump_insns everywhere, adjust arguments to those functions. Use unsignedp = true for EQ, NE, GEU, LEU, LTU and GTU comparisons. --- gcc/internal-fn.c.jj2014-11-19 18:48:02.0 +0100 +++ gcc/internal-fn.c 2014-11-21 17:34:00.634621461 +0100 @@ -386,8 +386,8 @@ expand_arith_overflow_result_store (tree int uns = TYPE_UNSIGNED (TREE_TYPE (TREE_TYPE (lhs))); lres = convert_modes (tgtmode, mode, res, uns); gcc_assert (GET_MODE_PRECISION (tgtmode) GET_MODE_PRECISION (mode)); - emit_cmp_and_jump_insns (res, convert_modes (mode, tgtmode, lres, uns), - EQ, NULL_RTX, mode, false, done_label, + do_compare_rtx_and_jump (res, convert_modes (mode, tgtmode, lres, uns), + EQ, true, mode, NULL_RTX, NULL_RTX, done_label, PROB_VERY_LIKELY); write_complex_part (target, const1_rtx, true); emit_label (done_label); @@ -533,8 +533,8 @@ expand_addsub_overflow (location_t loc, ? (CONST_SCALAR_INT_P (op0) REG_P (op1)) : CONST_SCALAR_INT_P (op1))) tem = op1; - emit_cmp_and_jump_insns (res, tem, code == PLUS_EXPR ? GEU : LEU, - NULL_RTX, mode, false, done_label, + do_compare_rtx_and_jump (res, tem, code == PLUS_EXPR ? GEU : LEU, + true, mode, NULL_RTX, NULL_RTX, done_label, PROB_VERY_LIKELY); goto do_error_label; } @@ -549,7 +549,7 @@ expand_addsub_overflow (location_t loc, rtx tem = expand_binop (mode, add_optab, code == PLUS_EXPR ? res : op0, sgn, NULL_RTX, false, OPTAB_LIB_WIDEN); - emit_cmp_and_jump_insns (tem, op1, GEU, NULL_RTX, mode, false, + do_compare_rtx_and_jump (tem, op1, GEU, true, mode, NULL_RTX, NULL_RTX, done_label, PROB_VERY_LIKELY); goto do_error_label; } @@ -591,9 +591,9 @@ expand_addsub_overflow (location_t loc, emit_jump (do_error); else if (pos_neg == 3) /* If ARG0 is not known to be always positive, check at runtime. */ - emit_cmp_and_jump_insns (op0, const0_rtx, LT, NULL_RTX, mode, false, -do_error, PROB_VERY_UNLIKELY); - emit_cmp_and_jump_insns (op1, op0, LEU, NULL_RTX, mode, false, + do_compare_rtx_and_jump (op0, const0_rtx, LT, false, mode, NULL_RTX, +NULL_RTX, do_error, PROB_VERY_UNLIKELY); + do_compare_rtx_and_jump (op1, op0, LEU, true, mode, NULL_RTX, NULL_RTX, done_label, PROB_VERY_LIKELY); goto do_error_label; } @@ -607,7 +607,7 @@ expand_addsub_overflow (location_t loc, OPTAB_LIB_WIDEN); rtx tem = expand_binop (mode, add_optab, op1, sgn, NULL_RTX, false, OPTAB_LIB_WIDEN); - emit_cmp_and_jump_insns (op0, tem, LTU, NULL_RTX, mode, false, + do_compare_rtx_and_jump (op0, tem, LTU, true, mode, NULL_RTX, NULL_RTX, done_label, PROB_VERY_LIKELY); goto do_error_label; } @@ -619,8 +619,8 @@ expand_addsub_overflow (location_t loc, unsigned. */ res = expand_binop (mode, add_optab, op0, op1, NULL_RTX, false, OPTAB_LIB_WIDEN); - emit_cmp_and_jump_insns (res, const0_rtx, LT, NULL_RTX, mode, false, - do_error, PROB_VERY_UNLIKELY); + do_compare_rtx_and_jump (res, const0_rtx, LT, false, mode, NULL_RTX, + NULL_RTX, do_error, PROB_VERY_UNLIKELY); rtx tem = op1; /* The operation is commutative, so we can pick operand to compare against. For prec = BITS_PER_WORD, I think preferring REG operand @@ -633,7 +633,7 @@ expand_addsub_overflow (location_t loc, ? (CONST_SCALAR_INT_P (op1) REG_P (op0)) : CONST_SCALAR_INT_P (op0)) tem = op0; - emit_cmp_and_jump_insns (res, tem, GEU
Re: [PATCH 4/4] OpenMP 4.0 offloading to Intel MIC: non-fallback testing
On Fri, Nov 21, 2014 at 10:14:21PM +0300, Ilya Verbin wrote: On Fri, 21 Nov 2014 21:44:40, Ilya Verbin wrote: You're right. This patch was rebased so many times, that we may forget to regenerate it before committing. Build with liboffloadmic passed. OK for trunk? -- Ilya * Makefile.in: Regenerate. Ok. --- a/Makefile.in +++ b/Makefile.in @@ -35238,9 +35238,6 @@ configure-target-liboffloadmic: $(SHELL) $(srcdir)/mkinstalldirs $(TARGET_SUBDIR)/liboffloadmic ; \ $(NORMAL_TARGET_EXPORTS) \ echo Configuring in $(TARGET_SUBDIR)/liboffloadmic; \ - \ - this_target=${target_alias}; \ - \ cd $(TARGET_SUBDIR)/liboffloadmic || exit 1; \ case $(srcdir) in \ /* | [A-Za-z]:[\\/]*) topdir=$(srcdir) ;; \ @@ -35248,14 +35245,12 @@ configure-target-liboffloadmic: sed -e 's,\./,,g' -e 's,[^/]*/,../,g' `$(srcdir) ;; \ esac; \ module_srcdir=liboffloadmic; \ - srcdiroption=--srcdir=$${topdir}/liboffloadmic; \ - libsrcdir=$$s/liboffloadmic; \ rm -f no-such-file || : ; \ CONFIG_SITE=no-such-file $(SHELL) \ $$s/$$module_srcdir/configure \ --srcdir=$${topdir}/$$module_srcdir \ $(TARGET_CONFIGARGS) --build=${build_alias} --host=${target_alias} \ - --target=$${this_target} $${srcdiroption} @extra_liboffloadmic_configure_flags@ \ + --target=${target_alias} @extra_liboffloadmic_configure_flags@ \ || exit 1 @endif target-liboffloadmic Jakub
Re: [PATCH][OpenMP] Fix named critical sections inside target functions
On Fri, Nov 21, 2014 at 11:19:26PM +0300, Ilya Verbin wrote: Hi, '#pragma omp critical (name)' can be placed in the function, marked with '#pragma omp declare target', in this case the corresponding node should be marked as offloadable too. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Please add a testcase for this. * omp-low.c (lower_omp_critical): Mark critical sections inside target functions as offloadable. diff --git a/gcc/omp-low.c b/gcc/omp-low.c index 3924282..6c5774c 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -9366,16 +9366,6 @@ lower_omp_critical (gimple_stmt_iterator *gsi_p, omp_context *ctx) DECL_ARTIFICIAL (decl) = 1; DECL_IGNORED_P (decl) = 1; - /* If '#pragma omp critical' is inside target region, the symbol must - be marked for offloading. */ - omp_context *octx; - for (octx = ctx-outer; octx; octx = octx-outer) - if (is_targetreg_ctx (octx)) - { - varpool_node::get_create (decl)-offloadable = 1; - break; - } - varpool_node::finalize_decl (decl); critical_name_mutexes-put (name, decl); @@ -9383,6 +9373,20 @@ lower_omp_critical (gimple_stmt_iterator *gsi_p, omp_context *ctx) else decl = *n; + /* If '#pragma omp critical' is inside target region or + inside function marked as offloadable, the symbol must be + marked as offloadable too. */ + omp_context *octx; + if (cgraph_node::get (current_function_decl)-offloadable) + varpool_node::get_create (decl)-offloadable = 1; + else + for (octx = ctx-outer; octx; octx = octx-outer) + if (is_targetreg_ctx (octx)) + { + varpool_node::get_create (decl)-offloadable = 1; + break; + } + lock = builtin_decl_explicit (BUILT_IN_GOMP_CRITICAL_NAME_START); lock = build_call_expr_loc (loc, lock, 1, build_fold_addr_expr_loc (loc, decl)); Jakub
Re: [PATCH][OpenMP] Fix named critical sections inside target functions
On Sat, Nov 22, 2014 at 12:08:38AM +0300, Ilya Verbin wrote: On 21 Nov 2014, at 23:36, Jakub Jelinek ja...@redhat.com wrote: On Fri, Nov 21, 2014 at 11:19:26PM +0300, Ilya Verbin wrote: Hi, '#pragma omp critical (name)' can be placed in the function, marked with '#pragma omp declare target', in this case the corresponding node should be marked as offloadable too. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Please add a testcase for this. By default with disabled offloading it will always PASS. Add anyway? Doesn't matter. We want to test it both with host fallback and offloaded to various devices. The latter of course will be tested only if somebody configures the 2 compilers that way, less often, but still. Jakub
Re: [PATCH v2] gcc/ubsan.c: Use 'pretty_print' for 'pretty_name' to avoid memory overflow
On Sat, Nov 22, 2014 at 05:03:37AM +0800, Chen Gang wrote: According to the next code, 'pretty_name' may need additional bytes more than 16 (may have unlimited length for array type). There is an easy way to fix it: use 'pretty_print' for 'pretty_name'. Let the code meet 2 white spaces alignment coding styles (originally, some of code is 1 white sapce alignment). It passes testsuite under fedora 20 x86_64-unknown-linux-gnu. 2014-11-22 Chen Gang gang.chen.5...@gmail.com * ubsan.c (ubsan_type_descriptor): Use 'pretty_print' for 'pretty_name' to avoid memory overflow Add a . at the end. while (deref_depth-- 0) -pretty_name[pos++] = '*'; - pretty_name[pos++] = '\''; - pretty_name[pos] = '\0'; + pp_star(pretty_name); + pp_quote(pretty_name); Formatting, missing space before (. Happens many times in the patch. if (dom TREE_CODE (TYPE_MAX_VALUE (dom)) == INTEGER_CST) - pos += sprintf (pretty_name[pos], HOST_WIDE_INT_PRINT_DEC, - tree_to_uhwi (TYPE_MAX_VALUE (dom)) + 1); + pp_printf (pretty_name, HOST_WIDE_INT_PRINT_DEC, +tree_to_uhwi (TYPE_MAX_VALUE (dom)) + 1); You don't know if TYPE_MAX_VALUE (dom) will fit into uhwi, and you are using signed printing anyway. You said that using pp_wide_int breaks too many tests, so perhaps do if (tree_fits_uhwi_p (TYPE_MAX_VALUE (dom)) tree_to_uhwi (TYPE_MAX_VALUE (dom)) + 1 != 0) pp_printf (..., HOST_WIDE_INT_PRINT_UNSIGNED, else pp_wide_int (..., wi::to_widest (TYPE_MAX_VALUE (dom)) + 1); ? Jakub
Re: [Patch,MIPS] Add default SYS_futex definition in libgomp
On Fri, Nov 21, 2014 at 02:30:06PM -0800, Steve Ellcey wrote: 2014-11-21 Steve Ellcey sell...@imgtec.com * config/linux/mips/futex.h (SYS_futex): Define if not already done. Ok. --- a/libgomp/config/linux/mips/futex.h +++ b/libgomp/config/linux/mips/futex.h @@ -25,6 +25,11 @@ /* Provide target-specific access to the futex system call. */ #include sys/syscall.h + +#if !defined(SYS_futex) +#define SYS_futex __NR_futex +#endif + #define FUTEX_WAIT 0 #define FUTEX_WAKE 1 Jakub
Re: [PATCH, PR 63551] Use proper type in evaluate_conditions_for_known_args
On Sat, Nov 22, 2014 at 12:09:46PM +0100, Martin Jambor wrote: 2014-11-21 Martin Jambor mjam...@suse.cz PR ipa/63551 * ipa-inline-analysis.c (evaluate_conditions_for_known_args): Convert value of the argument to the type of the value in the condition. testsuite/ * gcc.dg/ipa/pr63551.c: New test. Index: src/gcc/ipa-inline-analysis.c === --- src.orig/gcc/ipa-inline-analysis.c +++ src/gcc/ipa-inline-analysis.c @@ -880,7 +880,10 @@ evaluate_conditions_for_known_args (stru } if (c-code == IS_NOT_CONSTANT || c-code == CHANGED) continue; - res = fold_binary_to_constant (c-code, boolean_type_node, val, c-val); + val = fold_unary (VIEW_CONVERT_EXPR, TREE_TYPE (c-val), val); VCE should only be used if the sizes of the types are the same. Is that always the case here? Jakub
Re: [PATCH] Check for strtol, strtoul, strtoll and strtoull declarations
On Sat, Nov 22, 2014 at 09:40:58AM -0500, John David Anglin wrote: libiberty ChangeLog: 2014-11-22 John David Anglin dang...@gcc.gnu.org PR other/63694 * configure.ac: Check for strtol, strtoul, strtoll and strtoull declarations. * configure: Regenerated. gcc ChangeLog: 2014-11-22 John David Anglin dang...@gcc.gnu.org PR other/63694 * configure.ac: Check for strtol, strtoul, strtoll and strtoull declarations. * configure: Regenerated. * config.in: Regenerated. Looks reasonable to me, but I'll defer it to Ian as libiberty maintainer. Index: libiberty/configure.ac === --- libiberty/configure.ac(revision 217956) +++ libiberty/configure.ac(working copy) @@ -678,6 +678,7 @@ AC_CHECK_FUNCS($checkfuncs) AC_CHECK_DECLS([basename(char *), ffs, asprintf, vasprintf, snprintf, vsnprintf]) AC_CHECK_DECLS([calloc, getenv, getopt, malloc, realloc, sbrk]) + AC_CHECK_DECLS([strtol, strtoul, strtoll, strtoull]) AC_CHECK_DECLS([strverscmp]) libiberty_NEED_DECLARATION(canonicalize_file_name) fi Index: gcc/configure.ac === --- gcc/configure.ac (revision 217956) +++ gcc/configure.ac (working copy) @@ -1213,6 +1213,7 @@ CXXFLAGS=$CXXFLAGS -I${srcdir} -I${srcdir}/../include $GMPINC gcc_AC_CHECK_DECLS(getenv atol asprintf sbrk abort atof getcwd getwd \ stpcpy strnlen strsignal strstr strverscmp \ + strtol strtoul strtoll strtoull \ errno snprintf vsnprintf vasprintf malloc realloc calloc \ free basename getopt clock getpagesize ffs gcc_UNLOCKED_FUNCS, , ,[ #include ansidecl.h Jakub
Re: [PATCH] pr63856 - test case
On Sun, Nov 23, 2014 at 01:32:27PM -0500, tsaund...@mozilla.com wrote: bug was already fixed, so just add the test case. tested this only passes with r217909 where it is fixed, ok? Trev diff --git a/gcc/testsuite/gcc.dg/pr63856.c b/gcc/testsuite/gcc.dg/pr63856.c new file mode 100644 index 000..8fb65c6 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr63856.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -fPIC } */ +typedef int v2si __attribute__ ((vector_size (8))); +typedef short v4hi __attribute__ ((vector_size (8))); + +int __attribute__ ((noinline, noclone)) f (v2si A, int N) Vector arguments or return values often result in -Wpsabi warnings, so you certainly want -Wno-psabi in dg-options. On the other side, -fPIC should be only used for { target pic }, so better put that into dg-additional-options. +{ + return ((v4hi) A)[N]; +} + +int __attribute__ ((noinline, noclone)) g (v2si A, int N) +{ + return ((v4hi) A)[N]; +} -- 2.1.3 Jakub
Re: [PATCH] pr63856 - test case
On Sun, Nov 23, 2014 at 03:42:43PM -0500, Trevor Saunders wrote: Vector arguments or return values often result in -Wpsabi warnings, so you certainly want -Wno-psabi in dg-options. On the other side, -fPIC should be only used for { target pic }, so better put that into dg-additional-options. ok, thanks how about this then? The __attribute__ ((noinline, noclone)) is useless, though not harmful, sorry for missing that. Ok for trunk either way. --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr63856.c @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -Wno-psabi } */ +/* { dg-additional-options -fPIC { target fpic } } */ +typedef int v2si __attribute__ ((vector_size (8))); +typedef short v4hi __attribute__ ((vector_size (8))); + +int __attribute__ ((noinline, noclone)) f (v2si A, int N) +{ + return ((v4hi) A)[N]; +} + +int __attribute__ ((noinline, noclone)) g (v2si A, int N) +{ + return ((v4hi) A)[N]; +} -- 2.1.3 Jakub
Re: [PATCH v3] gcc/ubsan.c: Use 'pretty_print' for 'pretty_name' to avoid memory overflow
On Sun, Nov 23, 2014 at 09:13:27AM +0800, Chen Gang wrote: 2014-11-23 Chen Gang gang.chen.5...@gmail.com * ubsan.c (ubsan_type_descriptor): Use 'pretty_print' for 'pretty_name' to avoid memory overflow. Ok, with a small nit below. gcc/ubsan.c | 63 + 1 file changed, 34 insertions(+), 29 deletions(-) diff --git a/gcc/ubsan.c b/gcc/ubsan.c index b3d5343..3fceff7 100644 --- a/gcc/ubsan.c +++ b/gcc/ubsan.c @@ -369,7 +369,7 @@ ubsan_type_descriptor (tree type, enum ubsan_print_style pstyle) tree dtype = ubsan_get_type_descriptor_type (); tree type2 = type; const char *tname = NULL; - char *pretty_name; + pretty_printer pretty_name; unsigned char deref_depth = 0; unsigned short tkind, tinfo; @@ -408,54 +408,58 @@ ubsan_type_descriptor (tree type, enum ubsan_print_style pstyle) /* We weren't able to determine the type name. */ tname = unknown; - /* Decorate the type name with '', '*', struct, or union. */ - pretty_name = (char *) alloca (strlen (tname) + 16 + deref_depth); if (pstyle == UBSAN_PRINT_POINTER) { - int pos = sprintf (pretty_name, '%s%s%s%s%s%s%s, - TYPE_VOLATILE (type2) ? volatile : , - TYPE_READONLY (type2) ? const : , - TYPE_RESTRICT (type2) ? restrict : , - TYPE_ATOMIC (type2) ? _Atomic : , - TREE_CODE (type2) == RECORD_TYPE - ? struct - : TREE_CODE (type2) == UNION_TYPE -? union : , tname, - deref_depth == 0 ? : ); + pp_printf (pretty_name, '%s%s%s%s%s%s%s, + TYPE_VOLATILE (type2) ? volatile : , + TYPE_READONLY (type2) ? const : , + TYPE_RESTRICT (type2) ? restrict : , + TYPE_ATOMIC (type2) ? _Atomic : , + TREE_CODE (type2) == RECORD_TYPE + ? struct + : TREE_CODE (type2) == UNION_TYPE +? union : , tname, + deref_depth == 0 ? : ); while (deref_depth-- 0) -pretty_name[pos++] = '*'; - pretty_name[pos++] = '\''; - pretty_name[pos] = '\0'; + pp_star (pretty_name); + pp_quote (pretty_name); } else if (pstyle == UBSAN_PRINT_ARRAY) { /* Pretty print the array dimensions. */ gcc_assert (TREE_CODE (type) == ARRAY_TYPE); tree t = type; - int pos = sprintf (pretty_name, '%s , tname); + pp_printf (pretty_name, '%s , tname); while (deref_depth-- 0) -pretty_name[pos++] = '*'; + pp_star (pretty_name); while (TREE_CODE (t) == ARRAY_TYPE) { - pretty_name[pos++] = '['; + pp_left_bracket (pretty_name); tree dom = TYPE_DOMAIN (t); if (dom TREE_CODE (TYPE_MAX_VALUE (dom)) == INTEGER_CST) - pos += sprintf (pretty_name[pos], HOST_WIDE_INT_PRINT_DEC, + { + if (tree_fits_uhwi_p (TYPE_MAX_VALUE (dom)) +tree_to_uhwi (TYPE_MAX_VALUE (dom)) + 1 != 0) + pp_printf (pretty_name, HOST_WIDE_INT_PRINT_DEC, tree_to_uhwi (TYPE_MAX_VALUE (dom)) + 1); + else + pp_wide_int(pretty_name, + wi::add (wi::to_widest (TYPE_MAX_VALUE (dom)), 1), + TYPE_SIGN (TREE_TYPE (dom))); Space still missing before ( (and reindenting the following 2 lines). Jakub
Re: [PATCH v3] gcc/ubsan.c: Use 'pretty_print' for 'pretty_name' to avoid memory overflow
On Mon, Nov 24, 2014 at 04:28:10PM +0800, Chen Gang wrote: On 11/24/14 15:41, Jakub Jelinek wrote: On Sun, Nov 23, 2014 at 09:13:27AM +0800, Chen Gang wrote: [...] +else + pp_wide_int(pretty_name, + wi::add (wi::to_widest (TYPE_MAX_VALUE (dom)), 1), + TYPE_SIGN (TREE_TYPE (dom))); Space still missing before ( (and reindenting the following 2 lines). Oh, thanks, if necessary to send patch v4, please let me know. No, just fix it up before checking in. Jakub
Re: [C++ PATCH] Detect UB in shifts in constexpr functions
On Mon, Nov 24, 2014 at 02:51:14PM +0100, Marek Polacek wrote: --- gcc/cp/constexpr.c +++ gcc/cp/constexpr.c @@ -1451,6 +1451,43 @@ verify_constant (tree t, bool allow_non_constant, bool *non_constant_p, return *non_constant_p; } +/* Return true if the shift operation on LHS and RHS is undefined. */ + +static bool +cxx_eval_check_shift_p (enum tree_code code, tree lhs, tree rhs) +{ + if (code != LSHIFT_EXPR code != RSHIFT_EXPR) +return false; + + tree lhstype = TREE_TYPE (lhs); + unsigned HOST_WIDE_INT uprec = TYPE_PRECISION (TREE_TYPE (lhs)); + + /* [expr.shift] The behavior is undefined if the right operand + is negative, or greater than or equal to the length in bits + of the promoted left operand. */ + if (tree_int_cst_sgn (rhs) == -1 || compare_tree_int (rhs, uprec) = 0) +return true; I think VERIFY_CONSTANT doesn't guarantee both operands are INTEGER_CSTs. Consider say: constexpr int p = 1; constexpr int foo (int a) { return a (int) p; } constexpr int bar (int a) { return ((int) p) a; } constexpr int q = foo (5); constexpr int r = bar (2); constexpr int s = bar (0); Now, for foo (5) and bar (2) fold_binary_loc returns NULL and thus your cxx_eval_check_shift_p is not called, for bar (0) it returns non-NULL and while the result still is not a constant expression and right now is diagnosed, with your patch it would ICE. So, I'd just return false if either lhs or rhs are not INTEGER_CSTs. + + /* The value of E1 E2 is E1 left-shifted E2 bit positions; [...] + if E1 has a signed type and non-negative value, and E1x2^E2 is + representable in the corresponding unsigned type of the result type, + then that value, converted to the result type, is the resulting value; + otherwise, the behavior is undefined. */ + if (code == LSHIFT_EXPR !TYPE_UNSIGNED (lhstype)) +{ + if (tree_int_cst_sgn (lhs) == -1) + return true; + tree t = build_int_cst (unsigned_type_node, uprec - 1); + t = fold_build2 (MINUS_EXPR, unsigned_type_node, t, rhs); + tree ulhs = fold_convert (unsigned_type_for (lhstype), lhs); + t = fold_build2 (RSHIFT_EXPR, TREE_TYPE (ulhs), ulhs, t); + if (tree_int_cst_lt (integer_one_node, t)) + return true; I'll leave to Jason whether this shouldn't be using the various cxx_eval_*_expression calls instead, or perhaps int_const_binop or wide_int stuff directly. Jakub
Re: [C++ PATCH] Detect UB in shifts in constexpr functions
On Mon, Nov 24, 2014 at 04:58:22PM +0100, Marek Polacek wrote: Consider say: constexpr int p = 1; constexpr int foo (int a) { return a (int) p; } constexpr int bar (int a) { return ((int) p) a; } constexpr int q = foo (5); constexpr int r = bar (2); constexpr int s = bar (0); Now, for foo (5) and bar (2) fold_binary_loc returns NULL and thus your cxx_eval_check_shift_p is not called, for bar (0) it returns non-NULL and while the result still is not a constant expression and right now is diagnosed, with your patch it would ICE. So, I'd just return false if either lhs or rhs are not INTEGER_CSTs. Ok, I'll add that. Thank for pointing that out. Note, the above only with -m32 obviously, or supposedly for bar you could change return type and the cast and type of r/s to __PTRDIFF_TYPE__ or so. foo, as gcc always casts the shift count to int, would supposedly need to stay the way it is and might produce different diagnostics between -m32 and -m64. + + /* The value of E1 E2 is E1 left-shifted E2 bit positions; [...] + if E1 has a signed type and non-negative value, and E1x2^E2 is + representable in the corresponding unsigned type of the result type, + then that value, converted to the result type, is the resulting value; + otherwise, the behavior is undefined. */ + if (code == LSHIFT_EXPR !TYPE_UNSIGNED (lhstype)) +{ + if (tree_int_cst_sgn (lhs) == -1) + return true; + tree t = build_int_cst (unsigned_type_node, uprec - 1); + t = fold_build2 (MINUS_EXPR, unsigned_type_node, t, rhs); + tree ulhs = fold_convert (unsigned_type_for (lhstype), lhs); + t = fold_build2 (RSHIFT_EXPR, TREE_TYPE (ulhs), ulhs, t); + if (tree_int_cst_lt (integer_one_node, t)) + return true; I'll leave to Jason whether this shouldn't be using the various cxx_eval_*_expression calls instead, or perhaps int_const_binop or wide_int stuff directly. ISTR int_const_binop calls wide_int routines wi::rshift/wi::lshift and these return 0 and do not have any overflow flag, so that might not help (?). I meant for the above computation, there you don't check any overflow. Jakub
[committed] Fix !$omp atomic handling (PR fortran/63938)
Hi! This fixes Fortran !$omp atomic handling to match C/C++, goa_lhs_expr_p is not able to find out matches of more complex expressions after tree unsharing, so this patch helps it find what it needs. Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk. Will backport soon. 2014-11-24 Jakub Jelinek ja...@redhat.com PR fortran/63938 * trans-openmp.c (gfc_trans_omp_atomic): Make sure lhsaddr is simple enough for goa_lhs_expr_p. * libgomp.fortran/pr63938-1.f90: New test. * libgomp.fortran/pr63938-2.f90: New test. --- gcc/fortran/trans-openmp.c.jj 2014-10-10 13:07:42.0 +0200 +++ gcc/fortran/trans-openmp.c 2014-11-24 11:25:40.157171398 +0100 @@ -2683,6 +2683,18 @@ gfc_trans_omp_atomic (gfc_code *code) } lhsaddr = save_expr (lhsaddr); + if (TREE_CODE (lhsaddr) != SAVE_EXPR + (TREE_CODE (lhsaddr) != ADDR_EXPR + || TREE_CODE (TREE_OPERAND (lhsaddr, 0)) != VAR_DECL)) +{ + /* Make sure LHS is simple enough so that goa_lhs_expr_p can recognize +it even after unsharing function body. */ + tree var = create_tmp_var_raw (TREE_TYPE (lhsaddr), NULL); + DECL_CONTEXT (var) = current_function_decl; + lhsaddr = build4 (TARGET_EXPR, TREE_TYPE (lhsaddr), var, lhsaddr, + NULL_TREE, NULL_TREE); +} + rhs = gfc_evaluate_now (rse.expr, block); if (((atomic_code-ext.omp_atomic GFC_OMP_ATOMIC_MASK) --- libgomp/testsuite/libgomp.fortran/pr63938-1.f90.jj 2014-11-24 11:27:33.060113895 +0100 +++ libgomp/testsuite/libgomp.fortran/pr63938-1.f90 2014-11-24 11:28:13.339379860 +0100 @@ -0,0 +1,14 @@ +! PR fortran/63938 +! { dg-do run } + +program pr63938_1 + integer :: i, x(1) + x(1) = 0 +!$omp parallel do + do i = 1, 1000 +!$omp atomic +x(1) = x(1) + 1 + end do +!$omp end parallel do + if (x(1) .ne. 1000) call abort +end program pr63938_1 --- libgomp/testsuite/libgomp.fortran/pr63938-2.f90.jj 2014-11-24 11:28:22.836206793 +0100 +++ libgomp/testsuite/libgomp.fortran/pr63938-2.f90 2014-11-24 11:28:53.217653132 +0100 @@ -0,0 +1,18 @@ +! PR fortran/63938 +! { dg-do run } + +program pr63938_2 + type t +integer :: x + end type + integer :: i + type(t) :: x + x%x = 0 +!$omp parallel do + do i = 1, 1000 +!$omp atomic +x%x = x%x + 1 + end do +!$omp end parallel do + if (x%x .ne. 1000) call abort +end program pr63938_2 Jakub
[PATCH] Fix linemap_line_start (PR preprocessor/60436)
Hi! As mentioned in the PR, when preprocessing very large files, if there are huge numbers of lines where no #line is emitted, we might not detect overflowinging into adhoc locations. Apparently in the add_map case we already handle that fine, by first stopping tracking columns and after another 256M lines give up on tracking locations, so this patch just makes sure we enter that path if going over those limits. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2014-11-24 Jakub Jelinek ja...@redhat.com PR preprocessor/60436 * line-map.c (linemap_line_start): If highest is above 0x6000 and we are still tracking columns or highest is above 0x7000, force add_map. --- libcpp/line-map.c.jj2014-11-12 08:06:57.0 +0100 +++ libcpp/line-map.c 2014-11-24 12:14:52.691276169 +0100 @@ -529,10 +529,10 @@ linemap_line_start (struct line_maps *se line_delta * ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map) 1000) || (max_column_hint = (1U ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map))) || (max_column_hint = 80 - ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map) = 10)) -{ - add_map = true; -} + ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map) = 10) + || (highest 0x6000 + (set-max_column_hint || highest 0x7000))) +add_map = true; else max_column_hint = set-max_column_hint; if (add_map) @@ -543,7 +543,7 @@ linemap_line_start (struct line_maps *se /* If the column number is ridiculous or we've allocated a huge number of source_locations, give up on column numbers. */ max_column_hint = 0; - if (highest 0x7000) + if (highest 0x7000) return 0; column_bits = 0; } Jakub
[PATCH] Fix find_base_term in 32-bit -fpic code (PR lto/64025)
Hi! The fallback delegitimization I've added as last option mainly for debug info purposes, when we don't know if the base is a PIC register or say a PIC register plus some addend, unfortunately in some tests broke find_base_term, which for PLUS looks only at the first operand and recursion on it finds a base term, it returns it immediately. So, it found base term of _GLOBAL_OFFSET_TABLE_, when the right base term is actually in the second operand. This patch fixes it by swapping the operands, debug info doesn't care about the order, it won't match in any instruction anyway, but helps alias.c. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2014-11-24 Jakub Jelinek ja...@redhat.com PR lto/64025 * config/i386/i386.c (ix86_delegitimize_address): Ensure result comes before (addend - _GLOBAL_OFFSET_TABLE_) term. --- gcc/config/i386/i386.c.jj 2014-11-24 10:26:26.0 +0100 +++ gcc/config/i386/i386.c 2014-11-24 20:17:27.659091709 +0100 @@ -14848,7 +14848,7 @@ ix86_delegitimize_address (rtx x) ... movl foo@GOTOFF(%ecx), %edx in which case we return (%ecx - %ebx) + foo -or (%ecx - _GLOBAL_OFFSET_TABLE_) + foo if pseudo_pic_reg +or foo + (%ecx - _GLOBAL_OFFSET_TABLE_) if pseudo_pic_reg and reload has completed. */ if (pic_offset_table_rtx (!reload_completed || !ix86_use_pseudo_pic_reg ())) @@ -14859,7 +14859,14 @@ ix86_delegitimize_address (rtx x) { rtx tmp = gen_rtx_SYMBOL_REF (Pmode, GOT_SYMBOL_NAME); tmp = gen_rtx_MINUS (Pmode, copy_rtx (addend), tmp); - result = gen_rtx_PLUS (Pmode, tmp, result); + /* The order of the operands here is very important. find_base_term +will only recurse into the first operand of PLUS if none of the +arguments is REG with REG_POINTER_P set on it, and if that finds +base term, it doesn't recurse into the second operand of PLUS. +We don't want find_base_term to return the artificial +_GLOBAL_OFFSET_TABLE_ symbol, so ensure it goes into the +second operand. */ + result = gen_rtx_PLUS (Pmode, result, tmp); } else return orig_x; Jakub
[PATCH] Fix building of gengtype
Hi! My last 2 bootstraps failed, both because of a race while building host gengtype (each time different gengtype*.o). Looking at the history, we have two versions of gengtype, one in build/, another one in . (host one). Initially, the copy in build/ was built with -DGENERATOR_FILE, included bconfig.h, the other did not and included config.h. Then, Steven noticed that eventhough gengtype is built for host, it really for many reasons need GENERATOR_FILE define and defined it in the Makefile. Except that change resulted in always including bconfig.h even in host gengtype, which is presumably undesirable. After another half a year Marcus probably hit similar race like I did tonight, make trying to build host gengtype*.o when bconfig.h has not been generated yet, and added for gengtype-lex.o $(BCONFIG_H) dependency (but not to any other of the gengtype*.o files). Note, config.h has an #error for #ifdef GENERATOR_FILE, to prevent people from including config.h in build/gen* stuff by mistake. So, this patch fixes this by just using a different define and defines GENERATOR_FILE only after including config.h in the host gengtype* object files. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2014-11-25 Jakub Jelinek ja...@redhat.com * Makefile.in (gengtype-lex.o): Drop dependency on $(BCONFIG_H). (CFLAGS-gengtype-lex.o, CFLAGS-gengtype-parse.o, CFLAGS-gengtype-state.o, CFLAGS-gengtype.o): Add -DHOST_GENGTYPE instead of -DGENERATOR_FILE. * gengtype.c: Instead of testing GENERATOR_FILE define, test HOST_GENGTYPE. If defined, include config.h and define GENERATOR_FILE afterwards, otherwise include bconfig.h. * gengtype-parse.c, gengtype-state.c, gengtype-lex.l): Likewise. --- gcc/Makefile.in.jj 2014-11-18 18:17:53.0 +0100 +++ gcc/Makefile.in 2014-11-24 21:43:47.960196848 +0100 @@ -2485,27 +2485,27 @@ build/gengenrtl.o : gengenrtl.c $(BCONFI # the build-%: rule doesn't apply to them. gengtype-lex.o build/gengtype-lex.o : gengtype-lex.c gengtype.h $(SYSTEM_H) -gengtype-lex.o: $(CONFIG_H) $(BCONFIG_H) -CFLAGS-gengtype-lex.o += -DGENERATOR_FILE +gengtype-lex.o: $(CONFIG_H) +CFLAGS-gengtype-lex.o += -DHOST_GENGTYPE build/gengtype-lex.o: $(BCONFIG_H) gengtype-parse.o build/gengtype-parse.o : gengtype-parse.c gengtype.h \ $(SYSTEM_H) gengtype-parse.o: $(CONFIG_H) -CFLAGS-gengtype-parse.o += -DGENERATOR_FILE +CFLAGS-gengtype-parse.o += -DHOST_GENGTYPE build/gengtype-parse.o: $(BCONFIG_H) gengtype-state.o build/gengtype-state.o: gengtype-state.c $(SYSTEM_H) \ gengtype.h errors.h double-int.h version.h $(HASHTAB_H) $(OBSTACK_H) \ $(XREGEX_H) gengtype-state.o: $(CONFIG_H) -CFLAGS-gengtype-state.o += -DGENERATOR_FILE +CFLAGS-gengtype-state.o += -DHOST_GENGTYPE build/gengtype-state.o: $(BCONFIG_H) gengtype.o build/gengtype.o : gengtype.c $(SYSTEM_H) gengtype.h\ rtl.def insn-notes.def errors.h double-int.h version.h \ $(HASHTAB_H) $(OBSTACK_H) $(XREGEX_H) gengtype.o: $(CONFIG_H) -CFLAGS-gengtype.o += -DGENERATOR_FILE +CFLAGS-gengtype.o += -DHOST_GENGTYPE build/gengtype.o: $(BCONFIG_H) build/genmddeps.o: genmddeps.c $(BCONFIG_H) $(SYSTEM_H) coretypes.h\ --- gcc/gengtype.c.jj 2014-11-20 17:06:24.0 +0100 +++ gcc/gengtype.c 2014-11-24 21:44:30.746446814 +0100 @@ -17,10 +17,11 @@ along with GCC; see the file COPYING3. If not see http://www.gnu.org/licenses/. */ -#ifdef GENERATOR_FILE -#include bconfig.h -#else +#ifdef HOST_GENGTYPE #include config.h +#define GENERATOR_FILE 1 +#else +#include bconfig.h #endif #include system.h #include errors.h/* for fatal */ --- gcc/gengtype-parse.c.jj 2014-09-25 15:03:05.0 +0200 +++ gcc/gengtype-parse.c2014-11-24 21:45:19.905585065 +0100 @@ -17,10 +17,11 @@ along with GCC; see the file COPYING3. If not see http://www.gnu.org/licenses/. */ -#ifdef GENERATOR_FILE -#include bconfig.h -#else +#ifdef HOST_GENGTYPE #include config.h +#define GENERATOR_FILE 1 +#else +#include bconfig.h #endif #include system.h #include gengtype.h --- gcc/gengtype-state.c.jj 2014-09-25 15:01:57.0 +0200 +++ gcc/gengtype-state.c2014-11-24 21:44:59.293946382 +0100 @@ -23,10 +23,11 @@ and Basile Starynkevitch bas...@starynkevitch.net */ -#ifdef GENERATOR_FILE -#include bconfig.h -#else +#ifdef HOST_GENGTYPE #include config.h +#define GENERATOR_FILE 1 +#else +#include bconfig.h #endif #include system.h #include errors.h/* For fatal. */ --- gcc/gengtype-lex.l.jj 2014-09-25 15:02:45.0 +0200 +++ gcc/gengtype-lex.l 2014-11-24 21:45:41.486206761 +0100 @@ -21,10 +21,11 @@ along with GCC; see the file COPYING3. %option noinput %{ -#ifdef GENERATOR_FILE -#include bconfig.h -#else +#ifdef HOST_GENGTYPE #include config.h +#define GENERATOR_FILE 1 +#else +#include bconfig.h #endif #include system.h Jakub
Re: [PATCH] Fix find_base_term in 32-bit -fpic code (PR lto/64025)
On Tue, Nov 25, 2014 at 09:13:10AM +0100, Uros Bizjak wrote: On Tue, Nov 25, 2014 at 8:40 AM, Uros Bizjak ubiz...@gmail.com wrote: On Tue, Nov 25, 2014 at 12:25 AM, Jakub Jelinek ja...@redhat.com wrote: The fallback delegitimization I've added as last option mainly for debug info purposes, when we don't know if the base is a PIC register or say a PIC register plus some addend, unfortunately in some tests broke find_base_term, which for PLUS looks only at the first operand and recursion on it finds a base term, it returns it immediately. So, it found base term of _GLOBAL_OFFSET_TABLE_, when the right base term is actually in the second operand. This patch fixes it by swapping the operands, debug info doesn't care about the order, it won't match in any instruction anyway, but helps alias.c. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2014-11-24 Jakub Jelinek ja...@redhat.com PR lto/64025 * config/i386/i386.c (ix86_delegitimize_address): Ensure result comes before (addend - _GLOBAL_OFFSET_TABLE_) term. Can you also swap operands of (%ecx - %ebx) + foo? There is no point digging into RTX involving registers only when we know that we are looking for foo. This will also be consistent with the code you patched below. Something like attached prototype patch. Actually, thinking about it more, at least according to commutative_operand_precedence the canonical order is what we used to return (i.e. (something - _G_O_T_) + (symbol_ref) or (something - _G_O_T_) + (const (symbol_ref +- const)) So perhaps better fix is to follow find_base_value, which does something like: /* Guess which operand is the base address: If either operand is a symbol, then it is the base. If either operand is a CONST_INT, then the other is the base. */ if (CONST_INT_P (src_1) || CONSTANT_P (src_0)) return find_base_value (src_0); else if (CONST_INT_P (src_0) || CONSTANT_P (src_1)) return find_base_value (src_1); and do something similar in find_base_term too. I.e. perhaps even with higher precedence over REG_P with REG_POINTER (or lower, in these cases it doesn't really matter, neither argument is REG_P), choose first operand that is CONSTANT_P and not CONST_INT_P. Jakub
Re: [PATCH] Fix building of gengtype
On Tue, Nov 25, 2014 at 12:35:09AM +0100, Jakub Jelinek wrote: My last 2 bootstraps failed, both because of a race while building host gengtype (each time different gengtype*.o). Found bootstrap failures even with this patch (dunno what changed on my box that I started getting these last night, make has not changed), that time with errors.o and gcc-ar.o. The generated headers are solved these days in automatic dependencies world through # In order for parallel make to really start compiling the expensive # objects from $(OBJS) as early as possible, build all their # prerequisites strictly before all objects. $(ALL_HOST_OBJS) : | $(generated_files) and build/*.o have explicit dependencies. I've tried to compare $(ALL_HOST_OBJS) on my box with all *.o */*.o files I had in stage3 directory, and besides build/*.o, I found: crtbegin.o crtbeginS.o crtbeginT.o crtend.o crtendS.o crtfastmath.o crtprec32.o crtprec64.o crtprec80.o errors.o gcc-ar.o gcc-nm.o gcc-ranlib.o gengtype-lex.o gengtype.o gengtype-parse.o gengtype-state.o not being listed in ALL_HOST_OBJS. The crt*.o files come from libgcc build and thus are ok, the rest I've tried to handle in the following updated patch. If the #define GENERATOR_FILE inside of the 5 files is too ugly, another alternative might be to define both -DHOST_GENERATOR_FILE -DGENERATOR_FILE in Makefile.in and don't error in config.h if GENERATOR_FILE is defined, if HOST_GENERATOR_FILE is also defined. 2014-11-25 Jakub Jelinek ja...@redhat.com * Makefile.in (ALL_HOST_BACKEND_OBJS): Add $(GENGTYPE_OBJS), gcc-ar.o, gcc-nm.o and gcc-ranlib.o. (GENGTYPE_OBJS): New. (gengtype-lex.o, gengtype-parse.o, gengtype-state.o, gengtype.o): Remove explicit dependencies. (CFLAGS-gengtype-lex.o, CFLAGS-gengtype-parse.o, CFLAGS-gengtype-state.o, CFLAGS-gengtype.o): Add -DHOST_GENERATOR_FILE instead of -DGENERATOR_FILE. (CFLAGS-errors.o): New. * gengtype.c: Instead of testing GENERATOR_FILE define, test HOST_GENERATOR_FILE. If defined, include config.h and define GENERATOR_FILE afterwards, otherwise include bconfig.h. * gengtype-parse.c: Likewise. * gengtype-state.c: Likewise. * gengtype-lex.l: Likewise. * errors.c: Likewise. --- gcc/Makefile.in.jj 2014-11-25 00:06:43.122178737 +0100 +++ gcc/Makefile.in 2014-11-25 08:55:34.727300843 +0100 @@ -1509,7 +1509,8 @@ ALL_HOST_FRONTEND_OBJS = $(foreach v,$(C ALL_HOST_BACKEND_OBJS = $(GCC_OBJS) $(OBJS) $(OBJS-libcommon) \ $(OBJS-libcommon-target) @TREEBROWSER@ main.o c-family/cppspec.o \ $(COLLECT2_OBJS) $(EXTRA_GCC_OBJS) $(GCOV_OBJS) $(GCOV_DUMP_OBJS) \ - $(GCOV_TOOL_OBJS) lto-wrapper.o collect-utils.o + $(GCOV_TOOL_OBJS) $(GENGTYPE_OBJS) gcc-ar.o gcc-nm.o gcc-ranlib.o \ + lto-wrapper.o collect-utils.o # This lists all host object files, whether they are included in this # compilation or not. @@ -2484,30 +2485,31 @@ build/gengenrtl.o : gengenrtl.c $(BCONFI # on BCONFIG_H. For the build objects, add -DGENERATOR_FILE manually, # the build-%: rule doesn't apply to them. +GENGTYPE_OBJS = gengtype.o gengtype-parse.o gengtype-state.o \ + gengtype-lex.o errors.o + gengtype-lex.o build/gengtype-lex.o : gengtype-lex.c gengtype.h $(SYSTEM_H) -gengtype-lex.o: $(CONFIG_H) $(BCONFIG_H) -CFLAGS-gengtype-lex.o += -DGENERATOR_FILE +CFLAGS-gengtype-lex.o += -DHOST_GENERATOR_FILE build/gengtype-lex.o: $(BCONFIG_H) gengtype-parse.o build/gengtype-parse.o : gengtype-parse.c gengtype.h \ $(SYSTEM_H) -gengtype-parse.o: $(CONFIG_H) -CFLAGS-gengtype-parse.o += -DGENERATOR_FILE +CFLAGS-gengtype-parse.o += -DHOST_GENERATOR_FILE build/gengtype-parse.o: $(BCONFIG_H) gengtype-state.o build/gengtype-state.o: gengtype-state.c $(SYSTEM_H) \ gengtype.h errors.h double-int.h version.h $(HASHTAB_H) $(OBSTACK_H) \ $(XREGEX_H) -gengtype-state.o: $(CONFIG_H) -CFLAGS-gengtype-state.o += -DGENERATOR_FILE +CFLAGS-gengtype-state.o += -DHOST_GENERATOR_FILE build/gengtype-state.o: $(BCONFIG_H) gengtype.o build/gengtype.o : gengtype.c $(SYSTEM_H) gengtype.h\ rtl.def insn-notes.def errors.h double-int.h version.h \ $(HASHTAB_H) $(OBSTACK_H) $(XREGEX_H) -gengtype.o: $(CONFIG_H) -CFLAGS-gengtype.o += -DGENERATOR_FILE +CFLAGS-gengtype.o += -DHOST_GENERATOR_FILE build/gengtype.o: $(BCONFIG_H) +CFLAGS-errors.o += -DHOST_GENERATOR_FILE + build/genmddeps.o: genmddeps.c $(BCONFIG_H) $(SYSTEM_H) coretypes.h\ errors.h $(READ_MD_H) build/genmodes.o : genmodes.c $(BCONFIG_H) $(SYSTEM_H) errors.h \ --- gcc/gengtype.c.jj 2014-11-21 10:17:06.135695325 +0100 +++ gcc/gengtype.c 2014-11-25 08:56:18.042523089 +0100 @@ -17,10 +17,11 @@ along with GCC; see the file COPYING3. If not see http://www.gnu.org/licenses/. */ -#ifdef GENERATOR_FILE -#include bconfig.h -#else +#ifdef HOST_GENERATOR_FILE #include config.h +#define GENERATOR_FILE 1 +#else +#include bconfig.h
Re: [PATCH] Remove unnecessary calls to strchr.
On Tue, Nov 25, 2014 at 03:15:04PM +0300, Ilya Tocar wrote: As proposed in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63853 this patch replaces some function calls with pointer arithmetic. I didn't mention PR in Changelog, as they are not actually related. Ok for trunk? @@ -3408,8 +3408,7 @@ handle_foffload_option (const char *arg) if (n == NULL) n = strchr (c, '\0'); - if (strlen (target) == (size_t) (n - c) -strncmp (target, c, n - c) == 0) + if (next - cur == n - c strncmp (target, c, n - c) == 0) I suppose you could use memcmp here, you know the string lengths. @@ -3431,8 +3433,7 @@ handle_foffload_option (const char *arg) if (n == NULL) n = strchr (c, '\0'); - if (strlen (target) == (size_t) (n - c) -strncmp (c, target, n - c) == 0) + if (next - cur == n - c strncmp (c, target, n - c) == 0) break; And here too. Ok with or without those changes. Jakub
Re: Document __builtin_*_overflow
On Tue, Nov 25, 2014 at 07:50:02PM +0100, Gerald Pfeifer wrote: Hi Jakub, On Wednesday 2014-11-12 14:13, Jakub Jelinek wrote: This patch mentions __builtin_*_overflow in gcc-5/changes.html. Ok for CVS? I've fallen a bit behind with GCC patches, sorry. What do you think about this follow-up patch on top of yours? LGTM, thanks. --- changes.html 23 Nov 2014 14:42:28 - 1.41 +++ changes.html 25 Nov 2014 18:49:02 - @@ -157,14 +157,14 @@ These builtins have two integral arguments (which don't need to have the same type), the arguments are extended to infinite precision signed type, code+/code, code-/code or code*/code - is performed on those and the result is stored into some integer - variable pointed by the last argument. If the stored value is equal - to the infinite precision result, the built-in functions return + is performed on those, and the result is stored in an integer + variable pointed to by the last argument. If the stored value is + equal to the infinite precision result, the built-in functions return codefalse/code, otherwise codetrue/code. The type of the integer variable that will hold the result can be different from - the types of arguments. The following snippet demonstrates how - this can be used in computing the size for the codecalloc/code - function: + the types of the first two arguments. The following snippet + demonstrates how this can be used in computing the size for the + codecalloc/code function: blockquotepre void * calloc (size_t x, size_t y) @@ -177,8 +177,8 @@ return ret; } /pre/blockquote - On e.g. i?86 or x86-64 the above will result in codemul/code - instruction followed by jump on overflow. + On e.g. i?86 or x86-64 the above will result in a codemul/code + instruction followed by a jump on overflow. /li liThe option code-fextended-identifiers/code is now enabled by default for C++, and for C99 and later C versions. Various Gerald Jakub
Re: [PATCH] gcc parallel make check
On Tue, Nov 25, 2014 at 03:27:40PM +0100, Tom de Vries wrote: This patch fixes that by ensuring that we print that unsupported message only once. The resulting test result comparison diff is: 2014-11-25 Tom de Vries t...@codesourcery.com * testsuite/libstdc++-prettyprinters/prettyprinters.exp: Add missing dg-finish. Only print unsupported message once. LGTM. --- a/libstdc++-v3/testsuite/libstdc++-prettyprinters/prettyprinters.exp +++ b/libstdc++-v3/testsuite/libstdc++-prettyprinters/prettyprinters.exp @@ -30,7 +30,14 @@ if ![info exists ::env(GUALITY_GDB_NAME)] { } if {! [gdb_version_check]} { +dg-finish +# Only print unsupported message in one instance. +if ![gcc_parallel_test_run_p prettyprinters] { + return +} +gcc_parallel_test_enable 0 unsupported prettyprinters.exp +gcc_parallel_test_enable 1 return } -- 1.9.1 Jakub
Re: [PATCH v3] gcc/c-family/c-cppbuiltin.c: Let buffer enough to print host wide integer value
On Wed, Nov 26, 2014 at 09:41:16AM +0800, Chen Gang wrote: On 11/26/14 8:31, Joseph Myers wrote: On Wed, 26 Nov 2014, Chen Gang wrote: + gcc_assert (wi::fits_to_tree_p (value, char_type_node) +|| wi::fits_to_tree_p (value, short_integer_type_node) +|| wi::fits_to_tree_p (value, integer_type_node) +|| wi::fits_to_tree_p (value, long_integer_type_node) +|| wi::fits_to_tree_p (value, long_long_integer_type_node)); It doesn't make sense to check for char or short, since you can't write a constant of one of those types. And it doesn't make sense to check for int or long when checking for long long, as the ranges of int and long are subsets of that of long long. So just check long long here. + buf = (char *) alloca (strlen (macro) + vlen + extra); + + sprintf (buf, %s=%sHOST_WIDE_INT_PRINT_DEC%s%s, Oh, and please use spaces around HOST_WIDE_INT_PRINT_DEC etc., with C++11 user defined literals it is always better to have whitespace in there. Jakub
Patch ping^2: [PATCH] -fsanitize=vptr instrumentation (take 2)
On Wed, Nov 12, 2014 at 03:05:46PM +0100, Jakub Jelinek wrote: On Tue, Oct 28, 2014 at 01:44:50PM +0100, Jakub Jelinek wrote: On Mon, Oct 27, 2014 at 05:16:05PM +0100, Jakub Jelinek wrote: Here is an updated patch, ok if bootstrap/testing passes (so far just checked with make -j16 -k check RUNTESTFLAGS='--target_board=unix\{-m32,-m64\} asan.exp tsan.exp ubsan.exp' )? I'd like to ping the https://gcc.gnu.org/ml/gcc-patches/2014-10/msg02945.html patch. Ping. Jakub
Re: [PATCH] Enhance ASAN_CHECK optimization
On Tue, Nov 25, 2014 at 08:06:00PM +0300, Yury Gribov wrote: +/* Traits class for tree hash maps below. */ + +struct tree_map_traits : default_hashmap_traits +{ + static inline hashval_t hash (const_tree ref) +{ + return iterative_hash_expr (ref, 0); +} + + static inline bool equal_keys (const_tree ref1, const_tree ref2) +{ + return operand_equal_p (ref1, ref2, 0); +} Formatting. The {} should be indented like static and return 2 columns to the right of that. @@ -281,37 +316,46 @@ maybe_optimize_asan_check_ifn (struct sanopt_ctx *ctx, gimple stmt) gimple_set_uid (stmt, info-freeing_call_events); - auto_vecgimple v = ctx-asan_check_map.get_or_insert (ptr); - if (v.is_empty ()) + auto_vecgimple *ptr_checks = ctx-asan_check_map.get_or_insert (ptr); + gimple g = maybe_get_dominating_check (*ptr_checks); + + tree base_addr = maybe_get_single_definition (ptr); + auto_vecgimple *base_checks = NULL; + if (base_addr) { - /* For this PTR we don't have any ASAN_CHECK stmts recorded, so there's - nothing to optimize yet. */ - v.safe_push (stmt); - return false; + base_checks = ctx-asan_check_map.get_or_insert (base_addr); + /* Original pointer might have been invalidated. */ + ptr_checks = ctx-asan_check_map.get (ptr); } For base_addr computation, you don't really need g or ptr_checks, do you? So why not move the: auto_vecgimple *ptr_checks = ctx-asan_check_map.get_or_insert (ptr); gimple g = maybe_get_dominating_check (*ptr_checks); lines below the if? @@ -404,10 +445,7 @@ sanopt_optimize_walker (basic_block bb, struct sanopt_ctx *ctx) basic_block son; gimple_stmt_iterator gsi; sanopt_info *info = (sanopt_info *) bb-aux; - bool asan_check_optimize -= (flag_sanitize SANITIZE_ADDRESS) - ((flag_sanitize flag_sanitize_recover - SANITIZE_KERNEL_ADDRESS) == 0); + bool asan_check_optimize = (flag_sanitize SANITIZE_ADDRESS) != 0; for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);) { I'm afraid I'm not convinced about this hunk. If asan (kernel-address) is recovering, I don't see a difference from not reporting two different invalid accesses to the same function and not reporting two integer overflows in the same function, at least if they have different location_t. Jakub
Re: [PATCH 2/2] PR debug/38757 continued. Handle C11, C++11 and C++14.
On Wed, Nov 26, 2014 at 11:19:42AM +0100, Mark Wielaard wrote: Ping. Rebased patch attached. I have submitted patches for elfutils, valgrind, gdb and binutils for this. But they are pending till this patch hits GCC first. Ok, thanks. Jakub
Re: [PATCH] Fix PR bootstrap/63995
On Wed, Nov 26, 2014 at 03:41:46PM +0300, Ilya Enkovich wrote: Hi, This patch makes optimization for bounds lifetime reduction to ignore debug stetments. This fixes stage2 and stage3 comparision for instrumented boostrap. OK for trunk? Please add a small testcase (with -fcompare-debug -fcheck-pointer-bounds (or what other options you need to reproduce it) that fails without the patch and succeeds with it. 2014-11-26 Ilya Enkovich ilya.enkov...@intel.com PR bootstrap/63995 * tree-chkp-opt.c (chkp_reduce_bounds_lifetime): Ignore debug statement when searching for a new position for bounds load/creation statement. diff --git a/gcc/tree-chkp-opt.c b/gcc/tree-chkp-opt.c index ff390d7..92e0694 100644 --- a/gcc/tree-chkp-opt.c +++ b/gcc/tree-chkp-opt.c @@ -1175,6 +1175,9 @@ chkp_reduce_bounds_lifetime (void) FOR_EACH_IMM_USE_STMT (use_stmt, use_iter, op) { + if (is_gimple_debug (use_stmt)) + continue; + if (dom_bb dominated_by_p (CDI_DOMINATORS, dom_bb, gimple_bb (use_stmt))) Jakub
Re: [PATCH] Fix PR bootstrap/63995
On Wed, Nov 26, 2014 at 04:48:13PM +0300, Ilya Enkovich wrote: 2014-11-26 Ilya Enkovich ilya.enkov...@intel.com PR bootstrap/63995 * tree-chkp-opt.c (chkp_reduce_bounds_lifetime): Ignore debug statement when searching for a new position for bounds load/creation statement. gcc/testsuite/ 2014-11-26 Ilya Enkovich ilya.enkov...@intel.com PR bootstrap/63995 * gcc.target/i386/pr63995-2.c: New. Ok, thanks. Jakub
Re: [PATCH] Fix sanitizer/63788
On Wed, Nov 26, 2014 at 02:57:20PM +0100, Marek Polacek wrote: Ping. Ok, thanks. On Wed, Nov 19, 2014 at 08:09:21PM +0100, Marek Polacek wrote: As discussed in the PR, the problem here is that when running gfortran, the __builtin_object_size decl isn't available, because c-family's c_define_builtins wasn't called. One way how to fix this is to build the __builtin_object_size decl in initialize_sanitizer_builtins, if needed. Alternatively we could just bail in instrument_object_size if builtin_decl_explicit (BUILT_IN_OBJECT_SIZE) returns NULL_TREE... No test attached since we don't have any Fortran ubsan infrastructure, I've just tried ./xgcc -B ./ -B ../x86_64-unknown-linux-gnu/libsanitizer/ubsan/.libs/ -Wl,-rpath=../x86_64-unknown-linux-gnu/libsanitizer/ubsan/.libs/ -B ../x86_64-unknown-linux-gnu/libgfortran/.libs/ -O -fsanitize=undefined testcase.f -lgfortran; ./a.out on the testcase attached in PR - and it doesn't ICE. Bootstrapped/regtested on ppc64-linux, ok for trunk? 2014-11-19 Marek Polacek pola...@redhat.com PR sanitizer/63788 * asan.c (initialize_sanitizer_builtins): Add BT_FN_SIZE_CONST_PTR_INT var. Conditionally build BUILT_IN_OBJECT_SIZE decl. (ATTR_PURE_NOTHROW_LEAF_LIST): Define. Jakub
Re: [PATCH] Enhance array types debug info. for Ada
On Wed, Oct 08, 2014 at 09:05:30PM +0200, Pierre-Marie de Rodat wrote: gcc/ * dwarf2out.h (struct array_descr_info): Remove the base_decl field. * dwarf2out.c (enum dw_scalar_form): New. (struct loc_descr_context): New. (add_scalar_info): New. (add_bound_info): Add a context parameter. Use add_scalar_info. (loc_list_from_tree): Add a context parameter. Handle PLACEHOLDER_EXPR nodes for type-related expressions. Likewise for base declarations. (loc_descriptor_from_tree): Add a context parameter. (subrange_type_die): Update calls to add_bound_info. (tls_mem_loc_descriptor): Likewise. (loc_list_for_address_of_addr_expr_of_indirect_ref): Add a context parameter. Update calls to loc_list_from_tree. (add_subscript_info): Update calls to add_bound_info. (gen_array_type_die): Update calls to loc_list_from_tree and to add_bound_info. (descr_info_loc): Remove. (add_descr_info_field): Remove. (gen_descr_array_type_die): Switch add_descr_info_field calls into add_scalar_info/add_bound_info ones. (gen_subprogram_die): Update calls to loc_list_from_tree. (gen_variable_die): Likewise. Replace implicitely with implicitly in the whole patch. + to refer to register values). + + CONTEXT provides information to customize the location descriptions + generation. Its context_type field specifies what type is implicitely + referenced by DW_OP_push_object_address. If it is NULL_TREE, this operation + will not be generated. + + If CONTEXT is NULL, the behavior is the same as if the context_type field + was NULL_TREE. */ as if both context_type and base_decl were NULL_TREE? @@ -14311,6 +14351,12 @@ loc_list_from_tree (tree loc, int want_address) extending the values properly. Hopefully this won't be a real problem... */ + if (context != NULL + context-base_decl == loc + want_address == 0) +return new_loc_list (new_loc_descr (DW_OP_push_object_address, 0, 0), + NULL, NULL, NULL); + This isn't guarded with dwarf_version = 3 || !dwarf_strict. Shouldn't it be too and return NULL otherwise? + expansion_failed (loc, NULL_RTX, + PLACEHOLDER_EXPR for a unexpected type); for an unexpected type? @@ -14533,7 +14594,8 @@ loc_list_from_tree (tree loc, int want_address) list_ret = loc_list_from_tree (obj, want_address == 2 - !bitpos !offset ? 2 : 1); + !bitpos !offset ? 2 : 1, + context); Formatting. Should use tabs, not spaces. + if (prec = HOST_BITS_PER_WIDE_INT +|| tree_fits_uhwi_p (value)) Formatting. || should be below p in prec. Would be nice if you tried more than one fortran testcase, say build all gfortran.dg/ tests with -O0 -g -dA (and perhaps -O2 -g -dA afterwards) with both unpatched and patched compilers and diff *.s files? Otherwise, LGTM. Jakub
Re: [C/C++ PATCH] Don't convert RHS of a shift-expression to int (PR c/63862)
On Wed, Nov 26, 2014 at 06:39:44PM +0100, Marek Polacek wrote: Both C11 and C++14 standards specify that integral promotions are performed on both operands of a shift-expression. This we do just fine. But then we convert the right operand to integer_type_node. Not only is this unnecessary, it can also be harfmul, because for e.g. void foo (unsigned int x) { if (-1 x != -1) bar (); } with (int) x we lose info that x is nonnegative, which means that tree_expr_nonnegative_p cannot fold this expr. Another problem with the conversion is that we weren't able to detect e.g. shift by 0x1ULL, since after the conversion this is 0. Have you checked what the expander does with it? Say trying to shift something by __int128 count or similar might upset it. Wonder about if we don't make similar assumptions in the middle-end. It might make a difference (sometimes positive, sometimes negative) for vectorization too. 2014-11-26 Marek Polacek pola...@redhat.com PR c/63862 c-family/ * c-ubsan.c (ubsan_instrument_shift): Change the type of a MINUS_EXPR to op1_utype. This part is ok regardless of the rest. Jakub
Re: [PATCH] Enhance ASAN_CHECK optimization
On Wed, Nov 26, 2014 at 11:21:06AM -0700, ygribov wrote: Formatting. The {} should be indented like static and return 2 columns to the right of that. Right. For base_addr computation, you don't really need g or ptr_checks, do you? So why not move the: auto_vecgimple *ptr_checks = ctx-asan_check_map.get_or_insert (ptr); gimple g = maybe_get_dominating_check (*ptr_checks); lines below the if? I can do this. But then base_checks would be invalidated when I call get_or_insert for ptr_checks so I'll still have to hash_map::get. You're right. If asan (kernel-address) is recovering, I don't see a difference from not reporting two different invalid accesses to the same function and not reporting two integer overflows in the same function, at least if they have different location_t. Ok, agreed. BTW how about replacing ' SANITIZE_KERNEL_ADDRESS' with ' SANITIZE_ADDRESS'? I know we do not support recovery for userspace but having a general enum sounds more logical. Testing SANITIZE_ADDRESS bit in flag_sanitize_recover doesn't make sense, testing it in flag_sanitize of course does, but for recover you care whether the SANITIZE_{KERNEL,USER}_ADDRESS bit in flag_sanitize_recover is set depending on if SANITIZE_{KERNEL,USER}_ADDRESS is set in flag_sanitize_recover. So supposedly ((flag_sanitize flag_sanitize_recover) (SANITIZE_USER_ADDRESS | SANITIZE_KERNEL_ADDRESS)) != 0 is the right check for whether the current address sanitization wants to recover. Jakub
Re: [C/C++ PATCH] Don't convert RHS of a shift-expression to int (PR c/63862)
On Wed, Nov 26, 2014 at 07:20:04PM +0100, Marek Polacek wrote: On Wed, Nov 26, 2014 at 06:50:29PM +0100, Jakub Jelinek wrote: On Wed, Nov 26, 2014 at 06:39:44PM +0100, Marek Polacek wrote: Both C11 and C++14 standards specify that integral promotions are performed on both operands of a shift-expression. This we do just fine. But then we convert the right operand to integer_type_node. Not only is this unnecessary, it can also be harfmul, because for e.g. void foo (unsigned int x) { if (-1 x != -1) I'd say fold-const not simplifying this is a bug even if x is signed int. I think negative shift counts are undefined even in the middle-end, Richard? Treating 5 -5 as 5 5 looks just wrong to me, does any FE rely on that? I don't think the expander will expand it that way though. Anyway, if the shift count is unnecessary wide type, then the middle-end won't be able to optimize it as much as it could if it has been a narrower type. So, for ubsan purposes (but, shifts are instrumented in the FEs) it is of course desirable to see the original type, but perhaps it might be a good idea to cast the shift count to integer_type_node say during gimplification (perhaps in c_gimplify_expr?). Jakub
Re: [PATCH] Enhance ASAN_CHECK optimization
On Wed, Nov 26, 2014 at 11:42:57AM -0700, ygribov wrote: Testing SANITIZE_ADDRESS bit in flag_sanitize_recover doesn't make sense, testing it in flag_sanitize of course does, but for recover you care whether the SANITIZE_{KERNEL,USER}_ADDRESS bit in flag_sanitize_recover is set depending on if SANITIZE_{KERNEL,USER}_ADDRESS is set in flag_sanitize_recover. Ok, got it. BTW shouldn't we disable local optimization of ASan checks (in asan.c) as well? That would be a massive perf hit ... Ah, you're right, we are already doing that. So let's just optimize always even when recovering and we'll see if users don't complain too much. Jakub
[PATCH] Fix find_base_term in 32-bit -fpic code (PR lto/64025, take 2)
On Tue, Nov 25, 2014 at 09:20:13AM +0100, Jakub Jelinek wrote: Actually, thinking about it more, at least according to commutative_operand_precedence the canonical order is what we used to return (i.e. (something - _G_O_T_) + (symbol_ref) or (something - _G_O_T_) + (const (symbol_ref +- const)) So perhaps better fix is to follow find_base_value, which does something like: /* Guess which operand is the base address: If either operand is a symbol, then it is the base. If either operand is a CONST_INT, then the other is the base. */ if (CONST_INT_P (src_1) || CONSTANT_P (src_0)) return find_base_value (src_0); else if (CONST_INT_P (src_0) || CONSTANT_P (src_1)) return find_base_value (src_1); and do something similar in find_base_term too. I.e. perhaps even with higher precedence over REG_P with REG_POINTER (or lower, in these cases it doesn't really matter, neither argument is REG_P), choose first operand that is CONSTANT_P and not CONST_INT_P. Here it is. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2014-11-26 Jakub Jelinek ja...@redhat.com PR lto/64025 * alias.c (find_base_term): Use std::swap. Prefer tmp2 if it is CONSTANT_P other than CONST_INT. --- gcc/alias.c.jj 2014-11-21 10:17:17.0 +0100 +++ gcc/alias.c 2014-11-26 12:31:24.719485590 +0100 @@ -1756,11 +1756,11 @@ find_base_term (rtx x) if (REG_P (tmp1) REG_POINTER (tmp1)) ; else if (REG_P (tmp2) REG_POINTER (tmp2)) - { - rtx tem = tmp1; - tmp1 = tmp2; - tmp2 = tem; - } + std::swap (tmp1, tmp2); + /* If second argument is constant which has base term, prefer it + over variable tmp1. See PR64025. */ + else if (CONSTANT_P (tmp2) !CONST_INT_P (tmp2)) + std::swap (tmp1, tmp2); /* Go ahead and find the base term for both operands. If either base term is from a pointer or is a named object or a special address Jakub
[PATCH] Fix simd clone vectorization (PR tree-optimization/64024)
Hi! As discussed in the PR and on IRC, the problem here is that peeling for alignment can for some linear argument that during vect analysis passed simple_iv no longer pass it during vect transform phase. So, to fix this, this patch remembers the base and step values from simple_iv during vect analysis and uses them during transform phase (biased by what the peeling for alignment advanced of course). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2014-11-26 Jakub Jelinek ja...@redhat.com PR tree-optimization/64024 * tree-vectorizer.h (struct _stmt_vec_info): Remove simd_clone_fndecl field. Add simd_clone_info field. (STMT_VINFO_SIMD_CLONE_FNDECL): Remove. (STMT_VINFO_SIMD_CLONE_INFO): Define. * tree-vect-stmts.c (vectorizable_simd_clone_call): Adjust for STMT_VINFO_SIMD_CLONE_FNDECL becoming first element of STMT_VINFO_SIMD_CLONE_INFO vector. For linear arguments, remember base and linear_step from analysis phase and use it during transform phase, biased by the difference between LOOP_VINFO_NITERS{_UNCHANGED,} multiplied by linear_step. (free_stmt_vec_info): Release STMT_VINFO_SIMD_CLONE_INFO. * gcc.dg/vect/vect-simd-clone-13.c: New test. * gcc.dg/vect/vect-simd-clone-14.c: New test. --- gcc/tree-vectorizer.h.jj2014-11-19 18:48:07.0 +0100 +++ gcc/tree-vectorizer.h 2014-11-26 12:56:00.899824766 +0100 @@ -602,8 +602,10 @@ typedef struct _stmt_vec_info { of this stmt. */ vecdr_p same_align_refs; - /* Selected SIMD clone's function decl. */ - tree simd_clone_fndecl; + /* Selected SIMD clone's function info. First vector element + is SIMD clone's function decl, followed by a pair of trees (base + step) + for linear arguments (pair of NULLs for other arguments). */ + vectree simd_clone_info; /* Classify the def of this stmt. */ enum vect_def_type def_type; @@ -677,7 +679,7 @@ typedef struct _stmt_vec_info { #define STMT_VINFO_RELATED_STMT(S) (S)-related_stmt #define STMT_VINFO_PATTERN_DEF_SEQ(S) (S)-pattern_def_seq #define STMT_VINFO_SAME_ALIGN_REFS(S) (S)-same_align_refs -#define STMT_VINFO_SIMD_CLONE_FNDECL(S) (S)-simd_clone_fndecl +#define STMT_VINFO_SIMD_CLONE_INFO(S) (S)-simd_clone_info #define STMT_VINFO_DEF_TYPE(S) (S)-def_type #define STMT_VINFO_GROUP_FIRST_ELEMENT(S) (S)-first_element #define STMT_VINFO_GROUP_NEXT_ELEMENT(S) (S)-next_element --- gcc/tree-vect-stmts.c.jj2014-11-19 18:47:59.0 +0100 +++ gcc/tree-vect-stmts.c 2014-11-26 15:38:59.883409014 +0100 @@ -2715,12 +2715,40 @@ vectorizable_simd_clone_call (gimple stm else gcc_assert (thisarginfo.vectype != NULL_TREE); - if (thisarginfo.dt != vect_constant_def - thisarginfo.dt != vect_external_def - loop_vinfo - TREE_CODE (op) == SSA_NAME - simple_iv (loop, loop_containing_stmt (stmt), op, iv, false) - tree_fits_shwi_p (iv.step)) + /* For linear arguments, the analyze phase should have saved +the base and step in STMT_VINFO_SIMD_CLONE_INFO. */ + if (i * 2 + 3 = STMT_VINFO_SIMD_CLONE_INFO (stmt_info).length () + STMT_VINFO_SIMD_CLONE_INFO (stmt_info)[i * 2 + 2]) + { + gcc_assert (vec_stmt); + thisarginfo.linear_step + = tree_to_shwi (STMT_VINFO_SIMD_CLONE_INFO (stmt_info)[i * 2 + 2]); + thisarginfo.op + = STMT_VINFO_SIMD_CLONE_INFO (stmt_info)[i * 2 + 1]; + /* If loop has been peeled for alignment, we need to adjust it. */ + tree n1 = LOOP_VINFO_NITERS_UNCHANGED (loop_vinfo); + tree n2 = LOOP_VINFO_NITERS (loop_vinfo); + if (n1 != n2) + { + tree bias = fold_build2 (MINUS_EXPR, TREE_TYPE (n1), n1, n2); + tree step = STMT_VINFO_SIMD_CLONE_INFO (stmt_info)[i * 2 + 2]; + tree opt = TREE_TYPE (thisarginfo.op); + bias = fold_convert (TREE_TYPE (step), bias); + bias = fold_build2 (MULT_EXPR, TREE_TYPE (step), bias, step); + thisarginfo.op + = fold_build2 (POINTER_TYPE_P (opt) + ? POINTER_PLUS_EXPR : PLUS_EXPR, opt, + thisarginfo.op, bias); + } + } + else if (!vec_stmt + thisarginfo.dt != vect_constant_def + thisarginfo.dt != vect_external_def + loop_vinfo + TREE_CODE (op) == SSA_NAME + simple_iv (loop, loop_containing_stmt (stmt), op, +iv, false) + tree_fits_shwi_p (iv.step)) { thisarginfo.linear_step = tree_to_shwi (iv.step); thisarginfo.op = iv.base; @@ -2735,8 +2763,8 @@ vectorizable_simd_clone_call (gimple stm unsigned int badness = 0; struct cgraph_node *bestn = NULL