Fwd: Subject: Re: [Patch, fortran] PR46897 - [OOP] type-bound defined ASSIGNMENT(=) not used for derived type component in intrinsic assign
Dear All, It is only now that I see that my mail to Mikael and the release managers, to say that I would commit, bounced because of excess MIME content. I apologise for that. I can only say in mitigation that fortran is not release critical and regressions are unlikely because of the conditions that the patch hides behind. Thanks for the reviews Mikael! Committed as revision 194016. Paul -- The knack of flying is learning how to throw yourself at the ground and miss. --Hitchhikers Guide to the Galaxy
Re: [RFA:] fix group-loads of VOIDmode constants, expr.c:emit_group_load_1
Of course this matters only to 64bit (i.e. registersize) values like TImode, alias __int128. The problem here is that group-loading a constant for a function return-value doesn't work; it's passed to simplify_gen_subreg which horks on the VOIDmode constant. Thankfully, the code below the context handles this case, twice the register-mode, just fine, so let's just gate the simplify_gen_subreg call with a test for a VOIDmode source. IMO that's not as clear a cut as it seems. simplify_gen_subreg is supposed to work on VOIDmode constants, at least to be callable on them, because there is the simplify_gen_subreg - simplify_subreg - simplify_immed_subreg path. -- Eric Botcazou
Re: [PATCH i386] Allow cltd/cqto etc on modern CPUs
Hello! The following proposed patch fixed the problem. Note that for Atom, only the CWD instruction is slow with 5 cycle latency, the rest sign extension instructions are fast -- the fix for Atom needs finer grain control and can be done separately. 2010-11-30 Xinliang David Li davi...@google.com * config/i386/i386.c: Allow sign extend instructions (cltd etc) on modern CPUs. OK. Thanks, Uros.
Re: patch to add storage classes to wide int.
Kenneth Zadeck zad...@naturalbridge.com writes: 2) The patch does not work for rtxes at all. Rtxes have to copied. Trees could be pointer copied. The problem is that CONST_INTs are not canonized in a way that wide-ints are or that trees could be. This comes from the use of the GEN_INT macro that does not take a mode. Without a mode, you do not get the integers into proper form: i.e. sign extended. At the tree level, this is not a problem because the INT_CST constructors take a type. But the best that can be done at the rtl level is to do the sign extension when we convert inside from_rtx (which takes a precision or a mode). rtxes must be sign-extended too, even under current rules. If you have a case where an rtx constant isn't sign-extended for the mode in which it's being used, then either the constant wasn't created correctly or the code that's calling from_rtx has got the wrong mode. Those are bugs even now. This is enforced in some places already. E.g. if an insn has a QImode const_int_operand, (const_int 128) will not match (assuming 8 bits per unit of course). I can well imagine this patch hits cases that haven't been caught yet though. FWIW, I think the thing you mention here... Fixing this is on Richard Sandiford's and my it would be nice list but is certainly out of the question to do as a prereq for this patch. ...is the idea of attaching the mode to the constant, rather than having to keep track of it separately. That's definitely still something I'd like to do, but it wouldn't involve any changes to the canonicalisation rules. I still agree that abstracting the storage seems like an unnecessary complication. In the other thread, Richard B said: The patches introduce a lot more temporary wide-ints (your words) and at the same time makes construction of them from tree / rtx very expensive both stack space and compile-time wise. Look at how we for example compute TREE_INT_CST + 1 - int_cst_binop internally uses double_ints for the computation and then instantiates a new tree for holding the result. Now we'd use wide_ints for this requring totally unnecessary copying. But surely the expensive part of that operation is instantiating the new tree, with its associated hash table lookup and potential allocation. Trying to save one or two copies of integers from heap to stack seems minor compared to that. Richard
Re: Fix segfault on degenerate bitfield case
Eric Botcazou ebotca...@adacore.com writes: This is a segfault on a degenerate bitfield case introduced by the rewrite of the bitfield machinery. In Ada, we have bitfields of size zero and we ask the middle-end to generate accesses to them. This doesn't work anymore because get_best_mode now returns VOIDmode instead of QImode in this case, which wreaks havoc later. The patchlet just restores the previous behaviour. It also makes the comment describing the computation of bitregion_end_ more explicit, as the original formulation is a bit terse on second reading, even for the reviewer. :-) Bootstrapped/regtested on x86-64/Linux, applied on the mainline as obvious. Thanks! Richard
[committed] Step down as rtl maintainer
On reflection, I think it'd be better if I stood down as an rtl maintainer. I'll still try to keep the MIPS stuff ticking over though. Applied as obvious. Richard * MAINTAINERS: Remove self as RTL optimization maintainer. Index: MAINTAINERS === --- MAINTAINERS 2012-12-01 09:29:44.0 + +++ MAINTAINERS 2012-12-01 09:31:15.491862377 + @@ -245,7 +245,6 @@ reload Ulrich Weigand uweig...@de.ibm reload Bernd Schmidt ber...@codesourcery.com dfp.c, related Ben Ellistonb...@gnu.org RTL optimizers Eric Botcazou ebotca...@libertysurf.fr -RTL optimizers Richard Sandiford rdsandif...@googlemail.com auto-vectorizerRichard Biener rguent...@suse.de auto-vectorizerZdenek Dvorak o...@ucw.cz loop infrastructureZdenek Dvorak o...@ucw.cz
Re: [tsan] Small bugfix
On Sat, Dec 1, 2012 at 4:04 AM, Jakub Jelinek ja...@redhat.com wrote: Hi! When I've tried to compile the attached testcase (I was trying to see if tsan could discover the emutls.c data race), I got ICEs because expr_ptr in certain cases wasn't is_gimple_val and thus was invalid to pass it directly to a call as argument, fixed thusly. Unfortunately, trying to compile it dynamically against libtsan.so doesn't work (apparently it is using -fvisibility=hidden, but not saying the public entry points have default visibility), Runtime needs to mark all interface functions as visibility(default), right? compiling it by hand statically against libtsan.a (we don't have -static-libtsan yet) failed at runtime, complaining the binary isn't a PIE - can't it really support normal executables? It's not trivial to do fast shadow memory mapping in this case. Initially non pie builds ware not planned at. But I am starting to think that I know how to do it. I will try to look into it in next weeks. and when compiled/linked as PIE, I got == WARNING: ThreadSanitizer: thread leak (pid=31150) Thread 3 (tid=31153, finished) created at: #0 pthread_create ??:0 (exe+0xe4dc) #1 main ??:0 (exe+0x505f) == == WARNING: ThreadSanitizer: thread leak (pid=31150) Thread 4 (tid=31155, finished) created at: #0 pthread_create ??:0 (exe+0xe4dc) #1 main ??:0 (exe+0x505f) == == WARNING: ThreadSanitizer: thread leak (pid=31150) Thread 5 (tid=31156, finished) created at: #0 pthread_create ??:0 (exe+0xe4dc) #1 main ??:0 (exe+0x505f) == ThreadSanitizer: reported 3 warnings which is probably not what I was expecting to see. The thread leak reports are correct, right? The race must be detectable. Can you show the code? The first thing to check is that the memory accesses are instrumented. Also if you build runtime with -DTSAN_DEBUG_OUTPUT=2 it will print all incoming events; if you post the log most likely I will be able to say why the race is not detected. 2012-12-01 Jakub Jelinek ja...@redhat.com * tsan.c (instrument_expr): If expr_ptr isn't a gimple val, first store it into a SSA_NAME. --- gcc/tsan.c.jj 2012-11-30 19:17:13.0 +0100 +++ gcc/tsan.c 2012-11-30 21:50:54.695392123 +0100 @@ -93,10 +93,11 @@ is_vptr_store (gimple stmt, tree expr, b static bool instrument_expr (gimple_stmt_iterator gsi, tree expr, bool is_write) { - tree base, rhs, expr_type, expr_ptr, builtin_decl; + tree base, rhs, expr_ptr, builtin_decl; basic_block bb; HOST_WIDE_INT size; gimple stmt, g; + gimple_seq seq; location_t loc; size = int_size_in_bytes (TREE_TYPE (expr)); @@ -139,21 +140,25 @@ instrument_expr (gimple_stmt_iterator gs rhs = is_vptr_store (stmt, expr, is_write); gcc_checking_assert (rhs != NULL || is_gimple_addressable (expr)); expr_ptr = build_fold_addr_expr (unshare_expr (expr)); - if (rhs == NULL) + seq = NULL; + if (!is_gimple_val (expr_ptr)) { - expr_type = TREE_TYPE (expr); - while (TREE_CODE (expr_type) == ARRAY_TYPE) - expr_type = TREE_TYPE (expr_type); - size = int_size_in_bytes (expr_type); - g = gimple_build_call (get_memory_access_decl (is_write, size), -1, expr_ptr); + g = gimple_build_assign (make_ssa_name (TREE_TYPE (expr_ptr), NULL), + expr_ptr); + expr_ptr = gimple_assign_lhs (g); + gimple_set_location (g, loc); + gimple_seq_add_stmt_without_update (seq, g); } + if (rhs == NULL) +g = gimple_build_call (get_memory_access_decl (is_write, size), + 1, expr_ptr); else { builtin_decl = builtin_decl_implicit (BUILT_IN_TSAN_VPTR_UPDATE); g = gimple_build_call (builtin_decl, 1, expr_ptr); } gimple_set_location (g, loc); + gimple_seq_add_stmt_without_update (seq, g); /* Instrumentation for assignment of a function result must be inserted after the call. Instrumentation for reads of function arguments must be inserted before the call. @@ -170,13 +175,13 @@ instrument_expr (gimple_stmt_iterator gs bb = gsi_bb (gsi); e = find_fallthru_edge (bb-succs); if (e) - gsi_insert_seq_on_edge_immediate (e, g); + gsi_insert_seq_on_edge_immediate (e, seq); } else - gsi_insert_after (gsi, g, GSI_NEW_STMT); + gsi_insert_seq_after (gsi, seq, GSI_NEW_STMT); } else -gsi_insert_before (gsi, g, GSI_SAME_STMT); +gsi_insert_before (gsi, seq, GSI_SAME_STMT); return true; } Jakub
[testcase] add testcase for PR53860
Hello, as advised by Alexandre Oliva at http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53860#c3 , I am submitting a testcase for PR53860. If Ok, I need someone to commit this patch. Tested by make check-g++. Thanks, Zdenek Sojka Changelog: PR debug/53860 * g++.dg/debug/pr53860.C: New testcase.Index: gcc/testsuite/g++.dg/debug/pr53860.C === --- gcc/testsuite/g++.dg/debug/pr53860.C (revision 0) +++ gcc/testsuite/g++.dg/debug/pr53860.C (revision 0) @@ -0,0 +1,14 @@ +// PR debug/53860 +// { dg-do compile } +// { dg-options -fkeep-inline-functions -fdebug-types-section } + +void +foo () +{ + struct S + { +S () +{ +} + }; +}
Re: [committed] Step down as rtl maintainer
Richard Sandiford rdsandif...@googlemail.com wrote: On reflection, I think it'd be better if I stood down as an rtl maintainer. I'll still try to keep the MIPS stuff ticking over though. Applied as obvious. You might consider instead denoting yourself to reviewer status. We do not have too many people reviewing rtl bits... Richard Richard * MAINTAINERS: Remove self as RTL optimization maintainer. Index: MAINTAINERS === --- MAINTAINERS2012-12-01 09:29:44.0 + +++ MAINTAINERS2012-12-01 09:31:15.491862377 + @@ -245,7 +245,6 @@ reload Ulrich Weigand uweig...@de.ibm reloadBernd Schmidt ber...@codesourcery.com dfp.c, relatedBen Ellistonb...@gnu.org RTL optimizersEric Botcazou ebotca...@libertysurf.fr -RTL optimizersRichard Sandiford rdsandif...@googlemail.com auto-vectorizer Richard Biener rguent...@suse.de auto-vectorizer Zdenek Dvorak o...@ucw.cz loop infrastructure Zdenek Dvorak o...@ucw.cz -- Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet.
[Patch, Fortran] PR 55548: SYSTEM_CLOCK with integer(8) provides nanosecond resolution, but only microsecond precision (without -lrt)
Hi all, here is a straightforward patch for the intrinsic procedure SYSTEM_CLOCK. It does two things: 1) It reduces the resolution of the int8 version from 1 nanosecond to 1 microsecond (COUNT_RATE = 100). 2) It adds an int16 version with nanosecond precision. The motivation for item #1 was mainly that the actual precision is usually not better than 1 microsec anyway (unless linking with -lrt). This results in SYSTEM_CLOCK giving values whose last three digits are zero. One can argue that this is not a dramatic misbehavior, but it has disadvantages for certain applications, like e.g. using SYSTEM_CLOCK to initialize the random seed in a Monte-Carlo simulation. In general, I would say that the value of COUNT_RATE should not be larger than the actual precision of the clock used. Moreover, the microsecond resolution for int8 arguments has the advantage that it is compatible with ifort's behavior. Also I think a resolution of 1 microsecond is sufficient for most applications. If someone really needs more, he can now use the int16 version (and link with -lrt). Regtested on x86_64-unknown-linux-gnu (although we don't actually seem to have any test cases for SYSTEM_CLOCK yet). Ok for trunk? Btw, does it make sense to also add an int2 version? If yes, which resolution? Note that most other compilers seem to have an int2 version of SYSTEM_CLOCK ... Cheers, Janus 2012-12-01 Janus Weil ja...@gcc.gnu.org PR fortran/55548 * gfortran.map (GFORTRAN_1.5): Add _gfortran_system_clock_16. * intrinsics/system_clock.c (system_clock_8): Change resolution to one microsec. (system_clock_16): New function (with nanosecond resolution). 2012-12-01 Janus Weil ja...@gcc.gnu.org PR fortran/55548 * intrinsic.texi (SYSTEM_CLOCK): Update documentation of SYSTEM_CLOCK. 2012-12-01 Janus Weil ja...@gcc.gnu.org PR fortran/55548 * gfortran.dg/system_clock_1.f90: New test case. pr55548.diff Description: Binary data system_clock_1.f90 Description: Binary data
Re: [tsan] Small bugfix
On Sat, Dec 01, 2012 at 01:53:52PM +0400, Dmitry Vyukov wrote: On Sat, Dec 1, 2012 at 4:04 AM, Jakub Jelinek ja...@redhat.com wrote: Hi! When I've tried to compile the attached testcase (I was trying to see if tsan could discover the emutls.c data race), I got ICEs because expr_ptr in certain cases wasn't is_gimple_val and thus was invalid to pass it directly to a call as argument, fixed thusly. Unfortunately, trying to compile it dynamically against libtsan.so doesn't work (apparently it is using -fvisibility=hidden, but not saying the public entry points have default visibility), Runtime needs to mark all interface functions as visibility(default), right? Yeah, == SANITIZER_INTERFACE_ATTRIBUTE . compiling it by hand statically against libtsan.a (we don't have -static-libtsan yet) failed at runtime, complaining the binary isn't a PIE - can't it really support normal executables? It's not trivial to do fast shadow memory mapping in this case. Initially non pie builds ware not planned at. But I am starting to think that I know how to do it. I will try to look into it in next weeks. Perhaps libtsan.a could be for PIEs only, and document that -static-libtsan has that limitation. And libtsan.so.0 could add in some offset to allow even non-PIE binaries. and when compiled/linked as PIE, I got == WARNING: ThreadSanitizer: thread leak (pid=31150) Thread 3 (tid=31153, finished) created at: #0 pthread_create ??:0 (exe+0xe4dc) #1 main ??:0 (exe+0x505f) == == WARNING: ThreadSanitizer: thread leak (pid=31150) Thread 4 (tid=31155, finished) created at: #0 pthread_create ??:0 (exe+0xe4dc) #1 main ??:0 (exe+0x505f) == == WARNING: ThreadSanitizer: thread leak (pid=31150) Thread 5 (tid=31156, finished) created at: #0 pthread_create ??:0 (exe+0xe4dc) #1 main ??:0 (exe+0x505f) == ThreadSanitizer: reported 3 warnings which is probably not what I was expecting to see. The thread leak reports are correct, right? No idea what do you mean by thread leak. What exactly is leaking? The race must be detectable. Can you show the code? The first thing to check is that the memory accesses are instrumented. Also if you build runtime with -DTSAN_DEBUG_OUTPUT=2 it will print all incoming events; if you post the log most likely I will be able to say why the race is not detected. Ah, on closer inspection I found the bug on the GCC side, forgotten gimple_insert_before instead of gimple_insert_seq_before in one case. Now it reports the race (3 times): == WARNING: ThreadSanitizer: data race (pid=3613) Write of size 8 at 0x7fc5c8bfae40 by thread 1: #0 foo emutlstest.c:56 (exe+0x53b7) #1 tf emutlstest.c:102 (exe+0x54a9) Previous read of size 8 at 0x7fc5c8bfae40 by thread 2: #0 foo emutlstest.c:46 (exe+0x52c7) #1 tf emutlstest.c:102 (exe+0x54a9) Thread 1 (tid=3615, running) created at: #0 pthread_create ??:0 (exe+0xe54c) #1 main emutlstest.c:115 (exe+0x505f) Thread 2 (tid=3616, running) created at: #0 pthread_create ??:0 (exe+0xe54c) #1 main emutlstest.c:115 (exe+0x505f) == while when using uintptr_t offset = __atomic_load_n (x-offset, __ATOMIC_ACQUIRE); instead of uintptr_t offset = x-offset; it doesn't report it. So here is the fixed up patch: 2012-12-01 Jakub Jelinek ja...@redhat.com * tsan.c (instrument_expr): If expr_ptr isn't a gimple val, first store it into a SSA_NAME. --- gcc/tsan.c.jj 2012-12-01 12:51:40.437808319 +0100 +++ gcc/tsan.c 2012-12-01 13:11:31.347889889 +0100 @@ -93,10 +93,11 @@ is_vptr_store (gimple stmt, tree expr, b static bool instrument_expr (gimple_stmt_iterator gsi, tree expr, bool is_write) { - tree base, rhs, expr_type, expr_ptr, builtin_decl; + tree base, rhs, expr_ptr, builtin_decl; basic_block bb; HOST_WIDE_INT size; gimple stmt, g; + gimple_seq seq; location_t loc; size = int_size_in_bytes (TREE_TYPE (expr)); @@ -139,21 +140,25 @@ instrument_expr (gimple_stmt_iterator gs rhs = is_vptr_store (stmt, expr, is_write); gcc_checking_assert (rhs != NULL || is_gimple_addressable (expr)); expr_ptr = build_fold_addr_expr (unshare_expr (expr)); - if (rhs == NULL) + seq = NULL; + if (!is_gimple_val (expr_ptr)) { - expr_type = TREE_TYPE (expr); - while (TREE_CODE (expr_type) == ARRAY_TYPE) - expr_type = TREE_TYPE (expr_type); - size = int_size_in_bytes (expr_type); - g = gimple_build_call (get_memory_access_decl (is_write, size), -1, expr_ptr); + g = gimple_build_assign (make_ssa_name (TREE_TYPE (expr_ptr), NULL), + expr_ptr); + expr_ptr =
Re: [committed] Step down as rtl maintainer
On Sat, Dec 1, 2012 at 10:36 AM, Richard Sandiford rdsandif...@googlemail.com wrote: On reflection, I think it'd be better if I stood down as an rtl maintainer. I'll still try to keep the MIPS stuff ticking over though. Applied as obvious. Hmm, this isn't really so obvious to me. In fact, it's rather surprising to me. You're one of the best and most knowledgeable reviewer of patches (RTL-related or otherwise). I'm not sure what reflection led you to this decision, but I think it's an unfortunate one. In any case, thanks for all the nice work you've done and are doing on GCC! Ciao! Steven
Re: [tsan] Small bugfix
On Sat, Dec 1, 2012 at 4:23 PM, Jakub Jelinek ja...@redhat.com wrote: Hi! When I've tried to compile the attached testcase (I was trying to see if tsan could discover the emutls.c data race), I got ICEs because expr_ptr in certain cases wasn't is_gimple_val and thus was invalid to pass it directly to a call as argument, fixed thusly. Unfortunately, trying to compile it dynamically against libtsan.so doesn't work (apparently it is using -fvisibility=hidden, but not saying the public entry points have default visibility), Runtime needs to mark all interface functions as visibility(default), right? Yeah, == SANITIZER_INTERFACE_ATTRIBUTE . compiling it by hand statically against libtsan.a (we don't have -static-libtsan yet) failed at runtime, complaining the binary isn't a PIE - can't it really support normal executables? It's not trivial to do fast shadow memory mapping in this case. Initially non pie builds ware not planned at. But I am starting to think that I know how to do it. I will try to look into it in next weeks. Perhaps libtsan.a could be for PIEs only, and document that -static-libtsan has that limitation. And libtsan.so.0 could add in some offset to allow even non-PIE binaries. and when compiled/linked as PIE, I got == WARNING: ThreadSanitizer: thread leak (pid=31150) Thread 3 (tid=31153, finished) created at: #0 pthread_create ??:0 (exe+0xe4dc) #1 main ??:0 (exe+0x505f) == == WARNING: ThreadSanitizer: thread leak (pid=31150) Thread 4 (tid=31155, finished) created at: #0 pthread_create ??:0 (exe+0xe4dc) #1 main ??:0 (exe+0x505f) == == WARNING: ThreadSanitizer: thread leak (pid=31150) Thread 5 (tid=31156, finished) created at: #0 pthread_create ??:0 (exe+0xe4dc) #1 main ??:0 (exe+0x505f) == ThreadSanitizer: reported 3 warnings which is probably not what I was expecting to see. The thread leak reports are correct, right? No idea what do you mean by thread leak. What exactly is leaking? Thread leak is joinable but not joined thread. I have a pending todo to aggregate them by stack, so in this case it will 3 threads leaked here. Perhaps I need to report only finished threads, if a thread runs during exit, perhaps it does not matter. And there is a flag to disable it at all: TSAN_OPTIONS=report_thread_leaks=0 ./app The race must be detectable. Can you show the code? The first thing to check is that the memory accesses are instrumented. Also if you build runtime with -DTSAN_DEBUG_OUTPUT=2 it will print all incoming events; if you post the log most likely I will be able to say why the race is not detected. Ah, on closer inspection I found the bug on the GCC side, forgotten gimple_insert_before instead of gimple_insert_seq_before in one case. Now it reports the race (3 times): Great! Do you see 3 similar reports? TSan suppreses further reports on the same address and with the same stacks. == WARNING: ThreadSanitizer: data race (pid=3613) Write of size 8 at 0x7fc5c8bfae40 by thread 1: #0 foo emutlstest.c:56 (exe+0x53b7) #1 tf emutlstest.c:102 (exe+0x54a9) Previous read of size 8 at 0x7fc5c8bfae40 by thread 2: #0 foo emutlstest.c:46 (exe+0x52c7) #1 tf emutlstest.c:102 (exe+0x54a9) Thread 1 (tid=3615, running) created at: #0 pthread_create ??:0 (exe+0xe54c) Is the runtime built with -g/-gmlt? #1 main emutlstest.c:115 (exe+0x505f) Thread 2 (tid=3616, running) created at: #0 pthread_create ??:0 (exe+0xe54c) #1 main emutlstest.c:115 (exe+0x505f) == while when using uintptr_t offset = __atomic_load_n (x-offset, __ATOMIC_ACQUIRE); instead of uintptr_t offset = x-offset; it doesn't report it. So here is the fixed up patch: 2012-12-01 Jakub Jelinek ja...@redhat.com * tsan.c (instrument_expr): If expr_ptr isn't a gimple val, first store it into a SSA_NAME. --- gcc/tsan.c.jj 2012-12-01 12:51:40.437808319 +0100 +++ gcc/tsan.c 2012-12-01 13:11:31.347889889 +0100 @@ -93,10 +93,11 @@ is_vptr_store (gimple stmt, tree expr, b static bool instrument_expr (gimple_stmt_iterator gsi, tree expr, bool is_write) { - tree base, rhs, expr_type, expr_ptr, builtin_decl; + tree base, rhs, expr_ptr, builtin_decl; basic_block bb; HOST_WIDE_INT size; gimple stmt, g; + gimple_seq seq; location_t loc; size = int_size_in_bytes (TREE_TYPE (expr)); @@ -139,21 +140,25 @@ instrument_expr (gimple_stmt_iterator gs rhs = is_vptr_store (stmt, expr, is_write); gcc_checking_assert (rhs != NULL || is_gimple_addressable (expr)); expr_ptr = build_fold_addr_expr (unshare_expr (expr)); - if (rhs == NULL) + seq = NULL; + if
Re: [patch] Rework RTL CFG graph dumping to dump DOT format
On Mon, Nov 26, 2012 at 4:46 PM, Richard Biener wrote: Btw, I of course have my own CFG dumper (producing graphviz input) in my local tree - attached for reference (I'm simply using it from gdb sessions). Here's my version of it. I still have to fix some minor fall-out of not flushing the pretty-printers all over the (inappropriate) place, but the graph dumps seem to work nicely so far. Perhaps you can try it out and see if this is to your liking? :-) Bootstrappedtested on {powerpc64,x86_64}-unknown-linux-gnu. As I said: Still fixing some minor tree dump related fall-out. Ciao! Steven graph_dump_draft.diff Description: Binary data
Re: patch to add storage classes to wide int.
On 12/01/2012 04:28 AM, Richard Sandiford wrote: Kenneth Zadeck zad...@naturalbridge.com writes: 2) The patch does not work for rtxes at all. Rtxes have to copied. Trees could be pointer copied. The problem is that CONST_INTs are not canonized in a way that wide-ints are or that trees could be. This comes from the use of the GEN_INT macro that does not take a mode. Without a mode, you do not get the integers into proper form: i.e. sign extended. At the tree level, this is not a problem because the INT_CST constructors take a type. But the best that can be done at the rtl level is to do the sign extension when we convert inside from_rtx (which takes a precision or a mode). rtxes must be sign-extended too, even under current rules. If you have a case where an rtx constant isn't sign-extended for the mode in which it's being used, then either the constant wasn't created correctly or the code that's calling from_rtx has got the wrong mode. Those are bugs even now. This is enforced in some places already. E.g. if an insn has a QImode const_int_operand, (const_int 128) will not match (assuming 8 bits per unit of course). I can well imagine this patch hits cases that haven't been caught yet though. I agree in an ideal world, those canonization rules would be true. But i am a big believer in trust but verify and since there was no verification, because there is no mode actually provided, it is not surprising that we fail. I am simply staking out the position that fixing this is an unreasonable precondition for wide-int. FWIW, I think the thing you mention here... Fixing this is on Richard Sandiford's and my it would be nice list but is certainly out of the question to do as a prereq for this patch. ...is the idea of attaching the mode to the constant, rather than having to keep track of it separately. That's definitely still something I'd like to do, but it wouldn't involve any changes to the canonicalisation rules. I still agree that abstracting the storage seems like an unnecessary complication. In the other thread, Richard B said: The patches introduce a lot more temporary wide-ints (your words) and at the same time makes construction of them from tree / rtx very expensive both stack space and compile-time wise. Look at how we for example compute TREE_INT_CST + 1 - int_cst_binop internally uses double_ints for the computation and then instantiates a new tree for holding the result. Now we'd use wide_ints for this requring totally unnecessary copying. But surely the expensive part of that operation is instantiating the new tree, with its associated hash table lookup and potential allocation. Trying to save one or two copies of integers from heap to stack seems minor compared to that. I have made the point that the huge number of wide ints created are actually very cheap because the larger than necessary space is never initialized and the space is on the stack and goes away very quickly. this enhancement to wide int really makes no sense. Mike i an have spent a lot of time going down this rat hole, and it is time to put it to rest. Richard
Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
Attached is a different fix. It splits DF_REF_IN_NOTE in two: One flag for each kind of note. This allows the dead note removal code to distinguish the source note for the EQ_USES. I needed to remove one flag to keep the df_ref_flags 16-bit, but the DF_REF_SUBREG flag looks completely unnecessary to me, and only ira.c uses it, so it wasn't to hard to scavenge a single bit. I'll defer this to Paolo. The patch also includes all places I've found so far where the compiler could create self-referencing notes: 1. optabs.c: Not sure what it was trying to do, but now it just refuses to add a note if TARGET is mentioned in one of the source operands. OK. 2. gcse.c: gcse_emit_move_after added notes, but none of them was very useful as far as I could tell, and almost all of them turned self-referencing after CPROP. So I propose we just never add notes in this case. gcse_emit_move_after also preserves existing notes. Are they problematic? 3. cprop.c: It seems to me that the purpose here is to propagate constants. If a reg could not be propagated, then the REG_EQUAL note will not help much either. Propagating constants via REG_EQUAL notes still helps folding comparisons sometimes, so I'm proposing we only propagate those. As a bonus: less garbage RTL because a cprop_constant_p can be shared. That seems a bit radical to me, especially in try_replace_reg which is used for copy propagation as well. In cprop_jump, why is attaching a note to the jump problematic? 4. fwprop.c: If the SET_DEST is a REG that is mentioned in the SET_SRC, don' create a note. This one I'm not very happy with, because it doesn't handle the case that the REG is somehow simplified out of the SET_SRC, but being smarter about this would only complicate things. I for one can't think of anything better for the moment, anyway. OK. Finally, it makes sense to compute the NOTE problem for CSE. Why? It only uses REG_EQ* notes, so it seems like we're back to the earlier trick of using df_note_add_problem to clean up pre-existing REG_EQ* notes. -- Eric Botcazou
Re: [testcase] add testcase for PR53860
On Sat, Dec 1, 2012 at 2:23 AM, Zdeněk Sojka zso...@seznam.cz wrote: Hello, as advised by Alexandre Oliva at http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53860#c3 , I am submitting a testcase for PR53860. If Ok, I need someone to commit this patch. Tested by make check-g++. Thanks, Zdenek Sojka Changelog: PR debug/53860 * g++.dg/debug/pr53860.C: New testcase. I checked it for you. Thanks. -- H.J.
Re: [PATCH] Don't bypass blocks with multiple latch edges (PR middle-end/54838)
On Fri, Nov 30, 2012 at 11:00:28PM +0100, Eric Botcazou wrote: OK, let's tweak the patch as follows: 1) when current_loops is not NULL, we compute may_be_loop_header and whether the loop has more than 1 latch edge exactly, 2) when current_loops is NULL, we use your above method to do the same, 3) once this is done, we return from the function before entering the loop if this is a (potential) header with more than 1 (potential) latch edge. The comment can say that threading through a loop header with more than 1 latch edge is delicate and cite tree-threadupdate.c:thread_through_loop_header. Like this? Regtested/bootstrapped on x86_64-linux, ok for trunk? 2012-12-01 Marek Polacek pola...@redhat.com PR middle-end/54838 * cprop.c (bypass_block): Determine number of latches. Return when there is more than one latch edge. * gcc.dg/pr54838.c: New test. --- gcc/cprop.c.mp 2012-11-29 15:49:53.120524295 +0100 +++ gcc/cprop.c 2012-12-01 16:14:59.387335461 +0100 @@ -1510,13 +1510,28 @@ bypass_block (basic_block bb, rtx setcc, if (note) find_used_regs (XEXP (note, 0), NULL); - may_be_loop_header = false; - FOR_EACH_EDGE (e, ei, bb-preds) -if (e-flags EDGE_DFS_BACK) - { - may_be_loop_header = true; - break; - } + /* Determine whether there are more latch edges. Threading through + a loop header with more than one latch is delicate, see e.g. + tree-ssa-threadupdate.c:thread_through_loop_header. */ + if (current_loops) +{ + may_be_loop_header = bb == bb-loop_father-header; + if (may_be_loop_header + bb-loop_father-latch == NULL) + return 0; +} + else +{ + unsigned n_back_edges = 0; + FOR_EACH_EDGE (e, ei, bb-preds) + if (e-flags EDGE_DFS_BACK) + n_back_edges++; + + may_be_loop_header = n_back_edges 0; + + if (n_back_edges 1) +return 0; +} change = 0; for (ei = ei_start (bb-preds); (e = ei_safe_edge (ei)); ) --- gcc/testsuite/gcc.dg/pr54838.c.mp 2012-11-26 14:48:43.783980854 +0100 +++ gcc/testsuite/gcc.dg/pr54838.c 2012-11-26 14:49:51.051158719 +0100 @@ -0,0 +1,24 @@ +/* PR middle-end/54838 */ +/* { dg-do compile } */ +/* { dg-options -O2 -fno-forward-propagate -ftracer } */ + +void bar (void); + +void +foo (void *b, int *c) +{ +again: + switch (*c) +{ +case 1: + if (!b) + { + bar (); + return; + } + goto again; +case 3: + if (!b) + goto again; +} +} Marek
Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)
On Sat, Dec 1, 2012 at 3:54 PM, Eric Botcazou wrote: The patch also includes all places I've found so far where the compiler could create self-referencing notes: 1. optabs.c: Not sure what it was trying to do, but now it just refuses to add a note if TARGET is mentioned in one of the source operands. OK. Thanks. I'll commit this separately. 2. gcse.c: gcse_emit_move_after added notes, but none of them was very useful as far as I could tell, and almost all of them turned self-referencing after CPROP. So I propose we just never add notes in this case. gcse_emit_move_after also preserves existing notes. Are they problematic? Yes, they tend to be invalid after PRE because the registers used in the PRE'd expression usually are not live anymore (making the note invalid). Sometimes CPROP re-validates the notes, but it doesn't seem wise to me to rely on that. 3. cprop.c: It seems to me that the purpose here is to propagate constants. If a reg could not be propagated, then the REG_EQUAL note will not help much either. Propagating constants via REG_EQUAL notes still helps folding comparisons sometimes, so I'm proposing we only propagate those. As a bonus: less garbage RTL because a cprop_constant_p can be shared. That seems a bit radical to me, especially in try_replace_reg which is used for copy propagation as well. In cprop_jump, why is attaching a note to the jump problematic? Most of the time a note from copy-propagation was not valid because the copy-prop'd reg was not live at the point of the note. 4. fwprop.c: If the SET_DEST is a REG that is mentioned in the SET_SRC, don' create a note. This one I'm not very happy with, because it doesn't handle the case that the REG is somehow simplified out of the SET_SRC, but being smarter about this would only complicate things. I for one can't think of anything better for the moment, anyway. OK. I'll commit this along with the optabs.c part. Finally, it makes sense to compute the NOTE problem for CSE. Why? It only uses REG_EQ* notes, Not really. It uses single_set in a few places, including delete_trivially_dead_insns and cse_extended_basic_block. so it seems like we're back to the earlier trick of using df_note_add_problem to clean up pre-existing REG_EQ* notes. Again: Not really. I also bootstrappedtested without the cse.c change. I plan (and promise ;-) to add a REG_EQ* note verifier for GCC 4.9. Ciao! Steven
Another merge from trunk to gccgo branch
I've merged revision 194015 from trunk to the gccgo branch. Ian
Re: [i386] scalar ops that preserve the high part of a vector
Hello, here is a patch. If it is accepted, I'll extend it to other vm patterns (mul, div, min, max are likely candidates, but I need to check the doc). It passed bootstrap+testsuite on x86_64-linux. 2012-12-01 Marc Glisse marc.gli...@inria.fr PR target/54855 gcc/ * config/i386/sse.md (sse_vmplusminus_insnmode3): Rewrite pattern. * config/i386/i386-builtin-types.def: New function types. * config/i386/i386.c (ix86_expand_args_builtin): Likewise. (bdesc_args) __builtin_ia32_addss, __builtin_ia32_subss, __builtin_ia32_addsd, __builtin_ia32_subsd: Change prototype. * config/i386/xmmintrin.h: Adapt to new builtin prototype. * config/i386/emmintrin.h: Likewise. * doc/extend.texi (X86 Built-in Functions): Document changed prototype. testsuite/ * gcc.target/i386/pr54855-1.c: New testcase. * gcc.target/i386/pr54855-2.c: New testcase. -- Marc GlisseIndex: gcc/testsuite/gcc.target/i386/pr54855-2.c === --- gcc/testsuite/gcc.target/i386/pr54855-2.c (revision 0) +++ gcc/testsuite/gcc.target/i386/pr54855-2.c (revision 0) @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options -O -msse } */ + +typedef float vec __attribute__((vector_size(16))); + +vec f (vec x) +{ + x[0] += 2; + return x; +} + +vec g (vec x) +{ + x[0] -= 1; + return x; +} + +/* { dg-final { scan-assembler-not mov } } */ Property changes on: gcc/testsuite/gcc.target/i386/pr54855-2.c ___ Added: svn:keywords + Author Date Id Revision URL Added: svn:eol-style + native Index: gcc/testsuite/gcc.target/i386/pr54855-1.c === --- gcc/testsuite/gcc.target/i386/pr54855-1.c (revision 0) +++ gcc/testsuite/gcc.target/i386/pr54855-1.c (revision 0) @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options -O -msse2 } */ + +typedef double vec __attribute__((vector_size(16))); + +vec f (vec x) +{ + x[0] += 2; + return x; +} + +vec g (vec x) +{ + x[0] -= 1; + return x; +} + +/* { dg-final { scan-assembler-not mov } } */ Property changes on: gcc/testsuite/gcc.target/i386/pr54855-1.c ___ Added: svn:eol-style + native Added: svn:keywords + Author Date Id Revision URL Index: gcc/config/i386/i386.c === --- gcc/config/i386/i386.c (revision 194017) +++ gcc/config/i386/i386.c (working copy) @@ -27059,22 +27059,22 @@ static const struct builtin_description { OPTION_MASK_ISA_SSE, CODE_FOR_sse_cvttps2pi, __builtin_ia32_cvttps2pi, IX86_BUILTIN_CVTTPS2PI, UNKNOWN, (int) V2SI_FTYPE_V4SF }, { OPTION_MASK_ISA_SSE, CODE_FOR_sse_cvttss2si, __builtin_ia32_cvttss2si, IX86_BUILTIN_CVTTSS2SI, UNKNOWN, (int) INT_FTYPE_V4SF }, { OPTION_MASK_ISA_SSE | OPTION_MASK_ISA_64BIT, CODE_FOR_sse_cvttss2siq, __builtin_ia32_cvttss2si64, IX86_BUILTIN_CVTTSS2SI64, UNKNOWN, (int) INT64_FTYPE_V4SF }, { OPTION_MASK_ISA_SSE, CODE_FOR_sse_shufps, __builtin_ia32_shufps, IX86_BUILTIN_SHUFPS, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT }, { OPTION_MASK_ISA_SSE, CODE_FOR_addv4sf3, __builtin_ia32_addps, IX86_BUILTIN_ADDPS, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF }, { OPTION_MASK_ISA_SSE, CODE_FOR_subv4sf3, __builtin_ia32_subps, IX86_BUILTIN_SUBPS, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF }, { OPTION_MASK_ISA_SSE, CODE_FOR_mulv4sf3, __builtin_ia32_mulps, IX86_BUILTIN_MULPS, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF }, { OPTION_MASK_ISA_SSE, CODE_FOR_sse_divv4sf3, __builtin_ia32_divps, IX86_BUILTIN_DIVPS, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF }, - { OPTION_MASK_ISA_SSE, CODE_FOR_sse_vmaddv4sf3, __builtin_ia32_addss, IX86_BUILTIN_ADDSS, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF }, - { OPTION_MASK_ISA_SSE, CODE_FOR_sse_vmsubv4sf3, __builtin_ia32_subss, IX86_BUILTIN_SUBSS, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF }, + { OPTION_MASK_ISA_SSE, CODE_FOR_sse_vmaddv4sf3, __builtin_ia32_addss, IX86_BUILTIN_ADDSS, UNKNOWN, (int) V4SF_FTYPE_V4SF_FLOAT }, + { OPTION_MASK_ISA_SSE, CODE_FOR_sse_vmsubv4sf3, __builtin_ia32_subss, IX86_BUILTIN_SUBSS, UNKNOWN, (int) V4SF_FTYPE_V4SF_FLOAT }, { OPTION_MASK_ISA_SSE, CODE_FOR_sse_vmmulv4sf3, __builtin_ia32_mulss, IX86_BUILTIN_MULSS, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF }, { OPTION_MASK_ISA_SSE, CODE_FOR_sse_vmdivv4sf3, __builtin_ia32_divss, IX86_BUILTIN_DIVSS, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF }, { OPTION_MASK_ISA_SSE, CODE_FOR_sse_maskcmpv4sf3, __builtin_ia32_cmpeqps, IX86_BUILTIN_CMPEQPS, EQ, (int) V4SF_FTYPE_V4SF_V4SF }, { OPTION_MASK_ISA_SSE, CODE_FOR_sse_maskcmpv4sf3, __builtin_ia32_cmpltps, IX86_BUILTIN_CMPLTPS, LT, (int) V4SF_FTYPE_V4SF_V4SF }, { OPTION_MASK_ISA_SSE, CODE_FOR_sse_maskcmpv4sf3, __builtin_ia32_cmpleps, IX86_BUILTIN_CMPLEPS, LE, (int) V4SF_FTYPE_V4SF_V4SF }, {
Re: [patch] Rework RTL CFG graph dumping to dump DOT format
On Sat, Dec 1, 2012 at 2:23 PM, Steven Bosscher wrote: On Mon, Nov 26, 2012 at 4:46 PM, Richard Biener wrote: Btw, I of course have my own CFG dumper (producing graphviz input) in my local tree - attached for reference (I'm simply using it from gdb sessions). Here's my version of it. I still have to fix some minor fall-out of not flushing the pretty-printers all over the (inappropriate) place, but the graph dumps seem to work nicely so far. Perhaps you can try it out and see if this is to your liking? :-) Bootstrappedtested on {powerpc64,x86_64}-unknown-linux-gnu. As I said: Still fixing some minor tree dump related fall-out. I only need this fix on top of the patch: diff -u tree-pretty-print.c tree-pretty-print.c --- tree-pretty-print.c (working copy) +++ tree-pretty-print.c (working copy) @@ -161,6 +161,7 @@ { maybe_init_pretty_print (file); dump_generic_node (buffer, t, 0, flags, false); + pp_flush (buffer); } /* Dump the name of a _DECL node and its DECL_UID if TDF_UID is set Bootstrappedtested on powerpc64-unknown-linux-gnu. OK for trunk? Ciao! Steven
Re: [tsan] Small bugfix
On Sat, Dec 01, 2012 at 04:55:35PM +0400, Dmitry Vyukov wrote: No idea what do you mean by thread leak. What exactly is leaking? Thread leak is joinable but not joined thread. I have a pending todo to aggregate them by stack, so in this case it will 3 threads leaked here. Perhaps I need to report only finished threads, if a thread runs during exit, perhaps it does not matter. And there is a flag to disable it at all: TSAN_OPTIONS=report_thread_leaks=0 ./app Actually, seems tsan was right about this, when tweaking the test from 2 to 5 threads I ended up with: int main (void) { pthread_t p[2]; // wrong, should have been p[5] int i; for (i = 0; i 5; i++) if (pthread_create (p[i], NULL, tf, NULL)) return 0; for (i = 0; i 5; i++) pthread_join (p[i], NULL); return 0; } and gcc when optimizing kept the first loop to iterate 5 times (just taking address of past the end of buffer, undefined too), while the second loop got optimized into just two iterations (cunrolli opt). Jakub
your resignation
i think that this is really a mistake. you are one of the best at this part of the compiler. kenny
Re: [C++ Patch] PR 54170
Hi, On 12/01/2012 07:13 AM, Jason Merrill wrote: On 11/30/2012 04:05 PM, Paolo Carlini wrote: @@ -219,10 +219,15 @@ cp_convert_to_pointer (tree type, tree expr, tsubs -expr = build_int_cst (type, 0); +expr = (TREE_SIDE_EFFECTS (expr) +? build_nop (type, expr) +: build_int_cst (type, 0)); This seems to rely on a nop being sufficient to convert from any null pointer constant to the appropriate pointer type, which I don't think is safe if the integer is smaller than a pointer. I'm also not sure if we want to rely on nullptr_t expressions actually having the value 0. Thanks a lot. What about the below? I'm consistently building a COMPOUND_EXPR for the simpler plain pointers too; the hunk for pointers to member functions can be *very* simple: in fact I had already tried this solution but something else was wrong yesterday and seemed not to work, huumpf. Tested x86_64-linux. Thanks again, Paolo. Index: cp/cvt.c === --- cp/cvt.c(revision 194020) +++ cp/cvt.c(working copy) @@ -215,16 +215,14 @@ cp_convert_to_pointer (tree type, tree expr, tsubs return build_ptrmemfunc (TYPE_PTRMEMFUNC_FN_TYPE (type), expr, 0, /*c_cast_p=*/false, complain); - if (TYPE_PTRDATAMEM_P (type)) - { - /* A NULL pointer-to-member is represented by -1, not by -zero. */ - expr = build_int_cst_type (type, -1); - } - else - expr = build_int_cst (type, 0); + /* A NULL pointer-to-data-member is represented by -1, not by +zero. */ + tree val = (TYPE_PTRDATAMEM_P (type) + ? build_int_cst_type (type, -1) + : build_int_cst (type, 0)); - return expr; + return (TREE_SIDE_EFFECTS (expr) + ? build2 (COMPOUND_EXPR, type, expr, val) : val); } else if (TYPE_PTRMEM_P (type) INTEGRAL_CODE_P (form)) { Index: cp/typeck.c === --- cp/typeck.c (revision 194020) +++ cp/typeck.c (working copy) @@ -7567,7 +7567,7 @@ build_ptrmemfunc (tree type, tree pfn, int force, /* Handle null pointer to member function conversions. */ if (null_ptr_cst_p (pfn)) { - pfn = build_c_cast (input_location, type, nullptr_node); + pfn = build_c_cast (input_location, type, pfn); return build_ptrmemfunc1 (to_type, integer_zero_node, pfn); Index: testsuite/g++.dg/cpp0x/lambda/lambda-nullptr.C === --- testsuite/g++.dg/cpp0x/lambda/lambda-nullptr.C (revision 0) +++ testsuite/g++.dg/cpp0x/lambda/lambda-nullptr.C (working copy) @@ -0,0 +1,47 @@ +// PR c++/54170 +// { dg-do run { target c++11 } } + +#include cassert + +struct A; +typedef A* ptr; +typedef int (A::*pmf) (int); +typedef int (A::*pdm); + +int total; + +void add(int n) +{ + total += n; +} + +template typename RType, typename Callable +RType Call(Callable native_func, int arg) +{ + return native_func(arg); +} + +template typename RType +RType do_test(int delta) +{ + return CallRType([=](int delta) { add(delta); return nullptr; }, delta); +} + +template typename RType +void test() +{ + total = 0; + assert (!do_testRType(5)); + assert (total == 5); + assert (!do_testRType(20)); + assert (total == 25); + assert (!do_testRType(-256)); + assert (total == -231); +} + +int main() +{ + testptr(); + testpdm(); + testpmf(); +}
Re: [patch] reorg.c janitor: remove epilogue_delay_list
On 11/30/2012 03:00 PM, Steven Bosscher wrote: Hello, This epilogue_delay_list probably existed only for text epilogues, but it is now unused. Tested by building a set of cc1-i files at -O2 for SPARC with and without the patch, and verifying that there are no code gen changes. OK for trunk? OK jeff
Re: Non-dominating loop bounds in tree-ssa-loop-niter 3/4
On Wed, Oct 31, 2012 at 3:39 AM, Jan Hubicka hubi...@ucw.cz wrote: Hi, this patch implements the logic to remove statements that are known to be undefined and thus expected to not be executed after unrolling. It also removes redundant exits that I originally tried to do at once, but it does not fly, since the peeling confuse number_of_iterations_exit and it no longer understands the ivs. So now we 1) always remove exits that are known to be redundant by the bounds found 2) try to peel/unroll 3) if success remove statemnts from the last iteration This silence the array-bounds warnings in my testcase and many cases of -O3 bootstrap (I am running it now). Still if one construct testcase touching out of bound in more than one iteration we will produce false warning, I will do that incrementally by similar logic in loop copying. Bootstrapped/regtested x86_64-linux, OK? Honza * tree-ssa-loop-niter.c (free_loop_bounds): Break out from ... (free_numbers_of_iterations_estimates_loop): ... here. * tree-ssa-loop-ivcanon.c (remove_exits_and_undefined_stmts): New function. (remove_redundant_iv_test): New function. (try_unroll_loop_completely): Pass in MAXITER; use remove_exits_and_undefined_stmts (canonicalize_loop_induction_variables): Compute MAXITER; use remove_redundant_iv_test. * cfgloop.h (free_loop_bounds): New function. * gcc.dg/tree-ssa/cunroll-10.c: New testcase. * gcc.dg/tree-ssa/cunroll-9.c: New testcase. This caused: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=5 -- H.J.
[committed] Fix testsuite/30_threads/condition_variable/members/53841.cc on hppa*-hp-hpux11*
The attached change adds the -std=gnu++0x -pthread options on hppa*-hp-hpux11*. Test passes with the change. Tested on hppa2.0w-hp-hpux11.11 and hppa64-hp-hpux11.11. Committed to trunk and 4.7 branch. Dave -- J. David Anglin dave.ang...@nrc-cnrc.gc.ca National Research Council of Canada (613) 990-0752 (FAX: 952-6602) 2012-12-01 John David Anglin dave.ang...@nrc-cnrc.gc.ca PR libstdc++/55503 * testsuite/30_threads/condition_variable/members/53841.cc: Add hppa*-hp-hpux11* to -pthread dg-options. Index: testsuite/30_threads/condition_variable/members/53841.cc === --- testsuite/30_threads/condition_variable/members/53841.cc(revision 193878) +++ testsuite/30_threads/condition_variable/members/53841.cc(working copy) @@ -1,5 +1,5 @@ // { dg-do compile } -// { dg-options -std=gnu++0x -pthread { target *-*-freebsd* *-*-netbsd* *-*-linux* powerpc-ibm-aix* } } +// { dg-options -std=gnu++0x -pthread { target *-*-freebsd* *-*-netbsd* *-*-linux* powerpc-ibm-aix* hppa*-hp-hpux11* } } // { dg-options -std=gnu++0x -pthreads { target *-*-solaris* } } // { dg-options -std=gnu++0x { target *-*-cygwin *-*-darwin* } } // { dg-require-cstdint }
[committed] Fix ada build on hpux10
Tested on hppa1.1.-hp-hpux10.20. Commetted to trunk and 4.7. Dave -- J. David Anglin dave.ang...@nrc-cnrc.gc.ca National Research Council of Canada (613) 990-0752 (FAX: 952-6602) 2012-12-01 John David Anglin dave.ang...@nrc-cnrc.gc.ca PR ada/52110 * s-osinte-hpux-dce.ads: Declare pthread_rwlockattr_t and pthread_rwlock_t subtypes. Delete duplicate declaration of clockid_t. * s-taspri-hpux-dce.ads: Change pragma Atomic (Thread) to comment. Index: s-osinte-hpux-dce.ads === --- s-osinte-hpux-dce.ads (revision 193634) +++ s-osinte-hpux-dce.ads (working copy) @@ -244,6 +244,14 @@ type pthread_condattr_t is limited private; type pthread_key_t is private; + -- Read/Write lock not supported on HPUX. To add support both types + -- pthread_rwlock_t and pthread_rwlockattr_t must properly be defined + -- with the associated routines pthread_rwlock_[init/destroy] and + -- pthread_rwlock_[rdlock/wrlock/unlock]. + + subtype pthread_rwlock_t is pthread_mutex_t; + subtype pthread_rwlockattr_t is pthread_mutexattr_t; + --- -- Stack -- --- @@ -444,7 +452,6 @@ end record; pragma Convention (C, timespec); - type clockid_t is new int; CLOCK_REALTIME : constant clockid_t := 1; type cma_t_address is new System.Address; Index: s-taspri-hpux-dce.ads === --- s-taspri-hpux-dce.ads (revision 193634) +++ s-taspri-hpux-dce.ads (working copy) @@ -102,7 +102,9 @@ type Private_Data is record Thread : aliased System.OS_Interface.pthread_t; - pragma Atomic (Thread); + -- pragma Atomic (Thread); + -- Unfortunately, the above fails because Thread is 64 bits. + -- Thread field may be updated by two different threads of control. -- (See, Enter_Task and Create_Task in s-taprop.adb). They put the -- same value (thr_self value). We do not want to use lock on those
[committed] Remove xfail for hppa*-*-hpux* from gcc.dg/torture/pr52402.c
Test no longer fails on trunk. Committed to trunk. Dave -- J. David Anglin dave.ang...@nrc-cnrc.gc.ca National Research Council of Canada (613) 990-0752 (FAX: 952-6602) 2012-12-01 John David Anglin dave.ang...@nrc-cnrc.gc.ca PR middle-end/52450 * gcc.dg/torture/pr52402.c: Remove xfail for hppa*-*-hpux*. Index: gcc.dg/torture/pr52402.c === --- gcc.dg/torture/pr52402.c(revision 194025) +++ gcc.dg/torture/pr52402.c(working copy) @@ -1,7 +1,6 @@ /* { dg-do run } */ /* { dg-options -w -Wno-psabi } */ /* { dg-require-effective-target int32plus } */ -/* { dg-xfail-run-if pr52450 { { hppa*-*-hpux* } { ! lp64 } } } */ typedef int v4si __attribute__((vector_size(16))); struct T { v4si i[2]; int j; } __attribute__((packed));
Re: [Patch, Fortran] PR 55548: SYSTEM_CLOCK with integer(8) provides nanosecond resolution, but only microsecond precision (without -lrt)
On Sat, Dec 1, 2012 at 1:17 PM, Janus Weil ja...@gcc.gnu.org wrote: Hi all, here is a straightforward patch for the intrinsic procedure SYSTEM_CLOCK. It does two things: 1) It reduces the resolution of the int8 version from 1 nanosecond to 1 microsecond (COUNT_RATE = 100). 2) It adds an int16 version with nanosecond precision. The motivation for item #1 was mainly that the actual precision is usually not better than 1 microsec anyway (unless linking with -lrt). This results in SYSTEM_CLOCK giving values whose last three digits are zero. One can argue that this is not a dramatic misbehavior, but it has disadvantages for certain applications, like e.g. using SYSTEM_CLOCK to initialize the random seed in a Monte-Carlo simulation. In general, I would say that the value of COUNT_RATE should not be larger than the actual precision of the clock used. Moreover, the microsecond resolution for int8 arguments has the advantage that it is compatible with ifort's behavior. Also I think a resolution of 1 microsecond is sufficient for most applications. If someone really needs more, he can now use the int16 version (and link with -lrt). Regtested on x86_64-unknown-linux-gnu (although we don't actually seem to have any test cases for SYSTEM_CLOCK yet). Ok for trunk? Btw, does it make sense to also add an int2 version? If yes, which resolution? Note that most other compilers seem to have an int2 version of SYSTEM_CLOCK ... No, not Ok. IIRC there was some discussion about COUNT_RATE back when the nanosecond resolution system_clock feature was developed, and one idea was to have different count rates depending on whether clock_gettime is available, as you also suggest in the PR. In the end it was decided to keep a constant count rate as a consistent rate was seen as more important than wasting a few of the least significant bits when nanosecond resolution is not available, since int8 has sufficient range for several centuries even with nanosecond resolution. Anyway, I don't feel particularly strongly about this, and if there now is a consensus to have a changing count rate, I can live with that. But, I do object to providing only microsecond resolution for the int8 version. Nanosecond resolution is indeed not necessary in most cases, but like the saying goes, Precision, like virginity, is never regained once lost.. Reducing the resolution is a regression for those users who have relied on this feature; I, for one, have several test and benchmark programs which depend on nanosecond resolution for the int8 system_clock. OTOH, if one wants a microsecond count rate, converting the count value is just a single statement following the system_clock call. Needing to link with librt in order to access clock_gettime is an unfortunate wart in glibc, but other C libraries exist out there (heck, given the success of Android, glibc is certainly a minority even if you limit yourself to Linux), and of those that provide clock_gettime, most have it directly in libc and not in a separate library. As for the int16 and int2 variants, F2003 requires that the arguments can be of any kind. AFAICS, this means that different arguments can be of different kind. So in order to avoid a combinatorial explosion, the only sensible way is for the frontend to create temporaries of suitable kind, then call an appropriate library function etc. As mentioned, int8 already provides several centuries range with nanosecond resolution, and thus I see no reason to provide a int16 version, but rather the frontend should create int8 temporaries, call system_clock_8, then copy the results back to the actual int16 arguments. F2003 requires than all supported kinds should work, which would include kinds 1 and 2, although those versions obviously suffer from a lack of either range, precision or both, to the extent that they are quite useless in practice. I think they could be handled by just setting count to -huge(), count_rate and count_max to 0 in the frontend and not bothering to make a library call at all. Also, F2003 specifies than COUNT_RATE can be of type real as well; I think this should be handled similarly, by the frontend copying the result from an integer temporary. So in the end I don't think we need new library entry points, but rather the temporary variable handling stuff in the frontend. -- Janne Blomqvist
Re: [v3, build] Clear hardware capabilities on libstdc++.so with Sun as
Rainer, This patch is completely wrong and unacceptable because your test applies too generally. Why, exactly, is it appropriate to assume that the -nH assembler command line option means the same thing on all systems and one wants to clear HW capabilities on all systems? Because it is not appropriate and does not mean the same thing on AIX, for example. If you want to test the feature on Solaris, then test it on Solaris and Solaris only. Please fix this or revert it. Thanks, David
[patch stmt.c]: Fix SjLj exception handling
Hi, recent 4,8 has regressions in g++.old-deja/g++.eh for the catch*.C tests, if exception-mechanism is SjLj. This is due an off by one failure in an decreasing loop. ChangeLog 2012-12-01 Kai Tietz * stmt.c (expand_sjlj_dispatch_table): Fix off by one. Tested for i686-w64-mingw32, x86_64-unknown-linux-gnu. Ok for apply? Regards, Kai Index: stmt.c === --- stmt.c (Revision 193985) +++ stmt.c (Arbeitskopie) @@ -2282,7 +2282,7 @@ expand_sjlj_dispatch_table (rtx dispatch_index, tree range = maxval; rtx default_label = gen_label_rtx (); - for (int i = ncases - 1; i 0; --i) + for (int i = ncases - 1; i = 0; --i) { tree elt = dispatch_table[i]; tree low = CASE_LOW (elt);
Re: [patch stmt.c]: Fix SjLj exception handling
On Sat, Dec 1, 2012 at 10:59 PM, Kai Tietz wrote: Hi, recent 4,8 has regressions in g++.old-deja/g++.eh for the catch*.C tests, if exception-mechanism is SjLj. This is due an off by one failure in an decreasing loop. ChangeLog 2012-12-01 Kai Tietz * stmt.c (expand_sjlj_dispatch_table): Fix off by one. Tested for i686-w64-mingw32, x86_64-unknown-linux-gnu. Ok for apply? Regards, Kai Index: stmt.c === --- stmt.c (Revision 193985) +++ stmt.c (Arbeitskopie) @@ -2282,7 +2282,7 @@ expand_sjlj_dispatch_table (rtx dispatch_index, tree range = maxval; rtx default_label = gen_label_rtx (); - for (int i = ncases - 1; i 0; --i) + for (int i = ncases - 1; i = 0; --i) { tree elt = dispatch_table[i]; tree low = CASE_LOW (elt); I can't approve this, but it's obvious. The normal switch expander (expand_case) expects the default case in slot 0, but the SJLJ dispatch table doesn't have a default case. Ciao! Steven
[PATCH] Fix handling of EXPAND_MEMORY for TFmode memory constraint in asm
The attached change fixes the compilation of the following asm in libquadmath/math/fmaq.c: asm volatile ( : : m (v.value)); The issue arises because there is no support for directly loading TFmode objects. Ok for trunk? Dave -- J. David Anglin dave.ang...@nrc-cnrc.gc.ca National Research Council of Canada (613) 990-0752 (FAX: 952-6602) 2012-12-01 John David Anglin dave.ang...@nrc-cnrc.gc.ca PR middle-end/55198 * expr.c (expand_expr_real_1): Don't use bitfield extraction for non BLKmode objects when EXPAND_MEMORY is specified. Index: expr.c === --- expr.c (revision 193685) +++ expr.c (working copy) @@ -9928,7 +9928,8 @@ GET_MODE_CLASS (mode) != MODE_COMPLEX_INT GET_MODE_CLASS (mode) != MODE_COMPLEX_FLOAT modifier != EXPAND_CONST_ADDRESS -modifier != EXPAND_INITIALIZER) +modifier != EXPAND_INITIALIZER +modifier != EXPAND_MEMORY) /* If the field is volatile, we always want an aligned access. Do this in following two situations: 1. the access is not already naturally
[C++ testcase] PR 55558
Hi, I'm adding the testcase and closing the PR. Thanks, Paolo. /// 2012-12-01 Paolo Carlini paolo.carl...@oracle.com PR c++/8 * g++.dg/cpp0x/decltype46.C: New. Index: g++.dg/cpp0x/decltype46.C === --- g++.dg/cpp0x/decltype46.C (revision 0) +++ g++.dg/cpp0x/decltype46.C (working copy) @@ -0,0 +1,12 @@ +// PR c++/8 +// { dg-do compile { target c++11 } } + +struct A +{ + static int member; +}; + +templatetypename T void foobar () +{ + typedef decltype (A::member) myType; +}
Re: [Patch, Fortran] PR 55548: SYSTEM_CLOCK with integer(8) provides nanosecond resolution, but only microsecond precision (without -lrt)
Hi Janne, thanks for your feedback ... here is a straightforward patch for the intrinsic procedure SYSTEM_CLOCK. It does two things: 1) It reduces the resolution of the int8 version from 1 nanosecond to 1 microsecond (COUNT_RATE = 100). 2) It adds an int16 version with nanosecond precision. The motivation for item #1 was mainly that the actual precision is usually not better than 1 microsec anyway (unless linking with -lrt). This results in SYSTEM_CLOCK giving values whose last three digits are zero. One can argue that this is not a dramatic misbehavior, but it has disadvantages for certain applications, like e.g. using SYSTEM_CLOCK to initialize the random seed in a Monte-Carlo simulation. In general, I would say that the value of COUNT_RATE should not be larger than the actual precision of the clock used. Moreover, the microsecond resolution for int8 arguments has the advantage that it is compatible with ifort's behavior. Also I think a resolution of 1 microsecond is sufficient for most applications. If someone really needs more, he can now use the int16 version (and link with -lrt). Regtested on x86_64-unknown-linux-gnu (although we don't actually seem to have any test cases for SYSTEM_CLOCK yet). Ok for trunk? Btw, does it make sense to also add an int2 version? If yes, which resolution? Note that most other compilers seem to have an int2 version of SYSTEM_CLOCK ... No, not Ok. IIRC there was some discussion about COUNT_RATE back when the nanosecond resolution system_clock feature was developed, and one idea was to have different count rates depending on whether clock_gettime is available, as you also suggest in the PR. In the end it was decided to keep a constant count rate as a consistent rate was seen as more important than wasting a few of the least significant bits when nanosecond resolution is not available, since int8 has sufficient range for several centuries even with nanosecond resolution. Anyway, I don't feel particularly strongly about this, and if there now is a consensus to have a changing count rate, I can live with that. my patch does not implement such a changing (or library-dependent) count rate, but I would indeed prefer this over the current behavior (even more so if there is consensus to reject my current patch). But, I do object to providing only microsecond resolution for the int8 version. Nanosecond resolution is indeed not necessary in most cases, but like the saying goes, Precision, like virginity, is never regained once lost.. Reducing the resolution is a regression for those users who have relied on this feature; well, I wouldn't count it as a real regression, since: a) The Fortran standard does not require a certain resolution, but defines it to be processor dependent. b) You cannot rely on different compilers giving the same resolution values, therefore you cannot expect a fixed resolution if you want to write portable code. I, for one, have several test and benchmark programs which depend on nanosecond resolution for the int8 system_clock. You mean they depend on the count values being delivered in units of nanoseconds? In a portable code you cannot depend on getting actual nanosecond precision, since it may not be available on a certain system. OTOH, if one wants a microsecond count rate, converting the count value is just a single statement following the system_clock call. Certainly converting the units is not a problem. BUT: In a way I think that the current behavior is lying to the user. It pretends to provide a nanosecond resolution, while in fact the values you get might only be precise up to 1 microsecond. In my opinion the COUNT_RATE argument should rather return a value that corresponds to the actual precision of the clock. It should not just report some fixed unit of measure, but tell me something about the precision of my clock. If it gives me a value in microseconds, I can simply convert it to nanoseconds if I need to (knowing that the precision is only a microsecond). But it it gives me a value in nanoseconds, how can I know that I'm not getting nanosecond precision? In summary, I think we should do one of the following: 1) Provide different versions with fixed COUNT_RATE and let the user choose one via the kind of the argument. This is what we do right now, and my patch was expanding on this by providing an additional option. If you want nanosecond resolution, you can still use the int16 version, but now you can also choose to have microsecond resolution (via the int8 option). 2) Adapt the COUNT_RATE according to the actual precision of the clock on the system. 3) A combination of both, where the user could choose via int4/int8/int16 that he does not need more than milli/micro/nanosecond resolution. Then SYSTEM_CLOCK would try to provide this precision, but reduce COUNT_RATE to a lower value if the desired precision can not be provided. I think the last one would be most desirable. Any
Re: [PATCH i386] Allow cltd/cqto etc on modern CPUs
On Sat, Dec 1, 2012 at 6:50 AM, Xinliang David Li wrote: 2010-11-30 Xinliang David Li * config/i386/i386.c: Allow sign extend instructions (cltd etc) on modern CPUs. You installed the patch without the ChangeLog entry... (http://gcc.gnu.org/ml/gcc-cvs/2012-12/msg00027.html) Ciao! Steven
[cxx-conversion] LTO-related hash tables
Change LTO-related hash tables from htab_t to hash_table: lto-streamer.h output_block::string_hash_table lto-streamer-in.c file_name_hash_table lto-streamer.c tree_htab The struct string_slot moves from data-streamer.h to lto-streamer.h to resolve compilation dependences. Tested on x86-64. Okay for branch? Index: gcc/ChangeLog 2012-11-30 Lawrence Crowl cr...@google.com * data-streamer.h (struct string_slot): Move to lto-streamer.h. (hash_string_slot_node): Move implementation into lto-streamer.h struct string_slot_hasher. (eq_string_slot_node): Likewise. * data-streamer-out.c: Update output_block::string_hash_table dependent calls and types. * lto-streamer.h (struct string_slot): Move from data-streamer.h (struct string_slot_hasher): New. (htab_t output_block::string_hash_table): Change type to hash_table. Update dependent calls and types. * lto-streamer-in.c (freeing_string_slot_hasher): New. (htab_t file_name_hash_table): Change type to hash_table. Update dependent calls and types. * lto-streamer-out.c: Update output_block::string_hash_table dependent calls and types. * lto-streamer.c (htab_t tree_htab): Change type to hash_table. Update dependent calls and types. * Makefile.in: Update to changes above. Index: gcc/data-streamer-out.c === --- gcc/data-streamer-out.c (revision 193902) +++ gcc/data-streamer-out.c (working copy) @@ -42,8 +42,7 @@ streamer_string_index (struct output_blo s_slot.len = len; s_slot.slot_num = 0; - slot = (struct string_slot **) htab_find_slot (ob-string_hash_table, -s_slot, INSERT); + slot = ob-string_hash_table.find_slot (s_slot, INSERT); if (*slot == NULL) { struct lto_output_stream *string_stream = ob-string_stream; Index: gcc/lto-streamer-out.c === --- gcc/lto-streamer-out.c (revision 193902) +++ gcc/lto-streamer-out.c (working copy) @@ -77,8 +77,7 @@ create_output_block (enum lto_section_ty clear_line_info (ob); - ob-string_hash_table = htab_create (37, hash_string_slot_node, - eq_string_slot_node, NULL); + ob-string_hash_table.create (37); gcc_obstack_init (ob-obstack); return ob; @@ -92,7 +91,7 @@ destroy_output_block (struct output_bloc { enum lto_section_type section_type = ob-section_type; - htab_delete (ob-string_hash_table); + ob-string_hash_table.dispose (); free (ob-main_stream); free (ob-string_stream); Index: gcc/lto-streamer-in.c === --- gcc/lto-streamer-in.c (revision 193902) +++ gcc/lto-streamer-in.c (working copy) @@ -49,8 +49,19 @@ along with GCC; see the file COPYING3. #include tree-pass.h #include streamer-hooks.h +struct freeing_string_slot_hasher : string_slot_hasher +{ + static inline void remove (value_type *); +}; + +inline void +freeing_string_slot_hasher::remove (value_type *v) +{ + free (v); +} + /* The table to hold the file names. */ -static htab_t file_name_hash_table; +static hash_table freeing_string_slot_hasher file_name_hash_table; /* Check that tag ACTUAL has one of the given values. NUM_TAGS is the @@ -94,14 +105,14 @@ lto_input_data_block (struct lto_input_b static const char * canon_file_name (const char *string) { - void **slot; + string_slot **slot; struct string_slot s_slot; size_t len = strlen (string); s_slot.s = string; s_slot.len = len; - slot = htab_find_slot (file_name_hash_table, s_slot, INSERT); + slot = file_name_hash_table.find_slot (s_slot, INSERT); if (*slot == NULL) { char *saved_string; @@ -117,7 +128,7 @@ canon_file_name (const char *string) } else { - struct string_slot *old_slot = (struct string_slot *) *slot; + struct string_slot *old_slot = *slot; return old_slot-s; } } @@ -1150,8 +1161,7 @@ void lto_reader_init (void) { lto_streamer_init (); - file_name_hash_table = htab_create (37, hash_string_slot_node, - eq_string_slot_node, free); + file_name_hash_table.create (37); } Index: gcc/data-streamer.h === --- gcc/data-streamer.h (revision 193902) +++ gcc/data-streamer.h (working copy) @@ -44,15 +44,6 @@ struct bitpack_d void *stream; }; - -/* String hashing. */ -struct string_slot -{ - const char *s; - int len; - unsigned int slot_num; -}; - /* In data-streamer.c */ void bp_pack_var_len_unsigned (struct bitpack_d *, unsigned HOST_WIDE_INT); void bp_pack_var_len_int (struct bitpack_d *, HOST_WIDE_INT); @@ -90,35 +81,6 @@ const char *bp_unpack_string (struct dat unsigned HOST_WIDE_INT
[cxx-conversion] graphite-related hash tables
Change graphite-related hash tables from htab_t to hash_table: graphite-clast-to-gimple.c ivs_params::newivs_index graphite-clast-to-gimple.c ivs_params::params_index graphite-clast-to-gimple.c print_generated_program::params_index graphite-clast-to-gimple.c gloog::newivs_index graphite-clast-to-gimple.c gloog::params_index graphite.c graphite_transform_loops::bb_pbb_mapping sese.c copy_bb_and_scalar_dependences::rename_map Move hash table declarations to a new graphite-htab.h, because they are used in few places. Remove unused: htab_t scop::original_pddrs SCOP_ORIGINAL_PDDRS Remove unused: insert_loop_close_phis insert_guard_phis debug_ivtype_map ivtype_map_elt_info new_ivtype_map_elt Tested on x86-64. Okay for branch? Index: gcc/ChangeLog 2012-11-30 Lawrence Crowl cr...@google.com * graphite-htab.h: New. (typedef hash_table bb_pbb_hasher bb_pbb_htab_type): New. (extern find_pbb_via_hash): Move from graphite-poly.h. (extern loop_is_parallel_p): Move from graphite-poly.h. (extern get_loop_body_pbbs): Move from graphite-poly.h. * graphite-clast-to-gimple.h: (extern gloog) Move to graphite-htab.h. (bb_pbb_map_hash): Fold into bb_pbb_htab_type in graphite-htab.h. (eq_bb_pbb_map): Fold into bb_pbb_htab_type in graphite-htab.h. * graphite-clast-to-gimple.c: Include graphite-htab.h. (htab_t ivs_params::newivs_index): Change type to hash_table. Update dependent calls and types. (htab_t ivs_params::params_index): Likewise. (htab_t print_generated_program::params_index): Likewise. (htab_t gloog::newivs_index): Likewise. (htab_t gloog::params_index): Likewise. * graphite-dependences.c: Include graphite-htab.h. (loop_is_parallel_p): Change hash table type of parameter. * graphite-poly.h (htab_t scop::original_pddrs): Remove unused. (SCOP_ORIGINAL_PDDRS): Remove unused. (extern find_pbb_via_hash): Move to graphite-htab.h. (extern loop_is_parallel_p): Move to graphite-htab.h. (extern get_loop_body_pbbs): Move to graphite-htab.h. * graphite.c: Include graphite-htab.h. (htab_t graphite_transform_loops::bb_pbb_mapping): Change type to hash_table. Update dependent calls and types. * sese.h (extern insert_loop_close_phis): Remove unused. (extern insert_guard_phis): Remove unused. (extern debug_ivtype_map): Remove unused. (extern ivtype_map_elt_info): Remove unused. (inline new_ivtype_map_elt): Remove unused. (extern debug_rename_map): Move to .c file. * sese.c (debug_rename_map_1): Make extern. (debug_ivtype_elt): Remove unused. (debug_ivtype_map_1): Remove unused. (debug_ivtype_map): Remove unused. (ivtype_map_elt_info): Remove unused. (eq_ivtype_map_elts): Remove unused. (htab_t copy_bb_and_scalar_dependences::rename_map): Change type to hash_table. Update dependent calls and types. * Makefile.in: Update to changes above. Index: gcc/graphite-htab.h === --- gcc/graphite-htab.h (revision 0) +++ gcc/graphite-htab.h (revision 0) @@ -0,0 +1,60 @@ +/* Translation of CLAST (CLooG AST) to Gimple. + Copyright (C) 2012 Free Software Foundation, Inc. + Contributed by Sebastian Pop sebastian@amd.com. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 3, or (at your option) +any later version. + +GCC is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +http://www.gnu.org/licenses/. */ + +#ifndef GCC_GRAPHITE_HTAB_H +#define GCC_GRAPHITE_HTAB_H + +#include hash-table.h +#include graphite-clast-to-gimple.h + +/* Hashtable helpers. */ + +struct bb_pbb_hasher : typed_free_remove bb_pbb_def +{ + typedef bb_pbb_def value_type; + typedef bb_pbb_def compare_type; + static inline hashval_t hash (const value_type *); + static inline bool equal (const value_type *, const compare_type *); +}; + +/* Hash function for data base element BB_PBB. */ + +inline hashval_t +bb_pbb_hasher::hash (const value_type *bb_pbb) +{ + return (hashval_t)(bb_pbb-bb-index); +} + +/* Compare data base element PB1 and PB2. */ + +inline bool +bb_pbb_hasher::equal (const value_type *bp1, const compare_type *bp2) +{ + return (bp1-bb-index == bp2-bb-index); +} + +typedef hash_table bb_pbb_hasher bb_pbb_htab_type; + +extern bool gloog (scop_p, bb_pbb_htab_type); +poly_bb_p find_pbb_via_hash (bb_pbb_htab_type,
[cxx-conversion] gimplify_ctx::temp_htab hash table
Change gimplify.c gimplify_ctx::temp_htab hash table from htab_t to hash_table. Move struct gimple_temp_hash_elt and struct gimplify_ctx to a new gimplify-ctx.h, because they are used few places. Tested on x86-64. Okay for branch? Index: gcc/ChangeLog 2012-11-30 Lawrence Crowl cr...@google.com * gimple.h (struct gimplify_ctx): Move to gimplify-ctx.h. (push_gimplify_context): Likewise. (pop_gimplify_context): Likewise. * gimplify-ctx.h: New. (struct gimple_temp_hash_elt): Move from gimplify.c. (class gimplify_hasher): New. (struct gimplify_ctx): Move from gimple.h. (htab_t gimplify_ctx::temp_htab): Change type to hash_table. Update dependent calls and types. * gimple-fold.c: Include gimplify-ctx.h. * gimplify.c: Include gimplify-ctx.h. (struct gimple_temp_hash_elt): Move to gimplify-ctx.h. (htab_t gimplify_ctx::temp_htab): Update dependent calls and types for new type hash_table. (gimple_tree_hash): Move into gimplify_hasher in gimplify-ctx.h. (gimple_tree_eq): Move into gimplify_hasher in gimplify-ctx.h. * omp-low.c: Include gimplify-ctx.h. * tree-inline.c: Include gimplify-ctx.h. * tree-mudflap.c: Include gimplify-ctx.h. * Makefile.in: Update to changes above. Index: gcc/omp-low.c === --- gcc/omp-low.c (revision 193902) +++ gcc/omp-low.c (working copy) @@ -29,6 +29,7 @@ along with GCC; see the file COPYING3. #include tree.h #include rtl.h #include gimple.h +#include gimplify-ctx.h #include tree-iterator.h #include tree-inline.h #include langhooks.h Index: gcc/gimplify.c === --- gcc/gimplify.c (revision 193902) +++ gcc/gimplify.c (working copy) @@ -44,6 +44,7 @@ along with GCC; see the file COPYING3. #include splay-tree.h #include vec.h #include gimple.h +#include gimplify-ctx.h #include langhooks-def.h /* FIXME: for lhd_set_decl_assembler_name */ #include tree-pass.h /* FIXME: only for PROP_gimple_any */ @@ -88,15 +89,6 @@ static struct gimplify_ctx *gimplify_ctx static struct gimplify_omp_ctx *gimplify_omp_ctxp; -/* Formal (expression) temporary table handling: multiple occurrences of - the same scalar expression are evaluated into the same temporary. */ - -typedef struct gimple_temp_hash_elt -{ - tree val; /* Key */ - tree temp; /* Value */ -} elt_t; - /* Forward declaration. */ static enum gimplify_status gimplify_compound_expr (tree *, gimple_seq *, bool); @@ -131,40 +123,6 @@ mark_addressable (tree x) } } -/* Return a hash value for a formal temporary table entry. */ - -static hashval_t -gimple_tree_hash (const void *p) -{ - tree t = ((const elt_t *) p)-val; - return iterative_hash_expr (t, 0); -} - -/* Compare two formal temporary table entries. */ - -static int -gimple_tree_eq (const void *p1, const void *p2) -{ - tree t1 = ((const elt_t *) p1)-val; - tree t2 = ((const elt_t *) p2)-val; - enum tree_code code = TREE_CODE (t1); - - if (TREE_CODE (t2) != code - || TREE_TYPE (t1) != TREE_TYPE (t2)) -return 0; - - if (!operand_equal_p (t1, t2, 0)) -return 0; - -#ifdef ENABLE_CHECKING - /* Only allow them to compare equal if they also hash equal; otherwise - results are nondeterminate, and we fail bootstrap comparison. */ - gcc_assert (gimple_tree_hash (p1) == gimple_tree_hash (p2)); -#endif - - return 1; -} - /* Link gimple statement GS to the end of the sequence *SEQ_P. If *SEQ_P is NULL, a new sequence is allocated. This function is similar to gimple_seq_add_stmt, but does not scan the operands. @@ -242,8 +200,8 @@ pop_gimplify_context (gimple body) else record_vars (c-temps); - if (c-temp_htab) -htab_delete (c-temp_htab); + if (c-temp_htab.is_created ()) +c-temp_htab.dispose (); } /* Push a GIMPLE_BIND tuple onto the stack of bindings. */ @@ -586,23 +544,22 @@ lookup_tmp_var (tree val, bool is_formal else { elt_t elt, *elt_p; - void **slot; + elt_t **slot; elt.val = val; - if (gimplify_ctxp-temp_htab == NULL) -gimplify_ctxp-temp_htab - = htab_create (1000, gimple_tree_hash, gimple_tree_eq, free); - slot = htab_find_slot (gimplify_ctxp-temp_htab, (void *)elt, INSERT); + if (!gimplify_ctxp-temp_htab.is_created ()) +gimplify_ctxp-temp_htab.create (1000); + slot = gimplify_ctxp-temp_htab.find_slot (elt, INSERT); if (*slot == NULL) { elt_p = XNEW (elt_t); elt_p-val = val; elt_p-temp = ret = create_tmp_from_val (val, is_formal); - *slot = (void *) elt_p; + *slot = elt_p; } else { - elt_p = (elt_t *) *slot; + elt_p = *slot; ret = elt_p-temp; } } Index: gcc/gimple-fold.c
[cxx-conversion] tree-related hash tables
Change tree-related hash tables from htab_t to hash_table: tree-complex.c complex_variable_components tree-parloops.c eliminate_local_variables_stmt::decl_address tree-parloops.c separate_decls_in_region::decl_copies Move hash table declarations to a new tree-hasher.h, to resolve compilation dependences and because they are used in few places. Tested on x86-64. Okay for branch? Index: gcc/ChangeLog 2012-11-30 Lawrence Crowl cr...@google.com * tree-hasher.h: New. (struct int_tree_hasher): New. (typedef int_tree_htab_type): New. * tree-flow.h (extern int_tree_map_hash): Moved into tree-hasher struct int_tree_hasher. (extern int_tree_map_eq): Likewise. * tree-complex.c: Include tree-hasher.h (htab_t complex_variable_components): Change type to hash_table. Update dependent calls and types. * tree-parloops.c: Include tree-hasher.h. (htab_t eliminate_local_variables_stmt::decl_address): Change type to hash_table. Update dependent calls and types. (htab_t separate_decls_in_region::decl_copies): Likewise. * tree-ssa.c (int_tree_map_eq): Moved into struct int_tree_hasher in tree-flow.h. (int_tree_map_hash): Likewise. * Makefile.in: Update to changes above. Index: gcc/tree-complex.c === --- gcc/tree-complex.c (revision 193902) +++ gcc/tree-complex.c (working copy) @@ -29,6 +29,7 @@ along with GCC; see the file COPYING3. #include tree-iterator.h #include tree-pass.h #include tree-ssa-propagate.h +#include tree-hasher.h /* For each complex ssa name, a lattice value. We're interested in finding @@ -54,7 +55,7 @@ static veccomplex_lattice_t complex_la /* For each complex variable, a pair of variables for the components exists in the hashtable. */ -static htab_t complex_variable_components; +static int_tree_htab_type complex_variable_components; /* For each complex SSA_NAME, a pair of ssa names for the components. */ static vectree complex_ssa_name_components; @@ -66,7 +67,7 @@ cvc_lookup (unsigned int uid) { struct int_tree_map *h, in; in.uid = uid; - h = (struct int_tree_map *) htab_find_with_hash (complex_variable_components, in, uid); + h = complex_variable_components.find_with_hash (in, uid); return h ? h-to : NULL; } @@ -76,14 +77,13 @@ static void cvc_insert (unsigned int uid, tree to) { struct int_tree_map *h; - void **loc; + int_tree_map **loc; h = XNEW (struct int_tree_map); h-uid = uid; h-to = to; - loc = htab_find_slot_with_hash (complex_variable_components, h, - uid, INSERT); - *(struct int_tree_map **) loc = h; + loc = complex_variable_components.find_slot_with_hash (h, uid, INSERT); + *loc = h; } /* Return true if T is not a zero constant. In the case of real values, @@ -1569,8 +1569,7 @@ tree_lower_complex (void) init_parameter_lattice_values (); ssa_propagate (complex_visit_stmt, complex_visit_phi); - complex_variable_components = htab_create (10, int_tree_map_hash, -int_tree_map_eq, free); + complex_variable_components.create (10); complex_ssa_name_components.create (2 * num_ssa_names); complex_ssa_name_components.safe_grow_cleared (2 * num_ssa_names); @@ -1591,7 +1590,7 @@ tree_lower_complex (void) gsi_commit_edge_inserts (); - htab_delete (complex_variable_components); + complex_variable_components.dispose (); complex_ssa_name_components.release (); complex_lattice_values.release (); return 0; Index: gcc/tree-hasher.h === --- gcc/tree-hasher.h (revision 0) +++ gcc/tree-hasher.h (revision 0) @@ -0,0 +1,56 @@ +/* Data and Control Flow Analysis for Trees. + Copyright (C) 2001, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, + 2012 Free Software Foundation, Inc. + Contributed by Diego Novillo dnovi...@redhat.com + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 3, or (at your option) +any later version. + +GCC is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +http://www.gnu.org/licenses/. */ + +#ifndef GCC_TREE_HASHER_H +#define GCC_TREE_HASHER_H 1 + +#include hash-table.h +#include tree-flow.h + +/* Hashtable helpers. */ + +struct int_tree_hasher : typed_free_remove int_tree_map +{ + typedef int_tree_map value_type; + typedef int_tree_map compare_type; + static inline hashval_t hash (const
[cxx-conversion] ggc-common hash tables
Change ggc-common hash tables from htab_t to hash_table: ggc-common.c loc_hash ggc-common.c ptr_hash Add a new hash_table method elements_with_deleted to meet the needs of gcc-common.c. Correct many methods with parameter types compare_type to the correct value_type. (Correct code was unlikely to notice the change, but incorrect code will.) Tested on x86-64. Okay for branch? Index: gcc/ChangeLog 2012-11-30 Lawrence Crowl cr...@google.com * hash-table.h (class hash_table): Correct many methods with parameter types compare_type to the correct value_type. (Correct code was unlikely to notice the change.) (hash_table::elements_with_deleted) New. * ggc-common.c (htab_t saving_htab): Change type to hash_table. Update dependent calls and types. (htab_t loc_hash): Likewise. (htab_t ptr_hash): Likewise. (call_count): Rename ggc_call_count. (call_alloc): Rename ggc_call_alloc. (loc_descriptor): Rename make_loc_descriptor. (add_statistics): Rename ggc_add_statistics. * Makefile.in: Update to changes above. Index: gcc/hash-table.h === --- gcc/hash-table.h(revision 193902) +++ gcc/hash-table.h(working copy) @@ -380,19 +380,19 @@ public: void create (size_t initial_slots); bool is_created (); void dispose (); - value_type *find (const compare_type *comparable); + value_type *find (const value_type *value); value_type *find_with_hash (const compare_type *comparable, hashval_t hash); - value_type **find_slot (const compare_type *comparable, - enum insert_option insert); + value_type **find_slot (const value_type *value, enum insert_option insert); value_type **find_slot_with_hash (const compare_type *comparable, hashval_t hash, enum insert_option insert); void empty (); void clear_slot (value_type **slot); - void remove_elt (const compare_type *comparable); + void remove_elt (const value_type *value); void remove_elt_with_hash (const compare_type *comparable, hashval_t hash); - size_t size(); - size_t elements(); - double collisions(); + size_t size (); + size_t elements (); + size_t elements_with_deleted (); + double collisions (); template typename Argument, int (*Callback) (value_type **slot, Argument argument) @@ -431,9 +431,9 @@ hash_table Descriptor, Allocator::is_c template typename Descriptor, template typename Type class Allocator inline typename Descriptor::value_type * -hash_table Descriptor, Allocator::find (const compare_type *comparable) +hash_table Descriptor, Allocator::find (const value_type *value) { - return find_with_hash (comparable, Descriptor::hash (comparable)); + return find_with_hash (value, Descriptor::hash (value)); } @@ -443,9 +443,9 @@ template typename Descriptor, template typename Type class Allocator inline typename Descriptor::value_type ** hash_table Descriptor, Allocator -::find_slot (const compare_type *comparable, enum insert_option insert) +::find_slot (const value_type *value, enum insert_option insert) { - return find_slot_with_hash (comparable, Descriptor::hash (comparable), insert); + return find_slot_with_hash (value, Descriptor::hash (value), insert); } @@ -454,9 +454,9 @@ hash_table Descriptor, Allocator template typename Descriptor, template typename Type class Allocator inline void -hash_table Descriptor, Allocator::remove_elt (const compare_type *comparable) +hash_table Descriptor, Allocator::remove_elt (const value_type *value) { - remove_elt_with_hash (comparable, Descriptor::hash (comparable)); + remove_elt_with_hash (value, Descriptor::hash (value)); } @@ -476,12 +476,23 @@ hash_table Descriptor, Allocator::size template typename Descriptor, template typename Type class Allocator inline size_t -hash_table Descriptor, Allocator::elements() +hash_table Descriptor, Allocator::elements () { return htab-n_elements - htab-n_deleted; } +/* Return the current number of elements in this hash table. */ + +template typename Descriptor, + template typename Type class Allocator +inline size_t +hash_table Descriptor, Allocator::elements_with_deleted () +{ + return htab-n_elements; +} + + /* Return the fraction of fixed collisions during all work with given hash table. */ Index: gcc/ggc-common.c === --- gcc/ggc-common.c(revision 193902) +++ gcc/ggc-common.c(working copy) @@ -24,7 +24,7 @@ along with GCC; see the file COPYING3. #include config.h #include system.h #include coretypes.h -#include hashtab.h +#include hash-table.h #include ggc.h #include ggc-internal.h #include diagnostic-core.h @@ -47,10 +47,6 @@ static ggc_statistics *ggc_stats; struct traversal_state; static int ggc_htab_delete (void **, void *);
Simplify a VEC_SELECT from one half of a VEC_CONCAT
Hello, in PR50829, HJ Lu pointed me to this PR for which I already had a patch (I hadn't submitted it because I didn't have a good use case for it). bootstrap+testsuite on x86_64-linux. 2012-12-02 Marc Glisse marc.gli...@inria.fr PR target/44551 gcc/ * simplify-rtx.c (simplify_binary_operation_1) VEC_SELECT: Detect when all elements come from one half of a VEC_CONCAT. gcc/testsuite/ * gcc.target/i386/pr44551.c: New testcase. -- Marc GlisseIndex: gcc/simplify-rtx.c === --- gcc/simplify-rtx.c (revision 194037) +++ gcc/simplify-rtx.c (working copy) @@ -3482,44 +3482,77 @@ simplify_binary_operation_1 (enum rtx_co rtx subop0, subop1; gcc_assert (i0 2 i1 2); subop0 = XEXP (trueop0, i0); subop1 = XEXP (trueop0, i1); return simplify_gen_binary (VEC_CONCAT, mode, subop0, subop1); } } - if (XVECLEN (trueop1, 0) == 1 - CONST_INT_P (XVECEXP (trueop1, 0, 0)) - GET_CODE (trueop0) == VEC_CONCAT) + /* Detect if all the elements come from the same subpart of a concat. */ + if (GET_CODE (trueop0) == VEC_CONCAT) { - rtx vec = trueop0; - int offset = INTVAL (XVECEXP (trueop1, 0, 0)) * GET_MODE_SIZE (mode); + rtx new_op0 = NULL_RTX; + rtx new_op1 = NULL_RTX; + int first = 0; + int second = 0; + unsigned nelts_first_half = 1; + enum machine_mode mode_first_half = GET_MODE (XEXP (trueop0, 0)); + if (VECTOR_MODE_P (mode_first_half)) + { + int elt_size = GET_MODE_SIZE (GET_MODE_INNER (mode_first_half)); + nelts_first_half = (GET_MODE_SIZE (mode_first_half) / elt_size); + } - /* Try to find the element in the VEC_CONCAT. */ - while (GET_MODE (vec) != mode - GET_CODE (vec) == VEC_CONCAT) + for (int i = 0; i XVECLEN (trueop1, 0); i++) { - HOST_WIDE_INT vec_size = GET_MODE_SIZE (GET_MODE (XEXP (vec, 0))); - if (offset vec_size) - vec = XEXP (vec, 0); + rtx j = XVECEXP (trueop1, 0, i); + if (!CONST_INT_P (j)) + { + first++; + second++; + break; + } + if (INTVAL (j) nelts_first_half) + first++; else + second++; + } + + if (second == 0) + { + new_op0 = XEXP (trueop0, 0); + new_op1 = trueop1; + } + else if (first == 0) + { + int len = XVECLEN (trueop1, 0); + rtvec vec = rtvec_alloc (len); + for (int i = 0; i len; i++) { - offset -= vec_size; - vec = XEXP (vec, 1); + int j = INTVAL (XVECEXP (trueop1, 0, i)) - nelts_first_half; + RTVEC_ELT (vec, i) = GEN_INT (j); } - vec = avoid_constant_pool_reference (vec); + new_op0 = XEXP (trueop0, 1); + new_op1 = gen_rtx_PARALLEL (VOIDmode, vec); } - if (GET_MODE (vec) == mode) - return vec; + if (new_op0) + { + if (VECTOR_MODE_P (GET_MODE (new_op0))) + return simplify_gen_binary (VEC_SELECT, mode, new_op0, new_op1); + if (VECTOR_MODE_P (mode)) + return simplify_gen_unary (VEC_DUPLICATE, mode, new_op0, + GET_MODE (new_op0)); + return new_op0; + } } return 0; case VEC_CONCAT: { enum machine_mode op0_mode = (GET_MODE (trueop0) != VOIDmode ? GET_MODE (trueop0) : GET_MODE_INNER (mode)); enum machine_mode op1_mode = (GET_MODE (trueop1) != VOIDmode ? GET_MODE (trueop1) Index: gcc/testsuite/gcc.target/i386/pr44551.c === --- gcc/testsuite/gcc.target/i386/pr44551.c (revision 0) +++ gcc/testsuite/gcc.target/i386/pr44551.c (revision 0) @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options -O -mavx } */ + +#include immintrin.h + +__m128i +foo (__m256i x, __m128i y) +{ + __m256i r = _mm256_insertf128_si256(x, y, 1); + __m128i a = _mm256_extractf128_si256(r, 1); + return a; +} + +/* { dg-final { scan-assembler-not insert } } */ +/* { dg-final { scan-assembler-not extract } } */ Property changes on: gcc/testsuite/gcc.target/i386/pr44551.c ___ Added: svn:keywords + Author Date Id Revision URL Added: svn:eol-style + native
[PATCH] Fix PR55521 by switching libsanitizer from mach_override to mac interpose functions on darwin
The attached patch eliminates PR 55521/sanitizer by switching libasan on darwin from using mach_override to mac function interposition via the importation of the asan/dynamic/asan_interceptors_dynamic.cc file from llvm.org's compiler-rt svn. The changes involve defining USING_MAC_INTERPOSE in configure.ac rather than rather than USING_MACH_OVERRIDE, introduction of the use of USING_MAC_INTERPOSE in Makefile.am to avoid building the interception subdirectory, the passage of -DMAC_INTERPOSE_FUNCTIONS in asan/Makefile.am when USING_MAC_INTERPOSE as well as the introduction of a -DMISSING_BLOCKS_SUPPORT flag to disable code that requires blocks support which FSF gcc lacks. The depreciated usage of USING_MACH_OVERRIDE is also removed from interception/Makefile.am. Bootstrapped on x86_64-apple-darwin10, x86_64-apple-darwin11 and x86_64-apple-darwin12. Passes... make -k check RUNTESTFLAGS=asan.exp --target_board=unix'{-m32,-m64}' and fixes the previously failing cond1.C test case from PR55521 on all three targets. Okay for gcc trunk? Jack 2012-12-01 Kostya Serebryany k...@google.com Jack Howarth howa...@bromo.med.uc.edu /libsanitizer PR 55521/sanitizer * configure.ac: Define USING_MAC_INTERPOSE when on darwin. * Makefile.am: Don't build interception subdir when USING_MAC_INTERPOSE defined. * asan/Makefile.am: Pass -DMAC_INTERPOSE_FUNCTIONS and -DMISSING_BLOCKS_SUPPORT when USING_MAC_INTERPOSE defined. Compile asan_interceptors_dynamic.cc but not libinterception when USING_MAC_INTERPOSE defined. * interception/Makefile.am: Remove usage of USING_MACH_OVERRIDE. * configure: Regenerated. * Makefile.in: Likewise. * asan/Makefile.in: Likewise. * interception/Makefile.in: Likewise. * asan/asan_intercepted_functions.h: Use MISSING_BLOCKS_SUPPORT. * asan/asan_mac.cc: Likewise. * asan/dynamic/asan_interceptors_dynamic.cc: Migrate from llvm and use MISSING_BLOCKS_SUPPORT. Index: libsanitizer/asan/asan_mac.cc === --- libsanitizer/asan/asan_mac.cc (revision 194037) +++ libsanitizer/asan/asan_mac.cc (working copy) @@ -383,7 +383,7 @@ INTERCEPTOR(void, dispatch_group_async_f asan_dispatch_call_block_and_release); } -#if MAC_INTERPOSE_FUNCTIONS +#if defined(MAC_INTERPOSE_FUNCTIONS) !defined(MISSING_BLOCKS_SUPPORT) // dispatch_async, dispatch_group_async and others tailcall the corresponding // dispatch_*_f functions. When wrapping functions with mach_override, those // dispatch_*_f are intercepted automatically. But with dylib interposition Index: libsanitizer/asan/asan_intercepted_functions.h === --- libsanitizer/asan/asan_intercepted_functions.h (revision 194037) +++ libsanitizer/asan/asan_intercepted_functions.h (working copy) @@ -203,7 +203,7 @@ DECLARE_FUNCTION_AND_WRAPPER(void, __CFI DECLARE_FUNCTION_AND_WRAPPER(CFStringRef, CFStringCreateCopy, CFAllocatorRef alloc, CFStringRef str); DECLARE_FUNCTION_AND_WRAPPER(void, free, void* ptr); -#if MAC_INTERPOSE_FUNCTIONS +#if defined(MAC_INTERPOSE_FUNCTIONS) !defined(MISSING_BLOCKS_SUPPORT) DECLARE_FUNCTION_AND_WRAPPER(void, dispatch_group_async, dispatch_group_t dg, dispatch_queue_t dq, void (^work)(void)); Index: libsanitizer/asan/Makefile.am === --- libsanitizer/asan/Makefile.am (revision 194037) +++ libsanitizer/asan/Makefile.am (working copy) @@ -4,6 +4,9 @@ AM_CPPFLAGS = -I $(top_srcdir)/include - gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER) DEFS = -D_GNU_SOURCE -D_DEBUG -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -DASAN_HAS_EXCEPTIONS=1 -DASAN_FLEXIBLE_MAPPING_AND_OFFSET=0 -DASAN_NEEDS_SEGV=1 +if USING_MAC_INTERPOSE +DEFS += -DMAC_INTERPOSE_FUNCTIONS -DMISSING_BLOCKS_SUPPORT +endif AM_CXXFLAGS = -Wall -W -Wno-unused-parameter -Wwrite-strings -pedantic -Wno-long-long -fPIC -fno-builtin -fno-exceptions -fomit-frame-pointer -funwind-tables -fvisibility=hidden -Wno-variadic-macros -Wno-c99-extensions ACLOCAL_AMFLAGS = -I $(top_srcdir) -I $(top_srcdir)/config @@ -29,8 +32,14 @@ asan_files = \ asan_thread.cc \ asan_win.cc -libasan_la_SOURCES = $(asan_files) +libasan_la_SOURCES = $(asan_files) +if USING_MAC_INTERPOSE +libasan_la_SOURCES += dynamic/asan_interceptors_dynamic.cc +libasan_la_LIBADD = $(top_builddir)/sanitizer_common/libsanitizer_common.la $(top_builddir)/../libstdc++-v3/src/libstdc++.la +else libasan_la_LIBADD = $(top_builddir)/sanitizer_common/libsanitizer_common.la $(top_builddir)/interception/libinterception.la
Re: [PATCH] Fix PR35634
Richard, The testcases assume default signed char and fail on systems with different semantics. I believe that both testcases need to declare c as signed char to consistently test the desired behavior, right? Thanks, David
[doc] extend.texi copy-editing, 8/N (odds and ends)
This patch is another in my series of copy-edits to extend.texi. This installment covers random problems with grammar, punctuation, terminology, etc that jumped out at me when reading through the chapter, rather than being a systematic search-and-destroy on pervasive usage problems (as most of the previous patches in this series were). Since these changes are supposed to be content-free, I've checked this in. I've probably reached the end of this sort of microscopic improvements to the text of this chapter. However, I'm still kind of unhappy with the organization of the material and I may attempt to do some rearrangement to e.g. better group all the attribute discussion together. -Sandra 2012-12-02 Sandra Loosemore san...@codesourcery.com gcc/ * doc/extend.texi: Various corrections to punctuation and grammar throughout the file. Use consistent terminology and proper names. Correct some minor markup issues. Index: gcc/doc/extend.texi === --- gcc/doc/extend.texi (revision 194037) +++ gcc/doc/extend.texi (working copy) @@ -170,8 +170,8 @@ statement expression, and that is used t Therefore the @code{this} pointer observed by @code{Foo} is not the address of @code{a}. -Any temporaries created within a statement within a statement expression -are destroyed at the statement's end. This makes statement +In a statement expression, any temporaries created within a statement +are destroyed at that statement's end. This makes statement expressions inside macros slightly different from function calls. In the latter case temporaries introduced during argument evaluation are destroyed at the end of the statement that includes the function @@ -196,22 +196,22 @@ the initialization of @code{b}. In the temporary is destroyed when the function returns. These considerations mean that it is probably a bad idea to use -statement-expressions of this form in header files that are designed to +statement expressions of this form in header files that are designed to work with C++. (Note that some versions of the GNU C Library contained -header files using statement-expression that lead to precisely this +header files using statement expressions that lead to precisely this bug.) Jumping into a statement expression with @code{goto} or using a @code{switch} statement outside the statement expression with a @code{case} or @code{default} label inside the statement expression is not permitted. Jumping into a statement expression with a computed -@code{goto} (@pxref{Labels as Values}) yields undefined behavior. +@code{goto} (@pxref{Labels as Values}) has undefined behavior. Jumping out of a statement expression is permitted, but if the statement expression is part of a larger expression then it is unspecified which other subexpressions of that expression have been evaluated except where the language definition requires certain subexpressions to be evaluated before or after the statement -expression. In any case, as with a function call the evaluation of a +expression. In any case, as with a function call, the evaluation of a statement expression is not interleaved with the evaluation of other parts of the containing expression. For example, @@ -278,7 +278,7 @@ do @{ @} while (0) @end smallexample -This could also be written using a statement-expression: +This could also be written using a statement expression: @smallexample #define SEARCH(array, target) \ @@ -391,9 +391,12 @@ variable initializer, inlining and cloni @cindex thunks A @dfn{nested function} is a function defined inside another function. -(Nested functions are not supported for GNU C++.) The nested function's -name is local to the block where it is defined. For example, here we -define a nested function named @code{square}, and call it twice: +Nested functions are supported as an extension in GNU C, but are not +supported by GNU C++. + +The nested function's name is local to the block where it is defined. +For example, here we define a nested function named @code{square}, and +call it twice: @smallexample @group @@ -665,7 +668,7 @@ If you are writing a header file that mu programs, write @code{__typeof__} instead of @code{typeof}. @xref{Alternate Keywords}. -A @code{typeof}-construct can be used anywhere a typedef name could be +A @code{typeof} construct can be used anywhere a typedef name can be used. For example, you can use it in a declaration, in a cast, or inside of @code{sizeof} or @code{typeof}. @@ -673,8 +676,9 @@ The operand of @code{typeof} is evaluate only if it is an expression of variably modified type or the name of such a type. -@code{typeof} is often useful in conjunction with the -statements-within-expressions feature. Here is how the two together can +@code{typeof} is often useful in conjunction with +statement expressions (@pxref{Statement Exprs}).
Re: [PATCH] Fix PR55521 by switching libsanitizer from mach_override to mac interpose functions on darwin
Hi Jack, IIUC the wrappers for dispatch_async_f, dispatch_sync_f and other dispatch_smth_f do not need blocks support in the compiler, since regular functions are passed into them. So you may want to add the dynamic interceptors for those back. The remaining problem is that dispach_async and other functions using blocks won't be intercepted. This may lead to assertion failures in big projects (e.g. we needed those for Chrome). Overall, the change looks good. Do you want me to backport MISSING_BLOCKS_SUPPORT into the LLVM version of the runtime? Alex On Sun, Dec 2, 2012 at 6:43 AM, Jack Howarth howa...@bromo.med.uc.edu wrote: The attached patch eliminates PR 55521/sanitizer by switching libasan on darwin from using mach_override to mac function interposition via the importation of the asan/dynamic/asan_interceptors_dynamic.cc file from llvm.org's compiler-rt svn. The changes involve defining USING_MAC_INTERPOSE in configure.ac rather than rather than USING_MACH_OVERRIDE, introduction of the use of USING_MAC_INTERPOSE in Makefile.am to avoid building the interception subdirectory, the passage of -DMAC_INTERPOSE_FUNCTIONS in asan/Makefile.am when USING_MAC_INTERPOSE as well as the introduction of a -DMISSING_BLOCKS_SUPPORT flag to disable code that requires blocks support which FSF gcc lacks. The depreciated usage of USING_MACH_OVERRIDE is also removed from interception/Makefile.am. Bootstrapped on x86_64-apple-darwin10, x86_64-apple-darwin11 and x86_64-apple-darwin12. Passes... make -k check RUNTESTFLAGS=asan.exp --target_board=unix'{-m32,-m64}' and fixes the previously failing cond1.C test case from PR55521 on all three targets. Okay for gcc trunk? Jack -- Alexander Potapenko Software Engineer Google Moscow
[PATCH] Fix PR gcov-profile/55551 (issue6868045)
2012-12-01 Teresa Johnson tejohn...@google.com PR gcov-profile/1 * lto-cgraph.c (merge_profile_summaries): Handle scaled histogram entries that map to the same index. Index: lto-cgraph.c === --- lto-cgraph.c(revision 193999) +++ lto-cgraph.c(working copy) @@ -1345,7 +1345,8 @@ merge_profile_summaries (struct lto_file_decl_data /* Save a pointer to the profile_info with the largest scaled sum_all and the scale for use in merging the histogram. */ -if (lto_gcov_summary.sum_all saved_sum_all) +if (!saved_profile_info +|| lto_gcov_summary.sum_all saved_sum_all) { saved_profile_info = file_data-profile_info; saved_sum_all = lto_gcov_summary.sum_all; @@ -1363,17 +1364,20 @@ merge_profile_summaries (struct lto_file_decl_data above. Use that to find the new histogram index. */ int scaled_min = RDIV (saved_profile_info-histogram[h_ix].min_value * saved_scale, REG_BR_PROB_BASE); + /* The new index may be shared with another scaled histogram entry, + so we need to account for a non-zero histogram entry at new_ix. */ unsigned new_ix = gcov_histo_index (scaled_min); - lto_gcov_summary.histogram[new_ix].min_value = scaled_min; + lto_gcov_summary.histogram[new_ix].min_value + = MIN (lto_gcov_summary.histogram[new_ix].min_value, scaled_min); /* Some of the scaled counter values would ostensibly need to be placed into different (larger) histogram buckets, but we keep things simple here and place the scaled cumulative counter value in the bucket corresponding to the scaled minimum counter value. */ lto_gcov_summary.histogram[new_ix].cum_value - = RDIV (saved_profile_info-histogram[h_ix].cum_value - * saved_scale, REG_BR_PROB_BASE); + += RDIV (saved_profile_info-histogram[h_ix].cum_value + * saved_scale, REG_BR_PROB_BASE); lto_gcov_summary.histogram[new_ix].num_counters - = saved_profile_info-histogram[h_ix].num_counters; + += saved_profile_info-histogram[h_ix].num_counters; } /* Watch roundoff errors. */ -- This patch is available for review at 5