Re: [Patch, Fortran, OOP] PR 64244: [4.8/4.9/5 Regression] ICE at class.c:236 when using non_overridable
2014-12-16 7:58 GMT+01:00 Tobias Burnus bur...@net-b.de: Hi Janus, hi all, Janus Weil wrote: here is a regression fix for a problem with the NON_OVERRIDABLE attribute. For non-overridable type-bound procedures we do not have to generate a call to the vtable, but can just translate it to a simple ('non-virtual') function call. In this particular case, an additional generic binding was present, which fooled the compiler to believe that the call goes to an overridable procedure, so it tried to generate a call to a vtable entry which did not exist. The trick is simply to take the NON-OVERRIDABLE attribute from the specific procedure, not the generic one (which means the generic call has to be resolved to a specific one first). The patch was regtested on x86_64-unknown-linux-gnu. Ok for trunk? And for 4.8/4.9 after some time? OK. Thanks for the patch! Committed as r218776. I'll wait a bit before putting it to the branches, to see if any problems show up on trunk ... Cheers, Janus
Re: Do not build callgraph for external functions when inlining
Jan Hubicka hubi...@ucw.cz writes: this should be fixed by patch to handle extern aliases correctly I commited later that day. Does the problem still reproduce for you? It works again. Andreas. -- Andreas Schwab, SUSE Labs, sch...@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 And now for something completely different.
Re: [Patch, Fortran] PR54687 - Fortran options cleanup (part 2)
Built and currently regtested on x86-64-gnu-linux. OK for the trunk? OK. I’m not sure exactly why we have a hardcoded minimal value of (2^16-1) for “max array constructor”, or a default max stack of 2^15. But that’s a separate issue. Thanks, FX
Re: [PATCH PR62178]Improve candidate selecting in IVOPT, 2nd try.
On Thu, Dec 11, 2014 at 8:08 PM, Richard Biener richard.guent...@gmail.com wrote: On Thu, Dec 11, 2014 at 10:56 AM, Bin.Cheng amker.ch...@gmail.com wrote: On Wed, Dec 10, 2014 at 9:47 PM, Richard Biener richard.guent...@gmail.com wrote: On Fri, Dec 5, 2014 at 1:15 PM, Bin Cheng bin.ch...@arm.com wrote: Hi, Though PR62178 is hidden by recent cost change in aarch64 backend, the ivopt issue still exists. Current candidate selecting algorithm tends to select fewer candidates given below reasons: 1) to better handle loops with many induction uses but the best choice is one generic basic induction variable; 2) to keep compilation time low. One fundamental weakness of the strategy is the opposite situation can't be handled properly sometimes. For these cases the best choice is each induction variable has its own candidate. This patch fixes the problem by shuffling candidate set after fix-point is reached by current implementation. The reason why this strategy works is it replaces candidate set by selecting local optimal candidate for some induction uses, and the new candidate set (has lower cost) is exact what we want in the mentioned case. Instrumentation data shows this can find better candidates set for ~6% loops in spec2006 on x86_64, and ~4% on aarch64. This patch actually is extension to the first version patch posted at https://gcc.gnu.org/ml/gcc-patches/2014-09/msg02620.html, that only adds another selecting pass with special seed set (more or less like the shuffled set in this patch). Data also confirms this patch can find optimal sets for most loops found by the first one, as well as optimal sets for many new loops. Bootstrap and test on x86_64, no regression on benchmarks. Bootstrap and test on aarch64. Since this patch only selects candidate set with lower cost, any regressions revealed are latent bugs of other components in GCC. I also collected GCC bootstrap time on x86_64, no regression either. Is this OK? The algorithm seems to be quadratic in the number of IV candidates (at least): Yes, I worried about that too, that's why I measured the bootstrap time. One way is restrict this procedure one time for each loop. I already tried that and it can capture +90% loops. Is this sounds reasonable? Yes. That's my suggestion to handle it in the caller of try_improve_iv_set? BTW, do we have some compilation time benchmarks for GCC? There are various testcases linked from PR47344, I don't remember any particular one putting load on IVOPTs (but I do remember seeing IVOPTs in the ~25% area in -ftime-report for some testcases). Hi Jeff Richard, I updated patch according to your review comments. Is this version looks good? I didn't find cases in PR47344 which exercising IVOPT, but I produced one case from PR53852 which runs ivopt for ~17% of total time (28s). This patch does increase IVOPT time to 18%. Unfortunately, I tried the other restriction, it doesn't work as well as this one on spec2k6, if I understood the method correctly. Hi Sebastian, Thanks for help! I managed to run llvm compilation time tests successfully as you suggested. Case Multisource/Benchmarks/mafft/pairlocalalign is regressed but I can't reproduce it in cmd. The running time of compilation of pairlocalalign.c is too small comparing to the results. I also tried to invoke it by using RunSafely.sh but no lucky either. So any documentation on this? Thanks very much! Thanks, bin 2014-12-16 Bin Cheng bin.ch...@arm.com PR tree-optimization/62178 * tree-ssa-loop-ivopts.c (cheaper_cost_with_cand): New function. (iv_ca_replace): New function. (try_improve_iv_set): New parameter try_replace_p. Replace candidates in IVS by calling iv_ca_replace. (find_optimal_iv_set_1): Pass new argument to try_improve_iv_set. gcc/testsuite/ChangeLog 2014-12-16 Bin Cheng bin.ch...@arm.com PR tree-optimization/62178 * gcc.target/aarch64/pr62178.c: New test. Index: gcc/tree-ssa-loop-ivopts.c === --- gcc/tree-ssa-loop-ivopts.c (revision 218200) +++ gcc/tree-ssa-loop-ivopts.c (working copy) @@ -5862,6 +5862,127 @@ iv_ca_prune (struct ivopts_data *data, struct iv_c return best_cost; } +/* Check if CAND_IDX is a candidate other than OLD_CAND and has + cheaper local cost for USE than BEST_CP. Return pointer to + the corresponding cost_pair, otherwise just return BEST_CP. */ + +static struct cost_pair* +cheaper_cost_with_cand (struct ivopts_data *data, struct iv_use *use, + unsigned int cand_idx, struct iv_cand *old_cand, + struct cost_pair *best_cp) +{ + struct iv_cand *cand; + struct cost_pair *cp; + + gcc_assert (old_cand != NULL); + if (cand_idx == old_cand-id) +return best_cp; + + cand = iv_cand (data, cand_idx); + cp = get_use_iv_cost (data, use, cand); + if (cp != NULL + (best_cp == NULL ||
Fwd: [PATCH PR62178]Improve candidate selecting in IVOPT, 2nd try.
CCing Sebastian. Thanks, bin -- Forwarded message -- From: Bin.Cheng amker.ch...@gmail.com Date: Tue, Dec 16, 2014 at 4:42 PM Subject: Re: [PATCH PR62178]Improve candidate selecting in IVOPT, 2nd try. To: Richard Biener richard.guent...@gmail.com Cc: Bin Cheng bin.ch...@arm.com, GCC Patches gcc-patches@gcc.gnu.org, Zdenek Dvorak o...@ucw.cz On Thu, Dec 11, 2014 at 8:08 PM, Richard Biener richard.guent...@gmail.com wrote: On Thu, Dec 11, 2014 at 10:56 AM, Bin.Cheng amker.ch...@gmail.com wrote: On Wed, Dec 10, 2014 at 9:47 PM, Richard Biener richard.guent...@gmail.com wrote: On Fri, Dec 5, 2014 at 1:15 PM, Bin Cheng bin.ch...@arm.com wrote: Hi, Though PR62178 is hidden by recent cost change in aarch64 backend, the ivopt issue still exists. Current candidate selecting algorithm tends to select fewer candidates given below reasons: 1) to better handle loops with many induction uses but the best choice is one generic basic induction variable; 2) to keep compilation time low. One fundamental weakness of the strategy is the opposite situation can't be handled properly sometimes. For these cases the best choice is each induction variable has its own candidate. This patch fixes the problem by shuffling candidate set after fix-point is reached by current implementation. The reason why this strategy works is it replaces candidate set by selecting local optimal candidate for some induction uses, and the new candidate set (has lower cost) is exact what we want in the mentioned case. Instrumentation data shows this can find better candidates set for ~6% loops in spec2006 on x86_64, and ~4% on aarch64. This patch actually is extension to the first version patch posted at https://gcc.gnu.org/ml/gcc-patches/2014-09/msg02620.html, that only adds another selecting pass with special seed set (more or less like the shuffled set in this patch). Data also confirms this patch can find optimal sets for most loops found by the first one, as well as optimal sets for many new loops. Bootstrap and test on x86_64, no regression on benchmarks. Bootstrap and test on aarch64. Since this patch only selects candidate set with lower cost, any regressions revealed are latent bugs of other components in GCC. I also collected GCC bootstrap time on x86_64, no regression either. Is this OK? The algorithm seems to be quadratic in the number of IV candidates (at least): Yes, I worried about that too, that's why I measured the bootstrap time. One way is restrict this procedure one time for each loop. I already tried that and it can capture +90% loops. Is this sounds reasonable? Yes. That's my suggestion to handle it in the caller of try_improve_iv_set? BTW, do we have some compilation time benchmarks for GCC? There are various testcases linked from PR47344, I don't remember any particular one putting load on IVOPTs (but I do remember seeing IVOPTs in the ~25% area in -ftime-report for some testcases). Hi Jeff Richard, I updated patch according to your review comments. Is this version looks good? I didn't find cases in PR47344 which exercising IVOPT, but I produced one case from PR53852 which runs ivopt for ~17% of total time (28s). This patch does increase IVOPT time to 18%. Unfortunately, I tried the other restriction, it doesn't work as well as this one on spec2k6, if I understood the method correctly. Hi Sebastian, Thanks for help! I managed to run llvm compilation time tests successfully as you suggested. Case Multisource/Benchmarks/mafft/pairlocalalign is regressed but I can't reproduce it in cmd. The running time of compilation of pairlocalalign.c is too small comparing to the results. I also tried to invoke it by using RunSafely.sh but no lucky either. So any documentation on this? Thanks very much! Thanks, bin 2014-12-16 Bin Cheng bin.ch...@arm.com PR tree-optimization/62178 * tree-ssa-loop-ivopts.c (cheaper_cost_with_cand): New function. (iv_ca_replace): New function. (try_improve_iv_set): New parameter try_replace_p. Replace candidates in IVS by calling iv_ca_replace. (find_optimal_iv_set_1): Pass new argument to try_improve_iv_set. gcc/testsuite/ChangeLog 2014-12-16 Bin Cheng bin.ch...@arm.com PR tree-optimization/62178 * gcc.target/aarch64/pr62178.c: New test. Index: gcc/tree-ssa-loop-ivopts.c === --- gcc/tree-ssa-loop-ivopts.c (revision 218200) +++ gcc/tree-ssa-loop-ivopts.c (working copy) @@ -5862,6 +5862,127 @@ iv_ca_prune (struct ivopts_data *data, struct iv_c return best_cost; } +/* Check if CAND_IDX is a candidate other than OLD_CAND and has + cheaper local cost for USE than BEST_CP. Return pointer to + the corresponding cost_pair, otherwise just return BEST_CP. */ + +static struct cost_pair* +cheaper_cost_with_cand (struct ivopts_data *data, struct iv_use *use, + unsigned int
[PATCH] Fix PR64240
Hi, This patch fixes an obvious typo which may affect the DDG creation of SMS and make this optimization produce buggy code. Bootstrapped on x86_64-suse-linux. Also passed check-gcc test for aarch64-linux-gnu. OK for the trunk? Index: gcc/ddg.c === --- gcc/ddg.c (revision 218582) +++ gcc/ddg.c (working copy) @@ -77,7 +77,7 @@ mark_mem_use (rtx *x, void *) { subrtx_iterator::array_type array; FOR_EACH_SUBRTX (iter, array, *x, NONCONST) -if (MEM_P (*x)) +if (MEM_P (*iter)) { mem_ref_p = true; break; Index: gcc/ChangeLog === --- gcc/ChangeLog (revision 218582) +++ gcc/ChangeLog (working copy) @@ -1,3 +1,8 @@ +2014-12-16 Felix Yang felix.y...@huawei.com + + PR rtl-optimization/64240 + * ddg.c (mark_mem_use): Check *iter instead of *x. + 2014-12-10 Felix Yang felix.y...@huawei.com * config/aarch64/aarch64-protos.h (aarch64_function_profiler): Remove pr64240.diff Description: pr64240.diff
Re: [PATCH] Fix PR64240
On December 16, 2014 9:51:25 AM CET, Yangfei (Felix) felix.y...@huawei.com wrote: Hi, This patch fixes an obvious typo which may affect the DDG creation of SMS and make this optimization produce buggy code. Bootstrapped on x86_64-suse-linux. Also passed check-gcc test for aarch64-linux-gnu. OK for the trunk? Do you have a testcase? If so please add it. OK. Thanks, Richard. Index: gcc/ddg.c === --- gcc/ddg.c (revision 218582) +++ gcc/ddg.c (working copy) @@ -77,7 +77,7 @@ mark_mem_use (rtx *x, void *) { subrtx_iterator::array_type array; FOR_EACH_SUBRTX (iter, array, *x, NONCONST) -if (MEM_P (*x)) +if (MEM_P (*iter)) { mem_ref_p = true; break; Index: gcc/ChangeLog === --- gcc/ChangeLog (revision 218582) +++ gcc/ChangeLog (working copy) @@ -1,3 +1,8 @@ +2014-12-16 Felix Yang felix.y...@huawei.com + + PR rtl-optimization/64240 + * ddg.c (mark_mem_use): Check *iter instead of *x. + 2014-12-10 Felix Yang felix.y...@huawei.com * config/aarch64/aarch64-protos.h (aarch64_function_profiler): Remove
Re: [PATCH] Add support for exceptions to tsan (PR sanitizer/64265)
I am not qualified to review the actual code changes, but from the description it looks good to me. It adds a EH frame to every function, right? In 64-bit mode there is no runtime penalty, right? Do you have any idea about binary size increase? Does gcc build in C++ mode nowadays? It can be a good test. I am somewhat worried about potential corner cases that can lead to compiler crashes or missed/excessive func_entry/exit calls. But I guess there is no way to make it working w/o getting some code in first. Rigorous testing would require running a large C++ app that actually throws and catches exception and precisely verifying stacks in reports. On Mon, Dec 15, 2014 at 9:50 PM, Jakub Jelinek ja...@redhat.com wrote: Hi! As discussed in the PR, to support exceptions in -fsanitize=thread code, it is desirable to call __tsan_func_exit also when leaving functions by means of exceptions. Adding EH too late sounds too hard to me, so this patch instead adds an internal call during gimplification, makes sure the inliner removes the internal calls from the inline functions (we don't care about inlines, only about functions we emit), and for functions that didn't go through gimplify_function_tree (e.g. omp/tm etc. functions) just keeps using the old __tsan_func_exit additions. Bootstrapped/regtested on x86_64-linux and i686-linux. Ok for trunk? On: #include pthread.h int v; int foo (int x) { if (x 99) throw x; v++; return x; } void * tf (void *) { for (int i = 0; i 100; i++) try { foo (i); } catch (int) {} return NULL; } int main () { pthread_t th; if (pthread_create (th, NULL, tf, NULL)) return 0; v++; pthread_join (th, NULL); return 0; } I used to get without this patch: == WARNING: ThreadSanitizer: data race (pid=20449) Read of size 4 at 0x006020e0 by thread T1: #0 foo(int) /usr/src/gcc/obj/gcc/ts.C:10 (ts+0x00400cb9) #1 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #2 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #3 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #4 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #5 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #6 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #7 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #8 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #9 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #10 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #11 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #12 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #13 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #14 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #15 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #16 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #17 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #18 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #19 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #20 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #21 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #22 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #23 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #24 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #25 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #26 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #27 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #28 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #29 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #30 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #31 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #32 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #33 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #34 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #35 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #36 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #37 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #38 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #39 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #40 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #41 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #42 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #43 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #44 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #45 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18
Re: [PATCH] Add support for exceptions to tsan (PR sanitizer/64265)
Cross referenced this patch from https://code.google.com/p/thread-sanitizer/issues/detail?id=78 On Tue, Dec 16, 2014 at 12:25 PM, Dmitry Vyukov dvyu...@google.com wrote: I am not qualified to review the actual code changes, but from the description it looks good to me. It adds a EH frame to every function, right? In 64-bit mode there is no runtime penalty, right? Do you have any idea about binary size increase? Does gcc build in C++ mode nowadays? It can be a good test. I am somewhat worried about potential corner cases that can lead to compiler crashes or missed/excessive func_entry/exit calls. But I guess there is no way to make it working w/o getting some code in first. Rigorous testing would require running a large C++ app that actually throws and catches exception and precisely verifying stacks in reports. On Mon, Dec 15, 2014 at 9:50 PM, Jakub Jelinek ja...@redhat.com wrote: Hi! As discussed in the PR, to support exceptions in -fsanitize=thread code, it is desirable to call __tsan_func_exit also when leaving functions by means of exceptions. Adding EH too late sounds too hard to me, so this patch instead adds an internal call during gimplification, makes sure the inliner removes the internal calls from the inline functions (we don't care about inlines, only about functions we emit), and for functions that didn't go through gimplify_function_tree (e.g. omp/tm etc. functions) just keeps using the old __tsan_func_exit additions. Bootstrapped/regtested on x86_64-linux and i686-linux. Ok for trunk? On: #include pthread.h int v; int foo (int x) { if (x 99) throw x; v++; return x; } void * tf (void *) { for (int i = 0; i 100; i++) try { foo (i); } catch (int) {} return NULL; } int main () { pthread_t th; if (pthread_create (th, NULL, tf, NULL)) return 0; v++; pthread_join (th, NULL); return 0; } I used to get without this patch: == WARNING: ThreadSanitizer: data race (pid=20449) Read of size 4 at 0x006020e0 by thread T1: #0 foo(int) /usr/src/gcc/obj/gcc/ts.C:10 (ts+0x00400cb9) #1 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #2 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #3 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #4 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #5 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #6 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #7 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #8 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #9 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #10 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #11 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #12 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #13 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #14 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #15 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #16 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #17 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #18 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #19 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #20 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #21 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #22 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #23 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #24 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #25 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #26 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #27 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #28 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #29 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #30 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #31 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #32 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #33 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #34 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #35 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #36 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #37 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #38 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #39 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #40 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #41 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #42 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13)
Re: Fix streaming of target optimization/option nodes
On Mon, 15 Dec 2014, Jan Hubicka wrote: On Mon, 15 Dec 2014, Jan Hubicka wrote: Hi, actually this patch break fortran, I get streaming error in: lto1: internal compiler error: in streamer_get_pickled_tree apparently picking error_mark_node for variable constructor results in reading integer_type... ? Probably the default nodes are referenced by another builtin tree instead and you get inconsistent streaming between f951 and lto1. See the assert placed into record_common_node which you should extend to cover the optimization node trees. It seems that whole common node preloading is a major can of worms ;( It is. I've tried to get rid of most of it but unfortunately those pointer-compares to builtin trees remain (at least va_list node). It was also before SCC tree merging so maybe the situation isn't as bad as it was (and we reliably at least merge cross-TU builtins where necessary). Ideally we'd be down to an explicitely list of pre-loaded nodes, abstracted into a predicate function so we can update the assert from a single place as well. But I'm quite sure it's nothing for stage3 ;) Btw - please update the assert (or even better do that predicate function). Thanks, Richard. . Anyway the problem here is that record_common_node replaces every NULL by error_mark_node. It thus matters what is the last NULL pointer recorded. It used to be TI_CURRENT_OPTIMIZE_PRAGMA but now it is TI_PID_TYPE in some cases, TI_MAIN_IDENTIFIER in others and real error_mark_node in rest of cases. I am testing the following. Honza Index: tree-streamer.c === --- tree-streamer.c (revision 218726) +++ tree-streamer.c (working copy) @@ -324,7 +324,18 @@ preload_common_nodes (struct streamer_tr /* Skip boolean type and constants, they are frontend dependent. */ if (i != TI_BOOLEAN_TYPE i != TI_BOOLEAN_FALSE - i != TI_BOOLEAN_TRUE) + i != TI_BOOLEAN_TRUE + /* MAIN_IDENTIFIER is not always initialized by Fortran FE. */ + i != TI_MAIN_IDENTIFIER + /* PID_TYPE is initialized only by C family front-ends. */ + i != TI_PID_TYPE + /* Skip optimization and target option nodes; they depend on flags. */ + i != TI_OPTIMIZATION_DEFAULT + i != TI_OPTIMIZATION_CURRENT + i != TI_TARGET_OPTION_DEFAULT + i != TI_TARGET_OPTION_CURRENT + i != TI_CURRENT_TARGET_PRAGMA + i != TI_CURRENT_OPTIMIZE_PRAGMA) record_common_node (cache, global_trees[i]); } -- Richard Biener rguent...@suse.de SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)
Re: [PATCH] Make dg-extract-results.sh explicitly treat .{sum,log} files as text
On 12/15/2014 11:00 PM, Sergio Durigan Junior wrote: +# Check if grep supports the '--text' option. + +GREP_TEXT_OPT=--text +if grep --text 21 | grep unrecognized option /dev/null 21 ; then + GREP_TEXT_OPT= +fi + That assumes all greps output unrecognized option on unrecognized options. I don't think that can be counted on being portable? ISTM reversing the logic of the test would be better. IOW, test that --text actually works. Something like this perhaps: # If grep supports the '--text' option, use it. GREP_TEXT_OPT= if echo foo | grep --text foo /dev/null 21 ; then GREP_TEXT_OPT=--text fi Thanks, Pedro Alves
Re: [PATCH] Add support for exceptions to tsan (PR sanitizer/64265)
On Mon, Dec 15, 2014 at 7:50 PM, Jakub Jelinek ja...@redhat.com wrote: Hi! As discussed in the PR, to support exceptions in -fsanitize=thread code, it is desirable to call __tsan_func_exit also when leaving functions by means of exceptions. Adding EH too late sounds too hard to me, so this patch instead adds an internal call during gimplification, makes sure the inliner removes the internal calls from the inline functions (we don't care about inlines, only about functions we emit), and for functions that didn't go through gimplify_function_tree (e.g. omp/tm etc. functions) just keeps using the old __tsan_func_exit additions. Bootstrapped/regtested on x86_64-linux and i686-linux. Ok for trunk? So the issue is externally throwing EH, right? I wonder if we can teach the unwinder to do __tsan_func_exit instead, or have hooks in it that can be used by tsan? At least the issue seems generic enough for code instrumentation to consider a more general solution? How does -finstrument-functions work with externally throwing EH? Thanks, Richard. On: #include pthread.h int v; int foo (int x) { if (x 99) throw x; v++; return x; } void * tf (void *) { for (int i = 0; i 100; i++) try { foo (i); } catch (int) {} return NULL; } int main () { pthread_t th; if (pthread_create (th, NULL, tf, NULL)) return 0; v++; pthread_join (th, NULL); return 0; } I used to get without this patch: == WARNING: ThreadSanitizer: data race (pid=20449) Read of size 4 at 0x006020e0 by thread T1: #0 foo(int) /usr/src/gcc/obj/gcc/ts.C:10 (ts+0x00400cb9) #1 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #2 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #3 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #4 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #5 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #6 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #7 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #8 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #9 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #10 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #11 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #12 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #13 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #14 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #15 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #16 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #17 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #18 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #19 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #20 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #21 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #22 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #23 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #24 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #25 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #26 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #27 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #28 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #29 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #30 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #31 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #32 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #33 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #34 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #35 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #36 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #37 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #38 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #39 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #40 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #41 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #42 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #43 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #44 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #45 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #46 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #47 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #48 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #49 tf(void*) /usr/src/gcc/obj/gcc/ts.C:18 (ts+0x00400d13) #50
Re: [PATCH] Make dg-extract-results.sh explicitly treat .{sum,log} files as text
On Tue, Dec 16, 2014 at 09:36:33AM +, Pedro Alves wrote: On 12/15/2014 11:00 PM, Sergio Durigan Junior wrote: +# Check if grep supports the '--text' option. + +GREP_TEXT_OPT=--text +if grep --text 21 | grep unrecognized option /dev/null 21 ; then + GREP_TEXT_OPT= +fi + That assumes all greps output unrecognized option on unrecognized options. I don't think that can be counted on being portable? ISTM reversing the logic of the test would be better. IOW, test that --text actually works. Something like this perhaps: # If grep supports the '--text' option, use it. GREP_TEXT_OPT= if echo foo | grep --text foo /dev/null 21 ; then GREP_TEXT_OPT=--text fi Even better include some non-text bytes around the foo string in the input you feed to grep. Also, perhaps better would be to use a GREP variable, initialized to GREP=grep or GREP=grep --text and just invoke $GREP. Jakub
Re: [PATCH] Make dg-extract-results.sh explicitly treat .{sum,log} files as text
On Dec 15, 2014, at 2:37 PM, Sergio Durigan Junior sergi...@redhat.com wrote: This weekend I was running GDB's testsuite with many options enabled, and I noticed that, for some specific configurations (specifically when testing gdbserver), I was getting the following error: Binary file outputs/gdb.base/gdb-sigterm/gdb.log matches Right, the problem is that grep is assuming those 6 files are binary, not text. This happens because of this code, in grep: If one looks at those 6 files, one will find that they contain the NUL byte there. They are all printed by the same message, by gdbserver's code: input_interrupt, count = 0 c = 0 ('^@') (The ^@ above is the NUL byte.) So, either, the tool should not generate 0 in the output, which, is rather anti-social, or one should strip the funny characters in a more portable fashion. tr and cat -v come to mind; both should be pretty portable. OK to apply? Try cat | cat -v in there instead.
Re: C++ PATCH for C++14 sized deallocation
spawn /daten/aranym/gcc/gcc-20141216/Build/gcc/testsuite/g++/../../xg++ -B/daten/aranym/gcc/gcc-20141216/Build/gcc/testsuite/g++/../../ /daten/aranym/gcc/gcc-20141216/gcc/testsuite/g++.dg/abi/covariant4.C -fno-diagnostics-show-caret -fdiagnostics-color=never -nostdinc++ -I/daten/aranym/gcc/gcc-20141216/Build/m68k-linux/libstdc++-v3/include/m68k-linux -I/daten/aranym/gcc/gcc-20141216/Build/m68k-linux/libstdc++-v3/include -I/daten/aranym/gcc/gcc-20141216/libstdc++-v3/libsupc++ -I/daten/aranym/gcc/gcc-20141216/libstdc++-v3/include/backward -I/daten/aranym/gcc/gcc-20141216/libstdc++-v3/testsuite/util -fmessage-length=0 -std=c++14 -pedantic-errors -Wno-long-long -L/daten/aranym/gcc/gcc-20141216/Build/m68k-linux/./libstdc++-v3/src/.libs -B/daten/aranym/gcc/gcc-20141216/Build/m68k-linux/./libstdc++-v3/src/.libs -L/daten/aranym/gcc/gcc-20141216/Build/m68k-linux/./libstdc++-v3/src/.libs -lm -o ./covariant4.exe. /tmp/cc3x0AbC.o: In function `Model::~Model()':. covariant4.C:(.text._ZN5ModelD2Ev[_ZN5ModelD5Ev]+0x1e): undefined reference to `operator delete(void*, unsigned int)'. /tmp/cc3x0AbC.o: In function `Model::~Model()':. covariant4.C:(.text._ZN5ModelD0Ev[_ZN5ModelD5Ev]+0x1a): undefined reference to `operator delete(void*, unsigned int)'. /tmp/cc3x0AbC.o: In function `R::~R()':. covariant4.C:(.text._ZN1RD2Ev[_ZN1RD2Ev]+0x52): undefined reference to `operator delete(void*, unsigned int)'. /tmp/cc3x0AbC.o: In function `A::~A()':. covariant4.C:(.text._ZN1AD2Ev[_ZN1AD2Ev]+0x52): undefined reference to `operator delete(void*, unsigned int)'. /tmp/cc3x0AbC.o: In function `RA::~RA()':. covariant4.C:(.text._ZN2RAD2Ev[_ZN2RAD2Ev]+0x90): undefined reference to `operator delete(void*, unsigned int)'. /tmp/cc3x0AbC.o:covariant4.C:(.text._ZN3EQUD1Ev[_ZN3EQUD1Ev]+0x6a): more undefined references to `operator delete(void*, unsigned int)' follow. collect2: error: ld returned 1 exit status. compiler exited with status 1 Andreas. -- Andreas Schwab, SUSE Labs, sch...@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 And now for something completely different.
Re: [C++ Patch] PR 58650
Hi, On 12/15/2014 11:25 PM, Jason Merrill wrote: Why does error recovery fail? I would expect to be able to just drop the 'friend' and treat it as a normal non-static data member. I agree, that was my first thought too. Unfortunately we do non-trivial preparatory work *before* calling grokdeclarator based on ds_friend which aren't reverted by locally resetting friendp in grokdeclarator, and, eg, build_this_parm gets a null type even if declspecs-locations[ds_friend] itself is reset right at the beginning of grokdeclarator. I'll try the further investigate the above... Thanks, Paolo.
[PATCH][AARCH64]Add ACLE 2.0 predefined macros: __ARM_ALIGN_MAX_PWR and __ARM_ALIGN_MAX_STACK_PWR
Hi all, This is a simple patch to add another two ACLE 2.0 predefined macros into aarch64 backend. They are __ARM_ALIGN_MAX_PWR and __ARM_ALIGN_MAX_STACK_PWR. Currently, those two values are hard-wired to 16. The following clauses from ACLE 2.0 documentation indicate the meaning of those two macros: The macro __ARM_ALIGN_MAX_STACK_PWR indicates (as the exponent of a power of 2) the maximum available stack alignment. The macro __ARM_ALIGN_MAX_PWR indicates (as the exponent of a power of 2) the maximum available alignment of static data. aarch64-none-elf target is tested on on the model. No new regression. Is it Okay for trunk? Regards, Renlin Li gcc/ChangeLog 2014-12-16 Renlin Li renlin...@arm.com * config/aarch64/aarch64.h(TARGET_CPU_CPP_BUILTINS): Define __ARM_ALIGN_MAX_PWR and __ARM_ALIGN_MAX_STACK_PWR.diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index 8ed1f84..78417d7 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -29,6 +29,10 @@ builtin_define (__aarch64__); \ builtin_define (__ARM_64BIT_STATE); \ builtin_define_with_int_value \ +(__ARM_ALIGN_MAX_PWR, 16); \ + builtin_define_with_int_value \ +(__ARM_ALIGN_MAX_STACK_PWR, 16); \ + builtin_define_with_int_value \ (__ARM_ARCH, aarch64_architecture_version); \ cpp_define_formatted \ (parse_in, __ARM_ARCH_%dA, aarch64_architecture_version); \
Re: [PATCH] Add support for exceptions to tsan (PR sanitizer/64265)
On Tue, Dec 16, 2014 at 10:47:06AM +0100, Richard Biener wrote: On Mon, Dec 15, 2014 at 7:50 PM, Jakub Jelinek ja...@redhat.com wrote: Hi! As discussed in the PR, to support exceptions in -fsanitize=thread code, it is desirable to call __tsan_func_exit also when leaving functions by means of exceptions. Adding EH too late sounds too hard to me, so this patch instead adds an internal call during gimplification, makes sure the inliner removes the internal calls from the inline functions (we don't care about inlines, only about functions we emit), and for functions that didn't go through gimplify_function_tree (e.g. omp/tm etc. functions) just keeps using the old __tsan_func_exit additions. Bootstrapped/regtested on x86_64-linux and i686-linux. Ok for trunk? So the issue is externally throwing EH, right? I wonder if we can teach the unwinder to do __tsan_func_exit instead, or have hooks in it that can be used by tsan? At least the issue seems generic enough for code instrumentation to consider a more general solution? How does -finstrument-functions work with externally throwing EH? -finstrument-functions works the way I wrote the patch, in fact the gimplify_function_tree bits I've added were right after the -finstrument-functions handling of the same. Anyway, the alternative would be to wrap the various personality functions like __gcc_personality_v0 __gxx_personality_v0 __gcj_personality_v0 __gccgo_personality_v0 __gnu_objc_personality_v0 call the dlsym (, RTLD_NEXT) version from there and if it returns _URC_INSTALL_CONTEXT , try to figure out what frame it will be in. We can query the IP (_Unwind_GetIP), or the CFA (_Unwind_GetCFA), but then map it through supposedly target dependent code to the actual frame pointers __tsan_func_* store (do they?). The problem with wrapping those personality functions is that I'm not sure how could it work with the various -static* options. Say if you -static-libstdc++ (and link libtsan dynamically), then the __gxx_personality_v0 in the binary will not be wrapped by what is in libtsan.so. If you -static-libstdc++ -static-libtsan, then __gxx_personality_v0 linked in will be very likely from libstdc++ and thus again not overloaded, or if -ltsan would come first (right now it doesn't), then you still couldn't use dlsym to get at the overloaded symbol. So, while the __tsan_func_exit in cleanup is more expensive at runtime, it will work with all the wierdo options people are using. Jakub
Re: [PATCH] Make dg-extract-results.sh explicitly treat .{sum,log} files as text
On 12/16/2014 10:03 AM, Mike Stump wrote: input_interrupt, count = 0 c = 0 ('^@') (The ^@ above is the NUL byte.) So, either, the tool should not generate 0 in the output, which, is rather anti-social, Yeah, this is actually a gdbserver debug output that misses an if debugging guard; it shouldn't be printed by default. Still, if we do enable debug output, I agree we shouldn't be printing unprintable characters. Sergio, if you want to work on that, see serial_logchar and the use of isprint. Also, count==0 means the connection was closed; it'd be better even if in addition to isprint, we add a special case that logs client connection closed or some such instead of printing whatever was in 'c', which happens to be \0. or one should strip the funny characters in a more portable fashion. Thanks, Pedro Alves
Re: [PATCH][AArch64] Generalize code alignment
On 12 December 2014 at 14:48, Wilco Dijkstra wdijk...@arm.com wrote: This patch generalizes the code alignment and lets each CPU set function, jump and loop alignment independently. The defaults for A53/A57 are based the original patch by James Greenhalgh. OK for trunk? ChangeLog: 2014-12-13 Wilco Dijkstra wdijk...@arm.com * gcc/config/aarch64/aarch64-protos.h (tune-params): Add code alignment tuning parameters. * gcc/config/aarch64/aarch64.c (generic_tunings) Add code alignment tuning parameters. (cortexa53_tunings): Likewise. (cortexa57_tunings): Likewise. (thunderx_tunings): Likewise. (aarch64_override_options): Use new alignment tunings. OK /Marcus
Re: [PATCH][AArch64] Add TARGET_MIN_DIVISIONS_FOR_RECIP_MUL
On 12 December 2014 at 15:19, Wilco Dijkstra wdijk...@arm.com wrote: Add an override for TARGET_MIN_DIVISIONS_FOR_RECIP_MUL and set the minimum number of divisions to 2. This gives ~0.5% speedup on SPECFP2000/2006. OK for trunk? ChangeLog: 2014-12-13 Wilco Dijkstra wdijk...@arm.com * gcc/config/aarch64/aarch64.c (TARGET_MIN_DIVISIONS_FOR_RECIP_MUL): Define. (aarch64_min_divisions_for_recip_mul): New function. Ok /Marcus
Re: [PATCH ARM]Prefer neon for stringops on a53/a57 in AArch32 mode
On Thu, Nov 13, 2014 at 1:54 PM, Bin Cheng bin.ch...@arm.com wrote: Hi, As commented at https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00684.html, this is a simple patch enabling neon memset inlining on cortex-a53/cortex-a57 in AArch32 mode. Test on arm-none-linux-gnueabihf/--with-cpu=cortex-a57/--with-fpu=crypto-neon-fp-arm v8/--with-float=hard. I will further collect benchmark data, see if there is regression. Is it ok if benchmark results are good? 2014-11-13 Bin Cheng bin.ch...@arm.com * config/arm/arm.c (arm_cortex_a53_tune, arm_cortex_a57_tune): Prefer neon for stringops on cortex-a53/a57 in AArch32 mode. I collected perf data for this patch, there is no obvious change on cortex-a57/aarch32, so is it OK? Thanks, bin
Re: [C++ Patch] PR 58650
Hi again, On 12/16/2014 11:17 AM, Paolo Carlini wrote: Hi, On 12/15/2014 11:25 PM, Jason Merrill wrote: Why does error recovery fail? I would expect to be able to just drop the 'friend' and treat it as a normal non-static data member. I agree, that was my first thought too. Unfortunately we do non-trivial preparatory work *before* calling grokdeclarator based on ds_friend which aren't reverted by locally resetting friendp in grokdeclarator, and, eg, build_this_parm gets a null type even if declspecs-locations[ds_friend] itself is reset right at the beginning of grokdeclarator. I'll try the further investigate the above... In better detail: grokdeclarator is called, via grokfield, by cp_parser_member_declaration. The latter stores the friendship information in a friend_p local flag, which remains true when grokdeclarator returns. Then at the end of the function: if (decl) { /* Add DECL to the list of members. */ if (!friend_p) finish_member_declaration (decl); makes all the difference for the crash. Now, I could try resetting in grokdeclarator the primary ds_friend information as stored in the decl_specifiers and read it back, thus update friend_p in cp_parser_member_declaration. It would probably work, but frankly this fiddling only for error recovery makes me a little nervous. What do you think? Thanks, Paolo.
Re: [PATCH] Fix PR64240
On December 16, 2014 9:51:25 AM CET, Yangfei (Felix) felix.y...@huawei.com wrote: Hi, This patch fixes an obvious typo which may affect the DDG creation of SMS and make this optimization produce buggy code. Bootstrapped on x86_64-suse-linux. Also passed check-gcc test for aarch64-linux-gnu. OK for the trunk? Do you have a testcase? If so please add it. OK Yes, the patch is updated with the testcase added. Index: gcc/ddg.c === --- gcc/ddg.c (revision 218582) +++ gcc/ddg.c (working copy) @@ -77,7 +77,7 @@ mark_mem_use (rtx *x, void *) { subrtx_iterator::array_type array; FOR_EACH_SUBRTX (iter, array, *x, NONCONST) -if (MEM_P (*x)) +if (MEM_P (*iter)) { mem_ref_p = true; break; Index: gcc/ChangeLog === --- gcc/ChangeLog (revision 218582) +++ gcc/ChangeLog (working copy) @@ -1,3 +1,8 @@ +2014-12-16 Felix Yang felix.y...@huawei.com + + PR rtl-optimization/64240 + * ddg.c (mark_mem_use): Check *iter instead of *x. + 2014-12-10 Felix Yang felix.y...@huawei.com * config/aarch64/aarch64-protos.h (aarch64_function_profiler): Remove Index: gcc/testsuite/gcc.dg/sms-12.c === --- gcc/testsuite/gcc.dg/sms-12.c (revision 0) +++ gcc/testsuite/gcc.dg/sms-12.c (revision 0) @@ -0,0 +1,43 @@ +/* { dg-do run } */ +/* { dg-skip-if { ! { aarch64-*-* } } { * } { } } */ +/* { dg-options -O2 -fmodulo-sched -funroll-loops -fdump-rtl-sms --param sms-min-sc=1 -fmodulo-sched-allow-regmoves -fPIC } */ + +extern void abort (void); + +int X[1000]={0}; +int Y[1000]={0}; + +extern void abort (void); + +__attribute__ ((noinline)) +int +foo (int len, long a) +{ + int i; + long res = a; + + len = 1000; + for (i = 0; i len; i++) +res += X[i]* Y[i]; + + if (res != 601) +abort (); + +} + +int +main () +{ + X[0] = Y[1] = 2; + Y[0] = X[1] = 21; + X[2] = Y[3] = 3; + Y[2] = X[3] = 31; + X[4] = Y[5] = 4; + Y[4] = X[5] = 41; + + foo (6, 3); + return 0; +} + +/* { dg-final { cleanup-rtl-dump sms } } */ + Property changes on: gcc/testsuite/gcc.dg/sms-12.c ___ Added: svn:executable + * Index: gcc/testsuite/ChangeLog === --- gcc/testsuite/ChangeLog (revision 218582) +++ gcc/testsuite/ChangeLog (working copy) @@ -1,3 +1,8 @@ +2014-12-16 Felix Yang felix.y...@huawei.com + + PR rtl-optimization/64240 + * gcc.dg/sms-12.c: New test. + 2014-12-10 Martin Liska mli...@suse.cz * gcc.dg/ipa/pr63909.c: New test. pr64240-v2.diff Description: pr64240-v2.diff
Re: [PATCH] combine: If a parallel I2 was split, do not allow a new I2 (PR64268)
On 15/12/2014 17:24, Segher Boessenkool wrote: On Mon, Dec 15, 2014 at 04:51:14PM +0100, Paolo Bonzini wrote: Random questions: 1) did you check that it never triggers on e.g. an x86 bootstrap, and that it doesn't trigger too often on PPC64? I have checked on my largish connection of tests for the carry insns on PowerPC, and only two (related) transforms are disabled, and they aren't too important anyway. Well, and the bad transforms are disabled, only just two of-em but much more frequent (long long x; x--;). I haven't checked on x86, but it's a bugfix: don't do things that blow up! I agree---I just want to be sure of the extent of the change. [ That testcase, -m32: long long addSH(long long a, unsigned long b) { return a + ((unsigned long long)b 32); } results in addic 4,4,0 ; addze 3,5 while it could just be add 3,5 ] Ah, that's a pity. But breaking x-- is much worse. 2) if it triggers rarely, should combine just try and give a new UID to i1? That should in principle works, sure. Seems too dangerous for stage3 though. Indeed. Just thinking about it for 5.1. Paolo
Re: [PATCH] Add support for exceptions to tsan (PR sanitizer/64265)
On Tue, Dec 16, 2014 at 01:25:54PM +0400, Dmitry Vyukov wrote: I am not qualified to review the actual code changes, but from the description it looks good to me. It adds a EH frame to every function, right? In 64-bit mode there is no runtime penalty, right? Do you have any idea about binary size No, not to every function. Only to function that throws or could throw something externally, and only if exceptions are on. increase? Does gcc build in C++ mode nowadays? It can be a good test. GCC builds with C++, but with -fno-exceptions, so it is not a good example, but e.g. libstdc++, as it generally supports exceptions, could be an example. Jakub
Re: [PATCH] [AArch64, NEON] Fix testcases add by r218484
#define DECL_VABD_VAR(VAR) \ be careful with your cut and paste. VABD should probably be VFMA_N here, although it's purely a naming convention :-) The v3 patch attached fixed this minor issue. Thanks. It's OK for me with that change, but I'm not a maintainer. One more question: are there any corner-cases we would want to check? (for instance, rounding, nan, infinity, ...) We don't see any testsuite covers the test of these intrinsics. So we are adding these testcases to test the basic functionality. For now, I don't see any corner-cases that need to be checked for this patch. Index: gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/compute-ref-data.h === --- gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/compute-ref-data.h (revision 218582) +++ gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/compute-ref-data.h (working copy) @@ -142,6 +142,10 @@ VECT_VAR_DECL_INIT(buffer, poly, 16, 8); PAD(buffer_pad, poly, 16, 8); VECT_VAR_DECL_INIT(buffer, float, 32, 4); PAD(buffer_pad, float, 32, 4); +#ifdef __aarch64__ +VECT_VAR_DECL_INIT(buffer, float, 64, 2); +PAD(buffer_pad, float, 64, 2); +#endif /* The tests for vld1_dup and vdup expect at least 4 entries in the input buffer, so force 1- and 2-elements initializers to have 4 Index: gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vfma_n.c === --- gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vfma_n.c (revision 218582) +++ gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vfma_n.c (working copy) @@ -2,6 +2,7 @@ #include arm-neon-ref.h #include compute-ref-data.h +#if defined(__aarch64__) defined(__ARM_FEATURE_FMA) /* Expected results. */ VECT_VAR_DECL(expected,hfloat,32,2) [] = { 0x4438ca3d, 0x44390a3d }; VECT_VAR_DECL(expected,hfloat,32,4) [] = { 0x44869eb8, 0x4486beb8, 0x4486deb8, 0x4486feb8 }; @@ -9,33 +10,34 @@ VECT_VAR_DECL(expected,hfloat,64,2) [] = { 0x40890 #define VECT_VAR_ASSIGN(S,Q,T1,W) S##Q##_##T1##W #define ASSIGN(S, Q, T, W, V) T##W##_t S##Q##_##T##W = V -#define TEST_MSG VFMA/VFMAQ +#define TEST_MSG VFMA_N/VFMAQ_N + void exec_vfma_n (void) { /* Basic test: v4=vfma_n(v1,v2), then store the result. */ -#define TEST_VFMA(Q, T1, T2, W, N) \ +#define TEST_VFMA_N(Q, T1, T2, W, N) \ VECT_VAR(vector_res, T1, W, N) = \ vfma##Q##_n_##T2##W(VECT_VAR(vector1, T1, W, N), \ - VECT_VAR(vector2, T1, W, N), \ - VECT_VAR_ASSIGN(Scalar, Q, T1, W)); \ + VECT_VAR(vector2, T1, W, N),\ + VECT_VAR_ASSIGN(scalar, Q, T1, W)); \ vst1##Q##_##T2##W(VECT_VAR(result, T1, W, N), VECT_VAR(vector_res, T1, W, N)) -#define CHECK_VFMA_RESULTS(test_name,comment) \ +#define CHECK_VFMA_N_RESULTS(test_name,comment) \ {\ CHECK_FP(test_name, float, 32, 2, PRIx32, expected, comment); \ CHECK_FP(test_name, float, 32, 4, PRIx32, expected, comment); \ - CHECK_FP(test_name, float, 64, 2, PRIx64, expected, comment); \ - } +CHECK_FP(test_name, float, 64, 2, PRIx64, expected, comment); \ + } -#define DECL_VABD_VAR(VAR) \ +#define DECL_VFMA_N_VAR(VAR) \ DECL_VARIABLE(VAR, float, 32, 2);\ DECL_VARIABLE(VAR, float, 32, 4);\ - DECL_VARIABLE(VAR, float, 64, 2); + DECL_VARIABLE(VAR, float, 64, 2); - DECL_VABD_VAR(vector1); - DECL_VABD_VAR(vector2); - DECL_VABD_VAR(vector3); - DECL_VABD_VAR(vector_res); + DECL_VFMA_N_VAR(vector1); + DECL_VFMA_N_VAR(vector2); + DECL_VFMA_N_VAR(vector3); + DECL_VFMA_N_VAR(vector_res); clean_results (); @@ -50,20 +52,23 @@ void exec_vfma_n (void) VDUP(vector2, q, float, f, 64, 2, 15.8f); /* Choose init value arbitrarily. */ - ASSIGN(Scalar, , float, 32, 81.2f); - ASSIGN(Scalar, q, float, 32, 36.8f); - ASSIGN(Scalar, q, float, 64, 51.7f); + ASSIGN(scalar, , float, 32, 81.2f); + ASSIGN(scalar, q, float, 32, 36.8f); + ASSIGN(scalar, q, float, 64, 51.7f); /* Execute the tests. */ - TEST_VFMA(, float, f, 32, 2); - TEST_VFMA(q, float, f, 32, 4); - TEST_VFMA(q, float, f, 64, 2); + TEST_VFMA_N(, float, f, 32, 2); + TEST_VFMA_N(q, float, f, 32, 4); + TEST_VFMA_N(q, float, f, 64, 2); - CHECK_VFMA_RESULTS (TEST_MSG, ); + CHECK_VFMA_N_RESULTS (TEST_MSG, ); } +#endif int main (void) { +#if defined(__aarch64__) defined(__ARM_FEATURE_FMA) exec_vfma_n (); +#endif return 0; } Index: gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vfma.c
RE: patch to fix PR64110
This issue is already fixed by your commit r218760. Thanks. Hale. -Original Message- From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- ow...@gcc.gnu.org] On Behalf Of Hale Wang Sent: Tuesday, December 16, 2014 10:17 AM To: 'Vladimir Makarov'; GCC Patches Subject: RE: patch to fix PR64110 Hi, This commit will cause another GCC build fail for ARM targets. The details are descripted in the following Bugzilla linker: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64323. Could you help me to have a look? Thanks, Hale. -Original Message- From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- ow...@gcc.gnu.org] On Behalf Of Vladimir Makarov Sent: Saturday, December 13, 2014 4:12 AM To: GCC Patches Subject: patch to fix PR64110 The following patch fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64110 The patch was successfully tested and bootstrapped on x86/x86-64. Committed as rev. 218688. 2014-12-12 Vladimir Makarov vmaka...@redhat.com PR target/64110 * lra-constraints.c (process_alt_operands): Refuse alternative when reload pseudo of given class can not hold value of given mode. 2014-12-12 Vladimir Makarov vmaka...@redhat.com PR target/64110 * gcc.target/i386/pr64110.c: New.
[committed] Fix libtsan data symbolization
Hi! I have posted yesterday a patch to fix data symbolization using libbacktrace, this patch is a backport of the change I've sent upstream that has been committed today. 2014-12-16 Jakub Jelinek ja...@redhat.com * sanitizer_common/sanitizer_symbolizer_libbacktrace.cc, sanitizer_common/sanitizer_symbolizer_libbacktrace.h, sanitizer_common/sanitizer_symbolizer_posix_libcdep.cc: Cherry pick upstream r224308. --- libsanitizer/sanitizer_common/sanitizer_symbolizer_libbacktrace.cc (revision 218777) +++ libsanitizer/sanitizer_common/sanitizer_symbolizer_libbacktrace.cc (working copy) @@ -165,9 +165,9 @@ uptr LibbacktraceSymbolizer::SymbolizeCo return data.n_frames; } -bool LibbacktraceSymbolizer::SymbolizeData(DataInfo *info) { - backtrace_syminfo((backtrace_state *)state_, info-start, -SymbolizeDataCallback, ErrorCallback, info); +bool LibbacktraceSymbolizer::SymbolizeData(uptr addr, DataInfo *info) { + backtrace_syminfo((backtrace_state *)state_, addr, SymbolizeDataCallback, +ErrorCallback, info); return true; } @@ -185,7 +185,7 @@ uptr LibbacktraceSymbolizer::SymbolizeCo return 0; } -bool LibbacktraceSymbolizer::SymbolizeData(DataInfo *info) { +bool LibbacktraceSymbolizer::SymbolizeData(uptr addr, DataInfo *info) { return false; } --- libsanitizer/sanitizer_common/sanitizer_symbolizer_libbacktrace.h (revision 218777) +++ libsanitizer/sanitizer_common/sanitizer_symbolizer_libbacktrace.h (working copy) @@ -33,7 +33,7 @@ class LibbacktraceSymbolizer { uptr SymbolizeCode(uptr addr, AddressInfo *frames, uptr max_frames, const char *module_name, uptr module_offset); - bool SymbolizeData(DataInfo *info); + bool SymbolizeData(uptr addr, DataInfo *info); // May return NULL if demangling failed. static char *Demangle(const char *name, bool always_alloc = false); --- libsanitizer/sanitizer_common/sanitizer_symbolizer_posix_libcdep.cc (revision 218777) +++ libsanitizer/sanitizer_common/sanitizer_symbolizer_posix_libcdep.cc (working copy) @@ -593,7 +593,7 @@ class POSIXSymbolizer : public Symbolize // First, try to use libbacktrace symbolizer (if it's available). if (libbacktrace_symbolizer_ != 0) { mu_.CheckLocked(); - if (libbacktrace_symbolizer_-SymbolizeData(info)) + if (libbacktrace_symbolizer_-SymbolizeData(addr, info)) return true; } const char *str = SendCommand(true, module_name, module_offset); Jakub
Re: [PATCH PR62178]Improve candidate selecting in IVOPT, 2nd try.
Please ignore this one, I will further refine it. Sorry for disturbing! Thanks, bin On Tue, Dec 16, 2014 at 4:42 PM, Bin.Cheng amker.ch...@gmail.com wrote: On Thu, Dec 11, 2014 at 8:08 PM, Richard Biener richard.guent...@gmail.com wrote: On Thu, Dec 11, 2014 at 10:56 AM, Bin.Cheng amker.ch...@gmail.com wrote: On Wed, Dec 10, 2014 at 9:47 PM, Richard Biener richard.guent...@gmail.com wrote: On Fri, Dec 5, 2014 at 1:15 PM, Bin Cheng bin.ch...@arm.com wrote: Hi, Though PR62178 is hidden by recent cost change in aarch64 backend, the ivopt issue still exists. Current candidate selecting algorithm tends to select fewer candidates given below reasons: 1) to better handle loops with many induction uses but the best choice is one generic basic induction variable; 2) to keep compilation time low. One fundamental weakness of the strategy is the opposite situation can't be handled properly sometimes. For these cases the best choice is each induction variable has its own candidate. This patch fixes the problem by shuffling candidate set after fix-point is reached by current implementation. The reason why this strategy works is it replaces candidate set by selecting local optimal candidate for some induction uses, and the new candidate set (has lower cost) is exact what we want in the mentioned case. Instrumentation data shows this can find better candidates set for ~6% loops in spec2006 on x86_64, and ~4% on aarch64. This patch actually is extension to the first version patch posted at https://gcc.gnu.org/ml/gcc-patches/2014-09/msg02620.html, that only adds another selecting pass with special seed set (more or less like the shuffled set in this patch). Data also confirms this patch can find optimal sets for most loops found by the first one, as well as optimal sets for many new loops. Bootstrap and test on x86_64, no regression on benchmarks. Bootstrap and test on aarch64. Since this patch only selects candidate set with lower cost, any regressions revealed are latent bugs of other components in GCC. I also collected GCC bootstrap time on x86_64, no regression either. Is this OK? The algorithm seems to be quadratic in the number of IV candidates (at least): Yes, I worried about that too, that's why I measured the bootstrap time. One way is restrict this procedure one time for each loop. I already tried that and it can capture +90% loops. Is this sounds reasonable? Yes. That's my suggestion to handle it in the caller of try_improve_iv_set? BTW, do we have some compilation time benchmarks for GCC? There are various testcases linked from PR47344, I don't remember any particular one putting load on IVOPTs (but I do remember seeing IVOPTs in the ~25% area in -ftime-report for some testcases). Hi Jeff Richard, I updated patch according to your review comments. Is this version looks good? I didn't find cases in PR47344 which exercising IVOPT, but I produced one case from PR53852 which runs ivopt for ~17% of total time (28s). This patch does increase IVOPT time to 18%. Unfortunately, I tried the other restriction, it doesn't work as well as this one on spec2k6, if I understood the method correctly. Hi Sebastian, Thanks for help! I managed to run llvm compilation time tests successfully as you suggested. Case Multisource/Benchmarks/mafft/pairlocalalign is regressed but I can't reproduce it in cmd. The running time of compilation of pairlocalalign.c is too small comparing to the results. I also tried to invoke it by using RunSafely.sh but no lucky either. So any documentation on this? Thanks very much! Thanks, bin 2014-12-16 Bin Cheng bin.ch...@arm.com PR tree-optimization/62178 * tree-ssa-loop-ivopts.c (cheaper_cost_with_cand): New function. (iv_ca_replace): New function. (try_improve_iv_set): New parameter try_replace_p. Replace candidates in IVS by calling iv_ca_replace. (find_optimal_iv_set_1): Pass new argument to try_improve_iv_set. gcc/testsuite/ChangeLog 2014-12-16 Bin Cheng bin.ch...@arm.com PR tree-optimization/62178 * gcc.target/aarch64/pr62178.c: New test.
[PATCH] Add (1 A) 1) folding (PR middle-end/64309)
As discussed in the PR, this adds folding of (1 A) 1) if in a eq/ne comparison. The assembly diff on my x86_64 box is movq%rsp, %rbp .cfi_def_cfa_register 6 movl%edi, -4(%rbp) - movl-4(%rbp), %eax - movl$1, %edx - movl%eax, %ecx - sarl%cl, %edx - movl%edx, %eax - andl$1, %eax - testl %eax, %eax - setne %al + cmpl$0, -4(%rbp) + sete%al movzbl %al, %eax popq%rbp .cfi_def_cfa 7, 8 Since this removes a shift, I was afraid that it could regress ubsan sanitization, but luckily that is not the case and the shift diagnostics seems to be intact. It triggers several times during the bootstrap. Bootstrapped/regtested on x86_64-linux + ppc64-linux, ok for trunk? 2014-12-16 Marek Polacek pola...@redhat.com PR middle-end/64309 * match.pd: Add ((1 A) 1) != 0 - A == 0 and ((1 A) 1) == 0 - A != 0. * gcc.dg/pr64309.c: New test. diff --git gcc/match.pd gcc/match.pd index 083d65f..47b01eb 100644 --- gcc/match.pd +++ gcc/match.pd @@ -599,6 +599,15 @@ along with GCC; see the file COPYING3. If not see build_int_cst (TREE_TYPE (@1), element_precision (type)), @1); })) +/* ((1 A) 1) != 0 - A == 0 */ +(simplify + (ne (bit_and (lshift integer_onep @0) integer_onep) integer_zerop) + (eq @0 { build_zero_cst (TREE_TYPE (@0)); })) + +/* ((1 A) 1) == 0 - A != 0 */ +(simplify + (eq (bit_and (lshift integer_onep @0) integer_onep) integer_zerop) + (ne @0 { build_zero_cst (TREE_TYPE (@0)); })) /* Simplifications of conversions. */ diff --git gcc/testsuite/gcc.dg/pr64309.c gcc/testsuite/gcc.dg/pr64309.c index e69de29..710a762 100644 --- gcc/testsuite/gcc.dg/pr64309.c +++ gcc/testsuite/gcc.dg/pr64309.c @@ -0,0 +1,66 @@ +/* PR middle-end/64309 */ +/* { dg-do run } */ +/* { dg-options -fdump-tree-original } */ + +int +fn1 (int n) +{ + return ((1 n) 1) != 0; +} + +int +fn2 (int n) +{ + return (1 (1 n)) != 0; +} + +int +fn3 (int n) +{ + return ((1 n) 1) == 0; +} + +int +fn4 (int n) +{ + return (1 (1 n)) == 0; +} + +int +main (void) +{ + if (fn1 (0) != 1 + || fn1 (1) != 0 + || fn1 (2) != 0 + || fn1 (3) != 0 + || fn1 (4) != 0 + || fn1 (5) != 0) +__builtin_abort (); + + if (fn2 (0) != 1 + || fn2 (1) != 0 + || fn2 (2) != 0 + || fn2 (3) != 0 + || fn2 (4) != 0 + || fn2 (5) != 0) +__builtin_abort (); + + if (fn3 (0) != 0 + || fn3 (1) != 1 + || fn3 (2) != 1 + || fn3 (3) != 1 + || fn3 (4) != 1 + || fn3 (5) != 1) +__builtin_abort (); + + if (fn4 (0) != 0 + || fn4 (1) != 1 + || fn4 (2) != 1 + || fn4 (3) != 1 + || fn4 (4) != 1 + || fn4 (5) != 1) +__builtin_abort (); +} + +/* { dg-final { scan-tree-dump-not (|) original } } */ +/* { dg-final { cleanup-tree-dump original } } */ Marek
Re: [PATCH] combine: If a parallel I2 was split, do not allow a new I2 (PR64268)
On Mon, Dec 15, 2014 at 10:24:47AM -0600, Segher Boessenkool wrote: On Mon, Dec 15, 2014 at 04:51:14PM +0100, Paolo Bonzini wrote: 1) did you check that it never triggers on e.g. an x86 bootstrap, and that it doesn't trigger too often on PPC64? I have checked on my largish connection of tests for the carry insns on PowerPC, and only two (related) transforms are disabled, and they aren't too important anyway. Well, and the bad transforms are disabled, only just two of-em but much more frequent (long long x; x--;). I haven't checked on x86, but it's a bugfix: don't do things that blow up! It is amazing to me that it didn't show up before. One theory is that instructions that set the condition code as well as a GPR will never combine with a later insn to two insns, always to just one. But nothing made this explicit so AFAICS it is just an accident that it worked before. I'll do an instrumented x86 bootstrap. I did a run for powerpc64, one for powerpc, and one for x86-64. The powerpc64 bootstrap was with pre-installed GMP etc.; the others had those libraries in-tree. type1 is when try_combine used the ancient combine code to split a parallel set and set of cc; type2 is when it used my code to split any other parallel that sets two things; and type0 is when it didn't do either but still ended up with I1 and I2 the same UID (I think it might be called with the same insn twice; not a good thing, it does not know how to handle this; and it is really worrisome that it then sometimes succeeds in combining it). tries is how often that split-orig-I2-to-two code is used; recog is how often it reached the first call to recog (so it passed can_combine_p etc.); fail is how often it eventually failed (after reaching recog), one is how often it combined to one insn, two is how often it combined to two. powerpc64 tries recog failone two type1 39214 39214 38944 202 18 type2 21540 18968 18928 2 38 type0 292 289 0 3 powerpc tries recog failone two type1 21654 21654 21167 485 2 type2 21839 19754 19243 0 509 (*) type0 427 294 0 133 x86-64 tries recog failone two type1 17387 17387 17288 70 29 type2 40413 31681 30242 60 1369 type0 0 0 0 0 Not sure what to make of the high number in the x86-64 type2/two result. Segher (*) The (32-bit) powerpc bootstrap failed (that's what the patch is trying to rectify, after all); the columns on this line don't add up correctly (two are missing; this was a -j60 build).
[Patch, regcprop] Tentative fix for PR 64331
Hi, The cprop_hardreg pass does not consider REG_DEAD notes when propagating, and this causes issues if target specific code uses dead_or_set_regno_p to know if it can clobber registers. For example, regcrop transforms (insn 7 4 8 2 (set (reg:SI 16 r16 [orig:43 D.1617 ] [43]) (reg/v:SI 20 r20 [orig:46 x ] [46])) reduced.c:12 94 {*movsi} (nil)) ... (insn 13 12 14 3 (parallel [ (set (cc0) (compare (reg/v:SI 20 r20 [orig:46 x ] [46]) (const_int 0 [0]))) (clobber (scratch:QI)) ]) reduced.c:17 413 {*cmpsi} (expr_list:REG_DEAD (reg/v:SI 20 r20 [orig:46 x ] [46]) (nil))) ... (insn 17 16 18 4 (parallel [ (set (cc0) (compare (reg:SI 16 r16 [orig:43 D.1617 ] [43]) (reg:SI 24 r24 [orig:48 t_3(D)-b ] [48]))) (clobber (scratch:QI)) ]) reduced.c:20 413 {*cmpsi} (expr_list:REG_DEAD (reg:SI 24 r24 [orig:48 t_3(D)-b ] [48]) (expr_list:REG_DEAD (reg:SI 16 r16 [orig:43 D.1617 ] [43]) (nil into (insn 17 16 18 4 (parallel [ (set (cc0) (compare (reg:SI 20 r20 [orig:43 D.1617 ] [43]) (reg:SI 24 r24 [orig:48 t_3(D)-b ] [48]))) (clobber (scratch:QI)) ]) reduced.c:20 413 {*cmpsi} (expr_list:REG_DEAD (reg:SI 24 r24 [orig:48 t_3(D)-b ] [48]) (expr_list:REG_DEAD (reg:SI 16 r16 [orig:43 D.1617 ] [43]) (nil replacing r16 in insn 17 with r20, which was marked REG_DEAD in insn 13. The AVR backend, when emitting code for insn 13, figures that R20 is dead and therefore happily clobbers it. Killing regs that are marked REG_DEAD fixes the immediate problem. Is that the right approach? What do you guys think? Regards Senthil ChangeLog 2014-12-16 Senthil Kumar Selvaraj senthil_kumar.selva...@atmel.com PR rtl-optimization/64331 * regcprop.c (copyprop_hardreg_forward_1): Kill regs marked REG_DEAD. diff --git gcc/regcprop.c gcc/regcprop.c index daeb980..4f00fc4 100644 --- gcc/regcprop.c +++ gcc/regcprop.c @@ -815,6 +815,9 @@ copyprop_hardreg_forward_1 (basic_block bb, struct value_data *vd) would clobbers. */ for (link = REG_NOTES (insn); link; link = XEXP (link, 1)) { + if (REG_NOTE_KIND (link) == REG_DEAD) + kill_value (XEXP (link, 0), vd); + if (REG_NOTE_KIND (link) == REG_UNUSED) { kill_value (XEXP (link, 0), vd);
Re: [PATCH] Add (1 A) 1) folding (PR middle-end/64309)
On Tue, 16 Dec 2014, Marek Polacek wrote: As discussed in the PR, this adds folding of (1 A) 1) if in a eq/ne comparison. The assembly diff on my x86_64 box is movq%rsp, %rbp .cfi_def_cfa_register 6 movl%edi, -4(%rbp) - movl-4(%rbp), %eax - movl$1, %edx - movl%eax, %ecx - sarl%cl, %edx - movl%edx, %eax - andl$1, %eax - testl %eax, %eax - setne %al + cmpl$0, -4(%rbp) + sete%al movzbl %al, %eax popq%rbp .cfi_def_cfa 7, 8 Since this removes a shift, I was afraid that it could regress ubsan sanitization, but luckily that is not the case and the shift diagnostics seems to be intact. It triggers several times during the bootstrap. Bootstrapped/regtested on x86_64-linux + ppc64-linux, ok for trunk? 2014-12-16 Marek Polacek pola...@redhat.com PR middle-end/64309 * match.pd: Add ((1 A) 1) != 0 - A == 0 and ((1 A) 1) == 0 - A != 0. * gcc.dg/pr64309.c: New test. diff --git gcc/match.pd gcc/match.pd index 083d65f..47b01eb 100644 --- gcc/match.pd +++ gcc/match.pd @@ -599,6 +599,15 @@ along with GCC; see the file COPYING3. If not see build_int_cst (TREE_TYPE (@1), element_precision (type)), @1); })) +/* ((1 A) 1) != 0 - A == 0 */ +(simplify + (ne (bit_and (lshift integer_onep @0) integer_onep) integer_zerop) + (eq @0 { build_zero_cst (TREE_TYPE (@0)); })) + +/* ((1 A) 1) == 0 - A != 0 */ +(simplify + (eq (bit_and (lshift integer_onep @0) integer_onep) integer_zerop) + (ne @0 { build_zero_cst (TREE_TYPE (@0)); })) You can use (for cmp (ne eq) icmp (eq ne) (simplify (cmp (bit_and (lshift integer_onep @0) integer_onep) integer_zerop) (icmp @0 { build_zero_cst (TREE_TYPE (@0)); }))) to combine both patterns. Btw, shoudln't it be (bit_and (...) integer_each_onep)? I'm always unsure about the complex integer case (maybe try a runtime testcase and see what happens - eventually we just don't support bit operations on them...) Thanks, Richard. /* Simplifications of conversions. */ diff --git gcc/testsuite/gcc.dg/pr64309.c gcc/testsuite/gcc.dg/pr64309.c index e69de29..710a762 100644 --- gcc/testsuite/gcc.dg/pr64309.c +++ gcc/testsuite/gcc.dg/pr64309.c @@ -0,0 +1,66 @@ +/* PR middle-end/64309 */ +/* { dg-do run } */ +/* { dg-options -fdump-tree-original } */ + +int +fn1 (int n) +{ + return ((1 n) 1) != 0; +} + +int +fn2 (int n) +{ + return (1 (1 n)) != 0; +} + +int +fn3 (int n) +{ + return ((1 n) 1) == 0; +} + +int +fn4 (int n) +{ + return (1 (1 n)) == 0; +} + +int +main (void) +{ + if (fn1 (0) != 1 + || fn1 (1) != 0 + || fn1 (2) != 0 + || fn1 (3) != 0 + || fn1 (4) != 0 + || fn1 (5) != 0) +__builtin_abort (); + + if (fn2 (0) != 1 + || fn2 (1) != 0 + || fn2 (2) != 0 + || fn2 (3) != 0 + || fn2 (4) != 0 + || fn2 (5) != 0) +__builtin_abort (); + + if (fn3 (0) != 0 + || fn3 (1) != 1 + || fn3 (2) != 1 + || fn3 (3) != 1 + || fn3 (4) != 1 + || fn3 (5) != 1) +__builtin_abort (); + + if (fn4 (0) != 0 + || fn4 (1) != 1 + || fn4 (2) != 1 + || fn4 (3) != 1 + || fn4 (4) != 1 + || fn4 (5) != 1) +__builtin_abort (); +} + +/* { dg-final { scan-tree-dump-not (|) original } } */ +/* { dg-final { cleanup-tree-dump original } } */ Marek -- Richard Biener rguent...@suse.de SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)
[match-and-simplify] allow 't' only in user-defined predicates
This patch rejects 't' outside user-defined predicates. 2014-12-16 Prathamesh Kulkarni prathamesh.kulka...@linaro.org * genmatch.c (parser::parsing_match): New. (parser::parser): Initialize parsing_match to false. (parser::parse_pattern): Reset parsing_match when parsing user-defined predicate. (parser::parse_c_expr): Check if 't' is used when parsing_match is set. Thanks, Prathamesh Index: gcc/genmatch.c === --- gcc/genmatch.c (revision 218778) +++ gcc/genmatch.c (working copy) @@ -2746,6 +2746,7 @@ vecsimplify * simplifiers; vecpredicate_id * user_predicates; bool parsing_match_operand; + bool parsing_match; }; /* Lexing helpers. */ @@ -3059,8 +3060,11 @@ /* If this is possibly a user-defined identifier mark it used. */ if (token-type == CPP_NAME) { - id_base *idb = get_operator ((const char *)CPP_HASHNODE - (token-val.node.node)-ident.str); + const char *lexeme = (const char *) CPP_HASHNODE (token-val.node.node)-ident.str; + if (strcmp (lexeme, t) == 0 !parsing_match) + fatal_at (token, 't' is allowed only in predicates); + + id_base *idb = get_operator (lexeme); user_id *p; if (idb (p = dyn_castuser_id * (idb)) p-is_oper_list) record_operlist (token-src_loc, p); @@ -3497,6 +3501,7 @@ parse_simplify (token-src_loc, simplifiers, NULL, NULL); else if (strcmp (id, match) == 0) { + parsing_match = true; bool with_args = false; if (peek ()-type == CPP_OPEN_PAREN) { @@ -3530,6 +3535,7 @@ fatal_at (token, non-matching number of match operands); p-nargs = e ? e-ops.length () : 0; parse_simplify (token-src_loc, p-matchers, p, e); + parsing_match = false; } else if (strcmp (id, for) == 0) parse_for (token-src_loc); @@ -3569,6 +3575,7 @@ oper_lists = vNULL; user_predicates = vNULL; parsing_match_operand = false; + parsing_match = false; const cpp_token *token = next (); while (token-type != CPP_EOF)
Re: [match-and-simplify] allow 't' only in user-defined predicates
sorry for the noise. I sent it just before our conversation on IRC. On 16 December 2014 at 19:58, Prathamesh Kulkarni prathamesh.kulka...@linaro.org wrote: This patch rejects 't' outside user-defined predicates. 2014-12-16 Prathamesh Kulkarni prathamesh.kulka...@linaro.org * genmatch.c (parser::parsing_match): New. (parser::parser): Initialize parsing_match to false. (parser::parse_pattern): Reset parsing_match when parsing user-defined predicate. (parser::parse_c_expr): Check if 't' is used when parsing_match is set. Thanks, Prathamesh
Re: [PATCH] Fix PR64240
On Tue, Dec 16, 2014 at 11:41 AM, Yangfei (Felix) felix.y...@huawei.com wrote: On December 16, 2014 9:51:25 AM CET, Yangfei (Felix) felix.y...@huawei.com wrote: Hi, This patch fixes an obvious typo which may affect the DDG creation of SMS and make this optimization produce buggy code. Bootstrapped on x86_64-suse-linux. Also passed check-gcc test for aarch64-linux-gnu. OK for the trunk? Do you have a testcase? If so please add it. OK Yes, the patch is updated with the testcase added. Ok. Thanks, Richard. Index: gcc/ddg.c === --- gcc/ddg.c (revision 218582) +++ gcc/ddg.c (working copy) @@ -77,7 +77,7 @@ mark_mem_use (rtx *x, void *) { subrtx_iterator::array_type array; FOR_EACH_SUBRTX (iter, array, *x, NONCONST) -if (MEM_P (*x)) +if (MEM_P (*iter)) { mem_ref_p = true; break; Index: gcc/ChangeLog === --- gcc/ChangeLog (revision 218582) +++ gcc/ChangeLog (working copy) @@ -1,3 +1,8 @@ +2014-12-16 Felix Yang felix.y...@huawei.com + + PR rtl-optimization/64240 + * ddg.c (mark_mem_use): Check *iter instead of *x. + 2014-12-10 Felix Yang felix.y...@huawei.com * config/aarch64/aarch64-protos.h (aarch64_function_profiler): Remove Index: gcc/testsuite/gcc.dg/sms-12.c === --- gcc/testsuite/gcc.dg/sms-12.c (revision 0) +++ gcc/testsuite/gcc.dg/sms-12.c (revision 0) @@ -0,0 +1,43 @@ +/* { dg-do run } */ +/* { dg-skip-if { ! { aarch64-*-* } } { * } { } } */ +/* { dg-options -O2 -fmodulo-sched -funroll-loops -fdump-rtl-sms --param sms-min-sc=1 -fmodulo-sched-allow-regmoves -fPIC } */ + +extern void abort (void); + +int X[1000]={0}; +int Y[1000]={0}; + +extern void abort (void); + +__attribute__ ((noinline)) +int +foo (int len, long a) +{ + int i; + long res = a; + + len = 1000; + for (i = 0; i len; i++) +res += X[i]* Y[i]; + + if (res != 601) +abort (); + +} + +int +main () +{ + X[0] = Y[1] = 2; + Y[0] = X[1] = 21; + X[2] = Y[3] = 3; + Y[2] = X[3] = 31; + X[4] = Y[5] = 4; + Y[4] = X[5] = 41; + + foo (6, 3); + return 0; +} + +/* { dg-final { cleanup-rtl-dump sms } } */ + Property changes on: gcc/testsuite/gcc.dg/sms-12.c ___ Added: svn:executable + * Index: gcc/testsuite/ChangeLog === --- gcc/testsuite/ChangeLog (revision 218582) +++ gcc/testsuite/ChangeLog (working copy) @@ -1,3 +1,8 @@ +2014-12-16 Felix Yang felix.y...@huawei.com + + PR rtl-optimization/64240 + * gcc.dg/sms-12.c: New test. + 2014-12-10 Martin Liska mli...@suse.cz * gcc.dg/ipa/pr63909.c: New test.
Fix capture parsing in (match ...
I am testing the following patch to properly setup capture_ids for parsing (match ... Bootstrap and regtest running on x86_64-unknown-linux-gnu. Richard. 2014-12-16 Richard Biener rguent...@suse.de * genmatch.c (parser::parser): Initialize capture_ids. (parser::parse_pattern): Properly allocate capture_ids before using them. Set capture_ids to zero when its lifetime is supposed to finish. (parser::parse_simplify): Allocate capture_ids only if required. Index: gcc/genmatch.c === --- gcc/genmatch.c (revision 218747) +++ gcc/genmatch.c (working copy) @@ -3176,7 +3176,8 @@ parser::parse_simplify (source_location expr *result) { /* Reset the capture map. */ - capture_ids = new cid_map_t; + if (!capture_ids) +capture_ids = new cid_map_t; /* Reset oper_lists and set. */ hash_set user_id * olist; oper_lists_set = olist; @@ -3494,7 +3495,10 @@ parser::parse_pattern () const cpp_token *token = peek (); const char *id = get_ident (); if (strcmp (id, simplify) == 0) -parse_simplify (token-src_loc, simplifiers, NULL, NULL); +{ + parse_simplify (token-src_loc, simplifiers, NULL, NULL); + capture_ids = NULL; +} else if (strcmp (id, match) == 0) { bool with_args = false; @@ -3519,6 +3523,7 @@ parser::parse_pattern () expr *e = NULL; if (with_args) { + capture_ids = new cid_map_t; e = new expr (p); while (peek ()-type == CPP_ATSIGN) e-append_op (parse_capture (NULL)); @@ -3530,6 +3535,7 @@ parser::parse_pattern () fatal_at (token, non-matching number of match operands); p-nargs = e ? e-ops.length () : 0; parse_simplify (token-src_loc, p-matchers, p, e); + capture_ids = NULL; } else if (strcmp (id, for) == 0) parse_for (token-src_loc); @@ -3567,6 +3573,7 @@ parser::parser (cpp_reader *r_) simplifiers = vNULL; oper_lists_set = NULL; oper_lists = vNULL; + capture_ids = NULL; user_predicates = vNULL; parsing_match_operand = false;
Re: [PATCH] Add support for exceptions to tsan (PR sanitizer/64265)
On Tue, Dec 16, 2014 at 11:23 AM, Jakub Jelinek ja...@redhat.com wrote: On Tue, Dec 16, 2014 at 10:47:06AM +0100, Richard Biener wrote: On Mon, Dec 15, 2014 at 7:50 PM, Jakub Jelinek ja...@redhat.com wrote: Hi! As discussed in the PR, to support exceptions in -fsanitize=thread code, it is desirable to call __tsan_func_exit also when leaving functions by means of exceptions. Adding EH too late sounds too hard to me, so this patch instead adds an internal call during gimplification, makes sure the inliner removes the internal calls from the inline functions (we don't care about inlines, only about functions we emit), and for functions that didn't go through gimplify_function_tree (e.g. omp/tm etc. functions) just keeps using the old __tsan_func_exit additions. Bootstrapped/regtested on x86_64-linux and i686-linux. Ok for trunk? So the issue is externally throwing EH, right? I wonder if we can teach the unwinder to do __tsan_func_exit instead, or have hooks in it that can be used by tsan? At least the issue seems generic enough for code instrumentation to consider a more general solution? How does -finstrument-functions work with externally throwing EH? -finstrument-functions works the way I wrote the patch, in fact the gimplify_function_tree bits I've added were right after the -finstrument-functions handling of the same. Anyway, the alternative would be to wrap the various personality functions like __gcc_personality_v0 __gxx_personality_v0 __gcj_personality_v0 __gccgo_personality_v0 __gnu_objc_personality_v0 call the dlsym (, RTLD_NEXT) version from there and if it returns _URC_INSTALL_CONTEXT , try to figure out what frame it will be in. We can query the IP (_Unwind_GetIP), or the CFA (_Unwind_GetCFA), but then map it through supposedly target dependent code to the actual frame pointers __tsan_func_* store (do they?). I suppose we could annotate the CFA with appropriate information? The problem with wrapping those personality functions is that I'm not sure how could it work with the various -static* options. Say if you -static-libstdc++ (and link libtsan dynamically), then the __gxx_personality_v0 in the binary will not be wrapped by what is in libtsan.so. If you -static-libstdc++ -static-libtsan, then __gxx_personality_v0 linked in will be very likely from libstdc++ and thus again not overloaded, or if -ltsan would come first (right now it doesn't), then you still couldn't use dlsym to get at the overloaded symbol. Yeah, which is why I suggested that one might want to have a generic (list of) callbacks that can be registered (like malloc hooks). It would be all extensions but GCC controls the unwinding ABI, right? So, while the __tsan_func_exit in cleanup is more expensive at runtime, it will work with all the wierdo options people are using. True. As it follows existing practice with -finstrument-functions the patch is probably the way to go for GCC 5. I was just wondering of a less heavy-weight solution piggy-backing on the unwinder. I suppose that idea wouldn't work for SJLJ exceptions so you'd need both approaches anyway. Richard. Jakub
Re: [C++ Patch] PR 58650
On 12/16/2014 05:40 AM, Paolo Carlini wrote: In better detail: grokdeclarator is called, via grokfield, by cp_parser_member_declaration. The latter stores the friendship information in a friend_p local flag, which remains true when grokdeclarator returns. Maybe check function_declarator_p in cp_parser_member_declaration? Jason
Re: [PATCH 2/3] Extended if-conversion
Hi Richard, Here is updated patch which includes (1) split critical edges for aggressive if conversion. (2) delete all stuff related to support of critical edge predication. (3) only one function - predicate_scalar_phi performs predication. (4) function find_phi_replacement_condition was deleted since it was included in predicate_scalar_phi for phi with two arguments. I checked that patch works in stress testing mode, i.e. with aggressive if conversion by default. What is your opinion? Thanks. Yuri. 2014-12-11 11:59 GMT+03:00 Richard Biener richard.guent...@gmail.com: On Wed, Dec 10, 2014 at 4:22 PM, Yuri Rumyantsev ysrum...@gmail.com wrote: Richard, Thanks for your reply! I didn't understand your point: Well, I don't mind splitting all critical edges unconditionally but you do it unconditionally in proposed patch. I don't mind means I am fine with it. Also I assume that call of split_critical_edges() can break ssa. For example, we can split headers of loops, loop exit blocks etc. How does that break SSA? You mean loop-closed SSA? I'd be surprised if so but that may be possible. I prefer to do something more loop-specialized, e.g. call edge_split() for critical edges outgoing from bb ending with GIMPLE_COND stmt (assuming that edge destination bb belongs to loop). That works for me as well but it is more complicated to implement. Ideally you'd only split one edge if you find a block with only critical predecessors (where we'd currently give up). But note that this requires re-computation of ifc_bbs in if_convertible_loop_p_1 and it will change loop-num_nodes so we have to be more careful in constructing the loop calling if_convertible_bb_p. Richard. 2014-12-10 17:31 GMT+03:00 Richard Biener richard.guent...@gmail.com: On Wed, Dec 10, 2014 at 11:54 AM, Yuri Rumyantsev ysrum...@gmail.com wrote: Richard, Sorry that I forgot to delete debug dump from my fix. I have few questions about your comments. 1. You wrote : You also still have two functions for PHI predication. And the new extended variant doesn't commonize the 2-args and general path Did you mean that I must combine predicate_scalar_phi and predicate_extended scalar phi to one function? Please note that if additional flag was not set up (i.e. aggressive_if_conv is false) extended predication is required more compile time since it builds hash_map. It's compile-time complexity is reasonable enough even for non-aggressive if-conversion. 2. About critical edge splitting. Did you mean that we should perform it (1) under aggressive_if_conv option only; (2) should we split all critical edges. Note that this leads to recomputing of topological order. Well, I don't mind splitting all critical edges unconditionally, thus do something like Index: gcc/tree-if-conv.c === --- gcc/tree-if-conv.c (revision 218515) +++ gcc/tree-if-conv.c (working copy) @@ -2235,12 +2235,21 @@ pass_if_conversion::execute (function *f if (number_of_loops (fun) = 1) return 0; + bool critical_edges_split_p = false; FOR_EACH_LOOP (loop, 0) if (flag_tree_loop_if_convert == 1 || flag_tree_loop_if_convert_stores == 1 || ((flag_tree_loop_vectorize || loop-force_vectorize) !loop-dont_vectorize)) - todo |= tree_if_conversion (loop); + { + if (!critical_edges_split_p) + { + split_critical_edges (); + critical_edges_split_p = true; + todo |= TODO_cleanup_cfg; + } + todo |= tree_if_conversion (loop); + } #ifdef ENABLE_CHECKING { It is worth noting that in current implementation bb's with 2 predecessors and both are on critical edges are accepted without additional option. Yes, I know. tree-if-conv.c is a mess right now and if we can avoid adding more to it and even fix the critical edge missed optimization with splitting critical edges then I am all for that solution. Richard. Thanks ahead. Yuri. 2014-12-09 18:20 GMT+03:00 Richard Biener richard.guent...@gmail.com: On Tue, Dec 9, 2014 at 2:11 PM, Yuri Rumyantsev ysrum...@gmail.com wrote: Richard, Here is updated patch2 with the following changes: 1. Delete functions phi_has_two_different_args and find_insertion_point. 2. Use only one function for extended predication - predicate_extended_scalar_phi. 3. Save gsi before insertion of predicate computations for basic blocks if it has 2 predecessors and both incoming edges are critical or it gas more than 2 predecessors and at least one incoming edge is critical. This saved iterator can be used by extended phi predication. Here is motivated test-case which explains this point. Test-case is attached (t5.c) and it must be compiled with -O2 -ftree-loop-vectorize -fopenmp options. The problem phi is in bb-7: bb_5 (preds = {bb_4 }, succs = {bb_7 bb_9 }) { bb 5:
Re: [PATCH] Add (1 A) 1) folding (PR middle-end/64309)
On Tue, Dec 16, 2014 at 03:08:23PM +0100, Richard Biener wrote: You can use (for cmp (ne eq) icmp (eq ne) (simplify (cmp (bit_and (lshift integer_onep @0) integer_onep) integer_zerop) (icmp @0 { build_zero_cst (TREE_TYPE (@0)); }))) to combine both patterns. Btw, shoudln't it be (bit_and (...) Ah, we can have multiple operators to iterate (and yes, it's documented, my bad). integer_each_onep)? I'm always unsure about the complex integer case (maybe try a runtime testcase and see what happens - eventually we just don't support bit operations on them...) Correct - we don't allow complex operands for neither shift, nor bit and. So I left the integer_onep in there. Thanks. 2014-12-16 Marek Polacek pola...@redhat.com PR middle-end/64309 * match.pd: Add ((1 A) 1) != 0 - A == 0 and ((1 A) 1) == 0 - A != 0. * gcc.dg/pr64309.c: New test. diff --git gcc/match.pd gcc/match.pd index 083d65f..dbca99e 100644 --- gcc/match.pd +++ gcc/match.pd @@ -599,6 +599,13 @@ along with GCC; see the file COPYING3. If not see build_int_cst (TREE_TYPE (@1), element_precision (type)), @1); })) +/* ((1 A) 1) != 0 - A == 0 + ((1 A) 1) == 0 - A != 0 */ +(for cmp (ne eq) + icmp (eq ne) + (simplify + (cmp (bit_and (lshift integer_onep @0) integer_onep) integer_zerop) + (icmp @0 { build_zero_cst (TREE_TYPE (@0)); }))) /* Simplifications of conversions. */ diff --git gcc/testsuite/gcc.dg/pr64309.c gcc/testsuite/gcc.dg/pr64309.c index e69de29..710a762 100644 --- gcc/testsuite/gcc.dg/pr64309.c +++ gcc/testsuite/gcc.dg/pr64309.c @@ -0,0 +1,66 @@ +/* PR middle-end/64309 */ +/* { dg-do run } */ +/* { dg-options -fdump-tree-original } */ + +int +fn1 (int n) +{ + return ((1 n) 1) != 0; +} + +int +fn2 (int n) +{ + return (1 (1 n)) != 0; +} + +int +fn3 (int n) +{ + return ((1 n) 1) == 0; +} + +int +fn4 (int n) +{ + return (1 (1 n)) == 0; +} + +int +main (void) +{ + if (fn1 (0) != 1 + || fn1 (1) != 0 + || fn1 (2) != 0 + || fn1 (3) != 0 + || fn1 (4) != 0 + || fn1 (5) != 0) +__builtin_abort (); + + if (fn2 (0) != 1 + || fn2 (1) != 0 + || fn2 (2) != 0 + || fn2 (3) != 0 + || fn2 (4) != 0 + || fn2 (5) != 0) +__builtin_abort (); + + if (fn3 (0) != 0 + || fn3 (1) != 1 + || fn3 (2) != 1 + || fn3 (3) != 1 + || fn3 (4) != 1 + || fn3 (5) != 1) +__builtin_abort (); + + if (fn4 (0) != 0 + || fn4 (1) != 1 + || fn4 (2) != 1 + || fn4 (3) != 1 + || fn4 (4) != 1 + || fn4 (5) != 1) +__builtin_abort (); +} + +/* { dg-final { scan-tree-dump-not (|) original } } */ +/* { dg-final { cleanup-tree-dump original } } */ Marek
Re: C++ PATCH for C++14 sized deallocation
On 12/16/2014 05:09 AM, Andreas Schwab wrote: covariant4.C:(.text._ZN5ModelD2Ev[_ZN5ModelD5Ev]+0x1e): undefined reference to `operator delete(void*, unsigned int)'. Can you determine why this reference isn't being satisfied by libstdc++? Jason
[PATCH][ARM]Fix __ARM_SIZEOF_WCHAR_T definition.
Hi all, According to ACLE 2.0, the value of __ARM_SIZEOF_WCHAR_T should be defined in terms of byte, which means it should be 2 or 4. This patch corrects the error in arm backend. arm-none-eabi regression test has been done, no new issues. Okay for trunk? Regards, Renlin Li gcc/ChangeLog: 2014-12-16 Renlin Li renlin...@arm.com * config/arm/arm.h (TARGET_CPU_CPP_BUILTINS): Correct __ARM_SIZEOF_WCHAR_T macro definition. diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h index d850982..2c57182 100644 --- a/gcc/config/arm/arm.h +++ b/gcc/config/arm/arm.h @@ -74,8 +74,8 @@ extern char arm_arch_name[]; builtin_define_with_int_value (\ __ARM_SIZEOF_MINIMAL_ENUM,\ flag_short_enums ? 1 : 4);\ - builtin_define_type_sizeof (__ARM_SIZEOF_WCHAR_T, \ -wchar_type_node); \ + builtin_define_with_int_value (__ARM_SIZEOF_WCHAR_T, \ + WCHAR_TYPE_SIZE / 8); \ if (TARGET_ARM_ARCH_PROFILE)\ builtin_define_with_int_value ( \ __ARM_ARCH_PROFILE, TARGET_ARM_ARCH_PROFILE); \
Re: [PATCH] Fix for PR ipa/64146
On 12/12/2014 06:21 PM, Dominique Dhumieres wrote: Martin, Your test g++.dg/ipa/pr64146.C fails on darwin: grep bind pr64146.C.051i.icf returns nothing, so the first scan fails, while the second one succeeds. Dominique Hello. You are right, I forgot to decorate test case with: +/* { dg-require-alias } */ Martin
fix aix build error with math.h in gcc/sreal.c
Recent commit 218765 adding sreal::to_double() breaks on AIX due to math.h being included before _LARGE_FILES and __STDC_FORMAT_MACROS being defined later in config.h and system.h, respectively. 2014-12-16 Michael Haubenwallner michael.haubenwall...@ssi-schaefer.com Both config.h and system.h define ABI/API macros for system headers. * sreal.c: Include math.h later. Thanks! /haubi/ Index: gcc/sreal.c === --- gcc/sreal.c (revision 218780) +++ gcc/sreal.c (working copy) @@ -47,9 +47,9 @@ sig == 0 exp == -SREAL_MAX_EXP */ -#include math.h #include config.h #include system.h +#include math.h #include coretypes.h #include sreal.h
Re: [PATCH] Fix for PR ipa/64146
On 12/11/2014 03:03 PM, Richard Biener wrote: On Thu, Dec 11, 2014 at 2:49 PM, Martin Liška mli...@suse.cz wrote: Hello. In PR64146, for position independent code IPA ICF should be more careful about thunk creation. Patch can bootstrap on x86_64-linux-pc and no new regression was seen. Ready for thunk? Hmm, does that merge the functions but keep a call to the original alias which can be overridden at runtime? If so, ok. Hello. No, in this case there's no merge operation processed. Function are going to remain as they are. I hope such behavior is the right one. Martin Thanks, Richard. Thank you, Martin
[PATCH] Teach VRP about x cst even if x is VARYING (PR tree-optimization/64322)
Hi! If for RSHIFT_EXPR vr0 is not VR_RANGE or is symbolic, currently we make the result VARYING, even when we can do much better just by trying to shift the min and max values down. Divisions/modulo already handles it similarly, and +/-/ also handle it. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2014-12-16 Jakub Jelinek ja...@redhat.com PR tree-optimization/64322 * tree-vrp.c (extract_range_from_binary_expr_1): Attempt to derive range for RSHIFT_EXPR even if vr0 range is not VR_RANGE or is symbolic. * gcc.dg/tree-ssa/vrp95.c: New test. --- gcc/tree-vrp.c.jj 2014-12-01 14:57:30.0 +0100 +++ gcc/tree-vrp.c 2014-12-16 10:17:27.543111649 +0100 @@ -2434,6 +2434,7 @@ extract_range_from_binary_expr_1 (value_ code != MAX_EXPR code != PLUS_EXPR code != MINUS_EXPR + code != RSHIFT_EXPR (vr0.type == VR_VARYING || vr1.type == VR_VARYING || vr0.type != vr1.type @@ -2948,6 +2949,15 @@ extract_range_from_binary_expr_1 (value_ { if (code == RSHIFT_EXPR) { + /* Even if vr0 is VARYING or otherwise not usable, we can derive +useful ranges just from the shift count. E.g. +x 63 for signed 64-bit x is always [-1, 0]. */ + if (vr0.type != VR_RANGE || symbolic_range_p (vr0)) + { + vr0.type = type = VR_RANGE; + vr0.min = vrp_val_min (expr_type); + vr0.max = vrp_val_max (expr_type); + } extract_range_from_multiplicative_op_1 (vr, code, vr0, vr1); return; } --- gcc/testsuite/gcc.dg/tree-ssa/vrp95.c.jj2014-12-16 12:11:19.048361844 +0100 +++ gcc/testsuite/gcc.dg/tree-ssa/vrp95.c 2014-12-16 12:24:47.080308362 +0100 @@ -0,0 +1,50 @@ +/* PR tree-optimization/64322 */ +/* { dg-do compile } */ +/* { dg-options -O2 -fdump-tree-vrp1 } */ + +extern void link_error (); +extern void required_check (); + +long long int +foo (long long int x) +{ + x = sizeof (long long int) * __CHAR_BIT__ - 1; + if (x != 0 x != -1) +link_error (); + return x; +} + +unsigned long long int +bar (unsigned long long int x) +{ + x = sizeof (long long int) * __CHAR_BIT__ - 1; + if (x != 0 x != 1) +link_error (); + return x; +} + +long long int +baz (long long int x) +{ + x = (x sizeof (long long int) * __CHAR_BIT__ - 1) 1; + x = x / 0x1LL; + if (x != 0) +link_error (); + return x; +} + +unsigned long long int +range (unsigned long long int x, int y) +{ + y = 3; + x = sizeof (long long int) * __CHAR_BIT__ - 1 - y; + if (x 15) +link_error (); + if (x == 15) +required_check (); + return x; +} + +/* { dg-final { scan-tree-dump-not link_error vrp1 } } */ +/* { dg-final { scan-tree-dump required_check vrp1 } } */ +/* { dg-final { cleanup-tree-dump vrp1 } } */ Jakub
Re: [PATCH] Make dg-extract-results.sh explicitly treat .{sum,log} files as text
On Tuesday, December 16 2014, Jakub Jelinek wrote: On Tue, Dec 16, 2014 at 09:36:33AM +, Pedro Alves wrote: On 12/15/2014 11:00 PM, Sergio Durigan Junior wrote: +# Check if grep supports the '--text' option. + +GREP_TEXT_OPT=--text +if grep --text 21 | grep unrecognized option /dev/null 21 ; then + GREP_TEXT_OPT= +fi + That assumes all greps output unrecognized option on unrecognized options. I don't think that can be counted on being portable? ISTM reversing the logic of the test would be better. IOW, test that --text actually works. Something like this perhaps: # If grep supports the '--text' option, use it. GREP_TEXT_OPT= if echo foo | grep --text foo /dev/null 21 ; then GREP_TEXT_OPT=--text fi Even better include some non-text bytes around the foo string in the input you feed to grep. Also, perhaps better would be to use a GREP variable, initialized to GREP=grep or GREP=grep --text and just invoke $GREP. Hm, right. Thank you Pedro and Jakub for the comments. I will work on another version of the patch. -- Sergio GPG key ID: 0x65FC5E36 Please send encrypted e-mail if possible http://sergiodj.net/
Re: [Patch, regcprop] Tentative fix for PR 64331
The cprop_hardreg pass does not consider REG_DEAD notes when propagating, and this causes issues if target specific code uses dead_or_set_regno_p to know if it can clobber registers. As explained in the audit trail, it doesn't have to. Killing regs that are marked REG_DEAD fixes the immediate problem. Is that the right approach? What do you guys think? Nope, consumers of REG_DEAD notes must instead explicitly ask the DF framework to recompute them. -- Eric Botcazou
Re: C++ PATCH for C++14 sized deallocation
Jason Merrill ja...@redhat.com writes: On 12/16/2014 05:09 AM, Andreas Schwab wrote: covariant4.C:(.text._ZN5ModelD2Ev[_ZN5ModelD5Ev]+0x1e): undefined reference to `operator delete(void*, unsigned int)'. Can you determine why this reference isn't being satisfied by libstdc++? $ objdump -tC x86_64-suse-linux/32/libstdc++-v3/src/.libs/libstdc++.so.6.0.21 | grep 'operator delete' 0004fb30 g F .text 001d operator delete[](void*) 0004faf0 g F .text 001d operator delete(void*) 0004fb50 g F .text 001d operator delete[](void*, std::nothrow_t const) 0004fb10 g F .text 001d operator delete(void*, std::nothrow_t const) Andreas. -- Andreas Schwab, SUSE Labs, sch...@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 And now for something completely different.
Re: C++ PATCH for C++14 sized deallocation
On Tue, Dec 16, 2014 at 10:16:42AM -0500, Jason Merrill wrote: On 12/16/2014 05:09 AM, Andreas Schwab wrote: covariant4.C:(.text._ZN5ModelD2Ev[_ZN5ModelD5Ev]+0x1e): undefined reference to `operator delete(void*, unsigned int)'. Can you determine why this reference isn't being satisfied by libstdc++? Note it fails on i686-linux (and x86_64-linux with -m32) too: http://gcc.gnu.org/ml/gcc-testresults/2014-12/msg02036.html Jakub
[PATCH] Fix C++ PATCH for C++14 sized deallocation
On Tue, Dec 16, 2014 at 05:54:04PM +0100, Jakub Jelinek wrote: On Tue, Dec 16, 2014 at 10:16:42AM -0500, Jason Merrill wrote: On 12/16/2014 05:09 AM, Andreas Schwab wrote: covariant4.C:(.text._ZN5ModelD2Ev[_ZN5ModelD5Ev]+0x1e): undefined reference to `operator delete(void*, unsigned int)'. Can you determine why this reference isn't being satisfied by libstdc++? Note it fails on i686-linux (and x86_64-linux with -m32) too: http://gcc.gnu.org/ml/gcc-testresults/2014-12/msg02036.html So something like (untested)? [jmy] is e.g. what is used for operator new... 2014-12-16 Jakub Jelinek ja...@redhat.com * config/abi/pre/gnu.ver (CXXABI_1.3.9): Export not just _Zd[la]Pvm, but also _Zd[la]Pv[jy] to cover other std::size_t manglings. --- libstdc++-v3/config/abi/pre/gnu.ver 2014-12-16 15:03:03.183517188 +0100 +++ libstdc++-v3/config/abi/pre/gnu.ver 2014-12-16 18:08:22.527440407 +0100 @@ -1734,9 +1734,9 @@ CXXABI_1.3.9 { _ZTSPK[no]; # operator delete(void*, std::size_t) -_ZdlPvm; +_ZdlPv[jmy]; # operator delete[](void*, std::size_t) -_ZdaPvm; +_ZdaPv[jmy]; } CXXABI_1.3.8; Jakub
Re: [PATCH] Fix C++ PATCH for C++14 sized deallocation
OK, thanks. Jason
Re: [C++ Patch] PR 58650
Hi, On 12/16/2014 04:10 PM, Jason Merrill wrote: On 12/16/2014 05:40 AM, Paolo Carlini wrote: In better detail: grokdeclarator is called, via grokfield, by cp_parser_member_declaration. The latter stores the friendship information in a friend_p local flag, which remains true when grokdeclarator returns. Maybe check function_declarator_p in cp_parser_member_declaration? I see what you mean: try to somehow realize that grokdeclarator issued an error and we are in error recovery. What about directly addressing NSDMIs, the specific case at issue, thus the below? A bit ad-hoc-ish but on the other hand should be lighter than calling function_declarator_p, etc, to figure out... Well, I'd like to add a consideration in favor of the simplistic solution involving error_mark_node which I proposed earlier today: it has the advantage that we end up with consistent diagnostic for struct A { friend int i; A() { i = 1; } }; and struct B { friend int i = 0; B() { i = 1; } }; as far as the constructor is concerned: that is, 'i' is diagnosed as undeclared in both cases. This is, apparently, what both current clang and edg do. All the other solutions I figured out so far, so to speak recover too much in case of NSDMIs, that is beyond what we do for normal data members: no diagnostic is produced for the constructor in B. Thanks, Paolo. // Index: cp/parser.c === --- cp/parser.c (revision 218777) +++ cp/parser.c (working copy) @@ -21077,8 +21077,16 @@ cp_parser_member_declaration (cp_parser* parser) else if (TREE_CODE (decl) == FIELD_DECL !DECL_C_BIT_FIELD (decl) DECL_INITIAL (decl)) - /* Add DECL to the queue of NSDMI to be parsed later. */ - vec_safe_push (unparsed_nsdmis, decl); + { + + if (friend_p) + /* Explicitly include NSDMIs for error recovery purposes: + avoid crashing later if one is wrongly declared friend + (c++/58650). */ + finish_member_declaration (decl); + /* Add DECL to the queue of NSDMI to be parsed later. */ + vec_safe_push (unparsed_nsdmis, decl); + } } if (assume_semicolon) Index: testsuite/g++.dg/parse/friend12.C === --- testsuite/g++.dg/parse/friend12.C (revision 0) +++ testsuite/g++.dg/parse/friend12.C (working copy) @@ -0,0 +1,7 @@ +// PR c++/58650 + +struct A +{ + friend int i = 0; // { dg-error cannot be declared friend } +// { dg-error non-static data member { target { ! c++11 } } 5 } +};
Re: Fwd: [PATCH PR62178]Improve candidate selecting in IVOPT, 2nd try.
Bin.Cheng wrote: Multisource/Benchmarks/mafft/pairlocalalign is regressed but I can't reproduce it in cmd. The running time of compilation of pairlocalalign.c is too small comparing to the results. I also tried to invoke it by using RunSafely.sh but no lucky either. So any documentation on this? Thanks very much! There is not much documentation on running the llvm test-suite. Here is how I do rerun a single benchmark: In the build directory, if it is clean, i.e., you have just configure'd, you can run make clean and that will traverse all the directories and create them if they do not exist. If you have already run make TEST=simple you do not have to run make clean as you already have all the directories under the build dir. Once you have the benchmark dir in the build dir, just do: $ cd Multisource/Benchmarks/mafft/pairlocalalign $ make clean $ make TEST=simple [... all other variables as mentioned before ...] this way you will only run that specific benchmark. If you need to see which commands RunSafely.sh is running, I would suggest you add some echo $CMD or set -x in there. I think by default you do have the compiler commands. Sebastian
Re: [C++ Patch] PR 58650
On 12/16/2014 12:49 PM, Paolo Carlini wrote: I see what you mean: try to somehow realize that grokdeclarator issued an error and we are in error recovery. What about directly addressing NSDMIs, the specific case at issue, thus the below? A bit ad-hoc-ish but on the other hand should be lighter than calling function_declarator_p, etc, to figure out... Well, since only functions can be friends at this point, since types take a different path, maybe change the if (!friend_p) just above to if (!friend_p || TREE_CODE (decl) != FUNCTION_DECL)? Jason
Re: [PATCH] Add (1 A) 1) folding (PR middle-end/64309)
On Tue, 16 Dec 2014, Marek Polacek wrote: On Tue, Dec 16, 2014 at 03:08:23PM +0100, Richard Biener wrote: You can use (for cmp (ne eq) icmp (eq ne) (simplify (cmp (bit_and (lshift integer_onep @0) integer_onep) integer_zerop) (icmp @0 { build_zero_cst (TREE_TYPE (@0)); }))) to combine both patterns. Btw, shoudln't it be (bit_and (...) Ah, we can have multiple operators to iterate (and yes, it's documented, my bad). integer_each_onep)? I'm always unsure about the complex integer case (maybe try a runtime testcase and see what happens - eventually we just don't support bit operations on them...) Correct - we don't allow complex operands for neither shift, nor bit and. So I left the integer_onep in there. Thanks. Ok. Thanks, Richard. 2014-12-16 Marek Polacek pola...@redhat.com PR middle-end/64309 * match.pd: Add ((1 A) 1) != 0 - A == 0 and ((1 A) 1) == 0 - A != 0. * gcc.dg/pr64309.c: New test. diff --git gcc/match.pd gcc/match.pd index 083d65f..dbca99e 100644 --- gcc/match.pd +++ gcc/match.pd @@ -599,6 +599,13 @@ along with GCC; see the file COPYING3. If not see build_int_cst (TREE_TYPE (@1), element_precision (type)), @1); })) +/* ((1 A) 1) != 0 - A == 0 + ((1 A) 1) == 0 - A != 0 */ +(for cmp (ne eq) + icmp (eq ne) + (simplify + (cmp (bit_and (lshift integer_onep @0) integer_onep) integer_zerop) + (icmp @0 { build_zero_cst (TREE_TYPE (@0)); }))) /* Simplifications of conversions. */ diff --git gcc/testsuite/gcc.dg/pr64309.c gcc/testsuite/gcc.dg/pr64309.c index e69de29..710a762 100644 --- gcc/testsuite/gcc.dg/pr64309.c +++ gcc/testsuite/gcc.dg/pr64309.c @@ -0,0 +1,66 @@ +/* PR middle-end/64309 */ +/* { dg-do run } */ +/* { dg-options -fdump-tree-original } */ + +int +fn1 (int n) +{ + return ((1 n) 1) != 0; +} + +int +fn2 (int n) +{ + return (1 (1 n)) != 0; +} + +int +fn3 (int n) +{ + return ((1 n) 1) == 0; +} + +int +fn4 (int n) +{ + return (1 (1 n)) == 0; +} + +int +main (void) +{ + if (fn1 (0) != 1 + || fn1 (1) != 0 + || fn1 (2) != 0 + || fn1 (3) != 0 + || fn1 (4) != 0 + || fn1 (5) != 0) +__builtin_abort (); + + if (fn2 (0) != 1 + || fn2 (1) != 0 + || fn2 (2) != 0 + || fn2 (3) != 0 + || fn2 (4) != 0 + || fn2 (5) != 0) +__builtin_abort (); + + if (fn3 (0) != 0 + || fn3 (1) != 1 + || fn3 (2) != 1 + || fn3 (3) != 1 + || fn3 (4) != 1 + || fn3 (5) != 1) +__builtin_abort (); + + if (fn4 (0) != 0 + || fn4 (1) != 1 + || fn4 (2) != 1 + || fn4 (3) != 1 + || fn4 (4) != 1 + || fn4 (5) != 1) +__builtin_abort (); +} + +/* { dg-final { scan-tree-dump-not (|) original } } */ +/* { dg-final { cleanup-tree-dump original } } */ Marek -- Richard Biener rguent...@suse.de SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)
Re: fix aix build error with math.h in gcc/sreal.c
On Tue, Dec 16, 2014 at 5:04 PM, Michael Haubenwallner michael.haubenwall...@ssi-schaefer.com wrote: Recent commit 218765 adding sreal::to_double() breaks on AIX due to math.h being included before _LARGE_FILES and __STDC_FORMAT_MACROS being defined later in config.h and system.h, respectively. sreal.c shouldn't include math.h, if really really really needed math.h needs to be included from system.h at the appropriate place. Richard, 2014-12-16 Michael Haubenwallner michael.haubenwall...@ssi-schaefer.com Both config.h and system.h define ABI/API macros for system headers. * sreal.c: Include math.h later. Thanks! /haubi/
Re: [PATCH] Teach VRP about x cst even if x is VARYING (PR tree-optimization/64322)
On Tue, 16 Dec 2014, Jakub Jelinek wrote: Hi! If for RSHIFT_EXPR vr0 is not VR_RANGE or is symbolic, currently we make the result VARYING, even when we can do much better just by trying to shift the min and max values down. Divisions/modulo already handles it similarly, and +/-/ also handle it. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Ok. Thanks, Richard. 2014-12-16 Jakub Jelinek ja...@redhat.com PR tree-optimization/64322 * tree-vrp.c (extract_range_from_binary_expr_1): Attempt to derive range for RSHIFT_EXPR even if vr0 range is not VR_RANGE or is symbolic. * gcc.dg/tree-ssa/vrp95.c: New test. --- gcc/tree-vrp.c.jj 2014-12-01 14:57:30.0 +0100 +++ gcc/tree-vrp.c2014-12-16 10:17:27.543111649 +0100 @@ -2434,6 +2434,7 @@ extract_range_from_binary_expr_1 (value_ code != MAX_EXPR code != PLUS_EXPR code != MINUS_EXPR + code != RSHIFT_EXPR (vr0.type == VR_VARYING || vr1.type == VR_VARYING || vr0.type != vr1.type @@ -2948,6 +2949,15 @@ extract_range_from_binary_expr_1 (value_ { if (code == RSHIFT_EXPR) { + /* Even if vr0 is VARYING or otherwise not usable, we can derive + useful ranges just from the shift count. E.g. + x 63 for signed 64-bit x is always [-1, 0]. */ + if (vr0.type != VR_RANGE || symbolic_range_p (vr0)) + { + vr0.type = type = VR_RANGE; + vr0.min = vrp_val_min (expr_type); + vr0.max = vrp_val_max (expr_type); + } extract_range_from_multiplicative_op_1 (vr, code, vr0, vr1); return; } --- gcc/testsuite/gcc.dg/tree-ssa/vrp95.c.jj 2014-12-16 12:11:19.048361844 +0100 +++ gcc/testsuite/gcc.dg/tree-ssa/vrp95.c 2014-12-16 12:24:47.080308362 +0100 @@ -0,0 +1,50 @@ +/* PR tree-optimization/64322 */ +/* { dg-do compile } */ +/* { dg-options -O2 -fdump-tree-vrp1 } */ + +extern void link_error (); +extern void required_check (); + +long long int +foo (long long int x) +{ + x = sizeof (long long int) * __CHAR_BIT__ - 1; + if (x != 0 x != -1) +link_error (); + return x; +} + +unsigned long long int +bar (unsigned long long int x) +{ + x = sizeof (long long int) * __CHAR_BIT__ - 1; + if (x != 0 x != 1) +link_error (); + return x; +} + +long long int +baz (long long int x) +{ + x = (x sizeof (long long int) * __CHAR_BIT__ - 1) 1; + x = x / 0x1LL; + if (x != 0) +link_error (); + return x; +} + +unsigned long long int +range (unsigned long long int x, int y) +{ + y = 3; + x = sizeof (long long int) * __CHAR_BIT__ - 1 - y; + if (x 15) +link_error (); + if (x == 15) +required_check (); + return x; +} + +/* { dg-final { scan-tree-dump-not link_error vrp1 } } */ +/* { dg-final { scan-tree-dump required_check vrp1 } } */ +/* { dg-final { cleanup-tree-dump vrp1 } } */ Jakub -- Richard Biener rguent...@suse.de SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)
Go patch committed: Fix parsing of send clauses with composite literals
PR 61273 points out that for ; false; c - false { doesn't parse correctly. It's because the false { is incorrectly interpreted as being a potential composite literal. This patch from Chris Manghane fixes the parsing bug. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r 670705a1c9cc go/parse.cc --- a/go/parse.cc Mon Dec 15 12:17:08 2014 -0800 +++ b/go/parse.cc Tue Dec 16 10:50:34 2014 -0800 @@ -3819,7 +3819,7 @@ token = this-peek_token(); if (token-is_op(OPERATOR_CHANOP)) { - this-send_stmt(this-verify_not_sink(exp)); + this-send_stmt(this-verify_not_sink(exp), may_be_composite_lit); if (return_exp != NULL) *return_exp = true; } @@ -3913,13 +3913,13 @@ // Channel = Expression . void -Parse::send_stmt(Expression* channel) +Parse::send_stmt(Expression* channel, bool may_be_composite_lit) { go_assert(this-peek_token()-is_op(OPERATOR_CHANOP)); Location loc = this-location(); this-advance_token(); - Expression* val = this-expression(PRECEDENCE_NORMAL, false, true, NULL, -NULL); + Expression* val = this-expression(PRECEDENCE_NORMAL, false, +may_be_composite_lit, NULL, NULL); Statement* s = Statement::make_send_statement(channel, val, loc); this-gogo_-add_statement(s); } diff -r 670705a1c9cc go/parse.h --- a/go/parse.hMon Dec 15 12:17:08 2014 -0800 +++ b/go/parse.hTue Dec 16 10:50:34 2014 -0800 @@ -245,7 +245,7 @@ void statement_list(); bool statement_list_may_start_here(); void expression_stat(Expression*); - void send_stmt(Expression*); + void send_stmt(Expression*, bool may_be_composite_lit); void inc_dec_stat(Expression*); void assignment(Expression*, bool may_be_composite_lit, Range_clause*); void tuple_assignment(Expression_list*, bool may_be_composite_lit,
Re: [PATCH] Fix -fsanitize=float-cast-overflow with C FE (PR sanitizer/64289)
On Fri, 12 Dec 2014, Jakub Jelinek wrote: Hi! -fsanitize=float-cast-overflow sanitization is done in convert.c and calls there save_expr. Unfortunately, save_expr is a no-go for the C FE, we need c_save_expr, but as convert.c is shared by all FEs, the only way to arrange that would be a new langhook. This patch attempts to fix it the same way as PR54428 did (the other save_expr in c-convert.c). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? OK. -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH] combine: If a parallel I2 was split, do not allow a new I2 (PR64268)
On Tue, Dec 16, 2014 at 07:28:22AM -0600, Segher Boessenkool wrote: and type0 is when it didn't do either but still ended up with I1 and I2 the same UID (I think it might be called with the same insn twice; not a good thing, it does not know how to handle this; and it is really worrisome that it then sometimes succeeds in combining it). ... and when I make try_combine immediately refuse duplicates in I0,I1,I2 everything works again. When INSN_UID (i1) == INSN_UID (i2), calling df_insn_delete (i1) does set the should be deleted flag on i2 instead, but try_combine later calls df_insn_rescan (i2) which resets it again. This is icky and looks very wrong in the debug dump, but it does work fine. Doing an (effictively) 2-2 combine like this still feels quite unsafe wrt combine possibly looping, but it has worked for over twenty years. Bootstrapping, making new test results, will send a new patch later today. Thanks for pushing me to look much closer, Segher
[PATCH, libgo]: Fix build warning
Hello! When building libgo on CentOS 5.11, following warnings appear: In file included from /usr/include/fcntl.h:38:0, from sysinfo.c:6: /home/uros/gcc-build/gcc/include-fixed/sys/stat.h:317:16: warning: inline function ‘lstat64’ declared but never defined __inline__ int lstat64 (__const char *__restrict __file, ^ /home/uros/gcc-build/gcc/include-fixed/sys/stat.h:286:16: warning: inline function ‘fstatat64’ declared but never defined __inline__ int fstatat64 (int __fd, __const char *__restrict __file, ^ /home/uros/gcc-build/gcc/include-fixed/sys/stat.h:255:16: warning: inline function ‘fstat64’ declared but never defined __inline__ int fstat64 (int __fd, struct stat64 *__buf) __THROW __nonnull ((2)); ^ /home/uros/gcc-build/gcc/include-fixed/sys/stat.h:250:16: warning: inline function ‘stat64’ declared but never defined __inline__ int stat64 (__const char *__restrict __file, ^ These are emitted from: CC=/home/uros/gcc-build/./gcc/xgcc -B/home/uros/gcc-build/./gcc/ -B/usr/local/x86_64-unknown-linux-gnu/bin/ -B/usr/local/x86_64-unknown-linux-gnu/lib/ -isystem /usr/ local/x86_64-unknown-linux-gnu/include -isystem /usr/local/x86_64-unknown-linux-gnu/sys-include-DHAVE_CONFIG_H -I. -I../../../gcc-svn/trunk/libgo -I ../../../gcc-svn/trunk/libgo/run time -I../../../gcc-svn/trunk/libgo/../libffi/include -I../libffi/include -pthread -D_GNU_SOURCE -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 /bin/sh ../../../gcc-svn/trunk/libgo/mksysin fo.sh due to the sys/stat.h, which protects above functions with: # if defined __USE_LARGEFILE64 \ (! defined __USE_FILE_OFFSET64 \ || (defined __REDIRECT_NTH defined __OPTIMIZE__)) Adding -O to OSCFLAGS fixes this issue, as __OPTIMIZE__ is defined with -O. Patch was bootstrapped and regression tested on x86_64-linux-gnu (CentOS 5.11), where it removes the above warnings. OK for mainline? Uros. Index: configure === --- configure (revision 218778) +++ configure (working copy) @@ -13968,7 +13968,7 @@ -OSCFLAGS=-D_GNU_SOURCE -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 +OSCFLAGS=-D_GNU_SOURCE -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -O case $target in mips-sgi-irix6.5*) # IRIX 6 needs _XOPEN_SOURCE=500 for the XPG5 version of struct Index: configure.ac === --- configure.ac(revision 218778) +++ configure.ac(working copy) @@ -349,7 +349,7 @@ AC_SUBST(GO_SYSCALL_OS_ARCH_FILE) dnl Special flags used to generate sysinfo.go. -OSCFLAGS=-D_GNU_SOURCE -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 +OSCFLAGS=-D_GNU_SOURCE -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -O case $target in mips-sgi-irix6.5*) # IRIX 6 needs _XOPEN_SOURCE=500 for the XPG5 version of struct
Re: fix aix build error with math.h in gcc/sreal.c
On Tue, Dec 16, 2014 at 5:04 PM, Michael Haubenwallner michael.haubenwall...@ssi-schaefer.com wrote: Recent commit 218765 adding sreal::to_double() breaks on AIX due to math.h being included before _LARGE_FILES and __STDC_FORMAT_MACROS being defined later in config.h and system.h, respectively. sreal.c shouldn't include math.h, if really really really needed math.h needs to be included from system.h at the appropriate place. Hmm, I need math.h for the exponential function. genautomata is also including math.h. Should we thus move it to system.h Something like this? Since i do not caremuch about performance of to_double, we could also just have loop that multiplies/divides by 2 until exponent is reached. It is rather lame though. Honza Index: system.h === --- system.h(revision 218788) +++ system.h(working copy) @@ -45,6 +45,8 @@ #include stdio.h +#include math.h + /* Define a generic NULL if one hasn't already been defined. */ #ifndef NULL #define NULL 0 Index: genautomata.c === --- genautomata.c (revision 218788) +++ genautomata.c (working copy) @@ -113,7 +113,6 @@ #include errors.h #include gensupport.h -#include math.h #include hashtab.h #include vec.h #include fnmatch.h Index: sreal.c === --- sreal.c (revision 218788) +++ sreal.c (working copy) @@ -49,7 +49,6 @@ #include config.h #include system.h -#include math.h #include coretypes.h #include sreal.h Richard, 2014-12-16 Michael Haubenwallner michael.haubenwall...@ssi-schaefer.com Both config.h and system.h define ABI/API macros for system headers. * sreal.c: Include math.h later. Thanks! /haubi/
Go patch committed: Don't crash copying empty composite literal
This patch by Chris Manghane fixes a compiler crash when copying an empty composite literal. This fixes GCC PR 61264. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r 22a19b12e472 go/expressions.cc --- a/go/expressions.cc Tue Dec 16 10:50:55 2014 -0800 +++ b/go/expressions.cc Tue Dec 16 11:12:37 2014 -0800 @@ -11588,7 +11588,10 @@ do_copy() { Struct_construction_expression* ret = - new Struct_construction_expression(this-type_, this-vals_-copy(), + new Struct_construction_expression(this-type_, +(this-vals_ == NULL + ? NULL + : this-vals_-copy()), this-location()); if (this-traverse_order_ != NULL) ret-set_traverse_order(this-traverse_order_); @@ -12353,7 +12356,10 @@ Expression* do_copy() { -return new Map_construction_expression(this-type_, this-vals_-copy(), +return new Map_construction_expression(this-type_, + (this-vals_ == NULL + ? NULL + : this-vals_-copy()), this-location()); }
Re: [PATCH, libgo]: Fix build warning
On Tue, Dec 16, 2014 at 11:05 AM, Uros Bizjak ubiz...@gmail.com wrote: When building libgo on CentOS 5.11, following warnings appear: In file included from /usr/include/fcntl.h:38:0, from sysinfo.c:6: /home/uros/gcc-build/gcc/include-fixed/sys/stat.h:317:16: warning: inline function ‘lstat64’ declared but never defined __inline__ int lstat64 (__const char *__restrict __file, ^ /home/uros/gcc-build/gcc/include-fixed/sys/stat.h:286:16: warning: inline function ‘fstatat64’ declared but never defined __inline__ int fstatat64 (int __fd, __const char *__restrict __file, ^ /home/uros/gcc-build/gcc/include-fixed/sys/stat.h:255:16: warning: inline function ‘fstat64’ declared but never defined __inline__ int fstat64 (int __fd, struct stat64 *__buf) __THROW __nonnull ((2)); ^ /home/uros/gcc-build/gcc/include-fixed/sys/stat.h:250:16: warning: inline function ‘stat64’ declared but never defined __inline__ int stat64 (__const char *__restrict __file, ^ These are emitted from: CC=/home/uros/gcc-build/./gcc/xgcc -B/home/uros/gcc-build/./gcc/ -B/usr/local/x86_64-unknown-linux-gnu/bin/ -B/usr/local/x86_64-unknown-linux-gnu/lib/ -isystem /usr/ local/x86_64-unknown-linux-gnu/include -isystem /usr/local/x86_64-unknown-linux-gnu/sys-include-DHAVE_CONFIG_H -I. -I../../../gcc-svn/trunk/libgo -I ../../../gcc-svn/trunk/libgo/run time -I../../../gcc-svn/trunk/libgo/../libffi/include -I../libffi/include -pthread -D_GNU_SOURCE -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 /bin/sh ../../../gcc-svn/trunk/libgo/mksysin fo.sh due to the sys/stat.h, which protects above functions with: # if defined __USE_LARGEFILE64 \ (! defined __USE_FILE_OFFSET64 \ || (defined __REDIRECT_NTH defined __OPTIMIZE__)) Adding -O to OSCFLAGS fixes this issue, as __OPTIMIZE__ is defined with -O. Patch was bootstrapped and regression tested on x86_64-linux-gnu (CentOS 5.11), where it removes the above warnings. OK for mainline? This seems a bit dubious, as it seems that the same problem would occur for any C program that #include's sys/stat.h and is compiled without optimization. I don't mind passing -O when running mksysinfo.sh, but your patch will pass -O to all the C file compilations. That doesn't seem like a good idea--some people might want to debug that code. Can you try either only addding -O for CentOS, or addding it only to mksysinfo.sh? Ian
[Google] Port patch r215585 to Google/4.9 branch
Hi In Google application we hit the same problem as https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63341, so we also need the patch r215585 for Google/4.9 branch. It passed following tests: bootstrap and regression test on x86-64. regression test on ppc. Google reference 18687126. OK for Google/4.9 branch? patch Description: Binary data
Re: [Google] Port patch r215585 to Google/4.9 branch
The fix is already in upstream gcc-4.9 branch? If yes, we just need a merge. David On Tue, Dec 16, 2014 at 11:30 AM, Carrot Wei car...@google.com wrote: Hi In Google application we hit the same problem as https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63341, so we also need the patch r215585 for Google/4.9 branch. It passed following tests: bootstrap and regression test on x86-64. regression test on ppc. Google reference 18687126. OK for Google/4.9 branch?
Re: [Google] Port patch r215585 to Google/4.9 branch
Yes, it has been long time since last merge, so it is good idea to do another merge. On Tue, Dec 16, 2014 at 11:32 AM, Xinliang David Li davi...@google.com wrote: The fix is already in upstream gcc-4.9 branch? If yes, we just need a merge. David On Tue, Dec 16, 2014 at 11:30 AM, Carrot Wei car...@google.com wrote: Hi In Google application we hit the same problem as https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63341, so we also need the patch r215585 for Google/4.9 branch. It passed following tests: bootstrap and regression test on x86-64. regression test on ppc. Google reference 18687126. OK for Google/4.9 branch?
Re: [PATCH, libgo]: Fix build warning
On Tue, Dec 16, 2014 at 8:22 PM, Ian Lance Taylor i...@golang.org wrote: On Tue, Dec 16, 2014 at 11:05 AM, Uros Bizjak ubiz...@gmail.com wrote: When building libgo on CentOS 5.11, following warnings appear: In file included from /usr/include/fcntl.h:38:0, from sysinfo.c:6: /home/uros/gcc-build/gcc/include-fixed/sys/stat.h:317:16: warning: inline function ‘lstat64’ declared but never defined __inline__ int lstat64 (__const char *__restrict __file, ^ /home/uros/gcc-build/gcc/include-fixed/sys/stat.h:286:16: warning: inline function ‘fstatat64’ declared but never defined __inline__ int fstatat64 (int __fd, __const char *__restrict __file, ^ /home/uros/gcc-build/gcc/include-fixed/sys/stat.h:255:16: warning: inline function ‘fstat64’ declared but never defined __inline__ int fstat64 (int __fd, struct stat64 *__buf) __THROW __nonnull ((2)); ^ /home/uros/gcc-build/gcc/include-fixed/sys/stat.h:250:16: warning: inline function ‘stat64’ declared but never defined __inline__ int stat64 (__const char *__restrict __file, ^ These are emitted from: CC=/home/uros/gcc-build/./gcc/xgcc -B/home/uros/gcc-build/./gcc/ -B/usr/local/x86_64-unknown-linux-gnu/bin/ -B/usr/local/x86_64-unknown-linux-gnu/lib/ -isystem /usr/ local/x86_64-unknown-linux-gnu/include -isystem /usr/local/x86_64-unknown-linux-gnu/sys-include-DHAVE_CONFIG_H -I. -I../../../gcc-svn/trunk/libgo -I ../../../gcc-svn/trunk/libgo/run time -I../../../gcc-svn/trunk/libgo/../libffi/include -I../libffi/include -pthread -D_GNU_SOURCE -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 /bin/sh ../../../gcc-svn/trunk/libgo/mksysin fo.sh due to the sys/stat.h, which protects above functions with: # if defined __USE_LARGEFILE64 \ (! defined __USE_FILE_OFFSET64 \ || (defined __REDIRECT_NTH defined __OPTIMIZE__)) Adding -O to OSCFLAGS fixes this issue, as __OPTIMIZE__ is defined with -O. Patch was bootstrapped and regression tested on x86_64-linux-gnu (CentOS 5.11), where it removes the above warnings. OK for mainline? This seems a bit dubious, as it seems that the same problem would occur for any C program that #include's sys/stat.h and is compiled without optimization. Please note that the above command also defines __USE_FILE_OFFSET64, but indeed, I don't see these warnings on Fedora 20. I don't mind passing -O when running mksysinfo.sh, but your patch will pass -O to all the C file compilations. That doesn't seem like a good idea--some people might want to debug that code. Can you try either only addding -O for CentOS, or addding it only to mksysinfo.sh? You are right, the proposed change was too broad. I am testing following patch: --cut here-- Index: Makefile.am === --- Makefile.am (revision 218785) +++ Makefile.am (working copy) @@ -1812,7 +1812,7 @@ sysinfo.go: s-sysinfo; @true s-sysinfo: $(srcdir)/mksysinfo.sh config.h - CC=$(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) $(AM_CPPFLAGS) $(CPPFLAGS) $(OSCFLAGS) $(SHELL) $(srcdir)/mksysinfo.sh + CC=$(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) $(AM_CPPFLAGS) $(CPPFLAGS) $(OSCFLAGS) -O $(SHELL) $(srcdir)/mksysinfo.sh $(SHELL) $(srcdir)/mvifdiff.sh tmp-sysinfo.go sysinfo.go $(STAMP) $@ Index: Makefile.in === --- Makefile.in (revision 218785) +++ Makefile.in (working copy) @@ -4421,7 +4421,7 @@ sysinfo.go: s-sysinfo; @true s-sysinfo: $(srcdir)/mksysinfo.sh config.h - CC=$(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) $(AM_CPPFLAGS) $(CPPFLAGS) $(OSCFLAGS) $(SHELL) $(srcdir)/mksysinfo.sh + CC=$(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) $(AM_CPPFLAGS) $(CPPFLAGS) $(OSCFLAGS) -O $(SHELL) $(srcdir)/mksysinfo.sh $(SHELL) $(srcdir)/mvifdiff.sh tmp-sysinfo.go sysinfo.go $(STAMP) $@ --cut here-- Uros.
Move HWI abs functions inline
Hi, while looking on profiles of firefox linktime I noticed that HWI abs functions are implemented offline that slows down the normalize function. Those are always win when implemented inline. Bootstrapped/regtested x86_64-linux, comitted as obvious. * hwint.c (abs_hwi, absu_hwi): Move to ... * hwint.h (abs_hwi, absu_hwi): ... here; make inline. Index: hwint.c === --- hwint.c (revision 218730) +++ hwint.c (working copy) @@ -124,22 +124,6 @@ popcount_hwi (unsigned HOST_WIDE_INT x) #endif /* GCC_VERSION 3004 */ -/* Compute the absolute value of X. */ - -HOST_WIDE_INT -abs_hwi (HOST_WIDE_INT x) -{ - gcc_checking_assert (x != HOST_WIDE_INT_MIN); - return x = 0 ? x : -x; -} - -/* Compute the absolute value of X as an unsigned type. */ - -unsigned HOST_WIDE_INT -absu_hwi (HOST_WIDE_INT x) -{ - return x = 0 ? (unsigned HOST_WIDE_INT)x : -(unsigned HOST_WIDE_INT)x; -} /* Compute the greatest common divisor of two numbers A and B using Euclid's algorithm. */ Index: hwint.h === --- hwint.h (revision 218730) +++ hwint.h (working copy) @@ -264,4 +264,21 @@ zext_hwi (unsigned HOST_WIDE_INT src, un } } +/* Compute the absolute value of X. */ + +inline HOST_WIDE_INT +abs_hwi (HOST_WIDE_INT x) +{ + gcc_checking_assert (x != HOST_WIDE_INT_MIN); + return x = 0 ? x : -x; +} + +/* Compute the absolute value of X as an unsigned type. */ + +inline unsigned HOST_WIDE_INT +absu_hwi (HOST_WIDE_INT x) +{ + return x = 0 ? (unsigned HOST_WIDE_INT)x : -(unsigned HOST_WIDE_INT)x; +} + #endif /* ! GCC_HWINT_H */
Re: [C++ Patch] PR 58650
Hi, On 12/16/2014 07:18 PM, Jason Merrill wrote: On 12/16/2014 12:49 PM, Paolo Carlini wrote: I see what you mean: try to somehow realize that grokdeclarator issued an error and we are in error recovery. What about directly addressing NSDMIs, the specific case at issue, thus the below? A bit ad-hoc-ish but on the other hand should be lighter than calling function_declarator_p, etc, to figure out... Well, since only functions can be friends at this point, since types take a different path, maybe change the if (!friend_p) just above to if (!friend_p || TREE_CODE (decl) != FUNCTION_DECL)? Indeed, should have figured it out myself. Thanks. That also addresses my consistency concerns. Minor detail: I think we want DECL_DECLARES_FUNCTION_P to include templates, otherwise we regress on, eg, pr60390.C. I'm finishing testing the below. Thanks again, Paolo. / Index: cp/parser.c === --- cp/parser.c (revision 218777) +++ cp/parser.c (working copy) @@ -21069,7 +21069,10 @@ cp_parser_member_declaration (cp_parser* parser) if (decl) { /* Add DECL to the list of members. */ - if (!friend_p) + if (!friend_p + /* Explicitly include, eg, NSDMIs, for better error +recovery (c++/58650). */ + || !DECL_DECLARES_FUNCTION_P (decl)) finish_member_declaration (decl); if (TREE_CODE (decl) == FUNCTION_DECL) Index: testsuite/g++.dg/parse/friend12.C === --- testsuite/g++.dg/parse/friend12.C (revision 0) +++ testsuite/g++.dg/parse/friend12.C (working copy) @@ -0,0 +1,7 @@ +// PR c++/58650 + +struct A +{ + friend int i = 0; // { dg-error cannot be declared friend } +// { dg-error non-static data member { target { ! c++11 } } 5 } +};
Optimize sreal::normalize
Hi, I have optimized inliner to take advantage that the heap is now in sreal and found that using small numbers leads to excessive time (7% of WPA) spent by sreal::normalize. This is because the normalization is implemented by loops that are unnecesary and can account considerable time when tripped too often. This patch implements it via floor_log2 and brings normalize inline. The motivation is to make the constructors from compile time constants to be optimized into constant write. I broke normalize into normalize_up/normalize_down and the hot path to futher aid inlininig (w/o profile inliner will most of time inline everything together but still it is better than one big function with profile feedback at least.) Bootstrapped/regtested x86_64-linux, OK? Honza * sreal.h (sreal::normalize): Implement inline. (sreal::normalize_up): New function. (sreal::normalize_down): New function. Index: sreal.h === --- sreal.h (revision 218765) +++ sreal.h (working copy) @@ -116,7 +116,9 @@ public: } private: - void normalize (); + inline void normalize (); + inline void normalize_up (); + inline void normalize_down (); void shift_right (int amount); static sreal signedless_plus (const sreal a, const sreal b, bool negative); static sreal signedless_minus (const sreal a, const sreal b, bool negative); @@ -178,4 +180,85 @@ inline sreal operator (const sreal a, return a.shift (-exp); } +/* Make significant to be = SREAL_MIN_SIG. + + Make this separate method so inliner can handle hot path better. */ + +inline void +sreal::normalize_up () +{ + int64_t s = m_sig 0 ? -1 : 1; + unsigned HOST_WIDE_INT sig = absu_hwi (m_sig); + int shift = SREAL_PART_BITS - 2 - floor_log2 (sig); + + gcc_checking_assert (shift 0); + sig = shift; + m_exp -= shift; + gcc_checking_assert (sig = SREAL_MAX_SIG sig = SREAL_MIN_SIG); + + /* Check underflow. */ + if (m_exp -SREAL_MAX_EXP) +{ + m_exp = -SREAL_MAX_EXP; + sig = 0; +} + if (s == -1) +m_sig = -sig; + else +m_sig = sig; +} + +/* Make significant to be = SREAL_MAX_SIG. + + Make this separate method so inliner can handle hot path better. */ + +inline void +sreal::normalize_down () +{ + int64_t s = m_sig 0 ? -1 : 1; + int last_bit; + unsigned HOST_WIDE_INT sig = absu_hwi (m_sig); + int shift = floor_log2 (sig) - SREAL_PART_BITS + 2; + + gcc_checking_assert (shift 0); + last_bit = (sig (shift-1)) 1; + sig = shift; + m_exp += shift; + gcc_checking_assert (sig = SREAL_MAX_SIG sig = SREAL_MIN_SIG); + + /* Round the number. */ + sig += last_bit; + if (sig SREAL_MAX_SIG) +{ + sig = 1; + m_exp++; +} + + /* Check overflow. */ + if (m_exp SREAL_MAX_EXP) +{ + m_exp = SREAL_MAX_EXP; + sig = SREAL_MAX_SIG; +} + if (s == -1) +m_sig = -sig; + else +m_sig = sig; +} + +/* Normalize *this; the hot path. */ + +inline void +sreal::normalize () +{ + unsigned HOST_WIDE_INT sig = absu_hwi (m_sig); + + if (sig == 0) +m_exp = -SREAL_MAX_EXP; + else if (sig SREAL_MAX_SIG) +normalize_down (); + else if (sig SREAL_MIN_SIG) +normalize_up (); +} + #endif Index: sreal.c === --- sreal.c (revision 218765) +++ sreal.c (working copy) @@ -96,64 +96,6 @@ sreal::shift_right (int s) m_sig = s; } -/* Normalize *this. */ - -void -sreal::normalize () -{ - int64_t s = m_sig 0 ? -1 : 1; - unsigned HOST_WIDE_INT sig = absu_hwi (m_sig); - - if (sig == 0) -{ - m_exp = -SREAL_MAX_EXP; -} - else if (sig SREAL_MIN_SIG) -{ - do - { - sig = 1; - m_exp--; - } - while (sig SREAL_MIN_SIG); - - /* Check underflow. */ - if (m_exp -SREAL_MAX_EXP) - { - m_exp = -SREAL_MAX_EXP; - sig = 0; - } -} - else if (sig SREAL_MAX_SIG) -{ - int last_bit; - do - { - last_bit = sig 1; - sig = 1; - m_exp++; - } - while (sig SREAL_MAX_SIG); - - /* Round the number. */ - sig += last_bit; - if (sig SREAL_MAX_SIG) - { - sig = 1; - m_exp++; - } - - /* Check overflow. */ - if (m_exp SREAL_MAX_EXP) - { - m_exp = SREAL_MAX_EXP; - sig = SREAL_MAX_SIG; - } -} - - m_sig = s * sig; -} - /* Return integer value of *this. */ int64_t
Re: PING for gcc/flag-types.h (was: Re: [Patch, gcc/flag-types.h + Fortran] PR54687 - Fortran options cleanup)
Tobias Burnus wrote: The Fortran part of the .opt changes has been approved, but for Enum(), some bits had to be moved to gcc/flag-types.h – and review for that file is still missing. https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01068.html Still pinging the gcc/flag-types.h change. However, I have now committed the follow up patch [1] as Rev. 218790 – and parts of this patch, which do not depend on the flag-types.h change, as Rev. 218792. (Hopefully, nothing important got lost during the rediffing.) Attached are the pending bits of this patch. Tobias [1] https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01289.html Tobias Burnus wrote: This patch cleans up Fortran's option handling and moves it closer to the common way of option handling. That's a nice cleanup and additionally, as Manuel points out in the PR, there are a couple of reasons why this makes sense in addition. I have not yet touched all options but one has to start somewhere. Built and currently regtesting on x86-64-gnu-linux. OK for the trunk? Tobias 2014-12-16 Tobias Burnus bur...@net-b.de PR fortran/54687 gcc/ * flag-types.h (gfc_init_local_real, gfc_fcoarray, gfc_convert): New enums; moved from fortran/. gcc/fortran/ * gfortran.h (gfc_option_t): Remove flags which now have a Var(). (init_local_real, gfc_fcoarray): Moved to ../flag-types.h. * libgfortran.h (unit_convert): Add comment. * lang.opt (flag-convert, flag-init_real, flag-coarray): Add Var() and Enum(). * options.c (gfc_handle_coarray_option): Remove. (gfc_init_options, gfc_post_options, gfc_handle_option): Update for *.opt changes. * array.c: Update for flag-variable name changes. * check.c: Ditto. * match.c: Ditto. * resolve.c: Ditto. * simplify.c: Ditto. * trans-array.c: Ditto. * trans-decl.c: Ditto. * trans-expr.c: Ditto. * trans-intrinsic.c: Ditto. * trans-stmt.c: Ditto. * trans-types.c: Ditto. * trans.c: Ditto. diff --git a/gcc/flag-types.h b/gcc/flag-types.h index 52ff7ee..81e8fb8 100644 --- a/gcc/flag-types.h +++ b/gcc/flag-types.h @@ -263,4 +263,38 @@ enum lto_partition_model { LTO_PARTITION_MAX = 4 }; + +/* gfortran -finit-real= values. */ + +enum gfc_init_local_real +{ + GFC_INIT_REAL_OFF = 0, + GFC_INIT_REAL_ZERO, + GFC_INIT_REAL_NAN, + GFC_INIT_REAL_SNAN, + GFC_INIT_REAL_INF, + GFC_INIT_REAL_NEG_INF +}; + +/* gfortran -fcoarray= values. */ + +enum gfc_fcoarray +{ + GFC_FCOARRAY_NONE = 0, + GFC_FCOARRAY_SINGLE, + GFC_FCOARRAY_LIB +}; + + +/* gfortran -fconvert= values; used for unformatted I/O. + Keep in sync with GFC_CONVERT_* in gcc/fortran/libgfortran.h. */ +enum gfc_convert +{ + GFC_FLAG_CONVERT_NATIVE = 0, + GFC_FLAG_CONVERT_SWAP, + GFC_FLAG_CONVERT_BIG, + GFC_FLAG_CONVERT_LITTLE +}; + + #endif /* ! GCC_FLAG_TYPES_H */ diff --git a/gcc/fortran/array.c b/gcc/fortran/array.c index e88ba66..e60b938 100644 --- a/gcc/fortran/array.c +++ b/gcc/fortran/array.c @@ -208,7 +208,7 @@ coarray: return MATCH_ERROR; } - if (gfc_option.coarray == GFC_FCOARRAY_NONE) + if (flag_coarray == GFC_FCOARRAY_NONE) { gfc_fatal_error (Coarrays disabled at %C, use %-fcoarray=% to enable); return MATCH_ERROR; @@ -591,7 +591,7 @@ coarray: if (!gfc_notify_std (GFC_STD_F2008, Coarray declaration at %C)) goto cleanup; - if (gfc_option.coarray == GFC_FCOARRAY_NONE) + if (flag_coarray == GFC_FCOARRAY_NONE) { gfc_fatal_error (Coarrays disabled at %C, use %-fcoarray=% to enable); goto cleanup; diff --git a/gcc/fortran/check.c b/gcc/fortran/check.c index 527123d..95c5223 100644 --- a/gcc/fortran/check.c +++ b/gcc/fortran/check.c @@ -1481,7 +1481,7 @@ check_co_collective (gfc_expr *a, gfc_expr *image_idx, gfc_expr *stat, } } - if (gfc_option.coarray == GFC_FCOARRAY_NONE) + if (flag_coarray == GFC_FCOARRAY_NONE) { gfc_fatal_error (Coarrays disabled at %L, use %-fcoarray=% to enable, a-where); @@ -2569,7 +2569,7 @@ gfc_check_lbound (gfc_expr *array, gfc_expr *dim, gfc_expr *kind) bool gfc_check_lcobound (gfc_expr *coarray, gfc_expr *dim, gfc_expr *kind) { - if (gfc_option.coarray == GFC_FCOARRAY_NONE) + if (flag_coarray == GFC_FCOARRAY_NONE) { gfc_fatal_error (Coarrays disabled at %C, use %-fcoarray=% to enable); return false; @@ -4847,7 +4847,7 @@ gfc_check_image_index (gfc_expr *coarray, gfc_expr *sub) { mpz_t nelems; - if (gfc_option.coarray == GFC_FCOARRAY_NONE) + if (flag_coarray == GFC_FCOARRAY_NONE) { gfc_fatal_error (Coarrays disabled at %C, use %-fcoarray=% to enable); return false; @@ -4885,7 +4885,7 @@ gfc_check_image_index (gfc_expr *coarray, gfc_expr *sub) bool gfc_check_num_images (gfc_expr *distance, gfc_expr *failed) { - if (gfc_option.coarray == GFC_FCOARRAY_NONE) + if (flag_coarray == GFC_FCOARRAY_NONE) { gfc_fatal_error (Coarrays disabled at %C, use %-fcoarray=% to enable); return false; @@ -4927,7 +4927,7 @@ gfc_check_num_images (gfc_expr
[SH][committed] Fix fpchg testcase
Hi, The attached patch fixes the gcc.target/sh/fpchg.c test case. The test case actually never worked since it requires at least -O option to output the expected fpchg insn. Moreover, scan-assembler fpchg would match on the file name string in the asm output. Fixed and tested with make check-gcc RUNTESTFLAGS=sh.exp=pr53513-1.c --target_board=sh-sim \{-m4a/-ml,-m4a/-mb,-m4a-single/-ml,-m4a-single/-mb,-m4a-nofpu/-ml,-m4a-nofpu/-mb}. Commited as r218793. Cheers, Oleg gcc/testsuite/ChangeLog: PR target/53513 * gcc.target/sh/fpchg.c: Rename to ... * gcc.target/sh/pr53513-1.c: ... this. Adjust test case to work for -m4a and -m4a-single. Index: gcc/testsuite/gcc.target/sh/pr53513-1.c === --- gcc/testsuite/gcc.target/sh/pr53513-1.c (revision 0) +++ gcc/testsuite/gcc.target/sh/pr53513-1.c (revision 0) @@ -0,0 +1,11 @@ +/* Check that fpchg is used to switch FPSCR.PR mode on SH4A. */ +/* { dg-additional-options -O } */ +/* { dg-skip-if { sh*-*-* } { * } { -m4a -m4a-single } } */ +/* { dg-final { scan-assembler fpchg } } */ +/* { dg-final { scan-assembler-not fpscr } } */ + +double +foo (float a, float b, double c) +{ + return (a * b) + c; +} Index: gcc/testsuite/gcc.target/sh/fpchg.c === --- gcc/testsuite/gcc.target/sh/fpchg.c (revision 218791) +++ gcc/testsuite/gcc.target/sh/fpchg.c (working copy) @@ -1,17 +0,0 @@ -/* Check that fpchg is used to switch precision. */ - -/* { dg-do compile } */ -/* { dg-final { scan-assembler fpchg } } */ -/* { dg-final { scan-assembler-not fpscr } } */ -/* { dg-skip-if { sh*-*-* } { * } { -m4a } } */ - -extern float c; - -void -foo(int j) -{ - while (j--) -c++; - -} -
Go patch committed: force non-abstract type in switch statement
This patch by Chris Manghane fixes a Go compiler crash when using a switch statement where the switch value is an untyped constant. This is not normally a useful thing to do, but of course the compiler should not crash. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r 38dd5fea8cc1 go/statements.cc --- a/go/statements.cc Tue Dec 16 11:14:02 2014 -0800 +++ b/go/statements.cc Tue Dec 16 13:00:30 2014 -0800 @@ -3857,7 +3857,11 @@ Expression* val = this-val_; if (val == NULL) val = Expression::make_boolean(true, loc); - Temporary_statement* val_temp = Statement::make_temporary(NULL, val, loc); + + Type* type = val-type(); + if (type-is_abstract()) +type = type-make_non_abstract_type(); + Temporary_statement* val_temp = Statement::make_temporary(type, val, loc); b-add_statement(val_temp); this-clauses_-lower(b, val_temp, this-break_label());
Re: [ping^5] [libgomp] make it possible to use OMP on both sides of a fork
Ping^5 for: https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00519.html On Tue, Nov 25, 2014 at 6:54 PM, Nathaniel Smith n...@pobox.com wrote: Ping^4 for: https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00519.html On Tue, Nov 18, 2014 at 12:53 AM, Nathaniel Smith n...@pobox.com wrote: Hello, Ping for https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00519.html Patches posted early enough during Stage 1 and not yet fully reviewed may still get in early in Stage 3. Please make sure to ping them soon enough. This patch was initially posted before stage 1 opened... for 4.9. So hopefully that qualifies :-). It would be nice to get it in someday... -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org
[SH][committed] Adjust pr54089-1.c test case
Hi, The gcc.target/sh/pr54089-1.c test case started to fail for SH2A. Due to different combine paths the expected rotcr insn is sometimes not generated as expected at -O1. Changing it to -O2 produces the expected insn sequences. Tested with make -k check-gcc RUNTESTFLAGS=sh.exp=pr54089-1.c --target_board=sh-sim \{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb} Committed as r218795. Cheers, Oleg gcc/testsuite/ChangeLog: PR target/54089 * gcc.target/sh/pr54089-1.c: Change optimization level from -O1 to -O2. Index: gcc/testsuite/gcc.target/sh/pr54089-1.c === --- gcc/testsuite/gcc.target/sh/pr54089-1.c (revision 218791) +++ gcc/testsuite/gcc.target/sh/pr54089-1.c (working copy) @@ -1,6 +1,6 @@ /* Check that the rotcr instruction is generated. */ /* { dg-do compile } */ -/* { dg-options -O1 } */ +/* { dg-options -O2 } */ /* { dg-skip-if { sh*-*-* } { -m5*} { } } */ /* { dg-final { scan-assembler-times rotcr 24 } } */ /* { dg-final { scan-assembler-times shll\t 1 } } */
Re: [PATCH] Make dg-extract-results.sh explicitly treat .{sum,log} files as text
On Tuesday, December 16 2014, Mike Stump wrote: So, either, the tool should not generate 0 in the output, which, is rather anti-social, or one should strip the funny characters in a more portable fashion. tr and cat -v come to mind; both should be pretty portable. OK to apply? Try cat | cat -v in there instead. Thanks, Mike. WDYT of the attached patch? It tries to use grep --text if that is available, and falls back to using cat -v otherwise. Unfortunately, it is necessary to provide the filenames to grep here: CNT=`$GREP '=== .* tests ===' $SUM_FILES | $AWK '{ print $3 }' | sort -u | wc -l` so I couldn't use cat -v $SUM_FILES directly. Thanks, -- Sergio GPG key ID: 0x65FC5E36 Please send encrypted e-mail if possible http://sergiodj.net/ diff --git a/contrib/dg-extract-results.sh b/contrib/dg-extract-results.sh index a83c8e8..0ddf25b 100755 --- a/contrib/dg-extract-results.sh +++ b/contrib/dg-extract-results.sh @@ -127,13 +127,28 @@ do done test $ERROR -eq 0 || exit 1 +# Test if grep supports the '--text' option + +GREP=grep + +if echo -e '\x00foo\x00' | $GREP --text foo /dev/null 21 ; then + GREP=grep --text +else + # Our grep does not recognize the '--text' option. We have to + # treat our files in order to remove any non-printable character. + for file in $SUM_FILES ; do +mv $file ${file}.orig +cat -v ${file}.orig $file + done +fi + if [ -z $TOOL ]; then # If no tool was specified, all specified summary files must be for # the same tool. - CNT=`grep '=== .* tests ===' $SUM_FILES | $AWK '{ print $3 }' | sort -u | wc -l` + CNT=`$GREP '=== .* tests ===' $SUM_FILES | $AWK '{ print $3 }' | sort -u | wc -l` if [ $CNT -eq 1 ]; then -TOOL=`grep '=== .* tests ===' $FIRST_SUM | $AWK '{ print $2 }'` +TOOL=`$GREP '=== .* tests ===' $FIRST_SUM | $AWK '{ print $2 }'` else msg ${PROGNAME}: sum files are for multiple tools, specify a tool msg @@ -144,7 +159,7 @@ else # Ignore the specified summary files that are not for this tool. This # should keep the relevant files in the same order. - SUM_FILES=`grep -l === $TOOL $SUM_FILES` + SUM_FILES=`$GREP -l === $TOOL $SUM_FILES` if test -z $SUM_FILES ; then msg ${PROGNAME}: none of the specified files are results for $TOOL exit 1 @@ -233,7 +248,7 @@ else VARIANTS= for VAR in $VARS do -grep Running target $VAR $SUM_FILES /dev/null VARIANTS=$VARIANTS $VAR +$GREP Running target $VAR $SUM_FILES /dev/null VARIANTS=$VARIANTS $VAR done fi @@ -435,6 +450,6 @@ cat ${TMP}/var-* | $AWK -f $TOTAL_AWK # This is ugly, but if there's version output from the compiler under test # at the end of the file, we want it. The other thing that might be there # is the final summary counts. -tail -2 $FIRST_SUM | grep '^#' /dev/null || tail -2 $FIRST_SUM +tail -2 $FIRST_SUM | $GREP '^#' /dev/null || tail -2 $FIRST_SUM exit 0
Improve handling of const functions in inline-analysis
Hi, while looking into inliner's behaviour on sreal I noticed that const bulitins arenot predicted to become constant when their parameters are. This patch implement that. It is not fully optimal, because cost of the const call will be still accounted, but it enables better propagation of invariantness. Also inliner probably should make difference between constant and invariant, but that is for next stage1. Bootstrapped/regtested x86_64-linux, will commit it after bit of further testing on Firefox and tramp3d. Honza * ipa-inline-analysis.c (will_be_nonconstant_predicate): Consider return values of const calls as constants. (estimate_function_body_sizes): Expect calls to have false predicates. Index: ipa-inline-analysis.c === --- ipa-inline-analysis.c (revision 218791) +++ ipa-inline-analysis.c (working copy) @@ -2036,12 +2036,12 @@ will_be_nonconstant_predicate (struct ip struct agg_position_info aggpos; /* What statments might be optimized away - when their arguments are constant - TODO: also trivial builtins. - builtin_constant_p is already handled later. */ + when their arguments are constant. */ if (gimple_code (stmt) != GIMPLE_ASSIGN gimple_code (stmt) != GIMPLE_COND - gimple_code (stmt) != GIMPLE_SWITCH) + gimple_code (stmt) != GIMPLE_SWITCH + (gimple_code (stmt) != GIMPLE_CALL + || !(gimple_call_flags (stmt) ECF_CONST))) return p; /* Stores will stay anyway. */ @@ -2101,9 +2101,10 @@ will_be_nonconstant_predicate (struct ip p = nonconstant_names[SSA_NAME_VERSION (use)]; op_non_const = or_predicates (summary-conds, p, op_non_const); } - if (gimple_code (stmt) == GIMPLE_ASSIGN - TREE_CODE (gimple_assign_lhs (stmt)) == SSA_NAME) -nonconstant_names[SSA_NAME_VERSION (gimple_assign_lhs (stmt))] + if ((gimple_code (stmt) == GIMPLE_ASSIGN || gimple_code (stmt) == GIMPLE_CALL) + gimple_op (stmt, 0) + TREE_CODE (gimple_op (stmt, 0)) == SSA_NAME) +nonconstant_names[SSA_NAME_VERSION (gimple_op (stmt, 0))] = op_non_const; return op_non_const; } @@ -2683,7 +2684,9 @@ estimate_function_body_sizes (struct cgr else p = true_predicate (); - if (!false_predicate_p (p)) + if (!false_predicate_p (p) + || (is_gimple_call (stmt) + !false_predicate_p (bb_predicate))) { time += this_time; size += this_size;
Cleanup and speedup inliner after conversion of heap to sreals
Hi, conversion to sreal makes it possible to compute badness in more streamlined manner. Together with the sreal::normalize change this patch finally makes fibheap badness calcualtion to be out of radar in profiling and I hope it makes it more maintainable by eliminating the issues from roundoff errors and overflows that was quite painful. Incrementally it is also possible to turn time computation into sreals. Bootstrapped/regtested x86_64-linux, will wait with the commit for the preivous changes to show up in the benchmark testers. Thanks for Martin and Trevor for working on srealsfibheap changes. Honza * ipa-inline.c (cgraph_freq_base_rec, percent_rec): New functions. (compute_uninlined_call_time): Return sreal. (compute_inlined_call_time): Return sreal. (big_speedup_p): Update to be computed with sreals. (relative_time_benefit): Turn to sreal computation; return value as numerator/denominator to save division. (edge_badness): Rewrite to sreals; remove overflow checks and cleanup. (ipa_inline): Initialize cgraph_freq_base_rec and percent_rec. (inline_small_functions): Update dumping; speedup fibheap maintenance. (update_edge_key): Update dumping. Index: ipa-inline.c === --- ipa-inline.c(revision 218730) +++ ipa-inline.c(working copy) @@ -148,6 +148,10 @@ static gcov_type max_count; static sreal max_count_real, max_relbenefit_real, half_int_min_real; static gcov_type spec_rem; +/* sreal constants representing 1/CGRAPH_FREQ_BASE and + 1/100. */ +static sreal cgraph_freq_base_rec, percent_rec; + /* Return false when inlining edge E would lead to violating limits on function unit growth or stack usage growth. @@ -517,37 +521,34 @@ want_early_inline_function_p (struct cgr /* Compute time of the edge-caller + edge-callee execution when inlining does not happen. */ -inline gcov_type +inline sreal compute_uninlined_call_time (struct inline_summary *callee_info, struct cgraph_edge *edge) { - gcov_type uninlined_call_time = -RDIV ((gcov_type)callee_info-time * MAX (edge-frequency, 1), - CGRAPH_FREQ_BASE); - gcov_type caller_time = inline_summary (edge-caller-global.inlined_to - ? edge-caller-global.inlined_to - : edge-caller)-time; + sreal uninlined_call_time = (sreal)callee_info-time + * (sreal)MAX (edge-frequency, 1) + * cgraph_freq_base_rec; + int caller_time = inline_summary (edge-caller-global.inlined_to + ? edge-caller-global.inlined_to + : edge-caller)-time; return uninlined_call_time + caller_time; } /* Same as compute_uinlined_call_time but compute time when inlining does happen. */ -inline gcov_type +inline sreal compute_inlined_call_time (struct cgraph_edge *edge, int edge_time) { - gcov_type caller_time = inline_summary (edge-caller-global.inlined_to - ? edge-caller-global.inlined_to - : edge-caller)-time; - gcov_type time = (caller_time - + RDIV (((gcov_type) edge_time -- inline_edge_summary (edge)-call_stmt_time) - * MAX (edge-frequency, 1), CGRAPH_FREQ_BASE)); - /* Possible one roundoff error, but watch for overflows. */ - gcc_checking_assert (time = INT_MIN / 2); - if (time 0) -time = 0; + int caller_time = inline_summary (edge-caller-global.inlined_to + ? edge-caller-global.inlined_to + : edge-caller)-time; + sreal time = (sreal)caller_time + + ((sreal)(edge_time - inline_edge_summary (edge)-call_stmt_time) + * (sreal)MAX (edge-frequency, 1) + * cgraph_freq_base_rec); + gcc_checking_assert (time = 0); return time; } @@ -557,12 +558,10 @@ compute_inlined_call_time (struct cgraph static bool big_speedup_p (struct cgraph_edge *e) { - gcov_type time = compute_uninlined_call_time (inline_summary (e-callee), - e); - gcov_type inlined_time = compute_inlined_call_time (e, - estimate_edge_time (e)); + sreal time = compute_uninlined_call_time (inline_summary (e-callee), e); + sreal inlined_time = compute_inlined_call_time (e, estimate_edge_time (e)); if (time - inlined_time - RDIV (time * PARAM_VALUE (PARAM_INLINE_MIN_SPEEDUP), 100)) + (sreal)time * (sreal)PARAM_VALUE (PARAM_INLINE_MIN_SPEEDUP) * percent_rec) return true; return false; } @@ -860,42 +859,45 @@ want_inline_function_to_all_callers_p (s #define
Re: [C++ Patch] PR 58650
OK, thanks. Jason
[GOOGLE] Sync google-4_9 autofdo to trunk
This patch syncs google-4_9 autofdo implementation to trunk (as much as possible). Bootstrapped and passed regression test and performance test. OK for google-4_9? Thanks, Dehao Index: gcc/auto-profile.c === --- gcc/auto-profile.c (revision 218795) +++ gcc/auto-profile.c (working copy) @@ -1,5 +1,5 @@ -/* Calculate branch probabilities, and basic block execution counts. - Copyright (C) 2012. Free Software Foundation, Inc. +/* Read and annotate call graph profile from the auto profile data file. + Copyright (C) 2014. Free Software Foundation, Inc. Contributed by Dehao Chen (de...@google.com) This file is part of GCC. @@ -18,19 +18,17 @@ You should have received a copy of the GNU General along with GCC; see the file COPYING3. If not see http://www.gnu.org/licenses/. */ -/* Read and annotate call graph profile from the auto profile data - file. */ +#include config.h +#include system.h #include string.h #include map -#include vector #include set -#include config.h -#include system.h #include coretypes.h #include tree.h #include flags.h +#include vec.h #include basic-block.h #include diagnostic-core.h #include gcov-io.h @@ -73,26 +71,34 @@ along with GCC; see the file COPYING3. If not see Phase 1: Read profile from the profile data file. The following info is read from the profile datafile: - * string_table: a map between function name and its index. - * autofdo_source_profile: a map from function_instance name to - function_instance. This is represented as a forest of - function_instances. - * autofdo_module_profile: a map from module name to its - compilation/aux-module info. - * WorkingSet: a histogram of how many instructions are covered for a - given percentage of total cycles. +* string_table: a map between function name and its index. +* autofdo_source_profile: a map from function_instance name to + function_instance. This is represented as a forest of + function_instances. +* WorkingSet: a histogram of how many instructions are covered for a + given percentage of total cycles. This is describing the binary + level information (not source level). This info is used to help + decide if we want aggressive optimizations that could increase + code footprint (e.g. loop unroll etc.) + A function instance is an instance of function that could either be a + standalone symbol, or a clone of a function that is inlined into another + function. - Phase 2: Early inline. + Phase 2: Early inline + valur profile transformation. Early inline uses autofdo_source_profile to find if a callsite is: - * inlined in the profiled binary. - * callee body is hot in the profiling run. +* inlined in the profiled binary. +* callee body is hot in the profiling run. If both condition satisfies, early inline will inline the callsite regardless of the code growth. + Phase 2 is an iterative process. During each iteration, we also check + if an indirect callsite is promoted and inlined in the profiling run. + If yes, vpt will happen to force promote it and in the next iteration, + einline will inline the promoted callsite in the next iteration. Phase 3: Annotate control flow graph. AutoFDO uses a separate pass to: - * Annotate basic block count - * Estimate branch probability +* Annotate basic block count +* Estimate branch probability After the above 3 phases, all profile is readily annotated on the GCC IR. AutoFDO tries to reuse all FDO infrastructure as much as possible to make @@ -102,16 +108,17 @@ along with GCC; see the file COPYING3. If not see #define DEFAULT_AUTO_PROFILE_FILE fbdata.afdo -namespace autofdo { +namespace autofdo +{ /* Represent a source location: (function_decl, lineno). */ typedef std::pairtree, unsigned decl_lineno; /* Represent an inline stack. vector[0] is the leaf node. */ -typedef std::vectordecl_lineno inline_stack; +typedef auto_vecdecl_lineno inline_stack; /* String array that stores function names. */ -typedef std::vectorconst char * string_vector; +typedef auto_vecchar * string_vector; /* Map from function name's index in string_table to target's execution count. */ @@ -130,7 +137,7 @@ struct count_info /* Map from indirect call target to its sample count. */ icall_target_map targets; - /* Whether this inline stack is already used in annotation. + /* Whether this inline stack is already used in annotation. Each inline stack should only be used to annotate IR once. This will be enforced when instruction-level discriminator @@ -141,15 +148,21 @@ struct count_info /* operator for const char *. */ struct string_compare { - bool operator() (const char *a, const char *b) const -{
Re: [GOOGLE] Sync google-4_9 autofdo to trunk
ok. David On Tue, Dec 16, 2014 at 2:15 PM, Dehao Chen de...@google.com wrote: This patch syncs google-4_9 autofdo implementation to trunk (as much as possible). Bootstrapped and passed regression test and performance test. OK for google-4_9? Thanks, Dehao
Fix min in fibheap template
Hi, using min function of fibheap template results in compile error on non-existent data fields in fibheap node. Fixed thus. Bootstrapped/regtested x86_64-linux, comitted. Honza * fibonacci_heap.h (min): Return m_data instead of non-existing data. Index: fibonacci_heap.h === --- fibonacci_heap.h(revision 218796) +++ fibonacci_heap.h(working copy) @@ -211,7 +211,7 @@ public: if (m_min == NULL) return NULL; -return m_min-data; +return m_min-m_data; } /* Replace data associated with NODE and replace it with DATA. */
[GOOGLE] Do not promote indirect call for AutoFDO in the callee body definition is not available
This patch fixes the bug for undefined symbol in AutoFDO build. Testing on going. OK for google-4_9 branch? Thanks, Dehao Index: gcc/auto-profile.c === --- gcc/auto-profile.c (revision 218784) +++ gcc/auto-profile.c (working copy) @@ -592,6 +592,8 @@ continue; if (!check_ic_target (stmt, node)) continue; + if (!node-definition) +continue; (*map)[callee] = iter-second-total_count (); ret += iter-second-total_count (); }
Re: [PATCH] Updates ssa and inline summary in the correct location for AutoFDO
ping... Thanks, Dehao On Tue, Nov 18, 2014 at 2:29 PM, Dehao Chen de...@google.com wrote: This patch updates ssa and inline summary in the correct location for AutoFDO. Bootstrapped and passed regression test. OK for trunk? Thanks, Dehao gcc/ChangeLog: 2014-11-18 Dehao Chen de...@google.com * auto-profile.c (afdo_annotate_cfg): Invoke update_ssa in the right place. (auto_profile): Recompute inline summary after processing cgraph node. Index: gcc/auto-profile.c === --- gcc/auto-profile.c (revision 217741) +++ gcc/auto-profile.c (working copy) @@ -1514,7 +1514,14 @@ afdo_annotate_cfg (const stmt_set promoted_stmts) profile_status_for_fn (cfun) = PROFILE_READ; } if (flag_value_profile_transformations) -gimple_value_profile_transformations (); +{ + gimple_value_profile_transformations (); + free_dominance_info (CDI_DOMINATORS); + free_dominance_info (CDI_POST_DOMINATORS); + calculate_dominance_info (CDI_POST_DOMINATORS); + calculate_dominance_info (CDI_DOMINATORS); + update_ssa (TODO_update_ssa); +} } /* Wrapper function to invoke early inliner. */ @@ -1592,7 +1599,6 @@ auto_profile (void) early_inline (); autofdo::afdo_annotate_cfg (promoted_stmts); compute_function_frequency (); -update_ssa (TODO_update_ssa); /* Local pure-const may imply need to fixup the cfg. */ if (execute_fixup_cfg () TODO_cleanup_cfg) @@ -1601,6 +1607,7 @@ auto_profile (void) free_dominance_info (CDI_DOMINATORS); free_dominance_info (CDI_POST_DOMINATORS); cgraph_edge::rebuild_edges (); +compute_inline_parameters (cgraph_node::get (current_function_decl), true); pop_cfun (); }
Re: [GOOGLE] Do not promote indirect call for AutoFDO in the callee body definition is not available
Does it paper over the real bug? David On Tue, Dec 16, 2014 at 2:38 PM, Dehao Chen de...@google.com wrote: This patch fixes the bug for undefined symbol in AutoFDO build. Testing on going. OK for google-4_9 branch? Thanks, Dehao Index: gcc/auto-profile.c === --- gcc/auto-profile.c (revision 218784) +++ gcc/auto-profile.c (working copy) @@ -592,6 +592,8 @@ continue; if (!check_ic_target (stmt, node)) continue; + if (!node-definition) +continue; (*map)[callee] = iter-second-total_count (); ret += iter-second-total_count (); }
Re: [PATCH] Updates ssa and inline summary in the correct location for AutoFDO
gcc/ChangeLog: 2014-11-18 Dehao Chen de...@google.com * auto-profile.c (afdo_annotate_cfg): Invoke update_ssa in the right place. (auto_profile): Recompute inline summary after processing cgraph node. Index: gcc/auto-profile.c === --- gcc/auto-profile.c (revision 217741) +++ gcc/auto-profile.c (working copy) @@ -1514,7 +1514,14 @@ afdo_annotate_cfg (const stmt_set promoted_stmts) profile_status_for_fn (cfun) = PROFILE_READ; } if (flag_value_profile_transformations) -gimple_value_profile_transformations (); +{ + gimple_value_profile_transformations (); + free_dominance_info (CDI_DOMINATORS); + free_dominance_info (CDI_POST_DOMINATORS); + calculate_dominance_info (CDI_POST_DOMINATORS); + calculate_dominance_info (CDI_DOMINATORS); + update_ssa (TODO_update_ssa); I do not think you need to calcualte post dominators and most likely update_ssa will do its own dominators clauclation if ndeeded. It would be prettier to get those done via TODOs and subpasses, but if it is impossible patch is OK with the change above. What about the performance tweaks we separated out form the original auto-FDO commit? Honza +} } /* Wrapper function to invoke early inliner. */ @@ -1592,7 +1599,6 @@ auto_profile (void) early_inline (); autofdo::afdo_annotate_cfg (promoted_stmts); compute_function_frequency (); -update_ssa (TODO_update_ssa); /* Local pure-const may imply need to fixup the cfg. */ if (execute_fixup_cfg () TODO_cleanup_cfg) @@ -1601,6 +1607,7 @@ auto_profile (void) free_dominance_info (CDI_DOMINATORS); free_dominance_info (CDI_POST_DOMINATORS); cgraph_edge::rebuild_edges (); +compute_inline_parameters (cgraph_node::get (current_function_decl), true); pop_cfun (); }
Go patch committed: Dont build hash/equality functions for thunk structs
This patch to the Go frontend changes it to never build hash/equality functions for structures created for thunks (go and defer statements). These functions are never needed, and they can cause problems when a thunk is used to pass an unexported type from a different package to a function defined in that package. The resulting struct type may need to call the comparison routine from the other package, which will fail because the type is not exported. I will propose this as bug492 in the master testsuite. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r c549eac5acc2 go/statements.cc --- a/go/statements.cc Tue Dec 16 13:36:08 2014 -0800 +++ b/go/statements.cc Tue Dec 16 14:47:08 2014 -0800 @@ -1884,6 +1884,8 @@ // Class Thunk_statement. This is the base class for go and defer // statements. +Unordered_set(const Struct_type*) Thunk_statement::thunk_types; + // Constructor. Thunk_statement::Thunk_statement(Statement_classification classification, @@ -2265,7 +2267,20 @@ } } - return Type::make_struct_type(fields, location); + Struct_type *st = Type::make_struct_type(fields, location); + + Thunk_statement::thunk_types.insert(st); + + return st; +} + +// Return whether ST is a type created to hold thunk parameters. + +bool +Thunk_statement::is_thunk_struct(const Struct_type* st) +{ + return (Thunk_statement::thunk_types.find(st) + != Thunk_statement::thunk_types.end()); } // Build the thunk we are going to call. This is a brand new, albeit diff -r c549eac5acc2 go/statements.h --- a/go/statements.h Tue Dec 16 13:36:08 2014 -0800 +++ b/go/statements.h Tue Dec 16 14:47:08 2014 -0800 @@ -985,6 +985,10 @@ bool simplify_statement(Gogo*, Named_object*, Block*); + // Return whether ST is a type created to hold thunk parameters. + static bool + is_thunk_struct(const Struct_type *st); + protected: int do_traverse(Traverse* traverse); @@ -1023,6 +1027,9 @@ void thunk_field_param(int n, char* buf, size_t buflen); + // A list of all the struct types created for thunk statements. + static Unordered_set(const Struct_type*) thunk_types; + // The function call to be executed in a separate thread (go) or // later (defer). Expression* call_; diff -r c549eac5acc2 go/types.cc --- a/go/types.cc Tue Dec 16 13:36:08 2014 -0800 +++ b/go/types.cc Tue Dec 16 14:47:08 2014 -0800 @@ -1593,7 +1593,9 @@ hash_fnname = __go_type_hash_identity; equal_fnname = __go_type_equal_identity; } - else if (!this-is_comparable()) + else if (!this-is_comparable() || + (this-struct_type() != NULL +Thunk_statement::is_thunk_struct(this-struct_type( { hash_fnname = __go_type_hash_error; equal_fnname = __go_type_equal_error;
Re: [GOOGLE] Do not promote indirect call for AutoFDO in the callee body definition is not available
The real bug is in debug info. The callgraph should look like: foo-D1EV-D4EV While everything is inlined into foo, but in the inline stack D1EV is missing. This patch does not fix that problem, but is also reasonable because if callee's definition is not available, we should not promote the indirect call anyway. Dehao On Tue, Dec 16, 2014 at 2:45 PM, Xinliang David Li davi...@google.com wrote: Does it paper over the real bug? David On Tue, Dec 16, 2014 at 2:38 PM, Dehao Chen de...@google.com wrote: This patch fixes the bug for undefined symbol in AutoFDO build. Testing on going. OK for google-4_9 branch? Thanks, Dehao Index: gcc/auto-profile.c === --- gcc/auto-profile.c (revision 218784) +++ gcc/auto-profile.c (working copy) @@ -592,6 +592,8 @@ continue; if (!check_ic_target (stmt, node)) continue; + if (!node-definition) +continue; (*map)[callee] = iter-second-total_count (); ret += iter-second-total_count (); }
Re: [gofrontend-dev] [PATCH] backport libgo patch for ppc relocs in debug/elf
I committed this to the 4.9 branch. Ian On Tue, Dec 16, 2014 at 1:45 PM, Lynn A. Boger labo...@linux.vnet.ibm.com wrote: Hi, Please backport the attached patches to gcc 4.9 based on https://gcc.gnu.org/ml/gcc-patches/2014-10/msg02994.html. 2014-12-16 Lynn Boger labo...@linux.vnet.ibm.com * libgo/go/debug/elf/elf.go: Add reloc types for PPC64 * libgo/go/debug/elf/file.go: Add function applyRelocationsPPC64 * libgo/go/debug/elf/file_test.go: Update test for PPC64 * libgo/go/debug/elf/testdata/go-relocation-test-gcc447-ppc64.obj: new test data -- You received this message because you are subscribed to the Google Groups gofrontend-dev group. To unsubscribe from this group and stop receiving emails from it, send an email to gofrontend-dev+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: [GOOGLE] Do not promote indirect call for AutoFDO in the callee body definition is not available
On Tue, Dec 16, 2014 at 3:07 PM, Dehao Chen de...@google.com wrote: The real bug is in debug info. The callgraph should look like: foo-D1EV-D4EV While everything is inlined into foo, but in the inline stack D1EV is missing. Is it DFEed? This patch does not fix that problem, but is also reasonable because if callee's definition is not available, we should not promote the indirect call anyway. It depends. If there is a wrong DFE before the promotion, we will end up missing opportunities. David Dehao On Tue, Dec 16, 2014 at 2:45 PM, Xinliang David Li davi...@google.com wrote: Does it paper over the real bug? David On Tue, Dec 16, 2014 at 2:38 PM, Dehao Chen de...@google.com wrote: This patch fixes the bug for undefined symbol in AutoFDO build. Testing on going. OK for google-4_9 branch? Thanks, Dehao Index: gcc/auto-profile.c === --- gcc/auto-profile.c (revision 218784) +++ gcc/auto-profile.c (working copy) @@ -592,6 +592,8 @@ continue; if (!check_ic_target (stmt, node)) continue; + if (!node-definition) +continue; (*map)[callee] = iter-second-total_count (); ret += iter-second-total_count (); }
Re: [PATCH] combine: If a parallel I2 was split, do not allow a new I2 (PR64268)
On Tue, Dec 16, 2014 at 07:28:22AM -0600, Segher Boessenkool wrote: I did a run for powerpc64, one for powerpc, and one for x86-64. The powerpc64 bootstrap was with pre-installed GMP etc.; the others had those libraries in-tree. type1 is when try_combine used the ancient combine code to split a parallel set and set of cc; type2 is when it used my code to split any other parallel that sets two things; and type0 is when it didn't do either but still ended up with I1 and I2 the same UID (I think it might be called with the same insn twice; not a good thing, it does not know how to handle this; and it is really worrisome that it then sometimes succeeds in combining it). tries is how often that split-orig-I2-to-two code is used; recog is how often it reached the first call to recog (so it passed can_combine_p etc.); fail is how often it eventually failed (after reaching recog), one is how often it combined to one insn, two is how often it combined to two. powerpc64 tries recog failone two type1 39214 39214 38944 202 18 type2 21540 18968 18928 2 38 type0 292 289 0 3 powerpc tries recog failone two type1 21654 21654 21167 485 2 type2 21839 19754 19243 0 509 (*) type0 427 294 0 133 x86-64 tries recog failone two type1 17387 17387 17288 70 29 type2 40413 31681 30242 60 1369 type0 0 0 0 0 The new numbers: powerpc64 tries recog failone two type1 39216 39216 38996 202 18 type2 21540 18968 18928 2 38 powerpc tries recog failone two type1 45276 45276 44196 104040 type2 46149 41589 40553 0 1032 x86-64 tries recog failone two type1 17387 17387 17288 70 29 type2 40088 31625 30196 60 1369 Segher
Re: [gofrontend-dev] [PATCH] backport libgo patch to add ioctl consts
I have committed this to the 4.9 branch. Ian On Thu, Dec 11, 2014 at 12:54 PM, Lynn A. Boger labo...@linux.vnet.ibm.com wrote: Hi all, Please backport the following to gcc 4.9 https://gcc.gnu.org/ml/gcc-patches/2014-10/msg02980.html. There has been a request to get the fixes that went into gcc trunk for gccgo ppc64 ppc64le backported into gcc 4.9. 2014-12-11 Lynn Boger labo...@linux.vnet.ibm.com * libgo/mksysinfo.sh: Add ioctl const values Index: libgo/mksysinfo.sh === --- libgo/mksysinfo.sh (revision 218396) +++ libgo/mksysinfo.sh (working copy) @@ -174,6 +174,9 @@ enum { #ifdef TIOCGWINSZ TIOCGWINSZ_val = TIOCGWINSZ, #endif +#ifdef TIOCSWINSZ + TIOCSWINSZ_val = TIOCSWINSZ, +#endif #ifdef TIOCNOTTY TIOCNOTTY_val = TIOCNOTTY, #endif @@ -192,6 +195,12 @@ enum { #ifdef TIOCSIG TIOCSIG_val = TIOCSIG, #endif +#ifdef TCGETS + TCGETS_val = TCGETS, +#endif +#ifdef TCSETS + TCSETS_val = TCSETS, +#endif }; EOF @@ -780,6 +789,11 @@ if ! grep '^const TIOCGWINSZ' ${OUT} /dev/null 2 echo 'const TIOCGWINSZ = _TIOCGWINSZ_val' ${OUT} fi fi +if ! grep '^const TIOCSWINSZ' ${OUT} /dev/null 21; then + if grep '^const _TIOCSWINSZ_val' ${OUT} /dev/null 21; then +echo 'const TIOCSWINSZ = _TIOCSWINSZ_val' ${OUT} + fi +fi if ! grep '^const TIOCNOTTY' ${OUT} /dev/null 21; then if grep '^const _TIOCNOTTY_val' ${OUT} /dev/null 21; then echo 'const TIOCNOTTY = _TIOCNOTTY_val' ${OUT} @@ -812,8 +826,18 @@ if ! grep '^const TIOCSIG' ${OUT} /dev/null 21; fi # The ioctl flags for terminal control -grep '^const _TC[GS]ET' gen-sysinfo.go | \ +grep '^const _TC[GS]ET' gen-sysinfo.go | grep -v _val | \ sed -e 's/^\(const \)_\(TC[GS]ET[^= ]*\)\(.*\)$/\1\2 = _\2/' ${OUT} +if ! grep '^const TCGETS' ${OUT} /dev/null 21; then + if grep '^const _TCGETS_val' ${OUT} /dev/null 21; then +echo 'const TCGETS = _TCGETS_val' ${OUT} + fi +fi +if ! grep '^const TCSETS' ${OUT} /dev/null 21; then + if grep '^const _TCSETS_val' ${OUT} /dev/null 21; then +echo 'const TCSETS = _TCSETS_val' ${OUT} + fi +fi # ioctl constants. Might fall back to 0 if TIOCNXCL is missing, too, but # needs handling in syscalls.exec.go. -- You received this message because you are subscribed to the Google Groups gofrontend-dev group. To unsubscribe from this group and stop receiving emails from it, send an email to gofrontend-dev+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[PATCH] combine: Do not allow identical insns for I0,I1 and/or I2 (PR64268)
When looking deeper into the problem I discovered that very sometimes try_combine is called with the same insn for I1 and I2. This is quite bad since try_combine does not expect that at all. I expect this is caused by my patches adding a LOG_LINK per register, which can mean a pair of insns has more than one LOG_LINK between them. This patch fixes it by making try_combine refuse identical insns early. Bootstrapped on powerpc64-linux and powerpc-linux; the fails are gone. Also bootstrapped on x86_64-linux. Is this okay for mainline? (regtest on powerpc64-linux in process). Segher 2014-12-15 Segher Boessenkool seg...@kernel.crashing.org gcc/ PR target/64268 * combine.c (try_combine): Immediately return if any of I0,I1,I2 are the same insn. --- gcc/combine.c | 5 + 1 file changed, 5 insertions(+) diff --git a/gcc/combine.c b/gcc/combine.c index de2e49f..8e5d1f7 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -2620,6 +2620,11 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, rtx_insn *i0, rtx new_other_notes; int i; + /* Immediately return if any of I0,I1,I2 are the same insn (I3 can + never be). */ + if (i1 == i2 || i0 == i2 || (i0 i0 == i1)) +return 0; + /* Only try four-insn combinations when there's high likelihood of success. Look for simple insns, such as loads of constants or binary operations involving a constant. */ -- 1.8.1.4
Re: PING for gcc/flag-types.h (was: Re: [Patch, gcc/flag-types.h + Fortran] PR54687 - Fortran options cleanup)
On Mon, 15 Dec 2014, Tobias Burnus wrote: The Fortran part of the .opt changes has been approved, but for Enum(), some bits had to be moved to gcc/flag-types.h – and review for that file is still missing. https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01068.html The flag-types.h changes are OK. -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH 2/4] Add Visium support to libgcc
On Mon, 15 Dec 2014, Eric Botcazou wrote: Do you have a reason for using fp-bit instead of soft-fp? Apart from the obvious historical reason, probably not, but recently added ports (Blackfin, Epiphany) also use it so I'm not sure we want to change it. I doubt they have any good reason for using it. libgcc files are generally GPL+exception, not LGPL without exception with a very old FSF address (config/visium/div64.c, mod64.c, set_trampoline_parity.c, udiv64.c, udivmod64.c, umod64.c) Files whose copyright/origin is clear are already GPL+exception, but these 6 files were originally imported from Glibc so they aren't in the same basket. I guess I can reuse the copyright notice of soft-fp for them. Well, you'll need FSF approval to relicense - and unless you want to keep the same sources used verbatim in both places, the GPL+exception notice is the obvious one given such approval. (But in any case, putting in LGPL files without a license exception seems a bad idea, because it goes against the standard message about building your program with GCC not imposing restrictions on distribution of the resulting binary.) -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH 3/4] Add Visium support to gcc
On Mon, 15 Dec 2014, Eric Botcazou wrote: (and you should verify that the port builds cleanly with --enable-werror -always, for both 32-bit and 64-bit hosts, when building using current trunk GCC). Do you mean a bootstrap of the cross-compiler with --enable-werror-always on a 32-bit and a 64-bit host? OK, I'll do that. First, bootstrap a native compiler from current trunk. Then, use that native compiler to build the cross compiler configured with --enable-werror-always. (--enable-werror-always is only expected to work when GCC is being built with the same version of GCC, as the compiler may not be -Werror-clean when built with other versions.) Do this for both 32-bit and 64-bit hosts. For a cross-compiler, doing this provides equivalent -Werror coverage to what simple bootstrap does for a native compiler. -- Joseph S. Myers jos...@codesourcery.com